This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\articlenumber

12

Height of walks with resets, the Moran model, and the discrete Gumbel distribution

Rafik Aguech1\orcid0000-0002-4483-9356 Asma Althagafi2\orcid0000-0001-5499-0810  and  Cyril Banderier3\orcid0000-0003-0755-3022 1 Department of Statistics and Operations Research, King Saud Univ., Saudi Arabia, and Department of Mathematics, University of Monastir, Tunisia;
\websitehttps://faculty.ksu.edu.sa/en/raguech
2 Department of Statistics and Operations Research, King Saud Univ., Saudi Arabia; \websitehttps://www.researchgate.net/profile/Asma-Althagafi 3 Laboratoire d’Informatique de Paris Nord, Univ. Sorbonne Paris Nord, France; \websitehttp://lipn.fr/ banderier

Abstract.

In this article, we consider several models of random walks in one or several dimensions, additionally allowing, at any unit of time, a reset (or “catastrophe”) of the walk with probability qq. We establish the distribution of the final altitude. We prove algebraicity of the generating functions of walks of bounded height hh (showing in passing the equivalence between Lagrange interpolation and the kernel method). To get these generating functions, our approach offers an algorithm of cost O(1)O(1), instead of cost O(h3)O(h^{3}) if a Markov chain approach would be used. The simplest nontrivial model corresponds to famous dynamics in population genetics: the Moran model.

We prove that the height of these Moran walks asymptotically follows a discrete Gumbel distribution. For q=1/2q=1/2, this generalizes a model of carry propagation over binary numbers considered e.g. by von Neumann and Knuth. For generic qq, using a Mellin transform approach, we show that the asymptotic height exhibits fluctuations for which we get an explicit description (and, in passing, new bounds for the digamma function). We end by showing how to solve multidimensional generalizations of these walks (where any subset of particles is attributed a different probability of dying) and we give an application to the soliton wave model.

Key words and phrases:
Random walks, renewal process, Moran model, analytic combinatorics, discrete Gumbel distribution, Mellin transform, kernel method, digamma function

1. Introduction

The height of random walks is a fundamental parameter which occurs in many domains: in computer science (evolution of a stack, tree traversals, or cache algorithms [39]), in reliability or failure theory (maximal age of a component and inference statistics on the longevity before replacement [24]), in queueing theory (maximal length of the queue, with e.g. applications to traffic jam analysis [37]), in mathematical finance (e.g. in risk theory [28]), in bioinformatics (pattern matching and sequence alignment [2]), etc.

In combinatorics, random walks are studied via the corresponding notion of lattice paths, which play a central role, not only for intrinsic properties of such paths, but also as they are in bijection with many fundamental structures (trees, words, maps, …). We refer to the nice magnum opus of Flajolet and Sedgewick on analytic combinatorics [22] for many enumerative and asymptotic examples.

While the behavior of an extremal parameter such as the height is well understood for walks corresponding to Brownian motion theory, it becomes more subtle when a notion of reset/renewal/resetting/catastrophe [8, 14, 33, 9, 29, 42, 40] is introduced in the model: indeed, typical behaviors in this model are often established by conditioning on events of probability zero in the model without reset, leading to possibly counterintuitive results.

In this article, we give several enumerative and asymptotic results on different statistics (final altitude, waiting time, height) of walks with resets, focusing on the so-called Moran walks (walks related to biological/population models considered by Moran in 1958; see Section 5 for more on this).


Plan of the article.

In Section 2, we consider a generic model of walks with resets (allowing any finite set of steps and a reset step). We describe the behavior of their final altitude (at finite time, and asymptotically). We obtain an algebraic closed form for the bivariate generating function (length/final altitude) for walks of bounded height hh. Our approach uses a variant of the so-called kernel method, which has the advantage to avoid any case-by-case computation based on Markov chains/transfer matrices of size h×hh\times h. In passing, we show the intimate link between Lagrange interpolation and the kernel method.

In Section 3, we consider Moran walks, a model described in Figure 1, for which we generalize an enumerative formula due to Pippenger [45]. We show that their height asymptotically follows a distribution which involves non-trivial fluctuations. We prove that this distribution is a discrete Gumbel distribution, and we clarify its links with the continuous Gumbel distribution. We give an application to the waiting time for reaching any given altitude.

In Section 4, we begin with a brief presentation of the Mellin transform method, and then use it to derive a precise analysis of the asymptotic average and variance of the height. The second asymptotic term involves some O(1)O(1) fluctuations given by a Fourier series (which we prove to be infinitely differentiable, and for which we also derive generic bounds of independent interest). This extends (and fixes some error terms) in earlier analyses by von Neumann, Knuth, Flajolet and Sedgewick [13, 38, 22].

In Section 5, we tackle some multidimensional generalizations of Moran walks, with applications to a model in population genetics and to a wave propagation model (a soliton model), as considered by Itoh, Mahmoud, and Takahashi in [35, 34].

In Section 6, we conclude with a few possible extensions for future work.

Refer to caption
Figure 1. A Moran walk is a random walk which makes a jump +1+1 with probability pp, and a reset (a jump to 0) with probability 1p1-p. Above, one sees such a walk of length n=30n=30. Its final altitude is Yn=1Y_{n}=1, the height is Hn=5H_{n}=5 (reached twice, in red), having 7 resets (the 7 blue dots). In this article, we tackle the enumeration and asymptotics of such paths (and of generalizations involving more general step sets and higher dimension). We also prove that this simple model of walks leads to some noteworthy nontrivial asymptotic behavior of their height HnH_{n}.

2. Walks with resets: final altitude and height

We consider walks with steps in 𝒮\mathcal{S} (where 𝒮\mathcal{S} is a nonempty finite subset of \mathbb{Z}), which can additionally have a reset at any altitude. That is, we have the following process on \mathbb{Z}:

Y0\displaystyle Y_{0} =0\displaystyle=0
Yn+1\displaystyle Y_{n+1} ={Yn+k, with probability pk (for each k, with pk:=0 if k𝒮),0, with probability q (with q+k𝒮pk=1).\displaystyle=\left\{\begin{array}[]{ll}Y_{n}+k,&\hbox{ with probability }p_{k}\text{\qquad(for each $k\in\mathbb{Z}$, with $p_{k}:=0$ if $k\not\in\mathcal{S}$)},\\ \\ 0,&\hbox{ with probability }q\text{\qquad(with $q+\sum_{k\in\mathcal{S}}p_{k}=1$)}.\end{array}\right.

(So if Yn=0Y_{n}=0 we have Yn+1=0Y_{n+1}=0 with probability p0+qp_{0}+q.)

Thus, YnY_{n} is the altitude of the process after nn steps and Hn:=max(Y0,,Yn)H_{n}:=\max(Y_{0},\dots,Y_{n}) is its height. It is convenient to encode the steps and their probabilities by the Laurent polynomial

P(u):=k=cdpkuk (with c:=min𝒮 and d:=max𝒮).P(u):=\sum_{k=c}^{d}p_{k}u^{k}\text{\qquad(with $c:=\min{\mathcal{S}}$ and $d:=\max{\mathcal{S}}$)}. (1)

We assume 0<q<10<q<1 to avoid degenerate cases. We do not require that c<0c<0 or d>0d>0. Of course, if c0c\geq 0, the walk will live by design in \mathbb{N} (it is e.g. the case for Moran walks of Figure 1). In Section 2.1, we determine the distribution of the final altitude (as illustrated in Figure 2 for different families of steps) and we investigate the height in Section 2.2.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2. Plot of (Yn=k)\mathbb{P}(Y_{n}=k), the distribution of the altitudes of walks with resets, for n=100n=100 and different P(u)P(u). It has its support in the \mathbb{N}-linear combinations of steps from 𝒮\mathcal{S}. The final altitude is of order O(1)O(1) and the probability to end at higher altitudes decreases exponentially fast (see Theorem 2.1 for closed-form expressions of the mean and the distribution).

2.1. Final altitude YnY_{n}

Let us start with a simple result which paves the way for the more subtle generating function manipulations for the height that we tackle later in Section 2.2.

We use the classical convenient notations:

  • [zn]G(z)[z^{n}]G(z) stands for the coefficient of znz^{n} in the power series G(z)G(z),

  • ujF(z,1)\partial_{u}^{j}F(z,1) is the jj-th derivative of F(z,u)F(z,u) with respect to uu, evaluated at u=1u=1.

Theorem 2.1 (Final altitude at finite time).

The final altitude of walks with resets follows a discrete law with probability generating function

F(z,u)=n0𝔼[uYn]zn=1+qz/(1z)1zP(u),F(z,u)=\sum_{n\geq 0}\mathbb{E}[u^{Y_{n}}]z^{n}=\frac{1+qz/(1-z)}{1-zP(u)},\vspace{-1.1mm} (2)

where P(u)P(u) is the Laurent polynomial encoding the allowed steps (a finite subset of \mathbb{Z}). Equivalently, for kk\in\mathbb{Z}, we have

(Yn=k)=[uk]P(u)n+q[uk]j=0n1P(u)j.\mathbb{P}(Y_{n}=k)=[u^{k}]P(u)^{n}+q[u^{k}]\sum_{j=0}^{n-1}P(u)^{j}.\vspace{-1.5mm} (3)

Let δ:=P(1)\delta:=P^{\prime}(1) be the drift111We recall that P(1)=1qP(1)=1-q, so another convention could have been to call drift the quantity P(1)/(1q)P^{\prime}(1)/(1-q), i.e., we would then condition on having no reset (instead of considering walks without reset, weighted by the initial model (1)). This alternative convention does not simplify the subsequent formulas. of the walk without reset, and V:=P′′(1)V:=P^{\prime\prime}(1) its second factorial moment. The mean and the variance of the final altitude of the walk with resets are given by

𝔼[Yn]=δ/q+(1q)n1(δδ/q),\mathbb{E}[Y_{n}]=\delta/q+(1-q)^{n-1}(\delta-\delta/q),
𝕍ar[Yn]=(V+δ)q+δ2q2+(1q)n(2δ2n(q1)qV+δq)(1q)2nδ2q2.\mathbb{V}{\rm ar}[Y_{n}]=\frac{\left(V+\delta\right)q+\delta^{2}}{q^{2}}+(1-q)^{n}\left(2\,\frac{\delta^{2}n}{(q-1)q}-\frac{V+\delta}{q}\right)-(1-q)^{2n}\frac{\delta^{2}}{q^{2}}.

For Moran walks (i.e., P(u)=puP(u)=pu and p=1qp=1-q), the mean and the variance simplify to

𝔼[Yn]=pq(1pn) and 𝕍ar[Yn]=pq2(1pn(pn+1+(1+2n)q)).\mathbb{E}[Y_{n}]=\frac{p}{q}\Big{(}1-p^{n}\Big{)}\text{\quad and \quad}\mathbb{V}{\rm ar}[Y_{n}]=\frac{p}{q^{2}}\Big{(}1-p^{n}\big{(}p^{n+1}+(1+2n)q\big{)}\Big{)}.
Proof 2.2.

The probability generating function can be written as

F(z,u)=n0(kn(Yn=k)uk)zn=n0fn(u)zn,F(z,u)=\sum_{n\geq 0}\left(\sum_{k\in\mathbb{Z}}^{n}\mathbb{P}(Y_{n}=k)u^{k}\right)z^{n}=\sum_{n\geq 0}f_{n}(u)z^{n},\vspace{-1mm}

where the fn(u)f_{n}(u)’s are Laurent polynomials encoding the location of the walk at time nn; thus we have fn+1(u)=P(u)fn(u)+qfn(1)f_{n+1}(u)=P(u)f_{n}(u)+qf_{n}(1), with f0(u)=1f_{0}(u)=1. Multiplying both sides of this recurrence by zn+1z^{n+1}, and summing over nn, one gets

F(z,u)(1zP(u))=1+qzF(z,1).F(z,u)(1-zP(u))=1+qzF(z,1).

As F(z,1)=1/(1z)F(z,1)=1/(1-z), one obtains Formula (2). Note that the generating function can also be obtained by using a regular expression encoding these walks (by factorizing the walk in factors ending by a reset): (𝒮q)(𝒮)({\mathcal{S}}^{*}q)^{*}(\mathcal{S})^{*}, which translates to

F(z,u)=11qz11zP(1)11zP(u),F(z,u)=\frac{1}{1-qz\frac{1}{1-zP(1)}}\frac{1}{1-zP(u)},

where the occurrences of P(1)P(1) and P(u)P(u) reflect that only the altitudes after the last reset contribute to the final altitude of the full walk. Using P(1)=1qP(1)=1-q, we get Formula (2).

The mean of YnY_{n} is then obtained via μn:=𝔼[Yn]=[zn]uF(z,1)\mu_{n}:=\mathbb{E}[Y_{n}]=[z^{n}]\partial_{u}F(z,1), while its variance is obtained via a second-order derivative: 𝕍ar[Yn]=[zn]u2F(z,1)+μnμn2\mathbb{V}{\rm ar}[Y_{n}]=[z^{n}]\partial_{u}^{2}F(z,1)+\mu_{n}-\mu_{n}^{2}.

We can now establish the corresponding limit distribution.

Theorem 2.3 (Final altitude: asymptotics).

Consider walks with 0𝒮0\not\in\mathcal{S}, gcd𝒮=1\gcd\mathcal{S}=1, and d=max𝒮>0{d=\max\mathcal{S}>0} (these three constraints bring no loss of generality222 There is no loss of generality. Indeed, if the walk as a periodic support (i.e., if gcd(𝒮)=g\gcd(\mathcal{S})=g with g>1g>1) we rescale (without loss of generality) the step set 𝒮\mathcal{S} by dividing each step by gg. Now, if max𝒮<0\max\mathcal{S}<0, then we multiply each step by 1-1. Last, if 0𝒮0\in\mathcal{S} we consider instead the equivalent model 𝒮:=𝒮{0}\mathcal{S}:=\mathcal{S}\setminus\{0\} and q:=q+p0q:=q+p_{0}.). Therefore the support of the walk is either \mathbb{Z} (with all altitudes being reachable), or \mathbb{N} (with a finite set of altitudes impossible to reach, known as the unreachable set in the coin-exchange problem of Frobenius). The final altitude of these walks with resets behaves asymptotically according to these two cases.

  • a)

    For walks with min𝒮0\min{\mathcal{S}}\geq 0, we have for kk\in\mathbb{N} (not in the Frobenius unreachable set):

    q(mini𝒮pi)klimn(Yn=k)q(maxi𝒮pi)k/d.q\cdot(\min_{i\in\mathcal{S}}p_{i})^{k}\leq\lim_{n}\mathbb{P}(Y_{n}=k)\leq q\cdot(\max_{i\in\mathcal{S}}p_{i})^{k/d}. (4)

    In particular, for Moran walks, we have (Yn=k)=qpk\mathbb{P}(Y_{n}=k)=qp^{k} for 0k<n0\leq k<n and (Yn=n)=pn\mathbb{P}(Y_{n}=n)=p^{n} so limYn=Geom(q)1\lim Y_{n}=\operatorname{Geom}(q)-1.

  • b)

    For walks with min𝒮<0\min\mathcal{S}<0 and max𝒮>0\max\mathcal{S}>0, we have for kk\in\mathbb{Z}:

    (Yn=k)=qWk(1q)+(1q)1τk+112πnP′′(τ)+O(1n).\mathbb{P}(Y_{n}=k)=qW_{k}(1-q)+(1-q)\frac{1}{\tau^{k+1}}\frac{1}{\sqrt{2\pi nP^{\prime\prime}(\tau)}}+O\left(\frac{1}{n}\right).

Moreover, both in Case a) and in Case b), (Yn=k)\mathbb{P}(Y_{n}=k) has a geometric decay for large kk.

Proof 2.4.

In Case a), we have min𝒮1\min\mathcal{S}\geq 1; the definition of P(u)P(u) in (1) then entails [uk]P(u)j=0[u^{k}]P(u)^{j}=0 for large jj. The limit of Equation (3) thus gives

limn+(Yn=k)=q[uk]j=0kP(u)j.\lim_{n\rightarrow+\infty}\mathbb{P}(Y_{n}=k)=q[u^{k}]\sum_{j=0}^{k}P(u)^{j}. (5)

In particular, when it is not 0, this quantity is lower bounded by q(mini𝒮pi)kq\cdot(\min_{i\in\mathcal{S}}p_{i})^{k} and upper bounded by q(maxi𝒮pi)k/dq\cdot(\max_{i\in\mathcal{S}}p_{i})^{k/d}, and therefore decreases geometrically.

In Case b), the proof is more complicated and will recycle ingredients of the asymptotics of walks without reset. To this aim, first set P~(u):=P(u)/P(1)\widetilde{P}(u):=P(u)/P(1), i.e., the step setprobabilities are renormalized to have global mass P~(1)=1\widetilde{P}(1)=1. Let Wk(z)W_{k}(z) be the probability generating function of walks without reset, i.e., Wk(z)=[uk]11zP~(u)=n0wn,kznW_{k}(z)=[u^{k}]\frac{1}{1\scalebox{0.85}[1.11]{$\,-\,$}z\widetilde{P}(u)}=\sum_{n\geq 0}w_{n,k}z^{n}. We then rewrite Equation (3) as

(Yn=k)\displaystyle\mathbb{P}(Y_{n}=k) =P(1)n[uk]P~(u)n+q[uk]j=0n1P(1)jP~(u)j\displaystyle=P(1)^{n}[u^{k}]\widetilde{P}(u)^{n}+q[u^{k}]\sum_{j=0}^{n-1}P(1)^{j}\widetilde{P}(u)^{j}
=(1q)P(1)nwn,k+qj=0nP(1)jwj,k\displaystyle=(1-q)P(1)^{n}w_{n,k}+q\sum_{j=0}^{n}P(1)^{j}w_{j,k}
=(1q)P(1)nwn,k+qP(1)n[zn]11z/P(1)Wk(z).\displaystyle=(1-q)P(1)^{n}w_{n,k}+qP(1)^{n}[z^{n}]\frac{1}{1-z/P(1)}W_{k}(z). (6)

If min𝒮<0\min{\mathcal{S}}<0 and max𝒮>0\max{\mathcal{S}}>0, then there is a unique real τ>0\tau>0 such that P~(τ)=0\widetilde{P}^{\prime}(\tau)=0. It is proven in [5] that ρ=1/P~(τ)\rho=1/\widetilde{P}(\tau) is the radius of convergence of Wk(z)W_{k}(z) and that wn,kτkCP~(τ)n/2πnw_{n,k}\sim\tau^{-k}C\widetilde{P}(\tau)^{n}/\sqrt{2\pi n}, where C:=1τP~(τ)/P~′′(τ)C:=\frac{1}{\tau}\sqrt{\widetilde{P}(\tau)/\widetilde{P}^{\prime\prime}(\tau)}.

Note that, as we have a probability generating function, we have ρ=P~(τ)=1\rho=\widetilde{P}(\tau)=1. The asymptotics of (6) then follows by singularity analysis, as 1/(1z/P(1))1/(1-z/P(1)) is singular at z=P(1)=1qz=P(1)=1-q, that is, before Wk(z)W_{k}(z) which is singular at z=1z=1:

(Yn=k)=qWk(1q)+(1q)τkCP(τ)n2πn+O(1n).\mathbb{P}(Y_{n}=k)=qW_{k}(1-q)+(1-q)\tau^{-k}C\frac{P(\tau)^{n}}{\sqrt{2\pi n}}+O\left(\frac{1}{n}\right). (7)

Note that Formulas (10) and (11) in [5, Theorem 1] give a closed form for Wk(z)W_{k}(z). It implies in particular

0<Wk(1q)<(1q)(c+d)C1/C2|k|+1,0<W_{k}(1-q)<(1-q)(c+d)C_{1}/C_{2}^{|k|+1}, (8)

where C1>0C_{1}>0 and C2>1C_{2}>1 are constants independent of kk; thus Wk(1q)W_{k}(1-q) decays geometrically for k±k\rightarrow\pm\infty. This concludes our analysis of Case b) and gives the theorem.

These limiting behaviors are thus in sharp contrast with the asymptotic behavior of the final altitude of walks on \mathbb{Z} with no resets, which is δn±O(n)\delta n\pm O(\sqrt{n}), with fluctuations given by a continuous distribution (Rayleigh or Gaussian; see [5]).

2.2. The height HnH_{n}

In order to study the height of these walks with resets, one considers the subset of them made of walks conditioned to have a height smaller than hh. We want to obtain an explicit formula for their generating function

Fh(z,u):=n=0+𝔼(uYn1I{Y1h,Y2h,,Ynh})zn.F^{\leq h}(z,u):=\sum_{n=0}^{+\infty}\mathbb{E}\Big{(}u^{Y_{n}}{{\rm 1\!I}}_{\{Y_{1}\leq h,Y_{2}\leq h,\dots,Y_{n}\leq h\}}\Big{)}z^{n}.

If these walks are generated by a step set 𝒮\mathcal{S} having only positive jumps, a natural but naive approach to enumerate them would be to create a deterministic finite automaton (a finite discrete Markov chain) with hh states encoding the possible altitudes of the process. It leads to a system of linear equations which would allow us to get the corresponding rational generating function. However, this approach to obtain the generating function (given hh and the transition probabilities) suffers from three drawbacks:

  • it would be of complexity h3h^{3} (computing determinants of h×hh\times h matrices),

  • it would be a case-by-case approach (new computations are needed for each hh),

  • it would fail if the step set 𝒮\mathcal{S} has some negative steps (then the support of the walkis [,+h][-\infty,+h], and thus one would need an automaton with an infinite number of states).

So, we prefer here to use a more efficient approach, which relies on a powerful method (namely, the kernel method [7]): the complexity to obtain a closed-form formula for Fh(z,u)F^{\leq h}(z,u) then drops333The PhD thesis of Louis Dumont [17] compares the cost of different methods to compute the coefficients of such generating functions (which can be related to diagonals of rational functions); the full analysis has to take into account the space and time complexities, and some precomputation steps, of cost of course higher than O(1)O(1), but in all cases it is more efficient than a Markov chain approach (see however Bacher [3] for a clever use of a transfer matrix point of view). from O(h3)O(h^{3}) to O(1)O(1) for any finite step set 𝒮\mathcal{S}\subset\mathbb{Z} ! This leads to the following theorem.

Theorem 2.5.

Let Fh(z,u)F^{\leq h}(z,u) be the probability generating function of walks on \mathbb{Z} of height h\leq h with resets, where the length and the final altitude of the walks are respectively encoded by the exponents of zz and uu. Let P(u)P(u) encode the allowed jumps as in (1). One has

Fh(z,u)\displaystyle F^{\leq h}(z,u) =n=0+𝔼(uYn1I{Y1h,Y2h,,Ynh})zn=Wh(z,u)1zqWh(z,1),\displaystyle=\sum_{n=0}^{+\infty}\mathbb{E}\Big{(}u^{Y_{n}}{{\rm 1\!I}}_{\{Y_{1}\leq h,Y_{2}\leq h,\dots,Y_{n}\leq h\}}\Big{)}z^{n}=\frac{W^{\leq h}(z,u)}{1-zqW^{\leq h}(z,1)}, (9)

where

Wh(z,u)\displaystyle W^{\leq h}(z,u) :=1i=1d(uui)h+11jd,jiujuujui1zP(u)\displaystyle:=\frac{\displaystyle{1-\sum_{i=1}^{d}\left(\frac{u}{u_{i}}\right)^{h+1}\prod_{1\leq j\leq d,j\neq i}\frac{u_{j}-u}{u_{j}-u_{i}}}}{\displaystyle{1-zP(u)}} (10)

is the generating function of walks of height h\leq h without reset, and where u1,,udu_{1},\dots,u_{d} are the roots of 1zP(u)=01-zP(u)=0 such that limz0|ui(z)|=+\lim_{z\rightarrow 0}|u_{i}(z)|=+\infty.

Remark 2.6 (A rational simplification).

These generating functions are algebraic, as they rationally depends on the roots ui(z)u_{i}(z), which are themselves algebraic functions. Now, when the step set 𝒮\mathcal{S} has only positive steps, WhW^{\leq h} is a polynomial and FhF^{\leq h} simplifies to a rational function (despite the fact that their closed forms (10) and (9) involve algebraic functions!). This simplification can be seen either by the automaton point of view and the Kleene theorem, or by using the Vieta formulas on Newton sums (as, when one has only positive jumps, the uiu_{i}’s are then all the roots of the kernel 1zP(u)1-zP(u)). For example, for P(u)=u/3+u2/2P(u)=u/3+u^{2}/2 and h=3h=3, we have

u1(z)=z+z2+18z3z and u2(z)=zz2+18z3zu_{1}(z)=\frac{-z+\sqrt{z^{2}+18z}}{3z}\text{ \qquad and \qquad}u_{2}(z)=\frac{-z-\sqrt{z^{2}+18z}}{3z} (11)

(the Vieta formulas are here: u1(z)+u2(z)=2/3u_{1}(z)+u_{2}(z)=-2/3 and u1(z)u2(z)=2/zu_{1}(z)u_{2}(z)=-2/z); then, the quotient (9) involving these algebraic functions u1u_{1} and u2u_{2} simplifies, leading to

W3(z,u)\displaystyle W^{\leq 3}(z,u) =11zP(u)(1(uu1(z))4u2(z)uu2(z)u1(z)(uu2(z))4u1(z)uu1(z)u2(z))\displaystyle=\frac{1}{1-zP(u)}\left(1-\left(\frac{u}{u_{1}(z)}\right)^{4}\frac{u_{2}(z)-u}{u_{2}(z)-u_{1}(z)}-\left(\frac{u}{u_{2}(z)}\right)^{4}\frac{u_{1}(z)-u}{u_{1}(z)-u_{2}(z)}\right)
=1+z(u22+u3)+z2(u33+u29)+z3u327,\displaystyle=1+z\left({\frac{u^{2}}{2}}+{\frac{u}{3}}\right)+z^{2}\left({\frac{u^{3}}{3}}+{\frac{u^{2}}{9}}\right)+\frac{z^{3}u^{3}}{27},
F3(z,u)\displaystyle F^{\leq 3}(z,u) =(1+z(u22+u3)+z2(u33+u29)+z3u327)1zq(1+5z6+4z29+z327).\displaystyle=\frac{\left(1+z\left({\frac{u^{2}}{2}}+{\frac{u}{3}}\right)+z^{2}\left({\frac{u^{3}}{3}}+{\frac{u^{2}}{9}}\right)+\frac{z^{3}u^{3}}{27}\right)}{1-zq\left(1+{\frac{5z}{6}}+{\frac{4{z}^{2}}{9}}+{\frac{z^{3}}{27}}\right)}.
Proof 2.7 (Proof of Theorem 2.5).

The probability generating function can be written as

Fh(z,u)=n0fnh(u)zn=k=0hFkh(z)uk,F^{\leq h}(z,u)=\sum_{n\geq 0}f_{n}^{\leq h}(u)z^{n}=\sum_{k=0}^{h}F^{\leq h}_{k}(z)u^{k},

where fnh(u)f_{n}^{\leq h}(u) encodes the possible values of YnY_{n} (constrained to be bounded by hh over the full process), and where

Fkh(z)=n=0+fn,khzn=n=0+(Y1h,Y2h,,Yn1h,Yn=kh)znF^{\leq h}_{k}(z)=\sum_{n=0}^{+\infty}f_{n,k}^{\leq h}z^{n}=\sum_{n=0}^{+\infty}\mathbb{P}\Big{(}Y_{1}\leq h,\,Y_{2}\leq h,\dots,Y_{n-1}\leq h,\,Y_{n}=k\leq h\Big{)}z^{n}

is the probability generating function of bounded walks ending at altitude kk.

The dynamics of the process then entails the recurrence

fn+1h(u)=P(u)fnh(u){u>h}P(u)fn,hhuh+qfnh(1),f_{n+1}^{\leq h}(u)=P(u)f_{n}^{\leq h}(u)-\{u^{>h}\}P(u)f_{n,h}^{\leq h}u^{h}+qf_{n}^{\leq h}(1),

where {u>h}\{u^{>h}\} extracts monomials having a degree in uu strictly larger than hh. This mimics that at time n+1n+1, either, with probability pkp_{k}, we increase by kk the altitude of where we were at time nn (that is, we multiply by uku^{k}, and this is allowed as long as the walk stays at some altitude h\leq h, thus we removed here the cases corresponding to the walks which would reach an altitude >h>h at time n+1n+1); or, with probability qq, we have a reset to altitude 0 (i.e., all the mass of the walks at any altitude kk, corresponding to the coefficient of uku^{k}, is sent back to u0u^{0}; this is thus captured by the substitution u=1u=1).

This directly translates to the functional equation

Fh(z,u)=1+zP(u)Fh(z,u)k=0d1Fhkh(z)uhk(zj=k+1dpjuj)+zqFh(z,1).F^{\leq h}(z,u)=1+zP(u)F^{\leq h}(z,u)-\sum_{k=0}^{d-1}F_{h-k}^{\leq h}(z)u^{h-k}\left(z\sum_{j=k+1}^{d}p_{j}u^{j}\right)+zqF^{\leq h}(z,1).

Setting q=0q=0, we get the functional equation for the generating function WhW^{\leq h} of walks of height h\leq h without reset:

Wh(z,u)=1+zP(u)Wh(z,u)k=0d1Whkh(z)uhk(zj=k+1dpjuj).W^{\leq h}(z,u)=1+zP(u)W^{\leq h}(z,u)-\sum_{k=0}^{d-1}W_{h-k}^{\leq h}(z)u^{h-k}\left(z\sum_{j=k+1}^{d}p_{j}u^{j}\right). (12)

Of course, the factorization of walks with resets into (𝒮q)(𝒮)({\mathcal{S}}^{*}q)^{*}(\mathcal{S})^{*} entails Fh(z,u)=Seq(Wh(z,1)q)Wh(z,u)F^{\leq h}(z,u)=\operatorname{Seq}(W^{\leq h}(z,1)q)W^{\leq h}(z,u), which is Formula (9). So if we find a closed form for WhW^{\leq h}, we are happy as this also solves the initial problem for FhF^{\leq h}. Now, on the right-hand side of (12), the sum for kk from 0 to d1d-1 is a polynomial in uu, which we conveniently rewrite as

Wh(z,u)(1zP(u))=1uhk=1dGk(z)uk.W^{\leq h}(z,u)(1-zP(u))=1-u^{h}\sum_{k=1}^{d}G_{k}(z)u^{k}. (13)

It is possible to solve such an equation via the kernel method: the kernel is the factor 1zP(u)1-zP(u) in (13), and if one considers the equation on the variety defined by 1zP(u)=01-zP(u)=0, this brings additional equations which will allow us to get a closed form for Wh(z,u)W^{\leq h}(z,u). First, observe that this kernel is a (Laurent) polynomial in uu of “positive” degree dd. Then, from an analysis of its Newton polygon, one gets that it has dd roots u1(z),,ud(z)u_{1}(z),\dots,u_{d}(z) such that ui(z)z1/du_{i}(z)\approx z^{-1/d} for z0+z\sim 0^{+} (the other roots being convergent at z0+z\sim 0^{+}; see [5] for more on this issue). Thus, setting u=ui(z)u=u_{i}(z) (for i=1,,di=1,\dots,d) in the functional equation (13) gives dd new equations. Some care is required in this step: we have to check that one does not create series involving an infinite number of monomials with negative exponents444Let RR be the ring of series nanzn\sum_{n\in\mathbb{Z}}a_{n}z^{n}. The Cauchy product of two series in RR is well defined only with some additional convergence conditions, and, even if we restrict ourselves to series for which the product is well defined, we have to take care to the fact that they do not form an integral ring: indeed, we have many divisors of zero (e.g. for S(z):=nZznS(z):=\sum_{n\in Z}z^{n}, we have zS=SzS=S and thus (z1)S=0(z-1)S=0). Most algebraic manipulations in this ring, if they are temporarily handling quantities which are not in the subring of power series (or Laurent/Puiseux/Fourier series), would lead to invalid identities in [[z]]{\mathbb{C}}[[z]]..

In fact, in our case, the substitution u=uiu=u_{i} is legitimate as Wh(z,ui)W^{\leq h}(z,u_{i}) becomes a well-defined Puiseux series in zz: this follows from the fact that the coefficients fnh(u)f_{n}^{\leq h}(u) are (Laurent) polynomials with “positive” degree bounded by hh (and “negative” degree lower bounded by cn-cn), so fnh(ui(z))f_{n}^{\leq h}(u_{i}(z)) is a Puiseux series with exponents from h/d-h/d to ++\infty. Then, multiplying by znz^{n} and summing over nn, only a finite number of summands contribute to each monomial of Wh(z,ui)W^{\leq h}(z,u_{i}), which is thus well defined. Via these substitutions u=uiu=u_{i}, we obtain a linear system of dd equations (which only contains the GkG_{k}’s as unknowns). Then, by Cramer’s rule, we get Gk=det(Vk)/det(V)G_{k}=\det(V_{k})/\det(V), where

V=(u1h+1u1h+2u1h+du2h+1u2h+2u2h+dudh+1udh+2udh+d) and Vk=(u1h+1u1h+k11u1h+k+1u1h+du2h+1u2h+k11u2h+k+1u2h+dudh+1udh+k11udh+k+1udh+d),V=\begin{pmatrix}u_{1}^{h+1}&u_{1}^{h+2}&\dots&{u_{1}}^{h+d}\\ u_{2}^{h+1}&u_{2}^{h+2}&\dots&{u_{2}}^{h+d}\\ \vdots&\vdots&\vdots&&\vdots\\ u_{d}^{h+1}&u_{d}^{h+2}&\dots&{u_{d}}^{h+d}\end{pmatrix}\text{\quad and \quad}V_{k}=\begin{pmatrix}u_{1}^{h+1}&\dots&u_{1}^{h+k-1}&1&u_{1}^{h+k+1}&\dots&{u_{1}}^{h+d}\\ u_{2}^{h+1}&\dots&u_{2}^{h+k-1}&1&u_{2}^{h+k+1}&\dots&{u_{2}}^{h+d}\\ \vdots&\vdots&\vdots&&\vdots\\ u_{d}^{h+1}&\dots&u_{d}^{h+k-1}&1&u_{d}^{h+k+1}&\dots&{u_{d}}^{h+d}\end{pmatrix},

that is, VkV_{k} is the matrix VV with its kk-th column entries replaced by 11. Thus, as VV is a Vandermonde matrix, its determinant is

det(V)=(i=1duih+1)1i<jd(ujui).\det(V)=\left(\prod_{i=1}^{d}u_{i}^{h+1}\right)\prod_{1\leq i<j\leq d}(u_{j}-u_{i}). (14)

Now, to compute det(Vk)\det(V_{k}), one first proves that

Δ=det(u11u1k11u1k+1u1du21u2k11u2k+1u2dud1udk11udk+1udd)=edk(u1,,ud)1i<jd(ujui),\displaystyle\Delta=\det\begin{pmatrix}u_{1}^{1}&\dots&u_{1}^{k-1}&1&u_{1}^{k+1}&\dots&{u_{1}}^{d}\\ u_{2}^{1}&\dots&u_{2}^{k-1}&1&u_{2}^{k+1}&\dots&{u_{2}}^{d}\\ \vdots&\vdots&\vdots&&\vdots\\ u_{d}^{1}&\dots&u_{d}^{k-1}&1&u_{d}^{k+1}&\dots&{u_{d}}^{d}\\ \end{pmatrix}=e_{d-k}(u_{1},\dots,u_{d})\prod_{1\leq i<j\leq d}(u_{j}-u_{i}), (15)

where we used the classical notation for the elementary symmetric polynomials:

ek(x1,,xd):=[tk]i=1d(1+txi),e_{k}(x_{1},\dots,x_{d}):=[t^{k}]\prod_{i=1}^{d}(1+tx_{i}), (16)

e.g., e3(x1,,x5)=x1x2x3+x1x2x4+x1x2x5+x1x3x4+x1x3x5+x1x4x5+x2x3x4+x2x3x5+x2x4x5+x3x4x5e_{3}(x_{1},\dots,x_{5})=x_{1}x_{2}x_{3}+x_{1}x_{2}x_{4}+x_{1}x_{2}x_{5}+x_{1}x_{3}x_{4}+x_{1}x_{3}x_{5}+x_{1}x_{4}x_{5}+x_{2}x_{3}x_{4}+x_{2}x_{3}x_{5}+x_{2}x_{4}x_{5}+x_{3}x_{4}x_{5}. Formula (15) follows from 2 facts:

  • If ui=uju_{i}=u_{j}, then two rows of VkV_{k} are equal and thus the determinant is 0; this explains the Vandermonde product Π:=1i<jd(ujui)\Pi:=\prod_{1\leq i<j\leq d}(u_{j}-u_{i}) on the right-hand side of Formula (15).

  • Now writing the determinant as a sum over the d!d! permutations of the entries gives a sum of monomials, each of total degree (1+2++d)k(1+2+...+d)-k in the uiu_{i}’s. Π\Pi being of total degree (d2)=d(d1)/2\binom{d}{2}=d(d-1)/2, it implies that Δ/Π\Delta/\Pi is a polynomial which is symmetric and homogeneous of total degree dkd-k. Up to a constant factor (determined to be 1, by comparing any monomial), this polynomial has to be edke_{d-k}, which captures exactly the missing uiu_{i}’s in each of the d!d! summands.

Then, performing a Laplace expansion of det(Vk)\det(V_{k}) on its kk-th column and using Formula (15), one gets (after simplification in the Cramer formula):

Gk(z)==1duh1(1)k+dedk(u1,,ud)|u=01jdj1uuj.G_{k}(z)=\sum_{\ell=1}^{d}u_{\ell}^{-h-1}(-1)^{k+d}e_{d-k}(u_{1},\dots,u_{d})_{|u_{\ell}=0}\prod_{\begin{subarray}{c}1\leq j\leq d\\ j\neq\ell\end{subarray}}\frac{1}{u_{\ell}-u_{j}}. (17)

Now, using k=0d(1)dkedk(u1,,ud)uk=i=1d(uui)\sum_{k=0}^{d}(-1)^{d-k}e_{d-k}(u_{1},\dots,u_{d})u^{k}=\prod_{i=1}^{d}(u-u_{i}) (which is equivalent to the definition (16)), and regrouping the powers ukh1u_{k}^{-h-1}, we get

k=1dGk(z)uk1=k=1dukh11jd,jkujuujuk.\sum_{k=1}^{d}G_{k}(z)u^{k-1}=\sum_{k=1}^{d}u_{k}^{-h-1}\prod_{1\leq j\leq d,j\neq k}\frac{u_{j}-u}{u_{j}-u_{k}}. (18)

Combining Equations (18) and (12), we get Formula (10) for Wh(z,u)W^{\leq h}(z,u), and thus the closed form for Fh(z,u)F^{\leq h}(z,u).

Remark 2.8 (Link with Lagrange interpolation).

As we know the evaluation of the right-hand side of (13) in each of the uku_{k}, Formula (18) is also equivalent to the Lagrange interpolation formula (which we thus reproved en passant). Moreover, this Lagrange interpolation approach offers a nice advantage: it is circumventing the fact that the factorization argument used to get the closed forms for the generating functions in [12, 5] works only if the walks start at altitude 0.

Now, if we go back to Moran walks (i.e., for P(u)=puP(u)=pu; see Figure 1), the generating function simplifies to the following noteworthy shape.

Corollary 2.9.

The probability generating function of Moran walks of height h\leq h is

Fh(z,u)=(1pz)(1(pzu)h+1)(1puz)(1z+(pz)h+1zq),\displaystyle F^{\leq h}(z,u)=\frac{(1-pz)(1-(pzu)^{h+1})}{(1-puz)(1-z+(pz)^{h+1}zq)}, (19)

where, in the power series, the length and the final altitude of the walks are respectively encoded by the exponents of zz and uu. Accordingly,

(Hnh)\displaystyle\mathbb{P}(H_{n}\leq h) =[zn]Fh(z,1)=[zn]1(pz)h+11z+(pz)h+1zq\displaystyle=[z^{n}]F^{\leq h}(z,1)=[z^{n}]\frac{1-(pz)^{h+1}}{1-z+(pz)^{h+1}zq} (20)
=k=0nh+1(qph+1)k((nk(h+1)k)ph+1(n(k+1)(h+1)k)),\displaystyle=\sum_{k=0}^{\left\lfloor\frac{n}{h+1}\right\rfloor}(-qp^{h+1})^{k}\left(\binom{n-k(h+1)}{k}-p^{h+1}\binom{n-(k+1)(h+1)}{k}\right),\qquad (21)

with the convention that (mk)=0\binom{m}{k}=0 if m<0m<0.

Proof 2.10.

The closed form (21) is obtained via the power series expansion 1/(1T)=Tj1/(1-T)=\sum T^{j} by applying the binomial theorem to each term TjT^{j}, with T=z+(pz)h+1zqT=z+(pz)^{h+1}zq.

The binomial sum (21) generalizes a formula obtained (for p=1/2p=1/2) by Pippenger in [45]. Therein, it is derived by an inclusion-exclusion principle (guided by the combinatorics of the carry propagation in binary words); for his problem, the generating function, and thus the corresponding binomial sum, are a little bit simpler than (20) and (21), and are then used to perform some real analysis for the asymptotics of the expected length.

In our case, equipped with this explicit expression for the probability generating function of Moran walks of bounded height, we can now tackle the question of the asymptotic distribution of this extremal parameter.

3. Asymptotic height of Moran walks

In this section, we establish a local limit law for the distribution of the height of Moran walks. One noteworthy consequence of the generating function explicit formula that we get in the previous section is that it allows us to have very efficient computations and simulations of the process at time nn, for large nn, as stressed by the following remark.

Remark 3.1 (Fast computation scheme for any given nn and hh).

One does not need to run the process for nn steps to have the exact distribution of HnH_{n}. Indeed, using the rational generating function from Corollary 2.9, for any pp, hh, and nn, it is possible to get the exact value of (Hn=h)=[zn](Fh(z,1)Fh1(z,1))\mathbb{P}\left(H_{n}=h\right)=[z^{n}]\left(F^{\leq h}(z,1)-F^{\leq h-1}(z,1)\right) in time O(ln(n))O(\ln(n)) via binary exponentiation.

This allows us to plot the distribution HnH_{n}, for quite large values of nn (as an example, see Figure 3). Note that for our other generating functions, which are algebraic, there exists a fast algorithm of cost nln(n)\sqrt{n}\ln(n) to compute their nn-th coefficient (this algorithm works more generally for all D-finite functions). This algorithm due to the brothers Chudnovsky is e.g. implemented in the Maple computer algebra system via the package Gfun; see [49]

Refer to caption
Refer to caption
Figure 3. The distribution of HnH_{n}, for n=225n=2^{25} (for p=1/2p=1/2 on the left and p=1/4p=1/4 on the right). One observes a sharp concentration around the height 2525 for p=1/2p=1/2 and 12.512.5 for p=1/4p=1/4, suggesting a logarithmic link in base 1/p1/p between nn and HnH_{n}. We prove and refine this claim in the next pages.
(Hn=h)\mathbb{P}(H_{n}=h)(for p=12p=\frac{1}{2})(Hn=h)\mathbb{P}(H_{n}=h)(for p=14p=\frac{1}{4})

3.1. Localization of the dominant singularity

As Fh(z,1)F^{\leq h}(z,1) (as given by Equation (19)) is a rational function, all its singularities are poles. The asymptotic behavior of the coefficients of Fh(z,1)F^{\leq h}(z,1) is governed by the closest pole(s) to zero (also called “dominant singularities” of FhF^{\leq h}). A natural candidate for being such a dominant singularity of Fh(z,1)F^{\leq h}(z,1) would be z=1/pz=1/p, but it is in fact a removable singularity, as one has (e.g. via L’Hôpital’s rule) Fh(1/p,1)=p(h+1)2p1qhF^{\leq h}(1/p,1)=\frac{p(h+1)}{2p-1-qh}. Thus, we can focus on the other roots of the denominator D(z)D(z) of Fh(z,1)F^{\leq h}(z,1).

Lemma 3.2 (Localization of the singularities of FhF^{\leq h}).

For p(0,1)p\in(0,1), the h+2h+2 roots z1(h),,zh+2(h)z_{1}(h),\dots,z_{h+2}(h) of D(z)=1z+qph+1zh+2D(z)=1-z+qp^{h+1}z^{h+2} are such that we have for hh large enough:

  • (i)

    z1(h)z_{1}(h) is the unique root strictly between 1 and 1/p1/p;

  • (ii)

    z2(h)=1/pz_{2}(h)=1/p is the unique root of modulus 1/p1/p;

  • (iii)

    the remaining hh roots z3(h),,zh+2(h)z_{3}(h),\dots,z_{h+2}(h) are all of modulus >1/p>1/p, and arbitrarily close (in modulus) to 1/p1/p (for h+h\rightarrow+\infty);

  • (iv)

    all the roots are simple.

Accordingly, z1(h)z_{1}(h) is the dominant singularity of Fh(z,1)F^{\leq h}(z,1).

Proof 3.3.

Let z(h)z_{*}(h) be the unique positive zero of D(z)=1+(h+2)qph+1zh+1D^{\prime}(z)=-1+(h+2)qp^{h+1}z^{h+1} given by

z(h)=1p(1q(h+2))1h+1.z_{*}(h)=\frac{1}{p}\left(\frac{1}{q(h+2)}\right)^{\frac{1}{h+1}}.

As z(h)z_{*}(h) tends to 1p\frac{1}{p} from the left, we thus have 0<z(h)<1/p0<z_{*}(h)<1/p for hh large enough. Moreover, D(z)D(z) is decreasing for all zz in the interval [0,z(h)][0,z_{*}(h)] and increasing in the interval [z(h),+][z_{*}(h),+\infty]. As D(1/p)=0D(1/p)=0, one thus has D(z(h))<0D(z_{*}(h))<0. And since D(1)>0D(1)>0, the intermediate value theorem implies the existence of (at least) one zero of DD between 11 and z(h)z_{*}(h). Combined with the (non)decreasing properties of DD, this entails the unicity of this zero; let us call it z1(h)z_{1}(h). Then, Pringsheim’s theorem (see e.g. [22]) asserts that FhF^{\leq h} has a real positive dominant singularity which is thus z1(h)z_{1}(h), the first real positive zero of DD. As Fh(z)F^{\leq h}(z) is a probability generating function, all its singularities are of modulus 1{\geq 1}. So we have 1<z1(h)<z(h)<1/p{1<z_{1}(h)<z_{*}(h)<1/p} and thus proved (i).

We now prove (ii). The fact that z2(h)=1/pz_{2}(h)=1/p is a root follows from 11/p+q/p=01-1/p+q/p=0. Is there any other root of the same modulus? If z=exp(iθ)/pz=\exp(i\theta)/p (with θ[0,2π]\theta\in[0,2\pi]) would be a root of D(z)D(z), then this would imply p=exp(iθ)qexp(i(h+2)θ)p=\exp(i\theta)-q\exp(i(h+2)\theta). By the reverse triangle inequality ||x||y|||xy|\Big{|}|x|-|y|\Big{|}\leq|x-y| (with equality only if xy=0xy=0 or x/y+x/y\in\mathbb{R}^{+}), this would entail θ=0\theta=0.

To prove (iii), we use the following version of Rouché’s theorem: if |Dg|<|g||D-g|<|g| on the boundary of a disk 𝒟\mathcal{D}, then DD and gg have the same number of roots inside 𝒟\mathcal{D}. We can apply this theorem to DD with g(z):=1zg(z):=1-z, for the disk 𝒟(0,1ϵp){\mathcal{D}}(0,\frac{1-\epsilon}{p}): on its boundary, one indeed has |D(z)g(z)|=qp|pz|h+2qp|1ϵ|h+2<qp|1ϵ|2/q<qϵp|g(z)||D(z)-g(z)|=\frac{q}{p}|pz|^{h+2}\leq\frac{q}{p}|1-\epsilon|^{h+2}<\frac{q}{p}|1-\epsilon|^{2/q}<\frac{q-\epsilon}{p}\leq|g(z)|, where the first strict inequality holds for h2/qh\geq 2/q and the next strict inequality holds for any small enough ϵ\epsilon (independently of hh), as we have then ln(1ϵ/q)ln(1ϵ)<2/q\frac{\ln(1-\epsilon/q)}{\ln(1-\epsilon)}<2/q. As the constraint on hh is independent of ϵ\epsilon, letting ϵ0\epsilon\rightarrow 0, we infer that DD has only one root strictly inside 𝒟(0,1p){\mathcal{D}}(0,\frac{1}{p}).

Now we can also apply this theorem to DD with g(z):=1+zh+2g(z):=1+z^{h+2}: on the boundary of the disk 𝒟(0,1+ϵp){\mathcal{D}}(0,\frac{1+\epsilon}{p}), one indeed has, for hh large enough (depending on ϵ\epsilon),

|D(z)g(z)|(1+εp)h+2(1qph+1)+1+εp<(1+εp)h+21|g(z)|,|D(z)-g(z)|\leq\left(\frac{1+\varepsilon}{p}\right)^{h+2}\left(1-qp^{h+1}\right)+\frac{1+\varepsilon}{p}<\left(\frac{1+\varepsilon}{p}\right)^{h+2}-1\leq|g(z)|,

where the last 1-1 is just a crude bound of the term qp(1+ε)h+2+1+εp-\frac{q}{p}(1+\varepsilon)^{h+2}+\frac{1+\varepsilon}{p} which converges to -\infty for h+h\rightarrow+\infty. So DD, like gg, has h+2h+2 roots inside this disk.
To prove (iv), note that the equation D(z)=D(z)=0D(z)=D^{\prime}(z)=0 is forcing z=1+1h+1z=1+\frac{1}{h+1}, but D(1+1h+1)1D^{\prime}(1+\frac{1}{h+1})\rightarrow-1 for h+h\rightarrow+\infty, therefore all the zeros are simple for hh large enough.

See Figure 5 on page 5 for an illustration of the location of the roots.

3.2. Limit distribution of the height: the discrete Gumbel distribution

The height distribution exhibits some a priori surprising asymptotic aspects, having a flavor of number theory/Diophantine approximation. Such phenomena, however, appear for a few other probabilistic processes where some statistics could have different asymptotic behaviors depending on some resonance between lnp\ln p and lnq\ln q (see e.g. Janson [36] or Flajolet, Vallée, and Roux [21] for some examples related to tries or binary search trees). In our case, it appears that a resonance between lnp\ln p and lnn\ln n plays a role.

Theorem 3.4 (Distribution of the height of Moran walks).

We have

(Hnh)=exp(qnph+1)(1+O((lnn)3n)),\mathbb{P}\left(H_{n}\leq h\right)=\exp\left(-qnp^{h+1}\right)\left(1+O\left(\frac{(\ln n)^{3}}{n}\right)\right), (22)

where the error term is uniform for h[0,n]h\in[0,n]. Accordingly, (Hn=h)\mathbb{P}(H_{n}=h) is unimodal, with a peak at h=h(n)h=h^{*}(n), the closest integer to c(n)ln(n)ln(1/p)c^{*}(n)\frac{\ln(n)}{\ln(1/p)}, where c(n):=1ln(ln(1/p)/q2)ln(n)c^{*}(n):=1-\frac{\ln(\ln(1/p)/q^{2})}{\ln(n)}, and we have

(Hn=h(n))pp/qp1/q.\mathbb{P}(H_{n}=h^{*}(n))\sim p^{p/q}-p^{1/q}. (23)

Moreover, the mass is sharply concentrated around lnnln(1/p)\frac{\ln n}{\ln(1/p)}, as better seen by the following result, with a uniform error term in kk:

(Hnlnnln(1/p)+k)=exp(qα(n)pk+1)(1+O((lnn)3n)),\mathbb{P}\left(H_{n}\leq\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor+k\right)=\exp\left(-q\alpha(n)p^{k+1}\right)\left(1+O\left(\frac{(\ln n)^{3}}{n}\right)\right),

with α(n):=p{lnnlnp}\alpha(n):=p^{-\{\frac{\ln n}{-\ln p}\}} (where {x}\{x\} stands for the fractional part of xx, and where x\lfloor x\rfloor stands for the floor function of xx). [See Figure 3 on page 3 for an illustration of the distribution of HnH_{n} and Figure 4 for the behavior of the function α(n)\alpha(n).]

Refer to caption
Figure 4. Plot of the function α(n)=p{lnnlnp}\alpha(n)=p^{-\{\frac{\ln n}{-\ln p}\}} (for p=1/2p=1/2), which occurs in the fluctuations of the height of Moran walks (as stated in Theorem 3.4). The function α(n)\alpha(n) is taking values in [1,1/p)[1,1/p) for integers n1n\geq 1. It has a sawtooth wave shape, with frequencies getting larger and larger (with peaks at powers of 1/p1/p).
Proof 3.5.

In the sequel, as the context is explicit, we simply denote by z1,,zh+2z_{1},\dots,z_{h+2} the zeros z1(h),,zh+2(h)z_{1}(h),\dots,z_{h+2}(h) of D(z)=1z+qph+1zh+2D(z)=1-z+qp^{h+1}z^{h+2}. From Lemma 3.2, for hh large enough, all these zeros ziz_{i} are simple; the partial fraction decomposition of 1/D1/D is then

1D(z)=i=1h+21D(zi)(zzi)\frac{1}{D(z)}=\sum_{i=1}^{h+2}\frac{1}{D^{\prime}(z_{i})\left(z-z_{i}\right)}

and as D(zi)=1+(h+2)(zi1)/ziD^{\prime}(z_{i})=-1+(h+2)(z_{i}-1)/{z_{i}}, one thus gets

Fh(z,1)\displaystyle F^{\leq h}(z,1) =1(pz)h+1D(z)=i=1h+21(pz)h+1D(zi)(zzi)\displaystyle=\frac{1-(pz)^{h+1}}{D(z)}=\sum_{i=1}^{h+2}\frac{1-(pz)^{h+1}}{D^{\prime}(z_{i})\left(z-z_{i}\right)}
=i=1h+2(1zi(zi1)(h+2)(n=0+zinzn)ph+1zi(zi1)(h+2)n=h+1+zin+h+1zn)\displaystyle=\sum_{i=1}^{h+2}\left(\frac{1}{z_{i}\scalebox{0.85}[1.11]{$\,-\,$}\left(z_{i}\scalebox{0.85}[1.11]{$\,-\,$}1\right)(h\scalebox{0.95}{$\,+\,$}2)}\left(\sum_{n=0}^{+\infty}z_{i}^{-n}z^{n}\right)\scalebox{0.85}[1.11]{$\,-\,$}\frac{p^{h+1}}{z_{i}\scalebox{0.85}[1.11]{$\,-\,$}\left(z_{i}\scalebox{0.85}[1.11]{$\,-\,$}1\right)(h\scalebox{0.95}{$\,+\,$}2)}\sum_{n=h+1}^{+\infty}z_{i}^{-n+h+1}z^{n}\right)
=i=1h+2(1zi(zi1)(h+2)(n=0hzinzn)+1(pzi)h+1zi(zi1)(h+2)n=h+1+zinzn).\displaystyle=\sum_{i=1}^{h+2}\left(\frac{1}{z_{i}\scalebox{0.85}[1.11]{$\,-\,$}\left(z_{i}\scalebox{0.85}[1.11]{$\,-\,$}1\right)(h\scalebox{0.95}{$\,+\,$}2)}\left(\sum_{n=0}^{h}z_{i}^{-n}z^{n}\right)+\frac{1\scalebox{0.85}[1.11]{$\,-\,$}(pz_{i})^{h+1}}{z_{i}\scalebox{0.85}[1.11]{$\,-\,$}\left(z_{i}\scalebox{0.85}[1.11]{$\,-\,$}1\right)(h\scalebox{0.95}{$\,+\,$}2)}\sum_{n=h+1}^{+\infty}z_{i}^{-n}z^{n}\right).

It is combinatorially obvious that (Hnh)=1\mathbb{P}\left(H_{n}\leq h\right)=1 for all nhn\leq h. So we now focus on n>hn>h, for which we have, as (pzi)h+1=zi1qzi(pz_{i})^{h+1}=\frac{z_{i}-1}{qz_{i}} and 1zi1qzi=1pziqzi1-\frac{z_{i}-1}{qz_{i}}=\frac{1-pz_{i}}{qz_{i}}:

(Hnh)=[zn]Fh(z,1)\displaystyle\mathbb{P}\left(H_{n}\leq h\right)=[z^{n}]F^{\leq h}(z,1) =i=1h+21(pzi)h+1zi(zi1)(h+2)zin\displaystyle=\sum_{i=1}^{h+2}\frac{1-(pz_{i})^{h+1}}{z_{i}-\left(z_{i}-1\right)(h+2)}z_{i}^{-n}
=i=1h+21pziq(1+(1zi)(h+1))zin1\displaystyle=\sum_{i=1}^{h+2}\frac{1-pz_{i}}{q\left(1+\left(1-z_{i}\right)(h+1)\right)}z_{i}^{-n-1}
=Z1(n,h)+O(hMpn+1),\displaystyle=Z_{1}(n,h)+O\left(hMp^{n+1}\right), (24)

where M=maxi=3,,h+2|1pziq(1+(1zi)(h+1))|=O(1)M=\max_{i=3,\dots,h+2}\left|\frac{1-pz_{i}}{q\left(1+\left(1-z_{i}\right)(h+1)\right)}\right|=O(1) (note that the summand involving z2=1/pz_{2}=1/p cancels), and where Z1(n,h):=1pz1q[1+(1z1)(h+1)]z1n1Z_{1}(n,h):=\frac{1-pz_{1}}{q\left[1+\left(1-z_{1}\right)(h+1)\right]}z_{1}^{-n-1} is the contribution coming from the pole z1z_{1}.

Refer to caption
Figure 5. The roots of D(z)=1z+qph+1zh+2D(z)=1-z+qp^{h+1}z^{h+2} (here, with p=1/3p=1/3 and h=51h=51). For large hh, D(z)D(z) has one dominant root z1z_{1} just after 11, one root at z=1/pz=1/p, and the other roots have a slightly larger modulus, all asymptotically close to the circle |z|=1/p|z|=1/p; see Lemma 3.2.

Set z1:=1+εhz_{1}:=1+\varepsilon_{h}. Then D(z1)=1(1+εh)+qph+1(1+εh)h+2=0D(z_{1})=1-(1+\varepsilon_{h})+qp^{h+1}\left(1+\varepsilon_{h}\right)^{h+2}=0, thus this implies εh=qph+1(1+εh)h+2\varepsilon_{h}=qp^{h+1}\left(1+\varepsilon_{h}\right)^{h+2}; therefore we have z1=1+εh=1+qph+1+O(hp2h)z_{1}=1+\varepsilon_{h}=1+qp^{h+1}+O(hp^{2h}). Now, for h=h(n)h=h(n) tending to ++\infty, this entails that the contribution Z1(n,h)Z_{1}(n,h) of the pole z1z_{1} (as given by Equation (24)) satisfies

Z1(n,h)\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\!\!Z_{1}(n,h) =1ph+2+O(hp2h)1(h+1)qph+1+O(h2p2h)(1+εh)n1\displaystyle=\frac{1-p^{h+2}+O\left(hp^{2h}\right)}{1-(h+1)qp^{h+1}+O(h^{2}p^{2h})}(1+\varepsilon_{h})^{-n-1} (25)
=(1+q(h+1)ph+1ph+2+O(h2p2h))exp((n+1)ln(11+εh))\displaystyle=\Big{(}1+q(h+1)p^{h+1}-p^{h+2}+O(h^{2}p^{2h})\Big{)}\exp\left((n+1)\ln\left(\frac{1}{1+\varepsilon_{h}}\right)\right) (26)
=(1+q(h+1)ph+1ph+2+O(h2p2h))exp((n+1)εh+Θ((n+1)εh2)).\displaystyle=\left(1+q(h+1)p^{h+1}-p^{h+2}+O(h^{2}p^{2h})\right)\exp\left(-(n+1)\varepsilon_{h}+\Theta((n+1)\varepsilon_{h}^{2})\right)\!. (27)

Observe that

 if h=cln(n)ln(1/p)+cln(ln(n))ln(1/p) then ph=1ncln(n)c.\text{ if \qquad}h=c\frac{\ln(n)}{\ln(1/p)}+c^{\prime}\frac{\ln(\ln(n))}{\ln(1/p)}\text{ \qquad then \qquad}p^{h}=\frac{1}{n^{c}\ln(n)^{c^{\prime}}}. (28)

(Here and in the sequel we always consider c>1/2c>1/2 and c0c^{\prime}\geq 0. In fact, c>0c^{\prime}>0 is not needed right now, but this will be required for the asymptotics of the mean of HnH_{n} in Section 4.)

For such values of hh, the asymptotics of the first factor in Equation (27) is

1+q(h+1)ph+1ph+2+O(h2p2h)\displaystyle 1+q(h+1)p^{h+1}-p^{h+2}+O(h^{2}p^{2h}) =1+O(1ncln(n)c1),\displaystyle=1+O\left(\frac{1}{n^{c}\ln(n)^{c^{\prime}-1}}\right), (29)

and the asymptotics of the second factor in Equation (27) is

exp((n+1)εh+O((n+1)εh2))=exp(nqph+1+O(nhp2h)εh+Θ(n12c/ln(n)2c))\displaystyle\exp\left(-(n+1)\varepsilon_{h}+O((n+1)\varepsilon_{h}^{2})\right)=\exp\left(-nqp^{h+1}+O(nhp^{2h})-\varepsilon_{h}+\Theta(n^{1-2c}/\ln(n)^{2c^{\prime}})\right) (30)
=exp(nqph+1)(1+O(n12cln(n)12c)O(ncln(n)c)+Θ(n12c/ln(n)2c))).\displaystyle\qquad=\exp\left(-nqp^{h+1}\right)\left(1+O(n^{1-2c}\ln(n)^{1-2c^{\prime}})-O(n^{-c}\ln(n)^{-c^{\prime}})+\Theta(n^{1-2c}/\ln(n)^{2c^{\prime}}))\right). (31)

In this expansion, one now has to check which error term dominates. It is the big-oh term with ncn^{-c} if c>1c>1 and the big-oh with n12cn^{1-2c} if c1c\leq 1. Multiplying with the asymptotic expansion from Equation (29) and using the approximation (24), we get the following result (in which we simplified the ln\ln part of the error term in a non-optimal way which will be enough for our purpose):

(Hnh)=exp(nqph+1)(1+O(lnnnmin(c,2c1))).\mathbb{P}\left(H_{n}\leq h\right)=\exp\left(-nqp^{h+1}\right)\left(1+O\left(\frac{\ln n}{n^{\min(c,2c-1)}}\right)\right). (32)

Moreover, this approximation holds for all h[0,n]h\in[0,n]: first, for h12ln(n)/ln(1/p)h\ll\frac{1}{2}\ln(n)/\ln(1/p) this follows from the fact that (Hnh)\mathbb{P}\left(H_{n}\leq h\right) is increasing with respect to hh, and then for hcln(n)h\gg c\ln(n) this follows from the bound (50) hereafter.

In conclusion, for h=lnnln(1/p)+kh=\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor+k, for any kk such that h[c1ln(n)ln(1/p),c2ln(n)ln(1/p)]h\in\left[c_{1}\frac{\ln(n)}{\ln(1/p)},c_{2}\frac{\ln(n)}{\ln(1/p)}\right] (with 1/2<c1<c21/2<c_{1}<c_{2}), we have uniformly in kk (when n+n\to+\infty):

(Hnh)\displaystyle\mathbb{P}\left(H_{n}\leq h\right) =exp(nqplnnln(1/p)+k+1)(1+O((lnn)3n))\displaystyle=\exp\left(-nqp^{\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor+k+1}\right)\left(1+O\left(\frac{(\ln n)^{3}}{n}\right)\right)
=exp(qp{lnnlnp}+k+1)(1+O((lnn)3n)),\displaystyle=\exp\left(-qp^{-\{\frac{\ln n}{-\ln p}\}+k+1}\right)\left(1+O\left(\frac{(\ln n)^{3}}{n}\right)\right),

and we get Theorem 3.4 by setting α(n):=p{lnnlnp}\alpha(n):=p^{-\{\frac{\ln n}{-\ln p}\}}.

If p=q=1/2p=q=1/2, we have α(n)=2{lg(n)}\alpha(n)=2^{\{\operatorname{lg}(n)\}} (where the symbol lg\operatorname{lg} stands for the binary logarithm, lg(x)=log2(x)\operatorname{lg}(x)=\log_{2}(x)). This subcase of particular interest corresponds to a problem initially considered in 1946 by Burks, Goldstine, and von Neumann [13]: the study of carry propagation in computer binary arithmetic; it constitutes one of the first analyses of the cost of an algorithm! They gave crude bounds which were deeply improved by Knuth in 1978 [38]. This problem can also be seen as runs in binary words, and, as such, is analyzed by Flajolet and Sedgewick [22, Theorem V.1]. Therein, the analysis unfortunately contains a few typos which affect some of the error terms. Our proofs are incidentally fixing this issue.

These extremal parameters (runs, longest carry) are archetypal examples of problems leading to a Gumbel distribution (or a discrete version of it). This distribution indeed often appears in combinatorics as the distribution of parameters encoding a maximal value: e.g., maximum of i.i.d. geometric distributions [51], longest repetition of a pattern in lattice paths [46], runs in integer compositions [23], carry propagation in signed digit representations [30], largest part in some integer compositions, longest chain of nodes with a given arity in trees, maximum degree in some families of trees [47], the maximum protection number in simply generated trees [31]. For some of these examples, it was proven only for some specific families of structures, but there is no doubt that it holds generically. A general framework leading to such double exponential laws is given by Gourdon [26, Theorem 4] for the largest component in supercritical composition schemes (see also Bender and Gao [10]). We refer to Figure 6 for an illustration of some of these parameters.

Longest up run
in Dyck paths
Refer to caption
Longest chain of unary nodes
Refer to caption
Largest part in
integer compositions:
100=11+1+11+9100=11+1+11+9
+39+14+15\qquad+{\color[rgb]{1,0,0}39}+14+15.
Longest plateau
in Motzkin paths
Refer to caption
Maximal protection
number in trees
Refer to caption
Longest run in
integer compositions:
20=1+4+4+1\!\!\!\!20=1+4+4+1
+3+3+3+1\qquad\ \,+{\color[rgb]{1,0,0}3+3+3}+1.
Figure 6. Many combinatorial structures have some parameters which asymptotically follow a discrete Gumbel distribution.

The Gumbel distribution is also called the “double exponential distribution”, or the “type-I generalized extreme value distribution”, and can also be expressed as a subcase of the Fisher–Tippett distribution. Let us give a formal definition.

Definition 3.6 (Gumbel distribution).

A continuous random variable XX with support [,+][-\infty,+\infty] follows a Gumbel distribution (of parameters μ\mu and β\beta), denoted by Gumbel(μ,β)\operatorname{Gumbel}(\mu,\beta), if

(Xx)=exp(exp(xμβ)).\mathbb{P}(X\leq x)=\exp\left(-\exp\left(-\frac{x-\mu}{\beta}\right)\right).

Its mean satisfies 𝔼[X]=μ+γβ\mathbb{E}[X]=\mu+\gamma\beta (where γ=0.5772\gamma=0.5772\dots is Euler’s constant) and its variance satisfies 𝕍ar[X]=π26β2\mathbb{V}{\rm ar}[X]=\frac{\pi^{2}}{6}\beta^{2}. It is unimodal with a peak at x=μx=\mu and its median is at x=μβln(ln(2))x=\mu-\beta\ln(\ln(2)).

Definition 3.7 (Discrete Gumbel distribution).

A discrete random variable YY follows a discrete Gumbel distribution of parameters μ\mu and β\beta, which we denote Gumbel(μ,β)\operatorname{Gumbel}(\mu,\beta)555With a slight abuse of notation, we use the same notation Gumbel(μ,β)\operatorname{Gumbel}(\mu,\beta) for both the continuous distribution and the discrete distribution, adding the right adjective if needed to remove any ambiguity., if

(Yh)=exp(exp(hμβ)), for all h.\mathbb{P}(Y\leq h)=\exp\left(-\exp\left(-\frac{h-\mu}{\beta}\right)\right),\text{\qquad for all $h\in\mathbb{Z}$}. (33)

In particular, one can always write Y=XY=\lceil X\rceil, where XX follows a continuous Gumbel(μ,β)\operatorname{Gumbel}(\mu,\beta); note on the other side that X\lfloor X\rfloor follows a discrete Gumbel(μ1,β)\operatorname{Gumbel}(\mu-1,\beta).

To obtain a nice formula for the mean and variance of a discrete Gumbel distribution remains an open problem: for example, for Y=dGumbel(0,1)Y\stackrel{{\scriptstyle d}}{{=}}\operatorname{Gumbel}(0,1), we have

𝔼[Y]=h=h(exp(exp(h))exp(exp(h+1))=1.077240905953631072609\mathbb{E}[Y]=\sum_{h=-\infty}^{\infty}h\left(\exp(-\exp(-h))-\exp(-\exp(-h+1)\right)=1.077240905953631072609\dots

(and it takes 5 seconds to get thousands of digits, as the terms decrease doubly exponentially fast), but will anybody find a closed form for this mysterious constant? Some insight on the variance of the discrete distribution YY can be obtained from the continuous distribution XX via the following trivial but useful bounds which hold more generally as soon as |XY|<1|X-Y|<1:

|𝔼[Y]𝔼[X]|<1 and |𝕍ar[Y]𝕍ar[X]|<2+4|𝔼[X]|.\left|\mathbb{E}[Y]-\mathbb{E}[X]\right|<1\text{\qquad and \qquad}\left|\mathbb{V}{\rm ar}[Y]-\mathbb{V}{\rm ar}[X]\right|<2+4|\mathbb{E}[X]|. (34)

We can now restate our previous theorem in terms of this discrete Gumbel distribution.

Corollary 3.8 (Gumbel limit law).

The sequence of random variables Hnln(pqn)ln(1/p)\lceil H_{n}-\frac{\ln(pqn)}{\ln(1/p)}\rceil converges for n+n\rightarrow+\infty (in distribution and in moments) to the discrete Gumbel(0,β)\operatorname{Gumbel}(0,\beta) distribution with β=1ln(1/p)\beta=\frac{1}{\ln(1/p)}. Accordingly, it implies that

𝔼[Hn]ln(pqn)ln(1/p)+γβ+an error smaller than 1,\mathbb{E}[H_{n}]\sim\frac{\ln(pqn)}{\ln(1/p)}+\gamma\beta+\text{an error smaller than $1$},
𝕍ar[Hn]π26ln(p)2+ an error smaller than 2+4γβ.\mathbb{V}{\rm ar}[H_{n}]\sim\frac{\pi^{2}}{6\ln(p)^{2}}+\text{ an error smaller than $2+4\gamma\beta$}.
Proof 3.9.

Consider the sequence of random variables Yn:=HnμnY_{n}:=\lceil H_{n}-\mu_{n}\rceil. Then, the change of variable hh+μnh\mapsto h+\mu_{n} in Equation (22), with μn=ln(pqn)ln(1/p)\mu_{n}=\frac{\ln(pqn)}{\ln(1/p)} allows us to match Y:=limnYnY:=\lim_{n}Y_{n} (where the limit is in distribution) with the discrete Gumbel defined in (33), for μ=0\mu=0 and β=1ln(1/p)\beta=\frac{1}{\ln(1/p)}. Due to the exponentially small uniform error term in (22) on the support [0,n][0,n] of HnH_{n}, we have a convergence in moments of YnY_{n} to YY. Then, the asymptotics of the moments follow by applying the bounds (34) on the link between the mean/variance of the discrete and continuous Gumbel distribution.

These moment asymptotics already constitute a notable result (falling as a good ripe fruit!), but a very interesting phenomenon is hidden in these imprecise errors terms: some bodacious fluctuations, that we fully describe in Section 4.

3.3. Waiting time

Let us end this section with an application to a natural statistic: the waiting time τh\tau_{h}, i.e., the number of steps spent by the random walk when it reaches a given altitude hh for the first time. There is an intimate relationship between height and waiting time (stated more formally in Equation (37) hereafter); it is thus natural that they have enumerative and asymptotic formulas of a similar nature, as better shown by the following corollary.

Corollary 3.10.

The waiting time τh\tau_{h} for reaching height hh satisfies

(τh=n)\displaystyle\mathbb{P}(\tau_{h}=n) =[zn](1pz)phzh1z+qph1zh.\displaystyle=[z^{n}]\frac{(1-pz)p^{h}z^{h}}{1-z+qp^{h-1}z^{h}}. (35)

The distribution function of τh\tau_{h} satisfies

(τhn)=1exp(qα(n)2nph)+O((lnn)3n).\mathbb{P}(\tau_{h}\leq n)=1-\exp\left(-q\alpha(n)^{2}np^{h}\right)+O\left(\frac{(\ln n)^{3}}{n}\right). (36)
Proof 3.11.

Consider a walk reaching for the first time altitude hh at time nn. Cut it after each reset. It gives a sequence of factors of length khk\leq h, followed by a last factor with hh up steps. This translates into the combinatorial formula

(τh=n)\displaystyle\mathbb{P}(\tau_{h}=n) =[zn]phzh1k=1h1pk1qzk,\displaystyle=[z^{n}]\frac{p^{h}z^{h}}{1-\sum_{k=1}^{h-1}p^{k-1}qz^{k}},

which simplifies to Formula (35). Now, for the distribution function, instead of redoing a full analysis based on a partial fraction decomposition of this generating function, it is more convenient to use the relation

(τh=n)=(Hn=h and Hn1<h),\displaystyle\mathbb{P}(\tau_{h}=n)=\mathbb{P}(H_{n}=h\text{ and }H_{n-1}<h), (37)

thus this waiting time also satisfies

(τhn)=(Hnh)=1(Hnh1).\displaystyle\mathbb{P}(\tau_{h}\leq n)=\mathbb{P}(H_{n}\geq h)=1-\mathbb{P}(H_{n}\leq h-1). (38)

Then, using Theorem 3.4, we also have

(Hnh1)\displaystyle\mathbb{P}(H_{n}\leq h-1) =(Hnlnnln(1/p)+h1lnnln(1/p))\displaystyle=\mathbb{P}\left(H_{n}\leq\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor+h-1-\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor\right)
=exp(qα(n)phlnnln(1/p))+O((lnn)3n)\displaystyle=\exp\left(-q\alpha(n)p^{h-\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor}\right)+O\left(\frac{(\ln n)^{3}}{n}\right)
=exp(qα(n)2ph+lnnlnp)+O((lnn)3n).\displaystyle=\exp\left(-q\alpha(n)^{2}p^{h+\frac{\ln n}{\ln p}}\right)+O\left(\frac{(\ln n)^{3}}{n}\right).

Via Formula (38) linking the waiting time τh\tau_{h} and the height HnH_{n}, this entails (36).

We now turn to a finer analysis of the mean and variance of HnH_{n}.

4. Mean and variance of the height

4.1. Fundamental properties of the Mellin transform

In order to get a fine estimation of the average height, we use a Mellin transform, which, as we shall see, is the key tool to handle the corresponding asymptotics. We now present the needed definitions and formulas. We refer e.g. to Flajolet, Gourdon, and Dumas [19] or to the book Analytic Combinatorics [22, Appendix B.7] for more on the Mellin transform and numerous applications to asymptotics of harmonic sums, digital sums, and divide-and-conquer recurrences.

Definition 4.1 (Mellin transform).

Let f(t)f(t) be a continuous function defined on the positive real axis 0<t<+0<t<+\infty. The Mellin transform ff^{*} of ff is the function defined by

f(s):=0+f(t)ts1𝑑t.f^{*}(s):=\int_{0}^{+\infty}f(t)t^{s-1}dt.

This integral exists only for ss such that the function f(t)ts1f(t)t^{s-1} is integrable on (0,+)\left(0,\;+\infty\right). Thus, if there exist two real numbers aa and bb, such that a>ba>b and

f(t)={O(ta), if t0O(tb), if t+,f(t)=\begin{cases}O(t^{a}),&\mbox{ if }t\to 0\\ O(t^{b}),&\mbox{ if }t\to+\infty\end{cases}, (39)

then the function ff^{*} is well defined for any complex number ss with real part such that a<(s)<b-a<\Re(s)<-b; this domain is called the fundamental strip of ff^{*}. Moreover, for all cc in this domain, if f(s)f^{*}(s) converges uniformly to 0 for s=c±is=c\pm i\infty, then the function ff can be expressed for t(0,+)t\in(0,+\infty) as the following inverse Mellin transform:

f(t)=12iπcic+if(s)ts𝑑s.f(t)=\frac{1}{2i\pi}\int_{c-i\infty}^{c+i\infty}f^{*}(s)t^{-s}ds. (40)

As an example, let us consider the gamma function, which illustrates well the role of the fundamental strip (and this example will also play a role in the next pages).

Example 4.2 (The gamma function as a Mellin transform).

The gamma function satisfies

Γ(s)\displaystyle\Gamma(s) =0+exp(t)ts1𝑑t (for 0<(s)<+),\displaystyle=\int_{0}^{+\infty}\exp(-t)t^{s-1}dt\text{\qquad(for $0<\Re(s)<+\infty$)}, (41)
Γ(s)\displaystyle\Gamma(s) =0+(1exp(t))ts1𝑑t (for 1<(s)<0).\displaystyle=\int_{0}^{+\infty}\left(1-\exp(-t)\right)t^{s-1}dt\text{\qquad(for $-1<\Re(s)<0$)}. (42)

An important consequence of Formula (40) is that, if ff is a meromorphic function on \mathbb{C}, and if limc+cic+if(s)ts𝑑s=0\lim_{c\rightarrow+\infty}\int_{c-i\infty}^{c+i\infty}f^{*}(s)t^{-s}ds=0, then one can push the integration contour of Formula (40) to the right (taking limc+\lim_{c\rightarrow+\infty}) and one then collects in passing the contributions from the residue at each pole sks_{k} to the right of the fundamental strip. Now, for t>0t>0 and aa\!\in\!{\mathbb{C}}, multiplying ts=ta0ln(t)(as)/!t^{-s}=t^{-a}\sum_{\ell\geq 0}\ln(t)^{\ell}(a-s)^{\ell}/\ell! by the Laurent series of f(s)f^{*}(s) at s=sks\!=\!s_{k}, we see that Res[f(s)ts,sk]\operatorname{Res}[f^{*}(s)t^{-s},s_{k}] can be expressed666The notation Res[g(s),sk]\operatorname{Res}[g(s),s_{k}] stands for the residue of g(s)g(s) at s=sks=s_{k}. as a sum of order(sk)\operatorname{order}(s_{k}) terms, and one gets

f(t)=\displaystyle f(t)= sk pole of f(s)ts(sk)bRes[f(s)ts,sk]\displaystyle\sum_{\begin{subarray}{c}\text{$s_{k}$ pole of $f^{*}(s)t^{-s}$}\\ \text{$\Re(s_{k})\geq-b$}\end{subarray}}\operatorname{Res}[f^{*}(s)t^{-s},s_{k}] (43)
=\displaystyle= sk pole of f(sk)bj=1order(sk)Res[(ssk)j1f(s),sk]tsk(1)j(j1)!(lnt)j1.\displaystyle\sum_{\begin{subarray}{c}\text{$s_{k}$ pole of $f^{*}$}\\ \text{$\Re(s_{k})\geq-b$}\end{subarray}}\sum_{j=1}^{\operatorname{order}(s_{k})}\operatorname{Res}[(s-s_{k})^{j-1}f^{*}(s),s_{k}]\ t^{-s_{k}}\frac{(-1)^{j}}{(j-1)!}\,(\ln t)^{j-1}. (44)

4.2. Average height of Moran walks

We now state the main result of this section.

Theorem 4.3 (Average height).

The average height of Moran walks of length nn is given by

𝔼[Hn]\displaystyle\mathbb{E}[H_{n}] =lnnln(1/p)γlnp12lnqlnp+Q(ln(qn))lnp+O((lnn)4n),\displaystyle=\frac{\ln n}{\ln(1/p)}-\frac{\gamma}{\ln p}-\frac{1}{2}-\frac{\ln q}{\ln p}+\frac{Q(\ln(qn))}{\ln p}+O\left(\frac{(\ln n)^{4}}{n}\right), (45)

where γ=.57721\gamma=.57721\dots is Euler’s constant, and where QQ is an oscillating function (a Fourier series of period ln(1/p)\ln(1/p)) given by

Q(x)\displaystyle Q(x) :=k{0}Γ(sk)exp(skx) where sk:=2ikπlnp.\displaystyle:=\sum_{k\in\mathbb{Z}\setminus\{0\}}\Gamma(s_{k})\exp(-s_{k}x)\text{\quad where $s_{k}:=\frac{2ik\pi}{\ln p}$}. (46)
Remark 4.4 (Fourier series representation).

The fact that QQ is a Fourier series of period ln(1/p)\ln(1/p) and is real for xx\in\mathbb{R} is better seen via the alternative equivalent expression

Q(x)\displaystyle Q(x) =2k1((Γ(sk))cos(2kπxln(p))+(Γ(sk))sin(2kπxln(p))),\displaystyle=2\sum_{k\geq 1}\left(\Re(\Gamma(s_{k}))\cos\left(\frac{2k\pi x}{\ln(p)}\right)+\Im(\Gamma(s_{k}))\sin\left(\frac{2k\pi x}{\ln(p)}\right)\right),

where \Re and \Im stands for the real and imaginary parts. This is illustrated in Figure 7.

Remark 4.5 (Fourier series differentiability).

Such asymptotics involving fluctuations dictated by a Fourier series are typical of results obtained via Mellin transforms. They often appear in the asymptotic cost of divide-and-conquer algorithms, or of expressions involving digital sums, harmonic sums, or finite differences (see the work of de Bruijn, Knuth, and Rice [15, 38], or Flajolet, Gourdon, and Dumas [19]). It is sometimes also possible to get them via some real analysis (like Pippenger did [45]), or like in the seminal work of Delange [16] on the sum of digits. Note that the Delange series is nowhere differentiable, while our Fourier series is infinitely differentiable, as proven in Theorem 4.15.

Refer to caption
Q(x)Q(x) (for p=1/2p=1/2)
Refer to caption
  Q(ln(px))Q(\ln(px)) (for p=1/2p=1/2)
Figure 7. The height of Moran walks involves asymptotic fluctuations encoded by a Fourier series Q(x)Q(x), of period ln(1/p)\ln(1/p), and weak amplitude. More precisely, it involves Q(ln(px))Q(\ln(px)) which thus oscillates an infinite number of times for x0+x\rightarrow 0^{+}, and these oscillations get larger and larger for x+x\rightarrow+\infty. Moreover, QQ oscillates faster when pp tends to 11. We shall encounter later another Fourier series, R(x)R(x), which shares all these properties.
Proof 4.6 (Proof of Theorem 4.3).

The proof exploits the fact that the mean 𝔼[Hn]\mathbb{E}[H_{n}] asymptotically behaves like h=0+(1exp(nqph+1))\sum_{h=0}^{+\infty}\big{(}1-\exp(-nqp^{h+1})\big{)}; this is proven by rewriting 𝔼[Hn]\mathbb{E}[H_{n}] as follows:

𝔼[Hn]=h=0n(1(Hnh))=Σ0+Σ1+Σ2+Σ3Σ4+Σ,\mathbb{E}[H_{n}]=\sum_{h=0}^{n}\left(1-\mathbb{P}\left(H_{n}\leq h\right)\right)=\Sigma_{0}+\Sigma_{1}+\Sigma_{2}+\Sigma_{3}-\Sigma_{4}+\Sigma_{\infty}, (47)

with

Σ0\displaystyle\Sigma_{0} :=0h<h1(exp(nqph+1)(Hnh)),\displaystyle:=\sum_{0\leq h<h_{1}}\left(\exp(-nqp^{h+1})-\mathbb{P}\left(H_{n}\leq h\right)\right),
Σ1\displaystyle\Sigma_{1} :=h1h<h2(exp(nqph+1)(Hnh)),\displaystyle:=\sum_{h_{1}\leq h<h_{2}}\left(\exp(-nqp^{h+1})-\mathbb{P}\left(H_{n}\leq h\right)\right),
Σ2\displaystyle\Sigma_{2} :=h2h<h3(exp(nqph+1)(Hnh)),\displaystyle:=\sum_{h_{2}\leq h<h_{3}}\left(\exp(-nqp^{h+1})-\mathbb{P}\left(H_{n}\leq h\right)\right),
Σ3\displaystyle\Sigma_{3} :=h3hn(1(Hnh)),\displaystyle:=\sum_{h_{3}\leq h\leq n}\big{(}1-\mathbb{P}\left(H_{n}\leq h\right)\big{)},
Σ4\displaystyle\Sigma_{4} :=h=h3+(1exp(nqph+1)),\displaystyle:=\sum_{h=h_{3}}^{+\infty}\big{(}1-\exp(-nqp^{h+1})\big{)},
Σ\displaystyle\Sigma_{\infty} :=h=0+(1exp(nqph+1)).\displaystyle:=\sum_{h=0}^{+\infty}\big{(}1-\exp(-nqp^{h+1})\big{)}.

The key is to prove that, for some h1h_{1}, h2h_{2}, and h3h_{3} adequately chosen, the sums Σ0,Σ1,Σ2\Sigma_{0},\Sigma_{1},\Sigma_{2}, Σ3\Sigma_{3}, and Σ4\Sigma_{4} are asymptotically negligible, while the main contribution to 𝔼[Hn]\mathbb{E}[H_{n}] comes from the last sum (namely, Σ\Sigma_{\infty}), which we will evaluate via a Mellin transform approach.

The reader not enjoying delta-epsilon proofs could have the feeling that “cutting epsilons into 5 parts” like above is a little bit discouraging but this is the price to pay to get the O((ln(n)4/n)O((\ln(n)^{4}/n) error term in Formula (45). In fact, in Equation (47) for 𝔼[Hn]\mathbb{E}[H_{n}], it is possible to cut the sum into only 4 parts, but then this would lead to a final weaker O(1/n)O(1/\sqrt{n}) error term.

So let’s be brave and begin with Σ0\Sigma_{0}. Here, for the range 0h<h10\leq h<h_{1}, with h1:=34ln(n)ln(1/p)h_{1}:=\frac{3}{4}\frac{\ln(n)}{\ln(1/p)},
we get

|Σ0|\displaystyle|\Sigma_{0}| h1×(max0h<h1(exp(nqph+1)+max0h<h1(Hnh)))\displaystyle\leq h_{1}\times\left(\max_{0\leq h<h_{1}}\left(\exp(-nqp^{h+1})+\max_{0\leq h<h_{1}}\mathbb{P}\left(H_{n}\leq h\right)\right)\right)
=h1×(exp(nqph1+1)+(Hnh1))\displaystyle=h_{1}\times\left(\exp(-nqp^{h_{1}+1})+\mathbb{P}\left(H_{n}\leq h_{1}\right)\right)
=h1×(2exp(qpn1/4)+O((lnn)3n))\displaystyle=h_{1}\times\left(2\exp(-qpn^{1/4})+O\left(\frac{(\ln n)^{3}}{n}\right)\right)
=O((lnn)4n),\displaystyle=O\left(\frac{(\ln n)^{4}}{n}\right),

where, for the second line we used that the sequences are increasing with respect to hh, and for the third line we used Formula (28) for php^{h} and the approximation of Theorem 3.4. Note that this bound for |Σ0||\Sigma_{0}| also implies the uniform bound

(Hnh)=O((lnn)4n)(for h<h1).\mathbb{P}(H_{n}\leq h)=O\left(\frac{(\ln n)^{4}}{n}\right)\qquad\text{(for $h<h_{1}$)}. (48)

Now, for Σ1\Sigma_{1}, in the range h1h<h2h_{1}\leq h<h_{2}, with h2:=ln(n)ln(1/p)+ln(ln(n))ln(1/p)h_{2}:=\frac{\ln(n)}{\ln(1/p)}+\frac{\ln(\ln(n))}{\ln(1/p)}, we rewrite hh as h:=(1t)h1+th2h:=(1-t)h_{1}+th_{2}. Such values of hh correspond to using c=(t+3)/4c=(t+3)/4 and c=tc^{\prime}=t in the Formula (28) for php^{h}.

Via the exponential bound on HnH_{n} from Formula (32), we get

|Σ1|\displaystyle|\Sigma_{1}| (h2h1)×(maxh1h<h2(exp(nqph+1)+maxh1h<h2(Hnh)))\displaystyle\leq(h_{2}-h_{1})\times\left(\max_{h_{1}\leq h<h_{2}}\left(\exp(-nqp^{h+1})+\max_{h_{1}\leq h<h_{2}}\mathbb{P}\left(H_{n}\leq h\right)\right)\right)
h2×(exp(nqph2+1)+(Hnh2))=O((lnn)4/n).\displaystyle\leq h_{2}\times\left(\exp(-nqp^{h_{2}+1})+\mathbb{P}\left(H_{n}\leq h_{2}\right)\right)=O((\ln n)^{4}/n).

Then, for Σ2\Sigma_{2}, in the range h2h3h_{2}\leq h_{3}, with h3:=4ln(n)ln(1/p)h_{3}:=\frac{4\ln(n)}{\ln(1/p)}, we rewrite hh as h:=(1t)h2+th3h:=(1-t)h_{2}+th_{3}. Such values of hh correspond to using c=1+3tc=1+3t and c=1tc^{\prime}=1-t in the Formula (28) for php^{h}. Via Formula (32), we get |Σ2|=O((lnn)3/n)|\Sigma_{2}|=O((\ln n)^{3}/n).

For the next sum, using the power series expansion of the exponential in Equation (27) (and keeping in mind that our choice of h3h_{3} implies ph3=1/n4p^{h_{3}}=1/n^{4}), we get

Σ3=h=h3n(1(Hnh))\displaystyle\Sigma_{3}=\sum_{h=h_{3}}^{n}\left(1-\mathbb{P}\left(H_{n}\leq h\right)\right) (n+1h3)(1(Hnh3))\displaystyle\leq(n+1-h_{3})\left(1-\mathbb{P}\left(H_{n}\leq h_{3}\right)\right) (49)
n(1exp((n+1)qph3+1))(1+o(1))=O(1n2).\displaystyle\leq n(1-\exp(-(n+1)qp^{h_{3}+1}))(1+o(1))=O\left(\frac{1}{n^{2}}\right).\qquad (50)

Finally, for the sum Σ4\Sigma_{4}, we use the power series expansions of exp(x)\exp(x) and of 1/(1p)1/(1-p) and we get:

Σ4=hh3(1exp(nqph+1))=nqph3+11phh3k2(nqph+1)kk!<nph3+1=O(1n3).\Sigma_{4}=\sum_{h\geq h_{3}}(1-\exp(-nqp^{h+1}))=\frac{nqp^{h_{3}+1}}{1-p}-\sum_{h\geq h_{3}}\sum_{k\geq 2}\frac{(-nqp^{h+1})^{k}}{k!}<np^{h_{3}+1}=O\left(\frac{1}{n^{3}}\right).

We got that Σ0\Sigma_{0}, Σ1\Sigma_{1}, Σ2\Sigma_{2}, Σ3\Sigma_{3}, and Σ4\Sigma_{4} are o(1)o(1). It remains to evaluate Σ=h0(1enqph+1)\Sigma_{\infty}=\sum_{h\geq 0}(1-e^{-nqp^{h+1}}). Such a sum is typical of expressions which can be evaluated by Mellin transform techniques. To this aim, let ϕ(t)=h0(1etqph+1)\phi(t)=\sum_{h\geq 0}(1-e^{-tqp^{h+1}}) and set f(t):=1etpqf(t):=1-e^{-tpq} and μh:=ph\mu_{h}:=p^{h}, then

ϕ(t)=h0f(μht).\phi(t)=\sum_{h\geq 0}f(\mu_{h}t).

Let ϕ\phi^{*} and ff^{*} be, respectively, the Mellin transform of the functions ϕ\phi and ff. Using Identity (42) given in Example 4.2, we have f(s)=(pq)sΓ(s)f^{*}(s)=-(pq)^{-s}\Gamma(s) on its fundamental strip 1<(s)<0-1<\Re(s)<0 and, as ϕ\phi is a harmonic sum, its Mellin transform is

ϕ(s)=f(s)h0μhs=qsΓ(s)1ps.\phi^{*}(s)=f^{*}(s)\sum_{h\geq 0}\mu_{h}^{-s}=\frac{q^{-s}\Gamma(s)}{1-p^{s}}. (51)

This function extends analytically to the full complex plane, with isolated poles at the negative integers (due to poles of Γ(s)\Gamma(s) there), and with another set of isolated poles (the roots of ps=1p^{s}=1). These two sets of poles have s=0s=0 in common. This implies that for (s)>1\Re(s)>-1 the poles of ϕ\phi^{*} are

{sk=2ikπlnp for k,k0 (all are poles of order 1),s0=0(the only pole of order 2).\displaystyle\begin{cases}s_{k}=\frac{2ik\pi}{\ln p}\text{ for $k\in\mathbb{Z},k\neq 0$ \qquad(all are poles of order 1)},\\ s_{0}=0\qquad\text{(the only pole of order 2)}.\end{cases} (52)

Using Formula (44) for the inverse Mellin transform, we obtain

ϕ(t)\displaystyle\phi(t) =Res[sϕ,0]lntRes[ϕ,0]k{0}Res[ϕ,sk]tsk\displaystyle=\operatorname{Res}[s\phi^{*},0]\ \ln t-\operatorname{Res}[\phi^{*},0]-\sum_{k\in\mathbb{Z}\setminus\{0\}}\operatorname{Res}[\phi^{*},s_{k}]\ t^{-s_{k}}
=lntlnp(γlnp+12+lnqlnp)+1lnpk{0}Γ(sk)qsktsk.\displaystyle=\frac{\ln t}{-\ln p}-\left(\frac{\gamma}{\ln p}+\frac{1}{2}+\frac{\ln q}{\ln p}\right)+\frac{1}{\ln p}\sum_{k\in\mathbb{Z}\setminus\{0\}}\Gamma(s_{k})q^{-s_{k}}t^{-s_{k}}.

We finally get the claim of the theorem by noting that 𝔼[Hn]=ϕ(n)+O((lnn)4n)\mathbb{E}[H_{n}]=\phi(n)+O\left(\frac{(\ln n)^{4}}{n}\right).

4.3. Variance of the height of Moran walks

We now prove that the height of Moran walks, despite a mean of order O(lnn)O(\ln n) and a second moment of order O((lnn)2)O((\ln n)^{2}), has a variance which involves surprising cancellations at these two orders, leading to an oscillating function of order O(1)O(1) (in nn), as implied by the following much more precise asymptotics.

Theorem 4.7.

The variance of the height of Moran walks satisfies

𝕍ar[Hn]=1ln(p)2(Q2(ln(qn))+2γQ(ln(qn))+2R(ln(qn))+π26)+112+O((lnn)5n),\mathbb{V}{\rm ar}[H_{n}]=\frac{1}{\ln(p)^{2}}\left(Q^{2}(\ln(qn))+2\gamma Q(\ln(qn))+2R(\ln(qn))+\frac{\pi^{2}}{6}\right)+\frac{1}{12}+O\left(\frac{(\ln n)^{5}}{n}\right),

where QQ and RR are Fourier series of small amplitudes given by Formulas (46) and (56).

Proof 4.8.

To obtain the variance of HnH_{n} we first consider the second moment

𝔼[Hn2]\displaystyle\mathbb{E}[H_{n}^{2}] =h0(Hn=h)h2=h0(Hn2>h),\displaystyle=\sum_{h\geq 0}\mathbb{P}(H_{n}=h)h^{2}=\sum_{h\geq 0}\mathbb{P}(H_{n}^{2}>h), (53)

where we know from Theorem 3.4 that the summand can be approximated by

(Hn2>h)=1(Hnh)=1exp(nqph+1)+O((lnn)3n).\mathbb{P}(H_{n}^{2}>h)=1-\mathbb{P}\left(H_{n}\leq\sqrt{h}\right)=1-\exp\left(-nqp^{\left\lfloor\sqrt{h}\right\rfloor+1}\right)+O\left(\frac{(\ln n)^{3}}{n}\right).

Then, partitioning the last sum in (53) into the same intervals as in Formula (47), we get that 𝔼[Hn2]=ϕvar(n)+O((lnn)4n)\mathbb{E}[H_{n}^{2}]=\phi_{\rm var}(n)+O\left(\frac{(\ln n)^{4}}{n}\right), where ϕvar\phi_{\rm var} is the function defined by

ϕvar(x)=h0(1exp(xqph+1)).\phi_{\rm var}(x)=\sum_{h\geq 0}\left(1-\exp\left(-xqp^{\left\lfloor\sqrt{h}\right\rfloor+1}\right)\right).

From the behavior of ϕvar(x)\phi_{\rm var}(x) at x=0x=0 and x=+x=+\infty, using the property given in (39), we get that the Mellin transform of ϕvar\phi_{\rm var} is defined on the fundamental strip (1, 0)(-1,\,0). Using the harmonic sum summation (51), one gets for ss in this strip:

ϕvar(s)=f(s)h0(ph)s=Γ(s)(pq)sh0(ph)s.\phi_{\rm var}^{*}(s)=f^{*}(s)\sum_{h\geq 0}\left(p^{\left\lfloor\sqrt{h}\right\rfloor}\right)^{-s}=-\Gamma(s)(pq)^{-s}\sum_{h\geq 0}\left(p^{\left\lfloor\sqrt{h}\right\rfloor}\right)^{-s}.

Here, as we have

h0(ph)s=n0h=n2(n+1)21(ps)n=n0(2n+1)(ps)n=1+ps(1ps)2,\displaystyle\sum_{h\geq 0}\left(p^{\left\lfloor\sqrt{h}\right\rfloor}\right)^{-s}=\sum_{n\geq 0}\,\sum_{h=n^{2}}^{(n+1)^{2}-1}\left(p^{-s}\right)^{n}=\sum_{n\geq 0}\,\left(2n+1\right)\left(p^{-s}\right)^{n}=\frac{1+p^{-s}}{\left(1-p^{-s}\right)^{2}},

we finally get

ϕvar(s)=Γ(s)qs(1+ps)(ps1)2.\phi_{\rm var}^{*}(s)=\frac{-\Gamma(s)q^{-s}(1+p^{s})}{(p^{s}-1)^{2}}. (54)

What are the poles of ϕvar(s)\phi_{\rm var}^{*}(s)? These are s=0s=0 (a pole of order 3) and s=sk=2ikπs=s_{k}=2ik\pi (for k,k0k\in\mathbb{Z},k\neq 0, which are poles of order 2). Using Formula (44) for the inverse Mellin transform, one thus obtains

ϕvar(t)=\displaystyle\phi_{\rm var}(t)= ln(t)2ln(p)2+ln(t)ln(p)+2ln(q)+2γ2Q(ln(qt))ln(p)2\displaystyle\frac{\ln(t)^{2}}{\ln(p)^{2}}+\ln(t)\frac{\ln(p)+2\ln(q)+2\gamma-2Q(\ln(qt))}{\ln(p)^{2}}
ln(p)+2ln(q)ln(p)2Q(ln(qt))+2ln(p)2R(ln(qt))\displaystyle-\frac{\ln(p)+2\ln(q)}{\ln(p)^{2}}Q(\ln(qt))+\frac{2}{\ln(p)^{2}}R(\ln(qt))
+13+γ+ln(q)ln(p)+π2/6+γ2ln(p)2+2γln(q)+ln(q)2ln(p)2,\displaystyle+{\frac{1}{3}}+{\frac{\gamma+\ln(q)}{\ln(p)}}+\frac{\pi^{2}/6+\gamma^{2}}{\ln(p)^{2}}+\frac{2\gamma\,\ln(q)+\ln(q)^{2}}{\ln(p)^{2}}, (55)

with the same Q(x)Q(x) as in (46), and where R(x)R(x) is another Fourier series given by

R(x)\displaystyle R(x) =k{0}Γ(sk)exp(skx).\displaystyle=\sum_{k\in\mathbb{Z}\setminus\{0\}}\Gamma^{\prime}(s_{k})\exp(-s_{k}x). (56)

(Similarly to Q(x)Q(x), this Fourier series R(x)R(x) is always real, as can be seen by replacing Γ\Gamma by Γ\Gamma^{\prime} in Remark 4.4.)

Now that we obtained the asymptotic behavior of 𝔼[Hn2]\mathbb{E}[H_{n}^{2}], we conclude and obtain Theorem 4.7 via 𝕍ar[Hn]=𝔼[Hn2]𝔼[Hn]2\mathbb{V}{\rm ar}[H_{n}]=\mathbb{E}[H_{n}^{2}]-\mathbb{E}[H_{n}]^{2}, where 𝔼[Hn]\mathbb{E}[H_{n}] was computed in Theorem 4.3.

4.4. Height of excursions

Excursions are walks in 2\mathbb{N}^{2} ending at altitude 0 (where, as previously, time is encoded by the xx-axis, and altitude by the yy-axis). As in previous sections, let YnY_{n} and HnH_{n} be the final altitude and height of a walk, and let the random variable H~n{\widetilde{H}}_{n} be the height of a walk of length nn conditioned to be an excursion, that is, H~n=Hn|{Yn=0}{\widetilde{H}}_{n}=H_{n}|\{Y_{n}=0\}. For Moran walks, we get the following behavior.

Theorem 4.9 (Distribution and moments of the height of Moran excursions).

The distribution of the height of excursions satisfies (for a uniform error term in kk)

(H~nlnnln(1/p)+k)=exp(qα(n1)pk+1)+O((lnn)3n),\displaystyle\mathbb{P}\left({\widetilde{H}}_{n}\leq\left\lfloor\frac{\ln n}{\ln(1/p)}\right\rfloor+k\right)=\exp\left(-q\alpha(n-1)p^{k+1}\right)+O\left(\frac{(\ln n)^{3}}{n}\right), (57)

with α(n):=p{lnnln(1/p)}\alpha(n):=p^{-\{\frac{\ln n}{\ln(1/p)}\}} (where {x}\{x\} stands for the fractional part of xx, and where x\lfloor x\rfloor stands for the floor function of xx).

Introducing temporarily the quantity n:=ln(q(n1))\ell_{n}:=\ln(q(n-1)), and with the same Fourier series QQ and RR as in Theorems  4.3 and 4.7, the average and the variance are given by

𝔼[H~n]\displaystyle\mathbb{E}[{\widetilde{H}}_{n}] =lnnln(1/p)γlnp12lnqlnp+Q(n)lnp+O((lnn)4n),\displaystyle=\frac{\ln n}{\ln(1/p)}-\frac{\gamma}{\ln p}-\frac{1}{2}-\frac{\ln q}{\ln p}+\frac{Q(\ell_{n})}{\ln p}+O\left(\frac{(\ln n)^{4}}{n}\right), (58)
𝕍ar[H~n]=1ln(p)2(Q2(n)+2γQ(n)+2R(n)+π26)+112+O((lnn)5n).\mathbb{V}{\rm ar}[{\widetilde{H}}_{n}]=\frac{1}{\ln(p)^{2}}\left(Q^{2}(\ell_{n})+2\gamma Q(\ell_{n})+2R(\ell_{n})+\frac{\pi^{2}}{6}\right)+\frac{1}{12}+O\left(\frac{(\ln n)^{5}}{n}\right). (59)
Proof 4.10.

As a Moran excursion necessarily ends by a reset, we have

(H~nh)=(Hnh|{Yn=0})=q(Hn1h)/(Yn=0).\mathbb{P}(\widetilde{H}_{n}\leq h)=\mathbb{P}\left(H_{n}\leq h|\{Y_{n}=0\}\right)=q\mathbb{P}(H_{n-1}\leq h)/\mathbb{P}(Y_{n}=0). (60)

Thus, we have (H~nh)=(Hn1h)\mathbb{P}(\widetilde{H}_{n}\leq h)=\mathbb{P}(H_{n-1}\leq h), 𝔼[H~n]=𝔼[Hn1]\mathbb{E}[{\widetilde{H}}_{n}]=\mathbb{E}[H_{n-1}], and 𝕍ar[H~n]=𝕍ar[Hn1]\mathbb{V}{\rm ar}[{\widetilde{H}}_{n}]=\mathbb{V}{\rm ar}[H_{n-1}], we can therefore directly recycle the results of Theorems 3.44.3, and 4.7 to get the asymptotic distribution/mean/variance.

In this recycling, some care has to be brought while performing the substitution nn1n\rightarrow n-1 in the asymptotic formulas for the walks: indeed, this could impact intermediate asymptotic terms (smaller than the main asymptotic term, but larger than the error term); however, in our case, all is safe as we have

(ln(n±1))m(n±1)m=(lnn)mnm+O((lnn)mnm+1).\frac{(\ln(n\pm 1))^{m}}{(n\pm 1)^{m^{\prime}}}=\frac{(\ln n)^{m}}{n^{m^{\prime}}}+O\left(\frac{(\ln n)^{m}}{n^{m^{\prime}+1}}\right).

This result is a simple consequence of the combinatorially obvious identity (60), so this direct link between the asymptotics of walks and excursions holds in wider generality for any model of walks with resets for which the step set 𝒮\mathcal{S} contains only positive steps.

4.5. Fourier series: bounds and infinite differentiability

In his seminal work [38], Knuth mentions at the end of his Section 3 that if one assumes that ln(qn)\ln(qn) is equidistributed mod 1, then the sum Q(ln(qn))Q(\ln(qn)) is of “average 0”. Let us amend a little bit Knuth’s assertion. Indeed, Weyl’s criterion asserts that a sequence ana_{n} is equidistributed mod 1 if and only if, for any positive integer \ell, we have

limN+1Nn=1Nexp(2iπan)=0.\lim_{N\rightarrow+\infty}\frac{1}{N}\sum_{n=1}^{N}\exp(2i\pi\ell a_{n})=0.

Considering this sum with =1\ell=1 and an=ln(qn)a_{n}=\ln(qn), and applying the Euler–Maclaurin formula to it, one gets that it does not converge to 0, and therefore ln(qn)\ln(qn) is not equidistributed mod 1.

However, it is indeed true that the oscillating Q(x)Q(x) and R(x)R(x) are of mean value zero over their period (i.e., 0ln(1/p)Q(x)𝑑x=0\int_{0}^{\ln(1/p)}Q(x)dx=0; see Figure 7 on page 7), and that Q(ln(qn))Q(\ln(qn)) and R(ln(qn))R(\ln(qn)) are “almost” of mean value zero and that they possess small fluctuations. Let us give an explicit bound on their amplitude. To this aim, we first need to bound the digamma function777This is a rather misleading name: indeed, the digamma function is traditionally denoted by the letter psi (i.e., ψ\psi), while it should logically be denoted by the Greek letter digamma (i.e., ϝ\digamma, a letter which looks like a big Γ\Gamma stack on a small Γ\Gamma, which later gave birth to the more familiar letter FF in the Latin alphabet). This paradox is due to the fact that Stirling, who introduced this function, did initially use the notation digamma ϝ\digamma, but later authors switched the notation to ψ\psi, while the initial name remained., defined by

ψ(z):=Γ(z)/Γ(z).\psi(z):=\Gamma^{\prime}(z)/\Gamma(z). (61)

The function ψ\psi can be seen as an analytic continuation of harmonic numbers and satisfies ψ(t+1)=ψ(t)+1/t\psi(t+1)=\psi(t)+1/t. While several bounds for ψ(z)\psi(z) exist in the literature (see e.g. [52]), most of them are dedicated to zz\in\mathbb{R} (for example we have ψ(t)<ln(t)1/(2t)\psi(t)<\ln(t)-1/(2t) for t>0t>0), so we now establish a lemma for ziz\in i\mathbb{R} (which we believe to be new, and which has its own interest beyond our application hereafter to bounds of Fourier series).

Lemma 4.11 (A bound for the digamma function on the imaginary axis).

For t>0t>0, we have

|ψ(it)|12ln(1+t2)+(π2+1γ)+1t,\left|\psi(it)\right|\leq\frac{1}{2}\ln\left(1+t^{2}\right)+\left(\frac{\pi}{2}+1-\gamma\right)+\frac{1}{t}, (62)

which also implies the bound

|ψ(it)|(π2+1γ+ln22)+(ln(t)𝟙{t1}+1t).\left|\psi(it)\right|\leq\left(\frac{\pi}{2}+1-\gamma+\frac{\ln 2}{2}\right)+\left(\ln(t){\mathbbm{1}}_{\{t\geq 1\}}+\frac{1}{t}\right).
Proof 4.12.

Using Euler’s representation of the gamma function as an infinite product, i.e.,

Γ(z)=1zk1(1+1/k)z/(1+z/k)=exp(γz)zk1exp(z/k)1+z/k,\Gamma(z)=\frac{1}{z}\prod_{k\geq 1}(1+1/k)^{z}/(1+z/k)=\frac{\exp(-\gamma z)}{z}\prod_{k\geq 1}\frac{\exp(z/k)}{1+z/k},

we get that its logarithmic derivative, ψ(z)=Γ(z)/Γ(z)\psi(z)=\Gamma^{\prime}(z)/\Gamma(z), satisfies, for z,zz\in\mathbb{C},z\notin-{\mathbb{N}} :

ψ(z)=1zγ+k=1+zk(k+z).\psi(z)=-\frac{1}{z}-\gamma+\sum_{k=1}^{+\infty}\frac{z}{k(k+z)}. (63)

We refer to [18, Section 1.1] for more details on these formulas. Now, setting z=itz=it (with t>0t>0), and regrouping the imaginary and real parts gives

ψ(it)=i(1t+n=1+tn2+t2)+(n=1+t2n(n2+t2)γ),\psi(it)=i\left(\frac{1}{t}+\sum_{n=1}^{+\infty}\frac{t}{n^{2}+t^{2}}\right)+\left(\sum_{n=1}^{+\infty}\frac{t^{2}}{n\left(n^{2}+t^{2}\right)}-\gamma\right),

and thus, by the triangle inequality

|ψ(it)|(1t+n=1+tn2+t2)+(n=1+t2n(n2+t2)γ).\left|\psi(it)\right|\leq\left(\frac{1}{t}+\sum_{n=1}^{+\infty}\frac{t}{n^{2}+t^{2}}\right)+\left(\sum_{n=1}^{+\infty}\frac{t^{2}}{n\left(n^{2}+t^{2}\right)}-\gamma\right). (64)

Here, note that for all nu<n+1n\leq u<n+1, we have n2+t2u2+t2<(n+1)2+t2n^{2}+t^{2}\leq u^{2}+t^{2}<(n+1)^{2}+t^{2}, and thus

t(n+1)2+t2nn+1tu2+t2𝑑utn2+t2.\frac{t}{(n+1)^{2}+t^{2}}\leq\int_{n}^{n+1}\frac{t}{u^{2}+t^{2}}du\leq\frac{t}{n^{2}+t^{2}}.

Summing for nn from 0 to ++\infty, we obtain

n=1+tn2+t2n=0+nn+1tu2+t2𝑑u=0+tu2+t2𝑑u=π2.\displaystyle\sum_{n=1}^{+\infty}\frac{t}{n^{2}+t^{2}}\leq\sum_{n=0}^{+\infty}\int_{n}^{n+1}\frac{t}{u^{2}+t^{2}}du=\int_{0}^{+\infty}\frac{t}{u^{2}+t^{2}}du=\frac{\pi}{2}.

So the first infinite sum in (64) is bounded by π/2\pi/2. For the second infinite sum, it is convenient to split it in the contribution from the summand for n=1n=1, which is bounded by

maxt0(t21+t2)=1,\max_{t\geq 0}\left(\frac{t^{2}}{1+t^{2}}\right)=1,

plus the remaining part (i.e., the sum of the terms for n2n\geq 2):

n=2+t2n(n2+t2)t1+1u(u2+1)𝑑u=12ln(1+t2).\displaystyle\sum_{n=2}^{+\infty}\frac{t^{2}}{n\left(n^{2}+t^{2}\right)}\leq\int_{t^{-1}}^{+\infty}\frac{1}{u(u^{2}+1)}du\ =\ \frac{1}{2}\ln(1+t^{2}).

Plugging these two bounds in (64) proves our lemma.

Equipped with the previous lemma, we can now give our bounds for Q(x)Q(x) and R(x)R(x).

Proposition 4.13 (Uniform bounds for the oscillations).

The oscillating functions Q(x)Q(x) and R(x)R(x) are uniformly bounded by

supx+|Q(x)|\displaystyle\sup_{x\in\mathbb{R}^{+}}|Q(x)| ln(p)πlnexp(p,45π2),\displaystyle\leq\frac{\ln(p)}{\pi}\operatorname{lnexp}\left(p,\frac{4}{5}\pi^{2}\right), (65)
supx+|R(x)|\displaystyle\sup_{x\in\mathbb{R}^{+}}|R(x)| ln(p)π[lnexp(p,45π2)+(π2+1γln(p)2π)lnexp(p,114155π2)],\displaystyle\leq\frac{\ln(p)}{\pi}\left[\operatorname{lnexp}\left(p,\frac{4}{5}\pi^{2}\right)+\left(\frac{\pi}{2}\!+\!1\!-\!\gamma\!-\!\frac{\ln(p)}{2\pi}\right)\operatorname{lnexp}\left(p,\frac{114}{155}\pi^{2}\right)\right],\qquad (66)

where

lnexp(p,β):=ln(1exp(βln(p))).\operatorname{lnexp}(p,\beta):=\ln\left(1-\exp\left(\frac{\beta}{\ln(p)}\right)\right). (67)

For p=1/2p=1/2, we have more precisely

supx+|Q(x)|=1.090430×106 and supx+|R(x)|=2.987768×106.\sup_{x\in\mathbb{R}^{+}}|Q(x)|=1.090430\dots\times 10^{-6}\text{\qquad and \qquad}\sup_{x\in\mathbb{R}^{+}}|R(x)|=2.987768\dots\times 10^{-6}.
Proof 4.14.

Applying the triangle inequality on the definition of Q(x)Q(x) in (46), we get

|Q(x)|k{0}|Γ(sk)|×|exp(skx)|2k1|Γ(sk)||Q(x)|\leq\sum_{k\in\mathbb{Z}\setminus\{0\}}|\Gamma(s_{k})|\times|\exp(-s_{k}x)|\leq 2\sum_{k\geq 1}\left|\Gamma(s_{k})\right|

(a quantity independent of xx, as |exp(skx)|=1|\exp(-s_{k}x)|=1). Then, using the complement formula for the gamma function, we have Γ(z)Γ(z)=πzsin(π(z+1))\Gamma(-z)\Gamma(z)=\frac{\pi}{z\sin(\pi(z+1))} (for zz\not\in\mathbb{Z}). Using this relation for z=itz=it (with tt\in\mathbb{R}) together with the relation Γ(z)¯=Γ(z¯)\overline{\Gamma(z)}=\Gamma(\bar{z}), we infer that

|Γ(it)|2=Γ(it)Γ(it)=πtsinh(πt).|\Gamma(it)|^{2}=\Gamma(it)\Gamma(-it)=\frac{\pi}{t\sinh(\pi t)}. (68)

Thus, for t=2πlnpt=\frac{2\pi}{-\ln p}, this gives

supx+|Q(x)|\displaystyle\sup_{x\in\mathbb{R}^{+}}|Q(x)| 2k1πktsinh(πkt)=ln(1/p)2k11ksinh(πkt).\displaystyle\leq 2\sum_{k\geq 1}\sqrt{\frac{\pi}{kt\sinh(\pi kt)}}=\sqrt{\frac{\ln(1/p)}{2}}\sum_{k\geq 1}\sqrt{\frac{1}{k\sinh(\pi kt)}}. (69)

As, for x0x\geq 0, we have sinh(x)(1/4)xexp(4x/5)\sinh(x)\geq(1/4)x\exp(4x/5), we get

supx+|Q(x)|\displaystyle\sup_{x\in\mathbb{R}^{+}}|Q(x)| ln(1/p)2k11(1/4)πk2texp(4πkt/5)\displaystyle\leq\sqrt{\frac{\ln(1/p)}{2}}\sum_{k\geq 1}\sqrt{\frac{1}{(1/4)\pi k^{2}t\exp(4\pi kt/5)}}
=ln(1/p)k11πkexp(25πkt)\displaystyle={\ln(1/p)}\sum_{k\geq 1}\frac{1}{\pi k\exp\left(\frac{2}{5}\pi kt\right)} (70)
=ln(p)πln(1exp(4π25ln(p))).\displaystyle=\frac{\ln(p)}{\pi}\ln\left(1-\exp\left(\frac{4\pi^{2}}{5\ln(p)}\right)\right). (71)

Note that the more relaxed bound (71) is quite close to the stricter bound (69): e.g. for p=1/2p=1/2 the bound (69) gives the upper bound 1.090430×1061.090430\dots\times 10^{-6} (and one can numerically check that these first digits also constitute a lower bound), while the bound (71) gives the upper bound 2.49×1062.49\times 10^{-6}.

Now, for bounding R(x)R(x), we use the identity Γ(z)=ψ(z)Γ(z)\Gamma^{\prime}(z)=\psi(z)\Gamma(z), with the bound (62) from Lemma 4.11 for |ψ(it)||\psi(it)|, and the bound (70) for |Γ(it)||\Gamma(it)|:

|R(x)|\displaystyle|R(x)| 2k1|Γ(sk)|=2k1|ψ(sk)||Γ(sk)|\displaystyle\leq 2\sum_{k\geq 1}\left|\Gamma^{\prime}(s_{k})\right|=2\sum_{k\geq 1}\left|\psi(s_{k})\right|\left|\Gamma(s_{k})\right|
k1(12ln(1+(2πkln(p))2)+π2+1γlnp2πk)ln(1/p)πkexp(45π2k/ln(p)).\displaystyle\leq\sum_{k\geq 1}\left(\frac{1}{2}\ln\left(1\!+\!\left(\frac{2\pi k}{\ln(p)}\right)^{2}\right)\!+\!\frac{\pi}{2}\!+\!1\!-\!\gamma\!-\!\frac{\ln p}{2\pi k}\right)\frac{\ln(1/p)}{\pi k\exp\left(-\frac{4}{5}\pi^{2}k/\ln(p)\right)}.\qquad (72)

Now, it is easy to check that we have 12ln(1+x2)exp(131πx)\frac{1}{2}\ln\left(1+x^{2}\right)\leq\exp\left(\frac{1}{31}\pi x\right) for all x>0x>0. Then, noting t=2π/ln(p)t=-2\pi/\ln(p), we get

k112ln(1+(kt)2)ln(1/p)πkexp(25πkt)\displaystyle\sum_{k\geq 1}\frac{1}{2}\ln\left(1+(kt)^{2}\right)\frac{\ln(1/p)}{\pi k\exp\left(\frac{2}{5}\pi kt\right)} k1ln(1/p)πkexp(57155πkt)\displaystyle\leq\sum_{k\geq 1}\frac{\ln(1/p)}{\pi k\exp\left(\frac{57}{155}\pi kt\right)}
=ln(p)πln(1exp(114π2155ln(p))).\displaystyle=\frac{\ln(p)}{\pi}\ln\left(1-\exp\left(\frac{114\pi^{2}}{155\ln(p)}\right)\right).

Together with the contribution of the remaining summands in (72), this gives the bound (66) for |R(x)||R(x)|.

From this, we can establish the infinite differentiability of our fluctuations.

Theorem 4.15 (Fourier series infinite differentiability).

The Fourier series

Q(x)=k{0}Γ(sk)exp(skx) and R(x)=k{0}Γ(sk)exp(skx)Q(x)=\sum_{k\in\mathbb{Z}\setminus\{0\}}\Gamma(s_{k})\exp(-s_{k}x)\text{\qquad and \qquad}R(x)=\sum_{k\in\mathbb{Z}\setminus\{0\}}\Gamma^{\prime}(s_{k})\exp(-s_{k}x)

(where sk=2ikπlnps_{k}=\frac{2ik\pi}{\ln p}) are infinitely differentiable on {\mathbb{R}}.

Proof 4.16.

A Fourier series f(x)=kckexp(ikx)f(x)=\sum_{k\in{\mathbb{Z}}}c_{k}\exp(-ikx) satisfies the Weierstrass MM-test if there exists a sequence MnM_{n} such that |ckexp(ikx)|+|ckexp(ikx)|<Mk|c_{k}\exp(-ikx)|+|c_{-k}\exp(ikx)|<M_{k} (for all xx\in\mathbb{R}) and k0Mk\sum_{k\geq 0}M_{k} converges. If f(x)f(x) and g(x):=ikkckexp(ikx)g(x):=-i\sum_{k\in{\mathbb{Z}}}kc_{k}\exp(-ikx) both satisfy the Weierstrass MM-test, then they converge absolutely and uniformly in \mathbb{R}, and f=gf^{\prime}=g.

Thus, by successive application of this MM-test, if the coefficients decay polynomially, i.e., we have |ck|+|ck|=O(|k|d1)|c_{-k}|+|c_{k}|=O(|k|^{-d-1}), then f(x)f(x) is in 𝒞d{\mathcal{C}}^{d} (that is, dd times differentiable) and f(x)f(x) is in 𝒞{\mathcal{C}}^{\infty} (that is, infinitely differentiable) if its coefficients decay faster than any polynomial rate. By Equation (68), the coefficients Γ(sk)\Gamma(s_{k}) decay like exp(kπ/ln(p))\approx\exp(-k\pi/\ln(p)), so Q(x)Q(x) is in 𝒞{\mathcal{C}}^{\infty}. By Equation (72), the coefficients Γ(sk)\Gamma^{\prime}(s_{k}) also decay like an exponential, so R(x)R(x) is in 𝒞{\mathcal{C}}^{\infty}.

It is interesting to compare this smoothness result with the situation observed by Delange [16] in his seminal work on the sum of digits of nn in base 1/p1/p (when 1/p1/p is an integer). Therein, he proved an asymptotic behavior involving fluctuations dictated by a Fourier series, which can also be obtained by a Mellin transform approach, quite similarly to the road followed in our article. It appears that his Fourier series (already mentioned in Remark 4.5) has coefficients ζ(sk)/((1+sk)sk)k1.5\zeta(s_{k})/((1+s_{k})s_{k})\approx k^{-1.5}; it is thus not surprising that the Delange series is nowhere differentiable, in sharp contrast with the smoothness of our Fourier series (see Figure 8).

This concludes our analysis of the height and the corresponding fluctuations.

Refer to caption
Q(x)Q(x) (for p=1/2p=1/2)
Refer to caption
R(x)R(x) (for p=1/2p=1/2)
Refer to caption
Delange(x)Delange(x) (for p=1/2p=1/2)
Figure 8. Our Fourier series QQ and RR are infinitely differentiable, while the Fourier series obtained by Delange is nowhere differentiable. This follows from the asymptotics of their coefficients, as explained in the proof of Theorem 4.15.

5. Some results for the Moran model in dimension m>1m>1

5.1. Joint distribution of ages for the Moran model with m>1m>1

Moran processes are models of population evolution (or mutation transmission) where the population is of constant size (some individuals could die but are then immediately replaced by a new individual). Depending on the applications, several variants were considered in the literature starting with the seminal work of Moran himself [43, 44], up to more recent extensions (for example to spatially structured population [41].

Motivated by the model with resets of Itoh, Mahmoud, and Takahashi [35, 34], we now define the Moran model with mm individuals. It is a process parametrized by some probabilities pp and pip_{i}’s such that p+i=0mpi=1p+\sum_{i=0}^{m}p_{i}=1, and which starts at time 0 with mm individuals of age 0. Then, at each new unit of time,

  • either, with probability pp, all survive (their age increases by 1),

  • either, with probability pip_{i} (for 1im1\leq i\leq m), the ii-th individual dies (it is then replaced by a new ii-th individual of age 0), while the age of the m1m-1 surviving individuals increases by 1,

  • either, with probability p0p_{0}, all die and are replaced by mm new individuals of age 0.

Now, we define the sequence of multivariate polynomials fn(x1,,xm)f_{n}(x_{1},\dots,x_{m}) (for nn\in\mathbb{N}) by the fact that the coefficient of x1k1xmkmx_{1}^{k_{1}}\cdots x_{m}^{k_{m}} in fn(x1,,xm)f_{n}(x_{1},\dots,x_{m}) is the probability that, at time nn, the ii-th individual has age kik_{i} (for i=1,,mi=1,\dots,m). Accordingly, F(t,x1,,xm):=n0fn(x1,,xm)tnF(t,x_{1},\dots,x_{m}):=\sum_{n\geq 0}f_{n}(x_{1},\dots,x_{m})t^{n} is the probability generating function associated to the above Moran model, where the time is encoded by the exponent of tt.

Theorem 5.1.

The probability generating function of the Moran model is a rational function, and it admits the closed form

F(t,x1,,xm)=k=02m1(1)kPktkΔ,F(t,x_{1},\dots,x_{m})=\frac{\sum_{k=0}^{2^{m}-1}(-1)^{k}P_{k}t^{k}}{\Delta}, (73)

where the PkP_{k}’s are polynomials (given in the proof) in the xix_{i}’s, pp, pip_{i}’s, and where Δ\Delta is the following polynomial of degree 2m2^{m} in tt:

Δ=I{1,,m}(1t(p+p0[[I={1,,m}]]+iIpi)iIxi).\Delta=\prod_{I\subseteq\{1,\dots,m\}}\left(1-t\left(p+p_{0}[\![{I=\{1,\dots,m\}}]\!]+\sum_{i\in I}p_{i}\right)\prod_{i\not\in I}x_{i}\right). (74)
Proof 5.2.

The Moran model evolution is encoded by the following functional equation for the probability generating function FF:

F(t,x1,,xm)=1\displaystyle F(t,x_{1},\dots,x_{m})=1 +tpx1xmF(t,x1,,xm)+tp0F(t,1,,1)\displaystyle+tpx_{1}\cdots x_{m}F(t,x_{1},\dots,x_{m})+tp_{0}F(t,1,\dots,1)
+t(i=1mpix1xmxiF(t,x1,,xm)|xi=1),\displaystyle+t\left(\sum_{i=1}^{m}p_{i}\frac{x_{1}\cdots x_{m}}{x_{i}}F(t,x_{1},\dots,x_{m})_{|x_{i}=1}\right), (75)

where F|xi=1F_{|x_{i}=1} means FF evaluated at xi=1x_{i}=1.

To solve this single functional equation (which has m+2m+2 unknowns888We temporarily count F(t,1,,1)F(t,1,\dots,1) as unknown, even if it is obviously equal to 1/(1t)1/(1-t), as FF is a probability generating function.), the trick is to transform it into a linear system of equations with… 2m2^{m} unknowns! Indeed, by substituting xi=1x_{i}=1 (in all the possible ways) in the functional equation (5.2), we get a system of 2m2^{m} equations.

Then, we encode this system by a matrix MM, where we cleverly (sic!) choose the order in which unknowns are associated to the lines/columns of MM. Let us define this order; to this aim consider the Cartesian product 𝒳:={1,x1}××{1,xm}{\mathcal{X}}:=\{1,x_{1}\}\times\cdots\times\{1,x_{m}\}. For any pair of mm-tuples 𝐗\mathbf{X} and 𝐘\mathbf{Y} from 𝒳{\mathcal{X}}, one writes 𝐗𝐘\mathbf{X}\prec\mathbf{Y} if the number of 1’s in 𝐗\mathbf{X} is less than the number of 1’s in 𝐘\mathbf{Y}, or, when they have the same number of 1’s, if 𝐗\mathbf{X} is smaller than 𝐘\mathbf{Y} in the lexicographical order induced by x1xm1x_{1}\prec\dots\prec x_{m}\prec 1. For example, we have (x1,x2)(x1,1)(1,x2)(1,1)(x_{1},x_{2})\prec(x_{1},1)\prec(1,x_{2})\prec(1,1). Listing all the elements of 𝒳\mathcal{X} in increasing order, we get a list of 2m2^{m} tuples X1,,X2mX_{1},\dots,X_{2^{m}}. The matrix MM encoding the aforementioned system of equations is constructed such that the ii-th line of the matrix MM corresponds to the unknown F(t,Xi)F(t,X_{i}) and the jj-th column corresponds to the unknown F(t,Xj)F(t,X_{j}).

With this order, the matrix MM is an upper triangular matrix (as each of the substitution of some xix_{i}’s by some 1’s in Equation (5.2) leads from some tuple 𝐗\mathbf{X} to m+2m+2 larger tuples 𝐘\mathbf{Y}), and thus the determinant of MM is the product of its diagonal terms:

detM=I{1,,m}(1t(p+p0[[I={1,,m}]]+iIpi)iIxi),\det M=\ \prod_{I\subseteq\{1,\dots,m\}}\left(1-t\left(p+p_{0}[\![{I=\{1,\dots,m\}}]\!]+\sum_{i\in I}p_{i}\right)\prod_{i\not\in I}x_{i}\right), (76)

where we use Iverson’s bracket notation999This notation, [[assertion]][\![\text{assertion}]\!], is 1 if the assertion is true, and 0 if not. It was introduced in the semantics of the language APL by its founder, Kenneth Iverson. It was later popularized in mathematics by Graham, Knuth, and Patashnik [27]..

As this determinant Δ:=detM\Delta:=\det M is not zero, this entails by Cramer’s rule that F(t,x1,,xm)F(t,x_{1},\dots,x_{m}) can be written as a rational function with denominator Δ\Delta (note that, for some specific real values of pp and the pip_{i}’s, it is not excluded that the numerator could have a shared factor with Δ\Delta). Of course, computing the determinant of each comatrix, and using the relation p0=1(p+p1++pm)p_{0}=1-(p+p_{1}+\dots+p_{m}), we get symmetric polynomial expressions for the PkP_{k}’s occurring in (73), e.g.:

P0\displaystyle P_{0} =1,\displaystyle=1,
P1\displaystyle P_{1} =p(i=1m(1+xi)i=1mxi)+i=1mxij=1,,mjipj,\displaystyle=p\left(\prod_{i=1}^{m}(1+x_{i})-\prod_{i=1}^{m}x_{i}\right)+\sum_{i=1}^{m}x_{i}\sum_{\begin{subarray}{c}j=1,\dots,m\\ j\neq i\end{subarray}}p_{j},
\displaystyle\vdots
P2m1\displaystyle P_{2^{m}-1} =(i=1mxim)I{1,,m}(p+iIpi).\displaystyle=\left(\prod_{i=1}^{m}x_{i}^{m}\right)\prod_{I\subsetneq\{1,\dots,m\}}\left(p+\sum_{i\in I}p_{i}\right).

Note that the case p0=0p_{0}=0, pi=1/mp_{i}=1/m for i=1,,mi=1,\dots,m (with m2m\geq 2) was analyzed by Itoh and Mahmoud [34]: they proved that the age of each individual converges to a shifted geometric distribution, namely Geom(1/m)1\operatorname{Geom}(1/m)-1. They also show that the number of individuals of age kk at time nn converges to a Bernoulli distribution, namely Ber((m/(m1))k)\operatorname{Ber}((m/(m-1))^{k}).Our Theorem 5.1 constitutes a joint law version of these results, at discrete times, for generic pip_{i}’s. For example, introducing G(t,v):=j=1m(mj)vj[x1kxjk]F(t,x1,,xm)G(t,v):=\sum_{j=1}^{m}\binom{m}{j}v^{j}[x_{1}^{k}\dots x_{j}^{k}]F(t,x_{1},\dots,x_{m}), the coefficient [tn]vG(t,1)[t^{n}]\partial_{v}G(t,1) gives the average number of individuals of age kk at time nn. (Note that the sum with the binomial coefficients (mj)\binom{m}{j} has to be replaced by a sum over the subsets of {1,,m}\{1,\dots,m\} if the pip_{i}’s and the initial conditions for the xisx_{i}^{\prime}s are not symmetric.)

5.2. A multidimensional generalization of the Moran model

Interestingly, the same strategy of proof allows us to solve a wide generalization of the Moran model, where

  • with probability pIp_{I}, all the individuals from the subset II of {1,,m}\{1,\dots,m\} die (they are then replaced by new individuals of age 0), while the age of each surviving individual increases by 1.

  • the process starts with mm individuals of any (possibly distinct) ages, encoded by a monomial f0(x1,,xm)f_{0}(x_{1},\dots,x_{m}).

This translates to the following single functional equation, involving 2m2^{m} unknowns:

F(t,x1,,xm)=f0(x1,,xm)+tI{1,,m}pIF(t,𝐗I)iIxi,F(t,x_{1},\dots,x_{m})=f_{0}(x_{1},\dots,x_{m})+t\sum_{I\in\{1,\dots,m\}}p_{I}F(t,{\mathbf{X}}_{I})\ \prod_{i\not\in I}x_{i}, (77)

where 𝐗I=(x1,,xm)|xi=1 for all iI{\mathbf{X}}_{I}=(x_{1},\dots,x_{m})_{|\text{$x_{i}=1$ for all\ $i\in I$}}.

Obviously, by taking f0=1f_{0}=1, p=pp_{\varnothing}=p, p{1,,m}=p0p_{\{1,\dots,m\}}=p_{0}, p{i}=pip_{\{i\}}=p_{i}, and all other pI=0p_{I}=0, the generalized model simplifies to the classical Moran model of Theorem 5.1. Another natural set of probabilities is pI=qk(1q)mkp_{I}=q^{k}(1-q)^{m-k}, where kk is the number of elements in II. It encodes the model where, at each unit of time, each individual dies with probability qq.

More generally, for any set of pIp_{I}’s, one gets the following result.

Theorem 5.3.

The probability generating function of the generalized Moran model is a rational function:

F(t,x1,,xm)=k=02m1(1)kQktkΔ,F(t,x_{1},\dots,x_{m})=\frac{\sum_{k=0}^{2^{m}-1}(-1)^{k}Q_{k}t^{k}}{\Delta}, (78)

where the QkQ_{k}’s are polynomials in the xix_{i}’s and pIp_{I}’s for I{1,,m}I\subset\{1,\dots,m\}, and where Δ\Delta is the following polynomial of degree 2m2^{m} in tt:

Δ=I{1,,m}(1t(p+p{1,,m}[[I={1,,m}]]+iIp{i})iIxi).\Delta=\prod_{I\subseteq\{1,\dots,m\}}\left(1-t\left(p_{\varnothing}+p_{\{1,\dots,m\}}[\![{I=\{1,\dots,m\}}]\!]+\sum_{i\in I}p_{\{i\}}\right)\prod_{i\not\in I}x_{i}\right). (79)

Note that, for this generalized model, the denominator Δ\Delta is the same as in Theorem 5.1, and the QkQ_{k}’s are a lifting of the PkP_{k}’s from Theorem 5.1, involving more terms and variables (namely, all the pIp_{I}’s). For these two models, these polynomials PkP_{k} and QkQ_{k} are variants of symmetric functions. We comment more on this fact now.

Remark 5.4 (Links with bi-indexed families of symmetric functions).

Many problems related to lattice paths lead to generating functions expressible in terms of symmetric functions; this results from the kernel method, which involves a Vandermonde-like determinant, and thus leads to variants of Schur functions [4, 11, 6]. For the generalized Moran model we also get symmetric expressions, as the problem is by design symmetric, but in a more subtle way: one does not get formulas nicely expressible in terms of classical symmetric functions. This is due to the fact that we have to play with two distinct sets of variables (the pip_{i}’s and the xix_{i}’s), the occurrences of which are not fully independent. It appears that thesesubtle dependencies are well encoded by the MacMahon elementary symmetric functions, defined by ej,k:=[txjtpk]i=1m(1+txxi+tppi)e_{j,k}:=[t_{x}^{j}t_{p}^{k}]\prod_{i=1}^{m}(1+t_{x}x_{i}+t_{p}p_{i}). For example, we have e2,1=x1x2p3+x2x3p1+x3x1p2e_{2,1}=x_{1}x_{2}p_{3}+x_{2}x_{3}p_{1}+x_{3}x_{1}p_{2}. They allow us to provide more compact formulas for our generating functions, like P1=e1,1+pj=1mej,0P_{1}=e_{1,1}+p\sum_{j=1}^{m}e_{j,0}. We plan to study these aspects in a forthcoming work. Note that these MacMahon symmetric functions also appear in problems a priori unrelated to our multidimensional Moran walks, see e.g. the articles of Gessel [25] and Rosas [48].

5.3. Application to the soliton wave model

The soliton wave model (as considered by Itoh, Mahmoud, and Takahashi [35]) is a stochastic system of particles encoding a unidirectional wave. The number of particles is constant during the full process: we have mm particles on \mathbb{Z} which can only moves to the left as follows. At time n=0n=0, the initial configuration consists of mm particles, at xx-coordinates 1,,m1,\dots,m. Then, at each unit of time n=1, 2,n=1,\,2,\dots, uniformly at random, one of the mm particles jumps just to the left of the first particle (the wave front), thus leaving an empty space at its starting position:

[Uncaptioned image]

\ \longrightarrow\ [Uncaptioned image]

Note that at time nn the location of the leftmost particle has thus xx-coordinate 1n1-n. See Figure 9 for an illustration of 6 iterations of this process, where, for drawing convenience, we shift the origin of the xx-axis after each step, so that the first particle is always at xx-coordinate 1.

Then, applying Theorem 5.3 to this model, we get the following proposition.

Proposition 5.5.

The joint distribution F(t,x1,,xm)F(t,x_{1},\dots,x_{m}) of the time/positions of the particles in the soliton wave model is given by Formula (78), by taking as initial condition f0=x11x22xmmf_{0}=x_{1}^{1}x_{2}^{2}\dots x_{m}^{m}, and, as probabilities of transition, p{i}=1/mp_{\{i\}}=1/m and all other pI=0p_{I}=0; what is more, the denominator of F(t,x1,,xm)F(t,x_{1},\dots,x_{m}) thus simplifies to

Δ=I{1,,m}(1t|I|m)iIxi,\Delta=\prod_{I\subseteq\{1,\dots,m\}}\left(1-t\frac{|I|}{m}\right)\prod_{i\not\in I}x_{i},

where |I||I| stands for the number of elements of the set II.

Wave Time nn Length LnL_{n}
Refer to caption
0 4
Refer to caption
1 5
Refer to caption
2 6
Refer to caption
3 6
Refer to caption
4 4
Refer to caption
5 5
Refer to caption
6 6
Figure 9. The soliton wave model: a wave is a sequence of particles (the sequence may have some inner holes), and at each unit of time, one particle is selected and jumps at the very start of the wave (and thus leaves an empty slot where it was). Trailing empty slots are ignored (this occurs when the last particle is selected, e.g. from step 3 to 4 above).

Figure 9 also shows that this model has one degree of freedom, that is, the soliton wave model with mm particles can be modeled as m1m-1 interactive urns U1,,Um1U_{1},\dots,U_{m-1}: the urn UkU_{k} contains the number of white cells between the kk-th and (k+1)(k+1)-th blue particle. Accordingly, this interactive urn process starts with Uk(0)=0U_{k}(0)=0 for all kk, and then, at each unit of time, we have one of the following mm events (with probability 1/m1/m):

  • U1(n+1)=U1(n)+1U_{1}(n+1)=U_{1}(n)+1 and other urns are unchanged.

  • for k=2,,m1k=2,\dots,m-1: U1(n+1)=0U_{1}(n+1)=0, Uj(n+1):=Uj1(n)U_{j}(n+1):=U_{j-1}(n) (for j=2,,k1j=2,\dots,k-1), Uk(n+1):=Uk1(n)+Uk(n)+1U_{k}(n+1):=U_{k-1}(n)+U_{k}(n)+1, and remaining urns are unchanged.

  • U1(n+1)=0U_{1}(n+1)=0 and, for k2k\geq 2, Uk(n+1):=Uk1(n)U_{k}(n+1):=U_{k-1}(n).

The length of the soliton is then given by Ln=m+U1(n)++Um1(n)L_{n}=m+U_{1}(n)+\dots+U_{m-1}(n); it can equivalently be viewed as the maximum of the xx-coordinates (at time nn) of each particle.

6. Conclusion and future works

In this article, we considered several statistics (final altitude, waiting time, height) associated to walks with resets, for any given finite step set. For the case of the simplest non-trivial model (namely, for Moran walks), we prove that the asymptotic height exhibits some subtle behavior related to the discrete Gumbel distribution. In a forthcoming article, we plan to consider the asymptotic analysis of the height for more general walks.

In our formulas for walks of length nn, taking q:=q/nq^{\prime}:=q/n (and more generally q=q(n)q^{\prime}=q(n)) as the probability of reset leads to models which can counterbalance the infinite negative drift of the initial model, and thus present a different type of asymptotic behavior. Studying these models and their phase transitions in more detail would be interesting.

In Section 5, we considered several multidimensional extensions of such walks, with applications to the soliton wave model, or to models in genetics. More multidimensional variants of Moran models allowing both positive and negative jumps (and with or without resets) can be handled using the approach presented in this article (see [1]). One interesting example is the one where each dimension evolves like a Motzkin path, this model was e.g. considered in the haploid Moran model [32], where the authors use a Markov chain approach, using duality/reversibility to establish links with Ornstein–Uhlenbeck processes. Note that even if one adds resets to such Motzkin-like models, one keeps nice links with continuous fractions associated to birth and death processes; see [20]. The analysis becomes much more complicated as soon as jumps of amplitude 2\geq 2 are allowed; in such cases, our approach based on the kernel method strikes again.

Another natural extension is to consider walks in the quarter plane with resets (a natural model of two queues evolving in parallel); even for walks with jumps of amplitude 11, the exact enumeration and the asymptotic behavior of the (maximal) height remain open. Other more ad hoc extensions consider some age-dependent probabilities pip_{i}’s, then leading to partial differential equations for the corresponding generating functions. Some specific cases lead to closed-form solutions.

All these variants of Moran models are parametrized by the pip_{i}’s. One can then turn to the tuning of several statistical tests: having some experimental data, it is natural to look for maximum likelihood estimators of the pip_{i}’s, and to study if they are unbiased, sufficient, and consistent (for more on these notions, see e.g. [50]). In conclusion, the Moran model offers a large variety of interesting models, with many aspects to explore!

Acknowledgement: The work of the first author is supported by the Researchers Supporting Project RSPD2023R987 of King Saud University. We thank Hosam Mahmoud for introducing us to the Moran walks, and asking us about the distribution of their height. We are indebted to Rosa Orellana and Mike Zabrowski for pinpointing us that the bi-indexed symmetric functions which we introduced in Remark 5.4, and which already occurred in the literature under the name MacMahon symmetric functions. Last but not least, we also deeply thank the two referees for their kind detailed reports, which helped to improve several parts of this article.

References