This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On a random model of forgetting

Noga Alon Department of Mathematics, Princeton University, Princeton, NJ 08544, USA and Schools of Mathematics and Computer Science, Tel Aviv University, Tel Aviv, Israel. Email: [email protected]. Research supported in part by NSF grant DMS-1855464, ISF grant 281/17, BSF grant 2018267 and the Simons Foundation.    Dor Elboim Department of Mathematics, Princeton University, Princeton, NJ 08544, USA. Email: [email protected].    Allan Sly Department of Mathematics, Princeton University, Princeton, NJ 08544, USA. Email: [email protected]. Research supported in part by NSF grant DMS-1855527, a Simons Investigator grant and a MacArthur Fellowship.
Abstract

Georgiou, Katkov and Tsodyks considered the following random process. Let x1,x2,x_{1},x_{2},\ldots be an infinite sequence of independent, identically distributed, uniform random points in [0,1][0,1]. Starting with S={0}S=\{0\}, the elements xkx_{k} join SS one by one, in order. When an entering element is larger than the current minimum element of SS, this minimum leaves SS. Let S(1,n)S(1,n) denote the content of SS after the first nn elements xkx_{k} join. Simulations suggest that the size |S(1,n)||S(1,n)| of SS at time nn is typically close to n/en/e. Here we first give a rigorous proof that this is indeed the case, and that in fact the symmetric difference of S(1,n)S(1,n) and the set {xk11/e:1kn}\{x_{k}\geq 1-1/e:1\leq k\leq n\} is of size at most O~(n)\tilde{O}(\sqrt{n}) with high probability. Our main result is a more accurate description of the process implying, in particular, that as nn tends to infinity n1/2(|S(1,n)|n/e)n^{-1/2}\big{(}|S(1,n)|-n/e\big{)} converges to a normal random variable with variance 3e2e13e^{-2}-e^{-1}. We further show that the dynamics of the symmetric difference of S(1,n)S(1,n) and the set {xk11/e:1kn}\{x_{k}\geq 1-1/e:1\leq k\leq n\} converges with proper scaling to a three dimensional Bessel process.

1 Introduction

The following random process, which we denote by PP, is considered by Georgiou, Katkov and Tsodyks in [8] motivated by the study of a possible simplified model for the process of forgetting. Let x1,x2,x_{1},x_{2},\ldots be independent, identically distributed, uniform random points in [0,1][0,1]. Starting with the set S(1,0)={0}={x0}S(1,0)=\{0\}=\{x_{0}\} the elements xkx_{k} enter the set one by one, in order, and if when xkx_{k} enters S(1,k1)S(1,k-1) it is larger than the minimum member min(S(1,k1))\min(S(1,k-1)) of S(1,k1)S(1,k-1), this minimum leaves it. Let us note that in this process, the fact that xix_{i} are uniformly distributed is not crucial. Indeed, it only matters that the ordering of the arriving elements is a uniform random ordering and therefore the uniform distribution can be replaced with any other non-atomic distribution.

The set S(1,n)S(1,n) will be called the memory. We let P(n)P(n) be the finite process that stops at time nn. Simulations indicate that the expected size of the memory S(1,n)S(1,n) at time nn is (1+o(1))n/e(1+o(1))n/e where ee is the basis of the natural logarithm, and that in fact with high probability (that is, with probability approaching 11 as nn tends to infinity) the size at the end is very close to the expectation. Tsodyks [12] suggested the problem of proving this behavior rigorously. Our first result here is a proof that this is indeed the case, as stated in the following theorem. Recall that for two function f,gf,g the notation f(n)=O~(g(n))f(n)=\tilde{O}(g(n)) means that for all large nn one has f(n)C(logn)Cg(n)f(n)\leq C(\log n)^{C}g(n).

Theorem 1.

In the process P(n)P(n) the size of the memory S(1,n)S(1,n) at time nn is, with high probability, (1+o(1))n/e(1+o(1))n/e. Moreover, with high probability, the symmetric difference between the final content of the memory and the set {xk11/e:kn}\{x_{k}\geq 1-1/e:k\leq n\} is of size at most O~(n)\tilde{O}(\sqrt{n}).

A result similar to Theorem 1 (without any quantitative estimates) has been proved independently by Friedgut and Kozma [7] using different ideas.

Refer to caption
Figure 1: Four independent samples of the process obtained from a computer simulation. The blue points are the content of the memory set after n=200n=200 steps in the first two pictures and after n=1000n=1000 steps in the last two. In the main theorems we analyze the evolution of the memory set and determine how close it is to the set of uniform variables that are larger than 11/e1-1/e.

Our main result provides a more accurate description of the process PP as stated in the following theorems. To state the next theorem we let S(1,n)S(1,n) be the content of the list at time nn and let s(1,n):=|S(1,n)|s(1,n):=|S(1,n)| be the size of the list.

Theorem 2.

Let BtB_{t} be a standard Brownian motion. We have that

{s(1,tn)tn/en}t>0𝑑{3eeBt}t>0,n.\Big{\{}\frac{s(1,tn)-tn/e}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\Big{\{}\frac{\sqrt{3-e}}{e}\cdot B_{t}\Big{\}}_{t>0},\quad n\to\infty.

Our next theorem describes the final set of points obtained in this process. In particular, it strengthens the estimate in Theorem 1 by removing the poly-logarithmic factor in the bound on the size of the symmetric difference between S(1,n)S(1,n) and the set {xkz0:kn}\{x_{k}\geq z_{0}:k\leq n\} where z0:=11/ez_{0}:=1-1/e. The theorem also provides the limiting distribution of this difference. To state the theorem we define

Ln:=|S(1,n){xkz0:kn}|andRn:=|{xkz0:kn}S(1,n)|.L_{n}:=\big{|}S(1,n)\setminus\big{\{}x_{k}\geq z_{0}:k\leq n\big{\}}\big{|}\quad\text{and}\quad R_{n}:=\big{|}\big{\{}x_{k}\geq z_{0}:k\leq n\big{\}}\setminus S(1,n)\big{|}.
Theorem 3.

As nn tends to infinity we have

{(Ltnn,Rtnn)}t>0𝑑{2e1(MtBt,Mt)}t>0,\Big{\{}\Big{(}\frac{L_{tn}}{\sqrt{n}}\,,\,\frac{R_{tn}}{\sqrt{n}}\Big{)}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}e^{-1}\big{(}M_{t}-B_{t},M_{t}\big{)}\big{\}}_{t>0},

where BtB_{t} is a standard Brownian motion and Mt:=supstBsM_{t}:=\sup_{s\leq t}B_{s} is the maximum process. In particular, the evolution of the size of the symmetric difference is given by

{Ltn+Rtnn}t>0𝑑{2e1Xt}t>0,\Big{\{}\frac{L_{tn}+R_{tn}}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}e^{-1}X_{t}\big{\}}_{t>0}, (1)

where XtX_{t} is a three dimensional Bessel process. It follows that the differences and the symmetric difference satisfy the following limit laws

Lnn𝑑2e1|N|,Rnn𝑑2e1|N|andLn+Rnn𝑑X,\frac{L_{n}}{\sqrt{n}}\overset{d}{\longrightarrow}\sqrt{2}e^{-1}|N|,\quad\frac{R_{n}}{\sqrt{n}}\overset{d}{\longrightarrow}\sqrt{2}e^{-1}|N|\quad\text{and}\quad\frac{L_{n}+R_{n}}{\sqrt{n}}\overset{d}{\longrightarrow}X, (2)

where NN is a standard normal random variable and XX has the density

f(x)=e3x22πee2x2/4 1{x>0}.f(x)=\frac{e^{3}x^{2}}{2\sqrt{\pi}}e^{-e^{2}x^{2}/4}\,\mathds{1}\{x>0\}. (3)

In fact, in the following theorem, we give a complete description of the evolution of the process in a n1/2n^{-1/2} neighbourhood around the critical point z0=11/ez_{0}=1-1/e. For 0<z<10<z<1 and n1n\geq 1 we let S(z,n):=S(1,n)[0,z]S(z,n):=S(1,n)\cap[0,z] be the set of elements in the list that are smaller than zz at time nn and let s(z,n):=|S(z,n)|s(z,n):=|S(z,n)|.

Theorem 4.

As nn tends to infinity we have

{s(z0+yn1/2,tn)n}t>0y𝑑{2e1Bt+ytinfxt(2e1Bx+yx)}t>0y.\bigg{\{}\frac{s\big{(}z_{0}+yn^{-1/2},tn\big{)}}{\sqrt{n}}\bigg{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}\overset{d}{\longrightarrow}\Big{\{}\sqrt{2}e^{-1}B_{t}+yt-\inf_{x\leq t}\big{(}\sqrt{2}e^{-1}B_{x}+yx\big{)}\Big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}.
Remark 1.1.

The convergence in Theorems 2, 3 and 4 is a weak convergence in the relevant Skorohod spaces. It is equivalent to the following statements. In Theorem 2, we define the random function fn(t):=n1/2(s(1,tn)tn/e)f_{n}(t):=n^{-1/2}\big{(}s(1,\lfloor tn\rfloor)-tn/e\big{)}. Then, for all M>0M>0 there exists a coupling of the sequence fnf_{n} and a Brownian motion BtB_{t} such that almost surely

suptM|fn(t)3eeBt|0,n.\sup_{t\leq M}\Big{|}f_{n}(t)-\frac{\sqrt{3-e}}{e}\cdot B_{t}\Big{|}\to 0,\quad n\to\infty.

In Theorem 4, we define fn(t,y):=n1/2s(z0+yn1/2,tn)f_{n}(t,y):=n^{-1/2}s\big{(}z_{0}+yn^{-1/2},\lfloor tn\rfloor\big{)}. Then, for all M>0M>0, there exists a coupling such that almost surely

supt,|y|M|fn(t,y)(2e1Bt+ytinfxt(2e1Bx+yx))|0,n.\sup_{t,|y|\leq M}\Big{|}f_{n}(t,y)-\Big{(}\sqrt{2}e^{-1}B_{t}+yt-\inf_{x\leq t}\big{(}\sqrt{2}e^{-1}B_{x}+yx\big{)}\Big{)}\Big{|}\to 0,\quad n\to\infty.

The convergence in Theorem 3 is similar.

Refer to caption
Refer to caption

Refer to caption

Figure 2: A computer simulation of the process running for n=106n=10^{6} steps. Each picture is an independent sample of the process. The first picture shows the centered size of the list s(1,k)k/es(1,k)-k/e as a function of knk\leq n. Theorem 2 states that this is a rescaled Brownian motion. The second picture shows the processes RkR_{k} in red and LkL_{k} in blue. These processes scale to the maximum process MtM_{t} and to MtBtM_{t}-B_{t} respectively by Theorem 3. Note that RkR_{k} can grow only when LkL_{k} is zero and similarly, in the limit, MtM_{t} can grow only when MtBtM_{t}-B_{t} is zero. The third picture shows s(z0+yn1/2,k)s(z_{0}+yn^{-1/2},k) as a function of kk for y=2y=-2 in yellow, y=1y=-1 in green, y=0y=0 in blue, y=1y=1 in purple and y=2y=2 in red. Theorem 4 gives the scaling limit of these processes.

The rest of the paper is organized as follows. In the next section we describe the proof of Theorem 1. It is based on subadditivity and martingale concentration applied to the natural Doob martingale associated with the process. The concentration part resembles the basic approach in [1]. In Section 3 we prove Theorem 2, Theorem 3 and Theorem 4. The main idea in the proofs is to consider the process

W(z,n):=xS(z,n)11xW(z,n):=\sum_{x\in S(z,n)}\frac{1}{1-x} (4)

which we call “the branching martingale.” When S(z,n)S(z,n) is not empty, this process has a zero drift when z=11/ez=1-1/e, a negative drift when z<11/ez<1-1/e and a positive drift when z>11/ez>1-1/e. When S(z,n)S(z,n) is empty, WW can only increase in all cases. The intuition behind this magical formula comes from a certain multi-type branching process. In a multi-type branching process, there are individuals of kk different types. For each type ii, an offspring distribution μi\mu_{i} on k\mathbb{N}^{k} is given. In each step, every individual of type ii gives birth to a random number of individuals of each type according to the distribution μi\mu_{i}. Recall that a single-type branching process is critical when the expected number of offspring is 11 and in this case the size of the generation is a martingale. Similarly, for a multi-type branching process one defines the expectation matrix MM where Mi,j:=xj𝑑μi(x)M_{i,j}:=\int x_{j}d\mu_{i}(x) is the expected number of children of type jj of an individual of type ii. The process is then critical when the maximal eigenvalue of MM is one and in this case the process

Nt:=uZ(t)N_{t}:=u\cdot Z(t) (5)

is a martingale where uu is the eigenvector of MM with eigenvalue 11 and Z(t)=(Z1(t),,Zk(t))Z(t)=(Z_{1}(t),\dots,Z_{k}(t)) is the number of individuals of each type at time tt. For more background on multi-type branching processes see chapters V and VI in [3].

We think of our process as a multi-type branching process in which the type space is continuous. Given 0<z<10<z<1, we define a branching process with the type space [0,z][0,z]. In this branching, the offspring distribution of an individual of type x<zx<z is given as follows. With probability (1z)(1-z), the individual has 0 offspring, with probability (zx)(z-x) it has one offspring of type that is uniformly distributed in [x,z][x,z] and with probability xx it has two offspring, one of type xx and one of type that is uniformly distributed in [0,x][0,x].

We now explain the connection between this branching and our process. Suppose we start the process with a list SS such that the minimum of SS is zz, and we want to study the time until zz is removed from the list. We think of each point x<zx<z that is added to the list as an individual of type xx in the branching process and ignore points that fall to the right of zz. We claim that the number of individuals that were born before the extinction of the branching described above is exactly the time until zz is removed in the minimum process. Indeed, instead of exposing the branching in the usual way according to the generations, we expose the offspring of an individual when it becomes the current minimum of the list. When the element xx becomes the minimum, with probability 1z1-z it is removed and S(z,n)S(z,n) decreases by 11, with probability zxz-x we remove xx from S(z,n)S(z,n) and replace it with a uniform element in [x,z][x,z] and with probability xx we keep xx and add another element to S(z,n)S(z,n) that is uniformly distributed in [0,x][0,x].

The branching described above can be shown to be subcritical when z<z0z<z_{0}, critical when z=z0z=z_{0} and supercritical when z>z0z>z_{0}, where z0=11/ez_{0}=1-1/e. Now, in the critical case we look for a martingale of the form (5) which translates in the continuous case to the process

Nn:=xS(z0,n)f(x)N_{n}:=\sum_{x\in S(z_{0},n)}f(x) (6)

where ff is an eigenfunction of the expectation operator with eigenvalue 11. In order to find ff we let mn<z0m_{n}<z_{0} be the minimum of the list at time nn and write

𝔼[Nn+1Nn|n]=0z0f(x)𝑑x(1mn)f(mn).\mathbb{E}\big{[}N_{n+1}-N_{n}\ |\ \mathcal{F}_{n}\big{]}=\int_{0}^{z_{0}}f(x)dx-(1-m_{n})f(m_{n}). (7)

The last expression is zero only when f(x)=1/(1x)f(x)=1/(1-x) which leads to (4).

2 Proof of Theorem 1

For any integer 0n0\leq\ell\leq n and any x[0,1]x\in[0,1] let S(x,)S(x,\ell) denote the set of numbers that are in SS and lie in [0,x][0,x] after the first \ell steps, that is, after x1,x2,,xx_{1},x_{2},\ldots,x_{\ell} have been inserted. Put s(x,)=|S(x,)|s(x,\ell)=|S(x,\ell)|. Therefore s(1,n)s(1,n) is the size of SS at the end of the process. Let mim_{i} denote the value min(S(1,i1))\min(S(1,i-1)) just before xix_{i} enters it. Thus m1=0m_{1}=0 and m2=x1m_{2}=x_{1}.

We start with the following lemma showing that the expected value of the size of SS at the end is linear in nn.

Lemma 2.1.

For all n1n\geq 1 we have

𝔼[S(1,n)]n/4\mathbb{E}[S(1,n)]\geq n/4

Proof:  For each 1in1\leq i\leq n let XiX_{i} be the indicator random variable whose value is 11 if xi<mix_{i}<m_{i} (and 0 otherwise). Let YiY_{i} be the indicator random variable with value 11 if xi>mix_{i}>m_{i} and mi1/2m_{i}\leq 1/2. Note that i=1nYi\sum_{i=1}^{n}Y_{i} is at most the number of xix_{i}, 0i<n0\leq i<n whose values are in [0,1/2][0,1/2]. Indeed, whenever Yi=1Y_{i}=1 an element xj1/2x_{j}\leq 1/2 with j<ij<i leaves the memory set. The expected value of this number is 1+(n1)/21+(n-1)/2, as x0=0<1/2x_{0}=0<1/2 and any other xix_{i} lies in [0,1/2][0,1/2] with probability 1/21/2. Therefore the expectation of i=1nYi\sum_{i=1}^{n}Y_{i} is at most (n+1)/2(n+1)/2. We claim that for every 1in1\leq i\leq n, the expected value of 2Xi+Yi2X_{i}+Y_{i} is at least 11. Indeed, if mi>1/2m_{i}>1/2 the expected value of 2Xi2X_{i} is 2mi>12m_{i}>1 (and the expectation of YiY_{i} is 0). If mi1/2m_{i}\leq 1/2, the expected value of 2Xi2X_{i} is 2mi2m_{i}, and that of YiY_{i} is 1mi1-m_{i}. By linearity of expectation in this case the expected value of 2Xi+Yi2X_{i}+Y_{i} is 2mi+(1mi)=1+mi12m_{i}+(1-m_{i})=1+m_{i}\geq 1. This proves the claim. Using again linearity of expectation we conclude that the expectation of i=1n(2Xi+Yi)\sum_{i=1}^{n}(2X_{i}+Y_{i}) is at least nn. Since the expectation of i=1nYi\sum_{i=1}^{n}Y_{i} is at most (n+1)/2(n+1)/2 it follows that the expectation of i=1nXi\sum_{i=1}^{n}X_{i} is at least (n1)/4(n-1)/4. Note that the size of SS at the end is exactly 1+i=1nXi1+\sum_{i=1}^{n}X_{i}, as it has size 11 at the beginning, its size never decreases, and it increases by 11 in step ii if and only if Xi=1X_{i}=1. The assertion of the lemma follows. \Box

We also need the following simple deterministic statement.

Lemma 2.2.

Let T1,T2T_{1},T_{2} be arbitrary finite disjoint subsets of [0,1][0,1]. Let P=P(n)P=P(n) be the process above with the sequence x1,x2,,xn[0,1]x_{1},x_{2},\ldots,x_{n}\in[0,1] starting with S=T1S=T_{1}. Let PP^{\prime} denote the process with the same sequence xix_{i}, starting with S=T1T2S=T_{1}\cup T_{2}. For x[0,1]x\in[0,1] and 0n0\leq\ell\leq n, let S(x,)S(x,\ell) denote the set of numbers in SS that lie in [0,x][0,x] after the first \ell steps in the process PP. Similarly, let S(x,)S^{\prime}(x,\ell) denote the set of numbers in SS that lie in [0,x][0,x] after the first \ell steps in the process PP^{\prime}. Then S(x,)S^{\prime}(x,\ell) contains S(x,)S(x,\ell) and moreover

|S(x,)||S(x,)||T2|.|S^{\prime}(x,\ell)|-|S(x,\ell)|\leq|T_{2}|.

Proof:  We first describe the proof for x=1x=1. Apply induction on \ell. The result is trivial for =0\ell=0. Assuming it holds for \ell consider step number +1\ell+1 in the two processes, when x=x+1x=x_{\ell+1} enters the memory SS. If xx is smaller than the minimum of both sets S(1,)S(1,\ell) and S(1,)S^{\prime}(1,\ell) then it joins the memory in both processes and the desired properties clearly hold. If it is larger than the minimum in S(1,)S(1,\ell) then it is also larger than the minimum in S(1,)S^{\prime}(1,\ell), it joins the memory in both cases, and both minima leave. It is easy to check that the desired properties hold in this case too. Finally, if xx is smaller than the minimum in S(1,)S(1,\ell) but larger than the minimum in S(1,)S^{\prime}(1,\ell) then xx joins both memories and the minimum of S(1,)S^{\prime}(1,\ell) leaves. In this case SS^{\prime} still contains SS, but the difference between their sizes decreases by 11. This also satisfies the desired properties, completing the proof of the induction step for x=1x=1. The statement for general xx follows from the statement for x=1x=1 \Box

The above lemma has two useful consequences which are stated and proved in what follows.

Corollary 2.3.

For each fixed xx the function f()=𝔼[s(x,)]f(\ell)=\mathbb{E}[s(x,\ell)] is subadditive.

Proof:  We have to show that for every p,qp,q, f(p+q)f(p)+f(q)f(p+q)\leq f(p)+f(q). Consider the process PP with the points x1,x2,xp,y1,y2,,yqx_{1},x_{2},\ldots x_{p},y_{1},y_{2},\ldots,y_{q}. The process starts with the first pp steps, let TT be the content of the memory SS after these pp steps. We can now compare the process PP with the points y1,y2,yqy_{1},y_{2},\ldots y_{q} starting with the empty set SS (which is identical to starting with S={0}S=\{0\}), and the process PP^{\prime} with the same points yiy_{i} starting with S=TS=T. By Lemma 2.2, throughout the two processes, the number of elements in the process PP^{\prime} that lie in [0,x][0,x] is always at most the number of elements in PP that lie in [0,x][0,x], plus |T||T|. Taking expectations in both sides of this inequality implies the desired subadditivity. \Box

Corollary 2.4.

For each x[0,1]x\in[0,1] and any integer \ell, the random variable s(x,)s(x,\ell) is 22-Lipschitz, that is, its value changes by at most 22 if we change the value of a single xix_{i}. Therefore for any λ>0\lambda>0 the probability that it deviates from its expectation by more than λ\lambda\sqrt{\ell} is at most 2eλ2/8.2e^{-\lambda^{2}/8}.

Proof:  The effect that changing the value of xix_{i} to xix^{\prime}_{i} has on the content of SS after step ii is a change of xix_{i} to xix^{\prime}_{i} in SS, and possibly a removal of the minimum element of SS before step ii in one scenario and not in the other. In any case this means that one process can be converted to the other by adding an element to the memory and then by removing one or two elements from it. By Lemma 2.2 for each fixed choice of all other xjx_{j} the first modification can only increase the value of s(x,)s(x,\ell), and can increase it by at most 11. The second modification can then only decrease this value, decreasing it by at most 22. This implies that the value changes by at most 22, that is, the function is 22-Lipschitz. The concentration inequality thus follows from Azuma’s Martingale Inequality, c.f. [4, Theorem 1.1]. \Box

Proof of Theorem 1:  The symmetric difference of the sets S(1,n)S(1,n) and S(1,n+1)S(1,n+1) is of size at most two and therefore we may assume that nn is even throughout the proof of the theorem. We may also assume that nn is sufficiently large. Without any attempt to optimize the absolute constants, let y[0,1]y\in[0,1] be a real number so that the expected value 𝔼[s(y,n/2)]\mathbb{E}[s(y,n/2)] of the number of elements in the memory SS that lie in [0,y][0,y] at the end of the process with the first n/2n/2 elements xix_{i} is 5nlogn5\sqrt{n\log n}. Since 𝔼[s(0,n/2)]=0\mathbb{E}[s(0,n/2)]=0, 𝔼[s(1,n/2)]>n/8\mathbb{E}[s(1,n/2)]>n/8 by Lemma 2.1, and the function f(y)=𝔼[s(y,n/2)]f(y)=\mathbb{E}[s(y,n/2)] is continuous, there is such a value (provided n/85nlognn/8\geq 5\sqrt{n\log n}.) We claim that the expected number of indices ii so that n/2<inn/2<i\leq n and the minimum in the memory satisfies mi>ym_{i}>y, is smaller than 1/n21/n^{2}. Indeed, it follows from Lemma 2.2 that the function 𝔼[s(y,m)]\mathbb{E}[s(y,m)] is monotone increasing in mm. Thus, for every (n/2,n]\ell\in(n/2,n], 𝔼[s(y,1)]5nlogn.\mathbb{E}[s(y,\ell-1)]\geq 5\sqrt{n\log n}. By Corollary 2.4 (and using that {m>y}={s(y,1)=0}\{m_{\ell}>y\}=\{s(y,\ell-1)=0\}) this implies that for every fixed \ell in this range the probability that mm_{\ell} is larger than yy is at most 2e25nlogn/(8)<1/n32e^{-25n\log n/(8\ell)}<1/n^{3}. Therefore, by linearity of expectation, the expected number of steps i(n/2,n]i\in(n/2,n] in which mi>ym_{i}>y is smaller than 1/n21/n^{2}, proving the claim.

By this claim, the expected number of steps i(n/2,n]i\in(n/2,n] so that mi>ym_{i}>y and xi>mix_{i}>m_{i} is, of course, also at most 1/n21/n^{2}.

On the other hand, by the subadditivity established in Corollary 2.3, the expected value of s(y,n)s(y,n) is at most 10nlogn10\sqrt{n\log n}.

We next show that yy deviates from 11/e1-1/e by O(lognn).O(\frac{\sqrt{\log n}}{\sqrt{n}}). To this end we compute the expectation of the random variable YY counting the number of indices i(n/2,n]i\in(n/2,n] for which xiyx_{i}\leq y and xix_{i} is being removed at a step in which the arriving element exceeds yy. By definition Y=n/2<inXiY=\sum_{n/2<i\leq n}X_{i} where XiX_{i} denotes the indicator random variable whose value is 11 if xiyx_{i}\leq y and xix_{i} is being removed at a step in which the arriving element is larger than yy.

For all n/2<inn/2<i\leq n and i<jni<j\leq n, let 𝒜i,j\mathcal{A}_{i,j} be the event that xix_{i} is removed by xjx_{j}. Conditioning on xix_{i} and on the event 𝒜i,j\mathcal{A}_{i,j}, the random variable xjx_{j} is uniformly distributed on the interval [xi,1][x_{i},1] and therefore on the event {xiy}\{x_{i}\leq y\} we have

(xjyxi,𝒜i,j)=1y1xi.\mathbb{P}\big{(}x_{j}\geq y\mid x_{i},\mathcal{A}_{i,j}\big{)}=\frac{1-y}{1-x_{i}}. (8)

Thus, we obtain

𝔼[Xi]=𝔼[𝟙{xiy}(j=i+1n𝒜i,j{xjy}xi)]=𝔼[𝟙{xiy}j=i+1n(𝒜i,jxi)(xjyxi,𝒜i,j)]=𝔼[𝟙{xiy}1y1xi(xi is removed before time nxi)]\begin{split}\mathbb{E}[X_{i}]&=\mathbb{E}\bigg{[}\mathds{1}\{x_{i}\leq y\}\mathbb{P}\bigg{(}\bigcup_{j=i+1}^{n}\mathcal{A}_{i,j}\cap\{x_{j}\geq y\}\mid x_{i}\bigg{)}\bigg{]}\\ &=\mathbb{E}\bigg{[}\mathds{1}\{x_{i}\leq y\}\sum_{j=i+1}^{n}\mathbb{P}\big{(}\mathcal{A}_{i,j}\mid x_{i}\big{)}\mathbb{P}\big{(}x_{j}\geq y\mid x_{i},\mathcal{A}_{i,j}\big{)}\bigg{]}\\ &=\mathbb{E}\Big{[}\mathds{1}\{x_{i}\leq y\}\frac{1-y}{1-x_{i}}\cdot\mathbb{P}\big{(}x_{i}\text{ is removed before time $n$}\mid x_{i}\big{)}\Big{]}\end{split}

The expected number of i(n/2,n]i\in(n/2,n] for which xiyx_{i}\leq y is not removed is at most 10nlogn10\sqrt{n\log n}, as all these elements belong to S(y,n)S(y,n). Thus,

𝔼[Y]=n/2<in𝔼[Xi]=O(nlogn)+n/2<in𝔼[𝟙{xiy}1y1xi]=O(nlogn)+n20y1y1x𝑑x=O(nlogn)+n2(1y)log(11y),\begin{split}\mathbb{E}[Y]&=\sum_{n/2<i\leq n}\mathbb{E}[X_{i}]=O(\sqrt{n\log n})+\sum_{n/2<i\leq n}\mathbb{E}\Big{[}\mathds{1}\{x_{i}\leq y\}\frac{1-y}{1-x_{i}}\Big{]}\\ &=O(\sqrt{n\log n})+\frac{n}{2}\int_{0}^{y}\frac{1-y}{1-x}dx=O(\sqrt{n\log n})+\frac{n}{2}(1-y)\log\Big{(}\frac{1}{1-y}\Big{)},\end{split}

where in the third equality we used that xix_{i} is uniform in [0,1][0,1].

On the other hand, the expected number of steps j(n/2,n]j\in(n/2,n] in which the element xj>yx_{j}>y arrived and caused the removal of an element xi=mj<yx_{i}=m_{j}<y for some i(n/2,n]i\in(n/2,n] is (n/2)(1y)O(nlogn)(n/2)(1-y)-O(\sqrt{n\log n}). Indeed, the expected number of j(n/2,n]j\in(n/2,n] with xj>yx_{j}>y is (n/2)(1y)(n/2)(1-y), and almost each such xjx_{j} removes an element mjm_{j}, where the expected number of such mjm_{j} that exceed yy is O(1/n2)O(1/n^{2}). In some of these steps the removed element may be xix_{i} for some in/2i\leq n/2, but the expected number of such indices ii is at most the expectation of s(y,n/2)s(y,n/2) which is only O(nlogn)O(\sqrt{n\log n}).

It follows that

n2(1y)log(11y)=n2(1y)+O(nlogn).\frac{n}{2}(1-y)\log\Big{(}\frac{1}{1-y}\Big{)}=\frac{n}{2}(1-y)+O(\sqrt{n\log n}).

Dividing by n(1y)/2n(1-y)/2 (noting that 1y1-y is bounded away from 0) we conclude that y=11/e+O(lognn).y=1-1/e+O(\frac{\sqrt{\log n}}{\sqrt{n}}).

Since the expected number of elements xix_{i} for i(n/2,n]i\in(n/2,n] that fall in the interval

[yO(lognn),y+O(lognn)]\Big{[}y-O\Big{(}\frac{\sqrt{\log n}}{\sqrt{n}}\Big{)},y+O\Big{(}\frac{\sqrt{\log n}}{\sqrt{n}}\Big{)}\Big{]}

is O(nlogn)O(\sqrt{n\log n}) we conclude that the expected number of steps i(n/2,n]i\in(n/2,n] satisfying xi>11/ex_{i}>1-1/e that leave the memory during the process is O(nlogn)O(\sqrt{n\log n}). In addition, since 𝔼[s(y,n)]10nlogn\mathbb{E}[s(y,n)]\leq 10\sqrt{n\log n} it follows that the expected number of steps i(n/2,n]i\in(n/2,n] so that xi<11/ex_{i}<1-1/e and xix_{i} stays in the memory at the end of the process is also at most O(nlogn)O(\sqrt{n\log n}).

By splitting the set of all nn steps into dyadic intervals we conclude that the expected size of the symmetric difference between the final content of the memory and the set of all elements xix_{i} larger than z=11/ez=1-1/e is O(nlogn)O(\sqrt{n\log n}). This clearly also implies that the expected value of s(1,n)s(1,n) deviates from n/en/e by O(nlogn)O(\sqrt{n\log n}). Finally note that by Corollary 2.4, for any positive λ\lambda the probability that either s(1,n)s(1,n) or the above mentioned symmetric difference deviate from their expectations by more than λn\lambda\sqrt{n} is at most O(eΩ(λ2))O(e^{-\Omega(\lambda^{2})}). \Box

3 The branching martingale

In this section we prove Theorem 2, Theorem 3 and Theorem 4. The main idea in the proof is the following martingale which we call the branching martingale.

Let 0<z<10<z<1 and let z0:=11/ez_{0}:=1-1/e be the critical point. Recall that S(z,n)S(z,n) is the set of elements in the list at time nn that are smaller than zz. Define the processes

W(z,n):=xS(z,n)11xZ(z,n):=k=1nW(z,k)𝟙{W(z,k1)=0}W(z,n):=\sum_{x\in S(z,n)}\frac{1}{1-x}\quad\quad Z(z,n):=\sum_{k=1}^{n}W(z,k)\cdot\mathds{1}\big{\{}W(z,k-1)=0\big{\}}

and let X(z,n):=W(z,n)Z(z,n)X(z,n):=W(z,n)-Z(z,n). The following claim is the fundamental reason to consider these processes.

Proposition 3.1.

On the event {W(z,n)0}\{W(z,n)\neq 0\} we have that

𝔼[W(z,n+1)W(z,n)|n]=log(1z)1\mathbb{E}\big{[}W(z,n+1)-W(z,n)\,|\,\mathcal{F}_{n}\big{]}=-\log(1-z)-1

and therefore X(z0,n)X(z_{0},n) is a martingale. Moreover, ZZ is roughly minus the minimum process of XX. More precisely, we have almost surely

0Z(z,n)+minknX(z,k)11z.0\leq Z(z,n)+\min_{k\leq n}X(z,k)\leq\frac{1}{1-z}. (9)
Remark 3.2.

Proposition 3.1 essentially explains all the results in this paper. See Theorem 4 for example. Since X(z0,n)X(z_{0},n) is a martingale it is a Brownian motion in the limit and Z(z0,n)Z(z_{0},n) is minus the minimum process of this Brownian motion. Thus, W(z0,n)W(z_{0},n), which is closely related to the number of elements in the list that are smaller than z0z_{0} is the difference between the Brownian motion and its minimum.

Proof of Proposition 3.1.

On the event {W(z,n)0}\{W(z,n)\neq 0\} we have

W(z,n+1)W(z,n)=𝟙{xn+1z}11xn+1𝟙{xn+1mn}11mn,W(z,n+1)-W(z,n)=\mathds{1}\{x_{n+1}\leq z\}\frac{1}{1-x_{n+1}}-\mathds{1}\{x_{n+1}\geq m_{n}\}\frac{1}{1-m_{n}}, (10)

where mnm_{n} is the minimum of the list at time nn and xn+1x_{n+1} is the uniform variable that arrives at time n+1n+1 in the process. Thus, on the event {W(z,n)0}\{W(z,n)\neq 0\} we have

𝔼[W(z,n+1)W(z,n)|n]=0z11x𝑑xmn111mm=log(1z)1.\mathbb{E}\big{[}W(z,n+1)-W(z,n)\ \big{|}\ \mathcal{F}_{n}\big{]}=\int_{0}^{z}\frac{1}{1-x}dx-\int_{m_{n}}^{1}\frac{1}{1-m_{m}}=-\log(1-z)-1.

It follows that X(z0,n)X(z_{0},n) is a martingale. Indeed on the event {W(z,n)0}\{W(z,n)\neq 0\} we have

𝔼[X(z,n+1)X(z,n)|n]=𝔼[W(z,n+1)W(z,n)|n]=log(1z0)1=0.\mathbb{E}\big{[}X(z,n+1)-X(z,n)\ \big{|}\ \mathcal{F}_{n}\big{]}=\mathbb{E}\big{[}W(z,n+1)-W(z,n)\ \big{|}\ \mathcal{F}_{n}\big{]}=-\log(1-z_{0})-1=0.

Whereas on the event {W(z,n)=0}\{W(z,n)=0\} we have that X(n+1,z)=X(n,z)X(n+1,z)=X(n,z).

We next turn to prove the second statement of the proposition. For all knk\leq n we have that

X(z,k)=W(z,k)Z(z,k)Z(z,k)Z(z,n),X(z,k)=W(z,k)-Z(z,k)\geq-Z(z,k)\geq-Z(z,n),

and therefore minknX(z,k)Z(z,n)\min_{k\leq n}X(z,k)\geq-Z(z,n). Next, if W(z,n)=0W(z,n)=0 then

minknX(z,k)X(z,n)=Z(z,n).\min_{k\leq n}X(z,k)\leq X(z,n)=-Z(z,n).

If W(z,n)0W(z,n)\neq 0, we let nnn^{\prime}\leq n be the last time before nn for which W(z,n)=0W(z,n^{\prime})=0 (assuming it exists). We have that

minknX(z,k)X(z,n)=Z(z,n)Z(z,n)+11xn+1Z(z,n)+11z.\min_{k\leq n}X(z,k)\leq X(z,n^{\prime})=-Z(z,n^{\prime})\leq-Z(z,n)+\frac{1}{1-x_{n^{\prime}+1}}\leq-Z(z,n)+\frac{1}{1-z}.

Finally suppose there is no such nnn^{\prime}\leq n for which W(z,n)=0W(z,n^{\prime})=0. Then Z(z,k)=0Z(z,k)=0 for all knk\leq n, and hence X(z,k)=W(z,k)X(z,k)=W(z,k) for all kk, and Z(z,n)+minknX(z,k)=11/(1z)Z(z,n)+\min_{k\leq n}X(z,k)=1\leq 1/(1-z). This finishes the proof of the proposition. ∎

In the following section we give some rough bounds on the process that hold with very high probability. We will later use these a priori bounds in order to obtain more precise estimates and in order to prove the main theorems.

3.1 First control of the process

Definition 3.1.

We say that an event 𝒜\mathcal{A} holds with very high probability (WVHP) if there are absolute constants C,c>0C,c>0 such that (𝒜)1Cexp(nc)\mathbb{P}(\mathcal{A})\geq 1-C\exp(-n^{c}).

Let ϵ:=0.01\epsilon:=0.01 and recall that z0:=11/ez_{0}:=1-1/e. Define the events

𝒜1:={z<z0,s(z,n)nϵ/(z0z)}\mathcal{A}_{1}:=\big{\{}\forall z<z_{0},\ s(z,n)\leq n^{\epsilon}/(z_{0}-z)\big{\}}

and

𝒜2:={zz0,s(z,n)(zz0)n+n1/2+ϵ}.\mathcal{A}_{2}:=\big{\{}\forall z\geq z_{0},\ s(z,n)\leq(z-z_{0})n+n^{1/2+\epsilon}\big{\}}.
Lemma 3.3.

The event 𝒜1𝒜2\mathcal{A}_{1}\cap\mathcal{A}_{2} holds WVHP.

Proof.

Let z<z0z<z_{0} and note that W(z,n)s(z,n)W(z,n)\geq s(z,n). Also note that if W(z,k1)=0W(z,k-1)=0 and W(z,k)0W(z,k)\neq 0, then W(z,k)=1/(1x)W(z,k)=1/(1-x) for some xz<z0x\leq z<z_{0}. Thus, we have that

{s(z,n)nϵ/(z0z)}{W(z,n)nϵ/(z0z)}kn𝒞k\big{\{}s(z,n)\geq n^{\epsilon}/(z_{0}-z)\big{\}}\subseteq\big{\{}W(z,n)\geq n^{\epsilon}/(z_{0}-z)\big{\}}\subseteq\bigcup_{k\leq n}\mathcal{C}_{k} (11)

where the events 𝒞k\mathcal{C}_{k} are defined by

𝒞k={0<W(z,k)e,kmn,W(z,m)0 and W(z,n)nϵ/(z0z)}.\mathcal{C}_{k}=\big{\{}0<W(z,k)\leq e,\ \forall k\leq m\leq n,W(z,m)\neq 0\text{ and }W(z,n)\geq n^{\epsilon}/(z_{0}-z)\big{\}}.

Next, we show that each of these events has a negligible probability. To this end, let knk\leq n and define

Mm:=W(z,m)+(mk)(log(1z)+1),τ:=min{mk:W(z,m)=0}.M_{m}:=W(z,m)+(m-k)(\log(1-z)+1),\quad\tau:=\min\{m\geq k:W(z,m)=0\}.

By Proposition 3.1, the process MmτM_{m\wedge\tau} is a martingale. Moreover, since log(1z)+1z0z\log(1-z)+1\geq z_{0}-z, we have that

𝒞k{MnτMkτnϵ/(z0z)e+(nk)(log(1z)+1)}{MnτMkτnϵ/2/(z0z)+(nk)(z0z)}.\begin{split}\mathcal{C}_{k}&\subseteq\big{\{}M_{n\wedge\tau}-M_{k\wedge\tau}\geq n^{\epsilon}/(z_{0}-z)-e+(n-k)(\log(1-z)+1)\big{\}}\\ &\subseteq\big{\{}M_{n\wedge\tau}-M_{k\wedge\tau}\geq n^{\epsilon/2}/(z_{0}-z)+(n-k)(z_{0}-z)\big{\}}.\end{split}

Thus, using that MmτM_{m\wedge\tau} has bounded increments and that (a+b)2a2+b2(a+b)^{2}\geq a^{2}+b^{2}, we get from Azuma’s inequality that

(𝒞k)exp(cnϵ(z0z)2+(nk)2(z0z)2nk)Cexp(cnc),\mathbb{P}\big{(}\mathcal{C}_{k}\big{)}\leq\exp\Big{(}-c\ \frac{n^{\epsilon}(z_{0}-z)^{-2}+(n-k)^{2}(z_{0}-z)^{2}}{n-k}\Big{)}\leq C\exp(-cn^{c}),

where the last inequality clearly holds both when nknϵ/2(z0z)2n-k\leq n^{\epsilon/2}(z_{0}-z)^{-2} and when nknϵ/2(z0z)2n-k\geq n^{\epsilon/2}(z_{0}-z)^{-2}. It follows from (11) that

(s(z,n)nϵ/(z0z))Cexp(nc).\mathbb{P}\big{(}s(z,n)\geq n^{\epsilon}/(z_{0}-z)\big{)}\leq C\exp(-n^{c}).

In order to obtain this inequality simultaneously for all zz0z\leq z_{0} we first use a union bound to get it simultaneously for all z{i/n:i,i/nz0}z\in\{i/n:i\in\mathbb{N},i/n\leq z_{0}\} and then argue that s(z,n)s(z,n) does not change by much WVHP when zz changes by O(n1)O(n^{-1}). The details are omitted. It follows that (𝒜1)1Cexp(nc)\mathbb{P}(\mathcal{A}_{1})\geq 1-C\exp(-n^{c}).

We turn to bound the probability of 𝒜2\mathcal{A}_{2}. To this end, let zz0z\geq z_{0} and z1:=z0n1/2z_{1}:=z_{0}-n^{-1/2}. The event

{s(z1,n)n1/2+ϵ/2}{|{kn:xk[z1,z]}|(zz0)n+n1/2+ϵ/2}\big{\{}s(z_{1},n)\leq n^{1/2+\epsilon}/2\big{\}}\cap\big{\{}\big{|}\big{\{}k\leq n:x_{k}\in[z_{1},z]\big{\}}\big{|}\leq(z-z_{0})n+n^{1/2+\epsilon}/2\big{\}}

holds WVHP by the first part of the proof and by Chernoff’s inequality for the Binomial random variable |{kn:xk[z1,z]}|Bin(n,zz1)\big{|}\big{\{}k\leq n:x_{k}\in[z_{1},z]\big{\}}\big{|}\sim\text{Bin}(n,z-z_{1}). On this event we have that s(z,n)(zz0)n+n1/2+ϵs(z,n)\leq(z-z_{0})n+n^{1/2+\epsilon} and therefore, using the discretization argument as above we obtain (𝒜2)1Cexp(nc)\mathbb{P}(\mathcal{A}_{2})\geq 1-C\exp(-n^{c}). ∎

The following lemma implies in particular that, with very high probability, the minimum is rarely above zz for any z>z0z>z_{0}. For the statement of the lemma, define the random variable

Mn(z):=|{kn:mkz}|M_{n}(z):=\big{|}\big{\{}k\leq n:m_{k}\geq z\big{\}}\big{|}

and the events

1:={z>z0,Mn(z)nϵ/(zz0)}\mathcal{B}_{1}:=\big{\{}\forall z>z_{0},\ M_{n}(z)\leq n^{\epsilon}/(z-z_{0})\big{\}}

and

2:={zz0,|Mn(z)(log(1z)+1)n|n1/2+ϵ}.\mathcal{B}_{2}:=\big{\{}\forall z\leq z_{0},\ \big{|}M_{n}(z)-(\log(1-z)+1)n\big{|}\leq n^{1/2+\epsilon}\big{\}}.
Lemma 3.4.

The event 12\mathcal{B}_{1}\cap\mathcal{B}_{2} holds WVHP.

Proof.

We claim that for any z<1z<1 the process

L(z,k):=W(z,k)+(log(1z)+1)(k1)Mk1(z)L(z,k):=W(z,k)+(\log(1-z)+1)(k-1)-M_{k-1}(z) (12)

is a martingale. Indeed, if W(z,k)0W(z,k)\neq 0 then Mk(z)=Mk1(z)M_{k}(z)=M_{k-1}(z) and therefore by Proposition 3.1 we have 𝔼[L(z,k+1)L(k,z)k]=0\mathbb{E}[L(z,k+1)-L(k,z)\mid\mathcal{F}_{k}]=0. Otherwise, using that Mk(z)=Mk1(z)+1M_{k}(z)=M_{k-1}(z)+1 we have

𝔼[L(z,k+1)L(k,z)k]=0z11x𝑑x+log(1z)=0.\mathbb{E}[L(z,k+1)-L(k,z)\mid\mathcal{F}_{k}]=\int_{0}^{z}\frac{1}{1-x}dx+\log(1-z)=0.

Moreover, when z<3/4z<3/4 the martingale L(z,k)L(z,k) has bounded increments and therefore by Azuma’s inequality we have WVHP

|W(z,n+1)+(log(1z)+1)nMn(z)|n1/2+ϵ.\big{|}W(z,n+1)+(\log(1-z)+1)n-M_{n}(z)\big{|}\leq n^{1/2+\epsilon}. (13)

Now, when zz0z\leq z_{0}, on the event 𝒜2\mathcal{A}_{2} we have

W(z,n+1)W(z0,n+1)es(z0,n+1)n1/2+ϵ.W(z,n+1)\leq W(z_{0},n+1)\leq e\cdot s(z_{0},n+1)\leq n^{1/2+\epsilon}.

Substituting this back into (13) we obtain that WVHP

|Mn(z)(log(1z)+1)n|n1/2+2ϵ.\big{|}M_{n}(z)-(\log(1-z)+1)n\big{|}\leq n^{1/2+2\epsilon}. (14)

This shows that 2\mathcal{B}_{2} holds WVHP using the discretization trick. Indeed, WVHP each z[0,z0]z\in[0,z_{0}] is the minimum at most nϵn^{\epsilon} times and therefore WVHP Mn(z)M_{n}(z) won’t change by more than n2ϵn^{2\epsilon} when zz changes by O(n1)O(n^{-1}).

We turn to bound the probability of 1\mathcal{B}_{1}. To this end, let n0:=nϵ(zz0)2n_{0}:=\lfloor n^{\epsilon}(z-z_{0})^{-2}\rfloor and let z0z3/4z_{0}\leq z\leq 3/4. Using (13) and the fact that log(1z)+1e(zz0)\log(1-z)+1\leq-e(z-z_{0}) we obtain that WVHP for all n0knn_{0}\leq k\leq n

W(z,k+1)e(zz0)k|L(z,k)|e(zz0)kk1/2+ϵ>0,W(z,k+1)\geq e(z-z_{0})k-|L(z,k)|\geq e(z-z_{0})k-k^{1/2+\epsilon}>0, (15)

where the last inequality follows from the definition of n0n_{0}. Thus, WVHP, for all n0<knn_{0}<k\leq n we have that mkzm_{k}\leq z and therefore

Mn(z)=Mn0(z)Mn0(z0)n01/2+ϵnϵ/(zz0),M_{n}(z)=M_{n_{0}}(z)\leq M_{n_{0}}(z_{0})\leq n_{0}^{1/2+\epsilon}\leq n^{\epsilon}/(z-z_{0}),

where the second to last inequality holds WVHP by (14) with z=z0z=z_{0}. Of course the same bound holds when 3/4z13/4\leq z\leq 1 (by slightly changing ϵ\epsilon) and therefore, using the discretization trick we get that 1\mathcal{B}_{1} holds WVHP. ∎

3.2 Proof of Theorem 4

The main step toward proving the theorem is the following proposition that determines the scaling limit of X(z,n)X(z,n) when zz is in a small neighborhood around z0=11/ez_{0}=1-1/e.

Proposition 3.5.

We have that

{X(z0+yn1/2,tn)n}t>0y𝑑{2Bt+eyt}t>0y,n.\Big{\{}\frac{X(z_{0}+yn^{-1/2}\,,\,tn)}{\sqrt{n}}\Big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}B_{t}+eyt\big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}},\quad n\to\infty.

We start by proving Theorem 4 using Proposition 3.5.

Proof of Theorem 4.

By Proposition 3.5 and (9) we have that the joint distribution

{(X(z0+yn1/2,tn)n,Z(z0+yn1/2,tn)n)}t>0y\Big{\{}\Big{(}\frac{X(z_{0}+yn^{-1/2}\,,\,tn)}{\sqrt{n}}\ ,\ \frac{Z(z_{0}+yn^{-1/2}\,,\,tn)}{\sqrt{n}}\Big{)}\Big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}

converges to (2Bt+eyt,infst(2Bt+eys))\big{(}\sqrt{2}B_{t}+eyt\,,\,-\inf_{s\leq t}(\sqrt{2}B_{t}+eys)\big{)}. Thus, using that W(z,n)=X(z,n)+Z(z,n)W(z,n)=X(z,n)+Z(z,n) we obtain

{W(z0+yn1/2,tn)n}t>0y𝑑{2Bt+eytinfst(2Bt+eys)}t>0y.\Big{\{}\frac{W(z_{0}+yn^{-1/2}\,,\,tn)}{\sqrt{n}}\Big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}B_{t}+eyt-\inf_{s\leq t}\big{(}\sqrt{2}B_{t}+eys\big{)}\big{\}}_{\begin{subarray}{c}t>0\\ y\in\mathbb{R}\end{subarray}}. (16)

For the proof of Theorem 4 it suffices to show that W(z0+yn1/2,k)W(z_{0}+yn^{-1/2}\,,\,k) in the equation* above can be replaced by es(z0+yn1/2,k)e\cdot s(z_{0}+yn^{-1/2},k). Intuitively, it follows as most of the terms in the sum in the definition of WW are close to ee. Formally, let z1:=z0n1/4z_{1}:=z_{0}-n^{-1/4} and z2:=z0+n1/2+ϵz_{2}:=z_{0}+n^{-1/2+\epsilon}. By Lemma 3.3 we have WVHP that for all z1zz2z_{1}\leq z\leq z_{2} and kn1+ϵk\leq n^{1+\epsilon}

|W(z,k)es(z,k)|Cs(z1,k)+xS(z,k)S(z2,k)|11xe|Cn1/4+ϵ+Cn1/4s(z2,k)n1/4+2ϵ.\begin{split}\big{|}W(z,k)-e\cdot s(z,k)\big{|}&\leq Cs(z_{1},k)+\sum_{x\in S(z,k)\setminus S(z_{2},k)}\Big{|}\frac{1}{1-x}-e\Big{|}\\ &\leq Cn^{1/4+\epsilon}+Cn^{-1/4}s(z_{2},k)\leq n^{1/4+2\epsilon}.\end{split} (17)

This shows that WW can be replaced by ese\cdot s in equation (16) and therefore finishes the proof of the theorem. ∎

We turn to prove Proposition 3.5. The proposition follows immediately from the following two propositions.

Proposition 3.6.

We have that

{X(z0,tn)n}t>0𝑑{2Bt}t>0,n.\Big{\{}\frac{X(z_{0},tn)}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}B_{t}\big{\}}_{t>0},\quad n\to\infty.
Proposition 3.7.

We have WVHP for all zz with |zz0|n1/2+ϵ|z-z_{0}|\leq n^{-1/2+\epsilon} that

|X(z,n)X(z0,n)en(zz0)|n1/4+3ϵ.\big{|}X(z,n)-X(z_{0},n)-en(z-z_{0})\big{|}\leq n^{1/4+3\epsilon}.

We start by proving Proposition 3.6. The main tool we use is the following martingale functional central limit theorem [5, Theorem 8.2.8] .

Theorem 3.8.

Let MnM_{n} be a martingale with bounded increments. Recall that the predictable quadratic variation of MnM_{n} is given by

Vn:=k=1n𝔼[(MkMk1)2|k1].V_{n}:=\sum_{k=1}^{n}\mathbb{E}\big{[}\big{(}M_{k}-M_{k-1}\big{)}^{2}\ |\ \mathcal{F}_{k-1}\big{]}.

Suppose that for all t>0t>0 we have that

Vtnn𝑝σ2t,n.\frac{V_{tn}}{n}\overset{p}{\longrightarrow}\sigma^{2}t,\quad n\to\infty.

where the convergence is in probability. Then

{Mtnn}0<t<1𝑑{σBt}0<t<1,\Big{\{}\frac{M_{tn}}{\sqrt{n}}\Big{\}}_{0<t<1}\overset{d}{\longrightarrow}\big{\{}\sigma B_{t}\big{\}}_{0<t<1},

where BtB_{t} is a standard Brownian motion.

In order to use Theorem 3.8 we need to estimate the predictable quadratic variation of the martingale X(z0,n)X(z_{0},n). To this end, we need the following lemma. Let us note that the exponent 3/43/4 in the lemma is not tight. However, it suffices for the proof of the main theorems.

Lemma 3.9.

For any bounded function f:[0,1]f:[0,1]\to\mathbb{R} that is differentiable on [0,z0][0,z_{0}] we have

(|k=1nf(mk)n0z0f(x)1x𝑑x|n3/4+2ϵ)1Cexp(nc),\mathbb{P}\Big{(}\Big{|}\sum_{k=1}^{n}f(m_{k})-n\int_{0}^{z_{0}}\frac{f(x)}{1-x}dx\Big{|}\leq n^{3/4+2\epsilon}\Big{)}\geq 1-C\exp(-n^{c}),

where the constants C,cC,c may depend on the function ff.

Proof.

Throughout the proof we let the constants CC and cc depend on the function ff. For the proof of the lemma we split the interval [0,z0][0,z_{0}] into n1/4\lfloor n^{1/4}\rfloor small intervals [yi1,yi][y_{i-1},y_{i}] where yi:=iz0/n1/4y_{i}:=iz_{0}/\lfloor n^{1/4}\rfloor. For all in1/4i\leq\lfloor n^{1/4}\rfloor we let

Ji:={kn:mj[yi1,yi]}.J_{i}:=\big{\{}k\leq n:m_{j}\in[y_{i-1},y_{i}]\big{\}}. (18)

We have that

|k=1nf(mk)n0z0f(x)1xdx|i=1n1/4|kJif(mk)nyi1yif(x)1x𝑑x|+k=1nf(mk)𝟙{mkz0}.\begin{split}\Big{|}\sum_{k=1}^{n}f(m_{k})&-n\int_{0}^{z_{0}}\frac{f(x)}{1-x}dx\Big{|}\\ &\leq\sum_{i=1}^{\lfloor n^{1/4}\rfloor}\Big{|}\sum_{k\in J_{i}}f(m_{k})-n\int_{y_{i-1}}^{y_{i}}\!\frac{f(x)}{1-x}dx\Big{|}+\sum_{k=1}^{n}f(m_{k})\cdot\mathds{1}\{m_{k}\geq z_{0}\}.\end{split} (19)

Next, we estimate the size of JiJ_{i}. Using Lemma 3.4 we have WVHP

|Ji|=Mn(yi1)Mn(yi)=n(log(1yi1)log(1yi))+O(n1/2+ϵ)=n(yiyi1)1yi+O(n1/2+ϵ),\begin{split}|J_{i}|=M_{n}(y_{i-1})-M_{n}(y_{i})&=n\big{(}\log(1-y_{i-1})-\log(1-y_{i})\big{)}+O(n^{1/2+\epsilon})\\ &=\frac{n(y_{i}-y_{i-1})}{1-y_{i}}+O(n^{1/2+\epsilon}),\end{split}

where in the last equality we used the Taylor expansion of the function log(1x)\log(1-x) around x=yix=y_{i}. Thus, using that f(x)=f(yi)+O(n1/4)f(x)=f(y_{i})+O(n^{-1/4}) and f(x)/(1x)=f(yi)/(1yi)+O(n1/4)f(x)/(1-x)=f(y_{i})/(1-y_{i})+O(n^{-1/4}) for all x[yi1,yi]x\in[y_{i-1},y_{i}] we obtain

|kJif(mk)nyi1yif(x)1x𝑑x||f(yi)|||Ji|n(yiyi1)1yi|+O(|Ji|n1/4+n1/2)=O(n1/2+ϵ).\begin{split}\Big{|}\sum_{k\in J_{i}}f(m_{k})-n\int_{y_{i-1}}^{y_{i}}\!\frac{f(x)}{1-x}dx\Big{|}&\leq|f(y_{i})|\cdot\Big{|}|J_{i}|-\frac{n(y_{i}-y_{i-1})}{1-y_{i}}\Big{|}+O\big{(}|J_{i}|n^{-1/4}+n^{1/2}\big{)}\\ &=O(n^{1/2+\epsilon}).\end{split}

Substituting the last estimate back into (19) we obtain

|k=1nf(mk)n0z0f(x)1x𝑑x|O(n3/4+ϵ)+CMn(z0)=O(n3/4+ϵ),\Big{|}\sum_{k=1}^{n}f(m_{k})-n\int_{0}^{z_{0}}\frac{f(x)}{1-x}dx\Big{|}\leq O(n^{3/4+\epsilon})+CM_{n}(z_{0})=O(n^{3/4+\epsilon}),

where in the last equality we used Lemma 3.4 once again. ∎

We can now prove Proposition 3.6.

Proof of Proposition 3.6.

In order to use Theorem 3.8 we compute the predictable quadratic variation of X(z0,k)X(z_{0},k). Clearly, on the event {mkz0}\{m_{k}\geq z_{0}\} we have that 𝔼[(X(z0,k+1)X(z0,k))2|k]=0\mathbb{E}\big{[}(X(z_{0},k+1)-X(z_{0},k))^{2}\ \big{|}\ \mathcal{F}_{k}\big{]}=0. On the event {mkz0}\{m_{k}\leq z_{0}\} we have

𝔼[(X(z0,k+1)X(z0,k))2|k]=𝔼[(W(z0,k+1)W(z0,k))2|k]=0z01(1x)2𝑑x21mkmtz011x𝑑x+11mk=11x|0z0+21mklog(1x)|mkz0+11mk=e12log(1mk)+11mk.\begin{split}\mathbb{E}\big{[}(X&(z_{0},k+1)-X(z_{0},k))^{2}\ \big{|}\ \mathcal{F}_{k}\big{]}=\mathbb{E}\big{[}(W(z_{0},k+1)-W(z_{0},k))^{2}\ \big{|}\ \mathcal{F}_{k}\big{]}\\ &=\int_{0}^{z_{0}}\frac{1}{(1-x)^{2}}dx-\frac{2}{1-m_{k}}\int_{m_{t}}^{z_{0}}\frac{1}{1-x}dx+\frac{1}{1-m_{k}}\\ &=\frac{1}{1-x}\Big{|}_{0}^{z_{0}}+\frac{2}{1-m_{k}}\log(1-x)\Big{|}_{m_{k}}^{z_{0}}+\frac{1}{1-m_{k}}=e-1-\frac{2\log(1-m_{k})+1}{1-m_{k}}.\end{split}

Thus, the predictable quadratic variation of X(z0,k)X(z_{0},k) is given by

Vn:=k=0n1𝔼[(X(z0,k)X(z0,k))2|k]=k=1n1f(mk),V_{n}:=\sum_{k=0}^{n-1}\mathbb{E}\big{[}(X(z_{0},k)-X(z_{0},k))^{2}\ \big{|}\ \mathcal{F}_{k}\big{]}=\sum_{k=1}^{n-1}f(m_{k}), (20)

where

f(x):=(e12log(1x)+11x)𝟙{xz0}.f(x):=\Big{(}e-1-\frac{2\log(1-x)+1}{1-x}\Big{)}\cdot\mathds{1}\{x\leq z_{0}\}.

Next, we have that

0z0f(x)1xdx=0z011x(e12log(1x)+11x)𝑑x=(1e)log(1x)|0z02log(1x)+31x|0z0=e1e+3=2\begin{split}\int_{0}^{z_{0}}\frac{f(x)}{1-x}&dx=\int_{0}^{z_{0}}\frac{1}{1-x}\Big{(}e-1-\frac{2\log(1-x)+1}{1-x}\Big{)}dx\\ &=(1-e)\log(1-x)\Big{|}_{0}^{z_{0}}-\frac{2\log(1-x)+3}{1-x}\Big{|}_{0}^{z_{0}}=e-1-e+3=2\end{split}

and therefore, by Lemma 3.9 we have WVHP that |Vn2n|Cn3/4+3ϵ|V_{n}-2n|\leq Cn^{3/4+3\epsilon}. It follows that for all t>0t>0 we have Vtn/n2tV_{tn}/n\to 2t in probability and therefore by Theorem 3.8 we have

{X(z0,tn)n}t>0𝑑{2Bt}t>0\Big{\{}\frac{X(z_{0},tn)}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}B_{t}\big{\}}_{t>0}

as needed. ∎

We turn to prove Proposition 3.7. The main tool we use is the following martingale concentration result due to Freedman [6]. See also [4, Theorem 1.2].

Theorem 3.10 (Freedman’s inequality).

Let MnM_{n} be a martingale with increments bounded by MM and let

Vn:=k=1n𝔼[(Mk+1Mk)2|k]V_{n}:=\sum_{k=1}^{n}\mathbb{E}\big{[}(M_{k+1}-M_{k})^{2}\ |\ \mathcal{F}_{k}\big{]}

be the predictable quadratic variation. Then,

(|Mn|x,Vny)2exp(x22y+2Mx).\mathbb{P}\big{(}|M_{n}|\geq x,\ V_{n}\leq y\big{)}\leq 2\exp\Big{(}-\frac{x^{2}}{2y+2Mx}\Big{)}.

We can now prove Proposition 3.7.

Proof of Proposition 3.7.

Let zz such that |zz0|n1/2+ϵ|z-z_{0}|\leq n^{-1/2+\epsilon}. It follows from Proposition 3.1 that the process

N(z,k):=X(z,k)+(log(1z)+1)(k1)(log(1z)+1)Mk1(z)N(z,k):=X(z,k)+(\log(1-z)+1)(k-1)-(\log(1-z)+1)\cdot M_{k-1}(z)

is a martingale. Next, consider the difference martingale L(z,k):=N(z,k)N(z0,k)=N(z,k)X(z0,k)L(z,k):=N(z,k)-N(z_{0},k)=N(z,k)-X(z_{0},k) and note that

|L(z,k+1)L(z,k)|C|zz0|+C𝟙{xk+1[z0,z][z,z0]}+C𝟙{mkz0n1/2+ϵ}.\big{|}L(z,k+1)-L(z,k)\big{|}\leq C|z-z_{0}|+C\mathds{1}\{x_{k+1}\in[z_{0},z]\cup[z,z_{0}]\}+C\mathds{1}\{m_{k}\geq z_{0}-n^{-1/2+\epsilon}\}.

Thus, by Lemma 3.4 we have WVHP

Vn:=k=1n𝔼[(L(z,k)L(z,k1))2|k1]Cn1/2+ϵ+CMn(z0n1/2+ϵ)n1/2+2ϵ.V_{n}:=\sum_{k=1}^{n}\mathbb{E}\big{[}(L(z,k)-L(z,k-1))^{2}\ |\ \mathcal{F}_{k-1}\big{]}\leq Cn^{1/2+\epsilon}+CM_{n}(z_{0}-n^{-1/2+\epsilon})\leq n^{1/2+2\epsilon}.

We obtain using Theorem 3.10 that

(|L(z,n)|n1/4+2ϵ)(|L(z,n)|n1/4+2ϵ,Vnn1/2+2ϵ)+(Vkn1/2+2ϵ)Cexp(nc).\begin{split}\mathbb{P}\big{(}\big{|}L(z,n)\big{|}\geq n^{1/4+2\epsilon}\big{)}&\leq\mathbb{P}\big{(}\big{|}L(z,n)\big{|}\geq n^{1/4+2\epsilon},\ V_{n}\leq n^{1/2+2\epsilon}\big{)}+\mathbb{P}\big{(}V_{k}\geq n^{1/2+2\epsilon}\big{)}\\ &\leq C\exp(-n^{c}).\end{split}

Finally, using the expansion log(1z)+1=e(zz0)+O(n1+2ϵ)\log(1-z)+1=-e(z-z_{0})+O(n^{-1+2\epsilon}) we get that WVHP

|X(z,n)X(z0,n)en(zz0)||L(z,k)|+Cn2ϵ+C|zz0|Mn(z)n1/4+3ϵ.\big{|}X(z,n)-X(z_{0},n)-en(z-z_{0})\big{|}\leq\big{|}L(z,k)\big{|}+Cn^{2\epsilon}+C\cdot|z-z_{0}|\cdot M_{n}(z)\leq n^{1/4+3\epsilon}.

This finishes the proof of the proposition using the discretization trick. ∎

3.3 Proof of Theorem 3

The main statement in Theorem 3 is the scaling limit result

{(Ltnn,Rtnn)}t>0𝑑{2e1(MtBt,Mt)}t>0.\Big{\{}\Big{(}\frac{L_{tn}}{\sqrt{n}}\,,\,\frac{R_{tn}}{\sqrt{n}}\Big{)}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}e^{-1}\big{(}M_{t}-B_{t},M_{t}\big{)}\big{\}}_{t>0}. (21)

We start by proving (21) and then briefly explain how the other statements in Theorem 3 follow.

By Proposition 3.6 and Proposition 3.1 we have that

{n1/2(W(z0,tn),Z(z0,tn))}t>0𝑑{2(Btinfx<tBx,infx<tBx)}t>0.\big{\{}n^{-1/2}\big{(}W(z_{0},tn)\ ,\ Z(z_{0},tn)\big{)}\big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}\big{(}B_{t}-\inf_{x<t}B_{x}\ ,\ -\inf_{x<t}B_{x}\big{)}\big{\}}_{t>0}. (22)

Moreover, by symmetry

{2(Btinfx<tBx,infx<tBx)}t>0=𝑑{2(MtBt,Mt)}t>0.\big{\{}\sqrt{2}\big{(}B_{t}-\inf_{x<t}B_{x}\ ,\ -\inf_{x<t}B_{x}\big{)}\big{\}}_{t>0}\overset{d}{=}\big{\{}\sqrt{2}\big{(}M_{t}-B_{t}\ ,\ M_{t}\big{)}\big{\}}_{t>0}. (23)

Next, recall that Ln=s(z0,n)L_{n}=s(z_{0},n) and therefore, by (17), WVHP

|eLnW(z0,n)|n1/3.\big{|}e\cdot L_{n}-W(z_{0},n)\big{|}\leq n^{1/3}. (24)

The convergence in (21) clearly follows from (22), (23), (24) and the following lemma

Lemma 3.11.

We have WVHP

|eRnZ(z0,n)|n1/3.\big{|}e\cdot R_{n}-Z(z_{0},n)\big{|}\leq n^{1/3}.
Proof.

We write

|eRnZ(z0,n)|e|RnRn|+|eRnMn1(z0)|+|Mn1(z0)Z(z0,n)|,\big{|}e\cdot R_{n}-Z(z_{0},n)\big{|}\leq e\big{|}R_{n}-R_{n}^{\prime}\big{|}+\big{|}e\cdot R_{n}^{\prime}-M_{n-1}(z_{0})\big{|}+\big{|}M_{n-1}(z_{0})-Z(z_{0},n)\big{|}, (25)

where Rn:=k=1n1𝟙{mkz0}(1mk)R_{n}^{\prime}:=\sum_{k=1}^{n-1}\mathds{1}\{m_{k}\geq z_{0}\}(1-m_{k}) and bound each one of the terms in the right hand side of (25).

We start with the first term. Note that Rn=|{2kn:mkmk1z0}|R_{n}=\big{|}\{2\leq k\leq n:m_{k}\geq m_{k-1}\geq z_{0}\}\big{|} and therefore

𝔼[Rn+1Rn|n]=𝟙{mnz0}(1mn).\mathbb{E}\big{[}R_{n+1}-R_{n}\ |\ \mathcal{F}_{n}\big{]}=\mathds{1}\big{\{}m_{n}\geq z_{0}\big{\}}(1-m_{n}).

It follows that Nn:=RnRnN_{n}:=R_{n}-R_{n}^{\prime} is a martingale. The predictable quadratic variation of NnN_{n} is given by

k=1n1𝔼[(Nk+1Nk)2|k]k=1n1𝟙{mk1z0}Mn(z0)n1/2+ϵ,\sum_{k=1}^{n-1}\mathbb{E}\big{[}(N_{k+1}-N_{k})^{2}\ |\ \mathcal{F}_{k}\big{]}\leq\sum_{k=1}^{n-1}\mathds{1}\{m_{k-1}\geq z_{0}\}\leq M_{n}(z_{0})\leq n^{1/2+\epsilon},

where the last inequality holds WVHP by Lemma 3.4. Thus, using the same arguments as in the proof of Proposition 3.7 and by Theorem 3.10 we have that |RnRn|n1/4+ϵ|R_{n}-R_{n}^{\prime}|\leq n^{1/4+\epsilon} WVHP.

We turn to bound the second term in the right hand side of (25). We have

|eRnMn1(z0)|k=1n𝟙{mkz0}(1e(1mk))CMn(z0)n1/4+Mn(z0+n1/4)n1/4+2ϵ,\begin{split}\big{|}e\cdot R_{n}^{\prime}-M_{n-1}(z_{0})\big{|}&\leq\sum_{k=1}^{n}\mathds{1}\{m_{k}\geq z_{0}\}\big{(}1-e(1-m_{k})\big{)}\\ &\leq CM_{n}(z_{0})n^{-1/4}+M_{n}\big{(}z_{0}+n^{-1/4}\big{)}\leq n^{1/4+2\epsilon},\end{split}

where in the second inequality we separated the sum to mk[z0,z0+n1/4]m_{k}\in[z_{0},z_{0}+n^{-1/4}] and mkz0+n1/4m_{k}\geq z_{0}+n^{-1/4} and where the last inequality holds WVHP by Lemma 3.4.

The last term in the right hand side of (25) is bounded by n1/4+ϵn^{1/4+\epsilon} WVHP using the same arguments as in the first term. Indeed, Z(z0,n)Mn1(z0)Z(z_{0},n)-M_{n-1}(z_{0}) is clearly a martingale and its quadratic variation is bounded by n1/2+ϵn^{1/2+\epsilon} WVHP. ∎

We turn to prove the other results in Theorem 3. By (21) we have that

{Ltn+Rtnn}t>0𝑑{2e1(2MtBt)}t>0.\Big{\{}\frac{L_{tn}+R_{tn}}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{2}e^{-1}(2M_{t}-B_{t})\big{\}}_{t>0}. (26)

Recall that a three dimensional Bessel process, XtX_{t}, is the distance of a three dimensional Brownian motion from the origin. In [11] Pitman proved that

{2MtBt}t>0=𝑑{Xt}t>0\big{\{}2M_{t}-B_{t}\big{\}}_{t>0}\overset{d}{=}\{X_{t}\}_{t>0}

and therefore the convergence in (1) follows from (26). Next, using (21) once again we get that

Lnn𝑑2e1(M1B1),Rnn𝑑2e1M1\frac{L_{n}}{\sqrt{n}}\overset{d}{\longrightarrow}\sqrt{2}e^{-1}(M_{1}-B_{1}),\quad\quad\frac{R_{n}}{\sqrt{n}}\overset{d}{\longrightarrow}\sqrt{2}e^{-1}M_{1} (27)

It is well known (see [9, Theorem 2.34 and Theorem 2.21]) that M1B1=𝑑M1=𝑑|N|M_{1}-B_{1}\overset{d}{=}M_{1}\overset{d}{=}|N|. Substituting these identities into (27) finishes the proof of the first two limit laws in (2).

It remains to prove the last limit law in (2). By (1) with t=1t=1, it suffices to compute the density of 2e1X1\sqrt{2}e^{-1}X_{1}. This is a straightforward computation as X1X_{1} is the norm of a three dimensional normal random variable. We omit the details of the computation which lead to the density given in (3).

We note that one can compute the density of 2M1B12M_{1}-B_{1} without using the result of Pitman [11]. The following elementary argument was given to us by Iosif Pinelis in a Mathoverflow answer [10]. By the reflection principle for all a>b>0a>b>0

(M1>a,B1<b)=(B12ab)=1F(2ab).\mathbb{P}\big{(}M_{1}>a\ ,\ B_{1}<b\big{)}=\mathbb{P}\big{(}B_{1}\geq 2a-b\big{)}=1-F(2a-b). (28)

Equation (28) gives the joint distribution of M1M_{1} and B1B_{1}. From here it is straightforward to compute the joint density of (M1,B1)(M_{1},B_{1}) and the density of 2M1B12M_{1}-B_{1}. Once again, the details of the computation are omitted.

3.4 Proof of Theorem 2

Define En(z):=|{kn:xkz}|E_{n}(z):=\big{|}\{k\leq n:x_{k}\geq z\}\big{|} and consider the martingale

Nn:=X(z0,n)+eEn(z0)n.N_{n}:=X(z_{0},n)+eE_{n}(z_{0})-n.

Theorem 2 clearly follows from the following two lemmas.

Lemma 3.12.

We have that

{Ntnn}t>0𝑑{e3Bt}t>0\Big{\{}\frac{N_{tn}}{\sqrt{n}}\Big{\}}_{t>0}\overset{d}{\longrightarrow}\big{\{}\sqrt{e-3}\,B_{t}\big{\}}_{t>0}
Lemma 3.13.

We have WVHP

|s(1,n)n/eNn/e|n1/3+ϵ.\big{|}s(1,n)-n/e-N_{n}/e\big{|}\leq n^{1/3+\epsilon}.

We start by proving Lemma 3.12.

Proof of Lemma 3.12.

On the event {mkz0}\{m_{k}\leq z_{0}\} we have that

Nk+1Nk=𝟙{xk+1z0}11xk+1𝟙{xk+1mk}11mk+e𝟙{xk+1z0}1.N_{k+1}-N_{k}=\mathds{1}\{x_{k+1}\leq z_{0}\}\frac{1}{1-x_{k+1}}-\mathds{1}\{x_{k+1}\geq m_{k}\}\frac{1}{1-m_{k}}+e\cdot\mathds{1}\{x_{k+1}\geq z_{0}\}-1.

Thus, expanding the 1010 terms in (Nk+1Nk)2(N_{k+1}-N_{k})^{2} we get that on this event

𝔼[(Nk+1Nk)2|k]=0z0dx(1x)211mkmkz02dx1x0z02dx1x+11mk21mk+2+e2+1=11x|0z0+2log(1x)1mk|mkz0+2log(1x)|0z011mk+1+e=e121mk2log(1mk)1mk211mk+1+e=2log(1mk)31mk+2e2.\begin{split}&\mathbb{E}\big{[}(N_{k+1}-N_{k})^{2}\ |\ \mathcal{F}_{k}\big{]}\\ &=\int_{0}^{z_{0}}\frac{dx}{(1-x)^{2}}-\frac{1}{1-m_{k}}\int_{m_{k}}^{z_{0}}\frac{2\,dx}{1-x}-\int_{0}^{z_{0}}\frac{2\,dx}{1-x}+\frac{1}{1-m_{k}}-\frac{2}{1-m_{k}}+2+e-2+1\\ &=\frac{1}{1-x}\Big{|}_{0}^{z_{0}}+\frac{2\log(1-x)}{1-m_{k}}\Big{|}_{m_{k}}^{z_{0}}+2\log(1-x)\Big{|}_{0}^{z_{0}}-\frac{1}{1-m_{k}}+1+e\\ &=\!e-1-\frac{2}{1-m_{k}}-\frac{2\log(1-m_{k})}{1-m_{k}}-2-\frac{1}{1-m_{k}}+1+e=\frac{-2\log(1-m_{k})-3}{1-m_{k}}+2e-2.\end{split}

Thus, we can write 𝔼[(Nk+1Nk)2|k]=f(mk)\mathbb{E}\big{[}(N_{k+1}-N_{k})^{2}\ |\ \mathcal{F}_{k}\big{]}=f(m_{k}) where ff is given by the right hand side of the last equation when mkz0m_{k}\leq z_{0}. Moreover, we have that

0z0f(x)1x𝑑x=2log(1x)51x|0z0(2e2)log(1x)|0z0=3e+5+2e2=3e.\int_{0}^{z_{0}}\frac{f(x)}{1-x}dx=\frac{-2\log(1-x)-5}{1-x}\Big{|}_{0}^{z_{0}}-(2e-2)\log(1-x)\Big{|}_{0}^{z_{0}}=-3e+5+2e-2=3-e.

Thus, by Lemma 3.9 we have with very high probability that

Vn:=k=1n1𝔼[(Nk+1Nk)2|k]=k=1n1f(mk)=(3e)n+O(n5/6).V_{n}:=\sum_{k=1}^{n-1}\mathbb{E}\big{[}(N_{k+1}-N_{k})^{2}\ |\ \mathcal{F}_{k}\big{]}=\sum_{k=1}^{n-1}f(m_{k})=(3-e)n+O(n^{5/6}).

This finishes the proof of the lemma using Theorem 3.8. ∎

We turn to prove Lemma 3.13.

Proof of Lemma 3.13.

Recall that En(z):=|{kn:xkz}|E_{n}(z):=\big{|}\big{\{}k\leq n:x_{k}\geq z\big{\}}\big{|} and let z1:=z0+n1/3z_{1}:=z_{0}+n^{-1/3}. By Proposition 3.1 the process

Nk:=X(z1,k)+(log(1z1)+1)(k1)(log(1z1)+1)Mk1(z1)+eEn(z1)e(1z1)nN_{k}^{\prime}:=X(z_{1},k)+(\log(1-z_{1})+1)(k-1)-(\log(1-z_{1})+1)\cdot M_{k-1}(z_{1})+eE_{n}(z_{1})-e(1-z_{1})n

is a martingale. It is straightforward to check that

es(1,n)nNn=A1+A2+A3+A4+A5+A6e\cdot s(1,n)-n-N_{n}=A_{1}+A_{2}+A_{3}+A_{4}+A_{5}+A_{6}

where

A1:=e(s(1,n)s(z1,n)En(z1)),A2:=es(z1,n)W(z1,n),A3:=Z(z1,n),A4:=NnNn,A5:=e(z0z1)n(log(1z1)+1)(n1)andA6=(log(1z1)+1)Mn1(z1).\begin{split}&A_{1}:=e\big{(}s(1,n)-s(z_{1},n)-E_{n}(z_{1})\big{)},\quad A_{2}:=e\cdot s(z_{1},n)-W(z_{1},n),\\ &A_{3}:=Z(z_{1},n),\quad A_{4}:=N_{n}^{\prime}-N_{n},\quad A_{5}:=e(z_{0}-z_{1})n-(\log(1-z_{1})+1)(n-1)\\ &\text{and}\quad A_{6}=(\log(1-z_{1})+1)M_{n-1}(z_{1}).\end{split}

Next, we bound each of the AiA_{i} WVHP. Any uniform point xz1x\geq z_{1} that arrived before time nn such that xS(1,n)x\notin S(1,n) can be mapped to the time knk\leq n in which it was removed. Thus, using also Lemma 3.4 we obtain that WVHP

0En(z1)(s(1,n)s(z1,n))Mn(z1)n1/3+ϵ.0\leq E_{n}(z_{1})-\big{(}s(1,n)-s(z_{1},n)\big{)}\leq M_{n}(z_{1})\leq n^{1/3+\epsilon}.

This shows that WVHP |A1|en1/3+ϵ|A_{1}|\leq en^{1/3+\epsilon}.

Using the same arguments as in (17) we get that |A2|n1/3+2ϵ|A_{2}|\leq n^{1/3+2\epsilon}.

By Lemma 3.4 we have WVHP |A3|3Mn(z1)3n1/3+ϵ|A_{3}|\leq 3M_{n}(z_{1})\leq 3n^{1/3+\epsilon}.

In order to bound A4A_{4} we use the same arguments as in the proof of Proposition 3.7. Define the martingale Lk:=NkNkL_{k}:=N_{k}^{\prime}-N_{k}. We have that

|Lk+1Lk|Cn1/3+C𝟙{xk+1[z0,z1]}+C𝟙{mkz0}.\big{|}L_{k+1}-L_{k}\big{|}\leq Cn^{-1/3}+C\mathds{1}\{x_{k+1}\in[z_{0},z_{1}]\}+C\mathds{1}\{m_{k}\geq z_{0}\}.

Thus, using Theorem 3.10 we obtain that WVHP |NnNn|=|Ln|n1/3+3ϵ|N_{n}-N^{\prime}_{n}|=|L_{n}|\leq n^{1/3+3\epsilon}.

It follows from a second order Taylor expansion of log(1z1)\log(1-z_{1}) around z0z_{0} that |A5|Cn1/3|A_{5}|\leq Cn^{1/3}.

Lastly, by Lemma 3.4 we have WVHP |A6|Cn1/3Mn(z1)Cnϵ|A_{6}|\leq Cn^{-1/3}M_{n}(z_{1})\leq Cn^{\epsilon}.

Acknowledgment:  We thank Ehud Friedgut and Misha Tsodyks for helpful comments, we thank Ron Peled, Sahar Diskin and Jonathan Zung for fruitful discussions and we thank Iosif Pinelis for proving in [10] that the density function of 2M1B12M_{1}-B_{1} is given by 2x22πex2/2\frac{2x^{2}}{\sqrt{2\pi}}e^{-x^{2}/2}.

References

  • [1] N. Alon, C. Defant and N. Kravitz, The runsort permuton, arXiv:2106.14762, 2021.
  • [2] N. Alon and J. H. Spencer, The Probabilistic Method, Fourth Edition, Wiley, 2016, xiv+375 pp.
  • [3] K. B. Athreya and P. E. Ney, Branching processes, Courier Corporation, 2004.
  • [4] B. Bercu and A. Touati, Exponential inequalities for self-normalized martingales with applications, The Annals of Applied Probability, 18(5), 1848–1869, 2008.
  • [5] R. Durrett, Probability: theory and examples, volume 49, Cambridge university press, 2019.
  • [6] D. A. Freedman, On tail probabilities for martingales, The Annals of Probability, 3(1), 100–118, 1975.
  • [7] E. Friedgut and G. Kozma, Private communication.
  • [8] A. Georgiou, M. Katkov and M. Tsodyks, Retroactive interference model of forgetting, Journal of Mathematical Neuroscience (2021) 11:4.
  • [9] P. Mörters and Y. Peres, Brownian motion, volume 30, Cambridge University Press, 2010.
  • [10] I. Pinelis, What is the distribution of 2M1B12M_{1}-B_{1} where MtM_{t} is the maximum process of the Brownian motion BtB_{t}. MathOverflow question 409729.
  • [11] J. W. Pitman, One-dimensional Brownian motion and the three-dimensional Bessel process, Advances in Applied Probability, 7(3), 511–526, 1975.
  • [12] M. Tsodyks, Private Communication.