This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Mass Transport Proof of the Ergodic Theorem

Calvin Wooyoung Chin
Abstract.

It is known that a gambler repeating a game with positive expected value has a positive probability to never go broke. We use the mass transport method to prove the generalization of this fact where the gains from the bets form a stationary, rather than an i.i.d., sequence. Birkhoff’s ergodic theorem follows from this by a standard argument.

Let X1,X2,X_{1},X_{2},\ldots be (real-valued) random variables, and write

Sn:=X1++Xnfor all n𝐍.S_{n}:=X_{1}+\cdots+X_{n}\qquad\text{for all $n\in\mathbf{N}$.}

In this note, we write 𝐍:={1,2,3,}\mathbf{N}:=\{1,2,3,\ldots\}.

It is known [Chi22] that a gambler that repeats a game with positive expected value has a positive chance to never go broke. More precisely, if X1,X2,X_{1},X_{2},\ldots are i.i.d. and 𝐄X1>0\operatorname{\mathbf{E}}X_{1}>0, then

𝐏(Sn>0 for all n𝐍)>0.\operatorname{\mathbf{P}}(S_{n}>0\text{ for all }n\in\mathbf{N})>0.

In this note, we show that this holds whenever X1,X2,X_{1},X_{2},\ldots are stationary and 𝐄X1>0\operatorname{\mathbf{E}}X_{1}>0 (Lemma 1) using the “mass transport” method. Although no knowledge of the method will be needed, one might want to take a look at the short paper [Häg99] to get a flavor of mass transport.

Birkhoff’s ergodic theorem (Theorem 3) then follows by a standard argument. We include the derivation of the ergodic theorem for completeness.

Lemma 1.

If the sequence X1,X2,X_{1},X_{2},\ldots is stationary and 𝐄X1>0\operatorname{\mathbf{E}}X_{1}>0, then

𝐏(Sn>0 for all n𝐍)>0.\operatorname{\mathbf{P}}(S_{n}>0\text{ for all }n\in\mathbf{N})>0.
Proof.

By Kolmogorov’s extension theorem, we may assume without loss of generality the existence of a doubly-infinite stationary sequence

,X2,X1,X0,X1,X2,.\dots,X_{-2},X_{-1},X_{0},X_{1},X_{2},\dots.

Define SnS_{n} for every n𝐙n\in\mathbf{Z} so that S0=0S_{0}=0 and SnSn1=XnS_{n}-S_{n-1}=X_{n} for all n𝐙n\in\mathbf{Z}.

Let n𝐙n\in\mathbf{Z}. We call m𝐙m\in\mathbf{Z} a record after nn if

m>nandSm=min{Sn+1,Sn+2,,Sm}.m>n\qquad\text{and}\qquad S_{m}=\min\{S_{n+1},S_{n+2},\ldots,S_{m}\}.

Notice that n+1n+1 is always a record after nn. We now introduce the “mass” M(n,m)M(n,m) that nn sends to each m𝐙m\in\mathbf{Z} as follows. If Xn+10X_{n+1}\leq 0, then M(n,m):=0M(n,m):=0 for all m𝐙m\in\mathbf{Z}. If Xn+1>0X_{n+1}>0, then let n+1=n0<n1<n2<n+1=n_{0}<n_{1}<n_{2}<\cdots (this might be finite in length) be the enumeration of the records after nn, and let

(1) M(n,nj):=max{Snj1,Sn}max{Snj,Sn} for each j𝐍.M(n,n_{j}):=\max\{S_{n_{j-1}},S_{n}\}-\max\{S_{n_{j}},S_{n}\}\qquad\text{ for each $j\in\mathbf{N}$.}

For all m𝐙{n1,n2,}m\in\mathbf{Z}\setminus\{n_{1},n_{2},\ldots\}, let M(n,m):=0M(n,m):=0. See Figure 1 for a visualization of the mass transport.

Refer to caption
Figure 1. The records after nn (filled dots) and the mass that nn sends to them. In this particular case, nn sends out a+b+ca+b+c amount of mass, which equals Xn+1X_{n+1}.

Since ,X1,X0,X1,\dots,X_{-1},X_{0},X_{1},\dots is stationary and M(n,m)0M(n,m)\geq 0 for all n,m𝐙n,m\in\mathbf{Z}, we have

(2) 𝐄[n=1M(0,n)]=n=1𝐄[M(0,n)]=n=1𝐄[M(n,0)]=𝐄[n=1M(n,0)].\operatorname{\mathbf{E}}\biggl{[}\sum_{n=1}^{\infty}M(0,n)\biggr{]}=\sum_{n=1}^{\infty}\operatorname{\mathbf{E}}[M(0,n)]=\sum_{n=1}^{\infty}\operatorname{\mathbf{E}}[M(-n,0)]=\operatorname{\mathbf{E}}\biggl{[}\sum_{n=1}^{\infty}M(-n,0)\biggr{]}.

This simple equality is at the heart of the mass transport method. Suppose that 𝐏(Sn0 for some n𝐍)=1\operatorname{\mathbf{P}}(S_{n}\leq 0\text{ for some }n\in\mathbf{N})=1. We will derive a contradiction by evaluating each side of (2).

Assume that X1>0X_{1}>0, and let τ:=inf{n𝐍:Sn0}\tau:=\inf\{n\in\mathbf{N}:S_{n}\leq 0\}. Notice that τ<\tau<\infty a.s. Let 1=n0<n1<<nt=τ1=n_{0}<n_{1}<\cdots<n_{t}=\tau be the records after 0 up to τ\tau. Then we have

n=1M(0,n)=j=1tM(0,nj)=j=1t1(Snj1Snj)+(Snt1S0)=X1.\sum_{n=1}^{\infty}M(0,n)=\sum_{j=1}^{t}M(0,n_{j})=\sum_{j=1}^{t-1}(S_{n_{j-1}}-S_{n_{j}})+(S_{n_{t-1}}-S_{0})=X_{1}.

Thus, the left side of (2) equals 𝐄[X1;X1>0]\operatorname{\mathbf{E}}[X_{1};X_{1}>0].

Let us examine the sum on the right side of (2). See Figure 2 for a visualization. If X0>0X_{0}>0, then 0 is not a record after any number below 1-1, and thus the sum is 0. Assume X00X_{0}\leq 0, and let 1=m0>m1>m2>-1=m_{0}>m_{1}>m_{2}>\cdots (which might be finite in length) be the enumeration of the numbers m<0m<0 such that

Sm<min{Sm+1,,S1}.S_{m}<\min\{S_{m+1},\ldots,S_{-1}\}.

Let m<0m<0 and let us compute M(m,0)M(m,0). First assume that m=mjm=m_{j} for some j𝐍j\in\mathbf{N}, and consider the cases (a) Smj1<0S_{m_{j-1}}<0 and (b) Smj10S_{m_{j-1}}\geq 0. If (a) is the case, then 0 is not a record after mjm_{j}, and thus M(m,0)=0M(m,0)=0. Assume that (b) is the case. By the definition of m0,m1,m_{0},m_{1},\dots, we have SnSmj1S_{n}\geq S_{m_{j-1}} for all n=mj+1,,mj1n=m_{j}+1,\ldots,m_{j-1}. This implies that mj1m_{j-1} is a record after mjm_{j}. Since

0Smj1<min{Smj1+1,,S1},0\leq S_{m_{j-1}}<\min\{S_{m_{j-1}+1},\ldots,S_{-1}\},

we see that mj1m_{j-1} and 0 are consecutive records after mjm_{j}. Thus,

M(m,0)=Smj1max{0,Smj}.M(m,0)=S_{m_{j-1}}-\max\{0,S_{m_{j}}\}.

Combining (a) and (b) yields

M(m,0)=max{Smj1,0}max{Smj,0}.M(m,0)=\max\{S_{m_{j-1}},0\}-\max\{S_{m_{j}},0\}.

If mmjm\neq m_{j} for all j0j\geq 0, and k0k\geq 0 is the largest number such that m<mkm<m_{k}, then SmSmkS_{m}\geq S_{m_{k}}. Since mkm_{k} is a record after mm, the definition of MM tells us that M(m,0)=0M(m,0)=0; the two maximums in (1) are both SmS_{m} even if 0 is a record after mm. We now know what M(m,0)M(m,0) is for all m<0m<0, and this yields

𝐄[n=1M(n,0)]=𝐄[j1(max{Smj1,0}max{Smj,0});X00]𝐄[X0;X00].\begin{split}\operatorname{\mathbf{E}}\biggl{[}\sum_{n=1}^{\infty}M(-n,0)\biggr{]}&=\operatorname{\mathbf{E}}\biggl{[}\sum_{j\geq 1}\bigl{(}\max\{S_{m_{j-1}},0\}-\max\{S_{m_{j}},0\}\bigr{)};X_{0}\leq 0\biggr{]}\\ &\leq\operatorname{\mathbf{E}}[-X_{0};X_{0}\leq 0].\end{split}
Refer to caption
Figure 2. The mass 0 receives when X00X_{0}\leq 0. In this particular case, 0 receives a+b+c=X0a+b+c=-X_{0}, but it might receive less if infn<0Sn>0\inf_{n<0}S_{n}>0.

The equation (2) now gives

𝐄[X1;X1>0]𝐄[X0;X00],\operatorname{\mathbf{E}}[X_{1};X_{1}>0]\leq\operatorname{\mathbf{E}}[-X_{0};X_{0}\leq 0],

which implies 𝐄X10\operatorname{\mathbf{E}}X_{1}\leq 0; this is a contradiction. ∎

Remark 2.

Our argument actually proves the maximal ergodic theorem [Bil12, Theorem 24.2], which says that if X1,X2,X_{1},X_{2},\ldots is a stationary sequence, then

𝐄[X1;Sn0 for some n𝐍]0.\operatorname{\mathbf{E}}[X_{1};S_{n}\leq 0\text{ for some }n\in\mathbf{N}]\leq 0.

Indeed, the left side of (2) is bounded below by

𝐄[X1;X1>0 and Sn0 for some n𝐍],\operatorname{\mathbf{E}}[X_{1};X_{1}>0\text{ and }S_{n}\leq 0\text{ for some }n\in\mathbf{N}],

while the right side of (2) is bounded above by 𝐄[X0;X00]\operatorname{\mathbf{E}}[-X_{0};X_{0}\leq 0]. Thus,

𝐄[X1;X1>0 and Sn0 for some n𝐍]𝐄[X1;X10],\operatorname{\mathbf{E}}[X_{1};X_{1}>0\text{ and }S_{n}\leq 0\text{ for some }n\in\mathbf{N}]\leq\operatorname{\mathbf{E}}[-X_{1};X_{1}\leq 0],

and therefore 𝐄[X1;Sn0 for some n𝐍]0\operatorname{\mathbf{E}}[X_{1};S_{n}\leq 0\text{ for some }n\in\mathbf{N}]\leq 0.

There is a short proof of the maximal ergodic theorem which however lacks in intuition; see [Bil12, Theorem 24.2], for example. Our proof is an attempt to remedy this problem by utilizing the intuitive principle of mass transport.

We now prove Birkhoff’s ergodic theorem. Let (Ω,,𝐏)(\Omega,\mathcal{F},\operatorname{\mathbf{P}}) be the underlying probability space, and a measurable map T:ΩΩT\colon\Omega\to\Omega be measure-preserving in the sense that 𝐏(T1A)=𝐏(A)\operatorname{\mathbf{P}}(T^{-1}A)=\operatorname{\mathbf{P}}(A) for all AA\in\mathcal{F}. An event AA is invariant under TT if T1A=AT^{-1}A=A, and we denote the σ\sigma-field of all invariant events by \mathcal{I}.

Theorem 3 (Birkhoff’s ergodic theorem).

Let X1X_{1} be a random variable with finite mean, and write Xn:=X1Tn1X_{n}:=X_{1}\circ T^{n-1} for n=2,3,n=2,3,\dots. If Sn:=X1++XnS_{n}:=X_{1}+\cdots+X_{n} for all n𝐍n\in\mathbf{N}, then

Sn/n𝐄[X1]a.s.S_{n}/n\to\operatorname{\mathbf{E}}[X_{1}\mid\mathcal{I}]\qquad\text{a.s.}
Proof.

Since 𝐄[X1]T=𝐄[X1]\operatorname{\mathbf{E}}[X_{1}\mid\mathcal{I}]\circ T=\operatorname{\mathbf{E}}[X_{1}\mid\mathcal{I}], we may assume that 𝐄[X1]=0\operatorname{\mathbf{E}}[X_{1}\mid\mathcal{I}]=0. Let ϵ>0\epsilon>0 and Aϵ:={lim infnSn/n<ϵ}A_{\epsilon}:=\{\liminf_{n\to\infty}S_{n}/n<-\epsilon\}. On the event AϵA_{\epsilon}, we have Sn+nϵ<0S_{n}+n\epsilon<0 for some n𝐍n\in\mathbf{N}. Thus,

(3) 𝐏((Sn+nϵ)1Aϵ>0 for all n𝐍)=0.\operatorname{\mathbf{P}}((S_{n}+n\epsilon)1_{A_{\epsilon}}>0\text{ for all }n\in\mathbf{N})=0.

Since AϵA_{\epsilon}\in\mathcal{I}, we have

((X1+ϵ)1Aϵ)Tn1=(Xn+ϵ)1Aϵfor all n𝐍.((X_{1}+\epsilon)1_{A_{\epsilon}})\circ T^{n-1}=(X_{n}+\epsilon)1_{A_{\epsilon}}\qquad\text{for all $n\in\mathbf{N}$.}

As ((Xn+ϵ)1Aϵ)n𝐍((X_{n}+\epsilon)1_{A_{\epsilon}})_{n\in\mathbf{N}} is a stationary sequence, (3) and Lemma 1 imply

𝐄[(X1+ϵ)1Aϵ]0.\operatorname{\mathbf{E}}[(X_{1}+\epsilon)1_{A_{\epsilon}}]\leq 0.

Since 𝐄[X1]=0\operatorname{\mathbf{E}}[X_{1}\mid\mathcal{I}]=0, we have

𝐄[(X1+ϵ)1Aϵ]=ϵ𝐏(Aϵ),\operatorname{\mathbf{E}}[(X_{1}+\epsilon)1_{A_{\epsilon}}]=\epsilon\operatorname{\mathbf{P}}(A_{\epsilon}),

and thus 𝐏(Aϵ)=0\operatorname{\mathbf{P}}(A_{\epsilon})=0. As ϵ\epsilon is arbitrary, we have lim infnSn/n0\liminf_{n\to\infty}S_{n}/n\geq 0. Applying this result to X1-X_{1} in place of X1X_{1} gives lim supnSn/n0\limsup_{n\to\infty}S_{n}/n\leq 0. Therefore, we have Sn/n0S_{n}/n\to 0 a.s. ∎

References

  • [Bil12] Patrick Billingsley. Probability and measure. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, 2012. Anniversary edition [of MR1324786], With a foreword by Steve Lalley and a brief biography of Billingsley by Steve Koppes.
  • [Chi22] Calvin Wooyoung Chin. A gambler that bets forever and the strong law of large numbers. Amer. Math. Monthly, 129(2):183–185, 2022.
  • [Häg99] Olle Häggström. Invariant percolation on trees and the mass-transport method. 1999.