This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fluctuation bounds for ergodic averages of amenable groups

Andrew Warren Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213
USA
[email protected]
Abstract.

We study fluctuations of ergodic averages generated by actions of amenable groups. In the setting of an abstract ergodic theorem for locally compact second countable amenable groups acting on uniformly convex Banach spaces, we deduce a highly uniform bound on the number of fluctuations of the ergodic average for a class of Følner sequences satisfying an analogue of Lindenstrauss’s temperedness condition. Equivalently, we deduce a uniform bound on the number of fluctuations over long distances for arbitrary Følner sequences. As a corollary, these results imply associated bounds for a continuous action of an amenable group on a σ\sigma-finite LpL^{p} space with p(1,)p\in(1,\infty).

In this article, we consider a problem at the interface of effective ergodic theory and the ergodic theory of group actions.

By effective ergodic theory, we mean the following programme: insofar as ergodic theory provides theorems which tell us about the long-term behaviour of dynamical systems, we may ask for more quantitative, or computationally explicit, analogues of those theorems. For instance, the mean ergodic theorem of von Neumann asserts that, whenever (S,μ)(S,\mu) is a σ\sigma-finite measure space, T:SST:S\rightarrow S is a measure-preserving transformation, and fL2(S,μ)f\in L^{2}(S,\mu), the sequence of ergodic averages Anf:=1Ni=1N1fTiA_{n}f:=\frac{1}{N}\sum_{i=1}^{N-1}f\circ T^{-i} converges in L2L^{2} norm; but how fast does this convergence occur?

Effective ergodic theory is complicated by some rather general negative results. Indeed, it is known [19] that in the aforementioned setting of the mean ergodic theorem, there is no uniform rate of convergence of the averages AnfA_{n}f in norm if one considers arbitrary functions ff in L2(S,μ)L^{2}(S,\mu). Consequently, if we are to find a quantitative analogue of the mean ergodic theorem, viz. one which is more explicit about the nature of the convergence of AnfA_{n}f, it is necessary to look for more subtle convergence data than a uniform rate of convergence. One such form of convergence data is the number of ε\varepsilon-fluctuations of a sequence for each ε>0\varepsilon>0, or possibly ε\varepsilon-fluctuations satisfying some side condition, such as the fluctuations being “over long distances”. This is exactly the type of convergence data that we investigate here; a precise explanation of these terms is given in Definition 1.

On the other hand, the ergodic theory of group actions seeks to modify the machinery of classical ergodic theory, by replacing the action of a single measure-preserving transformation with a measure-preserving action by a group. In fact, since the action by a single invertible measure-preserving transformation can be identified with a measure-preserving action of \mathbb{Z}, many theorems in the ergodic theory of group actions contain results from classical ergodic theory as special cases. In particular, this is typically the case where one is interested in the ergodic theory of a class of group actions where the groups involved belong to a class of groups containing \mathbb{Z}, such as abelian groups, nilpotent groups, or, in the case of the present article, amenable groups (see Definition 12 below).

Consider, therefore, the following version of the mean ergodic theorem for actions of amenable groups:

Theorem.

Let Lp(S,μ)L^{p}(S,\mu) be such that either SS is σ\sigma-finite and 1<p<1<p<\infty or μ(S)<\mu(S)<\infty and p=1p=1, and let xLp(S,μ)x\in L^{p}(S,\mu). Let GG be a locally compact second countable amenable group with Haar measure dgdg, let GG act continuously on (S,μ)(S,\mu) by measure-preserving transformations, and let (Fn)(F_{n}) be a Følner sequence of compact subsets of GG. Then Anx:=1|Fn|Fnπ(g1)x𝑑gA_{n}x:=\frac{1}{|F_{n}|}\int_{F_{n}}\pi(g^{-1})xdg converges in LpL^{p}.

This result is originally due to Greenleaf [11], whose proof goes by way of an abstract Banach space analogue of the mean ergodic theorem which is simultaneously general enough to deduce the mean ergodic theorem for an amenable group acting on any reflexive Banach space or any L1(μ)L^{1}(\mu) with μ\mu a finite measure. Central to Greenleaf’s proof is a fixed point argument which, in particular, does not give any effective convergence information about the averages AnxA_{n}x. (If one specialises to the case where p=2p=2, a simpler proof is available [7, Thm. 8.13], but this proof also does not give any effective convergence information.)

Here our aim is to give an effective analogue of Greenleaf’s theorem. At the cost of some generality — here, we only consider actions of amenable groups on uniformly convex Banach spaces — we obtain an explicit uniform fluctuation bound for (Anx)(A_{n}x). In other words, we deduce a result of the following form:

Theorem.

(“Main theorem”) Let Lp(S,μ)L^{p}(S,\mu) be such that SS is σ\sigma-finite and 1<p<1<p<\infty, and let xLp(S,μ)x\in L^{p}(S,\mu). Let GG be a locally compact second countable amenable group with Haar measure dgdg, let GG act continuously on (S,μ)(S,\mu) by measure-preserving transformations, and let (Fn)(F_{n}) be a Følner sequence of compact subsets of GG. Then Anx:=1|Fn|Fnπ(g1)x𝑑gA_{n}x:=\frac{1}{|F_{n}|}\int_{F_{n}}\pi(g^{-1})xdg converges in LpL^{p} with a uniform bound on the number of ε\varepsilon-fluctuations over long distances for each ε>0\varepsilon>0; where the uniform bound (both the number of fluctuations and the “long distances”) depends exclusively, and explicitly, on: the choice of p(1,)p\in(1,\infty), the quantity xLp(S,μ)\|x\|_{L^{p}(S,\mu)}, and the Følner sequence (Fn)(F_{n}).

In fact, the dependence of the number of ε\varepsilon-fluctuations over long distances of AnxA_{n}x on the choice of Følner sequence (Fn)(F_{n}) is only via a specific type of data from the Følner sequence, which expresses the fact that (Fn)(F_{n}) is a Følner sequence in a slightly more quantitatively explicit fashion. We call this data the Følner convergence modulus; see Definition 15. It also turns out to be possible to delete the “over long distances” clause from the main theorem if, instead, one adds a side condition on the Følner sequence used, namely the condition that the Følner sequence be “fast”, as detailed in Definition 17.

We deduce this theorem as a special case of an “explicit fluctuation bound” analogue of an abstract mean ergodic theorem for a certain class of amenable group actions on uniformly convex Banach spaces, which we also call the “main theorem” of the article; this result is stated as Theorem 21 below, which also gives the explicit form of the uniform bound. The modified version of this theorem where the fluctuations are not “over long distances” but the Følner sequence is assumed to be “fast” is given as Corollary 22.

The plan of the article is as follows. In Section 1, we establish a number of background facts from functional analysis and the theory of amenable groups which are needed to state and prove the main theorem. In Section 2, we first provide a specialised proof of an abstract mean ergodic theorem for lcsc amenable groups acting on uniformly convex Banach spaces, and then modify this proof so as to be sufficiently quantitatively explicit that we are able to deduce the main theorem. Lastly, in Section 3, a discussion of some related literature and open problems is provided.

1. Preliminaries

Definition 1.

Fix an ε>0\varepsilon>0. Given a sequence (xn)(x_{n}) in some metric space, we say that (xn)(x_{n}) has at most NN ε\varepsilon-fluctuations if for every finite sequence n1<n2<<nkn_{1}<n_{2}<\ldots<n_{k} such that for each 1i<k1\leq i<k, d(xni,xni+1)εd(x_{n_{i}},x_{n_{i+1}})\geq\varepsilon, it always holds that k<Nk<N. A weaker notion is “ε\varepsilon-fluctuations at distance β\beta”: given some function β:\beta:\mathbb{N}\rightarrow\mathbb{N} with β(n)>n\beta(n)>n for every nn, we say that (xn)(x_{n}) has at most NN ε\varepsilon-fluctuations at distance β\beta if for every finite sequence n1<n2<<nkn_{1}<n_{2}<\ldots<n_{k} with the property that ni+1β(ni)n_{i+1}\geq\beta(n_{i}) for every 1i<k1\leq i<k such that for each 1i<k1\leq i<k, d(xni,xni+1)εd(x_{n_{i}},x_{n_{i+1}})\geq\varepsilon, it always holds that k<Nk<N.

Remark.

It holds that a sequence (xn)(x_{n}) is Cauchy (viz. that for every ε>0\varepsilon>0 there exists an NN such that for m,nNm,n\geq N, d(xm,xn)<εd(x_{m},x_{n})<\varepsilon) iff for every ε>0\varepsilon>0 there exists some NN^{\prime} such that (xn)(x_{n}) has at most NN^{\prime} ε\varepsilon-fluctuations, iff for any ε>0\varepsilon>0 and β\beta with β(n)>n\beta(n)>n, there exists some N′′N^{\prime\prime} such that (xn)(x_{n}) has at most N′′N^{\prime\prime} ε\varepsilon-fluctuations at distance β\beta. (More precisely: if (xn)(x_{n}) is Cauchy then for any β:\beta:\mathbb{N}\rightarrow\mathbb{N} with β(n)>n\beta(n)>n, (xn)(x_{n}) has only finitely many ε\varepsilon-fluctuations at distance β\beta for each ε>0\varepsilon>0; whereas conversely, if there is any such β\beta so that (xn)(x_{n}) has only finitely many ε\varepsilon-fluctuations at distance β\beta for each ε>0\varepsilon>0, then it follows that (xn)(x_{n}) is Cauchy.) Likewise, it is obvious that if for a specific sequence (xn)(x_{n}) we happen to know an explicit NN witnessing the Cauchy property, then this NN also serves as an explicit upper bound for NN^{\prime}, and likewise any explicit NN^{\prime} serves as an explicit upper bound on N′′N^{\prime\prime} (for any β\beta). However the converses are all false in a strong sense: there exist examples of sequences where NN^{\prime} is a computable function of ε\varepsilon but NN is not computable [1], and likewise with N′′N^{\prime\prime} (for βn+1\beta\neq n+1) and NN^{\prime} respectively [18].

These phenomena are certainly present in ergodic theory. As mentioned in the introduction, it has long been known that a single measure-preserving transformation acting on (S,μ)(S,\mu), then when 1p<1\leq p<\infty, there exist functions fLp(S,μ)f\in L^{p}(S,\mu) for which the convergence indicated by the mean (and pointwise) ergodic theorem occurs arbitrarily slowly [19]. In other words (Anx)(A_{n}x) does not exhibit a uniform rate of convergence. However, it was shown by Avigad and Rute [1] that when p(1,)p\in(1,\infty) in this setting — in fact, more generally, if the acted-upon space Lp(S,μ)L^{p}(S,\mu) is replaced with any uniformly convex Banach space \mathcal{B} with modulus of uniform convexity u(εu(\varepsilon) (see definition below) — then there exists a uniform bound on the number of ε\varepsilon-fluctuations in the sequence (Anx)(A_{n}x) which depends only on u(ε)u(\varepsilon) and x/ε\|x\|/\varepsilon.

Definition 2.

A normed vector space (,)(\mathcal{B},\|\cdot\|) is said to be uniformly convex if there exists a nondecreasing function u(ε)u(\varepsilon) such that for all x,yx,y\in\mathcal{B} with xy1\|x\|\leq\|y\|\leq 1 and xyε\|x-y\|\geq\varepsilon, it follows that 12(x+y)yu(ε)\left\|\frac{1}{2}(x+y)\right\|\leq\|y\|-u(\varepsilon). Such a function u(ε)u(\varepsilon) is then referred to as a modulus of uniform convexity for \mathcal{B}. We say that \mathcal{B} is pp-uniformly convex if Kεp+1K\varepsilon^{p+1} is a modulus of uniform convexity for \mathcal{B}, where KK is some constant.

Remark.

There are a number of equivalent ways to define uniform convexity. We have chosen the preceding definition because it is the most convenient for our argument, but it is worth mentioning another characterisation which is perhaps more standard: a space (,)(\mathcal{B},\|\cdot\|) is uniformly convex provided there is a nondecreasing function δ(ε)\delta(\varepsilon) (also called a modulus of uniform convexity) such that for all x,yx,y\in\mathcal{B} with x,y1\|x\|,\|y\|\leq 1 and xyε\|x-y\|\geq\varepsilon, it follows that 12(x+y)1δ(ε)\left\|\frac{1}{2}(x+y)\right\|\leq 1-\delta(\varepsilon). It is not hard to show (cf. [17, Lemma 3.2]) that a function δ(ε)\delta(\varepsilon) is a modulus of convexity in this sense iff u(ε)=ε2δ(ε)u(\varepsilon)=\frac{\varepsilon}{2}\delta(\varepsilon) is a modulus of uniform convexity in the sense of our definition. (This indicates the origin of the off-by-one issue in our definition of pp-uniform convexity.)

It is well known [6] that the LpL^{p} and p\ell^{p} spaces are uniformly convex when p(1,)p\in(1,\infty). Hanner showed [12] that for p[2,)p\in[2,\infty), the sharp modulus δ(ε)\delta(\varepsilon) for LpL^{p} has an especially nice form, namely δ(ε)=1(1(ε2)p)1/p\delta(\varepsilon)=1-\left(1-\left(\frac{\varepsilon}{2}\right)^{p}\right)^{1/p}. In particular, this implies that u(ε)=ε2ε2(1(ε2)p)1/pu(\varepsilon)=\frac{\varepsilon}{2}-\frac{\varepsilon}{2}\left(1-\left(\frac{\varepsilon}{2}\right)^{p}\right)^{1/p} is a modulus of uniform convexity, in our sense, for LpL^{p} with p[2,)p\in[2,\infty). (The same work shows that, even though LpL^{p} is also uniformly convex for p(1,2)p\in(1,2), the sharp modulus δ(ε)\delta(\varepsilon) does not have as nice of an explicit form.)

We recall some basic notions from the theory of vector-valued integration. We shall closely follow the recent textbook by Hytönen et al. [13]; for the convenience of the reader, we will sometimes refer directly to specific definitions, theorems, etc. therein.

Consider some measure space (G,𝒜,μ)(G,\mathcal{A},\mu) with some function f:Gf:G\rightarrow\mathcal{B}, with (,)(\mathcal{B},\|\cdot\|) a Banach space. We say that f(g)f(g) is a simple function with respect to the σ\sigma-algebra 𝒜\mathcal{A} and space \mathcal{B}, if it is of the form i=1N1Ai(g)bi\sum_{i=1}^{N}1_{A_{i}}(g)b_{i}, with 1Ai(g)1_{A_{i}}(g) an indicator function for Ai𝒜A_{i}\in\mathcal{A}, and bib_{i}\in\mathcal{B}. We then say that a function ff is strongly measurable if it is a pointwise limit of simple functions, i.e. if there exists a sequence fnf_{n} of simple functions such that for every gGg\in G, f(g)fn(g)0\|f(g)-f_{n}(g)\|\rightarrow 0 [13, Def. 1.1.4]. By contrast, a subtly different notion (but more standard in the vector integration literature) is that of μ\mu-strong measurability, which only asserts this limit for μ\mu-almost all gGg\in G, but requires that the sets AiA_{i} in the definition of simple function have finite μ\mu-measure (see [13, Def. 1.1.13 and Def. 1.1.14]). We do not directly use μ\mu-strong measurability in the main theorem of this paper — in particular, it is too weak of a form of measurability for Propositions 8 and 9 below. The relationship between strong measurability and μ\mu-strong measurability is the following (quoting from [13, Prop. 1.1.16]):

Fact 3.

Consider a measure space (G,𝒜,μ)(G,\mathcal{A},\mu), a Banach space \mathcal{B}, and a function f:Gf:G\rightarrow\mathcal{B}.

  1. (1)

    If ff is strongly μ\mu-measurable, then ff is μ\mu-almost everywhere equal to a strongly measurable function.

  2. (2)

    If μ\mu is σ\sigma-finite and ff is μ\mu-almost everywhere equal to a strongly measurable function, then ff is strongly μ\mu-measurable.

As a particular case of Fact 3, we see that when μ\mu is σ\sigma-finite, a strongly measurable function is also μ\mu-strongly measurable.

For μ\mu-strongly measurable functions, one can define a form of integration, namely the Bochner integral, in direct analogy with the Lebesgue integral. Bochner integration is denoted by Gf(g)𝑑μ\int_{G}f(g)d\mu. A function is Bochner μ\mu-integrable iff it is both μ\mu-strongly measurable and Gf(g)𝑑μ<\int_{G}\|f(g)\|d\mu<\infty, that is, f:G\|f\|:G\rightarrow\mathbb{R} is integrable in the Lebesgue sense [13, Prop. 1.2.2].

We record some other basic facts about the Bochner integral. (Each of these will be ultimately used in the proof of Lemma 19.)

Fact 4.

Let (G,μ)(G,\mu) be a measure space and f:Gf:G\rightarrow\mathcal{B} be μ\mu-strongly measurable.

(1) Gf(g)𝑑μGf(g)𝑑μ\|\int_{G}f(g)d\mu\|\leq\int_{G}\|f(g)\|d\mu. [13, Prop. 1.2.2]

(2) If T(,)T\in\mathcal{L}(\mathcal{B},\mathcal{B}), then T(Gf(g)𝑑μ)=GTf(g)𝑑μT\left(\int_{G}f(g)d\mu\right)=\int_{G}Tf(g)d\mu. [13, Eqn. 1.2]

(3) Fubini’s theorem holds for the Bochner integral. [13, Prop. 1.2.7] In particular, if (H,ν)(H,\nu) is another measure space, and μ\mu and ν\nu are σ\sigma-finite, and F:G×HF:G\times H\rightarrow\mathcal{B} is Bochner integrable, then

G×HF𝑑μ×𝑑ν=H(GF𝑑μ)𝑑ν=G(HF𝑑ν)𝑑μ.\int_{G\times H}Fd\mu\times d\nu=\int_{H}\left(\int_{G}Fd\mu\right)d\nu=\int_{G}\left(\int_{H}Fd\nu\right)d\mu.

A more general notion than strong measurability is weak measurability: we say that f:Gf:G\rightarrow\mathcal{B} is weakly measurable if for every bb^{*}\in\mathcal{B}^{*}, the function bf:Gb^{*}\circ f:G\rightarrow\mathbb{R} is measurable (in the ordinary sense as a function from (G,𝒜,μ)(G,\mathcal{A},\mu) to \mathbb{R} with the Borel σ\sigma-algebra). The following classical result indicates when weak measurability implies strong measurability.

Proposition 5.

(Pettis measurability criterion [13, Thm. 1.1.6]) Let (G,μ)(G,\mu) be a measure space and \mathcal{B} a Banach space. For a function f:Gf:G\rightarrow\mathcal{B} the following are equivalent:

  1. (1)

    ff is strongly measurable.

  2. (2)

    f is weakly measurable, and f(G)f(G) is separable in \mathcal{B}.

An easy consequence of the Pettis measurability criterion is the following.

Proposition 6.

If the measure space (G,μ)(G,\mu) is also a separable topological space and every Borel set in GG is μ\mu-measurable, and f:Gf:G\rightarrow\mathcal{B} is continuous, then ff is strongly measurable.

Proof.

Observe that for any bb^{*}\in\mathcal{B}^{*}, bfb^{*}\circ f is a composition of continuous functions, and is therefore continuous. Hence ff is weakly measurable. Moreover, it holds that the continuous image of a separable space is separable. ∎

Before the next proposition, it will be convenient to introduce the following terminology.

Definition 7.

If (G,𝒜,μ)(G,\mathcal{A},\mu) is a measure space which is also a topological space, and 𝒜\mathcal{A} extends the Borel σ\sigma-algebra on GG, then we say that f:Gf:G\rightarrow\mathcal{B} is weakly Borel if bf:Gb^{*}\circ f:G\rightarrow\mathbb{R} is Borel, for all bb^{*}\in\mathcal{B}^{*}. Likewise we say that f:Gf:G\rightarrow\mathcal{B} is strongly Borel provided it is weakly Borel and that f(G)f(G) is separable in \mathcal{B}.

Of course, in the case where 𝒜\mathcal{A} is precisely the Borel σ\sigma-algebra on GG, then the notions of weakly and strongly Borel functions f:Gf:G\rightarrow\mathcal{B} coincide with weak and strong measurability of ff.

Proposition 8.

If (G,𝒜,μ)(G,\mathcal{A},\mu) and (H,𝒞,ν)(H,\mathcal{C},\nu) are both topological spaces where every Borel set is measurable, ϕ:HG\phi:H\rightarrow G is Borel, and f:Gf:G\rightarrow\mathcal{B} is strongly Borel, then fϕf\circ\phi is strongly Borel (in particular strongly measurable).

Proof.

By hypothesis, ff is also weakly Borel, so for any bb^{*}\in\mathcal{B}^{*}, we have that bfb^{*}\circ f is Borel. Therefore b(fϕ)=(bf)ϕb^{*}\circ(f\circ\phi)=(b^{*}\circ f)\circ\phi is Borel, and thus fϕf\circ\phi is weakly Borel. Now, let A=ϕ(H)A=\phi(H) be the image of ϕ\phi in GG. Note that (fϕ)(H)=f(A)(f\circ\phi)(H)=f(A). Since f(G)f(G) is separable in \mathcal{B}, it follows that f(A)f(A) is also separable in \mathcal{B} since it is contained in f(G)f(G). ∎

We now fix some notation and terminology regarding group actions on Banach spaces and measure spaces.

A locally compact group GG will always come equipped with a Haar measure. In the countable discrete case this coincides with the counting measure. Regardless of whether the group is discrete or continuous, we will use the notations dgdg and |||\cdot| interchangeably to refer to the Haar measure. Conventions differ on the issue of whether the measure space (G,dg)(G,dg) is understood to come equipped with the Borel σ\sigma-algebra or its completion; in the latter case, the assertion that a function f:Gf:G\rightarrow\mathcal{B} is strongly Borel is stronger than the assertion that ff is strongly measurable. Therefore, in what follows, we assume only that the σ\sigma-algebra on GG contains the Borel σ\sigma-algebra, and carefully note when a function f:Gf:G\rightarrow\mathcal{B} is assumed to be strongly Borel rather than strongly measurable. We use the abbreviation lcsc for topological groups which are locally compact and second countable; importantly, for lcsc groups it always holds that the Haar measure is σ\sigma-finite.

In general, we say that a group GG acts on a Banach space (,)(\mathcal{B},\|\cdot\|) if there is a function π(g)\pi(g) that returns an operator on \mathcal{B} for every gGg\in G, π(e)\pi(e) is the identity operator, and for all g,hGg,h\in G, π(g)π(h)=π(gh)\pi(g)\pi(h)=\pi(gh). Together these imply that π(g)1=π(g1)\pi(g)^{-1}=\pi(g^{-1}). We say that GG acts linearly on \mathcal{B} provided that in addition, π\pi maps from GG to the space (,)\mathcal{L}(\mathcal{B},\mathcal{B}) of linear operators on \mathcal{B}. Writing 1(,)\mathcal{L}_{1}(\mathcal{B},\mathcal{B}) to indicate the set of all linear operators from \mathcal{B} to \mathcal{B} with supremum norm 11, another way to say that GG acts both linearly and with unit norm on \mathcal{B} is to say that GG acts on \mathcal{B} via π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}).111We remark that any group that acts via a representation π:G(,)\pi:G\rightarrow\mathcal{L}(\mathcal{B},\mathcal{B}) such that every π(g)\pi(g) is nonexpansive actually does so via π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}), by the fact that π(g1)=π(g)1\pi(g^{-1})=\pi(g)^{-1} and the general fact about linear operators that T1T1\|T^{-1}\|\geq\|T\|^{-1}. Nonexpansivity is required for the proof of our main result. Likewise, we say that a topological group GG acts continuously on \mathcal{B} provided that for every xx\in\mathcal{B}, if geg\rightarrow e then π(g)xx0\|\pi(g)x-x\|\rightarrow 0. In other words gπ(g)xg\mapsto\pi(g)x is continuous from GG to \mathcal{B}. In the case where GG also acts linearly (resp. and with unit norm) on \mathcal{B}, this is equivalent to requiring that π:G(,)\pi:G\rightarrow\mathcal{L}(\mathcal{B},\mathcal{B}) (resp. π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B})) is continuous when (,)\mathcal{L}(\mathcal{B},\mathcal{B}) is equipped with the strong operator topology.

Relatedly, in the case where a group GG acts on a measure space (S,𝒜,μ)(S,\mathcal{A},\mu), we say that GG acts by measure-preserving transformations if μ(g1A)=μ(A)\mu(g^{-1}\cdot A)=\mu(A) for every gGg\in G and A𝒜A\in\mathcal{A}. Likewise, we say that GG acts continuously on (S,𝒜,μ)(S,\mathcal{A},\mu) provided that for every A𝒜A\in\mathcal{A}, if geg\rightarrow e then μ(AΔgA)0\mu(A\Delta gA)\rightarrow 0. This notion of continuous group action is related to our previous notion of continuous action on a Banach space by Proposition 10 below (which may be taken as a justification for the terminology).

Furthermore, we say that if GG is understood as a measure space, then GG acts strongly on a Banach space \mathcal{B} provided that for every xx\in\mathcal{B}, gπ(g)xg\mapsto\pi(g)x is strongly measurable from GG to \mathcal{B}.222By analogy with the previous situation where GG acts continuously on \mathcal{B}: in the case where GG also acts linearly (resp. and with unit norm) on \mathcal{B}, this is equivalent to requiring that π:G(,)\pi:G\rightarrow\mathcal{L}(\mathcal{B},\mathcal{B}) (resp. π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B})) satisfies a “strongly measurable with respect to the strong operator topology” like condition: for a precise statement, see [13, Cor. 1.4.7]. However we do not use this alternative, strong operator topology based, characterisation in our argument. Likewise, we say that GG acts Borel strongly on \mathcal{B} provided that for every xx\in\mathcal{B}, gπ(g)xg\mapsto\pi(g)x is strongly Borel from GG to \mathcal{B}. Our argument will simultaneously require that GG acts Borel strongly on \mathcal{B}, and linearly and with unit norm. To refer to this last condition, we will say that GG acts Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}).

In the situation where GG acts Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}), we can immediately deduce the following facts, which will be used in the proof of Lemma 19.

Proposition 9.

Suppose that GG acts Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}). Then,

  1. (1)

    for each AGA\subset G with |A|<|A|<\infty, we have that 1Aπ(g1)x1_{A}\pi(g^{-1})x is Bochner integrable for each xx\in\mathcal{B}, i.e. 1Aπ(g1)x1_{A}\pi(g^{-1})x is μ\mu-strongly measurable and Aπ(g1)x𝑑g<\int_{A}\|\pi(g^{-1})x\|dg<\infty. Moreover,

  2. (2)

    for each A,BGA,B\subset G with |A|,|B|<|A|,|B|<\infty and for each xx\in\mathcal{B}, we have that 1A×Bπ((hg)1)x1_{A\times B}\pi((hg)^{-1})x is Bochner integrable.

Proof.

(1) Since π()x\pi(\cdot)x is a strongly Borel function from GG to \mathcal{B}, and gg1g\mapsto g^{-1} is a continuous and therefore Borel function from GG to GG, it follows from Proposition 8 that gπ(g1)xg\mapsto\pi(g^{-1})x is strongly measurable (in fact strongly Borel). Since the Haar measure on GG is σ\sigma-finite, this implies that gπ(g1)xg\mapsto\pi(g^{-1})x is μ\mu-strongly measurable, by Fact 3. For the second condition of Bochner integrability, it suffices to observe that since π(g1)1(,)\pi(g^{-1})\in\mathcal{L}_{1}(\mathcal{B},\mathcal{B}) for every gGg\in G,

Aπ(g1)x𝑑gAx𝑑g<.\int_{A}\|\pi(g^{-1})x\|dg\leq\int_{A}\|x\|dg<\infty.

(2) Since GG is a topological group, it holds that group multiplication is continuous. Therefore (g,h)π((hg)1)x(g,h)\mapsto\pi((hg)^{-1})x is strongly measurable (in fact strongly Borel) thanks to Proposition 8, since it is a composition of the continuous (and therefore Borel) multiplication function and the strongly Borel function π(1)x\pi(\cdot^{-1})x. Consequently, Fact 3 implies that (g,h)π((hg)1)x(g,h)\mapsto\pi((hg)^{-1})x is also μ\mu-strongly measurable. We also have Bochner integrability because again,

A×Bπ((hg)1)x𝑑g×𝑑hA×Bx𝑑g×𝑑h<.\int_{A\times B}\|\pi((hg)^{-1})x\|dg\times dh\leq\int_{A\times B}\|x\|dg\times dh<\infty.\qed

Let us tie all this discussion of Bochner integration and group actions on Banach spaces back to our original setting of interest. In the “concrete version” of Greenleaf’s mean ergodic theorem, stated in the introduction, GG acts continuously and by measure-preserving transformations on a σ\sigma-finite measure space (S,μ)(S,\mu), and fLpf\in L^{p} with p(1,)p\in(1,\infty). In the next proposition and corollary, we briefly check that in fact, this particular situation is actually covered by our abstract “group action on Banach space” framework.

Proposition 10.

Suppose that GG is a topological group which acts continuously and by measure-preserving transformations on a σ\sigma-finite measure space (S,μ)(S,\mu). Denote this action by

G×SSG\times S\rightarrow S
(g,s)gs.(g,s)\mapsto g\cdot s.

Fix p[1,)p\in[1,\infty).

  1. (1)

    This action of GG on SS induces an action of GG on Lp(S,μ)L^{p}(S,\mu), given by

    G×Lp(S,μ)Lp(S,μ)G\times L^{p}(S,\mu)\rightarrow L^{p}(S,\mu)
    (g,f)fg1(g,f)\mapsto f\circ g^{-1}

    where for a fixed gGg\in G, the Lp(S,μ)L^{p}(S,\mu) function fg1f\circ g^{-1} is defined in the following manner: (fg1)(s):=f(g1s)(f\circ g^{-1})(s):=f(g^{-1}\cdot s). In other words, there is a representation π\pi of GG into the space of operators on Lp(S,μ)L^{p}(S,\mu), with π(g)f=fg1\pi(g)f=f\circ g^{-1}.

  2. (2)

    Moreover, GG acts linearly and isometrically on Lp(S,μ)L^{p}(S,\mu), in the sense that π(g)fp=fp\|\pi(g)f\|_{p}=\|f\|_{p} for all fLp(S,μ)f\in L^{p}(S,\mu); so in particular, π:G1(Lp(S,μ),Lp(S,μ))\pi:G\rightarrow\mathcal{L}_{1}(L^{p}(S,\mu),L^{p}(S,\mu)).

  3. (3)

    In addition, GG acts continuously on Lp(S,μ)L^{p}(S,\mu), that is, for each fLp(S,μ)f\in L^{p}(S,\mu), gπ(g)fg\mapsto\pi(g)f is a continuous function from GG to Lp(S,μ)L^{p}(S,\mu).

Remark.

This proposition is basically a modification of the rather classical Koopman operator formalism; except here, instead of a single measure-preserving transformation, we have a topological group of measure-preserving transformations. In the former case, the argument is quite standard; its generalisation to a group action is straightforward, but we explicitly go through the proof here for completeness. (The argument is nearly given in [7, Ch. 8], for instance, albeit using a more restrictive definition of continuous GG-action on a measure space.)

Additionally, we note that the statement of the proposition includes the case p=1p=1, which is excluded from our main theorem; nor does the proposition require that GG be lcsc or amenable.

Proof.

(1) Here, we need only to check explicitly that (g,f)fg1(g,f)\mapsto f\circ g^{-1} is indeed a group action of GG on Lp(S,μ)L^{p}(S,\mu). To wit, we must show that π(e)f=f\pi(e)f=f, and π(gh)f=π(g)π(h)f\pi(gh)f=\pi(g)\pi(h)f, for every fLp(S,μ)f\in L^{p}(S,\mu).

The first is obvious, since

π(e)f(s)=(fe1)(s)=f(e1s)\pi(e)f(s)=(f\circ e^{-1})(s)=f(e^{-1}\cdot s)

and e1s=es=se^{-1}\cdot s=e\cdot s=s for every sSs\in S, since (g,s)gs(g,s)\mapsto g\cdot s is a group action on SS. Likewise, the second condition also follows directly from the fact that (g,s)gs(g,s)\mapsto g\cdot s is a group action on SS:

π(gh)f(s)=(f(gh)1)(s)=f((gh)1s)=f(h1(g1s))=π(g)f(h1s)=π(g)π(h)f(s).\pi(gh)f(s)=(f\circ(gh)^{-1})(s)=f((gh)^{-1}\cdot s)=f(h^{-1}\cdot(g^{-1}\cdot s))=\pi(g)f(h^{-1}\cdot s)=\pi(g)\pi(h)f(s).

(2) To see that π(g)\pi(g) is a linear operator on Lp(S,μ)L^{p}(S,\mu) for each gGg\in G, simply note that

π(g1)(cf1+f2)(s):=(cf1+f2)(g1s)=cf1(g1s)+f2(g1s)=cπ(g1)f1+π(g1)f2.\pi(g^{-1})\left(cf_{1}+f_{2}\right)(s):=\left(cf_{1}+f_{2}\right)(g^{-1}\cdot s)=cf_{1}(g^{-1}\cdot s)+f_{2}(g^{-1}\cdot s)=c\pi(g^{-1})f_{1}+\pi(g^{-1})f_{2}.

The isometry property follows immediately from the change of variable formula and the fact that the action of GG on (S,μ)(S,\mu) is measure preserving. Explicitly: fix hGh\in G, and compute that

π(h)fpp\displaystyle\|\pi(h)f\|_{p}^{p} =S|f(h1s)|p𝑑μ(s)=S|f(s)|p𝑑μ(hs)=S|f(s)|p𝑑μ(s)=fpp.\displaystyle=\int_{S}|f(h^{-1}\cdot s)|^{p}d\mu(s)=\int_{S}|f(s)|^{p}d\mu(h\cdot s)=\int_{S}|f(s)|^{p}d\mu(s)=\|f\|_{p}^{p}.

(3) We need to show that, for arbitrary fLp(S,μ)f\in L^{p}(S,\mu), if geg\rightarrow e then π(g)ffp0\|\pi(g)f-f\|_{p}\rightarrow 0. We first prove the claim for ff an indicator function; then, for simple functions; then pass to general functions in Lp(S,μ)L^{p}(S,\mu) by a density argument.

Fix a measurable set ASA\subseteq S with μ(A)<\mu(A)<\infty. By the assumption that the action of GG on (S,μ)(S,\mu) is continuous, it holds that: for every ε>0\varepsilon>0, there exists a UeU\ni e such that for all gUg\in U, μ(gAΔA)<ε\mu(gA\Delta A)<\varepsilon.

Let χA(s)\chi_{A}(s) denote the indicator function for AA. Let UU be a neighbourhood of the origin such that for every gUg\in U, μ(gAΔA)<εp\mu(gA\Delta A)<\varepsilon^{p}. Let gUg\in U. Observe that π(g)χA=χA(g1(s))=χgA(s)\pi(g)\chi_{A}=\chi_{A}(g^{-1}(s))=\chi_{gA}(s). Since gUg\in U, we have that μ(AΔgA)<εp\mu(A\Delta gA)<\varepsilon^{p}. But observe that

χAχgApp=AΔgA1𝑑μ<εp.\|\chi_{A}-\chi_{gA}\|_{p}^{p}=\int_{A\Delta gA}1d\mu<\varepsilon^{p}.

Since our choice of ε\varepsilon was arbitrary, we conclude that gπ(g)fg\mapsto\pi(g)f is continuous when ff is an indicator function of a set of finite measure.

Now let ff be a simple function of the form i=1kciχAi\sum_{i=1}^{k}c_{i}\chi_{A_{i}}. Then, π(g)f=i=1kciχgAi\pi(g)f=\sum_{i=1}^{k}c_{i}\chi_{gA_{i}}. In this case, it suffices to pick a neighbourhood UeU\ni e which is small enough that for each ii, we have that

μ(AiΔgiAi)<(εkci)p.\mu(A_{i}\Delta g_{i}A_{i})<\left(\frac{\varepsilon}{kc_{i}}\right)^{p}.

Indeed, from the triangle inequality, and the previous result for indicator functions, we observe that

fπ(g)fpi=1kci(χAiχgAi)p<i=1kci(εkci)=ε.\|f-\pi(g)f\|_{p}\leq\sum_{i=1}^{k}\left\|c_{i}\left(\chi_{A_{i}}-\chi_{gA_{i}}\right)\right\|_{p}<\sum_{i=1}^{k}c_{i}\left(\frac{\varepsilon}{kc_{i}}\right)=\varepsilon.

Lastly, take fLp(S,μ)f\in L^{p}(S,\mu) to be arbitrary. Suppose that f0f_{0} is a simple function such that ff0p<ε/3\|f-f_{0}\|_{p}<\varepsilon/3. By the previous case, we can pick a neighbourhood UeU\ni e such that f0π(g)f0p<ε/3\|f_{0}-\pi(g)f_{0}\|_{p}<\varepsilon/3 for all gUg\in U. Now, observe that

fπ(g)fpff0p+f0π(g)f0p+π(g)f0π(g)fp.\|f-\pi(g)f\|_{p}\leq\|f-f_{0}\|_{p}+\|f_{0}-\pi(g)f_{0}\|_{p}+\|\pi(g)f_{0}-\pi(g)f\|_{p}.

We have already seen that the first two terms are each <ε/3<\varepsilon/3. For the last term, observe that

π(g)f0π(g)fp=π(g)(f0f)p=f0fp\|\pi(g)f_{0}-\pi(g)f\|_{p}=\|\pi(g)(f_{0}-f)\|_{p}=\|f_{0}-f\|_{p}

by part (2) of the proposition. Consequently, π(g)f0π(g)fp<ε/3\|\pi(g)f_{0}-\pi(g)f\|_{p}<\varepsilon/3 as well, so that fπ(g)fp<ε\|f-\pi(g)f\|_{p}<\varepsilon as desired. ∎

Corollary 11.

Let GG be a lcsc group acting continuously by measure-preserving transformations on a σ\sigma-finite measure space (S,μ)(S,\mu). Fix p[1,)p\in[1,\infty). Then the induced action of GG on Lp(S,μ)L^{p}(S,\mu) is linear, strongly Borel, and has unit norm.

Proof.

That the induced action of GG on Lp(S,μ)L^{p}(S,\mu) is linear with unit norm is immediate from Proposition 10 (2). From Proposition 10 (3), we know that gπ(g)fg\mapsto\pi(g)f is continuous from GG to Lp(S,μ)L^{p}(S,\mu) for each ff; since GG is separable, Proposition 6 indicates that gπ(g)fg\mapsto\pi(g)f is strongly measurable. Thus, we need only to check that gπ(g)fg\mapsto\pi(g)f is strongly Borel, as per Definition 7.

The continuity of gπ(g)fg\mapsto\pi(g)f and the separability of GG imply that the image of GG under this mapping is separable in Lp(S,μ)L^{p}(S,\mu). Likewise, since gπ(g)fg\mapsto\pi(g)f is continuous, post-composing the mapping with an element of the dual space of Lp(S,μ)L^{p}(S,\mu) results in a continuous map from GG to Lp(S,μ)L^{p}(S,\mu), hence automatically Borel also. Therefore gπ(g)fg\mapsto\pi(g)f is both weakly Borel and has separable image, for any choice of ff; so we’ve shown that GG acts Borel strongly on Lp(S,μ)L^{p}(S,\mu). ∎

We now turn our attention to amenable groups. The following serves as our preferred characterisation of amenability.

Definition 12.

(1) Let GG be a countable discrete group. A sequence (Fn)(F_{n}) of finite subsets of GG is said to be a Følner sequence if for every ε>0\varepsilon>0 and finite KGK\subset G, there exists an NN such that for all nNn\geq N and for all kKk\in K, |FnΔkFn|<|Fn|ε|F_{n}\Delta kF_{n}|<|F_{n}|\varepsilon.

(2) Let GG be a locally compact second countable (lcsc) group with Haar measure |||\cdot|. A sequence (Fn)(F_{n}) of compact subsets of GG is said to be a Følner sequence if for every ε>0\varepsilon>0 and compact KGK\subset G, there exists an NN such that for all nNn\geq N, there exists a subset KK^{\prime} of KK with |K\K|<|K|ε|K\backslash K^{\prime}|<|K|\varepsilon such that for all kKk\in K^{\prime}, |FnΔkFn|<|Fn|ε|F_{n}\Delta kF_{n}|<|F_{n}|\varepsilon.

Remark.

It has been observed, for instance, by Ornstein and Weiss [23] that (2) is one of several equivalent “correct” generalisations of (1) to the lcsc setting. Note however, that we do not assume (Fn)(F_{n}) is nested (FiFi+1F_{i}\subset F_{i+1} for all ii\in\mathbb{N}) or exhausts GG (nFn=G\bigcup_{n\in\mathbb{N}}F_{n}=G), nor do we assume, in the lcsc case, that GG is unimodular. (Each of these is a common additional technical assumption when working with amenable groups.) Conversely, some authors use a version of (2) where the sets in (Fn)(F_{n}) are merely assumed to have finite volume, rather than compact; thanks to the regularity of the Haar measure, our definition results in no loss of generality.

Definition 13.

If GG is either a countable discrete or lcsc amenable group, and has some distinguished Følner sequence (Fn)(F_{n}), and acts Borel strongly on \mathcal{B} via a representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}), then we define the nnth ergodic average operator as follows: Anx:=Fnπ(g1)x𝑑gA_{n}x:=\fint_{F_{n}}\pi(g^{-1})xdg. Here \fint denotes the normalised integral, that is, Af(g)𝑑g:=1|A|Af(g)𝑑g\fint_{A}f(g)dg:=\frac{1}{|A|}\int_{A}f(g)dg.

Proposition 14.

With the notation above, An(,)1\|A_{n}\|_{\mathcal{L}(\mathcal{B},\mathcal{B})}\leq 1.

Proof.

Evidently AnA_{n} is a linear operator from \mathcal{B} to \mathcal{B}. From Proposition 9 we know that 1Fnπ(g1)x1_{F_{n}}\pi(g^{-1})x is Bochner integrable, and in particular

Anx:=Fnπ(g1)x𝑑gFnπ(g1)x𝑑gFnx𝑑g=x.\|A_{n}x\|:=\left\|\fint_{F_{n}}\pi(g^{-1})xdg\right\|\leq\fint_{F_{n}}\|\pi(g^{-1})x\|dg\leq\fint_{F_{n}}\|x\|dg=\|x\|.\qed

A key piece of quantitative information for us will be how large NN has to be if gg is chosen to be an element of (Fn)(F_{n}), in order for |FNΔgFN|/|FN||F_{N}\Delta gF_{N}|/|F_{N}| to be small. This information is encoded by the following modulus:

Definition 15.

Let GG be an amenable group, either countable discrete or lcsc, with Følner sequence (Fn)(F_{n}). A Følner convergence modulus β(n,ε)\beta(n,\varepsilon) for (Fn)(F_{n}) returns an integer NN such that:

  1. (1)

    If GG is countable discrete,

    (mN)(gFn)[|FmΔgFm|<|Fm|ε].(\forall m\geq N)(\forall g\in F_{n})\left[|F_{m}\Delta gF_{m}|<|F_{m}|\varepsilon\right].
  2. (2)

    If GG is lcsc,

    (mN)(FnFn)(gFn)[|Fn\Fn|<|Fn|ε|FmΔgFm|<|Fm|ε].(\forall m\geq N)(\exists F_{n}^{\prime}\subset F_{n})(\forall g\in F_{n}^{\prime})\left[|F_{n}\backslash F_{n}^{\prime}|<|F_{n}|\varepsilon\wedge|F_{m}\Delta gF_{m}|<|F_{m}|\varepsilon\right].

We remark that if (Fn)(F_{n}) is an increasing Følner sequence (that is, FnFmF_{n}\subset F_{m} for all nnn\leq n) then it follows trivially that β(n,ε)\beta(n,\varepsilon) is a nondecreasing function for any fixed ε\varepsilon. However, in what follows we do not always assume that (Fn)(F_{n}) is increasing. In some instances it is technically convenient to assume that β(n,ε)\beta(n,\varepsilon) is nondecreasing; in this case, we can upper bound β(n,ε)\beta(n,\varepsilon) using an “envelope” of the form β~(n,ε)=max1inβ(i,ε)\tilde{\beta}(n,\varepsilon)=\max_{1\leq i\leq n}\beta(i,\varepsilon). Hence, in any case we are free to assume that β(n,ε)\beta(n,\varepsilon) is nondecreasing in nn if necessary.

It should be clear that if we have a “sufficiently explicit” amenable group GG with a “sufficiently explicit” Følner sequence (Fn)(F_{n}), then we can explicitly write down a Følner convergence modulus β(n,ε)\beta(n,\varepsilon) for (Fn)(F_{n}). What is meant by this? Simply, the following: if we know “explicitly” that (Fn)(F_{n}) is a Følner sequence, this amounts to knowing “explicitly”, for each gGg\in G, that |FnΔgFn|/|Fn|0|F_{n}\Delta gF_{n}|/|F_{n}|\rightarrow 0 (resp. the analogous statement for when GG is lcsc), which in turn amounts to being able to write down how exactly this convergence occurs in an explicit way, i.e. that one can explicitly write down a rate of convergence N(g,ε)N(g,\varepsilon) for |FnΔgFn|/|Fn||F_{n}\Delta gF_{n}|/|F_{n}|. But a Følner convergence modulus β(n,ε)\beta(n,\varepsilon) is just a function which dominates N(g,ε)N(g,\varepsilon) for every gFng\in F_{n} — in particular we can take β(n,ε)=maxgFnN(g,ε)\beta(n,\varepsilon)=\max_{g\in F_{n}}N(g,\varepsilon).

Likewise (but less heuristically), it is easy to see that we can select a Følner sequence in such a way that β(n,ε)\beta(n,\varepsilon) can be chosen to be a computable function (for an appropriate restriction on the domain of the second variable). The following argument has essentially already been observed by previous authors [4, 5, 22] working with slightly different objects, but we include it for completeness. (In the interest of simplicity, we restrict our attention to countable discrete groups; one can say something similar for lcsc groups with computable topology, but we do not pursue this here.)

Proposition 16.

Let GG be a countable discrete finitely generated amenable group with the solvable word property. Fix kk\in\mathbb{N}. Then GG has a Følner sequence (Fn)(F_{n}) such that β(n,k1)=max{n+1,3k}\beta(n,k^{-1})=\max\{n+1,3k\} is a Følner convergence modulus for (Fn)(F_{n}). Moreover (Fn)(F_{n}) can be chosen in a computable fashion.

Proof.

Fix a computable enumeration of the elements of GG, as well as a computable enumeration of the finite subsets of GG. The solvable word property ensures that we can do this, and also that the cardinality of FΔgFF\Delta gF can always be computed for any gGg\in G and finite set FF. So, take F1F_{1} to be an arbitrary finite subset of GG containing g1g_{1}, the first element of GG. Given Fn1F_{n-1}, take F~n\tilde{F}_{n} to be the least (with respect to the enumeration) finite subset of GG containing Fn1F_{n-1}, such that for all gFn1g\in F_{n-1}, |F~nΔgF~n|<|F~n|/n|\tilde{F}_{n}\Delta g\tilde{F}_{n}|<|\tilde{F}_{n}|/n. Such an F~n\tilde{F}_{n} exists since GG is amenable. Then, put Fn=F~n{gn}F_{n}=\tilde{F}_{n}\cup\{g_{n}\} where gng_{n} is the nnth element of GG. This is indeed a Følner sequence: for a fixed gg, we see that |FnΔgFn|/|Fn|<3/n|F_{n}\Delta gF_{n}|/|F_{n}|<3/n for all nn greater than the first NN such that gFNg\in F_{N}, because

|FnΔgFn||Fn|2+|F~nΔgF~n||Fn|2|Fn|+|F~nΔgF~n||F~n|<3n.\frac{|F_{n}\Delta gF_{n}|}{|F_{n}|}\leq\frac{2+|\tilde{F}_{n}\Delta g\tilde{F}_{n}|}{|F_{n}|}\leq\frac{2}{|F_{n}|}+\frac{|\tilde{F}_{n}\Delta g\tilde{F}_{n}|}{|\tilde{F}_{n}|}<\frac{3}{n}.

Hence |FnΔgFn|/|Fn|0|F_{n}\Delta gF_{n}|/|F_{n}|\rightarrow 0. Moreover, we see that if mmax{n+1,3k}m\geq\max\{n+1,3k\}, then

(gFn)|FmΔgFm|<3|Fm|/m|Fm|/k(\forall g\in F_{n})\qquad|F_{m}\Delta gF_{m}|<3|F_{m}|/m\leq|F_{m}|/k

and so β(n,k1):=max{n+1,3k}\beta(n,k^{-1}):=\max\{n+1,3k\} is indeed a Følner convergence modulus for (Fn)(F_{n}). ∎

Remark.

The previous proposition is not sharp. It has been shown that there are groups without the solvable word property which nonetheless have computable Følner sequences with computable convergence behaviour [5]. (The cited paper uses a different explicit modulus of convergence for Følner sequences than the present paper, although the argument carries over to our setting without modification.)

Finally, it will be convenient to introduce the notion of a Følner sequence where the gap between nn and β(n,ε)\beta(n,\varepsilon) is controlled.

Definition 17.

Let GG be a countable discrete or lcsc amenable group and (Fn)(F_{n}) a Følner sequence. Let λ\lambda\in\mathbb{N} and ε>0\varepsilon>0. We say that (Fn)(F_{n}) is a (λ,ε)(\lambda,\varepsilon)-fast Følner sequence if

  1. (1)

    For GG countable and discrete, it holds that for all nn\in\mathbb{N} that for all mn+λm\geq n+\lambda, for all gFng\in F_{n}, |FmΔgFm|/|Fm|<ε|F_{m}\Delta gF_{m}|/|F_{m}|<\varepsilon.

  2. (2)

    For GG lcsc, it holds that for all nn\in\mathbb{N} that for all mn+λm\geq n+\lambda, there exists a set FnFnF_{n}^{\prime}\subset F_{n} such that |Fn\Fn|<|Fn|ε|F_{n}\backslash F_{n}^{\prime}|<|F_{n}|\varepsilon, so that for all gFng\in F_{n}^{\prime}, |FmΔgFm|/|Fm|<ε|F_{m}\Delta gF_{m}|/|F_{m}|<\varepsilon.

In other words, a Følner sequence (Fn)(F_{n}) is (λ,ε)(\lambda,\varepsilon)-fast provided that β(n,ε)=n+λ\beta(n,\varepsilon)=n+\lambda is a Følner convergence modulus for (Fn)(F_{n}).

It is clear that any Følner sequence can be refined into a (λ,ε)(\lambda,\varepsilon)-fast Følner sequence.

Proposition 18.

Given λ\lambda\in\mathbb{N} and ε>0\varepsilon>0, any Følner sequence can be refined into a (λ,ε)(\lambda,\varepsilon)-fast Følner sequence.

Proof.

It suffices to produce a (1,ε)(1,\varepsilon)-fast refinement. For simplicity, we only state the argument for the case where GG is countable and discrete.

Suppose we have already selected the first jj Følner sets in our refinement Fn1,,FnjF_{n_{1}},\ldots,F_{n_{j}}. Then, take Fnj+1F_{n_{j+1}} to be the next element of the sequence (Fn)(F_{n}) after njn_{j} such that, for all gi=1jFnig\in\bigcup_{i=1}^{j}F_{n_{i}}, |Fnj+1ΔgFnj+1|/|Fnj+1|<ε|F_{n_{j+1}}\Delta gF_{n_{j+1}}|/|F_{n_{j+1}}|<\varepsilon. Such a term exists since (Fn)(F_{n}) is a Følner sequence. ∎

Less obvious is the relationship between a Følner sequence being fast and the property of being tempered which is used in Lindenstrauss’s pointwise ergodic theorem [20], although they are somewhat similar in spirit. Nonetheless, we quickly observe that the previous proposition indicates that any tempered Følner sequence can be refined into a Følner sequence which is both tempered and (λ,ε)(\lambda,\varepsilon)-fast, simply by the fact that any subsequence of a tempered Følner sequence is again tempered.

2. The Main Theorem

Frequently in ergodic theory, one argues that if KNK\gg N, then AKANxAKxA_{K}A_{N}x\approx A_{K}x. The following lemma makes this precise in terms of the modulus β\beta.

Lemma 19.

Let (,)(\mathcal{B},\|\cdot\|) be a normed vector space. Let GG be a lcsc amenable group with Følner sequence (Fn)(F_{n}), and let GG act Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}). Fix NN\in\mathbb{N} and η>0\eta>0. Let β\beta be the Følner convergence modulus and suppose Kβ(N,η)K\geq\beta(N,\eta). Then for any xx\in\mathcal{B}, AKxAKANx<3ηx\|A_{K}x-A_{K}A_{N}x\|<3\eta\|x\|. (If GG is countable discrete, the “strongly Borel” part is trivially satisfied, and we have the sharper estimate AKxAKANx<ηx\|A_{K}x-A_{K}A_{N}x\|<\eta\|x\|.)

Proof.

From the definition of Følner convergence modulus, we know that there exists an FNFNF_{N}^{\prime}\subset F_{N} such that |FN\FN|<|FN|η|F_{N}\backslash F_{N}^{\prime}|<|F_{N}|\eta and such that for all hFNh\in F_{N}^{\prime}, |FKΔhFK|<|FK|η|F_{K}\Delta hF_{K}|<|F_{K}|\eta. Now perform the following computation (justification for each step addressed below):

AKxAKANx\displaystyle\|A_{K}x-A_{K}A_{N}x\| :=FKπ(g1)x𝑑gFKπ(g1)(FNπ(h1)x𝑑h)𝑑g\displaystyle:=\left\|\fint_{F_{K}}\pi(g^{-1})xdg-\fint_{F_{K}}\pi(g^{-1})\left(\fint_{F_{N}}\pi(h^{-1})xdh\right)dg\right\|
=FKπ(g1)x𝑑gFK(FNπ(g1)(π(h1)x)𝑑h)𝑑g\displaystyle\;=\left\|\fint_{F_{K}}\pi(g^{-1})xdg-\fint_{F_{K}}\left(\fint_{F_{N}}\pi(g^{-1})(\pi(h^{-1})x)dh\right)dg\right\|
=FKπ(g1)x𝑑gFK(FNπ((hg)1)x𝑑h)𝑑g\displaystyle\;=\left\|\fint_{F_{K}}\pi(g^{-1})xdg-\fint_{F_{K}}\left(\fint_{F_{N}}\pi((hg)^{-1})xdh\right)dg\right\|
=FN(FKπ(g1)x𝑑g)𝑑hFN(FKπ((hg)1)x𝑑g)𝑑h\displaystyle\;=\left\|\fint_{F_{N}}\left(\fint_{F_{K}}\pi(g^{-1})xdg\right)dh-\fint_{F_{N}}\left(\fint_{F_{K}}\pi((hg)^{-1})xdg\right)dh\right\|
FNFKπ(g1)x𝑑gFKπ((hg)1)x𝑑g𝑑h\displaystyle\leq\fint_{F_{N}}\left\|\fint_{F_{K}}\pi(g^{-1})xdg-\fint_{F_{K}}\pi((hg)^{-1})xdg\right\|dh
=FNFKπ(g1)x𝑑ghFKπ(g1)x𝑑g𝑑h\displaystyle=\fint_{F_{N}}\left\|\fint_{F_{K}}\pi(g^{-1})xdg-\fint_{hF_{K}}\pi(g^{-1})xdg\right\|dh
FN(1|FK|FKΔhFKπ(g1)x𝑑g)𝑑h\displaystyle\leq\fint_{F_{N}}\left(\frac{1}{|F_{K}|}\int_{F_{K}\Delta hF_{K}}\|\pi(g^{-1})x\|dg\right)dh
FN(1|FK|FKΔhFKx𝑑g)𝑑h\displaystyle\leq\fint_{F_{N}}\left(\frac{1}{|F_{K}|}\int_{F_{K}\Delta hF_{K}}\|x\|dg\right)dh
=FN1|FK|(|FKΔhFK|x)𝑑h\displaystyle=\fint_{F_{N}}\frac{1}{|F_{K}|}\left(|F_{K}\Delta hF_{K}|\|x\|\right)dh
<1|FN|[FNηx𝑑h+FN\FN(1|FK||FKΔhFK|x)𝑑h]\displaystyle<\frac{1}{|F_{N}|}\left[\int_{F_{N}^{\prime}}\eta\|x\|dh+\int_{F_{N}\backslash F_{N}^{\prime}}\left(\frac{1}{|F_{K}|}|F_{K}\Delta hF_{K}|\|x\|\right)dh\right]
ηx+1|FN|FN\FN(2x)𝑑h3ηx.\displaystyle\leq\eta\|x\|+\frac{1}{|F_{N}|}\int_{F_{N}\backslash F_{N}^{\prime}}\left(2\|x\|\right)dh\leq 3\eta\|x\|.

If GG is countable discrete, we instead assume that for all hFNh\in F_{N} (rather than FNF_{N}^{\prime}), |FKΔhFK|<|FK|η|F_{K}\Delta hF_{K}|<|F_{K}|\eta. Therefore, the penultimate line reduces to 1|FN|FNηx𝑑h\frac{1}{|F_{N}|}\int_{F_{N}}\eta\|x\|dh, and the last line reduces to ηx\eta\|x\|.

Finally let’s discuss which properties of the Bochner integral we had to use. Thanks to Proposition 9, we have that g1FKπ(g1)xg\mapsto 1_{F_{K}}\pi(g^{-1})x is Bochner integrable. Given that, for each gg, π(g)\pi(g) is a bounded linear operator, then indeed it follows that π(g1)(π(h1)xdh)=(π(g1)π(h1)xdh\pi(g^{-1})\left(\int\pi(h^{-1})xdh\right)=\int(\pi(g^{-1})\pi(h^{-1})xdh. From Fubini’s theorem and the fact (also from Proposition 9) that (g,h)1FN×FKπ((hg)1)x(g,h)\mapsto 1_{F_{N}\times F_{K}}\pi((hg)^{-1})x is Bochner integrable, we see that FKFNπ((hg)1)x𝑑h𝑑g=FNFKπ((hg)1)x𝑑g𝑑h\int_{F_{K}}\int_{F_{N}}\pi((hg)^{-1})xdhdg=\int_{F_{N}}\int_{F_{K}}\pi((hg)^{-1})xdgdh. Lastly, we repeatedly invoked the fact that Af(g)𝑑gAf(g)𝑑g\|\int_{A}f(g)dg\|\leq\int_{A}\|f(g)\|dg. It’s worth noting that in the case where GG is countable discrete, only the first fact (that GG acts by bounded linear operators with unit norm) is needed as an assumption, as the latter two properties hold trivially for finite averages. ∎

Remark.

It is possible to generalise this argument to the case where the action of GG is “power bounded” in the sense that there is some uniform constant CC such that for (dgdg-almost) all gGg\in G, π(g)C\|\pi(g)\|\leq C. However the argument for our main theorem necessitates setting C=1C=1.

The following argument is a generalisation of a proof of Garrett Birkhoff [2] to the amenable setting. The statement of the theorem is weaker than results which are already in Greenleaf’s article [11], but we include the argument for several reasons. One is that it is very short; another is that we will ultimately derive a bound on ε\varepsilon-fluctuations via a modification of this proof; and finally, the proof indicates additional information about the limiting behaviour of the norm of AnxA_{n}x, namely that limnAnx=infnAnx\lim_{n}\|A_{n}x\|=\inf_{n}\|A_{n}x\|.

Theorem 20.

Let GG be a lcsc amenable group with compact Følner sequence (Fn)(F_{n}), and let \mathcal{B} be a uniformly convex Banach space such that GG acts Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}). Then for every xx\in\mathcal{B}, the sequence of averages (Anx)(A_{n}x) converges in norm \|\cdot\|_{\mathcal{B}}.

Proof.

Without loss of generality, we assume x1\|x\|\leq 1 (otherwise, simply replace xx with x/xx/\|x\|). Define L:=infnAnxL:=\inf_{n}\|A_{n}x\|. Fix an arbitrary ε0>0\varepsilon_{0}>0, and let NN be some index such that ANx<L+ε0\|A_{N}x\|<L+\varepsilon_{0}. Let uu denote the modulus of uniform convexity. Fix a Følner convergence modulus β(n,ε)\beta(n,\varepsilon) for (Fn)(F_{n}), and let η>0\eta>0 be arbitrary. Suppose Mβ(N,η/(3x))M\geq\beta(N,\eta/(3\|x\|)) is an index such that ANxAMx>δ\|A_{N}x-A_{M}x\|>\delta. (If no such δ\delta exists then this means that after β(N,η/(3x))\beta(N,\eta/(3\|x\|)), the sequence has converged to within δ\delta.) Then, from uniform convexity, we know that

12(ANx+AMx)max{ANx,AMx}u(δ).\left\|\frac{1}{2}(A_{N}x+A_{M}x)\right\|\leq\max\{\|A_{N}x\|,\|A_{M}x\|\}-u(\delta).

Additionally, it follows from Lemma 19 that AMxAMANx<η\|A_{M}x-A_{M}A_{N}x\|<\eta, and thus

12(ANx+AMx)<max{ANx,AMANx+η}u(δ).\left\|\frac{1}{2}(A_{N}x+A_{M}x)\right\|<\max\{\|A_{N}x\|,\|A_{M}A_{N}x\|+\eta\}-u(\delta).

But AMANxANx\|A_{M}A_{N}x\|\leq\|A_{N}x\|, so this implies

12(ANx+AMx)<ANx+ηu(δ).\left\|\frac{1}{2}(A_{N}x+A_{M}x)\right\|<\|A_{N}x\|+\eta-u(\delta).

In turn, we know that ANx<L+ε0\|A_{N}x\|<L+\varepsilon_{0}, so

12(ANx+AMx)<L+ε0+ηu(δ).\left\|\frac{1}{2}(A_{N}x+A_{M}x)\right\|<L+\varepsilon_{0}+\eta-u(\delta).

In fact, it follows that 12AK(ANx+AMx)<L+ε0+ηu(δ)\|\frac{1}{2}A_{K}(A_{N}x+A_{M}x)\|<L+\varepsilon_{0}+\eta-u(\delta) also, for any index KK, since AK1\|A_{K}\|\leq 1. Now, choosing Kmax{β(N,η/(3x)),β(M,η/(3x))}K\geq\max\{\beta(N,\eta/(3\|x\|)),\beta(M,\eta/(3\|x\|))\}, we have that both AKxAKANx<η\|A_{K}x-A_{K}A_{N}x\|<\eta and AKxAKAMx<η\|A_{K}x-A_{K}A_{M}x\|<\eta. Thus,

AKx\displaystyle\|A_{K}x\| =12(AKxAKANx)+12(AKxAKAMx)+12(AKAN+AKAM)\displaystyle=\left\|\frac{1}{2}(A_{K}x-A_{K}A_{N}x)+\frac{1}{2}(A_{K}x-A_{K}A_{M}x)+\frac{1}{2}(A_{K}A_{N}+A_{K}A_{M})\right\|
η+12AK(ANx+AMx)\displaystyle\leq\eta+\left\|\frac{1}{2}A_{K}(A_{N}x+A_{M}x)\right\|
<2η+L+ε0u(δ).\displaystyle<2\eta+L+\varepsilon_{0}-u(\delta).

Since η\eta can be chosen to be arbitrarily small (this merely implies that KK and MM are very large), we see that lim supKAKxL+ε0u(δ)\limsup_{K}\|A_{K}x\|\leq L+\varepsilon_{0}-u(\delta). But since our choice of ε0\varepsilon_{0} was arbitrary, it follows that in fact lim supKAKxL\limsup_{K}\|A_{K}x\|\leq L.

Moreover this implies that (Anx)(A_{n}x) converges in norm. For if this were not the case, then we could find some δ0>0\delta_{0}>0 such that AnxAmx>δ0\|A_{n}x-A_{m}x\|>\delta_{0} infinitely often; in fact, there must be some δ0>0\delta_{0}>0 such that AnxAmx>δ0\|A_{n}x-A_{m}x\|>\delta_{0} infinitely often, with the further restriction333Note, here, that we are invoking the the fact that a sequence in a metric space converges iff, for any β:\beta:\mathbb{N}\rightarrow\mathbb{N} with β(n)>n\beta(n)>n, the sequence contains only finitely many δ\delta-fluctuations at distance β\beta, for each δ>0\delta>0; cf. discussion of this equivalence in the remark immediately following Definition 1. that mβ(n,η/(3x)m\geq\beta(n,\eta/(3\|x\|). Now, pick η\eta and ε0\varepsilon_{0} small enough that 2η+ε0<u(δ0)2\eta+\varepsilon_{0}<u(\delta_{0}), and pick both nn and mm such that: mm and nn are sufficiently large that Anx,Amx<L+ε0\|A_{n}x\|,\|A_{m}x\|<L+\varepsilon_{0} (which we can do since lim supKAKxL\limsup_{K}\|A_{K}x\|\leq L); and, moreover, pick mm and nn in such a way that AnxAmx>δ0\|A_{n}x-A_{m}x\|>\delta_{0} and mβ(n,η/(3x))m\geq\beta(n,\eta/(3\|x\|)). The computation above shows that for KK larger than β(m,η/(3x)\beta(m,\eta/(3\|x\|) and β(n,η/(3x))\beta(n,\eta/(3\|x\|)) we have that AKx<2η+L+ε0u(δ0)\|A_{K}x\|<2\eta+L+\varepsilon_{0}-u(\delta_{0}); but since we chose 2η+ε0<u(δ0)2\eta+\varepsilon_{0}<u(\delta_{0}), this implies that AKx<L\|A_{K}x\|<L, which contradicts the definition of LL. ∎

We now proceed to deriving a quantitative analogue of this result. We first do so “at distance β\beta”, and then recover a global bound in the case where the Følner sequence is fast (in the sense of Definition 17). The only really non-explicit part of the proof of the preceding theorem was the step where we used the fact that an infimum of a real sequence exists. Therefore the principal innovation in what follows is the avoidance of the direct invocation of this fact.

Fix a nondecreasing Følner convergence modulus β\beta for (Fn)(F_{n}). Initially let us consider the case where x1\|x\|\leq 1. Suppose that An0xAn1xε\|A_{n_{0}}x-A_{n_{1}}x\|\geq\varepsilon. Moreover, we pick some η<u(ε)/2\eta<u(\varepsilon)/2, and suppose that n1β(n0,η/3x)n_{1}\geq\beta(n_{0},\eta/3\|x\|). Then the computation from the previous proof shows that

12(An0x+An1x)<An0x+ηu(ε).\left\|\frac{1}{2}(A_{n_{0}}x+A_{n_{1}}x)\right\|<\|A_{n_{0}}x\|+\eta-u(\varepsilon).

More generally, if AnixAni+1xε\|A_{n_{i}}x-A_{n_{i+1}}x\|\geq\varepsilon with ni+1β(ni,η/3x)n_{i+1}\geq\beta(n_{i},\eta/3\|x\|), it follows that

12(Anix+Ani+1x)<Anix+ηu(ε).\left\|\frac{1}{2}(A_{n_{i}}x+A_{n_{i+1}}x)\right\|<\|A_{n_{i}}x\|+\eta-u(\varepsilon).

Now, choosing kmax{β(ni+1,η/(3x)),β(ni,η/(3x))}k\geq\max\{\beta(n_{i+1},\eta/(3\|x\|)),\beta(n_{i},\eta/(3\|x\|))\}, we have that both AkxAkAnix<η\|A_{k}x-A_{k}A_{n_{i}}x\|<\eta and AkxAkAni+1x<η\|A_{k}x-A_{k}A_{n_{i+1}}x\|<\eta. Thus,

Akx\displaystyle\|A_{k}x\| =12(AkxAkAnix)+12(AkxAkAni+1x)+12(AkAnix+AkAni+1x)\displaystyle=\left\|\frac{1}{2}(A_{k}x-A_{k}A_{n_{i}}x)+\frac{1}{2}(A_{k}x-A_{k}A_{n_{i+1}}x)+\frac{1}{2}(A_{k}A_{n_{i}}x+A_{k}A_{n_{i+1}}x)\right\|
η+12Ak(Anix+Ani+1x)\displaystyle\leq\eta+\left\|\frac{1}{2}A_{k}(A_{n_{i}}x+A_{n_{i+1}}x)\right\|
<2η+Anixu(ε).\displaystyle<2\eta+\|A_{n_{i}}x\|-u(\varepsilon).

Therefore let ni+2n_{i+2} equal the least index greater than max{β(ni+1,η/(3x)),\max\{\beta(n_{i+1},\eta/(3\|x\|)), β(ni,η/(3x))}\beta(n_{i},\eta/(3\|x\|))\} (equivalently, greater than β(ni+1,η/(3x))\beta(n_{i+1},\eta/(3\|x\|)), since β\beta is nondecreasing in nn) such that Ani+1xAni+2xε\|A_{n_{i+1}}x-A_{n_{i+2}}x\|\geq\varepsilon. The previous calculation shows that Ani+2x<Anix+2ηu(ε)\|A_{n_{i+2}}x\|<\|A_{n_{i}}x\|+2\eta-u(\varepsilon). More generally, we have that

Anix<An0xi2(u(ε)2η)i even\|A_{n_{i}}x\|<\|A_{n_{0}}x\|-\frac{i}{2}\left(u(\varepsilon)-2\eta\right)\quad i\text{ even}
Anix<An1xi12(u(ε)2η)i odd\|A_{n_{i}}x\|<\|A_{n_{1}}x\|-\frac{i-1}{2}(u(\varepsilon)-2\eta)\quad i\text{ odd}

So simply from the fact that Anix0\|A_{n_{i}}x\|\geq 0, these expressions derive a contradiction on the least ii such that

max{An0x,An1x}<i12(u(ε)2η)\max\left\{\|A_{n_{0}}x\|,\|A_{n_{1}}x\|\right\}<\frac{i-1}{2}(u(\varepsilon)-2\eta)

since this would imply that Ani(x)<0\|A_{n_{i}}(x)\|<0. That is, the contradiction implies that the nin_{i}th epsilon fluctuation could not have occurred. We have no a priori information on the values of An0x\|A_{n_{0}}x\| and An1x\|A_{n_{1}}x\|, except that both are at most x\|x\|. Therefore, we have the following uniform bound:

ix12u(ε)η+1i\leq\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}+1\right\rfloor

where ii tracks the indices of the subsequence along which ε\varepsilon-fluctuations occur. This is actually one more than the number of ε\varepsilon-fluctuations, so instead we have that the number of ε\varepsilon-fluctuations is bounded by x12u(ε)η\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor.

If we happen to have any lower bound on the infimum of Anx\|A_{n}x\|, we can sharpen the previous calculation. Instead of using the fact that for all nn, Anx0\|A_{n}x\|\geq 0, we use the fact that AnxL\|A_{n}x\|\geq L for some LL. To wit, if ii is large enough that

x<i12(u(ε)2η)+L\|x\|<\frac{i-1}{2}(u(\varepsilon)-2\eta)+L

then this would imply that Anix<L\|A_{n_{i}}x\|<L, a contradiction. Therefore we have the bound

ixL12u(ε)η+1i\leq\left\lfloor\frac{\|x\|-L}{\frac{1}{2}u(\varepsilon)-\eta}+1\right\rfloor

and so the number of ε\varepsilon-fluctuations is bounded by xL12u(ε)η\left\lfloor\frac{\|x\|-L}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor.

In the case where x>1\|x\|>1, a small modification must be made. We can make the substitution x=x/xx^{\prime}=x/\|x\|, so that AnixAni+1xε/x\|A_{n_{i}}x^{\prime}-A_{n_{i+1}}x^{\prime}\|\geq\varepsilon/\|x\|, and conclude (provided that η<12u(ε/x)\eta<\frac{1}{2}u(\varepsilon/\|x\|) that (Anx)(A_{n}x^{\prime}) has at most x12u(ε/x)η\left\lfloor\frac{\|x^{\prime}\|}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor ε/x\varepsilon/\|x\|-fluctuations at distance β(n,η/3x)\beta(n,\eta/3\|x^{\prime}\|), or rather at most 112u(ε/x)η\left\lfloor\frac{1}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor ε/x\varepsilon/\|x\|-fluctuations at distance β(n,η/3)\beta(n,\eta/3) since x=1\|x^{\prime}\|=1. But since AnixAni+1xε/x\|A_{n_{i}}x^{\prime}-A_{n_{i+1}}x^{\prime}\|\geq\varepsilon/\|x\| iff AnixAni+1xε\|A_{n_{i}}x-A_{n_{i+1}}x\|\geq\varepsilon, this actually tells us that (Anx)(A_{n}x) has at most 112u(ε/x)η\left\lfloor\frac{1}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor ε\varepsilon-fluctuations at distance β(n,η/3)\beta(n,\eta/3). Likewise, if we happen to have a lower bound LL on the infimum of Anx\|A_{n}x\|, since infnAnxL\inf_{n}\|A_{n}x\|\geq L iff infnAnxL/x\inf_{n}\|A_{n}x^{\prime}\|\geq L/\|x\| we have that (Anx)(A_{n}x) has at most 1L/x12u(ε/x)η\left\lfloor\frac{1-L/\|x\|}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor ε\varepsilon-fluctuations at distance β(n,η/3)\beta(n,\eta/3).

To summarise, we have shown that:

Theorem 21.

(“Main theorem”) Let \mathcal{B} be a uniformly convex Banach space with modulus uu. Fix ε>0\varepsilon>0 and xx\in\mathcal{B}. Pick some η>0\eta>0 such that η<12u(ε)\eta<\frac{1}{2}u(\varepsilon) if x1\|x\|\leq 1, or η<12u(ε/x)\eta<\frac{1}{2}u(\varepsilon/\|x\|) if x>1\|x\|>1. Then if GG is a lcsc amenable group that acts Borel strongly on \mathcal{B} via the representation π:G1(,)\pi:G\rightarrow\mathcal{L}_{1}(\mathcal{B},\mathcal{B}), with Følner sequence (Fn)(F_{n}), the sequence (Anx)(A_{n}x) has at most x12u(ε)η\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor ε\varepsilon-fluctuations at distance β(n,η/3x)\beta(n,\eta/3\|x\|) if x1\|x\|\leq 1, or at most 112u(ε/x)η\left\lfloor\frac{1}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor ε\varepsilon-fluctuations at distance β(n,η/3)\beta(n,\eta/3) if x>1\|x\|>1. If we know that infAnxL\inf\|A_{n}x\|\geq L, then we can sharpen the bound to xL12u(ε)η\left\lfloor\frac{\|x\|-L}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor in the case where x1\|x\|\leq 1, or 1L/x12u(ε/x)η\left\lfloor\frac{1-L/\|x\|}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor in the case where x>1\|x\|>1.

Corollary 22.

In the above setting, suppose that (Fn)(F_{n}) is (λ,η/3x)(\lambda,\eta/3\|x\|)-fast if x1\|x\|\leq 1, or (λ,η/3)(\lambda,\eta/3)-fast if x>1\|x\|>1 respectively. Then the sequence (Anx)(A_{n}x) has at most λx12u(ε)η+λ\lambda\cdot\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor+\lambda ε\varepsilon-fluctuations (resp. λ112u(ε/x)η+λ\lambda\cdot\left\lfloor\frac{1}{\frac{1}{2}u(\varepsilon/\|x\|)-\eta}\right\rfloor+\lambda ε\varepsilon-fluctuations).

Proof.

We know from the theorem that, if x1\|x\|\leq 1, there are at most x12u(ε)η\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor ε\varepsilon-fluctuations at distance λ\lambda. This leaves the possibility that there are some ε\varepsilon-fluctuations in the x12u(ε)η\left\lfloor\frac{\|x\|}{\frac{1}{2}u(\varepsilon)-\eta}\right\rfloor many gaps of width λ\lambda, and also that there are some ε\varepsilon-fluctuations in between the last possible index nin_{i} given by the previous theorem, and the index ni+1n_{i+1} at which contradiction is achieved. This last interval at the end is at most λ\lambda wide as well. The reasoning for the x>1\|x\|>1 case is identical. ∎

Example 23.

We mention two special cases of interest in the above setting. For brevity we restrict ourselves to the case where x1\|x\|\leq 1 and L=0L=0, but the reader can make the obvious substitutions.

(1) In the case where \mathcal{B} is pp-uniformly convex with constant KK, then in this setting (Anx)(A_{n}x) has at most λxK2εp+1η+λ\lambda\cdot\left\lfloor\frac{\|x\|}{\frac{K}{2}\varepsilon^{p+1}-\eta}\right\rfloor+\lambda ε\varepsilon-fluctuations.

(2) In the case where =Lp(μ)\mathcal{B}=L^{p}(\mu) for p[2,)p\in[2,\infty), then in this setting (Anx)(A_{n}x) has at most λxε4ε4(1(ε2)p)1/pη+λ\lambda\cdot\left\lfloor\frac{\|x\|}{\frac{\varepsilon}{4}-\frac{\varepsilon}{4}\left(1-\left(\frac{\varepsilon}{2}\right)^{p}\right)^{1/p}-\eta}\right\rfloor+\lambda ε\varepsilon-fluctuations.

3. Discussion

The prospect of obtaining quantitative convergence information for ergodic averages via a modification of Garrett Birkhoff’s argument [2] has been previously explored, in the \mathbb{Z}-action setting, by Kohlenbach and Leuştean [17] and subsequently by Avigad and Rute [1].

Our proof was carried out in the setting where the acted upon space was assumed to be uniformly convex, and indeed our bound on the number of fluctuations explicitly depends on the modulus of uniform convexity. Nonetheless, it is natural to ask whether an analogous result might be obtained for a more general class of acted upon spaces.

However, it has already been observed, in the case where G=G=\mathbb{Z}, that there exists a separable, reflexive, and strictly convex Banach space \mathcal{B} such that for every NN and ε>0\varepsilon>0, there exists an xx\in\mathcal{B} such that (Anx)(A_{n}x) has at least NN ε\varepsilon-fluctuations [1]. This counterexample applies equally to bounds on the rate of metastability (for more on metastability and its relationship with fluctuation bounds, see for instance [1, Section 5]). However, this counterexample does not directly eliminate the possibility of a fluctuation bound for =L1(X,μ)\mathcal{B}=L^{1}(X,\mu), so the question of a “quantitative L1L^{1} mean ergodic theorem for amenable groups” remains unresolved.

What about the choice of acting group? Our assumptions on GG (amenable and countable discrete or lcsc) where selected because this is the most general class of groups which have Følner sequences. Our argument depends essentially on Følner sequences; indeed, proofs of ergodic theorems for actions of non-amenable groups have a qualitatively different structure. Remarkably, there are certain classes of non-amenable groups whose associated ergodic theorems have much stronger convergence behaviour than the classical (G=G=\mathbb{Z}) setting; for recent progress on quantitative ergodic theorems in the non-amenable setting, we refer the reader to the book and survey article of Gorodnik and Nevo [9, 10].

We should also mention quantitative bounds for pointwise ergodic theorems. For G=G=\mathbb{Z} such results go as far back as Bishop’s upcrossing inequality [3]. Inequalities of this type have also been found for d\mathbb{Z}^{d} by Kalikow and Weiss (for Fn=[n,n]dF_{n}=[-n,n]^{d}) [16]; more recently Moriakov has modified the Kalikow and Weiss argument to give an upcrossing inequality for symmetric ball averages in groups of polynomial growth [21], and (simultaneously with the preparation of this article) Gabor [8] has extended this strategy to give an upcrossing inequality for countable discrete amenable groups with Følner sequences satisfying a strengthening of Lindenstrauss’s temperedness condition.

For both norm and pointwise convergence of ergodic averages, it is sometimes possible to deduce convergence behaviour which is stronger than ε\varepsilon-fluctuations and/or upcrossings but weaker than an explicit rate of convergence, namely that there exists an appropriate variational inequality. Jones et al. have succeeded in proving numerous variational inequalities, both for norm and pointwise convergence, for a large class of Følner sequences in \mathbb{Z} and d\mathbb{Z}^{d} [14, 15]. (In particular, the result proved in the present article is known not to be sharp when specialised to the case where G=dG=\mathbb{Z}^{d} and =Lp\mathcal{B}=L^{p}, due to these existing results.) However, their methods, which rely on a martingale comparison estimate and a Calderón-Zygmund decomposition argument, exploit numerous incidental geometric properties of d\mathbb{Z}^{d} which do not hold for many other groups. It would be interesting to determine which other groups enjoy similar variational inequalities.

Acknowledgements

The author thanks his advisor Jeremy Avigad, under whom this work was completed, for his guidance and support. The author also thanks Clinton Conley, Yves Cornulier, and Henry Towsner for helpful discussions. In addition, the author thanks the anonymous referee, whose comments led to a number of improvements to the exposition of the article.

References

  • [1] Jeremy Avigad and Jason Rute, Oscillation and the mean ergodic theorem for uniformly convex Banach spaces, Ergodic Theory and Dynamical Systems 35 (2015), no. 4, 1009–1027.
  • [2] Garrett Birkhoff, The mean ergodic theorem, Duke Mathematical Journal 5 (1939), no. 1, 19,20.
  • [3] Errett Bishop, A constructive ergodic theorem, Journal of Mathematics and Mechanics 17 (1968), no. 7, 631–639.
  • [4] Matteo Cavaleri, Computability of Følner sets, International Journal of Algebra and Computation 27 (2017), no. 07, 819–830.
  • [5] by same author, Følner functions and the generic word problem for finitely generated amenable groups, Journal of Algebra 511 (2018), 388–404.
  • [6] James A. Clarkson, Uniformly convex spaces, Transactions of the American Mathematical Society 40 (1936), no. 3, 396–414.
  • [7] Manfred Einsiedler and Thomas Ward, Ergodic theory: with a view towards number theory, vol. 259, Springer, 2010.
  • [8] Uri Gabor, Fluctuations of ergodic averages for amenable group actions, arXiv preprint arXiv:1902.07912 (2019).
  • [9] Alexander Gorodnik and Amos Nevo, The ergodic theory of lattice subgroups, Princeton University Press, 2009.
  • [10] by same author, Quantitative ergodic theorems and their number-theoretic applications, Bulletin of the American Mathematical Society 52 (2015), no. 1, 65–113.
  • [11] Frederick P. Greenleaf, Ergodic theorems and the construction of summing sequences in amenable locally compact groups, Communications on Pure and Applied Mathematics 26 (1973), no. 1, 29–46.
  • [12] Olof Hanner, On the uniform convexity of Lp{L}^{p} and lpl^{p}, Arkiv för Matematik 3 (1956), no. 3, 239–244.
  • [13] Tuomas Hytönen, Jan van Neerven, Mark Veraar, and Lutz Weis, Analysis in Banach spaces: Volume I: Martingales and Littlewood-Paley theory, vol. 63, Springer, 2016.
  • [14] Roger L. Jones, Robert Kaufman, Joseph M. Rosenblatt, and Máté Wierdl, Oscillation in ergodic theory, Ergodic Theory and Dynamical Systems 18 (1998), no. 4, 889–935.
  • [15] Roger L. Jones, Joseph M. Rosenblatt, and Máté Wierdl, Oscillation in ergodic theory: higher dimensional results, Israel Journal of Mathematics 135 (2003), no. 1, 1–27.
  • [16] Steven Kalikow and Benjamin Weiss, Fluctuations of ergodic averages, Illinois Journal of Mathematics 43 (1999), no. 3, 480–488.
  • [17] Ulrich Kohlenbach and Laurenţiu Leuştean, A quantitative mean ergodic theorem for uniformly convex Banach spaces, Ergodic Theory and Dynamical Systems 29 (2009), no. 6, 1907–1915.
  • [18] Ulrich Kohlenbach and Pavol Safarik, Fluctuations, effective learnability and metastability in analysis, Annals of Pure and Applied Logic 165 (2014), no. 1, 266–304.
  • [19] Ulrich Krengel, On the speed of convergence in the ergodic theorem, Monatshefte für Mathematik 86 (1978), no. 1, 3–6.
  • [20] Elon Lindenstrauss, Pointwise theorems for amenable groups, Inventiones Mathematicae 146 (2001), no. 2, 259–295.
  • [21] Nikita Moriakov, Fluctuations of ergodic averages for actions of groups of polynomial growth, Studia Mathematica 240 (2018), no. 3, 255–273.
  • [22] Nikita Moriakov, On effective Birkhoff’s ergodic theorem for computable actions of amenable groups, Theory of Computing Systems 62 (2018), no. 5, 1269–1287.
  • [23] Donald S. Ornstein and Benjamin Weiss, Entropy and isomorphism theorems for actions of amenable groups, Journal d’Analyse Mathématique 48 (1987), no. 1, 1–141.