This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Multiple ergodic averages for variable polynomials

Abstract.

In this paper we study multiple ergodic averages for “good” variable polynomials. In particular, under an additional assumption, we show that these averages converge to the expected limit, making progress related to an open problem posted by Frantzikinakis ([13, Problem 10]). These general convergence results imply several variable extensions of classical recurrence, combinatorial and number theoretical results which are presented as well.

Key words and phrases:
Variable polynomial sequences, multiple ergodic averages, multiple recurrence, characteristic factors, equidistribution, nilmanifolds, Hardy fields, sublinear functions.
1991 Mathematics Subject Classification:
Primary: 37A44; Secondary: 37A05, 11B25, 11B83, 05D10.

Andreas Koutsogiannis

Aristotle University of Thessaloniki

Department of Mathematics

Thessaloniki, 54124, Greece


Dedicated to the loving memory of Aris Deligiannis, a great mentor.


(Communicated by Zhiren Wang)

1. Introduction

The study of multiple ergodic averages along polynomials dates back to 1977. Furstenberg, exploiting the L2L^{2} limiting behavior (all the limits in this article are taken with respect to the L2L^{2} norm, unless otherwise stated), as N,N\to\infty, of

1Nn=1NTnf1T2nf2Tnf,\frac{1}{N}\sum_{n=1}^{N}T^{n}f_{1}\cdot T^{2n}f_{2}\cdot\ldots\cdot T^{\ell n}f_{\ell}, (1)

where ,\ell\in\mathbb{N}, (X,,μ,T)(X,\mathcal{B},\mu,T) is a measure preserving system,111 I.e., T:XXT:X\to X is an invertible measure preserving transformation on a standard Borel probability space (X,,μ).(X,\mathcal{B},\mu). and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), provided (in [16]) a purely ergodic theoretic proof of Szemerédi’s theorem; every subset of natural numbers of positive upper density222For a set AA\subseteq\mathbb{N} we define its upper density, d¯(A),\bar{d}(A), as d¯(A):=lim supNN1|A{1,,N}|.\bar{d}(A):=\limsup_{N}N^{-1}\cdot|A\cap\{1,\ldots,N\}|. contains arbitrarily long arithmetic progressions (a result that can be immediately obtained by combining Theorem 2.3 with Theorem 2.4 below).

A polynomial p[t]p\in\mathbb{Q}[t] is an integer polynomial if p().p(\mathbb{Z})\subseteq\mathbb{Z}. It was Bergelson who first visualized the iterates n,2n,,nn,2n,\ldots,\ell n in (1) as linear “distinct enough” integer polynomials. The integer polynomials p1,,p,p_{1},\ldots,p_{\ell}, ,\ell\in\mathbb{N}, are essentially distinct if pi,p_{i}, pipjp_{i}-p_{j} are non-constant for all ij.i\neq j. Bergelson studied (initially in [2]), via the use of van der Corput’s lemma, a crucial tool in “reducing the complexity” of the iterates, averages of the form

1Nn=1NTp1(n)f1Tp(n)f,\frac{1}{N}\sum_{n=1}^{N}T^{p_{1}(n)}f_{1}\cdot\ldots\cdot T^{p_{\ell}(n)}f_{\ell}, (2)

for essentially distinct integer polynomials p1,,pp_{1},\ldots,p_{\ell}; this study eventually led to multidimensional polynomial extensions of Szemerédi’s theorem (see [5]).

Bergelson and Leibman conjectured (in [4]) that multiple ergodic averages of the form

1Nn=1NT1p1(n)f1Tp(n)f,\frac{1}{N}\sum_{n=1}^{N}T_{1}^{p_{1}(n)}f_{1}\cdot\ldots\cdot T_{\ell}^{p_{\ell}(n)}f_{\ell}, (3)

in any system, for multiple commuting TiT_{i}’s (i.e., TiTj=TjTiT_{i}T_{j}=T_{j}T_{i} for all i,ji,j) and arbitrary integer polynomials pip_{i}, always have limit (as NN\to\infty). This conjecture was answered in the positive by Walsh, who actually showed it in greater generality (see [27]). No specific expression of the limit was provided by the method.333 The conjecture corresponding to that of Bergelson and Leibman about iterates which are integer parts of real polynomials, was shown in [23].

One of the questions that someone is called upon to answer is under which conditions, either on the polynomials or the system, we can explicitly find the limit of the aforementioned expressions. In particular, whether we can find families of polynomials for which we have convergence in a general system to a specific expression (see more about the “expected” limit below); then we can get a number of interesting applications, e.g., find the corresponding arithmetic configurations on “large” subsets of integers. For instance, showing that the characteristic factor coincides with the nilfactor of the system, and exploiting the equidistribution property of the corresponding polynomial sequence in nilmanifolds (all these notions will be defined later), Frantzikinakis proved (in [12]) that the expression

1Nn=1NT[p(n)]f1T2[p(n)]f2T[p(n)]f,\frac{1}{N}\sum_{n=1}^{N}T^{[p(n)]}f_{1}\cdot T^{2[p(n)]}f_{2}\cdot\ldots\cdot T^{\ell[p(n)]}f_{\ell}, (4)

where p[t]p\in\mathbb{R}[t] with p(t)cq(t)+d,p(t)\neq cq(t)+d, c,d,c,d\in\mathbb{R}, q[t],q\in\mathbb{Q}[t], has the same limit (as NN\to\infty), in any system, as (1); obtaining a refinement of Szemerédi’s theorem.444 Such polynomials pp have the property that for every λ{0},\lambda\in\mathbb{R}\setminus\{0\}, λp\lambda p has at least one non-constant irrational coefficient, which is exactly the case (via Weyl’s criterion) when the corresponding sequence (λp(n))n(\lambda p(n))_{n} is equidistributed.

Generalizing the condition to multiple polynomials, following Frantzikinakis’ approach, Karageorgos and the author showed (in [20]) that for strongly independent real polynomials p1,,pp_{1},\ldots,p_{\ell} (i.e., any non-trivial linear combination of the pip_{i}’s with scalars from \mathbb{R} has at least one non-constant irrational coefficient) the expression

1Nn=1NT[p1(n)]f1T[p(n)]f,\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell}(n)]}f_{\ell}, (5)

has the “expected” limit. In order to explain what is meant by “expected” limit, we need to recall the ergodicity and weakly mixing notions. TT is ergodic if T1A=AT^{-1}A=A implies μ(A){0,1}\mu(A)\in\{0,1\}; TT is weakly mixing if T×TT\times T is ergodic. Here, by “expected” limit we mean, in case TT is ergodic, that the limit is equal to i=1fi𝑑μ,\prod_{i=1}^{\ell}\int f_{i}\;d\mu, whereas, in the general case, it is equal to i=1𝔼(fi|(T)),\prod_{i=1}^{\ell}\mathbb{E}(f_{i}|\mathcal{I}(T)), where 𝔼(fi|(T))\mathbb{E}(f_{i}|\mathcal{I}(T)) is the conditional expectation of fif_{i} with respect to the σ\sigma-algebra of the TT-invariant sets (notice here the connection to independence in probability). Furstenberg showed in [16] that for a weakly mixing T,T, (1) converges to the expected limit; under the same assumption on TT, Bergelson showed in [2] that (2) converges to the same limit as well.

We extend the distinctness property of the sequences of iterates of (5) to sequences of real variable polynomials:

Definition 1.1 ([13]).

The sequence (pN)N,(p_{N})_{N}, where pN[t],N,p_{N}\in\mathbb{R}[t],N\in\mathbb{N}, is good if the polynomials have bounded degree and for every non-zero α\alpha\in\mathbb{R} we have

limN1Nn=1NeipN(n)α=0.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}e^{ip_{N}(n)\alpha}=0. (6)

The sequence of \ell-tuples of variable polynomials (p1,N,,p,N)N,(p_{1,N},\ldots,p_{\ell,N})_{N}, where pi,N[t],p_{i,N}\in\mathbb{R}[t], N,N\in\mathbb{N}, 1i,1\leq i\leq\ell, is good if every non-trivial linear combination of the sequences (p1,N)N,,(p,N)N(p_{1,N})_{N},\ldots,(p_{\ell,N})_{N} is good.

Example 1 ([13]).

For =2,\ell=2, the pair (p1,N,p2,N)N,(p_{1,N},p_{2,N})_{N}, where p1,N(n)=n/Na,p_{1,N}(n)=n/N^{a}, p2,N(n)=n/Nb,p_{2,N}(n)=n/N^{b}, N,n,N,n\in\mathbb{N}, 0<a<b<1,0<a<b<1, is good.

Example 2 ([13]).

For ,\ell\in\mathbb{N}, the \ell-tuple (p1,N,,p,N)N,(p_{1,N},\ldots,p_{\ell,N})_{N}, where pi,N(n)=ni/Na,p_{i,N}(n)=n^{i}/N^{a}, 1i,1\leq i\leq\ell, N,n,N,n\in\mathbb{N}, 0<a<1,0<a<1, is good.

For the class of good polynomial sequences, Frantzikinakis stated the following problem:

Problem 1 (Problem 10, [13]).

Let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a good \ell-tuple of variable polynomials. Is it true that, for every ergodic system (X,,μ,T)(X,\mathcal{B},\mu,T) and functions f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[p1,N(n)]f1T[p,N(n)]f=f1𝑑μf𝑑μ?\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}=\int f_{1}\;d\mu\cdot\ldots\cdot\int f_{\ell}\;d\mu\;?

As stated in [13], Problem 1 is interesting even in the special cases of Examples 1 and  2.

Showing that (4) has the same limit as (1) for p[t]p\in\mathbb{R}[t] with p(t)cq(t)+d,p(t)\neq cq(t)+d, c,d,c,d\in\mathbb{R}, q[t],q\in\mathbb{Q}[t], which follows from [12, Theorem 2.2], one comes to the following problem, which is a natural generalization of Frantzikinakis’ result to good-variable-polynomials:

Problem 2.

Let (pN)N(p_{N})_{N} be a good polynomial sequence. Is it true that, for every ,\ell\in\mathbb{N}, system (X,,μ,T),(X,\mathcal{B},\mu,T), and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[pN(n)]f1T2[pN(n)]f2T[pN(n)]f=limN1Nn=1NTnf1T2nf2Tnf?\begin{split}&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{N}(n)]}f_{1}\cdot T^{2[p_{N}(n)]}f_{2}\cdot\ldots\cdot T^{\ell[p_{N}(n)]}f_{\ell}\\ =&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{n}f_{1}\cdot T^{2n}f_{2}\cdot\ldots\cdot T^{\ell n}f_{\ell}\;?\end{split}

As mentioned in [13], the =1\ell=1 case of Problem 1 (which also coincides with the =1\ell=1 case of Problem 2) can be easily obtained by using the spectral theorem. For general ,\ell\in\mathbb{N}, we make progress towards the solution of both Problems 1 and 2. In particular, under some additional assumptions on the coefficients of the good variable polynomials, we show two general results: Theorems 2.1 and  2.2. In this introductory section, we will present an easier application of each of them, which still covers both Examples 1 and  2.

To this end, we first recall the set of sublinear logarithmico-exponential Hardy field functions (of polynomial degree 0) which converge (as xx\to\infty) to (±\pm) infinity:555 Let RR be the collection of equivalence classes of real valued functions defined on some halfline (c,),(c,\infty), c0,c\geq 0, where two functions that agree eventually are identified. These classes are called germs of functions. A Hardy field is a subfield of the ring (R,+,)(R,+,\cdot) that is closed under differentiation. Here, we use the word function when we refer to elements of RR (understanding that all the operations defined and statements made for elements of RR are considered only for sufficiently large xx\in\mathbb{R}). We say that gg is a logarithmico-exponential Hardy field function, and we write g,g\in\mathcal{LE}, if it belongs to a Hardy field of real valued functions and it is defined on some (c,+),(c,+\infty), c0,c\geq 0, by a finite combination of symbols +,,×,÷,n,exp,log+,-,\times,\div,\sqrt[n]{\cdot},\exp,\log acting on the real variable xx and on real constants. For more on Hardy field functions, see [10, 12, 18].

𝒮:={g: 1g(x)x}\mathcal{SLE}:=\{g\in\mathcal{LE}:\;1\prec g(x)\prec x\}

(we write g2g1 if |g1(x)|/|g2(x)| as xg_{2}\prec g_{1}\text{ if }|g_{1}(x)|/|g_{2}(x)|\to\infty\text{ as }x\to\infty). Next, we define an appropriate set of coefficients: For gi𝒮,g_{i}\in\mathcal{SLE}, 1il,1\leq i\leq l, l,l\in\mathbb{N}, with glg1,g_{l}\prec\ldots\prec g_{1},666 The different growth relation between the gig_{i}’s, 1il,1\leq i\leq l, is postulated to avoid cases as, e.g., (N+1)1/2N1/2N3/2,(N+1)^{-1/2}-N^{-1/2}\sim N^{-3/2}, since g(x)=x3/2g(x)=x^{3/2} is not sublinear (here we write g2g1g_{2}\sim g_{1} if g1(x)/g2(x)c{0}g_{1}(x)/g_{2}(x)\to c\in\mathbb{R}\setminus\{0\}). let 𝒞(g1,,gl)\mathcal{C}(g_{1},\ldots,g_{l}) be the set of all linear combinations of reciprocals of the gig_{i}’s, i.e.,

𝒞(g1,,gl):={i=1lρigi:ρi}.\mathcal{C}(g_{1},\ldots,g_{l}):=\left\{\sum_{i=1}^{l}\frac{\rho_{i}}{g_{i}}:\;\rho_{i}\in\mathbb{R}\right\}.

Extending the definition from [20], we say that the sequence of \ell-tuple of variable polynomials (p1,N,,p,N)N,(p_{1,N},\ldots,p_{\ell,N})_{N}, where for each 1i,1\leq i\leq\ell, pi,Np_{i,N} has the form:

pi,N(n)=ai,di,Nndi++ai,1,Nn+ai,0,N,with(ai,0,N)Nbounded, andai,j,𝒞(g1,,gl), 1jdi,\begin{split}&\quad p_{i,N}(n)=a_{i,d_{i},N}n^{d_{i}}+\dots+a_{i,1,N}n+a_{i,0,N},\\ &\text{with}\;\;(a_{i,0,N})_{N}\;\;\text{bounded, and}\;\;a_{i,j,\cdot}\in\mathcal{C}(g_{1},\ldots,g_{l}),\;1\leq j\leq d_{i},\end{split} (7)

is strongly independent if for any (λ1,,λ){0}(\lambda_{1},\ldots,\lambda_{\ell})\in\mathbb{R}^{\ell}\setminus\{\vec{0}\} we have that i=1λipi,N(n)\sum_{i=1}^{\ell}\lambda_{i}p_{i,N}(n) is a non-constant polynomial in nn. For example, the following triple of variable polynomials is strongly independent:

((2g3(N)+1g4(N))n331g5(N)n+1,1g3(N)n3,(3g1(N)17g2(N))n2)N,\left(\left(\frac{\sqrt{2}}{g_{3}(N)}+\frac{1}{g_{4}(N)}\right)n^{3}-\frac{31}{g_{5}(N)}n+1,\frac{1}{g_{3}(N)}n^{3},\left(\frac{\sqrt{3}}{g_{1}(N)}-\frac{17}{g_{2}(N)}\right)n^{2}\right)_{N},

where g5(x):=logxg4(x):=logxloglogxg3(x):=x1/2g2(x):=xπ/4log3/2xg1(x):=xe/3.g_{5}(x):=\log x\prec g_{4}(x):=\log x\cdot\log\log x\prec g_{3}(x):=x^{1/2}\prec g_{2}(x):=x^{\pi/4}\log^{3/2}x\prec g_{1}(x):=x^{e/3}.777 If p1,N(n):=(2g3(N)+1g4(N))n331g5(N)n+1,p_{1,N}(n):=\Big{(}\frac{\sqrt{2}}{g_{3}(N)}+\frac{1}{g_{4}(N)}\Big{)}n^{3}-\frac{31}{g_{5}(N)}n+1, p2,N(n):=1g3(N)n3,p_{2,N}(n):=\frac{1}{g_{3}(N)}n^{3}, and p3,N(n):=(3g1(N)17g2(N))n2,p_{3,N}(n):=\Big{(}\frac{\sqrt{3}}{g_{1}(N)}-\frac{17}{g_{2}(N)}\Big{)}n^{2}, we have that λ1p1,N+λ2p2,N+λ3p3,N\lambda_{1}p_{1,N}+\lambda_{2}p_{2,N}+\lambda_{3}p_{3,N} is constant only when λ1=λ2=λ3=0.\lambda_{1}=\lambda_{2}=\lambda_{3}=0. Regarding Problem 1, i.e., multiple variable polynomial sequences, we have the following result:

Theorem 1.2.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a strongly independent \ell-tuple of polynomials as in (7). Then, for every ergodic system (X,,μ,T)(X,\mathcal{B},\mu,T) and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[p1,N(n)]f1T[p,N(n)]f=f1𝑑μf𝑑μ.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}=\int f_{1}\;d\mu\cdot\ldots\cdot\int f_{\ell}\;d\mu.

Regarding Problem 2, our result is the following theorem:

Theorem 1.3.

Let (pN)N(p_{N})_{N} be a polynomial sequence as in (7) with pN(n)p_{N}(n) non-constant in nn. Then, for every ,\ell\in\mathbb{N}, system (X,,μ,T),(X,\mathcal{B},\mu,T), and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[pN(n)]f1T2[pN(n)]f2T[pN(n)]f=limN1Nn=1NTnf1T2nf2Tnf.\begin{split}&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{N}(n)]}f_{1}\cdot T^{2[p_{N}(n)]}f_{2}\cdot\ldots\cdot T^{\ell[p_{N}(n)]}f_{\ell}\\ =&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{n}f_{1}\cdot T^{2n}f_{2}\cdot\ldots\cdot T^{\ell n}f_{\ell}.\end{split}

Very few convergence results for averages with polynomial iterates, in which we can explicitly find the limit, exist. Results for variable polynomials are even scarcer. We will conclude this introduction, by mentioning some of them. Kifer ([21]) studied multiple averages for variable polynomials of the form pi,N(n)=pi(n)+qi(N),p_{i,N}(n)=p_{i}(n)+q_{i}(N), with pip_{i}’s essentially distinct and qi()q_{i}(\mathbb{Z})\subseteq\mathbb{Z} for a weakly mixing transformation T.T. Similarly, for more general polynomials, Kifer studied averages for strongly mixing “enough” transformations. Finally, Frantzikinakis (in [12]) found characteristic factors (see Definition 4.1) for averages with variable polynomial iterates pi,N,p_{i,N}, with leading coefficients independent of NN. It is the arguments from this article ([12]) that we will adapt, in order to find characteristic factors for the averages appearing in Theorems  1.2 and  1.3 as well, which is one of the main two ingredients of the proof (the second one is the equidistribution of particular sequences for which we adapt arguments from [11]).

Notation

We denote by ={1,2,},\mathbb{N}=\{1,2,\ldots\}, ,\mathbb{Z}, ,\mathbb{Q}, ,\mathbb{R}, and \mathbb{C} the sets of natural, integer, rational, real and complex numbers respectively. For a function f:Xf:X\to\mathbb{C} on a space XX with a transformation T:XX,T:X\to X, we denote by TfTf the composition fT.f\circ T. For s,s\in\mathbb{N}, 𝕋s=s/s\mathbb{T}^{s}=\mathbb{R}^{s}/\mathbb{Z}^{s} denotes the ss dimensional torus. For a,b0a,b\geq 0 we write ab,a\ll b, if there exists C>0C>0 such that aCb.a\leq Cb.

2. Main results and applications

Here we will state our most general results and some applications. For the proofs, we follow [10] and [20], adapting the corresponding arguments to the variable polynomial case.

We first cover Problem 1 for a subclass of good polynomial sequences:

Theorem 2.1.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a good and super nice888 The “super niceness” property is rather technical and will be defined in Section 4. \ell-tuple of polynomials. Then, for every ergodic system (X,,μ,T)(X,\mathcal{B},\mu,T) and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[p1,N(n)]f1T[p,N(n)]f=f1𝑑μf𝑑μ.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}=\int f_{1}\;d\mu\cdot\ldots\cdot\int f_{\ell}\;d\mu. (8)

We also cover the following case of Problem 2:

Theorem 2.2.

Let (pN)N[t](p_{N})_{N}\subseteq\mathbb{R}[t] be a good polynomial sequence such that, for all ,\ell\in\mathbb{N}, (pN,2pN,(p_{N},2p_{N}, ,pN)N\ldots,\ell p_{N})_{N} is super nice. Then, for every ,\ell\in\mathbb{N}, system (X,,μ,T),(X,\mathcal{B},\mu,T), and f1,,fL(μ),f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu), we have

limN1Nn=1NT[pN(n)]f1T2[pN(n)]f2T[pN(n)]f=limN1Nn=1NTnf1T2nf2Tnf.\begin{split}&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{N}(n)]}f_{1}\cdot T^{2[p_{N}(n)]}f_{2}\cdot\ldots\cdot T^{\ell[p_{N}(n)]}f_{\ell}\\ =&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{n}f_{1}\cdot T^{2n}f_{2}\cdot\ldots\cdot T^{\ell n}f_{\ell}.\end{split} (9)

We will show that Theorem 2.1 implies Theorem 1.2 (resp. Theorem 2.2 implies Theorem 1.3), and that it holds for any polynomial family {p1,,p}\{p_{1},\ldots,p_{\ell}\} (resp. {p,2p,,p}\{p,2p,\ldots,\ell p\} for Theorem 2.2) which is independent of NN and for which non-trivial linear combinations of its members satisfy (6). In particular, it generalizes [20, Theorem 2.1] for strongly independent polynomials (the same is true for Theorem 2.2 for the single polynomial case).

The approach we follow to show these results is similar to the one in [12, 20], with a few extra twists. Namely, one has to find the characteristic factors of (8) and (9), and show some equidistribution results in nilmanifolds. The “super niceness” property (Definition 4.8) will be introduced so we can deal with the former, while the “goodness” property (Definition 1.1) implies the latter.

As was mentioned in the previous section, the ergodicity assumption in Theorem 2.1 can be dropped.999 The limit in this case is equal to i=1𝔼(fi|(T)).\prod_{i=1}^{\ell}\mathbb{E}(f_{i}|\mathcal{I}(T)). Indeed, if μ=μt𝑑λ(t)\mu=\int\mu_{t}\;d\lambda(t) denotes the ergodic decomposition of μ,\mu, it suffices to show that if 𝔼(fi|(T))=0\mathbb{E}(f_{i}|\mathcal{I}(T))=0 for some i,i, then the averages converge to 0.0. Since 𝔼(fi|(T))=0,\mathbb{E}(f_{i}|\mathcal{I}(T))=0, we have that fi𝑑μt=0\int f_{i}\;d\mu_{t}=0 for λ\lambda-a.e. t.t. By (8), we have that the averages go to 0 in L2(μt)L^{2}(\mu_{t}) for λ\lambda-a.e. t,t, hence the limit is equal to 0 in L2(μ)L^{2}(\mu) by the Dominated Convergence Theorem. Hence, the theorems hold for any system; their strong nature is also reflected in the fact that they have immediate recurrence and combinatorial implications which we discuss next.

2.1. Single sequence consequences

We first deal with a single variable polynomial sequence, assuming the validity of Theorem 2.2.

2.1.1. Recurrence

The following theorem due to Furstenberg will help us obtain recurrence results:

Theorem 2.3 (Furstenberg Multiple Recurrence Theorem,  [16]).

Let (X,,μ,T)(X,\mathcal{B},\mu,T) be a system. Then, for every \ell\in\mathbb{N} and every set AA\in\mathcal{B} with μ(A)>0,\mu(A)>0, we have

lim infN1Nn=1Nμ(ATnAT2nATnA)>0.\liminf_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\mu(A\cap T^{-n}A\cap T^{-2n}A\cap\ldots\cap T^{-\ell n}A)>0.
Remark 1.

As we mentioned before, the liminf in the expression of Theorem 2.3 is actually a limit.

Combining Theorem 2.2 and Theorem 2.3 (with the choices fi:=1Af_{i}:=1_{A}), we get the following corollary:

Corollary 1.

Let (pN)N[t](p_{N})_{N}\subseteq\mathbb{R}[t] be as in Theorem 2.2. Then, for every ,\ell\in\mathbb{N}, every system (X,,μ,T),(X,\mathcal{B},\mu,T), and every set AA\in\mathcal{B} with μ(A)>0,\mu(A)>0, we have

limN1Nn=1Nμ(AT[pN(n)]AT2[pN(n)]AT[pN(n)]A)>0.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\mu\left(A\cap T^{-[p_{N}(n)]}A\cap T^{-2[p_{N}(n)]}A\cap\ldots\cap T^{-\ell[p_{N}(n)]}A\right)>0.

2.1.2. Combinatorics

Via Furstenberg’s Correspondence Principle, one gets combinatorial results from recurrence ones. We present here a reformulation of this principle from [1].

Theorem 2.4 (Furstenberg Correspondence Principle, [16][1]).

Let EE be a subset of integers. There exists a system (X,,μ,T)(X,\mathcal{B},\mu,T) and a set AA\in\mathcal{B} with μ(A)=d¯(E)\mu(A)=\bar{d}(E) such that

d¯(E(En1)(En))μ(ATn1ATnA)\bar{d}(E\cap(E-n_{1})\cap\ldots\cap(E-n_{\ell}))\geq\mu(A\cap T^{-n_{1}}A\cap\ldots\cap T^{-n_{\ell}}A) (10)

for every \ell\in\mathbb{N} and n1,,n.n_{1},\ldots,n_{\ell}\in\mathbb{Z}.

Using Corollary 1 and Theorem 2.4, we have the following combinatorial result:

Corollary 2.

Let (pN)N[t](p_{N})_{N}\subseteq\mathbb{R}[t] be as in Theorem 2.2. Then, for every \ell\in\mathbb{N} and every set EE\subseteq\mathbb{N} with d¯(E)>0,\bar{d}(E)>0, we have

lim infN1Nn=1Nd¯(E(E[pN(n)])(E2[pN(n)])(E[pN(n)]))>0.\liminf_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bar{d}\left(E\cap(E-[p_{N}(n)])\cap(E-2[p_{N}(n)])\cap\ldots\cap(E-\ell[p_{N}(n)])\right)>0.

Hence, we immediately get the following refinement of Szemerédi’s theorem:

Corollary 3.

Let (pN)N[t](p_{N})_{N}\subseteq\mathbb{R}[t] be as in Theorem 2.2. Then, for every \ell\in\mathbb{N}, every set EE\subseteq\mathbb{N} with d¯(E)>0\bar{d}(E)>0 contains arithmetic progressions of the form:

{m,m+[pN(n)],m+2[pN(n)],,m+[pN(n)]},\{m,m+[p_{N}(n)],m+2[p_{N}(n)],\ldots,m+\ell[p_{N}(n)]\},

for some m,m\in\mathbb{Z}, N,N\in\mathbb{N}, and 1nN,1\leq n\leq N, with [pN(n)]0.[p_{N}(n)]\neq 0.

2.2. Multiple sequences consequences

As in Subsection 2.1, assuming the validity of Theorem 2.1, we have various implications for multiple variable polynomial sequences.

2.2.1. Recurrence

Our first recurrence result is the following (we skip the proof as the argument is the same as the one in [11, Theorem 2.8]):

Theorem 2.5.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. If (X,,μ,T)(X,\mathcal{B},\mu,T) is a system and A0,A1,,AA_{0},A_{1},\ldots,A_{\ell}\in\mathcal{B} such that

μ(A0Tk1A1TkA)=α>0\mu\left(A_{0}\cap T^{k_{1}}A_{1}\cap\ldots\cap T^{k_{\ell}}A_{\ell}\right)=\alpha>0

for some k1,,k,k_{1},\ldots,k_{\ell}\in\mathbb{Z}, then

limN1Nn=1Nμ(A0T[p1,N(n)]A1T[p,N(n)]A)α+1.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\mu\left(A_{0}\cap T^{-[p_{1,N}(n)]}A_{1}\cap\ldots\cap T^{-[p_{\ell,N}(n)]}A_{\ell}\right)\geq\alpha^{\ell+1}.

Setting Ai=AA_{i}=A and ki=0k_{i}=0 we immediately get the following:

Corollary 4.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. Then, for every system (X,,μ,T)(X,\mathcal{B},\mu,T) and every set A,A\in\mathcal{B}, we have

limN1Nn=1Nμ(AT[p1,N(n)]AT[p,N(n)]A)(μ(A))+1.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\mu\left(A\cap T^{-[p_{1,N}(n)]}A\cap\ldots\cap T^{-[p_{\ell,N}(n)]}A\right)\geq(\mu(A))^{\ell+1}.

2.2.2. Combinatorics

Theorem 2.5, via [14, Proposition 3.3], which is a variant of Theorem 2.4 for several sets, implies the following (we are skipping the routine details):

Theorem 2.6.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. If E0,E1,,E_{0},E_{1},\ldots, EE_{\ell}\subseteq\mathbb{N} are such that

d¯(E0(E1+k1)(E+k))=α>0\bar{d}(E_{0}\cap(E_{1}+k_{1})\cap\ldots\cap(E_{\ell}+k_{\ell}))=\alpha>0

for some k1,,k,k_{1},\ldots,k_{\ell}\in\mathbb{Z}, then

lim infN1Nn=1Nd¯(E0(E1[p1,N(n)])(E[p,N(n)]))α+1.\liminf_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bar{d}(E_{0}\cap(E_{1}-[p_{1,N}(n)])\cap\ldots\cap(E_{\ell}-[p_{\ell,N}(n)]))\geq\alpha^{\ell+1}.

Setting Ei=EE_{i}=E and ki=0k_{i}=0 in the previous result, we get:

Corollary 5.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. Then, for every set E,E\subseteq\mathbb{N}, we have

lim infN1Nn=1Nd¯(E(E[p1,N(n)])(E[p,N(n)]))(d¯(E))+1.\liminf_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bar{d}(E\cap(E-[p_{1,N}(n)])\cap\ldots\cap(E-[p_{\ell,N}(n)]))\geq(\bar{d}(E))^{\ell+1}.

So, we immediately obtain the following combinatorial result:

Corollary 6.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. Then every set EE\subseteq\mathbb{N} with d¯(E)>0\bar{d}(E)>0 contains arithmetic configurations of the form

{m,m+[p1,N(n)],m+[p2,N(n)],,m+[p,N(n)]},\{m,m+[p_{1,N}(n)],m+[p_{2,N}(n)],\ldots,m+[p_{\ell,N}(n)]\},

for some m,m\in\mathbb{Z}, N,N\in\mathbb{N}, and 1nN,1\leq n\leq N, with [pi,N(n)]0,[p_{i,N}(n)]\neq 0, for all 1i.1\leq i\leq\ell.

A set EE\subseteq\mathbb{N} is called syndetic if finitely many translations of it cover .\mathbb{N}. The cardinality of such a set of translations is a syndeticity constant of EE. Applying Theorem 2.6 to syndetic sets E0,E1,,EE_{0},E_{1},\ldots,E_{\ell} and α=(i=1ri)1,\alpha=(\prod_{i=1}^{\ell}r_{i})^{-1}, where rir_{i} is a syndeticity constant of Ei,E_{i}, 1i,1\leq i\leq\ell, we have:

Corollary 7.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. If E0,E1,,E_{0},E_{1},\ldots, EE_{\ell}\subseteq\mathbb{N} are syndetic sets, then there exist m,m\in\mathbb{Z}, N,N\in\mathbb{N}, and 1nN,1\leq n\leq N, with [pi,N(n)]0,[p_{i,N}(n)]\neq 0, for all 1i,1\leq i\leq\ell, such that

mE0,m+[p1,N(n)]E1,,m+[p,N(n)]E.m\in E_{0},m+[p_{1,N}(n)]\in E_{1},\ldots,m+[p_{\ell,N}(n)]\in E_{\ell}.

In particular, for a syndetic set EE\subseteq\mathbb{N}, setting Ei=ciE:={cin:nE},E_{i}=c_{i}E:=\{c_{i}n:\;n\in E\}, 0i,0\leq i\leq\ell, where c0,c1,,c,c_{0},c_{1},\ldots,c_{\ell}\in\mathbb{N}, Corollary 7 above implies that we can find x0,x1,,xE,x_{0},x_{1},\ldots,x_{\ell}\in E, N,N\in\mathbb{N}, and 1nN,1\leq n\leq N, that solve the system of equations cixici1xi1=[pi,N(n)],c_{i}x_{i}-c_{i-1}x_{i-1}=[p_{i,N}(n)], 1i.1\leq i\leq\ell.

2.2.3. Topological dynamics

Let (X,T)(X,T) be a (topological) dynamical system, i.e., (X,d)(X,d) is a compact metric space and T:XXT:X\to X an invertible continuous transformation. TT (and consequently the system) is minimal, if, for all xX,x\in X, we have {Tnx:n}¯=X.\overline{\{T^{n}x:\;n\in\mathbb{N}\}}=X.

Analogously to [20, Theorem 2.5], we get the following result:

Theorem 2.7.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. If (X,T)(X,T) is a minimal dynamical system, then, for a residual and TT-invariant set of xX,x\in X, we have

{(T[p1,N(n)]x,,T[p,N(n)]x):N, 1nN}¯=X××X.\overline{\Big{\{}(T^{[p_{1,N}(n)]}x,\ldots,T^{[p_{\ell,N}(n)]}x):\;N\in\mathbb{N},\;1\leq n\leq N\Big{\}}}=X\times\cdots\times X. (11)
Proof.

There exists a TT-invariant Borel measure which gives positive value to every non-empty open set. So, due to the syndeticity of the orbit of every point, for every xXx\in X and every non-empty open set UU we have

lim infN1Nn=1N1U(Tnx)>0.\liminf_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}1_{U}(T^{n}x)>0. (12)

As we mentioned before, Theorem 2.1 implies that

limN1Nn=1NT[p1,N(n)]f1T[p,N(n)]f=i=1𝔼(fi|(T)).\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}=\prod_{i=1}^{\ell}\mathbb{E}(f_{i}|\mathcal{I}(T)). (13)

Since 𝔼(fi|(T))=limN1Nn=1NTnfi,\mathbb{E}(f_{i}|\mathcal{I}(T))=\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}T^{n}f_{i}, combining (13) with (12), we get for almost every xXx\in X (hence for a dense set) and every U1,,UU_{1},\ldots,U_{\ell} from a given countable basis of non-empty open sets that

lim supN1Nn=1N1U1(T[p1,N(n)]x)1U(T[p,N(n)]x)>0,\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}1_{U_{1}}(T^{[p_{1,N}(n)]}x)\cdot\ldots\cdot 1_{U_{\ell}}(T^{[p_{\ell,N}(n)]}x)>0,

proving that the set of points that satisfy (11), say R,R, is dense. To see that RR is Gδ,G_{\delta}, take =1\ell=1 (the general case is analogous). Then

R={xX:m,r,Nand 1nNwithT[p1,N(n)]xB(xm,1/r)},R=\left\{x\in X:\forall\;m,r\in\mathbb{N},\exists\;N\in\mathbb{N}\;\text{and}\;1\leq n\leq N\;\text{with}\;T^{[p_{1,N}(n)]}x\in B(x_{m},1/r)\right\},

where {xm:m}\{x_{m}:\;m\in\mathbb{N}\} is a countable, dense subset of XX and B(xm,1/r)B(x_{m},1/r) denotes the open ball centered at xmx_{m} with radius 1/r.1/r. The claim now follows since

R=m,rN,1nNT[p1,N(n)]B(xm,1/r).R=\bigcap_{m,r\in\mathbb{N}}\bigcup_{N\in\mathbb{N},\atop 1\leq n\leq N}T^{-[p_{1,N}(n)]}B(x_{m},1/r).

Since T[pi,N(n)](Tx)=T(T[pi,N(n)]x),T^{[p_{i,N}(n)]}(Tx)=T(T^{[p_{i,N}(n)]}x), we also get the TT-invariance of RR. ∎

Using Zorn’s lemma, we know that every dynamical system has a minimal subsystem. This fact together with Theorem 2.7 imply the following corollary:

Corollary 8.

For ,\ell\in\mathbb{N}, let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be as in Theorem 2.1. If (X,T)(X,T) is a dynamical system, then, for a non-empty and TT-invariant set of xX,x\in X, we have

{(T[p1,N(n)]x,,T[p,N(n)]x):N,1nN}¯={Tnx:n}¯××{Tnx:n}¯.\begin{split}&\overline{\Big{\{}(T^{[p_{1,N}(n)]}x,\ldots,T^{[p_{\ell,N}(n)]}x):\;N\in\mathbb{N},1\leq n\leq N\Big{\}}}\\ =&\overline{\{T^{n}x:n\in\mathbb{N}\}}\times\cdots\times\overline{\{T^{n}x:n\in\mathbb{N}\}}.\end{split}
Remark 2.

Following the method of [22] (which extended the one from [15]), as it was adapted in [20], the interested, and somewhat familiar to the topic, reader can state and prove the corresponding convergence results to Theorems 2.1 and  2.2 along prime numbers (or, for the sake of simplicity, Theorems 1.2 and  1.3), together with the corresponding corollaries, as well as recurrence results along primes shifted by ±1\pm 1.

While it is not trivial, this can be achieved, for the uniformity estimates, that allow one to pass from averages along natural numbers to the corresponding ones along primes, can be used for variable polynomial iterates of bounded degree (i.e., one can deal with the “good” and “super nice” variable iterates under consideration).

3. Some background material

In this section we list some materials that will be used for the multiple average case.

3.1. Factors

A homomorphism from a system (X,,μ,T)(X,\mathcal{B},\mu,T) onto a system (Y,𝒴,ν,(Y,\mathcal{Y},\nu, S)S) is a measurable map π:XY\pi:X^{\prime}\to Y^{\prime}, where XX^{\prime} is a TT-invariant subset of XX and YY^{\prime} is an SS-invariant subset of YY, both of full measure, such that μπ1=ν\mu\circ\pi^{-1}=\nu and Sπ(x)=πT(x)S\circ\pi(x)=\pi\circ T(x) for xXx\in X^{\prime}. When we have such a homomorphism we say that the system (Y,𝒴,ν,S)(Y,\mathcal{Y},\nu,S) is a factor of the system (X,,μ,T)(X,\mathcal{B},\mu,T). If the factor map π:XY\pi:X^{\prime}\to Y^{\prime} can be chosen to be injective, then we say that the systems (X,,μ,T)(X,\mathcal{B},\mu,T) and (Y,𝒴,ν,S)(Y,\mathcal{Y},\nu,S) are isomorphic. A factor can also be characterised by π1(𝒴)\pi^{-1}(\mathcal{Y}) which is a TT-invariant sub-σ\sigma-algebra of \mathcal{B}, and, conversely, any TT-invariant sub-σ\sigma-algebra of \mathcal{B} defines a factor. By abusing the terminology, we denote by the same letter the σ\sigma-algebra 𝒴\mathcal{Y} and its inverse image by π\pi, so, if (Y,𝒴,ν,S)(Y,\mathcal{Y},\nu,S) is a factor of (X,,μ,T)(X,\mathcal{B},\mu,T), we think of 𝒴\mathcal{Y} as a sub-σ\sigma-algebra of \mathcal{B}.

3.1.1. Seminorms

We follow [19] and [6] for the inductive definition of the seminorms ||||||k.\lvert\!|\!|\cdot|\!|\!\rvert_{k}. More specifically, the definition that we use here follows from [19] (in the ergodic case), [6] (in the general case) and the use of von Neumann’s mean ergodic theorem.

Let (X,,μ,T)(X,\mathcal{B},\mu,T) be a system and fL(μ).f\in L^{\infty}(\mu). We define inductively the seminorms |f|k,μ,T\lvert\!|\!|f|\!|\!\rvert_{k,\mu,T} (or just |f|k\lvert\!|\!|f|\!|\!\rvert_{k} if there is no room for confusion) as follows: For k=1k=1 we set

|||f|||1:=𝔼(f|(T))2.\lvert\!|\!|f|\!|\!\rvert_{1}:=\left\|\mathbb{E}(f|\mathcal{I}(T))\right\|_{2}.

Recall that the conditional expectation 𝔼(f|(T))\mathbb{E}(f|\mathcal{I}(T)) satisfies 𝔼(f|(T))𝑑μ=f𝑑μ\int\mathbb{E}(f|\mathcal{I}(T))\;d\mu=\int f\;d\mu and T𝔼(f|(T))=𝔼(Tf|(T)).T\mathbb{E}(f|\mathcal{I}(T))=\mathbb{E}(Tf|\mathcal{I}(T)).

For k1k\geq 1 we let

|f|k+12k+1:=limN1Nn=1N|f¯Tnf|k2k.\lvert\!|\!|f|\!|\!\rvert^{2^{k+1}}_{k+1}:=\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\lvert\!|\!|\bar{f}\cdot T^{n}f|\!|\!\rvert^{2^{k}}_{k}.

All these limits exist and ||||||k\lvert\!|\!|\cdot|\!|\!\rvert_{k} define seminorms on L(μ)L^{\infty}(\mu) ([19]). Also, we remark that for all kk\in\mathbb{N} we have |f|k|f|k+1\lvert\!|\!|f|\!|\!\rvert_{k}\leq\lvert\!|\!|f|\!|\!\rvert_{k+1} and |ff¯|k,μ×μ,T×T|f|k+1,μ,T2.\lvert\!|\!|f\otimes\bar{f}|\!|\!\rvert_{k,\mu\times\mu,T\times T}\leq\lvert\!|\!|f|\!|\!\rvert^{2}_{k+1,\mu,T}.

3.1.2. Nilfactors

Using the seminorms we defined above, we can construct factors 𝒵k=𝒵k(T)\mathcal{Z}_{k}=\mathcal{Z}_{k}(T) of XX characterized by:

 for fL(μ),𝔼(f|𝒵k1)=0 if and only if |f|k=0.\text{ for }f\in L^{\infty}(\mu),\;\;\;\mathbb{E}(f|\mathcal{Z}_{k-1})=0\text{ if and only if }\lvert\!|\!|f|\!|\!\rvert_{k}=0.

The following profound fact from [19] (see also the independent work of [28]) shows that for every kk\in\mathbb{N} the factor 𝒵k\mathcal{Z}_{k} has a purely algebraic structure; approximately, we can assume that it is a kk-step nilsystem (see Subsection 3.2 below for the definitions):

Theorem 3.1 (Structure Theorem, [19, 28]).

Let (X,,μ,T)(X,\mathcal{B},\mu,T) be an ergodic system and kk\in\mathbb{N}. Then the factor 𝒵k(T)\mathcal{Z}_{k}(T) is an inverse limit of kk-step nilsystems.101010 By this we mean that there exist TT-invariant sub-σ\sigma-algebras 𝒵k,i,i\mathcal{Z}_{k,i},i\in\mathbb{N}, of \mathcal{B} such that 𝒵k=i𝒵k,i\mathcal{Z}_{k}=\bigcup_{i\in\mathbb{N}}\mathcal{Z}_{k,i} and for every ii\in\mathbb{N}, the factors induced by the σ\sigma-algebras 𝒵k,i\mathcal{Z}_{k,i} are isomorphic to kk-step nilsystems.

Because of this result, we call 𝒵k\mathcal{Z}_{k} the kk-step nilfactor of the system. The smallest factor that is an extension of all finite step nilfactors is denoted by 𝒵=𝒵(T)\mathcal{Z}=\mathcal{Z}(T), meaning, 𝒵=k𝒵k\mathcal{Z}=\bigvee_{k\in\mathbb{N}}\mathcal{Z}_{k}, and is called the nilfactor of the system. The nilfactor 𝒵\mathcal{Z} is of particular interest because it controls the limiting behaviour in L2(μ)L^{2}(\mu) of the averages in (8) and (9).

3.2. Nilmanifolds

Let GG be a kk-step nilpotent Lie group, meaning Gk+1={e}G_{k+1}=\{e\} for some kk\in\mathbb{N}, where Gk=[G,Gk1]G_{k}=[G,G_{k-1}] denotes the kk-th commutator subgroup, and Γ\Gamma a discrete cocompact subgroup of GG. The compact homogeneous space X=G/ΓX=G/\Gamma is called kk-step nilmanifold (or nilmanifold). The group GG acts on G/ΓG/\Gamma by left translations, where the translation by an element bGb\in G is given by Tb(gΓ)=(bg)ΓT_{b}(g\Gamma)=(bg)\Gamma. We denote by mXm_{X} the normalized Haar measure on X,X, i.e., the unique probability measure that is invariant under the action of GG, and by 𝒢/Γ\mathcal{G}/\Gamma the Borel σ\sigma-algebra of G/ΓG/\Gamma. If bGb\in G, we call the system (G/Γ,𝒢/Γ,mX,Tb)(G/\Gamma,\mathcal{G}/\Gamma,m_{X},T_{b}) kk-step nilsystem (or nilsystem) and the elements of GG nilrotations.

3.2.1. Equidistribution

For a connected and simply connected Lie group G,G, let exp:𝔤G\exp:\mathfrak{g}\to G be the exponential map, where 𝔤\mathfrak{g} is the Lie algebra of GG. For bGb\in G and ss\in\mathbb{R} we define the element bsb^{s} of GG as follows: If X𝔤X\in\mathfrak{g} is such that exp(X)=b\exp(X)=b, then bs=exp(sX)b^{s}=\exp(sX) (this is well defined since under the aforementioned assumptions exp\exp is a bijection).

If (a(n))n(a(n))_{n} is a sequence of real numbers and X=G/ΓX=G/\Gamma is a nilmanifold with GG connected and simply connected, we say that the sequence (ba(n)x)n,(b^{a(n)}x)_{n}, bG,b\in G, is equidistributed in a subnilmanifold YY of XX, if for every FC(X)F\in C(X) we have

limN1Nn=1NF(ba(n)x)=F𝑑mY.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F(b^{a(n)}x)=\int F\;dm_{Y}. (14)

For the following claims, one can check the linear case in [25] ([25, Section 2], and in particular the theorem in [25, Subsection 2.17], together with [25, Theorem 2.19]) which covers the \mathbb{Z}-actions case, and [26] for the analogous result for \mathbb{R}-actions. A nilrotation bGb\in G is ergodic (or acts ergodically) on XX, if the sequence (bnΓ)n(b^{n}\Gamma)_{n} is dense in X.X. If bGb\in G is ergodic, then for every xXx\in X the sequence (bnx)n(b^{n}x)_{n} is equidistributed in XX. The orbit closure Z:=(bnΓ)¯nZ:=\overline{(b^{n}\Gamma)}_{n} of bGb\in G has the structure of a nilmanifold with (bnΓ)n(b^{n}\Gamma)_{n} being equidistributed in ZZ. Analogously, if GG is connected and simply connected, then W:=(bsΓ)¯sW:=\overline{(b^{s}\Gamma)}_{s\in\mathbb{R}} is a nilmanifold with (bsΓ)s(b^{s}\Gamma)_{s\in\mathbb{R}} being equidistributed in WW.

3.2.2. Change of base point formula

Let X=G/ΓX=G/\Gamma be a nilmanifold. As mentioned before, for every bGb\in G the sequence (bnΓ)n(b^{n}\Gamma)_{n} is equidistributed in Xb:={bnΓ:n}¯X_{b}:=\overline{\{b^{n}\Gamma:n\in\mathbb{N}\}}. Using the identity bng=g(g1bg)nb^{n}g=g(g^{-1}bg)^{n} we see that the nil-orbit (bngΓ)n(b^{n}g\Gamma)_{n} is equidistributed in the set gXg1bggX_{g^{-1}bg}. A similar formula holds when GG is connected and simply connected, where we replace the nn\in\mathbb{N} with ss\in\mathbb{R} and the nilmanifold XbX_{b} with Yb:={bsΓ:s}¯Y_{b}:=\overline{\{b^{s}\Gamma:\;s\in\mathbb{R}\}}.

3.2.3. Lifting argument

Giving a topological group G,G, we denote the connected component of its identity element, e, by G0.G_{0}. To assume that a nilmanifold has a representation G/Γ,G/\Gamma, with GG connected and simply connected, one can follow for example [25]. Since all our results deal with an action on XX of finitely many elements of GG we can and will assume that the discrete group G/G0G/G_{0} is finitely generated (see [25, Subsection 2.1]). In this case one can show (see [25, Subsection 1.11]) that X=G/ΓX=G/\Gamma is isomorphic to a sub-nilmanifold of a nilmanifold X~=G~/Γ~\tilde{X}=\tilde{G}/\tilde{\Gamma}, where G~\tilde{G} is a connected and simply connected nilpotent Lie group, with all translations from GG “represented” in G~\tilde{G}.121212 In practice this means that for every FC(X)F\in C(X), bGb\in G and xXx\in X, there exists F~C(X~)\tilde{F}\in C(\tilde{X}), b~G~\tilde{b}\in\tilde{G} and x~X~\tilde{x}\in\tilde{X}, such that F(bnx)=F~(b~nx~)F(b^{n}x)=\tilde{F}(\tilde{b}^{n}\tilde{x}) for every nn\in\mathbb{N}. We caution the reader that such a construction is only helpful when our working assumptions impose no restrictions on a nilrotation. Any assumption made about bG,b\in G, which acts on a nilmanifold XX, is typically lost when passing to the lifted nilmanifold X~\tilde{X}.

4. Finding the characteristic factor

In this technical section we find characteristic factors for the expressions that appear in Theorems 2.1 and 2.2. In both cases, we will show that the nilfactor is characteristic (Proposition 2 and Proposition 3 respectively).

We first start with the degree 11 case and then move on to the general one. At this point we recall the notion of a characteristic factor (adapted to our study):

Definition 4.1.

For \ell\in\mathbb{N} let (X,,μ,T)(X,\mathcal{B},\mu,T) be a system. The sub-σ\sigma-algebra 𝒴\mathcal{Y} of \mathcal{B} is a characteristic factor for the variable tuple of integer-valued sequences (a1,N,,a,N)N(a_{1,N},\ldots,a_{\ell,N})_{N} if it is TT-invariant and

limN1Nn=1Ni=1Tai,N(n)fi1Nn=1Ni=1Tai,N(n)f~i2=0,\lim_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}\prod_{i=1}^{\ell}T^{a_{i,N}(n)}f_{i}-\frac{1}{N}\sum_{n=1}^{N}\prod_{i=1}^{\ell}T^{a_{i,N}(n)}\tilde{f}_{i}\right\|_{2}=0,

for all fiL(μ),f_{i}\in L^{\infty}(\mu), where f~i=𝔼(fi|𝒴)\tilde{f}_{i}=\mathbb{E}(f_{i}|\mathcal{Y}), 1i.1\leq i\leq\ell.131313 Equivalently, limN1Nn=1NTa1,N(n)f1Ta,N(n)f2=0\lim_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}T^{a_{1,N}(n)}f_{1}\cdot\ldots\cdot T^{a_{\ell,N}(n)}f_{\ell}\right\|_{2}=0 if 𝔼(fi|𝒴)=0\mathbb{E}(f_{i}|\mathcal{Y})=0 for some 1i.1\leq i\leq\ell.

4.1. The base case

The following crucial lemma, which can be understood as a “change of variables” procedure, will be used in the base =1\ell=1 case for degpN=1,\deg p_{N}=1, i.e., pN(n)=aNn+bN.p_{N}(n)=a_{N}n+b_{N}. We will assume that (bN)N(b_{N})_{N} is bounded, so, as such error terms do not affect our averages, we mainly have to deal with the expression 1Nn=1NT[aNn]f.\frac{1}{N}\sum_{n=1}^{N}T^{[a_{N}n]}f.

Lemma 4.2.

Let (aN)N(0,+)(a_{N})_{N}\subseteq(0,+\infty) bounded with (aNN)N(a_{N}\cdot N)_{N} tending increasingly to \infty. For any sequence (cN(n))n,N[0,)(c_{N}(n))_{n,N}\subseteq[0,\infty) we have

lim supN1Nn=1Nc[aNN]([aNn])lim supN1Nn=1NcN(n).\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}c_{[a_{N}N]}([a_{N}n])\ll\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}c_{N}(n).
Proof.

For a fixed N,N\in\mathbb{N}, since (aN)N(a_{N})_{N} is bounded, we have the relation

1Nn=1Nc[aNN]([aNn])([1aN]+1)aN[aNN]aNN1[aNN]n=0[aNN]c[aNN](n).\frac{1}{N}\sum_{n=1}^{N}c_{[a_{N}N]}([a_{N}n])\leq\left(\left[\frac{1}{a_{N}}\right]+1\right)\cdot a_{N}\cdot\frac{[a_{N}N]}{a_{N}N}\cdot\frac{1}{[a_{N}N]}\sum_{n=0}^{[a_{N}N]}c_{[a_{N}N]}(n).

Since aNN,a_{N}N\to\infty, we get that [aNN]/aNN1.[a_{N}N]/a_{N}N\to 1. Finally, using yet again that (aN)N(a_{N})_{N} is bounded, we have

([1aN]+1)aNaN+11.\left(\left[\frac{1}{a_{N}}\right]+1\right)\cdot a_{N}\leq a_{N}+1\ll 1.

The result now follows by taking lim sup.\limsup.

Using Lemma 4.2, following the argument of [12, Lemma 5.2] we get:

Lemma 4.3.

Let (pN)N(p_{N})_{N} be a sequence of polynomials of degree 11 of the form

pN(n)=aNn+bN,n,N,p_{N}(n)=a_{N}n+b_{N},\;n,N\in\mathbb{N},

where (aN)N,(bN)N(a_{N})_{N},(b_{N})_{N} are bounded sequences with (aN)N(0,+)(a_{N})_{N}\subseteq(0,+\infty) and (aNN)N(a_{N}\cdot N)_{N} tending increasingly to .\infty. Then, for any system (X,,μ,T)(X,\mathcal{B},\mu,T) and f1L(μ),f_{1}\in L^{\infty}(\mu), we have

lim supNsupf011Nn=1N|f0T[pN(n)]f1𝑑μ||f1|2.\limsup_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[p_{N}(n)]}f_{1}\;d\mu\right|\ll\lvert\!|\!|f_{1}|\!|\!\rvert_{2}. (15)
Proof.

For every NN\in\mathbb{N} we choose functions f0,Nf_{0,N} with f0,N1\left\|f_{0,N}\right\|_{\infty}\leq 1 so that the corresponding average is 1/N1/N close to its supremum supf01.\sup_{\left\|f_{0}\right\|_{\infty}\leq 1}. Inequality (15) follows if we show

lim supN1Nn=1N|f0,[aNN]T[pN(n)]f1𝑑μ||f1|2.\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,[a_{N}N]}\cdot T^{[p_{N}(n)]}f_{1}\;d\mu\right|\ll\lvert\!|\!|f_{1}|\!|\!\rvert_{2}. (16)

We write [pN(n)]=[aNn+bN]=[aNn]+[bN]+e(n,N),e(n,N){0,1}.[p_{N}(n)]=[a_{N}n+b_{N}]=[a_{N}n]+[b_{N}]+e(n,N),\;e(n,N)\in\{0,1\}.

Let EE be the finite set where ([bN]+e(n,N))n,N([b_{N}]+e(n,N))_{n,N} takes values. We have that

1Nn=1N|f0,[aNN]T[pN(n)]f1𝑑μ|\displaystyle\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,[a_{N}N]}\cdot T^{[p_{N}(n)]}f_{1}\;d\mu\right| \displaystyle\ll maxeE1Nn=1N|f0,[aNN]T[aNn]+ef1𝑑μ|.\displaystyle\max_{e\in E}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,[a_{N}N]}\cdot T^{[a_{N}n]+e}f_{1}\;d\mu\right|.

Taking squares and using the Cauchy-Schwarz inequality, the right-hand side of the previous inequality is bounded by

maxeE1Nn=1N|f0,[aNN]T[aNn]+ef1𝑑μ|2=maxeE1Nn=1NF0,[aNN]S[aNn]+eF1𝑑μ~,\max_{e\in E}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,[a_{N}N]}\cdot T^{[a_{N}n]+e}f_{1}\;d\mu\right|^{2}=\max_{e\in E}\frac{1}{N}\sum_{n=1}^{N}\int F_{0,[a_{N}N]}\cdot S^{[a_{N}n]+e}F_{1}\;d\tilde{\mu},

where S=T×T,S=T\times T, F0,[aNN]=f0,[aNN]f¯0,[aNN],F_{0,[a_{N}N]}=f_{0,[a_{N}N]}\otimes\bar{f}_{0,[a_{N}N]}, F1=f1f¯1,F_{1}=f_{1}\otimes\bar{f}_{1}, and μ~=μ×μ.\tilde{\mu}=\mu\times\mu. For every eEe\in E, using Lemma 4.2, the lim sup\limsup of the averages on the right-hand side of the previous equality is bounded above by a constant multiple of

lim supN1Nn=1NF0,NSn+eF1𝑑μ~lim supN1Nn=1NSn+eF1L2(μ~),\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\int F_{0,N}\cdot S^{n+e}F_{1}\;d\tilde{\mu}\leq\limsup_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}S^{n+e}F_{1}\right\|_{L^{2}(\tilde{\mu})},

where the last inequality follows by Cauchy-Schwarz and the fact that F0,N1\left\|F_{0,N}\right\|_{\infty}\leq 1. Using von Neumann’s mean ergodic theorem, the last term is equal to

𝔼(SeF1|(S))L2(μ~)=𝔼(F1|(S))L2(μ~)|||f1|||22,\left\|\mathbb{E}(S^{e}F_{1}|\mathcal{I}(S))\right\|_{L^{2}(\tilde{\mu})}=\left\|\mathbb{E}(F_{1}|\mathcal{I}(S))\right\|_{L^{2}(\tilde{\mu})}\leq\lvert\!|\!|f_{1}|\!|\!\rvert_{2}^{2},

where we used the fact that SS is measure preserving, the definition of the seminorms ||||||,\lvert\!|\!|\cdot|\!|\!\rvert, and the relationship between the kk-th seminorm of the tensor product and the k+1k+1 seminorm on the base space. Inequality (16) now follows by removing the squares. ∎

Remark 3.

Lemma 4.3 holds also for sequences (aN)N(,0)(a_{N})_{N}\subseteq(-\infty,0) with (aNN)N(a_{N}\cdot N)_{N} tending decreasingly to .-\infty.

Indeed, In this case we write

[pN(n)]=[aNn]+[bN]+e(n,N),e(n,N){1,0},[p_{N}(n)]=-[-a_{N}n]+[b_{N}]+e(n,N),\;\;e(n,N)\in\{-1,0\},

so,

1Nn=1N|f0T[pN(n)]f1𝑑μ|maxeE1Nn=1N|f0T[aNn](Tef1)𝑑μ|,\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[p_{N}(n)]}f_{1}\;d\mu\right|\ll\max_{e\in E}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{-[-a_{N}n]}(T^{e}f_{1})\;d\mu\right|,

where EE is a finite subset of integers. Since aN>0-a_{N}>0 and (aNN)N(-a_{N}\cdot N)_{N} tends increasingly to ,\infty, we get the conclusion by the previous lemma (working with T1T^{-1} instead of TT).141414 We note that since in Theorems 2.1 and  2.2 we assume that the transformation TT or T1T^{-1} is ergodic, then the seminorms taken with respect to either of those transformations coincide.

For multiple terms, we use the following variant of the classical van der Corput trick:

Lemma 4.4 (Lemma 4.6, [12]).

Let (vN,n)N,n(v_{N,n})_{N,n} be a bounded sequence in a Hilbert space. Then

lim supN1Nn=1NvN,n24lim supH1Hh=1Hlim supN|1Nn=1NvN,n+h,vN,n|.\limsup_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}v_{N,n}\right\|^{2}\leq 4\limsup_{H\to\infty}\frac{1}{H}\sum_{h=1}^{H}\limsup_{N\to\infty}\left|\frac{1}{N}\sum_{n=1}^{N}\langle v_{N,n+h},v_{N,n}\rangle\right|.

We will now demonstrate the main idea behind the generalization of Lemma 4.3, for which we follow [12, Proposition 5.3, Case 1]. In that statement, to show

lim supNsupf0,fi11Nn=1N|f0T[a1n]f1T[a2n]f2||fj|4,\limsup_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{i}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[a_{1}n]}f_{1}\cdot T^{[a_{2}n]}f_{2}\right|\ll\lvert\!|\!|f_{j}|\!|\!\rvert_{4},

where (i,j)=(1,2)(i,j)=(1,2) or (2,1),(2,1), one uses Lemma 4.4, compose with, say, [a1n],-[a_{1}n], and gets the terms (notice that we keep the hh-term in the first difference even though it is bounded)

[a1(n+h)][a1n][a1h],[a_{1}(n+h)]-[a_{1}n]\approx[a_{1}h],
[a2(n+h)][a1n][(a2a1)n],and[a_{2}(n+h)]-[a_{1}n]\approx[(a_{2}-a_{1})n],\;\;\text{and}
[a2n][a1n][(a2a1)n],[a_{2}n]-[a_{1}n]\approx[(a_{2}-a_{1})n],

so, after grouping the last two terms together, using the first one as constant (since it only depends on hh–the average along which is crucial for the argument and is taken at the very end), one can use the base =1\ell=1 case. This is also how the inductive step works in the proof of the general \ell\in\mathbb{N} case.

The variable case is more complicated to deal with. We demonstrate the main idea behind it by considering Example 1, i.e., p1,N(n)=a1,Nnp_{1,N}(n)=a_{1,N}n and p2,N(n)=a2,Nn,p_{2,N}(n)=a_{2,N}n, where a1,N=1/Naa_{1,N}=1/N^{a} and a2,N=1/Nba_{2,N}=1/N^{b} for 0<a<b<10<a<b<1. The previous approach cannot be imitated, as, for example,

[a1,N(n+h)][a1,Nn][a1,Nh][a_{1,N}(n+h)]-[a_{1,N}n]\approx[a_{1,N}h]

is in general a variable term and we cannot proceed with the same argument. What we do instead is to transform the iterates in the initial sum to the following:

(0,[a1,Nn],[a2,Nn])(0,[a1,Nn],[a2,Na1,N[a1,Nn]])Lemma 4.2(0,n,[a2,Na1,Nn]),(0,[a_{1,N}n],[a_{2,N}n])\approx\left(0,[a_{1,N}n],\left[\frac{a_{2,N}}{a_{1,N}}[a_{1,N}n]\right]\right)\xrightarrow[\text{}]{\text{Lemma~{}\ref{L:comparison}}}\left(0,n,\left[\frac{a_{2,N}}{a_{1,N}}n\right]\right),

and then we use Lemma 4.4 (i.e., change of variables) to bound, eventually, everything by |f1|4.\lvert\!|\!|f_{1}|\!|\!\rvert_{4}. (To use Lemma 4.2 note the crucial fact that (a2,N/a1,N)N(a_{2,N}/a_{1,N})_{N} is bounded.) Additionally, to bound our expression by |f2|4,\lvert\!|\!|f_{2}|\!|\!\rvert_{4}, the previous argument needs an additional twist to work since the quantity (a1,N/a2,N)N(a_{1,N}/a_{2,N})_{N} is unbounded. What we do in this case is to compose with [a2,Nn]-[a_{2,N}n] to get

(0,[a1,Nn],[a2,Nn])\displaystyle(0,[a_{1,N}n],[a_{2,N}n]) \displaystyle\approx ([a2,Nn],[(a1,Na2,N)n],0)\displaystyle([-a_{2,N}n],[(a_{1,N}-a_{2,N})n],0)
\displaystyle\approx ([a2,Na1,Na2,N[(a1,Na2,N)n]],[(a1,Na2,N)n],0)\displaystyle\left(\left[\frac{-a_{2,N}}{a_{1,N}-a_{2,N}}[(a_{1,N}-a_{2,N})n]\right],[(a_{1,N}-a_{2,N})n],0\right)
\displaystyle\rightarrow ([a2,Na1,Na2,Nn],n,0),\displaystyle\left(\left[\frac{-a_{2,N}}{a_{1,N}-a_{2,N}}n\right],n,0\right),

where we used the change of variables. As (a2,N/(a1,Na2,N))N(a_{2,N}/(a_{1,N}-a_{2,N}))_{N} is bounded, we can now finish the argument as before.

The previous discussion, naturally leads to the following assumption on the leading coefficients of the linear (variable) polynomials:

Definition 4.5.

A sequence of real numbers (aN)N(a_{N})_{N} has the R1R_{1}-property if

  • (i)

    it is bounded; and

  • (ii)

    (aN)N(0,+)(a_{N})_{N}\subseteq(0,+\infty) or (,0)(-\infty,0) and (|aN|N)N(|a_{N}|\cdot N)_{N} tends increasingly to +.+\infty.

For \ell\in\mathbb{N} the sequences {(ai,N)N: 1i}\{(a_{i,N})_{N}:\;1\leq i\leq\ell\} have the RR_{\ell}-property if for all 1i1\leq i\leq\ell:

  • (i)

    (ai,N)N(a_{i,N})_{N} has the R1R_{1}-property; and

  • (ii)

    at least one of the following three properties holds:

(a)  1j0i\exists\;1\leq j_{0}\neq i\leq\ell such that {(aj0,Naj,Nai,N)N: 1\bigg{\{}\left(\frac{a_{j_{0},N}-a_{j,N}}{a_{i,N}}\right)_{N}:\;1\leq jj0}j\neq j_{0}\leq\ell\bigg{\}} have the R1R_{\ell-1}-property.

(b)  1j0i\exists\;1\leq j_{0}\neq i\leq\ell such that (ai,Naj0,N)N(a_{i,N}-a_{j_{0},N})_{N} has the R1R_{1}-property and the sequences {(aj,Nai,Naj0,N)N: 1jj0}\bigg{\{}\left(\frac{a_{j,N}}{a_{i,N}-a_{j_{0},N}}\right)_{N}:\;1\leq j\neq j_{0}\leq\ell\bigg{\}} have the R1R_{\ell-1}-property.

(c)  1j0i\exists\;1\leq j_{0}\neq i\leq\ell such that (ai,Naj0,N)N(a_{i,N}-a_{j_{0},N})_{N} has the R1R_{1}-property and 1k0j0,1\leq k_{0}\neq j_{0}, ii\leq\ell such that {(ak0,Nai,Naj0,N)N,\bigg{\{}\left(-\frac{a_{k_{0},N}}{a_{i,N}-a_{j_{0},N}}\right)_{N}, (aj,Nak0,Nai,Naj0,N)N: 1jk0,j0}\left(\frac{a_{j,N}-a_{k_{0},N}}{a_{i,N}-a_{j_{0},N}}\right)_{N}:\;1\leq j\neq k_{0},j_{0}\leq\ell\bigg{\}} have the R1R_{\ell-1}-property.

Remark 4.

The polynomial family of Example 1, i.e., p1,N(n)=n/Na,p_{1,N}(n)=n/N^{a}, p2,N(n)=n/Nb,p_{2,N}(n)=n/N^{b}, n,N,n,N\in\mathbb{N}, where 0<a<b<1,0<a<b<1, has the R2R_{2}-property.

Indeed, skipping the trivial calculations, both sequences (1/Na)N,(1/N^{a})_{N}, (1/Nb)N(1/N^{b})_{N} have the R1R_{1}-property and for i=1i=1 we have the (ii)(ii) (a)(a) case, while for i=2i=2 the (ii)(ii) (b)(b) case.

We are now ready to extend Lemma 4.3 to multiple terms along polynomials of degree 1,1, following the main idea of [12, Proposition 5.3, Case 1]:

Proposition 1.

Let (p1,N)N,,(p,N)N(p_{1,N})_{N},\ldots,(p_{\ell,N})_{N} be polynomial sequences of degree 11 of the form

pi,N(n)=ai,Nn+bi,N,n,N,1i,p_{i,N}(n)=a_{i,N}n+b_{i,N},\;\;n,N\in\mathbb{N},1\leq i\leq\ell,

where the sequences (ai,N)N,(a_{i,N})_{N}, 1i,1\leq i\leq\ell, have the RR_{\ell}-property and (bi,N)N,(b_{i,N})_{N}, 1i,1\leq i\leq\ell, are bounded. Then, for every f1L(μ),f_{1}\in L^{\infty}(\mu), we have

lim supNsupf0,f2,,f11Nn=1N|f0i=1T[pi,N(n)]fidμ||f1|2.\limsup_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot\prod_{i=1}^{\ell}T^{[p_{i,N}(n)]}f_{i}\;d\mu\right|\ll\lvert\!|\!|f_{1}|\!|\!\rvert_{2\ell}. (17)
Proof.

We use induction on .\ell. The base case, =1,\ell=1, follows from Lemma 4.3. We assume that 2\ell\geq 2 and that the statement holds for 1.\ell-1.

Case 1: For i=1,i=1, the property (ii) (a) from the Definition 4.5 holds.

1Nn=1N|f0T[a1,Nn+b1,N]f1T[a,Nn+b,N]f𝑑μ|=1Nn=1N|f0T[a1,Nn]+e1(n,N)f1i=2T[ai,Na1,N[a1,Nn]]+ei(n,N)fidμ|maxe1,,eE1Nn=1N|f0T[a1,Nn](Te1f1)i=2T[ai,Na1,N[a1,Nn]](Teifi)dμ|,\begin{split}&\quad\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[a_{1,N}n+b_{1,N}]}f_{1}\cdot\ldots\cdot T^{[a_{\ell,N}n+b_{\ell,N}]}f_{\ell}\;d\mu\right|\\ &=\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[a_{1,N}n]+e_{1}(n,N)}f_{1}\cdot\prod_{i=2}^{\ell}T^{\left[\frac{a_{i,N}}{a_{1,N}}[a_{1,N}n]\right]+e_{i}(n,N)}f_{i}\;d\mu\right|\\ &\ll\max_{e_{1},\ldots,e_{\ell}\in E}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[a_{1,N}n]}(T^{e_{1}}f_{1})\cdot\prod_{i=2}^{\ell}T^{\left[\frac{a_{i,N}}{a_{1,N}}[a_{1,N}n]\right]}(T^{e_{i}}f_{i})\;d\mu\right|,\end{split} (18)

where EE is a finite subset of integers (the error terms ei(n,N)e_{i}(n,N), as (bi,N)N,(b_{i,N})_{N}, and (ai,N/a1,N)N(a_{i,N}/a_{1,N})_{N} are bounded for 1i,1\leq i\leq\ell, take finitely many values).

For every NN\in\mathbb{N} we now choose functions fi,Nf_{i,N} with fi,N1\left\|f_{i,N}\right\|_{\infty}\leq 1 for i{0,2,,},i\in\{0,2,\ldots,\ell\}, so that the last term in (18) is 1/N1/N close to supf0,f2,,f1.\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}. Using the Cauchy-Schwarz inequality and the fact that |a1,N|N,|a_{1,N}|\cdot N\to\infty, we have that (17) follows if we show, for each choice of e1,,eE,e_{1},\ldots,e_{\ell}\in E, that

lim supN1Nn=1N|f0,[a1,NN]T[a1,Nn](Te1f1)i=2T[ai,Na1,N[a1,Nn]](Teifi,[a1,NN])dμ|2\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,[a_{1,N}N]}\cdot T^{[a_{1,N}n]}(T^{e_{1}}f_{1})\cdot\prod_{i=2}^{\ell}T^{\left[\frac{a_{i,N}}{a_{1,N}}[a_{1,N}n]\right]}(T^{e_{i}}f_{i,[a_{1,N}N]})\;d\mu\right|^{2} (19)

is bounded above by a constant multiple of |f1|22.\lvert\!|\!|f_{1}|\!|\!\rvert^{2}_{2\ell}. Using Lemma 4.2 it suffices to show

lim supN1Nn=1N|f0,NTn(Te1f1)i=2T[ai,Na1,Nn](Teifi,N)dμ|2|f1|22.\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0,N}\cdot T^{n}(T^{e_{1}}f_{1})\cdot\prod_{i=2}^{\ell}T^{\left[\frac{a_{i,N}}{a_{1,N}}n\right]}(T^{e_{i}}f_{i,N})\;d\mu\right|^{2}\ll\lvert\!|\!|f_{1}|\!|\!\rvert^{2}_{2\ell}. (20)

The left-hand side of (20) is equal to

A:=lim supN1Nn=1NF0,NSn(Se1F1)i=2S[ai,Na1,Nn](SeiFi,N)dμ~,A:=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\int F_{0,N}\cdot S^{n}(S^{e_{1}}F_{1})\cdot\prod_{i=2}^{\ell}S^{\left[\frac{a_{i,N}}{a_{1,N}}n\right]}(S^{e_{i}}F_{i,N})\;d\tilde{\mu},

where S=T×T,S=T\times T, F1=f1f¯1,F_{1}=f_{1}\otimes\bar{f}_{1}, Fi,N=fi,Nf¯i,N,F_{i,N}=f_{i,N}\otimes\bar{f}_{i,N}, i=0,2,,,i=0,2,\ldots,\ell, and μ~=μ×μ.\tilde{\mu}=\mu\times\mu. Using the Cauchy-Schwarz inequality and Lemma 4.4, we have that

|A|2lim supH1Hh=1HAh,|A|^{2}\ll\limsup_{H\to\infty}\frac{1}{H}\sum_{h=1}^{H}A_{h},

where

Ah:=lim supN1Nn=1N|Sn+h(Se1F1)i=2S[ai,Na1,N(n+h)](SeiFi,N)Sn(Se1F¯1)i=2S[ai,Na1,Nn](SeiF¯i,N)dμ~|.\begin{split}&\quad A_{h}:=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int S^{n+h}(S^{e_{1}}F_{1})\cdot\prod_{i=2}^{\ell}S^{\left[\frac{a_{i,N}}{a_{1,N}}(n+h)\right]}(S^{e_{i}}F_{i,N})\ \\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot S^{n}(S^{e_{1}}\overline{F}_{1})\cdot\prod_{i=2}^{\ell}S^{\left[\frac{a_{i,N}}{a_{1,N}}n\right]}(S^{e_{i}}\overline{F}_{i,N})\;d\tilde{\mu}\bigg{|}.\end{split}

Precomposing with the term SnS^{-n} we get

Ah=lim supN1Nn=1N|Sh(Se1F1)Se1F¯1i=2S[(ai,Na1,N1)n]+[ai,Na1,Nh]+ej(n,h,N)(SeiFi,N)i=2S[(ai,Na1,N1)n]+e~j(n,h,N)(SeiF¯i,N)dμ~|=lim supN1Nn=1N|F1,hi=2S[(ai,Na1,N1)n]Fi,h,n,Ndμ~|,\begin{split}&\quad A_{h}=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int S^{h}(S^{e_{1}}F_{1})\cdot S^{e_{1}}\overline{F}_{1}\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{i=2}^{\ell}S^{\left[\left(\frac{a_{i,N}}{a_{1,N}}-1\right)n\right]+\left[\frac{a_{i,N}}{a_{1,N}}h\right]+e_{j}(n,h,N)}(S^{e_{i}}F_{i,N})\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{i=2}^{\ell}S^{\left[\left(\frac{a_{i,N}}{a_{1,N}}-1\right)n\right]+\tilde{e}_{j}(n,h,N)}(S^{e_{i}}\overline{F}_{i,N})\;d\tilde{\mu}\bigg{|}\\ &\quad\quad=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int F_{1,h}\cdot\prod_{i=2}^{\ell}S^{\left[\left(\frac{a_{i,N}}{a_{1,N}}-1\right)n\right]}F_{i,h,n,N}\;d\tilde{\mu}\bigg{|},\end{split} (21)

where F1,h=Sh(Se1F1)Se1F¯1,F_{1,h}=S^{h}(S^{e_{1}}F_{1})\cdot S^{e_{1}}\overline{F}_{1}, Fi,h,n,N=S[ai,Na1,Nh]+ej(n,h,N)(SeiFi,N)Se~j(n,h,N)(F_{i,h,n,N}=S^{\left[\frac{a_{i,N}}{a_{1,N}}h\right]+e_{j}(n,h,N)}(S^{e_{i}}F_{i,N})\cdot S^{\tilde{e}_{j}(n,h,N)}\\ ( SeiF¯i,N),S^{e_{i}}\overline{F}_{i,N}), 2i.2\leq i\leq\ell. Using the hypothesis, for i=1,i=1, there exists 2j02\leq j_{0}\leq\ell such that the sequences {(aj0,Naj,Na1,N)N: 1jj0}\bigg{\{}\left(\frac{a_{j_{0},N}-a_{j,N}}{a_{1,N}}\right)_{N}:\;1\leq j\neq j_{0}\leq\ell\bigg{\}} have the R1R_{\ell-1}-property. Precomposing with S[(aj0,Na1,N1)n]S^{-\left[\left(\frac{a_{j_{0},N}}{a_{1,N}}-1\right)n\right]} in the right-hand side of (21) we have that

Ah=lim supN1Nn=1N|F1,hi=2S[(ai,Na1,N1)n]Fi,h,n,Ndμ~|=lim supN1Nn=1N|Fj0,h,n,NS[(aj0,Na1,N1)n]+e1(n,N)F1,h2ij0S[(aj0,Nai,Na1,N)n]F~i,h,n,Ndμ~|,\begin{split}&\quad A_{h}=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int F_{1,h}\cdot\prod_{i=2}^{\ell}S^{\left[\left(\frac{a_{i,N}}{a_{1,N}}-1\right)n\right]}F_{i,h,n,N}\;d\tilde{\mu}\bigg{|}\\ &\quad\quad=\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int F_{j_{0},h,n,N}\cdot S^{\left[-\left(\frac{a_{j_{0},N}}{a_{1,N}}-1\right)n\right]+e^{\prime}_{1}(n,N)}F_{1,h}\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{2\leq i\neq j_{0}\leq\ell}S^{\left[-\left(\frac{a_{j_{0},N}-a_{i,N}}{a_{1,N}}\right)n\right]}\tilde{F}_{i,h,n,N}\;d\tilde{\mu}\bigg{|},\end{split}

where F~i,h,n,N=Sei(n,N)Fi,h,n,N\tilde{F}_{i,h,n,N}=S^{e^{\prime}_{i}(n,N)}F_{i,h,n,N} for some error terms ei(n,N){0,1}.e^{\prime}_{i}(n,N)\in\{0,1\}.

As we previously highlighted, for every fixed N,N, we can partition the set of integers so that e1(n,N)e_{1}^{\prime}(n,N) is constant. So, fixing e1{0,1},e_{1}^{\prime}\in\{0,1\}, using the induction hypothesis, we have

Ahlim supNsupF0,F2,,F11Nn=1N|F0S[(aj0,Na1,N1)n](Se1F1,h)2ij0S[(aj0,Nai,Na1,N)n]Fidμ~||Se1F1,h|2(1)=|F1,h|2(1)=|Sh(Se1F1)Se1F¯1|2(1)=|(Th+e1f1Te1f¯1)(Th+e1f1Te1f¯1)¯|2(1)|Th+e1f1Te1f¯1|212.\begin{split}&\quad A_{h}\ll\limsup_{N\to\infty}\sup_{\left\|F_{0}\right\|_{\infty},\left\|F_{2}\right\|_{\infty},\ldots,\left\|F_{\ell}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int F_{0}\cdot S^{\left[-\left(\frac{a_{j_{0},N}}{a_{1,N}}-1\right)n\right]}(S^{e^{\prime}_{1}}F_{1,h})\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{2\leq i\neq j_{0}\leq\ell}S^{\left[-\left(\frac{a_{j_{0},N}-a_{i,N}}{a_{1,N}}\right)n\right]}F_{i}\;d\tilde{\mu}\bigg{|}\\ &\quad\quad\ll\lvert\!|\!|S^{e^{\prime}_{1}}F_{1,h}|\!|\!\rvert_{2(\ell-1)}=\lvert\!|\!|F_{1,h}|\!|\!\rvert_{2(\ell-1)}=\lvert\!|\!|S^{h}(S^{e_{1}}F_{1})\cdot S^{e_{1}}\overline{F}_{1}|\!|\!\rvert_{2(\ell-1)}\\ &\quad\quad=\lvert\!|\!|(T^{h+e_{1}}f_{1}\cdot T^{e_{1}}\overline{f}_{1})\otimes\overline{(T^{h+e_{1}}f_{1}\cdot T^{e_{1}}\overline{f}_{1})}|\!|\!\rvert_{2(\ell-1)}\leq\lvert\!|\!|T^{h+e_{1}}f_{1}\cdot T^{e_{1}}\overline{f}_{1}|\!|\!\rvert^{2}_{2\ell-1}.\end{split}

So, using Hölder inequality and the definition of the seminorms ||||||,\lvert\!|\!|\cdot|\!|\!\rvert, we have

|A|2\displaystyle|A|^{2} \displaystyle\ll lim supH1Hh=1HAhlim supH1Hh=1H|Th+e1f1Te1f¯1|212\displaystyle\limsup_{H\to\infty}\frac{1}{H}\sum_{h=1}^{H}A_{h}\ll\limsup_{H\to\infty}\frac{1}{H}\sum_{h=1}^{H}\lvert\!|\!|T^{h+e_{1}}f_{1}\cdot T^{e_{1}}\overline{f}_{1}|\!|\!\rvert^{2}_{2\ell-1}
\displaystyle\leq lim supH(1Hh=1H|Th+e1f1Te1f¯1|21221)1/22(1)=|Te1f1|24=|f1|24,\displaystyle\limsup_{H\to\infty}\left(\frac{1}{H}\sum_{h=1}^{H}\lvert\!|\!|T^{h+e_{1}}f_{1}\cdot T^{e_{1}}\overline{f}_{1}|\!|\!\rvert^{2^{2\ell-1}}_{2\ell-1}\right)^{1/2^{2(\ell-1)}}=\lvert\!|\!|T^{e_{1}}f_{1}|\!|\!\rvert^{4}_{2\ell}=\lvert\!|\!|f_{1}|\!|\!\rvert^{4}_{2\ell},

hence, (19) is bounded above by a constant multiple of |f1|22\lvert\!|\!|f_{1}|\!|\!\rvert^{2}_{2\ell} as was to be shown.

Cases 2 & 3: For i=1,i=1, we either have property (ii) (b) or (ii) (c) in Definition 4.5.

Here we will skip the details already outlined in Case 1. If 2j02\leq j_{0}\leq\ell is the integer guaranteed by Definition 4.5, the integrand in the last part of equation (18) will become (setting, without loss, ei=0e_{i}=0)

fj0T[(a1,Naj0,N)n]f1T[aj0,Na1,Naj0,N[(a1,Naj0,N)n]]f02jj0T[aj,Naj0,Na1,Naj0,N[(a1,Naj0,N)n]]fj\begin{split}&\quad f_{j_{0}}\cdot T^{[(a_{1,N}-a_{j_{0},N})n]}f_{1}\cdot T^{\left[-\frac{a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}[(a_{1,N}-a_{j_{0},N})n]\right]}f_{0}\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{2\leq j\neq j_{0}\leq\ell}T^{\left[\frac{a_{j,N}-a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}[(a_{1,N}-a_{j_{0},N})n]\right]}f_{j}\end{split}

and the one in equation (21)

F1,hS[(aj0,Na1,Naj0,N1)n]F0,h,n,N2jj0S[(aj,Naj0,Na1,Naj0,N1)n]Fi,h,n,N.F_{1,h}\cdot S^{\left[\left(\frac{-a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}-1\right)n\right]}F_{0,h,n,N}\cdot\prod_{2\leq j\neq j_{0}\leq\ell}S^{\left[\left(\frac{a_{j,N}-a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}-1\right)n\right]}F_{i,h,n,N}.

Precomposing with the term S[(aj0,Na1,Naj0,N1)n]=S[a1,Na1,Naj0,Nn]S^{-\left[\left(\frac{-a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}-1\right)n\right]}=S^{-\left[\frac{a_{1,N}}{a_{1,N}-a_{j_{0},N}}n\right]} (for Case 2) and with S[(ak0,Naj0,Na1,Naj0,N1)n]=S[ak0,Na1,Na1,Naj0,Nn]S^{-\left[\left(\frac{a_{k_{0},N}-a_{j_{0},N}}{a_{1,N}-a_{j_{0},N}}-1\right)n\right]}=S^{-\left[\frac{a_{k_{0},N}-a_{1,N}}{a_{1,N}-a_{j_{0},N}}n\right]} (for Case 3–where 2k0j02\leq k_{0}\neq j_{0}\leq\ell is the one guaranteed by Definition 4.5), we can continue (using the induction hypothesis) and finish the argument as in Case 1. The proof of the statement is now complete. ∎

Remark 5.

To the best of our knowledge, when we deal with norm convergence of averages of (non-variable) polynomial iterates, we can always replace the conventional Cesàro averages, i.e., limN1Nn=1N,\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}, with the corresponding uniform ones, i.e., limNM1NMn=MN1.\lim_{N-M\to\infty}\frac{1}{N-M}\sum_{n=M}^{N-1}. Our method though, exactly because of the choice of functions fi,[a1,NN]f_{i,[a_{1,N}N]} (to go from equation (15) to (16) and from equation (18) to (19)), cannot guarantee the corresponding uniform results.

4.2. The general case

We start by recalling (see, for example, [2] and [12]) the definition of the degree and type of a polynomial family that we will adapt in our study:

Definition 4.6.

For \ell\in\mathbb{N} let 𝒫={p1,,p}\mathcal{P}=\{p_{1},\ldots,p_{\ell}\} be a family of non-constant real polynomials. We denote with deg(𝒫)\deg(\mathcal{P}) the maximum degree of the polynomials pip_{i}’s and we call it degree of 𝒫.\mathcal{P}. If wiw_{i} denotes the number of distinct leading coefficients of polynomials from 𝒫\mathcal{P} of degree ii and d=deg(𝒫),d=\deg(\mathcal{P}), then the vector (d,wd,,w1)(d,w_{d},\ldots,w_{1}) is the type of 𝒫.\mathcal{P}. We order all the possible type vectors lexicographically.171717 I.e., (d,wd,,w1)>(d,wd,,w1)(d,w_{d},\ldots,w_{1})>(d^{\prime},w^{\prime}_{d},\ldots,w^{\prime}_{1}) iff, reading from left to right, the first instance where the two vectors disagree the coordinate of the first vector is greater than that of the second one.

In order to reduce the complexity (i.e., the type) of a polynomial family, one has to use the classic PET (i.e., Polynomial Exhaustion Technique) induction.

At this point we remind the reader that the real polynomials p1,,pp_{1},\ldots,p_{\ell} are called essentially distinct if they are, together with their pairwise differences, non-constant. Given such a family of polynomials 𝒫={p1,,p},\mathcal{P}=\{p_{1},\ldots,p_{\ell}\}, p𝒫p\in\mathcal{P} and h,h\in\mathbb{N}, the van der Corput operation (vdC-operation), acting on 𝒫,\mathcal{P}, gives the family

𝒫(p,h):={p1(t+h)p(t),,p(t+h)p(t),p1(t)p(t),,p(t)p(t)},\mathcal{P}(p,h):=\{p_{1}(t+h)-p(t),\ldots,p_{\ell}(t+h)-p(t),p_{1}(t)-p(t),\ldots,p_{\ell}(t)-p(t)\},

where we then remove all the terms that are bounded191919 This is justified with the use of the Cauchy-Schwarz inequality. and we group the ones of degree 11 with bounded difference (i.e., of the same leading coefficient), thus obtaining a new family of essentially distinct polynomials.

The following lemma states that there exists a choice of a polynomial in a family of essentially distinct polynomials, via which the vdC-operation reduces its type:

Lemma 4.7 (Lemma 4.5, [12]).

Let \ell\in\mathbb{N} and 𝒫={p1,,p}\mathcal{P}=\{p_{1},\ldots,p_{\ell}\} be a family of essentially distinct polynomials with deg(𝒫)=deg(p1)2.\deg(\mathcal{P})=\deg(p_{1})\geq 2. Then there exists p𝒫p\in\mathcal{P} (of minimum degree in the polynomial family) such that for every large hh the family 𝒫(p,h)\mathcal{P}(p,h) has type smaller than that of 𝒫,\mathcal{P}, and deg(𝒫(p,h))=deg(p1(t+h)p(t)).\deg(\mathcal{P}(p,h))=\deg(p_{1}(t+h)-p(t)).

What is crucial for us is that every decreasing sequence of types is eventually (after finitely many steps) stationary and that, by using the previous lemma, there is a point at which all the polynomials have degree 11. Also, by its definition, the vdC-operation preserves the essential distinctness property.

We will deal with sequences of families of real polynomials, (𝒫N)N,(\mathcal{P}_{N})_{N}, where 𝒫N={p1,N,,p,N},\mathcal{P}_{N}=\{p_{1,N},\ldots,p_{\ell,N}\}, N,N\in\mathbb{N}, that, for large NN, has type independent of N,N, to be able to use the facts that we just mentioned. Abusing the notation, we write (𝒫N)N=(p1,N,,p,N)N(\mathcal{P}_{N})_{N}=(p_{1,N},\ldots,p_{\ell,N})_{N}.

Next we define the subclass of variable polynomials that we will deal with.

Definition 4.8.

For \ell\in\mathbb{N} let (𝒫N)N=(p1,N,,p,N)N(\mathcal{P}_{N})_{N}=(p_{1,N},\ldots,p_{\ell,N})_{N} be a sequence of \ell-tuples of real polynomials with bounded coefficients. We say that (𝒫N)N(\mathcal{P}_{N})_{N} is super nice if, for every (large enough) NN\in\mathbb{N}:

  • (i)

    the polynomials pi,Np_{i,N} and, for all ij,i\neq j, pi,Npj,Np_{i,N}-p_{j,N} are non-constant and their degrees are independent of NN;

  • (ii)

    after performing, if needed, (finitely many) vdC-operations to (𝒫N)N(\mathcal{P}_{N})_{N} to obtain only polynomials of degree 1,1, say kk((𝒫N)N)k\equiv k((\mathcal{P}_{N})_{N}) many, the leading coefficients (for large enough hih_{i}’s–from the vdC-operations) have the RkR_{k}-property; and

  • (ii)

    if deg((pi0,N)N)=deg((𝒫N)N),\deg((p_{i_{0},N})_{N})=\deg((\mathcal{P}_{N})_{N}), then (ii) holds for the polynomial sequence (𝒫N)N(\mathcal{P}^{\prime}_{N})_{N} :=(p1,Npi0,N,,pi01,Npi0,N,pi0,N,pi0+1,Npi0,N,,p,Npi0,N)N.:=(p_{1,N}-p_{i_{0},N},\ldots,p_{i_{0}-1,N}-p_{i_{0},N},-p_{i_{0},N},p_{i_{0}+1,N}-p_{i_{0},N},\ldots,p_{\ell,N}-p_{i_{0},N})_{N}.

Remark 6.

(1)(1) It is not clear to us whether (ii)(ii) implies (ii)(ii)^{\prime}.

Consider for example the sequence of polynomials (𝒫N)N=(p1,N,p2,N)N,(\mathcal{P}_{N})_{N}=(p_{1,N},p_{2,N})_{N}, where p1,N(n)p_{1,N}(n) =aNn2bNn=-a_{N}n^{2}-b_{N}n and p2,N(n)=(aNbN)n.p_{2,N}(n)=(a_{N}-b_{N})n. After performing the vdC-operation twice we get the triple {2aN(h+h)n,\{-2a_{N}(h^{\prime}+h)n, 2aNhn,2aNhn},-2a_{N}hn,-2a_{N}h^{\prime}n\}, while for the sequence (𝒫N)N=(p1,N,p2,Np1,N)N,(\mathcal{P}_{N}^{\prime})_{N}=(-p_{1,N},p_{2,N}-p_{1,N})_{N}, after a single use of the vdC-operation we get {(aNbN)n,2aNhn,\{(a_{N}-b_{N})n,2a_{N}hn, ((2h+1)aNbN)n}((2h+1)a_{N}-b_{N})n\}. So, in the second case we have to impose assumptions on both (aN)N,(a_{N})_{N}, (bN)N,(b_{N})_{N}, while in the first one only on (aN)N.(a_{N})_{N}.

(2)(2) The degree and type of every super nice sequence, together with the integer kk in (ii)(ii) (and, analogously, in (ii)(ii)^{\prime} as well), are independent of N.N.

(3)(3) Every family of essentially distinct polynomials that does not depend on N is super nice.

Indeed, since (i)(i) is immediate, we are showing (ii)(ii) ((ii)(ii)^{\prime} follows by the same argument). As it was mentioned before, the vdC-operation preserves the essential distinctness property, hence, all the kk linear polynomial will have distinct leading coefficients, which, as they are independent of N,N, will have the k\mathcal{R}_{k}-property.

(4)(4) The set of super nice variable polynomial sequences is non-empty. Actually, the \ell-tuple (p1,N,,p,N)N,(p_{1,N},\ldots,p_{\ell,N})_{N}, where pi,N(n)=ni/Na,p_{i,N}(n)=n^{i}/N^{a}, 1i,1\leq i\leq\ell, N,n,N,n\in\mathbb{N}, and 0<a<1,0<a<1, from Example 2, is super nice (see Lemma 6.2 below for a more general statement).

Indeed, as the variable part of the coefficients of the polynomials, after applying vdC-operations, is the same for all terms (and equal to 1/Na1/N^{a}), at each step we have that the ratios of the coefficients are independent of NN, hence we have all the required properties.

(5)(5) Even though the number kk of degree 11 terms (that appears in (ii)(ii) and (ii)(ii)^{\prime}) is not a priori known, when we have a single variable polynomial sequence

pN(n)=ad,Nnd++a1,Nn+a0,N,p_{N}(n)=a_{d,N}n^{d}+\ldots+a_{1,N}n+a_{0,N},

where (ad,N)N(a_{d,N})_{N} has the R1R_{1}-property and all (ai,N)N,(a_{i,N})_{N}, 0id,0\leq i\leq d, are bounded, we have that for all ,\ell\in\mathbb{N}, (𝒫N)N=(pN,2pN,,pN)N(\mathcal{P}_{N})_{N}=(p_{N},2p_{N},\ldots,\ell p_{N})_{N} is super nice.

It suffices to show only (ii). We start with (pN(n),,(\ell p_{N}(n),\ldots, pN(n))p_{N}(n)) and use the vdC-operation which leads to differences of polynomials.202020 Notice that, for every h,h\in\mathbb{N}, Δ1(p(n);h):=p(n+h)p(n)\Delta_{1}(p(n);h):=p(n+h)-p(n) reduces the degree of pp by 11. Precomposing with pN(n)-p_{N}(n) we get the family of polynomials

(1)pN(n+h1)+pN(n+h1)pN(n),,pN(n+h1)pN(n),(1)pN(n),,pN(n).(\ell-1)p_{N}(n+h_{1})+p_{N}(n+h_{1})-p_{N}(n),\ldots,p_{N}(n+h_{1})-p_{N}(n),(\ell-1)p_{N}(n),\ldots,p_{N}(n).

In the next iteration of the vdC-operation we precompose with pN(n+h1)+pN(n)-p_{N}(n+h_{1})+p_{N}(n), and then (pN(n+h1+h2)pN(n+h2))+(pN(n+h1)pN(n))-(p_{N}(n+h_{1}+h_{2})-p_{N}(n+h_{2}))+(p_{N}(n+h_{1})-p_{N}(n)) (i.e., polynomials of minimum degree at each step). We will keep track of the leading coefficients of polynomials of maximum degree at each step;212121 Here, as we only have distinct non-zero multiples of the same polynomial it is not hard to do so; for more general coefficient tracking methods see [7, 9]. we have the following cases in this procedure:

\bullet The polynomial that is chosen according to Lemma 4.7 has degree strictly less than the one of the polynomial of maximum degree (e.g., this happens in the second iteration of the vdC-operation). In this case the leading coefficient of the latter polynomial doesn’t change.

\bullet The polynomial that is chosen according to Lemma 4.7, say qN,q_{N}, has degree, say D,D, equal to the one of the polynomial of maximum degree (hence all the polynomials have the same degree DD–this is the case in the first application of the vdC-operation). Here, because of the nature of the (essentially distinct) iterates, the leading coefficients will be multiples of the leading coefficient of qN.q_{N}. The scheme will continue by picking for the next step the polynomial qN(n+h)qN(n)q_{N}(n+h)-q_{N}(n) (for the corresponding shift hh\in\mathbb{N} with leading coefficient DD times the leading coefficient of qNq_{N}) which is of minimum degree.

Continuing the procedure, we eventually arrive at, say kk many, degree 11 iterates with distinct leading coefficients (because of the essential distinctness property), which are all multiples of d!ad,Nd!\cdot a_{d,N} (i.e., the coefficient of pN(d1)(n)p_{N}^{(d-1)}(n)). As all the iterated ratios of these coefficients are independent of ad,Na_{d,N} and non-zero, we get that they satisfy the RkR_{k}-property.

(6)(6) If (𝒫N)N(\mathcal{P}_{N})_{N} is super nice, then (𝒫N)N(\mathcal{P}^{\prime}_{N})_{N} is super nice too.

Looking at Property (i)(i) for (𝒫N)N,(\mathcal{P}_{N})_{N}, we have that each polynomial (sequence) in (𝒫N)N(\mathcal{P}^{\prime}_{N})_{N} is non-constant and has degree independent of N,N, equal to deg(pi0,Np1,N).\deg(p_{i_{0},N}-p_{1,N}). If we let

qi,N:={pi,Npi0,Nii0pi0,Ni=i0,then we haveqi,Nqj,N={pi,Npj,Ni,ji0pi,Nj=i0pj,Ni=i0,q_{i,N}:=\begin{cases}p_{i,N}-p_{i_{0},N}&i\neq i_{0}\\ -p_{i_{0},N}&i=i_{0}\end{cases},\;\text{then we have}\;q_{i,N}-q_{j,N}=\begin{cases}p_{i,N}-p_{j,N}&i,j\neq i_{0}\\ p_{i,N}&j=i_{0}\\ -p_{j,N}&i=i_{0}\end{cases},

so (i) follows for (𝒫N)N(\mathcal{P}^{\prime}_{N})_{N} as well. (ii)(ii) and (ii)(ii)^{\prime} follow by the fact that ((𝒫)N)N=(𝒫N)N.((\mathcal{P}^{\prime})^{\prime}_{N})_{N}=(\mathcal{P}_{N})_{N}.

(7)(7) Property (i)(i) is invariant under the vdC-operation.222222 So, for sequences (𝒫N)N(\mathcal{P}_{N})_{N} with degree 2,\geq 2, the vdC-operation preserves the super niceness property.

Indeed, if pi0,Np_{i_{0},N} is the polynomial guaranteed by Lemma 4.7, then we have the iterates: p1,N(n+h)pi0,N(n),,p,N(n+h)pi0,N(n),p_{1,N}(n+h)-p_{i_{0},N}(n),\ldots,p_{\ell,N}(n+h)-p_{i_{0},N}(n), and p1,N(n)pi0,N(n),,pi01,N(n)pi0,N(n),pi0+1,N(n)pi0,N(n),,p,N(n)pi0,N(n).p_{1,N}(n)-p_{i_{0},N}(n),\ldots,p_{i_{0}-1,N}(n)-p_{i_{0},N}(n),p_{i_{0}+1,N}(n)-p_{i_{0},N}(n),\ldots,p_{\ell,N}(n)-p_{i_{0},N}(n).

The degrees of these polynomials satisfy

deg(pi,N(n+h)pi0,N(n))={deg(pi,N(n)pi0,N(n))ii0deg(pi0,N)1i=i0.\deg(p_{i,N}(n+h)-p_{i_{0},N}(n))=\begin{cases}\deg(p_{i,N}(n)-p_{i_{0},N}(n))&i\neq i_{0}\\ \deg(p_{i_{0},N})-1&i=i_{0}\end{cases}.

For the pairwise differences part, for ij,i\neq j, we have

pi,N(n+h)pi0,N(n)(pj,N(n+h)pi0,N(n))=pi,N(n+h)pj,N(n+h),p_{i,N}(n+h)-p_{i_{0},N}(n)-(p_{j,N}(n+h)-p_{i_{0},N}(n))=p_{i,N}(n+h)-p_{j,N}(n+h),
pi,N(n)pi0,N(n)(pj,N(n)pi0,N(n))=pi,N(n)pj,N(n),p_{i,N}(n)-p_{i_{0},N}(n)-(p_{j,N}(n)-p_{i_{0},N}(n))=p_{i,N}(n)-p_{j,N}(n),

and, finally,

pi,N(n+h)pi0,N(n)(pj,N(n)pi0,N(n))=pi,N(n+h)pj,N(n),p_{i,N}(n+h)-p_{i_{0},N}(n)-(p_{j,N}(n)-p_{i_{0},N}(n))=p_{i,N}(n+h)-p_{j,N}(n),

so, everything follows by Property (i) for (𝒫N)N.(\mathcal{P}_{N})_{N}.242424 Recall here that if, in the case where i=j,i=j, it happens deg(pi,N)=1,\deg(p_{i,N})=1, then deg(pi0,N)=1\deg(p_{i_{0},N})=1 (as a non-constant polynomial of minimum degree in (𝒫N)N(\mathcal{P}_{N})_{N}), so the vdC-operation will group the terms pi,N(n+h)pi0,N(n)p_{i,N}(n+h)-p_{i_{0},N}(n) and pi,N(n)pi0,N(n)p_{i,N}(n)-p_{i_{0},N}(n) together, being of degree 11 with bounded difference.

Notice that Remark 6 (5) implies that Theorem 1.3, via Theorem 2.2, holds for a larger class of variable polynomial sequences; even with coefficients that oscillate.

A real-valued function gg which is continuously differentiable on [c,),[c,\infty), where c0,c\geq 0, is called Fejér if the following hold:

\bullet g(x)g^{\prime}(x) tends monotonically to 0 as x;x\to\infty; and

\bullet limxx|g(x)|=.\lim_{x\to\infty}x|g^{\prime}(x)|=\infty.252525 For a study of averages with general sublinear iterates one is referred to [8], and to [3] and [24] for more general functions, e.g. tempered functions.

Any such function is eventually monotonic and satisfies the growth conditions logxg(x)x,\log x\prec g(x)\prec x, hence (1/g(N))N(1/g(N))_{N} has the R1R_{1}-property. So, modulo the goodness property, Theorem 2.2 will also hold for polynomial sequences of the form:

(5g1(N)n3+p1,N(n))N,or(7g2(N)n17+p2,N(n))N,\left(\frac{\sqrt{5}}{g_{1}(N)}n^{3}+p_{1,N}(n)\right)_{N},\;\text{or}\;\left(\frac{7}{g_{2}(N)}n^{17}+p_{2,N}(n)\right)_{N},

where g1(x)=x1/2(2+coslogx),g_{1}(x)=x^{1/2}(2+\cos\sqrt{\log x}), g2(x)=x1/40(1/10+sinlogx)3,g_{2}(x)=x^{1/40}(1/10+\sin\log x)^{3}, and p1,N,p_{1,N}, p2,Np_{2,N} are polynomials of degrees less than 33 and 1717 respectively with bounded coefficients. This is a non-trivial generalization because while the functions g1g_{1} and g2g_{2} are Fejér, in view of the fact that they oscillate, do not belong to 𝒮.\mathcal{SLE}.

The following result shows that the nilfactor 𝒵\mathcal{Z} is characteristic for a super nice collection of polynomial sequences (p1,N,,p,N)N.(p_{1,N},\ldots,p_{\ell,N})_{N}.

Proposition 2.

For \ell\in\mathbb{N} let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a super nice sequence of polynomials, (X,,μ,T)(X,\mathcal{B},\mu,T) a system, and suppose that at least one of the functions f1,,fL(μ)f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu) is orthogonal to the nilfactor 𝒵.\mathcal{Z}. Then, we have

limN1Nn=1NT[p1,N(n)]f1T[p,N(n)]f2=0.\lim_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}\right\|_{2}=0. (22)
Proof.

We assume without loss of generality that f1f_{1} is orthogonal to 𝒵.\mathcal{Z}. As in [12, Lemma 4.7], to show (22), it suffices to show:

limNsupf0,f2,,f11Nn=1N|f0T[p1,N(n)]f1T[p,N(n)]f𝑑μ|=0.\lim_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\left|\int f_{0}\cdot T^{[p_{1,N}(n)]}f_{1}\cdot\ldots\cdot T^{[p_{\ell,N}(n)]}f_{\ell}\;d\mu\right|=0. (23)

We claim next that we can further assume that deg(p1,N)=deg(𝒫N).\deg(p_{1,N})=\deg(\mathcal{P}_{N}). If this is not the case and deg(p1,N)<deg(pi0,N)=deg(𝒫N),\deg(p_{1,N})<\deg(p_{i_{0},N})=\deg(\mathcal{P}_{N}), then, precomposing with T[pi0,N(n)],T^{-[p_{i_{0},N}(n)]}, (23) becomes

limNsupf0,f2,,f11Nn=1N|fi0T[pi0,N(n)]f0,n,N1ii0T[pi,N(n)pi0,N(n)]fi,n,Ndμ|=0,\begin{split}&\quad\lim_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int f_{i_{0}}\cdot T^{[-p_{i_{0},N}(n)]}f_{0,n,N}\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{1\leq i\neq i_{0}\leq\ell}T^{[p_{i,N}(n)-p_{i_{0},N}(n)]}f_{i,n,N}\;d\mu\bigg{|}=0,\end{split}

where fi,n,N=Tei(n,N)fif_{i,n,N}=T^{e_{i}(n,N)}f_{i} for some ei(n,N){0,1}.e_{i}(n,N)\in\{0,1\}. It suffices to show that for all e{0,1}e\in\{0,1\}

limNsupf0,f2,,f11Nn=1N|fi0T[pi0,N(n)]f0T[p1,N(n)pi0,N(n)](Tef1)2ii0T[pi,N(n)pi0,N(n)]fidμ|=0.\begin{split}&\lim_{N\to\infty}\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int f_{i_{0}}\cdot T^{[-p_{i_{0},N}(n)]}f_{0}\cdot T^{[p_{1,N}(n)-p_{i_{0},N}(n)]}(T^{e}f_{1})\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{2\leq i\neq i_{0}\leq\ell}T^{[p_{i,N}(n)-p_{i_{0},N}(n)]}f_{i}\;d\mu\bigg{|}=0.\end{split}

The claim follows by Remark 6 (6), as the family (pi0,Np1,N,,pi0,Npi01,N,(p_{i_{0},N}-p_{1,N},\ldots,p_{i_{0},N}-p_{i_{0}-1,N}, pi0,N,pi0,Npi0+1,N,,pi0,Np,N)Np_{i_{0},N},p_{i_{0},N}-p_{i_{0}+1,N},\ldots,p_{i_{0},N}-p_{\ell,N})_{N} is super nice with degree=deg(pi0,Np1,N)=\deg(p_{i_{0},N}-p_{1,N}).

If all the pi,Np_{i,N}’s are of degree 1,1, the result follows from Proposition 1. For deg(p1,N)2,\deg(p_{1,N})\geq 2, we use induction on the type of the polynomial family of \ell-tuple of sequences.

For every NN\in\mathbb{N} we choose functions fi,Nf_{i,N} with fi,N1\left\|f_{i,N}\right\|_{\infty}\leq 1 for i{0,2,,},i\in\{0,2,\ldots,\ell\}, so that the average in (23) is 1/N1/N close to supf0,f2,,f1.\sup_{\left\|f_{0}\right\|_{\infty},\left\|f_{2}\right\|_{\infty},\ldots,\left\|f_{\ell}\right\|_{\infty}\leq 1}. If S=T×T,S=T\times T, F1=f1f¯1,F_{1}=f_{1}\otimes\overline{f}_{1}, Fi,N=fi,Nf¯i,N,F_{i,N}=f_{i,N}\otimes\overline{f}_{i,N}, i=0,2,,,i=0,2,\ldots,\ell, and μ~=μ×μ,\tilde{\mu}=\mu\times\mu, using Cauchy-Schwarz, (23) follows if

limN1Nn=1NS[p1,N(n)]F1i=2S[pi,N(n)]Fi,NL2(μ~)=0.\lim_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}S^{[p_{1,N}(n)]}F_{1}\cdot\prod_{i=2}^{\ell}S^{[p_{i,N}(n)]}F_{i,N}\right\|_{L^{2}(\tilde{\mu})}=0. (24)

By Lemma 4.4, (24) follows if, for large enough h,h, for NN\to\infty we have

1Nn=1N|S[p1,N(n+h)]F1i=2S[pi,N(n+h)]Fi,NS[p1,N(n)]F¯1i=2S[pi,N(n)]F¯i,Ndμ~|\frac{1}{N}\sum_{n=1}^{N}\bigg{|}\int S^{[p_{1,N}(n+h)]}F_{1}\cdot\prod_{i=2}^{\ell}S^{[p_{i,N}(n+h)]}F_{i,N}\cdot S^{[p_{1,N}(n)]}\overline{F}_{1}\cdot\prod_{i=2}^{\ell}S^{[p_{i,N}(n)]}\overline{F}_{i,N}\;d\tilde{\mu}\bigg{|} (25)

goes to 0.0. Picking pj0,Np_{j_{0},N} as guaranteed by Lemma 4.7 (the degrees of the pi,Np_{i,N}’s are fixed, so the choice of j0j_{0} is independent of NN), we precompose with the term S[pj0,N(n)]S^{-[p_{j_{0},N}(n)]} in the integrand of (25) (notice that some error terms ei(n,h,N),e_{i}(n,h,N), e~i(n,N){0,1}\tilde{e}_{i}(n,N)\in\{0,1\} will appear). Next, we group the degree 11 iterates. More specifically, if deg(pi,N)=1\deg(p_{i,N})=1 for some 2i2\leq i\leq\ell, then [pi,N(n+h)]=[pi,N(n)]+[ci,Nh]+ei(n,h,N),[p_{i,N}(n+h)]=[p_{i,N}(n)]+[c_{i,N}h]+e^{\prime}_{i}(n,h,N), for some error terms in {0,1}.\{0,1\}. Hence, we have

S[pi,N(n+h)pj0,N(n)]+ei(n,h,N)Fi,NS[pi,N(n)pj0,N(n)]+e~i(n,h,N)F¯i,N=S[pi,N(n)pj0,N(n)](S[ci,Nh]+ei(n,h,N)+ei(n,h,N)Fi,NSe~i(n,h,N)F¯i,N).\begin{split}&\quad S^{[p_{i,N}(n+h)-p_{j_{0},N}(n)]+e_{i}(n,h,N)}F_{i,N}\cdot S^{[p_{i,N}(n)-p_{j_{0},N}(n)]+\tilde{e}_{i}(n,h,N)}\overline{F}_{i,N}\\ &=S^{[p_{i,N}(n)-p_{j_{0},N}(n)]}(S^{[c_{i,N}h]+e_{i}(n,h,N)+e^{\prime}_{i}(n,h,N)}F_{i,N}\cdot S^{\tilde{e}_{i}(n,h,N)}\overline{F}_{i,N}).\end{split}

We treat this product as one iterate. After this grouping, assuming that rr many terms remain, it suffices to show, for large h,h, and every choice of e{0,1}e\in\{0,1\} that

limNsupF0,F2,,Fr1|1Nn=1NF0S[p1,h,N(n)](SeF1)i=2rS[pi,h,N(n)]Fidμ~|=0,\begin{split}&\quad\lim_{N\to\infty}\sup_{\left\|F_{0}\right\|_{\infty},\left\|F_{2}\right\|_{\infty},\ldots,\left\|F_{r}\right\|_{\infty}\leq 1}\bigg{|}\frac{1}{N}\sum_{n=1}^{N}\int F_{0}\cdot S^{[p_{1,h,N}(n)]}(S^{e}F_{1})\\ &\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\cdot\prod_{i=2}^{r}S^{[p_{i,h,N}(n)]}F_{i}\;d\tilde{\mu}\bigg{|}=0,\end{split} (26)

where the polynomial sequences (pi,h,N)N(p_{i,h,N})_{N} form (𝒫N(pj0,N,h))N,(\mathcal{P}_{N}(p_{j_{0},N},h))_{N}, a polynomial family with p1,h,N(n)=p1,N(n+h)pj0,N(n)p_{1,h,N}(n)=p_{1,N}(n+h)-p_{j_{0},N}(n) and deg(p1,h,N)=deg(𝒫N(pj0,N,h)).\deg(p_{1,h,N})=\deg(\mathcal{P}_{N}(p_{j_{0},N},h)).

The left-hand side of (26) is as that of (23), with the polynomial family of the former having type strictly less than the latter (from Lemma 4.7). Using Remark 6 (7), we are done by induction. ∎

For the expression of Theorem 2.2, writing i[pN(n)]=[ipN(n)]+ei,N(n),i[p_{N}(n)]=[ip_{N}(n)]+e_{i,N}(n), for some ei,N(n){i,,1,0},e_{i,N}(n)\in\{-i,\ldots,-1,0\}, using Proposition 2, we get the following result:

Proposition 3.

For \ell\in\mathbb{N} let (pN,2pN,,pN)N(p_{N},2p_{N},\ldots,\ell p_{N})_{N} be a super nice sequence of polynomials, (X,,μ,T)(X,\mathcal{B},\mu,T) a system, and suppose that at least one of the functions f1,,fL(μ)f_{1},\ldots,f_{\ell}\in L^{\infty}(\mu) is orthogonal to the nilfactor 𝒵.\mathcal{Z}. Then, we have

limN1Nn=1NT[pN(n)]f1T2[pN(n)]f2T[pN(n)]f2=0.\lim_{N\to\infty}\left\|\frac{1}{N}\sum_{n=1}^{N}T^{[p_{N}(n)]}f_{1}\cdot T^{2[p_{N}(n)]}f_{2}\cdot\ldots\cdot T^{\ell[p_{N}(n)]}f_{\ell}\right\|_{2}=0.

5. Equidistribution

In order to prove our main equidistribution result (Theorem 5.2), we start with some definitions and facts, following [11] (see [11, Subsubsection 2.3.2] for more details).

If GG is a nilpotent group, then a sequence g:Gg:\mathbb{N}\to G of the form g(n)=b1p1(n)bkpk(n),g(n)=b_{1}^{p_{1}(n)}\cdots b_{k}^{p_{k}(n)}, where biG,b_{i}\in G, and pip_{i} are integer polynomials, is called a polynomial sequence in GG. If the maximum degree of the polynomials pip_{i}’s is at most dd we say that the degree of g(n)g(n) is at most d.d.

Given a nilmanifold X=G/ΓX=G/\Gamma the horizontal torus is defined to be the compact abelian group Z=G/([G,G]Γ)Z=G/([G,G]\Gamma). If XX is connected, then ZZ is isomorphic to some finite dimensional torus 𝕋s\mathbb{T}^{s}. A horizontal character χ:G\chi:G\to\mathbb{C} is a continuous homomorphism that satisfies χ(gγ)=χ(g)\chi(g\gamma)=\chi(g) for every γΓ\gamma\in\Gamma and can be thought of as a character of 𝕋s\mathbb{T}^{s}, in which case there exists a unique κs\kappa\in\mathbb{Z}^{s} such that χ(ts)=e(κt)\chi(t\mathbb{Z}^{s})=e(\kappa\cdot t), where “\cdot” denotes the inner product operation, and e(x):=e2πixe(x):=e^{2\pi ix}.

Let p:p:\mathbb{Z}\to\mathbb{R} be a polynomial sequence of degree dd of the form p(n)=i=0dainip(n)=\sum_{i=0}^{d}a_{i}n^{i}, where ai,a_{i}\in\mathbb{R}, 1id.1\leq i\leq d. We define the smoothness norm by

e(p(n))C[N]:=max1id(Niai),\left\|e(p(n))\right\|_{C^{\infty}[N]}:=\max_{1\leq i\leq d}(N^{i}\left\|a_{i}\right\|), (27)

where \left\|\cdot\right\| denotes the distance to the closest integer, i.e., x:=d(x,)\left\|x\right\|:=d(x,\mathbb{Z}).

Given NN\in\mathbb{N}, a finite sequence (g(n)Γ)1nN(g(n)\Gamma)_{1\leq n\leq N} is said to be δ\delta-equidistributed in XX, if

|1Nn=1NF(g(n)Γ)XF𝑑mX|δFLip(X)\left|\frac{1}{N}\sum_{n=1}^{N}F(g(n)\Gamma)-\int_{X}Fdm_{X}\right|\leq\delta\left\|F\right\|_{{\text{\rm Lip}}(X)}

for every Lipschitz function F:X,F:X\to\mathbb{C}, where

FLip(X)=F+supx,yX,xy|F(x)F(y)|dX(x,y)\left\|F\right\|_{{\text{\rm Lip}}(X)}=\left\|F\right\|_{\infty}+\sup_{x,y\in X,x\neq y}\frac{|F(x)-F(y)|}{d_{X}(x,y)}

for some appropriate metric dXd_{X}.

At this point we quote [11, Theorem 2.9], a direct consequence of [17, Theorem 2.9]:

Theorem 5.1 (Green & Tao, [17]).

Let X=G/ΓX=G/\Gamma be a nilmanifold with GG connected and simply connected, and dd\in\mathbb{N}. Then for every small enough δ>0\delta>0 there exist a positive constant MM(X,d,δ)M\equiv M(X,d,\delta) with the following property: For every NN\in\mathbb{N}, if g:Gg:\mathbb{Z}\to G is a polynomial sequence of degree at most dd such that the finite sequence (g(n)Γ)1nN(g(n)\Gamma)_{1\leq n\leq N} is not δ\delta-equidistributed, then for some non-trivial horizontal character χ\chi with χM\left\|\chi\right\|\leq M we have

χ(g(n))C[N]M\left\|\chi(g(n))\right\|_{C^{\infty}[N]}\leq M

(χ\chi here is thought of as a character of the horizontal torus Z=𝕋sZ=\mathbb{T}^{s} and g(n)g(n) as a polynomial sequence in 𝕋s\mathbb{T}^{s}).

Adapting the notion of equidistribution of a sequence in a nilmanifold (recall (14)) to our case, abusing the notation, we say that (baN(n)x)1nN,(b^{a_{N}(n)}x)_{1\leq n\leq N}, where (aN(n))1nN(a_{N}(n))_{1\leq n\leq N} is a variable sequence of real numbers and X=G/ΓX=G/\Gamma is a nilmanifold with GG connected and simply connected, is equidistributed in a subnilmanifold YY of X,X, if for every FC(X)F\in C(X) we have

limN1Nn=1NF(baN(n)x)=F𝑑mY.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F(b^{a_{N}(n)}x)=\int F\;dm_{Y}.

In order for us to prove Theorems 2.1 and 2.2, we prove the following equidistribution theorem, which is the main result of this section:

Theorem 5.2.

Let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a good sequence of \ell-tuples of polynomials.

  1. (i)(i)

    If Xi=Gi/Γi,X_{i}=G_{i}/\Gamma_{i}, 1i,1\leq i\leq\ell, are nilmanifolds with GiG_{i} connected and simply connected, then for every biGib_{i}\in G_{i} and xiXix_{i}\in X_{i} the sequence

    (b1p1,N(n)x1,,bp,N(n)x)1nN(b_{1}^{p_{1,N}(n)}x_{1},\ldots,b_{\ell}^{p_{\ell,N}(n)}x_{\ell})_{1\leq n\leq N}

    is equidistributed in the nilmanifold (b1sx1)¯s××(bsx)¯s.\overline{(b_{1}^{s}x_{1})}_{s\in\mathbb{R}}\times\cdots\times\overline{(b_{\ell}^{s}x_{\ell})}_{s\in\mathbb{R}}.

  2. (ii)(ii)

    If Xi=Gi/Γi,X_{i}=G_{i}/\Gamma_{i}, 1i,1\leq i\leq\ell, are nilmanifolds, then for every biGib_{i}\in G_{i} and xiXix_{i}\in X_{i} the sequence

    (b1[p1,N(n)]x1,,b[p,N(n)]x)1nN(b_{1}^{[p_{1,N}(n)]}x_{1},\ldots,b_{\ell}^{[p_{\ell,N}(n)]}x_{\ell})_{1\leq n\leq N}

    is equidistributed in the nilmanifold (b1nx1)¯n××(bnx)¯n.\overline{(b_{1}^{n}x_{1})}_{n}\times\cdots\times\overline{(b_{\ell}^{n}x_{\ell})}_{n}.

Remark 7.

In order to prove Theorem 5.2, we can assume that X1==X=X.X_{1}=\ldots=X_{\ell}=X.

Indeed, in the general case we consider the nilmanifold X~=X1××X.\tilde{X}=X_{1}\times\cdots\times X_{\ell}. Then X~=G~/Γ~,\tilde{X}=\tilde{G}/\tilde{\Gamma}, where G~=G1××G\tilde{G}=G_{1}\times\cdots\times G_{\ell} is connected and simply connected and Γ~=Γ1××Γ\tilde{\Gamma}=\Gamma_{1}\times\cdots\times\Gamma_{\ell} is a discrete cocompact subgroup of G~.\tilde{G}. Each bib_{i} can be considered as an element of G~\tilde{G} and each xix_{i} as an element of X~.\tilde{X}. Changing the base point we can also assume that x=Γ.x=\Gamma.

Part (ii) of the previous result follows from Part (i) (see [11, Lemma 5.1]):

Lemma 5.3.

Let \ell\in\mathbb{N} and (a1,N,,a,N)N(a_{1,N},\ldots,a_{\ell,N})_{N} be sequence of \ell-tuples of real numbers. Suppose that for every nilmanifold X=G/Γ,X=G/\Gamma, with GG connected and simply connected, and every b1,,bGb_{1},\ldots,b_{\ell}\in G the sequence

(b1a1,N(n)Γ,,ba,N(n)Γ)1nN(b_{1}^{a_{1,N}(n)}\Gamma,\ldots,b_{\ell}^{a_{\ell,N}(n)}\Gamma)_{1\leq n\leq N}

is equidistributed in the nilmanifold (b1sΓ)¯s××(bsΓ)¯s.\overline{(b_{1}^{s}\Gamma)}_{s\in\mathbb{R}}\times\cdots\times\overline{(b_{\ell}^{s}\Gamma)}_{s\in\mathbb{R}}. Then, for every nilmanifold X=G/Γ,X=G/\Gamma, b1,,bGb_{1},\ldots,b_{\ell}\in G and x1,,xX,x_{1},\ldots,x_{\ell}\in X, the sequence

(b1[a1,N(n)]x1,,b[a,N(n)]x)1nN(b_{1}^{[a_{1,N}(n)]}x_{1},\ldots,b_{\ell}^{[a_{\ell,N}(n)]}x_{\ell})_{1\leq n\leq N}

is equidistributed in the nilmanifold (b1nx1)¯n××(bnx)¯n.\overline{(b_{1}^{n}x_{1})}_{n}\times\cdots\times\overline{(b_{\ell}^{n}x_{\ell})}_{n}.

Sketch of the proof.

Following [11, Lemma 4.1], we show the =1\ell=1 case, as the general one follows with some straightforward modifications.

Let X=G/ΓX=G/\Gamma be a nilmanifold, bGb\in G and xX.x\in X. Using some standard reductions (namely, the lifting argument and the change of base point formula from Subsections 3.2.3 and  3.2.2), we can and will assume that GG is connected and simply connected and that x=Γ.x=\Gamma.

Letting Xb:=(bnΓ)¯nX_{b}:=\overline{(b^{n}\Gamma)}_{n} and mXbm_{X_{b}} the corresponding normalized Haar measure, we will show that for every FC(X)F\in C(X) we have

limN1Nn=1NF(b[aN(n)]Γ)=XbF𝑑mXb.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F(b^{[a_{N}(n)]}\Gamma)=\int_{X_{b}}F\ dm_{X_{b}}. (28)

Using our assumption for the case X~:=G~/Γ~,\tilde{X}:=\tilde{G}/\tilde{\Gamma}, where G~:=×G\tilde{G}:=\mathbb{R}\times G is connected and simply connected, Γ~:=×Γ\tilde{\Gamma}:=\mathbb{Z}\times\Gamma and b~:=(1,b),\tilde{b}:=(1,b), for every HC(X~)H\in C(\tilde{X}) we have

limN1Nn=1NH(b~aN(n)Γ~)=X~b~H𝑑mX~b~,\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}H(\tilde{b}^{a_{N}(n)}\tilde{\Gamma})=\int_{\tilde{X}_{\tilde{b}}}H\ dm_{\tilde{X}_{\tilde{b}}}, (29)

where X~b~:=(s,bsΓ)¯s\tilde{X}_{\tilde{b}}:=\overline{(s\mathbb{Z},b^{s}\Gamma)}_{s\in\mathbb{R}} and mX~b~m_{\tilde{X}_{\tilde{b}}} is its corresponding normalized Haar measure.262626 Here we adapt the notation zz\mathbb{Z} which is more convenient than z(mod1).z(\text{mod}1).

Let FC(X),F\in C(X), and define F~:X~\tilde{F}:\tilde{X}\to\mathbb{C} with F~(t,gΓ):=F(b{t}gΓ).\tilde{F}(t\mathbb{Z},g\Gamma):=F(b^{-\{t\}}g\Gamma). While F~\tilde{F} may be discontinuous, for every 0<δ<1/20<\delta<1/2 there exists F~δC(X~)\tilde{F}_{\delta}\in C(\tilde{X}) that equals F~\tilde{F} on X~δ=Iδ×X,\tilde{X}_{\delta}=I_{\delta}\times X, where Iδ={t:tδ},I_{\delta}=\{t\mathbb{Z}:\;\left\|t\right\|\geq\delta\}, and it is uniformly bounded by 2F.2\left\|F\right\|_{\infty}.

Since b~aN(n)=(aN(n),baN(n)),\tilde{b}^{a_{N}(n)}=(a_{N}(n),b^{a_{N}(n)}), our assumption implies that aN(n)Iδ,a_{N}(n)\mathbb{Z}\in I_{\delta}, and so b~aN(n)Γ~X~δ,\tilde{b}^{a_{N}(n)}\tilde{\Gamma}\in\tilde{X}_{\delta}, for a set of nn’s with density 12δ.1-2\delta.272727 By this we mean limNN1|{1nN:aN(n)Iδ}|=12δ.\lim_{N\to\infty}N^{-1}\cdot|\{1\leq n\leq N:\;a_{N}(n)\mathbb{Z}\in I_{\delta}\}|=1-2\delta. So,

lim supN1Nn=1N|F~(b~aN(n)Γ~)F~δ(b~aN(n)Γ~)|4δF,\limsup_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}|\tilde{F}(\tilde{b}^{a_{N}(n)}\tilde{\Gamma})-\tilde{F}_{\delta}(\tilde{b}^{a_{N}(n)}\tilde{\Gamma})|\leq 4\delta\left\|F\right\|_{\infty},

hence, since (29) holds for every F~δ,\tilde{F}_{\delta}, it also holds for F~.\tilde{F}.

The map (s,gΓ)b{s}gΓ(s\mathbb{Z},g\Gamma)\mapsto b^{-\{s\}}g\Gamma sends X~b~\tilde{X}_{\tilde{b}} onto Xb.X_{b}. Defining the measure mm on XbX_{b} by

XbF𝑑m:=X~b~F(b{s}gΓ)𝑑mX~b~(s,gΓ),\int_{X_{b}}F\ dm:=\int_{\tilde{X}_{\tilde{b}}}F(b^{-\{s\}}g\Gamma)\ dm_{\tilde{X}_{\tilde{b}}}(s\mathbb{Z},g\Gamma),

we have (see [11, Lemma 4.1] for details) that m=mXb.m=m_{X_{b}}. Thus, since

F~(b~aN(n)Γ~)=F~(aN(n),baN(n)Γ)=F(b{aN(n)}baN(n)Γ)=F(b[aN(n)]Γ),\tilde{F}(\tilde{b}^{a_{N}(n)}\tilde{\Gamma})=\tilde{F}(a_{N}(n)\mathbb{Z},b^{a_{N}(n)}\Gamma)=F(b^{-\{a_{N}(n)\}}b^{a_{N}(n)}\Gamma)=F(b^{[a_{N}(n)]}\Gamma),

using (29) for the function F~,\tilde{F}, we get

limN1Nn=1NF(b[aN(n)]Γ)=X~b~F~𝑑mX~b~=X~b~F(b{s}gΓ)𝑑mX~b~(s,gΓ)=XbF𝑑mXb,\begin{split}&\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F(b^{[a_{N}(n)]}\Gamma)=\int_{\tilde{X}_{\tilde{b}}}\tilde{F}\ dm_{\tilde{X}_{\tilde{b}}}\\ =&\int_{\tilde{X}_{\tilde{b}}}F(b^{-\{s\}}g\Gamma)\ dm_{\tilde{X}_{\tilde{b}}}(s\mathbb{Z},g\Gamma)=\int_{X_{b}}F\ dm_{X_{b}},\end{split}

so we have (28). ∎

Recalling that a sequence of \ell-tuples of variable polynomials (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} is good if every non-trivial linear combination of (p1,N)N,,(p,N)N(p_{1,N})_{N},\ldots,(p_{\ell,N})_{N} is good, we have the following:

Lemma 5.4.

Let (p1,N,,p,N)N(p_{1,N},\ldots,p_{\ell,N})_{N} be a good sequence of \ell-tuples of polynomials, Xi=Gi/ΓiX_{i}=G_{i}/\Gamma_{i} nilmanifolds, with GiG_{i} connected and simply connected, and suppose that biGib_{i}\in G_{i} acts ergodically on Xi,X_{i}, 1i1\leq i\leq\ell. Then the sequence

(b1p1,N(n)Γ1,,bp,N(n)Γ)1nN(b_{1}^{p_{1,N}(n)}\Gamma_{1},\ldots,b_{\ell}^{p_{\ell,N}(n)}\Gamma_{\ell})_{1\leq n\leq N}

is equidistributed in X1××XX_{1}\times\cdots\times X_{\ell}.

Proof.

We follow [11, Lemma 5.3]. As the general case is similar, we assume that X1==X=X.X_{1}=\ldots=X_{\ell}=X. Arguing by contradiction, we will also assume that for some δ>0,\delta>0, (b1p1,N(n)Γ,,bp,N(n)Γ)1nN(b_{1}^{p_{1,N}(n)}\Gamma,\ldots,b_{\ell}^{p_{\ell,N}(n)}\Gamma)_{1\leq n\leq N} is not δ\delta-equidistributed in X.X^{\ell}.

If pi,N(t)=k=0dici,k,Ntk,p_{i,N}(t)=\sum_{k=0}^{d_{i}}c_{i,k,N}t^{k}, then

bipi,N(n)=bi,0,Nbi,1,Nnbi,di,Nndi,b_{i}^{p_{i,N}(n)}=b_{i,0,N}\cdot b_{i,1,N}^{n}\cdots b_{i,d_{i},N}^{n^{d_{i}}},

where bi,j,N=bci,j,N,b_{i,j,N}=b^{c_{i,j,N}}, 0jdi,0\leq j\leq d_{i}, 1i,1\leq i\leq\ell, so, for all N,N\in\mathbb{N}, (b1p1,N(n),,bp,N(n))n(b_{1}^{p_{1,N}(n)},\ldots,b_{\ell}^{p_{\ell,N}(n)})_{n} is a polynomial sequence in G.G^{\ell}.

Applying Theorem 5.1, we have a constant MM(δ,X,d1,,d)M\equiv M(\delta,X,d_{1},\ldots,d_{\ell}) and a horizontal character χ\chi of XX^{\ell} with χM\left\|\chi\right\|\leq M such that

χ(b1p1,N(n),,bp,N(n))C[N]M.\left\|\chi(b_{1}^{p_{1,N}(n)},\ldots,b_{\ell}^{p_{\ell,N}(n)})\right\|_{C^{\infty}[N]}\leq M.

Let π(bi)=(βi,1,,βi,s),\pi(b_{i})=(\beta_{i,1}\mathbb{Z},\ldots,\beta_{i,s}\mathbb{Z}), 1i,1\leq i\leq\ell, where βi,j,\beta_{i,j}\in\mathbb{R}, be the projection of bib_{i} on the horizontal torus 𝕋s\mathbb{T}^{s} (the integer ss is bounded by the dimension of XX). Using the ergodicity assumption on the bib_{i}’s, for all 1i,1\leq i\leq\ell, the set {1,βi,1,,βi,s}\{1,\beta_{i,1},\ldots,\beta_{i,s}\} consists of rationally independent elements. For tt\in\mathbb{R} we have π(bit)=(tβ~i,1,,tβ~i,s)\pi(b_{i}^{t})=(t\tilde{\beta}_{i,1}\mathbb{Z},\ldots,t\tilde{\beta}_{i,s}\mathbb{Z}) for some elements β~i,j\tilde{\beta}_{i,j}\in\mathbb{R} with β~i,j=βi,j,\tilde{\beta}_{i,j}\mathbb{Z}=\beta_{i,j}\mathbb{Z},282828 Note here that, for all 1i,1\leq i\leq\ell, the β~i,j\tilde{\beta}_{i,j}’s are also rationally independent. so, we have that

χ(b1p1,N(n),,bp,N(n))=e(i=1pi,N(n)j=1sλi,jβ~i,j)\chi(b_{1}^{p_{1,N}(n)},\ldots,b_{\ell}^{p_{\ell,N}(n)})=e\left(\sum_{i=1}^{\ell}p_{i,N}(n)\sum_{j=1}^{s}\lambda_{i,j}\tilde{\beta}_{i,j}\right)

for some integers λi,j.\lambda_{i,j}\in\mathbb{Z}.

If, for n,n\in\mathbb{N}, we set

pN(n):=i=1pi,N(n)j=1sλi,jβ~i,j=k=0dck,Nnk,p_{N}(n):=\sum_{i=1}^{\ell}p_{i,N}(n)\sum_{j=1}^{s}\lambda_{i,j}\tilde{\beta}_{i,j}=\sum_{k=0}^{d}c_{k,N}n^{k},

we have that the sequence (pN)N,(p_{N})_{N}, being a non-trivial (as χ\chi is non-trivial and β~i,j\tilde{\beta}_{i,j}’s are rationally independent) linear combination of the (pi,N)N(p_{i,N})_{N}’s, is good. Combining the last three relations, we get Me(pN(n))C[N]max1jd(Njcj,N),M\geq\left\|e(p_{N}(n))\right\|_{C^{\infty}[N]}\geq\max_{1\leq j\leq d}\left(N^{j}\left\|c_{j,N}\right\|\right), which is a contradiction to limNmax1jd(Njcj,N)=\lim_{N\to\infty}\max_{1\leq j\leq d}\left(N^{j}\left\|c_{j,N}\right\|\right)=\infty; a condition that the coefficients of a good variable polynomial sequence satisfy (see [13]). ∎

The last ingredient in proving Part (i) of Theorem 5.2 is the following lemma:

Lemma 5.5 (Lemma 5.2, [11]).

Let X=G/ΓX=G/\Gamma be a nilmanifold with GG connected and simply connected. Then, for every b1,,bG,b_{1},\ldots,b_{\ell}\in G, there exists an s0s_{0}\in\mathbb{R} such that for all 1i1\leq i\leq\ell the element bis0b_{i}^{s_{0}} acts ergodically on the nilmanifold (bisΓ)¯s.\overline{(b_{i}^{s}\Gamma)}_{s\in\mathbb{R}}.

We are now ready to prove Theorem 5.2.

Proof of Theorem 5.2.

Using Lemma 5.3 we see that Part (ii) of Theorem 5.2 follows from Part (i). To establish Part (i) let b1,,bGb_{1},\ldots,b_{\ell}\in G. By Lemma 5.5 there exists a non-zero s0s_{0}\in\mathbb{R} such that for every 1i1\leq i\leq\ell the element bis0b_{i}^{s_{0}} acts ergodically on the nilmanifold (bisΓ)¯s.\overline{(b_{i}^{s}\Gamma)}_{s\in\mathbb{R}}. Using Lemma 5.4 for the elements bis0b_{i}^{s_{0}} and the polynomials pi,N/s0p_{i,N}/s_{0} (which are still forming a good sequence of \ell-tuples of polynomials) we get that the sequence (b1p1,N(n)Γ,,bp,N(n)Γ)1nN(b_{1}^{p_{1,N}(n)}\Gamma,\ldots,b_{\ell}^{p_{\ell,N}(n)}\Gamma)_{1\leq n\leq N} is equidistributed in the nilmanifold (b1sΓ)¯s××(bsΓ)¯s\overline{(b_{1}^{s}\Gamma)}_{s\in\mathbb{R}}\times\cdots\times\overline{(b_{\ell}^{s}\Gamma)}_{s\in\mathbb{R}}, hence we get the conclusion. ∎

6. Proof of main results

To prove our main results, we first show that the polynomial sequences from Theorems 1.2 and 1.3 are good and super nice. If either g2g1g_{2}\prec g_{1} or g2g1,g_{2}\sim g_{1}, we write g2g1.g_{2}\precsim g_{1}.

Lemma 6.1.

The polynomial sequences from Theorems 1.2 and 1.3 are good.

Proof.

Let λ1p1,N++λp,N\lambda_{1}p_{1,N}+\ldots+\lambda_{\ell}p_{\ell,N} be a non-trivial linear combination of strongly independent variable polynomials as in (7), which is also of the same form. In case this combination is a polynomial of degree 1,1, precomposing with the opposite of its constant term, without loss of generality, we can assume that it is of the form h(N)n,h(N)n, where h(N)1/g(N),h(N)\sim 1/g(N), with 1g(N)N1\prec g(N)\prec N (hence h(N)0h(N)\to 0 and |h(N)|N|h(N)|N\to\infty monotonically as NN\to\infty). For any α0,\alpha\neq 0, as N,N\to\infty, we have that

1Nn=0N1eiαh(N)n=1N1eiαh(N)N1eiαh(N)=h(N)1eiαh(N)1eiαh(N)Nh(N)Niα0=0.\frac{1}{N}\sum_{n=0}^{N-1}e^{i\alpha h(N)n}=\frac{1}{N}\cdot\frac{1-e^{i\alpha h(N)N}}{1-e^{i\alpha h(N)}}=\frac{h(N)}{1-e^{i\alpha h(N)}}\cdot\frac{1-e^{i\alpha h(N)N}}{h(N)N}\to\frac{i}{\alpha}\cdot 0=0.

In case the combination is a polynomial of degree d,d, after using Lemma 4.4 (d1)(d-1) times, we get a polynomial of degree 1,1, hence the result follows from the previous step. ∎

Recall that when we want to check that a kk-tuple, for k>1,k>1, has the RkR_{k}-property, we have to check (according to Definition 4.5) that for every 1ik1\leq i\leq k the corresponding (k1)(k-1)-tuple has the Rk1R_{k-1}-property. If a (k1)(k-1)-tuple corresponds to the index i0,i_{0}, we say that it is descending from the i0i_{0} term of the previous step.

Lemma 6.2.

The polynomial sequences from Theorems 1.2 and 1.3 are super nice.

Proof.

For a single polynomial sequence as in (7), the result follows immediately from Remark 6 (5) and the properties of Hardy field functions.

For multiple sequences, (i)(i) follows by the form (7) that the variable polynomial sequences have. As (𝒫N)N(\mathcal{P}_{N})_{N} and (𝒫N)N(\mathcal{P}^{\prime}_{N})_{N} consist of polynomials of the same form, (ii)(ii) and (ii)(ii)^{\prime} will both follow by the same argument.

After performing, if needed, finitely many vdC-operations to the polynomial families of interest, assuming that we have kk many essentially distinct terms of the form ai,Nn,a_{i,N}n, 1ik,1\leq i\leq k, we have ai,𝒞(g1,,gl)a_{i,\cdot}\in\mathcal{C}(g_{1},\ldots,g_{l}) and ak,Na2,Na1,N.a_{k,N}\precsim\ldots\precsim a_{2,N}\precsim a_{1,N}.292929 This happens because vdC-operations preserve the essential distinctness property of the polynomials and at each step the coefficient functions belong to 𝒞(g1,,gl).\mathcal{C}(g_{1},\ldots,g_{l}). In order to show that the sequences {(ai,N)N: 1ik}\{(a_{i,N})_{N}:\;1\leq i\leq k\} have the RkR_{k}-property, we present an algorithmic way of finding the corresponding terms at the steps 11 and λ,\lambda, for λ2\lambda\geq 2:

Step 1: For i=1i=1 we pick j0=kj_{0}=k (i.e., the largest index). In this case we will show that we have property (ii) (a) (of Definition 4.5). The terms become:

ak,Naj,Na1,Naj,Na1,N,  1jk1.\frac{a_{k,N}-a_{j,N}}{a_{1,N}}\sim\frac{a_{j,N}}{a_{1,N}},\;\;1\leq j\leq k-1.

For i>1,i>1, we pick j0=1j_{0}=1 (i.e., the smallest index). In this case we will show that we have property (ii) (b). The terms become:

aj,Nai,Na1,Naj,Na1,N,  2jk.\frac{a_{j,N}}{a_{i,N}-a_{1,N}}\sim\frac{a_{j,N}}{a_{1,N}},\;\;2\leq j\leq k.

Step λ\lambda: After we order them from largest to smallest growth, we denote the jj-th term at the λ\lambda-th step with aλ,j,Na_{\lambda,j,N}. We have two cases:

\bullet The sequence of coefficients is descending from the i=1i=1 term of the (λ1)(\lambda-1)-th step.

For i=1i=1 we pick j0=kλ+1j_{0}=k-\lambda+1 and show property (ii) (a) (for the i=1i=1 case we always pick the largest index j0j_{0} and show property (ii) (a)). For 1jj01,1\leq j\leq j_{0}-1, we have

aλ,j0,Naλ,j,Naλ,1,N=aλ1,j,Naλ1,j0,Naλ1,j0+1,Naλ1,1,Naλ1,j,Naλ1,1,N,\frac{a_{\lambda,j_{0},N}-a_{\lambda,j,N}}{a_{\lambda,1,N}}=\frac{a_{\lambda-1,j,N}-a_{\lambda-1,j_{0},N}}{a_{\lambda-1,j_{0}+1,N}-a_{\lambda-1,1,N}}\sim\frac{a_{\lambda-1,j,N}}{a_{\lambda-1,1,N}},

where the numerator comes from the difference (aλ1,j0+1,Naλ1,j0,N)(aλ1,j0+1,Naλ1,j,N),(a_{\lambda-1,j_{0}+1,N}-a_{\lambda-1,j_{0},N})-\break(a_{\lambda-1,j_{0}+1,N}-a_{\lambda-1,j,N}), and the (common) denominators are canceled.

For i>1i>1 we pick j0=1j_{0}=1 and show property (ii) (b) (for the i>1i>1 case we always pick j0=1j_{0}=1 and show property (ii) (b)). For 2jkλ+12\leq j\leq k-\lambda+1 we have

aλ,j,Naλ,i,Naλ,1,N=aλ1,kλ+2,Naλ1,j,Naλ1,1,Naλ1,i,Naλ1,j,Naλ1,1,N,\frac{a_{\lambda,j,N}}{a_{\lambda,i,N}-a_{\lambda,1,N}}=\frac{a_{\lambda-1,k-\lambda+2,N}-a_{\lambda-1,j,N}}{a_{\lambda-1,1,N}-a_{\lambda-1,i,N}}\sim\frac{a_{\lambda-1,j,N}}{a_{\lambda-1,1,N}},

where the denominator comes from the difference (aλ1,kλ+2,Naλ1,i,N)(aλ1,kλ+2,N(a_{\lambda-1,k-\lambda+2,N}-a_{\lambda-1,i,N})-(a_{\lambda-1,k-\lambda+2,N} aλ1,1,N),-a_{\lambda-1,1,N}), and, as in the previous case, the (common) denominators are canceled.

\bullet The sequence of coefficients is descending from the i>1i>1 term of the (λ1)(\lambda-1)-th step.

For i=1,i=1, we choose j0=kλ+1j_{0}=k-\lambda+1. For all 1jj011\leq j\leq j_{0}-1 we have:

aλ,j0,Naλ,j,Naλ,1,N=aλ1,j0+1,Naλ1,j+1,Naλ1,2,Naλ1,j+1,Naλ1,2,N.\frac{a_{\lambda,j_{0},N}-a_{\lambda,j,N}}{a_{\lambda,1,N}}=\frac{a_{\lambda-1,j_{0}+1,N}-a_{\lambda-1,j+1,N}}{a_{\lambda-1,2,N}}\sim\frac{a_{\lambda-1,j+1,N}}{a_{\lambda-1,2,N}}.

For i>1,i>1, we choose j0=1j_{0}=1. For all 2jkλ+1,2\leq j\leq k-\lambda+1, we have

aλ,j,Naλ,i,Naλ,1,N=aλ1,j+1,Naλ1,i+1,Naλ1,2,Naλ1,j+1,Naλ1,2,N.\frac{a_{\lambda,j,N}}{a_{\lambda,i,N}-a_{\lambda,1,N}}=\frac{a_{\lambda-1,j+1,N}}{a_{\lambda-1,i+1,N}-a_{\lambda-1,2,N}}\sim\frac{a_{\lambda-1,j+1,N}}{a_{\lambda-1,2,N}}.

Note that each of the aforementioned terms, at each step, is (up to a sign) of the form

ai,Naj,Nas,Nat,N,s>i>jt,ai,Naj,Nat,N,i>jt,or,aj,Nai,Nat,N,t<min{i,j},\frac{a_{i,N}-a_{j,N}}{a_{s,N}-a_{t,N}},\;s>i>j\geq t,\;\frac{a_{i,N}-a_{j,N}}{a_{t,N}},\;i>j\geq t,\;\text{or},\;\frac{a_{j,N}}{a_{i,N}-a_{t,N}},\;t<\min\{i,j\},

i.e., combinations of terms from the initial sequence (because of the cancellations mentioned above) which are all \sim to aj,N/at,N.a_{j,N}/a_{t,N}. The claim now follows by the properties of elements from \mathcal{LE} as each coefficient is a logarithmico-exponential Hardy function, hence eventually monotone, which is either 1\sim 1 or 1/g(N)\sim 1/g(N) with 1g(N)N1\prec g(N)\prec N by the construction. ∎

We now prove Theorem 2.1 (which implies Theorem 1.2):

Proof of Theorem 2.1.

We start by using Proposition 2 in order to get that the nilfactor 𝒵\mathcal{Z} is characteristic for the multiple average in (8) (which can be used as the polynomial iterates are super nice). Via Theorem 3.1 we can assume without loss of generality that our system is an inverse limit of nilsystems. By a standard approximation argument, we can further assume that it is actually a nilsystem.

Let (X=G/Γ,𝒢/Γ,mX,Tb)(X=G/\Gamma,\mathcal{G}/\Gamma,m_{X},T_{b}) be a nilsystem, where bGb\in G is ergodic, and F1,,FL(mX)F_{1},\ldots,F_{\ell}\in L^{\infty}(m_{X}). Our objective now is to show that

limN1Nn=1NF1(b[p1,N(n)]x)F(b[p,N(n)]x)=F1𝑑mXF𝑑mX,\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F_{1}(b^{[p_{1,N}(n)]}x)\cdot\ldots\cdot F_{\ell}(b^{[p_{\ell,N}(n)]}x)=\int F_{1}\;dm_{X}\cdot\ldots\cdot\int F_{\ell}\;dm_{X}, (30)

where the convergence takes place in L2(mX)L^{2}(m_{X}). By density, we can assume that the functions F1,,FF_{1},\ldots,F_{\ell} are continuous. In this case we will show that (30) holds for all xX,x\in X, hence we will obtain the result by using the Dominated Convergence Theorem. By applying Theorem 5.2 to the nilmanifold XX^{\ell}, the nilrotation b~=(b,,b)G\tilde{b}=(b,\ldots,b)\in G^{\ell}, the point x~=(x,,x)X\tilde{x}=(x,\ldots,x)\in X^{\ell}, xX,x\in X, and the continuous function F~(x1,,x)=F1(x1)F(x)\tilde{F}(x_{1},\ldots,x_{\ell})=F_{1}(x_{1})\cdot\ldots\cdot F_{\ell}(x_{\ell}) (here we are using the goodness property of the pi,Np_{i,N}’s), we get that

limN1Nn=1NF~(b[p1,N(n)]x,,b[p,N(n)]x)=F~𝑑mX.\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\tilde{F}(b^{[p_{1,N}(n)]}x,\ldots,b^{[p_{\ell,N}(n)]}x)=\int\tilde{F}\;dm_{X^{\ell}}.

This implies that (30) holds for every xXx\in X, completing the proof. ∎

Next, we show Theorem 2.2 (which in turn implies Theorem 1.3):

Proof of Theorem 2.2.

As in the previous proof, our objective is to show that if (pN)N[t](p_{N})_{N}\subseteq\mathbb{R}[t] is a good polynomial sequence with (pN,2pN,,pN)N(p_{N},2p_{N},\ldots,\ell p_{N})_{N} being super nice, then for every nilsystem (X=G/Γ,𝒢/Γ,mX,Tb)(X=G/\Gamma,\mathcal{G}/\Gamma,m_{X},T_{b}), where bGb\in G is ergodic, and F1,,FL(mX),F_{1},\ldots,F_{\ell}\in L^{\infty}(m_{X}), we have that the limit

limN1Nn=1NF1(b[pN(n)]x)F2(b2[pN(n)]x)F(b[pN(n)]x)\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F_{1}(b^{[p_{N}(n)]}x)\cdot F_{2}(b^{2[p_{N}(n)]}x)\cdot\ldots\cdot F_{\ell}(b^{\ell[p_{N}(n)]}x) (31)

is equal to the limit

limN1Nn=1NF1(bnx)F2(b2nx)F(bnx).\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}F_{1}(b^{n}x)\cdot F_{2}(b^{2n}x)\cdot\ldots\cdot F_{\ell}(b^{\ell n}x). (32)

As in the proof above, we assume that every FiF_{i} is continuous. Applying Theorem 5.2 to X,X^{\ell}, the nilrotation b~=(b,b2,,b),\tilde{b}=(b,b^{2},\ldots,b^{\ell}), the point x~=(x,x,,x),\tilde{x}=(x,x,\ldots,x), xX,x\in X, and the continuous function F~(x1,,x)=F1(x1)F2(x2)F(x),\tilde{F}(x_{1},\ldots,x_{\ell})=F_{1}(x_{1})\cdot F_{2}(x_{2})\cdot\ldots\cdot F_{\ell}(x_{\ell}), we get

limN1Nn=1NF~(b~[pN(n)]x~)=limN1Nn=1NF~(b~nx~).\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\tilde{F}(\tilde{b}^{[p_{N}(n)]}\tilde{x})=\lim_{N\to\infty}\frac{1}{N}\sum_{n=1}^{N}\tilde{F}(\tilde{b}^{n}\tilde{x}).

This implies that the limits in (31) and (32) exist for every xXx\in X and are equal. ∎

6.1. Closing comments and problems

In the generality it is stated, Problem 1 (i.e., [13, Problem 10]) remains open except in the =1\ell=1 case. In this article, we first showed that the nilfactor of a system is characteristic for the corresponding sequence of iterates under the additional super niceness assumption. Second, we showed that the goodness property alone was enough to imply the required equidistribution properties. This comes as no surprise for, as we have already mentioned, the goodness property is a strong equidistribution notion. Hence, to completely resolve the problem, someone has to answer the following problem in the positive:

Problem 3.

For \ell\in\mathbb{N} let (𝒫N)N=(p1,N,,p,N)N(\mathcal{P}_{N})_{N}=(p_{1,N},\ldots,p_{\ell,N})_{N} be a good sequence of \ell-tuples of polynomials. Is it true that for every system its nilfactor 𝒵\mathcal{Z} is characteristic for (𝒫N)N(\mathcal{P}_{N})_{N}?

Analogously, to solve Problem 2, it suffices to answer the following:

Problem 4.

Let (pN)N(p_{N})_{N} be a good sequence of polynomials. Is it true that for every \ell\in\mathbb{N} and every system its nilfactor 𝒵\mathcal{Z} is characteristic for the sequence (𝒫N)N=(pN,2pN,pN)N(\mathcal{P}_{N})_{N}=(p_{N},2p_{N}\ldots,\ell p_{N})_{N}?

As in our results we have convergence to the “expected” limit, it is reasonable for someone to study the corresponding pointwise results along natural numbers. So, we naturally close this article with the following problem:

Problem 5.

Find classes of good variable polynomial iterates (e.g., the ones in Theorems 2.1 and  2.2) for which we have the corresponding pointwise convergence results.303030 Someone can start by studying the pointwise convergence for the special cases of averages with iterates from Examples 1 and  2.

Acknowledgments

Thanks go to D. Karageorgos with whom I started discussing the problem; N. Frantzikinakis for his constant support and fruitful discussions during the writing of this article; and N. Kotsonis for his detailed corrections on the text. I am also deeply thankful to the anonymous Referees, X and Y, whose detailed feedback led to numerous clarifications, improving the readability and quality of the article.

References

  • [1] (MR891243) [10.1090/conm/065/891243] V. Bergelson, \doititleErgodic Ramsey theory, Logic and Combinatorics (Arcata, Calif., 1985), Contemp. Math. Amer. Math. Soc., Providence, RI, 65 (1987), 63–87.
  • [2] (MR912373) [10.1017/S0143385700004090] V. Bergelson, \doititleWeakly mixing PET, Ergodic Theory Dynam. Systems, 7 (1987), 337–349.
  • [3] (MR2545011) [10.1017/S0143385708000862] V. Bergelson and I. Håland-Knutson, \doititleWeakly mixing implies mixing of higher orders along tempered functions, Ergodic Theory Dynam. Systems, 29 (2009), 1375–1416.
  • [4] (MR1881925) [10.1007/s002220100179] V. Bergelson and A. Leibman, \doititleA nilpotent Roth theorem, Invent. Math., 147 (2002), 429–470.
  • [5] (MR1325795) [10.1090/S0894-0347-96-00194-4] V. Bergelson and A. Leibman, \doititlePolynomial extensions of van der Waerden’s and Szemerédi’s theorems, J. Amer. Math. Soc., 9 (1996), 725–753.
  • [6] (MR2795725) [10.1112/plms/pdq037] Q. Chu, N. Frantzikinakis and B. Host, \doititleErgodic averages of commuting transformations with distinct degree polynomial iterates, Proc. Lond. Math. Soc., 102 (2011), 801–842.
  • [7] S. Donoso, A. Ferré Moragues, A. Koutsogiannis and W. Sun, Decomposition of multicorrelation sequences and joint ergodicity, preprint, 2021, \arXiv2106.01058.
  • [8] (MR4092858) [10.1017/etds.2018.118] S. Donoso, A. Koutsogiannis and W. Sun, \doititlePointwise multiple averages for sublinear functions, Ergodic Theory Dynam. Systems, 40 (2020), 1594–1618.
  • [9] [10.1007/s11854-021-0186-z] S. Donoso, A. Koutsogiannis and W. Sun, \doititleSeminorms for multiple averages along polynomials and applications to joint ergodicity, J. d’Analyse Math., (2021)
  • [10] (MR3347186) [10.1090/S0002-9947-2014-06275-2] N. Frantzikinakis, \doititleA multidimensional Szemerédi theorem for Hardy sequences of different growth, Trans. Amer. Math. Soc., 367 (2015), 5653–5692.
  • [11] (MR2585398) [10.1007/s11854-009-0035-y] N. Frantzikinakis, \doititleEquidistribution of sparse sequences on nilmanifolds, J. Anal. Math., 109 (2009), 353–395.
  • [12] (MR2762998) [10.1007/s11854-010-0026-z] N. Frantzikinakis, \doititleMultiple recurrence and convergence for Hardy sequences of polynomial growth, J. Anal. Math., 112 (2010), 79–135.
  • [13] (MR3613710) N. Frantzikinakis, Some open problems on multiple ergodic averages, Bull. Hellenic Math. Soc., 60 (2016), 41–90.
  • [14] (MR3829173) [10.1093/imrn/rnx002] N. Frantzikinakis, \doititleAn averaged Chowla and Elliott conjecture along independent polynomials, Int. Math. Res. Not. IMRN, 2018 (2018), 3721–3743.
  • [15] (MR3047073) [10.1007/s11856-012-0132-y] N. Frantzikinakis, B. Host and B. Kra, \doititleThe polynomial multidimensional Szemerédi theorem along shifted primes, Israel J. Math., 194 (2013), 331–348.
  • [16] (MR498471) [10.1007/BF02813304] H. Furstenberg, \doititleErgodic behavior of diagonal measures and a theorem of Szemerédi on arithmetic progressions, J. Analyse Math., 31 (1977), 204–256.
  • [17] (MR2877065) [10.4007/annals.2012.175.2.2] B. Green and T. Tao, \doititleThe quantitative behaviour of polynomial orbits on nilmanifolds, Ann. of Math., 175 (2012), 465–540.
  • [18] [10.1112/plms/s2-10.1.54] G. H. Hardy, \doititleProc. of the London Math. Society, Proceedings of the London Mathematical Society, s2-10 (1912), 54–90.
  • [19] (MR2150389) [10.4007/annals.2005.161.397] B. Host and B. Kra, \doititleNonconventional ergodic averages and nilmanifolds, Annals of Math., 161 (2005), 397–488.
  • [20] (MR3999460) [10.4064/sm171102-18-9] D. Karageorgos and A. Koutsogiannis, \doititleInteger part independent polynomial averages and applications along primes, Studia Math., 249 (2019), 233–257.
  • [21] (MR3809055) [10.3934/dcds.2018113] Y. Kifer, \doititleErgodic theorems for nonconventional arrays and an extension of the Szemerédi theorem, Discrete Contin. Dyn. Syst., 38 (2018), 2687–2716.
  • [22] (MR3774837) [10.1017/etds.2016.40] A. Koutsogiannis, \doititleClosest integer polynomial multiple recurrence along shifted primes, Ergodic Theory Dynam. Systems, 38 (2018), 666–685.
  • [23] (MR3789175) [10.1017/etds.2016.67] A. Koutsogiannis, \doititleInteger part polynomial correlation sequences, Ergodic Theory Dynam. Systems, 38 (2018), 1525–1542.
  • [24] (MR4201837) [10.3934/dcds.2020314] A Koutsogiannis, \doititleMultiple ergodic averages for tempered functions, Discrete Contin. Dyn. Syst., 41 (2021), 1177–1205.
  • [25] (MR2122919) [10.1017/S0143385704000215] A. Leibman, \doititlePointwise Convergence of ergodic averages for polynomial sequences of translations on a nilmanifold, Ergodic Theory Dynam. Systems, 25 (2005), 201–213.
  • [26] (MR1106945) [10.1215/S0012-7094-91-06311-8] M. Ratner, \doititleRaghunatan’s topological conjecture and distribution of unipotent flows, Duke Math. J., 63 (1991), 235–280.
  • [27] (MR2912715) [10.4007/annals.2012.175.3.15] M. Walsh, \doititleNorm convergence of nilpotent ergodic averages, Annals of Math., 175 (2012), 1667–1688.
  • [28] (MR2257397) [10.1090/S0894-0347-06-00532-7] T. Ziegler, \doititleUniversal characteristic factors and Furstenberg averages, J. Amer. Math. Soc., 20 (2007), 53–97.

Received November 2021; revised March 2022; early access May 2022.