Multiple ergodic averages for variable polynomials
Abstract.
In this paper we study multiple ergodic averages for “good” variable polynomials. In particular, under an additional assumption, we show that these averages converge to the expected limit, making progress related to an open problem posted by Frantzikinakis ([13, Problem 10]). These general convergence results imply several variable extensions of classical recurrence, combinatorial and number theoretical results which are presented as well.
Key words and phrases:
Variable polynomial sequences, multiple ergodic averages, multiple recurrence, characteristic factors, equidistribution, nilmanifolds, Hardy fields, sublinear functions.1991 Mathematics Subject Classification:
Primary: 37A44; Secondary: 37A05, 11B25, 11B83, 05D10.Andreas Koutsogiannis
Aristotle University of Thessaloniki
Department of Mathematics
Thessaloniki, 54124, Greece
Dedicated to the loving memory of Aris Deligiannis, a great mentor.
(Communicated by Zhiren Wang)
1. Introduction
The study of multiple ergodic averages along polynomials dates back to 1977. Furstenberg, exploiting the limiting behavior (all the limits in this article are taken with respect to the norm, unless otherwise stated), as of
(1) |
where is a measure preserving system,111 I.e., is an invertible measure preserving transformation on a standard Borel probability space and provided (in [16]) a purely ergodic theoretic proof of Szemerédi’s theorem; every subset of natural numbers of positive upper density222For a set we define its upper density, as contains arbitrarily long arithmetic progressions (a result that can be immediately obtained by combining Theorem 2.3 with Theorem 2.4 below).
A polynomial is an integer polynomial if It was Bergelson who first visualized the iterates in (1) as linear “distinct enough” integer polynomials. The integer polynomials are essentially distinct if are non-constant for all Bergelson studied (initially in [2]), via the use of van der Corput’s lemma, a crucial tool in “reducing the complexity” of the iterates, averages of the form
(2) |
for essentially distinct integer polynomials ; this study eventually led to multidimensional polynomial extensions of Szemerédi’s theorem (see [5]).
Bergelson and Leibman conjectured (in [4]) that multiple ergodic averages of the form
(3) |
in any system, for multiple commuting ’s (i.e., for all ) and arbitrary integer polynomials , always have limit (as ). This conjecture was answered in the positive by Walsh, who actually showed it in greater generality (see [27]). No specific expression of the limit was provided by the method.333 The conjecture corresponding to that of Bergelson and Leibman about iterates which are integer parts of real polynomials, was shown in [23].
One of the questions that someone is called upon to answer is under which conditions, either on the polynomials or the system, we can explicitly find the limit of the aforementioned expressions. In particular, whether we can find families of polynomials for which we have convergence in a general system to a specific expression (see more about the “expected” limit below); then we can get a number of interesting applications, e.g., find the corresponding arithmetic configurations on “large” subsets of integers. For instance, showing that the characteristic factor coincides with the nilfactor of the system, and exploiting the equidistribution property of the corresponding polynomial sequence in nilmanifolds (all these notions will be defined later), Frantzikinakis proved (in [12]) that the expression
(4) |
where with has the same limit (as ), in any system, as (1); obtaining a refinement of Szemerédi’s theorem.444 Such polynomials have the property that for every has at least one non-constant irrational coefficient, which is exactly the case (via Weyl’s criterion) when the corresponding sequence is equidistributed.
Generalizing the condition to multiple polynomials, following Frantzikinakis’ approach, Karageorgos and the author showed (in [20]) that for strongly independent real polynomials (i.e., any non-trivial linear combination of the ’s with scalars from has at least one non-constant irrational coefficient) the expression
(5) |
has the “expected” limit. In order to explain what is meant by “expected” limit, we need to recall the ergodicity and weakly mixing notions. is ergodic if implies ; is weakly mixing if is ergodic. Here, by “expected” limit we mean, in case is ergodic, that the limit is equal to whereas, in the general case, it is equal to where is the conditional expectation of with respect to the -algebra of the -invariant sets (notice here the connection to independence in probability). Furstenberg showed in [16] that for a weakly mixing (1) converges to the expected limit; under the same assumption on , Bergelson showed in [2] that (2) converges to the same limit as well.
We extend the distinctness property of the sequences of iterates of (5) to sequences of real variable polynomials:
Definition 1.1 ([13]).
The sequence where is good if the polynomials have bounded degree and for every non-zero we have
(6) |
The sequence of -tuples of variable polynomials where is good if every non-trivial linear combination of the sequences is good.
Example 1 ([13]).
For the pair where is good.
Example 2 ([13]).
For the -tuple where is good.
For the class of good polynomial sequences, Frantzikinakis stated the following problem:
Problem 1 (Problem 10, [13]).
Let be a good -tuple of variable polynomials. Is it true that, for every ergodic system and functions we have
Showing that (4) has the same limit as (1) for with which follows from [12, Theorem 2.2], one comes to the following problem, which is a natural generalization of Frantzikinakis’ result to good-variable-polynomials:
Problem 2.
Let be a good polynomial sequence. Is it true that, for every system and we have
As mentioned in [13], the case of Problem 1 (which also coincides with the case of Problem 2) can be easily obtained by using the spectral theorem. For general we make progress towards the solution of both Problems 1 and 2. In particular, under some additional assumptions on the coefficients of the good variable polynomials, we show two general results: Theorems 2.1 and 2.2. In this introductory section, we will present an easier application of each of them, which still covers both Examples 1 and 2.
To this end, we first recall the set of sublinear logarithmico-exponential Hardy field functions (of polynomial degree ) which converge (as ) to () infinity:555 Let be the collection of equivalence classes of real valued functions defined on some halfline where two functions that agree eventually are identified. These classes are called germs of functions. A Hardy field is a subfield of the ring that is closed under differentiation. Here, we use the word function when we refer to elements of (understanding that all the operations defined and statements made for elements of are considered only for sufficiently large ). We say that is a logarithmico-exponential Hardy field function, and we write if it belongs to a Hardy field of real valued functions and it is defined on some by a finite combination of symbols acting on the real variable and on real constants. For more on Hardy field functions, see [10, 12, 18].
(we write ). Next, we define an appropriate set of coefficients: For with 666 The different growth relation between the ’s, is postulated to avoid cases as, e.g., since is not sublinear (here we write if ). let be the set of all linear combinations of reciprocals of the ’s, i.e.,
Extending the definition from [20], we say that the sequence of -tuple of variable polynomials where for each has the form:
(7) |
is strongly independent if for any we have that is a non-constant polynomial in . For example, the following triple of variable polynomials is strongly independent:
where 777 If and we have that is constant only when Regarding Problem 1, i.e., multiple variable polynomial sequences, we have the following result:
Theorem 1.2.
For let be a strongly independent -tuple of polynomials as in (7). Then, for every ergodic system and we have
Regarding Problem 2, our result is the following theorem:
Theorem 1.3.
Let be a polynomial sequence as in (7) with non-constant in . Then, for every system and we have
Very few convergence results for averages with polynomial iterates, in which we can explicitly find the limit, exist. Results for variable polynomials are even scarcer. We will conclude this introduction, by mentioning some of them. Kifer ([21]) studied multiple averages for variable polynomials of the form with ’s essentially distinct and for a weakly mixing transformation Similarly, for more general polynomials, Kifer studied averages for strongly mixing “enough” transformations. Finally, Frantzikinakis (in [12]) found characteristic factors (see Definition 4.1) for averages with variable polynomial iterates with leading coefficients independent of . It is the arguments from this article ([12]) that we will adapt, in order to find characteristic factors for the averages appearing in Theorems 1.2 and 1.3 as well, which is one of the main two ingredients of the proof (the second one is the equidistribution of particular sequences for which we adapt arguments from [11]).
Notation
We denote by and the sets of natural, integer, rational, real and complex numbers respectively. For a function on a space with a transformation we denote by the composition For denotes the dimensional torus. For we write if there exists such that
2. Main results and applications
Here we will state our most general results and some applications. For the proofs, we follow [10] and [20], adapting the corresponding arguments to the variable polynomial case.
We first cover Problem 1 for a subclass of good polynomial sequences:
Theorem 2.1.
For let be a good and super nice888 The “super niceness” property is rather technical and will be defined in Section 4. -tuple of polynomials. Then, for every ergodic system and we have
(8) |
We also cover the following case of Problem 2:
Theorem 2.2.
Let be a good polynomial sequence such that, for all is super nice. Then, for every system and we have
(9) |
We will show that Theorem 2.1 implies Theorem 1.2 (resp. Theorem 2.2 implies Theorem 1.3), and that it holds for any polynomial family (resp. for Theorem 2.2) which is independent of and for which non-trivial linear combinations of its members satisfy (6). In particular, it generalizes [20, Theorem 2.1] for strongly independent polynomials (the same is true for Theorem 2.2 for the single polynomial case).
The approach we follow to show these results is similar to the one in [12, 20], with a few extra twists. Namely, one has to find the characteristic factors of (8) and (9), and show some equidistribution results in nilmanifolds. The “super niceness” property (Definition 4.8) will be introduced so we can deal with the former, while the “goodness” property (Definition 1.1) implies the latter.
As was mentioned in the previous section, the ergodicity assumption in Theorem 2.1 can be dropped.999 The limit in this case is equal to Indeed, if denotes the ergodic decomposition of it suffices to show that if for some then the averages converge to Since we have that for -a.e. By (8), we have that the averages go to in for -a.e. hence the limit is equal to in by the Dominated Convergence Theorem. Hence, the theorems hold for any system; their strong nature is also reflected in the fact that they have immediate recurrence and combinatorial implications which we discuss next.
2.1. Single sequence consequences
We first deal with a single variable polynomial sequence, assuming the validity of Theorem 2.2.
2.1.1. Recurrence
The following theorem due to Furstenberg will help us obtain recurrence results:
Theorem 2.3 (Furstenberg Multiple Recurrence Theorem, [16]).
Let be a system. Then, for every and every set with we have
Remark 1.
As we mentioned before, the liminf in the expression of Theorem 2.3 is actually a limit.
Corollary 1.
Let be as in Theorem 2.2. Then, for every every system and every set with we have
2.1.2. Combinatorics
Via Furstenberg’s Correspondence Principle, one gets combinatorial results from recurrence ones. We present here a reformulation of this principle from [1].
Theorem 2.4 (Furstenberg Correspondence Principle, [16], [1]).
Let be a subset of integers. There exists a system and a set with such that
(10) |
for every and
Corollary 2.
Let be as in Theorem 2.2. Then, for every and every set with we have
Hence, we immediately get the following refinement of Szemerédi’s theorem:
Corollary 3.
Let be as in Theorem 2.2. Then, for every , every set with contains arithmetic progressions of the form:
for some and with
2.2. Multiple sequences consequences
As in Subsection 2.1, assuming the validity of Theorem 2.1, we have various implications for multiple variable polynomial sequences.
2.2.1. Recurrence
Our first recurrence result is the following (we skip the proof as the argument is the same as the one in [11, Theorem 2.8]):
Theorem 2.5.
Setting and we immediately get the following:
Corollary 4.
For let be as in Theorem 2.1. Then, for every system and every set we have
2.2.2. Combinatorics
Theorem 2.5, via [14, Proposition 3.3], which is a variant of Theorem 2.4 for several sets, implies the following (we are skipping the routine details):
Theorem 2.6.
Setting and in the previous result, we get:
Corollary 5.
For let be as in Theorem 2.1. Then, for every set we have
So, we immediately obtain the following combinatorial result:
Corollary 6.
For let be as in Theorem 2.1. Then every set with contains arithmetic configurations of the form
for some and with for all
A set is called syndetic if finitely many translations of it cover The cardinality of such a set of translations is a syndeticity constant of . Applying Theorem 2.6 to syndetic sets and where is a syndeticity constant of we have:
Corollary 7.
For let be as in Theorem 2.1. If are syndetic sets, then there exist and with for all such that
In particular, for a syndetic set , setting where Corollary 7 above implies that we can find and that solve the system of equations
2.2.3. Topological dynamics
Let be a (topological) dynamical system, i.e., is a compact metric space and an invertible continuous transformation. (and consequently the system) is minimal, if, for all we have
Analogously to [20, Theorem 2.5], we get the following result:
Theorem 2.7.
For let be as in Theorem 2.1. If is a minimal dynamical system, then, for a residual and -invariant set of we have
(11) |
Proof.
There exists a -invariant Borel measure which gives positive value to every non-empty open set. So, due to the syndeticity of the orbit of every point, for every and every non-empty open set we have
(12) |
As we mentioned before, Theorem 2.1 implies that
(13) |
Since combining (13) with (12), we get for almost every (hence for a dense set) and every from a given countable basis of non-empty open sets that
proving that the set of points that satisfy (11), say is dense. To see that is take (the general case is analogous). Then
where is a countable, dense subset of and denotes the open ball centered at with radius The claim now follows since
Since we also get the -invariance of . ∎
Using Zorn’s lemma, we know that every dynamical system has a minimal subsystem. This fact together with Theorem 2.7 imply the following corollary:
Corollary 8.
For let be as in Theorem 2.1. If is a dynamical system, then, for a non-empty and -invariant set of we have
Remark 2.
Following the method of [22] (which extended the one from [15]), as it was adapted in [20], the interested, and somewhat familiar to the topic, reader can state and prove the corresponding convergence results to Theorems 2.1 and 2.2 along prime numbers (or, for the sake of simplicity, Theorems 1.2 and 1.3), together with the corresponding corollaries, as well as recurrence results along primes shifted by .
While it is not trivial, this can be achieved, for the uniformity estimates, that allow one to pass from averages along natural numbers to the corresponding ones along primes, can be used for variable polynomial iterates of bounded degree (i.e., one can deal with the “good” and “super nice” variable iterates under consideration).
3. Some background material
In this section we list some materials that will be used for the multiple average case.
3.1. Factors
A homomorphism from a system onto a system is a measurable map , where is a -invariant subset of and is an -invariant subset of , both of full measure, such that and for . When we have such a homomorphism we say that the system is a factor of the system . If the factor map can be chosen to be injective, then we say that the systems and are isomorphic. A factor can also be characterised by which is a -invariant sub--algebra of , and, conversely, any -invariant sub--algebra of defines a factor. By abusing the terminology, we denote by the same letter the -algebra and its inverse image by , so, if is a factor of , we think of as a sub--algebra of .
3.1.1. Seminorms
We follow [19] and [6] for the inductive definition of the seminorms More specifically, the definition that we use here follows from [19] (in the ergodic case), [6] (in the general case) and the use of von Neumann’s mean ergodic theorem.
Let be a system and We define inductively the seminorms (or just if there is no room for confusion) as follows: For we set
Recall that the conditional expectation satisfies and
For we let
All these limits exist and define seminorms on ([19]). Also, we remark that for all we have and
3.1.2. Nilfactors
Using the seminorms we defined above, we can construct factors of characterized by:
The following profound fact from [19] (see also the independent work of [28]) shows that for every the factor has a purely algebraic structure; approximately, we can assume that it is a -step nilsystem (see Subsection 3.2 below for the definitions):
Theorem 3.1 (Structure Theorem, [19, 28]).
Let be an ergodic system and . Then the factor is an inverse limit of -step nilsystems.101010 By this we mean that there exist -invariant sub--algebras , of such that and for every , the factors induced by the -algebras are isomorphic to -step nilsystems.
Because of this result, we call the -step nilfactor of the system. The smallest factor that is an extension of all finite step nilfactors is denoted by , meaning, , and is called the nilfactor of the system. The nilfactor is of particular interest because it controls the limiting behaviour in of the averages in (8) and (9).
3.2. Nilmanifolds
Let be a -step nilpotent Lie group, meaning for some , where denotes the -th commutator subgroup, and a discrete cocompact subgroup of . The compact homogeneous space is called -step nilmanifold (or nilmanifold). The group acts on by left translations, where the translation by an element is given by . We denote by the normalized Haar measure on i.e., the unique probability measure that is invariant under the action of , and by the Borel -algebra of . If , we call the system -step nilsystem (or nilsystem) and the elements of nilrotations.
3.2.1. Equidistribution
For a connected and simply connected Lie group let be the exponential map, where is the Lie algebra of . For and we define the element of as follows: If is such that , then (this is well defined since under the aforementioned assumptions is a bijection).
If is a sequence of real numbers and is a nilmanifold with connected and simply connected, we say that the sequence is equidistributed in a subnilmanifold of , if for every we have
(14) |
For the following claims, one can check the linear case in [25] ([25, Section 2], and in particular the theorem in [25, Subsection 2.17], together with [25, Theorem 2.19]) which covers the -actions case, and [26] for the analogous result for -actions. A nilrotation is ergodic (or acts ergodically) on , if the sequence is dense in If is ergodic, then for every the sequence is equidistributed in . The orbit closure of has the structure of a nilmanifold with being equidistributed in . Analogously, if is connected and simply connected, then is a nilmanifold with being equidistributed in .
3.2.2. Change of base point formula
Let be a nilmanifold. As mentioned before, for every the sequence is equidistributed in . Using the identity we see that the nil-orbit is equidistributed in the set . A similar formula holds when is connected and simply connected, where we replace the with and the nilmanifold with .
3.2.3. Lifting argument
Giving a topological group we denote the connected component of its identity element, e, by To assume that a nilmanifold has a representation with connected and simply connected, one can follow for example [25]. Since all our results deal with an action on of finitely many elements of we can and will assume that the discrete group is finitely generated (see [25, Subsection 2.1]). In this case one can show (see [25, Subsection 1.11]) that is isomorphic to a sub-nilmanifold of a nilmanifold , where is a connected and simply connected nilpotent Lie group, with all translations from “represented” in .121212 In practice this means that for every , and , there exists , and , such that for every . We caution the reader that such a construction is only helpful when our working assumptions impose no restrictions on a nilrotation. Any assumption made about which acts on a nilmanifold , is typically lost when passing to the lifted nilmanifold .
4. Finding the characteristic factor
In this technical section we find characteristic factors for the expressions that appear in Theorems 2.1 and 2.2. In both cases, we will show that the nilfactor is characteristic (Proposition 2 and Proposition 3 respectively).
We first start with the degree case and then move on to the general one. At this point we recall the notion of a characteristic factor (adapted to our study):
Definition 4.1.
For let be a system. The sub--algebra of is a characteristic factor for the variable tuple of integer-valued sequences if it is -invariant and
for all where , 131313 Equivalently, if for some
4.1. The base case
The following crucial lemma, which can be understood as a “change of variables” procedure, will be used in the base case for i.e., We will assume that is bounded, so, as such error terms do not affect our averages, we mainly have to deal with the expression
Lemma 4.2.
Let bounded with tending increasingly to . For any sequence we have
Proof.
For a fixed since is bounded, we have the relation
Since we get that Finally, using yet again that is bounded, we have
The result now follows by taking ∎
Lemma 4.3.
Let be a sequence of polynomials of degree of the form
where are bounded sequences with and tending increasingly to Then, for any system and we have
(15) |
Proof.
For every we choose functions with so that the corresponding average is close to its supremum Inequality (15) follows if we show
(16) |
We write
Let be the finite set where takes values. We have that
Taking squares and using the Cauchy-Schwarz inequality, the right-hand side of the previous inequality is bounded by
where and For every , using Lemma 4.2, the of the averages on the right-hand side of the previous equality is bounded above by a constant multiple of
where the last inequality follows by Cauchy-Schwarz and the fact that . Using von Neumann’s mean ergodic theorem, the last term is equal to
where we used the fact that is measure preserving, the definition of the seminorms and the relationship between the -th seminorm of the tensor product and the seminorm on the base space. Inequality (16) now follows by removing the squares. ∎
Remark 3.
Lemma 4.3 holds also for sequences with tending decreasingly to
Indeed, In this case we write
so,
where is a finite subset of integers. Since and tends increasingly to we get the conclusion by the previous lemma (working with instead of ).141414 We note that since in Theorems 2.1 and 2.2 we assume that the transformation or is ergodic, then the seminorms taken with respect to either of those transformations coincide.
For multiple terms, we use the following variant of the classical van der Corput trick:
Lemma 4.4 (Lemma 4.6, [12]).
Let be a bounded sequence in a Hilbert space. Then
We will now demonstrate the main idea behind the generalization of Lemma 4.3, for which we follow [12, Proposition 5.3, Case 1]. In that statement, to show
where or one uses Lemma 4.4, compose with, say, and gets the terms (notice that we keep the -term in the first difference even though it is bounded)
so, after grouping the last two terms together, using the first one as constant (since it only depends on –the average along which is crucial for the argument and is taken at the very end), one can use the base case. This is also how the inductive step works in the proof of the general case.
The variable case is more complicated to deal with. We demonstrate the main idea behind it by considering Example 1, i.e., and where and for . The previous approach cannot be imitated, as, for example,
is in general a variable term and we cannot proceed with the same argument. What we do instead is to transform the iterates in the initial sum to the following:
and then we use Lemma 4.4 (i.e., change of variables) to bound, eventually, everything by (To use Lemma 4.2 note the crucial fact that is bounded.) Additionally, to bound our expression by the previous argument needs an additional twist to work since the quantity is unbounded. What we do in this case is to compose with to get
where we used the change of variables. As is bounded, we can now finish the argument as before.
The previous discussion, naturally leads to the following assumption on the leading coefficients of the linear (variable) polynomials:
Definition 4.5.
A sequence of real numbers has the -property if
-
(i)
it is bounded; and
-
(ii)
or and tends increasingly to
For the sequences have the -property if for all :
-
(i)
has the -property; and
-
(ii)
at least one of the following three properties holds:
(a) such that have the -property.
(b) such that has the -property and the sequences have the -property.
(c) such that has the -property and such that have the -property.
Remark 4.
The polynomial family of Example 1, i.e., where has the -property.
Indeed, skipping the trivial calculations, both sequences have the -property and for we have the case, while for the case.
We are now ready to extend Lemma 4.3 to multiple terms along polynomials of degree following the main idea of [12, Proposition 5.3, Case 1]:
Proposition 1.
Let be polynomial sequences of degree of the form
where the sequences have the -property and are bounded. Then, for every we have
(17) |
Proof.
We use induction on The base case, follows from Lemma 4.3. We assume that and that the statement holds for
Case 1: For the property (ii) (a) from the Definition 4.5 holds.
(18) |
where is a finite subset of integers (the error terms , as and are bounded for take finitely many values).
For every we now choose functions with for so that the last term in (18) is close to Using the Cauchy-Schwarz inequality and the fact that we have that (17) follows if we show, for each choice of that
(19) |
is bounded above by a constant multiple of Using Lemma 4.2 it suffices to show
(20) |
The left-hand side of (20) is equal to
where and Using the Cauchy-Schwarz inequality and Lemma 4.4, we have that
where
Precomposing with the term we get
(21) |
where Using the hypothesis, for there exists such that the sequences have the -property. Precomposing with in the right-hand side of (21) we have that
where for some error terms
As we previously highlighted, for every fixed we can partition the set of integers so that is constant. So, fixing using the induction hypothesis, we have
So, using Hölder inequality and the definition of the seminorms we have
hence, (19) is bounded above by a constant multiple of as was to be shown.
Cases 2 & 3: For we either have property (ii) (b) or (ii) (c) in Definition 4.5.
Here we will skip the details already outlined in Case 1. If is the integer guaranteed by Definition 4.5, the integrand in the last part of equation (18) will become (setting, without loss, )
and the one in equation (21)
Precomposing with the term (for Case 2) and with (for Case 3–where is the one guaranteed by Definition 4.5), we can continue (using the induction hypothesis) and finish the argument as in Case 1. The proof of the statement is now complete. ∎
Remark 5.
To the best of our knowledge, when we deal with norm convergence of averages of (non-variable) polynomial iterates, we can always replace the conventional Cesàro averages, i.e., with the corresponding uniform ones, i.e., Our method though, exactly because of the choice of functions (to go from equation (15) to (16) and from equation (18) to (19)), cannot guarantee the corresponding uniform results.
4.2. The general case
We start by recalling (see, for example, [2] and [12]) the definition of the degree and type of a polynomial family that we will adapt in our study:
Definition 4.6.
For let be a family of non-constant real polynomials. We denote with the maximum degree of the polynomials ’s and we call it degree of If denotes the number of distinct leading coefficients of polynomials from of degree and then the vector is the type of We order all the possible type vectors lexicographically.171717 I.e., iff, reading from left to right, the first instance where the two vectors disagree the coordinate of the first vector is greater than that of the second one.
In order to reduce the complexity (i.e., the type) of a polynomial family, one has to use the classic PET (i.e., Polynomial Exhaustion Technique) induction.
At this point we remind the reader that the real polynomials are called essentially distinct if they are, together with their pairwise differences, non-constant. Given such a family of polynomials and the van der Corput operation (vdC-operation), acting on gives the family
where we then remove all the terms that are bounded191919 This is justified with the use of the Cauchy-Schwarz inequality. and we group the ones of degree with bounded difference (i.e., of the same leading coefficient), thus obtaining a new family of essentially distinct polynomials.
The following lemma states that there exists a choice of a polynomial in a family of essentially distinct polynomials, via which the vdC-operation reduces its type:
Lemma 4.7 (Lemma 4.5, [12]).
Let and be a family of essentially distinct polynomials with Then there exists (of minimum degree in the polynomial family) such that for every large the family has type smaller than that of and
What is crucial for us is that every decreasing sequence of types is eventually (after finitely many steps) stationary and that, by using the previous lemma, there is a point at which all the polynomials have degree . Also, by its definition, the vdC-operation preserves the essential distinctness property.
We will deal with sequences of families of real polynomials, where that, for large , has type independent of to be able to use the facts that we just mentioned. Abusing the notation, we write .
Next we define the subclass of variable polynomials that we will deal with.
Definition 4.8.
For let be a sequence of -tuples of real polynomials with bounded coefficients. We say that is super nice if, for every (large enough) :
-
(i)
the polynomials and, for all are non-constant and their degrees are independent of ;
-
(ii)
after performing, if needed, (finitely many) vdC-operations to to obtain only polynomials of degree say many, the leading coefficients (for large enough ’s–from the vdC-operations) have the -property; and
-
(ii)′
if then (ii) holds for the polynomial sequence
Remark 6.
It is not clear to us whether implies .
Consider for example the sequence of polynomials where and After performing the vdC-operation twice we get the triple while for the sequence after a single use of the vdC-operation we get . So, in the second case we have to impose assumptions on both while in the first one only on
The degree and type of every super nice sequence, together with the integer in (and, analogously, in as well), are independent of
Every family of essentially distinct polynomials that does not depend on N is super nice.
Indeed, since is immediate, we are showing ( follows by the same argument). As it was mentioned before, the vdC-operation preserves the essential distinctness property, hence, all the linear polynomial will have distinct leading coefficients, which, as they are independent of will have the -property.
The set of super nice variable polynomial sequences is non-empty. Actually, the -tuple where and from Example 2, is super nice (see Lemma 6.2 below for a more general statement).
Indeed, as the variable part of the coefficients of the polynomials, after applying vdC-operations, is the same for all terms (and equal to ), at each step we have that the ratios of the coefficients are independent of , hence we have all the required properties.
Even though the number of degree terms (that appears in and ) is not a priori known, when we have a single variable polynomial sequence
where has the -property and all are bounded, we have that for all is super nice.
It suffices to show only (ii). We start with and use the vdC-operation which leads to differences of polynomials.202020 Notice that, for every reduces the degree of by . Precomposing with we get the family of polynomials
In the next iteration of the vdC-operation we precompose with , and then (i.e., polynomials of minimum degree at each step). We will keep track of the leading coefficients of polynomials of maximum degree at each step;212121 Here, as we only have distinct non-zero multiples of the same polynomial it is not hard to do so; for more general coefficient tracking methods see [7, 9]. we have the following cases in this procedure:
The polynomial that is chosen according to Lemma 4.7 has degree strictly less than the one of the polynomial of maximum degree (e.g., this happens in the second iteration of the vdC-operation). In this case the leading coefficient of the latter polynomial doesn’t change.
The polynomial that is chosen according to Lemma 4.7, say has degree, say equal to the one of the polynomial of maximum degree (hence all the polynomials have the same degree –this is the case in the first application of the vdC-operation). Here, because of the nature of the (essentially distinct) iterates, the leading coefficients will be multiples of the leading coefficient of The scheme will continue by picking for the next step the polynomial (for the corresponding shift with leading coefficient times the leading coefficient of ) which is of minimum degree.
Continuing the procedure, we eventually arrive at, say many, degree iterates with distinct leading coefficients (because of the essential distinctness property), which are all multiples of (i.e., the coefficient of ). As all the iterated ratios of these coefficients are independent of and non-zero, we get that they satisfy the -property.
If is super nice, then is super nice too.
Looking at Property for we have that each polynomial (sequence) in is non-constant and has degree independent of equal to If we let
so (i) follows for as well. and follow by the fact that
Property is invariant under the vdC-operation.222222 So, for sequences with degree the vdC-operation preserves the super niceness property.
Indeed, if is the polynomial guaranteed by Lemma 4.7, then we have the iterates: and
The degrees of these polynomials satisfy
For the pairwise differences part, for we have
and, finally,
so, everything follows by Property (i) for 242424 Recall here that if, in the case where it happens then (as a non-constant polynomial of minimum degree in ), so the vdC-operation will group the terms and together, being of degree with bounded difference.
Notice that Remark 6 (5) implies that Theorem 1.3, via Theorem 2.2, holds for a larger class of variable polynomial sequences; even with coefficients that oscillate.
A real-valued function which is continuously differentiable on where is called Fejér if the following hold:
tends monotonically to as and
252525 For a study of averages with general sublinear iterates one is referred to [8], and to [3] and [24] for more general functions, e.g. tempered functions.
Any such function is eventually monotonic and satisfies the growth conditions hence has the -property. So, modulo the goodness property, Theorem 2.2 will also hold for polynomial sequences of the form:
where and are polynomials of degrees less than and respectively with bounded coefficients. This is a non-trivial generalization because while the functions and are Fejér, in view of the fact that they oscillate, do not belong to
The following result shows that the nilfactor is characteristic for a super nice collection of polynomial sequences
Proposition 2.
For let be a super nice sequence of polynomials, a system, and suppose that at least one of the functions is orthogonal to the nilfactor Then, we have
(22) |
Proof.
We assume without loss of generality that is orthogonal to As in [12, Lemma 4.7], to show (22), it suffices to show:
(23) |
We claim next that we can further assume that If this is not the case and then, precomposing with (23) becomes
where for some It suffices to show that for all
The claim follows by Remark 6 (6), as the family is super nice with degree.
If all the ’s are of degree the result follows from Proposition 1. For we use induction on the type of the polynomial family of -tuple of sequences.
For every we choose functions with for so that the average in (23) is close to If and using Cauchy-Schwarz, (23) follows if
(24) |
By Lemma 4.4, (24) follows if, for large enough for we have
(25) |
goes to Picking as guaranteed by Lemma 4.7 (the degrees of the ’s are fixed, so the choice of is independent of ), we precompose with the term in the integrand of (25) (notice that some error terms will appear). Next, we group the degree iterates. More specifically, if for some , then for some error terms in Hence, we have
We treat this product as one iterate. After this grouping, assuming that many terms remain, it suffices to show, for large and every choice of that
(26) |
where the polynomial sequences form a polynomial family with and
For the expression of Theorem 2.2, writing for some using Proposition 2, we get the following result:
Proposition 3.
For let be a super nice sequence of polynomials, a system, and suppose that at least one of the functions is orthogonal to the nilfactor Then, we have
5. Equidistribution
In order to prove our main equidistribution result (Theorem 5.2), we start with some definitions and facts, following [11] (see [11, Subsubsection 2.3.2] for more details).
If is a nilpotent group, then a sequence of the form where and are integer polynomials, is called a polynomial sequence in . If the maximum degree of the polynomials ’s is at most we say that the degree of is at most
Given a nilmanifold the horizontal torus is defined to be the compact abelian group . If is connected, then is isomorphic to some finite dimensional torus . A horizontal character is a continuous homomorphism that satisfies for every and can be thought of as a character of , in which case there exists a unique such that , where “” denotes the inner product operation, and .
Let be a polynomial sequence of degree of the form , where We define the smoothness norm by
(27) |
where denotes the distance to the closest integer, i.e., .
Given , a finite sequence is said to be -equidistributed in , if
for every Lipschitz function where
for some appropriate metric .
Theorem 5.1 (Green & Tao, [17]).
Let be a nilmanifold with connected and simply connected, and . Then for every small enough there exist a positive constant with the following property: For every , if is a polynomial sequence of degree at most such that the finite sequence is not -equidistributed, then for some non-trivial horizontal character with we have
( here is thought of as a character of the horizontal torus and as a polynomial sequence in ).
Adapting the notion of equidistribution of a sequence in a nilmanifold (recall (14)) to our case, abusing the notation, we say that where is a variable sequence of real numbers and is a nilmanifold with connected and simply connected, is equidistributed in a subnilmanifold of if for every we have
In order for us to prove Theorems 2.1 and 2.2, we prove the following equidistribution theorem, which is the main result of this section:
Theorem 5.2.
Let be a good sequence of -tuples of polynomials.
-
If are nilmanifolds with connected and simply connected, then for every and the sequence
is equidistributed in the nilmanifold
-
If are nilmanifolds, then for every and the sequence
is equidistributed in the nilmanifold
Remark 7.
In order to prove Theorem 5.2, we can assume that
Indeed, in the general case we consider the nilmanifold Then where is connected and simply connected and is a discrete cocompact subgroup of Each can be considered as an element of and each as an element of Changing the base point we can also assume that
Part (ii) of the previous result follows from Part (i) (see [11, Lemma 5.1]):
Lemma 5.3.
Let and be sequence of -tuples of real numbers. Suppose that for every nilmanifold with connected and simply connected, and every the sequence
is equidistributed in the nilmanifold Then, for every nilmanifold and the sequence
is equidistributed in the nilmanifold
Sketch of the proof.
Following [11, Lemma 4.1], we show the case, as the general one follows with some straightforward modifications.
Let be a nilmanifold, and Using some standard reductions (namely, the lifting argument and the change of base point formula from Subsections 3.2.3 and 3.2.2), we can and will assume that is connected and simply connected and that
Letting and the corresponding normalized Haar measure, we will show that for every we have
(28) |
Using our assumption for the case where is connected and simply connected, and for every we have
(29) |
where and is its corresponding normalized Haar measure.262626 Here we adapt the notation which is more convenient than
Let and define with While may be discontinuous, for every there exists that equals on where and it is uniformly bounded by
Since our assumption implies that and so for a set of ’s with density 272727 By this we mean So,
hence, since (29) holds for every it also holds for
Recalling that a sequence of -tuples of variable polynomials is good if every non-trivial linear combination of is good, we have the following:
Lemma 5.4.
Let be a good sequence of -tuples of polynomials, nilmanifolds, with connected and simply connected, and suppose that acts ergodically on . Then the sequence
is equidistributed in .
Proof.
We follow [11, Lemma 5.3]. As the general case is similar, we assume that Arguing by contradiction, we will also assume that for some is not -equidistributed in
If then
where so, for all is a polynomial sequence in
Applying Theorem 5.1, we have a constant and a horizontal character of with such that
Let where be the projection of on the horizontal torus (the integer is bounded by the dimension of ). Using the ergodicity assumption on the ’s, for all the set consists of rationally independent elements. For we have for some elements with 282828 Note here that, for all the ’s are also rationally independent. so, we have that
for some integers
If, for we set
we have that the sequence being a non-trivial (as is non-trivial and ’s are rationally independent) linear combination of the ’s, is good. Combining the last three relations, we get which is a contradiction to ; a condition that the coefficients of a good variable polynomial sequence satisfy (see [13]). ∎
The last ingredient in proving Part (i) of Theorem 5.2 is the following lemma:
Lemma 5.5 (Lemma 5.2, [11]).
Let be a nilmanifold with connected and simply connected. Then, for every there exists an such that for all the element acts ergodically on the nilmanifold
We are now ready to prove Theorem 5.2.
Proof of Theorem 5.2.
Using Lemma 5.3 we see that Part (ii) of Theorem 5.2 follows from Part (i). To establish Part (i) let . By Lemma 5.5 there exists a non-zero such that for every the element acts ergodically on the nilmanifold Using Lemma 5.4 for the elements and the polynomials (which are still forming a good sequence of -tuples of polynomials) we get that the sequence is equidistributed in the nilmanifold , hence we get the conclusion. ∎
6. Proof of main results
To prove our main results, we first show that the polynomial sequences from Theorems 1.2 and 1.3 are good and super nice. If either or we write
Proof.
Let be a non-trivial linear combination of strongly independent variable polynomials as in (7), which is also of the same form. In case this combination is a polynomial of degree precomposing with the opposite of its constant term, without loss of generality, we can assume that it is of the form where with (hence and monotonically as ). For any as we have that
In case the combination is a polynomial of degree after using Lemma 4.4 times, we get a polynomial of degree hence the result follows from the previous step. ∎
Recall that when we want to check that a -tuple, for has the -property, we have to check (according to Definition 4.5) that for every the corresponding -tuple has the -property. If a -tuple corresponds to the index we say that it is descending from the term of the previous step.
Proof.
For a single polynomial sequence as in (7), the result follows immediately from Remark 6 (5) and the properties of Hardy field functions.
For multiple sequences, follows by the form (7) that the variable polynomial sequences have. As and consist of polynomials of the same form, and will both follow by the same argument.
After performing, if needed, finitely many vdC-operations to the polynomial families of interest, assuming that we have many essentially distinct terms of the form we have and 292929 This happens because vdC-operations preserve the essential distinctness property of the polynomials and at each step the coefficient functions belong to In order to show that the sequences have the -property, we present an algorithmic way of finding the corresponding terms at the steps and for :
Step 1: For we pick (i.e., the largest index). In this case we will show that we have property (ii) (a) (of Definition 4.5). The terms become:
For we pick (i.e., the smallest index). In this case we will show that we have property (ii) (b). The terms become:
Step : After we order them from largest to smallest growth, we denote the -th term at the -th step with . We have two cases:
The sequence of coefficients is descending from the term of the -th step.
For we pick and show property (ii) (a) (for the case we always pick the largest index and show property (ii) (a)). For we have
where the numerator comes from the difference and the (common) denominators are canceled.
For we pick and show property (ii) (b) (for the case we always pick and show property (ii) (b)). For we have
where the denominator comes from the difference and, as in the previous case, the (common) denominators are canceled.
The sequence of coefficients is descending from the term of the -th step.
For we choose . For all we have:
For we choose . For all we have
Note that each of the aforementioned terms, at each step, is (up to a sign) of the form
i.e., combinations of terms from the initial sequence (because of the cancellations mentioned above) which are all to The claim now follows by the properties of elements from as each coefficient is a logarithmico-exponential Hardy function, hence eventually monotone, which is either or with by the construction. ∎
Proof of Theorem 2.1.
We start by using Proposition 2 in order to get that the nilfactor is characteristic for the multiple average in (8) (which can be used as the polynomial iterates are super nice). Via Theorem 3.1 we can assume without loss of generality that our system is an inverse limit of nilsystems. By a standard approximation argument, we can further assume that it is actually a nilsystem.
Let be a nilsystem, where is ergodic, and . Our objective now is to show that
(30) |
where the convergence takes place in . By density, we can assume that the functions are continuous. In this case we will show that (30) holds for all hence we will obtain the result by using the Dominated Convergence Theorem. By applying Theorem 5.2 to the nilmanifold , the nilrotation , the point , and the continuous function (here we are using the goodness property of the ’s), we get that
This implies that (30) holds for every , completing the proof. ∎
Proof of Theorem 2.2.
As in the previous proof, our objective is to show that if is a good polynomial sequence with being super nice, then for every nilsystem , where is ergodic, and we have that the limit
(31) |
is equal to the limit
(32) |
As in the proof above, we assume that every is continuous. Applying Theorem 5.2 to the nilrotation the point and the continuous function we get
This implies that the limits in (31) and (32) exist for every and are equal. ∎
6.1. Closing comments and problems
In the generality it is stated, Problem 1 (i.e., [13, Problem 10]) remains open except in the case. In this article, we first showed that the nilfactor of a system is characteristic for the corresponding sequence of iterates under the additional super niceness assumption. Second, we showed that the goodness property alone was enough to imply the required equidistribution properties. This comes as no surprise for, as we have already mentioned, the goodness property is a strong equidistribution notion. Hence, to completely resolve the problem, someone has to answer the following problem in the positive:
Problem 3.
For let be a good sequence of -tuples of polynomials. Is it true that for every system its nilfactor is characteristic for ?
Analogously, to solve Problem 2, it suffices to answer the following:
Problem 4.
Let be a good sequence of polynomials. Is it true that for every and every system its nilfactor is characteristic for the sequence ?
As in our results we have convergence to the “expected” limit, it is reasonable for someone to study the corresponding pointwise results along natural numbers. So, we naturally close this article with the following problem:
Problem 5.
Acknowledgments
Thanks go to D. Karageorgos with whom I started discussing the problem; N. Frantzikinakis for his constant support and fruitful discussions during the writing of this article; and N. Kotsonis for his detailed corrections on the text. I am also deeply thankful to the anonymous Referees, X and Y, whose detailed feedback led to numerous clarifications, improving the readability and quality of the article.
References
- [1] (MR891243) [10.1090/conm/065/891243] V. Bergelson, \doititleErgodic Ramsey theory, Logic and Combinatorics (Arcata, Calif., 1985), Contemp. Math. Amer. Math. Soc., Providence, RI, 65 (1987), 63–87.
- [2] (MR912373) [10.1017/S0143385700004090] V. Bergelson, \doititleWeakly mixing PET, Ergodic Theory Dynam. Systems, 7 (1987), 337–349.
- [3] (MR2545011) [10.1017/S0143385708000862] V. Bergelson and I. Håland-Knutson, \doititleWeakly mixing implies mixing of higher orders along tempered functions, Ergodic Theory Dynam. Systems, 29 (2009), 1375–1416.
- [4] (MR1881925) [10.1007/s002220100179] V. Bergelson and A. Leibman, \doititleA nilpotent Roth theorem, Invent. Math., 147 (2002), 429–470.
- [5] (MR1325795) [10.1090/S0894-0347-96-00194-4] V. Bergelson and A. Leibman, \doititlePolynomial extensions of van der Waerden’s and Szemerédi’s theorems, J. Amer. Math. Soc., 9 (1996), 725–753.
- [6] (MR2795725) [10.1112/plms/pdq037] Q. Chu, N. Frantzikinakis and B. Host, \doititleErgodic averages of commuting transformations with distinct degree polynomial iterates, Proc. Lond. Math. Soc., 102 (2011), 801–842.
- [7] S. Donoso, A. Ferré Moragues, A. Koutsogiannis and W. Sun, Decomposition of multicorrelation sequences and joint ergodicity, preprint, 2021, \arXiv2106.01058.
- [8] (MR4092858) [10.1017/etds.2018.118] S. Donoso, A. Koutsogiannis and W. Sun, \doititlePointwise multiple averages for sublinear functions, Ergodic Theory Dynam. Systems, 40 (2020), 1594–1618.
- [9] [10.1007/s11854-021-0186-z] S. Donoso, A. Koutsogiannis and W. Sun, \doititleSeminorms for multiple averages along polynomials and applications to joint ergodicity, J. d’Analyse Math., (2021)
- [10] (MR3347186) [10.1090/S0002-9947-2014-06275-2] N. Frantzikinakis, \doititleA multidimensional Szemerédi theorem for Hardy sequences of different growth, Trans. Amer. Math. Soc., 367 (2015), 5653–5692.
- [11] (MR2585398) [10.1007/s11854-009-0035-y] N. Frantzikinakis, \doititleEquidistribution of sparse sequences on nilmanifolds, J. Anal. Math., 109 (2009), 353–395.
- [12] (MR2762998) [10.1007/s11854-010-0026-z] N. Frantzikinakis, \doititleMultiple recurrence and convergence for Hardy sequences of polynomial growth, J. Anal. Math., 112 (2010), 79–135.
- [13] (MR3613710) N. Frantzikinakis, Some open problems on multiple ergodic averages, Bull. Hellenic Math. Soc., 60 (2016), 41–90.
- [14] (MR3829173) [10.1093/imrn/rnx002] N. Frantzikinakis, \doititleAn averaged Chowla and Elliott conjecture along independent polynomials, Int. Math. Res. Not. IMRN, 2018 (2018), 3721–3743.
- [15] (MR3047073) [10.1007/s11856-012-0132-y] N. Frantzikinakis, B. Host and B. Kra, \doititleThe polynomial multidimensional Szemerédi theorem along shifted primes, Israel J. Math., 194 (2013), 331–348.
- [16] (MR498471) [10.1007/BF02813304] H. Furstenberg, \doititleErgodic behavior of diagonal measures and a theorem of Szemerédi on arithmetic progressions, J. Analyse Math., 31 (1977), 204–256.
- [17] (MR2877065) [10.4007/annals.2012.175.2.2] B. Green and T. Tao, \doititleThe quantitative behaviour of polynomial orbits on nilmanifolds, Ann. of Math., 175 (2012), 465–540.
- [18] [10.1112/plms/s2-10.1.54] G. H. Hardy, \doititleProc. of the London Math. Society, Proceedings of the London Mathematical Society, s2-10 (1912), 54–90.
- [19] (MR2150389) [10.4007/annals.2005.161.397] B. Host and B. Kra, \doititleNonconventional ergodic averages and nilmanifolds, Annals of Math., 161 (2005), 397–488.
- [20] (MR3999460) [10.4064/sm171102-18-9] D. Karageorgos and A. Koutsogiannis, \doititleInteger part independent polynomial averages and applications along primes, Studia Math., 249 (2019), 233–257.
- [21] (MR3809055) [10.3934/dcds.2018113] Y. Kifer, \doititleErgodic theorems for nonconventional arrays and an extension of the Szemerédi theorem, Discrete Contin. Dyn. Syst., 38 (2018), 2687–2716.
- [22] (MR3774837) [10.1017/etds.2016.40] A. Koutsogiannis, \doititleClosest integer polynomial multiple recurrence along shifted primes, Ergodic Theory Dynam. Systems, 38 (2018), 666–685.
- [23] (MR3789175) [10.1017/etds.2016.67] A. Koutsogiannis, \doititleInteger part polynomial correlation sequences, Ergodic Theory Dynam. Systems, 38 (2018), 1525–1542.
- [24] (MR4201837) [10.3934/dcds.2020314] A Koutsogiannis, \doititleMultiple ergodic averages for tempered functions, Discrete Contin. Dyn. Syst., 41 (2021), 1177–1205.
- [25] (MR2122919) [10.1017/S0143385704000215] A. Leibman, \doititlePointwise Convergence of ergodic averages for polynomial sequences of translations on a nilmanifold, Ergodic Theory Dynam. Systems, 25 (2005), 201–213.
- [26] (MR1106945) [10.1215/S0012-7094-91-06311-8] M. Ratner, \doititleRaghunatan’s topological conjecture and distribution of unipotent flows, Duke Math. J., 63 (1991), 235–280.
- [27] (MR2912715) [10.4007/annals.2012.175.3.15] M. Walsh, \doititleNorm convergence of nilpotent ergodic averages, Annals of Math., 175 (2012), 1667–1688.
- [28] (MR2257397) [10.1090/S0894-0347-06-00532-7] T. Ziegler, \doititleUniversal characteristic factors and Furstenberg averages, J. Amer. Math. Soc., 20 (2007), 53–97.
Received November 2021; revised March 2022; early access May 2022.