Pointwise convergence of ergodic averages with Möbius weight
Abstract.
Let be a measure-preserving system, and let be polynomials with integer coefficients. We prove that, for any , the Möbius-weighted polynomial multiple ergodic averages
converge to pointwise almost everywhere. Specialising to , this solves a problem of Frantzikinakis. We also prove pointwise convergence for a more general class of multiplicative weights for multiple ergodic averages involving distinct degree polynomials. For the proofs we establish some quantitative generalised von Neumann theorems for polynomial configurations that are of independent interest.
1. Introduction
Let be a probability space and an invertible measure-preserving map, meaning that for all measurable sets . The triple is called a measure-preserving system. Given functions and polynomials , we form the polynomial multiple ergodic averages
(1.1) |
The convergence properties of these averages as have been studied intensively. The question of their norm convergence was settled by Host–Kra [20] and Leibman [24] after a series of substantial progress by several authors. The question of pointwise convergence, however, remains open and is the subject of the celebrated Furstenberg–Bergelson–Leibman conjecture [2, Section 5.5]. Pointwise convergence has so far been established in two notable cases, namely the case where and the case where and is linear. The former follows from celebrated work of Bourgain [3], and the latter follows from another well known work of Bourgain [4] if is linear and from a recent breakthrough of Krause, Mirek and Tao [22] if has degree at least .
Let be the Möbius function, defined by if is the product of distinct primes and by if is divisible by the square of a prime. In this paper, we shall consider the Möbius-weighted polynomial multiple ergodic averages
Based on the Möbius randomness principle (see [21, Section 13]), it is natural to conjecture the following.
Conjecture 1.1.
Let , and let be polynomials with integer coefficients. Let be a measure-preserving system. Then, for any , we have
for almost all .
The corresponding norm convergence result was proven by Frantzikinakis and Host [10] (this could also be deduced from the proof of [5, Theorem 1.3], combined with [17, Theorem 1.1]). The only case we are aware of where Conjecture 1.1 was previously known is the case , proven in [7, Theorem 2.2] (see also [8, Proposition 3.1] for the case of a linear polynomial and [10, Theorem C] for a generalisation of that result to other multiplicative functions).
Our first main theorem settles Conjecture 1.1 in full. In fact, somewhat unusually for ergodic theorems, we get a quantitative (polylogarithmic) rate of converge. This result goes beyond what is currently known about the unweighted averages (1.1).
Theorem 1.2.
Let , and let be polynomials with integer coefficients. Let satisfy . Let be a measure-preserving system. Then, for any and , we have
for almost all .
Let us make a few remarks about Theorem 1.2.
-
(1)
Specialising to and (and , ), Theorem 1.2 settles the Möbius case of Problem 12 of Frantzikinakis’ open problems survey [9]111Frantzikinakis also asked about the convergence of the same multiple ergodic averages weighted by any multiplicative function , taking values in the unit disc, which has a mean value in every arithmetic progression. Theorem 1.6 below makes progress on this more general question.. This problem was also stated by Frantzikinakis and Host in [10].
-
(2)
In the case where the are linear, we can allow iterates of different commuting transformations in the result; see Theorem 1.3 below.
-
(3)
It is likely that, at least for and linear, the region of in the theorem could be improved to for some , hence “breaking duality” in this problem. This is thanks to the -improving estimates of Lacey [23] (see also [6]) and Han–Kovač–Lacey–Madrid–Yang [19]. See also [22, Section 11]. We leave the details to the interested reader.
-
(4)
The theorem continues to hold if we replace the Möbius function with the Liouville function (defined by , where is the number of prime factors of with multiplicities); see the more general Theorem 7.1 along with Remark 5.3. Similarly, all the results in this paper regarding the Möbius function also hold for the Liouville function.
1.1. Ergodic averages with commuting transformations
Let a probability space and invertible, commuting, measure-preserving maps be given. For any polynomials with integer coefficients and , one can consider polynomial ergodic averages with commuting transformations
(1.2) |
and their Möbius-weighted versions
(1.3) |
The commuting case seems to be rather more difficult, since in the unweighted case (1.2) pointwise convergence is not currently known even for and linear (see Problem 19 of [9]). However, we mention that Walsh [35] proved norm convergence of the averages (1.2) in a groundbreaking work, and the Furstenberg–Bergelson–Leibman conjecture asserts that pointwise convergence should hold also with commuting transformations.
Our next theorem states that for the Möbius averages (1.3) we have pointwise convergence in the case where the are linear.
Theorem 1.3.
Let . Let be a probability space and let be invertible, commuting, measure-preserving maps. Let satisfy , and let . Then, for any , we have
for almost all .
Naturally, this theorem implies Theorem 1.2 for being linear by taking to being powers of the same transformation.
It seems likely that also the general case of Theorem 1.2 could be obtained for commuting transformations by extending Theorem 4.2, which goes into its proof, to multivariate functions (which could likely be done with a more complicated PET induction scheme). We leave the details to the interested reader.
1.2. Multiplicative weights
We also consider more general weighted polynomial multiple ergodic averages
(1.4) |
with a function. Already for and linear, a necessary condition for the pointwise convergence of these averages is that has convergent means, meaning that exists for all (this is seen by taking to be a finite set).
In Theorem 7.1 we will show, assuming a mild growth condition on , that good decay bounds on the Gowers norms of (or for certain weaker norms that depend on the maximal correlation with polynomial phases) imply the convergence of the averages (1.4) to . Such results should have applications also in cases where has no arithmetic structure (for example, for random weights); however, here we focus on applications with being multiplicative222We say that is multiplicative if whenever are coprime.
Frantzikinakis [9] conjectured that if is -bounded, multiplicative and hasconvergent means, then we have the pointwise almost everywhere convergence of
There is an obvious extension of this conjecture to polynomial multiple ergodic averages.
Conjecture 1.4.
Let , and let be polynomials with integer coefficients. Let be a multiplicative function taking values in the unit disc and having convergent means. Let be a measure-preserving system. Then, for any , the limit
exists for almost all .
While we are not able to prove this statement in full, we can prove it for a natural class of multiplicative functions, namely those satisfying a Siegel–Walfisz assumption (stated below). Most practically occurring multiplicative functions of mean satisfy this property, and this class arises naturally in several problems in analytic number theory, in particular in connection with the Bombieri–Vinogradov theorem (see e.g. [15]).
Definition 1.5.
We say that a function satisfies the Siegel–Walfisz assumption if the following hold:
-
(1)
is divisor-bounded: for some , we have for all , with denoting the number of positive divisors of .
-
(2)
For all and we have
Examples of multiplicative functions satisfying the Siegel–Walfisz assumption include for any integer , real and Dirichlet character , where is any bounded multiplicative function that is “pretending” to be the Möbius function in the sense that .333That these functions are examples can be verified by using Perron’s formula and standard estimates for Dirichlet -functions close to the -line.
We are now ready to state a result on the pointwise convergence of multiple ergodic averages with a multiplicative weight satisfying the Siegel–Walfisz assumption.
Theorem 1.6.
Let , and let be polynomials with integer coefficients and with distinct degrees. Let satisfy . Let be a measure-preserving system. Let be a multiplicative function satisfying the Siegel–Walfisz assumption. Then, for any , we have
for almost all .
Note that we allow the function to be unbounded, hence proving pointwise convergence in some cases not covered by Conjecture 1.4.
1.3. Prime ergodic averages
The arguments presented in this paper are not limited to polynomial ergodic averages weighted by the Möbius function, and indeed apply to similar ergodic averages with any weight that satisfies certain Gowers uniformity assumptions as well as some weak upper bound assumptions that are easy to verify; see Theorem 7.1. In particular, thanks to the quantitative Gowers uniformity estimates for the von Mangoldt function in [25], these general theorems can be applied to reduce the problem of convergence of polynomial ergodic averages weighted by the primes to the problem of convergence of the same averages weighted by integers with no small prime factors, which is an easier problem (though still highly nontrivial). Pointwise convergence of polynomial multiple ergodic averages weighted by the primes will be studied in a future joint work with Krause, Mousavi and Tao.
1.4. Further applications of the proof method
Key ingredients in the proofs of our main theorems are some new polynomial generalised von Neumann theorems with quantitative dependencies that we establish in Section 4. These results are likely to have applications also to other problems, such as to bounds for sets of integers lacking progressions of the form with prime and with polynomials of distinct degrees with . Such applications will be investigated in future works.
1.5. Acknowledgements
The author thanks Nikos Frantzikinakis, Ben Krause, Sarah Peluse, Sean Prendiville and Terence Tao for helpful discussions and suggestions. The author was supported by funding from European Union’s Horizon Europe research and innovation programme under Marie Skłodowska-Curie grant agreement No 101058904.
2. Proof ideas
We now give an overview of the arguments used to prove Theorems 1.2 and 1.6, presenting the steps of the proof in a somewhat different order than in the actual proof and focusing on the case of functions for simplicity.
We begin with Theorem 1.2. Let be polynomials with integer coefficients and with highest degree . Thefirst step for the proofs of our pointwise convergence results is a lacunary subsequence trick, which combined with the Borel–Cantelli lemma and Markov’s inequality reduces matters to obtaining strong quantitative pointwise bounds of the form
(2.1) |
for -almost all and all , with large enough. This reduction is presented in Section 7.
We then establish a polynomial generalised von Neumann theorem for counting operators on the left-hand side of (2.1), which bounds the averages in (2.1) in terms of the Gowers norm of for some ; see Theorem 4.2. This result is proven by repeated applications of van der Corput’s inequality coupled with the PET induction scheme. Crucially, the bounds we obtain for (2.1) in terms of the Gowers norm are quantitative with polynomial (in fact, linear) dependencies.
After establishing this generalised von Neumann theorem, we can conclude the proof by applying the strongest known quantitative bounds for the norms of the Möbius function (see Lemma 5.1), which save an arbitrary power of logarithm, thanks to recent work of Leng [25] that builds on the work of Leng–Sah–Sawhney [26].
In the case of Theorem 1.6, we repeat the lacunary subsequence argument to reduce to (2.1), with replaced by a multiplicative function satisfying the Siegel–Walfisz assumption. If one were now to apply Theorem 4.2 again, one would not be able to obtain a sufficiently strong bound on the norm of , since the Leng–Sah–Sawhney inverse theorem [26] is quasipolynomial rather than polynomial. We overcome this by establishing a different polynomial generalised von Neumann theorem, Theorem 4.1, that (perhaps unexpectedly at first) allows bounding (2.1) in terms of a weaker norm than the norm of the weight function. This weaker norm, called the norm and defined in (3.3), expresses the maximal correlation of the weight with a polynomial phase of degree at most . The proof of Theorem 4.1 draws motivation from the Peluse–Prendiville degree lowering theory [28], [29]. The proof proceeds by induction on the length of the progression and involves showing that the first two functions in a weighted progressions can be assumed to be “locally linear” phase functions in a suitable sense. This conclusion is then boosted to global linearity with some extra work, which allows reducing the length of the progression, hence completing the induction.
Since the norm already involves correlations with polynomial phases, we are able to bypass the need for the inverse theorem for the norm when working with distict degree polynomials. In Subsection 5.2, we show (using in particular a restriction to typical factorisations and bilinear estimates for polynomial exponential sums) that if is multiplicative and satisfies the Siegel–Walfisz assumption, then is close in norm to a function whose norms decay faster than any power of logarithm. This together with Theorem 4.1 mentioned above suffices for concluding the proof of Theorem 1.6. The Siegel–Walfisz assumption gives just the right decay for (2.1): with any weaker assumption we would not be able to prove this (although the averages (1.4) should still converge).
We lastly remark that the approach based on Theorem 4.1 also gives a different and arguably simpler proof of Theorem 1.2 for distinct degree polynomials that is independent of any inverse theorems for the Gowers norms and only uses classical analytic number theory input (the only property needed of the Möbius function is an exponential sum estimate that essentially goes back to the work of Vinogradov [34] from 1939; see Remark 5.2).
3. Notation and preliminaries
3.1. Asymptotic notation, indicators and averages
We use the Vinogradov and Landau asymptotic notations . Thus, we write , or if there is a constant such that . We use to denote . We write as if for some function as . If we add subscripts to these notations, then the implied constants can depend on these subscripts. Thus, for example means that for some depending on .
For a set , we define the indicator function as the function that equals to if and equals to otherwise. Similarly, if is a proposition, the expression equals to if is true and if is false.
For a nonempty finite set and a function , we define the averages
For a real number , we denote . For integers , we denote their greatest common divisor by and write to mean that for some natural number . Unless otherwise specified, all our sums and averages run over the positive integers, with the exception that the symbol is reserved for primes.
3.2. Gowers norms
For and a function with finite support, we define its unnormalised Gowers norm as
where is the complex conjugation operator and for a vector we write . For , we then define the Gowers norm of a function as
As is well known (see for example [16, Appendix B]), for the norm is indeed a norm and for it is a seminorm, and the function is increasing.
We observe the classical inverse theorem: if satisfies and , then
(3.1) |
where . This follows from the identity (which can be verified by expanding out the right-hand side) combined with Parseval’s identity.
For the norms we have the Gowers–Cauchy–Schwarz inequality (see for example [32, (4.2)]), which states that, for any functions from to with finite support, we have
(3.2) |
3.3. Van der Corput’s inequality
For , define the weight
where is the floor function of (This weight should not be confused with the Möbius function ). For an integer , we define the differencing operator by setting for any and .
We will frequently use van der Corput’s inequality in the following form.
Lemma 3.1.
For any and any function supported on , we have
(3.4) |
Proof.
See for example [30, Lemma 3.1]. ∎
3.4. Vinogradov’s Fourier expansion
For a real number , we write for the distance from to the nearest integer(s).
We shall need a Fourier approximation for the indicator function of an interval that goes back to Vinogradov.
Lemma 3.2.
For any real numbers and , there exists a -periodic function with the following properties.
-
(1)
for , for , and for all .
-
(2)
For some , we have the pointwise convergent Fourier representation
-
(3)
For any , we have
Proof.
This follows from [33, Lemma 12] (taking there). ∎
4. Polynomial generalised von Neumann theorems
In this section, we prove generalised von Neumann theorems for the weighted polynomial counting operators
(4.1) |
where and is a weight function (which in applications we take to be a multiplicative function) and are functions supported on (with ). Thus, we bound the expression (4.1) in terms of some Gowers norm (or related norm) of or . It is important for the proofs of our main theorems that the obtained results are quantitative, with polynomial dependencies. The two main results of this section (of independent interest) are Theorems 4.1 and 4.2; they are used for proving Theorems 1.6 and 1.2, respectively.
The first main result of this section states that if have distinct degrees, then the polynomial counting operator (4.1) is bounded in terms of the norm of the weight for some , with polynomial dependencies. It is important to have the norm rather than the norm here, since for the norm we do not currently have a polynomial inverse theorem, meaning that the Siegel–Walfisz assumption from Definition 1.5 would be insufficient if we only had a bound in terms of these norms.
Theorem 4.1 (A polynomial generalised von Neumann theorem with control).
Let and . Let be a polynomials with integer coefficients satisfying . Let , and let be functions supported on with for all , and let be a function with . Then, for some , we have
The proof of Theorem 4.1 is given in the next three subsections. In Subsection 4.1, we show that can be assumed to be “locally linear phase functions” in a suitable sense. In Subsection 4.2, we prove the case of the theorem using the circle method; this works as a base case for the proof which is by induction on . Finally, in Subsection 4.3, we use the conclusions of the preceding subsections together with an iterative argument for the function to conclude the proof.
The second main result of this section is that, for any polynomials , the polynomial counting operator (4.1) is always bounded by some norm of the weight , with linear dependence on the Gowers norm. In what follows, for any finite nonempty collection of polynomials with integer coefficients, we define its degree as the largest of the degrees of the polynomials in .
Theorem 4.2 (A polynomial generalised von Neumann theorem with control).
Let and . Let , and let be a function. Let be a finite collection of polynomials with integer coefficients satisfying and for all . For each let be a function supported on with . Then, for some natural number , we have
(4.2) |
The proof of Theorem 4.2 is based on the PET induction scheme and is given in Subsection 4.4. We also present there a multidimensional version of the special case where (Lemma 4.7); this will be needed for the proof of Theorem 1.3.
4.1. Transferring to locally linear functions
The first step in the proof of Theorem 4.1 is to show that if a polynomial average of the form (4.1) is large, then the functions can be assumed to be locally linear phase functions. In what follows, we say that a function is a locally linear phase function of resolution is for some real numbers we have for all and if additionally there is a partition of into discrete intervals of length such that is constant on the cells of that partition. We call the set the spectrum of .
Proposition 4.3 (Reduction to locally linear phases).
Let , , and let be polynomials with integer coefficients and with . Let , and let be functions supported on and with for all . Let be a function with . Then, for some , we have
for some locally linear phase functions of resolution . Moreover, we may assume that the spectra of belong to .
Proposition 4.3 may be compared with, and is motivated by, the work of Peluse and Prendiville [29, Theorem 1.5], where in the case , and , it is proven that can be replaced more strongly with major arc locally linear phase functions. In the more general setup of Proposition 4.3, it is not possible to reduce to major arc locally linear phase functions.
For the proof of Proposition 4.3, we need Peluse’s inverse theorem.
Lemma 4.4 (Peluse’s inverse theorem).
Let and . Let be polynomials with integer coefficients satisfying for all and , and with all the coefficients of the polynomials being bounded by in modulus.
Let and . Let be functions supported on with for all . Then there exists such that for either we have
(4.3) |
Proof.
This will follow from [28, Theorem 3.3] after some reductions. It suffices to show that for each there exists such that (4.3) holds, since we may increase if necessary.
Suppose first that . Using the notation (4.1), let
(4.4) |
Then is equal to the left-hand side of (4.3). We may clearly assume that and that for any given constant . We may further assume that for any given constant , since otherwise by taking the claim follows (with in (4.3)) from the crude triangle inequality bound
Now, applying444In [28, Theorem 3.3], the functions are assumed to be supported on instead of , but this makes no difference in the argument. [28, Theorem 3.3] (with in place of ) we see that there exists some such that
which in view of (4.4) implies the claim.
Suppose then that . Then, making the change of variables in (4.1), we see that
Now the claim follows from the case handled above. ∎
Proof of Proposition 4.3.
We begin with a few reductions. Firstly, we may extend to a function on by setting it equal to outside . Secondly, we may assume that for by translating the functions if necessary. Thirdly, we may assume that is large enough in terms of so that
Let
(4.5) |
We may assume that for any given constant depending on , as otherwise the claim readily follows.
We first wish to replace with a locally linear phase function. For , define the first dual function
Then by (4.5) we have
so by the Cauchy–Schwarz inequality we get
(4.6) |
By the definition of , (4.6) expands out as
Denote . Applying Cauchy–Schwarz and van der Corput’s inequality (Lemma 3.1), this implies that
Noting that , from the triangle inequality and the pigeonhole principle we now see that
(4.7) |
for integers .
Applying Lemma 4.4 and the pigeonhole principle, from (4.7) we conclude that there exists a constant and an integer such that
(4.8) |
Using the simple bound
(4.9) |
valid for any bounded sequence and , and setting
from (4.8) we deduce that
Splitting the average into intervals of length and applying the pigeonhole principle, we see that there exists some integer such that
From Cauchy–Schwarz and van der Corput’s inequality, we then see that
(4.10) |
We wish to remove the weight from (4.10). To this end, we note the easily verified Fourier expansion
for , which allows us to write for the expansion
Substituting this to (4.10) and expanding and using and , we see that
where and for all . Hence, by the Gowers–Cauchy–Schwarz inequality (3.2), we find
Applying the pigeonhole principle, we deduce that
(4.11) |
for integers .
Note that for any complex number we have . For , let be the set of for which
(4.12) |
Then, by the pigeonhole principle, the inverse theorem (3.1) and (4.11), we have for some . For any , let be a point where the supremum in (4.12) is attained, and let be an element of nearest to . Recalling the definition of , we have
For , note that by Parseval’s identity there is some for which
Now, extend the definition of from to all of by letting , where is the largest element of that is at most . Then, define the locally linear phase function , which has resolution . We now have
Recalling the definition of , we conclude that
(4.13) |
where .
We proceed to replace also with a locally linear phase function. For , define the second dual function as
Making a change of variables, from (4.13) it follows that
Arguing verbatim as above, we deduce that there exists a locally linear phase function of resolution and with spectrum in such that
where .
Recalling the definition of , this means that
This gives the desired claim. ∎
4.2. A circle method bound
The proof of Theorem 4.1 will proceed by induction on , so we first need to bound the weighted averages (4.1) with . These averages can be controlled simply by using classical Fourier analysis.
Lemma 4.5.
Let and . Let be a polynomial of degree with integer coefficients. Let , and let be functions supported on with for both , and let be a function. Then we have
(4.14) |
Proof.
Let (depending on ) be such that . Then, by the orthogonality of characters, the left-hand side of (4.14) without absolute values equals to
Now the claim follows by bounding the exponential sum involving pointwise by and by using Cauhcy–Schwarz and Parseval’s identity to the remaining two exponential sums. ∎
4.3. Proof of Theorem 4.1
We are now ready to prove the claimed estimate for the operator (4.1) in the case of distinct degree polynomials.
Proof of Theorem 4.1.
We use induction on . The base case follows from Lemma 4.5. Suppose that the case has been proven for some , and consider the case .
Step 1: Reduction to locally linear phase functions. Let . We may assume that for any large constants and , as otherwise there is nothing to prove. By Proposition 4.3, there exist and locally linear phase functions of resolution for some , and with the spectra of belonging to , such that
(4.15) |
Step 2: An iteration for the locally linear phase function. We can write for some , with is being constant on the the intervals for some and all . For any set , write .
Claim. If and are large, the following holds. For any and any finite (possibly empty) set , if
(4.16) |
then either
(4.17) |
or there exists such that for integers and
(4.18) |
For proving this claim, we first apply van der Corput’s inequality (Lemma (3.1)) to (4.16) to conclude that there is a set of size such that for we have
From Lemma 4.4 and the pigeonhole principle, we now conclude that for some constant , some integer and some , we have
for integers . Let be the set of such . By (4.9) we also have
(4.19) |
for , where .
From the pigeonhole principle, we see that there exist and with such that for all . Let be the set of for which
Then . Note that, for any , and , we have
Hence, recalling (4.19) and applying the pigeonhole principle, we see that there is some with such that for all we have and
By the geometric sum formula, we conclude that, for all , we have
Recalling that , by the pigeonhole principle we conclude that there is some constant and some such that for integers . We have , since for . Now the claim follows by writing and applying the pigeonhole principle.
Step 3: Concluding the argument. Now, applying repeatedly the claim established above, starting with (in which case the assumption (4.16) holds for by (4.15)) and applying the above repeatedly, after iterations (4.18) cannot hold (since the number of different values that takes at least times is ), so (4.17) holds with for some constant .
Now that (4.17) holds with , by the pigeonhole principle there exist intervals of length and of length such that, denoting , we have
But since the function is constant on intervals of the form with , this implies
By the induction assumption and the fact that , for some we now obtain
By Vinogradov’s Fourier expansion (Lemma 3.2), we can write
for some real numbers , some complex numbers and some satisfying . Hence, we conclude that
as desired. ∎
4.4. Proof of Theorem 4.2
For the proof of Theorem 4.2, we need the following generalised von Neumann theorem for arithmetic progressions. This is well known (see for example [11, Lemma 2]), although the result is typically presented for functions defined on a cyclic group.
Lemma 4.6 (A generalised von Neumann theorem for arithmetic progressions).
Let , , , and let . Let be polynomials with integer coefficients of degree at most satisfying for all . Let be functions supported on with for . Then we have
Lemma 4.6 could be proven directly without difficulty, but we deduce it as an immediate consequence of the following more general lemma that deals with multidimensional averages. This multidimensional version is needed for proving Theorem 1.3 (but for Theorem 4.2 the one-dimensional case suffices).
Lemma 4.7 (A multidimensional generalised von Neumann theorem for arithmetic progressions).
Let , , , and let . Let be polynomials with integer coefficients of degree at most satisfying for all . Let be functions supported on with for . Then we have
(4.20) |
Proof of Lemma 4.7.
For convenience, we extend the definition of to all of by setting it equal to outside .
We use induction on . In the case , the claim is immediate by making the change of variables and noting that .
Suppose then that the claim holds in the case and consider the case . Let be the expression inside the absolute values in (4.20). By making the change of variables , we have
By the Cauchy–Schwarz inequality, we obtain
In what follows, for a function and , denote . From van der Corput’s inequality (Lemma 3.1), we conclude that
where is such that . By the induction assumption and the fact that , we see that
(4.21) |
But by Hölder’s inequality and the definition of the norm, the right-hand side of (4.21) is
This completes the induction. ∎
Proof of Theorem 4.2.
For convenience, we extend the definition of to all of by setting it equal to outside .
We apply the PET induction scheme of Bergelson and Leibman [1]. For any finite collection of polynomials, define its type as , where is the number of different leading coefficients among the polynomials in the subcollection . We introduce the lexicographic order on the types of polynomials. In other words, we write if there exists such that the first coordinates of the vectors are equal and the coordinate of order is larger for the second vector. In this way, we have introduced an order on the set of all finite collections of polynomials based on the order of their type. Note that the length of any descending chain of collections of polynomials with maximal element is bounded as a function of and .
We shall prove Theorem 4.2 by induction on the type of . The base case is that of collections of type with , that is, collections where all the polynomials have degree at most . This case follows readily from Lemma 4.6.
Suppose that is a finite collection of polynomials for which Theorem 4.2 fails and that there is no smaller collection with this property. Then . Let us write , where is one of the polynomials of with the least positive degree.
Let be the left-hand side of (4.2). By the Cauchy–Schwarz inequality, van der Corput’s inequality (3.4) and a change of variables, we have
(4.22) |
where are some -bounded functions and is the collection (of size and degree ) given by
We claim that for every the type of is less than the type of . Let . Let be the type of and let be the type of . We have , and if , then for . Moreover, , since if are the distinct leading coefficients of the degree polynomials in , the leading coefficients of degree polynomials in are . Hence we have .
Since by assumption , basic linear algebra gives that the coefficients of are in modulus. Then, by the mean value theorem, for any we have
so for all we have for some . Now, by the induction assumption (and the fact that the functions may be assumed to be supported on ), we conclude that for some natural number we have
Applying Hölder’s inequality as in the proof of Lemma 4.6, this implies that
This completes the induction. ∎
5. Quantitative uniformity of multiplicative functions
In this section, we give quantitative bounds for the uniformity norms of the Möbius function and other multiplicative functions that we need for the proofs of the main theorems.
5.1. The Möbius function
The only arithmetic property we need of the Möbius function is encapsulated in the following recent result of Leng [25], building on [26] and improving on [32].
Lemma 5.1 (Quantitative -uniformity of ).
Let , and . Then we have
Proof.
Remark 5.2.
Remark 5.3.
Lemma 5.1 continues to hold if the Möbius function is replaced with the Liouville function . This follows easily from the identities
which can be truncated to for any at the cost of an error term that is bounded by in norm.
5.2. Multiplicative functions satisfying the Siegel–Walfisz condition
Our goal in this subsection is to show that any multiplicative function satisfying the Siegel–Walfisz assumption (Definition 1.5) is close to a function whose norm decays faster than any power of logarithm.
Proposition 5.4.
Let and . Let be a multiplicative function satisfying the Siegel–Walfisz property. Then we have a decomposition with satisfying and such that for we have
and for we have
Throughout this section, let
for and , and define the function
(5.1) |
In other words, is the restriction of to those integers having at least one prime factor from . The following lemma shows that the function is close to in norm.
Lemma 5.5.
Let , , and let satisfy for all . Then for we have
(5.2) |
Proof.
The next lemma shows that the condition on the prime factors in the definition of can be replaced with a sieve weight up to a small error.
Lemma 5.6.
Let and . There exist real numbers such that for any we have
(5.3) |
Proof.
By splitting intervals into shorter ones if necessary, we may assume that is large enough in terms of . Let be the upper bound linear sieve coefficients of level and sifting range , as defined in [13, Section 12]. From the definition we have . Let . Then the left-hand side of (5.3) is
Using Shiu’s bound [31, Theorem 1], we see that
by the assumption that is large in terms of .
Since
for , we have
By the fundamental lemma of sieve theory ([21, Lemma 6.8]) and Mertens’s theorem, this is
which suffices. ∎
We are now ready to show that the Siegel–Walfisz property for implies the same property for .
Lemma 5.7.
Let be a multiplicative function satisfying the Siegel–Walfisz property. Then satisfies the Siegel–Walfisz property.
Proof.
Since , it suffices to show that the function
satisfies the Siegel–Walfisz property. By splitting a long interval into shorter ones, it suffices to show that for any large and any we have
By Lemma 5.6, it suffices to show that for any we have
Exchanging the order of summation and applying the triangle inequality, it suffices to show that
Let be the set of with . Then by Shiu’s bound we have
say. In view of this, it suffices to show that for all we have
By writing the sum over as a difference of two sums, it suffices to show that for we have
(5.4) |
We can uniquely factorise , where and . By multiplicativity of , we then reduce to showing that
(5.5) |
Let denote the part of the left-hand side of (5.5) with , and let denote the part with . By Shiu’s bound, we can crudely estimate
(5.6) |
where for the last line we used the simple inequality
Since for , we see that .
The remaining task is to show that . For this, it suffices to show that for any integer we have
(5.7) |
From Shiu’s bound and the Siegel–Walfisz assumption on , for any integer and any we have
(5.8) |
Substituting the Möbius inversion formula
and (5.8) into (5.7), and estimating , we reduce to showing that
Estimating crudely using for , and recalling that is large, it suffices to show that
(5.9) |
say. Since , the left-hand side is
which suffices. ∎
For the proof of Proposition 5.4 we also need a bilinear estimate for polynomial phases.
Lemma 5.8.
Let , , and let , be complex sequences with for , and with supported on . Let be a polynomial with real coefficients, and suppose that
(5.10) |
Then there exists an integer such that
for all integers .
Proof.
We may assume that is large and that is large in terms of . Write and , where
Then, since is large in terms of , we have
(5.11) |
and similarly with in place of . Now, applying the decompositions , , (5.11) and Cauchy–Schwarz, the assumption (5.10) yields
where , . Since are -bounded, the result follows e.g. from [27, Proposition 2.2] (which is an exponential sum estimate over short intervals; weaker results would also suffice). ∎
Proof of Proposition 5.4.
Recall the definition of from (5.1). We take , . Then we immediately have . In view of Lemma 5.5, it suffices to show that .
Let be such that for all . By splitting into short intervals, it suffices to show that for any and any polynomial we have
(5.12) |
Let
Write for brevity. Then for we have
Hence, by Shiu’s bound, for any with we can estimate
(5.13) |
Hence, it suffices to prove (5.12) with in place of .
For an interval, let denote the number of prime factors from without multiplicities. Then we immediately have the Ramaré identity
By multiplicativity, and unless , so
We trivially have
(5.14) |
6. Lemmas for the main proofs
6.1. A lacunary subsequence trick
We use a lacunary subsequence trick in the proofs of our pointwise convergence results. Such a trick roughly states that if are some measurable functions and we have strong quantitative decay for for some lacunary sequence , then provided that does not vary too much on intervals of the form in norm, the sequence must converge to in norm. Variants of this idea are frequently used to establish convergence of ergodic averages; see for example [12, Section 5].
Lemma 6.1.
Let . Let be a probability space, and for let be a measurable function. If for any we have
(6.1) |
and for and for -almost all we have
(6.2) |
then for -almost all we have .
6.2. A simple bound
We also need a simple estimate for ergodic averages that follows from Hölder’s inequality.
Lemma 6.2.
Let . Let be a probability space, and let be invertible measure-preserving maps. Let be functions. Also let satisfy . Then, for any , and , we have
where satisfies .
Proof.
By Hölder’s inequality and the -invariance of , we have
as claimed. ∎
7. Proofs of the pointwise ergodic theorems
All of our main theorems will be proven in the more general setting of weighted polynomial ergodic averages
(7.1) |
where is a function satisfying suitable uniformity norm estimates. We note the following result does not require to be multiplicative, and the hypotheses are satisfied also for example for being a suitably normalised version of the von Mangoldt function (by [25, Theorem 6]).
Theorem 7.1 (Pointwise convergence of polynomial ergodic averages with nice weight).
Let . Let satisfy
(7.2) |
for all . Let be polynomials with integer coefficients satisfying . Suppose that one of the following holds:
-
(i)
We have
for any , and .
-
(ii)
We have
for any and , and the polynomials have pairwise distinct degrees.
Let be a measure-preserving system, and let satisfy . Then, for any and , we have
for almost all .
Let us first see how this theorem implies our main theorems.
Proof of Theorems 1.2 and 1.6 assuming Theorem 7.1.
For proving Theorem 1.6, we first use Proposition 5.4 to obtain a decomposition with , and . Applying Theorem 7.1, we reduce to showing that
for almost all .
Since , we can find some and such that . Then by Hölder’s inequality we have
In view of the bound , it suffices to show for small enough that for all we have
(7.3) |
We will reduce Theorem 7.1 to the following quantitative estimate.
Proposition 7.2.
Proof that Proposition 7.2 implies Theorem 7.1.
We are going to apply Lemma 6.1. Recall the notation (7.1). Let satisfy . Applying first the triangle inequality, then Lemma 6.2 and finally (7.2) and Shiu’s bound, we see that for any , and we have
If is large enough, the exponent of the logarithm above is at most , say. Hence, by Lemma 6.1, the conclusion of Theorem 7.1 follows from (7.4). ∎
Proof of Proposition 7.2.
We first reduce the proof of Proposition 7.2 to the case .
Reduction to the case of bounded functions. We claim that it suffices to prove Proposition 7.2 in the case . Suppose that this case has been proven. Let and for (we can assume that all the are , since we have for any and any measurable ). Also let , and let be large enough in terms of .
For and , we split
Then by linearity we see that
(7.5) |
Since for , by the case of Proposition 7.2 we have
To bound the error term in (7.5), it suffices to show that for we have
for some , since can be taken to be large enough in terms of so that .
For , there is some such that ; for the sake of notation, assume that . Fix some and such that . Using the triangle inequality, Lemma 6.2 and Shiu’s bound combined with the assumption , we obtain
Now the proof of Proposition 7.2 has been reduced to the case .
The case of bounded functions. It now remains to show (7.4) in the case . In the rest of the proof, we abbreviate .
By Cauchy–Schwarz, it suffices to show that
The assumption gives
so it suffices to show that for any we have
(7.6) |
By the -invariance of , the claim (7.6) is equivalent to
(7.7) |
By the definition of the norm, there exists a set such that
Restricting the integral in (7.7) to , it suffices to show that for all we have the bound
(7.8) |
Write where , . Since in the support of , we have
(7.9) |
if is large enough in terms of .
Now, if assumption (i) of the theorem holds, by the triangle inequality and (7.9) we have
(7.10) |
If instead assumption (ii) holds, we similarly have
(7.11) |
In either case, since we may assume that are supported on for some and since , we can use either Theorem 4.2 or 4.1 (depending on whether we have (7.11) or (7.10)) to conclude that (7.8) holds with in place of . Hence it suffices to show that (7.8) holds with in place of . For this it suffices to show that
Here the left-hand side is
if is large enough in terms of . This completes the proof. ∎
Lastly, we prove Theorem 1.3.
References
- [1] V. Bergelson and A. Leibman. Polynomial extensions of van der Waerden’s and Szemerédi’s theorems. J. Amer. Math. Soc., 9(3):725–753, 1996.
- [2] V. Bergelson and A. Leibman. A nilpotent Roth theorem. Invent. Math., 147(2):429–470, 2002.
- [3] J. Bourgain. On the pointwise ergodic theorem on for arithmetic sets. Israel J. Math., 61(1):73–84, 1988.
- [4] J. Bourgain. Double recurrence and almost sure convergence. J. Reine Angew. Math., 404:140–161, 1990.
- [5] Q. Chu. Convergence of weighted polynomial multiple ergodic averages. Proc. Amer. Math. Soc., 137(4):1363–1369, 2009.
- [6] Y. Do, R. Oberlin, and E. A. Palsson. Variation-norm and fluctuation estimates for ergodic bilinear averages. Indiana Univ. Math. J., 66(1):55–99, 2017.
- [7] T. Eisner. A polynomial version of Sarnak’s conjecture. C. R. Math. Acad. Sci. Paris, 353(7):569–572, 2015.
- [8] E. H. El Abdalaoui, J. Kułaga-Przymus, M. Lemańczyk, and T. de la Rue. The Chowla and the Sarnak conjectures from ergodic theory point of view. Discrete Contin. Dyn. Syst., 37(6):2899–2944, 2017.
- [9] N. Frantzikinakis. Some open problems on multiple ergodic averages. Bull. Hellenic Math. Soc., 60:41–90, 2016.
- [10] N. Frantzikinakis and B. Host. Multiple ergodic theorems for arithmetic sets. Trans. Amer. Math. Soc., 369(10):7085–7105, 2017.
- [11] N. Frantzikinakis, B. Host, and B. Kra. Multiple recurrence and convergence for sequences related to the prime numbers. J. Reine Angew. Math., 611:131–144, 2007.
- [12] N. Frantzikinakis, E. Lesigne, and M. Wierdl. Random sequences and pointwise convergence of multiple ergodic averages. Indiana Univ. Math. J., 61(2):585–617, 2012.
- [13] J. Friedlander and H. Iwaniec. Opera de cribro, volume 57 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2010.
- [14] W. T. Gowers. A new proof of Szemerédi’s theorem. Geom. Funct. Anal., 11(3):465–588, 2001.
- [15] A. Granville and X. Shao. When does the Bombieri-Vinogradov theorem hold for a given multiplicative function? Forum Math. Sigma, 6:Paper No. e15, 23, 2018.
- [16] B. Green and T. Tao. Linear equations in primes. Ann. of Math. (2), 171(3):1753–1850, 2010.
- [17] B. Green and T. Tao. The Möbius function is strongly orthogonal to nilsequences. Ann. of Math. (2), 175(2):541–566, 2012.
- [18] B. Green, T. Tao, and T. Ziegler. An inverse theorem for the Gowers -norm. Ann. of Math. (2), 176(2):1231–1372, 2012.
- [19] R. Han, V. Kovač, M. T. Lacey, J. Madrid, and F. Yang. Improving estimates for discrete polynomial averages. J. Fourier Anal. Appl., 26(3):Paper No. 42, 11, 2020.
- [20] B. Host and B. Kra. Convergence of polynomial ergodic averages. Israel J. Math., 149:1–19, 2005. Probability in mathematics.
- [21] H. Iwaniec and E. Kowalski. Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2004.
- [22] B. Krause, M. Mirek, and T. Tao. Pointwise ergodic theorems for non-conventional bilinear polynomial averages. Ann. of Math. (2), 195(3):997–1109, 2022.
- [23] M. T. Lacey. The bilinear maximal functions map into for . Ann. of Math. (2), 151(1):35–57, 2000.
- [24] A. Leibman. Convergence of multiple ergodic averages along polynomials of several variables. Israel J. Math., 146:303–315, 2005.
- [25] J. Leng. Efficient Equidistribution of Nilsequences. arXiv e-prints, page arXiv:2312.10772, December 2023.
- [26] J. Leng, A. Sah, and M. Sawhney. Quasipolynomial bounds on the inverse theorem for the Gowers -norm. arXiv e-prints, page arXiv:2402.17994, February 2024.
- [27] K. Matomäki and X. Shao. Discorrelation between primes in short intervals and polynomial phases. Int. Math. Res. Not. IMRN, (16):12330–12355, 2021.
- [28] S. Peluse. Bounds for sets with no polynomial progressions. Forum Math. Pi, 8:e16, 55, 2020.
- [29] S. Peluse and S. Prendiville. A polylogarithmic bound in the nonlinear Roth theorem. Int. Math. Res. Not. IMRN, (8):5658–5684, 2022.
- [30] S. Prendiville. Quantitative bounds in the polynomial Szemerédi theorem: the homogeneous case. Discrete Anal., pages Paper No. 5, 34, 2017.
- [31] P. Shiu. A Brun-Titchmarsh theorem for multiplicative functions. J. Reine Angew. Math., 313:161–170, 1980.
- [32] T. Tao and J. Teräväinen. Quantitative bounds for Gowers uniformity of the Möbius and von Mangoldt functions. To appear in J. Eur. Math. Soc.
- [33] I. M. Vinogradov. The method of trigonometrical sums in the theory of numbers. Dover Publications, Inc., Mineola, NY, 2004. Translated from the Russian, revised and annotated by K. F. Roth and Anne Davenport, Reprint of the 1954 translation.
- [34] I. Vinogradow. Simplest trigonometrical sums with primes. C. R. (Doklady) Acad. Sci. URSS (N.S.), 23:615–617, 1939.
- [35] M. N. Walsh. Norm convergence of nilpotent ergodic averages. Ann. of Math. (2), 175(3):1667–1688, 2012.