Subsets of without L-shaped configurations
Abstract.
Fix a prime . We show that there exists a positive integer such that any subset of containing no nontrivial configurations of the form must have density , where denotes the -fold iterated logarithm. This gives the first reasonable bound in the multidimensional Szemerédi theorem for a two-dimensional four-point configuration in any setting.
1. Introduction
Szemerédi’s famous theorem on arithmetic progressions, which states that any subset of the integers with positive upper density contains arbitrarily long arithmetic progressions, has the following multidimensional generalization due to Furstenberg and Katznelson [4]:
Theorem 1.1.
Let be a finite, nonempty subset of . If contains no nontrivial homothetic copy of , then .
Here we use the standard notation . There has been great interest over the past few decades in proving a quantitative version of this theorem with reasonable bounds, i.e., with an upper bound for whose savings over the trivial bound of grows at least as quickly as a finite number of iterated logarithms. Indeed, Gowers has posed the problem of proving such a result on several occasions [5, 7, 8], and others, such as Graham [11], have asked for bounds for sets lacking particular multidimensional configurations. While reasonable bounds are known in Szemerédi’s theorem due to work of Gowers [6, 8], none are known in the Furstenberg–Katznelson theorem in general. Furstenberg and Katznelson’s original proof, which was via ergodic theory, produces no explicit bounds, while the hypergraph regularity proofs of Nagle, Rödl, Schacht, and Skokan [16, 18], Gowers [9], and Tao [22] each give a saving over the trivial bound of inverse Ackermann type.
Reasonable bounds in Theorem 1.1 are currently known for only one genuinely multidimensional configuration: two-dimensional corners
(1.1) |
(and, thus, their linear images,) due to work of Shkredov [19, 20], who proved that any subset of containing no nontrivial corners has size at most for some absolute constant . No reasonable bounds are known for any two-or-more-dimensional four-point configuration, such as three-dimensional corners,
(1.2) |
or axis-aligned squares,
(1.3) |
The latter of these two configurations is the topic of a conjecture of Graham [11], which states that any subset for which diverges must contain an axis-aligned square. Graham also conjectured, more generally, that if diverges, then must contain a homothetic copy of for every positive integer . This is a two-dimensional generalization of the famous and still unresolved conjecture of Erdős that every subset for which diverges must contain arbitrarily long arithmetic progressions.
Proving reasonable bounds for sets lacking the four-point configurations (1.2) and (1.3) seems to be out of reach. This is because no one has managed yet to prove anything useful about a certain two-dimensional directional uniformity norm that naturally appears in the study of these configurations. Details on this difficulty can be found in the work of Austin [1, 2], where he demonstrates how enormously complicated and difficult even 100% and 99% inverse theorems can be for directional uniformity norms.
The purpose of this paper is to identify the first two-dimensional four-point configuration for which reasonable bounds in the multidimensional Szemerédi theorem can be proven, and to prove such bounds in the finite field model setting. We will study the configuration
(1.4) |
which, when plotted on a two-dimensional integer grid, takes the shape of the capital letter “L”. Because of this, we refer to (1.4) as an L-shaped configuration, and an L-shaped configuration with as a nontrivial L-shaped configuration.
Theorem 1.2.
There exists a natural number and a constant such that the following holds. Fix a prime , and set . If , then all containing no nontrivial L-shaped configurations satisfy
The obtained in the theorem is huge, so we do not attempt to compute it. The bulk of the size of comes from our use of a recent quantitative inverse theorem for the -norm on due to Gowers and Milićević, who in [10] give a rough upper bound for the number of iterated exponentials appearing in their result. Based on this, is likely at least 24 trillion. The use of this inverse theorem is necessary in our proof, and no amount of care to argue efficiently in the rest of the argument can reduce by much. So, we have not tried to optimize the proof of Theorem 1.2, choosing instead to present the simplest argument that gives a reasonable upper bound.
It is likely that the proof of Theorem 1.2 can be adapted to the integer setting to prove a reasonable bound for subsets of lacking L-shaped configurations, with the bound obtained being far more reasonable than the bound in Theorem 1.2. This is because the quantitative aspects of Manners’s [15] inverse theorem for the -norm on cyclic groups are better than those of Gowers and Milićević’s inverse theorem when . It is also likely that Theorem 1.2 can be extended to more general L-shaped configurations with a longer vertical “leg”,
in both the finite field model and integer settings. We expect, however, that understanding L-shaped configurations with two longer “legs”,
is significantly more difficult, for some of the same reasons that proving reasonable bounds for sets lacking three-dimensional corners or axis-aligned squares seems out of reach.
While progress in proving a quantitative version of the multidimensional Szemerédi theorem has so far been extremely limited, there has been a bit more success in proving reasonable bounds for sets lacking multidimensional configurations with more degrees of freedom than those in Theorem 1.1. Prendiville [17] has proven reasonable bounds for subsets of lacking any sufficiently nondegenerate three- or four-term matrix progression, and one consequence of his work is that any subset of containing no four vertices of any square (not necessarily axis-aligned) has size at most for some absolute constant .
The remainder of this paper is organized as follows. In Section 2, we give a detailed outline of our proof of Theorem 1.2, including statements of the three main components of the density-increment argument: control of the count of L-shaped configurations by directional uniformity norms, obtaining a density-increment on a structured set, and pseudorandomizing the structured set previously obtained. After introducing additional technical preliminaries in Sections 3 and 4, we prove these three main components in Sections 5, 6, and 7, respectively. We then carry out the density increment argument in Section 8, proving Theorem 1.2.
Acknowledgments
The author thanks Ben Green and Freddie Manners for helpful discussions, and Ben Green, Noah Kravitz, Terry Tao, and the anonymous referees for useful comments on earlier drafts.
The author is supported by the NSF Mathematical Sciences Postdoctoral Research Fellowship Program under Grant No. DMS-1903038 and the Oswald Veblen fund, and also gratefully acknowledges the support and hospitality of the Hausdorff Institute for Mathematics, where the bulk of this paper was written.
2. Outline of the proof of Theorem 1.2
We begin this section by introducing the minimum amount of notation and preliminaries needed to understand our proof outline. We will use the standard asymptotic notation and , along with Vinogradov’s notation and . For any two quantities and , the relations , , , and all mean that for some absolute constant . We will write to represent a quantity that is and to represent a quantity that is . When any of these asymptotic symbols appears with a subscript, the implied constant is allowed to depend on the parameters in the subscript. Since we fix a prime in Theorem 1.2, the implied constants appearing throughout the paper will sometimes depend on even though we will not alert the reader to this with a subscript. We will use to denote the -fold iterated logarithm, so that and for all , as well as to denote the -fold iterated exponential, so that and for all .
We will frequently denote the indicator function of a set by the letter itself, so that
For any pair of finite sets with , we denote the density of in by
For any function , we denote the average of over by
When , we will usually drop “” and just write for . Whenever satisfies for all in its domain, we say that it is -bounded. Note that the indicator function of any set is -bounded.
For any and , we define the Fourier coefficient of at using the normalization
where and denotes the usual dot product in . With this choice of normalization, the Fourier inversion formula and Parseval’s identity read
and
respectively.
Let be any abelian group and . For any , we define the function by
and, for any , define the -fold iterated differencing operator by
Note that for any permutation of .
Now we can recall the definition of the Gowers uniformity norms.
Definition 2.1.
Let , be an abelian group, and . The -norm of is defined by
The basic properties of these norms can be found in [23]. One such fact needed in the upcoming outline is the inverse theorem for the -norm, which is a simple consequence of Fourier inversion and Parseval’s identity.
Lemma 2.2.
Let be an abelian group and be -bounded. If , then there exists a such that
We will also sometimes need the notion of the -norm on an affine subspace of , which is defined by . The corresponding inverse theorem for these norms follows from Lemma 2.2.
2.1. A review of Shkredov’s argument in the finite field model setting
Before we outline the proof of Theorem 1.2, it will be instructive to review Shkredov’s argument for corners (1.1) in the finite field model setting. A detailed account of the argument can be found in the expositions of Green [12, 13].
Shkredov’s proof proceeds via a density-increment argument. As in all analytic approaches to Szemerédi’s theorem and its generalizations, we begin by defining a multilinear average over the configuration of interest. For , set
Then, for any , the quantity equals the normalized count,
of the number of corners in . Setting , we let denote the density of in and denote the balanced function of . It follows from the trilinearity of that
Since by the Cauchy–Schwarz inequality, if the normalized count of corners in is far below the expected for a random set of density , which is the case when has no nontrivial corners and is sufficiently large in terms of , then must be large.
It can then be shown, by an appropriate sequence of applications of the Cauchy–Schwarz inequality, that must have large box norm
If is large, it follows by an averaging argument that has density at least on a product set for some large .
One may then hope to continue the density-increment argument by proving the following generalization of the result just sketched: if is a subset of density of a product set and contains no nontrivial corners, then has density at least on a product set contained in .
It turns out, however, that the Cauchy–Schwarz argument mentioned previously yields a lower bound on the box norm of large enough size only when and are sufficiently Fourier pseudorandom, meaning that their balanced functions and both have small -norm. The components of the product set just obtained are essentially arbitrary aside from being large. They are, in particular, not guaranteed to be Fourier pseudorandom.
To overcome this difficulty, Shkredov introduced a pseudorandomizing step into his proof. He used an energy increment argument incorporating the -inverse theorem to partition into products of large affine subspaces of the form
(2.1) |
for most of which the sets and are Fourier pseudorandom in . By an averaging argument, there must exist such a product of affine subspaces (2.1) on which the restrictions of and are both sufficiently dense and Fourier pseudorandom, and such that still has increased density on the intersection of with .
By passing to this product of cosets and using that corners are preserved by translation and invertible linear transformations of the form , one can then continue the density-increment argument with replaced by , where . If contains no nontrivial corners and and are sufficiently Fourier pseudorandom, then must have large box norm localized to . One must then prove that has a further density-increment on a product set contained in , which is, fortunately, of exactly the same difficulty whether or some other large product set. By applying the pseudorandomizing procedure to the factors of the product set just produced, one can then deduce that if is a subset of density of a product set , where and are large and sufficiently Fourier pseudorandom, and contains no nontrivial corners, then has density at least on a product set contained in , where and are also large and sufficiently Fourier pseudorandom. The density increment iteration can be carried out repeatedly to produce a good bound for subsets of lacking corners.
2.2. An outline of our argument
The obstructions to uniformity for L-shaped configurations are not just (skew) product sets, as was the case for corners, but also very general sets of the form
where each is an affine subspace of . For example, assume that , and consider the set
This set has density
in , but
L-shaped configurations, in contrast to the expected in a random subset of of density . Similarly, the number of L-shaped configurations in the sets
and
where and are now chosen uniformly at random, is also with high probability, while the sets have density with high probability. These new sorts of obstructions to uniformity are the main reason why the study of L-shaped configurations is significantly more difficult than that of corners, and must be taken into account to prove Theorem 1.2.
For any functions , we define
(2.2) |
so that equals the normalized count of L-shaped configurations in any subset of . The multilinearity of implies that
(2.3) |
where, as before, is the balanced function of . Thus, if the normalized count of L-shaped configurations in is far from the random normalized count , one of , , or must be large. In particular, when contains no nontrivial L-shaped configurations and is sufficiently large in terms of , one of these quantities will be larger than . It then follows from several applications of the Cauchy–Schwarz inequality that one of the following directional uniformity norms of must be larger than :
(2.4) |
(2.5) |
or
(2.6) |
Here is only a semi-norm, while and are genuine norms. Since these are all Gowers box norms, one can find a proof that they are (semi-)norms in Appendix B of [14]. The norm had previously been studied, in the setting of cyclic groups, in work of Shkredov [21].
Directional uniformity norms with two differencing parameters,
for fixed nonzero , are well-understood. Either and are scalar multiples of each other, in which case the norm is just the -norm on averaged over cosets of , or they are linearly independent, as in the definition of , in which case the norm is, after a change of variables, equivalent to the two-dimensional box norm. Directional uniformity norms with three differencing parameters,
for fixed nonzero analogously fall into one of three cases: either and are collinear, lie on exactly two lines, or are in general position. In the first case, the norm is just the -norm on averaged over cosets of . In the third case, the norm is linearly equivalent to the intractable norm that arises in the study of -dimensional corners and axis-aligned squares. The norm we encounter falls into the second case, and the study and fruitful use of this norm turns out to be possible (though still complicated) due to its structure as a “-norm”.
The upshot is that if contains no nontrivial L-shaped configurations, then it must have density at least on a set of the form
(2.7) |
where is of the form
(2.8) |
for some element and collection of subspaces of , where are large and is small for each . Note that this set is not quite as general as the one appearing at the very beginning of this subsection, as the element of does not vary with . It takes some extra work to show that we can guarantee to be of this special form, which turns out to be necessary for our density-increment iteration. We say more about this point in Section 6.
We would like to continue the density increment iteration and show that , which also lacks L-shaped configurations, has a further density increment of at least the same size as the first on a subset of of the same general form (2.7). Analogously to Shkredov’s argument for corners, we can only hope to do this if and are sufficiently pseudorandom, for some appropriate notions of pseudorandomness. We will need to control the count of L-shaped configurations by the norms , , and defined in (2.4), (2.5), and (2.6) with no loss of density factors, i.e., show that
(2.9) |
(2.10) |
and
(2.11) |
and also obtain a density-increment with no loss of density factors when some localized norm of the balanced function of a set is large, i.e., show that if
where now , then there exists a subset of the same general form,
as on which has a density increment
depending only on . Such results are needed so that the density increment obtained at each step of the iteration is independent of the step. If one is not sufficiently careful, it is easy to end up with a density increment that gets smaller as the subset of gets sparser, which is not enough to close the density increment iteration.
To carry out these arguments, we will need and to be pseudorandom with respect to the -norm. The situation for is more complicated, and deciding on a good measure of pseudorandomness for that is amenable to a Shkredov-like pseudorandomization procedure and can also be used to analyze the various averages appearing throughout our argument is one of the challenges of the proof of Theorem 1.2. A suitable condition on turns out to be that it is pseudorandom with respect to the -norm. This condition is not, on its own, immediately useful in the arguments of Sections 5 and 6, since the various averages that appear are not controllable by the -norm of . It takes a bit of work to show that it implies a roughly equivalent statement about the typical codimensions of certain affine subspaces obtained from . We prove this in Section 4, deriving some new results on the combinatorics of approximate polynomials along the way.
The proof of the implications (2.9), (2.10), and (2.11) when and are sufficiently pseudorandom consists of many careful applications of the Cauchy–Schwarz inequality, along with appeals to standard facts about the number of linear configurations of bounded Cauchy–Schwarz complexity in products of pseudorandom sets intersected with subspaces of bounded codimension. We carry out this argument in Section 5.
Obtaining a large enough density-increment when is large for requires some new ideas and a significant amount of extra work beyond the proof of the non-localized case, in contrast to the situation for the box norm localized to product sets, where the argument is the same as the non-localized case. In order to get such a density-increment that only depends on and not on the densities of or , one of the key ingredients is a density-preserving inverse theorem for the -norms on pseudorandom sets derived from and , which we prove using a version of the transference principle. We carry out this argument in Section 6.
As was the case for corners, the sets and obtained in the previous paragraph are not guaranteed to be pseudorandom. We must also carry out a pseudorandomizing procedure to locate a product of large affine subspaces of the form on which and are sufficiently pseudorandom and still has a large density increment on . Our pseudorandomization procedure is similar to Shkredov’s, but with some new complications coming from our desire for and and to be pseudorandom with respect to the - and -norms, respectively, and from ’s particular structure as a union of affine subspaces in the second factor of . To handle the first complication, we use a recent quantitative inverse theorem of Gowers and Milićević [10] for the -norms on vector spaces over finite fields, combined with a result of Cohen and Tal [3] that allows us to partition into large affine subspaces on which any finite collection of bounded degree polynomials are all constant. The structure of has the potential to cause issues in a Shkredov-like pseudorandomization argument, since the intersection of with a cell may no longer be the union of affine subspaces all having the same dimension. We will explain how this complication is dealt with in Section 7, since it requires a bit of set up.
2.3. Key intermediate results
We finish this section by stating the key intermediate results needed to prove Theorem 1.2 that we just described in the outline. Recall that denotes the balanced function of .
Lemma 2.3 (Estimation of ).
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Define by (2.7) and suppose that has density in . Let and assume that
Then
As a consequence, we get that if is small enough, is large enough, and has no nontrivial L-shaped configurations, then
Lemma 2.4 (Control by norms).
Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Assume that
Define by (2.7) and suppose that are -bounded functions supported on . Then
(2.12) |
(2.13) |
and
(2.14) |
Thus, if is small enough, is large enough, and has no nontrivial L-shaped configurations, then one of
is .
Theorem 2.5 (, , or large implies a density-increment).
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let and
and assume that
Define by (2.7) and assume that has density in . Suppose that
or
Then, has density at least on a subset of the form
where the densities of are all , and the set takes the form
where each is a subspace of of codimension .
The first three lemmas combined tell us that if has density and contains no nontrivial L-shaped configurations, then one can find a subset of of the form (2.7) on which has density at least . The next lemma tells us that, after restricting to a product of large affine subspaces, we can get this same conclusion with and as pseudorandom as we need, which will allow us to continue the density-increment iteration.
Lemma 2.6.
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Let , and suppose that have densities , respectively, and that and takes the form
where each is a subspace of of codimension . Define by (2.7), and assume that has density in , as well as that
Then there exists a subspace of dimension
, and such that, on setting ,
-
•
,
-
•
,
-
•
,
-
•
and ,
-
•
,
-
•
,
-
•
,
-
•
,
-
•
,
-
•
, and
-
•
,
we have
, and
By combining the previous four lemmas and using that L-shaped configurations are preserved by translation and invertible linear transformations of the form , we thus deduce the following density-increment lemma, which we will iterate in Section 8 to prove Theorem 1.2.
Lemma 2.7.
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Suppose that that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Define by (2.7) and let have density in . Let , and assume that
Let , and suppose that has no nontrivial L-shaped configurations. Then either
-
(1)
or
-
(2)
there exists natural numbers and satisfying
and , subsets of densities , respectively, a subset of the form
where each is a subspace of of codimension (so that has density , where ), and a subset , where
of density at least in , such that
, and contains no nontrivial L-shaped configurations.
3. Additional preliminaries
In this section, we present some more preliminaries that were not needed for the outline of the proof of Theorem 1.2, but will be convenient to have for the proof itself. We begin with the notion of Cauchy–Schwarz complexity, first defined by Green and Tao in [14].
Definition 3.1 (Cauchy–Schwarz complexity).
Let be a collection of linear forms in variables. We say that has Cauchy–Schwarz complexity at most if, for every , there exists a partition of into at most subsets such that is not contained in the linear span of any of the subsets.
The smallest such that has Cauchy–Schwarz complexity at most is called the Cauchy–Schwarz complexity of .
For example, four term arithmetic progressions,
have Cauchy–Schwarz complexity .
Any system of linear forms of complexity at most can be shown to be controlled by the -norm using repeated applications of the Cauchy–Schwarz inequality. In particular, carrying out the proof of the generalized von Neumann theorem of Green and Tao in [14] in the finite field model setting (where the technical details are much simpler) produces the following useful result.
Theorem 3.2.
Let be a collection of linear forms in variables with Cauchy–Schwarz complexity at most . For any -bounded functions , we have
Further, if all of are supported on a set of density that satisfies , then
We will use the following immediate corollary of Theorem 3.2 numerous times throughout the proof of Theorem 1.2.
Corollary 3.3.
Let be a collection of linear forms in variables with Cauchy–Schwarz complexity at most . Suppose that are -bounded functions having average values , respectively, and that
for all . Then
We will also need the Gowers–Cauchy–Schwarz inequality.
Lemma 3.4 (Gowers–Cauchy–Schwarz inequality).
Let be a natural number, be an abelian group, and for every . We have
Next, we record the basic fact that a function on with small -norm has small average on affine subspaces of small codimension.
Lemma 3.5.
Let be a -bounded function satisfying and be an affine subspace of codimension . Then
Proof.
The indicator function of can be written as
so we have that
by the Gowers–Cauchy–Schwarz inequality. Since , the desired result follows. ∎
The last lemma of this section will be used to analyze the various averages appearing in the proofs of Lemmas 2.3, 2.4, and 2.5.
Lemma 3.6.
Let be a collection of linear forms such that the coefficient of in each of is nonzero, and be a function of the form
for some -bounded functions with average value , each satisfying . If the Cauchy–Schwarz complexity of the set of linear forms
(3.1) |
in the variables , is at most , then
4. Pseudorandomness of
The main purpose of this section is to show that if is small and is a subset of the form
where each is a subspace of density in and has density in , then, whenever is a collection of linear forms of Cauchy–Schwarz complexity at most and , the affine subspaces
typically have maximum possible codimension. This allows us to transform the condition that is -pseudorandom into a more useful property for evaluating the various averages that arise in the proof of Theorem 1.2.
Lemma 4.1.
For each nonnegative integer and positive integer , there exist constants such that the following holds. Let be a nonnegative integer, and set . Let , have density , and be a set of the form
where each is a subspace of codimension . Assume that
Let be a collection of linear forms of Cauchy–Schwarz complexity at most . Then, for all but at most a -proportion of -tuples for which , we must have that
for all .
We begin by showing that if is pseudorandom with respect to the -norm, then is pseudorandom with respect to the -norm. This result will also be useful at a few other points in the proof of Theorem 1.2.
Lemma 4.2.
Let be a nonnegative integer, and set . Let have density and be of the form
where each is a subspace of of codimension . Then, for every natural number , we have
Proof.
We write
and then, for each , insert the identity
to get that equals
We can average the above quantity over and make the change of variables to get that it equals
(4.1) |
Applying the Gowers–Cauchy–Schwarz inequality bounds (4.1) above by
which gives us the conclusion of the lemma. ∎
To prove Lemma 4.1, we will need the notion of an approximate polynomial of bounded degree, which we define using the additive discrete difference operator . For and , define by
and, for , the -fold additive difference operator by
Definition 4.3.
Let be an abelian group, , and . We say that is an -approximate polynomial of degree at most on if
for at least an -proportion of -tuples for which for all .
We will also need the following result, which is the key combinatorial input into the proof of Lemma 4.1.
Lemma 4.4.
For each nonnegative integer , there exist constants such that the following holds. Let have density , with
and be a -approximate polynomial of degree at most on . Then, for at least a -proportion of -dimensional parallelopipeds
in , the derivative of on the -dimensional face ,
vanishes for all and .
Proof.
We proceed by induction on , beginning with the case . If is a -approximate polynomial of degree at most on , then , so that, by the pigeonhole principle, there exists some for which . Set , and consider the set of quadruples
Note that if , then
so certainly the derivatives
of on each of the -dimensional faces of the parallelopiped vanish. Since
we must have . Taking , the total number of quadruples in is by Corollary 3.3, which means that consists of at least a -proportion of parallelograms in if is chosen small enough. Thus, for at least a -proportion of parallelograms in , the derivative of on each -dimensional face vanishes, as desired.
Now suppose that the result holds for a general degree , and let be a -approximate polynomial of degree at most on . By Corollary 3.3,
so that, as long as is sufficiently small and is sufficiently large, it follows from Markov’s inequality that
and thus
as well, for all but a -proportion of . Thus, for at least a -proportion of , we have , , and that the function is a -approximate polynomial of degree at most on . Denoting the set of such by , so that , the induction hypothesis then says that, for each , there are at least a -proportion of -dimensional parallelopipeds in for which the derivative of on each -dimensional face vanishes.
Summing over all , it follows that, for at least a -proportion of -tuples for which and are both in , one has
for all and . By Corollary 3.3,
and so, by Markov’s inequality,
(4.2) |
for all but a -proportion of in . By taking small enough and large enough, there are therefore at least a -proportion of -tuples in for which (4.2) holds and, for at least a -proportion of pairs such that and are both in , one also has
for all and . For each such -tuple , it follows from the pigeonhole principle that there exists a in the set
such that, for at least a -proportion of , one has
for all and .
Now set ,
so that , and
Note that if and only if
for all , , and . Thus, whenever , we have
and
for all and . That is, the derivative of vanishes on all -dimensional faces of the -dimensional parallelopiped .
As in the case,
so that
for at least a -proportion of -tuples . Each ordered quadruple for which corresponds to a unique -dimensional parallelopiped
in . Thus
In comparison, the number of -dimensional parallelopipeds in is
by Corollary 3.3. The conclusion of the lemma now follows as long as is sufficiently small and is sufficiently large. ∎
With a bit more work, it is possible to prove a version of Lemma 4.4 with -dimensional parallelopipeds replaced by -dimensional parallelopipeds (which is optimal), and thus a version of Lemma 4.1 with the -norm replaced by the -norm, but this would make a negligible difference in Theorem 1.2.
Now we can prove Lemma 4.1.
Proof of Lemma 4.1.
We proceed by induction on and , beginning with the , case111Note that if a system of linear forms has finite Cauchy–Schwarz complexity, then it has Cauchy–Schwarz complexity at most .. Since for all , certainly
for all for which , and this case follows trivially without even needing the assumption that is small.
Now let or , and assume that the result holds for all pairs of integers satisfying
-
(1)
and or
-
(2)
,
and let be at most the minimum of and be at least the maximum of over all such pairs with . As long as , it follows from the induction hypothesis that for all but a -proportion of for which , we must have
for all . If the codimension of some is not for one of these typical , then it is either or at most , which means that
(4.3) |
in either case, since . Squaring both sides of (4.3), multiplying by , averaging over all , swapping the order of summation, and applying Lemma 4.2 (to deduce the uniformity of ) and Theorem 3.2 yields
By Hölder’s inequality, we then have
It follows from this, the induction hypothesis, our assumption that , and Lemma 4.2 that
since the Cauchy–Schwarz complexity of any proper subset of
is at most .
Thus, by taking sufficiently small and sufficiently large, we get that
(4.4) |
for at least a -proportion of -tuples for which . The condition (4.4) implies that, for each such , there exist vectors , , not all of which are zero, such that
Fix a basis of for each . We can write every vector in terms of this basis, giving us that
for some -tuple of constants , not all of which are zero.
We apply the pigeonhole principle to deduce that there is some -tuple of constants , not all of which are zero, such that
for at least a -proportion of -tuples for which . Defining
for each , the above says that, for at least a -proportion of -tuples for which , we must have
where at least one of the functions does not have the zero vector in its image (because is always a nontrivial linear combination of linearly independent vectors). Let be any such . It then follows by applying Corollary 3.3 to the inside average of
that for at least a -proportion of -tuples for which , again provided that is sufficiently small and is sufficiently large. That is, is a -approximate polynomial of degree at most on .
If is small enough and is large enough, Lemmas 4.2 and 4.4 then imply that for at least a -proportion of -dimensional parallelopipeds in , the derivative of on each -dimensional face vanishes, i.e.,
for all and . Call the set of -tuples corresponding to such -dimensional parallelopipeds . Consider , which, plugging in the expression
for , equals
The above has size , coming from the contribution of for each and . On the other hand, we have
so that
Taking sufficiently small and sufficiently large will thus yield a contradiction if
for some for a -proportion of for which . ∎
5. Control by directional uniformity norms
This section is devoted to proving Lemmas 2.3 and 2.4. Our arguments mostly consist of careful, repeated applications of the Cauchy–Schwarz inequality to ensure that there is no loss of density factors and using the results of Sections 3 and 4 to analyze the resulting averages. As a simple warm-up, we begin by showing that if has the form (2.7) and and are sufficiently pseudorandom, then has density close to the product density .
Lemma 5.1.
Let be a nonnegative integer, and set . Let and assume that have densities and , respectively, and satisfy
and that takes the form
where each is a subspace of of codimension . Then the set defined by (2.7) has density
Proof.
Now we can prove Lemma 2.3.
Proof of Lemma 2.3.
The quantity of interest is
which, after a change of variables, can be written as
where equals
We will show that is very close to the constant value for almost every pair , from which it will then follow that is close to .
The first moment equals
Applying Lemma 3.6 yields
where
It therefore follows from Lemma 3.5 that
for all but a -proportion of . Thus,
By arguing as in the proof of Lemma 5.1, we have
and thus conclude that
The second moment equals
Applying Lemma 3.6 again yields
where
and applying Lemma 4.1 yields
for all . Thus, by Lemma 3.5,
for all but a -proportion of , so that
By Lemmas 3.6 and 4.1 again, we have
and
where
so that
by Lemma 3.5. Using Corollary 3.3 to estimate , we thus conclude that
Our estimates for the first and second moments of imply that has variance . It follows that
When is sufficiently small and is sufficiently large, this gives the desired lower bound for . ∎
To finish this section, we prove Lemma 2.4.
Proof of Lemma 2.4.
We prove (2.12), (2.13), and then (2.14), proceeding in decreasing order of the number of applications of Cauchy–Schwarz required. For (2.12), we make the change of variables to write as
which, by applying the Cauchy–Schwarz inequality in the and variables, has modulus squared bounded by
The first factor equals by Lemma 5.1. Expanding the square and making a change of variables, the second factor equals
By another application of the Cauchy–Schwarz inequality in the and variables, the modulus squared of this is at most
(5.1) |
times
The first factor (5.1) can be estimated in the same manner as the averages appearing in the proof of Lemma 2.3, and equals . Expanding the square and making a change of variables, we get that the second factor equals
A final application of the Cauchy–Schwarz inequality in the and variables bounds the modulus squared of this by
(5.2) |
times
By Lemmas 3.6 and 4.1, we have
for all , and
where
and
It then follows from Lemma 3.5 that (5.2) equals
which equals
It remains to relate
to . Expanding the square and making a change of variables yields
which can be written as
where
We will show that, for almost every -tuple for which , is very close to the constant value . Indeed, the first moment is
Lemmas 3.6 and 4.1 tell us that
and
where
so it follows from Lemma 3.5 that the first moment equals
which equals
by Corollary 3.3. The second moment is
which, noting that
we can write as
Lemmas 3.6 and 4.1, analogously to the case of the first moment, tell us that
and
where
so it follows from Lemma 3.5 that the second moment equals times
(5.3) |
To estimate the main term of (5.3), we note that
and apply Lemmas 3.6 and 4.1 again to get that
and
where
Thus, by Lemma 3.5, we have that
equals
which equals
by Corollary 3.3. It therefore follows that the second moment is
Thus,
and we conclude that
Putting everything together gives
as desired.
For (2.13), we make a change of variables and apply the Cauchy–Schwarz inequality to bound by
Expanding the square in the second quantity and making a change of variables yields
By applying the Cauchy–Schwarz inequality again in the and variables, the modulus squared of this is bounded above by
(5.4) |
times
Expanding the square and making a change of variables, the second factor equals
which we can write as
where
The quantity (5.4) and the weight can be analyzed in the same manner as the corresponding quantity and weight in the proof of (2.12), so that
Finally, for (2.14), we make a change of variables and apply the Cauchy–Schwarz inequality to bound by times
Expanding the square in the second quantity and making a change of variables yields
which can be written as
where
This weight can again be analyzed in the same manner as the weights appearing in the proof of (2.12), giving
and completing the proof of the lemma. ∎
6. Obtaining a density-increment
As was mentioned in Section 2, one ingredient in the proof of Theorem 2.5 is an inverse theorem for the -norm localized to pseudorandom sets. Applying the standard, nonlocalized inverse theorem for the -norm would yield a density-increment that gets weaker as becomes sparser, which is inadequate to close the density-increment iteration. To get a strong enough localized version of this inverse theorem, we will need to use the transference principle. The particular instance of it required is an immediate consequence of the dense model lemma from [24], which appears as Lemma 3.3 in that paper.
Lemma 6.1 (Dense model lemma for the -norm on subspaces).
For every natural number , there exists a constant such that the following holds. Let , be a subspace, and be functions satisfying
-
(1)
,
-
(2)
, and
-
(3)
.
Then there exists a such that and .
One can see that this lemma is a consequence of Zhao’s lemma by using the Gowers–Cauchy–Schwarz inequality to translate between his -discrepancy pair condition and our -uniformity condition.
In the course of the proof of Theorem 2.5, we will encounter various averages of linear forms that turn out to be controlled by certain degree and directional uniformity norms. Because of this, we will also need to obtain a density increment when these norms of the balanced function are large. The first two subsections of this section are devoted to proving that this is possible.
6.1. Results on degree norms
We first show that the relevant fibers of any set of the form (2.7) typically have close to their average density, provided that and are sufficiently pseduorandom.
Lemma 6.2.
Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that has density in and takes the form
where each is a subspace of of codimension . Let and assume that
Define by (2.7). Then
and
for any .
Proof.
Now we can obtain a density-increment when the degree uniformity norms controlled by and are large. The proof is essentially an averaging argument, like the proof of the analogous Lemma 3.1 in [12]. The most substantial new feature, which will arise many times in this section, is that we must now verify that the set on which we claim to have found a density-increment actually has close to the correct density in . In the case of corners, any product set trivially has density equal to the product of the densities of and . This is not, in general, the case for sets of the form (2.7) unless further assumptions are made about and .
Lemma 6.3.
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let and , and assume that
Define by (2.7), and let have density in . Suppose that
(6.1) |
(6.2) |
or
(6.3) |
Then has density at least on some subset of of the form
where the densities of are , , and , respectively.
Proof.
The assumption (6.1) can be written as
Since for at most a -proportion of by Lemma 6.2, it follows that, as long as is sufficiently large, there exists a subset of relative density at least for which
for all , provided that is small enough and is large enough. Note that is a real number, and thus is either positive or negative. There must therefore exist a subset of density at least in such that either
for every or
for every .
In the first case, setting and , we have
by Lemma 6.2 whenever is small enough and is large enough, so that
Since by Lemma 6.2, the conclusion of the lemma follows when is small enough and is large enough by taking .
6.2. Results on degree norms
To obtain a density-increment when the relevant degree directional uniformity norms of are large, we first need to show that certain degree “inner products” are controlled by the degree directional uniformity norms studied in the previous subsection.
Lemma 6.4.
Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let and assume that
Define by (2.7), and let and .
If
(6.4) |
where equals or for all , at least one equals , and at least one equals , then
If
(6.5) |
where equals or for all , at least one equals , and at least one equals , then
or
If
(6.6) |
where equals or for all , at least one equals , and at least one equals , then
or
Proof.
We rewrite the various assumptions that (6.4), (6.5), and (6.6) hold when at least two of the ’s equal as
and
where
and
Using the definition (2.7) of and arguing as in Section 5 using Lemmas 3.5, 3.6, and 4.1 gives that each of is typically very close to its average value on its support. Precisely, we have the estimates
and
which imply that
and
Since by an application of the Cauchy–Schwarz inequality, the conclusion of the lemma easily follows starting from any one of the assumptions (6.4), (6.5), or (6.6) when at least two of the ’s equal .
To prove the lemma when only one in (6.4), (6.5), or (6.6) equals , we will apply the Cauchy–Schwarz inequality once, and then argue analogously. By making a change of variables, we may start from the assumption that
or
We apply the Cauchy–Schwarz inequality in each of these three cases to get that
(6.7) |
times is at least ,
(6.8) |
times is at least , or
(6.9) |
times is at least .
Now we are almost ready to prove our desired density-increment result for the localized degree directional uniformity norms controlled by :
Lemma 6.5.
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let and , and assume that
Define by (2.7), and let have density in . Suppose that
(6.10) |
or
(6.11) |
Then has density at least on some subset of of the form
where the densities of are both , and is of the form
where each is a subspace of of codimension .
To prove this lemma starting from the assumption (6.10), we will apply a localized -norm inverse theorem for many fixed , which will produce a density-increment on a set of the form
(6.12) |
where
This is not yet what the conclusion of the lemma promises, since varies with . To show that we can select a fixed (at the cost of shrinking the size of a bit) we will need Lemma 6.6 below.
The reader may wonder why we cannot just run the density-increment argument on sets of the form (6.12) and skip having to prove Lemma 6.6. The issue with this hypothetical proof is that -uniformity of a set of the form is not strong enough to guarantee that the analogue of Lemma 4.1 is true, regardless of how large is taken to be. Thus, such sets are not as amenable to a Shkredov-like pseudorandomization procedure.
Lemma 6.6.
There exist absolute constants such that the following holds. Let be a nonnegative integer, and set . Let have density , take the form
where and each is a subspace of of codimension , and satisfy
for every and every affine subspace of of codimension . Let , assume that , and suppose that has density at least in , where . Then there exists a and a subset such that has density at least in , where
and .
Proof.
Define
and
for every , and set , so that . By our assumption on , we have
and
for all .
Note that
and set
for each . Then we have
where
so that
when is small enough and is large enough. The contribution to coming from for which is obviously at most , which implies that
which is clearly positive. We thus conclude that there must exist a for which and
The conclusion of the lemma now follows by taking and . ∎
Now we can prove Lemma 6.5.
Proof of Lemma 6.5.
First assume that (6.10) holds. By writing , we see that equals
plus terms of the form
(6.13) |
where at least one equals and at least one other equals , and
If one of the terms (6.13) has absolute value larger than , then combining Lemmas 6.4 and 6.3 produces the desired density increment. Thus, we may proceed under the assumption that all have size at most , so that
For each , set , , and . Lemmas 3.5, 3.6, and 4.1 tell us that
and that
or else Lemma 6.3 will again give the desired density increment. It follows that there exists a subset of density in such that
for every . Setting
for all we then have , , , and
provided that is small enough and is large enough. Thus, as long as is sufficiently large, Lemma 6.1 tells us that there exists a function such that and . As a consequence, since , we must have
as well. Set and let , so that
Since , it follows that there exists a nonzero such that
where we have crucially used that is -bounded. As , it therefore follows that
for every .
Extend from to by picking a nonzero arbitrarily for all . We now split the average over above into an average of averages over cosets of in and average over all of to get that
and use the fact that
to deduce that
By applying the pigeonhole principle in the and variables, it follows that there exists a subset of density in and, for each , an element for which
Thus, recalling the definition of , we have
(6.14) |
Define by taking for all and to be an arbitrary element of for all , and similarly extend from to by taking to be an arbitrary element of for which
for all . Such an element must exist by the pigeonhole principle. Set
so that for every , and . Then (6.14) can be rewritten as
It remains to check that the density is close to , so that we indeed have the desired density-increment. But by Lemmas 3.5 and 3.6, we have
for all but a -proportion of , from which it follows that
Thus, has density at least on
provided that is large enough. The conclusion of the lemma now follows from Lemma 6.6
Now suppose that (6.11) holds. By writing and arguing as in the first case, we may proceed under the assumption that
We will first show that either
for almost every pair , or else we can deduce the desired density-increment using Lemma 6.3.
Consider the average
Using that , the above can be written as
(6.15) |
plus seven other terms of the form
(6.16) |
where and all equal or and at least one equals . By Lemmas 3.5, 3.6, and 4.1, the quantity (6.15) equals . Suppose that of the functions and in (6.16) equal . By a similar argument to those used to prove Lemma 6.4, if any term of the form (6.16) has size at least , then
or
so that the desired density-increment follows from Lemma 6.3. Thus, we may proceed under the assumption that
Now consider the average
Using that , the above can be written as
(6.17) |
plus 31 other terms of the form
(6.18) |
where and all equal or and at least one equals .
By Lemmas 3.5, 3.6, and 4.1, the quantity (6.17) equals . Suppose that of the functions in (6.18) equal . Analogously to the situation for the first moment, if any of the terms of the form (6.18) has size at least , then we will be able to deduce the desired density-increment. The most involved case is when . All other cases can be handled using a simpler version of the argument we are about to carry out.
So, consider this most involved case, i.e., that
has size at least . Applying the Cauchy–Schwarz inequality in the variables and gives that
times
is at least . By Lemmas 3.5, 3.6, and 4.1, . Expanding the square in the average above, this means that
is , provided that is small enough and is large enough. We can write the above as
where
Yet more applications of Lemmas 3.5, 3.6, and 4.1 give the estimate
so that
It follows from one more application of the Cauchy–Schwarz inequality in the variables and and a similar analysis to above that
which, combined with Lemma 6.3, gives the desired density-increment.
Thus, we may also proceed under the assumption that
By Markov’s inequality, we therefore have
(6.19) |
for all but a -proportion of . As a consequence, there exists a pair for which both (6.19) holds and
which together imply that
when we take , , , , and in the definition of . ∎
6.3. More preliminaries for
We will similarly need that certain -inner products are controlled by the degree and directional uniformity norms appearing in the previous subsections. The proof of the lemma below is similar to the proof of Lemma 6.4, but with an extra application of the Cauchy–Schwarz inequality in some cases.
Lemma 6.7.
Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let , and assume that
Define by (2.7), and let and . If
where at least one equals and another equals , then
or
Our final preliminary lemma says that, for almost every , the function is supported on a Fourier uniform subset of the affine subspace .
Lemma 6.8.
Let be a nonnegative integer, and set . Suppose that have densities , respectively, and that takes the form
where each is a subspace of of codimension . Let , and assume that
Setting
and
then the probability
is .
Proof.
By Lemmas 3.6 and 4.1, we have that or for at most a -proportion of pairs . For all of these typical pairs , we have
which is at most
by inserting the identity
for every . For each fixed triple , the interior average above is bounded by by the Gowers–Cauchy–Schwarz inequality. Since there are possible triples to sum over, we must have
from which the conclusion of the lemma follows. ∎
6.4. Proof of Theorem 2.5
Now we can finally finish the proof of Theorem 2.5.
Proof of Theorem 2.5.
First assume that
Analogously to the proof of Lemma 6.5, by writing , we see that equals
plus 62 terms of the form
(6.20) |
where at least one equals and at least one other equals . Note that
by Lemmas 3.5, 3.6, and 4.1. If any of the terms (6.20) have absolute value at least , then combining Lemma 6.7 with Lemma 6.3 or Lemma 6.5 produces the desired density increment. We may thus proceed under the assumption that these terms are all small, so that
For each pair , let and be as in Lemma 6.8, and set . By Markov’s inequality, either
(6.21) |
or else is , in which case a combination of Lemmas 6.3, 6.4, and 6.5 will produce the desired density increment. We may thus proceed under the assumption that (6.21) holds.
By (6.21) and Lemma 6.8, there exists a subset of density in such that
and for all . Note that is supported on , and, setting
we have , , , , and , provided that is small enough and is large enough. Applying Lemma 6.1 on the affine subspace yields a function such that
provided that is large enough. We must then also have
Arguing as in the proof of the first part of Lemma 6.5, it follows that, for every pair , there exists a nonzero such that
so that
as well.
Extend from to the set by picking a nonzero arbitrarily for all pairs outside of . We split the average over up into an average of averages over cosets of and average over all pairs such that to get that
(6.22) |
for some absolute constant . Note that
so that, if it were the case that
then we would be able to deduce the desired density-increment from Lemma 6.3. Thus, we may proceed under the assumption
which we can combine with (6.22), a change of variables, and an application of the pigeonhole principle in the variable to deduce that
(6.23) |
for some fixed . Set
(6.24) |
To deduce the desired density-increment by applying the pigeonhole principle to (6.23), we will have to show that almost every set of the form
has close to the “correct” density. Most of the remainder of our argument for is devoted to this task.
Set . We start by showing that either is small for almost every pair , or else we can deduce a density-increment from Lemmas 6.3 and 6.5. Note first that
where . By Lemmas 3.5, 3.6, and 4.1, we have
so that . Similarly, the average equals
and equals
where
By Lemmas 3.5, 3.6, and 4.1, we have
so that
equals
Thus, if
for some (to be chosen later), and is small enough and is large enough, then
in which case the desired density increment follows from Lemma 6.3. We may thus proceed under the assumption that
so that, by Markov’s inequality, we have
It follows that equals
If
then we could deduce the desired density increment from Lemma 6.5. So, we may proceed under the assumption that this inequality does not hold, which implies that . It then follows from Markov’s inequality that
(6.25) |
Next, we will show that, for typical , the average size of is not very large. Certainly,
for every . Setting
Lemma 3.6 says that
Thus, by 3.5, as long as is large enough, we have
(6.26) |
say.
Now, setting , the contribution to the left-hand side of (6.23) coming from pairs for which or is . Thus, by the pigeonhole principle, there exists an and a subset of density in such that
and for every . Recalling the definition of , the above displayed equation says that
We now use Lemma 6.6 to find a subset of density and a such that
where
This gives the conclusion of the lemma.
The proofs of the two remaining cases are very similar to those appearing in earlier subsections, so we will be more brief in our arguments. Next, assume that
As in the second part of the proof of Lemma 6.5, we may proceed under the assumption that
and use it to show that either
for almost every pair for which , or else the desired density-increment follows from Lemma 6.3. By Lemmas 3.5, 3.6, and 4.1 and the Cauchy–Schwarz inequality, either
and
or else we have the desired density-increment from Lemma 6.3. It then follows from Markov’s inequality that
for all but a -proportion of pairs for which , which means that we can find such a pair for which we also have
so that we get the desired density-increment by taking , , , , and in the definition of .
Now assume that
which means
By Lemma 6.2 and the pigeonhole principle, there exists a subset of density at least for which
whenever . As in the proof of Lemma 6.3, there is a subset of density at least in such that either or for all . As in the proof of Lemma 6.3, we can simply take and in these cases, respectively, since, by Lemma 6.2, the fibers typically have very close to their average size. ∎
7. Pseudorandomization
This section begins with yet more preliminaries. To prove Lemma 2.6, we will need a result of Cohen and Tal, which says that, for any finite set of polynomials in , one can find a partition of into affine subspaces of relatively large dimension on which all of the polynomials are constant.
Theorem 7.1 (Cohen and Tal, Theorem 3.6 of [3] specialized to prime fields).
Let and be natural numbers. There exists a positive integer satisfying
such that, for any polynomials of degree at most , there is a partition of into affine subspaces of dimension such that are constant on each affine subspace.
To see that this formulation of Cohen and Tal’s theorem is equivalent to Theorem 3.6 of [3], note that one can make all of the affine subspaces in the partitions produced by their theorem have the same dimension by simply partitioning each subspace not of the minimum possible dimension into more subspaces.
We will also need a “bilinear” version of Cohen and Tal’s result, which we will use to find partitions of into product spaces of the form on which polynomials in two sets of variables are constant.
Corollary 7.2.
Let and be natural numbers. There exists a positive integer satisfying
such that, for any polynomial of degree at most , there is a partition of into products of affine subspaces of the form
with each , such that is constant on each product of affine subspaces.
Proof.
Write
where satisfy and for all . Applying Theorem 7.1 to gives us a positive integer and a partition
of , with for each , such that are all constant on each affine subspace .
Now write
Since restricting a polynomial to a subspace cannot increase its degree, for each , we can apply Theorem 7.1 on each affine subspace to the polynomial
to get that there exists a positive integer satisfying
and a partition
with each subspace having , such that is constant on each .
Thus, we can write
where is constant on each product . Since we can further refine this partition to one of the desired form
on each part of which is still constant. ∎
Finally, we will recall the recent quantitative inverse theorem of Gowers and Milićević for the -norms on vector spaces over finite fields:
Theorem 7.3 (Gowers and Milićević, Theorem 7 of [10]).
Let be a natural number and assume that . There exist constants so that, if is a -bounded function satisfying
then there exists a polynomial of degree such that
7.1. Proof of Lemma 2.6
Our proof of Lemma 2.6 is modeled after the corresponding pseudorandomization arguments in [19, 20] and [12]. As was said in the outline, some new features arise from our desire for and to be uniform with respect to -norms of degree greater than and from ’s particular structure as a union of affine subspaces of the second factor of . The first of these two complications can be overcome by using Theorem 7.3 and then Theorem 7.1 or Corollary 7.2. We will discuss how ’s structure influences our proof shortly.
The proof of Lemma 2.6 proceeds via an energy-increment argument. Each step of the energy-increment iteration will produce a partition of into cells of the form
for some subspace . For each cell in a partition , we set
-
•
,
-
•
,
-
•
, and
-
•
,
and, correspondingly,
-
•
,
-
•
,
-
•
, and
-
•
,
so that
Analogously to the pseudorandomization procedure for corners, we will show that if or is large for a substantial portion of cells in a partition , then there exists a refinement of which has substantially larger energy. But even if is a union of affine subspaces of the same codimension for most cells in the partition, this may not be the case for the ’s corresponding to the cells in the refinement . The codimensions of the affine subspaces can range from to , so before even considering how to obtain a pseudorandom , we have already found an obstacle to even getting a set of the same general form as .
To get around this issue, we will pseudorandomize each of the sets
(7.1) |
for , instead of just itself. The definition (7.1) of selects all of the subspaces of the second factor of comprising that have codimension at most . This will pseudorandomize each of
as well. At the end of the proof of Lemma 2.6, we will use an averaging argument to choose to be some suitable .
For each , set . We define the energy of a partition of by
Note that the energy of any partition is bounded above by . We will now prove a couple of lemmas concerning this energy.
Lemma 7.4.
Let and be nonnegative integers, , be of the form
where each is a subspace of of codimension between and , be a partition of with each cell taking the form
for some subspace of codimension , and suppose that is a refinement of with each cell taking the form
for some subspace of codimension . Then
and
for every .
Proof.
Note that it suffices to prove the result with the sum over restricted to a single cell and the sum over restricted to all cells contained in , since one can then just sum over in to get the desired result. So, we may assume without loss of generality that is the trivial partition .
Let and denote the densities of and , respectively, in . Since has density in , we have
so that, by the Cauchy–Schwarz inequality,
as desired, since
for all cells of . Similarly, since has density in and has density in , we have
and
it follows again from the Cauchy–Schwarz inequality that
and
The argument for the ’s is also essentially identical, but with one small difference. For ease of notation, set and for all . Then we actually have
instead of equality, since (which is why we run the energy-increment argument with the ’s, instead of the ’s). It therefore follows yet again from the Cauchy–Schwarz inequality that
for all . ∎
Lemma 7.5.
Let and be nonnegative integers, , be of the form
where each is a subspace of of codimension between and , and be a partition of with each cell taking the form
for some subspace of dimension . There exists a positive integer
and positive integers , such that the following holds.
Let .
-
(1)
If , then there exists a partition of with each cell taking the form
with each , such that
-
(2)
If , then there exists a partition of with each cell taking the form
with each , such that
-
(3)
If , then there exists a partition of with each cell taking the form
with each , such that
-
(4)
Let . If , then there exists a partition of with each cell taking the form
with each , such that
Proof.
Let and denote the constants and , respectively, from Theorem 7.3, and denote the smaller of the two minimum values of appearing in Theorem 7.1 when we take as in this lemma, , and and appearing in Corollary 7.2 when we take as in this lemma and .
First assume that . Then applying Theorem 7.3 with yields a polynomial of degree at most such that
By Theorem 7.1, there exists a partition of into affine subspaces of of dimension , on each of which is constant. Thus, by the triangle inequality,
so that, by the Cauchy–Schwarz inequality,
Expanding the square, this means that
(7.2) |
Now we partition the whole cell of interest by writing
and, for each , use that to partition into cosets of to get
Since for each , (7.2) reads
The arguments for and are again analogous, but we include them for the sake of completeness. Next, assume that . Applying Theorem 7.3 yields a polynomial of degree at most such that
By Theorem 7.1, there exists a partition of into affine subspaces of of dimension on each of which is constant. Thus, by the Cauchy–Schwarz inequality again,
Now we partition the whole cell by writing
Since , we conclude that
Now assume that . Applying Theorem 7.3 yields a polynomial of degree at most such that
By Theorem 7.1, there exists a partition of into affine subspaces of of dimension on each of which is constant. Thus, by the Cauchy–Schwarz inequality yet again,
Now we partition the whole cell by writing
Since , we conclude that
Finally, suppose that for some . Theorem 7.3 then says that there exists a polynomial of degree at most such that
By Corollary 7.2, there exists a partition of into affine subspaces of the form with , on each of which is constant. Thus, by the Cauchy–Schwarz inequality, we have
Since
the conclusion
now follows. ∎
Now we can prove Lemma 2.6.
Proof of Lemma 2.6.
We proceed via an energy-increment argument, as described at the beginning of the subsection. A cell in a partition is said to be expired if or is less than , and a nonexpired cell is said to be uniform if
and
for all . We will denote the subset of expired cells of by , the subset of uniform cells by , and the subset of nonexpired, nonuniform cells by , so that , and partition . For any subset , we define to be the measure of all cells indexed by :
Finally, we define a sequence of integers by setting and, for every , to be the minimum of the value of appearing in Theorem 7.1 when we take , , and and of appearing in Corollary 7.2 when we take and , so that
for some absolute constants .
Set to be the trivial partition of . Letting and be as in Lemma 7.5, then, as long as and , there exists a refinement of such that
-
(1)
for every and whenever and
-
(2)
.
Indeed, suppose that . Each cell in must be of dimension . By Lemmas 7.4 and 7.5, there exists a partition of each such that
is at least
and each is of the form with . Taking
we see that multiplying both sides of the above by and summing over yields
Since for all partitions , this iteration must terminate for some , at which point either or . Assuming that ensures that the latter case cannot occur.
Since , we have
This certainly implies that
so that, by the pigeonhole principle, there exists a cell in for which
Since , another application of the pigeonhole principle tells us that there exists a for which we also have the density-increment
As noted at the beginning of this subsection,
so that the conclusion of the lemma now follows. ∎
8. The density-increment argument
Proof of Theorem 1.2.
Suppose that has density and contains no nontrivial L-shaped configurations. Set , , , , , and . Applying Lemma 2.7 repeatedly produces sequences of ’s, ’s, ’s, ’s, ’s, ’s, ’s, ’s, and ’s, with and of the form
where each is a subspace of codimension , such that, on setting
, , , , and , we have, for each , that
-
(1)
has density in , where ,
-
(2)
,
-
(3)
,
-
(4)
,
-
(5)
,
-
(6)
,
-
(7)
and contains no nontrivial L-shaped configurations,
provided that
(8.1) |
Since no set can have density larger than , the lower bound (8.1) must fail for some . Thus, there exists an absolute constant such that
while, on the other hand
Comparing the upper and lower bounds for and taking the -fold iterated logarithm of both sides yields the bound in Theorem 1.2. ∎
References
- [1] T. Austin. Partial difference equations over compact Abelian groups, I: modules of solutions. 2013. arXiv:1305.7269.
- [2] T. Austin. Partial difference equations over compact Abelian groups, II: step-polynomial solutions. 2013. arXiv:1309.3577.
- [3] G. Cohen and A. Tal. Two structural results for low degree polynomials and applications. In Approximation, randomization, and combinatorial optimization. Algorithms and techniques, volume 40 of LIPIcs. Leibniz Int. Proc. Inform., pages 680–709. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2015.
- [4] H. Furstenberg and Y. Katznelson. An ergodic Szemerédi theorem for commuting transformations. J. Analyse Math., 34:275–291 (1979), 1978.
- [5] W. T. Gowers. Fourier analysis and Szemerédi’s theorem. In Proceedings of the International Congress of Mathematicians, Vol. I (Berlin, 1998), number Extra Vol. I, pages 617–629, 1998.
- [6] W. T. Gowers. A new proof of Szemerédi’s theorem for arithmetic progressions of length four. Geom. Funct. Anal., 8(3):529–551, 1998.
- [7] W. T. Gowers. Arithmetic progressions in sparse sets. In Current developments in mathematics, 2000, pages 149–196. Int. Press, Somerville, MA, 2001.
- [8] W. T. Gowers. A new proof of Szemerédi’s theorem. Geom. Funct. Anal., 11(3):465–588, 2001.
- [9] W. T. Gowers. Hypergraph regularity and the multidimensional Szemerédi theorem. Ann. of Math. (2), 166(3):897–946, 2007.
- [10] W. T. Gowers and L. Milićević. An inverse theorem for Freiman multi-homomorphisms. preprint, 2020. arXiv:2002.11667.
- [11] R. L. Graham. Euclidean Ramsey theory. In Handbook of discrete and computational geometry, CRC Press Ser. Discrete Math. Appl., pages 153–166. CRC, Boca Raton, FL, 1997.
- [12] B. Green. An argument of Shkredov in the finite field setting. preprint, 2005. http://people.maths.ox.ac.uk/greenbj/papers/corners.pdf.
- [13] B. Green. Finite field models in additive combinatorics. In Surveys in combinatorics 2005, volume 327 of London Math. Soc. Lecture Note Ser., pages 1–27. Cambridge Univ. Press, Cambridge, 2005.
- [14] B. Green and T. Tao. Linear equations in primes. Ann. of Math. (2), 171(3):1753–1850, 2010.
- [15] F. Manners. Quantitative bounds in the inverse theorem for the Gowers -norms over cyclic groups. preprint, 2018. arXiv:1811.00718.
- [16] B. Nagle, V. Rödl, and M. Schacht. The counting lemma for regular -uniform hypergraphs. Random Structures Algorithms, 28(2):113–179, 2006.
- [17] S. Prendiville. Matrix progressions in multidimensional sets of integers. Mathematika, 61(1):14–48, 2015.
- [18] V. Rödl and J. Skokan. Regularity lemma for -uniform hypergraphs. Random Structures Algorithms, 25(1):1–42, 2004.
- [19] I. D. Shkredov. On a generalization of Szemerédi’s theorem. Proc. London Math. Soc. (3), 93(3):723–760, 2006.
- [20] I. D. Shkredov. On a problem of Gowers. Izv. Ross. Akad. Nauk Ser. Mat., 70(2):179–221, 2006.
- [21] I. D. Shkredov. On an inverse theorem for -norm. In Modern problems of mathematics and mechanics, Mathematics, Dynamical systems, volume 4, 2, pages 55–127, 2009.
- [22] T. Tao. A variant of the hypergraph removal lemma. J. Combin. Theory Ser. A, 113(7):1257–1280, 2006.
- [23] T. Tao. Higher order Fourier analysis, volume 142 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2012.
- [24] Y. Zhao. An arithmetic transference proof of a relative Szemerédi theorem. Math. Proc. Cambridge Philos. Soc., 156(2):255–261, 2014.