This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quasipolynomial bounds on the inverse theorem for the Gowers Us+1[N]U^{s+1}[N]-norm

James Leng Department of Mathematics, UCLA, Los Angeles, CA 90095, USA [email protected] Ashwin Sah  and  Mehtaab Sawhney Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA {asah,msawhney}@mit.edu
Abstract.

We prove quasipolynomial bounds on the inverse theorem for the Gowers Us+1[N]U^{s+1}[N]-norm. The proof is modeled after work of Green, Tao, and Ziegler and uses as a crucial input recent work of the first author regarding the equidistribution of nilsequences. In a companion paper, this result will be used to improve the bounds on Szemerédi’s theorem.

Leng was supported by NSF Graduate Research Fellowship Grant No. DGE-2034835. Sah and Sawhney were supported by NSF Graduate Research Fellowship Program DGE-2141064.

1. Introduction

We recall the definition of the Gowers UsU^{s}-norm on /N\mathbb{Z}/N\mathbb{Z} and [N][N]. Throughout we let [N]={1,,N}[N]=\{1,\ldots,N\}.

Definition 1.1.

Given f:/Nf\colon\mathbb{Z}/N\mathbb{Z}\to\mathbb{C} and s1s\geq 1, we define

fUs(/N)2s=𝔼x,h1,,hs/NΔh1,,hsf(x)\lVert f\rVert_{U^{s}(\mathbb{Z}/N\mathbb{Z})}^{2^{s}}=\mathbb{E}_{x,h_{1},\ldots,h_{s}\in\mathbb{Z}/N\mathbb{Z}}\Delta_{h_{1},\ldots,h_{s}}f(x)

where Δhf(x)=f(x)f(x+h)¯\Delta_{h}f(x)=f(x)\overline{f(x+h)} is the multiplicative discrete derivative (extended to lists by composition). Given a natural number NN and a function f:[N]f\colon[N]\to\mathbb{C}, we choose a number N~2sN\widetilde{N}\geq 2^{s}N and define f~:/N~\widetilde{f}\colon\mathbb{Z}/\widetilde{N}\mathbb{Z}\to\mathbb{C} via f~(x)=f(x)\widetilde{f}(x)=f(x) for x[N]x\in[N] and 0 otherwise. Then

fUs[N]:=f~Us(/N~)/𝟙[N]Us(/N~).\lVert f\rVert_{U^{s}[N]}:=\lVert\widetilde{f}\rVert_{U^{s}(\mathbb{Z}/\widetilde{N}\mathbb{Z})}/\lVert\mathbbm{1}_{[N]}\rVert_{U^{s}(\mathbb{Z}/\widetilde{N}\mathbb{Z})}.
Remark.

This is known to be well-defined and independent of N~\widetilde{N}, and a norm if s2s\geq 2; see [27, Lemma B.5].

Our main result is quasi-polynomial bounds on the inverse theorem for the Gowers Us+1U^{s+1}-norm over the integers. This builds on earlier work [43, Section 8] of the first author which handled the case of the U4U^{4}-norm.

Theorem 1.2.

Fix δ(0,1/2)\delta\in(0,1/2). Suppose that f:[N]f\colon[N]\to\mathbb{C} is 11-bounded and

fUs+1[N]δ.\lVert f\rVert_{U^{s+1}[N]}\geq\delta.

Then there exists a nilmanifold G/ΓG/\Gamma of degree ss, complexity at most MM, and dimension at most dd as well as a function FF on G/ΓG/\Gamma which is at most KK-Lipschitz such that

|𝔼n[N][f(n)F(g(n)Γ)¯]|ε,|\mathbb{E}_{n\in[N]}[f(n)\overline{F(g(n)\Gamma)}]|\geq\varepsilon,

where we may take

dlog(1/δ)Os(1) and ε1,K,Mexp(log(1/δ)Os(1)).d\leq\log(1/\delta)^{O_{s}(1)}\emph{ and }\varepsilon^{-1},K,M\leq\exp(\log(1/\delta)^{O_{s}(1)}).
Remark.

Throughout this paper, we will abusively write log\log for max(log(),ee)\max(\log(\cdot),e^{e}); this is to avoid issues with small numbers.

We have not formally defined a nilmanifold or notions of complexity; our definition is identical to that in work of Green and Tao [29] and will be recalled precisely in Sections 2 and 3.

In the companion paper to this work [46], we will use Theorem 1.2 in order to improve the long standing bounds of Gowers [16, 18] on Szemerédi’s theorem.

Theorem 1.3 (Theorem 1.1 in [46]).

Let rk(N)r_{k}(N) denote the size of the largest S[N]S\subseteq[N] such that SS has no kk-term arithmetic progressions. For k5k\geq 5, there is ck(0,1)c_{k}\in(0,1) such that

rk(N)Nexp((loglogN)ck).r_{k}(N)\ll N\exp(-(\log\log N)^{c_{k}}).

1.1. History and previous results

A long standing conjecture of Erdős and Turán [13] stated that rk(N)=o(N)r_{k}(N)=o(N). In full generality, this conjecture remained open until a combinatorial tour de force of Szemerédi [54, 55] which established the Erdős and Turán conjecture.

Theorem 1.4.

For k3k\geq 3, we have that

rk(N)=ok(N).r_{k}(N)=o_{k}(N).

Due to uses of the van der Waerden theorem and the regularity lemma (which was introduced in this work), Szemerédi’s density saving over the trivial bound was exceedingly small. In particular, Szemerédi’s result provided no improvement on known bounds for van der Waerden’s theorem which was part of Erdős and Turán’s original motivation.

The first result in the effort to prove reasonable bounds for rk(N)r_{k}(N), e.g. giving a density saving of at least a finite iterated logarithmic type, came from work of Roth [50] which proved

r3(N)N(loglogN)1.r_{3}(N)\ll N(\log\log N)^{-1}.

Being based on Fourier analysis, the methods used in this paper did not obviously generalize to k4k\geq 4. An estimate for rk(N)r_{k}(N) which was “reasonable” would have to wait until pioneering work of Gowers [16, 18].

The starting point of work of Gowers [16, 18] is noting via an iterative application of the Cauchy–Schwarz inequality that if a set AA of density δ\delta in [N][N] has no (s+2)(s+2)-term arithmetic progressions then fUs+1[N]δOs(1)\lVert f\rVert_{U^{s+1}[N]}\geq\delta^{O_{s}(1)} where ff is a shifted indicator function of the set. In doing so, Gowers provided the correct notion of “psuedorandomness” generalizing Fourier coefficients which was suitable for understanding arithmetic patterns in subsets of the integers and therefore created “higher order Fourier analysis”. The key technical ingredient in work of Gowers was a certain “local inverse theorem” for the Us+1[N]U^{s+1}[N]-norm. Gowers proved that given a 11-bounded function ff such that fUs+1[N]δ\lVert f\rVert_{U^{s+1}[N]}\geq\delta, there exists a decomposition of [N][N] into arithmetic progressions of length roughly NcsN^{c_{s}} and a 11-bounded function gg which is constant along these arithmetic progressions such that

𝔼x[N]f(x)g(x)¯δOs(1);\mathbb{E}_{x\in[N]}f(x)\overline{g(x)}\geq\delta^{O_{s}(1)};

i.e., ff correlates with gg. This result, coupled with the density increment strategy as introduced by Roth [50], provided the bound

rk(N)N(loglogN)ckr_{k}(N)\ll N(\log\log N)^{-c_{k}}

for Szemerédi’s theorem. These bounds have remained the best known for general kk until this work. For the sake of comparison, a long sequence of works have attacked the special case of k=3k=3, culminating in a recent breakthrough work of Kelley and Meka [41] which proved

r3(N)Nexp(c(logN)1/12);r_{3}(N)\ll N\exp(-c(\log N)^{1/12});

the constant 1/121/12 was refined to 1/91/9 in work of Bloom and Sisask [5]. The only other improvements to the bound of Gowers were due to works of Green and Tao [25, 30] which ultimately established that

r4(N)N(logN)c,r_{4}(N)\ll N(\log N)^{-c},

and very recent work of the authors [45] which handled the case k=5k=5 of Theorem 1.3.

Notice however that the “local inverse theorem” of Gowers only gives correlations on arithmetic progressions of length NcsN^{c_{s}} and that the converse of this result is not true. In particular, a function may have small Us+1[N]U^{s+1}[N]-norm and still correlate with a function which is constant on progressions of length NcsN^{c_{s}}. To construct such an example, break [N][N] into consecutive segments of length N\sqrt{N} and include each segment with probability 1/21/2; while this set with high probability has large “local correlations” it has polynomially small Gowers norm. To obtain a full inverse result (analogous to the quality of Freiman’s theorem, say), one must carefully pin down the global structure as well. Such a task is not straightforward, since the natural generalization of Fourier characters to exponentials of polynomials does not suffice.

A crucial development in the theory towards the inverse conjecture for the Gowers norm was the discovery of the role of nilpotent Lie groups. In groundbreaking work, Furstenberg [14] gave an alternate proof of Szemerédi based on ergodic theory; this work naturally led to seeking to understand certain nonconventional ergodic averages. In works of Conze and Lesigne [11] and Furstenberg and Weiss [15] regarding nonconventional ergodic averages, nilmanifolds G/ΓG/\Gamma where GG is nilpotent and Γ\Gamma is a discrete cocompact subgroup were brought to the forefront. Host and Kra [38] and independently Ziegler [62], proved convergence of such nonconventional ergodic averages. Crucial to these works was establishing that such averages are controlled by projections on certain characteristic factors which naturally give rise to nilmanifolds. The role of nilsequences (derived from polynomial sequences on nilmanifolds) was further highlighted in work of Bergelson, Host, and Kra [3].

The statement of the inverse conjecture (without the given quantification) we will prove was first formulated in work of Green and Tao [27]. Conditional on this inverse conjecture and that the Möbius function does not correlate with nilsequences, Green and Tao were able to prove asymptotic counts for all linear patterns in the primes of “finite complexity”, vastly generalizing the celebrated Green–Tao theorem [24]. Both of these conjectures were resolved; the second being resolved in work of Green and Tao [28] while the first was resolved in work of Green, Tao, and Ziegler [34]. We remark the cases s=2s=2 and s=3s=3 of the inverse conjecture were proven earlier by Green and Tao [23] and Green, Tao, and Ziegler [32] respectively. A crucial ingredient in the cases s3s\geq 3 was work of Green and Tao [29] on the equidistribution behavior of polynomial orbits on nilmanifolds. An alternative approach to the inverse conjecture was initiated by Szegedy [53], involving the development of the theory of nilspaces by Camarena and Szegedy [6]; a detailed treatment of these papers was given by Candela [8, 7]. This nilspace approach has been further developed in works of Gutman, Manners, and Varjú[36, 35, 37]. Both of these approaches to the inverse theorem, however, at least formally, gave no bounds on the complexity or dimension of the nilsequences with which the function correlates in the cases s4s\geq 4. A third approach due to Manners [47] will be discussed later in this section.

We remark that the study of the inverse conjecture for the Gowers norm makes sense beyond the setting of functions on the interval or on the cyclic group /N\mathbb{Z}/N\mathbb{Z}. Work of Bergelson, Tao, and Ziegler [4] and Tao and Ziegler [59, 60] resolved the analogue of the inverse conjecture for the Gowers norm over 𝔽pn\mathbb{F}_{p}^{n}. Candela and Szegedy [10] gave a version of the inverse conjecture for the Gowers norm over all compact abelian groups. This final work falls within the context of giving proofs which, broadly speaking, attempt to handle various abelian groups in a uniform manner. There has been substantial further work in this rough direction including works of Jamneshan and Tao [40], Jamneshan, Shalom, and Tao [39], and Candela, González-Sánchez, and Szegedy [9].

The inverse theorem has had numerous further applications within additive combinatorics; we highlight just two. First, Tao and Ziegler [61] gave an asymptotic for the number of polynomial patterns x+P1(y),,x+Pj(y)x+P_{1}(y),\ldots,x+P_{j}(y) in the primes where P1(0)==Pj(0)=0P_{1}(0)=\cdots=P_{j}(0)=0 with top degree terms P1,,PjP_{1},\ldots,P_{j} being distinct. Second, works of Green and Tao [26] and Altman [2, 1] used the inverse conjecture in combination with an arithmetic regularity lemma to establish the true complexity conjectures of Gowers and Wolf [21].

Due to its importance in the theory of additive patterns, establishing quantitative bounds on the inverse theorem for the Gowers norm has been seen as a central problem in additive combinatorics, with Green suggesting it as “perhaps the biggest open question in the subject” [22, Problem 56]. For the case of s=2s=2, work of Green and Tao [23] gave quantitative bounds for the inverse theorem over the integers and work of Sanders [51] combined with the strategy in [23] proves Theorem 1.2 for the case of s=2s=2. For general ss, until roughly five years ago no quantitative bounds were known for the inverse theorem and this was considered a major open problem. This state of affairs was substantially improved in remarkable work of Manners [47] which proves a version of the inverse theorem where, in the notation of Theorem 1.2,

dδOs(1) and ε1,K,Mexp(exp(δOs(1))).d\leq\delta^{-O_{s}(1)}\text{ and }\varepsilon^{-1},K,M\leq\exp(\exp(\delta^{-O_{s}(1)})).

This result was subsequently used as a crucial input in work of Tao and Teräväinen [57] to give an effective result for the counts of linear equations in the primes. We remark that a quantitative version of the inverse conjecture over finite fields of high characteristic was proven in work of Gowers and Milićević [20, 19].

At the highest level, the quantitative proofs of Manners [47] and Gowers and Milićević [20, 19] examine when the iterated derivatives of a function are 0 with positive probability. Deriving useful information from this hypothesis over finite fields and the integers are very different problems but fundamentally one glues information from higher derivatives together into information regarding lower derivatives iteratively.

Our proof instead operates via induction on ss and attempts to glue degree (s1)(s-1) nilmanifolds into a degree ss one exactly as in work of Green, Tao, and Ziegler [34]. Our proof in fact is very closely modeled on their work and borrows large sections of their work essentially verbatim. In fact, we believe that the proof in [34], if appropriately quantified, itself yields a bound involving O(s2)O(s^{2}) many iterated exponentials. The primary improvement of our proof over theirs stems from the use of improved quantitative equidistribution results on nilmanifolds [43, 42] rather than the results of [29]. The reason we obtain quasi-polynomial bounds is that our proof, even though it inducts on ss, gives quasi-polynomial bounds for each step of the induction. Since an iterated composition of finitely many quasi-polynomial functions is still quasi-polynomial, it follows that our bounds should remain quasi-polynomial. In contrast, we believe that the proof in [34], appropriately quantified, results in adding O(t)O(t) iterated exponentials in each step tt of the induction, which when iterated totals O(s2)O(s^{2}) iterated exponentials. Here the results of [43, 42] play a crucial role in eliminating the logarithms accumulated in the induction step. We further remark that the case s=3s=3 of the main theorem (e.g. the U4U^{4}-inverse theorem) of the strength in Theorem 1.2 was proven earlier by the first author in [43, Section 8] and may useful stepping stone for reading this paper (although this paper is logically independent).

1.2. Organization of the paper I

We briefly discuss the next three sections of the paper. In Section 2, we define a number of basic notions regarding nilmanifold and set various conventions which will be used throughout the paper. Our conventions differ in various extremely minor ways from those in the work of Green, Tao, and Ziegler [34] but we record them explicitly to recall a number of definitions which will be used throughout the paper. In Section 3, we set various complexity notions that will be given throughout the paper. In the case of nilmanifolds which are given a degree filtration (as is the case in Theorem 1.2), our conventions match those of Green and Tao [29]. Given these notions in hand, we will be in position to outline the main proof in greater detail in Section 4.

Acknowledgements

The first author thanks Terence Tao for advisement. The authors thank Ben Green and Terence Tao for useful discussions regarding [34, 31]. The authors are grateful to Dan Altman, Ben Green, and Zach Hunter for comments. Finally the authors are especially grateful to Sarah Peluse for exceptionally detailed and useful comments on the manuscript.

2. Conventions on nilmanifolds

We will recall a large portion of setup regarding nilsequences. In order to discuss this in a quantitative manner, various complexity notions are required which are formally defined in Section 3. This section contains little more than bare definitions; a number of these concepts are developed and motivated in a beautiful manner in [34, Section 6].

2.1. Basic group theory

We briefly record various basic group theory notations which will be used throughout the paper; our notation is identical to that of [34, Section 3].

Given a group GG and a subset AA, we define A\langle A\rangle to be the subgroup generated by the subset AA. Given a collection of subgroups (Hi)iI(H_{i})_{i\in I} in GG, we define iIHi\bigvee_{i\in I}H_{i} to be the smallest subgroup containing all the HiH_{i}. Given h,kGh,k\in G, we denote the commutator of hh and kk to be

[h,k]=h1k1hk.[h,k]=h^{-1}k^{-1}hk.

Given a sequence of elements g1,,grGg_{1},\ldots,g_{r}\in G, we define the set of (r1)(r-1)-fold commutators inductively. The 0-fold commutators of the set gig_{i} is simply gig_{i}. For r>1r>1, an (r1)(r-1)-fold commutator is [w,w][w,w^{\prime}] where ww and ww^{\prime} are (s1)(s-1)-fold and (s1)(s^{\prime}-1)-fold commutators of gi1,,gisg_{i_{1}},\ldots,g_{i_{s}} and gi1,,gisg_{i_{1}^{\prime}},\ldots,g_{i_{s^{\prime}}^{\prime}} respectively with {i1,,is}{i1,,is}={1,,r}\{i_{1},\ldots,i_{s}\}\cup\{i_{1}^{\prime},\ldots,i_{s^{\prime}}^{\prime}\}=\{1,\ldots,r\} and s+s=rs+s^{\prime}=r. For instance, [[g3,g4],[g1,g2]][[g_{3},g_{4}],[g_{1},g_{2}]] and [g1,[g3,[g2,g4]]][g_{1},[g_{3},[g_{2},g_{4}]]] are 33-fold commutators of g1g_{1}, g2g_{2}, g3g_{3}, and g4g_{4}.

We let HGH\leqslant G denote that HH is a subgroup of GG. Given H,KGH,K\leqslant G, we denote the commutator subgroup

[H,K]=[h,k]:hH,kK.[H,K]=\langle[h,k]\colon h\in H,k\in K\rangle.

The following pair of elementary lemmas will be used throughout the paper to verify various commutator identities; the first is [34, Lemma 3.1].

Lemma 2.1.

Let H=AH=\langle A\rangle and K=BK=\langle B\rangle be normal subgroups of a nilpotent group GG. Then [H,K][H,K] is also normal and is generated by the (i+j1)(i+j-1)-fold iterated commutators of a1,,ai,b1,,bja_{1},\ldots,a_{i},b_{1},\ldots,b_{j} over all choices of a1,,aiAa_{1},\ldots,a_{i}\in A, b1,,bjBb_{1},\ldots,b_{j}\in B and i,j1i,j\geq 1.

This implies (see [34, p. 1242]) that for families (Hi)iI(H_{i})_{i\in I}, (Kj)jJ(K_{j})_{j\in J} which are normal in a nilpotent group GG,

[iIHi,jJKj]=iI,jJ[Hi,Kj].\Big{[}\bigvee_{i\in I}H_{i},\bigvee_{j\in J}K_{j}\Big{]}=\bigvee_{i\in I,j\in J}[H_{i},K_{j}].

We next require that normality and various filtration conditions can be checked at the level of generators.

Lemma 2.2.

Suppose KHK\leqslant H with H=A,K=BH=\langle A\rangle,K=\langle B\rangle where A=A1A=A^{-1} and B=B1B=B^{-1}. Then:

  • If [a,b]K[a,b]\in K for all aAa\in A and bBb\in B then KK is normal in HH.

  • Suppose LKHL\leqslant K\cap H is a normal subgroup with respect to both KK and HH, and suppose for aAa\in A, bBb\in B, we have [a,b]L[a,b]\in L. Then [H,K]L[H,K]\leqslant L.

Remark.

Suppose we wish to prove that (Gi)iI(G_{i})_{i\in I} forms an II-filtration (see Definition 2.3). This lemma implies that it suffices to check the commutator filtration conditions simply at the level of generators: if for each i,jIi,j\in I we know [gi,gj]Gi+j[g_{i},g_{j}]\in G_{i+j} for all generators gig_{i} for GiG_{i} and gjg_{j} for GjG_{j}, then we can deduce that Gi+jG_{i+j} is normal in GiG_{i} using the first bullet point above, and then deduce that [Gi,Gj]Gi+j[G_{i},G_{j}]\leqslant G_{i+j} using the second bullet point above.

Proof.

For aA,bBa\in A,b\in B we have [a,b]K[a,b]\in K hence a1b1aKa^{-1}b^{-1}a\in K. Since B=B1B=B^{-1} generates KK, we find a1KaKa^{-1}Ka\leqslant K. Since AA generates HH, we deduce that KK is normal in HH.

For the second item, note that

[xy,z]=y1[x,z]y[y,z] and [x,zy]=[x,y]y1[x,z]y.[xy,z]=y^{-1}[x,z]y\cdot[y,z]\text{ and }[x,zy]=[x,y]\cdot y^{-1}[x,z]y.

Repeatedly expanding [h,k][h,k] for hH,kKh\in H,k\in K into generators proves the result. ∎

Finally, and most importantly, we will require the following versions of the Baker–Campbell–Hausdorff formula (see [34, (3.2)]). Given g1,g2g_{1},g_{2} in a nilpotent group GG and n1,n2n_{1},n_{2}\in\mathbb{N}, we have

(2.1) g1n1g2n2=g2n2g1n1agaPa(n1,n2)g_{1}^{n_{1}}g_{2}^{n_{2}}=g_{2}^{n_{2}}g_{1}^{n_{1}}\prod_{a}g_{a}^{P_{a}(n_{1},n_{2})}

where gag_{a} ranges over all iterated commutators of g1g_{1} and g2g_{2} with at least 11 copy of each and Pa(n1,n2):×P_{a}(n_{1},n_{2})\colon\mathbb{Z}\times\mathbb{Z}\to\mathbb{Z} is a polynomial in n1n_{1} and n2n_{2}. Furthermore if gag_{a} involves d1d_{1} copies of g1g_{1} and d2d_{2} copies of g2g_{2} we have that PaP_{a} has degree at most d1d_{1} in n1n_{1} and degree at most d2d_{2} in n2n_{2}. Here the aa have been ordered in some arbitrary manner.

If GG is a connected, simply connected nilpotent Lie group, then we denote the Lie algebra of GG as logG\log G and let exp:logGG\exp\colon\log G\to G denote the exponential map while log:GlogG\log\colon G\to\log G is the inverse (the exponential map being a homeomorphism in this situation). When we refer to nilpotent Lie groups, they will henceforth be connected and simply connected. For gGg\in G and tt\in\mathbb{R}, we define

gt=exp(tlogg).g^{t}=\exp(t\log g).

The Baker–Campbell–Hausdorff formula also implies that

exp(t1logg1+t2logg2)=g1t1g2t2agaRa(t1,t2)\exp(t_{1}\log g_{1}+t_{2}\log g_{2})=g_{1}^{t_{1}}g_{2}^{t_{2}}\prod_{a}g_{a}^{R_{a}(t_{1},t_{2})}

where gag_{a} ranges over all iterated commutators of g1g_{1} and g2g_{2} with at least 11 copy of each and RaR_{a} is a polynomial with rational coefficients satisfying identical degree constraints to PaP_{a}. Finally we require the following, most standard version, of the Baker–Campbell–Hausdorff formula which states that if X,YlogGX,Y\in\log G, then

exp(X)exp(Y)=exp(X+Y+12[X,Y]+)\exp(X)\exp(Y)=\exp\Big{(}X+Y+\frac{1}{2}[X,Y]+\cdots\Big{)}

where the remaining terms in the expansion are iterated commutators in XX and YY with all higher terms having at least one “copy” of XX and YY within them. In particular, this implies that

(2.2) exp(X)exp(Y)exp(X)exp(Y)=exp([X,Y]+)\exp(-X)\exp(-Y)\exp(X)\exp(Y)=\exp\big{(}[X,Y]+\cdots\big{)}

where are all higher order terms have at least one copy of XX and YY in them and are rr-fold commutators with r3r\geq 3. In all versions of Baker–Campbell–Hausdorff, it is important for us that nilpotency means these expressions are finite.

2.2. Filtrations

We next require the notion of an ordering and an associated filtration (see [34, Definition 6.7]).

Definition 2.3.

An ordering I=(I,,+,0)I=(I,\preceq,+,0) is a set II with a distinguished element 0, binary operation +:I×II+\colon I\times I\to I, and a partial order \preceq on II such that

  • ++ is associative and commutative with 0 acting as an identity element;

  • \preceq has 0 as the minimal element;

  • For all i,j,kIi,j,k\in I, if iji\preceq j then i+kj+ki+k\preceq j+k;

  • The initial segments {iI:id}\{i\in I\colon i\preceq d\} are finite for all dd.

We define the following three orderings, with addition being the standard addition:

  • The degree ordering is given by the standard ordering on \mathbb{N}, denoted I=I=\mathbb{N} for short;

  • The degree-rank ordering is given by {(d,r)2:0rd}\{(d,r)\in\mathbb{N}^{2}\colon 0\leq r\leq d\} with the ordering that (d,r)(d,r)(d^{\prime},r^{\prime})\preceq(d,r) if d<dd^{\prime}<d or d=dd^{\prime}=d and rrr^{\prime}\leq r, denoted I=DRI=\mathrm{DR} for short;

  • The multidegree ordering is given by k\mathbb{N}^{k} with (i1,,ik)(i1,,ik)(i_{1}^{\prime},\ldots,i_{k}^{\prime})\preceq(i_{1},\ldots,i_{k}) when ijiji_{j}^{\prime}\leq i_{j} for all 1jk1\leq j\leq k, denoted I=kI=\mathbb{N}^{k} for short.

An II-filtration of GG is a collection of subgroups GI=(Gi)iIG_{I}=(G_{i})_{i\in I} such that G0=GG_{0}=G and:

  • (Nesting) If i,jIi,j\in I are such that iji\preceq j then GiGjG_{i}\geqslant G_{j};

  • (Commutator) For i,jIi,j\in I, we have [Gi,Gj]Gi+j[G_{i},G_{j}]\leqslant G_{i+j}.

We say that a filtered group GG has degree d\leq d (for dId\in I) if GiG_{i} is trivial for idi\not\preceq d. GG has degree J\subseteq J for a downset JJ if GiG_{i} is trivial whenever iJi\notin J.

Note that the commutator condition implies nested subgroups are normal within each other. We next define degree, degree-rank, and multidegree filtrations.

Definition 2.4.

Given dd\in\mathbb{N}, we say a group GG is given a degree filtration of degree dd if:

  • GG is given a \mathbb{N}-filtration (Gi)i(G_{i})_{i\in\mathbb{N}} with degree d\leq d;

  • G0=G1G_{0}=G_{1}.

Given (d,r)2(d,r)\in\mathbb{N}^{2} with 0rd0\leq r\leq d, GG is given a degree-rank filtration of degree-rank (d,r)(d,r) if:

  • GG is given a DR\mathrm{DR}-filtration (Gi)iDR(G_{i})_{i\in\mathrm{DR}} with degree (d,r)\leq(d,r);

  • G(0,0)=G(1,0)G_{(0,0)}=G_{(1,0)} and G(i,0)=G(i,1)G_{(i,0)}=G_{(i,1)} for i1i\geq 1. (We also let G(i,j)=G(i+1,0)G_{(i,j)}=G_{(i+1,0)} for j>ij>i.)

The associated degree filtration with respect to this degree-rank filtration is (G(i,0))i0(G_{(i,0)})_{i\geq 0}.

Given (d1,,dk)k(d_{1},\ldots,d_{k})\in\mathbb{N}^{k}, GG is given a multidegree filtration of multidegree JJ (where JkJ\subseteq\mathbb{N}^{k} is a downset) if:

  • GG is given a k\mathbb{N}^{k}-filtration (Gi)ik(G_{i})_{i\in\mathbb{N}^{k}} with degree J\subseteq J;

  • G0=i=1kGeiG_{\vec{0}}=\bigvee_{i=1}^{k}G_{\vec{e_{i}}}.

The associated degree filtration with respect to the multidegree filtration is (|i|=iGi)i0(\bigvee_{|\vec{i}|=i}G_{\vec{i}})_{i\geq 0}. Here |i|=i1++ik|\vec{i}|=i_{1}+\ldots+i_{k}.

Remark.

This definition imposes some additional equalities of subgroups in order to say a group is given a degree-rank filtration versus a DR\mathrm{DR}-filtration (for example). In particular, the concept of “degree-rank” filtration and DR\mathrm{DR}-filtration are distinct. The difference is minor, but causes a number of technical checks to be required, most notably in Appendix C. We will almost exclusively operate with these additional conditions; this is so that we can invoke equidistribution theory safely.

We now define polynomial sequences of an II-filtered group. The notion of a polynomial sequence for a group GG given a degree-rank filtration will be the same as treating this ordering as a DR\mathrm{DR}-filtration; the same applies for degree and multidegree filtrations.

Definition 2.5.

Given g:HGg\colon H\to G a map between groups (not necessarily a homomorphism) and hHh\in H, we define the derivative hg:HG\partial_{h}g\colon H\to G via hg(n)=g(hn)g(n)1\partial_{h}g(n)=g(hn)g(n)^{-1} for all nHn\in H. If H,GH,G are II-filtered, we say that this map gg is polynomial if for all m0m\geq 0 and i1,,imIi_{1},\ldots,i_{m}\in I, we have

h1hmg(n)Gi1++im\partial_{h_{1}}\cdots\partial_{h_{m}}g(n)\in G_{i_{1}+\cdots+i_{m}}

for all choices of hjHijh_{j}\in H_{i_{j}} and nH0n\in H_{0}. The space of all polynomial maps with respect to this data is denoted poly(HIGI)\operatorname{poly}(H_{I}\to G_{I}).

We will require various general properties of polynomial sequences established in [34, Appendix B]. We will only consider H=kH=\mathbb{Z}^{k} for k1k\geq 1 and the following II-filtrations on HH.

Definition 2.6.

We define the following filtrations on H=kH=\mathbb{Z}^{k}:

  • The (domain) degree filtration is with I=I=\mathbb{N} the degree ordering and H0=H1=kH_{0}=H_{1}=\mathbb{Z}^{k}, and Hi={0}H_{i}=\{0\} for i2i\geq 2;

  • The (domain) multidegree filtration is with I=kI=\mathbb{N}^{k} the multidegree ordering, H0=kH_{\vec{0}}=\mathbb{Z}^{k}, Hei=eiH_{\vec{e}_{i}}=\mathbb{Z}\vec{e}_{i} for i[k]i\in[k], and Hv={0}H_{\vec{v}}=\{0\} otherwise, where ei\vec{e}_{i} forms the standard basis of k\mathbb{Z}^{k};

  • The (domain) degree-rank filtration is with I=DRI=\mathrm{DR} the degree-rank ordering and H(0,0)=H(1,0)=kH_{(0,0)}=H_{(1,0)}=\mathbb{Z}^{k} and H(d,r)={0}H_{(d,r)}=\{0\} otherwise.

We now define the notion of a nilmanifold, which is essentially a compact quotient of a filtered nilpotent Lie group.

Definition 2.7.

We define an II-filtered nilmanifold G/ΓG/\Gamma to be the data of a connected, simply connected nilpotent Lie group GG with II-filtration (of Lie subgroups) and discrete cocompact subgroup ΓG\Gamma\leqslant G which is rational with respect to GIG_{I} (i.e., Γi:=ΓGi\Gamma_{i}:=\Gamma\cap G_{i} is cocompact in GiG_{i} for all iIi\in I). We say it has degree d\leq d or J\subseteq J if GG has degree d\leq d or J\subseteq J.

If I=I=\mathbb{N} and the II-filtration is furthermore a degree filtration with degree d\leq d, then G/ΓG/\Gamma is a degree dd nilmanifold. If I=DRI=\mathrm{DR} and the II-filtration is furthermore a degree-rank filtration with degree (d,r)\leq(d,r), then G/ΓG/\Gamma is a degree-rank (d,r)(d,r) nilmanifold. Finally if I=kI=\mathbb{N}^{k} and the II-filtration is furthermore a multidegree filtration with degree J\subseteq J, then G/ΓG/\Gamma is a multidegree JJ nilmanifold.

Remark.

Note that Γ\Gamma can naturally be given the structure of an II-filtered group ΓI\Gamma_{I}.

We finally (very occasionally) will require the lower central series of a group GG.

Definition 2.8.

Given a nilpotent group GG, define the lower central series inductively via G(0)=G(1)=GG_{(0)}=G_{(1)}=G and G(i+1)=[G,G(i)]G_{(i+1)}=[G,G_{(i)}]. The step of GG is the minimal jj such that G(j+1)=IdGG_{(j+1)}=\mathrm{Id}_{G}.

2.3. Horizontal tori and Taylor coefficients

The next notion, that of a horizontal character, plays a vital role when discussing the equidistribution of nilsequences.

Definition 2.9.

Given a connected, simply connected nilpotent group GG and a discrete, cocompact subgroup Γ\Gamma, a horizontal character η\eta is a continuous homomorphism η:G\eta\colon G\to\mathbb{R} such that η(Γ)\eta(\Gamma)\subseteq\mathbb{Z}. We say a horizontal character is nontrivial when η\eta is not identically zero.

Remark.

Throughout the literature on nilmanifolds, horizontal characters are continuous homomorphisms η:G/\eta\colon G\to\mathbb{R}/\mathbb{Z} such that η\eta annihilates Γ\Gamma. It is straightforward to prove (via using Mal’cev bases) that these two notions are identical up to taking mod1~{}\mathrm{mod}~{}1. The reason we operate with the above definition is that the kernel of η\eta as defined is then a subspace of G/[G,G]dim(G)dim([G,G])G/[G,G]\simeq\mathbb{R}^{\dim(G)-\dim([G,G])}.

We next require the notion of horizontal tori with respect to a degree-rank filtration. These tori will play a starring role in Sections 8, 9, and 10; our definition is exactly that of [34, Definition 9.6].

Definition 2.10.

Let GG be a degree-rank filtered nilpotent Lie group with filtration GDR=(G(d,r))(d,r)DRG_{\mathrm{DR}}=(G_{(d,r)})_{(d,r)\in\mathrm{DR}}. Given a subgroup Γ\Gamma of GG, we define various horizontal tori for i1i\geq 1 as

Horizi(G)\displaystyle\operatorname{Horiz}_{i}(G) :=G(i,1)/G(i,2),\displaystyle:=G_{(i,1)}/G_{(i,2)},
Horizi(Γ)\displaystyle\operatorname{Horiz}_{i}(\Gamma) :=(ΓG(i,1))/(ΓG(i,2)),\displaystyle:=(\Gamma\cap G_{(i,1)})/(\Gamma\cap G_{(i,2)}),
Horizi(G/Γ)\displaystyle\operatorname{Horiz}_{i}(G/\Gamma) :=Horizi(G)/Horizi(Γ).\displaystyle:=\operatorname{Horiz}_{i}(G)/\operatorname{Horiz}_{i}(\Gamma).

Given a polynomial sequence gpoly(DRGDR)g\in\operatorname{poly}(\mathbb{Z}_{\mathrm{DR}}\to G_{\mathrm{DR}}) we define the ii-th horizontal Taylor coefficient to be

Taylori(g)\displaystyle\operatorname{Taylor}_{i}(g) :=11g(n)modG(i,2)Horizi(G),\displaystyle:=\partial_{1}\cdots\partial_{1}g(n)~{}\mathrm{mod}~{}G_{(i,2)}\in\operatorname{Horiz}_{i}(G),
Taylori(gΓ)\displaystyle\operatorname{Taylor}_{i}(g\Gamma) :=Taylori(g)modHorizi(Γ)Horizi(G/Γ),\displaystyle:=\operatorname{Taylor}_{i}(g)~{}\mathrm{mod}~{}\operatorname{Horiz}_{i}(\Gamma)\in\operatorname{Horiz}_{i}(G/\Gamma),

where we take ii iterated derivatives.

We also require the notion of ii-th horizontal characters.

Definition 2.11.

Consider a nilmanifold G/ΓG/\Gamma with a degree-rank filtration. A continuous homomorphism η:G(i,1)\eta\colon G_{(i,1)}\to\mathbb{R} is an ii-th horizontal character if η(G(i,2))=0\eta(G_{(i,2)})=0 and η(G(i,1)Γ)\eta(G_{(i,1)}\cap\Gamma)\subseteq\mathbb{Z}.

The name Taylor coefficient is also used in the context of Taylor coefficients of polynomial factorizations. The following elementary lemma relates these two notions; we remark that a very closely related proof appears in [26, Lemma A.8].

Lemma 2.12.

Let GG be given a degree-rank filtration of degree-rank (d,r)(d,r) and consider a sequence gpoly(DRGDR)g\in\operatorname{poly}(\mathbb{Z}_{\mathrm{DR}}\to G_{\mathrm{DR}}). Then we may write g(n)=i=0dgi(ni)g(n)=\prod_{i=0}^{d}g_{i}^{\binom{n}{i}} for elements giG(i,0)g_{i}\in G_{(i,0)} and for 1id1\leq i\leq d we have

Taylori(g)=gimodG(i,2).\operatorname{Taylor}_{i}(g)=g_{i}~{}\mathrm{mod}~{}G_{(i,2)}.
Proof.

The representation of g(n)g(n) in the specified product form follows immediately from the existence of Taylor expansion, see [34, Lemma B.9].

We next prove Taylorj(g)=gjmodG(j,2)\operatorname{Taylor}_{j}(g)=g_{j}~{}\mathrm{mod}~{}G_{(j,2)} for each 1jd1\leq j\leq d individually. Notice that it suffices to consider g~(n)\widetilde{g}(n) which is g(n)modG(j,2)g(n)~{}\mathrm{mod}~{}G_{(j,2)}, i.e., we consider the group G/G(j,2)G/G_{(j,2)} with quotiented filtration. This group is easily seen to be at most jj-step nilpotent and furthermore [G(i,1)/G(j,2),G(ji,1)/G(j,2)]=IdG/G(j,2)[G_{(i,1)}/G_{(j,2)},G_{(j-i,1)}/G_{(j,2)}]=\mathrm{Id}_{G/G_{(j,2)}} for 0ij0\leq i\leq j (one should check the cases j=1j=1 and i{0,j}i\in\{0,j\} manually). Let G~i=G(i,1)/G(j,2)\widetilde{G}_{i}=G_{(i,1)}/G_{(j,2)} for 0ij0\leq i\leq j and note G~0=G~1\widetilde{G}_{0}=\widetilde{G}_{1}.

We see that G~0G~j\widetilde{G}_{0}\geqslant\cdots\geqslant\widetilde{G}_{j} is an \mathbb{N}-filtration for G~0\widetilde{G}_{0} with [G~i,G~ji]=IdG~0[\widetilde{G}_{i},\widetilde{G}_{j-i}]=\mathrm{Id}_{\widetilde{G}_{0}} for all 0ij0\leq i\leq j. Note that g~(n)=i=0jg~i(ni)\widetilde{g}(n)=\prod_{i=0}^{j}\widetilde{g}_{i}^{\binom{n}{i}} where g~i\widetilde{g}_{i} is gimodG(j,2)g_{i}~{}\mathrm{mod}~{}G_{(j,2)}.

It suffices to prove the claim that g~(n+1)g~(n)1=i=0j1(g~i)(ni)\widetilde{g}(n+1)\widetilde{g}(n)^{-1}=\prod_{i=0}^{j-1}(\widetilde{g}_{i}^{\prime})^{\binom{n}{i}} with g~iG~i+1\widetilde{g}_{i}^{\prime}\in\widetilde{G}_{i+1} and g~j1=g~j\widetilde{g}_{j-1}^{\prime}=\widetilde{g}_{j}. If this is the case, then we may modify the filtration G~0G~1G~j\widetilde{G}_{0}\geqslant\widetilde{G}_{1}\geqslant\cdots\geqslant\widetilde{G}_{j} by stripping off the top group, which maintains the necessary inductive properties. Iterating this procedure jj times we obtain the desired Taylor equality.

This claim is a consequence of the Taylor expansion for general polynomial sequences and the Baker–Campbell–Hausdorff formula and counting the depths of nested commutators. The crucial reason that g~j1=g~j\widetilde{g}_{j-1}^{\prime}=\widetilde{g}_{j} is that any “higher order” terms which arise in the Baker–Campbell–Hausdorff formula and could contribute are in fact annhilated due to [G~i,G~ji]=idG~0[\widetilde{G}_{i},\widetilde{G}_{j-i}]=\mathrm{id}_{\widetilde{G}_{0}} for 0ij0\leq i\leq j. ∎

We also have the following linearity of the ii-th Taylor coefficients.

Lemma 2.13.

Assume the setup of Lemma 2.12. We have

Taylori(gh)=Taylori(g)+Taylori(h)\operatorname{Taylor}_{i}(gh)=\operatorname{Taylor}_{i}(g)+\operatorname{Taylor}_{i}(h)

and if

g(n)=exp(i=0dgi(ni))g(n)=\exp\bigg{(}\sum_{i=0}^{d}g_{i}\binom{n}{i}\bigg{)}

for gilog(Gi)g_{i}\in\log(G_{i}) we have

Taylori(g)=exp(gi)modG(i,2).\operatorname{Taylor}_{i}(g)=\exp(g_{i})~{}\mathrm{mod}~{}G_{(i,2)}.
Remark 2.14.

Note that G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} is abelian and hence additive notation may be used when considering Taylor coefficients.

Proof.

The first claim follows from Lemma 2.12, the Baker–Campbell–Hausdorff formula, and the commutator relationship that [G(i,0),G(ji,0)]=[G(i,1),G(ji,1)]G(j,2)[G_{(i,0)},G_{(j-i,0)}]=[G_{(i,1)},G_{(j-i,1)}]\subseteq G_{(j,2)}. (Note that this is using G(0,0)=G(0,1)=G(1,0)G_{(0,0)}=G_{(0,1)}=G_{(1,0)} in the case i=0i=0.)

For the second claim, suppose that

g(n)=exp(i=0dgi(ni))=i=0s(gi)(ni).g(n)=\exp\bigg{(}\sum_{i=0}^{d}g_{i}\binom{n}{i}\bigg{)}=\prod_{i=0}^{s}(g_{i}^{\prime})^{\binom{n}{i}}.

Via iterated applications of the Baker–Campbell–Hausdorff formula and the commutator relationship that [G(i,0),G(ji,0)]=[G(i,1),G(ji,1)]G(j,2)[G_{(i,0)},G_{(j-i,0)}]=[G_{(i,1)},G_{(j-i,1)}]\subseteq G_{(j,2)}, we see that gj=exp(gj)modG(j,2)g_{j}^{\prime}=\exp(g_{j})~{}\mathrm{mod}~{}G_{(j,2)} and the result follows. ∎

2.4. Vertical tori and nilcharacters

Given a polynomial sequence gg on an II-filtered Lie group with I=I=\mathbb{N}, we can define a sequence of vectors by considering a smooth vector-valued function FF on G/ΓG/\Gamma and looking at F(g(n)Γ)F(g(n)\Gamma). However, we will be particularly interested in those which “have a Fourier coefficient” with respect to various subgroups of the center.

Definition 2.15.

Consider a nilmanifold G/ΓG/\Gamma and a function F:G/ΓF\colon G/\Gamma\to\mathbb{C}. Given a connected, simply connected subgroup TT of the center Z(G)Z(G) which is rational (i.e., ΓT\Gamma\cap T is cocompact in TT) and a continuous homomorphism η:T\eta\colon T\to\mathbb{R} such that η(TΓ)\eta(T\cap\Gamma)\subseteq\mathbb{Z}, if

F(gx)=e(η(g))F(x) for all gTF(gx)=e(\eta(g))F(x)\emph{ for all }g\in T

we say that FF has a TT-vertical character (or TT-vertical frequency) η\eta.

Remark.

Note that T/(ΓT)T/(\Gamma\cap T) is isomorphic to a torus and thus one can modify functions under consideration to have vertical characters via appropriate Fourier decomposition.

A particular case which will arise frequently in our applications comes from the fact that given a filtration satisfying the conditions of Definition 2.4, we have that the “bottom group” is contained in the center. For example, a group GG given a degree filtration of degree dd satisfies [G,Gd]=[G1,Gd]=IdG[G,G_{d}]=[G_{1},G_{d}]=\mathrm{Id}_{G} hence GdZ(G)G_{d}\leqslant Z(G). One special class of functions with a vertical frequency which will be of particular importance is that of nilcharacters.

Definition 2.16.

A nilcharacter of degree dd and output dimension DD is the following data. Consider an II-filtered nilmanifold G/ΓG/\Gamma of degree dd such that [G,Gd]=IdG[G,G_{d}]=\mathrm{Id}_{G} and an II-filtered abelian group HH. Let gpoly(HIGI)g\in\operatorname{poly}(H_{I}\to G_{I}) and consider function F:G/ΓDF\colon G/\Gamma\to\mathbb{C}^{D} such that:

  • F(x)2=1\lVert F(x)\rVert_{2}=1 for all xG/Γx\in G/\Gamma pointwise;

  • F(gdx)=e(η(gd))F(x)F(g_{d}x)=e(\eta(g_{d}))F(x) for all gdGdg_{d}\in G_{d} where η\eta is some continuous homomorphism GdG_{d}\to\mathbb{R} such that η(ΓGd)\eta(\Gamma\cap G_{d})\subseteq\mathbb{Z}.

The values of the nilcharacter are given by χ:HD\chi\colon H\to\mathbb{C}^{D} where χ(n)=F(g(n)Γ)\chi(n)=F(g(n)\Gamma) for nHn\in H.

Remark.

We work with vector-valued nilcharacters for precisely the same topological reason given in [34, p. 1254].

2.5. Additional miscellaneous conventions

We end this section with a brief discussion of various miscellaneous conventions. Throughout the paper we use {}\{\cdot\} to denote the map (1/2,1/2]\mathbb{R}\to(-1/2,1/2] (or /(1/2,1/2]\mathbb{R}/\mathbb{Z}\to(-1/2,1/2], abusively) which takes the representative mod1~{}\mathrm{mod}~{}1 closest to 0. Furthermore given x/x\in\mathbb{R}/\mathbb{Z} and yy\in\mathbb{R} we will treat xy/x-y\in\mathbb{R}/\mathbb{Z} in the obvious manner. As used above, we let e:/e\colon\mathbb{R}/\mathbb{Z}\to\mathbb{C} denote the exponential function e(x)=exp(2πix)e(x)=\exp(2\pi ix), which is lifted to \mathbb{R} in the obvious manner.

We use standard asymptotic notation. Given functions f=f(n)f=f(n) and g=g(n)g=g(n), we write f=O(g)f=O(g), fgf\ll g, g=Ω(f)g=\Omega(f), or gfg\gg f to mean that there is a constant CC such that |f(n)|Cg(n)|f(n)|\leq Cg(n) for sufficiently large nn. We write fgf\asymp g or f=Θ(g)f=\Theta(g) to mean that fgf\ll g and gfg\ll f, and write f=o(g)f=o(g) or g=ω(f)g=\omega(f) to mean f(n)/g(n)0f(n)/g(n)\to 0 as nn\to\infty. Subscripts indicate dependence on parameters.

Finally in various arguments throughout the paper it will be convenient to denote appropriately bounded functions as b(n)b(n) or b(n1,,nk)b(n_{1},\ldots,n_{k}), and B(n),B(n1,,nk)B(n),B(n_{1},\ldots,n_{k}) when vector-valued. When using such notation, the functions b,Bb,B may change from line to line and within a line may refer to different functions.

3. Various complexity notions

3.1. Rationality of bases and Lipschitz norms

We will now discuss the definitions chosen for complexity of nilmanifolds. We start by defining first- and second-kind coordinates given a basis 𝒳\mathcal{X} for logG\log G.

Definition 3.1.

Consider a connected, simply connected nilpotent Lie group GG of dimension dd. Given a basis 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} of logG\log G and gGg\in G, there exists (t1,,td)d(t_{1},\ldots,t_{d})\in\mathbb{R}^{d} such that

g=exp(t1X1+t2X2++tdXd).g=\exp(t_{1}X_{1}+t_{2}X_{2}+\cdots+t_{d}X_{d}).

We define Mal’cev coordinates of first-kind ψexp=ψexp,𝒳:Gd\psi_{\exp}=\psi_{\exp,\mathcal{X}}\colon G\to\mathbb{R}^{d} for gg relative to 𝒳\mathcal{X} by

ψexp(g):=(t1,,td).\psi_{\exp}(g):=(t_{1},\ldots,t_{d}).

Given gGg\in G there also exists (u1,,ud)d(u_{1},\ldots,u_{d})\in\mathbb{R}^{d} such that

g=exp(u1X1)exp(udXd),g=\exp(u_{1}X_{1})\cdots\exp(u_{d}X_{d}),

and we define the Mal’cev coordinates of second-kind ψ=ψ𝒳:Gd\psi=\psi_{\mathcal{X}}\colon G\to\mathbb{R}^{d} for gg relative to 𝒳\mathcal{X} by

ψ(g):=(u1,,ud).\psi(g):=(u_{1},\ldots,u_{d}).

Note that the above definition does not account for the cocompact subgroup Γ\Gamma. The next set of definitions account for how “rational” 𝒳\mathcal{X} is with respect to itself and Γ\Gamma.

Definition 3.2.

The height of a number xx is max(|a|,|b|)\max(|a|,|b|) if x=a/bx=a/b with gcd(a,b)=1\gcd(a,b)=1 and \infty if xx is irrational.

Definition 3.3.

Given a nilmanifold G/ΓG/\Gamma of dimension dd, consider a basis 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} of 𝔤=logG\mathfrak{g}=\log G. 𝒳\mathcal{X} is said to be a weak basis of rationality QQ with respect to Γ\Gamma if:

  • There exist rationals cijkc_{ijk} of height at most QQ such that

    [Xi,Xj]=kcijkXk;[X_{i},X_{j}]=\sum_{k}c_{ijk}X_{k};
  • There exists integer 1qQ1\leq q\leq Q such that

    qdψexp,𝒳(Γ)q1d.q\cdot\mathbb{Z}^{d}\subseteq\psi_{\mathrm{exp},\mathcal{X}}(\Gamma)\subseteq q^{-1}\cdot\mathbb{Z}^{d}.

𝒳\mathcal{X} is a Mal’cev basis of logG\log G with respect to Γ\Gamma of rationality QQ if:

  • There exist rationals cijkc_{ijk} of height at most QQ such that

    [Xi,Xj]=kcijkXk;[X_{i},X_{j}]=\sum_{k}c_{ijk}X_{k};
  • ψ𝒳(Γ)=d\psi_{\mathcal{X}}(\Gamma)=\mathbb{Z}^{d}.

We say that 𝒳\mathcal{X} has the degree kk nesting property if there exist 1k\ell_{1}\leq\cdots\leq\ell_{k} such that if 𝔤t=span(Xt+1,,Xm)\mathfrak{g}_{t}=\operatorname{span}_{\mathbb{R}}(X_{\ell_{t}+1},\ldots,X_{m}) then [𝔤,𝔤]𝔤1[\mathfrak{g},\mathfrak{g}]\subseteq\mathfrak{g}_{1}, [𝔤,𝔤]𝔤+1[\mathfrak{g},\mathfrak{g}_{\ell}]\subseteq\mathfrak{g}_{\ell+1} and [𝔤,𝔤k]=0[\mathfrak{g},\mathfrak{g}_{k}]=0.

Finally we say that a Mal’cev basis is adapted to a sequence of nesting subgroups G=G0G1G2GIdGG=G_{0}\geqslant G_{1}\geqslant G_{2}\geqslant\cdots\geqslant G_{\ell}\geqslant\mathrm{Id}_{G} if

span({Xj:ddim(Gi)<jd})=logGi\operatorname{span}_{\mathbb{R}}(\{X_{j}\colon d-\dim(G_{i})<j\leq d\})=\log G_{i}

for 1i1\leq i\leq\ell.

We next state the definition of the Lipschitz property for a function on G/ΓG/\Gamma.

Definition 3.4.

We define a metric d=dG,𝒳d=d_{G,\mathcal{X}} on GG by

d(x,y):=inf{i=1nmin(ψ(xixi+11),ψ(xi+1xi1)):n,x1,,xn+1G,x1=x,xn+1=y},d(x,y):=\inf\bigg{\{}\sum_{i=1}^{n}\min(\lVert\psi(x_{i}x_{i+1}^{-1})\rVert,\lVert\psi(x_{i+1}x_{i}^{-1})\rVert)\colon n\in\mathbb{N},x_{1},\ldots,x_{n+1}\in G,x_{1}=x,x_{n+1}=y\bigg{\}},

where \lVert\cdot\rVert denotes the \ell^{\infty}-norm on m\mathbb{R}^{m}, and define a metric on G/ΓG/\Gamma by

d(xΓ,yΓ)=infγ,γΓd(xγ,yγ).d(x\Gamma,y\Gamma)=\inf_{\gamma,\gamma^{\prime}\in\Gamma}d(x\gamma,y\gamma^{\prime}).

Furthermore, for any function F:G/ΓF\colon G/\Gamma\to\mathbb{C} we define

FLip:=F+supxyG/Γ|F(x)F(y)|d(x,y).\lVert F\rVert_{\mathrm{Lip}}:=\lVert F\rVert_{\infty}+\sup_{x\neq y\in G/\Gamma}\frac{|F(x)-F(y)|}{d(x,y)}.

Given a function F:G/ΓDF\colon G/\Gamma\to\mathbb{C}^{D} such that F=(F1,,FD)F=(F_{1},\ldots,F_{D}) we define

FLip:=max1iDFiLip.\lVert F\rVert_{\mathrm{Lip}}:=\max_{1\leq i\leq D}\lVert F_{i}\rVert_{\mathrm{Lip}}.
Remark.

Note that the metric on GG is right-invariant. We may omit the subscript 𝒳\mathcal{X} for the distance function when clear from context.

3.2. Complexity of nilmanifolds

We now define the complexity of a nilmanifold with respect to either a degree or a degree-rank filtration.

Definition 3.5.

Let s1s\geq 1 be an integer and let M1M\geq 1. A nilmanifold G/ΓG/\Gamma of degree ss, dimension dd, and complexity at most MM consists of a degree ss filtration of GG along with a Mal’cev basis 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} of logG\log G which satisfies the following:

  • {X1,,Xd}\{X_{1},\ldots,X_{d}\} is a Mal’cev basis for logG\log G with respect to Γ\Gamma of rationality at most MM;

  • 𝒳\mathcal{X} is adapted to the sequence of subgroups (Gi)i(G_{i})_{i\in\mathbb{N}}.

Analogously a nilmanifold G/ΓG/\Gamma of degree-rank (s,r)(s,r), dimension dd, and complexity at most MM consists of a degree-rank (s,r)(s,r) filtration of GG along with a Mal’cev basis 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} of logG\log G which satisfies the following:

  • {X1,,Xd}\{X_{1},\ldots,X_{d}\} is a Mal’cev basis for logG\log G with respect to Γ\Gamma of rationality at most MM;

  • 𝒳\mathcal{X} is adapted to the sequence of subgroups (Gi)iDR(G_{i})_{i\in\mathrm{DR}}.

Remark.

The only difference in complexity for a degree versus degree-rank filtration is that we require the Mal’cev basis to be adapted with respect to the appropriate filtration. This definition unfortunately does not extend to the case of multidegree filtrations since the set of subgroups do not nest in a total order. Furthermore note that a degree-rank nilmanifold of complexity MM is also a degree nilmanifold of the same complexity by taking the associated degree filtration.

Finally, whenever discussing the complexity of nilmanifolds, this is always with respect to a given Mal’cev basis 𝒳\mathcal{X}. We will abusively write phrases such as “nilmanifold G/ΓG/\Gamma of complexity MM” throughout the paper; such a statement should always be understood with a corresponding implicitly provided adapted Mal’cev basis of the Lie algebra.

Remark.

We will also in passing require the notion of a degree 0 nilmanifold. A degree 0 nilmanifold is simply the trivial group IdG\mathrm{Id}_{G}. All scalar-valued functions on degree 0 nilmanifolds are constants and the Lipschitz norm is defined to be the absolute value of this constant.

We will next need the notion of a rational subgroup with respect to a Mal’cev basis; this will be crucial when giving the definition of complexity with respect to a multidegree filtration.

Definition 3.6.

A closed, connected subgroup GGG^{\prime}\leqslant G is QQ-rational with respect to a basis 𝒳={X1,,Xm}\mathcal{X}=\{X_{1},\ldots,X_{m}\} of logG\log G if logG\log G^{\prime} has a basis 𝒳={X1,,Xm}\mathcal{X}^{\prime}=\{X_{1}^{\prime},\ldots,X_{m^{\prime}}^{\prime}\} where Xi=j=1mcijXjX_{i}^{\prime}=\sum_{j=1}^{m}c_{ij}X_{j} for 1im1\leq i\leq m^{\prime} with cijc_{ij}\in\mathbb{Q} having heights bounded by QQ.

We will repeatedly use the following fact about rational subgroups without further comment.

Fact 3.7.

Suppose GG is a connected, simply connected nilpotent Lie group of step ss and dimension dd with a discrete cocompact subgroup Γ\Gamma. Suppose that G/ΓG/\Gamma has a weak basis 𝒳\mathcal{X} of rationality at most QQ. Let H1,,HjH_{1},\ldots,H_{j} be subgroups which are each QQ-rational and normal in GG. Then

H=i=1jHiH=\bigvee_{i=1}^{j}H_{i}

is an Os(QOs(dOs(1)))O_{s}(Q^{O_{s}(d^{O_{s}(1)})})-rational subgroup.

Proof.

Let 𝒳i\mathcal{X}^{i} denote the underlying basis of HiH_{i} witnessing low height. By applying Baker–Campbell–Hausdorff, we have that logH\log H is spanned by taking all (s)(\leq s)-fold commutators of elements in 𝒳i\mathcal{X}^{i} (possibly for different ii). Each such element of the Lie algebra is easily seen to be a Os(QOs(dOs(1)))O_{s}(Q^{O_{s}(d^{O_{s}(1)})})-rational combination of 𝒳\mathcal{X} (using the weak basis property of 𝒳\mathcal{X}). Taking a subset of these commutators which forms a basis of logH\log H gives the desired result. ∎

We are now in position to define the complexity of a multidegree nilsequence. This definition is admittedly rather artificial but is designed to be the most flexible given various lemmas scattered throughout the literature.

Definition 3.8.

Consider a downset JJ with respect to the multidegree ordering on k\mathbb{N}^{k}. Consider a group GG with a multidegree filtration of degree J\subseteq J. Recall the associated degree filtration

Gi=v:|v|=iGvG_{i}=\bigvee_{\vec{v}:|\vec{v}|=i}G_{\vec{v}}

and define the associated degree to be supvJ|v|\sup_{\vec{v}\in J}|\vec{v}|. We say a multidegree JJ nilmanifold G/ΓG/\Gamma of dimension dd with Mal’cev basis 𝒳\mathcal{X} has complexity at most MM if:

  • {X1,,Xd}\{X_{1},\ldots,X_{d}\} is a Mal’cev basis for logG\log G with respect to Γ\Gamma of rationality at most MM;

  • 𝒳\mathcal{X} is adapted to the sequence of subgroups (Gi)i(G_{i})_{i\in\mathbb{N}};

  • GvG_{\vec{v}} is an MM-rational subgroup for all vk\vec{v}\in\mathbb{N}^{k}.

We next note the trivial fact that complexity is bounded appropriately with respect to taking direct products; we implicitly invoke this when handling the complexity of direct products.

Fact 3.9.

Consider nilmanifolds G/ΓG/\Gamma, H/ΓH/\Gamma^{\prime} given degree ss filtrations (Gi)i0(G_{i})_{i\geq 0}, (Hi)i0(H_{i})_{i\geq 0} and adapted Mal’cev bases 𝒳,𝒳\mathcal{X},\mathcal{X}^{\prime} each of complexity at most MM. Then (G×H)/(Γ×Γ)(G\times H)/(\Gamma\times\Gamma^{\prime}) has complexity at most MM with respect to the Mal’cev basis

𝒳={(X,0):X𝒳}{(0,X):X𝒳}.\mathcal{X}^{\ast}=\{(X,0)\colon X\in\mathcal{X}\}\cup\{(0,X^{\prime})\colon X^{\prime}\in\mathcal{X}^{\prime}\}.

𝒳\mathcal{X}^{\ast} may be adapted to the degree ss filtration Gi×HiG_{i}\times H_{i} by creating an ordering with suffixes

{(Xj,0):Xj𝒳,0dim(G)j<dim(Gi)}{(0,Xj):Xj𝒳,0dim(H)j<dim(Hi)}.\displaystyle\big{\{}(X_{j},0)\colon X_{j}\in\mathcal{X},0\leq\dim(G)-j<\dim(G_{i})\big{\}}\cup\big{\{}(0,X_{j}^{\prime})\colon X_{j}^{\prime}\in\mathcal{X}^{\prime},0\leq\dim(H)-j<\dim(H_{i})\big{\}}.

Furthermore given F:G/ΓF\colon G/\Gamma\to\mathbb{C} and F:H/ΓF^{\prime}\colon H/\Gamma^{\prime}\to\mathbb{C} which are MM-Lipschitz,

F~((g,h)(Γ×Γ)):=F(gΓ)F(hΓ)\widetilde{F}((g,h)(\Gamma\times\Gamma^{\prime})):=F(g\Gamma)F^{\prime}(h\Gamma^{\prime})

is 3M23M^{2}-Lipschitz on (G×H)/(Γ×Γ)(G\times H)/(\Gamma\times\Gamma^{\prime}). Analogous statements hold for degree-rank filtrations and multidegree filtrations.

We finally end by noting that quotients by normal subgroups of bounded rationality have appropriate complexity.

Lemma 3.10.

Consider a nilmanifold G/ΓG/\Gamma with GG given a degree ss filtration (Gi)(G_{i}) and of complexity at most MM with respect to an adapted Mal’cev basis 𝒳\mathcal{X}.

Suppose that HH is a normal subgroup of GG which is MM-rational with respect to 𝒳\mathcal{X}. Then the quotient nilmanifold (G/H)/(Γ/(ΓH))(G/H)/(\Gamma/(\Gamma\cap H)) may be given an adapted Mal’cev basis 𝒳\mathcal{X}^{\ast}, where the degree ss filtration is (Gi/(GiH))(G_{i}/(G_{i}\cap H)), which is an MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}-rational combination of

𝒳={XmodlogH:X𝒳}.\mathcal{X}^{\prime}=\{X~{}\mathrm{mod}~{}\log H\colon X\in\mathcal{X}\}.

Analogous statements hold for degree-rank filtrations and multidegree filtrations. Finally if HZ(G)H\leqslant Z(G) and FF is an MM-Lipschitz function on G/ΓG/\Gamma which is HH-invariant then FF descends to (G/H)/(Γ/(ΓH))(G/H)/(\Gamma/(\Gamma\cap H)) and is MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}-Lipschitz with respect to 𝒳\mathcal{X}^{\ast}.

Proof.

We may find a subset SS such that

𝒳={XimodlogH:Xi𝒳,iS}\mathcal{X}^{\prime}=\{X_{i}~{}\mathrm{mod}~{}\log H\colon X_{i}\in\mathcal{X},i\in S\}

is a basis for log(G/H)\log(G/H). Since HH is MM-rational with respect to 𝒳\mathcal{X}, it follows from Cramer’s rule that for jSj\not\in S, XjmodlogHX_{j}~{}\mathrm{mod}~{}\log H is a MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}-combination of XimodlogHX_{i}~{}\mathrm{mod}~{}\log H with iSi\in S . Hence, 𝒳\mathcal{X}^{\prime} is a weak Mal’cev basis for (G/H)/(Γ/(ΓH))(G/H)/(\Gamma/(\Gamma\cap H)) of rationality MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}. By [42, Lemma B.11], we may find a Mal’cev basis adapted to (G/H)/(Γ/(ΓH))(G/H)/(\Gamma/(\Gamma\cap H)) with complexity MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}. Now, if HZ(G)H\leqslant Z(G) and FF is MM-Lipschitz on G/ΓG/\Gamma which is HH-invariant, it follows trivially that FF descends to a function F¯\overline{F} on (G/H)/(Γ/(ΓH))(G/H)/(\Gamma/(\Gamma\cap H)). The Lipschitz bounds for F¯\overline{F} follow from [42, Lemma B.3]. ∎

3.3. Size of vertical and horizontal characters

We now define the size of vertical and horizontal characters. We first define the size of a horizontal character.

Definition 3.11.

Given a nilmanifold G/ΓG/\Gamma and a Mal’cev basis 𝒳\mathcal{X}, note that any horizontal character η:G\eta\colon G\to\mathbb{R} can be expressed in the form

η(g)=kψ(g)\eta(g)=k\cdot\psi(g)

for some kdim(G)k\in\mathbb{Z}^{\dim(G)}. We define the size of the horizontal character as k\lVert k\rVert_{\infty}.

We next define the size of an ii-th horizontal character.

Definition 3.12.

Consider a nilmanifold G/ΓG/\Gamma with GG given a degree-rank filtration of degree-rank (s,r)(s,r) and a Mal’cev basis 𝒳={X1,,Xdim(G)}\mathcal{X}=\{X_{1},\ldots,X_{\dim(G)}\} adapted to the degree-rank filtration. Note that any ii-th horizontal character ηi:G(i,1)\eta_{i}\colon G_{(i,1)}\to\mathbb{R} can be expressed in the form

ηi(g)=kψ(g)\eta_{i}(g)=k\cdot\psi(g)

with kdim(G)k\in\mathbb{Z}^{\dim(G)} for some kk which is nonzero only on coordinates between dim(G)dim(G(i,1))<jdim(G)dim(G(i,2))\dim(G)-\dim(G_{(i,1)})<j\leq\dim(G)-\dim(G_{(i,2)}). We define the size of the ii-th horizontal character as k\lVert k\rVert_{\infty}.

We finally define the size of a vertical character.

Definition 3.13.

Consider a nilmanifold G/ΓG/\Gamma with GG given a degree filtration of degree kk and a Mal’cev basis 𝒳={X1,,Xdim(G)}\mathcal{X}=\{X_{1},\ldots,X_{\dim(G)}\} adapted to the degree filtration. Consider a continuous vertical character ξ:T\xi\colon T\to\mathbb{R} from a rational subgroup TZ(G)T\leqslant Z(G). We define the height of ξ\xi as

supxyT/(ΓT)|ξ(x)ξ(y)|dG(xΓ,yΓ);\sup_{x\neq y\in T/(\Gamma\cap T)}\frac{|\xi(x)-\xi(y)|}{d_{G}(x\Gamma,y\Gamma)};

this will be denoted as |ξ||\xi|.

Remark.

We now justify the terminology “height” given for the complexity of a vertical character. Suppose that G/ΓG/\Gamma has complexity MM (given 𝒳\mathcal{X}) with respect to a degree filtration of degree dd and TT is QQ-rational. We have that TT has a Mal’cev basis which is a (QM)Ok(dO(1))(QM)^{O_{k}(d^{O(1)})}-rational combination of 𝒳\mathcal{X} by [42, Lemma B.12]; denote this 𝒳\mathcal{X}^{\prime}. By [42, Lemma B.9], we have that for x,yTx,y\in T,

dG,𝒳(xΓ,yΓ)(QM)Ok(dO(1))dT,𝒳(x(ΓT),y(ΓT))(QM)Ok(dO(1))dG,𝒳(xΓ,yΓ).d_{G,\mathcal{X}}(x\Gamma,y\Gamma)\leq(QM)^{O_{k}(d^{O(1)})}d_{T,\mathcal{X}^{\prime}}(x(\Gamma\cap T),y(\Gamma\cap T))\leq(QM)^{O_{k}(d^{O(1)})}d_{G,\mathcal{X}}(x\Gamma,y\Gamma).

With respect to 𝒳={X1,,Xdim(T)}\mathcal{X}^{\prime}=\{X_{1}^{\prime},\ldots,X_{\dim(T)}^{\prime}\}, we have that ξ\xi is an integer vector and the definition of height is equivalent up to a multiplicative factor of (QM)Ok(dO(1))(QM)^{O_{k}(d^{O(1)})} to the height of this vector.

3.4. Correlation

We will also require the notion of a sequence being biased of some order.

Definition 3.14.

A function f:[N]Df\colon[N]\to\mathbb{C}^{D} is ss-biased of correlation η\eta, complexity MM, and dimension dd if there exists a nilmanifold G/ΓG/\Gamma with a degree ss filtration such that GG has dimension at most dd, G/ΓG/\Gamma has complexity at most MM, and there exists an MM-Lipschitz function FF and a polynomial sequence gpoly(G)g\in\operatorname{poly}(\mathbb{Z}_{\mathbb{N}}\to G_{\mathbb{N}}) such that

𝔼n[N][f(n)F(g(n)Γ)¯]η.\lVert\mathbb{E}_{n\in[N]}[f(n)\overline{F(g(n)\Gamma)}]\rVert_{\infty}\geq\eta.

We will denote this as fCorr(s,η,M,d)f\in\operatorname{Corr}(s,\eta,M,d).

3.5. Miscellaneous complexity notions

We will also require the following definition regarding smoothness norms of polynomial sequences.

Definition 3.15.

Given vk\vec{v}\in\mathbb{N}^{k} and nk\vec{n}\in\mathbb{N}^{k}, we define

(nv)=i=1k(nivi).\binom{\vec{n}}{\vec{v}}=\prod_{i=1}^{k}\binom{n_{i}}{v_{i}}.

Any polynomial sequence g:kg\colon\mathbb{Z}^{k}\to\mathbb{R} can be expressed uniquely as

g(n)=kα(n)g(\vec{n})=\sum_{\vec{\ell}\in\mathbb{N}^{k}}\alpha_{\vec{\ell}}\binom{\vec{n}}{\vec{\ell}}

with α\alpha_{\vec{\ell}}\in\mathbb{R}. We define

gC[N]:=max0N||α/\lVert g\rVert_{C^{\infty}[N]}:=\max_{\vec{\ell}\neq\vec{0}}N^{|\vec{\ell}|}\cdot\lVert\alpha_{\vec{\ell}}\rVert_{\mathbb{R}/\mathbb{Z}}

where ||=j=1kj|\vec{\ell}|=\sum_{j=1}^{k}\ell_{j}.

Remark.

Note that the above definition is only sensitive to the values of gmod1g~{}\mathrm{mod}~{}1.

We now define when a polynomial sequence is rational and smooth.

Definition 3.16.

Consider a nilmanifold G/ΓG/\Gamma given either a degree, degree-rank, or multidegree filtration with Mal’cev basis 𝒳\mathcal{X} and gg a domain k\mathbb{Z}^{k} polynomial sequence on GG with respect to the given filtration. We say that gg is (M,N)(M,N)-smooth if:

  • dG,𝒳(g(0),idG)Md_{G,\mathcal{X}}(g(\vec{0}),\mathrm{id}_{G})\leq M;

  • dG,𝒳(g(v),g(v+ei))MN1d_{G,\mathcal{X}}(g(\vec{v}),g(\vec{v}+\vec{e}_{i}))\leq M\cdot N^{-1} for v[N]k\vec{v}\in[N]^{k} and 1ik1\leq i\leq k.

We say that gg is MM-rational if there is 1mM1\leq m\leq M such that for all nk\vec{n}\in\mathbb{N}^{k} we have that

ψ𝒳(g(n))1mdim(G).\psi_{\mathcal{X}}(g(\vec{n}))\in\frac{1}{m}\cdot\mathbb{Z}^{\dim(G)}.

4. Proof outline

We are now in position to discuss the proof of Theorem 1.2; as our proof is closely modeled on that of Green, Tao, and Ziegler [34], the announcement of [33] may prove a useful starting point for certain readers. For various parts of this outline we will restrict to the case of the U5U^{5}-inverse theorem and discuss the proof as if the analysis were performed with bracket polynomials.

4.1. Induction on degree and additive quadruples

Suppose that f:[N]f\colon[N]\to\mathbb{C} is 11-bounded such that

fU5[N]δ.\lVert f\rVert_{U^{5}[N]}\geq\delta.

Via the inductive definition of the Gowers norm, we have for δO(1)N\delta^{O(1)}N values of h[N]h\in[N] that

ΔhfU4[N]δO(1).\lVert\Delta_{h}f\rVert_{U^{4}[N]}\geq\delta^{O(1)}.

Call this set of indices HH. Applying Theorem 1.2 inductively (when converted to bracket polynomials; see e.g. [45, Proposition 1.4]) we may choose d1,d2,d3log(1/δ)O(1)d_{1},d_{2},d_{3}\leq\log(1/\delta)^{O(1)} and coefficients ai,ha_{i,h} etc. such that

|𝔼n[N]Δhf(n)e(\displaystyle\bigg{|}\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\cdot e\bigg{(} i=1d1ai,hn[bi,hn][ci,hn]+i=1d2di,hn2[ei,hn]+i=1d3fi,hn[gi,hn]\displaystyle\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n]+\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n]+\sum_{i=1}^{d_{3}}f_{i,h}n[g_{i,h}n]
+jhn3+hn2+mhn)|exp(log(1/δ)O(1));\displaystyle\qquad\qquad\qquad+j_{h}n^{3}+\ell_{h}n^{2}+m_{h}n\bigg{)}\bigg{|}\geq\exp(-\log(1/\delta)^{O(1)});

we have padded with extra coefficients to make the dimensions did_{i} not hh-dependent. Set

Gh(n)¯=e(i=1d1ai,hn[bi,hn][ci,hn]+i=1d2di,hn2[ei,hn]+i=1d3fi,hn[gi,hn]+jhn3+hn2+mhn).\overline{G_{h}(n)}=e\bigg{(}\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n]+\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n]+\sum_{i=1}^{d_{3}}f_{i,h}n[g_{i,h}n]+j_{h}n^{3}+\ell_{h}n^{2}+m_{h}n\bigg{)}.

For the sake of clarity, we will let Lh(n)L_{h}(n) denote terms of degree 2\leq 2 which are possibly hh-dependent. We have

|𝔼n[N]Δhf(n)e(i=1d1ai,hn[bi,hn][ci,hn]+i=1d2di,hn2[ei,hn]+jhn3+Lh(n))|exp(log(1/δ)O(1)).\bigg{|}\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\cdot e\bigg{(}\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n]+\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n]+j_{h}n^{3}+L_{h}(n)\bigg{)}\bigg{|}\geq\exp(-\log(1/\delta)^{O(1)}).

The first crucial step, via a Cauchy–Schwarz argument due to Gowers [16] (see [32, Proposition 6.1] or Lemma 7.2) is that for many additive quadruples (h1,h2,h3,h4)(h_{1},h_{2},h_{3},h_{4}), i.e. h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4}, we have

|𝔼n[N]Gh1(n)Gh2(n+h1h4)Gh3(n)¯Gh4(n+h1h4)¯|exp(log(1/δ)O(1)).|\mathbb{E}_{n\in[N]}G_{h_{1}}(n)G_{h_{2}}(n+h_{1}-h_{4})\overline{G_{h_{3}}(n)}\overline{G_{h_{4}}(n+h_{1}-h_{4})}|\geq\exp(-\log(1/\delta)^{O(1)}).

4.2. Sunflower and linearization for the top degree-rank

Via bracket polynomial manipulations, we see that the “top degree-rank” term of the above expression is

i=1d1(ai,h1n[bi,h1n][ci,h1n]+ai,h2n[bi,h2n][ci,h2n]ai,h3n[bi,h3n][ci,h3n]ai,h4n[bi,h4n][ci,h4n]).\sum_{i=1}^{d_{1}}(a_{i,h_{1}}n[b_{i,h_{1}}n][c_{i,h_{1}}n]+a_{i,h_{2}}n[b_{i,h_{2}}n][c_{i,h_{2}}n]-a_{i,h_{3}}n[b_{i,h_{3}}n][c_{i,h_{3}}n]-a_{i,h_{4}}n[b_{i,h_{4}}n][c_{i,h_{4}}n]).

The heart of the proof is demonstrating that these “top degree-rank terms line up” in an appropriate sense across a dense additive tuples in HH. Such a conclusion is at least plausible since for generic coefficients the associated bracket polynomial equidistributesmod1~{}\mathrm{mod}~{}1, which would violate the given condition on Gh1(n)Gh2(n+h1h4)Gh3(n)¯Gh4(n+h1h4)¯G_{h_{1}}(n)G_{h_{2}}(n+h_{1}-h_{4})\overline{G_{h_{3}}(n)}\overline{G_{h_{4}}(n+h_{1}-h_{4})}. One possibility where the top degree-rank term is exactly zero is when we can write ai,h1=aih1a_{i,h_{1}}=a_{i}h_{1}, bi,h1=bib_{i,h_{1}}=b_{i}^{\ast}, ci,h1=cic_{i,h_{1}}=c_{i}^{\ast}. The heart of the matter is that, up to controlled modifications, this is the only way for that to occur in a robust sense.

The first modification is that we can replace in the above example the expression ai,h1=aih1a_{i,h_{1}}=a_{i}h_{1} with ai,h1=Θi{Θih1}a_{i,h_{1}}=\Theta_{i}\{\Theta_{i}^{\prime}h_{1}\} or more generally a bracket linear form. The second modification is that we may not get a description that respects the presented structure of the sum. Instead the coordinates of the bracket linear form may only appear in these “fixed”, “fixed”, “bracket linear” triples after a linear change of variables. We prove the existence of this structure in two steps, as in [34]. The first step proves that the bracket form is “fixed”, “fixed”, “hh-dependent” and the second step then proves that the “hh-dependent” part in fact has a bracket linear structure. These steps will fall under the names sunflower and linearization respectively.

4.3. Degree-rank iteration

Once we have learned this refined form for i=1d1ai,hn[bi,hn][ci,hn]\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n], we iterate and then learn the refined form for the next highest degree-rank term i=1d2di,hn2[ei,hn]\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n], and then finally we learn the refined form for jhn3j_{h}n^{3}. Given these refined forms, Green, Tao, and Ziegler prove that the top degree terms in fact have the form of a multidegree (1,3)(1,3) nilsequence (in variables hh and nn). Finally given such a correlation, a symmetrization argument as in [34] concludes the proof. We remark here that while terms such as ai,ha_{i,h} and ei,he_{i,h} correspond to Taylor coefficients on the top degree horizontal torus, terms such as di,hd_{i,h} belong to the second horizontal torus, and jhj_{h} to the third horizontal torus. Furthermore to handle terms of the form i=1d2di,hn2[ei,hn]\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n] correctly we must realize such terms via a degree-rank (3,2)(3,2) nilmanifold, hence the need for the finer degree-rank notion.

4.4. Nilcharacters and horizontal tori

We now make this description more precise in terms of nilcharacters and horizontal tori. Let F(gh(n)Γ)=Gh(n)F(g_{h}(n)\Gamma)=G_{h}(n) be a nilcharacter of degree-rank (s,r)(s,r); here e(an[bn][cn])e(an[bn][cn]) should be thought of as an “almost” degree-rank (3,3)(3,3) nilcharacter and e(an[bn2])e(an[bn^{2}]) as an “almost” degree-rank (3,2)(3,2) nilcharacter. The sunflower step proves that the nilsequence F(gh(n)Γ)F(g_{h}(n)\Gamma) can be realized as a bracket polynomial whose top degree-rank part is a sum of terms with (r1)(r-1) iterated brackets where each term consists of (r1)(r-1) hh-independent phases of ghg_{h}, and possibly one hh-dependent phase of ghg_{h}. Here, “phase” will correspond to components of the Taylor coefficients of ghg_{h}, Taylori(gh)\operatorname{Taylor}_{i}(g_{h}). This corresponds to showing that the ii-th horizontal torus G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} contains vector spaces Vi,DepViV_{i,\mathrm{Dep}}\leqslant V_{i} such that:

  • Taylori(gh)Taylori(gh)Vi,Dep\operatorname{Taylor}_{i}(g_{h})-\operatorname{Taylor}_{i}(g_{h^{\prime}})\in V_{i,\mathrm{Dep}} and Taylori(gh)Vi\operatorname{Taylor}_{i}(g_{h})\in V_{i};

  • If i1++ir=si_{1}+\cdots+i_{r}=s, then [vi1,vi2,,vir]=0[v_{i_{1}},v_{i_{2}},\ldots,v_{i_{r}}]=0 whenever vijVijv_{i_{j}}\in V_{i_{j}} and there are at least two indices jj such that vijVij,Depv_{i_{j}}\in V_{i_{j},\mathrm{Dep}}.

Here we have implicitly descended an iterated commutator to the vector spaces G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} which corresponds to a multilinear form in this case. Such a result is proven via combining quantitative equidistribution theory of nilsequences [43, 42] with a “Furstenberg–Weiss argument” as in [32, 34, 43]; see [56] for further examples of the Furstenberg–Weiss argument.

The linearization step then proves that the remaining hh-dependent phases are “bracket linear” in hh. In practice, we require an additional case that the hh-dependent phase may be a petal phase: a top degree-rank term with the petal phase can be realized as a “lower order term”, or more precisely a bracket phase with at most (r2)(r-2) iterated brackets or of total degree at most s1s-1. Thus, the statement we ultimately prove is that we may decompose a subspace of the ii-th horizontal torus into the sum of three linearly disjoint vector spaces Wi,W_{i,\ast}, Wi,LinW_{i,\mathrm{Lin}}, and Wi,PetW_{i,\mathrm{Pet}} such that:

Taylori(gh)\displaystyle\operatorname{Taylor}_{i}(g_{h}) Wi,+Wi,Lin+Wi,Pet,\displaystyle\in W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}},
Taylori(gh)Taylori(gh)\displaystyle\operatorname{Taylor}_{i}(g_{h})-\operatorname{Taylor}_{i}(g_{h^{\prime}}) Wi,Lin+Wi,Pet,\displaystyle\in W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}},

and the projection of Taylori(gh)\operatorname{Taylor}_{i}(g_{h}) onto Wi,LinW_{i,\mathrm{Lin}} is bracket linear. In addition, we require that if i1++ir=si_{1}+\cdots+i_{r}=s, then [vi1,vi2,,vir]=0[v_{i_{1}},v_{i_{2}},\ldots,v_{i_{r}}]=0 whenever viWi,+Wi,Lin+Wi,Petv_{i}\in W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}} and either vijWij,Petv_{i_{j}}\in W_{i_{j},\mathrm{Pet}} for at least one index jj or vijWij,Linv_{i_{j}}\in W_{i_{j},\mathrm{Lin}} for at least two distinct indices jj. Thus even though we have not improved our understanding of the Taylor coefficients on Wi,PetW_{i,\mathrm{Pet}} we have the improved the vanishing of the top degree-rank commutator bracket on this vector space. The linearization step is proved by a combination of quantitative equidistribution theory of nilmanifolds [43, 42] and inverse sumset theory. We refer the reader to [43] for a simpler case of the argument given here.

4.5. Quantitative bounds

The heart of this paper is performing the sunflower and linearization steps efficiently. Green, Tao, and Ziegler [34] accomplish this (when unwinding the correspondence between nilmanifolds and bracket polynomials) via iteratively learning relations between the coefficients ai,h,bi,h,ci,ha_{i,h},b_{i,h},c_{i,h} and performing a dimension reduction argument.111This is performed in [34, Section 10] via a “rank minimality” argument; this requires passing to an ultralimit. When performed in finitary language this becomes a dimension reduction argument and is also present in the proof of [34, Theorem D.5]. Furthermore, the underlying equidistribution theorem used in the work of Green, Tao, and Ziegler [34], proven in work of Green and Tao [29], relies on an induction on dimension argument. The use of any induction on dimension argument essentially immediately results in O(s)O(s) iterated logarithms and thus must be avoided.

The use of induction on dimension in the equidistribution theorem was avoided in work of the first author [43, 42]. The key point in Sections 8 and 9 therefore is to perform the sunflower and linearization steps without any use of induction on dimension. The precise details, while mainly utilizing elementary linear algebra, require a bit of precision. This argument, extending the case of the U4U^{4}-inverse theorem from [43], demonstrates that a dimension-independent number of applications of equidistribution theory is sufficient to derive the necessary decrease in degree-rank. (Note that the argument in [34] morally uses that one can in fact assume that there are no “short linear relations” between various coefficients, but such a result necessitates exponential in dimension dependencies in the exponent.) Another crucial point in our work is that the length of the associated bracket linear form that is obtained not “very long”. This is, by now, a standard consequence of the quasi-polynomial bounds of Sanders [52] towards the polynomial Bogolyubov conjecture.

We finally remark that the quantitative equidistribution theorem we use is slightly different than the one derived in work of the first author [43, 42]. The work of the first author is most naturally phrased as factoring ill-distributed polynomial sequences into a smooth part, a rational part, and a polynomial sequence which (up to taking a certain quotient) lives in a lower step nilmanifold. For our purposes, it is critical to instead lower the degree of the nilmanifold. This is most easily seen from the above bracket polynomial example where we are attempting to linearize a function of the form

e(i=1d3fi,hn[gi,hn]+jhn3+hn2+mhn).e\bigg{(}\sum_{i=1}^{d_{3}}f_{i,h}n[g_{i,h}n]+j_{h}n^{3}+\ell_{h}n^{2}+m_{h}n\bigg{)}.

At this step we wish to linearize jhn3j_{h}n^{3} instead of handling the terms fi,hn[gi,hn]f_{i,h}n[g_{i,h}n]; the jhn3j_{h}n^{3} term, while having the highest degree, does not correspond to the highest step part of the nilmanifold. This phenomenon only occurs when proving the Us+1U^{s+1}-inverse theorem for s4s\geq 4. Thus a crucial ingredient in our work is bootstrapping, as a black box, the efficient version of equidistribution with respect to step in order to obtain an efficient version of equidistribution with respect to degree; this is Theorem 5.4.

4.6. Organization of the paper II

In Section 5 we prove the necessary quantitative equidistribution theorem with respect to degree. In Section 6, we perform the setup and give various definitions which will be used to perform the sunflower and linearization steps. In Section 7, we derive that many additive quadruples exhibit a bias. In Section 8 we perform the sunflower step while in Section 9 we perform the linearization step. In Sections 10 and 11 we then convert information regarding the Taylor coefficients into correlation with a multidegree (1,s1)(1,s-1) nilsequence and a nilsequence of lower degree-rank. Iterating this argument we eventually obtain correlation with a mutltidegree (1,s1)(1,s-1) nilsequence. In Section 12, we symmetrize this nilsequence to obtain Theorem 1.2.

Appendix A collects certain standard results regarding approximate homomorphisms (this is ultimately where work of Sanders [52] is invoked). In Appendix B, we collect a number of miscellaneous propositions which are deferred throughout the paper. Finally in Appendix C we collect a number of propositions regarding nilcharacters.

5. Efficient equidistribution theory of nilsequences

In order to state the primary equidistribution input of this paper we will need the notion of when an element in G/[G,G]G/[G,G] and a horizontal character are orthogonal.

Definition 5.1.

Consider a nilmanifold G/ΓG/\Gamma, a horizontal character η:G\eta\colon G\to\mathbb{R}, and wG/[G,G]w\in G/[G,G]. We say that η\eta and ww are orthogonal if η(w)=0\eta(w)=0.

The primary equidistribution input into our results will be the following result of the first author [42, Theorem 3]. This result is ultimately the driving force of this paper.

Theorem 5.2.

Fix an integer 1\ell\geq 1, δ(0,1/10)\delta\in(0,1/10), M,d1M,d\geq 1, and F:G/ΓF\colon G/\Gamma\to\mathbb{C}. Suppose that GG is a dimension dd, at most ss-step connected, simply connected nilpotent Lie group with a given degree kk filtration, and the nilmanifold G/ΓG/\Gamma is complexity at most MM with respect to this filtration. Let gg be a polynomial sequence on GG with respect to this filtration.

Furthermore suppose that FLip1\lVert F\rVert_{\mathrm{Lip}}\leq 1 and FF has G(s)G_{(s)}-vertical frequency ξ\xi such that the height of ξ\xi is bounded by M/δM/\delta. Suppose that N(M/δ)Ωk,(dΩk,(1))N\geq(M/\delta)^{\Omega_{k,\ell}(d^{\Omega_{k,\ell}(1)})} and

|𝔼n[N]F(g(n)Γ)|δ.\big{|}\mathbb{E}_{\vec{n}\in[N]^{\ell}}F(g(\vec{n})\Gamma)\big{|}\geq\delta.

There exists an integer 0rdim(G/[G,G])0\leq r\leq\dim(G/[G,G]) such that:

  • We have horizontal characters η1,,ηr:G\eta_{1},\ldots,\eta_{r}\colon G\to\mathbb{R} with heights bounded by (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})};

  • For all 1ir1\leq i\leq r, we have ηigC[N](M/δ)Ok,(dOk,(1))\lVert\eta_{i}\circ g\rVert_{C^{\infty}[N]}\leq(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}

  • For any w1,,wsG/[G,G]w_{1},\ldots,w_{s}\in G/[G,G] such that wiw_{i} are orthogonal to all of η1,,ηr\eta_{1},\ldots,\eta_{r}, we have

    ξ([[[w1,w2],w3],,ws])=0.\xi([[[w_{1},w_{2}],w_{3}],\ldots,w_{s}])=0.
Remark 5.3.

Note that G(s)G_{(s)} (and in fact any group in the lower central series) is seen to be Os,k(MOs,k(1))O_{s,k}(M^{O_{s,k}(1)})-rational due to Lemma 2.1. This guarantees that the height definition used in [42] and here are compatible.

Remark.

Let W=i=1rker(ηi)W=\bigcap_{i=1}^{r}\operatorname{ker}(\eta_{i}). The crucial property of the lemma output is that

G~:=W/ker(ξ)\widetilde{G}:=W/\ker(\xi)

is trivially seen to be at most (s1)(s-1)-step nilpotent. (Note that if GG is abelian then we have that G~\widetilde{G} is trivial.) This is due to the fact that defining W=W0=W1W=W_{0}=W_{1} and Wj=[W1,Wj1]W_{j}=[W_{1},W_{j-1}] for j2j\geq 2 yields W(s)G(s)W_{(s)}\leqslant G_{(s)} and ξ(W(s))=0\xi(W_{(s)})=0. Additionally, the statement in [42, Theorem 3] assumes GG is exactly ss-step nilpotent and ξ\xi is nonzero. In the case when GG is strictly less than ss-step nilpotent, taking no horizontal characters (i.e., W=GW=G) gives the desired statement. Furthermore when ξ\xi is zero we may similarly take no horizontal characters and note that the final statement is vacuous.

The following variant of Theorem 5.2 will essentially be the primary equidistribution tool in our paper. For the sake of argumentation, we first prove the result in the case when the vertical frequency considered lives on a 11-dimensional torus and then bootstrap to the general case.

This theorem and its proof are motivated by [34, Lemma E.11]. The key point is that Theorem 5.2 allows us to give a procedure that relies on an induction on step rather than an induction on dimension. The main technical issue is at each stage we pass to a quotient group given by quotienting the kernel of a certain vertical character and thus we must iteratively “lift” these factorizations.

Theorem 5.4.

Let 1\ell\geq 1 be an integer, δ(0,1/10)\delta\in(0,1/10), M1M\geq 1, and F:G/ΓF\colon G/\Gamma\to\mathbb{C}. Suppose that GG is dimension dd, is ss-step nilpotent with a given degree kk filtration, and the nilmanifold G/ΓG/\Gamma is complexity at most MM with respect to this filtration.

Suppose that TZ(G)T\leqslant Z(G) is a 11-dimensional subgroup of the center which is MM-rational with respect to GG. Further suppose that FF has a nonzero TT-vertical character ξ\xi with |ξ|M/δ|\xi|\leq M/\delta, FLipM\lVert F\rVert_{\mathrm{Lip}}\leq M, N(M/δ)Ωk,(dΩk,(1))N\geq(M/\delta)^{\Omega_{k,\ell}(d^{\Omega_{k,\ell}(1)})}, and gg is a polynomial sequence with respect to the degree kk filtration such that g(0)=idGg(0)=\mathrm{id}_{G}. Then if

|𝔼n[N]F(g(n)Γ)|δ\big{|}\mathbb{E}_{\vec{n}\in[N]^{\ell}}F(g(\vec{n})\Gamma)\big{|}\geq\delta

there exists a factorization

g=εgγg=\varepsilon g^{\prime}\gamma

such that:

  • ε(0)=g(0)=γ(0)=idG\varepsilon(0)=g^{\prime}(0)=\gamma(0)=\mathrm{id}_{G};

  • gg^{\prime} lives in an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational subgroup HH such that HT=IdGH\cap T=\mathrm{Id}_{G};

  • γ\gamma is an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational polynomial sequence;

  • ε\varepsilon is an ((M/δ)Ok,(dOk,(1)),N)((M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})},N)-smooth polynomial sequence.

Proof.

The proof proceeds by iteratively “simplifying” gg to live on successively lower-step nilmanifolds. We treat \ell as constant and allow implicit constants to depend on \ell.

Step 1: Iteration setup. We will define a sequence of parameters Mi,δiM_{i},\delta_{i} and Qi,Ni,viQ_{i},N_{i},v_{i} (where the domain of n\vec{n} at stage ii will be vi+Qi[Ni]v_{i}+Q_{i}\cdot[N_{i}]^{\ell}) satisfying:

Mi+1(Mi/δi)Ok(dOk(1)),δi+1(δi/Mi)Ok(dOk(1));M_{i+1}\leq(M_{i}/\delta_{i})^{O_{k}(d^{O_{k}(1)})},\quad\delta_{i+1}\geq(\delta_{i}/M_{i})^{O_{k}(d^{O_{k}(1)})};
Qi+1Qi(Mi/δi)Ok(dOk(1)),Ni+1Ni(δi/Mi)Ok(dOk(1)),Qi+1Ni+1+vi+1N.Q_{i+1}\leq Q_{i}\cdot(M_{i}/\delta_{i})^{O_{k}(d^{O_{k}(1)})},\quad N_{i+1}\geq N_{i}\cdot(\delta_{i}/M_{i})^{O_{k}(d^{O_{k}(1)})},\quad Q_{i+1}\cdot N_{i+1}+\lVert v_{i+1}\rVert_{\infty}\leq N.

During the iteration, we have a sequence of nilpotent Lie groups

G0,G1,,Gt,G^{0},G^{1},\ldots,G^{t},\ldots

such that GtG^{t} is at most (st)(s-t)-step nilpotent with associated lattice Γt\Gamma^{t} and is complexity at most MtM_{t}. This in particular will imply that there are at most ss stages in the iteration. We also maintain a sequence of subgroups

K0,,Kt,K^{0},\ldots,K^{t},\ldots

which are MtM_{t}-rational subgroups of GG.

We will define homomorphisms πt+1:GtGt/ker(ξt)=:G~t+1\pi_{t+1}\colon G^{t}\to G^{t}/\operatorname{ker}(\xi_{t})=:\widetilde{G}^{t+1}, where ξt\xi_{t} is a G(st)tG^{t}_{(s-t)}-frequency (recall H(i)H_{(i)} denotes the lower central series filtration of a group HH). Gt+1G^{t+1} will be an appropriately rational subgroup of G~t+1\widetilde{G}^{t+1}. We will always maintain the invariant that ker(ξt)(πtπ1(T))=IdGt\operatorname{ker}(\xi_{t})\cap(\pi_{t}\circ\cdots\circ\pi_{1}(T))=\mathrm{Id}_{G^{t}}. We will furthermore maintain that the function FtF_{t} has a πtπ1(T)\pi_{t}\circ\cdots\circ\pi_{1}(T)-character given by descending ξ\xi on GG via πtπ1\pi_{t}\circ\cdots\circ\pi_{1}.

We inductively maintain the following pair of relations:

  • πtπ1(Kt)=Gt\pi_{t}\circ\cdots\circ\pi_{1}(K^{t})=G^{t};

  • πtπ1(gt)=g~t\pi_{t}\circ\cdots\circ\pi_{1}(g_{t})=\widetilde{g}_{t};

where gtg_{t} and g~t\widetilde{g}_{t} are polynomial sequences living in KtK^{t} and GtG^{t} respectively.

The iteration terminates when Gt(πtπ1(T))=IdGtG^{t}\cap(\pi_{t}\circ\cdots\circ\pi_{1}(T))=\mathrm{Id}_{G^{t}}. Before termination note that Gt(πtπ1(T))=πtπ1(T)G^{t}\cap(\pi_{t}\circ\cdots\circ\pi_{1}(T))=\pi_{t}\circ\cdots\circ\pi_{1}(T) since πtπ1(T)\pi_{t}\circ\cdots\circ\pi_{1}(T) is 11-dimensional. Note that this in particular ensures that before the termination of the iteration, πtπ1(T)\pi_{t}\circ\cdots\circ\pi_{1}(T) is well-defined even though πj\pi_{j} is not fully defined on the image of πj1\pi_{j-1}! Using the invariant that ker(ξt)(πtπ1(T))=IdGt\operatorname{ker}(\xi_{t})\cap(\pi_{t}\circ\cdots\circ\pi_{1}(T))=\mathrm{Id}_{G^{t}} we also have that ξ\xi (defined on TT) naturally descends to GtG^{t}. We define Jt=πtπ1(T)J^{t}=\pi_{t}\circ\cdots\circ\pi_{1}(T).

Furthermore at each stage of the iteration we have that

gt=εt+1gt+1γt+1g_{t}=\varepsilon_{t+1}\cdot g_{t+1}\cdot\gamma_{t+1}

where:

  • εt+1\varepsilon_{t+1} and γt+1\gamma_{t+1} are polynomial sequences lying in KtK_{t};

  • gt+1g_{t+1} is a polynomial sequence lying in Kt+1K_{t+1};

  • γt+1\gamma_{t+1} is Mt+1M_{t+1}-rational;

  • εt+1\varepsilon_{t+1} is (Mt+1,N)(M_{t+1},N)-smooth.

Finally, in each stage of the iteration we will maintain a function Ft:Gt/ΓtF_{t}\colon G^{t}/\Gamma^{t}\to\mathbb{C} such that

|𝔼nvt+Qt[Nt][Ft(gt~(n)Γt)]|δt.|\mathbb{E}_{\vec{n}\in v_{t}+Q_{t}\cdot[N_{t}]^{\ell}}[F_{t}(\widetilde{g_{t}}(\vec{n})\Gamma^{t})]|\geq\delta_{t}.

Throughout the iterations, nilmanifolds at stage ii will have complexity bounded by MiM_{i}, FiF_{i} is MiM_{i}-Lipschitz, and various horizontal and vertical characters constructed will have size and height bounded by MiM_{i}. The starting conditions are G0=GG^{0}=G, Γ0=Γ\Gamma^{0}=\Gamma, M0=MM_{0}=M, F0=FF_{0}=F, N0=NN_{0}=N, v0=0v_{0}=0, Q0=1Q_{0}=1, δ0=δ\delta_{0}=\delta, K0=GK^{0}=G (and J0=TJ^{0}=T), and g0=g~0=gg_{0}=\widetilde{g}_{0}=g.

Step 2: Applying equidistribution. We now run a single step of the iteration. We have

|𝔼nvt+Qt[Nt][Ft(g~t(n)Γt)]|δt.|\mathbb{E}_{\vec{n}\in v_{t}+Q_{t}\cdot[N_{t}]^{\ell}}[F_{t}(\widetilde{g}_{t}(n)\Gamma^{t})]|\geq\delta_{t}.

By definition, we have that FtF_{t} has a JtJ^{t}-frequency (a descent of ξ\xi); this is not sufficient to apply Theorem 5.2. We perform an additional Fourier-analytic step to obtain a G(st)tG^{t}_{(s-t)}-vertical frequency. Since FtF_{t} is MtM_{t}-Lipschitz, via [42, Lemma A.6] we may write

Ft(zΓt)=|ξ|(Mt/δt)Ok(dOk(1))Fξ,t(zΓt)+τ(zΓt)F_{t}(z\Gamma^{t})=\sum_{|\xi^{\prime}|\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}}F_{\xi^{\prime},t}(z\Gamma^{t})+\tau(z\Gamma^{t})

such that

  • Fξ,tF_{\xi^{\prime},t} has G(st)tG^{t}_{(s-t)}-vertical frequency ξ\xi^{\prime};

  • τδt/2\lVert\tau\rVert_{\infty}\leq\delta_{t}/2;

  • Fξ,tF_{\xi^{\prime},t} is (Mt/δt)Ok(dOk(1))(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}-Lipschitz on Gt/ΓtG^{t}/\Gamma^{t}.

Given this representation, recall that FtF_{t} has ξ\xi (appropriately descended) as a JtJ^{t}-vertical frequency. We abusively write this as ξ\xi. Therefore

Ft(zΓt)\displaystyle F_{t}(z\Gamma^{t}) =gJt/Γte(ξ(g))Ft(zgΓt)𝑑Jt(g)\displaystyle=\int_{g\in J^{t}/\Gamma^{t}}e(-\xi(g))F_{t}(zg\Gamma^{t})dJ^{t}(g)
=|ξ|(Mt/δt)Ok(dOk(1))gJt/Γte(ξ(g))Fξ,t(zgΓt)𝑑Jt(g)+gJt/Γte(ξ(g))τ(zgΓt)𝑑Jt(g)\displaystyle=\sum_{|\xi^{\prime}|\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}}\int_{g\in J^{t}/\Gamma^{t}}e(-\xi(g))F_{\xi^{\prime},t}(zg\Gamma^{t})dJ^{t}(g)+\int_{g\in J^{t}/\Gamma^{t}}e(-\xi(g))\tau(zg\Gamma^{t})dJ^{t}(g)
=|ξ|(Mt/δt)Ok(dOk(1))F~ξ,t(zΓt)+gJt/Γte(ξ(g))τ(zgΓt)𝑑Jt(g),\displaystyle=\sum_{|\xi^{\prime}|\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}}\widetilde{F}_{\xi^{\prime},t}(z\Gamma^{t})+\int_{g\in J^{t}/\Gamma^{t}}e(-\xi(g))\tau(zg\Gamma^{t})dJ^{t}(g),

where dJtdJ^{t} represents the Haar measure on Jt/ΓtJ^{t}/\Gamma^{t}. Thus FtF_{t} may be decomposed into a sum of functions with G(st)tG^{t}_{(s-t)}-vertical characters up to an LL^{\infty} error of δt/2\delta_{t}/2. Furthermore, each vertical character ξ\xi^{\prime} in question must agree with ξ\xi on JtG(st)tJ^{t}\cap G^{t}_{(s-t)}. If not, then the corresponding integral in the second line will average to 0 and we may remove it.

Applying Pigeonhole, there exists |ξ|(Mt/δt)Ok(dOk(1))|\xi^{\prime}|\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})} such that

(5.1) |𝔼nvt+Qt[Nt][F~ξ,t(g~t(n)Γt)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v_{t}+Q_{t}\cdot[N_{t}]^{\ell}}[\widetilde{F}_{\xi^{\prime},t}(\widetilde{g}_{t}(\vec{n})\Gamma^{t})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

We have the following trichotomy:

  • ξ\xi^{\prime} is nonzero and JtG(st)t=JtJ^{t}\cap G^{t}_{(s-t)}=J^{t};

  • ξ\xi^{\prime} is nonzero and JtG(st)t=IdGtJ^{t}\cap G^{t}_{(s-t)}=\mathrm{Id}_{G^{t}};

  • ξ=0\xi^{\prime}=0 in G(st)t^\widehat{G^{t}_{(s-t)}}.

We define πt+1:GtGt/ker(ξ)=:G~t+1\pi_{t+1}\colon G^{t}\to G^{t}/\ker(\xi^{\prime})=:\widetilde{G}^{t+1} (in particular we let ξt+1=ξ\xi_{t+1}=\xi^{\prime}). Let Γ~t+1:=Γt/(Γtker(ξ))=πt+1(Γt)\widetilde{\Gamma}^{t+1}:=\Gamma^{t}/(\Gamma^{t}\cap\ker(\xi^{\prime}))=\pi_{t+1}(\Gamma_{t}). We now apply Theorem 5.2 to (5.1), obtaining horizontal characters η1,,ηr:Gt\eta_{1},\ldots,\eta_{r}\colon G^{t}\to\mathbb{R}. Let their common kernel be HH^{\ast} and let Gt+1=πt+1(H)G~t+1G^{t+1}=\pi_{t+1}(H^{\ast})\leqslant\widetilde{G}^{t+1}. Note that in the case when GtG^{t} is abelian, we do not necessarily have that ηi\eta_{i} are trivial on ker(ξ)\operatorname{ker}(\xi^{\prime}). However replacing HH^{\ast} by Hker(ξ)H^{\ast}\operatorname{ker}(\xi^{\prime}) (and abusively referring to this as HH^{\ast}), we may then replace ηi\eta_{i} by ηi\eta_{i}^{\prime} which instead cutout Hker(ξ)H^{\ast}\operatorname{ker}(\xi^{\prime}) and note that ηi\eta_{i}^{\prime} may be taken to be (Mt/δt)Ok(dOk(1))(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}-height integer combination of ηi\eta_{i}. We abusively rename these characters as ηi\eta_{i} and then proceed with the proof in this edge abelian case.

By applying [42, Lemma A.1], we obtain a factorization of g~t(Qtn+vt)\widetilde{g}_{t}(Q_{t}n+v_{t}) into three nilsequences which are “smooth”, supported on a rational subgroup, and “rational”. We may change variables and then apply πt+1\pi_{t+1} to obtain

πt+1(g~t)=:εt+1gt+1γt+1\pi_{t+1}(\widetilde{g}_{t})=:\varepsilon_{t+1}^{\ast}g_{t+1}^{\ast}\gamma_{t+1}^{\ast}

where:

  • gt+1Gt+1g_{t+1}^{\ast}\in G^{t+1}, and Gt+1G^{t+1} is at most (st1)(s-t-1)-step nilpotent. Furthermore Gt+1G^{t+1} is trivially seen to be (Mt/δt)Ok(dOk(1))(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}-rational with respect to G~t+1\widetilde{G}_{t+1};

  • γt+1\gamma_{t+1}^{\ast} is an (Mt/δt)Ok(dOk(1))(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}-rational polynomial sequence within G~t+1\widetilde{G}^{t+1};

  • εt+1\varepsilon_{t+1}^{\ast} is ((Mt/δt)Ok(dOk(1)),N)((M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})},N)-smooth.

We remark that changing variables is easily seen to not affect the smoothness and rationality in a substantial manner due to the bounds on QtQ_{t}. We can see that the step of Gt+1G^{t+1} decreases appropriately.

Step 3: Lifting the factorization data. Note that Gt+1G^{t+1} can be defined via a set of horizontal characters η1,,ηr\eta_{1}^{\prime},\ldots,\eta_{r^{\prime}}^{\prime} of G~t+1\widetilde{G}^{t+1} such that

Gt+1={xG~t+1:ηi(x)=0 for all 1ir}G^{t+1}=\{x\in\widetilde{G}^{t+1}\colon\eta_{i}^{\prime}(x)=0\text{ for all }1\leq i\leq r^{\prime}\}

and we let Γt+1=Gt+1Γ~t+1\Gamma^{t+1}=G^{t+1}\cap\widetilde{\Gamma}^{t+1}. Note that the ηi\eta_{i}^{\prime} are the natural descentions of ηi\eta_{i} as we have ηi\eta_{i} are trivial on ker(ξ)\operatorname{ker}(\xi^{\prime}); this is precisely why we earlier modified the characters in the abelian case.

We define

Kt+1={xKt:ηi(πt+1πtπ1(x))=0 for all 1ir}.K^{t+1}=\{x\in K^{t}\colon\eta_{i}^{\prime}(\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(x))=0\text{ for all }1\leq i\leq r^{\prime}\}.

The trivial (but key) point is that πt+1πtπ1(Kt+1)Gt+1\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(K^{t+1})\leqslant G^{t+1}. The key issue is noting that the map is well-defined; this is because πtπ1(Kt)Gt\pi_{t}\circ\cdots\circ\pi_{1}(K^{t})\leqslant G^{t} by induction so that we are allowed to apply πt+1\pi_{t+1} to any such values. We further see that πt+1πtπ1(Kt+1)=Gt+1\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(K^{t+1})=G^{t+1} because πt+1πtπ1(Kt)=πt+1(Gt)=G~t+1\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(K^{t})=\pi_{t+1}(G^{t})=\widetilde{G}^{t+1} and Kt+1K^{t+1} is the subgroup of KtK^{t} such that the image under πt+1π1\pi_{t+1}\circ\cdots\circ\pi_{1} is precisely in the intersection of kernels defining Gt+1G^{t+1} within G~t+1\widetilde{G}^{t+1}.

Recall by induction that

πtπ1(gt)=g~t\pi_{t}\circ\cdots\circ\pi_{1}(g_{t})=\widetilde{g}_{t}

and thus

πt+1π1(gt)=εt+1gt+1γt+1.\pi_{t+1}\circ\cdots\circ\pi_{1}(g_{t})=\varepsilon_{t+1}^{\ast}g_{t+1}^{\ast}\gamma_{t+1}^{\ast}.

Applying ηi\eta_{i}^{\prime}, we find that that there exists a nonzero integer Ti(Mt/δt)Ok(dOk(1))T_{i}\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})} such that

(5.2) Tiηi(πt+1π1(gt))C[N](Mt/δt)Ok(dOk(1)).\lVert T_{i}\cdot\eta_{i}^{\prime}(\pi_{t+1}\circ\cdots\circ\pi_{1}(g_{t}))\rVert_{C^{\infty}[N]}\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}.

We now claim that ηi(πt+1π1())\eta_{i}^{\prime}(\pi_{t+1}\circ\cdots\circ\pi_{1}(\cdot)) is a horizontal character on KtK^{t}. It is a homomorphism since the πi\pi_{i} are homomorphisms and it is well-defined by the above. In addition, we may inductively show that πt+1π1(ΓKt)=Γ~t+1\pi_{t+1}\circ\cdots\circ\pi_{1}(\Gamma\cap K^{t})=\widetilde{\Gamma}^{t+1} and πt+1π1(ΓKt+1)=Γt+1\pi_{t+1}\circ\cdots\circ\pi_{1}(\Gamma\cap K^{t+1})=\Gamma^{t+1}. Hence ηi(πt+1π1(ΓKt))\eta_{i}^{\prime}(\pi_{t+1}\circ\cdots\circ\pi_{1}(\Gamma\cap K^{t}))\leqslant\mathbb{Z}, which verifies the property of being a horizontal character. That the horizontal character has appropriately bounded height is an immediate consequence of induction and the fact that |ξ|(Mt/δt)Ok(dOk(1))|\xi^{\prime}|\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}.

Now we use this data to construct the required factorization. By applying [42, Lemma A.1] with the horizontal characters Tiηi(πt+1π1)T_{i}\cdot\eta_{i}^{\prime}(\pi_{t+1}\circ\cdots\circ\pi_{1}) defined on KtK^{t} with the hypotheses (5.2), we may write

gt=εt+1gt+1γt+1g_{t}=\varepsilon_{t+1}^{\prime}g_{t+1}^{\prime}\gamma_{t+1}^{\prime}

where:

  • gt+1g_{t+1}^{\prime} takes values in Kt+1K_{t+1};

  • εt+1\varepsilon_{t+1}^{\prime} and γt+1\gamma_{t+1}^{\prime} take values in KtK_{t};

  • γt+1\gamma_{t+1}^{\prime} is an (Mt/δt)Ok(dOk(1))(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}-rational polynomial sequence;

  • εt+1\varepsilon_{t+1}^{\prime} is ((Mt/δt)Ok(dOk(1)),N)((M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})},N)-smooth.

Then QQ^{\prime} denote the least common multiple of the periods of the \ell different directions for γt+1Γ\gamma_{t+1}^{\prime}\Gamma; note that such periods exist and we have Q(Mt/δt)Ok(dOk(1))Q^{\prime}\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})} by [42, Lemma B.14]. Divide vt+Qt[Nt]v_{t}+Q_{t}\cdot[N_{t}]^{\ell} into boxes of common difference QtQQ_{t}Q^{\prime}. By Pigeonhole there exists vv^{\prime} such that

|𝔼nv+QQt[Nt/Q][F~ξ,t(g~t(n)Γt)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v^{\prime}+Q^{\prime}Q_{t}\cdot[N_{t}/Q^{\prime}]^{\ell}}[\widetilde{F}_{\xi^{\prime},t}(\widetilde{g}_{t}(\vec{n})\Gamma^{t})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

Note that

g~t=πtπ1(gt)=πtπ1(εt+1)πtπ1(gt+1)πtπ1(γt+1).\widetilde{g}_{t}=\pi_{t}\circ\cdots\circ\pi_{1}(g_{t})=\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon_{t+1}^{\prime})\cdot\pi_{t}\circ\cdots\circ\pi_{1}(g_{t+1}^{\prime})\cdot\pi_{t}\circ\cdots\circ\pi_{1}(\gamma_{t+1}^{\prime}).

Since the differences we are considering are divisible by QQ^{\prime}, there is γRep\gamma_{\mathrm{Rep}} such that

γRep1γt+1(v+QQtn)Γ\gamma_{\mathrm{Rep}}^{-1}\gamma_{t+1}^{\prime}(v^{\prime}+Q^{\prime}Q_{t}\cdot\vec{n})\in\Gamma

for all n\vec{n}\in\mathbb{Z}^{\ell}, where γRepKt\gamma_{\mathrm{Rep}}\in K^{t} and dG(γRep,idG)(Mt/δt)Ok(dOk(1))d_{G}(\gamma_{\mathrm{Rep}},\mathrm{id}_{G})\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})}. Since πtπ1(ΓKt)Γt\pi_{t}\circ\cdots\circ\pi_{1}(\Gamma\cap K^{t})\leqslant\Gamma^{t} we have that

|𝔼nv+QQt[Nt/Q][F~ξ,t(πtπ1(εt+1γRep)πtπ1(γRep1gt+1γRep)Γt)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v^{\prime}+Q^{\prime}Q_{t}\cdot[N_{t}/Q^{\prime}]^{\ell}}[\widetilde{F}_{\xi^{\prime},t}(\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon_{t+1}^{\prime}\gamma_{\mathrm{Rep}})\cdot\pi_{t}\circ\cdots\circ\pi_{1}(\gamma_{\mathrm{Rep}}^{-1}g_{t+1}^{\prime}\gamma_{\mathrm{Rep}})\Gamma^{t})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

Step 4: Completing the induction. The first key polynomial sequence we shall define is

gt+1=γRep1gt+1γRep.g_{t+1}=\gamma_{\mathrm{Rep}}^{-1}\cdot g_{t+1}^{\prime}\cdot\gamma_{\mathrm{Rep}}.

Note that Kt+1K^{t+1} is normal within KtK^{t} and since γRepKt\gamma_{\mathrm{Rep}}\in K^{t} we have that gt+1g_{t+1} takes on values in Kt+1K^{t+1} as desired. Further let εt+1=εt+1γRep\varepsilon_{t+1}=\varepsilon_{t+1}^{\prime}\cdot\gamma_{\mathrm{Rep}} and γt+1=γRep1γt+1\gamma_{t+1}=\gamma_{\mathrm{Rep}}^{-1}\cdot\gamma_{t+1}^{\prime}; these are trivially seen to lie in KtK_{t} and have the necessary rationality and smoothness properties due to the above analysis.

We now break [Nt/Q][N_{t}/Q^{\prime}]^{\ell} into a collection of boxes of length Nt+1Nt/Q(Mt/δt)Ok(dOk(1))N_{t+1}\geq N_{t}/Q^{\prime}\cdot(M_{t}/\delta_{t})^{-O_{k}(d^{O_{k}(1)})}. There exists a box such that

|𝔼nv′′+QQt[Nt+1][F~ξ,t(πtπ1(εt+1γRep)πtπ1(gt+1)Γt)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v^{\prime\prime}+Q^{\prime}Q_{t}\cdot[N_{t+1}]^{\ell}}[\widetilde{F}_{\xi^{\prime},t}(\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon_{t+1}^{\prime}\cdot\gamma_{\mathrm{Rep}})\cdot\pi_{t}\circ\cdots\circ\pi_{1}(g_{t+1})\Gamma^{t})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

Taking Nt+1N_{t+1} sufficiently small, we may replace the initial “smooth” polynomial sequence εt+1\varepsilon_{t+1}^{\ast} by εKt\varepsilon^{\ast}\in K^{t} where dG(ε,idG)(Mt/δt)Ok(dOk(1))d_{G}(\varepsilon^{\ast},\mathrm{id}_{G})\leq(M_{t}/\delta_{t})^{O_{k}(d^{O_{k}(1)})} such that

|𝔼nv′′+QQt[Nt+1][F~ξ,t(πtπ1(ε)πtπ1(gt+1)Γt)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v^{\prime\prime}+Q^{\prime}Q_{t}\cdot[N_{t+1}]^{\ell}}[\widetilde{F}_{\xi^{\prime},t}(\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon^{\ast})\cdot\pi_{t}\circ\cdots\circ\pi_{1}(g_{t+1})\Gamma^{t})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

The new function Ft+1F_{t+1} is given by descending gF~ξ,t(πtπ1(ε)gΓt)g\mapsto\widetilde{F}_{\xi^{\prime},t}(\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon^{\ast})\cdot g\Gamma^{t}) from GtG^{t} to G~t+1\widetilde{G}^{t+1} (and later we may implicitly restrict to Gt+1G^{t+1}). Explicitly, for gGtg\in G^{t} we have

F~ξ,t(πtπ1(ε)gΓt+1)=Ft+1(πt+1(g)Γ~t+1)\widetilde{F}_{\xi^{\prime},t}(\pi_{t}\circ\cdots\circ\pi_{1}(\varepsilon^{\ast})g\Gamma^{t+1})=F_{t+1}(\pi_{t+1}(g)\widetilde{\Gamma}^{t+1})

which is possible because F~ξ,t\widetilde{F}_{\xi^{\prime},t} has vertical frequency ξ\xi^{\prime}. Therefore we have

(5.3) |𝔼nv′′+QQt[Nt+1][Ft+1(πt+1πtπ1(gt+1(n))Γ~t+1)]|(δt/Mt)Ok(dOk(1)).|\mathbb{E}_{\vec{n}\in v^{\prime\prime}+Q^{\prime}Q_{t}\cdot[N_{t+1}]^{\ell}}[F_{t+1}(\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(g_{t+1}(\vec{n}))\widetilde{\Gamma}^{t+1})]|\geq(\delta_{t}/M_{t})^{O_{k}(d^{O_{k}(1)})}.

We let

g~t+1:=πt+1πtπ1(gt+1)\widetilde{g}_{t+1}:=\pi_{t+1}\circ\pi_{t}\circ\cdots\circ\pi_{1}(g_{t+1})

and we may replace Γ~t+1\widetilde{\Gamma}^{t+1} with Γt+1=Γ~t+1Gt+1\Gamma^{t+1}=\widetilde{\Gamma}^{t+1}\cap G^{t+1} in (5.3).

We now check that ker(ξ)Jt=idGt\operatorname{ker}(\xi^{\prime})\cap J^{t}=\mathrm{id}_{G^{t}}, which is one of the invariants we are maintaining (we take ξt=ξ\xi_{t}=\xi^{\prime}). We will have to distinguish between cases:

  • If ξ\xi^{\prime} is nonzero and JtG(st)t=JtJ^{t}\cap G^{t}_{(s-t)}=J^{t} note that ker(ξ)Jt=IdGt\operatorname{ker}(\xi^{\prime})\cap J^{t}=\mathrm{Id}_{G^{t}}. This is due to the fact that ξ\xi^{\prime} restricted to JtJ^{t} is (the descended version of) ξ\xi which is nonzero as given.

  • If ξ\xi^{\prime} is nonzero and JtG(st)t=IdGtJ^{t}\cap G^{t}_{(s-t)}=\mathrm{Id}_{G}^{t} then note that ker(ξ)JtJtG(st)t=IdGt\operatorname{ker}(\xi^{\prime})\cap J^{t}\leqslant J^{t}\cap G^{t}_{(s-t)}=\mathrm{Id}_{G^{t}}.

  • If ξ=0\xi^{\prime}=0 then note that as ξ\xi (appropriately descended) was nonzero we have that JtG(st)t=IdGtJ^{t}\cap G^{t}_{(s-t)}=\mathrm{Id}_{G}^{t} is forced in this case. The result then follows as in the previous step.

Now, if Gt+1πt+1π1(T)=πt+1π1(T)G^{t+1}\cap\pi_{t+1}\circ\cdots\circ\pi_{1}(T)=\pi_{t+1}\circ\cdots\circ\pi_{1}(T) then we continue with the iteration and do not terminate. If we have reached termination, we therefore have that Gt+1πt+1π1(T)=IdGt+1G^{t+1}\cap\pi_{t+1}\circ\cdots\pi_{1}(T)=\mathrm{Id}_{G^{t+1}}. We claim that this implies that Kt+1T=IdGK^{t+1}\cap T=\mathrm{Id}_{G} (and therefore we may take the output group to be H=Kt+1H=K^{t+1}). For the sake of contradiction, instead suppose TKt+1T\leqslant K^{t+1} (since TT is 11-dimensional). Applying πt+1π1\pi_{t+1}\circ\cdots\circ\pi_{1} we have that

πt+1π1(T)πt+1π1(Kt+1)=Gt+1\pi_{t+1}\circ\cdots\circ\pi_{1}(T)\leqslant\pi_{t+1}\circ\cdots\circ\pi_{1}(K^{t+1})=G^{t+1}

which contradicts the termination condition.

Finally, note that if Gt+1πt+1π1(T)=πt+1π1(T)G^{t+1}\cap\pi_{t+1}\circ\cdots\pi_{1}(T)=\pi_{t+1}\circ\cdots\pi_{1}(T) then Ft+1F_{t+1} when viewed as a function on Gt+1/Γt+1G^{t+1}/\Gamma^{t+1} is seen to have a nonzero πt+1π1(T)\pi_{t+1}\circ\cdots\pi_{1}(T) vertical character (which is given by descending ξ\xi on GG in through πt+1π1\pi_{t+1}\circ\cdots\pi_{1} in the obvious manner), so one can continue in the iteration in this case.

Step 5: Fixing the value at 0. To see that this completes the proof, if the iteration terminates at some stage tt then note that

g=ε1εtgtγtγ1.g=\varepsilon_{1}\cdots\varepsilon_{t}\cdot g_{t}\cdot\gamma_{t}\cdots\gamma_{1}.

Using that the product of smooth sequences are appropriately smooth and analogously for rational sequences allows us to deduce the necessary outputs. However, we have not guaranteed that the values of the factorization are the idG\mathrm{id}_{G} at 0. For this, let gt(0)={gt(0)}[gt(0)]g_{t}(0)=\{g_{t}(0)\}[g_{t}(0)] with [gt(0)]KtΓ[g_{t}(0)]\in K^{t}\cap\Gamma and dG({gt(0)},idG)(M/ε)Ok(dOk(1))d_{G}(\{g_{t}(0)\},\mathrm{id}_{G})\leq(M/\varepsilon)^{O_{k}(d^{O_{k}(1)})}. We then have that

g=ε1εt{gt(0)}({gt(0)}1gt[gt(0)]1)[gt(0)]γtγ1.g=\varepsilon_{1}\cdots\varepsilon_{t}\cdot\{g_{t}(0)\}\cdot(\{g_{t}(0)\}^{-1}g_{t}[g_{t}(0)]^{-1})\cdot[g_{t}(0)]\cdot\gamma_{t}\cdots\gamma_{1}.

As g(0)=0g(0)=0, we have that τ=[gt(0)]γt(0)γ1(0)\tau=[g_{t}(0)]\cdot\gamma_{t}(0)\cdot\cdots\gamma_{1}(0) satisfies dG(τ,idG)(M/ε)Ok(dOk(1))d_{G}(\tau,\mathrm{id}_{G})\leq(M/\varepsilon)^{O_{k}(d^{O_{k}(1)})} and τ\tau is (M/ε)Ok(dOk(1))(M/\varepsilon)^{O_{k}(d^{O_{k}(1)})}-rational. Thus

g=ε1εt{gt(0)}τ(τ1{gt(0)}1gt[gt(0)]1τ)τ1[gt(0)]γtγ1g=\varepsilon_{1}\cdots\varepsilon_{t}\cdot\{g_{t}(0)\}\tau\cdot(\tau^{-1}\{g_{t}(0)\}^{-1}g_{t}[g_{t}(0)]^{-1}\tau)\cdot\tau^{-1}[g_{t}(0)]\cdot\gamma_{t}\cdot\gamma_{1}

and note that (τ1{gt(0)}1gt[gt(0)]1τ)(\tau^{-1}\{g_{t}(0)\}^{-1}g_{t}[g_{t}(0)]^{-1}\tau) takes value in the conjugated subgroup τ1Ktτ\tau^{-1}K^{t}\tau which is (M/ε)Ok(dOk(1))(M/\varepsilon)^{O_{k}(d^{O_{k}(1)})}-rational by [42, Lemma B.15]. Note however that despite modifying the output group HH via conjugation, we have τ1KtτT=τ1Ktττ1Tτ=IdG\tau^{-1}K^{t}\tau\cap T=\tau^{-1}K^{t}\tau\cap\tau^{-1}T\tau=\mathrm{Id}_{G} as desired. ∎

We now remove the assumption of a 11-dimensional vertical torus via a reduction to this case.

Corollary 5.5.

Let 1\ell\geq 1 be an integer, δ(0,1/10)\delta\in(0,1/10), M1M\geq 1, and F:G/ΓF\colon G/\Gamma\to\mathbb{C}. Suppose that GG is dimension dd, is ss-step nilpotent with a given degree kk filtration, and the nilmanifold G/ΓG/\Gamma is complexity at most MM with respect to this filtration.

Suppose that TZ(G)T\leqslant Z(G) is a subgroup of the center which is MM-rational. Further suppose that FF has a nonzero TT-vertical character ξ\xi with |ξ|M/δ|\xi|\leq M/\delta, FLipM\lVert F\rVert_{\mathrm{Lip}}\leq M, N(M/δ)Ωk,(dΩk,(1))N\geq(M/\delta)^{\Omega_{k,\ell}(d^{\Omega_{k,\ell}(1)})}, and gg is a polynomial sequence with respect to the degree kk filtration. Then if

|𝔼n[N]F(g(n)Γ)|δ\big{|}\mathbb{E}_{\vec{n}\in[N]^{\ell}}F(g(\vec{n})\Gamma)\big{|}\geq\delta

there exists a factorization

g=εgγg=\varepsilon g^{\prime}\gamma

such that:

  • gg^{\prime} lives in an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational subgroup HH such that ξ(HT)=0\xi(H\cap T)=0;

  • γ\gamma is an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational polynomial sequence;

  • ε\varepsilon is an ((M/δ)Ok,(dOk,(1)),N)((M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})},N)-smooth polynomial sequence.

Furthermore if g(0)=idGg(0)=\mathrm{id}_{G} then we may take ε(0)=g(0)=γ(0)=idG\varepsilon(0)=g^{\prime}(0)=\gamma(0)=\mathrm{id}_{G}.

Proof.

We first reduce to the case where g(0)=idGg(0)=\mathrm{id}_{G} as is standard. We factor g(0)={g(0)}[g(0)]g(0)=\{g(0)\}[g(0)] such that [g(0)]Γ[g(0)]\in\Gamma and ψG({g(0)})[0,1)dim(G)\psi_{G}(\{g(0)\})\in[0,1)^{\dim(G)}. Replacing FF by F({g(0)})F(\{g(0)\}\cdot) and gg by {g(0)}1g[g(0)]1\{g(0)\}^{-1}g[g(0)]^{-1} we may clearly reduce to the case where g(0)=idGg(0)=\mathrm{id}_{G} at the cost of replacing MM by MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})} which leaves the conclusion unchanged.

Using Lemma 3.10 to bound the complexity of G/ker(ξ)G/\operatorname{ker}(\xi) and noting that FF descends to an (M/δ)Ok(dOk(1))(M/\delta)^{O_{k}(d^{O_{k}(1)})}-Lipschitz function on G/ker(ξ)G/\operatorname{ker}(\xi), by Theorem 5.4 we have that

(gmodker(ξ))=εgγ(g~{}\mathrm{mod}~{}\operatorname{ker}(\xi))=\varepsilon g^{\prime}\gamma

where ε,g,γ\varepsilon,g^{\prime},\gamma satisfy:

  • ε(0)=g(0)=γ(0)=idG/ker(ξ)\varepsilon(0)=g^{\prime}(0)=\gamma(0)=\mathrm{id}_{G/\operatorname{ker}(\xi)};

  • gg^{\prime} lives in an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational subgroup HH such that H(T/ker(ξ))=idG/ker(ξ)H\cap(T/\operatorname{ker}(\xi))=\mathrm{id}_{G/\operatorname{ker}(\xi)};

  • γ\gamma is an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational polynomial sequence;

  • ε\varepsilon is an ((M/δ)Ok,(dOk,(1)),N)((M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})},N)-smooth polynomial sequence.

We now “lift” this factorization. Consider the Mal’cev basis 𝒳\mathcal{X}^{\prime} for G/ker(ξ)G/\operatorname{ker}(\xi). For each element Xi𝒳X_{i}^{\prime}\in\mathcal{X}^{\prime} we may lift to ZilogGZ_{i}\in\log G such that:

  • exp(Xi)=exp(Zi)modker(ξ)\exp(X_{i}^{\prime})=\exp(Z_{i})~{}\mathrm{mod}~{}\operatorname{ker}(\xi);

  • dG(exp(Zi),idG)(M/δ)Ok(dOk(1))d_{G}(\exp(Z_{i}),\mathrm{id}_{G})\leq(M/\delta)^{O_{k}(d^{O_{k}(1)})};

  • ZiZ_{i} is an (M/δ)Ok(dOk(1))(M/\delta)^{O_{k}(d^{O_{k}(1)})}-rational combination of the elements of 𝒳\mathcal{X}.

Writing ε\varepsilon as

ε(n)=exp(|i|kεi(ni))\varepsilon(\vec{n})=\exp\bigg{(}\sum_{|\vec{i}|\leq k}\mathfrak{\varepsilon}_{\vec{i}}\binom{\vec{n}}{\vec{i}}\bigg{)}

where εilog(G|i|/(ker(ξ)G|i|))\mathfrak{\varepsilon}_{\vec{i}}\in\log(G_{|\vec{i}|}/(\operatorname{ker}(\xi)\cap G_{|\vec{i}|})), we lift via the above mapping on 𝒳\mathcal{X}^{\prime} to

ε~(n)=exp(|i|kε~i(ni))\widetilde{\varepsilon}(n)=\exp\bigg{(}\sum_{|\vec{i}|\leq k}\widetilde{\mathfrak{\varepsilon}}_{\vec{i}}\binom{\vec{n}}{\vec{i}}\bigg{)}

where ε~ilog(G|i|)\widetilde{\varepsilon}_{\vec{i}}\in\log(G_{|\vec{i}|}) and analogously for g,γg^{\prime},\gamma.

We easily see that ε~\widetilde{\varepsilon} is an ((M/δ)Ok,(dOk,(1)),N)((M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})},N)-smooth polynomial sequence, that γ~\widetilde{\gamma} is an (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational polynomial sequence, and that g~\widetilde{g}^{\prime} takes values in the subgroup H=exp(log(H)+log(ker(ξ)))H^{\prime}=\exp(\log(H)+\log(\operatorname{ker}(\xi))). Furthermore HH^{\prime} is seen to be (M/δ)Ok,(dOk,(1))(M/\delta)^{O_{k,\ell}(d^{O_{k,\ell}(1)})}-rational and ξ(HT)=0\xi(H^{\prime}\cap T)=0. Finally note that ε~modker(ξ)=ε\widetilde{\varepsilon}~{}\mathrm{mod}~{}\operatorname{ker}(\xi)=\varepsilon and analogously for g~,γ~\widetilde{g}^{\prime},\widetilde{\gamma}. Therefore

g(ε~g~γ~)1idGmodker(ξ)g\cdot(\widetilde{\varepsilon}\widetilde{g}^{\prime}\widetilde{\gamma})^{-1}\equiv\mathrm{id}_{G}~{}\mathrm{mod}~{}\operatorname{ker}(\xi)

as polynomial sequences. Thus

g=g(ε~g~γ~)1(ε~g~γ~)=ε~((g(ε~g~γ~)1)g~)γ~g=g\cdot(\widetilde{\varepsilon}\widetilde{g}^{\prime}\widetilde{\gamma})^{-1}\cdot(\widetilde{\varepsilon}\widetilde{g}^{\prime}\cdot\widetilde{\gamma})=\widetilde{\varepsilon}\cdot((g\cdot(\widetilde{\varepsilon}\widetilde{g}^{\prime}\widetilde{\gamma})^{-1})\cdot\widetilde{g}^{\prime})\cdot\widetilde{\gamma}

gives the desired factorization noting that ker(ξ)H\operatorname{ker}(\xi)\leqslant H^{\prime} and ker(ξ)\operatorname{ker}(\xi) is central and therefore g(ε~g~γ~)1g\cdot(\widetilde{\varepsilon}\widetilde{g}^{\prime}\widetilde{\gamma})^{-1} may be commuted to the right. ∎

6. Setup for Sunflower and Linearization Iteration

We now set up the iteration which will take up the bulk of the following four sections. The idea is to inductively assume the statement of Theorem 1.2 for s1s-1 (i.e., the quantitative inverse theorem for the Us[N]U^{s}[N]-norm) and the remaining goal is to prove it for ss. The key step is to show that for many h[N]h\in[N], Δhf\Delta_{h}f correlates with a multidegree (1,s1)(1,s-1) nilcharacter; this is a quantitative version of [34, Theorem 7.1]. For the remainder of the analysis until Section 12 we will be concerned with the notion of a correlation structure, which can be thought of as refining the notion in Definition 3.14 with intermediate bracket information.

Definition 6.1.

A correlation structure associated to the function f:[N]f\colon[N]\to\mathbb{C} with parameters ρ\rho, MM, dd, and DD and degree-rank (s1,r)(s-1,r^{\ast}) is the following data:

  • A subset H[N]H\subseteq[N] such that |H|ρN|H|\geq\rho N;

  • A multidegree (1,s1)(1,s-1) nilcharacter χ(h,n)\chi(h,n) that lives on a nilmanifold G/ΓG^{\ast}/\Gamma^{\ast} where χ\chi has a G(1,s1)G^{\ast}_{(1,s-1)}-vertical frequency η\eta^{\ast}. Furthermore G/ΓG^{\ast}/\Gamma^{\ast} has dimension bounded by dd and complexity bounded by MM, the function FF^{\ast} underlying χ\chi is MM-Lipschitz, η\eta^{\ast} has height bounded by MM, and the output dimension of χ\chi is bounded by DD. We let g(h,n)g(h,n) denote the underlying polynomial sequence of χ\chi;

  • A collection of degree-rank (s1,r)(s-1,r^{\ast}) nilcharacters χh(n)\chi_{h}(n) which live on G/ΓG/\Gamma where every χh\chi_{h} has the same G(s1,r)G_{(s-1,r^{\ast})}-vertical frequency η\eta. Furthermore G/ΓG/\Gamma has dimension bounded by dd and complexity bounded by MM (with Mal’cev basis 𝒳\mathcal{X}), the function underlying χh\chi_{h} is MM-Lipschitz, η\eta has height bounded by MM, and χh\chi_{h} has output dimension bounded by DD. We let gh(n)g_{h}(n) denote the polynomial sequence underlying χh\chi_{h}. Finally, the function underlying χh\chi_{h}, which we will denote FF, is independent of hh;

  • The polynomial sequences satisfy gh(0)=idGg_{h}(0)=\mathrm{id}_{G};

  • For all hHh\in H we have

    Δhf(n)χ(h,n)¯χh(n)¯Corr(s2,ρ,M,d).\Delta_{h}f(n)\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\in\operatorname{Corr}(s-2,\rho,M,d).

If the input function ff we are considering for the proof of Theorem 1.2 satisfies

fUs+1[N]δ,\lVert f\rVert_{U^{s+1}[N]}\geq\delta,

then our proof will always maintain bounds of the form

ρ1,M,Dexp(log(1/δ)Os(1)) and dlog(1/δ)Os(1)\rho^{-1},M,D\leq\exp(\log(1/\delta)^{O_{s}(1)})\text{ and }d\leq\log(1/\delta)^{O_{s}(1)}

on intermediate correlation structures, although the precise dependence may decay over roughly ss stages (wherein we reduce rr^{\ast} from s1s-1 to 0).

To get started, we first note that given a function ff with large Us+1U^{s+1}-norm we may associate to it a correlation structure of degree-rank (s1,s1)(s-1,s-1); this is little more than chasing definitions and applying induction.

Lemma 6.2.

Fix δ(0,1/2)\delta\in(0,1/2) and s2s\geq 2. Assume Theorem 1.2 for s1s-1. Let f:[N]f\colon[N]\to\mathbb{C} be a 11-bounded function such that

fUs+1[N]δ.\lVert f\rVert_{U^{s+1}[N]}\geq\delta.

Then there exists a degree-rank (s1,s1)(s-1,s-1) correlation structure associated to ff with parameters ρ\rho, MM, dd, and DD such that

ρ1,M,Dexp(log(1/δ)Os(1)) and dlog(1/δ)Os(1).\rho^{-1},M,D\leq\exp(\log(1/\delta)^{O_{s}(1)})\emph{ and }d\leq\log(1/\delta)^{O_{s}(1)}.
Proof.

Note that fUs+1[N]δ\lVert f\rVert_{U^{s+1}[N]}\geq\delta implies that

𝔼h[N]ΔhfUs[N]2sδOs(1);\mathbb{E}_{h\in[N]}\lVert\Delta_{h}f\rVert_{U^{s}[N]}^{2^{s}}\geq\delta^{O_{s}(1)};

this implicitly uses that ΔhfUs[N]=ΔhfUs[N]\lVert\Delta_{h}f\rVert_{U^{s}[N]}=\lVert\Delta_{-h}f\rVert_{U^{s}[N]} and that Δhf\Delta_{h}f is identically zero for |h|>N|h|>N.

Therefore there exists H[N]H\subseteq[N] with |H|δOs(1)N|H|\geq\delta^{O_{s}(1)}N such that

ΔhfUs[N]2sδOs(1)\lVert\Delta_{h}f\rVert_{U^{s}[N]}^{2^{s}}\geq\delta^{O_{s}(1)}

for hHh\in H.

By induction on Theorem 1.2, we may assume that for all such hHh\in H there exists Gh/ΓhG_{h}/\Gamma_{h} with degree s1s-1 filtration and an associated polynomial sequence gh()g_{h}(\cdot) such that

𝔼h[N][Δhf(n)Fh(gh(n)Γ)¯]ρ\mathbb{E}_{h\in[N]}[\Delta_{h}f(n)\overline{F_{h}(g_{h}(n)\Gamma)}]\geq\rho

where Gh/ΓhG_{h}/\Gamma_{h} has complexity bounded by MM and dimension bounded by dd. We may take

M,ρ1exp(log(1/δ)Os(1)) and dlog(1/δ)Os(1).M,\rho^{-1}\leq\exp(\log(1/\delta)^{O_{s}(1)})\text{ and }d\leq\log(1/\delta)^{O_{s}(1)}.

Note that via writing gh(0)={gh(0)}[gh(0)]g_{h}(0)=\{g_{h}(0)\}[g_{h}(0)] where ψGh({gh(0)})[0,1)dim(Gh)\psi_{G_{h}}(\{g_{h}(0)\})\in[0,1)^{\dim(G_{h})} and [gh(0)]Γh[g_{h}(0)]\in\Gamma_{h}, we have that

Fh(gh(n)Γ)\displaystyle F_{h}(g_{h}(n)\Gamma) =Fh({gh(0)}{gh(0)}1gh(n)[gh(0)]1[gh(0)]Γ)\displaystyle=F_{h}(\{g_{h}(0)\}\{g_{h}(0)\}^{-1}g_{h}(n)[g_{h}(0)]^{-1}\cdot[g_{h}(0)]\Gamma)
=Fh({gh(0)}{gh(0)}1gh(n)[gh(0)]1Γ)\displaystyle=F_{h}(\{g_{h}(0)\}\{g_{h}(0)\}^{-1}g_{h}(n)[g_{h}(0)]^{-1}\Gamma)

Note that gh(n)={gh(0)}1gh(n)[gh(0)]1g_{h}^{\prime}(n)=\{g_{h}(0)\}^{-1}g_{h}(n)[g_{h}(0)]^{-1} has gh(0)=idGhg_{h}^{\prime}(0)=\mathrm{id}_{G_{h}} and Fh=Fh({gh(0)})F_{h}^{\prime}=F_{h}(\{g_{h}(0)\}\cdot) is appropriately Lipschitz (as {gh(0)}\{g_{h}(0)\} has appropriately bounded coordinates by [42, Lemma B.2]). Therefore without loss we may assume that gh(0)=idGg_{h}(0)=\mathrm{id}_{G} for all hHh\in H.

Next note that there are only Os(M)Os(dO(1))O_{s}(M)^{O_{s}(d^{O(1)})} nilmanifolds of dimension at most dd with degree (s1)(s-1) filtration of complexity bounded by MM (up to isomorphism). This follows from Lie’s third theorem on the correspondence between Lie algebras and connected, simply connected Lie groups and counting the total possible number of different structure constants and filtration choices for the Lie algebra. Therefore by Pigeonhole we may assume, at the cost of decreasing the size of set HH by a multiplicative factor of Os(M)Os(dO(1))O_{s}(M)^{O_{s}(-d^{O(1)})}, that Gh/Γh=G/ΓG_{h}/\Gamma_{h}=G/\Gamma (and the corresponding filtration) is independent of hHh\in H.

We next remove the dependence on hh for the function FhF_{h}. Let γ\gamma be a parameter to be chosen later; by applying Lemma B.3 we may write

Fh(gΓ)=jIτj(gΓ)2Fh(gΓ)F_{h}(g\Gamma)=\sum_{j\in I}\tau_{j}(g\Gamma)^{2}\cdot F_{h}(g\Gamma)

where |I|(1/γ)Os(dOs(1))|I|\leq(1/\gamma)^{O_{s}(d^{O_{s}(1)})}, every gΓg\Gamma is supported on at most 2Os(d)2^{O_{s}(d)} many terms, and τj\tau_{j} are (M/γ)Os(dOs(1))(M/\gamma)^{O_{s}(d^{O_{s}(1)})}-Lipschitz. Furthermore each τj\tau_{j} is supported on a width 2γ2\gamma cube near the origin (in Mal’cev coordinates); see the third item of Lemma B.3 for a precise description. Since FF is an MM-Lipschitz function, and choosing γ\gamma to be sufficiently small with respect to (ρ/M)Os(dOs(1))(\rho/M)^{O_{s}(d^{O_{s}(1)})}, we find that

supgG|Fh(gΓ)jIajτj(gΓ)2|ρ/2\sup_{g\in G}|F_{h}(g\Gamma)-\sum_{j\in I}a_{j}\tau_{j}(g\Gamma)^{2}|\leq\rho/2

by taking aja_{j} to be the mean of FhF_{h} on the support of τj\tau_{j}. Note that |aj|M|a_{j}|\leq M. Pigeonholing over jIj\in I and decreasing ρ\rho and the size of HH by appropriate factors of Os(M)Os(dOs(1))O_{s}(M)^{O_{s}(-d^{O_{s}(1)})}, we may assume that Fh=FF_{h}=F for all hHh\in H.

We finally want to replace FF by a nilcharacter with a vertical frequency and the claimed output dimension bound. We first give GG a degree-rank (s1,s1)(s-1,s-1) filtration induced by its degree s1s-1 filtration. This is done via [34, Example 6.11] (i.e., G(d,r)G_{(d,r)} is generated by iterated commutators which either have filtration depths adding to greater than dd or adding to exactly dd with at least rr participating elements). Lemma 2.1 guarantees each subgroup is MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}-rational. Via [42, Lemma B.11], we may give GG a Mal’cev basis adapted to this degree-rank (s1,s1)(s-1,s-1) filtration with complexity MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}.

Via Fourier expansion (see [42, Lemma A.6]) and the triangle inequality we may additionally assume that FF has a vertical G(s1,s1)G_{(s-1,s-1)}-frequency η\eta 222We apply [42, Lemma A.6] to the degree filtration G(0,0)=G(1,0)G(2,0)G(s1,0)G(s1,s1)IdGG_{(0,0)}=G_{(1,0)}\geqslant G_{(2,0)}\geqslant\cdots\geqslant G_{(s-1,0)}\geqslant G_{(s-1,s-1)}\geqslant\mathrm{Id}_{G}. with height at most Os(M/ρ)Os(dOs(1))=exp(log(1/δ)Os(1))O_{s}(M/\rho)^{O_{s}(d^{O_{s}(1)})}=\exp(\log(1/\delta)^{O_{s}(1)}). Given FF, there exists a nilcharacter FηF_{\eta} by Lemma B.4 with vertical frequency η\eta, output dimension bounded by 2Os(d)2^{O_{s}(d)}, and such that each coordinate is Os(M)Os(dOs(1))O_{s}(M)^{O_{s}(d^{O_{s}(1)})}-Lipschitz. The function (F/(2F),Fη1|F/(2F)|2)(F/(2\lVert F\rVert_{\infty}),F_{\eta}\cdot\sqrt{1-|F/(2\lVert F\rVert_{\infty})|^{2}}) demonstrates that without loss of generality, we may assume FF is a coordinate of a nilcharacter.

To complete the deduction, we take G/ΓG^{\ast}/\Gamma^{\ast} to be the trivial nilmanifold and g(h,n)g(h,n) to be a constant sequence. ∎

The heart of this paper is the following quantification of [34, Theorem 7.2], the proof of which is the goal of the next few sections culminating in Section 11.2.

Lemma 6.3.

Fix s2s\geq 2 and 1rs11\leq r^{\ast}\leq s-1. Suppose f:[N]f\colon[N]\to\mathbb{C} is a 11-bounded function and Nexp(Ωs((dlog(MD/ρ))Ωs(1)))N\geq\exp(\Omega_{s}((d\log(MD/\rho))^{\Omega_{s}(1)})).

Furthermore suppose that there exists a degree-rank (s1,r)(s-1,r^{\ast}) correlation structure associated to ff with parameters ρ\rho, MM, dd, and DD. Then there exists a degree-rank (s1,r1)(s-1,r^{\ast}-1) correlation structure associated to ff with parameters ρ\rho^{\prime}, MM^{\prime}, dd^{\prime}, and DD^{\prime} such that

ρ1,M,Dexp(Os((dlog(MD/ρ))Os(1))) and dOs((dlog(MD/ρ))Os(1)).\rho^{\prime-1},M^{\prime},D^{\prime}\leq\exp(O_{s}((d\log(MD/\rho))^{O_{s}(1)}))\emph{ and }d^{\prime}\leq O_{s}((d\log(MD/\rho))^{O_{s}(1)}).

Combining Lemma 6.3 along with the observation that degree-rank (s1,0)(s-1,0) nilmanifolds induce a degree (s2)(s-2) filtration (coming from the groups G(i,0)G_{(i,0)}), we immediately obtain the following. In particular, these can now be “hidden” inside the nilmanifolds implicit in Corr(,,,)\operatorname{Corr}(\cdot,\cdot,\cdot,\cdot).

Theorem 6.4.

Fix δ(0,1/2)\delta\in(0,1/2) and s2s\geq 2. Assume Theorem 1.2 for s1s-1. Let f:[N]f\colon[N]\to\mathbb{C} be a 11-bounded function such that

fUs+1[N]δ.\lVert f\rVert_{U^{s+1}[N]}\geq\delta.

Then the following data exists:

  • A subset H[N]H\subseteq[N] of size at least ρN\rho N;

  • A multidegree (1,s1)(1,s-1) nilcharacter χ(h,n)\chi(h,n) which lives on a nilmanifold G/ΓG^{\ast}/\Gamma^{\ast} where χ\chi has a G(1,s1)G^{\ast}_{(1,s-1)}-vertical frequency η\eta^{\ast}. Furthermore G/ΓG^{\ast}/\Gamma^{\ast} has dimension bounded by dd and complexity bounded by MM, the function underlying χ\chi is MM-Lipschitz, η\eta^{\ast} has height bounded by MM, and the output dimension of χ\chi is bounded by DD;

  • For all hHh\in H we have that

    Δhf(n)χ(h,n)¯Corr(s2,ρ,M,d).\Delta_{h}f(n)\otimes\overline{\chi(h,n)}\in\operatorname{Corr}(s-2,\rho,M,d).

Furthermore, we can find such data satisfying

ρ1,M,Dexp(log(1/δ)Os(1)) and dlog(1/δ)Os(1).\rho^{-1},M,D\leq\exp(\log(1/\delta)^{O_{s}(1)})\emph{ and }d\leq\log(1/\delta)^{O_{s}(1)}.
Remark.

The case when NN is small (i.e., Nexp(log(1/δ)Os(1))N\leq\exp(\log(1/\delta)^{O_{s}(1)})) is handled via noting that ΔhfL2[N]exp(log(1/δ)Os(1))NO(1)\lVert\Delta_{h}f\rVert_{L^{2}[N]}\geq\exp(\log(1/\delta)^{O_{s}(1)})\cdot N^{-O(1)} for many hh and then applying Fourier analysis. Such an analysis always loses factors of NN and thus is only useful in this crude edge case. We will not comment further on such issues.

7. On a Cauchy–Schwarz Argument of Gowers

The proof of Lemma 6.3 is performed in a sequence of stages. We first deduce that the functions correlating with Δhf\Delta_{h}f are not arbitrary. Indeed for many additive quadruples (h1,h2,h3,h4)(h_{1},h_{2},h_{3},h_{4}) we have that the associated tensor product of χh(n)\chi_{h}(n) exhibits correlation with a degree (s2)(s-2) nilsequence.

We first need the following elementary Fourier-analytic lemma which converts correlation on long progressions to correlation with a major-arc Fourier phase; this is essentially [32, Lemma 3.5(ii)].

Lemma 7.1.

Let δ(0,1/2)\delta\in(0,1/2). Suppose that g:[N]g\colon[N]\to\mathbb{C} is 11-bounded and there exists an arithmetic progression PP of length δN\delta N with common difference qq within [N][N] such that

|𝔼nPg(n)|δ.\big{|}\mathbb{E}_{n\in P}g(n)\big{|}\geq\delta.

Then there exists Θ\Theta\in\mathbb{R} such that qΘ/δO(1)N1\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq\delta^{-O(1)}N^{-1} and

|𝔼n[N]e(Θn)g(n)|δO(1)N.\big{|}\mathbb{E}_{n\in[N]}e(\Theta n)g(n)\big{|}\geq\delta^{O(1)}N.
Proof.

Extend gg to be zero beyond the interval [N][N]. Let PP^{\prime} be the arithmetic progression of length δ2N\delta^{2}N with common difference qq centered at 0. We have

|n(𝟙P(|P|1𝟙P))(n)g(n)|δO(1)N.\bigg{|}\sum_{n\in\mathbb{Z}}(\mathbbm{1}_{P}\ast(|P^{\prime}|^{-1}\mathbbm{1}_{P^{\prime}}))(n)g(n)\bigg{|}\geq\delta^{O(1)}N.

Via Fourier inversion, we have

|Θ𝕋g^(Θ)𝟙P^(Θ)𝟙P^(Θ)¯𝑑Θ|δO(1)N2.\bigg{|}\int_{\Theta\in\mathbb{T}}\widehat{g}(\Theta)\overline{\widehat{\mathbbm{1}_{P}}(\Theta)\widehat{\mathbbm{1}_{P^{\prime}}}(\Theta)}d\Theta\bigg{|}\geq\delta^{O(1)}N^{2}.

Now via standard bounds on linear exponential sums, we have

|𝟙P^(Θ)|,|𝟙P^(Θ)|\displaystyle|\widehat{\mathbbm{1}_{P}}(\Theta)|,|\widehat{\mathbbm{1}_{P^{\prime}}}(\Theta)| min(qΘ/1,N).\displaystyle\lesssim\min(\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}^{-1},N).

Since |g^(Θ)|N|\widehat{g}(\Theta)|\leq N, we have that

|qΘ/T/Ng^(Θ)𝟙P^(Θ)𝟙P^(Θ)𝑑Θ|N2/T.\bigg{|}\int_{\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\geq T/N}\widehat{g}(\Theta)\widehat{\mathbbm{1}_{P}}(\Theta)\widehat{\mathbbm{1}_{P^{\prime}}}(\Theta)d\Theta\bigg{|}\lesssim N^{2}/T.

Therefore, taking T=δO(1)T=\delta^{-O(1)} sufficiently large we have that

N2qΘ/T/N|g^(Θ)|𝑑Θ|qΘ/T/Ng^(Θ)𝟙P^(Θ)𝟙P^(Θ)𝑑Θ|δO(1)N2.N^{2}\int_{\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq T/N}|\widehat{g}(\Theta)|d\Theta\geq\bigg{|}\int_{\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq T/N}\widehat{g}(\Theta)\widehat{\mathbbm{1}_{P}}(\Theta)\widehat{\mathbbm{1}_{P^{\prime}}}(\Theta)d\Theta\bigg{|}\geq\delta^{O(1)}N^{2}.

Thus

supqΘ/T/N|g^(Θ)|δO(1)T1N,\sup_{\lVert q\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq T/N}|\widehat{g}(\Theta)|\geq\delta^{O(1)}T^{-1}N,

which is exactly the desired conclusion (recalling that T=δO(1)T=\delta^{-O(1)}). ∎

The following lemma is due ultimately to Gowers but essentially appears as [32, Proposition 6.1]. We include the proof for the sake of completeness.

Lemma 7.2.

Suppose δ(0,1/2)\delta\in(0,1/2), f1,f2:[N]f_{1},f_{2}\colon[N]\to\mathbb{C} are 11-bounded, and χh:\chi_{h}\colon\mathbb{Z}\to\mathbb{C} are all 11-bounded. Suppose that

𝔼h[N]|𝔼n[N]f2(n)Δhf1(n)χh(n)¯|δ.\mathbb{E}_{h\in[N]}|\mathbb{E}_{n\in[N]}f_{2}(n)\Delta_{h}f_{1}(n)\overline{\chi_{h}(n)}|\geq\delta.

Then there exists Θ\Theta such that Θ/δO(1)/N\lVert\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq\delta^{-O(1)}/N and

𝔼h1+h2=h3+h4hi[N]|𝔼n[N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯e(Θn)|δO(1).\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\bigg{|}\mathbb{E}_{n\in[N]}\chi_{h_{1}}(n)\chi_{h_{2}}(n+h_{1}-h_{4})\overline{\chi_{h_{3}}(n)}\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot e\big{(}\Theta n\big{)}\bigg{|}\geq\delta^{O(1)}.
Proof.

Note that we assume that χh(n)=0\chi_{h}(n)=0 for h[N]h\notin[N] and that χh(n)=0\chi_{h}(n)=0 for n[N]n\notin[N] via replacing χh(n)\chi_{h}(n) with χh(n)𝟙n[N]\chi_{h}(n)\cdot\mathbbm{1}_{n\in[N]}; we will remove this truncation at the end of the argument. We extend these functions by 0 to /N~\mathbb{Z}/\widetilde{N}\mathbb{Z} where N~\widetilde{N} is a prime between 4N4N and 8N8N.

By Cauchy–Schwarz, we have

𝔼h/N~|𝔼n/N~f2(n)Δhf1(n)χh(n)¯|2δ2.\mathbb{E}_{h\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}f_{2}(n)\Delta_{h}f_{1}(n)\overline{\chi_{h}(n)}|^{2}\gg\delta^{2}.

Expanding, this is equivalent to

𝔼h/N~𝔼n1,n2/N~f2(n1)f1(n1)f1(n1+h)¯f2(n2)f1(n2)¯f1(n2+h)χh(n1)¯χh(n2)δ2.\mathbb{E}_{h\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\mathbb{E}_{n_{1},n_{2}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}f_{2}(n_{1})f_{1}(n_{1})\overline{f_{1}(n_{1}+h)}\overline{f_{2}(n_{2})f_{1}(n_{2})}f_{1}(n_{2}+h)\overline{\chi_{h}(n_{1})}\chi_{h}(n_{2})\gg\delta^{2}.

We set n=n1n=n_{1}, k=n2n1k=n_{2}-n_{1}, and m=n1+hm=n_{1}+h and find that

𝔼m,n/N~,k/N~Δk(f2f1)(n)Δkf1(m)¯Δkχmn(n)¯δ2.\mathbb{E}_{m,n\in\mathbb{Z}/\widetilde{N}\mathbb{Z},k\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Delta_{k}(f_{2}f_{1})(n)\Delta_{k}\overline{f_{1}(m)}\Delta_{k}\overline{\chi_{m-n}(n)}\gtrsim\delta^{2}.

This implies that

𝔼k/N~|𝔼m,n/N~Δk(f2f1)(n)Δkf1(m)¯Δkχmn(n)¯|4δ8.\mathbb{E}_{k\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{m,n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Delta_{k}(f_{2}f_{1})(n)\Delta_{k}\overline{f_{1}(m)}\Delta_{k}\overline{\chi_{m-n}(n)}|^{4}\gtrsim\delta^{8}.

Recall the box-norm inequality that for a,b,Φa,b,\Phi which are 11-bounded, we have

|𝔼n,m/N~a(n)b(m)Φ(n,m)|4\displaystyle|\mathbb{E}_{n,m\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}a(n)b(m)\Phi(n,m)|^{4} (𝔼n/N~|𝔼m/N~b(m)Φ(n,m)|)4\displaystyle\leq\big{(}\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{m\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}b(m)\Phi(n,m)|\big{)}^{4}
(𝔼n/N~|𝔼m/N~b(m)Φ(n,m)|2)2\displaystyle\leq\big{(}\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{m\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}b(m)\Phi(n,m)|^{2}\big{)}^{2}
=(𝔼n/N~𝔼m,m/N~b(m)b(m)¯Φ(n,m)Φ(n,m)¯|)2\displaystyle=\big{(}\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\mathbb{E}_{m,m^{\prime}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}b(m)\overline{b(m^{\prime})}\Phi(n,m)\overline{\Phi(n,m^{\prime})}|\big{)}^{2}
=(𝔼m,m/N~|𝔼n/N~Φ(n,m)Φ(n,m)¯|)2\displaystyle=\big{(}\mathbb{E}_{m,m^{\prime}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Phi(n,m)\overline{\Phi(n,m^{\prime})}|\big{)}^{2}
𝔼m,m/N~|𝔼n/N~Φ(n,m)Φ(n,m)¯|2\displaystyle\leq\mathbb{E}_{m,m^{\prime}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}|\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Phi(n,m)\overline{\Phi(n,m^{\prime})}|^{2}
(7.1) =𝔼n,n,m,m/N~Φ(n,m)Φ(n,m)¯Φ(n,m)¯Φ(n,m)).\displaystyle=\mathbb{E}_{n,n^{\prime},m,m^{\prime}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Phi(n,m)\overline{\Phi(n,m^{\prime})}\overline{\Phi(n^{\prime},m)}\Phi(n^{\prime},m^{\prime})\big{)}.

Applying this for each fixed kk, we have that

𝔼k/N~𝔼n,n,m,m/N~Δkχmn(n)¯Δkχmn(n)¯Δkχmn(n)Δkχmn(n)δ8.\mathbb{E}_{k\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\mathbb{E}_{n,n^{\prime},m,m^{\prime}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\Delta_{k}\overline{\chi_{m-n}(n)}\Delta_{k}\overline{\chi_{m^{\prime}-n^{\prime}}(n^{\prime})}\Delta_{k}\chi_{m^{\prime}-n}(n)\Delta_{k}\chi_{m-n^{\prime}}(n^{\prime})\gtrsim\delta^{8}.

Take mn=h1m^{\prime}-n=h_{1}, mn=h2m-n^{\prime}=h_{2}, mn=h3m-n=h_{3}, mn=h4m^{\prime}-n^{\prime}=h_{4}. Note that nn=h1h4n^{\prime}-n=h_{1}-h_{4} and h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4} and noting that nn and n+kn+k range over the whole cyclic group, this is exactly

𝔼h1+h2=h3+h4hi/N~|𝔼n/N~χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯|2δ8.\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in\mathbb{Z}/\widetilde{N}\mathbb{Z}\end{subarray}}\big{|}\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\chi_{h_{1}}(n)\chi_{h_{2}}(n+h_{1}-h_{4})\overline{\chi_{h_{3}}(n)}\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\big{|}^{2}\gtrsim\delta^{8}.

Since χh(n)=0\chi_{h}(n)=0 identically for h[N]h\notin[N], we in fact have

𝔼h1+h2=h3+h4hi[N]|𝔼n/N~χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯|2δ8.\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\big{|}\mathbb{E}_{n\in\mathbb{Z}/\widetilde{N}\mathbb{Z}}\chi_{h_{1}}(n)\chi_{h_{2}}(n+h_{1}-h_{4})\overline{\chi_{h_{3}}(n)}\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\big{|}^{2}\gtrsim\delta^{8}.

For the inner sum, recall that we “truncated” χh(n)\chi_{h}(n) with 𝟙n[N]\mathbbm{1}_{n\in[N]}. In particular, extracting the truncation term we have that

𝔼h1+h2=h3+h4hi[N]|𝔼n[N]𝟙1n+h1h4Nχh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯|2δ8.\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\bigg{|}\mathbb{E}_{n\in[N]}\mathbbm{1}_{1\leq n+h_{1}-h_{4}\leq N}\chi_{h_{1}}(n)\chi_{h_{2}}(n+h_{1}-h_{4})\overline{\chi_{h_{3}}(n)}\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\bigg{|}^{2}\gtrsim\delta^{8}.

Via an application of Lemma 7.1, there exist choices of Θh\Theta_{\vec{h}} with Θh/δO(1)/N\lVert\Theta_{\vec{h}}\rVert_{\mathbb{R}/\mathbb{Z}}\leq\delta^{-O(1)}/N such that

𝔼h1+h2=h3+h4hi[N]|𝔼n[N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯e(Θhn)|2δO(1).\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\big{|}\mathbb{E}_{n\in[N]}\chi_{h_{1}}(n)\chi_{h_{2}}(n+h_{1}-h_{4})\overline{\chi_{h_{3}}(n)}\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}e(\Theta_{\vec{h}}n)\big{|}^{2}\gtrsim\delta^{O(1)}.

Rounding Θh\Theta_{\vec{h}} to a lattice of spacing δO(1)/N\delta^{O(1)}/N and Pigeonholing then gives the desired result. ∎

The next proof will require defining the notion when two nilcharacters are “equivalent” (i.e., have the same symbol in a quantified sense of [34, Appendix E]).

Definition 7.3.

We say nilcharacters χ,χ\chi,\chi^{\prime} are (M,D,d)(M,D,d)-equivalent for multidegree JJ if χ,χ\chi,\chi^{\prime} have output dimensions bounded by DD and all coordinates of

χχ¯\chi\otimes\overline{\chi^{\prime}}

can be represented as sums of at most MM nilsequences of multidegree JJ such that the underlying functions of each nilsequence are MM-Lipschitz and the underlying nilmanifolds have complexity bounded by MM and dimension bounded by dd.

The key reason for the definition of equivalence is the following proposition, which states that given equivalent nilcharacters χ\chi and χ\chi^{\prime}, correlations with them are equivalent modulo introducing a term of multidegree JJ. This is a finitary quantification of [34, Lemma E.7].

Lemma 7.4.

Given a function f:ΩLf\colon\Omega\to\mathbb{C}^{L} and nilcharacters χ,χ\chi,\chi^{\prime} which are (M,D,d)(M,D,d)-equivalent for multidegree JJ, if

𝔼nΩf(n)χ(n)ρ\lVert\mathbb{E}_{\vec{n}\in\Omega}f(\vec{n})\otimes\chi(\vec{n})\rVert_{\infty}\geq\rho

then

𝔼nΩf(n)χ(n)ψ(n)(ρ/(MD))O(1),\lVert\mathbb{E}_{\vec{n}\in\Omega}f(\vec{n})\otimes\chi^{\prime}(\vec{n})\cdot\psi(\vec{n})\rVert_{\infty}\geq(\rho/(MD))^{O(1)},

where ψ\psi can be taken to be one of the nilsequences used as part of a represention of one of the coordinates in χχ¯\chi\otimes\overline{\chi^{\prime}}. In particular, ψ\psi is a nilsequence of multidegree JJ such that underlying nilmanifold has complexity bounded by MM and dimension bounded by dd and the underlying function has Lipschitz constant bounded by MM.

Remark.

The additional condition that ψ\psi can be taken to be an explicit nilsequence occurring in a witness for the equivalence of χ,χ\chi,\chi^{\prime} is used primarily to allow us to Pigeonhole the choice of ψ\psi in cases where we may need to apply this statement “on average”.

Proof.

Notice that since χ\chi^{\prime} is a nilcharacter, we have that the trace of

χχ¯\chi^{\prime}\otimes\overline{\chi^{\prime}}

is the constant function 11. Furthermore note that the trace is the sum of at most DD coordinates of χχ¯\chi^{\prime}\otimes\overline{\chi^{\prime}} and therefore

𝔼nΩf(n)χ(n)χ(n)¯χ(n)ρ/D.\lVert\mathbb{E}_{\vec{n}\in\Omega}f(\vec{n})\otimes\chi(\vec{n})\otimes\overline{\chi^{\prime}(\vec{n})}\otimes\chi^{\prime}(\vec{n})\rVert_{\infty}\geq\rho/D.

Consider the coordinate of f(n)χ(n)χ(n)¯χ(n)f(\vec{n})\otimes\chi(\vec{n})\otimes\overline{\chi^{\prime}(\vec{n})}\otimes\chi^{\prime}(\vec{n}) which achieves the LL^{\infty} above, and in particular the associated coordinate of χ(n)χ(n)¯\chi(\vec{n})\otimes\overline{\chi^{\prime}(\vec{n})} that contributes. Applying the definition of equivalence and the triangle inequality, there exists ψ(n)\psi(\vec{n}) of the desired form such that

𝔼nΩf(n)χ(n)ψ(n)(ρ/(MD))O(1).\lVert\mathbb{E}_{\vec{n}\in\Omega}f(\vec{n})\otimes\chi^{\prime}(\vec{n})\cdot\psi(\vec{n})\rVert_{\infty}\geq(\rho/(MD))^{O(1)}.\qed

We are now in position to prove the quantification of [34, Proposition 7.3]. We remark that there was an error in the published version of [34, Proposition 8.3] which affected the proof of [34, Proposition 7.3]. We quantify a closely related approach to that given in the erratum [31]. For our proof we require various quantifications of [34, Appendix E]; all of these are completely mechanical.

Lemma 7.5.

Fix s3s\geq 3 and 1rs11\leq r^{\ast}\leq s-1. Let f:[N]f\colon[N]\to\mathbb{C} is a 11-bounded function. Suppose that ff has a correlation structure with parameters ρ\rho, MM, dd, and DD and associated nilcharacters χ(h,n)\chi(h,n) and χh(n)\chi_{h}(n). Then for at least (MD/ρ)Os(dOs(1))N3(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}N^{3} quadruples h1,h2,h3,h4Hh_{1},h_{2},h_{3},h_{4}\in H with h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4} we have

χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯Corr(s2,ρ,M,d)\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\in\operatorname{Corr}(s-2,\rho^{\prime},M^{\prime},d^{\prime})

with

ρ1,M(MD/ρ)Os(dOs(1)) and dOs(dOs(1)).\rho^{\prime-1},M^{\prime}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}\emph{ and }d^{\prime}\leq O_{s}(d^{O_{s}(1)}).
Remark 7.6.

For s=2s=2, the same statement holds modulo a correction term of e(Θn)e(\Theta n) where Θ\Theta is such that Θ/(MD/ρ)Os(dOs(1))/N\lVert\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}/N.

Proof.

By definition of correlation structures we have for hHh\in H that

𝔼n[N](Δhf)(n)χ(h,n)¯χh(n)¯ψh(n)¯ρ\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\rho

where ψh\psi_{h} is a nilsequence of degree (s2)(s-2) whose underlying function is at most MM-Lipschitz on a nilmanifold of complexity at most MM and dimension at most dd. Setting χh(n)\chi_{h}(n) to be zero for hHh\notin H we have

𝔼h[N]𝔼n[N]f(n)f(n+h)¯χ(h,n)¯χh(n)¯ψh(n)¯ρ2.\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}f(n)\overline{f(n+h)}\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\rho^{2}.

Twisting ψh\psi_{h} by an appropriate hh-dependent constant complex phase so as to make the LL^{\infty} values be realized as positive real numbers, we may assume that

𝔼h[N]𝔼n[N]f(n)f(n+h)¯χ(h,n)¯χh(n)¯ψh(n)¯ρ2/D2.\lVert\mathbb{E}_{h\in[N]}\mathbb{E}_{n\in[N]}f(n)\overline{f(n+h)}\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\rho^{2}/D^{2}.

By Lemma C.5, we have that χ(h,n)\chi(h,n) is ((MD)Os(dOs(1)),(MD)Os(dOs(1)),dOs(1))((MD)^{O_{s}(d^{O_{s}(1)})},(MD)^{O_{s}(d^{O_{s}(1)})},d^{O_{s}(1)})-equivalent for degree (s1)(s-1) to some χ~(h,n,,n)\widetilde{\chi}(h,n,\ldots,n) which is a multidegree (1,,1)(1,\ldots,1) nilcharacter with output dimension, complexity of underlying nilmanifold, Lipschitz constant of underlying function for each coordinate, and vertical frequency height all bounded by (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})}. (χ~\widetilde{\chi} has ss total arguments.) Thus, applying Lemma 7.4, we have that

𝔼n,h[N]f(n)f(n+h)¯χ~(h,n,,n)¯χh(n)¯ψh(n)¯ψ~(h,n)(MD/ρ)Os(dOs(1)),\displaystyle\lVert\mathbb{E}_{n,h\in[N]}f(n)\overline{f(n+h)}\otimes\overline{\widetilde{\chi}(h,n,\ldots,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\psi_{h}(n)}\cdot\widetilde{\psi}(h,n)\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})},

where ψ~(h,n)\widetilde{\psi}(h,n) is a degree (s1)(s-1) nilsequence where the underlying function has Lipschitz norm and complexity of underlying nilmanifold bounded (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})} while the dimension of the underlying nilmanifold is bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}). The nilsequence ψ~(h,n)\widetilde{\psi}(h,n) can also be viewed as a multidegree (0,s1)(s1,s2)(0,s-1)\cup(s-1,s-2) nilsequence. (I.e., we take the union of the down-sets generated by these elements.) Furthermore, the underlying function has Lipschitz norm and complexity of underlying nilmanifold bounded (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})} while the dimension of the underlying nilmanifold is bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}).

Thus, applying Lemma C.6 (splitting) we have

𝔼n,h[N]f(n)f(n+h)¯χ~(h,n,,n)¯χh(n)¯ψh~(n)¯b(n)(MD/ρ)Os(dOs(1))\displaystyle\lVert\mathbb{E}_{n,h\in[N]}f(n)\overline{f(n+h)}\otimes\overline{\widetilde{\chi}(h,n,\ldots,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\widetilde{\psi_{h}}(n)}\cdot b(n)\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}

where ψh~\widetilde{\psi_{h}} are degree (s2)(s-2) nilsequences in nn where complexity and Lipschitz constant are bounded by (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})} and the dimension of the underlying nilmanifold is bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}) while b(n)b(n) is (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})}-bounded. Therefore, applying Lemma 7.2, we have

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χ~(h1,n,,n)χ~(h2,n+h1h4,,n+h1h4)χ~(h3,n,,n)¯\displaystyle\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\lVert\mathbb{E}_{n\in[N]}\widetilde{\chi}(h_{1},n,\ldots,n)\otimes\widetilde{\chi}(h_{2},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4})\otimes\overline{\widetilde{\chi}(h_{3},n,\ldots,n)}
χ~(h4,n+h1h4,,n+h1h4)¯χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯\displaystyle\otimes\overline{\widetilde{\chi}(h_{4},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4})}\otimes\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}
ψh1~(n)ψh2~(n)ψh3~(n)¯ψh4~(n)¯e(Θn)(MD/ρ)Os(dOs(1)).\displaystyle\cdot\widetilde{\psi_{h_{1}}}(n)\widetilde{\psi_{h_{2}}}(n)\overline{\widetilde{\psi_{h_{3}}}(n)}\overline{\widetilde{\psi_{h_{4}}}(n)}e(\Theta n)\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}.

We may combine ψh1~(n)ψh2~(n)ψh3~(n)¯ψh4~(n)¯e(Θn)\widetilde{\psi_{h_{1}}}(n)\widetilde{\psi_{h_{2}}}(n)\overline{\widetilde{\psi_{h_{3}}}(n)}\overline{\widetilde{\psi_{h_{4}}}(n)}e(\Theta n) to form ψh1,h2,h3,h4(n)\psi_{h_{1},h_{2},h_{3},h_{4}}^{\ast}(n) which is degree (s2)(s-2) in nn and with identical complexity bounds to ψh1~\widetilde{\psi_{h_{1}}} modulo changing implicit constant. Additionally, we may twist ψh1,h2,h3,h4\psi_{h_{1},h_{2},h_{3},h_{4}}^{\ast} by an (h1,h2,h3,h4)(h_{1},h_{2},h_{3},h_{4})-dependent complex phase to bring the outer expectation inside the norm. Thus we have

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χ~(h1,n,,n)χ~(h2,n+h1h4,,n+h1h4)χ~(h3,n,,n)¯\displaystyle\lVert\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\mathbb{E}_{n\in[N]}\widetilde{\chi}(h_{1},n,\ldots,n)\otimes\widetilde{\chi}(h_{2},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4})\otimes\overline{\widetilde{\chi}(h_{3},n,\ldots,n)}
χ~(h4,n+h1h4,,n+h1h4)¯χh1(n)χh2(n+h1h4)χh3(n)¯\displaystyle\qquad\otimes\overline{\widetilde{\chi}(h_{4},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4})}\otimes\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}
χh4(n+h1h4)¯ψh1,h2,h3,h4(n)(MD/ρ)Os(dOs(1)).\displaystyle\qquad\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot\psi^{\ast}_{h_{1},h_{2},h_{3},h_{4}}(n)\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}.

By Lemma C.5, χ(h2,n+h1h4,,n+h1h4)\chi(h_{2},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4}) is ((MD)Os(dOs(1)),(MD)Os(dOs(1)),dOs(1))((MD)^{O_{s}(d^{O_{s}(1)})},(MD)^{O_{s}(d^{O_{s}(1)})},d^{O_{s}(1)})-equivalent for degree (s1)(s-1) to

k=0s1χ(h2,n,,n,h1h4,,h1h4)\bigotimes_{k=0}^{s-1}\chi(h_{2},n,\ldots,n,h_{1}-h_{4},\ldots,h_{1}-h_{4})

where there are sk1s-k-1 copies of nn and kk copies of h1h4h_{1}-h_{4} and we have a similar expansion for χ(h4,n+h1h4,,n+h1h4)\chi(h_{4},n+h_{1}-h_{4},\ldots,n+h_{1}-h_{4}). Note that all terms in this expansion except for k=0k=0 may be absorbed into ψ\psi^{\ast}. Therefore applying Lemma 7.4, we have that

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χ~(h1,n,,n)χ~(h2,n,,n)χ~(h3,n,,n)¯\displaystyle\lVert\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\mathbb{E}_{n\in[N]}\widetilde{\chi}(h_{1},n,\ldots,n)\otimes\widetilde{\chi}(h_{2},n,\ldots,n)\otimes\overline{\widetilde{\chi}(h_{3},n,\ldots,n)}
χ~(h4,n,,n)¯χh1(n)χh2(n+h1h4)χh3(n)¯\displaystyle\qquad\otimes\overline{\widetilde{\chi}(h_{4},n,\ldots,n)}\otimes\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}
χh4(n+h1h4)¯ψh1,h2,h3,h4(n)τ(n,h1,h2,h3,h4)(MD/ρ)Os(dOs(1));\displaystyle\qquad\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot\psi^{\ast}_{h_{1},h_{2},h_{3},h_{4}}(n)\cdot\tau(n,h_{1},h_{2},h_{3},h_{4})\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})};

here τ(n,h1,h2,h3,h4)\tau(n,h_{1},h_{2},h_{3},h_{4}) is a degree (s1)(s-1) nilsequence where the underlying function has Lipschitz norm and complexity of underlying nilmanifold bounded by (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})} while the dimension of the underlying nilmanifold is bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}) and we have folded certain terms into ψ\psi^{\ast} while guaranteeing it is a degree (s2)(s-2) nilsequence (and the complexity bounds have not changed modulo implicit constants). Finally via Lemma C.5, we have that

χ~(h1,n,,n)χ~(h2,n,,n)χ~(h3,n,,n)¯χ~(h4,n,,n)¯\widetilde{\chi}(h_{1},n,\ldots,n)\otimes\widetilde{\chi}(h_{2},n,\ldots,n)\otimes\overline{\widetilde{\chi}(h_{3},n,\ldots,n)}\otimes\overline{\widetilde{\chi}(h_{4},n,\ldots,n)}

and χ~(h1+h2h3h4,n,,n)\widetilde{\chi}(h_{1}+h_{2}-h_{3}-h_{4},n,\ldots,n) are ((MD)Os(dOs(1)),(MD)Os(dOs(1)),dOs(1))((MD)^{O_{s}(d^{O_{s}(1)})},(MD)^{O_{s}(d^{O_{s}(1)})},d^{O_{s}(1)})-equivalent for degree (s1)(s-1). Thus applying Lemma 7.4, we have

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯\displaystyle\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\lVert\mathbb{E}_{n\in[N]}\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}
χ~(0,n,,n)ψh1,h2,h3,h4(n)τ(n,h1,h2,h3,h4)(MD/ρ)Os(dOs(1));\displaystyle\qquad\qquad\cdot\widetilde{\chi}(0,n,\ldots,n)\psi^{\ast}_{h_{1},h_{2},h_{3},h_{4}}(n)\tau(n,h_{1},h_{2},h_{3},h_{4})\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})};

here we have folded in various terms into τ(n,h1,,h4)\tau(n,h_{1},\ldots,h_{4}) and the complexity bounds have not changed modulo implicit constants. Note that by Lemma C.2, χ~(0,n,,n)\widetilde{\chi}(0,n,\ldots,n) is a degree (s1)(s-1) nilsequence in nn and thus may abusively also be absorbed into τ\tau. Finally noting that a degree (s1)(s-1) nilsequence may also be viewed as a multidegree (s1,0,,0)(s2,s1,,s1)(s-1,0,\ldots,0)\cup(s-2,s-1,\ldots,s-1) nilsequence and thus applying Lemma C.6 we have

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯ψh1,h2,h3,h4(n)b(n)\displaystyle\lVert\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\mathbb{E}_{n\in[N]}\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot\psi^{\ast}_{h_{1},h_{2},h_{3},h_{4}}(n)b(n)\rVert_{\infty}
(MD/ρ)Os(dOs(1)),\displaystyle\qquad\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})},

where b(n)b(n) is an (MD)Os(dOs(1))(MD)^{O_{s}(d^{O_{s}(1)})}-bounded function and ψ\psi^{\ast} has been modified but the underlying complexity bounds have not changed modulo implicit constants. Note ψ\psi^{\ast} is degree (s2)(s-2).

We now reparameterize with

h1=mn,h2=mn,h3=mn,h4=mn.h_{1}=m-n,h_{2}=m^{\prime}-n^{\prime},h_{3}=m^{\prime}-n,h_{4}=m-n^{\prime}.

By approximating with regions where we take m,m,n,nm,m^{\prime},n,n^{\prime} to live in short intervals, there exist intervals I1,,I4I_{1},\ldots,I_{4} each of density (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} in [±2N][\pm 2N] such that

𝔼mI1,mI2,nI3,nI4\displaystyle\bigg{\lVert}\mathbb{E}_{m\in I_{1},m^{\prime}\in I_{2},n\in I_{3},n^{\prime}\in I_{4}} χmn(n)χmn(n)χmn(n)¯χmn(n)¯\displaystyle\chi_{m-n}(n)\otimes\chi_{m^{\prime}-n^{\prime}}(n^{\prime})\otimes\overline{\chi_{m^{\prime}-n}(n)}\otimes\overline{\chi_{m-n^{\prime}}(n^{\prime})}
ψmn,mn,mn,mn(n)b(n)(MD/ρ)Os(dOs(1))\displaystyle\qquad\cdot\psi^{\ast}_{m-n,m^{\prime}-n^{\prime},m^{\prime}-n,m-n^{\prime}}(n)b(n)\bigg{\rVert}_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}

where ψmn,mn,mn,mn\psi_{m-n,m^{\prime}-n^{\prime},m^{\prime}-n,m-n^{\prime}} is a degree (s2)(s-2) nilsequence. Now by Cauchy–Schwarz, duplicating the variable mm and denoting the copies by m,m′′m,m^{\prime\prime}, we obtain

𝔼m,m′′I1,mI2,nI3,nI4\displaystyle\bigg{\lVert}\mathbb{E}_{m,m^{\prime\prime}\in I_{1},m^{\prime}\in I_{2},n\in I_{3},n^{\prime}\in I_{4}} χmn(n)χm′′n(n)¯χmn(n)¯χm′′n(n)\displaystyle\chi_{m-n}(n)\otimes\overline{\chi_{m^{\prime\prime}-n}(n)}\otimes\overline{\chi_{m-n^{\prime}}(n^{\prime})}\otimes\chi_{m^{\prime\prime}-n^{\prime}}(n^{\prime})
ψmn,mn,mn,mn,m′′n,m′′n(n)(MD/ρ)Os(dOs(1)).\displaystyle\cdot\psi^{\ast}_{m-n,m^{\prime}-n^{\prime},m^{\prime}-n,m-n^{\prime},m^{\prime\prime}-n,m^{\prime\prime}-n^{\prime}}(n)\bigg{\rVert}_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}.

Note that every term not involving mm was removed using appropriate boundedness. Now we may Pigeonhole on mn=tm^{\prime}-n=t and deduce

𝔼m,m′′I1,nI3,nI4\displaystyle\bigg{\lVert}\mathbb{E}_{m,m^{\prime\prime}\in I_{1},n\in I_{3},n^{\prime}\in I_{4}} χmn(n)χm′′n(n)¯χmn(n)¯χm′′n(n)\displaystyle\chi_{m-n}(n)\otimes\overline{\chi_{m^{\prime\prime}-n}(n)}\otimes\overline{\chi_{m-n^{\prime}}(n^{\prime})}\otimes\chi_{m^{\prime\prime}-n^{\prime}}(n^{\prime})
ψmn,mn,m′′n,m′′n(n)𝟙[n+tI2](MD/ρ)Os(dOs(1)).\displaystyle\qquad\cdot\psi^{\ast}_{m-n,m-n^{\prime},m^{\prime\prime}-n,m^{\prime\prime}-n^{\prime}}(n)\cdot\mathbbm{1}[n+t\in I_{2}]\bigg{\rVert}_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}.

Let m′′n=h1m^{\prime\prime}-n=h_{1}, mn=h2m-n^{\prime}=h_{2}, mn=h3m-n=h_{3}, and m′′n=h4m^{\prime\prime}-n^{\prime}=h_{4} (abusively). We have

\displaystyle\bigg{\lVert} 𝔼n[N]𝔼h1+h2=h3+h4hi[±N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯\displaystyle\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[\pm N]\end{subarray}}\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}
χh1,h2,h3,h4(n)𝟙[n+h3,n+h1I1,nI3,n+h1h4I4,n+tI2](MD/ρ)Os(dOs(1))\displaystyle\quad\cdot\chi_{h_{1},h_{2},h_{3},h_{4}}(n)\cdot\mathbbm{1}[n+h_{3},n+h_{1}\in I_{1},n\in I_{3},n+h_{1}-h_{4}\in I_{4},n+t\in I_{2}]\bigg{\rVert}_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}

where χh1,h2,h3,h4(n)\chi_{h_{1},h_{2},h_{3},h_{4}}(n) is a degree (s2)(s-2) nilsequence (for each fixed h1,h2,h3,h4h_{1},h_{2},h_{3},h_{4}) where the underlying nilmanifold and Lipschitz constant of underlying function are bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} and the dimension is bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}).

Therefore, by the triangle inequality we have that

𝔼\displaystyle\mathbb{E} 𝔼n[N]χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯h1+h2=h3+h4hi[±N]{}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[\pm N]\end{subarray}}\bigg{\lVert}\mathbb{E}_{n\in[N]}\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}
χh1,h2,h3,h4(n)𝟙[n+h3,n+h1I1,nI3,n+h1h4I4,n+tI2](MD/ρ)Os(dOs(1)).\displaystyle\qquad\chi_{h_{1},h_{2},h_{3},h_{4}}(n)\cdot\mathbbm{1}[n+h_{3},n+h_{1}\in I_{1},n\in I_{3},n+h_{1}-h_{4}\in I_{4},n+t\in I_{2}]\bigg{\rVert}_{\infty}\gtrsim(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}.

Finally, the last term is the indicator of an h\vec{h}-dependent interval. Applying Lemma 7.1 (and noting that s3s\geq 3 allows us to fold in the major arc Fourier term) completes the proof. ∎

8. Sunflower Step

For the next stage of our proof, as outlined in Section 4, we wish to provide more structure on hh-dependent nilcharacters χh\chi_{h} given information about additive quadruples as established in Section 7. As setup we will require the notion of a rational subspace with respect to a specified basis, and establish some basic control over Taylor coefficients of bounded polynomial sequences.

Definition 8.1.

A vector subspace VVV^{\prime}\leqslant V is QQ-rational with respect to VV given the basis ={B1,,Bdim(V)}\mathcal{B}=\{B_{1},\ldots,B_{\dim(V)}\} (of VV) if there exists a basis ={B1,,Bdim(V)}\mathcal{B}^{\prime}=\{B_{1}^{\prime},\ldots,B_{\dim(V^{\prime})}^{\prime}\} of VV^{\prime} such that each BjB_{j}^{\prime} is a linear combination of elements of \mathcal{B} with coefficients of height at most QQ.

Lemma 8.2.

Consider a nilmanifold G/ΓG/\Gamma given a degree-rank filtration of degree rank (s,r)(s,r), dimension dd, and complexity at most MM. Let 𝒳\mathcal{X} denote the underlying adapted Mal’cev basis and assign the basis

𝒳i=(𝒳log(G(i,1)))/log(G(i,2))\mathcal{X}_{i}=(\mathcal{X}\cap\log(G_{(i,1)}))/\log(G_{(i,2)})

for G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)}. Suppose ε\varepsilon is a polynomial sequence such that

dG,𝒳(idG,ε(n))Md_{G,\mathcal{X}}(\mathrm{id}_{G},\varepsilon(n))\leq M

for n[N]n\in[N]. Then for 1is1\leq i\leq s, we have

dG(i,1)/G(i,2),𝒳i(Taylori(ε),idG(i,1)/G(i,2))MOs(dOs(1))Ni.d_{G_{(i,1)}/G_{(i,2)},\mathcal{X}_{i}}(\operatorname{Taylor}_{i}(\varepsilon),\mathrm{id}_{G_{(i,1)}/G_{(i,2)}})\leq M^{O_{s}(d^{O_{s}(1)})}N^{-i}.
Proof.

We may write

ε(n)=exp(j=0sεj(nj))\varepsilon(n)=\exp\bigg{(}\sum_{j=0}^{s}\varepsilon_{j}\binom{n}{j}\bigg{)}

where εjlog(G(j,0))\varepsilon_{j}\in\log(G_{(j,0)}). By Lemma 2.12, we have

Taylori(ε)=exp(εi)modG(i,2).\operatorname{Taylor}_{i}(\varepsilon)=\exp(\varepsilon_{i})~{}\mathrm{mod}~{}G_{(i,2)}.

We have that

ψexp(ε(n))MOs(dOs(1))\lVert\psi_{\mathrm{exp}}(\varepsilon(n))\rVert_{\infty}\leq M^{O_{s}(d^{O_{s}(1)})}

for all n[N]n\in[N] by [42, Lemmas B.1, B.3]. This implies that

t=0j(1)t(jt)ψexp(ε(tN/(2j)+1))MOs(dOs(1)).\bigg{\lVert}\sum_{t=0}^{j}(-1)^{t}\binom{j}{t}\psi_{\mathrm{exp}}(\varepsilon(t\cdot\lfloor N/(2j)\rfloor+1))\bigg{\rVert}_{\infty}\leq M^{O_{s}(d^{O_{s}(1)})}.

This is exactly the jj-th discrete derivative and thus terms coming from εi\varepsilon_{i} with i<ji<j vanish. This implies that

εjNjmodlog(G(j,2))MOs(dOs(1)),\lVert\varepsilon_{j}N^{j}~{}\mathrm{mod}~{}\log(G_{(j,2)})\rVert_{\infty}\leq M^{O_{s}(d^{O_{s}(1)})},

where the basis we assign to log(G(j,1)/G(j,2))\log(G_{(j,1)}/G_{(j,2)}) is 𝒳j\mathcal{X}_{j}. The result follows by dividing by NjN^{-j} and noting, by say [45, Lemma 2.6], that the distance in first- and second-kind coordinates is comparable. ∎

We now come to the first of two crucial arguments in this paper where we “improve” the correlation structure. At the cost of restricting the set HH, we force the Taylor coefficients of ghg_{h}, the polynomial sequences underlying the χh\chi_{h}, to live in certain restricted subspaces and their differences to lie in an even finer restriction.

This step is closely related to the “sunflower” arguments of [32, Step 1] and [34, Lemma 11.3]; a quantitative version for the U4U^{4}-inverse theorem due to the first author can be found in [43]. The precise statement of the lemma should also be compared with [34, Theorem 11.1(i)]. We note however that unlike [32, 34], our proof is completely free of any iteration (or equivalently passing to a subgroup where polynomial sequences are “totally equidistributed”, which necessitates too much loss in the relevant parameters).

Thus, the crucial point of the following technical statement is the final condition, which essentially captures that two hh-dependent frequencies in the improved correlation structure cannot “simultaneouly” affect the bottom degree-rank portion.

Lemma 8.3.

Fix s2s\geq 2 and 1rs11\leq r^{\ast}\leq s-1. Let f:[N]f\colon[N]\to\mathbb{C} be a 11-bounded function. Suppose that ff has a degree-rank (s1,r)(s-1,r^{\ast}) correlation structure with parameters ρ\rho, MM, dd, and DD and that N(MD/ρ)Os(dOs(1))N\geq(MD/\rho)^{O_{s}(d^{O_{s}(1)})} and data labeled as in Definition 6.1. Furthermore let 𝒳i=(𝒳log(G(i,1)))/log(G(i,2))\mathcal{X}_{i}=(\mathcal{X}\cap\log(G_{(i,1)}))/\log(G_{(i,2)}).

We output a new degree-rank (s1,r)(s-1,r^{\ast}) correlation structure for ff with parameters

ρ1\displaystyle\rho^{\prime-1} (MD/ρ)Os(dOs(1)),MO(M),D=D,dO(d),\displaystyle\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})},\quad M^{\prime}\leq O(M),\quad D^{\prime}=D,\quad d^{\prime}\leq O(d),

with set HHH^{\prime}\subseteq H, with multidegree (1,s1)(1,s-1) nilcharacter χ(h,n)=F(g(h,n)Γ)\chi^{\prime}(h,n)={F^{\ast}}^{\prime}(g^{\prime}(h,n){\Gamma^{\ast}}^{\prime}) on (G)=G×(G^{\ast})^{\prime}=G^{\ast}\times\mathbb{R}, with hh-dependent nilcharacters χh\chi_{h}^{\prime} having underlying polynomial sequences gh(n)=F(gh(n)Γ)g_{h}^{\prime}(n)=F^{\prime}(g_{h}^{\prime}(n)\Gamma) on G=GG^{\prime}=G. This correlation structure satisfies:

  • (G)(G^{\ast})^{\prime} is given the multidegree filtration

    (G)(i,j)=(G)(i,j)×{0}(G^{\ast})^{\prime}_{(i,j)}=(G^{\ast})_{(i,j)}\times\{0\}

    if (i,j)(0,0)(i,j)\neq(0,0) or (0,1)(0,1). For (i,j){(0,0),(0,1)}(i,j)\in\{(0,0),(0,1)\} we set

    (G)(i,j)=(G)(i,j)×.(G^{\ast})^{\prime}_{(i,j)}=(G^{\ast})_{(i,j)}\times\mathbb{R}.

    We have F((x,z)(Γ×))=F(xΓ)e(z){F^{\ast}}^{\prime}((x,z)(\Gamma^{\ast}\times\mathbb{Z}))=F^{\ast}(x\Gamma^{\ast})\cdot e(z). We have g(h,n)=(g(h,n),Θn)g^{\prime}(h,n)=(g(h,n),\Theta n) for some appropriate value of Θ\Theta;

  • There exists a collection of \mathbb{R}-vector spaces Vi,DepViG(i,1)/G(i,2)V_{i,\mathrm{Dep}}\leqslant V_{i}\leqslant G_{(i,1)}/G_{(i,2)} which are all (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational with respect to exp(𝒳i)\exp(\mathcal{X}_{i}) for each ii;

  • For 1is11\leq i\leq s-1 and h,h1,h2Hh,h_{1},h_{2}\in H^{\prime} we have

    Taylori(gh)Vi,Taylori(gh1)Taylori(gh2)Vi,Dep;\operatorname{Taylor}_{i}(g_{h}^{\prime})\in V_{i},\qquad\operatorname{Taylor}_{i}(g_{h_{1}}^{\prime})-\operatorname{Taylor}_{i}(g_{h_{2}}^{\prime})\in V_{i,\mathrm{Dep}};
  • FF^{\prime} is MM^{\prime}-Lipschitz and has the same vertical frequency η\eta as FF;

  • For integers i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1, suppose that viViv_{i_{\ell}}\in V_{i_{\ell}} and for at least two distinct indices 1,2\ell_{1},\ell_{2} we have vi1Vi1,Depv_{i_{\ell_{1}}}\in V_{i_{\ell_{1}},\mathrm{Dep}} and vi2Vi2,Depv_{i_{\ell_{2}}}\in V_{i_{\ell_{2}},\mathrm{Dep}}. Then for ww which is any (r1)(r^{\ast}-1)-fold commutator of vi1,,virv_{i_{1}},\ldots,v_{i_{r^{\ast}}}, we have

    η(w)=0.\eta(w)=0.
Remark.

Consider gij=exp(Xij)g_{i_{j}}=\exp(X_{i_{j}}) with gijGij,0g_{i_{j}}\in G_{i_{j},0} for 1jr1\leq j\leq r^{\ast} and i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1. Fixing any (r1)(r^{\ast}-1)-fold commutator ww of gi1,,girg_{i_{1}},\ldots,g_{i_{r^{\ast}}}, repeated application of the commutator version of Baker–Campbell–Hausdorff (e.g. (2.2)) implies that

w=exp([Xi1,,Xir])w=\exp([X_{i_{1}},\ldots,X_{i_{r^{\ast}}}])

where the associated commutator has the same “form” as that defining ww. (All higher terms are annihilated since GG has degree-rank (s1,r)(s-1,r^{\ast}).) Note that this implies that one can define the associated commutator given inputs in G(i1,1)/G(i1,2),,G(ir,1)/G(ir,2)G_{(i_{1},1)}/G_{(i_{1},2)},\ldots,G_{(i_{r^{\ast}},1)}/G_{(i_{r^{\ast}},2)} and furthermore we see that the associated commutator form on the Lie algebra is a multilinear form of the vector arguments (since G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} and G(s1,r)G_{(s-1,r^{\ast})} are real vector spaces and the commutator bracket on the Lie algebra is multilinear).

Proof.

We first note that the statement of the lemma is trivial for r=1r^{\ast}=1 since we may take Vi=Vi,Dep=G(i,1)/G(i,2)V_{i}=V_{i,\mathrm{Dep}}=G_{(i,1)}^{\prime}/G_{(i,2)}^{\prime}; it is impossible to have two distinct indices in the final bullet point. Taking gh(n)=gh(n)g_{h}^{\prime}(n)=g_{h}(n) and g(h,n)=(g(h,n),0)g^{\prime}(h,n)=(g(h,n),0) completes the proof in this case. For s=2s=2, the only possible case is r=1r^{\ast}=1 and therefore for the remainder of the proof we will consider s3s\geq 3. Similarly, if η\eta is trivial, the result is once again immediate. Thus throughout the remainder of the proof we will assume that s1r2s-1\geq r^{\ast}\geq 2 and η\eta is nontrivial.

Step 1: Setup for invoking equidistribution theory. By Lemma 7.5, we have

𝔼[χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯ψh(gh(n)Γ)](MD/ρ)Os(dOs(1))\displaystyle\bigg{\lVert}\mathbb{E}\bigg{[}\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot\psi_{\vec{h}}(g_{\vec{h}}(n)\Gamma^{\prime})\bigg{]}\bigg{\rVert}_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}

for at least (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} fraction of additive quadruples h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4}. Furthermore gh(n)g_{\vec{h}}(n) is a polynomial sequence on a group GErrorG_{\mathrm{Error}} which has a degree (s2)(s-2) filtration, dimension bounded by Os(dOs(1))O_{s}(d^{O_{s}(1)}), and the complexity of GError/ΓErrorG_{\mathrm{Error}}/\Gamma_{\mathrm{Error}} and the Lipschitz constant of the function for ψh\psi_{\vec{h}} are bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Note that a priori GError/ΓErrorG_{\mathrm{Error}}/\Gamma_{\mathrm{Error}} and the associated Mal’cev basis depend on h\vec{h}. However, applying Pigeonhole on the choice of the associated structure constants allows us to assume, at the cost of passing to a density (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} subset of the additive quadruples, that GError/ΓErrorG_{\mathrm{Error}}/\Gamma_{\mathrm{Error}} is independent of h\vec{h}. Finally, we may assume as usual that gh(0)=idGErrorg_{\vec{h}}(0)=\mathrm{id}_{G_{\mathrm{Error}}} via by-now standard manipulations.

We now consider the group G~=G×G×G×G×GError\widetilde{G}=G\times G\times G\times G\times G_{\mathrm{Error}}. G~\widetilde{G} may naturally be given a degree-rank (s1,r)(s-1,r^{\ast}) product filtration (where we use [34, Example 6.11] to assign GErrorG_{\mathrm{Error}} a degree-rank (s2,s2)(s-2,s-2) structure) and Mal’cev basis. Furthermore if χhi(n)=F(ghi(n))\chi_{h_{i}}(n)=F(g_{h_{i}}(n)) we have that the five-fold function F(x1Γ)F(x2Γ)F(x3Γ)¯F(x4Γ)¯ψh(x5ΓError)F(x_{1}\Gamma)\otimes F(x_{2}\Gamma)\otimes\overline{F(x_{3}\Gamma)}\otimes\overline{F(x_{4}\Gamma)}\cdot\psi_{\vec{h}}(x_{5}\Gamma_{\mathrm{Error}}) has a vertical frequency ηProd=(η,η,η,η,0)\eta_{\mathrm{Prod}}=(\eta,\eta,-\eta,-\eta,0). (Note that (GError)(s1,i)=IdGError(G_{\mathrm{Error}})_{(s-1,i)}=\mathrm{Id}_{G_{\mathrm{Error}}} for all i0i\geq 0.)

For the sake of convenience, we set

gh(n)=(gh1(n),gh2(n+h1h4),gh3(n),gh4(n+h1h4),gh(n))g_{\vec{h}}^{\ast}(n)=(g_{h_{1}}(n),g_{h_{2}}(n+h_{1}-h_{4}),g_{h_{3}}(n),g_{h_{4}}(n+h_{1}-h_{4}),g_{\vec{h}}(n))

and note that the function FFF¯F¯ψhF\otimes F\otimes\overline{F}\otimes\overline{F}\cdot\psi_{\vec{h}} is seen to be MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}-Lipschitz on G~\widetilde{G}. Note that by the second item of Lemma 2.13, we immediately have that

Taylori(gh2(n+h1h4))=Taylori(gh2(n))\operatorname{Taylor}_{i}(g_{h_{2}}(n+h_{1}-h_{4}))=\operatorname{Taylor}_{i}(g_{h_{2}}(n))

for 1is11\leq i\leq s-1 and analogously for gh4(n+h1h4)g_{h_{4}}(n+h_{1}-h_{4}).

Step 2: Invoking equidistribution theory. By applying Corollary 5.5 (since η\eta is nonzero), there exists a (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational subgroup J=JhJ=J_{\vec{h}} of G~\widetilde{G} such that ηProd(JG~(s1,r))=0\eta_{\mathrm{Prod}}(J\cap\widetilde{G}_{(s-1,r^{\ast})})=0 and such that

gh=εhgh~γhg_{\vec{h}}^{\ast}=\varepsilon_{\vec{h}}\cdot\widetilde{g_{\vec{h}}}\cdot\gamma_{\vec{h}}

where:

  • εh(0)=gh~(0)=γh(0)=idG~\varepsilon_{\vec{h}}(0)=\widetilde{g_{\vec{h}}}(0)=\gamma_{\vec{h}}(0)=\mathrm{id}_{\widetilde{G}};

  • gh~\widetilde{g_{\vec{h}}} takes values in JJ;

  • γh\gamma_{\vec{h}} is (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational (with respect to the lattice Γ×Γ×Γ×Γ×ΓError\Gamma\times\Gamma\times\Gamma\times\Gamma\times\Gamma_{\mathrm{Error}});

  • d(ε(n),ε(n1))(MD/ρ)Os(dOs(1))N1d(\varepsilon(n),\varepsilon(n-1))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-1} for n[N]n\in[N].

By passing to a subset of additive quadruples of density (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} we may in fact assume that the group JJ is independent of h\vec{h} under consideration.

We define

Ji:=(JG~(i,1))/(JG~(i,2)),Ji:=τi(Ji)J_{i}^{\prime}:=(J\cap\widetilde{G}_{(i,1)})/(J\cap\widetilde{G}_{(i,2)}),\qquad J_{i}:=\tau_{i}(J_{i}^{\prime})

where τi:Horizi(G)4×Horizi(GError)Horizi(G)4\tau_{i}\colon\operatorname{Horiz}_{i}(G)^{\otimes 4}\times\operatorname{Horiz}_{i}(G_{\mathrm{Error}})\to\operatorname{Horiz}_{i}(G)^{\otimes 4} is the natural projection map to the four-fold product. Since ηProd(JG~(s1,r))=0\eta_{\mathrm{Prod}}(J\cap\widetilde{G}_{(s-1,r)})=0 (due to the output of Corollary 5.5), we have

ηProd([Ji1,,Jir])=0\eta_{\mathrm{Prod}}([J_{i_{1}}^{\prime},\ldots,J_{i_{r^{\ast}}}^{\prime}])=0

for i1++ir=s1i_{1}^{\prime}+\ldots+i_{r^{\ast}}^{\prime}=s-1 where the commutator bracket is taken with respect to G~\widetilde{G} and [,,][\cdot,\ldots,\cdot] denotes any possible (r1)(r^{\ast}-1)-fold commutator bracket.

Since GErrorG_{\mathrm{Error}} has been given a degree-rank <(s1,r)<(s-1,r^{\ast}) filtration, we have that in fact

ηProd([Ji1,,Jir])=0\eta_{\mathrm{Prod}}([J_{i_{1}},\ldots,J_{i_{r^{\ast}}}])=0

where we abusively descend ηProd\eta_{\mathrm{Prod}} to G4G^{\otimes 4}. Less formally, we are noting that the final coordinate of elements in G~\widetilde{G} play no role in commutators of the depth being considered.

Step 3: Furstenberg–Weiss commutator argument. We now perform the crucial Furstenberg–Weiss commutator argument. Given T[4]T\subseteq[4], we define πT((v1,,v4))=(vi)iT\pi_{T}((v_{1},\ldots,v_{4}))=(v_{i})_{i\in T} with the coordinates represented in increasing order of index.

We define

π123(Ji)\displaystyle\pi_{123}(J_{i})^{\ast} =π123(Ji){(v,0,0):vHorizi(G)},\displaystyle=\pi_{123}(J_{i})\cap\{(v,0,0)\colon v\in\operatorname{Horiz}_{i}(G)\},
π124(Ji)\displaystyle\pi_{124}(J_{i})^{\ast} =π124(Ji){(v,0,0):vHorizi(G)}.\displaystyle=\pi_{124}(J_{i})\cap\{(v,0,0)\colon v\in\operatorname{Horiz}_{i}(G)\}.

Note that π123(Ji)\pi_{123}(J_{i})^{\ast} and π124(Ji)\pi_{124}(J_{i})^{\ast} may (abusively) be viewed as subspaces of Horizi(G)\operatorname{Horiz}_{i}(G). The crucial claim is that

η([vi1,,vir])=0\eta([v_{i_{1}},\ldots,v_{i_{r^{\ast}}}])=0

if i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1, each viπ1(Ji)v_{i_{\ell}}\in\pi_{1}(J_{i_{\ell}}), and for two distinct indices 1,2\ell_{1},\ell_{2} we have that vi1π123(Ji1)v_{i_{\ell_{1}}}\in\pi_{123}(J_{i_{\ell_{1}}})^{\ast} and vi2π124(Ji2)v_{i_{\ell_{2}}}\in\pi_{124}(J_{i_{\ell_{2}}})^{\ast}. Note that η\eta lives on GG and the commutator brackets are taken with respect to GG, not G4G^{\otimes 4}. The Furstenberg–Weiss commutator argument is required to capture precisely this difference.

Note that an element viπ1(Ji)v_{i}\in\pi_{1}(J_{i_{\ell}}) lifts to an element vi~\widetilde{v_{i_{\ell}}} of the form (vi,,,)Horizi(G)4(v_{i_{\ell}},\cdot,\cdot,\cdot)\in\operatorname{Horiz}_{i_{\ell}}(G)^{\otimes 4}. Furthermore note that viπ123(Ji)v_{i_{\ell}}\in\pi_{123}(J_{i_{\ell}}) “lifts” to an element vi~\widetilde{v_{i_{\ell}}} of the form (vi,0,0,)Horizi(G)4(v_{i_{\ell}},0,0,\cdot)\in\operatorname{Horiz}_{i_{\ell}}(G)^{\otimes 4} while viπ124(Ji)v_{i_{\ell}}\in\pi_{124}(J_{i_{\ell}}) lifts to an element vi~\widetilde{v_{i_{\ell}}} of the form (vi,0,,0)Horizi(G)4(v_{i_{\ell}},0,\cdot,0)\in\operatorname{Horiz}_{i_{\ell}}(G)^{\otimes 4}.

Given the above setup, we have

[vi1~,,vir~]=([vi1,,vir],idG,idG,idG).[\widetilde{v_{i_{1}}},\ldots,\widetilde{v_{i_{{r^{\ast}}}}}]=([v_{i_{1}},\ldots,v_{i_{{r^{\ast}}}}],\mathrm{id}_{G},\mathrm{id}_{G},\mathrm{id}_{G}).

To see this note that the iterated commutator of elements in G×IdG×IdG×GG\times\mathrm{Id}_{G}\times\mathrm{Id}_{G}\times G (with any elements in G4G^{\otimes 4}) remains in the subgroup G×IdG×IdG×GG\times\mathrm{Id}_{G}\times\mathrm{Id}_{G}\times G; an analogous fact holds true for G×IdG×G×IdGG\times\mathrm{Id}_{G}\times G\times\mathrm{Id}_{G}. Since we assumed that our commutator contains elements in both G×IdG×IdG×GG\times\mathrm{Id}_{G}\times\mathrm{Id}_{G}\times G and G×IdG×G×IdGG\times\mathrm{Id}_{G}\times G\times\mathrm{Id}_{G}, the commutator must in fact live in G×IdG×IdG×IdGG\times\mathrm{Id}_{G}\times\mathrm{Id}_{G}\times\mathrm{Id}_{G}, and the first coordinates of the desired commutators is trivially seen to match.

Recalling that we have

ηProd([Ji1,,Jir])=0,\eta_{\mathrm{Prod}}([J_{i_{1}},\ldots,J_{i_{r^{\ast}}}])=0,

and noting that ηProd\eta_{\mathrm{Prod}} descends to η\eta on the subgroup G(s1,r)×IdG3G_{(s-1,r^{\ast})}\times\mathrm{Id}_{G}^{\otimes 3}, we have

η([vi1,,vir])=0\eta([v_{i_{1}},\ldots,v_{i_{{r^{\ast}}}}])=0

as claimed.

Step 4: Finding (h2,h3)(h_{2},h_{3}) and (h2,h4)(h_{2}^{\prime},h_{4}^{\prime}) which extend to many “good” h1h_{1}. Recall that we are looking at the at least (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} fraction of additive quadruples (h1,h2,h3,h4)H4[N]4(h_{1},h_{2},h_{3},h_{4})\in H^{4}\subseteq[N]^{4} which are such that gh~\widetilde{g_{\vec{h}}} lives on a specified subgroup JJ. Call this set of quadruples 𝒮\mathcal{S}.

So by Markov, there are at least (MD/ρ)Os(dOs(1))N(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}N many h1[N]h_{1}\in[N] which extend to at least (MD/ρ)Os(dOs(1))N2(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}N^{2} quadruples in 𝒮\mathcal{S}. Thus there are at least (MD/ρ)Os(dOs(1))N5(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}N^{5} pairs of additive tuples of the form

(h1,h2,h3,h1+h2h3),(h1,h2,h1+h2h4,h4)𝒮.(h_{1},h_{2},h_{3},h_{1}+h_{2}-h_{3}),~{}(h_{1},h_{2}^{\prime},h_{1}+h_{2}^{\prime}-h_{4}^{\prime},h_{4}^{\prime})\in\mathcal{S}.

By averaging, there exists a pair of pairs (h2,h3)(h_{2},h_{3}) and (h2,h4)(h_{2}^{\prime},h_{4}^{\prime}) such that there are at least (MD/ρ)Os(dOs(1))N(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}N many h1[N]h_{1}\in[N] which live in such additive tuples. We fix such a pair of pairs and define 𝒯\mathcal{T} to denote the set of h1[N]h_{1}\in[N] such that (h1,h2,h3,h1+h2h3)𝒮(h_{1},h_{2},h_{3},h_{1}+h_{2}-h_{3})\in\mathcal{S} and (h1,h2,h1+h2h4,h4)𝒮(h_{1},h_{2}^{\prime},h_{1}+h_{2}^{\prime}-h_{4}^{\prime},h_{4}^{\prime})\in\mathcal{S}.

Step 5: Extracting coefficient data. Consider h1𝒯h_{1}\in\mathcal{T} and define

h123=(h1,h2,h3,h1+h2h3),h124=(h1,h2,h1+h2h4,h4).h^{123}=(h_{1},h_{2},h_{3},h_{1}+h_{2}-h_{3}),\quad h^{124}=(h_{1},h_{2}^{\prime},h_{1}+h_{2}^{\prime}-h_{4}^{\prime},h_{4}^{\prime}).

Recall 𝒳i=(𝒳log(G(i,1)))/log(G(i,2))\mathcal{X}_{i}=(\mathcal{X}\cap\log(G_{(i,1)}))/\log(G_{(i,2)}) and assign the basis exp(𝒳i)\exp(\mathcal{X}_{i}) to G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} (viewed as a vector space). Finally we assign the basis 𝒵i=Yiexp(𝒳i){(Yi,0,0),(0,Yi,0),(0,0,Yi)}\mathcal{Z}_{i}=\bigcup_{Y_{i}\in\exp(\mathcal{X}_{i})}\{(Y_{i},0,0),(0,Y_{i},0),(0,0,Y_{i})\} to (G(i,1)/G(i,2))3(G_{(i,1)}/G_{(i,2)})^{\otimes 3}.

By Lemma 2.13, we have

Taylori(gh123)\displaystyle\operatorname{Taylor}_{i}(g_{h^{123}}^{\ast}) =Taylori(εh123)+Taylori(gh123~)+Taylori(γh123)\displaystyle=\operatorname{Taylor}_{i}(\varepsilon_{h^{123}})+\operatorname{Taylor}_{i}(\widetilde{g_{h^{123}}})+\operatorname{Taylor}_{i}(\gamma_{h^{123}})
Taylori(gh124)\displaystyle\operatorname{Taylor}_{i}(g_{h^{124}}^{\ast}) =Taylori(εh124)+Taylori(gh124~)+Taylori(γh124)\displaystyle=\operatorname{Taylor}_{i}(\varepsilon_{h^{124}})+\operatorname{Taylor}_{i}(\widetilde{g_{h^{124}}})+\operatorname{Taylor}_{i}(\gamma_{h^{124}})

Therefore, by Lemma 8.2, for all h1𝒯h_{1}\in\mathcal{T} we have

dist(Taylori((gh1,gh2,gh3)),π123(Ji)+Th11Horizi(Γ3))(MD/ρ)Os(dOs(1))Ni,\displaystyle\operatorname{dist}(\operatorname{Taylor}_{i}((g_{h_{1}},g_{h_{2}},g_{h_{3}})),\pi_{123}(J_{i})+T_{h_{1}}^{-1}\operatorname{Horiz}_{i}(\Gamma^{\otimes 3}))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i},
dist(Taylori((gh1,gh2,gh4)),π124(Ji)+Th11Horizi(Γ3))(MD/ρ)Os(dOs(1))Ni,\displaystyle\operatorname{dist}(\operatorname{Taylor}_{i}((g_{h_{1}},g_{h_{2}^{\prime}},g_{h_{4}^{\prime}})),\pi_{124}(J_{i})+T_{h_{1}}^{\prime-1}\operatorname{Horiz}_{i}(\Gamma^{\otimes 3}))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i},

where Th1T_{h_{1}} and Th1T_{h_{1}}^{\prime} are positive integers bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Here we have identified the basis 𝒵i\mathcal{Z}_{i} (for (G(i,1)/G(i,2))3(G_{(i,1)}/G_{(i,2)})^{\otimes 3}) with the standard basis vectors in 3dim(Horizi(G))\mathbb{R}^{3\dim(\operatorname{Horiz}_{i}(G))} and taken the LL^{\infty} metric on the latter (for the notion of dist\operatorname{dist}). At the cost of shrinking the set 𝒯\mathcal{T} by a multiplicative factor of (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} we may assume that Th1=TT_{h_{1}}=T and Th1=TT_{h_{1}}^{\prime}=T^{\prime} for all h1𝒯h_{1}\in\mathcal{T}.

We now consider a basis i\mathcal{B}_{i} for π123(Ji)\pi_{123}(J_{i}) which is in row-echelon form where one orders the coordinates corresponding to second copy of GG (in the four-fold G4G^{\otimes 4}) at the front, then the third copy, and then the first copy. In particular, the “final block” of basis vectors span π123(Ji)\pi_{123}(J_{i})^{\ast}. Note that one can take such i\mathcal{B}_{i} such that the coordinates are integers bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} due to the rationality of π123(Ji)\pi_{123}(J_{i}).

For h1,h1𝒯h_{1},h_{1}^{\prime}\in\mathcal{T}, we have

(8.1) Taylori((gh1,gh2,gh3))\displaystyle\operatorname{Taylor}_{i}((g_{h_{1}},g_{h_{2}},g_{h_{3}})) =RjiajRj+T13dim(Horizi(G))+vh1\displaystyle=\sum_{R_{j}\in\mathcal{B}_{i}}a_{j}R_{j}+T^{-1}\mathbb{Z}^{3\dim(\operatorname{Horiz}_{i}(G))}+v_{h_{1}}
(8.2) Taylori((gh1,gh2,gh3))\displaystyle\operatorname{Taylor}_{i}((g_{h_{1}^{\prime}},g_{h_{2}},g_{h_{3}})) =RjiajRj+T13dim(Horizi(G))+vh1\displaystyle=\sum_{R_{j}\in\mathcal{B}_{i}}a_{j}^{\prime}R_{j}+T^{-1}\mathbb{Z}^{3\dim(\operatorname{Horiz}_{i}(G))}+v_{h_{1}^{\prime}}

where vh1,vh1(MD/ρ)Os(dOs(1))Ni\lVert v_{h_{1}}\rVert_{\infty},\lVert v_{h_{1}^{\prime}}\rVert_{\infty}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}. For each basis vector RjiR_{j}\in\mathcal{B}_{i} where the first nonzero element is either in coordinates corresponding to second or third copy of GG, there exists a dual vector which is zero on the coordinates corresponding to the first copy of GG and whose inner product with all of i\mathcal{B}_{i} but RjR_{j} is zero.

Call this vector vjv_{j} and note one may take vjv_{j} to have integral coordinates bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} and divisible by TT. Then from (8.1) and (8.2),

0=vj(Taylori((gh1,gh2,gh3))Taylori((gh1,gh2,gh3))=Mj(ajaj)+±(MD/ρ)Os(dOs(1))Ni,0=v_{j}\cdot(\operatorname{Taylor}_{i}((g_{h_{1}},g_{h_{2}},g_{h_{3}}))-\operatorname{Taylor}_{i}((g_{h_{1}^{\prime}},g_{h_{2}},g_{h_{3}}))=M_{j}(a_{j}-a_{j}^{\prime})+\mathbb{Z}\pm(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i},

where MjM_{j} is an nonzero integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. (That is, Mj(ajaj)M_{j}(a_{j}-a_{j}^{\prime}) is within (MD/ρ)Os(dOs(1))Ni(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i} of an integer.)

We may now use this information about such indices jj in conjunction with (8.1) and (8.2). We deduce that for all h1,h1𝒯h_{1},h_{1}^{\prime}\in\mathcal{T},

dist(Taylori(gh1)Taylori(gh1),π123(Ji)+T11dim(Horizi(G)))(MD/ρ)Os(dOs(1))Ni,\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}})-\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}}),\pi_{123}(J_{i})^{\ast}+{T_{1}}^{-1}\mathbb{Z}^{\dim(\operatorname{Horiz}_{i}(G))})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i},

where T1T_{1} is an integer of size bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Analogously,

dist(Taylori(gh1)Taylori(gh1),π124(Ji)+T11dim(Horizi(G)))(MD/ρ)Os(dOs(1))Ni\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}})-\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}}),\pi_{124}(J_{i})^{\ast}+{T_{1}^{\prime}}^{-1}\mathbb{Z}^{\dim(\operatorname{Horiz}_{i}(G))})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

where T1T_{1}^{\prime} is an integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Putting it together, we may deduce that

dist(Taylori(gh1)Taylori(gh1),π123(Ji)π124(Ji)+T21dim(Horizi(G)))(MD/ρ)Os(dOs(1))Ni\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}})-\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}}),\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast}+T_{2}^{-1}\mathbb{Z}^{\dim(\operatorname{Horiz}_{i}(G))})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

with T2T_{2} a nonzero integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. To see this, simply construct a bounded integral basis of the orthogonal complement of π123(Ji)π124(Ji)\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast} (treated as a subspace of the dual space to G(i,1)/G(i,2)dim(Horizi(G))G_{(i,1)}/G_{(i,2)}\simeq\mathbb{R}^{\dim(\operatorname{Horiz}_{i}(G))}). Then the two input inequalities imply that any basis vector for the intersection space dual will map T1T1(Taylori(gh1)Taylori(gh1))T_{1}T_{1}^{\prime}(\operatorname{Taylor}_{i}(g_{h_{1}})-\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}})) to a near-integral scalar, which gives the claim.

Now by Lemma 2.13 we therefore have

(8.3) dist(Taylori(gh1gh11),π123(Ji)π124(Ji)+T21dim(Horizi(G)))(MD/ρ)Os(dOs(1))Ni.\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}}g_{h_{1}^{\prime}}^{-1}),\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast}+T_{2}^{-1}\mathbb{Z}^{\dim(\operatorname{Horiz}_{i}(G))})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}.

It is also trivial by restricting the factorization to the first coordinate that

(8.4) dist(Taylori(gh1),π1(Ji)+T31dim(Horizi(G)))(MD/ρ)Os(dOs(1))Ni\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}}),\pi_{1}(J_{i})+T_{3}^{-1}\mathbb{Z}^{\dim(\operatorname{Horiz}_{i}(G))})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

with T3T_{3} a nonzero integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}.

Step 6: Extracting initial factorizations. For the remainder of the proof fix h1𝒯h_{1}^{\ast}\in\mathcal{T}. Given h1𝒯h_{1}^{\prime}\in\mathcal{T} we have

gh1=gh1gh11gh1.g_{h_{1}^{\prime}}=g_{h_{1}^{\prime}}g_{h_{1}^{\ast}}^{-1}\cdot g_{h_{1}^{\ast}}.

Note that π1(Ji)\pi_{1}(J_{i}) and π123(Ji)π124(Ji)\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast} may each be defined as the kernel of a set of ii-th horizontal characters (on GG) of height at most (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Recall (8.3) and (8.4). Scaling the horizontal characters by at most (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} and applying Lemma B.2, we may write

gh1gh11\displaystyle g_{h_{1}^{\prime}}g_{h_{1}^{\ast}}^{-1} =εh1gh1~γh1,\displaystyle=\varepsilon_{h_{1}^{\prime}}\cdot\widetilde{g_{h_{1}^{\prime}}}\cdot\gamma_{h_{1}^{\prime}},
gh1\displaystyle g_{h_{1}^{\ast}} =εg~γ,\displaystyle=\varepsilon\cdot\widetilde{g^{\prime}}\cdot\gamma^{\prime},

where:

  • εh1(0)=gh1~(0)=γh1(0)=ε(0)=g(0)=γ(0)=idG\varepsilon_{h_{1}^{\prime}}(0)=\widetilde{g_{h_{1}^{\prime}}}(0)=\gamma_{h_{1}^{\prime}}(0)=\varepsilon(0)=g^{\prime}(0)=\gamma^{\prime}(0)=\mathrm{id}_{G};

  • Taylori(gh1~)π123(Ji)π124(Ji)\operatorname{Taylor}_{i}(\widetilde{g_{h_{1}^{\prime}}})\in\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast} and Taylori(g~)π1(Ji)\operatorname{Taylor}_{i}(\widetilde{g^{\prime}})\in\pi_{1}(J_{i});

  • γh1,γ\gamma_{h_{1}^{\prime}},\gamma^{\prime} are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational;

  • dG(εh1(n),εh1(n1))+dG(ε(n),ε(n1))(MD/ρ)Os(dOs(1))N1d_{G}(\varepsilon_{h_{1}^{\prime}}(n),\varepsilon_{h_{1}^{\prime}}(n-1))+d_{G}(\varepsilon(n),\varepsilon(n-1))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-1} for n[N]n\in[N].

Therefore

gh1\displaystyle g_{h_{1}^{\prime}} =gh1gh11gh1=εh1ε(ε1gh1~ε)(ε1γh1εγh11)(γh1g~γh11)γh1γ.\displaystyle=g_{h_{1}^{\prime}}g_{h_{1}^{\ast}}^{-1}\cdot g_{h_{1}^{\ast}}=\varepsilon_{h_{1}^{\prime}}\varepsilon\cdot(\varepsilon^{-1}\widetilde{g_{h_{1}^{\prime}}}\varepsilon)\cdot(\varepsilon^{-1}\gamma_{h_{1}^{\prime}}\varepsilon\gamma_{h_{1}^{\prime}}^{-1})\cdot(\gamma_{h_{1}^{\prime}}\cdot\widetilde{g^{\prime}}\gamma_{h_{1}^{\prime}}^{-1})\cdot\gamma_{h_{1}^{\prime}}\gamma^{\prime}.

By Lemma 2.13, we have

Taylori(γh1g~γh11)=Taylori(g~)\displaystyle\operatorname{Taylor}_{i}(\gamma_{h_{1}^{\prime}}\widetilde{g^{\prime}}\gamma_{h_{1}^{\prime}}^{-1})=\operatorname{Taylor}_{i}(\widetilde{g^{\prime}}) π1(Ji),\displaystyle\in\pi_{1}(J_{i}),
Taylori((ε1gh1~ε)(ε1γh1εγh11))=Taylori(gh1~)\displaystyle\operatorname{Taylor}_{i}((\varepsilon^{-1}\widetilde{g_{h_{1}^{\prime}}}\varepsilon)\cdot(\varepsilon^{-1}\gamma_{h_{1}^{\prime}}\varepsilon\gamma_{h_{1}^{\prime}}^{-1}))=\operatorname{Taylor}_{i}(\widetilde{g_{h_{1}^{\prime}}}) π123(Ji)π124(Ji).\displaystyle\in\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast}.

We say that h1,h1′′𝒯h_{1}^{\prime},h_{1}^{\prime\prime}\in\mathcal{T} have matching rational parts if

(γh1γ)1(γh1′′γ)(\gamma_{h_{1}^{\prime}}\gamma^{\prime})^{-1}\cdot(\gamma_{h_{1}^{\prime\prime}}\gamma^{\prime})

is a polynomial sequence valued in Γ\Gamma. By restricting 𝒯\mathcal{T} to an appropriate subset of density (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}, we may assume that all h1𝒯h_{1}^{\prime}\in\mathcal{T} have matching rational parts. (This is most easily seen in first-kind coordinates: if γh1γ\gamma_{h_{1}^{\prime}}\gamma^{\prime} and γh1′′γ\gamma_{h_{1}^{\prime\prime}}\gamma^{\prime} have all coefficients differing by T4span(𝒳,)T_{4}\cdot\operatorname{span}(\mathcal{X},\mathbb{Z}) where T4T_{4} is an appropriate integer of size bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} then two sequences match up to a polynomial sequence in Γ\Gamma.)

So, ultimately we may assume that for all h1𝒯h_{1}^{\prime}\in\mathcal{T} we have

gh1=εh1gh1γγh1~\displaystyle g_{h_{1}^{\prime}}=\varepsilon_{h_{1}^{\prime}}^{\ast}\cdot g_{h_{1}^{\prime}}^{\ast}\cdot\gamma^{\ast}\cdot\widetilde{\gamma_{h_{1}^{\prime}}}

where:

  • εh1(0)=gh1(0)=γ(0)=γh1~(0)=idG\varepsilon_{h_{1}^{\prime}}^{\ast}(0)=g_{h_{1}^{\prime}}^{\ast}(0)=\gamma^{\ast}(0)=\widetilde{\gamma_{h_{1}^{\prime}}}(0)=\mathrm{id}_{G};

  • Taylori(gh1(gh1′′)1)π123(Ji)π124(Ji)\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}}^{\ast}\cdot(g_{h_{1}^{\prime\prime}}^{\ast})^{-1})\in\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast} and Taylori(gh1)π1(Ji)\operatorname{Taylor}_{i}(g_{h_{1}^{\prime}}^{\ast})\in\pi_{1}(J_{i}) for all h1′′𝒯h_{1}^{\prime\prime}\in\mathcal{T};

  • γ\gamma^{\ast} is (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational

  • γh1~\widetilde{\gamma_{h_{1}^{\prime}}} takes values in Γ\Gamma;

  • dG(εh1(n),εh1(n1))(MD/ρ)Os(dOs(1))N1d_{G}(\varepsilon_{h_{1}^{\prime}}^{\ast}(n),\varepsilon_{h_{1}^{\prime}}^{\ast}(n-1))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-1} for n[N]n\in[N].

Step 7: Removing periodic and smooth pieces of factorization. Let QQ be the period of γΓ\gamma^{\ast}\Gamma and define δ=(MD/ρ)Os(dOs(1))\delta=(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} where δ\delta is to be chosen later. We break [N][N] into a collection of arithmetic progressions with difference QQ and length between δN\delta N and 2δN2\delta N; there are at most δ1\delta^{-1} such progressions. Call these progressions P1,,PP_{1},\ldots,P_{\ell} and note that

𝔼n[N](Δhf)(n)i=1𝟙nPiχ(h,n)χh(n)ψh(n)ρ\bigg{\lVert}\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\sum_{i=1}^{\ell}\mathbbm{1}_{n\in P_{i}}\cdot\chi(h,n)\otimes\chi_{h}(n)\cdot\psi_{h}(n)\bigg{\rVert}_{\infty}\geq\rho

where ψh(n)\psi_{h}(n) is the degree (s2)(s-2) nilsequence coming from the condition Δhf(n)χ(h,n)¯χh(n)¯Corr(s1,ρ,M,d)\Delta_{h}f(n)\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\in\operatorname{Corr}(s-1,\rho,M,d) from the original correlation structure. For h𝒯h\in\mathcal{T} we may write

χh(n)=F(gh(n)Γ)=F(εhghγΓ);\chi_{h}(n)=F(g_{h}(n)\Gamma)=F(\varepsilon_{h}^{\ast}g_{h}^{\ast}\gamma^{\ast}\Gamma);

here we are using that γh1~\widetilde{\gamma_{h_{1}^{\prime}}} takes values in Γ\Gamma so may be dropped for the remainder of the analysis.

Since QQ is the period of γ\gamma^{\ast}, we may replace γ\gamma^{\ast} by a value γPi\gamma_{P_{i}} for each progression where γPiγ(n)1Γ\gamma_{P_{i}}\gamma^{\ast}(n)^{-1}\in\Gamma for nPin\in P_{i} and ψ(γPi)1\lVert\psi(\gamma_{P_{i}})\rVert_{\infty}\leq 1. Then

𝔼n[N](Δhf)(n)χ(h,n)(i=1𝟙nPiF(εhghγPiΓ))ψh(n)ρ.\bigg{\lVert}\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\cdot\chi(h,n)\otimes\bigg{(}\sum_{i=1}^{\ell}\mathbbm{1}_{n\in P_{i}}F(\varepsilon_{h}^{\ast}g_{h}^{\ast}\gamma_{P_{i}}\Gamma)\bigg{)}\cdot\psi_{h}(n)\bigg{\rVert}_{\infty}\geq\rho.

Furthermore as εh\varepsilon_{h}^{\ast} is sufficiently smooth we may replace εh\varepsilon_{h}^{\ast} with the constant εh,Pi=εh(min(Pi))\varepsilon_{h,P_{i}}=\varepsilon_{h}^{\ast}(\min(P_{i})) and have

𝔼n[N](Δhf)(n)χ(h,n)(i=1𝟙nPiF(εh,PighγPiΓ))ψh(n)ρ/2,\bigg{\lVert}\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\cdot\chi(h,n)\otimes\bigg{(}\sum_{i=1}^{\ell}\mathbbm{1}_{n\in P_{i}}F(\varepsilon_{h,P_{i}}g_{h}^{\ast}\gamma_{P_{i}}\Gamma)\bigg{)}\cdot\psi_{h}(n)\bigg{\rVert}_{\infty}\geq\rho/2,

as long as δ\delta was chosen sufficiently small.

By the triangle inequality there exists some PiP_{i} which is distance at least δ1/2N\delta^{1/2}N from the ends of the interval [N][N] such that

𝔼n[N](Δhf)(n)χ(h,n)𝟙nPiF(εh,PighγPiΓ)ψh(n)δ2.\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\cdot\chi(h,n)\otimes\mathbbm{1}_{n\in P_{i}}F(\varepsilon_{h,P_{i}}g_{h}^{\ast}\gamma_{P_{i}}\Gamma)\cdot\psi_{h}(n)\rVert_{\infty}\geq\delta^{2}.

By paying a δO(1)\delta^{O(1)}-fraction in the size of 𝒯\mathcal{T} we may assume that the choice of index ii is independent of hh, hence writing Pi=PP_{i}=P. Furthermore note that there is a δO(1)\delta^{O(1)}-net of size δOs(d)\delta^{-O_{s}(d)} for the set of gg satisfying dG(g,idG)(MD/ρ)Os(dOs(1))d_{G}(g,\mathrm{id}_{G})\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. If the net size is chosen small enough, we may shift εh,Pi\varepsilon_{h,P_{i}} to a nearby value in the net without much loss. Then we can pay a δOs(d)\delta^{O_{s}(d)}-fraction in the size of 𝒯\mathcal{T} to Pigeonhole onto a single point in the net, writing εh,Pi=εP\varepsilon_{h,P_{i}}=\varepsilon_{P}.

Overall, for all h𝒯h\in\mathcal{T} we have

𝔼n[N](Δhf)(n)χ(h,n)𝟙P(n)F(εPghγPΓ)ψh(n)δ3\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\cdot\chi(h,n)\otimes\mathbbm{1}_{P}(n)F(\varepsilon_{P}g_{h}^{\ast}\gamma_{P}\Gamma)\cdot\psi_{h}(n)\rVert_{\infty}\geq\delta^{3}

for some PP at least δ1/2N\delta^{1/2}N from the endpoints of the interval. Thus by Lemma 7.1, for each h𝒯h\in\mathcal{T} there exists Θh\Theta_{h} with QΘh/δO(1)N1\lVert Q\cdot\Theta_{h}\rVert_{\mathbb{R}/\mathbb{Z}}\leq\delta^{-O(1)}N^{-1} and

(8.5) 𝔼n[N](Δhf)(n)χ(h,n)e(Θhn)F(εghγPΓ)ψh(n)δO(1).\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\cdot\chi(h,n)\otimes e(\Theta_{h}n)F(\varepsilon^{\ast}g_{h}^{\ast}\gamma_{P}\Gamma)\cdot\psi_{h}(n)\rVert_{\infty}\geq\delta^{O(1)}.

Rounding Θh\Theta_{h} to a net of distance δO(1)N1\delta^{O(1)}N^{-1} and paying a Q1δO(1)Q^{-1}\delta^{O(1)}-fraction in the size of 𝒯\mathcal{T} to Pigeonhole the resulting point, we may write Θh=Θ\Theta_{h}=\Theta for all h𝒯h\in\mathcal{T}. We are now finally in position to define the output data. Define

gh\displaystyle g_{h}^{\prime} =γP1ghγP,\displaystyle=\gamma_{P}^{-1}g_{h}^{\ast}\gamma_{P},
F\displaystyle F^{\prime} =F(εγP),\displaystyle=F(\varepsilon^{\ast}\gamma_{P}\cdot),
χh(n)\displaystyle\chi_{h}^{\prime}(n) =F(gh(n)),\displaystyle=F^{\prime}(g_{h}^{\prime}(n)),
χ(h,n)\displaystyle\chi^{\prime}(h,n) =χ(h,n)e(Θn).\displaystyle=\chi(h,n)\cdot e(\Theta n).

Note that g(h,n)=(g(h,n),Θn)g^{\prime}(h,n)=(g(h,n),\Theta n) is the polynomial sequence underlying χ\chi^{\prime}, and χ(h,n)=F(g(h,n)Γ)\chi^{\prime}(h,n)={F^{\ast}}^{\prime}(g^{\prime}(h,n){\Gamma^{\ast}}^{\prime}). It is easy to check the relevant properties of Definition 6.1 to see that we obtain a degree-rank (s1,r)(s-1,r^{\ast}) correlation structure with appropriately modified underlying parameters (we set HH^{\prime} to be the final refined version of 𝒯\mathcal{T}); in particular, (8.5) demonstrates the necessary correlation fact.

Finally, taking

Vi,Dep=π123(Ji)π124(Ji) and Vi=π1(Ji)V_{i,\mathrm{Dep}}=\pi_{123}(J_{i})^{\ast}\cap\pi_{124}(J_{i})^{\ast}\text{ and }V_{i}=\pi_{1}(J_{i})

we finish the proof: in particular, the result from Step 3 demonstrates the final item of the conclusion, and the result from Step 6 demonstrates the third item. ∎

9. Linearization Step

We now come to the second crucial argument of this paper. Prior this stage we have modified the degree-rank (s1,r)(s-1,r^{\ast}) to one in which various Taylor coefficients of ghg_{h} for hHh\in H (upon factoring) differ only on certain special subspaces. In this next stage, we deduce that either these Taylor coefficients differ on a further refined subspace which is seen to be essentially “annhilated” by η\eta or ghg_{h} has a certain “bracket linear” form. This step is ultimately where we invoke the results of Sanders [52] on quasi-polynomial bounds for the Bogolyubov lemma.

This step is closely modeled after [32, Step 2] and the closely related proof of [34, Lemma 11.5]; a quantitative version for the U4U^{4}-inverse theorem due to the first author can be found in [43]. The precise statement of the lemma should also be compared with [34, Theorem 11.1(ii)].

Lemma 9.1.

Fix s2s\geq 2 and 1rs11\leq r^{\ast}\leq s-1. Let f:[N]f\colon[N]\to\mathbb{C} be a 11-bounded function. Suppose that ff has a degree rank (s1,r)(s-1,r^{\ast}) correlation structure with parameters ρ\rho, MM, dd, and DD and that N(MD/ρ)Os(dOs(1))N\geq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Furthermore let 𝒳i=(𝒳log(G(i,1)))/log(G(i,2))\mathcal{X}_{i}=(\mathcal{X}\cap\log(G_{(i,1)}))/\log(G_{(i,2)}).

We output a new degree-rank (s1,r)(s-1,r^{\ast}) correlation structure for ff with parameters

ρ1\displaystyle\rho^{\prime-1} exp(Os((dlog(MD/ρ))Os(1))),MO(M),D=D,dO(d),\displaystyle\leq\exp(O_{s}((d\log(MD/\rho))^{O_{s}(1)})),\quad M^{\prime}\leq O(M),\quad D^{\prime}=D,\quad d^{\prime}\leq O(d),

with set HHH^{\prime}\subseteq H, with multidegree (1,s1)(1,s-1) nilcharacter χ(h,n)=F(g(h,n)Γ)\chi^{\prime}(h,n)={F^{\ast}}^{\prime}(g^{\prime}(h,n){\Gamma^{\ast}}^{\prime}) on (G)=G×(G^{\ast})^{\prime}=G^{\ast}\times\mathbb{R}, with hh-dependent nilcharacters χh\chi_{h}^{\prime} having underlying polynomial sequences gh(n)=F(gh(n)Γ)g_{h}^{\prime}(n)=F^{\prime}(g_{h}^{\prime}(n)\Gamma) on G=GG^{\prime}=G. This correlation structure satisfies:

  • (G)(G^{\ast})^{\prime} is given the multidegree filtration

    (G)(i,j)=(G)(i,j)×{0}(G^{\ast})^{\prime}_{(i,j)}=(G^{\ast})_{(i,j)}\times\{0\}

    if (i,j)(0,0)(i,j)\neq(0,0) or (0,1)(0,1). For (i,j){(0,0),(0,1)}(i,j)\in\{(0,0),(0,1)\}, we set

    (G)(i,j)=(G)(i,j)×.(G^{\ast})^{\prime}_{(i,j)}=(G^{\ast})_{(i,j)}\times\mathbb{R}.

    We have F((x,z)(Γ×))=F(xΓ)e(z){F^{\ast}}^{\prime}((x,z)(\Gamma^{\ast}\times\mathbb{Z}))=F^{\ast}(x\Gamma^{\ast})\cdot e(z). We have g(h,n)=(g(h,n),Θn)g^{\prime}(h,n)=(g(h,n),\Theta n) for some appropriate value of Θ\Theta;

  • There is a collection of \mathbb{R}-vector spaces Wi,,Wi,Lin,Wi,PetG(i,1)/G(i,2)W_{i,\ast},W_{i,\mathrm{Lin}},W_{i,\mathrm{Pet}}\leqslant G_{(i,1)}/G_{(i,2)} for each ii;

  • If Wi:=Wi,+Wi,Lin+Wi,PetW_{i}:=W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}} then dim(Wi)=dim(Wi,)+dim(Wi,Lin)+dim(Wi,Pet)\dim(W_{i})=\dim(W_{i,\ast})+\dim(W_{i,\mathrm{Lin}})+\dim(W_{i,\mathrm{Pet}}), i.e., the three spaces are linearly disjoint;

  • There exist bases 𝒳i,\mathcal{X}_{i,\ast}, 𝒳i,Lin\mathcal{X}_{i,\mathrm{Lin}}, and 𝒳i,Pet\mathcal{X}_{i,\mathrm{Pet}} of the corresponding spaces which are composed of (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational combinations of elements of (𝒳G(i,1))/G(i,2)(\mathcal{X}\cap G_{(i,1)})/G_{(i,2)};

  • For 1is11\leq i\leq s-1 and h,h1,h2Hh,h_{1},h_{2}\in H^{\prime} we have

    Taylori(gh)\displaystyle\operatorname{Taylor}_{i}(g_{h}^{\prime}) Wi,+Wi,Lin+Wi,Pet=Wi,\displaystyle\in W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}}=W_{i},
    Taylori(gh1)Taylori(gh2)\displaystyle\operatorname{Taylor}_{i}(g_{h_{1}}^{\prime})-\operatorname{Taylor}_{i}(g_{h_{2}}^{\prime}) Wi,Lin+Wi,Pet,\displaystyle\in W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}},
    ProjWi,Lin(Taylori(gh))\displaystyle\operatorname{Proj}_{W_{i,\mathrm{Lin}}}(\operatorname{Taylor}_{i}(g_{h}^{\prime})) =Zi,j𝒳i,Lin(γi,j+k=1dαi,j,k{βkh})Zi,j,\displaystyle=\sum_{Z_{i,j}\in\mathcal{X}_{i,\mathrm{Lin}}}\bigg{(}\gamma_{i,j}+\sum_{k=1}^{d^{\ast}}\alpha_{i,j,k}\{\beta_{k}h\}\bigg{)}Z_{i,j},

    with d(dlog(MD/ρ))Os(1)d^{\ast}\leq(d\log(MD/\rho))^{O_{s}(1)} and βk(1/N)\beta_{k}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime in [100N,200N][100N,200N];

  • FF^{\prime} is MM^{\prime}-Lipschitz and has the same vertical frequency η\eta as FF;

  • For any integers i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1, suppose that vijVijv_{i_{j}}\in V_{i_{j}} for all jj. If for at least one index \ell we have viWi,Petv_{i_{\ell}}\in W_{i_{\ell},\mathrm{Pet}}, then if ww is any (r1)(r^{\ast}-1)-fold commutator of vi1,,virv_{i_{1}},\ldots,v_{i_{r^{\ast}}} we have

    η(w)=0.\eta(w)=0.

    Furthermore, if instead for at least two indices 1,2\ell_{1},\ell_{2} we have vi1Wi,Linv_{i_{\ell_{1}}}\in W_{i_{\ell},\mathrm{Lin}} and vi2Wi2,Linv_{i_{\ell_{2}}}\in W_{i_{\ell_{2}},\mathrm{Lin}}, then if ww is any (r1)(r^{\ast}-1)-fold commutator of vi1,,virv_{i_{1}},\ldots,v_{i_{r^{\ast}}} we have

    η(w)=0.\eta(w)=0.
Remark.

The projection map ProjWi,Lin:WiWi,Lin\operatorname{Proj}_{W_{i,\mathrm{Lin}}}\colon W_{i}\to W_{i,\mathrm{Lin}} is well-defined due to the linear disjointness condition. Furthermore we have written Taylor coefficients with additive notation, since G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} can be identified with dim(Horizi(G))\mathbb{R}^{\dim(\operatorname{Horiz}_{i}(G))}.

Proof.

For the majority of the proof we will assume s3s\geq 3; we indicate the minor changes required for s=2s=2 for the end of the proof (and the case s=2s=2 is not used in the proof of Theorem 1.2). Note that the case when η\eta is trivial follows via taking Wi,Pet=G(i,1)/G(i,2)W_{i,\mathrm{Pet}}=G_{(i,1)}/G_{(i,2)}, Wi,LinW_{i,\mathrm{Lin}} and Wi,W_{i,\ast} to be trivial, gh=ghg_{h}^{\prime}=g_{h}, and g(h,n)=(g(h,n),0)g^{\prime}(h,n)=(g(h,n),0); therefore we may assume that η\eta is nontrivial for the remainder of the proof.

Step 1: Applying Lemma 8.3 and linear-algebraic setup. We apply Lemma 8.3 and treat the resulting correlation structure as the input to the lemma. Up to changing implicit constants in the output this leaves the lemma unchanged except for noting that

χ(h,n)=e(Θn)F(g(h,n)Γ)\chi(h,n)=e(\Theta n)\cdot F^{\ast}(g(h,n)\Gamma^{\ast})

which is defined on the group (G)=G×(G^{\ast})^{\prime}=G^{\ast}\times\mathbb{R}. In particular, we will abusively overwrite notation and relabel the resulting HH^{\prime} from the application of Lemma 8.3 as HH, ghg_{h}^{\prime} as ghg_{h}, and χh(n)\chi^{\prime}_{h}(n) as χh(n)\chi_{h}(n) and thus assume the output properties without further comment.

It will also be crucial to define certain linear-algebraic operators of ViV_{i}. Consider a basis for Vi,DepV_{i,\mathrm{Dep}}, an extension to a basis of ViV_{i}, and then to G(i,1)/G(i,2)G_{(i,1)}/G_{(i,2)} such that all basis elements are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational combinations of the basis exp(𝒳i)modG(i,2)\exp(\mathcal{X}_{i})~{}\mathrm{mod}~{}G_{(i,2)}. In particular, write

Vi,Dep\displaystyle V_{i,\mathrm{Dep}} =span(wi,1,,wi,dim(Vi,Dep))\displaystyle=\operatorname{span}_{\mathbb{R}}(w_{i,1},\ldots,w_{i,\dim(V_{i,\mathrm{Dep}})})
Vi\displaystyle V_{i} =span(wi,1,,wi,dim(Vi,Dep),wi,dim(Vi,Dep)+1,,wi,dim(Vi)),\displaystyle=\operatorname{span}_{\mathbb{R}}(w_{i,1},\ldots,w_{i,\dim(V_{i,\mathrm{Dep}})},w_{i,\dim(V_{i,\mathrm{Dep}})+1},\ldots,w_{i,\dim(V_{i})}),
G(i,1)/G(i,2)\displaystyle G_{(i,1)}/G_{(i,2)} =span(wi,1,,wi,dim(Horizi(G))).\displaystyle=\operatorname{span}_{\mathbb{R}}(w_{i,1},\ldots,w_{i,\dim(\operatorname{Horiz}_{i}(G))}).

Given vViv\in V_{i}, there is a unique linear combination

v=j=1dim(Vi)αjwi,j.v=\sum_{j=1}^{\dim(V_{i})}\alpha_{j}w_{i,j}.

We define

Piv=j=dim(Vi,Dep)+1dim(Vi)αjwi,j,Qiv=j=1dim(Vi,Dep)αjwi,j.P_{i}v=\sum_{j=\dim(V_{i,\mathrm{Dep}})+1}^{\dim(V_{i})}\alpha_{j}w_{i,j},\qquad Q_{i}v=\sum_{j=1}^{\dim(V_{i,\mathrm{Dep}})}\alpha_{j}w_{i,j}.

By construction Pi2=PiP_{i}^{2}=P_{i}, Qi2=QiQ_{i}^{2}=Q_{i}, Qi(Vi)Pi(Vi)=0Q_{i}(V_{i})\cap P_{i}(V_{i})=0, and Piv+Qiv=vP_{i}v+Q_{i}v=v for vViv\in V_{i}. We also (abusively) extend the operator PiP_{i} to ViV_{i}^{\otimes\ell} and (G(i,1)/G(i,2))4(G_{(i,1)}/G_{(i,2)})^{\otimes 4} in the obvious manners by acting on each copy of ViV_{i} separately (and zeroing out basis elements wi,dim(Vi)+1,,wi,dim(Horizi(G))w_{i,\dim(V_{i})+1},\ldots,w_{i,\dim(\operatorname{Horiz}_{i}(G))}).

Step 2: Invoking equidistribution theory. Applying Lemma 7.5 when s3s\geq 3, we have

𝔼[χh1(n)χh2(n+h1h4)χh3(n)¯χh4(n+h1h4)¯ψh(gh(n)Γ)](MD/ρ)Os(dOs(1))\displaystyle\lVert\mathbb{E}[\chi_{h_{1}}(n)\otimes\chi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\chi_{h_{3}}(n)}\otimes\overline{\chi_{h_{4}}(n+h_{1}-h_{4})}\cdot\psi_{\vec{h}}(g_{\vec{h}}(n)\Gamma^{\prime})]\rVert_{\infty}\geq(MD/\rho)^{-O_{s}(d^{O_{s}(1)})}

for a (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} density of additive tuples. We define GErrorG_{\mathrm{Error}}, G~\widetilde{G}, and ηProd\eta_{\mathrm{Prod}} as in the proof of Lemma 8.3 and as before we may assume that gh(0)=idGErrorg_{\vec{h}}(0)=\mathrm{id}_{G_{\mathrm{Error}}}. Define

gh(n)=(gh1(n),gh2(n+h1h4),gh3(n),gh4(n+h1h4),gh(n)).g_{\vec{h}}^{\ast}(n)=(g_{h_{1}}(n),g_{h_{2}}(n+h_{1}-h_{4}),g_{h_{3}}(n),g_{h_{4}}(n+h_{1}-h_{4}),g_{\vec{h}}(n)).

By applying Corollary 5.5, we have

gh=εhgh~γhg_{\vec{h}}^{\ast}=\varepsilon_{\vec{h}}\cdot\widetilde{g_{\vec{h}}}\cdot\gamma_{\vec{h}}

with

  • εh(0)=gh~(0)=γh(0)=idG~\varepsilon_{\vec{h}}(0)=\widetilde{g_{\vec{h}}}(0)=\gamma_{\vec{h}}(0)=\mathrm{id}_{\widetilde{G}};

  • gh~\widetilde{g_{\vec{h}}} takes values in KK;

  • γh\gamma_{\vec{h}} is (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational;

  • d(ε(n),ε(n1))(MD/ρ)Os(dOs(1))N1d(\varepsilon(n),\varepsilon(n-1))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-1} for n[N]n\in[N].

where ηProd(KG~(s1,r))=0\eta_{\mathrm{Prod}}(K\cap\widetilde{G}_{(s-1,r^{\ast})})=0 and KK is a (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational subgroup of G~\widetilde{G}. By passing to a subset of additive quadruples of density (MD/ρ)Os(dOs(1))(MD/\rho)^{-O_{s}(d^{O_{s}(1)})} we may in fact assume that the group KK is independent of h\vec{h} under consideration.

Step 3: Linear algebra deductions from equidistribution theory. Note that at present the subgroup KK does not account for the deductions given in Lemma 8.3; these initial deductions are designed essentially to account for this. Let τi:Horizi(G)4×Horizi(GError)Horizi(G)4\tau_{i}\colon\operatorname{Horiz}_{i}(G)^{\otimes 4}\times\operatorname{Horiz}_{i}(G_{\mathrm{Error}})\to\operatorname{Horiz}_{i}(G)^{\otimes 4} be the natural projection to the four-fold product. We define the following set of vector spaces:

Ri\displaystyle R_{i} :={(Qiv1,Qiv2,Qiv3,Qiv4)Vi4:Qiv1+Qiv2Qiv3Qiv4=0},\displaystyle:=\{(Q_{i}v_{1},Q_{i}v_{2},Q_{i}v_{3},Q_{i}v_{4})\in V_{i}^{\otimes 4}\colon Q_{i}v_{1}+Q_{i}v_{2}-Q_{i}v_{3}-Q_{i}v_{4}=0\},
Ki\displaystyle K_{i} :=τi(KG~(i,1)modG~(i,2)),\displaystyle:=\tau_{i}(K\cap\widetilde{G}_{(i,1)}~{}\mathrm{mod}~{}\widetilde{G}_{(i,2)}),
Si\displaystyle S_{i} :={(v1,v2,v3,v4)Vi4:Piv1=Piv2=Piv3=Piv4},\displaystyle:=\{(v_{1},v_{2},v_{3},v_{4})\in V_{i}^{\otimes 4}\colon P_{i}v_{1}=P_{i}v_{2}=P_{i}v_{3}=P_{i}v_{4}\},
Ki,1\displaystyle K_{i,1} :=KiSi,\displaystyle:=K_{i}\cap S_{i},
Ki~\displaystyle\widetilde{K_{i}} :=Ki,1+Ri,\displaystyle:=K_{i,1}+R_{i},
Li\displaystyle L_{i} :=π1(Ki~{(v1,v2,v3,v4)Vi4:Qiv2=Qiv3=Qiv4=0})+Qi(Vi).\displaystyle:=\pi_{1}(\widetilde{K_{i}}\cap\{(v_{1},v_{2},v_{3},v_{4})\in V_{i}^{\otimes 4}\colon Q_{i}v_{2}=Q_{i}v_{3}=Q_{i}v_{4}=0\})+Q_{i}(V_{i}).

By inspection, we have RiSiR_{i}\leqslant S_{i} hence Ki~Si\widetilde{K_{i}}\leqslant S_{i}. Note that

ηProd([Ki1,1,,Kir,1])=0\eta_{\mathrm{Prod}}([K_{i_{1},1},\ldots,K_{i_{r}^{\ast},1}])=0

whenever one has that i1++ir=s1i_{1}+\cdots+i_{r}^{\ast}=s-1 and [,,][\cdot,\ldots,\cdot] denotes any possible (r1)(r^{\ast}-1)-fold commutator bracket. This is a consequence of the fact that ηProd(KG~(s1,r))=0\eta_{\mathrm{Prod}}(K\cap\widetilde{G}_{(s-1,r^{\ast})})=0 and noting that Ki,1KiK_{i,1}\leqslant K_{i}. Note that we are implicitly using that ηProd\eta_{\mathrm{Prod}} is trivial on GErrorG_{\mathrm{Error}} as well, and we abusively descend ηProd\eta_{\mathrm{Prod}} to G4G^{\otimes 4}.

We now claim that

ηProd([vi1,,vir])=0\eta_{\mathrm{Prod}}([v_{i_{1}},\ldots,v_{i_{r^{\ast}}}])=0

if viSiv_{i_{\ell}}\in S_{i_{\ell}} for all \ell and there is at least one index jj such that vijRijv_{i_{j}}\in R_{i_{j}}.

To prove this, note by the final bullet point of Lemma 8.3 and multilinearity that

ηProd([vi1,,vir])\displaystyle\eta_{\mathrm{Prod}}([v_{i_{1}},\ldots,v_{i_{r^{\ast}}}]) =ηProd([Pi1vi1,,Pirvir])+k=1rηProd([Pi1vi1,,Qikvik,,Pirvir])\displaystyle=\eta_{\mathrm{Prod}}([P_{i_{1}}v_{i_{1}},\ldots,P_{i_{r^{\ast}}}v_{i_{r^{\ast}}}])+\sum_{k=1}^{r}\eta_{\mathrm{Prod}}([P_{i_{1}}v_{i_{1}},\ldots,Q_{i_{k}}v_{i_{k}},\ldots,P_{i_{r^{\ast}}}v_{i_{r^{\ast}}}])
=ηProd([Pi1vi1,,Qijvij,,Pirvir])=0.\displaystyle=\eta_{\mathrm{Prod}}([P_{i_{1}}v_{i_{1}},\ldots,Q_{i_{j}}v_{i_{j}},\ldots,P_{i_{r^{\ast}}}v_{i_{r^{\ast}}}])=0.

The first equality uses that every bracket with at least two QikvikQ_{i_{k}}v_{i_{k}} has two Vi,DepV_{i,\mathrm{Dep}} terms so is 0, the second equality uses Pijvij=0P_{i_{j}}v_{i_{j}}=0, and the third equality follows by noting that

Pivi{(v1,v2,v3,v4)Vi4:Piv1=Piv2=Piv3=Piv4,Qiv1=Qiv2=Qiv3=Qiv4=0}P_{i}v_{i}\in\{(v_{1},v_{2},v_{3},v_{4})\in V_{i}^{\otimes 4}\colon P_{i}v_{1}=P_{i}v_{2}=P_{i}v_{3}=P_{i}v_{4},Q_{i}v_{1}=Q_{i}v_{2}=Q_{i}v_{3}=Q_{i}v_{4}=0\}

and ηProd=(η,η,η,η)\eta_{\mathrm{Prod}}=(\eta,\eta,-\eta,-\eta). Now, we may ultimately deduce

ηProd([Ki1~,,Kir~])=0\eta_{\mathrm{Prod}}([\widetilde{K_{i_{1}}},\ldots,\widetilde{K_{i_{r^{\ast}}}}])=0

because Ki~=Ki,1+Ri\widetilde{K_{i}}=K_{i,1}+R_{i} and Ri,Ki,1SiR_{i},K_{i,1}\leqslant S_{i}.

Finally, let πT\pi_{T} for T[4]T\subseteq[4] is as in the proof of Lemma 8.3 (namely, an appropriate projection map). We have

π1(Ki~)Li.\pi_{1}(\widetilde{K_{i}})\leqslant L_{i}.

This follows because if

((Qv1,Pv1),(Qv2,Pv2),(Qv3,Pv3),(Qv4,Pv4))Ki~((Qv_{1},Pv_{1}),(Qv_{2},Pv_{2}),(Qv_{3},Pv_{3}),(Qv_{4},Pv_{4}))\in\widetilde{K_{i}}

then

((Q(v1+v2v3v4),Pv1),(0,Pv2),(0,Pv3),(0,Pv4))Ki~.((Q(v_{1}+v_{2}-v_{3}-v_{4}),Pv_{1}),(0,Pv_{2}),(0,Pv_{3}),(0,Pv_{4}))\in\widetilde{K_{i}}.

Step 4: Constructing a decomposition of Qi(Vi)Q_{i}(V_{i}). We will now decompose Qi(Vi)=Vi,DepQ_{i}(V_{i})=V_{i,\mathrm{Dep}} into a pair of subspaces. On one of these subspaces we will deduce an improved vanishing for the commutator while on the other subspace we will deduce an approximate linearity for Taylori(gh)\operatorname{Taylor}_{i}(g_{h}). Let

Li={(v1,v2,v3,v4)Si:Pv1=0,v2=v3=v4=0}Ki~.L_{i}^{\ast}=\{(v_{1},v_{2},v_{3},v_{4})\in S_{i}\colon Pv_{1}=0,v_{2}=v_{3}=v_{4}=0\}\cap\widetilde{K_{i}}.

Note that LiL_{i}^{\ast} may abusively be viewed as a subspace of ViV_{i} (instead of Vi4V_{i}^{\otimes 4}) and under this identification LiQi(Vi)=Vi,DepLiL_{i}^{\ast}\leqslant Q_{i}(V_{i})=V_{i,\mathrm{Dep}}\leqslant L_{i}.

The key claim in our analysis is if i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1, viLiv_{i_{\ell}}\in L_{i_{\ell}} for all indices \ell, and vijLijv_{i_{j}}\in L_{i_{j}}^{\ast} for at least one index jj we have

η([vi1,,vi])=0.\eta([v_{i_{1}},\ldots,v_{i_{\ell}}])=0.

To prove this, note that Qijvij=vijQ_{i_{j}}v_{i_{j}}=v_{i_{j}} and Pijvij=0P_{i_{j}}v_{i_{j}}=0 and using the last bullet point of Lemma 8.3, we have

η([vi1,,vi])=η([Pi1vi1,,Qijvij,,Pivi]),\eta([v_{i_{1}},\ldots,v_{i_{\ell}}])=\eta([P_{i_{1}}v_{i_{1}},\ldots,Q_{i_{j}}v_{i_{j}},\ldots,P_{i_{\ell}}v_{i_{\ell}}]),

similar to the argument in Step 3.

Next note that PiQiv=0P_{i}Q_{i}v=0 for all vViv\in V_{i} and therefore

Pi(Li)Pi(π1(Ki~{(v1,v2,v3,v4)Vi4:Qiv2=Qiv3=Qiv4=0})).P_{i}(L_{i})\leqslant P_{i}(\pi_{1}(\widetilde{K_{i}}\cap\{(v_{1},v_{2},v_{3},v_{4})\in V_{i}^{\otimes 4}\colon Q_{i}v_{2}=Q_{i}v_{3}=Q_{i}v_{4}=0\})).

Therefore we may lift PiviP_{i_{\ell}}v_{i_{\ell}} for j\ell\neq j to vi~=(Pivi+wi,Pivi,Pivi,Pivi)Ki~\widetilde{v_{i_{\ell}}}=(P_{i_{\ell}}v_{i_{\ell}}+w_{i_{\ell}},P_{i_{\ell}}v_{i_{\ell}},P_{i_{\ell}}v_{i_{\ell}},P_{i_{\ell}}v_{i_{\ell}})\in\widetilde{K_{i_{\ell}}} where wiQi(Vi)w_{i_{\ell}}\in Q_{i_{\ell}}(V_{i_{\ell}}). We lift vijv_{i_{j}} to vij~\widetilde{v_{i_{j}}} which has the form (Qijvij,0,0,0)Kij~(Q_{i_{j}}v_{i_{j}},0,0,0)\in\widetilde{K_{i_{j}}}.

Note that we have

0\displaystyle 0 =ηProd([vi1~,,vir~])=η([Pi1vi1+wi1,,Qijvij,,Pirvir+wir])\displaystyle=\eta_{\mathrm{Prod}}([\widetilde{v_{i_{1}}},\ldots,\widetilde{v_{i_{r^{\ast}}}}])=\eta([P_{i_{1}}v_{i_{1}}+w_{i_{1}},\ldots,Q_{i_{j}}v_{i_{j}},\ldots,P_{i_{r^{\ast}}}v_{i_{r^{\ast}}}+w_{i_{r^{\ast}}}])
=η([Pi1vi1,,Qijvij,,Pirvir])\displaystyle=\eta([P_{i_{1}}v_{i_{1}},\ldots,Q_{i_{j}}v_{i_{j}},\ldots,P_{i_{r^{\ast}}}v_{i_{r^{\ast}}}])

where in the first equality we have used for all \ell that vi~Ki~\widetilde{v_{i_{\ell}}}\in\widetilde{K_{i_{\ell}}} and the result from Step 3, in the second equality that vij~\widetilde{v_{i_{j}}} has the final three coordinates identically zero, and in the final equality that wiQi(Vi)=Vi,Depw_{i_{\ell}}\in Q_{i_{\ell}}(V_{i_{\ell}})=V_{i_{\ell},\mathrm{Dep}} and the final item of Lemma 8.3.

The desired decomposition of spaces for the lemma will have

Wi,Pet\displaystyle W_{i,\mathrm{Pet}} :=Li,Wi,:=Pi(Li)LiPi(Vi).\displaystyle:=L_{i}^{\ast},\quad W_{i,\ast}:=P_{i}(L_{i})\leqslant L_{i}\cap P_{i}(V_{i}).

The fact Pi(Li)LiPi(Vi)P_{i}(L_{i})\leqslant L_{i}\cap P_{i}(V_{i}) is deduced from Qi(Vi)LiQ_{i}(V_{i})\leqslant L_{i}. Wi,LinW_{i,\mathrm{Lin}} will be constructed explicitly in the next step but is chosen so that

Wi,LinQi(Li)=Qi(Vi)=Vi,DepW_{i,\mathrm{Lin}}\leqslant Q_{i}(L_{i})=Q_{i}(V_{i})=V_{i,\mathrm{Dep}}

and Wi,Lin+Wi,Pet=Vi,DepLiW_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}}=V_{i,\mathrm{Dep}}\leqslant L_{i}. Given these properties of Wi,LinW_{i,\mathrm{Lin}}, note that the above analysis, along with Lemma 8.3, establishes the final bullet point for our output.

Step 5: Controlling approximate homomorphisms. Recall Ki~Si\widetilde{K_{i}}\leqslant S_{i} and there is a natural isomorphism of groups

Si{(v,v1,v2,v3,v4):vPi(Vi),v1,,v4Qi(Vi)}.S_{i}\simeq\{(v,v_{1},v_{2},v_{3},v_{4})\colon v\in P_{i}(V_{i}),v_{1},\ldots,v_{4}\in Q_{i}(V_{i})\}.

Using this as an identification, we may write

Ki~=j=1dim(Si)dim(Ki~)ker((ξjPi,ξjQi,ξjQi,ξjQi,ξjQi))\widetilde{K_{i}}=\bigcap_{j=1}^{\dim(S_{i})-\dim(\widetilde{K_{i}})}\operatorname{ker}((\xi_{j}^{P_{i}},\xi_{j}^{Q_{i}},\xi_{j}^{Q_{i}},-\xi_{j}^{Q_{i}},-\xi_{j}^{Q_{i}}))

where ξjPiPi(Vi)\xi_{j}^{P_{i}}\in P_{i}(V_{i})^{\vee} and ξjQiQi(Vi)\xi_{j}^{Q_{i}}\in Q_{i}(V_{i})^{\vee} (i.e., corresponding dual vector spaces). Note that the annihilators all have the special form of (,ξjQi,ξjQi,ξjQi,ξjQi)(\cdot,\xi_{j}^{Q_{i}},\xi_{j}^{Q_{i}},-\xi_{j}^{Q_{i}},-\xi_{j}^{Q_{i}}) since

{(Qiv1,Qiv2,Qiv3,Qiv4)Vi4:Qiv1+Qiv2Qiv3Qiv4=0}=RiKi~.\{(Q_{i}v_{1},Q_{i}v_{2},Q_{i}v_{3},Q_{i}v_{4})\in V_{i}^{\otimes 4}\colon Q_{i}v_{1}+Q_{i}v_{2}-Q_{i}v_{3}-Q_{i}v_{4}=0\}=R_{i}\leqslant\widetilde{K_{i}}.

Note that

Li={vVi:Piv=0 and ξjQi(Qiv)=0 for all j}L_{i}^{\ast}=\{v\in V_{i}\colon P_{i}v=0\text{ and }\xi_{j}^{Q_{i}}(Q_{i}v)=0\text{ for all }j\}

since vLiv\in L_{i}^{\ast} is equivalent under this identification to (0,Qiv,0,0,0)Ki~(0,Q_{i}v,0,0,0)\in\widetilde{K_{i}}. Without loss of generality we may assume that for 1jdim(Vi,Dep)dim(Li)1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast}), vectors ξjQi\xi_{j}^{Q_{i}} are independent in Qi(Vi)Q_{i}(V_{i})^{\vee} (and they must span the orthogonal space to LiL_{i}^{\ast} within Qi(Vi)Q_{i}(V_{i})^{\vee}).

By appropriate scaling, we may assume ξjQi(wi,j)\xi_{j}^{Q_{i}}(w_{i,j}) is an integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} for 1jdim(Vi,Dep)1\leq j\leq\dim(V_{i,\mathrm{Dep}}). We extend each ξjQi\xi_{j}^{Q_{i}} to an operator on (G(i,1)/G(i,2))(G_{(i,1)}/G_{(i,2)})^{\vee} by setting ξjQi(wi,j)=0\xi_{j}^{Q_{i}}(w_{i,j})=0 for j>dim(Vi,Dep)j>\dim(V_{i,\mathrm{Dep}}). Possibly at the cost of another (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} scaling, we may assume that ξjQi(ΓG(i,1)modG(i,2))\xi_{j}^{Q_{i}}(\Gamma\cap G_{(i,1)}~{}\mathrm{mod}~{}G_{(i,2)})\in\mathbb{Z}. We extend ξjPi()\xi_{j}^{P_{i}}(\cdot) in an analogous manner to (G(i,1)/G(i,2))(G_{(i,1)}/G_{(i,2)})^{\vee} by setting ξjPi(wi,j)=0\xi_{j}^{P_{i}}(w_{i,j})=0 for 1jdim(Vi,Dep)1\leq j\leq\dim(V_{i},\mathrm{Dep}) and j>dim(Vi)j>\dim(V_{i}). Again, we may scale such that ξjPi(ΓG(i,1)modG(i,2))\xi_{j}^{P_{i}}(\Gamma\cap G_{(i,1)}~{}\mathrm{mod}~{}G_{(i,2)})\in\mathbb{Z}. The crucial point here is that now ξjPi\xi_{j}^{P_{i}} and ξjQi\xi_{j}^{Q_{i}} are ii-th horizontal characters of height at most (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}.

We have

τi(Taylori(g~h))Ki\tau_{i}(\operatorname{Taylor}_{i}(\widetilde{g}_{\vec{h}}))\in K_{i}

and thus

dist(τi(Taylori(gh)),Si+T1Horizi(Γ4))(MD/ρ)Os(dOs(1))Ni\operatorname{dist}(\tau_{i}(\operatorname{Taylor}_{i}(g_{\vec{h}}^{\ast})),S_{i}+T^{-1}\operatorname{Horiz}_{i}(\Gamma^{\otimes 4}))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

after Pigeonholing h\vec{h} appropriately. Here distance is in LL^{\infty} after expressing both of these expressions in the basis exp(𝒳i)4\exp(\mathcal{X}_{i})^{\otimes 4} and TT is an integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. We have used Lemma 2.13 and the properties of the original factorization; a very similar argument appears in Step 5 of the proof of Lemma 8.3.

Furthermore note that

τi(Taylori(gh))Si\tau_{i}(\operatorname{Taylor}_{i}(g_{\vec{h}}^{\ast}))\in S_{i}

by Lemma 8.3. So if we choose a set of horizontal characters of height (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} relative to Horizi(Γ4)\operatorname{Horiz}_{i}(\Gamma^{\otimes 4}) which cut out Ki~\widetilde{K_{i}} as their common kernel, then noting that KiSiKi~K_{i}\cap S_{i}\leqslant\widetilde{K_{i}} and applying Lemma B.2 we may assume that

(9.1) τi(Taylori(g~h))Ki~\tau_{i}(\operatorname{Taylor}_{i}(\widetilde{g}_{\vec{h}}))\in\widetilde{K_{i}}

and εh,γh\varepsilon_{\vec{h}},\gamma_{\vec{h}} have identical properties up to changing implicit constants. We will assume this refined property of the factorization for the remainder of our analysis.

Given the factorization of ghg_{\vec{h}}^{\ast}, we thus deduce (taking an appropriate least common multiple)

T1(ξjPi\displaystyle\bigg{\lVert}T_{1}\cdot\bigg{(}\xi_{j}^{P_{i}} (Taylori(gh2))+ξjQi(Taylori(gh1))+ξjQi(Taylori(gh2))\displaystyle(\operatorname{Taylor}_{i}(g_{h_{2}}))+\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h_{1}}))+\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h_{2}}))
(9.2) ξjQi(Taylori(gh3))ξjQi(Taylori(gh4)))/(MD/ρ)Os(dOs(1))Ni\displaystyle\quad-\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h_{3}}))-\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h_{4}}))\bigg{)}\bigg{\rVert}_{\mathbb{R}/\mathbb{Z}}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

for all 1jdim(Vi,Dep)dim(Li)1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast}) where T1T_{1} is an integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Here we have used that ξjPi(Taylori(gh))\xi_{j}^{P_{i}}(\operatorname{Taylor}_{i}(g_{h})) is equal for all hHh\in H by Lemma 8.3.

We define functions f,g:Hi=1s1dim(Vi,Dep)dim(Li)f,g\colon H\to\mathbb{R}^{\sum_{i=1}^{s-1}\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast})} via

f(h)\displaystyle f(h) =(T1ξjQi(Taylori(gh)))1is1,1jdim(Vi,Dep)dim(Li),\displaystyle=(T_{1}\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h})))_{1\leq i\leq s-1,~{}1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast})},
g(h)\displaystyle g(h) =(T1ξjPi(Taylori(gh))+T1ξjQi(Taylori(gh)))1is1,1jdim(Vi,Dep)dim(Li).\displaystyle=(T_{1}\xi_{j}^{P_{i}}(\operatorname{Taylor}_{i}(g_{h}))+T_{1}\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h})))_{1\leq i\leq s-1,~{}1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast})}.

Note that for the additive quadruples on which we have (9.2), we are exactly in the situation necessary to apply results on approximate homomorphisms.

In particular, we may apply Lemma A.1. We see that there exists HHH^{\prime}\subseteq H having density at least exp(Os(dlog(MD/ρ))Os(1))\exp(-O_{s}(d\log(MD/\rho))^{O_{s}(1)}) such that for all i,ji,j and hHh\in H^{\prime}, we have

(9.3) T1ξjQi(Taylori(gh))(γi,j+k=1dαi,j,k{βkh})/(MD/ρ)Os(dOs(1))Ni,\bigg{\lVert}T_{1}\xi_{j}^{Q_{i}}(\operatorname{Taylor}_{i}(g_{h}))-\bigg{(}\gamma_{i,j}+\sum_{k=1}^{d^{\ast}}\alpha_{i,j,k}\{\beta_{k}h\}\bigg{)}\bigg{\rVert}_{\mathbb{R}/\mathbb{Z}}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i},

where:

  • d(dlog(MD/ρ))Os(1)d^{\ast}\leq(d\log(MD/\rho))^{O_{s}(1)};

  • βk(1/N)\beta_{k}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime between 100N100N and 200N200N.

At this point, for each ii we find elements Zi,jZ_{i,j} for 1jdim(Vi,Dep)dim(Li)1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast}) which are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational combinations of {wi,j:1jdim(Vi,Dep)}\{w_{i,j}\colon 1\leq j\leq\dim(V_{i,\mathrm{Dep}})\} such that

(9.4) T1ξjQi(Zi,j)=1 and ξjQi(Zi,j)=0T_{1}\xi_{j}^{Q_{i}}(Z_{i,j})=1\text{ and }\xi_{j^{\prime}}^{Q_{i}}(Z_{i,j})=0

for jjj^{\prime}\neq j such that 1j,jdim(Vi,Dep)dim(Li)1\leq j,j^{\prime}\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast}). We define

Wi,Lin=span((Zi,j)1jdim(Vi,Dep)dim(Li)).W_{i,\mathrm{Lin}}=\operatorname{span}_{\mathbb{R}}((Z_{i,j})_{1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast})}).

We see that there are no nontrivial linear relations between Wi,W_{i,\ast} and Wi,Pet+Wi,LinW_{i,\mathrm{Pet}}+W_{i,\mathrm{Lin}} since Wi,Pi(Vi)W_{i,\ast}\leqslant P_{i}(V_{i}) and Wi,Pet+Wi,LinQi(Vi)W_{i,\mathrm{Pet}}+W_{i,\mathrm{Lin}}\leqslant Q_{i}(V_{i}). There are no linear relations between Wi,PetW_{i,\mathrm{Pet}} and Wi,LinW_{i,\mathrm{Lin}} as Wi,PetW_{i,\mathrm{Pet}} lies in the joint kernel of the ξjQi\xi_{j}^{Q_{i}} and therefore using (9.4) one can prove any such relation is trivial. Furthermore, by construction we have Vi,Dep=Qi(Vi)=Wi,Lin+Wi,PetV_{i,\mathrm{Dep}}=Q_{i}(V_{i})=W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}}. Finally Li=Pi(Li)+Qi(Li)=Wi,+Wi,Lin+Wi,PetL_{i}=P_{i}(L_{i})+Q_{i}(L_{i})=W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}}; this implicitly uses Q(Li)=Qi(Vi)LiQ(L_{i})=Q_{i}(V_{i})\leqslant L_{i}.

Step 6: Constructing the desired factorizations and completing the proof. Using the refined factorization (9.1) implies that

π1(τ(Taylori(g~h)))Li\pi_{1}(\tau(\operatorname{Taylor}_{i}(\widetilde{g}_{\vec{h}})))\in L_{i}

since π1(Ki~)Li\pi_{1}(\widetilde{K_{i}})\leqslant L_{i}. Applying gh=εhgh~γhg_{\vec{h}}^{\ast}=\varepsilon_{\vec{h}}\cdot\widetilde{g_{\vec{h}}}\cdot\gamma_{\vec{h}} in the first coordinate then implies that

(9.5) dist(Taylori(gh1),Li+T21Horizi(Γ))(MD/ρ)Os(dOs(1))Ni\operatorname{dist}(\operatorname{Taylor}_{i}(g_{h_{1}}),L_{i}+T_{2}^{-1}\operatorname{Horiz}_{i}(\Gamma))\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-i}

for h1Hh_{1}\in H^{\prime} where distance is in LL^{\infty} after expressing values in terms of exp(𝒳i)\exp(\mathcal{X}_{i}). Here T2T_{2} is an integer bounded by (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}. Furthermore recall from Lemma 8.3 that

(9.6) Taylori(gh)Taylori(gh)Vi,Dep=Qi(Vi)\operatorname{Taylor}_{i}(g_{h})-\operatorname{Taylor}_{i}(g_{h^{\prime}})\in V_{i,\mathrm{Dep}}=Q_{i}(V_{i})

for h,hHHh,h^{\prime}\in H^{\prime}\subseteq H.

Let Yi,jspan(𝒳log(G(i,1))𝒳log(G(i,2)))Y_{i,j}\in\operatorname{span}_{\mathbb{R}}(\mathcal{X}\cap\log(G_{(i,1)})\setminus\mathcal{X}\cap\log(G_{(i,2)})) be such that exp(Yi,j)modG(i,2)=Zi,j\exp(Y_{i,j})~{}\mathrm{mod}~{}G_{(i,2)}=Z_{i,j}. Then for hHh\in H^{\prime}, we define

(9.7) g~h(n)=i=1sj=1dim(Wi,Lin)exp(Yi,j)T11(ni)(γi,j+k=1dαi,j,k{βkh}).\widetilde{g}_{h}(n)=\prod_{i=1}^{s}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(Y_{i,j})^{T_{1}^{-1}\binom{n}{i}\cdot(\gamma_{i,j}+\sum_{k=1}^{d^{\ast}}\alpha_{i,j,k}\{\beta_{k}h\})}.

By construction and Lemma 2.13, for h,hHh,h^{\prime}\in H^{\prime} we have

Taylori(g~h1gh)Taylori(g~h1gh)\displaystyle\operatorname{Taylor}_{i}(\widetilde{g}_{h}^{-1}g_{h})-\operatorname{Taylor}_{i}(\widetilde{g}_{h^{\prime}}^{-1}g_{h^{\prime}}) Qi(Vi),\displaystyle\in Q_{i}(V_{i}),
dist(Taylori(g~h1gh),Li+T21Horizi(Γ))\displaystyle\operatorname{dist}(\operatorname{Taylor}_{i}(\widetilde{g}_{h}^{-1}g_{h}),L_{i}+T_{2}^{-1}\operatorname{Horiz}_{i}(\Gamma)) (MD/ρ)Os(dOs(1))Ni,\displaystyle\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}\cdot N^{-i},
T1ξjQi(g~h1gh)/\displaystyle\lVert T_{1}\xi_{j}^{Q_{i}}(\widetilde{g}_{h}^{-1}g_{h})\rVert_{\mathbb{R}/\mathbb{Z}} (MD/ρ)Os(dOs(1))Ni,\displaystyle\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}\cdot N^{-i},

where 1is11\leq i\leq s-1 and 1jdim(Vi,Dep)dim(Li)1\leq j\leq\dim(V_{i,\mathrm{Dep}})-\dim(L_{i}^{\ast}). The first line comes from (9.6), the second line from (9.5), and the third from (9.3) and (9.7), in conjunction with (9.4).

We now fix an element h2Hh_{2}\in H^{\prime}. For each h1Hh_{1}\in H^{\prime} we write

gh1\displaystyle g_{h_{1}}^{\prime} =g~h1(g~h11gh1)=g~h1(g~h11gh1)(g~h21gh2)1(g~h21gh2).\displaystyle=\widetilde{g}_{h_{1}}\cdot(\widetilde{g}_{h_{1}}^{-1}g_{h_{1}}^{\prime})=\widetilde{g}_{h_{1}}\cdot(\widetilde{g}_{h_{1}}^{-1}g_{h_{1}}^{\prime})\cdot(\widetilde{g}_{h_{2}}^{-1}g_{h_{2}}^{\prime})^{-1}\cdot(\widetilde{g}_{h_{2}}^{-1}g_{h_{2}}^{\prime}).

By applying Lemma B.2, we may write

(g~h11gh1)(g~h21gh2)1=εh1gh1γh1,(g~h21gh2)=εgγ(\widetilde{g}_{h_{1}}^{-1}g_{h_{1}}^{\prime})\cdot(\widetilde{g}_{h_{2}}^{-1}g_{h_{2}}^{\prime})^{-1}=\varepsilon_{h_{1}}^{\ast}g_{h_{1}}^{\ast}\gamma_{h_{1}}^{\ast},\qquad(\widetilde{g}_{h_{2}}^{-1}g_{h_{2}}^{\prime})=\varepsilon^{\ast}g^{\ast}\gamma^{\ast}

where γ,γh1\gamma^{\ast},\gamma_{h_{1}}^{\ast} are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational, ε,εh1\varepsilon^{\ast},\varepsilon_{h_{1}}^{\ast} are ((MD/ρ)Os(dOs(1)),N)((MD/\rho)^{O_{s}(d^{O_{s}(1)})},N)-smooth, and we have Taylori(gh1)Li=Wi,Pet\operatorname{Taylor}_{i}(g_{h_{1}}^{\ast})\in L_{i}^{\ast}=W_{i,\mathrm{Pet}} using the first and third lines above and Taylori(g)Li+Pi(Li)=Wi,+Wi,Pet\operatorname{Taylor}_{i}(g^{\ast})\in L_{i}^{\ast}+P_{i}(L_{i})=W_{i,\ast}+W_{i,\mathrm{Pet}} using the second and third lines above. (Recall that LiQi(Vi)L_{i}^{\ast}\leqslant Q_{i}(V_{i}) is cut out by the ξjQi\xi_{j}^{Q_{i}}.) Additionally, these sequences are the identity at 0.

Therefore, for h1Hh_{1}\in H^{\prime} we have

gh1\displaystyle g_{h_{1}}^{\prime} =εh1ε((εh1ε)1g~h1(εh1ε))((ε)1gh1ε)((ε)1γh1ε(γh1)1)(γh1g(γh1)1)(γh1γ)\displaystyle=\varepsilon_{h_{1}}^{\ast}\varepsilon^{\ast}((\varepsilon_{h_{1}}^{\ast}\varepsilon^{\ast})^{-1}\widetilde{g}_{h_{1}}(\varepsilon_{h_{1}}^{\ast}\varepsilon^{\ast}))((\varepsilon^{\ast})^{-1}g_{h_{1}}^{\ast}\varepsilon^{\ast})((\varepsilon^{\ast})^{-1}\gamma_{h_{1}}^{\ast}\varepsilon^{\ast}(\gamma_{h_{1}}^{\ast})^{-1})(\gamma_{h_{1}}^{\ast}g^{\ast}(\gamma_{h_{1}}^{\ast})^{-1})(\gamma_{h_{1}}^{\ast}\gamma^{\ast})
=:(εh1ε)gh1(γh1γ).\displaystyle=:(\varepsilon_{h_{1}}^{\ast}\varepsilon^{\ast})\cdot g_{h_{1}}^{\triangle}\cdot(\gamma_{h_{1}}^{\ast}\gamma^{\ast}).

So, for h3,h4Hh_{3},h_{4}\in H we deduce using Lemma 2.13 and the above analysis that

Taylori(g~h3)\displaystyle\operatorname{Taylor}_{i}(\widetilde{g}_{h_{3}}) Wi,Lin,\displaystyle\in W_{i,\mathrm{Lin}},
Taylori(gh3)\displaystyle\operatorname{Taylor}_{i}(g_{h_{3}}^{\triangle}) Li=Wi,+Wi,Lin+Wi,Pet,\displaystyle\in L_{i}=W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}},
ProjWi,Lin(Taylori(gh3))\displaystyle\operatorname{Proj}_{W_{i},\mathrm{Lin}}(\operatorname{Taylor}_{i}(g_{h_{3}}^{\triangle})) =ProjWi,Lin(Taylori(g~h3)),\displaystyle=\operatorname{Proj}_{W_{i},\mathrm{Lin}}(\operatorname{Taylor}_{i}(\widetilde{g}_{h_{3}})),
Taylori(g~h31gh3)Taylori(g~h41gh4)\displaystyle\operatorname{Taylor}_{i}(\widetilde{g}_{h_{3}}^{-1}g_{h_{3}}^{\triangle})-\operatorname{Taylor}_{i}(\widetilde{g}_{h_{4}}^{-1}g_{h_{4}}^{\triangle}) Wi,Pet.\displaystyle\in W_{i,\mathrm{Pet}}.

Furthermore note that εh1ε\varepsilon_{h_{1}}^{\ast}\varepsilon^{\ast} is sufficiently smooth and γh1γ\gamma_{h_{1}}^{\ast}\gamma^{\ast} is appropriately rational. This nearly gives the desired result except we need to remove the rational and smooth parts exactly as in Step 7 of Lemma 8.3; we omit the details, although note that the only difference between ghg_{h}^{\triangle} and the output is a conjugation by a fixed element which leaves all properties unchanged and the Fourier phase on the \mathbb{R} part of (G)(G^{\ast})^{\prime} may be modified. Additionally, the set HH^{\prime} will be made smaller by acceptable factors due to Pigeonhole.

Step 7: Handling the exceptional case s=2s=2. In this exceptional case, we have r=1r^{\ast}=1 and s=2s=2, and η\eta is nontrivial. The difference here versus the prior analysis is that the error term ψh(gh(n)Γ)\psi_{h}(g_{\vec{h}}(n)\Gamma^{\prime}) is replaced by e(Θhn)e(\Theta_{\vec{h}}n) with Θh/(MD/ρ)Os(dOs(1))N1\lVert\Theta_{\vec{h}}\rVert_{\mathbb{R}/\mathbb{Z}}\leq(MD/\rho)^{O_{s}(d^{O_{s}(1)})}N^{-1} by using the Remark 7.6 regarding Lemma 7.5 for s=2s=2.

We take GError=G^{\mathrm{Error}}=\mathbb{R}, ΓError=\Gamma^{\mathrm{Error}}=\mathbb{Z}, gh(n)=Θhng_{\vec{h}}(n)=\Theta_{\vec{h}}n, and ψh(z)=e(z)\psi_{\vec{h}}(z)=e(z). G~\widetilde{G} is defined as before. Taking η=(η,η,η,η,1)\eta^{\ast}=(\eta,\eta,-\eta,-\eta,1), by Corollary 5.5 we may factor

gh=εhg~hγhg_{\vec{h}}^{\ast}=\varepsilon_{\vec{h}}\cdot\widetilde{g}_{\vec{h}}\cdot\gamma_{\vec{h}}

where εh\varepsilon_{\vec{h}} is ((MD/ρ)Os(dOs(1)),N)((MD/\rho)^{O_{s}(d^{O_{s}(1)})},N)-smooth, γh\gamma_{\vec{h}} is (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational, and g~h\widetilde{g}_{\vec{h}} lies in a (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational subgroup KK such that η(KG~(1,1))=0\eta^{\ast}(K\cap\widetilde{G}_{(1,1)})=0. Note however that

gh=(idG,idG,idG,idG,Θhn)(τ(gh),0)g_{\vec{h}}^{\ast}=(\mathrm{id}_{G},\mathrm{id}_{G},\mathrm{id}_{G},\mathrm{id}_{G},\Theta_{\vec{h}}n)\cdot(\tau(g_{\vec{h}}^{\ast}),0)

where τ:G~G4\tau\colon\widetilde{G}\to G^{\otimes 4} is the natural projection. Let K=K(G4×{0})K^{\ast}=K\cap(G^{\otimes 4}\times\{0\}) and note that KK^{\ast} can be defined as the joint kernel of certain horizontal characters of height (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})} (namely, ones defining KK along with one of the form (0,0,0,0,1)(0,0,0,0,1)). Since Θh\Theta_{\vec{h}} is small and is the only part in the fifth coordinate, arguments similar to before allow us to refine the first factorization (up to changing implicit constants) and instead assume that g~h\widetilde{g}_{\vec{h}} lies in KK^{\ast}.

Furthermore note that if ηProd=(η,η,η,η,0)\eta_{\mathrm{Prod}}=(\eta,\eta,-\eta,-\eta,0) we have that ηProd(K)=0\eta_{\mathrm{Prod}}(K^{\ast})=0 as ηProd\eta_{\mathrm{Prod}} and η\eta^{\ast} agree on the initial four groups. At this point we are exactly in the situation of the earlier analysis and we may complete the proof.333Various simplifications are possible in the case since the underlying groups are all abelian here; in particular, invoking Corollary 5.5 reduces to summing a geometric series.

We remark that modulo minor annoyances, the strategy of using Lemma 7.5, deducing an approximate homomorphism, and then applying results coming from the Bogolyubov lemma was introduced by Gowers [16] in his seminal work on four-term arithmetic progressions. It was similarly applied in work of Green and Tao [23] on the U3U^{3}-inverse theorem. In a certain sense, the previous two sections can be thought of as showing that, given an appropriate equidistribution theorem and defining a number of notions for nilmanifolds, this analysis can be modified to make sense in the greater generality of nilmanifolds where the group is not abelian.

10. Setup for extracting a (1,s1)(1,s-1)-nilsequence

Before diving into the formal proof, we motivate how we extract the “top degree-rank” part and why lifting to the universal nilmanifold plays a role in our argument at this stage. We remark that Green, Tao, and Ziegler [34] work with the universal nilmanifold throughout their argument (in the form of a representation of a degree-rank nilcharacter; see [34, Definition 9.11]).

Recall the bracket polynomial U5U^{5}-inverse sketch discussed in Section 4; we started with functions

e(i=1d1ai,hn[bi,hn][ci,hn]+i=1d2di,hn2[ei,hn]+i=1d3fi,hn[gi,hn]+jhn3+hn2+mhn)e\bigg{(}\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n]+\sum_{i=1}^{d_{2}}d_{i,h}n^{2}[e_{i,h}n]+\sum_{i=1}^{d_{3}}f_{i,h}n[g_{i,h}n]+j_{h}n^{3}+\ell_{h}n^{2}+m_{h}n\bigg{)}

which correlate with Δhf\Delta_{h}f. At this point, we have proven that

i=1d1ai,hn[bi,hn][ci,hn]\sum_{i=1}^{d_{1}}a_{i,h}n[b_{i,h}n][c_{i,h}n]

is equivalent to a bracket polynomial up to lower order terms of degree-rank of the form

i=1d1δi{εih}n[βi,n][γi,n].\sum_{i=1}^{d_{1}^{\prime}}\delta_{i}\{\varepsilon_{i}h\}n[\beta_{i,\ast}n][\gamma_{i,\ast}n].

Our goal at this stage is to isolate

e(i=1d1δi{εih}n[βi,n][γi,n]);e\bigg{(}\sum_{i=1}^{d_{1}^{\prime}}\delta_{i}\{\varepsilon_{i}h\}n[\beta_{i,\ast}n][\gamma_{i,\ast}n]\bigg{)};

in the next section we will then convert this “top degree–rank” bracket phase into a (1,s1)(1,s-1)–nilsequence.

The reason lifting to a universal nilmanifold proves so technically useful is that it enables us to isolate various components of the horizontal tori as “separate subgroups”. For the sake of simplicity, consider a 22-step group GG in the U4U^{4}-inverse case given the degree-rank filtration G(0,0)=G(1,0)=G(1,1)=GG_{(0,0)}=G_{(1,0)}=G_{(1,1)}=G, G(2,0)=G(2,1)=G(2,2)=[G,G]G_{(2,0)}=G_{(2,1)}=G_{(2,2)}=[G,G], where the remaining groups are trivial. In this case, the output of Lemma 9.1 gives the linearly disjoint subspaces WW_{\ast}, WLinW_{\mathrm{Lin}}, WPetW_{\mathrm{Pet}} of V=G/[G,G]V=G/[G,G] such that the commutator of any two elements in WLin+WPetW_{\mathrm{Lin}}+W_{\mathrm{Pet}} vanishes and the commutator of any element of WW_{\ast} and WPetW_{\mathrm{Pet}} vanishes.

Let 𝒵\mathcal{Z}_{\ast} denote the rational basis of log(W)mod[G,G]\log(W_{\ast})~{}\mathrm{mod}~{}[G,G] and 𝒵Lin\mathcal{Z}_{\mathrm{Lin}} and 𝒵Pet\mathcal{Z}_{\mathrm{Pet}} be analogous. We also have a decomposition of our polynomial

gh=gh,+gh,Lin+gh,Petmod[G,G]g_{h}=g_{h,\ast}+g_{h,\mathrm{Lin}}+g_{h,\mathrm{Pet}}~{}\mathrm{mod}~{}[G,G]

where

gh,Lin(n)\displaystyle g_{h,\mathrm{Lin}}(n) =i=1dim(WLin)exp(δin{εih}ZiLin),gh,(n)=i=1dim(W)exp(βinZi),\displaystyle=\prod_{i=1}^{\dim(W_{\mathrm{Lin}})}\exp(\delta_{i}n\{\varepsilon_{i}h\}Z_{i}^{\mathrm{Lin}}),\qquad g_{h,\ast}(n)=\prod_{i=1}^{\dim(W_{\ast})}\exp(\beta_{i}nZ_{i}^{\ast}),
gh,Pet(n)\displaystyle g_{h,\mathrm{Pet}}(n) =i=1dim(WPet)exp(γihnZiPet).\displaystyle=\prod_{i=1}^{\dim(W_{\mathrm{Pet}})}\exp(\gamma_{i}^{h}nZ_{i}^{\mathrm{Pet}}).

Therefore we may write

gh(n)=gh,(n)gh,Lin(n)gh,Pet(n)gh,Rem(n)g_{h}(n)=g_{h,\ast}(n)g_{h,\mathrm{Lin}}(n)g_{h,\mathrm{Pet}}(n)g_{h,\mathrm{Rem}}(n)

with gh,Rem(n)[G,G]g_{h,\mathrm{Rem}}(n)\in[G,G] pointwise. The top order term which we seek to isolate is heuristically similar to

e(i=1dim(WLin)j=1dim(W)δin{εih}[βjn][ZiLin,Zj]).e\bigg{(}\sum_{i=1}^{\dim(W_{\mathrm{Lin}})}\sum_{j=1}^{\dim(W_{\ast})}\delta_{i}n\{\varepsilon_{i}h\}[\beta_{j}n][Z_{i}^{\mathrm{Lin}},Z_{j}^{\ast}]\bigg{)}.

Note that given the factorization of ghg_{h}, we have established no control over gh,Petg_{h,\mathrm{Pet}} and gh,Remg_{h,\mathrm{Rem}}. This may suggest that we wish to quotient out by the subgroup WPet[G,G]W_{\mathrm{Pet}}[G,G] in order to kill these terms; note however that G/(WPet[G,G])G/(W_{\mathrm{Pet}}[G,G]) now abelian and such a projection “kills” the higher order degree-rank term calculated above. This suggest that the group WPet[G,G]W_{\mathrm{Pet}}[G,G] is “too large” a quotient. The solution is to “enlarge” the group GG so that the subgroup [WLin,W][W_{\mathrm{Lin}},W_{\ast}] and the subgroup GG^{\prime} that corresponds to the remaining phases γhn2+δhn\gamma_{h}n^{2}+\delta_{h}n are disjoint. We can then quotient by WPetGW_{\mathrm{Pet}}G^{\prime}. This disjointness is accomplished by lifting to the universal nilmanifold of degree-rank (2,2)(2,2).

10.1. Unwinding the output of Lemma 9.1

We first require the following elementary lemma regarding lattice elements when presented in first-kind coordinates.

Lemma 10.1.

Fix an integer k1k\geq 1. Consider a nilmanifold G/ΓG/\Gamma of dimension dd with a Mal’cev basis 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} of logG\log G which is QQ-rational and such that 𝒳\mathcal{X} has the degree kk nesting property. Then there exists a positive integer QOk(QOk(dOk(1)))Q^{\prime}\leq O_{k}(Q^{O_{k}(d^{O_{k}(1)})}) such that if zjQz_{j}\in Q^{\prime}\cdot\mathbb{Z} then

exp(j=1dzjXj)Γ.\exp\bigg{(}\sum_{j=1}^{d}z_{j}X_{j}\bigg{)}\in\Gamma.
Proof.

Note that Γ=ψ𝒳(d)\Gamma=\psi_{\mathcal{X}}(\mathbb{Z}^{d}). By [42, Lemma B.1], ψ𝒳ψexp,𝒳1\psi_{\mathcal{X}}\circ\psi_{\mathrm{exp},\mathcal{X}}^{-1} is a degree Ok(1)O_{k}(1) polynomial with coefficients of height at most QOk(dO(1))Q^{O_{k}(d^{O(1)})}. The desired result then follows by taking QQ^{\prime} to the least common multiple of all denominators of all coefficients present in this polynomial (since there are only Ok(dOk(1))O_{k}(d^{O_{k}(1)}) total coefficients). Note that the polynomial corresponding to ψ𝒳ψexp,𝒳1\psi_{\mathcal{X}}\circ\psi_{\mathrm{exp},\mathcal{X}}^{-1} has no constant term by observing the image of idG\mathrm{id}_{G}. ∎

We next require the following additional elementary lemma which gives a Taylor series expansion which is “graded by the Mal’cev basis”.

Lemma 10.2.

Consider a nilmanifold G/ΓG/\Gamma of degree kk with an adapted Mal’cev basis 𝒳={X1,,Xdim(G)}\mathcal{X}=\{X_{1},\ldots,X_{\dim(G)}\} and a polynomial sequence g(n)g(n). There exists a representation

g(n)=i=0kj=dim(G)dim(Gk)+1dim(G)exp(Xj)αi,jnii!g(n)=\prod_{i=0}^{k}\prod_{j=\dim(G)-\dim(G_{k})+1}^{\dim(G)}\exp(X_{j})^{\alpha_{i,j}\cdot\frac{n^{i}}{i!}}

where αi,j\alpha_{i,j}\in\mathbb{R}.

Proof.

Note via Baker–Campbell–Hausdorff and existence of Taylor expansions, we may write

g(n)=exp(i=0sginii!)g(n)=\exp\bigg{(}\sum_{i=0}^{s}g_{i}\cdot\frac{n^{i}}{i!}\bigg{)}

with gilog(Gi)g_{i}\in\log(G_{i}). Let g0(n)=g(n)g_{0}(n)=g(n) and g0,i=gig_{0,i}=g_{i}. Then iteratively define g+1(n)g_{\ell+1}(n) by the following process: write j=dim(G)dim(G(,0))+1dim(G)α,jXj=g,\sum_{j=\dim(G)-\dim(G_{(\ell,0)})+1}^{\dim(G)}\alpha_{\ell,j}X_{j}=g_{\ell,\ell}. Then let

g+1(n):=(j=dim(G)dim(G)+1dim(G)exp(Xj)α,jn!)1g(n)g_{\ell+1}(n):=\Bigg{(}\prod_{j=\dim(G)-\dim(G_{\ell})+1}^{\dim(G)}\exp(X_{j})^{\alpha_{\ell,j}\cdot\frac{n^{\ell}}{\ell!}}\Bigg{)}^{-1}g_{\ell}(n)

and write

g+1(n)=exp(i=+1sg+1,inii!)g_{\ell+1}(n)=\exp\bigg{(}\sum_{i=\ell+1}^{s}g_{\ell+1,i}\cdot\frac{n^{i}}{i!}\bigg{)}

in order to define g+1,ig_{\ell+1,i}. There exists a valid choice of α,j\alpha_{\ell,j} at each step since 𝒳\mathcal{X} is a filtered Mal’cev basis and there exists a valid choice of g+1,ig_{\ell+1,i} for i+1i\geq\ell+1 by Baker–Campbell–Hausdorff. This process terminates with the identity sequence, and unraveling gives the desired. ∎

Remark.

Note that in the above proof, the reason we do not use the basis (ni)\binom{n}{i} is that (ni)(nj)\binom{n}{i}\binom{n}{j} is not a linear combination of polynomials of the form (nt)\binom{n}{t} for tmax(i,j)+1t\geq\max(i,j)+1 and hence the Baker–Campbell–Hausdorff to construct g+1,ig_{\ell+1,i} fails (one needs lower-degree terms with ii\leq\ell).

We now explicitly unwind, for the sake of clarity, the conclusion of Lemma 9.1. We will use the notation and conclusions here throughout the Sections 10 and 11. Suppose we have a 11-bounded function f:[N]f\colon[N]\to\mathbb{C} with a degree-rank (s1,r)(s-1,r^{\ast}) correlation structure with parameters ρ,M,D,d\rho,M,D,d.444We apologize to the reader; there is a rather incredible amount of data which is floating around at this point. The crucial details to track are data regarding Taylor coefficient and the associated decompositions of the vector spaces corresponding to horizontal tori. Then by Lemma 9.1 and some relabeling there exists a degree-rank (s1,r)(s-1,r^{\ast}) correlation structure with parameters

ρ1\displaystyle\rho^{\prime-1} exp(Os((dlog(MD/ρ))Os(1))),MO(M),D=D,dO(d)\displaystyle\leq\exp(O_{s}((d\log(MD/\rho))^{O_{s}(1)})),\quad M^{\prime}\leq O(M),\quad D^{\prime}=D,\quad d^{\prime}\leq O(d)

and

  • A subset H[N]H\subseteq[N] with |H|ρN|H|\geq\rho^{\prime}N;

  • A multidegree (1,s1)(1,s-1) nilcharacter χ(h,n)\chi(h,n) with a frequency η\eta^{\ast} with height at most MM. Furthermore χ\chi lives on a nilmanifold (G×)/(Γ×)(G^{\ast}\times\mathbb{R})/(\Gamma^{\ast}\times\mathbb{Z}) with dimension bounded by dd^{\prime}, output dimension bounded by DD^{\prime}, complexity bounded by by MM^{\prime}, and the function underlying χ\chi is MM^{\prime}-Lipschitz. We let g(h,n)g(h,n) denote the underlying polynomial sequence;

  • A collection of degree-rank (s1,r)(s-1,r^{\ast}) nilcharacters χh(n)\chi_{h}(n) with a frequency η\eta with of height at most MM. Furthermore χh\chi_{h} lives on a nilmanifold G/ΓG/\Gamma with dimension bounded by dd, output dimension bounded by DD, G/ΓG/\Gamma has complexity bounded by MM and the function underlying χh\chi_{h} (which is independent of hh) is MM^{\prime}-Lipschitz. We let ghg_{h} denote the underlying polynomial sequence and we have gh(0)=idGg_{h}(0)=\mathrm{id}_{G};

  • For all hHh\in H, we have

    Δhf(n)χ(h,n)χh(n)Corr(s2,ρ,M,d);\Delta_{h}f(n)\otimes\chi(h,n)\otimes\chi_{h}(n)\in\operatorname{Corr}(s-2,\rho^{\prime},M^{\prime},d^{\prime});
  • Then there exists a collection of subspaces Wi,,Wi,Lin,Wi,PetG(i,1)/G(i,2)W_{i,\ast},W_{i,\mathrm{Lin}},W_{i,\mathrm{Pet}}\leqslant G_{(i,1)}/G_{(i,2)} for 1is11\leq i\leq s-1 which are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-rational with respect to exp(𝒳)G(i,1)modG(i,2)\exp(\mathcal{X})\cap G_{(i,1)}~{}\mathrm{mod}~{}G_{(i,2)};

  • If Wi=Wi,+Wi,Lin+Wi,PetW_{i}=W_{i,\ast}+W_{i,\mathrm{Lin}}+W_{i,\mathrm{Pet}} then dim(Wi)=dim(Wi,)+dim(Wi,Lin)+dim(Wi,Pet)\dim(W_{i})=\dim(W_{i,\ast})+\dim(W_{i,\mathrm{Lin}})+\dim(W_{i,\mathrm{Pet}});

  • Let Zi,1,,Zi,dim(Wi,)Z_{i,1}^{\ast},\ldots,Z_{i,\dim(W_{i,\ast})}^{\ast} a sequence of integral linear combinations of 𝒳G(i,1)𝒳G(i,2)\mathcal{X}\cap G_{(i,1)}\setminus\mathcal{X}\cap G_{(i,2)} such that span(exp(Zi,1modG(i,2),,exp(Zi,dim(Wi,))modG(i,2))=Wi,\operatorname{span}_{\mathbb{R}}(\exp(Z_{i,1}^{\ast}~{}\mathrm{mod}~{}G_{(i,2)},\ldots,\exp(Z_{i,\dim(W_{i,\ast})}^{\ast})~{}\mathrm{mod}~{}G_{(i,2)})=W_{i,\ast}. We may let the coefficients of Zi,jZ_{i,j}^{\ast} be (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-bounded and exp(Zi,j)Γ\exp(Z_{i,j}^{\ast})\in\Gamma.

  • Let Zi,1Lin,,Zi,dim(Wi,Lin)LinZ_{i,1}^{\mathrm{Lin}},\ldots,Z_{i,\dim(W_{i,\mathrm{Lin}})}^{\mathrm{Lin}} be a sequence of integral linear combinations of 𝒳G(i,1)𝒳G(i,2)\mathcal{X}\cap G_{(i,1)}\setminus\mathcal{X}\cap G_{(i,2)} such that span(exp(Zi,1Lin)modG(i,2),,exp(Zi,dim(Wi,Lin)Lin)modG(i,2))=Wi,Lin\operatorname{span}_{\mathbb{R}}(\exp(Z_{i,1}^{\mathrm{Lin}})~{}\mathrm{mod}~{}G_{(i,2)},\ldots,\exp(Z_{i,\dim(W_{i,\mathrm{Lin}})}^{\mathrm{Lin}})~{}\mathrm{mod}~{}G_{(i,2)})=W_{i,\mathrm{Lin}}. We may let the coefficients of Zi,jLinZ_{i,j}^{\mathrm{Lin}} are (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-bounded and exp(Zi,jLin)Γ\exp(Z_{i,j}^{\mathrm{Lin}})\in\Gamma.

  • Let Zi,1Pet,,Zi,dim(Wi,Pet)PetZ_{i,1}^{\mathrm{Pet}},\ldots,Z_{i,\dim(W_{i,\mathrm{Pet}})}^{\mathrm{Pet}} be a sequence of integral linear combinations of 𝒳G(i,1)𝒳G(i,2)\mathcal{X}\cap G_{(i,1)}\setminus\mathcal{X}\cap G_{(i,2)} such that span(exp(Zi,1Pet)modG(i,2),,exp(Zi,dim(Wi,Pet)Pet)modG(i,2))=Wi,Pet\operatorname{span}_{\mathbb{R}}(\exp(Z_{i,1}^{\mathrm{Pet}})~{}\mathrm{mod}~{}G_{(i,2)},\ldots,\exp(Z_{i,\dim(W_{i,\mathrm{Pet}})}^{\mathrm{Pet}})~{}\mathrm{mod}~{}G_{(i,2)})=W_{i,\mathrm{Pet}}. We may let the coefficients of Zi,jPetZ_{i,j}^{\mathrm{Pet}} be (MD/ρ)Os(dOs(1))(MD/\rho)^{O_{s}(d^{O_{s}(1)})}-bounded and let exp(Zi,jPet)Γ\exp(Z_{i,j}^{\mathrm{Pet}})\in\Gamma.

  • For 1is11\leq i\leq s-1 and hHh\in H, we have

    Taylori(gh)\displaystyle\operatorname{Taylor}_{i}(g_{h}) =j=1dim(Wi,)exp(Zi,j)zi,jj=1dim(Wi,Pet)exp(Zi,jPet)zi,jh,Pet\displaystyle=\prod_{j=1}^{\dim(W_{i,\ast})}\exp(Z_{i,j}^{\ast})^{z_{i,j}^{\ast}}\cdot\prod_{j=1}^{\dim(W_{i,\mathrm{Pet}})}\exp(Z_{i,j}^{\mathrm{Pet}})^{z_{i,j}^{h,\mathrm{Pet}}}
    j=1dim(Wi,Lin)exp(Zi,jLin)zi,jh,LinmodG(i,2)\displaystyle\qquad\qquad\cdot\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(Z_{i,j}^{\mathrm{Lin}})^{z_{i,j}^{h,\mathrm{Lin}}}~{}\mathrm{mod}~{}G_{(i,2)}

    where

    zi,jh,Lin=γi,j+k=1dαi,j,k{βkh}z_{i,j}^{h,\mathrm{Lin}}=\gamma_{i,j}+\sum_{k=1}^{d^{\ast}}\alpha_{i,j,k}\{\beta_{k}h\}

    where d(dlog(MD/ρ))Os(1)d^{\ast}\leq(d\log(MD/\rho))^{O_{s}(1)} and βk(1/N)\beta_{k}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime in [100N,200N][100N,200N].

  • For any integers i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1, suppose that vijWijv_{i_{j}}\in W_{i_{j}} for all jj. If for at least one index \ell we have viWi,Petv_{i_{\ell}}\in W_{i_{\ell},\mathrm{Pet}}, then if ww is any (r1)(r^{\ast}-1)-fold commutator of vi1,,virv_{i_{1}},\ldots,v_{i_{r^{\ast}}}, we have

    η(w)=0.\eta(w)=0.

    Furthermore, if instead for at least two indices 1,2\ell_{1},\ell_{2} we have vi1Wi1,Linv_{i_{\ell_{1}}}\in W_{i_{\ell_{1}},\mathrm{Lin}} and vi2Wi2,Linv_{i_{\ell_{2}}}\in W_{i_{\ell_{2}},\mathrm{Lin}}, then if ww which is any (r1)(r^{\ast}-1)-fold commutator of vi1,,virv_{i_{1}},\ldots,v_{i_{r^{\ast}}} we have

    η(w)=0.\eta(w)=0.

We have relabeled as HH^{\prime} by HH, ghg_{h}^{\prime} by ghg_{h}, g(h,n)g^{\prime}(h,n) by g(h,n)g(h,n), χ\chi^{\prime} by χ\chi, and χh\chi_{h}^{\prime} by χh\chi_{h}. We have applied Lemma 10.1 and scaling to guarantee that exp(Zi,j),exp(Zi,jLin),exp(Zi,jPet)Γ\exp(Z_{i,j}^{\ast}),\exp(Z_{i,j}^{\mathrm{Lin}}),\exp(Z_{i,j}^{\mathrm{Pet}})\in\Gamma.

Let 𝒳={X1,,Xdim(G)}\mathcal{X}=\{X_{1},\ldots,X_{\dim(G)}\} denote the filtered Mal’cev basis given for G/ΓG/\Gamma. Via Lemma 10.2, for hHh\in H we may define

gh\displaystyle g_{h}^{\ast} =i=1s1j=1dim(Wi,)exp(Zi,j)zi,jnii!i=1s1j=1dim(Wi,Lin)exp(Zi,jLin)γi,jnii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\ast})}\exp(Z_{i,j}^{\ast})^{z_{i,j}^{\ast}\cdot\frac{n^{i}}{i!}}\cdot\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(Z_{i,j}^{\mathrm{Lin}})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}},
ghLin\displaystyle g_{h}^{\mathrm{Lin}} =i=1s1j=1dim(Wi,Lin)exp(Zi,jLin)(zi,jh,Linγi,j)nii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(Z_{i,j}^{\mathrm{Lin}})^{(z_{i,j}^{h,\mathrm{Lin}}-\gamma_{i,j})\cdot\frac{n^{i}}{i!}},
ghPet\displaystyle g_{h}^{\mathrm{Pet}} =i=1s1j=1dim(Wi,Pet)exp(Zi,jPet)zi,jh,Petnii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Pet}})}\exp(Z_{i,j}^{\mathrm{Pet}})^{z_{i,j}^{h,\mathrm{Pet}}\cdot\frac{n^{i}}{i!}},

and define ghRemg_{h}^{\mathrm{Rem}} via

gh=ghghLinghPetghRem.g_{h}=g_{h}^{\ast}\cdot g_{h}^{\mathrm{Lin}}\cdot g_{h}^{\mathrm{Pet}}\cdot g_{h}^{\mathrm{Rem}}.

Using Lemma 10.2 again, we may write

ghRem=i=1s1j=dim(G)dim(Gi,1)+1dim(G)exp(Xj)κi,jhnii!.g_{h}^{\mathrm{Rem}}=\prod_{i=1}^{s-1}\prod_{j=\dim(G)-\dim(G_{i,1})+1}^{\dim(G)}\exp(X_{j})^{\kappa_{i,j}^{h}\cdot\frac{n^{i}}{i!}}.

The fact that when applying Lemma 10.2 for ghRemg_{h}^{\mathrm{Rem}} we observe no coefficients for nii!\frac{n^{i}}{i!} corresponding to basis elements in 𝒳log(G(i,1))𝒳log(G(i,2))\mathcal{X}\cap\log(G_{(i,1)})\setminus\mathcal{X}\cap\log(G_{(i,2)}) follows from the fact that ghg_{h} and ghghLinghPetg_{h}^{\ast}\cdot g_{h}^{\mathrm{Lin}}\cdot g_{h}^{\mathrm{Pet}} have Taylor coefficients which match exactly for 1is11\leq i\leq s-1.

We now reach the first stage of “rewriting” where we realize the nilsequence χh(n)=F(gh(n)Γ)\chi_{h}(n)=F(g_{h}(n)\Gamma) on a universal nilmanifold.

10.2. Rewriting degree-rank nilsequences on the universal nilmanifold

We recall the universal nilmanifold of a given degree-rank (see [34, Definition 9.1]).

Definition 10.3.

The universal nilmanifold of degree-rank (s1,r)(s-1,r^{\ast}) and the associated discrete cocompact subgroup are defined as follows. We write GUniv=GUnivDG_{\mathrm{Univ}}=G_{\mathrm{Univ}}^{\vec{D}} where D=D+DLin+DPet\vec{D}=\vec{D}^{\ast}+\vec{D}^{\mathrm{Lin}}+\vec{D}^{\mathrm{Pet}} with D,DLin,DPet(0)s1\vec{D}^{\ast},\vec{D}^{\mathrm{Lin}},\vec{D}^{\mathrm{Pet}}\in(\mathbb{Z}_{\geq 0})^{s-1}. We specify GUnivDG_{\mathrm{Univ}}^{\vec{D}} by formal generators of the Lie algebra ei,je_{i,j} for 1is11\leq i\leq s-1 and 1jDi1\leq j\leq D_{i} where Di=Di+DiLin+DiPetD_{i}=D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}+D_{i}^{\mathrm{Pet}} with the relations:

  • Any (r1)(r-1)-fold commutator of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} with i1++ir>(s1)i_{1}+\cdots+i_{r}>(s-1) vanishes;

  • Any (r1)(r-1)-fold commutator of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} with i1++ir=(s1)i_{1}+\cdots+i_{r}=(s-1) and r>rr>r^{\ast} vanishes.

The associated discrete group which we will be concerned with is ΓUniv\Gamma_{\mathrm{Univ}} which is the discrete group generated by exp(ei,j)\exp(e_{i,j}) for 1is11\leq i\leq s-1 and 1jDi1\leq j\leq D_{i}.

Remark.

Note that in this definition, GUnivDG_{\mathrm{Univ}}^{\vec{D}} depends only on D\vec{D}; however, the quotient we will consider later depends on D,DLin,DPet\vec{D}^{\ast},\vec{D}^{\mathrm{Lin}},\vec{D}^{\mathrm{Pet}}. Furthermore, we have presented GUnivG_{\mathrm{Univ}} as a Lie algebra and not as a Lie group; via the general theory of nilpotent Lie algebras this is sufficient. Note that the Lie algebra defined is trivially seen to be nilpotent. By the Birkhoff Embedding Theorem (see remark following [12, Theorem 1.1.11]), we may realize any real nilpotent Lie algebra 𝔤\mathfrak{g} as a Lie subalgebra of the n×nn\times n real strictly upper triangular matrices. The proof of [12, Theorem 1.2.1] then realizes the n×nn\times n real strictly upper triangular matrices as a logarithm of a connected, simply connected Lie group NnN_{n} where the exponential map is bijective. The Baker–Campbell–Hausdorff formula then demonstrates 𝔤\mathfrak{g} is the logarithm of a connected, simply connected subgroup GNnG\leqslant N_{n} (and by construction the logarithm is a bijection between GG and 𝔤\mathfrak{g}). The group GG constructed is unique up to isomorphism by Lie’s third theorem.

We first prove the fact that GUnivG_{\mathrm{Univ}} may be given a degree-rank filtration and that GUniv/ΓUnivG_{\mathrm{Univ}}/\Gamma_{\mathrm{Univ}} has reasonable complexity.

Lemma 10.4.

Let GUniv=GUnivDG_{\mathrm{Univ}}=G_{\mathrm{Univ}}^{\vec{D}} and define (GUniv)(d,r)(G_{\mathrm{Univ}})_{(d,r)} by taking the group generated by all (r1)(r^{\prime}-1)-fold iterated commutators of exp(ti1,j1ei1,j1),,exp(tir,jreir,jr)\exp(t_{i_{1},j_{1}}e_{i_{1},j_{1}}),\ldots,\exp(t_{i_{r^{\prime}},j_{r^{\prime}}}e_{i_{r^{\prime}},j_{r^{\prime}}}) with tik,jkt_{i_{k},j_{k}}\in\mathbb{R}, and either i1++ir>di_{1}+\cdots+i_{r^{\prime}}>d or i1++ir=di_{1}+\cdots+i_{r^{\prime}}=d and rrr^{\prime}\geq r.

Then (GUniv)(d,r)(G_{\mathrm{Univ}})_{(d,r)} forms a valid degree-rank (s1,r)(s-1,r^{\ast}) filtration of GUnivG_{\mathrm{Univ}}. Furthermore the dimension of GUnivG_{\mathrm{Univ}} is bounded by Os(DOs(1))O_{s}(\lVert D\rVert_{\infty}^{O_{s}(1)}) and one may find an adapted Mal’cev basis 𝒳Univ\mathcal{X}_{\mathrm{Univ}} such that the complexity of GUniv/ΓUnivG_{\mathrm{Univ}}/\Gamma_{\mathrm{Univ}} is at most exp(DOs(1))\exp(\lVert D\rVert_{\infty}^{O_{s}(1)}).

Proof.

We will be brief with details; that the associated filtration is valid follows via a straightforward computation with Lemma 2.2. Note (GUniv)(i,0)=(GUniv)(i,1)(G_{\mathrm{Univ}})_{(i,0)}=(G_{\mathrm{Univ}})_{(i,1)} as r1r^{\prime}\geq 1 in the set of generators always. Also, (GUniv)(0,0)=(GUniv)(0,1)(G_{\mathrm{Univ}})_{(0,0)}=(G_{\mathrm{Univ}})_{(0,1)} since for all generators ei,je_{i,j} we have i1i\geq 1.

To establish the complexity bounds, the key point is noting that taking all (r1)(r^{\prime}-1)-fold iterated commutators of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r^{\prime}},j_{r^{\prime}}} with i1++irs2i_{1}+\cdots+i_{r^{\prime}}\leq s-2 or i1++ir=s1i_{1}+\cdots+i_{r^{\prime}}=s-1 and rrr^{\prime}\leq r^{\ast} gives a spanning set for log(GUniv)\log(G_{\mathrm{Univ}}). This immediately gives the specified dimension bound. These generators are not linearly independent; however, all relations are generated by either antisymmetry ([x,y]+[y,x]=0[x,y]+[y,x]=0) or the Jacobi identity ([x,[y,z]]+[y,[z,x]]+[z,[x,y]]=0[x,[y,z]]+[y,[z,x]]+[z,[x,y]]=0) applied to the set of generators specified.

To simplify matters, note that all linear relations can be reduced to those between these generators with the “same type” (i.e., relations between the set of (r1)(r^{\prime}-1)-fold commutators of a given set of generators ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r^{\prime}},j_{r^{\prime}}}). These can be collected into disconnected non-interacting “components” which are Os(1)O_{s}(1) in size. We may take a linearly spanning set within each group; each generator not in the spanning set may be written as a linear combination of height Os(1)O_{s}(1). Define 𝒳\mathcal{X} to be the union of all these spanning elements in log(GUniv)\log(G_{\mathrm{Univ}}). This gives us a basis. Note the subspaces log((GUniv)(d,r))\log((G_{\mathrm{Univ}})_{(d,r)}) are clearly compatible with natural subsets of these “components” and their associated spanning sets, demonstrating that the basis is appropriate adapted to these vector spaces log((GUniv)(d,r))\log((G_{\mathrm{Univ}})_{(d,r)}).

The last matter to check is that there exists Cs1C_{s}\geq 1 such that Csdim(GUniv)ψexp,𝒳(ΓUniv)Cs1dim(GUniv)C_{s}\mathbb{Z}^{\dim(G_{\mathrm{Univ}})}\subseteq\psi_{\mathrm{exp},\mathcal{X}}(\Gamma_{\mathrm{Univ}})\subseteq C_{s}^{-1}\mathbb{Z}^{\dim(G_{\mathrm{Univ}})}. This follows by noting that each element γΓ\gamma\in\Gamma may be written as

γ=k=1texp(eik,jk)sk\gamma=\prod_{k=1}^{t}\exp(e_{i_{k},j_{k}})^{s_{k}}

with sks_{k}\in\mathbb{Z}. We prove the first implication first; we prove that log(γ)\log(\gamma) may be written as a linear combination of iterated commutators where (r1)(r^{\prime}-1)-fold commutators have denominator bounded by CsrC_{s}^{r^{\prime}}. This is trivial to prove inductively via Baker–Campbell–Hausdorff and noting that all ss-fold commutators vanish.

For the reverse direction, consider expressions of the form

γ=exp(αcαeα)\gamma^{\prime}=\exp\bigg{(}\sum_{\alpha}c_{\alpha}e_{\alpha}\bigg{)}

where eαe_{\alpha} ranges over all possible iterated commutators (here e.g. e[(1,2),(1,3)]:=[e(1,2),e(1,3)]e_{[(1,2),(1,3)]}:=[e_{(1,2)},e_{(1,3)}]) where cαc_{\alpha} are sufficiently divisible integers. Let fαf_{\alpha} be defined as the commutator of the exponential of associated elements; e.g. f[(1,2),(1,3)]=[exp(e(1,2)),exp(e(1,3))]f_{[(1,2),(1,3)]}=[\exp(e_{(1,2)}),\exp(e_{(1,3)})]. Choose a generator α\alpha^{\prime} with the fewest number of commutators in γ\gamma^{\prime} such that cα0c_{\alpha^{\prime}}\neq 0. It is straightforward to see via Baker–Campbell–Hausdorff that there is an integer MsM_{s} such that if cαc_{\alpha} are all divisible by MsM_{s} then

fαcαγ=exp(αcαeα)f_{\alpha^{\prime}}^{-c_{\alpha^{\prime}}}\gamma^{\prime}=\exp\bigg{(}\sum_{\alpha}c_{\alpha}^{\ast}e_{\alpha}\bigg{)}

has each cαc_{\alpha}^{\ast} still divisible by MsM_{s} and cα=0c_{\alpha^{\prime}}^{\ast}=0 (without introducing backwards corrections).

The desired result then follows from [42, Lemma B.11], noting that (GUniv)(d,r)(G_{\mathrm{Univ}})_{(d,r)} is the degree-rank ordering forming a nested sequence of subgroups. ∎

We now represent the nilsequences χh(n)=F(gh(n))\chi_{h}(n)=F(g_{h}(n)) on the universal nilmanifold. We define

Di\displaystyle D_{i}^{\ast} =dim(Wi,)+dim(Wi,Lin),DiPet=dim(Wi,Pet)+dim(G(i,2)),DiLin=ddim(Wi,Lin).\displaystyle=\dim(W_{i,\ast})+\dim(W_{i,\mathrm{Lin}}),\quad D_{i}^{\mathrm{Pet}}=\dim(W_{i,\mathrm{Pet}})+\dim(G_{(i,2)}),\quad D_{i}^{\mathrm{Lin}}=d^{\ast}\dim(W_{i,\mathrm{Lin}}).

Note that DiLin(dlog(MD/ρ))Os(1)D_{i}^{\mathrm{Lin}}\leq(d\log(MD/\rho))^{O_{s}(1)} and trivially Di,DiLindD_{i}^{\ast},D_{i}^{\mathrm{Lin}}\leq d.

Recall that 𝒳={X1,,Xdim(G)}\mathcal{X}=\{X_{1},\ldots,X_{\dim(G)}\} is the filtered Mal’cev basis and Zi,jZ_{i,j}^{\ast}, Zi,jPetZ_{i,j}^{\mathrm{Pet}}, Zi,jLinZ_{i,j}^{\mathrm{Lin}} are representative of log(Wi,)\log(W_{i,\ast}), log(Wi,Lin)\log(W_{i,\mathrm{Lin}}), and log(Wi,Pet)\log(W_{i,\mathrm{Pet}}) respectively.

We define a homomorphism ϕ:GUnivG\phi\colon G_{\mathrm{Univ}}\to G by defining the map on generators. Define

ϕ(exp(ei,j))={exp(Zi,j) if 1jdim(Wi,),exp(Zi,jdim(Wi,)Lin) if dim(Wi,)+1jdim(Wi,)+dim(Wi,Lin)=Di,exp(Zi,Lin) if 1+(1)djDid for 1dim(Wi,Lin),exp(Zi,jDiDiLinPet) if Di+DiLin+1jDi+DiLin+dim(Wi,Pet),exp(XjDi+dim(G)) if Didim(G(i,2))+1jDi.\displaystyle\phi(\exp(e_{i,j}))=\begin{cases}\exp(Z_{i,j}^{\ast})&\text{ if }1\leq j\leq\dim(W_{i,\ast}),\\ \exp(Z_{i,j-\dim(W_{i,\ast})}^{\mathrm{Lin}})&\text{ if }\dim(W_{i,\ast})+1\leq j\leq\dim(W_{i,\ast})+\dim(W_{i,\mathrm{Lin}})=D_{i}^{\ast},\\ \exp(Z_{i,\ell}^{\mathrm{Lin}})&\text{ if }1+(\ell-1)d^{\ast}\leq j-D_{i}^{\ast}\leq\ell d^{\ast}\text{ for }1\leq\ell\leq\dim(W_{i,\mathrm{Lin}}),\\ \exp(Z_{i,j-D_{i}^{\ast}-D_{i}^{\mathrm{Lin}}}^{\mathrm{Pet}})&\text{ if }D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}+1\leq j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}+\dim(W_{i,\mathrm{Pet}}),\\ \exp(X_{j-D_{i}+\dim(G)})&\text{ if }D_{i}-\dim(G_{(i,2)})+1\leq j\leq D_{i}.\end{cases}

That this is a homomorphism is an immediate consequence of the fact that the only relations on the universal nilmanifold are forced on the group GG since it has degree-rank (s1,r)(s-1,r^{\ast}).

The function with which will be concerned is

F~(gΓUniv):=F(ϕ(g)Γ).\widetilde{F}(g\Gamma_{\mathrm{Univ}}):=F(\phi(g)\Gamma).

This is well-defined since ϕ(ΓUniv)Γ\phi(\Gamma_{\mathrm{Univ}})\leqslant\Gamma; it suffices to check that the generators exp(ei,j)\exp(e_{i,j}) map to within Γ\Gamma but this is trivial by construction. (This is precisely why we scaled Zi,Z_{i,\cdot}^{\ast}, Zi,LinZ_{i,\cdot}^{\mathrm{Lin}}, and Zi,PetZ_{i,\cdot}^{\mathrm{Pet}} so that when exponentiated they live within Γ\Gamma.)

We now note a series of basic properties of F~\widetilde{F} and the homomorphism ϕ\phi.

Lemma 10.5.

Given the above setup we have:

  • F~2=1\lVert\widetilde{F}\rVert_{2}=1 for all gGUnivg\in G_{\mathrm{Univ}};

  • FF has a vertical frequency ηUniv\eta_{\mathrm{Univ}} with height at most (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})};

  • F~\widetilde{F} is (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})}-Lipschitz

  • Consider ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r^{\ast}},j_{r^{\ast}}} with j1++jr=s1j_{1}+\cdots+j_{r^{\ast}}=s-1. If for at least one index \ell we have j>Di+DiLinj_{\ell}>D_{i_{\ell}}^{\ast}+D_{i_{\ell}}^{\mathrm{Lin}}, then

    ηUniv([exp(ei1,j1),,exp(eir,jr)])=0.\eta_{\mathrm{Univ}}([\exp(e_{i_{1},j_{1}}),\ldots,\exp(e_{i_{r^{\ast}},j_{r^{\ast}})}])=0.

    Furthermore, if instead for two indices 1,2\ell_{1},\ell_{2} we have j1>Di1j_{\ell_{1}}>D_{i_{\ell_{1}}}^{\ast} and j2>Di2j_{\ell_{2}}>D_{i_{\ell_{2}}}^{\ast} then

    ηUniv([exp(ei1,j1),,exp(eir,jr)])=0.\eta_{\mathrm{Univ}}([\exp(e_{i_{1},j_{1}}),\ldots,\exp(e_{i_{r^{\ast}},j_{r^{\ast}}})])=0.
Proof.

The first property is trivial. For the second property, note that ϕ\phi is an II-filtered homomorphism (e.g. ϕ((GUniv)(s,r1))G(s,r1)\phi((G_{\mathrm{Univ}})_{(s,r^{\ast}-1)})\leqslant G_{(s,r^{\ast}-1)}). Thus given gGUnivg\in G_{\mathrm{Univ}}, g(GUniv)(s,r1)g^{\prime}\in(G_{\mathrm{Univ}})_{(s,r^{\ast}-1)} we have

F~(ggΓUniv)=F(ϕ(g)ϕ(g)Γ)=e(η(ϕ(g)))F(ϕ(g)Γ)\widetilde{F}(gg^{\prime}\Gamma_{\mathrm{Univ}})=F(\phi(g)\phi(g^{\prime})\Gamma)=e(\eta(\phi(g^{\prime})))F(\phi(g)\Gamma)

and therefore we may set ηUniv=ηϕ\eta_{\mathrm{Univ}}=\eta\circ\phi. To check the complexity of ηUniv\eta_{\mathrm{Univ}} it suffices to check the magnitude of ηUniv\eta_{\mathrm{Univ}} on [exp(ei1,j1),,exp(eir,jr)][\exp(e_{i_{1},j_{1}}),\ldots,\exp(e_{i_{r^{\ast}},j_{r^{\ast}}})] where we use Remark 5.3 to convert between this notion and the notion of height defined. The resulting magnitude is bounded because Zi,jZ_{i,j}^{\ast}, Zi,jLinZ_{i,j}^{\mathrm{Lin}}, Zi,jPetZ_{i,j}^{\mathrm{Pet}} are appropriately bounded integral combinations of elements in 𝒳\mathcal{X} which itself has bounded complexity.

We omit a careful justification that F~\widetilde{F} has an appropriately bounded Lipchitz constant. The crucial point is that the Mal’cev basis constructed in Lemma 10.4 is made up of appropriately bounded linear combinations of commutators of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} and each such commutator is seen to map to a bounded element of GG since ϕ\phi maps each generator to a bounded element.

The final property is an immediate consequence of the properties of Wi,LinW_{i,\mathrm{Lin}}, Wi,PetW_{i,\mathrm{Pet}}, and Wi,W_{i,\ast} established in Lemma 9.1 and recorded above. The additional generators which are lifted to the “petal” position on the ii-th level come from G(i,2)G_{(i,2)} and otherwise we have only artificially placed certain elements in the “linear” class upward to the “\ast” class. (These will correspond to the constant terms in the linear part of the nilsequences.) ∎

We now lift the polynomial sequences in question to the universal nilmanifold. We define:

gh,Univ(n)\displaystyle g_{h}^{\ast,\mathrm{Univ}}(n) =i=1s1j=1dim(Wi,)exp(ei,j)zi,jnii!i=1s1j=1dim(Wi,Lin)exp(ei,j+dim(Wi,))γi,jnii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\ast})}\exp(e_{i,j})^{z_{i,j}^{\ast}\cdot\frac{n^{i}}{i!}}\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(e_{i,j+\dim(W_{i,\ast})})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}},
ghLin,Univ(n)\displaystyle g_{h}^{\mathrm{Lin},\mathrm{Univ}}(n) =i=1s1j=1dim(Wi,Lin)k=1dexp(ei,Di+(j1)d+k)αi,j,k{βkh}nii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\prod_{k=1}^{d^{\ast}}\exp(e_{i,D_{i}^{\ast}+(j-1)d^{\ast}+k})^{\alpha_{i,j,k}\{\beta_{k}h\}\cdot\frac{n^{i}}{i!}},
ghPet,Univ(n)\displaystyle g_{h}^{\mathrm{Pet},\mathrm{Univ}}(n) =i=1s1j=1dim(Wi,Pet)exp(ei,j+Di+DiLin)zi,jh,Petnii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Pet}})}\exp(e_{i,j+D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}})^{z_{i,j}^{h,\mathrm{Pet}}\cdot\frac{n^{i}}{i!}},
ghRem,Univ(n)\displaystyle g_{h}^{\mathrm{Rem},\mathrm{Univ}}(n) =i=1s1j=1dim(G(i,2))exp(ei,j+Didim(G(i,2)))κi,jhnii!.\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(G_{(i,2)})}\exp(e_{i,j+D_{i}-\dim(G_{(i,2)})})^{\kappa_{i,j}^{h}\cdot\frac{n^{i}}{i!}}.

We define

ghUniv:=gh,UnivghLin,UnivghPet,UnivghRem,Univ.g_{h}^{\mathrm{Univ}}:=g_{h}^{\ast,\mathrm{Univ}}\cdot g_{h}^{\mathrm{Lin},\mathrm{Univ}}\cdot g_{h}^{\mathrm{Pet},\mathrm{Univ}}\cdot g_{h}^{\mathrm{Rem},\mathrm{Univ}}.

The key claim, which is trivial by construction, is the following equality.

Claim 10.6.

Given the above setup, we have

F~(ghUniv(n)ΓUniv)=F(gh(n)Γ)=χh(n).\widetilde{F}(g_{h}^{\mathrm{Univ}}(n)\Gamma_{\mathrm{Univ}})=F(g_{h}(n)\Gamma)=\chi_{h}(n).
Proof.

The final equality is by definition of χh(n)\chi_{h}(n). The first equality follows by checking that ϕ(gh,Univ)=gh,ϕ(ghLin,Univ)=ghLin,ϕ(ghPet,Univ)=ghPet, and ϕ(ghRem,Univ)=ghRem\phi(g_{h}^{\ast,\mathrm{Univ}})=g_{h}^{\ast},~{}\phi(g_{h}^{\mathrm{Lin},\mathrm{Univ}})=g_{h}^{\mathrm{Lin}},~{}\phi(g_{h}^{\mathrm{Pet},\mathrm{Univ}})=g_{h}^{\mathrm{Pet}},~{}\text{ and }\phi(g_{h}^{\mathrm{Rem},\mathrm{Univ}})=g_{h}^{\mathrm{Rem}} by construction. Therefore since ϕ\phi is a homomorphism we conclude that ϕ(ghUniv)=gh\phi(g_{h}^{\mathrm{Univ}})=g_{h}. ∎

Note that at this stage we have simply replace the group GG in our correlation structure with GUnivG^{\mathrm{Univ}} as the cost of replacing dd by dim(GUniv)=dOs(1)log(MDρ1)Os(1)\dim(G_{\mathrm{Univ}})=d^{O_{s}(1)}\log(MD\rho^{-1})^{O_{s}(1)} and MM by exp(dOs(1)log(MDρ1)Os(1))\exp(d^{O_{s}(1)}\log(MD\rho^{-1})^{O_{s}(1)}).

This may seem as if we have gone backwards, the key point is that in Lemma 10.5 we have encoded various “vanishing conditions” on the commutator brackets at the level of the generators of the group. This will allow us to translate the “vanishing conditions” obtained in Lemma 9.1 into realizing we can, up to a degree-rank (s1,r1)(s-1,r^{\ast}-1)-error term.

10.3. Passing to a quotient nilmanifold

We now construct two additional nilmanifolds; there are essentially GG^{\ast} and G~\widetilde{G} certain quotients constructed in [34, Section 12].555There is a minor issue in [34, p. 1309] when defining GG^{\ast}; we follow the definitions given in the erratum [31].

Definition 10.7.

We define GRel=GRelD,DLin,DPetG_{\mathrm{Rel}}=G_{\mathrm{Rel}}^{\vec{D}^{\ast},\vec{D}^{\mathrm{Lin}},\vec{D}^{\mathrm{Pet}}} as the Lie subgroup of GUnivG_{\mathrm{Univ}} where log(GRel)\log(G_{\mathrm{Rel}}) is spanned by:

  • Any (r1)(r-1)-fold commutator of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} with at least one index \ell such that j>Di+DiLinj_{\ell}>D^{\ast}_{i_{\ell}}+D^{\mathrm{Lin}}_{i_{\ell}};

  • Any (r1)(r-1)-fold commutator of ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} with j>Dij_{\ell}>D^{\ast}_{i_{\ell}} for at least two distinct indices \ell.

We then define GQuotG_{\mathrm{Quot}} as GQuot:=GUniv/GRelG_{\mathrm{Quot}}:=G_{\mathrm{Univ}}/G_{\mathrm{Rel}} and ΓQuot=ΓUniv/(ΓUnivGRel)\Gamma_{\mathrm{Quot}}=\Gamma_{\mathrm{Univ}}/(\Gamma_{\mathrm{Univ}}\cap G_{\mathrm{Rel}}).

Remark 10.8.

Note that we may set r=1r=1 in the definition of GRelG_{\mathrm{Rel}}; in particular exp(ei,j)GRel\exp(e_{i,j})\in G_{\mathrm{Rel}} for j>Di+DiLinj>D^{\ast}_{i}+D^{\mathrm{Lin}}_{i}. Additionally, log(GQuot)\log(G_{\mathrm{Quot}}) may be realized as the following. Consider formal generators of a Lie algebra, e~i,j\widetilde{e}_{i,j} for 1jDi+DiLin1\leq j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}, with the property that:

  • Any (r1)(r-1)-fold commutator of e~i1,j1,,e~ir,jr\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r},j_{r}} with either i1++ir>(s1)i_{1}+\cdots+i_{r}>(s-1) or i1++ir=(s1)i_{1}+\cdots+i_{r}=(s-1) and r>rr>r^{\ast} vanishes;

  • Any (r1)(r-1)-fold commutator of e~i1,j1,,e~ir,jr\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r}^{\ast},j_{r}^{\ast}} with j>Dij_{\ell}>D^{\ast}_{i_{\ell}} for at least two distinct indices \ell vanishes.

This realization is given by taking e~i,j:=log(exp(ei,j)modGRel)\widetilde{e}_{i,j}:=\log(\exp(e_{i,j})~{}\mathrm{mod}~{}G_{\mathrm{Rel}}).

We first check that GQuotG_{\mathrm{Quot}} is well-defined.

Claim 10.9.

For D,DLin,DPet(0)s1\vec{D}^{\ast},\vec{D}^{\mathrm{Lin}},\vec{D}^{\mathrm{Pet}}\in(\mathbb{Z}_{\geq 0})^{s-1}, GRelG_{\mathrm{Rel}} is a well-defined normal subgroup of GUnivG_{\mathrm{Univ}}.

Proof.

It is clear from definition that log(GRel)\log(G_{\mathrm{Rel}}) is closed under brackets, so forms a Lie subalgebra within log(GUniv)\log(G_{\mathrm{Univ}}). Thus GRelG_{\mathrm{Rel}} is indeed a Lie subgroup. To prove that GRelG_{\mathrm{Rel}} is normal it suffices to prove that it is furthermore a Lie algebra ideal, i.e., [log(GUniv),log(GRel)]log(GRel)[\log(G_{\mathrm{Univ}}),\log(G_{\mathrm{Rel}})]\leqslant\log(G_{\mathrm{Rel}}).

Recall that log(GUniv)\log(G_{\mathrm{Univ}}) is spanned by all the (r1)(r-1)-fold commutators ei1,j1,,eir,jre_{i_{1},j_{1}},\ldots,e_{i_{r},j_{r}} (although as discussed in Lemma 10.4 this is not a basis). It suffices to check the containment at the level of generators of the respective Lie algebras. The result then follows since taking a commutator does not decrease the number of “petal” or “linear” generators. ∎

We also have the following complexity bound on GQuotG_{\mathrm{Quot}}. This may be done via the Lie algebra presentation given in Remark 10.8 and repeating the proof in Lemma 10.4, or via noting that GRelG_{\mathrm{Rel}} is a sufficient rational subgroup of GUnivG_{\mathrm{Univ}} and applying Lemma 3.10. We omit the details.

Lemma 10.10.

Given the above setup, let GQuot=GQuotDG_{\mathrm{Quot}}=G_{\mathrm{Quot}}^{\vec{D}} and note that GQuotG_{\mathrm{Quot}} has a degree-rank (s1,r)(s-1,r^{\ast}) filtration given by

(GQuot)(d,r)=(GUniv)(d,r)/((GUniv)(d,r)GRel).(G_{\mathrm{Quot}})_{(d,r)}=(G_{\mathrm{Univ}})_{(d,r)}/((G_{\mathrm{Univ}})_{(d,r)}\cap G_{\mathrm{Rel}}).

Furthermore the dimension of GUnivG_{\mathrm{Univ}} is bounded by Os(DOs(1))O_{s}(\lVert D\rVert_{\infty}^{O_{s}(1)}) and one may find an adapted Mal’cev basis 𝒳Quot\mathcal{X}_{\mathrm{Quot}} such that the complexity of GQuot/ΓQuotG_{\mathrm{Quot}}/\Gamma_{\mathrm{Quot}} is exp(DOs(1))\exp(\lVert D\rVert_{\infty}^{O_{s}(1)}).

A key point in this analysis is that this quotient is compatible with ηUniv\eta_{\mathrm{Univ}}.

Lemma 10.11.

Given the above setup, define ηQuot:(GQuot)(s1,r)\eta_{\mathrm{Quot}}\colon(G_{\mathrm{Quot}})_{(s-1,r^{\ast})}\to\mathbb{R} via

ηQuot(gmodGRel):=ηUniv(g)\eta_{\mathrm{Quot}}(g~{}\mathrm{mod}~{}G_{\mathrm{Rel}}):=\eta_{\mathrm{Univ}}(g)

for all g(GUniv)(s1,r)g\in(G_{\mathrm{Univ}})_{(s-1,r^{\ast})}. The map ηQuot\eta_{\mathrm{Quot}} is well-defined and in fact is a vertical character of GQuotG_{\mathrm{Quot}} of height at most (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})}.

Proof.

To be well-defined as a map, it suffices to show that GRel(GUniv)(s1,r)ker(ηUniv)G_{\mathrm{Rel}}\cap(G_{\mathrm{Univ}})_{(s-1,r^{\ast})}\leqslant\operatorname{ker}(\eta_{\mathrm{Univ}}). This comes exactly from the final item of Lemma 10.5. That η\eta is a vertical character then follows as ΓQuot=ΓUniv/(ΓUnivGRel)\Gamma_{\mathrm{Quot}}=\Gamma_{\mathrm{Univ}}/(\Gamma_{\mathrm{Univ}}\cap G_{\mathrm{Rel}}).

To bound the height of ηQuot\eta_{\mathrm{Quot}} note that taking a quotient by GRelG_{\mathrm{Rel}} maps exp(ei,j)\exp(e_{i,j}) to exp(e~i,j)\exp(\widetilde{e}_{i,j}) in the sense of Remark 10.8. Furthermore the construction of 𝒳Quot\mathcal{X}_{\mathrm{Quot}} has the property that 𝒳Quotlog((GQuot)(s1,r))\mathcal{X}_{\mathrm{Quot}}\cap\log((G_{\mathrm{Quot}})_{(s-1,r^{\ast})}) are sufficiently rational combinations of (r1)(r^{\ast}-1)-fold commutators of e~i1,j1,,e~ir,jr\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r^{\ast}},j_{r^{\ast}}} with i1++ir=s1i_{1}+\cdots+i_{r^{\ast}}=s-1. By Baker–Campbell–Hausdorff, we have that the (r1)(r^{\ast}-1)-fold commutator of exp(e~i1,j1),,exp(e~ir,jr)\exp(\widetilde{e}_{i_{1},j_{1}}),\ldots,\exp(\widetilde{e}_{i_{r^{\ast}},j_{r^{\ast}}}) is the same modGRel~{}\mathrm{mod}~{}G_{\mathrm{Rel}} as the corresponding one for exp(ei1,j1),,exp(eir,jr)\exp(e_{i_{1},j_{1}}),\ldots,\exp(e_{i_{r^{\ast}},j_{r^{\ast}}}). However, ηUniv\eta_{\mathrm{Univ}} maps the latter commutator to a sufficiently bounded integer by the complexity bound on ηUniv\eta_{\mathrm{Univ}} and the result follows. ∎

We will require

(10.1) gh,Quot(n)=i=1s1j=1dim(Wi,)exp(e~i,j)zi,jnii!i=1s1j=1dim(Wi,Lin)exp(e~i,j+dim(Wi,))γi,jnii!ghLin,Quot(n)=i=1s1j=1dim(Wi,Lin)k=1dexp(e~i,Di+(j1)d+k)αi,j,k{βkh}nii!;\displaystyle\begin{split}g_{h}^{\ast,\mathrm{Quot}}(n)&=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\ast})}\exp(\widetilde{e}_{i,j})^{z_{i,j}^{\ast}\cdot\frac{n^{i}}{i!}}\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\exp(\widetilde{e}_{i,j+\dim(W_{i,\ast})})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}}\\ g_{h}^{\mathrm{Lin},\mathrm{Quot}}(n)&=\prod_{i=1}^{s-1}\prod_{j=1}^{\dim(W_{i,\mathrm{Lin}})}\prod_{k=1}^{d^{\ast}}\exp(\widetilde{e}_{i,D_{i}^{\ast}+(j-1)d^{\ast}+k})^{\alpha_{i,j,k}\{\beta_{k}h\}\cdot\frac{n^{i}}{i!}};\end{split}

note that

gh,Quot=gh,UnivmodGRel,ghLin,Quot=ghLin,UnivmodGRel.g_{h}^{\ast,\mathrm{Quot}}=g_{h}^{\ast,\mathrm{Univ}}~{}\mathrm{mod}~{}G_{\mathrm{Rel}},\qquad g_{h}^{\mathrm{Lin},\mathrm{Quot}}=g_{h}^{\mathrm{Lin},\mathrm{Univ}}~{}\mathrm{mod}~{}G_{\mathrm{Rel}}.

Furthermore we have

ghPet,UnivmodGRel=ghRem,UnivmodGRel=idGQuotg_{h}^{\mathrm{Pet},\mathrm{Univ}}~{}\mathrm{mod}~{}G_{\mathrm{Rel}}=g_{h}^{\mathrm{Rem},\mathrm{Univ}}~{}\mathrm{mod}~{}G_{\mathrm{Rel}}=\mathrm{id}_{G^{\mathrm{Quot}}}

pointwise. Finally we define

ghQuot:=gh,QuotghLin,Quot.g_{h}^{\mathrm{Quot}}:=g_{h}^{\ast,\mathrm{Quot}}\cdot g_{h}^{\mathrm{Lin},\mathrm{Quot}}.

For the remainder of this section and Section 11, fix a nilcharacter FF^{\ast} on GQuotG_{\mathrm{Quot}} with a G(s1,r)G_{(s-1,r^{\ast})}-vertical frequency ηQuot\eta_{\mathrm{Quot}}. Furthermore by Lemma B.4666The lemma is stated for degree filtrations. However, one can give GQuotG_{\mathrm{Quot}} the degree filtration (GQuot)(0,0)=(GQuot)(1,0)(GQuot)(2,0)(GQuot)(s1,0)(GQuot)(s1,r)IdGQuot;(G_{\mathrm{Quot}})_{(0,0)}=(G_{\mathrm{Quot}})_{(1,0)}\geqslant(G_{\mathrm{Quot}})_{(2,0)}\geqslant\cdots\geqslant(G_{\mathrm{Quot}})_{(s-1,0)}\geqslant(G_{\mathrm{Quot}})_{(s-1,r^{\ast})}\geqslant\mathrm{Id}_{G_{\mathrm{Quot}}}; a vertical nilcharacter with respect to this filtration is a vertical nilcharacter with respect to the original degree-rank filtration. 𝒳Quot\mathcal{X}_{{\mathrm{Quot}}} is adapted to this degree-filtration (as it is adapted to the original degree-rank filtration). we may take FF^{\ast} which is (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})}–Lipschitz with output dimension bounded by 2Os(dim(GUniv))2^{O_{s}(\dim(G_{\mathrm{Univ}}))}.

The reason it will be sufficient to study F(ghQuotΓQuot)F^{\ast}(g_{h}^{\mathrm{Quot}}\Gamma^{\mathrm{Quot}}) will be the following lemma which proves that it is equal to F~(ghUnivΓUniv)\widetilde{F}(g_{h}^{\mathrm{Univ}}\Gamma^{\mathrm{Univ}}) up a term which is lower-order in degree-rank.

Lemma 10.12.

Given the above setup, let

GUniv:={(g,gmodGRel)GUniv×GQuot:gGUniv}G_{\mathrm{Univ}}^{\triangle}:=\{(g,g~{}\mathrm{mod}~{}G_{\mathrm{Rel}})\in G_{\mathrm{Univ}}\times G_{\mathrm{Quot}}\colon g\in G_{\mathrm{Univ}}\}

which is given the degree-rank filtration

(GUniv)(d,r):={(g,gmodGRel)(GUniv)(d,r)×(GQuot)(d,r):g(GUniv)(d,r)}.(G_{\mathrm{Univ}}^{\triangle})_{(d,r)}:=\{(g,g~{}\mathrm{mod}~{}G_{\mathrm{Rel}})\in(G_{\mathrm{Univ}})_{(d,r)}\times(G_{\mathrm{Quot}})_{(d,r)}\colon g\in(G_{\mathrm{Univ}})_{(d,r)}\}.

Define ΓUniv=GUniv(ΓUniv×ΓQuot)\Gamma_{\mathrm{Univ}}^{\triangle}=G_{\mathrm{Univ}}^{\triangle}\cap(\Gamma_{\mathrm{Univ}}\times\Gamma_{\mathrm{Quot}}). We have:

  • (ghUniv,ghQuot)(g_{h}^{\mathrm{Univ}},g_{h}^{\mathrm{Quot}}) is a polynomial sequence on GUnivG_{\mathrm{Univ}}^{\triangle} with respect to the given degree-rank filtration;

  • The function

    (g,g)F~(gΓUniv)F¯(gΓQuot)(g,g^{\prime})\mapsto\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}})

    for (g,g)GUniv(g,g^{\prime})\in G_{\mathrm{Univ}}^{\triangle} is (GUniv)(s1,r)(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}-invariant;

  • GUnivG_{\mathrm{Univ}}^{\triangle} has complexity bounded by (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})};

  • Each coordinate of F~(gΓUniv)F¯(gΓQuot)\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}}) is (MD/ρ)Os(dim(GUniv)Os(1))(MD/\rho)^{O_{s}(\dim(G_{\mathrm{Univ}})^{O_{s}(1)})}-Lipschitz.

Remark 10.13.

The second item implies that F~(gΓUniv)F¯(gΓQuot)\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}}) is (GUniv)(s1,r)(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}-invariant and thus can be realized on a degree-rank (s1,r1)(s-1,r^{\ast}-1) nilmanifold GUniv/(GUniv)(s1,r)G_{\mathrm{Univ}}^{\triangle}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})} with ΓUniv/(ΓUniv(GUniv)(s1,r))\Gamma_{\mathrm{Univ}}^{\triangle}/(\Gamma_{\mathrm{Univ}}^{\triangle}\cap(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}) being the lattice.

Proof.

It is trivial to verify that the degree-rank filtration on GUnivG_{\mathrm{Univ}}^{\triangle} is valid. Noting that

{(Xi,Ximodlog(GRel)):Xi𝒳Univ}\{(X_{i},X_{i}~{}\mathrm{mod}~{}\log(G_{\mathrm{Rel}}))\colon X_{i}\in\mathcal{X}_{\mathrm{Univ}}\}

is a valid Mal’cev basis for GUnivG_{\mathrm{Univ}}^{\triangle} bounds the complexity of GUnivG_{\mathrm{Univ}}^{\triangle}. The complexity bounds on F~(gΓUniv)F¯(gΓQuot)\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}}) follow by noting that F~\widetilde{F} is appropriately Lipschitz on GUniv/ΓUnivG_{\mathrm{Univ}}/\Gamma_{\mathrm{Univ}} and similar for F¯\overline{F^{\ast}}. For FF^{\ast}, we note that each coordinate of {Ximodlog(GRel)}\{X_{i}~{}\mathrm{mod}~{}\log(G_{\mathrm{Rel}})\} is appropriately rational with respect to the Mal’cev basis for 𝒳Quot\mathcal{X}_{\mathrm{Quot}}, by construction.

Furthermore for (h,hmodGRel)(GUniv)(s1,r)(h,h~{}\mathrm{mod}~{}G_{\mathrm{Rel}})\in(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})} we have

F~(ghΓUniv)F¯(g(hmodGRel)ΓQuot)\displaystyle\widetilde{F}(gh\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}(h~{}\mathrm{mod}~{}G_{\mathrm{Rel}})\Gamma_{\mathrm{Quot}})
=F~(gΓUniv)F¯(gΓQuot)e(ηUniv(h))e(ηQuot(hmodGRel))¯\displaystyle\qquad=\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}})\cdot e(\eta_{\mathrm{Univ}}(h))\overline{e(\eta_{\mathrm{Quot}}(h~{}\mathrm{mod}~{}G_{\mathrm{Rel}}))}
=F~(gΓUniv)F¯(gΓQuot)\displaystyle\qquad=\widetilde{F}(g\Gamma_{\mathrm{Univ}})\otimes\overline{F^{\ast}}(g^{\prime}\Gamma_{\mathrm{Quot}})

where in the final line we have used the definition of ηQuot\eta_{\mathrm{Quot}}.

Finally to verify that (ghUniv,ghQuot)(g_{h}^{\mathrm{Univ}},g_{h}^{\mathrm{Quot}}) is a polynomial sequence with respect to this degree-rank filtration, note via Taylor expansion (e.g. [34, Lemma B.9]) that all polynomial sequences hh with respect to GUnivG_{\mathrm{Univ}}^{\triangle} of the form (h,hmodGRel)(h^{\prime},h^{\prime}~{}\mathrm{mod}~{}G_{\mathrm{Rel}}) where hh^{\prime} is a polynomial sequence with respect to GUnivG_{\mathrm{Univ}} (and its specified degree-rank filtration). The result then follows due to the property

ghUnivmodGRel=ghQuotg_{h}^{\mathrm{Univ}}~{}\mathrm{mod}~{}G_{\mathrm{Rel}}=g_{h}^{\mathrm{Quot}}

noted above, which was by construction. ∎

11. Extracting a (1,s1)(1,s-1)-nilsequence

The goal of this section is to realize

F(ghQuot(n)ΓQuot)F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma_{\mathrm{Quot}})

as a multidegree (1,s1)(1,s-1) nilsequence in (h,n)(h,n). We accomplish this via a construction of Green, Tao, and Ziegler [34, Section 12] and then use this construction in order to complete the proof of Lemma 6.3. After this, the main business of the paper is essentially done and all that remains to prove Theorem 1.2 is the symmetrization argument which will be carried out in the next section.

11.1. Constructing the (1,s1)(1,s-1)-nilsequence

Our analysis at this point is essentially verbatim that of [34, pp. 1313-1315]. We reproduce the details here (and discuss various complexity issues which are completely routine in the appendix). For the sake of simplicity, we may clean up notation from (10.1) and write

ghQuot(n)\displaystyle g_{h}^{\mathrm{Quot}}(n) =i=1s1j=1Diexp(e~i,j)γi,jnii!i=1s1j=Di+1Di+DiLinexp(e~i,j)αi,j{βi,jh}nii!,\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{D_{i}^{\ast}}\exp(\widetilde{e}_{i,j})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}}\cdot\prod_{i=1}^{s-1}\prod_{j=D_{i}^{\ast}+1}^{D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}}\exp(\widetilde{e}_{i,j})^{\alpha_{i,j}\{\beta_{i,j}h\}\cdot\frac{n^{i}}{i!}},

where we have abusively reindexed various coefficients γ,α,β\gamma,\alpha,\beta but nothing else.

We now define GLinG_{\mathrm{Lin}} to be the Lie subgroup of GQuotG_{\mathrm{Quot}} such that log(GLin)\log(G_{\mathrm{Lin}}) is the subspace generated by all (r1)(r-1)-fold iterated commutators (with r1r\geq 1) of e~i1,j1,,e~ir,jr\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r},j_{r}} with j>Dij_{\ell}>D_{i_{\ell}}^{\ast} for exactly one index \ell. We have the following pair of basic observations.

Claim 11.1.

We have that GLinG_{\mathrm{Lin}} is well-defined, abelian, and normal with respect to GQuotG_{\mathrm{Quot}}.

Proof.

Similar to the proof of Claim 10.9, GLinG_{\mathrm{Lin}} is well-defined and normal. The only modification to the proof is noting that a commutator of e~ik,jk\widetilde{e}_{i_{k},j_{k}} with at least two indices \ell with j>Dij_{\ell}>D_{i_{\ell}}^{\ast} vanishes by the definition of GQuotG_{\mathrm{Quot}}.

To see that GLinG_{\mathrm{Lin}} is abelian, it suffices to prove that the commutator of any pair of generators is the identity. This immediately follows from the fact that commutators with at least two generators of the form e~i,j\widetilde{e}_{i_{\ell},j_{\ell}} with j>Dij_{\ell}>D_{i_{\ell}}^{\ast} vanish. ∎

Due to normality, GQuotG_{\mathrm{Quot}} acts on GLinG_{\mathrm{Lin}} via conjugation. In particular, we define GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} with the group law given by

(g,g1)(g,g1):=(gg,g1gg1)=(gg,((g)1g1g)g1).(g,g_{1})(g^{\prime},g_{1}^{\prime}):=(gg^{\prime},g_{1}^{g^{\prime}}g_{1}^{\prime})=(gg^{\prime},((g^{\prime})^{-1}g_{1}g^{\prime})g_{1}^{\prime}).

We now introduce a manner in which the additive group R=i=1s1DiLinR=\mathbb{R}^{\sum_{i=1}^{s-1}D_{i}^{\mathrm{Lin}}}, with elements denoted

t=(ti,j)1is1,Di,<jDi+DiLin,t=(t_{i,j})_{1\leq i\leq s-1,~{}D_{i,\ast}<j\leq D_{i}+D_{i}^{\mathrm{Lin}}},

acts on GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}. Specifically, we will define an action ρ(t)\rho(t) on this group for all tRt\in R and use this to construct

GMulti=Rρ(GQuotGLin).G_{\mathrm{Multi}}=R\ltimes_{\rho}(G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}).

This action will allow us to simultaneously “raise” parts of GLinG_{\mathrm{Lin}} to various different fractional powers of hh, allowing us to incorporate our “hh-linear” family of nilsequences into a multidegree (1,s1)(1,s-1) nilsequence (in variables (h,n)(h,n)).

For each tRt\in R, we define the homomorphism ggtg\mapsto g^{t} from GQuotG_{\mathrm{Quot}} to itself on generators. We map exp(e~i,j)exp(e~i,j)ti,j\exp(\widetilde{e}_{i,j})\to\exp(\widetilde{e}_{i,j})^{t_{i,j}} for 1is11\leq i\leq s-1 and Di<jDi+DiLinD_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}} while exp(e~i,j)\exp(\widetilde{e}_{i,j}) is fixed for 1is11\leq i\leq s-1 and 1jDi1\leq j\leq D_{i}^{\ast}. The defining relations of GQuotG_{\mathrm{Quot}} are preserved by this transformation, so this is easily seen to be a well-defined homomorphism. At the Lie algebra this transformation is essentially replacing appropriate e~i,j\widetilde{e}_{i,j} by ti,je~i,jt_{i,j}\widetilde{e}_{i,j}.

For gGQuotg\in G_{\mathrm{Quot}} and t,tRt,t^{\prime}\in R we have

(gt)t=gtt,(g^{t})^{t^{\prime}}=g^{tt^{\prime}},

and for g,gGLing,g^{\prime}\in G_{\mathrm{Lin}} we have

gtgt=gt+t and gtgt=(gg)t.g^{t}g^{t^{\prime}}=g^{t+t^{\prime}}\text{ and }g^{t}g^{\prime t}=(gg^{\prime})^{t}.

This are trivial since GLinG_{\mathrm{Lin}} is abelian.

We next claim that if gGQuotg\in G_{\mathrm{Quot}} and gGLing^{\prime}\in G_{\mathrm{Lin}} then

(11.1) (ggg1)t=ggtg1.(gg^{\prime}g^{-1})^{t}=gg^{\prime t}g^{-1}.

To prove this note that it suffices to prove the claim for powers of generators of the groups GQuotG_{\mathrm{Quot}} and GLinG_{\mathrm{Lin}} (since conjugation and ggtg\mapsto g^{t} are homomorphisms). If gGLing\in G_{\mathrm{Lin}} the result is trivial due to the abelian property, and if gGLing\notin G_{\mathrm{Lin}} (and is the power of a generator) then gt=gg^{t}=g by definition so (ggg1)t=gtgt(g1)t=ggtg1(gg^{\prime}g^{-1})^{t}=g^{t}g^{\prime t}(g^{-1})^{t}=gg^{\prime t}g^{-1} as desired.

We now define ρ:RAut(GQuotGLin)\rho\colon R\to\operatorname{Aut}(G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}) by

ρ(t)(g,g1):=(gg1t,g1).\rho(t)(g,g_{1}):=(g\cdot g_{1}^{t},g_{1}).

The map ρ(t)\rho(t) is clearly bijective and we have

ρ(s)(ρ(t)(g,g1))=ρ(s)((gg1t,g1))=(gg1t+s,g1)=ρ(t+s)(g,g1),\rho(s)(\rho(t)(g,g_{1}))=\rho(s)((g\cdot g_{1}^{t},g_{1}))=(g\cdot g_{1}^{t+s},g_{1})=\rho(t+s)(g,g_{1}),

so to check this is a group action it suffices to show ρ(t)\rho(t) gives a valid homomorphism of GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}. This follows because

ρ(t)((g,g1)(g,g1))\displaystyle\rho(t)((g,g_{1})\cdot(g^{\prime},g_{1}^{\prime})) =ρ(t)(gg,(g)1g1gg1)\displaystyle=\rho(t)(gg^{\prime},(g^{\prime})^{-1}g_{1}g^{\prime}g_{1}^{\prime})
=(gg(g)1g1tg(g1)t,(g)1g1gg1)=(gg1tg(g1)t,(g)1g1gg1),\displaystyle=(gg^{\prime}(g^{\prime})^{-1}g_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g^{\prime})^{-1}g_{1}g^{\prime}g_{1}^{\prime})=(gg_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g^{\prime})^{-1}g_{1}g^{\prime}g_{1}^{\prime}),

by (11.1), while

ρ(t)(g,g1)ρ(t)(g,g1)\displaystyle\rho(t)(g,g_{1})\rho(t)(g^{\prime},g_{1}^{\prime}) =(gg1t,g1)(g(g1)t,g1)=(gg1tg(g1)t,(g(g1)t)1g1g(g1)tg1)\displaystyle=(gg_{1}^{t},g_{1})\cdot(g^{\prime}(g_{1}^{\prime})^{t},g_{1}^{\prime})=(gg_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g^{\prime}(g_{1}^{\prime})^{t})^{-1}g_{1}g^{\prime}(g_{1}^{\prime})^{t}g_{1}^{\prime})
=(gg1tg(g1)t,(g1)t((g)1g1g)(g1)tg1)=(gg1tg(g1)t,(g1)t(g1)t((g)1g1g)g1)\displaystyle=(gg_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g_{1}^{\prime})^{-t}((g^{\prime})^{-1}g_{1}g^{\prime})(g_{1}^{\prime})^{t}g_{1}^{\prime})=(gg_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g_{1}^{\prime})^{-t}(g_{1}^{\prime})^{t}((g^{\prime})^{-1}g_{1}g^{\prime})g_{1}^{\prime})
=(gg1tg(g1)t,(g)1g1gg1),\displaystyle=(gg_{1}^{t}g^{\prime}(g_{1}^{\prime})^{t},(g^{\prime})^{-1}g_{1}g^{\prime}g_{1}^{\prime}),

where we have used that GLinG_{\mathrm{Lin}} is abelian and normal.

We are now in position to define the group of interest which will support the multidegree (1,s1)(1,s-1) nilsequence. Let

GMulti=Rρ(GQuotGLin)G_{\mathrm{Multi}}=R\ltimes_{\rho}(G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}})

where multiplication is given by

(t,(g,g1))(t,(g,g1))=(t+t,(ρ(t)(g,g1))(g,g1)).(t,(g,g_{1}))(t^{\prime},(g^{\prime},g_{1}^{\prime}))=(t+t^{\prime},(\rho(t^{\prime})(g,g_{1}))\cdot(g^{\prime},g_{1}^{\prime})).

This is seen to be a connected, simply connected Lie group. We give it a multidegree filtration (GMulti)(d1,d2)(G_{\mathrm{Multi}})_{(d_{1},d_{2})} defined by:

  • If d1>1d_{1}>1 then (GMulti)(d1,d2)=IdGMulti(G_{\mathrm{Multi}})_{(d_{1},d_{2})}=\mathrm{Id}_{G_{\mathrm{Multi}}};

  • If d2>0d_{2}>0 then (GMulti)(1,d2)={(0,(g,idGLin)):g(GQuot)(d2,0)GLin}(G_{\mathrm{Multi}})_{(1,d_{2})}=\{(0,(g,\mathrm{id}_{G_{\mathrm{Lin}}}))\colon g\in(G_{\mathrm{Quot}})_{(d_{2},0)}\cap G_{\mathrm{Lin}}\};

  • (GMulti)(1,0)={(t,(g,idGLin)):tR,g(GQuot)(0,0)GLin}(G_{\mathrm{Multi}})_{(1,0)}=\{(t,(g,\mathrm{id}_{G_{\mathrm{Lin}}}))\colon t\in R,g\in(G_{\mathrm{Quot}})_{(0,0)}\cap G_{\mathrm{Lin}}\} or equivalently just {(t,(g,idGLin)):tR,gGLin}\{(t,(g,\mathrm{id}_{G_{\mathrm{Lin}}}))\colon t\in R,g\in G_{\mathrm{Lin}}\};

  • If d2>0d_{2}>0 then (GMulti)(0,d2)={(0,(g,g1)):g(GQuot)(d2,0),g1(GQuot)(d2,0)GLin}(G_{\mathrm{Multi}})_{(0,d_{2})}=\{(0,(g,g_{1}))\colon g\in(G_{\mathrm{Quot}})_{(d_{2},0)},g_{1}\in(G_{\mathrm{Quot}})_{(d_{2},0)}\cap G_{\mathrm{Lin}}\};

  • (GMulti)(0,0)=GMulti(G_{\mathrm{Multi}})_{(0,0)}=G_{\mathrm{Multi}}.

Claim 11.2.

(GMulti)(d1,d2)(G_{\mathrm{Multi}})_{(d_{1},d_{2})} is a valid multidegree filtration on GMultiG_{\mathrm{Multi}}.

Proof.

Note that

(t,(g,g1))=(t,(idGQuot,idGLin))(0,(g,g1))(t,(g,g_{1}))=(t,(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g,g_{1}))

and therefore (GMulti)(0,0)=(GMulti)(1,0)(GMulti)(0,1)(G_{\mathrm{Multi}})_{(0,0)}=(G_{\mathrm{Multi}})_{(1,0)}\vee(G_{\mathrm{Multi}})_{(0,1)}. We next check various commutator relations. First note that

[(GMulti)(1,0),(GMulti)(1,0))]=IdGMulti.[(G_{\mathrm{Multi}})_{(1,0)},(G_{\mathrm{Multi}})_{(1,0)})]=\mathrm{Id}_{G_{\mathrm{Multi}}}.

This follows because if g,hGLing,h\in G_{\mathrm{Lin}} we have gh=hggh=hg hence

(t,(g,idGLin))(t,(h,idGLin)))=(t+t,(gh,idGLin))=(t,(h,idGLin)))(t,(g,idGLin)).(t,(g,\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(t^{\prime},(h,\mathrm{id}_{G_{\mathrm{Lin}}})))=(t+t^{\prime},(gh,\mathrm{id}_{G_{\mathrm{Lin}}}))=(t^{\prime},(h,\mathrm{id}_{G_{\mathrm{Lin}}})))\cdot(t,(g,\mathrm{id}_{G_{\mathrm{Lin}}})).

Therefore it suffices to verify that

[(GMulti)(0,a),(GMulti)(0,b)]\displaystyle[(G_{\mathrm{Multi}})_{(0,a)},(G_{\mathrm{Multi}})_{(0,b)}] (GMulti)(0,a+b),\displaystyle\leqslant(G_{\mathrm{Multi}})_{(0,a+b)},
[(GMulti)(1,a),(GMulti)(0,b)]\displaystyle[(G_{\mathrm{Multi}})_{(1,a)},(G_{\mathrm{Multi}})_{(0,b)}] (GMulti)(1,a+b).\displaystyle\leqslant(G_{\mathrm{Multi}})_{(1,a+b)}.

We first tackle the first claim, in which we may reduce to the case a,b>0a,b>0. We wish to show

[(g,g1),(g,g1)]{(h,h1):h(GQuot)(a+b,0),h1(GQuot)(a+b,0)GLin}[(g,g_{1}),(g^{\prime},g_{1}^{\prime})]\in\{(h,h_{1})\colon h\in(G_{\mathrm{Quot}})_{(a+b,0)},h_{1}\in(G_{\mathrm{Quot}})_{(a+b,0)}\cap G_{\mathrm{Lin}}\}

if g,g1(GQuot)(a,0)g,g_{1}\in(G_{\mathrm{Quot}})_{(a,0)}, g,g1(GQuot)(b,0)g^{\prime},g_{1}^{\prime}\in(G_{\mathrm{Quot}})_{(b,0)}, and g1,g1GLing_{1},g_{1}^{\prime}\in G_{\mathrm{Lin}}. Via Lemma 2.2, it suffices to prove (GQuot)(a+b,0)(G_{\mathrm{Quot}})_{(a+b,0)} is normal in (GQuot)(a,0)(G_{\mathrm{Quot}})_{(a,0)} and (GQuot)(b,0)(G_{\mathrm{Quot}})_{(b,0)} and then check at the level of generators.

To check normality, we have

(g,g1)(g,g1)(g,g1)1\displaystyle(g,g_{1})(g^{\prime},g_{1}^{\prime})(g,g_{1})^{-1} =(g,g1)(g,g1)(g1,gg11g1)\displaystyle=(g,g_{1})(g^{\prime},g_{1}^{\prime})(g^{-1},gg_{1}^{-1}g^{-1})
=(gg,(g)1g1gg1)(g1,gg11g1)\displaystyle=(gg^{\prime},(g^{\prime})^{-1}g_{1}g^{\prime}\cdot g_{1}^{\prime})(g^{-1},gg_{1}^{-1}g^{-1})
=(ggg1,(g(g)1)g1(gg1)gg1g1gg11g1)\displaystyle=(gg^{\prime}g^{-1},(g(g^{\prime})^{-1})g_{1}(g^{\prime}g^{-1})\cdot gg_{1}^{\prime}g^{-1}\cdot gg_{1}^{-1}g^{-1})

and the result follows noting that GLin,(GQuot)(j,0)G_{\mathrm{Lin}},(G_{\mathrm{Quot}})_{(j,0)} are normal in GQuotG_{\mathrm{Quot}} for all j0j\geq 0.

Since

(g,g1)=(g,idGLin)(idQuot,g1)(g,g_{1})=(g,\mathrm{id}_{G_{\mathrm{Lin}}})\cdot(\mathrm{id}_{\mathrm{Quot}},g_{1})

and it suffices to check the claim on generators, we may reduce to the case where exactly one of g,g1g,g_{1} and exactly one of g1,g1g_{1},g_{1}^{\prime} are the identity. The result is clear when g,gg,g^{\prime} are trivial, and the case when g1,g1g_{1},g_{1}^{\prime} are trivial follows from the fact that we have a valid filtration on GQuotG_{\mathrm{Quot}}. In the remaining cases we may assume by symmetry that g1=idGLing_{1}=\mathrm{id}_{G_{\mathrm{Lin}}} and g=idGQuotg^{\prime}=\mathrm{id}_{G_{\mathrm{Quot}}}. We have

(g1,idGLin)(idGQuot,(g1)1)(g,idGLin)(idGQuot,g1)=(idGQuot,g1(g1)1gg1)(g^{-1},\mathrm{id}_{G_{\mathrm{Lin}}})(\mathrm{id}_{G_{\mathrm{Quot}}},(g_{1}^{\prime})^{-1})(g,\mathrm{id}_{G_{\mathrm{Lin}}})(\mathrm{id}_{G_{\mathrm{Quot}}},g_{1}^{\prime})=(\mathrm{id}_{G_{\mathrm{Quot}}},g^{-1}(g_{1}^{\prime})^{-1}gg_{1}^{\prime})

and we see that the final coordinate satisfies [g,g1](GQuot)(a+b,0)GLin[g,g_{1}^{\prime}]\in(G_{\mathrm{Quot}})_{(a+b,0)}\cap G_{\mathrm{Lin}}. We have finished verifying the first claim.

Now note that {(h,idGLin):hGLin}\{(h,\mathrm{id}_{G_{\mathrm{Lin}}})\colon h\in G_{\mathrm{Lin}}\} is a normal subgroup of GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}, since GLinG_{\mathrm{Lin}} is abelian. Thus combining with the first claim gives the second claim, namely

[(GMulti)(1,a),(GMulti)(0,b))](GMulti)(1,a+b),[(G_{\mathrm{Multi}})_{(1,a)},(G_{\mathrm{Multi}})_{(0,b)})]\leqslant(G_{\mathrm{Multi}})_{(1,a+b)},

for a>0a>0.

The only nontrivial case left is a=0a=0 and b>0b>0 for the second claim. Furthermore, combining what we know it suffices to check the case when (t,(idGQuot,idGLin))(t,(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}})) is the element from (GMulti)(1,0)(G_{\mathrm{Multi}})_{(1,0)}. Note however that

(t,\displaystyle(t, (idGQuot,idGLin))(0,(g,g1))(t,(idGQuot,idGLin))(0,(g,g1))1\displaystyle(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g,g_{1}))\cdot(-t,(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g,g_{1}))^{-1}
=(t,(g,g1))(t,(idGQuot,idGLin))(0,(g1,gg11g1))=(0,(gg1t,g1))(0,(g1,gg11g1))\displaystyle=(t,(g,g_{1}))\cdot(-t,(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g^{-1},gg_{1}^{-1}g^{-1}))=(0,(gg_{1}^{-t},g_{1}))\cdot(0,(g^{-1},gg_{1}^{-1}g^{-1}))
=(0,(gg1tg1,idGLin)).\displaystyle=(0,(gg_{1}^{-t}g^{-1},\mathrm{id}_{G_{\mathrm{Lin}}})).

and the fact that if g,g1(GQuot)(b,0)g,g_{1}\in(G_{\mathrm{Quot}})_{(b,0)} and g1GLing_{1}\in G_{\mathrm{Lin}} then gg1tg1(GQuot)(b,0)gg_{1}^{-t}g^{-1}\in(G_{\mathrm{Quot}})_{(b,0)}. This follows because if g1(GQuot)(b,0)GLing_{1}\in(G_{\mathrm{Quot}})_{(b,0)}\cap G_{\mathrm{Lin}} then g1tg_{1}^{t} is in the same group. ∎

Writing t=(ti,j)1is1,Di<jDi+DiLint=(t_{i,j})_{1\leq i\leq s-1,~{}D_{i}^{\ast}<j\leq D_{i}+D_{i}^{\mathrm{Lin}}}, we define

ΓMulti={(t,(g,g1)):ti,j,gΓQuot,g1ΓQuotGLin}.\Gamma_{\mathrm{Multi}}=\{(t,(g,g_{1}))\colon t_{i,j}\in\mathbb{Z},g\in\Gamma_{\mathrm{Quot}},g_{1}\in\Gamma_{\mathrm{Quot}}\cap G_{\mathrm{Lin}}\}.

To see this is a group, observe that for g1ΓQuotg_{1}\in\Gamma_{\mathrm{Quot}} we have gtΓQuotg^{t}\in\Gamma_{\mathrm{Quot}} if all coordinates of tt are integral. This is clear for the generators of ΓQuot\Gamma_{\mathrm{Quot}} and the rest follows from recalling that “taking tt-th powers” is a homomorphism on GQuotG_{\mathrm{Quot}}.

We now define the relevant functions which will be used to represent F(ghQuot(n)ΓQuot)F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma^{\mathrm{Quot}}). Let δ=exp(Os((dlog(MD/ρ))Os(1)))\delta=\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})), where the implicit constants are chosen sufficiently large.

Let ϕ:\phi\colon\mathbb{R}\to\mathbb{R} be a 11-bounded, 11-periodic function such that:

  • ϕ(x)=1\phi(x)=1 if |{x}|1/22δ|\{x\}|\leq 1/2-2\delta;

  • ϕ(x)=0\phi(x)=0 if |{x}|1/2δ|\{x\}|\geq 1/2-\delta;

  • ϕ\phi is O(1/δ)O(1/\delta)-Lipschitz.

Define HHH^{\ast}\subseteq H such that for all 1is11\leq i\leq s-1 and Di<jDi+DiLinD_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}} we have |{βi,jh}|1/2δ|\{\beta_{i,j}h\}|\geq 1/2-\delta. Using that βi,j(1/N)\beta_{i,j}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime between 100N100N and 200N200N, we see that there are at most O(δNi=1s1DiLin)O(\delta\cdot N\cdot\sum_{i=1}^{s-1}D_{i}^{\mathrm{Lin}}) indices which do not satisfy the criterion and choosing δ\delta sufficiently small, we may assume that HH^{\ast} is at least half the size of HH.

Given (t,(g,g1))GMulti(t,(g,g_{1}))\in G_{\mathrm{Multi}}, we may find (t,(g,g1))(t,(g,g1))ΓMulti(t^{\prime},(g^{\prime},g_{1}^{\prime}))\in(t,(g,g_{1}))\Gamma_{\mathrm{Multi}} such that (t)i,j(1/2,1/2](t^{\prime})_{i,j}\in(-1/2,1/2] for all i,ji,j. Define

FMulti((t,(g,g1))ΓMulti)=F(gΓQuot)1is1Di<jDi+DiLinϕ(ti,j);F_{\mathrm{Multi}}((t,(g,g_{1}))\Gamma_{\mathrm{Multi}})=F^{\ast}(g^{\prime}\Gamma_{\mathrm{Quot}})\cdot\prod_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}}\phi(t_{i,j}^{\prime});

we check that this in fact gives a well-defined function on GMulti/ΓMultiG_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}}. Note that if (t,(g,g1))(t,(g,g1))ΓMulti(t^{\prime},(g^{\prime},g_{1}^{\prime}))\in(t,(g,g_{1}))\Gamma_{\mathrm{Multi}} and ti,j(1/2,1/2]t_{i,j}^{\prime}\in(-1/2,1/2] then ti,j={ti,j}t_{i,j}^{\prime}=\{t_{i,j}\} and hence tt^{\prime} is unique. Furthermore note that

(t,(g,g1))(0,(γ,γ1))=(t,(g,g1)(γ,γ1))=(t,(gγ,(γ)1g1γγ1))(t^{\prime},(g^{\prime},g_{1}^{\prime}))\cdot(0,(\gamma^{\prime},\gamma_{1}^{\prime}))=(t^{\prime},(g^{\prime},g_{1}^{\prime})\cdot(\gamma^{\prime},\gamma_{1}^{\prime}))=(t^{\prime},(g^{\prime}\gamma^{\prime},(\gamma^{\prime})^{-1}g_{1}^{\prime}\gamma^{\prime}\gamma_{1}^{\prime}))

and trivially

F(gΓQuot)=F(gγΓQuot)F^{\ast}(g^{\prime}\Gamma_{\mathrm{Quot}})=F^{\ast}(g^{\prime}\gamma^{\prime}\Gamma_{\mathrm{Quot}})

if γΓQuot\gamma^{\prime}\in\Gamma_{\mathrm{Quot}}. Now recall that

ghQuot(n)\displaystyle g_{h}^{\mathrm{Quot}}(n) =i=1s1j=1Diexp(e~i,j)γi,jnii!i=1s1j=Di+1Di+DiLinexp(e~i,j)αi,j{βi,jh}nii!.\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{D_{i}^{\ast}}\exp(\widetilde{e}_{i,j})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}}\cdot\prod_{i=1}^{s-1}\prod_{j=D_{i}^{\ast}+1}^{D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}}\exp(\widetilde{e}_{i,j})^{\alpha_{i,j}\{\beta_{i,j}h\}\cdot\frac{n^{i}}{i!}}.

We set

g0(n)\displaystyle g_{0}(n) =i=1s1j=1Diexp(e~i,j)γi,jnii!,g1(n)=i=1s1j=Di+1Di+DiLinexp(e~i,j)αi,jnii!\displaystyle=\prod_{i=1}^{s-1}\prod_{j=1}^{D_{i}^{\ast}}\exp(\widetilde{e}_{i,j})^{\gamma_{i,j}\cdot\frac{n^{i}}{i!}},\quad g_{1}(n)=\prod_{i=1}^{s-1}\prod_{j=D_{i}^{\ast}+1}^{D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}}\exp(\widetilde{e}_{i,j})^{\alpha_{i,j}\cdot\frac{n^{i}}{i!}}

and define

gFinal(h,n)\displaystyle g_{\mathrm{Final}}(h,n) =(0,(g0(n),g1(n)))((βi,jh)1is1Di<jDi+DiLin,(idGQuot,idGLin))\displaystyle=(0,(g_{0}(n),g_{1}(n)))\cdot((\beta_{i,j}h)_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))
=(0,(g0(n),idGLin))(0,(idGQuot,g1(n)))((βi,jh)1is1Di<jDi+DiLin,(idGQuot,idGLin)).\displaystyle=(0,(g_{0}(n),\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(\mathrm{id}_{G_{\mathrm{Quot}}},g_{1}(n)))\cdot((\beta_{i,j}h)_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}})).

gFinal(h,n)g_{\mathrm{Final}}(h,n) is seen to be a polynomial sequence with respect to the filtration given to GMultiG_{\mathrm{Multi}} as each piece is trivially a polynomial sequence and the polynomial sequences form a group under pointwise multiplication (see [34, Corollary B.4]).

Note that for all hHh\in H we have

gFinal(h,n)ΓMulti\displaystyle g_{\mathrm{Final}}(h,n)\Gamma_{\mathrm{Multi}} =(0,(g0(n),g1(n)))(({βi,jh})1is1Di<jDi+DiLin,(idGQuot,idGLin))ΓMulti\displaystyle=(0,(g_{0}(n),g_{1}(n)))\cdot((\{\beta_{i,j}h\})_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\Gamma_{\mathrm{Multi}}
=(({βi,jh})1is1Di<jDi+DiLin,(idGQuot,idGLin))(0,(g0,h(n),g1(n)))ΓMulti,\displaystyle=((\{\beta_{i,j}h\})_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g_{0,h}^{\ast}(n),g_{1}(n)))\Gamma_{\mathrm{Multi}},

writing

g0,h(n)=g0(n)(g1(n))t(h)g_{0,h}^{\ast}(n)=g_{0}(n)(g_{1}(n))^{t(h)}

where t(h)=({βi,jh})1is1,Di<jDi+DiLinRt(h)=(\{\beta_{i,j}h\})_{1\leq i\leq s-1,~{}D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}}\in R. This is precisely the desired sense, discussed earlier, in which we have used the group action to “raise” parts of GLinG_{\mathrm{Lin}} to hh-fractional powers.

Therefore, for all hHh\in H^{\ast} we have

(11.2) FMulti(gFinal(h,n)ΓMulti)=F(ghQuot(n)ΓQuot).F_{\mathrm{Multi}}(g_{\mathrm{Final}}(h,n)\Gamma_{\mathrm{Multi}})=F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma_{\mathrm{Quot}}).

We now state various complexity claims regarding GMulti/ΓMultiG_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}} and the Lipschitz nature of the function FMultiF_{\mathrm{Multi}}. We defer the rather uninspiring task of checking these bounds to the end of Appendix B.

Lemma 11.3.

Given the above setup, we have that GMulti/ΓMultiG_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}} has the structure of a multidegree (1,s1)(1,s-1) nilmanifold and it may be given a basis 𝒳Multi\mathcal{X}_{\mathrm{Multi}} of complexity bounded by exp(Os((dlog(MD/ρ))Os(1)))\exp(O_{s}((d\log(MD/\rho))^{O_{s}(1)})). Furthermore FMultiF_{\mathrm{Multi}} is exp(Os((dlog(MD/ρ))Os(1)))\exp(O_{s}((d\log(MD/\rho))^{O_{s}(1)}))-Lipschitz under this metric.

11.2. Extracting correlation

We now complete the proof of Lemma 6.3. The proof is little more than stitching results proven in this and the previous section and noting that if two nilcharacters “differ by a lower degree-rank term” then one may pass from to the other at the cost of introducing a lower order term. (This is essentially [34, Lemma E.7].)

Proof of Lemma 6.3.

We return to the correlation structure discussed in Section 10 (that is output by Lemma 9.1). Again, we will abuse notation slightly as discussed. So, for all hHh\in H (where |H|ρN|H|\geq\rho^{\prime}N) we have

𝔼n[N](Δhf)(n)χ(h,n)¯χh(n)¯ψh(n)¯exp(Os((dlog(MD/ρ))Os(1)))\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\otimes\overline{\chi(h,n)}\otimes\overline{\chi_{h}(n)}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)}))

where ψh\psi_{h} is a complexity MM^{\prime} nilsequence of degree (s2)(s-2) and dimension at most dd^{\prime}. We adopt the notation developed in Sections 10 and 11. Applying Claim 10.6, we have

𝔼n[N](Δhf)(n)χ(h,n)¯F~(ghUniv(n)ΓUniv)¯ψh(n)¯exp(Os((dlog(MD/ρ))Os(1))).\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\otimes\overline{\chi(h,n)}\otimes\overline{\widetilde{F}(g_{h}^{\mathrm{Univ}}(n)\Gamma_{\mathrm{Univ}})}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})).

Next note that

F(gΓQuot)F(gΓQuot)¯F^{\ast}(g^{\prime}\Gamma_{\mathrm{Quot}})\otimes\overline{F^{\ast}(g^{\prime}\Gamma_{\mathrm{Quot}})}

has trace equal to 11 as FF^{\ast} is a nilcharacter. Since the output dimension of FF^{\ast} is bounded by exp((dlog(MD/ρ))Os(1))\exp((d\log(MD/\rho))^{O_{s}(1)}), we have for all hHh\in H that

𝔼n[N](Δhf)(n)χ(h,n)¯\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n)\otimes\overline{\chi(h,n)} F~(ghUniv(n)ΓUniv)¯F(ghQuot(n)ΓQuot)\displaystyle\otimes\overline{\widetilde{F}(g_{h}^{\mathrm{Univ}}(n)\Gamma_{\mathrm{Univ}})}\otimes F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma_{\mathrm{Quot}})
F(ghQuot(n)ΓQuot)¯ψh(n)¯exp(Os((dlog(MD/ρ))Os(1))).\displaystyle\otimes\overline{F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma_{\mathrm{Quot}})}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})).

Using (11.2), we in fact may write for hHh\in H^{\ast} that

𝔼n[N](Δhf)(n)\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n) χ(h,n)¯F~(ghUniv(n)ΓUniv)¯F(ghQuot(n)ΓQuot)FMulti(gFinal(h,n))¯ψh(n)¯\displaystyle\otimes\overline{\chi(h,n)}\otimes\overline{\widetilde{F}(g_{h}^{\mathrm{Univ}}(n)\Gamma_{\mathrm{Univ}})}\otimes F^{\ast}(g_{h}^{\mathrm{Quot}}(n)\Gamma_{\mathrm{Quot}})\otimes\overline{F_{\mathrm{Multi}}(g_{\mathrm{Final}}(h,n))}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}
exp(dOs(1)log(MDρ1)Os(1)).\displaystyle\geq\exp(-d^{O_{s}(1)}\log(MD\rho^{-1})^{O_{s}(1)}).

Now we may pay a cost of exp((dlog(MD/ρ))Os(1))\exp(-(d\log(MD/\rho))^{O_{s}(1)}) in the size of HH^{\ast} by Pigeonhole to choose a single coordinate function of F~(ghUniv(n)ΓUniv)¯F(gΓQuot)\overline{\widetilde{F}(g_{h}^{\mathrm{Univ}}(n)\Gamma_{\mathrm{Univ}})}\otimes F^{\ast}(g^{\prime}\Gamma_{\mathrm{Quot}}), call it ψh(n)\psi_{h}^{\ast}(n), such that

𝔼n[N](Δhf)(n)\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n) χ(h,n)¯ψh(n)FMulti(gFinal(h,n))¯ψh(n)¯\displaystyle\otimes\overline{\chi(h,n)}\otimes\psi_{h}^{\ast}(n)\otimes\overline{F_{\mathrm{Multi}}(g_{\mathrm{Final}}(h,n))}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}
exp(Os((dlog(MD/ρ))Os(1))).\displaystyle\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})).

By Lemma 10.12 and using Remark 10.13, ψh(n)\psi_{h}^{\ast}(n) can be realized on a nilmanifold with a degree-rank (s1,r1)(s-1,r^{\ast}-1) filtration. Furthermore the function underlying ψh(n)\psi_{h}^{\ast}(n) is has Lipschitz constant bounded by exp((dlog(MD/ρ))Os(1))\exp((d\log(MD/\rho))^{O_{s}(1)}) and the nilmanifold it lives on has dimension at most (dlog(MD/ρ))Os(1)(d\log(MD/\rho))^{O_{s}(1)} and complexity bounded by exp((dlog(MD/ρ))Os(1))\exp((d\log(MD/\rho))^{O_{s}(1)}) due to Lemma 10.12.

By applying [42, Lemma A.6] with subgroup corresponding to the (s1,r1)(s-1,r^{\ast}-1) degree-rank and Pigeonholing in the associated vertical frequency, we may assume that ψh(n)\psi_{h}^{\ast}(n) has a vertical frequency with height bounded by exp((dlog(MD/ρ))Os(1))\exp((d\log(MD/\rho))^{O_{s}(1)}); this may reduce the subset of HH^{\ast} under consideration by a further admissible fraction. We then extend ψh(n)\psi_{h}^{\ast}(n) to a nilcharacter by using Lemma B.4777We have that ψh\psi_{h}^{\ast} lives on the group GUniv/(GUniv)(s1,r)G_{\mathrm{Univ}}^{\triangle}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}. We may give it the degree filtration GUniv/(\displaystyle G_{\mathrm{Univ}}^{\triangle}/( GUniv)(s1,r)=(GUniv)(1,0)/(GUniv)(s1,r)(GUniv)(2,0)/(GUniv)(s1,r)\displaystyle G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}=(G_{\mathrm{Univ}}^{\triangle})_{(1,0)}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}\geqslant(G_{\mathrm{Univ}}^{\triangle})_{(2,0)}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})} (GUniv)(s1,0)/(GUniv)(s1,r)(GUniv)(s1,r1)/(GUniv)(s1,r)IdGUniv/(GUniv)(s1,r)\displaystyle\geqslant\cdots\geqslant(G_{\mathrm{Univ}}^{\triangle})_{(s-1,0)}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}\geqslant(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast}-1)}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}\geqslant\mathrm{Id}_{G_{\mathrm{Univ}}^{\triangle}/(G_{\mathrm{Univ}}^{\triangle})_{(s-1,r^{\ast})}} and we apply Lemma B.4 to this filtration to get a nilcharacter HH. We then embed ψh\psi_{h}^{\ast} by taking the underlying function, call it QQ, and taking the nilcharacter (Q/(2Q),1|Q/(2Q)|2H)(Q/(2\cdot\lVert Q\rVert_{\infty}),\sqrt{1-|Q/(2\cdot\lVert Q\rVert_{\infty})|^{2}}\cdot H).; we refer to this nilcharacter as ψhOutput\psi_{h}^{\mathrm{Output}} and note it is a degree-rank (s1,r1)(s-1,r^{\ast}-1) nilcharacter with appropriate complexity. We thus have

𝔼n[N](Δhf)(n)\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n) χ(h,n)¯ψhOutput(n)FMulti(gFinal(h,n))¯ψh(n)¯\displaystyle\otimes\overline{\chi(h,n)}\otimes\psi_{h}^{\mathrm{Output}}(n)\otimes\overline{F_{\mathrm{Multi}}(g_{\mathrm{Final}}(h,n))}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}
exp(Os((dlog(MD/ρ))Os(1))).\displaystyle\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})).

By Pigeonholing in hh once again we may pass to FMultiF_{\mathrm{Multi}}^{\ast}, which is a fixed coordinate of FMultiF_{\mathrm{Multi}},

𝔼n[N](Δhf)(n)\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n) χ(h,n)¯ψhOutput(n)FMulti(gFinal(h,n))¯ψh(n)¯\displaystyle\otimes\overline{\chi(h,n)}\otimes\psi_{h}^{\mathrm{Output}}(n)\cdot\overline{F_{\mathrm{Multi}}^{\ast}(g_{\mathrm{Final}}(h,n))}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}
exp(dOs(1)log(MDρ1)Os(1)).\displaystyle\geq\exp(-d^{O_{s}(1)}\log(MD\rho^{-1})^{O_{s}(1)}).

on a exp((dlog(MD/ρ))Os(1))\exp(-(d\log(MD/\rho))^{O_{s}(1)}) fraction of indices. FMultiF_{\mathrm{Multi}}^{\ast} lives on the group GMultiG_{\mathrm{Multi}} and via [42, Lemma A.6], Pigeonholing in hh so that we have the same frequency, and embedding in a nilcharacter via Lemma B.4 similar to the above argument, we have for all hHh\in H^{\ast} that

𝔼n[N](Δhf)(n)\displaystyle\lVert\mathbb{E}_{n\in[N]}(\Delta_{h}f)(n) χ(h,n)¯ψhOutput(n)FMultiOutput(gFinal(h,n))¯ψh(n)¯\displaystyle\otimes\overline{\chi(h,n)}\otimes\psi_{h}^{\mathrm{Output}}(n)\otimes\overline{F_{\mathrm{Multi}}^{\mathrm{Output}}(g_{\mathrm{Final}}(h,n))}\cdot\overline{\psi_{h}(n)}\rVert_{\infty}
exp(Os((dlog(MD/ρ))Os(1)))\displaystyle\geq\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)}))

where FMultiOutputF_{\mathrm{Multi}}^{\mathrm{Output}} is a multidegree (1,s1)(1,s-1) nilcharacter on GMultiG_{\mathrm{Multi}} with vertical frequency height, output dimension, and Lipschitz constant of each coordinate bounded by exp((dlog(MD/ρ))Os(1))\exp((d\log(MD/\rho))^{O_{s}(1)}) while the dimension of the underlying nilmanifold is bounded by (dlog(MD/ρ))Os(1)(d\log(MD/\rho))^{O_{s}(1)}.

This completes the proof with χ(h,n)FMultiOutput(gFinal(h,n))\chi(h,n)\otimes F_{\mathrm{Multi}}^{\mathrm{Output}}(g_{\mathrm{Final}}(h,n)) being the new multidegree (1,s1)(1,s-1) nilcharacter, ψh(n)Output¯\overline{\psi_{h}(n)^{\mathrm{Output}}} being the degree-rank (s1,r1)(s-1,r^{\ast}-1) nilcharacter and noting that the density of indices hh which remain is at least exp(Os((dlog(MD/ρ))Os(1)))\exp(-O_{s}((d\log(MD/\rho))^{O_{s}(1)})). ∎

12. Symmetrization argument

We now perform the necessary symmetrization argument. In particular, at this stage in the argument due to Theorem 6.4 we have shown that for many hh, Δhf\Delta_{h}f correlates with χ(h,n)\chi(h,n) which is a multidegree (1,s1)(1,s-1) nilcharacter. We now demonstrate that χ(h,n)\chi(h,n) is “symmetric up to lower order terms” in hh and nn (after multilinearizing the nn variable) via an argument of Green, Tao, and Ziegler [34], which in turn is closely related to an earlier argument of Green and Tao [23] which proved such a result for the U3U^{3}-norm. Our treatment is slightly simpler than in [34]. Importantly, this argument is fundamentally based on a finite number of applications of Cauchy–Schwarz and a single call to equidistribution theory and therefore naturally comes with good bounds.

All references to Appendix C are simply quantified versions of lemmas which appear in the work of Green, Tao, and Ziegler [34, Appendix E] and a discussion of the correspondence is given more carefully in Appendix C. The reader may benefit from glancing at the statements in Appendix C or those in [34, Appendix E].

For the remainder of this section and Appendix C, to lighten statements, we say a nilsequence χ\chi has complexity (M,d)(M,d) if the underlying nilmanifold G/ΓG/\Gamma has complexity MM, the underlying function is MM-Lipschitz, and the dimension of GG is bounded by dd. We will say a nilcharacter χ\chi has complexity (M,d)(M,d) if the underlying nilmanifold G/ΓG/\Gamma has complexity MM, the output dimension of χ\chi is bounded by MM, the underlying function has all coordinates being MM-Lipschitz, the vertical character underlying χ\chi has height bounded by MM, and the dimension of GG is bounded by dd. In this section, MM will always be of the form M(δ):=exp(log(1/δ)Os(1))M(\delta):=\exp(\log(1/\delta)^{O_{s}(1)}) while the underlying dd will be of the form d(δ):=log(1/δ)Os(1)d(\delta):=\log(1/\delta)^{O_{s}(1)} in our analysis, where the implicit constants may, by abuse of notation, vary from line to line.

We now recall the output of Theorem 6.4. We have

𝔼h[N]𝔼n[N]Δhf(n)χ(h,n)ψh(n)M(δ)1.\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\chi(h,n)\psi_{h}(n)\rVert_{\infty}\geq M(\delta)^{-1}.

Here ψh(n)\psi_{h}(n) is a degree (s2)(s-2) nilsequence and χ(h,n)=F(g(h,n)Γ)\chi(h,n)=F(g(h,n)\Gamma) is a multidegree (1,s1)(1,s-1) nilcharacter. Furthermore χ\chi has complexity (M(δ),d(δ))(M(\delta),d(\delta)) while ψh(n)\psi_{h}(n) has complexity (M(δ),d(δ))(M(\delta),d(\delta)).

Our first step is to multilinearize χ\chi in the nn variable, replacing it by a multidegree (1,1,,1)(1,1,\ldots,1) nilcharacter which is symmetric in the final (s1)(s-1) variables.

Lemma 12.1.

Fix s2s\geq 2. Suppose that

𝔼h[N]𝔼n[N]Δhf(n)χ(h,n)ψh(n)1/M(δ)\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\chi(h,n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta)

with χ(h,n)\chi(h,n) being a periodic multidegree (1,s1)(1,s-1)-nilcharacter and ψh(n)\psi_{h}(n) are degree (s2)(s-2) nilsequences each of complexity (M(δ),d(δ))(M(\delta),d(\delta)).

There exists χ~\widetilde{\chi} a multidegree (1,,1)(1,\ldots,1) nilcharacter (with ss ones), ψ~\widetilde{\psi} a degree (s1)(s-1) nilsequence, and there exist ψh~(n)\widetilde{\psi_{h}}(n) which are degree (s2)(s-2) nilcharacters all having complexity complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼h[N]𝔼n[N]Δhf(n)χ~(h,n,,n)ψ~(n)ψh~(n)1/M(δ).\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\widetilde{\chi}(h,n,\ldots,n)\cdot\widetilde{\psi}(n)\otimes\widetilde{\psi_{h}}(n)\rVert_{\infty}\geq 1/M(\delta).

Furthermore χ~\widetilde{\chi} is symmetric in the final (s1)(s-1) coordinates, i.e., for any σ𝔖s1\sigma\in\mathfrak{S}_{s-1} we have

χ~(h,n1,,ns1)=χ~(h,nσ(1),,nσ(s1)).\widetilde{\chi}(h,n_{1},\ldots,n_{s-1})=\widetilde{\chi}(h,n_{\sigma(1)},\ldots,n_{\sigma(s-1)}).
Proof.

This is essentially an immediate consequence of multilinearization (see e.g. [34, Theorem E.10]). By applying Lemma C.5, there is multidegree (1,,1)(1,\ldots,1) nilcharacter χ~\widetilde{\chi} of complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that χ(h,n)\chi(h,n) and χ~(h,n,,n)\widetilde{\chi}(h,n,\ldots,n) are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Furthermore χ~\widetilde{\chi} is symmetric in the final (s1)(s-1) coordinates.

Thus applying Lemma 7.4 (and the remark following), there exists a nilsequence ψ(h,n)\psi^{\ast}(h,n) of degree (s1)\leq(s-1) and complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼h[N]𝔼n[N]Δhf(n)χ~(h,n,,n)ψ(h,n)ψh(n)1/M(δ).\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\widetilde{\chi}(h,n,\ldots,n)\otimes\psi^{\ast}(h,n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta).

Note that a degree (s1)(s-1) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)) in two variables (h,n)(h,n) is also a multidegree (0,s1)(s1,s2)(0,s-1)\cup(s-1,s-2) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)) via taking the filtration Gi:=G|i|G_{\vec{i}}:=G_{|\vec{i}|}. Therefore by Lemma C.6 and the first item of Lemma C.2, there exist nilsequences ψ~(n)\widetilde{\psi}(n) and ψh(n)\psi_{h}^{\ast}(n) of degree (s1)(s-1) and (s2)(s-2) respectively and complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼h[N]𝔼n[N]Δhf(n)χ~(h,n,,n)ψ~(n)ψh(n)ψh(n)1/M(δ).\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\widetilde{\chi}(h,n,\ldots,n)\otimes\widetilde{\psi}(n)\cdot\psi_{h}^{\ast}(n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta).

Now, ψh(n)ψh(n)\psi_{h}^{\ast}(n)\cdot\psi_{h}(n) is a degree (s2)(s-2) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)). Applying [42, Lemma A.6], we may replace this product by ψh(n)\psi_{h}^{\prime}(n) which is a degree (s2)(s-2) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)) with a vertical frequency of height exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)}). Finally apply Lemma B.4 and embed ψh(n)\psi_{h}^{\prime}(n) as a coordinate of a nilcharacter ψh~(n)\widetilde{\psi_{h}}(n) of complexity (M(δ),d(δ))(M(\delta),d(\delta)), similar to in the proof of Lemma 6.3. We thus have

𝔼h[N]𝔼n[N]Δhf(n)χ~(h,n,,n)ψ~(n)ψh~(n)1/M(δ)\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\widetilde{\chi}(h,n,\ldots,n)\cdot\widetilde{\psi}(n)\otimes\widetilde{\psi_{h}}(n)\rVert_{\infty}\geq 1/M(\delta)

where ψ~\widetilde{\psi} and ψh~\widetilde{\psi_{h}} have the appropriate properties. ∎

We are now in position to complete the proof of Theorem 1.2 via a symmetrization argument. Our argument is analogous to that of Green, Tao, and Ziegler [34, Section 13] modulo certain minor simplifications to the underlying Cauchy–Schwarz arguments.

Proof of Theorem 1.2.

We may assume that s3s\geq 3. The case s=0s=0 is trivial, s=1s=1 is standard Fourier analysis, and the case s=2s=2 follows from work of Sanders [52] (see [44, Theorem 8]). Furthermore, throughout the analysis we will assume implicitly that Nexp(log(1/δ)Ωs(1))N\geq\exp(\log(1/\delta)^{\Omega_{s}(1)}); in the case when NN is small one may deduce the statement via Fourier analysis. We proceed by induction, assuming that the inverse theorem is known for smaller ss.

By Theorem 6.4 and then Lemma 12.1 we may assume that

(12.1) 𝔼h[N]𝔼n[N]Δhf(n)χ(h,n,,n)ψ(n)ψh(n)1/M(δ).\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\chi(h,n,\ldots,n)\cdot\psi(n)\otimes\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta).

Here χ\chi is a multidegree (1,,1)(1,\ldots,1) nilcharacter which is symmetric in the final (s1)(s-1) variables, ψ\psi is a degree (s1)(s-1) nilsequence, and ψh\psi_{h} are degree (s2)(s-2) nilcharacters with complexities bounded by complexity (M(δ),d(δ))(M(\delta),d(\delta)). For h[N]h\notin[N], we take ψh(n)\psi_{h}(n) to be the constant function 11 (which is a degree 0 nilcharacter) throughout the argument. Additionally, we may use differently indexed versions of functions ψ\psi that are defined at intermediate stages of the argument; although an abuse of notation, it will always be clear from context.

Step 1: Initial setup for Cauchy–Schwarz argument. For the sake of shorthand, we will denote χ~(h,n)=χ(h,n,,n)\widetilde{\chi}(h,n)=\chi(h,n,\ldots,n) where there are (s1)(s-1) copies of the variable nn. By Lemma 7.2 (taking f1=ff_{1}=f and f2=ψ(n))f_{2}=\psi(n)), we have

𝔼h1+h2=h3+h4hi[N]𝔼n[N]χ~(h1,n)χ~(h2,n+h1h4)χ~(h3,n)¯χ~(h4,n+h1h4)¯\displaystyle\mathbb{E}_{\begin{subarray}{c}h_{1}+h_{2}=h_{3}+h_{4}\\ h_{i}\in[N]\end{subarray}}\lVert\mathbb{E}_{n\in[N]}\widetilde{\chi}(h_{1},n)\otimes\widetilde{\chi}(h_{2},n+h_{1}-h_{4})\otimes\overline{\widetilde{\chi}(h_{3},n)}\otimes\overline{\widetilde{\chi}(h_{4},n+h_{1}-h_{4})}
ψh1(n)ψh2(n+h1h4)ψh3(n)¯ψh4(n+h1h4)¯e(Θn)1/M(δ)\displaystyle\qquad\qquad\qquad\otimes\psi_{h_{1}}(n)\otimes\psi_{h_{2}}(n+h_{1}-h_{4})\otimes\overline{\psi_{h_{3}}(n)}\otimes\overline{\psi_{h_{4}}(n+h_{1}-h_{4})}\cdot e(\Theta n)\rVert_{\infty}\geq 1/M(\delta)

for some Θ/M(δ)/N\lVert\Theta\rVert_{\mathbb{R}/\mathbb{Z}}\leq M(\delta)/N. Note that Lemma 7.2 is stated for scalar function; here we are using that we may Pigeonhole on coordinates of the vector χ(h,n,,n)ψ(n)ψh(n)\chi(h,n,\ldots,n)\cdot\psi(n)\otimes\psi_{h}(n) before using Lemma 7.2.

We next change variables with h1=h+xh_{1}=h+x, h2=h+yh_{2}=h+y, h3=h+x+yh_{3}=h+x+y, and h4=hh_{4}=h. The above then implies that

𝔼h[N],x,y[±N]𝔼n[N]χ~(h+x,n)χ~(h+y,n+x)χ~(h+x+y,n)¯χ~(h,n+x)¯\displaystyle\mathbb{E}_{h\in[N],x,y\in[\pm N]}\lVert\mathbb{E}_{n\in[N]}\widetilde{\chi}(h+x,n)\otimes\widetilde{\chi}(h+y,n+x)\otimes\overline{\widetilde{\chi}(h+x+y,n)}\otimes\overline{\widetilde{\chi}(h,n+x)}
ψh+x(n)ψh+y(n+x)ψh+x+y(n)¯ψh(n+x)¯e(Θn)1/M(δ).\displaystyle\qquad\qquad\qquad\otimes\psi_{h+x}(n)\otimes\psi_{h+y}(n+x)\otimes\overline{\psi_{h+x+y}(n)}\otimes\overline{\psi_{h}(n+x)}e(\Theta n)\rVert_{\infty}\geq 1/M(\delta).

By the first item of Lemma C.3, ψh+y(n+x)\psi_{h+y}(n+x) and ψh+y(n)\psi_{h+y}(n) are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s3)(s-3). We use that s3s\geq 3 precisely here so that this is a well-defined term.

Therefore by Lemma 7.4, there exists a collection ψh,x,y(n)\psi_{h,x,y}(n) of degree (s3)(s-3) nilsequences each of complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼h[N],x,y[±N]𝔼n[N]χ~(h+x,n)χ~(h+y,n+x)χ~(h+x+y,n)¯χ~(h,n+x)¯\displaystyle\mathbb{E}_{h\in[N],x,y\in[\pm N]}\lVert\mathbb{E}_{n\in[N]}\widetilde{\chi}(h+x,n)\otimes\widetilde{\chi}(h+y,n+x)\otimes\overline{\widetilde{\chi}(h+x+y,n)}\otimes\overline{\widetilde{\chi}(h,n+x)}
ψh+x(n)ψh+y(n)ψh+x+y(n)¯ψh(n+x)¯ψh,x,y(n)e(Θn)1/M(δ).\displaystyle\qquad\qquad\qquad\otimes\psi_{h+x}(n)\otimes\psi_{h+y}(n)\otimes\overline{\psi_{h+x+y}(n)}\otimes\overline{\psi_{h}(n+x)}\psi_{h,x,y}(n)\cdot e(\Theta n)\rVert_{\infty}\geq 1/M(\delta).

We will use BB to denote vector-valued functions (which may vary term to term) with coordinates which are 11-bounded such that the dimension is bounded by M(δ)M(\delta). The key point is that nearly all terms may be folded into 11-bounded terms. In particular, we have

𝔼h[N],x,y[±N]𝔼n[N]\displaystyle\mathbb{E}_{h\in[N],x,y\in[\pm N]}\lVert\mathbb{E}_{n\in[N]} χ~(h+y,n+x)ψh,x,y(n)B(h,x,n)B(h,y,n)B(h,x+y,n)1/M(δ).\displaystyle\widetilde{\chi}(h+y,n+x)\cdot\psi_{h,x,y}(n)\otimes B(h,x,n)\otimes B(h,y,n)\otimes B(h,x+y,n)\rVert_{\infty}\geq 1/M(\delta).

Noting that ψh,x,y(n)\psi_{h,x,y}(n) may be twisted by an appropriate complex phase depending on hh, we may in fact assume that

𝔼h,n[N],x,y[±N]χ~(h+y,n+x)ψh,x,y(n)B(h,x,n)B(h,y,n)B(h,x+y,n)1/M(δ).\displaystyle\lVert\mathbb{E}_{h,n\in[N],x,y\in[\pm N]}\widetilde{\chi}(h+y,n+x)\cdot\psi_{h,x,y}(n)\otimes B(h,x,n)\otimes B(h,y,n)\otimes B(h,x+y,n)\rVert_{\infty}\geq 1/M(\delta).

By applying Pigeonhole in hh, we may fix hh^{\ast} such that

𝔼n[N],x,y[±N]χ~(h+y,n+x)ψh,x,y(n)B(x,n)B(y,n)B(x+y,n)1/M(δ).\displaystyle\lVert\mathbb{E}_{n\in[N],x,y\in[\pm N]}\widetilde{\chi}(h^{\ast}+y,n+x)\cdot\psi_{h^{\ast},x,y}(n)\otimes B(x,n)\otimes B(y,n)\otimes B(x+y,n)\rVert_{\infty}\geq 1/M(\delta).

Taking the coordinate which achieves the infinity norm, we may assume that B(,)B(\cdot,\cdot) are in fact all scalar and thus

𝔼x,y[±N]𝔼n[N]χ~(h+y,n+x)ψx,y(n)b(x,n)b(y,n)b(x+y,n)1/M(δ);\displaystyle\lVert\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\widetilde{\chi}(h^{\ast}+y,n+x)\cdot\psi_{x,y}(n)\cdot b(x,n)\cdot b(y,n)\cdot b(x+y,n)\rVert_{\infty}\geq 1/M(\delta);

we have dropped hh^{\ast} in one subscript here.

By applying the second item of Lemma C.3 and the second item of Lemma C.2, we have that χ~(h+y,n+x)\widetilde{\chi}(h^{\ast}+y,n+x) and χ~(y,n+x)\widetilde{\chi}(y,n+x) are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Thus by Lemma 7.4 there exists a nilsequence ψ(x,y,n)\psi^{\ast}(x,y,n) of degree (s1)(s-1) and complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼x,y[±N]𝔼n[N]χ~(y,n+x)ψ(x,y,n)ψx,y(n)b(x,n)b(y,n)b(x+y,n)1/M(δ).\displaystyle\lVert\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\widetilde{\chi}(y,n+x)\cdot\psi^{\ast}(x,y,n)\cdot\psi_{x,y}(n)\cdot b(x,n)b(y,n)b(x+y,n)\rVert_{\infty}\geq 1/M(\delta).

Note that a degree (s1)(s-1) nilsequence in variables x,y,nx,y,n is a multidegree (s1,s1,s3)(1,0,s2)(0,1,s2)(0,0,s1)(s-1,s-1,s-3)\cup(1,0,s-2)\cup(0,1,s-2)\cup(0,0,s-1)-nilsequence. Therefore applying Lemma C.6 and applying Pigeonhole, we may adjust ψx,y\psi_{x,y} and the 11-bounded functions and remove ψ\psi^{\ast} and thus we may assume that

𝔼x,y[±N]𝔼n[N]χ~(y,n+x)ψx,y(n)b(x,n)b(y,n)b(x+y,n)1/M(δ);\displaystyle\lVert\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\widetilde{\chi}(y,n+x)\cdot\psi_{x,y}(n)\cdot b(x,n)b(y,n)b(x+y,n)\rVert_{\infty}\geq 1/M(\delta);

note that ψx,y\psi_{x,y} and BB have all been modified but we have abusively maintained the same notation. In particular, ψx,y(n)\psi_{x,y}(n) is degree (s3)(s-3).

By Lemma C.4, the second item of Lemma C.2, and Lemma C.1 (and symmetry of χ\chi in the final (s1)(s-1) coordinates), we have that χ~(y,n+x)\widetilde{\chi}(y,n+x) and

k=0s1χ(y,n,,n,x,,x)(s1k)\bigotimes_{k=0}^{s-1}\chi(y,n,\ldots,n,x,\ldots,x)^{\otimes\binom{s-1}{k}}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). In this notation there are kk copies of nn and s1ks-1-k copies of xx. Now by Lemma 7.4, we have

𝔼x,y[±N]𝔼n[N]ψ(x,y,n)k=0s1χ(y,n,,n,x,,x)(s1k)ψx,y(n)\displaystyle\bigg{\lVert}\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\psi^{\ast}(x,y,n)\cdot\bigotimes_{k=0}^{s-1}\chi(y,n,\ldots,n,x,\ldots,x)^{\otimes\binom{s-1}{k}}\cdot\psi_{x,y}(n)
b(x,n)b(y,n)b(x+y,n)1/M(δ)\displaystyle\qquad\qquad\qquad\qquad\cdot b(x,n)b(y,n)b(x+y,n)\bigg{\rVert}_{\infty}\geq 1/M(\delta)

where ψ(x,y,n)\psi^{\ast}(x,y,n) is a new degree (s1)(s-1) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)). Applying Lemma C.6 as before, we may adjust ψx,y(n)\psi_{x,y}(n) and the 11-bounded functions and remove this term to have that

𝔼x,y[±N]𝔼n[N]k=0s1χ(y,n,,n,x,,x)(s1k)ψx,y(n)b(x,n)b(y,n)b(x+y,n)1/M(δ).\displaystyle\bigg{\lVert}\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\bigotimes_{k=0}^{s-1}\chi(y,n,\ldots,n,x,\ldots,x)^{\otimes\binom{s-1}{k}}\cdot\psi_{x,y}(n)b(x,n)b(y,n)b(x+y,n)\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Note that the only terms of k=0s1χ(y,n,,n,x,,x)(s1k)\bigotimes_{k=0}^{s-1}\chi(y,n,\ldots,n,x,\ldots,x)^{\otimes\binom{s-1}{k}} which involve all of x,y,nx,y,n with nn appearing at least s2s-2 times have exactly one copy of xx, one copy of yy and nn exactly (s2)(s-2) times. Therefore taking the coordinate of

0ks1ks2χ(y,n,,n,x,,x)(s1k)\bigotimes_{\begin{subarray}{c}0\leq k\leq s-1\\ k\neq s-2\end{subarray}}\chi(y,n,\ldots,n,x,\ldots,x)^{\otimes\binom{s-1}{k}}

which achieves the infinity norm and adjusting ψx,y\psi_{x,y}, bb, and adding a term b(x,y)b(x,y) we have

𝔼x,y[±N]𝔼n[N]χ(y,x,n,,n)(s1)ψx,y(n)b(x,n)b(y,n)b(x+y,n)b(x,y)1/M(δ).\displaystyle\bigg{\lVert}\mathbb{E}_{x,y\in[\pm N]}\mathbb{E}_{n\in[N]}\chi(y,x,n,\ldots,n)^{\otimes(s-1)}\cdot\psi_{x,y}(n)b(x,n)b(y,n)b(x+y,n)b(x,y)\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Step 2: Cauchy–Schwarz to remove 11-bounded functions. Applying Cauchy–Schwarz to each coordinate of the associated vector, duplicating the variable yy, and using that b(x,n)b(x,n) is 11-bounded, we find that

𝔼n[N],x[±N]𝔼y,y[±N]χ(y,x,n,,n)(s1)χ(y,x,n,,n)(s1)¯ψx,y(n)ψx,y(n)¯\displaystyle\lVert\mathbb{E}_{n\in[N],x\in[\pm N]}\mathbb{E}_{y,y^{\prime}\in[\pm N]}\chi(y,x,n,\ldots,n)^{\otimes(s-1)}\otimes\overline{\chi(y^{\prime},x,n,\ldots,n)^{\otimes(s-1)}}\cdot\psi_{x,y}(n)\overline{\psi_{x,y^{\prime}}(n)}
b(y,n)b(y,n)¯b(x+y,n)b(x+y,n)¯b(x,y)b(x,y)¯1/M(δ).\displaystyle\qquad\qquad\qquad\qquad\cdot b(y,n)\overline{b(y^{\prime},n)}\cdot b(x+y,n)\overline{b(x+y^{\prime},n)}\cdot b(x,y)\overline{b(x,y^{\prime})}\rVert_{\infty}\geq 1/M(\delta).

By Lemma C.4, Lemma C.2, and Lemma C.1, we have that

χ(y,x,n,,n)(s1)χ(y,x,n,,n)(s1)¯ and χ(yy,x,n,,n)(s1)\chi(y,x,n,\ldots,n)^{\otimes(s-1)}\otimes\overline{\chi(y^{\prime},x,n,\ldots,n)^{\otimes(s-1)}}\text{ and }\chi(y-y^{\prime},x,n,\ldots,n)^{\otimes(s-1)}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Therefore by Lemma 7.4, there exists ψ(x,y,y,n)\psi^{\ast}(x,y,y^{\prime},n) a degree (s1)(s-1) nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼n[N],x,y,y[±N]χ(yy,x,n,,n)(s1)ψ(x,y,y,n)ψx,y(n)ψx,y(n)¯\displaystyle\lVert\mathbb{E}_{n\in[N],x,y,y^{\prime}\in[\pm N]}\chi(y-y^{\prime},x,n,\ldots,n)^{\otimes(s-1)}\cdot\psi^{\ast}(x,y,y^{\prime},n)\cdot\psi_{x,y}(n)\overline{\psi_{x,y^{\prime}}(n)}
b(y,n)b(y,n)¯b(x+y,n)b(x+y,n)¯b(x,y)b(x,y)¯1/M(δ).\displaystyle\qquad\qquad\cdot b(y,n)\overline{b(y^{\prime},n)}\cdot b(x+y,n)\overline{b(x+y^{\prime},n)}\cdot b(x,y)\overline{b(x,y^{\prime})}\rVert_{\infty}\geq 1/M(\delta).

Note that z=x+y+yz=x+y+y^{\prime} ranges in the set [3N,3N][-3N,3N]. Take ρ=exp(log(1/δ)Os(1))\rho=\exp(-\log(1/\delta)^{O_{s}(1)}) sufficiently small. Then there exists zz^{\ast} such that z[(3ρ)N,(3ρ)N]z^{\ast}\in[-(3-\rho)N,(3-\rho)N] such that

𝔼n[N]𝔼x,y,y[±N]x+y+y=zχ(yy,x,n,,n)(s1)ψ(x,y,y,n)ψx,y(n)ψx,y(n)¯\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}x,y,y^{\prime}\in[\pm N]\\ x+y+y^{\prime}=z^{\ast}\end{subarray}}\chi(y-y^{\prime},x,n,\ldots,n)^{\otimes(s-1)}\cdot\psi^{\ast}(x,y,y^{\prime},n)\cdot\psi_{x,y}(n)\overline{\psi_{x,y^{\prime}}(n)}
b(y,n)b(y,n)¯b(x+y,n)b(x+y,n)¯b(x,y)b(x,y)¯1/M(δ).\displaystyle\qquad\qquad\cdot b(y,n)\overline{b(y^{\prime},n)}\cdot b(x+y,n)\overline{b(x+y^{\prime},n)}\cdot b(x,y)\overline{b(x,y^{\prime})}\rVert_{\infty}\geq 1/M(\delta).

This implies that

𝔼n[N]𝔼x,y,y[±N]x+y+y=zχ(yy,zyy,n,,n)(s1)ψ(zyy,y,y,n)ψx,y(n)ψx,y(n)¯\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}x,y,y^{\prime}\in[\pm N]\\ x+y+y^{\prime}=z^{\ast}\end{subarray}}\chi(y-y^{\prime},z^{\ast}-y-y^{\prime},n,\ldots,n)^{\otimes(s-1)}\cdot\psi^{\ast}(z^{\ast}-y-y^{\prime},y,y^{\prime},n)\cdot\psi_{x,y}(n)\overline{\psi_{x,y^{\prime}}(n)}
b(y,n)b(y,n)¯b(zy,n)b(zy,n)¯b(zyy,y)b(zyy,y)¯1/M(δ).\displaystyle\cdot b(y,n)\overline{b(y^{\prime},n)}\cdot b(z^{\ast}-y^{\prime},n)\overline{b(z^{\ast}-y,n)}\cdot b(z^{\ast}-y-y^{\prime},y)\overline{b(z^{\ast}-y-y^{\prime},y^{\prime})}\rVert_{\infty}\geq 1/M(\delta).

By applying the first item of Lemma C.3, Lemma C.4, Lemma C.2, and Lemma C.1 we have that

χ(yy,zyy,n,,n)(s1)\chi(y-y^{\prime},z^{\ast}-y-y^{\prime},n,\ldots,n)^{\otimes(s-1)}

and

χ(y,y,n,,n)(s1)χ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)χ(y,y,n,,n)¯(s1)\chi(y^{\prime},y^{\prime},n,\ldots,n)^{\otimes(s-1)}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}\overline{\chi(y,y,n,\ldots,n)}^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta)) equivalent for degree (s1)(s-1). Thus by Lemma 7.4 and letting ψ~\widetilde{\psi} denote a degree (s1)(s-1) nilsequence in y,y,ny,y^{\prime},n of complexity (M(δ),d(δ))(M(\delta),d(\delta)) we have that

𝔼n[N]𝔼y,y[±N]|zyy|Nχ(y,y,n,,n)(s1)χ(y,y,n,,n)(s1)\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}y,y^{\prime}\in[\pm N]\\ |z^{\ast}-y-y^{\prime}|\leq N\end{subarray}}\chi(y^{\prime},y^{\prime},n,\ldots,n)^{\otimes(s-1)}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}
χ(y,y,n,,n)¯(s1)χ(y,y,n,,n)¯(s1)ψ~(y,y,n)ψzyy,y(n)ψzyy,y(n)¯\displaystyle\overline{\chi(y,y,n,\ldots,n)}^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}\cdot\widetilde{\psi}(y,y^{\prime},n)\cdot\psi_{z^{\ast}-y-y^{\prime},y}(n)\overline{\psi_{z^{\ast}-y-y^{\prime},y^{\prime}}(n)}
b(y,n)b(y,n)¯b(zy,n)b(zy,n)¯b(zyy,y)b(zyy,y)¯1/M(δ).\displaystyle\cdot b(y,n)\overline{b(y^{\prime},n)}\cdot b(z^{\ast}-y^{\prime},n)\overline{b(z^{\ast}-y,n)}\cdot b(z^{\ast}-y-y^{\prime},y)\cdot\overline{b(z^{\ast}-y-y^{\prime},y^{\prime})}\rVert_{\infty}\geq 1/M(\delta).

Here we have “folded” in ψ(zyy,y,y,n)\psi^{\ast}(z^{\ast}-y-y^{\prime},y,y^{\prime},n) via Lemma C.2 in ψ~\widetilde{\psi}. We may collapse various 11-bounded functions (and pass to the coordinates of χ(y,y,n,,n)(s1)\chi(y^{\prime},y^{\prime},n,\ldots,n)^{\otimes(s-1)} and χ(y,y,n,,n)¯(s1)\overline{\chi(y,y,n,\ldots,n)}^{\otimes(s-1)} which achieve the LL^{\infty} norm) and obtain

𝔼n[N]𝔼y,y[±N]|zyy|Nχ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}y,y^{\prime}\in[\pm N]\\ |z^{\ast}-y-y^{\prime}|\leq N\end{subarray}}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}
ψ~(y,y,n)ψy,y(n)b(y,n)b(y,n)b(y,y)1/M(δ);\displaystyle\qquad\qquad\cdot\widetilde{\psi}(y,y^{\prime},n)\cdot\psi_{y,y^{\prime}}(n)\cdot b(y,n)b(y^{\prime},n)b(y,y^{\prime})\rVert_{\infty}\geq 1/M(\delta);

here the ψ~y,y(n)\widetilde{\psi}_{y,y^{\prime}}(n) are degree (s3)(s-3) nilsequences of complexity (M(δ),d(δ))(M(\delta),d(\delta)). Furthermore as ψ~(y,y,n)\widetilde{\psi}(y,y^{\prime},n) is a degree (s1)(s-1) nilsequence, we have that it is a multidegree (s1,s1,s3)(1,0,s1)(0,1,s1)(s-1,s-1,s-3)\cup(1,0,s-1)\cup(0,1,s-1) nilsequence. Therefore by Lemma C.6 and Lemma C.2, we may remove ψ~\widetilde{\psi} at the cost of adjusting bb and ψy,y\psi_{y,y^{\prime}} to obtain

𝔼n[N]𝔼y,y[±N]|zyy|Nχ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{\begin{subarray}{c}y,y^{\prime}\in[\pm N]\\ |z^{\ast}-y-y^{\prime}|\leq N\end{subarray}}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}
ψy,y(n)b(y,n)b(y,n)b(y,y)1/M(δ).\displaystyle\qquad\qquad\cdot\psi_{y,y^{\prime}}(n)\cdot b(y,n)b(y^{\prime},n)b(y,y^{\prime})\rVert_{\infty}\geq 1/M(\delta).

This implies that

𝔼n[N]𝔼y,y[±N]χ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{y,y^{\prime}\in[\pm N]}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}
ψy,y(n)b(y,n)b(y,n)b(y,y)𝟙|zyy|N1/M(δ).\displaystyle\qquad\qquad\cdot\psi_{y,y^{\prime}}(n)\cdot b(y,n)b(y^{\prime},n)b(y,y^{\prime})\cdot\mathbbm{1}_{|z^{\ast}-y-y^{\prime}|\leq N}\rVert_{\infty}\geq 1/M(\delta).

as [𝟙|zyy|N]ρ2\mathbb{P}[\mathbbm{1}_{|z^{\ast}-y-y^{\prime}|\leq N}]\gtrsim\rho^{2}. Note that the final indicator may be absorbed into b(y,y)b(y,y^{\prime}) to obtain

𝔼n[N]𝔼y,y[±N]χ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)\displaystyle\lVert\mathbb{E}_{n\in[N]}\mathbb{E}_{y,y^{\prime}\in[\pm N]}\chi(y^{\prime},y,n,\ldots,n)^{\otimes(s-1)}\overline{\chi(y,y^{\prime},n,\ldots,n)}^{\otimes(s-1)}
ψy,y(n)b(y,n)b(y,n)b(y,y)1/M(δ).\displaystyle\qquad\qquad\cdot\psi_{y,y^{\prime}}(n)\cdot b(y,n)b(y^{\prime},n)b(y,y^{\prime})\rVert_{\infty}\geq 1/M(\delta).

Define G(y,y,n)=χ(y,y,n,,n)(s1)χ(y,y,n,,n)¯(s1)G(y,y^{\prime},n)=\chi(y,y^{\prime},n,\ldots,n)^{\otimes(s-1)}\otimes\overline{\chi(y^{\prime},y,n,\ldots,n)}^{\otimes(s-1)} and we have

𝔼n[N],y,y[±N]G(y,y,n)ψy,y(n)¯b(y,n)b(y,n)b(y,y)1/M(δ).\displaystyle\lVert\mathbb{E}_{n\in[N],y,y^{\prime}\in[\pm N]}G(y,y^{\prime},n)\cdot\overline{\psi_{y,y^{\prime}}(n)}\cdot b(y,n)b(y^{\prime},n)b(y,y^{\prime})\rVert_{\infty}\geq 1/M(\delta).

Applying Cauchy–Schwarz in nn, then yy, and then yy^{\prime} (analogously to as in Lemma 7.2) we may remove the bounded functions bb and we have

𝔼n1,n2[N],y1,y2,y1,y2[±N]ε{1,2}3𝒞|ε|1G(yε1,yε2,nε3)ψy1,y2,y1,y2(n1)ψy1,y2,y1,y2(n2)¯\displaystyle\lVert\mathbb{E}_{n_{1},n_{2}\in[N],y_{1},y_{2},y_{1}^{\prime},y_{2}^{\prime}\in[\pm N]}\bigotimes_{\varepsilon\in\{1,2\}^{3}}\mathcal{C}^{|\varepsilon|-1}G(y_{\varepsilon_{1}},y_{\varepsilon_{2}}^{\prime},n_{\varepsilon_{3}})\cdot\psi_{y_{1},y_{2},y_{1}^{\prime},y_{2}^{\prime}}(n_{1})\cdot\overline{\psi_{y_{1},y_{2},y_{1}^{\prime},y_{2}^{\prime}}(n_{2})}\rVert_{\infty}
1/M(δ),\displaystyle\qquad\qquad\qquad\qquad\qquad\geq 1/M(\delta),

where 𝒞\mathcal{C} denotes conjugation and the ψy1,y2,y1,y2(ni)\psi_{y_{1},y_{2},y_{1}^{\prime},y_{2}^{\prime}}(n_{i}) are degree at most (s3)(s-3) nilsequences of complexity (M(δ),d(δ))(M(\delta),d(\delta)). Applying Pigeonhole in n2,y2,y2n_{2},y_{2},y_{2}^{\prime} and applying Lemma C.2 to specialize variables, reindexing n1,y1,y1n_{1},y_{1},y_{1}^{\prime} to n,y,yn,y,y^{\prime}, and taking the maximal coordinate we have

𝔼n[N],y,y[±N]G(y,y,n)ψ1(y,n)ψ2(y,n)ψy,y(n)1/M(δ).\displaystyle\lVert\mathbb{E}_{n\in[N],y,y^{\prime}\in[\pm N]}G(y,y^{\prime},n)\cdot\psi_{1}(y,n)\cdot\psi_{2}(y^{\prime},n)\cdot\psi_{y,y^{\prime}}(n)\rVert_{\infty}\geq 1/M(\delta).

Here ψy,y\psi_{y,y^{\prime}} is degree at most (s3)(s-3) in nn while ψ1(y,n)\psi_{1}(y,n) and ψ2(y,n)\psi_{2}(y,n) are multidegree (1,s2)(1,s-2) and all have complexity (M(δ),d(δ))(M(\delta),d(\delta)). Finally by the triangle inequality we have

𝔼y,y[±N]𝔼n[N]G(y,y,n)ψ1(y,n)ψ2(y,n)ψy,y(n)1/M(δ).\displaystyle\mathbb{E}_{y,y^{\prime}\in[\pm N]}\lVert\mathbb{E}_{n\in[N]}G(y,y^{\prime},n)\cdot\psi_{1}(y,n)\cdot\psi_{2}(y^{\prime},n)\cdot\psi_{y,y^{\prime}}(n)\rVert_{\infty}\geq 1/M(\delta).

Step 3: Converse of the inverse theorem and polarization. By the converse of the inverse theorem, see Lemma B.5, we have that

𝔼y,y[N]G(y,y,)ψ1(y,)ψ2(y,)Us2[N]2s21/M(δ).\displaystyle\mathbb{E}_{y,y^{\prime}\in[N]}\lVert G(y,y^{\prime},\cdot)\psi_{1}(y,\cdot)\psi_{2}(y^{\prime},\cdot)\rVert_{U^{s-2}[N]}^{2^{s-2}}\geq 1/M(\delta).

Expanding out the definition of the Us2U^{s-2}-norm, we find that

𝔼y,y[±N]𝔼n[N],h1,,hs2[±N]ε{0,1}s2(𝒞|ε|+s(G(y,y,n+εh)ψ1(y,n+εh)ψ2(y,n+εh))\displaystyle\bigg{\lVert}\mathbb{E}_{y,y^{\prime}\in[\pm N]}\mathbb{E}_{n\in[N],h_{1},\ldots,h_{s-2}\in[\pm N]}\bigotimes_{\varepsilon\in\{0,1\}^{s-2}}\Big{(}\mathcal{C}^{|\varepsilon|+s}(G(y,y^{\prime},n+\varepsilon\cdot\vec{h})\cdot\psi_{1}(y,n+\varepsilon\cdot\vec{h})\cdot\psi_{2}(y^{\prime},n+\varepsilon\cdot\vec{h}))
𝟙n+εh[N])1/M(δ).\displaystyle\qquad\qquad\cdot\mathbbm{1}_{n+\varepsilon\cdot\vec{h}\in[N]}\Big{)}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

The crucial point is that by repeatedly applying Lemma C.4, Lemma C.2, and Lemma C.1 we have that

ε{0,1}s2𝒞|ε|+s(G(y,y,n+εh))\bigotimes_{\varepsilon\in\{0,1\}^{s-2}}\mathcal{C}^{|\varepsilon|+s}(G(y,y^{\prime},n+\varepsilon\cdot\vec{h}))

and

χ(y,y,h1,,hs2)(s1)!χ(y,y,h1,,hs2)¯(s1)!\chi(y,y^{\prime},h_{1},\ldots,h_{s-2})^{\otimes(s-1)!}\cdot\overline{\chi(y^{\prime},y,h_{1},\ldots,h_{s-2})}^{\otimes(s-1)!}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Therefore by Lemma 7.4, there exists a nilsequence ψ~\widetilde{\psi} of degree (s1)(s-1) and complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

𝔼y,y[±N]𝔼n[N],h1,,hs2[±N]χ(y,y,h1,,hs2)(s1)!χ(y,y,h1,,hs2)¯(s1)!\displaystyle\lVert\mathbb{E}_{y,y^{\prime}\in[\pm N]}\mathbb{E}_{n\in[N],h_{1},\ldots,h_{s-2}\in[\pm N]}\chi(y,y^{\prime},h_{1},\ldots,h_{s-2})^{\otimes(s-1)!}\cdot\overline{\chi(y^{\prime},y,h_{1},\ldots,h_{s-2})}^{\otimes(s-1)!}
ψ~(n,y,y,h1,,hs2)𝟙n+εh[N]1/M(δ).\displaystyle\qquad\cdot\widetilde{\psi}(n,y,y^{\prime},h_{1},\ldots,h_{s-2})\cdot\mathbbm{1}_{n+\varepsilon\cdot\vec{h}\in[N]}\rVert_{\infty}\geq 1/M(\delta).

Via Fourier expansion (a multidimensional version of the argument in Lemma 7.1), we may fold in 𝟙n+εh[N]\mathbbm{1}_{n+\varepsilon\cdot\vec{h}\in[N]} into ψ~(n,y,y,h1,,hs2)\widetilde{\psi}(n,y,y^{\prime},h_{1},\ldots,h_{s-2}).888To be precise, we convolve 𝟙n+εh[N]\mathbbm{1}_{n+\varepsilon\cdot\vec{h}\in[N]} with 𝟙|n|ρNi=1s2𝟙|hi|ρN\mathbbm{1}_{|n|\leq\rho N}\cdot\prod_{i=1}^{s-2}\mathbbm{1}_{|h_{i}|\leq\rho N} where ρ=1/M(δ)\rho=1/M(\delta) is sufficiently small. This function has the necessary Fourier decay to apply the analysis in Lemma 7.1 We reduce to

𝔼y,y[±N]𝔼n[N],h1,,hs2[±N]χ(y,y,h1,,hs2)(s1)!χ(y,y,h1,,hs2)¯(s1)!\displaystyle\lVert\mathbb{E}_{y,y^{\prime}\in[\pm N]}\mathbb{E}_{n\in[N],h_{1},\ldots,h_{s-2}\in[\pm N]}\chi(y,y^{\prime},h_{1},\ldots,h_{s-2})^{\otimes(s-1)!}\cdot\overline{\chi(y^{\prime},y,h_{1},\ldots,h_{s-2})}^{\otimes(s-1)!}
ψ~(n,y,y,h1,,hs2)1/M(δ).\displaystyle\qquad\cdot\widetilde{\psi}(n,y,y^{\prime},h_{1},\ldots,h_{s-2})\rVert_{\infty}\geq 1/M(\delta).

Applying Pigeonhole in nn and applying the first item of Lemma C.2, we reduce to

(12.2) 𝔼y,y[±N]𝔼h1,,hs2[±N]χ(y,y,h1,,hs2)(s1)!χ(y,y,h1,,hs2)¯(s1)!ψ~(y,y,h1,,hs2)1/M(δ);\displaystyle\begin{split}&\lVert\mathbb{E}_{y,y^{\prime}\in[\pm N]}\mathbb{E}_{h_{1},\ldots,h_{s-2}\in[\pm N]}\chi(y,y^{\prime},h_{1},\ldots,h_{s-2})^{\otimes(s-1)!}\cdot\overline{\chi(y^{\prime},y,h_{1},\ldots,h_{s-2})}^{\otimes(s-1)!}\\ &\qquad\qquad\qquad\qquad\otimes\widetilde{\psi}(y,y^{\prime},h_{1},\ldots,h_{s-2})\rVert_{\infty}\geq 1/M(\delta);\end{split}

once again we have abusively updated ψ~\widetilde{\psi}, which has degree (s1)(s-1).

Step 4: Invoking equidistribution theory. This is the unique moment we have the ability to apply equidistribution theory; up to this point we have been applying “elementary” facts regarding nilsequences. Let

χ(y,y,h1,,hs2)=F(g(y,y,h1,,hs2)Γ)\chi(y,y^{\prime},h_{1},\ldots,h_{s-2})=F(g(y,y^{\prime},h_{1},\ldots,h_{s-2})\Gamma)

and let ξ\xi denote the vertical G(1,,1)G_{(1,\ldots,1)} frequency of FF on the multidegree (1,,1)(1,\ldots,1) nilmanifold G/ΓG/\Gamma. We write

ψ~(y,y,h1,,hs2)=F~(g(y,y,h1,,hs2)Γ)\widetilde{\psi}(y,y^{\prime},h_{1},\ldots,h_{s-2})=\widetilde{F}(g^{\ast}(y,y^{\prime},h_{1},\ldots,h_{s-2})\Gamma^{\prime})

on the multidegree (s1)(s-1) nilmanifold G/ΓG^{\prime}/\Gamma^{\prime}. Note that

(g(y,y,h1,,hs2),g(y,y,h1,,hs2),g(y,y1,h1,,hs2))(g(y,y^{\prime},h_{1},\ldots,h_{s-2}),g(y^{\prime},y,h_{1},\ldots,h_{s-2}),g^{\ast}(y,y_{1},h_{1},\ldots,h_{s-2}))

may be viewed as a polynomial sequence on G×G×GG\times G\times G^{\prime} where GG^{\prime} is given a degree (s1)(s-1) filtration. G×G×GG\times G\times G^{\prime} is given a degree ss filtration where the tt-th group is

(G×G×G)t=|i|=tGi×|i|=tGi×(G)t.(G\times G\times G^{\prime})_{t}=\bigvee_{|\vec{i}|=t}G_{\vec{i}}\times\bigvee_{|\vec{i}|=t}G_{\vec{i}}\times(G^{\prime})_{t}.

Note that FF¯F~F\otimes\overline{F}\otimes\widetilde{F} has (G×G×G)s(G\times G\times G^{\prime})_{s}-vertical frequency ξ=(ξ,ξ,0)\xi^{\prime}=(\xi,-\xi,0), noting that (G)s=IdG(G^{\prime})_{s}=\mathrm{Id}_{G^{\prime}}. By applying Corollary 5.5 with (LABEL:eq:main-2) to

F(g(y,y,h1,,hs2)Γ)(s1)!F(g(y,y,h1,,hs2)Γ)¯(s1)!F~(g(y,y,h1,,hs2)Γ),F(g(y,y^{\prime},h_{1},\ldots,h_{s-2})\Gamma)^{\otimes(s-1)!}\otimes\overline{F(g(y^{\prime},y,h_{1},\ldots,h_{s-2})\Gamma)}^{\otimes(s-1)!}\cdot\widetilde{F}(g^{\ast}(y,y^{\prime},h_{1},\ldots,h_{s-2})\Gamma^{\prime}),

and restricting the factorization to G×GG\times G, we have

(g(y,y,h1,,hs2),g(y,y,h1,,hs2))\displaystyle(g(y,y^{\prime},h_{1},\ldots,h_{s-2}),g(y^{\prime},y,h_{1},\ldots,h_{s-2}))
=ε(y,y1,h1,,hs2)gOutput(y,y1,h1,,hs2)γ(y,y1,h1,,hs2),\displaystyle\qquad=\varepsilon(y,y_{1},h_{1},\ldots,h_{s-2})\cdot g^{\mathrm{Output}}(y,y_{1},h_{1},\ldots,h_{s-2})\cdot\gamma(y,y_{1},h_{1},\ldots,h_{s-2}),

where

  • gOutputg^{\mathrm{Output}} lives in an M(δ)M(\delta)-rational subgroup HH such that ξ(H(G×G)s)=0\xi^{\prime}(H\cap(G\times G)_{s})=0;

  • γ\gamma is an M(δ)M(\delta)-rational polynomial sequence;

  • ε\varepsilon is (M(δ),N)(M(\delta),N)-smooth.

Note that when apply Corollary 5.5 the vertical frequency of the function we have is (s1)!ξ(s-1)!\cdot\xi^{\prime} and we obtain (s1)!ξ(H(G×G)s)=0(s-1)!\xi^{\prime}(H\cap(G\times G)_{s})=0; we may divide by (s1)!(s-1)! to obtain the above. Additionally, we have implicitly used that ξ\xi^{\prime} is trivial in the GG^{\prime} part and abuse notation to descend ξ\xi^{\prime} to G×GG\times G.

Let F=FF¯F^{\ast}=F\otimes\overline{F} and note that therefore

χ(h,n,,n)χ(n,h,n,,n)¯=F(ε(h,n,,n)gOutput(h,n,,n)γ(h,n,,n)(Γ×Γ)).\displaystyle\chi(h,n,\ldots,n)\otimes\overline{\chi(n,h,n,\ldots,n)}=F^{\ast}(\varepsilon(h,n,\ldots,n)g^{\mathrm{Output}}(h,n,\ldots,n)\cdot\gamma(h,n,\ldots,n)(\Gamma\times\Gamma)).

Step 5: The finishing touch. We now recall from (12.1) that

𝔼h[N]𝔼n[N]Δhf(n)χ(h,n,,n)ψ(n)ψh(n)1/M(δ);\mathbb{E}_{h\in[N]}\lVert\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\otimes\chi(h,n,\ldots,n)\cdot\psi(n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta);

here we have restricted to a coordinate of ψh\psi_{h} and we treat it as a degree (s2)(s-2) nilsequence (rather than using the nilcharacter). By applying Pigeonhole there exist q,q[s]q,q^{\prime}\in[s] such that

𝔼h[N/s]𝔼n[N/s]Δsh+qf(sn+q)χ(sh+q,sn+q,,sn+q)ψ(sn+q)ψh(sn+q)1/M(δ).\mathbb{E}_{h\in[N/s]}\lVert\mathbb{E}_{n\in[N/s]}\Delta_{sh+q^{\prime}}f(sn+q)\otimes\chi(sh+q^{\prime},sn+q,\ldots,sn+q)\cdot\psi(sn+q)\cdot\psi_{h}(sn+q)\rVert_{\infty}\geq 1/M(\delta).

By Lemma C.3, we have that

χ(sh+q,sn+q,,sn+q) and χ(sh,sn,,sn)\chi(sh+q^{\prime},sn+q,\ldots,sn+q)\text{ and }\chi(sh,sn,\ldots,sn)

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Applying Lemma C.6 (splitting) and adjusting ψ,ψh\psi,\psi_{h}, we may instead assume that

𝔼h[N/s]𝔼n[N/s]Δsh+qf(sn+q)χ(sh,sn,,sn)ψ(n)ψh(n)1/M(δ)\displaystyle\mathbb{E}_{h\in[N/s]}\lVert\mathbb{E}_{n\in[N/s]}\Delta_{sh+q^{\prime}}f(sn+q)\otimes\chi(sh,sn,\ldots,sn)\cdot\psi(n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta)

for ψ\psi of degree (s1)(s-1) and ψh\psi_{h} of degree (s2)(s-2). Now define

T(h,n):=χ(n+h,,n+h)χ(h,n,,n)¯χ(n,h,,n)¯(s1)χ(n,n,,n)¯.T(h,n):=\chi(n+h,\ldots,n+h)\otimes\overline{\chi(h,n,\ldots,n)}\otimes\overline{\chi(n,h,\ldots,n)}^{\otimes(s-1)}\otimes\overline{\chi(n,n,\ldots,n)}.

Since this is a nilcharacter, we automatically know

𝔼h[N/s]𝔼n[N/s]Δsh+qf(sn+q)χ(sh,sn,,sn)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\lVert\mathbb{E}_{n\in[N/s]}\Delta_{sh+q^{\prime}}f(sn+q)\otimes\chi(sh,sn,\ldots,sn)\cdot\psi(n)\cdot\psi_{h}(n)
T(h,n)ss1T(h,n)¯ss11/M(δ).\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\otimes T(h,n)^{\otimes s^{s-1}}\otimes\overline{T(h,n)}^{\otimes s^{s-1}}\rVert_{\infty}\geq 1/M(\delta).

We define

f~1(n)\displaystyle\widetilde{f}_{1}(n) =f(sn+q)χ(n,,n)¯ss1,\displaystyle=f(sn+q)\cdot\overline{\chi(n,\ldots,n)}^{\otimes s^{s-1}},
f~2(n+h)\displaystyle\widetilde{f}_{2}(n+h) =f(s(n+h)+q+q)¯χ(n+h,,n+h)ss1,\displaystyle=\overline{f(s(n+h)+q+q^{\prime})}\cdot\chi(n+h,\ldots,n+h)^{\otimes s^{s-1}},

which yields

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)χ(sh,sn,,sn)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\lVert\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\otimes\chi(sh,sn,\ldots,sn)\cdot\psi(n)\cdot\psi_{h}(n)
(χ(h,n,,n)¯χ(n,h,n,,n)¯(s1))ss1T(h,n)¯ss11/M(δ).\displaystyle\qquad\qquad\otimes(\overline{\chi(h,n,\ldots,n)}\otimes\overline{\chi(n,h,n,\ldots,n)}^{\otimes(s-1)})^{\otimes s^{s-1}}\otimes\overline{T(h,n)}^{\otimes s^{s-1}}\rVert_{\infty}\geq 1/M(\delta).

By applications of Lemma C.4, Lemma C.2, and Lemma C.1 we have that T(h,n)T(h,n) and

k=1s1χ(h,h,,h,n,,n)(s1k)k=2s1χ(n,h,,h,n,,n)(s1k)\bigotimes_{k=1}^{s-1}\chi(h,h,\ldots,h,n,\ldots,n)^{\binom{s-1}{k}}\otimes\bigotimes_{k=2}^{s-1}\chi(n,h,\ldots,h,n,\ldots,n)^{\binom{s-1}{k}}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). (There are k+1k+1 many hh’s in the first term and kk many hh’s in the second term.) Applying Lemma C.6, we may approximate each coordinate as a sum of products of multidegree (s1,s2)(s-1,s-2) and (0,s1)(0,s-1) nilsequences in variables (h,n)(h,n). Furthermore, by the second item of Lemma C.2 this new nilsequence is of similar type. So, folding everything into ψ(n)\psi(n) of degree (s1)(s-1) and the ψh(n)\psi_{h}(n) of degree (s2)(s-2), we find

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)χ(sh,sn,,sn)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\otimes\chi(sh,sn,\ldots,sn)\cdot\psi(n)\cdot\psi_{h}(n)
(χ(h,n,,n)¯χ(n,h,,n)¯(s1))ss11/M(δ).\displaystyle\qquad\qquad\qquad\qquad\otimes(\overline{\chi(h,n,\ldots,n)}\otimes\overline{\chi(n,h,\ldots,n)}^{\otimes(s-1)})^{\otimes s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Furthermore note by Lemma C.3 that

χ(sh,sn,,sn) and χ(h,n,,n)ss\chi(sh,sn,\ldots,sn)\text{ and }\chi(h,n,\ldots,n)^{\otimes s^{s}}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Applying Lemma 7.4 and Lemma C.6 and adjusting ψ\psi and ψh\psi_{h} yet again we have

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)χ(h,n,,n)ssψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\otimes\chi(h,n,\ldots,n)^{\otimes s^{s}}\cdot\psi(n)\cdot\psi_{h}(n)
(χ(h,n,,n)¯χ(n,h,,n)¯(s1))ss11/M(δ).\displaystyle\qquad\qquad\qquad\qquad\otimes(\overline{\chi(h,n,\ldots,n)}\otimes\overline{\chi(n,h,\ldots,n)}^{\otimes(s-1)})^{\otimes s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Now by Lemma C.3 we have that

χ(h,n,,n)ssχ(h,n,,n)¯ss1 and χ(h,n,,n)(s1)ss1\chi(h,n,\ldots,n)^{\otimes s^{s}}\otimes\overline{\chi(h,n,\ldots,n)}^{\otimes s^{s-1}}\text{ and }\chi(h,n,\ldots,n)^{\otimes(s-1)\cdot s^{s-1}}

are (M(δ),M(δ),d(δ))(M(\delta),M(\delta),d(\delta))-equivalent for degree (s1)(s-1). Thus applying Lemma 7.4 and Lemma C.6 and adjusting ψ\psi and ψh\psi_{h} once again we have

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)ψ(n)ψh(n)(χ(h,n,,n)χ(n,h,,n)¯)(s1)ss1\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)\otimes(\chi(h,n,\ldots,n)\otimes\overline{\chi(n,h,\ldots,n)})^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}
1/M(δ).\displaystyle\qquad\qquad\qquad\qquad\qquad\geq 1/M(\delta).

This is finally where we may apply our earlier factorization for χ(h,n,,n)χ(n,h,,n)¯\chi(h,n,\ldots,n)\otimes\overline{\chi(n,h,\ldots,n)}. Recall that

χ(h,n,,n)χ(n,h,n,,n)¯=F(ε(h,n,,n)gOutput(h,n,,n)γ(h,n,,n)(Γ×Γ))\displaystyle\chi(h,n,\ldots,n)\otimes\overline{\chi(n,h,n,\ldots,n)}=F^{\ast}(\varepsilon(h,n,\ldots,n)g^{\mathrm{Output}}(h,n,\ldots,n)\cdot\gamma(h,n,\ldots,n)(\Gamma\times\Gamma))

where γ\gamma is M(δ)M(\delta)-periodic and ε\varepsilon is (M(δ),N)(M(\delta),N)-smooth. Let QQ denote the period of γ\gamma (i.e., changing any argument by a multiple of QQ keeps its Γ×Γ\Gamma\times\Gamma coset the same) and take ρ=exp(log(1/δ)Os(1))\rho=\exp(-\log(1/\delta)^{O_{s}(1)}) where the implicit constant is sufficiently large. Break [N/s][N/s] into arithmetic progressions of length roughly ρN\rho N and common difference QQ; call these 𝒫1,,𝒫\mathcal{P}_{1},\ldots,\mathcal{P}_{\ell}. There exist ε𝒫i,h\varepsilon_{\mathcal{P}_{i,h}} and γ𝒫i,h\gamma_{\mathcal{P}_{i,h}} such that

𝔼h[N/s]𝔼n[N/s]i=1𝟙n𝒫if~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\sum_{i=1}^{\ell}\mathbbm{1}_{n\in\mathcal{P}_{i}}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(ε𝒫i,hγ𝒫i,h(γ𝒫i,h1gOutput(h,n,,n)γ𝒫i,h)(Γ×Γ)))(s1)ss11/M(δ)\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon_{\mathcal{P}_{i,h}}\gamma_{\mathcal{P}_{i,h}}(\gamma_{\mathcal{P}_{i,h}}^{-1}g^{\mathrm{Output}}(h,n,\ldots,n)\gamma_{\mathcal{P}_{i,h}})(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta)

where dG×G(ε𝒫i,h,idG×G)+dG×G(γ𝒫i,h,idG×G)exp(log(1/δ)Os(1))d_{G\times G}(\varepsilon_{\mathcal{P}_{i,h}},\mathrm{id}_{G\times G})+d_{G\times G}(\gamma_{\mathcal{P}_{i,h}},\mathrm{id}_{G\times G})\leq\exp(\log(1/\delta)^{O_{s}(1)}) and γ𝒫i,h\gamma_{\mathcal{P}_{i,h}} is exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)})-rational.

By Pigeonhole, there exists an index ii such that

𝔼h[N/s]𝔼n[N/s]𝟙n𝒫if~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\mathbbm{1}_{n\in\mathcal{P}_{i}}\cdot\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(ε𝒫i,hγ𝒫i,h(γ𝒫i,h1gOutput(h,n,,n)γ𝒫i,h)(Γ×Γ)))(s1)ss11/M(δ).\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon_{\mathcal{P}_{i,h}}\gamma_{\mathcal{P}_{i,h}}(\gamma_{\mathcal{P}_{i,h}}^{-1}g^{\mathrm{Output}}(h,n,\ldots,n)\cdot\gamma_{\mathcal{P}_{i,h}})(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

As γ𝒫i,h\gamma_{\mathcal{P}_{i,h}} is exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)})-rational and bounded, it takes on only exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)}) possible values. Thus by Pigeonhole, there is γΓ×Γ\gamma\in\Gamma\times\Gamma such that dG×G(γ,idG×G)exp(log(1/δ)Os(1))d_{G\times G}(\gamma,\mathrm{id}_{G\times G})\leq\exp(\log(1/\delta)^{O_{s}(1)}) and γ\gamma is exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)})-rational such that

𝔼h[N/s]𝔼n[N/s]𝟙n𝒫if~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\mathbbm{1}_{n\in\mathcal{P}_{i}}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(ε𝒫i,hγgConj(h,n,,n)(Γ×Γ)))(s1)ss11/M(δ),\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon_{\mathcal{P}_{i,h}}\gamma g^{\mathrm{Conj}}(h,n,\ldots,n)(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta),

where gConj=γ1gOutputγg^{\mathrm{Conj}}=\gamma^{-1}g^{\mathrm{Output}}\gamma. Finally, rounding ε𝒫i,hγ\varepsilon_{\mathcal{P}_{i,h}}\gamma to a exp(log(1/δ)Os(1))\exp(-\log(1/\delta)^{O_{s}(1)})-net and noting it is exp(log(1/δ)Os(1))\exp(\log(1/\delta)^{O_{s}(1)})-bounded, there exists ε\varepsilon such that

𝔼h[N/s]𝔼n[N/s]𝟙n𝒫if~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\mathbbm{1}_{n\in\mathcal{P}_{i}}\cdot\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(εgConj(h,n,,n)(Γ×Γ)))(s1)ss11/M(δ)\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon g^{\mathrm{Conj}}(h,n,\ldots,n)(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta)

and dG×G(ε,idG×G)exp(log(1/δ)Os(1))d_{G\times G}(\varepsilon,\mathrm{id}_{G\times G})\leq\exp(\log(1/\delta)^{O_{s}(1)}), as long as ρ\rho was chosen small enough.

By Lemma 7.1, there exists Θh\Theta_{h} such that

𝔼h[N/s]𝔼n[N/s]e(Θhn)f~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}e(\Theta_{h}n)\cdot\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(εgConj(h,n,,n)(Γ×Γ)))(s1)ss11/M(δ).\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon g^{\mathrm{Conj}}(h,n,\ldots,n)(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

As (s2)1(s-2)\geq 1, we may absorb e(Θhn)e(\Theta_{h}n) into ψh(n)\psi_{h}(n) and obtain

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(F(εgConj(h,n,,n)(Γ×Γ)))(s1)ss11/M(δ).\displaystyle\qquad\qquad\otimes(F^{\ast}(\varepsilon g^{\mathrm{Conj}}(h,n,\ldots,n)(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Replacing FF^{\ast} with FFinal(g)=F(εgΓ)F^{\mathrm{Final}}(g)=F^{\ast}(\varepsilon g\Gamma) and writing gFinal(h,n)=gConj(h,n,,n)g^{\mathrm{Final}}(h,n)=g^{\mathrm{Conj}}(h,n,\ldots,n), we have

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)ψ(n)ψh(n)\displaystyle\mathbb{E}_{h\in[N/s]}\bigg{\lVert}\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)
(FFinal(gFinal(h,n)(Γ×Γ)))(s1)ss11/M(δ).\displaystyle\qquad\qquad\otimes(F^{\mathrm{Final}}(g^{\mathrm{Final}}(h,n)(\Gamma\times\Gamma)))^{\otimes(s-1)s^{s-1}}\bigg{\rVert}_{\infty}\geq 1/M(\delta).

Now gFinal(h,n)g^{\mathrm{Final}}(h,n) takes values in γ1Hγ\gamma^{-1}H\gamma such that ξ(γ1Hγ(G×G)s)=0\xi^{\prime}(\gamma^{-1}H\gamma\cap(G\times G)_{s})=0. The key point is to note that FFinalF^{\mathrm{Final}} is right-invariant under (γ1Hγ)(G×G)s(\gamma^{-1}H\gamma)\cap(G\times G)_{s} since it has (G×G)s(G\times G)_{s}-vertical frequency ξ\xi^{\prime}. Note that γ1Hγ\gamma^{-1}H\gamma has complexity bounded by M(δ)M(\delta) due to [42, Lemma B.15]. Furthermore FFinalF^{\mathrm{Final}} is M(δ)M(\delta)-Lipschitz on γ1Hγ\gamma^{-1}H\gamma by [42, Lemma B.9, B.15]. Taking the quotient by (γ1Hγ)(G×G)s(\gamma^{-1}H\gamma)\cap(G\times G)_{s} gives that each coordinate of (FFinal(gFinal(h,n)(Γ×Γ)))ss1(F^{\mathrm{Final}}(g^{\mathrm{Final}}(h,n)(\Gamma\times\Gamma)))^{\otimes s^{s-1}} may be realized a complexity (M(δ),d(δ))(M(\delta),d(\delta)) nilsequence of degree (s1)(s-1).

Applying Pigeonhole in the coordinates of (FFinal)ss1(F^{\mathrm{Final}})^{\otimes s^{s-1}} and then Lemma C.6 to approximate as a sum of products of multidegree (s1,s2)(s-1,s-2) and (0,s1)(0,s-1) nilsequences in variables (h,n)(h,n). So again folding everything into ψ(n)\psi(n) of degree (s1)(s-1) and the ψh(n)\psi_{h}(n) of degree (s2)(s-2), we find

𝔼h[N/s]𝔼n[N/s]f~1(n)f~2(n+h)ψ(n)ψh(n)1/M(δ)\displaystyle\mathbb{E}_{h\in[N/s]}\lVert\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1}(n)\otimes\widetilde{f}_{2}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)\rVert_{\infty}\geq 1/M(\delta)

The functions f~1\widetilde{f}_{1} and f~2\widetilde{f}_{2} are vector-valued, but by Pigeonhole there exist coordinates j1,j2j_{1},j_{2} are coordinates of the vectors f~1\widetilde{f}_{1} and f~2\widetilde{f}_{2} such that

𝔼h[N/s]|𝔼n[N/s]f~1,j1(n)f~2,j2(n+h)ψ(n)ψh(n)|1/M(δ).\displaystyle\mathbb{E}_{h\in[N/s]}|\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1,j_{1}}(n)\widetilde{f}_{2,j_{2}}(n+h)\cdot\psi(n)\cdot\psi_{h}(n)|\geq 1/M(\delta).

Since ψh(n)\psi_{h}(n) is a nilsequence of degree (s2)(s-2) and complexity (M(δ),d(δ))(M(\delta),d(\delta)), by the converse of the inverse theorem (see Lemma B.5) we have that

𝔼h[N/s]f~1,j1()f~2,j2(+h)ψ()Us1[N/s]2s11/M(δ).\displaystyle\mathbb{E}_{h\in[N/s]}\lVert\widetilde{f}_{1,j_{1}}(\cdot)\widetilde{f}_{2,j_{2}}(\cdot+h)\psi(\cdot)\rVert_{U^{s-1}[N/s]}^{2^{s-1}}\geq 1/M(\delta).

By the Gowers–Cauchy–Schwarz inequality (e.g. [17, Lemma 3.8]), we have that

𝔼n[N/s]f~1,j1(n)ψ(n)Us[N/s]1/M(δ).\displaystyle\lVert\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1,j_{1}}(n)\psi(n)\rVert_{U^{s}[N/s]}\geq 1/M(\delta).

By induction, there is a nilsequence Θ(n)\Theta(n) of degree (s1)(s-1) and complexity (M(δ),d(δ))(M(\delta),d(\delta)) such that

|𝔼n[N/s]f~1,j1(n)ψ(n)Θ(n)|1/M(δ).\displaystyle|\mathbb{E}_{n\in[N/s]}\widetilde{f}_{1,j_{1}}(n)\psi(n)\Theta(n)|\geq 1/M(\delta).

Now recall that

f~1(n)=f(sn+q)χ(n,,n)¯ss1.\widetilde{f}_{1}(n)=f(sn+q)\cdot\overline{\chi(n,\ldots,n)}^{\otimes s^{s-1}}.

Each coordinate of χ(n,,n)¯ss1\overline{\chi(n,\ldots,n)}^{\otimes s^{s-1}} is a degree ss nilsequence of complexity (M(δ),d(δ))(M(\delta),d(\delta)); say j1j_{1}-th coordinate is Θ(n)\Theta^{\prime}(n) and thus we have

|𝔼n[N/s]f(sn+q)Θ(n)ψ(n)Θ(n)|1/M(δ).\displaystyle|\mathbb{E}_{n\in[N/s]}f(sn+q)\Theta^{\prime}(n)\psi(n)\Theta(n)|\geq 1/M(\delta).

This is equivalent to

|𝔼n[N]𝟙[nqmods]f(n)Θ((nq)/s)ψ((nq)/s)Θ((nq)/s)|1/M(δ).\displaystyle|\mathbb{E}_{n\in[N]}\mathbbm{1}[n\equiv q~{}\mathrm{mod}~{}s]f(n)\Theta^{\prime}((n-q)/s)\psi((n-q)/s)\Theta((n-q)/s)|\geq 1/M(\delta).

Note the condition

𝟙[nqmods]=s1j=0s1e(j(nq)s)\mathbbm{1}[n\equiv q~{}\mathrm{mod}~{}s]=s^{-1}\sum_{j=0}^{s-1}e\bigg{(}\frac{j\cdot(n-q)}{s}\bigg{)}

and thus there jj such that

|𝔼n[N]f(n)Θ((nq)/s)ψ((nq)/s)Θ((nq)/s)e(jn/s)|1/M(δ).\displaystyle|\mathbb{E}_{n\in[N]}f(n)\Theta^{\prime}((n-q)/s)\psi((n-q)/s)\Theta((n-q)/s)e(jn/s)|\geq 1/M(\delta).

The desired nilsequence is then

Θ((nq)/s)ψ((nq)/s)Θ((nq)/s)e(jn/s)¯\overline{\Theta^{\prime}((n-q)/s)\psi((n-q)/s)\Theta((n-q)/s)e(jn/s)}

which is seen to have degree ss and complexity (M(δ),d(δ))(M(\delta),d(\delta)). We have finally won. ∎

Appendix A On approximate homomorphisms

In this section, we give a number of basic results regarding approximate homomorphisms. The results in this section are, by now, well known consequences of work of Sanders [52]. The proof we give is essentially that in [43], modulo being forced to deal with slight error terms and operating over \mathbb{Z}. We dispose of these error terms via a rounding trick of Green, Tao, and Ziegler [32, Appendix C].

Lemma A.1.

Fix δ(0,1/2)\delta\in(0,1/2), let H1,H2,H3,H4[N]H_{1},H_{2},H_{3},H_{4}\subseteq[N] and let functions fi:Hidf_{i}\colon H_{i}\to\mathbb{R}^{d} be such that there are at least δN3\delta N^{3} additive tuples h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4} with

(f1(h1)+f2(h2)f3(h3)f4(h4))j/εj\lVert(f_{1}(h_{1})+f_{2}(h_{2})-f_{3}(h_{3})-f_{4}(h_{4}))_{j}\rVert_{\mathbb{R}/\mathbb{Z}}\leq\varepsilon_{j}

for all 1jd1\leq j\leq d. Then there exists H1H1H_{1}^{\prime}\subseteq H_{1} with |H1|exp((dlog(1/δ))O(1))N|H_{1}^{\prime}|\geq\exp(-(d\log(1/\delta))^{O(1)})N such that

(f1(h)i=1dai{αih}b)j/εj\bigg{\lVert}\bigg{(}f_{1}(h)-\sum_{i=1}^{d^{\prime}}a_{i}\{\alpha_{i}h\}-b\bigg{)}_{j}\bigg{\rVert}_{\mathbb{R}/\mathbb{Z}}\leq\varepsilon_{j}

for all hH1h\in H_{1}^{\prime}, for appropriate choices of d(dlog(1/δ))O(1)d^{\prime}\leq(d\log(1/\delta))^{O(1)}, ai,bda_{i},b\in\mathbb{R}^{d}, and αi(1/N)\alpha_{i}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime between 100N100N and 200N200N.

We deduce the result from the following variant which is the same statement modulo not having an error term.

Lemma A.2.

Fix δ(0,1/2)\delta\in(0,1/2). Let H1,H2,H3,H4[N]H_{1},H_{2},H_{3},H_{4}\subseteq[N] and fi:Hdf_{i}\colon H\to\mathbb{R}^{d} be such that there are at least δN3\delta N^{3} additive tuples h1+h2=h3+h4h_{1}+h_{2}=h_{3}+h_{4} with

f1(h1)+f2(h2)f3(h3)f4(h4)d.f_{1}(h_{1})+f_{2}(h_{2})-f_{3}(h_{3})-f_{4}(h_{4})\in\mathbb{Z}^{d}.

Then there exists H1H1H_{1}^{\prime}\subseteq H_{1} with |H1|exp(log(1/δ)O(1))N|H_{1}^{\prime}|\geq\exp(-\log(1/\delta)^{O(1)})N such that

f1(h)i=1dai{αih}bdf_{1}(h)-\sum_{i=1}^{d^{\prime}}a_{i}\{\alpha_{i}h\}-b\in\mathbb{Z}^{d}

for all hH1h\in H_{1}^{\prime}, for appropriate choices of dlog(1/δ)O(1)d^{\prime}\leq\log(1/\delta)^{O(1)}, ai,bda_{i},b\in\mathbb{R}^{d}, and αi(1/N)\alpha_{i}\in(1/N^{\prime})\mathbb{Z} where NN^{\prime} is a prime between 100N100N and 200N200N.

We briefly give the deduction, and then in the sequel focus on Lemma A.2.

Proof of Lemma A.1 given Lemma A.2.

Round each value of fif_{i} to the nearest point in the lattice (ε1,,εd)(\varepsilon_{1}\mathbb{Z},\ldots,\varepsilon_{d}\mathbb{Z}) to form fi~\widetilde{f_{i}} (breaking ties arbitrarily). We have that

(f1~(h1)+f2~(h2)f3~(h3)f4~(h4))j/5εj\lVert(\widetilde{f_{1}}(h_{1})+\widetilde{f_{2}}(h_{2})-\widetilde{f_{3}}(h_{3})-\widetilde{f_{4}}(h_{4}))_{j}\rVert_{\mathbb{R}/\mathbb{Z}}\leq 5\varepsilon_{j}

for at least δN4\delta N^{4} additive tuples.

Note however that

f1~(h1)+f2~(h2)f3~(h3)f4~(h4)(ε1,,εd)\widetilde{f_{1}}(h_{1})+\widetilde{f_{2}}(h_{2})-\widetilde{f_{3}}(h_{3})-\widetilde{f_{4}}(h_{4})\in(\varepsilon_{1}\mathbb{Z},\ldots,\varepsilon_{d}\mathbb{Z})

and that there are at most 11d11^{d} lattice points in (ε1,,εd)(\varepsilon_{1}\mathbb{Z},\ldots,\varepsilon_{d}\mathbb{Z}) which are at most 5εj5\varepsilon_{j} in the jj-th direction from the origin in all dd directions. Thus there is a vector w(ε1,,εd)w\in(\varepsilon_{1}\mathbb{Z},\ldots,\varepsilon_{d}\mathbb{Z}) such that

f1~(h1)+f2~(h2)f3~(h3)f4~(h4)+wd\widetilde{f_{1}}(h_{1})+\widetilde{f_{2}}(h_{2})-\widetilde{f_{3}}(h_{3})-\widetilde{f_{4}}(h_{4})+w\in\mathbb{Z}^{d}

for at least 11dδN411^{-d}\delta N^{4} additive tuples. Applying Lemma A.2 with f1~\widetilde{f_{1}}, f2~\widetilde{f_{2}}, f3~\widetilde{f_{3}}, and f4~w\widetilde{f_{4}}-w immediately gives the desired result. ∎

We now require the notion of a Bohr set in an abelian group.

Definition A.3.

Given an abelian group GG and a set SG^S\subseteq\widehat{G}, we define the Bohr set of radius ρ\rho to be

B(S,ρ):={xG:sx/ρ for all sS}.B(S,\rho):=\{x\in G\colon\lVert s\cdot x\rVert_{\mathbb{R}/\mathbb{Z}}\leq\rho\emph{ for all }s\in S\}.

We first require the fact that the four-fold sumset of a set with small doubling contains a Bohr set of small dimension and large radius. This is an immediate consequence of work of Sanders [52, Theorem 1.1] which produces a large symmetric coset progression and a proposition of Milićević [48, Propositon 27] which produces a Bohr set inside a large symmetric coset progression. This is explicitly [48, Corollary 28].

Lemma A.4.

Let A/NA\subseteq\mathbb{Z}/N\mathbb{Z} be such that |A|N/K|A|\geq N/K. Then there exists S/N^S\subseteq\widehat{\mathbb{Z}/N\mathbb{Z}} with |S|log(2K)O(1)|S|\leq\log(2K)^{O(1)} and 1/ρlog(2K)O(1)1/\rho\leq\log(2K)^{O(1)} such that B(S,ρ)2A2AB(S,\rho)\subseteq 2A-2A.

We next require the notion of a Freiman homomorphism.

Definition A.5.

A function f:ABf\colon A\to B (with AA and BB being subsets of possibly different abelian groups) is a kk-Freiman homorphism if for all ai,aiAa_{i},a_{i}^{\prime}\in A satisfying

a1++ak=a1++aka_{1}+\cdots+a_{k}=a_{1}^{\prime}+\cdots+a_{k}^{\prime}

we have

f(a1)++f(ak)=f(a1)++f(ak).f(a_{1})+\cdots+f(a_{k})=f(a_{1}^{\prime})+\cdots+f(a_{k}^{\prime}).

When kk is not specified, we will implicitly have k=2k=2.

We will also require the follow basic lemma which converts the Freiman homomorphism on a Bohr set into a “bracket” linear function on a slightly smaller Bohr set; the proof is a simplification of [23, Proposition 10.8].

Lemma A.6.

Consider S/N^S\subseteq\widehat{\mathbb{Z}/N\mathbb{Z}} and ρ(0,1/4)\rho\in(0,1/4) with Freiman homomorphism f:B(S,ρ)/f\colon B(S,\rho)\to\mathbb{R}/\mathbb{Z}. Taking ρ=ρ|S|2|S|\rho^{\prime}=\rho\cdot|S|^{-2|S|}, we have for all nB(S,ρ)n\in B(S,\rho^{\prime}) that

f(n)(αiSai{αin}+γ),f(n)-\Big{(}\sum_{\alpha_{i}\in S}a_{i}\{\alpha_{i}n\}+\gamma\Big{)}\in\mathbb{Z},

for appropriate choices of ai,γa_{i},\gamma\in\mathbb{R}.

Proof.

By [23, Proposition 10.5], we have that

B(S,ρ|S|2|S|)PB(S,ρ)B(S,\rho\cdot|S|^{-2|S|})\subseteq P\subseteq B(S,\rho)

where PP is a proper generalized arithmetic progression {i=1dini:ni[±Ni]}\{\sum_{i=1}^{d}\ell_{i}n_{i}\colon n_{i}\in[\pm N_{i}]\} of rank d|S|d\leq|S|. Furthermore ({αi})αS(\{\alpha\cdot\ell_{i}\})_{\alpha\in S} for 1id1\leq i\leq d are linearly independent as vectors in S\mathbb{R}^{S}.

Note that for |ni|Ni|n_{i}|\leq N_{i}, we have

(A.1) f(i=1dini)f(0)=i=1dni(f(i)f(0)).f\bigg{(}\sum_{i=1}^{d}\ell_{i}n_{i}\bigg{)}-f(0)=\sum_{i=1}^{d}n_{i}(f(\ell_{i})-f(0)).

Furthermore letting Φ:B(S,ρ)S\Phi\colon B(S,\rho)\to\mathbb{R}^{S} denote Φ(x)=({αx})αS\Phi(x)=(\{\alpha\cdot x\})_{\alpha\in S} we have that

Φ(x)+Φ(y)=Φ(x+y);\Phi(x)+\Phi(y)=\Phi(x+y);

we have used crucially that ρ<1/4\rho<1/4 here. Therefore, by a simple inductive argument we see

Φ(i=1dini)=i=1dniΦ(i)\Phi\bigg{(}\sum_{i=1}^{d}\ell_{i}n_{i}\bigg{)}=\sum_{i=1}^{d}n_{i}\Phi(\ell_{i})

if ni[±Ni]n_{i}\in[\pm N_{i}] for all 1leid1\\ lei\leq d.

By the above linear independence, there exists uiSu_{i}\in\mathbb{R}^{S} such that uiΦ(i)=1u_{i}\cdot\Phi(\ell_{i})=1 and uiΦ(j)=0u_{i}\cdot\Phi(\ell_{j})=0 for jij\neq i. Therefore if nPn\in P is such that n=i=1dinin=\sum_{i=1}^{d}\ell_{i}n_{i}, we have that

ni=uii=1dniΦ(i)=uiΦ(n)=αS(ui)α{αn}.n_{i}=u_{i}\cdot\sum_{i=1}^{d}n_{i}\Phi(\ell_{i})=u_{i}\cdot\Phi(n)=\sum_{\alpha\in S}(u_{i})_{\alpha}\cdot\{\alpha n\}.

The lemma then follows by plugging into (A.1). ∎

We now recall the definition of additive energy.

Definition A.7.

Given (finite) subsets A1,A2,A3,A4A_{1},A_{2},A_{3},A_{4} of an abelian group GG, define the additive energy E(A1,A2,A3,A4)E(A_{1},A_{2},A_{3},A_{4}) to be

E(A1,A2,A3,A4)=xiAi𝟙[x1+x2=x3+x4]E(A_{1},A_{2},A_{3},A_{4})=\sum_{x_{i}\in A_{i}}\mathbbm{1}[x_{1}+x_{2}=x_{3}+x_{4}]

and let E(A)=E(A,A,A,A)E(A)=E(A,A,A,A).

Note that one has the trivial bound E(A)|A|3E(A)\leq|A|^{3}. Furthermore via a standard Cauchy–Schwarz argument (similar to e.g. [58, Corollary 2.10]) we have

E(A1,A2,A3,A4)i=14E(Ai)1/4.E(A_{1},A_{2},A_{3},A_{4})\leq\prod_{i=1}^{4}E(A_{i})^{1/4}.
Proof of Lemma A.2.

Let Γi={(hi,fi(hi)modd):hiHi}×(/)d\Gamma_{i}=\{(h_{i},f_{i}(h_{i})~{}\mathrm{mod}~{}\mathbb{Z}^{d})\colon h_{i}\in H_{i}\}\subseteq\mathbb{Z}\times(\mathbb{R}/\mathbb{Z})^{d}, which is a graph (i.e., for every xx\in\mathbb{Z} there is at most one y(/)dy\in(\mathbb{R}/\mathbb{Z})^{d} with (x,y)Γi(x,y)\in\Gamma_{i}). By assumption we have

E(Γ1,Γ2,Γ3,Γ4)δN3.E(\Gamma_{1},\Gamma_{2},\Gamma_{3},\Gamma_{4})\geq\delta N^{3}.

We have

E(Γ1,Γ2,Γ3,Γ4)i=14E(Γi)1/4E(Γ1)1/4N9/4E(\Gamma_{1},\Gamma_{2},\Gamma_{3},\Gamma_{4})\leq\prod_{i=1}^{4}E(\Gamma_{i})^{1/4}\leq E(\Gamma_{1})^{1/4}N^{9/4}

and therefore E(Γ1)δ4N3E(\Gamma_{1})\geq\delta^{4}N^{3}. By Balog–Szemerédi–Gowers (see [23, Theorem 5.2]), there is ΓΓ1\Gamma^{\prime}\subseteq\Gamma_{1} such that |Γ|δO(1)N|\Gamma^{\prime}|\geq\delta^{O(1)}N while |ΓΓ|δO(1)N|\Gamma^{\prime}-\Gamma^{\prime}|\leq\delta^{-O(1)}N.

Let A=(8Γ8Γ)({0}×(/)d)A=(8\Gamma^{\prime}-8\Gamma^{\prime})\cap(\{0\}\times(\mathbb{R}/\mathbb{Z})^{d}). Since Γ\Gamma^{\prime} is a graph, we have that |Γ+A|=|Γ||A||\Gamma^{\prime}+A|=|\Gamma^{\prime}||A|. However |Γ+A||9Γ8Γ|δO(1)N|\Gamma^{\prime}+A|\leq|9\Gamma^{\prime}-8\Gamma^{\prime}|\leq\delta^{-O(1)}N by the Plünnecke–Ruzsa inequality (e.g. [23, Theorem 5.3]) and thus |A|δO(1)|A|\leq\delta^{-O(1)}.

Now, by abuse of notation we may view AA as a subset of (/)d(\mathbb{R}/\mathbb{Z})^{d}. We claim there exists TdT\subseteq\mathbb{Z}^{d} with |T|O(log(1/δ))|T|\leq O(\log(1/\delta)) such that AB(T,1/4)={0}A\cap B(T,1/4)=\{0\}; we give a proof which is essentially identical to that in [23, Lemma 8.3]. Note that given any w(/)d{0}w\in(\mathbb{R}/\mathbb{Z})^{d}\setminus\{0\} we have

lim supMv{M,,M}d[vw/<1/4]3/4.\limsup_{M\to\infty}\mathbb{P}_{v\in\{-M,\ldots,M\}^{d}}[\lVert v\cdot w\rVert_{\mathbb{R}/\mathbb{Z}}<1/4]\leq 3/4.

This follows immediately from noting that if ww has an irrational coordinate the probability tends to 1/21/2 by Weyl’s equidistribution criterion while if ww is rational the limiting probability is at most say 2/32/3. Choosing an integer vector vv which kills at least 1/41/4 of the set iteratively then immediately gives the desired lemma.

Let ψ:(/)d(/)T\psi\colon(\mathbb{R}/\mathbb{Z})^{d}\to(\mathbb{R}/\mathbb{Z})^{T} be defined as ψ(ξ)=(t(ξ))tT\psi(\xi)=(t(\xi))_{t\in T}. Now let τ=27\tau=2^{-7}. By averaging there exists a cube Q=x+[0,τ)TQ=\vec{x}+[0,\tau)^{T} such that

Γ~:={(h,f1(h))Γ:ψ(f1(h))Q}\widetilde{\Gamma}:=\{(h,f_{1}(h))\in\Gamma^{\prime}\colon\psi(f_{1}(h))\in Q\}

with |Γ~|τ|T||Γ||\widetilde{\Gamma}|\geq\tau^{|T|}|\Gamma^{\prime}|, so |Γ~|δO(1)N|\widetilde{\Gamma}|\geq\delta^{O(1)}N. Fix such a cube QQ.

We claim that 4Γ~4Γ~4\widetilde{\Gamma}-4\widetilde{\Gamma} is a graph. For the sake of contradiction suppose not. Then there exist h1,,h8h_{1},\ldots,h_{8} and h1,,h8h_{1}^{\prime},\ldots,h_{8}^{\prime} such that

h1++h4h5h8\displaystyle h_{1}+\cdots+h_{4}-h_{5}-\cdots-h_{8} =h1++h4h5h8,\displaystyle=h_{1}^{\prime}+\cdots+h_{4}^{\prime}-h_{5}^{\prime}-\cdots-h_{8}^{\prime},
f1(h1)++f1(h4)f1(h5)f1(h8)\displaystyle f_{1}(h_{1})+\cdots+f_{1}(h_{4})-f_{1}(h_{5})-\cdots-f_{1}(h_{8}) f1(h1)++f1(h4)f1(h5)f1(h8)mod1.\displaystyle\not\equiv f_{1}(h_{1}^{\prime})+\cdots+f_{1}(h_{4}^{\prime})-f_{1}(h_{5}^{\prime})-\cdots-f_{1}(h_{8}^{\prime})~{}\mathrm{mod}~{}1.

However,

ψ((f1(h1)++f1(h4)f1(h5)f1(h8))(f1(h1)++f1(h4)f1(h5)f1(h8)))\displaystyle\bigg{\lVert}\psi\big{(}\big{(}f_{1}(h_{1})+\cdots+f_{1}(h_{4})-f_{1}(h_{5})-\cdots-f_{1}(h_{8})\big{)}-\big{(}f_{1}(h_{1}^{\prime})+\cdots+f_{1}(h_{4}^{\prime})-f_{1}(h_{5}^{\prime})-\cdots-f_{1}(h_{8}^{\prime})\big{)}\big{)}\bigg{\rVert}_{\infty}
16τ<1/4\displaystyle\qquad\qquad\qquad\qquad\leq 16\cdot\tau<1/4

by definition of Γ~\widetilde{\Gamma}. Since AB(T,1/4)={0}A\cap B(T,1/4)=\{0\}, it follows that

(f1(h1)++f1(h4)f1(h5)f1(h8))(f1(h1)++f1(h4)f1(h5)f1(h8))d\big{(}f_{1}(h_{1})+\cdots+f_{1}(h_{4})-f_{1}(h_{5})-\cdots-f_{1}(h_{8})\big{)}-\big{(}f_{1}(h_{1}^{\prime})+\cdots+f_{1}(h_{4}^{\prime})-f_{1}(h_{5}^{\prime})-\cdots-f_{1}(h_{8}^{\prime})\big{)}\in\mathbb{Z}^{d}

as desired.

Let HH^{\ast} denote the projection of Γ~\widetilde{\Gamma} onto the first coordinate. Since f1f_{1} is an 88-Freiman homomorphism on HH^{\ast} (because 4Γ~4Γ~4\widetilde{\Gamma}-4\widetilde{\Gamma} is a graph), we have that f1f_{1} is a Freiman homorphism on 2H2H2H^{\ast}-2H^{\ast} (where f1f_{1} is extended via linearity). We now view HH^{\ast} (which is a subset of integers) as a subset of /N\mathbb{Z}/N^{\prime}\mathbb{Z} where NN^{\prime} is a prime in [100N,200N][100N,200N]. Note here that H[4N,4N]H^{\ast}\subseteq[-4N,4N] and thus 4Γ~4Γ~4\widetilde{\Gamma}-4\widetilde{\Gamma} when viewed as a subset of (/N)×(/)d(\mathbb{Z}/N^{\prime}\mathbb{Z})\times(\mathbb{R}/\mathbb{Z})^{d} is still a graph. Note that |H|δO(1)N|H^{\ast}|\geq\delta^{O(1)}N.

By Lemma A.4, we have that 2H2H2H^{\ast}-2H^{\ast} contains a Bohr set B(S,ρ)B(S,\rho) with |S|,ρ1(log(1/δ))O(1)|S|,\rho^{-1}\leq(\log(1/\delta))^{O(1)}. Then by applying Lemma A.6 to each coordinate of f1f_{1} on B(S,ρ)2H2HB(S,\rho^{\prime})\subseteq 2H^{\ast}-2H^{\ast} with ρ1exp(log(1/δ)O(1))\rho^{\prime-1}\leq\exp(\log(1/\delta)^{O(1)}), we have that

(A.2) f1(h1)=αiSai{αih1}+γmod1f_{1}(h_{1})=\sum_{\alpha_{i}\in S}a_{i}\{\alpha_{i}h_{1}\}+\gamma~{}\mathrm{mod}~{}1

for all h1B(S,ρ)h_{1}\in B(S,\rho^{\prime}), for appropriate choices of ai,γda_{i},\gamma\in\mathbb{R}^{d}. Here αi(1/N)\alpha_{i}\in(1/N^{\prime})\mathbb{Z}.

We now undo this transformation and we abusively view B(S,ρ)2H2HB(S,\rho^{\prime})\subseteq 2H^{\ast}-2H^{\ast} as a subset of integers in [4N,4N][-4N,4N] instead of /N\mathbb{Z}/N^{\prime}\mathbb{Z}, noting that the fractional part remains identical in both cases. As a slight technical annoyance, B(S,ρ)B(S,\rho^{\prime}) might not intersect HH^{\ast}. But, by Pigeonhole there exists x[5N,5N]x^{\ast}\in[-5N,5N] such that |(x+B(S,ρ/2))H|exp((log(1/δ))O(1))N|(x^{\ast}+B(S,\rho^{\prime}/2))\cap H^{\ast}|\geq\exp(-(\log(1/\delta))^{O(1)})N. (This requires a lower bound on the size of a Bohr set, see [58, Lemma 4.20].)

Fix hB(S,ρ/2)h^{\ast}\in B(S,\rho^{\prime}/2) such that x+hHx^{\ast}+h^{\ast}\in H^{\ast} and consider any h1B(S,ρ/2)h_{1}\in B(S,\rho^{\prime}/2) such that h1+xHh_{1}+x^{\ast}\in H^{\ast} we have that

f1(h1h)+f1(x+h)=f1(h1+x)+f(0)mod1f_{1}(h_{1}-h^{\ast})+f_{1}(x^{\ast}+h^{\ast})=f_{1}(h_{1}+x^{\ast})+f(0)~{}\mathrm{mod}~{}1

since 4Γ~4Γ~4\widetilde{\Gamma}-4\widetilde{\Gamma} is a graph (note that h1hB(S,ρ)2H2Hh_{1}-h^{\ast}\in B(S,\rho^{\prime})\subseteq 2H^{\ast}-2H^{\ast}). Thus we have

f1(h1+x)\displaystyle f_{1}(h_{1}+x^{\ast}) =f1(h1h)+f1(x+h)f(0)mod1\displaystyle=f_{1}(h_{1}-h^{\ast})+f_{1}(x^{\ast}+h^{\ast})-f(0)~{}\mathrm{mod}~{}1
=αiSai{αi((h1+x)(x+h))}+γmod1\displaystyle=\sum_{\alpha_{i}\in S}a_{i}\{\alpha_{i}((h_{1}+x^{\ast})-(x^{\ast}+h^{\ast}))\}+\gamma^{\prime}~{}\mathrm{mod}~{}1

The second line holds since x,hx^{\ast},h^{\ast} are viewed as fixed and h1hB(S,ρ)h_{1}-h^{\ast}\in B(S,\rho^{\prime}) hence we may apply (A.2).

So, letting HH^{\prime} be the set of values h1+xH1h_{1}+x^{\ast}\in H_{1} where h1B(S,ρ/2)h_{1}\in B(S,\rho^{\prime}/2), this nearly gives the desired result. The only issue is that there are shifts inside the brackets. Note that

{z1+z2}\displaystyle\{z_{1}+z_{2}\} ={{z1}+{z2}1 if {z1}+{z2}>1/2,{z1}+{z2}+1 if {z1}+{z2}1/2,{z1}+{z2} otherwise.\displaystyle=\begin{cases}\{z_{1}\}+\{z_{2}\}-1\text{ if }\{z_{1}\}+\{z_{2}\}>1/2,\\ \{z_{1}\}+\{z_{2}\}+1\text{ if }\{z_{1}\}+\{z_{2}\}\leq-1/2,\\ \{z_{1}\}+\{z_{2}\}\text{ otherwise.}\end{cases}

Given this, we may Pigeonhole possible values h1+xh_{1}+x^{\ast} into one of 3|S|3^{|S|} cases based on the corresponding shift for each αiS\alpha_{i}\in S. Applying the above relation with z1=αi(h1+x)z_{1}=\alpha_{i}(h_{1}+x^{\ast}) and z2=αi(x+h)z_{2}=-\alpha_{i}(x^{\ast}+h^{\ast}) and taking the most common case then gives the desired result. ∎

Appendix B Miscellaneous deferred results

We first require the following elementary lemma which will be used in the following deduction.

Lemma B.1.

Fix an integer H2H\geq 2. Consider vectors v1,,vdv_{1},\ldots,v_{\ell}\in\mathbb{Z}^{d} with integer coordinates bounded by HH and wdw\in\mathbb{R}^{d} such that dist(viw,)δ\operatorname{dist}(v_{i}\cdot w,\mathbb{Z})\leq\delta for 1i1\leq i\leq\ell. We may write w=wsmall+wrat+(wwsmallwrat)w=w_{\mathrm{small}}+w_{\mathrm{rat}}+(w-w_{\mathrm{small}}-w_{\mathrm{rat}}) where wratw_{\mathrm{rat}} has coordinates which are rationals with denominators bounded by HO(dO(1))H^{O(d^{O(1)})}, wsmallδHO(dO(1))\lVert w_{\mathrm{small}}\rVert_{\infty}\leq\delta\cdot H^{O(d^{O(1)})}, and vi(wwsmallwrat)=0v_{i}\cdot(w-w_{\mathrm{small}}-w_{\mathrm{rat}})=0 for 1i1\leq i\leq\ell.

Proof.

Note that by passing to a subset we may assume that v1,,vdv_{1},\ldots,v_{\ell}\in\mathbb{Z}^{d} are linearly independent. By Cramer’s rule, there exist w1,,wdw_{1},\ldots,w_{\ell}\in\mathbb{R}^{d} which have coordinates which are height HO(dO(1))H^{O(d^{O(1)})} rationals such that wjvk=𝟙j=kw_{j}\cdot v_{k}=\mathbbm{1}_{j=k}. Taking wrat=j=1(vjw{vjw})wjw_{\mathrm{rat}}=\sum_{j=1}^{\ell}(v_{j}\cdot w-\{v_{j}\cdot w\})\cdot w_{j} and wsmall=j=1{vjw}wjw_{\mathrm{small}}=\sum_{j=1}^{\ell}\{v_{j}\cdot w\}\cdot w_{j} we immediately have the desired result. Recall that we have chosen the fractional part {}\{\cdot\} to live within (1/2,1/2](-1/2,1/2]. ∎

We now prove the following elementary lemma which takes a set of horizontal characters (at potentially different levels) and produces a factorization.

Lemma B.2.

Consider a nilmanifold G/ΓG/\Gamma of degree-rank (s,r)(s,r) of dimension dd and complexity MM. Consider a polynomial sequence gg such that g(0)=idGg(0)=\mathrm{id}_{G} and consider a set of horizontal characters ψi,j\psi_{i,j} for 1ji1\leq j\leq\ell_{i} and where ψi,\psi_{i,\cdot} is an ii-th horizontal character of height at most HH. Furthermore suppose that for all i,ji,j,

dist(ψi,j(Taylori(g)),)HNi.\operatorname{dist}(\psi_{i,j}(\operatorname{Taylor}_{i}(g)),\mathbb{Z})\leq H\cdot N^{-i}.

Then one may factor

g=εgγg=\varepsilon\cdot g^{\prime}\cdot\gamma

where:

  • ε(0)=g(0)=γ(0)=idG\varepsilon(0)=g^{\prime}(0)=\gamma(0)=\mathrm{id}_{G};

  • ψi,j(Taylori(g))=0\psi_{i,j}(\operatorname{Taylor}_{i}(g^{\prime}))=0;

  • γ\gamma is (MH)Os(dOs(1))(MH)^{O_{s}(d^{O_{s}(1)})}-rational;

  • dG(ε(n),ε(n1))(MH)Os(dOs(1))N1d_{G}(\varepsilon(n),\varepsilon(n-1))\leq(MH)^{O_{s}(d^{O_{s}(1)})}\cdot N^{-1} for n[N]n\in[N].

Proof.

By the classification of polynomial sequences in terms of coordinates of the second-kind, we have that

g(n)=exp(k=1s(nk)gk)g(n)=\exp\Big{(}\sum_{k=1}^{s}\binom{n}{k}g_{k}\Big{)}

for some gklog(G(k,0))=log(G(k,1))g_{k}\in\log(G_{(k,0)})=\log(G_{(k,1)}). Note that

Taylori(g)=exp(gk)modG(i,2)\operatorname{Taylor}_{i}(g)=\exp(g_{k})~{}\mathrm{mod}~{}G_{(i,2)}

and note that each ψi,j\psi_{i,j} can be descended to a linear map on log(G(i,1))\log(G_{(i,1)}) with the property that ψi,j(log(ΓG(i,1)))\psi_{i,j}(\log(\Gamma\cap G_{(i,1)}))\in\mathbb{Z} and ψi,j(log(G(i,2)))=0\psi_{i,j}(\log(G_{(i,2)}))=0. That ψi,j\psi_{i,j} descends uses the fact that log(x)+log(y)log(xy)modlog(G(i,2))\log(x)+\log(y)\equiv\log(xy)~{}\mathrm{mod}~{}\log(G_{(i,2)}) for x,yG(i,1)x,y\in G_{(i,1)}, which follows from Baker–Campbell–Hausdorff.

We now apply Lemma B.1. As dist(ψi,j(Taylori(g)),)HNi\operatorname{dist}(\psi_{i,j}(\operatorname{Taylor}_{i}(g)),\mathbb{Z})\leq H\cdot N^{-i} by assumption, we may write gi=gi,small+gi,rat+(gigi,smallgi,rat)g_{i}=g_{i,\mathrm{small}}+g_{i,\mathrm{rat}}+(g_{i}-g_{i,\mathrm{small}}-g_{i,\mathrm{rat}}) such that gi,ratg_{i,\mathrm{rat}} is an HOs(dOs(1))H^{O_{s}(d^{O_{s}(1)})}-rational combination of elements in 𝒳log(G(i,1))\mathcal{X}\cap\log(G_{(i,1)}), such that gi,small(MH)Os(dOs(1))Ni\lVert g_{i,\mathrm{small}}\rVert_{\infty}\leq(MH)^{O_{s}(d^{O_{s}(1)})}\cdot N^{-i}, and such that ψi,j(gigi,smallgi,rat)=0\psi_{i,j}(g_{i}-g_{i,\mathrm{small}}-g_{i,\mathrm{rat}})=0. Defining

γ:=exp(k=1s(nk)gk,rat),ε:=exp(k=1s(nk)gk,small),\gamma:=\exp\Big{(}\sum_{k=1}^{s}\binom{n}{k}g_{k,\mathrm{rat}}\Big{)},\quad\varepsilon:=\exp\Big{(}\sum_{k=1}^{s}\binom{n}{k}g_{k,\mathrm{small}}\Big{)},

and g:=ε1gγ1g^{\prime}:=\varepsilon^{-1}g\gamma^{-1}, we immediately have that γΓ\gamma\Gamma is (MH)Os(dOs(1))(MH)^{O_{s}(d^{O_{s}(1)})}-periodic by [42, Lemma B.14]. That ε\varepsilon is sufficiently smooth is an immediate consequence of [42, Lemmas B.1, B.3]. ∎

We next require the following result regarding the existence of a nilmanifold partition of unity. As a remark, a similar statement (e.g. with jτj=1\sum_{j}\tau_{j}=1) appears as [45, Lemma 2.4]. The proof there, strangely, does not adapt in a straightforward manner to here.

Lemma B.3.

Fix ε(0,1/2)\varepsilon\in(0,1/2) and a nilmanifold G/ΓG/\Gamma of degree ss, dimension dd, and complexity MM. There exists an index set II and a collection of nonnegative smooth functions τj:G/Γ0\tau_{j}\colon G/\Gamma\to\mathbb{R}^{\geq 0} for jIj\in I such that:

  • For all gGg\in G, we have jIτj(gΓ)2=1\sum_{j\in I}\tau_{j}(g\Gamma)^{2}=1;

  • |I|(1/ε)Os(dOs(1))|I|\leq(1/\varepsilon)^{O_{s}(d^{O_{s}(1)})};

  • For each jIj\in I, there exists β[2,2]d\beta\in[-2,2]^{d} so that for any gΓsupp(τj)g\Gamma\in\operatorname{supp}(\tau_{j}) there exists ggΓg^{\prime}\in g\Gamma such that ψG(g)i=1d[βiε,βi+ε]\psi_{G}(g^{\prime})\in\prod_{i=1}^{d}[\beta_{i}-\varepsilon,\beta_{i}+\varepsilon];

  • τj\tau_{j} are (M/ε)Os(dOs(1))(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})}-Lipschitz on G/ΓG/\Gamma;

  • For any gGg\in G, gΓg\Gamma is contained in the support of at most 2Os(d)2^{O_{s}(d)} terms.

Proof.

We will prove the statement inductively based on the degree of the nilmanifold. For degree 11 nilmanifolds GG, note that G𝕋dG\simeq\mathbb{T}^{d}.There exists a set of function ρ1,,ρ2k:𝕋0\rho_{1},\ldots,\rho_{2k}\colon\mathbb{T}\to\mathbb{R}^{\geq 0} such that:

  • supp(ρj)[j/(2k),j/(2k)+1/k]mod1\operatorname{supp}(\rho_{j})\subseteq[j/(2k),j/(2k)+1/k]~{}\mathrm{mod}~{}1;

  • j=12kρj2=1\sum_{j=1}^{2k}\rho_{j}^{2}=1;

  • ρj\rho_{j} are O(1/k)O(1/k)-Lipschitz.

Taking k=O(ε1)k=O(\varepsilon^{-1}), we have that

1=(j1,,jd)[2k]d=1dρj((ψG(g)))21=\sum_{(j_{1},\ldots,j_{d})\in[2k]^{d}}\prod_{\ell=1}^{d}\rho_{j_{\ell}}((\psi_{G}(g))_{\ell})^{2}

where (ψG)(\psi_{G})_{\ell} denotes the \ell-th coordinate of ψG\psi_{G}. For j[2k]d\vec{j}\in[2k]^{d} we take

τj(g)==1dρj((ψG(g)))\tau_{\vec{j}}(g)=\prod_{\ell=1}^{d}\rho_{j_{\ell}}((\psi_{G}(g))_{\ell})

and note that this function is Γ\Gamma-invariant since multiplying by an element in Γ\Gamma shifts all coordinates by an integer. Furthermore, by [42, Lemma B.3] we have that the standard \ell^{\infty}-metric on G/ΓG/\Gamma is equivalent to dG/Γd_{G/\Gamma} up to a factor of O(M)O(dO(1))O(M)^{O(d^{O(1)})}. This completes the proof in this case.

When considering the case of a degree s2s\geq 2 filtration on GG, suppose that G0=G1G2GsIdGG_{0}=G_{1}\geqslant G_{2}\geqslant\cdots\geqslant G_{s}\geqslant\mathrm{Id}_{G} is the given filtration. Note that if 𝒳={X1,,Xd}\mathcal{X}=\{X_{1},\ldots,X_{d}\} is the adapted Mal’cev basis for G/ΓG/\Gamma then

𝒳~:={X1,,Xdim(G)dim(Gs)}modlog(Gs)\widetilde{\mathcal{X}}:=\{X_{1},\ldots,X_{\dim(G)-\dim(G_{s})}\}~{}\mathrm{mod}~{}\log(G_{s})

is a valid Mal’cev basis for G~:=G/Gs\widetilde{G}:=G/G_{s}. Furthermore define Γ~:=Γ/(ΓGs)\widetilde{\Gamma}:=\Gamma/(\Gamma\cap G_{s}). The complexity of 𝒳~\widetilde{\mathcal{X}} is always bounded by MM by definition. The filtration on G~\widetilde{G} is lower degree.

By induction, we have functions (τj)jI(\tau_{j})_{j\in I} with |I|(M/ε)Os(dOs(1))|I|\leq(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})} such that

1=jIτj~(g~Γ~)21=\sum_{j\in I}\widetilde{\tau_{j}}(\widetilde{g}\widetilde{\Gamma})^{2}

and satisfying various other appropriate properties. We may lift these functions to G/ΓG/\Gamma via

τj(gΓ)=τj~((gmodGs)Γ~).\tau_{j}(g\Gamma)=\widetilde{\tau_{j}}((g~{}\mathrm{mod}~{}G_{s})\widetilde{\Gamma}).

Note that this is well-defined since gΓmodGs=(gmodGs)(ΓmodGs)=(gmodGs)Γ~g\Gamma~{}\mathrm{mod}~{}G_{s}=(g~{}\mathrm{mod}~{}G_{s})\cdot(\Gamma~{}\mathrm{mod}~{}G_{s})=(g~{}\mathrm{mod}~{}G_{s})\widetilde{\Gamma}.

We view each τj\tau_{j} as a function on i=1dim(G~)(βi1/2,βi+1/2]×𝕋dim(Gs)\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/2,\beta_{i}+1/2]\times\mathbb{T}^{\dim(G_{s})} which only depends on the first dim(G~)\dim(\widetilde{G}) coordinates and such that the support is only within some i=1dim(G~)[βiε,βi+ε]×𝕋dim(Gs)\prod_{i=1}^{\dim(\widetilde{G})}[\beta_{i}-\varepsilon,\beta_{i}+\varepsilon]\times\mathbb{T}^{\dim(G_{s})}. This is via identifying the fundamental domain of G/ΓG/\Gamma via Mal’cev coordinates of the second-kind (see the proof of [42, Lemma B.6]). We let ψβ:G/Γi=1dim(G~)(βi1/2,βi+1/2]×𝕋dim(Gs)\psi_{\beta}\colon G/\Gamma\to\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/2,\beta_{i}+1/2]\times\mathbb{T}^{\dim(G_{s})} denote this identification. (Note that the choice of β\beta depends on jIj\in I, which we will fix through the remainder of the proof.)

We now have

τj(gΓ)2=τj~((gmodGs)Γ~)2(t1,,tdim(Gs))[2k]dim(Gs)=1dim(Gs)ρt((ψβ(gΓ))+dim(G~))2\tau_{j}(g\Gamma)^{2}=\widetilde{\tau_{j}}((g~{}\mathrm{mod}~{}G_{s})\widetilde{\Gamma})^{2}\cdot\sum_{(t_{1},\ldots,t_{\dim(G_{s})})\in[2k]^{\dim(G_{s})}}\prod_{\ell=1}^{\dim(G_{s})}\rho_{t_{\ell}}((\psi_{\beta}(g\Gamma))_{\ell+\dim(\widetilde{G})})^{2}

where k=O(1/ε)k=O(1/\varepsilon) and ρ\rho are defined as above.

The fact that each piece

τj,t(gΓ)2:=τj(gΓ)2=1dim(Gs)ρt((ψβ(gΓ))+dim(G~))2\tau_{j,\vec{t}}(g\Gamma)^{2}:=\tau_{j}(g\Gamma)^{2}\cdot\prod_{\ell=1}^{\dim(G_{s})}\rho_{t_{\ell}}((\psi_{\beta}(g\Gamma))_{\ell+\dim(\widetilde{G})})^{2}

is Γ\Gamma-invariant on the right is trivial by construction, and the sum of squares property is trivial.

Identifying ρj\rho_{j} with a function 0\mathbb{R}\to\mathbb{R}^{\geq 0} where supp(ρj)[j/(2k),j/(2k)+1/k]\operatorname{supp}(\rho_{j})\subseteq[j/(2k),j/(2k)+1/k], we may identify τj,t\tau_{j,\vec{t}} with a function on the fundamental domain (with respect to second-kind coordinates) of the form

i=1dim(G~)(βi1/2,βi+1/2]×=1dim(Gs)((t+1)/(2k)1/2,(t+1)/(2k)+1/2].\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/2,\beta_{i}+1/2]\times\prod_{\ell=1}^{\dim(G_{s})}((t_{\ell}+1)/(2k)-1/2,(t_{\ell}+1)/(2k)+1/2].

To check that this function is sufficiently Lipschitz, we note that each element gΓg\Gamma has a unique representative in this domain.

Consider τj,t(xΓ)\tau_{j,\vec{t}}(x\Gamma) and τj,t(yΓ)\tau_{j,\vec{t}}(y\Gamma); by multiplying by the lattice we may assume that ψ(x),ψ(y)\psi(x),\psi(y) are in the specified fundamental domain. Furthermore if dG/Γ(xΓ,yΓ)ε=MOs(dsO(1))d_{G/\Gamma}(x\Gamma,y\Gamma)\geq\varepsilon^{\prime}=M^{-O_{s}(d^{O}_{s}(1))} we immediately win as τj,t\tau_{j,\vec{t}} is 11-bounded. We claim that if dG/Γ(xΓ,yΓ)εd_{G/\Gamma}(x\Gamma,y\Gamma)\leq\varepsilon^{\prime} then dG/Γ(xΓ,yΓ)=dG(x,y)d_{G/\Gamma}(x\Gamma,y\Gamma)=d_{G}(x,y). In particular, note that

dG/Γ(xΓ,yΓ)\displaystyle d_{G/\Gamma}(x\Gamma,y\Gamma) =minγΓdG(xγ,y)\displaystyle=\min_{\gamma\in\Gamma}d_{G}(x\gamma,y)

and that

minγΓ{idG}dG(xγ,y)MOs(dOs(1))minγΓ{idG}dG(γ,x1y)MOs(dsO(1))\min_{\gamma\in\Gamma\setminus\{\mathrm{id}_{G}\}}d_{G}(x\gamma,y)\geq M^{-O_{s}(d^{O_{s}(1)})}\cdot\min_{\gamma\in\Gamma\setminus\{\mathrm{id}_{G}\}}d_{G}(\gamma,x^{-1}y)\geq M^{-O_{s}(d^{O}_{s}(1))}

which gives the desired contradiction assuming that various implicit constants defining ε\varepsilon^{\prime} are chosen appropriately.

Now we may assume that x,yx,y are such that

ψ(x),ψ(y)i=1dim(G~)[βi2ε,βi+2ε)×=1dim(Gs)[t/(2k)ε,t/(2k)+1/k+ε),\psi(x),\psi(y)\in\prod_{i=1}^{\dim(\widetilde{G})}[\beta_{i}-2\varepsilon,\beta_{i}+2\varepsilon)\times\prod_{\ell=1}^{\dim(G_{s})}[t_{\ell}/(2k)-\varepsilon,t_{\ell}/(2k)+1/k+\varepsilon),

else both function values vanish (again supposing ε\varepsilon^{\prime} is sufficiently small). This is because dG(x,y)d_{G}(x,y) is equivalent to ψ(x)ψ(y)\lVert\psi(x)-\psi(y)\rVert_{\infty} (up to a factor of MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}) for bounded elements by [42, Lemma B.3]), and due to the condition on the support of ρt\rho_{t_{\ell}}.

In particular, ψ(x),ψ(y)\psi(x),\psi(y) are seen to lie in the interior of the domain. The result then follows immediately noting that τj\tau_{j} is appropriately Lipschitz and ρt\rho_{t_{\ell}} is an appropriately Lipschitz function on \mathbb{R}. The claim that gΓg\Gamma is contained in the support of at most 2Os(d)2^{O_{s}(d)} terms follows trivially by construction. ∎

Given this we are now in position to show the existence of nilcharacters on G/ΓG/\Gamma.

Lemma B.4.

Fix ε(0,1/2)\varepsilon\in(0,1/2) and a nilmanifold G/ΓG/\Gamma of degree ss, dimension dd, and complexity MM. Fix η\eta a vertical GsG_{s}-frequency with height bounded by MM. There exists a nilcharacter FF with frequency η\eta such that the output dimension is bounded by 2Os(dOs(1))2^{O_{s}(d^{O_{s}(1)})} and each coordinate is Os(M)Os(dOs(1))O_{s}(M)^{O_{s}(d^{O_{s}(1)})}-Lipschitz.

Proof.

Let G~=G/Gs\widetilde{G}=G/G_{s} and Γ~=Γ/(ΓGs)\widetilde{\Gamma}=\Gamma/(\Gamma\cap G_{s}). Apply Lemma B.3 on G~/Γ~\widetilde{G}/\widetilde{\Gamma} with ε=1/4\varepsilon=1/4 to obtain τj~\widetilde{\tau_{j}} for jIj\in I. For η=0\eta=0, we may take the coordinates of FF to be

τj(gΓ)=τj~((gmodGd)Γ~).\tau_{j}(g\Gamma)=\widetilde{\tau_{j}}((g~{}\mathrm{mod}~{}G_{d})\widetilde{\Gamma}).

In general, for appropriate β\beta depending on jj, we have that gΓg\Gamma is naturally identified with a unique point inside i=1dim(G~)(βi1/2,βi+1/2]×𝕋dim(Gs)\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/2,\beta_{i}+1/2]\times\mathbb{T}^{\dim(G_{s})} as in the proof of Lemma B.3 and we let ψβ(gΓ)\psi_{\beta}(g\Gamma) denote this map. The key point is to write

τj(gΓ)=τj~((gmodGs)Γ~)exp(ηψβ(gΓ))\tau_{j}(g\Gamma)=\widetilde{\tau_{j}}((g~{}\mathrm{mod}~{}G_{s})\widetilde{\Gamma})\cdot\exp(\eta\cdot\psi_{\beta}(g\Gamma))

and note that jI|τj(gΓ)|2=1\sum_{j\in I}|\tau_{j}(g\Gamma)|^{2}=1 as before. Here we have identified η\eta with an integer vector using the last dim(Gs)\dim(G_{s}) elements of the Mal’cev basis and extending by 0. Note that this is trivially a function on G/ΓG/\Gamma and by construction it has the GsG_{s}-vertical frequency η\eta. The only technical point is verifying that this function is indeed Lipschitz, which we check for each coordinate τj\tau_{j}.

Consider xΓx\Gamma and yΓy\Gamma. If τj(xΓ)=τj(yΓ)=0\tau_{j}(x\Gamma)=\tau_{j}(y\Gamma)=0 the Lipschitz condition is obviously satisfied. Thus at least one value is nonzero, and without loss of generality we may assume τj(xΓ)0\tau_{j}(x\Gamma)\neq 0. Furthermore, noting that τj\tau_{j} is 11-bounded, we may assume that dG/Γ(xΓ,yΓ)MOs(dOs(1))d_{G/\Gamma}(x\Gamma,y\Gamma)\leq M^{-O_{s}(d^{O_{s}(1)})}. As τj(xΓ)0\tau_{j}(x\Gamma)\neq 0, possibly shifting xx on the right by an element in the lattice allows us to assume

ψ(x)i=1dim(G~)(βi1/4,βi+1/4]×(0,1]dim(Gs).\psi(x)\in\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/4,\beta_{i}+1/4]\times(0,1]^{\dim(G_{s})}.

Via an argument analogous to that in the proof of Lemma B.3, there exists yy^{\prime} such that yΓ=yΓy^{\prime}\Gamma=y\Gamma,

ψ(y)i=1dim(G~)(βi1/3,βi+1/3]×(1/2,3/2]dim(Gs),\psi(y^{\prime})\in\prod_{i=1}^{\dim(\widetilde{G})}(\beta_{i}-1/3,\beta_{i}+1/3]\times(-1/2,3/2]^{\dim(G_{s})},

and ψ(x)ψ(y)MOs(dOs(1))dG/Γ(xΓ,yΓ)\lVert\psi(x)-\psi(y^{\prime})\rVert_{\infty}\leq M^{O_{s}(d^{O_{s}(1)})}d_{G/\Gamma}(x\Gamma,y\Gamma). Since zexp(ηz)\vec{z}\mapsto\exp(\eta\cdot\vec{z}) is an appropriately Lipschitz function on the torus if ηdim(Gs)\eta\in\mathbb{Z}^{\dim(G_{s})}, the desired result follows immediately. ∎

We will also require the following converse of the Us+1U^{s+1}-inverse theorem; this is verbatim in [32, Appendix G] modulo various complexity details being omitted.

Lemma B.5.

Fix ε(0,1/2)\varepsilon\in(0,1/2) and let G/ΓG/\Gamma be a degree ss nilmanifold of dimension dd and complexity MM, and let g(n)g(n) be a polynomial sequence with respect to this filtration. Furthermore let F:G/Γ𝐂F\colon G/\Gamma\to\mathbf{C} satisfy FLipM\lVert F\rVert_{\mathrm{Lip}}\leq M. If f:[N]f\colon[N]\to\mathbb{C} is a 11-bounded function such that

|𝔼n[N]f(n)F(g(n)Γ)¯|ε,\big{|}\mathbb{E}_{n\in[N]}f(n)\overline{F(g(n)\Gamma)}\big{|}\geq\varepsilon,

then

fUs+1[N](ε/M)Os(dOs(1)).\lVert f\rVert_{U^{s+1}[N]}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}.
Proof.

In the degenerate case when s=0s=0, we take a degree ss nilsequence of complexity MM to be a constant function ψ\psi bounded by MM. This implies that

|𝔼n[N]f(n)|ε/M|\mathbb{E}_{n\in[N]}f(n)|\geq\varepsilon/M

and by Cauchy–Schwarz we have

𝔼n,n[N]f(n)f(n)¯(ε/M)2.\mathbb{E}_{n,n^{\prime}\in[N]}f(n)\overline{f(n^{\prime})}\geq(\varepsilon/M)^{2}.

By unwinding definitions this implies the case s=0s=0.

For larger ss, by applying [42, Lemma A.6] we may assume that

|𝔼n[N]f(n)Fξ(g(n)Γ)¯|(ε/M)Os(dOs(1))\big{|}\mathbb{E}_{n\in[N]}f(n)\overline{F_{\xi}(g(n)\Gamma)}\big{|}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}

where FξF_{\xi} is a (M/ε)Os(dOs(1))(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})}-Lipschitz function with GsG_{s}-vertical frequency ξ\xi bounded in height by (M/ε)Os(dOs(1))(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})}, after Pigeonhole. Cauchy–Schwarz implies that

𝔼n,n[N]f(n)f(n)¯Fξ(g(n)Γ)Fξ(g(n)Γ)¯(ε/M)Os(dOs(1)).\mathbb{E}_{n,n^{\prime}\in[N]}f(n)\overline{f(n^{\prime})}F_{\xi}(g(n^{\prime})\Gamma)\overline{F_{\xi}(g(n)\Gamma)}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}.

Note that we may rewrite this as

𝔼n[N],h[±N]f(n)f(n+h)¯Fξ(g(n+h)Γ)Fξ(g(n)Γ)¯(ε/M)Os(dOs(1)),\mathbb{E}_{n\in[N],h\in[\pm N]}f(n)\overline{f(n+h)}F_{\xi}(g(n+h)\Gamma)\overline{F_{\xi}(g(n)\Gamma)}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})},

where we extend ff by 0 in the usual manner. We define

G={(g,g):g,gG,g1gG2}G^{\Box}=\{(g,g^{\prime})\colon g,g^{\prime}\in G,g^{-1}g^{\prime}\in G_{2}\}

and note that this has a filtration (G)i={(g,g):g,gGi,g1gGi+1}(G^{\Box})_{i}=\{(g,g^{\prime})\colon g,g^{\prime}\in G_{i},g^{-1}g^{\prime}\in G_{i+1}\} by [42, Lemma A.3] (with G=(G)1G^{\Box}=(G^{\Box})_{1}). Let Γ=(Γ×Γ)G\Gamma^{\Box}=(\Gamma\times\Gamma)\cap G^{\Box} and note that

F~ξ((x,y)(Γ×Γ)):=Fξ(xΓ)Fξ(yΓ)¯\widetilde{F}_{\xi}((x,y)(\Gamma\times\Gamma)):=F_{\xi}(x\Gamma)\overline{F_{\xi}(y\Gamma)}

is invariant under GsG^{\Box}_{s}. Note that F~ξ\widetilde{F}_{\xi} is (M/ε)Os(dOs(1))(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})}-Lipschitz on G×GG\times G and on GG^{\Box}, and G/ΓG^{\Box}/\Gamma^{\Box} is a nilmanifold of appropriate complexity by [42, Lemma A.3].

Let

(g(0),g(h))={(g(0),g(h))}[(g(0),g(h))](g(0),g(h))=\{(g(0),g(h))\}\cdot[(g(0),g(h))]

with dG×G({(g(0),g(h))})MOs(dOs(1))d_{G\times G}(\{(g(0),g(h))\})\leq M^{O_{s}(d^{O_{s}(1)})} and [(g(0),g(h))]Γ×Γ[(g(0),g(h))]\in\Gamma\times\Gamma. Define

gh(n)={(g(0),g(h))}1(g(n),g(n+h))[(g(0),g(h))]1;g_{h}^{\prime}(n)=\{(g(0),g(h))\}^{-1}(g(n),g(n+h))[(g(0),g(h))]^{-1};

this is easily seen to be a polynomial sequence with respect to GG^{\Box}. Thus

𝔼n[N],h[±N]f(n)f(n+h)F~ξ({(g(0),g(h))}gh(n)(Γ×Γ))¯(ε/M)Os(dOs(1)).\mathbb{E}_{n\in[N],h\in[\pm N]}f(n)\overline{f(n+h)\widetilde{F}_{\xi}(\{(g(0),g(h))\}g_{h}^{\prime}(n)(\Gamma\times\Gamma))}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}.

Define F~ξ,h(x,y):=F~ξ({(g(0),g(h))}(x,y)(Γ×Γ))\widetilde{F}_{\xi,h}(x,y):=\widetilde{F}_{\xi}(\{(g(0),g(h))\}(x,y)(\Gamma\times\Gamma)) and note that it is (M/ε)Os(dOs(1))(M/\varepsilon)^{O_{s}(d^{O_{s}(1)})}-Lipschitz on G×GG\times G and on GG^{\Box} by [42, Lemma B.4]. Applying the triangle inequality and restricting to GG^{\Box} we have

𝔼h[±N]|𝔼n[N]Δhf(n)F~χ,h(gh(n)Γ¯)|(ε/M)Os(dOs(1)).\mathbb{E}_{h\in[\pm N]}\Big{|}\mathbb{E}_{n\in[N]}\Delta_{h}f(n)\cdot\overline{\widetilde{F}_{\chi,h}(g_{h}^{\prime}(n)\Gamma^{\Box}})\Big{|}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}.

Since F~ξ\widetilde{F}_{\xi} is invariant under (G)s(G^{\Box})_{s}, passing to G/(G)sG^{\Box}/(G^{\Box})_{s} gives a nilmanifold of degree (s1)(s-1) and complexity MOs(dOs(1))M^{O_{s}(d^{O_{s}(1)})}. Thus we may apply by induction, and deduce that

𝔼h[±N]ΔhfUs[N](ε/M)Os(dOs(1)).\mathbb{E}_{h\in[\pm N]}\lVert\Delta_{h}f\rVert_{U^{s}[N]}\geq(\varepsilon/M)^{O_{s}(d^{O_{s}(1)})}.

Since

𝔼h[±N]ΔhfUs[N]2ssfUs+1[N]2s+1,\mathbb{E}_{h\in[\pm N]}\lVert\Delta_{h}f\rVert_{U^{s}[N]}^{2^{s}}\lesssim_{s}\lVert f\rVert_{U^{s+1}[N]}^{2^{s+1}},

the desired result follows. ∎

We now check the deferred Lemma 11.3.

Proof of Lemma 11.3.

We first construct a weak basis for GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}. Note that each element in (g,g)GQuotGLin(g,g^{\prime})\in G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} may be written as

(g,g)=(g,idGLin)(idGQuot,g).(g,g^{\prime})=(g,\mathrm{id}_{G_{\mathrm{Lin}}})\cdot(\mathrm{id}_{G_{\mathrm{Quot}}},g^{\prime}).

Consider e~i,j\widetilde{e}_{i,j} and consider (r1)(r-1)-fold commutators of e~i1,j1,,e~ir,jr\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r},j_{r}} with i1++irs2i_{1}+\cdots+i_{r}\leq s-2 or i1++ir=s1i_{1}+\cdots+i_{r}=s-1, rrr\leq r^{\ast} and at most one generator has i>Dii_{\ell}>D_{i_{\ell}}^{\ast}. We define the type of the commutator to be given by the multiset {e~i1,j1,,e~ir,jr}\{\widetilde{e}_{i_{1},j_{1}},\ldots,\widetilde{e}_{i_{r},j_{r}}\} and we say that said type is linear if i>Dii_{\ell}>D_{i_{\ell}}^{\ast} for exactly one index \ell. We define the degree of a commutator to be i1++iri_{1}+\cdots+i_{r}. As discussed in Lemmas 10.4 and 10.10, commutators of all types span log(GQuot)\log(G_{\mathrm{Quot}}) and commutators of linear type span log(GLin)\log(G_{\mathrm{Lin}}), and all relations between these elements are spanned by relations between commutators of the same type of height Os(1)O_{s}(1).

Given this, for each collection of commutators of a given type choose a subset which “spans the type” (similar to in the proof of Lemma 10.4). Let 𝒳1\mathcal{X}_{1} denote the set of selected commutators and 𝒳2\mathcal{X}_{2} denote the selected commutators which are of linear type. Our weak basis for GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} will be

𝒳={(X,0):X𝒳1}{(0,X):X𝒳2};\mathcal{X}=\{(X,0)\colon X\in\mathcal{X}_{1}\}\cup\{(0,X)\colon X\in\mathcal{X}_{2}\};

this is seen to be a basis for the Lie algebra of GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}. That it spans is trivial, and if there were a relation note that there could be no elements of the form (X,0)(X,0) in the relation since projecting onto the first coordinate we recover multiplication in GQuotG_{\mathrm{Quot}}. Given that there are no elements of the form (X,0)(X,0), within this relation multiplication then acts exactly as in GLinG_{\mathrm{Lin}} and the result claimed independence follows.

We give GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} a multidegree filtration by taking the multidegree filtration of GMultiG_{\mathrm{Multi}} and intersecting with the subgroup of elements of the form (0,(g,g1))(0,(g,g_{1})). We see that all the subgroups of the filtration are in fact spanned subsets by subsets of 𝒳\mathcal{X}. This is simply by taking the generators in 𝒳\mathcal{X} of the appropriate degree-rank; for instance

(GQuotGLin)(0,d)={(g,g1):g(GQuot)(d,0),g1(GQuot)(d,0)GLin}(G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}})_{(0,d)}=\{(g,g_{1})\colon g\in(G_{\mathrm{Quot}})_{(d,0)},g_{1}\in(G_{\mathrm{Quot}})_{(d,0)}\cap G_{\mathrm{Lin}}\}

and we take the subsets of {(X,0):X𝒳1}\{(X,0)\colon X\in\mathcal{X}_{1}\} and {(0,X):X𝒳2}\{(0,X)\colon X\in\mathcal{X}_{2}\} where XX has degree at least dd. This is similarly true for |i|=k(GQuotGLin)(i1,i2)\bigvee_{|\vec{i}|=k}(G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}})_{(i_{1},i_{2})} which will ultimately form the underlying degree filtration for GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}. Furthermore ordering the basis according to whether they lie in the degree ordering associated to GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} proves that the basis has the degree Os(1)O_{s}(1) nesting property. Thus it suffices to check the complexity of various commutators.

Note the identity

[V,W]\displaystyle[V,W] =ddsddtexp(sV)exp(tW)exp(sV)exp(tW)|s,t=0\displaystyle=\frac{d}{ds}\frac{d}{dt}\exp(sV)\exp(tW)\exp(-sV)\exp(-tW)\bigg{|}_{s,t=0}

which holds for any Lie group and the associated Lie bracket. It is therefore immediate that

[(X,0),(X[(X,0),(X^{\prime},0)]=([X,X^{\prime}],0)\text{ and }[(0,X),(0,X^{\prime})]=(0,[X,X^{\prime}]),

and we have

[(X,0),(0,X)]\displaystyle[(X,0),(0,X^{\prime})]
=ddsddt(exp(sX),idGLin)(idGQuot,exp(tX))(exp(sX),idGLin)(idGQuot,exp(tX))|s,t=0\displaystyle\qquad=\frac{d}{ds}\frac{d}{dt}(\exp(sX),\mathrm{id}_{G_{\mathrm{Lin}}})\cdot(\mathrm{id}_{G_{\mathrm{Quot}}},\exp(tX^{\prime}))\cdot(\exp(-sX),\mathrm{id}_{G_{\mathrm{Lin}}})\cdot(\mathrm{id}_{G_{\mathrm{Quot}}},\exp(-tX^{\prime}))\bigg{|}_{s,t=0}
=ddsddt(idGQuot,exp(sX)exp(tX)exp(sX)exp(tX))|s,t=0\displaystyle\qquad=\frac{d}{ds}\frac{d}{dt}(\mathrm{id}_{G_{\mathrm{Quot}}},\exp(sX)\exp(tX^{\prime})\exp(-sX)\exp(-tX^{\prime}))\bigg{|}_{s,t=0}
=(0,[X,X]).\displaystyle\qquad=(0,[X,X^{\prime}]).

This immediately implies that the structure constants associated to the weak basis 𝒳\mathcal{X} are of height Os(1)O_{s}(1).

When including the semi-direct action, we will use the weak basis given by taking elements log((eij,(idGQuot,idGLin)))\log((\vec{e}_{ij},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))) where ei,j\vec{e}_{i,j} denotes the elementary basis vector in the corresponding direction in RR, placed at the start of 𝒳\mathcal{X}. This is easily seen to preserve the nesting property.

To compute the associated structure constants, first note that

[(eij,(idGQuot,idGLin)),(0,(g,idGLin))]=idGMulti[(\vec{e}_{ij},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}})),(0,(g,\mathrm{id}_{G_{\mathrm{Lin}}}))]=\mathrm{id}_{G_{\mathrm{Multi}}}

and thus the all Lie bracket structure constants of the corresponding form vanish. Furthermore note that

[log((eij,(idGQuot,idGLin))),(0,(0,X))]=ddsddt(0,(exp(tX)sei,j,idGLin))|s,t=0.\displaystyle[\log((\vec{e}_{ij},(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))),(0,(0,X^{\prime}))]=\frac{d}{ds}\frac{d}{dt}(0,(\exp(tX^{\prime})^{s\cdot\vec{e}_{i,j}},\mathrm{id}_{G_{\mathrm{Lin}}}))\bigg{|}_{s,t=0}.

We have that if the type of XX^{\prime} does not contain e~i,j\widetilde{e}_{i,j} then exp(tX)sei,j=idGQuot\exp(tX^{\prime})^{s\cdot\vec{e}_{i,j}}=\mathrm{id}_{G_{\mathrm{Quot}}} and otherwise exp(tX)sei,j=exp(stX)\exp(tX^{\prime})^{s\cdot\vec{e}_{i,j}}=\exp(stX^{\prime}) (recall the definition of exponentiation by elements of RR given in Section 11.1). In either case the structure constant is appropriately rational. Therefore we may construct a Mal’cev basis adapted to GMultiG_{\mathrm{Multi}} with the appropriate complexity by applying [42, Lemma B.11] to 𝒳\mathcal{X} to construct a Mal’cev basis for GQuotGLinG_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}} and adding the semi-direct Mal’cev basis elements described above to the front of the list. We define this basis to be 𝒳Multi\mathcal{X}_{\mathrm{Multi}} and define initial segment corresponding to the semi-direct Mal’cev basis elements to be 𝒳Multi,R\mathcal{X}_{\mathrm{Multi},R} and the remaining elements to be 𝒳Multi,GQuotGLin\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}}.

We finally check the that FMultiF_{\mathrm{Multi}} is an appropriately Lipschitz function. Let δ\delta be defined as in Section 11.1. Fix a pair x,yGMultix,y\in G_{\mathrm{Multi}}. Note that if

dGMulti/ΓMulti(xΓMulti,yΓMulti)δOs(dOs(1))d_{G_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}}}(x\Gamma_{\mathrm{Multi}},y\Gamma_{\mathrm{Multi}})\geq\delta^{O_{s}(d^{O_{s}(1)})}

we have that

FMulti(xΓMulti)FMulti(yΓMulti)dGMulti/ΓMulti(xΓMulti,yΓMulti)δOs(dOs(1))2FMulti\frac{F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}})-F_{\mathrm{Multi}}(y\Gamma_{\mathrm{Multi}})}{d_{G_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}}}(x\Gamma_{\mathrm{Multi}},y\Gamma_{\mathrm{Multi}})}\leq\delta^{-O_{s}(d^{O_{s}(1)})}\cdot 2\lVert F_{\mathrm{Multi}}\rVert_{\infty}

which is sufficiently bounded. Therefore to check the Lipschitz constant it suffices to consider x,yx,y such that dGMulti/ΓMulti(xΓMulti,yΓMulti)δOs(dOs(1))d_{G_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}}}(x\Gamma_{\mathrm{Multi}},y\Gamma_{\mathrm{Multi}})\leq\delta^{O_{s}(d^{O_{s}(1)})} (where the implicit constants are chosen sufficiently large for the remainder of the argument). By multiplying by elements in the lattice, we may assume that dGMulti/ΓMulti(xΓMulti,yΓMulti)=dGMulti(x,y)d_{G_{\mathrm{Multi}}/\Gamma_{\mathrm{Multi}}}(x\Gamma_{\mathrm{Multi}},y\Gamma_{\mathrm{Multi}})=d_{G_{\mathrm{Multi}}}(x,y), that FMulti(xΓMulti)0F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}})\neq 0, and

ψ𝒳Multi(x)[1/2,1/2)dim(GMulti).\psi_{\mathcal{X}_{\mathrm{Multi}}}(x)\in[-1/2,1/2)^{\dim(G_{\mathrm{Multi}})}.

Note that to assume that FMulti(xΓMulti)0F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}})\neq 0 we may need to swap xx and yy (if both are zero there is nothing to check with respect to the Lipschitz constant).

Since FMulti(xΓMulti)0F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}})\neq 0 we in fact have that the first i=1s1DiLin\sum_{i=1}^{s-1}D_{i}^{\mathrm{Lin}} coordinates of ψ𝒳Multi(x)\psi_{\mathcal{X}_{\mathrm{Multi}}}(x) are in [1/2+δ,1/2δ][-1/2+\delta,1/2-\delta]. This implies, due to the distance bound between xx and yy and by [42, Lemma B.3], that the first i=1s1DiLin\sum_{i=1}^{s-1}D_{i}^{\mathrm{Lin}} coordinates of ψ𝒳Multi(y)\psi_{\mathcal{X}_{\mathrm{Multi}}}(y) are in [1/2+δ/2,1/2δ/2][-1/2+\delta/2,1/2-\delta/2]. Therefore if x=(t1,(g1,g1))x=(t_{1},(g_{1},g_{1}^{\prime})) and y=(t2,(g2,g2))y=(t_{2},(g_{2},g_{2}^{\prime})) then

FMulti(xΓMulti)\displaystyle F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}}) =F(g1ΓQuot)1is1Di<jDi+DiLinϕ((t1)i,j),\displaystyle=F^{\ast}(g_{1}^{\prime}\Gamma_{\mathrm{Quot}})\cdot\prod_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}}\phi((t_{1})_{i,j}),
FMulti(xΓMulti)\displaystyle F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}}) =F(g2ΓQuot)1is1Di<jDi+DiLinϕ((t2)i,j).\displaystyle=F^{\ast}(g_{2}^{\prime}\Gamma_{\mathrm{Quot}})\cdot\prod_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}}\phi((t_{2})_{i,j}).

Note that

|FMulti(xΓMulti)FMulti(yΓMulti)|\displaystyle|F_{\mathrm{Multi}}(x\Gamma_{\mathrm{Multi}})-F_{\mathrm{Multi}}(y\Gamma_{\mathrm{Multi}})| F|1is1Di<jDi+DiLinϕ((t1)i,j)1is1Di<jDi+DiLinϕ((t2)i,j)|\displaystyle\leq\lVert F^{\ast}\rVert_{\infty}\cdot\bigg{|}\prod_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}}\phi((t_{1})_{i,j})-\prod_{\begin{subarray}{c}1\leq i\leq s-1\\ D_{i}^{\ast}<j\leq D_{i}^{\ast}+D_{i}^{\mathrm{Lin}}\end{subarray}}\phi((t_{2})_{i,j})\bigg{|}
+|F(g1ΓQuot)F(g2ΓQuot)|,\displaystyle+|F^{\ast}(g_{1}^{\prime}\Gamma_{\mathrm{Quot}})-F^{\ast}(g_{2}^{\prime}\Gamma_{\mathrm{Quot}})|,

where we have used that ϕ\phi is 11-bounded. Next note that distance in ψ𝒳,exp\psi_{\mathcal{X},\mathrm{exp}} controls the distance in ψ𝒳\psi_{\mathcal{X}} for bounded elements by [42, Lemma B.1] and distance in ψ𝒳\psi_{\mathcal{X}} controls distance in dGMultid_{G_{\mathrm{Multi}}} by [42, Lemma B.3]. The first term is therefore sufficiently bounded as ϕ\phi is O(1/δ)O(1/\delta)-Lipschitz.

Finally note that

(g1,g1)\displaystyle(g_{1},g_{1}^{\prime}) =(Xi,Xi)𝒳Multi,GQuotGLinexp((Xi,Xi))xi,\displaystyle=\prod_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}}}\exp((X_{i},X_{i}^{\prime}))^{x_{i}},
(g2,g2)\displaystyle(g_{2},g_{2}^{\prime}) =(Xi,Xi)𝒳Multi,GQuotGLinexp((Xi,Xi))yi,\displaystyle=\prod_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}}}\exp((X_{i},X_{i}^{\prime}))^{y_{i}},

where xix_{i} and yiy_{i} are the coordinates of xx and yy in ψ𝒳Multi\psi_{\mathcal{X}_{\mathrm{Multi}}} in the coordinates corresponding to 𝒳Multi,GQuotGLin\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}\ltimes G_{\mathrm{Lin}}}}. This is using that

(t,(idGQuot,idGLin))(0,(g,g))=(t,(g,g)).(t,(\mathrm{id}_{G_{\mathrm{Quot}}},\mathrm{id}_{G_{\mathrm{Lin}}}))\cdot(0,(g,g^{\prime}))=(t,(g,g^{\prime})).

Therefore we have that

g1\displaystyle g_{1} =(Xi,Xi)𝒳Multi,GQuotGLinexp(Xi)xi\displaystyle=\prod_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}}}\exp(X_{i})^{x_{i}}
g2\displaystyle g_{2} =(Xi,Xi)𝒳Multi,GQuotGLinexp(Xi)yi;\displaystyle=\prod_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}}}\exp(X_{i})^{y_{i}};

and note that exp(Xi)\exp(X_{i}) are appropriately bounded elements in GQuotG_{\mathrm{Quot}} since 𝒳Multi,GQuotGLin\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}\ltimes G_{\mathrm{Lin}}} are low height combinations of elements in 𝒳\mathcal{X}. Via telescoping, and using that the metric dGQuotd_{G_{\mathrm{Quot}}} is right-invariant and essentially left-invariant under multiplication by bounded elements (e.g. [42, Lemma B.4]), we have that

dGQuot(g1,g2)\displaystyle d_{G_{\mathrm{Quot}}}(g_{1},g_{2}) δOs(dOs(1))(Xi,Xi)𝒳Multi,GQuotdGQuot(exp(Xi)xiyi,idGQuot)\displaystyle\leq\delta^{-O_{s}(d^{O_{s}(1)})}\cdot\sum_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}}}d_{G_{\mathrm{Quot}}}(\exp(X_{i})^{x_{i}-y_{i}},\mathrm{id}_{G_{\mathrm{Quot}}})
δOs(dOs(1))(Xi,Xi)𝒳Multi,GQuot|xiyi|\displaystyle\leq\delta^{-O_{s}(d^{O_{s}(1)})}\cdot\sum_{(X_{i},X_{i}^{\prime})\in\mathcal{X}_{\mathrm{Multi},G_{\mathrm{Quot}}}}|x_{i}-y_{i}|
δOs(dOs(1))ψ𝒳Multi(x)ψ𝒳Multi(y)δOs(dOs(1))dGMulti(x,y).\displaystyle\leq\delta^{-O_{s}(d^{O_{s}(1)})}\cdot\lVert\psi_{\mathcal{X}_{\mathrm{Multi}}}(x)-\psi_{\mathcal{X}_{\mathrm{Multi}}}(y)\rVert\leq\delta^{-O_{s}(d^{O_{s}(1)})}\cdot d_{G_{\mathrm{Multi}}}(x,y).

This completes the proof upon noting that FF^{\ast} is appropriately Lipschitz on GQuotG_{\mathrm{Quot}}. ∎

Appendix C Nilcharacters

This section is essentially a straightforward quantification of various statements regarding nilcharacters proven in [34, Appendix E].

We first require that two nilcharacters being equivalent is a transitive relationship; this is a quantified version of [34, Lemma E.7]. Recall the notion of complexity (M,d)(M,d) that we carry over from Section 12.

Lemma C.1.

Consider three nilcharacters χ1,χ2,χ3\chi_{1},\chi_{2},\chi_{3} each of complexity (M,d)(M,d) and such that the pair χ1\chi_{1} and χ2\chi_{2} and the pair χ2\chi_{2} and χ3\chi_{3} are (M,D,d)(M,D,d)-equivalent for multidegree JJ. Then χ1\chi_{1} and χ3\chi_{3} are ((MD)O|J|(1),(MD)O|J|(1),O(d))((MD)^{O_{|J|}(1)},(MD)^{O_{|J|}(1)},O(d))-equivalent for multidegree JJ.

Proof.

Notice that each coordinate of χ1χ3¯\chi_{1}\otimes\overline{\chi_{3}} may be expressed as the sum of at most DD coordinates of the nilcharacter

χ1(χ2¯χ2)χ3¯;\chi_{1}\otimes(\overline{\chi_{2}}\otimes\chi_{2})\otimes\overline{\chi_{3}};

this follows since the trace of χ2¯χ2\overline{\chi_{2}}\otimes\chi_{2} is 11. The result then follows by rewriting

χ1(χ2¯χ2)χ3¯=(χ1χ2¯)(χ2χ3¯)\chi_{1}\otimes(\overline{\chi_{2}}\otimes\chi_{2})\otimes\overline{\chi_{3}}=(\chi_{1}\otimes\overline{\chi_{2}})\otimes(\chi_{2}\otimes\overline{\chi_{3}})

and applying the assumption. ∎

We will generally require the following specialization lemmas; these are rather straightforward consequences of the definitions modulo the need to handle slight filtration issues.

Lemma C.2.

We have the following:

  • Consider a nilsequence χ(h1,,hk)\chi(h_{1},\ldots,h_{k}) of multidegree (s1,,sk)(s_{1},\ldots,s_{k}) and complexity (M,d)(M,d). Given hh^{\ast}\in\mathbb{Z}, the function χ(h,h2,,hk)\chi(h^{\ast},h_{2},\ldots,h_{k}), treating hh^{\ast} as fixed, is a multidegree (s2,,sk)(s_{2},\ldots,s_{k}) nilsequence of complexity (MO|s|(dO|s|(1)),d)(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},d).

  • Consider homomorphisms Li:L_{i}\colon\mathbb{Z}^{\ell}\to\mathbb{Z}. If χ(h1,,hk)\chi(h_{1},\ldots,h_{k}) is a nilsequence of degree ss of complexity (M,d)(M,d) then χ(L1(t1,,t),,Lk(t1,,t))\chi(L_{1}(t_{1},\ldots,t_{\ell}),\ldots,L_{k}(t_{1},\ldots,t_{\ell})) is a degree ss nilsequence in variables t1,,tt_{1},\ldots,t_{\ell} of complexity (M,d)(M,d).

  • If χ(h1,,hk)\chi(h_{1},\ldots,h_{k}) is a nilsequence of multidegree (s1,,sk)(s_{1},\ldots,s_{k}) of complexity (M,d)(M,d) then it is also a nilsequence of degree s1++sks_{1}+\cdots+s_{k} of complexity (M,d)(M,d).

Remark.

This result allows us to interpret expressions such as χ(h1+h1,h2,,hk)\chi(h_{1}+h_{1}^{\prime},h_{2},\ldots,h_{k}) as an appropriate degree nilcharacter in k+1k+1 variables, if χ\chi is a nilcharacter in kk variables with multidegree (s1,,sk)(s_{1},\ldots,s_{k}).

Proof.

We handle these items in reverse order (as this is also the difficulty of these claims). Let

χ(h1,,hk)=F(g(h1,,hk)Γ)\chi(h_{1},\ldots,h_{k})=F(g(h_{1},\ldots,h_{k})\Gamma)

with the underlying nilmanifold being G/ΓG/\Gamma and the specified Mal’cev basis being 𝒳\mathcal{X}.

For the last claim, note that 𝒳\mathcal{X} (by the definition of complexity for multidegree nilmanifolds) is adapted to the degree filtration Gt=|i|=tGiG_{t}=\bigvee_{|\vec{i}|=t}G_{\vec{i}}. Furthermore by the inclusion given on [34, p. 1264] or direct inspection given the Taylor expansion in [34, Lemma B.9], we have that g(h1,,hk)g(h_{1},\ldots,h_{k}) is a polynomial sequence with respect to the degree filtration G0=G1G2Gs1++skIdGG_{0}=G_{1}\geqslant G_{2}\geqslant\cdots\geqslant G_{s_{1}+\cdots+s_{k}}\geqslant\mathrm{Id}_{G}. The desired result follows immediately.

For the second item, notice that if a polynomial P(x1,,xk)P(x_{1},\ldots,x_{k}) has total degree ss, then for any linear maps Li:L_{i}\colon\mathbb{R}^{\ell}\to\mathbb{R} we have that P(L1(y1,,y),,Lk(y1,,y))P(L_{1}(y_{1},\ldots,y_{\ell}),\ldots,L_{k}(y_{1},\ldots,y_{\ell})) has total degree ss. This coupled with Taylor expansion [34, Lemma B.9] and the fact that the set of polynomial sequences with respect to a given II-filtration is a group (by [34, Corollary B.4]) implies the result.

We now handle the first item; this is the only nontrivial part. Write g(h,0,,0)={g(h,0,,0)}[g(h,0,,0)]g(h^{\ast},0,\ldots,0)=\{g(h^{\ast},0,\ldots,0)\}[g(h^{\ast},0,\ldots,0)] with ψG,𝒳({g(h,0,,0)})[0,1)dim(G)\psi_{G,\mathcal{X}}(\{g(h^{\ast},0,\ldots,0)\})\in[0,1)^{\dim(G)} and [g(h,0,,0)]Γ[g(h^{\ast},0,\ldots,0)]\in\Gamma. We replace the polynomial sequence gg by g={g(h,0,,0)}1g[g(h,0,,0)]1g^{\prime}=\{g(h^{\ast},0,\ldots,0)\}^{-1}g[g(h^{\ast},0,\ldots,0)]^{-1} and FF by the function F()=F({g((h,0,,0))})F^{\prime}(\cdot)=F(\{g((h^{\ast},0,\ldots,0))\}\cdot). We may thus assume, at the cost of replacing MM by MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}, that g(h,0,,0)=idGg(h^{\ast},0,\ldots,0)=\mathrm{id}_{G}.

We now apply [34, Lemma B.9] to see

g(h1,,hk)=i1,,ikg(i1,,ik)(h1i1)(hkik)g(h_{1},\ldots,h_{k})=\prod_{i_{1},\ldots,i_{k}}g_{(i_{1},\ldots,i_{k})}^{\binom{h_{1}}{i_{1}}\cdots\binom{h_{k}}{i_{k}}}

where we order (i1,,ik)(i_{1},\ldots,i_{k}) lexicographically with indices considered in reverse order in the product (in particular, the first few terms are (0,,0)(0,\ldots,0), (1,,0)(1,\ldots,0), (2,,0)(2,\ldots,0), and so on) and g(i1,,ik)G(i1,,ik)g_{(i_{1},\ldots,i_{k})}\in G_{(i_{1},\ldots,i_{k})}. As g(h,0,,0)=idGg(h^{\ast},0,\ldots,0)=\mathrm{id}_{G}, we have that

g(h,h2,,hk)=i2,,iki2++ik>0g(i1,,ik)(hi1)(h2i2)(hkik).g(h^{\ast},h_{2},\ldots,h_{k})=\prod_{\begin{subarray}{c}i_{2},\ldots,i_{k}\\ i_{2}+\cdots+i_{k}>0\end{subarray}}g_{(i_{1},\ldots,i_{k})}^{\binom{h^{\ast}}{i_{1}}\cdot\binom{h_{2}}{i_{2}}\cdots\binom{h_{k}}{i_{k}}}.

It then follows that g(h,h2,,hk)g(h^{\ast},h_{2},\ldots,h_{k}) is a polynomial with respect to

G==2kGeG^{\ast}=\bigvee_{\ell=2}^{k}G_{\vec{e}_{\ell}}

which we give a multidegree filtration G(i2,,ik)=G(0,i2,,ik)G^{\ast}_{(i_{2},\ldots,i_{k})}=G_{(0,i_{2},\ldots,i_{k})} for i2++ik>0i_{2}+\cdots+i_{k}>0 and G(0,,0)=GG^{\ast}_{(0,\ldots,0)}=G^{\ast}. Note that all subgroups in this filtration are MM-rational with respect to 𝒳\mathcal{X} and that (G)t=|i|=tGi(G^{\ast})_{t}=\bigvee_{|\vec{i}|=t}G^{\ast}_{\vec{i}} is a degree |s|s1|\vec{s}|-s_{1} filtration. Therefore applying [42, Lemma B.11] guarantees that we may find a Mal’cev basis 𝒳\mathcal{X}^{\ast} for GG^{\ast} (which is MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational with respect to 𝒳\mathcal{X}). Descending FF to GG^{\ast} gives the desired result with the necessary Lipschitz bound following from [42, Lemma B.9]. ∎

We now state a quantified version of [34, Lemma E.8]. Recall the notion of equivalence (Definition 7.3).

Lemma C.3.

Consider a nilcharacter χ\chi with complexity (M,d)(M,d) of multidegree s=(s1,,sk)\vec{s}=(s_{1},\ldots,s_{k}) with |s|=s1++sk|\vec{s}|=s_{1}+\cdots+s_{k}. We have that:

  • The nilcharacters

    χ() and χ()\chi(\cdot)\emph{ and }\chi(\cdot)

    are (MO|s|(1),MO|s|(1),O(d))(M^{O_{|\vec{s}|}(1)},M^{O_{|\vec{s}|}(1)},O(d))-equivalent for multidegree <(s1,,sk)<(s_{1},\ldots,s_{k}).999This means we take the down-set generated by (s1,,sk)(s_{1},\ldots,s_{k}) and then remove (s1,,sk)(s_{1},\ldots,s_{k}).

  • Fix hh^{\ast}\in\mathbb{Z}. The nilcharacters

    χ(+hej) and χ()\chi(\cdot+h^{\ast}\vec{e}_{j})\emph{ and }\chi(\cdot)

    are (MO|s|(dO|s|(1)),MO|s|(dO|s|(1)),O(d))(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},O(d))-equivalent for multidegree <(s1,,sk)<(s_{1},\ldots,s_{k}).

  • Fix qq\in\mathbb{Z}. Then

    χq|s|() and χ(q)\chi^{\otimes q^{|\vec{s}|}}(\cdot)\emph{ and }\chi(q\cdot)

    are (MO|s|,q(dO|s|,q(1)),MO|s|,q(dO|s|,q(1)),dOq(1))(M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},d^{O_{q}(1)})-equivalent for multidegree <(s1,,sk)<(s_{1},\ldots,s_{k}).

  • Fix q>0q\in\mathbb{Z}^{>0}. There exists a nilcharacter χ~\widetilde{\chi} of complexity (MO|s|,q(dO|s|,q(1)),dOq(1))(M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},d^{O_{q}(1)}) such that

    χ() and χ~q()\chi(\cdot)\emph{ and }\widetilde{\chi}^{\otimes q}(\cdot)

    are (MO|s|,q(dO|s|,q(1)),MO|s|,q(dO|s|,q(1)),dOq(1))(M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},d^{O_{q}(1)})-equivalent for multidegree <(s1,,sk)<(s_{1},\ldots,s_{k}).

Remark.

χq\chi^{-\otimes q} for q>0q\in\mathbb{Z}^{>0} is interpreted as χ¯q\overline{\chi}^{\otimes q}.

Proof.

Throughout the proof, we let

χ(n)=F(g(n)Γ)\chi(\vec{n})=F(g(\vec{n})\Gamma)

where the underlying nilmanifold is G/ΓG/\Gamma and the underlying Mal’cev basis is 𝒳\mathcal{X}. When going from item to item, we may reuse variables (e.g., GG^{\prime} will be defined in multiple different manners throughout the proof). Additionally, the following analysis implicitly uses that |s|1|\vec{s}|\geq 1; in the remaining case s=0\vec{s}=0 all nilsequences become fixed constants and the result is obvious.

For the first item, note that coordinates of χχ¯\chi\otimes\overline{\chi} are multidegree (s1,,sk)(s_{1},\ldots,s_{k}) polynomial sequences with respect to group G={(g,g):gG}G^{\prime}=\{(g,g)\colon g\in G\} given the filtration

Gi={(g,g):gGi}.G^{\prime}_{\vec{i}}=\{(g,g)\colon g\in G_{\vec{i}}\}.

As all coordinates of χ\chi have the same vertical frequency, we have that the coordinates of χχ¯\chi\otimes\overline{\chi} are invariant under G(s1,,sk)G^{\prime}_{(s_{1},\ldots,s_{k})}. This immediately gives the desired result upon taking a quotient and using Lemma 3.10.

For the second item, note that G+ej=(Gi+ej)iIG^{+\vec{e}_{j}}=(G_{\vec{i}+\vec{e}_{j}})_{\vec{i}\in I} is a shifted filtration. Note that this is an II-filtration with respect to the multidegree ordering. We define the group

G==1k(Ge×IdG){(g,g):gG},G^{\prime}=\bigvee_{\ell=1}^{k}(G_{\vec{e}_{\ell}}\times\mathrm{Id}_{G})\vee\{(g,g)\colon g\in G\},

let Γ=G(Γ×Γ)\Gamma^{\prime}=G^{\prime}\cap(\Gamma\times\Gamma), and define the following II-filtration with respect to the multidegree ordering:

Gi==1k(Gi+e×IdG){(g,g):gGi}.G^{\prime}_{\vec{i}}=\bigvee_{\ell=1}^{k}(G_{\vec{i}+\vec{e}_{\ell}}\times\mathrm{Id}_{G})\vee\{(g,g)\colon g\in G_{\vec{i}}\}.

By using Lemma 2.2 we may see that this is a valid. We define the cocompact groups similarly. Now the proof of [34, Lemma E.8] shows that

(g(n+hej),g(n))(g(\vec{n}+h\vec{e}_{j}),g(\vec{n}))

is a polynomial sequence with respect to this filtration and that

F~((x,y)(Γ×Γ))=F(xΓ)F(yΓ)¯\widetilde{F}((x,y)(\Gamma\times\Gamma))=F(x\Gamma)\otimes\overline{F(y\Gamma)}

is invariant under the action of Gs={(g,g):gG(s1,,sk)}G_{\vec{s}}^{\prime}=\{(g,g)\colon g\in G_{(s_{1},\ldots,s_{k})}\}.

We first construct a Mal’cev basis 𝒳\mathcal{X}^{\prime} on GG^{\prime}. Define Gt=|i|=tGi+ejG_{t}^{\ast}=\bigvee_{|\vec{i}|=t}G_{\vec{i}+\vec{e}_{j}} and note that G=G0G^{\ast}=G_{0}^{\ast} has a degree filtration

G0=G0G1G2G_{0}^{\ast}=G_{0}^{\ast}\geqslant G_{1}^{\ast}\geqslant G_{2}^{\ast}\geqslant\cdots

and all these subgroups are MO|s|(1)M^{O_{|\vec{s}|}(1)}-rational with respect to 𝒳\mathcal{X}. Therefore GG^{\ast} has a Mal’cev basis 𝒳\mathcal{X}^{\ast} which is adapted to this filtration and all elements are height at most MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} combinations of elements in 𝒳\mathcal{X} by [42, Lemma B.11]. Note that

{(X,X):X𝒳}{(X,0):X𝒳}\{(X,X)\colon X\in\mathcal{X}\}\cup\{(X^{\ast},0)\colon X^{\ast}\in\mathcal{X}^{\ast}\}

is easily shown to be a weak basis of rationality MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} for G/ΓG^{\prime}/\Gamma^{\prime} and has the degree O|s|(1)O_{|\vec{s}|}(1) nesting property. Letting Gt=|i|=tGiG_{t}^{\prime}=\bigvee_{|\vec{i}|=t}G_{\vec{i}}^{\prime} we see that

G=GG1G2G^{\prime}=G^{\prime}\geqslant G_{1}^{\prime}\geqslant G_{2}^{\prime}\geqslant\cdots

form a sequence of subgroups such that [G,Gi]Gi+1[G^{\prime},G_{i}^{\prime}]\leqslant G_{i+1}^{\prime} for i0i\geq 0 (with G0=GG_{0}^{\prime}=G^{\prime}). Thus by [42, Lemma B.11] we can find a Mal’cev basis 𝒳\mathcal{X}^{\prime} adapted to this sequence such that each element is a height MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} linear combination of

{(X,X):X𝒳}{(X,0):X𝒳}.\{(X,X)\colon X\in\mathcal{X}\}\cup\{(X^{\ast},0)\colon X^{\ast}\in\mathcal{X}^{\ast}\}.

At present, however, we see that GG^{\prime} has not been given a multidegree filtration (only an II-filtration with respect to the multidegree ordering; recall Definition 2.4). We replace GG^{\prime} by

G~==1kGe=G1\widetilde{G}=\bigvee_{\ell=1}^{k}G^{\prime}_{\vec{e}_{\ell}}=G_{1}^{\prime}

and note that G~\widetilde{G} is appropriately rational with respect to 𝒳\mathcal{X}^{\prime} and GiG~G^{\prime}_{\vec{i}}\leqslant\widetilde{G} for i0\vec{i}\neq 0. Note that G~\widetilde{G} is easily seen to have a multidegree (s1,,sk)(s_{1},\ldots,s_{k}) filtration. Furthermore, removing the initial dim(G)dim(G~)\dim(G^{\prime})-\dim(\widetilde{G}) elements, we see that the truncation of 𝒳\mathcal{X}^{\prime} is valid Mal’cev basis for G~\widetilde{G} of complexity MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} and all subgroups in the multidegree filtration are MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational.

We now write (g(hej),g(0))={(g(hej),g(0))}[(g(hej),g(0))](g(h^{\ast}\vec{e}_{j}),g(0))=\{(g(h^{\ast}\vec{e}_{j}),g(0))\}[(g(h^{\ast}\vec{e}_{j}),g(0))] where

ψ𝒳({(g(hej),g(0))})MO|s|(dO|s|(1))\lVert\psi_{\mathcal{X}^{\prime}}(\{(g(h^{\ast}\vec{e}_{j}),g(0))\})\rVert_{\infty}\leq M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}

and [(g(hejj),g(0))]Γ[(g(h^{\ast}\vec{e}_{j}j),g(0))]\in\Gamma^{\prime}. We consider the modified polynomial sequence

g(n)={(g(hej),g(0))}1(g(n+hej),g(n))[(g(hej),g(0))]1;g^{\prime}(\vec{n})=\{(g(h^{\ast}\vec{e}_{j}),g(0))\}^{-1}(g(\vec{n}+h^{\ast}\vec{e}_{j}),g(\vec{n}))[(g(h^{\ast}\vec{e}_{j}),g(0))]^{-1};

evaluating at n=0\vec{n}=0 this is now seen to be a polynomial sequence in G~\widetilde{G}. Defining

F((x,y)(Γ×Γ))=F~({(g(hej),g(0))}(x,y)(Γ×Γ)),F^{\prime}((x,y)(\Gamma\times\Gamma))=\widetilde{F}(\{(g(h^{\ast}\vec{e}_{j}),g(0))\}(x,y)(\Gamma\times\Gamma)),

we have that FF^{\prime} is invariant under (G~)(s1,,sk)(\widetilde{G})_{(s_{1},\ldots,s_{k})} and

F(g(n))=F(g(n+hej))F¯(g(n)).F^{\prime}(g^{\prime}(\vec{n}))=F(g(\vec{n}+h^{\ast}\vec{e}_{j}))\overline{F}(g(\vec{n})).

We may pass to the quotient group G~/G~(s1,,sk)\widetilde{G}/\widetilde{G}_{(s_{1},\ldots,s_{k})} and the desired result is essentially an immediate consequence of Lemma 3.10.

We now come to the third item; we only maintain the notation from the first sentence of the proof. Note that via writing g(0)={g(0)}[g(0)]g(0)=\{g(0)\}[g(0)] with [g(0)]Γ[g(0)]\in\Gamma and ψ𝒳({g(0)})[0,1)dim(G)\psi_{\mathcal{X}}(\{g(0)\})\in[0,1)^{\dim(G)}, replacing g(n)g(\vec{n}) by {g(0)}1g(n)[g(0)]1\{g(0)\}^{-1}g(\vec{n})[g(0)]^{-1} and FF by F({g(0)})F(\{g(0)\}\cdot), up to replacing MM by MO|s|(1)M^{O_{|\vec{s}|}(1)} we may assume that g(0)=idGg(0)=\mathrm{id}_{G}.

Define

Gi=j>i(Gj×Gj){(gq|i|,g):gGi};G^{\prime}_{\vec{i}}=\bigvee_{\vec{j}>\vec{i}}(G_{\vec{j}}\times G_{\vec{j}})\vee\bigvee\{(g^{q^{|\vec{i}|}},g)\colon g\in G_{\vec{i}}\};

here j>i\vec{j}>\vec{i} means j\vec{j} is coordinate-wise at least as large as i\vec{i} and not identical. Furthermore Γ=G(Γ×Γ)\Gamma^{\prime}=G^{\prime}\cap(\Gamma\times\Gamma); note that GG^{\prime} is isomorphic to G×GG\times G however we have given the group an alternate filtration. This is verified to be an II-filtration with respect to the multidegree ordering in [34, p. 1356]. Furthermore the proof of [34, Lemma E.8] shows that

(g(qn),g(n))(g(q\vec{n}),g(\vec{n}))

is a polynomial sequence with respect to this filtration and that

F~((x,y)(Γ×Γ))=F(xΓ)F(yΓ)¯q|s|\widetilde{F}((x,y)(\Gamma\times\Gamma))=F(x\Gamma)\otimes\overline{F(y\Gamma)}^{\otimes q^{|\vec{s}|}}

is invariant under the action of Gs={(gq|s|,g):gG(s1,,sk)}G_{\vec{s}}=\{(g^{q^{|\vec{s}|}},g)\colon g\in G_{(s_{1},\ldots,s_{k})}\}. The primary technical issue, as before, is that while this is an II–filtration with respect to the multidegree ordering this is not a multidegree filtration (Definition 2.4).

We first give GG^{\prime} a Mal’cev basis. Note that 𝒳\mathcal{X} is adapted to the degree filtration on GG given by Gt=|i|=tGiG_{t}=\bigvee_{|\vec{i}|=t}G_{\vec{i}} (Definition 3.8). It is immediate to see that

𝒳={(X,0):X𝒳G1}{(0,X):X𝒳G1}\mathcal{X}^{\ast}=\{(X,0)\colon X\in\mathcal{X}\cap G_{1}\}\cup\{(0,X)\colon X\in\mathcal{X}\cap G_{1}\}

is a Mal’cev basis for the product filtration on GG. Then using [42, Lemma B.11] on

G0=G0G1G_{0}^{\prime}=G_{0}^{\prime}\geqslant G_{1}^{\prime}\geqslant\cdots

where Gt=|i|=tGiG_{t}^{\prime}=\bigvee_{|\vec{i}|=t}G_{\vec{i}}^{\prime}, which is seen to satisfy [G0,Gt]Gt+1[G_{0}^{\prime},G_{t}^{\prime}]\leqslant G_{t+1}^{\prime} for t0t\geq 0, we easily construct a Mal’cev basis 𝒳\mathcal{X}^{\prime} for G/ΓG^{\prime}/\Gamma^{\prime} coming from combinations of 𝒳\mathcal{X}^{\ast}. (We implicitly use that G=jGejG=\bigvee_{j}G_{\vec{e}_{j}}.) As GG is has complexity MM, it is trivial to see that all subgroups in the filtration of GG^{\prime} are MO|s|,q(dO|s|,q(1))M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})}-rational. The Mal’cev basis 𝒳\mathcal{X}^{\prime} clearly has the nesting property of order |s||\vec{s}| since 𝒳\mathcal{X} does.

We define G~\widetilde{G} as

G~==1kGe\widetilde{G}=\bigvee_{\ell=1}^{k}G^{\prime}_{\vec{e}_{\ell}}

and this group is seen to be appropriately rational with respect to 𝒳\mathcal{X}^{\prime} and is given the multidegree filtration G~i=Gi\widetilde{G}_{\vec{i}}=G^{\prime}_{\vec{i}} for i0\vec{i}\neq 0. Noting that the constant term of the Taylor expansion of (g(qn),g(n))(g(q\vec{n}),g(\vec{n})) is (idG,idG)(\mathrm{id}_{G},\mathrm{id}_{G}), we have that this is in fact a polynomial sequence with respect to the multidegree filtration given to G~\widetilde{G}. (This is where we use that we reduced to g(0)=idGg(0)=\mathrm{id}_{G}.) Furthermore, letting G~t=|i|=tG~i\widetilde{G}_{t}=\bigvee_{|\vec{i}|=t}\widetilde{G}_{\vec{i}}, we see that a truncation of 𝒳\mathcal{X}^{\prime} is an adapted Mal’cev basis to G~0=G~1G~2\widetilde{G}_{0}=\widetilde{G}_{1}\geqslant\widetilde{G}_{2}\geqslant\cdots where each element is an MO|s|,q(dO|s|,q(1))M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})}-rational combination of 𝒳\mathcal{X}^{\ast}. As F~\widetilde{F} is invariant under G~(s1,,sk)\widetilde{G}_{(s_{1},\ldots,s_{k})}, by passing to the quotient G~/G~(s1,,sk)\widetilde{G}/\widetilde{G}_{(s_{1},\ldots,s_{k})} and applying Lemma 3.10 we immediately finishe the proof.

We finally deduce the fourth item from the third item. Let g(n)=g(n/q)g^{\prime}(n)=g(n/q); note that gg may be extended to take on rational input via using Mal’cev coordinates and we may treat gg^{\prime} as a valid polynomial sequence. By applying the third item, we have that

F(g(n)) and F(g(n))q|s|F(g(n))\text{ and }F(g^{\prime}(n))^{\otimes q^{|\vec{s}|}}

are (MO|s|,q(dO|s|,q(1)),MO|s|,q(dO|s|,q(1)),dOq(1))(M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},M^{O_{|\vec{s}|,q}(d^{O_{|\vec{s}|,q}(1)})},d^{O_{q}(1)})-equivalent for multidegree <(s1,,sk)<(s_{1},\ldots,s_{k}). Outputting F(g(n))q|s|1F(g^{\prime}(n))^{\otimes q^{|\vec{s}|-1}} then gives the desired result. ∎

The next lemma is a quantified version of [34, Lemma 13.2]. The proof is once again essentially identical modulo noting slight changes in the filtration notions.

Lemma C.4.

Consider χ:k\chi\colon\mathbb{Z}^{k}\to\mathbb{C} which is a multidegree (1,,1)(1,\ldots,1) nilcharacter of complexity (M,d)(M,d). Then

χ(h1+h1,h2,,hk) and χ(h1,h2,,hk)χ(h1,h2,,hk)\chi(h_{1}+h_{1}^{\prime},h_{2},\ldots,h_{k})\emph{ and }\chi(h_{1},h_{2},\ldots,h_{k})\otimes\chi(h_{1}^{\prime},h_{2},\ldots,h_{k})

are (MOk(dOk(1)),dOk(1))(M^{O_{k}(d^{O_{k}(1)})},d^{O_{k}(1)})-equivalent for degree (k1)(k-1).

Proof.

Let χ(h1,,hk)=F(g(h1,,hk)Γ)\chi(h_{1},\ldots,h_{k})=F(g(h_{1},\ldots,h_{k})\Gamma) where the underlying nilmanifold is G/ΓG/\Gamma and the underlying Mal’cev basis is 𝒳\mathcal{X}. Let g(0,,0)={g(0,,0)}[g(0,,0)]g(0,\ldots,0)=\{g(0,\ldots,0)\}[g(0,\ldots,0)] with [g(0,,0)]Γ[g(0,\ldots,0)]\in\Gamma and ψG,𝒳({g(0,,0)})[0,1)dim(G)\psi_{G,\mathcal{X}}(\{g(0,\ldots,0)\})\in[0,1)^{\dim(G)}. We have that

F(g(h1,,hk)Γ)=F({g(0,,0)}({g(0,,0)}1g(h1,,hk)[g(0,,0)]1Γ)).F(g(h_{1},\ldots,h_{k})\Gamma)=F(\{g(0,\ldots,0)\}\cdot(\{g(0,\ldots,0)\}^{-1}g(h_{1},\ldots,h_{k})[g(0,\ldots,0)]^{-1}\Gamma)).

We let F(Γ)=F({g(0,,0)}Γ)F^{\prime}(\cdot\Gamma)=F(\{g(0,\ldots,0)\}\cdot\Gamma) and g(h1,,hk)=({g(0,,0)}1g(h1,,hk)[g(0,,0)]1g^{\prime}(h_{1},\ldots,h_{k})=(\{g(0,\ldots,0)\}^{-1}g(h_{1},\ldots,h_{k})[g(0,\ldots,0)]^{-1}. Thus may assume replace FF by FF^{\prime} and gg by gg^{\prime} (at the cost of replacing MM by MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})}) and assume that g(0)=idGg(0)=\mathrm{id}_{G}.

Consider, for t1t\geq 1,

Gt=|i|>t(Gi×Gi×Gi)|i|=t{(g1g2,g1,g2):giGi}G^{\prime}_{t}=\bigvee_{|\vec{i}|>t}(G_{\vec{i}}\times G_{\vec{i}}\times G_{\vec{i}})\vee\bigvee_{|\vec{i}|=t}\{(g_{1}g_{2},g_{1},g_{2}):g_{i}\in G_{\vec{i}}\}

and take G=G1G^{\prime}=G^{\prime}_{1}. Via Baker–Campbell–Hausdorff this gives a valid degree kk filtration

G=G1G2GkIdG.G^{\prime}=G^{\prime}_{1}\geqslant G^{\prime}_{2}\geqslant\cdots\geqslant G^{\prime}_{k}\geqslant\mathrm{Id}_{G^{\prime}}.

We define Γ=G(Γ×Γ×Γ)\Gamma^{\prime}=G^{\prime}\cap(\Gamma\times\Gamma\times\Gamma). We now verify that

(g(h1+h1,h2,,hk),g(h1,h2,,hk),g(h1,h2,,hk))(g(h_{1}+h_{1}^{\prime},h_{2},\ldots,h_{k}),g(h_{1},h_{2},\ldots,h_{k}),g(h_{1}^{\prime},h_{2},\ldots,h_{k}))

is a polynomial sequence with respect to this degree filtration. This is immediate nothing that by Taylor expansion [34, Lemma B.9] and the condition at 0, we have

g(h1,,hk)=(i1,,ik){0,1}k{0}gi1,,ik(h1i1)(hkik)g(h_{1},\ldots,h_{k})=\prod_{(i_{1},\ldots,i_{k})\in\{0,1\}^{k}\setminus\{\vec{0}\}}g_{i_{1},\ldots,i_{k}}^{\binom{h_{1}}{i_{1}}\cdots\binom{h_{k}}{i_{k}}}

with gi1,,ikG(i1,,ik)g_{i_{1},\ldots,i_{k}}\in G_{(i_{1},\ldots,i_{k})}. The desired polynomiality of the tripled sequence then follows easily from Baker–Campbell–Hausdorff and the fact that the degree of the exponents in the Taylor expansion for the h1h_{1} term is at most 11.

We now construct a Mal’cev basis for GG^{\prime}. Let Gt=|i|=tGiG_{t}=\bigvee_{|\vec{i}|=t}G_{\vec{i}} and note that by definition 𝒳\mathcal{X} is adapted to GtG_{t}. We may prove that

𝒳={(X,0,0):X𝒳log(G2)}{(0,X,0):X𝒳log(G2)}{(0,0,X):X𝒳log(G2)}\displaystyle\mathcal{X}^{{}^{\prime}}=\{(X,0,0)\colon X\in\mathcal{X}\cap\log(G_{2})\}\cup\{(0,X,0)\colon X\in\mathcal{X}\cap\log(G_{2})\}\cup\{(0,0,X):X\in\mathcal{X}\cap\log(G_{2})\}
{(X,X,0):X𝒳log(G1)𝒳log(G2)}{(X,0,X):X𝒳log(G1)𝒳log(G2)}\displaystyle\cup\{(X,X,0)\colon X\in\mathcal{X}\cap\log(G_{1})\setminus\mathcal{X}\cap\log(G_{2})\}\cup\{(X,0,X)\colon X\in\mathcal{X}\cap\log(G_{1})\setminus\mathcal{X}\cap\log(G_{2})\}

is a weak basis for GG^{\prime}. Furthermore this basis is easily seen to have the nesting property of order kk and that all subgroups GtG_{t}^{\prime} are MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})}-rational. Thus applying [42, Lemma B.11] we may find a Mal’cev basis 𝒳~\widetilde{\mathcal{X}} for GG^{\prime} adapted to the given filtration of complexity MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})} and such that all basis elements are MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})}-rational combinations of 𝒳\mathcal{X}^{\prime}.

The function F~\widetilde{F} we will consider is

F~((x,y,z)Γ3)=F(xΓ)F(yΓ)¯F(zΓ)¯.\widetilde{F}((x,y,z)\Gamma^{\otimes 3})=F(x\Gamma)\otimes\overline{F(y\Gamma)}\otimes\overline{F(z\Gamma)}.

This is easily seen to be Lipschitz on G×G×GG\times G\times G when given the Mal’cev basis 𝒳={(X,0,0):X𝒳}{(0,X,0):X𝒳}{(0,0,X):X𝒳}\mathcal{X}^{\ast}=\{(X,0,0)\colon X\in\mathcal{X}\}\cup\{(0,X,0)\colon X\in\mathcal{X}\}\cup\{(0,0,X)\colon X\in\mathcal{X}\}. As 𝒳~\widetilde{\mathcal{X}} has basis elements which are MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})} height rational combination of 𝒳\mathcal{X}^{\ast}, we find that F~\widetilde{F} is MOk(dOk(1))M^{O_{k}(d^{O_{k}(1)})}-Lipschitz on G/ΓG^{\prime}/\Gamma^{\prime}.

Note that F~\widetilde{F} is invariant under the group Gk3G_{k}^{3} and therefore taking the output quotient group G3/Gk3G^{3}/G_{k}^{3} with lattice Γ3/(Γ3Gk3)\Gamma^{3}/(\Gamma^{3}\cap G_{k}^{3}) and using Lemma 3.10 completes the proof. ∎

We now come to the most technical of the complexity justifications we will need to perform, multilinearization. We will give a rather barebones analysis (citing much from [34, Proposition E.9, E.10]); the reader may find the discussion in [34, pp. 1360-1363] where an extended example is discussed useful. (We prove a slightly weaker statement which is all that is used in the analysis to ease checking extra complexity details.)

Lemma C.5.

Consider nilcharacter χ(h1,,hk)\chi(h_{1},\ldots,h_{k}) of multidegree (s1,,sk)(s_{1},\ldots,s_{k}) and complexity (M,d)(M,d). There exists a multidegree (1,,1)(1,\ldots,1) nilcharacter

χ(h1,1,,h1,s1,h2,1,,h2,s2,,hk,1,,hk,sk)\chi^{\prime}(h_{1,1},\ldots,h_{1,s_{1}},h_{2,1},\ldots,h_{2,s_{2}},\ldots,h_{k,1},\ldots,h_{k,s_{k}})

of complexity (MO|s|(dO|s|(1)),dO|s|(1))(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},d^{O_{|\vec{s}|}(1)}) such that

χ(h1,,hk) and χ(h1,,h1,h2,,h2,,hk,,hk)\chi(h_{1},\ldots,h_{k})\emph{ and }\chi^{\prime}(h_{1},\ldots,h_{1},h_{2},\ldots,h_{2},\ldots,h_{k},\ldots,h_{k})

are (MO|s|(dO|s|(1)),MO|s|(dO|s|(1)),dO|s|(1))(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},d^{O_{|\vec{s}|}(1)})-equivalent for degree |s|1|\vec{s}|-1 and furthermore, for each 1ik1\leq i\leq k, χ\chi^{\prime} is symmetric in the variables hi,1,,hi,sih_{i,1},\ldots,h_{i,s_{i}}.

Remark.

We will only require the above lemma for multidegree (1,s1)(1,s-1) nilsequences.

Proof.

By Lemma C.3, there exists χ\chi^{\ast} such that

χ(h1,,hk) and χ(h1,,hk)i=1ksi!\chi(h_{1},\ldots,h_{k})\text{ and }\chi^{\ast}(h_{1},\ldots,h_{k})^{\otimes\prod_{i=1}^{k}s_{i}!}

are (MO|s|(dO|s|(1)),MO|s|(dO|s|(1)),dO|s|(1))(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},d^{O_{|\vec{s}|}(1)})-equivalent for degree |s|1|\vec{s}|-1. Therefore by Lemma 7.4, it suffices to produce χ\chi^{\prime} such that

χ(h1,,hk)i=1ksi! and χ(h1,,h1,h2,,h2,,hk,,hk)\chi^{\ast}(h_{1},\ldots,h_{k})^{\otimes\prod_{i=1}^{k}s_{i}!}\text{ and }\chi^{\prime}(h_{1},\ldots,h_{1},h_{2},\ldots,h_{2},\ldots,h_{k},\ldots,h_{k})

are (MO|s|(dO|s|(1)),MO|s|(dO|s|(1)),dO|s|(1))(M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})},d^{O_{|\vec{s}|}(1)})-equivalent for degree |s|1|\vec{s}|-1.

Let

χ(h1,,hk)=F(g(h1,,hk)Γ)\chi^{\prime}(h_{1},\ldots,h_{k})=F(g(h_{1},\ldots,h_{k})\Gamma)

with the underlying nilmanifold being G/ΓG/\Gamma and the associated Mal’cev basis being 𝒳\mathcal{X}. Via a standard manipulation which has been perform several times already, we may assume that g(0)=idGg(0)=\mathrm{id}_{G} (at the cost of an insignificant change in parameters). Furthermore assume that η\eta is the vertical character, so

F(g(s1,,sk)xΓ)=e(η(g(s1,,sk)))F(xΓ).F(g_{(s_{1},\ldots,s_{k})}x\Gamma)=e(\eta(g_{(s_{1},\ldots,s_{k})}))\cdot F(x\Gamma).

Given J[|s|]J\subseteq[|\vec{s}|], we denote

J:=(|J{s1++si1+1,,s1++si1+si}|)1ik.\|J\|:=(|J\cap\{s_{1}+\cdots+s_{i-1}+1,\ldots,s_{1}+\cdots+s_{i-1}+s_{i}\}|)_{1\leq i\leq k}.

The group G~\widetilde{G} we will ultimately use to construct our nilsequence will be given by constructing the associated nilpotent Lie algebra. We take

log(G~)=J[|s|]log(GJ)\log(\widetilde{G})=\bigoplus_{\emptyset\neq J\subseteq[|\vec{s}|]}\log(G_{\|J\|})

and for each J[|s|]\emptyset\neq J\subseteq[|\vec{s}|] let ιJ:log(GJ)log(G~)\iota_{J}\colon\log(G_{\|J\|})\hookrightarrow\log(\widetilde{G}) denote the embedding into the direct sum. We endow G~\widetilde{G} with a Lie bracket such that if JKJ\cap K\neq\emptyset then

[ιJ(xJ),ιK(yK)]=0[\iota_{J}(x_{J}),\iota_{K}(y_{K})]=0

and if JK=J\cap K=\emptyset then

[ιJ(xJ),ιK(yK)]=ιJK([xJ,yK]),[\iota_{J}(x_{J}),\iota_{K}(y_{K})]=\iota_{J\cup K}([x_{J},y_{K}]),

where the bracket between xJ,xKx_{J},x_{K} is taken in the ambient space log(G)\log(G) and is seen to lie in log(GJK)\log(G_{J\cup K}) by the commutator property of the original filtration on GG.

To verify that this gives a valid Lie algebra it suffices to verify this operation is antisymmetric and satisfies the Jacobi relations. Furthermore to verify it suffices to verify these relations on the generators. For antisymmetry for ιJ(xJ),ιK(yK)\iota_{J}(x_{J}),\iota_{K}(y_{K}), if JKJ\cap K\neq\emptyset it is trivial and otherwise

[ιJ(xJ),ιK(yK)]=ιJK([xJ,yK])=ιJK([yJ,xK])=[ιK(yK),ιJ(xJ)][\iota_{J}(x_{J}),\iota_{K}(y_{K})]=\iota_{J\cup K}([x_{J},y_{K}])=-\iota_{J\cup K}([y_{J},x_{K}])=-[\iota_{K}(y_{K}),\iota_{J}(x_{J})]

as desired. For the Jacobi identity, when checked on generators ιJ(xJ),ιK(yK),ιL(zL)\iota_{J}(x_{J}),\iota_{K}(y_{K}),\iota_{L}(z_{L}), if (JK)(KL)(LK)(J\cap K)\cup(K\cap L)\cup(L\cap K)\neq\emptyset the result is trivial. Otherwise we have

[ιJ(xJ),[ιK(yK),ιL(zL)]]+[ιK(yK),[ιL(zL),ιJ(xJ)]]+[ιL(zL),[ιJ(xJ),ιK(yK)]]\displaystyle[\iota_{J}(x_{J}),[\iota_{K}(y_{K}),\iota_{L}(z_{L})]]+[\iota_{K}(y_{K}),[\iota_{L}(z_{L}),\iota_{J}(x_{J})]]+[\iota_{L}(z_{L}),[\iota_{J}(x_{J}),\iota_{K}(y_{K})]]
=ιJKL([xJ,[yK,zL]]+[yK,[zL,xJ]]+[zL,[xJ,yK]])=0\displaystyle=\iota_{J\cup K\cup L}([x_{J},[y_{K},z_{L}]]+[y_{K},[z_{L},x_{J}]]+[z_{L},[x_{J},y_{K}]])=0

as desired.

The associated II-filtration with respect to the multidegree ordering is given as follows. For any (a1,,a|s|)|s|(a_{1},\ldots,a_{|\vec{s}|})\in\mathbb{N}^{|\vec{s}|}, let log(G~(a1,,a|s|))\log(\widetilde{G}_{(a_{1},\ldots,a_{|\vec{s}|})}) be the Lie subalgebra of log(G~)\log(\widetilde{G}) generated by ιJ(xJ)\iota_{J}(x_{J}) for which 1J(j)aj1_{J}(j)\geq a_{j} for each j=1,,|s|j=1,\ldots,|\vec{s}|, and xJGJx_{J}\in G_{\|J\|}. It follows this is an II-filtration with respect to the multidegree ordering because for vectors a,b{0,1}|s|a,b\in\{0,1\}^{|\vec{s}|}, if 1J(j)aj1_{J}(j)\geq a_{j} and 1K(j)bj1_{K}(j)\geq b_{j} we either have JKJ\cap K\neq\emptyset in which case the commutator is trivial or 1KJ(j)aj+bj1_{K\cup J}(j)\geq a_{j}+b_{j} in which case the result also follows easily. Noting that by construction G~==1|s|G~e\widetilde{G}=\bigvee_{\ell=1}^{|\vec{s}|}\widetilde{G}_{\vec{e}_{\ell}}, the above immediately implies that we have a multidegree (1,,1)(1,\ldots,1) filtration on G~\widetilde{G}.

We now construct a weak basis for G~\widetilde{G}. Recall we have a Mal’cev basis 𝒳\mathcal{X} for GG. Given J\|J\|, we define the filtration

GJt=|i|=tGJ+i.G_{\|J\|}^{t}=\bigvee_{|\vec{i}|=t}G_{\|J\|+\vec{i}}.

Note that GJ=GJ0G_{\|J\|}=G_{\|J\|}^{0} and that GJ0=GJ0GJ1GJ2G_{\|J\|}^{0}=G_{\|J\|}^{0}\geqslant G_{\|J\|}^{1}\geqslant G_{\|J\|}^{2}\geqslant\cdots is a valid degree filtration when J0\|J\|\neq\vec{0}. Thus by [34, Lemma B.11], we may find a Mal’cev basis 𝒳J\mathcal{X}^{\|J\|} for each GJG_{\|J\|} which is an MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational combination of 𝒳\mathcal{X}.

Define

𝒳~=J[|s|]ιJ(𝒳J).\widetilde{\mathcal{X}}=\bigcup_{\emptyset\neq J\subseteq[|\vec{s}|]}\iota_{J}(\mathcal{X}^{\|J\|}).

Furthermore define Γ~\widetilde{\Gamma} to be the group generated by exp(L!ιJ(ΓGJ))\exp(L!\cdot\iota_{J}(\Gamma\cap G_{\|J\|})) where LL is a sufficiently large constant depending only on |s||\vec{s}| (and in particular not on MM or dd). Direct computation with Baker–Campbell–Hausdorff implies that Γ~G~(1,,1)\widetilde{\Gamma}\cap\widetilde{G}_{(1,\ldots,1)} is contained in ι[|s|](ΓG(s1,,sk))\iota_{[|\vec{s}|]}(\Gamma\cap G_{(s_{1},\ldots,s_{k})}). Furthermore we see that G~/Γ~\widetilde{G}/\widetilde{\Gamma} is compact, 𝒳~\widetilde{\mathcal{X}} is a weak basis of rationality MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} for G~\widetilde{G}, and 𝒳~\widetilde{\mathcal{X}} has the degree O|s|(1)O_{|\vec{s}|}(1) nesting property. As all groups within the multidegree filtration are MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational with respect to 𝒳~\widetilde{\mathcal{X}}, by applying [42, Lemma B.11] we may construct a basis with respect to the canonical associated degree filtration of G~\widetilde{G} which certifies that G~\widetilde{G} with the given multidegree filtration has complexity bounded by MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}. Furthermore the adapted Mal’cev basis 𝒳~\widetilde{\mathcal{X}}^{\ast} is an MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational combination of 𝒳~\widetilde{\mathcal{X}} (lifted to log(G~)\log(\widetilde{G}) appropriately).

We define the G~(1,,1)\widetilde{G}_{(1,\ldots,1)}-vertical frequency as

η~(exp(ι(1,,1)(log(g(s1,,sk))))):=η(g(s1,,sk))\widetilde{\eta}(\exp(\iota_{(1,\ldots,1)}(\log(g_{(s_{1},\ldots,s_{k})})))):=\eta(g_{(s_{1},\ldots,s_{k})})

and it is trivial to use the construction of 𝒳~\widetilde{\mathcal{X}}^{\ast} to certify that η~\widetilde{\eta} has height bounded by MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}. We take F~\widetilde{F} to be a nilcharacter with frequency η~\widetilde{\eta} produced by Lemma B.4 (which is applied to the canonical degree filtration of G~\widetilde{G}) and this construction gives output dimension MO|s|(d)M^{O_{|\vec{s}|}(d)} and Lipschitz constant MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} with respect to 𝒳~\widetilde{\mathcal{X}}^{\ast}.

We now define g~\widetilde{g}. Note that

g(h1,,hk)=0(i1,,ik)(s1,,sk)(g(i1,,ik)Tay)h1i1hkikg(h_{1},\ldots,h_{k})=\prod_{\vec{0}\neq(i_{1},\ldots,i_{k})\leq(s_{1},\ldots,s_{k})}(g_{(i_{1},\ldots,i_{k})}^{\mathrm{Tay}})^{h_{1}^{i_{1}}\cdots h_{k}^{i_{k}}}

via [34, Lemma B.9] and the condition at 0 to rule out need a coefficient where (i1,,ik)=0(i_{1},\ldots,i_{k})=\vec{0}. (We are using monomials instead of binomials, which is a minor but easy alteration.) The product here is taken in increasing lexicographic order. We define

g~(h1,,h|s|):=0(i1,,ik)(s1,,sk)exp(i1!ik!J{1,,|s|}J=(i1,,ik)(iJhi)ιJ(log(g(i1,,ik)Tay))).\widetilde{g}(h_{1},\ldots,h_{|\vec{s}|}):=\prod_{\vec{0}\neq(i_{1},\ldots,i_{k})\leq(s_{1},\ldots,s_{k})}\exp\bigg{(}i_{1}!\cdots i_{k}!\sum_{\begin{subarray}{c}J\subseteq\{1,\ldots,|\vec{s}|\}\\ \|J\|=(i_{1},\ldots,i_{k})\end{subarray}}\big{(}\prod_{i\in J}h_{i}\big{)}\cdot\iota_{J}(\log(g_{(i_{1},\ldots,i_{k})}^{\mathrm{Tay}}))\bigg{)}.

Let GG^{\ast} denote the subgroup of G×G~G\times\widetilde{G} generated by

G:={(g(s1,,sk),exp(s1!sk!ι(1,,1)(log(g(s1,,sk))))):g(s1,,sk)G(s1,,sk)}.G^{\ast}:=\{(g_{(s_{1},\ldots,s_{k})},\exp(s_{1}!\cdots s_{k}!\iota_{(1,\ldots,1)}(\log(g_{(s_{1},\ldots,s_{k})}))))\colon g_{(s_{1},\ldots,s_{k})}\in G_{(s_{1},\ldots,s_{k})}\}.

Note that the function

(g,g~)F(gΓ)s1!sk!F~(g~Γ~)¯(g,\widetilde{g})\mapsto F(g\Gamma)^{\otimes s_{1}!\cdots s_{k}!}\otimes\overline{\widetilde{F}(\widetilde{g}\widetilde{\Gamma})}

is invariant under the action of GG^{\ast}.

We will construct GG^{\prime} which is a subgroup of G×G~G\times\widetilde{G} with a degree |s||\vec{s}| filtration such that the final group is GG^{\ast} and such that

(g(h1,,hk),g~(h1,,h1,h2,,h2,,hk,,hk))(g(h_{1},\ldots,h_{k}),\widetilde{g}(h_{1},\ldots,h_{1},h_{2},\ldots,h_{2},\ldots,h_{k},\ldots,h_{k}))

is a polynomial sequence with respect to this filtration. Let GjG_{j}^{\prime} (for j1j\geq 1) be generated by elements of the form

(C.1) (g(i1,,ik),exp(i1!ik!J{1,,|s|}J=(i1,,ik)ιJ(log(g(i1,,ik)))))\bigg{(}g_{(i_{1},\ldots,i_{k})},\exp\bigg{(}i_{1}!\cdots i_{k}!\sum_{\begin{subarray}{c}J\subseteq\{1,\ldots,|\vec{s}|\}\\ \|J\|=(i_{1},\ldots,i_{k})\end{subarray}}\iota_{J}(\log(g_{(i_{1},\ldots,i_{k})}))\bigg{)}\bigg{)}

where |i|=j|\vec{i}|=j, as well as

(g(i1,,ik),idG~), and (idG,G~J)\big{(}g_{(i_{1},\ldots,i_{k})},\mathrm{id}_{\widetilde{G}}\big{)},\text{ and }\big{(}\mathrm{id}_{G},\widetilde{G}_{J}\big{)}

where |i|j+1|\vec{i}|\geq j+1 in the first case, and |J|j+1|J|\geq j+1 in the second case. Furthermore set G=G0:=G1G^{\prime}=G_{0}^{\prime}:=G_{1}^{\prime}; we trivially see that G|s|=GG_{|\vec{s}|}=G^{\ast}. That this is a filtration follows from liberal application of Baker–Campbell–Hausdorff; we use crucially that the number of ways to break a set of size (i+j)(i+j) into two labeled sets of size ii and jj which are disjoint is (i+j)!/(i!j!)(i+j)!/(i!\cdot j!), which modifies the factorial prefactors in (C.1) appropriately.

Furthermore it is trivial to see that the GjG_{j}^{\prime} are MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})}-rational with respect to the Mal’cev basis for G×G~G\times\widetilde{G} given by

{(X,0):X𝒳}{(0,X):X𝒳~},\{(X,0)\colon X\in\mathcal{X}\}\cup\{(0,X)\colon X\in\widetilde{\mathcal{X}}^{\ast}\},

and therefore applying [42, Lemma B.11] we may construct a Mal’cev basis 𝒳\mathcal{X}^{\prime} of complexity MO|s|(dO|s|(1))M^{O_{|\vec{s}|}(d^{O_{|\vec{s}|}(1)})} for G/(G(Γ×Γ~))G^{\prime}/(G^{\prime}\cap(\Gamma\times\widetilde{\Gamma})). Furthermore Fs1!sk!F~¯F^{\otimes s_{1}!\cdots s_{k}!}\otimes\overline{\widetilde{F}} is appropriately Lipschitz with respect to 𝒳\mathcal{X}^{\prime}. Finally, since

(g(i1,,ik),exp(i1!ik!J{1,,|s|J=(i1,,ik)ιJ(log(g(i1,,ik)))))\bigg{(}g_{(i_{1},\ldots,i_{k})},\exp\big{(}i_{1}!\cdots i_{k}!\sum_{\begin{subarray}{c}J\subset\{1,\ldots,|\vec{s}|\\ \|J\|=(i_{1},\ldots,i_{k})\end{subarray}}\iota_{J}(\log(g_{(i_{1},\ldots,i_{k})}))\big{)}\bigg{)}

is in Gi1++ikG_{i_{1}+\cdots+i_{k}}^{\prime} by definition, we see that

(g(h1,,hk),g~(h1,,h1,h2,,h2,,hk,,hk))(g(h_{1},\ldots,h_{k}),\widetilde{g}(h_{1},\ldots,h_{1},h_{2},\ldots,h_{2},\ldots,h_{k},\ldots,h_{k}))

is a polynomial sequence with respect to the filtration. Quotienting out by G=G|s|G^{\ast}=G_{|\vec{s}|}^{\prime} (using that Fs1!sk!F~¯F^{\otimes s_{1}!\cdots s_{k}!}\otimes\overline{\widetilde{F}} is invariant under GG^{\ast}) and using Lemma 3.10, we finally complete the proof. ∎

We now reach the final technical lemma of the paper which states that a nilsequence of multidegree JJJ\cup J^{\prime} can be approximated by a sum of products of nilsequences in JJ and JJ^{\prime}. This “splitting” lemma is a quantified version of [34, Lemma E.4]; the proof here is ever so slightly different as we are forced to not use the Stone–Weierstrass theorem.

Lemma C.6.

Let JJ and JJ^{\prime} be finite downsets in k\mathbb{N}^{k} and fix ε(0,1/2)\varepsilon\in(0,1/2). Suppose that β(h1,,hk)\beta(h_{1},\ldots,h_{k}) is a nilsequence of multidegree JJJ\cup J^{\prime} with complexity (M,d)(M,d). Then there exists 1L(M/ε)OJ,J(dOJ,J(1))1\leq L\leq(M/\varepsilon)^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} such that

β(h1,,hk)j=1Lβj(h1,,hk)βj(h1,,hk)L(k)ε\bigg{\lVert}\beta(h_{1},\ldots,h_{k})-\sum_{j=1}^{L}\beta_{j}(h_{1},\ldots,h_{k})\beta_{j}^{\prime}(h_{1},\ldots,h_{k})\bigg{\rVert}_{L^{\infty}(\mathbb{Z}^{k})}\leq\varepsilon

with the βj\beta_{j} being nilsequences of multidegree JJ, the βj\beta_{j}^{\prime} being nilsequences of multidegree JJ^{\prime}, and βj,βj\beta_{j},\beta_{j}^{\prime} having complexity ((M/ε)OJ,J(dOJ,J(1)),dOJ,J(1))((M/\varepsilon)^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})},d^{O_{J,J^{\prime}}(1)}).

Proof.

We let

β(h1,,hk)=F(g(h1,,hk)Γ)\beta(h_{1},\ldots,h_{k})=F(g(h_{1},\ldots,h_{k})\Gamma)

where the underlying nilmanifold is G/ΓG/\Gamma. As is standard, we may assume that g(0,,0)=idGg(0,\ldots,0)=\mathrm{id}_{G} up to the insignificant change of adjusting MM to MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}. Furthermore let the adapted Mal’cev basis for GG be 𝒳\mathcal{X}.

We have for each j0\vec{j}\neq\vec{0} that the groups

Gtj=|i|=tGj+iG_{t}^{\vec{j}}=\bigvee_{|\vec{i}|=t}G_{\vec{j}+\vec{i}}

form a degree filtration G0j=G0jG1jG2jG_{0}^{\vec{j}}=G_{0}^{\vec{j}}\geqslant G_{1}^{\vec{j}}\geqslant G_{2}^{\vec{j}}\geqslant\cdots, where the length of the filtration is OJ,J(1)O_{J,J^{\prime}}(1). As these subgroups are all MM-rational with respect to 𝒳\mathcal{X}, there exists a Mal’cev basis 𝒳j\mathcal{X}^{\vec{j}} adapted to this filtration of complexity MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} where each element is an MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-rational combination of elements in 𝒳\mathcal{X} by [42, Lemma B.11].

Using a variant of Lemma 10.2, adapted to multidegree filtrations, we may write

g(h1,,hk)=j0Xj,i𝒳jexp(Xj,i)αj,i=1k(hj/j!).g(h_{1},\ldots,h_{k})=\prod_{\vec{j}\neq\vec{0}}\prod_{X_{\vec{j},i}\in\mathcal{X}^{\vec{j}}}\exp(X_{\vec{j},i})^{\alpha_{\vec{j},i}\prod_{\ell=1}^{k}(h_{\ell}^{j_{\ell}}/j_{\ell}!)}.

The product here is taken in j\vec{j} is increasing |j||\vec{j}| and then lexicographic order and Xj,iX_{\vec{j},i} taken in increasing order of ii. The modified proof of such a representation involves iteratively handling terms in increasing order of |j||\vec{j}| (and handling these terms in an arbitrary order); we omit a careful proof.

The first key part of the proof is lifting to the universal nilmanifold. We define the universal nilmanifold G~\widetilde{G} to be generated by generators exp(ej,i)tj,i\exp(e_{\vec{j},i})^{t_{\vec{j},i}} for j0\vec{j}\neq\vec{0}, 1idim(Gj)1\leq i\leq\dim(G_{\vec{j}}), and tj,it_{\vec{j},i}\in\mathbb{R}. The only relations these generators satisfy is that any (r1)(r-1)-fold commutator for r1r\geq 1 between exp(ej1,i1),,exp(ejr,ir)\exp(e_{\vec{j_{1}},i_{1}}),\ldots,\exp(e_{\vec{j_{r}},i_{r}}) vanishes if j1++jr\vec{j_{1}}+\cdots+\vec{j_{r}} is not in JJJ\cup J^{\prime}. We give G~\widetilde{G} the structure of a multidegree JJJ\cup J^{\prime} nilmanifold by letting (G~)j(\widetilde{G})_{\vec{j}^{\ast}} be generated by the set of (r1)(r-1)-fold commutators (for any r1r\geq 1) of exp(ej1,i1),,exp(ejr,ir)\exp(e_{\vec{j_{1}},i_{1}}),\ldots,\exp(e_{\vec{j_{r}},i_{r}}) where j1++jrj\vec{j_{1}}+\cdots+\vec{j_{r}}\geq\vec{j}^{\ast} (here \geq means that each coordinate is larger). This is easily proven to be an II-filtration with respect to the multidegree ordering and note that since we have no generators with j=0\vec{j}=\vec{0}, this is in fact a multidegree filtration. Finally we let Γ~\widetilde{\Gamma} be the lattice generated by exp(ej,i)\exp(e_{\vec{j},i}).

The analysis in Lemma 10.4 can easily be extended to prove that G~\widetilde{G} has a filtered Mal’cev basis X~\widetilde{X} of complexity MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} where the basis elements are height MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} linear combinations of (r1)(r-1)-fold commutators of ej1,i1,,ejr,ire_{\vec{j_{1}},i_{1}},\ldots,e_{\vec{j_{r}},i_{r}}. Furthermore note that the dimension of G~\widetilde{G} is dOJ,J(1)d^{O_{J,J^{\prime}}(1)}.

We now lift β\beta to G~/Γ~\widetilde{G}/\widetilde{\Gamma}. Define the homomorphism ϕ:G~G\phi\colon\widetilde{G}\to G via

ϕ(exp(ej,i))=exp(Xj,i);\phi(\exp(e_{\vec{j},i}))=\exp(X_{\vec{j},i});

here we are writing 𝒳j={Xj,1,,Xj,dim(Gj)}\mathcal{X}^{\vec{j}}=\{X_{\vec{j},1},\ldots,X_{\vec{j},\dim(G_{\vec{j}})}\}. That this is a homormorphism follows from noting that all relations in G~\widetilde{G} are present in GG because GG has multidegree JJJ\cup J^{\prime}. We next lift the polynomial sequence gg to

g~(h1,,hk)=j0i=1dim(Gj)exp(ej,i)αi,j=1k(hj/j!)\widetilde{g}(h_{1},\ldots,h_{k})=\prod_{\vec{j}\neq\vec{0}}\prod_{i=1}^{\dim(G_{\vec{j}})}\exp(e_{\vec{j},i})^{\alpha_{i,\vec{j}}\prod_{\ell=1}^{k}(h_{\ell}^{j_{\ell}}/j_{\ell}!)}

and FF to F~\widetilde{F} via

F~(g~Γ~)=F(ϕ(g~)Γ).\widetilde{F}(\widetilde{g}\widetilde{\Gamma})=F(\phi(\widetilde{g})\Gamma).

Note that since ϕ(Γ~)Γ\phi(\widetilde{\Gamma})\leqslant\Gamma, this is a well-defined function on G~/Γ~\widetilde{G}/\widetilde{\Gamma}. Furthermore, noting various properties of X~\widetilde{X} and that elements of 𝒳j\mathcal{X}^{\vec{j}} are appropriately bounded and rational linear combinations of 𝒳\mathcal{X}, we have that F~\widetilde{F} is MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-Lipschitz with respect to the Mal’cev basis specified by 𝒳~\widetilde{\mathcal{X}}. Therefore for the remainder of the proof we operate with the nilsequence

F~(g~(h1,,hk)Γ~).\widetilde{F}(\widetilde{g}(h_{1},\ldots,h_{k})\widetilde{\Gamma}).

For the remainder of the analysis we furthermore assume that there exists gG~g^{\ast}\in\widetilde{G} with dG~,𝒳~(g)MOJ,J(dOJ,J(1))d_{\widetilde{G},\widetilde{\mathcal{X}}}(g^{\ast})\leq M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} such that if ψexp,G~(g)+(1/2,1/2]dim(G)\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-1/2,1/2]^{\dim(G)} is identified with G~/Γ~\widetilde{G}/\widetilde{\Gamma} then supp(F~)\operatorname{supp}(\widetilde{F}) lies in ψexp,G~(g)+(δ,δ]dim(G)\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-\delta,\delta]^{\dim(G)}. We will ultimately take δ=MOJ,J(dOJ,J(1))\delta=M^{-O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} sufficiently small. If we prove the proposition with ε=εδOJ,J(dOJ,J(1))\varepsilon^{\prime}=\varepsilon\cdot\delta^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} for functions with such restricted support then the result in generality follows by Lemma B.3.

The use of the universal nilmanifold comes precisely when defining the following two nilmanifolds for the split terms. Let G~>J\widetilde{G}_{>J} be the group generated by G~i\widetilde{G}_{\vec{i}} with iJJ\vec{i}\in J^{\prime}\setminus J and G~>J\widetilde{G}_{>J^{\prime}} be the group generated by G~i\widetilde{G}_{\vec{i}} with iJJ\vec{i}\in J\setminus J^{\prime}. It is trivial to see that G~>J,G~>J\widetilde{G}_{>J},\widetilde{G}_{>J^{\prime}} are normal and MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-rational with respect to 𝒳~\widetilde{\mathcal{X}}. Let 𝒳~J\widetilde{\mathcal{X}}^{J} and 𝒳~J\widetilde{\mathcal{X}}^{J^{\prime}} be bases for the Lie algebras of log(G~>J)\log(\widetilde{G}_{>J}) and log(G~>J)\log(\widetilde{G}_{>J^{\prime}}) which are MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-rational bounded combinations of 𝒳~\widetilde{\mathcal{X}}.

We consider the nilmanifolds G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J}\widetilde{\Gamma}) and G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J^{\prime}}\widetilde{\Gamma}). The first is clearly a multidegree JJ^{\prime} nilmanifold while the second is a multidegree JJ nilmanifold, each of complexity MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}. Furthermore we can choose underlying Mal’cev bases which are MOJ,J(dOJ,J(1))M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-rational combinations of 𝒳~mod𝒳~J\widetilde{\mathcal{X}}~{}\mathrm{mod}~{}\widetilde{\mathcal{X}}^{J} and 𝒳~mod𝒳~J\widetilde{\mathcal{X}}~{}\mathrm{mod}~{}\widetilde{\mathcal{X}}^{J^{\prime}}, respectively. (See, e.g., the arguments regarding GQuotG_{\mathrm{Quot}} in Section 10.3.) Note here that Γ~>J=Γ~/(Γ~G~>J)\widetilde{\Gamma}_{>J}=\widetilde{\Gamma}/(\widetilde{\Gamma}\cap\widetilde{G}_{>J}) and analogously for Γ~>J\widetilde{\Gamma}_{>J^{\prime}}.

The key point is that by construction, G~>JG~>J=IdG~\widetilde{G}_{>J}\cap\widetilde{G}_{>J^{\prime}}=\mathrm{Id}_{\widetilde{G}}. This implies that there exist linear maps AA and BB such that

(C.2) Aψexp,G~/G~>J(zmodG~>J)+Bψexp,G~/G~>J(zmodG~>J)=ψexp,G~(z)A\circ\psi_{\exp,\widetilde{G}/\widetilde{G}_{>J}}(z~{}\mathrm{mod}~{}\widetilde{G}_{>J})+B\circ\psi_{\exp,\widetilde{G}/\widetilde{G}_{>J^{\prime}}}(z~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})=\psi_{\exp,\widetilde{G}}(z)

for all zG~z\in\widetilde{G}. Furthermore one can take AA and BB bounded in the sense that

dG~(Aψexp,G~/G~>J(exp(X~i)modG~>J),idG~)MOJ,J(dOJ,J(1))d_{\widetilde{G}}(A\circ\psi_{\exp,\widetilde{G}/\widetilde{G}_{>J}}(\exp(\widetilde{X}_{i})~{}\mathrm{mod}~{}\widetilde{G}_{>J}),\mathrm{id}_{\widetilde{G}})\leq M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}

for all X~i𝒳~\widetilde{X}_{i}\in\widetilde{\mathcal{X}} and analogously for BB and JJ^{\prime}.

We now identify G~/Γ~\widetilde{G}/\widetilde{\Gamma} via ψG~\psi_{\widetilde{G}} with the domain ψexp,G~(g)+(1/2,1/2]dim(G~)\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-1/2,1/2]^{\dim(\widetilde{G})} and we only have support of F~\widetilde{F} in ψexp,G~(g)+(δ,δ]dim(G~)\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-\delta,\delta]^{\dim(\widetilde{G})}. Given xG~x\in\widetilde{G} such that ψexp,G~(x)ψexp,G~(g)+(δ,δ]dim(G~)\psi_{\mathrm{exp},\widetilde{G}}(x)\in\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-\delta,\delta]^{\dim(\widetilde{G})}, we have that

ψexp,G~/G~>J(xmodG~>J)ψexp,G~/G~>J(gmodG~>J)+(δ,δ]dim(G~/G~>J)MOJ,J(dOJ,J(1)),\displaystyle\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(x~{}\mathrm{mod}~{}\widetilde{G}_{>J})\in\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J})+(-\delta,\delta]^{\dim(\widetilde{G}/\widetilde{G}_{>J})}\cdot M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})},
ψexp,G~/G~>J(xmodG~>J)ψexp,G~/G~>J(gmodG~>J)+(δ,δ]dim(G~/G~>J)MOJ,J(dOJ,J(1)).\displaystyle\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J^{\prime}}}(x~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})\in\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J^{\prime}}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})+(-\delta,\delta]^{\dim(\widetilde{G}/\widetilde{G}_{>J^{\prime}})}\cdot M^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}.

Given that δ=MOJ,J(dOJ,J(1))\delta=M^{-O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})} is sufficiently small, these are contained

ψexp,G~/G~>J(gmodG~>J)+(1/4,1/4]dim(G~/G~>J),\displaystyle\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J})+(-1/4,1/4]^{\dim(\widetilde{G}/\widetilde{G}_{>J})},
ψexp,G~/G~>J(gmodG~>J)+(1/4,1/4]dim(G~/G~>J),\displaystyle\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J^{\prime}}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})+(-1/4,1/4]^{\dim(\widetilde{G}/\widetilde{G}_{>J^{\prime}})},

respectively.

Identify ψexp,G~(g)+(1/2,1/2]dim(G)\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-1/2,1/2]^{\dim(G)} with the torus (note that the boundaries are glued differently than in G~/Γ~\widetilde{G}/\widetilde{\Gamma}, but we are near the center so it is not an issue). We have that F~\widetilde{F} is an (M/δ)OJ,J(dOJ,J(1))(M/\delta)^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-Lipschitz function with respect to the standard torus metric (see e.g. [45, Lemma 2.3] and [42, Lemma B.3]). Thus for xG~x\in\widetilde{G} such that ψexp,G~(x)ψexp,G~(g)+(1/2,1/2]dim(G)\psi_{\mathrm{exp},\widetilde{G}}(x)\in\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-1/2,1/2]^{\dim(G)}, via standard Fourier approximation (see e.g. [49, Lemma A.8]), for ξdim(G~)\xi\in\mathbb{Z}^{\dim(\widetilde{G})} there exist cξc_{\xi} with |cξ|(M/(δε))O(dim(G~))|c_{\xi}|\leq(M/(\delta\varepsilon^{\prime}))^{O(\dim(\widetilde{G}))} such that

F~(xΓ~)ξ(M/(δε))O(dim(G~))cξe(ξψexp(x))ε\bigg{\lVert}\widetilde{F}(x\widetilde{\Gamma})-\sum_{\lVert\xi\rVert_{\infty}\leq(M/(\delta\varepsilon))^{O(\dim(\widetilde{G}))}}c_{\xi}e(\xi\cdot\psi_{\exp}(x))\bigg{\rVert}_{\infty}\leq\varepsilon^{\prime}

where the sum is over ξdim(G~)\xi\in\mathbb{Z}^{\dim(\widetilde{G})}. Using (C.2) we may write this equivalently as

F~(xΓ~)ξcξe(ξ(Aψexp,G~>J(xmodG~>J)))e(ξ(Bψexp,G~>J(xmodG~>J)))ε,\bigg{\lVert}\widetilde{F}(x\widetilde{\Gamma})-\sum_{\xi}c_{\xi}e(\xi\cdot(A\circ\psi_{\exp,\widetilde{G}_{>J}}(x~{}\mathrm{mod}~{}\widetilde{G}_{>J})))e(\xi\cdot(B\circ\psi_{\exp,\widetilde{G}_{>J^{\prime}}}(x~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})))\bigg{\rVert}_{\infty}\leq\varepsilon^{\prime},

where again the sum is over ξ(M/(δε))O(dim(G~))\lVert\xi\rVert_{\infty}\leq(M/(\delta\varepsilon^{\prime}))^{O(\dim(\widetilde{G}))}.

For zz such that ψexp,G~/G~>J(z)ψexp,G~/G~>J(gmodG~>J)(1/2,1/2]dim(G~/G~>J)\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(z)-\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J})\in(-1/2,1/2]^{\dim(\widetilde{G}/\widetilde{G}_{>J})}, we let

τ>J,ξ(z)=ρ(ψexp,G~/G~>J(z)ψexp,G~/G~>J(gmodG~>J))e(ξ(Aψexp,G~>J(z)))\tau_{>J,\xi}(z)=\rho(\lVert\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(z)-\psi_{\mathrm{exp},\widetilde{G}/\widetilde{G}_{>J}}(g^{\ast}~{}\mathrm{mod}~{}\widetilde{G}_{>J})\rVert)\cdot e(\xi\cdot(A\circ\psi_{\exp,\widetilde{G}_{>J}}(z)))

with ρ(x)=1\rho(x)=1 for |x|1/4|x|\leq 1/4, ρ(x)=0\rho(x)=0 for |x|1/3|x|\geq 1/3, and ρ\rho is O(1)O(1)-Lipschitz and extends to G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J}\widetilde{\Gamma}) via periodicity. τ>J\tau_{>J} is seen to be an (M/(δε))OJ,J(dOJ,J(1))(M/(\delta\varepsilon^{\prime}))^{O_{J,J^{\prime}}(d^{O_{J,J^{\prime}}(1)})}-Lipschitz function on G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J}\widetilde{\Gamma}). This follows via the size of ξ\xi and that distance in dG~d_{\widetilde{G}} controls distance in first-kind coordinates (see e.g. [42, Lemmas B.1, B.3]). Define τ>J,ξ\tau_{>J^{\prime},\xi} in the same manner. We have that

F~(xΓ~)ξ(M/(δε))O(dim(G~))cξτ>J,ξ((xmodG~>J)Γ~>J)τ>J,ξ((xmodG~>J)Γ~>J)ε.\bigg{\lVert}\widetilde{F}(x\widetilde{\Gamma})-\sum_{\lVert\xi\rVert_{\infty}\leq(M/(\delta\varepsilon^{\prime}))^{O(\dim(\widetilde{G}))}}c_{\xi}\tau_{>J,\xi}((x~{}\mathrm{mod}~{}\widetilde{G}_{>J})\widetilde{\Gamma}_{>J})\tau_{>J,\xi}((x~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}})\widetilde{\Gamma}_{>J^{\prime}})\bigg{\rVert}_{\infty}\leq\varepsilon^{\prime}.

As this holds for all xx such that ψexp,G~(x)ψexp,G~(g)+(1/2,1/2]dim(G)\psi_{\mathrm{exp},\widetilde{G}}(x)\in\psi_{\mathrm{exp},\widetilde{G}}(g^{\ast})+(-1/2,1/2]^{\dim(G)} and the approximating function is invariant under Γ~\widetilde{\Gamma}, this holds for all xG~x\in\widetilde{G}. This completes the proof, plugging in x=g~(h1,,hk)x=\widetilde{g}(h_{1},\ldots,h_{k}) and noting that g~modG~>J\widetilde{g}~{}\mathrm{mod}~{}\widetilde{G}_{>J} and g~modG~>J\widetilde{g}~{}\mathrm{mod}~{}\widetilde{G}_{>J^{\prime}} are multidegree JJ^{\prime} and JJ polynomial sequences on G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J}\widetilde{\Gamma}) and G~/(G~>JΓ~)\widetilde{G}/(\widetilde{G}_{>J^{\prime}}\widetilde{\Gamma}) respectively. ∎

References

  • [1] D. Altman, A non-flag arithmetic regularity lemma and counting lemma, arXiv:2209.14083.
  • [2] D. Altman, On a conjecture of Gowers and Wolf, Discrete Anal. (2022), Paper No. 10, 13.
  • [3] V. Bergelson, B. Host, and B. Kra, Multiple recurrence and nilsequences, Invent. Math. 160 (2005), 261–303, With an appendix by Imre Ruzsa.
  • [4] V. Bergelson, T. Tao, and T. Ziegler, An inverse theorem for the uniformity seminorms associated with the action of 𝔽p\mathbb{F}^{\infty}_{p}, Geom. Funct. Anal. 19 (2010), 1539–1596.
  • [5] T. F. Bloom and O. Sisask, An improvement to the Kelley-Meka bounds on three-term arithmetic progressions, arXiv:2309.02353.
  • [6] O. A. Camarena and B. Szegedy, Nilspaces, nilmanifolds and their morphisms, arXiv:1009.3825.
  • [7] P. Candela, Notes on compact nilspaces, Discrete Anal. (2017), Paper No. 16, 57.
  • [8] P. Candela, Notes on nilspaces: algebraic aspects, Discrete Anal. (2017), Paper No. 15, 59.
  • [9] P. Candela, D. González-Sánchez, and B. Szegedy, On the inverse theorem for gowers norms in abelian groups of bounded torsion, arXiv:2311.13899.
  • [10] P. Candela and B. Szegedy, Regularity and inverse theorems for uniformity norms on compact abelian groups and nilmanifolds, J. Reine Angew. Math. 789 (2022), 1–42.
  • [11] J.-P. Conze and E. Lesigne, Théorèmes ergodiques pour des mesures diagonales, Bull. Soc. Math. France 112 (1984), 143–175.
  • [12] L. J. Corwin and F. P. Greenleaf, Representations of nilpotent Lie groups and their applications. Part I, Cambridge Studies in Advanced Mathematics, vol. 18, Cambridge University Press, Cambridge, 1990, Basic theory and examples.
  • [13] P. Erdős and P. Turán, On Some Sequences of Integers, J. London Math. Soc. 11 (1936), 261–264.
  • [14] H. Furstenberg, Ergodic behavior of diagonal measures and a theorem of Szemerédi on arithmetic progressions, J. Analyse Math. 31 (1977), 204–256.
  • [15] H. Furstenberg and B. Weiss, A mean ergodic theorem for (1/N)Nn=1f(Tnx)g(Tn2x)(1/N)\sum^{N}_{n=1}f(T^{n}x)g(T^{n^{2}}x), Convergence in ergodic theory and probability (Columbus, OH, 1993), Ohio State Univ. Math. Res. Inst. Publ., vol. 5, de Gruyter, Berlin, 1996, pp. 193–227.
  • [16] W. T. Gowers, A new proof of Szemerédi’s theorem for arithmetic progressions of length four, Geom. Funct. Anal. 8 (1998), 529–551.
  • [17] W. T. Gowers, Arithmetic progressions in sparse sets, Current developments in mathematics, 2000, Int. Press, Somerville, MA, 2001, pp. 149–196.
  • [18] W. T. Gowers, A new proof of Szemerédi’s theorem, Geom. Funct. Anal. 11 (2001), 465–588.
  • [19] W. T. Gowers and L. Milićević, An inverse theorem for Freiman multi-homomorphisms, arXiv:2002.11667.
  • [20] W. T. Gowers and L. Milićević, A quantitative inverse theorem for the U4U^{4} norm over finite fields, arXiv:1712.00241.
  • [21] W. T. Gowers and J. Wolf, The true complexity of a system of linear equations, Proc. Lond. Math. Soc. (3) 100 (2010), 155–176.
  • [22] B. Green, 100 open problems, manuscript, available on request.
  • [23] B. Green and T. Tao, An inverse theorem for the Gowers U3(G)U^{3}(G) norm, Proc. Edinb. Math. Soc. (2) 51 (2008), 73–153.
  • [24] B. Green and T. Tao, The primes contain arbitrarily long arithmetic progressions, Ann. of Math. (2) 167 (2008), 481–547.
  • [25] B. Green and T. Tao, New bounds for Szemerédi’s theorem. II. A new bound for r4(N)r_{4}(N), Analytic number theory, Cambridge Univ. Press, Cambridge, 2009, pp. 180–204.
  • [26] B. Green and T. Tao, An arithmetic regularity lemma, an associated counting lemma, and applications, An irregular mind, Bolyai Soc. Math. Stud., vol. 21, János Bolyai Math. Soc., Budapest, 2010, pp. 261–334.
  • [27] B. Green and T. Tao, Linear equations in primes, Ann. of Math. (2) 171 (2010), 1753–1850.
  • [28] B. Green and T. Tao, The Möbius function is strongly orthogonal to nilsequences, Ann. of Math. (2) 175 (2012), 541–566.
  • [29] B. Green and T. Tao, The quantitative behaviour of polynomial orbits on nilmanifolds, Ann. of Math. (2) 175 (2012), 465–540.
  • [30] B. Green and T. Tao, New bounds for Szemerédi’s theorem, III: a polylogarithmic bound for r4(N)r_{4}(N), Mathematika 63 (2017), 944–1040.
  • [31] B. Green, T. Tao, and T. Ziegler, Erratum for “An inverse theorem for the Gowers Us+1[N]U^{s+1}[N]-norm”, manuscript.
  • [32] B. Green, T. Tao, and T. Ziegler, An inverse theorem for the Gowers U4U^{4}-norm, Glasg. Math. J. 53 (2011), 1–50.
  • [33] B. Green, T. Tao, and T. Ziegler, An inverse theorem for the Gowers Us+1[N]U^{s+1}[N]-norm, Electron. Res. Announc. Math. Sci. 18 (2011), 69–90.
  • [34] B. Green, T. Tao, and T. Ziegler, An inverse theorem for the Gowers Us+1[N]U^{s+1}[N]-norm, Ann. of Math. (2) 176 (2012), 1231–1372.
  • [35] Y. Gutman, F. W. R. M. Manners, and P. P. Varjú, The structure theory of nilspaces II: Representation as nilmanifolds, Trans. Amer. Math. Soc. 371 (2019), 4951–4992.
  • [36] Y. Gutman, F. W. R. M. Manners, and P. P. Varjú, The structure theory of nilspaces I, J. Anal. Math. 140 (2020), 299–369.
  • [37] Y. Gutman, F. W. R. M. Manners, and P. P. Varjú, The structure theory of nilspaces III: Inverse limit representations and topological dynamics, Adv. Math. 365 (2020), 107059, 53.
  • [38] B. Host and B. Kra, Nonconventional ergodic averages and nilmanifolds, Ann. of Math. (2) 161 (2005), 397–488.
  • [39] A. Jamneshan, O. Shalom, and T. Tao, The structure of arbitrary Conze–Lesigne systems, Comm. Amer. Math. Soc. 4 (2024), 182–229.
  • [40] A. Jamneshan and T. Tao, The inverse theorem for the U3U^{3} Gowers uniformity norm on arbitrary finite abelian groups: Fourier-analytic and ergodic approaches, Discrete Anal. (2023), Paper No. 11, 48.
  • [41] Z. Kelley and R. Meka, Strong bounds for 3-progressions, arXiv:2302.05537.
  • [42] J. Leng, Efficient Equidistribution of Nilsequences, arXiv:2312.10772.
  • [43] J. Leng, Efficient Equidistribution of Periodic Nilsequences and Applications, arXiv:2306.13820.
  • [44] J. Leng, Improved Quadratic Gowers Uniformity for the Möbius Function, arXiv:2212.09635.
  • [45] J. Leng, A. Sah, and M. Sawhney, Improved bounds for five-term arithmetic progressions, arXiv:2312.10776.
  • [46] J. Leng, A. Sah, and M. Sawhney, Improved Bounds for Szemerédi’s Theorem, arXiv:2402.17995.
  • [47] F. W. R. M. Manners, Quantitative bounds in the inverse theorem for the Gowers Us+1{U}^{s+1}-norms over cyclic groups, arXiv:1811.00718.
  • [48] L. Milićević, Bilinear Bogolyubov Argument in Abelian Groups, arXiv:2109.03093.
  • [49] S. Peluse, A. Sah, and M. Sawhney, Effective bounds for Roth’s theorem with shifted square common difference, arXiv:2309.08359.
  • [50] K. F. Roth, On certain sets of integers. II, J. London Math. Soc. 29 (1954), 20–26.
  • [51] T. Sanders, On certain other sets of integers, J. Anal. Math. 116 (2012), 53–82.
  • [52] T. Sanders, On the Bogolyubov-Ruzsa lemma, Anal. PDE 5 (2012), 627–655.
  • [53] B. Szegedy, On higher order Fourier analysis, arXiv:1203.2260.
  • [54] E. Szemerédi, On sets of integers containing no four elements in arithmetic progression, Number Theory (Colloq., János Bolyai Math. Soc., Debrecen, 1968), Colloq. Math. Soc. János Bolyai, vol. 2, North-Holland, Amsterdam-London, 1970, pp. 197–204.
  • [55] E. Szemerédi, On sets of integers containing no kk elements in arithmetic progression, Acta Arith. 27 (1975), 199–245.
  • [56] T. Tao, Goursat and Furstenberg–Weiss type lemmas, 2021, blog post. https://terrytao.wordpress.com/2021/05/07/goursat-and-furstenberg-weiss-type-lemmas/.
  • [57] T. Tao and J. Teräväinen, Quantitative bounds for Gowers uniformity of the Möbius and von Mangoldt functions, J. Eur. Math. Soc. (2023), 1–64.
  • [58] T. Tao and V. H. Vu, Additive combinatorics, Cambridge Studies in Advanced Mathematics, vol. 105, Cambridge University Press, Cambridge, 2010, Paperback edition [of MR2289012].
  • [59] T. Tao and T. Ziegler, The inverse conjecture for the Gowers norm over finite fields via the correspondence principle, Anal. PDE 3 (2010), 1–20.
  • [60] T. Tao and T. Ziegler, The inverse conjecture for the Gowers norm over finite fields in low characteristic, Ann. Comb. 16 (2012), 121–188.
  • [61] T. Tao and T. Ziegler, Polynomial patterns in the primes, Forum Math. Pi 6 (2018), e1, 60.
  • [62] T. Ziegler, Universal characteristic factors and Furstenberg averages, J. Amer. Math. Soc. 20 (2007), 53–97.