This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimal Testing of Generalized Reed-Muller Codes in Fewer Queries

Dor Minzer Department of Mathematics, Massachusetts Institute of Technology, Cambridge, USA. Supported by a Sloan Research Fellowship, NSF CCF award 2227876 and NSF CAREER award 2239160.    Kai Zheng Department of Mathematics, Massachusetts Institute of Technology, Cambridge, USA. Supported by the NSF GRFP.
Abstract

A local tester for an error correcting code CΣnC\subseteq\Sigma^{n} is a tester that makes QQ oracle queries to a given word wΣnw\in\Sigma^{n} and decides to accept or reject the word ww. An optimal local tester is a local tester that has the additional properties of completeness and optimal soundness. By completeness, we mean that the tester must accept with probability 11 if wCw\in C. By optimal soundness, we mean that if the tester accepts with probability at least 1ε1-\varepsilon (where ε\varepsilon is small), then it must be the case that ww is O(ε/Q)O(\varepsilon/Q)-close to some codeword cCc\in C in Hamming distance.

We show that Generalized Reed-Muller codes admit optimal testers with Q=(3q)d+1q1+O(1)Q=(3q)^{\lceil\frac{d+1}{q-1}\rceil+O(1)} queries. Here, for a prime power q=pkq=p^{k}, the Generalized Reed-Muller code, RM[n,q,d]\operatorname{RM}[n,q,d], consists of the evaluations of all nn-variate degree dd polynomials over 𝔽q\mathbb{F}_{q}. Previously, no tester achieving this query complexity was known, and the best known testers due to Haramaty, Shpilka and Sudan [21](which is optimal) and due to Ron-Zewi and Sudan [33](which was not known to be optimal) both required qd+1qq/pq^{\lceil\frac{d+1}{q-q/p}\rceil} queries. Our tester achieves query complexity which is polynomially better than by a power of p/(p1)p/(p-1), which is nearly the best query complexity possible for generalized Reed-Muller codes.

The tester we analyze is due to Ron-Zewi and Sudan, and we show that their basic tester is in fact optimal. Our methods are more general and also allow us to prove that a wide class of testers, which follow the form of the Ron-Zewi and Sudan tester, are optimal. This result applies to testers for all affine-invariant codes (which are not necessarily generalized Reed-Muller codes).

1 Introduction

1.1 Local Testing of Reed Muller Codes

The Reed-Muller Code is a widely used code with many applications in complexity theory, and more broadly in theoretical computer science. One reason for this is that the Reed-Muller code enjoys very good local testability properties which are crucial in many applications (for example, in the construction of probabilistically checkable proofs). The primary goal of this paper is to present local testers for Reed-Muller codes over extension fields with improved query complexity, which additionally satisfy a stronger notion of soundness known as optimal testing.

Throughout this paper, pp\in\mathbb{N} is a prime and q=pkq=p^{k} is a prime power, where kk should be thought of as moderately large (so that qq is significantly larger than pp). For a degree parameter dd\in\mathbb{N} and a number of variables parameter nn\in\mathbb{N}, the Reed-Muller code consists of all evaluation vectors of degree dd polynomials f:𝔽qn𝔽qf\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}. That is,

RM[n,q,d]={(f(x))x𝔽qn|f:𝔽qn𝔽q is a polynomial of degree at most d}.\operatorname{RM}[n,q,d]=\left\{(f(\vec{x}))_{\vec{x}\in\mathbb{F}_{q}^{n}}~{}|~{}f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}\text{ is a polynomial of degree at most $d$}\right\}.

When k>1k>1, RM[n,q,d]\operatorname{RM}[n,q,d] is sometimes called a generalized Reed-Muller code, to distinguish from the prime field case, and as the title suggests, our results are most relevant to this case. However, henceforth, we will refer to them as Reed-Muller codes for simplicity.

Abusing notation, we will not distinguish functions and their evaluations when referring to codewords. That is, for f:𝔽qn𝔽qf:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}, we will simply say fRM[n,q,d]f\in\operatorname{RM}[n,q,d] if deg(f)d\deg(f)\leqslant d and we will view the codewords of RM[n,q,d]\operatorname{RM}[n,q,d] themselves as functions. When talking about the degree of a function ff, we mean the total degree when ff is written as a polynomial.

Given a proximity parameter δ>0\delta>0, the goal in the problem of local testing of Reed-Muller codes is to design a randomized tester 𝒯\mathcal{T} that makes QQ oracle queries (for QQ which is as small as possible) to a given function f:𝔽qn𝔽qf\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q} such that:

  1. 1.

    Completeness: if ff is a polynomial of degree at most dd, then 𝒯\mathcal{T} accepts with probability 11.

  2. 2.

    Soundness: if ff is δ\delta-far from all degree dd polynomials, then the tester 𝒯\mathcal{T} rejects with probability at least 2/32/3.

Local Testing.

Reed-Muller codes have a very natural and well studied local test [1, 24, 22, 21] called the tt-flat test. This test has its origins in the study of probabilistically checkable proofs [15, 3, 2, 34, 4, 32] (as it is related to the well known plane versus plane, plane versus line and line versus point tests), as well as in the theory of Gowers’ uniformity norms [16]. To check if a given function ff is indeed low degree, the tester samples a random tt-dimensional affine subspace U𝔽qnU\subseteq\mathbb{F}_{q}^{n}, queries f(x)f(\vec{x}) for all xU\vec{x}\in U and checks whether the resulting tt-variate function f|Uf|_{U} has degree at most dd. Clearly the number of queries made is qtq^{t}, and it is also clear that the test is complete: if ff has degree at most dd, then the tester always accepts. As for the soundness, it is known that one can take t=d+1qq/pt=\lceil\frac{d+1}{q-q/p}\rceil and get that the tester is somewhat sound [1], meaning that the rejection probability is bounded away from 0 (as opposed to at least 2/32/3). Indeed, a typical local-testing result shows that if a function ff is δ\delta-far from being a degree dd function, then the tester rejects it with probability at least ε=ε(q,d,δ)>0\varepsilon=\varepsilon(q,d,\delta)>0, which is a quantity that typically vanishes with the various parameters. To amplify the soundness, one repeats the tester θ(1/ε)\theta(1/\varepsilon) time sequentially to get 2/32/3 rejection probability, thereby giving a tester whose query complexity is O(qt/ε)O(q^{t}/\varepsilon) and whose soundness is at least 2/32/3.

Such testers for the Reed-Muller have been known for a long time. Indeed, in [1] the tt-flat tester is analyzed and it is shown that the rejection probability of the basic tester is at least εΩ(δ/qt)\varepsilon\geqslant\Omega(\delta/q^{t}) leading to a tester with query complexity O(q2t/δ)O(q^{2t}/\delta). This soundness analysis turns out to not be optimal, and indeed, as we explain below, follow-up works have shown a better analysis of the tt-flat tester. In particular, it was shown that the tt-flat tester is an optimal tester.

1.2 Optimal Testing of Reed Muller Codes

In this paper, we focus on a much stronger notion of testing known as optimal testing. Clearly, if a tester makes QQ queries and the proximity parameter is δ\delta, then the rejection probability can be at most min(1,Qδ)\min(1,Q\delta); indeed, this is a bound on the probability to distinguish between a Reed-Muller codeword and a Reed-Muller codeword that has been perturbed on a randomly chosen δ\delta-fraction of inputs. A tester is called optimal if the query-to-rejection probability tradeoff it achieves is roughly that. Oftentimes, one settles for rejection probability which is a bit worse and has the form c(q)min(1,Qδ)c(q)\min(1,Q\delta) for some function c(q)>0c(q)>0 depending only on the field size qq. We will refer to such results also as optimal testing results. Thus, one would ideally like a tester which both achieves as small as possible query complexity, while simultaneously being optimal.

Optimal testers are known for Reed-Muller codes. Such results were first proved over 𝔽2\mathbb{F}_{2} by [10]. Later on, optimal testing results were established for Reed-Muller codes over general fields [21] as well as more broadly for the class of affine lifted codes [20]. In all of these results, the tt-flat test is analyzed (for tt chosen as above), and is shown to be an optimal tester. We remark that additionally, the analyses of [10, 21, 20] led to improved query complexity for testing Reed-Muller codes. These results have important applications, most notably to the construction of small-set expander graphs with many eigenvalues close to 11 [8], which have later been used for improved PCP constructions and constructions of integrality gaps [19, 12, 11, 23].

Quantitatively, these results have two drawbacks. First, due to their application of the density Hales-Jewett theorem, the dependency of c(q)c(q) on the field size qq is tower type, hence making the result primarily effective for small fields. Secondly, while the query complexity achieved by their tester is the best possible for prime fields (as a lower bound on the query complexity needed is given by the distance of the dual code of RM[n,q,d]\operatorname{RM}[n,q,d], which is qtq^{t} if qq is prime), it is not known to be optimal for prime power size fields. This raises two natural questions: does the flat tester actually perform worse on large fields (in comparison to small fields)? Is there a tester with smaller query complexity over non-prime fields (i.e. extension fields)?

In [33] a new local tester for the Reed-Muller code was designed. The query complexity of the tester is Q=𝗉𝗈𝗅𝗒(q)(3q)d+1qQ={\sf poly}(q)(3q)^{\lceil\frac{d+1}{q}\rceil}, which is polynomially smaller than qtq^{t} above (by a power of pp1\frac{p}{p-1}). This tester, which will be referred to as the sparse tt-flat tester, plays a crucial role in the current work and will be presented in subsequent sections.

Unfortunately, the rejection probability proved in [33] for the sparse tt-flat tester is an ε\varepsilon which is sub-constant, and after repeating the tester Θ(1/ε)\Theta(1/\varepsilon) times its query complexity turns out to be roughly the same as that of the tt-flat tester above. This leaves the Reed Muller code over extension fields in a rather precarious situation: a local characterization for the code — namely a basic tester that rejects far from Reed-Muller codewords with some non-negligible probability — is known, but amplifying the soundness to be constant sets us back to the same query complexity as of the tt-flat tester.

1.3 Main Result: Optimal, Query Efficient Tester for Generalized Reed Muller Codes

The main result of this paper is a new and improved analysis of the tester of [33]. We show that the soundness of the tester is much better than the guarantee given by [33], and that in fact this tester is also an optimal tester:

Theorem 1.1.

For all primes pp\in\mathbb{N} and prime powers q=pkq=p^{k} there exists a tester 𝒯\mathcal{T} with query complexity Q3qp+O(1)(3q)d+1q1Q\leqslant 3q^{p+O(1)}(3q)^{\lceil\frac{d+1}{q-1}\rceil} such that given an oracle access to a function f:𝔽qn𝔽qf\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q},

  1. 1.

    Completeness: if ff has degree at most dd, then 𝒯\mathcal{T} accepts with probability 11.

  2. 2.

    Soundness: if ff is δ\delta-far from degree dd , then 𝒯\mathcal{T} rejects with probability at least c(q)min(1,Qδ)c(q)\min(1,Q\delta), where c(q)=1𝗉𝗈𝗅𝗒(q)c(q)=\frac{1}{{\sf poly}(q)}.

The test uses a parameter tt (where d+1q1+t\lceil\frac{d+1}{q-1}\rceil+t is analogous to the “dimension” parameter in the flat test), and the tt that we use will be t=max(p+2,10)t=\max(p+2,10). We remark however that the analysis we give applies to all tmax(p+2,10)t\geqslant\max(p+2,10), and we choose this specific tt so as to minimize the query complexity. We defer to Section 3.2 for more details on this parameter.

A lower bound on the query complexity needed is qd+1q1q^{\lceil\frac{d+1}{q-1}\rceil}, which once again follows by considering the dual code of the generalized Reed-Muller code and arguing that this number is its distance. Hence, Theorem 1.1 is tight up to a factor of 𝗉𝗈𝗅𝗒(q)3d+1q1{\sf poly}(q)3^{\lceil\frac{d+1}{q-1}\rceil}; for large qq, this factor is very small compared to qd+1q1q^{\lceil\frac{d+1}{q-1}\rceil}, hence the query complexity achieved by our tester is essentially optimal for large field size qq.

1.4 Optimal Testing of Other Linear Lifted Affine Invariant Codes

Our techniques are in fact more general, and also apply to testers of a wider family of codes, called (linear) lifted affine invariant codes. These codes were introduced by [17] and shown to be optimally testable in [20, 31]. In words, we show that any tester for such codes is optimal if it can be expressed as the product of polynomials, such that each polynomial is defined on a constant number of variables and such that the variables for each polynomial are disjoint. We describe this result in more detail below, but defer the full discussion to Section 6.

A code 𝒞{f:𝔽qn𝔽q}\mathcal{C}\subseteq\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\} is linear if its codewords form a linear subspace and is affine invariant if for any affine transformation T:𝔽qn𝔽qnT:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{n} and f𝒢f\in\mathcal{G}, we have that fT𝒢f\circ T\in\mathcal{G}, where fTf\circ T is defined as the function f:𝔽qn𝔽qnf^{\prime}:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{n} such that f(x)=f(T(x))f^{\prime}(x)=f(T(x)) for all x𝔽qnx\in\mathbb{F}_{q}^{n}. Since 𝒞\mathcal{C} is linear, it has a dual 𝒞\mathcal{C}^{\perp} also consisting of functions from 𝔽qn𝔽q\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}, and f𝒞f\in\mathcal{C} if and only if f,h=0\langle f,h\rangle=0 for all h𝒞h\in\mathcal{C}^{\perp}, where this inner product is the standard dot product of the evaluation vectors of ff and hh (over 𝔽qn\mathbb{F}_{q}^{n}). Notice that, one can also compose ff with affine transformations T:𝔽qk𝔽qnT:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}^{n} for k<nk<n. In this case, fTf\circ T is a function from 𝔽qk𝔽q\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}, and we can consider the code consisting of all f:𝔽qn𝔽qf:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q} such that fTf\circ T is in some affine invariant base code 𝒢{𝔽qk𝔽q}\mathcal{G}\subseteq\{\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}\}. This code is called the nn-dimensional lift of 𝒢\mathcal{G} and is defined as:

𝖫𝗂𝖿𝗍n(𝒢)={f:𝔽qn𝔽q|fT𝒢 for all affine transformations T:𝔽qk𝔽qn}.{\sf Lift}_{n}(\mathcal{G})=\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\;|\;f\circ T\in\mathcal{G}\text{ for all affine transformations }T:\mathbb{F}_{q}^{k}\xrightarrow{}\mathbb{F}_{q}^{n}\}.

It is not hard to see that 𝖫𝗂𝖿𝗍n(𝒢){\sf Lift}_{n}(\mathcal{G}) is also affine invariant and linear. Suppose we want to design a local tester for 𝒞\mathcal{C} and we know 𝒞=𝖫𝗂𝖿𝗍n(𝒢)\mathcal{C}={\sf Lift}_{n}(\mathcal{G}) for some affine invariant 𝒢\mathcal{G} defined as above with k10k\geqslant 10. A natural test is the following:

  1. 1.

    Take 𝒢\mathcal{H}\subseteq\mathcal{G}^{\perp}.

  2. 2.

    Choose an affine transformation T:𝔽qk𝔽qnT:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}^{n} uniformly at random.

  3. 3.

    Accept if and only if fT,H=0\langle f\circ T,H\rangle=0 for all hh\in\mathcal{H}.

We remark that not every choice of \mathcal{H} results in a local tester, and it is indeed possible to choose \mathcal{H} so that there exist f𝒞f\notin\mathcal{C} that still pass the above test with probability 11. Our main result shows that when such a test is a local test and \mathcal{H} consists of functions of a specified form, then the tester is automatically an optimal tester. In order to obtain explicit optimal testers one still has to find such a \mathcal{H} that is a local tester, but this is not the focus of the current paper.

The form for \mathcal{H} that we require is as follows. Let

H(x1,,xk)=(i=1sPi(xm(i),,xm(i+1)1))),H(x_{1},\ldots,x_{k^{\prime}})=\left(\prod_{i=1}^{s}P_{i}(x_{m(i)},\ldots,x_{m(i+1)-1}))\right),

where poly(q)kk0\operatorname{poly}(q)\geqslant k-k^{\prime}\geqslant 0, and m(i+1)m(i)tm(i+1)-m(i)\leqslant t^{\prime} for some constant tt^{\prime} and all 1is11\leqslant i\leqslant s-1. In words, HH is a polynomial in at most kk^{\prime}-variables, for some kk^{\prime} that is within some constant power of qq from kk, that can be factored into the product of polynomials in constant number of variables, and such that the variables of each of these polynomials is disjoint. Next let {𝔽qkk𝔽q}\mathcal{M}\subsetneq\{\mathbb{F}_{q}^{k-k^{\prime}}\xrightarrow[]{}\mathbb{F}_{q}\} be an affine invariant code and let ¯\overline{\mathcal{M}} be a basis for \mathcal{M}^{\perp}. Finally, suppose

={H(x1,,xk)M(xk+1,xk)|M}.\mathcal{H}=\{H(x_{1},\ldots,x_{k^{\prime}})M(x_{k^{\prime}+1},\ldots x_{k})\;|\;M\in\mathcal{M}\}.

Our theorem states:

Theorem 1.2.

Suppose \mathcal{H} is of the previously described form and suppose that the previously described test using \mathcal{H} is a local tester for 𝒞=𝖫𝗂𝖿𝗍n(𝒢)\mathcal{C}={\sf Lift}_{n}(\mathcal{G}). Then the local tester is also optimal. That is,

  1. 1.

    Completeness: if f𝒞f\in\mathcal{C} then the test accepts with probability 11.

  2. 2.

    Soundness: if ff is δ\delta-far from degree 𝒞\mathcal{C} , then the test rejects with probability at least c(q)min(1,Qδ)c(q)\min(1,Q\delta), where c(q)=1𝗉𝗈𝗅𝗒(q)c(q)=\frac{1}{{\sf poly}(q)}.

Although our result is stated for lifted affine-invariant codes, it also applies equally to affine-invariant codes by taking 𝒞=𝒢=𝖫𝗂𝖿𝗍k(𝒢)\mathcal{C}=\mathcal{G}={\sf Lift}_{k}(\mathcal{G}). In contrast, the optimal testing result for lifted affine-invariant codes in [31] applies only to the kk-flat test for 𝖫𝗂𝖿𝗍n(𝒢){\sf Lift}_{n}(\mathcal{G}), which is no longer “local” in the case of 𝒞=𝒢\mathcal{C}=\mathcal{G} as it looks at the entire domain. On the other hand, for the 𝒞=𝒢\mathcal{C}=\mathcal{G} case, one could still obtain locality using our result by designing a set \mathcal{H} of the specified form that has sparse support. Theorem 1.2 gives a general recipe to construct optimal testers, thus making progress on the task of establishing optimal testing results for general affine invariant codes. We would like to highlight two interesting avenues that we leave for future works. First, it would be interesting to investigate what other affine invariant codes can be analyzed via Theorem 1.2. This could lead to new optimal testing result for other codes, or otherwise to a more general class of testers that one can then try to analyze using the tools presented herein (and their extensions). Second, it would be interesting to extend our techniques to remove the requirements on the form of \mathcal{H} (or perhaps weaken it somehow), and show that any local test for affine-invariant codes is optimal.

1.5 Optimal Testing via Global Hypercontractivity

The proof of Theorem 1.1 is very different from the proofs of [10, 21, 20] (which proceed by induction on nn) as well as from the proof of [33] (which proceeds by presenting a local characterization of the generalized Reed-Muller code and appealing to a general and powerful result due to [25], that converts local characterizations to local tester). Instead, our proof is inspired by a new approach for establishing such results via global hypercontractivity [31].

Global hypercontractivity is a recently introduced tool that is often useful when working with non small set expander graphs. One corollary of global hypercontractivity (which is morally equivalent) is a useful characterization of all small sets in a graph that have edge expansion bounded away from 11. The study of global hypercontractivity has its roots in the proof of the 2-to-2 Games Theorem [29, 14, 13, 30], however by now it is known to be useful in the study a host of different problems (see for example [26, 27, 28, 31, 5, 7, 6, 18]).

Below, we explain the global hypercontractivity approach to proving local testability results. In Section 1.6 we explain how we extend this approach to the realm of generalized Reed-Muller codes in order to analyze the sparse tt-flat tester and prove Theorems 1.11.2.

1.5.1 Optimal Testing of Reed-Muller Codes via Global Hypercontractivity

In [31], the authors relate the analysis of the tt-flat tester of the Reed-Muller code to expansion properties of the affine Grassmann graph. Here, the affine Grassmann graph is the graph whose vertex set is the set of all tt-flats in 𝔽qn\mathbb{F}_{q}^{n}, and two vertices are adjacent if they intersect in a t1t-1 dimensional affine subspace. In short, the approach of [31] starts with the assumption that the tt-flat tester accepts ff with probability at least 1ε1-\varepsilon (for ε\varepsilon thought of as small) and proceeds to prove a structural result on the set of tt-flats on which the tester rejects:

S={T𝔽qn|𝖽𝖾𝗀(f|T)>d,T is a t-flat}.S=\left\{T\subseteq\mathbb{F}_{q}^{n}~{}|~{}{\sf deg}(f|_{T})>d,T\text{ is a $t$-flat}\right\}.

In particular, using a lemma from [21] they prove that the set SS has poor expansion properties in the affine Grassmann graph, and use it to prove that there exists a point x𝔽qnx^{\star}\in\mathbb{F}_{q}^{n} such that the tester almost always rejects when it selects a subspace TT containing xx^{\star}. This suggests that ff is erroneous at xx^{\star} and should be corrected, and indeed using that a local correction procedure is devised in [31] to show that the value of ff at xx^{\star} could be changed and lead to an increase in the acceptance probability of additive factor qtnO(1)q^{t-n-O(1)}. Iterating this argument, one eventually changes ff in at most C(q)qtεC(q)\cdot q^{-t}\varepsilon fraction of the inputs and gets a function ff^{\prime} on which the tester accepts with probability 11. Such ff^{\prime} must be of degree at most dd, hence one gets that ff is close to a degree dd polynomial.

1.5.2 Optimal Testing of Generalized Reed-Muller Codes via Global Hypercontractivity

While the approach of [31] seems to be more robust and thus potentially applicable towards analyzing a richer class of codes, it is not completely obvious how to do so. For the tt-flat tester one may associate a very natural graph with the tester, but this is much less clear in the context of the sparse tt-flat tester (which is the tester that we analyze). The pattern of inputs which are queried by the tester is no longer a nice-looking subspace (but this seems inherent in order to improve upon the query complexity of the flat tester).

At a high level, we show that another way of approaching this problem is by utilizing the symmetries of the code and constructing graphs on them. For that, we have to think of the tester as the composition of a “basic tester” and a random element from the group of symmetries of the code. In our case, the group of symmetries is the group of affine linear transformations over 𝔽qn\mathbb{F}_{q}^{n}, and the graph that turns out to be related to the analysis of the sparse tt-flat tester is the so-called Affine Bi-linear Scheme Graph.

At first sight, affine linear transformations seem to be morally equivalent to flats (identifying the image of an affine linear transformation with a subspace), and indeed there are many connections between results on the former graph and result on the latter graph. However, the distinction between affine linear transformations and flats will be crucial for us. Indeed, while two affine linear transformations A1A_{1} and A2A_{2} may have the same image, the performance of the tester when choosing A1A_{1} and when choosing A2A_{2} will not be the same whatsoever, and therefore we must capitalize on this distinction in our soundness analysis.

1.6 Our Techniques

In this section, we give a brief overview of the techniques underlying the proof of Theorem 1.1. We start by presenting the sparse (s+t)(s+t)-flat tester of [33] and then take a somewhat different perspective on it in the form of groups of symmetries. After that, we explain the high level strategy of the proof of Theorem 1.1, and explain some of the unique challenges that arise compared to the analysis of the standard tt-flat tester. For the sake of presentation, we focus on the case that p=2p=2 for the remainder of this section, and switch back to general pp in Section 2.

1.6.1 The Sparse Flat Tester

The construction of the sparse flat tester of [33] begins with the following observation. In the p=2p=2 case, define the bivariate polynomial P:𝔽q2𝔽qP\colon\mathbb{F}_{q}^{2}\to\mathbb{F}_{q} by

P(x1,x2)=x2q1+(x1+x2)q1x1=i=0q2x1ix2q2i.P(x_{1},x_{2})=\frac{-x_{2}^{q-1}+(x_{1}+x_{2})^{q-1}}{x_{1}}=\sum_{i=0}^{q-2}x_{1}^{i}x_{2}^{q-2-i}.

In [33], the authors observe the following two crucial properties of PP that make it useful towards getting a local testing result:

  1. 1.

    Sparse Support: the support of PP has size at most 3q3q; indeed, if x1+x20x_{1}+x_{2}\neq 0, x10x_{1}\neq 0 and x20x_{2}\neq 0, then by Fermat’s little theorem (x1+x2)q1=x2q1=1(x_{1}+x_{2})^{q-1}=x_{2}^{q-1}=1 and x10x_{1}\neq 0, so P(x1,x2)=0P(x_{1},x_{2})=0.

  2. 2.

    Detects the Monomials x1qix2ix_{1}^{q-i}x_{2}^{i}: looking at the inner product of PP with a monomial MM, defined as P,M=x1,x2P(x1,x2)M(x1,x2)\langle{P},{M}\rangle=\sum\limits_{x_{1},x_{2}}P(x_{1},x_{2})M(x_{1},x_{2}), we get that if M=x1qix2iM=x_{1}^{q-i}x_{2}^{i} for i{1,,q1}i\in\{1,\ldots,q-1\} then P,M0\langle{P},{M}\rangle\neq 0. Indeed,

    P,M=j=0q2x1x1qi+jx2x2q2j+i=j=0q2(q1)1qi+j=q1(q1)1q2j+i=q1,\langle{P},{M}\rangle=\sum\limits_{j=0}^{q-2}\sum\limits_{x_{1}}x_{1}^{q-i+j}\sum\limits_{x_{2}}x_{2}^{q-2-j+i}=\sum\limits_{j=0}^{q-2}(q-1)1_{q-i+j=q-1}(q-1)1_{q-2-j+i=q-1},

    so we only have contribution from j=i1j=i-1 and it is non-zero. For any other monomial MM, a similar computation shows that P,M=0\langle{P},{M}\rangle=0, hence taking an inner product of a function ff with PP may be thought of detecting whether in ff there is some monomial of the form x1qix2ix_{1}^{q-i}x_{2}^{i}.

With this in mind, the sparse tester follows. In [33] it is argued that to design a local tester for the generalized Reed-Muller code it suffices to design a tester that detects whether certain canonical monomials exist. Writing d+1=sq2+rd+1=s\cdot\frac{q}{2}+r, where ss is even and r<qr<q, it is sufficient to detect whether any monomials of the form i=1s/2x2i1q/2x2iq/2i=s+1s+txiei\prod_{i=1}^{s/2}x_{2i-1}^{q/2}x_{2i}^{q/2}\cdot\prod\limits_{i=s+1}^{s+t}x_{i}^{e_{i}} where i=1teir\sum_{i=1}^{t}e_{i}\geqslant r and tt is a small constant (say, t=10t=10). Using PP from above, a detector for such monomials is given by

He(x1,,xs+t)=P(x1,x2)P(xs1,xs)Me(xs+1,,xs+t),H_{e^{\prime}}(x_{1},\ldots,x_{s+t})=P(x_{1},x_{2})\cdots P(x_{s-1},x_{s})\cdot M_{e^{\prime}}(x_{s+1},\ldots,x_{s+t}),

where e+e=(q1,,q1)e^{\prime}+e=(q-1,\ldots,q-1) and Me(xs+1,,xs+t)=i=s+1s+txiq1eiM_{e^{\prime}}(x_{s+1},\ldots,x_{s+t})=\prod\limits_{i=s+1}^{s+t}x_{i}^{q-1-e_{i}}. We note that most of the degree of HeH_{e^{\prime}} comes from the product of the PP’s, while at most t(q1)rt(q-1)-r of the degree comes from the rest. As the support of PP is rather sparse, it follows that the support of HeH_{e^{\prime}} is also rather sparse. More precisely, the support of HeH_{e^{\prime}} has size at most (3q)s2+t(3q)^{\frac{s}{2}+t}, and as tt should be thought of as small and ss is roughly 2d/q2d/q, the support of HeH_{e^{\prime}} has size roughly (3q)d/q(3q)^{d/q}. This yields a query complexity of (3q)d/q(3q)^{d/q}, which is quadratically better than qdqq/pq2d/qq^{\lceil\frac{d}{q-q/p}\rceil}\approx q^{2d/q} given by the flat tester.

As mentioned earlier, the soundness analysis in [33] relies on a black box result from [25]. They manage to show that the detector they construct implies a tester with the same query complexity that rejects functions that are δ\delta-far from Reed-Muller codewords with probability Ω((3q)2(s/2+t)δ)\Omega((3q)^{-2(s/2+t)}\delta). To get constant rejection probability one has to repeat the tester (3q)2(s/2+t)(3q)^{2(s/2+t)} times and drastically increasing the query complexity; we defer a more detailed discussion to Section 3.

1.6.2 The Group of Symmetries Perspective

Our first task in proving Theorem 1.1 is to design a tester based on the ideas from [33]. The tester is very natural:

  1. 1.

    Sample a random affine map T:𝔽qs+t𝔽qnT\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}; here, ss and tt should be thought of as in the previous section. We are going to look at the function fTf\circ T, but not query all of its values (indeed, querying all of its values would amount to the (s+t)(s+t)-flat tester).

  2. 2.

    For any sequence of degrees e=(es+1,,es+t)e=(e_{s+1},\ldots,e_{s+t}) such that i=1tq1eir\sum_{i=1}^{t}q-1-e_{i}\geqslant r, take HeH_{e} and compute fT,He\langle{f\circ T},{H_{e}}\rangle. Reject if this inner product is non-zero for any such degree sequence. Otherwise, accept.

In words, we first take the restriction of ff to a random (s+t)(s+t)-flat, and then apply the detector polynomials of [33]. Although we test for multiple degree sequences (up to qtq^{t}), we will see in Section 3 that the support of HeH_{e} is the same in each case. Therefore, we can perform the inner product for all of the degree sequences using the same qs+tq^{s+t} queries. Another way to think about this tester is that we have the group of symmetries of the Reed-Muller code (affine linear transformations), and our tester proceeds by first taking a random symmetry from this group, taking a restriction, and then using the detector of [33].

1.6.3 The Graph on Affine Linear Transformations and Its Analysis

With the above tester in mind, the next question is how to analyze it. In the case of the flat tester we had a very natural graph associated with the tester (the affine Grassmann graph). The above tester seems related to flats as well, since the image of an affine transformation is a flat; however, there is a key, crucial distinction between the two. In the above tester, we are only going to look at the value of ff at a few locations in Im(T)\operatorname{Im}(T), hence the tester may perform differently on TT and TT^{\prime} even if they have identical images. This lack of symmetry is crucial for the sparseness of the test, but makes the task of associating a graph with the tester trickier.

To gain some intuition as to what this graph is supposed to be, recall that in the flat testers, one could look at the tt-flat tester for all tt (not necessarily the smallest one). The relations between these testers for different tt’s play a crucial role in all of the works establishing optimal testing results, and in particular it is known that the rejection probability of the (t+1)(t+1)-flat tester is at most qq times the rejection probability of the tt-flat tester. We will elaborate on this property a bit later (referred to as the “shadow lemma” below). Another benefit of looking at various testers is that it gives a natural way of arriving at the affine Grassmann graph, by doing an up-down walk between these testers. To obtain the edges of the affine Grassmann graph, one can start with a tt-flat, go up to a (t+1)(t+1)-flat that contains it, and then back down to a tt-flat contained in the (t+1)(t+1)-flat. What is the right analog of this operation in the context of the sparse flat tester?

Going up.

The above discussion suggests looking for analogs of the tester above for higher dimensions, and there is a clear natural analog to the up step: for any r0r\geqslant 0, we can look at the (s+t+r)(s+t+r) sparse flat tester, in which one chooses an affine map T:𝔽qs+t+r𝔽qnT\colon\mathbb{F}_{q}^{s+t+r}\to\mathbb{F}_{q}^{n} randomly, and proceeds with the tester as above for all viable degree sequences (but over more variables). Thus, starting with the (s+t)(s+t) sparse flat tester and with an affine map TT thought of as (n×(s+t))(n\times(s+t)) matrix over 𝔽q\mathbb{F}_{q} and an affine shift c𝔽qnc\in\mathbb{F}_{q}^{n}, we can go “up” to the (s+t+1)(s+t+1) sparse flat tester by choosing a random vector w𝔽qnw\in\mathbb{F}_{q}^{n} and looking at affine transformation AA corresponding to the matrix whose columns are the same as of TT, except that we append ww as the last column. Just like in the flat tester, it can be observed without much difficulty that if the (s+t)(s+t) sparse flat tester rejects when choosing TT, then the (s+t+1)(s+t+1) sparse flat tester rejects when choosing AA.

Going down.

The “going down” step is also simple, but a bit harder to motivate. Taking inspiration from the flat tester, one may want to apply some linear shuffling on the columns of AA and then “drop” one of the columns. This doesn’t seem to work though, as doing so would lead to a TT^{\prime} in which the performance of the sparse flat tester seems “completely independent” to its performance on TT (in the sense that the set of points it looks at in TT^{\prime} would typically be disjoint from the set of points it looks at in TT).

Thus, when going down we wish to do so in a way that keeps TT^{\prime} and TT equal on many points. A natural thing to try is to apply an affine transformation from R:𝔽qs+t+1𝔽qs+tR:\mathbb{F}_{q}^{s+t+1}\xrightarrow[]{}\mathbb{F}_{q}^{s+t} to AA that fixes a co-dimension 11 space. In this case, T=ART^{\prime}=A\circ R is outputted and TT^{\prime} is equal to AA on the co-dimension 11 space that RR fixes. On the other hand, by construction TT is equal to AA on a co-dimension 11 space as well - namely the subspace with last coordinate equal to 0. Therefore, after the down step we get TT^{\prime} which is equal to TT on a subspace of dimension s+ts+t - which is essentially as similar to TT as possible while still being distinct from it.

Put a different way, to go down from the (s+t+1)(s+t+1) sparse flat tester to (s+t)(s+t) sparse flat tester we proceed by choosing b1,,bs+t,bs+t+1𝔽qb_{1},\ldots,b_{s+t},b_{s+t+1}\in\mathbb{F}_{q} uniformly and independently and taking the affine transformation TT^{\prime} corresponding to the matrix whose iith column is 𝖼𝗈𝗅i(T)+bi𝖼𝗈𝗅s+t+1(T){\sf col}_{i}(T)+b_{i}{\sf col}_{s+t+1}(T), and whose shift is c+bs+t+1uc+b_{s+t+1}u. In words, we drop the final column but add a random multiple of it to each one of the other columns of TT.

Going up and then down.

Stitching these two operations, one gets a graph whose set of vertices is the set of affine linear transformations T:𝔽qs+t𝔽qnT\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n} and whose edges are very similar in spirit to the affine Grassmann graph; this graph is known as the bi-linear scheme graph. The core of our analysis relies on 33 components:

  1. 1.

    Relating the performance of the tester and expansion on this graph (the shadow lemma), and proving that the set of TT’s on which the tester rejects has edge expansion at most 11/q1-1/q.

  2. 2.

    Studying the structure of sets in this graph whose expansion is at most 11/q1-1/q and proving they must have some strong local structure.

  3. 3.

    Using said local structure towards a correction argument, proving that if the sparse flat tester rejects with small probability, then ff is close to a Reed-Muller codeword.

1.6.4 The Shadow Lemma

The shadow lemma is a result asserting a relation between the rejection probability of a (s+t+1)(s+t+1) tester and the (s+t)(s+t) tester.

The Shadow Lemma for Flat Testers.

In the context of flat testers, the lemma asserts that the fraction of flats of dimension (s+t+1)(s+t+1) on which the tester rejects is at most qq times the fraction of (s+t)(s+t) flats the tester rejects. The name of the result stems from the fact that letting SS be the set of (s+t)(s+t)-flats on which the tester rejects, the set of (s+t+1)(s+t+1) flats on which the tester rejects is exactly the upper shadow of SS:

S={A𝔽qn|𝖽𝗂𝗆(A)=s+t+1,BS,BA}.S^{\uparrow}=\{A\subseteq\mathbb{F}_{q}^{n}~{}|~{}{\sf dim}(A)=s+t+1,\exists B\in S,B\subseteq A\}.

This means that on average, an element ASA\in S^{\uparrow} has 1/q1/q of its subsets BAB\subseteq A in SS, and thinking of an edge in the affine Grassmann graph an up-down step we get that the probability that a random step from SS goes to a vertex outside SS is at most 11/q1-1/q. That is, the edge expansion of SS, defined as

Φ(S)=|{e=(A,A)E|A,AS}||{e=(A,A)E|AS}|,\Phi(S)=\frac{|\{e=(A,A^{\prime})\in E~{}|~{}A,A^{\prime}\in S\}|}{|\{e=(A,A^{\prime})\in E~{}|~{}A\in S\}|},

is at most 11q1-\frac{1}{q}.

The Shadow Lemma for Sparse Flat Testers.

In the context of the sparse flat tester, we wish to argue that something along the same lines holds. Towards this end, fixing the function f:𝔽qn𝔽qf\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}, we define

S={T:𝔽qs+t𝔽qn|the sparse flat tester rejects when choosing T}.S=\{T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}~{}|~{}\text{the sparse flat tester rejects when choosing $T$}\}.

Due to the asymmetry between the up and down steps, there is no clear analog of the upper shadow of SS. However, it is still true that if TST\in S and we append to TT some vector u𝔽qnu\in\mathbb{F}_{q}^{n} to form an affine map R:𝔽qs+t+1𝔽qnR\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}, then the sparse (s+t+1)(s+t+1) tester rejects RR and it makes sense to define

S={R:𝔽qs+t+1𝔽qn|the sparse flat tester rejects when choosing R}.S^{\uparrow}=\{R\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}~{}|~{}\text{the sparse flat tester rejects when choosing $R$}\}.

We prove that starting from RSR\in S^{\uparrow} and performing a down step to a T:𝔽qs+t+1𝔽qnT^{\prime}\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}, with probability at least 1/q1/q the sparse flat tester still rejects and hence TST^{\prime}\in S. In particular, we conclude that μ(S)qμ(S)\mu(S^{\uparrow})\leqslant q\mu(S) (where μ(S)\mu(S) denotes the ratio between the size of SS and the total number of affine maps from 𝔽qs+t\mathbb{F}_{q}^{s+t} to 𝔽qn\mathbb{F}_{q}^{n}, and μ(S)\mu(S^{\uparrow}) is defined similarly). Using the same logic as before we conclude that Φ(S)11q\Phi(S)\leqslant 1-\frac{1}{q}, where here we measure edge expansion with respect to the bi-linear scheme graph.

1.6.5 Expansion on the Bi-linear Scheme Graph

Equipped with the understanding that SS is a small set (as we assume the sparse flat tester rejects with small probability) and Φ(S)11q\Phi(S)\leqslant 1-\frac{1}{q}, the next question is what sort of structure this implies. In the bi-linear scheme graph there are natural examples of such sets, which are analogs of the zoom-in/ zoom-out sets in the context of subspaces. Roughly speaking, there is one type of examples which intuitively should be relevant for us, which is zoom-in sets:

𝒞x,y={T:𝔽qs+t𝔽qn|T(x)=y}.\mathcal{C}_{x^{\star},y^{\star}}=\{T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}~{}|~{}T(x^{\star})=y^{\star}\}.

There are other examples of small sets which have poor expansion in the bi-linear scheme graph, however these seem “irrelevant” in the present context. Indeed, after showing that our small set SS cannot be correlated with any of these other examples (a notion which is often referred to as pseudo-randomness with respect to zoom-outs and zoom-ins on the linear part), we use the theory of global hypercontractivity to prove that there must be xx^{\star} and yy^{\star} such that μ(S𝒞x,y)(1o(1))μ(𝒞x,y)\mu(S\cap\mathcal{C}_{x^{\star},y^{\star}})\geqslant(1-o(1))\mu(\mathcal{C}_{x^{\star},y^{\star}}). In words, the sparse (s+t)(s+t)-flat tester almost always rejects inside 𝒞x,y\mathcal{C}_{x^{\star},y^{\star}}.

We remark that the proof that SS has no “correlation” with any other non-expanding set in the bi-linear scheme is rather tricky, and much of the effort in the current paper is devoted to that. To do so, we have to build upon ideas from [31] as well as use a new relation between the sparse (s+t)(s+t)-flat tester applied on a function ff and the tt-flat tester applied on a related function f~\tilde{f}. As the construction of this related function is somewhat technical, we defer a detailed discussion of it to Section 3.5.

1.6.6 Finishing the Proof via Local Correction

Intuitively, the only way that SS could be dense inside some 𝒞x,y\mathcal{C}_{x^{\star},y^{\star}} is if we started with a Reed-Muller codeword gg, and perturbed it on a small number of inputs, of which yy^{\star} is one. Indeed, in this case we would have that gT,He=0\langle{g\circ T},{H_{e}}\rangle=0 for all T𝒞x,yT\in\mathcal{C}_{x^{\star},y^{\star}} and exponent vectors ee checked by the tester prior to perturbing, and after changing the value of gg at yy^{\star}, we would also change the value of gTg\circ T at xx^{\star}, breaking our previous equality. This intuition suggests that yy^{\star} is a point in which we should fix the value of ff and get closer to a Reed-Muller codeword, and we show that this is indeed the case.

There are several difficulties that arise when inspecting this argument more deeply. If He(x)=0H_{e}(x^{\star})=0, then the above reasoning breaks (as the value of ff at yy^{\star} is multiplied by 0); however, in this case it does not make sense that SS could be very dense inside 𝒞x,y\mathcal{C}_{x^{\star},y^{\star}} (as essentially the only input of ff included in the inner product is f(y)f(y^{\star}), but in any case it is multiplied by He(x)=0H_{e}(x^{\star})=0). Indeed, in our argument we show that if SS is very dense in 𝒞x,y\mathcal{C}_{x^{\star},y^{\star}}, then it must be the case that He(x)0H_{e}(x^{\star})\neq 0. Moreover, we show that the density of SS inside 𝒞x,y\mathcal{C}_{x^{\star},y^{\star}} and inside 𝒞x,y\mathcal{C}_{{x^{\star}}^{\prime},y^{\star}} is roughly the same for all xx^{\star} and x{x^{\star}}^{\prime} in the support of HeH_{e}. This last step is crucial for the analysis to go through and requires us to adapt and strengthen techniques from [31]. At the end of this step, roughly speaking, we conclude that the tester rejects with probability close to 11 whenever it queries the value of ff at yy^{\star}.

The last step in the argument is to show that we can change the value of ff at yy^{\star} and decrease the rejection probability of the tester by additive factor of Θ(qs+tn)\Theta(q^{s+t-n}) (which is proportional to the probability that the tester looks at yy^{\star}). We do so by a reduction to the same problem over the standard flat tester (which was solved in [31]). The idea is to look at somewhat larger tester of dimension s+t+100s+t+100, fix the first ss coordinates and let the rest vary, so that the tester becomes a local version of the standard flat tester.

Given that, and iterating the argument, we eventually reach a function ff^{\prime} that differs from ff on at most Θ(ε/qs+tn)\Theta(\varepsilon/q^{s+t-n}) of the inputs (where ε\varepsilon is the original rejection probability) and passes the test with probability 11. Hence, ff^{\prime} is a Reed-Muller codeword, and so ff is O(ε/Q)O(\varepsilon/Q) close to a Reed-Muller codeword.

2 Preliminaries

Notations.

For an integer nn we denote [n]={1,,n}[n]=\{1,\ldots,n\}. For a prime power qq we let 𝔽q\mathbb{F}_{q} be the field of size qq, and we denote by 𝔽q𝔽q\mathbb{F}_{q}^{*}\subseteq\mathbb{F}_{q} the set of non-zero elements in it.

2.1 Basic Facts about Fields

Throughout, abusing notations we define the Reed-Muller code RM[n,q,d]\operatorname{RM}[n,q,d] as the set of functions over 𝔽qn\mathbb{F}_{q}^{n} that can be written as polynomials of degree at most dd. Henceforth fix dd and write

d+1=p(qq/p)+r=s(qq/p)+r,d+1=\ell\cdot p(q-q/p)+r=s(q-q/p)+r,

where we set s=ps=\ell\cdot p, and 0<rp(qq/p)0<r\leqslant p(q-q/p).

We will need a few basic facts. First, it is a well known fact that 𝔽q\mathbb{F}_{q}^{*} has a multiplicative generator which we often denote by γ\gamma. The next lemma uses the existence of a multiplicative generator to estimate sums over 𝔽q\mathbb{F}_{q}.

Lemma 2.1.

For any prime power qq and integer i{0,,q1}i\in\{0,\ldots,q-1\},

α𝔽qαi={1,ifi=q1,0,otherwise.\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{i}=\begin{cases}-1,&\text{if}\ i=q-1,\\ 0,&\text{otherwise}.\end{cases}
Proof.

If i=q1i=q-1, then αi=1\alpha^{i}=1 for all α0\alpha\neq 0, while 0i=00^{i}=0. Therefore, the sum is one summed up q1q-1 times which is 1-1 in 𝔽q\mathbb{F}_{q}. For i{1,,q2}i\in\{1,\ldots,q-2\}, recall that 𝔽q\mathbb{F}_{q}^{*} has a generator γ\gamma. That is, 𝔽q={1,γ,,γq2}\mathbb{F}_{q}^{*}=\{1,\gamma,\ldots,\gamma^{q-2}\}. Since γi1\gamma^{i}\neq 1, we may write

α𝔽qαi=j=0q2(γi)j=(γi)q11γi1=11γi1=0.\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{i}=\sum_{j=0}^{q-2}(\gamma^{i})^{j}=\frac{(\gamma^{i})^{q-1}-1}{\gamma^{i}-1}=\frac{1-1}{\gamma^{i}-1}=0.

On the other hand, if i=0i=0 then the sum on the left hand side of the lemma is equal to α𝔽qα0=α𝔽q1=q=0\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{0}=\sum_{\alpha\in\mathbb{F}_{q}}1=q=0. ∎

2.2 Functions over Fields

Given two functions f,g:𝔽qn𝔽qf,g\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}, we measure the distance between them in terms of the normalized Hamming distance, that is,

δ(f,g)=Prx𝔽qn[f(x)g(x)].\delta(f,g)=\Pr_{x\in\mathbb{F}_{q}^{n}}[f(x)\neq g(x)].

The distance of a function f:𝔽qn𝔽qf\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q} from the Reed-Muller code RM[n,q,d]\operatorname{RM}[n,q,d], denoted by δd(f)\delta_{d}(f), is defined as the minimal distance between ff and some function gRM[n,q,d]g\in\operatorname{RM}[n,q,d]. That is,

δd(f)=mingRM[n,q,d]δ(f,g)=mingRM[n,q,d]Prx𝔽qn[f(x)g(x)].\delta_{d}(f)=\min_{g\in\operatorname{RM}[n,q,d]}\delta(f,g)=\min_{g\in\operatorname{RM}[n,q,d]}\Pr_{x\in\mathbb{F}_{q}^{n}}[f(x)\neq g(x)].

For two functions f,g:𝔽qn𝔽qf,g:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}, define their inner product as

f,g=v𝔽qnf(v)g(v).\langle f,g\rangle=\sum_{v\in\mathbb{F}_{q}^{n}}f(v)g(v).

It is clear that this inner product is bi-linear. Monomials are the basic building blocks of all polynomials over 𝔽q\mathbb{F}_{q}, and the following notation will be convenient for us to use:

Definition 1.

For an exponent vector e{0,,q1}ne\in\{0,\ldots,q-1\}^{n}, we define the monomial xe=i=1nxieix^{e}=\prod_{i=1}^{n}x_{i}^{e_{i}}

The next lemma allows us to compute inner product between monomials:

Lemma 2.2.

For e,e{0,,q1}ne,e^{\prime}\in\{0,\ldots,q-1\}^{n}, we have that

xe,xe={(1)n,ife+e=(q1,,q1),0,otherwise.\langle x^{e},x^{e^{\prime}}\rangle=\begin{cases}(-1)^{n},&\text{if}\ e+e^{\prime}=(q-1,\ldots,q-1),\\ 0,&\text{otherwise}.\end{cases}
Proof.

Let e′′=e+ee^{\prime\prime}=e+e^{\prime}. By definition

xe,xe=(α1,,αn)𝔽qni=1nαiei=i=1nα𝔽qαei′′.\langle x^{e},x^{e^{\prime}}\rangle=\sum_{(\alpha_{1},\ldots,\alpha_{n})\in\mathbb{F}_{q}^{n}}\prod_{i=1}^{n}\alpha_{i}^{e_{i}}=\prod_{i=1}^{n}\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{e^{\prime\prime}_{i}}.

The result follows from Lemma 2.1. ∎

2.3 Affine Linear Transformations and the Affine Bi-linear Scheme

In this section we present the Affine Bi-linear scheme, which plays a vital role in our arguments.

Definition 2.

We denote by 𝒯n,\mathcal{T}_{n,\ell} the set of affine transformations T:𝔽q𝔽qnT:\mathbb{F}_{q}^{\ell}\xrightarrow[]{}\mathbb{F}_{q}^{n}.

Each affine transformation T𝒯n,T\in\mathcal{T}_{n,\ell} consists of a linear part M𝔽qn×M\in\mathbb{F}_{q}^{n\times\ell} and a translation c𝔽qnc\in\mathbb{F}_{q}^{n}, and we will use the writing convention T=(M,c)T=(M,c) to refer to the fact that TT is the affine transformation such that Tx=Mx+cTx=Mx+c for x𝔽qx\in\mathbb{F}_{q}^{\ell}. In words, MM is the linear part of TT and cc is the affine shift. We stress that an affine transformation T𝒯n,T\in\mathcal{T}_{n,\ell} is not necessarily full rank.

Definition 3.

The density of a set S𝒯n,S\subseteq\mathcal{T}_{n,\ell} is defined as μ(S)=|S||𝒯n,|\mu_{\ell}(S)=\frac{|S|}{|\mathcal{T}_{n,\ell}|}.

Often times, we drop the subscript \ell if it is clear from the context. Our analysis will require us to view affine transformations as vertices of some suitable test graph. For example, for the (standard) tt-flat test, this graph is the affine Grassmann graph, 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,t){\sf AffGras}(n,t). The vertices of the graph are all of the tt-flats in 𝔽qn\mathbb{F}_{q}^{n}, and two vertices U1U_{1} and U2U_{2} are adjacent if they intersect in a (t1)(t-1)-flat, that is, dim(U1U2)=t1\dim(U_{1}\cap U_{2})=t-1. Below we define an analogous graph structure on affine transformations, which we refer to as the affine bi-linear scheme graph.

Definition 4.

The affine bi-linear scheme graph, 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell), has vertex set 𝒯n,\mathcal{T}_{n,\ell}. Two vertices T1,T2𝒯n,T_{1},T_{2}\in\mathcal{T}_{n,\ell} are adjacent adjacent if and only if they are equal on an 1\ell-1 dimensional affine subspace. This condition can also be written as dim(ker(T1T2))1\dim(\ker(T_{1}-T_{2}))\geqslant\ell-1.

Write T1T2T_{1}\sim T_{2} to denote an adjacency. We remark that the affine Grassmann graph can be obtained from the affine bi-linear scheme by identifying each T1𝒯n,T_{1}\in\mathcal{T}_{n,\ell} with its image (that is, closing the set 𝒯n,\mathcal{T}_{n,\ell} under some group operation), however as explained in the introduction this distinction will be crucial for us.

2.4 Expansion and pseudo-randomness in the Affine Bi-linear Scheme

We will use the standard notion of edge expansion, defined as follows.

Definition 5.

For a regular graph G=(V,E)G=(V,E) and a set of vertices SVS\subseteq V, we define the edge expansion of SS as

Φ(S)=PruS(u,v)E[vS].\Phi(S)=\Pr_{\begin{subarray}{c}u\in S\\ (u,v)\in E\end{subarray}}[v\not\in S].

In words, the edge expansion of SS is the probability to escape it in a single step of the random walk on GG.

We will use Φn,\Phi_{n,\ell} to denote edge expansion on 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell). When nn and \ell are clear from context we drop the subscripts. Later on in the paper, we will also consider edge expansion over the affine Grassmann graph 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,){\sf AffGras}(n,\ell); abusing notations, we will denote edge expansion there also using the notation Φn,\Phi_{n,\ell}, and it will be clear from context which graph we are considering.

2.4.1 The Up-Down view of the Ranom Walk

It will be helpful for us to consider the following equivalent, two-step process for sampling a neighbor of T1=(M1,c1)𝒯n,T_{1}=(M_{1},c_{1})\in\mathcal{T}_{n,\ell} in the affine bi-linear scheme graph:

  1. 1.

    Go Up: Choose a random w0w\neq 0. Let MM^{\prime} be the matrix obtained by appending the ww as a new column to M1M_{1}, and T=(M,c)𝒯n,+1.T^{\prime}=(M^{\prime},c)\in\mathcal{T}_{n,\ell+1}.

  2. 2.

    Go Down: Choose a random R=(A,b)𝒯+1,R=(A,b)\in\mathcal{T}_{\ell+1,\ell}, where the first \ell rows of AA are the identity matrix and the first \ell entries of bb are 0, and at least one entry out of the last row in AA and the last entry in bb is nonzero.

  3. 3.

    Output T2=TRT_{2}=T^{\prime}\circ R.

It is easy to see that for T2=(M2,c2)T_{2}=(M_{2},c_{2}), each column of M2M_{2} is equal to the corresponding column in M1M_{1} plus some multiple of ww and likewise for c2c_{2} and c1c_{1}.

2.4.2 Non-expanding Sets in the Affine Bi-linear Scheme

As explained in the introduction, our proof considers the set 𝒮\mathcal{S} of affine-transformations on which the test rejects as a set of vertices in the affine bi-linear scheme graph. We prove that 𝒮\mathcal{S} has small edge expansion in that graph, and then use global hypercontractivity in order to derive some conclusion about the structure of 𝒮\mathcal{S}, which is in turn helpful towards a local correction argument.

To facilitate this argument, we must first understand the structure of a few canonical examples of non-expanding sets in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell). In the affine Grassmann graph one has the following examples:

  • zoom-ins: 𝒟a={U𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,)|aU}\mathcal{D}_{a}=\{U\in{\sf AffGras}(n,\ell)\;|\;a\in U\}, for a𝔽qna\in\mathbb{F}_{q}^{n}.

  • zoom-outs: 𝒟W={U𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,)|UW}\mathcal{D}_{W}=\{U\in{\sf AffGras}(n,\ell)\;|\;U\subseteq W\}, for a hyperplane W𝔽qnW\subseteq\mathbb{F}_{q}^{n}.

  • zoom-ins on the linear part: 𝒟a,lin={U=z+V𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,)|aV}\mathcal{D}_{a,\operatorname{lin}}=\{U=z+V\in{\sf AffGras}(n,\ell)\;|\;a\in V\}, for a𝔽qna\in\mathbb{F}_{q}^{n}.

  • zoom-outs on the linear part: 𝒟W,lin={U=z+V𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n,)|VW}\mathcal{D}_{W,\operatorname{lin}}=\{U=z+V\in{\sf AffGras}(n,\ell)\;|\;V\subseteq W\}, for a hyperplane W𝔽qnW\subseteq\mathbb{F}_{q}^{n}.

It is not hard to see that each example has expansion at most 11/q1-1/q; also, it is an easy calculation to show that the density of each one of these sets is small (and is vanishing provided that \ell is significantly smaller than nn and both go to infinity). Each of these examples also has a counterpart in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell):

  • zoom-ins: 𝒞a,b={T𝒯n,:T(a)=b}\mathcal{C}_{a,b}=\{T\in\mathcal{T}_{n,\ell}\;:\;T(a)=b\}, for a𝔽qa\in\mathbb{F}_{q}^{\ell} and b𝔽qnb\in\mathbb{F}_{q}^{n}.

  • zoom-outs: 𝒞aT,b,β={(M,c)𝒯n,:aTM=b,aTc=β}\mathcal{C}_{a^{T},b,\beta}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;a^{T}\cdot M=b,\;a^{T}\cdot c=\beta\}, for a𝔽qna\in\mathbb{F}_{q}^{n}, b𝔽qb\in\mathbb{F}_{q}^{\ell}, and β𝔽q\beta\in\mathbb{F}_{q}.

  • zoom-ins on the linear part: 𝒞a,b,lin={(M,c)𝒯n,:Ma=b}\mathcal{C}_{a,b,\operatorname{lin}}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;M\cdot a=b\}, for a𝔽qa\in\mathbb{F}_{q}^{\ell} and b𝔽qnb\in\mathbb{F}_{q}^{n}.

  • zoom-outs on the linear part: 𝒞aT,b,lin={(M,c)𝒯n,:aTM=b}\mathcal{C}_{a^{T},b,\operatorname{lin}}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;a^{T}\cdot M=b\}, for a,𝔽qna,\in\mathbb{F}_{q}^{n}, and b𝔽qb\in\mathbb{F}_{q}^{\ell}.

Likewise, one can check that each example here also has expansion at most 11/q1-1/q in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell). Our argument will require us to show that, in a sense, these examples exhaust all small sets of vertices in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell) whose edge expansion is at most 11/q1-1/q. To formalize this, we first define the notion of pseudo-randomness. Intuitively, we say that a set SS is pseudo-random with respect to some example – say zoom-ins for concreteness – if SS may only have little correlation with 𝒞a,b\mathcal{C}_{a,b}’s, in the sense that the density of SS inside such sets is still small. More precisely:

Definition 6.

Let 𝒮𝒯n,\mathcal{S}\subseteq\mathcal{T}_{n,\ell} and let ξ[0,1]\xi\in[0,1].

  1. 1.

    We say that 𝒮\mathcal{S} is ξ\xi-pseudo-random with respect to zoom-ins if for each a𝔽qa\in\mathbb{F}_{q}^{\ell} and b𝔽qnb\in\mathbb{F}_{q}^{n} we have that

    μ(𝒮a,b):=PrT𝒞a,b[T𝒮]ξ.\mu(\mathcal{S}_{a,b}):=\Pr_{T\in\mathcal{C}_{a,b}}[T\in\mathcal{S}]\leqslant\xi.
  2. 2.

    We say that 𝒮\mathcal{S} is ξ\xi-pseudo-random with respect to zoom-outs if for each a𝔽qn,b𝔽q,a\in\mathbb{F}_{q}^{n},b\in\mathbb{F}_{q}^{\ell}, and β𝔽q\beta\in\mathbb{F}_{q} we have that

    μ(𝒮aT,b,β):=PrT𝒞aT,b,β[T𝒮]ξ.\mu(\mathcal{S}_{a^{T},b,\beta}):=\Pr_{T\in\mathcal{C}_{a^{T},b,\beta}}[T\in\mathcal{S}]\leqslant\xi.
  3. 3.

    We say that 𝒮\mathcal{S} is ξ\xi-pseudo-random with respect to zoom-ins on the linear part if for each a𝔽qa\in\mathbb{F}_{q}^{\ell} and b𝔽qnb\in\mathbb{F}_{q}^{n} we have that

    μ(𝒮a,b,lin):=PrT𝒞a,b,lin[T𝒮]ξ.\mu(\mathcal{S}_{a,b,\operatorname{lin}}):=\Pr_{T\in\mathcal{C}_{a,b,\operatorname{lin}}}[T\in\mathcal{S}]\leqslant\xi.
  4. 4.

    We say that 𝒮\mathcal{S} is ξ\xi-pseudo-random with respect to zoom-outs on the linear part if for each a𝔽qn,b𝔽qa\in\mathbb{F}_{q}^{n},b\in\mathbb{F}_{q}^{\ell} we have that

    μ(𝒮aT,b,lin):=PrT𝒞aT,b,lin[T𝒮]ξ.\mu(\mathcal{S}_{a^{T},b,\operatorname{lin}}):=\Pr_{T\in\mathcal{C}_{a^{T},b,\operatorname{lin}}}[T\in\mathcal{S}]\leqslant\xi.

Typically, global hypercontractivity results say that a small set with expansion bounded away from 11 cannot be pseudon-random (where the notion of pseudo-randomness is in the same spirit but a bit more complicated; see [13, 29] for example). For results in coding theory however it seems that a somewhat different (yet very much related) type of result is needed [31]. Intuitively, in these type of results the assumption on the expansion of the set 𝒮\mathcal{S} is much stronger, and roughly speaking asserts that the expansion of 𝒮\mathcal{S} is almost “as small as it could be”. In exchange for such a strong assumption, one often requires a stronger conclusion regarding the non-pseudo-randomness of the set 𝒮\mathcal{S}. For instance, in [31] it is shown that a set of vertices 𝒮\mathcal{S} with expansion at most 11/q1-1/q which is highly pseudo-random with respect to 33 of the above type of examples, must be highly non-pseudo-random with respect to the fourth type, in the sense that it almost contains a copy of such set. For our purposes we require an analogous statement for 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell), which is the following statement:

Theorem 2.1.

If 𝒮𝒯n,\mathcal{S}\subseteq\mathcal{T}_{n,\ell} satisfies

  1. 1.

    μ(𝒮)ξ\mu(\mathcal{S})\leqslant\xi,

  2. 2.

    𝒮\mathcal{S} is ξ\xi-pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part,

  3. 3.

    1Φ(𝒜)1q1-\Phi(\mathcal{A})\geqslant\frac{1}{q}.

Then there exist a𝔽qa\in\mathbb{F}_{q}^{\ell} and b𝔽qnb\in\mathbb{F}_{q}^{n} such that μ(𝒮a,b)11(q1)2q3q1(4q+867ξ1/4)\mu(\mathcal{S}_{a,b})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{q^{3}}{q-1}\left(4q^{-\ell}+867\xi^{1/4}\right), where μ(𝒮a,b)\mu(\mathcal{S}_{a,b}) is the density of 𝒮\mathcal{S} in 𝒞a,b\mathcal{C}_{a,b}.

Proof.

The proof uses a reduction to an analogous result over the affine Grassmann graph using ideas from [9], and is deferred to Section A. ∎

3 Local Testers for the Reed Muller Code

We now formally describe both the sparse flat test and the full flat test. Although we focus on the sparse test, at times it will be convenient to reduce to the full test to aid our analysis.

3.1 The tt-flat Tester and the Inner Product View

At a high level, both the tt-flat test and the sparse (s+t)(s+t)-flat test can be described in the following way:

  1. 1.

    Restriction: Choose a random T𝒯n,tT\in\mathcal{T}_{n,t} for some suitable dimension tt.

  2. 2.

    Local test on restriction: Check if the tt-variate function fTf\circ T is indeed degree dd. If so, accept, and otherwise reject.

The difference between the two tests lies in how the “check” in step 2 is done. The straightforward way to perform this check is by querying fTf\circ T on all points in 𝔽qt\mathbb{F}_{q}^{t}, interpolating fTf\circ T and checking its degree. Indeed, this is precisely how the tt-flat test is defined. In that case, it is easy to see that the result of the test depends only on the image of TT; this is because fTAf\circ T\circ A and fTf\circ T have the same degree for any full rank A𝒯t,tA\in\mathcal{T}_{t,t}. Therefore, the tt-flat test can be rephrased in its familiar form as follows:

  1. 1.

    Choose a random tt-flat U𝔽qnU\subseteq\mathbb{F}_{q}^{n}.

  2. 2.

    Accept if and only if deg(f|U)d\deg(f|_{U})\leqslant d.

However, there are other ways of trying to test whether fTf\circ T has degree at most dd or not. One way to do so is by taking inner products that check whether certain high-degree monomials exist in ff using Lemma 2.2. A simple way to do this is as follows:

Accept if and only if fT,xe=0\langle f\circ T,x^{e}\rangle=0 for all e{0,,q1}te\in\{0,\ldots,q-1\}^{t} such that i=1tq1ei>d\sum_{i=1}^{t}q-1-e_{i}>d.

By Lemma 2.2 it is not hard to see that this condition is equivalent to deg(fT)d\deg(f\circ T)\leqslant d. Furthermore it is clear that calculating all of the inner products requires querying every point in the support of some xex^{e} that is used – which is 𝔽qt\mathbb{F}_{q}^{t} in this case. Hence, taking inner product with all of these monomials does not lead to any savings in terms of the query complexity.

It turns out that there are more clever choices of “test polynomials” (which are not just monomials) that allow one to design an inner-product test above which is more query efficient. This idea was used by [33] who introduced the sparse (s+t)(s+t)-flat test, which we present formally in the next subsection.

3.2 The sparse (s+t)(s+t)-flat test

Our presentation of the sparse (s+t)(s+t)-flat is somewhat different than in [33], and this view will be necessary for our analysis. Recall that we write d+1=s(qq/p)+rd+1=s\cdot(q-q/p)+r where ss is divisible by pp and 0<rp(qq/p)0<r\leqslant p(q-q/p). For a vector x=(x1,,xp)x=(x_{1},\ldots,x_{p}) and a set I{1,,p}I\subseteq\{1,\ldots,p\} denote xI=iIxix_{I}=\sum\limits_{i\in I}x_{i}. Define the pp-variate polynomial

P(x1,,xp)=I[p1](1)|I|+1(xI+xp)q1x1xp1.\displaystyle P(x_{1},\ldots,x_{p})=\frac{\sum_{I\subseteq[p-1]}(-1)^{|I|+1}(x_{I}+x_{p})^{q-1}}{x_{1}\cdots x_{p-1}}.

For any degree sequence e=(e1,,et)[q]te=(e_{1},\ldots,e_{t})\in[q]^{t} define

Me(x1,,xt)=i=1txiei,He(x)=(i=1s/pP(xp(i1)+1,,xpi))Me(xs+1,,xs+t).M_{e}(x_{1},\ldots,x_{t})=\prod_{i=1}^{t}x_{i}^{e_{i}},\qquad\qquad H_{e}(x)=\left(\prod_{i=1}^{s/p}P(x_{p(i-1)+1},\ldots,x_{pi})\right)M_{e}(x_{s+1},\ldots,x_{s+t}).

The sparse (s+t)(s+t)-flat tester works as follows:

  1. 1.

    Choose a random affine transformation T:𝔽qn𝔽qs+tT:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{s+t}.

  2. 2.

    Accept if and only if, fT,He=0\langle f\circ T,H_{e}\rangle=0 for all e{0,,q1}te\in\{0,\ldots,q-1\}^{t} such that i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r.

Recall, we set the parameter tt to be t=max(p+2,10)t=\max(p+2,10). Essentially, tt just needs to be large enough so that the degree sequences in step 22 can account for monomials of degree up to r+q1r+q-1.

We refer to a degree sequence ee satisfying the inequality in step 22 as valid throughout. As dd is fixed throughout, the notion of valid degree sequences will also not change throughout the paper. Hence, we say that TT rejects ff or fTf\circ T is rejected, if fT,He0\langle f\circ T,H_{e}\rangle\neq 0 for some valid ee. Otherwise we say that TT accepts ff or fTf\circ T is accepted. We define 𝒮t\mathcal{S}_{t} to be the set of TT’s on which the tester rejects:

𝒮t={T𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,s+t)|fT,He0 for some e{0,,q1}t such that i=1teit(q1)r},\mathcal{S}_{t}=\left\{T\in{\sf AffBilin}(n,s+t)~{}|~{}\langle{f\circ T},{H_{e}}\rangle\neq 0\text{ for some $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum\limits_{i=1}^{t}e_{i}\leqslant t(q-1)-r$}\right\}, (1)

and let rejs+t(f)\operatorname{rej}_{s+t}(f) be the probability that the test rejects. Clearly, we have that rejs+t(f)=μs+t(𝒮t)\operatorname{rej}_{s+t}(f)=\mu_{s+t}(\mathcal{S}_{t})

3.3 Assuming nn is Sufficiently Large

In this section we argue that without loss of generality we may assume that ns+t+100n\geqslant s+t+100. Indeed, we may view an nn-variate function ff as a function in some NN-variables, for some sufficiently large N=O(n)N=O(n). Call this function g:𝔽qN𝔽qg:\mathbb{F}_{q}^{N}\xrightarrow[]{}\mathbb{F}_{q} and define it by g(a,b)=f(a)g(a,b)=f(a) for any a𝔽qna\in\mathbb{F}_{q}^{n}, b𝔽qNnb\in\mathbb{F}_{q}^{N-n}. We show that δ(f,RM[n,q,d])=δ(g,RM[N,q,d])\delta(f,\operatorname{RM}[n,q,d])=\delta(g,\operatorname{RM}[N,q,d]), which implies that we can view the test as over gg but still have the δ\delta in Theorem 1.1 be δ(f,RM[n,q,d])\delta(f,\operatorname{RM}[n,q,d]).

Lemma 3.1.

Let ff and gg be as defined above. Then δ(f,RM[n,q,d])=δ(g,RM[N,q,d])\delta(f,\operatorname{RM}[n,q,d])=\delta(g,\operatorname{RM}[N,q,d]).

Proof.

We first show that δ(f,RM[n,q,d])δ(g,RM[N,q,d])\delta(f,\operatorname{RM}[n,q,d])\geqslant\delta(g,\operatorname{RM}[N,q,d]). Suppose h:𝔽qn𝔽qh:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q} is the degree dd function such that δ(f,h)=δ(f,RM[n,q,d])\delta(f,h)=\delta(f,\operatorname{RM}[n,q,d]). Define hh^{\prime} as the extension of hh to NN variables (in the same way as gg is defined). Let h(,b)h^{\prime}(\cdot,b) denote the NN-variate function where the last NnN-n variables are set to b𝔽qNnb\in\mathbb{F}_{q}^{N-n}. Define g(,b)g(\cdot,b) similarly. Note that for any bb, h(,b)=hh^{\prime}(\cdot,b)=h, g(,b)=fg(\cdot,b)=f . We then have

δ(h,g)=𝔼b𝔽qNn[δ(h(,b),g(,b))]=𝔼b𝔽qNn[δ(h,f)]=δ(f,RM[n,q,d]).\delta(h^{\prime},g)=\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}\left[\delta\left(h^{\prime}(\cdot,b),g(\cdot,b)\right)\right]=\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}\left[\delta(h,f)\right]=\delta(f,\operatorname{RM}[n,q,d]).

Since hh is degree dd, so is hh^{\prime}, thus the above implies that δ(f,RM[n,q,d])δ(g,RM[2n,q,d])\delta(f,\operatorname{RM}[n,q,d])\geqslant\delta(g,\operatorname{RM}[2n,q,d]).

For the other direction, suppose hh^{\prime} is the degree dd, NN-variate function such that δ(g,RM[N,q,d])=δ(g,h)\delta(g,\operatorname{RM}[N,q,d])=\delta(g,h^{\prime}). Keeping the same notation as above, we have that

𝔼b𝔽qNn[δ(g(,b),h(,b))]=δ(g,h)=δ(g,RM[N,q,d]).\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}[\delta\left(g(\cdot,b),h^{\prime}(\cdot,b)\right)]=\delta(g,h^{\prime})=\delta(g,\operatorname{RM}[N,q,d]).

Hence, there is a choice of bb such that

δ(f,h(,b))=δ(g(,b),h(,b))=δ(g,RM[N,q,d]).\delta\left(f,h^{\prime}(\cdot,b)\right)=\delta\left(g(\cdot,b),h^{\prime}(\cdot,b)\right)=\delta(g,\operatorname{RM}[N,q,d]).

Since hh^{\prime} is degree dd, h(,b)h^{\prime}(\cdot,b) must be degree dd as well, so the above inequality implies that δ(f,RM[n,q,d])δ(g,RM[N,q,d])\delta(f,\operatorname{RM}[n,q,d])\leqslant\delta(g,\operatorname{RM}[N,q,d]), completing the proof. ∎

Using this lemma, we can always view the test as over gg instead of ff, and show that the sparse s+ts+t-flat test rejects with probability at least c(q)min(1,Qδ(g,RM[N,q,d]))c(q)\min(1,Q\delta(g,\operatorname{RM}[N,q,d])). Where QQ is the number of queries and c(q)c(q) is the some 1poly(q)\frac{1}{\operatorname{poly}(q)}. Since none of these parameters depend on NN we can use the lemma above and get that this rejection probability is the same as c(q)min(1,Qδ(f,RM[n,q,d])c(q)\min(1,Q\delta(f,\operatorname{RM}[n,q,d]).

Henceforth we assume that ns+t+100n\geqslant s+t+100. This assumption is helpful as it allows us to bound the number of non-full rank affine transformations from 𝔽qn𝔽qs+t\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{s+t} as we argue in the following remark.

Remark 3.2.

The fraction of 𝒯n,s+t\mathcal{T}_{n,s+t} that is not full rank is at most 1q100(q1)\frac{1}{q^{100}(q-1)}. The same holds for the fraction of 𝒯n,s+t\mathcal{T}_{n,s+t} that are not full rank conditioned on Ta=bTa=b for arbitrary a𝔽qs+ta\in\mathbb{F}_{q}^{s+t}, b𝔽qnb\in\mathbb{F}_{q}^{n}. To see that, note that we can upper bound the probability that the linear part of MM is not full rank by

i=0s+tqi1qnqs+t1(q1)qn1q100(q1).\sum_{i=0}^{s+t}\frac{q^{i-1}}{q^{n}}\leqslant\frac{q^{s+t}-1}{(q-1)q^{n}}\leqslant\frac{1}{q^{100}(q-1)}.

For the second part, note that conditioning on Ta=bTa=b does not change this estimate, as we can still choose the linear part uniformly at random and set the affine shift so that Ta=bTa=b.

3.4 Some Basic Facts of the Sparse Flat Tester

We now collect some basic facts regarding the sparse flat tester that will be necessary for our analysis.

3.4.1 Basic Soundness Properties

We begin by describing why this tester works. Our presentation herein will be partial, focusing on the most essential properties necessary for our analysis, and we refer the reader to Appendix B or [33] for more details.

Consider the monomials xex^{e^{\prime}}, for e{0,,q1}s+te^{\prime}\in\{0,\ldots,q-1\}^{s+t} such that xe,He(x)0\langle x^{e^{\prime}},H_{e}(x)\rangle\neq 0 for some e{0,,q1}te\in\{0,\ldots,q-1\}^{t} such that i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r, and tp+2t\geqslant p+2. By Lemma 2.1, these monomials, xex^{e^{\prime}}, must satisfy:

  • ep(i1)+1++epi=p(qq/p)e^{\prime}_{p(i-1)+1}+\cdots+e^{\prime}_{pi}=p\cdot(q-q/p) for 1isp1\leqslant i\leqslant\frac{s}{p}

  • es+1++es+tre^{\prime}_{s+1}+\cdots+e^{\prime}_{s+t}\geqslant r.

More generally, we can explicitly express any inner product with HeH_{e} as follows.

Lemma 3.3.

For f(x)=a{0,,q1}nCaxaf(x)=\sum_{a\in\{0,\ldots,q-1\}^{n}}C_{a}x^{a}, and e{0,,q1}te\in\{0,\ldots,q-1\}^{t} we have that

f,He=eDeCe,\langle f,H_{e}\rangle=\sum_{e^{\prime}}D_{e^{\prime}}C_{e^{\prime}},

where DeD_{e^{\prime}} is a constant depending on the coefficients of HeH_{e}, and the sum is over ee^{\prime} satisfying both the first condition above and es+i+ei=q1e^{\prime}_{s+i}+e_{i}=q-1 for 1it1\leqslant i\leqslant t.

Proof.

We have

xe,He=(i=1s/pα1,,αp𝔽qα1ep(i1)+1αpepiP(α1,,αp))β1,,βt𝔽qtβ1es+1+e1βtes+t+et.\langle x^{e^{\prime}},H_{e}\rangle=\left(\prod_{i=1}^{s/p}\sum_{\alpha_{1},\ldots,\alpha_{p}\in\mathbb{F}_{q}}\alpha_{1}^{e^{\prime}_{p(i-1)+1}}\cdots\alpha_{p}^{e^{\prime}_{pi}}P(\alpha_{1},\ldots,\alpha_{p})\right)\sum_{\beta_{1},\ldots,\beta_{t}\in\mathbb{F}_{q}^{t}}\beta_{1}^{e^{\prime}_{s+1}+e_{1}}\cdots\beta_{t}^{e^{\prime}_{s+t}+e_{t}}.

By Lemma 2.2, for each ii we have

α1,,αp𝔽qα1ep(i1)+1αpepiP(α1,,αp)0,\sum_{\alpha_{1},\ldots,\alpha_{p}\in\mathbb{F}_{q}}\alpha_{1}^{e^{\prime}_{p(i-1)+1}}\cdots\alpha_{p}^{e^{\prime}_{pi}}P(\alpha_{1},\ldots,\alpha_{p})\neq 0,

only if the monomial x1q1xpq1x_{1}^{q-1}\cdots x_{p}^{q-1} appears in x1ep(i1)+1xpepiP(x1,,xp)x_{1}^{e^{\prime}_{p(i-1)+1}}\cdots x_{p}^{e^{\prime}_{pi}}P(x_{1},\ldots,x_{p}). Since the degree of PP is qpq-p, this can only be the case if ep(i1)+1++epi=p(qq/p)e^{\prime}_{p(i-1)+1}+\cdots+e^{\prime}_{pi}=p\cdot(q-q/p). Likewise,

β1,,βt𝔽qtβ1e1+e1βtet+et0,\sum_{\beta_{1},\ldots,\beta_{t}\in\mathbb{F}_{q}^{t}}\beta_{1}^{e^{\prime}_{1}+e_{1}}\cdots\beta_{t}^{e^{\prime}_{t}+e_{t}}\neq 0,

only if ei+ei=q1e^{\prime}_{i}+e_{i}=q-1 for each 1it1\leqslant i\leqslant t. ∎

From Lemma 3.3 it is clear that TT rejects ff only if deg(fT)>d\deg(f\circ T)>d as in this case, ff does not have any monomials satisfying the above. Therefore if ff is indeed degree dd then it is accepted with probability 11. But is it necessarily the case that the sparse flat tester rejects a function ff with positive probability if its degree exceeds dd? This was shown to be true in [33] using canonical monomials. As we are going to need this fact and so as to be self contained, we include a proof in Appendix B.

Theorem 3.4.

A function ff passes the sparse (s+t)(s+t)-flat test with probability 11 if and only if ff has degree at most dd.

Proof.

Deferred to Appendix B. ∎

3.4.2 Sparsity

The next important feature of the sparse flat tester, as the name suggests, is that it has a small query complexity. Note that the number of queries the test makes is proportional to the size of the support of the test polynomials HeH_{e}, and the next lemma shows that for any of the ee in 𝔽qt\mathbb{F}_{q}^{t} of step 2, He(x)H_{e}(x) has a sparse support. Moreover, this support is the same no matter which ee is chosen. As a result, calculating fT,He\langle f\circ T,H_{e}\rangle does not require querying all qs+tq^{s+t} points in the domain of fTf\circ T.

Lemma 3.5.

The support of PP has size at most (2p1+p1)qp1(2^{p-1}+p-1)q^{p-1} and is contained in the set

(i=1p1{xi=0})(I[p1]{xI+xp=0}).\left(\bigcup_{i=1}^{p-1}\{x_{i}=0\}\right)\cup\left(\bigcup_{I\subset[p-1]}\{x_{I}+x_{p}=0\}\right).

where {xI=xp}\{x_{I}=x_{p}\} denotes the hyperplane given by xI=xpx_{I}=x_{p}.

Proof.

Suppose xx is not in the set described. Then in the expression for PP, each term (xI+xp)q1=1(x_{I}+x_{p})^{q-1}=1 by Fermat’s Little Theorem. Moreover, the denominator is nonzero, so a direct calculation yields P(x)=0P(x)=0. ∎

Lemma 3.6.

For any e{0,,q1}te\in\{0,\ldots,q-1\}^{t}, we have

supp(He)i=1s/psupp(P)×𝔽qt,\operatorname{supp}(H_{e})\subseteq\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t},
Proof.

Since He(x)=(i=1s/pP(xp(i1)+1,,xpi))j=1txs+jejH_{e}(x)=\left(\prod_{i=1}^{s/p}P(x_{p(i-1)+1},\ldots,x_{pi})\right)\prod_{j=1}^{t}x_{s+j}^{e_{j}}, the result is evident. ∎

Henceforth, we let supp(H)=e{0,,q1}tsupp(He)\operatorname{supp}(H)=\bigcup_{e\in\{0,\ldots,q-1\}^{t}}\operatorname{supp}(H_{e}). For any TT, the sparse s+ts+t-flat test can be done by only querying fTf\circ T on points in supp(H)\operatorname{supp}(H), which gives the following upper bound on query complexity.

Lemma 3.7.

The sparse s+ts+t-flat test has query complexity at most (3q)d+1qqt(3q)^{\frac{d+1}{q}}q^{t}. Since we can choose any tp+2t\geqslant p+2, the query complexity can be as low as (3q)d+1qqp+2(3q)^{\frac{d+1}{q}}q^{p+2}

Proof.

By Lemma 3.6, we can bound the size of i=1s/psupp(P)×𝔽qt\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}. By Lemma 3.5,

|supp(P)|(2p1+p1)qp1,|\operatorname{supp}(P)|\leqslant(2^{p-1}+p-1)q^{p-1},

so

|i=1s/psupp(P)×𝔽qt|((2p1+p1)qp1)s/pqt(3q)s(p1)/pqt,\left|\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}\right|\leqslant((2^{p-1}+p-1)q^{p-1})^{s/p}q^{t}\leqslant(3q)^{s(p-1)/p}q^{t},

where we use the fact that (2p1+p1)1/p3(2^{p-1}+p-1)^{1/p}\leqslant 3. Moreover, sd+1qq/ps\leqslant\frac{d+1}{q-q/p}, so

(3q)sqt(3q)d+1qqt.(3q)^{s}q^{t}\leqslant(3q)^{\frac{d+1}{q}}q^{t}.\qed

3.5 Relating the Sparse Flat Tester and the Flat Tester

In this section we describe a way to interpret the sparse (s+t)(s+t)-flat test as a tt-flat test. This relation will be useful later on as it allows us to borrow techniques from the analysis of the tt-flat test from [31].

Fix an affine transformation A𝒯n,s+A\in\mathcal{T}_{n,s+\ell} for some >t\ell>t and let Ress,,t\operatorname{Res}_{s,\ell,t} denote the set of affine transformations (R,b)(R,b) of the following form,

R=[Is00R],b=[0b],R=\begin{bmatrix}I_{s}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix}, (2)

for R𝔽q×t,b𝔽qR^{\prime}\in\mathbb{F}_{q}^{\ell\times t},b^{\prime}\in\mathbb{F}_{q}^{\ell}. We call the affine transformations in Ress,,t\operatorname{Res}_{s,\ell,t} restrictions. In words, composing AA with a random (R,b)(R,b) in Ress,,t\operatorname{Res}_{s,\ell,t} corresponds to the operation of preserving the first ss columns, and randomizing the rest of them. As we explain below, this allows to view the (s+t)(s+t)-sparse test as having the tt-flat tester embedded within it on some restricted-type function of ff.

More precisely, note that we can sample T𝒯n,s+tT\in\mathcal{T}_{n,s+t} by choosing A𝒯n,s+A\in\mathcal{T}_{n,s+\ell} as well as a restriction B=(R,b)Ress,,tB=(R,b)\in\operatorname{Res}_{s,\ell,t} uniformly at random and outputting T=ABT=A\circ B. In Lemmas 3.8 and 3.9 we show that the result of the test is “entirely dependent” on b+Im(R)=Im((R,b))b^{\prime}+\operatorname{Im}(R^{\prime})=\operatorname{Im}((R^{\prime},b^{\prime})). To this end, after fixing AA we define f~:𝔽q𝔽q\tilde{f}:\mathbb{F}_{q}^{\ell}\xrightarrow[]{}\mathbb{F}_{q} by

f~(β1,,β)=α𝔽qsfA(α,β1,,β)i=1s/pP(αp(i1)+1,,αpi).\tilde{f}(\beta_{1},\ldots,\beta_{\ell})=\sum_{\alpha\in\mathbb{F}_{q}^{s}}f\circ A(\alpha,\beta_{1},\ldots,\beta_{\ell})\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi}).

Also for B=(R,b)Ress,,tB=(R,b)\in\operatorname{Res}_{s,\ell,t} let fl(B)𝔽q\operatorname{fl}(B)\subseteq\mathbb{F}_{q}^{\ell} be the tt-flat given by b+Im(R)b^{\prime}+\operatorname{Im}(R^{\prime}). The following lemma gives a relation between the sparse flat tester rejecting fTf\circ T, and the (standard) flat tester rejecting f~\tilde{f} on b+Im(R)b^{\prime}+\operatorname{Im}(R^{\prime}).

Lemma 3.8.

For any BRess,,tB\in\operatorname{Res}_{s,\ell,t}, ABA\circ B rejects ff if and only if deg(f~|fl(B))r\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant r.

Proof.

Write B=(R,b)B=(R,b) in the form of (2) and let B:𝔽qt𝔽qB^{\prime}:\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q}^{\ell} be the affine transformation given by (R,b)(R^{\prime},b^{\prime}). Recall that ABA\circ B rejects if and only if

fAB,He=f~B,He0.\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ B,H_{e}\rangle\neq 0.

for some e{0,,q1}te\in\{0,\ldots,q-1\}^{t} such that i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r. For any ee, we can rewrite this inner product on the right hand side as follows.

fAB,He\displaystyle\langle f\circ A\circ B,H_{e}\rangle =α𝔽qs,β𝔽qt(fAB(α,β))i=1s/pP(αp(i1)+1,,αpi)βe\displaystyle=\sum_{\alpha\in\mathbb{F}_{q}^{s},\beta\in\mathbb{F}_{q}^{t}}\left(f\circ A\circ B(\alpha,\beta)\right)\cdot\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\beta^{e}
=β𝔽qt(α𝔽qsf(α,B(β))i=1s/pP(αp(i1)+1,,αpi))βe\displaystyle=\sum_{\beta\in\mathbb{F}_{q}^{t}}\left(\sum_{\alpha\in\mathbb{F}_{q}^{s}}f(\alpha,B^{\prime}(\beta))\cdot\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\right)\cdot\beta^{e}
=β𝔽qtf~B(β)βe\displaystyle=\sum_{\beta\in\mathbb{F}_{q}^{t}}\tilde{f}\circ B^{\prime}(\beta)\cdot\beta^{e}
=f~B,xe.\displaystyle=\langle\tilde{f}\circ B^{\prime},x^{e}\rangle.

However, f~B,xe0\langle\tilde{f}\circ B^{\prime},x^{e}\rangle\neq 0 if and only if f~B\tilde{f}\circ B^{\prime} contains the monomial x1q1e1xtq1etx_{1}^{q-1-e_{1}}\cdots x_{t}^{q-1-e_{t}}. It follows that ABA\circ B rejects if and only if deg(f~B)r\deg(\tilde{f}\circ B^{\prime})\geqslant r. Finally, since the degree of a polynomial is invariant under affine transformations, it follows that this is equivalent to deg(f~|fl(R))r\deg(\tilde{f}|_{\operatorname{fl}(R)})\geqslant r. ∎

We remark that Lemma 3.8 in particular implies that the sparse s+ts+t-flat test is invariant under any affine transformation that only affects the last tt-coordinates. In particular, we get:

Lemma 3.9.

For any BRess,t,tB\in\operatorname{Res}_{s,t,t}, TBT\circ B rejects ff if and only if TT rejects ff.

Proof.

We apply Lemma 3.8 with =t\ell=t and A=TA=T. Then, applying BB does not change the degree of f~\tilde{f} and the result follows. ∎

After fixing A𝒯n,s+A\in\mathcal{T}_{n,s+\ell} this lemma allows us to think of the remainder of the sparse (s+t)(s+t)-flat test, i.e. choosing a restriction BRess,,tB\in\operatorname{Res}_{s,\ell,t}, as a tt-flat test on f~=fA\tilde{f}=f\circ A. Setting =t+100\ell=t+100, we can view the sparse s+ts+t-flat test as follows.

  1. 1.

    Choose a random A𝒯n,s+t+100A\in\mathcal{T}_{n,s+t+100}.

  2. 2.

    Perform the standard tt-flat test on the t+100t+100-variate function f~\tilde{f} defined according to the above.

This view of the test will allow us to borrow some concepts and facts from the tt-flat test which we now introduce. First, we formally define the upper shadow for flats.

Definition 7.

For a set of \ell-flats, SS, define the upper shadow as follows

S={V|dim(V)=+1,US,UV}.S^{\uparrow}=\{V\;|\;\dim(V)=\ell+1,\exists U\in S,U\subseteq V\}.

The following shadow lemma is shown in [10] and is used extensively in the analysis of [31]. This lemma will play a role in our analysis as well, so we record it below.

Lemma 3.10.

For a function ff, let SS be the set of \ell-flats, UU, such that deg(f|U)>d\deg(f|_{U})>d. Then,

μ(S)qμ(S),\mu(S^{\uparrow})\leqslant q\mu(S),

where the measures are in the sets of (+1)(\ell+1)-flats and \ell-flats respectively.

We can also define an analogous notion of upper shadow for affine transformations that fits with the view of the sparse flat tester just introduced.

Definition 8.

For a set of affine transformations 𝒮𝒯n,s+\mathcal{S}\subseteq\mathcal{T}_{n,s+\ell} define

𝒮={T𝒯n,s++1|RRess++1,s+,TR𝒮}\mathcal{S}^{\uparrow}=\{T\in\mathcal{T}_{n,s+\ell+1}\;|\;\exists R\in\operatorname{Res}_{s+\ell+1,s+\ell},T\circ R\in\mathcal{S}\}

We will need the following simple result about the upper shadow of sets of affine transformations.

Lemma 3.11.

For a set of affine transformations 𝒮𝒯n,s+\mathcal{S}\subseteq\mathcal{T}_{n,s+\ell},

μ(𝒮)μ(𝒮).\mu(\mathcal{S}^{\uparrow})\geqslant\mu(\mathcal{S}).
Proof.

We can sample TT by first sampling T𝒯n,s++1T^{\prime}\in\mathcal{T}_{n,s+\ell+1} and then choosing a restriction RRess++1,s+R\in\operatorname{Res}_{s+\ell+1,s+\ell} and outputting T=TRT=T^{\prime}\circ R. We can only have T𝒮T\in\mathcal{S} if T𝒮T^{\prime}\in\mathcal{S}^{\uparrow}, so the result follows. ∎

4 Locating a Potential Error

We now begin our proof of Theorem 1.1. Recall 𝒮t={T𝒯n,s+t|T rejects f}\mathcal{S}_{t}=\{T\in\mathcal{T}_{n,s+t}\;|\;T\text{ rejects }f\} defined in (1) and that rejs+t(f)=μs+t(𝒮t)\operatorname{rej}_{s+t}(f)=\mu_{s+t}(\mathcal{S}_{t}); we denote ε=μs+t(𝒮t)\varepsilon=\mu_{s+t}(\mathcal{S}_{t}) for simplicity. In this section we show that if εqM\varepsilon\leqslant q^{-M} (where MM is a large absolute constant to be determined), then we may find a potential erroneous input xx^{\star} for ff, in the sense that the sparse flat tester almost always rejects if the chosen test polynomial has xx^{\star} in its support.

We begin by checking that the set 𝒮t\mathcal{S}_{t} satisfies the conditions of Theorem 2.1.

4.1 The Edge Expansion of 𝒮t\mathcal{S}_{t} in the Affine Bi-linear Scheme

We start by proving an upper bound on the expansion of 𝒮t\mathcal{S}_{t}. For a matrix MM and column vv vector vv, let [M,v][M,v] denote the matrix with vv appended as an additional column. Consider the following procedure for sampling an edge in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,s+t){\sf AffBilin}(n,s+t).

  1. 1.

    Choose T1=(M,c)𝒯n,s+tT_{1}=(M,c)\in\mathcal{T}_{n,s+t} uniformly at random.

  2. 2.

    Choose v𝔽qnv\in\mathbb{F}_{q}^{n} uniformly at random conditioned on v0v\neq 0 and let T=([M,v],c)T^{\prime}=([M,v],c).

  3. 3.

    Choose a uniformly random matrix R𝔽q(s+t+1)×(s+t)R\in\mathbb{F}_{q}^{(s+t+1)\times(s+t)} of the form R=[Is+t,w]TR=[I_{s+t},w]^{T}, and a uniformly random b𝔽qs+t+1b\in\mathbb{F}_{q}^{s+t+1} such that the first s+ts+t entries in bb are zero.

  4. 4.

    Set T2=T(R,b)T_{2}=T^{\prime}\circ(R,b).

The following lemma will allow us to show 𝒮t\mathcal{S}_{t} is poorly expanding. This takes the place of the “shadow” lemma from [31].

Lemma 4.1.

Let G:𝔽qs+t𝔽qG:\mathbb{F}_{q}^{s+t}\xrightarrow[]{}\mathbb{F}_{q} be an arbitrary polynomial and let T=(M,c)T=(M,c) be an s+ts+t-affine transformation such that fT,G0\langle f\circ T,G\rangle\neq 0. Fix v𝔽qnv\in\mathbb{F}_{q}^{n} chosen in step 22, and let T=([M,v],c)T^{\prime}=([M,v],c). Then,

Pr(R,b)[fT(R,b),G0]1q,\Pr_{(R,b)}[\langle f\circ T^{\prime}\circ(R,b),G\rangle\neq 0]\geqslant\frac{1}{q},

where RR is sampled according to step 3.

Proof.

Let w=(w1,,ws+t)w=(w_{1},\ldots,w_{s+t}) be the last row of RR and let bs+t+1b_{s+t+1} be the last entry of bb. Choosing (R,b)(R,b) uniformly at random amounts to choosing β\beta and each αi\alpha_{i} uniformly at random in 𝔽q\mathbb{F}_{q}. To obtain the result we will view fTR,G\langle f\circ T^{\prime}\circ R,G\rangle as a function in these random values and apply the Schwartz-Zippel lemma.

The function fT(R,b)f\circ T^{\prime}\circ(R,b) can be obtained by composing the (s+t+1)(s+t+1)-variable function, f~=fT\tilde{f}=f\circ T^{\prime}, with (R,b)(R,b). Then fT(R,b)f\circ T^{\prime}\circ(R,b) is just the (s+t)(s+t)-variable function,

(fT(R,b))(x1,,xs+t)=f~(x1,,xs+t,i=1s+twixi+bs+t+1).\left(f\circ T^{\prime}\circ(R,b)\right)(x_{1},\ldots,x_{s+t})=\tilde{f}\left(x_{1},\ldots,x_{s+t},\sum_{i=1}^{s+t}w_{i}x_{i}+b_{s+t+1}\right).

By Lemma 3.3, fT(R,b),G\langle f\circ T^{\prime}\circ(R,b),G\rangle is some linear combination of the coefficients of f~\tilde{f}, but these coefficients are polynomials in w1,,ws+t,bs+t+1w_{1},\ldots,w_{s+t},b_{s+t+1} of total degree at most q1q-1, because f~\tilde{f} can be written as a polynomial with individual degrees at most q1q-1. Thus, fTR,G\langle f\circ T^{\prime}\circ R,G\rangle is a polynomial in w1,,ws+t,bs+t+1w_{1},\ldots,w_{s+t},b_{s+t+1} of total degree at most q1q-1. Moreover, this polynomial is nonzero because when setting w1==ws+t=bs+t+1=0w_{1}=\cdots=w_{s+t}=b_{s+t+1}=0, it evaluates to

fT,G0.\langle f\circ T,G\rangle\neq 0.

By the Schwartz-Zippel lemma it follows that Pr(R,b)[fTR,G0]1q\Pr_{(R,b)}[\langle f\circ T^{\prime}\circ R,G\rangle\neq 0]\geqslant\frac{1}{q}. ∎

As an immediate corollary, we get that the set of affine transformations that reject is poorly expanding.

Lemma 4.2.

The set of s+ts+t-affine transformations that reject ff, 𝒮t\mathcal{S}_{t}, satisfies

1Φ(𝒮t)1q.1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}.

where the expansion is in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,s+t){\sf AffBilin}(n,s+t).

Proof.

Fix T1𝒮tT_{1}\in\mathcal{S}_{t} and choose T2T_{2} adjacent to T1T_{1} in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,s+t){\sf AffBilin}(n,s+t). By assumption there is some valid ee and HeH_{e} such that fT1,He0\langle f\circ T_{1},H_{e}\rangle\neq 0. Since T2T_{2} can be chosen according to the process described at the start of the section, we can apply Lemma 4.1 with G=HeG=H_{e},

PrT2[fT2,He0]1q.\Pr_{T_{2}}[\langle f\circ T_{2},H_{e}\rangle\neq 0]\geqslant\frac{1}{q}.

It follows that 1Φ(𝒮t)1q1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}. ∎

4.2 Pseudorandom with respect to zoom-outs and zoom-outs on the linear part

Next, we show that the set 𝒮t\mathcal{S}_{t} is pseudorandom with respect to zoom-outs and zoom-out on the linear part.

Lemma 4.3.

The set 𝒮t\mathcal{S}_{t} is qμ(𝒮t)q\mu(\mathcal{S}_{t})-pseudorandom with repsect to zoom-outs and zoom-outs on the linear part.

Proof.

We show the proof for zoom-outs. The argument for zoom-outs on the linear part is very similar.

Fix a zoom-out 𝒞aT,b,β\mathcal{C}_{a^{T},b,\beta} denote by μaT,b,β(𝒮t)\mu_{a^{T},b,\beta}(\mathcal{S}_{t}) the measure of 𝒮t\mathcal{S}_{t} in 𝒞aT,b,β\mathcal{C}_{a^{T},b,\beta}, that is, |𝒮t𝒞aT,b,β||𝒞aT,b,β|\frac{|\mathcal{S}_{t}\cap\mathcal{C}_{a^{T},b,\beta}|}{|\mathcal{C}_{a^{T},b,\beta}|}. Fix an arbitrary vv such that aTv0a^{T}\cdot v\neq 0. Sample an (s+t)(s+t)-affine transformation by first choosing (M,c)𝒞aT,b,β(M,c)\in\mathcal{C}_{a^{T},b,\beta} uniformly at random, and then (M,c)(M^{\prime},c^{\prime}) according to steps 3 and 4 with vv. That is, if the iith MM column is MiM_{i}, then the iith column of MM^{\prime} is Mi+αivM_{i}+\alpha_{i}v, and the affine shift is c=c+α0vc^{\prime}=c+\alpha_{0}v for uniformly random αis\alpha_{i}^{\prime}s and α0\alpha_{0} in 𝔽q\mathbb{F}_{q}. By Lemma 4.1, if T1𝒮t𝒞aT,b,βT_{1}\in\mathcal{S}_{t}\cap\mathcal{C}_{a^{T},b,\beta}, then T2𝒮tT_{2}\in\mathcal{S}_{t} with probability at least 1/q1/q, so

Pr[T2𝒮t]μaT,b,β(𝒮t)q.\Pr[T_{2}\in\mathcal{S}_{t}]\geqslant\frac{\mu_{a^{T},b,\beta}(\mathcal{S}_{t})}{q}.

On the other hand, notice that the distribution of T2T_{2} is uniform over 𝒯n,s+t\mathcal{T}_{n,s+t}. Indeed, fix a T=(M,c)𝒯n,s+tT=(M^{\prime},c^{\prime})\in\mathcal{T}_{n,s+t} and let MiM^{\prime}_{i} denote the iith column of MM^{\prime}. Then the only way to get T2=TT_{2}=T is to choose,

αi=aTMibiaTv for 1in, and α0=aTcβaTv,\alpha_{i}=\frac{a^{T}M^{\prime}_{i}-b_{i}}{a^{T}v}\text{ for }1\leqslant i\leqslant n,\quad\text{ and }\quad\alpha_{0}=\frac{a^{T}c^{\prime}-\beta}{a^{T}v},

and (M,c)𝒞aT,b,β(M,c)\in\mathcal{C}_{a^{T},b,\beta} such that

Mi=Miαiv for 1in, and c=cα0v.M_{i}=M^{\prime}_{i}-\alpha_{i}v\text{ for }1\leqslant i\leqslant n,\quad\text{ and }\quad c=c^{\prime}-\alpha_{0}v.

Since T2T_{2} is uniform over 𝒯n,s+t\mathcal{T}_{n,s+t}, we have

μ(𝒮t)Pr[T2𝒮t]μaT,b,β(𝒮t)q,\mu(\mathcal{S}_{t})\geqslant\Pr[T_{2}\in\mathcal{S}_{t}]\geqslant\frac{\mu_{a^{T},b,\beta}(\mathcal{S}_{t})}{q},

and hence μaT,b,β(𝒮t)qμ(𝒮t)\mu_{a^{T},b,\beta}(\mathcal{S}_{t})\leqslant q\mu(\mathcal{S}_{t}). Since this applies for all zoom-outs, the result follows. A similar argument works for zoom-outs on the linear part.

4.3 Pseudorandomness with respect to zoom-ins on the linear part

We next show pseudorandomness with respect to zoom-ins on the linear part. The argument is more involved. In particular, we use the relation to the tt-flat test to reduce the proof of this statement to analogous statements on affine Grassmann graphs. We then prove this statement using ideas similar to [31]. Formally, in this section we show:

Lemma 4.4.

The set 𝒮t\mathcal{S}_{t} is tq162εtq^{162}\varepsilon pseudorandom with respect to zoom-ins on the linear part Ca,b,linC_{a,b,\operatorname{lin}}.

We start with a few definitions. For a vector aa (of arbitrary length greater than ss), let a[s]a_{[s]} be the vector that is equal to aa in its first ss coordinates and 0 in all of its other coordinates. Let a|[s]a|_{[s]} be the vector aa restricted to its first ss coordinates. For a matrix M𝔽qn×(s+)M\in\mathbb{F}_{q}^{n\times(s+\ell)}, denote

Ima(M)={Ma|a𝔽q,a|[s]=a|[s],aa[s]}.\operatorname{Im}_{a}(M)=\{Ma^{\prime}\;|\;a^{\prime}\in\mathbb{F}_{q}^{\ell},a^{\prime}|_{[s]}=a|_{[s]},a^{\prime}\neq a^{\prime}_{[s]}\}.

In words, Ima(M)\operatorname{Im}_{a}(M) is the image of elements aa^{\prime} that agree with aa on the first ss coordinates and are not identically 0 on the rest of the coordinates.

Due to the asymmetry induced by the test, we have a different argument depending on the 0 pattern of aa. We will first show that 𝒮t\mathcal{S}_{t} is pseudorandom with respect to zoom-ins on the linear part, Ca,b,linC_{a,b,\operatorname{lin}}, where aa[s]a\neq a_{[s]}, by a direct argument. We will then prove that 𝒮t\mathcal{S}_{t} is pseudorandom with respect to any zoom-in on the linear part by a reduction to this case.

Lemma 4.5.

The set 𝒮t\mathcal{S}_{t} is q161εq^{161}\varepsilon pseudorandom with respect to zoom-ins on the linear part Ca,b,linC_{a,b,\operatorname{lin}} with aa[s]a\neq a_{[s]}.

The proof of Lemma 4.5 requires some set-up and auxiliary statements, which we present next. Fix a,ba,b as in the lemma and suppose that 𝒮t\mathcal{S}_{t} has α\alpha fractional size in 𝒞a,b,lin\mathcal{C}_{a,b,\operatorname{lin}}, that is,

μ(𝒮t𝒞a,b,lin)μ(𝒞a,b,lin)=α.\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})}{\mu(\mathcal{C}_{a,b,\operatorname{lin}})}=\alpha.

Consider another a𝔽qs+ta^{\prime}\in\mathbb{F}_{q}^{s+t} such that a[s]=a[s]a^{\prime}_{[s]}=a_{[s]}, and aa[s]a^{\prime}\neq a^{\prime}_{[s]}. In words aa^{\prime} is equal to aa in its first ss coordinates, and nonzero in its last tt coordinates. There is an invertible matrix RR whose first ss rows and columns form the identity matrix, IsI_{s}, such that Ra=aRa^{\prime}=a. Composition with RR gives a bijection between the sets 𝒞a,b,lin\mathcal{C}_{a,b,\operatorname{lin}} and 𝒞a,b,lin\mathcal{C}_{a^{\prime},b,\operatorname{lin}}. Moreover, by Lemma 3.9, (M,c)𝒮t(M,c)\in\mathcal{S}_{t} if and only if (MR,c)𝒮t(M\cdot R,c)\in\mathcal{S}_{t}. Therefore, for any aa^{\prime} satisfying a[s]=a[s]a^{\prime}_{[s]}=a_{[s]}, and aa[s]a^{\prime}\neq a^{\prime}_{[s]} it holds that

μ(𝒮t𝒞a,b,t)μ(𝒞a,b,t)=μ(𝒮t𝒞a,b,t)μ(𝒞a,b,t)=α.\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,t})}{\mu(\mathcal{C}_{a^{\prime},b,t})}=\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,t})}{\mu(\mathcal{C}_{a,b,t})}=\alpha.

Put another way, 𝒮t\mathcal{S}_{t} has fractional size α\alpha in the set of (M,c)(M,c) such that bIma(M)b\in\operatorname{Im}_{a}(M), and thus we define

𝒟a,b,={(M,c)𝒯n,s+|bIma(M)}.\mathcal{D}_{a,b,\ell}=\{(M,c)\in\mathcal{T}_{n,s+\ell}\;|\;b\in\operatorname{Im}_{a}(M)\}.

We can sample an (M,c)𝒟a,b,t(M^{\prime},c^{\prime})\in\mathcal{D}_{a,b,t} by first choosing (M,c)𝒟a,b,t+100(M,c)\in\mathcal{D}_{a,b,t+100}, then choosing (R,b)Ress,t+100,t(R,b)\in\operatorname{Res}_{s,t+100,t} uniformly at random, and outputting (M,c)=(M,c)(R,v)=(MR,c+Mv)(M^{\prime},c^{\prime})=(M,c)\circ(R,v)=(M\cdot R,c+Mv) conditioned on MR𝒟a,b,tM\cdot R\in\mathcal{D}_{a,b,t}. Since the probability that (M,c)𝒮t(M^{\prime},c^{\prime})\in\mathcal{S}_{t} is α\alpha and this can only happen if (M,c)𝒮t100(M,c)\in\mathcal{S}_{t}^{\uparrow^{100}}, we get that

μ(𝒮b)μ(𝒟a,b,s+t+100)α,\frac{\mu(\mathcal{S}_{b}^{\prime})}{\mu(\mathcal{D}_{a,b,s+t+100})}\geqslant\alpha,

where 𝒮b=𝒮t100𝒟a,b,t+100\mathcal{S}^{\prime}_{b}=\mathcal{S}_{t}^{\uparrow^{100}}\cap\mathcal{D}_{a,b,t+100}. Recall 𝒮t\mathcal{S}_{t}^{\uparrow} is the set of (s+t+1)(s+t+1)-affine transformations, TT, for which there exists a restriction RR such that TR𝒮tT\circ R\in\mathcal{S}_{t}, and 𝒮t100\mathcal{S}_{t}^{\uparrow^{100}} is the same operation applied 100100 times.

The following lemma shows that there is a way to fix the first ss columns of the test so that the rejection probability remains small:

Lemma 4.6.

There exists (M,c)𝒮b(M,c)\in\mathcal{S}^{\prime}_{b} such that choosing (R,v)Rest+100,t(R,v)\in\operatorname{Res}_{t+100,t} uniformly at random and setting M=MRM^{\prime}=M\cdot R, c=c+Mbc^{\prime}=c+Mb, we have

PrB=(R,b)Ress,t+100,t[(M,c)B𝒮t|bIma(M)]2αε.\Pr_{B=(R,b)\in\operatorname{Res}_{s,t+100,t}}[(M,c)\circ B\in\mathcal{S}_{t}\;|\;b\notin\operatorname{Im}_{a}(M^{\prime})]\leqslant\frac{2}{\alpha}\varepsilon.
Proof.

Choose a full rank (s+t+100)(s+t+100)-affine transformation (M,c)(M,c) uniformly conditioned on (M,c)𝒮b(M,c)\in\mathcal{S}^{\prime}_{b}, then choose an (R,v)Ress,t+100,t(R,v)\in\operatorname{Res}_{s,t+100,t} uniformly and set (M,c)=(M,c)R(M^{\prime},c^{\prime})=(M,c)\circ R. For the remainder of this proof all probabilities are according to this distribution. Define the following events.

  • E1={(M,c)𝒮t}E_{1}=\{(M^{\prime},c^{\prime})\in\mathcal{S}_{t}\}.

  • E2={(M,c)𝒮b}E_{2}=\{(M,c)\in\mathcal{S}^{\prime}_{b}\}.

  • E3={bIma(M)}E_{3}=\{b\notin\operatorname{Im}_{a}(M^{\prime})\}.

  • E4={bIma(M)}E_{4}=\{b\in\operatorname{Im}_{a}(M)\}.

First we note that

Pr[E1|E3E4]=Pr[E1|E3],\Pr[E_{1}\;|E_{3}\land E_{4}]=\Pr[E_{1}\;|\;E_{3}],

and as a result,

Pr[E1E3|E4]=Pr[E1|E3E4]Pr[E3|E4]Pr[E1|E3].\Pr[E_{1}\land E_{3}\;|\;E_{4}]=\Pr[E_{1}\;|\;E_{3}\land E_{4}]\cdot\Pr[E_{3}\;|\;E_{4}]\leqslant\Pr[E_{1}\;|\;E_{3}].

We can then conclude

Pr(M,c)=(M,c)R[E1|E2E3]\displaystyle\Pr_{(M^{\prime},c^{\prime})=(M,c)\circ R}[E_{1}\;|\;E_{2}\land E_{3}] =Pr[E1E3E4]Pr[E2E3]\displaystyle=\frac{\Pr[E_{1}\land E_{3}\land E_{4}]}{\Pr[E_{2}\land E_{3}]}
=Pr[E1E3|E4]Pr[E4]Pr[E2E3]\displaystyle=\frac{\Pr[E_{1}\land E_{3}~{}|~{}E_{4}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}
Pr[E1|E3]Pr[E4]Pr[E2E3]\displaystyle\leqslant\frac{\Pr[E_{1}\;|\;E_{3}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}
=Pr(M,c)[E1|E3]Pr[E2|E4]Pr[E3|E2],\displaystyle=\frac{\Pr_{(M^{\prime},c^{\prime})}[E_{1}\;|\;E_{3}]}{\Pr[E_{2}\;|\;E_{4}]\Pr[E_{3}\;|\;E_{2}]},

where we use the fact that E2E4=E2E_{2}\land E_{4}=E_{2}, and so Pr[E2]/Pr[E4]=Pr[E2|E4]\Pr[E_{2}]/\Pr[E_{4}]=\Pr[E_{2}~{}|~{}E_{4}].

Notice that the numerator of the last fraction is at most ε\varepsilon, while the denominator is at least α/2\alpha/2, so this probability is at most 2ε/α2\varepsilon/\alpha. Thus, there exists (M,c)𝒮b(M,c)\in\mathcal{S}^{\prime}_{b} such that conditioned on this (M,c)(M,c), the probability (M,c)RS(M,c)\circ R\in S over RRess,t+100,tR\in\operatorname{Res}_{s,t+100,t} is at most 2ε/α2\varepsilon/\alpha. ∎

Fixing this (M,c)(M,c), the remaining test graph is now over Ress+t+100,s+t\operatorname{Res}_{s+t+100,s+t} and is isomorphic 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(t+100,t){\sf AffBilin}(t+100,t). Moreover, by Lemma 3.8, we may focus solely on the flat given by b+Im(R)b^{\prime}+\operatorname{Im}(R^{\prime}) for (R,b)Ress,t+100,t(R,b)\in\operatorname{Res}_{s,t+100,t}. This allows us to define f~:𝔽qt+100𝔽qt\tilde{f}:\mathbb{F}_{q}^{t+100}\xrightarrow[]{}\mathbb{F}_{q}^{t} as done in Section 3.5:

f~(β1,,βt+100)=α𝔽qsfA(α,β1,,βt+100)i=1s/pP(αp(i1)+1,,αpi).\tilde{f}(\beta_{1},\ldots,\beta_{t+100})=\sum_{\alpha\in\mathbb{F}_{q}^{s}}f\circ A(\alpha,\beta_{1},\ldots,\beta_{t+100})\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi}).

By Lemma 3.8 we now work with the standard tt-flat test for f~\tilde{f}. The condition that bIma(MR)b\notin\operatorname{Im}_{a}(M\cdot R) translates into the condition that the tt-flat U𝔽qt+100U\subseteq\mathbb{F}_{q}^{t+100} does not contain the point w𝔽qt+100w\in\mathbb{F}_{q}^{t+100} equal the last t+100t+100 coordinates of aa^{\prime}, where aa^{\prime} is the unique point such that Ma=bMa^{\prime}=b.

Translating Lemma 4.6 gives

PrB=z+A𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t+100,t)[deg(f~|B)r|wA]2αε.\Pr_{B=z+A\in{\sf AffGras}(t+100,t)}[\deg(\tilde{f}|_{B})\geqslant r\;|\;w\notin A]\leqslant\frac{2}{\alpha}\varepsilon.

This is the same result shown at the start of the proof of [31, Claim 3.4], and the rest of the argument follows as therein. Let

={B=z+A𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t+100,t)|wA,deg(f~|A)r}.\mathcal{B}=\{B=z+A\in{\sf AffGras}(t+100,t)\;|\;w\notin A,\deg(\tilde{f}|_{A})\geqslant r\}.
Lemma 4.7.

The set \mathcal{B} is nonempty.

Proof.

Since we chose (M,c)𝒮b(M,c)\in\mathcal{S}^{\prime}_{b}, there is at least restriction BB such that (M,c)B(M,c)\circ B rejects ff and hence there is a tt-flat on which f~\tilde{f} has degree greater than rr. It follows that if we sample a random tt-flat z+A=B′′Bz+A=B^{\prime\prime}\subseteq B^{\prime}, then deg(f|B′′)r\deg(f|_{B^{\prime\prime}})\geqslant r with probability at least 1/q1/q. If \mathcal{B} were empty, however, we would have deg(f|B′′)r\deg(f|_{B^{\prime\prime}})\geqslant r only if wAw\in A. In this case the probability that wAw\in A is at most

qt1qt+11<1q.\frac{q^{t}-1}{q^{t+1}-1}<\frac{1}{q}.

Therefore \mathcal{B} is nonempty. ∎

We now proceed to the proof of Lemma 4.5.

Proof of Lemma 4.5.

By Lemma 4.7 \mathcal{B} is non-empty, so there must be a t+40t+40 flat WW such that

W={B|BW}.\mathcal{B}_{W}=\{B\in\mathcal{B}\;|\;B\subseteq W\}.

is nonempty. Let μW\mu_{W} denote the uniform measure over 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(W,t){\sf AffGras}(W,t). This graph is isomorphic to 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t+40,t){\sf AffGras}(t+40,t), but the ground space 𝔽qt+40\mathbb{F}_{q}^{t+40} is viewed as WW. We first claim that since μW(W)>0\mu_{W}(\mathcal{B}_{W})>0, it must be at least q100q^{-100}. If not, then 0<μW(W)<q1000<\mu_{W}(\mathcal{B}_{W})<q^{-100}. However, W\mathcal{B}_{W} is q60q^{-60} pseudo-random with respect to zoom-ins and zoom-ins on the linear part simply due to its size. Indeed, any zoom-in or zoom-in on the linear part already has measure q40q^{-40}, so W\mathcal{B}_{W} can only contain a q60q^{-60} fraction. Furthermore, W\mathcal{B}_{W} is also q50q^{-50} pseudo-random with respect to zoom-outs and zoom-outs on their linear part by a similar argument to Lemma 4.3. Finally, by a similar argument to Lemma 4.2, 1ΦW(W)1q1-\Phi_{W}(\mathcal{B}_{W})\geqslant\frac{1}{q}. Altogether, this then contradicts Theorem A.1, so we conclude that μW(W)q100\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}.

Thus, there exists WW such that μW(W)q100\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}. Fix such a WW and sample a uniform t+99t+99-flat Y=u+Vz+AY=u+V\subseteq z+A conditioned on wVw\notin V, a uniform t+60t+60-flat A2YA_{2}\subseteq Y, and consider A2WA_{2}\cap W. We may think of WW as being defined by a system of 6060 independent linear equations h1,x=c1,,h60,x=c60\langle h_{1},x\rangle=c_{1},\ldots,\langle h_{60},x\rangle=c_{60}. That is, WW is the subspace of 𝔽qt+100\mathbb{F}_{q}^{t+100} that satisfies these 6060 equations. Likewise, A2A_{2} is given by the restriction of 3939 linear equations, h1,x=c1,,h39,x=c39\langle h^{\prime}_{1},x\rangle=c^{\prime}_{1},\ldots,\langle h^{\prime}_{39},x\rangle=c^{\prime}_{39}. The probability that all 9999 linear equations are linearly independent is at least

j=038q99q60+jq99e2j=1qje4/q.\prod_{j=0}^{38}\frac{q^{99}-q^{60+j}}{q^{99}}\geqslant e^{-2\sum_{j=1}^{\infty}q^{-j}}\geqslant e^{-4/q}.

When all 9999 linear equations are linearly independent, A2WA_{2}\cap W is uniform over 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(W,t){\sf AffGras}(W,t). Thus,

Pr[A2WW]e4/qq100.\Pr[A_{2}\cap W\in\mathcal{B}_{W}]\geqslant e^{-4/q}q^{-100}.

If A2WWA_{2}\cap W\in\mathcal{B}_{W}, then A2Y60A_{2}\in\mathcal{B}_{Y}^{\uparrow^{60}}, where the upper shadow is taken with respect to 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(Y,t){\sf AffGras}(Y,t), so it follows that

EY[μY(Y60)]e4/qq100.E_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\geqslant e^{-4/q}q^{-100}.

However, by Lemma 3.10,

μY(Y60)q60μY(Y),\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})\leqslant q^{60}\mu_{Y}(\mathcal{B}_{Y}),

so altogether we get that

EY[μY(Y)]e4/qq160.E_{Y}[\mu_{Y}(\mathcal{B}_{Y})]\geqslant e^{-4/q}q^{-160}.

To conclude, note that the left hand side is at most the probability that f|Bf|_{B} has degree greater than rr over uniform B=x+A𝔽qt+100B=x+A\subseteq\mathbb{F}_{q}^{t+100} such that wAw\notin A. By assumption, this probability is at most 2αε\frac{2}{\alpha}\varepsilon so

αe4/qq1602ε.\alpha\leqslant e^{-4/q}q^{160}2\varepsilon.\qed

We are now ready to prove Lemma 4.4.

Proof of Lemma 4.4.

We show that the set 𝒮t\mathcal{S}_{t} is tq198εtq^{198}\varepsilon pseudorandom with respect to zoom-ins on the linear part Ca,b,linC_{a,b,\operatorname{lin}} for any a𝔽qs+ta\in\mathbb{F}_{q}^{s+t}.

If aa[s]a\neq a_{[s]}, then we are done by Lemma 4.5, so suppose that a=a[s]a=a_{[s]}, meaning aa is zero outside of its first ss coordinates. Clearly aa must have at least one nonzero coordinate, as otherwise a=0a=0 and Ca,b,linC_{a,b,\operatorname{lin}} is either all of 𝒯n,s+t\mathcal{T}_{n,s+t} (if b=0)b=0), or empty (if b0b\neq 0). The former case cannot happen by the assumption that μ(𝒮t)qM\mu(\mathcal{S}_{t})\leqslant q^{-M}, while the latter case is trivially true. Without loss of generality suppose that a1=α0a_{1}=\alpha\neq 0.

For each T𝒮tCa,b,linT\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}}, there is an e{0,,q1}te\in\{0,\ldots,q-1\}^{t} satisfying i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r such that fT,He0\langle f\circ T,H_{e}\rangle\neq 0. Furthermore, as r>0r>0, there must be some “special index” ii such that ei<q1e_{i}<q-1. Take the most common special index over all T𝒮tCa,b,linT\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}} and without loss of generality suppose it is 11. Thus, for at least 1/t1/t of T𝒮tCa,b,linT\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}}, we have fT,He0\langle f\circ T,H_{e}\rangle\neq 0 for some ee such that e1<q1e_{1}<q-1, and i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r. Let 𝒜\mathcal{A} denote the set of these transformations.

Consider the map, Fβ:𝔽qs+t𝔽qs+tF_{\beta}:\mathbb{F}_{q}^{s+t}\xrightarrow[]{}\mathbb{F}_{q}^{s+t} that sends xs+1x_{s+1} to xs+1+βx1x_{s+1}+\beta x_{1} and keeps all other coordinates unchanged. That is,

Fβ(x1,,xs+t)=(x1,,xs,xs+1+βx1,xs+2,,xs+t).F_{\beta}(x_{1},\ldots,x_{s+t})=(x_{1},\ldots,x_{s},x_{s+1}+\beta x_{1},x_{s+2},\ldots,x_{s+t}).

For any T𝒜T\in\mathcal{A}, we claim that

Prβ𝔽q[fTFβ,He0]2q,\Pr_{\beta\in\mathbb{F}_{q}}[\langle f\circ T\circ F_{\beta},H_{e}\rangle\neq 0]\geqslant\frac{2}{q},

where ee is the exponent vector such that e1<q1e_{1}<q-1, i=1teit(q1)r\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r, and fT,He0\langle f\circ T,H_{e}\rangle\neq 0.

Indeed, fTFβ,He\langle f\circ T\circ F_{\beta},H_{e}\rangle is a linear combination of the coefficients of fTFβf\circ T\circ F_{\beta} which is a polynomial in β\beta. In fact, by Lemma 2.2, it is a linear combination of coefficients of monomials where the degree of xs+1x_{s+1} is q1e1>0q-1-e_{1}>0. In every monomial of fTFβf\circ T\circ F_{\beta}, the degree of β\beta and the degree of xs+1x_{s+1} add to at most q1q-1 however, so it follows that the coefficients who contribute to fTFβ,He\langle f\circ T\circ F_{\beta},H_{e}\rangle have degree in β\beta at most q2q-2. Therefore fTFβ,He\langle f\circ T\circ F_{\beta},H_{e}\rangle is a polynomial in β\beta of degree at most q2q-2. Finally this polynomial is nonzero because it evaluates to a nonzero value at β=0\beta=0. The inequality then follows from the Schwartz-Zippel lemma.

In particular, this means for each T𝒜T\in\mathcal{A} there is a nonzero β\beta such that fTFβ,He0\langle f\circ T\circ F_{\beta},H_{e}\rangle\neq 0, and we choose β\beta^{\star} to be the most common such β\beta (we remark that the fact that β0\beta^{\star}\neq 0 is the reason we needed probability of 2/q2/q rather than 1/q1/q). Thus, for at least 2/q2/q of the transformations in 𝒜\mathcal{A} it holds that TFβ𝒮tT\circ F_{\beta^{\star}}\in\mathcal{S}_{t}. Letting a=Fβ1(a)a^{\prime}=F_{\beta}^{-1}(a), we get that for at least 2/q2/q fraction of the T𝒜T\in\mathcal{A} it holds that TFβ𝒞a,b,lin𝒮tT\circ F_{\beta^{\star}}\in\mathcal{C}_{a^{\prime},b,\operatorname{lin}}\cap\mathcal{S}_{t}, and as FβF_{\beta^{\star}} is a bijection it follows that

1tμ(𝒮t𝒞a,b,lin)μ(𝒜)q2μ(𝒮t𝒞a,b,lin).\frac{1}{t}\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})\leqslant\mu(\mathcal{A})\leqslant\frac{q}{2}\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,\operatorname{lin}}).

Finally, note that aa[s]a^{\prime}\neq a^{\prime}_{[s]} as its (s+1)(s+1) coordinate is non-zero by design. Applying Lemma 4.5 we get that μ(𝒮t𝒞a,b,lin)q161εμ(𝒞a,b,lin)\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,\operatorname{lin}})\leqslant q^{161}\varepsilon\mu(\mathcal{C}_{a^{\prime},b,\operatorname{lin}}), and as μ(𝒞a,b,lin)=μ(𝒞a,b,lin)\mu(\mathcal{C}_{a^{\prime},b,\operatorname{lin}})=\mu(\mathcal{C}_{a,b,\operatorname{lin}}), combining with the above we get that

μ(𝒮t𝒞a,b,lin)μ(𝒞a,b,lin)tq162ε.\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})}{\mu(\mathcal{C}_{a,b,\operatorname{lin}})}\leqslant tq^{162}\varepsilon.\qed

5 Correcting the error and iterating

The results from the previous subsections, along with the assumption that μ(𝒮t)qM\mu(\mathcal{S}_{t})\leqslant q^{-M}, establish that 𝒮t\mathcal{S}_{t} satisfies the following properties:

  1. 1.

    μ(𝒮t)qM\mu(\mathcal{S}_{t})\leqslant q^{-M}

  2. 2.

    1Φ(𝒮t)1q1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}.

  3. 3.

    𝒮t\mathcal{S}_{t} is q1Mq^{1-M}-pseudorandom with respect to zoom-outs and zoom-outs on the linear part.

  4. 4.

    𝒮t\mathcal{S}_{t} is tq162Mtq^{162-M}-pseudorandom with respect to zoom-ins on the linear part.

Since we take t10t\geqslant 10, it follows from Theorem 2.1 with ξ=tq162M\xi=tq^{162-M} and =s+t\ell=s+t that there exists a pair a,ba,b such that

μa,b(𝒮t)=μ(𝒮t𝒞a,b)μ(𝒞a,b)11(q1)21q62000tq200qM/4,\mu_{a,b}(\mathcal{S}_{t})=\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b})}{\mu(\mathcal{C}_{a,b})}\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}, (3)

for a large enough MM.

How shall we go about using this information? In words, (3) tells us that 𝒮t\mathcal{S}_{t} is dense on transformations sending aa to bb, and this suggests that the point bb is an erroneous point for ff which we should fix and, as a result, improve the acceptance probability of the test. Furthermore, it stands to reason that this change should affect all tests that come from transformations in 𝒞a,b\mathcal{C}_{a,b}.

Upon inspection however, these tests can only be affected in the case that asupp(H)a\in\operatorname{supp}(H); otherwise, in the test we perform, the value f(T(a))=f(b)f(T(a))=f(b) is multiplied by 0, and hence changing the value of ff at bb does not affect the test at all. Thus, for the above strategy to work, we must prove that (3) holds for a pair a,ba,b wherein aa is in 𝗌𝗎𝗉𝗉(H){\sf supp}(H).

Lemma 5.1.

There exists asupp(H)a\in\operatorname{supp}(H) and b𝔽qnb\in\mathbb{F}_{q}^{n} such that μa,b(𝒮t)11(q1)21q62000tq200qM/4\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}.

Once we have shown this lemma, we will be able to show that correction at bb can reduce the rejection probability of tests in 𝒞a,b\mathcal{C}_{a,b}; however, in order to show that the tester is optimal, we need to fix a larger fraction of tests with a single correction. Specifically, we need to be able to fix tests such that T(a)=bT(a)=b for any asupp(H)a\in\operatorname{supp}(H). In order for such an argument to work, we will have to show that 𝒮t\mathcal{S}_{t} is dense in asupp(H)𝒞a,b\bigcup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b}.

Lemma 5.2.

If there exists an asupp(H)a\in\operatorname{supp}(H) such that μa,b(𝒮t)11(q1)21q62000tq200qM/4\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}, then 𝒮t\mathcal{S}_{t} has density at least 11(q1)21q62q100(q1)2000tq200qM/4qM/2\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200}q^{-M/4}-q^{-M/2} in asupp(H)𝒞a,b\cup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b}.

In words, asupp(H)𝒞a,b\cup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b} is the set of (s+t)(s+t)-affine transformations TT such that Ta=bTa=b for some asupp(H)a\in\operatorname{supp}(H). In the next two subsections we prove Lemmas 5.1 and 5.2. After showing these lemmas, the way to perform corrections will become obvious and we quickly conclude the proof of Theorem 1.1. Both Lemmas are proved in a similar manner, and we first present a general lemma that will be used in both proofs. Let g:𝔽q+100g:\mathbb{F}_{q}^{\ell+100} be an arbitrary polynomial, let dd be a degree parameter, let ν\nu denote uniform measure over \ell-flats in 𝔽q+100\mathbb{F}_{q}^{\ell+100}, and let b𝔽q+100b\in\mathbb{F}_{q}^{\ell+100} be an arbitrary point. Define the following two sets:

𝒜={U|dim(U)=,deg(f|U)>d,bU},\mathcal{A}=\{U\;|\;\dim(U)=\ell,\deg(f|_{U})>d,b\in U\},
={U|dim(U)=,deg(f|U)>d,bU}.\mathcal{B}=\{U\;|\;\dim(U)=\ell,\deg(f|_{U})>d,b\notin U\}.

Keep ε\varepsilon and MM (the large absolute constant) the same as we have defined, so that qM/2O(ε)<qM/2q^{M/2}O(\varepsilon)<q^{-M/2} is small. We will use following result, which is an extension of the results in Section 3.2 of [31]:

Lemma 5.3.

Keep g,d,b,g,d,b, and \mathcal{B} as defined above and suppose max(d+1qq/p,4)\ell\geqslant\max(\lceil\frac{d+1}{q-q/p}\rceil,4). If ν()qM/2O(ε)\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon) then =\mathcal{B}=\emptyset. Moreover there is a value γ\gamma such that after changing g(b)g(b) to γ\gamma, ν(𝒜)=0\nu(\mathcal{A})=0.

As a consequence of these two points, deg(g)d\deg(g)\leqslant d after changing the value of g(b)g(b) to γ\gamma.

Proof.

The proof is deferred to Section C. ∎

5.1 Proof of Lemma 5.1

To establish Lemma 5.1 we show that 𝒮t\mathcal{S}_{t} is pseudorandom with respect to zoom-ins outside of the support, hence the zoom-in found in (3) must be in the support of HH. Specifically, we show:

Lemma 5.4.

For any asupp(H)a\notin\operatorname{supp}(H) and any b𝔽qnb\in\mathbb{F}_{q}^{n}, μa,b(𝒮t)<11(q1)21q62000tq200qM/4\mu_{a,b}(\mathcal{S}_{t})<1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}.

We begin the proof of Lemma 5.4, and we assume for the sake of contradiction that the lemma is false. That is, suppose there is asupp(H)a\notin\operatorname{supp}(H) such that μa,b(𝒮t)11(q1)21q62000tq200qM/4\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}. Since asupp(H)a\notin\operatorname{supp}(H), there are pp consecutive coordinate (api+1,,api+p1)supp(P)(a_{pi+1},\ldots,a_{pi+p-1})\not\in\operatorname{supp}(P). It will be convenient to swap the order of the coordinates so that these are the last pp coordinates. Thus, the polynomial He(x)H_{e}(x) for e{0,,q1}te\in\{0,\ldots,q-1\}^{t} is now,

He(x1,,xs+t)=(i=1s/p1P(xp(i1)+1,,xp))xsp+1e1xsp+tetP(xs+tp+1,,xs+t).H_{e}(x_{1},\ldots,x_{s+t})=\left(\prod_{i=1}^{s/p-1}P(x_{p(i-1)+1},\ldots,x_{p})\right)x_{s-p+1}^{e_{1}}\cdots x_{s-p+t}^{e_{t}}P(x_{s+t-p+1},\ldots,x_{s+t}).

The test with affine transformation TT checks if fT,He=0\langle f\circ T,H_{e}\rangle=0 for all valid ee with HeH_{e} as defined above. Reordering the variables in this way changes the support, so that our assumption asupp(H)a\notin\operatorname{supp}(H) becomes, without loss of generality, (as+tp+1,,as+t)supp(P)(a_{s+t-p+1},\ldots,a_{s+t})\notin\operatorname{supp}(P), which will make our notation later a bit simpler (this is the only reason for reordering the variables). We will once again sample a larger affine transformation and choose a restriction to obtain a T𝒯n,s+tT\in\mathcal{T}_{n,s+t}, however the restrictions this time will be slightly different. We will require the restrictions to be full rank and we will fix the first s+tps+t-p columns/coordinates of TT as opposed to the first ss as before. Let Ress+tp,p+102,p\operatorname{Res}_{s+t-p,p+102,p} consist of affine transformations (R,b)(R,b) of the form:

R=[Is+tp00R],b=[0b],R=\begin{bmatrix}I_{s+t-p}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix}, (4)

where R𝔽q(p+102)×pR^{\prime}\in\mathbb{F}_{q}^{(p+102)\times p} is full rank and b𝔽qp+102b^{\prime}\in\mathbb{F}_{q}^{p+102}. For B=(R,b)Ress+tp,p+102,pB=(R,b)\in\operatorname{Res}_{s+t-p,p+102,p} of this form, let fl(B)=b+Im(R)\operatorname{fl}(B)=b^{\prime}+\operatorname{Im}(R^{\prime}). Also, for an affine transformation TT, define Ima(T)={Tx:x[s+tp]=a[s+tp]}\operatorname{Im}_{a}(T)=\{Tx\;:\;x_{[s+t-p]}=a_{[s+t-p]}\}, where recall a[s+tp]a_{[s+t-p]} is aa with all coordinates outside of the first s+tps+t-p set to zero. Sample a T𝒯n,s+tT\in\mathcal{T}_{n,s+t} as follows:

  1. 1.

    Choose a full rank A𝒯n,s+t+102A\in\mathcal{T}_{n,s+t+102} such that bIma(A)b\in\operatorname{Im}_{a}(A).

  2. 2.

    Choose BRess+tp,p+102,pB\in\operatorname{Res}_{s+t-p,p+102,p}.

  3. 3.

    Output T=ABT=A\circ B.

After AA is chosen, the first s+tps+t-p columns/coordinates of AA’s linear part/affine shift are fixed, while the remaining parts are composed with some random restriction. Thus, once AA is fixed there is a unique xx^{\star} such that Ax=bAx^{\star}=b and we can only have Aa=bAa=b if Ba=xBa=x^{\star}. In particular, this only happens if xIm(B)x^{\star}\in\operatorname{Im}(B) where by design this xx^{\star} satisfies x[s+tp]=a[s+tp]x^{\star}_{[s+t-p]}=a_{[s+t-p]}. Setting z𝔽qp+102z^{\star}\in\mathbb{F}_{q}^{p+102} to be the last p+102p+102 coordinates of xx^{\star}, it follows that Ta=bTa=b only if zfl(B)z^{\star}\in\operatorname{fl}(B). Let μA\mu_{A} denote measure in Ress+tp,p+102,p\operatorname{Res}_{s+t-p,p+102,p}. Define

A={B|AB rejects,zfl(B)}.\mathcal{R}_{A}=\{B\;|\;A\circ B\text{ rejects},z^{\star}\notin\operatorname{fl}(B)\}.

If, after choosing AA and constructing zz^{\star} as above, we then condition on zfl(B)z^{\star}\notin\operatorname{fl}(B) then ABA\circ B is a uniformly random over full-rank transformations in 𝒯n,s+t\mathcal{T}_{n,s+t} conditioned on its image not containing bb. Therefore, 𝔼A[μA(A)]O(ε)\mathop{\mathbb{E}}_{A}[\mu_{A}(\mathcal{R}_{A})]\leqslant O(\varepsilon), and with probability at least 1/21/2 we have μA(A)O(ε)\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon).

On the other hand, if after choosing AA we choose BB uniformly such that AB𝒞a,bA\circ B\in\mathcal{C}_{a,b}, the distribution over TT is uniform over full rank T𝒞a,bT\in\mathcal{C}_{a,b}. By Remark 3.2, the fraction of affine transformations in 𝒞a,b\mathcal{C}_{a,b} that are not full rank is at most 1q100(q1)\frac{1}{q^{100}(q-1)} and by our assumption the set of rejecting transformations is dense in 𝒞a,b\mathcal{C}_{a,b}. Thus a simple averaging argument shows that the fraction of full rank T𝒞a,bT\in\mathcal{C}_{a,b} that reject is also large, and in particular, strictly greater than 1/21/2. Therefore with probability strictly greater than 1/21/2 over AA, there is at least one BB such that ABA\circ B is full rank and rejects.

It follows that there exists a full rank A𝒯n,s+t+102A\in\mathcal{T}_{n,s+t+102} such that the following two hold:

  • μA(A)O(ε)\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon),

  • There exists a BB^{\star} such that T=AB𝒞a,bT=A\circ B^{\star}\in\mathcal{C}_{a,b} is full rank and rejects.

Fix this AA and keep xx^{\star}, zz^{\star}, and A\mathcal{R}_{A} as defined. We now show how this leads to a contradiction. The idea is to apply Lemma 5.3 and argue that there is a point at which we can make a correction and cause ABA\circ B to be accepted for all possible BB, and in particular for BB^{\star}. Since we assumed that there was high rejection probability on a zoom-in outside the support, we will show that the correction is made at a point not looked at by the test ABA\circ B^{\star}. These two facts together form a contradiction because changing ff at a point that is not in the set of points (AB)(α)(A\circ B^{\star})(\alpha) for αsupp(H)\alpha\in\operatorname{supp}(H) cannot change the result of the test ABA\circ B^{\star}. In order to apply Lemma 5.3 however, we need some statement similarly to μA(A)O(ε)\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon), but for a set of flats instead of a set of affine transformations. To this end, we will try to argue that the set of fl(B)\operatorname{fl}(B) for BAB\in\mathcal{R}_{A} is small as well.

We proceed with the formal argument. The first step is to define an auxiliary polynomial similar to that of Lemma 3.8. Suppose TT rejects because fT,He0\langle f\circ T,H_{e}\rangle\neq 0 for a fixed valid ee. Using this ee, define f~:𝔽qp+102𝔽q\tilde{f}:\mathbb{F}_{q}^{p+102}\xrightarrow[]{}\mathbb{F}_{q} by

f~(β)=α𝔽qsp+tfA(α1,,αsp+t,β)(i=1s/p1P(αp(i1)+1,,αpi))αsp+1e1αsp+tet.\tilde{f}(\beta)=\sum_{\alpha\in\mathbb{F}_{q}^{s-p+t}}f\circ A(\alpha_{1},\ldots,\alpha_{s-p+t},\beta)\left(\prod_{i=1}^{s/p-1}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\right)\alpha_{s-p+1}^{e_{1}}\cdots\alpha_{s-p+t}^{e_{t}}.

For any B=(R,b)Ress+tp,p+102,pB=(R,b)\in\operatorname{Res}_{s+t-p,p+102,p} written according to Equation (4) it is easy to check that

fAB,He=f~(R,b),P,\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\rangle,

so after fixing AA we can view the test as being performed on f~\tilde{f} with the transformation (R,b)𝒯p+102,p(R^{\prime},b^{\prime})\in\mathcal{T}_{p+102,p}. By Theorem 3.4, if deg(f~)<q(p1)\deg(\tilde{f})<q(p-1), then f~(R,b),P=0\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\rangle=0 for all B=(R,b)𝒯p+102,pB^{\prime}=(R^{\prime},b^{\prime})\in\mathcal{T}_{p+102,p}. This leads to the following fact:

B can only reject if deg(f~|fl(B))q(p1).B\text{ can only reject if }\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1).

This fact is similar to Lemma 3.8, however, we only have one direction. Namely it is not true that BB always rejects if deg(f~|fl(B))q(p1)\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1) because the test on f~\tilde{f} only checks inner products with PP and not will all monomials of degree up to qpq-p.

Using this fact we can now relate μA(A)\mu_{A}(\mathcal{R}_{A}) to the measure of the set of pp-flats fl(B)\operatorname{fl}(B) not containing zz^{\star} such that deg(f~|fl(B))q(p1)\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1). Define the set of pp-flats,

A={fl(B)|deg(f~fl(B))q(p1),zfl(B)}𝖠𝖿𝖿𝖦𝗋𝖺𝗌(p+102,p).\mathcal{B}_{A}=\{\operatorname{fl}(B)\;|\;\deg(\tilde{f}_{\operatorname{fl}(B)})\geqslant q(p-1),z^{\star}\notin\operatorname{fl}(B)\}\subseteq{\sf AffGras}(p+102,p).

Equivalently, A\mathcal{B}_{A} is the set of all pp-flats UU not containing zz^{\star} such that deg(f~)|Uq(p1)\deg(\tilde{f})|_{U}\geqslant q(p-1).

To analyze the fractional size of this set, we can choose (R,b)(R^{\prime},b^{\prime}) by choosing a pp-flat, and then choosing a basis for the flat. More formally, with AA fixed,

  1. 1.

    Choose BResp+102,pB\in\operatorname{Res}_{p+102,p}^{\prime} such that zfl(B)z^{\star}\notin\operatorname{fl}(B), and write BB according to Equation (4).

  2. 2.

    Choose T𝒯p,pT\in\mathcal{T}_{p,p}, and replace (R,b)(R^{\prime},b^{\prime}) with (R,b)T(R^{\prime},b^{\prime})\circ T. Let the resulting restriction be BResp+102,pB^{\prime}\in\operatorname{Res}^{\prime}_{p+102,p}.

  3. 3.

    Output BB^{\prime}.

We claim that if initially deg(f~|fl(B))q(p1)\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1), then with probability at least 1/q1/q, the outputted BB^{\prime} rejects.

Lemma 5.5.

Suppose deg(f~|fl(B))q(p1)\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1). Then with probability at least 1/q1/q over M𝒯p,pM\in\mathcal{T}_{p,p} in the second step, BB^{\prime} rejects. That is, f~B,P0\langle\tilde{f}\circ B^{\prime},P\rangle\neq 0.

Proof.

Write BB in the form of Equation (4). Then the assumption deg(f~|fl(B))q(p1)\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1) is equivalent to deg(f~(R,b))q(p1)\deg(\tilde{f}\circ(R^{\prime},b^{\prime}))\geqslant q(p-1).

Choose a full rank TT uniformly at random and consider

f~(R,b)T,P=f~(R,b),PT1.\langle\tilde{f}\circ(R^{\prime},b^{\prime})\circ T,P\rangle=\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle.

Since deg(f~(R,b))q(p1)\deg(\tilde{f}\circ(R^{\prime},b^{\prime}))\geqslant q(p-1), Lemma B.5 implies that there is at least one choice of invertible TT, and hence an T1T^{-1}, that makes the above nonzero. Therefore, we can view f~(R,b),PT1\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle as a nonzero polynomial in the entries of T1T^{-1}. Since the degree of PP is at most qpq-p, the inner product f~(R,b),PT1\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle is a nonzero polynomial in the entries of T1T^{-1} of total degree at most qpq-p. By the Schwartz-Zippel Lemma, with probability at least p/qp/q, over the entries of T1T^{-1}, f~(R,b),PT10\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle\neq 0. Since T1T^{-1} has to be invertible, we need to ignore the choices of entries that are non-invertible, but this is at most a (p1)/q(p-1)/q fraction. Overall, we still get that with probability at least 1/q1/q over T𝒯p,pT\in\mathcal{T}_{p,p}, (R,b)T(R^{\prime},b^{\prime})\circ T rejects f~\tilde{f}. ∎

Letting νA\nu_{A} denote the uniform measure in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(p+102,p){\sf AffGras}(p+102,p), Lemma 5.5 implies that ν𝒜(A)qμA(A)\nu_{\mathcal{A}}(\mathcal{B}_{A})\leqslant q\mu_{A}(\mathcal{R}_{A}). We are now close to being able to apply Lemma 5.3, but A\mathcal{B}_{A} is a set of pp-flats, while Lemma 5.3 only works for sets of flats of dimension at least 44. Thus in the p=2,3p=2,3 cases, we cannot use this lemma. There is an easy fix however, which follows by looking at the upper shadow of A\mathcal{B}_{A}

Lemma 5.6.

Let A={U|dim(U)=p+2,zU,deg(f~|U)}q(p1)}\mathcal{B}^{\prime}_{A}=\{U^{\prime}\;|\;\dim(U^{\prime})=p+2,z^{\star}\notin U^{\prime},\deg(\tilde{f}|_{U^{\prime}})\}\geqslant q(p-1)\}. Then,

ν(A)q2νA(A),\nu(\mathcal{B}^{\prime}_{A})\leqslant q^{2}\nu_{A}(\mathcal{B}_{A}),

where ν\nu is uniform measure in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(p+102,p+2){\sf AffGras}(p+102,p+2).

Proof.

Observe that AA2\mathcal{B}^{\prime}_{A}\subseteq\mathcal{B}_{A}^{\uparrow^{2}}. Indeed, for any UAU^{\prime}\in\mathcal{B}^{\prime}_{A}, there must be a pp-flat UUU\subseteq U^{\prime} such that deg(f~|U)q(p1)\deg(\tilde{f}|_{U})\geqslant q(p-1). Since zUz^{\star}\notin U^{\prime}, it follows that zUz^{\star}\notin U and thus UAU\in\mathcal{B}_{A}. The result then follows from applying the Lemma 3.10 twice. ∎

We now wrap up the proof by applying Lemma 5.4 and obtaining a contradiction.

Proof of Lemma 5.4.

By Lemma 5.6, ν(A)q2νA(A)q3O(ε)\nu(\mathcal{B}^{\prime}_{A})\leqslant q^{2}\nu_{A}(\mathcal{B}_{A})\leqslant q^{3}O(\varepsilon). We can now apply Lemma 5.3, with =p+2\ell=p+2, degree parameter q(p1)1q(p-1)-1, and special point zz^{\star}. The conditions of Lemma 5.3 are satisfied since p+2max(q(p1)qq/p,4)p+2\geqslant\max(\lceil\frac{q(p-1)}{q-q/p}\rceil,4) and q3O(ε)qM/2O(ε)q^{3}O(\varepsilon)\leqslant q^{M/2}O(\varepsilon). From Lemma 5.3 it follows that A\mathcal{B}^{\prime}_{A} is empty and there is a value γ\gamma such that after changing f~(z)\tilde{f}(z^{\star}) to γ\gamma, we have deg(f~)p(q1)\deg(\tilde{f})\leqslant p(q-1). In particular, it must be the case that fAB,He=0\langle f\circ A\circ B^{\star},H_{e}\rangle=0, or equivalently, f~(R,b),P=0\langle\tilde{f}\circ(R^{\star},b^{\star}),P\rangle=0, where (R,b)(R^{\star},b^{\star}) is the non-identity part of BB^{\star} when written according to (4).

Recall how the point zz^{\star} was defined: after AA is fixed, x𝔽qs+t+102x^{\star}\in\mathbb{F}_{q}^{s+t+102} is the unique point such that Ax=bAx^{\star}=b, and z𝔽qp+102z^{\star}\in\mathbb{F}_{q}^{p+102} is the last p+102p+102 coordinates of xx^{\star}. Since BB^{\star} satisfies AB𝒞a,bA\circ B^{\star}\in\mathcal{C}_{a,b}, we have (AB)(a)=b(A\circ B^{\star})(a)=b, and hence Ba=xB^{\star}a=x^{\star}. Letting aa^{\prime} be the last p+102p+102 coordinates of aa, it follows that (R,b)(a)=z(R^{\star},b^{\star})(a^{\prime})=z^{\star}. However, by assumption asupp(P)a^{\prime}\notin\operatorname{supp}(P), and since (R,b)(R^{\star},b^{\star}) is full rank, (R,b)(α)z(R^{\star},b^{\star})(\alpha)\neq z^{\star} for any αsupp(P)\alpha\in\operatorname{supp}(P). We now see where the contradiction lies. After changing the value of f~(z)\tilde{f}(z^{\star}), we suddenly have

f~(R,b),P=αsupp(P)f~((R,b)(α))P(α)=0.\langle\tilde{f}\circ(R^{\star},b^{\star}),P\rangle=\sum_{\alpha\in\operatorname{supp}(P)}\tilde{f}\left((R^{\star},b^{\star})(\alpha)\right)\cdot P(\alpha)=0.

However, since we only changed the value of f~(z)\tilde{f}(z^{\star}), no term in the above summation was changed, and this inner product should still be nonzero. Hence, the set 𝒮t\mathcal{S}_{t} cannot be dense in any 𝒞a,b\mathcal{C}_{a,b} where asupp(H)a\notin\operatorname{supp}(H). ∎

5.2 Proof of Lemma 5.2

We have established that there exists an asupp(H)a^{\star}\in\operatorname{supp}(H) and b𝔽qnb\in\mathbb{F}_{q}^{n} such that

μa,b(𝒮t)11(q1)21q62000tq200M/4.\mu_{a^{\star},b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}.

Using this, we will deduce Lemma 5.2, which says that the set 𝒮t\mathcal{S}_{t} is dense in vsupp(H)𝒞v,b\cup_{v\in\operatorname{supp}{(H)}}\mathcal{C}_{v,b}.

Let 𝒰s+t\mathcal{U}_{s+t} denote the set of (s+t)(s+t)-flats UU such that deg(f|U)>d\deg(f|_{U})>d and let ν\nu denote the uniform measure on the set of (s+t)(s+t)-flats. We sample T𝒞a,bT\in\mathcal{C}_{a^{\star},b} by first choosing an (s+t)(s+t)-flat UU containing bb, and then choosing TT whose image is UU conditioned on Ta=bTa^{\star}=b. The point of this procedure is that after choosing UU, the resulting TT can only reject if deg(f|U)>d\deg(f|_{U})>d. Formally,

  1. 1.

    Choose a random (s+t)(s+t)-flat UU containing bb, and a random basis, U=span(u1,,us+t)+u0U=\operatorname{span}(u_{1},\ldots,u_{s+t})+u_{0}. Let T=(M,u0)T^{\prime}=(M^{\prime},u_{0}) where the iith column of MM^{\prime} is uiu_{i}. By assumption T(x)=bT^{\prime}(x)=b for some x𝔽qs+tx\in\mathbb{F}_{q}^{s+t}.

  2. 2.

    Choose an arbitrary matrix B𝒯s+t,s+tB\in\mathcal{T}_{s+t,s+t} such that Ba=xBa^{\star}=x and output T=(MB,u0)T=(M^{\prime}\cdot B,u_{0}).

It is easy to check that this procedure samples uniformly from 𝒞a,b\mathcal{C}_{a^{\star},b} and that as noted, TT can only reject if deg(f|U)>d\deg(f|_{U})>d, which leads to the following observation:

Remark 5.7.
11(q1)21q62000tq200M/4μa,b(𝒮t)νb(𝒰s+t),1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}\leqslant\mu_{a^{\star},b}(\mathcal{S}_{t})\leqslant\nu_{b}(\mathcal{U}_{s+t}),

where νb\nu_{b} denotes density in the zoom-in 𝒟b\mathcal{D}_{b} (on the affine Grassmann graph).

Using this information about 𝒰s+t\mathcal{U}_{s+t}, we now try to apply Lemma 5.3. Sample a full rank Tvsupp(H)𝒞v,bT\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b} as follows:

  1. 1.

    Choose an (s+t+100)(s+t+100)-flat VV uniformly at random containing bb.

  2. 2.

    Choose UVU\subseteq V uniformly at random such that bUb\in U.

  3. 3.

    Choose a basis representation U=span(u1,,us+t)+u0U=\operatorname{span}(u_{1},\ldots,u_{s+t})+u_{0} and output T=(M,u0)T=(M,u_{0}) where the iith column of MM is uiu_{i}, conditioned on Tv=bTv=b for some vsupp(H)v\in\operatorname{supp}(H).

After VV is chosen, recall the following two sets,

𝒜V={UV|dim(U)=s+t,deg(f|U)>d,bU},\mathcal{A}_{V}=\{U\subseteq V\;|\;\dim(U)=s+t,\deg(f|_{U})>d,b\in U\},
V={UV|dim(U)=s+t,deg(f|U)>d,bU},\mathcal{B}_{V}=\{U\subseteq V\;|\;\dim(U)=s+t,\deg(f|_{U})>d,b\notin U\},

and let νV\nu_{V} denote measure over s+ts+t-flats contained in VV.

Then 𝔼V[νV(𝒜V)]11(q1)21q62000tq200M/4\mathop{\mathbb{E}}_{V}[\nu_{V}(\mathcal{A}_{V})]\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}, so with probability at least 11(q1)21q62000tq200M/41-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}, we have νV(𝒜V)>0\nu_{V}(\mathcal{A}_{V})>0. On the other hand, 𝔼V[νV(V)]=O(ε)\mathop{\mathbb{E}}_{V}[\nu_{V}(\mathcal{B}_{V})]=O(\varepsilon), so with probability at least 1qM/21-q^{-M/2}, we have νV(V)qM/2O(ε)\nu_{V}(\mathcal{B}_{V})\leqslant q^{-M/2}O(\varepsilon). Altogether, this implies that with probability at least

11(q1)21q62000tq200M/4qM/2,1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2},

VV satisfies both νV(𝒜V)>0\nu_{V}(\mathcal{A}_{V})>0 and νV(V)qM/2O(ε)\nu_{V}(\mathcal{B}_{V})\leqslant q^{M/2}O(\varepsilon).

Suppose such a VV is chosen and the above holds. We may now apply Lemma 5.3 to show that any TT that can be chosen in step 33 must now reject. This will establish the desired result, that a random Tvsupp(H)𝒞v,bT\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b} rejects with probability close to 11.

By Lemma 5.3 there exists a value γ\gamma such that after changing f(b)f(b) to γ\gamma, deg(f|V)d\deg(f|_{V})\leqslant d. It must be the case that γf(b)\gamma\neq f(b) because νV(𝒜V)>0\nu_{V}(\mathcal{A}_{V})>0. Let ff^{\prime} be the the function after changing the value of ff at bb. Then for any TT as described above, we must have fT,He=0\langle f^{\prime}\circ T,H_{e}\rangle=0 for every valid ee. Since TT is full rank, there can only be one point mapped to bb and that point is some vsupp(H)v\in\operatorname{supp}(H), so we have

fT,HefT,He=He(v)(f(b)f(b))0.\langle f^{\prime}\circ T,H_{e}\rangle-\langle f\circ T,H_{e}\rangle=H_{e}(v)(f(b)-f^{\prime}(b))\neq 0.

Thus, for every valid ee fT,HefT,He\langle f\circ T,H_{e}\rangle\neq\langle f^{\prime}\circ T,H_{e}\rangle, and in particular, TT rejects ff.

Proof of Lemma 5.2.

Sampling a full rank Tvsupp(H)𝒞v,bT\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b} via the procedure above, the previous argument shows that with probability at least 11(q1)21q62000tq200M/4qM/21-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2} over (s+t+100)(s+t+100)-flats VV, the outputted TT in step 3 rejects. It follows that 𝒮t\mathcal{S}_{t} has density at least 11(q1)21q62000tq200M/4qM/21-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2} in the set of full rank transformations TT such that Tv=bTv=b for some asupp(H)a\in\operatorname{supp}(H). Since non-full rank transformations constitute only a 1q100(q1)\frac{1}{q^{100}(q-1)}-fraction of transformations in vsupp(H)𝒞v,b\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}, by Remark 3.2, we subtract out another 2q100(q1)\frac{2}{q^{100}(q-1)} to obtain the desired result. ∎

5.3 Iterating the Argument

Recall that rejs+t(f)\operatorname{rej}_{s+t}(f) denotes the probability that a randomly chosen T𝒯n,s+tT\in\mathcal{T}_{n,s+t} rejects ff. By Lemma 5.3, for at least 9/109/10 of Tvsupp(H)𝒞v,bT\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}, there is a value γ\gamma such that after changing the value of f(b)f(b) to γ\gamma, TT accepts ff. Thus, there exists a ff^{\prime} such that ff^{\prime} is identical to ff at all points except bb and

PrT[T rejects f|vsupp(H),T(v)=b]1910q.\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10q}.
Proposition 5.8.

If 0<rejs+t(f)<qM0<\operatorname{rej}_{s+t}(f)<q^{-M} and t=O(p)t=O(p) there exists a point z𝔽qnz\in\mathbb{F}_{q}^{n} and a function ff^{\prime} that is identical to ff at all points except bb such that

rejs+t(f)rejs+t(f)|supp(H)|qnC(q),\operatorname{rej}_{s+t}(f^{\prime})\leqslant\operatorname{rej}_{s+t}(f)-\frac{|\operatorname{supp}(H)|}{q^{n}C(q)},

where C(q)=O(q)C(q)=O(q).

Proof.

By Lemma 5.2, there exists bb such that

PrT[T rejects f|vsupp(H),T(v)=b]11(q1)21q62q100(q1)2000tq200M/4qM/2.\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200-M/4}-q^{-M/2}.

By the above discussion, there exists a ff^{\prime} such that ff^{\prime} is identical to ff at all points except bb and

PrT[T rejects f|vsupp(H),T(v)=b]1910q.\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10q}.

On the other hand, it is clear that for TT such that T(v)bT(v)\neq b for all vsupp(H)v\in\operatorname{supp}(H), the results of the tests on fTf\circ T and fTf^{\prime}\circ T are the same because ff and ff^{\prime} are identical on the points needed to evaluate fT,He\langle f\circ T,H_{e}\rangle and fT,He\langle f^{\prime}\circ T,H_{e}\rangle for any ee. Since |supp(H)|qn\frac{|\operatorname{supp}(H)|}{q^{n}} is the probability that b{T(v)|vsupp(H)}b\in\{T(v)\;|\;v\in\operatorname{supp}(H)\}, a direct calculation yields

rejs+t(f)\displaystyle\operatorname{rej}_{s+t}(f^{\prime})
rejs+t(f)|supp(H)|qn(11(q1)21q62q100(q1)2000tq200M/4qM/2(1910q))\displaystyle\leqslant\operatorname{rej}_{s+t}(f)-\frac{|\operatorname{supp}(H)|}{q^{n}}\left(1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200-M/4}-q^{-M/2}-\left(1-\frac{9}{10q}\right)\right)
rejs+t(f)Ω(|supp(H)|qnq).\displaystyle\leqslant\operatorname{rej}_{s+t}(f)-\Omega\left(\frac{|\operatorname{supp}(H)|}{q^{n}q}\right).

Finally, we can conclude the proof of Theorem 1.1.

Proof of Theorem 1.1.

By Proposition 5.8, while rejs+t(f)>0\operatorname{rej}_{s+t}(f)>0, we can change the value of ff at one point and reduce the rejection probability by Ω(|supp(H)|qnq)\Omega\left(\frac{|\operatorname{supp}(H)|}{q^{n}q}\right). When the rejection probability is 0, the function must be degree at most dd, therefore,

δd(f)qnO(rejs+t(f)|supp(H)|qnq)\delta_{d}(f)q^{n}\leqslant O\left(\frac{\operatorname{rej}_{s+t}(f)}{|\operatorname{supp}(H)|}q^{n}q\right)

which implies that

rejs+t(f)Ω(|supp(H)|qδd(f)).\operatorname{rej}_{s+t}(f)\geqslant\Omega\left(\frac{|\operatorname{supp}(H)|}{q}\delta_{d}(f)\right).

6 Optimal Testing from other Local Characterizations

When showing that the sparse flat test is optimal, we relied minimally on the structure of HeH_{e}. Thus it is not hard to extend our methods and show optimal testing results for other polynomials that give local characterizations. We will reuse the variables ss and rr in this section, so they no longer refer to their previous definitions. Let P:𝔽qk𝔽qP:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q} be a polynomial of the form

P(x1,,xk)=i=1sPi(xm(i),,xm(i+1)1)).P(x_{1},\ldots,x_{k})=\prod_{i=1}^{s}P_{i}(x_{m(i)},\ldots,x_{m(i+1)-1})).

Where m(1)=1m(1)=1, m(s)=k+1m(s)=k+1, m(i+1)m(i)tm(i+1)-m(i)\leqslant t^{\prime} for each 1is11\leqslant i\leqslant s-1 and some small constant tt^{\prime}. In words, HH is a kk-variate polynomial that is the product of polynomials in few variables, where the variables of each of these polynomials is disjoint. Finally let {𝔽qt𝔽q}\mathcal{M}\subseteq\{\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q}\} be an arbitrary nonempty set affine invariant set of polynomials and suppose t=poly(q)t=\operatorname{poly}(q). Define

={e{0,,q1}t|i=1txiq1ei}.\mathcal{E}=\{e\in\{0,\ldots,q-1\}^{t}\;|\;\prod_{i=1}^{t}x_{i}^{q-1-e_{i}}\notin\mathcal{M}\}.

Notice that (q1,,q1)(q-1,\ldots,q-1)\notin\mathcal{E}. It is well known that any affine invariant set of polynomials is given by the span of the monomials that appear in at least one polynomial of the family, c.f. [25], and this fact is elaborated on in Appendix B. In combination with Lemma 3.3, it follows that g:𝔽qt𝔽qg:\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q} is in \mathcal{M} if and only if g,i=1txiei=0\langle g,\prod_{i=1}^{t}x_{i}^{e_{i}}\rangle=0 for every (e1,,et)(e_{1},\ldots,e_{t})\in\mathcal{E}. In comparison with the description in Section 1.4, {i=1txiei|e}\{\prod_{i=1}^{t}x_{i}^{e_{i}}\;|\;e\in\mathcal{E}\} is an explicit basis for \mathcal{M}^{\perp}.

Using HH and monomials with exponent vectors in \mathcal{E}, we can define a test similar to the sparse flat tester. For ee\in\mathcal{E}, define

He(x1,,xk+t)=P(x1,,xk)i=1txk+iei,H_{e}(x_{1},\ldots,x_{k+t})=P(x_{1},\ldots,x_{k})\prod_{i=1}^{t}x_{k+i}^{e_{i}},

Let n(H)={f:𝔽qn𝔽q|T𝒯n,k,fT,He=0,e}\mathcal{F}_{n}(H)=\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\;|\;\forall T\in\mathcal{T}_{n,k},\langle f\circ T,H_{e}\rangle=0,\forall e\in\mathcal{E}\}. It is clear that n(H)\mathcal{F}_{n}(H) is affine invariant with the following natural tester:

  1. 1.

    Choose T𝒯n,kT\in\mathcal{T}_{n,k} uniformly at random.

  2. 2.

    If fT,He0\langle f\circ T,H_{e}\rangle\neq 0 for any ee\in\mathcal{E}, reject. Otherwise, accept.

Let supp(H)=esupp(He)=supp(P)×𝔽qt\operatorname{supp}(H)=\cup_{e\in\mathcal{E}}\operatorname{supp}(H_{e})=\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}. It is clear that the number of queries made by the tester is Q=|supp(H)|Q=|\operatorname{supp}(H)|, although this value will not be of interest for us. Instead, the goal for the remainder of this section is to show that this tester is optimal. As a consequence, this allows one to obtain lower query complexity optimal testers for general affine invariant codes by constructing local characterizations using PP with sparse support of the prescribed form above, similar to what is done for Generalized Reed-Muller Codes in [33].

Theorem 6.1.

The above tester is optimal for n(H)\mathcal{F}_{n}(H). That is, for any f:𝔽qn𝔽qf:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q},

rej(f)c(q)min(1,Qδ(f,n(H))),\operatorname{rej}(f)\geqslant c(q)\min(1,Q\delta(f,\mathcal{F}_{n}(H))),

where rej(f)\operatorname{rej}(f) is the probability that the tester rejects ff, c(q)=1poly(q)c(q)=\frac{1}{\operatorname{poly}(q)}, Q=|supp(H)|Q=|\operatorname{supp}(H)| is the number of queries that the tester performs, and δ(f,n(H))\delta(f,\mathcal{F}_{n}(H)) is the minimal relative hamming distance between ff and a member of n(H)\mathcal{F}_{n}(H).

We will follow the same strategy, first locating a potential error, and then correcting the error and iterating. Let 𝒮\mathcal{S} denote the set of rejecting tests in 𝒯n,k+t\mathcal{T}_{n,k+t} and assume that μ(𝒮)qM\mu(\mathcal{S})\leqslant q^{-M} for some MM such that M>10tM>10t^{\prime} and qMtq^{M}\geqslant t.

Before going into the proof, we introduce a lemma shown in [31]. Recall the definition of lifted affine-invariant codes from the introduction. By design, n()\mathcal{F}_{n}(\mathcal{H}) is a lifted affine invariant code. Letting =k+t(H)\mathcal{F}=\mathcal{F}_{k+t}(H), it is easy to see that

n(H)=𝖫𝗂𝖿𝗍n().\mathcal{F}_{n}(H)={\sf Lift}_{n}(\mathcal{F}).

Since we can view n(H)\mathcal{F}_{n}(H) as a lifted code, we can then apply the following lemma from [31].

Lemma 6.1.

Let g:𝔽qk+1𝔽qg:\mathbb{F}_{q}^{k^{\prime}+1}\xrightarrow[]{}\mathbb{F}_{q} be a polynomial such that g𝖫𝗂𝖿𝗍k+1()g\notin{\sf Lift}_{k^{\prime}+1}(\mathcal{F}), and suppose that kkk^{\prime}\geqslant k. Then,

PrU[g|U𝖫𝗂𝖿𝗍k()]1q,\Pr_{U}[g|_{U}\notin{\sf Lift}_{k^{\prime}}(\mathcal{F})]\geqslant\frac{1}{q},

where the probability is over hyperplanes U𝔽qk+1U\subset\mathbb{F}_{q}^{k^{\prime}+1}. If this inequality is tight and the set of hyperplanes UU such that g|U𝖫𝗂𝖿𝗍k(H)g|_{U}\notin{\sf Lift}_{k^{\prime}}(H) is of the form

{U|xU},\{U\;|\;x^{\star}\in U\},

for some point xx^{\star}, then there is a function gg^{\prime} equal to gg at all points except xx^{\star} such that gk+i+1(H)g^{\prime}\in\mathcal{F}_{k+i+1}(H).

This lemma will play the role of Lemma C.2 in this section’s analysis.

Locating a Potential Error

In order to locate an error, we will once again show that 𝒮\mathcal{S} is dense on some zoom-in. As we already assume 𝒮\mathcal{S} has small measure, this step requires us to show that 𝒮\mathcal{S} is poorly expanding and pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part.

Lemma 6.2.

The set 𝒮\mathcal{S} has the following properties:

  • μ(𝒮)qM\mu(\mathcal{S})\leqslant q^{-M}

  • Φ(𝒮)11/q\Phi(\mathcal{S})\leqslant 1-1/q

  • 𝒮\mathcal{S} is qμ(𝒮)q\mu(\mathcal{S})-pseudorandom with respect to zoom-outs and zoom-outs on the linear part.

  • 𝒮\mathcal{S} is tq162εtq^{162}\varepsilon-pseudorandom with respect to zoom-ins on the linear part.

where μ\mu is measure in the set 𝒯n,k+t\mathcal{T}_{n,k+t}.

The first item is true by assumption. The second and third items can be shown by the exact same arguments as in Lemmas 4.1 and 4.3 respectively.

The proof for the fourth item also proceeds similar to the Reed-Muller case, but since this argument was more involved, we will review the steps.

Consider a zoom-in on the linear part 𝒞a,b,lin\mathcal{C}_{a,b,\operatorname{lin}}. We can assume a0a\neq 0 as otherwise the set is either empty or 𝒯n,k\mathcal{T}_{n,k}. We will again have two cases, one where at least one of the last tt coordinates of aa is nonzero, and another when all are zero. We can prove the former case in exactly the same way as Lemma 4.5. For the latter case, recall that by assumption (q1,,q1)(q-1,\ldots,q-1)\notin\mathcal{E}. Thus, the reduction from the former case to the latter case (with a factor of tqtq loss) works exactly the same as in the Proof of Lemma 4.4.

Once again we can sample a transformation by first choosing a full rank A𝒯n,k+t+100A\in\mathcal{T}_{n,k+t+100}, BResk+t+100,kB\in\operatorname{Res}_{k+t+100,k} and outputting AB=TA\circ B=T. Where now Resk,t+100,t\operatorname{Res}_{k,t+100,t} is the set of affine transformations (R,b)(R,b) with

R=[Ik00R],b=[0b],R=\begin{bmatrix}I_{k}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix}, (5)

with R𝔽q(t+100)×tR^{\prime}\in\mathbb{F}_{q}^{(t+100)\times t} is full rank and b𝔽qt+100b^{\prime}\in\mathbb{F}_{q}^{t+100}. Call (R,b)(R^{\prime},b^{\prime}) the non-trivial part of B=(R,b)B=(R,b) and let fl(B)=Im(R,b)\operatorname{fl}(B)=\operatorname{Im}(R^{\prime},b^{\prime}).

Following the same setup as in Section 4.3, we can find AA such that the following two hold:

  • bIma[k](A)b\in\operatorname{Im}_{a_{[k]}}(A)

  • There exists BResk,t+100,tB\in\operatorname{Res}_{k,t+100,t} such that AB𝒮A\circ B\in\mathcal{S}

  • PrB[AB𝒮|bIma[k](MR)]2αε\Pr_{B}[A\circ B\in\mathcal{S}\;|\;b\notin\operatorname{Im}_{a_{[k]}}(M\circ R)]\leqslant\frac{2}{\alpha}\varepsilon.

Fixing this AA, define

f~(β1,,βt+100)=αfA(α,β1,,βt+100)P(α).\tilde{f}(\beta_{1},\ldots,\beta_{t+100})=\sum_{\alpha}f\circ A(\alpha,\beta_{1},\ldots,\beta_{t+100})P(\alpha).

Then for any restriction BB with B=(R,b)𝒯t+100,tB^{\prime}=(R^{\prime},b^{\prime})\in\mathcal{T}_{t+100,t} as its non-trivial part, and any tt-variate monomial xex^{e}, we have

fAB,He=f~B,xe.\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ B^{\prime},x^{e}\rangle.

It follows that B=(R,b)B=(R,b) rejects ff if and only if f~|b+Im(R)\tilde{f}|_{b^{\prime}+\operatorname{Im}(R^{\prime})}\in\mathcal{M}.

By construction Ax=bAx^{\star}=b for some x𝔽qk+100x^{\star}\in\mathbb{F}_{q}^{k+100} such that xx^{\star} is equal to aa in its first kk coordinates. Then bIma[k](AB)b\in\operatorname{Im}_{a_{[k]}}(A\circ B) is equivalent to zIm(R)z^{\star}\in\operatorname{Im}(R^{\prime}) where zz^{\star} is the last t+100t+100 coordinates of xx^{\star}

Letting A={(R,b)𝒯t+100,t|f~|b+Im(R),zIm(R)}\mathcal{R}_{A}=\{(R^{\prime},b^{\prime})\in\mathcal{T}_{t+100,t}\;|\;\tilde{f}|_{b^{\prime}+\operatorname{Im}(R^{\prime})},z^{\star}\in\operatorname{Im}(R^{\prime})\}, we can translate the third item above to

PrB[AB𝒮|bIma[k](MR)]=μA(A)2αε,\Pr_{B}[A\circ B\in\mathcal{S}\;|\;b\notin\operatorname{Im}_{a_{[k]}}(M\circ R)]=\mu_{A}(\mathcal{R}_{A})\leqslant\frac{2}{\alpha}\varepsilon,

where μA\mu_{A} is measure in Resk,t+100,t\operatorname{Res}_{k,t+100,t}. Define

A={b+Im(R)|(R,b)A}.\mathcal{B}_{A}=\{b^{\prime}+\operatorname{Im}(R^{\prime})\;|\;(R^{\prime},b^{\prime})\in\mathcal{R}_{A}\}.

By the same argument as in Lemma 4.7, A\mathcal{B}_{A} is nonempty. Finally, we can use the same proof as that of Lemma 4.5 (except referring to the first part of Lemma 6.1 instead of Lemma 3.10 to bound the sizes of upper shadows) to show that α2e4/qεq160εq161ε\alpha\leqslant 2e^{-4/q}\varepsilon q^{160}\leqslant\varepsilon q^{161}\varepsilon. This shows that 𝒮\mathcal{S} is εq161ε\varepsilon q^{161}\varepsilon-pseudorandom with respect to 𝒞a,b,lin\mathcal{C}_{a,b,\operatorname{lin}} where one of aa’s last tt-coordinates is nonzero. If this is not the case, then since (q1,,q1)(q-1,\ldots,q-1)\notin\mathcal{E}, we can perform the same reduction as in Lemma 4.4, to show that 𝒮\mathcal{S} is tq162εtq^{162}\varepsilon-pseudorandom with resepect to every zoom-in on the linear part.

Altogether, this establishes Lemma 6.2, which allows us to apply Theorem 2.1 with ξ=tq162M\xi=tq^{162-M} and =s+t\ell=s+t. Since we take t10t\geqslant 10, Theorem 2.1 implies that we have that 𝒮\mathcal{S} must have density at least 11(q1)21q62000tq200M/41-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4} on some zoom-in 𝒞a,b\mathcal{C}_{a,b}. Henceforth, fix this zoom-in and call it 𝒞a,b\mathcal{C}_{a^{\star},b}.

The point aa^{\star} is in the support of HH

We next show that it must be the case that asupp(H)a^{\star}\in\operatorname{supp}(H). We will need the following lemma which is the same as Lemma 5.3 but for the current setting. Just as in the setup of Lemma 5.3, let g:𝔽q+100g:\mathbb{F}_{q}^{\ell+100} be an arbitrary polynomial, let ν\nu denote uniform measure over \ell-flats in 𝔽q+100\mathbb{F}_{q}^{\ell+100}, and let b𝔽q+100b\in\mathbb{F}_{q}^{\ell+100} be an arbitrary point. Also let 𝒢{𝔽qN𝔽q}\mathcal{G}\subseteq\{\mathbb{F}_{q}^{N}\xrightarrow[]{}\mathbb{F}_{q}\} be an arbitrary affine-invariant code.

Define the following two sets:

𝒜={U|dim(U)=,g|U𝖫𝗂𝖿𝗍(𝒢),bU},\mathcal{A}=\{U\;|\;\dim(U)=\ell,g|_{U}\notin{\sf Lift}_{\ell}(\mathcal{G}),b\in U\},
={U|dim(U)=,g|U𝖫𝗂𝖿𝗍(𝒢),bU}.\mathcal{B}=\{U\;|\;\dim(U)=\ell,g|_{U}\notin{\sf Lift}_{\ell}(\mathcal{G}),b\notin U\}.

Keep ε\varepsilon and MM (the large absolute constant) the same as we have defined, so that qM/2O(ε)<qM/2q^{M/2}O(\varepsilon)<q^{-M/2} is small. We will use the following result, which is an extension of the results in Section 3.2 of [31]:

Lemma 6.3.

Keep the notation defined above and suppose max(N,4)\ell\geqslant\max(N,4). If ν()qM/2O(ε)\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon) then =\mathcal{B}=\emptyset. Moreover there is a value γ\gamma such that after changing g(b)g(b) to γ\gamma, ν(𝒜)=0\nu(\mathcal{A})=0

Proof.

The proof of this lemma is the same as the proof of Lemma 5.3, however we appeal to Lemma 6.1 instead of Lemma C.2. ∎

Following the same steps as in the proof of Lemma 5.4, this lemma almost immediately shows that it must be the case asupp(H)a^{\star}\in\operatorname{supp}(H). Suppose for the sake of contradiction that asupp(H)a^{\star}\notin\operatorname{supp}(H), and that it is the group of coordinates (am(i),,am(i+1)1)supp(Pi)(a^{\star}_{m(i)},\ldots,a^{\star}_{m(i+1)-1})\notin\operatorname{supp}(P_{i}). By assumption PiP_{i} is a t′′t^{\prime\prime}-variate polynomial for t′′tt^{\prime\prime}\leqslant t^{\prime}. Let

t′′(Pi)={g:𝔽qt′′𝔽q|gT,Pi=0,T𝒯t′′,t′′}.\mathcal{F}_{t^{\prime\prime}}(P_{i})=\{g:\mathbb{F}_{q}^{t^{\prime\prime}}\xrightarrow[]{}\mathbb{F}_{q}\;|\;\langle g\circ T,P_{i}\rangle=0,\forall T\in\mathcal{T}_{t^{\prime\prime},t^{\prime\prime}}\}.

We can similarly find an auxiliary polynomial f~:𝔽qt′′+102𝔽q\tilde{f}:\mathbb{F}_{q}^{t^{\prime\prime}+102}\xrightarrow[]{}\mathbb{F}_{q} and a special point z𝔽qt′′z^{\star}\in\mathbb{F}_{q}^{t^{\prime\prime}} such that the following hold:

  • μ()O(ε)\mu(\mathcal{R})\leqslant O(\varepsilon), where ={B𝒯t′′+102,t′′|f~B,Pi0,zIm(R)}\mathcal{R}=\{B\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}\;|\;\langle\tilde{f}\circ B,P_{i}\rangle\neq 0,z^{\star}\notin\operatorname{Im}(R)\} and μ\mu is measure in 𝒯t′′+102,t′′\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}.

  • There exists a full rank affine transformation B𝒯t′′+102,t′′B^{\star}\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}} such that f~B,Pi0\langle\tilde{f}\circ B^{\star},P_{i}\rangle\neq 0,

  • z{B(v)|vsupp(Pi)}z^{\star}\notin\{B^{\star}(v)\;|\;v\in\operatorname{supp}(P_{i})\}.

Using the first fact, we can also bound the measure of the following set of t′′t^{\prime\prime}-flats,

={U𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t′′+102,t′′)|f|Ut′′(Pi),zU}.\mathcal{B}=\{U\in{\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime})\;|\;f|_{U}\in\mathcal{F}_{t^{\prime\prime}}(P_{i}),z^{\star}\in U\}.
Lemma 6.4.

Suppose gg is a t′′t^{\prime\prime}-variate polynomial such that gt′′(Pi)g\notin\mathcal{F}_{t^{\prime\prime}}(P_{i}). Then with probability at least 1qt′′\frac{1}{q^{t^{\prime\prime}}} over T𝒯t′′,t′′T\in\mathcal{T}_{t^{\prime\prime},t^{\prime\prime}} we have gT,Pi0\langle g\circ T,P_{i}\rangle\neq 0.

Proof.

Choosing TT randomly, we can view gT,Pi\langle g\circ T,P_{i}\rangle as a polynomial in the entries of TT. Since gt′′(Pi)g\notin\mathcal{F}_{t^{\prime\prime}}(P_{i}) there must be some choice TT that makes this polynomials value nonzero, so gT,Pi\langle g\circ T,P_{i}\rangle is a nonzero polynomial in the entries of TT. As the individual degrees of gg must be at most q1q-1, it follows from Schwarz-Zippel that with probability at least 1qt′′\frac{1}{q^{t^{\prime\prime}}}, gT,Pi0\langle g\circ T,P_{i}\rangle\neq 0. ∎

Let ν\nu denote measure in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t′′+102,t′′){\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime}). We can choose B𝒯t′′+102,t′′B\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}} such that zIm(B)z^{\star}\notin\operatorname{Im}(B) by first choosing U𝖠𝖿𝖿𝖦𝗋𝖺𝗌(t′′+102,t′′)U\in{\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime}) not containing zz^{\star} and then BB with image contained in UU. Lemma 6.4 along with the assumption μ()O(ε)\mu(\mathcal{R})\leqslant O(\varepsilon) implies that

1qt′′ν()μ().\frac{1}{q^{t^{\prime\prime}}}\nu(\mathcal{B})\leqslant\mu(\mathcal{R}).

Therefore, we have that ν()qt′′O(ε)\nu(\mathcal{B})\leqslant q^{t^{\prime\prime}}O(\varepsilon) and by Lemma 6.3, ν(4)qt′′+4O(ε)qM/2O(ε)\nu(\mathcal{B}^{\uparrow^{4}})\leqslant q^{t^{\prime\prime}+4}O(\varepsilon)\leqslant q^{M/2}O(\varepsilon). We can then apply Lemma 6.3 with 𝒢=t′′(Pi)\mathcal{G}=\mathcal{F}_{t^{\prime\prime}}(P_{i}) and special point zz^{\star} to get that there is a value γ\gamma such that after changing f~(z)\tilde{f}(z^{\star}), we have,

f~|U𝖫𝗂𝖿𝗍t′′+4(t′′(Pi)), for all t′′+4-flats U containing z.\tilde{f}|_{U^{\prime}}\in{\sf Lift}_{t^{\prime\prime}+4}(\mathcal{F}_{t^{\prime\prime}}(P_{i})),\text{\quad for all $t^{\prime\prime}+4$-flats $U^{\prime}$ containing $z^{\star}$}.

This also implies that f~|Ut′′(Pi)\tilde{f}|_{U}\in\mathcal{F}_{t^{\prime\prime}}(P_{i}) for all t′′t^{\prime\prime}-flats UzU\ni z^{\star}, but note that changing the value of f~(z)\tilde{f}(z^{\star}) does not affect the value of the test

f~B,Pi0,\langle\tilde{f}\circ B^{\star},P_{i}\rangle\neq 0,

because of the third item above. Thus, we have a contradiction and it follows that asupp(H)a^{\star}\in\operatorname{supp}(H).

Dense on all zoom-ins inside the support of HH

After establishing that 𝒮\mathcal{S} has density at least 11(q1)22000tq200M/41-\frac{1}{(q-1)^{2}}-2000tq^{200-M/4} inside 𝒞a,b\mathcal{C}_{a^{\star},b} and asupp(H)a^{\star}\in\operatorname{supp}(H), we can use the same arguments as in the proof of Lemma 5.2 to show that 𝒮\mathcal{S} must have density at least

11(q1)22q100(q1)2000q200+tM/4qM/2,1-\frac{1}{(q-1)^{2}}-\frac{2}{q^{100}(q-1)}-2000q^{200+t-M/4}-q^{-M/2},

inside vsupp(H)𝒞v,b\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}. We can choose T𝒞a,bT\in\mathcal{C}_{a^{\star},b} by first choosing a k+t+100k+t+100-flat VV containing bb, then a k+tk+t-flat UVU\subseteq V containing bb, and finally outputting TT conditioned on Im(T)=U\operatorname{Im}(T)=U and Ta=bTa=b.

Using the same sampling and averaging argument, we get that with probability at least 11(q1)22000q200+tM/4qM/21-\frac{1}{(q-1)^{2}}-2000q^{200+t-M/4}-q^{-M/2} over k+t+100k+t+100-flats VV, the following hold:

  • νV(𝒜V)>0\nu_{V}(\mathcal{A}_{V})>0, where 𝒜V={UV|dim(U)=k,f|Uk+t(H),bU}\mathcal{A}_{V}=\{U\subseteq V\;|\;\dim(U)=k,f|_{U}\notin\mathcal{F}_{k+t}(H),b\in U\},

  • νV(V)qM/2O(ε)\nu_{V}(\mathcal{B}_{V})\leqslant q^{M/2}O(\varepsilon), where V={UV|dim(U)=k,f|Uk+t(H),bU}\mathcal{B}_{V}=\{U\subseteq V\;|\;\dim(U)=k,f|_{U}\notin\mathcal{F}_{k+t}(H),b\notin U\},

Applying Lemma 6.3 with 𝒢=k+t(H)\mathcal{G}=\mathcal{F}_{k+t}(H) and g=f|Vg=f|_{V}, we get that there is a value γ\gamma such that after changing the value of f(b)f(b) to γ\gamma, νV(𝒜V)=0\nu_{V}(\mathcal{A}_{V})=0. By the first item above, γf(b)\gamma\neq f(b). Therefore, prior to changing the value of f(b)f(b), we must have had f|VB,H0\langle f|_{V}\circ B,H\rangle\neq 0 for all full rank B𝒯k+t+100,k+tB\in\mathcal{T}_{k+t+100,k+t} such that B(v)=bB(v)=b for some vsupp(H)v\in\operatorname{supp}(H). After subtracting out the fraction of non-full rank transformations, we can conclude that 𝒮\mathcal{S} has density at least

11(q1)22q100(q1)2000q200+tM/4qM/2,1-\frac{1}{(q-1)^{2}}-\frac{2}{q^{100}(q-1)}-2000q^{200+t-M/4}-q^{-M/2},

inside vsupp(H)𝒞v,b\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}.

Correcting the Error and Iterating

Finally, we can correct the error and iterate as done in Section 5.3. For at least 9/109/10 of Tsuppvsupp(H)𝒞v,bT\in\operatorname{supp}_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}, there is a value γ\gamma such that after changing the value of f(b)f(b) to γ\gamma, TT accepts ff. Thus, there exists an ff^{\prime} that is identical to ff at all points except bb and

PrT[T rejects f|vsupp(H),T(v)=b]1910q.\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10}q.

By the same calculation as in Lemma 5.8, it follows that

rej(f)rej(f)|supp(H)|qnC(q),\operatorname{rej}(f^{\prime})\leqslant\operatorname{rej}(f)-\frac{|\operatorname{supp}(H)|}{q^{n}C(q)},

for some C(q)=O(q)C(q)=O(q), and we can conclude

rej(f)Ω(|supp(H)|qδ(f,n(H))).\operatorname{rej}(f)\geqslant\Omega\left(\frac{|\operatorname{supp}(H)|}{q}\delta(f,\mathcal{F}_{n}(H))\right).

References

  • [1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Trans. Inf. Theory, 51(11):4032–4039, 2005.
  • [2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998.
  • [3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of NP. J. ACM, 45(1):70–122, 1998.
  • [4] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Comb., 23(3):365–426, 2003.
  • [5] Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1629–1642, 2021.
  • [6] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. High dimensional expanders: Eigenstripping, pseudorandomness, and unique games. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 - 12, 2022, pages 1069–1128, 2022.
  • [7] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 185–194, 2022.
  • [8] Boaz Barak, Parikshit Gopalan, Johan Håstad, Raghu Meka, Prasad Raghavendra, and David Steurer. Making the long code shorter. SIAM J. Comput., 44(5):1287–1324, 2015.
  • [9] Boaz Barak, Pravesh Kothari, and David Steurer. Small-Set Expansion in Shortcode Graph and the 2-to-2 Conjecture. In Proceedings of the 10th Innovations in Theoretical Computer Science conference, ITCS 2019, January 10-12, San Diego, CA, USA, 2019.
  • [10] Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of Reed-Muller codes. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, Las Vegas, NV, USA, pages 488–497, 2010.
  • [11] Irit Dinur and Venkatesan Guruswami. Pcps via low-degree long code and hardness for constrained hypergraph coloring. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 340–349, 2013.
  • [12] Irit Dinur, Prahladh Harsha, Srikanth Srinivasan, and Girish Varma. Derandomized graph product results using the low degree long code. In 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, March 4-7, 2015, Garching, Germany, pages 275–287, 2015.
  • [13] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. On non-optimally expanding sets in Grassmann graphs. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 940–951, 2018.
  • [14] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 376–389, 2018.
  • [15] Uriel Feige, Shafi Goldwasser, László Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996.
  • [16] William T Gowers. A new proof of szemerédi’s theorem. Geometric & Functional Analysis GAFA, 11(3):465–588, 2001.
  • [17] Alan Guo, Swastik Kopparty, and Madhu Sudan. New affine-invariant codes from lifting. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, ITCS ’13, page 529–540, New York, NY, USA, 2013. Association for Computing Machinery.
  • [18] Tom Gur, Noam Lifshitz, and Siqi Liu. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 176–184, 2022.
  • [19] Venkatesan Guruswami, Prahladh Harsha, Johan Håstad, Srikanth Srinivasan, and Girish Varma. Super-polylogarithmic hypergraph coloring hardness via low-degree long codes. SIAM J. Comput., 46(1):132–159, 2017.
  • [20] Elad Haramaty, Noga Ron-Zewi, and Madhu Sudan. Absolutely sound testing of lifted codes. Theory Comput., 11:299–338, 2015.
  • [21] Elad Haramaty, Amir Shpilka, and Madhu Sudan. Optimal testing of multivariate polynomials over small prime fields. SIAM J. Comput., 42(2):536–562, 2013.
  • [22] Charanjit S. Jutla, Anindya C. Patthak, Atri Rudra, and David Zuckerman. Testing low-degree polynomials over prime fields. Random Struct. Algorithms, 35(2):163–193, 2009.
  • [23] Daniel M. Kane and Raghu Meka. A PRG for lipschitz functions of polynomials with applications to sparsest cut. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 1–10, 2013.
  • [24] Tali Kaufman and Dana Ron. Testing polynomials over general fields. SIAM J. Comput., 36(3):779–802, 2006.
  • [25] Tali Kaufman and Madhu Sudan. Algebraic property testing: the role of invaraince. In Proceedings of the 40th Annual ACM Symposium on Theory of computing, STOC 2008, May 17-20, Victoria, BC, Canada, pages 403–412, 2008.
  • [26] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Hypercontractivity for global functions and sharp thresholds. arXiv preprint arXiv:1906.05568, 2019.
  • [27] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Forbidden intersections for codes. arXiv preprint arXiv:2103.05050, 2021.
  • [28] Peter Keevash, Noam Lifshitz, and Dor Minzer. On the largest product-free subsets of the alternating groups. arXiv preprint arXiv:2205.15191, 2022.
  • [29] Subhash Khot, Dor Minzer, and Muli Safra. On independent sets, 2-to-2 games, and Grassmann graphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 576–589, 2017.
  • [30] Subhash Khot, Dor Minzer, and Muli Safra. Pseudorandom sets in Grassman graph have near-perfect expansion. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 592–601, 2018.
  • [31] Dor Minzer and Tali Kaufman. Improved Optimal Testing Results from Global Hypercontractivity. In Proceedings of the 63rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2022, October 31 - November 3, Denver, CO, USA, 2022.
  • [32] Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In Proceedings of the 29th Annual ACM SIGACT Symposium on Theory of Computing, STOC 1997, El Paso, TX, USA, May 406, 1997, pages 475–484, 1997.
  • [33] Noga Ron-Zewi and Madhu Sudan. A New Upper Bound on the Query Complexity of Testing Generalized Reed-Muller Codes. Theory of Computing, 9(25):783–807, 2013.
  • [34] Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996.

Appendix A Proof of Theorem 2.1

We now prove Theorem 2.1. First we will need to show the following result, which is a slight variation of [31, Theorem 2.4].

Theorem A.1.

Let 𝒜𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,)\mathcal{A}\subseteq{\sf AffGras}(n+\ell,\ell) satisfy

  1. 1.

    μ(𝒜)ξ\mu(\mathcal{A})\leqslant\xi,

  2. 2.

    𝒜\mathcal{A} is ξ\xi-pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part.

  3. 3.

    1Φ(𝒜)1qδ1-\Phi(\mathcal{A})\geqslant\frac{1}{q}-\delta.

Then there exists a point x𝔽qnx\in\mathbb{F}_{q}^{n} such that μ(𝒜x)1q2(4q+867ξ1/4+δq1)\mu(\mathcal{A}_{x})\geqslant 1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{\delta}{q-1}\right).

Measure, expansion, and pseudorandomness are all with respect to 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell), and μ(𝒜x)\mu(\mathcal{A}_{x}) denotes the fractional size of 𝒜\mathcal{A} in the subset of 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell) that contains the point xx.

To show that an analogous result holds in the affine Bi-linear Scheme Graph, we retrace the steps in [9] to establish a connection to the affine Grassmann Graph and then apply Theorem A.1. To this end, define the following injective map φ:𝒯n,𝒱n+,\varphi:\mathcal{T}_{n,\ell}\xrightarrow{}\mathcal{V}_{n+\ell,\ell}. Let {e1,,e}𝔽q\{e_{1},\ldots,e_{\ell}\}\in\mathbb{F}_{q}^{\ell} be the canonical basis. For two column vectors vv and ww with lengths 1\ell_{1} and 2\ell_{2} respectively, we denote by (v,w)T(v,w)^{T} the length 1+2\ell_{1}+\ell_{2} column vector obtained by vertically concatenating vv and ww. Likewise, let [vT,wT][v^{T},w^{T}] be the length 1+2\ell_{1}+\ell_{2} column vector obtained by horizontally concatenating vv and ww.

For an affine transformation (M,c)𝒯n,(M,c)\in\mathcal{T}_{n,\ell}, where MM has columns v1,,vv_{1},\ldots,v_{\ell} we set φ((M,v))=(0,c)T+span((e1,v1)T,,(e,vT))\varphi((M,v))=(0,c)^{T}+\operatorname{span}((e_{1},v_{1})^{T},\ldots,(e_{\ell},v_{\ell}^{T})). It is clear that this map is injective. Moreover, the image of φ\varphi is the set of \ell-flats V𝔽qn+V\subseteq\mathbb{F}_{q}^{n+\ell} such that the projection of VV onto the first \ell coordinates is full rank.

The map φ\varphi preserves edges, nearly preserves expansion, and maps the canonical non expanding sets in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell) to their counterparts in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell).

Lemma A.2.

If T1,T2𝒯n,T_{1},T_{2}\in\mathcal{T}_{n,\ell} are adjacent in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell), then φ(T1),φ(T2)\varphi(T_{1}),\varphi(T_{2}) are adjacent in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell). Conversely if V1,V2Vn,V_{1},V_{2}\in V_{n,\ell} are adjacent in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell) and both in the image of φ\varphi, then φ1(V1),φ1(V2)\varphi^{-1}(V_{1}),\varphi^{-1}(V_{2}) are adjacent in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell).

Proof.

For the first direction, suppose T1T_{1} and T2T_{2} are given by (M1,c1)(M_{1},c_{1}) and (M2,c2)(M_{2},c_{2}) respectively. Let M1′′M^{\prime\prime}_{1} be the (n+)×n(n+\ell)\times n matrix obtained by vertically concatenating II_{\ell} and MM, and let c1′′c^{\prime\prime}_{1} be the length n+n+\ell vector obtained by vertically concatenating the length \ell zero vector and c1c_{1}. Define M2′′M^{\prime\prime}_{2} and c2′′c^{\prime\prime}_{2} similarly. Then notice that φ(T1)\varphi(T_{1}) is given by the column-image of affine transformation T1′′=(M1′′,c1′′)T^{\prime\prime}_{1}=(M^{\prime\prime}_{1},c^{\prime\prime}_{1}), while φ(T2)\varphi(T_{2}) is given by the column-image of affine transformation T2′′=(M2′′,c2′′)T^{\prime\prime}_{2}=(M^{\prime\prime}_{2},c^{\prime\prime}_{2}). Thus, to show that edges are preserved it suffices to show that the column-images of T1′′T^{\prime\prime}_{1} and T2′′T^{\prime\prime}_{2} have intersection of dimension 1\ell-1. Indeed, T1′′T2′′=(M1′′M2′′,c1′′c2′′)T^{\prime\prime}_{1}-T^{\prime\prime}_{2}=(M^{\prime\prime}_{1}-M^{\prime\prime}_{2},c^{\prime\prime}_{1}-c^{\prime\prime}_{2}), and since both M1′′M2′′M^{\prime\prime}_{1}-M^{\prime\prime}_{2} and c1′′c2′′c^{\prime\prime}_{1}-c^{\prime\prime}_{2} are all zeros in the first \ell rows, it follows that dim(ker(T1′′T2′′))=dim(ker(T1T2))=1\dim(\ker(T^{\prime\prime}_{1}-T^{\prime\prime}_{2}))=\dim(\ker(T_{1}-T_{2}))=\ell-1. Therefore, dim(Im(T1′′T2′′))=1\dim(\operatorname{Im}(T^{\prime\prime}_{1}-T^{\prime\prime}_{2}))=1 and dim(φ(T1)φ(T2))=1\dim(\varphi(T_{1})\cap\varphi(T_{2}))=\ell-1.

In the converse direction, suppose that U,V𝒱n+,U,V\in\mathcal{V}_{n+\ell,\ell} where dim(UV)=1\dim(U\cap V)=\ell-1. Write U=u0+span(u1,,u)U=u_{0}+\operatorname{span}(u_{1},\ldots,u_{\ell}) and V=v+span(v1,,v)V=v+\operatorname{span}(v_{1},\ldots,v_{\ell}) where ui=(ei,ui)Tu_{i}=(e_{i},u^{\prime}_{i})^{T}, vi=(ei,vi)Tv_{i}=(e_{i},v^{\prime}_{i})^{T}, u0=(0,u0)Tu_{0}=(0,u^{\prime}_{0})^{T}, and v0=(0,v0)Tv_{0}=(0,v^{\prime}_{0})^{T}. Let M1M_{1} be the matrix with columns uiu_{i} and M2M_{2} be the matrix with columns viv_{i}. Then by assumption the affine transformations T1=(M1,u0)T_{1}=(M_{1},u_{0}) and T2=(M2,v0)T_{2}=(M_{2},v_{0}) have images with intersection of dimension 1\ell-1. We claim that this implies T1T_{1} and T2T_{2} are in fact equal on an 1\ell-1 dimensional subspace. Indeed if T1(a)=T2(b)T_{1}(a)=T_{2}(b) for a,b𝔽qa,b\in\mathbb{F}_{q}^{\ell} then examining the first \ell coordinates implies that a=ba=b. Therefore, T1,T2T_{1},T_{2} are equal on the 1\ell-1-dimensional preimage of Im(T1)Im(T2)\operatorname{Im}(T_{1})\cap\operatorname{Im}(T_{2}). Finally, since φ1(U)\varphi^{-1}(U) and φ1(V)\varphi^{-1}(V) are given by the projections of T1T_{1} and T2T_{2} respectively onto the last nn coordinates, dim(ker(φ1(U)φ1(V)))=1\dim(\ker(\varphi^{-1}(U)-\varphi^{-1}(V)))=\ell-1 and the conclusion follows. ∎

Lemma A.3.

Let V1𝒱n+,V_{1}\in\mathcal{V}_{n+\ell,\ell} be in the image of φ\varphi. Then at least 11/q1-1/q fraction of its neighbors are also in the image of φ\varphi.

Proof.

Recall that VV is in the image of φ\varphi if the projection of VV onto its first \ell-coordinates is full rank. Fix V1𝒱n+,V_{1}\in\mathcal{V}_{n+\ell,\ell} in the image of φ\varphi. Then a neighbor of V1V_{1} can be sampled as follows: choose a uniformly random point v0Vv_{0}\in V and \ell linearly independent directions, v1,,vv_{1},\ldots,v_{\ell}, so that V1=v0+span(v1,,v)V_{1}=v_{0}+\operatorname{span}(v_{1},\ldots,v_{\ell}), a uniformly random vv^{\prime}_{\ell} outside of span(v1,,v)\operatorname{span}(v_{1},\ldots,v_{\ell}), and set V2=v0+span(v1,,v1,v)V_{2}=v_{0}+\operatorname{span}(v_{1},\ldots,v_{\ell-1},v^{\prime}_{\ell}). Since V1V_{1} is in the image of φ\varphi, the projection of v1,,vv_{1},\ldots,v_{\ell} to the first \ell coordinates is full rank. Likewise, V2V_{2} is in the image of φ\varphi if vv^{\prime}_{\ell} projected onto its first \ell coordinates is linearly independent to v1,,v1v_{1},\ldots,v_{\ell-1}’s projections to the first \ell coordinates. This happens with probability exactly 11/q1-1/q, completing the proof. ∎

Lemma A.4.

If S𝒯n,S\subseteq\mathcal{T}_{n,\ell} satisfies 1Φ(S)=η1-\Phi(S)=\eta, then φ(S)\varphi(S) satisfies 1Φ(φ(S))η(11/q)1-\Phi^{\prime}(\varphi(S))\geqslant\eta(1-1/q), where Φ\Phi and Φ\Phi^{\prime} are expansion in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,){\sf AffBilin}(n,\ell) and 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell) respectively.

Proof.

This is a direct consequence of the previous two lemmas. By Lemma A.3, 11/q1-1/q fraction of the neighbors of φ(S)\varphi(S) are also in the image of φ\varphi and by the assumption and Lemma A.2 η\eta fraction of these neighbors are also in φ(S)\varphi(S). The result follows. ∎

Lemma A.5.

The map φ\varphi is a bijection between the following pairs of sets:

  • 𝒞a,b\mathcal{C}_{a,b} and 𝒟vIm(φ)\mathcal{D}_{v}\cap\operatorname{Im}(\varphi) for v=[a,b]Tv=[a,b]^{T}.

  • 𝒞a,b,lin\mathcal{C}_{a,b,\operatorname{lin}} and 𝒟v,linIm(φ)\mathcal{D}_{v,\operatorname{lin}}\cap\operatorname{Im}(\varphi) for v=[a,b]Tv=[a,b]^{T}.

  • 𝒞aT,b,β\mathcal{C}_{a^{T},b,\beta} and 𝒟WIm(φ)\mathcal{D}_{W}\cap\operatorname{Im}(\varphi) for WW given by [bT,aT],x=β\langle[-b^{T},a^{T}],x\rangle=\beta.

  • 𝒞aT,b,β,lin\mathcal{C}_{a^{T},b,\beta,\operatorname{lin}} and 𝒟W,linIm(φ)\mathcal{D}_{W,\operatorname{lin}}\cap\operatorname{Im}(\varphi) for WW given by [bT,aT],x=β\langle[-b^{T},a^{T}],x\rangle=\beta.

Proof.

We show the zoom-in and zoom-out cases. The other two are analogous.

zoom-ins: Take any arbitrary T=(M,c)𝒞a,bT=(M,c)\in\mathcal{C}_{a,b} and let let M=[I,M]TM^{\prime}=[I_{\ell},M]^{T} and c=[0,c]Tc^{\prime}=[0,c]^{T}. It is easy to check that Ma+c=[a,b]TM^{\prime}a+c^{\prime}=[a,b]^{T}. Since TT is arbitrary, this shows the first half of the bijection.

To complete the proof of this case, we must show that any flat in 𝒟vIm(φ)\mathcal{D}_{v}\cap\operatorname{Im}(\varphi) is mapped to. Take such a flat UU. Since UIm(φ)U\in\operatorname{Im}(\varphi), we can write U=c+span(u1,,u)U=c^{\prime}+\operatorname{span}(u_{1},\ldots,u_{\ell}), where the first \ell coordinates of uiu_{i} is equal to eie_{i} and the first \ell coordinates of cc^{\prime} are all 0. There is a unique linear combination of the uiu_{i}’s and cc^{\prime} that sums to vv, and looking at the first \ell coordinates it must be the case that v=c+a1u1++auv=c^{\prime}+a_{1}u_{1}+\cdots+a_{\ell}u_{\ell} where aia_{i} is the iith coordinate of aa. Let MM be the matrix whose iith column is given by the last nn entries of uiu_{i} and let cc be the last nn entries of cc^{\prime}. It follows that U=φ((M,c))U=\varphi((M,c)) and that Ma=bMa=b.

zoom-outs: Take an arbitrary T=(M,c)𝒞aT,b,βT=(M,c)\in\mathcal{C}_{a^{T},b,\beta} and suppose it has columns v1,,vv_{1},\ldots,v_{\ell}. Then φ(T)=c+span(v0,,v)\varphi(T)=c^{\prime}+\operatorname{span}(v^{\prime}_{0},\ldots,v^{\prime}_{\ell}) where c=[0,c]Tc^{\prime}=[0,c]^{T} and vi=[ei,vi]Tv^{\prime}_{i}=[e_{i},v_{i}]^{T} for 111\leqslant 1\leqslant\ell. Let h=[bT,aT]h=[-b^{T},a^{T}] be the length +n\ell+n row vector whose first \ell entries are bTb^{T} and last nn entries are aTa^{T}. By construction h,vi=0\langle h,v^{\prime}_{i}\rangle=0 and h,c=β\langle h,c\rangle=\beta. Therefore, φ(T)\varphi(T) is contained in the hyperplane given h,x=β\langle h,x\rangle=\beta. Since TT was arbitrary, it follows that φ(𝒞aT,b,β)W\varphi(\mathcal{C}_{a^{T},b,\beta})\subseteq W, where WW is the hyperplane given by h,x=β\langle h,x\rangle=\beta.

For the other direction of the bijection, take any z+V𝒟W,linIm(φ)z+V\in\mathcal{D}_{W,\operatorname{lin}}\cap\operatorname{Im}(\varphi), let (M,c)(M,c) be its preimage under φ\varphi, and suppose WW is given by h,x=0\langle h,x\rangle=0 (note that WW must be a linear subspace since it contains VV which is a linear subspace). Take bb to be the negative of the first \ell coordinates of hh and aa to be the last nn coordinates of hh. It follows that aTM=ba^{T}M=b. ∎

With these lemmas, we can obtain the desired characterization of poorly expanding setes in 𝖠𝖿𝖿𝖡𝗂𝗅𝗂𝗇(n,).{\sf AffBilin}(n,\ell).

Proof of Theorem 2.1.

Suppose 𝒮\mathcal{S} satisfies the conditions of Theorem 2.1 and let 𝒜=φ(𝒮)𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,)\mathcal{A}=\varphi(\mathcal{S})\subseteq{\sf AffGras}(n+\ell,\ell). Henceforth use Φ\Phi and μ\mu to denote expansion and measure in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(n+,){\sf AffGras}(n+\ell,\ell). By Lemma A.4 we have that 1Φ(𝒜)1q1q21-\Phi(\mathcal{A})\geqslant\frac{1}{q}-\frac{1}{q^{2}}. By Lemma A.5, we have that 𝒜\mathcal{A} is also ξ\xi-pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part. Finally since φ\varphi is an injection, it is clear that μ(𝒜)ξ\mu(\mathcal{A})\leqslant\xi. By Lemma A.1 it follows that there exists a point vv such that

|𝒜𝒟v||𝒟v|1q2(4q+867ξ1/4+1q100(q1)).\frac{|\mathcal{A}\cap\mathcal{D}_{v}|}{|\mathcal{D}_{v}|}\geqslant 1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{1}{q^{100}(q-1)}\right).

By Lemma A.5, there is a zoom-in (in the affine bi-linear scheme graph) 𝒞a,b\mathcal{C}_{a,b} such that φ1(𝒜𝒟v)=𝒮𝒞a,b\varphi^{-1}(\mathcal{A}\cap\mathcal{D}_{v})=\mathcal{S}\cap\mathcal{C}_{a,b}, so |𝒮𝒞a,b|=|𝒜𝒟v||\mathcal{S}\cap\mathcal{C}_{a,b}|=|\mathcal{A}\cap\mathcal{D}_{v}|. Finally |𝒞a,b||𝒟v|\frac{|\mathcal{C}_{a,b}|}{|\mathcal{D}_{v}|} is the probability that a randomly chosen flat from 𝒟v\mathcal{D}_{v} is not full rank when restricted to its first \ell coordinates. This probability is at most 11/q1-1/q. Indeed a flat from 𝒟v\mathcal{D}_{v} can be chosen by choosing \ell linearly independent vectors v1,,vv_{1},\ldots,v_{\ell} and taking the flat to be v+span(v1,,v)v+\operatorname{span}(v_{1},\ldots,v_{\ell}). Conditioned on the first 1\ell-1 vectors being chosen so that their restriction to the first \ell coordinates is linearly independent, there is a 11/q1-1/q chance that vv_{\ell} to satisfy this property as well. It follows that

|𝒮𝒞a,b||𝒞a,b|qq1(1q2(4q+867ξ1/4+1q100(q1)))11(q1)2q3q1(4q+867ξ1/4).\frac{|\mathcal{S}\cap\mathcal{C}_{a,b}|}{|\mathcal{C}_{a,b}|}\geqslant\frac{q}{q-1}\left(1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{1}{q^{100}(q-1)}\right)\right)\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{q^{3}}{q-1}\left(4q^{-\ell}+867\xi^{1/4}\right).

Appendix B Canonical Monomials characterizing RM[n,q,d]\operatorname{RM}[n,q,d]

In this section we include the proof of Theorem 3.4, showing that any polynomial passing the test with probability 11 is degree dd. The proof is implicit in  [33], but as our tester is presented differently we give a proof for completion.

Write d+1=p(qq/p)+rd+1=\ell\cdot p(q-q/p)+r, where 1rp(qq/p)1\leqslant r\leqslant p(q-q/p), and set s=ps=\ell\cdot p. First, note that our tester detects monomials of the form:

(i=1sxiqq/p)j=1p+2xs+jej,\left(\prod_{i=1}^{s}x_{i}^{q-q/p}\right)\prod_{j=1}^{p+2}x_{s+j}^{e_{j}}, (6)

for any (e1,,et)(e_{1},\ldots,e_{t}) such that j=1tejr\sum_{j=1}^{t}e_{j}\geqslant r. Indeed, for (e1,,et)(e_{1},\ldots,e_{t}) such that j=1tejr\sum_{j=1}^{t}e_{j}\geqslant r, let e=(q1e1,,q1et)e^{\prime}=(q-1-e_{1},\ldots,q-1-e_{t}) and consider the expansion of He(x1,,xs+t)H_{e^{\prime}}(x_{1},\ldots,x_{s+t}). Using Lucas’s Theorem, it is not hard to see that the expansion of He(x1,,xs+t)H_{e^{\prime}}(x_{1},\ldots,x_{s+t}) contains the monomial (i=1sxiq/p1)j=1txs+jq1ej\left(\prod_{i=1}^{s}x_{i}^{q/p-1}\right)\prod_{j=1}^{t}x_{s+j}^{q-1-e_{j}}, and hence by Lemma 3.3, any canonical monomial in (6) is rejected.

Let 𝒢\mathcal{G} be the family of functions that pass the sparse s+ts+t-flat test with probability 11. In this section we will prove:

Lemma B.1.

If ff passes the sparse-s+ts+t-flat test with probability 11, then fRM[n,q,d]f\in\operatorname{RM}[n,q,d]. Equivalently, 𝒢=RM[n,q,d]\mathcal{G}=\operatorname{RM}[n,q,d]

One direction of this lemma is clear. If deg(f)d\deg(f)\leqslant d then ff passes with probability 11. The other direction amounts to showing that 𝒢RM[n,q,d]\mathcal{G}\subseteq\operatorname{RM}[n,q,d]. To show this, we will actually show the contrapositive and prove that if deg(f)d+1\deg(f)\geqslant d+1, and f𝒢f\in\mathcal{G}, then one of the canonical monomials in (6) must be in 𝒢\mathcal{G}. This is a contradiction and establishes that 𝒢RM[n,q,d]\mathcal{G}\subseteq\operatorname{RM}[n,q,d].

Before proceeding to the proof, we will need the following two facts from [25] about affine-invariant families of polynomials. For a family of polynomials \mathcal{F}, let supp()\operatorname{supp}({\mathcal{F}}) denote the set of monomials that appear in at least one of these polynomials. The first fact says that these monomials are a basis of \mathcal{F}.

Lemma B.2 (Monomial Extraction Lemma [25]).

If \mathcal{F} is an affine-invariant family of polynomials then =span(supp())\mathcal{F}=\operatorname{span}(\operatorname{supp}(\mathcal{F})).

The second fact says that the monomials appearing in an affine-invariant family are pp-shadow closed. For two integers a,b{0,,q1}a,b\in\{0,\ldots,q-1\}, let a=i=0k1piaia=\sum_{i=0}^{k-1}p^{i}a_{i} and b=i=0k1pibib=\sum_{i=0}^{k-1}p^{i}b_{i} be their base pp representations (recall q=pkq=p^{k}). Then we say aa is in the pp-shadow of bb if aibia_{i}\leqslant b_{i} for i=0,,k1i=0,\ldots,k-1, and denote this by apba\leqslant_{p}b. Then for two exponent vectors e=(e1,,en)e=(e_{1},\ldots,e_{n}) and e=(e1,,en)e^{\prime}=(e^{\prime}_{1},\ldots,e^{\prime}_{n}), we say epee\leqslant_{p}e^{\prime} if eipeie_{i}\leqslant_{p}e^{\prime}_{i} for every ii. Affine-invariant families of polynomials have the following shadow closed property.

Lemma B.3.

Let =span(supp())\mathcal{F}=\operatorname{span}(\operatorname{supp}(\mathcal{F})) be an affine invariant family of polynomials. If epee\leqslant_{p}e^{\prime} and esupp()e^{\prime}\in\operatorname{supp}(\mathcal{F}), then esupp()e\in\operatorname{supp}(\mathcal{F}) as well.

Finally, we will need the following which will allow us to go from one monomial in \mathcal{F} to another with the same degree, but with the distribution of the individual degrees shifted.

Lemma B.4.

Suppose xex^{e}\in\mathcal{F} and suppose mpe2m\leqslant_{p}e_{2}. Then the monomial xex^{e^{\prime}}\in\mathcal{F}, where e=(e1+m,e2m,e3,,en)e^{\prime}=(e_{1}+m,e_{2}-m,e_{3},\ldots,e_{n}).

Proof.

Let TT be the affine transformation T(x)=(x1,x1+x2,x3,,xn)T(x)=(x_{1},x_{1}+x_{2},x_{3},\ldots,x_{n}). Then,

xeT=x1e1(x1+x2)e2j=3nxjej=(i=0d2(d2i)x1e1+ix2e2i)j=3nxjej.x^{e}\circ T=x_{1}^{e_{1}}(x_{1}+x_{2})^{e_{2}}\prod_{j=3}^{n}x_{j}^{e_{j}}=\left(\sum_{i=0}^{d_{2}}\binom{d_{2}}{i}x_{1}^{e_{1}+i}x_{2}^{e_{2}-i}\right)\prod_{j=3}^{n}x_{j}^{e_{j}}.

By Lucas’s Theorem and the assumption mpe2m\leqslant_{p}e_{2}, (d2m)0\binom{d_{2}}{m}\neq 0 in 𝔽q\mathbb{F}_{q}, and so xeT=x1e1(x1+x2)e2j=3nxjejx^{e}\circ T=x_{1}^{e_{1}}(x_{1}+x_{2})^{e_{2}}\prod_{j=3}^{n}x_{j}^{e_{j}} contains the monomial xex^{e^{\prime}}. The result then follows from Lemma B.2. ∎

With these three lemmas, we are ready to proceed to the proof of Lemma B.1. Supposing for the sake of contradiction that deg(f)>d\deg(f)>d, using the above lemmas, we will show that this implies one of the canonical monomials is in \mathcal{F}. This cannot happen however, as all monomials in (6) are rejected by T=IT=I, the identity.

Proof of Lemma B.1..

Let \mathcal{F} be the family of polynomials that pass the sparse s+ts+t-flat test with probability 11. Suppose for the sake of contradiction that ff\in\mathcal{F} and ff contains a monomial of degree d>dd^{\prime}>d. Let xe=i=1nxieix^{e}=\prod_{i=1}^{n}x_{i}^{e_{i}} be such monomial and let \ell be the smallest index such that e1++e>de_{1}+\cdots+e_{\ell}>d. Then (e1,,e,0,,0)pe(e_{1},\ldots,e_{\ell},0,\ldots,0)\leqslant_{p}e, so i=1xiei\prod_{i=1}^{\ell}x_{i}^{e_{i}}\in\mathcal{F} and

d+1i=1eid+q1=s(qq/p)+r+q1.d+1\leqslant\sum_{i=1}^{\ell}e_{i}\leqslant d+q-1=s(q-q/p)+r+q-1.

We will show that this monomial will lead to one of the canonical monomials being in \mathcal{F}. First, we claim that there is a monomial xex^{e^{\prime}} such that eiqq/pe^{\prime}_{i}\geqslant q-q/p for i=1,,si=1,\ldots,s. Define

c(e)=i=1smax(0,(qq/p)ei).c(e)=\sum_{i=1}^{s}\max(0,(q-q/p)-e_{i}).

If c(e)=0c(e)=0, then ee is of the desired form. If ee is not of the desired form, then there is isi\leqslant s such that ei<qq/pe_{i}<q-q/p. Otherwise one of the following must be true:

  • There is j>sj>s such that ej>0e_{j}>0, in which case we simply find some pmpejp^{m}\leqslant_{p}e_{j} and apply Lemma B.4 to obtain the monomial xiei+pmxjejpmx_{i}^{e_{i}+p^{m}}x_{j}^{e_{j}-p^{m}} in place of xieixjejx_{i}^{e_{i}}x_{j}^{e_{j}},

  • There is jsj\leqslant s such that ej>qq/pe_{j}>q-q/p. In this case we can find pmp^{m} such that ejpmqq/pe_{j}-p^{m}\geqslant q-q/p, and apply Lemma B.4 to obtain the monomial xiei+pmxjejpmx_{i}^{e_{i}+p^{m}}x_{j}^{e_{j}-p^{m}} in place of xieixjejx_{i}^{e_{i}}x_{j}^{e_{j}}.

In either case, we can find another monomial xex^{e^{\prime}} such that xex^{e^{\prime}}\in\mathcal{F} and c(e)<c(e)c(e^{\prime})<c(e). Iterating this process, we must eventually find the desired monomial with c(e)=0c(e)=0.

Now, abusing notation, let xex^{e} be this monomial. We have, d+1i=1neid+q1=s(qq/p)+r+q1d+1\leqslant\sum_{i=1}^{n}e_{i}\leqslant d+q-1=s(q-q/p)+r+q-1 and eiqq/pe_{i}\geqslant q-q/p for 1is1\leqslant i\leqslant s. We can now perform essentially the same argument and move any degree above qq/pq-q/p to es+1,,es+te_{s+1},\ldots,e_{s+t}. There is at most

d+q1s(qq/p)=r+q1p(qq/p)+q1,d+q-1-s(q-q/p)=r+q-1\leqslant p(q-q/p)+q-1,

degree leftover after subtracting so we can perform the above argument to shift degree until each of the degrees es+1,,es+t1e_{s+1},\ldots,e_{s+t-1} is at least qq/pq-q/p. This will leave at most q1q-1 degree leftover, which we can simply shift to es+te_{s+t}. ∎

Lemma B.5.

Suppose g:𝔽qp𝔽qg:\mathbb{F}_{q}^{p}\xrightarrow[]{}\mathbb{F}_{q} satisfies deg(g)q(p1)\deg(g)\geqslant q(p-1). Then there exist full rank affine transformations T1,T2𝒯p,pT_{1},T_{2}\in\mathcal{T}_{p,p} such that:

  1. 1.

    gT1g\circ T_{1} contains a monomial i=1pxiei\prod_{i=1}^{p}x_{i}^{e_{i}} with qq/peiq1q-q/p\leqslant e_{i}\leqslant q-1 for all i[p]i\in[p].

  2. 2.

    gT2,P0\langle g\circ T_{2},P\rangle\neq 0.

Proof.

To show the first part of the lemma, we use a similar idea to the proof of Lemma B.4. Suppose gg contains a monomial i=1pxiei\prod_{i=1}^{p}x_{i}^{e^{\prime}_{i}} and suppose mpe2m\leqslant_{p}e_{2}^{\prime}. Consider the full rank affine transformation Tα(x)=(x1,αx1+x2,x3,,xp)T_{\alpha}(x)=(x_{1},\alpha x_{1}+x_{2},x_{3},\ldots,x_{p}) for α𝔽q\alpha\in\mathbb{F}_{q} and the coefficient of x1e1+mx2e2mi=3pxieix_{1}^{e_{1}^{\prime}+m}x_{2}^{e_{2}^{\prime}-m}\prod_{i=3}^{p}x_{i}^{e_{i}^{\prime}} in gTαg\circ T_{\alpha}. It is not hard to see that this coefficient is a nonzero polynomial in α\alpha and therefore there exists an α\alpha such that gTαg\circ T_{\alpha} contains the monomial x1e1+mx2e2mi=3pxieix_{1}^{e_{1}^{\prime}+m}x_{2}^{e_{2}^{\prime}-m}\prod_{i=3}^{p}x_{i}^{e_{i}^{\prime}}. As TαT_{\alpha} is a full rank affine transformation, we can repeatedly apply such transformations to obtain a full rank T1𝒯p,pT_{1}\in\mathcal{T}_{p,p} such that gT1g\circ T_{1} contains a monomial i=1pxiei\prod_{i=1}^{p}x_{i}^{e_{i}} with qq/p<eiq1q-q/p<e_{i}\leqslant q-1, proving the first part of the lemma.

Let T1T_{1} be the transformation from part 11 and let g=gT1g^{\prime}=g\circ T_{1}, so that gg^{\prime} contains a monomial i=1pxiei\prod_{i=1}^{p}x_{i}^{e_{i}} with qq/p<eiq1q-q/p<e_{i}\leqslant q-1. For a𝔽qpa\in\mathbb{F}_{q}^{p}, let TaT_{a} be the full rank affine transformation such that Ta(x)=(x1+a1,,xp+ap)T_{a}(x)=(x_{1}+a_{1},\ldots,x_{p}+a_{p}) and consider the inner product

gTa,P,\langle g^{\prime}\circ T_{a},P\rangle,

as a polynomial in aa. Recall that PP contains the monomial i=1pxiq/p1\prod_{i=1}^{p}x_{i}^{q/p-1}. Since gg^{\prime} contains the monomial i=1pxiei\prod_{i=1}^{p}x_{i}^{e_{i}} with qq/p<eiq1q-q/p<e_{i}\leqslant q-1 there is a contribution of i=1paiei(qq/p)\prod_{i=1}^{p}a_{i}^{e_{i}-(q-q/p)} with nonzero coefficient, and this cannot be cancelled out from any other monomial. Therefore, gTa,P\langle g^{\prime}\circ T_{a},P\rangle is a nonzero polynomial in aa and there exists an a𝔽qpa\in\mathbb{F}_{q}^{p} such that gTa,P0\langle g^{\prime}\circ T_{a},P\rangle\neq 0. This establishes the second part of the lemma. ∎

Appendix C Proof of Lemma 5.3

In this section we provide the proof of Lemma 5.3. This is essentially the same as  [31, Proposition 3.5] with a slight modification. Recall that g:𝔽q+100g:\mathbb{F}_{q}^{\ell+100} is an arbitrary polynomial, dd is an arbitrary degree parameter, and b𝔽q+100b\in\mathbb{F}_{q}^{\ell+100} is an arbitrary point. We work in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(+100,){\sf AffGras}(\ell+100,\ell). Use ν\nu to denote the uniform measure over all \ell-flats, and νb\nu_{b} to denote uniform measure over the zoom-in on bb. When referring to zoom-ins, zoom-ins on the linear part etc. we are referring to the versions in the affine Grassmann graph. We have the following two sets of \ell-flats,

𝒜={U|dim(U)=,deg(g|U)>d,bU},\mathcal{A}=\{U\;|\;\dim(U)=\ell,\deg(g|_{U})>d,b\in U\},
={U|dim(U)=,deg(g|U)>d,bU},\mathcal{B}=\{U\;|\;\dim(U)=\ell,\deg(g|_{U})>d,b\notin U\},

and ν()qM/2O(ε)\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon). We will show the following weaker form of Lemma 5.3, and then explain why this gives the full Lemma 5.3.

Lemma C.1.

If ν()qM/2O(ε)\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon) then =\mathcal{B}=\emptyset. Moreover there is a value γ\gamma such that after changing f(b)f(b) to γ\gamma, νb(𝒜)11q\nu_{b}(\mathcal{A})\leqslant 1-\frac{1}{q}.

Before proving this lemma, we explain why it implies Lemma 5.3. Assuming that Lemma C.1 holds, suppose that gg^{\prime} is the resulting function after changing the value of g(b)g(b) to γ\gamma, and consider the set 𝒰\mathcal{U} of \ell-flats UU such that deg(g|U)>d\deg(g^{\prime}|_{U})>d. Assume for the sake of contradiction that 𝒰\mathcal{U} is nonempty. We record some facts about 𝒰\mathcal{U},

  1. 1.

    𝒰𝒟b={U|dim(U)=,bU}\mathcal{U}\subseteq\mathcal{D}_{b}=\{U\;|\;\dim(U)=\ell,b\in U\}.

  2. 2.

    νV(𝒰)q100\nu_{V}(\mathcal{U})\leqslant q^{-100}.

  3. 3.

    Φ(𝒰)11q\Phi(\mathcal{U})\leqslant 1-\frac{1}{q}.

  4. 4.

    νV(𝒰)\nu_{V}(\mathcal{U}) is q99q^{-99} pseudorandom with respect to zoom-outs and zoom-outs on the linear part.

  5. 5.

    νV(𝒰)\nu_{V}(\mathcal{U}) is q100q^{-100} pseudorandom with respect to zoom-ins on the linear part.

The first item follows from assuming Lemma  C.1. The second item follows from the first item and from bounding the fraction of \ell-flats that contain bb. The third and fourth items follow from the same arguments as Lemmas 4.2 and 4.3. Finally, the fifth item follows again from the first item and the fact that over any zoom-in on the linear part, a random \ell-flat from this set contains the point bb with probability at most q100q^{-100}.

Applying Theorem A.1 with ξ=q100\xi=q^{-100}, we get that 𝒰\mathcal{U} must have density at least 1q2(867q25+q)>11/q1-q^{2}(867q^{-25}+q^{-\ell})>1-1/q inside some zoom-in, where we use the fact that 4\ell\geqslant 4. There can only be one zoom-in on which 𝒰\mathcal{U} has nonzero density, and that is the zoom-in on bb. However, this contradicts the assumption that νb(𝒜)11q\nu_{b}(\mathcal{A})\leqslant 1-\frac{1}{q} after the value of g(b)g(b) is changed. In short, we have shown that, under this setting, if a correction causes the rejection probability within a zoom-in to drop below 11/q1-1/q, then the rejection probability must actually be 0.

We now give the proof of Lemma C.1. This proof requires the following result used in both [5, 31], which was referred to as the shadow lemma in the introduction.

Lemma C.2.

Let dd\in\mathbb{N} be a degree parameter and kd+1qq/pk\geqslant\lceil\frac{d+1}{q-q/p}\rceil. Then for any f:𝔽qk+1𝔽qf:\mathbb{F}_{q}^{k+1}\xrightarrow[]{}\mathbb{F}_{q}:

  1. 1.

    If deg(f)>d\deg(f)>d, then εk,d(f)1q\varepsilon_{k,d}(f)\geqslant\frac{1}{q}.

  2. 2.

    If d<deg(f)<(k+1)(q1)d<\deg(f)<(k+1)(q-1) then εk,d(f)>1q\varepsilon_{k,d}(f)>\frac{1}{q}.

Where εk,d(f)\varepsilon_{k,d}(f) and εk+1,d(f)\varepsilon_{k+1,d}(f) are the fraction of kk-flats and k+1k+1-flats respectively on which the restriction of ff has degree greater than dd.

In [31] this lemma is used to show that the set of rejecting flats is non-expanding, similar to Lemma 4.2 in this paper. The lemma has another use though, which is stated in its second item. For our purposes, the second item says that if f|Vf|_{V} has degree greater than dd and for greater than 1/q1/q of the hyperplanes UVU\subseteq V we have deg(f|U)>d\deg(f|_{U})>d as well, then f|Vf|_{V} cannot contain the maximum degree monomial. In this section we will use Lemma C.2 with the parameters d=dd=d and k=k=\ell. Note that d+1qq/p\ell\geqslant\lceil\frac{d+1}{q-q/p}\rceil by assumption.

Our first goal is to show =\mathcal{B}=\emptyset. Start by taking an (+40)(\ell+40)-flat WW uniformly conditioned on bWb\notin W. Let

W={BW|B},\mathcal{B}_{W}=\{B\subseteq W\;|\;B\in\mathcal{B}\},

and work in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(W,){\sf AffGras}(W,\ell) — the affine Grassmann graph over the \ell-flats contained in WW.

Let μW\mu_{W} and ΦW\Phi_{W} denote measure and expansion respectively in 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(W,){\sf AffGras}(W,\ell). We claim that either μW(W)=0\mu_{W}(\mathcal{B}_{W})=0 or μW(W)q100\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}. Otherwise, 0<μW(W)<q1000<\mu_{W}(\mathcal{B}_{W})<q^{-100} and W\mathcal{B}_{W} is q60q^{-60} pseudo-random with respect to zoom-ins and zoom-ins on the linear part. By a similar argument to Lemma 4.3 we have that W\mathcal{B}_{W} is q98q^{-98} pseudorandom with respect to zoom-outs and zoom-outs on the linear part and by a similar argument to Lemma 4.1 we have that 1ΦW(W)1/q1-\Phi_{W}(\mathcal{B}_{W})\geqslant 1/q. This contradicts Theorem A.1, however, so we conclude that either μW(W)=0\mu_{W}(\mathcal{B}_{W})=0 or μW(W)q100\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}.

Lemma C.3.

=.\mathcal{B}=\emptyset.

Proof.

Otherwise we may find a WW such that μW(W)q100\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}. Fix such a WW and sample a uniform +99\ell+99-flat YY conditioned on bYb\notin Y, and a uniform +60\ell+60-flat A2YA_{2}\subseteq Y. Now consider A2WA_{2}\cap W. We may think of WW as being defined by a system of 6060 independent linear equations h1,x=c1,,h60,x=c60\langle h_{1},x\rangle=c_{1},\ldots,\langle h_{60},x\rangle=c_{60}. That is, WW is the subspace of 𝔽q+100\mathbb{F}_{q}^{\ell+100} that satisfies these 6060 equations. Likewise, A2A_{2} is given by the restriction of 3939 linear equations, h1,x=c1,,h39,x=c39\langle h^{\prime}_{1},x\rangle=c^{\prime}_{1},\ldots,\langle h^{\prime}_{39},x\rangle=c^{\prime}_{39}. The probability that all 9999 linear equations are linearly independent is at least

j=038q99q60+jq99e2j=1qje4/q,\prod_{j=0}^{38}\frac{q^{99}-q^{60+j}}{q^{99}}\geqslant e^{-2\sum_{j=1}^{\infty}q^{-j}}\geqslant e^{-4/q},

and A2WA_{2}\cap W is uniform over 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(W,){\sf AffGras}(W,\ell). Thus,

Pr[A2WW]e4/qq100.\Pr[A_{2}\cap W\in\mathcal{B}_{W}]\geqslant e^{-4/q}q^{-100}.

If A2WWA_{2}\cap W\in\mathcal{B}_{W}, then A2Y60A_{2}\in\mathcal{B}_{Y}^{\uparrow^{60}}, where the upper shadow is taken with respect to 𝖠𝖿𝖿𝖦𝗋𝖺𝗌(Y,){\sf AffGras}(Y,\ell), so it follows that

𝔼Y[μY(Y60)]e4/qq100.\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\geqslant e^{-4/q}q^{-100}.

On the other hand, by Lemma C.2,

𝔼Y[μY(Y60)]q60𝔼Y[μY(Y)]2q60μ(V)2q60+M/2O(ε).\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\leqslant q^{60}\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y})]\leqslant 2q^{60}\mu(\mathcal{B}_{V})\leqslant 2q^{60+M/2}O(\varepsilon).

By assumption ε<qM\varepsilon<q^{-M} so altogether we get that,

qM/21C(q)q160,q^{-M/2}\geqslant\frac{1}{C(q)q^{-160}},

for some constant C(q)C(q). For large enough MM this is a contradiction. ∎

We now complete the proof of Lemma C.1, which in turn proves Lemma 5.3.

Proof of Lemma 5.3..

Fix an \ell-flat UU that contains bb. We will show that there exists a value γ\gamma such that changing the value of g(b)g(b) to γ\gamma results in a degree rr function on UU. Establishing this fact and choosing the most common γ\gamma over all \ell-flats proves Lemma C.1, which in turn establishes Lemma 5.3 by our previous discussion.

Suppose deg(g|U)>d\deg(g|_{U})>d, as otherwise we are done by setting γ=g(b)\gamma=g(b). Pick an +1\ell+1-flat UUU^{\prime}\supset U and let g=g|Ug^{\prime}=g|_{U^{\prime}}, and M(x)=1xbM(x)=1_{x\neq b}, over UU^{\prime}. Note that M(x)M(x) contains the degree (+1)(q1)(\ell+1)(q-1)-monomial, so there is some value γ\gamma such that γM(x)+g(x)\gamma M(x)+g^{\prime}(x) has degree strictly less than (+1)(q1)(\ell+1)(q-1). Showing that for this γ\gamma, deg(γM(x)+g(x))d\deg(\gamma M(x)+g^{\prime}(x))\leqslant d completes the proof.

Since we have already shown that, deg(g|U)d\deg(g|_{U^{\prime}})\leqslant d for any \ell-flat that does not contain the point bb, this implies that γM(x)+g(x)\gamma M(x)+g^{\prime}(x) also has degree at most dd on any \ell-flat not containing bb. This is because when restricted to such flats, γM(x)+g(x)\gamma M(x)+g^{\prime}(x) is equal to the restriction of gg plus some constant polynomial. Since the degree of gg’s restriction is at most dd, so is the degree of γM(x)+g(x)\gamma M(x)+g^{\prime}(x). It follows that γM(x)+g(x)\gamma M(x)+g^{\prime}(x) can only have degree greater than dd when restricted to \ell-flats that contain bb. This is at most a 1/q1/q-fraction of \ell-flats contained in UU^{\prime} however, which, along with the fact that deg(γM(x)+g(x))<(+1)(q1)\deg(\gamma M(x)+g^{\prime}(x))<(\ell+1)(q-1), implies that deg(γM(x)+g(x))d\deg(\gamma M(x)+g^{\prime}(x))\leqslant d by Lemma C.2. ∎