Optimal Testing of Generalized Reed-Muller Codes in Fewer Queries

Dor Minzer Department of Mathematics, Massachusetts Institute of Technology, Cambridge, USA. Supported by a Sloan Research Fellowship, NSF CCF award 2227876 and NSF CAREER award 2239160. Kai Zheng Department of Mathematics, Massachusetts Institute of Technology, Cambridge, USA. Supported by the NSF GRFP.

Abstract

A local tester for an error correcting code $C\subseteq\Sigma^{n}$ is a tester that makes $Q$ oracle queries to a given word $w\in\Sigma^{n}$ and decides to accept or reject the word $w$ . An optimal local tester is a local tester that has the additional properties of completeness and optimal soundness. By completeness, we mean that the tester must accept with probability $1$ if $w\in C$ . By optimal soundness, we mean that if the tester accepts with probability at least $1-\varepsilon$ (where $\varepsilon$ is small), then it must be the case that $w$ is $O(\varepsilon/Q)$ -close to some codeword $c\in C$ in Hamming distance.

We show that Generalized Reed-Muller codes admit optimal testers with $Q=(3q)^{\lceil\frac{d+1}{q-1}\rceil+O(1)}$ queries. Here, for a prime power $q=p^{k}$ , the Generalized Reed-Muller code, $\operatorname{RM}[n,q,d]$ , consists of the evaluations of all $n$ -variate degree $d$ polynomials over $\mathbb{F}_{q}$ . Previously, no tester achieving this query complexity was known, and the best known testers due to Haramaty, Shpilka and Sudan [21](which is optimal) and due to Ron-Zewi and Sudan [33](which was not known to be optimal) both required $q^{\lceil\frac{d+1}{q-q/p}\rceil}$ queries. Our tester achieves query complexity which is polynomially better than by a power of $p/(p-1)$ , which is nearly the best query complexity possible for generalized Reed-Muller codes.

The tester we analyze is due to Ron-Zewi and Sudan, and we show that their basic tester is in fact optimal. Our methods are more general and also allow us to prove that a wide class of testers, which follow the form of the Ron-Zewi and Sudan tester, are optimal. This result applies to testers for all affine-invariant codes (which are not necessarily generalized Reed-Muller codes).

1 Introduction

1.1 Local Testing of Reed Muller Codes

The Reed-Muller Code is a widely used code with many applications in complexity theory, and more broadly in theoretical computer science. One reason for this is that the Reed-Muller code enjoys very good local testability properties which are crucial in many applications (for example, in the construction of probabilistically checkable proofs). The primary goal of this paper is to present local testers for Reed-Muller codes over extension fields with improved query complexity, which additionally satisfy a stronger notion of soundness known as optimal testing.

Throughout this paper, $p\in\mathbb{N}$ is a prime and $q=p^{k}$ is a prime power, where $k$ should be thought of as moderately large (so that $q$ is significantly larger than $p$ ). For a degree parameter $d\in\mathbb{N}$ and a number of variables parameter $n\in\mathbb{N}$ , the Reed-Muller code consists of all evaluation vectors of degree $d$ polynomials $f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ . That is,

\operatorname{RM}[n,q,d]=\left\{(f(\vec{x}))_{\vec{x}\in\mathbb{F}_{q}^{n}}~{}|~{}f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}\text{ is a polynomial of degree at most $d$}\right\}.

When $k>1$ , $\operatorname{RM}[n,q,d]$ is sometimes called a generalized Reed-Muller code, to distinguish from the prime field case, and as the title suggests, our results are most relevant to this case. However, henceforth, we will refer to them as Reed-Muller codes for simplicity.

Abusing notation, we will not distinguish functions and their evaluations when referring to codewords. That is, for $f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ , we will simply say $f\in\operatorname{RM}[n,q,d]$ if $\deg(f)\leqslant d$ and we will view the codewords of $\operatorname{RM}[n,q,d]$ themselves as functions. When talking about the degree of a function $f$ , we mean the total degree when $f$ is written as a polynomial.

Given a proximity parameter $\delta>0$ , the goal in the problem of local testing of Reed-Muller codes is to design a randomized tester $\mathcal{T}$ that makes $Q$ oracle queries (for $Q$ which is as small as possible) to a given function $f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ such that:

1.

Completeness: if $f$ is a polynomial of degree at most $d$ , then $\mathcal{T}$ accepts with probability $1$ .
2.

Soundness: if $f$ is $\delta$ -far from all degree $d$ polynomials, then the tester $\mathcal{T}$ rejects with probability at least $2/3$ .

Local Testing.

Reed-Muller codes have a very natural and well studied local test [1, 24, 22, 21] called the $t$ -flat test. This test has its origins in the study of probabilistically checkable proofs [15, 3, 2, 34, 4, 32] (as it is related to the well known plane versus plane, plane versus line and line versus point tests), as well as in the theory of Gowers’ uniformity norms [16]. To check if a given function $f$ is indeed low degree, the tester samples a random $t$ -dimensional affine subspace $U\subseteq\mathbb{F}_{q}^{n}$ , queries $f(\vec{x})$ for all $\vec{x}\in U$ and checks whether the resulting $t$ -variate function $f|_{U}$ has degree at most $d$ . Clearly the number of queries made is $q^{t}$ , and it is also clear that the test is complete: if $f$ has degree at most $d$ , then the tester always accepts. As for the soundness, it is known that one can take $t=\lceil\frac{d+1}{q-q/p}\rceil$ and get that the tester is somewhat sound [1], meaning that the rejection probability is bounded away from $0$ (as opposed to at least $2/3$ ). Indeed, a typical local-testing result shows that if a function $f$ is $\delta$ -far from being a degree $d$ function, then the tester rejects it with probability at least $\varepsilon=\varepsilon(q,d,\delta)>0$ , which is a quantity that typically vanishes with the various parameters. To amplify the soundness, one repeats the tester $\theta(1/\varepsilon)$ time sequentially to get $2/3$ rejection probability, thereby giving a tester whose query complexity is $O(q^{t}/\varepsilon)$ and whose soundness is at least $2/3$ .

Such testers for the Reed-Muller have been known for a long time. Indeed, in [1] the $t$ -flat tester is analyzed and it is shown that the rejection probability of the basic tester is at least $\varepsilon\geqslant\Omega(\delta/q^{t})$ leading to a tester with query complexity $O(q^{2t}/\delta)$ . This soundness analysis turns out to not be optimal, and indeed, as we explain below, follow-up works have shown a better analysis of the $t$ -flat tester. In particular, it was shown that the $t$ -flat tester is an optimal tester.

1.2 Optimal Testing of Reed Muller Codes

In this paper, we focus on a much stronger notion of testing known as optimal testing. Clearly, if a tester makes $Q$ queries and the proximity parameter is $\delta$ , then the rejection probability can be at most $\min(1,Q\delta)$ ; indeed, this is a bound on the probability to distinguish between a Reed-Muller codeword and a Reed-Muller codeword that has been perturbed on a randomly chosen $\delta$ -fraction of inputs. A tester is called optimal if the query-to-rejection probability tradeoff it achieves is roughly that. Oftentimes, one settles for rejection probability which is a bit worse and has the form $c(q)\min(1,Q\delta)$ for some function $c(q)>0$ depending only on the field size $q$ . We will refer to such results also as optimal testing results. Thus, one would ideally like a tester which both achieves as small as possible query complexity, while simultaneously being optimal.

Optimal testers are known for Reed-Muller codes. Such results were first proved over $\mathbb{F}_{2}$ by [10]. Later on, optimal testing results were established for Reed-Muller codes over general fields [21] as well as more broadly for the class of affine lifted codes [20]. In all of these results, the $t$ -flat test is analyzed (for $t$ chosen as above), and is shown to be an optimal tester. We remark that additionally, the analyses of [10, 21, 20] led to improved query complexity for testing Reed-Muller codes. These results have important applications, most notably to the construction of small-set expander graphs with many eigenvalues close to $1$ [8], which have later been used for improved PCP constructions and constructions of integrality gaps [19, 12, 11, 23].

Quantitatively, these results have two drawbacks. First, due to their application of the density Hales-Jewett theorem, the dependency of $c(q)$ on the field size $q$ is tower type, hence making the result primarily effective for small fields. Secondly, while the query complexity achieved by their tester is the best possible for prime fields (as a lower bound on the query complexity needed is given by the distance of the dual code of $\operatorname{RM}[n,q,d]$ , which is $q^{t}$ if $q$ is prime), it is not known to be optimal for prime power size fields. This raises two natural questions: does the flat tester actually perform worse on large fields (in comparison to small fields)? Is there a tester with smaller query complexity over non-prime fields (i.e. extension fields)?

In [33] a new local tester for the Reed-Muller code was designed. The query complexity of the tester is $Q={\sf poly}(q)(3q)^{\lceil\frac{d+1}{q}\rceil}$ , which is polynomially smaller than $q^{t}$ above (by a power of $\frac{p}{p-1}$ ). This tester, which will be referred to as the sparse $t$ -flat tester, plays a crucial role in the current work and will be presented in subsequent sections.

Unfortunately, the rejection probability proved in [33] for the sparse $t$ -flat tester is an $\varepsilon$ which is sub-constant, and after repeating the tester $\Theta(1/\varepsilon)$ times its query complexity turns out to be roughly the same as that of the $t$ -flat tester above. This leaves the Reed Muller code over extension fields in a rather precarious situation: a local characterization for the code — namely a basic tester that rejects far from Reed-Muller codewords with some non-negligible probability — is known, but amplifying the soundness to be constant sets us back to the same query complexity as of the $t$ -flat tester.

1.3 Main Result: Optimal, Query Efficient Tester for Generalized Reed Muller Codes

The main result of this paper is a new and improved analysis of the tester of [33]. We show that the soundness of the tester is much better than the guarantee given by [33], and that in fact this tester is also an optimal tester:

Theorem 1.1.

For all primes $p\in\mathbb{N}$ and prime powers $q=p^{k}$ there exists a tester $\mathcal{T}$ with query complexity $Q\leqslant 3q^{p+O(1)}(3q)^{\lceil\frac{d+1}{q-1}\rceil}$ such that given an oracle access to a function $f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ ,

1.

Completeness: if $f$ has degree at most $d$ , then $\mathcal{T}$ accepts with probability $1$ .
2.

Soundness: if $f$ is $\delta$ -far from degree $d$ , then $\mathcal{T}$ rejects with probability at least $c(q)\min(1,Q\delta)$ , where $c(q)=\frac{1}{{\sf poly}(q)}$ .

The test uses a parameter $t$ (where $\lceil\frac{d+1}{q-1}\rceil+t$ is analogous to the “dimension” parameter in the flat test), and the $t$ that we use will be $t=\max(p+2,10)$ . We remark however that the analysis we give applies to all $t\geqslant\max(p+2,10)$ , and we choose this specific $t$ so as to minimize the query complexity. We defer to Section 3.2 for more details on this parameter.

A lower bound on the query complexity needed is $q^{\lceil\frac{d+1}{q-1}\rceil}$ , which once again follows by considering the dual code of the generalized Reed-Muller code and arguing that this number is its distance. Hence, Theorem 1.1 is tight up to a factor of ${\sf poly}(q)3^{\lceil\frac{d+1}{q-1}\rceil}$ ; for large $q$ , this factor is very small compared to $q^{\lceil\frac{d+1}{q-1}\rceil}$ , hence the query complexity achieved by our tester is essentially optimal for large field size $q$ .

1.4 Optimal Testing of Other Linear Lifted Affine Invariant Codes

Our techniques are in fact more general, and also apply to testers of a wider family of codes, called (linear) lifted affine invariant codes. These codes were introduced by [17] and shown to be optimally testable in [20, 31]. In words, we show that any tester for such codes is optimal if it can be expressed as the product of polynomials, such that each polynomial is defined on a constant number of variables and such that the variables for each polynomial are disjoint. We describe this result in more detail below, but defer the full discussion to Section 6.

A code $\mathcal{C}\subseteq\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\}$ is linear if its codewords form a linear subspace and is affine invariant if for any affine transformation $T:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{n}$ and $f\in\mathcal{G}$ , we have that $f\circ T\in\mathcal{G}$ , where $f\circ T$ is defined as the function $f^{\prime}:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{n}$ such that $f^{\prime}(x)=f(T(x))$ for all $x\in\mathbb{F}_{q}^{n}$ . Since $\mathcal{C}$ is linear, it has a dual $\mathcal{C}^{\perp}$ also consisting of functions from $\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ , and $f\in\mathcal{C}$ if and only if $\langle f,h\rangle=0$ for all $h\in\mathcal{C}^{\perp}$ , where this inner product is the standard dot product of the evaluation vectors of $f$ and $h$ (over $\mathbb{F}_{q}^{n}$ ). Notice that, one can also compose $f$ with affine transformations $T:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}^{n}$ for $k<n$ . In this case, $f\circ T$ is a function from $\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}$ , and we can consider the code consisting of all $f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ such that $f\circ T$ is in some affine invariant base code $\mathcal{G}\subseteq\{\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}\}$ . This code is called the $n$ -dimensional lift of $\mathcal{G}$ and is defined as:

{\sf Lift}_{n}(\mathcal{G})=\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\;|\;f\circ T\in\mathcal{G}\text{ for all affine transformations }T:\mathbb{F}_{q}^{k}\xrightarrow{}\mathbb{F}_{q}^{n}\}.

It is not hard to see that ${\sf Lift}_{n}(\mathcal{G})$ is also affine invariant and linear. Suppose we want to design a local tester for $\mathcal{C}$ and we know $\mathcal{C}={\sf Lift}_{n}(\mathcal{G})$ for some affine invariant $\mathcal{G}$ defined as above with $k\geqslant 10$ . A natural test is the following:

1.

Take $\mathcal{H}\subseteq\mathcal{G}^{\perp}$ .
2.

Choose an affine transformation $T:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}^{n}$ uniformly at random.
3.

Accept if and only if $\langle f\circ T,H\rangle=0$ for all $h\in\mathcal{H}$ .

We remark that not every choice of $\mathcal{H}$ results in a local tester, and it is indeed possible to choose $\mathcal{H}$ so that there exist $f\notin\mathcal{C}$ that still pass the above test with probability $1$ . Our main result shows that when such a test is a local test and $\mathcal{H}$ consists of functions of a specified form, then the tester is automatically an optimal tester. In order to obtain explicit optimal testers one still has to find such a $\mathcal{H}$ that is a local tester, but this is not the focus of the current paper.

The form for $\mathcal{H}$ that we require is as follows. Let

H(x_{1},\ldots,x_{k^{\prime}})=\left(\prod_{i=1}^{s}P_{i}(x_{m(i)},\ldots,x_{m(i+1)-1}))\right),

where $\operatorname{poly}(q)\geqslant k-k^{\prime}\geqslant 0$ , and $m(i+1)-m(i)\leqslant t^{\prime}$ for some constant $t^{\prime}$ and all $1\leqslant i\leqslant s-1$ . In words, $H$ is a polynomial in at most $k^{\prime}$ -variables, for some $k^{\prime}$ that is within some constant power of $q$ from $k$ , that can be factored into the product of polynomials in constant number of variables, and such that the variables of each of these polynomials is disjoint. Next let $\mathcal{M}\subsetneq\{\mathbb{F}_{q}^{k-k^{\prime}}\xrightarrow[]{}\mathbb{F}_{q}\}$ be an affine invariant code and let $\overline{\mathcal{M}}$ be a basis for $\mathcal{M}^{\perp}$ . Finally, suppose

\mathcal{H}=\{H(x_{1},\ldots,x_{k^{\prime}})M(x_{k^{\prime}+1},\ldots x_{k})\;|\;M\in\mathcal{M}\}.

Our theorem states:

Theorem 1.2.

Suppose $\mathcal{H}$ is of the previously described form and suppose that the previously described test using $\mathcal{H}$ is a local tester for $\mathcal{C}={\sf Lift}_{n}(\mathcal{G})$ . Then the local tester is also optimal. That is,

1.

Completeness: if $f\in\mathcal{C}$ then the test accepts with probability $1$ .
2.

Soundness: if $f$ is $\delta$ -far from degree $\mathcal{C}$ , then the test rejects with probability at least $c(q)\min(1,Q\delta)$ , where $c(q)=\frac{1}{{\sf poly}(q)}$ .

Although our result is stated for lifted affine-invariant codes, it also applies equally to affine-invariant codes by taking $\mathcal{C}=\mathcal{G}={\sf Lift}_{k}(\mathcal{G})$ . In contrast, the optimal testing result for lifted affine-invariant codes in [31] applies only to the $k$ -flat test for ${\sf Lift}_{n}(\mathcal{G})$ , which is no longer “local” in the case of $\mathcal{C}=\mathcal{G}$ as it looks at the entire domain. On the other hand, for the $\mathcal{C}=\mathcal{G}$ case, one could still obtain locality using our result by designing a set $\mathcal{H}$ of the specified form that has sparse support. Theorem 1.2 gives a general recipe to construct optimal testers, thus making progress on the task of establishing optimal testing results for general affine invariant codes. We would like to highlight two interesting avenues that we leave for future works. First, it would be interesting to investigate what other affine invariant codes can be analyzed via Theorem 1.2. This could lead to new optimal testing result for other codes, or otherwise to a more general class of testers that one can then try to analyze using the tools presented herein (and their extensions). Second, it would be interesting to extend our techniques to remove the requirements on the form of $\mathcal{H}$ (or perhaps weaken it somehow), and show that any local test for affine-invariant codes is optimal.

1.5 Optimal Testing via Global Hypercontractivity

The proof of Theorem 1.1 is very different from the proofs of [10, 21, 20] (which proceed by induction on $n$ ) as well as from the proof of [33] (which proceeds by presenting a local characterization of the generalized Reed-Muller code and appealing to a general and powerful result due to [25], that converts local characterizations to local tester). Instead, our proof is inspired by a new approach for establishing such results via global hypercontractivity [31].

Global hypercontractivity is a recently introduced tool that is often useful when working with non small set expander graphs. One corollary of global hypercontractivity (which is morally equivalent) is a useful characterization of all small sets in a graph that have edge expansion bounded away from $1$ . The study of global hypercontractivity has its roots in the proof of the 2-to-2 Games Theorem [29, 14, 13, 30], however by now it is known to be useful in the study a host of different problems (see for example [26, 27, 28, 31, 5, 7, 6, 18]).

Below, we explain the global hypercontractivity approach to proving local testability results. In Section 1.6 we explain how we extend this approach to the realm of generalized Reed-Muller codes in order to analyze the sparse $t$ -flat tester and prove Theorems 1.1, 1.2.

1.5.1 Optimal Testing of Reed-Muller Codes via Global Hypercontractivity

In [31], the authors relate the analysis of the $t$ -flat tester of the Reed-Muller code to expansion properties of the affine Grassmann graph. Here, the affine Grassmann graph is the graph whose vertex set is the set of all $t$ -flats in $\mathbb{F}_{q}^{n}$ , and two vertices are adjacent if they intersect in a $t-1$ dimensional affine subspace. In short, the approach of [31] starts with the assumption that the $t$ -flat tester accepts $f$ with probability at least $1-\varepsilon$ (for $\varepsilon$ thought of as small) and proceeds to prove a structural result on the set of $t$ -flats on which the tester rejects:

S=\left\{T\subseteq\mathbb{F}_{q}^{n}~{}|~{}{\sf deg}(f|_{T})>d,T\text{ is a $t$-flat}\right\}.

In particular, using a lemma from [21] they prove that the set $S$ has poor expansion properties in the affine Grassmann graph, and use it to prove that there exists a point $x^{\star}\in\mathbb{F}_{q}^{n}$ such that the tester almost always rejects when it selects a subspace $T$ containing $x^{\star}$ . This suggests that $f$ is erroneous at $x^{\star}$ and should be corrected, and indeed using that a local correction procedure is devised in [31] to show that the value of $f$ at $x^{\star}$ could be changed and lead to an increase in the acceptance probability of additive factor $q^{t-n-O(1)}$ . Iterating this argument, one eventually changes $f$ in at most $C(q)\cdot q^{-t}\varepsilon$ fraction of the inputs and gets a function $f^{\prime}$ on which the tester accepts with probability $1$ . Such $f^{\prime}$ must be of degree at most $d$ , hence one gets that $f$ is close to a degree $d$ polynomial.

1.5.2 Optimal Testing of Generalized Reed-Muller Codes via Global Hypercontractivity

While the approach of [31] seems to be more robust and thus potentially applicable towards analyzing a richer class of codes, it is not completely obvious how to do so. For the $t$ -flat tester one may associate a very natural graph with the tester, but this is much less clear in the context of the sparse $t$ -flat tester (which is the tester that we analyze). The pattern of inputs which are queried by the tester is no longer a nice-looking subspace (but this seems inherent in order to improve upon the query complexity of the flat tester).

At a high level, we show that another way of approaching this problem is by utilizing the symmetries of the code and constructing graphs on them. For that, we have to think of the tester as the composition of a “basic tester” and a random element from the group of symmetries of the code. In our case, the group of symmetries is the group of affine linear transformations over $\mathbb{F}_{q}^{n}$ , and the graph that turns out to be related to the analysis of the sparse $t$ -flat tester is the so-called Affine Bi-linear Scheme Graph.

At first sight, affine linear transformations seem to be morally equivalent to flats (identifying the image of an affine linear transformation with a subspace), and indeed there are many connections between results on the former graph and result on the latter graph. However, the distinction between affine linear transformations and flats will be crucial for us. Indeed, while two affine linear transformations $A_{1}$ and $A_{2}$ may have the same image, the performance of the tester when choosing $A_{1}$ and when choosing $A_{2}$ will not be the same whatsoever, and therefore we must capitalize on this distinction in our soundness analysis.

1.6 Our Techniques

In this section, we give a brief overview of the techniques underlying the proof of Theorem 1.1. We start by presenting the sparse $(s+t)$ -flat tester of [33] and then take a somewhat different perspective on it in the form of groups of symmetries. After that, we explain the high level strategy of the proof of Theorem 1.1, and explain some of the unique challenges that arise compared to the analysis of the standard $t$ -flat tester. For the sake of presentation, we focus on the case that $p=2$ for the remainder of this section, and switch back to general $p$ in Section 2.

1.6.1 The Sparse Flat Tester

The construction of the sparse flat tester of [33] begins with the following observation. In the $p=2$ case, define the bivariate polynomial $P\colon\mathbb{F}_{q}^{2}\to\mathbb{F}_{q}$ by

P(x_{1},x_{2})=\frac{-x_{2}^{q-1}+(x_{1}+x_{2})^{q-1}}{x_{1}}=\sum_{i=0}^{q-2}x_{1}^{i}x_{2}^{q-2-i}.

In [33], the authors observe the following two crucial properties of $P$ that make it useful towards getting a local testing result:

1.

Sparse Support: the support of $P$ has size at most $3q$ ; indeed, if $x_{1}+x_{2}\neq 0$ , $x_{1}\neq 0$ and $x_{2}\neq 0$ , then by Fermat’s little theorem $(x_{1}+x_{2})^{q-1}=x_{2}^{q-1}=1$ and $x_{1}\neq 0$ , so $P(x_{1},x_{2})=0$ .

Detects the Monomials $x_{1}^{q-i}x_{2}^{i}$ : looking at the inner product of $P$ with a monomial $M$ , defined as $\langle{P},{M}\rangle=\sum\limits_{x_{1},x_{2}}P(x_{1},x_{2})M(x_{1},x_{2})$ , we get that if $M=x_{1}^{q-i}x_{2}^{i}$ for $i\in\{1,\ldots,q-1\}$ then $\langle{P},{M}\rangle\neq 0$ . Indeed,

\langle{P},{M}\rangle=\sum\limits_{j=0}^{q-2}\sum\limits_{x_{1}}x_{1}^{q-i+j}\sum\limits_{x_{2}}x_{2}^{q-2-j+i}=\sum\limits_{j=0}^{q-2}(q-1)1_{q-i+j=q-1}(q-1)1_{q-2-j+i=q-1},

so we only have contribution from $j=i-1$ and it is non-zero. For any other monomial $M$ , a similar computation shows that $\langle{P},{M}\rangle=0$ , hence taking an inner product of a function $f$ with $P$ may be thought of detecting whether in $f$ there is some monomial of the form $x_{1}^{q-i}x_{2}^{i}$ .

With this in mind, the sparse tester follows. In [33] it is argued that to design a local tester for the generalized Reed-Muller code it suffices to design a tester that detects whether certain canonical monomials exist. Writing $d+1=s\cdot\frac{q}{2}+r$ , where $s$ is even and $r<q$ , it is sufficient to detect whether any monomials of the form $\prod_{i=1}^{s/2}x_{2i-1}^{q/2}x_{2i}^{q/2}\cdot\prod\limits_{i=s+1}^{s+t}x_{i}^{e_{i}}$ where $\sum_{i=1}^{t}e_{i}\geqslant r$ and $t$ is a small constant (say, $t=10$ ). Using $P$ from above, a detector for such monomials is given by

H_{e^{\prime}}(x_{1},\ldots,x_{s+t})=P(x_{1},x_{2})\cdots P(x_{s-1},x_{s})\cdot M_{e^{\prime}}(x_{s+1},\ldots,x_{s+t}),

where $e^{\prime}+e=(q-1,\ldots,q-1)$ and $M_{e^{\prime}}(x_{s+1},\ldots,x_{s+t})=\prod\limits_{i=s+1}^{s+t}x_{i}^{q-1-e_{i}}$ . We note that most of the degree of $H_{e^{\prime}}$ comes from the product of the $P$ ’s, while at most $t(q-1)-r$ of the degree comes from the rest. As the support of $P$ is rather sparse, it follows that the support of $H_{e^{\prime}}$ is also rather sparse. More precisely, the support of $H_{e^{\prime}}$ has size at most $(3q)^{\frac{s}{2}+t}$ , and as $t$ should be thought of as small and $s$ is roughly $2d/q$ , the support of $H_{e^{\prime}}$ has size roughly $(3q)^{d/q}$ . This yields a query complexity of $(3q)^{d/q}$ , which is quadratically better than $q^{\lceil\frac{d}{q-q/p}\rceil}\approx q^{2d/q}$ given by the flat tester.

As mentioned earlier, the soundness analysis in [33] relies on a black box result from [25]. They manage to show that the detector they construct implies a tester with the same query complexity that rejects functions that are $\delta$ -far from Reed-Muller codewords with probability $\Omega((3q)^{-2(s/2+t)}\delta)$ . To get constant rejection probability one has to repeat the tester $(3q)^{2(s/2+t)}$ times and drastically increasing the query complexity; we defer a more detailed discussion to Section 3.

1.6.2 The Group of Symmetries Perspective

Our first task in proving Theorem 1.1 is to design a tester based on the ideas from [33]. The tester is very natural:

1.

Sample a random affine map $T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}$ ; here, $s$ and $t$ should be thought of as in the previous section. We are going to look at the function $f\circ T$ , but not query all of its values (indeed, querying all of its values would amount to the $(s+t)$ -flat tester).
2.

For any sequence of degrees $e=(e_{s+1},\ldots,e_{s+t})$ such that $\sum_{i=1}^{t}q-1-e_{i}\geqslant r$ , take $H_{e}$ and compute $\langle{f\circ T},{H_{e}}\rangle$ . Reject if this inner product is non-zero for any such degree sequence. Otherwise, accept.

In words, we first take the restriction of $f$ to a random $(s+t)$ -flat, and then apply the detector polynomials of [33]. Although we test for multiple degree sequences (up to $q^{t}$ ), we will see in Section 3 that the support of $H_{e}$ is the same in each case. Therefore, we can perform the inner product for all of the degree sequences using the same $q^{s+t}$ queries. Another way to think about this tester is that we have the group of symmetries of the Reed-Muller code (affine linear transformations), and our tester proceeds by first taking a random symmetry from this group, taking a restriction, and then using the detector of [33].

1.6.3 The Graph on Affine Linear Transformations and Its Analysis

With the above tester in mind, the next question is how to analyze it. In the case of the flat tester we had a very natural graph associated with the tester (the affine Grassmann graph). The above tester seems related to flats as well, since the image of an affine transformation is a flat; however, there is a key, crucial distinction between the two. In the above tester, we are only going to look at the value of $f$ at a few locations in $\operatorname{Im}(T)$ , hence the tester may perform differently on $T$ and $T^{\prime}$ even if they have identical images. This lack of symmetry is crucial for the sparseness of the test, but makes the task of associating a graph with the tester trickier.

To gain some intuition as to what this graph is supposed to be, recall that in the flat testers, one could look at the $t$ -flat tester for all $t$ (not necessarily the smallest one). The relations between these testers for different $t$ ’s play a crucial role in all of the works establishing optimal testing results, and in particular it is known that the rejection probability of the $(t+1)$ -flat tester is at most $q$ times the rejection probability of the $t$ -flat tester. We will elaborate on this property a bit later (referred to as the “shadow lemma” below). Another benefit of looking at various testers is that it gives a natural way of arriving at the affine Grassmann graph, by doing an up-down walk between these testers. To obtain the edges of the affine Grassmann graph, one can start with a $t$ -flat, go up to a $(t+1)$ -flat that contains it, and then back down to a $t$ -flat contained in the $(t+1)$ -flat. What is the right analog of this operation in the context of the sparse flat tester?

Going up.

The above discussion suggests looking for analogs of the tester above for higher dimensions, and there is a clear natural analog to the up step: for any $r\geqslant 0$ , we can look at the $(s+t+r)$ sparse flat tester, in which one chooses an affine map $T\colon\mathbb{F}_{q}^{s+t+r}\to\mathbb{F}_{q}^{n}$ randomly, and proceeds with the tester as above for all viable degree sequences (but over more variables). Thus, starting with the $(s+t)$ sparse flat tester and with an affine map $T$ thought of as $(n\times(s+t))$ matrix over $\mathbb{F}_{q}$ and an affine shift $c\in\mathbb{F}_{q}^{n}$ , we can go “up” to the $(s+t+1)$ sparse flat tester by choosing a random vector $w\in\mathbb{F}_{q}^{n}$ and looking at affine transformation $A$ corresponding to the matrix whose columns are the same as of $T$ , except that we append $w$ as the last column. Just like in the flat tester, it can be observed without much difficulty that if the $(s+t)$ sparse flat tester rejects when choosing $T$ , then the $(s+t+1)$ sparse flat tester rejects when choosing $A$ .

Going down.

The “going down” step is also simple, but a bit harder to motivate. Taking inspiration from the flat tester, one may want to apply some linear shuffling on the columns of $A$ and then “drop” one of the columns. This doesn’t seem to work though, as doing so would lead to a $T^{\prime}$ in which the performance of the sparse flat tester seems “completely independent” to its performance on $T$ (in the sense that the set of points it looks at in $T^{\prime}$ would typically be disjoint from the set of points it looks at in $T$ ).

Thus, when going down we wish to do so in a way that keeps $T^{\prime}$ and $T$ equal on many points. A natural thing to try is to apply an affine transformation from $R:\mathbb{F}_{q}^{s+t+1}\xrightarrow[]{}\mathbb{F}_{q}^{s+t}$ to $A$ that fixes a co-dimension $1$ space. In this case, $T^{\prime}=A\circ R$ is outputted and $T^{\prime}$ is equal to $A$ on the co-dimension $1$ space that $R$ fixes. On the other hand, by construction $T$ is equal to $A$ on a co-dimension $1$ space as well - namely the subspace with last coordinate equal to $0$ . Therefore, after the down step we get $T^{\prime}$ which is equal to $T$ on a subspace of dimension $s+t$ - which is essentially as similar to $T$ as possible while still being distinct from it.

Put a different way, to go down from the $(s+t+1)$ sparse flat tester to $(s+t)$ sparse flat tester we proceed by choosing $b_{1},\ldots,b_{s+t},b_{s+t+1}\in\mathbb{F}_{q}$ uniformly and independently and taking the affine transformation $T^{\prime}$ corresponding to the matrix whose $i$ th column is ${\sf col}_{i}(T)+b_{i}{\sf col}_{s+t+1}(T)$ , and whose shift is $c+b_{s+t+1}u$ . In words, we drop the final column but add a random multiple of it to each one of the other columns of $T$ .

Going up and then down.

Stitching these two operations, one gets a graph whose set of vertices is the set of affine linear transformations $T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}$ and whose edges are very similar in spirit to the affine Grassmann graph; this graph is known as the bi-linear scheme graph. The core of our analysis relies on $3$ components:

1.

Relating the performance of the tester and expansion on this graph (the shadow lemma), and proving that the set of $T$ ’s on which the tester rejects has edge expansion at most $1-1/q$ .
2.

Studying the structure of sets in this graph whose expansion is at most $1-1/q$ and proving they must have some strong local structure.
3.

Using said local structure towards a correction argument, proving that if the sparse flat tester rejects with small probability, then $f$ is close to a Reed-Muller codeword.

1.6.4 The Shadow Lemma

The shadow lemma is a result asserting a relation between the rejection probability of a $(s+t+1)$ tester and the $(s+t)$ tester.

The Shadow Lemma for Flat Testers.

In the context of flat testers, the lemma asserts that the fraction of flats of dimension $(s+t+1)$ on which the tester rejects is at most $q$ times the fraction of $(s+t)$ flats the tester rejects. The name of the result stems from the fact that letting $S$ be the set of $(s+t)$ -flats on which the tester rejects, the set of $(s+t+1)$ flats on which the tester rejects is exactly the upper shadow of $S$ :

S^{\uparrow}=\{A\subseteq\mathbb{F}_{q}^{n}~{}|~{}{\sf dim}(A)=s+t+1,\exists B\in S,B\subseteq A\}.

This means that on average, an element $A\in S^{\uparrow}$ has $1/q$ of its subsets $B\subseteq A$ in $S$ , and thinking of an edge in the affine Grassmann graph an up-down step we get that the probability that a random step from $S$ goes to a vertex outside $S$ is at most $1-1/q$ . That is, the edge expansion of $S$ , defined as

\Phi(S)=\frac{|\{e=(A,A^{\prime})\in E~{}|~{}A,A^{\prime}\in S\}|}{|\{e=(A,A^{\prime})\in E~{}|~{}A\in S\}|},

is at most $1-\frac{1}{q}$ .

The Shadow Lemma for Sparse Flat Testers.

In the context of the sparse flat tester, we wish to argue that something along the same lines holds. Towards this end, fixing the function $f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ , we define

S=\{T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}~{}|~{}\text{the sparse flat tester rejects when choosing $T$}\}.

Due to the asymmetry between the up and down steps, there is no clear analog of the upper shadow of $S$ . However, it is still true that if $T\in S$ and we append to $T$ some vector $u\in\mathbb{F}_{q}^{n}$ to form an affine map $R\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}$ , then the sparse $(s+t+1)$ tester rejects $R$ and it makes sense to define

S^{\uparrow}=\{R\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}~{}|~{}\text{the sparse flat tester rejects when choosing $R$}\}.

We prove that starting from $R\in S^{\uparrow}$ and performing a down step to a $T^{\prime}\colon\mathbb{F}_{q}^{s+t+1}\to\mathbb{F}_{q}^{n}$ , with probability at least $1/q$ the sparse flat tester still rejects and hence $T^{\prime}\in S$ . In particular, we conclude that $\mu(S^{\uparrow})\leqslant q\mu(S)$ (where $\mu(S)$ denotes the ratio between the size of $S$ and the total number of affine maps from $\mathbb{F}_{q}^{s+t}$ to $\mathbb{F}_{q}^{n}$ , and $\mu(S^{\uparrow})$ is defined similarly). Using the same logic as before we conclude that $\Phi(S)\leqslant 1-\frac{1}{q}$ , where here we measure edge expansion with respect to the bi-linear scheme graph.

1.6.5 Expansion on the Bi-linear Scheme Graph

Equipped with the understanding that $S$ is a small set (as we assume the sparse flat tester rejects with small probability) and $\Phi(S)\leqslant 1-\frac{1}{q}$ , the next question is what sort of structure this implies. In the bi-linear scheme graph there are natural examples of such sets, which are analogs of the zoom-in/ zoom-out sets in the context of subspaces. Roughly speaking, there is one type of examples which intuitively should be relevant for us, which is zoom-in sets:

\mathcal{C}_{x^{\star},y^{\star}}=\{T\colon\mathbb{F}_{q}^{s+t}\to\mathbb{F}_{q}^{n}~{}|~{}T(x^{\star})=y^{\star}\}.

There are other examples of small sets which have poor expansion in the bi-linear scheme graph, however these seem “irrelevant” in the present context. Indeed, after showing that our small set $S$ cannot be correlated with any of these other examples (a notion which is often referred to as pseudo-randomness with respect to zoom-outs and zoom-ins on the linear part), we use the theory of global hypercontractivity to prove that there must be $x^{\star}$ and $y^{\star}$ such that $\mu(S\cap\mathcal{C}_{x^{\star},y^{\star}})\geqslant(1-o(1))\mu(\mathcal{C}_{x^{\star},y^{\star}})$ . In words, the sparse $(s+t)$ -flat tester almost always rejects inside $\mathcal{C}_{x^{\star},y^{\star}}$ .

We remark that the proof that $S$ has no “correlation” with any other non-expanding set in the bi-linear scheme is rather tricky, and much of the effort in the current paper is devoted to that. To do so, we have to build upon ideas from [31] as well as use a new relation between the sparse $(s+t)$ -flat tester applied on a function $f$ and the $t$ -flat tester applied on a related function $\tilde{f}$ . As the construction of this related function is somewhat technical, we defer a detailed discussion of it to Section 3.5.

1.6.6 Finishing the Proof via Local Correction

Intuitively, the only way that $S$ could be dense inside some $\mathcal{C}_{x^{\star},y^{\star}}$ is if we started with a Reed-Muller codeword $g$ , and perturbed it on a small number of inputs, of which $y^{\star}$ is one. Indeed, in this case we would have that $\langle{g\circ T},{H_{e}}\rangle=0$ for all $T\in\mathcal{C}_{x^{\star},y^{\star}}$ and exponent vectors $e$ checked by the tester prior to perturbing, and after changing the value of $g$ at $y^{\star}$ , we would also change the value of $g\circ T$ at $x^{\star}$ , breaking our previous equality. This intuition suggests that $y^{\star}$ is a point in which we should fix the value of $f$ and get closer to a Reed-Muller codeword, and we show that this is indeed the case.

There are several difficulties that arise when inspecting this argument more deeply. If $H_{e}(x^{\star})=0$ , then the above reasoning breaks (as the value of $f$ at $y^{\star}$ is multiplied by $0$ ); however, in this case it does not make sense that $S$ could be very dense inside $\mathcal{C}_{x^{\star},y^{\star}}$ (as essentially the only input of $f$ included in the inner product is $f(y^{\star})$ , but in any case it is multiplied by $H_{e}(x^{\star})=0$ ). Indeed, in our argument we show that if $S$ is very dense in $\mathcal{C}_{x^{\star},y^{\star}}$ , then it must be the case that $H_{e}(x^{\star})\neq 0$ . Moreover, we show that the density of $S$ inside $\mathcal{C}_{x^{\star},y^{\star}}$ and inside $\mathcal{C}_{{x^{\star}}^{\prime},y^{\star}}$ is roughly the same for all $x^{\star}$ and ${x^{\star}}^{\prime}$ in the support of $H_{e}$ . This last step is crucial for the analysis to go through and requires us to adapt and strengthen techniques from [31]. At the end of this step, roughly speaking, we conclude that the tester rejects with probability close to $1$ whenever it queries the value of $f$ at $y^{\star}$ .

The last step in the argument is to show that we can change the value of $f$ at $y^{\star}$ and decrease the rejection probability of the tester by additive factor of $\Theta(q^{s+t-n})$ (which is proportional to the probability that the tester looks at $y^{\star}$ ). We do so by a reduction to the same problem over the standard flat tester (which was solved in [31]). The idea is to look at somewhat larger tester of dimension $s+t+100$ , fix the first $s$ coordinates and let the rest vary, so that the tester becomes a local version of the standard flat tester.

Given that, and iterating the argument, we eventually reach a function $f^{\prime}$ that differs from $f$ on at most $\Theta(\varepsilon/q^{s+t-n})$ of the inputs (where $\varepsilon$ is the original rejection probability) and passes the test with probability $1$ . Hence, $f^{\prime}$ is a Reed-Muller codeword, and so $f$ is $O(\varepsilon/Q)$ close to a Reed-Muller codeword.

2 Preliminaries

Notations.

For an integer $n$ we denote $[n]=\{1,\ldots,n\}$ . For a prime power $q$ we let $\mathbb{F}_{q}$ be the field of size $q$ , and we denote by $\mathbb{F}_{q}^{*}\subseteq\mathbb{F}_{q}$ the set of non-zero elements in it.

2.1 Basic Facts about Fields

Throughout, abusing notations we define the Reed-Muller code $\operatorname{RM}[n,q,d]$ as the set of functions over $\mathbb{F}_{q}^{n}$ that can be written as polynomials of degree at most $d$ . Henceforth fix $d$ and write

d+1=\ell\cdot p(q-q/p)+r=s(q-q/p)+r,

where we set $s=\ell\cdot p$ , and $0<r\leqslant p(q-q/p)$ .

We will need a few basic facts. First, it is a well known fact that $\mathbb{F}_{q}^{*}$ has a multiplicative generator which we often denote by $\gamma$ . The next lemma uses the existence of a multiplicative generator to estimate sums over $\mathbb{F}_{q}$ .

Lemma 2.1.

For any prime power $q$ and integer $i\in\{0,\ldots,q-1\}$ ,

\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{i}=\begin{cases}-1,&\text{if}\ i=q-1,\\ 0,&\text{otherwise}.\end{cases}

Proof.

If $i=q-1$ , then $\alpha^{i}=1$ for all $\alpha\neq 0$ , while $0^{i}=0$ . Therefore, the sum is one summed up $q-1$ times which is $-1$ in $\mathbb{F}_{q}$ . For $i\in\{1,\ldots,q-2\}$ , recall that $\mathbb{F}_{q}^{*}$ has a generator $\gamma$ . That is, $\mathbb{F}_{q}^{*}=\{1,\gamma,\ldots,\gamma^{q-2}\}$ . Since $\gamma^{i}\neq 1$ , we may write

\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{i}=\sum_{j=0}^{q-2}(\gamma^{i})^{j}=\frac{(\gamma^{i})^{q-1}-1}{\gamma^{i}-1}=\frac{1-1}{\gamma^{i}-1}=0.

On the other hand, if $i=0$ then the sum on the left hand side of the lemma is equal to $\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{0}=\sum_{\alpha\in\mathbb{F}_{q}}1=q=0$ . ∎

2.2 Functions over Fields

Given two functions $f,g\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ , we measure the distance between them in terms of the normalized Hamming distance, that is,

\delta(f,g)=\Pr_{x\in\mathbb{F}_{q}^{n}}[f(x)\neq g(x)].

The distance of a function $f\colon\mathbb{F}_{q}^{n}\to\mathbb{F}_{q}$ from the Reed-Muller code $\operatorname{RM}[n,q,d]$ , denoted by $\delta_{d}(f)$ , is defined as the minimal distance between $f$ and some function $g\in\operatorname{RM}[n,q,d]$ . That is,

\delta_{d}(f)=\min_{g\in\operatorname{RM}[n,q,d]}\delta(f,g)=\min_{g\in\operatorname{RM}[n,q,d]}\Pr_{x\in\mathbb{F}_{q}^{n}}[f(x)\neq g(x)].

For two functions $f,g:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ , define their inner product as

\langle f,g\rangle=\sum_{v\in\mathbb{F}_{q}^{n}}f(v)g(v).

It is clear that this inner product is bi-linear. Monomials are the basic building blocks of all polynomials over $\mathbb{F}_{q}$ , and the following notation will be convenient for us to use:

Definition 1.

For an exponent vector $e\in\{0,\ldots,q-1\}^{n}$ , we define the monomial $x^{e}=\prod_{i=1}^{n}x_{i}^{e_{i}}$

The next lemma allows us to compute inner product between monomials:

Lemma 2.2.

For $e,e^{\prime}\in\{0,\ldots,q-1\}^{n}$ , we have that

\langle x^{e},x^{e^{\prime}}\rangle=\begin{cases}(-1)^{n},&\text{if}\ e+e^{\prime}=(q-1,\ldots,q-1),\\ 0,&\text{otherwise}.\end{cases}

Proof.

Let $e^{\prime\prime}=e+e^{\prime}$ . By definition

\langle x^{e},x^{e^{\prime}}\rangle=\sum_{(\alpha_{1},\ldots,\alpha_{n})\in\mathbb{F}_{q}^{n}}\prod_{i=1}^{n}\alpha_{i}^{e_{i}}=\prod_{i=1}^{n}\sum_{\alpha\in\mathbb{F}_{q}}\alpha^{e^{\prime\prime}_{i}}.

The result follows from Lemma 2.1. ∎

2.3 Affine Linear Transformations and the Affine Bi-linear Scheme

In this section we present the Affine Bi-linear scheme, which plays a vital role in our arguments.

Definition 2.

We denote by $\mathcal{T}_{n,\ell}$ the set of affine transformations $T:\mathbb{F}_{q}^{\ell}\xrightarrow[]{}\mathbb{F}_{q}^{n}$ .

Each affine transformation $T\in\mathcal{T}_{n,\ell}$ consists of a linear part $M\in\mathbb{F}_{q}^{n\times\ell}$ and a translation $c\in\mathbb{F}_{q}^{n}$ , and we will use the writing convention $T=(M,c)$ to refer to the fact that $T$ is the affine transformation such that $Tx=Mx+c$ for $x\in\mathbb{F}_{q}^{\ell}$ . In words, $M$ is the linear part of $T$ and $c$ is the affine shift. We stress that an affine transformation $T\in\mathcal{T}_{n,\ell}$ is not necessarily full rank.

Definition 3.

The density of a set $S\subseteq\mathcal{T}_{n,\ell}$ is defined as $\mu_{\ell}(S)=\frac{|S|}{|\mathcal{T}_{n,\ell}|}$ .

Often times, we drop the subscript $\ell$ if it is clear from the context. Our analysis will require us to view affine transformations as vertices of some suitable test graph. For example, for the (standard) $t$ -flat test, this graph is the affine Grassmann graph, ${\sf AffGras}(n,t)$ . The vertices of the graph are all of the $t$ -flats in $\mathbb{F}_{q}^{n}$ , and two vertices $U_{1}$ and $U_{2}$ are adjacent if they intersect in a $(t-1)$ -flat, that is, $\dim(U_{1}\cap U_{2})=t-1$ . Below we define an analogous graph structure on affine transformations, which we refer to as the affine bi-linear scheme graph.

Definition 4.

The affine bi-linear scheme graph, ${\sf AffBilin}(n,\ell)$ , has vertex set $\mathcal{T}_{n,\ell}$ . Two vertices $T_{1},T_{2}\in\mathcal{T}_{n,\ell}$ are adjacent adjacent if and only if they are equal on an $\ell-1$ dimensional affine subspace. This condition can also be written as $\dim(\ker(T_{1}-T_{2}))\geqslant\ell-1$ .

Write $T_{1}\sim T_{2}$ to denote an adjacency. We remark that the affine Grassmann graph can be obtained from the affine bi-linear scheme by identifying each $T_{1}\in\mathcal{T}_{n,\ell}$ with its image (that is, closing the set $\mathcal{T}_{n,\ell}$ under some group operation), however as explained in the introduction this distinction will be crucial for us.

2.4 Expansion and pseudo-randomness in the Affine Bi-linear Scheme

We will use the standard notion of edge expansion, defined as follows.

Definition 5.

For a regular graph $G=(V,E)$ and a set of vertices $S\subseteq V$ , we define the edge expansion of $S$ as

\Phi(S)=\Pr_{\begin{subarray}{c}u\in S\\ (u,v)\in E\end{subarray}}[v\not\in S].

In words, the edge expansion of $S$ is the probability to escape it in a single step of the random walk on $G$ .

We will use $\Phi_{n,\ell}$ to denote edge expansion on ${\sf AffBilin}(n,\ell)$ . When $n$ and $\ell$ are clear from context we drop the subscripts. Later on in the paper, we will also consider edge expansion over the affine Grassmann graph ${\sf AffGras}(n,\ell)$ ; abusing notations, we will denote edge expansion there also using the notation $\Phi_{n,\ell}$ , and it will be clear from context which graph we are considering.

2.4.1 The Up-Down view of the Ranom Walk

It will be helpful for us to consider the following equivalent, two-step process for sampling a neighbor of $T_{1}=(M_{1},c_{1})\in\mathcal{T}_{n,\ell}$ in the affine bi-linear scheme graph:

1.

Go Up: Choose a random $w\neq 0$ . Let $M^{\prime}$ be the matrix obtained by appending the $w$ as a new column to $M_{1}$ , and $T^{\prime}=(M^{\prime},c)\in\mathcal{T}_{n,\ell+1}.$
2.

Go Down: Choose a random $R=(A,b)\in\mathcal{T}_{\ell+1,\ell}$ , where the first $\ell$ rows of $A$ are the identity matrix and the first $\ell$ entries of $b$ are $0$ , and at least one entry out of the last row in $A$ and the last entry in $b$ is nonzero.
3.

Output $T_{2}=T^{\prime}\circ R$ .

It is easy to see that for $T_{2}=(M_{2},c_{2})$ , each column of $M_{2}$ is equal to the corresponding column in $M_{1}$ plus some multiple of $w$ and likewise for $c_{2}$ and $c_{1}$ .

2.4.2 Non-expanding Sets in the Affine Bi-linear Scheme

As explained in the introduction, our proof considers the set $\mathcal{S}$ of affine-transformations on which the test rejects as a set of vertices in the affine bi-linear scheme graph. We prove that $\mathcal{S}$ has small edge expansion in that graph, and then use global hypercontractivity in order to derive some conclusion about the structure of $\mathcal{S}$ , which is in turn helpful towards a local correction argument.

To facilitate this argument, we must first understand the structure of a few canonical examples of non-expanding sets in ${\sf AffBilin}(n,\ell)$ . In the affine Grassmann graph one has the following examples:

•

zoom-ins: $\mathcal{D}_{a}=\{U\in{\sf AffGras}(n,\ell)\;|\;a\in U\}$ , for $a\in\mathbb{F}_{q}^{n}$ .
•

zoom-outs: $\mathcal{D}_{W}=\{U\in{\sf AffGras}(n,\ell)\;|\;U\subseteq W\}$ , for a hyperplane $W\subseteq\mathbb{F}_{q}^{n}$ .
•

zoom-ins on the linear part: $\mathcal{D}_{a,\operatorname{lin}}=\{U=z+V\in{\sf AffGras}(n,\ell)\;|\;a\in V\}$ , for $a\in\mathbb{F}_{q}^{n}$ .
•

zoom-outs on the linear part: $\mathcal{D}_{W,\operatorname{lin}}=\{U=z+V\in{\sf AffGras}(n,\ell)\;|\;V\subseteq W\}$ , for a hyperplane $W\subseteq\mathbb{F}_{q}^{n}$ .

It is not hard to see that each example has expansion at most $1-1/q$ ; also, it is an easy calculation to show that the density of each one of these sets is small (and is vanishing provided that $\ell$ is significantly smaller than $n$ and both go to infinity). Each of these examples also has a counterpart in ${\sf AffBilin}(n,\ell)$ :

•

zoom-ins: $\mathcal{C}_{a,b}=\{T\in\mathcal{T}_{n,\ell}\;:\;T(a)=b\}$ , for $a\in\mathbb{F}_{q}^{\ell}$ and $b\in\mathbb{F}_{q}^{n}$ .
•

zoom-outs: $\mathcal{C}_{a^{T},b,\beta}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;a^{T}\cdot M=b,\;a^{T}\cdot c=\beta\}$ , for $a\in\mathbb{F}_{q}^{n}$ , $b\in\mathbb{F}_{q}^{\ell}$ , and $\beta\in\mathbb{F}_{q}$ .
•

zoom-ins on the linear part: $\mathcal{C}_{a,b,\operatorname{lin}}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;M\cdot a=b\}$ , for $a\in\mathbb{F}_{q}^{\ell}$ and $b\in\mathbb{F}_{q}^{n}$ .
•

zoom-outs on the linear part: $\mathcal{C}_{a^{T},b,\operatorname{lin}}=\{(M,c)\in\mathcal{T}_{n,\ell}\;:\;a^{T}\cdot M=b\}$ , for $a,\in\mathbb{F}_{q}^{n}$ , and $b\in\mathbb{F}_{q}^{\ell}$ .

Likewise, one can check that each example here also has expansion at most $1-1/q$ in ${\sf AffBilin}(n,\ell)$ . Our argument will require us to show that, in a sense, these examples exhaust all small sets of vertices in ${\sf AffBilin}(n,\ell)$ whose edge expansion is at most $1-1/q$ . To formalize this, we first define the notion of pseudo-randomness. Intuitively, we say that a set $S$ is pseudo-random with respect to some example – say zoom-ins for concreteness – if $S$ may only have little correlation with $\mathcal{C}_{a,b}$ ’s, in the sense that the density of $S$ inside such sets is still small. More precisely:

Definition 6.

Let $\mathcal{S}\subseteq\mathcal{T}_{n,\ell}$ and let $\xi\in[0,1]$ .

1.

We say that $\mathcal{S}$ is $\xi$ -pseudo-random with respect to zoom-ins if for each $a\in\mathbb{F}_{q}^{\ell}$ and $b\in\mathbb{F}_{q}^{n}$ we have that

$\mu(\mathcal{S}_{a,b}):=\Pr_{T\in\mathcal{C}_{a,b}}[T\in\mathcal{S}]\leqslant\xi.$
2.

We say that $\mathcal{S}$ is $\xi$ -pseudo-random with respect to zoom-outs if for each $a\in\mathbb{F}_{q}^{n},b\in\mathbb{F}_{q}^{\ell},$ and $\beta\in\mathbb{F}_{q}$ we have that

$\mu(\mathcal{S}_{a^{T},b,\beta}):=\Pr_{T\in\mathcal{C}_{a^{T},b,\beta}}[T\in\mathcal{S}]\leqslant\xi.$
3.

We say that $\mathcal{S}$ is $\xi$ -pseudo-random with respect to zoom-ins on the linear part if for each $a\in\mathbb{F}_{q}^{\ell}$ and $b\in\mathbb{F}_{q}^{n}$ we have that

$\mu(\mathcal{S}_{a,b,\operatorname{lin}}):=\Pr_{T\in\mathcal{C}_{a,b,\operatorname{lin}}}[T\in\mathcal{S}]\leqslant\xi.$
4.

We say that $\mathcal{S}$ is $\xi$ -pseudo-random with respect to zoom-outs on the linear part if for each $a\in\mathbb{F}_{q}^{n},b\in\mathbb{F}_{q}^{\ell}$ we have that

$\mu(\mathcal{S}_{a^{T},b,\operatorname{lin}}):=\Pr_{T\in\mathcal{C}_{a^{T},b,\operatorname{lin}}}[T\in\mathcal{S}]\leqslant\xi.$

Typically, global hypercontractivity results say that a small set with expansion bounded away from $1$ cannot be pseudon-random (where the notion of pseudo-randomness is in the same spirit but a bit more complicated; see [13, 29] for example). For results in coding theory however it seems that a somewhat different (yet very much related) type of result is needed [31]. Intuitively, in these type of results the assumption on the expansion of the set $\mathcal{S}$ is much stronger, and roughly speaking asserts that the expansion of $\mathcal{S}$ is almost “as small as it could be”. In exchange for such a strong assumption, one often requires a stronger conclusion regarding the non-pseudo-randomness of the set $\mathcal{S}$ . For instance, in [31] it is shown that a set of vertices $\mathcal{S}$ with expansion at most $1-1/q$ which is highly pseudo-random with respect to $3$ of the above type of examples, must be highly non-pseudo-random with respect to the fourth type, in the sense that it almost contains a copy of such set. For our purposes we require an analogous statement for ${\sf AffBilin}(n,\ell)$ , which is the following statement:

Theorem 2.1.

If $\mathcal{S}\subseteq\mathcal{T}_{n,\ell}$ satisfies

1.

$\mu(\mathcal{S})\leqslant\xi$ ,
2.

$\mathcal{S}$ is $\xi$ -pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part,
3.

$1-\Phi(\mathcal{A})\geqslant\frac{1}{q}$ .

Then there exist $a\in\mathbb{F}_{q}^{\ell}$ and $b\in\mathbb{F}_{q}^{n}$ such that $\mu(\mathcal{S}_{a,b})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{q^{3}}{q-1}\left(4q^{-\ell}+867\xi^{1/4}\right)$ , where $\mu(\mathcal{S}_{a,b})$ is the density of $\mathcal{S}$ in $\mathcal{C}_{a,b}$ .

Proof.

The proof uses a reduction to an analogous result over the affine Grassmann graph using ideas from [9], and is deferred to Section A. ∎

3 Local Testers for the Reed Muller Code

We now formally describe both the sparse flat test and the full flat test. Although we focus on the sparse test, at times it will be convenient to reduce to the full test to aid our analysis.

3.1 The $t$ -flat Tester and the Inner Product View

At a high level, both the $t$ -flat test and the sparse $(s+t)$ -flat test can be described in the following way:

1.

Restriction: Choose a random $T\in\mathcal{T}_{n,t}$ for some suitable dimension $t$ .
2.

Local test on restriction: Check if the $t$ -variate function $f\circ T$ is indeed degree $d$ . If so, accept, and otherwise reject.

The difference between the two tests lies in how the “check” in step 2 is done. The straightforward way to perform this check is by querying $f\circ T$ on all points in $\mathbb{F}_{q}^{t}$ , interpolating $f\circ T$ and checking its degree. Indeed, this is precisely how the $t$ -flat test is defined. In that case, it is easy to see that the result of the test depends only on the image of $T$ ; this is because $f\circ T\circ A$ and $f\circ T$ have the same degree for any full rank $A\in\mathcal{T}_{t,t}$ . Therefore, the $t$ -flat test can be rephrased in its familiar form as follows:

1.

Choose a random $t$ -flat $U\subseteq\mathbb{F}_{q}^{n}$ .
2.

Accept if and only if $\deg(f|_{U})\leqslant d$ .

However, there are other ways of trying to test whether $f\circ T$ has degree at most $d$ or not. One way to do so is by taking inner products that check whether certain high-degree monomials exist in $f$ using Lemma 2.2. A simple way to do this is as follows:

Accept if and only if $\langle f\circ T,x^{e}\rangle=0$ for all $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum_{i=1}^{t}q-1-e_{i}>d$ .

By Lemma 2.2 it is not hard to see that this condition is equivalent to $\deg(f\circ T)\leqslant d$ . Furthermore it is clear that calculating all of the inner products requires querying every point in the support of some $x^{e}$ that is used – which is $\mathbb{F}_{q}^{t}$ in this case. Hence, taking inner product with all of these monomials does not lead to any savings in terms of the query complexity.

It turns out that there are more clever choices of “test polynomials” (which are not just monomials) that allow one to design an inner-product test above which is more query efficient. This idea was used by [33] who introduced the sparse $(s+t)$ -flat test, which we present formally in the next subsection.

3.2 The sparse $(s+t)$ -flat test

Our presentation of the sparse $(s+t)$ -flat is somewhat different than in [33], and this view will be necessary for our analysis. Recall that we write $d+1=s\cdot(q-q/p)+r$ where $s$ is divisible by $p$ and $0<r\leqslant p(q-q/p)$ . For a vector $x=(x_{1},\ldots,x_{p})$ and a set $I\subseteq\{1,\ldots,p\}$ denote $x_{I}=\sum\limits_{i\in I}x_{i}$ . Define the $p$ -variate polynomial

\displaystyle P(x_{1},\ldots,x_{p})=\frac{\sum_{I\subseteq[p-1]}(-1)^{|I|+1}(x_{I}+x_{p})^{q-1}}{x_{1}\cdots x_{p-1}}.

For any degree sequence $e=(e_{1},\ldots,e_{t})\in[q]^{t}$ define

M_{e}(x_{1},\ldots,x_{t})=\prod_{i=1}^{t}x_{i}^{e_{i}},\qquad\qquad H_{e}(x)=\left(\prod_{i=1}^{s/p}P(x_{p(i-1)+1},\ldots,x_{pi})\right)M_{e}(x_{s+1},\ldots,x_{s+t}).

The sparse $(s+t)$ -flat tester works as follows:

1.

Choose a random affine transformation $T:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{s+t}$ .
2.

Accept if and only if, $\langle f\circ T,H_{e}\rangle=0$ for all $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ .

Recall, we set the parameter $t$ to be $t=\max(p+2,10)$ . Essentially, $t$ just needs to be large enough so that the degree sequences in step $2$ can account for monomials of degree up to $r+q-1$ .

We refer to a degree sequence $e$ satisfying the inequality in step $2$ as valid throughout. As $d$ is fixed throughout, the notion of valid degree sequences will also not change throughout the paper. Hence, we say that $T$ rejects $f$ or $f\circ T$ is rejected, if $\langle f\circ T,H_{e}\rangle\neq 0$ for some valid $e$ . Otherwise we say that $T$ accepts $f$ or $f\circ T$ is accepted. We define $\mathcal{S}_{t}$ to be the set of $T$ ’s on which the tester rejects:

\mathcal{S}_{t}=\left\{T\in{\sf AffBilin}(n,s+t)~{}|~{}\langle{f\circ T},{H_{e}}\rangle\neq 0\text{ for some $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum\limits_{i=1}^{t}e_{i}\leqslant t(q-1)-r$}\right\},

(1)

and let $\operatorname{rej}_{s+t}(f)$ be the probability that the test rejects. Clearly, we have that $\operatorname{rej}_{s+t}(f)=\mu_{s+t}(\mathcal{S}_{t})$

3.3 Assuming $n$ is Sufficiently Large

In this section we argue that without loss of generality we may assume that $n\geqslant s+t+100$ . Indeed, we may view an $n$ -variate function $f$ as a function in some $N$ -variables, for some sufficiently large $N=O(n)$ . Call this function $g:\mathbb{F}_{q}^{N}\xrightarrow[]{}\mathbb{F}_{q}$ and define it by $g(a,b)=f(a)$ for any $a\in\mathbb{F}_{q}^{n}$ , $b\in\mathbb{F}_{q}^{N-n}$ . We show that $\delta(f,\operatorname{RM}[n,q,d])=\delta(g,\operatorname{RM}[N,q,d])$ , which implies that we can view the test as over $g$ but still have the $\delta$ in Theorem 1.1 be $\delta(f,\operatorname{RM}[n,q,d])$ .

Lemma 3.1.

Let $f$ and $g$ be as defined above. Then $\delta(f,\operatorname{RM}[n,q,d])=\delta(g,\operatorname{RM}[N,q,d])$ .

Proof.

We first show that $\delta(f,\operatorname{RM}[n,q,d])\geqslant\delta(g,\operatorname{RM}[N,q,d])$ . Suppose $h:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ is the degree $d$ function such that $\delta(f,h)=\delta(f,\operatorname{RM}[n,q,d])$ . Define $h^{\prime}$ as the extension of $h$ to $N$ variables (in the same way as $g$ is defined). Let $h^{\prime}(\cdot,b)$ denote the $N$ -variate function where the last $N-n$ variables are set to $b\in\mathbb{F}_{q}^{N-n}$ . Define $g(\cdot,b)$ similarly. Note that for any $b$ , $h^{\prime}(\cdot,b)=h$ , $g(\cdot,b)=f$ . We then have

\delta(h^{\prime},g)=\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}\left[\delta\left(h^{\prime}(\cdot,b),g(\cdot,b)\right)\right]=\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}\left[\delta(h,f)\right]=\delta(f,\operatorname{RM}[n,q,d]).

Since $h$ is degree $d$ , so is $h^{\prime}$ , thus the above implies that $\delta(f,\operatorname{RM}[n,q,d])\geqslant\delta(g,\operatorname{RM}[2n,q,d])$ .

For the other direction, suppose $h^{\prime}$ is the degree $d$ , $N$ -variate function such that $\delta(g,\operatorname{RM}[N,q,d])=\delta(g,h^{\prime})$ . Keeping the same notation as above, we have that

\mathop{\mathbb{E}}_{b\in\mathbb{F}_{q}^{N-n}}[\delta\left(g(\cdot,b),h^{\prime}(\cdot,b)\right)]=\delta(g,h^{\prime})=\delta(g,\operatorname{RM}[N,q,d]).

Hence, there is a choice of $b$ such that

\delta\left(f,h^{\prime}(\cdot,b)\right)=\delta\left(g(\cdot,b),h^{\prime}(\cdot,b)\right)=\delta(g,\operatorname{RM}[N,q,d]).

Since $h^{\prime}$ is degree $d$ , $h^{\prime}(\cdot,b)$ must be degree $d$ as well, so the above inequality implies that $\delta(f,\operatorname{RM}[n,q,d])\leqslant\delta(g,\operatorname{RM}[N,q,d])$ , completing the proof. ∎

Using this lemma, we can always view the test as over $g$ instead of $f$ , and show that the sparse $s+t$ -flat test rejects with probability at least $c(q)\min(1,Q\delta(g,\operatorname{RM}[N,q,d]))$ . Where $Q$ is the number of queries and $c(q)$ is the some $\frac{1}{\operatorname{poly}(q)}$ . Since none of these parameters depend on $N$ we can use the lemma above and get that this rejection probability is the same as $c(q)\min(1,Q\delta(f,\operatorname{RM}[n,q,d])$ .

Henceforth we assume that $n\geqslant s+t+100$ . This assumption is helpful as it allows us to bound the number of non-full rank affine transformations from $\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}^{s+t}$ as we argue in the following remark.

Remark 3.2.

The fraction of $\mathcal{T}_{n,s+t}$ that is not full rank is at most $\frac{1}{q^{100}(q-1)}$ . The same holds for the fraction of $\mathcal{T}_{n,s+t}$ that are not full rank conditioned on $Ta=b$ for arbitrary $a\in\mathbb{F}_{q}^{s+t}$ , $b\in\mathbb{F}_{q}^{n}$ . To see that, note that we can upper bound the probability that the linear part of $M$ is not full rank by

\sum_{i=0}^{s+t}\frac{q^{i-1}}{q^{n}}\leqslant\frac{q^{s+t}-1}{(q-1)q^{n}}\leqslant\frac{1}{q^{100}(q-1)}.

For the second part, note that conditioning on $Ta=b$ does not change this estimate, as we can still choose the linear part uniformly at random and set the affine shift so that $Ta=b$ .

3.4 Some Basic Facts of the Sparse Flat Tester

We now collect some basic facts regarding the sparse flat tester that will be necessary for our analysis.

3.4.1 Basic Soundness Properties

We begin by describing why this tester works. Our presentation herein will be partial, focusing on the most essential properties necessary for our analysis, and we refer the reader to Appendix B or [33] for more details.

Consider the monomials $x^{e^{\prime}}$ , for $e^{\prime}\in\{0,\ldots,q-1\}^{s+t}$ such that $\langle x^{e^{\prime}},H_{e}(x)\rangle\neq 0$ for some $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ , and $t\geqslant p+2$ . By Lemma 2.1, these monomials, $x^{e^{\prime}}$ , must satisfy:

•

$e^{\prime}_{p(i-1)+1}+\cdots+e^{\prime}_{pi}=p\cdot(q-q/p)$ for $1\leqslant i\leqslant\frac{s}{p}$
•

$e^{\prime}_{s+1}+\cdots+e^{\prime}_{s+t}\geqslant r$ .

More generally, we can explicitly express any inner product with $H_{e}$ as follows.

Lemma 3.3.

For $f(x)=\sum_{a\in\{0,\ldots,q-1\}^{n}}C_{a}x^{a}$ , and $e\in\{0,\ldots,q-1\}^{t}$ we have that

\langle f,H_{e}\rangle=\sum_{e^{\prime}}D_{e^{\prime}}C_{e^{\prime}},

where $D_{e^{\prime}}$ is a constant depending on the coefficients of $H_{e}$ , and the sum is over $e^{\prime}$ satisfying both the first condition above and $e^{\prime}_{s+i}+e_{i}=q-1$ for $1\leqslant i\leqslant t$ .

Proof.

We have

\langle x^{e^{\prime}},H_{e}\rangle=\left(\prod_{i=1}^{s/p}\sum_{\alpha_{1},\ldots,\alpha_{p}\in\mathbb{F}_{q}}\alpha_{1}^{e^{\prime}_{p(i-1)+1}}\cdots\alpha_{p}^{e^{\prime}_{pi}}P(\alpha_{1},\ldots,\alpha_{p})\right)\sum_{\beta_{1},\ldots,\beta_{t}\in\mathbb{F}_{q}^{t}}\beta_{1}^{e^{\prime}_{s+1}+e_{1}}\cdots\beta_{t}^{e^{\prime}_{s+t}+e_{t}}.

By Lemma 2.2, for each $i$ we have

\sum_{\alpha_{1},\ldots,\alpha_{p}\in\mathbb{F}_{q}}\alpha_{1}^{e^{\prime}_{p(i-1)+1}}\cdots\alpha_{p}^{e^{\prime}_{pi}}P(\alpha_{1},\ldots,\alpha_{p})\neq 0,

only if the monomial $x_{1}^{q-1}\cdots x_{p}^{q-1}$ appears in $x_{1}^{e^{\prime}_{p(i-1)+1}}\cdots x_{p}^{e^{\prime}_{pi}}P(x_{1},\ldots,x_{p})$ . Since the degree of $P$ is $q-p$ , this can only be the case if $e^{\prime}_{p(i-1)+1}+\cdots+e^{\prime}_{pi}=p\cdot(q-q/p)$ . Likewise,

\sum_{\beta_{1},\ldots,\beta_{t}\in\mathbb{F}_{q}^{t}}\beta_{1}^{e^{\prime}_{1}+e_{1}}\cdots\beta_{t}^{e^{\prime}_{t}+e_{t}}\neq 0,

only if $e^{\prime}_{i}+e_{i}=q-1$ for each $1\leqslant i\leqslant t$ . ∎

From Lemma 3.3 it is clear that $T$ rejects $f$ only if $\deg(f\circ T)>d$ as in this case, $f$ does not have any monomials satisfying the above. Therefore if $f$ is indeed degree $d$ then it is accepted with probability $1$ . But is it necessarily the case that the sparse flat tester rejects a function $f$ with positive probability if its degree exceeds $d$ ? This was shown to be true in [33] using canonical monomials. As we are going to need this fact and so as to be self contained, we include a proof in Appendix B.

Theorem 3.4.

A function $f$ passes the sparse $(s+t)$ -flat test with probability $1$ if and only if $f$ has degree at most $d$ .

Proof.

Deferred to Appendix B. ∎

3.4.2 Sparsity

The next important feature of the sparse flat tester, as the name suggests, is that it has a small query complexity. Note that the number of queries the test makes is proportional to the size of the support of the test polynomials $H_{e}$ , and the next lemma shows that for any of the $e$ in $\mathbb{F}_{q}^{t}$ of step 2, $H_{e}(x)$ has a sparse support. Moreover, this support is the same no matter which $e$ is chosen. As a result, calculating $\langle f\circ T,H_{e}\rangle$ does not require querying all $q^{s+t}$ points in the domain of $f\circ T$ .

Lemma 3.5.

The support of $P$ has size at most $(2^{p-1}+p-1)q^{p-1}$ and is contained in the set

\left(\bigcup_{i=1}^{p-1}\{x_{i}=0\}\right)\cup\left(\bigcup_{I\subset[p-1]}\{x_{I}+x_{p}=0\}\right).

where $\{x_{I}=x_{p}\}$ denotes the hyperplane given by $x_{I}=x_{p}$ .

Proof.

Suppose $x$ is not in the set described. Then in the expression for $P$ , each term $(x_{I}+x_{p})^{q-1}=1$ by Fermat’s Little Theorem. Moreover, the denominator is nonzero, so a direct calculation yields $P(x)=0$ . ∎

Lemma 3.6.

For any $e\in\{0,\ldots,q-1\}^{t}$ , we have

\operatorname{supp}(H_{e})\subseteq\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t},

Proof.

Since $H_{e}(x)=\left(\prod_{i=1}^{s/p}P(x_{p(i-1)+1},\ldots,x_{pi})\right)\prod_{j=1}^{t}x_{s+j}^{e_{j}}$ , the result is evident. ∎

Henceforth, we let $\operatorname{supp}(H)=\bigcup_{e\in\{0,\ldots,q-1\}^{t}}\operatorname{supp}(H_{e})$ . For any $T$ , the sparse $s+t$ -flat test can be done by only querying $f\circ T$ on points in $\operatorname{supp}(H)$ , which gives the following upper bound on query complexity.

Lemma 3.7.

The sparse $s+t$ -flat test has query complexity at most $(3q)^{\frac{d+1}{q}}q^{t}$ . Since we can choose any $t\geqslant p+2$ , the query complexity can be as low as $(3q)^{\frac{d+1}{q}}q^{p+2}$

Proof.

By Lemma 3.6, we can bound the size of $\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}$ . By Lemma 3.5,

|\operatorname{supp}(P)|\leqslant(2^{p-1}+p-1)q^{p-1},

\left|\prod_{i=1}^{s/p}\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}\right|\leqslant((2^{p-1}+p-1)q^{p-1})^{s/p}q^{t}\leqslant(3q)^{s(p-1)/p}q^{t},

where we use the fact that $(2^{p-1}+p-1)^{1/p}\leqslant 3$ . Moreover, $s\leqslant\frac{d+1}{q-q/p}$ , so

(3q)^{s}q^{t}\leqslant(3q)^{\frac{d+1}{q}}q^{t}.\qed

3.5 Relating the Sparse Flat Tester and the Flat Tester

In this section we describe a way to interpret the sparse $(s+t)$ -flat test as a $t$ -flat test. This relation will be useful later on as it allows us to borrow techniques from the analysis of the $t$ -flat test from [31].

Fix an affine transformation $A\in\mathcal{T}_{n,s+\ell}$ for some $\ell>t$ and let $\operatorname{Res}_{s,\ell,t}$ denote the set of affine transformations $(R,b)$ of the following form,

R=\begin{bmatrix}I_{s}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix},

(2)

for $R^{\prime}\in\mathbb{F}_{q}^{\ell\times t},b^{\prime}\in\mathbb{F}_{q}^{\ell}$ . We call the affine transformations in $\operatorname{Res}_{s,\ell,t}$ restrictions. In words, composing $A$ with a random $(R,b)$ in $\operatorname{Res}_{s,\ell,t}$ corresponds to the operation of preserving the first $s$ columns, and randomizing the rest of them. As we explain below, this allows to view the $(s+t)$ -sparse test as having the $t$ -flat tester embedded within it on some restricted-type function of $f$ .

More precisely, note that we can sample $T\in\mathcal{T}_{n,s+t}$ by choosing $A\in\mathcal{T}_{n,s+\ell}$ as well as a restriction $B=(R,b)\in\operatorname{Res}_{s,\ell,t}$ uniformly at random and outputting $T=A\circ B$ . In Lemmas 3.8 and 3.9 we show that the result of the test is “entirely dependent” on $b^{\prime}+\operatorname{Im}(R^{\prime})=\operatorname{Im}((R^{\prime},b^{\prime}))$ . To this end, after fixing $A$ we define $\tilde{f}:\mathbb{F}_{q}^{\ell}\xrightarrow[]{}\mathbb{F}_{q}$ by

\tilde{f}(\beta_{1},\ldots,\beta_{\ell})=\sum_{\alpha\in\mathbb{F}_{q}^{s}}f\circ A(\alpha,\beta_{1},\ldots,\beta_{\ell})\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi}).

Also for $B=(R,b)\in\operatorname{Res}_{s,\ell,t}$ let $\operatorname{fl}(B)\subseteq\mathbb{F}_{q}^{\ell}$ be the $t$ -flat given by $b^{\prime}+\operatorname{Im}(R^{\prime})$ . The following lemma gives a relation between the sparse flat tester rejecting $f\circ T$ , and the (standard) flat tester rejecting $\tilde{f}$ on $b^{\prime}+\operatorname{Im}(R^{\prime})$ .

Lemma 3.8.

For any $B\in\operatorname{Res}_{s,\ell,t}$ , $A\circ B$ rejects $f$ if and only if $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant r$ .

Proof.

Write $B=(R,b)$ in the form of (2) and let $B^{\prime}:\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q}^{\ell}$ be the affine transformation given by $(R^{\prime},b^{\prime})$ . Recall that $A\circ B$ rejects if and only if

\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ B,H_{e}\rangle\neq 0.

for some $e\in\{0,\ldots,q-1\}^{t}$ such that $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ . For any $e$ , we can rewrite this inner product on the right hand side as follows.

	$\displaystyle\langle f\circ A\circ B,H_{e}\rangle$	$\displaystyle=\sum_{\alpha\in\mathbb{F}_{q}^{s},\beta\in\mathbb{F}_{q}^{t}}\left(f\circ A\circ B(\alpha,\beta)\right)\cdot\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\beta^{e}$
		$\displaystyle=\sum_{\beta\in\mathbb{F}_{q}^{t}}\left(\sum_{\alpha\in\mathbb{F}_{q}^{s}}f(\alpha,B^{\prime}(\beta))\cdot\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\right)\cdot\beta^{e}$
		$\displaystyle=\sum_{\beta\in\mathbb{F}_{q}^{t}}\tilde{f}\circ B^{\prime}(\beta)\cdot\beta^{e}$
		$\displaystyle=\langle\tilde{f}\circ B^{\prime},x^{e}\rangle.$

However, $\langle\tilde{f}\circ B^{\prime},x^{e}\rangle\neq 0$ if and only if $\tilde{f}\circ B^{\prime}$ contains the monomial $x_{1}^{q-1-e_{1}}\cdots x_{t}^{q-1-e_{t}}$ . It follows that $A\circ B$ rejects if and only if $\deg(\tilde{f}\circ B^{\prime})\geqslant r$ . Finally, since the degree of a polynomial is invariant under affine transformations, it follows that this is equivalent to $\deg(\tilde{f}|_{\operatorname{fl}(R)})\geqslant r$ . ∎

We remark that Lemma 3.8 in particular implies that the sparse $s+t$ -flat test is invariant under any affine transformation that only affects the last $t$ -coordinates. In particular, we get:

Lemma 3.9.

For any $B\in\operatorname{Res}_{s,t,t}$ , $T\circ B$ rejects $f$ if and only if $T$ rejects $f$ .

Proof.

We apply Lemma 3.8 with $\ell=t$ and $A=T$ . Then, applying $B$ does not change the degree of $\tilde{f}$ and the result follows. ∎

After fixing $A\in\mathcal{T}_{n,s+\ell}$ this lemma allows us to think of the remainder of the sparse $(s+t)$ -flat test, i.e. choosing a restriction $B\in\operatorname{Res}_{s,\ell,t}$ , as a $t$ -flat test on $\tilde{f}=f\circ A$ . Setting $\ell=t+100$ , we can view the sparse $s+t$ -flat test as follows.

1.

Choose a random $A\in\mathcal{T}_{n,s+t+100}$ .
2.

Perform the standard $t$ -flat test on the $t+100$ -variate function $\tilde{f}$ defined according to the above.

This view of the test will allow us to borrow some concepts and facts from the $t$ -flat test which we now introduce. First, we formally define the upper shadow for flats.

Definition 7.

For a set of $\ell$ -flats, $S$ , define the upper shadow as follows

S^{\uparrow}=\{V\;|\;\dim(V)=\ell+1,\exists U\in S,U\subseteq V\}.

The following shadow lemma is shown in [10] and is used extensively in the analysis of [31]. This lemma will play a role in our analysis as well, so we record it below.

Lemma 3.10.

For a function $f$ , let $S$ be the set of $\ell$ -flats, $U$ , such that $\deg(f|_{U})>d$ . Then,

\mu(S^{\uparrow})\leqslant q\mu(S),

where the measures are in the sets of $(\ell+1)$ -flats and $\ell$ -flats respectively.

We can also define an analogous notion of upper shadow for affine transformations that fits with the view of the sparse flat tester just introduced.

Definition 8.

For a set of affine transformations $\mathcal{S}\subseteq\mathcal{T}_{n,s+\ell}$ define

\mathcal{S}^{\uparrow}=\{T\in\mathcal{T}_{n,s+\ell+1}\;|\;\exists R\in\operatorname{Res}_{s+\ell+1,s+\ell},T\circ R\in\mathcal{S}\}

We will need the following simple result about the upper shadow of sets of affine transformations.

Lemma 3.11.

For a set of affine transformations $\mathcal{S}\subseteq\mathcal{T}_{n,s+\ell}$ ,

\mu(\mathcal{S}^{\uparrow})\geqslant\mu(\mathcal{S}).

Proof.

We can sample $T$ by first sampling $T^{\prime}\in\mathcal{T}_{n,s+\ell+1}$ and then choosing a restriction $R\in\operatorname{Res}_{s+\ell+1,s+\ell}$ and outputting $T=T^{\prime}\circ R$ . We can only have $T\in\mathcal{S}$ if $T^{\prime}\in\mathcal{S}^{\uparrow}$ , so the result follows. ∎

4 Locating a Potential Error

We now begin our proof of Theorem 1.1. Recall $\mathcal{S}_{t}=\{T\in\mathcal{T}_{n,s+t}\;|\;T\text{ rejects }f\}$ defined in (1) and that $\operatorname{rej}_{s+t}(f)=\mu_{s+t}(\mathcal{S}_{t})$ ; we denote $\varepsilon=\mu_{s+t}(\mathcal{S}_{t})$ for simplicity. In this section we show that if $\varepsilon\leqslant q^{-M}$ (where $M$ is a large absolute constant to be determined), then we may find a potential erroneous input $x^{\star}$ for $f$ , in the sense that the sparse flat tester almost always rejects if the chosen test polynomial has $x^{\star}$ in its support.

We begin by checking that the set $\mathcal{S}_{t}$ satisfies the conditions of Theorem 2.1.

4.1 The Edge Expansion of $\mathcal{S}_{t}$ in the Affine Bi-linear Scheme

We start by proving an upper bound on the expansion of $\mathcal{S}_{t}$ . For a matrix $M$ and column $v$ vector $v$ , let $[M,v]$ denote the matrix with $v$ appended as an additional column. Consider the following procedure for sampling an edge in ${\sf AffBilin}(n,s+t)$ .

1.

Choose $T_{1}=(M,c)\in\mathcal{T}_{n,s+t}$ uniformly at random.
2.

Choose $v\in\mathbb{F}_{q}^{n}$ uniformly at random conditioned on $v\neq 0$ and let $T^{\prime}=([M,v],c)$ .
3.

Choose a uniformly random matrix $R\in\mathbb{F}_{q}^{(s+t+1)\times(s+t)}$ of the form $R=[I_{s+t},w]^{T}$ , and a uniformly random $b\in\mathbb{F}_{q}^{s+t+1}$ such that the first $s+t$ entries in $b$ are zero.
4.

Set $T_{2}=T^{\prime}\circ(R,b)$ .

The following lemma will allow us to show $\mathcal{S}_{t}$ is poorly expanding. This takes the place of the “shadow” lemma from [31].

Lemma 4.1.

Let $G:\mathbb{F}_{q}^{s+t}\xrightarrow[]{}\mathbb{F}_{q}$ be an arbitrary polynomial and let $T=(M,c)$ be an $s+t$ -affine transformation such that $\langle f\circ T,G\rangle\neq 0$ . Fix $v\in\mathbb{F}_{q}^{n}$ chosen in step $2$ , and let $T^{\prime}=([M,v],c)$ . Then,

\Pr_{(R,b)}[\langle f\circ T^{\prime}\circ(R,b),G\rangle\neq 0]\geqslant\frac{1}{q},

where $R$ is sampled according to step 3.

Proof.

Let $w=(w_{1},\ldots,w_{s+t})$ be the last row of $R$ and let $b_{s+t+1}$ be the last entry of $b$ . Choosing $(R,b)$ uniformly at random amounts to choosing $\beta$ and each $\alpha_{i}$ uniformly at random in $\mathbb{F}_{q}$ . To obtain the result we will view $\langle f\circ T^{\prime}\circ R,G\rangle$ as a function in these random values and apply the Schwartz-Zippel lemma.

The function $f\circ T^{\prime}\circ(R,b)$ can be obtained by composing the $(s+t+1)$ -variable function, $\tilde{f}=f\circ T^{\prime}$ , with $(R,b)$ . Then $f\circ T^{\prime}\circ(R,b)$ is just the $(s+t)$ -variable function,

\left(f\circ T^{\prime}\circ(R,b)\right)(x_{1},\ldots,x_{s+t})=\tilde{f}\left(x_{1},\ldots,x_{s+t},\sum_{i=1}^{s+t}w_{i}x_{i}+b_{s+t+1}\right).

By Lemma 3.3, $\langle f\circ T^{\prime}\circ(R,b),G\rangle$ is some linear combination of the coefficients of $\tilde{f}$ , but these coefficients are polynomials in $w_{1},\ldots,w_{s+t},b_{s+t+1}$ of total degree at most $q-1$ , because $\tilde{f}$ can be written as a polynomial with individual degrees at most $q-1$ . Thus, $\langle f\circ T^{\prime}\circ R,G\rangle$ is a polynomial in $w_{1},\ldots,w_{s+t},b_{s+t+1}$ of total degree at most $q-1$ . Moreover, this polynomial is nonzero because when setting $w_{1}=\cdots=w_{s+t}=b_{s+t+1}=0$ , it evaluates to

\langle f\circ T,G\rangle\neq 0.

By the Schwartz-Zippel lemma it follows that $\Pr_{(R,b)}[\langle f\circ T^{\prime}\circ R,G\rangle\neq 0]\geqslant\frac{1}{q}$ . ∎

As an immediate corollary, we get that the set of affine transformations that reject is poorly expanding.

Lemma 4.2.

The set of $s+t$ -affine transformations that reject $f$ , $\mathcal{S}_{t}$ , satisfies

1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}.

where the expansion is in ${\sf AffBilin}(n,s+t)$ .

Proof.

Fix $T_{1}\in\mathcal{S}_{t}$ and choose $T_{2}$ adjacent to $T_{1}$ in ${\sf AffBilin}(n,s+t)$ . By assumption there is some valid $e$ and $H_{e}$ such that $\langle f\circ T_{1},H_{e}\rangle\neq 0$ . Since $T_{2}$ can be chosen according to the process described at the start of the section, we can apply Lemma 4.1 with $G=H_{e}$ ,

\Pr_{T_{2}}[\langle f\circ T_{2},H_{e}\rangle\neq 0]\geqslant\frac{1}{q}.

It follows that $1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}$ . ∎

4.2 Pseudorandom with respect to zoom-outs and zoom-outs on the linear part

Next, we show that the set $\mathcal{S}_{t}$ is pseudorandom with respect to zoom-outs and zoom-out on the linear part.

Lemma 4.3.

The set $\mathcal{S}_{t}$ is $q\mu(\mathcal{S}_{t})$ -pseudorandom with repsect to zoom-outs and zoom-outs on the linear part.

Proof.

We show the proof for zoom-outs. The argument for zoom-outs on the linear part is very similar.

Fix a zoom-out $\mathcal{C}_{a^{T},b,\beta}$ denote by $\mu_{a^{T},b,\beta}(\mathcal{S}_{t})$ the measure of $\mathcal{S}_{t}$ in $\mathcal{C}_{a^{T},b,\beta}$ , that is, $\frac{|\mathcal{S}_{t}\cap\mathcal{C}_{a^{T},b,\beta}|}{|\mathcal{C}_{a^{T},b,\beta}|}$ . Fix an arbitrary $v$ such that $a^{T}\cdot v\neq 0$ . Sample an $(s+t)$ -affine transformation by first choosing $(M,c)\in\mathcal{C}_{a^{T},b,\beta}$ uniformly at random, and then $(M^{\prime},c^{\prime})$ according to steps 3 and 4 with $v$ . That is, if the $i$ th $M$ column is $M_{i}$ , then the $i$ th column of $M^{\prime}$ is $M_{i}+\alpha_{i}v$ , and the affine shift is $c^{\prime}=c+\alpha_{0}v$ for uniformly random $\alpha_{i}^{\prime}s$ and $\alpha_{0}$ in $\mathbb{F}_{q}$ . By Lemma 4.1, if $T_{1}\in\mathcal{S}_{t}\cap\mathcal{C}_{a^{T},b,\beta}$ , then $T_{2}\in\mathcal{S}_{t}$ with probability at least $1/q$ , so

\Pr[T_{2}\in\mathcal{S}_{t}]\geqslant\frac{\mu_{a^{T},b,\beta}(\mathcal{S}_{t})}{q}.

On the other hand, notice that the distribution of $T_{2}$ is uniform over $\mathcal{T}_{n,s+t}$ . Indeed, fix a $T=(M^{\prime},c^{\prime})\in\mathcal{T}_{n,s+t}$ and let $M^{\prime}_{i}$ denote the $i$ th column of $M^{\prime}$ . Then the only way to get $T_{2}=T$ is to choose,

\alpha_{i}=\frac{a^{T}M^{\prime}_{i}-b_{i}}{a^{T}v}\text{ for }1\leqslant i\leqslant n,\quad\text{ and }\quad\alpha_{0}=\frac{a^{T}c^{\prime}-\beta}{a^{T}v},

and $(M,c)\in\mathcal{C}_{a^{T},b,\beta}$ such that

M_{i}=M^{\prime}_{i}-\alpha_{i}v\text{ for }1\leqslant i\leqslant n,\quad\text{ and }\quad c=c^{\prime}-\alpha_{0}v.

Since $T_{2}$ is uniform over $\mathcal{T}_{n,s+t}$ , we have

\mu(\mathcal{S}_{t})\geqslant\Pr[T_{2}\in\mathcal{S}_{t}]\geqslant\frac{\mu_{a^{T},b,\beta}(\mathcal{S}_{t})}{q},

and hence $\mu_{a^{T},b,\beta}(\mathcal{S}_{t})\leqslant q\mu(\mathcal{S}_{t})$ . Since this applies for all zoom-outs, the result follows. A similar argument works for zoom-outs on the linear part.

∎

4.3 Pseudorandomness with respect to zoom-ins on the linear part

We next show pseudorandomness with respect to zoom-ins on the linear part. The argument is more involved. In particular, we use the relation to the $t$ -flat test to reduce the proof of this statement to analogous statements on affine Grassmann graphs. We then prove this statement using ideas similar to [31]. Formally, in this section we show:

Lemma 4.4.

The set $\mathcal{S}_{t}$ is $tq^{162}\varepsilon$ pseudorandom with respect to zoom-ins on the linear part $C_{a,b,\operatorname{lin}}$ .

We start with a few definitions. For a vector $a$ (of arbitrary length greater than $s$ ), let $a_{[s]}$ be the vector that is equal to $a$ in its first $s$ coordinates and $0$ in all of its other coordinates. Let $a|_{[s]}$ be the vector $a$ restricted to its first $s$ coordinates. For a matrix $M\in\mathbb{F}_{q}^{n\times(s+\ell)}$ , denote

\operatorname{Im}_{a}(M)=\{Ma^{\prime}\;|\;a^{\prime}\in\mathbb{F}_{q}^{\ell},a^{\prime}|_{[s]}=a|_{[s]},a^{\prime}\neq a^{\prime}_{[s]}\}.

In words, $\operatorname{Im}_{a}(M)$ is the image of elements $a^{\prime}$ that agree with $a$ on the first $s$ coordinates and are not identically $0$ on the rest of the coordinates.

Due to the asymmetry induced by the test, we have a different argument depending on the $0$ pattern of $a$ . We will first show that $\mathcal{S}_{t}$ is pseudorandom with respect to zoom-ins on the linear part, $C_{a,b,\operatorname{lin}}$ , where $a\neq a_{[s]}$ , by a direct argument. We will then prove that $\mathcal{S}_{t}$ is pseudorandom with respect to any zoom-in on the linear part by a reduction to this case.

Lemma 4.5.

The set $\mathcal{S}_{t}$ is $q^{161}\varepsilon$ pseudorandom with respect to zoom-ins on the linear part $C_{a,b,\operatorname{lin}}$ with $a\neq a_{[s]}$ .

The proof of Lemma 4.5 requires some set-up and auxiliary statements, which we present next. Fix $a,b$ as in the lemma and suppose that $\mathcal{S}_{t}$ has $\alpha$ fractional size in $\mathcal{C}_{a,b,\operatorname{lin}}$ , that is,

\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})}{\mu(\mathcal{C}_{a,b,\operatorname{lin}})}=\alpha.

Consider another $a^{\prime}\in\mathbb{F}_{q}^{s+t}$ such that $a^{\prime}_{[s]}=a_{[s]}$ , and $a^{\prime}\neq a^{\prime}_{[s]}$ . In words $a^{\prime}$ is equal to $a$ in its first $s$ coordinates, and nonzero in its last $t$ coordinates. There is an invertible matrix $R$ whose first $s$ rows and columns form the identity matrix, $I_{s}$ , such that $Ra^{\prime}=a$ . Composition with $R$ gives a bijection between the sets $\mathcal{C}_{a,b,\operatorname{lin}}$ and $\mathcal{C}_{a^{\prime},b,\operatorname{lin}}$ . Moreover, by Lemma 3.9, $(M,c)\in\mathcal{S}_{t}$ if and only if $(M\cdot R,c)\in\mathcal{S}_{t}$ . Therefore, for any $a^{\prime}$ satisfying $a^{\prime}_{[s]}=a_{[s]}$ , and $a^{\prime}\neq a^{\prime}_{[s]}$ it holds that

\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,t})}{\mu(\mathcal{C}_{a^{\prime},b,t})}=\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,t})}{\mu(\mathcal{C}_{a,b,t})}=\alpha.

Put another way, $\mathcal{S}_{t}$ has fractional size $\alpha$ in the set of $(M,c)$ such that $b\in\operatorname{Im}_{a}(M)$ , and thus we define

\mathcal{D}_{a,b,\ell}=\{(M,c)\in\mathcal{T}_{n,s+\ell}\;|\;b\in\operatorname{Im}_{a}(M)\}.

We can sample an $(M^{\prime},c^{\prime})\in\mathcal{D}_{a,b,t}$ by first choosing $(M,c)\in\mathcal{D}_{a,b,t+100}$ , then choosing $(R,b)\in\operatorname{Res}_{s,t+100,t}$ uniformly at random, and outputting $(M^{\prime},c^{\prime})=(M,c)\circ(R,v)=(M\cdot R,c+Mv)$ conditioned on $M\cdot R\in\mathcal{D}_{a,b,t}$ . Since the probability that $(M^{\prime},c^{\prime})\in\mathcal{S}_{t}$ is $\alpha$ and this can only happen if $(M,c)\in\mathcal{S}_{t}^{\uparrow^{100}}$ , we get that

\frac{\mu(\mathcal{S}_{b}^{\prime})}{\mu(\mathcal{D}_{a,b,s+t+100})}\geqslant\alpha,

where $\mathcal{S}^{\prime}_{b}=\mathcal{S}_{t}^{\uparrow^{100}}\cap\mathcal{D}_{a,b,t+100}$ . Recall $\mathcal{S}_{t}^{\uparrow}$ is the set of $(s+t+1)$ -affine transformations, $T$ , for which there exists a restriction $R$ such that $T\circ R\in\mathcal{S}_{t}$ , and $\mathcal{S}_{t}^{\uparrow^{100}}$ is the same operation applied $100$ times.

The following lemma shows that there is a way to fix the first $s$ columns of the test so that the rejection probability remains small:

Lemma 4.6.

There exists $(M,c)\in\mathcal{S}^{\prime}_{b}$ such that choosing $(R,v)\in\operatorname{Res}_{t+100,t}$ uniformly at random and setting $M^{\prime}=M\cdot R$ , $c^{\prime}=c+Mb$ , we have

\Pr_{B=(R,b)\in\operatorname{Res}_{s,t+100,t}}[(M,c)\circ B\in\mathcal{S}_{t}\;|\;b\notin\operatorname{Im}_{a}(M^{\prime})]\leqslant\frac{2}{\alpha}\varepsilon.

Proof.

Choose a full rank $(s+t+100)$ -affine transformation $(M,c)$ uniformly conditioned on $(M,c)\in\mathcal{S}^{\prime}_{b}$ , then choose an $(R,v)\in\operatorname{Res}_{s,t+100,t}$ uniformly and set $(M^{\prime},c^{\prime})=(M,c)\circ R$ . For the remainder of this proof all probabilities are according to this distribution. Define the following events.

•

$E_{1}=\{(M^{\prime},c^{\prime})\in\mathcal{S}_{t}\}$ .
•

$E_{2}=\{(M,c)\in\mathcal{S}^{\prime}_{b}\}$ .
•

$E_{3}=\{b\notin\operatorname{Im}_{a}(M^{\prime})\}$ .
•

$E_{4}=\{b\in\operatorname{Im}_{a}(M)\}$ .

First we note that

\Pr[E_{1}\;|E_{3}\land E_{4}]=\Pr[E_{1}\;|\;E_{3}],

and as a result,

\Pr[E_{1}\land E_{3}\;|\;E_{4}]=\Pr[E_{1}\;|\;E_{3}\land E_{4}]\cdot\Pr[E_{3}\;|\;E_{4}]\leqslant\Pr[E_{1}\;|\;E_{3}].

We can then conclude

	$\displaystyle\Pr_{(M^{\prime},c^{\prime})=(M,c)\circ R}[E_{1}\;\|\;E_{2}\land E_{3}]$	$\displaystyle=\frac{\Pr[E_{1}\land E_{3}\land E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle=\frac{\Pr[E_{1}\land E_{3}~{}\|~{}E_{4}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle\leqslant\frac{\Pr[E_{1}\;\|\;E_{3}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle=\frac{\Pr_{(M^{\prime},c^{\prime})}[E_{1}\;\|\;E_{3}]}{\Pr[E_{2}\;\|\;E_{4}]\Pr[E_{3}\;\|\;E_{2}]},$

where we use the fact that $E_{2}\land E_{4}=E_{2}$ , and so $\Pr[E_{2}]/\Pr[E_{4}]=\Pr[E_{2}~{}|~{}E_{4}]$ .

Notice that the numerator of the last fraction is at most $\varepsilon$ , while the denominator is at least $\alpha/2$ , so this probability is at most $2\varepsilon/\alpha$ . Thus, there exists $(M,c)\in\mathcal{S}^{\prime}_{b}$ such that conditioned on this $(M,c)$ , the probability $(M,c)\circ R\in S$ over $R\in\operatorname{Res}_{s,t+100,t}$ is at most $2\varepsilon/\alpha$ . ∎

Fixing this $(M,c)$ , the remaining test graph is now over $\operatorname{Res}_{s+t+100,s+t}$ and is isomorphic ${\sf AffBilin}(t+100,t)$ . Moreover, by Lemma 3.8, we may focus solely on the flat given by $b^{\prime}+\operatorname{Im}(R^{\prime})$ for $(R,b)\in\operatorname{Res}_{s,t+100,t}$ . This allows us to define $\tilde{f}:\mathbb{F}_{q}^{t+100}\xrightarrow[]{}\mathbb{F}_{q}^{t}$ as done in Section 3.5:

\tilde{f}(\beta_{1},\ldots,\beta_{t+100})=\sum_{\alpha\in\mathbb{F}_{q}^{s}}f\circ A(\alpha,\beta_{1},\ldots,\beta_{t+100})\prod_{i=1}^{s/p}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi}).

By Lemma 3.8 we now work with the standard $t$ -flat test for $\tilde{f}$ . The condition that $b\notin\operatorname{Im}_{a}(M\cdot R)$ translates into the condition that the $t$ -flat $U\subseteq\mathbb{F}_{q}^{t+100}$ does not contain the point $w\in\mathbb{F}_{q}^{t+100}$ equal the last $t+100$ coordinates of $a^{\prime}$ , where $a^{\prime}$ is the unique point such that $Ma^{\prime}=b$ .

Translating Lemma 4.6 gives

\Pr_{B=z+A\in{\sf AffGras}(t+100,t)}[\deg(\tilde{f}|_{B})\geqslant r\;|\;w\notin A]\leqslant\frac{2}{\alpha}\varepsilon.

This is the same result shown at the start of the proof of [31, Claim 3.4], and the rest of the argument follows as therein. Let

\mathcal{B}=\{B=z+A\in{\sf AffGras}(t+100,t)\;|\;w\notin A,\deg(\tilde{f}|_{A})\geqslant r\}.

Lemma 4.7.

The set $\mathcal{B}$ is nonempty.

Proof.

Since we chose $(M,c)\in\mathcal{S}^{\prime}_{b}$ , there is at least restriction $B$ such that $(M,c)\circ B$ rejects $f$ and hence there is a $t$ -flat on which $\tilde{f}$ has degree greater than $r$ . It follows that if we sample a random $t$ -flat $z+A=B^{\prime\prime}\subseteq B^{\prime}$ , then $\deg(f|_{B^{\prime\prime}})\geqslant r$ with probability at least $1/q$ . If $\mathcal{B}$ were empty, however, we would have $\deg(f|_{B^{\prime\prime}})\geqslant r$ only if $w\in A$ . In this case the probability that $w\in A$ is at most

\frac{q^{t}-1}{q^{t+1}-1}<\frac{1}{q}.

Therefore $\mathcal{B}$ is nonempty. ∎

We now proceed to the proof of Lemma 4.5.

Proof of Lemma 4.5.

By Lemma 4.7 $\mathcal{B}$ is non-empty, so there must be a $t+40$ flat $W$ such that

\mathcal{B}_{W}=\{B\in\mathcal{B}\;|\;B\subseteq W\}.

is nonempty. Let $\mu_{W}$ denote the uniform measure over ${\sf AffGras}(W,t)$ . This graph is isomorphic to ${\sf AffGras}(t+40,t)$ , but the ground space $\mathbb{F}_{q}^{t+40}$ is viewed as $W$ . We first claim that since $\mu_{W}(\mathcal{B}_{W})>0$ , it must be at least $q^{-100}$ . If not, then $0<\mu_{W}(\mathcal{B}_{W})<q^{-100}$ . However, $\mathcal{B}_{W}$ is $q^{-60}$ pseudo-random with respect to zoom-ins and zoom-ins on the linear part simply due to its size. Indeed, any zoom-in or zoom-in on the linear part already has measure $q^{-40}$ , so $\mathcal{B}_{W}$ can only contain a $q^{-60}$ fraction. Furthermore, $\mathcal{B}_{W}$ is also $q^{-50}$ pseudo-random with respect to zoom-outs and zoom-outs on their linear part by a similar argument to Lemma 4.3. Finally, by a similar argument to Lemma 4.2, $1-\Phi_{W}(\mathcal{B}_{W})\geqslant\frac{1}{q}$ . Altogether, this then contradicts Theorem A.1, so we conclude that $\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}$ .

Thus, there exists $W$ such that $\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}$ . Fix such a $W$ and sample a uniform $t+99$ -flat $Y=u+V\subseteq z+A$ conditioned on $w\notin V$ , a uniform $t+60$ -flat $A_{2}\subseteq Y$ , and consider $A_{2}\cap W$ . We may think of $W$ as being defined by a system of $60$ independent linear equations $\langle h_{1},x\rangle=c_{1},\ldots,\langle h_{60},x\rangle=c_{60}$ . That is, $W$ is the subspace of $\mathbb{F}_{q}^{t+100}$ that satisfies these $60$ equations. Likewise, $A_{2}$ is given by the restriction of $39$ linear equations, $\langle h^{\prime}_{1},x\rangle=c^{\prime}_{1},\ldots,\langle h^{\prime}_{39},x\rangle=c^{\prime}_{39}$ . The probability that all $99$ linear equations are linearly independent is at least

\prod_{j=0}^{38}\frac{q^{99}-q^{60+j}}{q^{99}}\geqslant e^{-2\sum_{j=1}^{\infty}q^{-j}}\geqslant e^{-4/q}.

When all $99$ linear equations are linearly independent, $A_{2}\cap W$ is uniform over ${\sf AffGras}(W,t)$ . Thus,

\Pr[A_{2}\cap W\in\mathcal{B}_{W}]\geqslant e^{-4/q}q^{-100}.

If $A_{2}\cap W\in\mathcal{B}_{W}$ , then $A_{2}\in\mathcal{B}_{Y}^{\uparrow^{60}}$ , where the upper shadow is taken with respect to ${\sf AffGras}(Y,t)$ , so it follows that

E_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\geqslant e^{-4/q}q^{-100}.

However, by Lemma 3.10,

\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})\leqslant q^{60}\mu_{Y}(\mathcal{B}_{Y}),

so altogether we get that

E_{Y}[\mu_{Y}(\mathcal{B}_{Y})]\geqslant e^{-4/q}q^{-160}.

To conclude, note that the left hand side is at most the probability that $f|_{B}$ has degree greater than $r$ over uniform $B=x+A\subseteq\mathbb{F}_{q}^{t+100}$ such that $w\notin A$ . By assumption, this probability is at most $\frac{2}{\alpha}\varepsilon$ so

\alpha\leqslant e^{-4/q}q^{160}2\varepsilon.\qed

We are now ready to prove Lemma 4.4.

Proof of Lemma 4.4.

We show that the set $\mathcal{S}_{t}$ is $tq^{198}\varepsilon$ pseudorandom with respect to zoom-ins on the linear part $C_{a,b,\operatorname{lin}}$ for any $a\in\mathbb{F}_{q}^{s+t}$ .

If $a\neq a_{[s]}$ , then we are done by Lemma 4.5, so suppose that $a=a_{[s]}$ , meaning $a$ is zero outside of its first $s$ coordinates. Clearly $a$ must have at least one nonzero coordinate, as otherwise $a=0$ and $C_{a,b,\operatorname{lin}}$ is either all of $\mathcal{T}_{n,s+t}$ (if $b=0)$ , or empty (if $b\neq 0$ ). The former case cannot happen by the assumption that $\mu(\mathcal{S}_{t})\leqslant q^{-M}$ , while the latter case is trivially true. Without loss of generality suppose that $a_{1}=\alpha\neq 0$ .

For each $T\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}}$ , there is an $e\in\{0,\ldots,q-1\}^{t}$ satisfying $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ such that $\langle f\circ T,H_{e}\rangle\neq 0$ . Furthermore, as $r>0$ , there must be some “special index” $i$ such that $e_{i}<q-1$ . Take the most common special index over all $T\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}}$ and without loss of generality suppose it is $1$ . Thus, for at least $1/t$ of $T\in\mathcal{S}_{t}\cap C_{a,b,\operatorname{lin}}$ , we have $\langle f\circ T,H_{e}\rangle\neq 0$ for some $e$ such that $e_{1}<q-1$ , and $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ . Let $\mathcal{A}$ denote the set of these transformations.

Consider the map, $F_{\beta}:\mathbb{F}_{q}^{s+t}\xrightarrow[]{}\mathbb{F}_{q}^{s+t}$ that sends $x_{s+1}$ to $x_{s+1}+\beta x_{1}$ and keeps all other coordinates unchanged. That is,

F_{\beta}(x_{1},\ldots,x_{s+t})=(x_{1},\ldots,x_{s},x_{s+1}+\beta x_{1},x_{s+2},\ldots,x_{s+t}).

For any $T\in\mathcal{A}$ , we claim that

\Pr_{\beta\in\mathbb{F}_{q}}[\langle f\circ T\circ F_{\beta},H_{e}\rangle\neq 0]\geqslant\frac{2}{q},

where $e$ is the exponent vector such that $e_{1}<q-1$ , $\sum_{i=1}^{t}e_{i}\leqslant t(q-1)-r$ , and $\langle f\circ T,H_{e}\rangle\neq 0$ .

Indeed, $\langle f\circ T\circ F_{\beta},H_{e}\rangle$ is a linear combination of the coefficients of $f\circ T\circ F_{\beta}$ which is a polynomial in $\beta$ . In fact, by Lemma 2.2, it is a linear combination of coefficients of monomials where the degree of $x_{s+1}$ is $q-1-e_{1}>0$ . In every monomial of $f\circ T\circ F_{\beta}$ , the degree of $\beta$ and the degree of $x_{s+1}$ add to at most $q-1$ however, so it follows that the coefficients who contribute to $\langle f\circ T\circ F_{\beta},H_{e}\rangle$ have degree in $\beta$ at most $q-2$ . Therefore $\langle f\circ T\circ F_{\beta},H_{e}\rangle$ is a polynomial in $\beta$ of degree at most $q-2$ . Finally this polynomial is nonzero because it evaluates to a nonzero value at $\beta=0$ . The inequality then follows from the Schwartz-Zippel lemma.

In particular, this means for each $T\in\mathcal{A}$ there is a nonzero $\beta$ such that $\langle f\circ T\circ F_{\beta},H_{e}\rangle\neq 0$ , and we choose $\beta^{\star}$ to be the most common such $\beta$ (we remark that the fact that $\beta^{\star}\neq 0$ is the reason we needed probability of $2/q$ rather than $1/q$ ). Thus, for at least $2/q$ of the transformations in $\mathcal{A}$ it holds that $T\circ F_{\beta^{\star}}\in\mathcal{S}_{t}$ . Letting $a^{\prime}=F_{\beta}^{-1}(a)$ , we get that for at least $2/q$ fraction of the $T\in\mathcal{A}$ it holds that $T\circ F_{\beta^{\star}}\in\mathcal{C}_{a^{\prime},b,\operatorname{lin}}\cap\mathcal{S}_{t}$ , and as $F_{\beta^{\star}}$ is a bijection it follows that

\frac{1}{t}\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})\leqslant\mu(\mathcal{A})\leqslant\frac{q}{2}\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,\operatorname{lin}}).

Finally, note that $a^{\prime}\neq a^{\prime}_{[s]}$ as its $(s+1)$ coordinate is non-zero by design. Applying Lemma 4.5 we get that $\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a^{\prime},b,\operatorname{lin}})\leqslant q^{161}\varepsilon\mu(\mathcal{C}_{a^{\prime},b,\operatorname{lin}})$ , and as $\mu(\mathcal{C}_{a^{\prime},b,\operatorname{lin}})=\mu(\mathcal{C}_{a,b,\operatorname{lin}})$ , combining with the above we get that

\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b,\operatorname{lin}})}{\mu(\mathcal{C}_{a,b,\operatorname{lin}})}\leqslant tq^{162}\varepsilon.\qed

5 Correcting the error and iterating

The results from the previous subsections, along with the assumption that $\mu(\mathcal{S}_{t})\leqslant q^{-M}$ , establish that $\mathcal{S}_{t}$ satisfies the following properties:

1.

$\mu(\mathcal{S}_{t})\leqslant q^{-M}$
2.

$1-\Phi(\mathcal{S}_{t})\geqslant\frac{1}{q}$ .
3.

$\mathcal{S}_{t}$ is $q^{1-M}$ -pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
4.

$\mathcal{S}_{t}$ is $tq^{162-M}$ -pseudorandom with respect to zoom-ins on the linear part.

Since we take $t\geqslant 10$ , it follows from Theorem 2.1 with $\xi=tq^{162-M}$ and $\ell=s+t$ that there exists a pair $a,b$ such that

\mu_{a,b}(\mathcal{S}_{t})=\frac{\mu(\mathcal{S}_{t}\cap\mathcal{C}_{a,b})}{\mu(\mathcal{C}_{a,b})}\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4},

(3)

for a large enough $M$ .

How shall we go about using this information? In words, (3) tells us that $\mathcal{S}_{t}$ is dense on transformations sending $a$ to $b$ , and this suggests that the point $b$ is an erroneous point for $f$ which we should fix and, as a result, improve the acceptance probability of the test. Furthermore, it stands to reason that this change should affect all tests that come from transformations in $\mathcal{C}_{a,b}$ .

Upon inspection however, these tests can only be affected in the case that $a\in\operatorname{supp}(H)$ ; otherwise, in the test we perform, the value $f(T(a))=f(b)$ is multiplied by $0$ , and hence changing the value of $f$ at $b$ does not affect the test at all. Thus, for the above strategy to work, we must prove that (3) holds for a pair $a,b$ wherein $a$ is in ${\sf supp}(H)$ .

Lemma 5.1.

There exists $a\in\operatorname{supp}(H)$ and $b\in\mathbb{F}_{q}^{n}$ such that $\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}$ .

Once we have shown this lemma, we will be able to show that correction at $b$ can reduce the rejection probability of tests in $\mathcal{C}_{a,b}$ ; however, in order to show that the tester is optimal, we need to fix a larger fraction of tests with a single correction. Specifically, we need to be able to fix tests such that $T(a)=b$ for any $a\in\operatorname{supp}(H)$ . In order for such an argument to work, we will have to show that $\mathcal{S}_{t}$ is dense in $\bigcup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b}$ .

Lemma 5.2.

If there exists an $a\in\operatorname{supp}(H)$ such that $\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}$ , then $\mathcal{S}_{t}$ has density at least $\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200}q^{-M/4}-q^{-M/2}$ in $\cup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b}$ .

In words, $\cup_{a\in\operatorname{supp}(H)}\mathcal{C}_{a,b}$ is the set of $(s+t)$ -affine transformations $T$ such that $Ta=b$ for some $a\in\operatorname{supp}(H)$ . In the next two subsections we prove Lemmas 5.1 and 5.2. After showing these lemmas, the way to perform corrections will become obvious and we quickly conclude the proof of Theorem 1.1. Both Lemmas are proved in a similar manner, and we first present a general lemma that will be used in both proofs. Let $g:\mathbb{F}_{q}^{\ell+100}$ be an arbitrary polynomial, let $d$ be a degree parameter, let $\nu$ denote uniform measure over $\ell$ -flats in $\mathbb{F}_{q}^{\ell+100}$ , and let $b\in\mathbb{F}_{q}^{\ell+100}$ be an arbitrary point. Define the following two sets:

\mathcal{A}=\{U\;|\;\dim(U)=\ell,\deg(f|_{U})>d,b\in U\},

\mathcal{B}=\{U\;|\;\dim(U)=\ell,\deg(f|_{U})>d,b\notin U\}.

Keep $\varepsilon$ and $M$ (the large absolute constant) the same as we have defined, so that $q^{M/2}O(\varepsilon)<q^{-M/2}$ is small. We will use following result, which is an extension of the results in Section 3.2 of [31]:

Lemma 5.3.

Keep $g,d,b,$ and $\mathcal{B}$ as defined above and suppose $\ell\geqslant\max(\lceil\frac{d+1}{q-q/p}\rceil,4)$ . If $\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon)$ then $\mathcal{B}=\emptyset$ . Moreover there is a value $\gamma$ such that after changing $g(b)$ to $\gamma$ , $\nu(\mathcal{A})=0$ .

As a consequence of these two points, $\deg(g)\leqslant d$ after changing the value of $g(b)$ to $\gamma$ .

Proof.

The proof is deferred to Section C. ∎

5.1 Proof of Lemma 5.1

To establish Lemma 5.1 we show that $\mathcal{S}_{t}$ is pseudorandom with respect to zoom-ins outside of the support, hence the zoom-in found in (3) must be in the support of $H$ . Specifically, we show:

Lemma 5.4.

For any $a\notin\operatorname{supp}(H)$ and any $b\in\mathbb{F}_{q}^{n}$ , $\mu_{a,b}(\mathcal{S}_{t})<1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}$ .

We begin the proof of Lemma 5.4, and we assume for the sake of contradiction that the lemma is false. That is, suppose there is $a\notin\operatorname{supp}(H)$ such that $\mu_{a,b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200}q^{-M/4}$ . Since $a\notin\operatorname{supp}(H)$ , there are $p$ consecutive coordinate $(a_{pi+1},\ldots,a_{pi+p-1})\not\in\operatorname{supp}(P)$ . It will be convenient to swap the order of the coordinates so that these are the last $p$ coordinates. Thus, the polynomial $H_{e}(x)$ for $e\in\{0,\ldots,q-1\}^{t}$ is now,

H_{e}(x_{1},\ldots,x_{s+t})=\left(\prod_{i=1}^{s/p-1}P(x_{p(i-1)+1},\ldots,x_{p})\right)x_{s-p+1}^{e_{1}}\cdots x_{s-p+t}^{e_{t}}P(x_{s+t-p+1},\ldots,x_{s+t}).

The test with affine transformation $T$ checks if $\langle f\circ T,H_{e}\rangle=0$ for all valid $e$ with $H_{e}$ as defined above. Reordering the variables in this way changes the support, so that our assumption $a\notin\operatorname{supp}(H)$ becomes, without loss of generality, $(a_{s+t-p+1},\ldots,a_{s+t})\notin\operatorname{supp}(P)$ , which will make our notation later a bit simpler (this is the only reason for reordering the variables). We will once again sample a larger affine transformation and choose a restriction to obtain a $T\in\mathcal{T}_{n,s+t}$ , however the restrictions this time will be slightly different. We will require the restrictions to be full rank and we will fix the first $s+t-p$ columns/coordinates of $T$ as opposed to the first $s$ as before. Let $\operatorname{Res}_{s+t-p,p+102,p}$ consist of affine transformations $(R,b)$ of the form:

R=\begin{bmatrix}I_{s+t-p}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix},

(4)

where $R^{\prime}\in\mathbb{F}_{q}^{(p+102)\times p}$ is full rank and $b^{\prime}\in\mathbb{F}_{q}^{p+102}$ . For $B=(R,b)\in\operatorname{Res}_{s+t-p,p+102,p}$ of this form, let $\operatorname{fl}(B)=b^{\prime}+\operatorname{Im}(R^{\prime})$ . Also, for an affine transformation $T$ , define $\operatorname{Im}_{a}(T)=\{Tx\;:\;x_{[s+t-p]}=a_{[s+t-p]}\}$ , where recall $a_{[s+t-p]}$ is $a$ with all coordinates outside of the first $s+t-p$ set to zero. Sample a $T\in\mathcal{T}_{n,s+t}$ as follows:

1.

Choose a full rank $A\in\mathcal{T}_{n,s+t+102}$ such that $b\in\operatorname{Im}_{a}(A)$ .
2.

Choose $B\in\operatorname{Res}_{s+t-p,p+102,p}$ .
3.

Output $T=A\circ B$ .

After $A$ is chosen, the first $s+t-p$ columns/coordinates of $A$ ’s linear part/affine shift are fixed, while the remaining parts are composed with some random restriction. Thus, once $A$ is fixed there is a unique $x^{\star}$ such that $Ax^{\star}=b$ and we can only have $Aa=b$ if $Ba=x^{\star}$ . In particular, this only happens if $x^{\star}\in\operatorname{Im}(B)$ where by design this $x^{\star}$ satisfies $x^{\star}_{[s+t-p]}=a_{[s+t-p]}$ . Setting $z^{\star}\in\mathbb{F}_{q}^{p+102}$ to be the last $p+102$ coordinates of $x^{\star}$ , it follows that $Ta=b$ only if $z^{\star}\in\operatorname{fl}(B)$ . Let $\mu_{A}$ denote measure in $\operatorname{Res}_{s+t-p,p+102,p}$ . Define

\mathcal{R}_{A}=\{B\;|\;A\circ B\text{ rejects},z^{\star}\notin\operatorname{fl}(B)\}.

If, after choosing $A$ and constructing $z^{\star}$ as above, we then condition on $z^{\star}\notin\operatorname{fl}(B)$ then $A\circ B$ is a uniformly random over full-rank transformations in $\mathcal{T}_{n,s+t}$ conditioned on its image not containing $b$ . Therefore, $\mathop{\mathbb{E}}_{A}[\mu_{A}(\mathcal{R}_{A})]\leqslant O(\varepsilon)$ , and with probability at least $1/2$ we have $\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon)$ .

On the other hand, if after choosing $A$ we choose $B$ uniformly such that $A\circ B\in\mathcal{C}_{a,b}$ , the distribution over $T$ is uniform over full rank $T\in\mathcal{C}_{a,b}$ . By Remark 3.2, the fraction of affine transformations in $\mathcal{C}_{a,b}$ that are not full rank is at most $\frac{1}{q^{100}(q-1)}$ and by our assumption the set of rejecting transformations is dense in $\mathcal{C}_{a,b}$ . Thus a simple averaging argument shows that the fraction of full rank $T\in\mathcal{C}_{a,b}$ that reject is also large, and in particular, strictly greater than $1/2$ . Therefore with probability strictly greater than $1/2$ over $A$ , there is at least one $B$ such that $A\circ B$ is full rank and rejects.

It follows that there exists a full rank $A\in\mathcal{T}_{n,s+t+102}$ such that the following two hold:

•

$\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon)$ ,
•

There exists a $B^{\star}$ such that $T=A\circ B^{\star}\in\mathcal{C}_{a,b}$ is full rank and rejects.

Fix this $A$ and keep $x^{\star}$ , $z^{\star}$ , and $\mathcal{R}_{A}$ as defined. We now show how this leads to a contradiction. The idea is to apply Lemma 5.3 and argue that there is a point at which we can make a correction and cause $A\circ B$ to be accepted for all possible $B$ , and in particular for $B^{\star}$ . Since we assumed that there was high rejection probability on a zoom-in outside the support, we will show that the correction is made at a point not looked at by the test $A\circ B^{\star}$ . These two facts together form a contradiction because changing $f$ at a point that is not in the set of points $(A\circ B^{\star})(\alpha)$ for $\alpha\in\operatorname{supp}(H)$ cannot change the result of the test $A\circ B^{\star}$ . In order to apply Lemma 5.3 however, we need some statement similarly to $\mu_{A}(\mathcal{R}_{A})\leqslant O(\varepsilon)$ , but for a set of flats instead of a set of affine transformations. To this end, we will try to argue that the set of $\operatorname{fl}(B)$ for $B\in\mathcal{R}_{A}$ is small as well.

We proceed with the formal argument. The first step is to define an auxiliary polynomial similar to that of Lemma 3.8. Suppose $T$ rejects because $\langle f\circ T,H_{e}\rangle\neq 0$ for a fixed valid $e$ . Using this $e$ , define $\tilde{f}:\mathbb{F}_{q}^{p+102}\xrightarrow[]{}\mathbb{F}_{q}$ by

\tilde{f}(\beta)=\sum_{\alpha\in\mathbb{F}_{q}^{s-p+t}}f\circ A(\alpha_{1},\ldots,\alpha_{s-p+t},\beta)\left(\prod_{i=1}^{s/p-1}P(\alpha_{p(i-1)+1},\ldots,\alpha_{pi})\right)\alpha_{s-p+1}^{e_{1}}\cdots\alpha_{s-p+t}^{e_{t}}.

For any $B=(R,b)\in\operatorname{Res}_{s+t-p,p+102,p}$ written according to Equation (4) it is easy to check that

\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\rangle,

so after fixing $A$ we can view the test as being performed on $\tilde{f}$ with the transformation $(R^{\prime},b^{\prime})\in\mathcal{T}_{p+102,p}$ . By Theorem 3.4, if $\deg(\tilde{f})<q(p-1)$ , then $\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\rangle=0$ for all $B^{\prime}=(R^{\prime},b^{\prime})\in\mathcal{T}_{p+102,p}$ . This leads to the following fact:

B\text{ can only reject if }\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1).

This fact is similar to Lemma 3.8, however, we only have one direction. Namely it is not true that $B$ always rejects if $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1)$ because the test on $\tilde{f}$ only checks inner products with $P$ and not will all monomials of degree up to $q-p$ .

Using this fact we can now relate $\mu_{A}(\mathcal{R}_{A})$ to the measure of the set of $p$ -flats $\operatorname{fl}(B)$ not containing $z^{\star}$ such that $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1)$ . Define the set of $p$ -flats,

\mathcal{B}_{A}=\{\operatorname{fl}(B)\;|\;\deg(\tilde{f}_{\operatorname{fl}(B)})\geqslant q(p-1),z^{\star}\notin\operatorname{fl}(B)\}\subseteq{\sf AffGras}(p+102,p).

Equivalently, $\mathcal{B}_{A}$ is the set of all $p$ -flats $U$ not containing $z^{\star}$ such that $\deg(\tilde{f})|_{U}\geqslant q(p-1)$ .

To analyze the fractional size of this set, we can choose $(R^{\prime},b^{\prime})$ by choosing a $p$ -flat, and then choosing a basis for the flat. More formally, with $A$ fixed,

1.

Choose $B\in\operatorname{Res}_{p+102,p}^{\prime}$ such that $z^{\star}\notin\operatorname{fl}(B)$ , and write $B$ according to Equation (4).
2.

Choose $T\in\mathcal{T}_{p,p}$ , and replace $(R^{\prime},b^{\prime})$ with $(R^{\prime},b^{\prime})\circ T$ . Let the resulting restriction be $B^{\prime}\in\operatorname{Res}^{\prime}_{p+102,p}$ .
3.

Output $B^{\prime}$ .

We claim that if initially $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1)$ , then with probability at least $1/q$ , the outputted $B^{\prime}$ rejects.

Lemma 5.5.

Suppose $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1)$ . Then with probability at least $1/q$ over $M\in\mathcal{T}_{p,p}$ in the second step, $B^{\prime}$ rejects. That is, $\langle\tilde{f}\circ B^{\prime},P\rangle\neq 0$ .

Proof.

Write $B$ in the form of Equation (4). Then the assumption $\deg(\tilde{f}|_{\operatorname{fl}(B)})\geqslant q(p-1)$ is equivalent to $\deg(\tilde{f}\circ(R^{\prime},b^{\prime}))\geqslant q(p-1)$ .

Choose a full rank $T$ uniformly at random and consider

\langle\tilde{f}\circ(R^{\prime},b^{\prime})\circ T,P\rangle=\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle.

Since $\deg(\tilde{f}\circ(R^{\prime},b^{\prime}))\geqslant q(p-1)$ , Lemma B.5 implies that there is at least one choice of invertible $T$ , and hence an $T^{-1}$ , that makes the above nonzero. Therefore, we can view $\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle$ as a nonzero polynomial in the entries of $T^{-1}$ . Since the degree of $P$ is at most $q-p$ , the inner product $\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle$ is a nonzero polynomial in the entries of $T^{-1}$ of total degree at most $q-p$ . By the Schwartz-Zippel Lemma, with probability at least $p/q$ , over the entries of $T^{-1}$ , $\langle\tilde{f}\circ(R^{\prime},b^{\prime}),P\circ T^{-1}\rangle\neq 0$ . Since $T^{-1}$ has to be invertible, we need to ignore the choices of entries that are non-invertible, but this is at most a $(p-1)/q$ fraction. Overall, we still get that with probability at least $1/q$ over $T\in\mathcal{T}_{p,p}$ , $(R^{\prime},b^{\prime})\circ T$ rejects $\tilde{f}$ . ∎

Letting $\nu_{A}$ denote the uniform measure in ${\sf AffGras}(p+102,p)$ , Lemma 5.5 implies that $\nu_{\mathcal{A}}(\mathcal{B}_{A})\leqslant q\mu_{A}(\mathcal{R}_{A})$ . We are now close to being able to apply Lemma 5.3, but $\mathcal{B}_{A}$ is a set of $p$ -flats, while Lemma 5.3 only works for sets of flats of dimension at least $4$ . Thus in the $p=2,3$ cases, we cannot use this lemma. There is an easy fix however, which follows by looking at the upper shadow of $\mathcal{B}_{A}$

Lemma 5.6.

Let $\mathcal{B}^{\prime}_{A}=\{U^{\prime}\;|\;\dim(U^{\prime})=p+2,z^{\star}\notin U^{\prime},\deg(\tilde{f}|_{U^{\prime}})\}\geqslant q(p-1)\}$ . Then,

\nu(\mathcal{B}^{\prime}_{A})\leqslant q^{2}\nu_{A}(\mathcal{B}_{A}),

where $\nu$ is uniform measure in ${\sf AffGras}(p+102,p+2)$ .

Proof.

Observe that $\mathcal{B}^{\prime}_{A}\subseteq\mathcal{B}_{A}^{\uparrow^{2}}$ . Indeed, for any $U^{\prime}\in\mathcal{B}^{\prime}_{A}$ , there must be a $p$ -flat $U\subseteq U^{\prime}$ such that $\deg(\tilde{f}|_{U})\geqslant q(p-1)$ . Since $z^{\star}\notin U^{\prime}$ , it follows that $z^{\star}\notin U$ and thus $U\in\mathcal{B}_{A}$ . The result then follows from applying the Lemma 3.10 twice. ∎

We now wrap up the proof by applying Lemma 5.4 and obtaining a contradiction.

Proof of Lemma 5.4.

By Lemma 5.6, $\nu(\mathcal{B}^{\prime}_{A})\leqslant q^{2}\nu_{A}(\mathcal{B}_{A})\leqslant q^{3}O(\varepsilon)$ . We can now apply Lemma 5.3, with $\ell=p+2$ , degree parameter $q(p-1)-1$ , and special point $z^{\star}$ . The conditions of Lemma 5.3 are satisfied since $p+2\geqslant\max(\lceil\frac{q(p-1)}{q-q/p}\rceil,4)$ and $q^{3}O(\varepsilon)\leqslant q^{M/2}O(\varepsilon)$ . From Lemma 5.3 it follows that $\mathcal{B}^{\prime}_{A}$ is empty and there is a value $\gamma$ such that after changing $\tilde{f}(z^{\star})$ to $\gamma$ , we have $\deg(\tilde{f})\leqslant p(q-1)$ . In particular, it must be the case that $\langle f\circ A\circ B^{\star},H_{e}\rangle=0$ , or equivalently, $\langle\tilde{f}\circ(R^{\star},b^{\star}),P\rangle=0$ , where $(R^{\star},b^{\star})$ is the non-identity part of $B^{\star}$ when written according to (4).

Recall how the point $z^{\star}$ was defined: after $A$ is fixed, $x^{\star}\in\mathbb{F}_{q}^{s+t+102}$ is the unique point such that $Ax^{\star}=b$ , and $z^{\star}\in\mathbb{F}_{q}^{p+102}$ is the last $p+102$ coordinates of $x^{\star}$ . Since $B^{\star}$ satisfies $A\circ B^{\star}\in\mathcal{C}_{a,b}$ , we have $(A\circ B^{\star})(a)=b$ , and hence $B^{\star}a=x^{\star}$ . Letting $a^{\prime}$ be the last $p+102$ coordinates of $a$ , it follows that $(R^{\star},b^{\star})(a^{\prime})=z^{\star}$ . However, by assumption $a^{\prime}\notin\operatorname{supp}(P)$ , and since $(R^{\star},b^{\star})$ is full rank, $(R^{\star},b^{\star})(\alpha)\neq z^{\star}$ for any $\alpha\in\operatorname{supp}(P)$ . We now see where the contradiction lies. After changing the value of $\tilde{f}(z^{\star})$ , we suddenly have

\langle\tilde{f}\circ(R^{\star},b^{\star}),P\rangle=\sum_{\alpha\in\operatorname{supp}(P)}\tilde{f}\left((R^{\star},b^{\star})(\alpha)\right)\cdot P(\alpha)=0.

However, since we only changed the value of $\tilde{f}(z^{\star})$ , no term in the above summation was changed, and this inner product should still be nonzero. Hence, the set $\mathcal{S}_{t}$ cannot be dense in any $\mathcal{C}_{a,b}$ where $a\notin\operatorname{supp}(H)$ . ∎

5.2 Proof of Lemma 5.2

We have established that there exists an $a^{\star}\in\operatorname{supp}(H)$ and $b\in\mathbb{F}_{q}^{n}$ such that

\mu_{a^{\star},b}(\mathcal{S}_{t})\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}.

Using this, we will deduce Lemma 5.2, which says that the set $\mathcal{S}_{t}$ is dense in $\cup_{v\in\operatorname{supp}{(H)}}\mathcal{C}_{v,b}$ .

Let $\mathcal{U}_{s+t}$ denote the set of $(s+t)$ -flats $U$ such that $\deg(f|_{U})>d$ and let $\nu$ denote the uniform measure on the set of $(s+t)$ -flats. We sample $T\in\mathcal{C}_{a^{\star},b}$ by first choosing an $(s+t)$ -flat $U$ containing $b$ , and then choosing $T$ whose image is $U$ conditioned on $Ta^{\star}=b$ . The point of this procedure is that after choosing $U$ , the resulting $T$ can only reject if $\deg(f|_{U})>d$ . Formally,

1.

Choose a random $(s+t)$ -flat $U$ containing $b$ , and a random basis, $U=\operatorname{span}(u_{1},\ldots,u_{s+t})+u_{0}$ . Let $T^{\prime}=(M^{\prime},u_{0})$ where the $i$ th column of $M^{\prime}$ is $u_{i}$ . By assumption $T^{\prime}(x)=b$ for some $x\in\mathbb{F}_{q}^{s+t}$ .
2.

Choose an arbitrary matrix $B\in\mathcal{T}_{s+t,s+t}$ such that $Ba^{\star}=x$ and output $T=(M^{\prime}\cdot B,u_{0})$ .

It is easy to check that this procedure samples uniformly from $\mathcal{C}_{a^{\star},b}$ and that as noted, $T$ can only reject if $\deg(f|_{U})>d$ , which leads to the following observation:

Remark 5.7.

1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}\leqslant\mu_{a^{\star},b}(\mathcal{S}_{t})\leqslant\nu_{b}(\mathcal{U}_{s+t}),

where $\nu_{b}$ denotes density in the zoom-in $\mathcal{D}_{b}$ (on the affine Grassmann graph).

Using this information about $\mathcal{U}_{s+t}$ , we now try to apply Lemma 5.3. Sample a full rank $T\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ as follows:

1.

Choose an $(s+t+100)$ -flat $V$ uniformly at random containing $b$ .
2.

Choose $U\subseteq V$ uniformly at random such that $b\in U$ .
3.

Choose a basis representation $U=\operatorname{span}(u_{1},\ldots,u_{s+t})+u_{0}$ and output $T=(M,u_{0})$ where the $i$ th column of $M$ is $u_{i}$ , conditioned on $Tv=b$ for some $v\in\operatorname{supp}(H)$ .

After $V$ is chosen, recall the following two sets,

\mathcal{A}_{V}=\{U\subseteq V\;|\;\dim(U)=s+t,\deg(f|_{U})>d,b\in U\},

\mathcal{B}_{V}=\{U\subseteq V\;|\;\dim(U)=s+t,\deg(f|_{U})>d,b\notin U\},

and let $\nu_{V}$ denote measure over $s+t$ -flats contained in $V$ .

Then $\mathop{\mathbb{E}}_{V}[\nu_{V}(\mathcal{A}_{V})]\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}$ , so with probability at least $1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}$ , we have $\nu_{V}(\mathcal{A}_{V})>0$ . On the other hand, $\mathop{\mathbb{E}}_{V}[\nu_{V}(\mathcal{B}_{V})]=O(\varepsilon)$ , so with probability at least $1-q^{-M/2}$ , we have $\nu_{V}(\mathcal{B}_{V})\leqslant q^{-M/2}O(\varepsilon)$ . Altogether, this implies that with probability at least

1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2},

$V$ satisfies both $\nu_{V}(\mathcal{A}_{V})>0$ and $\nu_{V}(\mathcal{B}_{V})\leqslant q^{M/2}O(\varepsilon)$ .

Suppose such a $V$ is chosen and the above holds. We may now apply Lemma 5.3 to show that any $T$ that can be chosen in step $3$ must now reject. This will establish the desired result, that a random $T\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ rejects with probability close to $1$ .

By Lemma 5.3 there exists a value $\gamma$ such that after changing $f(b)$ to $\gamma$ , $\deg(f|_{V})\leqslant d$ . It must be the case that $\gamma\neq f(b)$ because $\nu_{V}(\mathcal{A}_{V})>0$ . Let $f^{\prime}$ be the the function after changing the value of $f$ at $b$ . Then for any $T$ as described above, we must have $\langle f^{\prime}\circ T,H_{e}\rangle=0$ for every valid $e$ . Since $T$ is full rank, there can only be one point mapped to $b$ and that point is some $v\in\operatorname{supp}(H)$ , so we have

\langle f^{\prime}\circ T,H_{e}\rangle-\langle f\circ T,H_{e}\rangle=H_{e}(v)(f(b)-f^{\prime}(b))\neq 0.

Thus, for every valid $e$ $\langle f\circ T,H_{e}\rangle\neq\langle f^{\prime}\circ T,H_{e}\rangle$ , and in particular, $T$ rejects $f$ .

Proof of Lemma 5.2.

Sampling a full rank $T\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ via the procedure above, the previous argument shows that with probability at least $1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2}$ over $(s+t+100)$ -flats $V$ , the outputted $T$ in step 3 rejects. It follows that $\mathcal{S}_{t}$ has density at least $1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}-q^{-M/2}$ in the set of full rank transformations $T$ such that $Tv=b$ for some $a\in\operatorname{supp}(H)$ . Since non-full rank transformations constitute only a $\frac{1}{q^{100}(q-1)}$ -fraction of transformations in $\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ , by Remark 3.2, we subtract out another $\frac{2}{q^{100}(q-1)}$ to obtain the desired result. ∎

5.3 Iterating the Argument

Recall that $\operatorname{rej}_{s+t}(f)$ denotes the probability that a randomly chosen $T\in\mathcal{T}_{n,s+t}$ rejects $f$ . By Lemma 5.3, for at least $9/10$ of $T\in\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ , there is a value $\gamma$ such that after changing the value of $f(b)$ to $\gamma$ , $T$ accepts $f$ . Thus, there exists a $f^{\prime}$ such that $f^{\prime}$ is identical to $f$ at all points except $b$ and

\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10q}.

Proposition 5.8.

If $0<\operatorname{rej}_{s+t}(f)<q^{-M}$ and $t=O(p)$ there exists a point $z\in\mathbb{F}_{q}^{n}$ and a function $f^{\prime}$ that is identical to $f$ at all points except $b$ such that

\operatorname{rej}_{s+t}(f^{\prime})\leqslant\operatorname{rej}_{s+t}(f)-\frac{|\operatorname{supp}(H)|}{q^{n}C(q)},

where $C(q)=O(q)$ .

Proof.

By Lemma 5.2, there exists $b$ such that

\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200-M/4}-q^{-M/2}.

By the above discussion, there exists a $f^{\prime}$ such that $f^{\prime}$ is identical to $f$ at all points except $b$ and

\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10q}.

On the other hand, it is clear that for $T$ such that $T(v)\neq b$ for all $v\in\operatorname{supp}(H)$ , the results of the tests on $f\circ T$ and $f^{\prime}\circ T$ are the same because $f$ and $f^{\prime}$ are identical on the points needed to evaluate $\langle f\circ T,H_{e}\rangle$ and $\langle f^{\prime}\circ T,H_{e}\rangle$ for any $e$ . Since $\frac{|\operatorname{supp}(H)|}{q^{n}}$ is the probability that $b\in\{T(v)\;|\;v\in\operatorname{supp}(H)\}$ , a direct calculation yields

	$\displaystyle\operatorname{rej}_{s+t}(f^{\prime})$
	$\displaystyle\leqslant\operatorname{rej}_{s+t}(f)-\frac{\|\operatorname{supp}(H)\|}{q^{n}}\left(1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-\frac{2}{q^{100}(q-1)}-2000tq^{200-M/4}-q^{-M/2}-\left(1-\frac{9}{10q}\right)\right)$
	$\displaystyle\leqslant\operatorname{rej}_{s+t}(f)-\Omega\left(\frac{\|\operatorname{supp}(H)\|}{q^{n}q}\right).$

∎

Finally, we can conclude the proof of Theorem 1.1.

Proof of Theorem 1.1.

By Proposition 5.8, while $\operatorname{rej}_{s+t}(f)>0$ , we can change the value of $f$ at one point and reduce the rejection probability by $\Omega\left(\frac{|\operatorname{supp}(H)|}{q^{n}q}\right)$ . When the rejection probability is $0$ , the function must be degree at most $d$ , therefore,

\delta_{d}(f)q^{n}\leqslant O\left(\frac{\operatorname{rej}_{s+t}(f)}{|\operatorname{supp}(H)|}q^{n}q\right)

which implies that

\operatorname{rej}_{s+t}(f)\geqslant\Omega\left(\frac{|\operatorname{supp}(H)|}{q}\delta_{d}(f)\right).

∎

6 Optimal Testing from other Local Characterizations

When showing that the sparse flat test is optimal, we relied minimally on the structure of $H_{e}$ . Thus it is not hard to extend our methods and show optimal testing results for other polynomials that give local characterizations. We will reuse the variables $s$ and $r$ in this section, so they no longer refer to their previous definitions. Let $P:\mathbb{F}_{q}^{k}\xrightarrow[]{}\mathbb{F}_{q}$ be a polynomial of the form

P(x_{1},\ldots,x_{k})=\prod_{i=1}^{s}P_{i}(x_{m(i)},\ldots,x_{m(i+1)-1})).

Where $m(1)=1$ , $m(s)=k+1$ , $m(i+1)-m(i)\leqslant t^{\prime}$ for each $1\leqslant i\leqslant s-1$ and some small constant $t^{\prime}$ . In words, $H$ is a $k$ -variate polynomial that is the product of polynomials in few variables, where the variables of each of these polynomials is disjoint. Finally let $\mathcal{M}\subseteq\{\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q}\}$ be an arbitrary nonempty set affine invariant set of polynomials and suppose $t=\operatorname{poly}(q)$ . Define

\mathcal{E}=\{e\in\{0,\ldots,q-1\}^{t}\;|\;\prod_{i=1}^{t}x_{i}^{q-1-e_{i}}\notin\mathcal{M}\}.

Notice that $(q-1,\ldots,q-1)\notin\mathcal{E}$ . It is well known that any affine invariant set of polynomials is given by the span of the monomials that appear in at least one polynomial of the family, c.f. [25], and this fact is elaborated on in Appendix B. In combination with Lemma 3.3, it follows that $g:\mathbb{F}_{q}^{t}\xrightarrow[]{}\mathbb{F}_{q}$ is in $\mathcal{M}$ if and only if $\langle g,\prod_{i=1}^{t}x_{i}^{e_{i}}\rangle=0$ for every $(e_{1},\ldots,e_{t})\in\mathcal{E}$ . In comparison with the description in Section 1.4, $\{\prod_{i=1}^{t}x_{i}^{e_{i}}\;|\;e\in\mathcal{E}\}$ is an explicit basis for $\mathcal{M}^{\perp}$ .

Using $H$ and monomials with exponent vectors in $\mathcal{E}$ , we can define a test similar to the sparse flat tester. For $e\in\mathcal{E}$ , define

H_{e}(x_{1},\ldots,x_{k+t})=P(x_{1},\ldots,x_{k})\prod_{i=1}^{t}x_{k+i}^{e_{i}},

Let $\mathcal{F}_{n}(H)=\{f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}\;|\;\forall T\in\mathcal{T}_{n,k},\langle f\circ T,H_{e}\rangle=0,\forall e\in\mathcal{E}\}$ . It is clear that $\mathcal{F}_{n}(H)$ is affine invariant with the following natural tester:

1.

Choose $T\in\mathcal{T}_{n,k}$ uniformly at random.
2.

If $\langle f\circ T,H_{e}\rangle\neq 0$ for any $e\in\mathcal{E}$ , reject. Otherwise, accept.

Let $\operatorname{supp}(H)=\cup_{e\in\mathcal{E}}\operatorname{supp}(H_{e})=\operatorname{supp}(P)\times\mathbb{F}_{q}^{t}$ . It is clear that the number of queries made by the tester is $Q=|\operatorname{supp}(H)|$ , although this value will not be of interest for us. Instead, the goal for the remainder of this section is to show that this tester is optimal. As a consequence, this allows one to obtain lower query complexity optimal testers for general affine invariant codes by constructing local characterizations using $P$ with sparse support of the prescribed form above, similar to what is done for Generalized Reed-Muller Codes in [33].

Theorem 6.1.

The above tester is optimal for $\mathcal{F}_{n}(H)$ . That is, for any $f:\mathbb{F}_{q}^{n}\xrightarrow[]{}\mathbb{F}_{q}$ ,

\operatorname{rej}(f)\geqslant c(q)\min(1,Q\delta(f,\mathcal{F}_{n}(H))),

where $\operatorname{rej}(f)$ is the probability that the tester rejects $f$ , $c(q)=\frac{1}{\operatorname{poly}(q)}$ , $Q=|\operatorname{supp}(H)|$ is the number of queries that the tester performs, and $\delta(f,\mathcal{F}_{n}(H))$ is the minimal relative hamming distance between $f$ and a member of $\mathcal{F}_{n}(H)$ .

We will follow the same strategy, first locating a potential error, and then correcting the error and iterating. Let $\mathcal{S}$ denote the set of rejecting tests in $\mathcal{T}_{n,k+t}$ and assume that $\mu(\mathcal{S})\leqslant q^{-M}$ for some $M$ such that $M>10t^{\prime}$ and $q^{M}\geqslant t$ .

Before going into the proof, we introduce a lemma shown in [31]. Recall the definition of lifted affine-invariant codes from the introduction. By design, $\mathcal{F}_{n}(\mathcal{H})$ is a lifted affine invariant code. Letting $\mathcal{F}=\mathcal{F}_{k+t}(H)$ , it is easy to see that

\mathcal{F}_{n}(H)={\sf Lift}_{n}(\mathcal{F}).

Since we can view $\mathcal{F}_{n}(H)$ as a lifted code, we can then apply the following lemma from [31].

Lemma 6.1.

Let $g:\mathbb{F}_{q}^{k^{\prime}+1}\xrightarrow[]{}\mathbb{F}_{q}$ be a polynomial such that $g\notin{\sf Lift}_{k^{\prime}+1}(\mathcal{F})$ , and suppose that $k^{\prime}\geqslant k$ . Then,

\Pr_{U}[g|_{U}\notin{\sf Lift}_{k^{\prime}}(\mathcal{F})]\geqslant\frac{1}{q},

where the probability is over hyperplanes $U\subset\mathbb{F}_{q}^{k^{\prime}+1}$ . If this inequality is tight and the set of hyperplanes $U$ such that $g|_{U}\notin{\sf Lift}_{k^{\prime}}(H)$ is of the form

\{U\;|\;x^{\star}\in U\},

for some point $x^{\star}$ , then there is a function $g^{\prime}$ equal to $g$ at all points except $x^{\star}$ such that $g^{\prime}\in\mathcal{F}_{k+i+1}(H)$ .

This lemma will play the role of Lemma C.2 in this section’s analysis.

Locating a Potential Error

In order to locate an error, we will once again show that $\mathcal{S}$ is dense on some zoom-in. As we already assume $\mathcal{S}$ has small measure, this step requires us to show that $\mathcal{S}$ is poorly expanding and pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part.

Lemma 6.2.

The set $\mathcal{S}$ has the following properties:

•

$\mu(\mathcal{S})\leqslant q^{-M}$
•

$\Phi(\mathcal{S})\leqslant 1-1/q$
•

$\mathcal{S}$ is $q\mu(\mathcal{S})$ -pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
•

$\mathcal{S}$ is $tq^{162}\varepsilon$ -pseudorandom with respect to zoom-ins on the linear part.

where $\mu$ is measure in the set $\mathcal{T}_{n,k+t}$ .

The first item is true by assumption. The second and third items can be shown by the exact same arguments as in Lemmas 4.1 and 4.3 respectively.

The proof for the fourth item also proceeds similar to the Reed-Muller case, but since this argument was more involved, we will review the steps.

Consider a zoom-in on the linear part $\mathcal{C}_{a,b,\operatorname{lin}}$ . We can assume $a\neq 0$ as otherwise the set is either empty or $\mathcal{T}_{n,k}$ . We will again have two cases, one where at least one of the last $t$ coordinates of $a$ is nonzero, and another when all are zero. We can prove the former case in exactly the same way as Lemma 4.5. For the latter case, recall that by assumption $(q-1,\ldots,q-1)\notin\mathcal{E}$ . Thus, the reduction from the former case to the latter case (with a factor of $tq$ loss) works exactly the same as in the Proof of Lemma 4.4.

Once again we can sample a transformation by first choosing a full rank $A\in\mathcal{T}_{n,k+t+100}$ , $B\in\operatorname{Res}_{k+t+100,k}$ and outputting $A\circ B=T$ . Where now $\operatorname{Res}_{k,t+100,t}$ is the set of affine transformations $(R,b)$ with

R=\begin{bmatrix}I_{k}&0\\ 0&R^{\prime}\end{bmatrix},b=\begin{bmatrix}0\\ b^{\prime}\\ \end{bmatrix},

(5)

with $R^{\prime}\in\mathbb{F}_{q}^{(t+100)\times t}$ is full rank and $b^{\prime}\in\mathbb{F}_{q}^{t+100}$ . Call $(R^{\prime},b^{\prime})$ the non-trivial part of $B=(R,b)$ and let $\operatorname{fl}(B)=\operatorname{Im}(R^{\prime},b^{\prime})$ .

Following the same setup as in Section 4.3, we can find $A$ such that the following two hold:

•

$b\in\operatorname{Im}_{a_{[k]}}(A)$
•

There exists $B\in\operatorname{Res}_{k,t+100,t}$ such that $A\circ B\in\mathcal{S}$
•

$\Pr_{B}[A\circ B\in\mathcal{S}\;|\;b\notin\operatorname{Im}_{a_{[k]}}(M\circ R)]\leqslant\frac{2}{\alpha}\varepsilon$ .

Fixing this $A$ , define

\tilde{f}(\beta_{1},\ldots,\beta_{t+100})=\sum_{\alpha}f\circ A(\alpha,\beta_{1},\ldots,\beta_{t+100})P(\alpha).

Then for any restriction $B$ with $B^{\prime}=(R^{\prime},b^{\prime})\in\mathcal{T}_{t+100,t}$ as its non-trivial part, and any $t$ -variate monomial $x^{e}$ , we have

\langle f\circ A\circ B,H_{e}\rangle=\langle\tilde{f}\circ B^{\prime},x^{e}\rangle.

It follows that $B=(R,b)$ rejects $f$ if and only if $\tilde{f}|_{b^{\prime}+\operatorname{Im}(R^{\prime})}\in\mathcal{M}$ .

By construction $Ax^{\star}=b$ for some $x^{\star}\in\mathbb{F}_{q}^{k+100}$ such that $x^{\star}$ is equal to $a$ in its first $k$ coordinates. Then $b\in\operatorname{Im}_{a_{[k]}}(A\circ B)$ is equivalent to $z^{\star}\in\operatorname{Im}(R^{\prime})$ where $z^{\star}$ is the last $t+100$ coordinates of $x^{\star}$

Letting $\mathcal{R}_{A}=\{(R^{\prime},b^{\prime})\in\mathcal{T}_{t+100,t}\;|\;\tilde{f}|_{b^{\prime}+\operatorname{Im}(R^{\prime})},z^{\star}\in\operatorname{Im}(R^{\prime})\}$ , we can translate the third item above to

\Pr_{B}[A\circ B\in\mathcal{S}\;|\;b\notin\operatorname{Im}_{a_{[k]}}(M\circ R)]=\mu_{A}(\mathcal{R}_{A})\leqslant\frac{2}{\alpha}\varepsilon,

where $\mu_{A}$ is measure in $\operatorname{Res}_{k,t+100,t}$ . Define

\mathcal{B}_{A}=\{b^{\prime}+\operatorname{Im}(R^{\prime})\;|\;(R^{\prime},b^{\prime})\in\mathcal{R}_{A}\}.

By the same argument as in Lemma 4.7, $\mathcal{B}_{A}$ is nonempty. Finally, we can use the same proof as that of Lemma 4.5 (except referring to the first part of Lemma 6.1 instead of Lemma 3.10 to bound the sizes of upper shadows) to show that $\alpha\leqslant 2e^{-4/q}\varepsilon q^{160}\leqslant\varepsilon q^{161}\varepsilon$ . This shows that $\mathcal{S}$ is $\varepsilon q^{161}\varepsilon$ -pseudorandom with respect to $\mathcal{C}_{a,b,\operatorname{lin}}$ where one of $a$ ’s last $t$ -coordinates is nonzero. If this is not the case, then since $(q-1,\ldots,q-1)\notin\mathcal{E}$ , we can perform the same reduction as in Lemma 4.4, to show that $\mathcal{S}$ is $tq^{162}\varepsilon$ -pseudorandom with resepect to every zoom-in on the linear part.

Altogether, this establishes Lemma 6.2, which allows us to apply Theorem 2.1 with $\xi=tq^{162-M}$ and $\ell=s+t$ . Since we take $t\geqslant 10$ , Theorem 2.1 implies that we have that $\mathcal{S}$ must have density at least $1-\frac{1}{(q-1)^{2}}-\frac{1}{q^{6}}-2000tq^{200-M/4}$ on some zoom-in $\mathcal{C}_{a,b}$ . Henceforth, fix this zoom-in and call it $\mathcal{C}_{a^{\star},b}$ .

The point $a^{\star}$ is in the support of $H$

We next show that it must be the case that $a^{\star}\in\operatorname{supp}(H)$ . We will need the following lemma which is the same as Lemma 5.3 but for the current setting. Just as in the setup of Lemma 5.3, let $g:\mathbb{F}_{q}^{\ell+100}$ be an arbitrary polynomial, let $\nu$ denote uniform measure over $\ell$ -flats in $\mathbb{F}_{q}^{\ell+100}$ , and let $b\in\mathbb{F}_{q}^{\ell+100}$ be an arbitrary point. Also let $\mathcal{G}\subseteq\{\mathbb{F}_{q}^{N}\xrightarrow[]{}\mathbb{F}_{q}\}$ be an arbitrary affine-invariant code.

Define the following two sets:

\mathcal{A}=\{U\;|\;\dim(U)=\ell,g|_{U}\notin{\sf Lift}_{\ell}(\mathcal{G}),b\in U\},

\mathcal{B}=\{U\;|\;\dim(U)=\ell,g|_{U}\notin{\sf Lift}_{\ell}(\mathcal{G}),b\notin U\}.

Keep $\varepsilon$ and $M$ (the large absolute constant) the same as we have defined, so that $q^{M/2}O(\varepsilon)<q^{-M/2}$ is small. We will use the following result, which is an extension of the results in Section 3.2 of [31]:

Lemma 6.3.

Keep the notation defined above and suppose $\ell\geqslant\max(N,4)$ . If $\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon)$ then $\mathcal{B}=\emptyset$ . Moreover there is a value $\gamma$ such that after changing $g(b)$ to $\gamma$ , $\nu(\mathcal{A})=0$

Proof.

The proof of this lemma is the same as the proof of Lemma 5.3, however we appeal to Lemma 6.1 instead of Lemma C.2. ∎

Following the same steps as in the proof of Lemma 5.4, this lemma almost immediately shows that it must be the case $a^{\star}\in\operatorname{supp}(H)$ . Suppose for the sake of contradiction that $a^{\star}\notin\operatorname{supp}(H)$ , and that it is the group of coordinates $(a^{\star}_{m(i)},\ldots,a^{\star}_{m(i+1)-1})\notin\operatorname{supp}(P_{i})$ . By assumption $P_{i}$ is a $t^{\prime\prime}$ -variate polynomial for $t^{\prime\prime}\leqslant t^{\prime}$ . Let

\mathcal{F}_{t^{\prime\prime}}(P_{i})=\{g:\mathbb{F}_{q}^{t^{\prime\prime}}\xrightarrow[]{}\mathbb{F}_{q}\;|\;\langle g\circ T,P_{i}\rangle=0,\forall T\in\mathcal{T}_{t^{\prime\prime},t^{\prime\prime}}\}.

We can similarly find an auxiliary polynomial $\tilde{f}:\mathbb{F}_{q}^{t^{\prime\prime}+102}\xrightarrow[]{}\mathbb{F}_{q}$ and a special point $z^{\star}\in\mathbb{F}_{q}^{t^{\prime\prime}}$ such that the following hold:

•

$\mu(\mathcal{R})\leqslant O(\varepsilon)$ , where $\mathcal{R}=\{B\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}\;|\;\langle\tilde{f}\circ B,P_{i}\rangle\neq 0,z^{\star}\notin\operatorname{Im}(R)\}$ and $\mu$ is measure in $\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}$ .
•

There exists a full rank affine transformation $B^{\star}\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}$ such that $\langle\tilde{f}\circ B^{\star},P_{i}\rangle\neq 0$ ,
•

$z^{\star}\notin\{B^{\star}(v)\;|\;v\in\operatorname{supp}(P_{i})\}$ .

Using the first fact, we can also bound the measure of the following set of $t^{\prime\prime}$ -flats,

\mathcal{B}=\{U\in{\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime})\;|\;f|_{U}\in\mathcal{F}_{t^{\prime\prime}}(P_{i}),z^{\star}\in U\}.

Lemma 6.4.

Suppose $g$ is a $t^{\prime\prime}$ -variate polynomial such that $g\notin\mathcal{F}_{t^{\prime\prime}}(P_{i})$ . Then with probability at least $\frac{1}{q^{t^{\prime\prime}}}$ over $T\in\mathcal{T}_{t^{\prime\prime},t^{\prime\prime}}$ we have $\langle g\circ T,P_{i}\rangle\neq 0$ .

Proof.

Choosing $T$ randomly, we can view $\langle g\circ T,P_{i}\rangle$ as a polynomial in the entries of $T$ . Since $g\notin\mathcal{F}_{t^{\prime\prime}}(P_{i})$ there must be some choice $T$ that makes this polynomials value nonzero, so $\langle g\circ T,P_{i}\rangle$ is a nonzero polynomial in the entries of $T$ . As the individual degrees of $g$ must be at most $q-1$ , it follows from Schwarz-Zippel that with probability at least $\frac{1}{q^{t^{\prime\prime}}}$ , $\langle g\circ T,P_{i}\rangle\neq 0$ . ∎

Let $\nu$ denote measure in ${\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime})$ . We can choose $B\in\mathcal{T}_{t^{\prime\prime}+102,t^{\prime\prime}}$ such that $z^{\star}\notin\operatorname{Im}(B)$ by first choosing $U\in{\sf AffGras}(t^{\prime\prime}+102,t^{\prime\prime})$ not containing $z^{\star}$ and then $B$ with image contained in $U$ . Lemma 6.4 along with the assumption $\mu(\mathcal{R})\leqslant O(\varepsilon)$ implies that

\frac{1}{q^{t^{\prime\prime}}}\nu(\mathcal{B})\leqslant\mu(\mathcal{R}).

Therefore, we have that $\nu(\mathcal{B})\leqslant q^{t^{\prime\prime}}O(\varepsilon)$ and by Lemma 6.3, $\nu(\mathcal{B}^{\uparrow^{4}})\leqslant q^{t^{\prime\prime}+4}O(\varepsilon)\leqslant q^{M/2}O(\varepsilon)$ . We can then apply Lemma 6.3 with $\mathcal{G}=\mathcal{F}_{t^{\prime\prime}}(P_{i})$ and special point $z^{\star}$ to get that there is a value $\gamma$ such that after changing $\tilde{f}(z^{\star})$ , we have,

\tilde{f}|_{U^{\prime}}\in{\sf Lift}_{t^{\prime\prime}+4}(\mathcal{F}_{t^{\prime\prime}}(P_{i})),\text{\quad for all $t^{\prime\prime}+4$-flats $U^{\prime}$ containing $z^{\star}$}.

This also implies that $\tilde{f}|_{U}\in\mathcal{F}_{t^{\prime\prime}}(P_{i})$ for all $t^{\prime\prime}$ -flats $U\ni z^{\star}$ , but note that changing the value of $\tilde{f}(z^{\star})$ does not affect the value of the test

\langle\tilde{f}\circ B^{\star},P_{i}\rangle\neq 0,

because of the third item above. Thus, we have a contradiction and it follows that $a^{\star}\in\operatorname{supp}(H)$ .

Dense on all zoom-ins inside the support of $H$

After establishing that $\mathcal{S}$ has density at least $1-\frac{1}{(q-1)^{2}}-2000tq^{200-M/4}$ inside $\mathcal{C}_{a^{\star},b}$ and $a^{\star}\in\operatorname{supp}(H)$ , we can use the same arguments as in the proof of Lemma 5.2 to show that $\mathcal{S}$ must have density at least

1-\frac{1}{(q-1)^{2}}-\frac{2}{q^{100}(q-1)}-2000q^{200+t-M/4}-q^{-M/2},

inside $\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ . We can choose $T\in\mathcal{C}_{a^{\star},b}$ by first choosing a $k+t+100$ -flat $V$ containing $b$ , then a $k+t$ -flat $U\subseteq V$ containing $b$ , and finally outputting $T$ conditioned on $\operatorname{Im}(T)=U$ and $Ta=b$ .

Using the same sampling and averaging argument, we get that with probability at least $1-\frac{1}{(q-1)^{2}}-2000q^{200+t-M/4}-q^{-M/2}$ over $k+t+100$ -flats $V$ , the following hold:

•

$\nu_{V}(\mathcal{A}_{V})>0$ , where $\mathcal{A}_{V}=\{U\subseteq V\;|\;\dim(U)=k,f|_{U}\notin\mathcal{F}_{k+t}(H),b\in U\}$ ,
•

$\nu_{V}(\mathcal{B}_{V})\leqslant q^{M/2}O(\varepsilon)$ , where $\mathcal{B}_{V}=\{U\subseteq V\;|\;\dim(U)=k,f|_{U}\notin\mathcal{F}_{k+t}(H),b\notin U\}$ ,

Applying Lemma 6.3 with $\mathcal{G}=\mathcal{F}_{k+t}(H)$ and $g=f|_{V}$ , we get that there is a value $\gamma$ such that after changing the value of $f(b)$ to $\gamma$ , $\nu_{V}(\mathcal{A}_{V})=0$ . By the first item above, $\gamma\neq f(b)$ . Therefore, prior to changing the value of $f(b)$ , we must have had $\langle f|_{V}\circ B,H\rangle\neq 0$ for all full rank $B\in\mathcal{T}_{k+t+100,k+t}$ such that $B(v)=b$ for some $v\in\operatorname{supp}(H)$ . After subtracting out the fraction of non-full rank transformations, we can conclude that $\mathcal{S}$ has density at least

1-\frac{1}{(q-1)^{2}}-\frac{2}{q^{100}(q-1)}-2000q^{200+t-M/4}-q^{-M/2},

inside $\cup_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ .

Correcting the Error and Iterating

Finally, we can correct the error and iterate as done in Section 5.3. For at least $9/10$ of $T\in\operatorname{supp}_{v\in\operatorname{supp}(H)}\mathcal{C}_{v,b}$ , there is a value $\gamma$ such that after changing the value of $f(b)$ to $\gamma$ , $T$ accepts $f$ . Thus, there exists an $f^{\prime}$ that is identical to $f$ at all points except $b$ and

\Pr_{T}[T\text{ rejects }f\;|\;\exists v\in\operatorname{supp}(H),T(v)=b]\leqslant 1-\frac{9}{10}q.

By the same calculation as in Lemma 5.8, it follows that

\operatorname{rej}(f^{\prime})\leqslant\operatorname{rej}(f)-\frac{|\operatorname{supp}(H)|}{q^{n}C(q)},

for some $C(q)=O(q)$ , and we can conclude

\operatorname{rej}(f)\geqslant\Omega\left(\frac{|\operatorname{supp}(H)|}{q}\delta(f,\mathcal{F}_{n}(H))\right).

References

[1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Trans. Inf. Theory, 51(11):4032–4039, 2005.
[2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998.
[3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of NP. J. ACM, 45(1):70–122, 1998.
[4] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Comb., 23(3):365–426, 2003.
[5] Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1629–1642, 2021.
[6] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. High dimensional expanders: Eigenstripping, pseudorandomness, and unique games. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 - 12, 2022, pages 1069–1128, 2022.
[7] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 185–194, 2022.
[8] Boaz Barak, Parikshit Gopalan, Johan Håstad, Raghu Meka, Prasad Raghavendra, and David Steurer. Making the long code shorter. SIAM J. Comput., 44(5):1287–1324, 2015.
[9] Boaz Barak, Pravesh Kothari, and David Steurer. Small-Set Expansion in Shortcode Graph and the 2-to-2 Conjecture. In Proceedings of the 10th Innovations in Theoretical Computer Science conference, ITCS 2019, January 10-12, San Diego, CA, USA, 2019.
[10] Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of Reed-Muller codes. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, Las Vegas, NV, USA, pages 488–497, 2010.
[11] Irit Dinur and Venkatesan Guruswami. Pcps via low-degree long code and hardness for constrained hypergraph coloring. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 340–349, 2013.
[12] Irit Dinur, Prahladh Harsha, Srikanth Srinivasan, and Girish Varma. Derandomized graph product results using the low degree long code. In 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, March 4-7, 2015, Garching, Germany, pages 275–287, 2015.
[13] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. On non-optimally expanding sets in Grassmann graphs. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 940–951, 2018.
[14] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 376–389, 2018.
[15] Uriel Feige, Shafi Goldwasser, László Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996.
[16] William T Gowers. A new proof of szemerédi’s theorem. Geometric & Functional Analysis GAFA, 11(3):465–588, 2001.
[17] Alan Guo, Swastik Kopparty, and Madhu Sudan. New affine-invariant codes from lifting. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, ITCS ’13, page 529–540, New York, NY, USA, 2013. Association for Computing Machinery.
[18] Tom Gur, Noam Lifshitz, and Siqi Liu. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 176–184, 2022.
[19] Venkatesan Guruswami, Prahladh Harsha, Johan Håstad, Srikanth Srinivasan, and Girish Varma. Super-polylogarithmic hypergraph coloring hardness via low-degree long codes. SIAM J. Comput., 46(1):132–159, 2017.
[20] Elad Haramaty, Noga Ron-Zewi, and Madhu Sudan. Absolutely sound testing of lifted codes. Theory Comput., 11:299–338, 2015.
[21] Elad Haramaty, Amir Shpilka, and Madhu Sudan. Optimal testing of multivariate polynomials over small prime fields. SIAM J. Comput., 42(2):536–562, 2013.
[22] Charanjit S. Jutla, Anindya C. Patthak, Atri Rudra, and David Zuckerman. Testing low-degree polynomials over prime fields. Random Struct. Algorithms, 35(2):163–193, 2009.
[23] Daniel M. Kane and Raghu Meka. A PRG for lipschitz functions of polynomials with applications to sparsest cut. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 1–10, 2013.
[24] Tali Kaufman and Dana Ron. Testing polynomials over general fields. SIAM J. Comput., 36(3):779–802, 2006.
[25] Tali Kaufman and Madhu Sudan. Algebraic property testing: the role of invaraince. In Proceedings of the 40th Annual ACM Symposium on Theory of computing, STOC 2008, May 17-20, Victoria, BC, Canada, pages 403–412, 2008.
[26] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Hypercontractivity for global functions and sharp thresholds. arXiv preprint arXiv:1906.05568, 2019.
[27] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Forbidden intersections for codes. arXiv preprint arXiv:2103.05050, 2021.
[28] Peter Keevash, Noam Lifshitz, and Dor Minzer. On the largest product-free subsets of the alternating groups. arXiv preprint arXiv:2205.15191, 2022.
[29] Subhash Khot, Dor Minzer, and Muli Safra. On independent sets, 2-to-2 games, and Grassmann graphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 576–589, 2017.
[30] Subhash Khot, Dor Minzer, and Muli Safra. Pseudorandom sets in Grassman graph have near-perfect expansion. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 592–601, 2018.
[31] Dor Minzer and Tali Kaufman. Improved Optimal Testing Results from Global Hypercontractivity. In Proceedings of the 63rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2022, October 31 - November 3, Denver, CO, USA, 2022.
[32] Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In Proceedings of the 29th Annual ACM SIGACT Symposium on Theory of Computing, STOC 1997, El Paso, TX, USA, May 406, 1997, pages 475–484, 1997.
[33] Noga Ron-Zewi and Madhu Sudan. A New Upper Bound on the Query Complexity of Testing Generalized Reed-Muller Codes. Theory of Computing, 9(25):783–807, 2013.
[34] Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996.

Appendix A Proof of Theorem 2.1

We now prove Theorem 2.1. First we will need to show the following result, which is a slight variation of [31, Theorem 2.4].

Theorem A.1.

Let $\mathcal{A}\subseteq{\sf AffGras}(n+\ell,\ell)$ satisfy

1.

$\mu(\mathcal{A})\leqslant\xi$ ,
2.

$\mathcal{A}$ is $\xi$ -pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part.
3.

$1-\Phi(\mathcal{A})\geqslant\frac{1}{q}-\delta$ .

Then there exists a point $x\in\mathbb{F}_{q}^{n}$ such that $\mu(\mathcal{A}_{x})\geqslant 1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{\delta}{q-1}\right)$ .

Measure, expansion, and pseudorandomness are all with respect to ${\sf AffGras}(n+\ell,\ell)$ , and $\mu(\mathcal{A}_{x})$ denotes the fractional size of $\mathcal{A}$ in the subset of ${\sf AffGras}(n+\ell,\ell)$ that contains the point $x$ .

To show that an analogous result holds in the affine Bi-linear Scheme Graph, we retrace the steps in [9] to establish a connection to the affine Grassmann Graph and then apply Theorem A.1. To this end, define the following injective map $\varphi:\mathcal{T}_{n,\ell}\xrightarrow{}\mathcal{V}_{n+\ell,\ell}$ . Let $\{e_{1},\ldots,e_{\ell}\}\in\mathbb{F}_{q}^{\ell}$ be the canonical basis. For two column vectors $v$ and $w$ with lengths $\ell_{1}$ and $\ell_{2}$ respectively, we denote by $(v,w)^{T}$ the length $\ell_{1}+\ell_{2}$ column vector obtained by vertically concatenating $v$ and $w$ . Likewise, let $[v^{T},w^{T}]$ be the length $\ell_{1}+\ell_{2}$ column vector obtained by horizontally concatenating $v$ and $w$ .

For an affine transformation $(M,c)\in\mathcal{T}_{n,\ell}$ , where $M$ has columns $v_{1},\ldots,v_{\ell}$ we set $\varphi((M,v))=(0,c)^{T}+\operatorname{span}((e_{1},v_{1})^{T},\ldots,(e_{\ell},v_{\ell}^{T}))$ . It is clear that this map is injective. Moreover, the image of $\varphi$ is the set of $\ell$ -flats $V\subseteq\mathbb{F}_{q}^{n+\ell}$ such that the projection of $V$ onto the first $\ell$ coordinates is full rank.

The map $\varphi$ preserves edges, nearly preserves expansion, and maps the canonical non expanding sets in ${\sf AffBilin}(n,\ell)$ to their counterparts in ${\sf AffGras}(n+\ell,\ell)$ .

Lemma A.2.

If $T_{1},T_{2}\in\mathcal{T}_{n,\ell}$ are adjacent in ${\sf AffBilin}(n,\ell)$ , then $\varphi(T_{1}),\varphi(T_{2})$ are adjacent in ${\sf AffGras}(n+\ell,\ell)$ . Conversely if $V_{1},V_{2}\in V_{n,\ell}$ are adjacent in ${\sf AffGras}(n+\ell,\ell)$ and both in the image of $\varphi$ , then $\varphi^{-1}(V_{1}),\varphi^{-1}(V_{2})$ are adjacent in ${\sf AffBilin}(n,\ell)$ .

Proof.

For the first direction, suppose $T_{1}$ and $T_{2}$ are given by $(M_{1},c_{1})$ and $(M_{2},c_{2})$ respectively. Let $M^{\prime\prime}_{1}$ be the $(n+\ell)\times n$ matrix obtained by vertically concatenating $I_{\ell}$ and $M$ , and let $c^{\prime\prime}_{1}$ be the length $n+\ell$ vector obtained by vertically concatenating the length $\ell$ zero vector and $c_{1}$ . Define $M^{\prime\prime}_{2}$ and $c^{\prime\prime}_{2}$ similarly. Then notice that $\varphi(T_{1})$ is given by the column-image of affine transformation $T^{\prime\prime}_{1}=(M^{\prime\prime}_{1},c^{\prime\prime}_{1})$ , while $\varphi(T_{2})$ is given by the column-image of affine transformation $T^{\prime\prime}_{2}=(M^{\prime\prime}_{2},c^{\prime\prime}_{2})$ . Thus, to show that edges are preserved it suffices to show that the column-images of $T^{\prime\prime}_{1}$ and $T^{\prime\prime}_{2}$ have intersection of dimension $\ell-1$ . Indeed, $T^{\prime\prime}_{1}-T^{\prime\prime}_{2}=(M^{\prime\prime}_{1}-M^{\prime\prime}_{2},c^{\prime\prime}_{1}-c^{\prime\prime}_{2})$ , and since both $M^{\prime\prime}_{1}-M^{\prime\prime}_{2}$ and $c^{\prime\prime}_{1}-c^{\prime\prime}_{2}$ are all zeros in the first $\ell$ rows, it follows that $\dim(\ker(T^{\prime\prime}_{1}-T^{\prime\prime}_{2}))=\dim(\ker(T_{1}-T_{2}))=\ell-1$ . Therefore, $\dim(\operatorname{Im}(T^{\prime\prime}_{1}-T^{\prime\prime}_{2}))=1$ and $\dim(\varphi(T_{1})\cap\varphi(T_{2}))=\ell-1$ .

In the converse direction, suppose that $U,V\in\mathcal{V}_{n+\ell,\ell}$ where $\dim(U\cap V)=\ell-1$ . Write $U=u_{0}+\operatorname{span}(u_{1},\ldots,u_{\ell})$ and $V=v+\operatorname{span}(v_{1},\ldots,v_{\ell})$ where $u_{i}=(e_{i},u^{\prime}_{i})^{T}$ , $v_{i}=(e_{i},v^{\prime}_{i})^{T}$ , $u_{0}=(0,u^{\prime}_{0})^{T}$ , and $v_{0}=(0,v^{\prime}_{0})^{T}$ . Let $M_{1}$ be the matrix with columns $u_{i}$ and $M_{2}$ be the matrix with columns $v_{i}$ . Then by assumption the affine transformations $T_{1}=(M_{1},u_{0})$ and $T_{2}=(M_{2},v_{0})$ have images with intersection of dimension $\ell-1$ . We claim that this implies $T_{1}$ and $T_{2}$ are in fact equal on an $\ell-1$ dimensional subspace. Indeed if $T_{1}(a)=T_{2}(b)$ for $a,b\in\mathbb{F}_{q}^{\ell}$ then examining the first $\ell$ coordinates implies that $a=b$ . Therefore, $T_{1},T_{2}$ are equal on the $\ell-1$ -dimensional preimage of $\operatorname{Im}(T_{1})\cap\operatorname{Im}(T_{2})$ . Finally, since $\varphi^{-1}(U)$ and $\varphi^{-1}(V)$ are given by the projections of $T_{1}$ and $T_{2}$ respectively onto the last $n$ coordinates, $\dim(\ker(\varphi^{-1}(U)-\varphi^{-1}(V)))=\ell-1$ and the conclusion follows. ∎

Lemma A.3.

Let $V_{1}\in\mathcal{V}_{n+\ell,\ell}$ be in the image of $\varphi$ . Then at least $1-1/q$ fraction of its neighbors are also in the image of $\varphi$ .

Proof.

Recall that $V$ is in the image of $\varphi$ if the projection of $V$ onto its first $\ell$ -coordinates is full rank. Fix $V_{1}\in\mathcal{V}_{n+\ell,\ell}$ in the image of $\varphi$ . Then a neighbor of $V_{1}$ can be sampled as follows: choose a uniformly random point $v_{0}\in V$ and $\ell$ linearly independent directions, $v_{1},\ldots,v_{\ell}$ , so that $V_{1}=v_{0}+\operatorname{span}(v_{1},\ldots,v_{\ell})$ , a uniformly random $v^{\prime}_{\ell}$ outside of $\operatorname{span}(v_{1},\ldots,v_{\ell})$ , and set $V_{2}=v_{0}+\operatorname{span}(v_{1},\ldots,v_{\ell-1},v^{\prime}_{\ell})$ . Since $V_{1}$ is in the image of $\varphi$ , the projection of $v_{1},\ldots,v_{\ell}$ to the first $\ell$ coordinates is full rank. Likewise, $V_{2}$ is in the image of $\varphi$ if $v^{\prime}_{\ell}$ projected onto its first $\ell$ coordinates is linearly independent to $v_{1},\ldots,v_{\ell-1}$ ’s projections to the first $\ell$ coordinates. This happens with probability exactly $1-1/q$ , completing the proof. ∎

Lemma A.4.

If $S\subseteq\mathcal{T}_{n,\ell}$ satisfies $1-\Phi(S)=\eta$ , then $\varphi(S)$ satisfies $1-\Phi^{\prime}(\varphi(S))\geqslant\eta(1-1/q)$ , where $\Phi$ and $\Phi^{\prime}$ are expansion in ${\sf AffBilin}(n,\ell)$ and ${\sf AffGras}(n+\ell,\ell)$ respectively.

Proof.

This is a direct consequence of the previous two lemmas. By Lemma A.3, $1-1/q$ fraction of the neighbors of $\varphi(S)$ are also in the image of $\varphi$ and by the assumption and Lemma A.2 $\eta$ fraction of these neighbors are also in $\varphi(S)$ . The result follows. ∎

Lemma A.5.

The map $\varphi$ is a bijection between the following pairs of sets:

•

$\mathcal{C}_{a,b}$ and $\mathcal{D}_{v}\cap\operatorname{Im}(\varphi)$ for $v=[a,b]^{T}$ .
•

$\mathcal{C}_{a,b,\operatorname{lin}}$ and $\mathcal{D}_{v,\operatorname{lin}}\cap\operatorname{Im}(\varphi)$ for $v=[a,b]^{T}$ .
•

$\mathcal{C}_{a^{T},b,\beta}$ and $\mathcal{D}_{W}\cap\operatorname{Im}(\varphi)$ for $W$ given by $\langle[-b^{T},a^{T}],x\rangle=\beta$ .
•

$\mathcal{C}_{a^{T},b,\beta,\operatorname{lin}}$ and $\mathcal{D}_{W,\operatorname{lin}}\cap\operatorname{Im}(\varphi)$ for $W$ given by $\langle[-b^{T},a^{T}],x\rangle=\beta$ .

Proof.

We show the zoom-in and zoom-out cases. The other two are analogous.

zoom-ins: Take any arbitrary $T=(M,c)\in\mathcal{C}_{a,b}$ and let let $M^{\prime}=[I_{\ell},M]^{T}$ and $c^{\prime}=[0,c]^{T}$ . It is easy to check that $M^{\prime}a+c^{\prime}=[a,b]^{T}$ . Since $T$ is arbitrary, this shows the first half of the bijection.

To complete the proof of this case, we must show that any flat in $\mathcal{D}_{v}\cap\operatorname{Im}(\varphi)$ is mapped to. Take such a flat $U$ . Since $U\in\operatorname{Im}(\varphi)$ , we can write $U=c^{\prime}+\operatorname{span}(u_{1},\ldots,u_{\ell})$ , where the first $\ell$ coordinates of $u_{i}$ is equal to $e_{i}$ and the first $\ell$ coordinates of $c^{\prime}$ are all $0$ . There is a unique linear combination of the $u_{i}$ ’s and $c^{\prime}$ that sums to $v$ , and looking at the first $\ell$ coordinates it must be the case that $v=c^{\prime}+a_{1}u_{1}+\cdots+a_{\ell}u_{\ell}$ where $a_{i}$ is the $i$ th coordinate of $a$ . Let $M$ be the matrix whose $i$ th column is given by the last $n$ entries of $u_{i}$ and let $c$ be the last $n$ entries of $c^{\prime}$ . It follows that $U=\varphi((M,c))$ and that $Ma=b$ .

zoom-outs: Take an arbitrary $T=(M,c)\in\mathcal{C}_{a^{T},b,\beta}$ and suppose it has columns $v_{1},\ldots,v_{\ell}$ . Then $\varphi(T)=c^{\prime}+\operatorname{span}(v^{\prime}_{0},\ldots,v^{\prime}_{\ell})$ where $c^{\prime}=[0,c]^{T}$ and $v^{\prime}_{i}=[e_{i},v_{i}]^{T}$ for $1\leqslant 1\leqslant\ell$ . Let $h=[-b^{T},a^{T}]$ be the length $\ell+n$ row vector whose first $\ell$ entries are $b^{T}$ and last $n$ entries are $a^{T}$ . By construction $\langle h,v^{\prime}_{i}\rangle=0$ and $\langle h,c\rangle=\beta$ . Therefore, $\varphi(T)$ is contained in the hyperplane given $\langle h,x\rangle=\beta$ . Since $T$ was arbitrary, it follows that $\varphi(\mathcal{C}_{a^{T},b,\beta})\subseteq W$ , where $W$ is the hyperplane given by $\langle h,x\rangle=\beta$ .

For the other direction of the bijection, take any $z+V\in\mathcal{D}_{W,\operatorname{lin}}\cap\operatorname{Im}(\varphi)$ , let $(M,c)$ be its preimage under $\varphi$ , and suppose $W$ is given by $\langle h,x\rangle=0$ (note that $W$ must be a linear subspace since it contains $V$ which is a linear subspace). Take $b$ to be the negative of the first $\ell$ coordinates of $h$ and $a$ to be the last $n$ coordinates of $h$ . It follows that $a^{T}M=b$ . ∎

With these lemmas, we can obtain the desired characterization of poorly expanding setes in ${\sf AffBilin}(n,\ell).$

Proof of Theorem 2.1.

Suppose $\mathcal{S}$ satisfies the conditions of Theorem 2.1 and let $\mathcal{A}=\varphi(\mathcal{S})\subseteq{\sf AffGras}(n+\ell,\ell)$ . Henceforth use $\Phi$ and $\mu$ to denote expansion and measure in ${\sf AffGras}(n+\ell,\ell)$ . By Lemma A.4 we have that $1-\Phi(\mathcal{A})\geqslant\frac{1}{q}-\frac{1}{q^{2}}$ . By Lemma A.5, we have that $\mathcal{A}$ is also $\xi$ -pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part. Finally since $\varphi$ is an injection, it is clear that $\mu(\mathcal{A})\leqslant\xi$ . By Lemma A.1 it follows that there exists a point $v$ such that

\frac{|\mathcal{A}\cap\mathcal{D}_{v}|}{|\mathcal{D}_{v}|}\geqslant 1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{1}{q^{100}(q-1)}\right).

By Lemma A.5, there is a zoom-in (in the affine bi-linear scheme graph) $\mathcal{C}_{a,b}$ such that $\varphi^{-1}(\mathcal{A}\cap\mathcal{D}_{v})=\mathcal{S}\cap\mathcal{C}_{a,b}$ , so $|\mathcal{S}\cap\mathcal{C}_{a,b}|=|\mathcal{A}\cap\mathcal{D}_{v}|$ . Finally $\frac{|\mathcal{C}_{a,b}|}{|\mathcal{D}_{v}|}$ is the probability that a randomly chosen flat from $\mathcal{D}_{v}$ is not full rank when restricted to its first $\ell$ coordinates. This probability is at most $1-1/q$ . Indeed a flat from $\mathcal{D}_{v}$ can be chosen by choosing $\ell$ linearly independent vectors $v_{1},\ldots,v_{\ell}$ and taking the flat to be $v+\operatorname{span}(v_{1},\ldots,v_{\ell})$ . Conditioned on the first $\ell-1$ vectors being chosen so that their restriction to the first $\ell$ coordinates is linearly independent, there is a $1-1/q$ chance that $v_{\ell}$ to satisfy this property as well. It follows that

\frac{|\mathcal{S}\cap\mathcal{C}_{a,b}|}{|\mathcal{C}_{a,b}|}\geqslant\frac{q}{q-1}\left(1-q^{2}\left(4q^{-\ell}+867\xi^{1/4}+\frac{1}{q^{100}(q-1)}\right)\right)\geqslant 1-\frac{1}{(q-1)^{2}}-\frac{q^{3}}{q-1}\left(4q^{-\ell}+867\xi^{1/4}\right).

∎

Appendix B Canonical Monomials characterizing $\operatorname{RM}[n,q,d]$

In this section we include the proof of Theorem 3.4, showing that any polynomial passing the test with probability $1$ is degree $d$ . The proof is implicit in [33], but as our tester is presented differently we give a proof for completion.

Write $d+1=\ell\cdot p(q-q/p)+r$ , where $1\leqslant r\leqslant p(q-q/p)$ , and set $s=\ell\cdot p$ . First, note that our tester detects monomials of the form:

\left(\prod_{i=1}^{s}x_{i}^{q-q/p}\right)\prod_{j=1}^{p+2}x_{s+j}^{e_{j}},

(6)

for any $(e_{1},\ldots,e_{t})$ such that $\sum_{j=1}^{t}e_{j}\geqslant r$ . Indeed, for $(e_{1},\ldots,e_{t})$ such that $\sum_{j=1}^{t}e_{j}\geqslant r$ , let $e^{\prime}=(q-1-e_{1},\ldots,q-1-e_{t})$ and consider the expansion of $H_{e^{\prime}}(x_{1},\ldots,x_{s+t})$ . Using Lucas’s Theorem, it is not hard to see that the expansion of $H_{e^{\prime}}(x_{1},\ldots,x_{s+t})$ contains the monomial $\left(\prod_{i=1}^{s}x_{i}^{q/p-1}\right)\prod_{j=1}^{t}x_{s+j}^{q-1-e_{j}}$ , and hence by Lemma 3.3, any canonical monomial in (6) is rejected.

Let $\mathcal{G}$ be the family of functions that pass the sparse $s+t$ -flat test with probability $1$ . In this section we will prove:

Lemma B.1.

If $f$ passes the sparse- $s+t$ -flat test with probability $1$ , then $f\in\operatorname{RM}[n,q,d]$ . Equivalently, $\mathcal{G}=\operatorname{RM}[n,q,d]$

One direction of this lemma is clear. If $\deg(f)\leqslant d$ then $f$ passes with probability $1$ . The other direction amounts to showing that $\mathcal{G}\subseteq\operatorname{RM}[n,q,d]$ . To show this, we will actually show the contrapositive and prove that if $\deg(f)\geqslant d+1$ , and $f\in\mathcal{G}$ , then one of the canonical monomials in (6) must be in $\mathcal{G}$ . This is a contradiction and establishes that $\mathcal{G}\subseteq\operatorname{RM}[n,q,d]$ .

Before proceeding to the proof, we will need the following two facts from [25] about affine-invariant families of polynomials. For a family of polynomials $\mathcal{F}$ , let $\operatorname{supp}({\mathcal{F}})$ denote the set of monomials that appear in at least one of these polynomials. The first fact says that these monomials are a basis of $\mathcal{F}$ .

Lemma B.2 (Monomial Extraction Lemma [25]).

If $\mathcal{F}$ is an affine-invariant family of polynomials then $\mathcal{F}=\operatorname{span}(\operatorname{supp}(\mathcal{F}))$ .

The second fact says that the monomials appearing in an affine-invariant family are $p$ -shadow closed. For two integers $a,b\in\{0,\ldots,q-1\}$ , let $a=\sum_{i=0}^{k-1}p^{i}a_{i}$ and $b=\sum_{i=0}^{k-1}p^{i}b_{i}$ be their base $p$ representations (recall $q=p^{k}$ ). Then we say $a$ is in the $p$ -shadow of $b$ if $a_{i}\leqslant b_{i}$ for $i=0,\ldots,k-1$ , and denote this by $a\leqslant_{p}b$ . Then for two exponent vectors $e=(e_{1},\ldots,e_{n})$ and $e^{\prime}=(e^{\prime}_{1},\ldots,e^{\prime}_{n})$ , we say $e\leqslant_{p}e^{\prime}$ if $e_{i}\leqslant_{p}e^{\prime}_{i}$ for every $i$ . Affine-invariant families of polynomials have the following shadow closed property.

Lemma B.3.

Let $\mathcal{F}=\operatorname{span}(\operatorname{supp}(\mathcal{F}))$ be an affine invariant family of polynomials. If $e\leqslant_{p}e^{\prime}$ and $e^{\prime}\in\operatorname{supp}(\mathcal{F})$ , then $e\in\operatorname{supp}(\mathcal{F})$ as well.

Finally, we will need the following which will allow us to go from one monomial in $\mathcal{F}$ to another with the same degree, but with the distribution of the individual degrees shifted.

Lemma B.4.

Suppose $x^{e}\in\mathcal{F}$ and suppose $m\leqslant_{p}e_{2}$ . Then the monomial $x^{e^{\prime}}\in\mathcal{F}$ , where $e^{\prime}=(e_{1}+m,e_{2}-m,e_{3},\ldots,e_{n})$ .

Proof.

Let $T$ be the affine transformation $T(x)=(x_{1},x_{1}+x_{2},x_{3},\ldots,x_{n})$ . Then,

x^{e}\circ T=x_{1}^{e_{1}}(x_{1}+x_{2})^{e_{2}}\prod_{j=3}^{n}x_{j}^{e_{j}}=\left(\sum_{i=0}^{d_{2}}\binom{d_{2}}{i}x_{1}^{e_{1}+i}x_{2}^{e_{2}-i}\right)\prod_{j=3}^{n}x_{j}^{e_{j}}.

By Lucas’s Theorem and the assumption $m\leqslant_{p}e_{2}$ , $\binom{d_{2}}{m}\neq 0$ in $\mathbb{F}_{q}$ , and so $x^{e}\circ T=x_{1}^{e_{1}}(x_{1}+x_{2})^{e_{2}}\prod_{j=3}^{n}x_{j}^{e_{j}}$ contains the monomial $x^{e^{\prime}}$ . The result then follows from Lemma B.2. ∎

With these three lemmas, we are ready to proceed to the proof of Lemma B.1. Supposing for the sake of contradiction that $\deg(f)>d$ , using the above lemmas, we will show that this implies one of the canonical monomials is in $\mathcal{F}$ . This cannot happen however, as all monomials in (6) are rejected by $T=I$ , the identity.

Proof of Lemma B.1..

Let $\mathcal{F}$ be the family of polynomials that pass the sparse $s+t$ -flat test with probability $1$ . Suppose for the sake of contradiction that $f\in\mathcal{F}$ and $f$ contains a monomial of degree $d^{\prime}>d$ . Let $x^{e}=\prod_{i=1}^{n}x_{i}^{e_{i}}$ be such monomial and let $\ell$ be the smallest index such that $e_{1}+\cdots+e_{\ell}>d$ . Then $(e_{1},\ldots,e_{\ell},0,\ldots,0)\leqslant_{p}e$ , so $\prod_{i=1}^{\ell}x_{i}^{e_{i}}\in\mathcal{F}$ and

d+1\leqslant\sum_{i=1}^{\ell}e_{i}\leqslant d+q-1=s(q-q/p)+r+q-1.

We will show that this monomial will lead to one of the canonical monomials being in $\mathcal{F}$ . First, we claim that there is a monomial $x^{e^{\prime}}$ such that $e^{\prime}_{i}\geqslant q-q/p$ for $i=1,\ldots,s$ . Define

c(e)=\sum_{i=1}^{s}\max(0,(q-q/p)-e_{i}).

If $c(e)=0$ , then $e$ is of the desired form. If $e$ is not of the desired form, then there is $i\leqslant s$ such that $e_{i}<q-q/p$ . Otherwise one of the following must be true:

•

There is $j>s$ such that $e_{j}>0$ , in which case we simply find some $p^{m}\leqslant_{p}e_{j}$ and apply Lemma B.4 to obtain the monomial $x_{i}^{e_{i}+p^{m}}x_{j}^{e_{j}-p^{m}}$ in place of $x_{i}^{e_{i}}x_{j}^{e_{j}}$ ,
•

There is $j\leqslant s$ such that $e_{j}>q-q/p$ . In this case we can find $p^{m}$ such that $e_{j}-p^{m}\geqslant q-q/p$ , and apply Lemma B.4 to obtain the monomial $x_{i}^{e_{i}+p^{m}}x_{j}^{e_{j}-p^{m}}$ in place of $x_{i}^{e_{i}}x_{j}^{e_{j}}$ .

In either case, we can find another monomial $x^{e^{\prime}}$ such that $x^{e^{\prime}}\in\mathcal{F}$ and $c(e^{\prime})<c(e)$ . Iterating this process, we must eventually find the desired monomial with $c(e)=0$ .

Now, abusing notation, let $x^{e}$ be this monomial. We have, $d+1\leqslant\sum_{i=1}^{n}e_{i}\leqslant d+q-1=s(q-q/p)+r+q-1$ and $e_{i}\geqslant q-q/p$ for $1\leqslant i\leqslant s$ . We can now perform essentially the same argument and move any degree above $q-q/p$ to $e_{s+1},\ldots,e_{s+t}$ . There is at most

d+q-1-s(q-q/p)=r+q-1\leqslant p(q-q/p)+q-1,

degree leftover after subtracting so we can perform the above argument to shift degree until each of the degrees $e_{s+1},\ldots,e_{s+t-1}$ is at least $q-q/p$ . This will leave at most $q-1$ degree leftover, which we can simply shift to $e_{s+t}$ . ∎

Lemma B.5.

Suppose $g:\mathbb{F}_{q}^{p}\xrightarrow[]{}\mathbb{F}_{q}$ satisfies $\deg(g)\geqslant q(p-1)$ . Then there exist full rank affine transformations $T_{1},T_{2}\in\mathcal{T}_{p,p}$ such that:

1.

$g\circ T_{1}$ contains a monomial $\prod_{i=1}^{p}x_{i}^{e_{i}}$ with $q-q/p\leqslant e_{i}\leqslant q-1$ for all $i\in[p]$ .
2.

$\langle g\circ T_{2},P\rangle\neq 0$ .

Proof.

To show the first part of the lemma, we use a similar idea to the proof of Lemma B.4. Suppose $g$ contains a monomial $\prod_{i=1}^{p}x_{i}^{e^{\prime}_{i}}$ and suppose $m\leqslant_{p}e_{2}^{\prime}$ . Consider the full rank affine transformation $T_{\alpha}(x)=(x_{1},\alpha x_{1}+x_{2},x_{3},\ldots,x_{p})$ for $\alpha\in\mathbb{F}_{q}$ and the coefficient of $x_{1}^{e_{1}^{\prime}+m}x_{2}^{e_{2}^{\prime}-m}\prod_{i=3}^{p}x_{i}^{e_{i}^{\prime}}$ in $g\circ T_{\alpha}$ . It is not hard to see that this coefficient is a nonzero polynomial in $\alpha$ and therefore there exists an $\alpha$ such that $g\circ T_{\alpha}$ contains the monomial $x_{1}^{e_{1}^{\prime}+m}x_{2}^{e_{2}^{\prime}-m}\prod_{i=3}^{p}x_{i}^{e_{i}^{\prime}}$ . As $T_{\alpha}$ is a full rank affine transformation, we can repeatedly apply such transformations to obtain a full rank $T_{1}\in\mathcal{T}_{p,p}$ such that $g\circ T_{1}$ contains a monomial $\prod_{i=1}^{p}x_{i}^{e_{i}}$ with $q-q/p<e_{i}\leqslant q-1$ , proving the first part of the lemma.

Let $T_{1}$ be the transformation from part $1$ and let $g^{\prime}=g\circ T_{1}$ , so that $g^{\prime}$ contains a monomial $\prod_{i=1}^{p}x_{i}^{e_{i}}$ with $q-q/p<e_{i}\leqslant q-1$ . For $a\in\mathbb{F}_{q}^{p}$ , let $T_{a}$ be the full rank affine transformation such that $T_{a}(x)=(x_{1}+a_{1},\ldots,x_{p}+a_{p})$ and consider the inner product

\langle g^{\prime}\circ T_{a},P\rangle,

as a polynomial in $a$ . Recall that $P$ contains the monomial $\prod_{i=1}^{p}x_{i}^{q/p-1}$ . Since $g^{\prime}$ contains the monomial $\prod_{i=1}^{p}x_{i}^{e_{i}}$ with $q-q/p<e_{i}\leqslant q-1$ there is a contribution of $\prod_{i=1}^{p}a_{i}^{e_{i}-(q-q/p)}$ with nonzero coefficient, and this cannot be cancelled out from any other monomial. Therefore, $\langle g^{\prime}\circ T_{a},P\rangle$ is a nonzero polynomial in $a$ and there exists an $a\in\mathbb{F}_{q}^{p}$ such that $\langle g^{\prime}\circ T_{a},P\rangle\neq 0$ . This establishes the second part of the lemma. ∎

Appendix C Proof of Lemma 5.3

In this section we provide the proof of Lemma 5.3. This is essentially the same as [31, Proposition 3.5] with a slight modification. Recall that $g:\mathbb{F}_{q}^{\ell+100}$ is an arbitrary polynomial, $d$ is an arbitrary degree parameter, and $b\in\mathbb{F}_{q}^{\ell+100}$ is an arbitrary point. We work in ${\sf AffGras}(\ell+100,\ell)$ . Use $\nu$ to denote the uniform measure over all $\ell$ -flats, and $\nu_{b}$ to denote uniform measure over the zoom-in on $b$ . When referring to zoom-ins, zoom-ins on the linear part etc. we are referring to the versions in the affine Grassmann graph. We have the following two sets of $\ell$ -flats,

\mathcal{A}=\{U\;|\;\dim(U)=\ell,\deg(g|_{U})>d,b\in U\},

\mathcal{B}=\{U\;|\;\dim(U)=\ell,\deg(g|_{U})>d,b\notin U\},

and $\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon)$ . We will show the following weaker form of Lemma 5.3, and then explain why this gives the full Lemma 5.3.

Lemma C.1.

If $\nu(\mathcal{B})\leqslant q^{M/2}O(\varepsilon)$ then $\mathcal{B}=\emptyset$ . Moreover there is a value $\gamma$ such that after changing $f(b)$ to $\gamma$ , $\nu_{b}(\mathcal{A})\leqslant 1-\frac{1}{q}$ .

Before proving this lemma, we explain why it implies Lemma 5.3. Assuming that Lemma C.1 holds, suppose that $g^{\prime}$ is the resulting function after changing the value of $g(b)$ to $\gamma$ , and consider the set $\mathcal{U}$ of $\ell$ -flats $U$ such that $\deg(g^{\prime}|_{U})>d$ . Assume for the sake of contradiction that $\mathcal{U}$ is nonempty. We record some facts about $\mathcal{U}$ ,

1.

$\mathcal{U}\subseteq\mathcal{D}_{b}=\{U\;|\;\dim(U)=\ell,b\in U\}$ .
2.

$\nu_{V}(\mathcal{U})\leqslant q^{-100}$ .
3.

$\Phi(\mathcal{U})\leqslant 1-\frac{1}{q}$ .
4.

$\nu_{V}(\mathcal{U})$ is $q^{-99}$ pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
5.

$\nu_{V}(\mathcal{U})$ is $q^{-100}$ pseudorandom with respect to zoom-ins on the linear part.

The first item follows from assuming Lemma C.1. The second item follows from the first item and from bounding the fraction of $\ell$ -flats that contain $b$ . The third and fourth items follow from the same arguments as Lemmas 4.2 and 4.3. Finally, the fifth item follows again from the first item and the fact that over any zoom-in on the linear part, a random $\ell$ -flat from this set contains the point $b$ with probability at most $q^{-100}$ .

Applying Theorem A.1 with $\xi=q^{-100}$ , we get that $\mathcal{U}$ must have density at least $1-q^{2}(867q^{-25}+q^{-\ell})>1-1/q$ inside some zoom-in, where we use the fact that $\ell\geqslant 4$ . There can only be one zoom-in on which $\mathcal{U}$ has nonzero density, and that is the zoom-in on $b$ . However, this contradicts the assumption that $\nu_{b}(\mathcal{A})\leqslant 1-\frac{1}{q}$ after the value of $g(b)$ is changed. In short, we have shown that, under this setting, if a correction causes the rejection probability within a zoom-in to drop below $1-1/q$ , then the rejection probability must actually be $0$ .

We now give the proof of Lemma C.1. This proof requires the following result used in both [5, 31], which was referred to as the shadow lemma in the introduction.

Lemma C.2.

Let $d\in\mathbb{N}$ be a degree parameter and $k\geqslant\lceil\frac{d+1}{q-q/p}\rceil$ . Then for any $f:\mathbb{F}_{q}^{k+1}\xrightarrow[]{}\mathbb{F}_{q}$ :

1.

If $\deg(f)>d$ , then $\varepsilon_{k,d}(f)\geqslant\frac{1}{q}$ .
2.

If $d<\deg(f)<(k+1)(q-1)$ then $\varepsilon_{k,d}(f)>\frac{1}{q}$ .

Where $\varepsilon_{k,d}(f)$ and $\varepsilon_{k+1,d}(f)$ are the fraction of $k$ -flats and $k+1$ -flats respectively on which the restriction of $f$ has degree greater than $d$ .

In [31] this lemma is used to show that the set of rejecting flats is non-expanding, similar to Lemma 4.2 in this paper. The lemma has another use though, which is stated in its second item. For our purposes, the second item says that if $f|_{V}$ has degree greater than $d$ and for greater than $1/q$ of the hyperplanes $U\subseteq V$ we have $\deg(f|_{U})>d$ as well, then $f|_{V}$ cannot contain the maximum degree monomial. In this section we will use Lemma C.2 with the parameters $d=d$ and $k=\ell$ . Note that $\ell\geqslant\lceil\frac{d+1}{q-q/p}\rceil$ by assumption.

Our first goal is to show $\mathcal{B}=\emptyset$ . Start by taking an $(\ell+40)$ -flat $W$ uniformly conditioned on $b\notin W$ . Let

\mathcal{B}_{W}=\{B\subseteq W\;|\;B\in\mathcal{B}\},

and work in ${\sf AffGras}(W,\ell)$ — the affine Grassmann graph over the $\ell$ -flats contained in $W$ .

Let $\mu_{W}$ and $\Phi_{W}$ denote measure and expansion respectively in ${\sf AffGras}(W,\ell)$ . We claim that either $\mu_{W}(\mathcal{B}_{W})=0$ or $\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}$ . Otherwise, $0<\mu_{W}(\mathcal{B}_{W})<q^{-100}$ and $\mathcal{B}_{W}$ is $q^{-60}$ pseudo-random with respect to zoom-ins and zoom-ins on the linear part. By a similar argument to Lemma 4.3 we have that $\mathcal{B}_{W}$ is $q^{-98}$ pseudorandom with respect to zoom-outs and zoom-outs on the linear part and by a similar argument to Lemma 4.1 we have that $1-\Phi_{W}(\mathcal{B}_{W})\geqslant 1/q$ . This contradicts Theorem A.1, however, so we conclude that either $\mu_{W}(\mathcal{B}_{W})=0$ or $\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}$ .

Lemma C.3.

$\mathcal{B}=\emptyset.$

Proof.

Otherwise we may find a $W$ such that $\mu_{W}(\mathcal{B}_{W})\geqslant q^{-100}$ . Fix such a $W$ and sample a uniform $\ell+99$ -flat $Y$ conditioned on $b\notin Y$ , and a uniform $\ell+60$ -flat $A_{2}\subseteq Y$ . Now consider $A_{2}\cap W$ . We may think of $W$ as being defined by a system of $60$ independent linear equations $\langle h_{1},x\rangle=c_{1},\ldots,\langle h_{60},x\rangle=c_{60}$ . That is, $W$ is the subspace of $\mathbb{F}_{q}^{\ell+100}$ that satisfies these $60$ equations. Likewise, $A_{2}$ is given by the restriction of $39$ linear equations, $\langle h^{\prime}_{1},x\rangle=c^{\prime}_{1},\ldots,\langle h^{\prime}_{39},x\rangle=c^{\prime}_{39}$ . The probability that all $99$ linear equations are linearly independent is at least

\prod_{j=0}^{38}\frac{q^{99}-q^{60+j}}{q^{99}}\geqslant e^{-2\sum_{j=1}^{\infty}q^{-j}}\geqslant e^{-4/q},

and $A_{2}\cap W$ is uniform over ${\sf AffGras}(W,\ell)$ . Thus,

\Pr[A_{2}\cap W\in\mathcal{B}_{W}]\geqslant e^{-4/q}q^{-100}.

If $A_{2}\cap W\in\mathcal{B}_{W}$ , then $A_{2}\in\mathcal{B}_{Y}^{\uparrow^{60}}$ , where the upper shadow is taken with respect to ${\sf AffGras}(Y,\ell)$ , so it follows that

\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\geqslant e^{-4/q}q^{-100}.

On the other hand, by Lemma C.2,

\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y}^{\uparrow^{60}})]\leqslant q^{60}\mathop{\mathbb{E}}_{Y}[\mu_{Y}(\mathcal{B}_{Y})]\leqslant 2q^{60}\mu(\mathcal{B}_{V})\leqslant 2q^{60+M/2}O(\varepsilon).

By assumption $\varepsilon<q^{-M}$ so altogether we get that,

q^{-M/2}\geqslant\frac{1}{C(q)q^{-160}},

for some constant $C(q)$ . For large enough $M$ this is a contradiction. ∎

We now complete the proof of Lemma C.1, which in turn proves Lemma 5.3.

Proof of Lemma 5.3..

Fix an $\ell$ -flat $U$ that contains $b$ . We will show that there exists a value $\gamma$ such that changing the value of $g(b)$ to $\gamma$ results in a degree $r$ function on $U$ . Establishing this fact and choosing the most common $\gamma$ over all $\ell$ -flats proves Lemma C.1, which in turn establishes Lemma 5.3 by our previous discussion.

Suppose $\deg(g|_{U})>d$ , as otherwise we are done by setting $\gamma=g(b)$ . Pick an $\ell+1$ -flat $U^{\prime}\supset U$ and let $g^{\prime}=g|_{U^{\prime}}$ , and $M(x)=1_{x\neq b}$ , over $U^{\prime}$ . Note that $M(x)$ contains the degree $(\ell+1)(q-1)$ -monomial, so there is some value $\gamma$ such that $\gamma M(x)+g^{\prime}(x)$ has degree strictly less than $(\ell+1)(q-1)$ . Showing that for this $\gamma$ , $\deg(\gamma M(x)+g^{\prime}(x))\leqslant d$ completes the proof.

Since we have already shown that, $\deg(g|_{U^{\prime}})\leqslant d$ for any $\ell$ -flat that does not contain the point $b$ , this implies that $\gamma M(x)+g^{\prime}(x)$ also has degree at most $d$ on any $\ell$ -flat not containing $b$ . This is because when restricted to such flats, $\gamma M(x)+g^{\prime}(x)$ is equal to the restriction of $g$ plus some constant polynomial. Since the degree of $g$ ’s restriction is at most $d$ , so is the degree of $\gamma M(x)+g^{\prime}(x)$ . It follows that $\gamma M(x)+g^{\prime}(x)$ can only have degree greater than $d$ when restricted to $\ell$ -flats that contain $b$ . This is at most a $1/q$ -fraction of $\ell$ -flats contained in $U^{\prime}$ however, which, along with the fact that $\deg(\gamma M(x)+g^{\prime}(x))<(\ell+1)(q-1)$ , implies that $\deg(\gamma M(x)+g^{\prime}(x))\leqslant d$ by Lemma C.2. ∎

	$\displaystyle\Pr_{(M^{\prime},c^{\prime})=(M,c)\circ R}[E_{1}\;\|\;E_{2}\land E_{3}]$	$\displaystyle=\frac{\Pr[E_{1}\land E_{3}\land E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle=\frac{\Pr[E_{1}\land E_{3}~{}\|~{}E_{4}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle\leqslant\frac{\Pr[E_{1}\;\|\;E_{3}]\Pr[E_{4}]}{\Pr[E_{2}\land E_{3}]}$
		$\displaystyle=\frac{\Pr_{(M^{\prime},c^{\prime})}[E_{1}\;\|\;E_{3}]}{\Pr[E_{2}\;\|\;E_{4}]\Pr[E_{3}\;\|\;E_{2}]},$

Optimal Testing of Generalized Reed-Muller Codes in Fewer Queries

Abstract

1 Introduction

1.1 Local Testing of Reed Muller Codes

Local Testing.

1.2 Optimal Testing of Reed Muller Codes

1.3 Main Result: Optimal, Query Efficient Tester for Generalized Reed Muller Codes

Theorem 1.1.

1.4 Optimal Testing of Other Linear Lifted Affine Invariant Codes

Theorem 1.2.

1.5 Optimal Testing via Global Hypercontractivity

1.5.1 Optimal Testing of Reed-Muller Codes via Global Hypercontractivity

1.5.2 Optimal Testing of Generalized Reed-Muller Codes via Global Hypercontractivity

1.6 Our Techniques

1.6.1 The Sparse Flat Tester

1.6.2 The Group of Symmetries Perspective

1.6.3 The Graph on Affine Linear Transformations and Its Analysis

Going up.

Going down.

Going up and then down.

1.6.4 The Shadow Lemma

The Shadow Lemma for Flat Testers.

The Shadow Lemma for Sparse Flat Testers.

1.6.5 Expansion on the Bi-linear Scheme Graph

1.6.6 Finishing the Proof via Local Correction

2 Preliminaries

Notations.

2.1 Basic Facts about Fields

Lemma 2.1.

Proof.

2.2 Functions over Fields

Definition 1.

Lemma 2.2.

Proof.

2.3 Affine Linear Transformations and the Affine Bi-linear Scheme

Definition 2.

Definition 3.

Definition 4.

2.4 Expansion and pseudo-randomness in the Affine Bi-linear Scheme

Definition 5.

2.4.1 The Up-Down view of the Ranom Walk

2.4.2 Non-expanding Sets in the Affine Bi-linear Scheme

Definition 6.

Theorem 2.1.

Proof.

3 Local Testers for the Reed Muller Code

3.1 The tt-flat Tester and the Inner Product View

3.2 The sparse (s+t)(s+t)-flat test

3.3 Assuming nn is Sufficiently Large

Lemma 3.1.

Proof.

Remark 3.2.

3.4 Some Basic Facts of the Sparse Flat Tester

3.4.1 Basic Soundness Properties

Lemma 3.3.

Proof.

Theorem 3.4.

Proof.

3.4.2 Sparsity

Lemma 3.5.

Proof.

Lemma 3.6.

Proof.

Lemma 3.7.

Proof.

3.5 Relating the Sparse Flat Tester and the Flat Tester

Lemma 3.8.

Proof.

Lemma 3.9.

Proof.

Definition 7.

Lemma 3.10.

Definition 8.

Lemma 3.11.

Proof.

4 Locating a Potential Error

4.1 The Edge Expansion of 𝒮t\mathcal{S}_{t} in the Affine Bi-linear Scheme

Lemma 4.1.

Proof.

Lemma 4.2.

3.1 The $t$ -flat Tester and the Inner Product View

3.2 The sparse $(s+t)$ -flat test

3.3 Assuming $n$ is Sufficiently Large

4.1 The Edge Expansion of $\mathcal{S}_{t}$ in the Affine Bi-linear Scheme

The point $a^{\star}$ is in the support of $H$

Dense on all zoom-ins inside the support of $H$

Appendix B Canonical Monomials characterizing $\operatorname{RM}[n,q,d]$