Optimal Testing of Generalized Reed-Muller Codes in Fewer Queries
Abstract
A local tester for an error correcting code is a tester that makes oracle queries to a given word and decides to accept or reject the word . An optimal local tester is a local tester that has the additional properties of completeness and optimal soundness. By completeness, we mean that the tester must accept with probability if . By optimal soundness, we mean that if the tester accepts with probability at least (where is small), then it must be the case that is -close to some codeword in Hamming distance.
We show that Generalized Reed-Muller codes admit optimal testers with queries. Here, for a prime power , the Generalized Reed-Muller code, , consists of the evaluations of all -variate degree polynomials over . Previously, no tester achieving this query complexity was known, and the best known testers due to Haramaty, Shpilka and Sudan [21](which is optimal) and due to Ron-Zewi and Sudan [33](which was not known to be optimal) both required queries. Our tester achieves query complexity which is polynomially better than by a power of , which is nearly the best query complexity possible for generalized Reed-Muller codes.
The tester we analyze is due to Ron-Zewi and Sudan, and we show that their basic tester is in fact optimal. Our methods are more general and also allow us to prove that a wide class of testers, which follow the form of the Ron-Zewi and Sudan tester, are optimal. This result applies to testers for all affine-invariant codes (which are not necessarily generalized Reed-Muller codes).
1 Introduction
1.1 Local Testing of Reed Muller Codes
The Reed-Muller Code is a widely used code with many applications in complexity theory, and more broadly in theoretical computer science. One reason for this is that the Reed-Muller code enjoys very good local testability properties which are crucial in many applications (for example, in the construction of probabilistically checkable proofs). The primary goal of this paper is to present local testers for Reed-Muller codes over extension fields with improved query complexity, which additionally satisfy a stronger notion of soundness known as optimal testing.
Throughout this paper, is a prime and is a prime power, where should be thought of as moderately large (so that is significantly larger than ). For a degree parameter and a number of variables parameter , the Reed-Muller code consists of all evaluation vectors of degree polynomials . That is,
When , is sometimes called a generalized Reed-Muller code, to distinguish from the prime field case, and as the title suggests, our results are most relevant to this case. However, henceforth, we will refer to them as Reed-Muller codes for simplicity.
Abusing notation, we will not distinguish functions and their evaluations when referring to codewords. That is, for , we will simply say if and we will view the codewords of themselves as functions. When talking about the degree of a function , we mean the total degree when is written as a polynomial.
Given a proximity parameter , the goal in the problem of local testing of Reed-Muller codes is to design a randomized tester that makes oracle queries (for which is as small as possible) to a given function such that:
-
1.
Completeness: if is a polynomial of degree at most , then accepts with probability .
-
2.
Soundness: if is -far from all degree polynomials, then the tester rejects with probability at least .
Local Testing.
Reed-Muller codes have a very natural and well studied local test [1, 24, 22, 21] called the -flat test. This test has its origins in the study of probabilistically checkable proofs [15, 3, 2, 34, 4, 32] (as it is related to the well known plane versus plane, plane versus line and line versus point tests), as well as in the theory of Gowers’ uniformity norms [16]. To check if a given function is indeed low degree, the tester samples a random -dimensional affine subspace , queries for all and checks whether the resulting -variate function has degree at most . Clearly the number of queries made is , and it is also clear that the test is complete: if has degree at most , then the tester always accepts. As for the soundness, it is known that one can take and get that the tester is somewhat sound [1], meaning that the rejection probability is bounded away from (as opposed to at least ). Indeed, a typical local-testing result shows that if a function is -far from being a degree function, then the tester rejects it with probability at least , which is a quantity that typically vanishes with the various parameters. To amplify the soundness, one repeats the tester time sequentially to get rejection probability, thereby giving a tester whose query complexity is and whose soundness is at least .
Such testers for the Reed-Muller have been known for a long time. Indeed, in [1] the -flat tester is analyzed and it is shown that the rejection probability of the basic tester is at least leading to a tester with query complexity . This soundness analysis turns out to not be optimal, and indeed, as we explain below, follow-up works have shown a better analysis of the -flat tester. In particular, it was shown that the -flat tester is an optimal tester.
1.2 Optimal Testing of Reed Muller Codes
In this paper, we focus on a much stronger notion of testing known as optimal testing. Clearly, if a tester makes queries and the proximity parameter is , then the rejection probability can be at most ; indeed, this is a bound on the probability to distinguish between a Reed-Muller codeword and a Reed-Muller codeword that has been perturbed on a randomly chosen -fraction of inputs. A tester is called optimal if the query-to-rejection probability tradeoff it achieves is roughly that. Oftentimes, one settles for rejection probability which is a bit worse and has the form for some function depending only on the field size . We will refer to such results also as optimal testing results. Thus, one would ideally like a tester which both achieves as small as possible query complexity, while simultaneously being optimal.
Optimal testers are known for Reed-Muller codes. Such results were first proved over by [10]. Later on, optimal testing results were established for Reed-Muller codes over general fields [21] as well as more broadly for the class of affine lifted codes [20]. In all of these results, the -flat test is analyzed (for chosen as above), and is shown to be an optimal tester. We remark that additionally, the analyses of [10, 21, 20] led to improved query complexity for testing Reed-Muller codes. These results have important applications, most notably to the construction of small-set expander graphs with many eigenvalues close to [8], which have later been used for improved PCP constructions and constructions of integrality gaps [19, 12, 11, 23].
Quantitatively, these results have two drawbacks. First, due to their application of the density Hales-Jewett theorem, the dependency of on the field size is tower type, hence making the result primarily effective for small fields. Secondly, while the query complexity achieved by their tester is the best possible for prime fields (as a lower bound on the query complexity needed is given by the distance of the dual code of , which is if is prime), it is not known to be optimal for prime power size fields. This raises two natural questions: does the flat tester actually perform worse on large fields (in comparison to small fields)? Is there a tester with smaller query complexity over non-prime fields (i.e. extension fields)?
In [33] a new local tester for the Reed-Muller code was designed. The query complexity of the tester is , which is polynomially smaller than above (by a power of ). This tester, which will be referred to as the sparse -flat tester, plays a crucial role in the current work and will be presented in subsequent sections.
Unfortunately, the rejection probability proved in [33] for the sparse -flat tester is an which is sub-constant, and after repeating the tester times its query complexity turns out to be roughly the same as that of the -flat tester above. This leaves the Reed Muller code over extension fields in a rather precarious situation: a local characterization for the code — namely a basic tester that rejects far from Reed-Muller codewords with some non-negligible probability — is known, but amplifying the soundness to be constant sets us back to the same query complexity as of the -flat tester.
1.3 Main Result: Optimal, Query Efficient Tester for Generalized Reed Muller Codes
The main result of this paper is a new and improved analysis of the tester of [33]. We show that the soundness of the tester is much better than the guarantee given by [33], and that in fact this tester is also an optimal tester:
Theorem 1.1.
For all primes and prime powers there exists a tester with query complexity such that given an oracle access to a function ,
-
1.
Completeness: if has degree at most , then accepts with probability .
-
2.
Soundness: if is -far from degree , then rejects with probability at least , where .
The test uses a parameter (where is analogous to the “dimension” parameter in the flat test), and the that we use will be . We remark however that the analysis we give applies to all , and we choose this specific so as to minimize the query complexity. We defer to Section 3.2 for more details on this parameter.
A lower bound on the query complexity needed is , which once again follows by considering the dual code of the generalized Reed-Muller code and arguing that this number is its distance. Hence, Theorem 1.1 is tight up to a factor of ; for large , this factor is very small compared to , hence the query complexity achieved by our tester is essentially optimal for large field size .
1.4 Optimal Testing of Other Linear Lifted Affine Invariant Codes
Our techniques are in fact more general, and also apply to testers of a wider family of codes, called (linear) lifted affine invariant codes. These codes were introduced by [17] and shown to be optimally testable in [20, 31]. In words, we show that any tester for such codes is optimal if it can be expressed as the product of polynomials, such that each polynomial is defined on a constant number of variables and such that the variables for each polynomial are disjoint. We describe this result in more detail below, but defer the full discussion to Section 6.
A code is linear if its codewords form a linear subspace and is affine invariant if for any affine transformation and , we have that , where is defined as the function such that for all . Since is linear, it has a dual also consisting of functions from , and if and only if for all , where this inner product is the standard dot product of the evaluation vectors of and (over ). Notice that, one can also compose with affine transformations for . In this case, is a function from , and we can consider the code consisting of all such that is in some affine invariant base code . This code is called the -dimensional lift of and is defined as:
It is not hard to see that is also affine invariant and linear. Suppose we want to design a local tester for and we know for some affine invariant defined as above with . A natural test is the following:
-
1.
Take .
-
2.
Choose an affine transformation uniformly at random.
-
3.
Accept if and only if for all .
We remark that not every choice of results in a local tester, and it is indeed possible to choose so that there exist that still pass the above test with probability . Our main result shows that when such a test is a local test and consists of functions of a specified form, then the tester is automatically an optimal tester. In order to obtain explicit optimal testers one still has to find such a that is a local tester, but this is not the focus of the current paper.
The form for that we require is as follows. Let
where , and for some constant and all . In words, is a polynomial in at most -variables, for some that is within some constant power of from , that can be factored into the product of polynomials in constant number of variables, and such that the variables of each of these polynomials is disjoint. Next let be an affine invariant code and let be a basis for . Finally, suppose
Our theorem states:
Theorem 1.2.
Suppose is of the previously described form and suppose that the previously described test using is a local tester for . Then the local tester is also optimal. That is,
-
1.
Completeness: if then the test accepts with probability .
-
2.
Soundness: if is -far from degree , then the test rejects with probability at least , where .
Although our result is stated for lifted affine-invariant codes, it also applies equally to affine-invariant codes by taking . In contrast, the optimal testing result for lifted affine-invariant codes in [31] applies only to the -flat test for , which is no longer “local” in the case of as it looks at the entire domain. On the other hand, for the case, one could still obtain locality using our result by designing a set of the specified form that has sparse support. Theorem 1.2 gives a general recipe to construct optimal testers, thus making progress on the task of establishing optimal testing results for general affine invariant codes. We would like to highlight two interesting avenues that we leave for future works. First, it would be interesting to investigate what other affine invariant codes can be analyzed via Theorem 1.2. This could lead to new optimal testing result for other codes, or otherwise to a more general class of testers that one can then try to analyze using the tools presented herein (and their extensions). Second, it would be interesting to extend our techniques to remove the requirements on the form of (or perhaps weaken it somehow), and show that any local test for affine-invariant codes is optimal.
1.5 Optimal Testing via Global Hypercontractivity
The proof of Theorem 1.1 is very different from the proofs of [10, 21, 20] (which proceed by induction on ) as well as from the proof of [33] (which proceeds by presenting a local characterization of the generalized Reed-Muller code and appealing to a general and powerful result due to [25], that converts local characterizations to local tester). Instead, our proof is inspired by a new approach for establishing such results via global hypercontractivity [31].
Global hypercontractivity is a recently introduced tool that is often useful when working with non small set expander graphs. One corollary of global hypercontractivity (which is morally equivalent) is a useful characterization of all small sets in a graph that have edge expansion bounded away from . The study of global hypercontractivity has its roots in the proof of the 2-to-2 Games Theorem [29, 14, 13, 30], however by now it is known to be useful in the study a host of different problems (see for example [26, 27, 28, 31, 5, 7, 6, 18]).
Below, we explain the global hypercontractivity approach to proving local testability results. In Section 1.6 we explain how we extend this approach to the realm of generalized Reed-Muller codes in order to analyze the sparse -flat tester and prove Theorems 1.1, 1.2.
1.5.1 Optimal Testing of Reed-Muller Codes via Global Hypercontractivity
In [31], the authors relate the analysis of the -flat tester of the Reed-Muller code to expansion properties of the affine Grassmann graph. Here, the affine Grassmann graph is the graph whose vertex set is the set of all -flats in , and two vertices are adjacent if they intersect in a dimensional affine subspace. In short, the approach of [31] starts with the assumption that the -flat tester accepts with probability at least (for thought of as small) and proceeds to prove a structural result on the set of -flats on which the tester rejects:
In particular, using a lemma from [21] they prove that the set has poor expansion properties in the affine Grassmann graph, and use it to prove that there exists a point such that the tester almost always rejects when it selects a subspace containing . This suggests that is erroneous at and should be corrected, and indeed using that a local correction procedure is devised in [31] to show that the value of at could be changed and lead to an increase in the acceptance probability of additive factor . Iterating this argument, one eventually changes in at most fraction of the inputs and gets a function on which the tester accepts with probability . Such must be of degree at most , hence one gets that is close to a degree polynomial.
1.5.2 Optimal Testing of Generalized Reed-Muller Codes via Global Hypercontractivity
While the approach of [31] seems to be more robust and thus potentially applicable towards analyzing a richer class of codes, it is not completely obvious how to do so. For the -flat tester one may associate a very natural graph with the tester, but this is much less clear in the context of the sparse -flat tester (which is the tester that we analyze). The pattern of inputs which are queried by the tester is no longer a nice-looking subspace (but this seems inherent in order to improve upon the query complexity of the flat tester).
At a high level, we show that another way of approaching this problem is by utilizing the symmetries of the code and constructing graphs on them. For that, we have to think of the tester as the composition of a “basic tester” and a random element from the group of symmetries of the code. In our case, the group of symmetries is the group of affine linear transformations over , and the graph that turns out to be related to the analysis of the sparse -flat tester is the so-called Affine Bi-linear Scheme Graph.
At first sight, affine linear transformations seem to be morally equivalent to flats (identifying the image of an affine linear transformation with a subspace), and indeed there are many connections between results on the former graph and result on the latter graph. However, the distinction between affine linear transformations and flats will be crucial for us. Indeed, while two affine linear transformations and may have the same image, the performance of the tester when choosing and when choosing will not be the same whatsoever, and therefore we must capitalize on this distinction in our soundness analysis.
1.6 Our Techniques
In this section, we give a brief overview of the techniques underlying the proof of Theorem 1.1. We start by presenting the sparse -flat tester of [33] and then take a somewhat different perspective on it in the form of groups of symmetries. After that, we explain the high level strategy of the proof of Theorem 1.1, and explain some of the unique challenges that arise compared to the analysis of the standard -flat tester. For the sake of presentation, we focus on the case that for the remainder of this section, and switch back to general in Section 2.
1.6.1 The Sparse Flat Tester
The construction of the sparse flat tester of [33] begins with the following observation. In the case, define the bivariate polynomial by
In [33], the authors observe the following two crucial properties of that make it useful towards getting a local testing result:
-
1.
Sparse Support: the support of has size at most ; indeed, if , and , then by Fermat’s little theorem and , so .
-
2.
Detects the Monomials : looking at the inner product of with a monomial , defined as , we get that if for then . Indeed,
so we only have contribution from and it is non-zero. For any other monomial , a similar computation shows that , hence taking an inner product of a function with may be thought of detecting whether in there is some monomial of the form .
With this in mind, the sparse tester follows. In [33] it is argued that to design a local tester for the generalized Reed-Muller code it suffices to design a tester that detects whether certain canonical monomials exist. Writing , where is even and , it is sufficient to detect whether any monomials of the form where and is a small constant (say, ). Using from above, a detector for such monomials is given by
where and . We note that most of the degree of comes from the product of the ’s, while at most of the degree comes from the rest. As the support of is rather sparse, it follows that the support of is also rather sparse. More precisely, the support of has size at most , and as should be thought of as small and is roughly , the support of has size roughly . This yields a query complexity of , which is quadratically better than given by the flat tester.
As mentioned earlier, the soundness analysis in [33] relies on a black box result from [25]. They manage to show that the detector they construct implies a tester with the same query complexity that rejects functions that are -far from Reed-Muller codewords with probability . To get constant rejection probability one has to repeat the tester times and drastically increasing the query complexity; we defer a more detailed discussion to Section 3.
1.6.2 The Group of Symmetries Perspective
Our first task in proving Theorem 1.1 is to design a tester based on the ideas from [33]. The tester is very natural:
-
1.
Sample a random affine map ; here, and should be thought of as in the previous section. We are going to look at the function , but not query all of its values (indeed, querying all of its values would amount to the -flat tester).
-
2.
For any sequence of degrees such that , take and compute . Reject if this inner product is non-zero for any such degree sequence. Otherwise, accept.
In words, we first take the restriction of to a random -flat, and then apply the detector polynomials of [33]. Although we test for multiple degree sequences (up to ), we will see in Section 3 that the support of is the same in each case. Therefore, we can perform the inner product for all of the degree sequences using the same queries. Another way to think about this tester is that we have the group of symmetries of the Reed-Muller code (affine linear transformations), and our tester proceeds by first taking a random symmetry from this group, taking a restriction, and then using the detector of [33].
1.6.3 The Graph on Affine Linear Transformations and Its Analysis
With the above tester in mind, the next question is how to analyze it. In the case of the flat tester we had a very natural graph associated with the tester (the affine Grassmann graph). The above tester seems related to flats as well, since the image of an affine transformation is a flat; however, there is a key, crucial distinction between the two. In the above tester, we are only going to look at the value of at a few locations in , hence the tester may perform differently on and even if they have identical images. This lack of symmetry is crucial for the sparseness of the test, but makes the task of associating a graph with the tester trickier.
To gain some intuition as to what this graph is supposed to be, recall that in the flat testers, one could look at the -flat tester for all (not necessarily the smallest one). The relations between these testers for different ’s play a crucial role in all of the works establishing optimal testing results, and in particular it is known that the rejection probability of the -flat tester is at most times the rejection probability of the -flat tester. We will elaborate on this property a bit later (referred to as the “shadow lemma” below). Another benefit of looking at various testers is that it gives a natural way of arriving at the affine Grassmann graph, by doing an up-down walk between these testers. To obtain the edges of the affine Grassmann graph, one can start with a -flat, go up to a -flat that contains it, and then back down to a -flat contained in the -flat. What is the right analog of this operation in the context of the sparse flat tester?
Going up.
The above discussion suggests looking for analogs of the tester above for higher dimensions, and there is a clear natural analog to the up step: for any , we can look at the sparse flat tester, in which one chooses an affine map randomly, and proceeds with the tester as above for all viable degree sequences (but over more variables). Thus, starting with the sparse flat tester and with an affine map thought of as matrix over and an affine shift , we can go “up” to the sparse flat tester by choosing a random vector and looking at affine transformation corresponding to the matrix whose columns are the same as of , except that we append as the last column. Just like in the flat tester, it can be observed without much difficulty that if the sparse flat tester rejects when choosing , then the sparse flat tester rejects when choosing .
Going down.
The “going down” step is also simple, but a bit harder to motivate. Taking inspiration from the flat tester, one may want to apply some linear shuffling on the columns of and then “drop” one of the columns. This doesn’t seem to work though, as doing so would lead to a in which the performance of the sparse flat tester seems “completely independent” to its performance on (in the sense that the set of points it looks at in would typically be disjoint from the set of points it looks at in ).
Thus, when going down we wish to do so in a way that keeps and equal on many points. A natural thing to try is to apply an affine transformation from to that fixes a co-dimension space. In this case, is outputted and is equal to on the co-dimension space that fixes. On the other hand, by construction is equal to on a co-dimension space as well - namely the subspace with last coordinate equal to . Therefore, after the down step we get which is equal to on a subspace of dimension - which is essentially as similar to as possible while still being distinct from it.
Put a different way, to go down from the sparse flat tester to sparse flat tester we proceed by choosing uniformly and independently and taking the affine transformation corresponding to the matrix whose th column is , and whose shift is . In words, we drop the final column but add a random multiple of it to each one of the other columns of .
Going up and then down.
Stitching these two operations, one gets a graph whose set of vertices is the set of affine linear transformations and whose edges are very similar in spirit to the affine Grassmann graph; this graph is known as the bi-linear scheme graph. The core of our analysis relies on components:
-
1.
Relating the performance of the tester and expansion on this graph (the shadow lemma), and proving that the set of ’s on which the tester rejects has edge expansion at most .
-
2.
Studying the structure of sets in this graph whose expansion is at most and proving they must have some strong local structure.
-
3.
Using said local structure towards a correction argument, proving that if the sparse flat tester rejects with small probability, then is close to a Reed-Muller codeword.
1.6.4 The Shadow Lemma
The shadow lemma is a result asserting a relation between the rejection probability of a tester and the tester.
The Shadow Lemma for Flat Testers.
In the context of flat testers, the lemma asserts that the fraction of flats of dimension on which the tester rejects is at most times the fraction of flats the tester rejects. The name of the result stems from the fact that letting be the set of -flats on which the tester rejects, the set of flats on which the tester rejects is exactly the upper shadow of :
This means that on average, an element has of its subsets in , and thinking of an edge in the affine Grassmann graph an up-down step we get that the probability that a random step from goes to a vertex outside is at most . That is, the edge expansion of , defined as
is at most .
The Shadow Lemma for Sparse Flat Testers.
In the context of the sparse flat tester, we wish to argue that something along the same lines holds. Towards this end, fixing the function , we define
Due to the asymmetry between the up and down steps, there is no clear analog of the upper shadow of . However, it is still true that if and we append to some vector to form an affine map , then the sparse tester rejects and it makes sense to define
We prove that starting from and performing a down step to a , with probability at least the sparse flat tester still rejects and hence . In particular, we conclude that (where denotes the ratio between the size of and the total number of affine maps from to , and is defined similarly). Using the same logic as before we conclude that , where here we measure edge expansion with respect to the bi-linear scheme graph.
1.6.5 Expansion on the Bi-linear Scheme Graph
Equipped with the understanding that is a small set (as we assume the sparse flat tester rejects with small probability) and , the next question is what sort of structure this implies. In the bi-linear scheme graph there are natural examples of such sets, which are analogs of the zoom-in/ zoom-out sets in the context of subspaces. Roughly speaking, there is one type of examples which intuitively should be relevant for us, which is zoom-in sets:
There are other examples of small sets which have poor expansion in the bi-linear scheme graph, however these seem “irrelevant” in the present context. Indeed, after showing that our small set cannot be correlated with any of these other examples (a notion which is often referred to as pseudo-randomness with respect to zoom-outs and zoom-ins on the linear part), we use the theory of global hypercontractivity to prove that there must be and such that . In words, the sparse -flat tester almost always rejects inside .
We remark that the proof that has no “correlation” with any other non-expanding set in the bi-linear scheme is rather tricky, and much of the effort in the current paper is devoted to that. To do so, we have to build upon ideas from [31] as well as use a new relation between the sparse -flat tester applied on a function and the -flat tester applied on a related function . As the construction of this related function is somewhat technical, we defer a detailed discussion of it to Section 3.5.
1.6.6 Finishing the Proof via Local Correction
Intuitively, the only way that could be dense inside some is if we started with a Reed-Muller codeword , and perturbed it on a small number of inputs, of which is one. Indeed, in this case we would have that for all and exponent vectors checked by the tester prior to perturbing, and after changing the value of at , we would also change the value of at , breaking our previous equality. This intuition suggests that is a point in which we should fix the value of and get closer to a Reed-Muller codeword, and we show that this is indeed the case.
There are several difficulties that arise when inspecting this argument more deeply. If , then the above reasoning breaks (as the value of at is multiplied by ); however, in this case it does not make sense that could be very dense inside (as essentially the only input of included in the inner product is , but in any case it is multiplied by ). Indeed, in our argument we show that if is very dense in , then it must be the case that . Moreover, we show that the density of inside and inside is roughly the same for all and in the support of . This last step is crucial for the analysis to go through and requires us to adapt and strengthen techniques from [31]. At the end of this step, roughly speaking, we conclude that the tester rejects with probability close to whenever it queries the value of at .
The last step in the argument is to show that we can change the value of at and decrease the rejection probability of the tester by additive factor of (which is proportional to the probability that the tester looks at ). We do so by a reduction to the same problem over the standard flat tester (which was solved in [31]). The idea is to look at somewhat larger tester of dimension , fix the first coordinates and let the rest vary, so that the tester becomes a local version of the standard flat tester.
Given that, and iterating the argument, we eventually reach a function that differs from on at most of the inputs (where is the original rejection probability) and passes the test with probability . Hence, is a Reed-Muller codeword, and so is close to a Reed-Muller codeword.
2 Preliminaries
Notations.
For an integer we denote . For a prime power we let be the field of size , and we denote by the set of non-zero elements in it.
2.1 Basic Facts about Fields
Throughout, abusing notations we define the Reed-Muller code as the set of functions over that can be written as polynomials of degree at most . Henceforth fix and write
where we set , and .
We will need a few basic facts. First, it is a well known fact that has a multiplicative generator which we often denote by . The next lemma uses the existence of a multiplicative generator to estimate sums over .
Lemma 2.1.
For any prime power and integer ,
Proof.
If , then for all , while . Therefore, the sum is one summed up times which is in . For , recall that has a generator . That is, . Since , we may write
On the other hand, if then the sum on the left hand side of the lemma is equal to . ∎
2.2 Functions over Fields
Given two functions , we measure the distance between them in terms of the normalized Hamming distance, that is,
The distance of a function from the Reed-Muller code , denoted by , is defined as the minimal distance between and some function . That is,
For two functions , define their inner product as
It is clear that this inner product is bi-linear. Monomials are the basic building blocks of all polynomials over , and the following notation will be convenient for us to use:
Definition 1.
For an exponent vector , we define the monomial
The next lemma allows us to compute inner product between monomials:
Lemma 2.2.
For , we have that
Proof.
2.3 Affine Linear Transformations and the Affine Bi-linear Scheme
In this section we present the Affine Bi-linear scheme, which plays a vital role in our arguments.
Definition 2.
We denote by the set of affine transformations .
Each affine transformation consists of a linear part and a translation , and we will use the writing convention to refer to the fact that is the affine transformation such that for . In words, is the linear part of and is the affine shift. We stress that an affine transformation is not necessarily full rank.
Definition 3.
The density of a set is defined as .
Often times, we drop the subscript if it is clear from the context. Our analysis will require us to view affine transformations as vertices of some suitable test graph. For example, for the (standard) -flat test, this graph is the affine Grassmann graph, . The vertices of the graph are all of the -flats in , and two vertices and are adjacent if they intersect in a -flat, that is, . Below we define an analogous graph structure on affine transformations, which we refer to as the affine bi-linear scheme graph.
Definition 4.
The affine bi-linear scheme graph, , has vertex set . Two vertices are adjacent adjacent if and only if they are equal on an dimensional affine subspace. This condition can also be written as .
Write to denote an adjacency. We remark that the affine Grassmann graph can be obtained from the affine bi-linear scheme by identifying each with its image (that is, closing the set under some group operation), however as explained in the introduction this distinction will be crucial for us.
2.4 Expansion and pseudo-randomness in the Affine Bi-linear Scheme
We will use the standard notion of edge expansion, defined as follows.
Definition 5.
For a regular graph and a set of vertices , we define the edge expansion of as
In words, the edge expansion of is the probability to escape it in a single step of the random walk on .
We will use to denote edge expansion on . When and are clear from context we drop the subscripts. Later on in the paper, we will also consider edge expansion over the affine Grassmann graph ; abusing notations, we will denote edge expansion there also using the notation , and it will be clear from context which graph we are considering.
2.4.1 The Up-Down view of the Ranom Walk
It will be helpful for us to consider the following equivalent, two-step process for sampling a neighbor of in the affine bi-linear scheme graph:
-
1.
Go Up: Choose a random . Let be the matrix obtained by appending the as a new column to , and
-
2.
Go Down: Choose a random , where the first rows of are the identity matrix and the first entries of are , and at least one entry out of the last row in and the last entry in is nonzero.
-
3.
Output .
It is easy to see that for , each column of is equal to the corresponding column in plus some multiple of and likewise for and .
2.4.2 Non-expanding Sets in the Affine Bi-linear Scheme
As explained in the introduction, our proof considers the set of affine-transformations on which the test rejects as a set of vertices in the affine bi-linear scheme graph. We prove that has small edge expansion in that graph, and then use global hypercontractivity in order to derive some conclusion about the structure of , which is in turn helpful towards a local correction argument.
To facilitate this argument, we must first understand the structure of a few canonical examples of non-expanding sets in . In the affine Grassmann graph one has the following examples:
-
•
zoom-ins: , for .
-
•
zoom-outs: , for a hyperplane .
-
•
zoom-ins on the linear part: , for .
-
•
zoom-outs on the linear part: , for a hyperplane .
It is not hard to see that each example has expansion at most ; also, it is an easy calculation to show that the density of each one of these sets is small (and is vanishing provided that is significantly smaller than and both go to infinity). Each of these examples also has a counterpart in :
-
•
zoom-ins: , for and .
-
•
zoom-outs: , for , , and .
-
•
zoom-ins on the linear part: , for and .
-
•
zoom-outs on the linear part: , for , and .
Likewise, one can check that each example here also has expansion at most in . Our argument will require us to show that, in a sense, these examples exhaust all small sets of vertices in whose edge expansion is at most . To formalize this, we first define the notion of pseudo-randomness. Intuitively, we say that a set is pseudo-random with respect to some example – say zoom-ins for concreteness – if may only have little correlation with ’s, in the sense that the density of inside such sets is still small. More precisely:
Definition 6.
Let and let .
-
1.
We say that is -pseudo-random with respect to zoom-ins if for each and we have that
-
2.
We say that is -pseudo-random with respect to zoom-outs if for each and we have that
-
3.
We say that is -pseudo-random with respect to zoom-ins on the linear part if for each and we have that
-
4.
We say that is -pseudo-random with respect to zoom-outs on the linear part if for each we have that
Typically, global hypercontractivity results say that a small set with expansion bounded away from cannot be pseudon-random (where the notion of pseudo-randomness is in the same spirit but a bit more complicated; see [13, 29] for example). For results in coding theory however it seems that a somewhat different (yet very much related) type of result is needed [31]. Intuitively, in these type of results the assumption on the expansion of the set is much stronger, and roughly speaking asserts that the expansion of is almost “as small as it could be”. In exchange for such a strong assumption, one often requires a stronger conclusion regarding the non-pseudo-randomness of the set . For instance, in [31] it is shown that a set of vertices with expansion at most which is highly pseudo-random with respect to of the above type of examples, must be highly non-pseudo-random with respect to the fourth type, in the sense that it almost contains a copy of such set. For our purposes we require an analogous statement for , which is the following statement:
Theorem 2.1.
If satisfies
-
1.
,
-
2.
is -pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part,
-
3.
.
Then there exist and such that , where is the density of in .
3 Local Testers for the Reed Muller Code
We now formally describe both the sparse flat test and the full flat test. Although we focus on the sparse test, at times it will be convenient to reduce to the full test to aid our analysis.
3.1 The -flat Tester and the Inner Product View
At a high level, both the -flat test and the sparse -flat test can be described in the following way:
-
1.
Restriction: Choose a random for some suitable dimension .
-
2.
Local test on restriction: Check if the -variate function is indeed degree . If so, accept, and otherwise reject.
The difference between the two tests lies in how the “check” in step 2 is done. The straightforward way to perform this check is by querying on all points in , interpolating and checking its degree. Indeed, this is precisely how the -flat test is defined. In that case, it is easy to see that the result of the test depends only on the image of ; this is because and have the same degree for any full rank . Therefore, the -flat test can be rephrased in its familiar form as follows:
-
1.
Choose a random -flat .
-
2.
Accept if and only if .
However, there are other ways of trying to test whether has degree at most or not. One way to do so is by taking inner products that check whether certain high-degree monomials exist in using Lemma 2.2. A simple way to do this is as follows:
Accept if and only if for all such that .
By Lemma 2.2 it is not hard to see that this condition is equivalent to . Furthermore it is clear that calculating all of the inner products requires querying every point in the support of some that is used – which is in this case. Hence, taking inner product with all of these monomials does not lead to any savings in terms of the query complexity.
It turns out that there are more clever choices of “test polynomials” (which are not just monomials) that allow one to design an inner-product test above which is more query efficient. This idea was used by [33] who introduced the sparse -flat test, which we present formally in the next subsection.
3.2 The sparse -flat test
Our presentation of the sparse -flat is somewhat different than in [33], and this view will be necessary for our analysis. Recall that we write where is divisible by and . For a vector and a set denote . Define the -variate polynomial
For any degree sequence define
The sparse -flat tester works as follows:
-
1.
Choose a random affine transformation .
-
2.
Accept if and only if, for all such that .
Recall, we set the parameter to be . Essentially, just needs to be large enough so that the degree sequences in step can account for monomials of degree up to .
We refer to a degree sequence satisfying the inequality in step as valid throughout. As is fixed throughout, the notion of valid degree sequences will also not change throughout the paper. Hence, we say that rejects or is rejected, if for some valid . Otherwise we say that accepts or is accepted. We define to be the set of ’s on which the tester rejects:
(1) |
and let be the probability that the test rejects. Clearly, we have that
3.3 Assuming is Sufficiently Large
In this section we argue that without loss of generality we may assume that . Indeed, we may view an -variate function as a function in some -variables, for some sufficiently large . Call this function and define it by for any , . We show that , which implies that we can view the test as over but still have the in Theorem 1.1 be .
Lemma 3.1.
Let and be as defined above. Then .
Proof.
We first show that . Suppose is the degree function such that . Define as the extension of to variables (in the same way as is defined). Let denote the -variate function where the last variables are set to . Define similarly. Note that for any , , . We then have
Since is degree , so is , thus the above implies that .
For the other direction, suppose is the degree , -variate function such that . Keeping the same notation as above, we have that
Hence, there is a choice of such that
Since is degree , must be degree as well, so the above inequality implies that , completing the proof. ∎
Using this lemma, we can always view the test as over instead of , and show that the sparse -flat test rejects with probability at least . Where is the number of queries and is the some . Since none of these parameters depend on we can use the lemma above and get that this rejection probability is the same as .
Henceforth we assume that . This assumption is helpful as it allows us to bound the number of non-full rank affine transformations from as we argue in the following remark.
Remark 3.2.
The fraction of that is not full rank is at most . The same holds for the fraction of that are not full rank conditioned on for arbitrary , . To see that, note that we can upper bound the probability that the linear part of is not full rank by
For the second part, note that conditioning on does not change this estimate, as we can still choose the linear part uniformly at random and set the affine shift so that .
3.4 Some Basic Facts of the Sparse Flat Tester
We now collect some basic facts regarding the sparse flat tester that will be necessary for our analysis.
3.4.1 Basic Soundness Properties
We begin by describing why this tester works. Our presentation herein will be partial, focusing on the most essential properties necessary for our analysis, and we refer the reader to Appendix B or [33] for more details.
Consider the monomials , for such that for some such that , and . By Lemma 2.1, these monomials, , must satisfy:
-
•
for
-
•
.
More generally, we can explicitly express any inner product with as follows.
Lemma 3.3.
For , and we have that
where is a constant depending on the coefficients of , and the sum is over satisfying both the first condition above and for .
Proof.
We have
By Lemma 2.2, for each we have
only if the monomial appears in . Since the degree of is , this can only be the case if . Likewise,
only if for each . ∎
From Lemma 3.3 it is clear that rejects only if as in this case, does not have any monomials satisfying the above. Therefore if is indeed degree then it is accepted with probability . But is it necessarily the case that the sparse flat tester rejects a function with positive probability if its degree exceeds ? This was shown to be true in [33] using canonical monomials. As we are going to need this fact and so as to be self contained, we include a proof in Appendix B.
Theorem 3.4.
A function passes the sparse -flat test with probability if and only if has degree at most .
Proof.
Deferred to Appendix B. ∎
3.4.2 Sparsity
The next important feature of the sparse flat tester, as the name suggests, is that it has a small query complexity. Note that the number of queries the test makes is proportional to the size of the support of the test polynomials , and the next lemma shows that for any of the in of step 2, has a sparse support. Moreover, this support is the same no matter which is chosen. As a result, calculating does not require querying all points in the domain of .
Lemma 3.5.
The support of has size at most and is contained in the set
where denotes the hyperplane given by .
Proof.
Suppose is not in the set described. Then in the expression for , each term by Fermat’s Little Theorem. Moreover, the denominator is nonzero, so a direct calculation yields . ∎
Lemma 3.6.
For any , we have
Proof.
Since , the result is evident. ∎
Henceforth, we let . For any , the sparse -flat test can be done by only querying on points in , which gives the following upper bound on query complexity.
Lemma 3.7.
The sparse -flat test has query complexity at most . Since we can choose any , the query complexity can be as low as
3.5 Relating the Sparse Flat Tester and the Flat Tester
In this section we describe a way to interpret the sparse -flat test as a -flat test. This relation will be useful later on as it allows us to borrow techniques from the analysis of the -flat test from [31].
Fix an affine transformation for some and let denote the set of affine transformations of the following form,
(2) |
for . We call the affine transformations in restrictions. In words, composing with a random in corresponds to the operation of preserving the first columns, and randomizing the rest of them. As we explain below, this allows to view the -sparse test as having the -flat tester embedded within it on some restricted-type function of .
More precisely, note that we can sample by choosing as well as a restriction uniformly at random and outputting . In Lemmas 3.8 and 3.9 we show that the result of the test is “entirely dependent” on . To this end, after fixing we define by
Also for let be the -flat given by . The following lemma gives a relation between the sparse flat tester rejecting , and the (standard) flat tester rejecting on .
Lemma 3.8.
For any , rejects if and only if .
Proof.
Write in the form of (2) and let be the affine transformation given by . Recall that rejects if and only if
for some such that . For any , we can rewrite this inner product on the right hand side as follows.
However, if and only if contains the monomial . It follows that rejects if and only if . Finally, since the degree of a polynomial is invariant under affine transformations, it follows that this is equivalent to . ∎
We remark that Lemma 3.8 in particular implies that the sparse -flat test is invariant under any affine transformation that only affects the last -coordinates. In particular, we get:
Lemma 3.9.
For any , rejects if and only if rejects .
Proof.
We apply Lemma 3.8 with and . Then, applying does not change the degree of and the result follows. ∎
After fixing this lemma allows us to think of the remainder of the sparse -flat test, i.e. choosing a restriction , as a -flat test on . Setting , we can view the sparse -flat test as follows.
-
1.
Choose a random .
-
2.
Perform the standard -flat test on the -variate function defined according to the above.
This view of the test will allow us to borrow some concepts and facts from the -flat test which we now introduce. First, we formally define the upper shadow for flats.
Definition 7.
For a set of -flats, , define the upper shadow as follows
The following shadow lemma is shown in [10] and is used extensively in the analysis of [31]. This lemma will play a role in our analysis as well, so we record it below.
Lemma 3.10.
For a function , let be the set of -flats, , such that . Then,
where the measures are in the sets of -flats and -flats respectively.
We can also define an analogous notion of upper shadow for affine transformations that fits with the view of the sparse flat tester just introduced.
Definition 8.
For a set of affine transformations define
We will need the following simple result about the upper shadow of sets of affine transformations.
Lemma 3.11.
For a set of affine transformations ,
Proof.
We can sample by first sampling and then choosing a restriction and outputting . We can only have if , so the result follows. ∎
4 Locating a Potential Error
We now begin our proof of Theorem 1.1. Recall defined in (1) and that ; we denote for simplicity. In this section we show that if (where is a large absolute constant to be determined), then we may find a potential erroneous input for , in the sense that the sparse flat tester almost always rejects if the chosen test polynomial has in its support.
We begin by checking that the set satisfies the conditions of Theorem 2.1.
4.1 The Edge Expansion of in the Affine Bi-linear Scheme
We start by proving an upper bound on the expansion of . For a matrix and column vector , let denote the matrix with appended as an additional column. Consider the following procedure for sampling an edge in .
-
1.
Choose uniformly at random.
-
2.
Choose uniformly at random conditioned on and let .
-
3.
Choose a uniformly random matrix of the form , and a uniformly random such that the first entries in are zero.
-
4.
Set .
The following lemma will allow us to show is poorly expanding. This takes the place of the “shadow” lemma from [31].
Lemma 4.1.
Let be an arbitrary polynomial and let be an -affine transformation such that . Fix chosen in step , and let . Then,
where is sampled according to step 3.
Proof.
Let be the last row of and let be the last entry of . Choosing uniformly at random amounts to choosing and each uniformly at random in . To obtain the result we will view as a function in these random values and apply the Schwartz-Zippel lemma.
The function can be obtained by composing the -variable function, , with . Then is just the -variable function,
By Lemma 3.3, is some linear combination of the coefficients of , but these coefficients are polynomials in of total degree at most , because can be written as a polynomial with individual degrees at most . Thus, is a polynomial in of total degree at most . Moreover, this polynomial is nonzero because when setting , it evaluates to
By the Schwartz-Zippel lemma it follows that . ∎
As an immediate corollary, we get that the set of affine transformations that reject is poorly expanding.
Lemma 4.2.
The set of -affine transformations that reject , , satisfies
where the expansion is in .
Proof.
Fix and choose adjacent to in . By assumption there is some valid and such that . Since can be chosen according to the process described at the start of the section, we can apply Lemma 4.1 with ,
It follows that . ∎
4.2 Pseudorandom with respect to zoom-outs and zoom-outs on the linear part
Next, we show that the set is pseudorandom with respect to zoom-outs and zoom-out on the linear part.
Lemma 4.3.
The set is -pseudorandom with repsect to zoom-outs and zoom-outs on the linear part.
Proof.
We show the proof for zoom-outs. The argument for zoom-outs on the linear part is very similar.
Fix a zoom-out denote by the measure of in , that is, . Fix an arbitrary such that . Sample an -affine transformation by first choosing uniformly at random, and then according to steps 3 and 4 with . That is, if the th column is , then the th column of is , and the affine shift is for uniformly random and in . By Lemma 4.1, if , then with probability at least , so
On the other hand, notice that the distribution of is uniform over . Indeed, fix a and let denote the th column of . Then the only way to get is to choose,
and such that
Since is uniform over , we have
and hence . Since this applies for all zoom-outs, the result follows. A similar argument works for zoom-outs on the linear part.
∎
4.3 Pseudorandomness with respect to zoom-ins on the linear part
We next show pseudorandomness with respect to zoom-ins on the linear part. The argument is more involved. In particular, we use the relation to the -flat test to reduce the proof of this statement to analogous statements on affine Grassmann graphs. We then prove this statement using ideas similar to [31]. Formally, in this section we show:
Lemma 4.4.
The set is pseudorandom with respect to zoom-ins on the linear part .
We start with a few definitions. For a vector (of arbitrary length greater than ), let be the vector that is equal to in its first coordinates and in all of its other coordinates. Let be the vector restricted to its first coordinates. For a matrix , denote
In words, is the image of elements that agree with on the first coordinates and are not identically on the rest of the coordinates.
Due to the asymmetry induced by the test, we have a different argument depending on the pattern of . We will first show that is pseudorandom with respect to zoom-ins on the linear part, , where , by a direct argument. We will then prove that is pseudorandom with respect to any zoom-in on the linear part by a reduction to this case.
Lemma 4.5.
The set is pseudorandom with respect to zoom-ins on the linear part with .
The proof of Lemma 4.5 requires some set-up and auxiliary statements, which we present next. Fix as in the lemma and suppose that has fractional size in , that is,
Consider another such that , and . In words is equal to in its first coordinates, and nonzero in its last coordinates. There is an invertible matrix whose first rows and columns form the identity matrix, , such that . Composition with gives a bijection between the sets and . Moreover, by Lemma 3.9, if and only if . Therefore, for any satisfying , and it holds that
Put another way, has fractional size in the set of such that , and thus we define
We can sample an by first choosing , then choosing uniformly at random, and outputting conditioned on . Since the probability that is and this can only happen if , we get that
where . Recall is the set of -affine transformations, , for which there exists a restriction such that , and is the same operation applied times.
The following lemma shows that there is a way to fix the first columns of the test so that the rejection probability remains small:
Lemma 4.6.
There exists such that choosing uniformly at random and setting , , we have
Proof.
Choose a full rank -affine transformation uniformly conditioned on , then choose an uniformly and set . For the remainder of this proof all probabilities are according to this distribution. Define the following events.
-
•
.
-
•
.
-
•
.
-
•
.
First we note that
and as a result,
We can then conclude
where we use the fact that , and so .
Notice that the numerator of the last fraction is at most , while the denominator is at least , so this probability is at most . Thus, there exists such that conditioned on this , the probability over is at most . ∎
Fixing this , the remaining test graph is now over and is isomorphic . Moreover, by Lemma 3.8, we may focus solely on the flat given by for . This allows us to define as done in Section 3.5:
By Lemma 3.8 we now work with the standard -flat test for . The condition that translates into the condition that the -flat does not contain the point equal the last coordinates of , where is the unique point such that .
Translating Lemma 4.6 gives
This is the same result shown at the start of the proof of [31, Claim 3.4], and the rest of the argument follows as therein. Let
Lemma 4.7.
The set is nonempty.
Proof.
Since we chose , there is at least restriction such that rejects and hence there is a -flat on which has degree greater than . It follows that if we sample a random -flat , then with probability at least . If were empty, however, we would have only if . In this case the probability that is at most
Therefore is nonempty. ∎
We now proceed to the proof of Lemma 4.5.
Proof of Lemma 4.5.
By Lemma 4.7 is non-empty, so there must be a flat such that
is nonempty. Let denote the uniform measure over . This graph is isomorphic to , but the ground space is viewed as . We first claim that since , it must be at least . If not, then . However, is pseudo-random with respect to zoom-ins and zoom-ins on the linear part simply due to its size. Indeed, any zoom-in or zoom-in on the linear part already has measure , so can only contain a fraction. Furthermore, is also pseudo-random with respect to zoom-outs and zoom-outs on their linear part by a similar argument to Lemma 4.3. Finally, by a similar argument to Lemma 4.2, . Altogether, this then contradicts Theorem A.1, so we conclude that .
Thus, there exists such that . Fix such a and sample a uniform -flat conditioned on , a uniform -flat , and consider . We may think of as being defined by a system of independent linear equations . That is, is the subspace of that satisfies these equations. Likewise, is given by the restriction of linear equations, . The probability that all linear equations are linearly independent is at least
When all linear equations are linearly independent, is uniform over . Thus,
If , then , where the upper shadow is taken with respect to , so it follows that
However, by Lemma 3.10,
so altogether we get that
To conclude, note that the left hand side is at most the probability that has degree greater than over uniform such that . By assumption, this probability is at most so
We are now ready to prove Lemma 4.4.
Proof of Lemma 4.4.
We show that the set is pseudorandom with respect to zoom-ins on the linear part for any .
If , then we are done by Lemma 4.5, so suppose that , meaning is zero outside of its first coordinates. Clearly must have at least one nonzero coordinate, as otherwise and is either all of (if , or empty (if ). The former case cannot happen by the assumption that , while the latter case is trivially true. Without loss of generality suppose that .
For each , there is an satisfying such that . Furthermore, as , there must be some “special index” such that . Take the most common special index over all and without loss of generality suppose it is . Thus, for at least of , we have for some such that , and . Let denote the set of these transformations.
Consider the map, that sends to and keeps all other coordinates unchanged. That is,
For any , we claim that
where is the exponent vector such that , , and .
Indeed, is a linear combination of the coefficients of which is a polynomial in . In fact, by Lemma 2.2, it is a linear combination of coefficients of monomials where the degree of is . In every monomial of , the degree of and the degree of add to at most however, so it follows that the coefficients who contribute to have degree in at most . Therefore is a polynomial in of degree at most . Finally this polynomial is nonzero because it evaluates to a nonzero value at . The inequality then follows from the Schwartz-Zippel lemma.
In particular, this means for each there is a nonzero such that , and we choose to be the most common such (we remark that the fact that is the reason we needed probability of rather than ). Thus, for at least of the transformations in it holds that . Letting , we get that for at least fraction of the it holds that , and as is a bijection it follows that
Finally, note that as its coordinate is non-zero by design. Applying Lemma 4.5 we get that , and as , combining with the above we get that
5 Correcting the error and iterating
The results from the previous subsections, along with the assumption that , establish that satisfies the following properties:
-
1.
-
2.
.
-
3.
is -pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
-
4.
is -pseudorandom with respect to zoom-ins on the linear part.
Since we take , it follows from Theorem 2.1 with and that there exists a pair such that
(3) |
for a large enough .
How shall we go about using this information? In words, (3) tells us that is dense on transformations sending to , and this suggests that the point is an erroneous point for which we should fix and, as a result, improve the acceptance probability of the test. Furthermore, it stands to reason that this change should affect all tests that come from transformations in .
Upon inspection however, these tests can only be affected in the case that ; otherwise, in the test we perform, the value is multiplied by , and hence changing the value of at does not affect the test at all. Thus, for the above strategy to work, we must prove that (3) holds for a pair wherein is in .
Lemma 5.1.
There exists and such that .
Once we have shown this lemma, we will be able to show that correction at can reduce the rejection probability of tests in ; however, in order to show that the tester is optimal, we need to fix a larger fraction of tests with a single correction. Specifically, we need to be able to fix tests such that for any . In order for such an argument to work, we will have to show that is dense in .
Lemma 5.2.
If there exists an such that , then has density at least in .
In words, is the set of -affine transformations such that for some . In the next two subsections we prove Lemmas 5.1 and 5.2. After showing these lemmas, the way to perform corrections will become obvious and we quickly conclude the proof of Theorem 1.1. Both Lemmas are proved in a similar manner, and we first present a general lemma that will be used in both proofs. Let be an arbitrary polynomial, let be a degree parameter, let denote uniform measure over -flats in , and let be an arbitrary point. Define the following two sets:
Keep and (the large absolute constant) the same as we have defined, so that is small. We will use following result, which is an extension of the results in Section 3.2 of [31]:
Lemma 5.3.
Keep and as defined above and suppose . If then . Moreover there is a value such that after changing to , .
As a consequence of these two points, after changing the value of to .
Proof.
The proof is deferred to Section C. ∎
5.1 Proof of Lemma 5.1
To establish Lemma 5.1 we show that is pseudorandom with respect to zoom-ins outside of the support, hence the zoom-in found in (3) must be in the support of . Specifically, we show:
Lemma 5.4.
For any and any , .
We begin the proof of Lemma 5.4, and we assume for the sake of contradiction that the lemma is false. That is, suppose there is such that . Since , there are consecutive coordinate . It will be convenient to swap the order of the coordinates so that these are the last coordinates. Thus, the polynomial for is now,
The test with affine transformation checks if for all valid with as defined above. Reordering the variables in this way changes the support, so that our assumption becomes, without loss of generality, , which will make our notation later a bit simpler (this is the only reason for reordering the variables). We will once again sample a larger affine transformation and choose a restriction to obtain a , however the restrictions this time will be slightly different. We will require the restrictions to be full rank and we will fix the first columns/coordinates of as opposed to the first as before. Let consist of affine transformations of the form:
(4) |
where is full rank and . For of this form, let . Also, for an affine transformation , define , where recall is with all coordinates outside of the first set to zero. Sample a as follows:
-
1.
Choose a full rank such that .
-
2.
Choose .
-
3.
Output .
After is chosen, the first columns/coordinates of ’s linear part/affine shift are fixed, while the remaining parts are composed with some random restriction. Thus, once is fixed there is a unique such that and we can only have if . In particular, this only happens if where by design this satisfies . Setting to be the last coordinates of , it follows that only if . Let denote measure in . Define
If, after choosing and constructing as above, we then condition on then is a uniformly random over full-rank transformations in conditioned on its image not containing . Therefore, , and with probability at least we have .
On the other hand, if after choosing we choose uniformly such that , the distribution over is uniform over full rank . By Remark 3.2, the fraction of affine transformations in that are not full rank is at most and by our assumption the set of rejecting transformations is dense in . Thus a simple averaging argument shows that the fraction of full rank that reject is also large, and in particular, strictly greater than . Therefore with probability strictly greater than over , there is at least one such that is full rank and rejects.
It follows that there exists a full rank such that the following two hold:
-
•
,
-
•
There exists a such that is full rank and rejects.
Fix this and keep , , and as defined. We now show how this leads to a contradiction. The idea is to apply Lemma 5.3 and argue that there is a point at which we can make a correction and cause to be accepted for all possible , and in particular for . Since we assumed that there was high rejection probability on a zoom-in outside the support, we will show that the correction is made at a point not looked at by the test . These two facts together form a contradiction because changing at a point that is not in the set of points for cannot change the result of the test . In order to apply Lemma 5.3 however, we need some statement similarly to , but for a set of flats instead of a set of affine transformations. To this end, we will try to argue that the set of for is small as well.
We proceed with the formal argument. The first step is to define an auxiliary polynomial similar to that of Lemma 3.8. Suppose rejects because for a fixed valid . Using this , define by
For any written according to Equation (4) it is easy to check that
so after fixing we can view the test as being performed on with the transformation . By Theorem 3.4, if , then for all . This leads to the following fact:
This fact is similar to Lemma 3.8, however, we only have one direction. Namely it is not true that always rejects if because the test on only checks inner products with and not will all monomials of degree up to .
Using this fact we can now relate to the measure of the set of -flats not containing such that . Define the set of -flats,
Equivalently, is the set of all -flats not containing such that .
To analyze the fractional size of this set, we can choose by choosing a -flat, and then choosing a basis for the flat. More formally, with fixed,
-
1.
Choose such that , and write according to Equation (4).
-
2.
Choose , and replace with . Let the resulting restriction be .
-
3.
Output .
We claim that if initially , then with probability at least , the outputted rejects.
Lemma 5.5.
Suppose . Then with probability at least over in the second step, rejects. That is, .
Proof.
Write in the form of Equation (4). Then the assumption is equivalent to .
Choose a full rank uniformly at random and consider
Since , Lemma B.5 implies that there is at least one choice of invertible , and hence an , that makes the above nonzero. Therefore, we can view as a nonzero polynomial in the entries of . Since the degree of is at most , the inner product is a nonzero polynomial in the entries of of total degree at most . By the Schwartz-Zippel Lemma, with probability at least , over the entries of , . Since has to be invertible, we need to ignore the choices of entries that are non-invertible, but this is at most a fraction. Overall, we still get that with probability at least over , rejects . ∎
Letting denote the uniform measure in , Lemma 5.5 implies that . We are now close to being able to apply Lemma 5.3, but is a set of -flats, while Lemma 5.3 only works for sets of flats of dimension at least . Thus in the cases, we cannot use this lemma. There is an easy fix however, which follows by looking at the upper shadow of
Lemma 5.6.
Let . Then,
where is uniform measure in .
Proof.
Observe that . Indeed, for any , there must be a -flat such that . Since , it follows that and thus . The result then follows from applying the Lemma 3.10 twice. ∎
We now wrap up the proof by applying Lemma 5.4 and obtaining a contradiction.
Proof of Lemma 5.4.
By Lemma 5.6, . We can now apply Lemma 5.3, with , degree parameter , and special point . The conditions of Lemma 5.3 are satisfied since and . From Lemma 5.3 it follows that is empty and there is a value such that after changing to , we have . In particular, it must be the case that , or equivalently, , where is the non-identity part of when written according to (4).
Recall how the point was defined: after is fixed, is the unique point such that , and is the last coordinates of . Since satisfies , we have , and hence . Letting be the last coordinates of , it follows that . However, by assumption , and since is full rank, for any . We now see where the contradiction lies. After changing the value of , we suddenly have
However, since we only changed the value of , no term in the above summation was changed, and this inner product should still be nonzero. Hence, the set cannot be dense in any where . ∎
5.2 Proof of Lemma 5.2
We have established that there exists an and such that
Using this, we will deduce Lemma 5.2, which says that the set is dense in .
Let denote the set of -flats such that and let denote the uniform measure on the set of -flats. We sample by first choosing an -flat containing , and then choosing whose image is conditioned on . The point of this procedure is that after choosing , the resulting can only reject if . Formally,
-
1.
Choose a random -flat containing , and a random basis, . Let where the th column of is . By assumption for some .
-
2.
Choose an arbitrary matrix such that and output .
It is easy to check that this procedure samples uniformly from and that as noted, can only reject if , which leads to the following observation:
Remark 5.7.
where denotes density in the zoom-in (on the affine Grassmann graph).
Using this information about , we now try to apply Lemma 5.3. Sample a full rank as follows:
-
1.
Choose an -flat uniformly at random containing .
-
2.
Choose uniformly at random such that .
-
3.
Choose a basis representation and output where the th column of is , conditioned on for some .
After is chosen, recall the following two sets,
and let denote measure over -flats contained in .
Then , so with probability at least , we have . On the other hand, , so with probability at least , we have . Altogether, this implies that with probability at least
satisfies both and .
Suppose such a is chosen and the above holds. We may now apply Lemma 5.3 to show that any that can be chosen in step must now reject. This will establish the desired result, that a random rejects with probability close to .
By Lemma 5.3 there exists a value such that after changing to , . It must be the case that because . Let be the the function after changing the value of at . Then for any as described above, we must have for every valid . Since is full rank, there can only be one point mapped to and that point is some , so we have
Thus, for every valid , and in particular, rejects .
Proof of Lemma 5.2.
Sampling a full rank via the procedure above, the previous argument shows that with probability at least over -flats , the outputted in step 3 rejects. It follows that has density at least in the set of full rank transformations such that for some . Since non-full rank transformations constitute only a -fraction of transformations in , by Remark 3.2, we subtract out another to obtain the desired result. ∎
5.3 Iterating the Argument
Recall that denotes the probability that a randomly chosen rejects . By Lemma 5.3, for at least of , there is a value such that after changing the value of to , accepts . Thus, there exists a such that is identical to at all points except and
Proposition 5.8.
If and there exists a point and a function that is identical to at all points except such that
where .
Proof.
By Lemma 5.2, there exists such that
By the above discussion, there exists a such that is identical to at all points except and
On the other hand, it is clear that for such that for all , the results of the tests on and are the same because and are identical on the points needed to evaluate and for any . Since is the probability that , a direct calculation yields
∎
Finally, we can conclude the proof of Theorem 1.1.
6 Optimal Testing from other Local Characterizations
When showing that the sparse flat test is optimal, we relied minimally on the structure of . Thus it is not hard to extend our methods and show optimal testing results for other polynomials that give local characterizations. We will reuse the variables and in this section, so they no longer refer to their previous definitions. Let be a polynomial of the form
Where , , for each and some small constant . In words, is a -variate polynomial that is the product of polynomials in few variables, where the variables of each of these polynomials is disjoint. Finally let be an arbitrary nonempty set affine invariant set of polynomials and suppose . Define
Notice that . It is well known that any affine invariant set of polynomials is given by the span of the monomials that appear in at least one polynomial of the family, c.f. [25], and this fact is elaborated on in Appendix B. In combination with Lemma 3.3, it follows that is in if and only if for every . In comparison with the description in Section 1.4, is an explicit basis for .
Using and monomials with exponent vectors in , we can define a test similar to the sparse flat tester. For , define
Let . It is clear that is affine invariant with the following natural tester:
-
1.
Choose uniformly at random.
-
2.
If for any , reject. Otherwise, accept.
Let . It is clear that the number of queries made by the tester is , although this value will not be of interest for us. Instead, the goal for the remainder of this section is to show that this tester is optimal. As a consequence, this allows one to obtain lower query complexity optimal testers for general affine invariant codes by constructing local characterizations using with sparse support of the prescribed form above, similar to what is done for Generalized Reed-Muller Codes in [33].
Theorem 6.1.
The above tester is optimal for . That is, for any ,
where is the probability that the tester rejects , , is the number of queries that the tester performs, and is the minimal relative hamming distance between and a member of .
We will follow the same strategy, first locating a potential error, and then correcting the error and iterating. Let denote the set of rejecting tests in and assume that for some such that and .
Before going into the proof, we introduce a lemma shown in [31]. Recall the definition of lifted affine-invariant codes from the introduction. By design, is a lifted affine invariant code. Letting , it is easy to see that
Since we can view as a lifted code, we can then apply the following lemma from [31].
Lemma 6.1.
Let be a polynomial such that , and suppose that . Then,
where the probability is over hyperplanes . If this inequality is tight and the set of hyperplanes such that is of the form
for some point , then there is a function equal to at all points except such that .
This lemma will play the role of Lemma C.2 in this section’s analysis.
Locating a Potential Error
In order to locate an error, we will once again show that is dense on some zoom-in. As we already assume has small measure, this step requires us to show that is poorly expanding and pseudorandom with respect to zoom-outs, zoom-outs on the linear part, and zoom-ins on the linear part.
Lemma 6.2.
The set has the following properties:
-
•
-
•
-
•
is -pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
-
•
is -pseudorandom with respect to zoom-ins on the linear part.
where is measure in the set .
The first item is true by assumption. The second and third items can be shown by the exact same arguments as in Lemmas 4.1 and 4.3 respectively.
The proof for the fourth item also proceeds similar to the Reed-Muller case, but since this argument was more involved, we will review the steps.
Consider a zoom-in on the linear part . We can assume as otherwise the set is either empty or . We will again have two cases, one where at least one of the last coordinates of is nonzero, and another when all are zero. We can prove the former case in exactly the same way as Lemma 4.5. For the latter case, recall that by assumption . Thus, the reduction from the former case to the latter case (with a factor of loss) works exactly the same as in the Proof of Lemma 4.4.
Once again we can sample a transformation by first choosing a full rank , and outputting . Where now is the set of affine transformations with
(5) |
with is full rank and . Call the non-trivial part of and let .
Following the same setup as in Section 4.3, we can find such that the following two hold:
-
•
-
•
There exists such that
-
•
.
Fixing this , define
Then for any restriction with as its non-trivial part, and any -variate monomial , we have
It follows that rejects if and only if .
By construction for some such that is equal to in its first coordinates. Then is equivalent to where is the last coordinates of
Letting , we can translate the third item above to
where is measure in . Define
By the same argument as in Lemma 4.7, is nonempty. Finally, we can use the same proof as that of Lemma 4.5 (except referring to the first part of Lemma 6.1 instead of Lemma 3.10 to bound the sizes of upper shadows) to show that . This shows that is -pseudorandom with respect to where one of ’s last -coordinates is nonzero. If this is not the case, then since , we can perform the same reduction as in Lemma 4.4, to show that is -pseudorandom with resepect to every zoom-in on the linear part.
The point is in the support of
We next show that it must be the case that . We will need the following lemma which is the same as Lemma 5.3 but for the current setting. Just as in the setup of Lemma 5.3, let be an arbitrary polynomial, let denote uniform measure over -flats in , and let be an arbitrary point. Also let be an arbitrary affine-invariant code.
Define the following two sets:
Keep and (the large absolute constant) the same as we have defined, so that is small. We will use the following result, which is an extension of the results in Section 3.2 of [31]:
Lemma 6.3.
Keep the notation defined above and suppose . If then . Moreover there is a value such that after changing to ,
Proof.
Following the same steps as in the proof of Lemma 5.4, this lemma almost immediately shows that it must be the case . Suppose for the sake of contradiction that , and that it is the group of coordinates . By assumption is a -variate polynomial for . Let
We can similarly find an auxiliary polynomial and a special point such that the following hold:
-
•
, where and is measure in .
-
•
There exists a full rank affine transformation such that ,
-
•
.
Using the first fact, we can also bound the measure of the following set of -flats,
Lemma 6.4.
Suppose is a -variate polynomial such that . Then with probability at least over we have .
Proof.
Choosing randomly, we can view as a polynomial in the entries of . Since there must be some choice that makes this polynomials value nonzero, so is a nonzero polynomial in the entries of . As the individual degrees of must be at most , it follows from Schwarz-Zippel that with probability at least , . ∎
Let denote measure in . We can choose such that by first choosing not containing and then with image contained in . Lemma 6.4 along with the assumption implies that
Therefore, we have that and by Lemma 6.3, . We can then apply Lemma 6.3 with and special point to get that there is a value such that after changing , we have,
This also implies that for all -flats , but note that changing the value of does not affect the value of the test
because of the third item above. Thus, we have a contradiction and it follows that .
Dense on all zoom-ins inside the support of
After establishing that has density at least inside and , we can use the same arguments as in the proof of Lemma 5.2 to show that must have density at least
inside . We can choose by first choosing a -flat containing , then a -flat containing , and finally outputting conditioned on and .
Using the same sampling and averaging argument, we get that with probability at least over -flats , the following hold:
-
•
, where ,
-
•
, where ,
Applying Lemma 6.3 with and , we get that there is a value such that after changing the value of to , . By the first item above, . Therefore, prior to changing the value of , we must have had for all full rank such that for some . After subtracting out the fraction of non-full rank transformations, we can conclude that has density at least
inside .
Correcting the Error and Iterating
Finally, we can correct the error and iterate as done in Section 5.3. For at least of , there is a value such that after changing the value of to , accepts . Thus, there exists an that is identical to at all points except and
By the same calculation as in Lemma 5.8, it follows that
for some , and we can conclude
References
- [1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Trans. Inf. Theory, 51(11):4032–4039, 2005.
- [2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998.
- [3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of NP. J. ACM, 45(1):70–122, 1998.
- [4] Sanjeev Arora and Madhu Sudan. Improved low-degree testing and its applications. Comb., 23(3):365–426, 2003.
- [5] Mitali Bafna, Boaz Barak, Pravesh K. Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1629–1642, 2021.
- [6] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. High dimensional expanders: Eigenstripping, pseudorandomness, and unique games. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, Virtual Conference / Alexandria, VA, USA, January 9 - 12, 2022, pages 1069–1128, 2022.
- [7] Mitali Bafna, Max Hopkins, Tali Kaufman, and Shachar Lovett. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 185–194, 2022.
- [8] Boaz Barak, Parikshit Gopalan, Johan Håstad, Raghu Meka, Prasad Raghavendra, and David Steurer. Making the long code shorter. SIAM J. Comput., 44(5):1287–1324, 2015.
- [9] Boaz Barak, Pravesh Kothari, and David Steurer. Small-Set Expansion in Shortcode Graph and the 2-to-2 Conjecture. In Proceedings of the 10th Innovations in Theoretical Computer Science conference, ITCS 2019, January 10-12, San Diego, CA, USA, 2019.
- [10] Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of Reed-Muller codes. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23-26, Las Vegas, NV, USA, pages 488–497, 2010.
- [11] Irit Dinur and Venkatesan Guruswami. Pcps via low-degree long code and hardness for constrained hypergraph coloring. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 340–349, 2013.
- [12] Irit Dinur, Prahladh Harsha, Srikanth Srinivasan, and Girish Varma. Derandomized graph product results using the low degree long code. In 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, March 4-7, 2015, Garching, Germany, pages 275–287, 2015.
- [13] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. On non-optimally expanding sets in Grassmann graphs. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 940–951, 2018.
- [14] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 376–389, 2018.
- [15] Uriel Feige, Shafi Goldwasser, László Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996.
- [16] William T Gowers. A new proof of szemerédi’s theorem. Geometric & Functional Analysis GAFA, 11(3):465–588, 2001.
- [17] Alan Guo, Swastik Kopparty, and Madhu Sudan. New affine-invariant codes from lifting. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, ITCS ’13, page 529–540, New York, NY, USA, 2013. Association for Computing Machinery.
- [18] Tom Gur, Noam Lifshitz, and Siqi Liu. Hypercontractivity on high dimensional expanders. In STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 176–184, 2022.
- [19] Venkatesan Guruswami, Prahladh Harsha, Johan Håstad, Srikanth Srinivasan, and Girish Varma. Super-polylogarithmic hypergraph coloring hardness via low-degree long codes. SIAM J. Comput., 46(1):132–159, 2017.
- [20] Elad Haramaty, Noga Ron-Zewi, and Madhu Sudan. Absolutely sound testing of lifted codes. Theory Comput., 11:299–338, 2015.
- [21] Elad Haramaty, Amir Shpilka, and Madhu Sudan. Optimal testing of multivariate polynomials over small prime fields. SIAM J. Comput., 42(2):536–562, 2013.
- [22] Charanjit S. Jutla, Anindya C. Patthak, Atri Rudra, and David Zuckerman. Testing low-degree polynomials over prime fields. Random Struct. Algorithms, 35(2):163–193, 2009.
- [23] Daniel M. Kane and Raghu Meka. A PRG for lipschitz functions of polynomials with applications to sparsest cut. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 1–10, 2013.
- [24] Tali Kaufman and Dana Ron. Testing polynomials over general fields. SIAM J. Comput., 36(3):779–802, 2006.
- [25] Tali Kaufman and Madhu Sudan. Algebraic property testing: the role of invaraince. In Proceedings of the 40th Annual ACM Symposium on Theory of computing, STOC 2008, May 17-20, Victoria, BC, Canada, pages 403–412, 2008.
- [26] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Hypercontractivity for global functions and sharp thresholds. arXiv preprint arXiv:1906.05568, 2019.
- [27] Peter Keevash, Noam Lifshitz, Eoin Long, and Dor Minzer. Forbidden intersections for codes. arXiv preprint arXiv:2103.05050, 2021.
- [28] Peter Keevash, Noam Lifshitz, and Dor Minzer. On the largest product-free subsets of the alternating groups. arXiv preprint arXiv:2205.15191, 2022.
- [29] Subhash Khot, Dor Minzer, and Muli Safra. On independent sets, 2-to-2 games, and Grassmann graphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 576–589, 2017.
- [30] Subhash Khot, Dor Minzer, and Muli Safra. Pseudorandom sets in Grassman graph have near-perfect expansion. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 592–601, 2018.
- [31] Dor Minzer and Tali Kaufman. Improved Optimal Testing Results from Global Hypercontractivity. In Proceedings of the 63rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2022, October 31 - November 3, Denver, CO, USA, 2022.
- [32] Ran Raz and Shmuel Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In Proceedings of the 29th Annual ACM SIGACT Symposium on Theory of Computing, STOC 1997, El Paso, TX, USA, May 406, 1997, pages 475–484, 1997.
- [33] Noga Ron-Zewi and Madhu Sudan. A New Upper Bound on the Query Complexity of Testing Generalized Reed-Muller Codes. Theory of Computing, 9(25):783–807, 2013.
- [34] Ronitt Rubinfeld and Madhu Sudan. Robust characterizations of polynomials with applications to program testing. SIAM J. Comput., 25(2):252–271, 1996.
Appendix A Proof of Theorem 2.1
We now prove Theorem 2.1. First we will need to show the following result, which is a slight variation of [31, Theorem 2.4].
Theorem A.1.
Let satisfy
-
1.
,
-
2.
is -pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part.
-
3.
.
Then there exists a point such that .
Measure, expansion, and pseudorandomness are all with respect to , and denotes the fractional size of in the subset of that contains the point .
To show that an analogous result holds in the affine Bi-linear Scheme Graph, we retrace the steps in [9] to establish a connection to the affine Grassmann Graph and then apply Theorem A.1. To this end, define the following injective map . Let be the canonical basis. For two column vectors and with lengths and respectively, we denote by the length column vector obtained by vertically concatenating and . Likewise, let be the length column vector obtained by horizontally concatenating and .
For an affine transformation , where has columns we set . It is clear that this map is injective. Moreover, the image of is the set of -flats such that the projection of onto the first coordinates is full rank.
The map preserves edges, nearly preserves expansion, and maps the canonical non expanding sets in to their counterparts in .
Lemma A.2.
If are adjacent in , then are adjacent in . Conversely if are adjacent in and both in the image of , then are adjacent in .
Proof.
For the first direction, suppose and are given by and respectively. Let be the matrix obtained by vertically concatenating and , and let be the length vector obtained by vertically concatenating the length zero vector and . Define and similarly. Then notice that is given by the column-image of affine transformation , while is given by the column-image of affine transformation . Thus, to show that edges are preserved it suffices to show that the column-images of and have intersection of dimension . Indeed, , and since both and are all zeros in the first rows, it follows that . Therefore, and .
In the converse direction, suppose that where . Write and where , , , and . Let be the matrix with columns and be the matrix with columns . Then by assumption the affine transformations and have images with intersection of dimension . We claim that this implies and are in fact equal on an dimensional subspace. Indeed if for then examining the first coordinates implies that . Therefore, are equal on the -dimensional preimage of . Finally, since and are given by the projections of and respectively onto the last coordinates, and the conclusion follows. ∎
Lemma A.3.
Let be in the image of . Then at least fraction of its neighbors are also in the image of .
Proof.
Recall that is in the image of if the projection of onto its first -coordinates is full rank. Fix in the image of . Then a neighbor of can be sampled as follows: choose a uniformly random point and linearly independent directions, , so that , a uniformly random outside of , and set . Since is in the image of , the projection of to the first coordinates is full rank. Likewise, is in the image of if projected onto its first coordinates is linearly independent to ’s projections to the first coordinates. This happens with probability exactly , completing the proof. ∎
Lemma A.4.
If satisfies , then satisfies , where and are expansion in and respectively.
Proof.
Lemma A.5.
The map is a bijection between the following pairs of sets:
-
•
and for .
-
•
and for .
-
•
and for given by .
-
•
and for given by .
Proof.
We show the zoom-in and zoom-out cases. The other two are analogous.
zoom-ins: Take any arbitrary and let let and . It is easy to check that . Since is arbitrary, this shows the first half of the bijection.
To complete the proof of this case, we must show that any flat in is mapped to. Take such a flat . Since , we can write , where the first coordinates of is equal to and the first coordinates of are all . There is a unique linear combination of the ’s and that sums to , and looking at the first coordinates it must be the case that where is the th coordinate of . Let be the matrix whose th column is given by the last entries of and let be the last entries of . It follows that and that .
zoom-outs: Take an arbitrary and suppose it has columns . Then where and for . Let be the length row vector whose first entries are and last entries are . By construction and . Therefore, is contained in the hyperplane given . Since was arbitrary, it follows that , where is the hyperplane given by .
For the other direction of the bijection, take any , let be its preimage under , and suppose is given by (note that must be a linear subspace since it contains which is a linear subspace). Take to be the negative of the first coordinates of and to be the last coordinates of . It follows that . ∎
With these lemmas, we can obtain the desired characterization of poorly expanding setes in
Proof of Theorem 2.1.
Suppose satisfies the conditions of Theorem 2.1 and let . Henceforth use and to denote expansion and measure in . By Lemma A.4 we have that . By Lemma A.5, we have that is also -pseudorandom with respect to hyperplanes, hyperplanes on its linear part, and points on its linear part. Finally since is an injection, it is clear that . By Lemma A.1 it follows that there exists a point such that
By Lemma A.5, there is a zoom-in (in the affine bi-linear scheme graph) such that , so . Finally is the probability that a randomly chosen flat from is not full rank when restricted to its first coordinates. This probability is at most . Indeed a flat from can be chosen by choosing linearly independent vectors and taking the flat to be . Conditioned on the first vectors being chosen so that their restriction to the first coordinates is linearly independent, there is a chance that to satisfy this property as well. It follows that
∎
Appendix B Canonical Monomials characterizing
In this section we include the proof of Theorem 3.4, showing that any polynomial passing the test with probability is degree . The proof is implicit in [33], but as our tester is presented differently we give a proof for completion.
Write , where , and set . First, note that our tester detects monomials of the form:
(6) |
for any such that . Indeed, for such that , let and consider the expansion of . Using Lucas’s Theorem, it is not hard to see that the expansion of contains the monomial , and hence by Lemma 3.3, any canonical monomial in (6) is rejected.
Let be the family of functions that pass the sparse -flat test with probability . In this section we will prove:
Lemma B.1.
If passes the sparse--flat test with probability , then . Equivalently,
One direction of this lemma is clear. If then passes with probability . The other direction amounts to showing that . To show this, we will actually show the contrapositive and prove that if , and , then one of the canonical monomials in (6) must be in . This is a contradiction and establishes that .
Before proceeding to the proof, we will need the following two facts from [25] about affine-invariant families of polynomials. For a family of polynomials , let denote the set of monomials that appear in at least one of these polynomials. The first fact says that these monomials are a basis of .
Lemma B.2 (Monomial Extraction Lemma [25]).
If is an affine-invariant family of polynomials then .
The second fact says that the monomials appearing in an affine-invariant family are -shadow closed. For two integers , let and be their base representations (recall ). Then we say is in the -shadow of if for , and denote this by . Then for two exponent vectors and , we say if for every . Affine-invariant families of polynomials have the following shadow closed property.
Lemma B.3.
Let be an affine invariant family of polynomials. If and , then as well.
Finally, we will need the following which will allow us to go from one monomial in to another with the same degree, but with the distribution of the individual degrees shifted.
Lemma B.4.
Suppose and suppose . Then the monomial , where .
Proof.
Let be the affine transformation . Then,
By Lucas’s Theorem and the assumption , in , and so contains the monomial . The result then follows from Lemma B.2. ∎
With these three lemmas, we are ready to proceed to the proof of Lemma B.1. Supposing for the sake of contradiction that , using the above lemmas, we will show that this implies one of the canonical monomials is in . This cannot happen however, as all monomials in (6) are rejected by , the identity.
Proof of Lemma B.1..
Let be the family of polynomials that pass the sparse -flat test with probability . Suppose for the sake of contradiction that and contains a monomial of degree . Let be such monomial and let be the smallest index such that . Then , so and
We will show that this monomial will lead to one of the canonical monomials being in . First, we claim that there is a monomial such that for . Define
If , then is of the desired form. If is not of the desired form, then there is such that . Otherwise one of the following must be true:
-
•
There is such that , in which case we simply find some and apply Lemma B.4 to obtain the monomial in place of ,
-
•
There is such that . In this case we can find such that , and apply Lemma B.4 to obtain the monomial in place of .
In either case, we can find another monomial such that and . Iterating this process, we must eventually find the desired monomial with .
Now, abusing notation, let be this monomial. We have, and for . We can now perform essentially the same argument and move any degree above to . There is at most
degree leftover after subtracting so we can perform the above argument to shift degree until each of the degrees is at least . This will leave at most degree leftover, which we can simply shift to . ∎
Lemma B.5.
Suppose satisfies . Then there exist full rank affine transformations such that:
-
1.
contains a monomial with for all .
-
2.
.
Proof.
To show the first part of the lemma, we use a similar idea to the proof of Lemma B.4. Suppose contains a monomial and suppose . Consider the full rank affine transformation for and the coefficient of in . It is not hard to see that this coefficient is a nonzero polynomial in and therefore there exists an such that contains the monomial . As is a full rank affine transformation, we can repeatedly apply such transformations to obtain a full rank such that contains a monomial with , proving the first part of the lemma.
Let be the transformation from part and let , so that contains a monomial with . For , let be the full rank affine transformation such that and consider the inner product
as a polynomial in . Recall that contains the monomial . Since contains the monomial with there is a contribution of with nonzero coefficient, and this cannot be cancelled out from any other monomial. Therefore, is a nonzero polynomial in and there exists an such that . This establishes the second part of the lemma. ∎
Appendix C Proof of Lemma 5.3
In this section we provide the proof of Lemma 5.3. This is essentially the same as [31, Proposition 3.5] with a slight modification. Recall that is an arbitrary polynomial, is an arbitrary degree parameter, and is an arbitrary point. We work in . Use to denote the uniform measure over all -flats, and to denote uniform measure over the zoom-in on . When referring to zoom-ins, zoom-ins on the linear part etc. we are referring to the versions in the affine Grassmann graph. We have the following two sets of -flats,
and . We will show the following weaker form of Lemma 5.3, and then explain why this gives the full Lemma 5.3.
Lemma C.1.
If then . Moreover there is a value such that after changing to , .
Before proving this lemma, we explain why it implies Lemma 5.3. Assuming that Lemma C.1 holds, suppose that is the resulting function after changing the value of to , and consider the set of -flats such that . Assume for the sake of contradiction that is nonempty. We record some facts about ,
-
1.
.
-
2.
.
-
3.
.
-
4.
is pseudorandom with respect to zoom-outs and zoom-outs on the linear part.
-
5.
is pseudorandom with respect to zoom-ins on the linear part.
The first item follows from assuming Lemma C.1. The second item follows from the first item and from bounding the fraction of -flats that contain . The third and fourth items follow from the same arguments as Lemmas 4.2 and 4.3. Finally, the fifth item follows again from the first item and the fact that over any zoom-in on the linear part, a random -flat from this set contains the point with probability at most .
Applying Theorem A.1 with , we get that must have density at least inside some zoom-in, where we use the fact that . There can only be one zoom-in on which has nonzero density, and that is the zoom-in on . However, this contradicts the assumption that after the value of is changed. In short, we have shown that, under this setting, if a correction causes the rejection probability within a zoom-in to drop below , then the rejection probability must actually be .
We now give the proof of Lemma C.1. This proof requires the following result used in both [5, 31], which was referred to as the shadow lemma in the introduction.
Lemma C.2.
Let be a degree parameter and . Then for any :
-
1.
If , then .
-
2.
If then .
Where and are the fraction of -flats and -flats respectively on which the restriction of has degree greater than .
In [31] this lemma is used to show that the set of rejecting flats is non-expanding, similar to Lemma 4.2 in this paper. The lemma has another use though, which is stated in its second item. For our purposes, the second item says that if has degree greater than and for greater than of the hyperplanes we have as well, then cannot contain the maximum degree monomial. In this section we will use Lemma C.2 with the parameters and . Note that by assumption.
Our first goal is to show . Start by taking an -flat uniformly conditioned on . Let
and work in — the affine Grassmann graph over the -flats contained in .
Let and denote measure and expansion respectively in . We claim that either or . Otherwise, and is pseudo-random with respect to zoom-ins and zoom-ins on the linear part. By a similar argument to Lemma 4.3 we have that is pseudorandom with respect to zoom-outs and zoom-outs on the linear part and by a similar argument to Lemma 4.1 we have that . This contradicts Theorem A.1, however, so we conclude that either or .
Lemma C.3.
Proof.
Otherwise we may find a such that . Fix such a and sample a uniform -flat conditioned on , and a uniform -flat . Now consider . We may think of as being defined by a system of independent linear equations . That is, is the subspace of that satisfies these equations. Likewise, is given by the restriction of linear equations, . The probability that all linear equations are linearly independent is at least
and is uniform over . Thus,
If , then , where the upper shadow is taken with respect to , so it follows that
On the other hand, by Lemma C.2,
By assumption so altogether we get that,
for some constant . For large enough this is a contradiction. ∎
Proof of Lemma 5.3..
Fix an -flat that contains . We will show that there exists a value such that changing the value of to results in a degree function on . Establishing this fact and choosing the most common over all -flats proves Lemma C.1, which in turn establishes Lemma 5.3 by our previous discussion.
Suppose , as otherwise we are done by setting . Pick an -flat and let , and , over . Note that contains the degree -monomial, so there is some value such that has degree strictly less than . Showing that for this , completes the proof.
Since we have already shown that, for any -flat that does not contain the point , this implies that also has degree at most on any -flat not containing . This is because when restricted to such flats, is equal to the restriction of plus some constant polynomial. Since the degree of ’s restriction is at most , so is the degree of . It follows that can only have degree greater than when restricted to -flats that contain . This is at most a -fraction of -flats contained in however, which, along with the fact that , implies that by Lemma C.2. ∎