This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Randomly punctured Reed–Solomon codes achieve list-decoding capacity over linear-sized fields

Omar Alrabiah Department of EECS, UC Berkeley, Berkeley, CA, 94709, USA. Email: [email protected]. Research supported in part by a Saudi Arabian Cultural Mission (SACM) Scholarship, NSF CCF-2210823 and V. Guruswami’s Simons Investigator Award.    Venkatesan Guruswami Departments of EECS and Mathematics, and the Simons Institute for the Theory of Computing, UC Berkeley, Berkeley, CA, 94709, USA. Email: [email protected]. Research supported by a Simons Investigator Award and NSF grants CCF-2210823 and CCF-2228287.    Ray Li Department of EECS, UC Berkeley, Berkeley, CA, 94709, USA. Email: [email protected]. Research supported by the NSF Mathematical Sciences Postdoctoral Research Fellowships Program under Grant DMS-2203067, and a UC Berkeley Initiative for Computational Transformation award.
(March 2024)
Abstract

Reed–Solomon codes are a classic family of error-correcting codes consisting of evaluations of low-degree polynomials over a finite field on some sequence of distinct field elements. They are widely known for their optimal unique-decoding capabilities, but their list-decoding capabilities are not fully understood. Given the prevalence of Reed-Solomon codes, a fundamental question in coding theory is determining if Reed–Solomon codes can optimally achieve list-decoding capacity.

A recent breakthrough by Brakensiek, Gopi, and Makam, established that Reed–Solomon codes are combinatorially list-decodable all the way to capacity. However, their results hold for randomly-punctured Reed–Solomon codes over an exponentially large field size 2O(n)2^{O(n)}, where nn is the block length of the code. A natural question is whether Reed–Solomon codes can still achieve capacity over smaller fields. Recently, Guo and Zhang showed that Reed–Solomon codes are list-decodable to capacity with field size O(n2)O(n^{2}). We show that Reed–Solomon codes are list-decodable to capacity with linear field size O(n)O(n), which is optimal up to the constant factor. We also give evidence that the ratio between the alphabet size qq and code length nn cannot be bounded by an absolute constant.

Our techniques also show that random linear codes are list-decodable up to (the alphabet-independent) capacity with optimal list-size O(1/ε)O(1/\varepsilon) and near-optimal alphabet size 2O(1/ε2)2^{O(1/\varepsilon^{2})}, where ε\varepsilon is the gap to capacity. As far as we are aware, list-decoding up to capacity with optimal list-size O(1/ε)O(1/\varepsilon) was not known to be achievable with any linear code over a constant alphabet size (even non-constructively), and it was also not known to be achievable for random linear codes over any alphabet size.

Our proofs are based on the ideas of Guo and Zhang, and we additionally exploit symmetries of reduced intersection matrices. With our proof, which maintains a hypergraph perspective of the list-decoding problem, we include an alternate presentation of ideas from Brakensiek, Gopi, and Makam that more directly connects the list-decoding problem to the GM-MDS theorem via a hypergraph orientation theorem.

1 Introduction

An (error-correcting) code is simply a set of strings (codewords). In this paper, all codes are linear, meaning our code C𝔽qnC\subset\mathbb{F}_{q}^{n} is a space of vectors over a finite field 𝔽q\mathbb{F}_{q}, for some prime power qq. A Reed–Solomon code [RS60] is a linear code obtained by evaluating low-degree polynomials over 𝔽q\mathbb{F}_{q}. More formally,

𝖱𝖲n,k(α1,,αn)=def{(f(α1),,f(αn))𝔽qn:f𝔽q[X],deg(f)<k}.\displaystyle\mathsf{RS}_{n,k}(\alpha_{1},\dots,\alpha_{n})\stackrel{{\scriptstyle\rm def}}{{=}}\{(f(\alpha_{1}),\dots,f(\alpha_{n}))\in\mathbb{F}_{q}^{n}:f\in\mathbb{F}_{q}[X],\deg(f)<k\}. (1)

The rate RR of a code CC is R=deflogq|C|/nR\stackrel{{\scriptstyle\rm def}}{{=}}\log_{q}|C|/n, which, for a Reed–Solomon code, is k/nk/n. Famously, Reed–Solomon codes are optimal for the unique decoding problem [RS60]: for any rate RR Reed–Solomon code, for every received word y𝔽qny\in\mathbb{F}_{q}^{n}, there is at most one codeword within Hamming distance pnpn of yy,111The Hamming distance between two codewords is the number of coordinates on which they differ. and this error parameter p=1R2p=\frac{1-R}{2} is optimal by the Singleton bound [Sin64].

In this paper, we study Reed–Solomon codes in the context of list-decoding, a generalization of unique-decoding that was introduced by Elias and Wozencraft [Eli57, Woz58]. Formally, a code C𝔽qnC\subset\mathbb{F}_{q}^{n} is (p,L)(p,L)-list-decodable if, for every received word y𝔽qny\in\mathbb{F}_{q}^{n}, there are at most LL codewords of CC within Hamming distance pnpn of yy.

It is well known that the list-decoding capacity, namely the largest fraction of errors that can be list-decoded with small lists, is 1R1-R [GRS22, Theorem 7.4.1]. Specifically, for p=1Rεp=1-R-\varepsilon, there are (infinite families of) rate RR codes that are (p,L)(p,L) list-decodable for a list-size LL as small as O(1/ε)O(1/\varepsilon). On the other hand, for p=1R+εp=1-R+\varepsilon, if a rate RR code is (p,L)(p,L) list decodable, the list size LL must be exponential in the code length nn. Informally, a code that is list-decodable up to radius p=1Rεp=1-R-\varepsilon with list size Oε(1)O_{\varepsilon}(1), or even list size nOε(1)n^{O_{\varepsilon}(1)} where nn is the code length, is said to achieve (list-decoding) capacity.

The list-decodability of Reed–Solomon codes is important for several reasons. Reed–Solomon codes are the most fundamental algebraic error-correcting code. In fact, all of the prior explicit constructions of codes achieving list-decoding capacity are based on algebraic constructions that generalize Reed–Solomon codes, for example, Folded Reed–Solomon codes [GR08, KRZSW18], Multiplicity codes [GW13, Kop15, KRZSW18], and algebraic-geometric codes [DL12, GX12, GX13, HRZW19]. Thus, it is natural to wonder whether and when Reed–Solomon codes themselves achieve list-decoding capacity. Additionally, all Reed–Solomon codes are optimally unique-decodable, so (equivalently) they are optimally list-decodable L=1L=1, making them a natural candidate for codes achieving list-decoding capacity. Further, capacity-achieving Reed–Solomon codes would potentially offer advantages over existing explicit capacity-achieving codes, such as simplicity and potentially smaller alphabet sizes (which we achieve in this work). Lastly, list-decoding of Reed–Solomon codes has found several applications in complexity theory and pseudorandomness [CPS99, STV01, LP20].

For all these reasons, the list-decodability of Reed–Solomon codes is well-studied. As rate RR Reed–Solomon codes are uniquely decodable up to the optimal radius 1R2\frac{1-R}{2} given by the Singleton Bound, the Johnson-bound [Joh62] automatically implies that Reed–Solomon codes are (p,L)(p,L)-list-decodable for error parameter p=1Rεp=1-\sqrt{R}-\varepsilon and list size L=O(1/ε)L=O(1/\varepsilon). Guruswami and Sudan [GS99] showed how to efficiently list-decode Reed–Solomon codes up to the Johnson radius 1R1-\sqrt{R}. For a long time, this remained the best list-decodability result (even non-constructively) for Reed–Solomon codes.

Since then, several results suggested Reed–Solomon codes could not be list-decoded up to capacity, and in fact, not much beyond the Johnson radius 1R1-\sqrt{R}. Guruswami and Rudra [GR06] showed that, for a generalization of list-decoding called list-recovery, Reed–Solomon codes are not list-recoverable beyond the (list-recovery) Johnson bound in some parameter settings. Cheng and Wan [CW07] showed that efficient list-decoding of Reed–Solomon codes beyond the Johnson radius in certain parameter settings implies fast algorithms for the discrete logarithm problem. Ben-Sasson, Kopparty, and Radhakrishnan [BKR10] showed that full-length Reed–Solomon codes (q=nq=n) are not list-decodable much beyond the Johnson bound in some parameter settings.

Since then, an exciting line of work [RW14, ST20, GLS+22, FKS22, GST22, BGM23, GZ23] has shown the existence of Reed–Solomon codes that could in fact be list-decoded beyond the Johnson radius. These works all consider combinatorial list-decodability of randomly punctured Reed–Solomon codes. By combinatorial list-decodability, we mean that the code is proved to be list-decodable without providing an algorithm to efficiently decode the list of nearby codewords. By randomly punctured Reed–Solomon code, we mean a code 𝖱𝖲n,k(α1,,αn)\mathsf{RS}_{n,k}(\alpha_{1},\dots,\alpha_{n}) where (α1,,αn)(\alpha_{1},\dots,\alpha_{n}) are chosen uniformly over all nn-tuples of pairwise distinct elements of 𝔽q\mathbb{F}_{q}. Several of these works [RW14, FKS22, GST22] proved more general list-decoding results about randomly puncturing any code with good unique-decoding properties, not just Reed–Solomon codes.

In this line of work, a recent breakthrough of Brakensiek, Gopi, and Makam [BGM23] showed, using notions of “higher-order MDS codes” [BGM22, Rot22], that Reed–Solomon codes can actually be list-decoded up to capacity. In fact, they show, more strongly, that Reed–Solomon codes can be list-decoded with list size LL with radius p=LL+1(1R)p=\frac{L}{L+1}(1-R), exactly meeting the generalized Singleton bound [ST20], resolving a conjecture of Shangguan and Tamo [ST20]. However, their results require randomly puncturing Reed–Solomon codes over an exponentially large field size 2O(n)2^{O(n)}, where nn is the block length of the code.

A natural question is how small we can take the field size in a capacity-achieving Reed–Solomon code. Brakensiek, Dhar, and Gopi [BDG22, Corollary 1.7, Theorem 1.8] showed that the exponential-in-nn field size in [BGM23] is indeed necessary to exactly achieve the generalized Singleton bound for L=2L=2 — under the additional assumptions that the code is linear and MDS. These assumptions were removed in followup work [AGL24], which also generalized the result to all LL — but smaller field sizes remained possible if one allowed a small ε\varepsilon slack in the parameters. Recently, an exciting work of Guo and Zhang [GZ23] showed that Reed–Solomon codes are list-decodable up to capacity, in fact up to (but not exactly at) the generalized Singleton bound, with alphabet size O(n2)O(n^{2}).

1.1 Our results

1.1.0.0.1 List-decoding Reed–Solomon codes.

Building on Guo and Zhang’s argument, we show that Reed–Solomon codes are list-decodable up to capacity and the generalized Singleton bound with linear alphabet size O(n)O(n), which is evidently optimal up to the constant factor. Our main result is the following.

Theorem 1.1.

Let ε(0,1)\varepsilon\in(0,1), L2L\geq 2 and qq be a prime power such that qn+k210L/εq\geq n+k\cdot 2^{10L/\varepsilon}. Then with probability at least 12Ln1-2^{-Ln}, a randomly punctured Reed–Solomon code of block length nn and rate k/nk/n over 𝔽q\mathbb{F}_{q} is (LL+1(1Rε),L)(\frac{L}{L+1}(1-R-\varepsilon),L) average-radius list-decodable.

As in previous works like [BGM23, GZ23], Theorem 1.1 gives average-radius list-decodability, a stronger guarantee than list-decodability: for any distinct L+1L+1 codewords c(1),,c(L+1)c^{(1)},\dots,c^{(L+1)} and any vector y𝔽qny\in\mathbb{F}_{q}^{n}, the average Hamming distance from c(1),,c(L+1)c^{(1)},\dots,c^{(L+1)} to yy is at least LL+1(1Rε)\frac{L}{L+1}(1-R-\varepsilon). Taking L=O(1/ϵ)L=O(1/\epsilon) in Theorem 1.1, it follows that Reed–Solomon codes achieve list-decoding capacity even over linear-sized alphabets.

Corollary 1.2.

Let ε(0,1)\varepsilon\in(0,1) and qq be a prime power such that qn+k2O(1/ε2)q\geq n+k\cdot 2^{O(1/\varepsilon^{2})}. Then with probability at least 12Ω(n/ε)1-2^{-\Omega(n/\varepsilon)}, a randomly punctured Reed–Solomon code of block length nn and rate k/nk/n over 𝔽q\mathbb{F}_{q} is (1Rε,O(1ε))(1-R-\varepsilon,O(\frac{1}{\varepsilon})) average-radius list-decodable.

The alphabet size in [GZ23] is 2O(L2/ε)nk2^{O(L^{2}/\varepsilon)}nk. Our main contribution is improving their alphabet size from quadratic to linear. As a secondary improvement, we also bring down the constant factor from 2O(L2/ε)2^{O(L^{2}/\varepsilon)} to 2O(L/ε)2^{O(L/\varepsilon)}. We defer the proof overview of Theorem 1.1 to Section 3.1 after setting up the necessary notions in Section 2.

In our proof of Theorem 1.1, we maintain a hypergraph perspective of the list-decoding problem, which was introduced in [GLS+22]. Section 2.2 elaborates on the advantages of this perspective, which include (i) more conpact notations, definitions, and lemma statements, (ii) our improved constant factor of 2O(L/ε)2^{O(L/\varepsilon)}, (iii) an improved alphabet size in our random linear codes result below (Theorem 1.3), and (iv) an alternate presentation of ideas from Brakensiek, Gopi, and Makam [BGM23] that more directly connects the list-decoding problem to the GM-MDS theorem [DSY14, Lov18, YH19] via a hypergraph orientation theorem (see Appendix A).

1.1.0.0.2 List-decoding random linear codes.

A random linear code of rate RR and length nn over 𝔽q\mathbb{F}_{q} is a random subspace of 𝔽qn\mathbb{F}_{q}^{n} of dimension RnRn. List-decoding random linear codes is well-studied [ZP81, Eli91, GHSZ02, GHK11, Woo13, RW14, RW18, LW20, MRRZ+20, GLM+21, GM22, PP23] and is an important question for several reasons. First, finding explicit codes approaching list-decoding capacity is a major challenge, and random linear codes provide a stepping stone towards explicit codes: a classic result says that uniformly random codes achieve list-decoding capacity [Eli57, Woz58], and showing list-decodability of random linear codes can be viewed as a derandomization of the uniformly random construction. Mathematically, the list-decodability of random linear codes concerns a fundamental geometric question: to what extent do random subspaces over 𝔽q\mathbb{F}_{q} behave like uniformly random sets? In coding theory, list-decodable random linear codes are useful building blocks in other coding theory constructions [GI01, HW18]. Lastly, the algorithmic question of decoding random linear codes is closely related to the Learning With Errors (LWE) problem in cryptography [Reg09] and Learning Parity with Noise (LPN) problem in learning theory [BKW03, FGKP06].

The list-decodability of random linear codes is more difficult to analyze than uniformly random codes, because codewords do not enjoy the same independence as in random codes. Thus the naive argument that shows that random linear codes achieve list-decoding capacity [ZP81] gives an exponentially worse list size of q1/εq^{1/\varepsilon} than for random codes (ε\varepsilon is the gap to the “qq-ary capacity”, R=1Hq(p)R=1-H_{q}(p), where Hq(x)=defxlogq(q1)xlogq(x)(1x)logq(1x)H_{q}(x)\stackrel{{\scriptstyle\rm def}}{{=}}x\log_{q}(q-1)-x\log_{q}(x)-(1-x)\log_{q}(1-x) is the qq-ary entropy function). Several works have sought to circumvent this difficulty [Eli91, GHSZ02, GHK11, Woo13, RW14, RW18, LW20, GLM+21] improving the list-size bound to Oq(1/ε)O_{q}(1/\varepsilon), matching the list-size of uniformly random codes.

However, these results are more relevant for smaller alphabet sizes qq, and approaching the alphabet-independent capacity of p=1Rp=1-R is less understood. In this setting, uniformly random codes are, with high probability, list-decodable to capacity with optimal alphabet size 2O(1/ε)2^{O(1/\varepsilon)} 222This follows from the list-decoding capacity theorem [Eli57, Woz58]. Over qq-ary alphabets, the list-decoding capacity is given by p=Hq1(1R)p=H_{q}^{-1}(1-R), which is larger than 1Rε1-R-\varepsilon when q2Ω(1/ε)q\geq 2^{\Omega(1/\varepsilon)}. and optimal list size O(1/ε)O(1/\varepsilon).333For codes over smaller alphabets, the list size O(1/ε)O(1/\varepsilon), where ε\varepsilon is the gap to capacity, is believed to be optimal, but a proof is only known for large radius [GV10]. However, for approaching the alphabet independent capacity, the list size O(1/ε)O(1/\varepsilon) is known to be optimal by the generalized Singleton bound [ST20]. However, it was not known whether random linear codes (or, in general, more structured codes) could achieve similar parameters. In particular, both of the following questions were open (as far as we are aware).

  • Are rate RR random linear codes (1Rε,O(1/ε))(1-R-\varepsilon,O(1/\varepsilon))-list-decodable with high probability? Previously, this was not known for any alphabet size qq, even alphabet size growing with the length of the code. Previously, the best list size for random linear codes list-decodable to radius p=1Rεp=1-R-\varepsilon was at least 2Ω(1/ε)2^{\Omega(1/\varepsilon)} [GHK11, RW18].444[GHK11] appears to give a list-size bound of O(qOR(1)/ε)O(q^{O_{R}(1)}/\varepsilon), and [RW18] appears to give a list size bound that is at least qlog2(1/ε)q^{\log^{2}(1/\varepsilon)}, and we need q2Ω(1/ε)q\geq 2^{\Omega(1/\varepsilon)}

  • Do there exist any linear codes (even non-constructively) over constant-sized (independent of nn) alphabets that are (1Rε,O(1/ε))(1-R-\varepsilon,O(1/\varepsilon))-list-decodable?

Using the same framework as the proof of Theorem 1.3, we answer both questions affirmatively. We show that, with high probability, random linear codes approach the generalized Singleton bound, and thus capacity, with alphabet size close to the optimal.

Theorem 1.3.

For all L1,ε(0,1)L\geq 1,\varepsilon\in(0,1), a random linear code over alphabet size q210L/εq\geq 2^{10L/\varepsilon} and nn sufficiently large is with high probability (LL+1(1Rε),L)(\frac{L}{L+1}(1-R-\varepsilon),L)-average-radius-list-decodable.

By taking L=O(1/ε)L=O(1/\varepsilon), we see that random linear codes achieve capacity with optimal list size O(1/ε)O(1/\varepsilon) and near optimal alphabet size 2O(1/ε2)2^{O(1/\varepsilon^{2})}.

Corollary 1.4.

For all ε>0\varepsilon>0, a random linear code over alphabet size q2O(1/ε2)q\geq 2^{O(1/\varepsilon^{2})} and nn sufficiently large is with high probability (1Rε,O(1/ε))(1-R-\varepsilon,O(1/\varepsilon))-average-radius-list-decodable.

The techniques developed in this work for the proof of Theorem 1.1 are important for obtaining the strong alphabet size guarantees of Theorem 1.3. One could also have adapted the proof of Guo and Zhang, but doing so in the same natural way would only yield an alphabet size of O(n)O(n) (see Section 4.4 for discussions). Further, our use of the hypergraph machinery, which gives a secondary improvement over [GZ23] in constant factor in the alphabet size in Corollary 1.2, gives the primary improvement in the alphabet size in Corollary 1.4 from 2O(1/ε3)2^{O(1/\varepsilon^{3})} to 2O(1/ε2)2^{O(1/\varepsilon^{2})}.

As the proof of Theorem 1.3 is very similar to the proof of Theorem 1.1, we focus most of the paper on Theorem 1.1 for brevity and clarity of presentation in Section 2 and Section 3. In Section 4, we show how the definitions and proof can be modified to work for random linear codes.

1.1.0.0.3 Alphabet size lower bounds.

Above, we saw that random linear codes achieve list-decoding capacity with optimal list-size and near-optimal alphabet size. A natural question, asked by Guo and Zhang, is how large the alphabet size needs to be for capacity-achieving Reed–Solomon codes. We showed that qn2O(1/ε2)q\geq n\cdot 2^{O(1/\varepsilon^{2})} suffices, and by the list-decoding capacity theorem [Eli57, Woz58], we cannot have better than an exponential-type dependence on 1/ε1/\varepsilon for subconstant ε<O(1/logn)\varepsilon<O(1/\log n).

For approaching capacity with constant ε\varepsilon, Ben-Sasson, Kopparty, and Radhakrishnan [BKR10] showed that, for any c1c\geq 1, there exist full-length Reed–Solomon codes that are not list-decodable much beyond the Johnson bound with list-sizes O(nc)O(n^{c}). Thus in order to achieve list-decoding capacity, one needs q>nq>n in some cases. However, while full-length Reed–Solomon codes could not achieve capacity, perhaps it was possible that Reed–Solomon codes over field size, say q=2nq=2n or even q=(1+γ)nq=(1+\gamma)n, could achieve capacity in all parameter settings. We observe that, as a corollary of [BKR10], such a strong guarantee is not possible. We show that, for any c>1c>1, there exist a constant rate R=R(c)>0R=R(c)>0 and infinitely many field sizes qq such that all Reed–Solomon codes of length nq/cn\geq q/c and rate RR over 𝔽q\mathbb{F}_{q} are not list-decodable to capacity 1R1-R with list size ncn^{c}. The proof is in Appendix B.

Proposition 1.5.

Let δ=2b\delta=2^{-b} for some positive integer b3b\geq 3. There exists infinitely many qq such that any Reed–Solomon code of length n4δ0.99qn\geq 4\delta^{0.99}q and rate δ\delta is not (12δ,nΩ(log(1/δ)))(1-2\delta,n^{\Omega(\log(1/\delta))})-list-decodable.

Follow up work

The techniques in our paper have already been influential. In follow-up work, Brakensiek, Dhar, Gopi, and Zhang [BDG+23b] used our argument to prove that Algebraic Geometry (AG) codes achieve list-decoding capacity over constant-sized alphbaets. They prove this by combining our techniques with a generalized GM-MDS theorem, proved by Brakensiek, Dhar, Gopi [BDG23a].

2 Preliminaries

2.1 Basic notation

For positive integers tt, let [t][t] denote the set {1,2,,t}\{1,2,\dots,t\}. The Hamming distance d(x,y)d(x,y) between two vectors x,y𝔽qnx,y\in\mathbb{F}_{q}^{n} is the number of indices ii where xiyix_{i}\neq y_{i}. For a finite field 𝔽q\mathbb{F}_{q}, we follow the standard notation that 𝔽q[X1,,Xn]\mathbb{F}_{q}[X_{1},\dots,X_{n}] denotes the ring of multivariate polynomials with variables X1,,XnX_{1},\dots,X_{n} over 𝔽q\mathbb{F}_{q}, and 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n}) denotes the field of fractions of the polynomial ring 𝔽q[X1,,Xn]\mathbb{F}_{q}[X_{1},\dots,X_{n}]. By abuse of notation, we let XiX_{\leq i} or X[i]X_{[i]} to denote the sequence X1,,XiX_{1},\dots,X_{i}, and we let, for example, Xi=αiX_{\leq i}=\alpha_{\leq i} to denote X1=α1,X2=α2,,Xi=αiX_{1}=\alpha_{1},X_{2}=\alpha_{2},\dots,X_{i}=\alpha_{i}. Given a matrix MM over the field of fractions 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n}) and field elements α1,,αi𝔽q\alpha_{1},\dots,\alpha_{i}\in\mathbb{F}_{q}, let M(Xi=αi)M(X_{\leq i}=\alpha_{\leq i}) denote the matrix over 𝔽q(Xi+1,Xi+2,,Xn)\mathbb{F}_{q}(X_{i+1},X_{i+2},\dots,X_{n}) obtained by setting Xi=αiX_{\leq i}=\alpha_{\leq i} in MM.

2.2 Hypergraphs and connectivity

In this work, we maintain a hypergraph perspective of the list-decoding problem, which was introduced in [GLS+22]. We describe a bad list-decoding instance with a hypergraph where the L+1L+1 bad codewords identify the vertices and the nn evaluation points identify the hyperedges (Definition 2.1). While prior works described a bad list-decoding instance by L+1L+1 sets indicating the agreements of the codewords with the received word, this hypergraph perspective gives us several advantages:

  1. 1.

    The constraints imposed by a bad list-decoding configuration yield a hypergraph that is weakly-partition-connected. This is a natural notion of hypergraph connectivity, which is well-studied in combinatorics [FKK03b, FKK03a, Kir03] and optimization [JMS03, FK09, Fra11, CX18], and which generalizes a well-known notion (kk-partition-connectivity) for graphs [NW61, Tut61].555The notion of weakly-partition-connected sits between two other well-studied notions: kk-partition-connected implies kk-weakly-partition-connected implies kk-edge-connected [Kir03]. Each of these three notions generalizes an analogous notion on graphs. On graphs, kk-partition-connected and kk-weakly-partition-connected are equivalent. This connection allows us to have more compact notation, definitions, and lemma statements.

  2. 2.

    Because we work with weakly-partition-connected hypergraphs, we save a factor of LL in Lemma 2.14 compared to the analogous lemma in [GZ23]. This allows us to improve the constant factor in alphabet size for Reed–Solomon codes from 2O(L2/ε)2^{O(L^{2}/\varepsilon)} in [GZ23] to 2O(L/ε)2^{O(L/\varepsilon)} in Theorem 1.1.

  3. 3.

    For similar reasons, for random linear codes, the hypergraph perspective saves a factor of LL in the alphabet size exponent, improving from 2O(L2/ε)2^{O(L^{2}/\varepsilon)} to 2O(L/ε)2^{O(L/\varepsilon)} in Theorem 1.3.

  4. 4.

    With the hypergraph perspective, we can give a new presentation of the results in [BGM23] and more directly connect the list-decoding problem to the GM-MDS theorem [DSY14, Lov18, YH19], as the heavy-lifting in the combinatorics is done using known results on hypergraph orientations. This is done in Appendix A.

A hypergraph =(V,)\mathcal{H}=(V,\mathcal{E}) is given by a set of vertices VV and a set \mathcal{E} of (hyper)edges, which are subsets of the vertices VV. In this work, all hypergraphs have labeled edges, meaning we enumerate our edges eie_{i} by distinct indices ii from some set, typically [n][n], in which case we may also think of \mathcal{E} as a tuple (e1,,en)(e_{1},\dots,e_{n}). Throughout this paper, the vertex set VV is typically [t][t] for some positive integer tt. The weight of a hyperedge ee is wt(e)=defmax(0,|e|1)\operatorname*{wt}(e)\stackrel{{\scriptstyle\rm def}}{{=}}\max(0,|e|-1), and the weight of a set of hyperedges \mathcal{E} is simply wt()=defewt(e)\operatorname*{wt}(\mathcal{E})\stackrel{{\scriptstyle\rm def}}{{=}}\sum_{e\in\mathcal{E}}{\operatorname*{wt}(e)}.

en2e_{n-2}en1e_{n-1}ene_{n}f(1)f^{(1)}f(2)f^{(2)}f(3)f^{(3)}f(4)f^{(4)}f(5)f^{(5)}f(6)f^{(6)}f(7)f^{(7)}en2={1,2,4}e_{n-2}=\{1,2,4\} means f(1)(αn2)=f(2)(αn2)=f(4)(αn2)=yn2f^{(1)}(\alpha_{n-2})=f^{(2)}(\alpha_{n-2})=f^{(4)}(\alpha_{n-2})=y_{n-2}en1={5,6}e_{n-1}=\{5,6\} means f(5)(αn1)=f(6)(αn1)=yn1f^{(5)}(\alpha_{n-1})=f^{(6)}(\alpha_{n-1})=y_{n-1}en={7}e_{n}=\{7\} means f(7)(αn)=ynf^{(7)}(\alpha_{n})=y_{n}
Figure 1: Example edges from an agreement hypergraph =([7],(e1,,en))\mathcal{H}=([7],(e_{1},\dots,e_{n})) (Definition 2.1) arising from a bad list-decoding configuration with polynomials f(1),,f(7)𝔽q[X]f^{(1)},\dots,f^{(7)}\in\mathbb{F}_{q}[X], received word y𝔽qny\in\mathbb{F}_{q}^{n}, and evaluation points α1,,αn\alpha_{1},\dots,\alpha_{n}.

All hypergraphs that we will consider in this work are agreement hypergraphs for a bad list-decoding configuration. See Figure 1 for an illustration.

Definition 2.1 (Agreement Hypergraph).

Given vectors y,c(1),,c(t)𝔽qny,c^{(1)},\dots,c^{(t)}\in\mathbb{F}_{q}^{n}, the agreement hypergraph has a vertex set [t][t] and a tuple of nn hyperedges (e1,,en)(e_{1},\dots,e_{n}) where ei=def{j[t]:cij=yi}e_{i}\stackrel{{\scriptstyle\rm def}}{{=}}\{j\in[t]:c^{j}_{i}=y_{i}\}.

A key property of hypergraphs that we are concerned with is weak-partition-connectivity.

Definition 2.2 (Weak Partition Connectivity).

A hypergraph =([t],)\mathcal{H}=([t],\mathcal{E}) is kk-weakly-partition-connected if, for every partition 𝒫\mathcal{P} of the set of vertices [t][t],

emax{|𝒫(e)|1,0}k(|𝒫|1)\displaystyle\sum_{e\in\mathcal{E}}{\max\{|\mathcal{P}(e)|-1,0\}}\geq k(|\mathcal{P}|-1) (2)

where |𝒫||\mathcal{P}| is the number of parts of the partition, and |𝒫(e)||\mathcal{P}(e)| is the number of parts of the partition that edge ee intersects.

To give some intuition for weak partition connectivity, we state two of its combinatorial implications. First, if a graph is kk-weakly-partition-connected, then it is kk-edge-connected [Kir03], which, by the Hypergraph Menger’s (Max-Flow-Min-Cut) theorem [Kir03, Theorem 1.11], equivalently means that every pair of vertices has kk edge-disjoint (hyper)paths between them.666In general the converse is not true. Second, suppose we replace every hyperedge ee with an arbitrary spanning tree of its vertices (which we effectively do in Definition 2.6). The resulting (non-hyper)graph is kk-partition-connected,777In (non-hyper)graphs, kk-partition-connectivity and kk-weak-partition-connectivity are equivalent. which, by the Nash-Williams-Tutte Tree-Packing theorem [NW61, Tut61], equivalently means there are kk edge-disjoint spanning trees (this connection was used in [GLS+22]).

The key reason we consider weak-partition-connectivity is that a bad list-decoding configuration yields a kk-weakly-partition-connected agreement hypergraph.

Lemma 2.3 (Bad list gives kk-weakly-partition-connected hypergraph. See also Lemma 7.4 of [GLS+22]).

Suppose that vectors y,c(1),,c(L+1)𝔽qny,c^{(1)},\dots,c^{(L+1)}\in\mathbb{F}_{q}^{n} are such that the average Hamming distance from yy to c(1),,c(L+1)c^{(1)},\dots,c^{(L+1)} is at most LL+1(nk)\frac{L}{L+1}(n-k). That is, j=1L+1d(y,c(j))L(nk)\sum_{j=1}^{L+1}d(y,c^{(j)})\leq L(n-k). Then, for some subset J[L+1]J\subseteq[L+1] with |J|2|J|\geq 2, the agreement hypergraph of (y,c(j):jJ)(y,c^{(j)}:j\in J) is kk-weakly-partition-connected.

Lemma 2.3 follows from the following result about weakly-particion-connected hypergraphs

Lemma 2.4.

Let =(V,)\mathcal{H}=(V,\mathcal{E}) be a hypergraph with at least two vertices and with total edge weight ewt(e)k(|V|1)\sum_{e\in\mathcal{E}}\operatorname*{wt}(e)\geq k\cdot(|V|-1), for some positive integer kk. Then there exists a subset VVV^{\prime}\subset V of at least two vertices such that the hypergraph =(V,{Ve:e})\mathcal{H}^{\prime}=(V^{\prime},\{V^{\prime}\cap e:e\in\mathcal{E}\}) is kk-weakly-partition-connected.

Proof.

Let VV^{\prime} be an inclusion-minimal subset V[L+1]V^{\prime}\subseteq[L+1] with |V|2|V^{\prime}|\geq 2 such that

ewt(eV)(|V|1)k.\displaystyle\sum_{e\in\mathcal{E}}{\operatorname*{wt}(e\cap V^{\prime})}\geq(|V^{\prime}|-1)k. (3)

By assumption, V=[L+1]V^{\prime}=[L+1] satisfies (3), so VV^{\prime} exists (note that singleton subsets of [L+1][L+1] satisfy (3) with equality). Let =(V,)\mathcal{H}=(V^{\prime},\mathcal{E}^{\prime}) be the hypergraph with edge set ={Ve:e}\mathcal{E}^{\prime}=\{V^{\prime}\cap e:e\in\mathcal{E}\}. By minimality of VV^{\prime}, for all V′′VV^{\prime\prime}\subsetneq V^{\prime}, we have ewt(eV′′)(|V′′|1)k\sum_{e\in\mathcal{E}^{\prime}}{\operatorname*{wt}(e\cap V^{\prime\prime})}\leq(|V^{\prime\prime}|-1)k. Now, consider a non-trivial partition 𝒫=P1Pp\mathcal{P}=P_{1}\sqcup\cdots\sqcup P_{p} of VV^{\prime} where PiVP_{i}\neq V^{\prime} for all i[p]i\in[p] (as otherwise (2) trivially follows). We have

emax{|𝒫(e)|1,0}\displaystyle\sum_{e\in\mathcal{E}^{\prime}}{\max\{|\mathcal{P}(e)|-1,0\}} =e,e(1+=1p1[|eP|>0])\displaystyle=\sum_{e\in\mathcal{E}^{\prime},e\neq\varnothing}{\left(-1+\sum_{\ell=1}^{p}{\textbf{1}[|e\cap P_{\ell}|>0]}\right)}
=e,e((|e|1)=1p(|eP|1[|eP|>0]))\displaystyle=\sum_{e\in\mathcal{E}^{\prime},e\neq\varnothing}{\left((|e|-1)-\sum_{\ell=1}^{p}{\left(|e\cap P_{\ell}|-\textbf{1}[|e\cap P_{\ell}|>0]\right)}\right)}
=e,e(max(|e|1,0)=1pmax(|eP|1,0))\displaystyle=\sum_{e\in\mathcal{E}^{\prime},e\neq\varnothing}{\left(\max(|e|-1,0)-\sum_{\ell=1}^{p}{\max(|e\cap P_{\ell}|-1,0)}\right)}
=ewt(e)=1pewt(eP)\displaystyle=\sum_{e\in\mathcal{E}^{\prime}}{\operatorname*{wt}(e)}-\sum_{\ell=1}^{p}{\sum_{e\in\mathcal{E}^{\prime}}{\operatorname*{wt}(e\cap P_{\ell})}}
(|V|1)k=1p(|P|1)k\displaystyle\geq(|V^{\prime}|-1)k-\sum_{\ell=1}^{p}{(|P_{\ell}|-1)k}
=(p1)k=(|𝒫|1)k.\displaystyle=(p-1)k=(|\mathcal{P}|-1)k. (4)

This holds for all partitions 𝒫\mathcal{P} of VV^{\prime}, so \mathcal{H}^{\prime} is kk-weakly-partition-connected. ∎

Proof of Lemma 2.3.

Consider the agreement hypergraph ([L+1],)([L+1],\mathcal{E}) of y,(c(1),,c(L+1))y,(c^{(1)},\dots,c^{(L+1)}). The total edge weight is

ewt(e)\displaystyle\sum_{e\in\mathcal{E}}{\operatorname*{wt}(e)} n+e|e|=n+i=1nj=1L+11[yi=ci(j)]=n+j=1L+1(nd(y,c(j)))Lk.\displaystyle\geq-n+\sum_{e\in\mathcal{E}}{|e|}=-n+\sum_{i=1}^{n}{\sum_{j=1}^{L+1}{\textbf{1}[y_{i}=c^{(j)}_{i}]}}=-n+\sum_{j=1}^{L+1}{(n-d(y,c^{(j)}))}\geq Lk. (5)

By Lemma 2.4, there exists a subset J[L+1]J\subset[L+1] of at least two vertices such that =(J,{Je:e})\mathcal{H}^{\prime}=(J,\{J\cap e:e\in\mathcal{E}\}) — which is exactly the agreement hypergraph of (y,c(j):jJ)(y,c^{(j)}:j\in J) — is kk-weakly-partition-connected. ∎

Remark 2.5.

The condition |J|2|J|\geq 2 is needed later so that the reduced intersection matrix (defined below) is not a 0×00\times 0 matrix, in which case the matrix does not help establish list-decodability.

2.3 Reduced intersection matrices: definition and example

As in [GZ23], we work with the reduced intersection matrix, though our proof should work essentially the same with a different matrix called the (non-reduced) intersection matrix, which was considered in [ST20, GLS+22, BGM23].

Definition 2.6 (Reduced intersection matrix).

The reduced intersection matrix 𝖱𝖨𝖬k,q,\mathsf{RIM}_{k,q,\mathcal{H}} associated with a prime power qq, degree kk, and a hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) is a wt()×(t1)k\operatorname*{wt}(\mathcal{E})\times(t-1)k matrix over the field of fractions 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n}). For each hyperedge eie_{i} with vertices j1<j2<<j|ei|j_{1}<j_{2}<\dots<j_{|e_{i}|}, we add wt(ei)=|ei|1\operatorname*{wt}(e_{i})=|e_{i}|-1 rows to 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. For u=2,,|ei|u=2,\dots,|e_{i}|, we add a row ri,u=(r(1),,r(t1))r_{i,u}=(r^{(1)},\ldots,r^{(t-1)}) of length (t1)k(t-1)k defined as follows:

  • If j=j1j=j_{1}, then r(j)=[1,Xi,Xi2,,Xik1]r^{(j)}=[1,X_{i},X_{i}^{2},\dots,X_{i}^{k-1}]

  • If j=juj=j_{u} and jutj_{u}\neq t, then r(j)=[1,Xi,Xi2,,Xik1]r^{(j)}=-[1,X_{i},X_{i}^{2},\dots,X_{i}^{k-1}]

  • Otherwise, r(j)=0kr^{(j)}=0^{k}.

We typically omit kk and qq and write 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} as qq is typically understood.

Example 2.7.

Recall the example edges of the agreement hypergraph =([7],(e1,,en))\mathcal{H}=([7],(e_{1},\dots,e_{n})) in Figure 1.

en2e_{n-2}en1e_{n-1}ene_{n}en2e_{n-2}en1e_{n-1}ene_{n}f(1)f^{(1)}f(2)f^{(2)}f(3)f^{(3)}f(4)f^{(4)}f(5)f^{(5)}f(6)f^{(6)}f(7)f^{(7)}

The edges en2,en1,ene_{n-2},e_{n-1},e_{n} from \mathcal{H} contribute the following length (t1)k(t-1)k rows to its reduced intersection matrix:

[Vn2Vn20000Vn200Vn2000000Vn1Vn1]\displaystyle\begin{bmatrix}V_{n-2}&-V_{n-2}&0&0&0&0\\ V_{n-2}&0&0&-V_{n-2}&0&0\\ 0&0&0&0&V_{n-1}&-V_{n-1}\\ \end{bmatrix} (6)

Here Vi=[1,Xi,Xi2,,Xik1]V_{i}=[1,X_{i},X_{i}^{2},\dots,X_{i}^{k-1}] is a “Vandermonde row”, and 0 denotes the length-kk vector [0,0,,0][0,0,\dots,0]. Note that each edge ee contributes |e|1|e|-1 rows to the agreement matrix, and in particular ene_{n} does not contribute any rows.

Reduced intersection matrices arise by encoding all agreements from a bad list-decoding configuration into linear constraints on the message symbols (the polynomial coefficients). These constraints are placed into one matrix that we call the reduced intersection matrix. The following lemma implies that, if every reduced intersection matrix arising from a possible bad list-decoding configuration has full column rank when X1=α1,,Xn=αnX_{1}=\alpha_{1},\dots,X_{n}=\alpha_{n}, the corresponding Reed–Solomon code is list-decodable.

Lemma 2.8 (RIM of agreement hypergraphs are not full column rank).

Let \mathcal{H} be an agreement hypergraph for (y,c(1),,c(t))(y,c^{(1)},\dots,c^{(t)}), where c(j)𝔽qnc^{(j)}\in\mathbb{F}_{q}^{n} are codewords of RSn,k(α1,,αn)RS_{n,k}(\alpha_{1},\dots,\alpha_{n}), not all equal to each other. Then the reduced intersection matrix 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) does not have full column rank.

Proof.

By definition,

𝖱𝖨𝖬(X[n]=α[n])[f(1)f(t)f(t1)f(t)]=0\displaystyle\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\cdot\begin{bmatrix}f^{(1)}-f^{(t)}\\ \vdots\\ f^{(t-1)}-f^{(t)}\end{bmatrix}=0 (7)

where f(1),,f(t)𝔽qkf^{(1)},\dots,f^{(t)}\in\mathbb{F}_{q}^{k} are the vectors of coefficients of the polynomials that generate codewords c(1),,c(t)𝔽qnc^{(1)},\dots,c^{(t)}\in\mathbb{F}_{q}^{n}. Since these vectors are not all equal to each other, 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) does not have full column rank. ∎

Remark 2.9 (Symmetries of reduced intersection matrices).

From this definition, it should be clear that we can divide the variables X1,,XnX_{1},\dots,X_{n} into at most 2L2^{L} classes such that variables in the same class are exchangeable with respect to the reduced intersection matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}: if eie_{i} and eie_{i^{\prime}} are the same hyperedge, then swapping XiX_{i} and XiX_{i^{\prime}} yields the same reduced intersection matrix (up to row permutations). This observation, which was alluded to in [GZ23], turns out to be crucial in our argument that allows us to improve the alphabet size in [GZ23] from quadratic to linear.

Remark 2.10.

The pairwise distinctness requirement in the definition of average-radius-list-decodability (see Section 1.1) is nonetheless crucial in the proof of Theorem 1.1, despite the weaker requirement in Lemma 2.8. That is because we will eventually apply Lemma 2.8 on the subcollection of codewords given from Lemma 2.3, which can potentially be arbitrary. The guarantee that this subcollection of codewords is not all equal to each other would then follow from pairwise distinctness of the codewords in the original list.

2.4 Reduced intersection matrices: full column rank

The following theorem shows that reduced intersection matrices of kk-weakly-partition-connected hypergraphs are nonsingular when viewed as a matrix over 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n}). This was essentially conjectured by Shangguan and Tamo [ST20] and essentially established by Brakensiek, Gopi, and Makam [BGM23], who conjectured and showed, respectively, nonsingularity of the (non-reduced) intersection matrix under similar conditions. By the same union bound argument as in [ST20, Theorem 5.8], Theorem 2.11 already implies list-decodability of Reed–Solomon codes up to the generalized Singleton bound over exponentially large fields sizes, which is [BGM23, Theorem 1.5]. For completeness, and to demonstrate how the hypergraph perspective more directly connects the list-decoding problem to the GM-MDS theorem, we include a proof of Theorem 2.11 in Appendix A.

Theorem 2.11 (Full column rank. Implicit from Theorem A.2 of [BGM23]).

Let nn and kk be positive integers and 𝔽q\mathbb{F}_{q} be a finite field. Let \mathcal{H} be a kk-weakly-partition-connected hypergraph with nn hyperedges and at least 22 vertices. Then 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} has full column rank over the field 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\cdots,X_{n}).

Remark 2.12.

We note that, [BGM23] assumes throughout their paper that the alphabet size qq is sufficiently large, but Theorem 2.11 follows from the weaker “qq sufficiently large” version: For any fixed field size qq, take QQ to be a sufficiently large power of qq. Then, by the “qq sufficiently large” version of Theorem 2.11, matrix 𝖱𝖨𝖬Q,\mathsf{RIM}_{Q,\mathcal{H}} has full column rank over the field 𝔽Q(X1,,Xn)\mathbb{F}_{Q}(X_{1},\dots,X_{n}). Hence, the determinant of some square full-rank submatrix of 𝖱𝖨𝖬Q,\mathsf{RIM}_{Q,\mathcal{H}} is a nonzero polynomial in 𝔽Q[X1,,Xn]\mathbb{F}_{Q}[X_{1},\dots,X_{n}]. The entries of 𝖱𝖨𝖬Q,\mathsf{RIM}_{Q,\mathcal{H}} can all be viewed as polynomials over 𝔽q\mathbb{F}_{q}, so the corresponding full-rank submatrix of 𝖱𝖨𝖬q,\mathsf{RIM}_{q,\mathcal{H}} has a determinant that is a nonzero polynomial in 𝔽q[X1,,Xn]\mathbb{F}_{q}[X_{1},\dots,X_{n}] — symbolically, the determinants are the same polynomials, as 𝔽q\mathbb{F}_{q} and 𝔽Q\mathbb{F}_{Q} have the same characteristic. Hence, the matrix 𝖱𝖨𝖬q,\mathsf{RIM}_{q,\mathcal{H}} has full column rank over the field 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n}).

2.5 Reduced intersection matrix: row deletions

As in [GZ23], we consider row deletions from the reduced intersection matrix. The goal of this section is to establish Lemma 2.14, that the full-column-rank-ness of reduced intersection matrices are robust to row deletions.

Definition 2.13 (Row deletion of reduced intersection matrix).

Given a hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) and set B[n]B\subseteq[n], define 𝖱𝖨𝖬B\mathsf{RIM}_{\mathcal{H}}^{B} to be the submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} obtained by deleting all rows containing a variable XiX_{i} with iBi\in B.

The next lemma appears in a weaker form in [GZ23]. It roughly says that, given a reduced intersection matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} with some constant factor “slack” in the combinatorial constraints, we can omit a constant fraction of the rows without compromising the full-column-rank-ness of the matrix. Our version of this lemma saves roughly a factor of tLt\sim L compared to the analogous lemma [GZ23, Lemma 3.11]. The reason is that the kk-weakly-partition-connected condition is more robust to these row deletions (by a factor of roughly tt) than the condition in [GZ23]. As such, our proof is also more direct.

Lemma 2.14 (Robustness to deletions. Similar to Lemma 3.11 of [GZ23]).

Let =([t],)\mathcal{H}=([t],\mathcal{E}) be a (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph with t2t\geq 2. For all sets B[n]B\subset[n] with |B|εn|B|\leq\varepsilon n, we have that 𝖱𝖨𝖬B\mathsf{RIM}_{\mathcal{H}}^{B} is nonempty and has full column rank.

Proof.

By definition of the reduced intersection matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, the matrix with row deletions 𝖱𝖨𝖬B\mathsf{RIM}_{\mathcal{H}}^{B} is the matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}^{\prime}}, where =([t],)\mathcal{H}^{\prime}=([t],\mathcal{E}^{\prime}) is the hypergraph obtained from \mathcal{H} by deleting eie_{i} for iBi\in B. By Theorem 2.11, it suffices to prove that \mathcal{H}^{\prime} is kk-weakly-partition connected. Indeed, consider any partition 𝒫\mathcal{P} of [t][t]. We have

emax{|𝒫(e)|1,0}\displaystyle\sum_{e\in\mathcal{E}^{\prime}}{\max\{|\mathcal{P}(e)|-1,0\}} =i[n]max{|𝒫(e)|1,0}iBmax{|𝒫(e)|1,0}\displaystyle=\sum_{i\in[n]}{\max\{|\mathcal{P}(e)|-1,0\}}-\sum_{i\in B}{\max\{|\mathcal{P}(e)|-1,0\}}
(k+εn)(|𝒫|1)|B|(|𝒫|1)=k(|𝒫|1),\displaystyle\geq(k+\varepsilon n)\cdot(|\mathcal{P}|-1)-|B|\cdot(|\mathcal{P}|-1)=k\cdot(|\mathcal{P}|-1), (8)

as desired. The first inequality holds because \mathcal{H} is (k+εn)(k+\varepsilon n)-weakly-partition-connected, and, trivially, any edge eie_{i} touches at most |𝒫||\mathcal{P}| parts of 𝒫\mathcal{P}. ∎

3 Proof of list-decodability with linear-sized alphabets

3.1 Overview of the proof

en2e_{n-2}en1e_{n-1}ene_{n}en2e_{n-2}en1e_{n-1}ene_{n} Lemma 2.8 Bad list-decoding configuration has (k+εn)(k+\varepsilon n)-w.p.c agreement hypergraph Lemma 2.3 RIMs for agreement hypergraphs do not have full column rank Lemma 3.1 RIMs for (k+εn)(k+\varepsilon n)-w.p.c hypergraphs have full column rank w.h.p. Theorem 1.1 RS code list-decodable w.h.p. Union bound over possible agreement hypergraphs Lemma 3.8 If RIM not full column rank, it admits a certificate. Corollary 3.10 Number of possible certificates is small. Corollary 3.12 The probability of any one certificate is very small Union bound over possible certificates Properties of GetCertificate, which generates certificates for non-full-rank RIMs.
Figure 2: A roadmap of our proof. The orange boxes are preliminaries, and the blue-green boxes are the meat of the proof address in Section 3. All probabilities are over the random choice of evaluation points α1,,αn\alpha_{1},\dots,\alpha_{n} for our Reed–Solomon code.

By Lemma 2.8 and Lemma 2.3, every bad list-decoding configuration admits a weakly-partition-connected agreement hypergraph whose reduced intersection matrix does not have full column rank. Thus, to prove Theorem 1.1, it suffices to show that, with high probability, every such reduced intersection matrix has full column rank. The main technical lemma for this section is the one stated below. Our main result, Theorem 1.1, follows by applying Lemma 2.3 and Lemma 2.8 with Lemma 3.1, and taking a union bound over all t=2L+12tn\sum_{t=2}^{L+1}{2^{tn}} possible agreement hypergraphs.

Lemma 3.1.

Let kk be a positive integer and ε>0\varepsilon>0. For each (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) with t2t\geq 2, we have, for r=εn/2r=\lfloor{\varepsilon n/2}\rfloor,

𝐏𝐫α1,,αn𝔽q distinct[𝖱𝖨𝖬(X[n]=α[n]) does not have full column rank](nr)2tr((t1)kqn)r.\displaystyle\mathop{\bf Pr\/}_{\alpha_{1},\dots,\alpha_{n}\sim\mathbb{F}_{q}\text{ distinct}}\left[\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\text{ does not have full column rank}\right]\leq\binom{n}{r}2^{tr}\cdot\left(\frac{(t-1)k}{q-n}\right)^{r}\ . (9)

At the highest level, the proof of Lemma 3.1 follows the same outline as [GZ23]. For every sequence of evaluation points (α1,,αn)𝔽qn(\alpha_{1},\dots,\alpha_{n})\in\mathbb{F}_{q}^{n} for which 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} does not have full column rank, we show that there is a certificate (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r} consisting of distinct indices in [n][n] (Lemma 3.8), which intuitively “attests” to the failure of the matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} to be full column rank. We then show that, for any certificate (i1,,ir)(i_{1},\dots,i_{r}), the probability that (α1,,αn)(\alpha_{1},\dots,\alpha_{n}) has certificate (i1,,ir)(i_{1},\dots,i_{r}) is exponentially small. (More precisely, it will at most be ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r}. See Corollary 3.12). We then show that there are not too many certificates (Corollary 3.10), and then union bound over the number of possible certificates to obtain the desired result (Lemma 3.1).

Our argument differs from [GZ23] in how we choose our certificates. The argument of [GZ23] allowed for up to nrn^{r} certificates. Our argument instead only needs (nr)2tr\binom{n}{r}2^{tr} many certificates, which is much smaller when r=Ω(n)r=\Omega(n) (the parameter regime of interest here) and overall allows us to save a factor of nn in the alphabet size. Our savings comes from leveraging that there are at most 2t2^{t} different “types” of hyperedges (see Remark 2.9), and thus at most 2t2^{t} different types of variables XiX_{i} in the reduced intersection matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. This observation was alluded to in [GZ23].888Guo and Zhang [GZ23] write “It is possible that achieving an alphabet size linear in n would require establishing and exploiting other properties of intersection matrices or reduced intersection matrices, such as an appropriate notion of exchangeability.” We found this prediction to be insightful and true. With this observation in mind, we assume, without loss of generality, that the edges of \mathcal{H} are ordered by their respective type (we can relabel the edges of \mathcal{H}, which effectively permutes the rows of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}).

Our method of generating a certificate (i1,,ir)(i_{1},\dots,i_{r}) for the evaluation sequence (α1,,αn)(\alpha_{1},\dots,\alpha_{n}) (Algorithm 2) is similar to that of [GZ23] at a high level—with each certificate i1,,iri_{1},\dots,i_{r}, we associate a sequence of (t1)k×(t1)k(t-1)k\times(t-1)k submatrices M1,,MrM_{1},\dots,M_{r} of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} (Algorithm 1) that are entirely specified by i1,,iri_{1},\dots,i_{r} as follows: since evaluating X[n]=α[n]X_{[n]}=\alpha_{[n]} forces 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} to not be full rank, then so will all of its (t1)k×(t1)k(t-1)k\times(t-1)k submatrices. Thus if we sequentially ’reveal’ X1=α1,X2=α2,X_{1}=\alpha_{1},X_{2}=\alpha_{2},\dots, then at some point, MjM_{j} becomes singular exactly when we set Xij=αijX_{i_{j}}=\alpha_{i_{j}} — in fact, iji_{j} is defined as such, so that we select M1,i1,M2,i2,,M_{1},i_{1},M_{2},i_{2},\dots, in that order, but we emphasize that MjM_{j} can be computed from i1,,ij1i_{1},\dots,i_{j-1} without knowing α1,,αn\alpha_{1},\dots,\alpha_{n}. Conditioned on MjM_{j} being non-singular with X1=α1,,Xij1=αij1X_{1}=\alpha_{1},\dots,X_{i_{j}-1}=\alpha_{i_{j}-1}, the probability that MjM_{j} becomes singular when setting Xij=αijX_{i_{j}}=\alpha_{i_{j}} is at most (t1)kqn\frac{(t-1)k}{q-n}: αij\alpha_{i_{j}} is uniformly random over at least qnq-n field elements, and the degree of XijX_{i_{j}} in the determinant of MjM_{j} is at most (t1)k(t-1)k (and the determinant is nonzero by definition). Running conditional probabilities in the correct order, we conclude that the probability that a particular certificate i1,,iri_{1},\dots,i_{r} is generated is at most ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r}, just as in [GZ23].

Whereas [GZ23] pick any matrix MjM_{j} that is obtained after removing the variables Xi1,,Xij1X_{i_{1}},\ldots,X_{i_{j-1}}, we do a more deliberate choice of matrices by leveraging the symmetries of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} (Remark 2.9). First, we ensure that we can keep a “bank” of Ωt(r)\Omega_{t}(r) unused variables of each of the Ot(1)O_{t}(1) types. Then, starting with a full column rank submatrix MM of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} devoid of all variables in the “bank,” we start sequentially applying the evaluations X1=α1,X2=α2,X_{1}=\alpha_{1},X_{2}=\alpha_{2},\ldots. Whenever M(Xi1=αi1)M(X_{\leq i_{1}}=\alpha_{\leq i_{1}}) turns singular, we find that the evaluation Xi1=αi1X_{i_{1}}=\alpha_{i_{1}} is what ’caused’ it to become singular. We then go to the “bank” to find a variable Xi1X_{i_{1}^{\prime}} of the same type as Xi1X_{i_{1}} and “re-indeterminate” MM by replacing all instances of Xi1X_{i_{1}} in MM with Xi1X_{i_{1}^{\prime}}. That way, we ensure that MM is, in a sense, “reused.” Furthermore, we ensure i1>i1i_{1}^{\prime}>i_{1}, so that the matrix M(Xi1=αi1)M(X_{\leq i_{1}}=\alpha_{\leq i_{1}}) is now nonsingular, so we can keep going. Of course, if we end up reaching the end (i.e. M(X[n]=α[n])M(X_{[n]}=\alpha_{[n]}) is full column rank), then in fact, 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) is full column rank, and so the evaluations (α1,,αn)(\alpha_{1},\ldots,\alpha_{n}) were ‘good’ after all.

Otherwise, if the evaluations (α1,,αn)(\alpha_{1},\ldots,\alpha_{n}) were ‘bad’, then the submatrix MM couldn’t have reached the end, and that can only happen if some specific type was completely exhausted from the bank. However, given the size of our initial bank, that must have meant that MM must have been “re-indeterminated” at least Ωt(r)\Omega_{t}(r) times. When that happens, we collect the indices i1,,ii_{1},\dots,i_{\ell} that we gathered from this round, remove them from 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, and repeat the process again with a refreshed bank. Since we only need rr indices, then we end up doing at most Ot(1)O_{t}(1) rounds. Because each round yields a strictly increasing sequence of indices of length at least Ωt(r)\Omega_{t}(r), then we up getting a certificate consisting of at most Ot(1)O_{t}(1) strictly increasing runs of total length rr, of which there are at most (nr)Ot(1)r\binom{n}{r}\cdot O_{t}(1)^{r}.

To be more concrete, when we generate the submatrix M=M1M=M_{1}, we ensure that any variable appearing in M1M_{1} has the same type as Ωt(r)\Omega_{t}(r) variables that are not in M1M_{1} (but still in 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}). This creates a “bank” of variables of each type. Then, if Xi1=αi1X_{\leq i_{1}}=\alpha_{\leq i_{1}} was the point that made M1M_{1} singular, we can get M2M_{2} by replacing all copies of Xi1X_{i_{1}} with some Xi1X_{i_{1}^{\prime}} that is of the same type and in the “bank.” Since variables i1i_{1} and i1i_{1}^{\prime} are of the same type, they have analogous rows in the reduced intersection matrix 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, so this new matrix M2M_{2} is still a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. Therefore, we can pick up where we left off with M1M_{1} but with M2M_{2} instead. That is, M2M_{2} will in fact be full rank when we apply the evaluations Xi1=αi1X_{\leq i_{1}}=\alpha_{\leq i_{1}}. Thus the next index i2i_{2} on which M2M_{2} turns singular will be strictly greater than i1i_{1}. We then repeat the process in M2M_{2}, replacing Xi2X_{i_{2}} with some Xi2X_{i_{2}^{\prime}} that is in the “bank” and of the same type, getting M3M_{3}, and so on. We can continue this process for Ωt(r)\Omega_{t}(r) steps because of the size of the bank of each type, so we get an increasing run of length Ωt(r)\Omega_{t}(r) in our certificate. After we run out of some type in our bank, we remove the used indices i1,,ii_{1},\dots,i_{\ell} from 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} and repeat the process again with a refreshed bank. This continues for Ot(1)O_{t}(1) times only, as we only need rr indices in the end.

We now finish the proof of Theorem 1.1, assuming Lemma 3.1. The rest of this section is devoted to proving Lemma 3.1.

Proof of Theorem 1.1, assuming Lemma 3.1.

By Lemma 2.3, if RSn,k(α1,,αn)RS_{n,k}(\alpha_{1},\dots,\alpha_{n}) is not (LL+1(1Rε),L)\left(\frac{L}{L+1}(1-R-\varepsilon),L\right) average-radius list-decodable, then there exists a vector yy and pairwise distinct codewords c(1),,c(t)c^{(1)},\dots,c^{(t)} with t2t\geq 2 such that the agreement hypergraph =([t],)\mathcal{H}=([t],\mathcal{E}) is (R+ε)n=(k+εn)(R+\varepsilon)n=(k+\varepsilon n)-weakly-partition-connected. By Lemma 2.8, the matrix 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) is not full column rank. Now, the number of possible agreement hypergraphs \mathcal{H} is at most t=2L+12tn2(L+2)n\sum_{t=2}^{L+1}2^{tn}\leq 2^{(L+2)n}. Thus by the union bound over possible agreement hypergraphs \mathcal{H} with Lemma 3.1, we have, for r=εn2r=\lfloor{\frac{\varepsilon n}{2}}\rfloor,

𝐏𝐫α[n][RSn,k(α1,,αn) not (LL+1(1Rε),L) list-decodable]\displaystyle\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[RS_{n,k}(\alpha_{1},\dots,\alpha_{n})\text{ not $\left(\frac{L}{L+1}(1-R-\varepsilon),L\right)$ list-decodable}\right]
𝐏𝐫α[n][ (k+εn)-w.p.c. agreement hypergraph  such that 𝖱𝖨𝖬(X[n]=α[n]) not full column rank]\displaystyle\leq\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[\exists\text{ $(k+\varepsilon n)$-w.p.c. agreement hypergraph }\mathcal{H}\text{ such that }\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\text{ not full column rank}\right]
2(L+2)nmax(k+εn)-w.p.c. 𝐏𝐫α[n][𝖱𝖨𝖬(X[n]=α[n]) not full column rank]\displaystyle\leq 2^{(L+2)n}\max_{\text{$(k+\varepsilon n)$-w.p.c. }\mathcal{H}}\quad\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\text{ not full column rank}\right]
2(L+2)n(nr)2(L+1)r(Lkqn)r(2(L+2)n/renr2L+1Lkqn)r2Ln,\displaystyle\leq 2^{(L+2)n}\cdot\binom{n}{r}2^{(L+1)r}\left(\frac{Lk}{q-n}\right)^{r}\leq\left(2^{(L+2)n/r}\cdot\frac{en}{r}\cdot 2^{L+1}\frac{Lk}{q-n}\right)^{r}\leq 2^{-Ln}, (10)

as desired. Here, we used that q=n+k210L/εq=n+k\cdot 2^{10L/\varepsilon}. ∎

3.2 Setup for proof of Lemma 3.1

We now devote the rest of this Section to proving Lemma 3.1.

3.2.0.0.1 Types.

For a hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})), the type of an index ii (or, by abuse of notation, the type of the variable XiX_{i}, or the edge eie_{i}) is simply the set ei[t]e_{i}\subset[t]. There are 2t2^{t} types, and by abuse of notation, we identify the types by the numbers 1,2,,2t1,2,\dots,2^{t} in an arbitrary fixed order with a bijection τ:(subsets of [t])[2t]\tau:\text{(subsets of $[t]$)}\to[2^{t}]. We say a hypergraph is type-ordered if the hyperedges e1,,ene_{1},\dots,e_{n} are sorted according to their type: τ(e1)τ(e2)τ(en)\tau(e_{1})\leq\tau(e_{2})\leq\cdots\leq\tau(e_{n}). Since permuting the labels of the edges of \mathcal{H} preserves the rank of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} (it merely permutes the rows of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}), we can without loss of generality assume in Lemma 3.1 that \mathcal{H} is type-ordered.

3.2.0.0.2 Global variables.

Throughout the rest of the section, we fix a positive integer kk, parameter ε>0\varepsilon>0, and =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})), a type-ordered (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph with t2t\geq 2. We also fix

r=defεn2.\displaystyle r\stackrel{{\scriptstyle\rm def}}{{=}}\left\lfloor\frac{\varepsilon n}{2}\right\rfloor. (11)
Input: indices i1,,ij1[n]i_{1},\dots,i_{j-1}\in[n] for some j1j\geq 1.
Output: M1,,MjM_{1},\dots,M_{j}, which are (t1)k×(t1)k(t-1)k\times(t-1)k matrices over 𝔽q(X1,X2,,Xn)\mathbb{F}_{q}(X_{1},X_{2},\dots,X_{n}).
1 BB\leftarrow\emptyset, i0i_{0}\leftarrow\perp, 0\ell_{0}\leftarrow\perp
2 for =1,,j\ell=1,\dots,j do
      
       // MM_{\ell} depends only on i1,,i1i_{1},\dots,i_{\ell-1}
3       if >1\ell>1 then
            
             // Fetch new index from bank BB
4             τ\tau\leftarrow the type of i1i_{\ell-1}
5             ss\leftarrow number of indices among i0,i0+1,,i1i_{\ell_{0}},i_{\ell_{0}+1},\dots,i_{\ell-1} that are type τ\tau
6             i1i_{\ell-1}^{\prime}\leftarrow the ss-th smallest element of BB that has type τ\tau
7             if i1i_{\ell-1}^{\prime} is defined then
8                   MM_{\ell}\leftarrow the matrix obtained from M1M_{\ell-1} by replacing all copies of Xi1X_{i_{\ell-1}} with Xi1X_{i_{\ell-1}^{\prime}}
9            
10      if MM_{\ell} not yet defined then
            
             // Refresh bank BB
11             BB\leftarrow\emptyset
12             for τ=1,,2t\tau=1,\dots,2^{t} do
13                   BB{largest r/2t indices of type τ in [n]{i1,,i1}}B\leftarrow B\cup\{\text{largest $\lfloor{r/2^{t}}\rfloor$ indices of type $\tau$ in $[n]\setminus\{i_{1},\dots,i_{\ell-1}\}$}\} (if there are less than r/2t\lfloor{r/2^{t}}\rfloor indices of type τ\tau, then BB contains all such indices)
14                  
15            MM_{\ell}\leftarrow lexicographically smallest nonsingular (t1)k×(t1)k(t-1)k\times(t-1)k submatrix of 𝖱𝖨𝖬B{i1,,i1}\mathsf{RIM}_{\mathcal{H}}^{B\cup\{i_{1},\dots,i_{\ell-1}\}}
             0\ell_{0}\leftarrow\ell // new refresh index
16            
17            
18      
return M1,,MjM_{1},\dots,M_{j}
Algorithm 1 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}
Input: Evaluation points (α1,,αn)𝔽qn(\alpha_{1},\dots,\alpha_{n})\in\mathbb{F}_{q}^{n}.
Output: A “certificate” (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r}.
1 for j=1,,rj=1,\dots,r do
      
       // M1,,Mj1M_{1},\dots,M_{j-1} stay the same, MjM_{j} is now defined
2       M1,,Mj=𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij1)M_{1},\dots,M_{j}=\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j-1})
3       iji_{j}\leftarrow smallest index ii such that Mj(Xi=αi)M_{j}(X_{\leq i}=\alpha_{\leq i}) is singular
4       if iji_{j} not defined then
5             return \perp
6      
return (i1,,ir)(i_{1},\dots,i_{r})
Algorithm 2 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate}

3.3 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} and 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}: Basic properties

As mentioned at the beginning of this section, we design an algorithm, Algorithm 2, that attempts to generate a certificate (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r} for evaluation points α1,,αn\alpha_{1},\dots,\alpha_{n}. It uses Algorithm 1, a helper function that generates the associated square submatrices M1,,MrM_{1},\dots,M_{r} of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. Below, we establish some basic properties of these algorithms.

First, we establish that the matrices outputted by GetMatrixSequence are well-defined.

Lemma 3.2 (Output is well-defined).

For all sequence of indices i1,,ij1i_{1},\dots,i_{j-1}, if M1,,MjM_{1},\dots,M_{j} is the output of the function 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij1)\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j-1}), then M1,,MjM_{1},\dots,M_{j} are well-defined.

Proof.

If \ell is a refresh index, then we have |B{i1,,i1}|<|B|+r2rεn|B\cup\{i_{1},\dots,i_{\ell-1}\}|<|B|+r\leq 2r\leq\varepsilon n, so by Lemma 2.14, 𝖱𝖨𝖬B{i1,,i1}\mathsf{RIM}_{\mathcal{H}}^{B\cup\{i_{1},\dots,i_{\ell-1}\}} is nonempty and has full column rank. Thus MM_{\ell} exists in Line 1. If \ell is not a refresh index, MM_{\ell} is always well-defined by definition. ∎

Next, we observe that GetMatrixSequence is an “online” algorithm.

Lemma 3.3 (Online).

Furthermore, 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence} is a deterministic function of i1,,ij1i_{1},\dots,i_{j-1}, and it computes MM_{\ell} “online”, meaning MM_{\ell} depends only on i1,,i1i_{1},\dots,i_{\ell-1} for all =1,,j\ell=1,\dots,j (and M1M_{1} is always the same matrix). In particular, 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij1)\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j-1}) is a prefix of 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij)\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j}).

Proof.

By definition and Lemma 3.2. ∎

Definition 3.4 (Refresh index).

In 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}, in the outer loop over \ell, we say a refresh index is an index \ell obtained at Line 1 (i.e. when MM_{\ell} is defined on Line 1). For example, =1\ell=1 is a refresh index.

Our first lemma shows that the new indices we are receiving from 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence} are in fact new.

Lemma 3.5 (New Variable).

In 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}, in the outer loop iteration over \ell at Line 1, if we reach Line 1 of 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}, variable Xi1X_{i_{\ell-1}^{\prime}} does not appear in M0,M0+1,,M1M_{\ell_{0}},M_{\ell_{0}+1},\dots,M_{\ell-1}, where 0\ell_{0} is the largest refresh index less than \ell.

Proof.

Let BB be the set defined in Line 1 at iteration 0\ell_{0}. In iterations =0,0+1,,\ell^{\prime}=\ell_{0},\ell_{0}+1,\dots,\ell, the set BB is the same, and i1i_{\ell-1}^{\prime} is in this set BB by definition. Thus, the variable Xi1X_{i_{\ell-1}^{\prime}} does not appear in M0M_{\ell_{0}} by definition. For =0,0+1,,\ell^{\prime}=\ell_{0},\ell_{0}+1,\dots,\ell, the (τ,s)(\tau,s) pairs generated at Line 1 and Line 1 are pairwise distinct, so Xi1X_{i_{\ell-1}^{\prime}} is not added to MM_{\ell^{\prime}} for =0+1,,1\ell^{\prime}=\ell_{0}+1,\dots,\ell-1 and thus is not in M0,M0+1,,M1M_{\ell_{0}},M_{\ell_{0}+1},\dots,M_{\ell-1}. ∎

To show that the probability of a particular certificate (i1,,ir)(i_{1},\dots,i_{r}) is small (Lemma 3.11, Corollary 3.12), we crucially need that i1,,iri_{1},\dots,i_{r} are pairwise distinct. The next lemma proves that this is always the case.

Lemma 3.6 (Distinct indices).

For any sequence of evaluation points (α1,,αn)𝔽qn(\alpha_{1},\dots,\alpha_{n})\in\mathbb{F}_{q}^{n}, the output of 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n}) is a sequence (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r} of pairwise distinct indices.

Proof.

By definition of ii_{\ell} at Line 2 of 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate}, variable XiX_{i_{\ell}} must be in MM_{\ell}, so suffices to show that MM_{\ell} never contains any variable XiX_{i} for i{i1,,i1}i\in\{i_{1},\dots,i_{\ell-1}\}. We induct on \ell. If \ell is a refresh index, this is true by definition. If not, let 0\ell_{0} be the largest refresh index less than \ell. By induction, i1,,i2i_{1},\dots,i_{\ell-2} are not in M1M_{\ell-1}, so we just need to show i1i_{\ell-1}^{\prime} (the new index replacing i1i_{\ell-1} in MM_{\ell} at Line 1) is not any of i1,,i1i_{1},\dots,i_{\ell-1}. It is not any of i1,,i01i_{1},\dots,i_{\ell_{0}-1} because none of those indices are in BB by definition. It is not any of ii_{\ell^{\prime}} for =0,,1\ell^{\prime}=\ell_{0},\dots,\ell-1, because XiX_{i_{\ell^{\prime}}} is in MM_{\ell^{\prime}}, but Xi1X_{i_{\ell-1}^{\prime}} is not, by Lemma 3.5 . ∎

3.4 Bad evaluation points admit certificates

Here, we establish Lemma 3.8, that if some evaluation points make 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} not full column rank, then 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} outputs a certificate. To do so, we first justify our matrix constructions, showing that the matrices in 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence} are in fact submatrices of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}.

Lemma 3.7 (𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence} gives submatrices of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}).

For all sequence of indices i1,,ij1i_{1},\dots,i_{j-1}, if M1,,MjM_{1},\dots,M_{j} is the output of 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij1)\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j-1}), then M1,,MjM_{1},\dots,M_{j} are (t1)k×(t1)k(t-1)k\times(t-1)k submatrices of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}.

Proof.

We proceed with induction on =1,,j\ell=1,\dots,j. First, if \ell is a refresh index, then MM_{\ell} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} by definition. In particular, M1M_{1} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, so the base case holds. Now suppose \ell is not a refresh index and M1M_{\ell-1} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. Matrix MM_{\ell} is defined by replacing all copies of Xi1X_{i_{\ell-1}} with Xi1X_{i_{\ell-1}^{\prime}}. To check that MM_{\ell} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, it suffices to show that

  • (i)

    for each row of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} containing Xi1X_{i_{\ell-1}}, replacing all copies of Xi1X_{i_{\ell-1}} with Xi1X_{i_{\ell-1}^{\prime}} gives another row of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, and

  • (ii)

    the variable Xi1X_{i_{\ell-1}^{\prime}} does not appear in M1M_{\ell-1}.

The first item follows from the fact that indices i1i_{\ell-1} and i1i_{\ell-1}^{\prime} are of the same type, so (i) holds by definition of types and 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} (see also Remark 2.9). The second item is Lemma 3.5. Thus, MM_{\ell} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, completing the induction. ∎

We now show that any nn-tuple of bad evaluation points admits a certificate.

Lemma 3.8 (Bad evaluations points admit certificates).

If (α1,,αn)𝔽qn(\alpha_{1},\dots,\alpha_{n})\in\mathbb{F}_{q}^{n} are evaluation points such that 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) does not have full column rank, 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n}) returns a certificate (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r} (rather than \perp).

Proof.

Suppose for contradiction that 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} returns \perp at iteration jj in the loop. Then there is no index ii such that Mj(Xi=αi)M_{j}(X_{\leq i}=\alpha_{\leq i}) is singular, so in particular, Mj(X[n]=α[n])M_{j}(X_{[n]}=\alpha_{[n]}) is nonsingular and thus has full column rank. By Lemma 3.7, MjM_{j} is a submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, so we conclude 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} has full column rank. ∎

3.5 Bounding the number of possible certificates

In this section, we upper bound the number of possible certificates. The key step is to prove the following structural result about certificates.

Lemma 3.9 (Certificate structure).

Given a sequence of evaluation points (α1,,αn)𝔽qn(\alpha_{1},\dots,\alpha_{n})\in\mathbb{F}_{q}^{n} such that 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) is not full column rank, the return value (i1,,ir)=𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)(i_{1},\dots,i_{r})=\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n}) satisfies ij1<iji_{j-1}<i_{j} for all but at most 2t2^{t} values j=2,,rj=2,\dots,r.

Proof.

Let (i1,,ir)(i_{1},\dots,i_{r}) be the return of 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate}, and let M1,,MrM_{1},\dots,M_{r} be the associated matrix sequence. By Lemma 3.3, we have M1,,Mj=𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ij1)M_{1},\dots,M_{j}=\mathtt{GetMatrixSequence}(i_{1},\dots,i_{j-1}) for j=1,,rj=1,\dots,r. Recall an index [r]\ell\in[r] is a refresh index if MM_{\ell} is defined on Line 1 rather than Line 1. The lemma follows from two claims:

  1. (i)

    If >1\ell>1 is not a refresh index, then i1<ii_{\ell-1}<i_{\ell}.

  2. (ii)

    Any two refresh indices differ by at least r/2tr/2^{t}.

To see claim (i), let 0\ell_{0} be the largest refresh index less than \ell. By definition of a refresh index, the set BB stays constant between when M0M_{\ell_{0}} is defined and when MM_{\ell} is defined. From the definition of iji_{j} at Line 2 in 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate}, we know that

  • For i<i1i<i_{\ell-1} the matrix M1(Xi=αi)M_{\ell-1}(X_{\leq i}=\alpha_{\leq i}) is nonsingular.

  • The matrix M(Xi=αi)M_{\ell}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}) is singular.

Suppose for contradiction that i<i1i_{\ell}<i_{\ell-1}. (Note that i1ii_{\ell-1}\neq i_{\ell} by Lemma 3.6.) We contradict the first item by showing, using the second item, that M1(Xi=αi)M_{\ell-1}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}) is also singular. By the definition of 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎\mathtt{GetMatrixSequence}, since \ell is not a refresh index, MM_{\ell} is defined in Line 1. By construction of BB and i1i_{\ell-1}^{\prime}, we know that i1>i1>ii_{\ell-1}^{\prime}>i_{\ell-1}>i_{\ell}. Thus, not only is MM_{\ell} obtained from M1M_{\ell-1} by replacing all copies of Xi1X_{i_{\ell-1}} with Xi1X_{i_{\ell-1}^{\prime}}, but M(Xi=αi)M_{\ell}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}) is also obtained by replacing all copies of Xi1X_{i_{\ell-1}} with Xi1X_{i_{\ell-1}^{\prime}} in M1(Xi=αi)M_{\ell-1}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}) . Moreover, the variable Xi1X_{i_{\ell-1}^{\prime}} does not appear in M1M_{\ell-1} by Lemma 3.5. So we conclude that, as M(Xi=αi)M_{\ell}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}) is singular, so is M1(Xi=αi)M_{\ell-1}(X_{\leq i_{\ell}}=\alpha_{\leq i_{\ell}}).

Now we show claim (ii). Suppose 0\ell_{0} and 1\ell_{1} are consecutive refresh indices. If a variable of type τ\tau appears in the matrix M0M_{\ell_{0}}, there must be exactly r/2t\lfloor{r/2^{t}}\rfloor indices of type τ\tau in BB (if there were fewer, then B{i1,,i1}B\cup\{i_{1},\dots,i_{\ell-1}\} would contain all indices of type τ\tau, and the corresponding variables would not appear in 𝖱𝖨𝖬B{i1,,i1}\mathsf{RIM}_{\mathcal{H}}^{B\cup\{i_{1},\dots,i_{\ell-1}\}}). Let τ\tau be the type of index i11i_{\ell_{1}-1}. Since 1\ell_{1} is a refresh index, the number of indices of type τ\tau among i0,i0+1,,i11i_{\ell_{0}},i_{\ell_{0}+1},\dots,i_{\ell_{1}-1} must therefore be r/2t+1\lfloor{r/2^{t}}\rfloor+1. In particular, this means 10r/2t+1r/2t\ell_{1}-\ell_{0}\geq\lfloor{r/2^{t}}\rfloor+1\geq r/2^{t}, as desired. ∎

Corollary 3.10 (Certificate count).

The number of possible outputs to 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} is at most (nr)2tr\binom{n}{r}2^{tr}.

Proof.

The certificate consists of rr distinct indices of [n][n] by Lemma 3.6. We can choose those in (nr)\binom{n}{r} ways. These indices are distributed between at most 2t2^{t} increasing runs by Lemma 3.9. We can distribute these indices between the 2t2^{t} increasing runs in at most (2t)r(2^{t})^{r} ways. ∎

3.6 Bounding the probability of one certificate

The goal of this section is to establish Corollary 3.12, which states that the probability of obtaining a particular certificate is at most ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r}. The argument is implicit in [GZ23], but we include a proof for completeness.

Lemma 3.11 (Implicit in [GZ23]).

Let i1,,ir[n]i_{1},\dots,i_{r}\in[n] be pairwise distinct indices, and M1,,MrM_{1},\dots,M_{r} be (t1)k×(t1)k(t-1)k\times(t-1)k submatrices of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. Over randomly chosen pairwise distinct evaluation points α1,αn𝔽q\alpha_{1},\dots\alpha_{n}\in\mathbb{F}_{q}, define the following events for j=1,,rj=1,\dots,r:

  • EjE_{j} is the event that Mj(Xi=αi)M_{j}(X_{\leq i}=\alpha_{\leq i}) is non-singular for all i<iji<i_{j}.

  • FjF_{j} is the event that Mj(Xij=αij)M_{j}(X_{\leq i_{j}}=\alpha_{\leq i_{j}}) is singular.

The probability that all the events hold is at most ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r}.

Proof.

Note that the set of evaluation points α1,,αn\alpha_{1},\dots,\alpha_{n} for which events EjE_{j} and FjF_{j} occur depends only on MjM_{j} and iji_{j}. Furthermore, each of the events EjE_{j} and FjF_{j} depends only on MiM_{i}, iji_{j}, and the evaluation points. Thus, by relabeling the index jj, we may assume without loss of generality that i1<i2<<iri_{1}<i_{2}<\cdots<i_{r}. We emphasize that we are not assuming that the output of 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} satisfies i1<<iri_{1}<\cdots<i_{r} (this is not true). We are instead just choosing how we ’reveal’ our events EjE_{j} and FjF_{j}: starting with the smallest index in i1,,iri_{1},\ldots,i_{r} and ending with the largest index in it.

We have

𝐏𝐫α[n][j=1r(EjFj)]\displaystyle\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[\bigwedge_{j=1}^{r}(E_{j}\wedge F_{j})\right] =j=1r𝐏𝐫α[n][EjFj|E1F1Ej1Fj1]\displaystyle=\prod_{j=1}^{r}\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[E_{j}\wedge F_{j}|E_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\right]
j=1r𝐏𝐫α[n][Fj|E1F1Ej1Fj1Ej]\displaystyle\leq\prod_{j=1}^{r}\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[F_{j}|E_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j}\right] (12)

Note that E1F1Ej1Fj1EjE_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j} depends only on α1,,αij1\alpha_{1},\dots,\alpha_{i_{j}-1}, and FjF_{j} depends only on α1,,αij\alpha_{1},\dots,\alpha_{i_{j}}. For any α1,,αij1\alpha_{1},\dots,\alpha_{i_{j}-1} for which E1F1Ej1Fj1EjE_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j} holds, we have that Mj(Xij1=αij1)M_{j}(X_{\leq i_{j}-1}=\alpha_{\leq i_{j}-1}) is a (t1)k×(t1)k(t-1)k\times(t-1)k matrix in 𝔽q(Xij,Xij+1,,Xn)\mathbb{F}_{q}(X_{i_{j}},X_{i_{j}+1},\dots,X_{n}) whose determinant is a nonzero polynomial of degree at most (t1)k(t-1)k in each variable (the determinant contains at most t1t-1 rows including XijX_{i_{j}}, each time with maximum degree k1k-1). In particular, at most (t1)k(t-1)k values of αij\alpha_{i_{j}} can make the determinant zero since, viewing the determinant as a polynomial in variables Xij+1,,XnX_{i_{j}+1},\dots,X_{n} with coefficients in 𝔽q[Xij]\mathbb{F}_{q}[X_{i_{j}}], any single nonzero coefficient becomes zero on at most (t1)k(t-1)k values of αij\alpha_{i_{j}}. Conditioned on α1,,αij1\alpha_{1},\dots,\alpha_{i_{j}-1}, the field element αij\alpha_{i_{j}} is uniformly random over qij+1qnq-i_{j}+1\geq q-n elements. Thus, we have, for all α1,,αij1\alpha_{1},\dots,\alpha_{i_{j}-1} such that E1F1Ej1Fj1EjE_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j},

𝐏𝐫αij[Fj|α1,,αij1](t1)kqn.\displaystyle\mathop{\bf Pr\/}_{\alpha_{i_{j}}}\left[F_{j}|\alpha_{1},\dots,\alpha_{i_{j}-1}\right]\leq\frac{(t-1)k}{q-n}. (13)

Since E1F1Ej1Fj1EjE_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j} depends only on αij1\alpha_{\leq i_{j}-1} and FjF_{j} depends only on αij\alpha_{\leq i_{j}}, we have

𝐏𝐫α[n][Fj|E1F1Ej1Fj1Ej](t1)kqn.\displaystyle\mathop{\bf Pr\/}_{\alpha_{[n]}}\left[F_{j}|E_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j}\right]\leq\frac{(t-1)k}{q-n}. (14)

Combining with (12) gives the desired result. ∎

The key result for this section is a corollary of Lemma 3.11.

Corollary 3.12 (Probability of one certficiate).

For any sequence i1,,ir[n]i_{1},\dots,i_{r}\in[n], over randomly chosen pairwise distinct evaluation points α1,,αn\alpha_{1},\dots,\alpha_{n}, we have

𝐏𝐫[𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)=(i1,,ir)]((t1)kqn)r.\displaystyle\mathop{\bf Pr\/}\left[\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n})=(i_{1},\dots,i_{r})\right]\leq\left(\frac{(t-1)k}{q-n}\right)^{r}. (15)
Proof.

By Lemma 3.6, we only need to consider pairwise distinct indices i1,,iri_{1},\dots,i_{r}, otherwise the probability is 0. Let M1,,Mr=𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎(i1,,ir)M_{1},\dots,M_{r}=\mathtt{GetMatrixSequence}(i_{1},\dots,i_{r}). By Lemma 3.7, matrices M1,,MrM_{1},\dots,M_{r} are all submatrices of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}. Thus, Lemma 3.11 applies. Let E1,,Er,F1,,FrE_{1},\dots,E_{r},F_{1},\dots,F_{r} be the events in Lemma 3.11. If 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)=(i1,,ir)\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n})=(i_{1},\dots,i_{r}), then the definition of iji_{j} in Line 2 of 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎\mathtt{GetCertificate} implies that events EjE_{j} and FjF_{j} both occur. By Lemma 3.11, the probability that all EjE_{j} and FjF_{j} hold is at most ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r}, hence the result. ∎

3.7 Finishing the proof of Lemma 3.1

Proof of Lemma 3.1.

Recall (Section 3.2) that we fixed \mathcal{H} to be a type-ordered (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph. By Lemma 3.8, if the matrix 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) does not have full column rank, then 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n}) is some certificate (i1,,ir)(i_{1},\dots,i_{r}). The probability that 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎(α1,,αn)=(i1,,ir)\mathtt{GetCertificate}(\alpha_{1},\dots,\alpha_{n})=(i_{1},\dots,i_{r}) is at most ((t1)kqn)r(\frac{(t-1)k}{q-n})^{r} by Corollary 3.12. By Corollary 3.10, there are at most (nr)2tr\binom{n}{r}2^{tr} certificates. Taking a union bound over possible certificates gives the lemma. ∎

4 Random Linear Codes

In this section, we discuss how to modify the proof of Theorem 1.1 to give Theorem 1.3, list-decoding for random linear codes (RLCs). Our proof follows the roadmap in Figure 2. The proof is identical up to a few minor modifications, which we state here for brevity. Below, we state the same lemmas as in the proof for Reed–Solomon codes, adjusted for random linear codes, and we highlight the key differences in purple. We expect that our framework could be applied even more generally to show that other families of random codes — beyond randomly punctured Reed–Solomon codes and random linear codes — achieve list-decoding capacity with small alphabet sizes, assuming such codes satisfy an appropriate GM-MDS theorem.

4.1 Preliminaries: Notation and Definitions

The generator matrix G𝔽qn×kG\in\mathbb{F}_{q}^{n\times k} of a random linear code has independent uniformly random entries in 𝔽q\mathbb{F}_{q}. To transfer the proof for list-decoding Reed–Solomon codes to list-decoding random linear codes, a key analogy is to think of the generator matrix as a n×kn\times k matrix of nknk distinct indeterminates (Xi,)i[n],[k](X_{i,\ell})_{i\in[n],\ell\in[k]}, evaluated at nknk independent and uniformly random field elements (αi,)i[n],[k](\alpha_{i,\ell})_{i\in[n],\ell\in[k]}.

𝒢\displaystyle\mathcal{G} =def[X1,1X1,kXn,1Xn,k]𝔽q(X1,1,,Xn,k)n×k,\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}\begin{bmatrix}X_{1,1}&\cdots&X_{1,k}\\ \vdots&\ddots&\vdots\\ X_{n,1}&\cdots&X_{n,k}\\ \end{bmatrix}\in\mathbb{F}_{q}(X_{1,1},\dots,X_{n,k})^{n\times k},
G\displaystyle G =def𝒢|X[n]×[n]=α[n]×[k]\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}\mathcal{G}|_{X_{[n]\times[n]}=\alpha_{[n]\times[k]}}
𝒢i\displaystyle\mathcal{G}_{i} =def[Xi,1,,Xi,k] (the ith row of 𝒢).\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}[X_{i,1},\dots,X_{i,k}]\text{ (the $i$th row of $\mathcal{G}$).} (16)

We note that our randomly punctured Reed–Solomon code can also be viewed as an evaluation of 𝒢\mathcal{G}, where Xi,X_{i,\ell} is assigned αi1\alpha_{i}^{\ell-1} where α1,,αn\alpha_{1},\dots,\alpha_{n} are random distinct field elements over 𝔽\mathbb{F}. In this light, one might expect our framework can also apply, and indeed it does.

Accordingly, we use similar indexing shorthand, where the notation X[a]×[b]X_{[a]\times[b]} represents the aba\cdot b indeterminates X1,1,X1,2,,Xa,bX_{1,1},X_{1,2},\dots,X_{a,b}, and similarly for field elements α[a]×[b]\alpha_{[a]\times[b]}. For field elements α1,1,,αa,b\alpha_{1,1},\dots,\alpha_{a,b}, we write X[a]×[b]=α[a]×[b]X_{[a]\times[b]}=\alpha_{[a]\times[b]} to denote Xi,=αi,X_{i,\ell}=\alpha_{i,\ell} for 1ia1\leq i\leq a and 1b1\leq b\leq\ell.

We again use the notion of an agreement hypergraph in Section 2.2, and Lemma 2.3 still holds. For each agreement hypergraph \mathcal{H}, we consider more general reduced intersection matrix RIM,𝒢\textsf{RIM}_{\mathcal{H},\mathcal{G}}, where the XiX_{i}-Vandermonde-rows are instead the ii-th row of 𝒢\mathcal{G}. More precisely,

Definition 4.1 (Reduced intersection matrix, Random Linear Codes, Analogous to Definition 2.6.).

The reduced intersection matrix 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}} associated with a hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) is a wt()×(t1)k\operatorname*{wt}(\mathcal{E})\times(t-1)k matrix over the field of fractions 𝔽q(X1,1,,Xn,k)\mathbb{F}_{q}(X_{1,1},\dots,X_{n,k}). For each hyperedge eie_{i} with vertices j1<j2<<j|ei|j_{1}<j_{2}<\dots<j_{|e_{i}|}, we add wt(ei)=|ei|1\operatorname*{wt}(e_{i})=|e_{i}|-1 rows to 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}}. For u=2,,|ei|u=2,\dots,|e_{i}|, we add a row ri,u=(r(1),,r(t1))r_{i,u}=(r^{(1)},\ldots,r^{(t-1)}) of length (t1)k(t-1)k defined as follows:

  • If j=j1j=j_{1}, then r(j)=𝒢i=[Xi,1,Xi,2,Xi,3,,Xi,k]r^{(j)}=\mathcal{G}_{i}=[X_{i,1},X_{i,2},X_{i,3},\dots,X_{i,k}]

  • If j=juj=j_{u} and jutj_{u}\neq t, then r(j)=𝒢i=[Xi,1,Xi,2,Xi,3,,Xi,k]r^{(j)}=-\mathcal{G}_{i}=-[X_{i,1},X_{i,2},X_{i,3},\dots,X_{i,k}]

  • Otherwise, r(j)=0kr^{(j)}=0^{k}.

4.2 Preliminaries: Properties of RLC Reduced Intersection Matrices

We have similar preliminaries for reduced intersection matrices of random linear codes.

Lemma 4.2 (RIM of agreement hypergraphs are not full column rank, Analogous to Lemma 2.8).

Let \mathcal{H} be an agreement hypergraph for (y,c(1),,c(t))(y,c^{(1)},\dots,c^{(t)}), where c(j)𝔽qnc^{(j)}\in\mathbb{F}_{q}^{n} are distinct codewords of the code generated by 𝒢|X[n]×[k]=α[n]×[k]\mathcal{G}|_{X_{[n]\times[k]}=\alpha_{[n]\times[k]}}. Then the reduced intersection matrix 𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k])\mathsf{RIM}_{\mathcal{H},\mathcal{G}}(X_{[n]\times[k]}=\alpha_{[n]\times[k]}) does not have full column rank.

Proof.

Analogous to the proof of Lemma 2.8. ∎

Lemma 4.3 (RIM have full column rank, Analogous to Theorem 2.11).

Let \mathcal{H} be a kk-weakly-partition-connected hypergraph with nn hyperedges and at least 22 vertices. Then 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}} has full column rank over the field 𝔽q(X1,1,,Xn,k)\mathbb{F}_{q}(X_{1,1},\dots,X_{n,k}).

Proof.

We note that the Reed–Solomon code reduced intersection matrix RIM\textsf{RIM}_{\mathcal{H}} can be obtained from the random linear code reduced intersection matrix RIM,𝒢\textsf{RIM}_{\mathcal{H},\mathcal{G}} by setting the indeterminates Xi,=Xi1X_{i,\ell}=X_{i}^{\ell-1}, so Lemma 4.3 immediately follows from Theorem 2.11. We emphasize that, while Reed–Solomon codes require large alphabet sizes qΩ(n)q\geq\Omega(n), Theorem 2.11 still holds for constant alphabet sizes qq (see Remark 2.12), so we can use it here. ∎

We remark that Lemma 4.3 can be proven directly by following the proof framework of Theorem 2.11 in Appendix A.3, but instead substitute the use of Theorem A.2 with an analogous GM-MDS theorem for Random Linear Codes, which can be found in Lemma 7 of [DSY15] (Lemma 7 of [DSY15] only implies Lemma 4.3 for qq to be sufficiently large, but again by Remark 2.12 the qq sufficiently large version of Lemma 4.3 implies the lemma for all qq). That way, the proof of Theorem 1.3 relies only on the proof framework of Theorem 1.1 and not on any of its lemmas.

We again define row deletions for reduced intersection matrices.

Definition 4.4 (Row deletions, Analogous to Definition 2.13).

Given a hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) and set B[n]B\subseteq[n], define 𝖱𝖨𝖬,𝒢B\mathsf{RIM}_{\mathcal{H},\mathcal{G}}^{B} to be the submatrix of 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}} obtained by deleting all rows containing the row 𝒢i\mathcal{G}_{i} with iBi\in B.

Now we show that, as for Reed–Solomon codes, the full-column-rankness of reduced intersection matrices is robust to deletions.

Lemma 4.5 (Robustness to deletions, Analogous to Lemma 2.14).

Let =([t],)\mathcal{H}=([t],\mathcal{E}) be a (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph with t2t\geq 2. For all sets B[n]B\subset[n] with |B|εn|B|\leq\varepsilon n, we have that 𝖱𝖨𝖬,𝒢B\mathsf{RIM}_{\mathcal{H},\mathcal{G}}^{B} is nonempty and has full column rank.

Proof.

The proof is identical to Lemma 2.14, where we instead use the full column rankness of 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}} for kk-weakly-partition-connected \mathcal{H} (Lemma 4.3) rather than the full column rankness of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} (Theorem 2.11). ∎

4.3 The proof

Input: indices i1,,ij1[n]i_{1},\dots,i_{j-1}\in[n] for some j1j\geq 1.
Output: M1,,MjM_{1},\dots,M_{j}, which are (t1)k×(t1)k(t-1)k\times(t-1)k matrices over 𝔽q(X1,1,,Xn,k)\mathbb{F}_{q}(X_{1,1},\dots,X_{n,k}).
1 BB\leftarrow\emptyset, i0i_{0}\leftarrow\perp, 0\ell_{0}\leftarrow\perp
2 for =1,,j\ell=1,\dots,j do
      
       // MM_{\ell} depends only on i1,,i1i_{1},\dots,i_{\ell-1}
3       if >1\ell>1 then
            
             // Fetch new index from bank BB
4             τ\tau\leftarrow the type of i1i_{\ell-1}
5             ss\leftarrow number of indices among i0,i0+1,,i1i_{\ell_{0}},i_{\ell_{0}+1},\dots,i_{\ell-1} that are type τ\tau
6             i1i_{\ell-1}^{\prime}\leftarrow the ss-th smallest element of BB that has type τ\tau
7             if i1i_{\ell-1}^{\prime} is defined then
8                   MM_{\ell}\leftarrow the matrix obtained from M1M_{\ell-1} by replacing all copies of row 𝒢i1\mathcal{G}_{i_{\ell-1}} with 𝒢i1\mathcal{G}_{i_{\ell-1}^{\prime}}
9            
10      if MM_{\ell} not yet defined then
            
             // Refresh bank BB
11             BB\leftarrow\emptyset
12             for τ=1,,2t\tau=1,\dots,2^{t} do
13                   BB{largest r/2t indices of type τ in [n]{i1,,i1}}B\leftarrow B\cup\{\text{largest $\lfloor{r/2^{t}}\rfloor$ indices of type $\tau$ in $[n]\setminus\{i_{1},\dots,i_{\ell-1}\}$}\} (if there are less than r/2t\lfloor{r/2^{t}}\rfloor indices of type τ\tau, then BB contains all such indices)
14                  
15            MM_{\ell}\leftarrow lexicographically smallest nonsingular (t1)k×(t1)k(t-1)k\times(t-1)k submatrix of 𝖱𝖨𝖬,𝒢B{i1,,i1}\mathsf{RIM}_{\mathcal{H},\mathcal{G}}^{B\cup\{i_{1},\dots,i_{\ell-1}\}}
             0\ell_{0}\leftarrow\ell // new refresh index
16            
17            
18      
return M1,,MjM_{1},\dots,M_{j}
Algorithm 3 𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎𝚁𝙻𝙲\mathtt{GetMatrixSequenceRLC}
Input: Generator matrix entries α1,1,,αn,k𝔽q\alpha_{1,1},\dots,\alpha_{n,k}\in\mathbb{F}_{q}.
Output: A “certificate” (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r}.
1 for j=1,,rj=1,\dots,r do
      
       // M1,,Mj1M_{1},\dots,M_{j-1} stay the same, MjM_{j} is now defined
2       M1,,Mj=𝙶𝚎𝚝𝙼𝚊𝚝𝚛𝚒𝚡𝚂𝚎𝚚𝚞𝚎𝚗𝚌𝚎𝚁𝙻𝙲(i1,,ij1)M_{1},\dots,M_{j}=\mathtt{GetMatrixSequenceRLC}(i_{1},\dots,i_{j-1})
3       iji_{j}\leftarrow smallest index ii such that Mj(X[i]×[k]=α[i]×[k])M_{j}(X_{[i]\times[k]}=\alpha_{[i]\times[k]}) is singular
4       if iji_{j} not defined then
5             return \perp
6      
return (i1,,ir)(i_{1},\dots,i_{r})
Algorithm 4 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎𝚁𝙻𝙲\mathtt{GetCertificateRLC}

The proof of Theorem 1.3 follows similarly to the proof of Theorem 1.1. Our key lemma, analogous to Lemma 3.1 is to show that reduced intersection matrices of weakly-partition-connected hypergraphs are full column rank with high probability.

Lemma 4.6 (Analogous to Lemma 3.1).

Let kk be a positive integer and ε>0\varepsilon>0. For each (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph =([t],(e1,,en))\mathcal{H}=([t],(e_{1},\dots,e_{n})) with t2t\geq 2, we have, for r=εn/2r=\lfloor{\varepsilon n/2}\rfloor,

𝐏𝐫α[n]×[k][𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k]) does not have full column rank](nr)2tr(t1q)r.\displaystyle\mathop{\bf Pr\/}_{{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\alpha_{[n]\times[k]}}}\left[{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\mathsf{RIM}_{\mathcal{H},\mathcal{G}}(X_{[n]\times[k]}=\alpha_{[n]\times[k]})}\text{ does not have full column rank}\right]\leq\binom{n}{r}2^{tr}\cdot\left({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{t-1}{q}}\right)^{r}\ . (17)

We highlight that our probability bound here is better than the one in Lemma 3.1 for Reed–Solomon codes. This is because (i) all indeterminates in our generator matrix (and thus, the reduced intersection matrix) appear with degree 1 (rather than degree up to k1k-1), and (ii) our indeterminates are assigned independently uniformly at random, rather than random distinct values. Thus, the probability of any particular square submatrix matrix being made singular with an assignment is at most t1q\frac{t-1}{q}, rather than (t1)kqn\frac{(t-1)k}{q-n}: item (i) improves the numerator from (t1)k(t-1)k to t1t-1, and item (ii) improves the denominator from qnq-n to qq. This improved probability bound means we can use a smaller alphabet size for random linear codes than for Reed–Solomon codes. Other than this difference, the rest of our proof follows analogously. We include some more details for completeness.

We start with the same setup in Section 3.2, defining types in the same way, and starting with a (k+εn)(k+\varepsilon n)-weakly-partition-connected hypergraph \mathcal{H} that we assume without loss of generality is type-ordered. We again fix

r=defεn2\displaystyle r\stackrel{{\scriptstyle\rm def}}{{=}}\left\lfloor\frac{\varepsilon n}{2}\right\rfloor (18)

To prove Lemma 4.6, we similarly find a certificate (i1,,ir)(i_{1},\dots,i_{r}) for each singular reduced intersection matrix. This certificate is generated by an analogous algorithm, GetCertificateRLC, which uses an analogous helper function GetMatrixSequenceRLC. We show this certificate has the same three properties

  1. 1.

    A bad generator matrix, namely a generator matrix for which the reduced intersection matrix is not full column rank, must yield a certificate.

  2. 2.

    There are few possible certificates

  3. 3.

    The probability that a random generator matrix yields a particular certificate is small.

We generate the certificate in a similar way. This time, instead of sequentially revealing the evaluation points, we sequentially reveal rows of the generator matrix, and i1i_{1} indicates.

The first item is captured in the following Lemma.

Lemma 4.7 (Bad generator matrix admits certificate, Analogous to Lemma 3.8).

If α1,1,,αn,k𝔽q\alpha_{1,1},\dots,\alpha_{n,k}\in\mathbb{F}_{q} are entries for the generator matrix such that 𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k])\mathsf{RIM}_{\mathcal{H},\mathcal{G}}(X_{[n]\times[k]}=\alpha_{[n]\times[k]}) does not have full column rank, 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎𝚁𝙻𝙲(α1,1,,αn,k)\mathtt{GetCertificateRLC}(\alpha_{1,1},\dots,\alpha_{n,k}) returns a certificate (i1,,ir)[n]r(i_{1},\dots,i_{r})\in[n]^{r} (rather than \perp).

Proof.

Analogous to the proof of Lemma 3.8. ∎

Just as for Reed–Solomon codes, we obtain the same bound on the number of possible certificates.

Lemma 4.8 (Certificate count, Analogous to Corollary 3.10).

The number of possible outputs to 𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎𝚁𝙻𝙲\mathtt{GetCertificateRLC} is at most (nr)2tr\binom{n}{r}2^{tr}.

Proof.

Analogous to the proof of Corollary 3.10. ∎

Lastly, we obtain an upper bound on the probability of obtaining a particular certificate.

Lemma 4.9 (Probability of one certficiate, Analogous to Corollary 3.12).

For any sequence i1,,ir[n]i_{1},\dots,i_{r}\in[n], over independent uniformly random α1,1,,αn,k\alpha_{1,1},\dots,\alpha_{n,k}, we have

𝐏𝐫[𝙶𝚎𝚝𝙲𝚎𝚛𝚝𝚒𝚏𝚒𝚌𝚊𝚝𝚎𝚁𝙻𝙲(α1,1,,αn,k)=(i1,,ir)](t1q)r.\displaystyle\mathop{\bf Pr\/}\left[{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\mathtt{GetCertificateRLC}(\alpha_{1,1},\dots,\alpha_{n,k})}=(i_{1},\dots,i_{r})\right]\leq\left({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{t-1}{q}}\right)^{r}. (19)

Lemma 4.9 is slightly different from the analogous result for Reed–Solomon codes, Corollary 3.12, so we provide a little more justification here. Similar to Corollary 3.12, Lemma 4.9 follows from a lemma analogous to Lemma 3.11.

Lemma 4.10 (Analogous to Lemma 3.11).

Let i1,,ir[n]i_{1},\dots,i_{r}\in[n] be pairwise distinct indices, and M1,,MrM_{1},\dots,M_{r} be (t1)k×(t1)k(t-1)k\times(t-1)k submatrices of 𝖱𝖨𝖬,𝒢\mathsf{RIM}_{\mathcal{H},\mathcal{G}}. Over random generator matrix entries α1,1,αn,k𝔽q\alpha_{1,1},\dots\alpha_{n,k}\in\mathbb{F}_{q}, define the following events for j=1,,rj=1,\dots,r:

  • EjE_{j} is the event that Mj(X[i]×[k]=α[i]×[k])M_{j}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[i]\times[k]}=\alpha_{[i]\times[k]}}) is non-singular for all i<iji<i_{j}.

  • FjF_{j} is the event that Mj(X[ij]×[k]=α[ij]×[k])M_{j}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[i_{j}]\times[k]}=\alpha_{[i_{j}]\times[k]}}) is singular.

The probability that all the events hold is at most (t1q)r({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{t-1}{q}})^{r}.

Proof of Lemma 4.10.

The proof is similar to the proof of Lemma 3.11. Lemma 3.11 follows from combining Eqaution (13) with the appropriate conditional probabilities. This lemma follows the same approach. We again assume without loss of generality i1<i2<,iri_{1}<i_{2}<\cdots,i_{r}.

Here, we want, analogous to Equation (13), for all α[ij1]×[k]\alpha_{[i_{j}-1]\times[k]} such that E1F1Ej1Fj1EjE_{1}\wedge F_{1}\wedge\cdots\wedge E_{j-1}\wedge F_{j-1}\wedge E_{j},

𝐏𝐫α{ij}×[k][Fj|α[ij1]×[k]]t1q.\displaystyle\mathop{\bf Pr\/}_{{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\alpha_{\{i_{j}\}\times[k]}}}\left[F_{j}|{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\alpha_{[i_{j}-1]\times[k]}}\right]\leq{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{t-1}{q}}. (20)

To see (20), consider the determinant of Mj(X[ij1]×[k]=α[ij1]×[k])M_{j}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[i_{j}-1]\times[k]}=\alpha_{[i_{j}-1]\times[k]}}), a (t1)k×(t1)k(t-1)k\times(t-1)k matrix in 𝔽q(X{ij,ij+1,,n}×[k])\mathbb{F}_{q}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{\{i_{j},i_{j}+1,\dots,n\}\times[k]}}). View the determinant of Mj(X[ij1]×[k]=α[ij1]×[k])M_{j}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[i_{j}-1]\times[k]}=\alpha_{[i_{j}-1]\times[k]}}) as a polynomial in variables X{ij+1,,n}×[k]{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{\{i_{j}+1,\dots,n\}\times[k]}} with coefficients in 𝔽q[Xij,1,,Xij,k]\mathbb{F}_{q}[{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{i_{j},1},\dots,X_{i_{j},k}}]. It is nonzero because we assume EjE_{j} holds, so there is some coefficient of the form f(Xij,1,,Xij,k)f({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{i_{j},1},\dots,X_{i_{j},k}}) that is nonzero. Since matrix MjM_{j} has at most t1t-1 rows containing any variables among Xij,1,,Xij,k{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{i_{j},1},\dots,X_{i_{j},k}}, each appearing with total degree 1, the total degree of Xij,1,,Xij,k{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{i_{j},1},\dots,X_{i_{j},k}} in the determinant of MjM_{j} is at most t1t-1. Thus, the total degree of f(Xij,1,,Xij,k)f({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{i_{j},1},\dots,X_{i_{j},k}}) is at most t1t-1. Hence, by the Schwarz-Zippel lemma, ff becomes zero with probability at most t1q\frac{t-1}{q} over random αij,1,,αij,k\alpha_{i_{j},1},\dots,\alpha_{i_{j},k}. Thus, the probability that FjF_{j} holds is at most t1q\frac{t-1}{q}, giving (20).

Combining conditional probabilities as in Lemma 3.11 gives the result. ∎

Proof of Theorem 1.3.

By Lemma 2.3, if our random linear code generated by GG is not (LL+1(1Rε),L)\left(\frac{L}{L+1}(1-R-\varepsilon),L\right) average-radius list-decodable, then there exists a vector yy and codewords c(1),,c(t)c^{(1)},\dots,c^{(t)} with t2t\geq 2 such that the agreement hypergraph =([t],)\mathcal{H}=([t],\mathcal{E}) is (R+ε)n=(k+εn)(R+\varepsilon)n=(k+\varepsilon n)-weakly-partition-connected. By Lemma 4.2, the matrix 𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k])\mathsf{RIM}_{\mathcal{H},\mathcal{G}}(X_{[n]\times[k]}=\alpha_{[n]\times[k]}) is not full column rank. Now, the number of possible agreement hypergraphs \mathcal{H} is at most t=2L+12tn2(L+2)n\sum_{t=2}^{L+1}2^{tn}\leq 2^{(L+2)n}. Thus by the union bound over possible agreement hypergraphs \mathcal{H} with Lemma 4.6, we have, for r=εn2r=\lfloor{\frac{\varepsilon n}{2}}\rfloor,

𝐏𝐫α[n]×[k][Code generated by 𝒢|X[n]×[k]=α[n]×[k] not (LL+1(1Rε),L) list-decodable]\displaystyle\mathop{\bf Pr\/}_{\alpha_{[n]\times[k]}}\left[{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\text{Code generated by }\mathcal{G}|_{X_{[n]\times[k]}=\alpha_{[n]\times[k]}}}\text{ not $\left(\frac{L}{L+1}(1-R-\varepsilon),L\right)$ list-decodable}\right]
𝐏𝐫α[n]×[k][ (k+εn)-w.p.c. agreement hypergraph  such that 𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k]) not full column rank]\displaystyle\leq\mathop{\bf Pr\/}_{\alpha_{[n]\times[k]}}\left[\exists\text{ $(k+\varepsilon n)$-w.p.c. agreement hypergraph }\mathcal{H}\text{ such that }\mathsf{RIM}_{\mathcal{H},\mathcal{G}}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[n]\times[k]}=\alpha_{[n]\times[k]}})\text{ not full column rank}\right]
2(L+2)nmax(k+εn)-w.p.c. 𝐏𝐫α[n]×[k][𝖱𝖨𝖬,𝒢(X[n]×[k]=α[n]×[k]) not full column rank]\displaystyle\leq 2^{(L+2)n}\max_{\text{$(k+\varepsilon n)$-w.p.c. }\mathcal{H}}\quad\mathop{\bf Pr\/}_{\alpha_{[n]\times[k]}}\left[\mathsf{RIM}_{\mathcal{H},\mathcal{G}}({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}X_{[n]\times[k]}=\alpha_{[n]\times[k]}})\text{ not full column rank}\right]
2(L+2)n(nr)2(L+1)r(Lq)r(2(L+2)n/renr2L+1Lq)r2Ln,\displaystyle\leq 2^{(L+2)n}\cdot\binom{n}{r}2^{(L+1)r}\left({\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{L}{q}}\right)^{r}\leq\left(2^{(L+2)n/r}\cdot\frac{en}{r}\cdot 2^{L+1}\cdot{\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}\frac{L}{q}}\right)^{r}\leq 2^{-Ln}, (21)

as desired. Here, we used that q=210L/εq={\color[rgb]{.75,0,.25}\definecolor[named]{pgfstrokecolor}{rgb}{.75,0,.25}2^{10L/\varepsilon}}. ∎

4.4 Technical comparison with [GZ23]

To prove that random linear codes achieved list-decoding capacity (Theorem 1.3), we extended the framework for showing that (randomly punctured) Reed–Solomon codes achieve list-decoding capacity over linear-sized fields (Theorem 1.1). It is possible to instead use the framework of Guo and Zhang [GZ23] to show a similar result. However, using the framework of Guo and Zhang in the same way would have only worked for alphabet size that is linear in nn, rather than, in our case, a near-optimal constant. Below, we explain why our new ideas were necessary for obtaining our near-optimal alphabet size.

In (21), our upper bound on the non-list-decodability probability is

2(L+2)n(nr)2(L+1)r(Lq)r,\displaystyle 2^{(L+2)n}\cdot\binom{n}{r}2^{(L+1)r}\cdot\left(\frac{L}{q}\right)^{r}, (22)

where r=εn/2r=\varepsilon n/2, where ε>0\varepsilon>0 is roughly the gap to capacity. Here, the term 2(L+2)n2^{(L+2)n} comes from a union bound over the number of possible hypergraphs, the term (nr)2(L+1)r\binom{n}{r}2^{(L+1)r} comes from a union bound over the number of possible certificates, and the term (Lq)r\left(\frac{L}{q}\right)^{r} bounds the probability of a single certificate. We saw above that this probability is o(1)o(1) as long as q210L/εq\geq 2^{10L/\varepsilon}.

If we applied the framework of [GZ23] to random linear codes, the number of possible certificates would instead be nrn^{r}. Our bound on the non-list-decodability probability would then be

2(L+2)nnr(Lq)r.\displaystyle 2^{(L+2)n}\cdot n^{r}\cdot\left(\frac{L}{q}\right)^{r}. (23)

For this to bound to be o(1)o(1), we need to take q2L/εnq\geq 2^{L/\varepsilon}\cdot n, giving an alphabet size of O(n)O(n). This would still have been a new result, as, previously, the Reed–Solomon codes of [GZ23] gave the smallest known alphabet size (O(n2)O(n^{2})) of any linear code achieving list-decoding capacity with optimal list size O(1/ε)O(1/\varepsilon). However, using our framework allows us to achieve a near-optimal constant list size of 2O(L/ε)2^{O(L/\varepsilon)}.

Acknowledgements

We thank Mary Wootters and Francisco Pernice for helpful discussions about [BGM23] and the hypergraph perspective of the list-decoding problem. We thank Karthik Chandrasekaran for helpful discussions about hypergraph connectivity notions and for the reference of Theorem 9 in [Fra11]. We thank Nikhil Shagrithaya and Jonathan Mosheiff for pointing out a mistake in the proof of Lemma 4.3 in an earlier version of the paper. We thank an anonymous reviewer for pointing out a mistake in the proof of Corollary A.4 in an earlier version of the paper.

References

  • [AGL24] Omar Alrabiah, Venkatesan Guruswami, and Ray Li. Ag codes have no list-decoding friends: Approaching the generalized singleton bound requires exponential alphabets. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1367–1378. SIAM, 2024.
  • [BDG22] Joshua Brakensiek, Manik Dhar, and Sivakanth Gopi. Improved field size bounds for higher order mds codes. arXiv preprint arXiv:2212.11262, 2022.
  • [BDG23a] Joshua Brakensiek, Manik Dhar, and Sivakanth Gopi. Generalized gm-mds: Polynomial codes are higher order mds. arXiv preprint arXiv:2310.12888, 2023.
  • [BDG+23b] Joshua Brakensiek, Manik Dhar, Sivakanth Gopi, et al. Ag codes achieve list decoding capacity over contant-sized fields. arXiv preprint arXiv:2310.12898, 2023.
  • [BGM22] Joshua Brakensiek, Sivakanth Gopi, and Visu Makam. Lower bounds for maximally recoverable tensor codes and higher order mds codes. IEEE Transactions on Information Theory, 68(11):7125–7140, 2022.
  • [BGM23] Joshua Brakensiek, Sivakanth Gopi, and Visu Makam. Generic reed-solomon codes achieve list-decoding capacity. In STOC 2023, page to appear, 2023.
  • [BKR10] Eli Ben-Sasson, Swastik Kopparty, and Jaikumar Radhakrishnan. Subspace polynomials and limits to list decoding of Reed-Solomon codes. IEEE Trans. Inform. Theory, 56(1):113–120, Jan 2010.
  • [BKW03] Avrim Blum, Adam Kalai, and Hal Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM (JACM), 50(4):506–519, 2003.
  • [CPS99] Jin-Yi Cai, Aduri Pavan, and D Sivakumar. On the hardness of permanent. In Annual Symposium on Theoretical Aspects of Computer Science, pages 90–99. Springer, 1999.
  • [CW07] Qi Cheng and Daqing Wan. On the list and bounded distance decodability of Reed-Solomon codes. SIAM J. Comput., 37(1):195–209, April 2007.
  • [CX18] Chandra Chekuri and Chao Xu. Minimum cuts and sparsification in hypergraphs. SIAM Journal on Computing, 47(6):2118–2156, 2018.
  • [DL12] Zeev Dvir and Shachar Lovett. Subspace evasive sets. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 351–358, 2012.
  • [DSY14] Son Hoang Dau, Wentu Song, and Chau Yuen. On the existence of mds codes over small fields with constrained generator matrices. In 2014 IEEE International Symposium on Information Theory, pages 1787–1791. IEEE, 2014.
  • [DSY15] Son Hoang Dau, Wentu Song, and Chau Yuen. On simple multiple access networks. IEEE Journal on Selected Areas in Communications, 33(2):236–249, 2015.
  • [Eli57] Peter Elias. List decoding for noisy channels. Wescon Convention Record, Part 2, Institute of Radio Engineers, pages 99–104, 1957.
  • [Eli91] Peter Elias. Error-correcting codes for list decoding. IEEE Transactions on Information Theory, 37(1):5–12, 1991.
  • [FGKP06] Vitaly Feldman, Parikshit Gopalan, Subhash Khot, and Ashok Kumar Ponnuswami. New results for learning noisy parities and halfspaces. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pages 563–574. IEEE, 2006.
  • [FK09] András Frank and Tamás Király. A survey on covering supermodular functions. Research Trends in Combinatorial Optimization: Bonn 2008, pages 87–126, 2009.
  • [FKK03a] András Frank, Tamás Király, and Zoltán Király. On the orientation of graphs and hypergraphs. Discrete Applied Mathematics, 131(2):385–400, 2003.
  • [FKK03b] András Frank, Tamás Király, and Matthias Kriesell. On decomposing a hypergraph into k connected sub-hypergraphs. Discrete Applied Mathematics, 131(2):373–383, 2003.
  • [FKS22] Asaf Ferber, Matthew Kwan, and Lisa Sauermann. List-decodability with large radius for reed-solomon codes. IEEE Transactions on Information Theory, 68(6):3823–3828, 2022.
  • [Fra11] András Frank. Connections in combinatorial optimization, volume 38. Oxford University Press Oxford, 2011.
  • [GHK11] Venkatesan Guruswami, Johan Håstad, and Swastik Kopparty. On the list-decodability of random linear codes. IEEE Trans. Inform. Theory, 57(2):718–725, Feb 2011.
  • [GHSZ02] Venkatesan Guruswami, Johan Håstad, Madhu Sudan, and David Zuckerman. Combinatorial bounds for list decoding. IEEE Trans. Inform. Theory, 48(5):1021–1034, May 2002.
  • [GI01] Venkatesan Guruswami and Piotr Indyk. Expander-based constructions of efficiently decodable codes. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pages 658–667. IEEE, 2001.
  • [GLM+21] Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, and Mary Wootters. Bounds for list-decoding and list-recovery of random linear codes. IEEE Transactions on Information Theory, 68(2):923–939, 2021.
  • [GLS+22] Zeyu Guo, Ray Li, Chong Shangguan, Itzhak Tamo, and Mary Wootters. Improved list-decodability and list-recoverability of reed-solomon codes via tree packings. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 708–719. IEEE, 2022.
  • [GM22] Venkatesan Guruswami and Jonathan Mosheiff. Punctured low-bias codes behave like random linear codes. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 36–45. IEEE, 2022.
  • [GR06] Venkatesan Guruswami and Atri Rudra. Limits to list decoding Reed–Solomon codes. IEEE Trans. Inform. Theory, 52(8):3642–3649, August 2006.
  • [GR08] Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Transactions on Information Theory, 54(1):135–150, 2008.
  • [GRS22] Venkatesan Guruswami, Atri Rudra, and Madhu Sudan. Essential coding theory. Draft available at https://cse.buffalo.edu/faculty/atri/courses/coding-theory/book/, 2022.
  • [GS99] Venkatesan Guruswami and Madhu Sudan. Improved decoding of Reed–Solomon and algebraic-geometry codes. IEEE Transactions on Information Theory, 45(6):1757–1767, 1999.
  • [GST22] Eitan Goldberg, Chong Shangguan, and Itzhak Tamo. Singleton-type bounds for list-decoding and list-recovery, and related results. In 2022 IEEE International Symposium on Information Theory (ISIT), pages 2565–2570. IEEE, 2022.
  • [GV10] Venkatesan Guruswami and Salil Vadhan. A lower bound on list size for list decoding. IEEE Transactions on Information Theory, 56(11):5681–5688, 2010.
  • [GW13] Venkatesan Guruswami and Carol Wang. Linear-algebraic list decoding for variants of reed–solomon codes. IEEE Transactions on Information Theory, 59(6):3257–3268, 2013.
  • [GX12] Venkatesan Guruswami and Chaoping Xing. Folded codes from function field towers and improved optimal rate list decoding. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 339–350, 2012.
  • [GX13] Venkatesan Guruswami and Chaoping Xing. List decoding reed-solomon, algebraic-geometric, and gabidulin subcodes up to the singleton bound. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 843–852, 2013.
  • [GZ23] Zeyu Guo and Zihan Zhang. Randomly punctured reed-solomon codes achieve the list decoding capacity over polynomial-size alphabets. arXiv preprint arXiv:2304.01403, 2023.
  • [HRZW19] Brett Hemenway, Noga Ron-Zewi, and Mary Wootters. Local list recovery of high-rate tensor codes and applications. SIAM Journal on Computing, 49(4):FOCS17–157, 2019.
  • [HW18] Brett Hemenway and Mary Wootters. Linear-time list recovery of high-rate expander codes. Information and Computation, 261:202–218, 2018.
  • [JMS03] Kamal Jain, Mohammad Mahdian, and Mohammad R Salavatipour. Packing steiner trees. In SODA, volume 3, pages 266–274, 2003.
  • [Joh62] Selmer Johnson. A new upper bound for error-correcting codes. IRE Transactions on Information Theory, 8(3):203–207, 1962.
  • [Kir03] Tamás Király. Edge-connectivity of undirected and directed hypergraphs. PhD thesis, Eötvös Loránd University, 2003.
  • [Kop15] Swastik Kopparty. List-decoding multiplicity codes. Theory of Computing, 11(1):149–182, 2015.
  • [KRZSW18] Swastik Kopparty, Noga Ron-Zewi, Shubhangi Saraf Saraf, and Mary Wootters. Improved decoding of folded Reed-Solomon and multiplicity codes. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 212–223. IEEE, 2018.
  • [Lov18] Shachar Lovett. Mds matrices over small fields: A proof of the gm-mds conjecture. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 194–199. IEEE, 2018.
  • [LP20] Ben Lund and Aditya Potukuchi. On the list recoverability of randomly punctured codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020), volume 176, pages 30:1–30:11, 2020.
  • [LW20] Ray Li and Mary Wootters. Improved list-decodability of random linear binary codes. IEEE Transactions on Information Theory, 67(3):1522–1536, 2020.
  • [MRRZ+20] Jonathan Mosheiff, Nicolas Resch, Noga Ron-Zewi, Shashwat Silas, and Mary Wootters. Ldpc codes achieve list decoding capacity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 458–469. IEEE, 2020.
  • [NW61] Crispin St. J. A. Nash-Williams. Edge-disjoint spanning trees of finite graphs. Journal of the London Mathematical Society, 1(1):445–450, 1961.
  • [PP23] Aaron Putterman and Edward Pyne. Pseudorandom linear codes are list decodable to capacity. arXiv preprint arXiv:2303.17554, 2023.
  • [Reg09] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM (JACM), 56(6):1–40, 2009.
  • [Rot22] Ron M Roth. Higher-order mds codes. IEEE Transactions on Information Theory, 68(12):7798–7816, 2022.
  • [RS60] Irving S. Reed and Gustave Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2):300–304, 1960.
  • [RW14] Atri Rudra and Mary Wootters. Every list-decodable code for high noise has abundant near-optimal rate puncturings. In David B. Shmoys, editor, Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 764–773. ACM, 2014.
  • [RW18] Atri Rudra and Mary Wootters. Average-radius list-recoverability of random linear codes. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 644–662. SIAM, 2018.
  • [Sin64] Richard Singleton. Maximum distance qq-nary codes. IEEE Trans. Inform. Theory, 10(2):116–118, April 1964.
  • [ST20] Chong Shangguan and Itzhak Tamo. Combinatorial list-decoding of Reed-Solomon codes beyond the Johnson radius. In Proceedings of the 52nd Annual ACM Symposium on Theory of Computing, STOC 2020, pages 538–551, 2020.
  • [STV01] Madhu Sudan, Luca Trevisan, and Salil Vadhan. Pseudorandom generators without the xor lemma. Journal of Computer and System Sciences, 62(2):236–266, 2001.
  • [Tut61] William T. Tutte. On the problem of decomposing a graph into n connected factors. Journal of the London Mathematical Society, 1(1):221–230, 1961.
  • [Woo13] Mary Wootters. On the list decodability of random linear codes with large error rates. In Proceedings of the Forty-fifth Annual ACM Symposium on Theory of Computing, STOC ’13, pages 853–860, New York, NY, USA, 2013. ACM.
  • [Woz58] John M. Wozencraft. List decoding. Quarterly Progress Report, Research Laboratory of Electronics, MIT, 48:90–95, 1958.
  • [YH19] Hikmet Yildiz and Babak Hassibi. Optimum linear codes with support-constrained generator matrices over small fields. IEEE Transactions on Information Theory, 65(12):7868–7875, 2019.
  • [ZP81] Victor Vasilievich Zyablov and Mark Semenovich Pinsker. List concatenated decoding. Problemy Peredachi Informatsii, 17(4):29–33, 1981.

Appendix A Alternate presentation of [BGM23]

Here, we include alternate presentations of some ideas from [BGM23]. Algebraically, our presentation is the same, but the hypergraph perspective streamlines combinatorial aspects of their ideas.

A.1 Preliminaries

A.1.0.0.1 Dual of Reed–Solomon codes.

It is well known that the dual of a Reed–Solomon code is a generalized Reed–Solomon code: Given positive integers knk\leq n and evaluation points α1,,αn𝔽q\alpha_{1},\dots,\alpha_{n}\in\mathbb{F}_{q}, there exists nonzero β1,,βn𝔽q\beta_{1},\dots,\beta_{n}\in\mathbb{F}_{q} such that the following matrix, called the parity-check matrix,

H=[β1β2βnβ1α1β2α2βnαnβ1α1nk1β2α2nk1βnαnnk1]\displaystyle H=\begin{bmatrix}\beta_{1}&\beta_{2}&\cdots&\beta_{n}\\ \beta_{1}\alpha_{1}&\beta_{2}\alpha_{2}&\cdots&\beta_{n}\alpha_{n}\\ \vdots&\vdots&&\vdots\\ \beta_{1}\alpha_{1}^{n-k-1}&\beta_{2}\alpha_{2}^{n-k-1}&\cdots&\beta_{n}\alpha_{n}^{n-k-1}\\ \end{bmatrix} (24)

satisfies Hc=0nkHc=0^{n-k} if and only if c𝖱𝖲n,k(α1,,αn)c\in\mathsf{RS}_{n,k}(\alpha_{1},\dots,\alpha_{n}).

A.1.0.0.2 Generic Zero Patterns.

Following [BGM23], we leverage the GM-MDS theorem to establish list-decodability of Reed–Solomon codes. In this work, we more directly connect the list-decoding problem to the GM-MDS theorem using a hypergraph orientation lemma (introduced in the next section). Here, we review generic zero-patterns and the GM-MDS theorem. To keep the meaning of the variable “kk” consistent throughout the paper, we unconventionally state the definition of zero patterns and the GM-MDS theorem with nkn-k rows instead of kk rows.

Definition A.1.

Given positive integers knk\leq n, an (n,nk)(n,n-k)-generic-zero-pattern (GZP) is a collection of sets S1,,Snk[n]S_{1},\ldots,S_{n-k}\subset[n] such that, for all K[nk]K\subseteq[n-k],

|KS|nk|K|.\displaystyle\left|\bigcap_{\ell\in K}S_{\ell}\right|\leq n-k-|K|. (25)

A.1.0.0.3 GM-MDS Theorem.

As in [BGM23], we connect the list-decoding problem to the GM-MDS theorem. Here, we make the connection more directly.

Theorem A.2 (GM-MDS Theorem [DSY14, Lov18, YH19]).

Given q2nk1q\geq 2n-k-1 and any generic zero-pattern S1,,Snk[n]S_{1},\dots,S_{n-k}\subset[n], there exists pairwise distinct evaluation points α1,,αn𝔽q\alpha_{1},\dots,\alpha_{n}\in\mathbb{F}_{q} and an invertible matrix M𝔽q(nk)×(nk)M\in\mathbb{F}_{q}^{(n-k)\times(n-k)} such that, if HH is the parity-check matrix for 𝖱𝖲n,k(α1,,αn)\mathsf{RS}_{n,k}(\alpha_{1},\dots,\alpha_{n}) (as in (24)), then MHMH achieves zero-pattern S1,,SnkS_{1},\dots,S_{n-k}, meaning that (MH),i=0(MH)_{\ell,i}=0 whenever iSi\in S_{\ell}.

We note that the original GM-MDS theorem shows that the generator matrix of a (non-generalized) Reed Solomon code achieves any generic zero pattern. Here, we state that the generator matrix of a generalized Reed–Solomon code achieves any generic zero pattern, which is an immediate corollary of the former result.

A.2 Hypergraph orientations

Our new perspective of the tools from [BGM23] leverages a well-known theorem about orienting weakly-partition-connected hypergraphs, stated below. This theorem is most explicitly stated in [Fra11], but it implicit in [Kir03, FKK03a].

A directed hyperedge is a hyperedge with one vertex assigned as the head. All the other vertices in the hyperedge are called tails. A directed hypergraph consists of directed hyperedges. In a directed hypergraph, the in-degree of a vertex vv is the number of edges for which vv is the head. A path in a directed hypergraph is a sequence v1,e1,v2,e2,,vs1,es1,vsv_{1},e_{1},v_{2},e_{2},\dots,v_{s-1},e_{s-1},v_{s} such that for all =1,,s1\ell=1,\dots,s-1, vertex vv_{\ell} is a tail of edge ee_{\ell} and vertex v+1v_{\ell+1} is the head of edge ee_{\ell}. An orientation of an (undirected) hypergraph is obtained by assigning a head to each hyperedge, making every hyperedge directed.

Theorem A.3 (Theorems 9.4.13 and 15.4.4 of [Fra11]).

A hypergraph \mathcal{H} is kk-weakly-partition-connected if and only if it has an orientation such that, for some vertex vv (the “root”), every other vertex uu has kk edge-disjoint paths to vv.999In [Fra11, Theorems 9.4.13 and 15.4.4], the property of having kk edge-disjoint paths to vv is called (0,k)(0,k)-edge-connected.

We note that Theorem 9 is remains true if “to vv” is replaced with “from vv” and kk-weakly-partition-connected is replaced with another hypergraph notion called kk-partition-connected. The following corollary essentially captures (the hard direction of) [BGM23, Lemma 2.8].

Corollary A.4.

Let =([t],)\mathcal{H}=([t],\mathcal{E}) be a kk-weakly-partition-connected hypergraph with nn hyperedges and t2t\geq 2. Then there exists integers δ1,,δt0\delta_{1},\dots,\delta_{t}\geq 0 summing to nkn-k such that taking δj\delta_{j} copies of Sj=def{i[n]:jei}[n]S_{j}\stackrel{{\scriptstyle\rm def}}{{=}}\{i\in[n]:j\notin e_{i}\}\subset[n] gives an (n,nk)(n,n-k)-GZP.

Proof.

Take the orientation of \mathcal{H} and root vertex v[t]v\in[t] given by Theorem 9. We now take our δj\delta_{j}’s as follows: for each non-root j[t]j\in[t], let δj=defdegin(j)\delta_{j}\stackrel{{\scriptstyle\rm def}}{{=}}\deg_{in}(j) to be the in-degree of vertex jj. For the root vv, let δv=defdegin(v)k\delta_{v}\stackrel{{\scriptstyle\rm def}}{{=}}\deg_{in}(v)-k. Note that any other vertex uu has kk edge-disjoint paths to vv, so vv has in-degree at least kk and δv0\delta_{v}\geq 0. Since there are nn hyperedges, the sum of all δj\delta_{j}’s is thus nkn-k. We now check the generic zero pattern condition (25). Consider any nonempty multiset K[t]K\subset[t] such that each vertex j[t]j\in[t] appears at most δj\delta_{j} times. We claim:

|KS|=(# edges induced by [t]K)j[t]Kδj=nkjK[t]δjnk|K|.\displaystyle\left|\bigcap_{\ell\in K}S_{\ell}\right|=(\text{\# edges induced by $[t]\setminus K$})\leq\sum_{j\in[t]\setminus K}\delta_{j}=n-k-\sum_{j\in K\cap[t]}\delta_{j}\leq n-k-|K|. (26)

The first equality holds by definition of SjS_{j}. The second equality holds because j[t]δj=nk\sum_{j\in[t]}\delta_{j}=n-k. The second inequality holds because |K|jKδj|K|\leq\sum_{j\in K}\delta_{j} by definition of KK. It reamins to show the first inequality. We have two cases:

Case 1: the root vv is in KK. The number of hyperedges induced by the vertices [t]K[t]\setminus K is at most the sum of the indegrees of [t]K[t]\setminus K, which is exactly j[t]Kδj\sum_{j\in[t]\setminus K}\delta_{j} by definition of δj\delta_{j}.

Case 2: the root vv is in [t]K[t]\setminus K. Fix an arbitrary vertex uu in KK. By our orientation of \mathcal{H}, vertex uu has kk edge-disjoint paths to vv. Each of these paths has an edges that “enters” [t]K[t]\setminus K, i.e., the head is in [t]K[t]\setminus K but the edge is not induced by [t]K[t]\setminus K. Thus, the number of edges induced by [t]K[t]\setminus K is at most (j[t]Kdegin(j))k(\sum_{j\in[t]\setminus K}\deg_{in}(j))-k, which is exactly j[t]Kδj\sum_{j\in[t]\setminus K}\delta_{j} by definition of δj\delta_{j}. Hence, we have the first inequality. This covers all cases, proving (26), completing the proof. ∎

A.3 Proof of Theorem 2.11

In this section, we reprove Theorem 2.11, which we need in this work.

Proof of Theorem 2.11.

It suffices to prove that 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} has full column rank for some evaluation of X1=α1,,Xn=αnX_{1}=\alpha_{1},\dots,X_{n}=\alpha_{n} for α1,,αn𝔽q\alpha_{1},\dots,\alpha_{n}\in\mathbb{F}_{q}. Furthermore, by Remark 2.12, it also suffices to prove Theorem 2.11 for when q2nk1q\geq 2n-k-1. Indeed, that would then show there that is a square (t1)k×(t1)k(t-1)k\times(t-1)k submatrix of 𝖱𝖨𝖬(X[n]=α[n])\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]}) of full column rank, which means that submatrix has nonzero determinant (in 𝔽q\mathbb{F}_{q}), which means the corresponding square submatrix of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} also has a nonzero determinant (in 𝔽q(X1,,Xn)\mathbb{F}_{q}(X_{1},\dots,X_{n})), so 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}} has full column rank.

Let e1,,ene_{1},\dots,e_{n} be the edges of our kk-weakly-partition-connected hypergraph \mathcal{H}. By Corollary A.4, there a generic zero pattern S1,,SnkS_{1},\dots,S_{n-k} where, for all =1,,nk\ell=1,\dots,n-k, the set SS_{\ell} is {i:jei}\{i:j\notin e_{i}\} for some j[t]j\in[t]. By Theorem A.2, there exists pairwise distinct α1,,αn𝔽q\alpha_{1},\dots,\alpha_{n}\in\mathbb{F}_{q} and a nonsingular matrix M𝔽q(nk)×(nk)M\in\mathbb{F}_{q}^{(n-k)\times(n-k)} such that, for H𝔽q(nk)×nH\in\mathbb{F}_{q}^{(n-k)\times n} the parity check matrix of 𝖱𝖲n,k(α1,,αn)\mathsf{RS}_{n,k}(\alpha_{1},\dots,\alpha_{n}), the matrix MH𝔽q(nk)×nM\cdot H\in\mathbb{F}_{q}^{(n-k)\times n} achieves the zero pattern S1,,SnkS_{1},\dots,S_{n-k}, meaning that (MH),i=0(MH)_{\ell,i}=0 whenever iSi\in S_{\ell}.

Suppose for the sake of contradiction there is a nonzero vector v𝔽q(t1)kv\in\mathbb{F}_{q}^{(t-1)k} such that 𝖱𝖨𝖬(X[n]=α[n])v=0\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\cdot v=0. Let f(1),,f(t)𝔽qkf^{(1)},\dots,f^{(t)}\in\mathbb{F}_{q}^{k} be such that v=[f(1),f(2),,f(t1)]v=[f^{(1)},f^{(2)},\dots,f^{(t-1)}] and f(t)=0f^{(t)}=0. Define c(1),,c(t)𝔽qnc^{(1)},\dots,c^{(t)}\in\mathbb{F}_{q}^{n} be such that c(i)=Gf(i)c^{(i)}=G\cdot f^{(i)} where

G=def[1α1α1k11α2α2k11αnαnk1]\displaystyle G\stackrel{{\scriptstyle\rm def}}{{=}}\begin{bmatrix}1&\alpha_{1}&\cdots&\alpha_{1}^{k-1}\\ 1&\alpha_{2}&\cdots&\alpha_{2}^{k-1}\\ \vdots&\vdots&&\vdots\\ 1&\alpha_{n}&\cdots&\alpha_{n}^{k-1}\\ \end{bmatrix} (27)

We next show that, for any i=1,,ni=1,\dots,n, ci(j)=ci(j)c^{(j)}_{i}=c^{(j^{\prime})}_{i} for all j,jeij,j^{\prime}\in e_{i}. Let ei=j1,,j|ei|e_{i}=j_{1},\dots,j_{|e_{i}|}. Since 𝖱𝖨𝖬(X[n]=α[n])v=0\mathsf{RIM}_{\mathcal{H}}(X_{[n]}=\alpha_{[n]})\cdot v=0, we have, by definition of 𝖱𝖨𝖬\mathsf{RIM}_{\mathcal{H}}, for u=2,,|ei|u=2,\dots,|e_{i}|,

ci(j1)ci(ju)=[1,αi,,αik1](f(j1)f(ju))T=0.\displaystyle c^{(j_{1})}_{i}-c^{(j_{u})}_{i}=[1,\alpha_{i},\dots,\alpha_{i}^{k-1}]\cdot(f^{(j_{1})}-f^{(j_{u})})^{T}=0. (28)

(note this is true even if ju=tj_{u}=t, since f(t)=0f^{(t)}=0).

Define a vector y𝔽qny\in\mathbb{F}_{q}^{n} such that, for i=1,,ni=1,\dots,n, we have yi=ci(j)y_{i}=c^{(j)}_{i}, where jj is an arbitrary element of hyperedge eie_{i} (by the previous paragraph, the choice of jj does not matter). For each j=1,,tj=1,\dots,t, we must have (MH(yc(j)))=0(MH\cdot(y-c^{(j)}))_{\ell}=0 for all [nk]\ell\in[n-k] such that SS_{\ell} is a copy of {i[n]:jei}\{i\in[n]:j\notin e_{i}\}; the \ell’th row of MHMH is supported only on {i[n]:jei}\{i\in[n]:j\in e_{i}\}, and yc(j)y-c^{(j)} is zero on {i[n]:jei}\{i\in[n]:j\in e_{i}\} by definition of yy. Since MHc(j)=M(Hc(j))=0MHc^{(j)}=M\cdot(Hc^{(j)})=0 for all j=1,,tj=1,\dots,t, we have, for all jj and all \ell such that SS_{\ell} is a copy of {i[n]:jei}\{i\in[n]:j\notin e_{i}\},

(MHy)=(MH(yc(j)))=0.\displaystyle(MHy)_{\ell}=(MH\cdot(y-c^{(j)}))_{\ell}=0. (29)

By construction, all SS_{\ell} are a copy of some set {i:jei}\{i:j\notin e_{i}\}, so we conclude MHy=0MHy=0. Since MM is invertible, we must have Hy=0Hy=0.

This means y=Gfy=G\cdot f for some f𝔽qkf\in\mathbb{F}_{q}^{k}, so yy is the evaluation of a degree-less-than-kk polynomial. Since \mathcal{H} is kk-weakly-partition-connected, by considering the partition {j}([t]{j})\{j\}\sqcup([t]\setminus\{j\}), there are at least kk hyperedges eie_{i} containing vertex jj in \mathcal{H}, so yi=ci(j)y_{i}=c^{(j)}_{i} in at least kk indices ii. Since yy and c(j)c^{(j)} are the evaluation of degree-less-than-kk polynomials, we must have y=c(j)y=c^{(j)}. This holds for all jj, so we have y=c(1)==c(t)=0y=c^{(1)}=\cdots=c^{(t)}=0 (recall f(t)=0f^{(t)}=0), which contradicts our initial assumption that v0v\neq 0. ∎

Appendix B Alphabet size limitations

In this section, we establish Proposition 1.5. For positive integers mm, view 𝔽2m\mathbb{F}_{2^{m}} as a vector space of dimension mm over base field 𝔽2\mathbb{F}_{2}. For a set S𝔽2mS\subset\mathbb{F}_{2^{m}}, let

PS(X)=defαS(Xα).\displaystyle P_{S}(X)\stackrel{{\scriptstyle\rm def}}{{=}}\prod_{\alpha\in S}(X-\alpha). (30)

An affine subspace is a set L+α={α+β:βL}L+\alpha=\{\alpha+\beta:\beta\in L\} for some subspace LL of 𝔽2m\mathbb{F}_{2^{m}}.

Lemma B.1 (Proposition 3.2 of [BKR10]).

Let LL be a subspace of 𝔽2m\mathbb{F}_{2^{m}}. Then PLP_{L} has the form

X2dimL+i=0dimL1αiX2i.\displaystyle X^{2^{\dim L}}+\sum_{i=0}^{\dim L-1}\alpha_{i}X^{2^{i}}. (31)

where αi𝔽2m\alpha_{i}\in\mathbb{F}_{2^{m}}

As an immediate corollary, we have

Lemma B.2.

Let LL be an affine subspace of 𝔽2m\mathbb{F}^{2^{m}}. Then PLP_{L} has the form

X2dimL+i=0dimL1αiX2i+β\displaystyle X^{2^{\dim L}}+\sum_{i=0}^{\dim L-1}\alpha_{i}X^{2^{i}}+\beta (32)

for αi,β𝔽2m\alpha_{i},\beta\in\mathbb{F}_{2^{m}}.

Proof.

Since LL is an affine subspace, there exists γ\gamma such that Lγ=def{αγ:αL}L-\gamma\stackrel{{\scriptstyle\rm def}}{{=}}\{\alpha-\gamma:\alpha\in L\} is a subspace of 𝔽2m\mathbb{F}_{2^{m}}. By Lemma B.1, we have PLγP_{L-\gamma} is of the form

X2dimL+i=0dimL1αiX2i\displaystyle X^{2^{\dim L}}+\sum_{i=0}^{\dim L-1}\alpha_{i}X^{2^{i}} (33)

for αi𝔽2m\alpha_{i}\in\mathbb{F}_{2^{m}}. In particular, PLγP_{L-\gamma} is 𝔽2\mathbb{F}_{2}-linear, so

PL(X)=PLγ(X+γ)=PLγ(X)+PLγ(γ).\displaystyle P_{L}(X)=P_{L-\gamma}(X+\gamma)=P_{L-\gamma}(X)+P_{L-\gamma}(\gamma). (34)

Setting β=PLγ(γ)\beta=P_{L-\gamma}(\gamma) gives the desired form for PL(X)P_{L}(X). ∎

Lemma B.3 (Analogous to Lemma 3.5 of [BKR10]).

Let SS be a subset of 𝔽2m\mathbb{F}_{2^{m}} of size nn. Let uu and vv be integers such that 0uvm0\leq u\leq v\leq m. Then there is a family \mathcal{L} of at least 2(u+1)mv22^{(u+1)m-v^{2}} affine subspaces of dimension vv, such that each affine subspace LL\in\mathcal{L} satisfies |LS|n/2mv|L\cap S|\geq n/2^{m-v}, and for any two affine subspaces L,LL,L^{\prime}\in\mathcal{L}, the difference PLPLP_{L}-P_{L^{\prime}} has degree at most 2u2^{u}.

Proof.

For every subspace LL of dimension vv, there exists β0,,β2mv\beta_{0},\dots,\beta_{2^{m-v}} such that the affine subspaces L+βiL+\beta_{i} partition 𝔽2m\mathbb{F}_{2^{m}}. By pigeonhole, there exists some βi\beta_{i} such that |(L+βi)S||S|/2mv=n/2mv|(L+\beta_{i})\cap S|\geq|S|/2^{m-v}=n/2^{m-v} The number of subspaces of dimension vv is

(2m1)(2m2)(2m2v1)(2v1)(2v2)(2v2v1)2v(mv),\displaystyle\frac{(2^{m}-1)(2^{m}-2)\cdots(2^{m}-2^{v-1})}{(2^{v}-1)(2^{v}-2)\cdots(2^{v}-2^{v-1})}\geq 2^{v(m-v)}, (35)

so there are at least 2v(mv)2^{v(m-v)} affine-subspaces LL with |LS|n/2mv|L\cap S|\geq n/2^{m-v}. For all such affine-subspaces LL, the polynomial PL(X)P_{L}(X) has the form X2v+i=0v1αiX2i+βX^{2^{v}}+\sum_{i=0}^{v-1}\alpha_{i}X^{2^{i}}+\beta by Lemma B.2. Among these affine-subspaces LL, by the pigeonhole principle, for at least a fraction 2m(vu1)2^{-m(v-u-1)} of these subspaces, their subspace polynomials PL(X)P_{L}(X) have the same αi\alpha_{i} for i=u+1,u+2,,vi=u+1,u+2,\dots,v. Let \mathcal{L} be this family of subspaces. The number of subspaces is at least 2v(mv)×2m(vu1)=2(u+1)mv22^{v(m-v)}\times 2^{-m(v-u-1)}=2^{(u+1)m-v^{2}}, so \mathcal{L} is the desired family of affine subspaces. ∎

Proof of Proposition 1.5.

Let δ=2r1\delta=2^{-r-1} as in the statement of Proposition 1.5. Consider a Reed–Solomon code of length nn and rate δ\delta over 𝔽q\mathbb{F}_{q}, where q=2mq=2^{m} with mm sufficiently large. Let S𝔽qS\subset\mathbb{F}_{q} be the set of nn evaluation points. Apple Lemma B.3 with u=m1.99ru=m-\lceil{1.99r}\rceil and v=mrv=m-r. This gives a family \mathcal{L} of 2m(m1.99r)(mr)2=22rm1.99rm+r2qΩ(log(1/δ))2^{m(m-\lceil{1.99r}\rceil)-(m-r)^{2}}=2^{2rm-\lceil{1.99r}\rceil m+r^{2}}\geq q^{\Omega(\log(1/\delta))} affine subspaces L𝔽2mL\leq\mathbb{F}_{2^{m}} for which |LS|n/2mv=2δn|L\cap S|\geq n/2^{m-v}=2\delta n. Furthermore, for LL\in\mathcal{L}, the subspace polynomials PLP_{L} each have 2v2^{v} roots, and agree on all coefficients of degree larger than 2u2^{u}. Let L0L_{0} be an arbitrary element of \mathcal{L}. Then the polynomials {PL0PL:L}\{P_{L_{0}}-P_{L}:L\in\mathcal{L}\} are each of degree at most 2u=21.99rq4δ1.99qδn2^{u}=2^{-\lceil{1.99r}\rceil}q\leq 4\delta^{1.99}q\leq\delta n, and each agree with PL0(X)P_{L_{0}}(X) on at least |LS|2δn|L\cap S|\geq 2\delta n values of SS. Thus, our Reed–Solomon code is not (12δ,nΩ(1/δ))(1-2\delta,n^{\Omega(1/\delta)})-list-decodable, as desired. ∎