Random Shortening of Linear Codes and Applications
Abstract
Random linear codes (RLCs) are well known to have nice combinatorial properties and near-optimal parameters in many different settings. However, getting explicit constructions matching the parameters of RLCs is challenging, and RLCs are hard to decode efficiently. This motivated several previous works to study the problem of partially derandomizing RLCs, by applying certain operations to an explicit mother code. Among them, one of the most well studied operations is random puncturing, where a series of works culminated in the work of Guruswami and Mosheiff (FOCS’ 22), which showed that a random puncturing of a low-biased code is likely to possess almost all interesting local properties of RLCs.
In this work, we provide an in-depth study of another, dual operation of random puncturing, known as random shortening, which can be viewed equivalently as random puncturing on the dual code. Our main results show that for any small , by starting from a mother code with certain weaker conditions (e.g., having a large distance) and performing a random (or even pseudorandom) shortening, the new code is -biased with high probability. Our results hold for any field size and yield a shortened code with constant rate. This can be viewed as a complement to random puncturing, and together, we can obtain codes with properties like RLCs from weaker initial conditions.
Our proofs involve several non-trivial methods of estimating the weight distribution of codewords, which may be of independent interest.
1 Introduction
Error correcting codes are fundamental objects in combinatorics and computer science. The study of these objects together with the bounds and parameters that can be achieved, has also helped shape the field of information theory starting from the pioneering work of Shannon and Hamming. In the theory of error-correcting codes, linear codes form a fundamental class of codes that are building blocks of many important constructions and applications. Such codes have simple algebraic structures that are often key ingredients in their performance and analysis. For example, any linear code with message length and codeword length over the field can be described by both a generator matrix in and a parity check matrix in .
It is well known that random linear codes (RLCs, where one samples each entry of the generator matrix uniformly independently from ) enjoy nice combinatorial properties and have near-optimal parameters in many different settings. Specifically, with high probability they achieve Shannon capacity, the Gilbert-Varshamov (GV) bound of rate-distance tradeoff, and are list-decodable up to capacity. However, getting explicit constructions remains a challenging problem in many situations. In addition, random linear codes have little structure, which makes it difficult to design efficient decoding algorithms. Indeed, decoding random linear codes is closely related to the problems of learning parity with noise and learning with errors, whose hardness is the basis of many cryptographic applications (see e.g., [Reg09]). As such, many previous works studied the problem of slightly derandomizing, or equivalently reducing the randomness used in RLCs, while still maintaining their nice properties.
Among these works, random puncturing is one of the most well-studied operations. Here, one takes an explicit mother code, and then randomly punctures some coordinates from the code (or equivalently, punctures some columns from the generator matrix) to get a new, shorter code. Specifically, a -puncturing of a mother code randomly chooses a subset of size , and for every codeword of , deletes all symbols with positions in . Compared to standard RLCs, the number of random bits used is thus reduced from to . Furthermore, certain nice structures of the mother code are often inherited by the punctured code, which makes decoding easier.
With sophisticated techniques, previous works have shown that if the mother code satisfies some natural conditions, then after a random puncturing, with high probability the new code has certain properties similar to those of RLCs. For example, motivated by the problem of achieving list-decoding capacity, recent works [Woo13, RW14, FKS22, GST21, BGM22, GZ23, AGL23] studied random puncturing of Reed-Muller (RM) codes and Reed-Solomon (RS) codes. Subsequent works [GM22, PP23] generalized the list-decoding property to all monotone-decreasing local properties. In all these works, the mother code needs to have some special properties, such as being an RS code, an RM code, having a large distance over a large alphabet, or having a low bias over a small alphabet. These properties are not immediately implied by general linear codes, and thus, one of the natural goals is to gradually weaken the requirements of the mother code so that the approach works for a broader class of codes. Indeed, as we shall see later, this is one of the main motivations and themes in previous works.
In this paper we continue this line of work and study the following two natural questions:
-
1.
If the mother code is not that strong, can we still use some operations to get a new code that has properties similar to random linear codes?
-
2.
What other operations, besides random puncturing, are useful in this context?
Towards answering these questions, we consider a different operation to reduce the randomness of RLCs, called random shortening, previously studied in [BGL17, LDT21, YP17, NvZ15]. Specifically, for an integer , a random -shortening of a code randomly chooses a subset of size , and forms a new code by picking all codewords of which are zeros at the positions in , and deleting these zero symbols.
We note that just like random puncturing, the operation of random shortening can in fact be carried out on any code, not just on linear codes. However, for linear codes there is an important, alternative view of random shortening: it is actually the dual version of random puncturing. In particular, one can check that it is equivalent to a random puncturing of size on the parity check matrix of a linear code , or the generator matrix of the dual code . Thus in this paper, for a linear code, we also call shortening dual puncturing.
This view brings some convenience from the viewpoint of the parity check matrix. For example, any puncturing of the parity check matrix (hence also shortening) of a low-density parity check (LDPC) code [Gal62] still results in an LDPC code. Another example is expander codes [SS96]. A binary expander code is based on a bipartite expander graph with nodes on the left, nodes on the right, and left degree . The parity check matrix of is defined as follows. Each left node corresponds to a codeword bit and each right node corresponds to a parity check which checks if the parity of its neighboring codeword bits is . Such a code has linear time decoding algorithms, and the distance of can be lower bounded by using the vertex expansion property of . Specifically, assume that for every left set , with , the neighbors of , denoted as has size at least , then [CCLO23] showed that the distance of is at least roughly . Notice that an -shortening of actually corresponds to deleting nodes in from the left set together with their adjacent edges, thus this does not change the vertex expansion property of the remaining graph. Hence the new code still has a distance of at least roughly , which in fact corresponds to a larger relative distance (since the new code has a shorter length). As we will see shortly, this is actually a general property of any shortening of a code. In summary, just like puncturing, the shortening operation also preserves certain nice properties of the mother code, e.g., being an LDPC code or an expander code. In turn, this makes decoding easier.
Before stating our results, we first review some previous works on random puncturing and random shortening in more detail.
1.1 Previous Work
Recently, random puncturing has drawn a lot of attention in the context of list decoding. In [Woo13], Wootters showed that by applying a random puncturing to a Reed-Muller code and setting the desired rate to , with high probability one can list-decode the punctured code up to a relative radius of , with an exponential but non-trivial list size. In [RW14], Rudra and Wootters showed that if the mother code is an RS code, and has a large enough relative distance of , then after puncturing one can get a list-decoding radius of and a rate close to capacity up to a factor, while the list size is . We remark that a rate upper bound for list-decodable linear codes is given by Shangguan and Tamo [ST20], which is a generalized singleton bound. Specifically, they proved that if is a linear code of rate that is list decodable, i.e., the code has a relative list decoding radius of and list size , then . They conjectured the existence of such codes and proved the case for . Later, towards proving this conjecture, Guo et. al. [GLS+22] showed that there are RS codes that are list decodable and the rate can be , though they mainly use intersection matrices instead of random puncturing. Ferber, Kwan, and Sauermann [FKS22] further showed that through random puncturing one can achieve a rate of with list decoding radius and list size . This was further improved by Goldberg et. al. [GST21] to achieve a rate of . Most recently, [BGM22] showed that random puncturing of RS codes can go all the way up to the generalized singleton bound if the field size is , resolving a main conjecture of [ST20]. This was subsequently improved by [GZ23], which reduced the field size to ; and again by [AGL23], which further reduced the field size to , although [GZ23, AGL23] can only get close to the generalized singleton bound. We note that all the above works mainly studied RS codes or RM codes, which have strong algebraic structures, and some of them also require a large relative distance (e.g., close to ).
On the other hand, Guruswami and Mosheiff [GM22] considered random puncturing of more general codes with weaker properties. Specifically, they considered two cases, where the mother code either has a low bias or has a large distance over a large alphabet (note that the property of a low bias implies a large distance, hence is stronger). For both cases, they showed that the punctured code can achieve list decoding close to capacity. In fact, they showed a stronger result, that all monotone-decreasing local properties of the punctured code are similar to those of random linear codes. Subsequent to [GM22], Putterman and Pyne [PP23] showed that the same results in [GM22] can be achieved by using a pseudorandom puncturing instead, which reduces the number of random bits used in the puncturing to be linear in the block length of the punctured code, even if the mother code has a much larger length.
Unlike puncturing, there are only a handful of previous works on shortening. In [NvZ15], Nelson and Van Zwam proved that all linear codes can be obtained by a sequence of puncturing and/or shortening of a collection of asymptotically good codes. In [YP17], Yardi and Pellikaan showed that all linear codes can be obtained by a sequence of puncturing and/or shortening on some specific cyclic code. In [BGL17], Bioglio et. al. presented a low-complexity construction of polar codes with arbitrary length and rate using shortening and puncturing. In [LDT21], Liu et. al. provided some general properties of shortened linear codes.
1.2 Notation and Definitions.
Definition 1.1.
A linear code of length and dimension over a finite field is a -dimensional subspace of the -dimensional vector space . The rate of is the ratio of the dimension to the length of the code: . The distance (or minimum distance) of is the minimum Hamming distance between any two distinct codewords in the code: . The relative distance of is the ratio of its distance to its length: .
The dual code of a linear code is the dual linear subspace of . Hence the sum of the rates of and is 1. We call the dual distance of as the minimum distance of its dual code . The relative dual distance of is the ratio of its dual distance to its length: .
We denote a linear code with these properties as an code or an code. Moreover, a linear code can be described by a generator matrix , where each codeword in is a linear combination of the rows of . The parity check matrix of is an matrix satisfying the property that for any codeword , . So is the generator matrix of .
Definition 1.2.
Let be a subset of of size . A -puncturing on a code of length involves removing positions indexed by . The resulted punctured code has length . If is a uniformly random subset of size , we say that is obtained from by a random -puncturing.
Definition 1.3.
Let be a subset of of size . An -shortening on a code of length involves selecting all codewords with zero values on positions indexed by and removing these positions. The resulted shortened code has length . If is a uniformly random subset of size , we say that is obtained from by a random -shortening.
Definition 1.4.
The -ary entropy function is defined as .
Throughout the paper, we use “with high probability” to mean that when the rate , relative distance , relative dual distance of the code, and other given parameters are fixed, the probability of the event is for some constant . Essentially, this means that the probability of the event occurring approaches 1 as the block length increases, making it increasingly likely that the desired properties hold.
As in [GM22], in this paper we also consider monotone-decreasing, local properties. Informally, we call a code property monotone-decreasing and local if, the fact that a code does not satisfy can be witnessed by a small “bad set” of codewords in . For example, some typical properties, such as being list-decodable to capacity and having a small bias, are monotone-decreasing and local properties. More formally, a monotone-decreasing and local property is the opposite of a monotone-increasing and local property, defined below.
Definition 1.5.
A property is said to be
-
•
monotone-increasing if, for any code , whenever one of its subcodes (i.e., a subspace of ) satisfies , the code itself also satisfies (monotone-decreasing if the complement of is monotone-increasing);
-
•
-local for some if there exists a family of sets of words in , with the size of the sets at most , and such that satisfies if and only if there exists an set satisfying ,
-
•
row-symmetric if, for any code that satisfies , the resulting code obtained by performing a permutation on the positions of also satisfies .
1.3 Main Results
Random puncturing vs. random shortening
Before formally stating our results, we first informally compare the two operations of random puncturing and random shortening. A random -puncturing of a code of length involves uniformly selecting positions randomly from , and discarding these positions in the code. One can see that under appropriate conditions, this operation preserves the distinctness of all codewords, and thus can increase the rate of the code. However it may decrease the distance or relative distance of the code. In contrast, a random -shortening of a code involves picking positions uniformly randomly from , forming a subcode that consists of codewords which contain only zeros at these positions, and then deleting these positions in the subcode. It can be seen that this operation perserves the distance of the code, and thus increases the relative distance of the code, but on the other hand the rate of the code can potentially decrease. Hence, these two operations are indeed “dual” in some sense, and therefore one can apply both operations to adjust both the rate and the relative distance of the code.
A linear code , where for some prime , is called -biased, if for every codewords , for all . where and is the field trace map. In Section 2.2, we will provide a more detailed explanation of the -biased code.
Our main results show that random shortening is an effective way to reduce the bias of a code. Note that this is stronger than increasing the relative distance, since the former implies the latter (see Proposition 2.10). If the mother code satisfies certain conditions, then we show after random shortening the new code can achieve an arbitrarily small bias with high probability. We note that a random linear code has a small bias, and thus in this sense the code after random shortening behaves like random linear codes. Moreover, the condition that the mother code has a low bias is required in several previous works (e.g., [GM22, PP23]), while these works essentially do not care about the rate of the mother code. Thus we can apply a random puncturing to the new code after a random shortening, to get another code where all monotone-decreasing local properties are similar to those of random linear codes. This further weakens the requirements of mother codes in previous works to some extent.
Low-Biased codes from codes with large distance.
A low-biased code must have a large distance, as stated in Proposition 2.10. However, the reverse may not hold. The following theorem shows that it is also possible to derive a low-biased code from a code with a large distance by random shortening.
Theorem 1.6.
For any , any code with and any constant , there exists a number such that the following holds. If we perform a random -shortening to , then with high probability, the shortened code is -biased and has rate at least .
We note that the theorem only requires a lower bound on the relative distance, but there are no restrictions on the rate of the original code, . Hence, this requirement is generally easy to satisfy, for example, from simple constructions using code concatenation. Furthermore, we can select an appropriate shortening size to ensure that the rate of the shortened code is arbitrarily close to .
The distance condition of in Theorem 1.6 can also be relaxed, resulting in the following theorem.
Theorem 1.7.
Given any , if an code satisfies the condition that there exists some , such that , then there exists a number such that the following holds. If we perform a random -shortening to , then with high probability, the shortened code is -biased with rate at least .
Indeed, the asymptotic form of the Plotkin bound is given by
(1) |
Thus Theorem 1.7 implies that as long as the rate-distance trade-off of the original code is close enough to the Plotkin bound, we can obtain a code with an arbitrarily small bias by random shortening. On the other hand, unlike in Theorem 1.6, the rate of the shortened code may not be arbitrarily close to , but we can still get a new rate that is only a constant factor smaller.
Remark 1.
Just as in [GM22], in both our theorems, the condition that is a linear code is crucial for ensuring that the shortened code maintains a constant rate with high probability. When is non-linear, there are certain counterexamples. For instance, let be an code and let be obtained from by adding ’s at the end of each codeword in . By picking an arbitrarily small constant , the rate and relative distance of are almost the same as . However, after a random -shortening, is the code with probability .
Low-biased codes from codes with small rate and not too small dual distance.
In the next theorem, there is no requirement for to be very large. Instead, we impose constraints on its dual distance, , and the rate, . If the dual distance is not too small and the rate can be upper bounded, then we can also apply the shortening technique to obtain a low-biased code.
Theorem 1.8.
Given any , if an code satisfies the condition that there exist , , such that and , then there exists a number such that the following holds. If we perform a random -shortening to , then with high probability the shortened code is -biased with rate at least .
In Theorem 1.8, the rate of the dual code must be sufficiently large. Additionally, if the term is less than 1, the rate-distance trade-off of the dual code surpasses the Gilbert-Varshamov (GV) bound. Consequently, when examining the problem within the context of the GV bound, we need to impose specific constraints on . This leads to the following corollary.
Corollary 1.9.
Given any , , there exists a number , such that for any , for a certain , there exists a number such that the following holds. Let be any code. If we perform a random -shortening to , then with high probability the shortened code is -biased with rate at least .
Theorem 1.8 and Corollary 1.9 show that as long as the mother code and its dual both have a reasonable relative distance, one can use random shortening to get a new code with an arbitrary small bias, while only losing a constant factor in the rate. We note that linear codes such that both the code and its dual have good relative distance are also easily constructible, for example, see [Shp09].
Random-like codes by random shortening and puncturing.
In [GM22], the authors showed that a random puncturing of a low-biased code results in a new code that behaves like random linear codes. Using our theorems, we present a weaker condition that still achieves similar results. This follows from a combination of random shortening and random puncturing, as briefly discussed before.
Theorem 1.10.
For any , , and prime power , there exists some , such that the following holds. Let be a monotone-decreasing, -local, and row-symmetric property over satisfied by a random linear code of length and rate . There exists some such that the following holds. If any one of the following properties is satisfied for :
-
1.
, or
-
2.
and for a certain ,
then there exists such that for any code, if we perform a random -shortening and then a random -puncturing on , the resulted code has length , rate at least and with high probability, satisfies .
Remark 2.
In fact, all our theorems hold under the more restricted pseudorandom shortening, where is sampled from a random walk on a sufficiently good constant-degree expander graph, as in [PP23]. The reason is that in this case, the only thing that changes is the probability that a random shortening does not hit a codeword. A standard hitting set property of random walks on expander graphs ensures that this probability (as in Lemma 2.15) remains close to , as long as the second largest normalized eigenvalue in absolute value of the expander is small enough, which can be achieved by having a large enough constant degree. Consequently, all our proofs essentially go through with only minimal changes. Thus, like [PP23], we can also reduce the number of random bits used in the shortening to be linear in the block length of the shortened code, even if the mother code has a much larger length.
Remark 3.
We emphasize that, in all our theorems, the conditions of the mother code we need are weaker than the conditions in previous works which only use random puncturing [GM22, PP23]. For example, when the alphabet size is small, those works require the mother code to have a small bias, which implies a large distance. On the other hand, our theorems either only require a large distance, or only require a reasonable distance in both the mother code and its dual.
Discussions and open questions.
Our work leaves several open questions for future investigation. One such question is whether we can achieve a good rate when performing random shortening. Right now, using our analysis, the rate of the code after random shortening is potentially worse than the mother code we start with, and thus we do not get a good rate-distance trade-off by simply applying random shortening. Therefore, it is a natural question to see if one can achieve an -biased code through random shortening while at the same time maintaining a favorable – trade-off, if we start with some good initial conditions of the mother code. Another direction is further weakening the initial conditions required for obtaining low-biased codes, such as removing the constraint of in Theorem 1.8. In our view, these questions present exciting opportunities for advancing our understanding of random shortening, low-biased codes and their connections.
Finally, it would also be quite interesting to completely derandomize the random shortening, thus yielding explicit constructions of low-biased codes with special structures (e.g., LDPC codes or expander codes). This will complement existing constructions of low-biased codes, and possibly lead to more efficient decoding algorithms.
1.4 Technique Overview
We investigate the effect of shortening as follows. An -shortening applied to a code of length involves selecting all codewords with zeros at positions indexed by and removing these positions. Specifically, if the support of a codeword intersects (in which case we say hits ), then will not be included in the shortened code; if the support of does not intersect , then there is a codeword , which is obtained from by removing all positions in . In this way, under a random shortening, each non-zero codeword has a certain probability of being dropped and a certain probability of being retained in . If the distance of is , then the probability of each codeword being hit and dropped is at least by Lemma 2.15, where is the size of .
We use to denote all codewords in which are not -biased. If the size of is small, then by a union bound, the probability that not all codewords in are hit by is exponentially small. Thus, with high probability, all codewords in that are not hit by and inherited to are -biased. Hence, a critical part of all our proofs is to upper bound the size of .
Furthermore, as long as is a linear code and is less than the dimension of , we know that the shortened code has dimension at least . Consequently, retains a nonzero constant rate as well.
Change of parameters.
The shortening results in changes to the parameters of the code. Here, we mainly apply shortening for two purposes: adjusting the bias and amplifying the relative distance.
-
1.
Adjusting the bias: Let be of length . When is hit by , it implies that the codewords in not hit by are all -biased. However, it doesn’t directly imply that the shortened code is also -biased, since the shortening operation changes the length of the code. Nevertheless, the new bias of is given by , where is the size of the shortening . If is small compared to , is close to . In the proof of Theorem 1.6, we can choose to be a sufficiently small fraction of . In the proof of Theorem 1.8, we provide an upper bound for , which also enables us to choose a small shortening size. In both cases, we set the shortening size to be less than , allowing us to choose .
-
2.
Amplifying the relative distance: We use another technique in the proof of Theorem 1.7 to first transform a code with a rate-distance trade-off near the Plotkin bound into a code with near-optimal distance. By Proposition 2.8, the distance of the shortened code is no less than that of the original code . However, since has length instead of , its relative distance becomes . This allows us to increase the relative distance of the code. In turn, Theorem 1.7 follows from Theorem 1.6.
Estimation of the size of .
This is the most critical part of all our proofs. For Theorem 1.6 and Theorem 1.8, we have two different ways of estimating the upper bound of :
-
1.
Estimating with relative distance : We use to denote the list decoding radius corresponding to the classical Johnson bound for a code over with relative distance . It is easy to see that when is close to the optimal , so is . To give an upper bound of , we construct balls in with radius and centered at , where is the all-one vector and . By the Johnson bound, the number of codewords covered by these balls is at most . We show that, if a codeword is not covered by these balls, its empirical distribution over is close to the uniform distribution, which implies is small biased. This upper bounds by .
-
2.
Estimating with relative dual distance and rate : If has dual distance , then any columns of the generator matrix of are linearly independent, which means that if we uniformly randomly choose a codeword from , then any symbols of the codeword are independently uniform, i.e., the symbols of a random codeword are -wise independent. We can now use this property to estimate the probability that a codeword randomly chosen from is not -biased. This is a typical application of the concentration phenomenon from the higher moment method, where we use Hoeffding inequality, Chernoff bound, and Sub-Gaussian property to bound the th moment of the summation of some random variables. Then by Markov’s inequality, the probability that a random codeword is not -biased can be bounded, which also gives an upper bound on .
Obtaining random-like codes.
To obtain random-like codes, we combine our results with those in [GM22], which state that a randomly punctured low-biased code is likely to possess any monotone-decreasing local property typically satisfied by a random linear code of a similar rate. By our results, we can start from a code with less stringent conditions and achieve the same results as in [GM22], through the operations of a random shortening followed by a random puncturing.
Organization.
The rest of this paper is organized as follows. In Section 2, we describe some basic definitions, terms, and useful properties. In Section 3, we show how to combine other theorems and the work of [GM22] to obtain a random-like code with weaker initial conditions and prove Theorem 1.10. In Section 4, we present two methods for estimating . In Section 5, we prove Theorem 1.6 and Theorem 1.7. In Section 6, we prove Theorem 1.8 and Corollary 1.9.
2 Preliminary
2.1 Punctured codes and shortened codes
Definition 2.1.
A linear code of length and dimension over a finite field is a -dimensional subspace of the -dimensional vector space . The rate of is defined as the ratio of the dimension to the length of the code: . The distance (or minimum distance) of is the minimum Hamming distance between any two distinct codewords in the code: . The relative distance of is the ratio of its distance to its length: . The dual distance of is the minimum distance of its dual code , denoted by . The relative dual distance of is the ratio of its dual distance to its length: . We denote a linear code with these properties as an code or an code. A linear code can be described by a generator matrix , where each codeword in can be obtained as a linear combination of the rows of . The parity check matrix of is an matrix satisfying the property that for any codeword , .
Definition 2.2.
Let be a subset of of size . A -puncturing on a code of length involves removing positions indexed by . The resulted punctured code has length . If is a uniformly random subset of size , we say that is obtained from by a random -puncturing.
Definition 2.3.
Let be a subset of of size . An -shortening on a code of length involves selecting all codewords with zero values on positions indexed by and removing these positions. The resulted shortened code has length . If is a uniformly random subset of size , we say that is obtained from by a random -shortening.
Proposition 2.4.
For a linear code , and , and are also both linear codes. The generator matrix of is obtained from that of by deleting columns indexed by , and the parity check matrix of is obtained from that of by deleting rows indexed by .
Given that the parity check matrix of a code is the transpose of the generator matrix of its corresponding dual code, we can deduce that shortening a code is equivalent to puncturing its dual code. Consequently, we can observe the following properties.
Proposition 2.5.
For a linear code , and , .
Definition 2.6.
Let and be a codeword in , and . Denote to be the set of non-zero coordinates of . We say hits if , hits if each codeword in is hit by .
In general, when applying the shortening to a code , we exclude any codeword that is hit by . Specifically, if a codeword is hit by , by definition we do not include it in the shortened code . On the other hand, if a codeword is not hit by , there exists a corresponding codeword in the shortened code , which is obtained by removing all positions indexed by .
Next, we examine the effects of the shortening operation on code parameters.
Proposition 2.7.
Let be an code and be a subset of of size . The shortened code has dimension . Moreover, if , then .
Proof.
By Proposition 2.4, the parity matrix of is , which implies that the dimension of is at least . If , by Proposition 2.5, there doesn’t exist a non-zero codeword such that . In this case, there won’t be any collisions between two codewords in after shortening, so , and the punctured code has rate . ∎
Proposition 2.8.
Let be an code and be a subset of of size , The distance of shortened code is at least .
Proof.
By definition, any non-zero codeword comes from some codewords by removing all coordinates indexed by , and for any , and thus . So, the distance of is at least . ∎
Moreover, since the shortened code has length of instead of , its relative distance becomes . This allows us to increase the relative distance of the code.
2.2 A characterization of -biased code
In this section, we recall the definition of -biased code.
Definition 2.9.
Let , where for some prime and let . A vector is said to be -biased if for all . Here, and is the field trace map: . The code is said to be -biased if every is -biased.
Note that for a binary vector , . Then, a binary code is -biased if and only of all non-zero codewords of has weight .
Proposition 2.10.
If is -biased, then its distance is at least .
Proof.
Since is -biased, for any ,
(2) |
By changing the order of double summation,
(3) |
Since for all , we get
(4) |
which implies . ∎
Definition 2.11.
Given a vector , we define its empirical distribution over by
(5) |
Lemma 2.12.
Given a vector , if for any ,
(6) |
then is -biased.
Proof.
We compute the total variation distance between and the uniform distribution over , which is given by
(7) |
Since for any ,
(8) |
We know that
(9) | ||||
which implies is -biased. ∎
Definition 2.13.
Let be a code of length . Denote to be the set of all non-zero codewords which are not -biased.
Lemma 2.14.
Let be a code of length , and be a subset of of size . If hits , then the shortened code is -biased.
Proof.
Any non-zero codeword comes from some codewords by removing all coordinates indexed by , and for any . Since hits , is -biased, that is, for any , . Then
(10) | ||||
for any , and thus is -biased. ∎
Lemma 2.15.
Let be a code of length and distance , and be a subset of of size . For any codeword , the probability that is not hit by is at most
Proof.
Suppose . not hit by means , whose probability is at most
(11) |
∎
3 Random-like codes by random shortening and puncturing
In this section, we integrate the applications of low-biased codes with our theorems to obtain further results. In [GM22], random punctured low biased codes are studied, and they are shown to be locally similar to random linear codes. By combining this result, we can derive weaker conditions for obtaining a random-like code. We first recall some of the framework for studying properties of codes, which was established in [MRRZ+20, GMR+20].
A property of length- linear codes over is a collection of linear codes in . A linear code such that is said to satisfy .
Definition 3.1.
A property is said to be
-
•
monotone-increasing if, for any code , whenever one of its subcodes (i.e., a subspace of ) satisfies , the code itself also satisfies (monotone-decreasing if the complement of is monotone-increasing);
-
•
-local for some if there exists a family of sets of words in , with the size of the sets at most , and such that satisfies if and only if there exists an set satisfying ,
-
•
row-symmetric if, for any code that satisfies , the resulting code obtained by performing a permutation on the positions of also satisfies .
Definition 3.2.
Let be a nonempty monotone-increasing property over . Its threshold is defined by
(12) |
where is a random linear code of rate in .
Theorem 3.3.
[MRRZ+20] Let be a random linear code of radius and Let be a monotone-increasing, b-local and row-symmetric property over , where . The following now holds for every .
-
1.
If then
(13) -
2.
If then
(14)
Theorem 3.4.
[GM22] Let be a prime power, and let be a monotone-decreasing, row-symmetric and b-local property over , where . Let be a linear code. Let be a random n-puncturing of of design rate for some . Suppose that is -biased. Then,
This theorem offers a technique for constructing random-like codes from low biased codes. By combining puncturing and shortening methods, we can transform a code with weaker initial conditions into a random-like code. Since a monotone-increasing property is the negation of a monotone-decreasing property, we can present our theorem using the monotone-decreasing property and the term “with high probability” instead of “with probability exponentially decreasing”.
Theorem 3.5.
Let , , and be a prime power. For any monotone-decreasing, -local, and row-symmetric property over with , there exists some such that the following holds. If any one of the following conditions is satisfied for :
-
1.
, or
-
2.
and for some ,
then there exist positive such that for any code , upon performing a random -shortening and then a random -puncturing to , the resultant code has length , rate of at least and, with high probability, satisfies .
Proof.
Define . Given that one of the two conditions holds, we can select an such that a random -puncturing to results in an -biased code by Theorem 1.7 and Theorem 1.8. In this scenario, due to the singleton bound, the code rate is less than . Utilizing Theorem 3.4, we designate the rate as and perform a random -puncturing. The resulting code then has a rate and, with a high probability, satisfies . Furthermore, as the choices of and are independent of , we can finally define . Consequently, the resulting code has a length of . ∎
4 Estimation on low-biased codewords
For a random vector , it is known from the law of large numbers that its empirical distribution is, with high probability, -close to the uniform distribution over for any as goes to infinity. Therefore, for each , let be a random code; will, with high probability, constitute only a small fraction of . In the following, we present several estimation methods for the size of under general conditions.
We first give a fact about field trace map: . The trace map is defined by , where . In fact, this map is both linear and surjective. Consequently, for any , if is selected uniformly at random from , then will also assume a uniformly random value in .
We commence by presenting our first estimation method. Given a scenario where the distance is substantial, it is viable to set to a small value, thereby ensuring that the number of codewords in is restricted to a maximum polynomial number. To exemplify, consider a binary code over with a considerable distance . We argue that, in this instance, only a polynomial number of codewords surpass a weight greater than . We can create a Hamming sphere, centered at with a radius . It is within this sphere that only a polynomial number of codewords exist.
Lemma 4.1.
Let be an code. For any , .
Proof.
For each Denote to be the vector where each entry has a value . Denote , which is the list decoding radius associated with the Johnson bound. Let
(15) |
By Johnson bound for a linear code,
(16) |
If , then the Hamming distance between and is at least for each , which means
(17) |
By Lemma 2.12, is -biased for any , and thus
(18) |
∎
Another approach to approximate is the probability method. It is essential to observe that when the dual code of has distance , every set of columns within the generator matrix of are linearly independent. This observation implies that when examining the distribution of a randomly selected codeword from , the bits exhibit -wise independence. Consequently, is bound by the constraints of the -th moment inequality.
Lemma 4.2.
are independent random variables with , and . Denote . Then for any even ,
(19) |
Proof.
By Hoeffding inequality, for any ,
(20) |
Then by Chernoff bound, for any ,
(21) |
Then by Sub-Gaussian property, for any even ,
(22) |
∎
Corollary 4.3.
Let be random variables taking values in which are -wise independent, . Let , , then for any ,
(23) |
Proof.
If are -wise independent, then the -th central moment of is the same as the case where these are fully independent. Therefore, by Markov inequality,
(24) |
∎
Lemma 4.4.
Let be a random vector, whose components uniformly take values in and are -wise independent. Let . Then
(25) |
Proof.
Let and denote the real part and imaginary part of a complex number separately. Since for each , , and are real-valued discrete random variables with , and taking values in .
(26) | ||||
∎
Corollary 4.5.
Let be a code of length , rate and dual distance over the field . Then for each , the number of codewords which are not -biased is not more than
for sufficiently large .
Proof.
If has dual distance , then each columns of generating matrix of are independent, which means that if we uniformly randomly choose a codeword from , each bits of the codeword are independent. Therefore, the number of codewords is less than
(27) | ||||
which is less than
for sufficiently large . ∎
5 Proof of Theorem 1.6
Before proving Theorem 1.6, we first give the following theorem.
Theorem 5.1.
Let be an code. If we perform a random -shortening to , where , then with high probability, the shortened code is -biased, where .
Proof.
To get Theorem 1.6, we need to select an appropriate value for that satisfies the following conditions:
-
1.
The chosen does not cause a significant decrease in the rate .
-
2.
satisfies the requirements defined by .
Proof of Theorem 1.6.
Proof of Theorem 1.7.
Since , we can first choose an such that . We then perform an -shortening to . By Proposition 2.8 and Proposition 2.7, has a distance of at least and rate at least . Then, using Theorem 1.6, we are able to select a sufficiently small defined in Theorem 1.6 such that , which ultimately enables us to achieve the desired result. ∎
6 Proof of Theorem 1.8
We first present an inequality concerning the -ary entropy function here.
Lemma 6.1.
For any , , when , .
Proof.
When ,
(32) |
for . When ,
(33) |
for , and thus
(34) |
∎
Proof of Theorem 1.8.
We set and get . Let . We get
(35) | ||||||
Therefore, this probability in Equation 35 tends to as approaches infinity.
Moreover, one can verify that given and . Hence, the shortened code has rate at least , and
(36) | ||||
By Lemma 2.14, is -biased and since . So, with high probability, is -biased. ∎
References
- [AGL23] Omar Alrabiah, Venkatesan Guruswami, and Ray Li. Randomly punctured reed–solomon codes achieve list-decoding capacity over linear-sized fields. arXiv preprint arXiv:2304.09445, 2023.
- [BGL17] Valerio Bioglio, Frederic Gabry, and Ingmar Land. Low-complexity puncturing and shortening of polar codes. In 2017 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), pages 1–6. IEEE, 2017.
- [BGM22] Joshua Brakensiek, Sivakanth Gopi, and Visu Makam. Generic reed-solomon codes achieve list-decoding capacity. arXiv preprint arXiv:2206.05256, 2022.
- [CCLO23] Xue Chen, Kuan Cheng, Xin Li, and Minghui Ouyang. Improved decoding of expander codes. IEEE Transactions on Information Theory, 2023.
- [FKS22] Asaf Ferber, Matthew Kwan, and Lisa Sauermann. List-decodability with large radius for reed-solomon codes. IEEE Transactions on Information Theory, 68(6):3823–3828, 2022.
- [Gal62] Robert Gallager. Low-density parity-check codes. IRE Transactions on information theory, 8(1):21–28, 1962.
- [GLS+22] Zeyu Guo, Ray Li, Chong Shangguan, Itzhak Tamo, and Mary Wootters. Improved list-decodability and list-recoverability of reed-solomon codes via tree packings. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 708–719. IEEE, 2022.
- [GM22] Venkatesan Guruswami and Jonathan Mosheiff. Punctured low-bias codes behave like random linear codes. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 36–45. IEEE, 2022.
- [GMR+20] Venkatesan Guruswami, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, and Mary Wootters. Sharp threshold rates for random codes. arXiv preprint arXiv:2009.04553, 2020.
- [GST21] Eitan Goldberg, Chong Shangguan, and Itzhak Tamo. List-decoding and list-recovery of reed-solomon codes beyond the johnson radius for any rate. arXiv preprint arXiv:2105.14754, 2021.
- [GZ23] Zeyu Guo and Zihan Zhang. Randomly punctured reed-solomon codes achieve the list decoding capacity over polynomial-size alphabets. arXiv preprint arXiv:2304.01403, 2023.
- [LDT21] Yang Liu, Cunsheng Ding, and Chunming Tang. Shortened linear codes over finite fields. IEEE Transactions on Information Theory, 67(8):5119–5132, 2021.
- [MRRZ+20] Jonathan Mosheiff, Nicolas Resch, Noga Ron-Zewi, Shashwat Silas, and Mary Wootters. Ldpc codes achieve list decoding capacity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 458–469. IEEE, 2020.
- [NvZ15] Peter Nelson and Stefan HM van Zwam. On the existence of asymptotically good linear codes in minor-closed classes. IEEE Transactions on Information Theory, 61(3):1153–1158, 2015.
- [PP23] Aaron (Louie) Putterman and Edward Pyne. Pseudorandom linear codes are list decodable to capacity. arXiv preprint arXiv:2303.17554, 2023.
- [Reg09] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. J. ACM, 56(6):34:1–34:40, 2009.
- [RW14] Atri Rudra and Mary Wootters. Every list-decodable code for high noise has abundant near-optimal rate puncturings. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, 2014.
- [Shp09] Amir Shpilka. Constructions of low-degree and error-correcting -biased generators. Comput. Complex., 18(4):495–525, dec 2009.
- [SS96] Michael Sipser and Daniel A Spielman. Expander codes. IEEE transactions on Information Theory, 42(6):1710–1722, 1996.
- [ST20] Chong Shangguan and Itzhak Tamo. Combinatorial list-decoding of reed-solomon codes beyond the johnson radius. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 538–551, 2020.
- [Woo13] Mary Wootters. On the list decodability of random linear codes with large error rates. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 853–860, 2013.
- [YP17] Arti Yardi and Ruud Pellikaan. On shortened and punctured cyclic codes. arXiv preprint arXiv:1705.09859, 2017.