This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Random Shortening of Linear Codes and Applications

Xue Chen School of Computer Science, University of Science and Technology of China, Email: [email protected]    Kuan Cheng Center on Frontiers of Computing Studies, Peking University, Email: [email protected]    Xin Li Department of Computer Science, Johns Hopkins University, Email: [email protected]. Supported by NSF CAREER Award CCF-1845349 and NSF Award CCF-2127575.    Songtao Mao Department of Computer Science, Johns Hopkins University, Email: [email protected]. Supported by NSF Award CCF-2127575.
Abstract

Random linear codes (RLCs) are well known to have nice combinatorial properties and near-optimal parameters in many different settings. However, getting explicit constructions matching the parameters of RLCs is challenging, and RLCs are hard to decode efficiently. This motivated several previous works to study the problem of partially derandomizing RLCs, by applying certain operations to an explicit mother code. Among them, one of the most well studied operations is random puncturing, where a series of works culminated in the work of Guruswami and Mosheiff (FOCS’ 22), which showed that a random puncturing of a low-biased code is likely to possess almost all interesting local properties of RLCs.

In this work, we provide an in-depth study of another, dual operation of random puncturing, known as random shortening, which can be viewed equivalently as random puncturing on the dual code. Our main results show that for any small ε\varepsilon, by starting from a mother code with certain weaker conditions (e.g., having a large distance) and performing a random (or even pseudorandom) shortening, the new code is ε\varepsilon-biased with high probability. Our results hold for any field size and yield a shortened code with constant rate. This can be viewed as a complement to random puncturing, and together, we can obtain codes with properties like RLCs from weaker initial conditions.

Our proofs involve several non-trivial methods of estimating the weight distribution of codewords, which may be of independent interest.

1 Introduction

Error correcting codes are fundamental objects in combinatorics and computer science. The study of these objects together with the bounds and parameters that can be achieved, has also helped shape the field of information theory starting from the pioneering work of Shannon and Hamming. In the theory of error-correcting codes, linear codes form a fundamental class of codes that are building blocks of many important constructions and applications. Such codes have simple algebraic structures that are often key ingredients in their performance and analysis. For example, any linear code with message length kk and codeword length nn over the field 𝔽q\mathbb{F}_{q} can be described by both a generator matrix in 𝔽qk×n\mathbb{F}_{q}^{k\times n} and a parity check matrix in 𝔽qn×(nk)\mathbb{F}_{q}^{n\times(n-k)}.

It is well known that random linear codes (RLCs, where one samples each entry of the generator matrix uniformly independently from 𝔽q\mathbb{F}_{q}) enjoy nice combinatorial properties and have near-optimal parameters in many different settings. Specifically, with high probability they achieve Shannon capacity, the Gilbert-Varshamov (GV) bound of rate-distance tradeoff, and are list-decodable up to capacity. However, getting explicit constructions remains a challenging problem in many situations. In addition, random linear codes have little structure, which makes it difficult to design efficient decoding algorithms. Indeed, decoding random linear codes is closely related to the problems of learning parity with noise and learning with errors, whose hardness is the basis of many cryptographic applications (see e.g., [Reg09]). As such, many previous works studied the problem of slightly derandomizing, or equivalently reducing the randomness used in RLCs, while still maintaining their nice properties.

Among these works, random puncturing is one of the most well-studied operations. Here, one takes an explicit mother code, and then randomly punctures some coordinates from the code (or equivalently, punctures some columns from the generator matrix) to get a new, shorter code. Specifically, a 𝒫\mathcal{P}-puncturing of a mother code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} randomly chooses a subset 𝒫[n]\mathcal{P}\subseteq[n] of size pp, and for every codeword of 𝒞\mathcal{C}, deletes all symbols with positions in 𝒫\mathcal{P}. Compared to standard RLCs, the number of random bits used is thus reduced from O(nklogq)O(nk\log q) to O(n)O(n). Furthermore, certain nice structures of the mother code are often inherited by the punctured code, which makes decoding easier.

With sophisticated techniques, previous works have shown that if the mother code satisfies some natural conditions, then after a random puncturing, with high probability the new code has certain properties similar to those of RLCs. For example, motivated by the problem of achieving list-decoding capacity, recent works [Woo13, RW14, FKS22, GST21, BGM22, GZ23, AGL23] studied random puncturing of Reed-Muller (RM) codes and Reed-Solomon (RS) codes. Subsequent works [GM22, PP23] generalized the list-decoding property to all monotone-decreasing local properties. In all these works, the mother code needs to have some special properties, such as being an RS code, an RM code, having a large distance over a large alphabet, or having a low bias over a small alphabet. These properties are not immediately implied by general linear codes, and thus, one of the natural goals is to gradually weaken the requirements of the mother code so that the approach works for a broader class of codes. Indeed, as we shall see later, this is one of the main motivations and themes in previous works.

In this paper we continue this line of work and study the following two natural questions:

  1. 1.

    If the mother code is not that strong, can we still use some operations to get a new code that has properties similar to random linear codes?

  2. 2.

    What other operations, besides random puncturing, are useful in this context?

Towards answering these questions, we consider a different operation to reduce the randomness of RLCs, called random shortening, previously studied in [BGL17, LDT21, YP17, NvZ15]. Specifically, for an integer ss, a random ss-shortening of a code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} randomly chooses a subset 𝒮[n]\mathcal{S}\subseteq[n] of size ss, and forms a new code by picking all codewords of 𝒞\mathcal{C} which are zeros at the positions in 𝒮\mathcal{S}, and deleting these zero symbols.

We note that just like random puncturing, the operation of random shortening can in fact be carried out on any code, not just on linear codes. However, for linear codes there is an important, alternative view of random shortening: it is actually the dual version of random puncturing. In particular, one can check that it is equivalent to a random puncturing of size ss on the parity check matrix of a linear code CC, or the generator matrix of the dual code 𝒞\mathcal{C}^{\perp}. Thus in this paper, for a linear code, we also call shortening dual puncturing.

This view brings some convenience from the viewpoint of the parity check matrix. For example, any puncturing of the parity check matrix (hence also shortening) of a low-density parity check (LDPC) code [Gal62] still results in an LDPC code. Another example is expander codes [SS96]. A binary expander code 𝒞\mathcal{C} is based on a bipartite expander graph Γ:[N]×[D][M]\Gamma:[N]\times[D]\rightarrow[M] with NN nodes on the left, MM nodes on the right, and left degree DD. The parity check matrix of 𝒞\mathcal{C} is defined as follows. Each left node corresponds to a codeword bit and each right node corresponds to a parity check which checks if the parity of its neighboring codeword bits is 0. Such a code has linear time decoding algorithms, and the distance of 𝒞\mathcal{C} can be lower bounded by using the vertex expansion property of Γ\Gamma. Specifically, assume that for every left set A[N]A\subseteq[N], with |A|αN|A|\leq\alpha N, the neighbors of AA, denoted as Γ(A)\Gamma(A) has size at least (1ε)D|A|(1-\varepsilon)D|A|, then [CCLO23] showed that the distance of 𝒞\mathcal{C} is at least roughly αN2ε\frac{\alpha N}{2\varepsilon}. Notice that an 𝒮\mathcal{S}-shortening of 𝒞\mathcal{C} actually corresponds to deleting nodes in 𝒮\mathcal{S} from the left set [N][N] together with their adjacent edges, thus this does not change the vertex expansion property of the remaining graph. Hence the new code still has a distance of at least roughly αN2ε\frac{\alpha N}{2\varepsilon}, which in fact corresponds to a larger relative distance (since the new code has a shorter length). As we will see shortly, this is actually a general property of any shortening of a code. In summary, just like puncturing, the shortening operation also preserves certain nice properties of the mother code, e.g., being an LDPC code or an expander code. In turn, this makes decoding easier.

Before stating our results, we first review some previous works on random puncturing and random shortening in more detail.

1.1 Previous Work

Recently, random puncturing has drawn a lot of attention in the context of list decoding. In [Woo13], Wootters showed that by applying a random puncturing to a Reed-Muller code and setting the desired rate to O(ε2)O(\varepsilon^{2}), with high probability one can list-decode the punctured code up to a relative radius of 1/2ε1/2-\varepsilon, with an exponential but non-trivial list size. In [RW14], Rudra and Wootters showed that if the mother code is an RS code, and has a large enough relative distance of 11/qε21-1/q-\varepsilon^{2}, then after puncturing one can get a list-decoding radius of 11/qε1-1/q-\varepsilon and a rate close to capacity up to a 𝗉𝗈𝗅𝗒log(1/ε)\mathsf{poly}\log(1/\varepsilon) factor, while the list size is O(1/ε)O(1/\varepsilon). We remark that a rate upper bound for list-decodable linear codes is given by Shangguan and Tamo [ST20], which is a generalized singleton bound. Specifically, they proved that if 𝒞\mathcal{C} is a linear code of rate RR that is (ρ,L)(\rho,L) list decodable, i.e., the code has a relative list decoding radius of ρ\rho and list size LL, then ρ(1R)LL+1\rho\leq(1-R)\frac{L}{L+1}. They conjectured the existence of such codes and proved the case for L=2,3L=2,3. Later, towards proving this conjecture, Guo et. al. [GLS+22] showed that there are RS codes that are (1ε,O(1/ε))(1-\varepsilon,O(1/\varepsilon)) list decodable and the rate can be Ω(ε/log(1/ε))\Omega(\varepsilon/\log(1/\varepsilon)), though they mainly use intersection matrices instead of random puncturing. Ferber, Kwan, and Sauermann [FKS22] further showed that through random puncturing one can achieve a rate of ε/15\varepsilon/15 with list decoding radius 1ε1-\varepsilon and list size O(1/ε)O(1/\varepsilon). This was further improved by Goldberg et. al. [GST21] to achieve a rate of ε2ε\frac{\varepsilon}{2-\varepsilon}. Most recently, [BGM22] showed that random puncturing of RS codes can go all the way up to the generalized singleton bound if the field size is 2O(n)2^{O(n)}, resolving a main conjecture of [ST20]. This was subsequently improved by [GZ23], which reduced the field size to O(n2)O(n^{2}); and again by [AGL23], which further reduced the field size to O(n)O(n), although [GZ23, AGL23] can only get close to the generalized singleton bound. We note that all the above works mainly studied RS codes or RM codes, which have strong algebraic structures, and some of them also require a large relative distance (e.g., close to 11/q1-1/q).

On the other hand, Guruswami and Mosheiff [GM22] considered random puncturing of more general codes with weaker properties. Specifically, they considered two cases, where the mother code either has a low bias or has a large distance over a large alphabet (note that the property of a low bias implies a large distance, hence is stronger). For both cases, they showed that the punctured code can achieve list decoding close to capacity. In fact, they showed a stronger result, that all monotone-decreasing local properties of the punctured code are similar to those of random linear codes. Subsequent to [GM22], Putterman and Pyne [PP23] showed that the same results in [GM22] can be achieved by using a pseudorandom puncturing instead, which reduces the number of random bits used in the puncturing to be linear in the block length of the punctured code, even if the mother code has a much larger length.

Unlike puncturing, there are only a handful of previous works on shortening. In [NvZ15], Nelson and Van Zwam proved that all linear codes can be obtained by a sequence of puncturing and/or shortening of a collection of asymptotically good codes. In [YP17], Yardi and Pellikaan showed that all linear codes can be obtained by a sequence of puncturing and/or shortening on some specific cyclic code. In [BGL17], Bioglio et. al. presented a low-complexity construction of polar codes with arbitrary length and rate using shortening and puncturing. In [LDT21], Liu et. al. provided some general properties of shortened linear codes.

1.2 Notation and Definitions.

Definition 1.1.

A linear code 𝒞\mathcal{C} of length nn and dimension kk over a finite field 𝔽q\mathbb{F}_{q} is a kk-dimensional subspace of the nn-dimensional vector space 𝔽qn\mathbb{F}_{q}^{n}. The rate of 𝒞\mathcal{C} is the ratio of the dimension to the length of the code: R(𝒞)=knR(\mathcal{C})=\frac{k}{n}. The distance (or minimum distance) of 𝒞\mathcal{C} is the minimum Hamming distance between any two distinct codewords in the code: d(𝒞)=minc1,c2𝒞,c1c2d(c1,c2)d(\mathcal{C})=\underset{c_{1},c_{2}\in\mathcal{C},c_{1}\neq c_{2}}{\min}d(c_{1},c_{2}). The relative distance of 𝒞\mathcal{C} is the ratio of its distance to its length: δ(𝒞)=d(𝒞)n\delta(\mathcal{C})=\frac{d(\mathcal{C})}{n}.

The dual code 𝒞\mathcal{C}^{\perp} of a linear code is the dual linear subspace of CC. Hence the sum of the rates of CC and CC^{\perp} is 1. We call d(𝒞)d^{\perp}(\mathcal{C}) the dual distance of 𝒞\mathcal{C} as the minimum distance of its dual code 𝒞\mathcal{C}^{\perp}. The relative dual distance of 𝒞\mathcal{C} is the ratio of its dual distance to its length: δ(𝒞)=d(𝒞)n\delta^{\perp}(\mathcal{C})=\frac{d^{\perp}(\mathcal{C})}{n}.

We denote a linear code with these properties as an [n,k,d]q[n,k,d]_{q} code or an [n,k,d,d]q[n,k,d,d^{\perp}]_{q} code. Moreover, a linear code can be described by a k×nk\times n generator matrix GG, where each codeword in 𝒞\mathcal{C} is a linear combination of the rows of GG. The parity check matrix of 𝒞\mathcal{C} is an n×(nk)n\times(n-k) matrix HH satisfying the property that for any codeword c𝒞c\in\mathcal{C}, cH=0cH=0. So HH^{\top} is the generator matrix of 𝒞\mathcal{C}^{\perp}.

Definition 1.2.

Let 𝒫\mathcal{P} be a subset of [n][n] of size pp. A 𝒫\mathcal{P}-puncturing on a code 𝒞\mathcal{C} of length nn involves removing pp positions indexed by 𝒫\mathcal{P}. The resulted punctured code 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} has length npn-p. If 𝒫\mathcal{P} is a uniformly random subset of size pp, we say that 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} is obtained from 𝒞\mathcal{C} by a random pp-puncturing.

Definition 1.3.

Let 𝒮\mathcal{S} be a subset of [n][n] of size ss. An 𝒮\mathcal{S}-shortening on a code 𝒞\mathcal{C} of length nn involves selecting all codewords with zero values on positions indexed by 𝒮\mathcal{S} and removing these positions. The resulted shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has length nsn-s. If 𝒮\mathcal{S} is a uniformly random subset of size ss, we say that 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is obtained from 𝒞\mathcal{C} by a random ss-shortening.

Definition 1.4.

The qq-ary entropy function is defined as Hq(x)=xlogq(q1)xlogqx(1x)logq(1x)\mathrm{H}_{q}(x)=x\log_{q}(q-1)-x\log_{q}x-(1-x)\log_{q}(1-x).

Throughout the paper, we use “with high probability” to mean that when the rate RR, relative distance δ\delta, relative dual distance δ\delta^{\perp} of the code, and other given parameters are fixed, the probability of the event is 1O(exp(tn))1-O(\exp(-tn)) for some constant tt. Essentially, this means that the probability of the event occurring approaches 1 as the block length nn increases, making it increasingly likely that the desired properties hold.

As in [GM22], in this paper we also consider monotone-decreasing, local properties. Informally, we call a code property 𝒫\mathscr{P} monotone-decreasing and local if, the fact that a code 𝒞\mathcal{C} does not satisfy 𝒫\mathscr{P} can be witnessed by a small “bad set” of codewords in 𝒞\mathcal{C}. For example, some typical properties, such as being list-decodable to capacity and having a small bias, are monotone-decreasing and local properties. More formally, a monotone-decreasing and local property is the opposite of a monotone-increasing and local property, defined below.

Definition 1.5.

A property 𝒫\mathscr{P} is said to be

  • monotone-increasing if, for any code 𝒞\mathcal{C}, whenever one of its subcodes (i.e., a subspace of 𝒞\mathcal{C}) satisfies 𝒫\mathscr{P}, the code 𝒞\mathcal{C} itself also satisfies 𝒫\mathscr{P} (monotone-decreasing if the complement of 𝒫\mathscr{P} is monotone-increasing);

  • bb-local for some bb\in\mathbb{N} if there exists a family 𝒫\mathcal{B}_{\mathscr{P}} of sets of words in 𝔽qn\mathbb{F}_{q}^{n}, with the size of the sets at most bb, and such that 𝒞\mathcal{C} satisfies 𝒫\mathscr{P} if and only if there exists an set B𝒫B\in\mathcal{B}_{\mathscr{P}} satisfying B𝒞B\subseteq\mathcal{C},

  • row-symmetric if, for any code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} that satisfies 𝒫\mathscr{P}, the resulting code obtained by performing a permutation on the nn positions of 𝒞\mathcal{C} also satisfies 𝒫\mathscr{P}.

1.3 Main Results

Random puncturing vs. random shortening

Before formally stating our results, we first informally compare the two operations of random puncturing and random shortening. A random pp-puncturing of a code of length nn involves uniformly selecting pp positions randomly from [n][n], and discarding these positions in the code. One can see that under appropriate conditions, this operation preserves the distinctness of all codewords, and thus can increase the rate of the code. However it may decrease the distance or relative distance of the code. In contrast, a random ss-shortening of a code involves picking ss positions uniformly randomly from [n][n], forming a subcode that consists of codewords which contain only zeros at these positions, and then deleting these positions in the subcode. It can be seen that this operation perserves the distance of the code, and thus increases the relative distance of the code, but on the other hand the rate of the code can potentially decrease. Hence, these two operations are indeed “dual” in some sense, and therefore one can apply both operations to adjust both the rate and the relative distance of the code.

A linear code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n}, where q=prq=p^{r} for some prime pp, is called ε\varepsilon-biased, if for every codewords c𝒞c\in\mathcal{C}, |i=1nωtr(aci)|εn\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}\right|\leq\varepsilon n for all a𝔽qa\in\mathbb{F}_{q}^{\ast}. where ω=e2πip\omega=e^{\frac{2\pi i}{p}} and tr:𝔽q𝔽p\mathrm{tr}:\mathbb{F}_{q}\rightarrow\mathbb{F}_{p} is the field trace map. In Section 2.2, we will provide a more detailed explanation of the ϵ\epsilon-biased code.

Our main results show that random shortening is an effective way to reduce the bias of a code. Note that this is stronger than increasing the relative distance, since the former implies the latter (see Proposition 2.10). If the mother code satisfies certain conditions, then we show after random shortening the new code can achieve an arbitrarily small bias with high probability. We note that a random linear code has a small bias, and thus in this sense the code after random shortening behaves like random linear codes. Moreover, the condition that the mother code has a low bias is required in several previous works (e.g., [GM22, PP23]), while these works essentially do not care about the rate of the mother code. Thus we can apply a random puncturing to the new code after a random shortening, to get another code where all monotone-decreasing local properties are similar to those of random linear codes. This further weakens the requirements of mother codes in previous works to some extent.

Low-Biased codes from codes with large distance.

A low-biased code must have a large distance, as stated in Proposition 2.10. However, the reverse may not hold. The following theorem shows that it is also possible to derive a low-biased code from a code with a large distance by random shortening.

Theorem 1.6.

For any 0<ε<10<\varepsilon<1, any [n,Rn,δn]q[n,Rn,\delta n]_{q} code 𝒞\mathcal{C} with q1qqq1(ε2(q1))2<δ<q1q\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2}<\delta<\frac{q-1}{q} and any constant 0<γ<R0<\gamma<R, there exists a number 0<s<R0<s<R such that the following holds. If we perform a random snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}, then with high probability, the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased and has rate at least RγR-\gamma.

We note that the theorem only requires a lower bound on the relative distance, but there are no restrictions on the rate of the original code, RR. Hence, this requirement is generally easy to satisfy, for example, from simple constructions using code concatenation. Furthermore, we can select an appropriate shortening size to ensure that the rate of the shortened code is arbitrarily close to RR.

The distance condition of 𝒞\mathcal{C} in Theorem 1.6 can also be relaxed, resulting in the following theorem.

Theorem 1.7.

Given any 0<ε<10<\varepsilon<1, if an [n,Rn,δn]q[n,Rn,\delta n]_{q} code 𝒞\mathcal{C} satisfies the condition that there exists some 0<β<10<\beta<1, such that δ1(1β)R>q1qqq1(ε2(q1))2\frac{\delta}{1-(1-\beta)R}>\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2}, then there exists a number 0<s<R0<s<R such that the following holds. If we perform a random snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}, then with high probability, the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased with rate at least βR\beta R.

Indeed, the asymptotic form of the Plotkin bound is given by

R1(qq1)δ+o(1).R\leq 1-(\frac{q}{q-1})\cdot\delta+o(1). (1)

Thus Theorem 1.7 implies that as long as the rate-distance trade-off of the original code is close enough to the Plotkin bound, we can obtain a code with an arbitrarily small bias by random shortening. On the other hand, unlike in Theorem 1.6, the rate of the shortened code may not be arbitrarily close to RR, but we can still get a new rate that is only a constant factor smaller.

Remark 1.

Just as in [GM22], in both our theorems, the condition that 𝒞\mathcal{C} is a linear code is crucial for ensuring that the shortened code maintains a constant rate with high probability. When 𝒞\mathcal{C} is non-linear, there are certain counterexamples. For instance, let 𝒞\mathcal{C^{\prime}} be an [n,Rn,δn]q[n,Rn,\delta n]_{q} code and let 𝒞\mathcal{C} be obtained from 𝒞\mathcal{C}^{\prime} by adding τn\tau n 11’s at the end of each codeword in 𝒞\mathcal{C}. By picking an arbitrarily small constant τ>0\tau>0, the rate and relative distance of 𝒞\mathcal{C} are almost the same as 𝒞\mathcal{C}^{\prime}. However, after a random snsn-shortening, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is the 0 code with probability 12Ω(n)1-2^{-\Omega(n)}.

Low-biased codes from codes with small rate and not too small dual distance.

In the next theorem, there is no requirement for δ\delta to be very large. Instead, we impose constraints on its dual distance, δ\delta^{\perp}, and the rate, RR. If the dual distance is not too small and the rate can be upper bounded, then we can also apply the shortening technique to obtain a low-biased code.

Theorem 1.8.

Given any 0<ε<10<\varepsilon<1, if an [n,Rn,δn,δn]q[n,Rn,\delta n,\delta^{\perp}n]_{q} code 𝒞\mathcal{C} satisfies the condition that there exist 0<γ<140<\gamma<\frac{1}{4}, 0<δ0<min{ε1γ,(1+logq(1δ)36)2,(1q)1γ}0<\delta_{0}^{\perp}<\min\{\varepsilon^{\frac{1}{\gamma}},\left(\frac{1+\log_{q}(1-\delta)}{36}\right)^{2},(\frac{1}{q})^{\frac{1}{\gamma}}\}, such that δ>δ0\delta^{\perp}>\delta_{0}^{\perp} and 0<R<0.52γ1+0.9logq(1δ)Hq(δ0)0<R<\frac{0.5-2\gamma}{1+0.9\cdot\log_{q}(1-\delta)}\mathrm{H}_{q}(\delta_{0}^{\perp}), then there exists a number 0<s<R0<s<R such that the following holds. If we perform a random snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}, then with high probability the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased with rate at least 0.1R0.1R.

In Theorem 1.8, the rate of the dual code must be sufficiently large. Additionally, if the term 122γ1+0.9logq(1δ)\frac{\frac{1}{2}-2\gamma}{1+0.9\cdot\log_{q}(1-\delta)} is less than 1, the rate-distance trade-off of the dual code surpasses the Gilbert-Varshamov (GV) bound. Consequently, when examining the problem within the context of the GV bound, we need to impose specific constraints on δ\delta. This leads to the following corollary.

Corollary 1.9.

Given any 0<ε<10<\varepsilon<1, δ>1q0.6\delta>1-q^{-0.6}, there exists a number γ>0\gamma>0, such that for any δ>δ0\delta^{\perp}>\delta_{0}^{\perp}, 0<R<(1+γ)Hq(δ0)0<R<(1+\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp}) for a certain 0<δ0<min{ε1γ,18100,(1q)1γ}0<\delta_{0}^{\perp}<\min\{\varepsilon^{\frac{1}{\gamma}},\frac{1}{8100},(\frac{1}{q})^{\frac{1}{\gamma}}\}, there exists a number 0<s<R0<s<R such that the following holds. Let 𝒞\mathcal{C} be any [n,Rn,δn,δn]q[n,Rn,\delta n,\delta^{\perp}n]_{q} code. If we perform a random snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}, then with high probability the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased with rate at least 0.1R0.1R.

Theorem 1.8 and Corollary 1.9 show that as long as the mother code and its dual both have a reasonable relative distance, one can use random shortening to get a new code with an arbitrary small bias, while only losing a constant factor in the rate. We note that linear codes such that both the code and its dual have good relative distance are also easily constructible, for example, see [Shp09].

Random-like codes by random shortening and puncturing.

In [GM22], the authors showed that a random puncturing of a low-biased code results in a new code that behaves like random linear codes. Using our theorems, we present a weaker condition that still achieves similar results. This follows from a combination of random shortening and random puncturing, as briefly discussed before.

Theorem 1.10.

For any 0<ε<10<\varepsilon<1, bb\in\mathbb{N}, and prime power qq, there exists some η>0\eta>0, such that the following holds. Let 𝒫\mathscr{P} be a monotone-decreasing, bb-local, and row-symmetric property over 𝔽qn\mathbb{F}_{q}^{n} satisfied by a random linear code of length nn and rate RR^{\prime}. There exists some η>0\eta>0 such that the following holds. If any one of the following properties is satisfied for R,δ,δ,q,ηR,\delta,\delta^{\perp},q,\eta:

  1. 1.

    δ>(q1qη)(1R)\delta>(\frac{q-1}{q}-\eta)(1-R), or

  2. 2.

    δ>δ0\delta^{\perp}>\delta_{0}^{\perp} and 0<R<122γ1+0.9logq(1δ)Hq(δ0)0<R<\frac{\frac{1}{2}-2\gamma}{1+0.9\cdot\log_{q}(1-\delta)}\mathrm{H}_{q}(\delta_{0}^{\perp}) for a certain 0<δ0<min{ε1γ,(1+logq(1δ)36)2,(1q)1γ}0<\delta_{0}^{\perp}<\min\{\varepsilon^{\frac{1}{\gamma}},\left(\frac{1+\log_{q}(1-\delta)}{36}\right)^{2},(\frac{1}{q})^{\frac{1}{\gamma}}\},

then there exists m,p,s>0m,p,s>0 such that for any [m,Rm,δm]q[m,Rm,\delta m]_{q} code, if we perform a random smsm-shortening and then a random pmpm-puncturing on 𝒞\mathcal{C}, the resulted code 𝒟\mathcal{D} has length nn, rate at least RεR^{\prime}-\varepsilon and with high probability, satisfies 𝒫\mathscr{P}.

Remark 2.

In fact, all our theorems hold under the more restricted pseudorandom shortening, where 𝒮\mathcal{S} is sampled from a random walk on a sufficiently good constant-degree expander graph, as in [PP23]. The reason is that in this case, the only thing that changes is the probability that a random shortening does not hit a codeword. A standard hitting set property of random walks on expander graphs ensures that this probability (as in Lemma 2.15) remains close to (1δ)s(1-\delta)^{s}, as long as the second largest normalized eigenvalue in absolute value of the expander is small enough, which can be achieved by having a large enough constant degree. Consequently, all our proofs essentially go through with only minimal changes. Thus, like [PP23], we can also reduce the number of random bits used in the shortening to be linear in the block length of the shortened code, even if the mother code has a much larger length.

Remark 3.

We emphasize that, in all our theorems, the conditions of the mother code we need are weaker than the conditions in previous works which only use random puncturing [GM22, PP23]. For example, when the alphabet size is small, those works require the mother code to have a small bias, which implies a large distance. On the other hand, our theorems either only require a large distance, or only require a reasonable distance in both the mother code and its dual.

Discussions and open questions.

Our work leaves several open questions for future investigation. One such question is whether we can achieve a good rate when performing random shortening. Right now, using our analysis, the rate of the code after random shortening is potentially worse than the mother code we start with, and thus we do not get a good rate-distance trade-off by simply applying random shortening. Therefore, it is a natural question to see if one can achieve an ε\varepsilon-biased code through random shortening while at the same time maintaining a favorable RRε\varepsilon trade-off, if we start with some good initial conditions of the mother code. Another direction is further weakening the initial conditions required for obtaining low-biased codes, such as removing the constraint of δ\delta in Theorem 1.8. In our view, these questions present exciting opportunities for advancing our understanding of random shortening, low-biased codes and their connections.

Finally, it would also be quite interesting to completely derandomize the random shortening, thus yielding explicit constructions of low-biased codes with special structures (e.g., LDPC codes or expander codes). This will complement existing constructions of low-biased codes, and possibly lead to more efficient decoding algorithms.

1.4 Technique Overview

We investigate the effect of shortening as follows. An 𝒮\mathcal{S}-shortening applied to a code 𝒞\mathcal{C} of length nn involves selecting all codewords with zeros at positions indexed by 𝒮\mathcal{S} and removing these positions. Specifically, if the support of a codeword c𝒞c\in\mathcal{C} intersects 𝒮\mathcal{S} (in which case we say 𝒮\mathcal{S} hits cc), then cc will not be included in the shortened code; if the support of cc does not intersect 𝒮\mathcal{S}, then there is a codeword c𝒞[𝒮]c^{\prime}\in\mathcal{C}^{[\mathcal{S}]}, which is obtained from cc by removing all positions in 𝒮\mathcal{S}. In this way, under a random shortening, each non-zero codeword has a certain probability of being dropped and a certain probability of being retained in 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]}. If the distance of 𝒞\mathcal{C} is δn\delta n, then the probability of each codeword being hit and dropped is at least 1(1δ)s1-(1-\delta)^{s} by Lemma 2.15, where ss is the size of 𝒮\mathcal{S}.

We use 𝒞ε\mathcal{C}_{\varepsilon} to denote all codewords in 𝒞\mathcal{C} which are not ε\varepsilon-biased. If the size of 𝒞ε\mathcal{C}_{\varepsilon} is small, then by a union bound, the probability that not all codewords in 𝒞ε\mathcal{C}_{\varepsilon} are hit by 𝒮\mathcal{S} is exponentially small. Thus, with high probability, all codewords in 𝒞\mathcal{C} that are not hit by 𝒮\mathcal{S} and inherited to 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} are ε\varepsilon-biased. Hence, a critical part of all our proofs is to upper bound the size of 𝒞ε\mathcal{C}_{\varepsilon}.

Furthermore, as long as 𝒞\mathcal{C} is a linear code and ss is less than the dimension kk of 𝒞\mathcal{C}, we know that the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has dimension at least ksk-s. Consequently, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} retains a nonzero constant rate as well.

Change of parameters.

The shortening results in changes to the parameters of the code. Here, we mainly apply shortening for two purposes: adjusting the bias and amplifying the relative distance.

  1. 1.

    Adjusting the bias: Let 𝒞\mathcal{C} be of length nn. When 𝒞ε\mathcal{C}_{\varepsilon^{\prime}} is hit by 𝒮\mathcal{S}, it implies that the codewords in 𝒞\mathcal{C} not hit by 𝒮\mathcal{S} are all ε\varepsilon^{\prime}-biased. However, it doesn’t directly imply that the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is also ε\varepsilon^{\prime}-biased, since the shortening operation changes the length of the code. Nevertheless, the new bias ε\varepsilon of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is given by εεn+sns\varepsilon\leq\frac{\varepsilon^{\prime}n+s}{n-s}, where ss is the size of the shortening 𝒮\mathcal{S}. If ss is small compared to nn, ε\varepsilon is close to ε\varepsilon^{\prime}. In the proof of Theorem 1.6, we can choose ss to be a sufficiently small fraction of nn. In the proof of Theorem 1.8, we provide an upper bound for RR, which also enables us to choose a small shortening size. In both cases, we set the shortening size to be less than 0.05εn0.05\varepsilon^{\prime}n, allowing us to choose ε=0.9ε\varepsilon^{\prime}=0.9\varepsilon.

  2. 2.

    Amplifying the relative distance: We use another technique in the proof of Theorem 1.7 to first transform a code with a rate-distance trade-off near the Plotkin bound into a code with near-optimal distance. By Proposition 2.8, the distance of the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is no less than that of the original code 𝒞\mathcal{C}. However, since 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has length nsn-s instead of nn, its relative distance becomes δ1sn\frac{\delta}{1-\frac{s}{n}}. This allows us to increase the relative distance of the code. In turn, Theorem 1.7 follows from Theorem 1.6.

Estimation of the size of 𝒞ε\mathcal{C}_{\varepsilon}.

This is the most critical part of all our proofs. For Theorem 1.6 and Theorem 1.8, we have two different ways of estimating the upper bound of |𝒞ε||\mathcal{C}_{\varepsilon}|:

  1. 1.

    Estimating |𝒞ε||\mathcal{C}_{\varepsilon}| with relative distance δ\delta: We use Jq(δ)J_{q}(\delta) to denote the list decoding radius corresponding to the classical Johnson bound for a code over 𝔽q\mathbb{F}_{q} with relative distance δ\delta. It is easy to see that when δ\delta is close to the optimal q1q\frac{q-1}{q}, so is Jq(δ)J_{q}(\delta). To give an upper bound of |𝒞ε||\mathcal{C}_{\varepsilon}|, we construct qq balls in 𝔽qn\mathbb{F}_{q}^{n} with radius Jq(δ)J_{q}(\delta) and centered at t1t\cdot\vec{1}, where 1\vec{1} is the all-one vector and t𝔽qt\in\mathbb{F}_{q}. By the Johnson bound, the number of codewords covered by these balls is at most poly(n)\mathrm{poly}(n). We show that, if a codeword cc is not covered by these balls, its empirical distribution over 𝔽q\mathbb{F}_{q} is close to the uniform distribution, which implies cc is small biased. This upper bounds |𝒞ε||\mathcal{C}_{\varepsilon}| by poly(n)\mathrm{poly}(n).

  2. 2.

    Estimating |𝒞ε||\mathcal{C}_{\varepsilon}| with relative dual distance δ\delta^{\perp} and rate RR: If 𝒞\mathcal{C} has dual distance dd^{\perp}, then any d1d^{\perp}-1 columns of the generator matrix of 𝒞\mathcal{C} are linearly independent, which means that if we uniformly randomly choose a codeword from 𝒞\mathcal{C}, then any d1d^{\perp}-1 symbols of the codeword are independently uniform, i.e., the symbols of a random codeword are d1d^{\perp}-1-wise independent. We can now use this property to estimate the probability that a codeword randomly chosen from 𝒞\mathcal{C} is not ε\varepsilon-biased. This is a typical application of the concentration phenomenon from the higher moment method, where we use Hoeffding inequality, Chernoff bound, and Sub-Gaussian property to bound the (d1)(d^{\perp}-1)th moment of the summation of some random variables. Then by Markov’s inequality, the probability that a random codeword is not ε\varepsilon-biased can be bounded, which also gives an upper bound on |𝒞ε||\mathcal{C}_{\varepsilon}|.

Obtaining random-like codes.

To obtain random-like codes, we combine our results with those in [GM22], which state that a randomly punctured low-biased code is likely to possess any monotone-decreasing local property typically satisfied by a random linear code of a similar rate. By our results, we can start from a code with less stringent conditions and achieve the same results as in [GM22], through the operations of a random shortening followed by a random puncturing.

Organization.

The rest of this paper is organized as follows. In Section 2, we describe some basic definitions, terms, and useful properties. In Section 3, we show how to combine other theorems and the work of [GM22] to obtain a random-like code with weaker initial conditions and prove Theorem 1.10. In Section 4, we present two methods for estimating |𝒞ε||\mathcal{C}_{\varepsilon}|. In Section 5, we prove Theorem 1.6 and Theorem 1.7. In Section 6, we prove Theorem 1.8 and Corollary 1.9.

2 Preliminary

2.1 Punctured codes and shortened codes

Definition 2.1.

A linear code 𝒞\mathcal{C} of length nn and dimension kk over a finite field 𝔽q\mathbb{F}_{q} is a kk-dimensional subspace of the nn-dimensional vector space 𝔽qn\mathbb{F}_{q}^{n}. The rate of 𝒞\mathcal{C} is defined as the ratio of the dimension to the length of the code: R(𝒞)=knR(\mathcal{C})=\frac{k}{n}. The distance (or minimum distance) of 𝒞\mathcal{C} is the minimum Hamming distance between any two distinct codewords in the code: d(𝒞)=minc1,c2𝒞,c1c2d(c1,c2)d(\mathcal{C})=\min_{c_{1},c_{2}\in\mathcal{C},c_{1}\neq c_{2}}d(c_{1},c_{2}). The relative distance of 𝒞\mathcal{C} is the ratio of its distance to its length: δ(𝒞)=d(𝒞)n\delta(\mathcal{C})=\frac{d(\mathcal{C})}{n}. The dual distance of 𝒞\mathcal{C} is the minimum distance of its dual code 𝒞\mathcal{C}^{\perp}, denoted by d(𝒞)d^{\perp}(\mathcal{C}). The relative dual distance of 𝒞\mathcal{C} is the ratio of its dual distance to its length: δ(𝒞)=d(𝒞)n\delta^{\perp}(\mathcal{C})=\frac{d^{\perp}(\mathcal{C})}{n}. We denote a linear code with these properties as an [n,k,d]q[n,k,d]_{q} code or an [n,k,d,d]q[n,k,d,d^{\perp}]_{q} code. A linear code can be described by a k×nk\times n generator matrix GG, where each codeword in 𝒞\mathcal{C} can be obtained as a linear combination of the rows of GG. The parity check matrix of 𝒞\mathcal{C} is an n×(nk)n\times(n-k) matrix HH satisfying the property that for any codeword c𝒞c\in\mathcal{C}, cH=0cH=0.

Definition 2.2.

Let 𝒫\mathcal{P} be a subset of [n][n] of size pp. A 𝒫\mathcal{P}-puncturing on a code 𝒞\mathcal{C} of length nn involves removing pp positions indexed by 𝒫\mathcal{P}. The resulted punctured code 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} has length npn-p. If 𝒫\mathcal{P} is a uniformly random subset of size pp, we say that 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} is obtained from 𝒞\mathcal{C} by a random pp-puncturing.

Definition 2.3.

Let 𝒮\mathcal{S} be a subset of [n][n] of size ss. An 𝒮\mathcal{S}-shortening on a code 𝒞\mathcal{C} of length nn involves selecting all codewords with zero values on positions indexed by 𝒮\mathcal{S} and removing these positions. The resulted shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has length nsn-s. If 𝒮\mathcal{S} is a uniformly random subset of size ss, we say that 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is obtained from 𝒞\mathcal{C} by a random ss-shortening.

Proposition 2.4.

For a linear code 𝒞\mathcal{C}, and 𝒫,𝒮[n]\mathcal{P},\mathcal{S}\subseteq[n], 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} and 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} are also both linear codes. The generator matrix of 𝒞(𝒫)\mathcal{C}^{(\mathcal{P})} is obtained from that of 𝒞\mathcal{C} by deleting pp columns indexed by 𝒫\mathcal{P}, and the parity check matrix of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is obtained from that of 𝒞\mathcal{C} by deleting ss rows indexed by 𝒮\mathcal{S}.

Given that the parity check matrix of a code is the transpose of the generator matrix of its corresponding dual code, we can deduce that shortening a code is equivalent to puncturing its dual code. Consequently, we can observe the following properties.

Proposition 2.5.

For a linear code 𝒞\mathcal{C}, and 𝒫[n]\mathcal{P}\subseteq[n], (𝒞(𝒫))=(𝒞)[𝒫](\mathcal{C}^{(\mathcal{P})})^{\perp}=(\mathcal{C}^{\perp})^{[\mathcal{P}]}.

Definition 2.6.

Let 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} and cc be a codeword in 𝒞\mathcal{C}, and 𝒮[n]\mathcal{S}\subseteq[n]. Denote supp(c)\mathrm{supp}(c) to be the set of non-zero coordinates of cc. We say 𝒮\mathcal{S} hits cc if supp(c)𝒮\mathrm{supp}(c)\cap\mathcal{S}\neq\varnothing, 𝒮\mathcal{S} hits 𝒞ε\mathcal{C}_{\varepsilon} if each codeword in 𝒞ε\mathcal{C}_{\varepsilon} is hit by 𝒮\mathcal{S}.

In general, when applying the shortening 𝒮\mathcal{S} to a code 𝒞\mathcal{C}, we exclude any codeword that is hit by 𝒮\mathcal{S}. Specifically, if a codeword c𝒞c\in\mathcal{C} is hit by 𝒮\mathcal{S}, by definition we do not include it in the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]}. On the other hand, if a codeword cc is not hit by 𝒮\mathcal{S}, there exists a corresponding codeword cc^{\prime} in the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]}, which is obtained by removing all positions indexed by 𝒮\mathcal{S}.

Next, we examine the effects of the shortening operation on code parameters.

Proposition 2.7.

Let 𝒞\mathcal{C} be an [n,k,d,d]q[n,k,d,d^{\perp}]_{q} code and 𝒮\mathcal{S} be a subset of [n][n] of size s<ks<k. The shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has dimension kksk^{\prime}\geq k-s. Moreover, if s<ds<d^{\perp}, then k=ksk^{\prime}=k-s.

Proof.

By Proposition 2.4, the parity matrix of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is (ns)×(nk)(n-s)\times(n-k), which implies that the dimension of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is at least (ks)(k-s). If s<ds<d^{\perp}, by Proposition 2.5, there doesn’t exist a non-zero codeword c𝒞c\in\mathcal{C}^{\perp} such that supp(c)𝒮\mathrm{supp}(c)\subseteq\mathcal{S}. In this case, there won’t be any collisions between two codewords in 𝒞\mathcal{C}^{\perp} after shortening, so dim(𝒞)=dim(𝒞[𝒮])=nk\dim(\mathcal{C})^{\perp}=\dim(\mathcal{C}^{[\mathcal{S}]})^{\perp}=n-k, and the punctured code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has rate ksk-s. ∎

Proposition 2.8.

Let 𝒞\mathcal{C} be an [n,k,d,d]q[n,k,d,d^{\perp}]_{q} code and 𝒮\mathcal{S} be a subset of [n][n] of size s<ks<k, The distance of shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is at least dd.

Proof.

By definition, any non-zero codeword c𝒞[𝒮]c\in\mathcal{C}^{[\mathcal{S}]} comes from some codewords c𝒞c^{\prime}\in\mathcal{C} by removing all coordinates indexed by [𝒮][\mathcal{S}], and ci=0c_{i}=0 for any i𝒮i\in\mathcal{S}, and thus wt(c)wt(c)d\mathrm{wt}(c)\geq\mathrm{wt}(c^{\prime})\geq d. So, the distance of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is at least dd. ∎

Moreover, since the shortened code 𝒞[𝒮]\mathcal{C}^{\mathcal{[S]}} has length of nsn-s instead of nn, its relative distance becomes δ1sn\frac{\delta}{1-\frac{s}{n}}. This allows us to increase the relative distance of the code.

2.2 A characterization of ε\varepsilon-biased code

In this section, we recall the definition of ε\varepsilon-biased code.

Definition 2.9.

Let 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n}, where q=prq=p^{r} for some prime pp and let ε>0\varepsilon>0. A vector x𝔽qnx\in\mathbb{F}_{q}^{n} is said to be ε\varepsilon-biased if |i=1nωtr(axi)|εn\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot x_{i}\right)}\right|\leq\varepsilon n for all a𝔽qa\in\mathbb{F}_{q}^{\ast}. Here, ω=e2πip\omega=e^{\frac{2\pi i}{p}} and tr:𝔽q𝔽p\mathrm{tr}:\mathbb{F}_{q}\rightarrow\mathbb{F}_{p} is the field trace map: tr(x)=i=1rxpi\mathrm{tr}(x)=\sum_{i=1}^{r}x^{p^{i}}. The code 𝒞\mathcal{C} is said to be ε\varepsilon-biased if every c𝒞\{0}c\in\mathcal{C}\backslash\{0\} is ε\varepsilon-biased.

Note that for a binary vector x𝔽2nx\in\mathbb{F}_{2}^{n}, |i=1nωtr(xi)|=|n2wt(x)|\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(-x_{i}\right)}\right|=\left|n-2\cdot\mathrm{wt}(x)\right|. Then, a binary code 𝒞\mathcal{C} is ε\varepsilon-biased if and only of all non-zero codewords cc of 𝒞\mathcal{C} has weight 1ε2nwt(c)1+ε2n\frac{1-\varepsilon}{2}n\leq\mathrm{wt}(c)\leq\frac{1+\varepsilon}{2}n.

Proposition 2.10.

If 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} is ε\varepsilon-biased, then its distance is at least (q1)(1ε)qn\frac{(q-1)(1-\varepsilon)}{q}n.

Proof.

Since 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} is ε\varepsilon-biased, for any c𝒞c\in\mathcal{C},

|a𝔽qi=1nωtr(aci)|=|a𝔽q(i:ci0ωtr(aci)+(nwt(c)))|(q1)εn.\left|\sum_{a\in\mathbb{F}_{q}^{\ast}}\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}\right|=\left|\sum_{a\in\mathbb{F}_{q}^{\ast}}\left(\sum_{i:c_{i}\neq 0}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}+(n-\mathrm{wt(c)})\right)\right|\leq(q-1)\varepsilon n. (2)

By changing the order of double summation,

|i:ci0a𝔽qωtr(aci)+(q1)(nwt(c))|(q1)εn.\left|\sum_{i:c_{i}\neq 0}\sum_{a\in\mathbb{F}_{q}^{\ast}}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}+(q-1)(n-\mathrm{wt(c)})\right|\leq(q-1)\varepsilon n. (3)

Since a𝔽qωtr(aci)=1\sum_{a\in\mathbb{F}_{q}^{\ast}}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}=-1 for all ci0c_{i}\neq 0, we get

|(q1)nqwt(c)|(q1)εn,\left|(q-1)n-q\cdot\mathrm{wt}(c)\right|\leq(q-1)\varepsilon n, (4)

which implies wt(c)(q1)(1ε)qn\mathrm{wt}(c)\geq\frac{(q-1)(1-\varepsilon)}{q}n. ∎

Definition 2.11.

Given a vector x𝔽qnx\in\mathbb{F}_{q}^{n}, we define its empirical distribution Empx\mathrm{Emp}_{x} over 𝔽q\mathbb{F}_{q} by

Empx(t)=Pri[n](xi=t).\mathrm{Emp}_{x}(t)=\mathrm{Pr}_{i\in[n]}(x_{i}=t). (5)
Lemma 2.12.

Given a vector x𝔽qnx\in\mathbb{F}_{q}^{n}, if for any t𝔽qt\in\mathbb{F}_{q},

Empx(t)1q+ε2(q1),\mathrm{Emp}_{x}(t)\leq\frac{1}{q}+\frac{\varepsilon}{2(q-1)}, (6)

then xx is ε\varepsilon-biased.

Proof.

We compute the total variation distance between Empx\mathrm{Emp}_{x} and the uniform distribution over 𝔽q\mathbb{F}_{q}, which is given by

12t𝔽q|Empx(t)1q|=t:Empx(t)>1q(Empx(t)1q)ε2.\frac{1}{2}\sum_{t\in\mathbb{F}_{q}}|\mathrm{Emp}_{x}(t)-\frac{1}{q}|=\sum_{t:\mathrm{Emp}_{x}(t)>\frac{1}{q}}(\mathrm{Emp}_{x}(t)-\frac{1}{q})\leq\frac{\varepsilon}{2}. (7)

Since for any a𝔽qa\in\mathbb{F}_{q}^{\ast},

t𝔽qωtr(at)=0.\sum_{t\in\mathbb{F}_{q}}\omega^{\mathrm{tr}\left(a\cdot t\right)}=0. (8)

We know that

|i=1nωtr(axi)|\displaystyle\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot x_{i}\right)}\right| (9)
=\displaystyle= |t𝔽qi:xi=tωtr(axi)|\displaystyle\left|\sum_{t\in\mathbb{F}_{q}}\sum_{i:x_{i}=t}\omega^{\mathrm{tr}\left(a\cdot x_{i}\right)}\right|
=\displaystyle= |t𝔽q(i:ci=tωtr(axi)nqωtr(at))|\displaystyle\left|\sum_{t\in\mathbb{F}_{q}}\left(\sum_{i:c_{i}=t}\omega^{\mathrm{tr}\left(a\cdot x_{i}\right)}-\frac{n}{q}\omega^{\mathrm{tr}\left(a\cdot t\right)}\right)\right|
\displaystyle\leq t𝔽qn|Empx(t)1q|εn,\displaystyle\sum_{t\in\mathbb{F}_{q}}n\cdot\left|\mathrm{Emp}_{x}(t)-\frac{1}{q}\right|\leq\varepsilon n,

which implies xx is ε\varepsilon-biased. ∎

Definition 2.13.

Let 𝒞\mathcal{C} be a code of length nn. Denote 𝒞ε\mathcal{C}_{\varepsilon} to be the set of all non-zero codewords which are not ε\varepsilon-biased.

Lemma 2.14.

Let 𝒞\mathcal{C} be a code of length nn, and 𝒮\mathcal{S} be a subset of [n][n] of size ss. If 𝒮\mathcal{S} hits 𝒞ε\mathcal{C}_{\varepsilon}, then the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is εn+sns\frac{\varepsilon n+s}{n-s}-biased.

Proof.

Any non-zero codeword c𝒞[𝒮]c\in\mathcal{C}^{[\mathcal{S}]} comes from some codewords c𝒞c^{\prime}\in\mathcal{C} by removing all coordinates indexed by [𝒮][\mathcal{S}], and ci=0c_{i}=0 for any i𝒮i\in\mathcal{S}. Since 𝒮\mathcal{S} hits 𝒞ε\mathcal{C}_{\varepsilon}, cc^{\prime} is ε\varepsilon-biased, that is, for any a𝔽qa\in\mathbb{F}_{q}^{\ast}, |i=1nωtr(aci)|εn\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot c^{\prime}_{i}\right)}\right|\leq\varepsilon n. Then

|i=1nsωtr(aci)|\displaystyle\left|\sum_{i=1}^{n-s}\omega^{\mathrm{tr}\left(a\cdot c_{i}\right)}\right| (10)
=\displaystyle= |i𝒮ωtr(aci)|\displaystyle\left|\sum_{i\notin\mathcal{S}}\omega^{\mathrm{tr}\left(a\cdot c^{\prime}_{i}\right)}\right|
\displaystyle\leq |i=1nωtr(aci)|+|i𝒮ωtr(aci)|\displaystyle\left|\sum_{i=1}^{n}\omega^{\mathrm{tr}\left(a\cdot c^{\prime}_{i}\right)}\right|+\left|\sum_{i\in\mathcal{S}}\omega^{\mathrm{tr}\left(a\cdot c^{\prime}_{i}\right)}\right|
\displaystyle\leq εn+s=(εn+sns)(ns)\displaystyle\varepsilon n+s=\left(\frac{\varepsilon n+s}{n-s}\right)\left(n-s\right)

for any a𝔽qa\in\mathbb{F}_{q}^{\ast}, and thus 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is εn+sns\frac{\varepsilon n+s}{n-s}-biased. ∎

Lemma 2.15.

Let 𝒞\mathcal{C} be a code of length nn and distance d=δnd=\delta n, and 𝒮\mathcal{S} be a subset of [n][n] of size ss. For any codeword c𝒞c\in\mathcal{C}, the probability that cc is not hit by 𝒮\mathcal{S} is at most

(1δ)s.(1-\delta)^{s}.
Proof.

Suppose wt(c)=i\mathrm{wt}(c)=i. cc not hit by 𝒮\mathcal{S} means 𝒮supp(c)=\mathcal{S}\cap\mathrm{supp}(c)=\varnothing, whose probability is at most

(nis)(ns)(1in)s(1δ)s.\frac{{n-i\choose s}}{{n\choose s}}\leq(1-\frac{i}{n})^{s}\leq(1-\delta)^{s}. (11)

3 Random-like codes by random shortening and puncturing

In this section, we integrate the applications of low-biased codes with our theorems to obtain further results. In [GM22], random punctured low biased codes are studied, and they are shown to be locally similar to random linear codes. By combining this result, we can derive weaker conditions for obtaining a random-like code. We first recall some of the framework for studying properties of codes, which was established in [MRRZ+20, GMR+20].

A property 𝒫\mathscr{P} of length-nn linear codes over 𝔽q\mathbb{F}_{q} is a collection of linear codes in 𝔽qn\mathbb{F}_{q}^{n}. A linear code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} such that 𝒞𝒫\mathcal{C}\in\mathscr{P} is said to satisfy 𝒫\mathscr{P}.

Definition 3.1.

A property 𝒫\mathscr{P} is said to be

  • monotone-increasing if, for any code 𝒞\mathcal{C}, whenever one of its subcodes (i.e., a subspace of 𝒞\mathcal{C}) satisfies 𝒫\mathscr{P}, the code 𝒞\mathcal{C} itself also satisfies 𝒫\mathscr{P} (monotone-decreasing if the complement of 𝒫\mathscr{P} is monotone-increasing);

  • bb-local for some bb\in\mathbb{N} if there exists a family 𝒫\mathcal{B}_{\mathscr{P}} of sets of words in 𝔽qn\mathbb{F}_{q}^{n}, with the size of the sets at most bb, and such that 𝒞\mathcal{C} satisfies 𝒫\mathscr{P} if and only if there exists an set B𝒫B\in\mathcal{B}_{\mathscr{P}} satisfying B𝒞B\subseteq\mathcal{C},

  • row-symmetric if, for any code 𝒞𝔽qn\mathcal{C}\subseteq\mathbb{F}_{q}^{n} that satisfies 𝒫\mathscr{P}, the resulting code obtained by performing a permutation on the nn positions of 𝒞\mathcal{C} also satisfies 𝒫\mathscr{P}.

Definition 3.2.

Let 𝒫\mathscr{P} be a nonempty monotone-increasing property over 𝔽qn\mathbb{F}_{q}^{n}. Its threshold is defined by

RLC(𝒫)=min{R[0,1]Pr[𝒞RLCn,q(R) satisfies 𝒫]12},\mathrm{RLC}(\mathscr{P})=\min\left\{R\in[0,1]\mid\mathrm{Pr}\left[\mathcal{C}_{\mathrm{RLC}}^{n,q}(R)\text{ satisfies }\mathscr{P}\right]\geq\frac{1}{2}\right\}, (12)

where CRLCn,q(R)C_{\mathrm{RLC}}^{n,q}(R) is a random linear code of rate RR in 𝔽qn\mathbb{F}_{q}^{n}.

Theorem 3.3.

[MRRZ+20] Let 𝒫𝔽qn\mathscr{P}\subseteq\mathbb{F}_{q}^{n} be a random linear code of radius RR and Let 𝒫\mathscr{P} be a monotone-increasing, b-local and row-symmetric property over 𝔽qn\mathbb{F}_{q}^{n}, where nlogqnωn(q2b)\frac{n}{\log_{q}n}\geq\omega_{n\rightarrow\infty}\left(q^{2b}\right). The following now holds for every ε>0\varepsilon>0.

  1. 1.

    If RRLC(𝒫)εR\leq\mathrm{RLC}(\mathscr{P})-\varepsilon then

    Pr[𝒞 satisfies 𝒫]q(εon(1))n.\mathrm{Pr}[\mathcal{C}\text{ satisfies }\mathscr{P}]\leq q^{-\left(\varepsilon-o_{n}\rightarrow\infty(1)\right)n}. (13)
  2. 2.

    If RRLC(𝒫)+εR\geq\mathrm{RLC}(\mathscr{P})+\varepsilon then

    Pr[𝒞 satisfies 𝒫]1q(εon(1))n.\mathrm{Pr}[\mathcal{C}\text{ satisfies }\mathscr{P}]\geq 1-q^{-\left(\varepsilon-o_{n\rightarrow\infty}(1)\right)n}. (14)
Theorem 3.4.

[GM22] Let qq be a prime power, and let 𝒫\mathscr{P} be a monotone-decreasing, row-symmetric and b-local property over 𝔽qn\mathbb{F}_{q}^{n}, where nlognωn(q2b)\frac{n}{\log n}\geq\omega_{n\rightarrow\infty}\left(q^{2b}\right). Let 𝒟𝔽qm\mathcal{D}\subseteq\mathbb{F}_{q}^{m} be a linear code. Let 𝒞\mathcal{C} be a random n-puncturing of 𝒟\mathcal{D} of design rate RRLC(𝒫)εR\leq\mathrm{RLC}(\mathscr{P})-\varepsilon for some ε>0\varepsilon>0. Suppose that 𝒟\mathcal{D} is (εblnqqb)\left(\frac{\varepsilon b\ln q}{q^{b}}\right)-biased. Then,

Pr[𝒞 satisfies 𝒫]q(εon(1))n.\mathrm{Pr}[\mathcal{C}\text{ satisfies }\mathscr{P}]\leq q^{-\left(\varepsilon-o_{n\rightarrow\infty}(1)\right)n}.

This theorem offers a technique for constructing random-like codes from low biased codes. By combining puncturing and shortening methods, we can transform a code with weaker initial conditions into a random-like code. Since a monotone-increasing property is the negation of a monotone-decreasing property, we can present our theorem using the monotone-decreasing property and the term “with high probability” instead of “with probability exponentially decreasing”.

Theorem 3.5.

Let 0<ε<10<\varepsilon<1, bb\in\mathbb{N}, and qq be a prime power. For any monotone-decreasing, bb-local, and row-symmetric property 𝒫\mathscr{P} over 𝔽qn\mathbb{F}_{q}^{n} with RLC(𝒫)>ε\mathrm{RLC}(\mathscr{P})>\varepsilon, there exists some η>0\eta>0 such that the following holds. If any one of the following conditions is satisfied for R,δ,δ,q,ηR,\delta,\delta^{\perp},q,\eta:

  1. 1.

    δ>(q1qη)(1R)\delta>(\frac{q-1}{q}-\eta)(1-R), or

  2. 2.

    0<δ<min{η1γ,(1+logq(1δ)36)2,(1q)1γ}0<\delta^{\perp}<\min\{\eta^{\frac{1}{\gamma}},\left(\frac{1+\log_{q}(1-\delta)}{36}\right)^{2},(\frac{1}{q})^{\frac{1}{\gamma}}\} and 0<R<122γ1+0.9logq(1δ)Hq(δ)0<R<\frac{\frac{1}{2}-2\gamma}{1+0.9\log_{q}(1-\delta)}\mathrm{H}_{q}(\delta^{\perp}) for some γ\gamma,

then there exist positive m,p,sm,p,s such that for any [m,Rm,δm]q[m,Rm,\delta m]_{q} code 𝒞\mathcal{C}, upon performing a random smsm-shortening and then a random pmpm-puncturing to 𝒞\mathcal{C}, the resultant code 𝒟\mathcal{D} has length nn, rate of at least RLC(𝒫)ε\mathrm{RLC}(\mathscr{P})-\varepsilon and, with high probability, satisfies 𝒫\mathscr{P}.

Proof.

Define η=min{εblnqqb,RLC(𝒫)ε}\eta=\min\{\frac{\varepsilon b\ln q}{q^{b}},\mathrm{RLC}(\mathscr{P})-\varepsilon\}. Given that one of the two conditions holds, we can select an ss such that a random snsn-puncturing to 𝒞\mathcal{C} results in an η\eta-biased code by Theorem 1.7 and Theorem 1.8. In this scenario, due to the singleton bound, the code rate is less than RLC(𝒫)ε\mathrm{RLC}(\mathscr{P})-\varepsilon. Utilizing Theorem 3.4, we designate the rate as RLC(𝒫)ε\mathrm{RLC}(\mathscr{P})-\varepsilon and perform a random pp-puncturing. The resulting code then has a rate RLC(𝒫)ε\mathrm{RLC}(\mathscr{P})-\varepsilon and, with a high probability, satisfies 𝒫\mathscr{P}. Furthermore, as the choices of ss and pp are independent of mm, we can finally define m=n1spm=\frac{n}{1-s-p}. Consequently, the resulting code has a length of nn. ∎

4 Estimation on low-biased codewords

For a random vector x𝔽qnx\in\mathbb{F}_{q}^{n}, it is known from the law of large numbers that its empirical distribution Empx\mathrm{Emp}_{x} is, with high probability, ε\varepsilon-close to the uniform distribution over 𝔽q\mathbb{F}_{q} for any ε\varepsilon as nn goes to infinity. Therefore, for each ε\varepsilon, let 𝒞\mathcal{C} be a random code; 𝒞ε\mathcal{C}_{\varepsilon} will, with high probability, constitute only a small fraction of 𝒞\mathcal{C}. In the following, we present several estimation methods for the size of |𝒞ε||\mathcal{C}_{\varepsilon}| under general conditions.

We first give a fact about field trace map: 𝔽q𝔽p\mathbb{F}_{q}\rightarrow\mathbb{F}_{p}. The trace map is defined by tr(x)=i=1rxpi\mathrm{tr}(x)=\sum_{i=1}^{r}x^{p^{i}}, where q=prq=p^{r}. In fact, this map is both linear and surjective. Consequently, for any a𝔽qa\in\mathbb{F}_{q}^{\ast}, if xx is selected uniformly at random from 𝔽q\mathbb{F}_{q}, then tr(axi)\mathrm{tr}(a\cdot x_{i}) will also assume a uniformly random value in 𝔽p\mathbb{F}_{p}.

We commence by presenting our first estimation method. Given a scenario where the distance δ\delta is substantial, it is viable to set ε\varepsilon to a small value, thereby ensuring that the number of codewords in 𝒞ε\mathcal{C}_{\varepsilon} is restricted to a maximum polynomial number. To exemplify, consider a binary code 𝒞\mathcal{C} over 𝔽2n\mathbb{F}_{2}^{n} with a considerable distance δ\delta. We argue that, in this instance, only a polynomial number of codewords surpass a weight greater than 12+O(ϵ)\frac{1}{2}+O(\sqrt{\epsilon}). We can create a Hamming sphere, centered at [1,1,,1][1,1,\cdots,1] with a radius J(δ)=12(112δ)J(\delta)=\frac{1}{2}(1-\sqrt{1-2\delta}). It is within this sphere that only a polynomial number of codewords exist.

Lemma 4.1.

Let 𝒞\mathcal{C} be an [n,Rn,δn]q[n,Rn,\delta n]_{q} code. For any ε2(q1)q1q(q1qδ)\varepsilon\geq 2(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}, |𝒞ε|q2δn2|\mathcal{C}_{\varepsilon}|\leq q^{2}\delta n^{2}.

Proof.

For each t𝔽qt\in\mathbb{F}_{q} Denote t𝔽qn\vec{t}\in\mathbb{F}_{q}^{n} to be the vector where each entry has a value tt. Denote Jq(δ)=(11q)(11qq1δ)J_{q}(\delta)=(1-\frac{1}{q})\left(1-\sqrt{1-\frac{q}{q-1}\delta}\right), which is the list decoding radius associated with the Johnson bound. Let

B(t)={cCd(c,t)Jδ(δ)}.B(t)=\{c\in C\mid d(c,\vec{t})\leq J_{\delta}(\delta)\}. (15)

By Johnson bound for a linear code,

|t𝔽qB(t)|t𝔽q|B(t)|q2δn2.|\bigcup_{t\in\mathbb{F}_{q}}B(t)|\leq\sum_{t\in\mathbb{F}_{q}}|B(t)|\leq q^{2}\delta n^{2}. (16)

If ct𝔽qB(t)c\notin\bigcup_{t\in\mathbb{F}_{q}}B(t), then the Hamming distance between cc and t\vec{t} is at least Jq(δ)J_{q}(\delta) for each t𝔽qt\in\mathbb{F}_{q}, which means

Empc(t)1Jq(δ)=1q+q1q(q1qδ).\mathrm{Emp}_{c}(t)\leq 1-J_{q}(\delta)=\frac{1}{q}+\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}. (17)

By Lemma 2.12, cc is ε\varepsilon-biased for any ε2(q1)q1q(q1qδ)\varepsilon\geq 2(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}, and thus

|𝒞ε||t𝔽qB(t)|q2δn2.|\mathcal{C}_{\varepsilon}|\leq|\bigcup_{t\in\mathbb{F}_{q}}B(t)|\leq q^{2}\delta n^{2}. (18)

Another approach to approximate |𝒞ε||\mathcal{C}_{\varepsilon}| is the probability method. It is essential to observe that when the dual code of 𝒞\mathcal{C} has distance d+1d+1, every set of dd columns within the generator matrix of 𝒞\mathcal{C} are linearly independent. This observation implies that when examining the distribution of a randomly selected codeword from 𝒞\mathcal{C}, the bits exhibit dd-wise independence. Consequently, 𝒞\mathcal{C} is bound by the constraints of the dd-th moment inequality.

Lemma 4.2.

x1,,xnx_{1},\cdots,x_{n} are independent random variables with μ=0\mu=0, and xi[1,1]x_{i}\in\left[-1,1\right]. Denote Xn=i=1nxiX_{n}=\sum_{i=1}^{n}x_{i}. Then for any even dd,

𝔼((Xn)d)2(2n)d/2(d2)!.\mathbb{E}((X_{n})^{d})\leq 2\cdot(2n)^{d/2}\cdot(\frac{d}{2})!. (19)
Proof.

By Hoeffding inequality, for any λ0\lambda\geq 0,

𝔼eλXneλ22n.\mathbb{E}e^{\lambda X_{n}}\leq e^{\frac{\lambda^{2}}{2}n}. (20)

Then by Chernoff bound, for any t>0t>0,

Pr(Xnt)exp(t22n);Pr(Xnt)exp(t22n).\begin{gathered}\mathrm{Pr}\left(X_{n}\geq t\right)\leq\exp\left(-\frac{t^{2}}{2n}\right);\\ \mathrm{Pr}\left(X_{n}\leq-t\right)\leq\exp\left(-\frac{t^{2}}{2n}\right).\end{gathered} (21)

Then by Sub-Gaussian property, for any even dd,

𝔼((Xn)d)=𝔼(|Xn|d)2(2n)d/2(d2)!.\mathbb{E}((X_{n})^{d})=\mathbb{E}(|X_{n}|^{d})\leq 2\cdot(2n)^{d/2}\cdot(\frac{d}{2})!. (22)

Corollary 4.3.

Let x1,x2,xnx_{1},x_{2}\cdots,x_{n} be random variables taking values in [1,1][-1,1] which are dd-wise independent, 𝔼(xi)=0\mathbb{E}(x_{i})=0. Let Xn=i=1nxiX_{n}=\sum_{i=1}^{n}x_{i}, δ=d/n\delta=d/n, then for any ε>0\varepsilon>0,

Pr(|i=1nxi|εn)4πd(δε2e)δn/2.\mathrm{Pr}(|\sum_{i=1}^{n}x_{i}|\geq\varepsilon n)\leq 4\sqrt{\pi d}(\frac{\delta}{\varepsilon^{2}e})^{\delta n/2}. (23)
Proof.

If x1,x2,xnx_{1},x_{2}\cdots,x_{n} are dd-wise independent, then the dd-th central moment of Xn=i=1nxiX_{n}=\sum_{i=1}^{n}x_{i} is the same as the case where these x1,x2,xnx_{1},x_{2}\cdots,x_{n} are fully independent. Therefore, by Markov inequality,

Pr(|i=1nxi|εn)𝔼((Xn)d)(εn)2=4πd(δε2e)δn/2.\mathrm{Pr}(|\sum_{i=1}^{n}x_{i}|\geq\varepsilon n)\leq\frac{\mathbb{E}((X_{n})^{d})}{(\varepsilon n)^{2}}=4\sqrt{\pi d}(\frac{\delta}{\varepsilon^{2}e})^{\delta n/2}. (24)

Lemma 4.4.

Let xx be a random vector, whose components uniformly take values in 𝔽q\mathbb{F}_{q} and are dd-wise independent. Let δ=d/n\delta=d/n. Then

Pr(x is not ε-biased)22(q1)(2δε2e)δn/2.\mathrm{Pr}(x\text{\ is not $\varepsilon$-biased})\leq 2\sqrt{2}(q-1)\left(\frac{2\delta}{\varepsilon^{2}e}\right)^{\delta n/2}. (25)
Proof.

Let Re()\mathrm{Re}(\cdot) and Im()\mathrm{Im}(\cdot) denote the real part and imaginary part of a complex number separately. Since for each i[n]i\in[n], a𝔽qa\in\mathbb{F}_{q}, Re(ωtr(axi))\mathrm{Re}(\omega^{\mathrm{tr}(a\cdot x_{i})}) and Im(ωtr(axi))\mathrm{Im}(\omega^{\mathrm{tr}(a\cdot x_{i})}) are real-valued discrete random variables with μ=0\mu=0, and taking values in [0,1][0,1].

Pr(x is not ε-biased)\displaystyle\mathrm{Pr}(x\text{\ is not $\varepsilon$-biased}) (26)
\displaystyle\leq a𝔽qPr(|i=1nωtr(axi)|εn)\displaystyle\sum_{a\in\mathbb{F}_{q}^{\ast}}\mathrm{Pr}(|\sum_{i=1}^{n}\omega^{\mathrm{tr}(a\cdot x_{i})}|\geq\varepsilon n)
\displaystyle\leq a𝔽q(Pr(|i=1nRe(ωtr(axi))|22εn)+Pr(|i=1nIm(ωtr(axi))|22εn))\displaystyle\sum_{a\in\mathbb{F}_{q}^{\ast}}\left(\mathrm{Pr}(|\sum_{i=1}^{n}\mathrm{Re}(\omega^{\mathrm{tr}(a\cdot x_{i})})|\geq\frac{\sqrt{2}}{2}\varepsilon n)+\mathrm{Pr}(|\sum_{i=1}^{n}\mathrm{Im}(\omega^{\mathrm{tr}(a\cdot x_{i})})|\geq\frac{\sqrt{2}}{2}\varepsilon n)\right)
\displaystyle\leq 8(q1)πδn(2δε2e)δn/2.\displaystyle 8(q-1)\sqrt{\pi\delta n}\left(\frac{2\delta}{\varepsilon^{2}e}\right)^{\delta n/2}.

Corollary 4.5.

Let 𝒞\mathcal{C} be a code of length nn, rate RR and dual distance d=δnd^{\perp}=\delta^{\perp}n over the field 𝔽q\mathbb{F}_{q}. Then for each ε>0\varepsilon>0, the number of codewords which are not ε\varepsilon-biased is not more than

8qπδn(2δε2e)δn/2qRn8q\sqrt{\pi\delta^{\perp}n}\cdot\left(\frac{2\delta^{\perp}}{\varepsilon^{2}e}\right)^{\delta^{\perp}n/2}\cdot q^{Rn}

for sufficiently large nn.

Proof.

If 𝒞\mathcal{C} has dual distance dd^{\perp}, then each d1d^{\perp}-1 columns of generating matrix of 𝒞\mathcal{C} are independent, which means that if we uniformly randomly choose a codeword from 𝒞\mathcal{C}, each d1d^{\perp}-1 bits of the codeword are independent. Therefore, the number of codewords is less than

Pr(c is not ε-biasedc𝒞)|𝒞|\displaystyle\mathrm{Pr}\left(\text{$c$ is not $\varepsilon$-biased}\mid c\in\mathcal{C}\right)\cdot|\mathcal{C}| (27)
\displaystyle\leq 8(q1)π(δn1)(2δ2nε2e)(δn1)/2qRn,\displaystyle 8(q-1)\sqrt{\pi(\delta^{\perp}n-1)}\cdot\left(\frac{2\delta^{\perp}-\frac{2}{n}}{\varepsilon^{2}e}\right)^{(\delta^{\perp}n-1)/2}\cdot q^{Rn},

which is less than

8qπδn(2δε2e)δn/2qRn8q\sqrt{\pi\delta^{\perp}n}\cdot\left(\frac{2\delta^{\perp}}{\varepsilon^{2}e}\right)^{\delta^{\perp}n/2}\cdot q^{Rn}

for sufficiently large nn. ∎

5 Proof of Theorem 1.6

Before proving Theorem 1.6, we first give the following theorem.

Theorem 5.1.

Let 𝒞\mathcal{C} be an [n,Rn,δn]q[n,Rn,\delta n]_{q} code. If we perform a random snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}, where s<Rs<R, then with high probability, the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased, where ε=2(q1)q1q(q1qδ)+s1s\varepsilon=\frac{2(q-1)\sqrt{\frac{q-1}{q}\left(\frac{q-1}{q}-\delta\right)}+s}{1-s}.

Proof.

Denote ε=2(q1)q1q(q1qδ)\varepsilon^{\prime}=2(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}. From Lemma 4.1, we know that |𝒞ε|q2δn2|\mathcal{C}_{\varepsilon^{\prime}}|\leq q^{2}\delta n^{2}. Hence, by union bound,

Pr(𝒮 doesn’t hit all codewords in 𝒞ε)\displaystyle\mathrm{Pr}\left(\text{$\mathcal{S}$ doesn't hit all codewords in $\mathcal{C}_{\varepsilon^{\prime}}$}\right) (28)
\displaystyle\leq c𝒞εPr(c is not hit by 𝒮)\displaystyle\sum_{c\in\mathcal{C}_{\varepsilon^{\prime}}}\mathrm{Pr}\left(\text{$c$ is not hit by $\mathcal{S}$}\right)
\displaystyle\leq q2δn2(1δ)sn,\displaystyle q^{2}\delta n^{2}\cdot(1-\delta)^{sn},

which approaches 0 as nn tends to infinity. Hence, with high probability, 𝒮\mathcal{S} hits 𝒞ε\mathcal{C}_{\varepsilon^{\prime}}. From Lemma 2.14, the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased. ∎

To get Theorem 1.6, we need to select an appropriate value for ss that satisfies the following conditions:

  1. 1.

    The chosen ss does not cause a significant decrease in the rate RR.

  2. 2.

    ss satisfies the requirements defined by ε\varepsilon.

Proof of Theorem 1.6.

Let s=min{γ1+γ,R2,ε2(q1)q1q(q1qδ)}s=\min\{\frac{\gamma}{1+\gamma},\frac{R}{2},\frac{\varepsilon}{2}-(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}\}. By Proposition 2.7, since smin{γ1+γ,R2}s\leq\min\{\frac{\gamma}{1+\gamma},\frac{R}{2}\}, the rate of 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is

Rs1s>Rγ.\frac{R-s}{1-s}>R-\gamma. (29)

Rearranging the inequality q1qqq1(ε2(q1))2<δ\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2}<\delta, we get

(q1)q1q(q1qδ)<ε2,(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}<\frac{\varepsilon}{2}, (30)

And since s<ε2(q1)q1q(q1qδ)s<\frac{\varepsilon}{2}-(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}.

2(q1)q1q(q1qδ)+s1s\displaystyle\frac{2(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}+s}{1-s} (31)
<\displaystyle< (q1)q1q(q1qδ)+ε21ε2+(q1)q1q(q1qδ)\displaystyle\frac{(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}+\frac{\varepsilon}{2}}{1-\frac{\varepsilon}{2}+(q-1)\sqrt{\frac{q-1}{q}(\frac{q-1}{q}-\delta)}}
<\displaystyle< ε.\displaystyle\varepsilon.

By Lemma 5.1, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased. ∎

Proof of Theorem 1.7.

Since δ1(1β)R>q1qqq1(ε2(q1))2\frac{\delta}{1-(1-\beta)R}>\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2}, we can first choose an s<(1β)Rs<(1-\beta)R such that δ1s>q1qqq1(ε2(q1))2\frac{\delta}{1-s}>\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2}. We then perform an snsn-shortening 𝒮\mathcal{S} to 𝒞\mathcal{C}. By Proposition 2.8 and Proposition 2.7, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has a distance of at least q1qqq1(ε2(q1))2\frac{q-1}{q}-\frac{q}{q-1}\left(\frac{\varepsilon}{2(q-1)}\right)^{2} and rate at least Rs1s>βR\frac{R-s}{1-s}>\beta R. Then, using Theorem 1.6, we are able to select a sufficiently small γ\gamma defined in Theorem 1.6 such that Rs1sγ>βR\frac{R-s}{1-s}-\gamma>\beta R, which ultimately enables us to achieve the desired result. ∎

6 Proof of Theorem 1.8

We first present an inequality concerning the qq-ary entropy function Hq(x)\mathrm{H}_{q}(x) here.

Lemma 6.1.

For any q=prq=p^{r}, 0<γ<140<\gamma<\frac{1}{4}, when 0<x<(1q)1γ0<x<(\frac{1}{q})^{\frac{1}{\gamma}}, Hq(x)<(1+2γ)xlogqx\mathrm{H}_{q}(x)<-(1+2\gamma)x\log_{q}x.

Proof.

When q=2q=2,

Hq(x)xlogqx1=(1x)logq(1x)xlogq(x)<logq(1+2x)xlogq(x)<2logq(x)=2logx(q)<2γ.\frac{\mathrm{H}_{q}(x)}{-x\log_{q}x}-1=\frac{-(1-x)\log_{q}(1-x)}{-x\log_{q}(x)}<\frac{\log_{q}(1+2x)}{-x\log_{q}(x)}<\frac{2}{-\log_{q}(x)}=-2\log_{x}(q)<2\gamma. (32)

for 0<x<(1q)1γ0<x<(\frac{1}{q})^{\frac{1}{\gamma}}. When q3q\geq 3,

xlogq(q1)xlogq(x)=logx(q1)<logx(q)<γ,(1x)logq(1x)xlogq(x)<logq(1x)xlogq(x)<xxlogq(x)=logx(q)<γ,\begin{gathered}\frac{x\log_{q}(q-1)}{-x\log_{q}(x)}=-\log_{x}(q-1)<-\log_{x}(q)<\gamma,\\ \frac{-(1-x)\log_{q}(1-x)}{-x\log_{q}(x)}<\frac{-\log_{q}(1-x)}{-x\log_{q}(x)}<\frac{x}{-x\log_{q}(x)}=-\log_{x}(q)<\gamma,\end{gathered} (33)

for 0<x<(1q)1γ0<x<(\frac{1}{q})^{\frac{1}{\gamma}}, and thus

Hq(x)xlogqx1=xlogq(q1)xlogq(x)+(1x)logq(1x)xlogq(x)<2γ.\frac{\mathrm{H}_{q}(x)}{-x\log_{q}x}-1=\frac{x\log_{q}(q-1)}{-x\log_{q}(x)}+\frac{-(1-x)\log_{q}(1-x)}{-x\log_{q}(x)}<2\gamma. (34)

Proof of Theorem 1.8.

We set ε=0.9ε\varepsilon^{\prime}=0.9\varepsilon and get δ0<(eε2)1γ\delta_{0}^{\perp}<(\frac{\sqrt{e}\varepsilon^{\prime}}{\sqrt{2}})^{\frac{1}{\gamma}}. Let s=R(122γ)Hq(δ0)logq(1δ)s=\frac{R-(\frac{1}{2}-2\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp})}{-\log_{q}(1-\delta)}. We get

Pr(𝒮 doesn’t hit all codewords in 𝒞ε)\displaystyle\mathrm{Pr}\left(\text{$\mathcal{S}$ doesn't hit all codewords in $\mathcal{C}_{\varepsilon^{\prime}}$}\right) (35)
\displaystyle\leq c𝒞εPr(c is not hit by 𝒮)\displaystyle\sum_{c\in\mathcal{C}_{\varepsilon^{\prime}}}\mathrm{Pr}\left(\text{$c$ is not hit by $\mathcal{S}$}\right)
\displaystyle\leq |𝒞ε|(1δ)sn\displaystyle|\mathcal{C}_{\varepsilon^{\prime}}|\cdot(1-\delta)^{sn} (using Lemma 2.15)\displaystyle(\text{using Lemma \ref{lem_hitprob}})
\displaystyle\leq 8qπδ0n(2δ0(ε)2e)δ0n/2qRn(1δ)sn\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot\left(\frac{2\delta_{0}^{\perp}}{(\varepsilon^{\prime})^{2}e}\right)^{\delta_{0}^{\perp}n/2}\cdot q^{Rn}\cdot(1-\delta)^{sn} (using Corollary 4.5)\displaystyle(\text{using Corollary \ref{cor_upper2}})
=\displaystyle= 8qπδ0n(2δ0(ε)2e)δ0n/2qRnqlogq(1δ)sn\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot\left(\frac{2\delta_{0}^{\perp}}{(\varepsilon^{\prime})^{2}e}\right)^{\delta_{0}^{\perp}n/2}\cdot q^{Rn}\cdot q^{\log_{q}(1-\delta)sn}
=\displaystyle= 8qπδ0n((δ0)12γ)δ0n/2q(122γ)Hq(δ0)n\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot\left((\delta_{0}^{\perp})^{1-2\gamma}\right)^{\delta_{0}^{\perp}n/2}\cdot q^{(\frac{1}{2}-2\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp})\cdot n} (using s=R(122γ)Hq(δ0)logq(1δ))\displaystyle(\text{using }s=\frac{R-(\frac{1}{2}-2\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp})}{-\log_{q}(1-\delta)})
\displaystyle\leq 8qπδ0n((δ0)12γ)δ0n/2q(122γ)(1+2γ)δ0logqδ0n\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot\left((\delta_{0}^{\perp})^{1-2\gamma}\right)^{\delta_{0}^{\perp}n/2}\cdot q^{-(\frac{1}{2}-2\gamma)(1+2\gamma)\delta_{0}^{\perp}\log_{q}\delta_{0}^{\perp}\cdot n} (using Lemma 6.1)\displaystyle(\text{using Lemma \ref{lemma_Hinq}})
=\displaystyle= 8qπδ0n((δ0)12γ(δ0)(122γ)(1+2γ))δ0n\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot\left((\delta_{0}^{\perp})^{\frac{1}{2}-\gamma}\cdot(\delta_{0}^{\perp})^{-(\frac{1}{2}-2\gamma)(1+2\gamma)}\right)^{\delta_{0}^{\perp}n}
=\displaystyle= 8qπδ0n(δ0)4γ2δ0n.\displaystyle 8q\sqrt{\pi\delta_{0}^{\perp}n}\cdot(\delta_{0}^{\perp})^{4\gamma^{2}\delta_{0}^{\perp}n}.

Therefore, this probability in Equation 35 tends to 0 as nn approaches infinity.

Moreover, one can verify that s<0.9Rs<0.9R given s=R(122γ)Hq(δ0)logq(1δ)s=\frac{R-(\frac{1}{2}-2\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp})}{-\log_{q}(1-\delta)} and R122γ1+0.9logq(1δ)Hq(δ0)R\leq\frac{\frac{1}{2}-2\gamma}{1+0.9\cdot\log_{q}(1-\delta)}\mathrm{H}_{q}(\delta_{0}^{\perp}). Hence, the shortened code 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} has rate at least Rs1s>0.1R\frac{R-s}{1-s}>0.1R, and

s\displaystyle s <R\displaystyle<R (36)
<0.52γ1+logq(1δ)Hq(δ0)\displaystyle<\frac{0.5-2\gamma}{1+\log_{q}(1-\delta)}\mathrm{H}_{q}(\delta_{0}^{\perp})
<0.751+logq(1δ)δ0logq(δ0)(using Lemma 6.1)\displaystyle<-\frac{0.75}{1+\log_{q}(1-\delta)}\cdot\delta_{0}^{\perp}\cdot\log_{q}(\delta_{0}^{\perp})\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad(\text{using Lemma \ref{lemma_Hinq}})
<0.751+logq(1δ)(δ0)12(δ0)γ(δ0)14(logq(δ0))(using γ<14)\displaystyle<\frac{0.75}{1+\log_{q}(1-\delta)}\cdot(\delta_{0}^{\perp})^{\frac{1}{2}}\cdot(\delta_{0}^{\perp})^{\gamma}\cdot(\delta_{0}^{\perp})^{\frac{1}{4}}\cdot(-\log_{q}(\delta_{0}^{\perp}))\quad\quad(\text{using }\gamma<\frac{1}{4})
<148109ε4eln(2)\displaystyle<\frac{1}{48}\cdot\frac{10}{9}\varepsilon^{\prime}\cdot\frac{4}{e\ln(2)}
(using 0<δ0<min{(2+logq(1δ)36)2,ε1γ} and (δ0)14logq(δ0)4eln2)\displaystyle\quad(\text{using }0<\delta_{0}^{\perp}<\min\{\left(\frac{2+\log_{q}(1-\delta)}{36}\right)^{2},\varepsilon^{\frac{1}{\gamma}}\}\text{ and }-(\delta_{0}^{\perp})^{\frac{1}{4}}\cdot\log_{q}(\delta_{0}^{\perp})\leq\frac{4}{e\ln 2})
<0.05ε.\displaystyle<0.05\varepsilon^{\prime}.

By Lemma 2.14, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε+s1s\frac{\varepsilon^{\prime}+s}{1-s}-biased and since ε+s1s<109ε<ε\frac{\varepsilon^{\prime}+s}{1-s}<\frac{10}{9}\varepsilon^{\prime}<\varepsilon. So, with high probability, 𝒞[𝒮]\mathcal{C}^{[\mathcal{S}]} is ε\varepsilon-biased. ∎

Proof of Corollary 1.9.

Given that δ>1q0.6\delta>1-q^{-0.6}, we have logq(1δ)<0.6\log_{q}(1-\delta)<-0.6. We can select a universal constant η\eta such that 122η1+0.9logq(1δ)>1\frac{\frac{1}{2}-2\eta}{1+0.9\cdot\log_{q}(1-\delta)}>1. Set γ=min{η,122η1+0.9logq(1δ)1}\gamma=\min\{\eta,\frac{\frac{1}{2}-2\eta}{1+0.9\cdot\log_{q}(1-\delta)}-1\}. By invoking theorem 1.8 with s=R(122γ)Hq(δ0)logq(1δ)s=\frac{R-(\frac{1}{2}-2\gamma)\mathrm{H}_{q}(\delta_{0}^{\perp})}{-\log_{q}(1-\delta)} again, we have the same guarantee. ∎

References

  • [AGL23] Omar Alrabiah, Venkatesan Guruswami, and Ray Li. Randomly punctured reed–solomon codes achieve list-decoding capacity over linear-sized fields. arXiv preprint arXiv:2304.09445, 2023.
  • [BGL17] Valerio Bioglio, Frederic Gabry, and Ingmar Land. Low-complexity puncturing and shortening of polar codes. In 2017 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), pages 1–6. IEEE, 2017.
  • [BGM22] Joshua Brakensiek, Sivakanth Gopi, and Visu Makam. Generic reed-solomon codes achieve list-decoding capacity. arXiv preprint arXiv:2206.05256, 2022.
  • [CCLO23] Xue Chen, Kuan Cheng, Xin Li, and Minghui Ouyang. Improved decoding of expander codes. IEEE Transactions on Information Theory, 2023.
  • [FKS22] Asaf Ferber, Matthew Kwan, and Lisa Sauermann. List-decodability with large radius for reed-solomon codes. IEEE Transactions on Information Theory, 68(6):3823–3828, 2022.
  • [Gal62] Robert Gallager. Low-density parity-check codes. IRE Transactions on information theory, 8(1):21–28, 1962.
  • [GLS+22] Zeyu Guo, Ray Li, Chong Shangguan, Itzhak Tamo, and Mary Wootters. Improved list-decodability and list-recoverability of reed-solomon codes via tree packings. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 708–719. IEEE, 2022.
  • [GM22] Venkatesan Guruswami and Jonathan Mosheiff. Punctured low-bias codes behave like random linear codes. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 36–45. IEEE, 2022.
  • [GMR+20] Venkatesan Guruswami, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, and Mary Wootters. Sharp threshold rates for random codes. arXiv preprint arXiv:2009.04553, 2020.
  • [GST21] Eitan Goldberg, Chong Shangguan, and Itzhak Tamo. List-decoding and list-recovery of reed-solomon codes beyond the johnson radius for any rate. arXiv preprint arXiv:2105.14754, 2021.
  • [GZ23] Zeyu Guo and Zihan Zhang. Randomly punctured reed-solomon codes achieve the list decoding capacity over polynomial-size alphabets. arXiv preprint arXiv:2304.01403, 2023.
  • [LDT21] Yang Liu, Cunsheng Ding, and Chunming Tang. Shortened linear codes over finite fields. IEEE Transactions on Information Theory, 67(8):5119–5132, 2021.
  • [MRRZ+20] Jonathan Mosheiff, Nicolas Resch, Noga Ron-Zewi, Shashwat Silas, and Mary Wootters. Ldpc codes achieve list decoding capacity. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 458–469. IEEE, 2020.
  • [NvZ15] Peter Nelson and Stefan HM van Zwam. On the existence of asymptotically good linear codes in minor-closed classes. IEEE Transactions on Information Theory, 61(3):1153–1158, 2015.
  • [PP23] Aaron (Louie) Putterman and Edward Pyne. Pseudorandom linear codes are list decodable to capacity. arXiv preprint arXiv:2303.17554, 2023.
  • [Reg09] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography. J. ACM, 56(6):34:1–34:40, 2009.
  • [RW14] Atri Rudra and Mary Wootters. Every list-decodable code for high noise has abundant near-optimal rate puncturings. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, 2014.
  • [Shp09] Amir Shpilka. Constructions of low-degree and error-correcting ε\varepsilon-biased generators. Comput. Complex., 18(4):495–525, dec 2009.
  • [SS96] Michael Sipser and Daniel A Spielman. Expander codes. IEEE transactions on Information Theory, 42(6):1710–1722, 1996.
  • [ST20] Chong Shangguan and Itzhak Tamo. Combinatorial list-decoding of reed-solomon codes beyond the johnson radius. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 538–551, 2020.
  • [Woo13] Mary Wootters. On the list decodability of random linear codes with large error rates. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 853–860, 2013.
  • [YP17] Arti Yardi and Ruud Pellikaan. On shortened and punctured cyclic codes. arXiv preprint arXiv:1705.09859, 2017.