A Bound on the Minimal Field Size of LRCs,
and Cyclic
MR Codes That Attain It
Abstract
We prove a new lower bound on the field size of locally repairable codes (LRCs). Additionally, we construct maximally recoverable (MR) codes which are cyclic. While a known construction for MR codes has the same parameters, it produces non-cyclic codes. Furthermore, we prove both necessary conditions and sufficient conditions that specify when the known non-cyclic MR codes may be permuted to become cyclic, thus proving our construction produces cyclic MR codes with new parameters. Furthermore, using our new bound on the field size, we show that the new cyclic MR codes have optimal field size in certain cases. Other known LRCs are also shown to have optimal field size in certain cases.
Index Terms:
Distributed storage, locally repairable codes, maximally recoverable codes, cyclic codes.I Introduction
In large-scale cloud storage and distributed file systems, such as Amazon Elastic Block Store (EBS) and Google File System (GoogleFS), disk failures are the norm and not the exception, due to the sheer scale of the system. To protect the data integrity, coding theory is used to recover from data loss due to disk failures. The simplest solution for those systems is a straightforward replication of data packets across different disks. However, this solution is costly especially for large-scale systems since it suffers from a large storage overhead. As an alternative solution, erasure codes such as maximum distance separable (MDS) codes, may be employed as storage codes. These codes encode information symbols to symbols and store them across disks, and they can recover from the loss of any symbols. This scheme achieves a dramatic improvement in redundancy compared with replication. However, for MDS codes, even if one disk fails, the system needs to access surviving disks in order to recover the lost symbol, which makes the repair process costly.
One method to improve the repair efficiently, suggested in [17], is endow the code with a locality property. This property allows a failed symbol to be recovered by accessing only other symbols. Erasure codes with locality are also called locally repairable codes (LRCs). The original concept of locality only works when exactly one erasure occurs (that is, one disk fails). In the past decade, the notion of locality further generalized in several directions. For example, LRCs with -locality [31] allow an erased symbol to be recovered by reading other symbols, even if the repair set suffered more erasures. Other examples include: locality which guarantees disjoint multiple repairable sets (also named as availability) [40, 35, 6, 8], locality which has a hierarchical structure [36, 14], and unequal localities [26, 44, 22].
Other code properties are also desirable. For a given code length and dimension , we would like the Hamming distance to be as large as possible, in order to maximize erasure-correcting capabilities. Additionally, we would like the field size (or alphabet size) to be as small as possible, in order to reduce the computation complexity for coding and decoding. Other desirable properties may include a cyclic structure for the code, since it allows for fast encoding algorithms. Finally, even if the code has optimal distance, we would like to be able to correct some pre-determined erasure patterns beyond the minimum Hamming distance.
In the past a few years, many results have been obtained for LRCs. Upper bounds on the minimum Hamming distance were proved, e.g., Singleton-type bounds [17, 31, 41, 7], and bounds related with the alphabet size [5, 1]. Optimal LRCs (with respect to these bounds), were constructed, e.g., [24, 34, 38, 37, 42, 13, 28]. In [21, 9], lower bounds on the field size of optimal LRCs were derived for [21], and [9]. Among the known optimal LRCs, some of them also achieve order-optimal field size [25, 2, 43, 12] when , and [9] when . Otherwise, constructions of optimal cyclic LRCs were introduced in [39, 13, 32, 14, 33]. When considering pre-determined recoverable erasure patterns beyond the minimum Hamming distance, codes that can recover from all information-theoretically recoverable erasure patterns are called maximally recoverable (MR) codes [17], also known as partial MDS codes [3]. In [19], lower bounds on the field size requirement for MR codes were introduced. For explicit constructions of MR codes, the reader may refer to [3, 16, 15, 28, 20, 10, 18]. Notably, there are MR codes have order-optimal field size (with respect to the bound in [19]): [3] for a single global parity check (), [4, 19] for , [16] for and , and [10, 18] for a constant, and .
The above summary shows how subsets of the mentioned desired properties may be obtained simultaneously. However, to the best of our knowledge, there are no explicit constructions that achieve all them, namely, cyclic MR codes with optimal field size. In this paper, our motivation is to construct cyclic MR codes with optimal field size, or order-optimal field size. To this end, we work both on constructions for cyclic MR codes, and a theoretic bound on the field size of optimal LRCs (containing MR codes as special cases). In the first part of the paper we prove a new general bound for optimal LRCs. We compare our new bound with the known bounds, and show that it is tighter for some parameters. In the second part of the paper, we introduce a new construction for cyclic MR codes. Our construction produces cyclic MR codes that share the same parameters as one of the known non-cyclic constructions in [19]. We also show that under certain conditions, the non-cyclic construction from [19] can be permuted to become a cyclic code, whereas in other cases it cannot, thus proving our construction produces cyclic MR codes with new parameters. As a byproduct of the proof, we characterize the algebraic structure of repair sets for optimal cyclic LRCs, which results in strong new restrictions on the parameters of optimal cyclic LRCs. Finally, we return to review our bound on optimal LRCs, and show that our construction has an optimal field size when . Since the bound is for general LRCs, as a consequence we get that some known constructions have optimal field size when , a result which has not been claimed before.
The remainder of this paper is organized as follows. Section II introduces some preliminaries about LRCs. Section III proves a new bound on the field size of LRCs. Section IV describes a construction of cyclic MR codes, as well as sufficient and necessary conditions under which a known non-cyclic construction from [19] may be permuted to become cyclic. Section V concludes this paper with some remarks.
II Preliminaries
In this section, we present notation and some necessary known results, which are used throughout the paper. For a positive integer , we define . If is a positive integer, we denote
Thus, implicitly depends on , whose value should be understood from the context.
For any prime power , let denote the finite field of size , let denote the set of vectors of length over , and let denote the set of all possible matrices over .
An linear code, , over , is a -dimensional subspace of . Such a code may be specified as the row-space of a generator matrix , where is a column vector of length for all . Specifically, it is called an linear code if the minimum Hamming distance of the code is . For a subset , we define
The code can also be specified by a parity-check matrix , i.e., where . Given a non-empty set of coordinates, , the punctured code is the code obtained from by deleting the code symbols at positions . Thus, is generated by which is obtained from by deleting the columns at . Similarly, the shortened code is the code whose parity matrix is , namely, the matrix obtained from by deleting the columns at .
An linear code, , is said to be a cyclic code if implies that , where is the cyclic shift operator by one place. It is well known (see [27]) that a cyclic code with length over corresponds to a principal ideal of . Thus, let be generated by a monic polynomial , which is called the generator polynomial of . When , assume is a primitive th root of unity of , then the cyclic code can be also be determined by the roots of , i.e., .
We shall encounter many Vandermonde matrices in the following section. Since we use a broader-than-usual definition for such matrices, we give it here explicitly. Let be distinct elements. We say the following matrix is a Vandermonde matrix,
where is a permutation matrix, and where and are invertible diagonal matrices. It is well known that the rank of such a matrix is .
II-A Locally Repairable Codes
In [17], Gopalan et al. introduced a definition for the locality of code symbols. For , the th code symbol, , of an linear code, , is said to have locality if it can be recovered by accessing at most other symbols of . This has been generalized in [31] to the following definition:
Definition 1:
Let be an linear code, and let be a generator matrix for it. For , the th code symbol, , of , is said to have -locality if there exists a subset such that:
-
•
and ; and
-
•
the minimum Hamming distance of the punctured code is at least .
In that case, the set is also called a repair set of . The code is said to have information -locality if there exists with such that for each , has -locality. Furthermore, the code is said to have all-symbol -locality if all the code symbols have -locality.
Thus, the definition of symbol locality from [17] is the special case of in the definition from [31]. In [31] (and for the case , originally [17]), the following upper bound on the minimum Hamming distance of linear codes with information -locality is derived.
Theorem 1 ([31]):
For an linear code with information -locality,
Codes with information -locality are said to be optimal locally repairable codes (optimal LRCs) if their minimum Hamming distance attains the bound of Theorem 1 with equality. It is known that optimal LRCs with all-symbol -locality have a specific structure to their repair sets.
Theorem 2 ([37, 9]):
Let be an optimal LRC with all-symbol -locality. Let be the set of all possible repair sets. Write , for integers and , and . If , , and additionally, or , then there exists a subset , such that:
-
•
All are of cardinality , and is a partition of .
-
•
For any , is an MDS code.
Remark 1:
In [21], Guruswami et al. asked a fundamental interesting question: How long can an optimal LRC with -locality be? They derived the following upper bound on the code length.
Theorem 3 ([21]):
Let be an optimal LRC with all-symbol -locality. If , , , and additionally, or , then
(1) |
where , and .
In [9], this problem is further considered for optimal LRCs with all-symbol -locality, .
Theorem 4 ([9]):
Let , , , , and additionally, or , where all parameters are integers. Assume that there exists an optimal linear code with all-symbol -locality, and define . If , then
where can also be rewritten as .
II-B Maximally Recoverable Codes
Maximally recoverable (MR) codes are an extremal case of LRCs, that maximize the erasure-repair capability.
Definition 2:
Let be an code with all-symbol -locality, and define , where is a repair set for coordinate . The code is said to be a maximally recoverable (MR) code if is a partition of , and for any such that , the punctured code is an MDS code.
In general, for , are not required to be of the same size. However, from an application point of view, equal-sized repair sets simplify the implementation, bringing us to the following definition:
Definition 3:
Let be an MR code, as in Definition 2. If each is of size (implying ), we define
Then is said to be an -MR code.
We first note that it is easy to verify that -MR codes are optimal LRCs with all-symbol -locality. We can regard each codeword of an -MR code, as an array, by placing each repair set in as a row, when forms a partition of . In this way, -MR codes match the definition of partial MDS (PMDS) codes, as defined in [3]. When implemented in a distributed-storage setting, each entry of a codeword array corresponds to a sector, each column of the array corresponds to a disk, and each row to a stripe. Thus, an -MR code can recover from sector erasures in each stripe, and additional erased sectors anywhere. We mention in passing that a more restricted type of codes, called sector-disk (SD) codes, are capable of recovering from disk erasures, and additional erased sectors (see [30, 11]).
Paralleling the general case of optimal LRCs, it is interesting to ask what is the minimum alphabet size required by MR codes.
Lemma 1 ([19, Theorem I.1]):
Let be an -MR code, . If , then
where , and where and are regarded as constants. Additionally,
-
1.
If :
-
2.
If , , and :
-
3.
If , , and :
Remark 2:
For the case , the field size requirement of an -MR code may be as small as . This is attainable since the punctured code over any repair set together with the single global parity check is an MDS code when or , where with (see [9]).
Definition 4:
A family of -MR codes has order-optimal field size if it attains one of the bounds of Lemma 1 asymptotically for , or if it has for .
III A New Bound on Optimal LRCs
In this section we present a new bound on the parameters of optimal LRCs with all-symbol -locality. To that end, we first prove bounds for optimal LRCs with small minimum Hamming distance, distinguishing between the two cases of and . The proof strategy of both bounds is showing that the existence of such codes forces the existence of many subspaces, any two of which intersect only trivially. We then recall a parameter-reduction lemma, which reduces optimal LRCs with a large minimum Hamming distance into optimal LRCs with a smaller one. Combining these together results in the main bound. We note that the new bound is not specific to MR codes or to cyclic codes, but instead applies to optimal LRCs. The bound does, however, require certain divisibility conditions, which are common to several constructions of optimal LRCs, among them, MR codes.
Lemma 2:
Let be an optimal LRC with all-symbol -locality. If , then
Proof:
Denote and . By Theorem 2, the code has a parity-check matrix of the following form,
where , for and , and is parity-check matrix of an MDS code for . This implies that the matrices and , , have columns each when , and otherwise.
Let us consider the following square matrices,
where , , and . Since the minimum Hamming distance of is , any columns from are linearly independent. This implies that the matrices defined above have full rank.
Recall that is a parity-check matrix of an MDS code. Thus, is an invertible matrix. A similar claim follows for . Hence, after simple column and row operations, the full rank of implies that
has full rank, implying also that
(2) |
for , , and . We also mention that if either or , natural adjustments need to be made, that is, zeroing instead of , and instead of .
Next, assume , and , . We pick only columns from , which must therefore be linearly independent, giving us,
(3) |
We explain why the fourth equality holds in more detail. The column operations performed in order to obtain may be written as
where is the identity matrix. It then follows that
Since is a parity-check matrix for an MDS code, we have that the matrix is a generator matrix for that code. Hence, any columns of it are linearly independent. In particular, that means is invertible. We can therefore write,
This completes the detailed explanation for the fourth equality in (3). The main observation is that (3) gives
(4) |
for , and , . Again, if or , a natural adjustment needs to be made.
For the case , we also have a similar lemma.
Lemma 3:
Let be an optimal LRC with all-symbol -locality. If , then
Proof:
By Theorem 2, and after simple row operations, the code has a parity-check matrix of the following form,
where is the identity matrix, , , , , and is a parity-check matrix of an MDS code, for all .
Consider the following square matrices,
where , , and where and are the first columns of and , respectively. Since the minimum Hamming distance of is , any columns from are linearly independent, and in particular,
This implies that
By the size of the matrices, we also must have
and also
We now denote . Obviously, . Let us arbitrarily choose an -dimensional subspace such that , namely, and . For any vector , let denotes its projection onto , that is, is the unique vector such that , with . For any , we then construct from by replacing each column vector with its projection onto . It then follows, that for all , ,
and also
Let us construct the set of subspaces
By the previous discussion, contains subspaces of , each of dimension , any two of which intersect only trivially. Additionally, the sum of any two subspaces from , summed together with the fixed -dimensional subspace , gives . Thus,
completing the proof. ∎
The final component in our main bounding theorem is a parameter-reduction lemma. This lemma was proved in [9].
Lemma 4 ([9] Corollary 2):
Let , , , and additionally, or , where all parameters are integers. If there exists an optimal linear code with and all-symbol -locality, then there exists an optimal linear code with all-symbol -locality and parameters where .
Let us now state and prove our main bound. The next theorem gives a lower bound on the size of the field required for LRCs with all-symbol -locality.
Theorem 5:
Let be an optimal linear code with all-symbol -locality. Assume , , . If and then,
where is the smallest prime power greater or equal to . If and then
Proof:
Remark 3:
Assume that there exists an optimal LRC with all-symbol -locality, and define . Rewriting the bound of Theorem 4 (which we cite from [9]) in a slightly looser yet more convenient way,
Thus, when only and tend to infinity, and the other parameters are constant, if
then the exponent in the bound of Theorem 5 is higher, and it may outperform the known bound of Theorem 4 (where we denote and ).
Having seen that the new bound of Theorem 5 may provide an improvement, we focus on a single case. More specifically, the case of is of particular interest, since we can then use Theorem 5 to prove that some known LRCs have optimal field size.
We first consider some Tamo-Barg codes [38].
Lemma 5 ([38]):
Let be a prime power, , then there exists an optimal LRC with all-symbol -locality and parameters , where and .
Corollary 1:
Let be a code from Lemma 5 with and . If is not a prime power then has optimal field size.
Example 1:
Let , , , then by Lemma 5 there exists an optimal LRC with all-symbol -locality and parameters , which has optimal field size since is not a prime power.
We now examine a construction of cyclic optimal LRCs from [39].
Lemma 6 ([39]):
Let , , and with , where is prime power. Then there exists a cyclic optimal LRC with all-symbol -locality and parameters .
Corollary 2:
Let be a code from Lemma 6 with and . If neither , nor , are prime powers, then has optimal field size.
Example 2:
Let , , and . Then by Lemma 6, there exists a cyclic optimal LRC with all-symbol -locality, and parameters , which has optimal field size since both and are not prime powers.
Yet another construction of cyclic optimal LRCs comes from [13].
Lemma 7 ([13]):
Let , , , and , with an even integer, and where is prime power. Then there exists a cyclic optimal LRC with all-symbol -locality and parameters .
Corollary 3:
Let be a code from Lemma 7 with . Then has optimal field size.
Example 3:
[13] Let , , , , then there exists a cyclic optimal -LRC, which has optimal field size.
Note that an -MR code is also an optimal LRC. The following corollary can be derived directly from Lemma 2.
Corollary 4:
Let be an -MR code. Then .
IV Cyclic Maximally Recoverable (MR) Codes
We divide this section into two parts. In the first part we construct cyclic MR codes, and show that for certain parameters they have the exact optimal field size. The main idea behind our construction is to carefully choose the roots for the cyclic code we construct, in a way that produces an MR code. In the second part we study a known class of MR codes which are non-cyclic, but have the same parameters as the cyclic codes we construct. We then show that these non-cyclic codes can sometimes be permuted to obtain cyclic MR codes. For this part, as tools to prove the main results, we characterize the algebraic structure of the repair sets of cyclic optimal LRCs (Theorem 7) and the structure of punctured codes and shortened codes over repair sets (Corollary 7). On the one hand, we prove that for some parameters the known construction can be permuted to obtain cyclic MR codes by finding suitable permutations (Theorem 8). On the other hand, we show that such a permutation is not always possible (Theorem 9). By combining the two parts, we obtain our main result.
IV-A A New Construction
We immediately present our construction for cyclic MR codes. It is inspired by the construction of [39].
Construction A:
Let be integers, a prime power, , a primitive element, , and such that . Define
The constructed code, , is the cyclic code of length over with root set .
Our goal is now to show that the code from Construction A is indeed a cyclic MR code. However, in order to do so we require a technical lemma.
Lemma 8:
Assume the setting and notation of Construction A. Denote , and . Assume for . Then for any , , the matrix
has full rank.
Proof:
Assume to the contrary that there exists satisfying . Hence, the polynomials and have roots and , respectively. Note that implies that , for otherwise, and , but they each have distinct roots, a contradiction. By Vieta’s formula, we have
Hence, there exists an integer such that , i.e., . It follows that
and then
This contradicts the facts that and , and completes the proof. ∎
We can now prove that the constructed code is indeed a cyclic MR code.
Theorem 6:
Proof:
Denote , and . In the first step of our proof we contend that the following matrix is a parity-check matrix of ,
Define the following polynomial,
Clearly, . We then have
The preceding equation also means that for all ,
(5) |
Assume is a generator matrix for . Recall that the roots of are . Hence, for all and ,
Define, for all ,
Note that is a linear combination of the rows of matrix in (5). Thus, the facts , and hint
and so, . Combining this with the fact that is cyclic (and therefore, also ), for all , where we recall that is the cyclic left-shift operator. Thus, the first rows of contain codewords of . The remaining last two rows of correspond to parity checks for the roots and , both of which are roots of . If we now denote by the code whose parity-check matrix is , we can say . It remains to show that to complete the proof.
We first observe that since has rows,
(6) |
An inspection of reveals that has all-symbol -locality and the repair sets are given by for . Plugging (6) into Theorem 1 we obtain that the minimum distance of satisfies
(7) |
Let us first handle the case of . We contend that in that case, the minimum distance of is at least . Even if we ignore the two bottom rows of , the Vandermonde matrices in the columns corresponding to a repair set show that any columns of are linearly independent. Thus, a linearly dependent set of columns from requires at least columns from each repair set it intersects. If we try to pick linearly dependent columns from a single repair set, then taking into account also the bottom two rows of , the columns of a repair set also form a Vandermonde matrix (recall that ), and so columns of from the same repair set are still linearly independent. If instead we pick columns from more than one repair set, at least columns are required. Combined together, since , the smallest set of linearly dependent columns of contains at least columns, i.e., as claimed. Together with (7),
Again by Theorem 1, necessarily
Finally, we note that
Since , and they are of equal dimension, we have , and is a parity-check matrix for .
We turn to the case of . As in the previous case, a linearly dependent set of columns from requires at least columns from each repair set it intersects. However, this time, since , we cannot choose columns from the same repair set, since each repair set contains exactly columns. Thus, a set of linearly dependent columns of contains at least columns each from two repair sets. However, by Lemma 8, exactly columns each from two repair sets, still forms a set of linearly independent vectors. Thus, at least columns are required for a dependent set, namely, . As in the previous case, by (7) we have
and then
and , as desired.
In summary, we just proved the code is an optimal LRC which is cyclic. The fact that it is a -MR code follows directly from Lemma 8, since any erasure pattern hitting two repair sets with erasures each, corresponds to a full-rank set of columns from , and is therefore correctable. ∎
The cyclic MR codes by Construction A have optimal Hamming distance, and order-optimal field size with respect to the bound in Lemma 1-(1), where we consider as a constant. However, we can do better than that when , according to Corollary 4.
Corollary 5:
When , the cyclic MR codes generated by Construction A have optimal field size by Corollary 4, provided that neither , nor , are prime powers. When , the cyclic MR codes by Construction A have optimal Hamming distance, and order-optimal field size with respect to the bound in Lemma 1-(1), where we consider as a constant.
IV-B Turning Non-cyclic Codes into Cyclic Codes
Previous works that constructed non-cyclic -MR codes, for , did so with in [4], and later, with [19] (see also [23], that obtained for the special case of ). Of particular interest to us are the -MR codes from [19, Theorem IV.2]. These MR codes have the same parameters as Construction A. However, they are not cyclic MR codes directly. In what follows, we shall attempt to determine whether the MR codes generated in [19, Theorem IV.2] can be rearranged to become cyclic codes. Along the way, we shall prove some interesting facts concerning cyclic optimal LRCs.
As a first step, we show the repair sets of cyclic optimal LRCs are severely restricted.
Theorem 7:
Let be a cyclic optimal LRC with parameters and all-symbol -locality. Write with . If , then for any repair set , and any , either or .
The technical proof of Theorem 7 is deferred to the appendix. As an immediate consequence, we now show that the repair sets of cyclic optimal LRCs must be cosets of .
Corollary 6:
Let be a cyclic optimal LRC with parameters and all-symbol -locality (where, to avoid trivial cases, we assume that does not have all-symbol -locality). Let , . If , then , , and the repair sets of are
for all .
Proof:
Let be a repair set such that . By Theorem 7 we have for any . Thus, is a subgroup of the cyclic group . Note that . If , then the fact , for , are also repair sets for , implies that has all-symbol -locality, which contradicts our assumption. Thus, , , and .
Let be any repair set of . the same analysis shows that . Note that for any is still a repair set of . Now it is easy to check that is a -subgroup of , i.e., , which completes the proof. ∎
Remark 4:
Corollary 6 shows that the condition is not a restriction when , but rather a consequence.
Remark 5:
Further building on Corollary 6, we can now show that cyclic optimal LRCs have a parity-check matrix with a nice form.
Corollary 7:
Let be a cyclic optimal LRC with parameters , and all-symbol -locality (where, to avoid trivial cases, we assume that does not have all-symbol -locality). If , then a parity-check matrix of can be given in the following form:
where are column vectors, is a parity-check matrix of a cyclic code with minimum Hamming distance of at least , , , and corresponds to the global parity checks. Moreover, the punctured codes satisfy for all , where . Similarly, the shortened codes satisfy for all , where is the code whose parity check matrix contains only the columns corresponding to from .
Proof:
The parity-check matrix follows directly from Corollary 6 and the fact that for is also a cyclic code. Additionally, since is cyclic, trivially we have and . ∎
Now we recall a construction, which was first introduced in [19].
Construction B ([19]):
Let be a prime power, , , , , and . Let be a primitive element of , , and , . The following parity-check matrix defines an -MR code,
One cannot avoid seeing a similarity between the parity-check matrix of Construction B, and the parity-check matrix found in Theorem 6 for the code from Construction A. However, the code from Construction B is not cyclic, but rather quasi-cyclic. In what follows we study whether permuting it produces a cyclic code.
Let denote the set of permutations over , for any . Each permutation in may be thought of as a bijection in , namely, a bijection from to . Let be a code of length , whose coordinates are indexed by . If is a permutation, we define the permutation of by as
If is a cyclic code, it is natural to ask for what permutations , is also cyclic. Apart from the trivial cyclic shifts of , a natural subset of candidate permutations are multipliers, namely,
Pálfy [29] proved that, in many cases, multipliers are the essential permutations keeping a code cyclic:
Lemma 9 ([29]):
Consider codes of length whose coordinates are indexed by .
-
1.
When or , for all cyclic codes , if , , is also a cyclic code, then there is a multiplier such that .
-
2.
When and , there exists a cyclic code , and such that is cyclic, but for all multipliers .
Here denotes Euler’s totient function.
Drawing inspiration from Lemma 9, we address the (different) question of finding permutations from that turn the non-cyclic code of Construction B into a cyclic code. Recall that in the setting of Construction B, , and . We now define a set of functions from to as follows:
where we assume , , , and . We then define the set,
and where by abuse of notation, denotes the mapping that maps .
We would like to make some easy observations concerning the elements of . Denote . Then is an additive subgroup of , and . Let us denote the cosets of by , for all . We now note that is a bijection from to if and only if . Thus, (i.e., the restriction of to ) is a bijection from to . With the extra requirement that , we have that distinct cosets are mapped to distinct cosets , and hence, , namely, comprises of permutations over .
Theorem 8:
Assume the notation and setting of Construction B, and let be the resulting code when . Then there exists a permutation such that is a cyclic code if and only if .
Proof:
We first observe that if and only if the equation has at least one solution . We now prove both directions of the claim.
In the first direction, assume has a solution . Consider the permutation for which , and , where . Applying to the coordinates of , the parity-check matrix from Construction B becomes
which is a parity-check matrix for . Here we used to get that . Now, by dividing some of the rows with appropriate scalars, another parity-check matrix for is the following:
It is now clear that is cyclic, since implies , i.e., implies .
In the second direction, assume that there exists such that is cyclic. Assume to the contrary that for all . Let us write , with , and . We can now write,
with , and . Let us further define . We can now apply to the order of the columns of from Construction B to obtain a parity-check matrix for the code . By rearranging the order of the rows of the matrix, we may write,
Recall that, by construction, the multiplicative order of is . Since , we also have that , for all . Taking into account that , namely, , we have that are all distinct. Let us look at the columns of that correspond to for some . These columns, after removing all-zero rows, form a (transposed) Vandermonde matrix.
(8) |
where is a permutation matrix that moves the second row from the bottom to the top. Since are all distinct, the rows of (8) are linearly independent. Thus, a linear combination of the rows of that results in zeros in all the positions of must be a trivial combination.
We now use the fact that is a parity-check matrix for a cyclic code. By adding cyclic rotations of existing rows in , we obtain which is also a parity-check matrix for the same code,
where . However, these added rows must be linear combinations of the rows of . Since they contain zeros in all the entries of , , these linear combinations cannot use the last two rows of . It now follows that
After the same treatment as (8), this gives
If , then by the fact that , we would have a contradiction to the rank equality above. It follows that
(9) |
for all .
We now repeat the argument, with an extra step. Take and add to it a cyclic rotation of its last row to obtain the following parity-check matrix for the same code,
where . Again, this added row is linearly dependent on the others, and so, looking at the columns of we obtain the rank equality
Again, using the same steps as in (8), we get
As before, if , then by the fact that , we would have a contradiction to the rank equality above. If follows that
(10) |
for all .
The combination of (9) and (10) implies that for all . We observe that
By (9),
But now, since , we conclude that
for all .
Now that we know that , we can write as
Looking at the columns of that correspond to , , once again we observe that the non-zero rows are equivalent to a (transposed) Vandermonde matrix
Hence, these rows are linearly independent. Let us now add a cyclically shifted version of the last row of , to obtain yet another parity-check matrix for the code,
The added row is a linear combination of the original rows of . Assume , . If we look at the columns of in we see that
Since the non-zero rows in the columns of are linearly independent, this linear combination is unique. Similarly, for we get
which is again unique. However, all these linear combinations must coincide simultaneously when viewing the entire , and so
Multiplying all of them together we get
However, this means that
which completes the proof. ∎
While the last theorem shows us a sufficient condition under which the known code of Construction B may be permuted to a cyclic code, the next theorem shows us that for almost all cases, this condition is in fact necessary. First, we bring a technical proposition.
Proposition 1:
Let , , be positive integers with , and . If
(11) |
and one of the following conditions holds,
-
1.
and
-
2.
and
-
3.
and is odd
then we have .
Proof:
For Case 1, since there exists an such that . By (11), we have
Obviously,
and so,
and since ,
Since and , we must have . It remains to show that . Assume to the contrary that . Similarly, by (11), we have
Again,
which implies that,
and since ,
However, implies that
which is a contradiction. Thus, we have and .
If then we have , hence . However, this is impossible since and . Similarly, if then we have , hence . Again, this is also impossible since and . Thus, we have .
Theorem 9:
Assume the notation and setting of Construction B. Let be the resulting code. Denote with and . Additionally, let or . Furthermore, assume that , and that one of the following holds:
-
1.
and
-
2.
and
-
3.
and is odd
Then there exists a permutation such that is cyclic only if .
Proof:
Assume is a permutation such that is cyclic. By Construction B, we have that , , are exactly the repair sets of . Thus, the image sets are exactly the repair sets of . Note that is an optimal LRC, which means that is also optimal. By Corollary 6, we have
Thus, there exists a sequence of permutations, over , for all , and , such that for all ,
(12) |
which also implies that , and is a permutation of . By assumption, is cyclic. Hence, is also cyclic, for each .
By Construction B, any punctured code, , , is a subcode of the code with the parity-check matrix . Recall that is an optimal LRC. Hence, by Theorem 2, we have that this punctured code, , is an MDS code. This implies that its parity-check matrix is exactly . Since this matrix clearly does not depend on , we have , for all . Additionally, since , all the punctured codes are cyclic.
By Corollary 7, we have that for all , and are all cyclic codes. Thus, a parity-check matrix of may be given by
(13) |
We now have that maps the cyclic code into a cyclic code , where we view these codes as indexed by . Then, by Lemma 9, we can find a multiplier permutation from that also maps to . More concretely, there exists , with which we define a permutation
for all and . For this permutation we have for all . Now, a parity-check matrix for may be given by
and it must be row-equivalent to from (13).
We switch our view from the punctured codes to the shortened codes of . As above, the following shortened codes are all equal, , for all . A parity-check matrix for them may be written as
By the multiplicative order of , the codes are all cyclic. By Corollary 7, , for all , and a parity-check matrix for them may be given by
(14) |
where , , , are the same as those in (13). Once again, are all cyclic. Hence, by Lemma 9 we can find a multiplier permutation from that also maps to . Namely, there exists , with which we define
for all and , such that , for all . A parity-check matrix for may be given by
and it must be row-equivalent to from (14).
By the properties of Vandermonde matrices we have for all ,
where the last equality holds by the row equivalence of and , as well as the row equivalence of and . Since , the above equality implies
By construction, the multiplicative order of is , and so
where the last equality follows from the fact that . Since ,
Thus,
Since , we have
Then, by Proposition 1, we have .
Denote . Thus, . We now know that the following two matrices are row equivalent,
(15) |
for all . Recall that and have the same order, , i.e., the entries of the matrices in (15) belong to the field . Hence, can be represented as a linear combination
(16) |
where for all , . We also highlight the fact that for all , for otherwise we would have that the matrix on the right has rank whereas the one on the left has rank . For convenience, let us define , where , .
After focusing on shortened and punctured codes, let us look again at the entire code. If we permute the columns of the parity-check matrix of using , we arrive at the following parity-check matrix for , to (14),
where , are the same as in (12), and , , , are the same as in (13). By (16) and the equivalence of and , the matrix is row equivalent with
Since is also a parity-check matrix for , which is a cyclic code, adding a dependent row to which is a cyclic shift of another row, does not change the code. Hence, we look at the following parity-check matrix for ,
Let us now denote by the bottom row of , and by the bottom two rows of . We recall that , and hence,
We now observe that the last row of may be shown as a linear combination of the preceding two rows. More precisely, for all ,
where
(17) | ||||
Since is row equivalent with , and (i.e., full rank), the linear combination above is the unique linear combination of the rows of that gives . This linear combination does not use the first rows of .
Looking at the entire matrix (instead of focusing on the projections onto ), once again, since is row equivalent with , must be linear combination of the rows of . Since in each projection onto there is a unique linear combination, all these must simultaneously agree. In particular, this means
We recall that for all , and is primitive. Thus, we may write
(18) |
for some integer . Also, by (17),
(19) |
Now, combining (18) and (19) we get
Thus,
This, in turn, implies that , as we wanted to prove. ∎
To conclude this section, we make use of Theorem 9 in order to show that Construction A may produce cyclic MR codes with new parameters. Namely, in certain case, the construction of [19], which produces codes with the same parameters as our Construction A, results in codes that are neither cyclic, nor can be permuted to become cyclic.
V Conclusion
In this paper, we proved a new lower bound on the field size of optimal LRCs. As a byproduct, when we were able to prove that some known code constructions actually have optimal field size (where we further had to assume that the field size minus or is not a prime power). We then constructed cyclic MR codes. When , these codes also attain the new bound with equality, and therefore have optimal field size (again, assuming the same number-theoretic condition). We concluded by showing a known quasi-cyclic MR code, with the same parameters as our cyclic construction, may sometimes be permuted to become cyclic, and in other cases it may not.
Many open questions remain. First and foremost, the construction for a cyclic MR code in this paper only works for the case of two global parity checks, i.e., . However, in the non-cyclic case, there are a few known constructions of MR codes with . Finding cyclic MR codes with is still an open question.
As a second open question we mention our lower bound on the field size of optimal LRCs. We were able to show it is tight only when . Thus, finding out whether it is tight for cases in which , or improving it, remains widely open. We leave these questions and others for future work.
In this appendix, we shall prove Theorem 7. To this end, we first recall some definitions and lemmas from [9].
Throughout the appendix we shall assume the coordinate of code of length are indexed by , and where operations on coordinates are required, they shall be made modulo . Let with . Denote the set of all the possible repair sets for an LRC with all-symbol -locality as
Lemma 10 ([8], Lemma 7):
Let be an linear code with all-symbol -locality. If for a subset , and for all ,
then we have
For cyclic LRCs we have the following simple fact.
Lemma 11:
Let be a cyclic LRC. If is a repair set of , then is also a repair set of , for all .
Proof:
Since is cyclic, for any . The claim follows immediately by definition. ∎
We are now ready for the main proof.
Proof:
Assume to the contrary that there exists a repair set and such that
(20) |
As an auxiliary claim, we contend that for any there exists a -subset of that satisfies one of the following properties:
-
P1.
There exists a subset and such that
(21) -
P2.
The following inequalities hold:
(22) (23)
We proceed to prove this auxiliary claim by induction on .
For the induction base, consider . In that case, choose . By (20), if additionally, , then P1 holds. Otherwise, by Lemma 10, P2 holds. Thus, the induction base is proved. Now arbitrarily choose .
For the induction hypothesis, assume the claim holds for , and let be a set that satisfies the claim in that case, i.e., . For the induction step, we prove it also holds for , as long as , namely, that there exists a repair set of repair sets, , containing repair sets, that satisfies P1 or P2. We shall make an educated guess as to what might be, which will work in most cases. When it does not, we shall offer a correction to our initial choice of .
Since we have
Hence, there exists an with . As our initial guess, we now define the following:
We observe that since they are cyclic rotations by the same amount of and , respectively, which by (20), are two distinct sets. Additionally, , and since , it follows that . Hence, .
If satisfies P1 then trivially so does and the claim follows. Assume then that only satisfies P2. In particular, by (23),
(24) |
Again, if satisfies P1 then we are done. Otherwise, assume that does not satisfy P1, which means
(25) |
or
(26) |
for .
If (25) holds for , then the fact that
means that
(27) |
Recall that , but note that , which implies that
Thus, we can find a -subset such that and
We therefore have,
(28) |
where the second inequality holds by (24). Note that since satisfies P2, by (22),
(29) |
where the last inequality follows from (27). Recall that , hence
It then follows that there exists a repair set such that
We now correct our initial guess, and for this case only, set . We therefore have,
By the last inequality, there exists a -subset , and then
(30) |
where the second inequality holds by (28). By (29) we have,
(31) |
In total, the combination of (30) and (31) shows that the modified satisfies P2.
We now return to the original . If satisfies (25), then a similar argument shows we can build a modified for which P2 holds.
As a final case, we consider the situation where both and satisfy (26). In that case, there exist -subsets with and , for . Thus, we have
(32) |
where the last inequality holds by (23). Additionally, by (22), and since ,
(33) |
By combining (32) and (33) we learn that satisfies P2, and the auxiliary claim follows.
We turn to prove the main claim. The proof is divided into two cases depending on properties P1 and P2:
Case 1: P1 holds for some . Let be a -subset, , and , such that (21) holds. By that equation, we can choose a subset such that . Of all such subsets, let us choose to be minimal, namely, for any . Thus, by Lemma 10 we have
(34) |
Assume contains repairs sets, . Since each repair set in has rank at most , and the union of repair sets has rank at most , we can extend to a -set such that , such that each added repair set increases the overall rank, i.e.,
(35) |
for all .
Let be a -subset of and . In a similar fashion to the analysis above, we have
where to prove the first inequality we use the fact that . Repeating the processing, at each iteration adding , we can conclude that
(36) |
Recall that the rank of the union of repair sets, and in particular, , satisfies . Thus, we have a set of coordinates , with and . Consider the set . By (36),
i.e.,
(37) |
Recall now that for an code ,
Thus, by (37), for our code
However, since our code is an optimal LRC,
and thus, a have reached a contradiction.
Case 2: P2 holds for all -subsets , where . Assume first that is odd. Denote , and arbitrarily pick , with . By (22) and (23),
where the last inequality holds by the condition , and the fact that is odd. Thus, we can extend to with , such that
and
The fact that means that we can find a set with and . Then,
As in Case 1, we obtain
which contradicts the minimum distance of an optimal LRC being
Assume now that is even. Denote , and arbitrarily pick , with . By (22) and (23),
where the last inequality holds by the condition . Thus, we can extend to with , such that
and
We now continue exactly as in the case of odd to obtain a contradiction.
In all of the above cases, we have reached a contradiction. Hence, our assumption that there exist and such that is incorrect, and the main claim of the theorem follows. ∎
References
- [1] A. Agarwal, A. Barg, S. Hu, A. Mazumdar, and I. Tamo, “Combinatorial alphabet-dependent bounds for locally recoverable codes,” IEEE Transactions on Information Theory, vol. 64, no. 5, pp. 3481–3492, 2018.
- [2] A. Beemer, R. Coatney, V. Guruswami, H. H. López, and F. Piñero, “Explicit optimal-length locally repairable codes of distance 5,” in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2018, pp. 800–804.
- [3] M. Blaum, J. L. Hafner, and S. Hetzler, “Partial-MDS codes and their application to RAID type of architectures,” IEEE Transactions on Information Theory, vol. 59, no. 7, pp. 4510–4519, 2013.
- [4] M. Blaum, J. S. Plank, M. Schwartz, and E. Yaakobi, “Construction of partial MDS and sector-disk codes with two global parity symbols,” IEEE Transactions on Information Theory, vol. 62, no. 5, pp. 2673–2681, 2016.
- [5] V. R. Cadambe and A. Mazumdar, “Bounds on the size of locally recoverable codes,” IEEE transactions on information theory, vol. 61, no. 11, pp. 5787–5794, 2015.
- [6] H. Cai, M. Cheng, C. Fan, and X. Tang, “Optimal locally repairable systematic codes based on packings,” IEEE Transactions on Communications, vol. 67, no. 1, pp. 39–49, 2019.
- [7] H. Cai, C. Fan, Y. Miao, M. Schwartz, and X. Tang, “Optimal locally repairable codes: An improved bound and constructions,” IEEE Transactions on Information Theory, vol. 68, no. 8, pp. 5060–5074, Aug. 2022.
- [8] H. Cai, Y. Miao, M. Schwartz, and X. Tang, “On optimal locally repairable codes with multiple disjoint repair sets,” IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2402–2416, 2020.
- [9] ——, “On optimal locally repairable codes with super-linear length,” IEEE Transactions on Information Theory, vol. 66, no. 8, pp. 4853–4868, 2020.
- [10] ——, “A construction of maximally recoverable codes with order-optimal field size,” IEEE Transactions on Information Theory, vol. 68, no. 1, pp. 204–212, 2022.
- [11] H. Cai and M. Schwartz, “On optimal locally repairable codes and generalized sector-disk codes,” IEEE Transactions on Information Theory, vol. 67, no. 2, pp. 686–704, 2021.
- [12] B. Chen, W. Fang, S.-T. Xia, J. Hao, and F.-W. Fu, “Improved bounds and Singleton-optimal constructions of locally repairable codes with minimum distance 5 and 6,” IEEE Transactions on Information Theory, vol. 67, no. 1, pp. 217–231, 2021.
- [13] B. Chen, S.-T. Xia, J. Hao, and F.-W. Fu, “Constructions of optimal cyclic locally repairable codes,” IEEE Transactions on Information Theory, vol. 64, no. 4, pp. 2499–2511, 2018.
- [14] Z. Chen and A. Barg, “Cyclic LRC codes with hierarchy and availability,” in 2020 IEEE International Symposium on Information Theory (ISIT). IEEE, 2020, pp. 616–621.
- [15] R. Gabrys, E. Yaakobi, M. Blaum, and P. H. Siegel, “Constructions of partial MDS codes over small fields,” IEEE Transactions on Information Theory, vol. 65, no. 6, pp. 3692–3701, 2019.
- [16] P. Gopalan, C. Huang, B. Jenkins, and S. Yekhanin, “Explicit maximally recoverable codes with locality,” IEEE Transactions on Information Theory, vol. 60, no. 9, pp. 5245–5256, 2014.
- [17] P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, “On the locality of codeword symbols,” IEEE Transactions on Information Theory, vol. 58, no. 11, pp. 6925–6934, 2012.
- [18] S. Gopi and V. Guruswami, “Improved maximally recoverable LRCs using skew polynomials,” arXiv preprint arXiv:2012.07804, 2020.
- [19] S. Gopi, V. Guruswami, and S. Yekhanin, “Maximally recoverable LRCs: A field size lower bound and constructions for few heavy parities,” IEEE Transactions on Information Theory, vol. 66, no. 10, pp. 6066–6083, 2020.
- [20] V. Guruswami, L. Jin, and C. Xing, “Constructions of maximally recoverable local reconstruction codes via function fields,” IEEE Transactions on Information Theory, vol. 66, no. 10, pp. 6133–6143, 2020.
- [21] V. Guruswami, C. Xing, and C. Yuan, “How long can optimal locally repairable codes be?” IEEE Transactions on Information Theory, vol. 65, no. 6, pp. 3662–3670, 2019.
- [22] J. Hao and S.-T. Xia, “Constructions of optimal binary locally repairable codes with multiple repair groups,” IEEE Communications Letters, vol. 20, no. 6, pp. 1060–1063, 2016.
- [23] G. Hu and S. Yekhanin, “New constructions of SD and MR codes over small finite fields,” in 2016 IEEE International Symposium on Information Theory (ISIT). IEEE, 2016, pp. 1591–1595.
- [24] C. Huang, M. Chen, and J. Li, “Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems,” ACM Transactions on Storage (TOS), vol. 9, no. 1, pp. 1–28, 2013.
- [25] L. Jin, “Explicit construction of optimal locally recoverable codes of distance 5 and 6 via binary constant weight codes,” IEEE Transactions on Information Theory, vol. 65, no. 8, pp. 4658–4663, 2019.
- [26] G. Kim and J. Lee, “Locally repairable codes with unequal local erasure correction,” IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 7137–7152, 2018.
- [27] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. Elsevier, 1977.
- [28] U. Martínez-Peñas and F. R. Kschischang, “Universal and dynamic locally repairable codes with maximal recoverability via sum-rank codes,” IEEE Transactions on Information Theory, vol. 65, no. 12, pp. 7790–7805, 2019.
- [29] P. P. Pálfy, “Isomorphism problem for relational structures with a cyclic automorphism,” European Journal of Combinatorics, vol. 8, no. 1, pp. 35–43, 1987.
- [30] J. S. Plank and M. Blaum, “Sector-disk (SD) erasure codes for mixed failure modes in RAID systems,” ACM Transactions on Storage (TOS), vol. 10, no. 1, pp. 1–17, 2014.
- [31] N. Prakash, G. M. Kamath, V. Lalitha, and P. V. Kumar, “Optimal linear codes with a local-error-correction property,” in 2012 IEEE International Symposium on Information Theory Proceedings. IEEE, 2012, pp. 2776–2780.
- [32] J. Qian and L. Zhang, “New optimal cyclic locally recoverable codes of length ,” IEEE Transactions on Information Theory, vol. 66, no. 1, pp. 233–239, 2019.
- [33] J. Qiu, D. Zheng, and F.-W. Fu, “New constructions of optimal cyclic locally repairable codes from their zeros,” IEEE Transactions on Information Theory, vol. 67, no. 3, pp. 1596–1608, 2021.
- [34] A. S. Rawat, O. O. Koyluoglu, N. Silberstein, and S. Vishwanath, “Optimal locally repairable and secure codes for distributed storage systems,” IEEE Transactions on Information Theory, vol. 60, no. 1, pp. 212–236, 2014.
- [35] A. S. Rawat, D. S. Papailiopoulos, A. G. Dimakis, and S. Vishwanath, “Locality and availability in distributed storage,” IEEE Transactions on Information Theory, vol. 62, no. 8, pp. 4481–4493, 2016.
- [36] B. Sasidharan, G. K. Agarwal, and P. V. Kumar, “Codes with hierarchical locality,” in 2015 IEEE International Symposium on Information Theory (ISIT). IEEE, 2015, pp. 1257–1261.
- [37] W. Song, S. H. Dau, C. Yuen, and T. J. Li, “Optimal locally repairable linear codes,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 5, pp. 1019–1036, 2014.
- [38] I. Tamo and A. Barg, “A family of optimal locally recoverable codes,” IEEE Transactions on Information Theory, vol. 60, no. 8, pp. 4661–4676, 2014.
- [39] I. Tamo, A. Barg, S. Goparaju, and R. Calderbank, “Cyclic LRC codes, binary LRC codes, and upper bounds on the distance of cyclic codes,” International Journal of Information and Coding Theory, vol. 3, no. 4, pp. 345–364, 2016.
- [40] A. Wang and Z. Zhang, “Repair locality with multiple erasure tolerance,” IEEE Transactions on Information Theory, vol. 60, no. 11, pp. 6979–6987, 2014.
- [41] ——, “An integer programming-based bound for locally repairable codes,” IEEE Transactions on Information Theory, vol. 61, no. 10, pp. 5280–5294, 2015.
- [42] T. Westerbäck, R. Freij-Hollanti, T. Ernvall, and C. Hollanti, “On the combinatorics of locally repairable codes via matroid theory,” IEEE Transactions on Information Theory, vol. 62, no. 10, pp. 5296–5315, 2016.
- [43] C. Xing and C. Yuan, “Construction of optimal locally recoverable codes and connection with hypergraph,” in 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
- [44] A. Zeh and E. Yaakobi, “Bounds and constructions of codes with multiple localities,” in 2016 IEEE International Symposium on Information Theory (ISIT). IEEE, 2016, pp. 640–644.