Density Devolution for Ordering Synthetic Channels
Abstract
Constructing a polar code is all about selecting a subset of rows from a Kronecker power of . It is known that, under successive cancellation decoder, some rows are Pareto-better than the other. For instance, whenever a user sees a substring in the binary expansion of a row index and replaces it with , the user obtains a row index that is always more welcomed. We call this a “rule” and denote it by . In present work, we first enumerate some rules over binary erasure channels such as and and . We then summarize them using a “rule of rules”: if is a rule, where and are arbitrary binary strings, then and are rules. This work’s main contribution is using field theory, Galois theory, and numerical analysis to develop an algorithm that decides if a rule of rules is mathematically sound. We apply the algorithm to enumerate some rules of rules. Each rule of rule is capable of generating an infinite family of rules. For instance, for arbitrary binary string can be generated. We found an application of that is related to integer partition and the dominance order therein.
I Introduction
Ever since polar code was invented [1], it has constantly reminded researchers about its similarity to Reed–Muller codes [2, 3, 4, 5, 6] for that they both are generated by some subsets of rows of a Kronecker power of .
There is, nevertheless, a huge discrepancy. For Reed–Muller code, a user always picks rows with the highest Hamming weights up to the desired rate. Whether this will yield a good code is a very challenging problem [7, 8]. For polar code, the story goes the opposite way. While Arıkan proved the existence of subsets of rows that are capacity-achieving under successive cancellation decoder, it was not clear if there is any simple rule that tells us what to choose other than actually simulating the channels and carrying out the statistics. Should we be not careful and choose the wrong row, the performance will go low.
Mori and Tanaka [9], to address the row-selection issue, suggested using density evolution and the degradation relation as an alternative to simulation. Degradation does not directly tell us which rows are to select, but it tells us that some rows are always better than the other. This means that not all rows need to undergo simulation or density evolution, presumably simplifying the row-selection process.
Later, Schürch [10] and Bardet, Drägoi, Otmani, and Tillich [11] added one more degradation relation . Mondelli and Hassani and Urbanke [12] then showed that the number of rows that need to be seriously evaluated (either by simulation or evolution) is sub-linear in the block length. For the majority of rows, the degradation relation alone is enough to make the decision.
In this work, we follow the track of Wu and Siegel [13], Camps, López, Matthews, and Sarmiento [14], Drägoi and Cristescu [15], Geiger [16], Kahraman [17], and Kim, Oh, Kim, and Ha [18] to study comparison between rows. Throughout the work, we assume that the underlying channel is a binary erasure channel (BEC) and the successive cancellation decoder is in use.
For the first time, we consider density devolution, which stands for the opposite operation of density evolution. In the absence of devolution, comparing two BECs is essentially about proving a certain polynomial nonnegative, which can be done procedurally. Devolution introduces square root into the world of polynomials and the problem become proving certain algebraic function nonnegative. Wielding tools from field theory, Galois theory, and numerical analysis, we can decide, rigorously, if an algebraic function is always nonnegative. Each nonnegative algebraic function imply the nonnegativity of infinitely many polynomials, which means that we can now prove polynomial inequalities en masse.
The paper is organized as follows. Section II reviews the definition of polar code and a related preorder. Section III shows how to determine a polynomial inequality. Section IV extends the alphabet to . Section V lays some field theory bricks. Section VI shows how to determine a inequality involving . Section VII proves infinitely many inequalities.
II Polar Code and Partial Order over BEC
A simplified recipe of polar code reads: Let . Choose a positive integer and construct , the th Kronecker power of . Then, choose a subset of rows of as the generator matrix . This generates a polar code.
In selecting rows, a useful fact is that each row corresponds to a synthetic channel that is defined through density evolution. Let’s take the BEC with capacity as an example. First, define and . Next, define for any binary string . Now, the th row of corresponds to the BEC with capacity .
The design principle of polar code recommends that the best rows are those with the highest capacities. A natural question is: Can there be two rows such that one always has a higher capacity than the other?
Definition 1
For any binary strings and , we say if for all .
The following lemmas are known regarding and .
Lemma 2
and are strictly monotonically increasing and map bijectively onto . In fact, all are.
Proof:
Being strictly monotonically increasing and bijective is closed under function composition. ∎
Lemma 3
forms a preorder on binary strings.
Proof:
Reflexivity: , hence . Transitivity: implies . ∎
Lemma 4
if and for any binary strings , , , and . Juxtaposition stands for string concatenation.
Proof:
is monotonically increasing so . The image is contained in so . ∎
Lemma 5
If , then . Tilde stands for bitwise complement.
Proof:
The plot of is the plot of rotated by degrees w.r.t. as the center. Hence the assumption that ’s plot lies above ’s plot implies that ’s plot lies below ’s plot. ∎
These lemmas together with imply comparisons such as . Readers might argue that this should not be counted as multiple contributions. Rather, it is but one simple rule, , applied multiple times. Likewise, can be summarized by a simple rule, .
The main goal of this paper is to demonstrate how to generate new rules that are not consequences of other rules. We hope readers understand that what qualifies as an independent rule is context-dependent, sometimes even subjective. For instance, probably looks like an independent rule, but it is a consequence111 Take the bitwise complement: implies . Hence . of . For longer comparisons such as , it is never clear if they can be verified without brute force. Therefore, we will declare a new rule whenever it seems to be so.
III Polynomial Inequality Problem
III-A A procedure
Suppose is an polynomial with integer coefficients. There exist algorithms that find the exact number of roots of in a given interval (cf. Sturm’s theorem) or at least a mathematical sound upper bound on the number of roots (cf. Vincent’s theorem) [19]. A root-counting algorithm is usually paired with a bisecting strategy and the Newton–Raphson method to pinpoint the roots. But for now, we only need the counting part.
Assume . Here is how to decide whether
(1) |
Step one: Divide by , where and are the multiplicities of and , respectively, as roots of . Since and are nonnegative over , the sign of stays unchanged after division.
Step two: Remove the perfect square part of by dividing by , where (cf. Yun’s algorithm [20]). To see why this division works, suppose is any irreducible factor of . Write to mean but , i.e, is the multiplicity of in the factorization of . We then have
Note that is just the indicator of . This means that we are removing two copies of from if there are two or more copies. Repeat this step several times until the multiplicities of all irreducible factors become or . Since we always divide by a perfect square, the sign of stays unchanged.
Step three: Granted that is now square-free, we evaluate . If , this violates (1) right away. If , since is a single root, either or will be negative for some small , violating (1). In either case, our procedure terminates early and returns FALSE.
Step four: Count the number of roots of in the interval . If there is any root, say , then by the same reasoning as above, or will be negative, violating (1). If there are no roots, then all share the same sign as , confirming that (1) is TRUE. This concludes our procedure to decide (1).
Theorem 6
The procedure described in this subsection decides inequalities of the form (1).
III-B Application of the procedure
Checking is equivalent to checking . This constitutes a terminating procedure that decides if for any binary strings and . One runs this procedure and finds the following comparisons.
-
•
, where is the empty string.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
Among these comparisons, , , , , , , and qualify as new rules. Theoretically, one Their run the procedure on arbitrarily long strings to generate all possible comparisons. As of now, this is the only reliable way to enumerate comparisons that are not consequences of each other.
Executing that, one might find interesting patterns among those comparisons. For instance, seems to be for any binary string . After some reductions, one sees that seems to be a set of rules that are not consequences of each other. Can we prove this for all ?
IV Inverse of Density Evolution
To motivate why we need the inverses of and , consider the following. We know from running the deterministic procedure that holds for . But we want it to hold for all positive integers . Wouldn’t it be convenient if there is a comparison that, when combined with , gives
(2) | ||||
(3) |
What should and be to make this happen?
By reverse-engineering (2) and (3), we see that “equals” and “equals” . The inverses and can be made formal by appointing the inverse function of and the inverse function of . We use and in place of and to ease the notation. These are density devolution; if is a BEC with capacity , then , where and are BECs with capacities and , respectively.
Define for any quaternary string . Note that .
Definition 7
For any quaternary strings and , we say if for all .
We find generalizations of the old lemmas to the new alphabet as well as new lemmas that only make sense with the new alphabet. Some proofs are omitted.
Lemma 8
, , , and are strictly monotonically increasing and map bijectively onto . In fact, all are.
Lemma 9
still forms a preorder on quaternary strings. and still imply and . Here, the complement (denoted by tilde) of and are each other.
Lemma 10
iff iff iff iff iff . Here, represents the empty string.
Proof:
Let to prove the first “only if”. The rest are similar. ∎
Lemma 11
iff .
Proof:
implies . Note that the plot of and the plot of are symmetric with respect to the diagonal that passes and . As implies that the former plot lies entirely above the later plot, the former plot lies entirely above the diagonal. This is saying . ∎
V Field Extensions of Univariate Polynomials
We aim to generalize the procedure in Section III to one that can decides whether for all quaternary strings and . This sections prepares some building blocks for such a procedure.
Let be the set of rational numbers. Let be the set of univariate polynomials in variable whose coefficients are in . Let be the fraction field of . Let be the set of complex numbers. Let be the set of functions from to that take infinity finitely many times.
A complex number is called an algebraic number if there exists a polynomial with rational number coefficients such that . Since our quaternary alphabet involves extracting the square root of a polynomial, we want to extend the concept of algebraic numbers to polynomials.
Definition 12
We call any function an algebraic function if there exists a bivariate polynomial with rational number coefficients such that for all except for the case . We denote the set of algebraic functions by .
Clearly, .
Lemma 13
Algebraic functions are closed under addition and multiplication. That is, given .
Proof:
We use a routine argument from the field theory. Readers are encouraged to skip should they find this boring.
Let , as a subset of , be the collection of functions of the form
where the sum is finite and . forms a vector space over because it is closed under addition and scalar-multiplication by an element of .
Suppose the degree of over is and the degree of over is . That is to say, there are relations
where . Then we can express as a combination of for . Likewise, we can express as a combination of lower power terms. Hence is a finite dimensional vector space over .
This implies that , as a vector subspace of , is of finite dimension over . Hence are not linearly independent over . There must exist a linear equation relating the powers of , which witnesses the fact that is an algebraic function.
For , the same argument applies. ∎
Lemma 14
Algebraic functions are closed under extracting the th roots. That is, if the function is such that for some positive integer .
Proof:
If , then is a bivariate polynomial such that . ∎
Definition 15
A root of an algebraic function is a complex number such that .
Lemma 16
A root of an algebraic function is an algebraic number.
Proof:
If is such that , then is a polynomial in such that . Note that is a polynomial with rational-number coefficients so is an algebraic number. ∎
Theorem 17
Suppose and and are quaternary strings. If , then is an algebraic number.
Proof:
The building blocks of are squaring, subtracting from , and extracting the square root. We have seen that all three operations map an algebraic function to an algebraic function. Hence is an algebraic function. Hence is an algebraic number. ∎
VI Algebraic Function Inequality Problem
Thanks to the previous section, we can verify any inequality of the form
(4) |
where and are quaternary strings. Here is how.
VI-A The “in principle, we can” part
Let . To safely extend the domain of to , we always select the square root in if there is any, and select an arbitrary one otherwise.
By the proof of that any root of is an algebraic number, there is an algorithm that outputs a polynomial that evaluates any root of to zero. That is to say, to check if contains any root of , it suffices to check if any root of in happens to be a root of .
Now, let the roots of that are also in be, in ascending order, . It suffices to check if are all positive. It might seem that this step will cost a lot because we need precise roots of . Luckily, a root-isolating algorithm will give us disjoint intervals that each contains an isolated root. Thus, instead of , checking the endpoints of the intervals is equally valid.
VI-B The “thanks to Galois theory, we’d better” part
But how exactly are we going to find the polynomial given quaternary strings and ? Instead of relying on the lemma that and are finite-dimensional over but the linear equations are implicit. The following is what we actually do to work out more efficiently.
First and foremost, instead of , we will be working on . If contains no square root, then it is a polynomial and we have detailed how to handle polynomials in Section III. Otherwise, is of the form
where and . We see that, if , then
Note that . So we can eliminate one layer of square root by replacing with
(5) |
Repeating the same procedure, we can eliminate all square roots introduced by symbols and in . And the resulting product of ’s will be our .
As an example, suppose we begin with
This means and . So (5) becomes
Proceed to the next step, becomes and becomes . So (5) becomes
(6) |
Now this is the polynomial we want.
Remark: the idea presented in this subsection can be summarized as “multiplying all Galois-conjugates together.” To elaborate, since is algebraic over , there exists Galois actions that sends to other functions that satisfies the very same equation satisfied by . The product of all conjugates must be in the base field: . But one of the factors is ; so implies . In the example above, the conjugates are
Their product is the same we conclude in (6).
Theorem 18
The procedure described in this section decides inequalities of the form (4).
VII The TBM Partial Order
VII-A Prefix TBM
Definition 19
Fix quaternary strings and . We write if, for any quaternary strings and ,
We call a TBM relation. TBM stands for tunnel boring machine, a machine that extends existing tunnels, inspired by the potential usage in (2) and (3). As a starter, we can prove fairly easily: if , then and .
Another example: Suppose holds, then implies . Take the bitwise complement: implies ; then implies . Therefore, implies .
As it gets more complicated, we see the necessity to develop a machine that can automate this away.
Lemma 20
if and only if both and hold.
Proof:
Let’s prove the “if” part. Suppose holds and suppose , then , implying . For , the same argument applies.
Let’s prove the “only if” part. Let , then definitely holds as both sides are empty. But then that implies that , which is . Likewise, is just . ∎
Lemma 21
is a preorder.
Proof:
Reflexivity: Observe that and . Transitivity: Use and . ∎
Lemma 22
iff iff iff .
Theorem 23
implies . Note: the converse is not true as counterexamples are found.
Proof:
This is a sketch of proof. See the appendix for more details.
If is not true, then there exists such that . Pick a binary string such that is very, very close to , so close that . At the same time, make very, very close to , so close that . Then , which contradicts . So must be true. ∎
VII-B Some rules of rules
Since checking and is straightforward using the procedure described in Section VI, we list some TBM relations as follows.
-
•
, where is the empty string.
-
•
.
-
•
.
-
•
.
-
•
.
-
•
.
As commented before, is equivalent to ; we now know that they both are true. Hence, as claimed before, we have for any positive integer .
Another example: and imply for any positive integer .
Yet another example: and , imply for any positive integer .
VII-C Suffix TBM
One might be motivated to make the following definition: if, for any quaternary strings and ,
Nevertheless, this is not a new concept, because is equivalent to , which is equivalent to .
The following are some suffix TBM relations we found.
-
•
. • . • . • .
-
•
. • . • .
An example application for suffix TBM: and , implies for any positive integer .
Another example application for suffix TBM: and and , imply for any positive integer .
To the best of our knowledge, all example inequalities given in this section are independent rules. These are particular examples of family of infinitely many rules. In the next section, we relate families of rules to the dominance order of integer partition.
VIII Integer Partition and Dominance Order
For a binary string , let be the integer partition where is the number of zeros to the right of the th one in . For instance is . We claim the following theorem.
Theorem 24
If and are of the same length and weight, and dominates as in the theory of integer partition, then .
Proof:
It suffices to prove that, if covers in the dominance order, then . By a theorem in integer partition, covers if and contain and , respectively, where . But we already know : they are consequences of and . ∎
For a binary string , let be the resulting string wherein each zero in is replaced by and each by . We claim that there are two more embeddings of the dominance order.
Theorem 25
Same assumptions as Theorem 24. Then , , , , . What’s more, , , , , .
Proof:
It suffices to check and for the first inequality, and and for the second inequality. All of them are checked on a personal computer. ∎
IX Conclusions
This paper addresses the problem of determining if one synthetic channel is always more preferred over another synthetic channel. We use a root-counting algorithm to demonstrate how to determine if a polynomial is nonnegative over . We then extend that to a procedure that determines if an algebraic function is nonnegative over . The latter becomes a tool that can prove infinitely many rules in one go.
References
- [1] E. Arikan, “Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels,” IEEE Transactions on Information Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
- [2] ——, “A performance comparison of polar codes and reed-muller codes,” IEEE Communications Letters, vol. 12, no. 6, pp. 447–449, June 2008.
- [3] ——, “A survey of reed-muller codes from polar coding perspective,” in 2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo), Jan 2010, pp. 1–5.
- [4] M. Mondelli, S. H. Hassani, and R. L. Urbanke, “From polar to reed-muller codes: A technique to improve the finite-length performance,” IEEE Transactions on Communications, vol. 62, no. 9, pp. 3084–3091, Sep. 2014.
- [5] B. Li, H. Shen, and D. Tse, “A rm-polar codes,” 2014. [Online]. Available: https://arxiv.org/abs/1407.5483
- [6] E. Abbe and M. Ye, “Reed-muller codes polarize,” IEEE Transactions on Information Theory, vol. 66, no. 12, pp. 7311–7332, Dec 2020.
- [7] S. Kudekar, S. Kumar, M. Mondelli, H. D. Pfister, E. Şaşoǧlu, and R. L. Urbanke, “Reed–muller codes achieve capacity on erasure channels,” IEEE Transactions on Information Theory, vol. 63, no. 7, pp. 4298–4316, July 2017.
- [8] G. Reeves and H. D. Pfister, “Reed-muller codes achieve capacity on bms channels,” 2021. [Online]. Available: https://arxiv.org/abs/2110.14631
- [9] R. Mori and T. Tanaka, “Performance of polar codes with the construction using density evolution,” IEEE Communications Letters, vol. 13, no. 7, pp. 519–521, July 2009.
- [10] C. Schürch, “A partial order for the synthesized channels of a polar code,” in 2016 IEEE International Symposium on Information Theory (ISIT), July 2016, pp. 220–224.
- [11] M. Bardet, V. Dragoi, A. Otmani, and J.-P. Tillich, “Algebraic properties of polar codes from a new polynomial formalism,” in 2016 IEEE International Symposium on Information Theory (ISIT), July 2016, pp. 230–234.
- [12] M. Mondelli, S. H. Hassani, and R. L. Urbanke, “Construction of polar codes with sublinear complexity,” IEEE Transactions on Information Theory, vol. 65, no. 5, pp. 2782–2791, May 2019.
- [13] W. Wu and P. H. Siegel, “Generalized partial orders for polar code bit-channels,” IEEE Transactions on Information Theory, vol. 65, no. 11, pp. 7114–7130, Nov 2019.
- [14] E. Camps, H. H. López, G. L. Matthews, and E. Sarmiento, “Polar decreasing monomial-cartesian codes,” IEEE Transactions on Information Theory, vol. 67, no. 6, pp. 3664–3674, June 2021.
- [15] V.-F. Drăgoi and G. Cristescu, “Bhattacharyya parameter of monomial codes for the binary erasure channel: From pointwise to average reliability,” Sensors, vol. 21, no. 9, 2021. [Online]. Available: https://www.mdpi.com/1424-8220/21/9/2976
- [16] B. C. Geiger, “The fractality of polar and reed–muller codes,” Entropy, vol. 20, no. 1, 2018. [Online]. Available: https://www.mdpi.com/1099-4300/20/1/70
- [17] S. Kahraman, “Strange attractor for efficient polar code design,” 2017. [Online]. Available: https://arxiv.org/abs/1708.04167
- [18] D. Kim, K. Oh, D. Kim, and J. Ha, “Information set analysis of polar codes,” in 2016 International Conference on Information and Communication Technology Convergence (ICTC), Oct 2016, pp. 813–815.
- [19] M. Sagraloff and K. Mehlhorn, “Computing real roots of real polynomials,” Journal of Symbolic Computation, vol. 73, pp. 46–86, 2016. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0747717115000292
- [20] D. Y. Yun, “On square-free decomposition algorithms,” in Proceedings of the Third ACM Symposium on Symbolic and Algebraic Computation, ser. SYMSAC ’76. New York, NY, USA: Association for Computing Machinery, 1976, pp. 26–35. [Online]. Available: https://doi.org/10.1145/800205.806320
Appendix A Proof of Theorem 23
For any binary string ,
This holds because each is a consequence of . Therefore, we will have for all binary strings .
Next, suppose is not true, i.e., there exists such that . Let and let .
We now attempt to construct capacity-achieving polar codes over BECs with capacities and , respectively. Since , i.e., the sum of the capacities of these two channels exceed one, the bitwise complement of the information set of the former overlaps the information set of the latter provided that the block length is large enough. That is, for sufficiently large block lengths, there is some binary string such that is in the information set of a polar code over BEC with capacity and is in the information set of a polar code over BEC with capacity .
Recall that polar code has exponentially small error probabilities. Therefore, for arbitrarily small , we will see
as long as is long enough and taken from the intersection. Using , the inequalities above are equivalent to
That is, is such that is very close to and is very close to . If we choose to be really, really small, we will see
and
This contradicts the fact that , so it must be the case that is true. This finishes the proof.
Consider the preceding proof similar to [13, Proposition 1].