This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fast Fourier transform via automorphism groups of rational function fields

Songsong Li School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China [email protected]  and  Chaoping Xing School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China [email protected]
Abstract.

The Fast Fourier Transform (FFT) over a finite field 𝔽q\mathbb{F}_{q} computes evaluations of a given polynomial of degree less than nn at a specifically chosen set of nn distinct evaluation points in 𝔽q\mathbb{F}_{q}. If qq or q1q-1 is a smooth number, then the divide-and-conquer approach leads to the fastest known FFT algorithms. Depending on the type of group that the set of evaluation points forms, these algorithms are classified as multiplicative (Math of Comp. 1965) and additive (FOCS 2014) FFT algorithms. In this work, we provide a unified framework for FFT algorithms that include both multiplicative and additive FFT algorithms as special cases, and beyond: our framework also works when q+1q+1 is smooth, while all known results require qq or q1q-1 to be smooth. For the new case where q+1q+1 is smooth (this new case was not considered before in literature as far as we know), we show that if nn is a divisor of q+1q+1 that is BB-smooth for a real B>0B>0, then our FFT needs O(Bnlogn)O(Bn\log n) arithmetic operations in 𝔽q\mathbb{F}_{q}. Our unified framework is a natural consequence of introducing the algebraic function fields into the study of FFT.

1. Introduction

The discrete Fourier transform (DFT for short) of length nn over a field FF is a transform from nn coefficients of a polynomial f(x)f(x) over FF of degree less than nn to nn evaluations of f(x)f(x) at all nn-th roots of unity in FF. The inverse DFT (iDFT for short) is just the reverse process from nn evaluations to nn coefficients. However, the computation of DFT directly from the definition requires O(n2)O(n^{2}) operations in FF which is usually too slow for practical purposes. Thus, we desire to design a faster DFT to fulfill various applications. This motivates the study of fast Fourier transform (FFT for short). FFT is a classical topic in complexity theory and has found various applications in theoretical computer science such as fast polynomial arithmetic [VZGG13, BCS13, BSCKL23], and coding theory [Jus76, Gao03, GS10, LCH14, LANHC16, LANH16, HCLB21].

1.1. Previous work

The origin of FFT can be traced back to Gauss’s unpublished work in 1805, which was pointed out by Heideman et al. in [HJB84]. After 160 years, Cooley and Tukey [CT65] independently rediscovered this algorithm and popularized it. Now this FFT is known as the Cooley-Tukey algorithm. Cooley-Tukey’s algorithm is a divide-and-conquer algorithm that implements DFT over the complex numbers \mathbb{C}; namely, given a polynomial f(x)=i<naixi[X]f(x)=\sum_{i<n}a_{i}x^{i}\in\mathbb{C}[X], evaluate f(x)f(x) at the nn-th roots of unity in \mathbb{C}. If nn is a BB-smooth number, i.e., all its prime factors are not greater than BB, Cooley-Tukey’s algorithm computes DFT in O(Bnlogn)O(Bn\log n) arithmetic operations of \mathbb{C}. In the following, we also call nn is smooth if B=O(1)B=O(1) is a constant. Using Cooley-Tukey’s algorithm, DFT over 𝔽q\mathbb{F}_{q} can also be done in O(nlogn)O(n\log n) field operations of 𝔽q\mathbb{F}_{q} if 𝔽q\mathbb{F}_{q} contains an nn-th root of unity for smooth integer nn [M.71]. For DFTs of polynomials in 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}, but the evaluation points in an extension field 𝔽qd\mathbb{F}_{q^{d}} which has smooth nn-th roots of unity, van der Hoeven and Larrieu [vDHL17] showed that the Frobenius automorphism ΦGal(𝔽qd/𝔽q)\Phi\in\operatorname{\mathrm{Gal}}(\mathbb{F}_{q^{d}}/\mathbb{F}_{q}) can be used to accelerate the FFT over 𝔽qd\mathbb{F}_{q^{d}}. The above class of FFTs is called multiplicative FFT due to the fact that the evaluation set is a multiplicative subgroup of 𝔽q\mathbb{F}_{q}^{*}.

However, if 𝔽q\mathbb{F}_{q} does not have the desired nn-th roots of unity, for example, 𝔽q=𝔽2r\mathbb{F}_{q}=\mathbb{F}_{2^{r}} and 2r12^{r}-1 is a prime number, then the multiplicative FFT is not efficient. To implement FFT over 𝔽q\mathbb{F}_{q} in this case, a new type of FFT algorithm was discovered by Zhu-Wang [WY88] and Cantor [Can89] independently. To distinguish Cooley-Tukey’s FFT algorithm from the one by Zhu-Wang and Cantor, Mateer, and Gao [GS10] named the latter one as the additive FFT based on the fact that the evaluation set is an additive subgroup of 𝔽q\mathbb{F}_{q}. In their work [GS10], Mateer and Gao improved the previous additive FFT algorithm [WY88, Can89] and showed that their additive FFT algorithm requires O(nlog2n)O(n\log^{2}n) additions and O(nlogn)O(n\log n) multiplications in 𝔽q=𝔽2r\mathbb{F}_{q}=\mathbb{F}_{2^{r}}. Particularly, in case rr is a power of two, their improved FFT needs O(nlogn)O(n\log n) multiplications and only O(nlognloglogn)O(n\log n\cdot\log\log n) additions. Four years later, Lin et al. [LCH14] showed that additive FFT can be run in O(nlogn)O(n\log n) operations (including additions and multiplications) over 𝔽2r\mathbb{F}_{2^{r}} by taking a novel polynomial basis of 𝔽2r[x]<n\mathbb{F}_{2^{r}}[x]_{<n} (the 𝔽2r\mathbb{F}_{2^{r}}-vector space consisting of all polynomials over 𝔽2r\mathbb{F}_{2^{r}} of degree less than nn). Moreover, their new FFT algorithm is applicable to finite field 𝔽2r\mathbb{F}_{2^{r}} with arbitrary rr. The same group of authors later gave a new interpretation of their algorithm and presented some applications of FFT in encoding and decoding of Reed-Solomon codes [LANH16, HCLB21]. Note that for the additive FFT given in [LCH14, LANH16], it requires that every polynomial in 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} is represented under a certain basis consisting of products of linearized polynomials instead of the standard monomial basis {1,x,,xn1}\{1,x,\dots,x^{n-1}\}.

As we have seen, both multiplicative and additive FFTs over 𝔽q\mathbb{F}_{q} have constraints. Namely, for multiplicative FFT, one requires that q1q-1 is smooth; while for additive FFT, one requires that the characteristic pp of 𝔽q\mathbb{F}_{q} is a small constant. Recently, Ben-Sasson et al. [BSCKL23] made a breakthrough in the FFT-like algorithm over an arbitrary finite field. The new algorithm is based on elliptic curves and isogenies of smooth degree, thus it is called elliptic-curve-based FFT (ECFFT for short). More precisely, assume nn is a smooth number. Let f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} be a polynomial and S𝔽qS^{\prime}\subsetneq\mathbb{F}_{q} a carefully selected subset of cradinality nn. Then the multipoint evaluation (MPE for short) of ff at SS^{\prime} can be done in O(nlogn)O(n\log n) operations of 𝔽q\mathbb{F}_{q} under the representation of ff: f=(f(s1),,f(sn))f=(f(s_{1}),\ldots,f(s_{n})), where S𝔽qS\subsetneq\mathbb{F}_{q} is another subset of size nn and SS=S\cap S^{\prime}=\emptyset (This is different from previous FFTs which take as input the coefficients of ff under a certain basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}). They made use of isogenies between elliptic curves to construct the transform from the MPE of ff at SS to the MPE of ff at SS^{\prime}. During the transformation, some precomputations are required. The authors did not give the total storage for the precomputations, but at least Ω(n)\Omega(n) is required. This is an additional overhead compared to the multiplicative/additive FFTs. Besides, a constraint for the ECFFT over 𝔽q\mathbb{F}_{q} is that length nn is upper bounded by O(q)O(\sqrt{q}), especially for odd qq. This restricts many applications such as encoding and decoding of qq-ary Reed-Solomon codes where code lengths nn are usually proportional to qq.

1.2. Sketch of the multiplicative and additive FFT techniques

To compare our results and better understand the FFT algorithm, let us sketch the idea of Cooley-Tukey’s algorithm and Lin-Chung-Han’s algorithm [LCH14], which correspond to the multiplicative and additive cases, respectively. For both cases, let n=2rn=2^{r} and f(x)𝔽q[x]f(x)\in\mathbb{F}_{q}[x] be a polynomial of degree less than nn. Denote the complexity of FFT with respect to the number of additions and multiplications by A(n)A(n) and M(n)M(n), respectively.

For Cooley-Tukey’s algorithm, the evaluation set is selected to be μn𝔽q\mu_{n}\subseteq\mathbb{F}_{q}^{*} which is the set of all nn-th roots of unity, where nn is equal to 2r2^{r} for an integer r1r\geq 1. Then the square map s(x)=x2s(x)=x^{2} maps μn\mu_{n} onto μn/2\mu_{n/2}. Moreover, the polynomial f(x)f(x) can be decomposed as a combination of two lower-degree polynomials meanwhile, i.e., f(x)=f0(x2)+xf1(x2)f(x)=f_{0}(x^{2})+x\cdot f_{1}(x^{2}), where f0,f1𝔽q[x]f_{0},f_{1}\in\mathbb{F}_{q}[x] are polynomials of degree less than n/2n/2. Thus, the DFT of f(x)f(x) at μn\mu_{n} can be reduced to DFTs of f0f_{0} and f1f_{1} at μn/2\mu_{n/2} and then a combination of these two sets of n/2n/2 evaluations by using O(n)O(n) additions/multiplications in 𝔽q\mathbb{F}_{q}. Then the running time A(n)A(n) and M(n)M(n) both satisfy the recursive formula

A(n)=2A(n/2)+O(n),M(n)=2M(n/2)+O(n).A(n)=2A(n/2)+O(n),\ M(n)=2M(n/2)+O(n). (1.2.1)

After rr steps of recursions, we have A(n)=M(n)=O(nlogn)A(n)=M(n)=O(n\log n).

Lin-Chung-Han’s additive FFT algorithm inherited the idea of the FFT algorithm [Fid72] through polynomials modular arithmetic. Assume 𝔽q=𝔽2r={αi}i=02r1\mathbb{F}_{q}=\mathbb{F}_{2^{r}}=\{\alpha_{i}\}_{i=0}^{2^{r}-1}. Then the evaluations of f(x)𝔽2r[x]<2rf(x)\in\mathbb{F}_{2^{r}}[x]_{<2^{r}} at 𝔽2r\mathbb{F}_{2^{r}} are exactly the 2r2^{r} residues

(f(x)mod(xα0),f(x)mod(xα1),,f(x)mod(xα2r1)).\big{(}f(x)\bmod(x-\alpha_{0}),f(x)\bmod(x-\alpha_{1}),\dots,f(x)\bmod(x-\alpha_{2^{r}-1})\big{)}.

Note that i=02r1(xαi)=x2rx\prod_{i=0}^{2^{r}-1}(x-\alpha_{i})=x^{2^{r}}-x and f(x)=f(x)mod(x2rx)f(x)=f(x)\bmod(x^{2^{r}}-x). To efficiently get these 2r2^{r} residues, Lin et al. introduced linearized polynomials to decompose the modulo (x2rx)(x^{2^{r}}-x) into 2r2^{r} modulus {(xαi)}i=02r1\{(x-\alpha_{i})\}_{i=0}^{2^{r}-1} recursively. Assume

{0}=W0W1Wr1Wr=𝔽2r\{0\}=W_{0}\subsetneq W_{1}\subsetneq\cdots\subsetneq W_{r-1}\subsetneq W_{r}=\mathbb{F}_{2^{r}}

is a subspace chain of 𝔽2r\mathbb{F}_{2^{r}} and Wi=Wi1(Wi1+vi)W_{i}=W_{i-1}\cup(W_{i-1}+v_{i}) for some vi𝔽2rv_{i}\in\mathbb{F}_{2^{r}} and each i[1,r]i\in[1,r]. In particular, we have dim𝔽2(Wi)=i\dim_{\mathbb{F}_{2}}(W_{i})=i and {v1,,vr}\{v_{1},\dots,v_{r}\} constitutes a basis of 𝔽2r\mathbb{F}_{2^{r}} as a vector space over 𝔽2\mathbb{F}_{2}. Define the linearized polynomial as i(x)=wWi(xw)\ell_{i}(x)=\prod_{w\in W_{i}}(x-w). Then deg(i)=2i\deg(\ell_{i})=2^{i} and

i(x+β)=i1(x+β)i1(x+β+vi)for anyβ𝔽2r.\ell_{i}(x+\beta)=\ell_{i-1}(x+\beta)\cdot\ell_{i-1}(x+\beta+v_{i})\ \text{for\ any}\ \beta\in\mathbb{F}_{2^{r}}.

For any e[0,2r1]e\in[0,2^{r}-1], assume the 22-adic expansion of ee is e=i=0r1ei2ie=\sum_{i=0}^{r-1}e_{i}2^{i}. Then

deg(0e0(x)1e1(x)r1er1(x)})=e.\deg(\ell_{0}^{e_{0}}(x)\ell_{1}^{e_{1}}(x)\cdots\ell_{r-1}^{e_{r-1}}(x)\})=e.

Hence ={0e0(x)1(x)e1r1er1(x)e[0,2r1]}\mathcal{B}=\{\ell_{0}^{e_{0}}(x)\ell_{1}(x)^{e_{1}}\cdots\ell_{r-1}^{e_{r-1}}(x)\ \mid e\in[0,2^{r}-1]\} is a basis of 𝔽2r[x]<2r\mathbb{F}_{2^{r}}[x]_{<2^{r}}. Under the basis \mathcal{B}, write f(x)=f0(x)+r1(x)f1(x)f(x)=f_{0}(x)+\ell_{r-1}(x)\cdot f_{1}(x), where f0,f1𝔽2r[x]f_{0},f_{1}\in\mathbb{F}_{2^{r}}[x] are polynomials of degree less than 2r12^{r-1}. Since x2rx=r1(x)r1(x+vr)x^{2^{r}}-x=\ell_{r-1}(x)\cdot\ell_{r-1}(x+v_{r}), by the Chinese Reminder theorem,

f(x)mod(x2rx)=(f0(x)modr1(x),(f0(x)+r1(vr)f1(x))mod(r1(x)+r1(vr))).f(x)\bmod(x^{2^{r}}-x)=\Big{(}f_{0}(x)\bmod\ell_{r-1}(x),\big{(}f_{0}(x)+\ell_{r-1}(v_{r})f_{1}(x)\big{)}\bmod\big{(}\ell_{r-1}(x)+\ell_{r-1}(v_{r})\big{)}\Big{)}.

Thus, the computation of 2r2^{r} residues {f(x)mod(xα)}αWr\{f(x)\bmod(x-\alpha)\}_{\alpha\in W_{r}} of f(x)f(x) can be reduced to 2r12^{r-1} residues {f0mod(xα)}αWr1\{f_{0}\bmod(x-\alpha)\}_{\alpha\in W_{r-1}} of f0(x)f_{0}(x) and 2r12^{r-1} residues {(f0+r1(vr)f1)mod(xα)}αWr1+vr\{\big{(}f_{0}+\ell_{r-1}(v_{r})\cdot f_{1}\big{)}\bmod(x-\alpha)\}_{\alpha\in W_{r-1}+v_{r}} of f0(x)+r1(vr)f1(x)f_{0}(x)+\ell_{r-1}(v_{r})f_{1}(x). Since f0,f1f_{0},f_{1} both have degree less than 2r12^{r-1}, the computation of f0+r1(vr)f1f_{0}+\ell_{r-1}(v_{r})f_{1} needs O(n)O(n) additions and O(n)O(n) multiplications in 𝔽2r\mathbb{F}_{2^{r}}, respectively. Thus the running time A(n)A(n) and M(n)M(n) also satisfy equation  (1.2.1) leading to A(n)=M(n)=O(nlogn)A(n)=M(n)=O(n\log n).

So far, we have briefly described the ideas of multiplicative FFT and additive FFT. Although there are similarities between multiplicative and additive FFTs, the methods used are different. This makes great inconvenience for the generalization of these FFT algorithms and their applications.

1.3. Our result and comparisons

In this work, we provide a unified framework for FFT in 𝔽q\mathbb{F}_{q} that includes both multiplicative and additive FFT. More importantly, our framework works well when either q1q-1, qq, or q+1q+1 is smooth. In other words, we give a new FFT algorithm that includes both the multiplicative DFT and the additive DFT as two special cases. Besides, we discuss a new case that has never been considered: if nn is a divisor of q+1q+1 that is BB-smooth for a real number B>0B>0; namely, nn can be factored into the product of i=1rpi\prod_{i=1}^{r}p_{i} satisfying piBp_{i}\leq B for every 1ir1\leq i\leq r, then our FFT algorithm runs in O(Bnlogn)O(B\cdot n\cdot\log n) operations in 𝔽q\mathbb{F}_{q}.

Theorem 1.1.

Let B>0B>0 be a real. Let 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} be the space of polynomials over 𝔽q\mathbb{F}_{q} of degree less than nn. If nn is a BB-smooth divisor of either q1q-1, qq or q+1q+1, then one can run FFT for f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} at a well-chosen multipoint set of size nn in O(Bnlogn)O(B\cdot n\cdot\log n) field operations of 𝔽q\mathbb{F}_{q}.

Remark 1.2.
  • (1)

    In our framework, we need to represent a polynomial f𝔽q[x]<nf\in\mathbb{F}_{q}[x]_{<n} under a certain basis \mathcal{B} of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}, then implement FFT of ff at a well-chosen multipoint set (the roots in 𝔽q\mathbb{F}_{q} of a polynomial of degree nn). As we will see in Section 3, in the case of nq1n\mid q-1, then a basis \mathcal{B} of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} constructed from our framework is actually the standard monomial basis, i.e., ={xi}i=0n1\mathcal{B}=\{x^{i}\}_{i=0}^{n-1}; in the case of nqn\mid q, then \mathcal{B} is constructed by the products of a series of linearized polynomials, which is the same as in [LCH14].

    Moreover, if nqn\mid q is smooth and the polynomial ff is given under the standard basis {xi}i=0n1\{x^{i}\}_{i=0}^{n-1}, then our framework can implement DFT of ff in O(nlog2n)O(n\log^{2}n) operations in 𝔽q\mathbb{F}_{q} (see Corollary 3.6 in Section 3), which is a generalization of Mateer-Gao’s additive FFT over 𝔽2r\mathbb{F}_{2^{r}} to arbitrary characteristics.

    Although the results for multiplicative and additive FFTs are known, we present them under a unified framework for FFT by the theory of algebraic function fields.

  • (2)

    If both qq and q1q-1 are not smooth, then neither additive FFT nor multiplicative FFT in 𝔽q\mathbb{F}_{q} is applicable. In this case, if q+1q+1 has a smooth divisor nn, then our framework ensures that there is an efficient FFT in 𝔽q\mathbb{F}_{q}. Therefore, our work loosens the restriction of FFT on the finite field to some extent.

    For instance, let pp be a Sophie Germain prime, i.e., 2p+12p+1 is also a prime. Then both q:=2p+1q:=2p+1 and q1=2pq-1=2p are not smooth. Thus, if p+1p+1 is smooth, then q+1q+1 is smooth. The other instance is that pp is a Sophie Germain prime and q=22p+11q=2^{2p+1}-1 is a Mersenne prime, then qq is not smooth. Furthermore, q1=22p+12=2(2p1)(2p+1)q-1=2^{2p+1}-2=2\left(2^{p}-1\right)\left(2^{p}+1\right). Thus, with high probability q1q-1 is not smooth as 2p12^{p}-1 is a Mersenne number. On the other hand, we have q+1=22p+1q+1=2^{2p+1} is smooth.

  • (3)

    For the new case: q+1q+1 is smooth, we give a practical FFT algorithm over 𝔽q\mathbb{F}_{q} with multipoint set 𝒫𝔽q\mathcal{P}\subset\mathbb{F}_{q} in Section 4. Although one could also consider multiplicative FFT in the quadratic extension field 𝔽q2\mathbb{F}_{q^{2}} in this case, the corresponding multipoint set 𝒫1\mathcal{P}_{1} is a subgroup of 𝔽q2\mathbb{F}_{q^{2}}^{*}. Note that 𝒫1𝔽q={1,1}\mathcal{P}_{1}\cap\mathbb{F}_{q}=\{1,-1\}. Thus, the multiplicative FFT over 𝔽q2\mathbb{F}_{q^{2}} does not imply the FFT over 𝔽q\mathbb{F}_{q}.

We compare our result with known results in Subsection 1.1, see Table 1.

Table 1. Comparisons of our result with known FFTs over 𝔽q\mathbb{F}_{q} with complexity O(nlogn)O(n\log n)
evaluation set 𝒫\mathcal{P} Representation of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} constraints111in all cases, nn need to be O(1)O(1)-smooth.
Multiplicative 𝒫𝔽q\mathcal{P}\leq\mathbb{F}_{q}^{*} f=i=0n1aixif=\sum_{i=0}^{n-1}a_{i}x^{i} is rep. nq1n\mid q-1
FFT [CT65] nn-th roots of unity under the standard basis
Additive 𝒫𝔽q\mathcal{P}\leq\mathbb{F}_{q} ff is rep. under a nqn\mid q
FFT[LCH14] subspace of 𝔽q\mathbb{F}_{q} non-standard basis \mathcal{B}
Elliptic curve 𝒫𝔽q\mathcal{P}\subset\mathbb{F}_{q} ff is rep. as MPE n=O(q)n=O(\sqrt{q})
FFT [BSCKL23] xx-coord. of an nn-coset222a coset C=E(𝔽q)/GC=E(\mathbb{F}_{q})/G, where GE(𝔽q)G\leq E(\mathbb{F}_{q}) is a subgroup of order nn. f=(f(s1),,f(sn))f=(f(s_{1}),\dots,f(s_{n}))
Our 𝒫𝔽q\mathcal{P}\subset\mathbb{F}_{q} ff is rep. under a nq1,qn\mid q-1,\ q,
results roots of an nn-poly.333a polynomial determined by the automorphism subgroup of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) of order nn. (non-)standard basis \mathcal{B} or q+1q+1

1.4. Our techniques

Let Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) be the automorphism group consisting of all automorphisms of 𝔽q(x)\mathbb{F}_{q}(x) keeping elements of 𝔽q\mathbb{F}_{q} invariant. Let GG be a subgroup of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) and let 𝔽q(x)G\mathbb{F}_{q}(x)^{G} be the subfield of 𝔽q(x)\mathbb{F}_{q}(x) fixed by GG. If GG satisfies the following conditions (see Section 2.1 for precise definitions):

  1. (i)

    GG is an abelian group with smooth order n=|G|n=|G|, i.e., n=i=1rpin=\prod_{i=1}^{r}p_{i}, where all prime divisors pip_{i} are small constant.

  2. (ii)

    There is a place 𝒬\mathcal{Q} of 𝔽q(x)G\mathbb{F}_{q}(x)^{G} that is totally ramified in 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}.

  3. (iii)

    𝔽q(x)G\mathbb{F}_{q}(x)^{G} has a rational place 𝔓\mathfrak{P} that splits completely in 𝔽q(x)\mathbb{F}_{q}(x). Let 𝒫\mathcal{P} denote the set of places of 𝔽q(x)\mathbb{F}_{q}(x) lying over 𝔓\mathfrak{P}.

then we can construct an FFT for any polynomial in 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} at 𝒫\mathcal{P} with complexity O(nlogn)O(n\log n) (since every rational place P𝒫P\in\mathcal{P} is the zero of xαx-\alpha for α𝔽q\alpha\in\mathbb{F}_{q} and f(P)=f(α)f(P)=f(\alpha) for any f𝔽q[x]f\in\mathbb{F}_{q}[x], we identify 𝒫\mathcal{P} as a subset of 𝔽q\mathbb{F}_{q}). Our FFT is based on the Galois theory, in order to differentiate it from the classical FFT, we name it G-FFT.

By condition (i) and the structure of finite abelian groups, GG has an ascending chain of subgroups

{1}=G0G1Gr1Gr=G,\{1\}=G_{0}\subsetneq G_{1}\subsetneq\cdots G_{r-1}\subsetneq G_{r}=G,

where |Gi|/|Gi1|=pi|G_{i}|/|G_{i-1}|=p_{i} for i=1,,ri=1,\cdots,r. By the Galois theory [HKTO08], each group GiG_{i} determines a fixed subfield 𝔽q(x)Gi\mathbb{F}_{q}(x)^{G_{i}} of 𝔽q(x)\mathbb{F}_{q}(x). As any subfield of 𝔽q(x)\mathbb{F}_{q}(x) is again a rational function field [Sti09, Proposition 3.5.9], we assume 𝔽q(x)Gi=𝔽q(xi)\mathbb{F}_{q}(x)^{G_{i}}=\mathbb{F}_{q}(x_{i}) for some xi𝔽q(x)x_{i}\in\mathbb{F}_{q}(x). Thus the subgroups chain leads to a tower of fields

𝔽q(x)=𝔽q(x0)𝔽q(x1)𝔽q(xr)=𝔽q(x)G.\mathbb{F}_{q}(x)=\mathbb{F}_{q}(x_{0})\supsetneq\mathbb{F}_{q}(x_{1})\supsetneq\cdots\supsetneq\mathbb{F}_{q}(x_{r})=\mathbb{F}_{q}(x)^{G}.

Furthermore, if the pole place PP_{\infty} of xx totally ramifies in the extension 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}, then we can choose the xix_{i} such that each xi+1x_{i+1} is a polynomial of xix_{i} of degree pi+1p_{i+1} for 0ir10\leq i\leq r-1. As a result, we have νP(xi)=|Gi|\nu_{P_{\infty}}(x_{i})=-|G_{i}| for each 0ir0\leq i\leq r. Based on these facts, we can construct an 𝔽q\mathbb{F}_{q}-basis of the polynomial space 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} as following

={x0e0x1e1xr1er1e=(e0,e1,,er1)p1×p2××pr}.\mathcal{B}=\{x_{0}^{e_{0}}x_{1}^{e_{1}}\cdots x_{r-1}^{e_{r-1}}\mid\operatorname{\textbf{e}}=(e_{0},e_{1},\ldots,e_{r-1})\in\mathbb{Z}_{p_{1}}\times\mathbb{Z}_{p_{2}}\times\cdots\times\mathbb{Z}_{p_{r}}\}. (1.4.1)

Thus, any f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} can be represented as

f(x)=e0=0p11e1=0p21er1=0pr1aex0e0x1e1xr1er1=f0(x1)+xf1(x1)++xp11fp11(x1),\begin{split}f(x)&=\sum_{e_{0}=0}^{p_{1}-1}\sum_{e_{1}=0}^{p_{2}-1}\ldots\sum_{e_{r-1}=0}^{p_{r}-1}a_{\operatorname{\textbf{e}}}\cdot x_{0}^{e_{0}}x_{1}^{e_{1}}\cdots x_{r-1}^{e_{r-1}}\\ &=f_{0}(x_{1})+x\cdot f_{1}(x_{1})+\dots+x^{p_{1}-1}\cdot f_{p_{1}-1}(x_{1}),\end{split} (1.4.2)

where the second “=” is the result of combining all terms with the same xe0x^{e_{0}} for e0=0,1,,p11e_{0}=0,1,\ldots,p_{1}-1. Then f0(x1),,fp11(x1)𝔽q[x1]f_{0}(x_{1}),\ldots,f_{p_{1}-1}(x_{1})\in\mathbb{F}_{q}[x_{1}] are polynomials of degree less than n/p1n/p_{1} with respect to variable x1x_{1}. Thus, the evaluations of f(x)f(x) at 𝒫\mathcal{P} can be reduced to the evaluations of f0,f1,,fp11f_{0},f_{1},\dots,f_{p_{1}-1} at 𝒫\mathcal{P} and then a combination of these p1p_{1} sets of values by using p1O(n)p_{1}O(n) additions/multiplications in 𝔽q\mathbb{F}_{q}. However, for every fj𝔽q(x1)f_{j}\in\mathbb{F}_{q}(x_{1}), its evaluation at every P𝒫P\in\mathcal{P} satisfying fj(P)=f(P𝔽q(x1))f_{j}(P)=f\big{(}P\cap\mathbb{F}_{q}(x_{1})\big{)}. According to (iii), let 𝒫1=𝒫𝔽q(x1)\mathcal{P}_{1}=\mathcal{P}\cap\mathbb{F}_{q}(x_{1}) be the set of places of 𝔽q(x1)\mathbb{F}_{q}(x_{1}) lying over 𝔓\mathfrak{P}. Then |𝒫1|=n/p1|\mathcal{P}_{1}|=n/p_{1}. Denote the complexity of FFT with respect to the number of additions and multiplications by A(n)A(n) and M(n)M(n), respectively. Then A(n)A(n) and M(n)M(n) both satisfy the recursive formula

A(n)=p1A(n/p1)+p1O(n),M(n)=p1M(n/p1)+p1O(n).A(n)=p_{1}\cdot A(n/p_{1})+p_{1}\cdot O(n),\ M(n)=p_{1}\cdot M(n/p_{1})+p_{1}\cdot O(n). (1.4.3)

Similarly, the DFTs of f0,,fp11f_{0},\dots,f_{p_{1}-1} at 𝒫1\mathcal{P}_{1} can be reduced to DFTs of polynomials of lower-degree with respect to the variable x2x_{2} at 𝒫2=𝒫0𝔽q(x2)\mathcal{P}_{2}=\mathcal{P}_{0}\cap\mathbb{F}_{q}(x_{2}). Then, after rr steps of recursions, we obtain A(n)=M(n)=O(Bnlogn)A(n)=M(n)=O(B\cdot n\log n), where B=max1ir{pi}B=\max_{1\leq i\leq r}\{p_{i}\}.

The above idea works well for a subgroup GG of the affine linear group AGL2(q)\mathrm{AGL}_{2}(q) as the pole place PP_{\infty} of xx totally ramifies in 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}. However, if we choose a subgroup that is not contained in AGL2(q)\mathrm{AGL}_{2}(q), then there are no rational places that totally ramify in 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}. In this paper, we consider a cyclic subgroup GG of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) of order q+1q+1. There is a place QQ of degree 22 that totally ramifies in the extension 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}, and the rational place PP_{\infty} is splitting completely. Assume n=q+1=2rn=q+1=2^{r}. By some elementary analysis, we first construct a basis \mathcal{B} of the Riemann-Roch space (nQ)=1Qn𝔽q[x]2n\mathcal{L}\left(nQ\right)=\frac{1}{Q^{n}}\mathbb{F}_{q}[x]_{\leq 2n} (here we identify the place QQ with a quadratic irreducible polynomial QQ). Then we show that a subset ~\tilde{\mathcal{B}} of \mathcal{B} spans the 𝔽q\mathbb{F}_{q} vector space ur,0(x)Qn𝔽q[x]<n\frac{u_{r,0}(x)}{Q^{n}}\mathbb{F}_{q}[x]_{<n}, where ur,0(x)u_{r,0}(x) is a polynomial of degree nn with no roots in 𝔽q\mathbb{F}_{q} (see Lemma 4.5). In other words, we get a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}, namely, Qnur,0(x)~\frac{Q^{n}}{u_{r,0}(x)}\tilde{\mathcal{B}}.

For a polynomial f𝔽q[x]f\in\mathbb{F}_{q}[x] which is represented under the basis Qnur,0(x)~\frac{Q^{n}}{u_{r,0}(x)}\tilde{\mathcal{B}}, let f~=ur,0(x)Qnf(nQ)\tilde{f}=\frac{u_{r,0}(x)}{Q^{n}}f\in\mathcal{L}(nQ). By the same recursive reduction, we first show that the multipoint evaluation {f~(α)α𝔽q}\{\tilde{f}(\alpha)\mid\alpha\in\mathbb{F}_{q}\} of f~\tilde{f} can be computed via G-FFT in time O(nlogn)O(n\log n). Next, by precomputing Qn(α)ur,0(α)\frac{Q^{n}(\alpha)}{u_{r,0}(\alpha)} for all α𝔽q\alpha\in\mathbb{F}_{q}, we get f(α)=Qn(α)ur,0(α)(~α)f(\alpha)=\frac{Q^{n}(\alpha)}{u_{r,0}(\alpha)}\tilde{(}\alpha) for every α𝔽q\alpha\in\mathbb{F}_{q}.

In the process, we need nn precomputations of {Qn(α)ur,0(α)α𝔽q}\{\frac{Q^{n}(\alpha)}{u_{r,0}(\alpha)}\mid\alpha\in\mathbb{F}_{q}\}. Moreover, in the recursive reduction, we also need some precomputations to get around the pole places of elements in the basis ~\tilde{\mathcal{B}}, which will cost O(logn)O(\log n) storage space. Therefore, the total storage in the G-FFT algorithm is O(n)O(n).

1.5. Organization

The paper is organized as follows. In Section 2, we present some preliminaries on algebraic function fields, in particular rational function fields. In Section 3, we construct the G-FFT via affine linear subgroups GG of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) and instantiate GG by the multiplicative group 𝔽q\mathbb{F}_{q}^{*} and additive group 𝔽q\mathbb{F}_{q} as two examples. Moreover, in the case qq is smooth, we show that G-FFT under the standard basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} can be done in O(nlog2n)O(n\log^{2}n). In Section 4, we construct a new basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} via a non-affine cyclic subgroup of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}) of order nn. Then we show that the G-FFT of smooth length nq+1n\mid q+1 can be done in O(nlogn)O(n\log n) and give the G-FFT algorithm for the case q+1=2rq+1=2^{r}. Finally, Section 5 concludes our work.

2. Preliminary

In this section, we will introduce some preliminaries on algebraic function fields, especially some known results about the rational function field 𝔽q(x)\mathbb{F}_{q}(x) and its automorphism group. We also introduce the definition of BB-smooth groups which are similar to the BB-smooth integers.

2.1. Algebraic extensions of function fields

We briefly introduce the theory of rational function fields in this subsection. The reader may refer to [Sti09] for details.

For a prime power qq, let 𝔽q\mathbb{F}_{q} denote the finite field of qq elements. An algebraic function field over 𝔽q\mathbb{F}_{q} in one variable is a field extension F𝔽qF\supset\mathbb{F}_{q} such that FF is an algebraic extension of 𝔽q(x)\mathbb{F}_{q}(x) for some transcendental element xx over 𝔽q\mathbb{F}_{q}. In particular, the rational function field 𝔽q(x)\mathbb{F}_{q}(x) is an algebraic function field of one variable. In the following, we say F/𝔽qF/\mathbb{F}_{q} is an algebraic function field with the assumption that 𝔽q\mathbb{F}_{q} is the full constant field of FF, i.e., every algebraic element of FF over 𝔽q\mathbb{F}_{q} belongs to 𝔽q\mathbb{F}_{q} as well.

Now let FF be the rational function field 𝔽q(x)\mathbb{F}_{q}(x) over 𝔽q\mathbb{F}_{q}. For every irreducible polynomial P(x)𝔽q[x]P(x)\in\mathbb{F}_{q}[x], we define a discrete valuation νP\nu_{P} which is a map from 𝔽q[x]\mathbb{F}_{q}[x] to {}\mathbb{Z}\cup\{\infty\} given by νP(0)=\nu_{P}(0)=\infty and νP(f)=a\nu_{P}(f)=a, where ff is a nonzero polynomial and aa is the unique nonnegative integer satisfying Pa|fP^{a}|f and Pa+1fP^{a+1}\nmid f. This map can be extended to 𝔽q(x)\mathbb{F}_{q}(x) by defining νP(f/g)=νP(f)νP(g)\nu_{P}(f/g)=\nu_{P}(f)-\nu_{P}(g) for any two polynomials f,g𝔽q[x]f,g\in\mathbb{F}_{q}[x] with g0g\neq 0. Apart from the above finite discrete valuation νP\nu_{P}, we have an infinite valuation ν\nu_{\infty} defined by ν(f/g)=deg(g)deg(f)\nu_{\infty}(f/g)=\deg(g)-\deg(f) for any two polynomials f,g𝔽q[x]f,g\in\mathbb{F}_{q}[x] with g0g\neq 0. Note that we define deg(0)=\deg(0)=\infty. The set of places of FF is denoted by F\mathbb{P}_{F}.

For each discrete valuation νP\nu_{P} (PP is either a polynomial or \infty), by abuse of notion we still denote by PP the set {yF:νP(y)>0}\{y\in F:\;\nu_{P}(y)>0\}. Then the set PP is called a place of FF. If P=xαP=x-\alpha, then we denote PP by PαP_{\alpha}. The degree of the place PP is defined to be the degree of the corresponding polynomial P(x)P(x). If PP is the infinite place \infty, then the degree of \infty is defined to be 11. A place of degree 11 is called rational. In fact, there are exactly q+1q+1 rational places for the rational function field FF over 𝔽q\mathbb{F}_{q}.

Let F\mathbb{P}_{F} be the set of all places of F/𝔽qF/{\mathbb{F}_{q}} and let F(1)\mathbb{P}_{F}^{(1)} be the subset of F\mathbb{P}_{F} consisting of all rational places, i.e., places of degree one. Assume F/FF^{\prime}/F is a finite algebraic extension of degree nn with the constant field 𝔽q\mathbb{F}_{q}. A place PFP^{\prime}\in\mathbb{P}_{F^{\prime}} is said to be lying over PP, written as PPP^{\prime}\mid P, if PPP\subseteq P^{\prime}. The ramification index of PP^{\prime} over PP is denoted by e(PP)e(P^{\prime}\mid P). Let P1,,PmP_{1},\dots,P_{m} be all the places of FF^{\prime} lying over PP. If e(PiP)=[F:F]e(P_{i}\mid P)=[F^{\prime}:F] for a place PiPP_{i}\mid P, then m=1m=1 and PP is said to be totally ramified in FF^{\prime}; if e(PiP)=1e(P_{i}\mid P)=1 for every PiP_{i}, i=1,,mi=1,\dots,m, PP is said to be unramified in FF^{\prime}; furthermore, if e(PiP)=1e(P_{i}\mid P)=1 and m=[F:F]m=[F^{\prime}:F], then PP is said to split completely in FF^{\prime}. An important fact that is used in the design of our G-FFT is that for every place PP of FF and a function fFf\in F with νP(f)0\nu_{P}(f)\geq 0, the value f(P)f(P^{\prime}) is constant for all places PP^{\prime} of FF^{\prime} lying over PP.

2.2. The Riemann-Roch Space

Let F=𝔽q(x)F=\mathbb{F}_{q}(x) be a rational function field. For a place QQ of FF and an integer mm, we define the Riemann-Roch space associated with QQ as

(mQ):={fF:νP(f)0if PQ;νQ(f)m}{0}.\mathcal{L}(mQ):=\{f\in F^{*}:\;\nu_{P}(f)\geq 0\;\mbox{if $P\neq Q$};\;\nu_{Q}(f)\geq-m\}\cup\{0\}.

Then (mQ)\mathcal{L}(mQ) is a finite-dimensional vector space over 𝔽q\mathbb{F}_{q}. If QQ is the pole of xx, then (mQ)\mathcal{L}(mQ) is the polynomial space 𝔽q[x]m\mathbb{F}_{q}[x]_{\leq m}. The dimension dim𝔽q(mQ)\dim_{\mathbb{F}_{q}}\mathcal{L}(mQ) is usually denoted by (mQ)\ell(mQ). Then we have

(mQ)=mdeg(Q)+1\ell(mQ)=m\deg(Q)+1

for m0m\geq 0; and (mQ)=0\ell(mQ)=0, otherwise.

2.3. Subfields of the rational function field

Let F=𝔽q(x)F=\mathbb{F}_{q}(x) be the rational function field for some transcendental element xFx\in F over the finite field 𝔽q\mathbb{F}_{q}. We denote by Aut(F/𝔽q)\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}) the automorphism group of FF over 𝔽q\mathbb{F}_{q}, i.e.,

Aut(F/𝔽q)={σ:FF:σ is an 𝔽q-automorphism of F}.\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q})=\{\sigma:F\rightarrow F:\;\sigma\mbox{ is an }\mathbb{F}_{q}\mbox{-automorphism of }F\}. (2.3.1)

It is clear that an automorphism σAut(F/𝔽q)\sigma\in\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}) is uniquely determined by σ(x)\sigma(x). It is well known that every automorphism σAut(F/𝔽q)\sigma\in\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}) is given by

σ(x)=ax+bcx+d\sigma(x)=\frac{ax+b}{cx+d} (2.3.2)

for some constant a,b,c,d𝔽qa,b,c,d\in\mathbb{F}_{q} with adbc0ad-bc\neq 0 (see [HKTO08]). Denote by GL2(q)\mathrm{GL}_{2}(q) the general linear group of 2×22\times 2 invertible matrices over 𝔽q\mathbb{F}_{q}. Thus, every matrix A=(abcd)GL2(q)A=\left(\begin{array}[]{cc}a&b\\ c&d\end{array}\right)\in\mathrm{GL}_{2}(q) induces an automorphism of FF given by (2.3.2). Two matrices of GL2(q)\mathrm{GL}_{2}(q) induce the same automorphism of FF if and only if they belong to the same coset of Z(GL2(q))Z(\mathrm{GL}_{2}(q)), where Z(GL2(q))Z(\mathrm{GL}_{2}(q)) stands for the center {aI2:a𝔽q}\{aI_{2}:\;a\in\mathbb{F}_{q}^{*}\} of GL2(q)\mathrm{GL}_{2}(q). This implies that Aut(F/𝔽q)\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}) is isomorphic to the projective linear group PGL2(q):=GL2(q)/Z(GL2(q))\mathrm{PGL}_{2}(q):=\mathrm{GL}_{2}(q)/Z(\mathrm{GL}_{2}(q)). Thus, we can idenitify Aut(F/𝔽q)\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}) with PGL2(q)\mathrm{PGL}_{2}(q).

Consider the subgroup of PGL2(q)\mathrm{PGL}_{2}(q)

AGL2(q):={(ab01):a𝔽q,b𝔽q}.\mathrm{AGL}_{2}(q):=\left\{\left(\begin{array}[]{cc}a&b\\ 0&1\end{array}\right):\;a\in\mathbb{F}_{q}^{*},b\in\mathbb{F}_{q}\right\}. (2.3.3)

AGL2(q)\mathrm{AGL}_{2}(q) is called the affine linear group. Every element A=(ab01)AGL2(q)A=\left(\begin{array}[]{cc}a&b\\ 0&1\end{array}\right)\in\mathrm{AGL}_{2}(q) defines an affine automorphism σ(x)=ax+b.\sigma(x)=ax+b.

In addition to the subgroup AGL2(q)\mathrm{AGL}_{2}(q), PGL2(q)\mathrm{PGL}_{2}(q) has a cyclic subgroup with order q+1q+1. Let P(x)=x2+ax+bP(x)=x^{2}+ax+b be a primitive polynomial over 𝔽q[x]\mathbb{F}_{q}[x]. Consider the element σ=(01ba)PGL2(q)\sigma=\left(\begin{array}[]{cc}0&1\\ -b&-a\end{array}\right)\in\mathrm{PGL}_{2}(q). Then the order of σ\sigma is q+1q+1 (one can refer to [JMX19, Lemma V.1]). Thus σ\langle\sigma\rangle induces a q+1q+1-cyclic subgroup of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q}).

For a subgroup GG of Aut(F/𝔽q)\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}), let FGF^{G} be the fixed subfield by GG, i.e.,

FG={uF:σ(u)=u,for allσG}F^{G}=\{u\in F:\;\sigma(u)=u,\ \text{for\ all}\ \sigma\in G\}

By the Galois theory [HKTO08], we know that F/FGF/{F^{G}} is a Galois field extension with Gal(F/FG)=G\operatorname{\mathrm{Gal}}(F/{F^{G}})=G. Moreover, Lüroth’s Theorem [Sti09, Proposition 3.5.9] asserts that if 𝔽qEF\mathbb{F}_{q}\subsetneq E\subseteq F, then E=𝔽q(z)E=\mathbb{F}_{q}(z), where z𝔽q(x)z\in\mathbb{F}_{q}(x) is a rational function of xx and zz is transcendental over 𝔽q\mathbb{F}_{q}.

Lemma 2.1 ([HKTO08]).

Assume 𝔽q(z)\mathbb{F}_{q}(z) is a subfield of 𝔽q(x)\mathbb{F}_{q}(x) and m=[𝔽q(x):𝔽q(z)]m=[\mathbb{F}_{q}(x):\mathbb{F}_{q}(z)]. Then z=f(x)g(x)z=\frac{f(x)}{g(x)} for some polynomial f(x),g(x)𝔽q[x]f(x),g(x)\in\mathbb{F}_{q}[x] with g(x)0g(x)\neq 0 satisfying gcd(f(x),g(x))=1\gcd(f(x),g(x))=1 and m=max{degf(x),degg(x)}m=\max\{\deg f(x),\deg g(x)\}. In particular, 𝔽q(z)=𝔽q(x)\mathbb{F}_{q}(z)=\mathbb{F}_{q}(x) if and only if there exist a,b,c,d𝔽qa,b,c,d\in\mathbb{F}_{q} such that z=ax+bcx+dz=\frac{ax+b}{cx+d} and adbc0ad-bc\neq 0.

2.4. B-smooth groups

Let B>0B>0 be a constant integer. Recall that a positive integer mm is called BB-smooth if every prime factor of mm is upper bounded by BB, i.e., m=i=1rpim=\prod_{i=1}^{r}p_{i} with piBp_{i}\leq B for all 1ir1\leq i\leq r. Note that we do not require these rr prime divisors pip_{i} to be distinct. We give the definition of BB-smooth finite groups as follows.

Definition 2.2.

A finite group GG is called BB-smooth if there exists a chain of subgroups of GG:

{1}=G0G1Gr1Gr=G\{1\}=G_{0}\subsetneq G_{1}\subsetneq\dots\subsetneq G_{r-1}\subsetneq G_{r}=G (2.4.1)

such that |Gi||Gi1|B\frac{|G_{i}|}{|G_{i-1}|}\leq B for i=1,,ri=1,\dots,r.

Lemma 2.3.

Let GG be a finite abelian group. If |G||G| is BB-smooth, then GG is BB-smooth.

Proof.

Assume |G|=i=1rpi|G|=\prod_{i=1}^{r}p_{i}, where p1,,prp_{1},\dots,p_{r} are divisors not larger than BB. Since GG is abelian, by the structure theorem for finite abelian groups [KS04], there exists a subgroup Gr1G_{r-1} of order i=1r1pi\prod_{i=1}^{r-1}p_{i}. As Gr1G_{r-1} is also abelian, there is a subgroup Gr2G_{r-2} of Gr1G_{r-1} of order i=1r2pi\prod_{i=1}^{r-2}p_{i}. Continuing in this fashion until G0={1}G_{0}=\{1\}, we get a series of finite subgroups of GG that satisfy the condition (2.4.1) and pj=|Gj||Gj1|Bp_{j}=\frac{|G_{j}|}{|G_{j-1}|}\leq B. ∎

3. Fast Fourier transform via affine subgroups of Aut(𝔽q(x)/𝔽q)\operatorname{\mathrm{Aut}}(\mathbb{F}_{q}(x)/\mathbb{F}_{q})

Assume the affine linear group AGL2(q)\mathrm{AGL}_{2}(q) has a smooth subgroup GG with order |G|=n|G|=n. We will first construct a polynomial basis \mathcal{B} of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} in terms of GG. Under this basis \mathcal{B}, we will then show that a Galois-group-based FFT (G-FFT for short) over 𝔽q\mathbb{F}_{q} of length nn can be computed in quasi-linear time of nn. We then instantiate our G-FFT by the multiplicative group 𝔽q\mathbb{F}_{q}^{*} and the additive group 𝔽q\mathbb{F}_{q} as two examples.

3.1. G-FFT via affine subgroups

Assume GAGL2(q)G\leq\mathrm{AGL}_{2}(q) is a BB-smooth affine subgroup with order nn, i.e., |G|=n=i=1rpi|G|=n=\prod_{i=1}^{r}p_{i} such that piBp_{i}\leq B for all 1ir1\leq i\leq r. Then there is an ascending subgroup chain

{1}=G0G1Gr1Gr=G\{1\}=G_{0}\subsetneq G_{1}\subsetneq\dots\subsetneq G_{r-1}\subsetneq G_{r}=G

satisfying pi=|Gi||Gi1|Bp_{i}=\frac{|G_{i}|}{|G_{i-1}|}\leq B for i=1,,ri=1,\dots,r. Let Fi=𝔽q(x)GiF_{i}=\mathbb{F}_{q}(x)^{G_{i}} be the fixed subfield of 𝔽q(x)\mathbb{F}_{q}(x) under GiG_{i}; namely,

Fi={u𝔽q(x):σ(u)=ufor allσGi}.F_{i}=\{u\in\mathbb{F}_{q}(x):\;\sigma(u)=u\ \text{for\ all}\ \sigma\in G_{i}\}.

By the Galois theory [HKTO08], F0/FiF_{0}/F_{i} is a Galois extension with F0=𝔽q(x)F_{0}=\mathbb{F}_{q}(x) and [F0:Fi]=|Gi|=j=1ipj[F_{0}:F_{i}]=|G_{i}|=\prod_{j=1}^{i}p_{j} for 0ir0\leq i\leq r. Furthermore, we have FiFi1F_{i}\subsetneq F_{i-1} resulting a tower of fields

𝔽q(x)=F0F1Fr1Fr.\mathbb{F}_{q}(x)=F_{0}\supsetneq F_{1}\supsetneq\dots\supsetneq F_{r-1}\supsetneq F_{r}.
Lemma 3.1.

Assume GG is a B-smooth affine subgroup with order nn and n=i=1rpin=\prod_{i=1}^{r}p_{i}. Let GiG_{i} be a subgroup of GG of order j=1ipj\prod_{j=1}^{i}p_{j} and Fi=FGiF_{i}=F^{G_{i}} be the fixed subfield under GiG_{i} for 0ir0\leq i\leq r. Then we have

  • (1)

    For 0i<r0\leq i<r, Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}) and xi=σGiσ(x)x_{i}=\prod_{\sigma\in G_{i}}\sigma(x); moreover, each xi+1x_{i+1} can be written as a polynomial of xix_{i} of degree pi+1p_{i+1}.

  • (2)

    The set

    ={x0e0x1e1xr1er1e=(e0,e1,,er1)p1×p2××pr}.\mathcal{B}=\{x_{0}^{e_{0}}x_{1}^{e_{1}}\cdots x_{r-1}^{e_{r-1}}\mid\operatorname{\textbf{e}}=(e_{0},e_{1},\ldots,e_{r-1})\in\mathbb{Z}_{p_{1}}\times\mathbb{Z}_{p_{2}}\times\cdots\times\mathbb{Z}_{p_{r}}\}.

    forms a basis of the 𝔽q\mathbb{F}_{q}-vector space 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}.

Proof.

(1) Let Ni(x)=σGiσ(x)N_{i}(x)=\prod_{\sigma\in G_{i}}\sigma(x). Since GG is affine, σ(x)\sigma(x) is a linear polynomial of xx for each σG\sigma\in G by § 2.3. Thus Ni(x)𝔽q[x]N_{i}(x)\in\mathbb{F}_{q}[x] is a polynomial of degree |Gi||G_{i}|. Let xi=Ni(x)x_{i}=N_{i}(x). Then xiFix_{i}\in F_{i} and xx is a root of φi(T)=Ni(T)xi\varphi_{i}(T)=N_{i}(T)-x_{i}. These lead to

|Gi|=[F0:Fi][F0:𝔽q(xi)]|Gi|.|G_{i}|=[F_{0}:F_{i}]\leq[F_{0}:\mathbb{F}_{q}(x_{i})]\leq|G_{i}|.

Thus Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}). Moreover, by definition,

xi+1=σGi+1σ(x)=σGi+1/Giσ(xi).x_{i+1}=\prod_{\sigma\in G_{i+1}}\sigma(x)=\prod_{\sigma\in G_{i+1}/G_{i}}\sigma(x_{i}).

By [JMX19, Proposition IV.2], the infinity place PP_{\infty} totally ramifies in 𝔽q(x)/𝔽q(x)G\mathbb{F}_{q}(x)/\mathbb{F}_{q}(x)^{G}. Let Pi,=PFiP_{i,\infty}=P_{\infty}\cap F_{i} be the place of FiF_{i} lying below PP_{\infty}. Then e(PPi,)=|Gi|e(P_{\infty}\mid P_{i,\infty})=|G_{i}| for 0ir0\leq i\leq r. Thus

νPi,(xi)=νP(xi)/e(PPi,)=1;\nu_{P_{i,\infty}}(x_{i})=\nu_{P_{\infty}}(x_{i})/e(P_{\infty}\mid P_{i,\infty})=-1;

namely, Pi,P_{i,\infty} is a pole of xix_{i}. Since Pi,P_{i,\infty} is totally ramified in Fi/Fi+1F_{i}/F_{i+1}, for any σGi+1/Gi=Gal(Fi/Fi+1)\sigma\in G_{i+1}/G_{i}=\operatorname{\mathrm{Gal}}(F_{i}/F_{i+1}), σ(Pi,)=Pi,\sigma(P_{i,\infty})=P_{i,\infty}. Thus

νPi,(σ(xi))=νPi,(xi)=1.\nu_{P_{i,\infty}}\left(\sigma(x_{i})\right)=\nu_{P_{i,\infty}}(x_{i})=-1.

Therefore, each σ(xi)\sigma(x_{i}) is a linear polynomial of xix_{i} and xi+1x_{i+1} can be seen as a polynomial of xix_{i} of degree |Gi+1/Gi|=pi+1|G_{i+1}/G_{i}|=p_{i+1}.

(2) It is easy to see that ||=n|\mathcal{B}|=n. For any Xe=x0e0x1e1xr1er1\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}}=x_{0}^{e_{0}}x_{1}^{e_{1}}\cdots x_{r-1}^{e_{r-1}}\in\mathcal{B}, by computation,

νP(Xe)=i=0r1ei|Gi|>pr|Gr1|=n,\nu_{P_{\infty}}(\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}})=-\sum_{i=0}^{r-1}e_{i}|G_{i}|>-p_{r}\cdot|G_{r-1}|=-n,

thus Xe𝔽q[x]<n\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}}\in\mathbb{F}_{q}[x]_{<n} and 𝔽q[x]<n\mathcal{B}\subsetneq\mathbb{F}_{q}[x]_{<n}. Note that, if ee=(e0,,er1)\operatorname{\textbf{e}}\neq\operatorname{\textbf{e}}^{\prime}=(e_{0}^{\prime},\ldots,e_{r-1}^{\prime}), assume kk is the largest index in [0,r1][0,r-1] such that ekeke_{k}\neq e_{k}^{\prime}. Then

νP(Xe)i=0kei|Gi|mod|Gk+1|νP(Xe)i=0kei|Gi|mod|Gk+1|.\nu_{P_{\infty}}(\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}})\equiv\sum_{i=0}^{k}-e_{i}\cdot|G_{i}|\bmod|G_{k+1}|\neq\nu_{P_{\infty}}(\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}^{\prime}})\equiv\sum_{i=0}^{k}-e_{i}^{\prime}\cdot|G_{i}|\bmod|G_{k+1}|.

Thus νP(Xe)νP(Xe)\nu_{P_{\infty}}(\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}})\neq\nu_{P_{\infty}}(\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}^{\prime}}). As each element in \mathcal{B} has pairwise distinct valuation at the infinity place PP_{\infty}, they must be linearly independent over 𝔽q\mathbb{F}_{q}. Since 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} has dimension nn, then \mathcal{B} must be a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}. ∎

By the above Lemma, under the basis \mathcal{B}, we can express any f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} as

f(x)=eaeXe=f(x,x1,,xr1),f(x)=\sum_{\operatorname{\textbf{e}}}a_{\operatorname{\textbf{e}}}\operatorname{\textbf{X}}_{\operatorname{\textbf{e}}}=f(x,x_{1},\dots,x_{r-1}),

and xi,xi+1,,xrx_{i},x_{i+1},\dots,x_{r} are polynomials of xi1x_{i-1} for 1ir1\leq i\leq r.

Theorem 3.2.

Let FF be the rational function field 𝔽q(x)\mathbb{F}_{q}(x). Let GG be a BB-smooth subgroup of the affine group AGL2(q)\mathrm{AGL}_{2}(q) with |G|=n|G|=n. Assume 𝔓\mathfrak{P} is a rational place of FGF^{G} that splits completely in the extension F/FGF/F^{G}. Let 𝒫\mathcal{P} be the set of rational places of FF lying over 𝔓\mathfrak{P} (hence |𝒫|=n|\mathcal{P}|=n). Then, under the basis \mathcal{B} as in Lemma 3.1, the evaluations of every function f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} at 𝒫\mathcal{P} can be computed in O(Bnlogn)O(B\cdot n\log n) operations in 𝔽q\mathbb{F}_{q}.

Proof.

First, denote the complexity (i.e., the total number of operations in 𝔽q\mathbb{F}_{q}) of evaluating f(x)f(x) at the set 𝒫\mathcal{P} by C(n)C(n). Under the basis \mathcal{B} defined in Lemma 3.1, assume f=f(x,x1,,xr1)f=f(x,x_{1},\dots,x_{r-1}), where degxi(f)<pi+1\deg_{x_{i}}(f)<p_{i+1}. To perform FFT of f(x)f(x) at 𝒫\mathcal{P}, write f=f(x,x1,,xr1)f=f(x,x_{1},\dots,x_{r-1}) in at most O(n)O(n) steps as follows

f=f0(x1,,xr1)+xf1(x1,,xr1)++xp11fp11(x1,,xr1).f=f_{0}(x_{1},\dots,x_{r-1})+x\cdot f_{1}(x_{1},\dots,x_{r-1})+\dots+x^{p_{1}-1}\cdot f_{p_{1}-1}(x_{1},\dots,x_{r-1}). (3.1.1)

Note that

νP1,fi(x1,,xr1)((p21)+(p31)p2++(pr1)p2pr2)>p2pr=n/p1fori=0,,p11,\begin{split}\nu_{P_{1,\infty}}f_{i}(x_{1},\dots,x_{r-1})&\geq-\big{(}(p_{2}-1)+(p_{3}-1)p_{2}+\cdots+(p_{r}-1)p_{2}\cdots p_{r-2}\big{)}\\ &>-p_{2}\cdots p_{r}=-n/p_{1}\ \text{for}\ i=0,\cdots,p_{1}-1,\end{split}

where P1,=PF1P_{1,\infty}=P_{\infty}\cap F_{1} is the infinity place of F1F_{1}. Thus we have f0,,fp11((n/p1)P1,)=𝔽q[x1]<n/p1f_{0},\dots,f_{p_{1}-1}\in\mathcal{L}\left((n/p_{1})P_{1,\infty}\right)=\mathbb{F}_{q}[x_{1}]_{<n/p_{1}}. By equation (3.1.1), the DFT of ff at 𝒫\mathcal{P} can be reduced to the DFTs of f0,f1,,fp11f_{0},f_{1},\dots,f_{p_{1}-1} at 𝒫1=𝒫F1\mathcal{P}_{1}=\mathcal{P}\cap F_{1} and then a combination of these values by using p1O(n)p_{1}O(n) operations in 𝔽q\mathbb{F}_{q}. Therefore, the running time C(n)C(n) satisfies the following recursive formula

C(n)=p1C(n/p1)+p1O(n).C(n)=p_{1}C(n/{p_{1}})+p_{1}\cdot O(n). (3.1.2)

We continue the reduction in this fashion till to the DFTs of functions in 𝔽q[xr]<1\mathbb{F}_{q}[x_{r}]_{<1}. We get a total of (p1pr)(p_{1}\cdots p_{r}) constant polynomials and need to compute their’s evaluations at 𝒫r=𝒫Fr={𝔓}\mathcal{P}_{r}=\mathcal{P}\cap F_{r}=\{\mathfrak{P}\}, which can be done in (p1pr)O(1)(p_{1}\cdots p_{r})O(1) orperations. Thus, the recursive formula (3.1.2) leads the total running time equal to

C(n)=p1prO(1)+(p1++pr)O(n)=O(Bnlogn).C(n)=p_{1}\cdots p_{r}\cdot O(1)+(p_{1}+\dots+p_{r})O(n)=O(B\cdot n\log n).

This completes the proof.∎

Remark 3.3.

Let Ev𝒫(f)\operatorname{\mathrm{Ev}}_{\mathcal{P}}(f) denote the MPE of f𝔽q[x]<nf\in\mathbb{F}_{q}[x]_{<n} at 𝒫\mathcal{P}. The inverse G-FFT means that given the Ev𝒫(f)\operatorname{\mathrm{Ev}}_{\mathcal{P}}(f), output the coefficients of f(x)f(x) under the basis defined by G-FFT (see Lemma 3.1). In the following, we show that the inverse G-FFT can also be done in O(nlogn)O(n\log n) operations. Let I(n)I(n) denote the complexity of inverse G-FFT of length nn. For any P(1)𝒫F1=𝒫(1)P^{(1)}\in\mathcal{P}\cap F_{1}=\mathcal{P}^{(1)}, let P1,1,,P1,p1P_{1,1},\ldots,P_{1,p_{1}} be the p1p_{1} places in F\mathbb{P}_{F} lying over P(1)P^{(1)}. By the recursive Equation (3.1.1),

(f0(P(1))f1(P(1))fp11(P(1)))=(1x(P1,1)xp11(P1,1)1x(P1,2)xp11(P1,2)1x(P1,p1)xp11(P1,p1))1(f(P1,1)f(P1,2)f(P1,p1))\left(\begin{matrix}f_{0}(P^{(1)})\\ f_{1}(P^{(1)})\\ \vdots\\ f_{p_{1}-1}(P^{(1)})\end{matrix}\right)=\left(\begin{matrix}1&x(P_{1,1})&\cdots&x^{p_{1}-1}(P_{1,1})\\ 1&x(P_{1,2})&\cdots&x^{p_{1}-1}(P_{1,2})\\ \vdots&\vdots&\cdots&\vdots\\ 1&x(P_{1,p_{1}})&\cdots&x^{p_{1}-1}(P_{1,p_{1}})\end{matrix}\right)^{-1}\cdot\left(\begin{matrix}f(P_{1,1})\\ f(P_{1,2})\\ \vdots\\ f(P_{1,p_{1}})\end{matrix}\right) (3.1.3)

Thus, we can first compute the MPEs of f0,f1,,fp11f_{0},f_{1},\ldots,f_{p_{1}-1} at 𝒫(1)\mathcal{P}^{(1)} from the above equation, which will cost p1O(n)p_{1}O(n) operations. Then we can reduce the inverse G-FFT of Ev𝒫(f)\operatorname{\mathrm{Ev}}_{\mathcal{P}}(f) to p1p_{1} inverse G-FFTs of Ev𝒫(1)(f0)\operatorname{\mathrm{Ev}}_{\mathcal{P}^{(1)}}(f_{0}), Ev𝒫(1)(f1)\operatorname{\mathrm{Ev}}_{\mathcal{P}^{(1)}}(f_{1}), \ldots, Ev𝒫(1)(fp11)\operatorname{\mathrm{Ev}}_{\mathcal{P}^{(1)}}(f_{p_{1}-1}). Finally, we get ff by Equation (3.1.1). Therefore, I(n)I(n) satisfies the following recursive formula

I(n)=p1I(n/p1)+p1O(n)=O(nlogn).I(n)=p_{1}I(n/{p_{1}})+p_{1}O(n)=O(n\log n).

3.2. Instantiation

3.2.1. Multiplicative G-FFT

Let us first look at the multiplicative case. Let FF be the rational function field 𝔽q(x)\mathbb{F}_{q}(x). Consider a subgroup TT of 𝔽q\mathbb{F}_{q}^{*} of order nn and an automorphism subgroup

GT={σAut(F/𝔽q):σ(x)=axfor some aT}.G_{T}=\{\sigma\in\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}):\;\sigma(x)=ax\;\mbox{for some $a\in T$}\}.

When T=𝔽qT=\mathbb{F}_{q}^{*}, we denote GTG_{T} by GG_{*}. Then the field extension F/FGF/F^{G_{*}} has extension degree [F:FG]=q1[F:F^{G_{*}}]=q-1. Furthermore, there are a few facts about this extension (refer to [JMX19, Proposition IV.2])

  1. (1)

    The pole of xx is totally ramified in the extension F/FGF/F^{G_{*}}.

  2. (2)

    There is a rational place \wp of FGF^{G_{*}} that splits completely in F/FGF/F^{G_{*}}.

Let 𝔓\mathfrak{P} be a rational place of FGTF^{G_{T}} that lies over \wp. Then 𝔓\mathfrak{P} splits completely in the extension F/FGTF/F^{G_{T}}. Let 𝒫\mathcal{P} be the set of rational places of FF lying over 𝔓\mathfrak{P}. Then |𝒫|=n|\mathcal{P}|=n.

Assume that nn has factorization n=i=1rpin=\prod_{i=1}^{r}p_{i} with piBp_{i}\leq B for a positive integer BB. Consider an ascending chain of subgroups

{1}=T0T1Tr1Tr=T.\begin{split}\{1\}=T_{0}\subsetneq T_{1}\subsetneq\dots\subsetneq T_{r-1}\subsetneq T_{r}=T.\end{split}

with |Ti|=j=1ipi|T_{i}|=\prod_{j=1}^{i}p_{i} for i1i\geq 1. Let Gi=GTiG_{i}=G_{T_{i}} and Fi=FGiF_{i}=F^{G_{i}}. If Char(𝔽q)=2\operatorname{\mathrm{Char}}(\mathbb{F}_{q})=2, let xi:=σGiσ(x)x_{i}:=\prod_{\sigma\in G_{i}}\sigma(x). Then Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}) and xi=(aTia)xj=1ipj=xj=1ipjx_{i}=(\prod_{a\in T_{i}}a)\cdot x^{\prod_{j=1}^{i}p_{j}}=x^{\prod_{j=1}^{i}p_{j}}. If Char(𝔽q)\operatorname{\mathrm{Char}}(\mathbb{F}_{q}) is odd, then 2q12\mid q-1. We choose T1T_{1} with even order (hence nn is also even). Then 2|Ti|2\mid|T_{i}| and aTia=1\prod_{a\in T_{i}}a=-1 for each TiT_{i}. Let xi=σGiσ(x)x_{i}=-\prod_{\sigma\in G_{i}}\sigma(x). Then Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}) and xi=xj=1ipjx_{i}=x^{\prod_{j=1}^{i}p_{j}}. Therefore, we always have xi=xi1pix_{i}=x_{i-1}^{p_{i}} for all i1i\geq 1. Furthermore, it is easy to see that xi=0x_{i}=0 gives only one solution x=0x=0, i.e., the zero of xix_{i} totally ramifies in F/FiF/F_{i}. Let α\alpha be a primitive element of 𝔽q\mathbb{F}_{q}. Then for any β𝔽q\beta\in\mathbb{F}_{q}^{*}, the equation xr=βnx_{r}=\beta^{n}, i.e., xn=βnx^{n}=\beta^{n} gives solutions x=βαj(q1)/nx=\beta\alpha^{j(q-1)/n} for j=0,1,,n1j=0,1,\dots,n-1. This implies that xrβnx_{r}-\beta^{n} splits into nn rational places of FF. Let 𝒫\mathcal{P} be the set of rational places of FF lying over xrβnx_{r}-\beta^{n}, i.e., 𝒫={xβαj(q1)/n:j=0,1,,n1}\mathcal{P}=\{x-\beta\alpha^{j(q-1)/n}:\;j=0,1,\dots,n-1\}. Thus, we have 𝒫i=𝒫Fi={xiβj=1ipiαk(q1)j=1ipj/n:k=0,1,,nj=1ipj1}\mathcal{P}_{i}=\mathcal{P}\cap F_{i}=\{x_{i}-\beta^{\prod_{j=1}^{i}p_{i}}\alpha^{k(q-1)\prod_{j=1}^{i}p_{j}/n}:\;k=0,1,\dots,\frac{n}{\prod_{j=1}^{i}p_{j}}-1\}. Partition 𝒫\mathcal{P} into nj=1ipj\frac{n}{\prod_{j=1}^{i}p_{j}} subsets k:={(xβαk(q1)/nα(q1)/j=1ipj):=0,1,,j=1ipj1}\Re_{k}:=\{(x-\beta\cdot\alpha^{k(q-1)/n}\cdot\alpha^{\ell(q-1)/\prod_{j=1}^{i}p_{j}}):\;\ell=0,1,\dots,\prod_{j=1}^{i}p_{j}-1\} for k=0,1,,n/(j=1ipj)1k=0,1,\dots,n/(\prod_{j=1}^{i}p_{j})-1. It is easy to see that evaluation of xix_{i} at every place of k\Re_{k} is a constant that is equal to βj=1ipiαk(q1)j=1ipj/n\beta^{\prod_{j=1}^{i}p_{i}}\alpha^{k(q-1)\prod_{j=1}^{i}p_{j}/n}.

In our G-FFT algorithm, we have to decompose a polynomial in variable xi1x_{i-1} in terms of polynomials in variable xix_{i}. So we need represent f𝔽q[x]<nf\in\mathbb{F}_{q}[x]_{<n} under the basis \mathcal{B} defined in Lemma 3.1. Actually, in this case,

T={x0e0x1e1xr1er1e=(e0,e1,,er1)p1×p2××pr}={xee=i=0r1ei(p1pi+1)[0,n1]}={1,x,,xn1},\begin{split}\mathcal{B}_{T}&=\{x_{0}^{e_{0}}x_{1}^{e_{1}}\cdots x_{r-1}^{e_{r-1}}\mid\operatorname{\textbf{e}}=(e_{0},e_{1},\ldots,e_{r-1})\in\mathbb{Z}_{p_{1}}\times\mathbb{Z}_{p_{2}}\times\cdots\times\mathbb{Z}_{p_{r}}\}\\ &=\{x^{e}\mid e=\sum_{i=0}^{r-1}e_{i}\cdot(p_{1}\cdots p_{i+1})\in[0,n-1]\}=\{1,x,\cdots,x^{n-1}\},\end{split}

namely, T\mathcal{B}_{T} is exactly the standard basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}. To illustrate, let us consider the first step, i.e., express a polynomial f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} as a combination of {xk}k=1p11\{x^{k}\}_{k=1}^{p_{1}-1} with coefficients of polynomials in variable x1=xp1x_{1}=x^{p_{1}}. Then, in at most O(n)O(n) steps, f(x)f(x) can be written as

f(x)=f0(xp1)+xf1(xp1)++xp11fp11(xp1)=f0(x1)+xf1(x1)++xp11fp11(x1).\begin{split}f(x)&=f_{0}(x^{p_{1}})+xf_{1}(x^{p_{1}})+\cdots+x^{p_{1}-1}f_{p_{1}-1}(x^{p_{1}})\\ &=f_{0}(x_{1})+xf_{1}(x_{1})+\cdots+x^{p_{1}-1}f_{p_{1}-1}(x_{1}).\end{split} (3.2.1)

It is clear that each of fi(x1)f_{i}(x_{1}) has degree less than n/p1n/p_{1}. We can then decompose each fi(x1)f_{i}(x_{1}) as a combination of {x1k}k=0p21\{x_{1}^{k}\}_{k=0}^{p_{2}-1} with coefficients of polynomials in variable x2=x1p2x_{2}=x_{1}^{p_{2}}. Then the polynomials in x2x_{2} all have degrees less than n/(p1p2)n/(p_{1}p_{2}). We continue in this fashion until in the last step, all polynomials in variable xrx_{r} have degrees less than n/j=1rpj=1n/\prod_{j=1}^{r}p_{j}=1, i.e., they are all constants.

In conclusion, we have the following result.

Corollary 3.4.

If nn is a divisor of q1q-1 that is BB smooth, then one can run FFT for polynomials f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} in O(Bnlogn)O(B\cdot n\cdot\log n) field operations of 𝔽q\mathbb{F}_{q}.

3.2.2. Additive G-FFT

We now turn to the additive case. Let FF be the rational function field 𝔽q(x)\mathbb{F}_{q}(x). Let pp be the characteristic of 𝔽q\mathbb{F}_{q}. Consider an 𝔽p\mathbb{F}_{p}-subspace WW of 𝔽q\mathbb{F}_{q} of order nn and the associated automorphism group

GW={σAut(F/𝔽q):σ(x)=x+bfor some bW}.G_{W}=\{\sigma\in\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}):\;\sigma(x)=x+b\;\mbox{for some $b\in W$}\}.

When W=𝔽qW=\mathbb{F}_{q}, we denote GWG_{W} by G+G_{+}. Then the field extension F/FG+F/F^{G_{+}} has extension degree [F:FG+]=q[F:F^{G_{+}}]=q. Furthermore, there are a few facts about this extension [JMX19]:

  1. (1)

    The pole of xx is totally ramified in the extension F/FG+F/F^{G_{+}}.

  2. (2)

    There is a rational place \wp of FG+F^{G_{+}} that splits completely in F/FG+F/F^{G_{+}}.

Let 𝔓\mathfrak{P} be a rational place of FGWF^{G_{W}} that lies over \wp. Then 𝔓\mathfrak{P} splits completely in the extension F/FGWF/F^{G_{W}}. Let 𝒫\mathcal{P} be the set of rational places of FF lying over 𝔓\mathfrak{P}. Then |𝒫|=n|\mathcal{P}|=n.

Assume that nn has factorization n=prn=p^{r}. For simplicity, choose a set {α1,α2,,αr}\{\alpha_{1},\alpha_{2},\dots,\alpha_{r}\} of 𝔽p\mathbb{F}_{p}-linearly independent elements in 𝔽q\mathbb{F}_{q}. Put Wi=j=1i𝔽pαjW_{i}=\sum_{j=1}^{i}\mathbb{F}_{p}\alpha_{j}. Then WiW_{i} is an 𝔽p\mathbb{F}_{p}-subspace of 𝔽q\mathbb{F}_{q} of dimension ii. Hence, |Wi|=pi|W_{i}|=p^{i} for 1ir1\leq i\leq r. Put Gi={σAut(F/𝔽q):σ(x)=x+bfor some bWi}G_{i}=\{\sigma\in\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q}):\;\sigma(x)=x+b\;\mbox{for some $b\in W_{i}$}\}, Fi=FGiF_{i}=F^{G_{i}} and xi=σGiσ(x)=aWi(xa)x_{i}=\prod_{\sigma\in G_{i}}\sigma(x)=\prod_{a\in W_{i}}(x-a). Then each xix_{i} is a linearized polynomial of degree pip^{i}, denoted by i(x)\ell_{i}(x). Furthermore, xi=xi1pβixi1x_{i}=x_{i-1}^{p}-\beta_{i}x_{i-1} where βi=i1p1(αi)𝔽q\beta_{i}=\ell_{i-1}^{p-1}(\alpha_{i})\in\mathbb{F}_{q}. It is easy to see that xi=0x_{i}=0 gives solutions x=ax=a for all aWia\in W_{i}. This implies that xrx_{r} splits into nn rational places of FF. Let 𝒫\mathcal{P} be the set of rational places of FF lying over xrx_{r}, i.e., 𝒫={xa:aW}\mathcal{P}=\{x-a:\;a\in W\}. Thus, there exists an 𝔽p\mathbb{F}_{p}-vector space ViV_{i} of dimension rir-i such that 𝒫i=𝒫Fi={xia:aVi}\mathcal{P}_{i}=\mathcal{P}\cap F_{i}=\{x_{i}-a:\;a\in V_{i}\}.

To perform G-FFT, we need to represent f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} under a basis =W\mathcal{B}=\mathcal{B}_{W} constructed in Lemma 3.1. In this case, each xi=i(x)=xi1pβixi1x_{i}=\ell_{i}(x)=x_{i-1}^{p}-\beta_{i}x_{i-1} is a linearized polynomial of degree pip^{i} for 0ir0\leq i\leq r. In particular, 0(x)=x\ell_{0}(x)=x. Thus,

W={0(x)e01(x)e1r1(x)er1:(e0,,er1)pr}.\mathcal{B}_{W}=\{\ell_{0}(x)^{e_{0}}\ell_{1}(x)^{e_{1}}\cdots\ell_{r-1}(x)^{e_{r-1}}:\ (e_{0},\dots,e_{r-1})\in\mathbb{Z}_{p}^{r}\}. (3.2.2)

We note that W\mathcal{B}_{W} is constructed in the same way as in [LCH14]. Under this basis, we can decompose a polynomial in variable xi1x_{i-1} in terms of polynomials in variable xix_{i} step by step. To illustrate, let us consider the first step only. Let f(x)f(x) be a polynomial of degree less than nn. Under the basis W\mathcal{B}_{W}, write f(x)f(x) as

f(x)=f0(x1)+xf1(x1)++xp11fp11(x1)=f0(xpβ1x)+xf1(xpβ1x)++xp1fp11(xpβ1x).\begin{split}f(x)&=f_{0}(x_{1})+xf_{1}(x_{1})+\cdots+x^{p_{1}-1}f_{p_{1}-1}(x_{1})\\ &=f_{0}(x^{p}-\beta_{1}x)+xf_{1}(x^{p}-\beta_{1}x)+\cdots+x^{p-1}f_{p_{1}-1}(x^{p}-\beta_{1}x).\end{split} (3.2.3)

It is clear that each of fi(x1)f_{i}(x_{1}) has degree less than n/p=pr1n/p=p^{r-1}. We can then decompose each fi(x1)f_{i}(x_{1}) as a combination of {x1i}i=1p1\{x_{1}^{i}\}_{i=1}^{p-1} and polynomials in variable x2=x1pβ2x1x_{2}=x_{1}^{p}-\beta_{2}x_{1}. Then the polynomials in x2x_{2} all have degrees less than n/p2=pr2n/p^{2}=p^{r-2}. We continue in this fashion until the last step: all polynomials in variable xrx_{r} have degrees less than n/pr=1n/p^{r}=1, i.e., all constant polynomials. In conclusion, we have the following result.

Corollary 3.5.

Let W\mathcal{B}_{W} be a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} defined as in equation (3.2.2). Let f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} be a given polynomial that is represented under W\mathcal{B}_{W}. If nn is a divisor of qq, then one can run FFT for f(x)f(x) in O(pnlogn)O(p\cdot n\cdot\log n) field operations of 𝔽q\mathbb{F}_{q}.

The condition “f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} is represented under W\mathcal{B}_{W}” in Corollary 3.5 is necessary to decompose a polynomial in variable xi1x_{i-1} in terms of polynomials in variable xix_{i} step by step in our G-FFT. If we are given a representation of f(x)f(x) under the standard basis, i.e., f(x)=i=0n1aixif(x)=\sum_{i=0}^{n-1}a_{i}x^{i}, we need to compute its (xpβ1x)(x^{p}-\beta_{1}x)-adic expansion (refer to [GS10, VZGG13]) first, i.e.,

f(x)=a0(x)+a1(x)(xpβ1x)++am(x)(xpβ1x)m,deg(ai)<pfor 0im<pr1.f(x)=a_{0}(x)+a_{1}(x)(x^{p}-\beta_{1}x)+\dots+a_{m}(x)(x^{p}-\beta_{1}x)^{m},\ \deg(a_{i})<p\ \text{for}\ 0\leq i\leq m<p^{r-1}.

Then we can write f(x)f(x) as in the form of (3.2.3) in at most O(n)O(n) steps from its (xpβ1x)(x^{p}-\beta_{1}x)-adic expansion. In the second recursion, we need also to compute the (x1pβ2x1)(x_{1}^{p}-\beta_{2}x_{1})-adic representation of each fi(x1)f_{i}(x_{1}) to write it as a combination of {x1i}i=1p1\{x_{1}^{i}\}_{i=1}^{p-1} and polynomials in variable x2=x1pβ2x1x_{2}=x_{1}^{p}-\beta_{2}x_{1}. We continue in this fashion until the last step: all polynomials in variable xrx_{r} are constant. This has been considered in [GS10] for the case that 𝔽q=𝔽2r\mathbb{F}_{q}=\mathbb{F}_{2^{r}}. We notice that if the characteristic pp is a constant (not only for p=2p=2), then the (xpβx)(x^{p}-\beta x)-adic expansion of any f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} can be computed in O(nlogn)O(n\log n) for any β𝔽q\beta\in\mathbb{F}_{q} (see Lemma A.2). Consequently, if f(x)=i=0n1aixif(x)=\sum_{i=0}^{n-1}a_{i}x^{i} is given under the standard basis, we must add the complexity of computing the (xpβx)(x^{p}-\beta x)-adic expansions in the recurse formula of C(n)C(n). Then C(n)C(n) satisfies the following recursive formula

C(n)=pC(n/p)+pO(n)+O(nlogn),C(n)=pC(n/p)+pO(n)+O(n\log n),

After rr steps of recursion, we have

C(n)=prC(1)+O(n(logp++logn))+rpO(n)=O(nlog2n)C(n)=p^{r}C(1)+O\left(n\cdot(\log p+\cdots+\log n)\right)+r\cdot p\cdot O(n)=O(n\log^{2}n)

In conclusion, we have the following result.

Corollary 3.6.

Let 𝔽q\mathbb{F}_{q} be a finite field with constant characteristic pp. Let f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} be a given polynomial under the standard basis. If nn is a divisor of qq, then one can run FFT for ff in O(nlog2n)O(n\log^{2}n) field operations of 𝔽q\mathbb{F}_{q}.

4. Fast Fourier transform via a cyclic subgroup of order q+1q+1

We continue to use the notation FF to denote the rational function field 𝔽q(x)\mathbb{F}_{q}(x). Assume nq+1n\mid q+1 is a O(1)O(1)-smooth divisor. Let m(x)=x2+ax+b𝔽q[x]m(x)=x^{2}+ax+b\in\mathbb{F}_{q}[x] be an irreducible and primitive polynomial over 𝔽q\mathbb{F}_{q} and Q(x)=x2+abx+1bQ(x)=x^{2}+\frac{a}{b}x+\frac{1}{b}. Let QQ denote the quadratic place of FF corresponding to Q(x)Q(x). In this section, we will show that the DFT of any polynomial f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} at a well-chosen set can be done in O(nlogn)O(n\log n) by using our G-FFT algorithm.

Recall that σ=(01ba)\sigma=\left(\begin{matrix}0&1\\ -b&-a\end{matrix}\right) has order q+1q+1 in the group Aut(F/𝔽q)=PGL2(q)\operatorname{\mathrm{Aut}}(F/\mathbb{F}_{q})=\mathrm{PGL}_{2}(q), where a,ba,b are coefficients of m(x)m(x). Let F(1)\mathbb{P}_{F}^{(1)} be the set of q+1q+1 rational places of FF. Then σ\sigma acts as a (q+1)(q+1)-cycle on the set F(1)\mathbb{P}_{F}^{(1)}.

Lemma 4.1 ([JMX19]).

Let Q(x)Q(x) and σ\sigma be defined as above. Let GσG\leq\langle\sigma\rangle be a subgroup of order nn. Then the fixed subfield of FF by GG is FG=𝔽q(z),wherez=τGτ(x).F^{G}=\mathbb{F}_{q}(z),\ \text{where}\ z=\sum_{\tau\in G}\tau(x). Moreover,

  • (1)

    the pole place 𝔓FG\mathfrak{P}\in\mathbb{P}_{F^{G}} of zz splits completely in F/FGF/F^{G};

  • (2)

    the quadratic place QFQ\in\mathbb{P}_{F} is the unique place that is totally ramified in F/FGF/F^{G}.

Assume G=σ(q+1)/nG=\langle\sigma^{(q+1)/n}\rangle is a BB-smooth subgroup of order nn, i.e., |G|=n=i=1rpr|G|=n=\prod_{i=1}^{r}p_{r}, where p1,,prBp_{1},\dots,p_{r}\leq B. Then there is an ascending chain of subgroups:

G0={1},Gi1Giwith index[Gi:Gi1]=pi,i=1,,r.G_{0}=\{1\},\ G_{i-1}\leq G_{i}\ \text{with\ index}\ [G_{i}:G_{i-1}]=p_{i},\ i=1,\dots,r.

Let Fi=FGiF_{i}=F^{G_{i}} be the fixed subfield. By the same analysis as in Subsection 3.1, we have FiFi1F_{i}\subsetneq F_{i-1} resulting a tower of fields

F0=𝔽q(x),Fi1Fiwith extension degree[Fi1:Fi]=pi,i=1,,r.F_{0}=\mathbb{F}_{q}(x),\ F_{i-1}\supsetneq F_{i}\ \text{with\ extension\ degree}\ [F_{i-1}:F_{i}]=p_{i},\ i=1,\dots,r.

Define

xi=τGiτ(x),yi=τGiτ(1Q(x)).x_{i}=\sum_{\tau\in G_{i}}\tau(x),\ y_{i}=\prod_{\tau\in G_{i}}\tau\left(\frac{1}{Q(x)}\right). (4.0.1)

Then Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}) by taking G=GiG=G_{i} in Lemma 4.1. Let Ei=𝔽q(yi)E_{i}=\mathbb{F}_{q}(y_{i}). It is clear that EiFiE_{i}\subsetneq F_{i} according to the definition of FiF_{i}. Furthermore, since QQ is totally ramified in F/FGF/F^{G}, τ(Q)=Q\tau(Q)=Q for all τG\tau\in G [Sti09, Theorem 3.8.2]. Thus the pole divisor of yiy_{i} is (yi)=τGiτ(Q)=|Gi|Q(y_{i})_{\infty}=\sum_{\tau\in G_{i}}\tau(Q)=|G_{i}|Q. By [Sti09, Theroem 1.4.11],

[F:Ei]=deg(yi)=2|Gi|.[F:E_{i}]=\deg(y_{i})_{\infty}=2|G_{i}|.

Thus [Fi:Ei]=[F:Ei]/[F:Fi]=2[F_{i}:E_{i}]=[F:E_{i}]/[F:F_{i}]=2. We have the following diagram

𝔽q(x){\mathbb{F}_{q}(x)}𝔽q(x1){\mathbb{F}_{q}(x_{1})}𝔽q(y1){\mathbb{F}_{q}(y_{1})}𝔽q(x2){\mathbb{F}_{q}(x_{2})}𝔽q(y2){\mathbb{F}_{q}(y_{2})}{\vdots}{\vdots}𝔽q(xr){\mathbb{F}_{q}(x_{r})}𝔽q(yr).{\mathbb{F}_{q}(y_{r}).}p1\scriptstyle{p_{1}}2p1\scriptstyle{2p_{1}}p2\scriptstyle{p_{2}}2\scriptstyle{2}2p2\scriptstyle{2p_{2}}p3\scriptstyle{p_{3}}2\scriptstyle{2}pr\scriptstyle{p_{r}}2pr\scriptstyle{2p_{r}}2\scriptstyle{2}

Let Pi,FiP_{i,\infty}\in\mathbb{P}_{F_{i}} denote the pole place of xix_{i} for 0ir0\leq i\leq r. According to Lemma 4.1, we know that Pi,P_{i,\infty} splits completely in F/FiF/F_{i}. In particular, let 𝒫F\mathcal{P}\subset\mathbb{P}_{F} be the set of places lying over Pr,P_{r,\infty}. Then |𝒫|=n|\mathcal{P}|=n. Let 𝒫(i)=𝒫Fi\mathcal{P}^{(i)}=\mathcal{P}\cap F_{i} be the set of all rational places in Fi\mathbb{P}_{F_{i}} lying above Pr,P_{r,\infty}. Then |𝒫(i)|=n/|Gi||\mathcal{P}^{(i)}|=n/{|G_{i}|}. By the definition of xix_{i}, P0,Pi,P_{0,\infty}\mid P_{i,\infty} for every ii, thus Pi,𝒫(i)P_{i,\infty}\in\mathcal{P}^{(i)}. Define

𝒫^(i)=𝒫(i){Pi,},i=0,1,,r1.\hat{\mathcal{P}}^{(i)}=\mathcal{P}^{(i)}\setminus\{P_{i,\infty}\},\ i=0,1,\ldots,r-1. (4.0.2)

Since every rational place P𝒫^(i)P\in\hat{\mathcal{P}}^{(i)} is the zero of xiαx_{i}-\alpha for some α𝔽q\alpha\in\mathbb{F}_{q}, we identify PP with α\alpha in this correspondence. Thus, 𝒫^(i)\hat{\mathcal{P}}^{(i)} can be viewed as a subset of 𝔽q\mathbb{F}_{q} for all ii. In particular, 𝒫(0)=𝒫\mathcal{P}^{(0)}=\mathcal{P} and 𝒫^=𝒫{P}\hat{\mathcal{P}}=\mathcal{P}\setminus\{P_{\infty}\}.

The following lemma presents the relationships between xi,yi,xi1,yi1x_{i},y_{i},x_{i-1},y_{i-1}, which is crucial for us to construct a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n} and perform the G-FFT for any f𝔽q[x]<nf\in\mathbb{F}_{q}[x]_{<n}.

Lemma 4.2.

For 1ir1\leq i\leq r, let xix_{i}, yiy_{i} be defined as in Equation (4.0.1). Let Qi=QFiQ_{i}=Q\cap F_{i} be the quadratic place of FiF_{i} lying below QQ. Then, the followings hold:

  • (1)

    1/yi1/{y_{i}} is a prime element of QiQ_{i}, i.e., 1/yi=Qi(xi)1/y_{i}={Q_{i}(x_{i})}, where Qi(xi)Q_{i}(x_{i}) is a quadratic irreducible polynomial corresponding to QiQ_{i}.

  • (2)

    The pole place Pi,P_{i,\infty} of xix_{i} splits into pip_{i} rational places including Pi1,P_{i-1,\infty} in Fi1F_{i-1}; namely, xi=ui,i1(xi1)(xi1λi,1)(xi1λi,|pi|1)x_{i}=\frac{u_{i,i-1}(x_{i-1})}{(x_{i-1}-\lambda_{i,1})\cdots(x_{i-1}-\lambda_{i,|p_{i}|-1})} for some polynomial ui,i1(T)𝔽q[T]u_{i,i-1}(T)\in\mathbb{F}_{q}[T] of degree pip_{i}, and λi,1,,λi,|pi|1𝔽q\lambda_{i,1},\dots,\lambda_{i,|p_{i}|-1}\in\mathbb{F}_{q} are pairwise distinct.

  • (3)

    yi=ciyi1pi(xi1λi,1)2(xi1λi,|pi|1)2y_{i}=c_{i}y_{i-1}^{p_{i}}\cdot(x_{i-1}-\lambda_{i,1})^{2}\cdots(x_{i-1}-\lambda_{i,|p_{i}|-1})^{2}, where ci𝔽qc_{i}\in\mathbb{F}_{q}^{*} and λi,1,,λi,|pi|1\lambda_{i,1},\dots,\lambda_{i,|p_{i}|-1} are given in (2).

  • (4)

    xr=ur,i(xi)α𝒫^(i)(xiα)x_{r}=\frac{u_{r,i}(x_{i})}{\prod_{\alpha\in\hat{\mathcal{P}}^{(i)}}(x_{i}-\alpha)} for some ur,i(T)𝔽q[T]u_{r,i}(T)\in\mathbb{F}_{q}[T] of degree n/|Gi|n/|G_{i}|, and yr=α𝒫^(i)(xiα)2cr,iQin/|Gi|(xi)y_{r}=\frac{\prod_{\alpha\in\hat{\mathcal{P}}^{(i)}}(x_{i}-\alpha)^{2}}{c_{r,i}Q_{i}^{n/|G_{i}|}(x_{i})}, where cr,i𝔽qc_{r,i}\in\mathbb{F}_{q}^{*} and 𝒫^(i)\hat{\mathcal{P}}^{(i)} is defined by Equation (4.0.2).

Proof.

Let τi=σ(q+1)/|Gi|\tau_{i}=\sigma^{(q+1)/|G_{i}|}. Then τi\tau_{i} is a generator of GiG_{i}, i.e., Gi=τiG_{i}=\langle\tau_{i}\rangle. Since σ\sigma acts as a (q+1)(q+1)-cycle on the set 𝔽q(x)(1)\mathbb{P}_{\mathbb{F}_{q}(x)}^{(1)}, then, as a permutation of 𝔽q(x)(1)\mathbb{P}_{\mathbb{F}_{q}(x)}^{(1)}, τi\tau_{i} can be decomposed as a product of (q+1)/|Gi|(q+1)/|G_{i}| cycles of length |Gi||G_{i}|. Assume the cycle which contains P0,=PP_{0,\infty}=P_{\infty} is as follows

PτiPαi,1τiτiPαi,|Gi|1τiP,P_{\infty}\stackrel{{\scriptstyle\tau_{i}}}{{\longrightarrow}}P_{\alpha_{i,1}}\stackrel{{\scriptstyle\tau_{i}}}{{\longrightarrow}}\dots\stackrel{{\scriptstyle\tau_{i}}}{{\longrightarrow}}P_{\alpha_{i,{|G_{i}|-1}}}\stackrel{{\scriptstyle\tau_{i}}}{{\longrightarrow}}P_{\infty}, (4.0.3)

where αi,1,,αi,|Gi|1𝔽q\alpha_{i,1},\dots,\alpha_{i,|G_{i}|-1}\in\mathbb{F}_{q} are pairwise distinct. Thus

xi=x+τi(x)+τi|Gi|1(x)=ui(x)(xαi,1)(xαi,|Gi|1),x_{i}=x+\tau_{i}(x)+\cdots\tau_{i}^{|G_{i}|-1}(x)=\frac{u_{i}(x)}{(x-\alpha_{i,1})\cdots(x-\alpha_{i,|G_{i}|-1})}, (4.0.4)

for some ui(x)𝔽q[x]u_{i}(x)\in\mathbb{F}_{q}[x]. Consider the principal divisor div(1/Q(x))=2PQ\operatorname{\mathrm{div}}(1/Q(x))=2P_{\infty}-Q. Since QQ is totally ramified, τij(Q)=Q\tau_{i}^{j}(Q)=Q for all τijGi\tau_{i}^{j}\in G_{i} by [Sti09, Theorem 3.8.2]. Then

νQ(τij(1/Q(x)))=ντij(Q)(1/Q(x))=νQ(1/Q(x))=1.\nu_{Q}\left(\tau_{i}^{j}(1/Q(x))\right)=\nu_{\tau_{i}^{-j}(Q)}(1/Q(x))=\nu_{Q}(1/Q(x))=-1.

According to equation (4.0.3),

div(yi)=j=0|Gi|1div((τij(1/Q(x)))=2P+j=1|Gi|12Pαi,j|Gi|Q.\operatorname{\mathrm{div}}(y_{i})=\sum_{j=0}^{|G_{i}|-1}\operatorname{\mathrm{div}}(\left(\tau_{i}^{j}(1/Q(x))\right)=2P_{\infty}+\sum_{j=1}^{|G_{i}|-1}2P_{\alpha_{i,j}}-|G_{i}|Q.

Thus, we have

yi=ci(xαi,1)2(xαi,|Gi|1)2Qp1pi(x),for someci𝔽q.y_{i}=c_{i}^{\prime}\cdot\frac{(x-\alpha_{i,1})^{2}\cdots(x-\alpha_{i,|G_{i}|-1})^{2}}{Q^{p_{1}\cdots p_{i}}(x)},\ \text{for\ some}\ c_{i}^{\prime}\in\mathbb{F}_{q}^{*}. (4.0.5)

Let Pi,FiP_{i,\infty}\in\mathbb{P}_{F_{i}} be the pole of xix_{i} for 0ir0\leq i\leq r. By Equations (4.0.4) and (4.0.5), Pi,P_{i,\infty} is a double zero of yiy_{i} and QiQ_{i} is the unique pole of yiy_{i} in FiF_{i}. Furthermore,

νQi(yi)=νQ(yi)/e(QQi)=1.\nu_{Q_{i}}(y_{i})=\nu_{Q}(y_{i})/e(Q\mid Q_{i})=-1.

Thus 1/yi1/y_{i} must be a prime element of QiQ_{i}, denoted by Qi(xi)Q_{i}(x_{i}). Then yi=1/Qi(xi)y_{i}=1/Q_{i}(x_{i}). This completes the proof of (1).

(2) For the relationship between xix_{i} and xi1x_{i-1}, we note that

xi=τ¯Gi/Gi1τ¯(xi1).x_{i}=\sum_{\bar{\tau}\in G_{i}/G_{i-1}}\bar{\tau}(x_{i-1}). (4.0.6)

By Equation (4.0.4), the pole Pi,P_{i,\infty} of xix_{i} splits completely in the extension F/FiF/F_{i}. Thus Pi,P_{i,\infty} splits into pip_{i} places of Fi1F_{i-1}. It is obvious to see that Pi1,Pi,P_{i-1,\infty}\mid P_{i,\infty} from equation (4.0.6). Assume Pi1,,Pλi,1,,Pλi,pi1P_{i-1,\infty},P_{\lambda_{i,1}},\dots,P_{\lambda_{i,p_{i}-1}} are pip_{i} places of Fi1F_{i-1} lying above Pi,P_{i,\infty}. Then τ¯i\bar{\tau}_{i}, as an automorphism of Fi1F_{i-1}, permutes these pip_{i} places. Without loss of generality, assume

Pi1,τ¯iPλi,1τ¯iτ¯iPλi,pi1τ¯iPi1,,P_{i-1,\infty}\stackrel{{\scriptstyle\bar{\tau}_{i}}}{{\longrightarrow}}P_{\lambda_{i,1}}\stackrel{{\scriptstyle\bar{\tau}_{i}}}{{\longrightarrow}}\dots\stackrel{{\scriptstyle\bar{\tau}_{i}}}{{\longrightarrow}}P_{\lambda_{i,{p_{i}-1}}}\stackrel{{\scriptstyle\bar{\tau}_{i}}}{{\longrightarrow}}P_{i-1,\infty}, (4.0.7)

Then we have

xi=ui,i1(xi1)(xi1λi,1)(xi1λi,|pi|1),x_{i}=\frac{u_{i,i-1}(x_{i-1})}{(x_{i-1}-\lambda_{i,1})\cdots(x_{i-1}-\lambda_{i,|p_{i}|-1})},

for some polynomial ui,i1(T)𝔽q[T]u_{i,i-1}(T)\in\mathbb{F}_{q}[T]. As [𝔽q(xi1):𝔽q(xi)]=pi[\mathbb{F}_{q}(x_{i-1}):\mathbb{F}_{q}(x_{i})]=p_{i}, by Lemma 2.1, we then have degui,i1=pi\deg u_{i,i-1}=p_{i}. This completes the proof of (2).

(3) For the relationship between yiy_{i} and xi1x_{i-1}, we note that

yi=τ¯Gi/Gi1τ¯(yi1)=τ¯Gi/Gi1τ¯(1/Qi1(xi1)).\begin{split}y_{i}=\prod_{\bar{\tau}\in G_{i}/G_{i-1}}\bar{\tau}(y_{i-1})=\prod_{\bar{\tau}\in G_{i}/G_{i-1}}\bar{\tau}(1/Q_{i-1}(x_{i-1}))\end{split}.

The second equality in the above display follows from (1). Consider the principal divisor div(1/(Qi1(xi1)))=2Pi1,Qi1\operatorname{\mathrm{div}}(1/(Q_{i-1}(x_{i-1})))=2P_{{i-1},\infty}-Q_{i-1} in FiF_{i}. By Equation (4.0.7),

div(yi)=j=0pi1div(τ¯ij(1/Qi1(xi1)))=2Pi,+j=0pi1(2Pλi,j)piQi1.\operatorname{\mathrm{div}}(y_{i})=\sum_{j=0}^{p_{i}-1}\operatorname{\mathrm{div}}\left(\bar{\tau}_{i}^{j}(1/Q_{i-1}(x_{i-1}))\right)=2P_{i,\infty}+\sum_{j=0}^{p_{i}-1}(2P_{\lambda_{i,j}})-p_{i}Q_{i-1}.

Thus yi=ci(xi1λi,1)2(xi1λi,|pi|1)2Qi1(xi1)pi=ciyi1pi(xi1λi,1)2(xi1λi,|pi|1)2y_{i}=c_{i}\cdot\frac{(x_{i-1}-\lambda_{i,1})^{2}\cdots(x_{i-1}-\lambda_{i,|p_{i}|-1})^{2}}{Q_{i-1}(x_{i-1})^{p_{i}}}=c_{i}\cdot y_{i-1}^{p_{i}}\cdot(x_{i-1}-\lambda_{i,1})^{2}\cdots(x_{i-1}-\lambda_{i,|p_{i}|-1})^{2} for some ci𝔽qc_{i}\in\mathbb{F}_{q}^{*}.

(4) can be proved similarly by substituting ii with rr in the proof of (2) and (3). We do not repeat it here. ∎

Remark 4.3.

If n=q+1=2rn=q+1=2^{r}, by Equations (4.0.4) and (4.0.5) in the proof of Lemma 4.2, then

xr=ur,0(x)xqxfor someur,0(x)𝔽q[x]withdeg(ur,0(x))=q+1,yr=(xqx)2cr,0Qq+1(x)for somecr,0𝔽q.\begin{split}&x_{r}=\frac{u_{r,0}(x)}{x^{q}-x}\ \text{for\ some}\ u_{r,0}(x)\in\mathbb{F}_{q}[x]\ \text{with}\ \deg(u_{r,0}(x))=q+1,\\ &y_{r}=\frac{(x^{q}-x)^{2}}{c_{r,0}Q^{q+1}(x)}\ \text{for\ some}\ c_{r,0}\in\mathbb{F}_{q}^{*}.\end{split} (4.0.8)

In the following, we will first construct a basis \mathcal{B} of the Riemann-Roch space (nQ)\mathcal{L}(nQ). For simplicity, we define the following notions:

for 0ir1,zi(ei):={1,ifei=0,1(xiλi+1,1)(xiλi+1,2)(xiλi+1,ei),if 1eipi+11,zr(0):=1,zr(1):=xr,andpr+1:=2.\displaystyle\begin{split}&\text{for}\ 0\leq i\leq r-1,\ z_{i}^{(e_{i})}:=\begin{cases}1,\ &\text{if}\ e_{i}=0,\\ \frac{1}{(x_{i}-\lambda_{i+1,1})(x_{i}-\lambda_{i+1,2})\cdots(x_{i}-\lambda_{i+1,e_{i}})},\ &\text{if}\ 1\leq e_{i}\leq p_{i+1}-1,\end{cases}\\ &z_{r}^{(0)}:=1,\ z_{r}^{(1)}:=x_{r},\ \text{and}\ p_{r+1}:=2.\\ \end{split} (4.0.9)

where all (xiλi+1,ei)(x_{i}-\lambda_{i+1,e_{i}}) are given in Lemma 4.2(2) for 0ir10\leq i\leq r-1 and eipi+1e_{i}\in\mathbb{Z}_{p_{i+1}}.

Lemma 4.4.

We continue the notations in Lemma 4.2. Let zi(ei)z_{i}^{(e_{i})} be defined as in Equation (4.0.9) for i=0,1,,ri=0,1,\ldots,r and eipi+1e_{i}\in\mathbb{Z}_{p_{i+1}}. Then

={z0(e0)z1(e1)zr1(er1)zr(er)yre=(e0,e1,,er1,er)i=1r+1pi}{1}\mathcal{B}=\left\{z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}z_{r}^{(e_{r})}y_{r}\mid\operatorname{\textbf{e}}=(e_{0},e_{1},\ldots,e_{r-1},e_{r})\in\prod_{i=1}^{r+1}\mathbb{Z}_{p_{i}}\right\}\cup\left\{1\right\}

is a basis of the Riemann-Roch space (nQ)\mathcal{L}\left(nQ\right).

Proof.

For m=1,2,,rm=1,2,\dots,r, define the set m1\mathcal{B}_{m-1} as follows

m1={ym,ym2,,ymnp1pm}{1,xm}m1,m{1,zm1(1),,zm1(pm1)}m1,m1,\mathcal{B}_{m-1}=\underbrace{\{y_{m},y_{m}^{2},\dots,y_{m}^{\frac{n}{p_{1}\cdots p_{m}}}\}\cdot\{1,x_{m}\}}_{\text{$\mathcal{B}_{m-1,m}$}}\cdot\underbrace{\{1,z_{m-1}^{(1)},\ldots,z_{m-1}^{(p_{m}-1)}\}}_{\text{$\mathcal{B}_{m-1,m-1}$}}, (4.0.10)

where the product “\cdot” of sets means a set consisting of all possible products of elements from these three sets. Then |m1|=2n/(p1pm1)|\mathcal{B}_{m-1}|=2n/(p_{1}\cdots p_{m-1}). Let m1,m\mathcal{B}_{m-1,m} and m1,m1\mathcal{B}_{m-1,m-1} be two sets defined in Equation (4.0.10). We claim that

  • (i)

    m1{1}\mathcal{B}_{m-1}\cup\{1\} is a basis of the Riemann-Roch space (np1pm1Qm1)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right).

  • (ii)

    m1,m{1}\mathcal{B}_{m-1,m}\cup\{1\} is a basis of the Riemann-Roch space (np1pmQm)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m}}Q_{m}\right).

Then m\mathcal{B}_{m} and m1,m\mathcal{B}_{m-1,m} are linearly equivalent and have same cardinality. By substituting the subset m1,m\mathcal{B}_{m-1,m} with m\mathcal{B}_{m} in m1\mathcal{B}_{m-1}, we get a new basis m1{1}\mathcal{B}_{m-1}^{\prime}\cup\{1\} of (np1pm1Qm1)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right), where m1=mm1,m1\mathcal{B}_{m-1}^{\prime}=\mathcal{B}_{m}\cdot\mathcal{B}_{m-1,m-1}. In particular, for m=0m=0, 0{1}=(0,10,0){1}\mathcal{B}_{0}\cup\{1\}=(\mathcal{B}_{0,1}\cdot\mathcal{B}_{0,0})\cup\{1\} is a basis of (nQ)\mathcal{L}\left(nQ\right). By substituting 0,1\mathcal{B}_{0,1} with 1\mathcal{B}_{1}, then we have (10,0){1}(\mathcal{B}_{1}\cdot\mathcal{B}_{0,0})\cup\{1\} is also a basis of (nQ)\mathcal{L}(nQ). Continuing in this fashion till to m=r1m={r-1}, then we get a basis of (nQ)\mathcal{L}(nQ), i.e.,

=rr1,r11,10,0{1}={yr,xryr}{1,zr1(1),,zr1(pr1)}{1,z0(1),,z0(p11)}{1}.\begin{split}\mathcal{B}&=\mathcal{B}_{r}\cdot\mathcal{B}_{r-1,r-1}\cdots\mathcal{B}_{1,1}\cdot\mathcal{B}_{0,0}\cup\{1\}\\ =\{y_{r},&x_{r}y_{r}\}\cdot\{1,z_{r-1}^{(1)},\ldots,z_{r-1}^{(p_{r}-1)}\}\cdots\{1,z_{0}^{(1)},\ldots,z_{0}^{(p_{1}-1)}\}\cup\{1\}.\end{split}

Therefore, it suffices to prove the above claims (i) and (ii).

(i) For every element ymixmjzm1(em1)y_{m}^{i}x_{m}^{j}z_{m-1}^{(e_{m-1})} in m1\mathcal{B}_{m-1}, where 1in/p1pm1\leq i\leq n/{p_{1}\cdots p_{m}}, j=0,1j=0,1 and em1pme_{m-1}\in\mathbb{Z}_{p_{m}}, by Lemma 4.2, we have

νQm1(ymixmjzm1(em1))=νQm1(ymi)=ipmn/(p1pm1),\nu_{Q_{m-1}}(y_{m}^{i}x_{m}^{j}z_{m-1}^{(e_{m-1})})=\nu_{Q_{m-1}}\left(y_{m}^{i}\right)=-i\cdot p_{m}\geq-n/(p_{1}\cdots p_{m-1}),

and it has no other poles except for Qm1Q_{m-1}. Thus m1{1}(np1pm1Qm1)\mathcal{B}_{m-1}\cup\{1\}\subsetneq\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right). Assume

a+i=1n/(p1pm)j=01e=0pm1ai,j,eymixmjzm1(e)=0a+\sum_{i=1}^{n/(p_{1}\cdots p_{m})}\sum_{j=0}^{1}\sum_{e=0}^{p_{m}-1}a_{i,j,e}\cdot y_{m}^{i}x_{m}^{j}z_{m-1}^{(e)}=0 (4.0.11)

where all ai,j,e𝔽qa_{i,j,e}\in\mathbb{F}_{q} and a𝔽qa\in\mathbb{F}_{q}. Multiplying by (xm1λm,1)(xm1λm,pm1)(x_{m-1}-\lambda_{m,1})\cdots(x_{m-1}-\lambda_{m,p_{m}-1}) on both sides of Equation (4.0.11), we have

(a+i=1n/(p1pm)j=01ai,j,0ymixmj)(xm1λm,1)(xm1λm,pm1)+i=1n/(p1pm)j=01e>0ai,j,eymixmj(xm1λm,e+1)(xm1λm,pm1)=0\begin{split}&\left(a+\sum_{i=1}^{n/(p_{1}\cdots p_{m})}\sum_{j=0}^{1}a_{i,j,0}\cdot y_{m}^{i}x_{m}^{j}\right)(x_{m-1}-\lambda_{m,1})\cdots(x_{m-1}-\lambda_{m,p_{m}-1})\\ &+\sum_{i=1}^{n/(p_{1}\cdots p_{m})}\sum_{j=0}^{1}\sum_{e>0}a_{i,j,e}y_{m}^{i}x_{m}^{j}(x_{m-1}-\lambda_{m,e+1})\cdots(x_{m-1}-\lambda_{m,p_{m}-1})=0\end{split} (4.0.12)

Since [Fm1:Fm]=pm[F_{m-1}:F_{m}]=p_{m}, we have the coefficients of xm1kx_{m-1}^{k} for 0kpm10\leq k\leq p_{m}-1 are all zero; namely, a+i,jai,j,0ymixmj=0a+\sum_{i,j}a_{i,j,0}\cdot y_{m}^{i}x_{m}^{j}=0 and i,jai,j,eymixmj=0\sum_{i,j}a_{i,j,e}\cdot y_{m}^{i}x_{m}^{j}=0 for every 0<e<pm0<e<p_{m}. As [𝔽q(xm):𝔽q(ym)]=2[\mathbb{F}_{q}(x_{m}):\mathbb{F}_{q}(y_{m})]=2, we then have a=0a=0 and iai,j,eymi=0\sum_{i}a_{i,j,e}\cdot y_{m}^{i}=0 for all 0epm10\leq e\leq p_{m}-1 and j=0,1j=0,1. As ymy_{m} is transcendental over 𝔽q\mathbb{F}_{q}, we have all the coefficients ai,j,e=0a_{i,j,e}=0. Therefore, the elements in m1\mathcal{B}_{m-1} are linearly independent. By the Riemann-Roch theorem [Sti09], (np1pm1Qm1)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right) has dimension:

(np1pm1Qm1)=2n/(p1pm1)+1=|m1{1}|.\ell\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right)=2n/(p_{1}\cdots p_{m-1})+1=|\mathcal{B}_{m-1}\cup\{1\}|.

Thus, m1{1}\mathcal{B}_{m-1}\cup\{1\} is a basis of (np1pm1Qm1)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m-1}}Q_{m-1}\right).

(ii) For every element ymixmjy_{m}^{i}x_{m}^{j} in the set m1,m={ym,ym2,,ymnp1pm}×{1,xm}\mathcal{B}_{m-1,m}=\{y_{m},y_{m}^{2},\dots,y_{m}^{\frac{n}{p_{1}\cdots p_{m}}}\}\times\{1,x_{m}\}, by Lemma 4.2, we have

νQm(ymixmj)=νQm(ymi)=in/(p1pm).\nu_{Q_{m}}\left(y_{m}^{i}x_{m}^{j}\right)=\nu_{Q_{m}}(y_{m}^{i})=-i\geq-n/(p_{1}\cdots p_{m}).

Thus m1,m(np1pmQm)\mathcal{B}_{m-1,m}\subsetneq\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m}}Q_{m}\right). We have shown in (i) that m1,m{1}\mathcal{B}_{m-1,m}\cup\{1\} is a linearly independent set. By the Riemann-Roch theorem [Sti09], (np1pmQm)=2n/(p1pm)+1=|m1,m{1}|\ell(\frac{n}{p_{1}\cdots p_{m}}Q_{m})=2n/{(p_{1}\cdots p_{m})}+1=|\mathcal{B}_{m-1,m}\cup\{1\}|. Hence m1,m{1}\mathcal{B}_{m-1,m}\cup\{1\} is also basis of (np1pmQm)\mathcal{L}\left(\frac{n}{p_{1}\cdots p_{m}}Q_{m}\right). ∎

Next, we take a subset of \mathcal{B} to construct a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}.

Lemma 4.5.

Let zi(ei)z_{i}^{(e_{i})} be defined as in Equation (4.0.9) for 0ir0\leq i\leq r and eipi+1e_{i}\in\mathbb{Z}_{p_{i+1}}. Let xr=ur,0(x)α𝒫^(xα)x_{r}=\frac{u_{r,0}(x)}{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)} by Lemma 4.2 for ur,0(x)𝔽q[x]u_{r,0}(x)\in\mathbb{F}_{q}[x] of degree nn. Then

~={z0(e0)z1(e1)zr1(er1)xryre=(e0,e1,,er1)p1×p2××pr}\tilde{\mathcal{B}}=\{z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}x_{r}y_{r}\mid\operatorname{\textbf{e}}=(e_{0},e_{1},\ldots,e_{r-1})\in\mathbb{Z}_{p_{1}}\times\mathbb{Z}_{p_{2}}\times\ldots\times\mathbb{Z}_{p_{r}}\}

is a basis of ur,0(x)Qn(x)𝔽q[x]<n\frac{u_{r,0}(x)}{Q^{n}(x)}\mathbb{F}_{q}[x]_{<n}. In other words, Qn(x)ur,0(x)~\frac{Q^{n}(x)}{u_{r,0}(x)}\tilde{\mathcal{B}} is a basis of 𝔽q[x]<n\mathbb{F}_{q}[x]_{<n}.

Proof.

Firstly, let Wq(~)W_{q}(\tilde{\mathcal{B}}) be the 𝔽q\mathbb{F}_{q}-subspace of (nQ)\mathcal{L}(nQ) spanned by ~\tilde{\mathcal{B}}. By Lemma 4.4, ~\tilde{\mathcal{B}} is an 𝔽q\mathbb{F}_{q}-linearly independent subset of \mathcal{B}. Then

dim(Wq(~))=p1p2pr=n=dim(ur,0(x)Qn(x)𝔽q[x]<n).\dim(W_{q}(\tilde{\mathcal{B}}))=p_{1}p_{2}\cdots p_{r}=n=\dim\left(\frac{u_{r,0}(x)}{Q^{n}(x)}\cdot\mathbb{F}_{q}[x]_{<n}\right).

It suffices to show that ~ur,0(x)Qn(x)𝔽q[x]<n\tilde{\mathcal{B}}\subset\frac{u_{r,0}(x)}{Q^{n}(x)}\cdot\mathbb{F}_{q}[x]_{<n}.

By Lemma 4.2,

zr(1)yr=xryr=ur,0(x)α𝒫^(xα)α𝒫^(xα)2Qn(x)=ur,0(x)Qn(x)α𝒫^(xα).z_{r}^{(1)}y_{r}=x_{r}y_{r}=\frac{u_{r,0}(x)}{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)}\cdot\frac{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)^{2}}{Q^{n}(x)}=\frac{u_{r,0}(x)}{Q^{n}(x)}{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)}.

Claim: assume z0(e0)z1(e1)zr1(er1)=Ne(x)/De(x)𝔽q(x)z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}=N_{\operatorname{\textbf{e}}}(x)/D_{\operatorname{\textbf{e}}}(x)\in\mathbb{F}_{q}(x), where Ne(x),De(x)𝔽q[x]N_{\operatorname{\textbf{e}}}(x),D_{\operatorname{\textbf{e}}}(x)\in\mathbb{F}_{q}[x]. Then the denominator De(x)D_{\operatorname{\textbf{e}}}(x) will be killed by α𝒫^(xα){\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)}.

By the above claim, we have z0(e0)z1(e1)zr1(er1)zr(1)yrur,0(x)Qn(x)𝔽q[x]z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}z_{r}^{(1)}y_{r}\in\frac{u_{r,0}(x)}{Q^{n}(x)}\mathbb{F}_{q}[x]. Since Wq({1})=1Qn(x)𝔽q[x]<2nW_{q}\left(\mathcal{B}\setminus\{1\}\right)=\frac{1}{Q^{n}(x)}\mathbb{F}_{q}[x]_{<2n}, then the numerator of z0(e0)z1(e1)zr1(er1)zr(1)yrz_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}z_{r}^{(1)}y_{r} (as a rational polynomial function) belongs to ur,0(x)𝔽q[x]<nu_{r,0}(x)\mathbb{F}_{q}[x]_{<n}. Thus, the lemma is proved if the claim is true.

The proof of claim: It is equivalent to show that the pole places of z0(e0)z1(e1)zr1(er1)z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})} is a subset of 𝒫\mathcal{P}. Since zi(ei)=1/(xiλi+1,1)(xiλi+1,ei)z_{i}^{(e_{i})}={1}/{(x_{i}-\lambda_{i+1,1})\cdots(x_{i}-\lambda_{i+1,e_{i}})}, the pole places of zi(ei)z_{i}^{(e_{i})} in F\mathbb{P}_{F} are those lying above Pλi+1,kP_{\lambda_{i+1,k}} for 1kei1\leq k\leq e_{i}, where Pλi+1,kFiP_{\lambda_{i+1,k}}\in\mathbb{P}_{F_{i}} is the zero of xiλi+1,kx_{i}-\lambda_{i+1,k}. Moreover, Pλi+1,kPi+1,P_{\lambda_{i+1,k}}\mid P_{i+1,\infty} and Pi+1,P_{i+1,\infty} splits completely in F/Fi+1F/F_{i+1} by Lemma 4.2, thus each Pλi+1,kP_{\lambda_{i+1,k}} splits completely in F/FiF/F_{i}. Let (zi(ei))Div(F)\left(z_{i}^{(e_{i})}\right)_{\infty}\in\operatorname{\mathrm{Div}}(F) be the pole place of zi(ei)z_{i}^{(e_{i})} and Zi(ei):=Supp((zi(ei)))Z_{i}^{(e_{i})}:=\operatorname{\mathrm{Supp}}\left(\left(z_{i}^{(e_{i})}\right)_{\infty}\right) be its support set. Then |Zi(ei)|=eip1pi|Z_{i}^{(e_{i})}|=e_{i}\cdot p_{1}\cdots p_{i}. In particular, Zi(pi+11)Z_{i}^{(p_{i+1}-1)} contains Zi(ei)Z_{i}^{(e_{i})} which has cardinality (k=1i+1pk)(k=1ipk)\left(\prod_{k=1}^{i+1}p_{k}\right)-\left(\prod_{k=1}^{i}p_{k}\right). Moreover, let 𝒫i:={PF:PPi,}\mathcal{P}_{i}:=\{P\in\mathbb{P}_{F}:\ P\mid P_{i,\infty}\}. Then Zi(pi+11)𝒫i=Z_{i}^{(p_{i+1}-1)}\cap\mathcal{P}_{i}=\emptyset. For any j<ij<i, since Zj(ej)Zj(pj+11)𝒫iZ_{j}^{(e_{j})}\subset Z_{j}^{(p_{j+1}-1)}\subset\mathcal{P}_{i}, Zj(ej)Z_{j}^{(e_{j})} and Zi(ei)Z_{i}^{(e_{i})} are disjoint. This means the disjoint union i=0r1Zi(ei)\bigsqcup_{i=0}^{r-1}Z_{i}^{(e_{i})} is a subset of i=0r1Zi(pi+11)=𝒫^\bigsqcup_{i=0}^{r-1}Z_{i}^{(p_{i+1}-1)}=\hat{\mathcal{P}}. Thus, the denominator DeD_{\operatorname{\textbf{e}}} can be killed by α𝒫^(xα){\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)}.

Assume n=i=1rpin=\prod_{i=1}^{r}p_{i} is BB-smooth, i.e., pi<Bp_{i}<B for all i=1,,ri=1,\ldots,r. By Lemma 4.1, there are q+1n\frac{q+1}{n} rational places in Fr\mathbb{P}_{F_{r}} that are splitting completely in F0/FrF_{0}/F_{r}, namely those places lying above the pole place of i=0qσi(x)\sum_{i=0}^{q}\sigma^{i}(x) in FσF^{\langle\sigma\rangle}. Denote the set consisting of these q+1n\frac{q+1}{n} places by 𝒬\mathcal{Q}. Particularly, 𝒬={Pr,}\mathcal{Q}=\{P_{r,\infty}\} if n=q+1n=q+1, and contains Pr,P_{r,\infty} if n<q+1n<q+1. We make the following choices:

𝔓𝒬{Pr,},ifn<q+1,𝔓=Pr,,ifn=q+1.\begin{split}&\mathfrak{P}\in\mathcal{Q}\setminus\{P_{r,\infty}\},\ \text{if}\ n<q+1,\\ &\mathfrak{P}=P_{r,\infty},\ \text{if}\ n=q+1.\\ \end{split} (4.0.13)

Let 𝒫F\mathcal{P}\subset\mathbb{P}_{F} be the set of all rational places of FF lying over 𝔓\mathfrak{P} (hence |𝒫|=n|\mathcal{P}|=n). If n<q+1n<q+1, by the proof of claim in Lemma 4.5, every base element z0(e0)z1(e1)zr1(er1)xryr~z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})}x_{r}y_{r}\in\tilde{\mathcal{B}} has no poles in 𝒫\mathcal{P}.

Theorem 4.6.

Let GG be a cyclic subgroup of σ\langle\sigma\rangle of order n=i=1rpin=\prod_{i=1}^{r}p_{i}. Let 𝔓FG\mathfrak{P}\in\mathbb{P}_{F^{G}} and 𝒫\mathcal{P} be defined as above. If nn is BB-smooth, then the DFT of any polynomial f𝔽q[x]<nf\in\mathbb{F}_{q}[x]_{<n} at 𝒫\mathcal{P} can be done in O(Bnlogn)O(Bn\log n) operations of 𝔽q\mathbb{F}_{q}.

Proof.

Let f~=ur,0(x)cr,0Qn(x)f\tilde{f}=\frac{u_{r,0}(x)}{c_{r,0}Q^{n}(x)}f, where ur,0(x)u_{r,0}(x) and cr,0Qn(x)c_{r,0}Q^{n}(x) come from xr=ur,0(x)α𝒫^(xα)x_{r}=\frac{u_{r,0}(x)}{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)} and yr=α𝒫^(xα)2cr,0Qn(x)y_{r}=\frac{\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)^{2}}{c_{r,0}Q^{n}(x)} by Lemma 4.2(4). Then f~Wq(~)\tilde{f}\in W_{q}(\tilde{\mathcal{B}}) by Lemma 4.5. We will first show that the MPE of f~\tilde{f} at 𝒫\mathcal{P} can be done in O(nlogn)O(n\log n). By pre-computing cr,0Qn(α)ur,0(α)\frac{c_{r,0}Q^{n}(\alpha)}{u_{r,0}(\alpha)} for all α𝒫\alpha\in\mathcal{P}, we can obtain f(α)=cr,0Qn(α)ur,0(α)f~(α)f(\alpha)=\frac{c_{r,0}Q^{n}(\alpha)}{u_{r,0}(\alpha)}\tilde{f}(\alpha) for every α𝒫\alpha\in\mathcal{P}. If n=q+1n=q+1, we define f(P)=f(P_{\infty})=\infty since νP(f)<0\nu_{P_{\infty}}(f)<0. Denote the complexity (i.e., the total number of operations of 𝔽q\mathbb{F}_{q}) to compute the nn evaluations of any f~Wq(~)(nQ)\tilde{f}\in W_{q}(\tilde{\mathcal{B}})\subset\mathcal{L}(nQ) at the set 𝒫\mathcal{P} by C(n)C(n).

Under the basis ~\tilde{\mathcal{B}} in Lemma 4.5, we represent f~\tilde{f} as

f~=e¯i=0r1pi+1ae¯xryrz(e¯)=e0p1z0(e0)e¯i=1r1pi+1aee0¯xryrz(e¯)=z0(0)f0+z0(1)f1++z0(p11)fp11=f0+1xλ1,1f1++1(xλ1,1)(xλ1,p11)fp11.\begin{split}\tilde{f}&=\sum_{\operatorname{\underline{\textbf{e}}}\in\prod_{i=0}^{r-1}\mathbb{Z}_{p_{i+1}}}a_{\operatorname{\underline{\textbf{e}}}}\cdot{x_{r}y_{r}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}})}}\\ &=\sum_{e_{0}\in\mathbb{Z}_{p_{1}}}z_{0}^{(e_{0})}\cdot\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\prod_{i=1}^{r-1}\mathbb{Z}_{p_{i+1}}}a_{\underline{\operatorname{\textbf{e}}^{\prime}e_{0}}}\cdot{x_{r}y_{r}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}}\\ &=z_{0}^{(0)}\cdot f_{0}+z_{0}^{(1)}\cdot f_{1}+\dots+z_{0}^{(p_{1}-1)}\cdot f_{p_{1}-1}\\ &=f_{0}+\frac{1}{x-\lambda_{1,1}}\cdot f_{1}+\dots+\frac{1}{(x-\lambda_{1,1})\cdots(x-\lambda_{1,p_{1}-1})}\cdot f_{p_{1}-1}.\end{split} (4.0.14)

where fk=e¯i=1r1pi+1aek¯xryrz(e¯)f_{k}=\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\prod_{i=1}^{r-1}\mathbb{Z}_{p_{i+1}}}a_{\underline{\operatorname{\textbf{e}}^{\prime}k}}\cdot{x_{r}y_{r}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}} for every 0k<p10\leq k<p_{1}. Now, all fkF1f_{k}\in F_{1} and

νQ1(fk)νQ1(yr)=np1,whereQ1=QF1.\nu_{Q_{1}}(f_{k})\geq\nu_{Q_{1}}(y_{r})=-\frac{n}{p_{1}},\ \text{where}\ Q_{1}=Q\cap F_{1}.

Thus, fk(np1Q1)f_{k}\in\mathcal{L}(\frac{n}{p_{1}}Q_{1}) for all 0kp110\leq k\leq p_{1}-1.

For any P𝒫P\in\mathcal{P}, let P1=PF1P_{1}=P\cap F_{1}. Then fk(P)=fk(P1)f_{k}(P)=f_{k}(P_{1}) as fkF1f_{k}\in F_{1}. Let 𝒫(1)=𝒫F1={PF1:P𝒫}\mathcal{P}^{(1)}=\mathcal{P}\cap F_{1}=\{P\cap F_{1}:P\in\mathcal{P}\}. Then |𝒫(1)|=np1|\mathcal{P}^{(1)}|=\frac{n}{p_{1}} by Lemma 4.1. Thus the evaluations of f~\tilde{f} at 𝒫\mathcal{P} can be reduced to evaluations of f0,,fp11f_{0},\cdots,f_{p_{1}-1} at 𝒫(1)\mathcal{P}^{(1)}, and then a combination of these values according to Equation (4.0.14). If n<q+1n<q+1, by the choice of 𝔓\mathfrak{P}, there are no places in 𝒫\mathcal{P} that are poles of 1/(xλ1,1)(xλ1,p11)1/{(x-\lambda_{1,1})\cdots(x-\lambda_{1,p_{1}-1})}. So the reduction works well and we can directly get the recursive formula (4.0.17) which leads to the final result C(n)=O(Bnlogn)C(n)=O(Bn\log n). However, if n=q+1n=q+1, the pole places of 1/(xλ1,1)(xλ1,p11)1/{(x-\lambda_{1,1})\cdots(x-\lambda_{1,p_{1}-1})} are contained in 𝒫\mathcal{P} which need to be dealt with separately.

We are left to consider the case n=q+1n=q+1. According to Lemma 4.2 (2), let Λ1={Pλ1,1,,Pλ1,p11,P0,}\Lambda_{1}=\{P_{\lambda_{1,1}},\dots,P_{\lambda_{1,p_{1}-1}},P_{0,\infty}\} be the set of all rational places of F0F_{0} lying above P1,P_{1,\infty}. For any Pλ1,iΛ1P_{\lambda_{1,i}}\in\Lambda_{1}, by computations,

νPλ1,i(yr)=νPr,(yr)=2,νPλ1,i(xr)=1,νPλ1,i(zj(ej))=νPj,(zj(ej))=ej0,for 1jr1,ejpj+1νPλ1,i(z0(k))=1,forikp11,νPλ1,i(z0(k))=0,for 1k<i,\begin{split}&\nu_{P_{\lambda_{1,i}}}(y_{r})=\nu_{P_{r,\infty}}(y_{r})=2,\ \nu_{P_{\lambda_{1,i}}}(x_{r})=-1,\\ &\nu_{P_{\lambda_{1,i}}}(z_{j}^{(e_{j})})=\nu_{P_{j,\infty}}(z_{j}^{(e_{j})})=e_{j}\geq 0,\ \text{for}\ 1\leq j\leq r-1,\ e_{j}\in\mathbb{Z}_{p_{j+1}}\\ &\nu_{P_{\lambda_{1,i}}}(z_{0}^{(k)})=-1,\ \text{for}\ i\leq k\leq p_{1}-1,\\ &\nu_{P_{\lambda_{1,i}}}(z_{0}^{(k)})=0,\ \text{for}\ 1\leq k<i,\\ \end{split} (4.0.15)

Then,

νPλ1,i(fk)mine¯i=1r1pi+1{νPλ1,i(aek¯xryrz(e¯))}1,\nu_{P_{\lambda_{1,i}}}(f_{k})\geq\min_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\prod_{i=1}^{r-1}\mathbb{Z}_{p_{i+1}}}\{\nu_{P_{\lambda_{1,i}}}(a_{\underline{\operatorname{\textbf{e}}^{\prime}k}}\cdot{x_{r}y_{r}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}})\}\geq 1,

namely, each Pλ1,iP_{\lambda_{1,i}} is a zero of fk=e¯i=1r1pi+1aek¯xryrz(e¯)f_{k}=\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\prod_{i=1}^{r-1}\mathbb{Z}_{p_{i+1}}}a_{\underline{\operatorname{\textbf{e}}^{\prime}k}}\cdot{x_{r}y_{r}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}}. However, {Pλ1,1,,Pλ1,k}\{P_{\lambda_{1,1}},\dots,P_{\lambda_{1,k}}\} are simple poles of z0(k)z_{0}^{(k)}. We cannot evaluate fkf_{k} and z0(k)z_{0}^{(k)} separately at Pλ1,iP_{\lambda_{1,i}} for iki\leq k.

Note that, by Equation (4.0.15), νPλ1,i(xryrze¯)0\nu_{P_{\lambda_{1,i}}}({x_{r}y_{r}\operatorname{\textbf{z}}^{\operatorname{\underline{\textbf{e}}}})}\geq 0 and take 0 only at e¯=(00k)\operatorname{\underline{\textbf{e}}}=(0\cdots 0k) for ikp11i\leq k\leq p_{1}-1. Thus

z0(k)fk(Pλ1,i)=(a00kxryrz0(k))(Pλ1,i)=(a00kur,0(x)α𝒫^(xα)Qn(x)1(xλ1,1)(xλ1,k))(Pλ1,i)={a00kur,0(λ1,i)α𝒫^{λ1,1,,λ1,k}(λ1,iα)Q(λ1,i)n,ifikp11;0,ifk<ip11.\begin{split}{z_{0}^{(k)}f_{k}}(P_{\lambda_{1,i}})&=\big{(}a_{0\cdots 0k}x_{r}y_{r}\cdot z_{0}^{(k)}\big{)}(P_{\lambda_{1,i}})\\ &=\left(a_{0\cdots 0k}\frac{u_{r,0}(x)\prod_{\alpha\in\hat{\mathcal{P}}}(x-\alpha)}{Q^{n}(x)}\cdot\frac{1}{(x-\lambda_{1,1})\cdots(x-\lambda_{1,k})}\right)(P_{\lambda_{1,i}})\\ &=\begin{cases}a_{0\cdots 0k}\cdot\frac{u_{r,0}(\lambda_{1,i})\prod_{\alpha\in\hat{\mathcal{P}}\setminus\{\lambda_{1,1},\ldots,\lambda_{1,k}\}}(\lambda_{1,i}-\alpha)}{Q(\lambda_{1,i})^{n}},\ &\text{if}\ i\leq k\leq p_{1}-1;\\ 0,\ &\text{if}\ k<i\leq p_{1}-1.\end{cases}\end{split} (4.0.16)

where the second “=” follows from Lemma 4.2 (4).

Thus, for these p11p_{1}-1 places in Λ1\Lambda_{1}, we can get f~(Pλ1,i)\tilde{f}(P_{\lambda_{1,i}}) according to equations (4.0.16) and (4.0.14) which can be done in p12p_{1}^{2} by pre-computing of

ur,0(λ1,i)α𝒫^{λ1,1,,λ1,k}(λ1,iα)Q(λ1,i)nfor 1ikp11.\frac{u_{r,0}(\lambda_{1,i})\prod_{\alpha\in\hat{\mathcal{P}}\setminus\{\lambda_{1,1},\ldots,\lambda_{1,k}\}}(\lambda_{1,i}-\alpha)}{Q(\lambda_{1,i})^{n}}\ \text{for}\ 1\leq i\leq k\leq p_{1}-1.

It is not hard to see νP0,(z0(k)fk)>0\nu_{P_{0,\infty}}(z_{0}^{(k)}f_{k})>0, thus z0(k)fk(P0,)=0z_{0}^{(k)}f_{k}(P_{0,\infty})=0. As for other np1n-p_{1} rational places in 𝒫Λ1\mathcal{P}\setminus\Lambda_{1}, the functions {z0(k),fk}k=0p11\{z_{0}^{(k)},f_{k}\}_{k=0}^{p_{1}-1} in equation (4.0.14) are well-defined at them. Thus, we can get the corresponding np1n-p_{1} evaluations of f~\tilde{f} according to Equations (4.0.14) by using p1O(np1)p_{1}\cdot O(n-p_{1}) operations in 𝔽q\mathbb{F}_{q}. Since each fk(np1Q1)f_{k}\in\mathcal{L}(\frac{n}{p_{1}}Q_{1}) and |𝒫1|=n/p1|\mathcal{P}_{1}|=n/p_{1}, by the above analysis, we get a recursive formula for C(n)C(n):

C(n)=p1C(n/p1)+p1O(n).C(n)=p_{1}\cdot C\left({n}/{p_{1}}\right)+p_{1}\cdot O(n). (4.0.17)

Then we can continue to reduce the DFT of each fkf_{k} at 𝒫(1)\mathcal{P}^{(1)} to the DFTs of p2p_{2} functions in ((n/p1p2)Q2)\mathcal{L}\big{(}(n/{p_{1}p_{2}})Q_{2}\big{)} at 𝒫(2)=𝒫F2\mathcal{P}^{(2)}=\mathcal{P}\cap F_{2}, till to the last step where we get nn functions in (Qr)\mathcal{L}(Q_{r}) and evaluate them at 𝒫(r)={Pr,}\mathcal{P}^{(r)}=\{P_{r,\infty}\}, which can be done in O(n)O(n) operations. Overall, the recursive formula (4.0.17) lead to a final complexity of C(n)C(n):

C(n)=p1p2prC(1)+(pr++p2+p1)O(n)=O(Bnlogn).\begin{split}C(n)=p_{1}p_{2}\cdots p_{r}\cdot C(1)+(p_{r}+\dots+p_{2}+p_{1})\cdot O(n)=O(B\cdot n\log n).\end{split}

Since we need some precomputations in each step of the reduction, we analyze the total storage for these precomputations. Precisely, in step jj of the reduction, we need to precompute

(yrxrzj(k))(Pλj,i)=ur,j(λj,i)α𝒫^(j){λj,1,,λj,k}(λj,iα)Qj(λj,i)nfor 1ikpj1.\left(y_{r}x_{r}z_{j}^{(k)}\right)(P_{\lambda_{j,i}})=\frac{u_{r,j}(\lambda_{j,i})\prod_{\alpha\in\hat{\mathcal{P}}^{(j)}\setminus\{\lambda_{j,1},\ldots,\lambda_{j,k}\}}(\lambda_{j,i}-\alpha)}{Q_{j}(\lambda_{j,i})^{n}}\ \text{for}\ 1\leq i\leq k\leq p_{j}-1.

Therefore, the storage for the precomputations in the process of reduction is:

j=1rk=1pj1ik1=O(B4logn).\sum_{j=1}^{r}\sum_{k=1}^{p_{j}-1}\sum_{i\leq k}1=O(B^{4}\log n).

By adding the storage for ur,0(α)Qn(α)\frac{u_{r,0}(\alpha)}{Q^{n}}(\alpha) for every α𝒫\alpha\in\mathcal{P}, the total storage for precomputation is n+O(B4logn)=O(n)n+O(B^{4}\log n)=O(n).

Remark 4.7.
  • (1)

    By the same analysis as in Remark 3.3, the inverse G-FFT for the q+1q+1 case can also be performed in O(nlogn)O(n\log n) operations.

  • (2)

    Let f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} be any nonzero polynomial. In the affine G-FFT, we showed that the transformation from the coefficients of f(x)f(x) under the basis \mathcal{B} in Lemma 3.1 to the coefficients of f(x)f(x) under the standard basis can be done in at most O(nlogn)O(n\log n) operations. However, in the q+1q+1 case, this transformation will cost O(M(n)logn)O(M(n)\log n) operations in 𝔽q\mathbb{F}_{q}, where M(n)M(n) denotes the cost of multiplication of two polynomials of degree less than nn. For simplicity, we take q+1=2rq+1=2^{r} as an example to illustrate. In this case, assume xi+1=ui,i+1(xi)xiλix_{i+1}=\frac{u_{i,i+1}(x_{i})}{x_{i}-\lambda_{i}} for i=0,1,,r1i=0,1,\ldots,r-1. Assume f(x)=(xqx)e¯2rae¯z(e¯)f(x)=(x^{q}-x)\cdot\sum_{\operatorname{\underline{\textbf{e}}}\in\mathbb{Z}_{2}^{r}}a_{\operatorname{\underline{\textbf{e}}}}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}})}, where each z(e¯)=z0(e0)z1(e1)zr1(er1)\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}})}=z_{0}^{(e_{0})}z_{1}^{(e_{1})}\cdots z_{r-1}^{(e_{r-1})} is the base element in Lemma 4.5, and the evaluation set 𝒫=𝒫i,0𝒫i,1\mathcal{P}=\mathcal{P}_{i,0}\sqcup\mathcal{P}_{i,1}, where i=1,2,,r1i=1,2,\ldots,r-1 and 𝒫i,0\mathcal{P}_{i,0} (resp. 𝒫i,1\mathcal{P}_{i,1}) F\subsetneq\mathbb{P}_{F} is the set of places lying above Pi,P_{i,\infty} (resp. Pi,λiP_{i,\lambda_{i}}). Since

    f(x)=(xqx)e¯2r1ae0¯z(e¯)+(xqx)zr1(1)e¯2r1ae1¯z(e¯)=f0(x)α𝒫r1,1(xα)+f1(x).\begin{split}f(x)&=(x^{q}-x)\cdot\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\mathbb{Z}_{2}^{r-1}}a_{\underline{\operatorname{\textbf{e}}^{\prime}0}}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}+(x^{q}-x)z_{r-1}^{(1)}\cdot\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\mathbb{Z}_{2}^{r-1}}a_{\underline{\operatorname{\textbf{e}}^{\prime}1}}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}\\ &=f_{0}(x)\cdot\prod_{\alpha\in\mathcal{P}_{r-1,1}}(x-\alpha)+f_{1}(x).\end{split}

    where

    f0(x)=α𝒫^r1,0(xα)e¯2r1ae0¯z(e¯),f1(x)=α𝒫^r1,0(xα)e¯2r1ae1¯z(e¯).f_{0}(x)=\prod_{\alpha\in\hat{\mathcal{P}}_{r-1,0}}(x-\alpha)\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\mathbb{Z}_{2}^{r-1}}a_{\underline{\operatorname{\textbf{e}}^{\prime}0}}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})},\ f_{1}(x)=\prod_{\alpha\in\hat{\mathcal{P}}_{r-1,0}}(x-\alpha)\sum_{\operatorname{\underline{\textbf{e}}}^{\prime}\in\mathbb{Z}_{2}^{r-1}}a_{\underline{\operatorname{\textbf{e}}^{\prime}1}}\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})}.

    The pole places of z(e¯)\operatorname{\textbf{z}}^{(\operatorname{\underline{\textbf{e}}}^{\prime})} are in 𝒫^r1,0\hat{\mathcal{P}}_{r-1,0} for all e¯2r1\operatorname{\underline{\textbf{e}}}^{\prime}\in\mathbb{Z}_{2}^{r-1}. Therefore, the computation of coefficients of f(x)f(x) under the standard basis can be reduced to the computations of f0(x)f_{0}(x) and f1(x)f_{1}(x) under the standard basis. Then get ff by multiplying f0f_{0} with α𝒫r2,1(xα)\prod_{\alpha\in\mathcal{P}_{r-2,1}}(x-\alpha) and add f1f_{1}. Thus the transformation complexity, denoted by B(n)B(n), satisfies

    B(n)=2B(n/2)+M(n)=O(M(n)logn).B(n)=2B(n/2)+M(n)=O(M(n)\log n).
Example 4.8.

At last, let us present an example with q+1=2rq+1=2^{r} to illustrate the G-FFT. Since xi=ui,i1(xi1)xi1λi1x_{i}=\frac{u_{i,i-1}(x_{i-1})}{x_{i-1}-\lambda_{i-1}} for i=1,,ri=1,\dots,r by Lemma 4.2, the pole place Pi,FiP_{i,\infty}\in\mathbb{P}_{F_{i}} of xix_{i} splits into two places in Fi1\mathbb{P}_{F_{i-1}}, i.e., Pi1,P_{i-1,\infty} and Pi1,λi1P_{i-1,\lambda_{i-1}}. Let ci=yrxrxiλixi=λic_{i}=\frac{y_{r}x_{r}}{x_{i}-\lambda_{i}}\mid_{x_{i}=\lambda_{i}} for i=0,1,,ri=0,1,\ldots,r. Then the G-FFT in the proof of Theorem 4.6 can be implemented by the following Algorithm 1. Note that it can be also applied to the general case, i.e., q+1=i=1rpiq+1=\prod_{i=1}^{r}p_{i} is BB-smooth, by writing f~\tilde{f} under the basis ~\tilde{\mathcal{B}} in step 4 and adding the discussion of pj1p_{j}-1 pole places {Pj,1,,Pj,pj1}𝒫(j1)\{P_{j,1},\ldots,P_{j,p_{j}-1}\}\subset\mathcal{P}^{(j-1)} of xjx_{j} in step 8. For better understanding, we only present the simplest case q+1=2rq+1=2^{r}.

Algorithm 1 G-FFT(f,𝒫f,\mathcal{P}).
1:f=e¯2rae¯xqxi=0r1(xiλi)ei𝔽q[x]<qf=\sum_{\operatorname{\underline{\textbf{e}}}\in\mathbb{Z}_{2}^{r}}a_{\operatorname{\underline{\textbf{e}}}}\frac{x^{q}-x}{\prod_{i=0}^{r-1}(x_{i}-\lambda_{i})^{e_{i}}}\in\mathbb{F}_{q}[x]_{<q} and 𝒫=𝔽q={P1,P2,,Pq}\mathcal{P}=\mathbb{F}_{q}=\{P_{1},P_{2},\dots,P_{q}\}.
2:(f(P1),,f(Pq))𝔽qq\left(f(P_{1}),\dots,f(P_{q})\right)\in\mathbb{F}_{q}^{q}.
  • Write f~=ur,0(x)cr,0Qq+1(x)f=e¯2rae¯yrxri=0r1(xiλi)ei\tilde{f}=\frac{u_{r,0}(x)}{c_{r,0}Q^{q+1}(x)}f=\sum_{\operatorname{\underline{\textbf{e}}}\in\mathbb{Z}_{2}^{r}}a_{\operatorname{\underline{\textbf{e}}}}\frac{y_{r}x_{r}}{\prod_{i=0}^{r-1}(x_{i}-\lambda_{i})^{e_{i}}}.

  • For j=0,1,,rj=0,1,\dots,r, let 𝒫(j)=𝒫Fj\mathcal{P}^{(j)}=\mathcal{P}\cap F_{j}. In particular, 𝒫(0)=𝒫\mathcal{P}^{(0)}=\mathcal{P}.

3:for j=0j=0 to r1r-1 do
4:     Write f~=f0+f11xjλj\tilde{f}=f_{0}+f_{1}\frac{1}{x_{j}-\lambda_{j}}.
5:     jj+1j\leftarrow j+1
6:     Call G-FFT(fk,𝒫(j))\left(f_{k},\mathcal{P}^{(j)}\right) for k=0,1k=0,1.
7:     Return f~(P)=f0(P)+f1(P)1xj1(P)λj1\tilde{f}(P)=f_{0}(P)+f_{1}(P)\cdot\frac{1}{x_{j-1}(P)-\lambda_{j-1}} for P𝒫(j1){Pλj1}P\in\mathcal{P}^{(j-1)}\setminus\{P_{\lambda_{j-1}}\};
8:     Return f~(P)=a001¯cj1\tilde{f}(P)=a_{\underline{0\cdots 01}}\cdot c_{j-1} for P=Pλj1P=P_{\lambda_{j-1}}, where a001¯a_{\underline{0\cdots 01}} is the coefficient of ff at the term yrxrxj1λj1\frac{y_{r}x_{r}}{x_{j-1}-\lambda_{j-1}}.
9:end for
10:Output f(P)=cr,0Qq+1(P)ur,0(P)f~(P)f(P)=\frac{c_{r,0}Q^{q+1}(P)}{u_{r,0}(P)}\tilde{f}(P) for every P𝒫P\in\mathcal{P}.

More specifically, all data for the case q=127q=127 are as follows. Let F=𝔽127(x)F=\mathbb{F}_{127}(x). Firstly, we choose a primitive polynomial over 𝔽127\mathbb{F}_{127}: m(x)=x2+126x+3m(x)=x^{2}+126x+3. The automorphism σ=(011241)PGL2(q)\sigma=\left(\begin{matrix}0&1\\ 124&1\end{matrix}\right)\in\mathrm{PGL}_{2}(q) has order 128128. Since q+1=27q+1=2^{7} and r=7r=7, we can construct a tower of fields, i.e., 𝔽q(xi)=FGi\mathbb{F}_{q}(x_{i})=F^{G_{i}} is the subfield fixed by Gi=σ2riG_{i}=\langle\sigma^{2^{r-i}}\rangle, i=0,1,,7i=0,1,\ldots,7. Then xi=τGiτ(x)x_{i}=\sum_{\tau\in G_{i}}\tau(x). In addition, let Q(x)=x2+42x+85Q(x)=x^{2}+42x+85 and QQ be the quadratic place corresponding to Q(x)Q(x). Then QQ is totally ramified in F/FGF/F^{G}. Let yi=τGiτ(1/Q(x))y_{i}=\prod_{\tau\in G_{i}}\tau(1/Q(x)). The field Ei=𝔽q(yi)E_{i}=\mathbb{F}_{q}(y_{i}) is a subfield of Fq(xi)F_{q}(x_{i}). By computation, we have

(1) The generator xix_{i} (resp. yi=1/Qi(xi)y_{i}=1/Q_{i}(x_{i})) of the subfield Fi=𝔽q(xi)F_{i}=\mathbb{F}_{q}(x_{i}) (resp. Ei=𝔽q(yi)E_{i}=\mathbb{F}_{q}(y_{i})) for i=0,1,,7i=0,1,\ldots,7. By the definition (4.0.1), we have

x0=x,x1=x2+42x+21,x2=x4+125x2+71x+33x3+63x2+28x+100,x3=x8+33x6+105x5+24x4++122x7+20x6+69x5++20x4=x16+87x14+34x13+116x12+43x15+61x14+91x13+34x12++44,x5=x32+4x30+29x29+119x28++8x31+16x30+22x29+47x28++116,x6=x64+90x62+15x61+89x60++80x63+53x62+67x61+64x60++71,x7=x128+42x+85x127+126x=u3,0(x)xqx.\begin{split}&x_{0}=x,\ x_{1}=\frac{x^{2}+42}{x+21},\ x_{2}=\frac{x^{4}+125x^{2}+71x+33}{x^{3}+63x^{2}+28x+100},\ x_{3}=\frac{x^{8}+33x^{6}+105x^{5}+24x^{4}+\cdots+122}{x^{7}+20x^{6}+69x^{5}+\cdots+20}\\ &x_{4}=\frac{x^{16}+87x^{14}+34x^{13}+116x^{12}\cdots+43}{x^{15}+61x^{14}+91x^{13}+34x^{12}+\cdots+44},\ x_{5}=\frac{x^{32}+4x^{30}+29x^{29}+119x^{28}+\cdots+8}{x^{31}+16x^{30}+22x^{29}+47x^{28}+\cdots+116},\\ &x_{6}=\frac{x^{64}+90x^{62}+15x^{61}+89x^{60}+\cdots+80}{x^{63}+53x^{62}+67x^{61}+64x^{60}+\cdots+71},\ x_{7}=\frac{x^{128}+42x+85}{x^{127}+126x}=\frac{u_{3,0}(x)}{x^{q}-x}.\end{split}

and

y0=1/Q(x),y1=x2+42x+6025Q(x)2,y2=x6+126x5++9416Q(x)4,y3=x14+40x13++1938Q(x)8,y4=x30+122x29++3116Q(x)16,y5=x62+32x61++121100Q(x)32,y6=x126+106x125++884Q(x)64,y7=x254+125x128+x2100Q(x)128.\begin{split}&y_{0}=1/Q(x),\ y_{1}=\frac{x^{2}+42x+60}{25Q(x)^{2}},\ y_{2}=\frac{x^{6}+126x^{5}+\cdots+94}{16Q(x)^{4}},\ y_{3}=\frac{x^{14}+40x^{13}+\cdots+19}{38Q(x)^{8}},\\ &y_{4}=\frac{x^{30}+122x^{29}+\cdots+31}{16Q(x)^{16}},\ y_{5}=\frac{x^{62}+32x^{61}+\cdots+121}{100Q(x)^{32}},\ y_{6}=\frac{x^{126}+106x^{125}+\cdots+88}{4Q(x)^{64}},\\ &y_{7}=\frac{x^{254}+125x^{128}+x^{2}}{100Q(x)^{128}}.\end{split}

(2) The basis of 𝔽q[x]<128\mathbb{F}_{q}[x]_{<128}. The pole place Pi,FiP_{i,\infty}\in\mathbb{P}_{F_{i}} of xix_{i} splits into two places in Fi1\mathbb{P}_{F_{i-1}}, i.e., Pi1,P_{i-1,\infty} and Pi1,λi1P_{i-1,\lambda_{i-1}}, for every ii. By computations, we have λ0=106,λ1=85,λ2=43,λ3=86,λ4=45,λ5=90,λ6=53\lambda_{0}=106,\lambda_{1}=85,\lambda_{2}=43,\lambda_{3}=86,\lambda_{4}=45,\lambda_{5}=90,\lambda_{6}=53. Thus, by Lemma 4.5, a basis of 𝔽q[x]<128\mathbb{F}_{q}[x]_{<128} is

{x127xi=06(xiλi)eie¯=(e0,e1,,e6)27}\left\{\frac{x^{127}-x}{\prod_{i=0}^{6}(x_{i}-\lambda_{i})^{e_{i}}}\mid\operatorname{\underline{\textbf{e}}}=(e_{0},e_{1},\ldots,e_{6})\in\mathbb{Z}_{2}^{7}\right\}

(3) The precomputations. In the proof of Theorem 4.6, we need to store rr values in precomputing process:

c0=y7x7xλ0x=λ0=106,c1=y7x7x1λ1x1=λ1=101,c2=y7x7x2λ2x2=λ2=64,c3=y7x7x3λ3x3=λ3=34,c4=y7x7x4λ4x4=λ4=35,c5=y7x7x5λ5x5=λ5=1,c6=y7x7x6λ6x6=λ6=0.\begin{split}&c_{0}=\frac{y_{7}x_{7}}{x-\lambda_{0}}\mid_{x=\lambda_{0}}=106,\ c_{1}=\frac{y_{7}x_{7}}{x_{1}-\lambda_{1}}\mid_{x_{1}=\lambda_{1}}=101,\ c_{2}=\frac{y_{7}x_{7}}{x_{2}-\lambda_{2}}\mid_{x_{2}=\lambda_{2}}=64,\\ &c_{3}=\frac{y_{7}x_{7}}{x_{3}-\lambda_{3}}\mid_{x_{3}=\lambda_{3}}=34,\ c_{4}=\frac{y_{7}x_{7}}{x_{4}-\lambda_{4}}\mid_{x_{4}=\lambda_{4}}=35,\ c_{5}=\frac{y_{7}x_{7}}{x_{5}-\lambda_{5}}\mid_{x_{5}=\lambda_{5}}=1,\\ &c_{6}=\frac{y_{7}x_{7}}{x_{6}-\lambda_{6}}\mid_{x_{6}=\lambda_{6}}=0.\end{split}

According to Algorithm 1, we can compute the MPE for any f=e¯27ae¯x127xi=06(xiλi)ei𝔽127[x]<127f=\sum_{\operatorname{\underline{\textbf{e}}}\in\mathbb{Z}_{2}^{7}}a_{\operatorname{\underline{\textbf{e}}}}\frac{x^{127}-x}{\prod_{i=0}^{6}(x_{i}-\lambda_{i})^{e_{i}}}\in\mathbb{F}_{127}[x]_{<127} at 𝔽127\mathbb{F}_{127}. In Appendix B, we take a random f(x)f(x) (see its coefficients in Table 2) and set f~=x128+42x+85100Q128(x)f\tilde{f}=\frac{x^{128}+42x+85}{100Q^{128}(x)}f. Then we compute {f~(P):P𝔽128}\{\tilde{f}(P):P\in\mathbb{F}_{128}\} by G-FFT and get f(P)=100Q128(P)u7,0(P)f~(P)f(P)=\frac{100Q^{128}(P)}{u_{7,0}(P)}\tilde{f}(P) for every P𝔽127P\in\mathbb{F}_{127} (see Table 3 for the MPEs of f~\tilde{f} and ff at 𝔽q\mathbb{F}_{q}).

5. Conclusion

In this work, we consider FFT for evaluations of polynomials at a chosen set 𝒫𝔽q\mathcal{P}\subseteq\mathbb{F}_{q}. We conclude that when either q1q-1, qq or q+1q+1 is smooth, then it is possible to run FFT in O(nlogn)O(n\log n) operations of 𝔽q\mathbb{F}_{q}, which loosens the restriction of FFT on the finite field to some extent. Most importantly, if n(q+1)n\mid(q+1) is smooth, we give a practical algorithm to implement FFT over 𝔽q\mathbb{F}_{q} of length nn rather than turn to a quadratic extension field 𝔽q2\mathbb{F}_{q^{2}} to do FFT of length nn. Our framework is based on the Galois theory and properties of the rational function field. The applications of our new algorithm to polynomial arithmetic and encoding/decoding of Reed-Solomon codes are not considered in this work and we leave them as a future research work.

Acknowledgments

We would like to thank Yi Kuang for his help in implementing the G-FFT algorithm. We would also like to thank Liming Ma and Fuchun Lin for their valuable comments on the manuscript of this work. This Research is partially supported by the National Key Research and Development Program under Grant 2022YFA1004900 and the National Natural Science Foundation of China under Grant numbers 123031011 and 12031011.


References

  • [BCS13] Peter Bürgisser, Michael Clausen, and Mohammad A Shokrollahi. Algebraic complexity theory, volume 315. Springer Science & Business Media, 2013.
  • [BSCKL23] Eli Ben-Sasson, Dan Carmon, Swastik Kopparty, and David Levit. Elliptic curve fast Fourier transform (ECFFT) part I: Low-degree extension in time O(nlogn){O(n\log n)} over all finite fields. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 700–737. SIAM, 2023.
  • [Can89] D. G. Cantor. On arithmetical algorithms over finite fields. J. Combin. Theory, ser., 50(2):285–300, 1989.
  • [CT65] James W Cooley and John W Tukey. An algorithm for the machine calculation of complex Fourier series. Mathematics of computation, 19(90):297–301, 1965.
  • [Fid72] Charles M Fiduccia. Polynomial evaluation via the division algorithm the fast Fourier transform revisited. In Proceedings of the fourth annual ACM symposium on Theory of computing, pages 88–93, 1972.
  • [Gao03] Shuhong Gao. A new algorithm for decoding Reed-Solomon codes. Communications, information and network security, pages 55–68, 2003.
  • [GS10] Mateer T. Gao S. Additive fast Fourier transforms over finite fields. IEEE Transactions on Information Theory, 56(12):6265–6272., 2010.
  • [HCLB21] Yunghsiang S Han, Chao Chen, Sian-Jheng Lin, and Baoming Bai. On fast Fourier transform-based decoding of Reed-Solomon codes. International Journal of Ad Hoc and Ubiquitous Computing, 36(3):180–187, 2021.
  • [HJB84] Michael Heideman, Don Johnson, and Charles Burrus. Gauss and the history of the fast Fourier transform. IEEE Assp Magazine, 1(4):14–21, 1984.
  • [HKTO08] James William Peter Hirschfeld, Gábor Korchmáros, Fernando Torres, and Fernando Eduardo Torres Orihuela. Algebraic curves over a finite field, volume 20. Princeton University Press, 2008.
  • [JMX19] Lingfei Jin, Liming Ma, and Chaoping Xing. Construction of optimal locally repairable codes via automorphism groups of rational function fields. IEEE Transactions on Information Theory, 66(1):210–221, 2019.
  • [Jus76] Jø\orn Justesen. On the complexity of decoding Reed-Solomon codes (corresp.). IEEE transactions on information theory, 22(2):237–238, 1976.
  • [KS04] Hans Kurzweil and Bernd Stellmacher. The theory of finite groups: an introduction, volume 1. Springer, 2004.
  • [LANH16] Sian-Jheng Lin, Tareq Y Al-Naffouri, and Yunghsiang S Han. FFT algorithm for binary extension finite fields and its application to Reed-Solomon codes. IEEE Transactions on Information Theory, 62(10):5343–5358, 2016.
  • [LANHC16] Sian-Jheng Lin, Tareq Y Al-Naffouri, Yunghsiang S Han, and Wei-Ho Chung. Novel polynomial basis with fast Fourier transform and its application to Reed-Solomon erasure codes. IEEE Transactions on Information Theory, 62(11):6284–6299, 2016.
  • [LCH14] Sian-Jheng Lin, Wei-Ho Chung, and Yunghsiang S Han. Novel polynomial basis and its application to Reed-Solomon erasure codes. In 2014 IEEE 55th annual symposium on foundations of computer science, pages 316–325. IEEE, 2014.
  • [M.71] Pollard J M. The fast Fourier transform in a finite field. Mathematics of computation, 25(114):365–374, 1971.
  • [Sti09] Henning Stichtenoth. Algebraic function fields and codes, volume 254. Springer Science & Business Media, 2009.
  • [vDHL17] Joris van Der Hoeven and Robin Larrieu. The frobenius FFT. In Proceedings of the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, pages 437–444, 2017.
  • [VZGG13] Joachim Von Zur Gathen and Jürgen Gerhard. Modern computer algebra. Cambridge University press, 2013.
  • [WY88] Zhu X. Wang Y. A fast algorithm for the Fourier transform over finite fields and its VLSI implementation. IEEE Journal on Selected Areas in Communications, 6(3):572–577, 1988.

Appendix A (xpαx)(x^{p}-\alpha x)-adic expansions of polynomials in 𝔽q[x]\mathbb{F}_{q}[x]

Definition A.1 ([VZGG13, GS10]).

Let 𝔽q\mathbb{F}_{q} be a finite field with characteristic pp. Assume f(x)𝔽[x]<nf(x)\in\mathbb{F}[x]_{<n} is a polynomial of degree less than nn. For any α𝔽q\alpha\in\mathbb{F}_{q}, the (xpαx)(x^{p}-\alpha x)-adic expansion of f(x)f(x) is

f(x)=a0(x)+a1(x)(xpαx)++am(x)(xpαx)m,f(x)=a_{0}(x)+a_{1}(x)\cdot(x^{p}-\alpha x)+\dots+a_{m}(x)\cdot(x^{p}-\alpha x)^{m},

where ai(x)𝔽q[x]<pa_{i}(x)\in\mathbb{F}_{q}[x]_{<p} for 0im0\leq i\leq m and m=n/pm=\lfloor n/p\rfloor.

Mateer-Gao [GS10] showed that when p=2p=2, the (x2x)(x^{2}-x)-adic expansion of f(x)f(x) can be computed in O(nlogn)O(n\log n). In this part, we will generalize this result to any constant characteristic pp.

Lemma A.2.

Assume 𝔽q\mathbb{F}_{q} is a finite field with constant characteristic pp. Let f(x)𝔽q[x]<nf(x)\in\mathbb{F}_{q}[x]_{<n} be a polynomial of degree less than nn. For any α𝔽q\alpha\in\mathbb{F}_{q}, one can compute the (xpαx)(x^{p}-\alpha x)-adic representation of f(x)f(x) in O(nlogn)O(n\log n) operations in 𝔽q\mathbb{F}_{q}.

Proof.

Let cnc_{n} be the leading coefficient of f(x)f(x). If npn\leq p, then the (xpαx)(x^{p}-\alpha x)-adic expansion of ff is

f(x)={f(x),ifdeg(f)<p,cp(xpαx)+f(x)cp(xpαx),ifdeg(f)=p.f(x)=\begin{cases}f(x),&\text{if}\ \deg(f)<p,\\ c_{p}(x^{p}-\alpha x)+f(x)-c_{p}(x^{p}-\alpha x),&\text{if}\ \deg(f)=p.\end{cases}

Next, we assume deg(f)>p\deg(f)>p. Let r2r\geq 2 be a positive integer such that

pr1<npr.p^{r-1}<n\leq p^{r}.

Write f(x)f(x) as

f(x)=k=0p1f1(x)xpr1++fp1(x)x(p1)pr1,=k=0p1=0p1fk,(x)xpr2+kpr1,\begin{split}f(x)&=\sum_{k=0}^{p-1}f_{1}(x)\cdot x^{p^{r-1}}+\dots+f_{p-1}(x)\cdot x^{(p-1)p^{r-1}},\\ &=\sum_{k=0}^{p-1}\sum_{\ell=0}^{p-1}f_{k,\ell}(x)\cdot x^{\ell p^{r-2}+kp^{r-1}},\end{split} (A.0.1)

where deg(fk,)<pr2\deg(f_{k,\ell})<p^{r-2} for 0k,p10\leq k,\ell\leq p-1. For the convenience of writing, we sometimes denote xpαxx^{p}-\alpha x by TT and denote αpm\alpha^{p^{m}} by αm\alpha_{m} for any integer m0m\geq 0. Since the characteristic of 𝔽q\mathbb{F}_{q} is pp, we have

xpr1=(xpαx)pr2+αr2xpr2=Tpr2+αr2xpr2.x^{p^{r-1}}=(x^{p}-\alpha x)^{p^{r-2}}+\alpha_{r-2}x^{p^{r-2}}=T^{{p^{r-2}}}+\alpha_{r-2}x^{p^{r-2}}.

Thus, for each kk in [0,p1][0,p-1], we have

xkpr1=(Tpr2+αr2xpr2)k=j=0k(kj)αr2kjx(kj)pr2Tjpr2.x^{kp^{r-1}}=(T^{p^{r-2}}+\alpha_{r-2}x^{p^{r-2}})^{k}=\sum_{j=0}^{k}\binom{k}{j}\cdot\alpha_{r-2}^{k-j}\cdot x^{(k-j)p^{r-2}}T^{jp^{r-2}}. (A.0.2)

Now, by substituting each xkpr1x^{kp^{r-1}} in equation (A.0.1) with the formula on the right side of equation (A.0.1), we then have

f(x)=k,j=0kfk,(x)(kj)αr2kjx(+kj)pr2Tjpr2=j=0p1kjp1=0p1fk,(x)(kj)αr2kjx(+kj)pr2Tjpr2.\begin{split}f(x)&=\sum_{k,\ell}\sum_{j=0}^{k}f_{k,\ell}(x)\cdot\binom{k}{j}\cdot\alpha_{r-2}^{k-j}\cdot x^{(\ell+k-j)p^{r-2}}T^{jp^{r-2}}\\ &=\sum_{j=0}^{p-1}\sum_{k\geq j}^{p-1}\sum_{\ell=0}^{p-1}f_{k,\ell}(x)\cdot\binom{k}{j}\cdot\alpha_{r-2}^{k-j}\cdot x^{(\ell+k-j)p^{r-2}}T^{jp^{r-2}}.\end{split} (A.0.3)

If +kjp\ell+k-j\geq p, then

x(+kj)pr2=x(+kjp)pr2(Tpr2+αr2).x^{(\ell+k-j)p^{r-2}}=x^{(\ell+k-j-p)p^{r-2}}(T^{p^{r-2}}+\alpha_{r-2}).

We split the sum over \ell in equation (A.0.3) into two parts: a sum over \ell satisfying pk+j\ell\geq p-k+j and a sum over \ell satisfying <pk+j\ell<p-k+j. Then, by the above equation, we have

f(x)=j=0p1(kjp1+kj<pfk,(x)(kj)αr2kjx(+kj)pr2)Tjpr2+j=0p1(kjp1+kjpfk,(x)(kj)αr2kj+1x(+kjp+1)pr2)Tjpr2+j=1p1(kj1p1+kj+1pfk,(x)(kj1)αr2kj+1x(+kjp+1)pr2)Tjpr2.\begin{split}f(x)&=\sum_{j=0}^{p-1}\left(\sum_{k\geq j}^{p-1}\sum_{\ell+k-j<p}f_{k,\ell}(x)\cdot\binom{k}{j}\cdot\alpha_{r-2}^{k-j}\cdot x^{(\ell+k-j)p^{r-2}}\right)T^{jp^{r-2}}\\ &+\sum_{j=0}^{p-1}\left(\sum_{k\geq j}^{p-1}\sum_{\ell+k-j\geq p}f_{k,\ell}(x)\cdot\binom{k}{j}\cdot\alpha_{r-2}^{k-j+1}\cdot x^{(\ell+k-j-p+1)p^{r-2}}\right)T^{jp^{r-2}}\\ &+\sum_{j=1}^{p-1}\left(\sum_{k\geq j-1}^{p-1}\sum_{\ell+k-j+1\geq p}f_{k,\ell}(x)\cdot\binom{k}{j-1}\cdot\alpha_{r-2}^{k-j+1}\cdot x^{(\ell+k-j-p+1)p^{r-2}}\right)T^{jp^{r-2}}.\end{split} (A.0.4)

Note that the three coefficients of Tjpr2T^{jp^{r-2}} in the above equation all are polynomials of degree less than pr1p^{r-1}. Thus their sum, denoted by gj(x)g_{j}(x), is also a polynomial of degree less than pr1p^{r-1}. To explicitly compute the coefficients polynomial gj(x)g_{j}(x) of Tjpr2T^{jp^{r-2}}, we need to add some polynomial terms fk,xspr2f_{k,\ell}x^{sp^{r-2}} and fk,xspr2f_{k^{\prime},\ell^{\prime}}x^{s^{\prime}p^{r-2}} for 0s,sp10\leq s,s^{\prime}\leq p-1 in equation (A.0.4). Since deg(f,k),deg(f,k)<pr2\deg(f_{\ell,k}),\deg(f_{\ell^{\prime},k^{\prime}})<p^{r-2}, the additions are only needed when s=ss=s^{\prime}. In this case, fk,xspr2+fk,xspr2f_{k,\ell}x^{sp^{r-2}}+f_{k^{\prime},\ell^{\prime}}x^{sp^{r-2}} can be computed in at most pr2p^{r-2} additions in 𝔽q\mathbb{F}_{q}. By counting, there are at most p2p^{2} polynomials in the form fk,xspr2f_{k,\ell}x^{sp^{r-2}} for 0<,kp10<\ell,k\leq p-1. Thus we need at most pprp\cdot p^{r} additions in 𝔽q\mathbb{F}_{q} to compute the coefficient gj(x)g_{j}(x) of Tjpr2T^{jp^{r-2}}. Next, we count the total multiplications in 𝔽q\mathbb{F}_{q} to compute gj(x)g_{j}(x). Since there are most p2p^{2} polynomials fk,f_{k,\ell} needed to be multiplied by a scalar (kj)αr2kj\binom{k}{j}\alpha_{r-2}^{k-j} and each scalar multiplication needs at most pr2p^{r-2} multiplications in 𝔽q\mathbb{F}_{q}. Thus the total multiplications are also pr+1p^{r+1}. After the combinations, we get

f(x)=j=0p1gj(x)Tjpr2,deg(gj)<pr1for 0jp1.f(x)=\sum_{j=0}^{p-1}g_{j}(x)T^{jp^{r-2}},\ \deg(g_{j})<p^{r-1}\ \text{for}\ 0\leq j\leq p-1.

Thus, the (xpαx)(x^{p}-\alpha x)-adic expansion of f(x)f(x) of degree less than prp^{r} is reduced to pp (xpαx)(x^{p}-\alpha x)-adic expansions of polynomials of degree <pr1<p^{r-1}. We can continue this procedure recursively to g0,g1,,gp1g_{0},g_{1},\dots,g_{p-1}. Then in at most rr recursions, all the polynomials will have degrees less than pp and we obtain the (xpαx)(x^{p}-\alpha x)-expansion of f(x)f(x). Let Ap(n)A_{p}(n) and Mp(n)M_{p}(n) be the complexity of the (xpαx)(x^{p}-\alpha x)-adic expansion of a polynomial in 𝔽q[x]n\mathbb{F}_{q}[x]_{\leq n}. Through the above analysis, we have

Ap(n)=pAp(n/p)+O(n),Mp(n)=pMp(n/p)+O(n).A_{p}(n)=p\cdot A_{p}(n/p)+O(n),\ M_{p}(n)=p\cdot M_{p}(n/p)+O(n).

The recursive formula finally leads to Ap(n)=O(nlogn)A_{p}(n)=O(n\cdot\log n) and Mp(n)=O(nlogn)M_{p}(n)=O(n\cdot\log n). ∎

Appendix B An example of G-FFT with q+1=128q+1=128

Table 2. The coefficients of f(x)f(x)
ee 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ae¯a_{\operatorname{\underline{\textbf{e}}}} 15 4 37 109 3 87 116 18 10 90 73 51 92 66 121 86 70
ee 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
ae¯a_{\operatorname{\underline{\textbf{e}}}} 13 21 95 29 122 78 122 78 41 26 49 44 66 19 66 40 121
ee 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
ae¯a_{\operatorname{\underline{\textbf{e}}}} 81 3 116 4 50 40 121 85 25 66 38 55 42 98 37 116 15
ee 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
ae¯a_{\operatorname{\underline{\textbf{e}}}} 49 33 100 86 120 104 61 114 0 10 17 68 91 81 98 124 44
ee 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
ae¯a_{\operatorname{\underline{\textbf{e}}}} 5 23 119 115 25 73 10 113 17 91 11 86 118 8 31 63 32
ee 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
ae¯a_{\operatorname{\underline{\textbf{e}}}} 21 62 77 51 90 53 89 48 97 11 15 77 8 64 63 7 62
ee 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
ae¯a_{\operatorname{\underline{\textbf{e}}}} 55 92 116 116 118 53 80 39 47 84 53 100 4 97 40 106 108
ee 119 120 121 122 123 124 125 126 127
ae¯a_{\operatorname{\underline{\textbf{e}}}} 39 107 25 67 51 87 90 111 93
Table 3. The MPEs of f(x)f(x) and f~(x)\tilde{f}(x)
α\alpha f(α)f(\alpha) f~(α)\tilde{f}(\alpha) α\alpha f(α){f}(\alpha) f~(α)\tilde{f}(\alpha) α\alpha f(α){f}(\alpha) f~(α)\tilde{f}(\alpha) α\alpha f(α){f}(\alpha) f~(α)\tilde{f}(\alpha) α\alpha f(α){f}(\alpha) f~(α)\tilde{f}(\alpha)
\infty 0 0 81 74 71 8 55 114 59 85 121 6 54 32
106 123 89 107 126 91 92 75 101 66 11 112 11 3 62
101 94 2 24 91 33 46 108 57 5 92 104 38 61 123
111 123 108 49 3 63 117 99 31 22 85 35 41 10 79
21 42 64 68 105 95 9 96 83 20 68 5 19 119 34
54 102 106 110 42 107 84 22 18 93 74 14 26 117 53
31 9 99 1 40 102 37 68 9 25 98 111 27 25 48
64 98 107 76 93 5 99 77 6 53 96 62 124 68 43
34 114 12 48 109 91 12 31 68 75 124 119 10 40 22
94 38 103 113 10 75 86 36 50 115 41 5 97 40 104
89 89 93 13 16 100 72 30 124 23 68 48 32 116 22
100 120 30 109 21 100 103 68 94 91 48 71 60 72 66
51 76 8 73 113 84 0 86 61 40 83 72 63 20 68
118 37 117 126 48 109 2 79 89 116 96 30 80 3 31
112 0 0 14 29 116 70 23 71 30 23 17 65 54 60
123 24 95 69 36 71 82 120 118 108 69 33 119 40 11
4 0 0 57 77 83 43 89 11 33 75 81 45 111 107
105 111 59 78 106 106 56 105 52 122 73 33 96 48 15
35 109 6 39 35 22 98 15 65 3 104 43 62 126 74
67 51 57 95 65 114 125 8 126 15 110 69 121 58 17
17 62 44 77 73 22 44 58 1 83 46 47 52 93 9
102 105 77 120 53 90 47 10 16 85 16 97 90 36 105
36 69 52 7 101 101 74 110 72 29 76 109 55 77 79
61 40 48 28 77 83 79 69 55 42 122 85 104 34 77
18 89 92 16 41 35 58 77 31 87 6 31
50 123 86 71 84 82 88 4 10 114 43 17