This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fast Navigation with Icosahedral Golden Gates

Terrence Richard Blackman MEC & IAS [email protected] http://terrenceblackman.com  and  Zachary Stier UC Berkeley [email protected]
Abstract.

An algorithm of Ross and Selinger for the factorization of diagonal elements of PU(2) to within distance ε\varepsilon was adapted by Parzanchevski and Sarnak into an efficient probabilistic algorithm for any element of PU(2) using at most effective 3logp1ε33\log_{p}\frac{1}{\varepsilon^{3}} factors from certain well-chosen sets associated to a number field and a prime pp. The icosahedral super golden gates are one such set associated to (5)\mathbb{Q}(\sqrt{5}). We leverage recent work of Carvalho Pinto, Petit, and Stier to reduce this bound to 73log591ε3\frac{7}{3}\log_{59}\frac{1}{\varepsilon^{3}}, and we implement the algorithm in Python. This represents an improvement by a multiplicative factor of log2595.9\log_{2}59\approx 5.9 over the analogous result for the Clifford+TT gates. This is of interest because the icosahedral gates have shortest factorization lengths among all super golden gates.

Key words and phrases:
Quantum computing, quaternion algebras
1991 Mathematics Subject Classification:
Primary 68Q12, 81P65, 11R52; Secondary 51F25

1. Introduction

Lubotzky, Phillips, and Sarnak, in a series of papers [11, 12, 13, 14], explicitly constructed topological generators with optimal covering properties for the compact Lie group PU(2)\operatorname{PU}(2). Such generators for projective unitary groups find an interesting application in quantum computing where a fundamental design challenge is to determine an optimal, fault-tolerant decomposition of a quantum gate. For classical computing a single bit state is an element of {0,1}\{0,1\}. A classical gate implements functions on binary inputs. The only nontrivial single bit logic operation is NOT, which takes 0 to 1 and 1 to 0 (though it is also possible for the codomain to contain more than one bit). In the quantum setting, single qubit states are points

u=(u1,u2)2u=(u_{1},u_{2})\in\mathbb{C}^{2}

up to a mutual phase e2πiθe^{2\pi i\theta} in each component, such that

|u|2=|u1|2+|u2|2=1.\left\lvert u\right\rvert^{2}=\left\lvert u_{1}\right\rvert^{2}+\left\lvert u_{2}\right\rvert^{2}=1.

A gate here cannot output more than one qubit, and thus must be a 2×22\times 2 projective unitary.

A universal gate set is a finite set of gates, S:={s1,s2,,sk:sPU(2)}S:=\{s_{1},s_{2},\cdots,s_{k}:s_{\ell}\in\operatorname{PU}(2)\}, that can approximate, in the bi-invariant metric on the compact Lie group, any matrix arbitrarily well. That is, the group generated by SS must be topologically dense in PU(2)\operatorname{PU}(2). The Solovay–Kitaev theorem [17] guarantees that universal gate sets can efficiently approximate quantum operations for unitaries on a constant number of qubits. Universal gate sets typically consists of a finite group CPU(2)C\leqslant\operatorname{PU}(2) together with an extra element τ\tau, which we will take to be an involution, so that the subgroup generated by CC and τ\tau covers PU(2)\operatorname{PU}(2) with minimal τ\tau-count, and simultaneously navigates PU(2)\operatorname{PU}(2) efficiently. That is, given some gate in PU(2)\operatorname{PU}(2) and desired precision ε\varepsilon, there is an efficient algorithm (polynomial-time in the input size) that with high probability finds a short word in SS to that precision, typically of length O(log(1/ε))O(\log(\nicefrac{{1}}{{\varepsilon}})). The deep insight of Sarnak [24] is that the construction and optimality of universal single-qubit quantum gate sets can be understood in terms of the arithmetic of quaternion algebras. Specifically, identifying CC with a subgroup of the group of units in a definite quaternion algebra over a totally real number field provides a coherent framework within which one can systematically address the question of optimality of topological generators for PU(2).\operatorname{PU}(2).

The question we address within this context is that of finding the “best” topological generators of PU(2)\operatorname{PU}(2) among those universal gate sets, the super golden gates of Parzanchevski and Sarnak [19], which are known to possess optimal covering properties and efficient navigation. In particular, there are only finitely many super golden gates [19, p.870–871] and the respective finite subgroups CC can be realized as the group of symmetries of of the Platonic solids: for the tetrahedron, A4A_{4}; for the cube and the octahedron, S4S_{4}; and for the dodecahedron and icosahedron, A5A_{5}.

We demonstrate that the icosahedral super golden gates admit a factorization directly analogous to the one obtained by Stier [25], and that this gives the best-known preconstant to the first-power logarithm in the approximation length. The exact factor of the improvement is log2595.9\log_{2}59\approx 5.9. This improvement is due to the fact that the icosahedral super golden gates have a growth rate that is on the order of 59k59^{k}, while gate set studied in [25], the Clifford+TT gates, have growth rate of order 2k2^{k}. We note also that the icosahedral super-golden-gates represent the greatest number of distinct gates with bounded τ\tau-count.

The commonly used Clifford+TT gate set provides a set of elementary gates that is universal and consists only of a small number of gates, all of which are very well compatible with many established error correction schemes and can be physically implemented in all quantum technologies that seem promising for large-scale quantum computations [18]. In this case the finite group CC is the Clifford group C24C_{24} of order 24 in PU(2).\operatorname{PU}(2). At least one non-Clifford gate must be added to the basic gate set in order to achieve universality. A common choice for this additional gate is the TT-gate (or π8\frac{\pi}{8}-gate). The TT-gate is not the only possible extension of the Clifford group but it is considered to be the most practical one. This is due to the availability of fault-tolerant implementations of the TT-gate. For this reason, the Clifford+TT gate set is considered the most promising candidate for practical quantum computing. We recall, for the convenience of the reader, some of the salient details of the single qubit Clifford+TT gate set. Let

H:=i2(1111) and T:=(eiπ/8eiπ/8).H:=\frac{i}{\sqrt{2}}\begin{pmatrix}1&\phantom{-}1\\ 1&-1\end{pmatrix}\text{ and }T:=\begin{pmatrix}e^{\nicefrac{{i\pi}}{{8}}}&\\ &e^{\nicefrac{{-i\pi}}{{8}}}\end{pmatrix}.

We take S:={H,T}S:=\{H,T\} [24, 21, 25]. The Clifford+TT universal gate set is an example of a golden gate set [24, 19]. The remarkable observation of [24] is that the HH and TT gates of the Clifford+TT gate set come from the definite quaternion algebra

𝒜:=(1,1(2)).\mathcal{A}:=\left(\frac{-1,-1}{\mathbb{Q}(\sqrt{2})}\right).

Kliuchnikov, Maslov, and Mosca [6] characterized the group S\left\langle S\right\rangle and demonstrated an efficient algorithm to factor exactly, by carefully studying the powers of 2\sqrt{2} in the denominators of matrix entries. Ross and Selinger [21] then focused on diagonal matrices near to S\left\langle S\right\rangle, with the goal of finding ε\varepsilon-close factorizations of length clog2(1/ε3)c\log_{2}(\nicefrac{{1}}{{\varepsilon^{3}}}); the base of 2 is intrinsic to the structure of the Clifford+TT gates. By studying upright sets in the plane, which function analogously to the simplices of Lenstra’s algorithm [7, 20], they achieved the leading coefficient of 1+o(1)1+o(1) (in that restricted case). Parzenchevski and Sarnak [19] generalized Ross and Selinger’s approach to any golden gate set (with the base of the logarithm correspondingly changing per the gate set), and by considering Euler angles reached the coefficient of 3+o(1)3+o(1) for approximations to generic elements. Working with the (finite) LPS Ramanujan graphs (see §2.1.3 and [14]), Carvalho Pinto and Petit [1] factorize in the equivalent of 7/3+o(1).\nicefrac{{7}}{{3}}+o(1). This approach is adapted by Stier [25] for the same coefficient with Clifford+TT gates. We combine aspects of the techniques of [1, 25], in the proposed icosahedral setting, to reduce this bound to 73log59(1/ε3).\frac{7}{3}\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}).

2. Super Golden Gates

We recall here essential ideas related to the arithmetic of quaternion algebras [27, 28], SS-arithmetic groups [16, §C], and Ramanujan graphs [10] which lead to the icosahedral super golden gates [19].

2.1. Quaternion Algebras

A quaternion algebra 𝒜\mathcal{A} over a field 𝔽\mathbb{F} is a central simple algebra of dimension four over 𝔽\mathbb{F} (cf. [15, §5.2] or [5, p.15]). It follows from Wedderburn’s structure theorem on simple algebras [3] that every quaternion algebra over any field 𝔽\mathbb{F} is either isomorphic to M2(𝔽)\mathrm{M}_{2}(\mathbb{F}) or a division algebra with center 𝔽\mathbb{F} [8]. If the characteristic of 𝔽\mathbb{F} is not 2 then it is always possible to find a basis {1,i,j,k}\{1,i,j,k\} for 𝒜\mathcal{A} over 𝔽\mathbb{F} such that

i2=α,j2=β,k=ij=jii^{2}=\alpha,j^{2}=\beta,k=ij=-ji

where α,β𝔽×\alpha,\beta\in\mathbb{F}^{\times}. We designate such an algebra by

𝒜=(α,β𝔽).\mathcal{A}=\left(\frac{\alpha,\beta}{\mathbb{F}}\right).

Evidently q𝒜q\in\mathcal{A} is of the form

(2.1) q=x0+x1i+x2j+x3k,q=x_{0}+x_{1}i+x_{2}j+x_{3}k,

where x𝔽x_{\ell}\in\mathbb{F}. In this notation, Hamilton’s quaternions arise as

=(1,1).\mathbb{H}=\left(\frac{-1,-1}{\mathbb{R}}\right).

The conjugate of qq as in (2.1), denoted by q¯\bar{q}, is equal to x0x1ix2jx3kx_{0}-x_{1}i-x_{2}j-x_{3}k. For each q𝒜q\in\mathcal{A} we define the (reduced) norm map N:𝒜𝔽\operatorname{N}:\mathcal{A}\to\mathbb{F} by

N(q):=qq¯=x02αx12βx22+αβx32\operatorname{N}(q):=q\bar{q}=x_{0}^{2}-\alpha x_{1}^{2}-\beta x_{2}^{2}+\alpha\beta x_{3}^{2}

and the (reduced) trace map Tr:𝒜𝔽\operatorname{Tr}:\mathcal{A}\to\mathbb{F} by

Tr(q):=q+q¯=2x0.\operatorname{Tr}(q):=q+\bar{q}=2x_{0}.

We note that every element q𝒜q\in\mathcal{A} satisfies the quadratic equation

q2Tr(q)q+N(q)=0.q^{2}-\operatorname{Tr}(q)q+\operatorname{N}(q)=0.

It is possible to make everything completely explicit by embedding 𝒜\mathcal{A} in M2(F(α))\mathrm{M}_{2}\left(F\left(\sqrt{\alpha}\right)\right) by, for example,

i(αα) and j(β1).i\mapsto\begin{pmatrix}\sqrt{\alpha}\\ &-\sqrt{\alpha}\end{pmatrix}\text{ and }j\mapsto\begin{pmatrix}&\beta\\ 1\end{pmatrix}.

Evidently, if α\alpha is a square in 𝔽\mathbb{F}, then 𝒜M2(𝔽)\mathcal{A}\cong\mathrm{M}_{2}(\mathbb{F}). A necessary and sufficient condition for 𝒜M2(𝔽)\mathcal{A}\cong\mathrm{M}_{2}(\mathbb{F}) is that α\alpha is the norm of an element in 𝔽(β)\mathbb{F}(\sqrt{\beta}) with respect to 𝔽\mathbb{F} [28]. Of course, one may interchange α\alpha and β\beta in this remark.

Specializing to quaternion algebras over the rational field \mathbb{Q}, or over one of the completions p\mathbb{Q}_{p} or \mathbb{R} (the completion \mathbb{Q}_{\infty}), let 𝒜\mathcal{A} be a quaternion algebra over \mathbb{Q} and let pp be a prime (or \infty). We define

𝒜p:=𝒜p.\mathcal{A}_{p}:=\mathcal{A}\underset{\mathbb{Q}}{\otimes}{\mathbb{Q}_{p}}.

Either 𝒜pM2(p)\mathcal{A}_{p}\cong\mathrm{M}_{2}(\mathbb{Q}_{p}) or 𝒜p\mathcal{A}_{p} is a division algebra over p\mathbb{Q}_{p}. In this case we will have that 𝒜pp\mathcal{A}_{p}\cong\mathbb{H}_{p} and say that 𝒜\mathcal{A} is ramified at p.p. When 𝒜pM2(p)\mathcal{A}_{p}\cong\mathrm{M}_{2}(\mathbb{Q}_{p}) we will say that 𝒜\mathcal{A} is unramified or split at p.p. If 𝒜\mathcal{A} is ramified at \infty it is called a definite rational quaternion algebra—that is, if 𝒜\mathcal{A} is definite then

𝒜=𝒜.\mathcal{A}_{\infty}=\mathcal{A}\underset{\mathbb{Q}}{\otimes}{\mathbb{R}}\cong\mathbb{H}.

If 𝒜\mathcal{A} is split at \infty, it is called an indefinite rational quaternion algebra—that is, if 𝒜\mathcal{A} is indefinite then

𝒜=𝒜M2().\mathcal{A}_{\infty}=\mathcal{A}\underset{\mathbb{Q}}{\otimes}{\mathbb{R}}\cong\mathrm{M}_{2}(\mathbb{R}).

2.1.1. Definite quaternion algebras over totally real number fields

We now turn to the quaternion algebras that are the objects of interest in this paper, definite quaternion algebras over totally real number fields. Let 𝒜\mathcal{A} be a quaternion algebra over a number field 𝔽,\mathbb{F}, ν\nu be a place of 𝔽\mathbb{F}, and 𝔽ν\mathbb{F}_{\nu} be the completion of 𝔽\mathbb{F} at ν\nu. Recall that a totally real number field is a finite algebraic extension of \mathbb{Q} all of whose complex embeddings lie entirely in .\mathbb{R}. For example, the field (d)\mathbb{Q}(\sqrt{d}) is totally real for positive, integral d.d. If every infinite place of FF is ramified in 𝒜,\mathcal{A}, we say that 𝒜\mathcal{A} is a totally definite quaternion algebra. Consequently, if 𝒜\mathcal{A} is a totally definite quaternion algebra over a number field 𝔽,\mathbb{F}, then 𝔽\mathbb{F} is necessarily totally real. Moreover, if 𝔽\mathbb{F} is a quadratic field, the number of finite places which are ramified in 𝒜\mathcal{A} is even.

2.1.2. Unit groups in orders in definite quaternion algebras over totally real number fields

Let 𝒜\mathcal{A} be quaternion algebra over a field 𝔽\mathbb{F} and let O𝔽O_{\mathbb{F}} be the ring of integers in 𝔽\mathbb{F}. An element q𝒜q\in\mathcal{A} is integral if N(q),Tr(q)O𝔽\operatorname{N}(q),\operatorname{Tr}(q)\in O_{\mathbb{F}}. An order 𝒪𝒜\mathcal{O}\subseteq\mathcal{A} is a O𝔽O_{\mathbb{F}}-algebra of integral elements such that [26, p.2]

𝒪O𝔽𝔽𝒜.\mathcal{O}\underset{O_{\mathbb{F}}}{\otimes}{\mathbb{F}}\cong\mathcal{A}.

Observe that as an example of an order we always have

(2.2) 𝒪:=O𝔽O𝔽iO𝔽jO𝔽k.\mathcal{O}:=O_{\mathbb{F}}\oplus O_{\mathbb{F}}i\oplus O_{\mathbb{F}}j\oplus O_{\mathbb{F}}k.

If 𝒪\mathcal{O} is an O𝔽O_{\mathbb{F}}-order in a definite quaternion algebra 𝒜\mathcal{A} over the totally real field 𝔽\mathbb{F} then the group of units of reduced norm 1, i.e.

(2.3) 𝒪1:={α𝒪:N(α)=1}\mathcal{O}^{1}:=\{\alpha\in\mathcal{O}:\operatorname{N}(\alpha)=1\}

is a finite group [28, p.289]. These finite groups will correspond to the groups of rotational symmetries of the platonic solids [28, p.172–173].

2.1.3. Ramanujan graphs

Two key ideas are needed for the construction of golden gate and super golden gate sets: a SS-arithmetic unit quaternion group and a Ramanujan graph. We outline the essential ideas of Lubotzky, Phillips, and Sarnak’s [14] “LPS” construction of these objects.

Ramanujan graphs are graphs whose spectrum is bounded optimally. Let XX be a finite connected kk-regular graph and AA its adjacency matrix.

Definition 2.1.

The graph XX is called a Ramanujan graph if every eigenvalue λ\lambda of AA satisfies either |λ|=k\lvert\lambda\rvert=k or |λ|2k1.\lvert\lambda\rvert\leq 2\sqrt{k-1}.

LPS Ramanujan graphs [2] arise as Cayley graphs of PSL(2,𝔽q)\operatorname{PSL}(2,\mathbb{F}_{q}) (for 𝔽q\mathbb{F}_{q} the finite field on qq elements). When considering these graphs we interchangeably refer to their elements by their group-theoretic properties as matrices and their graph-theoretic relations as vertices. [14] establishes that for any prime p1(mod4)p\equiv 1\pmod{4} there are infinitely many (p+1p+1)-regular Ramanujan graphs. We use the notation Xp,qX^{p,q} where pp and qq are distinct primes congruent to 1 modulo 4 to represent such graphs. The construction comes from number theory by way of the generalized Ramanujan conjecture [19, p.873]. The symmetric space PGL(2,p)/PGL(2,p)\operatorname{PGL}(2,\mathbb{Q}_{p})/\operatorname{PGL}(2,\mathbb{Z}_{p}) can be identified with a (p+1p+1)-regular infinite tree. PGL(2,[1/p])\operatorname{PGL}(2,\mathbb{Z}[\nicefrac{{1}}{{p}}]) acts from the left on PGL(2,p)/PGL(2,p)\operatorname{PGL}(2,\mathbb{Q}_{p})/\operatorname{PGL}(2,\mathbb{Z}_{p}). The generalized Ramanujan conjecture, a theorem in this case, implies that the quotient of PGL(2,p)/PGL(2,p)\operatorname{PGL}(2,\mathbb{Q}_{p})/\operatorname{PGL}(2,\mathbb{Z}_{p}) by any congruence subgroup of PGL(2,[1/p])\operatorname{PGL}(2,\mathbb{Z}[\nicefrac{{1}}{{p}}]), a (p+1p+1)-regular graph, is a Ramanujan graph. By considering an appropriate congruence subgroup of PGL(2,[1/p])\operatorname{PGL}(2,\mathbb{Z}[\nicefrac{{1}}{{p}}]) we can identify the quotient of this symmetric space with a Cayley graph associated to PSL(2,𝔽q)\operatorname{PSL}(2,\mathbb{F}_{q}) or PGL(2,𝔽q)\operatorname{PGL}(2,\mathbb{F}_{q}), depending on the value of the Legendre symbol (pq)\left(\frac{p}{q}\right) [22].

2.1.4. pp-arithmetic unit quaternion groups

Golden gate and super golden gate sets for PU(2)\operatorname{PU}(2) require the construction of a pp-arithmetic group (a special case of SS-arithmetic groups, where SS is a collection of places of \mathbb{Z}). Let GGL(n)G\leq\operatorname{GL}(n) be an algebraic group defined over [1/p]\mathbb{Z}[\nicefrac{{1}}{{p}}] with G()G(\mathbb{R}) compact. A pp-arithmetic group Λ\Lambda is a subgroup of G([1/p])G()×G(p)G(\mathbb{Z}[\nicefrac{{1}}{{p}}])\leq G(\mathbb{R})\times G(\mathbb{Q}_{p}), and has congruence subgroup Λ(N):={gΛ:gI(modN)}\Lambda(N):=\{g\in\Lambda:g\equiv I\pmod{N}\}. That is, in our compact Lie group PU(2)\operatorname{PU}(2) we take only rational numbers whose denominators are powers of a fixed prime pp as coefficients in the matrices.

One can also make the same construction over the ring of integers O𝔽O_{\mathbb{F}} of a totally real number field 𝔽\mathbb{F}\supsetneq\mathbb{Q}, at which point the role of pp is played by 𝔭\mathfrak{p} a prime ideal of O𝔽O_{\mathbb{F}}, so that the inverted elements (those allowed in denominators) are now all of 𝔭𝔽×\mathfrak{p}\cap\mathbb{F}^{\times}.

2.2. Golden gates

Golden gates are special unit groups in quaternion algebras over totally real number fields derived from the pp-arithmetic groups [24]. They give variants of optimal generators for PU(2)\operatorname{PU}(2) and connect the arithmetic of quaternion algebras to quantum computation on a single qubit. The “golden” characterization is to be understood by way of an interesting link between pp-arithmetic unit groups coming from unit quaternion groups and the Ramanujan graphs Xp,qX^{p,q}, which we explicate below.

Recall once more that in classical computation, one decomposes any function into basic logical gates such as XOR, AND, and NOR, and that in quantum computation, the classical bits are replaced by qubits, which are vectors in projective Hilbert space Pn\mathbb{C}P^{n}, and that the logical gates are all the elements of the projective unitary group PU(2).\operatorname{PU}(2). Let SS be a subgroup of PU(2)\operatorname{PU}(2) and denote by S()S^{(\ell)} the set of \ell-fold products of elements in SS. If S=0S()\left\langle S\right\rangle=\bigcup\limits_{\ell\geq 0}S^{(\ell)} is dense in PU(2)\operatorname{PU}(2) (with respect to the standard bi-invariant metric d2(A,B):=1|Tr(AB)|2d^{2}(A,B):=1-\frac{|\operatorname{Tr}(A^{*}B)|}{2}) then SS is universal. That is, any gate can be approximated with arbitrary precision as a product of elements of SS.

The notion of of a golden gate set is much stronger, requiring [10]:

  1. (1)

    Optimal covering of PU(2)\operatorname{PU}(2) by S\left\langle S\right\rangle: for every \ell the set S()S^{(\ell)} distributes in PU(2)\operatorname{PU}(2) as a perfect sphere packing (or randomly placed points) would, up to a negligible factor.

  2. (2)

    Approximation: given APU(2)A\in\operatorname{PU}(2) and ε>0\varepsilon>0, there is an efficient algorithm to find some ABε(A)A^{\prime}\in B_{\varepsilon}(A) (the ε\varepsilon-ball around AA) such that AS()A^{\prime}\in S^{(\ell)} with \ell (almost) minimal.

  3. (3)

    Compiling: given ASA\in\left\langle S\right\rangle as a matrix, there is an efficient algorithm to write AA as a word in SS of the smallest possible length.

These requirements ensure that any gate can be approximated and compiled as an efficient circuit using the gates in SS.

2.3. Super golden gates

Each super golden gate set is composed of a finite group CC and an involution τ\tau, which lie in a 𝔭\mathfrak{p}-arithmetic group for 𝔭\mathfrak{p} a prime ideal of the integers O𝔽O_{\mathbb{F}} of a totally definite quaternion algebra 𝒜\mathcal{A}, over a totally real number field 𝔽\mathbb{F}. We require that:

  1. (1)

    CC acts simply transitively on the neighbors of any given vertex in XpX^{p}, the (p+1p+1)-regular tree, for pp the norm of 𝔭\mathfrak{p}.

  2. (2)

    τ\tau is an involution which takes a vertex to one of its neighbors.

𝔭\mathfrak{p}-arithmetic unit quaternion group act transitively on the vertices of the corresponding XpX^{p}. The 𝔭\mathfrak{p}-arithmetic groups which act transitively on the vertices of XpX^{p} are called the golden gates and the 𝔭\mathfrak{p}-arithmetic groups which act transitively on both the vertices and edges of XpX^{p} are called the super golden gates.

3. Icosahedral Super Golden Gates

Recall that there are only finitely many such super golden gate sets and in each case the finite group CC is identified with the group of rotational symmetries of a platonic solid: for the tetrahedron, A4A_{4}; for the cube and the octahedron, S4S_{4}; and for the dodecahedron and icosahedron, A5A_{5}. Each of these finite groups can be precisely identified with a 𝔭\mathfrak{p}-arithmetic unit quaternion group coming from one of the following quaternion algebras [24]:

  • tetrahedral gates: 𝒜=(1,1)\mathcal{A}=\left(\frac{-1,-1}{\mathbb{Q}}\right);

  • octahedral gates: 𝒜=(1,1(2))\mathcal{A}=\left(\frac{-1,-1}{\mathbb{Q}(\sqrt{2})}\right); and

  • icosahedral gates: 𝒜=(1,1(5))\mathcal{A}=\left(\frac{-1,-1}{\mathbb{Q}(\sqrt{5})}\right).

We consider the quaternion algebra 𝒜\mathcal{A} over the golden field (5)\mathbb{Q}(\sqrt{5}) [19, p.895]. A maximal order 𝒪\mathcal{O} in 𝒜:=(1,1(5))\mathcal{A}:=\left(\frac{-1,-1}{\mathbb{Q}(\sqrt{5})}\right) is given by the ring of icosians. The unit group 𝒪1\mathcal{O}^{1} is the platonic icosahedral group, generated by 1+i+j+k2,1+φ1j+φk2\left\langle\frac{1+i+j+k}{2},\frac{1+\varphi^{-1}j+\varphi k}{2}\right\rangle, where φ:=1+52\varphi:=\frac{1+\sqrt{5}}{2} is the golden ratio, and 𝒪1A5\mathcal{O}^{1}\cong A_{5} In PU(2)\operatorname{PU}(2), this corresponds to

C60=(11ii),(1φi/φφ+i/φ1)=:ρ,σ.C_{60}=\left\langle\begin{pmatrix}1&1\\ i&-i\end{pmatrix},\begin{pmatrix}1&\varphi-\nicefrac{{i}}{{\varphi}}\\ \varphi+\nicefrac{{i}}{{\varphi}}&-1\end{pmatrix}\right\rangle=:\left\langle\rho,\sigma\right\rangle.

We take as our involution

τ=τ60:=(2+φ1i1+i2φ).\tau=\tau_{60}:=\begin{pmatrix}2+\varphi&1-i\\ 1+i&-2-\varphi\end{pmatrix}.

For the prime ideal 𝔭=(7+5φ)\mathfrak{p}=\left(7+5\varphi\right) one has 𝔽𝔭59\mathbb{F}_{\mathfrak{p}}\cong\mathbb{Q}_{59}, and the generated group Γ=C60,τ60\Gamma=\left\langle C_{60},\tau_{60}\right\rangle is the full (7+5φ7+5\varphi)-arithmetic group of 𝒪\mathcal{O}. As such, we establish:

Theorem 1.

Subject to standard number-theoretic heuristic conjectures, there exists a factorization of any gPU(2)g\in\operatorname{PU}(2) to precision ε\varepsilon using τ\tau-count at most (7/3+o(1))log59(1/ε3).(\nicefrac{{7}}{{3}}+o(1))\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}).

In particular, if gg is near to a diagonal matrix then we just apply the approach in §5 (for additive error) to get a path of length at most (1+o(1))log59(1/ε3)(1+o(1))\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}), and otherwise run the approach in §6.

Notice that no other choice of golden gates is both super and has a greater logarithm base, so that these are the “best” generators given the present state of knowledge.

4. Nearby Elements in PU(2), and Factoring in Γ\Gamma

In this section, we establish a key technical lemma regarding approximations in the matrix group, and the method for exact synthesis for elements of the relevant subgroup.

For α,β\alpha,\beta\in\mathbb{C} satisfying |α|2+|β|2=1\left\lvert\alpha\right\rvert^{2}+\left\lvert\beta\right\rvert^{2}=1 we write

u(α,β):=(αββ¯α¯),u(\alpha,\beta):=\begin{pmatrix}\alpha&\beta\\ -\bar{\beta}&\bar{\alpha}\end{pmatrix},

and for θ\theta\in\mathbb{R} we write

u(θ):=u(eiθ,0)=(eiθeiθ).u(\theta):=u(e^{i\theta},0)=\begin{pmatrix}e^{i\theta}\\ &e^{-i\theta}\end{pmatrix}.

We restate below [25, Lemma 5] which holds independent of our choice of universal gate set.

Lemma 1.

Select absolute constants δ,ε0>0\delta,\varepsilon_{0}>0 and put C=12+12(2+δε0)2C=\sqrt{\frac{1}{2}+\frac{1}{2}\left(\frac{2+\delta}{\varepsilon_{0}}\right)^{2}}. Take γ1\gamma_{1} and γ2\gamma_{2} in either SU(2)\operatorname{SU}(2) or PU(2)\operatorname{PU}(2) and write them as γ1=u(α1,β1)\gamma_{1}=u(\alpha_{1},\beta_{1}) and γ2=u(α2,β2)\gamma_{2}=u(\alpha_{2},\beta_{2}). If ||α1||α2||<ε\left\lvert\left\lvert\alpha_{1}\right\rvert-\left\lvert\alpha_{2}\right\rvert\right\rvert<\varepsilon for some ε<δ\varepsilon<\delta and min{|α1|,|α2|}<1ε02\min\{\left\lvert\alpha_{1}\right\rvert,\left\lvert\alpha_{2}\right\rvert\}<\sqrt{1-\varepsilon_{0}^{2}} then for

θ1\displaystyle\theta_{1} =12(argα1argα2+argβ1argβ2),\displaystyle=\frac{1}{2}(\arg\alpha_{1}-\arg\alpha_{2}+\arg\beta_{1}-\arg\beta_{2}), δ1\displaystyle\delta_{1} =u(θ1)\displaystyle=u(\theta_{1})
θ2\displaystyle\theta_{2} =12(argα1argα2argβ1+argβ2),\displaystyle=\frac{1}{2}(\arg\alpha_{1}-\arg\alpha_{2}-\arg\beta_{1}+\arg\beta_{2}), δ2\displaystyle\delta_{2} =u(θ2)\displaystyle=u(\theta_{2})

we have the approximation δ1γ2δ2\delta_{1}\gamma_{2}\delta_{2} to γ1\gamma_{1}, satisfying

d(γ1,δ1γ2δ2)<Cε.d(\gamma_{1},\delta_{1}\gamma_{2}\delta_{2})<C\varepsilon.

Loosely, this result can be thought of as a sufficient condition to transform one 2×22\times 2 unitary into (approximately) another based only on a weak condition and by “tuning” with diagonal (rotation) matrices on either side. The condition is merely that the “starting” and “target” matrices have top-left entries nearby in absolute value, and that neither is very near to 1.

We now move to factorizing. Put η=7+5φ\eta=7+5\varphi for the sequel. Observe that is is a positive real number, and the generator of 𝔭\mathfrak{p} above. For our purposes, we will encounter elements of Γ=ρ,σ,τPU(2)\Gamma=\left\langle\rho,\sigma,\tau\right\rangle\leq\operatorname{PU}(2) only as [φ]\mathbb{Z}[\varphi]-quaternions with norm a power of η\eta, envisioned as elements of the Cayley graph for ρ,σ,τ(φ)U2([φ])\left\langle\rho,\sigma,\tau\right\rangle\leq\mathbb{Q}(\varphi)\operatorname{U}_{2}(\mathbb{Z}[\varphi]), which [19] shows acts transitively with respect to the distance measure of τ\tau-count, which can be detected by counting the power of η\eta in the quaternion norm, after quotienting out [φ]\mathbb{Z}[\varphi]-scalars. This leads to the following factoring algorithm.

Algorithm 1.

Let CC be a set of representatives of ρ,σPU(2)\left\langle\rho,\sigma\right\rangle\leq\operatorname{PU}(2) lifted to (φ)U2([φ])\mathbb{Q}(\varphi)\operatorname{U}_{2}(\mathbb{Z}[\varphi]). Given γ\gamma a lift of some element of Γ\Gamma with τ\tau-count kk, it can be factored by determining which (unique) cCc\in C gives rise to γcτ\gamma c\tau of τ\tau-count k1k-1, which in turn requires no more than 60 multiplications of three matrices. cc is found if and only if γcτ\gamma c\tau has all coefficients divisble by η\eta in [φ]\mathbb{Z}[\varphi].

This algorithm has the base case of simply comparing γ\gamma against all 60 elements of CC, another trivial computational task. The idea of the general case is that γ\gamma has the representation

c0i[n]τcic_{0}\prod\limits_{i\in[n]}\tau c_{i}

where cic_{i} is not the identity for i0,ni\neq 0,n. Then cc will be projectively equal to cn1c_{n}^{-1}, giving rise to

γcτ=c0i[n1]τciτcncτ=zpc0i[n1]τci\gamma c\tau=c_{0}\prod\limits_{i\in[n-1]}\tau c_{i}\tau c_{n}c\tau=zpc_{0}\prod\limits_{i\in[n-1]}\tau c_{i}

for z=det(cnc)z=\det(c_{n}c) (coprime to η\eta).

5. Algorithm for Short Paths to Diagonal Elements

5.1. Algorithm

We proceed by, for given diagonal element δ=u(θ)\delta=u(\theta) and ε\varepsilon, seeking γΓ\gamma\in\Gamma with d(δ,γ)<εd(\delta,\gamma)<\varepsilon. Knowing that γ\gamma is projectively equal to an element

(x0+x1ix2+x3ix2+x3ix0x1i)\begin{pmatrix}\phantom{-}x_{0}+x_{1}i&x_{2}+x_{3}i\\ -x_{2}+x_{3}i&x_{0}-x_{1}i\end{pmatrix}

for x0,x1,x2,x3[φ]x_{0},x_{1},x_{2},x_{3}\in\mathbb{Z}[\varphi] satisfying

(5.1) x02+x12+x22+x32=ηmx_{0}^{2}+x_{1}^{2}+x_{2}^{2}+x_{3}^{2}=\eta^{m}

for some mm\in\mathbb{N}, we also have that it is sufficient to satisfy

(5.2) x0cosθ+x1sinθηm/2(12ε2)x_{0}\cos\theta+x_{1}\sin\theta\geq\eta^{\nicefrac{{m}}{{2}}}(1-2\varepsilon^{2})

(note that this is precisely [21, (13) and Problem 7.4] (see there for the derivation)), following the readily computable observation that δγ2d(δ,γ)\left\|\delta-\gamma\right\|\geq\sqrt{2}d(\delta,\gamma) and setting the goal δγ2ε\left\|\delta-\gamma\right\|\leq\sqrt{2}\varepsilon, since [21] operates with respect to the operator norm. Observe, unfortunately, that (5.1) is a quadratic constraint and (5.2) is a linear constraint. We now explain how to transform them into a practicable sequence of integer programs.111Fundamentally, this is the same idea as in [21], as their study of upright ellipses accomplishes the same task as Lenstra’s algorithm.

Fix mm. We seek x[φ]x_{\ell}\in\mathbb{Z}[\varphi] satisfying (5.1) and (5.2). Define y=x/ηm/2y_{\ell}=\nicefrac{{x_{\ell}}}{{\eta^{\nicefrac{{m}}{{2}}}}}. Artificially add the constraint

(5.3) 1ε2y1sinθ0.1-\varepsilon^{2}-y_{1}\sin\theta\geq 0.

Observe from (5.1) that y021y12y_{0}^{2}\leq 1-y_{1}^{2}. Then, we have the sequence of implications (mainly by algebraic manipulation)

y0cosθ\displaystyle y_{0}\cos\theta 1ε2y1sinθ0\displaystyle\geq 1-\varepsilon^{2}-y_{1}\sin\theta\geq 0
y02cos2θ\displaystyle y_{0}^{2}\cos^{2}\theta (1ε2)2+y12sin2θ2(1ε2)y1sinθ\displaystyle\geq(1-\varepsilon^{2})^{2}+y_{1}^{2}\sin^{2}\theta-2(1-\varepsilon^{2})y_{1}\sin\theta
cos2θ\displaystyle\cos^{2}\theta (y1(1ε2)sinθ)2(1ε2)2sin2θ+(1ε2)2\displaystyle\geq(y_{1}-(1-\varepsilon^{2})\sin\theta)^{2}-(1-\varepsilon^{2})^{2}\sin^{2}\theta+(1-\varepsilon^{2})^{2}
cos2θ(2ε2ε4)\displaystyle\cos^{2}\theta(2\varepsilon^{2}-\varepsilon^{4}) (y1(1ε2)sinθ)2\displaystyle\geq(y_{1}-(1-\varepsilon^{2})\sin\theta)^{2}
(5.4) |y1(1ε2)sinθ|\displaystyle\left\lvert y_{1}-(1-\varepsilon^{2})\sin\theta\right\rvert |cosθ|2ε2ε\displaystyle\leq\left\lvert\cos\theta\right\rvert\sqrt{2-\varepsilon^{2}}\varepsilon

and so we have reduced our consideration to just one of the four variables. As [(φ):]=2[\mathbb{Q}(\varphi):\mathbb{Q}]=2, we explicitly work with the Galois group elements

σ+: 1\displaystyle\sigma_{+}:\,1 1\displaystyle\longmapsto 1
φ\displaystyle\varphi φ,\displaystyle\longmapsto\varphi,
σ: 1\displaystyle\sigma_{-}:\,1 1\displaystyle\longmapsto 1
φ\displaystyle\varphi φ,\displaystyle\longmapsto\varphi^{\bullet},

where φ\varphi^{\bullet} is φ\varphi’s Galois conjugate 152=1φ\frac{1-\sqrt{5}}{2}=1-\varphi, both of which are real embeddings, yielding the additional constraints

(5.5) |σ±y1|1.\left\lvert\sigma_{\pm}y_{1}\right\rvert\leq 1.

Now, putting x1=c+dφx_{1}=c+d\varphi for some c,dc,d\in\mathbb{Z}, (5.3), (5.4), and (5.5) transform into

Problem 1.

Given mm, find (c,d)2(c,d)\in\mathbb{Z}^{2} satisfying the five linear constraints

(c+dφ)sinθ\displaystyle(c+d\varphi)\sin\theta ηm/2(1ε2),\displaystyle\leq\eta^{\nicefrac{{m}}{{2}}}(1-\varepsilon^{2}),
|c+dσ±φ|\displaystyle\left\lvert c+d\sigma_{\pm}\varphi\right\rvert (σ±η)m/2,\displaystyle\leq(\sigma_{\pm}\eta)^{\nicefrac{{m}}{{2}}},
|c+dφηm/2(1ε2)sinθ|\displaystyle\left\lvert c+d\varphi-\eta^{\nicefrac{{m}}{{2}}}(1-\varepsilon^{2})\sin\theta\right\rvert ηm/2|cosθ|2ε2ε.\displaystyle\leq\eta^{\nicefrac{{m}}{{2}}}\left\lvert\cos\theta\right\rvert\sqrt{2-\varepsilon^{2}}\varepsilon.

The existence of such a pair is determinable in O(polym)O(\operatorname{poly}m) time, by Lenstra’s algorithm [7, 20].

Putting x0=a+bφx_{0}=a+b\varphi for some a,ba,b\in\mathbb{Z}, for a particular solution to Problem 1, (5.1) and (5.2) transform into

Problem 0.

Given mm and x1=c+dφx_{1}=c+d\varphi, find (a,b)2(a,b)\in\mathbb{Z}^{2} satisfying the four linear constraints

|a+bσ±φ|\displaystyle\left\lvert a+b\sigma_{\pm}\varphi\right\rvert (σ±η)m/21(σ±x1)2,\displaystyle\leq(\sigma_{\pm}\eta)^{\nicefrac{{m}}{{2}}}\sqrt{1-(\sigma_{\pm}x_{1})^{2}},
(a+bφ)cosθ\displaystyle(a+b\varphi)\cos\theta ηm/2(1x1sinθ),\displaystyle\leq\eta^{\nicefrac{{m}}{{2}}}(1-x_{1}\sin\theta),
(a+bφ)cosθ\displaystyle(a+b\varphi)\cos\theta ηm/2(1ε2x1sinθ).\displaystyle\geq\eta^{\nicefrac{{m}}{{2}}}(1-\varepsilon^{2}-x_{1}\sin\theta).

Again, the existence of such a pair is determinable in O(polym)O(\operatorname{poly}m) time, by Lenstra’s algorithm.

Finally, if we have solutions to Problem 5.1 and Problem 1, we seek to solve

Problem 23.

Given mm, x0=a+bφx_{0}=a+b\varphi, and x1=c+dφx_{1}=c+d\varphi, find (x2,x3)[φ]2(x_{2},x_{3})\in\mathbb{Z}[\varphi]^{2} satisfying (5.1).

Here we change techniques. The objective is to write ηmx02x12\eta^{m}-x_{0}^{2}-x_{1}^{2} as a sum of squares in [φ]\mathbb{Z}[\varphi]. Assuming efficient factorization in \mathbb{Z} (or [φ]\mathbb{Z}[\varphi], also a PID), Problem 23 is efficiently solvable via Theorem 3 (the general approach being basically identical to the classical algorithms for [i]\mathbb{Z}[i] or [eiπ/8]\mathbb{Z}[e^{\nicefrac{{i\pi}}{{8}}}], cf. [21, §C] for the latter).

We have succeeded in the core of the algorithm. To handle given δ\delta, begin by fixing m=0m=0. For given mm, attempt to solve Problem 1; for each solution, attempt to solve Problem 5.1; for each solution, attempt to solve Problem 23. If this results in a tuple (x0,x1,x2,x3)(x_{0},x_{1},x_{2},x_{3}) satisfying all three problems, halt and return

1ηm/2(x0+x1ix2+x3ix2+x3ix0x1).\frac{1}{\eta^{\nicefrac{{m}}{{2}}}}\begin{pmatrix}\phantom{-}x_{0}+x_{1}i&x_{2}+x_{3}i\\ -x_{2}+x_{3}i&x_{0}-x_{1}\end{pmatrix}.

Otherwise, if all possibilities have been exhausted, increment mm.

5.2. Analysis

We refer the reader to [19, §2.3]’s analysis of timing and correctness for Ross and Selinger’s algorithm generalized to any golden gate set, where the conclusion is that for desired precision ε\varepsilon, the factorization length becomes (1+o(1))log59(1/ϵ3)(1+o(1))\log_{59}(\nicefrac{{1}}{{\epsilon^{3}}}) with required computational time remaining O(polylog(1/ϵ))O(\operatorname{poly}\log(\nicefrac{{1}}{{\epsilon}})).

6. Algorithm For Short Paths

Here we largely adopt the structure and wording from [25, §4 and §5], as the core concepts and heuristics that make the algorithm work are the same in the icosahedral setting.

6.1. Algorithm

Select absolute constants ε~,ε0>0\tilde{\varepsilon},\varepsilon_{0}>0. Take any g=u(α,β)PU(2)g=u(\alpha,\beta)\in\operatorname{PU}(2) where |α|<1ε02\left\lvert\alpha\right\rvert<\sqrt{1-\varepsilon_{0}^{2}}, and pick ε<ε~\varepsilon<\tilde{\varepsilon}. We wish to approximate gg in dd using γΓ\gamma\in\Gamma of the form

(6.1) γ=1ηk/2(x0+x1ix2+x3ix2+x3ix0x1i)\gamma=\frac{1}{\eta^{\nicefrac{{k}}{{2}}}}\begin{pmatrix}\phantom{-}x_{0}+x_{1}i&x_{2}+x_{3}i\\ -x_{2}+x_{3}i&x_{0}-x_{1}i\end{pmatrix}

having kk, the factorization length, minimized, and so we begin with k=0k=0. (We also have x0,x1,x2,x3[φ]x_{0},x_{1},x_{2},x_{3}\in\mathbb{Z}[\varphi].) In particular, the objective is to approximate gg as γ1γγ2\gamma_{1}\gamma\gamma_{2} where γ1,γ2Γ\gamma_{1},\gamma_{2}\in\Gamma approximate well-chosen diagonals computable using §5, and γΓ\gamma\in\Gamma has factorization computable using Algorithm 1. We will see that γ\gamma is designed to have factorization typically shorter than that of γ1\gamma_{1} and γ2\gamma_{2}, giving rise to the desired improvement.

In order to apply Lemma 1 we need to have |x0+x1iηk/2|=x02+x12ηk\left\lvert\frac{x_{0}+x_{1}i}{\eta^{\nicefrac{{k}}{{2}}}}\right\rvert=\sqrt{\frac{x_{0}^{2}+x_{1}^{2}}{\eta^{k}}} near |α|\left\lvert\alpha\right\rvert (that is, within ε\varepsilon). Because |x0+x1iηk/2|+|α||α|\left\lvert\frac{x_{0}+x_{1}i}{\eta^{\nicefrac{{k}}{{2}}}}\right\rvert+\left\lvert\alpha\right\rvert\geq\left\lvert\alpha\right\rvert which is fixed, it suffices to find candidate values for x0,x1[φ]x_{0},x_{1}\in\mathbb{Z}[\varphi] with ||x0+x1iηk/2|2|α|2|<ε|α|\left\lvert\left\lvert\frac{x_{0}+x_{1}i}{\eta^{\nicefrac{{k}}{{2}}}}\right\rvert^{2}-\left\lvert\alpha\right\rvert^{2}\right\rvert<\varepsilon\left\lvert\alpha\right\rvert, rewritten to

(6.2) |x02+x12|α|2ηk|<ε|α|ηk.\left\lvert x_{0}^{2}+x_{1}^{2}-\left\lvert\alpha\right\rvert^{2}\eta^{k}\right\rvert<\varepsilon\left\lvert\alpha\right\rvert\eta^{k}.

Viewing γ\gamma as an element of SU(2)\operatorname{SU}(2), we also have detγ=1\det\gamma=1, i.e. x02+x12+x22+x32=ηkx_{0}^{2}+x_{1}^{2}+x_{2}^{2}+x_{3}^{2}=\eta^{k}. As x[φ](φ)x_{\ell}\in\mathbb{Z}[\varphi]\subset\mathbb{Q}(\varphi)\subset\mathbb{R}, it follows that σ±(x02+x12)+σ±(x22+x32)=σ±(ηk)\sigma_{\pm}(x_{0}^{2}+x_{1}^{2})+\sigma_{\pm}(x_{2}^{2}+x_{3}^{2})=\sigma_{\pm}(\eta^{k}), and so

(6.3) σ±(x02+x12)(σ±η)k.\sigma_{\pm}(x_{0}^{2}+x_{1}^{2})\leq(\sigma_{\pm}\eta)^{k}.

Now, let m=x02+x12[φ]m=x_{0}^{2}+x_{1}^{2}\in\mathbb{Z}[\varphi]. Considering [φ]\mathbb{Z}[\varphi] as an integer lattice, we adapt (6.2) and (6.3) and seek to solve

(6.4) |m|α|2ηk|\displaystyle\left\lvert m-\left\lvert\alpha\right\rvert^{2}\eta^{k}\right\rvert <ε|α|ηk\displaystyle<\varepsilon\left\lvert\alpha\right\rvert\eta^{k}
(6.5) m\displaystyle m ηk\displaystyle\leq\eta^{k}
(6.6) m\displaystyle m^{\bullet} (η)k\displaystyle\leq(\eta^{\bullet})^{k}
(6.7) m,m\displaystyle m,m^{\bullet} 0\displaystyle\geq 0

which are convex constraints on mm’s lattice components. Since this is an integer programming problem in two dimensions, we apply Lenstra’s algorithm [7, 20] to efficiently list all such lattice points mm. For each mm, using Theorem 3, we attempt to write mm as a sum of two squares; if possible, say m=x02+x12m=x_{0}^{2}+x_{1}^{2}, we then attempt to write m~=ηkm\tilde{m}=\eta^{k}-m as a sum of two squares. If possible, say m~=x22+x32\tilde{m}=x_{2}^{2}+x_{3}^{2}, so we simply halt and return γ\gamma corresponding to (6.1). However, if m~\tilde{m} may not be represented as a sum of two squares, we simply move on to the next value of mm and try this process again. If this fails for all mm arising from kk, we increment kk and run Lenstra’s algorithm for the new inequalities.

Supposing we have halted and constructed γ\gamma, we compute δ1\delta_{1} and δ2\delta_{2} guaranteed by Lemma 1. These are efficiently approximable by §5 to γ1\gamma_{1} and γ2\gamma_{2}, respectively. Chaining together the three approximations as γ1γγ2\gamma_{1}\gamma\gamma_{2} gives the final desired approximation to gg.

6.2. Analysis

We begin the analysis by establishing the τ\tau-count and tightness of the approximation. In particular, d(γ1,δ1)<εd(\gamma_{1},\delta_{1})<\varepsilon and d(γ2,δ2)<εd(\gamma_{2},\delta_{2})<\varepsilon with factorization lengths of γ\gamma_{\ell} each up to (1+o(1))log59(1/ε3)(1+o(1))\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}). By Lemma 1, d(γ,δ1γδ2)<Cεd(\gamma,\delta_{1}\gamma\delta_{2})<C\varepsilon. Therefore, d(g,γ1γγ2)<(C+2)εd(g,\gamma_{1}\gamma\gamma_{2})<(C+2)\varepsilon (by the triangle inequality) and since γ\gamma has a factorization of τ\tau-count up to (13+o(1))log59(1/ε3)\left(\frac{1}{3}+o(1)\right)\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}), this constitutes a factorization of an element in gg’s neighborhood of τ\tau-count up to (73+o(1))log59(1/ε3)\left(\frac{7}{3}+o(1)\right)\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}).

The efficiency of this algorithm—that is, that it runs in time O(polylog(1/ε))O(\operatorname{poly}\log(\nicefrac{{1}}{{\varepsilon}}))—is because we expect to halt when 59kεO(1)59^{k}\varepsilon\in O(1) (so only k13log59(1/ε3)k\approx\frac{1}{3}\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}}) calls are expected), and only call polynomially-many polynomial-time subroutines. The dominant subroutines are calls to Lenstra’s algorithm which as shown in [7] runs in time polynomial in the size of the constraints for any fixed dimension nn. Indeed, here we have only n=2n=2 dimensions, m=6m=6 linear constraints (two per absolute value), and the largest value aa in the constraints is ηk\eta^{k}, so the runtime is polynomial in nmlogaΘ(k)nm\log a\in\Theta(k).

The reason we expect to halt when 59kεO(1)59^{k}\varepsilon\in O(1) is that heuristically, we expect to halt when the area enclosed by (6.4)–(6.7) is O(polylog(1/ϵ))O(\operatorname{poly}\log(\nicefrac{{1}}{{\epsilon}})). Conveniently, the region is a rectangle since the vectors 1,φ\langle 1,\varphi\rangle and 1,φ\langle 1,\varphi^{\bullet}\rangle are orthogonal, so assuming in the limit that εmin{|α|,1|α|2|α|}\varepsilon\ll\min\left\{\left\lvert\alpha\right\rvert,\frac{1-\left\lvert\alpha\right\rvert^{2}}{\left\lvert\alpha\right\rvert}\right\} (so that (6.5) and the “bottom” inequality of (6.4) are redundant), we compute length-times-width of

2ε|α|ηk1+φ2(η)k1+(1φ)2=2|α|559kεO(polylog(1/ε))\frac{2\varepsilon\left\lvert\alpha\right\rvert\eta^{k}}{\sqrt{1+\varphi^{2}}}\cdot\frac{(\eta^{\bullet})^{k}}{\sqrt{1+(1-\varphi)^{2}}}=\frac{2\left\lvert\alpha\right\rvert}{\sqrt{5}}\cdot 59^{k}\varepsilon\in O(\operatorname{poly}\log(\nicefrac{{1}}{{\varepsilon}}))

whence we find k(1+o(1))log59(1/ε)k\in(1+o(1))\log_{59}(\nicefrac{{1}}{{\varepsilon}}) as expected.

When attempting to write elements of 𝒪\mathcal{O} as a sum of two squares, we primarily rest on a belief, in the style of Cramér’s conjecture and a conjecture of Sardari [23, (*)] that sums of squares are dense in \mathbb{N}. Seeking to analogize [23, (*)] in particular, we note that the operative aspect is that a dense cluster of lattice points will represent at least one sum of two squares, and that such a point thus will be found quickly through Lenstra’s algorithm.

The significance of this result is to accomplish a factorization in PU(2)\operatorname{PU}(2) in analogue to [25], but using a gate set with additional desirable properties beyond those enjoyed by the Clifford+TT gates.

7. Implementation Details and Examples

Our algorithm has been implemented for proof-of-concept purposes in Python, and the code is available at https://math.berkeley.edu/~zstier/icosahedral. Included in this implementation are:

  • An implementation of Lenstra’s algorithm for special cases (convex.py).

  • The rings [φ]\mathbb{Z}[\varphi], [i,φ]\mathbb{Z}[i,\varphi], and H([φ])H(\mathbb{Z}[\varphi]) (rings.py and quaternions.py).

  • Solutions to sum-of-two-square problems in [φ]\mathbb{Z}[\varphi] (rings.py).

  • Factorization of elements of H([φ])H(\mathbb{Z}[\varphi]) which are of norm a power of η\eta (quaternions.py).

  • Efficient factorization of diagonal elements of PU(2)\operatorname{PU}(2), as outlined in §5 (diagonal.py).

  • Efficient factorization of general elements of PU(2)\operatorname{PU}(2), as outlined in §6 (approx.py).

In the remainder of this section, we demonstrate the efficacy of this factorization technique on the two more “classical” single-qubit quantum gate generators; recall

H=i2(1111) and T=(eiπ/8eiπ/8).H=\frac{i}{\sqrt{2}}\begin{pmatrix}1&\phantom{-}1\\ 1&-1\end{pmatrix}\text{ and }T=\begin{pmatrix}e^{\nicefrac{{i\pi}}{{8}}}&\\ &e^{\nicefrac{{-i\pi}}{{8}}}\end{pmatrix}.

In both cases, pick precision ε=1/1010\varepsilon=\nicefrac{{1}}{{10^{10}}}.

For the first example of TT, we yield

T\displaystyle T ξ0τξ1τξ2τξ3τξ4τξ5τξ6τξ7τξ8τξ9τξ10τξ11τξ12τξ13τξ14τξ15τξ16τξ17τξ18τξ19\displaystyle\approx\xi_{0}\tau\xi_{1}\tau\xi_{2}\tau\xi_{3}\tau\xi_{4}\tau\xi_{5}\tau\xi_{6}\tau\xi_{7}\tau\xi_{8}\tau\xi_{9}\tau\xi_{10}\tau\xi_{11}\tau\xi_{12}\tau\xi_{13}\tau\xi_{14}\tau\xi_{15}\tau\xi_{16}\tau\xi_{17}\tau\xi_{18}\tau\xi_{19}
=(σσρσρ)τ(ρσσρσσ)τ(σρσρσ)τ(ρσρσρσ)τ(ρσσρ\displaystyle=(\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\sigma\rho\sigma\sigma)\tau(\sigma\rho\sigma\rho\sigma)\tau(\rho\sigma\rho\sigma\rho\sigma)\tau(\rho\sigma\sigma\rho
σρσσρσρ)τ(σσρσρσσρσρ)τ(ρ)τ(σρσσρσρσσρ)\displaystyle\quad\,\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho)\tau(\sigma\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)
τ(ρ)τ(ρσσρσρσσρ)τ(ρσρσρσσ)τ(σσρσρσσρ)\displaystyle\quad\,\tau(\rho)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\rho\sigma\rho\sigma\rho\sigma\sigma)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)
τ(σρσρσσ)τ(σρσσρσρσσρ)τ(σρσρσσρσ)τ(σρ\displaystyle\quad\,\tau(\sigma\rho\sigma\rho\sigma\sigma)\tau(\sigma\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\sigma\rho
σρσ)τ(ρσσρσρσσρσρ)τ(ρσσρσρσσρσ)τ(σρσ)\displaystyle\quad\,\sigma\rho\sigma)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\sigma\rho\sigma)

where ξiσ,ρ\xi_{i}\in\left\langle\sigma,\rho\right\rangle correspond to the parenthesized terms following; this achieves precision in dd of 1.28/1010\nicefrac{{1.28}}{{10^{10}}}. The τ\tau-count is 19; compare this with the predicted log59(1/ε3)16.9\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}})\approx 16.9. We remark that the discrepancy can be attributed to computational limitations: at key steps in the algorithm, we must run the Tonelli–Shanks algorithm for finding quadratic residues modulo some prime qq, which has worst-case behavior Θ(q)\Theta(q). Therefore it is only practical to abandon on instances where some prime factor is at least 10610^{6}, something that occurred more than 328 times before the above-stated approximation was found.

For the second example of HH, the central element is determined to be the following:

γ\displaystyle\gamma =ξ0τξ1τξ2τξ3τξ4τξ5τξ6τξ7τξ8τξ9\displaystyle=\xi_{0}\tau\xi_{1}\tau\xi_{2}\tau\xi_{3}\tau\xi_{4}\tau\xi_{5}\tau\xi_{6}\tau\xi_{7}\tau\xi_{8}\tau\xi_{9}
=(ρσσρσρσσρσρ)τ(ρσρσρσσρσσ)τ(σρσρσσ)τ\displaystyle=(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\sigma)\tau(\sigma\rho\sigma\rho\sigma\sigma)\tau
(ρσσρσρσσρ)τ(ρσσρ)τ(σρσσρσρ)τ(σρσρσσρ\displaystyle\quad\,(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\rho\sigma\sigma\rho)\tau(\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\sigma\rho\sigma\rho\sigma\sigma\rho
σσ)τ(ρσρσσρσρσσρσ)τ(ρσσρσρσ)τ(ρσσρσρ)\displaystyle\quad\,\sigma\sigma)\tau(\rho\sigma\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma)\tau(\rho\sigma\sigma\rho\sigma\rho)

(ξi\xi_{i} serves the same function as above). It has τ\tau-count 9; compare this with the predicted 13log59(1/ε3)5.6\frac{1}{3}\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}})\approx 5.6. As before, the discrepancy can be attributed to non-vanishing of o(1)o(1) factors for explicit ε\varepsilon and to having to abandon possibilities with very large prime factors. The outer diagonal elements are determined to be the following:

γ1\displaystyle\gamma_{1}  ξ~ 0τ ξ~ 1τ ξ~ 2τ ξ~ 3τ ξ~ 4τ ξ~ 5τ ξ~ 6τ ξ~ 7τ ξ~ 8τ ξ~ 9τ ξ~ 10τ ξ~ 11τ ξ~ 12τ ξ~ 13τ ξ~ 14τ ξ~ 15τ ξ~ 16τ ξ~ 17τ ξ~ 18\displaystyle\approx\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{0}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{1}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{2}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{3}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{4}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{5}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{6}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{7}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{8}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{9}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{10}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{11}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{12}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{13}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{14}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{15}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{16}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{17}\tau\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{18}
=(σρσρ)τ(ρσρσσρσρσσρσ)τ(ρσσρσσ)τ(ρσσρσ\displaystyle=(\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\rho\sigma\sigma\rho\sigma\sigma)\tau(\rho\sigma\sigma\rho\sigma
ρσσρσρ)τ(ρσρσρσ)τ(σρσρ)τ(σσρσ)τ(σσρσρσ)\displaystyle\quad\,\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma\rho\sigma)\tau(\sigma\rho\sigma\rho)\tau(\sigma\sigma\rho\sigma)\tau(\sigma\sigma\rho\sigma\rho\sigma)
τ(σρσρσσρσ)τ(σρσρσσρσρ)τ(ρσσρσρσ)τ(ρσ\displaystyle\quad\,\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma)\tau(\rho\sigma
ρσσρ)τ(σσρσ)τ(σσρσρσσ)τ(σσ)τ(σρσρσσρσσ)\displaystyle\quad\,\rho\sigma\sigma\rho)\tau(\sigma\sigma\rho\sigma)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma)\tau(\sigma\sigma)\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\sigma)
τ(σρσρσσρσ)τ(σσρσρσσρσρ)τ(ρσρσσρσρσ)\displaystyle\quad\,\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma\sigma\rho\sigma\rho\sigma)
γ2\displaystyle\gamma_{2} ξ^0τξ^1τξ^2τξ^3τξ^4τξ^5τξ^6τξ^7τξ^8τξ^9τξ^10τξ^11τξ^12τξ^13τξ^14τξ^15τξ^16τξ^17τξ^18\displaystyle\approx\hat{\xi}_{0}\tau\hat{\xi}_{1}\tau\hat{\xi}_{2}\tau\hat{\xi}_{3}\tau\hat{\xi}_{4}\tau\hat{\xi}_{5}\tau\hat{\xi}_{6}\tau\hat{\xi}_{7}\tau\hat{\xi}_{8}\tau\hat{\xi}_{9}\tau\hat{\xi}_{10}\tau\hat{\xi}_{11}\tau\hat{\xi}_{12}\tau\hat{\xi}_{13}\tau\hat{\xi}_{14}\tau\hat{\xi}_{15}\tau\hat{\xi}_{16}\tau\hat{\xi}_{17}\tau\hat{\xi}_{18}
=(σρσρσσρ)τ(σσρσρσσρ)τ(σρσρσσρσσ)τ(σρσ\displaystyle=(\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\sigma\rho\sigma\rho\sigma\sigma\rho\sigma\sigma)\tau(\sigma\rho\sigma
σρσρ)τ(σσρσρσ)τ(ρσρσσ)τ(σρσσρσρ)τ(ρσρσ\displaystyle\quad\,\sigma\rho\sigma\rho)\tau(\sigma\sigma\rho\sigma\rho\sigma)\tau(\rho\sigma\rho\sigma\sigma)\tau(\sigma\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma
ρσσρσρ)τ(ρσρσσρσσ)τ(ρσσρσ)τ(ρσρσρσσρ\displaystyle\quad\,\rho\sigma\sigma\rho\sigma\rho)\tau(\rho\sigma\rho\sigma\sigma\rho\sigma\sigma)\tau(\rho\sigma\sigma\rho\sigma)\tau(\rho\sigma\rho\sigma\rho\sigma\sigma\rho
σρ)τ(ρσσρ)τ(ρσσρσρσσρ)τ(ρσσρσ)τ(ρσρσσρ\displaystyle\quad\,\sigma\rho)\tau(\rho\sigma\sigma\rho)\tau(\rho\sigma\sigma\rho\sigma\rho\sigma\sigma\rho)\tau(\rho\sigma\sigma\rho\sigma)\tau(\rho\sigma\rho\sigma\sigma\rho
σρσσ)τ(ρσρσσρσρσσ)τ(σσρσρσσρσ)τ(σρ)τ(ρ)\displaystyle\quad\,\sigma\rho\sigma\sigma)\tau(\rho\sigma\rho\sigma\sigma\rho\sigma\rho\sigma\sigma)\tau(\sigma\sigma\rho\sigma\rho\sigma\sigma\rho\sigma)\tau(\sigma\rho)\tau(\rho)

( ξ~ i\leavevmode\hbox{\set@color\hskip 2.1875pt\leavevmode\hbox{\set@color$\xi$}\raisebox{2.6389pt}{\leavevmode\hbox{\set@color$\mathchar 12414\relax$}}\hskip 2.1875pt}_{i}, ξ^i\hat{\xi}_{i} serves the same function as above). They both have τ\tau-counts of 18, with 75 collective abandoned cases; compare this with the predicted log59(1/ε3)16.9\log_{59}(\nicefrac{{1}}{{\varepsilon^{3}}})\approx 16.9. Multiplying out γ1γγ2\gamma_{1}\gamma\gamma_{2} gives distance in dd of 1.28/1010\nicefrac{{1.28}}{{10^{10}}} (that this difference equals the previous one is a coincidence; they disagree at the third decimal place).

8. Concluding Remarks

This work represents a theoretical, heuristic, and proof-of-concept demonstration of a state-of-the-art methodology to construct single-qubit quantum gates, optimizing for using as few expensive gates as possible, and in particular representing an improvement of log2595.9\log_{2}59\approx 5.9 over the previous best method [25]. However, the area of efficient quantum hardware selection is far from fully explored. While [19] demonstrates that efficiently computing a length-logp(1/ϵ3)\log_{p}(\nicefrac{{1}}{{\epsilon^{3}}}) factorization is NP-complete (for an arbitrary PU(2)\operatorname{PU}(2)-element into golden gates associated to prime pp), it may still be possible to achieve length-clogp(1/ϵ3)c\log_{p}(\nicefrac{{1}}{{\epsilon^{3}}}) factorizations for c(1,7/3)c\in(1,\nicefrac{{7}}{{3}}). Another possibility is in the study of multiple qubits simultaneously, as has been initiated for PU(3)\operatorname{PU}(3) by Evra and Parzanchevski [4].

9. Acknowledgements

TRB was supported, in part, by the Institute for Advanced Study and Medgar Evers College. ZS was supported by a National Science Foundation graduate research fellowship, grant numbers DGE-1752814/2146752. We are deeply indebted to Peter Sarnak for introducing us to this to this problem and for the many conversations that led to production of this paper. We are also grateful to Carlos Esparza and John Voight for their suggestions.

References

  • [1] E. Carvalho Pinto and C. Petit. Better path-finding algorithms in LPS Ramanujan graphs. Journal of Mathematical Cryptology, 12(4), 2018.
  • [2] G. P. Davidoff, P. Sarnak, and A. Valette. Elementary number theory, group theory, and Ramanujan graphs, volume 55. Cambridge university press Cambridge, 2003.
  • [3] R. Dubisch. The wedderburn structure theorems. The American Mathematical Monthly, 54(5):253–259, 1947.
  • [4] S. Evra and O Parzanchevski. Ramanujan complexes and Golden Gates in PU(3). Geometric and Functional Analysis, 2022.
  • [5] S. Johansson. A description of quaternion algebras. preprint, 1997.
  • [6] V. Kliuchnikov, D. Maslov, and M. Mosca. Fast and efficient exact synthesis of single qubit unitaries generated by Clifford and T gates. Quantum Information & Computation, 13(7&8), 2012.
  • [7] H. W. Lenstra Jr. Integer Programming with a Fixed Number of Variables. Mathematics of Operations Research, 8(4), 1983.
  • [8] D. W. Lewis. Quaternion algebras and the algebraic legacy of hamilton’s quaternions. Irish Math. Soc. Bull, 57:41–64, 2006.
  • [9] The LMFDB Collaboration. Number field 4.0.400.1: (i,5)\mathbb{Q}(i,\sqrt{5}).
  • [10] A. Lubotzky and O. Parzanchevski. From Ramanujan graphs to Ramanujan complexes. Philosophical Transactions of the Royal Society A, 378(2163):20180445, 2020.
  • [11] A. Lubotzky, R. Phillips, and P. Sarnak. Explicit expanders and the Ramanujan conjectures. In Proceedings of the eighteenth annual ACM symposium on Theory of computing, pages 240–246, 1986.
  • [12] A. Lubotzky, R. Phillips, and P. Sarnak. Hecke operators and distributing points on the sphere. I. Communications on Pure and Applied Mathematics, 39(S1):S149–S186, 1986.
  • [13] A. Lubotzky, R. Phillips, and P. Sarnak. Hecke operators and distributing points on S2S^{2}. II. Communications on Pure and Applied Mathematics, 40(4):401–420, 1987.
  • [14] A. Lubotzky, R. Phillips, and P. Sarnak. Ramanujan graphs. Combinatorica, 8(3):261–277, 1988.
  • [15] T. Miyake. Modular forms. Springer Science & Business Media, 2006.
  • [16] D. W. Morris. Introduction to arithmetic groups. arXiv preprint math/0106063, 2001.
  • [17] M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2010.
  • [18] P. Niemann, R. Wille, and R. Drechsler. Advanced exact synthesis of Clifford+TT circuits. Quantum Information Processing, 19(9):1–23, 2020.
  • [19] O. Parzanchevski and P. Sarnak. Super Golden Gates for PU(2). Advances in Mathematics, 327:869–901, 2018.
  • [20] A. Paz. A simplified version of H.W. Lenstra’s integer programming algorithm and some applications. 1983.
  • [21] N. J. Ross and P. Selinger. Optimal ancilla-free Clifford+T{T} approximation of zz-rotations. Quantum Information and Computation, 2016.
  • [22] N. T. Sardari. Diameter of Ramanujan graphs and random Cayley graphs with numerics. arXiv preprint arXiv:1511.09340, 1, 2015.
  • [23] N. T. Sardari. Complexity of Strong Approximation on the Sphere. International Mathematics Research Notices, 2019.
  • [24] P. Sarnak. Letter to Aaronson and Pollington on the Solovay–Kitaev Theorem and Golden Gates. 2015.
  • [25] Z. Stier. Short paths in PU(2). Quantum Information and Computation, 21(9&10), 2021.
  • [26] A. Strömbergsson. An application of an explicit trace formula to a wellknown spectral correspondence on quaternion groups. Preprint available at http://www.math.uu.se/~astrombe/papers/JLny2.ps, 2000.
  • [27] M.-F. Vignéras. Arithmétique des algèbres de quaternions, volume 800 of Lecture Notes in Mathematics. Springer, Berlin, 1980.
  • [28] J. Voight. Quaternion algebras. Springer Nature, 2021.

Appendix A (i,φ)\mathbb{Q}(i,\varphi) is norm-Euclidean

Put R=[i,φ]R=\mathbb{Z}[i,\varphi] and K=(i,φ)K=\mathbb{Q}(i,\varphi).

Proposition 1.

RR is the ring of integers of KK.

Proof.

Consider the canonical \mathbb{Z}-basis of RR, namely {1,φ,i,iφ}\{1,\varphi,i,i\varphi\}. One readily computes its discriminant to be 400, equal to KK’s, as per [9]. ∎

Consider the norm function N=|NK/|N=\left\lvert N_{K/\mathbb{Q}}\right\rvert, defined on KK and taking integral values on RR. (The absolute value here is customary but superfluous, as one of KK’s \mathbb{Q}-automorphisms is complex conjugation.)

In what follows, balls (r)\mathcal{B}(r) of radius rr are centered at the origin and closed and with respect to the \ell_{\infty}-norm.

We shall treat KK interchangably with its formulation as the \mathbb{Q}-space 4\mathbb{Q}^{4}.

Theorem 2.

RR is norm-Euclidean; that is, RR is a Euclidean domain with respect to the Euclidean function NN.

Proof.

We use the standard reformulation that RR is norm-Euclidean if and only if for all αK\alpha\in K there is βR\beta\in R for which N(αβ)<1N(\alpha-\beta)<1. Therefore we shall attack the latter statement.

Put α=w+xφ+yi+ziφK\alpha=w+x\varphi+yi+zi\varphi\in K. Then, we readily compute

N(α)\displaystyle N(\alpha) =w4+2w3xw2x22wx3+x4+2w2y2+2wxy2+3x2y2+y4+2w2yz\displaystyle=w^{4}+2w^{3}x-w^{2}x^{2}-2wx^{3}+x^{4}+2w^{2}y^{2}+2wxy^{2}+3x^{2}y^{2}+y^{4}+2w^{2}yz
8wxyz2x2yz+2y3z+3w2z22wxz2+2x2z2y2z22yz3+z4.\displaystyle\qquad-8wxyz-2x^{2}yz+2y^{3}z+3w^{2}z^{2}-2wxz^{2}+2x^{2}z^{2}-y^{2}z^{2}-2yz^{3}+z^{4}.

Define α=max{|w|,|x|,|y|,|z|}\left\|\alpha\right\|=\max\{\left\lvert w\right\rvert,\left\lvert x\right\rvert,\left\lvert y\right\rvert,\left\lvert z\right\rvert\}. Then we can also compute, for fixed α\alpha and any other βK\beta\in K with β0\left\|\beta\right\|\to 0,

(A.1) N(α+β)N(α)+O(β),N(\alpha+\beta)\leq N(\alpha)+O(\left\|\beta\right\|),

and we shall presently obtain an effective, non-asymptotic form of this fact.

Viewing K4K\cong\mathbb{Q}^{4} as \mathbb{Q}-spaces and RR as the standard lattice, we can translate any element of KK using RR to one with \ell_{\infty}-norm at most 1/2\nicefrac{{1}}{{2}} (that is, (1/2)\mathcal{B}(\nicefrac{{1}}{{2}})), by subtracting off the “rounded” element—round each component to the nearest integer. So, it remains to verify whether every element α(1/2)\alpha\in\mathcal{B}(\nicefrac{{1}}{{2}}) of the vector space has N(α)<1N(\alpha)<1 (which turns out to not actually hold!). To attempt to accomplish this, we look at a refinement of RR. In particular, consider Λ=1n4\Lambda=\frac{1}{n}\mathbb{Z}^{4} for well-chosen nn. We cover (1/2)\mathcal{B}(\nicefrac{{1}}{{2}}) with Λ\Lambda-translates of (1/2n)\mathcal{B}(\nicefrac{{1}}{{2n}}), so that for each αΛ\alpha\in\Lambda, for all βα+(1/2n)\beta\in\alpha+\mathcal{B}(\nicefrac{{1}}{{2n}}), by (A.1) we have N(β)N(α)+O(1/2n)N(\beta)\leq N(\alpha)+O(\nicefrac{{1}}{{2n}}). The balancing act, then, is to choose nn small enough so that #(Λ(1/2))\#(\Lambda\cap\mathcal{B}(\nicefrac{{1}}{{2}})) is manageable for a computer, but large enough so that 1/2n\nicefrac{{1}}{{2n}} is sufficiently small. There is a catch, where it is not actually the case that we can ensure that N(α)+O(1/2n)<1N(\alpha)+O(\nicefrac{{1}}{{2n}})<1 for all αΛ(1/2)\alpha\in\Lambda\cap\mathcal{B}(\nicefrac{{1}}{{2}}); however, for those α\alpha which violate this condition, we can attempt to circumvent the obstruction by translating to (α+δ)+(1/2n)(\alpha+\delta)+\mathcal{B}(\nicefrac{{1}}{{2n}}) (where δ4\delta\in\mathbb{Z}^{4}) and then taking NN.222As it turns out, picking δ\delta so that it shifts exactly one component towards 0 by exactly 1 is sufficient when any δ\delta is necessary at all.

We now establish (A.1), after which we will be able to select nn. Keeping α=w+xφ+yi+ziφ\alpha=w+x\varphi+yi+zi\varphi, put β=d1+d2φ+d3i+d4φ\beta=d_{1}+d_{2}\varphi+d_{3}i+d_{4}\varphi. Fully written out, N(α+β)N(\alpha+\beta) is (the absolute value of) a degree-four polynomial in eight variables with 170 total terms, so we spare the reader its full statement. Letting βε\left\|\beta\right\|\leq\varepsilon,333So that |d|ε\left\lvert d_{\ell}\right\rvert\leq\varepsilon for all [4]\ell\in[4]. however, we have that

N(α+β)\displaystyle N(\alpha+\beta) N(α)\displaystyle\leq N(\alpha)
+2ε(|2w3+3w2xwx2x3+2wy2+xy2+2wyz+3wz2xz24xyz|\displaystyle\quad+2\varepsilon(\left\lvert 2w^{3}+3w^{2}x-wx^{2}-x^{3}+2wy^{2}+xy^{2}+2wyz+3wz^{2}-xz^{2}-4xyz\right\rvert
+|w3w2x3wx2+2x3+wy2+3xy24wyz2xyzwz2+2xz2|\displaystyle\qquad+\left\lvert w^{3}-w^{2}x-3wx^{2}+2x^{3}+wy^{2}+3xy^{2}-4wyz-2xyz-wz^{2}+2xz^{2}\right\rvert
+|2w2y+2wxy+3x2y+2y3+w2z4wxzx2z+3y2zyz2z3|\displaystyle\qquad+\left\lvert 2w^{2}y+2wxy+3x^{2}y+2y^{3}+w^{2}z-4wxz-x^{2}z+3y^{2}z-yz^{2}-z^{3}\right\rvert
+|w2y4wxyx2y+y3+3w2z2wxz+2x2zy2z3yz2+2z3|)\displaystyle\qquad+\left\lvert w^{2}y-4wxy-x^{2}y+y^{3}+3w^{2}z-2wxz+2x^{2}z-y^{2}z-3yz^{2}+2z^{3}\right\rvert)
+ε2(|6w2+6wxx2+2y2+2yz+3z2|+2|3w22wx3x2+y24yzz2|\displaystyle\quad+\varepsilon^{2}(\left\lvert 6w^{2}+6wx-x^{2}+2y^{2}+2yz+3z^{2}\right\rvert+2\left\lvert 3w^{2}-2wx-3x^{2}+y^{2}-4yz-z^{2}\right\rvert
+|w26wx+6x2+3y22yz+2z2|+|2w2+2wx+3x2+6y2+6yzz2|\displaystyle\qquad+\left\lvert-w^{2}-6wx+6x^{2}+3y^{2}-2yz+2z^{2}\right\rvert+\left\lvert 2w^{2}+2wx+3x^{2}+6y^{2}+6yz-z^{2}\right\rvert
+2|w24wxx2+3y22yz3z2|+|3w22wx+2x26yz+6z2y2|\displaystyle\qquad+2\left\lvert w^{2}-4wx-x^{2}+3y^{2}-2yz-3z^{2}\right\rvert+\left\lvert 3w^{2}-2wx+2x^{2}-6yz+6z^{2}-y^{2}\right\rvert
+4|2wy+xy+wz2xz|+4|wy+3xy2wzxz|+4|wy2xy+3wzxz|\displaystyle\qquad+4\left\lvert 2wy+xy+wz-2xz\right\rvert+4\left\lvert wy+3xy-2wz-xz\right\rvert+4\left\lvert wy-2xy+3wz-xz\right\rvert
+4|2wyxywz+2xz|)\displaystyle\qquad+4\left\lvert-2wy-xy-wz+2xz\right\rvert)
+2ε3(|w+x|+2|3wx|+2|w+3x|+2|2xw|+3|2w+x|+2|w2x|\displaystyle\quad+2\varepsilon^{3}(\left\lvert w+x\right\rvert+2\left\lvert 3w-x\right\rvert+2\left\lvert w+3x\right\rvert+2\left\lvert 2x-w\right\rvert+3\left\lvert 2w+x\right\rvert+2\left\lvert w-2x\right\rvert
+2|2zy|+2|y+3z|+2|y2z|+2|3yz|+4|2y+z|)\displaystyle\qquad+2\left\lvert 2z-y\right\rvert+2\left\lvert y+3z\right\rvert+2\left\lvert y-2z\right\rvert+2\left\lvert 3y-z\right\rvert+4\left\lvert 2y+z\right\rvert)
+40ε4.\displaystyle\quad+40\varepsilon^{4}.

Then, picking n=6n=6 and ε=112\varepsilon=\frac{1}{12}, for each αΛ(1/2)\alpha\in\Lambda\cap\mathcal{B}(\nicefrac{{1}}{{2}}) as well as α(sgnw,0,0,0)\alpha-(\operatorname{sgn}w,0,0,0), α(0,sgnx,0,0)\alpha-(0,\operatorname{sgn}x,0,0), α(0,0,sgny,0)\alpha-(0,0,\operatorname{sgn}y,0), and α(0,0,0,sgnz)\alpha-(0,0,0,\operatorname{sgn}z), we test whether the right-hand side of the above is bounded by 1 for any of those five choices. The code appearing in Figure 1 verifies that this indeed comes to pass. ∎

1import numpy as np
2def KQnorm(w,x,y,z,r):
3 p0 = abs(w**4 + 2 * w**3 * x - w**2 * x**2 - 2 * w * x**3 + x**4 \
4 + 2 * w**2 * y**2 + 2 * w * x * y**2 + 3 * x**2 * y**2 \
5 + y**4 + 2 * w**2 * y * z - 8 * w * x * y * z \
6 - 2 * x**2 * y * z + 2 * y**3 * z + 3 * w**2 * z**2 \
7 - y**2 * z**2 - 2 * y * z**3 + z**4 - 2 * w * x * z**2 \
8 + 2 * x**2 * z**2)
9 p1 = 2 * r * (abs(2 * w**3 + 3 * w**2 * x - w * x**2 - x**3 \
10 + 2 * w * y**2 + x * y**2 + 2 * w * y * z + 3 * w * z**2 \
11 - x * z**2 - 4 * x * y * z) + abs(w**3 - w**2 * x \
12 - 3 * w * x**2 + 2 * x**3 + w * y**2 + 3 * x * y**2 \
13 - 4 * w * y * z - 2 * x * y * z - w * z**2 + 2 * x * z**2) \
14 + abs(2 * w**2 * y + 2 * w * x * y + 3 * x**2 * y + 2 * y**3 \
15 + w**2 * z - 4 * w * x * z - x**2 * z + 3 * y**2 * z \
16 - y * z**2 - z**3) + abs(w**2 * y - 4 * w * x * y - x**2 * y \
17 + y**3 + 3 * w**2 * z - 2 * w * x * z + 2 * x**2 * z \
18 - y**2 * z - 3 * y * z**2 + 2 * z**3))
19 p2 = r**2 * (abs(6 * w**2 + 6 * w * x - x**2 + 2 * y**2 \
20 + 2 * y * z + 3 * z**2) + abs(6 * w**2 - 4 * w * x \
21 - 6 * x**2 + 2 * y**2 - 8 * y * z - 2 * z**2) \
22 + abs( - w**2 - 6 * w * x + 6 * x**2 + 3 * y**2 - 2 * y * z \
23 + 2 * z**2) + abs(2 * w**2 + 2 * w * x + 3 * x**2 + 6 * y**2 \
24 + 6 * y * z - z**2) + abs(2 * w**2 - 8 * w * x - 2 * x**2 \
25 + 6 * y**2 - 4 * y * z - 6 * z**2) + abs(3 * w**2 \
26 - 2 * w * x + 2 * x**2 - 6 * y * z + 6 * z**2 - y**2) \
27 + abs(8 * w * y + 4 * x * y + 4 * w * z - 8 * x * z) \
28 + abs(4 * w * y + 12 * x * y - 8 * w * z - 4 * x * z) \
29 + abs(4 * w * y - 8 * x * y + 12 * w * z - 4 * x * z) \
30 + abs( - 8 * w * y - 4 * x * y - 4 * w * z + 8 * x * z))
31 p3 = 2 * r**3 * (abs(w + x) + 2 * abs(3 * w - x) \
32 + 2 * abs(w + 3 * x) + 2 * abs(2 * x - w) \
33 + 3 * abs(2 * w + x) + 2 * abs(w - 2 * x) \
34 + 2 * abs(2 * z - y) + 2 * abs(y + 3 * z) \
35 + 2 * abs(y - 2 * z) + 2 * abs(3 * y - z) \
36 + 4 * abs(2 * y + z))
37 p4 = 40 * r**4
38 return p0 + p1 + p2 + p3 + p4
39n = 7
40r = 1.0/(2*(n-1))
41l = np.linspace(-0.5,0.5,n)
42for a in l:
43 for b in l:
44 for c in l:
45 for d in l:
46 if KQnorm(a,b,c,d,r) >= 1 \
47 and KQnorm(a-np.sign(a),b,c,d,r) >= 1 \
48 and KQnorm(a,b-np.sign(b),c,d,r) >= 1 \
49 and KQnorm(a,b,c-np.sign(c),d,r) >= 1 \
50 and KQnorm(a,b,c,d-np.sign(d),r) >= 1:
51 print (a,b,c,d)
Figure 1. Code used to prove Theorem 2. It prints 4-tuples corresponding to points whose norms are too large. Its failure to print anything completes the proof.
Remark 1.

It would appear that this method would readily adapt to a computation to determine that additional number fields are norm-Euclidean, so long as one knows a basis for its ring of integers and correspondingly computes an appropriate analogue to the local effective bound KQnorm(w,x,y,z,r). Unfortunately we have been unable to reproduce this success with any other biquadratic number field of the form (i,n)\mathbb{Q}\left(i,\sqrt{n}\right), n>5n>5.

Appendix B Irreducible elements and sums of two squares in [φ]\mathbb{Z}[\varphi]

As in [21, §C], here we shall summarize results about classifying irreducible elements in [φ]\mathbb{Z}[\varphi] and an efficient algorithm for writing certain elements of [φ]\mathbb{Z}[\varphi] as a sum of two squares.

For this section, let N=N(φ)/N=N_{\mathbb{Q}(\varphi)/\mathbb{Q}} (in contrast to N=N(i,φ)/N=N_{\mathbb{Q}(i,\varphi)/\mathbb{Q}} as in §A). We readily compute N(a+bφ)=a2+abb2N(a+b\varphi)=a^{2}+ab-b^{2}.

Recall that [φ]\mathbb{Z}[\varphi] is a Euclidean domain with respect to |N|\left\lvert N\right\rvert, and that [φ]×=±φ\mathbb{Z}[\varphi]^{\times}=\left\langle\pm\varphi\right\rangle (cf. [19, §4.1.4]). Also recall that φ\varphi^{\bullet} is φ\varphi’s Galois conjugate, which happens to equal 1φ1-\varphi. Extend \mathbb{Q}-linearly to all of (φ)\mathbb{Q}(\varphi).

Proposition 2.

Let pp be irreducible in \mathbb{Z}. If p±2(mod5)p\equiv\pm 2\pmod{5} then pp is irreducible in [φ]\mathbb{Z}[\varphi]. If p±1(mod5)p\equiv\pm 1\pmod{5} then there is an algorithm (running in time O(polylogp)O(\operatorname{poly}\log p)) to compute a [φ]\mathbb{Z}[\varphi]-irreducible element dividing pp.

Proof.

If pp is reducible in [φ]\mathbb{Z}[\varphi] then there exist non-units u,v[φ]u,v\in\mathbb{Z}[\varphi] with uv=puv=p and accordingly N(u)N(p)=p2N(u)\mid N(p)=p^{2}; that is, N(u)=pN(u)=p.

  • Observe that a2+abb2(a2b)2(mod5)a^{2}+ab-b^{2}\equiv(a-2b)^{2}\pmod{5} and that neither of ±2\pm 2 are quadratic residues modulo 5. This proves the first part of the proposition, as if u=a+bφu=a+b\varphi then N(u)N(u) can never equal pp.

  • To prove the second part, we first show that if p1(mod5)p\equiv 1\pmod{5} then there exists xx\in\mathbb{Z} satisfying x2x10(modp)x^{2}-x-1\equiv 0\pmod{p}. Indeed, rewrite the equation to (x21)21+41(x-2^{-1})^{2}\equiv 1+4^{-1}. Now,

    (1+41p)=(1+41p)(4p)=(5p)=(p5)(1)p1=(p5)=1\left(\frac{1+4^{-1}}{p}\right)=\left(\frac{1+4^{-1}}{p}\right)\left(\frac{4}{p}\right)=\left(\frac{5}{p}\right)=\left(\frac{p}{5}\right)(-1)^{p-1}=\left(\frac{p}{5}\right)=1

    with the third equality following by quadratic reciprocity. Let yy\in\mathbb{Z} be any modulo-pp square root of 1+411+4^{-1}. Then x=y+21x=y+2^{-1} suffices. Set z=xφ[φ]z=x-\varphi\in\mathbb{Z}[\varphi]. Then px2x1=zzp\mid x^{2}-x-1=zz^{\bullet} but pzp\nmid z as zz’s φ\varphi-part is not divisible by pp, so pp is not irreducible. Thus u=gcd(p,z)u=\gcd(p,z) will be a non-unit divisor of pp, as desired. (uu cannot be a unit because otherwise gcd(p,z)=gcd(p,z)\gcd(p^{\bullet},z^{\bullet})=\gcd(p,z^{\bullet}) will also be a unit, but then there are no prime divisors of zzzz^{\bullet} also dividing pp, a contradiction.)

    The timing of the above method is due to the extended Euclidean algorithm for modular inverses, and the standard Euclidean algorithm for GCDs in a Euclidean domain, which have respective runtimes O(log2p)O(\log^{2}p) and O(logp)O(\log p).

The only case not covered in the above is p=5p=5, which trivially factors as (1+2φ)2(-1+2\varphi)^{2}.

Algorithm 2.

Assume that there is an efficient blackbox algorithm to factor nn\in\mathbb{Z} in time O(polylogn)O(\operatorname{poly}\log n). Then there is an algorithm which factors n[φ]n\in\mathbb{Z}[\varphi] in time O(polylog|N(n)|)O(\operatorname{poly}\log\left\lvert N(n)\right\rvert).

Proof.

Factor N(n)N(n) as =1mpe\prod\limits_{\ell=1}^{m}p_{\ell}^{e_{\ell}}. Each prime pp_{\ell} is factorizable in time O(polylogp)O(polylogN(n))O(\operatorname{poly}\log p_{\ell})\subset O(\operatorname{poly}\log N(n)), by Proposition 2. There are O(logN(n))O(\log N(n)) such primes. ∎

To each irreducible u[φ]u\in\mathbb{Z}[\varphi], let its associated prime be the least (positive) prime factor of N(u)N(u); that is, N(u)\sqrt{N(u)} when uu is a \mathbb{Z}-irreducible times a unit, and N(u)N(u) otherwise.444Let us quickly establish why the associated prime is well-defined. Suppose pqp\neq q are \mathbb{Z}-primes with p,qN(u)=uup,q\mid N(u)=uu^{\bullet}. Let p,q[φ]p^{\prime},q^{\prime}\in\mathbb{Z}[\varphi] be irreducibles such that |N(p)|=p\left\lvert N(p^{\prime})\right\rvert=p if p±1(mod5)p\equiv\pm 1\pmod{5} or p=5p=5, and p=pp^{\prime}=p otherwise; and define qq^{\prime} similarly. Then p,quup^{\prime},q^{\prime}\mid uu^{\bullet}, so as irreducibles we have pp^{\prime} dividing either uu or uu^{\bullet}, and similarly for qq^{\prime}, hence p=qp=q since uu and uu^{\bullet} are themselves irreducible. Thus, N(u)N(u) is divisible by at most one prime at most twice.

Lemma 2.

For any u[φ]\{0}u\in\mathbb{Z}[\varphi]\backslash\{0\}, it is not the case that both uu and uφu\varphi can be written as a sum of two squares.

Proof.

Suppose not, so write u=w2+x2u=w^{2}+x^{2} and uφ=y2+z2u\varphi=y^{2}+z^{2}. Then we may take the product

uuφ=N(u)φ=(wy+xz)2+(wzxy)2=(a+bφ)2+(c+dφ)2u^{\bullet}u\varphi=N(u)\varphi=(w^{\bullet}y+x^{\bullet}z)^{2}+(w^{\bullet}z-x^{\bullet}y)^{2}=(a+b\varphi)^{2}+(c+d\varphi)^{2}

for some a,b,c,da,b,c,d\in\mathbb{Z}. The right-hand side expands out with 1-part555Viewing (φ)1,φ2\mathbb{Q}(\varphi)\cong\left\langle 1,\varphi\right\rangle_{\mathbb{Q}}\cong\mathbb{Q}^{2} so that the “1-part” and “φ\varphi-part” are the corresponding components in the canonical isomorphism. equal to a2+b2+c2+d2a^{2}+b^{2}+c^{2}+d^{2}, but N(u)φN(u)\varphi has 1-part equal to 0, so a=b=c=d=0a=b=c=d=0, and therefore N(u)=0N(u)=0, a contradiction. ∎

Proposition 3.

Let u[φ]u\in\mathbb{Z}[\varphi] be irreducible, with associated prime pp. If p1,3,7,9,13,17(mod20)p\equiv 1,3,7,9,13,17\pmod{20} then there is an efficient algorithm in time O(polylogp)O(\operatorname{poly}\log p) to write either uu or uφu\varphi as a sum of two squares.

Observe that ϕ(20)=8\phi(20)=8 (ϕ\phi here signifying Euler’s totient function), so by Chebotarev’s density theorem the above-asserted algorithm manages to write 3/4\nicefrac{{3}}{{4}} of primes as sums of two squares.

Proof.

Before beginning in earnest, first remark that 2 factors in [i,φ]\mathbb{Z}[i,\varphi] as (1+i)(1i)(1+i)(1-i), and that 1±i1\pm i are irreducible because N(i,φ)/(φ)(1±i)=2N_{\mathbb{Q}(i,\varphi)/\mathbb{Q}(\varphi)}(1\pm i)=2, a [φ]\mathbb{Z}[\varphi]-irreducible.

  • Suppose p1(mod4)p\equiv 1\pmod{4} (i.e. p1,9,13,17(mod20)p\equiv 1,9,13,17\pmod{20}). Then there exists even xx\in\mathbb{Z} satisfying x2+10(modp)x^{2}+1\equiv 0\pmod{p}, that is, upx2+1u\mid p\mid x^{2}+1. Let g=gcd(u,x+i)g=\gcd(u,x+i), computed in [i,φ]\mathbb{Z}[i,\varphi], and set s=Re(g)s=\operatorname{Re}(g) and t=Im(g)t=\operatorname{Im}(g). We claim that either uu or uφu\varphi equals a squared unit times s2+t2s^{2}+t^{2}. Indeed, suppose q[i,φ]q\in\mathbb{Z}[i,\varphi] is an irreducible divisor of uu. We wish to show that qq divides exactly one of x±ix\pm i. (Clearly it divides at least one of them, by qux2+1q\mid u\mid x^{2}+1.)

    • Say 1±iq[i,φ]1\pm i\neq q\in\mathbb{Z}[i,\varphi] is an irreducible dividing uu and both x±ix\pm i. Then q2iq\mid 2i. By our choice of qq this is impossible.

    • Suppose now we let qq be one of 1±i1\pm i. Then in order to have 1±1ix±2i1\pm_{1}i\mid x\pm_{2}i (where ±1\pm_{1} and ±2\pm_{2} are independent signs) we must have (1±1i)(a+bi)=x±2i(1\pm_{1}i)(a+bi)=x\pm_{2}i where a,b[φ]a,b\in\mathbb{Z}[\varphi]. Solving for aa and bb by looking at real and imaginary parts, we have

      a1b\displaystyle a\mp_{1}b =x\displaystyle=\phantom{\pm_{2}}x
      b±1a\displaystyle b\pm_{1}a =±21\displaystyle=\pm_{2}1

      and solving yields 2a=x±1±212a=x\pm_{1}\pm_{2}1, so in order for aa to exist we must have xx is odd. This is a contradiction to our assumption that xx is even, regardless of the choice of ±1\pm_{1} and ±2\pm_{2}, thus in fact we have neither of 1±i1\pm i dividing either of x±ix\pm i.

    Thus, if qq divides uu with multiplicity mqm_{q} (so that uu factors in [i,φ]\mathbb{Z}[i,\varphi] as u=φmquqmqu=\varphi^{m}\prod\limits_{q\mid u}q^{m_{q}}), we must have that qq divides x±ix\pm i with multiplicity at least mm and xix\mp i with multiplicity 0; and so666Depending on choice of GCD algorithm, there may be a unit factor in front, but we assume not without loss of generality. g=qu,x+iqmqg=\prod\limits_{q\mid u,x+i}q^{m_{q}} while g¯=gcd(u,xi)=qu,xiqmq\bar{g}=\gcd(u,x-i)=\prod\limits_{q\mid u,x-i}q^{m_{q}}. Thus u=φmgg¯=φm(Re(g)2+Im(g)2)u=\varphi^{m}g\bar{g}=\varphi^{m}(\operatorname{Re}(g)^{2}+\operatorname{Im}(g)^{2}). If 2m2\mid m then u=(φm/2Re(g))2+(φm/2Im(g))2u=(\varphi^{\nicefrac{{m}}{{2}}}\operatorname{Re}(g))^{2}+(\varphi^{\nicefrac{{m}}{{2}}}\operatorname{Im}(g))^{2}.

  • Suppose p3(mod4),±2(mod5)p\equiv 3\pmod{4},\pm 2\pmod{5} (i.e. p3,7(mod20)p\equiv 3,7\pmod{20}). Then compute

    (5p)=(1p)(5p)=(p5)(1)p1=1\left(\frac{-5}{p}\right)=\left(\frac{-1}{p}\right)\left(\frac{5}{p}\right)=-\left(\frac{p}{5}\right)(-1)^{p-1}=1

    so that there exists even xx\in\mathbb{Z} which is not a multiple of 5 satisfying x2+50(modp)x^{2}+5\equiv 0\pmod{p}, that is, u=px2+5u=p\mid x^{2}+5. Let g=gcd(g,x+i(1+2φ))g=\gcd(g,x+i(-1+2\varphi)), computed in [i,φ]\mathbb{Z}[i,\varphi], and set s=Re(g)s=\operatorname{Re}(g) and t=Im(g)t=\operatorname{Im}(g). We claim that p=s2+t2p=s^{2}+t^{2}. Indeed, suppose q[i,φ]q\in\mathbb{Z}[i,\varphi] is an irreducible divisor of pp (in [i,φ]\mathbb{Z}[i,\varphi]). We wish to show that qq divides exactly one of x±i(1+2φ)x\pm i(-1+2\varphi). (Clearly it divides at least one of them, by qpx2+5q\mid p\mid x^{2}+5.)

    • Say 1+2φ,1±iq[i,φ]-1+2\varphi,1\pm i\neq q\in\mathbb{Z}[i,\varphi] is an irreducible dividing uu and both x±i(1+2φ)x\pm i(-1+2\varphi). Then q2(1+2φ)iq\mid 2(-1+2\varphi)i. By our choice of qq this is impossible.

    • Suppose we let q=1+2φ=5q=-1+2\varphi=\sqrt{5}. Then as 5x5\nmid x, we have

      x±i(1+2φ)q=x5(1+2φ)±i(i,φ)\[i,φ]\frac{x\pm i(-1+2\varphi)}{q}=\frac{x}{5}(-1+2\varphi)\pm i\in\mathbb{Q}(i,\varphi)\backslash\mathbb{Z}[i,\varphi]

      and so in fact qq divides neither of x±i(1+2φ)x\pm i(-1+2\varphi).

    • Suppose now we let qq be one of 1±i1\pm i. Then in order to have 1±1ix±2i(1+2φ)1\pm_{1}i\mid x\pm_{2}i(-1+2\varphi) (where ±1\pm_{1} and ±2\pm_{2} are independent signs) we must have (1±1i)(a+bi)=x±2i(1+2φ)(1\pm_{1}i)(a+bi)=x\pm_{2}i(-1+2\varphi) where a,b[φ]a,b\in\mathbb{Z}[\varphi]. Solving for aa and bb by looking at real and imaginary parts, we have

      a1b\displaystyle a\mp_{1}b =x\displaystyle=\hskip 48.36958ptx
      b±1a\displaystyle b\pm_{1}a =±2(1+2φ)\displaystyle=\pm_{2}(-1+2\varphi)

      and solving yields 2a=x±1±2(1+2φ)2a=x\pm_{1}\pm_{2}(-1+2\varphi), so in order for aa to exist we must have xx is odd. This is a contradiction to our assumption that xx is even, regardless of the choice of ±1\pm_{1} and ±2\pm_{2}, thus in fact we have neither of 1±i1\pm i dividing either of x±ix\pm i.

It is here that we require Theorem 2, that [i,φ]\mathbb{Z}[i,\varphi] is norm-Euclidean, as this ensures that GCDs are efficiently computable, using an Euclidean algorithm with respect to the Euclidean function (the norm). As before, the bottlenecks are in the extended and standard Euclidean algorithms, which are both of runtime O(log2p)O(\log^{2}p). ∎

Observe that we can also include the \mathbb{Z}-irreducibles 2=12+122=1^{2}+1^{2} and 5=(1+2φ)2+025=(-1+2\varphi)^{2}+0^{2}.

Theorem 3.

For x[φ]x\in\mathbb{Z}[\varphi], factor it as

x==1numx=\prod\limits_{\ell=1}^{n}u_{\ell}^{m_{\ell}}

where u[φ]u_{\ell}\in\mathbb{Z}[\varphi] are all irreducible, with associated primes pp_{\ell}, and mm_{\ell}\in\mathbb{N}. Then either xx or xφx\varphi may be written as the sum of two squares if, for all [n]\ell\in[n], one of the following holds:

  • p=2p_{\ell}=2;

  • p1,3,7,9,13,17(mod20)p_{\ell}\equiv 1,3,7,9,13,17\pmod{20};

  • 2m2\mid m_{\ell};

and, for arbitrary xx not a priori satisfying all three criteria, this whole procedure (i.e., either determining a sum of two squares, or rejecting on the basis of some \ell failing all three conditions) runs in time O(polylog|N(x)|)O(\operatorname{poly}\log\left\lvert N(x)\right\rvert).

Proof.

For each [n]\ell\in[n], consider ss_{\ell} and tt_{\ell} defined as follows:

  • If p=2p_{\ell}=2 then s=t=1s_{\ell}=t_{\ell}=1.

  • If p1,3,7,9,13,17(mod20)p_{\ell}\equiv 1,3,7,9,13,17\pmod{20} then ss_{\ell} and tt_{\ell} are as computed using Proposition 3.

  • If 2m2\mid m_{\ell} then s=um/2s_{\ell}=u_{\ell}^{\nicefrac{{m_{\ell}}}{{2}}} and t=0t_{\ell}=0.

Now, we do the standard (inductive) trick where (s2+t2)(s2+t2)=(ss+tt)2+(stts)2(s_{\ell}^{2}+t_{\ell}^{2})(s_{\ell^{\prime}}^{2}+t_{\ell^{\prime}}^{2})=(s_{\ell}s_{\ell^{\prime}}+t_{\ell}t_{\ell^{\prime}})^{2}+(s_{\ell}t_{\ell^{\prime}}-t_{\ell}s_{\ell^{\prime}})^{2}. Since we have that s2+t2s_{\ell}^{2}+t_{\ell}^{2} equals either uu_{\ell} or uφu_{\ell}\varphi, the resulting product from performing the trick nn times gives a sum of two squares s2+t2s^{2}+t^{2} equal to xφnx\varphi^{n^{\prime}} where 0nn0\leq n^{\prime}\leq n. If 2n2\mid n^{\prime} then we return the pair (sφn/2,tφn/2)(s\varphi^{\nicefrac{{-n^{\prime}}}{{2}}},t\varphi^{\nicefrac{{-n^{\prime}}}{{2}}}), and otherwise we return (sφ(1n)/2,tφ(1n)/2)(s\varphi^{\nicefrac{{(1-n^{\prime})}}{{2}}},t\varphi^{\nicefrac{{(1-n^{\prime})}}{{2}}}).

For the runtime analysis, simply factor xx using Algorithm 2 and then for each resulting factor, apply Proposition 3 after checking that one of the three criteria holds (immediately halting and rejecting if any factor fails). ∎