This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

thanks: This document is a preprint uploaded to arXiv on August 15th, 2024.

Minimum Synthesis Cost of CNOT Circuits

Alan Bu [email protected] Harvard University, Cambridge, MA 02138    Evan Fan [email protected] Phillips Exeter Academy, Exeter, NH 03833    Robert Joo [email protected] Phillips Exeter Academy, Exeter, NH 03833
Abstract

Optimizing the size and depth of CNOT circuits is an active area of research in quantum computing and is particularly relevant for circuits synthesized from the Clifford + T universal gate set. Although many techniques exist for finding short syntheses, it is difficult to assess how close to optimal these syntheses are without an exponential brute-force search. We use a novel method of categorizing CNOT gates in a synthesis to obtain a strict lower bound computable in O(nω)O(n^{\omega}) time on the minimum number of gates needed to synthesize a given CNOT circuit, where ω\omega denotes the matrix multiplication constant and nn is the number of qubits involved. Applying our framework, we prove that 3(n1)3(n-1) gate syntheses of the nn-cycle circuit are optimal and provide insight into their structure. We also generalize this result to permutation circuits.

For linear reversible circuits with n=3,4,5n=3,4,5 qubits, our lower bound is optimal for 100%, 67.7%, and 23.1% of circuits and is accurate to within one CNOT gate in 100%, 99.5%, and 83.0% of circuits respectively. We also introduce an algorithm that efficiently determines whether certain circuits can be synthesized with fewer than nn CNOT gates.

preprint: APS/123-QED

1 Introduction

Minimizing the number of CNOT gates necessary to synthesize a given quantum circuit is a well-studied task in quantum computing. The authors of [1, 2] have developed algorithms to optimize the number of CNOT gates in quantum circuits, and [3, 4, 5, 6, 7, 8] give asymptotics on the minimum number of CNOT gates required in circuit synthesis. The approaches presented in these papers have been used to minimize both the size and depth of a circuit, in some cases accounting for the qubit connectivity of the quantum computer and the ancilla count. In general, these methods have approached asymptotic optimality [9, 10, 4, 11].

Minimizing the number of CNOT operations is important because CNOT operations and single qubit gates form a universal set for quantum computing. Both the speed and error rate of quantum algorithms such as Shor’s algorithm and Grover’s search improve with decreases in CNOT gate count [12, 13].

1.1 Main Results

Although a large volume of active research has been dedicated to optimizing CNOT circuits, we know little about how to obtain firm lower bounds on the minimum number of CNOT gates required to synthesize a given circuit besides resorting to brute-force search. Consequently, an exponential search time is required to verify the optimality of a CNOT circuit decomposition. In this paper, we focus on quantum computers without topological constraints or quantum reconfigurability. Our main result utilizes a novel categorization of the CNOT gates in a CNOT decomposition into link, middle, and cut gates to find strict lower bounds in O(nω)O(n^{\omega}) time on the minimum number of CNOT gates necessary to synthesize a given circuit, where ω\omega denotes the matrix multiplication constant [14] and nn is the number of qubits involved. We apply this method to all 3,4,53,4,5-qubit circuits. In these cases, our lower bound is exact in 100%,67.7%,100\%,67.7\%, and 23.1%23.1\% of circuits, and off from the minimum by at most one in 100%,99.5%100\%,99.5\%, and 83.0%83.0\% of circuits respectively.

Notably, [15, 16, 17, 4] provided constructions which use 3(n1)3(n-1) CNOT gates to synthesize an nn-cycle circuit, but did not disprove the existence of syntheses with fewer CNOT gates. By applying our lower bound, we prove that 3(n1)3(n-1) gates is optimal, eliminating the need for further CNOT count optimization of nn-cycle circuits in future research. More generally, we show the minimum number of CNOT gates needed to synthesize permutation circuits is 3(nk)3(n-k), where kk is the number of disjoint cycles in the permutation. We remark that our framework not only yields a certificate of optimality, but also shows that the gates in any 3(n1)3(n-1) gate synthesis of an nn-cycle circuit can have its CNOT gates partitioned into three sets of size n1n-1, each of which forms a spanning tree over the nn qubits, all without needing to resort to brute-force search.

Finally, Jiang et al. [10] proved that determining the minimum CNOT count of an arbitrary CNOT circuit with topological constraints is NP-hard. Finding the computational complexity of the same problem without topological constraints is an ongoing area of research [11, 18, 19]. We introduce the linkability problem, a subclass of this problem, with the goal of efficiently determining whether certain circuits can be synthesized with fewer than nn CNOT gates. Combining our framework with a graph-theoretic approach, we give an O(nω)O(n^{\omega}) algorithm for this problem.

1.2 Organization of the Paper

In Section 2, we introduce conventions and definitions used throughout the paper. In Section 3, we introduce the categorization of CNOT gates into link, middle, and cut gates, as well as derive a lower bound on the number of link and cut gates necessary to synthesize a circuit. In Section 4, we prove results regarding the effects of middle gates on a circuit and derive a lower bound on the number of middle gates necessary to synthesize a circuit. In Section 5, we combine all of these results alongside a few additional observations to yield our lower bound and apply the bound to nn-cycle and permutation circuits to determine their minimum CNOT gate count. In Section 6, we introduce and prove our polynomial-time algorithm for the linkability problem. In Section 7, we assess the performance of our lower bound in the n=3,4,5n=3,4,5 cases. In Section 8, we summarize our main results and raise future directions.

2 Preliminaries and Notations

We use the notation [n]={1,2,,n}[n]=\{1,2,\ldots,n\}. We use Mi,jM_{i,j} to denote the value in the iith row and jjth column of a matrix MM. We use ee to denote the identity permutation and TijT_{ij} to denote the (i,j)(i,j)-transposition. We denote the identity matrix by 𝐈\mathbf{I}. For a given set AA and permutation σ\sigma, we let σ(A)\sigma(A) denote the image of AA under σ\sigma. We let 1iA\textbf{1}_{i\in A} denote the indicator function, where 1iA=1\textbf{1}_{i\in A}=1 if iAi\in A and 1iA=0\textbf{1}_{i\in A}=0 otherwise. We say that an n×nn\times n matrix MM is a reversible binary matrix if MGLn(𝔽2)M\in GL_{n}(\mathbb{F}_{2}). We define EmpM\textrm{Emp}\,M to be the number of all-zero rows in MM and DupM\textrm{Dup}\,M to be the number of disjoint pairs of duplicate, not all-zero, rows in MM.

2.1 CNOT Synthesis of Circuits

In this paper, we lower bound the minimum number of CNOT gates required to synthesize a given linear reversible circuit, which can be represented as an n×nn\times n reversible binary matrix. Rigorously, we define the problem as follows, borrowing notation from [10].

Definition 2.1.

A CNOT gate or CNOT operation takes as input two binary values xix_{i} and xjx_{j} representing the control and target of the CNOT gate, respectively, and replaces the pair (xi,xj)(x_{i},x_{j}) with (xi,xixj)(x_{i},x_{i}\oplus x_{j}). The effect of a CNOT gate with control ii and target jj on a set of nn qubits can be represented as multiplying by matrix MM with Mi,j=1M_{i,j}=1, all diagonal entries equal to 11, and all other entries equal to 0. We denote this CNOT gate, which adds the iith row of a matrix to the jjth row, as CNOT(i,j)\textrm{CNOT}(i,j).

Definition 2.2.

For any n×nn\times n reversible binary matrix MM, we define its size s(M)s(M) to be one less than the length of the shortest sequence M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} of n×nn\times n reversible binary matrices with M1=IM_{1}=\textbf{I} and Mk+1=MM_{k+1}=M such that Mi+1M_{i+1} can be obtained from MiM_{i} through a single CNOT operation for all i[k]i\in[k].111We note that our usage of the term “size” differs from its conventional definition, where it typically refers to the number of CNOT gates in a given CNOT synthesis. In this paper, when we refer to the “size” of a circuit, we mean the minimum number of CNOT gates required in any synthesis of the circuit. We call any such minimal sequence a CNOT synthesis of MM. Furthermore, this sequence gives a decomposition of MM as a product of CNOT gate matrices, which we call a CNOT decomposition of MM. An example is shown in Figure I.

[100010001]\begin{bmatrix}1&0&0\\ 0&1&0\\ 0&0&1\end{bmatrix}

Initial matrix

\to

\Qcircuit@C=0.8em@R=1.5em&x1\ctrl1\ctrl2\targ\ctrl1\qw\qw\qwx2x2\targ\qw\ctrl1\targ\targ\ctrl1\qwx3x3\qw\targ\qw\qw\ctrl1\targ\qwx1\Qcircuit@C=0.8em@R=1.5em{&x_{1}\quad\quad\ctrl{1}\ctrl{2}\targ\ctrl{1}\qw\qw\qw\;x_{2}\\ x_{2}\quad\quad\targ\qw\ctrl{-1}\targ\targ\ctrl{1}\qw\;x_{3}\\ x_{3}\quad\quad\qw\targ\qw\qw\ctrl{-1}\targ\qw\;x_{1}}
CNOT decomposition

\to

[010001100]\begin{bmatrix}0&1&0\\ 0&0&1\\ 1&0&0\end{bmatrix}

33-cycle matrix
Figure I: CNOT decomposition of the 33-cycle permutation operation.

The authors of [10, 20] also minimized the number of parallel CNOT operations required to synthesize a given circuit, as fully disjoint CNOT operations can be performed simultaneously on a quantum computer to save time.

Definition 2.3.

A parallel CNOT gate or parallel CNOT operation is a composition of arbitrarily many CNOT gates such that no qubit is operated on by more than one such gate. Rigorously, let x1,x2,,xkx_{1},x_{2},\ldots,x_{k} and y1,y2,,yky_{1},y_{2},\ldots,y_{k} be 2k2k distinct numbers in [n][n]. A parallel CNOT gate can be represented by the matrix MM with Mxi,yi=1M_{x_{i},y_{i}}=1 for i[k]i\in[k], all diagonal entries equal to 11, and all other entries equal to 0.

An example of a parallel CNOT decomposition is given in Figure II.

[010000101001001000001100000011011001]\begin{bmatrix}0&1&0&0&0&0\\ 1&0&1&0&0&1\\ 0&0&1&0&0&0\\ 0&0&1&1&0&0\\ 0&0&0&0&1&1\\ 0&1&1&0&0&1\end{bmatrix}

\Qcircuit@C=1.0em@R=1.2em&\lstickx1\qw\ctrl1\targ\ctrl5\qw\qwx2\lstickx2\qw\targ\ctrl1\qw\targ\qwx1x3x6\lstickx3\qw\ctrl1\ctrl3\qw\qw\qwx3\lstickx4\qw\targ\qw\qw\qw\qwx3x4\lstickx5\qw\targ\qw\qw\qw\qw\rstickx5x6\lstickx6\qw\ctrl1\targ\targ\ctrl4\qwx2x3x6\Qcircuit@C=1.0em@R=1.2em{&\lstick{x_{1}}\qw\ctrl{1}\targ\ctrl{5}\qw\qw\qquad\qquad\;x_{2}\\ \lstick{x_{2}}\qw\targ\ctrl{-1}\qw\targ\qw\qquad\qquad\;x_{1}\oplus x_{3}\oplus x_{6}\\ \lstick{x_{3}}\qw\ctrl{1}\ctrl{3}\qw\qw\qw\qquad\qquad\;x_{3}\\ \lstick{x_{4}}\qw\targ\qw\qw\qw\qw\qquad\qquad\;x_{3}\oplus x_{4}\\ \lstick{x_{5}}\qw\targ\qw\qw\qw\qw\rstick{x_{5}\oplus x_{6}}\\ \lstick{x_{6}}\qw\ctrl{-1}\targ\targ\ctrl{-4}\qw\qquad\qquad\;x_{2}\oplus x_{3}\oplus x_{6}}

Figure II: An illustration of a depth 44 parallel CNOT decomposition.
Definition 2.4.

For any n×nn\times n reversible binary matrix MM, we define its depth d(M)d(M) to be one less than the length of the shortest sequence M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} of n×nn\times n reversible binary matrices with M1=IM_{1}=\textbf{I} and Mk+1=MM_{k+1}=M such that Mi+1M_{i+1} can be obtained from MiM_{i} through a single parallel CNOT operation for all i[k]i\in[k].222See above.

2.2 Circuit-derived Graphs

We further introduce two graphs fundamental to the lower bound introduced in this paper.

Definition 2.5.

Given an n×nn\times n reversible binary matrix MM, we define its vertex-connectivity graph to be Gv(M)=(V,E)G_{v}(M)=(V,E), where V=[n]V=[n] and (i,j)E(i,j)\in E if and only if iji\neq j and at least one of Mi,jM_{i,j} or Mj,iM_{j,i} is equal to one. We define v(M)v(M) to be the number of connected components of this graph.

Definition 2.6.

Given an n×nn\times n reversible binary matrix MM, we define its edge-connectivity graph to be Ge(M)=(V,E)G_{e}(M)=(V,E), where V={R1,R2,,Rn,C1,,Cn}V=\{R_{1},R_{2},\ldots,R_{n},C_{1},\ldots,C_{n}\}. For each row RiR_{i} and column CjC_{j} of MM, we connect RiR_{i} and CjC_{j} with an edge in Ge(M)G_{e}(M) if and only if Mi,j=1M_{i,j}=1. We define e(M)e(M) to be the number of connected components of this graph.

Examples of the vertex and edge-connectivity graphs of a matrix are shown in Figure III.

[0010001010100000000101000]\begin{bmatrix}0&0&1&0&0\\ 0&1&0&1&0\\ 1&0&0&0&0\\ 0&0&0&0&1\\ 0&1&0&0&0\end{bmatrix}

(a) Matrix MM
111112345
(b) Gv(M)G_{v}(M)
R1R_{1}C1C_{1}R2R_{2}C2C_{2}R3R_{3}C3C_{3}R4R_{4}C4C_{4}R5R_{5}C5C_{5}
(c) Ge(M)G_{e}(M)
Figure III: An example of MM with v(M)=2v(M)=2 and e(M)=4e(M)=4.
Definition 2.7.

We call a graph G=(V,E)G=(V,E) on nn vertices a star graph if one vertex of GG has degree n1n-1 and all other vertices have degree 11. We call GG a path graph if we can label its vertices as v1,v2,,vnv_{1},v_{2},\ldots,v_{n} such that (vi,vj)E(v_{i},v_{j})\in E if and only if |ij|=1|i-j|=1. With slight abuse of notation, we say that a set of n1n-1 CNOT gates is a star or path if the graph GG formed by adding an edge between the control and target of each CNOT gate is a star or path graph respectively.

2.3 Influence Graphs

Finally, we introduce two definitions used in our algorithm for the linkability problem.

Definition 2.8.

For a reversible binary n×nn\times n matrix MM, we define its influence graph (M)=(V,A)\mathcal{I}(M)=(V,A) to be the directed graph such that V=[n]V=[n] and (i,j)A(i,j)\in A if and only if Mj,i=1M_{j,i}=1 and iji\neq j. In each such case, we say that ii influences jj. An example is given in Figure IV.

Definition 2.9.

For a directed graph G=(V,A)G=(V,A), we define a transitive reduction GG^{\prime} of GG to be another directed graph with the same vertices and a minimal set of directed edges such there is a directed path from vertex uu to vertex vv in GG^{\prime} if and only if there is a directed path from uu to vv in GG.

[1011111000100011]\begin{bmatrix}1&0&1&1\\ 1&1&1&0\\ 0&0&1&0\\ 0&0&1&1\end{bmatrix}

(a) Matrix MM
3142
(b) Influence graph of MM
3142
(c) Transitive reduction of (M)\mathcal{I}(M)
Figure IV: Example of an influence graph and its transitive reduction.

3 Link, Middle, and Cut CNOT Gates

The key insight of our algorithm is that each gate of a CNOT decomposition can be placed into one of three categories (or none), and finding lower bounds on the number of CNOT gates in each category allows us to obtain a lower bound on the total number of CNOT gates. In this section, we introduce these categories and prove lower bounds on the number of CNOT gates in two of the three categories.

Definition 3.1.

We define a river of an n×nn\times n reversible binary matrix MM to be a permutation σSn\sigma\in S_{n} such that Mi,σ(i)=1M_{i,\sigma(i)}=1 for all ii. We denote the set of all rivers of MM as S(M)S(M). For any river σ\sigma, we denote the unique permutation matrix whose only nonzero entries lie on this river as PσP_{\sigma}.

Definition 3.2.

Let M,NM,N be n×nn\times n reversible binary matrices such that NN can be obtained from MM through one CNOT operation. We call this CNOT operation a

  • Link gate if v(M)>v(N)v(M)>v(N),

  • Middle gate if S(M)S(N)S(M)\neq S(N),

  • Cut gate if e(M)<e(N)e(M)<e(N).

We now prove an important result on the effects of link and cut gates on GvG_{v} and GeG_{e}.

Proposition 3.3.

Let MM be a reversible binary matrix. If Ri,RjR_{i},R_{j} are connected in Ge(M)G_{e}(M), then i,ji,j are connected in Gv(M)G_{v}(M).

Proof.

As Ri,RjR_{i},R_{j} are connected in Ge(M)G_{e}(M), there exists a path Rx1Cy1Rx2Cy2Cxm1RxmR_{x_{1}}C_{y_{1}}R_{x_{2}}C_{y_{2}}\cdots C_{x_{m-1}}R_{x_{m}} of Ge(M)G_{e}(M) such that x1=ix_{1}=i and xm=jx_{m}=j. Thus, xi,xi+1x_{i},x_{i+1} are connected in Gv(M)G_{v}(M) for all i[m1]i\in[m-1] since Mxi,yi=Myi,xi+1=1M_{x_{i},y_{i}}=M_{y_{i},x_{i+1}}=1. Thus ii is connected to jj in Gv(M)G_{v}(M) as desired. ∎

Lemma 3.4.

Let M,NM,N be n×nn\times n reversible binary matrices such that NN can be obtained from MM through one CNOT operation. Then, |v(M)v(N)|1|v(M)-v(N)|\leq 1 and |e(M)e(N)|1|e(M)-e(N)|\leq 1. Furthermore, if v(N)=v(M)1v(N)=v(M)-1, then e(N)=e(M)1e(N)=e(M)-1.

Proof.

Consider a CNOT operation on a matrix MM adding row ii to row jj for which v(M)>v(N)v(M)>v(N). If ii and jj were in the same connected component AA in Gv(M)G_{v}(M), then any new edges in Gv(N)G_{v}(N) will also be between vertices in AA, contradicting v(M)>v(N)v(M)>v(N). If ii and jj were in distinct connected components AA and BB of Gv(M)G_{v}(M), then the only edges that will be different in Gv(N)G_{v}(N) will be those between vertices of AA and BB. This will cause AA and BB to become a single connected component in Gv(N)G_{v}(N), and leave all other connected components unchanged. Thus, if v(M)>v(N)v(M)>v(N) then v(M)=v(N)+1v(M)=v(N)+1.

Similarly, consider the case when e(M)>e(N)e(M)>e(N). If RiR_{i}, RjR_{j} are in the same connected component AA of Ge(M)G_{e}(M) then no edges between connected components are added, while if they are in separate connected components AA and BB, then the only new edges are edges connecting a vertex in AA to a vertex in BB, leaving all other connected components unchanged. Thus, if e(M)>e(N)e(M)>e(N) then e(M)=e(N)+1e(M)=e(N)+1.

Now, notice that applying CNOT(i,j)\textrm{CNOT}(i,j) on NN returns the original matrix MM. In symmetric fashion, v(N)v(M)+1v(N)\leq v(M)+1 and e(N)e(M)+1e(N)\leq e(M)+1.

Finally, if v(N)=v(M)1v(N)=v(M)-1, then there must exist i,ji,j that lie in different connected components of Gv(M)G_{v}(M) such that Ni,j=1N_{i,j}=1. The CNOT gate adds some row kk with Mk,j=1M_{k,j}=1 to row ii. Note that j,kj,k lie in the same connected component of Gv(M)G_{v}(M). After adding row kk to row ii, we know i,j,ki,j,k lie in the same connected component of Gv(N)G_{v}(N) and Ri,CjR_{i},C_{j} is connected in Ge(N)G_{e}(N). Notably, RiR_{i} and CjC_{j} are in different connected components of Ge(M)G_{e}(M) by Proposition 3.3, and thus the number of edge-connected components in NN is one less than in MM, as desired. ∎

We now use the tools developed in the proof of Lemma 3.4 to prove that these three categories of CNOT gates are mutually exclusive, allowing us to sum lower bounds on each category to bound the size of the entire circuit.

Lemma 3.5.

Let M,NM,N be n×nn\times n reversible binary matrices such that NN can be obtained from MM through one CNOT operation. Then the CNOT operation is either exclusively a link, middle, or cut gate, or none of these.

Proof.

By Lemma 3.4, if the CNOT operation is a link gate then e(M)>e(N)e(M)>e(N), so a link gate cannot be a cut gate.

As discussed in the proof of Lemma 3.4, link gates operate on rows that are in separate connected components of Gv(M)G_{v}(M). Let the control row of the CNOT gate lie in connected component A[n]A\subseteq[n] of Gv(M)G_{v}(M). Without loss of generality, let A=[a]A=[a] and define B=[n]AB=[n]\setminus A. Then, M=[A00B]M=\begin{bmatrix}\textbf{A}&\textbf{0}\\ \textbf{0}&\textbf{B}\end{bmatrix} where A is an a×aa\times a matrix and B is an (na)×(na)(n-a)\times(n-a) matrix. Any σS(M)\sigma\in S(M) satisfies σ(A)=A\sigma(A)=A. When we add a row from A to a row in B, the upper-right box of zeros does not change. Thus, for any σS(N)\sigma\in S(N), we have σ(A)A\sigma(A)\subseteq A, so σ(A)=A\sigma(A)=A. Therefore the set of rivers is unchanged as no river passes through the newly created bottom left entries, so a link gate cannot be a middle gate.

Finally, we show that cuts and middles are also disjoint. In this case, we can assume up to permutation of the rows and columns that N=[A00B]N=\begin{bmatrix}\textbf{A}&\textbf{0}\\ \textbf{0}&\textbf{B}\end{bmatrix} where A,B\textbf{A},\textbf{B} are not necessarily square. However, if A,B\textbf{A},\textbf{B} are not square, then NN is not reversible, giving contradiction. Suppose the CNOT gate adds a row of A to a row of B. Let the set of columns of MM in A be denoted as CAC_{A}, and let the set of rows of MM in A be denoted as RAR_{A}. Thus if σS(N)\sigma\in S(N), then σ(RA)=CA\sigma(R_{A})=C_{A}. Furthermore, for any σS(M)\sigma\in S(M), we have σ(RA)CA\sigma(R_{A})\subseteq C_{A} so σ(RA)=CA\sigma(R_{A})=C_{A}. Thus the set of rivers in MM and NN are equal. This finishes the final case. ∎

Finally, we extend Lemma 3.4 to yield a simple but powerful lower bound on the number of link and cut gates in a given circuit.

Lemma 3.6.

Any CNOT decomposition of n×nn\times n matrix MM must contain at least nv(M)n-v(M) link gates and at least e(M)v(M)e(M)-v(M) cut gates.

Proof.

Consider any CNOT synthesis M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} of MM. Note that v(I)=e(I)=nv(\textbf{I})=e(\textbf{I})=n. The number of vertex-connected components must decrease over the sequence, so Lemma 3.4 implies that at least nv(M)n-v(M) link gates are required.

From Lemma 3.4, link gates decrease the number of edge connected components by exactly one, while cut gates increase them by exactly one. Since there are nn initial components and nv(M)n-v(M) link gates, we need at least e(M)v(M)e(M)-v(M) cut gates in order to end up at e(M)e(M) edge-connected components in the final matrix. ∎

4 Lower Bounds on Middle Gates

In this section, we develop an O(nω)O(n^{\omega}) time algorithm that computes a lower bound on the number of middle gates necessary to synthesize a given circuit. To begin, we prove several lemmas that clarify the effects of middle gates on the rivers of a matrix.

Given a set of middle CNOT gates, each river can only “interact” with other rivers in the same equivalence class. These equivalence classes are determined by the set of middle CNOT gates. When it is clear that the rivers of the MM cannot be obtained from I through these interactions, the given set of middle CNOT gates is insufficient to synthesize MM. In the following subsections, we make this idea rigorous.

4.1 Equivalence Classes of Rivers

We now show that a singular middle CNOT gate only allows “interactions” between certain pairs of rivers, each of which differs by a fixed transposition determined by the control and target of the CNOT gate.

Lemma 4.1.

Let M,NM,N be n×nn\times n reversible binary matrices such that NN can be obtained from MM by adding row ii to row jj. Consider pairing each permutation σ\sigma to TijσT_{ij}\circ\sigma. Then |{σ,Tijσ}S(M)||{σ,Tijσ}S(N)|(mod2)|\{\sigma,T_{ij}\circ\sigma\}\cap S(M)|\equiv|\{\sigma,T_{ij}\circ\sigma\}\cap S(N)|\pmod{2}.

10011011\begin{matrix}1&0\\ 0&1\end{matrix}\quad\xrightarrow{\hskip 20.00003pt}\quad\begin{matrix}1&0\\ 1&1\end{matrix}
11011110\begin{matrix}1&1\\ 0&1\end{matrix}\quad\xrightarrow{\hskip 20.00003pt}\quad\begin{matrix}1&1\\ 1&0\end{matrix}

Figure V: Examples of the effects of CNOT(i,j)\textrm{CNOT}(i,j) for rows i,ji,j and columns σ(i),σ(j)\sigma(i),\sigma(j).
Proof.

We begin with casework on the values of Mi,σ(i)M_{i,\sigma(i)} and Mi,σ(j)M_{i,\sigma(j)}, with examples in Figure V. If they are both 0, then {σ,Tijσ}S(M)={σ,Tijσ}S(N)=\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N)=\emptyset as desired. If exactly one of them is 11, it can be checked that {σ,Tijσ}S(M)={σ,Tijσ}S(N)\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N) as desired. We now consider the case when Mi,σ(i)=Mi,σ(j)=1M_{i,\sigma(i)}=M_{i,\sigma(j)}=1. In this case, if it is not true that Mk,σ(k)=1M_{k,\sigma(k)}=1 for all ki,jk\neq i,j then {σ,Tijσ}S(M)={σ,Tijσ}S(N)=\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N)=\emptyset as desired. Otherwise, observe that σM\sigma\in M if and only if σN\sigma\notin N and TijσMT_{ij}\circ\sigma\in M if and only if TijσNT_{ij}\circ\sigma\notin N. Thus, ({σ,Tijσ}S(M))({σ,Tijσ}S(N))={σ,Tijσ}(\{\sigma,T_{ij}\circ\sigma\}\cap S(M))\sqcup(\{\sigma,T_{ij}\circ\sigma\}\cap S(N))=\{\sigma,T_{ij}\circ\sigma\}, completing the final case. ∎

Lemma 4.1 demonstrates that when there is only one CNOT gate involved, the parity of the number of rivers in sets of the form {σ,Tijσ}\{\sigma,T_{ij}\circ\sigma\} cannot change. We extend this idea to the case of multiple CNOT gates by allowing multiple transpositions.

Definition 4.2.

Let STS\subseteq T be a fixed set of transpositions. We say that σ1Sσ2\sigma_{1}\equiv_{S}\sigma_{2} if and only if σ2\sigma_{2} can be obtained from σ1\sigma_{1} through a sequence of transpositions in SS. It is easy to see that S\equiv_{S} is an equivalence relation on SnS_{n} as the reflexive, transitive, and symmetric properties hold. We let Sn/SS_{n}/S denote the set of equivalence classes under S\equiv_{S}.

Theorem 4.3.

Let M,NM,N be n×nn\times n reversible binary matrices such that NN can be obtained from MM by adding row ii to row jj. Then, for any STS\subseteq T with TijST_{ij}\in S and for any equivalence class KSn/SK\in S_{n}/S, we have |KS(M)||KS(N)|(mod2)|K\cap S(M)|\equiv|K\cap S(N)|\pmod{2}.

Proof.

Since TijST_{ij}\in S, we can partition all the rivers in KK into pairs (σ,Tijσ)(\sigma,T_{ij}\circ\sigma) without including elements outside of KK. By Lemma 4.1,

|KS(M)|\displaystyle|K\cap S(M)| =|{σ,Tijσ}S(M)|\displaystyle=\sum|\{\sigma,T_{ij}\circ\sigma\}\cap S(M)|
|{σ,Tijσ}S(N)|\displaystyle\equiv\sum|\{\sigma,T_{ij}\circ\sigma\}\cap S(N)|
=|KS(N)|(mod2).\displaystyle=|K\cap S(N)|\pmod{2}.

In preparation for a proof in the next subsection, we introduce the labeling of an equivalence class.

Definition 4.4.

Let STS\subseteq T and KSn/SK\in S_{n}/S. Let ai=minσKσ1(i)a_{i}=\min_{\sigma\in K}\sigma^{-1}(i). We define the labeling (K)\mathcal{L}(K) of KK as (a1,a2,,an)[n]n(a_{1},a_{2},\ldots,a_{n})\in[n]^{n}. Additionally, when the context for SS is clear, we let (σ)\mathcal{L}(\sigma) denote the labeling of the equivalence class of σ\sigma.

In order to show that no two equivalence classes share a labeling, we prove the following lemma.

Lemma 4.5.

Suppose that there is a tree TT whose vertices are v1,v2,,vmv_{1},v_{2},\ldots,v_{m}. We begin by choosing a set of initial values p0(vi)p_{0}(v_{i}), and assigning p(vi)=p0(vi)p(v_{i})=p_{0}(v_{i}) for each vertex viv_{i}. We are allowed to perform operations which exchange the values of p(vi)p(v_{i}) and p(vj)p(v_{j}) for adjacent vertices vi,vjv_{i},v_{j}. Then for any permutation σSm\sigma\in S_{m}, there is a sequence of operations resulting in p(vi)=p0(vσ(i))p(v_{i})=p_{0}(v_{\sigma(i)}) for all i[m]i\in[m].

viv_{i}σ(i)\sigma(i)viv_{i}σ(i)\sigma(i)viv_{i}σ(i)\sigma(i)
Figure VI: An illustration of Lemma 4.5.
Proof.

Without loss of generality, we assume that p0(vi)=ip_{0}(v_{i})=i for i[m]i\in[m]. Consider the following algorithm: pick a leaf node viv_{i}. We want to ensure that p(vi)=σ(i)p(v_{i})=\sigma(i). We view TT as a rooted tree with viv_{i} as its root node and perform operations swapping σ(i)\sigma(i) up the tree until p(vi)=σ(i)p(v_{i})=\sigma(i) (see Figure VI). Now, we can remove viv_{i} from TT. We repeat this process until we have assigned σ(i)\sigma(i) to every viv_{i}. ∎

Theorem 4.6.

Let STS\subseteq T. The function :Sn/S[n]n\mathcal{L}:S_{n}/S\to[n]^{n} is injective.

Proof.

Let a labeling be L=(a1,a2,,an)L=(a_{1},a_{2},\ldots,a_{n}), and suppose (K)=L\mathcal{L}(K)=L for some equivalence class KSn/SK\in S_{n}/S. By definition, aia_{i} denotes the minimum jj such that σ(j)=i\sigma(j)=i over all σK\sigma\in K. Let σ0\sigma_{0} be a fixed element of KK. If TuvST_{uv}\in S, then aσ0(u)=aσ0(v)a_{\sigma_{0}(u)}=a_{\sigma_{0}(v)}. Define Ak={i|ai=k}A_{k}=\{i|a_{i}=k\} for each k[n]k\in[n]. Let G=(V,E)G=(V,E) be a graph such that V=[n]V=[n] and (i,j)E(i,j)\in E if and only if TijST_{ij}\in S.

For each transposition TuvST_{uv}\in S, both u,vu,v lie in the same set σ01(Ak)\sigma_{0}^{-1}(A_{k}). Conversely, if aσ0(u)=aσ0(v)a_{\sigma_{0}(u)}=a_{\sigma_{0}(v)} then there exists a composition of transpositions in SS mapping uu to aσ0(u)a_{\sigma_{0}(u)} then to vv. Thus each nonempty set σ01(Ak)\sigma_{0}^{-1}(A_{k}) is a connected component of GG.

As a result, any σK\sigma\in K must satisfy σ(σ01(Ak))=Ak\sigma(\sigma_{0}^{-1}(A_{k}))=A_{k} for all kk. In particular, the subgraph of GG induced by σ01(Ak)\sigma_{0}^{-1}(A_{k}) is connected and has a spanning tree, so by Lemma 4.5, all permutations in SnS_{n} satisfying this set of conditions must also lie in KK. Hence KK is uniquely determined by LL and SS, since the sets σ01(Ak)\sigma_{0}^{-1}(A_{k}) are determined by SS as the connected components of GG and the sets AkA_{k} are determined by LL. ∎

Repeated applications of Theorem 4.3 show that if SS contains the transpositions corresponding to each middle CNOT gate in a CNOT decomposition, then the parity of |KS(M)||K\cap S(M)| is preserved throughout the entire CNOT decomposition. Although this condition is powerful, there are many possible equivalence classes KK based on the choice of CNOT gates, making it impractical to check the condition. In order to make the lower bound on middle gates computable in polynomial time, we should eliminate both the dependence on KK and S(M)S(M).

4.2 The cperfectc_{\textrm{perfect}} Algorithm

We introduce a consequence of Theorem 4.3 which more directly allows us to find a lower bound on the number of middle gates without depending on the equivalence classes KK in Theorem 4.3.

Theorem 4.7.

Let M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} be a CNOT synthesis of n×nn\times n matrix MM. Let M=I+σS(M)PσM^{\prime}=\textbf{I}+\sum_{\sigma\in S(M)}P_{\sigma}. Let G=(V,E)G=(V,E) a graph such that V=[n]V=[n] and (i,j)E(i,j)\in E if and only if there is some middle gate involved in the CNOT synthesis adding between rows i,ji,j. Then for any connected component AA of GG, let a=(11A,12A,,1nA)\textbf{a}=(\textbf{1}_{1\in A},\textbf{1}_{2\in A},\ldots,\textbf{1}_{n\in A}). Then, aM=0\textbf{a}M^{\prime}=\textbf{0}.

Proof.

First, define S={Tij|(i,j)E}S=\{T_{ij}|(i,j)\in E\}. For every equivalence class KSn/SK\in S_{n}/S, by Theorem 4.3, we have |KS(M1)||KS(Mk+1)|(mod2)|K\cap S(M_{1})|\equiv|K\cap S(M_{k+1})|\pmod{2} since non-middle gates in the sequence do not affect the set of rivers.

Let S=(S(Mk+1)S(M1))(S(M1)S(Mk+1))S^{\prime}=(S(M_{k+1})\setminus S(M_{1}))\cup(S(M_{1})\setminus S(M_{k+1})) be obtained by either removing or adding ee to S(Mk+1)S(M_{k+1}) depending on whether eS(Mk+1)e\in S(M_{k+1}). Let aija_{ij} denote the number of rivers σS\sigma\in S^{\prime} such that the jjth term of (σ)\mathcal{L}(\sigma) is equal to ii. Since each individual labeling must appear an even number of times in SS^{\prime}, aija_{ij} is even for all i,ji,j.

Now, for each permutation σ\sigma, note that the jjth entry of σ1\sigma^{-1}, which we denote uu, represents the position of σ\sigma in which jj appears. Furthermore, the jjth entry of (σ)\mathcal{L}(\sigma), which we denote vv, represents the minimum position of jj over all permutations in the equivalence class of σ\sigma. From the proof of Theorem 4.6, vv is the minimum vertex in the connected component of uu in GG.

Thus each aij0a_{ij}\neq 0 counts exactly the number of rivers σS\sigma\in S^{\prime} such that σ1(j)\sigma^{-1}(j) lies in the connected component of ii in GG. Notably, aij=0a_{ij}=0 whenever ii is not the smallest vertex in its connected component.

Let us further define bijb_{ij} to be the number of rivers σS\sigma\in S^{\prime} such that σ1(j)=i\sigma^{-1}(j)=i. Then

aij={kAbkji=minA,0otherwise,a_{ij}=\begin{cases}\sum\limits_{k\in A}b_{kj}&i=\min A,\\ 0&\textrm{otherwise,}\end{cases}

where AA is the connected component of ii in GG.

Now, since every aija_{ij} is even, for any connected component AA of GG, we must have that kAbkj\sum_{k\in A}b_{kj} is even. Notice that in the matrix MM^{\prime}, the (k,j)(k,j) entry denotes exactly the parity of the number of rivers σS\sigma\in S^{\prime} such that σ1(j)=k\sigma^{-1}(j)=k, so Mk,j=bkjM^{\prime}_{k,j}=b_{kj}. Hence, the sum of the rows of MM^{\prime} whose indices are in AA is exactly 𝟎\mathbf{0}. The result follows.

Although Theorem 4.7 gives a condition for the connected components of GG that can be used to lower bound the number of middle CNOT gates, the time needed to compute MM^{\prime} is O(nn!)O(n\cdot n!) as it would be necessary to loop through every river of MM. We now introduce a shortcut that decreases this time to O(nω)O(n^{\omega}).

Lemma 4.8.

Let MM be an n×nn\times n reversible binary matrix. Then MM+I=I+σS(M)PσM\land M^{-\top}+\textbf{I}=\textbf{I}+\sum_{\sigma\in S(M)}P_{\sigma}.

Proof.

It is well known that M1=1detMCM^{-1}=\frac{1}{\det M}C^{\top} where CC is the cofactor matrix of MM. Thus, the cofactor matrix of MM can be computed as MM^{-\top} when MM is a reversible binary matrix. We say a river σS(M)\sigma\in S(M) passes through the (i,j)(i,j)-entry of MM if σ(i)=j\sigma(i)=j. Notice that rivers can only pass through nonzero entries, and the parity of the number of rivers passing through any such entry is equal to the (i,j)(i,j)-cofactor. Therefore the parity of the number of rivers in S(M)S(M) passing through any given square is given by the matrix MMM\land M^{-\top}. Finally, we add 𝐈\mathbf{I} to balance the equation. ∎

Combining Theorem 4.7 and Lemma 4.8 gives us an O(nω)O(n^{\omega}) algorithm shedding light on the connected components formed by the middle CNOT gates. This yields a straightforward lower bound on the number of middle gates.

Theorem 4.9 (cperfectc_{\textnormal{perfect}} algorithm).

Given n×nn\times n matrix MM and a CNOT synthesis M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} of MM, let M=MM+𝐈M^{\prime}=M\land M^{-\top}+\mathbf{I}. Let G=(V,E)G=(V,E) a graph such that V=[n]V=[n] and (i,j)E(i,j)\in E if and only if there is some middle gate involved in the CNOT synthesis adding between rows i,ji,j. Then GG contains at most cperfect(M):=n+2EmpM+DupM3c_{\textnormal{perfect}}(M):=\frac{n+2\cdot\textrm{Emp}\,M^{\prime}+\textrm{Dup}\,M^{\prime}}{3} connected components.

Proof.

By Theorem 4.7, the sum of the rows of MM^{\prime} corresponding to each connected component is zero. There are at most EmpM\textrm{Emp}\,M^{\prime} rows contained in connected components of size 11 and at most EmpM+2DupM\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime} rows contained in connected components of size 2\leq 2. Let xx be the number of size 11 connected components and let yy be the number of size 22 connected components. Then there must be at most x+y+nx2y3=2x+y+n3x+y+\frac{n-x-2y}{3}=\frac{2x+y+n}{3} connected components in GG. Since x+2yEmpM+2DupMx+2y\leq\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime} and xEmpMx\leq\textrm{Emp}\,M^{\prime},

2x+y+n3\displaystyle\frac{2x+y+n}{3} =3x+(x+2y)+2n6\displaystyle=\frac{3x+(x+2y)+2n}{6}
3EmpM+(EmpM+2DupM)+2n6\displaystyle\leq\frac{3\cdot\textrm{Emp}\,M^{\prime}+(\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime})+2n}{6}
=n+2EmpM+DupM3.\displaystyle=\frac{n+2\cdot\textrm{Emp}\,M^{\prime}+\textrm{Dup}\,M^{\prime}}{3}.

We remark that we can find EmpM\textrm{Emp}\,M^{\prime} and DupM\textrm{Dup}\,M^{\prime} in O(n2)O(n^{2}) time by sorting the rows of MM^{\prime}.

Corollary 4.10.

The number of middle gates in any CNOT decomposition of n×nn\times n matrix MM is at least ncperfect(M)n-c_{\textnormal{perfect}}(M). Furthermore, ncperfect(M)n-c_{\textnormal{perfect}}(M) can be computed in O(nω)O(n^{\omega}) time.

5 The LMC Bound

We now combine the lower bounds found in Sections 3, 4 alongside other minor optimizations. We begin by proving that certain matrix operations preserve the value of s(M)s(M).

Theorem 5.1.

For any n×nn\times n reversible binary matrix MM and any n×nn\times n permutation matrix PP, we have s(M)=s(M1)=s(M)=s(P1MP)s(M)=s(M^{-1})=s(M^{\top})=s(P^{-1}MP).

Proof.

Consider a CNOT decomposition M=R1R2Rs(M)M=R_{1}R_{2}\cdots R_{s(M)}. Taking the inverse of MM, we obtain a CNOT decomposition M1=Rs(M)R2R1M^{-1}=R_{s(M)}\cdots R_{2}R_{1}. By symmetry, s(M)=s(M1)s(M)=s(M^{-1}). Taking the transpose of MM, we obtain a CNOT decomposition M=Rs(M)R2R1M^{\top}=R_{s(M)}^{\top}\cdots R_{2}^{\top}R_{1}^{\top}. By symmetry, s(M)=s(M)s(M)=s(M^{\top}). Finally, conjugating MM by PP, we obtain a CNOT decomposition P1MP=R1R2Rs(M)P^{-1}MP=R^{\prime}_{1}R^{\prime}_{2}\cdots R^{\prime}_{s(M)} where Ri=P1RiPR^{\prime}_{i}=P^{-1}R_{i}P is a CNOT gate. By symmetry, s(M)=s(P1MP)s(M)=s(P^{-1}MP), completing the proof. ∎

Next, we show the number of zeroes on the main diagonal of MM gives a second lower bound on the total number of non-link gates in any CNOT decomposition of MM.

Theorem 5.2.

Let M1,M2,,Mk+1M_{1},M_{2},\ldots,M_{k+1} be a CNOT synthesis of n×nn\times n matrix MM. If MM has #0(M)\#_{0}(M) zeroes on its main diagonal, then there must be at least #0(M)\#_{0}(M) CNOT operations involved in the synthesis which are not link gates.

Proof.

In the identity, there are no zeroes on the main diagonal. Since any CNOT operation changes only one row, we can add at most one zero at a time to the main diagonal. Thus it suffices to show that link gates don’t affect the main diagonal.

In the proof of Lemma 3.4, we showed that for any link gate adding row ii to row jj, we must have i,ji,j in different connected components of Gv(M)G_{v}(M). This implies that Mi,j=0M_{i,j}=0, so the element Mj,jM_{j,j} on the main diagonal will not change. Since link gates cannot change the diagonal, we need at least #0(M)\#_{0}(M) other CNOT operations in order to synthesize MM, as desired. ∎

Finally, we obtain the following lower bound on the size of CNOT circuits.

Theorem 5.3 (LMC Bound).

For an n×nn\times n reversible binary matrix MM, compute =nv(M)\ell=n-v(M),
m=nmin{cperfect(M),cperfect(M)}m=n-\min\{c_{\textnormal{perfect}}(M),c_{\textnormal{perfect}}(M^{\top})\}, and c=e(M)v(M)c=e(M)-v(M). Then

s(M)+max{m+c,#0(M),#0(M1)}.s(M)\geq\ell+\max\left\{m+c,\#_{0}(M),\#_{0}(M^{-1})\right\}.
Proof.

By Lemma 3.5, the sets of link, middle, and cut gates are disjoint. From Lemma 3.6, the number of link gates is at least nv(M)n-v(M) and the number of cut gates is at least e(M)v(M)e(M)-v(M). From Corollary 4.10, the number of middle gates is at least ncperfect(M)n-c_{\textrm{perfect}}(M). By Theorem 5.2, at least #0(M)\#_{0}(M) non-link gates are required. Combining these results shows that s(M)+max{ncperfect(M)+c,#0(M)}s(M)\geq\ell+\max\{n-c_{\textrm{perfect}}(M)+c,\#_{0}(M)\}. Since s(M)=s(M1)=s(M)=s(M)s(M)=s(M^{-1})=s(M^{\top})=s(M^{-\top}) by Theorem 5.1, the result follows from taking the maximum of these lower bounds. ∎

5.1 Cycle and Permutation Matrices

We now apply Theorem 5.3 to obtain exact values on the size of all nn-cycle and permutation matrices, as well as the exact depth of nn-cycle matrices for n7n\geq 7.

Theorem 5.4.

For any nn-cycle σSn\sigma\in S_{n}, s(Pσ)=3(n1)s(P_{\sigma})=3(n-1). Furthermore, let M1,M2,,M3n2M_{1},M_{2},\ldots,M_{3n-2} be a CNOT synthesis of PσP_{\sigma}. Then the sets of link, middle, and cut gates involved in the CNOT synthesis each have size n1n-1 and form a spanning tree on the set of qubits.

Proof.

Applying Theorem 5.3, we note that v(Pσ)=1v(P_{\sigma})=1 and e(Pσ)=ne(P_{\sigma})=n. Applying Theorem 4.7 and Lemma 4.8, we have Pσ=PσPσ+𝐈=Pσ+𝐈P^{\prime}_{\sigma}=P_{\sigma}\land P_{\sigma}^{-\top}+\mathbf{I}=P_{\sigma}+\mathbf{I}. In particular, the only vector vv such that vPσ=𝟎vP^{\prime}_{\sigma}=\mathbf{0} is 𝟏\mathbf{1}, so the graph G=(V,E)G=(V,E), where V=[n]V=[n] and (i,j)E(i,j)\in E if and only if there is a middle gate adding between rows i,ji,j, must be connected. Thus cperfect(M)=1c_{\textrm{perfect}}(M)=1 and s(Pσ)3(n1)s(P_{\sigma})\geq 3(n-1). Figure VII gives various constructions of s(Pσ)=3(n1)s(P_{\sigma})=3(n-1) alongside the spanning tree structures of the link, middle, and cut gates respectively. We add three original constructions which do not use the swapping method in addition to those given in [17, 15, 21].

Pseudocode L, M, C ordering   Link   Middle   Cut
for i=2i=2 to nn do
     CNOT(11, ii)
     CNOT(ii, 11)
     CNOT(11, ii)
end for
LMCLMCLMC3(n1)\overbrace{\textrm{LMCLMC}\cdots\textrm{LMC}}^{3(n-1)} Star Star Star
for i=1i=1 to n1n-1 do
     CNOT(nin-i, ni+1n-i+1)
     CNOT(ni+1n-i+1, nin-i)
     CNOT(nin-i, ni+1n-i+1)
end for
LMCLMCLMC3(n1)\overbrace{\textrm{LMCLMC}\cdots\textrm{LMC}}^{3(n-1)} Path Path Path
for i=1i=1 to n1n-1 do
     CNOT(ii, i+1i+1)
end for
for i=1i=1 to n1n-1 do
     CNOT(i+1i+1, ii)
end for
for i=1i=1 to n1n-1 do
     CNOT(ii, nn)
end for
LLLn1MMMn1CCCn1\overbrace{\textrm{LL}\cdots\textrm{L}}^{n-1}\overbrace{\textrm{MM}\cdots\textrm{M}}^{n-1}\overbrace{\textrm{CC}\cdots\textrm{C}}^{n-1} Star Path Star
for i=1i=1 to n1n-1 do
     CNOT(nin-i, nn)
     CNOT(nn, nin-i)
end for
CNOT(11, nn)
for i=1i=1 to n2n-2 do
     CNOT(i+1i+1, ii)
end for
LMLMLM2(n1)CCCn1\overbrace{\textrm{LMLM}\cdots\textrm{LM}}^{2(n-1)}\overbrace{\textrm{CC}\cdots\textrm{C}}^{n-1} Star Star Path
for i=2i=2 to nn do
     CNOT(11, ii)
end for
for i=1i=1 to n1n-1 do
     CNOT(i+1i+1, ii)
     CNOT(ii, i+1i+1)
end for
LLLn1MCMCMC2(n1)\overbrace{\textrm{LL}\cdots\textrm{L}}^{n-1}\overbrace{\textrm{MCMC}\cdots\textrm{MC}}^{2(n-1)} Star Path Path
Figure VII: Table of various constructions to synthesize an nn-cycle circuit in 3(n1)3(n-1) CNOT operations, alongside their decomposition into link, middle, and cut gates.

Therefore s(Pσ)=3(n1)s(P_{\sigma})=3(n-1). Now, consider a CNOT synthesis M1,M2,,M3n2M_{1},M_{2},\ldots,M_{3n-2} of PσP_{\sigma}. There must be exactly n1n-1 link, middle, and cut gates respectively. Because GG is connected, the set of middle gates forms a spanning tree on the set of qubits. Now assume for the sake of contradiction that the set of link gates does not form a spanning tree on the set of qubits. Then there is a partition [n]=AB[n]=A\sqcup B such that there is never a CNOT operation between a row in AA and a row in BB, as the first such gate would be a link gate. However, this implies that PσP_{\sigma} is separable, giving contradiction.

Finally, assume for the sake of contradiction that the set of cut gates does not form a spanning tree over the nn qubits. Then there exists a partition [n]=AB[n]=A\sqcup B such that there is no cut gate between any aAa\in A and bBb\in B. There must be a link gate adding a row aAa\in A to a row bBb\in B, as the link gates form a spanning tree. This means that at some point, Ra,RbR_{a},R_{b} are connected by this link gate. However, if there is no cut gate between AA and BB, then there will always exist a row of AA and a row of BB that are connected in GeG_{e} after this point, giving contradiction, as no two rows are connected in Ge(Pσ)G_{e}(P_{\sigma}). ∎

Moore and Nilsson [21] showed that the depth of any permutation circuit was at most 66 by giving a construction using parallel CNOT gates. We demonstrate that this number is optimal for large nn-cycle circuits.

Corollary 5.5.

Let n7n\geq 7. Then maxσSnd(Pσ)=6\max_{\sigma\in S_{n}}d(P_{\sigma})=6, and d(Pσ)=6d(P_{\sigma})=6 when σ\sigma is an nn-cycle.

Proof.

Note that any parallel CNOT operation can be decomposed into at most n2\left\lfloor\frac{n}{2}\right\rfloor CNOT gates. Thus, we have

d(M)s(M)n2d(M)\geq\left\lceil\frac{s(M)}{\left\lfloor\frac{n}{2}\right\rfloor}\right\rceil

for any n×nn\times n reversible binary matrix MM. Hence, for n7n\geq 7, any nn-cycle σ\sigma satisfies d(Pσ)6d(P_{\sigma})\geq 6. Finally, the construction given in [21] shows that d(M)6d(M)\leq 6 for any choice of PσP_{\sigma}. ∎

Theorem 5.6.

For any σSn\sigma\in S_{n}, we have s(Pσ)=3(nk)s(P_{\sigma})=3(n-k), where kk is the number of disjoint cycles in σ\sigma.

Proof.

Using the construction from Theorem 5.4 on each disjoint cycle of σ\sigma, we achieve a synthesis of PσP_{\sigma} in 3(nk)3(n-k) CNOT gates, showing that s(Pσ)3(nk)s(P_{\sigma})\leq 3(n-k).

Now for the sake of contradiction assume that there exists some σSn\sigma\in S_{n} with minimal kk that satisfies s(Pσ)<3(nk)s(P_{\sigma})<3(n-k). Theorem 5.4 implies that k2k\geq 2. Now, choose qubits i,ji,j in disjoint cycles of σ\sigma. Appending CNOT(i,j),CNOT(j,i),CNOT(i,j)\textrm{CNOT}(i,j),\textrm{CNOT}(j,i),\textrm{CNOT}(i,j) in that order to the end of the circuit synthesizing PσP_{\sigma} has the effect of swapping rows i,ji,j. However, as illustrated in Figure VIII, this synthesizes another permutation matrix PσP_{\sigma^{\prime}} such that σ\sigma^{\prime} has one fewer disjoint cycle than σ\sigma and additionally s(Pσ)<3(nk+1)s(P_{\sigma^{\prime}})<3(n-k+1). This either contradicts the minimality of kk or Theorem 5.4, finishing. ∎

C1C_{1}C2C_{2}iijj
Figure VIII: The effect of adding a swap operation (3 CNOT gates) between two disjoint cycles.

6 Linkability

In this section, we introduce the linkability problem, a subclass of the global minimization problem proposed in [10]. Specifically, we determine whether it is possible to synthesize an arbitrary circuit MM with v(M)=1v(M)=1 in at most n1n-1 gates in O(nω)O(n^{\omega}) time. We begin by introducing a few notions used in the algorithm.

Lemma 6.1.

Consider a CNOT decomposition of n×nn\times n matrix MM. If every CNOT gate is a link gate, then the influence graph (M)\mathcal{I}(M) forms a directed acyclic graph.

Proof.

Assume that there is a directed cycle x1x2xmx1x_{1}\to x_{2}\to\cdots\to x_{m}\to x_{1} in (M)\mathcal{I}(M). Let indices be taken modulo mm. The permutation

σ(x)={xi+1x=xi,i[m],xotherwise,\sigma(x)=\begin{cases}x_{i+1}&x=x_{i},i\in[m],\\ x&\textrm{otherwise,}\end{cases}

satisfies σS(M)\sigma\in S(M). However, this means S(M)S(I)S(M)\neq S(\textbf{I}) while link gates do not affect the set of rivers by Lemma 3.5, giving contradiction. ∎

We recall the following result in graph theory, which gives us the time complexity for computing the transitive reduction of an arbitrary graph.

Theorem 6.2 (Aho et al. [22] and Furman [23]).

The transitive reduction of a directed graph GG with nn vertices can be computed in O(nω)O(n^{\omega}) time.

Now, we present the algorithm for the linkability problem in its entirety.

Theorem 6.3.

Given an n×nn\times n reversible binary matrix MM such that v(M)=1v(M)=1, it is possible to determine whether s(M)n1s(M)\leq n-1 in O(nω)O(n^{\omega}) time.

Proof.
  1. Step 1:

    We begin by assuming that MM satisfies s(M)n1s(M)\leq n-1. Then by Theorem 5.3, s(M)=n1s(M)=n-1. By assumption, there exists a CNOT synthesis M1,M2,,MnM_{1},M_{2},\ldots,M_{n} of MM. It is clear that all the CNOT gates must be link gates since v(M)=1v(M)=1. We compute the influence graph (M)=(V,A)\mathcal{I}(M)=(V,A), which takes O(n2)O(n^{2}) time.

  2. Step 2:

    Let =(V,A)\mathcal{I^{\prime}}=(V,A^{\prime}) be the transitive reduction of \mathcal{I}. We claim that for each (i,j)A(i,j)\in A^{\prime}, one of the n1n-1 CNOT operations used in the synthesis of MM must add row ii to row jj. We prove this as follows. If (i,j)A(i,j)\in A^{\prime}, there must not have been a directed path from ii to jj of length 2\geq 2 in \mathcal{I}. Assume that no CNOT operation adds row ii to row jj. Then, there exists a sequence k1,k2,,ksk_{1},k_{2},\ldots,k_{s} with k1=ik_{1}=i and ks=jk_{s}=j such that there is a CNOT gate adding row kik_{i} to ki+1k_{i+1} for all i[s1]i\in[s-1]. However, this implies each (ki,ki+1)A(k_{i},k_{i+1})\in A as link gates cannot remove nonzero entries. This is a directed path of length 2\geq 2 from ii to jj, giving contradiction and completing the proof.

    Hence \mathcal{I^{\prime}} both maintains connectivity and has at most n1n-1 edges, so it must be a spanning tree. The exact set of CNOT gates can thus be read off of AA^{\prime}.

  3. Step 3:

    Now, we determine the order in which these CNOT operations took place. For any 3 vertices (u,v,w)(u,v,w) in \mathcal{I^{\prime}} where (u,v),(v,w)A(u,v),(v,w)\in A^{\prime}, it is easy to see that Mw,u=1M_{w,u}=1 if and only if CNOT(u,v)\textrm{CNOT}(u,v) happens before CNOT(v,w)\textrm{CNOT}(v,w), as (u,w)A(u,w)\notin A^{\prime} and no other directed path from uu to ww exists in \mathcal{I^{\prime}}.

  4. Step 4:

    From the previous step, we know the relative ordering of all adjacent CNOT gates. Any ordering of the n1n-1 CNOT gates respecting these relative orders synthesizes the same matrix. Thus, we can pick any such full ordering and perform the CNOT operations step by step on I to yield a matrix NN. This entire process requires O(n2)O(n^{2}) time. If M=NM=N, we have just found a construction to obtain MM in exactly n1n-1 CNOT gates. Otherwise, if MNM\neq N, this is a contradiction on the original assumption that s(M)n1s(M)\leq n-1, finishing.

The overall time complexity of this algorithm is O(n2)+O(nω)+O(n2)+O(n2)=O(nω)O(n^{2})+O(n^{\omega})+O(n^{2})+O(n^{2})=O(n^{\omega}), as desired. ∎

7 Results

For n5n\leq 5, the size of all linear reversible CNOT circuits can be easily found using a BFS algorithm over the set of all nn-qubit circuits. In Table 1, we provide various metrics on how well the lower bound on size from Theorem 5.3 matches the exact size for nn-qubit circuits. In Figure IX, we provide heatmaps for the confusion matrix between the lower bound and the exact size. The raw data used can be found in Appendix A.

 Metric n3n\leq 3 n=4n=4 n=5n=5
Δ=0\Delta=0 100%100\% 67.7%67.7\% 23.1%23.1\%
Δ1\Delta\leq 1 100%100\% 99.5%99.5\% 83.0%83.0\%
Δ2\Delta\leq 2 100%100\% 100%100\% 99.7%99.7\%
σ\sigma 0 0.581 1.137
MADMAD 0 0.328 0.941
R2R^{2} 1 0.811 0.649
PCCPCC 1 0.901 0.806
Table 1: Various metrics on Theorem 5.3 for nn-qubit circuits (here, Δ\Delta denotes the deviation between lower bound and actual size).
Refer to caption
Refer to caption
Refer to caption
Figure IX: Column-by-column heatmaps for n=3,4,5n=3,4,5 between the lower bound in Theorem 5.3 (on the vertical) and the size of the circuit (on the horizontal).

Although our lower bound performs well at various metrics for n5n\leq 5, Jiang et al. show in [10] that as nn grows larger, the average size of an arbitrary n×nn\times n binary reversible circuit grows to O(n2logn)O(\frac{n^{2}}{\log n}) while our lower bound can be at most 3(n1)3(n-1). This means that although our lower bound still holds for large nn, it cannot be used as a size-approximation algorithm for large nn without further improvements.

8 Conclusion and Future Directions

In this paper, we introduce a categorization of the gates of CNOT decompositions into link, middle, and cut gates. Under this framework, we derive lower bounds on the number of CNOT gates in each category and combine them to obtain a lower bound on the total number of CNOT gates needed to synthesize a given circuit that avoids brute-force search and can be obtained in O(nω)O(n^{\omega}) time. We apply our framework to the problem of synthesizing the nn-cycle matrix to show that known constructions with 3(n1)3(n-1) CNOT gates are optimal, and further show the size of any permutation matrix is 3(nk)3(n-k), where kk is the number of disjoint cycles in the permutation.

We introduce an O(nω)O(n^{\omega}) time algorithm determining whether a given circuit with v(M)=1v(M)=1 can be synthesized in fewer than nn CNOT operations, resolving a subclass of the minimum circuit size problem. We hope the methods and ideas developed in this paper will aid the search for further polynomial-time lower bounds on the size of CNOT circuits, as well as demonstrate the optimality or near-optimality of known CNOT decompositions.

To end, we present several interesting future directions on both the minimum circuit size problem as well as for improving our lower bound in Theorem 5.3.

Problem 8.1.

In [3], an information theoretic bound is used to show that the maximum size of a circuit on nn qubits is at least n22log2n\frac{n^{2}}{2\log_{2}n}. However, in the cases n5n\leq 5, the maximum size of an nn-qubit circuit is always 3(n1)3(n-1), attained by the nn-cycle circuit. This is plausible for all n27n\leq 27, as we have n22log2n3(n1)\frac{n^{2}}{2\log_{2}n}\leq 3(n-1). At what point does the asymptotic bound overtake the 3(n1)3(n-1) pattern?

Problem 8.2.

In the n=5n=5 case, a single circuit, alongside those equivalent to it through Theorem 5.1, lies far off the main diagonal in the table in Appendix A. It turns out that MM+𝐈=𝟎M\land M^{-\top}+\mathbf{I}=\mathbf{0} (see Figure X), rendering the cperfectc_{\textrm{perfect}} bound ineffective. Can we improve the cperfectc_{\textrm{perfect}} algorithm to account for this “cperfectc_{\textrm{perfect}} glitch” and obtain a more powerful lower bound on the number of middle CNOT gates?

[1001101101011101011011001]\begin{bmatrix}1&0&0&1&1\\ 0&1&1&0&1\\ 0&1&1&1&0\\ 1&0&1&1&0\\ 1&1&0&0&1\end{bmatrix}

(a) Circuit M
  • 12345  45231

  • 12435  45312

  • 13245  52341

  • 15342  52431

  • 15432  53241

  • 42315  53412

  • 43215

(b) Rivers S(M)S(M) of MM

[0000000000000000000000000]\begin{bmatrix}0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\end{bmatrix}

(c) An all-zero MM+𝐈M\land M^{-\top}+\mathbf{I}
Figure X: The cperfectc_{\textrm{perfect}} glitch
Problem 8.3.

In addition to permutations, are there other classes of circuits for which the bound in Theorem 5.3 is optimal?

Problem 8.4.

Is it possible to extend our methods to circuits with ancillae, topological constraints, or one-qubit gates such as TT gates and Hadamard gates?

9 Acknowledgements

We thank Yunseo Choi and Kevin Cong for their valuable feedback.

References

  • [1] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator, J. A. Smolin, and H. Weinfurter, “Elementary gates for quantum computation,” Phys. Rev. A, vol. 52, no. 5, p. 3457, 1995.
  • [2] D. P. DiVincenzo, “Two-bit gates are universal for quantum computation,” Phys. Rev. A, vol. 51, no. 2, p. 1015, 1995.
  • [3] K. N. Patel, I. L. Markov, and J. P. Hayes, “Efficient synthesis of linear reversible circuits,” arXiv preprint quant-ph/0302002, 2003.
  • [4] F. Vatan and C. Williams, “Optimal quantum circuits for general two-qubit gates,” Phys. Rev. A—Atomic, Molecular, and Optical Physics, vol. 69, no. 3, p. 032315, 2004.
  • [5] V. V. Shende, I. L. Markov, and S. S. Bullock, “Smaller two-qubit circuits for quantum communication and computation,” in proceedings design, automation and test in europe conference and exhibition, vol. 2, pp. 980–985, IEEE, 2004.
  • [6] V. V. Shende, S. S. Bullock, and I. L. Markov, “Synthesis of quantum logic circuits,” in proceedings of the 2005 Asia and South Pacific Design Automation Conference, pp. 272–275, 2005.
  • [7] J. J. Vartiainen, M. Möttönen, and M. M. Salomaa, “Efficient decomposition of quantum gates,” Phys. Rev. Letters, vol. 92, no. 17, p. 177902, 2004.
  • [8] E. Knill, “Approximation by quantum circuits,” arXiv preprint quant-ph/9508006, 1995.
  • [9] B. Nash, V. Gheorghiu, and M. Mosca, “Quantum circuit optimizations for nisq architectures,” Quantum Science and Technology, vol. 5, no. 2, p. 025010, 2020.
  • [10] J. Jiang, X. Sun, S.-H. Teng, B. Wu, K. Wu, and J. Zhang, “Optimal space-depth trade-off of cnot circuits in quantum logic synthesis,” in Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 213–229, SIAM, 2020.
  • [11] Y. Kang and H. Ma, “Cnot-optimal circuit synthesis,” 2023.
  • [12] X. Liu, H. Yang, and L. Yang, “Minimizing cnot-count in quantum circuit of the extended shor’s algorithm for ecdlp,” Cybersecurity, vol. 6, no. 1, p. 48, 2023.
  • [13] K.-A. Brickman, P. Haljan, P. Lee, M. Acton, L. Deslauriers, and C. Monroe, “Implementation of grover’s quantum search algorithm in a scalable system,” Phys. Rev. A—Atomic, Molecular, and Optical Physics, vol. 72, no. 5, p. 050306, 2005.
  • [14] R. Duan, H. Wu, and R. Zhou, “Faster matrix multiplication via asymmetric hashing,” in 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pp. 2129–2138, IEEE, 2023.
  • [15] J. Liu, Y. Ren, Y. Cao, H. Sun, and L. Chen, “Realization of permutation groups by quantum circuit,” arXiv preprint arXiv:2406.01350, 2024.
  • [16] M. Planat and R. Ul Haq, “The magic of universal quantum computing with permutations,” Advances in Mathematical Physics, vol. 2017, no. 1, p. 5287862, 2017.
  • [17] M. Bataille, “Quantum circuits of cnot gates: optimization and entanglement,” Quantum Info. Processing, vol. 21, no. 7, p. 269, 2022.
  • [18] C. D. Murray and R. R. Williams, “On the (non) np-hardness of computing circuit complexity,” Theory of Computing, vol. 13, no. 1, pp. 1–22, 2017.
  • [19] M. Amy, P. Azimzadeh, and M. Mosca, “On the controlled-not complexity of controlled-not–phase circuits,” Quantum Science and Technology, vol. 4, no. 1, p. 015002, 2018.
  • [20] D. Maslov and B. Zindorf, “Depth optimization of cz, cnot, and clifford circuits,” IEEE Trans. on Quantum Engineering, vol. 3, pp. 1–8, 2022.
  • [21] C. Moore and M. Nilsson, “Parallel quantum computation and quantum codes,” SIAM Journal on Computing, vol. 31, no. 3, pp. 799–815, 2001.
  • [22] A. V. Aho, M. R. Garey, and J. D. Ullman, “The transitive reduction of a directed graph,” SIAM Journal on Computing, vol. 1, no. 2, pp. 131–137, 1972.
  • [23] M. Furman, “Application of a method of rapid multiplication of matrices to the problem of finding the transitive closure of a graph,” in Doklady Akademii Nauk, vol. 194, pp. 524–524, Russian Academy of Sciences, 1970.

Appendix A Performance of Theorem 5.3 for Circuits with 44 or 55 Qubits

In this appendix, the lower bound given in Theorem 5.3 is plotted against the exact size of the circuit in the 44 and 55-qubit cases, with the entry in row ii and column jj denoting the number of linear reversible circuits with lower bound ii and size jj. For n3n\leq 3, equality holds.

  0   1   2   3   4   5   6   7    8     9
  0 1
1 12
2 96
3 542 138
4 1920 756 12
5 4560 2589 84
6 4929 2464
7 1510 469
8 72
9 6
Figure XI: The distribution of 44-qubit circuits, with the lower bound in Theorem 5.3 plotted on the vertical and the size of the circuit plotted on the horizontal.
0 1 2 3 4 5 6 7 8 9 10 11 12
0 1
1 20
2 260
3 2570 690
4 18990 20540 3640 12
5 97320 176505 37880 1320
6 360325 917960 322750 11580
7 813870 2448985 925340 15840
8 798120 2072000 365540
9 216378 350120 13340
10 5040 1920
11 480
12 24
Figure XII: The distribution of 55-qubit circuits, with the lower bound in Theorem 5.3 plotted on the vertical and the size of the circuit plotted on the horizontal.