^†^†thanks: This document is a preprint uploaded to arXiv on August 15th, 2024.

Minimum Synthesis Cost of CNOT Circuits

Alan Bu [email protected] Harvard University, Cambridge, MA 02138 Evan Fan [email protected] Phillips Exeter Academy, Exeter, NH 03833 Robert Joo [email protected] Phillips Exeter Academy, Exeter, NH 03833

Abstract

Optimizing the size and depth of CNOT circuits is an active area of research in quantum computing and is particularly relevant for circuits synthesized from the Clifford + T universal gate set. Although many techniques exist for finding short syntheses, it is difficult to assess how close to optimal these syntheses are without an exponential brute-force search. We use a novel method of categorizing CNOT gates in a synthesis to obtain a strict lower bound computable in $O(n^{\omega})$ time on the minimum number of gates needed to synthesize a given CNOT circuit, where $\omega$ denotes the matrix multiplication constant and $n$ is the number of qubits involved. Applying our framework, we prove that $3(n-1)$ gate syntheses of the $n$ -cycle circuit are optimal and provide insight into their structure. We also generalize this result to permutation circuits.

For linear reversible circuits with $n=3,4,5$ qubits, our lower bound is optimal for 100%, 67.7%, and 23.1% of circuits and is accurate to within one CNOT gate in 100%, 99.5%, and 83.0% of circuits respectively. We also introduce an algorithm that efficiently determines whether certain circuits can be synthesized with fewer than $n$ CNOT gates.

^†^†preprint: APS/123-QED

1 Introduction

Minimizing the number of CNOT gates necessary to synthesize a given quantum circuit is a well-studied task in quantum computing. The authors of [1, 2] have developed algorithms to optimize the number of CNOT gates in quantum circuits, and [3, 4, 5, 6, 7, 8] give asymptotics on the minimum number of CNOT gates required in circuit synthesis. The approaches presented in these papers have been used to minimize both the size and depth of a circuit, in some cases accounting for the qubit connectivity of the quantum computer and the ancilla count. In general, these methods have approached asymptotic optimality [9, 10, 4, 11].

Minimizing the number of CNOT operations is important because CNOT operations and single qubit gates form a universal set for quantum computing. Both the speed and error rate of quantum algorithms such as Shor’s algorithm and Grover’s search improve with decreases in CNOT gate count [12, 13].

1.1 Main Results

Although a large volume of active research has been dedicated to optimizing CNOT circuits, we know little about how to obtain firm lower bounds on the minimum number of CNOT gates required to synthesize a given circuit besides resorting to brute-force search. Consequently, an exponential search time is required to verify the optimality of a CNOT circuit decomposition. In this paper, we focus on quantum computers without topological constraints or quantum reconfigurability. Our main result utilizes a novel categorization of the CNOT gates in a CNOT decomposition into link, middle, and cut gates to find strict lower bounds in $O(n^{\omega})$ time on the minimum number of CNOT gates necessary to synthesize a given circuit, where $\omega$ denotes the matrix multiplication constant [14] and $n$ is the number of qubits involved. We apply this method to all $3,4,5$ -qubit circuits. In these cases, our lower bound is exact in $100\%,67.7\%,$ and $23.1\%$ of circuits, and off from the minimum by at most one in $100\%,99.5\%$ , and $83.0\%$ of circuits respectively.

Notably, [15, 16, 17, 4] provided constructions which use $3(n-1)$ CNOT gates to synthesize an $n$ -cycle circuit, but did not disprove the existence of syntheses with fewer CNOT gates. By applying our lower bound, we prove that $3(n-1)$ gates is optimal, eliminating the need for further CNOT count optimization of $n$ -cycle circuits in future research. More generally, we show the minimum number of CNOT gates needed to synthesize permutation circuits is $3(n-k)$ , where $k$ is the number of disjoint cycles in the permutation. We remark that our framework not only yields a certificate of optimality, but also shows that the gates in any $3(n-1)$ gate synthesis of an $n$ -cycle circuit can have its CNOT gates partitioned into three sets of size $n-1$ , each of which forms a spanning tree over the $n$ qubits, all without needing to resort to brute-force search.

Finally, Jiang et al. [10] proved that determining the minimum CNOT count of an arbitrary CNOT circuit with topological constraints is NP-hard. Finding the computational complexity of the same problem without topological constraints is an ongoing area of research [11, 18, 19]. We introduce the linkability problem, a subclass of this problem, with the goal of efficiently determining whether certain circuits can be synthesized with fewer than $n$ CNOT gates. Combining our framework with a graph-theoretic approach, we give an $O(n^{\omega})$ algorithm for this problem.

1.2 Organization of the Paper

In Section 2, we introduce conventions and definitions used throughout the paper. In Section 3, we introduce the categorization of CNOT gates into link, middle, and cut gates, as well as derive a lower bound on the number of link and cut gates necessary to synthesize a circuit. In Section 4, we prove results regarding the effects of middle gates on a circuit and derive a lower bound on the number of middle gates necessary to synthesize a circuit. In Section 5, we combine all of these results alongside a few additional observations to yield our lower bound and apply the bound to $n$ -cycle and permutation circuits to determine their minimum CNOT gate count. In Section 6, we introduce and prove our polynomial-time algorithm for the linkability problem. In Section 7, we assess the performance of our lower bound in the $n=3,4,5$ cases. In Section 8, we summarize our main results and raise future directions.

2 Preliminaries and Notations

We use the notation $[n]=\{1,2,\ldots,n\}$ . We use $M_{i,j}$ to denote the value in the $i$ th row and $j$ th column of a matrix $M$ . We use $e$ to denote the identity permutation and $T_{ij}$ to denote the $(i,j)$ -transposition. We denote the identity matrix by $\mathbf{I}$ . For a given set $A$ and permutation $\sigma$ , we let $\sigma(A)$ denote the image of $A$ under $\sigma$ . We let $\textbf{1}_{i\in A}$ denote the indicator function, where $\textbf{1}_{i\in A}=1$ if $i\in A$ and $\textbf{1}_{i\in A}=0$ otherwise. We say that an $n\times n$ matrix $M$ is a reversible binary matrix if $M\in GL_{n}(\mathbb{F}_{2})$ . We define $\textrm{Emp}\,M$ to be the number of all-zero rows in $M$ and $\textrm{Dup}\,M$ to be the number of disjoint pairs of duplicate, not all-zero, rows in $M$ .

2.1 CNOT Synthesis of Circuits

In this paper, we lower bound the minimum number of CNOT gates required to synthesize a given linear reversible circuit, which can be represented as an $n\times n$ reversible binary matrix. Rigorously, we define the problem as follows, borrowing notation from [10].

Definition 2.1.

A CNOT gate or CNOT operation takes as input two binary values $x_{i}$ and $x_{j}$ representing the control and target of the CNOT gate, respectively, and replaces the pair $(x_{i},x_{j})$ with $(x_{i},x_{i}\oplus x_{j})$ . The effect of a CNOT gate with control $i$ and target $j$ on a set of $n$ qubits can be represented as multiplying by matrix $M$ with $M_{i,j}=1$ , all diagonal entries equal to $1$ , and all other entries equal to $0$ . We denote this CNOT gate, which adds the $i$ th row of a matrix to the $j$ th row, as $\textrm{CNOT}(i,j)$ .

Definition 2.2.

For any $n\times n$ reversible binary matrix $M$ , we define its size $s(M)$ to be one less than the length of the shortest sequence $M_{1},M_{2},\ldots,M_{k+1}$ of $n\times n$ reversible binary matrices with $M_{1}=\textbf{I}$ and $M_{k+1}=M$ such that $M_{i+1}$ can be obtained from $M_{i}$ through a single CNOT operation for all $i\in[k]$ .¹¹1We note that our usage of the term “size” differs from its conventional definition, where it typically refers to the number of CNOT gates in a given CNOT synthesis. In this paper, when we refer to the “size” of a circuit, we mean the minimum number of CNOT gates required in any synthesis of the circuit. We call any such minimal sequence a CNOT synthesis of $M$ . Furthermore, this sequence gives a decomposition of $M$ as a product of CNOT gate matrices, which we call a CNOT decomposition of $M$ . An example is shown in Figure I.

$\begin{bmatrix}1&0&0\\ 0&1&0\\ 0&0&1\end{bmatrix}$

Initial matrix

$\to$

\Qcircuit@C=0.8em@R=1.5em{&x_{1}\quad\quad\ctrl{1}\ctrl{2}\targ\ctrl{1}\qw\qw\qw\;x_{2}\\ x_{2}\quad\quad\targ\qw\ctrl{-1}\targ\targ\ctrl{1}\qw\;x_{3}\\ x_{3}\quad\quad\qw\targ\qw\qw\ctrl{-1}\targ\qw\;x_{1}}

CNOT decomposition

$\to$

$\begin{bmatrix}0&1&0\\ 0&0&1\\ 1&0&0\end{bmatrix}$

3

-cycle matrix

Figure I: CNOT decomposition of the

3

-cycle permutation operation.

The authors of [10, 20] also minimized the number of parallel CNOT operations required to synthesize a given circuit, as fully disjoint CNOT operations can be performed simultaneously on a quantum computer to save time.

Definition 2.3.

A parallel CNOT gate or parallel CNOT operation is a composition of arbitrarily many CNOT gates such that no qubit is operated on by more than one such gate. Rigorously, let $x_{1},x_{2},\ldots,x_{k}$ and $y_{1},y_{2},\ldots,y_{k}$ be $2k$ distinct numbers in $[n]$ . A parallel CNOT gate can be represented by the matrix $M$ with $M_{x_{i},y_{i}}=1$ for $i\in[k]$ , all diagonal entries equal to $1$ , and all other entries equal to $0$ .

An example of a parallel CNOT decomposition is given in Figure II.

$\begin{bmatrix}0&1&0&0&0&0\\ 1&0&1&0&0&1\\ 0&0&1&0&0&0\\ 0&0&1&1&0&0\\ 0&0&0&0&1&1\\ 0&1&1&0&0&1\end{bmatrix}$

$\Qcircuit@C=1.0em@R=1.2em{&\lstick{x_{1}}\qw\ctrl{1}\targ\ctrl{5}\qw\qw\qquad\qquad\;x_{2}\\ \lstick{x_{2}}\qw\targ\ctrl{-1}\qw\targ\qw\qquad\qquad\;x_{1}\oplus x_{3}\oplus x_{6}\\ \lstick{x_{3}}\qw\ctrl{1}\ctrl{3}\qw\qw\qw\qquad\qquad\;x_{3}\\ \lstick{x_{4}}\qw\targ\qw\qw\qw\qw\qquad\qquad\;x_{3}\oplus x_{4}\\ \lstick{x_{5}}\qw\targ\qw\qw\qw\qw\rstick{x_{5}\oplus x_{6}}\\ \lstick{x_{6}}\qw\ctrl{-1}\targ\targ\ctrl{-4}\qw\qquad\qquad\;x_{2}\oplus x_{3}\oplus x_{6}}$

Figure II: An illustration of a depth

4

parallel CNOT decomposition.

Definition 2.4.

For any $n\times n$ reversible binary matrix $M$ , we define its depth $d(M)$ to be one less than the length of the shortest sequence $M_{1},M_{2},\ldots,M_{k+1}$ of $n\times n$ reversible binary matrices with $M_{1}=\textbf{I}$ and $M_{k+1}=M$ such that $M_{i+1}$ can be obtained from $M_{i}$ through a single parallel CNOT operation for all $i\in[k]$ .²²2See above.

2.2 Circuit-derived Graphs

We further introduce two graphs fundamental to the lower bound introduced in this paper.

Definition 2.5.

Given an $n\times n$ reversible binary matrix $M$ , we define its vertex-connectivity graph to be $G_{v}(M)=(V,E)$ , where $V=[n]$ and $(i,j)\in E$ if and only if $i\neq j$ and at least one of $M_{i,j}$ or $M_{j,i}$ is equal to one. We define $v(M)$ to be the number of connected components of this graph.

Definition 2.6.

Given an $n\times n$ reversible binary matrix $M$ , we define its edge-connectivity graph to be $G_{e}(M)=(V,E)$ , where $V=\{R_{1},R_{2},\ldots,R_{n},C_{1},\ldots,C_{n}\}$ . For each row $R_{i}$ and column $C_{j}$ of $M$ , we connect $R_{i}$ and $C_{j}$ with an edge in $G_{e}(M)$ if and only if $M_{i,j}=1$ . We define $e(M)$ to be the number of connected components of this graph.

Examples of the vertex and edge-connectivity graphs of a matrix are shown in Figure III.

$\begin{bmatrix}0&0&1&0&0\\ 0&1&0&1&0\\ 1&0&0&0&0\\ 0&0&0&0&1\\ 0&1&0&0&0\end{bmatrix}$

(a) Matrix

M

(b)

G_{v}(M)

(c)

G_{e}(M)

Figure III: An example of

M

with

v(M)=2

and

e(M)=4

Definition 2.7.

We call a graph $G=(V,E)$ on $n$ vertices a star graph if one vertex of $G$ has degree $n-1$ and all other vertices have degree $1$ . We call $G$ a path graph if we can label its vertices as $v_{1},v_{2},\ldots,v_{n}$ such that $(v_{i},v_{j})\in E$ if and only if $|i-j|=1$ . With slight abuse of notation, we say that a set of $n-1$ CNOT gates is a star or path if the graph $G$ formed by adding an edge between the control and target of each CNOT gate is a star or path graph respectively.

2.3 Influence Graphs

Finally, we introduce two definitions used in our algorithm for the linkability problem.

Definition 2.8.

For a reversible binary $n\times n$ matrix $M$ , we define its influence graph $\mathcal{I}(M)=(V,A)$ to be the directed graph such that $V=[n]$ and $(i,j)\in A$ if and only if $M_{j,i}=1$ and $i\neq j$ . In each such case, we say that $i$ influences $j$ . An example is given in Figure IV.

Definition 2.9.

For a directed graph $G=(V,A)$ , we define a transitive reduction $G^{\prime}$ of $G$ to be another directed graph with the same vertices and a minimal set of directed edges such there is a directed path from vertex $u$ to vertex $v$ in $G^{\prime}$ if and only if there is a directed path from $u$ to $v$ in $G$ .

$\begin{bmatrix}1&0&1&1\\ 1&1&1&0\\ 0&0&1&0\\ 0&0&1&1\end{bmatrix}$

(a) Matrix

M

(b) Influence graph of

M

\mathcal{I}(M)

Figure IV: Example of an influence graph and its transitive reduction.

3 Link, Middle, and Cut CNOT Gates

The key insight of our algorithm is that each gate of a CNOT decomposition can be placed into one of three categories (or none), and finding lower bounds on the number of CNOT gates in each category allows us to obtain a lower bound on the total number of CNOT gates. In this section, we introduce these categories and prove lower bounds on the number of CNOT gates in two of the three categories.

Definition 3.1.

We define a river of an $n\times n$ reversible binary matrix $M$ to be a permutation $\sigma\in S_{n}$ such that $M_{i,\sigma(i)}=1$ for all $i$ . We denote the set of all rivers of $M$ as $S(M)$ . For any river $\sigma$ , we denote the unique permutation matrix whose only nonzero entries lie on this river as $P_{\sigma}$ .

Definition 3.2.

Let $M,N$ be $n\times n$ reversible binary matrices such that $N$ can be obtained from $M$ through one CNOT operation. We call this CNOT operation a

•

Link gate if $v(M)>v(N)$ ,
•

Middle gate if $S(M)\neq S(N)$ ,
•

Cut gate if $e(M)<e(N)$ .

We now prove an important result on the effects of link and cut gates on $G_{v}$ and $G_{e}$ .

Proposition 3.3.

Let $M$ be a reversible binary matrix. If $R_{i},R_{j}$ are connected in $G_{e}(M)$ , then $i,j$ are connected in $G_{v}(M)$ .

Proof.

As $R_{i},R_{j}$ are connected in $G_{e}(M)$ , there exists a path $R_{x_{1}}C_{y_{1}}R_{x_{2}}C_{y_{2}}\cdots C_{x_{m-1}}R_{x_{m}}$ of $G_{e}(M)$ such that $x_{1}=i$ and $x_{m}=j$ . Thus, $x_{i},x_{i+1}$ are connected in $G_{v}(M)$ for all $i\in[m-1]$ since $M_{x_{i},y_{i}}=M_{y_{i},x_{i+1}}=1$ . Thus $i$ is connected to $j$ in $G_{v}(M)$ as desired. ∎

Lemma 3.4.

Let $M,N$ be $n\times n$ reversible binary matrices such that $N$ can be obtained from $M$ through one CNOT operation. Then, $|v(M)-v(N)|\leq 1$ and $|e(M)-e(N)|\leq 1$ . Furthermore, if $v(N)=v(M)-1$ , then $e(N)=e(M)-1$ .

Proof.

Consider a CNOT operation on a matrix $M$ adding row $i$ to row $j$ for which $v(M)>v(N)$ . If $i$ and $j$ were in the same connected component $A$ in $G_{v}(M)$ , then any new edges in $G_{v}(N)$ will also be between vertices in $A$ , contradicting $v(M)>v(N)$ . If $i$ and $j$ were in distinct connected components $A$ and $B$ of $G_{v}(M)$ , then the only edges that will be different in $G_{v}(N)$ will be those between vertices of $A$ and $B$ . This will cause $A$ and $B$ to become a single connected component in $G_{v}(N)$ , and leave all other connected components unchanged. Thus, if $v(M)>v(N)$ then $v(M)=v(N)+1$ .

Similarly, consider the case when $e(M)>e(N)$ . If $R_{i}$ , $R_{j}$ are in the same connected component $A$ of $G_{e}(M)$ then no edges between connected components are added, while if they are in separate connected components $A$ and $B$ , then the only new edges are edges connecting a vertex in $A$ to a vertex in $B$ , leaving all other connected components unchanged. Thus, if $e(M)>e(N)$ then $e(M)=e(N)+1$ .

Now, notice that applying $\textrm{CNOT}(i,j)$ on $N$ returns the original matrix $M$ . In symmetric fashion, $v(N)\leq v(M)+1$ and $e(N)\leq e(M)+1$ .

Finally, if $v(N)=v(M)-1$ , then there must exist $i,j$ that lie in different connected components of $G_{v}(M)$ such that $N_{i,j}=1$ . The CNOT gate adds some row $k$ with $M_{k,j}=1$ to row $i$ . Note that $j,k$ lie in the same connected component of $G_{v}(M)$ . After adding row $k$ to row $i$ , we know $i,j,k$ lie in the same connected component of $G_{v}(N)$ and $R_{i},C_{j}$ is connected in $G_{e}(N)$ . Notably, $R_{i}$ and $C_{j}$ are in different connected components of $G_{e}(M)$ by Proposition 3.3, and thus the number of edge-connected components in $N$ is one less than in $M$ , as desired. ∎

We now use the tools developed in the proof of Lemma 3.4 to prove that these three categories of CNOT gates are mutually exclusive, allowing us to sum lower bounds on each category to bound the size of the entire circuit.

Lemma 3.5.

Let $M,N$ be $n\times n$ reversible binary matrices such that $N$ can be obtained from $M$ through one CNOT operation. Then the CNOT operation is either exclusively a link, middle, or cut gate, or none of these.

Proof.

By Lemma 3.4, if the CNOT operation is a link gate then $e(M)>e(N)$ , so a link gate cannot be a cut gate.

As discussed in the proof of Lemma 3.4, link gates operate on rows that are in separate connected components of $G_{v}(M)$ . Let the control row of the CNOT gate lie in connected component $A\subseteq[n]$ of $G_{v}(M)$ . Without loss of generality, let $A=[a]$ and define $B=[n]\setminus A$ . Then, $M=\begin{bmatrix}\textbf{A}&\textbf{0}\\ \textbf{0}&\textbf{B}\end{bmatrix}$ where A is an $a\times a$ matrix and B is an $(n-a)\times(n-a)$ matrix. Any $\sigma\in S(M)$ satisfies $\sigma(A)=A$ . When we add a row from A to a row in B, the upper-right box of zeros does not change. Thus, for any $\sigma\in S(N)$ , we have $\sigma(A)\subseteq A$ , so $\sigma(A)=A$ . Therefore the set of rivers is unchanged as no river passes through the newly created bottom left entries, so a link gate cannot be a middle gate.

Finally, we show that cuts and middles are also disjoint. In this case, we can assume up to permutation of the rows and columns that $N=\begin{bmatrix}\textbf{A}&\textbf{0}\\ \textbf{0}&\textbf{B}\end{bmatrix}$ where $\textbf{A},\textbf{B}$ are not necessarily square. However, if $\textbf{A},\textbf{B}$ are not square, then $N$ is not reversible, giving contradiction. Suppose the CNOT gate adds a row of A to a row of B. Let the set of columns of $M$ in A be denoted as $C_{A}$ , and let the set of rows of $M$ in A be denoted as $R_{A}$ . Thus if $\sigma\in S(N)$ , then $\sigma(R_{A})=C_{A}$ . Furthermore, for any $\sigma\in S(M)$ , we have $\sigma(R_{A})\subseteq C_{A}$ so $\sigma(R_{A})=C_{A}$ . Thus the set of rivers in $M$ and $N$ are equal. This finishes the final case. ∎

Finally, we extend Lemma 3.4 to yield a simple but powerful lower bound on the number of link and cut gates in a given circuit.

Lemma 3.6.

Any CNOT decomposition of $n\times n$ matrix $M$ must contain at least $n-v(M)$ link gates and at least $e(M)-v(M)$ cut gates.

Proof.

Consider any CNOT synthesis $M_{1},M_{2},\ldots,M_{k+1}$ of $M$ . Note that $v(\textbf{I})=e(\textbf{I})=n$ . The number of vertex-connected components must decrease over the sequence, so Lemma 3.4 implies that at least $n-v(M)$ link gates are required.

From Lemma 3.4, link gates decrease the number of edge connected components by exactly one, while cut gates increase them by exactly one. Since there are $n$ initial components and $n-v(M)$ link gates, we need at least $e(M)-v(M)$ cut gates in order to end up at $e(M)$ edge-connected components in the final matrix. ∎

4 Lower Bounds on Middle Gates

In this section, we develop an $O(n^{\omega})$ time algorithm that computes a lower bound on the number of middle gates necessary to synthesize a given circuit. To begin, we prove several lemmas that clarify the effects of middle gates on the rivers of a matrix.

Given a set of middle CNOT gates, each river can only “interact” with other rivers in the same equivalence class. These equivalence classes are determined by the set of middle CNOT gates. When it is clear that the rivers of the $M$ cannot be obtained from I through these interactions, the given set of middle CNOT gates is insufficient to synthesize $M$ . In the following subsections, we make this idea rigorous.

4.1 Equivalence Classes of Rivers

We now show that a singular middle CNOT gate only allows “interactions” between certain pairs of rivers, each of which differs by a fixed transposition determined by the control and target of the CNOT gate.

Lemma 4.1.

Let $M,N$ be $n\times n$ reversible binary matrices such that $N$ can be obtained from $M$ by adding row $i$ to row $j$ . Consider pairing each permutation $\sigma$ to $T_{ij}\circ\sigma$ . Then $|\{\sigma,T_{ij}\circ\sigma\}\cap S(M)|\equiv|\{\sigma,T_{ij}\circ\sigma\}\cap S(N)|\pmod{2}$ .

\begin{matrix}1&0\\ 0&1\end{matrix}\quad\xrightarrow{\hskip 20.00003pt}\quad\begin{matrix}1&0\\ 1&1\end{matrix}

\begin{matrix}1&1\\ 0&1\end{matrix}\quad\xrightarrow{\hskip 20.00003pt}\quad\begin{matrix}1&1\\ 1&0\end{matrix}

Figure V: Examples of the effects of

\textrm{CNOT}(i,j)

for rows

i,j

and columns

\sigma(i),\sigma(j)

Proof.

We begin with casework on the values of $M_{i,\sigma(i)}$ and $M_{i,\sigma(j)}$ , with examples in Figure V. If they are both $0$ , then $\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N)=\emptyset$ as desired. If exactly one of them is $1$ , it can be checked that $\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N)$ as desired. We now consider the case when $M_{i,\sigma(i)}=M_{i,\sigma(j)}=1$ . In this case, if it is not true that $M_{k,\sigma(k)}=1$ for all $k\neq i,j$ then $\{\sigma,T_{ij}\circ\sigma\}\cap S(M)=\{\sigma,T_{ij}\circ\sigma\}\cap S(N)=\emptyset$ as desired. Otherwise, observe that $\sigma\in M$ if and only if $\sigma\notin N$ and $T_{ij}\circ\sigma\in M$ if and only if $T_{ij}\circ\sigma\notin N$ . Thus, $(\{\sigma,T_{ij}\circ\sigma\}\cap S(M))\sqcup(\{\sigma,T_{ij}\circ\sigma\}\cap S(N))=\{\sigma,T_{ij}\circ\sigma\}$ , completing the final case. ∎

Lemma 4.1 demonstrates that when there is only one CNOT gate involved, the parity of the number of rivers in sets of the form $\{\sigma,T_{ij}\circ\sigma\}$ cannot change. We extend this idea to the case of multiple CNOT gates by allowing multiple transpositions.

Definition 4.2.

Let $S\subseteq T$ be a fixed set of transpositions. We say that $\sigma_{1}\equiv_{S}\sigma_{2}$ if and only if $\sigma_{2}$ can be obtained from $\sigma_{1}$ through a sequence of transpositions in $S$ . It is easy to see that $\equiv_{S}$ is an equivalence relation on $S_{n}$ as the reflexive, transitive, and symmetric properties hold. We let $S_{n}/S$ denote the set of equivalence classes under $\equiv_{S}$ .

Theorem 4.3.

Let $M,N$ be $n\times n$ reversible binary matrices such that $N$ can be obtained from $M$ by adding row $i$ to row $j$ . Then, for any $S\subseteq T$ with $T_{ij}\in S$ and for any equivalence class $K\in S_{n}/S$ , we have $|K\cap S(M)|\equiv|K\cap S(N)|\pmod{2}$ .

Proof.

Since $T_{ij}\in S$ , we can partition all the rivers in $K$ into pairs $(\sigma,T_{ij}\circ\sigma)$ without including elements outside of $K$ . By Lemma 4.1,

	$\displaystyle\|K\cap S(M)\|$	$\displaystyle=\sum\|\{\sigma,T_{ij}\circ\sigma\}\cap S(M)\|$
		$\displaystyle\equiv\sum\|\{\sigma,T_{ij}\circ\sigma\}\cap S(N)\|$
		$\displaystyle=\|K\cap S(N)\|\pmod{2}.$

∎

In preparation for a proof in the next subsection, we introduce the labeling of an equivalence class.

Definition 4.4.

Let $S\subseteq T$ and $K\in S_{n}/S$ . Let $a_{i}=\min_{\sigma\in K}\sigma^{-1}(i)$ . We define the labeling $\mathcal{L}(K)$ of $K$ as $(a_{1},a_{2},\ldots,a_{n})\in[n]^{n}$ . Additionally, when the context for $S$ is clear, we let $\mathcal{L}(\sigma)$ denote the labeling of the equivalence class of $\sigma$ .

In order to show that no two equivalence classes share a labeling, we prove the following lemma.

Lemma 4.5.

Suppose that there is a tree $T$ whose vertices are $v_{1},v_{2},\ldots,v_{m}$ . We begin by choosing a set of initial values $p_{0}(v_{i})$ , and assigning $p(v_{i})=p_{0}(v_{i})$ for each vertex $v_{i}$ . We are allowed to perform operations which exchange the values of $p(v_{i})$ and $p(v_{j})$ for adjacent vertices $v_{i},v_{j}$ . Then for any permutation $\sigma\in S_{m}$ , there is a sequence of operations resulting in $p(v_{i})=p_{0}(v_{\sigma(i)})$ for all $i\in[m]$ .

Figure VI: An illustration of Lemma 4.5.

Proof.

Without loss of generality, we assume that $p_{0}(v_{i})=i$ for $i\in[m]$ . Consider the following algorithm: pick a leaf node $v_{i}$ . We want to ensure that $p(v_{i})=\sigma(i)$ . We view $T$ as a rooted tree with $v_{i}$ as its root node and perform operations swapping $\sigma(i)$ up the tree until $p(v_{i})=\sigma(i)$ (see Figure VI). Now, we can remove $v_{i}$ from $T$ . We repeat this process until we have assigned $\sigma(i)$ to every $v_{i}$ . ∎

Theorem 4.6.

Let $S\subseteq T$ . The function $\mathcal{L}:S_{n}/S\to[n]^{n}$ is injective.

Proof.

Let a labeling be $L=(a_{1},a_{2},\ldots,a_{n})$ , and suppose $\mathcal{L}(K)=L$ for some equivalence class $K\in S_{n}/S$ . By definition, $a_{i}$ denotes the minimum $j$ such that $\sigma(j)=i$ over all $\sigma\in K$ . Let $\sigma_{0}$ be a fixed element of $K$ . If $T_{uv}\in S$ , then $a_{\sigma_{0}(u)}=a_{\sigma_{0}(v)}$ . Define $A_{k}=\{i|a_{i}=k\}$ for each $k\in[n]$ . Let $G=(V,E)$ be a graph such that $V=[n]$ and $(i,j)\in E$ if and only if $T_{ij}\in S$ .

For each transposition $T_{uv}\in S$ , both $u,v$ lie in the same set $\sigma_{0}^{-1}(A_{k})$ . Conversely, if $a_{\sigma_{0}(u)}=a_{\sigma_{0}(v)}$ then there exists a composition of transpositions in $S$ mapping $u$ to $a_{\sigma_{0}(u)}$ then to $v$ . Thus each nonempty set $\sigma_{0}^{-1}(A_{k})$ is a connected component of $G$ .

As a result, any $\sigma\in K$ must satisfy $\sigma(\sigma_{0}^{-1}(A_{k}))=A_{k}$ for all $k$ . In particular, the subgraph of $G$ induced by $\sigma_{0}^{-1}(A_{k})$ is connected and has a spanning tree, so by Lemma 4.5, all permutations in $S_{n}$ satisfying this set of conditions must also lie in $K$ . Hence $K$ is uniquely determined by $L$ and $S$ , since the sets $\sigma_{0}^{-1}(A_{k})$ are determined by $S$ as the connected components of $G$ and the sets $A_{k}$ are determined by $L$ . ∎

Repeated applications of Theorem 4.3 show that if $S$ contains the transpositions corresponding to each middle CNOT gate in a CNOT decomposition, then the parity of $|K\cap S(M)|$ is preserved throughout the entire CNOT decomposition. Although this condition is powerful, there are many possible equivalence classes $K$ based on the choice of CNOT gates, making it impractical to check the condition. In order to make the lower bound on middle gates computable in polynomial time, we should eliminate both the dependence on $K$ and $S(M)$ .

4.2 The $c_{\textrm{perfect}}$ Algorithm

We introduce a consequence of Theorem 4.3 which more directly allows us to find a lower bound on the number of middle gates without depending on the equivalence classes $K$ in Theorem 4.3.

Theorem 4.7.

Let $M_{1},M_{2},\ldots,M_{k+1}$ be a CNOT synthesis of $n\times n$ matrix $M$ . Let $M^{\prime}=\textbf{I}+\sum_{\sigma\in S(M)}P_{\sigma}$ . Let $G=(V,E)$ a graph such that $V=[n]$ and $(i,j)\in E$ if and only if there is some middle gate involved in the CNOT synthesis adding between rows $i,j$ . Then for any connected component $A$ of $G$ , let $\textbf{a}=(\textbf{1}_{1\in A},\textbf{1}_{2\in A},\ldots,\textbf{1}_{n\in A})$ . Then, $\textbf{a}M^{\prime}=\textbf{0}$ .

Proof.

First, define $S=\{T_{ij}|(i,j)\in E\}$ . For every equivalence class $K\in S_{n}/S$ , by Theorem 4.3, we have $|K\cap S(M_{1})|\equiv|K\cap S(M_{k+1})|\pmod{2}$ since non-middle gates in the sequence do not affect the set of rivers.

Let $S^{\prime}=(S(M_{k+1})\setminus S(M_{1}))\cup(S(M_{1})\setminus S(M_{k+1}))$ be obtained by either removing or adding $e$ to $S(M_{k+1})$ depending on whether $e\in S(M_{k+1})$ . Let $a_{ij}$ denote the number of rivers $\sigma\in S^{\prime}$ such that the $j$ th term of $\mathcal{L}(\sigma)$ is equal to $i$ . Since each individual labeling must appear an even number of times in $S^{\prime}$ , $a_{ij}$ is even for all $i,j$ .

Now, for each permutation $\sigma$ , note that the $j$ th entry of $\sigma^{-1}$ , which we denote $u$ , represents the position of $\sigma$ in which $j$ appears. Furthermore, the $j$ th entry of $\mathcal{L}(\sigma)$ , which we denote $v$ , represents the minimum position of $j$ over all permutations in the equivalence class of $\sigma$ . From the proof of Theorem 4.6, $v$ is the minimum vertex in the connected component of $u$ in $G$ .

Thus each $a_{ij}\neq 0$ counts exactly the number of rivers $\sigma\in S^{\prime}$ such that $\sigma^{-1}(j)$ lies in the connected component of $i$ in $G$ . Notably, $a_{ij}=0$ whenever $i$ is not the smallest vertex in its connected component.

Let us further define $b_{ij}$ to be the number of rivers $\sigma\in S^{\prime}$ such that $\sigma^{-1}(j)=i$ . Then

a_{ij}=\begin{cases}\sum\limits_{k\in A}b_{kj}&i=\min A,\\ 0&\textrm{otherwise,}\end{cases}

where $A$ is the connected component of $i$ in $G$ .

Now, since every $a_{ij}$ is even, for any connected component $A$ of $G$ , we must have that $\sum_{k\in A}b_{kj}$ is even. Notice that in the matrix $M^{\prime}$ , the $(k,j)$ entry denotes exactly the parity of the number of rivers $\sigma\in S^{\prime}$ such that $\sigma^{-1}(j)=k$ , so $M^{\prime}_{k,j}=b_{kj}$ . Hence, the sum of the rows of $M^{\prime}$ whose indices are in $A$ is exactly $\mathbf{0}$ . The result follows.

∎

Although Theorem 4.7 gives a condition for the connected components of $G$ that can be used to lower bound the number of middle CNOT gates, the time needed to compute $M^{\prime}$ is $O(n\cdot n!)$ as it would be necessary to loop through every river of $M$ . We now introduce a shortcut that decreases this time to $O(n^{\omega})$ .

Lemma 4.8.

Let $M$ be an $n\times n$ reversible binary matrix. Then $M\land M^{-\top}+\textbf{I}=\textbf{I}+\sum_{\sigma\in S(M)}P_{\sigma}$ .

Proof.

It is well known that $M^{-1}=\frac{1}{\det M}C^{\top}$ where $C$ is the cofactor matrix of $M$ . Thus, the cofactor matrix of $M$ can be computed as $M^{-\top}$ when $M$ is a reversible binary matrix. We say a river $\sigma\in S(M)$ passes through the $(i,j)$ -entry of $M$ if $\sigma(i)=j$ . Notice that rivers can only pass through nonzero entries, and the parity of the number of rivers passing through any such entry is equal to the $(i,j)$ -cofactor. Therefore the parity of the number of rivers in $S(M)$ passing through any given square is given by the matrix $M\land M^{-\top}$ . Finally, we add $\mathbf{I}$ to balance the equation. ∎

Combining Theorem 4.7 and Lemma 4.8 gives us an $O(n^{\omega})$ algorithm shedding light on the connected components formed by the middle CNOT gates. This yields a straightforward lower bound on the number of middle gates.

Theorem 4.9 ( $c_{\textnormal{perfect}}$ algorithm).

Given $n\times n$ matrix $M$ and a CNOT synthesis $M_{1},M_{2},\ldots,M_{k+1}$ of $M$ , let $M^{\prime}=M\land M^{-\top}+\mathbf{I}$ . Let $G=(V,E)$ a graph such that $V=[n]$ and $(i,j)\in E$ if and only if there is some middle gate involved in the CNOT synthesis adding between rows $i,j$ . Then $G$ contains at most $c_{\textnormal{perfect}}(M):=\frac{n+2\cdot\textrm{Emp}\,M^{\prime}+\textrm{Dup}\,M^{\prime}}{3}$ connected components.

Proof.

By Theorem 4.7, the sum of the rows of $M^{\prime}$ corresponding to each connected component is zero. There are at most $\textrm{Emp}\,M^{\prime}$ rows contained in connected components of size $1$ and at most $\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime}$ rows contained in connected components of size $\leq 2$ . Let $x$ be the number of size $1$ connected components and let $y$ be the number of size $2$ connected components. Then there must be at most $x+y+\frac{n-x-2y}{3}=\frac{2x+y+n}{3}$ connected components in $G$ . Since $x+2y\leq\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime}$ and $x\leq\textrm{Emp}\,M^{\prime}$ ,

	$\displaystyle\frac{2x+y+n}{3}$	$\displaystyle=\frac{3x+(x+2y)+2n}{6}$
		$\displaystyle\leq\frac{3\cdot\textrm{Emp}\,M^{\prime}+(\textrm{Emp}\,M^{\prime}+2\cdot\textrm{Dup}\,M^{\prime})+2n}{6}$
		$\displaystyle=\frac{n+2\cdot\textrm{Emp}\,M^{\prime}+\textrm{Dup}\,M^{\prime}}{3}.$

∎

We remark that we can find $\textrm{Emp}\,M^{\prime}$ and $\textrm{Dup}\,M^{\prime}$ in $O(n^{2})$ time by sorting the rows of $M^{\prime}$ .

Corollary 4.10.

The number of middle gates in any CNOT decomposition of $n\times n$ matrix $M$ is at least $n-c_{\textnormal{perfect}}(M)$ . Furthermore, $n-c_{\textnormal{perfect}}(M)$ can be computed in $O(n^{\omega})$ time.

5 The LMC Bound

We now combine the lower bounds found in Sections 3, 4 alongside other minor optimizations. We begin by proving that certain matrix operations preserve the value of $s(M)$ .

Theorem 5.1.

For any $n\times n$ reversible binary matrix $M$ and any $n\times n$ permutation matrix $P$ , we have $s(M)=s(M^{-1})=s(M^{\top})=s(P^{-1}MP)$ .

Proof.

Consider a CNOT decomposition $M=R_{1}R_{2}\cdots R_{s(M)}$ . Taking the inverse of $M$ , we obtain a CNOT decomposition $M^{-1}=R_{s(M)}\cdots R_{2}R_{1}$ . By symmetry, $s(M)=s(M^{-1})$ . Taking the transpose of $M$ , we obtain a CNOT decomposition $M^{\top}=R_{s(M)}^{\top}\cdots R_{2}^{\top}R_{1}^{\top}$ . By symmetry, $s(M)=s(M^{\top})$ . Finally, conjugating $M$ by $P$ , we obtain a CNOT decomposition $P^{-1}MP=R^{\prime}_{1}R^{\prime}_{2}\cdots R^{\prime}_{s(M)}$ where $R^{\prime}_{i}=P^{-1}R_{i}P$ is a CNOT gate. By symmetry, $s(M)=s(P^{-1}MP)$ , completing the proof. ∎

Next, we show the number of zeroes on the main diagonal of $M$ gives a second lower bound on the total number of non-link gates in any CNOT decomposition of $M$ .

Theorem 5.2.

Let $M_{1},M_{2},\ldots,M_{k+1}$ be a CNOT synthesis of $n\times n$ matrix $M$ . If $M$ has $\#_{0}(M)$ zeroes on its main diagonal, then there must be at least $\#_{0}(M)$ CNOT operations involved in the synthesis which are not link gates.

Proof.

In the identity, there are no zeroes on the main diagonal. Since any CNOT operation changes only one row, we can add at most one zero at a time to the main diagonal. Thus it suffices to show that link gates don’t affect the main diagonal.

In the proof of Lemma 3.4, we showed that for any link gate adding row $i$ to row $j$ , we must have $i,j$ in different connected components of $G_{v}(M)$ . This implies that $M_{i,j}=0$ , so the element $M_{j,j}$ on the main diagonal will not change. Since link gates cannot change the diagonal, we need at least $\#_{0}(M)$ other CNOT operations in order to synthesize $M$ , as desired. ∎

Finally, we obtain the following lower bound on the size of CNOT circuits.

Theorem 5.3 (LMC Bound).

For an $n\times n$ reversible binary matrix $M$ , compute $\ell=n-v(M)$ ,
$m=n-\min\{c_{\textnormal{perfect}}(M),c_{\textnormal{perfect}}(M^{\top})\}$ , and $c=e(M)-v(M)$ . Then

s(M)\geq\ell+\max\left\{m+c,\#_{0}(M),\#_{0}(M^{-1})\right\}.

Proof.

By Lemma 3.5, the sets of link, middle, and cut gates are disjoint. From Lemma 3.6, the number of link gates is at least $n-v(M)$ and the number of cut gates is at least $e(M)-v(M)$ . From Corollary 4.10, the number of middle gates is at least $n-c_{\textrm{perfect}}(M)$ . By Theorem 5.2, at least $\#_{0}(M)$ non-link gates are required. Combining these results shows that $s(M)\geq\ell+\max\{n-c_{\textrm{perfect}}(M)+c,\#_{0}(M)\}$ . Since $s(M)=s(M^{-1})=s(M^{\top})=s(M^{-\top})$ by Theorem 5.1, the result follows from taking the maximum of these lower bounds. ∎

5.1 Cycle and Permutation Matrices

We now apply Theorem 5.3 to obtain exact values on the size of all $n$ -cycle and permutation matrices, as well as the exact depth of $n$ -cycle matrices for $n\geq 7$ .

Theorem 5.4.

For any $n$ -cycle $\sigma\in S_{n}$ , $s(P_{\sigma})=3(n-1)$ . Furthermore, let $M_{1},M_{2},\ldots,M_{3n-2}$ be a CNOT synthesis of $P_{\sigma}$ . Then the sets of link, middle, and cut gates involved in the CNOT synthesis each have size $n-1$ and form a spanning tree on the set of qubits.

Proof.

Applying Theorem 5.3, we note that $v(P_{\sigma})=1$ and $e(P_{\sigma})=n$ . Applying Theorem 4.7 and Lemma 4.8, we have $P^{\prime}_{\sigma}=P_{\sigma}\land P_{\sigma}^{-\top}+\mathbf{I}=P_{\sigma}+\mathbf{I}$ . In particular, the only vector $v$ such that $vP^{\prime}_{\sigma}=\mathbf{0}$ is $\mathbf{1}$ , so the graph $G=(V,E)$ , where $V=[n]$ and $(i,j)\in E$ if and only if there is a middle gate adding between rows $i,j$ , must be connected. Thus $c_{\textrm{perfect}}(M)=1$ and $s(P_{\sigma})\geq 3(n-1)$ . Figure VII gives various constructions of $s(P_{\sigma})=3(n-1)$ alongside the spanning tree structures of the link, middle, and cut gates respectively. We add three original constructions which do not use the swapping method in addition to those given in [17, 15, 21].

Pseudocode	L, M, C ordering	Link	Middle	Cut
for $i=2$ to $n$ do CNOT( $1$ , $i$ ) CNOT( $i$ , $1$ ) CNOT( $1$ , $i$ ) end for	$\overbrace{\textrm{LMCLMC}\cdots\textrm{LMC}}^{3(n-1)}$	Star	Star	Star
for $i=1$ to $n-1$ do CNOT( $n-i$ , $n-i+1$ ) CNOT( $n-i+1$ , $n-i$ ) CNOT( $n-i$ , $n-i+1$ ) end for	$\overbrace{\textrm{LMCLMC}\cdots\textrm{LMC}}^{3(n-1)}$	Path	Path	Path
for $i=1$ to $n-1$ do CNOT( $i$ , $i+1$ ) end for for $i=1$ to $n-1$ do CNOT( $i+1$ , $i$ ) end for for $i=1$ to $n-1$ do CNOT( $i$ , $n$ ) end for	$\overbrace{\textrm{LL}\cdots\textrm{L}}^{n-1}\overbrace{\textrm{MM}\cdots\textrm{M}}^{n-1}\overbrace{\textrm{CC}\cdots\textrm{C}}^{n-1}$	Star	Path	Star
for $i=1$ to $n-1$ do CNOT( $n-i$ , $n$ ) CNOT( $n$ , $n-i$ ) end for CNOT( $1$ , $n$ ) for $i=1$ to $n-2$ do CNOT( $i+1$ , $i$ ) end for	$\overbrace{\textrm{LMLM}\cdots\textrm{LM}}^{2(n-1)}\overbrace{\textrm{CC}\cdots\textrm{C}}^{n-1}$	Star	Star	Path
for $i=2$ to $n$ do CNOT( $1$ , $i$ ) end for for $i=1$ to $n-1$ do CNOT( $i+1$ , $i$ ) CNOT( $i$ , $i+1$ ) end for	$\overbrace{\textrm{LL}\cdots\textrm{L}}^{n-1}\overbrace{\textrm{MCMC}\cdots\textrm{MC}}^{2(n-1)}$	Star	Path	Path

Figure VII: Table of various constructions to synthesize an

n

-cycle circuit in

3(n-1)

CNOT operations, alongside their decomposition into link, middle, and cut gates.

Therefore $s(P_{\sigma})=3(n-1)$ . Now, consider a CNOT synthesis $M_{1},M_{2},\ldots,M_{3n-2}$ of $P_{\sigma}$ . There must be exactly $n-1$ link, middle, and cut gates respectively. Because $G$ is connected, the set of middle gates forms a spanning tree on the set of qubits. Now assume for the sake of contradiction that the set of link gates does not form a spanning tree on the set of qubits. Then there is a partition $[n]=A\sqcup B$ such that there is never a CNOT operation between a row in $A$ and a row in $B$ , as the first such gate would be a link gate. However, this implies that $P_{\sigma}$ is separable, giving contradiction.

Finally, assume for the sake of contradiction that the set of cut gates does not form a spanning tree over the $n$ qubits. Then there exists a partition $[n]=A\sqcup B$ such that there is no cut gate between any $a\in A$ and $b\in B$ . There must be a link gate adding a row $a\in A$ to a row $b\in B$ , as the link gates form a spanning tree. This means that at some point, $R_{a},R_{b}$ are connected by this link gate. However, if there is no cut gate between $A$ and $B$ , then there will always exist a row of $A$ and a row of $B$ that are connected in $G_{e}$ after this point, giving contradiction, as no two rows are connected in $G_{e}(P_{\sigma})$ . ∎

Moore and Nilsson [21] showed that the depth of any permutation circuit was at most $6$ by giving a construction using parallel CNOT gates. We demonstrate that this number is optimal for large $n$ -cycle circuits.

Corollary 5.5.

Let $n\geq 7$ . Then $\max_{\sigma\in S_{n}}d(P_{\sigma})=6$ , and $d(P_{\sigma})=6$ when $\sigma$ is an $n$ -cycle.

Proof.

Note that any parallel CNOT operation can be decomposed into at most $\left\lfloor\frac{n}{2}\right\rfloor$ CNOT gates. Thus, we have

d(M)\geq\left\lceil\frac{s(M)}{\left\lfloor\frac{n}{2}\right\rfloor}\right\rceil

for any $n\times n$ reversible binary matrix $M$ . Hence, for $n\geq 7$ , any $n$ -cycle $\sigma$ satisfies $d(P_{\sigma})\geq 6$ . Finally, the construction given in [21] shows that $d(M)\leq 6$ for any choice of $P_{\sigma}$ . ∎

Theorem 5.6.

For any $\sigma\in S_{n}$ , we have $s(P_{\sigma})=3(n-k)$ , where $k$ is the number of disjoint cycles in $\sigma$ .

Proof.

Using the construction from Theorem 5.4 on each disjoint cycle of $\sigma$ , we achieve a synthesis of $P_{\sigma}$ in $3(n-k)$ CNOT gates, showing that $s(P_{\sigma})\leq 3(n-k)$ .

Now for the sake of contradiction assume that there exists some $\sigma\in S_{n}$ with minimal $k$ that satisfies $s(P_{\sigma})<3(n-k)$ . Theorem 5.4 implies that $k\geq 2$ . Now, choose qubits $i,j$ in disjoint cycles of $\sigma$ . Appending $\textrm{CNOT}(i,j),\textrm{CNOT}(j,i),\textrm{CNOT}(i,j)$ in that order to the end of the circuit synthesizing $P_{\sigma}$ has the effect of swapping rows $i,j$ . However, as illustrated in Figure VIII, this synthesizes another permutation matrix $P_{\sigma^{\prime}}$ such that $\sigma^{\prime}$ has one fewer disjoint cycle than $\sigma$ and additionally $s(P_{\sigma^{\prime}})<3(n-k+1)$ . This either contradicts the minimality of $k$ or Theorem 5.4, finishing. ∎

Figure VIII: The effect of adding a swap operation (3 CNOT gates) between two disjoint cycles.

6 Linkability

In this section, we introduce the linkability problem, a subclass of the global minimization problem proposed in [10]. Specifically, we determine whether it is possible to synthesize an arbitrary circuit $M$ with $v(M)=1$ in at most $n-1$ gates in $O(n^{\omega})$ time. We begin by introducing a few notions used in the algorithm.

Lemma 6.1.

Consider a CNOT decomposition of $n\times n$ matrix $M$ . If every CNOT gate is a link gate, then the influence graph $\mathcal{I}(M)$ forms a directed acyclic graph.

Proof.

Assume that there is a directed cycle $x_{1}\to x_{2}\to\cdots\to x_{m}\to x_{1}$ in $\mathcal{I}(M)$ . Let indices be taken modulo $m$ . The permutation

\sigma(x)=\begin{cases}x_{i+1}&x=x_{i},i\in[m],\\ x&\textrm{otherwise,}\end{cases}

satisfies $\sigma\in S(M)$ . However, this means $S(M)\neq S(\textbf{I})$ while link gates do not affect the set of rivers by Lemma 3.5, giving contradiction. ∎

We recall the following result in graph theory, which gives us the time complexity for computing the transitive reduction of an arbitrary graph.

Theorem 6.2 (Aho et al. [22] and Furman [23]).

The transitive reduction of a directed graph $G$ with $n$ vertices can be computed in $O(n^{\omega})$ time.

Now, we present the algorithm for the linkability problem in its entirety.

Theorem 6.3.

Given an $n\times n$ reversible binary matrix $M$ such that $v(M)=1$ , it is possible to determine whether $s(M)\leq n-1$ in $O(n^{\omega})$ time.

Proof.

Step 1:

We begin by assuming that $M$ satisfies $s(M)\leq n-1$ . Then by Theorem 5.3, $s(M)=n-1$ . By assumption, there exists a CNOT synthesis $M_{1},M_{2},\ldots,M_{n}$ of $M$ . It is clear that all the CNOT gates must be link gates since $v(M)=1$ . We compute the influence graph $\mathcal{I}(M)=(V,A)$ , which takes $O(n^{2})$ time.
Step 2:

Let $\mathcal{I^{\prime}}=(V,A^{\prime})$ be the transitive reduction of $\mathcal{I}$ . We claim that for each $(i,j)\in A^{\prime}$ , one of the $n-1$ CNOT operations used in the synthesis of $M$ must add row $i$ to row $j$ . We prove this as follows. If $(i,j)\in A^{\prime}$ , there must not have been a directed path from $i$ to $j$ of length $\geq 2$ in $\mathcal{I}$ . Assume that no CNOT operation adds row $i$ to row $j$ . Then, there exists a sequence $k_{1},k_{2},\ldots,k_{s}$ with $k_{1}=i$ and $k_{s}=j$ such that there is a CNOT gate adding row $k_{i}$ to $k_{i+1}$ for all $i\in[s-1]$ . However, this implies each $(k_{i},k_{i+1})\in A$ as link gates cannot remove nonzero entries. This is a directed path of length $\geq 2$ from $i$ to $j$ , giving contradiction and completing the proof.

Hence $\mathcal{I^{\prime}}$ both maintains connectivity and has at most $n-1$ edges, so it must be a spanning tree. The exact set of CNOT gates can thus be read off of $A^{\prime}$ .
Step 3:

Now, we determine the order in which these CNOT operations took place. For any 3 vertices $(u,v,w)$ in $\mathcal{I^{\prime}}$ where $(u,v),(v,w)\in A^{\prime}$ , it is easy to see that $M_{w,u}=1$ if and only if $\textrm{CNOT}(u,v)$ happens before $\textrm{CNOT}(v,w)$ , as $(u,w)\notin A^{\prime}$ and no other directed path from $u$ to $w$ exists in $\mathcal{I^{\prime}}$ .
Step 4:

From the previous step, we know the relative ordering of all adjacent CNOT gates. Any ordering of the $n-1$ CNOT gates respecting these relative orders synthesizes the same matrix. Thus, we can pick any such full ordering and perform the CNOT operations step by step on I to yield a matrix $N$ . This entire process requires $O(n^{2})$ time. If $M=N$ , we have just found a construction to obtain $M$ in exactly $n-1$ CNOT gates. Otherwise, if $M\neq N$ , this is a contradiction on the original assumption that $s(M)\leq n-1$ , finishing.

The overall time complexity of this algorithm is $O(n^{2})+O(n^{\omega})+O(n^{2})+O(n^{2})=O(n^{\omega})$ , as desired. ∎

7 Results

For $n\leq 5$ , the size of all linear reversible CNOT circuits can be easily found using a BFS algorithm over the set of all $n$ -qubit circuits. In Table 1, we provide various metrics on how well the lower bound on size from Theorem 5.3 matches the exact size for $n$ -qubit circuits. In Figure IX, we provide heatmaps for the confusion matrix between the lower bound and the exact size. The raw data used can be found in Appendix A.

Metric	$n\leq 3$	$n=4$	$n=5$
$\Delta=0$	$100\%$	$67.7\%$	$23.1\%$
$\Delta\leq 1$	$100\%$	$99.5\%$	$83.0\%$
$\Delta\leq 2$	$100\%$	$100\%$	$99.7\%$
$\sigma$	0	0.581	1.137
$MAD$	0	0.328	0.941
$R^{2}$	1	0.811	0.649
$PCC$	1	0.901	0.806

Table 1: Various metrics on Theorem 5.3 for

n

-qubit circuits (here,

\Delta

denotes the deviation between lower bound and actual size).

Refer to caption — Figure IX: Column-by-column heatmaps for $n=3,4,5$ between the lower bound in Theorem 5.3 (on the vertical) and the size of the circuit (on the horizontal).

Although our lower bound performs well at various metrics for $n\leq 5$ , Jiang et al. show in [10] that as $n$ grows larger, the average size of an arbitrary $n\times n$ binary reversible circuit grows to $O(\frac{n^{2}}{\log n})$ while our lower bound can be at most $3(n-1)$ . This means that although our lower bound still holds for large $n$ , it cannot be used as a size-approximation algorithm for large $n$ without further improvements.

8 Conclusion and Future Directions

In this paper, we introduce a categorization of the gates of CNOT decompositions into link, middle, and cut gates. Under this framework, we derive lower bounds on the number of CNOT gates in each category and combine them to obtain a lower bound on the total number of CNOT gates needed to synthesize a given circuit that avoids brute-force search and can be obtained in $O(n^{\omega})$ time. We apply our framework to the problem of synthesizing the $n$ -cycle matrix to show that known constructions with $3(n-1)$ CNOT gates are optimal, and further show the size of any permutation matrix is $3(n-k)$ , where $k$ is the number of disjoint cycles in the permutation.

We introduce an $O(n^{\omega})$ time algorithm determining whether a given circuit with $v(M)=1$ can be synthesized in fewer than $n$ CNOT operations, resolving a subclass of the minimum circuit size problem. We hope the methods and ideas developed in this paper will aid the search for further polynomial-time lower bounds on the size of CNOT circuits, as well as demonstrate the optimality or near-optimality of known CNOT decompositions.

To end, we present several interesting future directions on both the minimum circuit size problem as well as for improving our lower bound in Theorem 5.3.

Problem 8.1.

In [3], an information theoretic bound is used to show that the maximum size of a circuit on $n$ qubits is at least $\frac{n^{2}}{2\log_{2}n}$ . However, in the cases $n\leq 5$ , the maximum size of an $n$ -qubit circuit is always $3(n-1)$ , attained by the $n$ -cycle circuit. This is plausible for all $n\leq 27$ , as we have $\frac{n^{2}}{2\log_{2}n}\leq 3(n-1)$ . At what point does the asymptotic bound overtake the $3(n-1)$ pattern?

Problem 8.2.

In the $n=5$ case, a single circuit, alongside those equivalent to it through Theorem 5.1, lies far off the main diagonal in the table in Appendix A. It turns out that $M\land M^{-\top}+\mathbf{I}=\mathbf{0}$ (see Figure X), rendering the $c_{\textrm{perfect}}$ bound ineffective. Can we improve the $c_{\textrm{perfect}}$ algorithm to account for this “ $c_{\textrm{perfect}}$ glitch” and obtain a more powerful lower bound on the number of middle CNOT gates?

$\begin{bmatrix}1&0&0&1&1\\ 0&1&1&0&1\\ 0&1&1&1&0\\ 1&0&1&1&0\\ 1&1&0&0&1\end{bmatrix}$

(a) Circuit M

12345 45231
12435 45312
13245 52341
15342 52431
15432 53241
42315 53412
43215

(b) Rivers

S(M)

M

$\begin{bmatrix}0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\\ 0&0&0&0&0\end{bmatrix}$

M\land M^{-\top}+\mathbf{I}

Figure X: The

c_{\textrm{perfect}}

glitch

Problem 8.3.

In addition to permutations, are there other classes of circuits for which the bound in Theorem 5.3 is optimal?

Problem 8.4.

Is it possible to extend our methods to circuits with ancillae, topological constraints, or one-qubit gates such as $T$ gates and Hadamard gates?

9 Acknowledgements

We thank Yunseo Choi and Kevin Cong for their valuable feedback.

References

[1] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator, J. A. Smolin, and H. Weinfurter, “Elementary gates for quantum computation,” Phys. Rev. A, vol. 52, no. 5, p. 3457, 1995.
[2] D. P. DiVincenzo, “Two-bit gates are universal for quantum computation,” Phys. Rev. A, vol. 51, no. 2, p. 1015, 1995.
[3] K. N. Patel, I. L. Markov, and J. P. Hayes, “Efficient synthesis of linear reversible circuits,” arXiv preprint quant-ph/0302002, 2003.
[4] F. Vatan and C. Williams, “Optimal quantum circuits for general two-qubit gates,” Phys. Rev. A—Atomic, Molecular, and Optical Physics, vol. 69, no. 3, p. 032315, 2004.
[5] V. V. Shende, I. L. Markov, and S. S. Bullock, “Smaller two-qubit circuits for quantum communication and computation,” in proceedings design, automation and test in europe conference and exhibition, vol. 2, pp. 980–985, IEEE, 2004.
[6] V. V. Shende, S. S. Bullock, and I. L. Markov, “Synthesis of quantum logic circuits,” in proceedings of the 2005 Asia and South Pacific Design Automation Conference, pp. 272–275, 2005.
[7] J. J. Vartiainen, M. Möttönen, and M. M. Salomaa, “Efficient decomposition of quantum gates,” Phys. Rev. Letters, vol. 92, no. 17, p. 177902, 2004.
[8] E. Knill, “Approximation by quantum circuits,” arXiv preprint quant-ph/9508006, 1995.
[9] B. Nash, V. Gheorghiu, and M. Mosca, “Quantum circuit optimizations for nisq architectures,” Quantum Science and Technology, vol. 5, no. 2, p. 025010, 2020.
[10] J. Jiang, X. Sun, S.-H. Teng, B. Wu, K. Wu, and J. Zhang, “Optimal space-depth trade-off of cnot circuits in quantum logic synthesis,” in Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 213–229, SIAM, 2020.
[11] Y. Kang and H. Ma, “Cnot-optimal circuit synthesis,” 2023.
[12] X. Liu, H. Yang, and L. Yang, “Minimizing cnot-count in quantum circuit of the extended shor’s algorithm for ecdlp,” Cybersecurity, vol. 6, no. 1, p. 48, 2023.
[13] K.-A. Brickman, P. Haljan, P. Lee, M. Acton, L. Deslauriers, and C. Monroe, “Implementation of grover’s quantum search algorithm in a scalable system,” Phys. Rev. A—Atomic, Molecular, and Optical Physics, vol. 72, no. 5, p. 050306, 2005.
[14] R. Duan, H. Wu, and R. Zhou, “Faster matrix multiplication via asymmetric hashing,” in 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pp. 2129–2138, IEEE, 2023.
[15] J. Liu, Y. Ren, Y. Cao, H. Sun, and L. Chen, “Realization of permutation groups by quantum circuit,” arXiv preprint arXiv:2406.01350, 2024.
[16] M. Planat and R. Ul Haq, “The magic of universal quantum computing with permutations,” Advances in Mathematical Physics, vol. 2017, no. 1, p. 5287862, 2017.
[17] M. Bataille, “Quantum circuits of cnot gates: optimization and entanglement,” Quantum Info. Processing, vol. 21, no. 7, p. 269, 2022.
[18] C. D. Murray and R. R. Williams, “On the (non) np-hardness of computing circuit complexity,” Theory of Computing, vol. 13, no. 1, pp. 1–22, 2017.
[19] M. Amy, P. Azimzadeh, and M. Mosca, “On the controlled-not complexity of controlled-not–phase circuits,” Quantum Science and Technology, vol. 4, no. 1, p. 015002, 2018.
[20] D. Maslov and B. Zindorf, “Depth optimization of cz, cnot, and clifford circuits,” IEEE Trans. on Quantum Engineering, vol. 3, pp. 1–8, 2022.
[21] C. Moore and M. Nilsson, “Parallel quantum computation and quantum codes,” SIAM Journal on Computing, vol. 31, no. 3, pp. 799–815, 2001.
[22] A. V. Aho, M. R. Garey, and J. D. Ullman, “The transitive reduction of a directed graph,” SIAM Journal on Computing, vol. 1, no. 2, pp. 131–137, 1972.
[23] M. Furman, “Application of a method of rapid multiplication of matrices to the problem of finding the transitive closure of a graph,” in Doklady Akademii Nauk, vol. 194, pp. 524–524, Russian Academy of Sciences, 1970.

Appendix A Performance of Theorem 5.3 for Circuits with $4$ or $5$ Qubits

In this appendix, the lower bound given in Theorem 5.3 is plotted against the exact size of the circuit in the $4$ and $5$ -qubit cases, with the entry in row $i$ and column $j$ denoting the number of linear reversible circuits with lower bound $i$ and size $j$ . For $n\leq 3$ , equality holds.

	0	1	2	3	4	5	6	7	8	9
0	1
1		12
2			96
3				542	138
4					1920	756	12
5						4560	2589	84
6							4929	2464
7								1510	469
8									72
9										6

Figure XI: The distribution of

4

-qubit circuits, with the lower bound in Theorem 5.3 plotted on the vertical and the size of the circuit plotted on the horizontal.

	0	1	2	3	4	5	6	7	8	9	10	11	12

0	1
1		20
2			260
3				2570	690
4					18990	20540	3640			12
5						97320	176505	37880	1320
6							360325	917960	322750	11580
7								813870	2448985	925340	15840
8									798120	2072000	365540
9										216378	350120	13340
10											5040	1920
11												480
12													24

Figure XII: The distribution of

5

-qubit circuits, with the lower bound in Theorem 5.3 plotted on the vertical and the size of the circuit plotted on the horizontal.

	$\displaystyle\|K\cap S(M)\|$	$\displaystyle=\sum\|\{\sigma,T_{ij}\circ\sigma\}\cap S(M)\|$
		$\displaystyle\equiv\sum\|\{\sigma,T_{ij}\circ\sigma\}\cap S(N)\|$
		$\displaystyle=\|K\cap S(N)\|\pmod{2}.$

Minimum Synthesis Cost of CNOT Circuits

Abstract

1 Introduction

1.1 Main Results

1.2 Organization of the Paper

2 Preliminaries and Notations

2.1 CNOT Synthesis of Circuits

Definition 2.1.

Definition 2.2.

Definition 2.3.

Definition 2.4.

2.2 Circuit-derived Graphs

Definition 2.5.

Definition 2.6.

Definition 2.7.

2.3 Influence Graphs

Definition 2.8.

Definition 2.9.

3 Link, Middle, and Cut CNOT Gates

Definition 3.1.

Definition 3.2.

Proposition 3.3.

Proof.

Lemma 3.4.

Proof.

Lemma 3.5.

Proof.

Lemma 3.6.

Proof.

4 Lower Bounds on Middle Gates

4.1 Equivalence Classes of Rivers

Lemma 4.1.

Proof.

Definition 4.2.

Theorem 4.3.

Proof.

Definition 4.4.

Lemma 4.5.

Proof.

Theorem 4.6.

Proof.

4.2 The cperfectc_{\textrm{perfect}} Algorithm

Theorem 4.7.

Proof.

Lemma 4.8.

Proof.

Theorem 4.9 (cperfectc_{\textnormal{perfect}} algorithm).

Proof.

Corollary 4.10.

5 The LMC Bound

Theorem 5.1.

Proof.

Theorem 5.2.

Proof.

Theorem 5.3 (LMC Bound).

Proof.

5.1 Cycle and Permutation Matrices

Theorem 5.4.

Proof.

Corollary 5.5.

Proof.

Theorem 5.6.

Proof.

6 Linkability

Lemma 6.1.

Proof.

Theorem 6.2 (Aho et al. [22] and Furman [23]).

Theorem 6.3.

Proof.

7 Results

8 Conclusion and Future Directions

Problem 8.1.

Problem 8.2.

Problem 8.3.

Problem 8.4.

9 Acknowledgements

References

Appendix A Performance of Theorem 5.3 for Circuits with 44 or 55 Qubits

4.2 The $c_{\textrm{perfect}}$ Algorithm

Theorem 4.9 ( $c_{\textnormal{perfect}}$ algorithm).

Appendix A Performance of Theorem 5.3 for Circuits with $4$ or $5$ Qubits