Partial Equivalence Checking of Quantum Circuits

Tian-Fu Chen^1,2,3, Jie-Hong R. Jiang^1,2,3,4, and Min-Hsiu Hsieh⁵ ¹Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan ²Graduate School of Advanced Technology, National Taiwan University, Taipei, Taiwan ³Center for Quantum Science and Engineering, National Taiwan University, Taipei, Taiwan ⁴Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan ⁵Quantum Computing Research Center, Hon Hai Research Institute, Taipei, Taiwan [email protected], [email protected], [email protected]

Abstract

Equivalence checking of quantum circuits is an essential element in quantum program compilation, in which a quantum program can be synthesized into different quantum circuits that may vary in the number of qubits, initialization requirements, and output states. Verifying the equivalences among the implementation variants requires proper generality. Although different notions of quantum circuit equivalence have been defined, prior methods cannot check observational equivalence between two quantum circuits whose qubits are partially initialized, which is referred to as partial equivalence. In this work, we prove a necessary and sufficient condition for two circuits to be partially equivalent. Based on the condition, we devise algorithms for checking quantum circuits whose partial equivalence cannot be verified by prior approaches. Experiment results confirm the generality and demonstrate the efficiency and effectiveness of our method. Our result may unleash the optimization power of quantum program compilation to take more aggressive steps.

Index Terms:

Quantum circuit, quantum computing, quantum measurement, equivalence checking

I Introduction

Equivalence checking plays an important role in the design flow of modern integrated circuits. It ensures the transformations done in the corresponding synthesis steps do not introduce errors. Many scalable techniques, such as random simulation, satisfiability solving, decision diagrams, and structural similarity detection, have been developed and widely applied in classical circuit verification [1, 2, 3, 4, 5, 6, 7]. Due to the rapid progress of hardware and software developments in quantum computation, quantum circuit verification, especially equivalence checking, arises as an important issue in quantum program compilation. Although practical quantum circuits remain relatively small until now, it remains challenging to their equivalence checking due to extraordinary quantum properties. Moreover, as quantum computing technology evolves, the size of quantum circuits will continue growing. Efficient equivalence checking tools for quantum circuits are essential.

Many approaches to equivalence checking of quantum circuits have been proposed [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. Among them, decision diagrams are a widely adopted data structure for quantum state and operator representation and manipulation [8, 9, 10, 13, 19]. Also, techniques, such as miter construction [8, 10, 12, 13, 18] or tensor network computation [11], have been applied. Some of them focus on the equivalence of reversible circuits, which realize permutation functions, whereas others address the equivalence of more general quantum circuits. For the latter, most of them require total equivalence, namely, the identity of the output states (modulo a global phase difference) between the two circuits under verification. However, since measurement is the only way to extract information from a quantum system, total quantum state equivalence may be unnecessary and only observational equivalence matters. That is, two circuits exhibit the same probability for every possible measurement outcome under a fixed measurement basis. Furthermore, not all qubits of a circuit need to be measured and there can be a set of initial states to be verified not just a particular single initial state. To accommodate such different design constraints, we relax the notion of quantum circuit equivalence and define a generalization, called partial equivalence. That is, two circuits are partially equivalent if, given any valid initial input state, they exhibit the same probability for each measurement outcome. Therefore, two partially equivalent circuits do not exhibit statistical differences by observed measurements. Essentially, the generality of partial equivalence allows more flexibility for quantum circuit synthesis. Fig. 1 shows an example where $C_{1}$ and $C_{2}$ are partially equivalent, but not totally equivalent.

Refer to caption — Figure 1: A motivating example of partial equivalence where circuits $C_{1}$ and $C_{2}$ measured on $q_{0}$ have the same probability distribution over the outcomes for any given initial input state.

Apart from its application in quantum circuit synthesis, partial equivalence checking has its natural application in quantum program compilation. As the quantum computing technology evolves, many quantum programming languages have been developed, e.g., [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]. Although nowadays most quantum programming languages are abstracted at the gate level, high-level languages will get their increasing presence in the future. Their compilation may require making trade-offs between ancilla bits and quantum gates. To verify the correctness of different compilation choices may require the defined partial equivalence.

Equivalence based on measurement results is not new, nor is the relaxation requiring only some of the input qubits to be data qubits and some of the output qubits to be measured qubits. In fact, partial equivalence can be seen as the case that all principal output qubits in Definition 3 of [32] are to be measured. However, there are no corresponding checking methods so far, and none of the prior work [18, 19, 32, 33, 34] can fully resolve the partial equivalence checking problem. In particular, equivalence up to a relative phase discussed in [18] can only cover the case that all output qubits are measured; the method considering measurement-based equivalence in [19] is used for reversible circuits only; the initial state is given and fixed in [32] and [33]; sequential circuit verification in [34] is limited to the assumption that the number of measured output qubits cannot be smaller than the number of data input qubits. In this work, we are not constrained by these limitations. Our methods can be applied to general cases of checking partial equivalence relations. Moreover, they are compatible with the state-of-the-art SliQEC [8] system, which uses Boolean functions represented by decision diagrams to store and manipulate matrices.

The main results of this work include:

1.

We characterize partial equivalence (in Section III), which subsumes prior notions of quantum circuit equivalence.
2.

We prove a necessary and sufficient condition for two circuits to be partially equivalent (in Section IV).
3.

We develop algorithms for partial equivalence checking under general and special settings (in Section V).
4.

We conduct experiments to demonstrate the feasibility and effectiveness of our algorithms and the practical applicability of partial equivalence (in Section VI).

II Preliminaries

We briefly provide some background and define notations.

II-A Boolean Function and Decision Diagram

For Boolean connectives, we denote conjunction by “ $\wedge$ ” (sometimes omitted in a Boolean expression for brevity), disjunction by “ $\vee$ ,” exclusive-or by “ $\oplus$ ,” and negation by an overline. Given a Boolean function $f(x_{1},x_{2},...,x_{n})$ , the positive cofactor of $f$ on $x_{i}$ , denoted $f|_{x_{i}}$ , is

f|_{x_{i}}=f(x_{1},x_{2},...,x_{i-1},1,x_{i+1},...,x_{n}).

(1)

Similarly, the negative cofactor of $f$ on $x_{i}$ , denoted $f|_{\overline{x_{i}}}$ , is

f|_{\overline{x_{i}}}=f(x_{1},x_{2},...,x_{i-1},0,x_{i+1},...,x_{n}).

(2)

The reduced ordered binary decision diagram (ROBDD), referred to as BDD in the sequel, is a canonical form for Boolean function representation [35]. There are efficient BDD packages for Boolean function manipulation. A BDD is a directed acyclic graph consisting of non-terminal nodes and terminal-0 and terminal-1 nodes. Each non-terminal node $v$ is associated with a decision variable and has two children $v_{0}$ and $v_{1}$ pointed to by the 0-branch and 1-branch of $v$ , respectively. Each node in a BDD corresponds to a Boolean function. Let $f_{v},f_{v_{0}},f_{v_{1}}$ be the functions of node $v$ with decision variable $x$ and its two children $v_{0},v_{1}$ , respectively. Then $f_{v_{0}}=f|_{\overline{x}}$ and $f_{v_{1}}=f|_{x}$ , and the Shannon expansion $f_{v}=xf_{v_{1}}\vee\overline{x}f_{v_{0}}$ holds. Fig. 2 shows a BDD of function $f=ac\vee\overline{a}bc\vee\overline{a}\overline{b}\overline{c}$ .

II-B Quantum States and Quantum Circuits

For a quantum system, at the input end, a qubit is called a data qubit if it is associated with the input data and a non-data qubit otherwise. At the output end, a qubit is called a measured qubit if it is to be measured and a non-measured qubit otherwise. In an $n$ -qubit quantum system, a quantum state can be represented by a $2^{n}\times 1$ vector $\phi=[a_{0},a_{1},...,a_{2^{n}-1}]^{\rm T}$ , where each $a_{i}$ is a complex number and the norm of $\phi$ satisfies $\left\lVert\phi\right\rVert=1$ . We use $|0\rangle$ to represent the vector $[1,0,...,0]^{\rm T}$ , $|1\rangle$ to represent the vector $[0,1,0,...,0]^{\rm T}$ , and so on. For state $|i\rangle$ , we let the binary number of integer $i$ be expressed by qubits $q_{0},\ldots,q_{n-1}$ with $q_{0}$ being the most significant bit. In the sequel, we assume that non-data qubits are fixed to the initial state $|0\rangle$ and refer to them as ancilla qubits. This assumption is general in that any other initial state can be transformed from $|0\rangle$ .

If we measure all the qubits of $\phi$ under the computational basis, there will be a probability ${|a_{i}|}^{2}$ for the system to get into state $|i\rangle$ . Also, if we measure only the first (more significant) $m$ qubits, for $m\leq n$ , of $\phi$ , the probability of obtaining a state $|j\rangle$ , for $j\in\{0,1,...,2^{m}-1\}$ , of the measured qubits equals

\sum_{k=g\cdot j}^{g\cdot(j+1)-1}{|a_{k}|}^{2},

(3)

where $g=2^{n-m}$ .

The evolution of the state of a closed quantum system can be described by quantum gates. For an $n$ -qubit quantum system, a quantum gate can be defined by a $2^{n}\times 2^{n}$ unitary matrix $U$ of complex numbers. A unitary matrix $U$ is a matrix satisfying $U^{\dagger}U=I$ , where $U^{\dagger}$ is the conjugate transpose of $U$ and $I$ is the identity matrix. Thus, $U^{\dagger}$ is also the inverse of $U$ . A quantum circuit is made up of a series of quantum gates $G_{1},G_{2},...G_{d}$ . Let their corresponding unitary matrices be $U_{1},U_{2},...,U_{d}$ and let $\phi_{0}$ be the initial state. Then the final output state $\phi_{out}$ of the circuit is derived by the product

\phi_{out}=U_{d}\cdot U_{d-1}\cdot\ldots\cdot U_{1}\cdot\phi_{0}.

(4)

Therefore, we can view the whole circuit as a quantum gate with the unitary matrix

U=U_{d}\cdot U_{d-1}\cdot...\cdot U_{1}.

(5)

Particularly, when an $n$ -qubit quantum circuit with unitary matrix $U$ yields a permutation of the computational basis $\{|0\rangle,...,|2^{n}-1\rangle\}$ , it is referred to as a reversible circuit [36, 37, 38, 39].

Because only the global unitary operator of a quantum circuit matters in our discussion, for simplicity we abstract away gate implementation details and denote an $n$ -qubit quantum circuit $C$ as a 4-tuple $C=(d,m,k,U)$ for $n=d+k$ , where

•

$d$ is the number of data qubits, specifically, $q_{0},\ldots,q_{d-1}$ ,
•

$m$ is the number of measured qubits, specifically, $q_{0},\ldots,q_{m-1}$ ,
•

$k$ is the number of non-data qubits, and
•

$U$ is the global unitary operator of $C$ of size $2^{n}\times 2^{n}$ .

Note that above we make the data qubits and measured qubits follow the same order. This arrangement is without loss of generality as the qubits can be reordered by swap gates.

Given a quantum circuit $C=(d,m,k,U)$ with the data qubits prepared at initial state $\psi$ , we denote the probability of $C$ collapsing to state $|t\rangle$ upon a measurement on the measured qubits under the computational basis as $P(t|\psi,C)$ , which can be calculated by

P(t|\psi,C)=\sum_{i=g\cdot t}^{g\cdot t+g-1}\left|\sum_{j=0}^{2^{n}-1}u_{i,2^{k}\cdot j}\cdot a_{j}\right|^{2},

(6)

where $g=2^{(d+k-m)}$ and $u_{x,y}$ denotes the $(x,y)^{\mathrm{th}}$ entry of $U$ .

We remark that a matrix can be represented numerically or algebraically, and explicitly or implicitly for different computation choices. In this work, we adopt an algebraic and implicit approach to matrix representation and manipulation based on [40, 8].

II-C Algebraic and Implicit Representation of Matrices

Following [41], we represent a complex number $\alpha\in\mathbb{C}$ algebraically by

\alpha=\frac{1}{\sqrt{2}^{k}}(a\omega^{3}+b\omega^{2}+c\omega+d),

(7)

where $\omega=e^{i\pi/4}$ and $a,b,c,d\in\mathbb{Z}$ . In [8], the bit-slicing technique [40] is applied to represent a $2^{n}\times 2^{n}$ complex matrix by one integer-type variable for the $k$ -coefficient and $4r$ $2n$ -variable BDDs for the $a$ -, $b$ -, $c$ -, $d$ -coefficients, assuming an integer is encoded with a bit-vector of size $r$ . Let the BDDs of the $i^{\rm th}$ bit of the $a,b,c,d$ -coefficient Boolean matrices be $F^{ai},F^{bi},F^{ci},F^{di}$ , respectively, for $i=0,\ldots,r-1$ . In the sequel, we do not distinguish among $F^{ai},F^{bi},F^{ci},F^{di}$ and simply use $F$ to refer to every individual of these BDDs. Essentially, such a function $F$ of $2n$ variables implicitly represents a $2^{n}\times 2^{n}$ Boolean matrix, whose rows and columns are indexed by the output qubits and input qubits, respectively, in our context. For qubit $q_{i}$ , two variables $x_{i}$ and $y_{i}$ are introduced to encode its corresponding output qubit and input qubit, respectively, for the Boolean matrix. The variables $X=\{x_{0},\ldots,x_{n-1}\}$ and $Y=\{y_{0},\ldots,y_{n-1}\}$ are referred to as the row-variables and column-variables, respectively. Effectively, $F$ is a function over variables $X\cup Y$ . That is, the $(i,j)^{\mathrm{th}}$ entry of the Boolean matrix corresponds to the Boolean value of $F$ for $X$ and $Y$ being substituted with the $n$ -bit binary numbers of $i$ and $j$ , respectively. In addition to the implicit matrix representation using Boolean functions, the multiplication of matrices can be done effectively through Boolean function manipulations as shown in [8].

III Partial Equivalence Problem

In this section, we formally define partial equivalence between two quantum circuits, and compare it to other equivalence notions in the literature.

III-A Problem Statement

The partial equivalence relation is formally defined as follows.

Definition 1 (Partial Equivalence).

Two circuits $C_{1}=(d,m,k,U_{1})$ and $C_{2}=(d,m,k,U_{2})$ are partially equivalent, denoted by $C_{1}\sim C_{2}$ , if

P(t|\psi,C_{1})=P(t|\psi,C_{2})

(8)

holds for any state vector $\psi$ of size $2^{d}$ and any $t=0,1,...,2^{m}-1$ .

Note that, without loss of generality, we assume two circuits have the same number of ancilla qubits because we can always add dummy ancilla qubits to the one with fewer ancilla qubits. The partial equivalence checking problem can be stated as follows.

Problem 1 (Partial Equivalence Checking).

Given two circuits $C_{1}=(d,m,k,U_{1})$ and $C_{2}=(d,m,k,U_{2})$ , we are asked to check whether $C_{1}$ and $C_{2}$ are partially equivalent.

Partial equivalence can be applied to check the conformance between two quantum circuits that compute the same function but may produce different output states.

We remark that, in prior work [32], the notion of m-equivalence and q-equivalence are defined. For m-equivalent circuits, it requires that all qubits have fixed initial states, and it allows that we only measure on part of the qubits. Two circuits are m-equivalent if they have the same probability distribution of measurement outcomes. That is, m-equivalence is a special case of partial equivalence when we set $d=0$ for the circuits under verification. On the other hand, for q-equivalence, ancilla qubits are allowed. Besides, the output qubits are divided into two groups, the main output qubits and the garbage qubits. It is required that if we measure the garbage qubits, the quantum states of the main output qubits cannot be affected by the measurement outcomes of the garbage qubits. Two circuits are q-equivalent if the output states of the main output qubits between the two circuits are the same for any input state.

To demonstrate the usefulness of partial equivalence, we take the period-finding quantum algorithm [42] as an example. In Fig. 3 (a), the circuit computes the function

U_{1}|x\rangle=\begin{cases}|3x\;\rm{mod}\;5\rangle,&\text{if $x<5$,}\\ |x\rangle,&\text{otherwise.}\end{cases}\qquad\qquad\;

(9)

In Fig. 3 (b), the circuit computes the function

U_{2}|x\rangle=\begin{cases}|3(x+1)-1\;\rm{mod}\;5\rangle,&\text{if $x<5$,}\\ |x\rangle,&\text{otherwise.}\end{cases}

(10)

It can be seen that the $U_{1}$ and $U_{2}$ have the same period 4 (for $U_{1}$ mapping $|1\rangle\mapsto|3\rangle\mapsto|4\rangle\mapsto|2\rangle\mapsto|1\rangle$ and $U_{2}$ mapping $|1\rangle\mapsto|0\rangle\mapsto|2\rangle\mapsto|3\rangle\mapsto|1\rangle$ ). For the circuit in Fig. 3 (c), let the first register as the data qubits and the second register as the ancilla qubits. Let $C_{1}$ (resp. $C_{2}$ ) be the circuit of Fig. 3 (c) with the oracle blocks $U$ being substituted with function $U_{1}$ (resp. $U_{2}$ ). Then $C_{1}$ and $C_{2}$ are partially equivalent under $d=m=3$ despite the fact that neither of total equivalence and q-equivalence holds and m-equivalence is not applicable them. The fact that $C_{1}$ and $C_{2}$ are not totally equivalent is immediate. Also, we verify that the q-equivalence [32] does not hold because the output state of the $1^{\rm st}$ register is affected by the $2^{\rm nd}$ register; the m-equivalence is not applicable because only the $2^{\rm nd}$ register is taken as the ancilla qubits. To the best of our knowledge, no previous tools can check this equivalence.

We observe that when the two circuits under verification have no ancilla qubits, the computation of their partial equivalence checking can be simplified. It motivates the consideration of the following special case.

Problem 2 (Zero-Ancilla Partial Equivalence Checking).

Given two circuits $C_{1}=(d,m,0,U_{1})$ and $C_{2}=(d,m,0,U_{2})$ , we are asked to check whether $C_{1}$ and $C_{2}$ are partially equivalent.

For instance, consider the aforementioned $C_{1}$ and $C_{2}$ circuits. If we properly add some quantum gates to the beginning of the second register of $C_{2}$ (e.g., achieving the mapping $|x\rangle\mapsto|x-1\mod 5\rangle$ for $x<5$ and $|x\rangle\mapsto|x\rangle$ for $x\geq 5$ ), then $C_{1}$ and $C_{2}$ can remain partially equivalent even if all qubits are taken as data qubits. In this case both circuits have no ancilla qubits (i.e., $k=0$ ).

III-B Comparison of Different Equivalences

There have been several quantum circuit equivalences being studied. We compare them with the partial equivalence.

Besides the previously mentioned m-equivalence and q-equivalence, the most common definition of quantum circuit equivalence is total equivalence, which requires that two circuits produce the same state vector (up to a global-phase difference) for any input state [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. That is, the output states of the two circuits should be numerically identical except for a scalar multiplication factor.

Another well-known equivalence is functional equivalence of reversible circuits. In reversible circuits, they may also include ancilla qubits, and some qubits may be discarded in the end (i.e., we do not care about their measurement results). Some may even include don’t-care conditions, which means that we do not care about all of or part of the measurement results under some input states [36].

Though partial equivalence looks similar to functional equivalence of reversible circuits and m-equivalence, we note that partial equivalence is much more complex, because measurement is a non-linear operation and does not have superposition property. That is, even if we know

P(t|\psi_{i},C_{1})=P(t|\psi_{i},C_{2})

(11)

for $i=1,2$ , we cannot infer

P\left(t|(\alpha\psi_{1}+\beta\psi_{2}),C_{1}\right)=P\left(t|(\alpha\psi_{1}+\beta\psi_{2}),C_{2}\right).

(12)

Therefore, even with a brute-force search to enumerate the probability distribution under $\psi=e_{i,2^{d}}$ for all $i$ , the partial equivalence relation is still not guaranteed. Some other approaches to checking partial equivalence are essential.

We briefly summarize the main differences between the equivalence types mentioned above in Table I. We further illustrate the relations between partial equivalence and other equivalences in Fig. 4. Note that partial equivalence subsumes total equivalence, functional equivalence of reversible circuits, and q-equivalence as these equivalences are special cases of partial equivalence. Specifically, it is evident that totally equivalent circuits must be partially equivalent. For reversible circuits, it can be derived from definition that $P(t|\psi,C)$ is either 0 or 1 for all $\psi=|0\rangle,|1\rangle,...,|2^{d}-1\rangle$ , so Eq. (12) holds. Therefore, functionally equivalent reversible circuits without don’t care conditions must be partially equivalent. As for m-equivalence, it can be clearly seen that we can apply partial equivalence to m-equivalence by setting $d=0$ . For q-equivalence, as the states of main output qubits between two circuits are identical, we must get the same probability distribution of the two circuits if we measure on the main output qubits. Thus, q-equivalent circuits must be partially equivalent.

TABLE I: Comparison of Different Notions of Quantum Circuit Equivalence

equivalence type

equivalence form

inputs

outputs

special restrictions

partial equivalence

measurement probability

allow ancilla

allow garbage

total equivalence

numerical quantum state

usually no ancilla

usually no garbage

functionally equivalent

reversible circuits

Boolean values

allow ancilla

allow garbage

restricted on reversible circuits

m-equivalence

measurement probability

no ancilla

allow garbage

q-equivalence

numerical quantum state

allow ancilla

allow garbage

main output independent of garbage output

IV Necessary and Sufficient Conditions of Partial Equivalence

In this section, we derive a necessary and sufficient condition for partial equivalence, which can be directly used to solve Problem 1 if the matrices of both circuits are explicitly given. For the special case with no ancilla qubits, we show that a simpler necessary and sufficient condition exists.

We define the following notation to locate some sub-matrices.

Definition 2.

Given a circuit $C=(d,m,k,U)$ , let $u_{p,q}$ represents the $(p,q)^{\mathrm{th}}$ entry of $U$ and $g=2^{(d+k-m)}$ , then $v_{i,j}(U)$ of size $g$ is defined to be

v_{i,j}(U)=[u_{gi,2^{k}j},u_{gi+1,2^{k}j},...,u_{gi+g-1,2^{k}j}]^{\rm{T}},

(13)

where $i=0,1,...,2^{m-1}$ , $j=0,1,...,2^{d-1}$ . An example for $d=m=k=1$ is illustrated as follows. In addition, we use $e_{i,l}$ to represent the unit vector $[0,0,...,0,1,0,...,0]$ of length $l$ and the $i^{th}$ position ( $i$ starts from 0) being 1.

We remark that the definition of $\{v_{i,j}(U)\}$ is critical in deriving the necessary and sufficient condition for the partial equivalence. We will show this point in the following section.

IV-A Property of Partial Equivalence

To solve Problem 1, we first derive the condition when $C_{1}\sim C_{2}$ holds.

Theorem 1 (Partial Equivalence).

Given two circuits $C_{1}=(d,m,k,U_{1})$ and $C_{2}=(d,m,k,U_{2})$ , then $C_{1}\sim C_{2}$ if and only if

v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})

(14)

holds for all $t=0,1,...,2^{m-1}$ and $p,q=0,1,...,2^{d-1}$ .

Proof.

Let the input state vector $\psi=[a_{0},a_{1},...,a_{(2^{d}-1)}]^{\rm{T}}$ . The matrix $U_{1}$ and $U_{2}$ will respectively be applied on $\psi\otimes e_{0,2^{k}}$ . Let’s consider the output $|t\rangle=t_{0}t_{1}...t_{m-1}$ . Let $g=2^{(d+k-m)}$ and $u_{1,x,y}$ denotes the $(x,y)$ entry of $U_{1}$ , then

	$\displaystyle\qquad\quad P(t\|\psi,C_{1})=\sum_{i=gt}^{gt+g-1}\left\|\sum_{j=0}^{2^{d}-1}u_{1,i,2^{k}\cdot j}\cdot a_{j}\right\|^{2}$
	$\displaystyle=\sum_{i=gt}^{gt+g-1}\left(\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{(u_{1,i,2^{k}\cdot p}\cdot a_{p})^{*}(u_{1,i,2^{k}\cdot q}\cdot a_{q})\right\}\right)$
	$\displaystyle=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{}a_{q}\sum_{i=gt}^{gt+g-1}u_{1,i,2^{k}\cdot p}^{}\cdot u_{1,i,2^{k}\cdot q}\right\}$
	$\displaystyle=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{*}a_{q}\cdot v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})\right\},$		(15)

where $Re\left\{x\right\}$ denotes the real part of the complex number $x$ . Similarly, we get

P(t|\psi,C_{2})=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{*}a_{q}\cdot v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})\right\}.

(16)

If $C_{1}\sim C_{2}$ , then $P(t|\psi,C_{1})=P(t|\psi,C_{2})$ must hold for all $\psi$ , so $v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})$ must hold for all $p$ and $q$ . We also need the above equality hold for any $t=0,1,...,2^{m-1}$ . Thus,

	$\displaystyle C_{1}\sim C_{2}$	$\displaystyle\Leftrightarrow P(t\|\psi,C_{1})=P(t\|\psi,C_{2})\;\forall t\forall\psi$
		$\displaystyle\Leftrightarrow v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})\;\forall t\forall p\forall q.$

∎

As mentioned before, brute-force search does not work for partial equivalence. Therefore, Theorem 1 provides the first way to formally verify the partial equivalence relation.

From Theorem 1, we can see that whether $C_{1}\sim C_{2}$ only depends on the $v_{i,j}(U)$ ’s of two matrices, and is not related to other entries.

IV-B Property of Zero-Ancilla Partial Equivalence

For the case that both circuits have no ancilla qubits, we provide another necessary and sufficient condition for partial equivalence as follows.

Theorem 2 (Zero-Ancilla Partial Equivalence).

Given $C_{1}=(d,m,0,U_{1})$ and $C_{2}=(d,m,0,U_{2})$ , then $C_{1}\sim C_{2}$ if and only if $U_{1}U_{2}^{-1}$ is in the following form.

U_{1}U_{2}^{-1}=\begin{bmatrix}U_{0,0}&&&\\ &U_{1,1}&&\\ &&\ddots&\\ &&&U_{2^{m}-1,2^{m}-1}\end{bmatrix},

(17)

where each $U_{i,i}$ is a $2^{d-m}\times 2^{d-m}$ square matrix, and the other entries in $U_{1}U_{2}^{-1}$ not labeled are 0.

Proof.

Firstly we show a transformed problem. By definition, $C_{1}\sim C_{2}$ if and only if $P(t|\psi,C_{1})=P(t|\psi,C_{2})$ for any $\psi$ and $t$ . Let $U_{1}\psi=\phi_{1}$ , $U_{2}\psi=\phi_{2}$ , and then we have $U_{1}U_{2}^{-1}\phi_{2}=\phi_{1}$ . Because $U_{1}$ and $U_{2}$ are both reversible matrices, $\phi_{1}$ and $\phi_{2}$ can be any vector of size $2^{d}$ (and satisfying $\left\lVert\phi_{1}\right\rVert=\left\lVert\phi_{2}\right\rVert=1$ ). Thus, $C_{1}\sim C_{2}$ if and only if $P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I)$ for any $\phi_{2}$ and any $t$ , where $C_{1}C_{2}^{-1}$ is the circuit with unitary matrix $U_{1}U_{2}^{-1}$ , and $I$ is the identity circuit that directly sends the input to the output.

We will use the transformed problem in the following proof.

(i) the “only if ” part:

If $C_{1}\sim C_{2}$ , $P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I)$ must hold for any $\phi_{2}$ and any $t$ . We firstly consider $\phi_{2}=e_{0,2^{d}}$ . We can easily find that

P(t|e_{0,2^{d}},I)=\left\{\begin{aligned} &1,\;\mathit{if}\;t=00...0\\ &0,\;\mathit{else}\end{aligned}\right..

(18)

Let $U=U_{1}U_{2}^{-1}$ and $u_{x,y}$ denotes the $(x,y)$ entry of $U$ . To make $P(00...0|e_{0,2^{d}},C_{1}C_{2}^{-1})=1$ , $u_{i,0}$ should be 0 for all $i\geq 2^{d-m}$ .

Following the same method, now we generally consider $\phi_{2}=e_{j,2^{d}}$ , where $j=0,1,...,2^{d-1}$ . Let $g=\lfloor j/2^{d-m}\rfloor$ , then

		$\displaystyle\;\;\;\;\;\;P(t\|e_{j,2^{d}},I)=\left\{\begin{aligned} &1,\;\mathit{if}\;t=g\\ &0,\;\mathit{else}\end{aligned}\right.\;\;\;\;\;\;\;\;$
		$\displaystyle\Rightarrow P(t\|e_{j,2^{d}},C_{1}C_{2}^{-1})=\left\{\begin{aligned} &1,\;\mathit{if}\;t=g\\ &0,\;\mathit{else}\end{aligned}\right.\;\;\;\;\;\;\;\;$
		$\displaystyle\Rightarrow u_{i,j}=0\;\;\forall i\notin\left[2^{d-m}g,\,2^{d-m}(g+1)\right).$

That is, $U_{1}U_{2}^{-1}$ must be in form Eq. (17).

(ii) the “if ” part:

If $U=U_{1}U_{2}^{-1}$ is in form Eq. (17), we firstly note that:

U^{\dagger}U=\begin{bmatrix}U_{0,0}^{\dagger}U_{0,0}&&&\\ &U_{1,1}^{\dagger}U_{1,1}&&\\ &&\quad\ddots\quad&\\ &&&\quad\ddots\quad\end{bmatrix}.

(19)

Because $U=U_{1}U_{2}^{-1}$ is a unitary matrix, $U_{i,i}^{\dagger}U_{i,i}$ must be identity for all $i$ . Thus, each $U_{i,i}$ must be a unitary matrix.

Now for any $\phi_{2}$ , let $\phi_{2}=\left[a_{0},a_{1},...,a_{2^{d}-1}\right]^{\rm T}=\left[A_{0},A_{1},...,A_{2^{m}-1}\right]^{\rm T}$ , where each $A_{i}$ is of length $2^{d-m}$ . That is, we equally divide $\phi_{2}$ into $2^{m}$ groups. Then

P(t|\phi_{2},C_{1}C_{2}^{-1})=\left\lVert U_{t,t}A_{t}\right\rVert.

(20)

Because $U_{t,t}$ is unitary, it is true that

\left\lVert U_{t,t}A_{t}\right\rVert=A_{t}^{\dagger}U_{t,t}^{\dagger}U_{t,t}A_{t}=\left\lVert A_{t}\right\rVert.

(21)

Because $\left\lVert A_{t}\right\rVert=P(t|\phi_{2},I)$ , $P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I)$ holds for all $\phi_{2}$ and $t$ . Therefore, $C_{1}\sim C_{2}$ . ∎

V Algorithms for Partial Equivalence Checking

In this section, we present algorithms to solve Problems 1 and 2 based on the theorems established in Section IV. We emphasize that these algorithms can be realized by using the bit-sliced algebraic representation of matrices [8].

V-A Partial Equivalence Checking Algorithm

If $U_{1}$ and $U_{2}$ are given in their explicit numerical form, we can easily extract $v_{i,j}(U)$ ’s and enumerate all $(t,p,q)$ triples to check whether $C_{1}\sim C_{2}$ by Theorem 1, with $O(2^{3d+k})$ steps of complex number multiplications. However, as $U_{1}$ and $U_{2}$ may be in an implicit form, we give another approach by using matrix entry manipulation and matrix multiplication, as shown in Algorithm 1.

Algorithm 1 PEC( $C_{1}$ , $C_{2}$ )

C_{1}

C_{1}=(d,m,k,U_{1})

C_{2}

C_{2}=(d,m,k,U_{2})

0: Whether

C_{1}\sim C_{2}

or not

extra:={\rm max}\{m-k,0\}

k:=k+extra

U_{1}:=U_{1}\otimes I_{2^{extra}}

U_{1}^{\prime}:=U_{1}

5: Equally divide the columns of

U_{1}^{\prime}

into

2^{d}

parts. For each part, keep the leftmost column unchanged, and set the remains to 0.

6: Equally divide the rows of

U_{1}^{\prime}

into

2^{m}

parts. For the

i^{th}

part (

i

starts from 0), right shift the contents for

i

columns.

U_{1}^{\prime\prime}:=U_{1}^{\dagger}U_{1}^{\prime}

8: Equally divide the rows of

U_{1}^{\prime\prime}

into

2^{d}

parts. For each part, keep the top row and set the other entries to 0.

9: Repeat step 3

\sim

8 for

U_{2}

and get

U_{2}^{\prime\prime}

10: return whether

U_{1}^{\prime\prime}=U_{2}^{\prime\prime}

or not

To better illustrate steps 5 and 6, we note that after step 6, each $v_{i,j}(U_{1})$ for different $i$ and $j$ should be separated into different columns, and all entries not storing values of $v_{i,j}(U_{1})$ are 0. An example of $U_{1}^{\prime}$ with $d=m=k=1$ after this step is shown as follows.

To see the correctness of Algorithm 1, we first consider the case $m\geq k$ . It can be easily checked that after step 8, the $(2^{k}p,\>2^{k}q+t)^{\mathrm{th}}$ entry of $U_{1}^{\prime\prime}\>$ will store the value of $v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})$ , while the other entries are 0. Note that in step 2 we have added some ancilla qubits and now $k=m$ . Therefore, we can easily check whether $U_{1}^{\prime\prime}=U_{2}^{\prime\prime}$ holds to determine whether $C_{1}\sim C_{2}$ .

For the case $m<k$ , the $(2^{k}p,\>2^{k}q+t)^{\mathrm{th}}$ entry of $U_{1}^{\prime\prime}\>$ still stores the value of $v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})$ , but only for those with $t=0,1,...,2^{m-1}$ . For $t=2^{m},...,2^{k-1}$ , the value of $v_{t,q}(U_{1})^{\dagger}v_{t,q}(U_{1})$ does not exist, but we can see that the $(2^{k}p,\>2^{k}q+t)^{\mathrm{th}}$ entry of $U_{1}^{\prime\prime}\>$ will always be 0 because the entries in the ${(2^{k}q+t)}^{\rm{th}}$ column of $U_{1}^{\prime}$ are all 0’s after step 6. That is, though we do not clear these entries in step 8, they are automatically set to 0. Therefore, we can still determine whether $C_{1}\sim C_{2}$ by checking whether $U_{1}^{\prime\prime}=U_{2}^{\prime\prime}$ .

We further point out that this can be done quite easily using bit-sliced algebraic representation of a matrix [8], as shown in Algorithm 2. As introduced before, a matrix is represented by multiple BDDs, but here we use single $F_{1}$ and $F_{2}$ as representatives for all BDDs in $C_{1}$ and $C_{2}$ respectively for convenience. In steps 5, 7, 10, 13, we mean to do the same operation on each BDD, and in step 15, we need all corresponding BDDs of the two circuits to be the same.

Algorithm 2 PEC_BDD( $C_{1}$ , $C_{2}$ )

C_{1}

C_{1}=(d,m,k,U_{1})

C_{2}

C_{2}=(d,m,k,U_{2})

0: Whether

C_{1}\sim C_{2}

or not

extra:={\rm max}\{m-k,0\}

k:=k+extra

3: Add

extra

row-variables and column-variables to

C_{1}

4: for

i=d,d+1,...,d+k-1

F_{1}:=F_{1}|_{\overline{y_{i}}}

6: for

i=0,1,...,m-1

F_{1}:=F_{1}\land\overline{x_{i}\oplus y_{d+k-m+i}}

8: if

m<k

9: for

i=d,d+1,...,d+k-m-1

10:

F_{1}:=F_{1}\,\land\,\overline{y_{i}}

11: Apply the inverse circuit

C_{1}^{-1}

F_{1}

using the method in [8]

12: for

i=d,d+1,...,d+k-1

13:

F_{1}:=F_{1}\land\overline{x_{i}}

14: Repeat step 3 to 13 for

C_{2}

and get

F_{2}

15: return whether

F_{1}=F_{2}

or not

We use steps 4 to 10 in Algorithm 2 to replace steps 5 and 6 in Algorithm 1. In steps 4 to 5, we divide the columns of $U_{1}$ into $2^{d}$ parts. For each part, we use cofactor operation to copy the leftmost columns to the other columns. In steps 6 to 9, we remove unwanted entries to ensure that each column only contains different $v_{i,j}(U_{1})$ ’s. The remaining steps have similar functions as in Algorithm 1 but are rewritten in Boolean form.

We can see that with the help of BDD, the operations can be simplified and are more efficient. To be precise, two applications of inverse circuit operations $U^{-1}$ are required, and only $O(m+k)$ Boolean AND, OR, and cofactor operations are needed, which is linear to $m$ and $k$ . As BDDs often give a good data compression, there may be a significant speedup for Algorithm 2.

V-B Zero-Ancilla Partial Equivalence Checking Algorithm

Similarly, we can directly use Theorem 2 to solve Problem 2 if $U_{1}$ and $U_{2}$ are explicitly given. Here we emphasize that it is also easy to check this property using bit-sliced algebraic representation of the matrix. The checking procedure is shown in Algorithm 3. Again, we mean to do the same operation on each BDD in step 5, and we require all BDDs to be constant $false$ in step 6.

Algorithm 3 PEC_BDD_noAncilla( $C_{1}$ , $C_{2}$ )

C_{1}

C_{1}=(d,m,0,U_{1})

C_{2}

C_{2}=(d,m,0,U_{2})

0: Whether

C_{1}\sim C_{2}

or not

1: Build the BDDs

F

representing

C_{1}C_{2}^{-1}

using the method in [8]

M:=false

3: for

i=0,1,...,m-1

M:=M\lor\left(x_{i}\oplus y_{i}\right)

F:=F\land M

6: return whether

F=false

or not

We can see that in Algorithm 3, we do not have to build BDDs for $C_{1}$ and $C_{2}$ respectively, but rather merge them into one quantum circuit $C_{1}C_{2}^{-1}$ . As there may be some gates canceled out in $C_{1}C_{2}^{-1}$ , Algorithm 3 may be more efficient than Algorithm 2.

VI Experimental Evaluation

We implemented Algorithms 2 and 3 in C/C++ under the framework of SliQEC [8], and adopted CUDD [43] package for BDD manipulation. All experiments were conducted on a server with Intel Xeon Silver 4210 CPU at 2.20 GHz and 128 GB RAM. The reported runtime refers to CPU time in seconds. The time-out (TO) limit is set to 600 seconds.

We conducted experiments of partial equivalence checking on benchmarks of period-finding instances, randomly generated instances, Grover search instances, and reversible circuit instances. We also experimented total equivalence checking on randomly generated instances. The results are discussed as follows.

VI-A Checking Period-Finding Algorithm

We used the tool proposed in [39] to implement the reversible circuits realizing $U|x\rangle=|ax\mod N\rangle$ for $x<N$ and $U|x\rangle=|a(x+1)-1\mod N\rangle$ for $x<N$ , which have the same period with input state $|1\rangle$ . We then substituted the reversible circuits into the oracle used in the period finding algorithm. We let the first register consist of 3 qubits, and used one extra ancilla qubit to implement quantum Fourier transform. We then used Algorithm 2 to test the equivalence relation. As mentioned above, this type of equivalence cannot be verified by previous tools, and Algorithm 2 is the only way to check their equivalence relation. We have tested the $(a,N)$ pairs for all $a\in\{3,5,7,11\},N\in\{5,7,11,13\}$ and $a<N$ . The parameters are set to $d=m=3$ and $k=\lceil\log_{2}N\rceil+1$ .

The results are shown in Table II. All test cases yield correct answers. As these test cases are relatively small, we can see that the runtime is roughly proportional to the gate count. On the other hand, the memory usages are roughly the same because all test cases have similar numbers of qubits.

TABLE II: Experimental Results on Period-Finding Benchmarks

a

N

#Gates

C_{1}

#Gates

C_{2}

time (s)

memory

(MB)

115

0.180

12.874

108

136

0.354

13.914

227

325

0.794

14.496

353

248

0.796

13.959

143

164

0.441

13.922

381

388

1.083

14.512

353

507

0.806

13.173

395

437

1.263

14.520

479

241

1.136

14.787

430

416

1.302

14.787

VI-B Checking Random Benchmarks

To further test the scalability of our algorithms, we generated some benchmarks with Clifford+T gates (H, S, T, CNOT gates) and 2-control Toffoli gates. The numbers of data qubits of the benchmarks range from $d=5,10,15,20,25,30,35,40,$ $50,60,70,80,90,100$ . The number $m$ of measured qubits was fixed to $\lfloor 0.5d\rfloor$ , and the gate number was set to about $6.5d$ . For each $d$ , 20 groups of circuits were generated, and each group includes two conditions: ancilla qubits existing or non-existing. We used the latter condition to test Algorithm 3, and both conditions to test Algorithm 2.

The structure of the benchmarks is shown in Fig. 5. Let the partially equivalent circuit pair be $C_{1}$ and $C_{2}$ , both with $d$ data qubits and $m$ measured qubits. The circuits are divided into five parts: H (H gates), T (totally equivalent), P (partially equivalent), A (arbitrary) and C (CNOT gates). If $C_{1}$ and $C_{2}$ are in the condition without ancilla qubits, only the first four parts are contained. Otherwise, they both contain all five parts.

In part H, an H gate is firstly applied to each data qubit to impose superposition. In part T, we generate a random sub-circuit with $d$ qubits and $3d$ gates to apply on both circuits, but all the Toffoli gates in $C_{2}$ are decomposed in the way proposed in [44]. In part P, we divide the data qubits into several groups, where each group may contain one or two adjacent qubits. For each group, we apply $X_{1}$ and $X_{2}$ on $C_{1}$ and $C_{2}$ , respectively, where $X_{1}$ and $X_{2}$ are pre-generated subcircuits satisfying $X_{1}\sim X_{2}$ when all qubits are set to data qubits and measured qubits. $X_{1}$ and $X_{2}$ are exhaustively searched with up to five gates in total. In part A, different random sub-circuits with $d-m$ qubits and $d-m$ gates are applied on $C_{1}$ and $C_{2}$ . These gates trivially do not affect the measurement results of the measured qubits. In part C, we use the ancilla qubits as the control bits to control the upper $d$ qubits. As ancilla qubits are initially set to 0, these CNOT gates do not affect the circuit.

Although the circuit generation is simple, there are no known circuit simplification methods exploiting partial equivalence. The generated circuits are used to test our algorithms. Furthermore, to validate the correctness of the benchmarks, we use the same generating method to generate some test cases with $3\leq d\leq 9$ . For each test case, we use Qiskit [27] to randomly set an initial state and get the final state vector numerically. We calculate for all possible outputs and make sure that $C_{1}$ and $C_{2}$ have the same probability distribution under this random initial state.

TABLE III: Experimental Results on Random Partially Equivalent Circuits

d

m

Without ancilla qubits

With ancilla qubits

#Gates in

C_{1}

#Gates in

C_{2}

Algorithm 3

Algorithm 2

#Gates in

C_{1}

#Gates in

C_{2}

Algorithm 2

time(s)

memory

(MB)

#TO

time(s)

memory

(MB)

#TO

time(s)

memory

(MB)

#TO

26.05

78.95

0.017

12.942

0.064

12.954

27.05

79.95

0.067

12.954

51.05

157.00

0.050

12.942

1.880

25.077

52.65

158.60

1.951

24.766

76.00

240.70

0.135

13.855

72.416

165.957

77.95

243.05

72.576

168.987

102.65

308.75

0.364

14.180

117.528

170.217

105.85

312.00

129.006

167.451

128.90

372.70

0.723

14.577

132.55

376.15

155.45

439.05

1.276

15.798

160.00

443.40

180.65

543.15

2.491

17.633

185.50

548.25

204.80

587.75

3.554

18.930

210.85

594.00

257.00

710.25

8.827

35.281

264.65

717.75

309.65

862.90

18.673

45.444

318.25

872.50

361.45

1037.45

55.950

78.525

372.10

1048.00

414.55

1157.00

105.611

122.198

426.00

1169.25

461.65

1328.75

229.473

144.761

475.80

1342.00

100

517.35

1464.15

297.401

160.311

532.70

1480.05

TABLE IV: Experimental Results on Functionally Equivalent Reversible Circuits

Without ancilla qubits (Algorithm 3)
$C_{1}$	$C_{2}$	$d$	$m$	#Gates in $C_{1}$	#Gates in $C_{2}$	time (s)	memory (MB)
hwb4_52	hwb4_49	4	4	11	17	0.0128	12.6730
hwb6_58	hwb6_56	6	6	42	126	0.0214	12.9434
graycode6_47	graycode6_48	6	6	5	5	0.0132	12.9434
ham3_103	ham3_102	3	3	4	5	0.0128	12.6730
urf1_150	urf1_149	9	9	1517	11554	5.5048	15.4092
urf2_153	urf2_152	8	8	638	5030	1.1813	13.9059
urf3_156	urf3_155	10	10	2732	26468	25.6478	17.7562
urf5_159	urf5_158	9	9	499	10276	3.5319	15.4010
urf6_281	urf6_160	15	15	5102	10740	188.8090	112.1649
alu-v1_28	alu-v1_29	5	1	8	8	0.0101	12.9393
alu-v2_33	alu-v2_30	5	1	7	18	0.0130	13.0048
alu-v3_34	alu-v3_35	5	1	8	8	0.0110	12.9393
alu-v4_36	alu-v4_37	5	1	8	8	0.0127	12.9393
With ancilla qubits (Algorithm 2)
$C_{1}$	$C_{2}$	$d$	$m$	#Gates in $C_{1}$	#Gates in $C_{2}$	time (s)	memory (MB)
hwb5_53	hwb5_300	5	5	55	101	3.8765	31.1869
alu-v0_26	alu-bdd_288	5	1	7	10	0.0146	12.9434
ham7_104	ham7_299	7	7	23	76	0.7321	16.2570
ham15_107	ham15_298	15	15	132	184	89.9754	203.5876
nestedif2_16_396	nestedif2_16_468	34	32	256	266	3.6910	37.0237

The experimental results are shown in Table IV. The runtime and memory usage shown are averaged over the 20 test cases excluding those timed out. As seen, Algorithm 3 can easily scale up to tens of qubits. However, for $d>20$ , Algorithm 2 can hardly finish the computation within the time limit. It may be because, for Algorithm 3, $F$ keeps representing a unitary matrix of a quantum circuit almost during the whole process, which may be more “regular” for BDDs to represent. Moreover, due to the structure of $C_{1}C_{2}^{-1}$ , some gates may be canceled out and the BDDs are more likely to be maintained as an identity matrix, as mentioned in [8]. In contrast, the entries of the final matrix in Algorithm 2 contain the values of $v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})$ , which can be arbitrary real numbers. Besides, we fix $m$ to $\lfloor 0.5d\rfloor$ in our benchmarks, and the second step in Algorithm 2 may further increase the number of qubits. Thus, the runtime and memory usage of Algorithm 2 grow more rapidly than those of Algorithm 3.

VI-C Checking Reversible Circuits

As mentioned, functionally equivalent reversible circuits without don’t-care conditions must be partially equivalent. For those with ancilla inputs, Algorithm 2 is needed. On the other hand, Algorithm 3 is sufficient if no ancilla inputs exist. We used some test cases from RevLib [45] and used Algorithms 2 and 3 to test their partial equivalence to justify this point. The experimental results are shown in Table IV. It is justified that all functionally equivalent reversible circuits are indeed partially equivalent. The runtime and memory usage mainly grow proportional to the gate-count increase.

VI-D Checking Grover’s Algorithm

If reversible circuits are used as part of a quantum circuit, the entire quantum circuit may not be a reversible circuit anymore, and its verification becomes more complex. We took Grover’s algorithm as an example for case study. In the experiment, we created a reversible circuit as the oracle function in Grover’s algorithm. We also used the way proposed in [38] to implement the same function with one ancilla qubit. Then, Algorithm 2 was applied to check the equivalence of the two circuit. The data qubit number $d$ and measured qubit number $m$ range from $d=m=6,10,14,18,22,26$ .

The experimental results are shown in Table V. The runtime and memory usage grow as $d$ increases. It is reasonable because the circuit size is larger. We note that quantum circuits of Grover’s algorithm with the same oracle function but implemented by different reversible circuits are q-equivalent. In this experiment, we show that they are also partially equivalent. Apart from equivalent test cases, we also try to remove one gate from the second circuit to make them not partially equivalent. The runtime and memory usage for non-equivalent test cases are still close to equivalent test cases.

TABLE V: Experimental Results on Grover Search Benchmarks

d

m

#Gates

C_{1}

#Gates

C_{2}

time (s)

memory

(MB)

0.257

13.201

1.537

38.687

9.613

153.371

107

109

22.831

179.745

131

133

49.149

192.492

155

157

135.116

232.116

TABLE VI: Experimental Results on Random Totally Equivalent Circuits

$d$	$m$	#Gates in $C_{1}$	#Gates in $C_{2}$	Algorithm 3			SliQEC
$d$	$m$	#Gates in $C_{1}$	#Gates in $C_{2}$	time (s)	memory (MB)	#TO	time (s)	memory (MB)	#TO
10	10	40.00	118.40	0.038	12.955	0	0.028	13.000	0
20	20	80.00	237.50	0.182	14.245	0	0.194	14.237	0
30	30	120.00	377.60	0.685	15.699	0	0.640	15.285	0
40	40	160.00	496.00	1.802	19.232	0	1.749	17.815	0
50	50	200.00	649.40	3.612	28.114	0	4.722	28.233	0
60	60	240.00	751.00	7.246	38.732	0	9.889	42.336	0
70	70	280.00	887.60	12.165	45.731	0	10.086	38.610	0
80	80	320.00	993.40	20.001	58.224	0	22.079	56.422	0
90	90	360.00	1071.20	24.703	69.052	0	25.826	68.746	0
100	100	400.00	1230.20	48.089	85.605	0	58.483	79.486	0

VI-E Checking Total Equivalent Circuits

Because totally equivalent circuits must be partially equivalent, under these circumstances we use totally equivalent circuits as the benchmarks to test Algorithm 3 and compare it against SliQEC [8]. Because Algorithm 3 is extended from SliQEC, we expect that they would perform similarly. We generated some benchmarks with qubit number $d=10,20,$ $30,40,50,60,70,80,90,100$ , and set all qubits to be measured qubits for partial equivalence checking. The generating method is similar to how we generate partially equivalent pairs, except that only part H and part T are included to make them totally equivalent. The results shown in Table VI confirm the expectation that Algorithm 3 and SliQEC behave similarly in both runtime and memory usage on these instances.

VII Conclusions

In this work, we defined partial equivalence, which is concerned with observational equivalence with respect to partial measurement and a set of initial input states. We obtained the necessary and sufficient conditions of partial equivalence, and exploited them to develop partial equivalence checking algorithms. Experimental results demonstrate the usefulness and strengths of our methods. For future work, it may be interesting to explore partial equivalence in applications such as quantum circuit synthesis and quantum program compilation.

Acknowledgments

This work was supported in part by the Ministry of Science and Technology of Taiwan under grants MOST 110-2224-E-002-011 and MOST 111-2119-M-002-012, and MediaTek Research. TFC was supported by the TSMC Scholarship.

References

[1] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts and heaps,” in Proc. Design Automation Conference, 1997, pp. 263–268.
[2] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een, “Improvements to combinational equivalence checking,” in Proc. Int. Conf. on Computer-Aided Design, 2006, pp. 836–843.
[3] S.-Y. Huang and K.-T. T. Cheng, Formal Equivalence Checking and Design Debugging. Springer, 2012.
[4] C. Van Eijk, “Sequential equivalence checking based on structural similarities,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 19, no. 7, pp. 814–819, 2000.
[5] K. L. McMillan, “Interpolation and SAT-based model checking,” in Proc. Int. Conf. on Computer Aided Verification, 2003, pp. 1–13.
[6] A. R. Bradley, “SAT-based model checking without unrolling,” in Proc. Int. Conf. on Verification, Model Checking, and Abstract Interpretation, 2011, pp. 70–87.
[7] N. Een, A. Mishchenko, and R. Brayton, “Efficient implementation of property directed reachability,” in Proc. Int. Conf. on Formal Methods in Computer-Aided Design, 2011, pp. 125–134.
[8] C.-Y. Wei, Y.-H. Tsai, C.-S. Jhang, and J.-H. R. Jiang, “Accurate BDD-based unitary operator manipulation for scalable and robust quantum circuit verification,” in Proc. Design Automation Conference, 2022.
[9] P. Niemann, R. Wille, and R. Drechsler, “Equivalence checking in multi-level quantum systems,” in Proc. Int’l Conf. of Reversible Comput., 2014, pp. 201–215.
[10] L. Burgholzer and R. Wille, “Advanced equivalence checking for quantum circuits,” IEEE Trans. on Comput.-Aided Design Integr. Circuits Syst, vol. 40, no. 9, pp. 1810–1824, 2021.
[11] X. Hong, M. Ying, X. Z. Y. Feng, and S. Li, “Approximate equivalence checking of noisy quantum circuits,” in Proc. Design Automation Conf., 2021, pp. 637–642.
[12] S. Yamashita and I. L. Markov, “Fast equivalence-checking for quantum circuits,” in Proc. Int. Symp. on Nanoscale Architectures, 2010, pp. 23–28.
[13] L. Burgholzer and R. Wille, “Improved DD-based equivalence checking of quantum circuits,” in Proc. Asia and South Pacific Design Automation Conf., 2020, pp. 127–132.
[14] ——, “The power of simulation for equivalence checking in quantum computing,” in Proc. Design Automation Conf., 2020, pp. 1–6.
[15] ——, “Handling non-unitaries in quantum circuit equivalence checking,” in Proc. Design Automation Conference, 2022.
[16] E. Ardeshir-Larijani, S. J. Gay, and R. Nagarajan, “Verification of concurrent quantum protocols by equivalence checking,” in Proc. Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, 2014, pp. 500–514.
[17] M. Amy, “Towards large-scale functional verification of universal quantum circuits,” in Electronic Proc. in Theoretical Computer Science, vol. 287, 2019, pp. 1–21.
[18] G. F. Viamontes, I. L. Markov, and J. P. Hayes, “Checking equivalence of quantum circuits and states,” in Proc. Int. Conf. on Computer-Aided Design, 2007, pp. 69–74.
[19] S. A. Wang, C. Y. Lu, I. M. Tsai, and S. Y. Kuo, “An XQDD-based verification method for quantum circuits,” IEICE Trans. on Fundamentals of Electron., Commun. and Computer Sci., vol. E91-A, no. 2, pp. 584–594, 2008.
[20] E. Knill, “Conventions for quantum pseudocode,” Los Alamos National Lab, Tech. Rep., 1996, [Online]. Available: https://www.osti.gov/biblio/366453.
[21] B. Ömer, “A procedural formalism for quantum computing,” Master’s thesis, Dep. Theoretical Phys., Technical Univ. of Vienna, 1998.
[22] A. van Tonder, “A lambda calculus for quantum computation,” SIAM Journal on Computing, vol. 33, pp. 1109–1135, 2004.
[23] P. Selinger, “Towards a quantum programming language,” Mathematical Structures in Computer Science, vol. 14, no. 4, pp. 527–586, 2004.
[24] S. Bettelli, L. Serafini, , and T. Calarco, “Toward an architecture for quantum programming,” Eur. Phys. J. D, vol. 25, pp. 181–200, 2003.
[25] A. W. Cross, L. S. Bishop, J. A. Smolin, and J. M. Gambetta, “Open quantum assembly language,” 2017, [Online]. Available: https://arxiv.org/abs/1707.03429.
[26] A. J. Abhari et al., “Scaffold: Quantum programming language,” Dept. of Computer Science, Princeton Univ., Tech. Rep. TR934-12, 2012.
[27] H. Abraham et al., “Qiskit: An open-source framework for quantum computing,” 2019, doi: 10.5281/zenodo.2562111.
[28] B. Bichsel, M. Baader, T. Gehr, and M. Vechev, “Silq: A high-level quantum language with safe uncomputation and intuitive semantics,” in Proc. Int. Conf. on Programming Language Design and Implementation, London, UK, 2020, pp. 286–300.
[29] K. M. Svore et al., “Q#: Enabling scalable quantum computing and development with a high-level dsl,” in Proc. Real World Domain Specific Languages Workshop, 2018, pp. 1–10.
[30] A. S. Green, P. L. Lumsdaine, N. J. Ross, P. Selinger, and B. Valiron, “Quipper: a scalable quantum programming language,” in Proc. Int. Conf. on Programming Language Design and Implementation, 2013, pp. 333–342.
[31] S. Liu et al., “ $Q|Si\rangle$ : A quantum programming environment,” in Proc. Symp. on Real-Time and Hybrid Systems, 2018, pp. 133–164.
[32] X. Hong, Y. Feng, S. Li, and M. Ying, “Equivalence checking of dynamic quantum circuits,” 2021, [Online]. Available: arXiv:2106.01658v1.
[33] Q. S. Wang, J. Y. Liu, and M. S. Ying, “Equivalence checking of quantum finite-state machines,” J. Comput. Syst. Sci., vol. 116, pp. 1–21, 2021.
[34] Q. Wang, R. Li, and M. Ying, “Equivalence checking of sequential quantum circuits,” IEEE Trans. on Comput.-Aided Design of Integr. Circuits and Syst., 2021.
[35] R. E. Bryant, “Graph-based algorithms for Boolean function manipulation,” IEEE Trans. on Computers, vol. 35, no. 8, pp. 677–691, 1986.
[36] R. Wille, D. Große, D. M. Miller, and R. Drechsler, “Equivalence checking of reversible circuits,” in Proc. Int. Symp. on Multiple-Valued Logic, 2009, pp. 324–330.
[37] L. G. Amaru, “Improvements to the equivalence checking of reversible circuits,” in New Data Structures and Algorithms for Logic Synthesis and Verification. Springer, 2017, ch. 6.
[38] D. M. Miller, R. Wille, and R. Drechsler, “Reducing reversible circuit cost by adding lines,” in Proc. Int. Symp. on Multi-Valued Logic, 2010, pp. 217–222.
[39] H. Lee, K. C. Jung, D. Han, and P. Kim, “An algorithm for reversible logic circuit synthesis based on tensor decomposition,” 2021, [Online]. Available: arxiv.org/abs/2107.04298.
[40] Y.-H. Tsai, J.-H. R. Jiang, and C.-S. Jhang, “Bit-slicing the Hilbert space: Scaling up accurate quantum circuit simulation,” in Proc. Design Automation Conference, 2021, pp. 439–444.
[41] Z. Alwin, N. Philipp, D. Rolf, and W. Robert, “Accuracy and compactness in decision diagrams for quantum computation,” in Proc. Design, Automation Test in Europe Conference and Exhibition, 2019, pp. 280–283.
[42] P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” in Proc. IEEE Symposium on Foundations of Computer Science, 1994, pp. 124–134.
[43] F. Somenzi, “CUDD: CU decision diagram package (release 2.4.2),” 2012.
[44] D. Maslov, “Advantages of using relative-phase Toffoli gates with an application to multiple control Toffoli optimization,” Phys. Rev. A, vol. 93, no. 022311, 2016.
[45] R. Wille, D. Große, L. Teuber, G. W. Dueck, and R. Drechsler, “Revlib: An online resource for reversible functions and reversible circuits,” in Proc. Int. Symp. on Multi-Valued Logic, 2008, pp. 220–225.