This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Partial Equivalence Checking of Quantum Circuits

Tian-Fu Chen1,2,3, Jie-Hong R. Jiang1,2,3,4, and Min-Hsiu Hsieh5 1Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan 2Graduate School of Advanced Technology, National Taiwan University, Taipei, Taiwan 3Center for Quantum Science and Engineering, National Taiwan University, Taipei, Taiwan 4Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan 5Quantum Computing Research Center, Hon Hai Research Institute, Taipei, Taiwan [email protected], [email protected], [email protected]
Abstract

Equivalence checking of quantum circuits is an essential element in quantum program compilation, in which a quantum program can be synthesized into different quantum circuits that may vary in the number of qubits, initialization requirements, and output states. Verifying the equivalences among the implementation variants requires proper generality. Although different notions of quantum circuit equivalence have been defined, prior methods cannot check observational equivalence between two quantum circuits whose qubits are partially initialized, which is referred to as partial equivalence. In this work, we prove a necessary and sufficient condition for two circuits to be partially equivalent. Based on the condition, we devise algorithms for checking quantum circuits whose partial equivalence cannot be verified by prior approaches. Experiment results confirm the generality and demonstrate the efficiency and effectiveness of our method. Our result may unleash the optimization power of quantum program compilation to take more aggressive steps.

Index Terms:
Quantum circuit, quantum computing, quantum measurement, equivalence checking

I Introduction

Equivalence checking plays an important role in the design flow of modern integrated circuits. It ensures the transformations done in the corresponding synthesis steps do not introduce errors. Many scalable techniques, such as random simulation, satisfiability solving, decision diagrams, and structural similarity detection, have been developed and widely applied in classical circuit verification [1, 2, 3, 4, 5, 6, 7]. Due to the rapid progress of hardware and software developments in quantum computation, quantum circuit verification, especially equivalence checking, arises as an important issue in quantum program compilation. Although practical quantum circuits remain relatively small until now, it remains challenging to their equivalence checking due to extraordinary quantum properties. Moreover, as quantum computing technology evolves, the size of quantum circuits will continue growing. Efficient equivalence checking tools for quantum circuits are essential.

Many approaches to equivalence checking of quantum circuits have been proposed [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. Among them, decision diagrams are a widely adopted data structure for quantum state and operator representation and manipulation [8, 9, 10, 13, 19]. Also, techniques, such as miter construction [8, 10, 12, 13, 18] or tensor network computation [11], have been applied. Some of them focus on the equivalence of reversible circuits, which realize permutation functions, whereas others address the equivalence of more general quantum circuits. For the latter, most of them require total equivalence, namely, the identity of the output states (modulo a global phase difference) between the two circuits under verification. However, since measurement is the only way to extract information from a quantum system, total quantum state equivalence may be unnecessary and only observational equivalence matters. That is, two circuits exhibit the same probability for every possible measurement outcome under a fixed measurement basis. Furthermore, not all qubits of a circuit need to be measured and there can be a set of initial states to be verified not just a particular single initial state. To accommodate such different design constraints, we relax the notion of quantum circuit equivalence and define a generalization, called partial equivalence. That is, two circuits are partially equivalent if, given any valid initial input state, they exhibit the same probability for each measurement outcome. Therefore, two partially equivalent circuits do not exhibit statistical differences by observed measurements. Essentially, the generality of partial equivalence allows more flexibility for quantum circuit synthesis. Fig. 1 shows an example where C1C_{1} and C2C_{2} are partially equivalent, but not totally equivalent.

Refer to caption

Figure 1: A motivating example of partial equivalence where circuits C1C_{1} and C2C_{2} measured on q0q_{0} have the same probability distribution over the outcomes for any given initial input state.

Apart from its application in quantum circuit synthesis, partial equivalence checking has its natural application in quantum program compilation. As the quantum computing technology evolves, many quantum programming languages have been developed, e.g., [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]. Although nowadays most quantum programming languages are abstracted at the gate level, high-level languages will get their increasing presence in the future. Their compilation may require making trade-offs between ancilla bits and quantum gates. To verify the correctness of different compilation choices may require the defined partial equivalence.

Equivalence based on measurement results is not new, nor is the relaxation requiring only some of the input qubits to be data qubits and some of the output qubits to be measured qubits. In fact, partial equivalence can be seen as the case that all principal output qubits in Definition 3 of [32] are to be measured. However, there are no corresponding checking methods so far, and none of the prior work [18, 19, 32, 33, 34] can fully resolve the partial equivalence checking problem. In particular, equivalence up to a relative phase discussed in [18] can only cover the case that all output qubits are measured; the method considering measurement-based equivalence in [19] is used for reversible circuits only; the initial state is given and fixed in [32] and [33]; sequential circuit verification in [34] is limited to the assumption that the number of measured output qubits cannot be smaller than the number of data input qubits. In this work, we are not constrained by these limitations. Our methods can be applied to general cases of checking partial equivalence relations. Moreover, they are compatible with the state-of-the-art SliQEC [8] system, which uses Boolean functions represented by decision diagrams to store and manipulate matrices.

The main results of this work include:

  1. 1.

    We characterize partial equivalence (in Section III), which subsumes prior notions of quantum circuit equivalence.

  2. 2.

    We prove a necessary and sufficient condition for two circuits to be partially equivalent (in Section IV).

  3. 3.

    We develop algorithms for partial equivalence checking under general and special settings (in Section V).

  4. 4.

    We conduct experiments to demonstrate the feasibility and effectiveness of our algorithms and the practical applicability of partial equivalence (in Section VI).

II Preliminaries

We briefly provide some background and define notations.

II-A Boolean Function and Decision Diagram

For Boolean connectives, we denote conjunction by “\wedge” (sometimes omitted in a Boolean expression for brevity), disjunction by “\vee,” exclusive-or by “\oplus,” and negation by an overline. Given a Boolean function f(x1,x2,,xn)f(x_{1},x_{2},...,x_{n}), the positive cofactor of ff on xix_{i}, denoted f|xif|_{x_{i}}, is

f|xi=f(x1,x2,,xi1,1,xi+1,,xn).f|_{x_{i}}=f(x_{1},x_{2},...,x_{i-1},1,x_{i+1},...,x_{n}). (1)

Similarly, the negative cofactor of ff on xix_{i}, denoted f|xi¯f|_{\overline{x_{i}}}, is

f|xi¯=f(x1,x2,,xi1,0,xi+1,,xn).f|_{\overline{x_{i}}}=f(x_{1},x_{2},...,x_{i-1},0,x_{i+1},...,x_{n}). (2)

The reduced ordered binary decision diagram (ROBDD), referred to as BDD in the sequel, is a canonical form for Boolean function representation [35]. There are efficient BDD packages for Boolean function manipulation. A BDD is a directed acyclic graph consisting of non-terminal nodes and terminal-0 and terminal-1 nodes. Each non-terminal node vv is associated with a decision variable and has two children v0v_{0} and v1v_{1} pointed to by the 0-branch and 1-branch of vv, respectively. Each node in a BDD corresponds to a Boolean function. Let fv,fv0,fv1f_{v},f_{v_{0}},f_{v_{1}} be the functions of node vv with decision variable xx and its two children v0,v1v_{0},v_{1}, respectively. Then fv0=f|x¯f_{v_{0}}=f|_{\overline{x}} and fv1=f|xf_{v_{1}}=f|_{x}, and the Shannon expansion fv=xfv1x¯fv0f_{v}=xf_{v_{1}}\vee\overline{x}f_{v_{0}} holds. Fig. 2 shows a BDD of function f=aca¯bca¯b¯c¯f=ac\vee\overline{a}bc\vee\overline{a}\overline{b}\overline{c}.

Refer to caption

Figure 2: A BDD of function f=aca¯bca¯b¯c¯f=ac\lor\overline{a}bc\lor\overline{a}\overline{b}\overline{c}.

II-B Quantum States and Quantum Circuits

For a quantum system, at the input end, a qubit is called a data qubit if it is associated with the input data and a non-data qubit otherwise. At the output end, a qubit is called a measured qubit if it is to be measured and a non-measured qubit otherwise. In an nn-qubit quantum system, a quantum state can be represented by a 2n×12^{n}\times 1 vector ϕ=[a0,a1,,a2n1]T\phi=[a_{0},a_{1},...,a_{2^{n}-1}]^{\rm T}, where each aia_{i} is a complex number and the norm of ϕ\phi satisfies ϕ=1\left\lVert\phi\right\rVert=1. We use |0|0\rangle to represent the vector [1,0,,0]T[1,0,...,0]^{\rm T}, |1|1\rangle to represent the vector [0,1,0,,0]T[0,1,0,...,0]^{\rm T}, and so on. For state |i|i\rangle, we let the binary number of integer ii be expressed by qubits q0,,qn1q_{0},\ldots,q_{n-1} with q0q_{0} being the most significant bit. In the sequel, we assume that non-data qubits are fixed to the initial state |0|0\rangle and refer to them as ancilla qubits. This assumption is general in that any other initial state can be transformed from |0|0\rangle.

If we measure all the qubits of ϕ\phi under the computational basis, there will be a probability |ai|2{|a_{i}|}^{2} for the system to get into state |i|i\rangle. Also, if we measure only the first (more significant) mm qubits, for mnm\leq n, of ϕ\phi, the probability of obtaining a state |j|j\rangle, for j{0,1,,2m1}j\in\{0,1,...,2^{m}-1\}, of the measured qubits equals

k=gjg(j+1)1|ak|2,\sum_{k=g\cdot j}^{g\cdot(j+1)-1}{|a_{k}|}^{2}, (3)

where g=2nmg=2^{n-m}.

The evolution of the state of a closed quantum system can be described by quantum gates. For an nn-qubit quantum system, a quantum gate can be defined by a 2n×2n2^{n}\times 2^{n} unitary matrix UU of complex numbers. A unitary matrix UU is a matrix satisfying UU=IU^{\dagger}U=I, where UU^{\dagger} is the conjugate transpose of UU and II is the identity matrix. Thus, UU^{\dagger} is also the inverse of UU. A quantum circuit is made up of a series of quantum gates G1,G2,GdG_{1},G_{2},...G_{d}. Let their corresponding unitary matrices be U1,U2,,UdU_{1},U_{2},...,U_{d} and let ϕ0\phi_{0} be the initial state. Then the final output state ϕout\phi_{out} of the circuit is derived by the product

ϕout=UdUd1U1ϕ0.\phi_{out}=U_{d}\cdot U_{d-1}\cdot\ldots\cdot U_{1}\cdot\phi_{0}. (4)

Therefore, we can view the whole circuit as a quantum gate with the unitary matrix

U=UdUd1U1.U=U_{d}\cdot U_{d-1}\cdot...\cdot U_{1}. (5)

Particularly, when an nn-qubit quantum circuit with unitary matrix UU yields a permutation of the computational basis {|0,,|2n1}\{|0\rangle,...,|2^{n}-1\rangle\}, it is referred to as a reversible circuit [36, 37, 38, 39].

Because only the global unitary operator of a quantum circuit matters in our discussion, for simplicity we abstract away gate implementation details and denote an nn-qubit quantum circuit CC as a 4-tuple C=(d,m,k,U)C=(d,m,k,U) for n=d+kn=d+k, where

  • dd is the number of data qubits, specifically, q0,,qd1q_{0},\ldots,q_{d-1},

  • mm is the number of measured qubits, specifically, q0,,qm1q_{0},\ldots,q_{m-1},

  • kk is the number of non-data qubits, and

  • UU is the global unitary operator of CC of size 2n×2n2^{n}\times 2^{n}.

Note that above we make the data qubits and measured qubits follow the same order. This arrangement is without loss of generality as the qubits can be reordered by swap gates.

Given a quantum circuit C=(d,m,k,U)C=(d,m,k,U) with the data qubits prepared at initial state ψ\psi, we denote the probability of CC collapsing to state |t|t\rangle upon a measurement on the measured qubits under the computational basis as P(t|ψ,C)P(t|\psi,C), which can be calculated by

P(t|ψ,C)=i=gtgt+g1|j=02n1ui,2kjaj|2,P(t|\psi,C)=\sum_{i=g\cdot t}^{g\cdot t+g-1}\left|\sum_{j=0}^{2^{n}-1}u_{i,2^{k}\cdot j}\cdot a_{j}\right|^{2}, (6)

where g=2(d+km)g=2^{(d+k-m)} and ux,yu_{x,y} denotes the (x,y)th(x,y)^{\mathrm{th}} entry of UU.

We remark that a matrix can be represented numerically or algebraically, and explicitly or implicitly for different computation choices. In this work, we adopt an algebraic and implicit approach to matrix representation and manipulation based on [40, 8].

II-C Algebraic and Implicit Representation of Matrices

Following [41], we represent a complex number α\alpha\in\mathbb{C} algebraically by

α=12k(aω3+bω2+cω+d),\alpha=\frac{1}{\sqrt{2}^{k}}(a\omega^{3}+b\omega^{2}+c\omega+d), (7)

where ω=eiπ/4\omega=e^{i\pi/4} and a,b,c,da,b,c,d\in\mathbb{Z}. In [8], the bit-slicing technique [40] is applied to represent a 2n×2n2^{n}\times 2^{n} complex matrix by one integer-type variable for the kk-coefficient and 4r4r 2n2n-variable BDDs for the aa-, bb-, cc-, dd-coefficients, assuming an integer is encoded with a bit-vector of size rr. Let the BDDs of the ithi^{\rm th} bit of the a,b,c,da,b,c,d-coefficient Boolean matrices be Fai,Fbi,Fci,FdiF^{ai},F^{bi},F^{ci},F^{di}, respectively, for i=0,,r1i=0,\ldots,r-1. In the sequel, we do not distinguish among Fai,Fbi,Fci,FdiF^{ai},F^{bi},F^{ci},F^{di} and simply use FF to refer to every individual of these BDDs. Essentially, such a function FF of 2n2n variables implicitly represents a 2n×2n2^{n}\times 2^{n} Boolean matrix, whose rows and columns are indexed by the output qubits and input qubits, respectively, in our context. For qubit qiq_{i}, two variables xix_{i} and yiy_{i} are introduced to encode its corresponding output qubit and input qubit, respectively, for the Boolean matrix. The variables X={x0,,xn1}X=\{x_{0},\ldots,x_{n-1}\} and Y={y0,,yn1}Y=\{y_{0},\ldots,y_{n-1}\} are referred to as the row-variables and column-variables, respectively. Effectively, FF is a function over variables XYX\cup Y. That is, the (i,j)th(i,j)^{\mathrm{th}} entry of the Boolean matrix corresponds to the Boolean value of FF for XX and YY being substituted with the nn-bit binary numbers of ii and jj, respectively. In addition to the implicit matrix representation using Boolean functions, the multiplication of matrices can be done effectively through Boolean function manipulations as shown in [8].

III Partial Equivalence Problem

In this section, we formally define partial equivalence between two quantum circuits, and compare it to other equivalence notions in the literature.

III-A Problem Statement

The partial equivalence relation is formally defined as follows.

Definition 1 (Partial Equivalence).

Two circuits C1=(d,m,k,U1)C_{1}=(d,m,k,U_{1}) and C2=(d,m,k,U2)C_{2}=(d,m,k,U_{2}) are partially equivalent, denoted by C1C2C_{1}\sim C_{2}, if

P(t|ψ,C1)=P(t|ψ,C2)P(t|\psi,C_{1})=P(t|\psi,C_{2}) (8)

holds for any state vector ψ\psi of size 2d2^{d} and any t=0,1,,2m1t=0,1,...,2^{m}-1.

Note that, without loss of generality, we assume two circuits have the same number of ancilla qubits because we can always add dummy ancilla qubits to the one with fewer ancilla qubits. The partial equivalence checking problem can be stated as follows.

Problem 1 (Partial Equivalence Checking).

Given two circuits C1=(d,m,k,U1)C_{1}=(d,m,k,U_{1}) and C2=(d,m,k,U2)C_{2}=(d,m,k,U_{2}), we are asked to check whether C1C_{1} and C2C_{2} are partially equivalent.

Partial equivalence can be applied to check the conformance between two quantum circuits that compute the same function but may produce different output states.

We remark that, in prior work [32], the notion of m-equivalence and q-equivalence are defined. For m-equivalent circuits, it requires that all qubits have fixed initial states, and it allows that we only measure on part of the qubits. Two circuits are m-equivalent if they have the same probability distribution of measurement outcomes. That is, m-equivalence is a special case of partial equivalence when we set d=0d=0 for the circuits under verification. On the other hand, for q-equivalence, ancilla qubits are allowed. Besides, the output qubits are divided into two groups, the main output qubits and the garbage qubits. It is required that if we measure the garbage qubits, the quantum states of the main output qubits cannot be affected by the measurement outcomes of the garbage qubits. Two circuits are q-equivalent if the output states of the main output qubits between the two circuits are the same for any input state.

To demonstrate the usefulness of partial equivalence, we take the period-finding quantum algorithm [42] as an example. In Fig. 3 (a), the circuit computes the function

U1|x={|3xmod 5,if x<5,|x,otherwise.U_{1}|x\rangle=\begin{cases}|3x\;\rm{mod}\;5\rangle,&\text{if $x<5$,}\\ |x\rangle,&\text{otherwise.}\end{cases}\qquad\qquad\; (9)

In Fig. 3 (b), the circuit computes the function

U2|x={|3(x+1)1mod 5,if x<5,|x,otherwise.U_{2}|x\rangle=\begin{cases}|3(x+1)-1\;\rm{mod}\;5\rangle,&\text{if $x<5$,}\\ |x\rangle,&\text{otherwise.}\end{cases} (10)

It can be seen that the U1U_{1} and U2U_{2} have the same period 4 (for U1U_{1} mapping |1|3|4|2|1|1\rangle\mapsto|3\rangle\mapsto|4\rangle\mapsto|2\rangle\mapsto|1\rangle and U2U_{2} mapping |1|0|2|3|1|1\rangle\mapsto|0\rangle\mapsto|2\rangle\mapsto|3\rangle\mapsto|1\rangle). For the circuit in Fig. 3 (c), let the first register as the data qubits and the second register as the ancilla qubits. Let C1C_{1} (resp. C2C_{2}) be the circuit of Fig. 3 (c) with the oracle blocks UU being substituted with function U1U_{1} (resp. U2U_{2}). Then C1C_{1} and C2C_{2} are partially equivalent under d=m=3d=m=3 despite the fact that neither of total equivalence and q-equivalence holds and m-equivalence is not applicable them. The fact that C1C_{1} and C2C_{2} are not totally equivalent is immediate. Also, we verify that the q-equivalence [32] does not hold because the output state of the 1st1^{\rm st} register is affected by the 2nd2^{\rm nd} register; the m-equivalence is not applicable because only the 2nd2^{\rm nd} register is taken as the ancilla qubits. To the best of our knowledge, no previous tools can check this equivalence.

Refer to caption

Figure 3: (a) A reversible circuit of period 4 and with input state |1|1\rangle. (b) Another reversible circuit of period 4 and with input state |1|1\rangle. (c) The block diagram of the period-finding (PF) quantum algorithm. For two PF instances, one with UU’s in (c) being substituted with the circuit in (a) and the other with UU’s in (c) being substituted with that in (b), they are partially equivalent under d=m=3d=m=3, i.e., the upper (resp. lower) 3 qubits in (c) are data qubits (resp. ancilla qubits fixed to initial state |0|0\rangle).

We observe that when the two circuits under verification have no ancilla qubits, the computation of their partial equivalence checking can be simplified. It motivates the consideration of the following special case.

Problem 2 (Zero-Ancilla Partial Equivalence Checking).

Given two circuits C1=(d,m,0,U1)C_{1}=(d,m,0,U_{1}) and C2=(d,m,0,U2)C_{2}=(d,m,0,U_{2}), we are asked to check whether C1C_{1} and C2C_{2} are partially equivalent.

For instance, consider the aforementioned C1C_{1} and C2C_{2} circuits. If we properly add some quantum gates to the beginning of the second register of C2C_{2} (e.g., achieving the mapping |x|x1mod5|x\rangle\mapsto|x-1\mod 5\rangle for x<5x<5 and |x|x|x\rangle\mapsto|x\rangle for x5x\geq 5), then C1C_{1} and C2C_{2} can remain partially equivalent even if all qubits are taken as data qubits. In this case both circuits have no ancilla qubits (i.e., k=0k=0).

III-B Comparison of Different Equivalences

There have been several quantum circuit equivalences being studied. We compare them with the partial equivalence.

Besides the previously mentioned m-equivalence and q-equivalence, the most common definition of quantum circuit equivalence is total equivalence, which requires that two circuits produce the same state vector (up to a global-phase difference) for any input state [8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. That is, the output states of the two circuits should be numerically identical except for a scalar multiplication factor.

Another well-known equivalence is functional equivalence of reversible circuits. In reversible circuits, they may also include ancilla qubits, and some qubits may be discarded in the end (i.e., we do not care about their measurement results). Some may even include don’t-care conditions, which means that we do not care about all of or part of the measurement results under some input states [36].

Though partial equivalence looks similar to functional equivalence of reversible circuits and m-equivalence, we note that partial equivalence is much more complex, because measurement is a non-linear operation and does not have superposition property. That is, even if we know

P(t|ψi,C1)=P(t|ψi,C2)P(t|\psi_{i},C_{1})=P(t|\psi_{i},C_{2}) (11)

for i=1,2i=1,2, we cannot infer

P(t|(αψ1+βψ2),C1)=P(t|(αψ1+βψ2),C2).P\left(t|(\alpha\psi_{1}+\beta\psi_{2}),C_{1}\right)=P\left(t|(\alpha\psi_{1}+\beta\psi_{2}),C_{2}\right). (12)

Therefore, even with a brute-force search to enumerate the probability distribution under ψ=ei,2d\psi=e_{i,2^{d}} for all ii, the partial equivalence relation is still not guaranteed. Some other approaches to checking partial equivalence are essential.

We briefly summarize the main differences between the equivalence types mentioned above in Table I. We further illustrate the relations between partial equivalence and other equivalences in Fig. 4. Note that partial equivalence subsumes total equivalence, functional equivalence of reversible circuits, and q-equivalence as these equivalences are special cases of partial equivalence. Specifically, it is evident that totally equivalent circuits must be partially equivalent. For reversible circuits, it can be derived from definition that P(t|ψ,C)P(t|\psi,C) is either 0 or 1 for all ψ=|0,|1,,|2d1\psi=|0\rangle,|1\rangle,...,|2^{d}-1\rangle, so Eq. (12) holds. Therefore, functionally equivalent reversible circuits without don’t care conditions must be partially equivalent. As for m-equivalence, it can be clearly seen that we can apply partial equivalence to m-equivalence by setting d=0d=0. For q-equivalence, as the states of main output qubits between two circuits are identical, we must get the same probability distribution of the two circuits if we measure on the main output qubits. Thus, q-equivalent circuits must be partially equivalent.

Refer to caption

Figure 4: Relation between partial equivalence and other known equivalences. For m-equivalence, it is just the case that setting d=0d=0 of partial equivalence and thus not displayed in the diagram.
TABLE I: Comparison of Different Notions of Quantum Circuit Equivalence
equivalence type equivalence form inputs outputs special restrictions
partial equivalence measurement probability allow ancilla allow garbage -
total equivalence numerical quantum state usually no ancilla usually no garbage -
functionally equivalent
reversible circuits
Boolean values allow ancilla allow garbage restricted on reversible circuits
m-equivalence measurement probability no ancilla allow garbage -
q-equivalence numerical quantum state allow ancilla allow garbage main output independent of garbage output

IV Necessary and Sufficient Conditions of Partial Equivalence

In this section, we derive a necessary and sufficient condition for partial equivalence, which can be directly used to solve Problem 1 if the matrices of both circuits are explicitly given. For the special case with no ancilla qubits, we show that a simpler necessary and sufficient condition exists.

We define the following notation to locate some sub-matrices.

Definition 2.

Given a circuit C=(d,m,k,U)C=(d,m,k,U), let up,qu_{p,q} represents the (p,q)th(p,q)^{\mathrm{th}} entry of UU and g=2(d+km)g=2^{(d+k-m)}, then vi,j(U)v_{i,j}(U) of size gg is defined to be

vi,j(U)=[ugi,2kj,ugi+1,2kj,,ugi+g1,2kj]T,v_{i,j}(U)=[u_{gi,2^{k}j},u_{gi+1,2^{k}j},...,u_{gi+g-1,2^{k}j}]^{\rm{T}}, (13)

where i=0,1,,2m1i=0,1,...,2^{m-1}, j=0,1,,2d1j=0,1,...,2^{d-1}. An example for d=m=k=1d=m=k=1 is illustrated as follows. In addition, we use ei,le_{i,l} to represent the unit vector [0,0,,0,1,0,,0][0,0,...,0,1,0,...,0] of length ll and the ithi^{th} position (ii starts from 0) being 1.

[Uncaptioned image]

We remark that the definition of {vi,j(U)}\{v_{i,j}(U)\} is critical in deriving the necessary and sufficient condition for the partial equivalence. We will show this point in the following section.

IV-A Property of Partial Equivalence

To solve Problem 1, we first derive the condition when C1C2C_{1}\sim C_{2} holds.

Theorem 1 (Partial Equivalence).

Given two circuits C1=(d,m,k,U1)C_{1}=(d,m,k,U_{1}) and C2=(d,m,k,U2)C_{2}=(d,m,k,U_{2}), then C1C2C_{1}\sim C_{2} if and only if

vt,p(U1)vt,q(U1)=vt,p(U2)vt,q(U2)v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2}) (14)

holds for all t=0,1,,2m1t=0,1,...,2^{m-1} and p,q=0,1,,2d1p,q=0,1,...,2^{d-1}.

Proof.

Let the input state vector ψ=[a0,a1,,a(2d1)]T\psi=[a_{0},a_{1},...,a_{(2^{d}-1)}]^{\rm{T}}. The matrix U1U_{1} and U2U_{2} will respectively be applied on ψe0,2k\psi\otimes e_{0,2^{k}}. Let’s consider the output |t=t0t1tm1|t\rangle=t_{0}t_{1}...t_{m-1}. Let g=2(d+km)g=2^{(d+k-m)} and u1,x,yu_{1,x,y} denotes the (x,y)(x,y) entry of U1U_{1}, then

P(t|ψ,C1)=i=gtgt+g1|j=02d1u1,i,2kjaj|2\displaystyle\qquad\quad P(t|\psi,C_{1})=\sum_{i=gt}^{gt+g-1}\left|\sum_{j=0}^{2^{d}-1}u_{1,i,2^{k}\cdot j}\cdot a_{j}\right|^{2}
=i=gtgt+g1(p=02d1q=02d1Re{(u1,i,2kpap)(u1,i,2kqaq)})\displaystyle=\sum_{i=gt}^{gt+g-1}\left(\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{(u_{1,i,2^{k}\cdot p}\cdot a_{p})^{*}(u_{1,i,2^{k}\cdot q}\cdot a_{q})\right\}\right)
=p=02d1q=02d1Re{apaqi=gtgt+g1u1,i,2kpu1,i,2kq}\displaystyle=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{*}a_{q}\sum_{i=gt}^{gt+g-1}u_{1,i,2^{k}\cdot p}^{*}\cdot u_{1,i,2^{k}\cdot q}\right\}
=p=02d1q=02d1Re{apaqvt,p(U1)vt,q(U1)},\displaystyle=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{*}a_{q}\cdot v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})\right\}, (15)

where Re{x}Re\left\{x\right\} denotes the real part of the complex number xx. Similarly, we get

P(t|ψ,C2)=p=02d1q=02d1Re{apaqvt,p(U2)vt,q(U2)}.P(t|\psi,C_{2})=\sum_{p=0}^{2^{d}-1}\sum_{q=0}^{2^{d}-1}Re\left\{a_{p}^{*}a_{q}\cdot v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})\right\}. (16)

If C1C2C_{1}\sim C_{2}, then P(t|ψ,C1)=P(t|ψ,C2)P(t|\psi,C_{1})=P(t|\psi,C_{2}) must hold for all ψ\psi, so vt,p(U1)vt,q(U1)=vt,p(U2)vt,q(U2)v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2}) must hold for all pp and qq. We also need the above equality hold for any t=0,1,,2m1t=0,1,...,2^{m-1}. Thus,

C1C2\displaystyle C_{1}\sim C_{2} P(t|ψ,C1)=P(t|ψ,C2)tψ\displaystyle\Leftrightarrow P(t|\psi,C_{1})=P(t|\psi,C_{2})\;\forall t\forall\psi
vt,p(U1)vt,q(U1)=vt,p(U2)vt,q(U2)tpq.\displaystyle\Leftrightarrow v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1})=v_{t,p}(U_{2})^{\dagger}v_{t,q}(U_{2})\;\forall t\forall p\forall q.

As mentioned before, brute-force search does not work for partial equivalence. Therefore, Theorem 1 provides the first way to formally verify the partial equivalence relation.

From Theorem 1, we can see that whether C1C2C_{1}\sim C_{2} only depends on the vi,j(U)v_{i,j}(U)’s of two matrices, and is not related to other entries.

IV-B Property of Zero-Ancilla Partial Equivalence

For the case that both circuits have no ancilla qubits, we provide another necessary and sufficient condition for partial equivalence as follows.

Theorem 2 (Zero-Ancilla Partial Equivalence).

Given C1=(d,m,0,U1)C_{1}=(d,m,0,U_{1}) and C2=(d,m,0,U2)C_{2}=(d,m,0,U_{2}), then C1C2C_{1}\sim C_{2} if and only if U1U21U_{1}U_{2}^{-1} is in the following form.

U1U21=[U0,0U1,1U2m1,2m1],U_{1}U_{2}^{-1}=\begin{bmatrix}U_{0,0}&&&\\ &U_{1,1}&&\\ &&\ddots&\\ &&&U_{2^{m}-1,2^{m}-1}\end{bmatrix}, (17)

where each Ui,iU_{i,i} is a 2dm×2dm2^{d-m}\times 2^{d-m} square matrix, and the other entries in U1U21U_{1}U_{2}^{-1} not labeled are 0.

Proof.

Firstly we show a transformed problem. By definition, C1C2C_{1}\sim C_{2} if and only if P(t|ψ,C1)=P(t|ψ,C2)P(t|\psi,C_{1})=P(t|\psi,C_{2}) for any ψ\psi and tt. Let U1ψ=ϕ1U_{1}\psi=\phi_{1}, U2ψ=ϕ2U_{2}\psi=\phi_{2}, and then we have U1U21ϕ2=ϕ1U_{1}U_{2}^{-1}\phi_{2}=\phi_{1}. Because U1U_{1} and U2U_{2} are both reversible matrices, ϕ1\phi_{1} and ϕ2\phi_{2} can be any vector of size 2d2^{d} (and satisfying ϕ1=ϕ2=1\left\lVert\phi_{1}\right\rVert=\left\lVert\phi_{2}\right\rVert=1). Thus, C1C2C_{1}\sim C_{2} if and only if P(t|ϕ2,C1C21)=P(t|ϕ2,I)P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I) for any ϕ2\phi_{2} and any tt, where C1C21C_{1}C_{2}^{-1} is the circuit with unitary matrix U1U21U_{1}U_{2}^{-1}, and II is the identity circuit that directly sends the input to the output.

We will use the transformed problem in the following proof.

(i) the “only if ” part:

If C1C2C_{1}\sim C_{2}, P(t|ϕ2,C1C21)=P(t|ϕ2,I)P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I) must hold for any ϕ2\phi_{2} and any tt. We firstly consider ϕ2=e0,2d\phi_{2}=e_{0,2^{d}}. We can easily find that

P(t|e0,2d,I)={1,𝑖𝑓t=0000,𝑒𝑙𝑠𝑒.P(t|e_{0,2^{d}},I)=\left\{\begin{aligned} &1,\;\mathit{if}\;t=00...0\\ &0,\;\mathit{else}\end{aligned}\right.. (18)

Let U=U1U21U=U_{1}U_{2}^{-1} and ux,yu_{x,y} denotes the (x,y)(x,y) entry of UU. To make P(000|e0,2d,C1C21)=1P(00...0|e_{0,2^{d}},C_{1}C_{2}^{-1})=1, ui,0u_{i,0} should be 0 for all i2dmi\geq 2^{d-m}.

Following the same method, now we generally consider ϕ2=ej,2d\phi_{2}=e_{j,2^{d}}, where j=0,1,,2d1j=0,1,...,2^{d-1}. Let g=j/2dmg=\lfloor j/2^{d-m}\rfloor, then

P(t|ej,2d,I)={1,𝑖𝑓t=g0,𝑒𝑙𝑠𝑒\displaystyle\;\;\;\;\;\;P(t|e_{j,2^{d}},I)=\left\{\begin{aligned} &1,\;\mathit{if}\;t=g\\ &0,\;\mathit{else}\end{aligned}\right.\;\;\;\;\;\;\;\;
P(t|ej,2d,C1C21)={1,𝑖𝑓t=g0,𝑒𝑙𝑠𝑒\displaystyle\Rightarrow P(t|e_{j,2^{d}},C_{1}C_{2}^{-1})=\left\{\begin{aligned} &1,\;\mathit{if}\;t=g\\ &0,\;\mathit{else}\end{aligned}\right.\;\;\;\;\;\;\;\;
ui,j=0i[2dmg, 2dm(g+1)).\displaystyle\Rightarrow u_{i,j}=0\;\;\forall i\notin\left[2^{d-m}g,\,2^{d-m}(g+1)\right).

That is, U1U21U_{1}U_{2}^{-1} must be in form Eq. (17).

(ii) the “if ” part:

If U=U1U21U=U_{1}U_{2}^{-1} is in form Eq. (17), we firstly note that:

UU=[U0,0U0,0U1,1U1,1].U^{\dagger}U=\begin{bmatrix}U_{0,0}^{\dagger}U_{0,0}&&&\\ &U_{1,1}^{\dagger}U_{1,1}&&\\ &&\quad\ddots\quad&\\ &&&\quad\ddots\quad\end{bmatrix}. (19)

Because U=U1U21U=U_{1}U_{2}^{-1} is a unitary matrix, Ui,iUi,iU_{i,i}^{\dagger}U_{i,i} must be identity for all ii. Thus, each Ui,iU_{i,i} must be a unitary matrix.

Now for any ϕ2\phi_{2}, let ϕ2=[a0,a1,,a2d1]T=[A0,A1,,A2m1]T\phi_{2}=\left[a_{0},a_{1},...,a_{2^{d}-1}\right]^{\rm T}=\left[A_{0},A_{1},...,A_{2^{m}-1}\right]^{\rm T}, where each AiA_{i} is of length 2dm2^{d-m}. That is, we equally divide ϕ2\phi_{2} into 2m2^{m} groups. Then

P(t|ϕ2,C1C21)=Ut,tAt.P(t|\phi_{2},C_{1}C_{2}^{-1})=\left\lVert U_{t,t}A_{t}\right\rVert. (20)

Because Ut,tU_{t,t} is unitary, it is true that

Ut,tAt=AtUt,tUt,tAt=At.\left\lVert U_{t,t}A_{t}\right\rVert=A_{t}^{\dagger}U_{t,t}^{\dagger}U_{t,t}A_{t}=\left\lVert A_{t}\right\rVert. (21)

Because At=P(t|ϕ2,I)\left\lVert A_{t}\right\rVert=P(t|\phi_{2},I), P(t|ϕ2,C1C21)=P(t|ϕ2,I)P(t|\phi_{2},C_{1}C_{2}^{-1})=P(t|\phi_{2},I) holds for all ϕ2\phi_{2} and tt. Therefore, C1C2C_{1}\sim C_{2}. ∎

V Algorithms for Partial Equivalence Checking

In this section, we present algorithms to solve Problems 1 and 2 based on the theorems established in Section IV. We emphasize that these algorithms can be realized by using the bit-sliced algebraic representation of matrices [8].

V-A Partial Equivalence Checking Algorithm

If U1U_{1} and U2U_{2} are given in their explicit numerical form, we can easily extract vi,j(U)v_{i,j}(U)’s and enumerate all (t,p,q)(t,p,q) triples to check whether C1C2C_{1}\sim C_{2} by Theorem 1, with O(23d+k)O(2^{3d+k}) steps of complex number multiplications. However, as U1U_{1} and U2U_{2} may be in an implicit form, we give another approach by using matrix entry manipulation and matrix multiplication, as shown in Algorithm 1.

Algorithm 1 PEC(C1C_{1},C2C_{2})
0:   C1C_{1}: C1=(d,m,k,U1)C_{1}=(d,m,k,U_{1}),     C2C_{2}: C2=(d,m,k,U2)C_{2}=(d,m,k,U_{2})
0:   Whether C1C2C_{1}\sim C_{2} or not
1:  extra:=max{mk,0}extra:={\rm max}\{m-k,0\}
2:  k:=k+extrak:=k+extra
3:  U1:=U1I2extraU_{1}:=U_{1}\otimes I_{2^{extra}}
4:  U1:=U1U_{1}^{\prime}:=U_{1}
5:  Equally divide the columns of U1U_{1}^{\prime} into 2d2^{d} parts. For each part, keep the leftmost column unchanged, and set the remains to 0.
6:  Equally divide the rows of U1U_{1}^{\prime} into 2m2^{m} parts. For the ithi^{th} part (ii starts from 0), right shift the contents for ii columns.
7:  U1′′:=U1U1U_{1}^{\prime\prime}:=U_{1}^{\dagger}U_{1}^{\prime}
8:  Equally divide the rows of U1′′U_{1}^{\prime\prime} into 2d2^{d} parts. For each part, keep the top row and set the other entries to 0.
9:  Repeat step 3\sim8 for U2U_{2} and get U2′′U_{2}^{\prime\prime}.
10:  return  whether U1′′=U2′′U_{1}^{\prime\prime}=U_{2}^{\prime\prime} or not

To better illustrate steps 5 and 6, we note that after step 6, each vi,j(U1)v_{i,j}(U_{1}) for different ii and jj should be separated into different columns, and all entries not storing values of vi,j(U1)v_{i,j}(U_{1}) are 0. An example of U1U_{1}^{\prime} with d=m=k=1d=m=k=1 after this step is shown as follows.

[Uncaptioned image]

To see the correctness of Algorithm 1, we first consider the case mkm\geq k. It can be easily checked that after step 8, the (2kp, 2kq+t)th(2^{k}p,\>2^{k}q+t)^{\mathrm{th}} entry of U1′′U_{1}^{\prime\prime}\>will store the value of vt,p(U1)vt,q(U1)v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1}), while the other entries are 0. Note that in step 2 we have added some ancilla qubits and now k=mk=m. Therefore, we can easily check whether U1′′=U2′′U_{1}^{\prime\prime}=U_{2}^{\prime\prime} holds to determine whether C1C2C_{1}\sim C_{2}.

For the case m<km<k, the (2kp, 2kq+t)th(2^{k}p,\>2^{k}q+t)^{\mathrm{th}} entry of U1′′U_{1}^{\prime\prime}\>still stores the value of vt,p(U1)vt,q(U1)v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1}), but only for those with t=0,1,,2m1t=0,1,...,2^{m-1}. For t=2m,,2k1t=2^{m},...,2^{k-1}, the value of vt,q(U1)vt,q(U1)v_{t,q}(U_{1})^{\dagger}v_{t,q}(U_{1}) does not exist, but we can see that the (2kp, 2kq+t)th(2^{k}p,\>2^{k}q+t)^{\mathrm{th}} entry of U1′′U_{1}^{\prime\prime}\>will always be 0 because the entries in the (2kq+t)th{(2^{k}q+t)}^{\rm{th}} column of U1U_{1}^{\prime} are all 0’s after step 6. That is, though we do not clear these entries in step 8, they are automatically set to 0. Therefore, we can still determine whether C1C2C_{1}\sim C_{2} by checking whether U1′′=U2′′U_{1}^{\prime\prime}=U_{2}^{\prime\prime}.

We further point out that this can be done quite easily using bit-sliced algebraic representation of a matrix [8], as shown in Algorithm 2. As introduced before, a matrix is represented by multiple BDDs, but here we use single F1F_{1} and F2F_{2} as representatives for all BDDs in C1C_{1} and C2C_{2} respectively for convenience. In steps 5, 7, 10, 13, we mean to do the same operation on each BDD, and in step 15, we need all corresponding BDDs of the two circuits to be the same.

Algorithm 2 PEC_BDD(C1C_{1},C2C_{2})
0:   C1C_{1}: C1=(d,m,k,U1)C_{1}=(d,m,k,U_{1}),     C2C_{2}: C2=(d,m,k,U2)C_{2}=(d,m,k,U_{2})
0:   Whether C1C2C_{1}\sim C_{2} or not
1:  extra:=max{mk,0}extra:={\rm max}\{m-k,0\}
2:  k:=k+extrak:=k+extra
3:  Add extraextra row-variables and column-variables to C1C_{1}.
4:  for i=d,d+1,,d+k1i=d,d+1,...,d+k-1 
5:     F1:=F1|yi¯F_{1}:=F_{1}|_{\overline{y_{i}}}
6:  for  i=0,1,,m1i=0,1,...,m-1  
7:     F1:=F1xiyd+km+i¯F_{1}:=F_{1}\land\overline{x_{i}\oplus y_{d+k-m+i}}
8:  if m<km<k 
9:     for i=d,d+1,,d+km1i=d,d+1,...,d+k-m-1 
10:        F1:=F1yi¯F_{1}:=F_{1}\,\land\,\overline{y_{i}}
11:  Apply the inverse circuit C11C_{1}^{-1} on F1F_{1} using the method in [8]
12:  for i=d,d+1,,d+k1i=d,d+1,...,d+k-1 
13:     F1:=F1xi¯F_{1}:=F_{1}\land\overline{x_{i}}
14:  Repeat step 3 to 13 for C2C_{2} and get F2F_{2}
15:  return  whether F1=F2F_{1}=F_{2} or not

We use steps 4 to 10 in Algorithm 2 to replace steps 5 and 6 in Algorithm 1. In steps 4 to 5, we divide the columns of U1U_{1} into 2d2^{d} parts. For each part, we use cofactor operation to copy the leftmost columns to the other columns. In steps 6 to 9, we remove unwanted entries to ensure that each column only contains different vi,j(U1)v_{i,j}(U_{1})’s. The remaining steps have similar functions as in Algorithm 1 but are rewritten in Boolean form.

We can see that with the help of BDD, the operations can be simplified and are more efficient. To be precise, two applications of inverse circuit operations U1U^{-1} are required, and only O(m+k)O(m+k) Boolean AND, OR, and cofactor operations are needed, which is linear to mm and kk. As BDDs often give a good data compression, there may be a significant speedup for Algorithm 2.

V-B Zero-Ancilla Partial Equivalence Checking Algorithm

Similarly, we can directly use Theorem 2 to solve Problem 2 if U1U_{1} and U2U_{2} are explicitly given. Here we emphasize that it is also easy to check this property using bit-sliced algebraic representation of the matrix. The checking procedure is shown in Algorithm 3. Again, we mean to do the same operation on each BDD in step 5, and we require all BDDs to be constant falsefalse in step 6.

Algorithm 3 PEC_BDD_noAncilla(C1C_{1},C2C_{2})
0:   C1C_{1}: C1=(d,m,0,U1)C_{1}=(d,m,0,U_{1}),     C2C_{2}: C2=(d,m,0,U2)C_{2}=(d,m,0,U_{2})
0:   Whether C1C2C_{1}\sim C_{2} or not
1:  Build the BDDs FF representing C1C21C_{1}C_{2}^{-1} using the method in [8]
2:  M:=falseM:=false
3:  for i=0,1,,m1i=0,1,...,m-1 
4:     M:=M(xiyi)M:=M\lor\left(x_{i}\oplus y_{i}\right)
5:  F:=FMF:=F\land M
6:  return  whether F=falseF=false or not

We can see that in Algorithm 3, we do not have to build BDDs for C1C_{1} and C2C_{2} respectively, but rather merge them into one quantum circuit C1C21C_{1}C_{2}^{-1}. As there may be some gates canceled out in C1C21C_{1}C_{2}^{-1}, Algorithm 3 may be more efficient than Algorithm 2.

VI Experimental Evaluation

We implemented Algorithms 2 and 3 in C/C++ under the framework of SliQEC [8], and adopted CUDD [43] package for BDD manipulation. All experiments were conducted on a server with Intel Xeon Silver 4210 CPU at 2.20 GHz and 128 GB RAM. The reported runtime refers to CPU time in seconds. The time-out (TO) limit is set to 600 seconds.

We conducted experiments of partial equivalence checking on benchmarks of period-finding instances, randomly generated instances, Grover search instances, and reversible circuit instances. We also experimented total equivalence checking on randomly generated instances. The results are discussed as follows.

VI-A Checking Period-Finding Algorithm

We used the tool proposed in [39] to implement the reversible circuits realizing U|x=|axmodNU|x\rangle=|ax\mod N\rangle for x<Nx<N and U|x=|a(x+1)1modNU|x\rangle=|a(x+1)-1\mod N\rangle for x<Nx<N, which have the same period with input state |1|1\rangle. We then substituted the reversible circuits into the oracle used in the period finding algorithm. We let the first register consist of 3 qubits, and used one extra ancilla qubit to implement quantum Fourier transform. We then used Algorithm 2 to test the equivalence relation. As mentioned above, this type of equivalence cannot be verified by previous tools, and Algorithm 2 is the only way to check their equivalence relation. We have tested the (a,N)(a,N) pairs for all a{3,5,7,11},N{5,7,11,13}a\in\{3,5,7,11\},N\in\{5,7,11,13\} and a<Na<N. The parameters are set to d=m=3d=m=3 and k=log2N+1k=\lceil\log_{2}N\rceil+1.

The results are shown in Table II. All test cases yield correct answers. As these test cases are relatively small, we can see that the runtime is roughly proportional to the gate count. On the other hand, the memory usages are roughly the same because all test cases have similar numbers of qubits.

TABLE II: Experimental Results on Period-Finding Benchmarks
aa NN
#Gates
in C1C_{1}
#Gates
in C2C_{2}
time (s)
memory
(MB)
3 5 115 87 0.180 12.874
3 7 108 136 0.354 13.914
3 11 227 325 0.794 14.496
3 13 353 248 0.796 13.959
5 7 143 164 0.441 13.922
5 11 381 388 1.083 14.512
5 13 353 507 0.806 13.173
7 11 395 437 1.263 14.520
7 13 479 241 1.136 14.787
11 13 430 416 1.302 14.787

VI-B Checking Random Benchmarks

To further test the scalability of our algorithms, we generated some benchmarks with Clifford+T gates (H, S, T, CNOT gates) and 2-control Toffoli gates. The numbers of data qubits of the benchmarks range from d=5,10,15,20,25,30,35,40,d=5,10,15,20,25,30,35,40, 50,60,70,80,90,10050,60,70,80,90,100. The number mm of measured qubits was fixed to 0.5d\lfloor 0.5d\rfloor, and the gate number was set to about 6.5d6.5d. For each dd, 20 groups of circuits were generated, and each group includes two conditions: ancilla qubits existing or non-existing. We used the latter condition to test Algorithm 3, and both conditions to test Algorithm 2.

The structure of the benchmarks is shown in Fig. 5. Let the partially equivalent circuit pair be C1C_{1} and C2C_{2}, both with dd data qubits and mm measured qubits. The circuits are divided into five parts: H (H gates), T (totally equivalent), P (partially equivalent), A (arbitrary) and C (CNOT gates). If C1C_{1} and C2C_{2} are in the condition without ancilla qubits, only the first four parts are contained. Otherwise, they both contain all five parts.

Refer to caption

Figure 5: Structure of random benchmarks.

In part H, an H gate is firstly applied to each data qubit to impose superposition. In part T, we generate a random sub-circuit with dd qubits and 3d3d gates to apply on both circuits, but all the Toffoli gates in C2C_{2} are decomposed in the way proposed in [44]. In part P, we divide the data qubits into several groups, where each group may contain one or two adjacent qubits. For each group, we apply X1X_{1} and X2X_{2} on C1C_{1} and C2C_{2}, respectively, where X1X_{1} and X2X_{2} are pre-generated subcircuits satisfying X1X2X_{1}\sim X_{2} when all qubits are set to data qubits and measured qubits. X1X_{1} and X2X_{2} are exhaustively searched with up to five gates in total. In part A, different random sub-circuits with dmd-m qubits and dmd-m gates are applied on C1C_{1} and C2C_{2}. These gates trivially do not affect the measurement results of the measured qubits. In part C, we use the ancilla qubits as the control bits to control the upper dd qubits. As ancilla qubits are initially set to 0, these CNOT gates do not affect the circuit.

Although the circuit generation is simple, there are no known circuit simplification methods exploiting partial equivalence. The generated circuits are used to test our algorithms. Furthermore, to validate the correctness of the benchmarks, we use the same generating method to generate some test cases with 3d93\leq d\leq 9. For each test case, we use Qiskit [27] to randomly set an initial state and get the final state vector numerically. We calculate for all possible outputs and make sure that C1C_{1} and C2C_{2} have the same probability distribution under this random initial state.

TABLE III: Experimental Results on Random Partially Equivalent Circuits
dd mm Without ancilla qubits With ancilla qubits
#Gates in C1C_{1} #Gates in C2C_{2} Algorithm 3 Algorithm 2 #Gates in C1C_{1} #Gates in C2C_{2} Algorithm 2
time(s)
memory
(MB)
#TO time(s)
memory
(MB)
#TO time(s)
memory
(MB)
#TO
5 2 26.05 78.95 0.017 12.942 0 0.064 12.954 0 27.05 79.95 0.067 12.954 0
10 5 51.05 157.00 0.050 12.942 0 1.880 25.077 0 52.65 158.60 1.951 24.766 0
15 7 76.00 240.70 0.135 13.855 0 72.416 165.957 1 77.95 243.05 72.576 168.987 1
20 10 102.65 308.75 0.364 14.180 0 117.528 170.217 14 105.85 312.00 129.006 167.451 13
25 12 128.90 372.70 0.723 14.577 0 - - - 132.55 376.15 - - -
30 15 155.45 439.05 1.276 15.798 0 - - - 160.00 443.40 - - -
35 17 180.65 543.15 2.491 17.633 0 - - - 185.50 548.25 - - -
40 20 204.80 587.75 3.554 18.930 0 - - - 210.85 594.00 - - -
50 25 257.00 710.25 8.827 35.281 0 - - - 264.65 717.75 - - -
60 30 309.65 862.90 18.673 45.444 0 - - - 318.25 872.50 - - -
70 35 361.45 1037.45 55.950 78.525 0 - - - 372.10 1048.00 - - -
80 40 414.55 1157.00 105.611 122.198 3 - - - 426.00 1169.25 - - -
90 45 461.65 1328.75 229.473 144.761 6 - - - 475.80 1342.00 - - -
100 50 517.35 1464.15 297.401 160.311 11 - - - 532.70 1480.05 - - -
TABLE IV: Experimental Results on Functionally Equivalent Reversible Circuits
Without ancilla qubits (Algorithm 3)
C1C_{1} C2C_{2} dd mm #Gates in C1C_{1} #Gates in C2C_{2} time (s) memory (MB)
hwb4_52 hwb4_49 4 4 11 17 0.0128 12.6730
hwb6_58 hwb6_56 6 6 42 126 0.0214 12.9434
graycode6_47 graycode6_48 6 6 5 5 0.0132 12.9434
ham3_103 ham3_102 3 3 4 5 0.0128 12.6730
urf1_150 urf1_149 9 9 1517 11554 5.5048 15.4092
urf2_153 urf2_152 8 8 638 5030 1.1813 13.9059
urf3_156 urf3_155 10 10 2732 26468 25.6478 17.7562
urf5_159 urf5_158 9 9 499 10276 3.5319 15.4010
urf6_281 urf6_160 15 15 5102 10740 188.8090 112.1649
alu-v1_28 alu-v1_29 5 1 8 8 0.0101 12.9393
alu-v2_33 alu-v2_30 5 1 7 18 0.0130 13.0048
alu-v3_34 alu-v3_35 5 1 8 8 0.0110 12.9393
alu-v4_36 alu-v4_37 5 1 8 8 0.0127 12.9393
With ancilla qubits (Algorithm 2)
C1C_{1} C2C_{2} dd mm #Gates in C1C_{1} #Gates in C2C_{2} time (s) memory (MB)
hwb5_53 hwb5_300 5 5 55 101 3.8765 31.1869
alu-v0_26 alu-bdd_288 5 1 7 10 0.0146 12.9434
ham7_104 ham7_299 7 7 23 76 0.7321 16.2570
ham15_107 ham15_298 15 15 132 184 89.9754 203.5876
nestedif2_16_396 nestedif2_16_468 34 32 256 266 3.6910 37.0237

The experimental results are shown in Table IV. The runtime and memory usage shown are averaged over the 20 test cases excluding those timed out. As seen, Algorithm 3 can easily scale up to tens of qubits. However, for d>20d>20, Algorithm 2 can hardly finish the computation within the time limit. It may be because, for Algorithm 3, FF keeps representing a unitary matrix of a quantum circuit almost during the whole process, which may be more “regular” for BDDs to represent. Moreover, due to the structure of C1C21C_{1}C_{2}^{-1}, some gates may be canceled out and the BDDs are more likely to be maintained as an identity matrix, as mentioned in [8]. In contrast, the entries of the final matrix in Algorithm 2 contain the values of vt,p(U1)vt,q(U1)v_{t,p}(U_{1})^{\dagger}v_{t,q}(U_{1}), which can be arbitrary real numbers. Besides, we fix mm to 0.5d\lfloor 0.5d\rfloor in our benchmarks, and the second step in Algorithm 2 may further increase the number of qubits. Thus, the runtime and memory usage of Algorithm 2 grow more rapidly than those of Algorithm 3.

VI-C Checking Reversible Circuits

As mentioned, functionally equivalent reversible circuits without don’t-care conditions must be partially equivalent. For those with ancilla inputs, Algorithm 2 is needed. On the other hand, Algorithm 3 is sufficient if no ancilla inputs exist. We used some test cases from RevLib [45] and used Algorithms 2 and 3 to test their partial equivalence to justify this point. The experimental results are shown in Table IV. It is justified that all functionally equivalent reversible circuits are indeed partially equivalent. The runtime and memory usage mainly grow proportional to the gate-count increase.

VI-D Checking Grover’s Algorithm

If reversible circuits are used as part of a quantum circuit, the entire quantum circuit may not be a reversible circuit anymore, and its verification becomes more complex. We took Grover’s algorithm as an example for case study. In the experiment, we created a reversible circuit as the oracle function in Grover’s algorithm. We also used the way proposed in [38] to implement the same function with one ancilla qubit. Then, Algorithm 2 was applied to check the equivalence of the two circuit. The data qubit number dd and measured qubit number mm range from d=m=6,10,14,18,22,26d=m=6,10,14,18,22,26.

The experimental results are shown in Table V. The runtime and memory usage grow as dd increases. It is reasonable because the circuit size is larger. We note that quantum circuits of Grover’s algorithm with the same oracle function but implemented by different reversible circuits are q-equivalent. In this experiment, we show that they are also partially equivalent. Apart from equivalent test cases, we also try to remove one gate from the second circuit to make them not partially equivalent. The runtime and memory usage for non-equivalent test cases are still close to equivalent test cases.

TABLE V: Experimental Results on Grover Search Benchmarks
dd mm
#Gates
in C1C_{1}
#Gates
in C2C_{2}
time (s)
memory
(MB)
6 6 35 37 0.257 13.201
10 10 59 61 1.537 38.687
14 14 83 85 9.613 153.371
18 18 107 109 22.831 179.745
22 22 131 133 49.149 192.492
26 26 155 157 135.116 232.116
TABLE VI: Experimental Results on Random Totally Equivalent Circuits
dd mm #Gates in C1C_{1} #Gates in C2C_{2} Algorithm 3 SliQEC
time (s) memory (MB) #TO time (s) memory (MB) #TO
10 10 40.00 118.40 0.038 12.955 0 0.028 13.000 0
20 20 80.00 237.50 0.182 14.245 0 0.194 14.237 0
30 30 120.00 377.60 0.685 15.699 0 0.640 15.285 0
40 40 160.00 496.00 1.802 19.232 0 1.749 17.815 0
50 50 200.00 649.40 3.612 28.114 0 4.722 28.233 0
60 60 240.00 751.00 7.246 38.732 0 9.889 42.336 0
70 70 280.00 887.60 12.165 45.731 0 10.086 38.610 0
80 80 320.00 993.40 20.001 58.224 0 22.079 56.422 0
90 90 360.00 1071.20 24.703 69.052 0 25.826 68.746 0
100 100 400.00 1230.20 48.089 85.605 0 58.483 79.486 0

VI-E Checking Total Equivalent Circuits

Because totally equivalent circuits must be partially equivalent, under these circumstances we use totally equivalent circuits as the benchmarks to test Algorithm 3 and compare it against SliQEC [8]. Because Algorithm 3 is extended from SliQEC, we expect that they would perform similarly. We generated some benchmarks with qubit number d=10,20,d=10,20, 30,40,50,60,70,80,90,10030,40,50,60,70,80,90,100, and set all qubits to be measured qubits for partial equivalence checking. The generating method is similar to how we generate partially equivalent pairs, except that only part H and part T are included to make them totally equivalent. The results shown in Table VI confirm the expectation that Algorithm 3 and SliQEC behave similarly in both runtime and memory usage on these instances.

VII Conclusions

In this work, we defined partial equivalence, which is concerned with observational equivalence with respect to partial measurement and a set of initial input states. We obtained the necessary and sufficient conditions of partial equivalence, and exploited them to develop partial equivalence checking algorithms. Experimental results demonstrate the usefulness and strengths of our methods. For future work, it may be interesting to explore partial equivalence in applications such as quantum circuit synthesis and quantum program compilation.

Acknowledgments

This work was supported in part by the Ministry of Science and Technology of Taiwan under grants MOST 110-2224-E-002-011 and MOST 111-2119-M-002-012, and MediaTek Research. TFC was supported by the TSMC Scholarship.

References

  • [1] A. Kuehlmann and F. Krohm, “Equivalence checking using cuts and heaps,” in Proc. Design Automation Conference, 1997, pp. 263–268.
  • [2] A. Mishchenko, S. Chatterjee, R. Brayton, and N. Een, “Improvements to combinational equivalence checking,” in Proc. Int. Conf. on Computer-Aided Design, 2006, pp. 836–843.
  • [3] S.-Y. Huang and K.-T. T. Cheng, Formal Equivalence Checking and Design Debugging.   Springer, 2012.
  • [4] C. Van Eijk, “Sequential equivalence checking based on structural similarities,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 19, no. 7, pp. 814–819, 2000.
  • [5] K. L. McMillan, “Interpolation and SAT-based model checking,” in Proc. Int. Conf. on Computer Aided Verification, 2003, pp. 1–13.
  • [6] A. R. Bradley, “SAT-based model checking without unrolling,” in Proc. Int. Conf. on Verification, Model Checking, and Abstract Interpretation, 2011, pp. 70–87.
  • [7] N. Een, A. Mishchenko, and R. Brayton, “Efficient implementation of property directed reachability,” in Proc. Int. Conf. on Formal Methods in Computer-Aided Design, 2011, pp. 125–134.
  • [8] C.-Y. Wei, Y.-H. Tsai, C.-S. Jhang, and J.-H. R. Jiang, “Accurate BDD-based unitary operator manipulation for scalable and robust quantum circuit verification,” in Proc. Design Automation Conference, 2022.
  • [9] P. Niemann, R. Wille, and R. Drechsler, “Equivalence checking in multi-level quantum systems,” in Proc. Int’l Conf. of Reversible Comput., 2014, pp. 201–215.
  • [10] L. Burgholzer and R. Wille, “Advanced equivalence checking for quantum circuits,” IEEE Trans. on Comput.-Aided Design Integr. Circuits Syst, vol. 40, no. 9, pp. 1810–1824, 2021.
  • [11] X. Hong, M. Ying, X. Z. Y. Feng, and S. Li, “Approximate equivalence checking of noisy quantum circuits,” in Proc. Design Automation Conf., 2021, pp. 637–642.
  • [12] S. Yamashita and I. L. Markov, “Fast equivalence-checking for quantum circuits,” in Proc. Int. Symp. on Nanoscale Architectures, 2010, pp. 23–28.
  • [13] L. Burgholzer and R. Wille, “Improved DD-based equivalence checking of quantum circuits,” in Proc. Asia and South Pacific Design Automation Conf., 2020, pp. 127–132.
  • [14] ——, “The power of simulation for equivalence checking in quantum computing,” in Proc. Design Automation Conf., 2020, pp. 1–6.
  • [15] ——, “Handling non-unitaries in quantum circuit equivalence checking,” in Proc. Design Automation Conference, 2022.
  • [16] E. Ardeshir-Larijani, S. J. Gay, and R. Nagarajan, “Verification of concurrent quantum protocols by equivalence checking,” in Proc. Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, 2014, pp. 500–514.
  • [17] M. Amy, “Towards large-scale functional verification of universal quantum circuits,” in Electronic Proc. in Theoretical Computer Science, vol. 287, 2019, pp. 1–21.
  • [18] G. F. Viamontes, I. L. Markov, and J. P. Hayes, “Checking equivalence of quantum circuits and states,” in Proc. Int. Conf. on Computer-Aided Design, 2007, pp. 69–74.
  • [19] S. A. Wang, C. Y. Lu, I. M. Tsai, and S. Y. Kuo, “An XQDD-based verification method for quantum circuits,” IEICE Trans. on Fundamentals of Electron., Commun. and Computer Sci., vol. E91-A, no. 2, pp. 584–594, 2008.
  • [20] E. Knill, “Conventions for quantum pseudocode,” Los Alamos National Lab, Tech. Rep., 1996, [Online]. Available: https://www.osti.gov/biblio/366453.
  • [21] B. Ömer, “A procedural formalism for quantum computing,” Master’s thesis, Dep. Theoretical Phys., Technical Univ. of Vienna, 1998.
  • [22] A. van Tonder, “A lambda calculus for quantum computation,” SIAM Journal on Computing, vol. 33, pp. 1109–1135, 2004.
  • [23] P. Selinger, “Towards a quantum programming language,” Mathematical Structures in Computer Science, vol. 14, no. 4, pp. 527–586, 2004.
  • [24] S. Bettelli, L. Serafini, , and T. Calarco, “Toward an architecture for quantum programming,” Eur. Phys. J. D, vol. 25, pp. 181–200, 2003.
  • [25] A. W. Cross, L. S. Bishop, J. A. Smolin, and J. M. Gambetta, “Open quantum assembly language,” 2017, [Online]. Available: https://arxiv.org/abs/1707.03429.
  • [26] A. J. Abhari et al., “Scaffold: Quantum programming language,” Dept. of Computer Science, Princeton Univ., Tech. Rep. TR934-12, 2012.
  • [27] H. Abraham et al., “Qiskit: An open-source framework for quantum computing,” 2019, doi: 10.5281/zenodo.2562111.
  • [28] B. Bichsel, M. Baader, T. Gehr, and M. Vechev, “Silq: A high-level quantum language with safe uncomputation and intuitive semantics,” in Proc. Int. Conf. on Programming Language Design and Implementation, London, UK, 2020, pp. 286–300.
  • [29] K. M. Svore et al., “Q#: Enabling scalable quantum computing and development with a high-level dsl,” in Proc. Real World Domain Specific Languages Workshop, 2018, pp. 1–10.
  • [30] A. S. Green, P. L. Lumsdaine, N. J. Ross, P. Selinger, and B. Valiron, “Quipper: a scalable quantum programming language,” in Proc. Int. Conf. on Programming Language Design and Implementation, 2013, pp. 333–342.
  • [31] S. Liu et al., “Q|SiQ|Si\rangle: A quantum programming environment,” in Proc. Symp. on Real-Time and Hybrid Systems, 2018, pp. 133–164.
  • [32] X. Hong, Y. Feng, S. Li, and M. Ying, “Equivalence checking of dynamic quantum circuits,” 2021, [Online]. Available: arXiv:2106.01658v1.
  • [33] Q. S. Wang, J. Y. Liu, and M. S. Ying, “Equivalence checking of quantum finite-state machines,” J. Comput. Syst. Sci., vol. 116, pp. 1–21, 2021.
  • [34] Q. Wang, R. Li, and M. Ying, “Equivalence checking of sequential quantum circuits,” IEEE Trans. on Comput.-Aided Design of Integr. Circuits and Syst., 2021.
  • [35] R. E. Bryant, “Graph-based algorithms for Boolean function manipulation,” IEEE Trans. on Computers, vol. 35, no. 8, pp. 677–691, 1986.
  • [36] R. Wille, D. Große, D. M. Miller, and R. Drechsler, “Equivalence checking of reversible circuits,” in Proc. Int. Symp. on Multiple-Valued Logic, 2009, pp. 324–330.
  • [37] L. G. Amaru, “Improvements to the equivalence checking of reversible circuits,” in New Data Structures and Algorithms for Logic Synthesis and Verification.   Springer, 2017, ch. 6.
  • [38] D. M. Miller, R. Wille, and R. Drechsler, “Reducing reversible circuit cost by adding lines,” in Proc. Int. Symp. on Multi-Valued Logic, 2010, pp. 217–222.
  • [39] H. Lee, K. C. Jung, D. Han, and P. Kim, “An algorithm for reversible logic circuit synthesis based on tensor decomposition,” 2021, [Online]. Available: arxiv.org/abs/2107.04298.
  • [40] Y.-H. Tsai, J.-H. R. Jiang, and C.-S. Jhang, “Bit-slicing the Hilbert space: Scaling up accurate quantum circuit simulation,” in Proc. Design Automation Conference, 2021, pp. 439–444.
  • [41] Z. Alwin, N. Philipp, D. Rolf, and W. Robert, “Accuracy and compactness in decision diagrams for quantum computation,” in Proc. Design, Automation Test in Europe Conference and Exhibition, 2019, pp. 280–283.
  • [42] P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” in Proc. IEEE Symposium on Foundations of Computer Science, 1994, pp. 124–134.
  • [43] F. Somenzi, “CUDD: CU decision diagram package (release 2.4.2),” 2012.
  • [44] D. Maslov, “Advantages of using relative-phase Toffoli gates with an application to multiple control Toffoli optimization,” Phys. Rev. A, vol. 93, no. 022311, 2016.
  • [45] R. Wille, D. Große, L. Teuber, G. W. Dueck, and R. Drechsler, “Revlib: An online resource for reversible functions and reversible circuits,” in Proc. Int. Symp. on Multi-Valued Logic, 2008, pp. 220–225.