Quantum implementation of circulant matrices and its use in quantum string processing

(Received: date / Accepted: date)

Abstract

String problems in general can be solved faster by using special data structures such as suffixes in many cases structured as trees and arrays. In this paper, we show that suffixes used in those data structures can be obtained by using circulant matrices as a quantum operator which can be implemented in logarithmic time. Hence, if the strings are given as quantum states, using the presented circuit implementation one can do string processing efficiently on quantum computers.

Index Terms:

Quantum algorithms, circulant matrix, suffix trees, Burrows-Wheeler Transform, string processing

I Introduction

For any given vector $\displaystyle\mathbf{c}=[c_{0},c_{1},\dots,c_{n-1}]^{T}$ , the circulant matrix $\displaystyle C$ with the first row $\displaystyle\mathbf{c}$ is defined as follows ([1, 2, 3]):

C=\left(\begin{matrix}c_{0}&c_{n-1}&\cdots&c_{2}&c_{1}\\ c_{1}&c_{0}&c_{n-1}&&c_{2}\\ \vdots&c_{1}&c_{0}&\ddots&\vdots\\ c_{n-2}&&\ddots&\ddots&c_{n-1}\\ c_{n-1}&c_{n-2}&\cdots&c_{1}&c_{0}\\ \end{matrix}\right)

(1)

The eigenspace of the circulant matrices are formed by Fourier matrices; therefore, the eigenvectors and the eigenvalues of $\displaystyle C$ are defined analytically in the following form of pairs: A $\displaystyle j$ th ( $\displaystyle j=0,\ldots,n-1$ ) eigenvector is given by

v_{j}=\frac{1}{\sqrt{n}}\left(1,\omega^{j},\omega^{2j},\ldots,\omega^{(n-1)j}\right),

(2)

where $\displaystyle\omega=\exp\left(\tfrac{2\pi i}{n}\right)$ . And its corresponding eigenvalue is

\lambda_{j}=c_{0}+c_{n-1}\omega^{j}+c_{n-2}\omega^{2j}+\dots+c_{1}\omega^{(n-1)j}.

(3)

As a result, the eigendecomposition of any circulant matrix can be written as $\displaystyle C=F\Lambda F^{-1}$ where $\displaystyle F$ represents the normalized matrix for the discrete Fourier transform and $\displaystyle\Lambda=diag({F\mathbf{c}})$ , which is a diagonal matrix representing the eigenvalues.

I-A Direct Implementation as a quantum circuit

One can implement $\displaystyle C$ as a quantum circuit by using its eigendecomposition $\displaystyle C=F\Lambda F^{-1}$ : It is well known that $\displaystyle F$ can be implemented in $\displaystyle O(\log(n)^{2})$ time as the quantum Fourier transform [4]. In addition, since any vector of dimension $\displaystyle n$ can be implemented in $\displaystyle O(n)$ time, we can also implement $\displaystyle\Lambda$ in linear time. Therefore, $\displaystyle C$ can be implemented as a quantum circuit by using $\displaystyle O(n)$ two and single qubit quantum gates, which is less than the classical matrix size $\displaystyle O(n^{2})$ .

I-B Circuit implementation through permutations

$\displaystyle C$ can be represented as a function of the following permutation matrix, which is also circulant:

P=\left(\begin{matrix}0&0&\cdots&0&1\\ 1&0&\cdots&0&0\\ 0&\ddots&&\vdots&\vdots\\ \vdots&\ddots&\ddots&0&0\\ 0&\cdots&0&1&0\end{matrix}\right)

(4)

Then, we can redefine $\displaystyle C$ as a polynomial function of $\displaystyle P$ with the coefficients given by $\displaystyle\mathbf{c}$ :

C=c_{0}I+c_{1}P+c_{2}P^{2}+\dots+c_{n-1}P^{n-1}=\sum_{j=0}^{n-1}c_{j}P^{j}.

(5)

Since $\displaystyle P$ is an orthogonal matrix, we can construct its exact quantum circuit decomposition. Moreover, it is known that sum of unitary matrices can be implemented as a subpart of a larger quantum system with the help of ancilla registers (e.g., [5, 6, 7]). Here, we shall show that $\displaystyle C$ can be constructed as a vector by using $\displaystyle V_{P}\times\left|\psi\right\rangle$ where $\displaystyle\left|\psi\right\rangle$ represents a combinations of the vector $\displaystyle\mathbf{c}$ and $\displaystyle V_{P}$ includes the powers of $\displaystyle P$ used in the above polynomial. Later in the paper, we shall show that this way of implementation eases the way to solve some string problems.

I-B1 Circuit for $\displaystyle P$

Since $\displaystyle P$ is a circulant matrix, we can define its any $\displaystyle j$ th eigenvalue by using Eq.(3):

\lambda_{j}=w^{j}.

(6)

Therefore, its eigendecomposition can be written as:

F\times\left(\begin{matrix}1&&&&\\ &w^{1}&&\\ &&w^{2}&\\ &&&\ddots&\\ &&&&w^{n-1}\end{matrix}\right)\times F^{-1}=F\Lambda_{P}F^{-1}.

(7)

As a quantum circuit, we can implement $\displaystyle\Lambda_{P}$ as follows: First, observe that the power $\displaystyle j$ in $\displaystyle w^{j}$ is equal to the row index which can be found from the binary expansion of $\displaystyle j=(b_{n-1}\dots b_{0})_{2}$ . Therefore, if a qubit is in $\displaystyle\left|1\right\rangle$ , we apply a control phase gate to the first qubit. The phase of these gates are determined from the decimal index of the qubit in the binary expansion. We define the phase gate as follows:

R_{w}(k)=\left(\begin{matrix}1&0\\ 0&w^{2^{k}}\end{matrix}\right),

(8)

where $\displaystyle k$ is the order of the control qubit in the binary expansion $\displaystyle(b_{n-1}\dots b_{0})_{2}$ . The resulting circuit is drawn in Fig.1 for four qubits. The number of quantum gates for this implementation is equivalent to the number of qubits which is $\displaystyle(\log n)$ .

\Qcircuit@C=0.7em @R=1em \lstick &\qw\qw \qw\ctrl3 \qw
\lstick \qw\qw \ctrl2 \qw \qw
\lstick \qw\ctrl1\qw \qw \qw
\lstick\gateR_w(0) \gateR_w(1) \gateR_w(2) \gateR_w(3)\qw

Figure 1: Quantum circuit for

\displaystyle\Lambda_{P}

I-B2 Circuit for $\displaystyle V_{P}$

Consider the following matrix:

V_{P}=\left(\begin{matrix}I&&&&\\ &P^{1}&&\\ &&P^{2}&\\ &&&\ddots&\\ &&&&P^{n-1}\end{matrix}\right),

(9)

which highly resembles $\displaystyle\Lambda_{P}$ given in Eq.(7). Therefore, we can implement the same way as we implemented $\displaystyle\Lambda_{P}$ : In this case, we only need to somehow replace $\displaystyle R_{w}({k})$ with the power of $\displaystyle P$ :

P^{2^{k}}=F\Lambda_{P}^{2^{k}}F^{-1}.

(10)

Since all the powers of $\displaystyle P$ s have the same eigenvectors, we apply $\displaystyle F$ and $\displaystyle F^{-1}$ only once at the beginning and at the end of the circuit. We use the following quantum operation in the circuit:

U_{P}(k)=\left(\begin{matrix}I&0\\ 0&\Lambda_{P}^{2^{k}}\end{matrix}\right).

(11)

Here, when we look at the circuit for $\displaystyle\Lambda_{P}$ in Fig.1, we can see that the power can be distributed to the quantum gates inside the circuit since the order of the gates is not important. That means, we can obtain any power of $\displaystyle\Lambda_{P}$ by simply adjusting the angle values of the quantum gates. Therefore, for power of $\displaystyle\lambda_{P}^{2^{k}}$ , we adjust the parameter of the rotation gates in Fig.1 as: $\displaystyle R_{w}(0\times k),R_{w}(1\times k),\dots$ , and so on. The final circuit for $\displaystyle V_{P}$ is just composed of the controlled versions of these quantum gates which is shown in Fig.2 and 3.

\Qcircuit

@C=0.7em @R=1em \lstick &\qw\qw\qw \qw\ctrl3 \qw \qw

\lstick

\qw

\ctrl

2 \qw \qw \qw

\lstick

\qw

\ctrl

1\qw \qw \qw \qw

\lstick

\gate

F^-1\gateU_P(0) \gateU_P(1) \gateU_P(2) \gateU_P(3)\gateF\qw

Figure 2: The final quantum circuit of

\displaystyle V_{P}

illustrated for

\displaystyle n=4

. The explicit circuit for

\displaystyle U_{P}(k)

is given in Fig.3 below.

\Qcircuit

@C=0.7em @R=1em \lstick &\ctrl4\ctrl4 \ctrl4 \ctrl4 \qw
\lstick \qw\qw \qw\ctrl3 \qw
\lstick \qw\qw \ctrl2 \qw \qw
\lstick \qw\ctrl1\qw \qw \qw
\lstick\gateR_w(0×k) \gateR_w(1×k) \gateR_w(2×k) \gateR_w(3×k)\qw

Figure 3: The explicit circuit implementation of

\displaystyle U_{P}(k)

which implements the

\displaystyle 2^{k}

th power of

\displaystyle\Lambda_{P}

given in Eq.(7).

I-C The overall complexity

As shown in Fig.2 and 3 the number of quantum gates are limited to the number of qubits used in the main system. The size of the permutation matrix $\displaystyle P$ is $\displaystyle n$ by $\displaystyle n$ , therefore the main system is described by $\displaystyle\log n$ qubits. Then, an additional $\displaystyle\log n$ qubit is used to construct $\displaystyle V_{P}$ . Therefore, in total the circuit requires $\displaystyle(2\times\log n)$ qubits. The quantum gates are at most controlled by two qubits, therefore the total number of required single and two qubit gates are bounded by $\displaystyle O(poly(\log n))$ in general. More specifically, if we assume the general implementation of quantum Fourier transform requires $\displaystyle O((\log n)^{2})$ gates (with an optimization, it may require $\displaystyle O((\log n)\log\log n)$ ) and the decomposition of each Toffoli gate requires less than $\displaystyle\log n$ gates [4], then the total complexity becomes bounded by $\displaystyle O((\log n)^{2})$ . This shows that we can form $\displaystyle V_{P}$ on quantum computers very efficiently since it requires a number of quantum gates which is a poly-logarithm of the dimension $\displaystyle n$ .

II Applications

II-A Suffix trees and arrays

Many string problems can be solved easier if they are stored by using well-known data structures such as suffix tries, trees, and arrays [8]. Below, we will follow the lecture notes in Ref.[9] to give a brief introduction of these data structures and explain how to implement them on quantum computers with the help of the circuits introduced in the previous section.

From a given string, a suffix can be chosen by determining a starting position. For instance, we can choose the following suffixes from string “banana”:

\begin{bmatrix}b&a&n&a&n&a&\$\\ a&n&a&n&a&\$\\ n&a&n&a&\$\\ a&n&a&\$\\ n&a&\$\\ a&\$\\ \$\end{bmatrix}

(12)

Here, “$” is used to indicate the end of the string and considered less than any letter of the alphabet in lexicographical order.

In suffix arrays, these strings are put in the buckets based on their orders and sorted. In tries, the prefixes are put in the rooted tree in which every node has a child for each letter of the alphabet used in the string. In the suffix tries, the suffixes are put in a rooted tree in which every node has a child for each letter of the alphabet used in the string. Therefore, by following the paths from the root of the tree, it is possible to determine if a letter follows another and if the prefix exists or not. In the suffix trees, the tries are built for the suffixes and the number of nodes in the tries are optimized by compressing the nodes in the straight paths ( i.e. the paths that do not go more than one way in any nodes on the path.). Each path in the suffix tree ends with the letter “$”.

While constructing suffix arrays and trees can be done in $\displaystyle O(n\log n)$ time [10], they can be used to solve many string problems more efficiently such as string searching and matching and sequence alignment [11].

II-B Burrows Wheeler Transform[12]

The main idea in Burrows Wheeler transform (BWT) is to first generate a circulant matrix by using the given string and then sort this matrix column by column. In the sorted matrix, the last column is used as the equivalent transformed string. As an illustration, consider the following example string ”banana”, which is used in most textbooks:

\begin{bmatrix}b&a&n&a&n&a&\$\\ a&n&a&n&a&\$&b\\ n&a&n&a&\$&b&a\\ a&n&a&\$&b&a&n\\ n&a&\$&b&a&n&a\\ a&\$&b&a&n&a&n\\ \$&b&a&n&a&n&a\end{bmatrix}\xrightarrow{sort}\begin{bmatrix}\$&b&a&n&a&n&\mathbf{a}\\ a&\$&b&a&n&a&\mathbf{n}\\ a&n&a&\$&b&a&\mathbf{n}\\ a&n&a&n&a&\$&\mathbf{b}\\ b&a&n&a&n&a&\mathbf{\$}\\ n&a&n&a&\$&b&\mathbf{a}\\ n&a&\$&b&a&n&\mathbf{a}\\ \end{bmatrix}

(13)

From the above, BWT(“banana”) = “annb$aa”. In the above matrix, if one considers “$” as 1s and the rest of the letters as 0s, then BWT of any string can be represented as a permutation matrix.

II-C Quantum implementation of suffixes

In the previous section we show how to construct $\displaystyle V_{P}$ . Here, consider that we are given the text $\displaystyle\left|c\right\rangle$ including also the “$” sign. We apply $\displaystyle\left|c\right\rangle$ in the main register and $\displaystyle H^{\otimes\log n}\left|0\right\rangle$ in the ancilla register. This generates the following vector:

(H^{\otimes\log n}\left|0\right\rangle)\left|c\right\rangle=\frac{1}{\sqrt{n}}\left(\begin{matrix}\begin{bmatrix}c_{0}\\ c_{1}\\ \vdots\\ c_{n-1}\\ \end{bmatrix}_{0\ \ \ }\\ \vdots\ \ \ \ \\ \begin{bmatrix}c_{0}\\ c_{1}\\ \vdots\\ c_{n-1}\\ \end{bmatrix}_{n-1}\\ \end{matrix}\right).

(14)

If we multiply this vector by $\displaystyle V_{P}$ , we get $\displaystyle C$ in Eq.(1) in vector form:

V_{P}(H^{\otimes\log n}\left|0\right\rangle)\left|c\right\rangle=\frac{1}{\sqrt{n}}\left(\begin{matrix}\begin{bmatrix}c_{0}\\ c_{1}\\ \vdots\\ c_{n-1}\\ \end{bmatrix}_{0\ \ \ }\\ \vdots\ \ \ \ \\ \begin{bmatrix}c_{n-1}\\ c_{n-2}\\ \vdots\\ c_{0}\\ \end{bmatrix}_{n-1}\\ \end{matrix}\right)=\frac{1}{\sqrt{n}}\left(\begin{matrix}S_{0}\\ \vdots\\ S_{n-1}\\ \end{matrix}\right),

(15)

where each $\displaystyle S_{j}$ represents a circularly permuted string. The above vector provides the unsorted suffixes whose end indicated by some end of file character ”$”. Since the order of this character provides information about the suffixes, the amplitude of this char may be adjusted higher so that from the measurements, the order can be obtained in an efficient way.

By using the indices of the end of file characters, we can collapse the quantum state onto any desired direction. In addition, we can draw some conclusion from the measurements on the collapsed state: e.g., . Therefore, the measurement statistics can be used to draw conclusions about the most common prefixes in the suffixes in the collapsed state.

Since in most string algorithms sorted structures allow us to develop more efficient algorithms, the same sorting can be also done on this vector.

III Sorting

Sorting problem simply can be described as finding an ordered list of items given as an unordered list. It is known that comparison based sorting algorithms such as merge sort and quicksort have running time $\displaystyle\Omega(nlogn)$ for a list of $\displaystyle n$ items. This lower bound can be broken by using bucket sorting (counting sort) where each item is considered as a direct pointer to the bucket; therefore, the sorting is done in $\displaystyle O(n)$ time. This simple approach is further improved to give $\displaystyle O(nd)$ time, where $\displaystyle d$ is the number of bits used to represent each item and $\displaystyle O(nloglogn)$ time. Because of the memory constraints and requirements, in practice comparison sorts are used more often in practice than these kinds of sorting algorithms. [13]

On quantum computers, the sorting problem is considered based on registers $\displaystyle\left|x_{1}\right\rangle\dots\left|x_{n}\right\rangle$ . Here, each register represents an item in the unsorted list. Therefore, the algorithms try to prepare the registers in the output to encode the natural ordered list of the given items: i.e., $\displaystyle\left|q_{1}\right\rangle\geq\dots\geq\left|q_{n}\right\rangle$ . Based on this representation, it is shown that quantum algorithms based on comparison models have similar complexity bounds for the sorting problem: i.e. $\displaystyle\Omega(\sqrt{n}logn)$ [14, 15].

As done in the classical sorting algorithms, bucket sort with an order preserving hash function can be used to sort the items stored in memory as:

\sum_{i=1}^{n}\left|\mathbf{0}\right\rangle\left|i\right\rangle\left|x_{i}\right\rangle,

(16)

where $\displaystyle i$ is the index of the item $\displaystyle x_{i}$ in the given unordered list. The sorting problem becomes constructing the following quantum state:

\sum_{i=1}^{n}\left|o_{i}\right\rangle\left|i\right\rangle\left|x_{i}\right\rangle

(17)

where $\displaystyle\left|o_{i}\right\rangle$ represents the index of the item in the sorted list. We can rewrite this in terms of a hash function $\displaystyle h(x_{i})$ that maps an item $\displaystyle x_{i}$ to the index $\displaystyle o_{i}$ :

\sum_{i=1}^{n}\left|h(x_{i})\right\rangle\left|i\right\rangle\left|x_{i}\right\rangle.

(18)

$\displaystyle h$ can be as simple as a direct map or more general hash function. In particular consider $\displaystyle h$ as a partial order preserving hash function: i.e., if $\displaystyle x_{i}$ and $\displaystyle x_{j}$ , their real ordered locations at some distance $\displaystyle d$ from each other so that $\displaystyle o_{i}+d<o_{j}$ , then $\displaystyle h(x_{i})<h(x_{j})$ . Then, since we can apply an operator $\displaystyle h$ simultaneously to all items, we can generate their orders in $\displaystyle O(1)$ time. Sometimes knowing the elements’ rough order in the array may be considered enough. In those cases, the hash function need not be perfect; therefore, one can use similar ideas to classical sorting algorithms such as bucket sort or the shell-sort ¹¹1Shell sort is a generalization of insertion sort algorithm, where items at certain distances are compared and if necessary swapped to have a k-sorted array: i.e. an array where the numbers are grouped into regions based on their orders. to generate some k-sorted array in which buckets are sorted, however, the items in the same buckets are not sorted.

As mentioned above, if we use comparison based sorting algorithms, then the sorting is almost the same as classical sorting algorithm: Let us consider the following vector whose construction is given in the previous section:

\frac{1}{\sqrt{n}}\left(\begin{matrix}S_{0}\\ \vdots\\ S_{n-1}\\ \end{matrix}\right),

(19)

In Burrows Wheeler transform, the items are sorted by columns. We can do the same sorting on this vector: A particular direct sorting may be as follows:

•
First, we compare each element with its left neighbor, then if it is necessary we swap them.
- –
  
  This step can be done in parallel, if the swap and comparison can be implemented in $\displaystyle O(poly(\log n))$ number of quantum gates, then it takes $\displaystyle O(poly(\log n))$ time ²²2Here, since the comparison and swap operations depend on the number of qubits, it may require controlled gates whose decomposition requires number of gates that are polynomial in the system size $\displaystyle O(\log n)$ . If this step can be done in $\displaystyle O(1)$ , then sorting can be done more efficiently..
•
In the second step we do the same thing for the right neighbors (we group two elements together and compare them.).
- –
  
  This step also requires $\displaystyle O(poly(\log n))$ number of quantum gates.
•

If we repeat the above steps for O(n) time, we basically get a simple $\displaystyle O(n\log n)$ time sorting algorithm.

IV Conclusion

In this paper, we describe how to generate suffix structures efficiently as a vector by using quantum circuits for circulant matrices. We discuss how the generated vector can be used in the string algorithms and sorted if necessary. As a future direction, we will apply this circuit to the sequence alignment and pattern matching problems. Since circulant matrices are used in convolutions, it can be also applied to problems in different areas such as convolutional neural network, time series analysis [16, 17].

References

[1] G. H. Golub and C. F. Van Loan, Matrix computations. JHU press, 2013.
[2] R. M. Gray et al., “Toeplitz and circulant matrices: A review,” Foundations and Trends® in Communications and Information Theory, vol. 2, no. 3, pp. 155–239, 2006.
[3] H. Karner, J. Schneid, and C. W. Ueberhuber, “Spectral decomposition of real circulant matrices,” Linear Algebra and Its Applications, vol. 367, pp. 301–311, 2003.
[4] M. A. Nielsen and I. Chuang, “Quantum computation and quantum information,” 2002.
[5] A. M. Childs and N. Wiebe, “Hamiltonian simulation using linear combinations of unitary operations,” arXiv preprint arXiv:1202.5822, 2012.
[6] A. Daskin, A. Grama, G. Kollias, and S. Kais, “Universal programmable quantum circuit schemes to emulate an operator,” The Journal of chemical physics, vol. 137, no. 23, p. 234112, 2012.
[7] G. H. Low and I. L. Chuang, “Hamiltonian simulation by qubitization,” Quantum, vol. 3, p. 163, 2019.
[8] U. Manber and G. Myers, “Suffix arrays: a new method for on-line string searches,” siam Journal on Computing, vol. 22, no. 5, pp. 935–948, 1993.
[9] B. Langmead, “Burrows-wheeler transform and fm index,” Johns Hopkins University, accessed in 2022. [Online]. Available: https://www.cs.jhu.edu/~langmea/resources/lecture_notes/10_bwt_and_fm_index_v2.pdf
[10] E. Ukkonen, “On-line construction of suffix trees,” Algorithmica, vol. 14, no. 3, pp. 249–260, 1995.
[11] M. Mielczarek and J. Szyda, “Review of alignment and snp calling algorithms for next-generation sequencing data,” Journal of applied genetics, vol. 57, no. 1, pp. 71–79, 2016.
[12] M. Burrows and D. Wheeler, “A block-sorting lossless data compression algorithm,” in Digital SRC Research Report. Citeseer, 1994.
[13] T. Hagerup, “Sorting and searching on the word ram,” in Annual Symposium on Theoretical Aspects of Computer Science. Springer, 1998, pp. 366–398.
[14] P. Høyer, J. Neerbek, and Y. Shi, “Quantum complexities of ordered searching, sorting, and element distinctness,” Algorithmica, vol. 34, no. 4, pp. 429–448, 2002.
[15] A. Ambainis, “Quantum lower bounds by quantum arguments,” Journal of Computer and System Sciences, vol. 64, no. 4, pp. 750–767, 2002.
[16] D. S. G. Pollock, “Circulant matrices and time-series analysis,” International Journal of Mathematical Education in Science and Technology, vol. 33, no. 2, pp. 213–230, 2002.
[17] A. Daskin, “A walk through of time series analysis on quantum computers,” arXiv preprint arXiv:2205.00986, 2022.

Quantum implementation of circulant matrices and its use in quantum string processing

Abstract

Index Terms:

I Introduction

I-A Direct Implementation as a quantum circuit

I-B Circuit implementation through permutations

I-B1 Circuit for P\displaystyle P

I-B2 Circuit for VP\displaystyle V_{P}

I-C The overall complexity

II Applications

II-A Suffix trees and arrays

II-B Burrows Wheeler Transform[12]

II-C Quantum implementation of suffixes

III Sorting

IV Conclusion

References

I-B1 Circuit for $\displaystyle P$

I-B2 Circuit for $\displaystyle V_{P}$