\history

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000. 10.1109/ACCESS.2022.0122113

\tfootnote

Part of this work was supported by the National Science Foundation under Grant 2046220.

\corresp

Corresponding author: Hiu Yung Wong (email: [email protected]). A. Zaman and H. J. Morrell have equal contributions.

A Step-by-Step HHL Algorithm Walkthrough to Enhance Understanding of Critical Quantum Computing Concepts

ANIKA ZAMAN1 HECTOR JOSE MORRELL2, HIU YUNG WONG.3 Department of Electrical Engineering, San Jose State University, San Jose, CA 95192 USA (email: [email protected]) Department of Electrical Engineering, San Jose State University, San Jose, CA 95192 USA (email: [email protected]) Department of Electrical Engineering, San Jose State University, San Jose, CA 95192 USA (email: [email protected])

Abstract

After learning basic quantum computing concepts, it is desirable to reinforce the learning using an important and relatively complex algorithm through which students can observe and appreciate how qubits evolve and interact with each other. Harrow-Hassidim-Lloyd (HHL) quantum algorithm, which can solve linear system problems with exponential speed-up over the classical method and is the basis of many important quantum computing algorithms, is used to serve this purpose. The HHL algorithm is explained analytically followed by a 4-qubit numerical example in bra-ket notation. Matlab code corresponding to the numerical example is available for students to gain a deeper understanding of the HHL algorithm from a pure matrix point of view. A quantum circuit programmed using qiskit is also provided for real hardware execution in IBM quantum computers. After going through the material, students are expected to have a better appreciation of the concepts such as basis transformation, bra-ket and matrix representations, superposition, entanglement, controlled operations, measurement, quantum Fourier transformation, quantum phase estimation, and quantum programming. To help readers review these basic concepts, brief explanations augmented by the HHL numerical examples in the main text are provided in the Appendix.

Index Terms:

Harrow-Hassidim-Lloyd (HHL) quantum algorithm, Quantum Fourier transform (QFT), inverse quantum Fourier transform (IQFT), quantum phase estimation(QPE),linear system problem (LSP), Quantum Education

\titlepgskip

=-15pt

I Introduction

Quantum Computing is promising in solving challenging engineering [1], biomedical [2] and finance [3] problems. It has a tremendous advancement in the last two decades and, recently, quantum breakthrough has been demonstrated using a 53-qubit system [4].However, according to the paper [5], due to less efficient hardware implementation till date, the goal to reach superconducting quantum supremacy is yet to achieve. Therefore, the training of a quantum technology workforce is an imminent goal for many countries (e.g. [23]) to support this fast-growing industry. However, quantum technology is based on concepts very different from our daily and classical experiences. In the early stage of learning quantum computing, although linking to daily and classical experience may enhance the understanding of certain quantum concepts and such an approach should not be de-emphasized, we believe a fast and robust way of training a quantum workforce is to train the students to be able to emulate a quantum processor and trace the evolution of the qubits. This is particularly useful in learning quantum algorithms without a quantum mechanics background. Such an approach obviates the students from cognitive conflicts, which can be resolved later, if possible, after they understand how quantum computing works. This also embraces the “Shut up and calculate!” approach proposed by Mermin on how to deal with the uncomfortable feeling toward quantum mechanics interpretation [6].

Besides analytical equations, matrix representation and computer simulations are important tools to enhance the understanding of qubit evolution. However, available examples that include computer simulations are usually of simple algorithms and, very often, without matrix representation. There is a lack of examples of important and relatively complex algorithms which combine some of the most important quantum computing concepts and basic algorithms. Such examples are desirable to allow students to appreciate the roles and the interplay of various basic concepts in a more realistic quantum algorithm. Harrow-Hassidim-Lloyd (HHL) quantum algorithm [7][8] which can be used to solve linear system problems (LSP) and can provide exponential speedup over the classical conjugate gradient method [9] is chosen for this purpose. HHL is the basic of many more advanced algorithms and is important in various applications such as machine learning [10] and modeling of quantum systems [2][11]. HHL solves system of linear equation which is a discretization of [12][13]. In this paper, we detail the qubit evolution in Harrow-Hassidim-Lloyd (HHL) quantum algorithm analytically with a 4-qubit circuit as a numerical example. Although HHL examples are available elsewhere (e.g. [14][15]), this paper has certain characteristics which are not all found in those examples. Firstly, the HHL algorithm is discussed analytically step-by-step and is self-contained. Secondly, a numerical example is given in bra-ket notation mirroring the analytical equations. Thirdly, a Matlab code corresponding exactly to the numerical example is available to enhance the understanding from a matrix point of view. The Matlab code allows the students to trace how the wavefunction evolves instead of just seeing the magnitudes of the coefficients as in IBM-Q. Fourthly, a qiskit code written in python [16] corresponding to the numerical example is available and can be run in IBM simulation and hardware machines [17]. Finally, in the example, all the 4 qubits are traced throughout the process without simplification.

The readers are assumed to have the following background concepts which are further enhanced through the step-by-step walkthrough of the HHL algorithm: basis transformation, bra-ket and matrix representations, superposition, entanglement, encoding, controlled operations, measurement, quantum Fourier transformation, and quantum programming. To make this paper self-contained and to help the readers better appreciate the roles of these basic concepts in the HHL, an Appendix is devoted to briefly explaining these concepts using the examples from the main text. A more detailed explanation of these concepts using a similar approach as in this paper can be found in [19].

I-A How to use this paper

For readers who have a fresh memory of the basic concepts, they can start reading from Section II, in which the HHL algorithm is discussed step-by-step analytically followed by a numerical example in Section III. The basic concepts mentioned in the Appendix are referred to in the main text and readers are encouraged to review them when needed.

For readers who need reviews on the basic concepts first, they are encouraged to go over the Appendix first before reading the main text.

For readers who have devoted substantial time to learning HHL elsewhere but just need a numerical example to reinforce the understanding, they might start with the numerical example in Section III.

Equations in the Appendix begin with ’V’. If the equations are examples from the main text, the same equation number is used in the Appendix.

II HHL Algorithm

II-A Definitions and Overview

We will first give an overview of the problem and the HHL algorithm. Details will be discussed in the following subsections with reference to the Appendix for reviewing basic concepts. A linear system problem (LSP) can be represented as the following

A\vec{x}=\vec{b}

(1)

where $A$ is a $N_{b}\times N_{b}$ Hermitian matrix and $\vec{x}$ and $\vec{b}$ are $N_{b}$ -dimensional vectors. For simplicity, it is assumed $N_{b}=2^{n_{b}}$ , where $n_{b}$ is the number of qubits in the quantum circuit, and $N_{b}$ is the total combinations due number of qubits, $n_{b}$ . In matrix representation, qubits are represented by their total combinations( $N_{b}\times N_{b}$ ), or we can say for $N_{b}$ number of unknowns, we need $n_{b}$ qubits to solve unknowns. Dummy equations can be added otherwise to convert the system satisfy this assumption. $A$ and $\vec{b}$ are known and $\vec{x}$ is the unknown to be solved, i.e.

\vec{x}=A^{-1}\vec{b}

(2)

Refer to caption — Figure 1: Schematic of the $HHL$ quantum circuit flowing from left to right. The circuit is decomposed into top and bottom portions for clarity. Note that the lowest qubit in the diagram is the most significant bit ( $MSB$ ) while the top one is the least significant bit ( $LSB$ ).

As an example, $A=\begin{pmatrix}1&-\frac{1}{3}\\ -\frac{1}{3}&1\\ \end{pmatrix}$ , $\vec{b}=\begin{pmatrix}0\\ 1\\ \end{pmatrix}$ , and $\vec{x}=\begin{pmatrix}\frac{3}{8}\\ \frac{9}{8}\\ \end{pmatrix}$ with $n_{b}=1$ and $N_{b}=2^{1}=2$ . Readers may refer to Appendix V-13 and Appendix V-14 to review how LSP is solved classically using Gaussian Elimination and Conjugate Gradient Method, respectively.

$A$ is assumed to be Hermitian (See Appendix V-1). If it is not Hermitian, the $A$ can be converted to $\begin{pmatrix}0&A\\ A^{\dagger}&0\\ \end{pmatrix}$ , which is Hermitian. Readers may refer to [20] for the more advanced treatment when $A$ is not Hermitian.

Figure 1 shows the schematic of the $HHL$ algorithm and the corresponding circuit to solve LSP. In the HHL quantum algorithm, the $N_{b}$ components of $\vec{b}$ and $\vec{x}$ are encoded as the amplitudes/coefficients (amplitude encoding) of basis states of the $n_{b}$ -qubits, $\ket{\;}_{b}$ , which form a $\mathbb{C}^{N_{b}}$ Hilbert space. These $n_{b}$ qubits are called b-register. Qubit $n_{b}$ is chosen to be large enough to encode $\vec{b}$ , i.e. $2^{n_{b}}$ needs to be the same as the length of the vectors $\vec{b}$ and $\vec{x}$ . The matrix $A$ is simulated through Hamiltonian encoding by encoding it as the Hamiltonian of a unitary gate. Appendix V-9 reviews examples of various encoding schemes.

The HHL algorithm has 5 main components, namely state preparation, quantum phase estimation (QPE), ancilla bit rotation, inverse quantum phase estimation (IQPE), and measurement. In this paper, the little-endian convention is used. In a little-endian convention, the rightmost (ending) qubit represents the least significant bit (LSB). For example, in a 4-qubit system, $\ket{0001}$ in binary is $\ket{1}$ in decimal because the $1$ in the basis state $\ket{0001}$ is the LSB, representing $2^{0}$ instead of $2^{3}$ (if it were the most significant bit, MSB). Moreover, in the circuit diagrams, the lowest qubit represents the MSB and the topmost qubit represents the LSB, which is a convention used in qiskit [16] and the IBM-Q platform [17]

As shown in Figure 1, besides the b-register, which belongs to the more significant bits, there are two more sets of inputs to the algorithm. The first set is sometimes called the c-register because it is related to the time (clock) in the controlled rotation in the QPE part. Therefore, they are also called the clock qubits.The c-register stores the values of the phase of the eigenvalues of the $A$ matrix after the QPE. There are $n$ qubits in the c-register. Since basis encoding is used (i.e. the phase value is encoded as the basis number (See Appendix V-9), the value of $n$ determines how accurately the phase can be stored. A larger $n$ results in higher accuracy when the encoding is not exact. We set $N=2^{n}$ .

The last set of qubits is the ancilla qubit $\ket{\;}_{a}$ which is the LSB. The ancilla qubit, as its name implies, is important to help achieve the goal although it will be discarded at the end, as will be detailed later.

The matrix $A$ , which is a Hamiltonian, may be written as a linear combination of the outer products of its eigenvectors, $\ket{u_{i}}\bra{u_{i}}$ weighted by its eigenvalues, $\lambda_{i}$ ,in Eq.(3), (See Appendix V-8).

A=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}\ket{u_{i}}\bra{u_{i}}

(3)

Since A is diagonal in its eigenvector basis, its inverse is simply, $A^{-1}=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}^{-1}\ket{u_{i}}\bra{u_{i}}$ . $\vec{b}$ can be also expressed in the basis formed by the eigenvectors of A, such that

\ket{b}=\sum_{j=0}^{2^{n_{b}}-1}b_{j}\ket{u_{j}}

(4)

Therefore, Eq. (2) can be encoded as,

\ket{x}=A^{-1}\ket{b}=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}^{-1}b_{i}\ket{u_{i}}

(5)

by using the fact that $\braket{u_{i}}{u_{j}}=\delta_{ij}$ . The goal of the HHL algorithm is to find the solution in this form and $\ket{x}$ is stored in the b-register.Storing the right hand side of (5) in the b-register is equivalent to storing $\ket{x}$ in the b-register. But the solution is encoded as the amplitudes of the basis vectors $\ket{0}/\ket{1}$ (the measurement basis). Therefore, the solutions are not $\lambda_{i}^{-1}b_{i}$ , which are the amplitudes of the eigenvector basis vectors. However, one will naturally get the correct amplitudes if it is measured in the $\ket{0}/\ket{1}$ basis and this is only possible if the qubits are not entangled with other qubits. This can be checked mathematically replacing $\ket{u_{i}}$ by $\ket{0}/\ket{1}$ based on their relationship. If $\ket{u_{i}}$ are entangled with other qubits, one cannot obtain the desired solution. This will be clear after the ancilla bit rotation to be detailed later.

The equation needs also to be prepared so that the eigenvectors, $\ket{u_{i}}$ , and $\ket{b}$ are normalized so they can be properly represented as a unit vector in quantum computing. Therefore, (4) and Eq. (5) require

\displaystyle\sum_{j=0}^{2^{n_{b}}-1}\lvert b_{j}\rvert^{2}=1

\displaystyle\sum_{i=0}^{2^{n_{b}}-1}\lvert\lambda_{i}^{-1}b_{i}\rvert^{2}=1

(7)

II-B State Preparation

There are total $n_{b}+n+1$ qubits and they are initialized as

\ket{\Psi_{0}}=\ket{0\cdots 0}_{b}\ket{0\cdots 0}_{c}\ket{0}_{a}=\ket{0}^{\otimes n_{b}}\ket{0}^{\otimes n}\ket{0}

(8)

In the state preparation, $\ket{0\cdots 0}_{b}$ in the b-register needs to be rotated to have the amplitudes correspond to the coefficients of $\vec{b}$ . That is

\vec{b}=\begin{pmatrix}\beta_{0}\\ \beta_{1}\\ \vdots\\ \beta_{N_{b}-1}\\ \end{pmatrix}\Leftrightarrow\beta_{0}\ket{0}+\beta_{1}\ket{1}+\cdots+\beta_{N_{b}-1}\ket{N_{b}-1}=\ket{b}

(9)

The vector $\vec{b}$ is represented in a column form on the left with coefficients $\beta^{\prime}s$ . This is also a valid representation of $\ket{b}$ . On the right, the corresponding basis of the Hilbert space formed by the $n_{b}$ qubits is written explicitly. Therefore,

\ket{\Psi_{1}}=\ket{b}_{b}\ket{0\cdots 0}_{c}\ket{0}_{a}

(10)

From now on, some of the subscripts of the kets will be omitted when there is no ambiguity. Since the state preparation depends on the actual value of $\vec{b}$ , it will be discussed in more detail in the numerical example.

II-C Quantum Phase Estimation

Quantum phase estimation (QPE) is also an eigenvalue estimation algorithm. QPE has three components, namely the superposition of the clock qubits through Hadamard gates, controlled rotation, and inverse quantum Fourier transform (IQFT). The goal of QPE is to estimate the phase of the eigenvalues of the unitary rotation matrix, $U=e^{iAt}$ , in the controlled gate, $C-U$ , (Fig. 1) used in the QPE. Again, this gate encodes the matrix $A$ as its Hamiltonian. It is also instructive to note that the eigenvalues of $U$ must be roots of unity (i.e. in the form of $e^{i\theta}$ ) as $U$ is unitary. Therefore, the phase of the eigenvalue of the gate is proportional to the eigenvalue of $A$ . As a result, by using QPE in the HHL algorithm, it is expected the eigenvalues of $A$ will be encoded in the c-register after the QPE, i.e. at $\ket{\Psi_{4}}$ . As it will be clear later, the eigenvalues are only encoded through basis encoding. The c-register does not store the exact eigenvalues.

Here we assume the readers are already familiar with IQFT and it will not be explained in detail. Readers may review the basic concepts in Appendices V-11 and V-12.

In the first step of QPE, Hadamard gates are applied to the clock qubits to create a superposition of the clock qubits,

	$\displaystyle\ket{\Psi_{2}}$	$\displaystyle=$	$\displaystyle I^{\otimes n_{b}}\otimes H^{\otimes n}\otimes I\ket{\Psi_{1}}$		(11)
		$\displaystyle=$	$\displaystyle\ket{b}\frac{1}{2^{\frac{n}{2}}}(\ket{0}+\ket{1})^{\otimes n}\ket{0}$		(12)

In the controlled rotation part, controlled gates are applied to $\ket{b}$ with the clock qubits as the controlling qubits (Figure 2). The number of the qubit, $n$ , of the c-register determines the number of the controlled gates. The gates are in the form of $U^{2^{r}}$ , where $r$ is the index of the clock qubit. Also, $U=e^{iAt}$ . For the most significant bit in the c-register, $\ket{c_{n-1}}$ , it controls the gate $U^{2^{n-1}}$ on the b-register while the least significant one, $\ket{c_{0}}$ , controls the gate $U^{2^{0}}=U$ on the b-register.

To understand how it works, we begin by assuming that $\ket{b}$ is an eigenvector of $U$ with eigenvalue $e^{2\pi i\phi}$ . The eigenvalue is written in this form so that the phase, $\phi$ , will be encoded as the basis state in the c-register (See Eq. (II-C)). Therefore, based on the definition of eigenvalues and eigenvectors (See Section V-8),

U\ket{b}=e^{2\pi i\phi}\ket{b}

(13)

When the control clock qubit is $\ket{0}$ , $\ket{b}$ will not be affected. If the clock bit is $\ket{1}$ , $U$ will be applied to $\ket{b}$ . This is equivalent to multiplying $e^{2\pi i\phi 2^{j}}$ in front of the $\ket{1}$ of the $j$ th clock qubit, $\ket{c_{j}}$ , as one can assign the prefactor to the controlling qubit. Therefore, after the controlled- $U$ operation, we have

$\displaystyle\ket{\Psi_{3}}$	$\displaystyle=\ket{b}\otimes\big{(}\frac{1}{2^{\frac{n}{2}}}(\ket{0}+e^{2\pi i\phi 2^{n-1}}\ket{1})\otimes(\ket{0}+$
	$\displaystyle e^{2\pi i\phi 2^{n-2}}\ket{1})\otimes\cdots\ \otimes(\ket{0}+e^{2\pi i\phi 2^{0}}\ket{1})\big{)}\otimes\ket{0}_{a}$
	$\displaystyle=\ket{b}\frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}\ket{k}\ket{0}_{a}$	(14)

In the IQFT part,(II-C), only the clock qubits are affected. Note that in certain literature, this is called Quantum Fourier Transform (QFT) (Appendix V-11).

$\displaystyle\ket{\Psi_{4}}$	$\displaystyle=\ket{b}\textrm{IQFT}(\frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}\ket{k})\ket{0}_{a}$
	$\displaystyle=\ket{b}\frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}(\textrm{IQFT}\ket{k})\ket{0}_{a}$
	$\displaystyle=\ket{b}\frac{1}{2^{n}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}(\sum_{y=0}^{2^{n}-1}e^{-2\pi iyk/N}\ket{y})\ket{0}_{a}$
	$\displaystyle=\frac{1}{2^{n}}\ket{b}\sum_{y=0}^{2^{n}-1}\sum_{k=0}^{2^{n}-1}e^{2\pi ik(\phi-y/N)}\ket{y}\ket{0}_{a}$	(15)

Due to interference, only $\ket{y}$ satisfying the condition $\phi-y/N=0$ will have a finite amplitude of $\sum_{k=0}^{2^{n}-1}e^{0}=2^{n}$ . Otherwise, the amplitude is $\sum_{k=0}^{2^{n}-1}e^{2\pi ik(\phi-y/N)}=0$ due to destructive interference. By ignoring the states with zero amplitude, we may rewrite $\ket{\Psi_{4}}$ as

	$\displaystyle\ket{\Psi_{4}}$	$\displaystyle=\frac{1}{2^{n}}\ket{b}\sum_{k=0}^{2^{n}-1}e^{2\pi ik\cdot 0}\ket{N\phi}\ket{0}_{a}$
		$\displaystyle=\ket{b}\ket{N\phi}\ket{0}_{a}$		(16)

Therefore, in QPE, the clock qubits are used to represent the phase information of $U$ , which is $\phi$ , and the accuracy depends on the number of qubits, $n$ .

Since in Hamiltonian encoding, $U$ is related to $A$ through

U=e^{iAt}

(17)

where $t$ is the evolution time for that Hamiltonian. $U$ is also diagonal in $A^{\prime}s$ eigenvector, $\ket{u_{i}}$ , basis. If $\ket{b}=\ket{u_{j}}$ ,

\displaystyle U\ket{b}

\displaystyle=

\displaystyle e^{i\lambda_{j}t}\ket{u_{j}}

(18)

By equating $i\lambda_{j}t$ to $2\pi i\phi$ in Eq. (13), we get $\phi=\lambda_{j}t/2\pi$ and Eq. (II-C) becomes

\ket{\Psi_{4}}=\ket{u_{j}}\ket{N\lambda_{j}t/2\pi}\ket{0}_{a}

(19)

Thus the eigenvalues of $A$ have been encodeded in the clock qubits (basis encoding). However, in general, given in (4), by superposition,

\ket{\Psi_{4}}=\sum_{j=0}^{2^{n_{b}}-1}b_{j}\ket{u_{j}}\ket{N\lambda_{j}t/2\pi}\ket{0}_{a}

(20)

The $\lambda_{j}$ are usually not integers. We will choose $t$ so that $\tilde{\lambda_{j}}=N\lambda_{j}t/2\pi$ are integers. Therefore, $\tilde{\lambda_{j}}$ are usually scaled version of $\lambda_{j}$ .

$\Psi_{4}$ can be rewritten as

\ket{\Psi_{4}}=\sum_{j=0}^{2^{n_{b}}-1}b_{j}\ket{u_{j}}\ket{\tilde{\lambda_{j}}}\ket{0}_{a}

(21)

II-D Controlled Rotation and Measurement of the Ancilla Qubit

The next step is to rotate the ancilla qubit, $\ket{0}_{a}$ , based on the encoded eigenvalues in the c-register, such that,

\ket{\Psi_{5}}=\sum_{j=0}^{2^{n_{b}}-1}b_{j}\ket{u_{j}}\ket{\tilde{\lambda_{j}}}(\sqrt{1-\frac{C^{2}}{\tilde{\lambda_{j}}^{2}}}\ket{0}_{a}+\frac{C}{\tilde{\lambda_{j}}}\ket{1}_{a})

(22)

where $C$ is a constant. The goal is to show why this is useful.

When the ancilla qubit is measured, the ancilla qubit wavefunction will collapse to either $\ket{0}$ or $\ket{1}$ . If it is $\ket{0}$ , the result will be discarded and the computation will be repeated until the measurement is $\ket{1}$ . Therefore, the final wavefunction of interest is

\ket{\Psi_{6}}=\frac{1}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\tilde{\lambda_{j}}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}b_{j}\ket{u_{j}}\ket{\tilde{\lambda_{j}}}\frac{C}{\tilde{\lambda_{j}}}\ket{1}_{a}

(23)

where the prefactor is due to normalization after measurement. Since $\lvert\frac{C}{\tilde{\lambda_{j}}}\rvert^{2}$ is the probabily of obtaining $\ket{1}$ when the ancilla bit is measured, $C$ should be chosen to be as large as possible. Compared to (5), the result resembles the answer $\ket{x}$ that we are looking for. However, we can only obtain the correct result if the b-register is measured in the eigenvector basis (i.e. $\ket{u_{j}}$ instead of $\ket{0}/\ket{1}$ ). However, the b-register is entangled with the clock qubits, $\ket{\tilde{\lambda_{j}}}$ . This means that we cannot factorize the result into a tensor product of the c-register and b-register (See the discussion after (5) and Appendix V-6). As a result, we cannot convert the b-register into the $\ket{0}/\ket{1}$ measurement basis with the desired amplitudes. We will need to uncompute the state so that it gives the right results in the $\ket{0}/\ket{1}$ measurement during which the b-register and c-register will be unentangled.

The measurement of the ancilla qubit can be and is usually performed after uncomputation. However, since the ancilla bit is not involved in any operations after the controlled rotation, measuring the ancilla bit before the uncomputation gives the same result. For simplicity in the derivation, it is thus performed before the uncomputation.

II-E Uncomputation - Inverse QPE

The uncomputation is done by using inverse QPE. Firstly, QFT is applied to the clock qubits as,

$\displaystyle\ket{\Psi_{7}}$	$\displaystyle=\frac{1}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\tilde{\lambda_{j}}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}\frac{b_{j}C}{\tilde{\lambda_{j}}}\ket{u_{j}}QFT\ket{\tilde{\lambda_{j}}}\ket{1}_{a}$
	$\displaystyle=\frac{1}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\tilde{\lambda_{j}}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}\frac{b_{j}C}{\tilde{\lambda_{j}}}\ket{u_{j}}$
	$\displaystyle\cdot(\frac{1}{2^{n/2}}\sum_{y=0}^{2^{n}-1}e^{2\pi iy\tilde{\lambda_{j}}/N}\ket{y})\ket{1}_{a}$	(24)

Then inverse controlled-rotations of the b-register by the clock qubits are applied with $U^{-1}=e^{-iAt}$ . Similar to the forward process, when the controlling $r$ th clock qubit is $\ket{0}$ , $\ket{u_{j}}$ will not be affected. If the $r$ th clock qubit is $\ket{1}$ , $(U^{-1})^{2^{r}}$ will be applied to $\ket{u_{j}}$ . This is equivalent to multiplying $e^{-i\lambda_{j}ty}$ if the c-register is $\ket{y}$ . This is due to the similar argument in Eq. (II-C) and the fact that $2\pi i\phi=i\lambda_{j}t$ . Therefore,

	$\displaystyle\ket{\Psi_{8}}$	$\displaystyle=$	$\displaystyle\frac{1}{2^{n/2}\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\tilde{\lambda_{j}}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}\frac{b_{j}C}{\tilde{\lambda_{j}}}\ket{u_{j}}$
	$\displaystyle\cdot(\sum_{y=0}^{2^{n}-1}e^{-i\lambda_{j}ty}e^{2\pi iy\tilde{\lambda_{j}}/N}\ket{y})\ket{1}_{a}$				(25)

Since we earlier chose to set $\tilde{\lambda_{j}}=N\lambda_{j}t/2\pi$ , therefore, the two exponential terms cancel each other and

	$\displaystyle\ket{\Psi_{8}}$	$\displaystyle=$	$\displaystyle\frac{1}{2^{n/2}\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\tilde{\lambda_{j}}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}\frac{b_{j}C}{\tilde{\lambda_{j}}}\ket{u_{j}}\sum_{y=0}^{2^{n}-1}\ket{y}\ket{1}_{a}$
		$\displaystyle=$	$\displaystyle\frac{C}{2^{n/2}\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\lambda_{j}}\rvert^{2}}}\ket{x}\sum_{y=0}^{2^{n}-1}\ket{y}\ket{1}_{a}$		(26)

The clock qubits and the b-register are now unentangled and the b-register stores $\ket{x}$ . By applying the Hadamard gate on the clock qubits, finally, we have

	$\displaystyle\ket{\Psi_{9}}$	$\displaystyle=\frac{1}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\lambda_{j}}\rvert^{2}}}\sum_{j=0}^{2^{n_{b}}-1}\frac{b_{j}C}{\lambda_{j}}\ket{u_{j}}\ket{0}^{\otimes n}\ket{1}_{a}$
		$\displaystyle=\frac{C}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}C}{\lambda_{j}}\rvert^{2}}}\ket{x}_{b}\ket{0}_{c}^{\otimes n}\ket{1}_{a}$		(27)

If $C$ is real and by using (7),

	$\displaystyle\ket{\Psi_{9}}$	$\displaystyle=\frac{1}{\sqrt{\sum_{j=0}^{2^{n_{b}}-1}\lvert\frac{b_{j}}{\lambda_{j}}\rvert^{2}}}\ket{x}_{b}\ket{0}_{c}^{\otimes n}\ket{1}_{a}$
		$\displaystyle=\ket{x}_{b}\ket{0}_{c}^{\otimes n}\ket{1}_{a}$		(28)

Here, the solution $\ket{x}$ (Eq. 5) is stored in the b-register successfully.

III Numerical Example

We will present a numerical example and apply HHL to it step-by-step. The implementation is shown in Figure 3. Firstly, we will discuss how to implement the controlled- $U$ and ancilla qubit rotations.

III-A Encoding Scheme

In this example, the matrix $A$ and vector $\vec{b}$ are set to be

A=\begin{pmatrix}1&-\frac{1}{3}\\ -\frac{1}{3}&1\\ \end{pmatrix}

(29)

\vec{b}=\begin{pmatrix}0\\ 1\\ \end{pmatrix}

(30)

The eigenvectors of $A$ are $\vec{u_{0}}=\begin{pmatrix}\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}\end{pmatrix}$ , $\vec{u_{1}}=\begin{pmatrix}\frac{-1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{pmatrix}$ with eigenvalues $\lambda_{0}=\frac{2}{3}$ and $\lambda_{1}=\frac{4}{3}$ respectively. We need to using basis encoding to encode the eigenvalues in the basis formed by the clock qubit and 2 qubits are needed by encoding $\lambda_{0}$ as $\ket{01}$ and $\lambda_{1}$ as $\ket{10}$ so that it maintains the ratio of $\lambda_{1}/\lambda_{0}=2$ . This means $\tilde{\lambda_{0}}=1$ and $\tilde{\lambda_{1}}=2$ or in other words, $\tilde{\ket{\lambda_{0}}}=\ket{01}$ and $\tilde{\ket{\lambda_{1}}}=\ket{10}$ . This gives a perfect encoding with $n=2$ (i.e. $N=4$ ). Therefore, $t$ is chosen to be $\frac{3\pi}{4}$ to achieve the encoding scheme since $\tilde{\lambda_{j}}=N\lambda_{j}t/2\pi$ .

Since $\vec{b}$ is a 2-dimensional complex vector, it can be encoded using 1 qubit and, thus, $n_{b}=1$ .

The solution to the LSP is found to be

\vec{x}=\begin{pmatrix}\frac{3}{8}\\ \frac{9}{8}\\ \end{pmatrix}

(31)

whereby, the ratio of $\lvert x_{0}\rvert^{2}$ to $\lvert x_{1}\rvert^{2}$ is $1:9$ .

III-B Controlled-U Implementation

In reality, we expect the controlled- $U$ operation to be implemented by Hamiltonian simulation [18]. However, to understand the algorithm, we will derive the matrix for $U$ and then map this to the $Controlled-U$ gate used in IBM-Q directly. Since $n=2$ , there are two operations needed, namely $U^{2^{1}}=U^{2}$ and $U^{2^{0}}=U$ , controlled by $c_{1}$ and $c_{0}$ , respectively.

In order to find the corresponding matrix for $U^{2}=e^{i2At}$ and $U=e^{iAt}$ , we need to perform similarity transformation on $i2At$ and $iAt$ , exponentiate then, and transforms back to the original basis.

The transformation matrix from the original basis to the eigenvector basis is

	$\displaystyle V$	$\displaystyle=\begin{pmatrix}\vec{u_{0}}&\vec{u_{1}}\end{pmatrix}$
		$\displaystyle=\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}$		(32)

Since $V$ is real and symmetric,its conjugate, $V^{\dagger}$ equals itself.

The diagonalized $A$ , i.e. expressed in the basis formed by $\vec{u_{0}}$ and $\vec{u_{0}}$ , is

	$\displaystyle A_{diag}$	$\displaystyle=V^{\dagger}AV$
		$\displaystyle=\begin{pmatrix}\frac{2}{3}&0\\ 0&\frac{4}{3}\end{pmatrix}$		(33)

As it is diagonal, $U$ can be obtained by exponentiation of the elements accordingly.

$\displaystyle U_{diag}$	$\displaystyle=\begin{pmatrix}e^{i\lambda_{0}t}&0\\ 0&e^{i\lambda_{1}t}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}e^{i\pi/2}&0\\ 0&e^{i\pi}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}i&0\\ 0&-1\end{pmatrix}$	(34)

where $t=3\pi/4$ as mentioned earlier is used. Also,

\displaystyle U_{diag}^{2}=U_{diag}U_{diag}=\begin{pmatrix}-1&0\\ 0&1\end{pmatrix}

(35)

It is worth noting that both are naturally unitary which is a requirement for a quantum operation.

To obtain $U$ and $U^{2}$ in the original basis, we need to apply similarity transformation again in the reverse direction,

$\displaystyle U$	$\displaystyle=VU_{diag}V^{\dagger}$
	$\displaystyle=\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}\begin{pmatrix}i&0\\ 0&-1\end{pmatrix}\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}$
	$\displaystyle=\frac{1}{2}\begin{pmatrix}-1+i&1+i\\ 1+i&-1+i\end{pmatrix}$	(36)

$\displaystyle U^{2}$	$\displaystyle=VU_{diag}^{2}V^{\dagger}$
	$\displaystyle=\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}\begin{pmatrix}-1&0\\ 0&1\end{pmatrix}\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}0&-1\\ -1&0\end{pmatrix}$	(37)

To implement $U$ and $U^{2}$ , a 4-parameter arbitrary unitary gate with global phase [19],

\displaystyle U

\displaystyle=\begin{pmatrix}e^{i\gamma}\cos(\theta/2)&-e^{i(\gamma+\lambda)}\sin(\theta/2)\\ e^{i(\gamma+\phi)}\sin(\theta/2)&e^{i(\gamma+\phi+\lambda)}\cos(\theta/2)\end{pmatrix}

(38)

By choosing $\theta=\pi,\phi=\pi,\lambda=0,\gamma=0$ , $U^{2}$ is implemented.

By choosing $\theta=\pi/2,\phi=-\pi/2,\lambda=\pi/2,\gamma=3\pi/4$ , $U$ is implemented.

For the IQPE part, we also need to implement $U^{-1}$ and $U^{-2}$ . Since in this example, ${(U^{2})}^{-1}=U^{2}$ , one can use the same set of parameters to implement $(U^{2})^{-1}$ .

However,

\displaystyle U^{-1}

\displaystyle=\frac{1}{2}\begin{pmatrix}-1-i&1-i\\ 1-i&-1-i\end{pmatrix}

(39)

We need to choose $\theta=\pi/2,\phi=\pi/2,\lambda=-\pi/2,\gamma=-3\pi/4$ to implement $U^{-1}$ .

The controlled version of matrix $U^{\prime}$ can then be constructed using

\displaystyle C-U^{\prime}=I\otimes\ket{0}\bra{0}+U^{\prime}\otimes\ket{1}\bra{1}

(40)

Note that in this equation, only the controlling clock bit and the b-register are included for simplicity. For example,

$\displaystyle C-U^{-1}$	$\displaystyle=\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\otimes\begin{pmatrix}1&0\\ 0&0\end{pmatrix}+\frac{1}{2}\begin{pmatrix}-1-i&1-i\\ 1-i&-1-i\end{pmatrix}$
	$\displaystyle\otimes\begin{pmatrix}0&0\\ 0&1\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}1&0&0&0\\ 0&0&0&0\\ 0&0&1&0\\ 0&0&0&0\end{pmatrix}+\frac{1}{2}\begin{pmatrix}0&0&0&0\\ 0&-1-i&0&1-i\\ 0&0&0&0\\ 0&1-i&0&-1-i\end{pmatrix}$
	$\displaystyle=\frac{1}{2}\begin{pmatrix}2&0&0&0\\ 0&-1-i&0&1-i\\ 0&0&2&0\\ 0&1-i&0&-1-i\end{pmatrix}$	(41)

III-C Implementataion of the Controlled-Rotation of Ancilla Qubit

The coefficients of $\ket{0}$ and $\ket{1}$ of the ancilla bit after rotation in Eq. (22) are $\sqrt{1-\frac{C^{2}}{\tilde{\lambda_{j}}^{2}}}$ and $\frac{C}{\tilde{\lambda_{j}}}$ , respectively. The sum of the square of the magnitudes of the coefficients is $1$ as required. This means also $C\leq\tilde{\lambda_{j}}$ . Since the minimal $\tilde{\lambda_{j}}$ is $1$ , we will set $C=1$ to maximize the probability of measuring $\ket{1}$ during the ancilla bit measurement.

The transformation of $\ket{0}_{a}$ to $\sqrt{1-\frac{1}{\tilde{\lambda_{j}}^{2}}}\ket{0}_{a}+\frac{1}{\tilde{\lambda_{j}}}\ket{1}_{a}$ is known to be equivalent to $RY(\theta)$ rotation,

\displaystyle RY(\theta)=\begin{pmatrix}\cos(\frac{\theta}{2})&-\sin(\frac{\theta}{2})\\ \sin(\frac{\theta}{2})&\cos(\frac{\theta}{2})\end{pmatrix}

(42)

with $\theta=2\arcsin(\frac{1}{\tilde{\lambda_{j}}})$ . One can check this by multiplying $RY(\theta)$ to $\ket{0}_{a}$ .

	$\displaystyle RY(\theta)\ket{0}_{a}=\begin{pmatrix}\cos(\frac{\theta}{2})&-\sin(\frac{\theta}{2})\\ \sin(\frac{\theta}{2})&\cos(\frac{\theta}{2})\end{pmatrix}\begin{pmatrix}1\\ 0\end{pmatrix}$
	$\displaystyle=\cos(\frac{\theta}{2})\ket{0}_{a}+\sin(\frac{\theta}{2})\ket{1}_{a}$		(43)

Therefore, we will establish a function to implement this rotation and this function only need to be valid when the input are the encoded eigenvalues because only encoded eigenvalues have zero magnitudes in the c-register as shown in (21). The function is defined as

\displaystyle\theta(c)=\theta(c_{1}c_{0})=2\arcsin(\frac{1}{c})

(44)

where $c$ is the value of the clock qubits and $c_{1}c_{0}$ is its binary form.

Since only $\ket{\tilde{\lambda_{j}}}$ has non-zero amplitude in (21), we only need to set up (44) such that it is correct for $\ket{c}$ = $\ket{01}$ and $\ket{10}$ , namely

	$\displaystyle\theta(1)=\theta(01)=2\arcsin(\frac{1}{1})=\pi$		(45)
	$\displaystyle\theta(2)=\theta(10)=2\arcsin(\frac{1}{2})=\frac{\pi}{3}$		(46)

The following function can achieve the goal,

\displaystyle\theta(c)=\theta(c_{1}c_{0})=\frac{\pi}{3}c_{1}+\pi c_{0}

(47)

Therefore, the controlled rotation can be implemented as

	$\displaystyle\ket{1}\bra{1}\otimes I\otimes RY(\frac{\pi}{3})+\ket{0}\bra{0}\otimes I\otimes I+$
	$\displaystyle I\otimes\ket{1}\bra{1}\otimes RY(\pi)+I\otimes\ket{0}\bra{0}\otimes I$		(48)

where the operators operate on qubits $\ket{c_{1}}$ , $\ket{c_{0}}$ , and $\ket{a}$ from left to right, respectively.

III-D Quantum Circuit

An HHL circuit for the numerical example is then built and shown in Figure 3. We will then walk through the circuit using numerical substitution.

III-E Numerical Substitution

The algorithm begins with

\ket{\Psi_{0}}=\ket{0}_{b}\otimes\ket{00}_{c}\otimes\ket{0}_{a}=\ket{0000}

(49)

X-gate is then applied to convert $\ket{0}_{b}$ to $\ket{1}_{b}$ with

\ket{\Psi_{1}}=X\otimes I\otimes I\ket{\Psi_{0}}=\ket{1000}

(50)

After applying the Hadamard gates to create a superposition among the clock qubits,

$\displaystyle\ket{\Psi_{2}}$	$\displaystyle=I\otimes H^{\otimes n}\otimes I\ket{\Psi_{1}}$
	$\displaystyle=\ket{1}\frac{1}{2^{\frac{2}{2}}}(\ket{0}+\ket{1})^{\otimes 2}\ket{0}$
	$\displaystyle=\frac{1}{2}(\ket{1000}+\ket{1010}+\ket{1100}+\ket{1110})$	(51)

Before applying the CU3(controlled rotation of ancillary qubit) gates in the bra-ket notation, it will be convenient to perform a basis change to the eigenvector basis of $A$ . Since $\ket{1}=\frac{1}{\sqrt{2}}(-\ket{u_{0}}+\ket{u_{1}})$ , we have $b_{0}=\frac{-1}{\sqrt{2}}$ and $b_{1}=\frac{1}{\sqrt{2}}$ . Therefore,

$\displaystyle\ket{\Psi_{2}}$	$\displaystyle=\ket{1}\frac{1}{2}(\ket{000}+\ket{010}+\ket{100}+\ket{110})$
	$\displaystyle=\frac{1}{\sqrt{2}}(-\ket{u_{0}}+\ket{u_{1}})\frac{1}{2}(\ket{000}+\ket{010}+\ket{100}$
	$\displaystyle+\ket{110})$
	$\displaystyle=\frac{1}{2\sqrt{2}}(-\ket{u_{0}}\ket{000}-\ket{u_{0}}\ket{010}-\ket{u_{0}}\ket{100}$
	$\displaystyle-\ket{u_{0}}\ket{110}+\ket{u_{1}}\ket{000}+\ket{u_{1}}\ket{010}$
	$\displaystyle+\ket{u_{1}}\ket{100}+\ket{u_{1}}\ket{110})$	(52)

In the controlled rotation operations, when the corresponding c-register is $\ket{k}_{c}$ , a phase change of $\phi_{j}=k\lambda_{j}t/2\pi$ is added (i.e. multiplied by $e^{2\pi i\phi_{j}}$ ) for $\ket{u_{j}}$ . Since $t=\frac{3\pi}{4}$ , $\lambda_{0}=\frac{2}{3}$ and $\lambda_{1}=\frac{4}{3}$ , we have

$\displaystyle\ket{\Psi_{3}}$	$\displaystyle=\frac{1}{2\sqrt{2}}(-\ket{u_{0}}\ket{000}-e^{2\pi i\phi_{0}}\ket{u_{0}}\ket{010}$
	$\displaystyle-e^{2\pi i2\phi_{0}}\ket{u_{0}}\ket{100}-e^{2\pi i3\phi_{0}}\ket{u_{0}}\ket{110}+$
	$\displaystyle\ket{u_{1}}\ket{000}+e^{2\pi i\phi_{1}}\ket{u_{1}}\ket{010}$
	$\displaystyle+e^{2\pi i2\phi_{1}}\ket{u_{1}}\ket{100}+e^{2\pi i3\phi_{1}}\ket{u_{1}}\ket{110})$
	$\displaystyle=\frac{1}{2\sqrt{2}}(-\ket{u_{0}}\ket{000}-e^{i\lambda_{0}t}\ket{u_{0}}\ket{010}$
	$\displaystyle-e^{i2\lambda_{0}t}\ket{u_{0}}\ket{100}-e^{i3\lambda_{0}t}\ket{u_{0}}\ket{110}+$
	$\displaystyle\ket{u_{1}}\ket{000}+e^{i\lambda_{1}t}\ket{u_{1}}\ket{010}$
	$\displaystyle+e^{i2\lambda_{1}t}\ket{u_{1}}\ket{100}+e^{i3\lambda_{1}t}\ket{u_{1}}\ket{110})$
	$\displaystyle=\frac{1}{2\sqrt{2}}(-\ket{u_{0}}\ket{000}-e^{i\pi/2}\ket{u_{0}}\ket{010}-e^{i\pi}\ket{u_{0}}\ket{100}$
	$\displaystyle-e^{i3\pi/2}\ket{u_{0}}\ket{110}+\ket{u_{1}}\ket{000}+e^{i\pi}\ket{u_{1}}\ket{010}$
	$\displaystyle+e^{i2\pi}\ket{u_{1}}\ket{100}+e^{i3\pi}\ket{u_{1}}\ket{110})$
	$\displaystyle=\frac{1}{2\sqrt{2}}(-\ket{u_{0}}\ket{000}-i\ket{u_{0}}\ket{010}+\ket{u_{0}}\ket{100}$
	$\displaystyle+i\ket{u_{0}}\ket{110}+\ket{u_{1}}\ket{000}-\ket{u_{1}}\ket{010}+$
	$\displaystyle\ket{u_{1}}\ket{100}-\ket{u_{1}}\ket{110})$	(53)

Before applying $IQFT$ , the terms are regrouped for simplicity.

	$\displaystyle\ket{\Psi_{3}}$	$\displaystyle=\frac{1}{2\sqrt{2}}((-\ket{u_{0}}+\ket{u_{1}})\ket{00}+(-i\ket{u_{0}}-\ket{u_{1}})\ket{01}$
		$\displaystyle+(\ket{u_{0}}+\ket{u_{1}})\ket{10}+(i\ket{u_{0}}-\ket{u_{1}})\ket{11})\ket{0}$		(54)

Now apply $IQFT$ to the clock qubits, e.g.

$\displaystyle\textrm{IQFT}\ket{10}$	$\displaystyle=\textrm{IQFT}\ket{2}$
	$\displaystyle=\frac{1}{2^{2/2}}\sum_{y=0}^{2^{2}-1}e^{-2\pi i2y/4}\ket{y}$
	$\displaystyle=\frac{1}{2}(\ket{0}-\ket{1}+\ket{2}-\ket{3}$
	$\displaystyle=\frac{1}{2}(\ket{00}-\ket{01}+\ket{10}-\ket{11})$	(55)

Similarly,

\displaystyle\textrm{IQFT}\ket{00}

\displaystyle=\frac{1}{2}(\ket{00}+\ket{01}+\ket{10}+\ket{11})

(56)

\displaystyle\textrm{IQFT}\ket{01}

\displaystyle=\frac{1}{2}(\ket{00}-i\ket{01}-\ket{10}+i\ket{11})

(57)

\displaystyle\textrm{IQFT}\ket{11}

\displaystyle=\frac{1}{2}(\ket{00}+i\ket{01}-\ket{10}-i\ket{11})

(58)

Therefore, applying $IQFT$ to $\ket{\Psi_{3}}$ and substituting Eq. (55) to (58),

$\displaystyle\ket{\Psi_{4}}$	$\displaystyle=\textrm{IQFT}\ket{\Psi_{3}}$
	$\displaystyle=\frac{1}{4\sqrt{2}}$
	$\displaystyle((-\ket{u_{0}}+\ket{u_{1}})(\ket{00}+\ket{01}+\ket{10}+\ket{11})+$
	$\displaystyle(-i\ket{u_{0}}-\ket{u_{1}})(\ket{00}-i\ket{01}-\ket{10}+i\ket{11})+$
	$\displaystyle(\ket{u_{0}}+\ket{u_{1}})(\ket{00}-\ket{01}+\ket{10}-\ket{11})+$
	$\displaystyle(i\ket{u_{0}}-\ket{u_{1}})(\ket{00}+i\ket{01}-\ket{10}-i\ket{11}))\ket{0}$
	$\displaystyle=\frac{1}{\sqrt{2}}(-\ket{u_{0}}\ket{01}+\ket{u_{1}}\ket{10})\ket{0}$	(59)

It can be seen that after $IQFT$ , the eigenvalues are encoded in the clock qubits as $\ket{01}$ and $\ket{11}$ with non-zero amplitudes due constructive interference. $b_{0}=\frac{-1}{\sqrt{2}}$ and $b_{1}=\frac{1}{\sqrt{2}}$ . We clearly see the entanglement between the b-register and the c-register that $\ket{u_{0}}$ goes with $\ket{01}$ and $\ket{u_{1}}$ goes with $\ket{11}$ .

After performing the ancilla qubit rotation,

$\displaystyle\ket{\Psi_{5}}$	$\displaystyle=\sum_{j=0}^{2^{1}-1}b_{j}\ket{u_{j}}\ket{\tilde{\lambda_{j}}}(\sqrt{1-\frac{C^{2}}{\tilde{\lambda_{j}}^{2}}}\ket{0}+\frac{C}{\tilde{\lambda_{j}}}\ket{1})$
	$\displaystyle=-\frac{1}{\sqrt{2}}\ket{u_{0}}\ket{01}(\sqrt{1-\frac{1}{1^{2}}}\ket{0}+\frac{1}{1}\ket{1})+$
	$\displaystyle\frac{1}{\sqrt{2}}\ket{u_{1}}\ket{10}(\sqrt{1-\frac{1}{2^{2}}}\ket{0}+\frac{1}{2}\ket{1})$	(60)

If the measurement of the ancilla bit is $\ket{1}$ ,

	$\displaystyle\ket{\Psi_{6}}$	$\displaystyle=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}\ket{01}\ket{1}$
		$\displaystyle+\frac{1}{2\sqrt{2}}\ket{u_{1}}\ket{10}\ket{1})$		(61)

Applying QFT to the encoded eigenvalues, we have

$\displaystyle\textrm{QFT}\ket{10}$	$\displaystyle=\textrm{QFT}\ket{2}$
	$\displaystyle=\frac{1}{2^{2/2}}\sum_{y=0}^{2^{2}-1}e^{2\pi i2y/4}\ket{y}$
	$\displaystyle=\frac{1}{2}(\ket{00}-\ket{01}+\ket{10}-\ket{11})$	(62)

	$\displaystyle\textrm{QFT}\ket{01}$	$\displaystyle=\textrm{QFT}\ket{1}$
		$\displaystyle=\frac{1}{2}(\ket{00}+i\ket{01}-\ket{10}-i\ket{11})$		(63)

Therefore, applying QFT to $\ket{\Psi_{6}}$ and substituting Eq. (III-E) to (III-E), we obtain

$\displaystyle\ket{\Psi_{7}}$	$\displaystyle=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}\frac{1}{2}(\ket{00}+i\ket{01}-\ket{10}$
	$\displaystyle-i\ket{11})\ket{1}+\frac{1}{2\sqrt{2}}\ket{u_{1}}\frac{1}{2}(\ket{00}-\ket{01}$
	$\displaystyle+\ket{10}-\ket{11}))\ket{1}$	(64)

For the controlled rotation, the state is multiplied by $e^{-i\lambda_{j}t}$ and $e^{-i2\lambda_{j}t}$ if $c_{0}=1$ and $c_{1}=1$ , respectively. Since $e^{-i\lambda_{0}t}=-i$ , $e^{-i2\lambda_{0}t}=-1$ , $e^{-i\lambda_{1}t}=-1$ , $e^{-i2\lambda_{1}t}=1$ , and $Nt/2\pi$ = $3/2$

$\displaystyle\ket{\Psi_{8}}$	$\displaystyle=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}\frac{1}{2}(\ket{00}+\ket{01}+\ket{10}+\ket{11})\ket{1}$
	$\displaystyle+\frac{1}{2\sqrt{2}}\ket{u_{1}}\frac{1}{2}(\ket{00}+\ket{01}+\ket{10}+\ket{11}))\ket{1}$
	$\displaystyle=\frac{1}{2}\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}+\frac{1}{2\sqrt{2}}\ket{u_{1}})(\ket{00}+\ket{01}$
	$\displaystyle+\ket{10}+\ket{11})\ket{1}$
	$\displaystyle=\frac{1}{2}(\frac{2}{3})\sqrt{\frac{8}{5}}(-\frac{1}{\frac{2}{3}\sqrt{2}}\ket{u_{0}}+\frac{1}{\frac{4}{3}\sqrt{2}}\ket{u_{1}})(\ket{00}+\ket{01}$
	$\displaystyle+\ket{10}+\ket{11})\ket{1}$	(65)

Finally, by applying Hadamard gate to the clock qubits,

	$\displaystyle\ket{\Psi_{9}}$	$\displaystyle=\frac{2}{3}\sqrt{\frac{8}{5}}(-\frac{1}{\frac{2}{3}\sqrt{2}}\ket{u_{0}}$
		$\displaystyle+\frac{1}{\frac{4}{3}\sqrt{2}}\ket{u_{1}})\ket{00}\ket{1}$		(66)

It can be verified that $\ket{\Psi_{9}}$ is a normalized vector as it should be because every operation in the HHL circuit is unitary and preserves the norm.

Equation (III-E) can be simplified by substituting $\ket{u_{0}}=\frac{-1}{\sqrt{2}}\ket{0}+\frac{-1}{\sqrt{2}}\ket{1}$ and $\ket{u_{1}}=\frac{-1}{\sqrt{2}}\ket{0}+\frac{1}{\sqrt{2}}\ket{1}$ . We obtain,

\displaystyle\ket{\Psi_{9}}

\displaystyle=\frac{1}{2}\sqrt{\frac{2}{5}}(\ket{0}+3\ket{1})\ket{00}\ket{1}

(67)

The probability ratio of obtaining $\ket{0}$ and $\ket{1}$ when b-register is measured is thus $1:9$ as expected.

III-F Simulation Results

Matlab code implementing the numerical example using matrix approach is created and available at [21]. In the Matlab code, measurement is not performed (i.e. not partical tracing of the matrix). $\Psi_{9}$ is found to be,

\ket{\psi_{9}}=\begin{pmatrix}-0.4330\\ 0.2500\\ 0.0000\\ -0.0000\\ 0.0000\\ -0.0000\\ 0.0000\\ 0.0000\\ 0.4330\\ 0.7500\\ -0.0000\\ 0.0000\\ -0.0000\\ 0.0000\\ -0.0000\\ 0.0000\\ \end{pmatrix}

(68)

Since $\ket{0}_{c}$ are discarded during the measurement step, only $\ket{0001}$ and $\ket{1001}$ are left. Their amplitude ratio is $0.25^{2}:0.75^{2}=1:9$ as expected.

The circuit in Fig. 3 is also simulated in the IBM-Q system (code available at [21]). Since only the b-register and the ancilla qubit are measured, there are only four possible outputs as shown in Figure 4. Again, only $\ket{1}_{a}$ should be considered. The ratio of the measurement probability of $\ket{0}_{b}\ket{1}_{a}$ to $\ket{1}_{b}\ket{1}_{a}$ is $0.063:0.564=1:8.95$ , which is close to the expected value.

On the other hand, due to the imperfection and noise in a real quantum computer, the hardware execution of the same circuit does not give a satisfactory result (Figure 5). The ratio of the measurement probability of $\ket{0}_{b}\ket{1}_{a}$ to $\ket{1}_{b}\ket{1}_{a}$ is only $0.142^{2}:0.361^{2}=1:2.54$ .

IV Conclusion

In this paper, we presented the HHL algorithm through a step-by-step walkthrough of the derivation. A numerical example is also presented in the bra-ket notation. The numerical example echos the analytical derivation to help students understand how qubits evolve in this important and relatively complex algorithm. A Matlab code corresponding to the numerical example is constructed to help understand the algorithm from the matrix point of view. Qiskit circuit of the corresponding circuit which can be simulated in IBM-Q and run on their quantum computing hardware is also available. Through this self-contained and step-by-step walkthrough, the basic concepts in quantum computing are reinforced.

V APPENDIX

V-1 Hermitian matrix

A Hermitian matrix is a matrix that equals to its adjoint matrix (transpose followed by complex conjugation). That is, if $A$ is a Hermitian matrix, then it is defined as,

\displaystyle A=A^{\dagger}=(A^{T})^{*}

(V.1)

where $A^{T}$ is the transpose of $A$ .

In this paper, the matrix, $A$ , in the LPS to be solved is assumed to be Hermitian.

Another example is in (III-B), where $V$ is Hermitian.

	$\displaystyle V=\begin{pmatrix}\vec{u_{0}}&\vec{u_{1}}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}\frac{-1}{\sqrt{2}}&\frac{-1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\end{pmatrix}$		(III-B)

V-2 Bra-ket Notation

Bra-ket notation is commonly used in quantum mechanics. A vector $\vec{v}$ is represented as $\ket{v}$ in its ket form. The bra form of the vectors forms a dual space to the space of the kets. The bra form of $\vec{v}$ is $\bra{v}$ .

In matrix representation, ket is the complex conjugate transpose of bra and vice versa. For example, if $\ket{v}=\begin{pmatrix}1\\ -i\end{pmatrix}$ , then $\bra{v}=\begin{pmatrix}1&i.\end{pmatrix}$

V-3 Superposition

Superposition or Quantum Superposition is a quantum state which is the linear combination of two or more basis states. For example, a superposition state can be $\ket{v}=c_{1}\ket{1}+c_{2}\ket{0}$ , where $c_{1}$ and $c_{2}$ are complex number and $\ket{1}$ and $\ket{0}$ are basis states. A Hadamard gate is a gate commonly used to create a superposition state (Appendix V-5).

V-4 Basis Transformation and Quantum Gate

In quantum computing, we only care about the basis transformation due to rotation in the hyperspace. The transformation is equivalent to the multiplication of the basis vectors by a unitary matrix, $U$ , which is the transformation matrix. All quantum gates can be defined as how the basis vectors are transformed from the initial basis vector to the final basis vectors. Usually, a quantum gate rotates a basis state into another basis state (e.g. the NOT gate) and has its classical counterpart. But there are some gates that rotate a basis state to a superposition of two or more basis states. Such gates have no classical counterparts. For example, a Hadamard gate defines how an initial basis vector is rotated to an equal superposition of two basis vectors (Appendix V-5).

V-5 Hadamard Gate

The Hadamard gate is a quantum gate that does not have a classical counterpart. It rotates the basis state to create an equal superposition of the basis states. For a 1-qubit case, this means it has equal probability (i.e. $\frac{1}{2}$ ) of measuring $\ket{0}$ and $\ket{1}$ .

The matrix form of the Hadamard gate is,

\displaystyle\frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\ 1&-1\\ \end{pmatrix}

(V.2)

When it is applied on the basis state $\ket{1}$ , which is $\begin{pmatrix}0\\ 1\\ \end{pmatrix}$ in matrix form, we have,

	$\displaystyle=\frac{1}{\sqrt{2}}\begin{pmatrix}1&1\\ 1&-1\\ \end{pmatrix}\begin{pmatrix}0\\ 1\\ \end{pmatrix}$		(V.3)
	$\displaystyle=\frac{1}{\sqrt{2}}\begin{pmatrix}1\\ -1\\ \end{pmatrix}$		(V.4)

which can also be represented in bra-ket form as,

\displaystyle\frac{\ket{0}-\ket{1}}{\sqrt{2}}

(V.5)

In this paper, Hadamard gates are applied in clock qubit to create superposition from $\ket{\psi_{1}}$ to $\ket{\psi_{2}}$ . For example, in (11),

\displaystyle\ket{\Psi_{2}}=I^{\otimes n_{b}}\otimes H^{\otimes n}\otimes I\ket{\Psi_{1}}

(11)

where $\ket{\psi_{2}}$ is obtained by applying tensor product of identity gates and an $n$ -qubit Hadamard gate to $\ket{\psi_{1}}$ . The identity gates are applied to the b-register and the ancilla qubit while the Hadamard gate is applied to the clock qubits. In this equation, the $n$ -qubit Hadamard gate is represented as $H^{\otimes n}$ , i.e. the tensor product of $n$ 1-qubit Hadamard gates.

V-6 Entanglement

Entanglement refers to the quantum state of a 2- or more-qubit system that cannot be expressed as a tensor product of the individual qubit. This is an important feature that quantum computing uses often. As an example,

\displaystyle\ket{\Psi}=\frac{1}{\sqrt{2}}(\ket{00}+\ket{11})

(V.6)

is an entangled state. It cannot be expressed as a tensor product of two individual qubit states.

In this paper, after the ancilla bit rotation, we have

\displaystyle\ket{\Psi_{6}}=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}\ket{01}\ket{1}+\frac{1}{2\sqrt{2}}\ket{u_{1}}\ket{10}\ket{1})

(61)

where the b-register and the c-register are entangled and $\ket{u_{0}}$ ( $\ket{u_{1}}$ ) always appears with $\ket{01}$ ( $\ket{10}$ ) after the measurement.

If the b-register were not entangled with the c-register, we have

\displaystyle\ket{\Psi_{6}}=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}\ket{u_{0}}+\frac{1}{2\sqrt{2}}\ket{u_{1}})

(V.7)

By substituting $\ket{u_{0}}=\frac{-1}{\sqrt{2}}\ket{0}+\frac{-1}{\sqrt{2}}\ket{1}$ and $\ket{u_{1}}=\frac{-1}{\sqrt{2}}\ket{0}+\frac{1}{\sqrt{2}}\ket{1}$ and after simplification, we have

$\displaystyle\ket{\Psi_{6}}$	$\displaystyle=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}(\frac{-1}{\sqrt{2}}\ket{0}+\frac{-1}{\sqrt{2}}\ket{1})+\frac{1}{2\sqrt{2}}(\frac{-1}{\sqrt{2}}\ket{0}+$
	$\displaystyle\frac{1}{\sqrt{2}}\ket{1}))$
	$\displaystyle=\sqrt{\frac{8}{5}}(\frac{1}{4}\ket{0}+\frac{3}{4}\ket{1})$	(V.8)

This is the same as (67). The probability of measuring $\ket{0}$ and $\ket{1}$ has the ratio of 1:9 as expected.

However, when there is entanglement, the probability of measuring $\ket{0}$ and $\ket{1}$ would not be 1:9 because the previous grouping is impossible.

	$\displaystyle\ket{\Psi_{6}}$	$\displaystyle=\sqrt{\frac{8}{5}}(-\frac{1}{\sqrt{2}}(\frac{-1}{\sqrt{2}}\ket{0}+\frac{-1}{\sqrt{2}}\ket{1})\ket{01}+\frac{1}{2\sqrt{2}}(\frac{-1}{\sqrt{2}}\ket{0}$
		$\displaystyle+\frac{1}{\sqrt{2}}\ket{1}))\ket{10}$		(V.9)

V-7 Controlled Operation

Controlled operation requires more than one qubit. For a 2-qubit controlled gate, an operation is applied to a qubit (the target qubit), if the value of the controlling qubit is 1 in the basis vector.

For example, in Figure 2, $\ket{b}$ is the target qubit and $\ket{c_{n-1}}$ is the controlling qubit. The operation of $U^{2^{n-1}}$ is applied to $\ket{b}$ only if $\ket{c_{n-1}}$ is 1 in the basis state (e.g. $\ket{bc_{n-1}\cdots}=\ket{01\cdots}$ .

In general, the controlled version of a unitary gate, $U^{\prime}$ , can be implemented using the following equation if the LSB is the controlling qubit.

C-U^{\prime}=I\otimes\ket{0}\bra{0}+U^{\prime}\otimes\ket{1}\bra{1}

(40)

which literally means that if the controlling qubit is 0, Identity gate is applied to the target qubit (MSB). Otherise, $U^{\prime}$ is applied to the target qubit.

V-8 Eigenvalue and Eigenvector

When a non-zero $n\times n$ matrix $A$ is applied to an $n$ -dimensional vector $\vec{V}$ and has the following relationship,

\displaystyle A\vec{V}=\lambda\vec{V}

(V.10)

where $\lambda$ is a scalar, then, by definition, $\vec{V}$ and $\lambda$ are the eigenvector and eigenvalue of $A$ , respectively. This is similar to (3), where the matrix $A$ is expressed as a linear combination of the outer products of its eigenvectors, $\ket{u_{i}}\bra{u_{i}}$ .

\displaystyle A=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}\ket{u_{i}}\bra{u_{i}}

(3)

This can be checked by applying $A$ to its eigenvector $\ket{u_{j}}$ ,

$\displaystyle A\ket{u_{j}}$	$\displaystyle=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}\ket{u_{i}}\bra{u_{i}}\ket{u_{j}}$	(V.11)
	$\displaystyle=\sum_{i=0}^{2^{n_{b}}-1}\lambda_{i}\ket{u_{i}}\delta_{ij}$
	$\displaystyle=\lambda_{j}\ket{u_{j}}$

which meets the definition in Eq. (V.10).

V-9 Different Types of Encoding

The three common types of encodings are explained here.

•

Basis Encoding- Basis encoding converts classical information such as numbers or matrix to quantum information in the form of basis states. For example,

\displaystyle x=2

\displaystyle\underrightarrow{\textrm{binary}}

\displaystyle 10

\displaystyle\underrightarrow{\textrm{quantum state}}

\displaystyle\ket{10}

\displaystyle x=\begin{pmatrix}2\\ 3\end{pmatrix}\underrightarrow{\textrm{binary}}\begin{pmatrix}10\\ 11\end{pmatrix}\underrightarrow{\textrm{quantum state}}\ket{1011}

(V.12)

•

Amplitude Encoding- Amplitude encoding encodes the information as the coefficients of the basis vectors. For example, for

$\displaystyle\vec{v}=\begin{pmatrix}v_{0}\\ v_{1}\end{pmatrix}$ (V.13)

which is assumed to be normalized ( $|v|=1$ ), it can be encoded as in the following quantum state,

$\displaystyle\ket{v}=v_{0}\ket{0}+v_{1}\ket{1}$ (V.14)

where $v_{0}$ and $v_{1}$ become the coefficients of the basis states, $\ket{0}$ and $\ket{0}$ , respectively. In the main text, Eq. (9) shows how the values of the components of vector $\ket{b}$ are encoded using amplitude encoding.
•

Hamiltonian Encoding- One type of the Hamiltonian encoding is to encode the matrix as the Hamiltonian in a unitary gate. For example, in this paper, Eq. (17) shows that

$\displaystyle U=e^{iAt}$ (17)

where it encodes matrix $A$ as the Hamiltonian of the unitary gate $U$ . Matrix $A$ needs to be Hermitian as it is used to represent the Hamiltonian (the energy) of the system. However, $A$ does not need to be unitary and $U$ will be unitary due to its definition in (17).

V-10 Discrete Fourier Transform (DFT)

The discrete Fourier Transform (DFT) transforms an $N$ -dimensional vector $\vec{X}$ to another $N$ -dimensional vector $\vec{Y}$ . The transformation matrix $\Omega$ contains the powers of the $N$ -th root of unity, $\omega=e^{i2\pi/N}$ . The transformation is represented as,

$\displaystyle\vec{Y}$	$\displaystyle=$	$\displaystyle\Omega\vec{X}$
$\displaystyle\begin{pmatrix}y_{0}\\ y_{1}\\ \vdots\\ y_{N-1}\end{pmatrix}$	$\displaystyle=\frac{1}{\sqrt{N}}\begin{pmatrix}\omega^{-0\cdot 0}&\cdots&\omega^{-0\cdot(N-1)}\\ \omega^{-1\cdot 0}&\cdots&\omega^{-1\cdot(N-1)}\\ \vdots&\ddots&\vdots\\ \omega^{-(N-1)\cdot 0}&\cdots&\omega^{-(N-1)\cdot(N-1)}\end{pmatrix}$
	$\displaystyle\cdot\begin{pmatrix}x_{0}\\ x_{1}\\ \vdots\\ x_{N-1}\end{pmatrix}$		(V.15)

V-11 Inverse Quantum Fourier Transform (IQFT) and Quantum Fourier Transform(QFT)

Mathematically, IQFT is similar to DFT (See Appendix V-10). The transformation matrix, $U_{I}$ , is $N\times N$ for an $N$ -dimensional Hilbert space. Therefore, $N=2^{n}$ for an $n$ -qubit system. Eq. (V.15) becomes

\displaystyle\ket{Y}=U_{I}\ket{X}

(V.16)

and $U_{I}$ has the same expression as $\Omega$ in DFT.

\displaystyle U_{I}=\frac{1}{\sqrt{N}}\begin{pmatrix}\omega^{-0\cdot 0}&\cdots&\omega^{-0\cdot(N-1)}\\ \omega^{-1\cdot 0}&\cdots&\omega^{-1\cdot(N-1)}\\ \vdots&\ddots&\vdots\\ \omega^{-(N-1)\cdot 0}&\cdots&\omega^{-(N-1)\cdot(N-1)}\end{pmatrix}

(V.17)

Note that in some literature, e.g. [19], this form of IQFT is called QFT. $\ket{X}$ and $\ket{Y}$ are the quantum states/vectors in the $N$ -dimensional Hilbert space. IQFT can be treated as the rotation of $\ket{X}$ to $\ket{Y}$ .

If $\ket{X}$ is a basis vector $\ket{k}$ , applying IQFT to $\ket{k}$ using Eq. (V.17), we have

\displaystyle U_{I}\ket{k}=\frac{1}{\sqrt{N}}\sum_{j=0}^{N-1}\omega^{-jk}\ket{j}

(V.18)

This is the equation used often in this paper. It tells us that by applying IQFT to a basis vector, the basis vector is rotated to a superposition of all basis vectors weighted by the powers of the $N$ -th root of unity.

For example, in (55) in the main text,

$\displaystyle\textrm{IQFT}\ket{10}$	$\displaystyle=\textrm{IQFT}\ket{2}$
	$\displaystyle=\frac{1}{2^{2/2}}\sum_{y=0}^{2^{2}-1}e^{-2\pi i2y/4}\ket{y}$
	$\displaystyle=\frac{1}{2}(\ket{0}-\ket{1}+\ket{2}-\ket{3}$
	$\displaystyle=\frac{1}{2}(\ket{00}-\ket{01}+\ket{10}-\ket{11})$	(55)

where $N=2^{2}=4$ , $k=2$ , $j=y$ in (V.18) is used. The basis state $\ket{10}$ becomes a superposition of all other basis states after $IQFT$ .

Another more complex example is the general equation,in (II-C), for the $IQFT$ in Figure 1.

$\displaystyle\ket{\Psi_{4}}$	$\displaystyle=\ket{b}\textrm{IQFT}(\frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}\ket{k})\ket{0}_{a}$
	$\displaystyle=\ket{b}\frac{1}{2^{\frac{n}{2}}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}(\textrm{IQFT}\ket{k})\ket{0}_{a}$
	$\displaystyle=\ket{b}\frac{1}{2^{n}}\sum_{k=0}^{2^{n}-1}e^{2\pi i\phi k}(\sum_{y=0}^{2^{n}-1}e^{-2\pi iyk/N}\ket{y})\ket{0}_{a}$
	$\displaystyle=\frac{1}{2^{n}}\ket{b}\sum_{y=0}^{2^{n}-1}\sum_{k=0}^{2^{n}-1}e^{2\pi ik(\phi-y/N)}\ket{y}\ket{0}_{a}$	(II-C)

Here, $IQFT$ is applied to the c-register which is a superposition of basis states, $\ket{k}$ . Using the distribution law of matrix operations, $IQFT$ is applied to individual $\ket{k}$ and (V.18) is used with $y=j$ and $N=2^{n}$ .

QFT is the inverse of $IQFT$ and can be treated as the rotation of the basis. The rotation matrix is given by

\displaystyle U_{Q}=\frac{1}{\sqrt{N}}\begin{pmatrix}1&1&\cdots&1\\ 1&\omega^{1}&\cdots&\omega^{(N-1)}\\ 1&\omega^{2}&\cdots&\omega^{2(N-1)}\\ \vdots&\vdots&\ddots&\vdots\\ 1&\omega^{(N-1)}&\cdots&\omega^{(N-1)(N-1)}\end{pmatrix}

(V.19)

Equivalently,

\displaystyle U_{Q}\ket{k}=\frac{1}{\sqrt{N}}\sum_{j=0}^{N-1}\omega^{jk}\ket{j}

(V.20)

It can be shown that $U_{I}=U_{Q}^{-1}$ or $U_{I}U_{Q}=I$ .

V-12 Implementation of $QFT$ and $IQFT$

$QFT$ and $IQFT$ are constructed using Hadamard gates, controlled phase shift gates, and $SWAP$ gates. Readers may refer to other sources for the details (e.g. [19]). Here, we show the circuit of a 2-qubit $IQFT$ gate (Fig. 6).

In general, the phase shift angle is $\phi=\frac{-2\pi}{2^{r}}$ , $r-1$ is the distance between the controlling qubit and the target qubit. For the 2-qubit $IQFT$ case, there is only one controlled phase shift gate and $r=2$ and this results in the phase $\phi=\frac{-\pi}{2}$ .

For QFT, the circuit is the same as the $IQFT$ , but the phase shift is negated, i.e. $\phi=\frac{2\pi}{2^{r}}$ . This can be appreciated by the fact that the elements in the $IQFT$ and $QFT$ have opposition signs in (V.17) and (V.19), respectively.

V-13 Gaussian elimination method

Here, Gaussian elimination is demonstrated by solving (1) using the numerical example in Section III.

\displaystyle A\vec{x}=\vec{b}

(1)

\displaystyle\begin{pmatrix}1&\frac{-1}{3}\\ \frac{-1}{3}&1\end{pmatrix}\begin{pmatrix}x_{0}\\ x_{1}\ \end{pmatrix}=\begin{pmatrix}0\\ 1\end{pmatrix}

(V.21)

which is rewritten as an augmented matrix followed by Gaussian method of Elimination to solve for $x_{0}$ and $x_{1}$ ,

\displaystyle\begin{bmatrix}1&\frac{-1}{3}&0\\ \frac{-1}{3}&1&1\end{bmatrix}\underrightarrow{\textrm{Row2}=3\times\textrm{Row2}+\textrm{Row1}}\begin{bmatrix}1&\frac{-1}{3}&0\\ 0&\frac{8}{3}&3\end{bmatrix}

\displaystyle\underrightarrow{\textrm{Row2}=\frac{3}{8}\times\textrm{Row2}}\begin{bmatrix}3&-1&0\\ 0&1&\frac{9}{8}\end{bmatrix}

\displaystyle\begin{bmatrix}3&-1&0\\ 0&1&\frac{9}{8}\end{bmatrix}\underrightarrow{\textrm{Row1}=\frac{1}{3}\times\textrm{Row1}+\textrm{Row2}}\begin{bmatrix}1&0&\frac{3}{8}\\ 0&1&\frac{9}{8}\end{bmatrix}

This the solution $\vec{x}$ is

\displaystyle\begin{pmatrix}x_{0}\\ x_{1}\end{pmatrix}=\begin{pmatrix}\frac{3}{8}\\ \frac{9}{8}\end{pmatrix}

The complexity of Gaussian Elimination is $O(N^{3})$ . This is much slower than the classical conjugate gradient method (Appendix V-14), to which HHL is compared.

V-14 Conjugate Gradient Method

The Conjugate Gradient Method (CGM) solves the LSP with a complexity of $O(N)$ and is the fastest known classical solver. Therefore, the speed of HHL, which has a complexity of $O(log(N))$ , is often compared to the speed of CGM [22]. Thus, HHL provides an exponential speedup over the fastest known classical method.

When we solve a system of linear equation, according to (Eq.(1)),where $A$ is a matrix, $b$ is a vector and $x$ is to be solved. If $A$ is a $2\times 2$ matrix and $b$ is $2\times 1$ ,then $x$ can be solved easily. But is $A$ is a long matrix, for example $1000,000,000\times 1000,000,000$ and $b$ is $1000,000,000\times 1$ vector, and $N$ in this case is $1000,000,000$ . To solve $x$ in Classical Gaussian Elimination method we need $O(N^{3})$ speed, where $O$ is omega. In Classical Conjugate Gradient Method with sparse matrix that contains lots of zeros, it will take $O(N)$ speed. And for HHL Quantum Algorithm, with sparse matrix, it takes $O(log(N))$ speed.

However, according to the paper [24], the speed of inner product in HHL Quantum Algorithm is only $log(mn)/\epsilon$ steps when certain amplitudes is obtained and distinguished among other amplitudes, otherwise the speed is only quadratically faster than classical algorithm.

To solve (1) in $CGM$ method,

\displaystyle A\vec{x}=\vec{b}

(1)

initial guess of $\vec{x}$ is used as the starting point. The residual is then found and the search direction is determined by finding the steepest descent. This is repeated until a stable condition is met.

The residual in the $k$ þsearch is given as,

\displaystyle R_{k}=b-A\vec{x}_{k}

(V.22)

The readers do not need to understand CGM to understand HHL. Interested readers may refer to the literature (e.g. [9]) for more details.

References

[1] P. W. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” Proceedings 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134, doi: 10.1109/SFCS.1994.365700.
[2] C. Outeiral, M. Strahm, J. Shi, G. M. Morris, S. C. Benjamin, and C. M. Deane, “The prospects of quantum computing in computational molecular biology,” WIREs Comput. Mol. Sci., 11, e1481 (2021).
[3] D. J. Egger et al., “Quantum Computing for Finance: State-of-the-Art and Future Prospects,” in IEEE Transactions on Quantum Engineering, 1, pp. 1–24, 2020, Art no. 3101724, doi: 10.1109/TQE.2020.3030314.
[4] Arute, F., Arya, K., Babbush, R. et al. Quantum supremacy using a programmable superconducting processor. Nature 574, 505–510 (2019). doi.org/10.1038/s41586-019-1666-5
[5] Madsen, L.S., Laudenbach, F., Askarani, M.F. et al. Quantum computational advantage with a programmable photonic processor. Nature 606, 75–81 (2022). doi.org/10.1038/s41586-022-04725-x
[6] N. David Mermin, ”Could Feynman Have Said This?,” Physics Today 57 (5), 10 (2004); doi: 10.1063/1.1768652
[7] A. Harrow, A. Hassidim, and S. Lloyd, ”Quantum algorithm for linear systems of equations,” Phys. Rev. Lett. 103, 150502 (2009).
[8] Yudong Cao, Anmer Daskin, Steven Frankel, and Sabre Kais, “Quantum circuit design for solving linear systems of equations,” Molecular Physics 110, 15–16 (2011).
[9] R. Chandra, S.C. Eisenstat, and M.H. Schultz, ”Conjugate Gradient Methods for Partial Differential Equations,” in the Proceedings of the AICA International Symposium on Computer Methods for Partial Differential Equations, Bethlehem, Pennsylvania, June 1975.
[10] S. Dmitry et al., “The Potential of Quantum Computing and Machine Learning to Advance Clinical Research and Change the Practice of Medicine.” Missouri medicine 115 (5), 463–467 (2018).
[11] Bojia Duan, Jiabin Yuan, Chao-Hua Yu, Jianbang Huang, and Chang-Yu Hsieh, ”A survey on HHL algorithm: From theory to application in quantum machine learning”, Physics Letters A 384, 126595 (2020).
[12] Shengbin Wang, Zhimin Wang, Wendong Li, Lixin Fan, Zhiqiang Wei, and Yongjian Gu, “Quantum fast Poisson solver: the algorithm and complete and modular circuit design,” Quantum Information Processing 19, Article number: 170 (2020).
[13] H. J. Morrell and H. Y. Wong, ”Study of using Quantum Computer to Solve Poisson Equation in Gate Insulators,” 2021 International Conference on Simulation of Semiconductor Processes and Devices (SISPAD), 2021, pp. 69-72, doi: 10.1109/SISPAD54002.2021.9592604.
[14] Schleich, P., 2019. How to solve a linear system of equations using a quantum computer. [Online]. Available: http://www.acom.rwth-aachen.de/_media/3teaching/ 00projects/schleich.pdf.
[15] HHL Example using Qiskit. [Online]. Available: https://qiskit.org/textbook/ch-applications/hhl_tutorial.html.
[16] Gadi Aleksandrowicz, et al., (2019). Qiskit: An Open-source Framework for Quantum Computing (0.7.2). Zenodo. doi.org/10.5281/zenodo.2562111.
[17] IBM Quantum Site. [Online]. Available: https://quantumcomputing.ibm.com/.
[18] Dominic W. Berry, Graeme Ahokas, Richard Cleve, and Barry C. Sanders, ”Efficient quantum algorithms for simulating sparse Hamiltonians,” arXiv:quant-ph/0508139, 2007.
[19] Hiu Yung Wong, Introduction to Quantum Computing: From a Layperson to a Programmer in 30 Steps. Switzerland: Springer Nature, 2022, pp. 170. doi.org/10.1007/978-3-030-98339-0. ISBN-10: 3030983382.
[20] Danial Dervovic, Mark Herbster, Peter Mountney, Simone Severini, Naïri Usher, and Leonard Wossnig, ”Quantum linear systems algorithms: a primer,” arXiv:1802.08227v1.
[21] Matlab code and Jupyter Notebook, [Online]. Available: https://github.com/hywong2/HHL_Example.
[22] Vandenbrocque, Adrien. ”On Quantum Algorithms for Solving Linear Systems of Equations.”Master’s semester Project I”, 2019. [Online]. Available: https://adrienvdb.com/projects-and-reports/.
[23] National Strategic Overview for Quantum Information Science Report (National Science and Technology Council, 2018). [Online]. Available: https://www.quantum.gov/wp-content/uploads/2020/10/ 2018_NSTC_National_Strategic _Overview_QIS.pdf.
[24] Aaronson, S. Read the fine print. Nature Phys 11, 291–293 (2015). [Online]. Available: https://doi.org/10.1038/nphys3272

\EOD