Development of a Novel Quantum Pre-processing Filter to Improve Image Classification Accuracy of Neural Network Models

Abstract

This paper proposes a novel quantum pre-processing filter (QPF) to improve the image classification accuracy of neural network (NN) models. A simple four qubit quantum circuit that uses Y rotation gates for encoding and two controlled NOT gates for creating correlation among the qubits is applied as a feature extraction filter prior to passing data into the fully connected NN architecture. By applying the QPF approach, the results show that the image classification accuracy based on the MNIST (handwritten 10 digits) and the EMNIST (handwritten 47 class digits and letters) datasets can be improved, from 92.5% to 95.4% and from 68.9% to 75.9%, respectively. These improvements were obtained without introducing extra model parameters or optimizations in the machine learning process. However, tests performed on the developed QPF approach against a relatively complex GTSRB dataset with 43 distinct class real-life traffic sign images showed a degradation in the classification accuracy. Considering this result, further research into the understanding and the design of a more suitable quantum circuit approach for image classification neural networks could be explored utilizing the baseline method proposed in this paper.

Farina Riaz, Shahab Abdulla, Hajime Suzuki, Srinjoy Ganguly, Ravinesh C. Deo, and Susan Hopkins

1 Introduction

The application of quantum computing to the tasks of machine learning, herein referred as quantum machine learning, has attracted much research attention in recent years. Literature surveys on quantum machine learning can be found in [1, 10, 4]. Among many proposals to combine classical machine learning methods with quantum computing, quanvolutional neural network (QNN) proposed by Henderson et. al. [6] has an advantage of being implementable on quantum circuits with a smaller number of qubits with shallow gate depths and yet being applicable to practical applications. QNN utilizes quantum circuits as transformation layers, called quanvolutional layer, to extract features for the purpose of image classification using convolutional neural networks (CNNs). In [6], MNIST handwritten 10 digit dataset [8] was applied to QNN using 9 qubits. The results showed classification accuracy improvement using QNN over CNN. However, when the quanvolutional layer of QNN was replaced by a conventional convolutional layer, no improvement in classification accuracy was observed. Henderson et. al. later updated QNN and implemented on Rigetti Computing’s Aspen-7-25Q-B quantum processing unit which has 25 qubits with 24 programmable two-qubit gate interactions [5]. The proposed method was applied to 4 class low resolution satellite image dataset. However, no improvement in classification accuracy by QNN over CNN was observed in [5].

An implementation of QNN on a software quantum computing simulator, PennyLane [2], was provided by Mari [9]. Mari’s implementation of QNN differs from that of Henderson in that the output of the quantum circuit, which is a set of expectation values, is directly fed into the following neural network layer, while that of Henderson was made into a single scalar value by a classical method. The proposed method was applied to MNIST dataset using 50 training and 30 test image set. No clear improvement in classification accuracy by QNN over NN was observed in [9].

Inspired by Henderson’s and Mari’s QNNs, we propose a new method, herein called quantum pre-processing filter (QPF), that shows clear improvements in image classification accuracy over conventional neural networks. QPF uses a quantum circuit with four qubits, four Y rotations, two controlled NOTs (CNOTs), and four measurements. When QPF is applied as a pre-processing unit of an image classification neural network, i.e. as a feature extraction filter, the image classification accuracy of fully connected neural network against MNIST and EMNIST (handwritten 47 class digits and letters, [3]) improves from 92.5% to 95.4% and from 68.9% to 75.9%, respectively. These improvements were obtained without introducing any extra parameters to optimize in machine learning process. Unlike other quantum machine learning methods, the use of QPF does not require optimization of the parameters inside the quantum circuits and hence requires only a limited use of the quantum circuit. Given the small number of qubits and relatively shallow depth of the quantum circuit, QPF is well suited to be implemented on noisy intermediate-scale quantum computers. While the proposed method is promising, a test against a more complicated dataset, GTSRB (43 class real-life traffic sign images, [11]), showed degradation in classification accuracy by the application of QPF. This prompts further research into the understanding and design of suitable quantum circuits for image classification neural networks. To support the validation of our claims and further research, we have made our source code available at https://github.com/hajimesuzuki999/qpf.

This paper is organized as follows: The new QPF unit combined with the classical image classification neural network is proposed in Section 2. Section 3 describes the experiment conducted using software simulation. The results and discussions are presented in Section 4, followed by conclusions in Section 5.

2 Methodology

Figure 1 shows the architecture of the proposed model. The method assumes that the input image is a two-dimensional matrix with size $m$ -by- $m$ and the pixel value, $x$ , follows $0\leq x\leq 1$ . An extension to multi-channel pixel image is considered as straightforward. Similar to QNN models, a section of size $n$ -by- $n$ is extracted from the input image. While $1<n\leq m$ in the case of QNN, the proposed QPF uses $n=2$ . This $2\times 2$ section of the input image is referred as QPF window. An extension of QPF using $n>2$ is left for further studies.

Refer to caption — Figure 1: The architecture of the proposed quantum pre-processing filter (QPF) model.

Given $n=2$ , we use 4 qubit quantum circuit as shown in Figure 1. The four qubits are initialized in the ground state. The four pixel values are then encoded using Y rotation with $\theta=\pi x$ according to (1).

R_{y}(\theta)=\left[\begin{array}[]{cc}\cos\frac{\theta}{2}&-i\sin\frac{\theta}{2}\\ \sin\frac{\theta}{2}&\cos\frac{\theta}{2}\par\end{array}\right]

(1)

The outputs from the Y rotation gates are fed to the quantum circuit referred as $U$ in Figure 1. Measurements, referred as $M$ in Figure 1, are performed on the output of the quantum circuit $U$ . Three different quantum circuits are examined in this paper. The first circuit, referred as Encoding only and shown in Figure 2, performs measurement straight after the encoding. The second circuit, referred as One CNOT and shown in Figure 3, performs controlled NOT operation with the first qubit as the control and the 4th qubit as the target, as shown in Figure 3. The third circuit, referred as Two CNOTs, performs two controlled NOT operations as shown in Figure 4.

The outputs from the measurement operations are given as expectation values between $-1$ and $1$ , and form output features. We note that the total number of parameters in the input image ( $m\times m$ ) is the same as the total number of parameters in the output features ( $4\times(m/2)\times(m/2)$ ). The output features are made into a one dimensional vector by the flatten layer. The number of the nodes of the output of the flatten layer is $m\times m$ . The nodes are fully connected by the first fully connected layer 1. The output of the fully connected layer 2 has the number of nodes equal to the number of classes.

3 Experiment

As was performed by many, we first apply the proposed method to the MNIST dataset [8] to obtain benchmark results. The MNIST dataset consists of 60,000 training and 10,000 test images of handwritten digits of 0 to 9. The size of each image is 28 by 28 pixels. The original images are in greyscale within the values between 0 and 255, which are scaled by dividing them by 255. We then chose the EMNIST dataset [3] to extend the number of image class. The EMNIST dataset (Balanced and By_Merge [3]) contains 112,800 training and 18,800 test images of handwritten digits and letters making up 47 classes. Note that some of upper- and lower-case letters are merged due to their similarity (e.g. C is similar to c) in this dataset. Original EMNIST dataset is divided by 255 to create a dataset with pixel values between 0 and 1.

The GTSRB dataset [11] consists of 34,799 training and 12,630 test images of 43 class traffic signs captured from actual traffic signs in-use in various conditions. The original dataset has various image sizes between $15\times 15$ and $222\times 193$ pixels. Those images were scaled to a size of $32\times 32$ pixels. The original images were in RGB color, which were converted into grayscale between 0 and 255. Unlike MNIST and EMNIST dataset, in order to normalize the dynamic range of each image, the normalization is applied to each image according to the following formula:

\tilde{c}_{x,y}=\frac{c_{x,y}-\min(I)}{\max(I)-\min(I)}

(2)

where $c_{x,y}$ and $\tilde{c}_{x,y}$ represent the original and normalized pixel values in the position ( $x$ , $y$ ), and $\max(I)$ and $\min(I)$ denote the maximum and the minimum element in the two dimensional image matrix $I$ , respectively.

The experiment was performed mainly in MATLAB. PennyLane was used to build quantum circuits which were exported to MATLAB as unitary matrices. The Adam optimizer [7] was used as the solver for the training network and default mini-batch size of 128 was used for all of the three datasets to fully test the proposed QPF model.

Figure 5 shows example input images from MNIST, EMNIST, and GTSRB, and corresponding output features using encoding only (labelled as $q[0]$ to $q[3]$ ), and two CNOTs (CNOT( $q[0],q[3]$ ) and CNOT( $q[1],q[2]$ )).

4 Results and Discussion

Figure 6 shows the variation of the testing dataset accuracy as a function of training iterations using MNIST dataset. As can be seen in Figure 6, the test set accuracy using QPF Encoding only converges faster than that of NN. The phenomenon of faster convergence was also observed in Henderson’s QNN [6]. However, the converged test set accuracy of the QPF Encoding only model does not improve that of NN. The application of QPF One CNOT improves the test set accuracy from 92.5% to 93.7%. The test set accuracy is further improved to 95.4% by the application of QPF Two CNOTs.

Similarly, small but clearly faster convergence is observed by the application of QPF Encoding only in the case of EMNIST dataset as shown in Figure 7. However, in the case of EMNIST, the converged test set accuracy of the QPF Encoding only is reduced from that of NN. Nonetheless, the test set accuracy is improved from 68.9% to 72.0% by the application of QPF One CNOT, and to 75.9% by the application of QPF Two CNOTs.

Figure 8 shows the variation of test set accuracy using the GTSRB dataset. Unlike the cases using MNIST and EMNIST datasets, the converged test set accuracy by the application of QPF is reduced from that of NN in the case of GTSRB dataset. The exact causes of this phenomenon are currently unknown to the authors and remain for further research. The summary of the testing accuracy results is shown in Table 1.

	MNIST	EMNIST	GTSRB
NN	92.5%	68.9%	81.4%
Encoding only	92.4%	68.1%	77.6%
One CNOT	93.7%	72.0%	77.6%
Two CNOT	95.4%	75.9%	77.0%

Table 1: Summary of testing accuracy results.

Referring back to Figure 3, there are 12 different ways to create a CNOT circuit from four qubits. In order to investigate if a different CNOT arrangement would make the difference in classification accuracy, the training of the network and the classification of the images were performed for MNIST dataset using the 12 different CNOT arrangements. Figure 9 shows the results. The $x$ axis of Figure 9 denotes the arrangement of the CNOT where the lower number refers to the control qubit and the upper number refers to the target qubit. For example, “0 3” refers to the arrangement as shown in Figure 3. As can be seen from Figure 9, the variation of test set accuracy for different CNOT arrangement is considered to be small, within less than 0.6%.

Similarly, there are 24 different ways to arrange the two CNOTs using four qubits. The test set accuracy was derived for 24 different two CNOTs arrangement using MNIST dataset, and the results are shown in Figure 10. In Figure 10, the $x$ axis label refers to the arrangement of the two CNOTs in the order of the control and target qubits of the first CNOT, then the control and target qubits of the second CNOT. For example, “0 3 1 2” refers to the arrangement as shown in Figure 4. As evident in Figure 10, the CNOT arrangements pairing the 1st and the 4th qubits and pairing the 2nd and the 3rd qubits seems to achieve a higher testing accuracy irrespective to which qubit is assigned as the target or the control. Referring back to Figure 1, the pairing of the 1st and 4th qubits and the pairing of the 2nd and 3rd qubits correspond to the pairing the diagonal elements of the 2 x 2 QPF window. The exact reason for the improved classification accuracy in the case of MNIST and EMNIST dataset when correlating the diagonal elements of the QPF window is currently not known to the authors and is left for further research.

5 Conclusion

A novel QPF that can improve the image classification accuracy of NN for MNIST and EMNIST datasets is proposed. While clear improvements are observed, the exact causes of the improvements are currently unknown and remain for future research and investigation. Also, further investigation is needed in order to improve the classification accuracy of NNs against complex images such as those from the GTSRB dataset. Proposed future research directions include increasing the number of qubit sizes from four as well as applying the QPFs to other image classification methods such as convolutional NNs.

6 Acknowledgment

This research has been supported by Australian government Research Training Program and Commonwealth Scientific Industrial and Research Organization.

References

[1] M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4), November 2019.
[2] V. Bergholm, J. Izaac, M. Schuld, C. Gogolin, M. S. Alam, S. Ahmed, J. M. Arrazola, C. Blank, A. Delgado, S. Jahangiri, K. McKiernan, J. J. Meyer, Z. Niu, A. Szava, and N. Killoran. Pennylane: Automatic differentiation of hybrid quantum-classical computations. arXiv:1811.04968v4, July 2022.
[3] G. Cohen, S. Afshar, J. Tapson, and A. van Schaik. EMNIST: an extension of MNIST to handwritten letters. In International Joint Conference on Neural Networks, pages 2921–2926, May 2017.
[4] D. P. Garcia, J. Cruz-Benito, and F. J. Garcia-Penalvo. Systematic literature review: Quantum machine learning and its applications. arXiv:2201.04093, January 2022.
[5] M. Henderson, J. Gallina, and M. Brett. Methods for accelerating geospatial data processing using quantum computers. Quantum Machine Intelligence, 4(4), January 2021.
[6] M. Henderson, S. Shakya, S. Pradhan, and T. Cook. Quanvolutional neural networks: powering image recognition with quantum circuits. Quantum Science and Technology, 2(2), February 2020.
[7] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv:1412.6980v9, January 2017.
[8] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Quanvolutional neural networks: powering image recognition with quantum circuits. Proceedings of the IEEE, 86(11):2278–2324, November 1998.
[9] A. Mari. Quanvolutional neural network. Available online: https://pennylane.ai/qml/demos/tutorial_quanvolution.html, January 2021.
[10] W. O’Quinn and S. Mao. Quantum machine learning: Recent advances and outlook. IEEE Wireless Communications, 27(3):126–131, June 2020.
[11] J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. The German traffic sign recognition benchmark: A multi-class classification competition. In International Joint Conference on Neural Networks, pages 1453–1460, July 2011.