QUANTUM MACHINE LEARNING APPLIED TO THE CLASSIFICATION OF DIABETES

Hancco-Quispe Juan Kenyhy Faculty of Statistic and Computer Engineering,
Universidad Nacional del Altiplano de Puno, P.O. Box 291
Puno - Peru.
Email: [email protected] Borda-Colque Jordan Piero Faculty of Statistic and Computer Engineering,
Universidad Nacional del Altiplano de Puno, P.O. Box 291
Puno - Peru.
Email: [email protected] Torres-Cruz Fred Faculty of Statistic and Computer Engineering,
Universidad Nacional del Altiplano de Puno, P.O. Box 291
Puno - Peru.
Email: [email protected]

Abstract

Quantum Machine Learning (QML) shows how it maintains certain significant advantages over machine learning methods. It now shows that hybrid quantum methods have great scope for deployment and optimisation, and hold promise for future industries. As a weakness, quantum computing does not have enough qubits to justify its potential. This topic of study gives us encouraging results in the improvement of quantum coding, being the data preprocessing an important point in this research we employ two dimensionality reduction techniques LDA and PCA applying them in a hybrid way Quantum Support Vector Classifier (QSVC) and Variational Quantum Classifier (VQC) in the classification of Diabetes.

Index Terms:

Quantum machine learning; Quantum Support Vector Classifier; Classical encoding; Dimensionality reduction; Variational Quantum Classifier

I Introduction

Machine learning is the best way to solve problems with well-defined outcomes. Image recognition, finding patterns in missing data and understanding clear and unambiguous language are all AI can do[1]. It is also commonly used to find differences in financial transactions, make predictions based on patterns in past data (such as the stock market) and determine when someone has sent spam and mark it as such[2].

Under this premise and with the advance of quantum computers, a new field of study and research is beginning to emerge called Quantum Machine Learning (QML). The goal of quantum technologies is to demonstrate their potential in comparison to classical machine learning, but this in turn shows weaknesses such as the limitation of qubits and continuous operations of logic gates [3].

While some companies have explored the use of quantum machine learning in their research and research projects, it is still quite rare to find companies that are using quantum machine learning in their business operations on a widespread basis. Some companies that have mentioned the use of quantum machine learning or have conducted research in this field include:

Google: The technology company has conducted research into the use of quantum machine learning in tasks such as route optimisation and financial data analysis.

IBM: The technology company has developed a quantum machine learning platform called IBM Q, which is used to investigate how quantum machine learning can improve the performance of tasks such as image classification and natural language processing.

Microsoft: The technology company has conducted research into using quantum machine learning to improve energy efficiency in data centres and to optimise power distribution in power grids.

D-Wave: This quantum technology company has developed a quantum machine learning platform called D-Wave Leap, which is used to investigate how quantum machine learning can improve decision-making in various industries.

In this paper, we will apply dimensionality reduction to the structure of the data set, as well as show that Linear Discriminant Analysis (LDA)[4]. Shows a significant advantage in supervised preprocessing over Principal Component Analysis (PCA)[5].

We aim to compare Classical and Quantum classification methods - Quantum Support Vector Classifier (QSVC)[6] and Variational Quantum Classifier (VQC). Qiskit Machine Learning was used for the construction of fundamental computational building blocks, such as Quantum Kernels and Quantum Neural Network, which we use for the classification of Diabetes[7].

II DATASETS

The purpose of the data selection in this study is to categorize patients with diabetes. We utilize a CSV file to store the retrieved Kaggle dataset.

Var	Type	Scale	Description
1	V.Input	Quant. Discrete	Pregnancies
2	V.Input	Quant. Continues	Glucose
3	V.Input	Quant. Continues	P. Blood
4	V.Input	Quant. Continues	Skin-Thickness
5	V.Input	Quant. Continues	Insulin
6	V.Input	Quant. Continues	BMI
7	V.Input	Quant. Continues	P.Diabetes
8	V.Input	Quant. Discrete	Age
9	V.Output	Quant.Dichotomous	A. ”Yes” or ”No

Table 1 Description of variables Pima

Pima Indians Diabetes Database

https://www.kaggle.com/datasets/nancyalaswad90/review

The aim of the data set is to identify the presence or absence of diabetes in a patient based on particular diagnostic parameters provided in the data set. There were a number of limitations used to choose these instances from a broader data set. A number of limitations were implemented as a consequence of which a set of patients who matched the description of being Pima women of at least 21 years of age were chosen as patients.

III METHODS APPLIED

III-A Dimensionality reduction

Dimension reduction, also known as principal component analysis (PCA), is a data processing technique used to reduce the dimensionality of a data set. PCA is based on the idea of finding a set of principal components that explain most of the variability in the data.[8].

The goal is to transform a large dataset (a large number of features) into a compact representation containing important information about the data (orthogonal projection). Size reduction implies a loss of accuracy, but PCA, uses an eigenvalue suppression process to transform the covariance matrix involved in this process. Components are linear combinations of different objects to create new unallocated objects. The first element will contain the most information, then the rest in the second, and so on. The geometrical representation of the PCA are the components representing the direction of maximum change (rotation)[9].

Linear discriminant analysis (LDA) is similar to PCA, these are linear transformations to reduce the dimensionality of the data set (eigenvalue separation). Where PCA maximises the variance, LDA maximises the class separation axes. LDA will create a k-dimensional subspace from the n-dimensional space of the original data, where k¡=n-1. The subspace is calculated taking into account the labels to maximise class segregation.

III-B METRICS

Fundamentals for evaluating various classifiers and their corresponding expressions. Where FN, TN, FP and TP are False positive, true negative, false positive and true positive.

Metrics	Equation
Precision	$\frac{TP}{TP+FP}$
Recall	$\frac{TP}{TP+FN}$
F1-Score	$\frac{2x(PrecisionxRecall)}{Precision+Recall}$
Balanced Accuracy	( $\frac{TP}{TP+FN}$ $/$ $\frac{TN}{TN+FP})$ $/2$

Tabla 2.

III-C Backends

Making a choice between backends and quantum computer simulations is difficult when quick iteration over huge data sets is required. Before entering into production or actual production, machine learning models typically need a number of modifications and iterations (remediation)[10]. The difficulties are the same for QML, but the hardware ecosystem is distinct. On both real devices and simulators (noisy and crowded quantum computer simulations), quantum algorithms can be used[11].

III-C1 Simulators

In this research, only the Qiskit Aer simulator and the standard Pennylane qubit simulator were used.

Qiskit Aer is a quantum machine simulator developed by IBM that is used to simulate the behaviour of a quantum machine on a classical computer[12]. Qiskit Aer is an open source tool and can be used to research and develop quantum algorithms and programs without the need to access a real quantum machine.[13]

PennyLane is an open source framework for quantum computation and quantum machine learning. PennyLane can be used to simulate and run quantum algorithms and programs on different quantum backends[14], including quantum machine simulators such as Qiskit Aer and real quantum machines[15].

IV ALGORITHMS

IV-A Machine Learning Models

Since supervised learning involves classification difficulties, we use many of the classical methods in this field of machine learning to compare the combined quantum classical approach.

IV-A1 Regresión logística

This approach to the binary classification issue is among the simplest. The model is trained to learn the parameters of the linear equations in their entirety:

\hat{k}^{i}=\beta_{0}+\beta_{1}x_{1}^{i}+\beta_{2}x_{2}^{i}+\beta_{n}x_{n}^{i}

(1)

where $\beta_{n}$ are the linear regression coefficients $x_{n}$ , - false positives, respectively, the sample characteristics. Since linear regression could not successfully achieve the classification objective, the regression formula was introduced into the logistic function as Equation (1). To determine the probability

P(y^{i}=1)=\frac{1}{1+e^{-}(\beta_{0}+\beta_{1}x_{1}^{i}+\beta_{2}x_{2}^{i}+\beta_{n}x_{n}^{i})}

(2)

P is the probability that the label $y$ for sample $i$ is associated with the number 1. The probability threshold is set to 0.5 to change the binary output after calculating the probability of each sample (search model). ). If $P(y^{i}=1)<0.5$ , the appropriate label is 0, if $P(y^{i}=1)\geq 0.5$ 1 is the label. A sufficient number of stable samples is required for the logistic regression method to successfully estimate the parameters. $\beta_{n}$

IV-A2 Classification and Regression Trees”

A decision tree, also known as a classification and regression tree (CART), is a binary graph model in which the choice you make for the previous child determines the decision you make for the next child. The root of the tree serves as the starting point from which two branches emerge, divided into ”yes” and ”no” categories. Until a final choice, called a leaf, is reached, the tree structure is built up through a series of decisions. Although this approach is simple, it is prone to overfitting. These are powerful algorithms that can be successfully adapted to complex data sets. The learning process uses entropy or the Gini criterion (equation (2)).

H_{i}=\sum_{k=1}^{10}P_{(}i,k)log_{2}P_{(}i,k)

(3)

where $i$ is the $i^{th}$ node, $P_{(}i,k)$ , is the probability of the $k$ category..

IV-A3 Naive Bayes

The Nave Bayes algorithm, sometimes known as NB, is a simpler version of the Bayes theorem (Eq. 3)

P(A|B)=\frac{P(A|B).P(A)}{P(B)}

(4)

IV-A4 k-Nearest Neighbors

The simplest algorithm based on distance without parameters is called k-nearest neighbour (k-NN). Assume that in n-dimensional space the corresponding point will be the density. Using a distance calculation such as Euclidean distance, the point will be encoded and placed. Then, the algorithm then determines the class to apply to the new point by looking at the closest K points, averaging the classes, and predicting the appropriate class for the next new point[16].

Where A and B are events, $P(A|B)$ represents the probability that A occurs if B is true, $P(B|A)$ represents the probability that B occurs if A is true, and $P(A)$ and $P(B)$ are independent probabilities of events A and B respectively. The probabilities are conditionally independent of the classifier DS. He greatly simplified the mathematics and made it an easy problem to solve.

IV-B Quantum Machine Learning Models

Quantum machine learning is a subfield of machine learning that uses quantum computers to perform machine learning tasks. Quantum machine learning algorithms can be used to solve problems that are difficult or impossible to solve using classical machine learning algorithms. These problems include finding patterns in large datasets, optimizing complex functions, and classifying data.

There are several approaches to quantum machine learning, including:

Quantum support vector machines: These are quantum versions of support vector machines, which are a type of linear classifier.

Quantum principal component analysis: This is a quantum version of the principal component analysis algorithm, which is used to reduce the dimensionality of data.

IV-B1 Quantum Kernel

A quantum kernel is a function that is used to define the similarity between two quantum states in a quantum machine learning algorithm. The quantum kernel is a generalization of the classical kernel function, which is commonly used in kernel methods, a type of machine learning algorithm that operates by mapping data into a high-dimensional feature space.

In quantum machine learning, the quantum kernel is used to define the similarity between two quantum states in the feature space. This allows the quantum machine learning algorithm to compare and classify quantum states based on their similarity.

IV-B2 Quantum Encoding

The process leading from conventional data to canonical representation is known as canonical codification. There are several ways to process basic data and provide useful representations. In this study, we employ the cunning characteristic map (Qiskit ZZFeatureMap) and the angular codification (Pennylane) QSVC and VQC, respectively.

IV-B3 Angle Encoding

The traditional process of encoding information by rotation is called angular encoding. Traditional information is represented by rotating arrows at the appropriate ports and can be written as an equation:

where R is a rotating door such as Rx, Ry, and Rz. Angular encoding is used when the width of the eigenvector x is equal to the number of qubits.

IV-B4 WorkFlow

We present here the workflow used in this study to compare the selected algorithms (classical and computational). A series of algorithms are applied to the data. A rendering created using dimensionality reduction techniques. The workflow consists of four steps:

1 Apply an Exploratory Data Analysis after gathering the data. To prepare the data for the dimensionality reduction method, the data must be cleaned and normalized with a good format.

2 Dimensionality reduction: LDA and PCA are used to reduce the number of compressed two-dimensional functions. The PCA method uses two components. In another dataset, LDA is used. All media parts are scaled down using LDA components.

3 Quantum encoding: Using maps of cuanic characteristics, the traditional data are codified into a cuanic representation. Only computational algorithms used this step.

4 Models used: A combination of selected algorithms (ML and QML) is applied to the data encoding (quantum or classical) and evaluated using the same metric (see Table 1 for details).

V RESULTS

The results of using the traditional classification models, such as LR, CART, KNN, and NB, are shown in this section along with a comparison to the results obtained using the machine learning statistical models, QVC and QSVC, which were used to classify diabetes.

Tab. 3: Classical models (LR, KNN, CART, NB), as well as computational models (QSVC, VQC), applied to the entire set of diabetes data using PCA dimensionality reduction.

Tab. 4: Classical models (LR, KNN, CART, NB), as well as computational models (QSVC, VQC), applied to the entire set of diabetes data using LDA dimensionality reduction.

Figure 1. Comparison of QSVC and VQC measures for diabetes classification using PCA and LDA. [Uncaptioned image]

Figure 2 compares the metrics of QSVC and VQC with CART and KNN (the best traditional algorithms) when LDA is used to classify diabetes. [Uncaptioned image]

VI DISCUSSION AND CONCLUSIONS

We show how a quantum computer can use classification results obtained with only a subset of measurements to extract more useful information from classical data. Use in. Competitive advantage is not necessarily measured by its ability to outperform standard machine learning models, but it can be considered the best method of information extraction. Some of the potential benefits of quantum machine learning include:

Faster speed: Quantum machine learning can perform computations much faster than classical machine learning due to the higher speed of quantum computers. For example, one study has shown that a quantum machine learning algorithm can classify data faster than a classical machine learning algorithm[17].

Higher accuracy: Quantum machine learning can produce more accurate results than classical machine learning due to the higher accuracy of quantum computers. For example, one study has shown that a quantum machine learning algorithm can classify data more accurately than a classical machine learning algorithm[18].

Higher learning capacity: Quantum machine learning can learn faster and more accurately than classical machine learning due to the higher processing capacity and accuracy of quantum computers. For example, one study has shown that a quantum machine learning algorithm can learn faster and more accurately than a classical machine learning algorithm[18].

Increased processing power: Quantum machine learning can process a larger amount of data at the same time due to the increased processing power of quantum computers. For example, one study has shown that a quantum machine learning algorithm can process large amounts of data faster than a classical machine learning algorithm [19].

For computer supervised machine learning tasks, LDA shows most promising results. To understand how LDA provides a better data representation for qubit encoding, the prevalence of LDA in PCA has not been explored in this paper, but will be considered in the future.

comparisons between LDA and PCA:

Approach: LDA focuses on maximising the separation between different classes of data while PCA focuses on maximising the variance of the data. According to one study, ”LDA is a supervised method that is used to reduce the dimension of the data while maintaining as much class information as possible. On the other hand, PCA is an unsupervised method that is used to reduce the dimension of the data while maintaining as much variance as possible”[20].

Number of components: LDA produces a number of components equal to the number of classes minus one, while PCA produces a number of components equal to the number of variables minus one. According to one study, ”LDA always produces the number of components equal to the number of classes minus one, while PCA produces the number of components equal to the number of variables minus one. Therefore, LDA is more suitable for the analysis of data with a small number of classes, while PCA is more suitable for the analysis of data with a large number of variables”[20].

Data requirements: LDA requires data to be normally distributed and classes to have equal covariance matrices, whereas PCA does not have these requirements. According to one study, ”LDA requires data to be normally distributed and classes to have equal covariance matrices. On the other hand, PCA does not have these requirements. Therefore, LDA is more sensitive to the violation of these assumptions than PCA” [20].

References

[1] Andreas Maier, Christopher Syben, Tobias Lasser, and Christian Riess. A gentle introduction to deep learning in medical image processing. Zeitschrift fur Medizinische Physik, 29(2):86–101, 2019
[2] Andrey Rudenko, Luigi Palmieri, Michael Herman, Kris M Kitani, Dariu M Gavrila, and Kai O Arras. Human motion trajectory prediction: A survey. The International Journal of Robotics Research, 39(8):895–935, 2020.
[3] Yexiong Zeng, Zheng-Yang Zhou, Enrico Rinaldi, Clemens Gneiting, and Franco Nori. Approximate autonomous quantum error correction with reinforcement learning, 2022.
[4] Jocelyn T. Chi and Deanna Needell. Sketched gaussian model linear discriminant analysis via the randomized kaczmarz method, 2022.
[5] Ishtiaque Ahmed Showmik, Tahsina Farah Sanam, and Hafiz Imtiaz. Human activity recognition from wi-fi csi data using principal component based wavelet cnn, 2022.
[6] Aaron Baughman, Kavitha Yogaraj, Raja Hebbar, Sudeep Ghosh, Rukhsan Ul Haq, and Yoshika Chhabra. Study of feature importance for quantum machine learning models, 2022.
[7] Andrew Blance and Michael Spannowsky. Quantum machine learning for particle physics using a variational quantum classifier. Journal of High Energy Physics, 2021(2):1–20, 2021.
[8] Etienne Becht, Leland McInnes, John Healy, Charles-Antoine Dutertre, Immanuel WH Kwok, Lai Guan Ng, Florent Ginhoux, and Evan W Newell. Dimensionality reduction for visualizing single-cell data using umap. Nature biotechnology, 37(1):38–44, 2019.
[9] Yiduo Guo, Wenpeng Hu, Dongyan Zhao, and Bing Liu. Adaptive orthogonal projection for batch and online continual learning. Proceedings of AAAI-2022, 2, 2022.
[10] Marcelo S. Zanetti, Douglas F. Pinto, Marcos L. W. Basso, and Jonas Maziero. Simulating noisy quantum channels via quantum state preparation algorithms, 2022.
[11] eva Cepaite. Simulation of networked quantum computing on encrypted data, 2022.
[12] Dimitrios Giannakis, Abbas Ourmazd, Philipp Pfeffer, Joerg Schumacher, and Joanna Slawinska. Embedding classical dynamics in a quantum computer, 2020.
[13] Sami Khairy, Ruslan Shaydulin, Lukasz Cincio, Yuri Alexeev, and Prasanna Balaprakash. Reinforcement-learning-based variational quantum circuits optimization for combinatorial problems. 2019.
[14] Francesco Di Marcantonio, Massimiliano Incudini, Davide Tezza, and Michele Grossi. Quask – quantum advantage seeker with kernels, 2022.
[15] Olivia Di Matteo, Josh Izaac, Tom Bromley, Anthony Hayes, Christina Lee, Maria Schuld, Antal Száva, Chase Roberts, and Nathan Killoran.
[16] Utkarsh Azad and Helena Zhang. Machine learning based discrimination for excited state promoted readout, 2022.
[17] Maria Schuld, Ilya Sinayskiy, and Francesco Petruccione. The quest for a quantum neural network. Quantum Information Processing, 13(11):2567–2586, 2014.
[18] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning. Nature, 549(7671):195–202, 2017.
[19] Nathan Wiebe, Ashish Kapoor, and Krysta Svore. Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning. 2014.
[20] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002.