∎
22email: [email protected] 33institutetext: W.-J. Liu 44institutetext: P.-P. Gao 55institutetext: W.-B. Yu 66institutetext: Z.-G. Qu 77institutetext: School of Computer and Software, Nanjing University of Information Science & Technology, Nanjing 210044, P.R.China 88institutetext: C.-N. Yang 99institutetext: Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien 974, Taiwan
Quantum Relief Algorithm
Abstract
Relief algorithm is a feature selection algorithm used in binary classification proposed by Kira and Rendell, and its computational complexity remarkable increases with both the scale of samples and the number of features. In order to reduce the complexity, a quantum feature selection algorithm based on Relief algorithm, also called quantum Relief algorithm, is proposed. In the algorithm, all features of each sample are superposed by a certain quantum state through the CMP and rotation operations, then the swap test and measurement are applied on this state to get the similarity between two samples. After that, Near-hit and Near-miss are obtained by calculating the maximal similarity, and further applied to update the feature weight vector to get that determine the relevant features with the threshold . In order to verify our algorithm, a simulation experiment based on IBM Q with a simple example is performed. Efficiency analysis shows the computational complexity of our proposed algorithm is O(M), while the complexity of the original Relief algorithm is O(NM), where is the number of features for each sample, and is the size of the sample set. Obviously, our quantum Relief algorithm has superior acceleration than the classical one.
Keywords:
quantum Relief algorithm feature selection CMP operation rotation operation swap test IBM Q1 Introduction
Machine learning refers to an area of computer science in which patterns are derived (“learned”) from data with the goal to make sense of previously unknown inputs. As part of both artificial intelligence and statistics, machine learning algorithms process large amounts of information for tasks that come naturally to the human brain, such as image recognition, pattern identification and strategy optimization.
Machine learning tasks are typically classified into two broad categories 1 , supervised and unsupervised machine learning, depending on whether there is a learning “signal” or “feedback” available to a learning system. In supervised machine learning, the learner is provided a set of training examples with features presented in the form of high-dimensional vectors and with corresponding labels to mark its category. On the contrary, no labels are given to the learning algorithm, leaving it on its own to find structure in unsupervised machine learning. The core mathematical task for both supervised and unsupervised machine learning algorithm is to evaluate the distance and inner products between the high-dimensional vectors, which requires a time proportional to the size of the vectors on classical computers. As we know, it is “curse of dimensionality” 2 to calculate the distance in high-dimensional condition. One of the possible solutions is the dimension reduction 3 , and the other is the feature selection 4 5 .
Relief algorithm is one of the most representative feature selection algorithms, which was firstly proposed by Kira et al. 6 . The algorithm devises a “relevant statistic” to weigh the importance of the feature which has been widely applied in many fields, such as, hand gesture recognition 7 , electricity price forecasting 8 and power system transient stability assessment 9 . Relief algorithm will occupy larger amount of computation resources while the number of samples and features become huger, which restricts the application of the algorithm.
Since the concept of quantum computer was proposed by the famous physicist Feynman 10 , a number of remarkable outcomes have been proposed. For example, Deutsch’s algorithms 11 12 embody the superiority of quantum parallelism calculation, Shor’s algorithm 13 solves the problem of integer factorization in polynomial time, and Grover’s algorithm 14 has a quadratic speedup to the problem of conducting a search through some unstructured search space. With the properties of superposition and entanglement, quantum computation has the potential advantage for dealing with high dimension vectors which attract researchers to apply the quantum mechanics to solve some classical machine learning tasks such as quantum pattern matching 15 , quantum probably approximately correct learning 16 , feedback learning for quantum measurement 17 , quantum binary classifiers 18 19 , and quantum support vector machines 20 .
Even in quantum machine learning, we still face with the trouble of “curse of dimensionality”, thus dimension reduction or feature selection is a necessary preliminary before training high-dimensional samples. Inspired by the idea of computing the inner product between two vectors 21 22 , we propose a Relief-based quantum parallel algorithm, also named quantum Relief algorithm, to effectively perform the feature selection.
The outline of this paper is as follows. The classic Relief algorithm is briefly reviewed in Sect. 2, the proposed quantum Relief algorithm is proposed in detail in Sect. 3, and a simulation experiment based on IBM Q with a simple example is given in Sect. 4. Subsequently, the efficiency of the algorithm is analyzed in Sect. 5, and the brief conclusion and the outlook of our work are discussed in the last section.
2 Review of Relief algorithm
Relief algorithm is a feature selection algorithm used in binary classification (generalizable to polynomial classification by decomposition into a number of binary problems) proposed by Kira and Rendell 6 in 1992. It is efficient in estimating features according to how well their values distinguish among samples that are near each other.
We can divide an -sample set into two vector sets: , , where , are -feature samples: , , , and the weight vector of features is initialized to all zeros. Suppose the upper limit of iteration is , and the relevance threshold (that differentiate the relevant and irrelevant features) is . The process of Relief algorithm is described in Algorithm 1.
At each iteration, pick a random sample , and then select the samples closest to (by -dimensional Euclidean distance) from each class. The closest same-class sample is called Near-hit, and the closest different-class sample is called Near-miss. Update the weight vector such that,
(1) |
where the function is defined as below,
(2) |
Relief was also described as generalizable to polynomial classification by decomposition into a number of binary problems. However, as the scale of samples and the number of features increase, their efficiencies will drastically decline.
3 Quantum Relief Algorithm
In order to handle the problem of large samples and large features, we propose a quantum Relief algorithm. Suppose the sample sets , , weight vector , the upper limit and threshold are the same as classical Relief algorithm defined in Sec. 2. Different from the classical one, all the features of each sample are represented as a quantum superposition state, and the similarity between two samples can be calculated in parallel. Algorithm 2 shows the detailed procedure of quantum Relief algorithm.
3.1 State preparation
At the beginning of the algorithm, the classical information is converted into quantum superposition states. Quantum superposition state sets and , which store all the feature values of and , are prepared as below,
(3) |
where and represent the i-th feature value of vector and , respectively. Suppose the initial state is () and we want to prepare the state , its construction process consists of the following three steps.
First of all, the H (i.e., Hadamard gate) and CMP operations are performed on to obtain the state ,
(4) |
Fig. 1 depicts the detailed circuit of these operations, where the CMP operation is a key component which is used to tailor the state into the target state , and its definition is
(5) |
After the CMP operation, the quantum state which greater than (i.e., the last qubit is ) will be clipped off.


.
And then, an unitary rotation operation
(6) |
is performed on the last qubit to obtain ,
(7) |
3.2 Similarity calculation
The similarity is a coefficient that describes how close two samples are, and it is obviously inversely related to the Euclidean distance. After the state preparation, the main work of this phase is to calculate the similarity between and other states in state sets and , where is a state selected randomly from or . For simplicity, suppose is the l-th state from ,
(8) |
which corresponds to the chosen sample from sample set in the classical scenario. The detailed process is as follows.
First, a swap operation is performed on the last two qubits of to obtain a new state,
(9) |

If we measure the first qubit with , the probability of measurement result to be is
(11) |
As we know, the inner product between and prepared state can be calculated as below,
(12) |
Finally, we can find out the max-similarity sample through the classical maximum traversal search algorithm among the set .
Through the above method, we can also find out the other max-similarity sample . When having finished all these steps, the algorithm turns to the next phase.
3.3 Weight vector update
The first step is to determine the closest same-class sample Near-hit and the different-class sample Near-miss, which obey the following rule,
(14) |
After determining Near-hit and Near-miss, we can update every element of weight vector with them,
(15) |
where .
3.4 Feature selection
After iterating the similarity calculation and weight vector update runs, the algorithm jumps out of the loop with the final vector as output result. The remainder of the algorithm is to select the “real” relevant features.
We firstly divide by , and obtain its mean vector ,
(16) |
Then, we select relevant features according to the preset threshold . To be specific, those features are selected if their corresponding values in are greater than and discarded in the opposite case,
(17) |
4 Example and its experiment
Suppose there are four samples(see Tab. 1), , , , , thus the in Eq. (4) is 2, and they belong to two classes: , , which is illustrated in Fig. 4.
1 | 0 | 1 | 0 | |
1 | 0 | 0 | 0 | |
0 | 1 | 1 | 0 | |
0 | 1 | 0 | 0 |

First, the four initial quantum states are prepared as follows,
(18) |
Taking as an example, the operation is applied on the third and fourth qubits,
(19) |
Then we perform rotation (see Eq. (6)) on the last qubit, where
(20) |
and can get
(21) |
The other quantum states are prepared in the same way and they are listed as below,
(22) |
Second, we randomly select a sample (assume is that one), and perform similarity calculation with other samples (i.e., , , ). Taking and as an example, we perform a swap operation between the last two qubits of ,
(23) |
After that, the swap test operation is applied on (, ),
(24) |
then, we measure the result shown in Eq.(24) and obtain the probability of the first qubit being is
(25) |
In terms of Eq. (12), the inner product between and can be represented as
(26) |
From Eqs. (25) and (26), the similarity between and is
(27) |
here, the value of can be determined by measurement.
In order to obtain the measurement result and also verify our algorithm, we choose the IBM Q [24] to perform the quantum processing (Fig. 5 gives the schematic diagram of our experimental circuit)111In the experiment, we program our algorithm based on the QISKit toolkit[25] and phyton language, and remotely connect the online IBM QX5 device to execute quantum processing.. After the experiment, we can get which is shown in Tab. 2.

Iteration times() | |||
---|---|---|---|
1 | 0.49023438 | ||
0.49902344 | |||
0.49121094 | |||
2 | 0.50097656 | ||
0.52246094 | |||
0.53417969 | |||
3 | 0.50683594 | ||
0.50878906 | |||
0.49218750 | |||
4 | 0.49804688 | ||
0.49218750 | |||
0.50195312 |
According to (from Tab. 2) and Eq. (27), we can calculate the similarity between and ,
(28) |
In the same way, the other two similarities (, ), (, ) can also be obtained (which are illustrated in Tab. 3).
0.3125 | 0.03125 | 0.28125 | ||
0.71875 | 1.09375 | |||
0.25 | ||||
Third, From Tab. 3, it is easy to find Near-hit is and Near-miss is (as shown in Fig. 6). Then, the weight vector is updated by applying Eq.(15), and the result of is listed in the second row of Tab. 4 for the first iteration.

1 | [1 1 0 0] |
---|---|
2 | [2 2 -1 0] |
3 | [3 3 -1 0] |
4 | [4 4 -2 0] |
The algorithm iterates times (in our example, =4) as above steps, and obtains all the results shown in Tab. 4. After -th iterations, , then . Since the threshold , so the selected features are and , i.e., the first and second features.
5 Efficiency analysis
In classical Relief algorithm, it takes O(N) times to calculate the Euclidean distance between and each of the samples, the complexity of finding its Near-hit and Near-miss is related to , and the loop iterates times, so the computational complexity of classical Relief algorithm is O(TMN). Since is a constant which affects the accuracy of relevance levels, but it is chosen independently of and , so the complexity can be simplified to O(MN). In addition, an -dimensional vector in the Hilbert space is represented by bits in the classical computer, and there are samples (i.e., -dimensional vectors) needed to be stored in the algorithm, so the classical Relief algorithm will consume bits storage resources.
In our quantum Relief algorithm, all the features of each sample are superposed on a quantum state or , then the similarity calculation between two states, which is shown in Eq. (13), is just required to be taken time. As same as Relief algorithm, the similarity between the selected state and each state in , is calculated, taking times, to obtain Near-miss and Near-hit, and the loop iterates times, so the proposed algorithm totally takes O(TM) times. Since is a constant, the computational complexity can be rewritten as O(M). On the viewpoint of resource consumption, each quantum state in state sets and is represented as
or
and it consists of qubits. Since , and , that means that there are such quantum states needed to be stored in our algorithm, so it will consume qubits storage resources.
Tab. 5 illustrates the efficiency comparison, including the computational complexity and resource consumption, between classical Relief and our quantum Relief algorithms. As shown in the table, the computational complexity of our algorithm is , which is obviously superior than in classical Relief algorithm. On the other hand, the resource that our algorithm needs to consume is qubits, while the classical Relief algorithm consumes bits.
Complexity | resource consumption | |
---|---|---|
Relief Algorithm | bits | |
Quantum Relief Algorithm | qubits |
6 Conclusion and Discussion
With quantum computing technologies nearing the era of commercialization and quantum supremacy, recently machine learning seems to be considered as one of the ”killer” applications. In this study, we utilize quantum computing technologies to solve a simple feature selection problem (just used in binary classification), and propose the quantum Relief algorithm, which consist of four phases: state preparation, similarity calculation, weight vector update, and features selection. Furthermore, we verify our algorithm by performing quantum experiments on the IBM Q platform. Compared with the classical Relief algorithm, our algorithm holds lower computation complexity and less resource consumption (in terms of number of resource). To be specific, the complexity is reduced from to , and the consumed resource is shortened from bits to qubits.
Although this work just focuses on the feature selection problem (even the simple binary classification), but the method can be generalized to implement the other Relief-like algorithms, such as ReliefF 26 , RReliefF 27 . Besides, we are interested in utilizing quantum technologies to deal with some classic high-dimension massive data processing, such as text classification, video stream processing, data mining, computer version, information retrieval, and bioinformatics.
Acknowledgements.
The authors would like to thank the anonymous reviewers and editor for their comments that improved the quality of this paper. This work is supported by the National Nature Science Foundation of China (Grant Nos. 61502101, 61501247 and and 61672290), the Natural Science Foundation of Jiangsu Province, China (Grant No. BK20171458), the Six Talent Peaks Project of Jiangsu Province, China (Grant No. 2015-XXRJ-013), the Natural Science Foundation for Colleges and Universities of Jiangsu Province, China (Grant No. 16KJB520030), the Practice Innovation Training Program Projects for Jiangsu College Students (Grant No. 201810300016Z), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).References
- (1) Mohri M., Rostamizadeh A., Talwalkar A.: Foundations of machine learning. MIT Press, Cambridge, Massachusetts (2012)
- (2) Bellman, R. E.: Dynamic programming. Princeton University Press, Princeton, NJ (1957)
- (3) Roweis, S. T., Saul, L. K.: Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323-2326 (2000)
- (4) Liu, H., Motoda, H.: Feature selection for knowledge discovery and data mining. Springer US (2012)
- (5) Liu, H., Motoda, H.: Computational methods of feature selection. CRC Press (2007)
- (6) Kira, K., Rendell, L. A.: A practical approach to feature selection. In: Proc. of 9th International Workshop on Machine Learning, pp. 249-256, Morgan Kaufmann Publishers, San Francisco, CA, USA (1992)
- (7) Zhang, J., Lin, H., Zhao, M.: A fast algorithm for hand gesture recognition using Relief. In: Proc. of 6th International Conference on Fuzzy Systems and Knowledge Discovery, vol. 1, pp. 8-12, IEEE Press, Piscataway, NJ, USA (2009)
- (8) Amjady, N., Daraeepour, A.: Day-ahead electricity price forecasting using the relief algorithm and neural networks. In: Proc. of 5th International Conference on the European Electricity, pp. 1-7, IEEE Press, Market, Lisboa, Portugal (2008)
- (9) Dai, Y., Chen, L., Zhang, W., Min, Y.: Multi-support vector machine power system transient stability assessment based on relief algorithm. In: Proc. of 2015 IEEE PES Asia-Pacific Power and Energy Engineering, pp. 1-5. IEEE Press, Brisbane, QLD, Australia (2015)
- (10) Feynman, R. P.: Quantum mechanical computers. Found. Phys. 16(6), 507-531 (1986)
- (11) Deutsch, D., Jozsa, R.: Rapid solution of problems by quantum computation. In: Proc. of Royal Society of London, Series A: Mathematical and Physical Sciences. 439(1907), pp. 553-558, the Royal Society (1992)
- (12) Deutsch, D.: Quantum theory, the church-turing principle and the universal quantum computer. In: Proc. of Royal Society of London, Series A: Mathematical and Physical Sciences. 400(1818), pp. 97-117, the Royal Society (1985)
- (13) Shor, P. W.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 41(2), 303-332 (1999)
- (14) Grover, Lov K.: A fast quantum mechanical algorithm for database search. In: Proc. of 28th Annual ACM symposium on Theory of Computing, pp. 212-219, ACM Press, New York, NY, USA (1996)
- (15) Sasaki, M., Carlini, A.: Quantum learning and universal quantum matching machine. Phys. Rev. A. 66(2), 022303 (2002)
- (16) Servedio, R. A., Gortler, S. J.: Equivalences and separations between quantum and classical learnability. SIAM J. Comput. 33(5), 1067-1092 (2004)
- (17) Hentschel, A., Sanders, B. C.: Machine learning for precise quantum measurement. Phys. Rev. Lett. 104(6), 063603 (2010)
- (18) Neven, H., Denchev, V. S., Rose, G., Macready, W. G.: Training a large scale classifier with the quantum adiabatic algorithm. arXiv:quant-ph/0912.0779v1 (2009)
- (19) Pudenz, K. L., Lidar, D. A.: Quantum adiabatic machine learning. Quantum Inf. Process. 12(5), 2027-2070 (2013)
- (20) Anguita, D., Ridella, S., Rivieccio, F., Zunino, R.: Quantum optimization for training support vector machines. Neural Netw., 16(5-6), 763 770 (2003)
- (21) Wiebe, N., Kapoor, A., Svore, K.: Quantum algorithms for nearest-neighbor methods for supervised and unsupervised learning. Quantum Info. Comput. 15(3-4), 316-356 (2015)
- (22) Lloyd, S., Mohseni, M., Rebentrost, P.: Quantum algorithms for supervised and unsupervised machine learning. arXiv:quant-ph/1307.0411 (2013)
- (23) Buhrman, H., Cleve, R., Watrous, J., De, W. R.: Quantum fingerprinting. Phys. Rev. Lett. 87(16), 167902 (2001)
- (24) IBM quantum computing platform. https://www.research.ibm.com/ibm-q/. Online accessed 12-April-2018
- (25) QISKit: Open Source Quantum Information Science Kit. https://qiskit.org/. Online accessed 12-April-2018
- (26) Kononenko, I., Simec, E., Robniksikonja, M.: Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl. Intell. 7(1), 39-55 (1997)
- (27) Robnik-Sikonja, M., Kononenko, I.: An adaptation of Relief for attribute estimation in regression. In: Proc. of 14th International conference on Machine Learning, pp. 296-304, Morgan Kaufmann Publishers, Nashville, TN, USA (1997)