ML-Aided Collision Recovery for UHF-RFID Systems

Talha Akyıldız1, Raymond Ku1, Nicholas Harder1, Najme Ebrahimi2, Hessam Mahdavifar1 Email: {akyildiz, rayku, nharder}@umich.edu, [email protected], [email protected] 1 Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA 2 Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32603, USA

Abstract

We propose a collision recovery algorithm with the aid of machine learning (ML-aided) for passive Ultra High Frequency (UHF) Radio Frequency Identification (RFID) systems. The proposed method aims at recovering the tags under collision to improve the system performance. We first estimate the number of tags from the collided signal by utilizing machine learning tools and show that the number of colliding tags can be estimated with high accuracy. Second, we employ a simple yet effective deep learning model to find the experienced channel coefficients. The proposed method allows the reader to separate each tag’s signal from the received one by applying maximum likelihood decoding. We perform simulations to illustrate that the use of deep learning is highly beneficial and demonstrate that the proposed approach boosts the throughput performance of the standard framed slotted ALOHA (FSA) protocol from $0.368$ to $1.837$ , where the receiver is equipped with a single antenna and capable of decoding up to $4$ tags.

I Introduction

Passive Ultra High Frequency (UHF) Radio Frequency Identification (RFID) is a wireless communication system. UHF-RFID systems have a wide range of application area, e.g., logistics & supply chain, item inventory tracking and materials management to mention a few. In UHF-RFID systems, the communication is conducted between the reader and arbitrary number of passive tags with backscatter modulation. In this work, we focus on passive UHF-RFID systems where the tags can only operate by absorbing energy from the reader through the radio waves and have a limited operation capability.

In UHF-RFID systems, the reader can only communicate with at most one tag, and whenever two or more tags try to communicate with the reader a collision occurs. To avoid that, EPC Gen2 standards employ framed slotted ALOHA (FSA) protocol to randomize the transmission procedure of passive tags [1]. In FSA, frames are formed by slots and each tag chooses its slot uniformly over the frame where its size is determined by the reader and also known by the tags [2]. Even though FSA is employed as an anti-collision protocol, collisions still occur. This situation leads to the application of collision recovery algorithms with an aim of extracting or possibly recovering information from the tags under collision.

In [3], the authors study joint-decoding of multiple tags by investigating the signal constellation points and parameter estimation methods for low frequency RFID systems. Another work [4] shows that the number of tags up to four can be estimated from the collided signal, and this information later can be used to recover the signals of colliding tags. The utilization of fixed beam-forming for different sub-spaces to detect collisions which can be combined with other anti-collision algorithms is presented in [5]. In [6], it is shown that using antenna array which is coupled with blind source separation techniques can reduce the collision probability and also remove the interference signals. Another related work [7] formulates a maximum likelihood estimator to find out the number of tags in collision for each slot and also compare its performance against to the other well known estimators.

The authors in [8] first propose a different type of single antenna collision recovery receivers, i.e., zero-forcing and ordered successive cancellation, to improve the performance of FSA by estimating channel coefficients for at most two colliding tags, and then, they generalize their approach for multiple antenna setting in [9]. Another similar work [10] shows that it is possible to recover more than two colliding tags by exploiting the additional diversity (increasing the number of receiver antennas) with an assumption of perfect channel knowledge at the reader. Later, they extend their work by also proposing a channel estimation technique using the post-preamble symbols in [11]. In [12], the number of tag estimation is coupled with a channel estimation technique for collision recovery and [13] studies code designs to recover tag collisions given the number of colliding tags.

In this work, we propose a new framework with the application of deep learning tools for the collision recovery in UHF-RFID systems. In our framework, we first estimate the number of tags in collision by considering Gaussian mixture model (GMM) which clusters the received signal points. We then replace GMM with two different neural networks architectures, i.e., feed-forward and convolutional. We observe that the deep learning architectures can estimate the number of tags with high accuracy by out-performing GMM. The estimated number of tags is then utilized to identify the channel coefficients of tags with the aid of additional symbols via deep learning. As a last step, we decode the backscattered modulated signals using minimum-distance decoding. The results demonstrate that the proposed framework achieves significantly higher throughput values compared to the conventional schemes, and hence, the use of deep learning tools is highly beneficial for collision recovery in UHF-RFID systems.

The remainder of this paper is organized as follows. In Section II, we present the system model including FSA and the communication channel. Section III describes the machine learning based proposed model for collision recovery. Section IV illustrates our numerical results in comparison with the other approaches. Finally, Section V concludes the paper.

II System Model

In this section, we describe a mathematical framework for UHF-RFID systems. First, we briefly mention the properties of FSA and also compute the analytical throughput performance of FSA with collision recovery. We also represent the communication model between the reader and passive tags. Finally, we depict the constellation points of the received signal at the reader in in-phase (I) and quadrature (Q) plane (I/Q plane).

II-A Framed Slotted ALOHA

EPC Gen2 standards employ FSA protocol where frames are formed by $K$ slots and a population of $N$ tags select their slots uniformly over the frame. In FSA, slots can be in three different states: 1) Empty slot: None of the tags transmit, 2) Singleton Slot: Only one specific tag transmits, 3) Collision Slot: Two or more tags try to communicate. In conventional UHF-RFID systems, the reader can only identify singleton slots and collision slots are discarded. Let $\mathcal{X}_{R}$ be a random variable that denote the number of slots with $R$ colliding tags and its expected value is $\mathbb{E}\{\mathcal{X}_{R}\}=K\binom{N}{R}\left(\frac{1}{K}\right)^{R}\left(1-\frac{1}{K}\right)^{N-R}$ .

The throughput of FSA is defined as the number of decoded users per slot which corresponds to $\frac{\mathbb{E}\{\mathcal{X}_{1}\}}{K}$ . This value is maximized when $K=N$ with a value of $0.368$ .

We consider the scenario where the receiver is capable of decoding tags when collision occurs and derive the throughput under this scenario. Let $J$ denote the number of tags can be decoded and $M$ denote the number of tags can be resolved by the receiver with a condition $R\leq M$ . If $J=1$ , the receiver can decode one of the tags out of at most $M$ colliding ones. In this case, the throughput can be computed as $\frac{1}{K}\sum_{R=1}^{M}\mathbb{E}\{\mathcal{X}_{R}\}=\sum_{R=1}^{M}\binom{N}{R}\left(\frac{1}{K}\right)^{R}\left(1-\frac{1}{K}\right)^{N-R}$ .

We illustrate the throughput values of FSA over different values of frame size to tag ratio ( $K/N$ ) for $M$ up to $4$ in Fig. 1. It can be seen that as $M$ increases, the throughput performance improves, and the maximum throughput can be achieved for $K/N$ values because the collisions are now recoverable. The throughput can be increased to $0.818$ ( $M=4$ ) which is a substantial gain compared to the throughput $0.368$ of FSA.

Refer to caption — Figure 1: Expected throughput comparison for different values of $M$ over different values of $K/N$ when $J=1$ .

We also consider the scenario where the receiver can decode more than one tag ( $J\geq 1$ ) out of up to $M$ colliding tags. In this case, the throughput value can be written as $\frac{1}{K}(\sum_{R=1}^{J}R\leavevmode\nobreak\ \mathbb{E}\{\mathcal{X}_{R}\}+\sum_{R=J+1}^{M}J\leavevmode\nobreak\ \mathbb{E}\{\mathcal{X}_{R}\})$ .

We depict the maximum throughput values of FSA for different values of $M$ and $J$ in Fig. 2 by calculating the optimal $K/N$ values. It can be seen that recovering more than one tag is significantly beneficial and can boost the system performance further. The throughput value can be up to nearly $2$ which was $0.818$ when $J=1$ . The throughput analysis of FSA with collision recovery shows its potential and how it can improve the performance of conventional FSA.

II-B The Communication Model

We now present the communication model between the reader and passive tags, following typically considered models in the system level analysis of UHF-RFID systems [9, 14] as the models also comply with measurement data. The communication model is depicted in Fig. 3. In our model, we only consider a single receiver antenna at the reader. The communication can be divided into two sub-parts: 1) Forward channel: The reader supports energy and data to the passive tags with continuous carrier wave transmission, 2) Backward channel: Passive tags absorb the energy from the reader and reflects it back to the reader. Each tag modulates its 16 bit random number (RN16) with an on-off keying: 0 (OFF) and 1 (ON) corresponding to the absorbing and reflecting states. Let $a_{i}(t)$ denote the modulated RN16 signal for tag $i$ , i.e.,

a_{i}(t)=\sum_{k}a_{i}[k]p(t-kT_{i}-\tau_{i}),

(1)

where $a_{i}[k]$ denotes the transmitted symbols ( $\pm 1$ ) and $p(t)$ is a rectangular pulse of the modulated signal. $T_{i}$ and $\tau_{i}$ correspond to the symbol period and modulation delay, respectively.

The backscattered modulated signal for tag $i$ is transmitted via carrier frequency $f_{c}$ and denoted by $s_{tag,i}$ , i.e.,

s_{tag,i}=|h_{i}^{f}|\sqrt{|\Delta\sigma_{i}|}a_{i}(t)\sin(2\pi f_{c}t+\varphi_{i}^{f}+\varphi_{i}^{\Delta\sigma}),

(2)

where $|h_{i}^{f}|$ is the magnitude of forward channel channel coefficient, $|\Delta\sigma_{i}|$ is the normalized differential radar cross section (RCS) coefficient of tag $i$ [15]. $\varphi_{i}^{f}$ and $\varphi_{i}^{\Delta\sigma}$ are phase shifts of the forward channel and the modulation.

During the transmission from the reader to tags, a leakage occurs between the transmitter and receiver antenna, i.e.,

s_{leak}(t)=|L|\sin(2\pi f_{c}t+\varphi^{leak}),

(3)

where $|L|$ is leakage magnitude and $\varphi^{leak}$ is the phase shift.

After deriving all necessary components, we finally compute the received signal at the reader denoted as $s(t)$ and given by

	$\displaystyle s(t)$	$\displaystyle=\sum_{i=1}^{R}\|h_{i}^{b}\|\|h_{i}^{f}\|\sqrt{\|\Delta\sigma_{i}\|}a_{i}(t)\sin(2\pi f_{c}t+\varphi_{i}^{f}+\varphi_{i}^{\Delta\sigma}+\varphi_{i}^{b})$
		$\displaystyle+s_{leak}(t)+n(t),$

where $R$ is the number of colliding tags in a certain slot. The backward channel is denoted by $h_{i}^{b}$ and $\varphi_{i}^{b}$ is the corresponding phase shift for tag $i$ . $n(t)$ is the additive white Gaussian noise (AWGN) with power spectral density $N_{0}$ ¹¹1We note that the parameters, i.e., the channel coefficients, phase shifts, differential RCS is constant during the transmission of the modulated backscattered tag signals..

It is possible to down-convert the received signal into the baseband using I/Q demodulators since all signal components have the same carrier frequency. Utilizing that, we can write the complex-valued baseband signal at the reader as

s^{b}(t)=\sum_{i=1}^{R}h_{i}^{b}h_{i}^{f}\sqrt{\Delta\sigma_{i}}a_{i}(t)+L+n^{b}(t),

(4)

where $h_{i}^{f}=|h_{i}^{f}|e^{\varphi_{i}^{f}}$ and $h_{i}^{b}=|h_{i}^{b}|e^{\varphi_{i}^{b}}$ are complex valued channel coefficients. The normalized RCS, leakage and noise can be written as $\sqrt{\Delta\sigma_{i}}=\sqrt{|\Delta\sigma_{i}|}e^{\varphi_{i}^{\Delta\sigma}}$ , $L=|L|e^{\varphi^{leak}}$ , and $n^{b}(t)=n(t)e^{j2\pi f_{c}t}$ , respectively.

We further simplify the equation (4) by denoting $h_{i}=h_{i}^{b}h_{i}^{f}\sqrt{\Delta\sigma_{i}}$ as the general coefficient including both forward and backward channels as well as differential RCS coefficient. By introducing $\mathbf{h}$ with elements $h_{i}$ , $\mathbf{a}(t)$ with elements $a_{i}(t)$ , we can rewrite the received signal as

s^{b}(t)=\mathbf{h}\mathbf{a}(t)+L+n^{b}(t),

(5)

where signal to noise ratio (SNR) is defined as $\mathbb{E}\{|h_{i}|^{2}a_{i}^{2}\}/N_{0}$ .

We note that our communication model assumes coherent detection, i.e, precise tag synchronization and symbol rate among tags. In UHF-RFID systems, the tags might be asynchronous due to delay term ( $\tau_{i}$ ) during the modulation and modulated RN16 signals $a_{i}(t)$ generally have different modulation frequency (symbol period $T_{i}$ ) which can differ up to $\pm 22\%$ [1]. However, the impact of asynchronism and different symbol periods among tags is neglected to obtain comparable numerical results with the literature where simulations are performed in ideal conditions and non-coherent detection left as a future work.

II-C Received Signal Constellation Points

The received baseband signal is complex-valued and contains both in-phase and quadrature components. The received signal $s^{b}(t)$ and its constellation points in I/Q plane for different values of number of tags up to $4$ is illustrated in Fig. 4. The first column plots correspond to the amplitudes of both I/Q components of the received signal where second column plots depict the constellation points in I/Q plane. One major observation is that for each number of tag we have different amplitude levels which results in higher number of clusters as we increase the number of tags. Since we only consider ON-OFF keying, the transmitted symbols can only take values $\pm 1$ . As a result of that, the cluster centers and signal levels correspond to the distinct combinations of the channel gains, $h_{i}$ , and therefore, we will have $2^{R}$ different levels and clusters when the number of tags in collision is $R$ .

The I/Q diagram of the received signal provides important information about the number of tags under collision. Hence, we utilize the I/Q diagram of the constellation points to estimate the number of tags in the proposed recovery algorithm.

III ML-Aided Collision Recovery Algorithm

In this section, we present our proposed ML-aided collision recovery algorithm for number of tags up to $4$ and the diagram of our algorithm is shown in Fig. 5. As a first step, we estimate the number of tags in collision by using GMM and neural network architectures, i.e., feed-forward (FNN) and convolutional (CNN), utilizing the received signal constellation points in I/Q plane. Then, four different FNN models are trained to estimate the channel coefficients for given number of tags with the aid of $4$ additional symbols. After finding the number of tags and the channel gains, we apply minimum distance decoder to separate the transmitted signals of the passive tags.

III-A Number of Tag Estimation

We first consider a well-known clustering method Gaussian mixture model (GMM) for number of tag estimation. GMM is a probabilistic model-based clustering technique where the samples are drawn from the mixture of normal distributions, i.e., $p(\mathbf{x_{n}})=\sum_{l=1}^{L}{\pi_{l}}N(\mathbf{x_{n}}\mid\mu_{l},\Sigma_{l})$ where $\mathbf{x_{n}}$ is received signal constellation points, $\pi_{l}$ is mixture probabilities, i.e., $0\leq\pi_{l}\leq 1$ for all $l$ and $\sum_{l=1}^{L}\pi_{l}=1$ , $N(\mathbf{x_{n}}\mid\mu_{l},\Sigma_{l})$ is a normal distribution with a mean $\mu_{l}$ and a covariance $\Sigma_{l}$ where $L$ is the number of components in the mixture equal to $2^{R}$ . GMM technique suits well for our problem since the received signal is a mixture of normally distributed constellation points and they are centered at the combinations of the channel coefficients and deviates around the center points with an AWGN noise²²2For GMM, we pick the number of tags by utilizing the estimated number of clusters. We select the number of tags as $x$ when the number of clusters lies between $2^{x-1}$ and $2^{x}$ .. We use expectation-maximization (EM) algorithm as an iterative solution to find a well-performing parameters ( $\pi_{l},\mu_{l},{\Sigma_{l}}$ ) of GMM [16].

We extend our approach by using FNN and the architecture model comprises of $7$ hidden layers and an output layer with the number of units starting from $1024$ and gradually decreasing through the hidden layers by half where the output layer has $4$ units corresponding to the number of tags to be estimated. The received baseband signal $s^{b}(t)$ is fed into the model by concatenating the real and imaginary parts. All linear layers apply a matrix multiplication with weight parameters and also add bias terms. After linear operations, a ReLU (rectified linear unit) activation function is applied at the hidden layers. At the output layer, soft-max activation function is used to get probabilities for the number of tags.

We also consider CNN with an aim of improving the performance of FNN since it might be able to extract features from the constellation points through the convolutional layers. We use similar architecture for CNN as in FNN. Different from FNN, we feed the received signal to the 3 convolutional layers with a number of channels $16$ , $64$ and $32$ with a kernel size of $5$ and activated by ReLU function. The convolutional layers is then followed by $4$ hidden layers and an output layer with a number of units $512$ , $256$ , $64$ , $32$ , and $4$ respectively. Moreover, hidden and output layers are coupled with ReLU and soft-max activation functions.

We describe the dataset used for the training of FNN and CNN. We generate the received signal samples for each number of tag in size of batches and symbol period $T_{i}$ and delay parameters $\tau_{i}$ are set to $10$ and $0$ , respectively. The training SNR is determined as $5$ dB and leakage is assumed as 0. Each received signal sample is formed as follows: 1) Each tag modulates 16 bit random number (RN16), 2) The Rayleigh channel coefficients are generated as an independent and identically distributed (i.i.d) with zero mean and unit variance complex Gaussian random variables, i.e., $\mathcal{CN}(0,1)$ , 3) The noise realizations are also i.i.d complex Gaussian random variables with zero mean and variance $N_{0}$ .

For GMM, we apply Bayesian-based information criterion (BIC) score to avoid over-estimating the number of clusters by applying penalization over the likelihood function [17]. For FNN and CNN, Adam optimizer [18] is used with a learning rate $\eta=0.0001$ . We select the batch size as $256$ for each number of tag and employ cross-entropy loss function.

III-B Channel Estimation

We now explain our proposed approach for channel estimation by utilizing the estimated number of tags. After finding out the number of tags, four different FNN models are trained with the aid of $4$ fixed symbols in addition to the RN16. We use fixed additional symbols for this specific problem because if the symbols are formed in a random manner similar to the RN16, the neural network cannot capture the relationship between the received signal and the channel coefficients. We select the number of symbols as $4$ since we are interested in estimating the channel coefficients for at most $4$ tags. We note that the additional symbols should be different for each tag and orthogonal to each other. The main idea of using FNN model is to extract the channel coefficients from the received signal as it is corrupted by the AWGN noise.

The FNN model for channel estimation contains $5$ hidden layers with an output layer. Similar to the previous model, the received signal with only $4$ additional symbols is fed into the model. The hidden layers have equal number of $320$ units and are activated by ReLU function. The output layer has no activation function and consists of $2R$ units where multiplier $2$ is used for the real and imaginary parts. For the training procedure, we produce the dataset in a similar fashion to the number of tag estimation case. We use Adam optimizer with learning rate $\eta=0.1$ . We select the batch size as $256$ with a training SNR of $20$ dB and also employ minimum mean squared error (MMSE) as the loss function.

III-C Minimum Distance Decoder

As a final step in our framework, we perform minimum distance decoding after finding the channel coefficients for each tag. As described earlier in Section II, the received signal levels or cluster centers are combinations of channel coefficients with different signs. As an example, consider two tags with channel gains $h_{1}$ and $h_{2}$ , we will have $4$ different levels in the received signal denoted by $l_{1}$ , $l_{2}$ , $l_{3}$ , and $l_{4}$ , i.e.,

l_{1}=h_{1}+h_{2},\leavevmode\nobreak\ \leavevmode\nobreak\ l_{2}=h_{1}-h_{2},\leavevmode\nobreak\ \leavevmode\nobreak\ l_{3}=-h_{1}+h_{2},\leavevmode\nobreak\ \leavevmode\nobreak\ l_{4}=-h_{1}-h_{2}.

After computing the signal levels which can be done for higher number of tags in a similar manner, the maximum likelihood decision rule is equivalent to a minimum distance decoding for the given observation (received signal $s^{b}(t)$ ), i.e.,

\hat{i}=\underset{i\in\{1,2,\cdots,2^{R}\}}{\mathrm{argmin}}{\mid\mid l_{i}-s^{b}(t)\mid\mid}

(6)

As a final step, after finding the optimal signal level for each RN16 symbols, we can derive the transmitted bits/symbols of passive tags for the given signal level. Specifically, if $\hat{i}=1$ , both tags transmit $1$ in their reflecting states and if $\hat{i}=2$ , first and second tag transmit $1$ and $-1$ , respectively.

IV Numerical Results

In this section, we illustrate the performance of the proposed framework via simulations. We first present the accuracy values of tag estimation and MMSE losses for channel estimation. We then demonstrate the throughput performance of GMM, FNN and CNN in comparison with the existing approaches.

We consider two frequency flat channel models, i.e., Rayleigh and Rician fading. In the former, the channel coefficients are i.i.d with zero mean and unit variance complex Gaussian random variables. In the latter, the channel coefficients are calculated as $h_{\text{ric}}=\sqrt{P_{LOS}}+\sqrt{P_{NLOS}}\leavevmode\nobreak\ h_{\text{ray}}$ where $h_{\text{ric}}$ and $h_{\text{ray}}$ are the corresponding channel coefficients. $P_{LOS}$ and $P_{NLOS}$ measure the powers of line of sight (LOS) and non line of sight (NLOS) components in the environment and can be computed as $P_{LOS}=\frac{K}{K+1}$ and $P_{NLOS}=\frac{1}{K+1}$ where $K$ is the Rician factor and selected as $2.8$ dB according to the measurements in [19].

IV-A Number of Tag and Channel Estimation

The accuracy values of GMM, FNN and CNN are given in Table I for tag numbers up to $4$ at an SNR value of 20 dB under both channel models. While GMM performs well at SNR value of $20$ dB, its performance degrades severely for lower SNR values. On the other hand, FNN has the worst performance and CNN outperforms both GMM and FNN with an overall accuracy value of $0.971$ for Rayleigh fading channel. However, it should be noted that FNN and CNN are more tolerant to noise than GMM as it will be illustrated in throughput simulations. It is also possible to see that FNN and CNN can achieve almost identical performance for Rician fading model as well. This exhibits that even though the neural networks are trained with a specific channel model, it is still robust to other channel model as well.

Table I: Comparison of tag accuracy values for GMM, FNN and CNN at SNR value of 20 dB under both channel models.

	Rayleigh			Rician
Number of tags	GMM	FNN	CNN	FNN	CNN
1 tag	0.9988	1.0	1.0	1.0	1.0
2 tags	0.9855	0.993	1.0	0.988	1.0
3 tags	0.9688	0.889	0.992	0.813	0.991
4 tags	0.863	0.863	0.894	0.831	0.89
Overall	0.954	0.936	0.971	0.908	0.97

The MMSE loss values of channel parameters are presented in Table II for tag numbers up to $4$ at an SNR value of 20 dB under both channel models. It is demonstrated that the loss values can be down to $10^{-4}$ and the network can extract the channel coefficients successfully with the help of preamble symbols. The same observation between two channel models can be made for this case as well which shows the generalization capability of the network.

We also note that the neural networks for number of tags and channel estimation is trained off-line using extensive number of simulated samples, however, the practice of the trained models only require simple calculations (matrix multiplication and non-linear activation). Hence, the use of deep learning tools is feasible for real time implementation while this is not the case for GMM. GMM needs to calculate well performing parameters for each sample individually which makes it demanding for computational power and disadvantageous in terms of applicability in real time.

Table II: Comparison of MMSE values for channel parameters at SNR value of 20 dB under both channel models.

Number of tags	Rayleigh	Rician
1 tag	$6.71\times 10^{-5}$	$6.96\times 10^{-5}$
2 tags	$1.09\times 10^{-4}$	$1.09\times 10^{-4}$
3 tags	$1.45\times 10^{-4}$	$1.51\times 10^{-4}$
4 tags	$1.65\times 10^{-4}$	$1.68\times 10^{-4}$

IV-B Throughput Performance

We now depict the throughput performance of the proposed collision recovery algorithm. We first consider the scenario where the receiver can only decode one tag from the collision ( $J=1$ ), and then, extend our examples to the scenario where the receiver is capable of decoding up to $4$ tags ( $J=4$ ). For both scenarios, we assume that the receiver is equipped with single antenna and the channel is modeled as Rayleigh fading. We also perform simulations for Rician fading, however the results are quite similar to the Rayleigh fading since the trained models perform well for this case as well, hence, they are omitted. We perform our simulations over different SNR values under the optimal frame to tag ratio ( $K/N$ ).

In the first simulation setup, the receiver can only decode one tag from the collision up to $4$ tags ( $M=4,J=1$ ). Fig. 6 illustrates throughput performances of GMM, FNN and CNN over various SNR values. For comparison purposes, we also include the throughput values of the ideal scenario where the number of tags and channel coefficients are known perfectly by the receiver and conventional FSA algorithm. The dashed lines indicate the theoretical throughput of conventional FSA and FSA with collision recovery. We observe that in low SNR regime, GMM performance suffers significantly, both FNN and CNN perform well. However, this gap vanishes as SNR increase and all models provide similar performance to each other.

In the second simulation setup, we consider the case where the receiver decode up to $4$ tags out of $4$ colliding tags ( $M=4,J=4$ ). We depict the throughput values of GMM, FNN and CNN in Fig. 7 along with the ideal scenario and conventional FSA. The throughput gain compared to the $J=1$ scenario is easy to observe and throughput value nearly approaching to $2$ . The observations are similar to the previous setup and CNN still outperforming the other alternatives.

Finally, we compare our results with the existing works in the literature with single antenna receiver and present the maximum throughput values of each work in Table III. The work [9] considers only $2$ colliding tags and recovery of a single tag ( $M=2,J=1$ ) with zero-forcing receiver, thus, the throughput performance is lower than the other works. The authors in [11] employ an MMSE receiver which is capable of recovering $2$ colliding tags ( $M=2,J=2$ ) and obtain a throughput value of $0.841$ . In [12], voltage clustering is applied for the constellation points with a throughput value of $0.85$ under the simulation setup $M=3,J=2$ . With our proposed approach, we show that it is possible to decode $4$ tags out of $4$ colliding ones ( $M=4,J=4$ ) by improving throughput value to the $1.837$ .

Table III: Comparison of our work with the existing ones.

Work	Maximum Throughput	Method & Setup
[9]	0.587	Zero-Forcing Receiver ( $M=2,J=1$ )
[11]	0.841	MMSE Receiver ( $M=2,J=2$ )
[12]	0.85	Voltage Clustering ( $M=3,J=2$ )
Our work	0.797	ML-based recovery ( $M=4,J=1$ )
Our work	1.837	ML-based recovery ( $M=4,J=4$ )

V Conclusions and Future Work

We have considered the problem of recovering tags in collision for UHF-RFID systems. Different from the existing recovery methods, we propose a machine learning based algorithm. We show that the proposed models can estimate the number of tags and the channel coefficients successfully along with a suitable design and training. The trained models enable the receiver to recover tag signals from the collided one by using minimum distance decoding. We also perform simulations to demonstrate the performance of the proposed approach which provides a significant improvement in throughput.

As a future work, one direction might be to confirm the validity of the proposed method on measurement data obtained from the real time implementations. In our work, we consider a simplified communication model with coherent detection to show the applicability of deep learning tools for UHF-RFID systems and it is possible to extend our approach for more realistic setups taking the tag asynchronism and different symbols periods into account. In addition, we study a system model where the receiver is equipped with only a single antenna which can be generalized to the multiple antenna setup. Another line of work might be to examine the complexity of the trained models and how they can be coupled with an experimental configuration.

References

[1] EPCGlobal, “EPC Radio-Frequency Identity protocols Class-1 Generation-2 UHF RFID protocol for communications at 860 Mhz–960 Mhz,” Version, vol. 1, p. 23, 2008.
[2] F. Schoute, “Dynamic frame length ALOHA,” IEEE Trans. Commun., vol. 31, no. 4, pp. 565–568, Apr. 1983.
[3] D. Shen, G. Woo, D. P. Reed, A. B. Lippman, and J. Wang, “Separation of multiple passive RFID signals using software defined radio,” in Proc. IEEE Int. Conf. RFID, Orlando, FL, USA, Apr. 2009, pp. 139–146.
[4] R. S. Khasgiwale, R. U. Adyanthaya, and D. W. Engels, “Extracting information from tag collisions,” in Proc. IEEE Int. Conf. RFID, Orlando, FL, Apr. 2009, pp. 131–138.
[5] J. Yu, K. H. Liu, X. Huang, and G. Yan, “An anti-collision algorithm based on smart antenna in RFID system,” in Proc. Int. Conf. Microwave and Millimeter Wave Technol. (ICMMT), Nanjing, China, Apr. 2008, pp. 1149–1152.
[6] A. F. Mindikoglu and A.-J. van der Veen, “Separation of overlapping RFID signals by antenna arrays,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, NV, Apr. 2008, pp. 2737–2740.
[7] B. Knerr, M. Holzer, C. Angerer, and M. Rupp, “Slot-wise maximum likelihood estimation of the tag population size in FSA protocols,” IEEE Trans. Commun., vol. 58, no. 2, pp. 578–585, Feb. 2010.
[8] C. Angerer, G. Maier, M. V. Delgado, M. Rupp, and J. V. Alonso, “Single antenna physical layer collision recover receivers for RFID readers,” in Proc. IEEE Int. Conf. Indust. Technol., Via del Mar, Chile, Mar. 2010, pp. 1406–1411.
[9] C. Angerer, R. Langwieser, and M. Rupp, “RFID reader receivers for physical layer collision recovery,” IEEE Trans. Commun., vol. 58, no. 12, pp. 3526–3537, Dec. 2010.
[10] J. Kaitovic, R. Langwieser, and M. Rupp, “RFID reader with multi antenna physical layer collision recovery receivers,” in Proc. IEEE Int. Conf. RFID-Technol. Appl., Sitges, Spain, Sep. 2011, pp. 286–291.
[11] J. Kaitovic, M. Šimko, R. Langwieser, and M. Rupp, “Channel estimation in tag collision scenarios,” in Proc. IEEE Int. Conf. RFID, Orlando, FL, Apr. 2012, pp. 74–80.
[12] X. Tan, H. Wang, L. Fu, J. Wang, H. Min, and D. W. Engels, “Collision detection and signal recovery for UHF RFID systems,” IEEE Trans. Autom. Sci. Eng., vol. 15, no. 1, pp. 239–250, Jan. 2018.
[13] H. Mahdavifar and A. Vardy, “Coding for tag collision recovery,” in Proc. IEEE Int. Conf. RFID, San Diego, CA, Apr. 2015, pp. 9–16.
[14] A. Bletsas, J. Kimionis, A. G. Dimitriou, and G. N. Karystinos, “Single-antenna coherent detection of collided FM0 RFID signals,” IEEE Trans. Commun., vol. 60, no. 3, pp. 756–766, Mar. 2012.
[15] P. V. Nikitin, K. Rao, and R. D. Martinez, “Differential RCS of RFID tag,” Electron. Lett., vol. 43, no. 8, pp. 431–432, Apr. 2007.
[16] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Royal Stat. Soc., vol. 39, no. 1, pp. 1–38, 1977.
[17] G. Schwarz, “Estimating the dimension of a model,” Ann. Stat., vol. 6, pp. 461–464, 1978.
[18] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Rep. (ICLR), San Diego, USA, May 2015.
[19] D. Kim, M. A. Ingram, and W. W. Smith, “Measurements of small-scale fading and path loss for long range RF tags,” IEEE Trans. Antennas Propag., vol. 51, no. 8, pp. 1740–1749, Aug. 2003.