This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Rogue Emitter Detection Using Hybrid Network of Denoising Autoencoder and Deep Metric Learning

Zeyang Yang, Xue Fu, Guan Gui, Yun Lin, Haris Gacanin, Hikmet Sari, and Fumiyuki Adachi††
 
College of Telecommunications and Information Engineering, NJUPT, Nanjing, China
College of Information and Communication Engineering, Harbin Engineering University, Harbin, China
Institute for Communication Technologies and Embedded Systems, RWTH Aachen University, Aachen, Germany
††International Research Institute of Disaster Science (IRIDeS), Tohoku University, Sendai, Japan
Abstract

Rogue emitter detection (RED) is a crucial technique to maintain secure internet of things applications. Existing deep learning-based RED methods have been proposed under the friendly environments. However, these methods perform unstable under low signal-to-noise ratio (SNR) scenarios. To address this problem, we propose a robust RED method, which is a hybrid network of denoising autoencoder and deep metric learning (DML). Specifically, denoising autoencoder is adopted to mitigate noise interference and then improve its robustness under low SNR while DML plays an important role to improve the feature discrimination. Several typical experiments are conducted to evaluate the proposed RED method on an automatic dependent surveillance-Broadcast dataset and an IEEE 802.11 dataset and also to compare it with existing RED methods. Simulation results show that the proposed method achieves better RED performance and higher noise robustness with more discriminative semantic vectors than existing methods.

Index Terms:
Rogue emitter detection, deep learning, deep metric learning, denoising autoencoder, feature discrimination.

I Introduction

With the development of wireless communications, various internet of things (IoT) applications grow rapidly and play indispensable role for our daily life[1, 2, 3]. However, the convergence of sensors, actuators, information, and communication technologies in IoT produces massive amounts of data that need to be sifted to facilitate reasonably accurate decision-making and control [4]. The openness of IoT makes it vulnerable to cybersecurity threats [5, 6]. In particularly, identity spoofing attacks, where an adversary passively listens to the existing radio communications and then mimics the identity of legitimate devices to conduct malicious activities.

Recently, radio frequency fingerprinting identification (RFFI) is a technique for identifying various RF devices by extracting inherent features from hardware defects in analog circuits [7]. These hardware imperfections appear during the manufacturing process. The most important merit of using physical imperfection as a signature for identification that it is hard to spoof the signature by using other wireless devices. RFFI has been used as an additional security layer for wireless devices to avoid spoofing or analog attacks [9, 8].

In recent years, deep learning-based specific emitter identification (SEI) methods have been proposed [10, 11, 12, 13, 14]. These methods have their own advantages such as high identification accuracy, strong model generalization ability, and low model complexity. However, these methods are hard to apply in the field of rogue emitter detection (RED), which is important technique to solve the threat posed by the openness of IoT applications. Until know, some related works about RED have been investigated. Breunig et al.[15] and Liu et al.[16] respectively introduced LOF and IsolationForest, which are traditional RED methods based on machine learning. Bendale et al.[17] used meta-recognition and replaced softmax with openmax, which managed the open space risk of deep networks while rejecting spoofed images. Akcay et al.[18] introduced an encoder-decoder-encoder architecture for RED, and used the idea of generative adversarial training, which had good generalization ability for any RED task. Dong et al. [19] presented SR2CNN, which could improve the feature discrimination by updating the semantic center vector. The details of these related works are shown in Table I. The above works mainly focused on improving the accuracy of RED at a specific high signal-to-noise ratio (SNR). However, existing RED methods are hard to work under the low SNR scenarios.

TABLE I: Related works.
References Method Data type RED SNR/dB Detection performance
Y. Pan et al. [13] DRN Singnal of radio emitters No [10,24][10,24] Acc: 56%\sim95%
N. Yang et al. [14] MAML ZigBee devices and 5 UAVs No [0,30][0,30] Acc: 30%\sim100%
Bendale et al. [17] OpenMax ImageNet (ILSVRC 2012 dataset) Yes /  F-measure: 0.595
Akcay et al. [18] GANomaly MNIST, CIFAR and X-ray Yes / AUC: 0.666\sim0.882
Y. Dong et al. [19] SR2CNN RML2016.10a Yes 16\geq 16
True known rate: 0.959
True unknown rate: 0.998

To solve this problem, we propose a robust RED method by using a hybrid network of denoising autoencoder and deep metric learning (DML). Denoising autoencoder mitigates low SNR noise interference while the DML improves the feature discrimination so that the proposed method can further improve the RED performance in low SNR scenarios. Furthermore, we introduce an objective function that consists of cross-entropy (CE) loss, mean squared error (MSE) loss and center (ML) loss, which allows the autoencoder to have better extraction performance of semantic features with high discrimination while saving feature space such that the proposed method has the potential to detect more rogue emitters.

II Signal Model and Problem Formulation

II-A Signal Model

In this paper, two typical datasets, i.e., ADS-B dataset and IEEE 802.11 dataset, are used to evaluate the performance of the proposed RED method. The receiver is used to receive ADS-B signals in a given airspace or IEEE 802.11 signals of the specific USRP transmitter. Assume that there are KK aircrafts or KK USRP transmitters transmitting signals, and the ADS-B signals from each aircraft or the IEEE 802.11 signals from each transmitter is received individually by the receiver. The received signal can be represented as:

rk(t)=sk(t)hk(t)+nk(t),k=1,2,,K\displaystyle{r_{k}(t)=s_{k}(t)*h_{k}(t)+n_{k}(t),\quad k=1,2,\cdots,K} (1)

where rk(t)r_{k}(t) is the received signal, sk(t)s_{k}(t) is the ADS-B signal by the aircraft or IEEE 802.11 signal transmitted by the transmitter, hk(t)h_{k}(t) is the channel impulse response between aircraft and receiver, nk(t)n_{k}(t) denotes the additive white Gaussian noise, and * means the convolution operation.

II-B Problem Formulation

Let 𝒳\mathcal{X} be the sample space and 𝒴\mathcal{Y} be the category space. One goal of the proposed RED method is to generate a mapping function f:𝒳𝒴f\in{\mathcal{F}}:{\mathcal{X}}\rightarrow{\mathcal{Y}} , which can accurately predict the category of signals. 𝐱𝒳{\bf x}\in\mathcal{X} represents the signal. y𝒴{\rm y}\in\mathcal{Y} represents the true label of the signal. The mapping function ff can minimize the empirical error εem\varepsilon_{em}, i.e.,

II-B1 Goal 1

minfεem=minf𝔼(𝐱,y)𝒟tCE(f(𝐱),y)+𝔼(𝐱,y)𝒟tred()\displaystyle\min_{f\in{\mathcal{F}}}\varepsilon_{em}=\min_{f\in{\mathcal{F}}}\mathbb{E}_{(\mathbf{x},\mathrm{y})\sim\mathcal{D}_{t}}\mathcal{L}_{CE}(f(\mathbf{x}),\mathrm{y})+\mathbb{E}_{(\mathbf{x},\mathrm{y})\sim\mathcal{D}_{\mathrm{t}}}\mathcal{L}_{\mathrm{red}}(\cdot) (2)

where 𝒟t\mathcal{D}_{t} represents the training dataset, CE\mathcal{L}_{CE} represents the classification loss, and red\mathcal{L}_{\mathrm{red}} represents the regularization term which can prevent overfitting, improve the noise robustness, and extract more discriminable features for RED. MSE loss and ML loss are used as red()\mathcal{L}_{\mathrm{red}}(\cdot) in this paper.

According to the optimized mapping function, the semantic features of radio signals of legal emitters will be obtained and the semantic center features 𝒔k\bm{s}_{k} can be calculated as:

𝒔k=1Ni=1Nfen(𝒙i), if 𝒙i{classk},\displaystyle{{\bm{s}}_{k}=\frac{1}{N}\sum_{i=1}^{N}f_{en}({\bm{x}}_{i}),\text{ if }{\bm{x}}_{i}\in\{\text{class}~{}k}\}, (3)

where fenf_{en} denotes the mapping function of encoder, kk denotes the kk-th category, and NN represents the number of samples of the kk-th category. Our second goal is to detect whether the rogue emitters are legitimate by comparing the distance between semantic features of radio signals of unknown emitters and semantic center features, i.e.,

II-B2 Goal 2

{min𝒅(fen(𝒙),𝑺)θ𝒚𝒴in min𝒅(fen(𝒙),𝑺)>θ𝒚𝒴out \displaystyle\left\{\begin{array}[]{l}\min\bm{d}(f_{en}(\bm{x}),\bm{S})\leqslant\theta\Rightarrow\bm{y}\in\mathcal{Y}_{\text{in }}\\ \min\bm{d}(f_{en}(\bm{x}),\bm{S})>\theta\Rightarrow\bm{y}\in\mathcal{Y}_{\text{out }}\end{array}\right. (6)

where 𝑺\bm{S} is the set of known semantic center features {𝒔kk=1,2,,K}\left\{{\bm{s}}_{k}\mid k=1,2,\cdots,K\right\}, min𝒅(fen(𝒙),𝑺)\min\bm{d}(f_{en}(\bm{x}),\bm{S}) represents the minimum distance between semantic features of radio signals of unknown emitters and the known semantic center features; θ\theta represents the threshold for detection, 𝒴in \mathcal{Y}_{\text{in }} represents the known category space, and 𝒴out \mathcal{Y}_{\text{out }} represents the rogue category space.

III The Proposed RED Method

III-A The Framework of Proposed RED Method

The proposed framework consists of an encoder, a decoder and a classifier, as shown in Fig. 1. The structure of each part of the network is shown in Table II. Both encoder and decoder contain seven convolutional layers. Maxpool is a pooling operation, which reduces the feature dimension of the output of convolutional layers. BatchNorm is a batch normalization operation, which adjusts the distribution of the input values of each layer to a standard normal distribution and can speed up the training and convergence of the network. LazyLinear is a fully connected (FC) layer.

Refer to caption
Figure 1: The framework of the proposed RED method.
TABLE II: The network structure of proposed RED method.
Net Layer Number of layers
Encoder Input ×1\times 1
Conv (64, (10,1))+ ReLU
+ BatchNorm + MaxPool ((4,1))
×7\times 7
Decoder
ConvTranspose (64, (3,1))
+ ReLU + BatchNorm
×7\times 7
Conv (1, (3,1)) + Sigmoid ×1\times 1
Classifier Flatten ×1\times 1
LazyLinear (1024) ×1\times 1
LazyLinear (n classes) ×1\times 1

We introduce an objective loss function using CE loss, MSE loss and ML loss. Specifically, we choose the CE loss to evaluate the classification loss, and it is the key part of objective loss function and it can be expressed as:

CE=𝔼(𝒚log(𝒚^)),\displaystyle\mathcal{L}_{CE}=-{\mathbb{E}}(\bm{y}\log\left({\hat{\bm{y}}}\right)), (7)

where 𝒚\bm{y} denotes the true label of training sample, 𝒚^\hat{\bm{y}} denotes the predicted label of training sample. To further obtain the model with strong robustness and extraction capability of discriminative features, the decoder is connected with encoder and two terms are used as regularization of objective function,

min=min{λCECE+λMLML+λMSEMSE},\displaystyle\operatorname{min}\mathcal{L}=\operatorname{min}\{\lambda_{CE}\mathcal{L}_{CE}+\lambda_{ML}\mathcal{L}_{ML}+\lambda_{MSE}\mathcal{L}_{MSE}\}, (8)

where the ML\mathcal{L}_{ML} is the ML loss; MSE\mathcal{L}_{MSE} is the MSE loss; λCE\lambda_{CE}, λMSE\lambda_{MSE}, and λML\lambda_{ML} are the weighting coefficients. With the regularization of ML loss, the model can obtain a set of network parameters suitable for mining discriminative semantic features. With the regularization of MSE loss, the model can obtain a set of network parameters suitable for improving the noise robustness. Two regularization terms will be introduced in details in the following two subsections. The fully training procedure for the proposed RED method is decribed in Algorithm 1. After the training, the decoder and classifier will be discarded and the encoder is used for RED.

III-B Denoising Reconstruction for Strong Noise Robustness

To improve the noise robustness, this paper adopts a denoising autoencoder architecture [21] as shown in Fig. 2(b). Different from the traditional autoencoder as shown in Fig. 2(a), we add additive white Gaussian noise to the original signals. The encoder extracts the semantic features of the noise-added signals. The decoder reconstructs the semantic features into the original signals.

Refer to caption
Figure 2: The structure of traditional autoencoder and denoising autoencoder of the proposed RED method.

The MSE loss is used as the criterion to evaluate the reconstruction result and thus the network has the noise robustness by minimizing this loss. The MSE loss can be expressed as

MSE=𝔼(𝒙i𝒙^in)2,\displaystyle\mathcal{L}_{MSE}={\mathbb{E}}\left({\bm{x}}_{i}-\hat{\bm{x}}_{i}^{n}\right)^{2}, (9)

where 𝒙i{\bm{x}}_{i} denotes the original signal and 𝒙^in\hat{\bm{x}}_{i}^{n} denotes the reconstructed signal.

Algorithm 1 Training procedure of the proposed RED method.

Require:

  • DD: Training dataset;

  • TT: Number of training iterations;

  • BB: Number of batches in a training iteration;

  • θ\theta: Parameters of encoder, classifier and decoder, respectively;

  • θML\theta_{ML}: The parameter of ML loss;

  • lrlr: Learning rate of encoder, classifier and decoder, respectively;

  • lrMLlr_{ML}: Learning rate of ML loss;

  • λCE\lambda_{CE}, λMSE\lambda_{MSE}, λML\lambda_{ML}: Scalars for balancing the loss functions;

Dataset preprocessing:
DDmin(D)max(D)min(D)D\leftarrow\frac{D-\min(D)}{\max(D)-\min(D)};
for t=0t=0 to T1T-1 do:
    for b=0b=0 to B1B-1 do:
         Sampling a batch training dataset (𝒙i,𝒚i)({{\bm{x}}_{i}},{\bm{y}_{i}}) from DD.
   Forward propagation:
   Add artificial noise perturbation to 𝒙i{\bm{x}}_{i}.
         𝒙in=awgn(𝒙i;snr){\bm{x}}_{i}^{n}=\operatorname{awgn}\left({{\bm{x}}_{i}};\operatorname{snr}\right);
   Get the output of encoder, classifier and decoder.
         {𝒛i,𝒚^,𝒙^in}=f(θt,b,θMLt,b;𝒙^in)\{{\bm{z}}_{i},\hat{\bm{y}},\hat{\bm{x}}_{i}^{n}\}=f(\theta^{t,b},\theta_{ML}^{t,b};\hat{\bm{x}}_{i}^{n});
   Calculate the loss.
         CE=CE(𝒚^i,𝒚i){\mathcal{L}}_{CE}={\mathcal{L}}_{CE}(\hat{\bm{y}}_{i},{\bm{y}}_{i});
         MSE=MSE(𝒙i,𝒙^in){\mathcal{L}}_{MSE}={\mathcal{L}}_{MSE}({\bm{x}}_{i},{\hat{\bm{x}}_{i}^{n}});
         ML=ML(𝒛i,𝒄yk){\mathcal{L}}_{ML}={\mathcal{L}}_{ML}({\bm{z}}_{i},{\bm{c}}_{y_{k}});
         =λCECE+λMSEMSE+λMLML{\mathcal{L}}=\lambda_{CE}{\mathcal{L}}_{CE}+\lambda_{MSE}{\mathcal{L}}_{MSE}+\lambda_{ML}{\mathcal{L}}_{ML};
   Backward propagation:
         θt,b+1Adam(θ,,lr,θ)\theta^{t,b+1}\leftarrow Adam(\nabla_{\theta},{\mathcal{L}},lr,\theta)
         θMLt,b+1Adam(θML,ML,lrML,θML)\theta_{ML}^{t,b+1}\leftarrow Adam(\nabla_{\theta_{ML}},{\mathcal{L}_{ML}},lr_{ML},\theta_{ML})     
    end for
end for

III-C Metric Regularization for High Feature Discrimination

Feature discrimination is critical for RED. However, the discrimination brought by CE loss is not sufficient, so the ML loss is used as another regularization term. The ML loss [20] can be formulated as:

ML=12{𝔼𝒛i𝒄yk22},\displaystyle\mathcal{L}_{ML}=\frac{1}{2}\left\{{\mathbb{E}}\left\|{\bm{z}}_{i}-\bm{c}_{{y}_{k}}\right\|_{2}^{2}\right\}, (10)

where 𝒛i{\bm{z}}_{i} and 𝒄yk{\bm{c}}_{{y}_{k}} denote the semantic feature vectors and the semantic center feature vectors of train samples in the kk-th category, respectively. The combination of CE loss and ML loss allows the neural network to extract semantic features with small intra-class distance. The role of ML loss is illustrated in Fig. 3.

III-D Rogue Emitter Detection

As mentioned in subsection A, when training is over, only the encoder is retained and used for RED. Fig. 3 shows a more detailed RED process which can be divided into the following three steps.

III-D1 Obtaining semantic center features

The training dataset is considered as radio signals that emit from the legal emitters and are fed into the encoder to extract the semantic features and the semantic features of the same category are averaged to obtain the semantic center features of each legal emitter. The formula is expressed as (3).

III-D2 Calculating the distance between semantic features of radio signals of unknown emitters and the known semantic center features

Radio signals from unknown emitters are fed into the encoder of the proposed RED method to mine semantic features. The distance between the semantic features of radio signals from unknown emitters and the known semantic center features is calculated, and the formula[19] is expressed as follows:

d(fen(𝒙m),𝑺k)=(fen(𝒙m)𝑺k)T𝑨k1(fen(𝒙m)𝑺k)\displaystyle\begin{array}[]{l}{\ d}\left(f_{en}\left({\bm{x}}_{m}\right),{\bm{S}}_{k}\right)\\ =\sqrt{\left(f_{en}\left({\bm{x}}_{m}\right)-{\bm{S}}_{k}\right)^{T}{\bm{A}}_{k}^{-1}\left(f_{en}\left({\bm{x}}_{m}\right)-{\bm{S}}_{k}\right)}\end{array} (13)

where fenf_{en} denotes the mapping function of the encoder of the proposed RED method, fen(𝒙m)f_{en}\left({\bm{x}}_{m}\right) denotes the semantic features of the input signals, and dd denotes the Euclidean distance when 𝑨k1{\bm{A}}_{k}^{-1} is the unit matrix.

III-D3 Comparing the distance with threshold

{𝒙mknown, if min𝑺k𝑺d(fen(𝒙m),𝑺k)λ3t𝒙munknown, if min𝑺k𝑺d(fen(𝒙m),𝑺k)>λ3t\displaystyle{\left\{\begin{array}[]{ll}{\bm{x}}_{m}\in\text{known},\!\!\!&\!\!\!\text{ if }{{\underset{{\bm{S}}_{k}\in\bm{S}}{\min}{d}\left(f_{en}\left({\bm{x}}_{m}\right),{{\bm{S}}_{k}}\right)}\leqslant\lambda\sqrt{{3t}}}\\ {\bm{x}}_{m}\in\text{unknown},\!\!\!&\!\!\!\text{ if }{{\underset{{\bm{S}}_{k}\in\bm{S}}{\min}{d}\left(f_{en}\left({\bm{x}}_{m}\right),{{\bm{S}}_{k}}\right)}>\lambda\sqrt{{3t}}}\end{array}\right.} (16)

where λ\lambda is the hyperparameter and tt is the dimensionality of the semantic center features. If min𝑺k𝑺d(fen(𝒙m),𝑺k)\min_{{\bm{S}}_{k}\in{\bm{S}}}d(f_{en}({\bm{x}}_{m}),{\bm{S}}_{k}) is less than or equal to the threshold, the input signal will be judged to belong to the known class, otherwise, the input signal will be judged to belong to the rogue class. The threshold is inspired by the three-sigma rule[25].

Refer to caption
Figure 3: The details of the proposed RED method.

IV Experimental Results

IV-A Simulation Parameters

Our simulations are performed on NVIDIA GeForce GTX1080Ti platform based on pytorch 1.8.1. The maximum epoch TT is 150 and the batch size BB is 16. We use a dynamic learning rate, set the initial learning rate to 0.001 and change it to one-tenth of the initial rate when the validation loss does not drop for 10 epochs. Adam is selected as the optimizer. The weighting coefficients λCE\lambda_{CE}, λMSE\lambda_{MSE}, and λML\lambda_{ML} are set to 1, 0.5 and 0.005, respectively. The threshold λ\lambda ranges from 0.2 to 0.5 with a step size of 0.05.

IV-B Dataset Description

The datasets proposed in [22] and [23] are used to evaluate the proposed RED method. The dataset in [22] is a real radio signal dataset based on a special airborne monitoring system ADS-B. We randomly choose dataset containing 10 classes of aircrafts, where the ratio of the number of known classes to that of rogue classes is 9:1, and the number of samples is 3,736. The dataset in [23] is collected from 16 transmitters which are bit-similar USRP X310 radios that emit IEEE 802.11a standards-compliant frames generated via a MATLAB WLAN System toolbox, where the ratio of the number of known classes to that of rogue classes is 15:1, and the number of samples is 53,344. The number of sampling points of both datasets is 6,000, and the format of samples is In-phase/Quadrature (IQ).

IV-C Evaluation Criteria

The receiver operating characteristic (ROC) curve is used to evaluate the performance of RED. The simulation results are plotted in Fig. 4, Fig. 5, and Fig. 6 and are analyzed later. The horizontal axis is the false positive rate (FPR) which indicates the probability that the model will determine a rogue device as a known device. The vertical axis indicates the true positive rate (TPR) which indicates the probability that the model will determine a known device as a known device. The mathematical expressions are as follows:

TPR=TPTP+FN,FPR=FPFP+TN,\displaystyle TPR=\frac{TP}{TP+FN},~{}FPR=\frac{FP}{FP+TN}, (17)

where the meaning of each variable is shown in Table. III. Area under curve (AUC) is used to quantify the ROC, where larger AUC means better RED performance. The mathematical expression of AUC is given as

AUC=TPRd(FRP).\displaystyle AUC=\int TPR\cdot d(FRP). (18)

Silhouette coefficient (SC) is used to characterize the discrimination of the semantic features.

TABLE III: The meaning of TPR and FPR variables.
Prediction Value
1 0
Real 1 True Positive (TP) False Negative (FN)
0 False Positive (FP) True Negative (TN)

IV-D Comparative RED Methods

We compare the proposed RED method with several RED methods, including SR2CNN[19], IsolationForest[16], and LocalOutlierFactor [15].

Refer to caption
Figure 4: RED performances of the proposed RED method at different SNR.
Refer to caption
Figure 5: The proposed RED method vs. comparative RED methods when SNR=0 dB.
Refer to caption
Figure 6: Ablation experiments of proposed method on ADS-B dataset when SNR= 0 dB.

IV-E Rogue Emitter Detection Performance

IV-E1 RED Performance of the proposed RED method at Different SNR

To demonstrate the noise robustness of the proposed method, we analyze the detection performance of proposed RED method at SNR ={0,20,30}=\{0,20,30\} dB. As shown in Fig. 4 and Table. IV, the ROC curve of the proposed method at SNR =0=0 dB is similar to that at SNR =30=30 dB, which demonstrates that our proposed method has good robustness under different noise.

TABLE IV: The AUC of proposed method under different SNR.
SNR/dB 0 20 30
AUC (ADS-B) 0.929 0.906 0.926
AUC (IEEE 802.11) 0.920 0.987 0.956

IV-E2 Proposed RED method vs. Comparative RED Methods

To further demonstrate the noise robustness of proposed RED method, the detection performances of the proposed method and the comparative methods at SNR =0=0 dB are shown in the Fig. 5 and Table. V. It is shown that the ROC curve of the proposed method is higher than that of the comparative methods, and the AUC of the proposed method have improved by 0.106, 0.359, and 0.431 on ADS-B dataset and 0.113, 0.207, 0.417 on IEEE 802.11 dataset, respectively, compared to the comparative methods, which shows that the proposed method has better noise robustness than the comparative methods.

To analyze the discrimination of semantic features, t-distributed stochastic neighbor embedding (t-SNE) [24] is used to reduce the dimensionality of the extracted semantic features to two dimensions, and the semantic features extracted by the proposed RED method and SR2CNN on ADS-B dataset are compared. As shown by Fig. 7, the semantic features extracted by the proposed RED method has higher feature discrimination than that extracted by the SR2CNN at low SNR.

TABLE V: The AUC of proposed method and other methods.
Method Ours SR2CNN IsolationForest LOF
AUC (ADS-B) 0.929 0.823 0.57 0.498
AUC (IEEE 802.11) 0.920 0.807 0.713 0.503
Refer to caption
(a) Our proposed
Refer to caption
(b) SR2CNN.
Figure 7: The feature visualization of the proposed RED method and SR2CNN testing on ADS-B dataset when SNR =0=0 dB, and the red box indicates the features of rogue category. Note that the SCs of proposed RED method and SR2CNN are 0.43210.4321 and 0.29550.2955, respectively.

IV-E3 Ablation Experiment

The objective function of the proposed RED method contains three loss functions, i.e., CE loss, MSE loss, and ML loss. To verify the necessity of each loss, we perform ablation experiments on ADS-B dataset when the SNR is 0 dB. As shown by the Fig. 6 and Table. VI, without the regularization of the MSE loss, the detection performance of the proposed method decreases, which indicates that MSE loss makes the model have better noise robustness. As shown by the the Fig. 6 and Fig. 8(a), 8(c), without the ML loss, the detection performance of the proposed method decreases, which indicates that ML loss makes the model have better feature discrimination. In addition, through the ablation experiments, we also can observe that MSE loss can not only improve the noise robustness of the model, but also improve the feature discrimination, and ML loss can improve the feature discrimination as well as noise robustness. The proposed method has the best rogue detection performance when all three losses are existing simultaneously.

TABLE VI: The AUC of ablation experiments.
Method Proposed
Proposed
- CE
Proposed
- ML
Proposed
- MSE
AUC (ADS-B) 0.929 0.570 0.146 0.828
Refer to caption
(a) Our proposed
Refer to caption
(b) Our proposed-CE.
Refer to caption
(c) Our proposed-ML.
Refer to caption
(d) Our proposed-MSE.
Figure 8: The feature visualization of ablation experiments on ADS-B dataset, and the red box indicates the features of the rogue category. Note that the SCs of ablation experiments are 0.43210.4321, 0.1979-0.1979, 0.0308-0.0308, and 0.42900.4290, respectively.

V Conclusion

In this paper, we proposed a robust RED method which has strong noise robustness and high feature discrimination. Specifically, a denosing autoencoder is used to reconstruct the noisy signal into the original signal, and an objective function that consists of CE loss, MSE loss, and ML loss is designed to achieve better extraction performance of semantic features with high discrimination while saving feature space. The proposed RED method was evaluated on an open source real-word ADS-B dataset and an IEEE 802.11 dataset and is compared with three RED methods. The simulation results showed that the proposed method has better noise robustness and feature discrimination.

References

  • [1] D. C. Nguyen, M. Ding, et al., “6G internet of things: A comprehensive survey,” IEEE Internet Things J., vol. 9, no. 1, pp. 359–383, Jan. 2022.
  • [2] M. Vaezi, A. Azari, et al., “Cellular, wide-area, and non-terrestrial IoT: A survey on 5G advances and the road toward 6G,” IEEE Commun. Surv. Tutorials, vol. 24, no. 2, pp. 1117–1174, Secondquarter 2022.
  • [3] Z. Na, Y. Liu, J. Shi, C. Liu, and Z. Gao, “UAV-supported clustered NOMA for 6G-enabled internet of things: Trajectory planning and resource allocation,” IEEE Internet Things J., vol. 8, no. 20, pp. 15041–15048, Oct. 2021.
  • [4] L. Chettri and R. Bera, “A comprehensive survey on internet of things (IoT) toward 5G wireless systems,” IEEE Internet Things J., vol. 7, no. 1, pp. 16–32, Jan. 2020.
  • [5] N. Wang, et al., “Physical-layer security of 5G wireless networks for IoT: Challenges and opportunities,” IEEE Internet Things J., vol. 6, no. 5, pp. 8169–8181, Oct. 2019.
  • [6] Q. Chen, W. Meng, S. Han, C. Li, H.-H. Chen, “Robust task scheduling for delay-aware IoT applications in civil aircraft-aaugmented SAGIN,” IEEE Trans. Commun., voll. 70, no. 8, pp. 5368–5385, Aug. 2022.
  • [7] N. Soltanieh, Y. Norouzi, Y. Yang, and N. C. Karmakar, “A review of radio frequency fingerprinting techniques,” IEEE J. Radio Freq. Identif., vol. 4, no. 3, pp. 222–233, Sep. 2020.
  • [8] F. Meneghello, et al., “IoT: Internet of threats? A survey of practical security vulnerabilities in real IoT devices,” IEEE Internet Things J., vol. 6, no. 5, pp. 8182–8201, Oct. 2019.
  • [9] Y. Xing, et al., “Design of a robust radio-frequency fingerprint identification scheme for multimode LFM radar,” IEEE Internet Things J., vol. 7, no. 10, pp. 10581–10593, Oct. 2020.
  • [10] X. Fu, et al., “Lightweight automatic modulation classification based on decentralized learning,” IEEE Trans. Cogn. Commun. Netw, vol. 8, no. 1, pp. 57–70, Mar. 2022.
  • [11] Y. Wang, et al., “An efficient specific emitter identification method based on complex-valued neural networks and network compression,” IEEE J. Sel. Areas Commun., vol. 39, no. 8, pp. 2305–2317, Aug. 2021.
  • [12] S. Chang, R. Zhang, K. Ji, S. Huang and Z. Feng, “A hierarchical classification head based convolutional gated deep neural network for automatic modulation classification,” IEEE Trans. Wireless Commun., vol. 21, no. 10, pp. 8713–8728, Oct. 2022.
  • [13] X. Zha, H. Chen, T. Li, Z. Qiu, and Y. Feng, “Specific emitter identification based on complex Fourier neural network,” IEEE Commun. Lett., vol. 26, no. 3, pp. 592–596, Mar. 2022.
  • [14] N. Yang, B. Zhang, et al., “Specific emitter identification with limited samples: A model-agnostic meta-learning approach,” IEEE Commun. Lett., vol. 26, no. 2, pp. 345–349, Feb. 2022.
  • [15] M. M. Breunig, H. Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying density-based local outliers,” in International Conference on Management of Data (SIGMOD), 2000, pp. 93–104.
  • [16] F. T. Liu, K. M. Ting and Z. Zhou, “Isolation forest,” in IEEE International Conference on Data Mining (ICDM), 2008, pp. 413–422.
  • [17] A. Bendale and T. E. Boult, “Towards open set deep networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1563–1572.
  • [18] S. Akcay, A. Atapour-Abarghouei, P. Breckon, “GANomaly: Semi-supervised anomaly detection via adversarial training,” in Computer Vision (CV), 2019, pp. 622–637.
  • [19] Y. Dong, et al., “SR2CNN: Zero-shot learning for signal recognition,” IEEE Trans. Signal Process., vol. 69, pp. 2316–2329, Mar. 2021.
  • [20] Y. Wen, et al., “A discriminative feature learning approach for deep face recognition,” in Computer Vision (CV), 2016, pp. 499-515.
  • [21] J. Yu, et al., “Radio frequency fingerprint identification based on denoising autoencoders,” in International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), 2019, pp. 1–6.
  • [22] Y. Tu, et al., “Large-scale real-world radio signal recognition with deep learning,” Chin. J. Aeronaut., vol. 35, no. 9. pp. 35–48, Sep. 2022.
  • [23] K. Sankhe, et al., “ORACLE: Optimized radio classification through convolutional neuraL networks,” in IEEE Conference on Computer Communications (INFOCOM), 2019, pp. 370–378.
  • [24] L. Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605, 2008.
  • [25] F. Pukelsheim, “The three sigma rule,” Amer. Statistician, vol. 48, no. 2, pp. 88–91, 1994.