Transfer-based Adversarial Poisoning Attacks
for Online (MIMO-)Deep Receviers

Kunze Wu, Weiheng Jiang, , Dusit Niyato, , Yinghuan Li, and Chuang Luo

Abstract

Recently, the design of wireless receivers using deep neural networks (DNNs), known as deep receivers, has attracted extensive attention for ensuring reliable communication in complex channel environments. To adapt quickly to dynamic channels, online learning has been adopted to update the weights of deep receivers with over-the-air data (e.g., pilots). However, the fragility of neural models and the openness of wireless channels expose these systems to malicious attacks. To this end, understanding these attack methods is essential for robust receiver design. In this paper, we propose a transfer-based adversarial poisoning attack method for online receivers. Without knowledge of the attack target, adversarial perturbations are injected to the pilots, poisoning the online deep receiver and impairing its ability to adapt to dynamic channels and nonlinear effects. In particular, our attack method targets Deep Soft Interference Cancellation (DeepSIC)[1] using online meta-learning. As a classical model-driven deep receiver, DeepSIC incorporates wireless domain knowledge into its architecture. This integration allows it to adapt efficiently to time-varying channels with only a small number of pilots, achieving optimal performance in a multi-input and multi-output (MIMO) scenario. The deep receiver in this scenario has a number of applications in the field of wireless communication, which motivates our study of the attack methods targeting it. Specifically, we demonstrate the effectiveness of our attack in simulations on synthetic linear, synthetic nonlinear, static, and COST 2100 channels. Simulation results indicate that the proposed poisoning attack significantly reduces the performance of online receivers in rapidly changing scenarios.

Index Terms:

Wireless securitys, poisoning attacks, adversarial attacks, model-based deep learning, deep receivers, online learning, meta-learning.

I Introduction

In recent years, the application of deep learning (DL) in designing wireless communication systems has garnered significant interest. Researchers have concentrated on employing DL in wireless receivers to enhance communication performance in complex channels and to bolster adaptability in dynamic environments [3, 4]. However, DL-based wireless applications face vulnerabilities to evasion and data poisoning attacks owing to the inherent openness of wireless channels and the fragility of neural models [5]. Investigating attack methodologies on deep receivers serves to elucidate their response under such threats, thereby facilitating the development of secure wireless DL systems, which forms the primary focus of this paper.

I-A DL-based Wireless Receivers and Related Applications

Till now, numerous studies have explored DL-based designs for wireless communication systems. Most of them utilize an independent DNN to map the input-output relationships of functional modules in communication links. Alternatively, some approaches jointly optimize modules at both the transmitter and receiver by multiple DNNs. Typical applications include DL-based adaptive modulation[6], channel estimation[7], channel coding and decoding[8], and modulation recognition[10]. Beyond replacing functional modules in the physical layer, various constraints can be integrated into the training process to optimize additional system metrics, such as the adjacent channel leakage ratio (ACLR) and peak-to-average power ratio (PAPR) [6]. In terms of receiver design, lots of work has contributed to enhancing its adaptability to the dynamic channels, including training multiple models as a deep ensemble [12] and joint learning [13, 15].

However, DL-based wireless designs mentioned above are data-driven methods that heavily depend on a substantial amount of training data to improve their generalization capacity. Given that deep receivers typically have access to only limited pilots for adaptation, this characteristic poses a significant challenge. In addition, data-driven designs may suffer from performance degradation when faced with data distribution drift caused by dynamic channels[16]. To handle these issues, an online learning based approach has been proposed, involving dataset, training algorithm and deep receivers architecture. In particular, data augmentation[17] and self supervision methods [18],[19] were proposed to to expand the training data for online adaptation. In [16] and [20], meta-learning was employed to improve the generalization capability. Furthermore, Model-based deep learning provide a solution for receivers architecture design to satisfy the adaptability and data efficiency[21]. Specifically, the deep receivers were explicitly modelled by incorporating wireless domain knowledge, thereby reducing the dependence on data, such as DNN-aided inference[15],[21] and deep unfolding [21, 15, 22, 1]. In these studies, [1] proposed a classic model-based deep receiver, i.e., the DeepSIC, in MIMO scenarios, derived from the iterative soft interference cancellation (SIC)[23] MIMO detection algorithm. It employed DNN instead of each round of interference cancellation and soft detection, requiring only a few iterations to achieve extremely low data dependence and optimal performance. [20] utilized meta-learning to improve the training performance of online DeepSIC, and the evaluation results indicated that its performance was improved compared with traditional data-driven receiver, and it exhibited commendable adaptability to dynamic channels.

I-B Security of DL in Wireless Communications

As mentioned earlier, while DL-based transceiver designs can enhance performance, they remain vulnerable to attacks by malicious users. In particular, attacks on DL-based transceivers are divided into two main categories, i.e., the evasion attacks and data poisoning attacks. Evasion attacks, also known as adversarial attacks, manipulate test data to mislead the model[5],[24]. On the other hand, data poisoning attacks corrupt the training data, affecting the model’s performance during testing[5], [24, 25, 26].

According to extensive literature review, numerous studies on DL-based wireless communication primarily concentrate on evasion attacks. For instance, [28] proposed generative adversarial network (GAN)-based method to generate adversarial perturbations for channel received data, which can unnoticeably mislead wireless end-to-end autoencoders, modulation pattern recognition, and the DL-based symbol detection in orthogonal frequency division multiplexing (OFDM) systems. [29] reported adversarial perturbations can interfere with gradient-based iterative optimization algorithms in the physical layer. [30] proposed semantic attacks against semantic communication. Furthermore, adversarial perturbations can also play a role in interpretable (e.g., deep unfolding-based) architectures. To illustrate, [31] employed transfer-based methods to attack deep sparse coding networks and demonstrated that these attacks exert deleterious effects on the various components of deep unfolded-based sparse coding. Regarding data poisoning attacks, current research primarily focuses on cognitive radio spectrum-aware poisoning [5],[32] and disrupting distributed wireless federated learning[33, 34].

I-C Contribution of This Paper

Unlike previous studies, this paper addresses security threats to online deep receivers. Furthermore, we propose a transfer-based adversarial poisoning attack method, which can significantly corrupt various online deep receivers even without prior knowledge of the target system. Specifically, we focus on online receivers based on model-based deep learning, such as DeepSIC[1] and Meta-DeepSIC[20], as well as general DL-based detectors, including the black-box DNN detector[15],[17] and the ResNet detector designed based on DeepRX[35].

As previously stated, DeepSIC is a classical model-based deep receiver that can be combined with meta-learning for efficient online adaptation. This design effectively tackles the challenge of limited pilot data in wireless communication scenarios, thereby improving the generalization of deep receivers under dynamic channel conditions. Moreover, studies on the attack methods for DeepSIC can provide comprehensive insights into deep receiver characteristics and contribute to robust designs. Ultimately, this research aids in creating secure and efficient DL-enabled wireless communication systems.

Specifically, the mainly contributions of this paper are summarised as below.

$\bullet$

We highlight a communication system susceptible to malicious user poisoning attacks. We then analyze the vulnerability of the deep receiver based on online learning in the authorization system. From the perspective of malicious user, we further develop an attack utility model and an optimal attack utility decision problem.
$\bullet$

We effectively design a poisoning attack framework and attack perturbation generation method for online learning deep receivers. The fundamental concept is to introduce a poisoned sample into the online training and updating phase of the deep receiver, thereby compromising its performance over time. The poisoning attack framework has two stages. Firstly, malicious user employ joint learning to create a surrogate model, which can be selected from a generic DNN architecture, e.g., feedforward DNNs. Secondly, they generate poisoning perturbation samples based on the surrogate model. The transferability of the poisoning attack makes it work on different types of deep receivers.
$\bullet$

We numerically evaluate the effect of the proposed poisoning attack method on four channel models: Linear synthetic channel, nonlinear synthetic channel, static channel, and COST 2100 channel. Simulation results demonstrate that the proposed poisoning attack method impairs the deep receiver’s ability to adapt to rapid changes in dynamic channels and to learn from nonlinear effects. Furthermore, deep receivers adapted using meta-learning more severely damaged after poisoning.

The rest of the paper is organized as follows. Section II introduces the system and scenario models and the attack models of the malicious user. Section III presents the basic theory of adversarial machine learning, focusing on evasion attacks, data poisoning attacks, and the conceptual approaches to attack transferability. Section IV details the proposed poisoning attack framework and the method for generating poisoning attack samples for online deep receivers. Section V evaluates and analyzes the effectiveness of the proposed poisoning attack method. Section VI concludes the paper.

II System and scenario modelling

In this section, we first present the communication system and scenario model under the presence of a malicious poisoning attack user in Section II-A. Subsequently, we introduce the operational model of the legitimate receiver based on deep learning in Section II-B. Finally, we discuss the detail of malicious user poisoning attack, focusing on pilot poisoning attacks in Section II-C.

Refer to caption — Figure 1: Scenario modelling of communication systems with malicious user.

II-A Communication System Scenario Model

In this paper, we investigate a poisoning attack scenario model for a communication system, as illustrated in Fig. 1. This system consists of a pair of legitimate transceivers and a malicious poisoning attack user. Both the legitimate transmitter and receiver equip with multiple antennas, denoted as $N_{tx}$ and $N_{rx}$ respectively. We focus on a single-antenna malicious user, as this represents a cost-effective and straightforward approach to conducting attacks. The data transmission from transmitter to receiver is block-based, as illustrated in Fig. 2. The length of one block is $L$ , including $L_{\text{pilot}}$ pilot symbols and $L_{\text{info}}$ information symbols, herein, $L_{\text{info}}\gg L_{\text{pilot}}$ . As shown in Fig. 3, the legitimate receiver utilizes a DL-based architecture for signal receiving and processing. It is trained using pilot data and employs the trained deep receiver to decode information data. For the malicious user, based on the previously collected pilot data, it launches an attack by poisoning or disturbing the transmission process of the pilot used by the legitimate user. Its objective is to corrupt with the online training and updating of the deep receiver, thereby disrupting its information data reception and decoding.

Based on the above illustrated scenario, defining the transmit symbols by the legitimate transmitter as $\mathbf{s}\in\mathbb{R}^{N_{\mathrm{tx}}}$ , and the corresponding modulated symbols as $\mathbf{x}\in\mathbb{C}^{N_{\mathrm{tx}}}$ . These modulation symbols are then upconverted, amplified, transmitted through multiple antennas, and finally arrived at the receiver. Let $\mathbf{H}\in\mathbb{R}^{N_{\mathrm{rx}}\times N_{\mathrm{tx}}}$ denote the baseband equivalent channel matrix, and $\mathbf{w}\sim\mathcal{CN}\left(\mathbf{0},\sigma^{2}\ \mathbf{I}\right)$ represent the additive gaussian white noise experienced by the legitimate receiver. The equivalent baseband signal received by the legitimate receiver, in the absence of a malicious user poisoning attack, is $\mathbf{y}\in\mathbb{C}^{N_{\mathrm{rx}}}$ , which can be expressed as

\mathbf{y}=\mathbf{Hx}+\mathbf{w}\text{. }

(1)

Following that, defining a block of data symbols received by the authorised receiver as $\mathbf{Y}$ . The received data symbols can then be divided into two parts, denoted as $\mathbf{Y}_{\text{pilot}}=\left\{\mathbf{y}_{i}\right\}_{i=1}^{L_{\text{pilot}}}$ and $\mathbf{Y}_{\text{info}}=\left\{\mathbf{y}_{i}\right\}_{i=L_{\text{pilot}}+1}^{L}$ .

II-B Deep Receiver with Online Training

In this paper, the online learning based deep receiver is adopted by the legitimate receiver, as illustrated in Fig. 3. Here, define the deep receiver as a classifier $f_{\theta}$ with model parameter $\theta$ . $f_{\theta}$ is trained using a supervised learning approach. The data used for training is the pilot dataset, which is defined as $\mathcal{T}=\left\{\mathbf{Y}_{\text{pilot}},\mathbf{S}_{\text{pilot}}\right\}=\left\{\mathbf{y}_{i},\mathbf{s}_{i}\right\}_{i=1}^{L_{\text{pilot}}}$ . Model testing is done with information dataset, which is represented as $\mathcal{V}=\left\{\mathbf{Y}_{\text{info}},\mathbf{S}_{\text{info}}\right\}=\left\{\mathbf{y}_{i},\mathbf{s}_{i}\right\}_{i=L_{\text{pilot}}+1}^{L}$ . The supervised training loss function is the cross-entropy loss, which is represented as $\mathcal{L}(\mathcal{T};\theta)$ . $\hat{P}_{\theta}(\cdot|\cdot)$ denotes the likelihood probability of symbol estimation for deep receivers. The deep receiver training objective can be described by

\arg\underset{\theta}{\min}\left\{\mathcal{L}(\mathcal{T};\theta)=-\sum_{\left(\mathbf{y}_{i},\mathbf{s}_{i}\right)\subset\mathcal{T}}\log\hat{P}_{\theta}\left(\mathbf{s}_{i}\mid\mathbf{y}_{i}\right)\right\}\text{.}

(2)

The deep receiver, trained using $\mathcal{T}$ , is used to decode the symbols in $\mathcal{V}$ . For the $i$ -th received symbol $\mathbf{y}_{i}$ , the decoded result is expressed as $\hat{\mathbf{s}}_{i}\left(\mathbf{y}_{i};\theta\right)$ . The performance metric of the deep receiver is the symbol error rate (SER), which is defined as

SER(\theta)=\frac{1}{L_{\text{info}}}\sum_{i=L_{\text{pilot}}+1}^{L}\operatorname{Pr}\left(\hat{\mathbf{s}}_{i}\left(\mathbf{y}_{i};\theta\right)\neq\mathbf{s}_{i}\right)\text{.}

(3)

II-C Modus Operandi of the Malicious User

For the considered system, as mentioned earlier, there is a malicious user, which aims to corrupt the information receiving and decoding of the legitimate receiver, by poisoning on the pilot data transmission from the transmitter to the receiver. This is similar to [32, 33]. In particular, the attack process of the malicious user is shown in Fig. 4 and summarised as below.

Step 1:

Since the pilot pattern is fixed during transmission, the malicious user can collect pilot data during the communication process between the legitimate transceiver.
Step 2:

The accumulated pilot data is employed to train a surrogate model that is analogous to the attack target, specifically the authorised deep receiver.
Step 3:

The malicious user generates the optimal perturbation based on the surrogate model and the transferability of the attack.
Step 4:

The malicious user injects the channel perturbation. The deep receiver will gradually be poisoned until the model fails when it receives the perturbed pilot data used to train the model.

In principle, a poisoning attack perpetrated by a malicious user can be conceptualized as a perturbation injection process. In particular, the perturbation is defined as a vector in the complex space of receiver inputs, denoted as $\delta\in\mathbb{C}^{N_{rx}}$ , with a poisoning process represented by $\mathcal{P}(\cdot)$ . For the $i$ -th received symbol $\mathbf{y}_{i}$ , the corresponding poisoning perturbation is given by $\delta_{i}$ , and the corresponding poisoned received symbol is given by $\mathcal{P}\left(\mathbf{y}_{i}\right)=\mathbf{y}_{i}+\delta_{i}$ . Therefore, within the context of poisoning attacks on deep receivers, the primary challenge for malicious users is to design an optimal perturbation signal structure that maximizes the deep receiver’s loss on subsequent information symbols or validation sets, which will be addressed in the following sections.

III Adversarial Machine Learning Theory

Before introducing the proposed poisoning attack method for online deep receivers, we briefly discuss the theory of adversarial machine learning. Specifically, we provide a overview of the threat model presented in Section III-A, including the attacker’s goal, knowledge, and capability. These concepts and definitions facilitate a more comprehensive understanding of the proposed attack method for deep receivers. Subsequently, in Section III-B and Section III-C, we briefly elucidate the fundamental optimization issues associated with two distinct attack paradigms, namely evasion attacks and data poisoning attacks. Furthermore, we explain the differences and connections between these two attack paradigms. Finally, in Section III-D, we analyze the transferability of attack samples and discuss the methods to enhance their transferability.

III-A Threat Model

III-A1 Attacker’s Goal

As discussed in [24],[36],[37], the objective of the attacker in adversarial scenarios can be categorised according to the form of security threats, including integrity attacks and availability attacks.

$\bullet$

Integrity attacks: The attacker’s goal is to tamper the integrity of the target. Specifically, this implies that the attack samples generated by the attacker are only effective in certain parts of the target system, while the remainder of the target system retains its original functionality.
$\bullet$

Availability attacks: In contrast, availability attacks aim to disrupt the entire system, making it unavailable to legitimate users.

The main difference between integrity attacks and availability attacks lies in the focus of the optimization objectives of the attack models. In this paper, the proposed attack method is an availability attack, namely the destruction of the usability of all functions of a deep receiver.

III-A2 Attacker’s Knowledge

The attacker’s knowledge indicates the extent to which they are aware of the attack target. This knowledge encompasses several dimensions, as outlined in [24],[36]. These dimensions include: (i) The data utilized for training purposes. (ii) The architectural design of the target model, the learning algorithms employed during training, and their associated parameters and hyperparameters. (iii) The data comprising the test set. Based on these dimensions, two main attack scenarios can be defined:

$\bullet$

White-box attacks: The attacker has complete knowledge about the attack target. In this context, the attacker will adapt the nature of the attack to align with the specific characteristics of the target to achieve the most effective and impactful outcome.
$\bullet$

Black-box attacks: Black-box attacks can be further categorized into two main types: Transfer-based attacks and query attacks. In transfer-based attacks, the attacker has limited or no knowledge of the target model. The knowledge about the target encompasses the aforementioned dimensions (i), (ii), and (iii). In this setting, the attacker is limited to relying on the data they have collected to construct a surrogate model that approximates the target model. This attack is then transferred to the target model by launching a white-box attack on the surrogate model. In a black-box query attacks, the attacker can query the target’s output or confidence level to optimize the attack. Currently, the majority of black-box attacks exploit the transferability for attack purposes [24],[38, 39, 40, 41].

The discussion regarding the attacker’s knowledge aims to define the scenarios in which attacks are deployed, particularly in more practical black-box attacks, which are the focus of this paper. Moreover, within the framework of black-box attacks, the transferability of attack samples holds particular significance. This will be addressed in greater detail in Section III-D of this paper.

III-A3 Attacker’s Capability

The attacker’s capabilities determine the methods used to influence the attack target and the constraints for data manipulation. To avoid potential defense filtering mechanisms, the attacker typically imposes an upper bound of $\epsilon$ to the perturbation $\delta$ under the $p$ -norm space. From the perspective of attack methods[24], if the attacker can manipulate data from both the training and testing phases, which considered as causal attacks and is called data poisoning attacks. If the attacker can only manipulate the data during the testing phase, this attack is considered exploratory and is called evasion attacks. The difference lies in the optimization objectives and implementation methods. This paper focuses on adversarial poisoning attacks in black-box scenarios, which can be seen as a synthesis of evasion attacks and data poisoning attacks. The specific optimization goals and implementation forms of evasion attacks and data poisoning attacks are described in the following Section III-B and Section III-C, respectively.

III-B Evasion Attacks

The evasion attacks executed by generating adversarial samples aimed at causing misclassification during the testing phase of the DNN. This is achieved by identifying the network’s vulnerabilities and applying small, strategically crafted perturbations to the input, as illustrated in Fig. 5(a). Gradient-based optimizers [42],[43] are effective in determining these influential perturbations. Given a target model and input data, the gradient of the objective function is used to guide the application of minor perturbations, which maximizes the loss induced on the input. This can be formulated as a single-layer optimization problem. Specifically, let $\mathbf{y}$ represent the model input, $\mathbf{s}$ the corresponding labels, and $\mathbf{y}^{\prime}$ the adversarial samples generated after adding perturbation. The adversarial perturbation, denoted as $\delta=\left\|\mathbf{y}^{\prime}-\mathbf{y}\right\|_{p}$ , is constrained by an upper bound $\epsilon>0$ , with the optimal adversarial sample $\mathbf{y}^{*}$ . The classifier is parameterized by $\theta$ , and for classification tasks, the cross-entropy loss function $\mathcal{L}\left(\mathbf{y}^{\prime},\mathbf{s};f\theta\right)$ is employed. Thus, the generation of optimal adversarial samples can be framed as the following optimization problem

\mathbf{y}^{*}\in\arg\underset{\mathbf{y}^{\prime}}{\max}\mathcal{L}\left(\mathbf{y}^{\prime},\mathbf{x};\theta\right)\text{, such that }\delta=\left\|\mathbf{y}^{\prime}-\mathbf{y}\right\|_{p}\leq\epsilon\text{. }

(4)

III-C Data Poisoning Attacks

The optimization goal of the data poisoning attacks is to poison the target with poisoned training data to degrade its test performance, which is shown in Fig. 5(b). Similar to the generation of adversarial samples, in data poisoning attacks, a perturbation $\delta$ is applied to each sample in the training set under the $p$ -norm constraint, ensuring that the perturbation magnitude does not exceed an upper bound $\epsilon$ . Specifically, the test set $\mathcal{V}$ and the training set $\mathcal{T}$ are defined, as well as the poisoned training set $\mathcal{P}(\mathcal{T})$ after applying the perturbation to the data set. Furthermore, $\theta^{*}$ denotes the optimal poisoning parameter of the target classifier with respect to the parameter $\theta$ . Thus, the poisoning attack can be modelled as a dual optimization problem as follows:

\max_{\mathcal{P}}\mathcal{L}\left(\mathcal{V};\theta^{*}\right)\text{, and }\theta^{*}\in\arg\min\mathcal{L}(\mathcal{P}(\mathcal{T});\theta){.}

(5)

Herein, firstly, the inner layer optimization involves the standard model training process. In this process, the attacker uses the poisoned data to train the target model by minimizing the empirical loss $\mathcal{L}(\mathcal{P}(\mathcal{T});\theta)$ to obtain the optimal poisoned parameter $\theta^{*}$ . Secondly, based on the obtained $\theta^{*}$ , the attacker maximizes the loss $\mathcal{L}(\mathcal{V};\theta^{*})$ on the test set. Note that solving this optimization problem directly is often very difficult, and it is more common practice to approximate this maximization process through gradient optimization of $\delta$ .

III-C1 Adversarial Samples as Poisoning Attacks

As previously stated in Section III-B and Section III-C, although both construction of evasion attacks and poisoning attacks can be framed as gradient-based optimization problems, the goals achieved by constructing perturbations for these two types of attacks are different. Recently, however, researchers have discovered that adversarial samples are also highly effective for poisoning DNNs, a phenomenon known as Adversarial Poisoning [44],[45]. In this case, the poisoning attack optimization problem (5) can also be uniformly expressed in the form of the adversarial sample attack optimization problem (4) in Section III-B. [44] provides a method for creating adversarial poisoning samples to obtain optimal poisoning results. Additionally, [46] shows that adversarial poisoning attacks can also cause serious harm to the meta-learner in a white-box attack setting. Compared with the dual optimization process of the poisoning attack method in (5), using adversarial samples as poisoning attack samples is more convenient and practically feasible. This is also the basis for the poisoning attack method proposed in this paper.

III-D Transferability of Attacks

III-D1 Why Can Attacks Be Transferred?

Attack transferability means that the attack generated for a model may be equally effective for other models that have not seen that attack before. This phenomenon has been observed and demonstrated in [47]. The proposed attack method is a black-box attack. Therefore, it is crucial to analyze the source of the attack portability.

[24] presented an upper bound on the loss when black-box transfer occurs. Define $f_{\varphi}$ as the surrogate model with parameter $\varphi$ , $f_{\theta}$ as the target model with parameter $\theta$ , and $\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)$ as the loss of the input $\mathbf{y}$ in $f_{\varphi}$ against the label $\mathbf{s}$ . In consideration of the transferability of evasion attacks (poisoning attacks also take the same form), the optimal adversarial sample, denoted as $\mathbf{y}^{*}$ is obtained by solving (4) on $f_{\varphi}$ , with the corresponding optimal perturbation, denoted as $\delta^{*}$ . To illustrate, consider the sphere space with $p=2$ and radius denoted as $\epsilon$ . The optimal adversarial perturbation obtained on $f_{\varphi}$ can be expressed in (6) as follows:

\delta^{*}=\epsilon\frac{\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)}{\left\|\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)\right\|_{2}}.

(6)

As a result, the loss of $\mathbf{y}$ on the target model, denoted as $\mathcal{L}(\mathbf{y},\mathbf{s},\theta)$ . Define $\Delta\mathcal{L}$ as the increase in loss of the input $\mathbf{y}^{*}$ compared to the input $\mathbf{y}$ on the target model. The upper bound of $\Delta\mathcal{L}$ on the target model can be described by

\Delta\mathcal{L}=\epsilon\frac{\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)^{\top}}{\left\|\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)\right\|_{2}}\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\theta)\leq\epsilon\left\|\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\theta)\right\|_{2}.

(7)

The left-hand side of the inequality in (7) represents the loss in the black-box attack scenario, while the right-hand side represents the loss in the white-box attack scenario. In the white-box attack scenario, i.e., when $f_{\varphi}=f_{\theta}$ , the inequalities in (7) becomes an equality. At this point, the attack achieves its upper bound and has optimal effectiveness.

Thus, the effectiveness of transferring an attack sample from the surrogate model to the target model is influenced by two factors: The intrinsic adversarial vulnerability of the target model (right-hand side of the inequality in (7)) and the complexity of the surrogate model used to optimize the attack (left-hand side of the inequality in (7)). The right-hand side of the inequality in (7) shows that a more vulnerable target model has a larger upper bound on the loss, represented by $\epsilon\left|\left|\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\theta)\right|\right|_{2}$ . Here, the intrinsic complexity of the model measures the learning algorithm’s ability to fit the training data. More complex models, like those without regularization or those prone to overfitting, have more intricate parameter spaces and rugged loss landscapes, making them sensitive to input perturbations. For robust models with smaller upper loss bounds, a successful attack requires a higher perturbation limit, reducing the likelihood of bypassing the system’s monitoring. This demonstrates the impact of the intrinsic adversarial vulnerability of the target model on transferability of attacks.

The complexity of the surrogate model used to optimize the attack depends on two main factors: The gradient alignment between the surrogate and the target, and the variance magnitude of the surrogate model’s loss function. These factors are particularly relevant for the left-hand side of the inequality in (7). When the surrogate has better gradient alignment with the target, such as higher gradient cosine similarity [24], the attack samples from the surrogate model exhibit better transferability. This is reflected in $\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)^{\top}\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\theta)$ , as shown on the left-hand side of the inequality in (7). Additionally, a surrogate model with low variance leads to a more stable optimization process, producing attack samples effective across different target models. In contrast, a large variance leads to an unstable optimization process, leading to attack samples may not match the target model, resulting in failure. Intuitively, on the left-hand side of the inequality in (7), high variance of the loss function increases the corresponding denominator term $\left|\left|\nabla_{\mathbf{y}}\mathcal{L}(\mathbf{y},\mathbf{s},\varphi)\right|\right|_{2}$ , which results in smaller upper bounds on the achievability of the transfer.

III-D2 Related Work of Transfer-based Attacks

In summary, transfer-based attacks can be approached from two perspectives: The inherent adversarial vulnerability of the target model and the complexity of optimizing the surrogate model.

$\bullet$

For the former, certain assumptions about the target model are typically necessary to model the attack objective. For example, [41] assumed unstable common weakness in the model ensemble. In this scenario, stronger constraints can be applied to the optimization target, such as sharpness-aware minimization[41] or momentum methods [43], to make the transfer attack more effective. However, the intrinsic adversarial vulnerability of unknown target is often difficult to identified directly.
$\bullet$

For the latter, approaches include using ensembled surrogate model [47], self-ensembled method [39], and data augmentation [39, 38]. [40] suggested alternating different training paradigms (e.g., unsupervised and self-supervised) to enhance transferability in poisoning attacks. These methods aim to build more generalized and robust surrogate models, which are then optimized to obtain more effective and transferable attack samples.

IV A poisoning attack framework
for online deep receivers

As mentioned earlier, in the scenarios discussed in this paper, legitimate deep receivers adapt to fast-varying wireless channels by utilizing pilot and updating model parameters through online learning. However, this online update mechanism, designed to adapt to local channel variations, is vulnerable to sample input poisoning.

In the attack framework of the malicious user, the legitimate online deep receiver becomes the target model, which is updated online to adapt to local channel variations for better performance. However, this process is inherently not robust and can result in overfitting [48]. From this point onwards, the attack target of this paper is similar to [49], which causes catastrophic forgetting of the target by constructing poisoned samples. According to the analysis in Section III-D, this overfitting is the source of the target model’s intrinsic adversarial vulnerability. This implies that a malicious user can effectively attack the target model by optimizing a surrogate model and generating corresponding adversarial perturbations.

Therefore, this section proposes a poisoning attack framework targeting online deep receivers. The core idea is to poison the model training and updating phases, resulting in a degraded model after a certain period. Specifically, based on the behavioral patterns of malicious user described in Section II-C, the poisoning attack framework and the method for generating the poisoning attack samples can be summarized in below 3 steps, as illustrated in Fig. 7.

Step 1:

The malicious user collects communication data (e.g., pilot) from the wireless channel and produces a joint learning dataset. This dataset is used to train a surrogate model for attacking the target (i.e., the legitimate deep receiver).
Step 2:

The malicious user optimization solution (4) generates an adversarial perturbation based on the surrogate model.
Step 3:

The malicious user injects the adversarial perturbation onto the channel, causing the deep receiver to receive the poisoned pilot. Consequently, the deep receiver undergoes online learning with the perturbed pilot, resulting in the model being poisoned and deactivated.

The specific details of Steps 1 and 2 are detailed in the following Sections IV-A to IV-C.

IV-A Surrogate Model Selection

As discussed in Section III-D, black-box attacks rely on the gradient alignment between the surrogate and target models to ensure generalization. To achieve this, it is crucial to avoid using surrogate models that are too specialized for specific tasks. This strategy improves compatibility with different deep receivers, enhancing the attack’s effectiveness. Therefore, this paper employs a generic DNN architecture, such as feedforward neural networks, as the surrogate model, as shown in Step 1 of Fig. 6.

IV-B Joint Learning for Training of Surrogate Models

From the perspective of a malicious user, it is essential to select a suitable surrogate model architecture similar to the target model and to effectively train and optimize the surrogate model. In the proposed attack framework, joint learning is used to train the surrogate model. This approach utilizes data collected under various channel conditions to train the DNN, enabling it to adapt to dynamic channels [13, 15]. Unlike legitimate communicating parties that have to use online learning methods to adapt to time-varying wireless channels, the malicious user can pre-collect large datasets to train a robust model. Furthermore, the use of joint learning, instead of using channel state information or online learning [15], addresses the issue of attack generality at the data level, avoiding over-specialized surrogate models in black-box attack transfers, as discussed in Section IV-A. Joint learning also satisfies the data volume requirements of deep learning, making the surrogate model more robust than the target model, which uses online learning. This results in attack patterns with more stable and effective attack results. Conversely, the robust learning process of the target model mitigates the impact of poisoning, as demonstrated by pilot size adjustments in Section V-E.

Specifically, as shown in Step 2 of Fig. 6, the malicious user employs joint learning to train a DNN (i.e., the surrogate model) utilizing a large amount of communication data (e.g., pilot) amassed from legitimate transceivers. The DNN learns a mapping applicable to most channel states, reflecting the input-output relationship of the target deep receivers. The training data includes two types: communication data from different channel distributions and data from varying signal-to-noise ratio (SNR) conditions within the same channel. Finally, to generate effective adversarial poisoning samples, suitable attack generation methods must be considered, as detailed in Section IV-C.

IV-C Adversarial Poisoning Attack Samples Generation

Algorithm 1 Sample Generation Method for Poisoning Attack Based on PGD Algorithm.

Input: Pilot dataset

\mathcal{T}_{\text{pilot}}=\{\mathbf{y},\mathbf{s}\}

, model parameter

\varphi

, PGD iteration count

Q

, perturbation step size

\gamma

, loss function

\mathcal{L}(\mathcal{T};\varphi)

, input data perturbation bounds

I_{max}/I_{min}

, perturbation limit

\epsilon

Output: Poisoned dataset

\mathcal{P}(\mathcal{T})\equiv\left\{\mathbf{y}^{*},\mathbf{s}\right\}

Initialize poisoned dataset:

\mathcal{P}(\mathcal{T})=\emptyset

for each symbol

\mathbf{y}

in the pilot dataset

T_{\text{pilot}}=\{\mathbf{y},\mathbf{s}\}

\delta\sim U(-\epsilon,\epsilon)

\mathbf{y}^{*}=\text{clip}(\mathbf{y}+\delta,I_{min},I_{max})

for

i\in\{1,\ldots,Q\}

\delta\leftarrow\text{sgn}\left(\nabla_{\mathbf{y}}\mathcal{L}\left(\mathbf{y^{*}};s;f_{\varphi}\right)\right)

\mathbf{y}^{*}\leftarrow\text{clip}(\mathbf{y}+\gamma\delta,I_{min},I_{max})

\mathbf{y}^{*}\leftarrow\mathbf{y}+\text{clip}(\mathbf{y}^{*}-\mathbf{y},-\epsilon,\epsilon)

end for

\mathcal{P}(\mathcal{T})=\mathcal{P}(\mathcal{T})\cup\{\mathbf{y}^{*},\mathbf{s}\}

end for

Once the optimized surrogate model has been obtained, the malicious user generates the necessary poisoning attack samples in order to execute the attack. As illustrated in Step 2 of Fig. 6, we employ the projected gradient descent (PGD) algorithm [42], to generate the adversarial poisoning attack perturbations discussed in Section III-C. The whole algorithmic flow of generating perturbations is shown in Algorithm 1, and summarised as below.

Step 1:

Obtain the pilot dataset $\mathcal{T}_{\text{pilot}}\equiv\{\mathbf{y},\mathbf{s}\}$ . Then, within an interval $[-\epsilon,+\epsilon]$ defined by the upper bound of the $p$ -norm constraints, use uniformly distributed sampling to generate a randomly initialized perturbation vector $\delta$ .
Step 2:

Superimpose the $\delta$ on the pilot data. The perturbed data should then be fed into the surrogate model $f_{\varphi}$ to calculate the attack loss.
Step 3:

$\delta$ updates in the gradient direction of loss with step size $\gamma$ , and $\mathbf{y}+\gamma\delta$ remains within the specified upper bound $I_{max}$ , and lower bound $I_{min}$ .
Step 4:

The Step 2 and Step 3 iteratively repeat the $Q$ rounds to obtain poisoned pilot data with the optimal attack perturbation.

Once the iteration is complete, the optimal perturbation will be applied to the current block’s pilot to generate the poisoned pilot dataset, denoted as $\mathcal{P}(\mathcal{T})$ . This dataset will then be received by the deep receiver and poisoned during the training and updating of the model, as illustrated in Step 3 of Fig. 6.

V NUMERICAL EVALUATIONS

In this section, we numerically evaluate the proposed poisoning attack aimed at disrupting the deep receiver’s online adaptation process. First, we outline the parameter settings used in the simulations, including channel models, deep receivers, online training methods, and the poisoning attack method (Sections V-A to V-D). Next, simulation results are presented for deep receivers under the following conditions: linear and nonlinear time-varying synthetic channels, a linear static synthetic channel, and the time-varying COST 2100 channel (Section V-E). Lastly, the experimental results across all four channel settings are summarized and discussed in Section V-F.

V-A Evaluated Channel Models

The deep receiver operates on a discrete memoryless MIMO channel. The number of transmitting and receiving antennas is set to $N_{tx}$ $=$ $N_{rx}$ $=$ 4. The experimental evaluation channel model comprises synthetic channels [20] and COST 2100 channels [50]. Fig. 7 illustrates the four channel tapping coefficients for a randomly selected user in a multiuser system over 100 blocks of data transmission. In the context of linear channels, the input-output relationship is expressed by the (1) given in Section II-A. In the context of a nonlinear channel model [20], the input-output relationship is represented by

\mathbf{y}=\tanh(k(\mathbf{Hx}+\mathbf{w})),

(8)

where the $\tanh(\boldsymbol{\cdot})$ function is used to simulate the non-linear variations in the transceiver process due to non-ideal hardware, with the parameter $k=0.5$ .

V-B Evaluated Deep Receivers

We consider three deep receiver architectures in the evaluated network architecture: Namely, the model-based deep receiver DeepSIC [1], the black-box DNN detector [15],[17], and the ResNet detector designed based on DeepRX [35].The relevant details are as follows:

$\bullet$

DeepSIC: DeepSIC unfolds the traditional SIC iterative process and replaces each iteration with a sub-neural network to improve performance. Each sub-neural network enhances the reliability of the current estimation using received symbols and the output confidence from the previous iteration. This design enables DeepSIC to achieve high reliability even with limited training data[1, 20]. In this paper, DeepSIC includes 3 iterations, resulting in a total of $3\times N_{tx}$ sub-networks. Each sub-network is a two-layer fully connected layer network. The first layer has a dimension of $(N_{rx}+N_{tx}-1)\times 64$ , and the second layer has a dimension of $64\times|S|$ , where $|S|$ is the size of the set of symbols to be transmitted. For example, $|S|=4$ when using QPSK transmission. The activation function employed in the initial layer of each subnetwork is ReLU, while the second layer utilizes softmax classification.
$\bullet$

Black-box DNN Detector: The black-box DNN architecture consists of 4 fully-connected layers and a softmax classification header. The dimensions of the layers are $N_{rx}\times 60,60\times 60,60\times 60$ and $60\times|S|^{N_{tx}}$ , respectively.
$\bullet$

ResNet Detector: The ResNet detector employed in this paper comprises 10 layers of residual blocks, each containing 2 convolutional layers with 3 $\times$ 3 kernels, one-pixel padding on both sides, and no bias terms. A ReLU activation function is used between the layers, and each convolutional layer is followed by 2D batch normalization.

V-C Online Training Methods

TABLE I: The parameter configuration of the Online Training Methods

Parameters	Values
$L_{\text{pilot}}$	200
$L_{\text{info}}$	50000
$\eta$	$5\times 10^{-3}$
$\eta_{\text{meta}}$	$0.01$
Epochs	300
Optimizer	adam

The objective of this paper is to present an attack strategy, which attacks online adaptation of the deep receivers. Consequently, the focus is on the deep receiver’s online training, with the parameter configurations illustrated in Table I. The parameter settings for online training come from [15, 17]. Upon receipt of a data block, the deep receiver is only able to utilize a subset of the data block for training, like pilot symbols, specifically $L_{\text{pilot}}=200$ . Then the deep reciver predicts the subsequent $L_{\text{info}}=50000$ symbols. In the experiment, a total of 100 data blocks were transmitted, with the transmitted data being QPSK modulated, i.e., the user-transmitted symbols $\mathbf{s}$ were mapped to the set $C=\left\{\left(\pm\frac{1}{\sqrt{2}},\pm\frac{1}{\sqrt{2}}\right)\right\}^{4}$ . Furthermore, the deep receivers are trained using the adam optimizer. The training epochs are 300. The initial learning rate $\eta$ is set to 5 $\times 10^{-3}$ for ResNet detector, black-box DNN detector and DeepSIC. The meta learning rate $\eta_{\text{meta}}$ is set to 0.01 for Meta-DeepSIC. Additionally, online training is performed for different architectural receivers, which were implemented in the following two cases:

$\bullet$

Online learning: Based on the adaptation from the previous data block, the deep receiver trains and updates the current model using the limited pilot symbols received in the current data block.
$\bullet$

Online meta-learning: According to [20], [52], the meta-learning is employed to facilitate adaptation to the channel. Specifically, the pilot data from 5 data blocks is accumulated, and then meta-learning is performed on the this data to obtain the meta-learning weights for the deep receiver. Subsequently, the weights are then employed in an online learning process involving the pilot data of the current block.

Unless otherwise stated, the deep receivers are trained using online learning, including black-box DNN detector, ResNet detector, and DeepSIC. Only Meta-DeepSIC is trained using online meta-learning methods.

V-D Attacker’s Configuration

TABLE II: The parameter configuration of the Attack Methods

Parameters	Values
$p$	2
$\epsilon$	0.3
$\gamma$	0.01
$Q$	250
$L_{\text{sur}}$	5000
$SNR_{\text{sur}}$	{2,4,6,8,10,12,14,16}

The attack samples are designed based on the gradient of the surrogate model, with joint learning performed using the collected pilot dataset. The black-box DNN detector, presented in Section V-B, is employed as the surrogate model and contains three times the number of parameters compared to a single sub-network of DeepSIC. The joint learning parameters are based on [15]. In particular, the linear time-varying synthetic channel model is employed to generate the channel data for joint learning. Training data is generated under the condition of an SNR of 2 dB to 16 dB with an interval of 2 dB, denoted as $SNR_{\text{sur}}$ . In this manner, $L_{\text{sur}}=5000$ training pilot symbols are generated at each SNR value, and the surrogate model is trained in accordance with this procedure.

Moreover, the poisoning samples are iteratively optimized using the PGD algorithm, following the settings in [44], where the adversarial poisoning samples are most effective under this specific parameter configuration. The iteration step size is set to $\gamma=0.01$ , the iterations of PGD is $Q=250$ , and the perturbation upper bound under the $p=2$ norm is $\epsilon=0.3$ . The maximum and minimum values of the received symbol magnitude are denoted as $I_{max}$ and $I_{min}$ , respectively, and $I_{max}=max\{\|\mathbf{y}\|\}=-I_{min}$ . Fig. 8 shows the real and imaginary parts of the original and poisoned received symbols, while Fig. 9 presents these symbols in a constellation diagram.

V-E Numerical Results under Four Channel Models

This section presents the experimental results evaluating the proposed poisoning attack across various channel models. The method’s effectiveness is tested on linear and nonlinear time-varying synthetic channels, linear static synthetic channels, and time-varying COST 2100 channels. For the black-box DNN detector, a white-box poisoning attack is launched against this architecture receiver since its architecture is the same as the surrogate model. In the case of the ResNet detector, transfer poisoning attacks are implemented in black-box scenarios. Since DeepSIC contains multiple sub-networks, direct gradient information cannot be used for designing poisoning samples. Therefore, a transfer-based poisoning attack is employed, anticipating that perturbations designed on the surrogate model will transfer to DeepSIC.

V-E1 Linear Time-varying Synthesis Channel Results

Fig. 10(a) shows the results of 100 data transmission blocks, with the average cumulative SER calculated block-wise over five repetitions at an SNR of 14 dB. Meta-DeepSIC, combining model-based method with meta-learning, better captures the time-varying wireless channel characteristics in the absence of a poisoning attack, achieveing superior SER performance compared to the black-box DNN detector, ResNet detector and DeepSIC. However, Meta-DeepSIC is more susceptible to poisoning attacks than the DeepSIC. This aligns with the findings in [46]. Since this method targets the black-box DNN detector through a white-box attack approach, the black-box DNN detector performs the worst after poisoning. Meanwhile, the poisoning attack also significantly impacts the ResNet detector. Fig. 10(b) illustrates the results of the 100-block data transmission, with SER averaged over five repetitions under varying SNR conditions. The poisoning attack effectively degrades the performance of all four types of deep receivers at different SNRs. Specifically, the attack degrades SER by 0.68 dB for the ResNet detector, 0.5 dB for the black-box DNN detector, 0.67 dB for DeepSIC, and 0.91 dB for Meta-DeepSIC at SNR = 10 dB. As the SNR increases, the attack effect becomes more pronounced, with the SER deteriorations of the four receiver architectures reaching 0.68 dB, 0.43 dB, 1.40 dB, and 2.0 dB, respectively, at SNR = 15 dB.

V-E2 Non-linear Time-varying Synthetic Channels Results

Fig. 11 evaluates the effectiveness of the proposed poisoning attack under a nonlinear time-varying synthetic channel, using the same experimental setup as in Fig. 10. The results indicate a stronger poisoning effect in the nonlinear channel compared to the linear one. Specifically, the attack degrades SER by 0.93 dB for the ResNet detector, 0.53 dB for the black-box DNN detector, 1.35 dB for DeepSIC, and 1.42 dB for Meta-DeepSIC at SNR = 10 dB. As the SNR increases, the attack effect becomes more pronounced, with the SER deteriorations of the four receiver architectures reaching 1.13 dB, 0.71 dB, 2.74 dB, and 3.09 dB, respectively, at SNR = 15 dB.

V-E3 Linear Static Synthetic Channels Results

Fig. 12 evaluates the effectiveness of the proposed poisoning attack in a linear static synthetic channel, with the same setup as Fig. 10. The performance of DeepSIC and Meta-DeepSIC is nearly identical in this case. Furthermore, the lack of diverse data makes the data-driven detector unsuitable for adapting to the channel environment. For example, the black-box DNN detector struggles to adapt to the static channel, while the ResNet detector, which performed well in previous channels, performs the worst here, making the poisoning effect negligible. In particular, at SNR = 10 dB, the poisoning attack degrades the SER by 0.08 dB for the black-box detector, 0.71 dB for DeepSIC, and 0.72 dB for Meta-DeepSIC. As the SNR increases, the impact intensifies, with SER degradations of 0.06 dB, 1.05 dB, and 1.07 dB at SNR = 15 dB, respectively.

V-E4 Time-varying COST 2100 Channels Results

The effectiveness of the proposed poisoning attack is evaluated under the time-varying COST 2100 channel, as shown in Fig. 13. The experimental configuration is identical to that of the linear time-varying synthetic channel. Simultaneously, the surrogate model is trained on the joint channel data based on the time-varying linear channel model. As Fig. 13 illustrates, the poisoning attack proves effective across receivers with varying architectural and online training scenarios, while also adapting to different channel environments. At SNR = 10 dB, the attack degrades SER performance by 0.12 dB for the ResNet detector, 0.25 dB for the black-box DNN detector, 0.61 dB for DeepSIC, and 0.78 dB for Meta-DeepSIC. At SNR = 15 dB, the attack degrades SER performance by 0.04 dB for the ResNet detector, 0.23 dB for the black-box DNN detector, 2.11 dB for DeepSIC, and 1.36 dB for Meta-DeepSIC.

V-E5 The Impact of Pilot Size on Poisoning

The online learning of deep receivers is influenced by the training data, i.e., the pilot size $L_{\text{pilot}}$ . Larger pilot sizes generally enhance performance, leading to more stable learning and reduced overfitting. To explore the impact of overfitting on the poisoning effect, Fig. 14 examines the influence of pilot size in a linear time-varying synthetic channel at SNR = 14 dB. The results show that as $L_{\text{pilot}}$ increases, the effectiveness of poisoning attacks diminishes. Specifically, when comparing $L_{\text{pilot}}=200$ with $L_{\text{pilot}}=1000$ , the poisoning effects on the ResNet detector, black-box DNN detector, DeepSIC, and Meta-DeepSIC are reduced by 0.34 dB, 0.74 dB, 0.14 dB, and 0.34 dB, respectively. However, Larger pilot sizes cannot alleviate the poisoning effects on meta-learning methods, and the poisoned Meta-DeepSIC still performs worse than poisoned DeepSIC.

V-F Discussion of Results

The experimental results demonstrate that the proposed poisoning attack is effective across four different channel environments. Nevertheless, the precise impact of these attacks varies depending on the specific channel environment. Firstly, as illustrated in Fig.11, the poisoning effect observed in the nonlinear time-varying synthetic channel is markedly higher than in the other environments. The performance of the deep receivers subjected to a poisoning attack is severely degraded and approaches failure in this channel environment. This suggests that the poisoning attack is capable of impeding the deep receiver’s ability to adapt to the rapid changes in the channel’s effects and to learn the nonlinear effects.

Secondly, the results for the linear time-varying synthetic channel and the COST 2100 channel support this conclusion from different perspectives. In comparison to the synthetic channel, the tap coefficients of the COST 2100 channel demonstrate greater long-term variance, while exhibiting a relatively flat profile in the short term (e.g., between two blocks). As illustrated in Fig. 13(a), the impact of poisoning attacks gradually reduces receiver performance in the COST 2100 channel, indicating a sustained but limited exacerbation. In contrast, in the linear time-varying synthetic channel undergoing significant changes over time, the poisoning attack has a more pronounced effect on the receiver’s performance. This indicates that the attack has a impact on the deep receiver’s capacity to track long-term channel alterations and a more pronounced disruptive effect on short-term rapid adaptation. Notably, as shown in Fig. 14, a larger pilot size can mitigate the adverse effects of poisoning attacks, but this means more spectrum resources are consumed.

Finally, Meta-DeepSIC, which incorporates a meta-learning approach, demonstrates optimal performance with limited pilot data, particularly in fast-varying channels. Additionally, this learning capability is effective in nonlinear environments, but its increased sensitivity to poisoned samples leads to more significant performance deterioration compared to DeepSIC in fast-varying channels. In slow-varying or static channels, the performance of both Meta-DeepSIC and DeepSIC, with or without poisoning attacks, is more stable than in fast-varying channels, as shown in Fig. 12 and Fig. 13.

In conclusion, the proposed attack primarily impedes the receiver’s ability to learn rapid channel changes and non-linear effects in the short term, leading to performance degradation. Meanwhile, when deep receivers are combined with meta-learning, the poisoning effect is particularly pronounced (e.g., Meta-DeepSIC). This conclusion highlights the security risks associated with designing wireless receivers using online and online meta-learning methods, particularly in environments characterised by rapid channel changes and non-linear effects, where the system is particularly susceptible to poisoning attacks.

VI Conclusion

This paper proposes a transfer-based adversarial poisoning attack method for online deep receivers without the knowledge of the target. The fundamental concept is to corrupt the online training and updating phases of deep receivers in such a way that the model becomes compromised after a designated period of training, resulting in a decline in performance. The poisoning attack framework and the generation of poisoning attack samples comprise two steps. Initially, the malicious user acquires the surrogate model through the joint learning method. Subsequently, the poisoning attack perturbations are generated based on the surrogate model to poisoning the pilot. Simulation experiments on the proposed poisoning attack method under varying channel models demonstrate that it disrupts the adaptation of dynamic channls and learning of nonlinear effects. Meanwhile, the proposed attack can be effective against both model-based deep learning architectures and typical DNN-based receiver architectures. Meta-DeepSIC demonstrates optimal performance in fast-varying channels. However, it is particularly susceptible to poisoning attack samples, resulting in a notable decline in performance. It is therefore recommended that future research should concentrate on the development of efficient, robust and secure deep receiver architectures that are capable of defending against potential attacks, such as poisoning purification before learning or reducing the impact after poisoning[53], with a view to furthering the application of deep learning in wireless transceiver design and deep receiver deployment.

References

[1] N. Shlezinger, R. Fu, and Y. C. Eldar, “Deepsic: Deep soft interference cancellation for multiuser mimo detection,” IEEE Transactions on Wireless Communications, vol. 20, no. 2, pp. 1349–1362, 2020.
[2] N. Shlezinger, G. C. Alexandropoulos, M. F. Imani, Y. C. Eldar, and D. R. Smith, “Dynamic metasurface antennas for 6g extreme massive mimo communications,” IEEE Wireless Communications, vol. 28, no. 2, pp. 106–113, 2021.
[3] G. C. Alexandropoulos, N. Shlezinger, I. Alamzadeh, M. F. Imani, H. Zhang, and Y. C. Eldar, “Hybrid reconfigurable intelligent metasurfaces: Enabling simultaneous tunable reflections and sensing for 6g wireless communications,” IEEE Vehicular Technology Magazine, 2023.
[4] N. Shlezinger, Y. C. Eldar, and M. R. Rodrigues, “Asymptotic task-based quantization with application to massive mimo,” IEEE Transactions on Signal Processing, vol. 67, no. 15, pp. 3995–4012, 2019.
[5] D. Adesina, C.-C. Hsieh, Y. E. Sagduyu, and L. Qian, “Adversarial machine learning in wireless communications using rf data: A review,” IEEE Communications Surveys & Tutorials, vol. 25, no. 1, pp. 77–100, 2022.
[6] F. A. Aoudia and J. Hoydis, “Waveform learning for next-generation wireless communication systems,” IEEE Transactions on Communications, vol. 70, no. 6, pp. 3804–3817, 2022.
[7] H. Fu, W. Si, and I.-M. Kim, “Deep learning-based joint pilot design and channel estimation for ofdm systems,” IEEE Transactions on Communications, vol. 71, no. 8, pp. 4577–4590, 2023.
[8] N. Farsad, M. Rao, and A. Goldsmith, “Deep learning for joint source-channel coding of text,” in 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018, pp. 2326–2330.
[9] T. Raviv, A. Goldman, O. Vayner, Y. Be’ery, and N. Shlezinger, “Crc-aided learned ensembles of belief-propagation polar decoders,” in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024, pp. 8856–8860.
[10] Y. Liu, Y. Liu, and C. Yang, “Modulation recognition with graph convolutional network,” IEEE Wireless Communications Letters, vol. 9, no. 5, pp. 624–627, 2020.
[11] D. Korpi, M. Honkala, J. M. Huttunen, F. A. Aoudia, and J. Hoydis, “Waveform learning for reduced out-of-band emissions under a nonlinear power amplifier,” arXiv preprint arXiv:2201.05524, 2022.
[12] T. Raviv, N. Raviv, and Y. Be’ery, “Data-driven ensembles for deep and hard-decision hybrid decoding,” in 2020 IEEE International Symposium on Information Theory (ISIT). IEEE, 2020, pp. 321–326.
[13] T. O’shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563–575, 2017.
[14] M.-S. Baek, S. Kwak, J.-Y. Jung, H. M. Kim, and D.-J. Choi, “Implementation methodologies of deep learning-based signal detection for conventional mimo transmitters,” IEEE Transactions on Broadcasting, vol. 65, no. 3, pp. 636–642, 2019.
[15] T. Raviv, S. Park, O. Simeone, Y. C. Eldar, and N. Shlezinger, “Adaptive and flexible model-based ai for deep receivers in dynamic channels,” IEEE Wireless Communications, 2024.
[16] S. Park, H. Jang, O. Simeone, and J. Kang, “Learning to demodulate from few pilots via offline and online meta-learning,” IEEE Transactions on Signal Processing, vol. 69, pp. 226–239, 2020.
[17] T. Raviv and N. Shlezinger, “Data augmentation for deep receivers,” IEEE Transactions on Wireless Communications, vol. 22, no. 11, pp. 8259–8274, 2023.
[18] R. A. Finish, Y. Cohen, T. Raviv, and N. Shlezinger, “Symbol-level online channel tracking for deep receivers,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2022. [Online]. Available: http://dx.doi.org/10.1109/icassp43922.2022.9747026
[19] N. Shlezinger, N. Farsad, Y. C. Eldar, and A. J. Goldsmith, “Viterbinet: A deep learning based viterbi algorithm for symbol detection,” IEEE Transactions on Wireless Communications, p. 3319–3331, May 2020. [Online]. Available: http://dx.doi.org/10.1109/twc.2020.2972352
[20] T. Raviv, S. Park, O. Simeone, Y. C. Eldar, and N. Shlezinger, “Online meta-learning for hybrid model-based deep receivers,” IEEE Transactions on Wireless Communications, vol. 22, no. 10, pp. 6415–6431, 2023.
[21] N. Shlezinger, J. Whang, Y. C. Eldar, and A. G. Dimakis, “Model-based deep learning,” Proceedings of the IEEE, vol. 111, no. 5, pp. 465–499, 2023.
[22] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for communications systems: A survey and some new directions,” in 2019 IEEE International Workshop on Signal Processing Systems (SiPS). IEEE, 2019, pp. 266–271.
[23] W.-J. Choi, K.-W. Cheong, and J. M. Cioffi, “Iterative soft interference cancellation for multiple antenna systems,” in 2000 IEEE Wireless Communications and Networking Conference. Conference Record (Cat. No. 00TH8540), vol. 1. IEEE, 2000, pp. 304–309.
[24] A. Demontis, M. Melis, M. Pintor, M. Jagielski, B. Biggio, A. Oprea, C. Nita-Rotaru, and F. Roli, “Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks,” in 28th USENIX security symposium (USENIX security 19), 2019, pp. 321–338.
[25] L. Muñoz-González, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. C. Lupu, and F. Roli, “Towards poisoning of deep learning algorithms with back-gradient optimization,” in Proceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 27–38.
[26] Y. Wang and K. Chaudhuri, “Data poisoning attacks against online learning. arxiv 2018,” arXiv preprint arXiv:1808.08994.
[27] K. Zheng and X. Ma, “Designing learning-based adversarial attacks to (mimo-) ofdm systems with adaptive modulation,” IEEE Transactions on Wireless Communications, vol. 22, no. 9, pp. 6241–6251, 2023.
[28] A. Bahramali, M. Nasr, A. Houmansadr, D. Goeckel, and D. Towsley, “Robust adversarial attacks against dnn-based wireless communication systems,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 126–140.
[29] E. Sofer and N. Shlezinger, “On the interpretable adversarial sensitivity of iterative optimizers,” in 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2023, pp. 1–6.
[30] G. Nan, Z. Li, J. Zhai, Q. Cui, G. Chen, X. Du, X. Zhang, X. Tao, Z. Han, and T. Q. Quek, “Physical-layer adversarial robustness for deep learning-based semantic communications,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 8, pp. 2592–2608, 2023.
[31] Y. Wang, K. Wu, and C. Zhang, “Adversarial attacks on deep unfolded networks for sparse coding,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 5974–5978.
[32] Y. E. Sagduyu, Y. Shi, and T. Erpek, “Adversarial deep learning for over-the-air spectrum poisoning attacks,” IEEE Transactions on Mobile Computing, vol. 20, no. 2, pp. 306–319, 2019.
[33] T. Pang, X. Yang, Y. Dong, H. Su, and J. Zhu, “Accumulative poisoning attacks on real-time data,” Advances in Neural Information Processing Systems, vol. 34, pp. 2899–2912, 2021.
[34] G. Xia, J. Chen, C. Yu, and J. Ma, “Poisoning attacks in federated learning: A survey,” IEEE Access, vol. 11, pp. 10 708–10 722, 2023.
[35] M. Honkala, D. Korpi, and J. M. Huttunen, “Deeprx: Fully convolutional deep learning receiver,” IEEE Transactions on Wireless Communications, vol. 20, no. 6, pp. 3925–3940, 2021.
[36] A. E. Cinà, K. Grosse, A. Demontis, S. Vascon, W. Zellinger, B. A. Moser, A. Oprea, B. Biggio, M. Pelillo, and F. Roli, “Wild patterns reloaded: A survey of machine learning security against training data poisoning,” ACM Computing Surveys, vol. 55, no. 13s, pp. 1–39, 2023.
[37] J. Lin, L. Dang, M. Rahouti, and K. Xiong, “Ml attack models: adversarial attacks and data poisoning attacks,” arXiv preprint arXiv:2112.02797, 2021.
[38] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille, “Improving transferability of adversarial examples with input diversity,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 2730–2739.
[39] H. Huang, Z. Chen, H. Chen, Y. Wang, and K. Zhang, “T-sea: Transfer-based self-ensemble attack on object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 20 514–20 523.
[40] Y. Liu, M. Backes, and X. Zhang, “Transferable availability poisoning attacks,” arXiv preprint arXiv:2310.05141, 2023.
[41] H. Chen, Y. Zhang, Y. Dong, X. Yang, H. Su, and J. Zhu, “Rethinking model ensemble in transfer-based adversarial attacks,” arXiv preprint arXiv:2303.09105, 2023.
[42] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
[43] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9185–9193.
[44] L. Fowl, M. Goldblum, P.-y. Chiang, J. Geiping, W. Czaja, and T. Goldstein, “Adversarial examples make strong poisons,” Advances in Neural Information Processing Systems, vol. 34, pp. 30 339–30 351, 2021.
[45] P. Sandoval-Segura, V. Singla, J. Geiping, M. Goldblum, and T. Goldstein, “What can we learn from unlearnable datasets?” Advances in Neural Information Processing Systems, vol. 36, 2024.
[46] E. T. Oldewage, J. Bronskill, and R. E. Turner, “Adversarial attacks are a surprisingly strong baseline for poisoning few-shot meta-learners,” in Proceedings on. PMLR, 2023, pp. 27–40.
[47] Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transferable adversarial examples and black-box attacks,” arXiv preprint arXiv:1611.02770, 2016.
[48] G. Shi, J. Chen, W. Zhang, L.-M. Zhan, and X.-M. Wu, “Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima,” Advances in neural information processing systems, vol. 34, pp. 6747–6761, 2021.
[49] E. Perry, “Lethean attack: an online data poisoning technique,” arXiv preprint arXiv:2011.12355, 2020.
[50] L. Liu, C. Oestges, J. Poutanen, K. Haneda, P. Vainikainen, F. Quitin, F. Tufvesson, and P. De Doncker, “The cost 2100 mimo channel model,” IEEE Wireless Communications, vol. 19, no. 6, pp. 92–99, 2012.
[51] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[52] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in International conference on machine learning. PMLR, 2017, pp. 1126–1135.
[53] W. Wang and S. Feizi, “Temporal robustness against data poisoning,” Advances in Neural Information Processing Systems, vol. 36, 2024.

Transfer-based Adversarial Poisoning Attacks for Online (MIMO-)Deep Receviers