Improving the Shortest Plank: Vulnerability-Aware Adversarial Training for Robust Recommender System

Kaike Zhang CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences University of Chinese Academyof Sciences, Beijing, China [email protected] , Qi Cao CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences,Beijing, China [email protected] , Yunfan Wu CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences University of Chinese Academyof Sciences, Beijing, China [email protected] , Fei Sun CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences,Beijing, China [email protected] , Huawei Shen CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences,Beijing, China [email protected] and Xueqi Cheng CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences,Beijing, China [email protected]

(2024)

Abstract.

Recommender systems play a pivotal role in mitigating information overload in various fields. Nonetheless, the inherent openness of these systems introduces vulnerabilities, allowing attackers to insert fake users into the system’s training data to skew the exposure of certain items, known as poisoning attacks. Adversarial training has emerged as a notable defense mechanism against such poisoning attacks within recommender systems. Existing adversarial training methods apply perturbations of the same magnitude across all users to enhance system robustness against attacks. Yet, in reality, we find that attacks often affect only a subset of users who are vulnerable. These perturbations of indiscriminate magnitude make it difficult to balance effective protection for vulnerable users without degrading recommendation quality for those who are not affected. To address this issue, our research delves into understanding user vulnerability. Considering that poisoning attacks pollute the training data, we note that the higher degree to which a recommender system fits users’ training data correlates with an increased likelihood of users incorporating attack information, indicating their vulnerability. Leveraging these insights, we introduce the Vulnerability-aware Adversarial Training (VAT), designed to defend against poisoning attacks in recommender systems. VAT employs a novel vulnerability-aware function to estimate users’ vulnerability based on the degree to which the system fits them. Guided by this estimation, VAT applies perturbations of adaptive magnitude to each user, not only reducing the success ratio of attacks but also preserving, and potentially enhancing, the quality of recommendations. Comprehensive experiments confirm VAT’s superior defensive capabilities across different recommendation models and against various types of attacks.

Robust Recommender System, Adversarial Training, Poisoning Attack

^†^†journalyear: 2024^†^†copyright: rightsretained^†^†conference: 18th ACM Conference on Recommender Systems; October 14–18, 2024; Bari, Italy^†^†booktitle: 18th ACM Conference on Recommender Systems (RecSys ’24), October 14–18, 2024, Bari, Italy^†^†doi: 10.1145/3640457.3688120^†^†isbn: 979-8-4007-0505-2/24/10^†^†ccs: Information systems Recommender systems^†^†ccs: Security and privacy Social network security and privacy

1. INTRODUCTION

Refer to caption — (a) Number of unaffected versus affected users across different attacks.

Recommender systems have become essential tools for managing the exponential growth of information available online. Collaborative Filtering (CF) is particularly notable among the various techniques employed in these systems (Koren et al., 2009; He et al., 2020). CF is deployed across diverse domains, from e-commerce platforms (Smith and Linden, 2017) to content streaming services (Gomez-Uribe and Hunt, 2015), significantly enhancing user experience by offering personalized suggestions. However, the openness of recommender systems also makes them vulnerable to attacks where attackers inject fake users into the system’s training data, known as poisoning attacks. These attacks aim to manipulate the exposure of targeted items (Huang et al., 2021; Tang et al., 2020), which may not only degrade the user experience but also pose a threat to the long-term sustainability of recommender systems (Zhang et al., 2023, 2024b).

Existing methods for combating poisoning attacks within recommender systems can generally be divided into two main approaches (Zhang et al., 2023): (1) detecting and removing malicious users from the dataset (Chung et al., 2013; Yang et al., 2016; Zhang et al., 2020, 2024b), and (2) developing robust models through adversarial training (He et al., 2018b; Li et al., 2020; Wu et al., 2021b; Ye et al., 2023). Detection-based approaches inspect entire datasets to eliminate malicious users, often relying on supervised data (Chung et al., 2013; Yang et al., 2016) or specific assumptions regarding the characteristics of malicious users (Zhang et al., 2020). These approaches may fail when characteristics of practical attacks deviate from predefined criteria. On the other hand, adversarial training improves model robustness by introducing perturbations into the embeddings of users and items during the training phase, utilizing a “min-max” strategy to minimize risks under the worst-case attacks (He et al., 2018b; Li et al., 2020; Chen et al., 2023a), providing a more general and effective defense without the need for prior knowledge.

Considering the above advantages, this paper focuses the paradigm of adversarial training. Existing adversarial training methods typically apply the same magnitude of perturbations to all users. However, in practical scenarios, only a minority of users might be affected by attacks, as illustrated in Figure 1(a) (Huang et al., 2021; Tang et al., 2020; LI et al., 2022). To protect users against attacks, introducing the same large-magnitude perturbations across all users may inevitably impair the experience for those not vulnerable to attacks. Conversely, applying the same small-magnitude perturbations results in inefficient protection, as shown in the top part of Figure 1(b). Thus, applying the same magnitudes of perturbations across all users may not be the most effective way, resulting in a trade-off between recommendation performance and effective protection against poisoning attacks.

To address this issue, we propose user-adaptive-magnitude perturbations in adversarial training. On the one hand, we prioritize identifying users vulnerable to attacks to ensure their performance remains robust through sufficiently large perturbations. On the other hand, for users deemed less vulnerable, we propose reducing the magnitude of adversarial perturbations, as depicted in the bottom part of Figure 1(b). Unfortunately, from a defensive standpoint, without details about the attackers’ targets, it is challenging to assess which users are affected by attacks, making precise identification of user vulnerability difficult.

Given these challenges, we explore alternative indicators to estimate user vulnerability. Through extensive experiments, we find that a user’s vulnerability to attacks changes as the recommendation system undergoes training, indicating that this vulnerability is fluctuant. This insight leads us to further examine the link between a user’s vulnerability and the model’s training process. Considering the nature of poisoning attacks—where training data is polluted to mislead the recommender system from fitting the user’s real preferences—we pose the question: Is there a correlation between a user’s vulnerability and their degree of fit within the recommender system? Our empirical analysis yields a noteworthy discovery: users with a higher degree of fit within the recommender system face a higher risk of being affected by attacks. This finding is intuitive in the context of poisoning attacks, as users with a higher degree of fit are also more likely to capture the malicious patterns in poisoned data, placing them at a greater risk.

Based on this observation, we propose a Vulnerability-Aware Adversarial Training (VAT) method to enhance the robustness of recommender systems. VAT follows the established adversarial training paradigm in recommender systems (He et al., 2018b), which introduces adversarial perturbations to user and item embeddings during the training phase. To protect users who are vulnerable to attacks while preserving the performance of those who are not, we implement a vulnerability-aware function. This function estimates users’ vulnerabilities based on the degree to which the recommender system fits them. Following this assessment, VAT applies user-adaptive magnitudes of perturbations to the embeddings. In this way, VAT can both diminish the success ratio of attacks and maintain recommendation quality, thereby avoiding the trade-off suffered by traditional adversarial training methods.

Our extensive experiments across multiple recommendation models and various attacks consistently show that VAT significantly enhances the robustness of recommender systems (reducing the average success ratio of attacks by 21.53%) while avoiding a decline in recommendation performance (even improving the average recommendation performance of the backbone model by 12.36%). The pivotal contributions of our work are as follows:

•

Through extensive empirical analysis, we interestingly find that “users with a higher degree of fit within the recommender system are at a higher risk of being affected by attacks”.
•

Building on these insights, we introduce a novel vulnerability-aware adversarial training method, i.e., VAT, applying user-adaptive magnitudes of perturbations based on users’ vulnerabilities.
•

Our comprehensive experiments confirm the effectiveness of VAT in resisting various attacks while maintaining recommendation quality, as well as demonstrating its adaptability across various recommendation models.

2. RELATED WORK

This section briefly reviews the research on collaborative filtering, poisoning attacks, and robust recommender systems.

2.1. Collaborative Filtering

Collaborative Filtering (CF) is a foundational technique in modern recommender systems, widely recognized and applied across the field (Covington et al., 2016; Ying et al., 2018; He et al., 2020). Its core assumption is that users with similar preferences tend to share comparable opinions and behaviors (Koren et al., 2021), which can be leveraged to predict future recommendations. Among CF methods, Matrix Factorization (MF) is particularly prominent, as it models latent user and item embeddings by decomposing the observed interaction matrix (Koren et al., 2009). The integration of deep learning technologies has led to the emergence of neural CF models that aim to uncover more complex patterns in user preferences (Wang et al., 2015; He et al., 2017; Liang et al., 2018; He et al., 2018a). More recently, the advent of Graph Neural Networks (Wu et al., 2020a) has facilitated the development of graph-based CF models such as NGCF (Wang et al., 2019) and LightGCN (He et al., 2020), achieving notable success in enhancing recommendation tasks. Despite these advancements, these systems remain susceptible to poisoning attacks, posing significant challenges to their robustness (Zhang et al., 2023).

2.2. Poisoning Attacks in Recommender System

Poisoning attacks in recommender systems involve injecting fake users into the training data to manipulate the exposure of certain items. Early works focus on rule-based heuristic attacks. These methods typically construct profiles for fake users through heuristic rules (Lam and Riedl, 2004; Burke et al., 2005; Mobasher et al., 2007; Seminario and Wilson, 2014). The Random Attack (Lam and Riedl, 2004) involves fake users engaging with targeted items along with a random selection of other items. Conversely, the Bandwagon Attack (Burke et al., 2005) crafts fake users’ interactions to include both targeted items and those chosen for their popularity. As attacks have advanced, more recent contributions have adopted optimization-based approaches to generate fake user profiles (Li et al., 2016; Huang et al., 2021; Wu et al., 2021a; Tang et al., 2020; LI et al., 2022; Chen et al., 2022; Qian et al., 2023; Wang et al., 2023; Chen et al., 2023b; Huang and Li, 2023). For instance, the Rev Attack (Tang et al., 2020) frames the attack as a bi-level optimization challenge, tackled using gradient-based techniques. The DP Attack (Huang et al., 2021) targets the attack of deep learning recommender systems.

2.3. Robust Recommender System

Mainstream strategies for enhancing the robustness of recommender systems against poisoning attacks typically fall into two groups (Zhang et al., 2023): (1) detecting and excluding malicious users (Chirita et al., 2005; Chung et al., 2013; Yang et al., 2016; Zhang et al., 2020, 2024b; Liu, 2020; Zhang and Zhou, 2014); (2) developing robust models through adversarial training (He et al., 2018b; Chen and Li, 2019; Li et al., 2020; Wu et al., 2021b; Ye et al., 2023; Chen et al., 2023a).

Detection-based methods aim to either identify and remove potential fake users from the dataset (Chung et al., 2013; Zhang and Zhou, 2014; Yang et al., 2016; Liu, 2020) or mitigate the impact of malicious activities during training (Zhang et al., 2020, 2024b). These approaches often depend on specific assumptions about attack patterns (Chung et al., 2013; Zhang et al., 2020) or supervised data regarding attacks (Zhang and Zhou, 2014; Yang et al., 2016; Zhang et al., 2020, 2024b). Among these, LoRec (Zhang et al., 2024a) leverages the expansive knowledge of large language models to enhance sequential recommendations, overcoming the limitations associated with specific knowledge in detection-based strategies. However, its applicability is limited to sequential recommender systems and is hard to extend to CF.

In contrast, mainstream adversarial training methods for recommender systems, such as Adversarial Personalized Ranking (APR) (He et al., 2018b), introduce adversarial perturbations at the parameter level during training (He et al., 2018b; Li et al., 2020; Ye et al., 2023; Chen et al., 2023a). This methodology adopts a “min-max” optimization strategy, aiming to minimize recommendation errors while maximizing the impact of adversarial perturbations (Zhang et al., 2024b). This approach requires the model to maintain recommendation accuracy under the worst attacks, within a predefined perturbation magnitude. In practice, however, only a minority of users may be affected by attacks (Huang et al., 2021; Tang et al., 2020). Adversarial training that imposes the same large-magnitude perturbations prepares every user for the worst attacks, potentially degrading the experience for users unaffected by attacks. Conversely, the same small-magnitude perturbations offer insufficient protection for vulnerable users. Thus, there is a critical need for a technique that offers targeted protection to vulnerable users while preserving the recommendation performance of those who are not vulnerable.

3. PRELIMINARY

This section mathematically formulates the task of collaborative filtering and adversarial training for recommender systems.

Collaborative Filtering. Collaborative filtering (CF) methods are extensively used in recommender systems. Following (Su and Khoshgoftaar, 2009; He et al., 2020), we define a set of users $\mathcal{U}=\{u\}$ and a set of items $\mathcal{I}=\{i\}$ . We aim to learn latent embeddings $\bm{P}=[\bm{p}_{u}\in\mathbb{R}^{d}]_{u\in\mathcal{U}}$ for users and $\bm{Q}=[\bm{q}_{i}\in\mathbb{R}^{d}]_{i\in\mathcal{I}}$ for items. Subsequently, we employ a preference function $f:\mathbb{R}^{d}\times\mathbb{R}^{d}\rightarrow\mathbb{R}$ , which predicts user-item preference scores, denoted as $\hat{r}_{u,i}=f(\bm{p}_{u},\bm{q}_{i})$ .

Adversarial Training for Recommender Systems. Adversarial training approaches for recommender systems, particularly within the Adversarial Personalized Ranking (APR) framework (He et al., 2018b), incorporate adversarial perturbations at the parameter level throughout the training process. The original loss function of the recommender system is represented as $\mathcal{L}(\Theta)$ , with $\Theta=(\bm{P},\bm{Q})$ indicating the system’s parameters. These adversarial training methods introduce perturbations $\Delta$ directly to the parameters as follows:

(1)		$\displaystyle\mathcal{L}_{\mathrm{AT}}(\Theta)=$	$\displaystyle\mathcal{L}(\Theta)+\lambda\mathcal{L}(\Theta+\Delta^{\mathrm{AT}}),$
(1)		$\displaystyle\mathrm{where}\quad\Delta^{\mathrm{AT}}=$	$\displaystyle\arg\max_{\Delta,\,\\|\Delta\\|\leq\epsilon}\mathcal{L}(\Theta+\Delta),$

where $\epsilon>0$ controls the magnitude of perturbations, and $\lambda$ denotes the adversarial training weight. In practice (He et al., 2018b), for an interaction $(u,i)$ , the specific adversarial perturbation is given by

(2)

\displaystyle\Delta^{\mathrm{AT}}_{u,i}=\epsilon\frac{\Gamma_{u,i}}{\|\Gamma_{u,i}\|},\quad\text{where}\quad\Gamma_{u,i}=\frac{\partial\mathcal{L}((u,i)|\Theta+\Delta)}{\partial\Delta_{u,i}}.

4. Method

To address the limitations of existing adversarial training methods, we propose user-adaptive magnitudes of perturbations, integrating large-magnitude perturbations for users vulnerable to attacks, thus offering effective protection. Simultaneously, we reduce the magnitude of adversarial perturbations for users deemed invulnerable, aiming to preserve the quality of their recommendations. This section delves into identifying these vulnerable users. Subsequently, we introduce the Vulnerability-Aware Adversarial Training (VAT) method, which tailors adversarial training to the specific vulnerabilities of users by applying perturbations of user-adaptive magnitudes. VAT aims to both provide effective protection and maintain recommendation performance by adapting to the nuanced needs of individual users.

4.1. User’s Vulnerability Is Fluctuant

To seek indicators to estimate user vulnerability, we initially examine whether this vulnerability is static, derived from user statistics, or fluctuant, evolving with the recommender system’s training. Accordingly, we assess the frequency with which users are affected by attacks during the training process.

Using the Gowalla dataset (Liang et al., 2016; He et al., 2020) as an example dataset, we implement both the Random Attack (Lam and Riedl, 2004) and Bandwagon Attack (Burke et al., 2005), and evaluate their impact on Matrix Factorization (MF) (Koren et al., 2009) and LightGCN (He et al., 2020) as victim models. For a detailed discussion of the experimental settings, please refer to Section 5.1. The recommender system is trained under these conditions across 300 epochs, during which we evaluate whether users are affected by attacks¹¹1A user is considered affected by attacks if any target item appears in the user’s top 50 recommendation list (Tang et al., 2020; Huang et al., 2021). every 30 epochs, resulting in a total of 10 evaluations.

Figure 2(a) presents the distribution of users’ attack statuses (whether affected) over these 10 evaluations, illustrating that a predominant portion of users has never been affected (denoted by a horizontal coordinate of 0), while almost no users consistently demonstrate vulnerability (denoted by a horizontal coordinate of 10). Additionally, users affected by the attack have varying frequencies, with horizontal coordinates ranging from 1 to 9.

Moreover, Figure 2(b), which shows the changes in users’ attack statuses²²2Changes in attack status are marked when there is a discrepancy between two successive evaluations, where the initial state is not being affected. over time, indicates that a majority of those who have been affected undergo several status changes, with horizontal coordinates ranging from 2 to 8. This emphasizes the fluctuating nature of user vulnerability. These analyses, supported by Figure 2(a) and Figure 2(b), confirm that user vulnerability is indeed fluctuant during the training of the recommender system.

4.2. Well-Fitted Users Are More Likely to Be Vulnerable

Considering the fluctuant nature of user vulnerability during recommender system training, this section explores the relationship between a user’s vulnerability to attacks and the training process of the recommender system.

Hypothesis on User Vulnerability. Poisoning attacks manipulate the training of recommender systems by polluting the training data. These attacks establish deceptive correlations between users’ historical interactions and the target items chosen by attackers. If the recommender system captures these deceptive correlations during the process of fitting user behavior, the user may be affected by attacks. This insight leads us to pose a critical question: Are well-fitted users in the current recommender system more likely to be affected by attacks?

Observation. To validate our hypothesis, we use user-specific loss as a measure of fit within the recommender system, regarding that users with lower loss values are better fitted by the system. We record each user’s training loss alongside their attack status. Due to space constraints, we present the results of LightGCN on Gowalla in Figure 3. Figure 3(a) shows the percentage of users affected by attacks relative to the total user count within each loss bin³³3Only bins including more than 0.5% of the total number of users are included to ensure visibility; bins below this threshold are excluded due to potential data instability.. Meanwhile, Figure 3(b) displays the number of users affected by attacks across different loss bins. Our observations reveal that users with smaller losses have a higher probability of being affected compared to those with larger losses, indicating a general downward trend in the proportion and the number of users affected by attacks as loss increases. These observations empirically substantiate our hypothesis that well-fitted users in the current recommender system are more likely to be vulnerable, i.e., affected by attacks.

Analysis. Poisoning attacks exploit the system’s fitting capabilities by creating deceptive correlations between users’ historical interactions and the attacker’s chosen targets. These correlations may involve complex patterns, such as high-order connectivity with other items/users or similarities in consumption behavior, which the system is designed to capture and utilize. Figure 4 demonstrates a real example involving a well-fitted user (characterized by a small loss), an under-fitted user (characterized by a large loss), a fake user with a target item, and other normal items. By employing T-SNE (Van der Maaten and Hinton, 2008) to project their embeddings into two dimensions, we observe that the well-fitted user precisely models the third-order link, thereby showing high similarity with the target item. Conversely, the under-fitted user fails to discern this pattern, remaining unaffected by the attack. In other words, users whose interactions are better fitted by the system more readily identify and utilize these deceptive correlations, increasing the likelihood of the attacker’s target items being recommended. Thus, increasing the magnitude of perturbations for these users is necessary to enhance their protection. This process intuitively explains why users who are well-fitted are more susceptible to the influence of poisoned data, making them more likely to be vulnerable to poisoning attacks.

4.3. Vulnerability-Aware Adversarial Training

To enhance the robustness of users identified as vulnerable, we increase the magnitude of adversarial perturbations for these users to improve their ability to resist poisoning attacks, thus providing more effective protection. Recognizing that users with smaller losses are more likely to be vulnerable, we propose a vulnerability-aware function $g(\cdot)$ to quantify users’ vulnerability based on this indicator, which reflects such relatively small and large losses. To prevent excessively large perturbations over the training duration, we constrain $g:\mathbb{R}\rightarrow(0,1)$ . Formally, we define $g(\cdot)$ as follows:

(3)

g\left(\mathcal{L}(u|\Theta)\right)=\sigma\left(\left(\frac{\mathcal{L}(u|\Theta)-\overline{\mathcal{L}(u|\Theta)}}{\overline{\mathcal{L}(u|\Theta)}}\right)^{-1}\right),

where $\mathcal{L}(u|\Theta)$ denotes the loss associated with user $u$ , $\overline{\mathcal{L}(u|\Theta)}$ is the mean loss across all users, and $\sigma(\cdot)$ is the Sigmoid function.

Given the recommender system’s original loss function, $\mathcal{L}(\Theta)$ , and referring to Equation 1, we integrate the vulnerability-aware function $g(\cdot)$ into $\Delta^{\mathrm{AT}}$ . The loss function for VAT is expressed as follows:

(4)		$\displaystyle\mathcal{L}_{\mathrm{VAT}}(\Theta)=$	$\displaystyle\mathcal{L}(\Theta)+\lambda\mathcal{L}(\Theta+\Delta^{\mathrm{VAT}}),$
(4)		$\displaystyle\mathrm{where}\quad\Delta^{\mathrm{VAT}}=$	$\displaystyle\arg\max_{\Delta,\,\\|\Delta_{u,*}\\|\leq\rho g(\mathcal{L}(u\|\Theta))}\mathcal{L}(\Theta+\Delta),$

where $\lambda$ is the weight used in adversarial training, and $\rho$ determines the initial magnitude of perturbations. Specifically, for an interaction $(u,i)$ , the perturbation of user-adaptive magnitude, $\Delta^{\mathrm{VAT}}_{u,i}$ , is calculated as:

(5)

\Delta^{\mathrm{VAT}}_{u,i}=\rho g\left(\mathcal{L}(u|\Theta)\right)\frac{\Gamma_{u,i}}{\|\Gamma_{u,i}\|},\quad\text{where}\quad\Gamma_{u,i}=\frac{\partial\mathcal{L}((u,i)|\Theta+\Delta)}{\partial\Delta_{u,i}}.

According to Equation 5, we apply such user-adaptive magnitudes of perturbations based on the user vulnerability, thereby providing an effective defense against poisoning attacks while maintaining the performance of the recommender system.

4.4. Further Discussion

It is important to note that although users with lower loss values are more vulnerable to attacks, the overall success ratio of these attacks remains low, leaving a part of low-loss users unaffected. Nonetheless, adversarial training at the parameter level also proves effective in cases where model parameters are overfitted to the data, as demonstrated in (He et al., 2018b). For users with small losses, the introduction of large-magnitude perturbations can help correct the overfitting of parameters, thereby improving the quality of recommendations. With these dual benefits, VAT is capable of not only enhancing defenses for users vulnerable to attacks but also improving the generalization capabilities of the recommender system for users whose parameters are overfitted, as evident in Section 5.4.2.

5. EXPERIMENTS

In this section, we conduct extensive experiments to answer the following research questions (RQs).

•

RQ1: Can VAT defend against poisoning attacks?
•

RQ2: How do hyper-parameters affect VAT?
•

RQ3: Why does VAT outperform traditional adversarial training methods?

5.1. Experimental Setup

Table 1. Dataset statistics

DATASET	#Users	#Items	#Ratings	Avg.Inter.	Sparsity
Gowalla	29,858	40,981	1,027,370	34.4	99.92%
Yelp2018	31,668	38,048	1,561,406	49.3	99.88%
MIND	141,920	36,214	20,693,122	145.8	99.60%

5.1.1. Datasets

We employ three widely recognized datasets: the Gowalla check-in dataset (Liang et al., 2016), the Yelp2018 business dataset, and the MIND news recommendation dataset (Wu et al., 2020b). The Gowalla and Yelp2018 datasets include all users, whereas for the MIND dataset, we sample a subset of users in alignment with (Zhang et al., 2024b). Consistent with (He et al., 2020; Wang et al., 2019), users and items with fewer than 10 interactions are excluded from our analysis. We allocate 80% of each user’s historical interactions to the training set and reserve the remainder for testing. Additionally, within the training set, 10% of the interactions are randomly selected to form a validation set for hyperparameter tuning. Detailed statistics of the datasets are summarized in Table 1.

5.1.2. Baselines for Defense

We incorporate a variety of defense methods, including detection-based methods, adversarial training methods, and a denoise-based method. Specifically, we examine GraphRfi (Zhang et al., 2020) and LLM4Dec (Zhang et al., 2024b) for detection-based methods; APR (He et al., 2018b) and SharpCF (Chen et al., 2023a) for adversarial training methods; and StDenoise (Tian et al., 2022; Ye et al., 2023) for the denoise-based approach.

•

GraphRfi (Zhang et al., 2020): Employs a combination of Graph Convolutional Networks and Neural Random Forests for identifying fraudsters.
•

LLM4Dec (Zhang et al., 2024b): Utilizes an LLM-based framework for fraudster detection.
•

APR (He et al., 2018b): Generates parameter perturbations and integrates these perturbations into training.
•

SharpCF (Chen et al., 2023a): Adopts a sharpness-aware minimization approach to refine the adversarial training process proposed by APR.
•

StDenoise (Tian et al., 2022; Ye et al., 2023): Applies a structural denoising technique that leverages the similarity between $\bm{p}_{u}$ and $\bm{q}_{i}$ for each $(u,i)$ pair, aiding in the removal of noise, as described in (Tian et al., 2022; Ye et al., 2023).

Note that LLM4Dec, which relies on item-side information, is exclusively evaluated on the MIND dataset. Additionally, we observe that SharpCF, initially proposed for the MF model, exhibits unstable training performance when applied to the LightGCN model or the MIND dataset. Consequently, we present SharpCF results solely for the MF model on the Gowalla and Yelp2018 datasets.

Table 2. Robustness against target items promotion

Dataset	Model	Random Attack(%)		Bandwagon Attack(%)		DP Attack(%)		Rev Attack(%)
Dataset	Model	T-HR@50¹	T-NDCG@50	T-HR@50	T-NDCG@50	T-HR@50	T-NDCG@50	T-HR@50	T-NDCG@50
Gowalla	MF	0.148 $\pm$ 0.030	0.036 $\pm$ 0.008	0.120 $\pm$ 0.027	0.029 $\pm$ 0.007	0.201 $\pm$ 0.020	0.051 $\pm$ 0.005	0.246 $\pm$ 0.097	0.061 $\pm$ 0.027
	+StDenoise	0.200 $\pm$ 0.049	0.050 $\pm$ 0.012	0.165 $\pm$ 0.034	0.038 $\pm$ 0.008	0.292 $\pm$ 0.034	0.074 $\pm$ 0.010	0.355 $\pm$ 0.126	0.084 $\pm$ 0.030
	+GraphRfi	0.159 $\pm$ 0.061	0.042 $\pm$ 0.015	0.154 $\pm$ 0.038	0.036 $\pm$ 0.009	0.174 $\pm$ 0.038	0.043 $\pm$ 0.009	0.206 $\pm$ 0.042	0.050 $\pm$ 0.010
	+APR	0.201 $\pm$ 0.091	0.054 $\pm$ 0.026	0.184 $\pm$ 0.067	0.047 $\pm$ 0.015	0.034 $\pm$ 0.021	0.006 $\pm$ 0.004	0.261 $\pm$ 0.063	0.067 $\pm$ 0.018
	+SharpCF	0.204 $\pm$ 0.037	0.049 $\pm$ 0.010	0.169 $\pm$ 0.031	0.041 $\pm$ 0.008	0.303 $\pm$ 0.024	0.077 $\pm$ 0.006	0.350 $\pm$ 0.111	0.087 $\pm$ 0.031
	+VAT	0.121 $\pm$ 0.028	0.031 $\pm$ 0.009	0.101 $\pm$ 0.038	0.024 $\pm$ 0.008	0.028 $\pm$ 0.007	0.006 $\pm$ 0.001	0.103 $\pm$ 0.048	0.024 $\pm$ 0.011
	Gain²	+18.49% $\uparrow$	+15.63% $\uparrow$	+15.86% $\uparrow$	+16.36% $\uparrow$	+16.77% $\uparrow$	+9.03% $\uparrow$	+49.87% $\uparrow$	+52.39% $\uparrow$
	LightGCN	0.234 $\pm$ 0.116	0.056 $\pm$ 0.031	0.639 $\pm$ 0.090	0.153 $\pm$ 0.024	0.231 $\pm$ 0.048	0.048 $\pm$ 0.010	0.718 $\pm$ 0.134	0.149 $\pm$ 0.026
	+StDenoise	0.118 $\pm$ 0.068	0.029 $\pm$ 0.019	0.334 $\pm$ 0.092	0.079 $\pm$ 0.020	0.585 $\pm$ 0.092	0.120 $\pm$ 0.019	1.304 $\pm$ 0.184	0.259 $\pm$ 0.037
	+GraphRfi	0.099 $\pm$ 0.023	0.023 $\pm$ 0.006	0.710 $\pm$ 0.250	0.161 $\pm$ 0.052	0.228 $\pm$ 0.048	0.046 $\pm$ 0.010	0.564 $\pm$ 0.067	0.115 $\pm$ 0.013
	+APR	0.090 $\pm$ 0.053	0.022 $\pm$ 0.015	0.332 $\pm$ 0.050	0.079 $\pm$ 0.012	0.190 $\pm$ 0.037	0.039 $\pm$ 0.008	0.655 $\pm$ 0.141	0.132 $\pm$ 0.027
	+VAT	0.089 $\pm$ 0.054	0.021 $\pm$ 0.014	0.259 $\pm$ 0.047	0.063 $\pm$ 0.012	0.141 $\pm$ 0.034	0.028 $\pm$ 0.007	0.456 $\pm$ 0.093	0.094 $\pm$ 0.018
	Gain	+0.22% $\uparrow$	+0.55% $\uparrow$	+22.01% $\uparrow$	+20.77% $\uparrow$	+25.86% $\uparrow$	+28.32% $\uparrow$	+19.17% $\uparrow$	+18.29% $\uparrow$
Yelp2018	MF	0.035 $\pm$ 0.007	0.010 $\pm$ 0.002	0.073 $\pm$ 0.032	0.020 $\pm$ 0.009	0.223 $\pm$ 0.040	0.049 $\pm$ 0.009	0.153 $\pm$ 0.025	0.040 $\pm$ 0.006
	+StDenoise	0.015 $\pm$ 0.038	0.007 $\pm$ 0.010	0.181 $\pm$ 0.046	0.043 $\pm$ 0.011	0.376 $\pm$ 0.198	0.077 $\pm$ 0.039	0.331 $\pm$ 0.145	0.075 $\pm$ 0.031
	+GraphRfi	0.032 $\pm$ 0.009	0.009 $\pm$ 0.003	0.058 $\pm$ 0.014	0.015 $\pm$ 0.003	0.200 $\pm$ 0.041	0.043 $\pm$ 0.010	0.129 $\pm$ 0.027	0.031 $\pm$ 0.007
	+APR	0.012 $\pm$ 0.007	0.004 $\pm$ 0.002	0.057 $\pm$ 0.047	0.013 $\pm$ 0.011	0.185 $\pm$ 0.038	0.040 $\pm$ 0.009	0.098 $\pm$ 0.048	0.022 $\pm$ 0.011
	+SharpCF	0.034 $\pm$ 0.007	0.010 $\pm$ 0.002	0.072 $\pm$ 0.029	0.019 $\pm$ 0.008	0.226 $\pm$ 0.041	0.050 $\pm$ 0.010	0.152 $\pm$ 0.025	0.040 $\pm$ 0.006
	+VAT	0.010 $\pm$ 0.006	0.003 $\pm$ 0.002	0.040 $\pm$ 0.031	0.010 $\pm$ 0.007	0.142 $\pm$ 0.038	0.028 $\pm$ 0.007	0.090 $\pm$ 0.049	0.020 $\pm$ 0.010
	Gain	+14.11% $\uparrow$	+19.16% $\uparrow$	+30.70% $\uparrow$	+25.30% $\uparrow$	+23.32% $\uparrow$	+28.80% $\uparrow$	+8.43% $\uparrow$	+8.50% $\uparrow$
	LightGCN	0.381 $\pm$ 0.064	0.116 $\pm$ 0.022	1.286 $\pm$ 0.351	0.299 $\pm$ 0.083	0.451 $\pm$ 0.040	0.098 $\pm$ 0.008	1.761 $\pm$ 0.368	0.402 $\pm$ 0.091
	+StDenoise	0.058 $\pm$ 0.017	0.018 $\pm$ 0.008	1.609 $\pm$ 0.381	0.346 $\pm$ 0.091	3.939 $\pm$ 0.417	0.814 $\pm$ 0.094	5.965 $\pm$ 0.375	1.472 $\pm$ 0.125
	+GraphRfi	0.434 $\pm$ 0.074	0.127 $\pm$ 0.023	0.958 $\pm$ 0.199	0.200 $\pm$ 0.042	0.581 $\pm$ 0.049	0.119 $\pm$ 0.011	1.597 $\pm$ 0.087	0.344 $\pm$ 0.016
	+APR	0.291 $\pm$ 0.050	0.090 $\pm$ 0.018	1.052 $\pm$ 0.278	0.242 $\pm$ 0.065	0.370 $\pm$ 0.034	0.078 $\pm$ 0.007	1.139 $\pm$ 0.179	0.249 $\pm$ 0.041
	+VAT	0.082 $\pm$ 0.020	0.024 $\pm$ 0.006	0.694 $\pm$ 0.181	0.156 $\pm$ 0.041	0.365 $\pm$ 0.037	0.076 $\pm$ 0.008	0.927 $\pm$ 0.135	0.196 $\pm$ 0.029
	Gain	-	-	+27.56% $\uparrow$	+22.13% $\uparrow$	+1.50% $\uparrow$	+1.88% $\uparrow$	+18.61% $\uparrow$	+21.25% $\uparrow$
MIND	MF	0.032 $\pm$ 0.007	0.010 $\pm$ 0.002	0.169 $\pm$ 0.017	0.055 $\pm$ 0.005	0.023 $\pm$ 0.013	0.005 $\pm$ 0.003	OOM³	OOM
	+StDenoise	0.036 $\pm$ 0.006	0.013 $\pm$ 0.004	0.040 $\pm$ 0.006	0.020 $\pm$ 0.004	0.010 $\pm$ 0.003	0.002 $\pm$ 0.001	OOM	OOM
	+GraphRfi	0.031 $\pm$ 0.006	0.010 $\pm$ 0.002	0.189 $\pm$ 0.015	0.059 $\pm$ 0.005	0.020 $\pm$ 0.009	0.004 $\pm$ 0.002	OOM	OOM
	+LLM4Dec	0.020 $\pm$ 0.001	0.004 $\pm$ 0.000	0.083 $\pm$ 0.009	0.025 $\pm$ 0.003	0.019 $\pm$ 0.010	0.004 $\pm$ 0.002	OOM	OOM
	+APR	0.083 $\pm$ 0.013	0.035 $\pm$ 0.006	0.068 $\pm$ 0.005	0.023 $\pm$ 0.002	0.008 $\pm$ 0.007	0.002 $\pm$ 0.001	OOM	OOM
	+VAT	0.026 $\pm$ 0.006	0.011 $\pm$ 0.003	0.032 $\pm$ 0.004	0.011 $\pm$ 0.001	0.002 $\pm$ 0.002	0.000 $\pm$ 0.000	OOM	OOM
	Gain	-	-	+20.15% $\uparrow$	+45.40% $\uparrow$	+75.36% $\uparrow$	+77.27% $\uparrow$	-	-
	LightGCN	0.056 $\pm$ 0.008	0.015 $\pm$ 0.002	0.149 $\pm$ 0.016	0.038 $\pm$ 0.004	0.006 $\pm$ 0.002	0.001 $\pm$ 0.000	OOM	OOM
	+StDenoise	0.052 $\pm$ 0.026	0.014 $\pm$ 0.020	0.164 $\pm$ 0.017	0.040 $\pm$ 0.004	0.007 $\pm$ 0.002	0.001 $\pm$ 0.001	OOM	OOM
	+GraphRfi	0.045 $\pm$ 0.004	0.012 $\pm$ 0.001	0.093 $\pm$ 0.007	0.022 $\pm$ 0.001	0.008 $\pm$ 0.001	0.002 $\pm$ 0.000	OOM	OOM
	+LLM4Dec	0.039 $\pm$ 0.017	0.013 $\pm$ 0.006	0.104 $\pm$ 0.009	0.027 $\pm$ 0.003	0.006 $\pm$ 0.002	0.001 $\pm$ 0.001	OOM	OOM
	+APR	0.053 $\pm$ 0.007	0.016 $\pm$ 0.002	0.091 $\pm$ 0.007	0.022 $\pm$ 0.002	0.006 $\pm$ 0.001	0.001 $\pm$ 0.000	OOM	OOM
	+VAT	0.032 $\pm$ 0.005	0.010 $\pm$ 0.001	0.065 $\pm$ 0.005	0.016 $\pm$ 0.001	0.003 $\pm$ 0.001	0.001 $\pm$ 0.000	OOM	OOM
	Gain	+17.20% $\uparrow$	+18.64% $\uparrow$	+28.50% $\uparrow$	+26.56% $\uparrow$	+40.00% $\uparrow$	+39.66% $\uparrow$	-	-

1. Target Item Hit Ratio (Equation 6); T-HR@50 and T-NDCG@50 of all target items on clean datasets are 0.000.
2. The relative percentage increase of VAT’s metrics to the best value of other baselines’ metrics, i.e., $\left(\min\left(\mathrm{T}\text{-}\mathrm{HR}_{\mathrm{Beslines}}\right)-\mathrm{T}\text{-}\mathrm{HR}_{\mathrm{VAT}}\right)/\min(\mathrm{T}\text{-}\mathrm{HR}_{\mathrm{Beslines}})$ . Notably, only three decimal places are presented due to space limitations, though the actual ranking and calculations utilize the full precision of the data.
3. The Rev attack method could not be executed on the dataset due to memory constraints, resulting in an out-of-memory error.

Table 3. Recommendation performance

Model	Clean (%)		Random Attack (%)		Bandwagon Attack (%)		DP Attack (%)		Rev Attack (%)
(Dataset)	HR@20	NDCG@20	HR@20	NDCG@20	HR@20	NDCG@20	HR@20	NDCG@20	HR@20	NDCG@20
MF (Gowalla)	11.352 $\pm$ 0.091	7.158 $\pm$ 0.035	11.306 $\pm$ 0.077	7.196 $\pm$ 0.061	11.238 $\pm$ 0.077	7.106 $\pm$ 0.042	10.722 $\pm$ 0.109	8.170 $\pm$ 0.076	10.698 $\pm$ 0.090	8.188 $\pm$ 0.044
+StDenoise	10.484 $\pm$ 0.096	8.074 $\pm$ 0.103	10.456 $\pm$ 0.089	8.074 $\pm$ 0.067	10.412 $\pm$ 0.058	8.038 $\pm$ 0.023	10.532 $\pm$ 0.130	8.120 $\pm$ 0.089	10.568 $\pm$ 0.047	8.186 $\pm$ 0.038
+GraphRfi	10.434 $\pm$ 0.065	7.968 $\pm$ 0.026	10.344 $\pm$ 0.080	7.886 $\pm$ 0.057	10.304 $\pm$ 0.059	7.846 $\pm$ 0.061	10.400 $\pm$ 0.115	7.942 $\pm$ 0.079	10.496 $\pm$ 0.093	8.010 $\pm$ 0.069
+APR	13.058 $\pm$ 0.063	10.646 $\pm$ 0.058	12.934 $\pm$ 0.044	10.520 $\pm$ 0.013	12.902 $\pm$ 0.065	10.500 $\pm$ 0.030	12.946 $\pm$ 0.056	10.586 $\pm$ 0.060	13.128 $\pm$ 0.052	10.720 $\pm$ 0.065
+SharpCF	13.203 $\pm$ 0.074	10.020 $\pm$ 0.090	13.188 $\pm$ 0.077	10.028 $\pm$ 0.069	13.025 $\pm$ 0.060	9.890 $\pm$ 0.050	13.270 $\pm$ 0.138	10.082 $\pm$ 0.098	13.215 $\pm$ 0.087	10.095 $\pm$ 0.044
+VAT	13.424 $\pm$ 0.041	10.864 $\pm$ 0.047	13.292 $\pm$ 0.016	10.764 $\pm$ 0.012	13.286 $\pm$ 0.029	10.740 $\pm$ 0.018	13.396 $\pm$ 0.045	10.860 $\pm$ 0.036	13.540 $\pm$ 0.087	10.980 $\pm$ 0.059
Gain	+1.68% $\uparrow$	+2.05% $\uparrow$	+0.79% $\uparrow$	+2.32% $\uparrow$	+2.00% $\uparrow$	+2.29% $\uparrow$	+0.95% $\uparrow$	+2.59% $\uparrow$	+2.46% $\uparrow$	+2.43% $\uparrow$
Gain w.r.t. MF	+18.25% $\uparrow$	+51.77% $\uparrow$	+17.57% $\uparrow$	+49.58% $\uparrow$	+18.22% $\uparrow$	+51.14% $\uparrow$	+24.94% $\uparrow$	+32.93% $\uparrow$	+26.57% $\uparrow$	+34.10% $\uparrow$
MF (Yelp2018)	3.762 $\pm$ 0.034	2.974 $\pm$ 0.039	3.730 $\pm$ 0.017	2.934 $\pm$ 0.010	3.744 $\pm$ 0.040	2.948 $\pm$ 0.029	3.866 $\pm$ 0.038	3.028 $\pm$ 0.033	3.812 $\pm$ 0.044	3.028 $\pm$ 0.041
+StDenoise	3.410 $\pm$ 0.085	2.612 $\pm$ 0.092	3.288 $\pm$ 0.040	2.504 $\pm$ 0.026	3.322 $\pm$ 0.057	2.522 $\pm$ 0.047	3.384 $\pm$ 0.062	2.578 $\pm$ 0.063	3.380 $\pm$ 0.104	2.586 $\pm$ 0.102
+GraphRfi	3.726 $\pm$ 0.051	2.942 $\pm$ 0.034	3.664 $\pm$ 0.038	2.902 $\pm$ 0.033	3.640 $\pm$ 0.054	2.882 $\pm$ 0.029	3.762 $\pm$ 0.056	2.932 $\pm$ 0.049	3.718 $\pm$ 0.053	2.950 $\pm$ 0.042
+APR	4.094 $\pm$ 0.022	3.202 $\pm$ 0.017	4.036 $\pm$ 0.019	3.160 $\pm$ 0.018	4.080 $\pm$ 0.028	3.194 $\pm$ 0.026	4.012 $\pm$ 0.059	3.152 $\pm$ 0.043	4.061 $\pm$ 0.029	3.205 $\pm$ 0.024
+SharpCF	3.933 $\pm$ 0.038	3.108 $\pm$ 0.045	3.883 $\pm$ 0.015	3.058 $\pm$ 0.016	3.910 $\pm$ 0.051	3.079 $\pm$ 0.027	4.034 $\pm$ 0.034	3.161 $\pm$ 0.037	3.971 $\pm$ 0.052	3.156 $\pm$ 0.047
+VAT	4.112 $\pm$ 0.023	3.234 $\pm$ 0.022	4.074 $\pm$ 0.016	3.206 $\pm$ 0.014	4.130 $\pm$ 0.035	3.246 $\pm$ 0.030	4.096 $\pm$ 0.044	3.202 $\pm$ 0.041	4.218 $\pm$ 0.027	3.326 $\pm$ 0.024
Gain	+0.44% $\uparrow$	+1.00% $\uparrow$	+0.94% $\uparrow$	+1.46% $\uparrow$	+1.23% $\uparrow$	+1.63% $\uparrow$	+1.53% $\uparrow$	+1.31% $\uparrow$	+3.86% $\uparrow$	+3.79% $\uparrow$
Gain w.r.t. MF	+9.30% $\uparrow$	+8.74% $\uparrow$	+9.22% $\uparrow$	+9.27% $\uparrow$	+10.31% $\uparrow$	+10.11% $\uparrow$	+5.95% $\uparrow$	+5.75% $\uparrow$	+10.65% $\uparrow$	+9.84% $\uparrow$
MF (MIND)	1.204 $\pm$ 0.014	0.676 $\pm$ 0.005	1.190 $\pm$ 0.011	0.670 $\pm$ 0.006	1.192 $\pm$ 0.016	0.676 $\pm$ 0.005	1.204 $\pm$ 0.005	0.688 $\pm$ 0.007	OOM	OOM
+StDenoise	1.126 $\pm$ 0.014	0.630 $\pm$ 0.006	1.120 $\pm$ 0.006	0.626 $\pm$ 0.005	1.116 $\pm$ 0.008	0.632 $\pm$ 0.004	1.130 $\pm$ 0.006	0.642 $\pm$ 0.007	OOM	OOM
+GraphRfi	1.198 $\pm$ 0.015	0.666 $\pm$ 0.005	1.188 $\pm$ 0.010	0.666 $\pm$ 0.005	1.194 $\pm$ 0.010	0.668 $\pm$ 0.007	1.204 $\pm$ 0.019	0.674 $\pm$ 0.008	OOM	OOM
+LLM4Dec	1.200 $\pm$ 0.011	0.676 $\pm$ 0.005	1.190 $\pm$ 0.011	0.670 $\pm$ 0.006	1.194 $\pm$ 0.015	0.676 $\pm$ 0.005	1.194 $\pm$ 0.005	0.682 $\pm$ 0.004	OOM	OOM
+APR	1.218 $\pm$ 0.010	0.682 $\pm$ 0.007	1.262 $\pm$ 0.016	0.712 $\pm$ 0.007	1.212 $\pm$ 0.008	0.686 $\pm$ 0.004	1.214 $\pm$ 0.010	0.696 $\pm$ 0.008	OOM	OOM
+VAT	1.264 $\pm$ 0.012	0.710 $\pm$ 0.000	1.264 $\pm$ 0.014	0.714 $\pm$ 0.005	1.266 $\pm$ 0.008	0.714 $\pm$ 0.005	1.260 $\pm$ 0.013	0.718 $\pm$ 0.010	OOM	OOM
Gain	+3.78% $\uparrow$	+4.11% $\uparrow$	+0.16% $\uparrow$	+0.28% $\uparrow$	+4.44% $\uparrow$	+4.10% $\uparrow$	+3.79% $\uparrow$	+3.16% $\uparrow$	-	-
Gain w.r.t. MF	+4.98% $\uparrow$	+5.03% $\uparrow$	+6.22% $\uparrow$	+6.57% $\uparrow$	+6.21% $\uparrow$	+5.62% $\uparrow$	+4.65% $\uparrow$	+4.36% $\uparrow$	-	-

5.1.3. Attack Methods

In our analysis, we explore both heuristic (Random Attack (Lam and Riedl, 2004), Bandwagon Attack (Burke et al., 2005)) and optimization-based (Rev Attack (Tang et al., 2020), DP Attack (Huang et al., 2021)) attack methods within a black-box context, where the attacker does not have access to the internal architecture or parameters of the target model.

5.1.4. Evaluation Metrics

We adopt standard metrics widely accepted in the field. The primary metrics for assessing recommendation performance are the top- $k$ metrics: Hit Ratio at $k$ ( $\mathrm{HR}@k$ ) and Normalized Discounted Cumulative Gain at $k$ ( $\mathrm{NDCG}@k$ ), as documented in (Zhang et al., 2023; He et al., 2020; Wang et al., 2019). To quantify the success ratio of attacks, we utilize metrics tailored to measuring the performance of target items promotion within the top- $k$ recommendations, denoted as $\mathrm{T}\text{-}\mathrm{HR}@k$ and $\mathrm{T}\text{-}\mathrm{NDCG}@k$ (Tang et al., 2020; Huang et al., 2021; Zhang et al., 2024b):

(6)

\mathrm{T}\text{-}\mathrm{HR}@k=\frac{1}{|\mathcal{T}|}\sum_{\mathit{tar}\in\mathcal{T}}\frac{\sum_{u\in\mathcal{U}-\mathcal{U}_{\mathit{tar}}}\mathbb{I}\left(\mathit{tar}\in L_{u,{1:k}}\right)}{|\mathcal{U}-\mathcal{U}_{\mathit{tar}}|},

where $\mathcal{T}$ is the set of target items, $\mathcal{U}_{\mathit{tar}}$ denotes the set of genuine users interacted with target items $\mathit{tar}$ , $L_{u,{1:k}}$ represents the top- $k$ list of recommendations for user $u$ , and $\mathbb{I}(\cdot)$ is the indicator function that returns 1 if the condition is true. $\mathrm{T}\text{-}\mathrm{NDCG}@k$ mirrors $\mathrm{T}\text{-}\mathrm{HR}@k$ , serving as the target item-specific version of $\mathrm{NDCG}@k$ .

5.1.5. Implementation Details

In our study, we employ two common backbone recommendation models, MF (Koren et al., 2009) and LightGCN (He et al., 2020). To quantify the success ratio of attacks, we select $k=50$ as the evaluation metric following (Huang et al., 2021; Tang et al., 2020; Wu et al., 2021b), while for assessing recommendation performance, we utilize $k=20$ following (He et al., 2020; Wang et al., 2019). The configuration of both the defense methods and the recommendation models involves selecting a learning rate from {0.1, 0.01, $\dots$ , $1\times 10^{-5}$ }, and a weight decay from {0, 0.1, $\dots$ , $1\times 10^{-5}$ }. The implementation of GraphRfi follows its paper. For the detection-based methods, we employ the Random Attack to generate supervised fraudster data. The magnitude parameter of adversarial perturbations in both APR and VAT is determined from a range of {0.1, 0.2, $\dots$ , 1.0}. In terms of attack methods, we set the attack budget to $1\%$ and target five items. The hyperparameters align with those detailed in their original publications. Our implementation code is accessible via the provided link⁴⁴4https://github.com/Kaike-Zhang/VAT.

5.2. Performance Comparison (RQ1)

In this section, we answer RQ1. We focus on two key aspects: the robustness against poisoning attacks and the recommendation performance of our proposed VAT.

5.2.1. Robustness Against Poisoning Attacks

We evaluate the effectiveness of VAT in defending against poisoning attacks by analyzing the attack success ratio. Our experiments focus on items with extremely low popularity, indicated by $\mathrm{T}\text{-}\mathrm{HR}@50$ and $\mathrm{T}\text{-}\mathrm{NDCG}@50$ scores of 0.0 in the absence of any attack. Note: Lower scores of $\mathrm{T}\text{-}\mathrm{HR}@50$ and $\mathrm{T}\text{-}\mathrm{NDCG}@50$ signify stronger defense capabilities.

Item Promotion Attack - unpopular Items: The results in Table 2 reveal that the purely denoise-based defense strategy is mostly effective against random attacks, attributable to the random selection of items and the simplified task of filtering out these fake users’ interactions. However, when faced with other types of attacks, denoise-based defenses might even increase the attack’s success ratio. Detection-based methods, such as GraphRfi and LLM4Dec, demonstrate robust defense capabilities against attacks that align with their training data (notably, random attacks). However, the effectiveness of GraphRfi significantly diminishes against other types of attacks. In contrast, adversarial training methods, which do not rely on prior knowledge, consistently show stable defense against various attacks. Among them, VAT significantly outperforms traditional adversarial training methods like APR and SharpCF. VAT reduces the success ratio of attacks, decreasing an average of $\mathrm{T}\text{-}\mathrm{HR}@50$ and $\mathrm{T}\text{-}\mathrm{NDCG}@50$ by 21.53% and 22.54%, respectively, compared to the top baseline results. These findings underscore VAT’s superior defense mechanism.

Item Promotion Attack - Popular Items: Furthermore, we assess VAT’s defense capabilities against attacks targeting popular items on Gowalla. According to Figure 5, VAT exhibits strong defensibility, outperforming the best baseline even when attacks specifically promote popular items.

Table 4. Robustness and Performance against Adaptive Attack

Model	T-HR@50 (%)	T-NDCG@20 (%)	HR@20 (%)	NDCG@20 (%)
MF	0.201 $\pm$ 0.020	0.051 $\pm$ 0.005	10.722 $\pm$ 0.109	8.170 $\pm$ 0.076
+TopBaseline	0.049 $\pm$ 0.024	0.012 $\pm$ 0.005	12.952 $\pm$ 0.082	10.630 $\pm$ 0.066
+VAT	0.033 $\pm$ 0.004	0.008 $\pm$ 0.001	13.461 $\pm$ 0.045	10.973 $\pm$ 0.040
Gain	+32.72% $\uparrow$	+34.45% $\uparrow$	+3.9% $\uparrow$	+3.2% $\uparrow$

Adaptive Item Promotion Attack: Additionally, we evaluate the effectiveness of defenses against attacks generated by adaptive DP Attacks (note that the Rev Attack cannot be adaptive due to its close dependency on the loss function), as shown in Table 4. Our results indicate that VAT performs better in adaptive DP Attacks compared to non-adaptive ones, highlighting VAT’s superior defense capability.

5.2.2. Recommendation Performance

In our assessment of the efficacy of various defense methods on recommendation performance, as depicted in Table 3, we observe a notable improvement in recommendation quality with the use of adversarial training methods. This observation aligns with findings from previous studies (He et al., 2018b; Chen et al., 2023a), which indicate that adversarial training can significantly enhance the performance of recommender systems. Among the evaluated methods, VAT stands out by achieving the most impressive outcomes in enhancing recommendation performance, surpassing other baseline approaches. This indicates that the user-adaptive magnitude of perturbations, while resisting attacks, can also positively impact recommendation performance.

5.3. Hyper-Parameters Analysis (RQ2)

In this section, we answer RQ2. We explore the effects of hyperparameters, i.e., perturbation magnitude $\rho$ and adversarial training weight $\lambda$ as defined in Equation 4. The results are shown in Figure 6.

5.3.1. Analysis of Hyper-Parameters $\rho$

With $\lambda$ set to 1.0 (the optimal setting for $\lambda$ ), we vary $\rho$ from 0.1 to 1.0 in increments of 0.1. Our findings reveal a progressive improvement in defensive performance as $\rho$ increases, as shown in the top-right part of Figure 6. Notably, defensive efficacy stabilizes once $\rho$ exceeds 0.6. Furthermore, the range of $\rho$ from 0.5 to 0.8 results in the most significant enhancement in recommendation performance, as depicted in the top-right part of Figure 6. Even when $\rho$ is set at a small value of 0.2, there is a noticeable improvement in recommendation performance.

5.3.2. Analysis of Hyper-Parameters $\lambda$

Setting $\rho$ at 0.6 (the optimal setting for $\rho$ ), we vary $\lambda$ from 0.2 to 2.0 in increments of 0.2. This analysis shows that both defense capabilities and the ability to enhance recommendation quality become stable when $\lambda$ exceeds 1.0. Notably, excessively high $\lambda$ can increase performance variance and complicate the convergence of adversarial training. This suggests an optimal setting of $\lambda=1.0$ to balance performance and stability.

5.4. Case Study (RQ3)

In this section, we answer RQ3 by presenting user cases that further support the effectiveness of VAT.

5.4.1. Invulnerable User and Vulnerable User

We illustrate cases for both an invulnerable user and a vulnerable user, showcasing their top-10 recommendation lists obtained through normal training on clean data, as well as through normal training, traditional adversarial training with the same small-magnitude perturbations ( $\epsilon=0.2$ ), and with the same large-magnitude perturbations ( $\epsilon=0.7$ ), and our VAT method on poisoned data as depicted in Figure 7.

We find that small-magnitude perturbations in traditional adversarial training preserve the recommendation performance for the invulnerable user (characterized by a large loss), but offer insufficient protection for the vulnerable user (characterized by a small loss). Conversely, large-magnitude perturbations in traditional adversarial training render the attack ineffective for the vulnerable user but impair the recommendation performance for the invulnerable user. With VAT, user-adaptive magnitudes of perturbations not only enhance recommendation performance for the invulnerable user but also provide adequate protection for the vulnerable user.

5.4.2. Over-fitting User

Additionally, we discuss users who are over-fitted (characterized by small losses but not affected) as mentioned in Section 4.3. Although these users are not affected by attacks, applying large-magnitude perturbations through VAT can mitigate over-fitting, thus improving their performance, as demonstrated in Figure 8.

6. CONCLUSION

In this study, we innovatively explore user vulnerability in recommender systems subjected to poisoning attacks. Our findings indicate that well-fitted users in the current recommender system are more likely to be vulnerable, i.e., affected by attacks. This exploration has led to the development of a Vulnerability-Aware Adversarial Training (VAT) method. VAT distinctively tailors the magnitude of adversarial perturbations according to users’ vulnerabilities, thereby avoiding the typical trade-offs between robustness and performance suffered by traditional adversarial training methods in recommender systems. Through comprehensive experimentation, we have confirmed the effectiveness of VAT. VAT not only reduces the success ratio of poisoning attacks but also improves the overall recommendation performance.

Acknowledgements.

This work is funded by the National Key R&D Program of China (2022YFB3103700, 2022YFB3103701), the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDB0680101, and the National Natural Science Foundation of China under Grant Nos. 62102402, 62272125, U21B2046. Huawei Shen is also supported by Beijing Academy of Artificial Intelligence (BAAI).

References

(1)
Burke et al. (2005) Robin Burke, Bamshad Mobasher, and Runa Bhaumik. 2005. Limited Knowledge Shilling Attacks in Collaborative Filtering Systems. In Proceedings of 3rd International Workshop on Intelligent Techniques for Web Personalization, 19th International Joint Conference on Artificial Intelligence. 17–24.
Chen and Li (2019) Huiyuan Chen and Jing Li. 2019. Adversarial Tensor Factorization for Context-aware Recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems. 363–367.
Chen et al. (2023a) Huiyuan Chen, Xiaoting Li, Vivian Lai, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Mahashweta Das, and Hao Yang. 2023a. Adversarial Collaborative Filtering for Free. In Proceedings of the 17th ACM Conference on Recommender Systems. ACM, 245–255.
Chen et al. (2022) Jingfan Chen, Wenqi Fan, Guanghui Zhu, Xiangyu Zhao, Chunfeng Yuan, Qing Li, and Yihua Huang. 2022. Knowledge-enhanced Black-box Attacks for Recommendations. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 108–117.
Chen et al. (2023b) Ziheng Chen, Fabrizio Silvestri, Jia Wang, Yongfeng Zhang, and Gabriele Tolomei. 2023b. The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples. arXiv preprint arXiv:2305.00574 (2023).
Chirita et al. (2005) Paul-Alexandru Chirita, Wolfgang Nejdl, and Cristian Zamfir. 2005. Preventing Shilling Attacks in Online Recommender Systems. In Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management. 67–74.
Chung et al. (2013) Chen-Yao Chung, Ping-Yu Hsu, and Shih-Hsiang Huang. 2013. $\beta P$ : A Novel Approach to Filter Out Malicious Rating Rrofiles from Recommender Systems. Decision Support Systems 55, 1 (2013), 314–325.
Covington et al. (2016) Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for Youtube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. 191–198.
Gomez-Uribe and Hunt (2015) Carlos A Gomez-Uribe and Neil Hunt. 2015. The Netflix Recommender System: Algorithms, Business Value, and Innovation. ACM Transactions on Management Information Systems 6, 4 (2015), 1–19.
He et al. (2020) Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, YongDong Zhang, and Meng Wang. 2020. LightGCN - Simplifying and Powering Graph Convolution Network for Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 639–648.
He et al. (2018a) Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, and Tat-Seng Chua. 2018a. Outer Product-based Neural Collaborative Filtering. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2227–2233.
He et al. (2018b) Xiangnan He, Zhankui He, Xiaoyu Du, and Tat-Seng Chua. 2018b. Adversarial Personalized Ranking for Recommendation. In The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 355–364.
He et al. (2017) Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web. 173–182.
Huang and Li (2023) Chengzhi Huang and Hui Li. 2023. Single-User Injection for Invisible Shilling Attack against Recommender Systems. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 864–873.
Huang et al. (2021) Hai Huang, Jiaming Mu, Neil Zhenqiang Gong, Qi Li, Bin Liu, and Mingwei Xu. 2021. Data Poisoning Attacks to Deep Learning Based Recommender Systems. In Proceedings 2021 Network and Distributed System Security Symposium.
Koren et al. (2009) Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (2009), 30–37.
Koren et al. (2021) Yehuda Koren, Steffen Rendle, and Robert Bell. 2021. Advances in collaborative filtering. Recommender systems handbook (2021), 91–142.
Lam and Riedl (2004) Shyong K Lam and John Riedl. 2004. Shilling Recommender Systems for Fun and Profit. In Proceedings of the 13th International Conference on World Wide Web. 393–402.
Li et al. (2016) Bo Li, Yining Wang, Aarti Singh, and Yevgeniy Vorobeychik. 2016. Data Poisoning Attacks on Factorization-based Collaborative Filtering. Advances in Neural Information Processing Systems 29 (2016).
LI et al. (2022) Haoyang LI, Shimin DI, and Lei Chen. 2022. Revisiting Injective Attacks on Recommender Systems. In Conference on Neural Information Processing Systems (NeurIPS), Vol. 35. 29989–30002.
Li et al. (2020) Ruirui Li, Xian Wu, and Wei Wang. 2020. Adversarial Learning to Compare: Self-attentive Prospective Customer Recommendation in Location Based Social Networks. In Proceedings of the 13th International Conference on Web Search and Data Mining. 349–357.
Liang et al. (2016) Dawen Liang, Laurent Charlin, James McInerney, and David M Blei. 2016. Modeling User Exposure in Recommendation. In Proceedings of the 25th International Conference on World Wide Web. 951–961.
Liang et al. (2018) Dawen Liang, Rahul G Krishnan, Matthew D Hoffman, and Tony Jebara. 2018. Variational Autoencoders for Collaborative Filtering. In Proceedings of the 2018 World Wide Web Conference. 689–698.
Liu (2020) Yuli Liu. 2020. Recommending Inferior Results: A General and Feature-Free Model for Spam Detection. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 955–974.
Mobasher et al. (2007) Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2007. Toward Trustworthy Recommender Systems: An Analysis of Attack Models and Algorithm Robustness. ACM Transactions on Internet Technology 7, 4 (2007), 23–es.
Qian et al. (2023) Fulan Qian, Bei Yuan, Hai Chen, Jie Chen, Defu Lian, and Shu Zhao. 2023. Enhancing the Transferability of Adversarial Examples Based on Nesterov Momentum for Recommendation Systems. IEEE Transactions on Big Data (2023).
Seminario and Wilson (2014) Carlos E Seminario and David C Wilson. 2014. Attacking Item-based Recommender Systems with Power Items. In Proceedings of the 8th ACM Conference on Recommender Systems. 57–64.
Smith and Linden (2017) Brent Smith and Greg Linden. 2017. Two Decades of Recommender Systems at Amazon.com. IEEE Internet Computing 21, 3 (2017), 12–18.
Su and Khoshgoftaar (2009) Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence 2009 (2009).
Tang et al. (2020) Jiaxi Tang, Hongyi Wen, and Ke Wang. 2020. Revisiting adversarially learned injection attacks against recommender systems. In Proceedings of the 14th ACM Conference on Recommender Systems. 318–327.
Tian et al. (2022) Changxin Tian, Yuexiang Xie, Yaliang Li, Nan Yang, and Wayne Xin Zhao. 2022. Learning to Denoise Unreliable Interactions for Graph Collaborative Filtering. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 122–132.
Van der Maaten and Hinton (2008) Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using T-SNE. Journal of Machine Learning Research 9, 11 (2008).
Wang et al. (2015) Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative Deep Learning for Recommender Systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1235–1244.
Wang et al. (2019) Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural Graph Collaborative Filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 165–174.
Wang et al. (2023) Yanling Wang, Yuchen Liu, Qian Wang, Cong Wang, and Chenliang Li. 2023. Poisoning Self-supervised Learning Based Sequential Recommendations. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 300–310.
Wu et al. (2021a) Chenwang Wu, Defu Lian, Yong Ge, Zhihao Zhu, and Enhong Chen. 2021a. Triple Adversarial Learning for Influence-based Poisoning Attack in Recommender Systems. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1830–1840.
Wu et al. (2021b) Chenwang Wu, Defu Lian, Yong Ge, Zhihao Zhu, Enhong Chen, and Senchao Yuan. 2021b. Fight Fire with Fire: Towards Robust Recommender Systems via Adversarial Poisoning Training. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1074–1083.
Wu et al. (2020b) Fangzhao Wu, Ying Qiao, Jiun-Hung Chen, Chuhan Wu, Tao Qi, Jianxun Lian, Danyang Liu, Xing Xie, Jianfeng Gao, Winnie Wu, et al. 2020b. Mind: A Large-scale Dataset for News Recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3597–3606.
Wu et al. (2020a) Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020a. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2020), 4–24.
Yang et al. (2016) Zhihai Yang, Lin Xu, Zhongmin Cai, and Zongben Xu. 2016. Re-scale AdaBoost for Attack Detection in Collaborative Filtering Recommender Systems. Knowledge-Based Systems 100 (2016), 74–88.
Ye et al. (2023) Haibo Ye, Xinjie Li, Yuan Yao, and Hanghang Tong. 2023. Towards Robust Neural Graph Collaborative Filtering via Structure Denoising and Embedding Perturbation. ACM Transactions on Information Systems 41, 3 (2023), 1–28.
Ying et al. (2018) Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 974–983.
Zhang and Zhou (2014) Fuzhi Zhang and Quanqiang Zhou. 2014. HHT–SVM: An Online Method for Detecting Profile Injection Attacks in Collaborative Recommender Systems. Knowledge-Based Systems 65 (2014), 96–105.
Zhang et al. (2023) Kaike Zhang, Qi Cao, Fei Sun, Yunfan Wu, Shuchang Tao, Huawei Shen, and Xueqi Cheng. 2023. Robust Recommender System: A Survey and Future Directions. arXiv preprint arXiv:2309.02057 (2023).
Zhang et al. (2024a) Kaike Zhang, Qi Cao, Yunfan Wu, Fei Sun, Huawei Shen, and Xueqi Cheng. 2024a. LoRec: Combating Poisons with Large Language Model for Robust Sequential Recommendation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1733–1742.
Zhang et al. (2024b) Kaike Zhang, Qi Cao, Yunfan Wu, Fei Sun, Huawei Shen, and Xueqi Cheng. 2024b. LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning Attacks. arXiv abs/2401.17723 (2024).
Zhang et al. (2020) Shijie Zhang, Hongzhi Yin, Tong Chen, Quoc Viet Nguyen Hung, Zi Huang, and Lizhen Cui. 2020. GCN-Based User Representation Learning for Unifying Robust Recommendation and Fraudster Detection. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 689–698.