This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Towards More Practical Adversarial Attacks on Graph Neural Networks

Jiaqi Ma
[email protected]
&Shuangrui Ding 22footnotemark: 2
[email protected]
&Qiaozhu Mei11footnotemark: 1 33footnotemark: 3
[email protected]
School of Information, University of Michigan, Ann Arbor, Michigan, USAEqual contribution.Department of EECS, University of Michigan, Ann Arbor, Michigan, USA
Abstract

We study the black-box attacks on graph neural networks (GNNs) under a novel and realistic constraint: attackers have access to only a subset of nodes in the network, and they can only attack a small number of them. A node selection step is essential under this setup. We demonstrate that the structural inductive biases of GNN models can be an effective source for this type of attacks. Specifically, by exploiting the connection between the backward propagation of GNNs and random walks, we show that the common gradient-based white-box attacks can be generalized to the black-box setting via the connection between the gradient and an importance score similar to PageRank. In practice, we find attacks based on this importance score indeed increase the classification loss by a large margin, but they fail to significantly increase the mis-classification rate. Our theoretical and empirical analyses suggest that there is a discrepancy between the loss and mis-classification rate, as the latter presents a diminishing-return pattern when the number of attacked nodes increases. Therefore, we propose a greedy procedure to correct the importance score that takes into account of the diminishing-return pattern. Experimental results show that the proposed procedure can significantly increase the mis-classification rate of common GNNs on real-world data without access to model parameters nor predictions111The code for the experiments in this paper is available at https://github.com/Mark12Ding/GNN-Practical-Attack. A follow-up project with improved experiment setup is available at https://github.com/TheaperDeng/GNN-Attack-InfMax..

1 Introduction

Graph neural networks (GNNs) [23], the family of deep learning models on graphs, have shown promising empirical performance on various applications of machine learning to graph data, such as recommender systems [28], social network analysis [12], and drug discovery [17]. Like other deep learning models, GNNs have also been shown to be vulnerable under adversarial attacks [31], which has recently attracted increasing research interest [9]. Indeed, adversarial attacks have been an efficient tool to analyze both the theoretical properties as well as the practical accountability of graph neural networks. As graph data have more complex structures than image or text data, researchers have come up with diverse adversarial attack setups. For example, there are different tasks (node classification and graph classification), assumptions of attacker’s knowledge (white-box, grey-box, and black-box), strategies (node feature modification and graph structure modification), and corresponding budget or other constraints (norm of feature changes or number of edge changes).

Despite these research efforts, there is still a considerable gap between the existing attack setups and the reality. It is unreasonable to assume that an attacker can alter the input of a large proportion of nodes, and even if there is a budget limit, it is unreasonable to assume that they can attack any node as they wish. For example, in a real-world social network, the attackers usually only have access to a few bot accounts, and they are unlikely to be among the top nodes in the network; it is difficult for the attackers to hack and alter the properties of celebrity accounts. Moreover, an attacker usually has limited knowledge about the underling machine learning model used by the platform (e.g., they may roughly know what types of models are used but have no access to the model parameters or training labels). Motivated by the real-world scenario of attacks, in this paper we study a new type of black-box adversarial attack for node classification tasks, which is more restricted and more practical, assuming that the attacker has no access to the model parameters or predictions. Our setup differs from existing work with a novel constraint on node access, where attackers only have access to a subset of nodes in the graph, and they can only manipulate a small number of them.

The proposed black-box adversarial attack requires a two-step procedure: 1) selecting a small subset of nodes to attack under the limits of node access; 2) altering the node attributes or edges under a per-node budget. In this paper, we focus on the first step and study the node selection strategy. The key insight of the proposed strategy lies in the observation that, with no access to the GNN parameters or predictions, the strong structural inductive biases of the GNN models can be exploited as an effective information source of attacks. The structural inductive biases encoded by various neural architectures (e.g., the convolution kernel in convolutional neural networks) play important roles in the success of deep learning models. GNNs have even more explicit structural inductive biases due to the graph structure and their heavy weight sharing design. Theoretical analyses have shown that the understanding of structural inductive biases could lead to better designs of GNN models [26, 11]. From a new perspective, our work demonstrates that such structural inductive biases can turn into security concerns in a black-box attack, as the graph structure is usually exposed to the attackers.

Following this insight, we derive a node selection strategy with a formal analysis of the proposed black-box attack setup. By exploiting the connection between the backward propagation of GNNs and random walks, we first generalize the gradient-norm in a white-box attack into a model-independent importance score similar to the PageRank. In practice, attacking the nodes with high importance scores increases the classification loss significantly but does not generate the same effect on the mis-classification rate. Our theoretical and empirical analyses suggest that such discrepancy is due to the diminishing-return effect of the mis-classification rate. We further propose a greedy correction procedure for calculating the importance scores. Experiments on three real-world benchmark datasets and popular GNN models show that the proposed attack strategy significantly outperforms baseline methods. We summarize our main contributions as follows:

  1. 1.

    We propose a novel setup of black-box attacks for GNNs with a constraint of limited node access, which is by far the most restricted and practical compared to existing work.

  2. 2.

    We demonstrate that the structural inductive biases of GNNs can be exploited as an effective information source of black-box adversarial attacks.

  3. 3.

    We analyze the discrepancy between classification loss and mis-classification rate and propose a practical greedy method of adversarial attacks for node classification tasks.

  4. 4.

    We empirically verify the effectiveness of the proposed method on three benchmark datasets with popular GNN models.

2 Related Work

2.1 Adversarial Attack on GNNs

The study of adversarial attacks on graph neural networks has surged recently. A taxonomy of existing work has been summarized by Jin et al. [9], and we give a brief introduction here. First, there are two types of machine learning tasks on graphs that are commonly studied, node-level classification and graph-level classification. We focus on the node-level classification in this paper. Next, there are a couple of choices of the attack form. For example, the attack can happen either during model training (poisoning) or during model testing (evasion); the attacker may aim to mislead the prediction on specific nodes (targeted attack) [31] or damage the overall task performance (untargeted attack) [30]; the adversarial perturbation can be done by modifying node features, adding or deleting edges, or injecting new nodes [18]. Our work belongs to untargeted evasion attacks. For the adversarial perturbation, most existing works of untargeted attacks apply global constraints on the proportion of node features or the number of edges to be altered. Our work sets a novel local constraint on node access, which is more realistic in practice: perturbation on top (e.g., celebrity) nodes is prohibited and only a small number of nodes can be perturbed. Finally, depending on the attacker’s knowledge about the GNN model, existing work can be split into three categories: white-box attacks [24, 5, 22] have access to full information about the model, including model parameters, input data, and labels; grey-box attacks [30, 31, 18] have partial information about the model and the exact setups vary in a range; in the most challenging setting, black-box attacks [6, 2, 4] can only access the input data and sometimes the black-box predictions of the model. In this work, we consider an even more strict black-box attack setup, where model predictions are invisible to the attackers. As far as we know, the only existing works that conduct untargeted black-box attacks without access to model predictions are those by Bojchevski and Günnemann [2] and Chang et al. [4]. However both of them require the access to embeddings of nodes, which are prohibited as well in our setup.

2.2 Structural Inductive Bias of GNNs

While having an extremely restricted black-box setup, we demonstrate that effective adversarial attacks are still possible due to the strong and explicit structural inductive biases of GNNs.

Structural inductive biases refer to the structures encoded by various neural architectures, such as the weight sharing mechanisms in convolution kernels of convolutional neural networks, or the gating mechanisms in recurrent neural networks. Such neural architectures have been recognized as a key factor for the success of deep learning models [29], which (partially) motivate some recent developments of neural architecture search [29], Bayesian deep learning [21], Lottery Ticket Hypothesis [7], etc. The natural graph structure and the heavy weight sharing mechanism grant GNN models even more explicit structural inductive biases. Indeed, GNN models have been theoretically shown to share similar behaviours as Weisfeiler-Lehman tests [14, 25] or random walks [26]. On the positive side, such theoretical analyses have led to better GNN model designs [26, 11].

Our work instead studies the negative impact of the structural inductive biases in the context of adversarial attacks: when the graph structure is exposed to the attacker, such structural information can turn into the knowledge source for an attack. While most existing attack strategies more-or-less utilize some structural properties of GNNs, they are utilized in a data-driven manner which requires querying the GNN model, e.g., learning to edit the graph via a trial-and-error interaction with the GNN model [6]. We formally establish connections between the structural properties and attack strategies without any queries to the GNN model.

3 Principled Black-Box Attack Strategies with Limited Node Access

In this section, we derive principled strategies to attack GNNs under the novel black-box setup with limited node access. We first analyze the corresponding white-box attack problem in Section 3.2 and then adapt the theoretical insights from the white-box setup to the black-box setup and propose a black-box attack strategy in Section 3.3. Finally, in Section 3.4, we correct the proposed strategy by taking into account of the diminishing-return effect for the mis-classification rate.

3.1 Preliminary Notations

We first introduce necessary notations. We denote a graph as G=(V,E)G=(V,E), where V={1,2,,N}V=\{1,2,\ldots,N\} is the set of NN nodes, and EV×VE\subseteq V\times V is the set of edges. For a node classification problem, the nodes of the graph are collectively associated with node features XN×DX\in\mathbb{R}^{N\times D} and labels y{1,2,,K}Ny\in\{1,2,\ldots,K\}^{N}, where DD is the dimensionality of the feature vectors and KK is the number of classes. Each node ii’s local neighborhood including itself is denoted as 𝒩i={jV(i,j)E}{i}\mathcal{N}_{i}=\{j\in V\mid(i,j)\in E\}\cup\{i\}, and its degree as di=|𝒩i|d_{i}=|\mathcal{N}_{i}|. To ease the notation, for any matrix AD1×D2A\in\mathbb{R}^{D_{1}\times D_{2}} in this paper, we refer AjA_{j} to the transpose of the jj-th row of the matrix, i.e., AjD2A_{j}\in\mathbb{R}^{D_{2}}.

GNN models. Given the graph GG, a GNN model is a function fG:N×DN×Kf_{G}:\mathbb{R}^{N\times D}\rightarrow\mathbb{R}^{N\times K} that maps the node features XX to output logits of each node. We denote the output logits of all nodes as a matrix HN×KH\in\mathbb{R}^{N\times K} and H=fG(X)H=f_{G}(X). A GNN fGf_{G} is usually built by stacking a certain number (LL) of layers, with the ll-th layer, 1lL1\leq l\leq L, taking the following form:

Hi(l)=σ(j𝒩iαij𝑾lHj(l1)),H_{i}^{(l)}=\sigma\left(\sum_{j\in\mathcal{N}_{i}}\alpha_{ij}\bm{W}_{l}H_{j}^{(l-1)}\right), (1)

where H(l)N×DlH^{(l)}\in\mathbb{R}^{N\times D_{l}} is the hidden representation of nodes with DlD_{l} dimensions, output by the ll-th layer; 𝑾l\bm{W}_{l} is a learnable linear transformation matrix; σ\sigma is an element-wise nonlinear activation function; and different GNNs have different normalization terms αij\alpha_{ij}. For instance, αij=1/didj\alpha_{ij}=1/\sqrt{d_{i}d_{j}} or αij=1/di\alpha_{ij}=1/d_{i} in Graph Convolutional Networks (GCN) [10]. In addition, H(0)=XH^{(0)}=X and H=H(L)H=H^{(L)}.

Random walks. A random walk [13] on GG is specified by the matrix of transition probabilities, MN×NM\in\mathbb{R}^{N\times N}, where

Mij={1/di, if (i,j)E or j=i,0, otherwise.M_{ij}=\left\{\begin{array}[]{ll}1/d_{i},&\text{ if }(i,j)\in E\text{ or }j=i,\\ 0,&\text{ otherwise.}\end{array}\right.

Each MijM_{ij} represents the probability of transiting from ii to jj at any given step of the random walk. And powering the transition matrix by tt gives us the tt-step transition matrix MtM^{t}.

3.2 White-Box Adversarial Attacks with Limited Node Access

Problem formulation. Given a classification loss :N×K×{1,,K}N\mathcal{L}:\mathbb{R}^{N\times K}\times\{1,\ldots,K\}^{N}\rightarrow\mathbb{R}, the problem of white-box attack with limited node access can be formulated as an optimization problem as follows:

maxSV\displaystyle\max_{S\subseteq V}\quad (H,y)\displaystyle\mathcal{L}(H,y) (2)
subject to |S|r,dim,iS\displaystyle|S|\leq r,d_{i}\leq m,\forall i\in S
H=f(τ(X,S)),\displaystyle H=f(\tau(X,S)),

where r,m+r,m\in\mathbb{Z}^{+} respectively specify the maximum number of nodes and the maximum degree of nodes that can be attacked. Intuitively, we treat high-degree nodes as a proxy of celebrity accounts in a social network. For simplicity, we have omitted the subscript GG of the learned GNN classifier fGf_{G}. The function τ:N×D×2VN×D\tau:\mathbb{R}^{N\times D}\times 2^{V}\rightarrow\mathbb{R}^{N\times D} perturbs the feature matrix XX based on the selected node set SS (i.e., attack set). Under the white-box setup, theoretically τ\tau can also be optimized to maximize the loss. However, as our goal is to study the node selection strategy under the black-box setup, we set τ\tau as a pre-determined function. In particular, we define the jj-th row of the output of τ\tau as τ(X,S)j=Xj+𝟙[jS]ϵ\tau(X,S)_{j}=X_{j}+\mathbbm{1}[j\in S]\epsilon, where ϵD\epsilon\in\mathbb{R}^{D} is a small constant noise vector constructed by attackers’ domain knowledge about the features. In other words, the same small noise vector is added to the features of every attacked node.

We use the Carlili-Wagner loss for our analysis, a close approximation of cross-entropy loss and has been used in the analysis of adversarial attacks on image classifiers [3]:

(H,y)j=1Nj(Hj,yj)j=1Nmaxk{1,,K}HjkHjyj.\displaystyle\mathcal{L}(H,y)\triangleq\sum_{j=1}^{N}\mathcal{L}_{j}(H_{j},y_{j})\triangleq\sum_{j=1}^{N}\max_{k\in\{1,\ldots,K\}}H_{jk}-H_{jy_{j}}. (3)

The change of loss under perturbation. Next we investigate how the overall loss changes when we select and perturb different nodes. We define the change of loss when perturbing the node ii as a function of the perturbed feature vector xx:

Δi(x)=(f(X),y)(f(X),y), where Xi=x and Xj=Xj,ji.\Delta_{i}(x)=\mathcal{L}(f(X^{\prime}),y)-\mathcal{L}(f(X),y),\text{ where }X^{\prime}_{i}=x\text{ and }X^{\prime}_{j}=X_{j},\forall j\neq i.

To concretize the analysis, we consider the GCN model with αij=1di\alpha_{ij}=\frac{1}{d_{i}} in our following derivations. Suppose ff is an LL-layer GCN. With the connection between GCN and random walk [26] and Assumption 1 on the label distribution, we can show that, in expectation, the first-order Taylor approximation Δ~i(x)Δi(Xi)+(xΔi(Xi))T(xXi)\widetilde{\Delta}_{i}(x)\triangleq\Delta_{i}(X_{i})+(\nabla_{x}\Delta_{i}(X_{i}))^{T}(x-X_{i}) is related to the sum of the ii-th column of the LL-step random walk transition matrix MLM^{L}. We formally summarize this finding in Proposition 1.

Assumption 1 (Label Distribution).

Assume the distribution of the labels of all nodes follows the same constant categorical distribution, i.e.,

Pr[yj=k]=qk,j=1,2,,N,\Pr[y_{j}=k]=q_{k},\forall j=1,2,\ldots,N,

where 0<qk<10<q_{k}<1 for k=1,2,,Kk=1,2,\ldots,K and k=1Kqk=1\sum_{k=1}^{K}q_{k}=1. Moreover, since the classifier ff has been well-trained and fixed, the prediction of ff should capture certain relationships among the KK classes. Specifically, we assume the chance for ff predicting any node jj as any class k{1,,K}k\in\{1,\ldots,K\}, conditioned on the node label yj=l{1,,K}y_{j}=l\in\{1,\ldots,K\}, confines to a certain distribution p(kl)p(k\mid l), i.e.,

Pr[(argmaxc{1,,K}Hjc)=kyj=l]=p(kl).\displaystyle\Pr[\left(\operatorname*{argmax}_{c\in\{1,\ldots,K\}}H_{jc}\right)=k\mid y_{j}=l]=p(k\mid l).
Proposition 1.

For an LL-layer GCN model, if Assumption 1 and a technical assumption about the GCN222This is an assumption made by Xu et al. [26], which we list as Assumption 5 in Appendix A.1. hold, then

δi𝔼[Δ~i(x)|x=τ(X,{i})i]=Cj=1N[ML]ji,\delta_{i}\triangleq\operatorname*{\mathbb{E}}\left[\widetilde{\Delta}_{i}\left(x\right)|_{x=\tau(X,\{i\})_{i}}\right]=C\sum_{j=1}^{N}[M^{L}]_{ji},

where CC is a constant independent of ii.

3.3 Adaptation from the White-Box Setup to the Black-Box Setup

Now we turn to the black-box setup where we have no access to the model parameters or predictions. This means we are no longer able to evaluate the objective function (H,y)\mathcal{L}(H,y) of the optimization problem (2). Proposition 1 shows that the relative ratio of δi/δj\delta_{i}/\delta_{j} between different nodes iji\neq j only depends on the random walk transition matrix, which we can easily calculate based on the graph GG. This implies that we can still approximately optimize the problem (2) in the black-box setup.

Node selection with importance scores. Consider the change of loss under the perturbation of a set of nodes SS. If we write the change of loss as a function of the perturbed features and take the first order Taylor expansion, which we denote as δ\delta, we have δ=iSδi\delta=\sum_{i\in S}\delta_{i}. Therefore δ\delta is maximized by the set of rr nodes with degrees less than mm and the largest possible δi\delta_{i}, where m,rm,r are the limits of node access defined in the problem (2). Therefore, we can define an importance score for each node ii as the sum of the ii-th column of MLM^{L}, i.e., Ii=j=1N[ML]jiI_{i}=\sum_{j=1}^{N}[M^{L}]_{ji}, and simply select the nodes with the highest importance scores to attack. We denote this strategy as RWCS (Random Walk Column Sum). We note that RWCS is similar to PageRank. The difference between RWCS and PageRank is that the latter uses the stationary transition matrix MM^{\infty} for a random walk with restart.

Empirically, RWCS indeed significantly increases the classification loss (as shown in Section 4.2). The nonlinear loss actually increases linearly w.r.t. the perturbation strength (the norm of the perturbation noise ϵ\epsilon) for a wide range, which indicates that Δ~i\widetilde{\Delta}_{i} is a good approximation of Δi\Delta_{i}. Surprisingly, RWCS fails to continue to increase the mis-classification rate (which matters more in real applications) when the perturbation strength becomes larger. Details of this empirical finding are shown in Figure 1 in Section 4.2. We conduct additional formal analyses on the mis-classification rate in the following section and find a diminishing-return effect of adding more nodes to the attack set when the perturbation strength is adequate.

3.4 Diminishing-Return of Mis-classification Rate and its Correction

Analysis of the diminishing-return effect. Our analysis is based on the investigation that each target node iVi\in V will be mis-classified as we increase the attack set.

To assist the analysis, we first define the concepts of vulnerable function and vulnerable set below.

Definition 1 (Vulnerable Function).

We define the vulnerable function gi:2V{0,1}g_{i}:2^{V}\rightarrow\{0,1\} of a target node iVi\in V as, for a given attack set SVS\subseteq V,

gi(S)={1, if i is mis-classified when attacking S,0,if i is correctly-classified when attacking S.\displaystyle g_{i}(S)=\left\{\begin{array}[]{lr}1,&\text{ if $i$ is mis-classified when attacking $S$},\\ 0,&\text{if $i$ is correctly-classified when attacking $S$.}\end{array}\right.
Definition 2 (Vulnerable Set).

We define the vulnerable set of a target node iVi\in V as a set of all attack sets that could lead ii to being mis-classified:

Ai{SVgi(S)=1}.A_{i}\triangleq\{S\subseteq V\mid g_{i}(S)=1\}.

We also make the following assumption about the vulnerable function.

Assumption 2.

gig_{i} is non-decreasing for all iVi\in V, i.e., if TSVT\subseteq S\subseteq V, then gi(T)gi(S)g_{i}(T)\leq g_{i}(S).

With the definitions above, the mis-classification rate can be written as the average of the vulnerable functions: h(S)=1Ni=1Ngi(S)h(S)=\frac{1}{N}\sum_{i=1}^{N}g_{i}(S). By Assumption 2, hh is also clearly non-decreasing.

We further define the basic vulnerable set to characterize the minimal attack sets that can lead a target node to being mis-classified.

Definition 3 (Basic Vulnerable Set).

iV\forall i\in V, we call BiAiB_{i}\subseteq A_{i} a basic vulnerable set of ii if,

  1. 1)

    Bi\emptyset\notin B_{i}; if Ai,Bi=\emptyset\in A_{i},B_{i}=\emptyset;

  2. 2)

    if Ai\emptyset\notin A_{i}, for any nonempty SAiS\in A_{i}, there exists a TBiT\in B_{i} s.t. TST\subseteq S;

  3. 3)

    for any distinct S,TBiS,T\in B_{i}, |ST|<min(|S|,|T|)|S\cap T|<\min(|S|,|T|).

And the existence of such a basic vulnerable set is guaranteed by Lemma 1.

Lemma 1.

For any iVi\in V, there exists a unique BiB_{i}.

The distribution of the sizes of the element sets of BiB_{i} is closely related to the perturbation strength on the features. When the perturbation is small, we may have to perturb multiple nodes before the target node is mis-classified, and thus the element sets of BiB_{i} will be large. When perturbation is relatively large, we may be able to turn a target node to be mis-classified by perturbing a single node, if chosen wisely. In this case BiB_{i} will have a lot of singleton sets.

Our following analysis (Proposition 2) shows that hh has a diminishing-return effect if the vulnerable sets of nodes on the graph present homophily (Assumption 3), which is common in real-world networks, and the perturbation on features becomes considerably large (Assumption 4).

Assumption 3 (Homophily).

Si=1NAi\forall S\in\cup_{i=1}^{N}A_{i} and |S|>1|S|>1, there are b(S)1b(S)\geq 1 nodes s.t., for any node jj among these nodes, SAjS\in A_{j}.

Intuitively, the vulnerable sets present strong homophily if b(S)b(S)’s are large.

Assumption 4 (Considerable Perturbation).

Si=1NAi\forall S\in\cup_{i=1}^{N}A_{i} and if |S|>1|S|>1, then there are p(S)b(S)\lceil p(S)\cdot b(S)\rceil nodes s.t., for any node jj among these nodes, there exists a set TST\subseteq S, |T|=1|T|=1, and TAjT\in A_{j}. And rr+1<p(S)1\frac{r}{r+1}<p(S)\leq 1.

Proposition 2.

If Assumptions 3 and 4 hold, hh is γ\gamma-approximately submodular for some 0<γ<1r0<\gamma<\frac{1}{r}, i.e., there exists a non-decreasing submodular function h~:2V+\tilde{h}:2^{V}\rightarrow\mathbb{R}_{+}, s.t. SV\forall S\subseteq V,

(1γ)h~(S)h(S)(1+γ)h~(S).(1-\gamma)\tilde{h}(S)\leq h(S)\leq(1+\gamma)\tilde{h}(S).

As greedy methods are guaranteed to enjoy a constant approximation ratio for such approximately submodular functions [8], Proposition 2 motivates us to develop a greedy correction procedure to compensate the diminishing-return effect when calculating the importance scores.

The greedy correction procedure. We propose an iterative node selection procedure and apply two greedy correction steps on top of the RWCS strategy, motivated by Assumption 3 and 4.

To accommodate Assumption 3, after each node is selected into the attack set, we exclude a kk-hop neighborhood of the selected node for next iteration, for a given constant integer kk. The intuition is that nodes in a local neighborhood may contribute to similar target nodes due to homophily. To accommodate Assumption 4, we adopt an adaptive version of RWCS scores. First, we binarize the LL-step random walk transition matrix MLM^{L} as M~\widetilde{M}, i.e.,

[M~]ij={1, if [ML]ij is among Top-l of [ML]i and [ML]ij0,0,otherwise,\left[\widetilde{M}\right]_{ij}=\left\{\begin{array}[]{lr}1,&\text{ if }\text{$[M^{L}]_{ij}$ is among Top-$l$ of $[M^{L}]_{i}$}\text{ and }[M^{L}]_{ij}\neq 0,\\ 0,&\text{otherwise,}\end{array}\right. (4)

where ll is a given constant integer. Next, we define a new adaptive influence score as a function of a matrix QQ: I~i(Q)=j=1N[Q]ji.\widetilde{I}_{i}(Q)=\sum_{j=1}^{N}[Q]_{ji}. In the iterative node selection procedure, we initialize QQ as M~\widetilde{M}. We select the node with highest score I~i(Q)\widetilde{I}_{i}(Q) subsequently. After each iteration, suppose we have selected the node ii in this iteration, we will update QQ by setting to zero for all the rows where the elements of the ii-th column are 11. The underlying assumption of this operation is that, adding ii to the selected set is likely to mis-classify all the target nodes corresponding to the aforementioned rows, which complies Assumption 4. We name this iterative procedure as the GC-RWCS (Greedily Corrected RWCS) strategy, and summarize it in Algorithm 1.

1
Input: number of nodes limit rr; maximum degree limit mm; neighbor hops kk; binarized transition matrix M~\widetilde{M}; the adaptive influence score function I~i,iV\widetilde{I}_{i},\forall i\in V.
Output: the set SS to be attacked.
2 Initialize the candidate set P={iVdim}P=\{i\in V\mid d_{i}\leq m\}, and the score matrix Q=M~Q=\widetilde{M};
3 Initialize S=S=\emptyset;
4 for t=1,2,,rt=1,2,\ldots,r do
5       zargmaxiPI~i(Q)z\leftarrow\operatorname*{argmax}_{i\in P}\widetilde{I}_{i}(Q);
6       SS{z}S\leftarrow S\cup\{z\};
7       PP{iPshortest-path(i,z)k}P\leftarrow P\setminus\{i\in P\mid\text{shortest-path}(i,z)\leq k\};
8       qQ,zq\leftarrow Q_{\cdot,z};
9       for iVi\in V do
10             if qiq_{i} is 11 then
11                   Qi0Q_{i}\leftarrow\textbf{0};
12                  
13            
14      
15return SS;
Algorithm 1 The GC-RWCS Strategy for Node Selection.

Finally, we want to mention that, while the derivation of RWCS and GC-RWCS requires the knowledge of the number of layers LL for GCN, we find that the empirical performance of the proposed attack strategies are not sensitive w.r.t. the choice of LL. Therefore, the proposed methods are applicable to the black-box setup where we do not know the exact LL of the model.

4 Experiments

4.1 Experiment Setup

GNN models. We evaluate the proposed attack strategies on two common GNN models, GCN [10] and JK-Net [26]. For JK-Net, we test on its two variants, JKNetConcat and JKNetMaxpool, which apply concatenation and element-wise max at last layer respectively. We set the number of layers for GCN as 2 and the number of layers for both JK-Concat and JK-Maxpool as 7. The hidden size of each layer is 32. For the training, we closely follow the hyper-parameter setup in Xu et al. [26].

Datasets. We adopt three citation networks, Citeseer, Cora, and Pubmed, which are standard node classification benchmark datasets [16, 27]. Following the setup of JK-Net [26], we randomly split each dataset by 60%60\%, 20%20\%, and 20%20\% for training, validation, and testing. And we draw 40 random splits.

Baseline methods for comparison. As we summarized in Section 2.1, our proposed black-box adversarial attack setup is by far the most restricted, and none of existing attack strategies for GNN can be applied. We compare the proposed attack strategies with baseline strategies by selecting nodes with top centrality metrics. We compare with three well-known network metrics capturing different aspects of node centrality: Degree, Betweenness, and PageRank and name the attack strategies correspondingly. In classical network analysis literature [15], real-world networks are shown to be fragile under attacks to high-centrality nodes. Therefore we believe these centrality metrics serve as reasonable baselines under our restricted black-box setup. For the purpose of sanity check, we also include a trivial baseline Random, which randomly selects the nodes to be attacked.

Hyper-parameters for GC-RWCS. For the proposed GC-RWCS strategy, we fix the number of step L=4L=4, the neighbor-hop parameter k=1k=1 and the parameter l=30l=30 for the binarized M~\widetilde{M} in Eq. (4) for all models on all datasets. Note that L=4L=4 is different from the number of layers of both GCN and JK-Nets in our experiments. But we achieve effective attack performance. We also conduct a sensitivity analysis in Appendix B.2 and demonstrate the proposed method is not sensitive w.r.t. LL.

Nuisance parameters of the attack procedure. For each dataset, we fix the limit on the number of nodes to attack, rr, as 1%1\% of the graph size. After the node selection step, we also need to specify how to perturb the node features, i.e., the design of ϵ\epsilon in τ\tau function in the optimization problem (2). In a real-world scenario, ϵ\epsilon should be designed with domain knowledge about the classification task, without access to the GNN models. In our experiments, we have to simulate the domain knowledge due to the lack of semantic meaning of each individual feature in the benchmark datasets. Formally, we construct the constant perturbation ϵD\epsilon\in\mathbb{R}^{D} as follows, for j=1,2,,Dj=1,2,\ldots,D,

ϵj={λsign(i=1N(H,y)Xij),if jarg top-J([|i=1N(H,y)Xil|]l=1,2,,D),0,otherwise,\epsilon_{j}=\left\{\begin{array}[]{lr}\lambda\cdot\text{sign}(\sum_{i=1}^{N}\frac{\partial\mathcal{L}(H,y)}{\partial X_{ij}}),&\text{if }j\in\text{arg top-$J$}\left(\left[\absolutevalue{\sum_{i=1}^{N}\frac{\partial\mathcal{L}(H,y)}{\partial X_{il}}}\right]_{l=1,2,\ldots,D}\right),\\ 0,&\text{otherwise,}\end{array}\right. (5)

where λ\lambda is the magnitude of modification. We fix J=0.02DJ=\lfloor 0.02D\rfloor for all datasets. While gradients of the model are involved, we emphasize that we only use extremely limited information of the gradients: determining a few number of important features and the binary direction to perturb for each selected feature, only at the global level by averaging gradients on all nodes. We believe such coarse information is usually available from domain knowledge about the classification task. The perturbation magnitude for each feature is fixed as a constant λ\lambda and is irrelevant to the model. In addition, the same perturbation vector is added to the features of all the selected nodes. The construction of the perturbation is totally independent of the selected nodes.

4.2 Experiment Results

Verifying the discrepancy between the loss and the mis-classification rate. We first provide empirical evidence for the discrepancy between classification loss (cross-entropy) and mis-classification rate. We compare the RWCS strategy to baseline strategies with varying perturbation strength as measured by λ\lambda in Eq. (5). The results shown in Figure 1 are obtained by attacking GCN on Citeseer. First, we observe that RWCS increases the classification loss almost linearly as λ\lambda increases, indicating our approximation of the loss by first-order Taylor expansion actually works pretty well in practice. Not surprisingly, RWCS performs very similarly as PageRank. And RWCS performs much better than other centrality metrics in increasing the classification loss, showing the effectiveness of Proposition 1. However, we see the decrease of classification accuracy when attacked by RWCS (and PageRank) quickly saturates as λ\lambda increases. The GC-RWCS strategy that is proposed to correct the importance scores is able to decreases the classification accuracy the most as λ\lambda becomes larger, although it increases the classification loss the least.

Refer to caption
(a) Loss on Test Set
Refer to caption
(b) Accuracy on Test Set
Figure 1: Experiments of attacking GCN on Citeseer with increasing perturbation strength λ\lambda. Results are averaged over 40 random trials and error bars indicate standard error of mean.

Full experiment results. We then provide the full experiment results of attacking GCN, JKNetConcat, and JKNetMaxpool on all three datasets in Table 1. The perturbation strength is set as λ=1\lambda=1. The thresholds 10%10\% and 30%30\% indicate that we set the limit on the maximum degree mm as the lowest degree of the top 10%10\% and 30%30\% nodes respectively.

The results clearly demonstrate the effectiveness of the proposed GC-RWCS strategy. GC-RWCS achieves the best attack performance on almost all experiment settings, and the difference to the second-best strategy is significant in almost all cases. It is also worth noting that the proposed GC-RWCS strategy is able to decrease the node classification accuracy by up to 33.5%33.5\%, and GC-RWCS achieves a 70%70\% larger decrease of the accuracy than the Random baseline in most cases (see Table 3 in Appendix B.2). And this is achieved by merely adding the same constant perturbation vector to the features of 1%1\% of the nodes in the graph. This verifies that the explicit structural inductive biases of GNN models make them vulnerable even in the extremely restricted black-box attack setup.

Table 1: Summary of the attack performance. The lower the accuracy (in %\%) the better the attacks. The bold marker denotes the best performance. The asterisk (*) means the difference between the best strategy and the second-best strategy is statistically significant by a t-test at significance level 0.05. The error bar (±\pm) denotes the standard error of the mean by 40 independent trials.
Cora Citeseer Pubmed
Method GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool
None 85.6 ±\pm 0.3 86.2 ±\pm 0.2 85.8 ±\pm 0.3 75.1 ±\pm 0.2 72.9 ±\pm 0.3 73.2 ±\pm 0.3 85.7 ±\pm 0.1 85.8 ±\pm 0.1 85.7 ±\pm 0.1
Threshold 10%10\%
Random 81.3 ±\pm 0.3 68.8 ±\pm 0.8 68.8 ±\pm 1.3 71.3 ±\pm 0.3 60.8 ±\pm 0.8 61.7 ±\pm 0.9 82.0 ±\pm 0.3 75.9 ±\pm 0.7 75.4 ±\pm 0.7
Degree 78.2 ±\pm 0.4 60.7 ±\pm 1.0 59.9 ±\pm 1.5 67.5 ±\pm 0.4 52.5 ±\pm 0.8 53.7 ±\pm 1.0 78.9 ±\pm 0.5 63.4 ±\pm 1.0 63.3 ±\pm 1.2
Pagerank 79.4 ±\pm 0.4 71.6 ±\pm 0.6 70.0 ±\pm 1.0 70.1 ±\pm 0.3 61.5 ±\pm 0.5 62.6 ±\pm 0.6 80.3 ±\pm 0.3 71.3 ±\pm 0.8 71.2 ±\pm 0.8
Betweenness 79.7 ±\pm 0.4 60.5 ±\pm 0.9 60.3 ±\pm 1.6 68.9 ±\pm 0.3 53.5 ±\pm 0.8 55.1 ±\pm 1.0 78.5 ±\pm 0.6 67.1 ±\pm 1.1 66.2 ±\pm 1.1
RWCS 79.5 ±\pm 0.3 71.2 ±\pm 0.5 69.9 ±\pm 1.0 69.9 ±\pm 0.3 60.8 ±\pm 0.6 62.2 ±\pm 0.7 79.8 ±\pm 0.3 70.7 ±\pm 0.8 70.7 ±\pm 0.8
GC-RWCS 78.5 ±\pm 0.5 52.7 ±\pm 1.0* 53.3 ±\pm 1.9* 65.1 ±\pm 0.5* 46.6 ±\pm 0.8* 48.2 ±\pm 1.1* 77.3 ±\pm 0.7 62.1 ±\pm 1.2 60.6 ±\pm 1.4*
Threshold 30%30\%
Random 82.6 ±\pm 0.4 70.7 ±\pm 1.1 71.8 ±\pm 1.1 72.6 ±\pm 0.3 62.7 ±\pm 0.8 63.9 ±\pm 0.8 82.6 ±\pm 0.2 77.3 ±\pm 0.4 77.4 ±\pm 0.5
Degree 80.7 ±\pm 0.4 64.9 ±\pm 1.4 67.0 ±\pm 1.5 70.4 ±\pm 0.4 56.9 ±\pm 0.8 58.7 ±\pm 0.9 81.5 ±\pm 0.4 72.4 ±\pm 0.7 72.3 ±\pm 0.7
Pagerank 82.6 ±\pm 0.3 79.6 ±\pm 0.4 79.7 ±\pm 0.4 72.9 ±\pm 0.2 70.2 ±\pm 0.3 70.3 ±\pm 0.3 83.0 ±\pm 0.2 79.3 ±\pm 0.3 79.6 ±\pm 0.3
Betweenness 81.8 ±\pm 0.4 64.1 ±\pm 1.3 65.9 ±\pm 1.4 70.7 ±\pm 0.3 56.3 ±\pm 0.8 58.3 ±\pm 0.9 81.3 ±\pm 0.3 74.1 ±\pm 0.5 74.6 ±\pm 0.5
RWCS 82.8 ±\pm 0.3 79.3 ±\pm 0.5 79.5 ±\pm 0.4 72.9 ±\pm 0.2 69.8 ±\pm 0.3 70.1 ±\pm 0.3 82.1 ±\pm 0.2 77.8 ±\pm 0.3 78.4 ±\pm 0.3
GC-RWCS 80.7 ±\pm 0.5 59.1 ±\pm 1.6* 61.1 ±\pm 1.6* 67.8 ±\pm 0.5* 49.0 ±\pm 0.9* 50.7 ±\pm 1.1* 80.3 ±\pm 0.5* 69.2 ±\pm 0.7* 70.0 ±\pm 0.7*

5 Conclusion

In this paper, we propose a novel black-box adversarial attack setup for GNN models with constraint of limited node access, which we believe is by far the most restricted and realistic black-box attack setup. Nonetheless, through both theoretical analyses and empirical experiments, we demonstrate that the strong and explicit structural inductive biases of GNN models make them still vulnerable to this type of adversarial attacks. We also propose a principled attack strategy, GC-RWCS, based on our theoretical analyses on the connection between the GCN model and random walk, which corrects the diminishing-return effect of the mis-classification rate. Our experimental results show that the proposed strategy significantly outperforms competing attack strategies under the same setup.

Acknowledgments

This work was in part supported by the National Science Foundation under grant numbers 1633370 and 1620319.

Broader Impact

For the potential positive impacts, we anticipate that the work may raise the public attention about the security and accountability issues of graph-based machine learning techniques, especially when they are applied to real-world social networks. Even without accessing any information about the model training, the graph structure alone can be exploited to damage a deep learning framework with a rather executable strategy.

On the potential negative side, as our work demonstrates that there is a chance to attack existing GNN models effectively without any knowledge but a simple graph structure, this may expose a serious alert to technology companies who maintain the platforms and operate various applications based on the graphs. However, we believe making this security concern transparent can help practitioners detect potential attack in this form and better defend the machine learning driven applications.

References

  • Barabási and Albert [1999] Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999.
  • Bojchevski and Günnemann [2018] Aleksandar Bojchevski and Stephan Günnemann. Adversarial attacks on node embeddings via graph poisoning. arXiv preprint arXiv:1809.01093, 2018.
  • Carlini and Wagner [2017] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. IEEE, 2017.
  • Chang et al. [2020] Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu, and Junzhou Huang. A restricted black-box adversarial framework towards attacking graph embedding models. In AAAI Conference on Artificial Intelligence, 2020.
  • Chen et al. [2018] Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. Fast gradient attack on network embedding. arXiv preprint arXiv:1809.02797, 2018.
  • Dai et al. [2018] Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. Adversarial attack on graph structured data. arXiv preprint arXiv:1806.02371, 2018.
  • Frankle and Carbin [2018] Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
  • Horel and Singer [2016] Thibaut Horel and Yaron Singer. Maximization of approximately submodular functions. In Advances in Neural Information Processing Systems, pages 3045–3053, 2016.
  • Jin et al. [2020] Wei Jin, Yaxin Li, Han Xu, Yiqi Wang, and Jiliang Tang. Adversarial attacks and defenses on graphs: A review and empirical study. arXiv preprint arXiv:2003.00653, 2020.
  • Kipf and Welling [2016] Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  • Klicpera et al. [2018] Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.
  • Li et al. [2017] Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. Deepcas: An end-to-end predictor of information cascades. In Proceedings of the 26th international conference on World Wide Web, pages 577–586, 2017.
  • Lovász et al. [1993] László Lovász et al. Random walks on graphs: A survey. Combinatorics, Paul erdos is eighty, 2(1):1–46, 1993.
  • Morris et al. [2019] Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 4602–4609, 2019.
  • Newman [2018] Mark Newman. Networks. Oxford university press, 2018.
  • Sen et al. [2008] Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. Collective classification in network data. AI magazine, 29(3):93–93, 2008.
  • Shi et al. [2020] Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang. Graphaf: a flow-based autoregressive model for molecular graph generation. arXiv preprint arXiv:2001.09382, 2020.
  • Sun et al. [2019] Yiwei Sun, Suhang Wang, Xianfeng Tang, Tsung-Yu Hsieh, and Vasant Honavar. Node injection attacks on graphs via reinforcement learning. arXiv preprint arXiv:1909.06543, 2019.
  • Veličković et al. [2017] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  • Wang et al. [2019] Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, et al. Deep graph library: Towards efficient and scalable deep learning on graphs. arXiv preprint arXiv:1909.01315, 2019.
  • Wilson [2020] Andrew Gordon Wilson. The case for bayesian deep learning. arXiv preprint arXiv:2001.10995, 2020.
  • Wu et al. [2019] Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. Adversarial examples for graph data: Deep insights into attack and defense. In IJCAI, 2019.
  • Wu et al. [2020] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2020.
  • Xu et al. [2019] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. Topology attack and defense for graph neural networks: An optimization perspective. arXiv preprint arXiv:1906.04214, 2019.
  • Xu et al. [2018a] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018a.
  • Xu et al. [2018b] Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, and Stefanie Jegelka. Representation learning on graphs with jumping knowledge networks. arXiv preprint arXiv:1806.03536, 2018b.
  • Yang et al. [2016] Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. Revisiting semi-supervised learning with graph embeddings. arXiv preprint arXiv:1603.08861, 2016.
  • Ying et al. [2018] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 974–983, 2018.
  • Zoph and Le [2016] Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
  • Zügner and Günnemann [2019] Daniel Zügner and Stephan Günnemann. Adversarial attacks on graph neural networks via meta learning. In International Conference on Learning Representations (ICLR), 2019.
  • Zügner et al. [2018] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial attacks on neural networks for graph data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2847–2856, 2018.

Appendix A Proofs

A.1 Proof of Proposition 1

We first remind the reader for some notations, a GCN model is denoted as a function ff, the feature matrix is XN×DX\in\mathbb{R}^{N\times D}, and the output logits H=f(X)N×KH=f(X)\in\mathbb{R}^{N\times K}. The LL-step random walk transition matrix is MLM^{L}. More details can be found in in Section 3.1

We give in Lemma 2 the connection between GCN models and random walks. Lemma 2 relies on a technical assumption about the GCN model (Assumption 5) and the proof can be found in Xu et al. [26].

Assumption 5 (Xu et al. [26]).

All paths in the computation graph of the given GCN model are independently activated with the same probability of success ρ\rho.

Lemma 2.

(Xu et al. [26].) Given an LL-layer GCN with averaging as αi,j=1/di\alpha_{i,j}=1/d_{i} in Eq. 1, assume that all path in the computation graph of the model are activated with the same probability of success ρ\rho (Assumption 5). Then, for any node i,jVi,j\in V,

𝔼[HjXi]=ρl=L1𝑾l[ML]ji,\mathbb{E}\left[\frac{\partial H_{j}}{\partial X_{i}}\right]=\rho\cdot\prod_{l=L}^{1}\bm{W}_{l}[M^{L}]_{ji}, (6)

where 𝐖l\bm{W}_{l} is the learnable parameter at ll-th layer.

Then we are able to prove Proposition 1 below.

Proof.

First, we derive the gradient of the loss (H,y)\mathcal{L}(H,y) w.r.t. the feature XiX_{i} of node ii,

Xi(H,y)\displaystyle\nabla_{X_{i}}\mathcal{L}(H,y) =Xi(j=1Nj(Hj,yj))\displaystyle=\nabla_{X_{i}}\left(\sum_{j=1}^{N}\mathcal{L}_{j}(H_{j},y_{j})\right)
=j=1NXij(Hj,yj)\displaystyle=\sum_{j=1}^{N}\nabla_{X_{i}}\mathcal{L}_{j}(H_{j},y_{j})
=j=1N(HjXi)Tj(Hj,yj)Hj,\displaystyle=\sum_{j=1}^{N}\left(\frac{\partial H_{j}}{\partial X_{i}}\right)^{T}\frac{\partial\mathcal{L}_{j}(H_{j},y_{j})}{\partial H_{j}}, (7)

where HjH_{j} is the jjth row of HH but being transposed as column vectors and yjy_{j} is the true label of node jj. Note that j(Hj,yj)HjK\frac{\partial\mathcal{L}_{j}(H_{j},y_{j})}{\partial H_{j}}\in\mathbb{R}^{K}, and HjXiK×D\frac{\partial H_{j}}{\partial X_{i}}\in\mathbb{R}^{K\times D}.

Next, we plug Eq. 7 into Δ~i(x)|x=τ(X,{i})i\widetilde{\Delta}_{i}\left(x\right)|_{x=\tau(X,\{i\})_{i}}. For simplicity, We write Δ~i(x)|x=τ(X,{i})i\widetilde{\Delta}_{i}\left(x\right)|_{x=\tau(X,\{i\})_{i}} as Δ~i\widetilde{\Delta}_{i} in the rest of the proof.

Δ~i\displaystyle\widetilde{\Delta}_{i} =(Xi(H,y))Tϵ\displaystyle=\left(\nabla_{X_{i}}\mathcal{L}(H,y)\right)^{T}\epsilon
=j=1N(j(Hj,yj)Hj)THjXiϵ.\displaystyle=\sum_{j=1}^{N}\left(\frac{\partial\mathcal{L}_{j}(H_{j},y_{j})}{\partial H_{j}}\right)^{T}\frac{\partial H_{j}}{\partial X_{i}}\epsilon. (8)

Denote ajj(Hj,yj)HjKa^{j}\triangleq\frac{\partial\mathcal{L}_{j}(H_{j},y_{j})}{\partial H_{j}}\in\mathbb{R}^{K}. From the definition of loss

j(Hj,yj)=maxk{1,,K}HjkHjyj,\mathcal{L}_{j}(H_{j},y_{j})=\max_{k\in\{1,\ldots,K\}}H_{jk}-H_{jy_{j}},

we have

akj={1, if k=yj and yjargmaxc{1,,K}Hjc,1, if kyj and k=argmaxc{1,,K}Hjc,0,otherwise,\displaystyle a^{j}_{k}=\left\{\begin{array}[]{lr}-1,&\text{ if }k=y_{j}\text{ and }y_{j}\neq\operatorname*{argmax}_{c\in\{1,\ldots,K\}}H_{jc},\\ 1,&\text{ if }k\neq y_{j}\text{ and }k=\operatorname*{argmax}_{c\in\{1,\ldots,K\}}H_{jc},\\ 0,&\text{otherwise,}\end{array}\right.

for k=1,2,,Kk=1,2,\ldots,K. Under Assumption 1, the expectation of each element of aja^{j} is

𝔼[akj]=qk(1p(kk))+w=1,wkKp(kw)qw,k=1,2,,K\operatorname*{\mathbb{E}}[a^{j}_{k}]=-q_{k}(1-p(k\mid k))+\sum_{w=1,w\neq k}^{K}p(k\mid w)q_{w},k=1,2,\ldots,K

which is a constant independent of HjH_{j} and yjy_{j}. Therefore, we can write

𝔼[aj]=c,j=1,2,,N,\operatorname*{\mathbb{E}}[a^{j}]=c,\forall j=1,2,\ldots,N,

where cKc\in\mathbb{R}^{K} is a constant vector independent of jj.

Taking expectation of Eq. (8) and plug in the result of Lemma 2,

𝔼[Δ~i]\displaystyle\operatorname*{\mathbb{E}}\left[\widetilde{\Delta}_{i}\right] 𝔼[j=1N(j(Hj,yj)Hj)THjXiϵ]\displaystyle\approx\mathbb{E}\left[\sum_{j=1}^{N}\left(\frac{\partial\mathcal{L}_{j}(H_{j},y_{j})}{\partial H_{j}}\right)^{T}\frac{\partial H_{j}}{\partial X_{i}}\epsilon\right]
=j=1N𝔼[aj]T(ρl=L1Wl[ML]ji)ϵ\displaystyle=\sum_{j=1}^{N}\operatorname*{\mathbb{E}}[a^{j}]^{T}\left(\rho\prod_{l=L}^{1}W_{l}[M^{L}]_{ji}\right)\epsilon
=(ρcTl=L1Wlϵ)j=1N[ML]ji\displaystyle=\left(\rho c^{T}\prod_{l=L}^{1}W_{l}\epsilon\right)\sum_{j=1}^{N}[M^{L}]_{ji}
=Cj=1N[ML]ji,\displaystyle=C\sum_{j=1}^{N}[M^{L}]_{ji},

where C=ρcTl=L1WlϵC=\rho c^{T}\prod_{l=L}^{1}W_{l}\epsilon is a constant scalar independent of ii. ∎

A.2 Proofs for the Results in Section 3.4

Proof of Lemma 1.

Proof.

If Ai=A_{i}=\emptyset, BiAiB_{i}\subseteq A_{i} so Bi=B_{i}=\emptyset. The three conditions of Definition 3 are also trivially true. Below we investigate the case AiA_{i}\neq\emptyset.

The existence can be given by a constructive proof. We check the nonempty elements in AiA_{i} one by one with any order. If this element is a super set of any other element in AiA_{i}, we skip it. Otherwise, we put it into BiB_{i}. Then we verify that the resulted BiB_{i} is a basic vulnerable set for ii. BiAiB_{i}\subseteq A_{i}. For condition 1), clearly, Bi\emptyset\notin B_{i} and if Ai\emptyset\in A_{i}, all nonempty elements in AiA_{i} are skipped so Bi=B_{i}=\emptyset. For condition 2), given Ai\emptyset\notin A_{i}, for any nonempty SAiS\in A_{i}, if SBiS\in B_{i}, the condition holds. If SBiS\notin B_{i}, by construction, there exists a nonempty strict subset S1SS_{1}\subset S and S1AiS_{1}\in A_{i}. If S1BiS_{1}\in B_{i}, the condition holds. If S1BiS_{1}\notin B_{i}, we can similarly find a nonempty strict subset S2SS_{2}\subset S and S2AiS_{2}\in A_{i}. Recursively, we can get a series SS1S2S\supset S_{1}\supset S_{2}\supset\cdots. As SS is finite, we will have a set SkS_{k} that no longer has strict subset so SkBiS_{k}\in B_{i}. Therefore the condition holds. Condition 3) means any set in BiB_{i} is not a subset of another set in BiB_{i}. This condition holds by construction.

Now we prove the uniqueness. Suppose there are two distinct basic vulnerable sets BiCiB_{i}\neq C_{i}. Without loss of generality, we assume SBiS\in B_{i} but SCiS\notin C_{i}. BiB_{i}\neq\emptyset so Ai\emptyset\notin A_{i}. Further SAiS\in A_{i}, hence CiC_{i}\neq\emptyset. As SBiAiS\in B_{i}\subseteq A_{i}, SS\neq\emptyset, and CiC_{i} satisfies condition 2), there will be a nonempty TCiT\in C_{i} s.t. TST\subset S. If TBiT\in B_{i}, then condition 3) is violated for BiB_{i}. If TBiT\notin B_{i}, there will be a nonempty TBiT^{\prime}\in B_{i} s.t. TTT^{\prime}\subset T. But TST^{\prime}\subset S also violates condition 3). By contradiction we prove the uniqueness. ∎

In order to prove Proposition 2, we first would like to construct a submodular function that is close to hh, with the help of Lemma 3 below.

Lemma 3.

If iV\forall i\in V, BiB_{i} is either empty or only contains singleton sets, then hh is submodular.

Proof.

We first prove the case when iV,Ai\forall i\in V,A_{i}\neq\emptyset.

First, we show that iV\forall i\in V, if AiA_{i}\neq\emptyset, for any nonempty SV,gi(S)=1S\subseteq V,g_{i}(S)=1 if and only if Bi=B_{i}=\emptyset or TBi,TS\exists T\in B_{i},T\subseteq S. On one hand, if gi(S)=1g_{i}(S)=1, then SAiS\in A_{i}. If Ai,Bi=\emptyset\in A_{i},B_{i}=\emptyset. If Ai\emptyset\notin A_{i}, by condition 2) of the basic vulnerable set, TBi,TS\exists T\in B_{i},T\subseteq S. On the other hand, if TBi,TS\exists T\in B_{i},T\subseteq S, gi(T)=1g_{i}(T)=1, by Assumption 2, gi(S)gi(T)g_{i}(S)\geq g_{i}(T), so gi(S)=1g_{i}(S)=1. If Bi=B_{i}=\emptyset, as AiA_{i}\neq\emptyset, if Ai\emptyset\notin A_{i}, the condition 2) of Definition 3 will be violated. Therefore Ai\emptyset\in A_{i} so gi()=1g_{i}(\emptyset)=1. Still by Assumption 2, gi(S)gi()g_{i}(S)\geq g_{i}(\emptyset), so gi(S)=1g_{i}(S)=1.

Define a function e:V2Ve:V\rightarrow 2^{V} s.t. for any node iVi\in V,

e(i)={jV{i}Bj}.e(i)=\{j\in V\mid\{i\}\in B_{j}\}.

Given BiB_{i} is either empty or only contains singleton sets for any iVi\in V, for any nonempty SVS\subseteq V

h(S)\displaystyle h(S) =1Ni=1Ngi(S)\displaystyle=\frac{1}{N}\sum_{i=1}^{N}g_{i}(S) (9)
=1N|{jVBj= or TBj,TS}|\displaystyle=\frac{1}{N}\left|\{j\in V\mid B_{j}=\emptyset\text{ or }\exists T\in B_{j},T\subseteq S\}\right|
=1N|{jVBj= or {i}Bj,iS}|\displaystyle=\frac{1}{N}\left|\{j\in V\mid B_{j}=\emptyset\text{ or }\exists\{i\}\in B_{j},i\in S\}\right|
=1N|{jVBj= or iS,{i}Bj}|\displaystyle=\frac{1}{N}\left|\{j\in V\mid B_{j}=\emptyset\text{ or }\exists i\in S,\{i\}\in B_{j}\}\right|
=1N(|iSe(i)|+|{jVBj=}|).\displaystyle=\frac{1}{N}\left(\left|\cup_{i\in S}e(i)\right|+\left|\{j\in V\mid B_{j}=\emptyset\}\right|\right).

|{jVBj=}|\left|\{j\in V\mid B_{j}=\emptyset\}\right| is a constant independent of SS. Therefore, maximizing h(S)h(S) over SS with |S|r|S|\leq r is equivalent to maximizing |iSe(i)|\left|\cup_{i\in S}e(i)\right| over SS with |S|r|S|\leq r, which is a maximum coverage problem. Therefore hh is submodular.

The case of allowing some nodes to have empty vulnerable sets can be easily proved by removing such nodes in Eq. (9) as their corresponding vulnerable functions always equal to zero. ∎

Proof of Proposition 2. For simplicity, we assume AiA_{i}\neq\emptyset for any iVi\in V. The proof below can be easily adapted to the general case without this assumption, by removing the nodes with empty vulnerable sets similarly as the proof for Lemma 3.

Proof.

iV\forall i\in V, define B~i{SBi|S|=1}\widetilde{B}_{i}\triangleq\{S\in B_{i}\mid|S|=1\}. We can then define a new group of vulnerable sets A~i\widetilde{A}_{i} on VV for iVi\in V. Let

A~i={2V, if Bi=,,Bi but B~i=,{SVTB~i,TS},otherwise.\displaystyle\widetilde{A}_{i}=\left\{\begin{array}[]{lr}2^{V},&\text{ if $B_{i}=\emptyset$},\\ \emptyset,&\text{$B_{i}\neq\emptyset$ but $\widetilde{B}_{i}=\emptyset$},\\ \{S\subseteq V\mid\exists T\in\widetilde{B}_{i},T\subseteq S\},&\text{otherwise.}\end{array}\right.

Then it is clear that B~i\widetilde{B}_{i} is a valid basic vulnerable set corresponding to A~i\widetilde{A}_{i}, for iVi\in V. If we define g~i:2V{0,1}\tilde{g}_{i}:2^{V}\rightarrow\{0,1\} as

g~i(S)={1, if Bi= or TB~i,TS,0,otherwise,\displaystyle\tilde{g}_{i}(S)=\left\{\begin{array}[]{lr}1,&\text{ if $B_{i}=\emptyset$ or $\exists T\in\widetilde{B}_{i},T\subseteq S$},\\ 0,&\text{otherwise,}\end{array}\right.

we can easily verify that g~i\tilde{g}_{i} is a valid vulnerable function corresponding to A~i\widetilde{A}_{i}, for iVi\in V. Further let h~:2V+\tilde{h}:2^{V}\rightarrow\mathbb{R}_{+} as

h~(S)=1Ni=1Ng~i(S).\tilde{h}(S)=\frac{1}{N}\sum_{i=1}^{N}\tilde{g}_{i}(S).

By Lemma 3, as iV,B~i\forall i\in V,\widetilde{B}_{i} is either empty or only contains singleton sets, we know h~\tilde{h} is submodular.

Next we investigate the difference between hh and h~\tilde{h}. First, for any SVS\subseteq V, if Si=1NAiS\notin\cup_{i=1}^{N}A_{i}, clearly h(S)=h~(S)=0h(S)=\tilde{h}(S)=0; if |S|1|S|\leq 1, it’s easy to show h(S)=h~(S)h(S)=\tilde{h}(S). Second, for any Si=1NAiS\in\cup_{i=1}^{N}A_{i} and |S|>1|S|>1, by Assumption 3, there are exactly bb (omitting the SS in b(S)b(S)) nodes whose vulnerable set contains SS. Without loss of generality, let us assume the indexes of bb nodes are 1,2,,b1,2,\ldots,b. Then, for any node i>bi>b, gi(S)=0,g~i(S)=0g_{i}(S)=0,\tilde{g}_{i}(S)=0. For node i=1,2,,bi=1,2,\ldots,b, gi(S)=1g_{i}(S)=1, and

g~i(S)={1, if Bi= or TS|T|=1 and TBi~,0,otherwise.\displaystyle\tilde{g}_{i}(S)=\left\{\begin{array}[]{lr}1,&\text{ if $B_{i}=\emptyset$ or $\exists T\subseteq S$, $|T|=1$ and $T\in\widetilde{B_{i}}$},\\ 0,&\text{otherwise.}\end{array}\right.

By Assumption 4, there are at least pb\lceil pb\rceil (omitting the SS in p(S)p(S)) nodes like jj s.t. g~j(S)=1\tilde{g}_{j}(S)=1. Therefore, h(S)=bNh(S)=\frac{b}{N} and pbNh~(S)bN\frac{\lceil pb\rceil}{N}\leq\tilde{h}(S)\leq\frac{b}{N}. Hence 11r<1h(S)h~(S)1p<1+1r1-\frac{1}{r}<1\leq\frac{h(S)}{\tilde{h}(S)}\leq\frac{1}{p}<1+\frac{1}{r}. ∎

Appendix B Addtional Experiment Details

B.1 Additional Dataset Details

Datasets. We adopt the Deep Graph Library [20] version of Cora, Citeseer, and Pubmed in our experiments. The summary statistics of the datasets are summarized in Table 2. The number of edges does not include self-loops.

Table 2: Summary statistics of datasets.
Dataset Nodes Edges Classes Features
Citeseer 3,327 4,552 6 3,703
Cora 2,708 5,278 7 1,433
Pubmed 19,717 44,324 3 500

B.2 Additional Experiment Results

In this section, we provide results of more experiment setups.

Comparison to the Random baseline. We first highlight the relative decrease of accuracy between the proposed GC-RWCS strategy (L=4L=4) and the Random strategy in Table 3. GC-RWCS is able to decrease the node classification accuracy by up to 33.5%33.5\%, and achieves a 70%70\% larger decrease of the accuracy than the Random baseline in most cases. As the GC-RWCS and Random use exactly the same feature perturbation and the node selection step of Random does not include any information of the graph structure, this relative comparison can be roughly viewed as an indicator of the attack effectiveness attributed to the structural inductive biases of the GNN models.

Table 3: Accuracy decrease (in %\%) comparison with clean dataset
Cora Citeseer Pubmed
Method GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool
Threshold 10%
Random 4.3 17.4 17 3.8 12.1 11.5 3.7 9.9 10.3
GC-RWCS 7.1 33.5 32.5 10.0 26.3 25.0 8.4 23.7 25.1
GC-RWCS/Random 165.12% 192.53% 191.18% 263.16% 217.36% 217.39% 227.03% 239.39% 243.69%
Threshold 30%
Random 3.0 15.5 14 2.5 10.2 9.3 3.1 8.5 8.3
GC-RWCS 4.9 27.1 24.7 7.3 23.9 22.5 5.4 16.6 15.7
GC-RWCS/Random 163.33% 174.84% 176.43% 292.00% 234.31% 241.94% 174.19% 195.29% 189.16%

More thresholds and sensitivity analysis with respect to LL.. We provide a setup of 20%20\% threshold in addition to the 10%10\% and 30%30\% thresholds shown in Section 4.2, to give a better resolution of the results. And the results of threshold 20%20\% are consistent with other setups. We also conduct a sensitivity analysis of the hyper-parameter LL in GC-RWCS in Table 4, and show the results of GC-RWCS with L=3,4,5,6,7L=3,4,5,6,7. Note that GCN has 2 layers and the JK-Nets have 7 layers. The variations of GC-RWCS results with the provided range of LL are typically within 2%2\%, indicating that the proposed GC-RWCS strategy does not rely on the exact knowledge of number of layers in the GNN models to be effective.

Table 4: Summary of the accuracy (in %\%) when L={3,4,5,6,7}L=\{3,4,5,6,7\}. The bold number and the asterisk (*) denotes the same meaning as Table 1. The underline marker denotes the values of GC-RWCS outperforms all the baseline.
Cora Citeseer Pubmed
Method GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool
None 85.6 ±\pm 0.3 86.2 ±\pm 0.2 85.8 ±\pm 0.3 75.1 ±\pm 0.2 72.9 ±\pm 0.3 73.2 ±\pm 0.3 85.7 ±\pm 0.1 85.8 ±\pm 0.1 85.7 ±\pm 0.1
Threshold 10%
Random 81.3 ±\pm 0.3 68.8 ±\pm 0.8 68.8 ±\pm 1.3 71.3 ±\pm 0.3 60.8 ±\pm 0.8 61.7 ±\pm 0.9 82.0 ±\pm 0.3 75.9 ±\pm 0.7 75.2 ±\pm 0.7
Degree 78.2 ±\pm 0.4 60.7 ±\pm 1.0 59.9 ±\pm 1.5 67.5 ±\pm 0.4 52.5 ±\pm 0.8 53.7 ±\pm 1.0 78.9 ±\pm 0.5 63.4 ±\pm 1.0 63.2 ±\pm 1.2
Pagerank 79.4 ±\pm 0.4 71.6 ±\pm 0.6 70.0 ±\pm 1.0 70.1 ±\pm 0.3 61.5 ±\pm 0.5 62.6 ±\pm 0.6 80.3 ±\pm 0.3 71.3 ±\pm 0.8 71.2 ±\pm 0.8
Betweenness 79.7 ±\pm 0.4 60.5 ±\pm 0.9 60.3 ±\pm 1.6 68.9 ±\pm 0.3 53.5 ±\pm 0.8 55.1 ±\pm 1.0 78.5 ±\pm 0.6 67.1 ±\pm 1.1 66.1 ±\pm 1.1
RWCS 79.4 ±\pm 0.4 71.7 ±\pm 0.5 70.3 ±\pm 0.9 69.9 ±\pm 0.3 62.4 ±\pm 0.4 63.1 ±\pm 0.6 79.8 ±\pm 0.3 70.7 ±\pm 0.8 70.7 ±\pm 0.8
GC-RWCS-3 78.6 ±\pm 0.5 52.1 ±\pm 1.1* 53.0 ±\pm 1.9* 64.8 ±\pm 0.5* 46.4 ±\pm 0.8* 48.2 ±\pm 1.0* 78.1 ±\pm 0.6 62.3 ±\pm 1.2 61.6 ±\pm 1.5
GC-RWCS-4 78.5 ±\pm 0.5 52.7 ±\pm 1.0* 53.3 ±\pm 1.9* 65.1 ±\pm 0.5* 46.6 ±\pm 0.8* 48.2 ±\pm 1.1* 77.3 ±\pm 0.7 62.1 ±\pm 1.2 60.6 ±\pm 1.4*
GC-RWCS-5 78.9 ±\pm 0.5 53.5 ±\pm 1.1* 54.2 ±\pm 1.9* 65.3 ±\pm 0.5* 46.6 ±\pm 0.8* 48.4 ±\pm 1.0* 78.4 ±\pm 0.5 64.2 ±\pm 1.2 62.5 ±\pm 1.4
GC-RWCS-6 78.5 ±\pm 0.5 54.3 ±\pm 1.1* 54.9 ±\pm 1.9* 65.5 ±\pm 0.5* 47.1 ±\pm 0.8 48.9 ±\pm 1.1* 78.0 ±\pm 0.6 63.7 ±\pm 1.1 62.6 ±\pm 1.4
GC-RWCS-7 78.1 ±\pm 0.5 54.2 ±\pm 1.1* 54.8 ±\pm 1.9* 66.1 ±\pm 0.4* 47.5 ±\pm 0.8 49.3 ±\pm 1.1* 78.7 ±\pm 0.5 64.9 ±\pm 1.2 63.3 ±\pm 1.3
Threshold 20%
Random 82.3 ±\pm 0.3 71.7 ±\pm 1.1 69.8 ±\pm 1.1 72.1 ±\pm 0.3 62.1 ±\pm 0.7 62.6 ±\pm 0.9 82.6 ±\pm 0.2 77.9 ±\pm 0.5 77.5 ±\pm 0.5
Degree 79.3 ±\pm 0.4 64.2 ±\pm 1.2 61.6 ±\pm 1.3 69.2 ±\pm 0.4 56.0 ±\pm 0.8 56.4 ±\pm 1.0 80.6 ±\pm 0.4 69.5 ±\pm 0.8 69.4 ±\pm 1.0
Pagerank 80.8 ±\pm 0.3 74.5 ±\pm 0.8 73.0 ±\pm 0.8 72.1 ±\pm 0.3 68.3 ±\pm 0.3 68.2 ±\pm 0.4 82.2 ±\pm 0.2 77.7 ±\pm 0.4 77.8 ±\pm 0.4
Betweenness 80.7 ±\pm 0.4 62.2 ±\pm 1.4 60.1 ±\pm 1.4 70.1 ±\pm 0.4 54.8 ±\pm 0.8 55.8 ±\pm 1.1 80.2 ±\pm 0.4 72.4 ±\pm 0.8 72.0 ±\pm 0.7
RWCS 81.4 ±\pm 0.3 76.8 ±\pm 0.6 76.0 ±\pm 0.6 72.4 ±\pm 0.3 68.9 ±\pm 0.3 69.0 ±\pm 0.4 81.3 ±\pm 0.2 76.0 ±\pm 0.4 76.5 ±\pm 0.4
GC-RWCS-3 79.4 ±\pm 0.5 57.5 ±\pm 1.6* 53.1 ±\pm 1.5* 67.1 ±\pm 0.4* 48.4 ±\pm 0.9* 49.3 ±\pm 1.2* 79.0 ±\pm 0.5* 67.4 ±\pm 0.9* 66.3 ±\pm 1.0*
GC-RWCS-4 79.4 ±\pm 0.5 57.5 ±\pm 1.7* 53.2 ±\pm 1.4* 67.3 ±\pm 0.5* 47.9 ±\pm 0.9* 48.8 ±\pm 1.3* 79.0 ±\pm 0.5* 67.4 ±\pm 1.0* 66.3 ±\pm 1.0*
GC-RWCS-5 79.4 ±\pm 0.5 59.0 ±\pm 1.7* 54.5 ±\pm 1.4* 67.3 ±\pm 0.4* 48.4 ±\pm 0.9* 49.4 ±\pm 1.3* 79.2 ±\pm 0.5* 68.5 ±\pm 0.9 68.1 ±\pm 0.9
GC-RWCS-6 79.5 ±\pm 0.5 59.3 ±\pm 1.7 54.9 ±\pm 1.5* 68.1 ±\pm 0.4* 49.2 ±\pm 0.9* 50.2 ±\pm 1.3* 79.1 ±\pm 0.5* 68.4 ±\pm 0.9 68.5 ±\pm 1.0
GC-RWCS-7 79.4 ±\pm 0.5 59.3 ±\pm 1.6 55.3 ±\pm 1.5* 68.1 ±\pm 0.4* 50.0 ±\pm 0.9* 50.8 ±\pm 1.3* 79.2 ±\pm 0.5* 68.7 ±\pm 0.9 68.2 ±\pm 0.8
Threshold 30%
Random 82.6 ±\pm 0.4 70.7 ±\pm 1.1 71.8 ±\pm 1.1 72.6 ±\pm 0.3 62.7 ±\pm 0.8 63.9 ±\pm 0.8 82.6 ±\pm 0.2 77.3 ±\pm 0.4 77.3 ±\pm 0.5
Degree 80.7 ±\pm 0.4 64.9 ±\pm 1.4 67.0 ±\pm 1.5 70.4 ±\pm 0.4 56.9 ±\pm 0.8 58.7 ±\pm 0.9 81.5 ±\pm 0.4 72.4 ±\pm 0.7 72.1 ±\pm 0.8
Pagerank 82.6 ±\pm 0.3 79.6 ±\pm 0.4 79.7 ±\pm 0.4 72.9 ±\pm 0.2 70.2 ±\pm 0.3 70.3 ±\pm 0.3 83.0 ±\pm 0.2 79.3 ±\pm 0.3 79.5 ±\pm 0.3
Betweenness 81.8 ±\pm 0.4 64.1 ±\pm 1.3 65.9 ±\pm 1.4 70.7 ±\pm 0.3 56.3 ±\pm 0.8 58.3 ±\pm 0.9 81.3 ±\pm 0.3 74.1 ±\pm 0.5 74.5 ±\pm 0.5
RWCS 82.9 ±\pm 0.3 79.7 ±\pm 0.4 80.0 ±\pm 0.4 72.9 ±\pm 0.2 70.2 ±\pm 0.3 70.4 ±\pm 0.3 82.1 ±\pm 0.2 77.8 ±\pm 0.3 78.4 ±\pm 0.3
GC-RWCS-3 80.2 ±\pm 0.6 57.3 ±\pm 1.7* 59.0 ±\pm 1.6* 67.9 ±\pm 0.5* 49.1 ±\pm 0.9* 50.8 ±\pm 1.1* 80.3 ±\pm 0.5* 69.0 ±\pm 0.7* 69.8 ±\pm 0.7*
GC-RWCS-4 80.7 ±\pm 0.5 59.1 ±\pm 1.6* 61.1 ±\pm 1.6* 67.8 ±\pm 0.5* 49.0 ±\pm 0.9* 50.7 ±\pm 1.1* 80.3 ±\pm 0.5* 69.2 ±\pm 0.7* 70.0 ±\pm 0.7*
GC-RWCS-5 80.8 ±\pm 0.5 59.8 ±\pm 1.6* 61.5 ±\pm 1.6* 68.4 ±\pm 0.5* 49.2 ±\pm 0.9* 51.2 ±\pm 1.1* 80.2 ±\pm 0.5* 70.4 ±\pm 0.6* 71.5 ±\pm 0.6
GC-RWCS-6 80.7 ±\pm 0.5 59.8 ±\pm 1.5* 61.4 ±\pm 1.5* 68.5 ±\pm 0.5* 50.5 ±\pm 0.9* 52.2 ±\pm 1.1* 80.2 ±\pm 0.5* 70.5 ±\pm 0.5* 71.6 ±\pm 0.6
GC-RWCS-7 80.7 ±\pm 0.5 60.2 ±\pm 1.5* 61.9 ±\pm 1.5* 68.7 ±\pm 0.5* 50.7 ±\pm 0.9* 52.6 ±\pm 1.1* 80.3 ±\pm 0.4* 70.9 ±\pm 0.5* 71.9 ±\pm 0.6

Extra experiments on Graph Attetion Networks (GAT). We further show that the proposed attack strategy GC-RWCS, while being derived from the GCN model, is also able to generalize well on GAT [19]. The results on GAT are shown in the Table 5.

Table 5: Accuracy (in %) on GAT model
Dataset Cora Citeseer Pubmed
threshold 10% 20% 30% 10% 20% 30% 10% 20% 30%
None 87.8 ±\pm 0.2 76.9 ±\pm 0.3 85.2 ±\pm 0.1
Random 72.9 ±\pm 0.5 73.8 ±\pm 0.6 73.9 ±\pm 0.6 70.0 ±\pm 0.5 71.2 ±\pm 0.4 71.7 ±\pm 0.4 73.9 ±\pm 0.4 75.4 ±\pm 0.3 76.2 ±\pm 0.3
Degree 66.5 ±\pm 0.7 67.3 ±\pm 0.7 69.8 ±\pm 0.7 63.3 ±\pm 0.5 65.9 ±\pm 0.4 67.9 ±\pm 0.3 66.7 ±\pm 0.7 69.0 ±\pm 0.5 71.2 ±\pm 0.4
Pagerank 74.3 ±\pm 0.5 74.8 ±\pm 0.3 82.4 ±\pm 0.2 69.5 ±\pm 0.3 72.9 ±\pm 0.3 74.2 ±\pm 0.3 71.6 ±\pm 0.4 78.1 ±\pm 0.2 79.1 ±\pm 0.2
Betweenness 64.8 ±\pm 0.5 66.0 ±\pm 0.5 67.3 ±\pm 0.6 65.2 ±\pm 0.5 66.5 ±\pm 0.4 67.6 ±\pm 0.3 63.4 ±\pm 0.7 68.4 ±\pm 0.6 72.0 ±\pm 0.4
RWCS 71.1 ±\pm 0.5 74.6 ±\pm 0.3 82.5 ±\pm 0.2 69.2 ±\pm 0.3 72.9 ±\pm 0.3 73.9 ±\pm 0.3 69.4 ±\pm 0.5 74.9 ±\pm 0.3 77.9 ±\pm 0.2
GC-RWCS 58.1 ±\pm 0.6* 57.9 ±\pm 0.6* 63.0 ±\pm 0.5* 58.3 ±\pm 0.6* 61.9 ±\pm 0.6* 61.9 ±\pm 0.4* 58.9 ±\pm 0.9* 63.8 ±\pm 0.7* 68.9 ±\pm 0.5*

Extra experiments on a synthetic dataset. We also run experiments on a synthetic dataset to show that, when sufficient domain knowledge regarding the node features is present, the proposed attack strategy can be made effective in a pure black-box fashion. We generate the synthetic dataset as follows. First, we generate a graph using the Barabási-Albert random graph model [1] with NN nodes, and denote the adjacency matrix as AA. Then we sample DD-dimensional node features, XN×DX\in\mathbb{R}^{N\times D}, from a multivariate normal distribution and take the absolute value elementwisely. Finally, we generate the node labels as follows. We first calculate Y~=sigmoid((A+I)XW)\tilde{Y}=\text{sigmoid}((A+I)XW), for some WDW\in\mathbb{R}^{D}. Considering a binary classification problem, we make the label Yi=1Y_{i}=1 if Y~i>0.5\tilde{Y}_{i}>0.5, 0 otherwise. When attacking a GNN model trained on such a synthetic dataset, we assume the attackers know a couple of “important features” with large weights in WW but do not know any of the trained model information. We randomly generate two datasets with N=3000N=3000 and D=10D=10. And the experiment results are shown in the Table 6. And we find the proposed GC-RWCS model performs well on these two synthetic datasets.

Table 6: Accuracy (in %) on GCN model with synthetic data
Dataset synthetic_0 synthetic_1
Threshold 10% 20% 30% 10% 20% 30%
None 83.6±\pm0.3 85.2±\pm0.4
Degree 77.9±\pm0.4 77.5±\pm0.4 79.2±\pm0.4 77.4±\pm0.4 79.7±\pm0.4 79.9±\pm0.2
Pagerank 76.7±\pm0.4 78.3±\pm0.3 79.2±\pm0.4 78.5±\pm0.5 79.2±\pm0.3 80.1±\pm0.3
Between 76.3±\pm0.4 80.4±\pm0.3 79.1±\pm0.4 78.7±\pm0.3 80.6±\pm0.3 80.3±\pm0.3
Random 79.3±\pm0.4 80.3±\pm0.4 81.2±\pm0.4 82.4±\pm0.4 79.6±\pm0.3 82.2±\pm0.2
RWCS 74.9±\pm0.5 78.8±\pm0.4 78.5±\pm0.3 79.0±\pm0.4 79.9±\pm0.4 80.0±\pm0.3
GC-RWCS 74.0±\pm0.3* 77.9±\pm0.5 77.4±\pm0.4* 77.5±\pm0.4 78.9±\pm0.3* 78.4±\pm0.3*

Diminishing-return effect with respect to JJ. In Section 4.2, we have empirically shown the diminishing-return effect on the mis-classification rate when strengthening the adversarial perturbation by increasing the perturbation strength λ\lambda. Here we further validate our observation by showing that a similar effect appears as we strengthen the adversarial perturbation by increasing the number of features to be perturbed, JJ. The results are shown in Figure 2.

Refer to caption
(a) Loss on Test Set
Refer to caption
(b) Accuracy on Test Set
Figure 2: Experiments of attacking GCN on Citeseer with increasing number of features to be perturbed, JJ. Results are averaged over 40 random trials and error bars indicate standard error of mean.