This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Reinforcement Learning-based Black-Box Evasion Attacks to Link Prediction in Dynamic Graphs

Houxiang Fan∗,1, Binghui Wang∗,2, Pan Zhou3, Ang Li2, Meng Pang4,
Zichuan Xu5, Cai Fu3, Hai Li2, Yiran Chen2
1School of Electronic Information and Communications, Huazhong University of Science Technology
2Department of Electrical and Computer Engineering, Duke University
3School of Cyber Science and Engineering, Huazhong University of Science Technology
4School of Electrical and Electronic Engineering, Nanyang Technological University
5School of Software, Dalian University of Technology
Equal Contribution
Abstract

Link prediction in dynamic graphs (LPDG) is an important research problem that has diverse applications such as online recommendations, studies on disease contagion, organizational studies, etc. Various LPDG methods based on graph embedding and graph neural networks have been recently proposed and achieved state-of-the-art performance. In this paper, we study the vulnerability of LPDG methods and propose the first practical black-box evasion attack. Specifically, given a trained LPDG model, our attack aims to perturb the graph structure, without knowing to model parameters, model architecture, etc., such that the LPDG model makes as many wrong predicted links as possible. We design our attack based on a stochastic policy-based RL algorithm. Moreover, we evaluate our attack on three real-world graph datasets from different application domains. Experimental results show that our attack is both effective and efficient.

1 Introduction

Graphs are often used to describe complex systems such as social networks, biology, social and economic organizations, communication systems, power grid, etc. These real-world systems often evolve with time and can be modeled as dynamic graphs, where nodes/entities or links/edges are dynamically added or deleted. Links, which represent the interactions between nodes, are of great importance in the analysis of dynamic graphs; and one particular important research problem is called link prediction in dynamic graphs (LPDG). Specifically, given historical graph data of a real-world system, LPDG aims to predict its future graph structure so as to better understand the evolution process. It is precisely that information in future graphs would be valuable in various applications such as online recommendations, studies on disease contagion, organizational studies, etc.

Various LPDG methods have been proposed in the past decade. Conventional methods include feature-based methods [15, 39, 7, 20], generative methods [25, 37, 44], and deep neural networks [19, 30, 5]. Recently, graph embedding (GE) methods [24, 28, 12] and graph neural networks (GNNs) [17, 31, 35] have achieved great success in many graph-related tasks (e.g., node classification, link prediction, graph classification, etc.) for static graphs. Inspired by this success, many GE/GNN-based LPDG methods  [18, 36, 22, 11, 41, 23, 10, 21] have been developed, which extend original GE/GNN methods to the dynamic graphs via introducing a recurrent mechanism that captures the temporal information of dynamic graphs. GE/GNN-based LPDG methods have achieved great success in various applications such as forecasting bitcoin user trading [23], traffic prediction [40, 36], predicting loan repayment [41], forecasting message exchange in communication network [23], etc. For instance, DyGCN [21] combines LSTM [14] and GCN [17] and achieves the state-of-the-art performance.

In this paper, in contrast to designing better LPDG methods, we take the first step to study the vulnerability of LPDG methods. In particular, we consider a practical black-box evasion attacks to LPDG— Given a trained LPDG model, we assume an attacker only knows the predictions via querying the LPDG model, while not knowing the model parameters, model architecture, etc. As to attacker’s capability, we consider that the attacker can perturb a certain number of links/edges in the historical graphs, i.e., by adding new edges to or/and deleting existing edges from these graphs. Then, the attacker’s goal is to learn to perturb the historical graphs such that the LPDG model has as many wrong predicted links as possible in the future graph. One possible way to perform such an attack is to formulate the attack as an optimization problem. However, we emphasize that there are two key challenges. First, optimization-based attack requires the attacker knows accurate gradient information related to model parameters, which is difficult to obtain in the black-box setting. Second, optimization-based attack for graph structure perturbation is a binary optimization problem, which is NP-hard.

To address the above challenges, we propose a reinforcement learning (RL)-based attack to LPDG. Specifically, we can model perturbing (optimal) edges in a graph as executing (optimal) actions, which is based on solving a policy function in RL. Note that solving the policy function only requires LPDG model predictions, and thus our attack does not need to know model parameters. Moreover, our attack involves learning a nonlinear mapping from the high-dimensional graph structure space to a low-dimensional representation space. Such a nonlinear mapping can be easily parameterized by a neural network, and naturally fits the RL framework. We implement our RL-based attack, specifically to DyGCN, by adopting the soft actor-critic algorithm [13], which is a stochastic policy-based RL method that has demonstrated a stronger exploration ability and a more stable training. We evaluate our attack on three real-world graph datasets from different application domains. Our attack is effective. For instance, on Trapping dataset, our attack can reduce the link prediction performance (measured by F1F_{1} score) by 37.9%37.9\% when only 1%1\% of total edges in a graph is perturbed. Our attack is also efficient. For instance, the running time of our attack is linear to the number of perturbed edges.

2 Background and Problem Definition

Refer to caption
Figure 1: Link prediction in dynamic graphs via DyGCN. We use a single training example (Stn:t1,At)(S_{t-n:t-1},A_{t}) to illustrate DyGCN.

Link Prediction in Dynamic Graphs

Given a finite sequence of undirected graphs 𝒢={G1,G2,,GT}\mathcal{G}=\left\{G_{1},G_{2},...,G_{T}\right\}, where Gt=(Vt,Et),t[1:T]G_{t}=(V_{t},E_{t}),\forall t\in[1:T] denotes a snapshot graph at time step tt, with VtV_{t} the set of nodes and EtE_{t} the set of edges/links. |Vt||V_{t}| and |Et||E_{t}| represent the number of nodes and number of edges in graph GtG_{t}, respectively. W.l.o.g., we assume all graphs in 𝒢\mathcal{G} have the same node set VV in this paper. We use At{0,1}|V|×|V|A_{t}\in\left\{0,1\right\}^{|V|\times|V|} to represent the adjacency matrix for GtG_{t}, i.e., at(u,v)=1a_{t}^{(u,v)}=1 if there is a link between node uu and node vv in GtG_{t}, and 0 otherwise. For description convenience, we will use the adjacency matrix and graph interchangeably. For each vVv\in V, let xtvDx^{v}_{t}\in\mathbb{R}^{D} be vv’s feature vector and Xt=[xt1;xt2;;xt|V|]|V|×DX_{t}=[x_{t}^{1};x_{t}^{2};\cdots;x_{t}^{|V|}]\in\mathbb{R}^{|V|\times D} be the feature matrix of all nodes in graph GtG_{t}.

Now, suppose we are given a set of MtrM_{tr} training examples {(S[tn:t1]tr,Attr)}t=n+1n+1+Mtr𝒮×𝒜\{(S^{tr}_{[t-n:t-1]},A^{tr}_{t})\}_{t=n+1}^{n+1+M_{tr}}\subseteq\mathcal{S}\times\mathcal{A}, where S[tn:t1]tr={Atntr,,At1tr}𝒮S^{tr}_{[t-n:t-1]}=\{A^{tr}_{t-n},\cdots,A^{tr}_{t-1}\}\in\mathcal{S} consists of a sequence of nn historical graphs and Attr𝒜A^{tr}_{t}\in\mathcal{A} is the future graph to be predicted. Then, the goal of link prediction in dynamic graphs is to learn a function Θ:𝒮𝒜\mathcal{F}_{\Theta}:\mathcal{S}\rightarrow\mathcal{A}, parameterized by Θ\Theta, from the historical graphs in 𝒮\mathcal{S} to predict links in the future graphs in 𝒜\mathcal{A}. In this paper, we focus on DyGCN, as it achieves the state-of-the-art dynamic link prediction performance.

Dynamic Graph Convolutional Network (DyGCN)

DyGCN combines GCN [17] with LSTM [14] to perform link prediction in dynamic graphs. GCN was originally developed for a static graph. We first briefly review GCN. Suppose we are given a graph GjG_{j} with adjacency matrix AjA_{j} and node feature matrix XjX_{j}. GCN stacks multiple (e.g., LL) graph convolutional layers, and each layer ll learns a hidden node feature matrix Hj(l)H_{j}^{(l)}. Formally, the node feature matrix at (l+1)(l+1)-layer is updated as follows:

Hj(l+1)=σ(ÅjHj(l)W(l)),\displaystyle H_{j}^{(l+1)}=\sigma(\mathring{A}_{j}H_{j}^{(l)}W^{(l)}),

where Åj\mathring{A}_{j} is a normalized version of AjA_{j}, which is defined as Åj=D~j12A~jD~j12\mathring{A}_{j}=\widetilde{D}_{j}^{-\frac{1}{2}}\widetilde{A}_{j}\widetilde{D}_{j}^{-\frac{1}{2}}, where A~j=Aj+I,D~j=diag(kA~j(i,k))\widetilde{A}_{j}=A_{j}+I,\widetilde{D}_{j}=diag(\sum_{k}\widetilde{A}_{j}^{(i,k)}); Hj(0)=XjH_{j}^{(0)}=X_{j}; W(l)W^{(l)} is the parameter matrix connecting the ll-th layer and the (l+1)(l+1)-th layer; σ()\sigma(\cdot) is a nonlinear activation function, e.g., ReLU .

In order to perform LPDG, DyGCN introduces multiple GCNs at different time steps and adds an LSTM after each GCN. Specifically, at time step jj, GCN is with an input AjA_{j} and the output of GCN (i.e., Hj(L)H_{j}^{(L)}) is treated as the input of an LSTM. Moreover, the output of LSTM is denoted as YjY_{j}.

Now, suppose we have a set of MtrM_{tr} training examples. For each training example (S[tn:t1]tr,Attr)(S^{tr}_{[t-n:t-1]},A^{tr}_{t}), each GCN takes a graph in S[tn:t1]trS^{tr}_{[t-n:t-1]} as an input, e.g., GCN at time step jj takes AjtrA^{tr}_{j} as an input. To predict links in the graph AttrA^{tr}_{t}, DyGCN adds a fully connected layer after the final output of LSTM at time step t1t-1 (i.e., Yt1Y_{t-1}), and maps the output to a matrix Zt|V|×|V|Z_{t}\in\mathbb{R}^{|V|\times|V|} that has the same size as AttrA^{tr}_{t}. Formally,

Zt=σ(fc(Yt1,WZ)),Z_{t}=\sigma(fc(Y_{t-1},W_{Z})),

where fc()fc(\cdot) is a fully connected layer with parameters WZW_{Z}. As the entries in ZtZ_{t} are usually not binary, DyGCN further performs the following transformation:

a^t(i,j)={1,zt(i,j)>0.5;0,otherwise,\widehat{a}_{t}^{(i,j)}=\left\{\begin{matrix}1,&z_{t}^{(i,j)}>{0.5};\\ 0,&otherwise,\end{matrix}\right.

where A^t={a^t(i,j)}i,j=1|V|\widehat{A}_{t}=\{\widehat{a}_{t}^{(i,j)}\}_{i,j=1}^{|V|} is the predicted graph.

We use the function Θ\mathcal{F}_{\Theta} to encompass all functions involved in DyGCN, where Θ\Theta contains all parameters. Thus, A^t=Θ(S[tn:t1]tr)\widehat{A}_{t}=\mathcal{F}_{\Theta}({S}^{tr}_{[t-n:t-1]}). To train DyGCN, we minimize the loss function \mathcal{L}, which is defined as the reconstruction loss between the predicted graph and the ground truth graph for all training examples. Specifically,

({Θ(𝒮[tn:t1]tr),Attr})\displaystyle\mathcal{L}(\{\mathcal{F}_{\Theta}({\mathcal{S}^{tr}_{[t-n:t-1]}}),A^{tr}_{t}\}) =t=n+1n+1+MtrAttrA^tF2Bt,\displaystyle=\sum_{t=n+1}^{n+1+M_{tr}}||A^{tr}_{t}-\widehat{A}_{t}||_{F}^{2}\odot B_{t},

where the weight matrix Bt|V|×|V|B_{t}\in\mathbb{R}^{|V|\times|V|} is used to mitigate the issue caused by the sparse graph. We set bt(i,j)=1b_{t}^{(i,j)}=1 if at(i,j)=0a_{t}^{(i,j)}=0, and bt(i,j)=β>1b_{t}^{(i,j)}=\beta>1, otherwise. In our experiment, we set β=10\beta=10. Adam [16] is used to train DyGCN and the learnt model parameters is denoted as Θ\Theta^{*}. Figure 1 illustrates DyGCN using a single training example.

Problem Definition

We consider black-box evavaion attacks to LPDG and focus on attacking DyGCN in this paper. However, we note that our attack is applicable to any LPDG algorithm. Specifically, given a trained DyGCN model Θ\mathcal{F}_{\Theta^{*}}, we assume that the attacker only knows the predictions (i.e., predicted graphs) via querying Θ\mathcal{F}_{\Theta^{*}}, while not knowing the model parameters Θ\Theta^{*} or model architecture. Moreover, given a testing example (𝒮[tn:t1]te,Atte)(\mathcal{S}^{te}_{[t-n:t-1]},A^{te}_{t}) where 𝒮[tn:t1]te={Atnte,,At1te}\mathcal{S}^{te}_{[t-n:t-1]}=\{A_{t-n}^{te},\cdots,A_{t-1}^{te}\} consists of a sequence of nn historical graphs, we assume the attacker can perturb a fraction μ\mu of the nn graphs in 𝒮[tn:t1]te\mathcal{S}^{te}_{[t-n:t-1]}, and each graph is allowed to perturb (e.g., add new edges or remove existing edges) a certain fraction δ\delta of the total links/edges. We denote the perturbed testing example as 𝒮~[tn:t1]te={A~tnte,,A~t1te}\widetilde{\mathcal{S}}_{[t-n:t-1]}^{te}=\{\tilde{A}_{t-n}^{te},\cdots,\tilde{A}_{t-1}^{te}\}, where A~jte=Ajte+ΔAjte\tilde{A}_{j}^{te}={A}_{j}^{te}+\Delta{A}_{j}^{te}, if Ajte{A}_{j}^{te} is perturbed by a binary matrix ΔAjte\Delta{A}_{j}^{te}; and A~jte=Ajte\tilde{A}_{j}^{te}={A}_{j}^{te}, if not. For simplicity, we consider recent historical graphs, i.e., [Atμnte:At1te][{A}_{t-\mu n}^{te}:{A}_{t-1}^{te}], are perturbed. Then, the attacker’s goal is learn to perturb the graphs, such that the link prediction performance on 𝒮~[tn:t1]te\widetilde{\mathcal{S}}_{[t-n:t-1]}^{te} evaluated by Θ\mathcal{F}_{\Theta^{*}} is maximally decreased. Formally, the attacker’s objective function is defined as follows:

max{ΔAjte}err=(AtteΘ(𝒮~[tn:t1]te)),s.t. |ΔAjte|δ|Ej|,j[tμn:t1],\displaystyle\max_{\{\Delta{A}^{te}_{j}\}}\,\textrm{err}=\mathcal{I}({A}_{t}^{te}\neq\mathcal{F}_{\Theta^{*}}(\widetilde{\mathcal{S}}^{te}_{[t-n:t-1]})),\quad\textrm{s.t. }\,|\Delta{{A}^{te}_{j}}|\leq\delta\cdot|E_{j}|,\quad\forall j\in[t-\mu n:t-1], (1)

where (AB)\mathcal{I}(A\neq B) is a function that counts the total number of unequal elements between AA and BB at the same position.

Directly solving Equation 1 to achieve the attacker’s goal is challenging, as it is a binary optimization problem that is NP-hard. In the next section, we will introduce our RL-based attack against DyGCN.

3 Our Proposed RL-based Attack

Refer to caption
Figure 2: Overview of our RL-based black-box attack to DyGCN. We take attacking a snapshot graph Gt1G_{t-1} as an example.

In this section, we propose to use RL to perform black-box evasion attacks to DyGCN. Our RL-based attack is based on soft actor-critic (SAC) [13], which is a stochastic policy-based RL method and achieves state-of-the-art performance. Comparing with optimization-based attacks, RL-based attacks have two major advantages. First, perturbing (optimal) edges in a graph (i.e., add new edges or delete existing edges) can be naturally modeled by executing (optimal) actions based on solving a policy function in RL, while solving which is NP-hard using optimization methods. Second, our attack involves learning a nonlinear mapping from the high-dimensional graph structure space to a low-dimensional representation space. Such a nonlinear mapping can be easily parameterized by a neural network and fits the RL framework.

The core idea of our attack is as follows: First, the attacker observes the current state associated with a graph. Based on the current state, the attacker executes an optimal action (i.e., add new edges and delete existing edges) by solving a policy function parameterized by a policy network; and obtaining a reward, which relates to our objective function in Equation 1, by solving a soft QQ-function parameterized by a QQ-network. Then, the attacker obtains the next state and saves the current and next states, actions, and rewards as a trajectory in a replay buffer. Finally, the attacker samples the trajectories from the buffer and adopts the SAC algorithm to train our attack. Figure 2 shows our proposed RL-based attack to DyGCN. The used notations and their descriptions are listed in Table 1.

The Attack Environment

A RL method consists of states, actions, rewards, and terminal condition. We first define the attack environment from the attacker’s perspective, which involves how to represent states, execute actions, define rewards, and determine the terminal condition. We take attacking a single graph Gt1G_{t-1} (At1A_{t-1}) in an example S[tn:t1]={Atn,,At1}S_{[t-n:t-1]}=\{A_{t-n},\cdots,A_{t-1}\} as an instance to introduce our attack.

States.

We use sk\textbf{s}_{k} to denote the state of the intermediate perturbed graph A~t1k\widetilde{A}_{t-1}^{k} at the attack step kk. Thus s0\textbf{s}_{0} is the state of the clean graph At1A_{t-1} and A~t10=At1\widetilde{A}_{t-1}^{0}=A_{t-1}. Moreover, we also use sk{\textbf{s}}_{k} to indicate a low-dimensional embedding of the high-dimensional perturbed graph A~t1k\widetilde{A}^{k}_{t-1}. sk{\textbf{s}}_{k} aims to capture the latent structural information in the perturbed graph A~t1k\widetilde{A}_{t-1}^{k}. and it is learnt via the following three steps: First, we observe all clean graphs in the example other than At1k{A}^{k}_{t-1} (i.e., from AtnA_{t-n} to At2A_{t-2} in this case), and select a fraction ρ\rho (<0.5<0.5) of nodes VpVV_{p}\subseteq V with the largest averaged degrees (called popular nodes) as well as the same fraction of nodes VnVV_{n}\subseteq V with the smallest averaged degrees (called neglected nodes) among those observed graphs (i.e., |Vp|=|Vn|=ρ|V||V_{p}|=|V_{n}|=\rho|V|). Our intuition is that nodes with the largest/smallest averaged degrees can maintain the primary information in a graph. Second, we encode the two node sets VpV_{p} and VnV_{n} using two one-hot vectors op,on|V|\textbf{o}_{p},\textbf{o}_{n}\in\mathbb{R}^{|V|}. Specifically, op,u=1o_{p,u}=1 if uVpu\in V_{p} and op,u=0o_{p,u}=0 if uVpu\notin V_{p}. Similarly, on,u=1o_{n,u}=1 if uVnu\in V_{n} and on,u=0o_{n,u}=0 if uVnu\notin V_{n}. Third, we define the representation for sk{\textbf{s}}_{k} as follows:

sk=[sk(1),sk(2)],\displaystyle{\textbf{s}}_{k}=\left[{\textbf{s}}_{k}^{(1)},{\textbf{s}}_{k}^{(2)}\right], (2)
sk(1)=g(A~t1k((1γ)op+γ/|V|)),\displaystyle{\textbf{s}}_{k}^{(1)}=g(\widetilde{A}_{t-1}^{k}\cdot((1-\gamma)\textbf{o}_{p}+{\gamma}/{|V|})),
sk(2)=g(A~t1k((1γ)on+γ/|V|)),\displaystyle{\textbf{s}}_{k}^{(2)}=g(\widetilde{A}_{t-1}^{k}\cdot((1-\gamma)\textbf{o}_{n}+{\gamma}/{|V|})),

where g(a)g(a) is a function g:|V|ρ|V|g:\mathbb{R}^{|V|}\rightarrow\mathbb{R}^{\rho|V|}, which first selects ρ|V|\rho|V| entries from the |V||V|-dimensional vector aa with indexes corresponding to VpV_{p} (or VnV_{n}) to form a new ρ|V|\rho|V|-dimensional vector; and then applies a nonlinear activation function (e.g., ReLU in our paper) to the new vector to obtain the ρ|V|\rho|V|-dimensional embedding for the perturbed graph A~t1k\tilde{A}^{k}_{t-1}. Note that we introduce a smooth regularization term γ/|V|\gamma/|V| in order to stabilize the training and reduce overfitting. By default, we set γ\gamma to be 0.2. With Equation 2, each entry value in sk{\textbf{s}}_{k} indicates the importance of the corresponding node in VpV_{p} or VnV_{n}. For the implementation purpose, we also store the mapping between the vector index i[1:Vp]i\in[1:V_{p}] or i[1:Vn]i\in[1:V_{n}] and the real node index jVj\in V in the graph.

Table 1: Notations and descriptions.
Notation Description
𝒮tr/𝒮val/𝒮te\mathcal{S}^{tr}/\mathcal{S}^{val}/\mathcal{S}^{te} Training/Validation/testing set
Mtr/Mval/MteM_{tr}/M_{val}/M_{te} No. of Training/Validation/testing examples
Θ\mathcal{F}_{\Theta^{*}} Learnt DyGCN model
Gj/AjG_{j}/{A}_{j} Ground truth graph at time step tt
G^j/A^j\widehat{G}_{j}/\widehat{A}_{j} Estimated graph at time step tt
G~j/A~j\widetilde{G}_{j}/\widetilde{A}_{j} Perturbed graph at time step jj
Vp/VnV_{p}/V_{n} Popular/Neglected node set
nn No. of graphs in a sequence
NN No. of nodes in each graph
LL No. of layers in GCN
𝒫Φ\mathcal{P}_{\Phi} Policy network
𝒬Λ\mathcal{Q}_{\Lambda} Soft 𝒬\mathcal{Q}-network
𝒟\mathcal{D} Replay buffer
α\alpha Temperature hyperparameter
μ\mu Fraction of perturbed graphs in an example
ρ\rho Fraction of selected popular/neglected nodes
δ\delta Fraction of total edges perturbed in a graph
TT No. of training episodes

Action.

We consider that the attacker only changes the link status among VpV_{p} and VnV_{n}. Our motivation is based on the assumption that: when deleting edges among popular nodes or/and adding edges among neglected nodes, the graph structure can be largely changed, and thus the DyGCN model would make as many wrong predicted links as possible when evaluated on the perturbed graph. Note that such a consideration can also reduce the computational complexity, as the attacker does not need to search the entire graph structure. Moreover, to ensure that certain graph property does not change significantly after the perturbation, we require the total number of edges in the graph keeps the same in each attack step. Particularly, we consider that an attacker executes an action by adding a new edge between a pair of nodes as well as deleting an existing edge between another pair of nodes. Then, the purpose of the action is to search two pairs of nodes, such that when executing the action, DyGCN’s performance decreases as much as possible based on the current state. Specifically, we denote ak\textbf{a}_{k} as the action at the attack step kk. Then, we design a policy to generate actions based on sk\textbf{s}_{k}. Formally, we define

ak𝒫Φ(ak|sk),\textbf{a}_{k}\sim\mathcal{P}_{\Phi}(\textbf{a}_{k}|\textbf{s}_{k}), (3)

where 𝒫Φ\mathcal{P}_{\Phi} is a policy network parameterized by Φ\Phi. Equation 3 attempts to enforce that, when nodes in VpV_{p} (or VnV_{n}) are relative more important at state sk\textbf{s}_{k}, then they are more likely to be selected as the action ak\textbf{a}_{k}. In this paper, we define our policy network as a composition of a 3-layer MLP network and a truncated normal distribution (TruncNorm) within an interval [1,ρ|V|][1,\rho|V|]. Specifically, the output of MLP is the mean and log-deviation of a TruncNorm; and we then perform the action ak\textbf{a}_{k} by sampling two pairs of nodes from the TruncNorm (with rounding). Moreover, we note that the sampling process can be naturally divided into two separate steps: we first perform an action ak(1)\textbf{a}^{(1)}_{k} to sample a pair of nodes from VpV_{p} and then perform another action ak(2)\textbf{a}^{(2)}_{k} to sample a pair of nodes from VnV_{n}. Formally,

ak=[ak(1),ak(2)]2×2,Φ=[Φ1,Φ2]ρ|V|×2,\displaystyle\textbf{a}_{k}=[\textbf{a}^{(1)}_{k},\textbf{a}^{(2)}_{k}]\in\mathbb{R}^{2\times 2},\quad\Phi=[\Phi_{1},\Phi_{2}]\in\mathbb{R}^{\rho|V|\times 2},
ak(1)TruncNorm(ak(1)|μ1,diag(σ12),1,ρ|V|]).round(),\displaystyle\textbf{a}_{k}^{(1)}\sim\textrm{TruncNorm}(\textbf{a}_{k}^{(1)}|\mu_{1},\textrm{diag}(\sigma_{1}^{2}),1,\rho|V|]).\textrm{round()},
μ1=MLPΦ1(sk(1))1:22,logσ1=MLPΦ1(sk(1))3:42,\displaystyle\mu_{1}=\textrm{MLP}_{\Phi_{1}}(\textbf{s}_{k}^{(1)})_{1:2}\in\mathbb{R}^{2},\,\log\sigma_{1}=\textrm{MLP}_{\Phi_{1}}(\textbf{s}_{k}^{(1)})_{3:4}\in\mathbb{R}^{2},
ak(2)TruncNorm(ak(2)|μ2,diag(σ22),1,ρ|V|).round(),\displaystyle\textbf{a}_{k}^{(2)}\sim\textrm{TruncNorm}(\textbf{a}_{k}^{(2)}|\mu_{2},\textrm{diag}(\sigma_{2}^{2}),1,\rho|V|).\textrm{round()},
μ2=MLPΦ2(sk(2))1:22,logσ2=MLPΦ2(sk(2))3:42.\displaystyle\mu_{2}=\textrm{MLP}_{\Phi_{2}}(\textbf{s}_{k}^{(2)})_{1:2}\in\mathbb{R}^{2},\,\log\sigma_{2}=\textrm{MLP}_{\Phi_{2}}(\textbf{s}_{k}^{(2)})_{3:4}\in\mathbb{R}^{2}.

With ak\textbf{a}_{k}, we can map each of its value back to the real node index via our stored mapping. For simplicity, we still use ak\textbf{a}_{k} to indicate the real node indexes. Moreover, when the attacker’s action is to add an existing edge or delete a nonexistent edge, we will move to the next step. Details of training the policy network 𝒫Φ\mathcal{P}_{\Phi} is illustrated in the next subsection.

Rewards.

Given the state sk\textbf{s}_{k} and action ak\textbf{a}_{k}, the attacker obtains a new state sk+1\textbf{s}_{k+1}. Naturally, if the link prediction performance of Θ\mathcal{F}_{\Theta^{*}} at state sk+1\textbf{s}_{k+1} is worse than that at state sk\textbf{s}_{k}, then a positive reward will be given, otherwise a negative reward. Based on this motivation, we design the reward function as follows:

rk(sk,ak)\displaystyle\textbf{r}_{k}(\textbf{s}_{k},\textbf{a}_{k}) ={fkfk+1if errk+1>errk;μnotherwise,\displaystyle=\left\{\begin{matrix}f^{k}-f^{k+1}&\qquad\textrm{if }\textrm{err}^{k+1}>\textrm{err}^{k};\\ -\mu\cdot n&\textrm{otherwise},\end{matrix}\right. (4)

where errk=(AtkΘ(𝒮~[tn:t1]k))\textrm{err}^{k}=\mathcal{I}({A}_{t}^{k}\neq\mathcal{F}_{\Theta^{*}}(\widetilde{\mathcal{S}}^{k}_{[t-n:t-1]})) corresponds to our objective function in Equation 1 at the kk-th attack step. fkf^{k} is a function to measure the effectiveness of Θ\mathcal{F}_{\Theta^{*}} on the perturbed graphs for link prediction at state sk\textbf{s}_{k}. If errk+1>errk\textrm{err}^{k+1}>\textrm{err}^{k}, which means our attack generates more wrong predicted links at the k+1k+1-th attack step, then the gap between fkf^{k} and fk+1f^{k+1} is positive. Furthermore, if the positive gap is larger, then our attack is more effective. Thus, we can use this positive gap as a positive reward rk\textbf{r}_{k}. Otherwise, if our attack performs worse at the attack step k+1k+1, then we set a large negative reward to penalize this step.

To obtain a optimal expected reward and guide the policy network 𝒫Φ\mathcal{P}_{\Phi}, we also need to solve a QQ-function, which is parameterized by a soft 𝒬\mathcal{Q}-network 𝒬Λ\mathcal{Q}_{\Lambda}. In our paper, soft 𝒬\mathcal{Q}-network uses the same MLP as the policy network. Specifically, 𝒬\mathcal{Q}-network takes a (state, action) pair as an input and outputs a score that indicates the effectiveness of the action. Its associated Bellman equation is defined as:

𝒬Λ(sk,ak):=rk(sk,ak)+λ𝔼sk+1[𝒱(sk+1)],\mathcal{Q}_{\Lambda}(\textbf{s}_{k},\textbf{a}_{k}):=\textbf{r}_{k}(\textbf{s}_{k},\textbf{a}_{k})+\lambda\mathbb{E}_{\textbf{s}_{k+1}}[\mathcal{V}(\textbf{s}_{k+1})], (5)

where

𝒱(sk)=𝔼ak𝒫Φ[QΛ(sk,ak)αlog𝒫Φ(ak|sk)]\mathcal{V}(\textbf{s}_{k})=\mathbb{E}_{\textbf{a}_{k}\sim\mathcal{P}_{\Phi}}[Q_{\Lambda}(\textbf{s}_{k},\textbf{a}_{k})-\alpha\log\mathcal{P}_{\Phi}(\textbf{a}_{k}|\textbf{s}_{k})] (6)

is the state value function that denotes the expected reward obtained by the attacker at state sk\textbf{s}_{k}. The discount factor λ=0.99\lambda=0.99 is to limit the sum of the expected rewards. Details of training the soft QQ-network is shown in the next subsection.

Terminal.

As required for the attacker, the number of modified edges for each perturbed graph in the sequence is limited to δ\delta. During the training process, the steps for each training episode is finite and fixed. In each attack step, the attacker agent adds a new edge as well as deletes an existing edge, therefore there are at most δ/2\delta/2 steps per episode.

Maximum Entropy Reinforcement Learning

We use the soft actor critic (SAC) algorithm [13], a method of maximum entropy reinforcement learning (MERL), to train our attack model. This algorithm maximizes the entropy of the policy as well as the expected reward. Compared with other RL algorithms, SAC has a stronger exploration ability and a more stable training. We first define the loss function of the policy network 𝒫Φ\mathcal{P}_{\Phi} and the soft 𝒬\mathcal{Q}-network 𝒬Λ\mathcal{Q}_{\Lambda}; and then show how to train them.

Policy Network.

Its loss function is defined as

𝒫Φ=𝔼sk𝒟[𝔼ak𝒫Φ[αlog(𝒫Φ(ak|sk))𝒬Λ(sk,ak)]],\displaystyle\mathcal{L}_{\mathcal{P}_{\Phi}}=\mathbb{E}_{\textbf{s}_{k}\sim\mathcal{D}}\left[\mathbb{E}_{\textbf{a}_{k}\sim\mathcal{P}_{\Phi}}\left[\alpha\log(\mathcal{P}_{\Phi}(\textbf{a}_{k}|\textbf{s}_{k}))-\mathcal{Q}_{\Lambda}(\textbf{s}_{k},\textbf{a}_{k})\right]\right], (7)

where 𝒟\mathcal{D} is a replay buffer which collects a set of previous states, actions, and rewards. α\alpha is a temperature parameter that controls the stochasticity of the optimal policy. In practice, α\alpha is an important parameter that is learnt as follows:

α=𝔼ak𝒫Φ[αlog𝒫Φ(ak|sk)α0],\mathcal{L}_{\alpha}=\mathbb{E}_{\textbf{a}_{k}\sim\mathcal{P}_{\Phi}}[-\alpha\textup{log}\mathcal{P}_{\Phi}(\textbf{a}_{k}|\textbf{s}_{k})-\alpha\mathcal{H}_{0}], (8)

where 0\mathcal{H}_{0} is a predefined entropy threshold (e.g., 0=2\mathcal{H}_{0}=-2 in our paper). Minimizing loss function in Equation 7 can make the policy 𝒫Φ\mathcal{P}_{\Phi} select multiple attractive actions whose probabilities are similar, instead of focusing on a single determinate action.

Furthermore, to estimate the density of the 𝒬\mathcal{Q}-function with a lower variance estimator, we usually apply a reparameterization trick to reparameterize the policy using a neural network transformation fΦ(ϵk;sk)f_{\Phi}(\epsilon_{k};\textbf{s}_{k}), where ϵk\epsilon_{k} is an input noise, sampled from, e.g., a normal Gaussian distribution. Then, we can rewrite the loss in Equation 7 as

𝒫Φ\displaystyle\mathcal{L}_{\mathcal{P}_{\Phi}} =𝔼sk𝒟,ϵk|𝒱|[αlog𝒫Φ(fΦ(ϵt;sk)|sk)𝒬Λ(sk,fΦ(ϵk;sk))].\displaystyle=\mathbb{E}_{\textbf{s}_{k}\sim\mathcal{D},\epsilon_{k}\sim\mathcal{|V|}}[\alpha\cdot\textup{log}\mathcal{P}_{\Phi}(f_{\Phi}(\epsilon_{t};\textbf{s}_{k})|\textbf{s}_{k})-\mathcal{Q}_{\Lambda}(\textbf{s}_{k},f_{\Phi}(\epsilon_{k};\textbf{s}_{k}))]. (9)
10:    
2A set of MvalM_{val} validation examples (𝒮val,𝒜val)={(S[tn:t1]val,Atval)}t=n+1n+1+Mval(\mathcal{S}^{val},\mathcal{A}^{val})=\{({S}^{val}_{[t-n:t-1]},A^{val}_{t})\}_{t=n+1}^{n+1+M_{val}}. Trained DyGCN model: Θ\mathcal{F}_{\Theta}^{*}. Hyperparameters: T,μ,ρ,δ,nT,\mu,\rho,\delta,n.
30:    
4Our trained attack model Θatt={Φ,α,Λ}\Theta_{att}=\{\Phi^{*},\alpha^{*},\Lambda^{*}\}.
51:  Initialize the model parameters Φ\Phi, α\alpha, Λ\Lambda;
62:  Initialize the replay buffer 𝒟\mathcal{D};
3:  for each S[tn:t1]val𝒮val{S}^{val}_{[t-n:t-1]}\in\mathcal{S}^{val} do
74:     Select μ\mu fraction of nn graphs (denoted as Sμval{S}^{val}_{\mu}) from S[tn:t1]val{S}^{val}_{[t-n:t-1]} to be attacked;
85:     Select Vp{V_{p}} and Vn{V_{n}} using the other (1μ)n(1-\mu)n graphs;
6:     for each graph Aj𝒮μvalA_{j}\in\mathcal{S}^{val}_{\mu} do
97:        Initialize the perturbed graph A~j0Aj\widetilde{A}_{j}^{0}\leftarrow A_{j};
108:        while episode epiT\text{epi}\leq{T} do
119:           while attack step kδ/2k\leq\delta/2 do
1210:              Learn sk\textbf{s}_{k} based on A~jk\widetilde{A}_{j}^{k} via Eq. (2);
1311:              Sample an action ak\textbf{a}_{k} at state sk\textbf{s}_{k} via Eq. (3);
1412:              Obtain reward rk\textbf{r}_{k} via Eq. (4) and Θ\mathcal{F}_{\Theta}^{*};
1513:              Update state sk+1={sk,ak}\textbf{s}_{k+1}=\{\textbf{s}_{k},\textbf{a}_{k}\};
1614:              Update perturbed graph A~jk+1A~jkak\widetilde{A}_{j}^{k+1}\leftarrow\widetilde{A}_{j}^{k}\bigcup\textbf{a}_{k};
1715:              Update buffer 𝒟\mathcal{D} \leftarrow 𝒟\mathcal{D} \cup {sk,ak,rk,sk+1}\left\{\textbf{s}_{k},\textbf{a}_{k},\textbf{r}_{k},\textbf{s}_{k+1}\right\};
1816:              Sample a batch of trajectories from 𝒟\mathcal{D};
17:              Update Φ\Phi, α\alpha, Λ\Lambda according to Eq.(7)-Eq.(10).
18:           end while
19:        end while
20:     end for
21:  end for
22:  return  Attack model parameters {Φ,α,Λ}\{\Phi^{*},\alpha^{*},\Lambda^{*}\}.
Algorithm 1 Train our attack model on a validation set.

Soft 𝒬\mathcal{Q}-Network.

Its loss function is defined as:

𝒬Λ\displaystyle\mathcal{L}_{\mathcal{Q}_{\Lambda}} =𝔼(sk,ak)𝒟[12(𝒬Λ(sk,ak)rk(sk,ak)λ(𝒬Λ(sk+1,ak+1)+αlog(𝒫Φ(ak+1|sk+1))))2],\displaystyle=\mathbb{E}_{(\textbf{s}_{k},\textbf{a}_{k})\sim\mathcal{D}}[\frac{1}{2}(\mathcal{Q}_{\Lambda}(\textbf{s}_{k},\textbf{a}_{k})-\textbf{r}_{k}(\textbf{s}_{k},\textbf{a}_{k})-\lambda(\mathcal{Q}_{\Lambda}(\textbf{s}_{k+1},\textbf{a}_{k+1})+\alpha\log(\mathcal{P}_{\Phi}(\textbf{a}_{k+1}|\textbf{s}_{k+1}))))^{2}], (10)

where sk\textbf{s}_{k} and ak\textbf{a}_{k} are sampled from the relay buffer 𝒟\mathcal{D}, and ak+1\textbf{a}_{k+1} is sampled from the policy during the training process.

Attack model training.

After defining the loss function of the parameterized policy network, parameterized soft 𝒬\mathcal{Q}-network and temperature parameter, we can now train our RL-based attack model. Suppose we are given a trained DyGCN model Θ\mathcal{F}_{\Theta^{*}}, a validation set, and a testing set. We use the validation set and the Adam algorithm to train our attack against the DyGCN model. Specifically, we first update the parameters Φ\Phi in the policy network 𝒫Φ\mathcal{P}_{\Phi}, which guides the policy improvement; Then, we update the temperature parameter α\alpha based on updated parameters Φ\Phi. Third, we update the parameter Λ\Lambda in the soft 𝒬\mathcal{Q}-network, which evaluates the effectiveness of the policy. We iteratively and alternatively update these parameters until reaching the terminal condition. Finally, we can evaluate our trained attack model on a testing set. The training process and evaluation process of our attack are detailed in Algorithm 1 and Algorithm 2, respectively.

10:    
A set of MteM_{te} testing examples (𝒮te,𝒜te)={(S[tn:t1]te,Atte)}t=n+1n+1+Mte(\mathcal{S}^{{te}},\mathcal{A}^{{te}})=\{({S}^{{te}}_{[t-n:t-1]},A^{{te}}_{t})\}_{t=n+1}^{n+1+M_{{te}}}. Trained DyGCN model: Θ\mathcal{F}_{\Theta}^{*}. Trained attack model Θatt\Theta_{att}. Parameters: μ\mu, ρ\rho, δ\delta, nn.
20:    
3Perturbed testing examples 𝒮~te\widetilde{\mathcal{S}}^{te} and score f(𝒮~te)f(\widetilde{\mathcal{S}}^{te}).
41:  for each 𝒮[tn:t1]te𝒮te\mathcal{S}_{[t-n:t-1]}^{te}\in\mathcal{S}^{te} do
52:     Select μ\mu fraction of nn graphs (denoted as 𝒮μte)\mathcal{S}_{\mu}^{te}) from 𝒮[tn:t1]te\mathcal{S}_{[t-n:t-1]}^{te} to be attacked;
63:     Select VpV_{p} and VnV_{n} using the other (1μ)n(1-\mu)n graphs;
4:     for each Aj𝒮μteA_{j}\in\mathcal{S}_{\mu}^{te} do
75:        Initialize the perturbed graph A~j0Aj\widetilde{A}_{j}^{0}\leftarrow A_{j};
86:        while attack step kδ/2k\leq\delta/2 do
97:           Obtain state sk\textbf{s}_{k} based on Eq. (7) and action ak\textbf{a}_{k} using the attack model Θatt\Theta_{att} and Θ\mathcal{F}_{\Theta}^{*};
108:           Update perturbed graph A~jk+1A~jkak\widetilde{A}_{j}^{k+1}\leftarrow\widetilde{A}_{j}^{k}\bigcup\textbf{a}_{k};
9:        end while
10:     end for
11:  end for
12:  return  𝒮~te\widetilde{\mathcal{S}}^{te}, f(𝒮~te)f(\widetilde{\mathcal{S}}^{te})
Algorithm 2 Evaluate our attack model on a testing set.

4 Evaluation

Table 2: Dataset statistics.
Dataset Nodes Edges Graphs
Haggle 274 28.2k 30
Hypertext 113 20.8k 30
Trapping 1.5k 4.6k 30

Experimental Setup

Dataset description.

We evaluate our RL-based black-box evasion attack to DyGCN, a state-of-the-art LPDG method, on three graph datasets, i.e., Haggle, Hypertext, and Trapping, from three different domains. We split each dataset into 30 graphs in a chronological order.

  • Haggle. This is a social network available at KONECT111http://konect.uni-koblenz.de/networks/contact, representing the connection between users measured by wireless devices. A node represents a user and an link between two person indicates a contact between them. The dataset was collected within 5 days.

  • Hypertext. This is a face-to-face contact network of  100 attendees to the ACM Hypertext 2009 conference held in Turin, Italy over three days from June 29 to July 1, 2009. The network is provided by SocioPatterns222http://www.sociopatterns.org/datasets. In the network, a node represents a conference attendee, and an edge represents a face-to-face contact between two attendees that was active for at least 20 seconds.

  • Trapping. This is a real-world animal interaction network from Network Data Repository333http://networkrepository.com/mammalia-voles-rob-trapping.php. The animal interaction data were from published studies of wild, captive, and domesticated animals. Each node represents an animal, a mammal, or a voles. An edge was added into the network whenever two voles were caught in at least one common trap over the primary trapping sessions. Each network in the dataset has a 12-hour time span.

Refer to caption
(a) Haggle
Refer to caption
(b) Hypertext
Refer to caption
(c) Trapping
Figure 3: Impact of δ\delta (ρ=0.30,δ=0.02)(\rho=0.30,\delta=0.02).
Refer to caption
(a) Haggle
Refer to caption
(b) Hypertext
Refer to caption
(c) Trapping
Figure 4: Impact of ρ\rho (δ=0.02,μ=1.0\delta=0.02,\mu=1.0).
Refer to caption
(a) Haggle
Refer to caption
(b) Hypertext
Refer to caption
(c) Trapping
Figure 5: Impact of μ\mu (δ=0.02,ρ=0.30)(\delta=0.02,\rho=0.30).

Configuration.

Each dataset is divided into a training set, a validation set, and a testing set, where each training/validation/testing example consists of a sequence of n=10n=10 graphs. As each dataset has 30 graphs, we have 2020 examples in total. We use Mtr=10M_{tr}=10 training examples to train DyGCN; Mval=5M_{val}=5 validation examples to train our attack model; and the remaining Mte=5M_{te}=5 testing examples to evaluate our attack. All our experiments are performed on a Linux machine with 128GB memory and 10 cores.

In DyGCN, the number of GCN layers is L=3L=3 with each layer having 64 units, and LSTM has 2 layers. As our graph datasets do not have node features, we thus use the identity matrix to represent node feature matrix, similar to [35]. In our attack, there are several key parameters that could affect the attack performance: the fraction μ\mu of total graphs to be perturbed in an example; the fraction ρ\rho of total nodes to be selected as the popular/neglected nodes; and the fraction δ\delta of total edges to be perturbed in each graph. By default, we set μ=1.0\mu=1.0, ρ=0.30\rho=0.30, and δ=0.02\delta=0.02. We also explore the impact of these parameters. We fix the other parameters as the default value when we study one specific parameter. Specifically, we set μ={0.6,0.8,1.0}\mu=\{0.6,0.8,1.0\}, ρ={0.15,0.30,0.45}\rho=\{0.15,0.30,0.45\}, and δ={0.01,0.02,0.05}\delta=\{0.01,0.02,0.05\}.

Compared attacks.

There are no existing works on attacking dynamic link prediction in the black-box setting. Here, we propose two baseline attacks and compare them with our attack under the same setting (e.g., the same number of graphs to be attacked, the same attack budget).

  • Random-whole. In this attack, the attacker randomly deletes an edge from and adds an edge to the whole graph in each attack step.

  • Random-partial. In this attack, in each attack step, the attacker randomly selects a pair of nodes from VpV_{p} and deletes the edge if an edge exists between them; and selects another pair of nodes from VnV_{n} and adds the edge if there is no edge between them.

Evaluation metric.

Similar to existing works [5, 23, 21], we use F1F_{1} score to evaluate the performance of a link prediction algorithm. Thus, the function ff in Equation 4 is the F1F_{1} score. The higher/lower the F1F_{1} score, the better/worse the link prediction method is. All results are reported in an averaged F1F_{1} score on the testing examples.

Table 3: Link prediction results on clean data
Dataset Haggle Hypertext Trapping
No attack(NO) 0.95020.9502 0.55760.5576 0.65010.6501
Random-whole(R-W) 0.94670.9467 0.55320.5532 0.55120.5512
Random-partial(R-P) 0.93530.9353 0.53610.5361 0.50240.5024
Our attack(Ours) 0.83220.8322 0.46450.4645 0.35420.3542

Attack Results on DyGCN

Effectiveness of our attack.

Table 3 shows DyGCN’s performance on the clean graphs, i.e., no attack, and the perturbed graphs, i.e., with our attack and two random attacks. We observe that our attack is effective. For instance, compared with no attack, our attack has an 12.4%12.4\%, 16.7%16.7\%, and 45.5%45.5\% F1F_{1} score degradation on the three datasets, respectively. Moreover, our attack is much more effective than random attacks. For instance, on Trapping, our attack has a 30.3%30.3\% and 22.8%22.8\% performance gain over Random-whole attack and Random-partial attack, respectively.

Impact of δ\delta.

Figure 3 shows DyGCN’s F1F_{1} score under all compared attacks vs. fraction δ\delta of total edges perturbed. We observe that as δ\delta increases, all attacks have better attack performance. This is because, a larger δ\delta indicates that an attacker is capable of perturbing a larger number of links/edges. Moreover, our attack is much more effective than random attacks at a given δ\delta. For instance, when only 1%1\% edges can be perturbed, our attack can reduce the F1F_{1} score with 8.8%8.8\%, 9.0%9.0\%, 37.9%37.9\%, while random attacks reduce around 1.3%1.3\%, 1.4%1.4\%, and 16.6%16.6\%, on the three datasets, respectively.

Impact of ρ\rho.

Figure 4 shows DyGCN’s F1F_{1} score under all compared attacks vs. fraction ρ\rho of total nodes selected as the popular/neglected node set. Similarly, we observe that as ρ\rho increases, all attacks’ performance increases. However, our attack requires a much smaller ρ\rho than random attacks, in order to achieve a promising attack performance.

Impact of μ\mu.

Figure 5 shows DyGCN’s F1F_{1} score under all compared attacks vs. fraction μ\mu of total graphs perturbed in each example. Similarly, we observe that as μ\mu increases, F1F_{1} score of all attacks decreases. However, our attack obtains a good attack performance when perturbing a relatively smaller number of graphs (e.g., μ=0.6\mu=0.6).

Refer to caption
Figure 6: Running time of training our attack vs. ρ\rho, μ\mu, and δ\delta.

Efficiency of our attack.

We use running time as the metric to evaluate the efficiency of training our attack. Figure 6 shows the running time vs. ρ\rho, μ\mu, and δ\delta on Haggle. When we observe one fraction, the other two are set to default values. Note that we have similar observations on the other two datasets, and thus show results on Haggle for simplicity. We observe that the running time is linear to the three parameters. Thus, we can conclude that our attack is also efficient.

5 Conclusion

We propose the first black-box evasion attack against graph neural network-based link prediction in dynamic graphs (LPDG). Our attack aims to perturb the graph structure so as to fool the LPDG model, while not knowing the model parameters, model architecture, etc. We formulate our attack as an optimization problem, which is NP hard, and then develop a reinforcement learning-based method to realize our attack. Experimental results on three real-world graph datasets show that our attack is both effective and efficient.

6 Related Work

We review existing works that perform attacks against graph data. These methods can be summarized as attacking graph-based clustering [6], graph-based collective classification [29, 32], graph embedding [9, 26, 3, 1, 2], and graph neural networks [42, 8, 43, 32, 33, 34, 27, 38]. For instance, [6] designed a practical attack against spectral clustering, a type of graph-based clustering methods.  [32] proposed an attack against the collective classification method, called linearized belief propagation, by manipulating the graph structure. Zügner et al. [42] developed a Nettack method to attack graph convolutional network (GCN) [17] by perturbing both the node features and graph structure. However, we note that all these attacks focus on attacking a single static graph.

To the best of our knowledge, only one work [4] studies adversarial attacks to link prediction in dynamic graphs. However, this attack is white-box (i.e., the attacker knows all information about the trained model), and is specially designed for a specific LPDG method called deep dynamic graph embedding. In contrast, our attack is black-box and is applicable to arbitrary LPDG methods. In our work, for simplicity, we focus on attacking the state-of-the-art DyGCN method [21].

References

  • [1] Aleksandar Bojchevski and Stephan Günnemann. Adversarial attacks on node embeddings via graph poisoning. In ICML, 2019.
  • [2] Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu, and Junzhou Huang. A restricted black-box adversarial framework towards attacking graph embedding models. In AAAI, 2020.
  • [3] Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. Fast gradient attack on network embedding. arXiv, 2018.
  • [4] Jinyin Chen, Jian Zhang, Zhi Chen, Min Du, Feifei Li, and Qi Xuan. Time-aware gradient attack on dynamic network link prediction. arXiv, 2019.
  • [5] Jinyin Chen, Jian Zhang, Xuanheng Xu, Chenbo Fu, Dan Zhang, Qingpeng Zhang, and Qi Xuan. E-lstm-d: A deep learning framework for dynamic network link prediction. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019.
  • [6] Yizheng Chen, Yacin Nadji, Athanasios Kountouras, Fabian Monrose, Roberto Perdisci, Manos Antonakakis, and Nikolaos Vasiloglou. Practical attacks against graph-based clustering. In CCS, 2017.
  • [7] Carter Chiu and Justin Zhan. Deep learning for link prediction in dynamic networks using weak estimators. IEEE Access, 2018.
  • [8] Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. Adversarial attack on graph structured data. In ICML, 2018.
  • [9] Quanyu Dai, Qiang Li, Jian Tang, and Dan Wang. Adversarial network embedding. In AAAI, 2018.
  • [10] Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowledge-Based Systems, 2020.
  • [11] Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu. Dyngem: Deep embedding method for dynamic graphs. arXiv, 2018.
  • [12] Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. In KDD, 2016.
  • [13] Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, et al. Soft actor-critic algorithms and applications. arXiv, 2018.
  • [14] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 1997.
  • [15] Nahla Mohamed Ahmed Ibrahim and Ling Chen. Link prediction in dynamic social networks by integrating different types of information. Applied Intelligence, 2015.
  • [16] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
  • [17] Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  • [18] Jundong Li, Harsh Dani, Xia Hu, Jiliang Tang, Yi Chang, and Huan Liu. Attributed network embedding for learning in a dynamic environment. In CIKM, 2017.
  • [19] Xiaoyi Li, Nan Du, Hui Li, Kang Li, Jing Gao, and Aidong Zhang. A deep learning approach to link prediction in dynamic networks. In SDM, 2014.
  • [20] Xiaoke Ma, Penggang Sun, and Yu Wang. Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks. Physica A: Statistical mechanics and its applications, 2018.
  • [21] Franco Manessi, Alessandro Rozza, and Mario Manzo. Dynamic graph convolutional networks. Pattern Recognition, 2020.
  • [22] Giang Hoang Nguyen, John Boaz Lee, Ryan A Rossi, Nesreen K Ahmed, Eunyee Koh, and Sungchul Kim. Continuous-time dynamic network embeddings. In WWW, 2018.
  • [23] Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao B Schardl, and Charles E Leiserson. Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In AAAI, 2020.
  • [24] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In KDD, 2014.
  • [25] Purnamrita Sarkar, Deepayan Chakrabarti, and Michael Jordan. Nonparametric link prediction in dynamic networks. In ICML, 2012.
  • [26] Mingjie Sun, Jian Tang, Huichen Li, Bo Li, Chaowei Xiao, Yao Chen, and Dawn Song. Data poisoning attack against unsupervised node embedding methods. arXiv preprint arXiv:1810.12881, 2018.
  • [27] Yiwei Sun, Suhang Wang, Xianfeng Tang, Tsung-Yu Hsieh, and Vasant Honavar. Adversarial attacks on graph neural networks via node injections: A hierarchical reinforcement learning approach. In The Web Conference, 2020.
  • [28] Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. Line: Large-scale information network embedding. In WWW, 2015.
  • [29] MohamadAli Torkamani and Daniel Lowd. Convex adversarial collective classification. In ICML, 2013.
  • [30] Rakshit Trivedi, Hanjun Dai, Yichen Wang, and Le Song. Know-evolve: Deep temporal reasoning for dynamic knowledge graphs. In ICML, 2017.
  • [31] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. In ICLR, 2018.
  • [32] Binghui Wang and Neil Zhenqiang Gong. Attacking graph-based classification via manipulating the graph structure. In CCS, 2019.
  • [33] Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. Adversarial examples on graph data: Deep insights into attack and defense. In IJCAI, 2019.
  • [34] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin. Topology attack and defense for graph neural networks: An optimization perspective. In IJCAI, 2019.
  • [35] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In ICLR, 2019.
  • [36] Bing Yu, Haoteng Yin, and Zhanxing Zhu. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In IJCAI, 2018.
  • [37] Wenchao Yu, Wei Cheng, Charu C Aggarwal, Kai Zhang, Haifeng Chen, and Wei Wang. Netwalk: A flexible deep embedding approach for anomaly detection in dynamic networks. In KDD, 2018.
  • [38] Zaixi Zhang, Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. Backdoor attacks to graph neural networks. arXiv, 2020.
  • [39] Zhongbao Zhang, Jian Wen, Li Sun, Qiaoyu Deng, Sen Su, and Pengyan Yao. Efficient incremental dynamic link prediction algorithms in social network. Knowledge-Based Systems, 2017.
  • [40] Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and Haifeng Li. T-gcn: A temporal graph convolutional network for traffic prediction. IEEE Transactions on Intelligent Transportation Systems, 2019.
  • [41] Le-kui Zhou, Yang Yang, Xiang Ren, Fei Wu, and Yueting Zhuang. Dynamic network embedding by modeling triadic closure process. In AAAI, 2018.
  • [42] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial attacks on neural networks for graph data. In KDD, 2018.
  • [43] Daniel Zügner and Stephan Günnemann. Adversarial attacks on graph neural networks via meta learning. In ICLR, 2019.
  • [44] Yuan Zuo, Guannan Liu, Hao Lin, Jia Guo, Xiaoqian Hu, and Junjie Wu. Embedding temporal network via neighborhood formation. In KDD, 2018.