This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Hypergraph-Based Dynamic Graph Node Classification

Xiaoxu Ma1, Chen Zhao2, Minglai Shao1*, Yujie Lin1 *Corresponding author. 1School of New Media and Communication, Tianjin University, Tianjin, China
2Department of Computer Science, Baylor University, Waco, Texas, USA
{maxiaoxu, shaoml, linyujie_22}@tju.edu.cn, [email protected]
Abstract

Node classification on static graphs has achieved significant success, but achieving accurate node classification on dynamic graphs where node topology, attributes, and labels change over time has not been well addressed. Existing methods based on RNNs and self-attention only aggregate features of the same node across different time slices, which cannot adequately address and capture the diverse dynamic changes in dynamic graphs. Therefore, we propose a novel model named Hypergraph-Based Multi-granularity Dynamic Graph Node Classification (HYDG). After obtaining basic node representations for each slice through a GNN backbone, HYDG models the representations of each node in the dynamic graph through two modules. The individual-level hypergraph captures the spatio-temporal node representations between individual nodes, while the group-level hypergraph captures the multi-granularity group temporal representations among nodes of the same class. Each hyperedge captures different temporal dependencies of varying lengths by connecting multiple nodes within specific time ranges. More accurate representations are obtained through weighted information propagation and aggregation by the hypergraph neural network. Extensive experiments on five real dynamic graph datasets using two GNN backbones demonstrate the superiority of our proposed framework.

Index Terms:
Dynamic Graph, Hypergraph, Node Classification

I Introduction

As a widely employed data structure, graph structures excel in representing intricate data and possess robust capabilities for storing and articulating node attributes and entity relationships. The current landscape of graph neural networks[1, 2, 3, 4, 5, 6, 7] predominantly operates within the realm of static graph structures[8]. However, many real-world scenarios involve dynamic graph structures, such as recommender systems, social media analytics[9], transportation systems[10], citation networks[11]and epidemic modeling. Dynamic graphs can be broadly classified into discrete-time dynamic graphs and continuous-time dynamic graphs[12].In both forms of dynamic graphs, node features, node labels, and graph structures may evolve over time[13], posing a challenge for traditional static graph models in capturing the dynamic relationships among node features. Consequently, the effective capture and utilization of node relationships and features in dynamic networks have emerged as pivotal research questions.

Refer to caption
Figure 1: An illustrative example of interest prediction in dynamic social graphs involves predicting users’ interest tags at a specific time based on historical data. In this network, both user connections and interest tags evolve over time, and our goal is to forecast users’ interest tags using information from the interest network.

In this paper, we focus on the task of node classification in discrete dynamic graphs, specifically predicting the node labels in unknown future graph snapshots given the graph information at previous time steps. As illustrated in Fig.  1, in social interest networks, a user’s social connections and personal information may change significantly over time, leading to shifts in their interest labels. Current methods primarily fall into two categories[12, 14]: one uses RNNs[15, 16, 17, 18] to capture temporal relationships between nodes, while the other employs attention[19, 13] mechanisms to capture dynamic relationships between nodes. However, in diverse real-world dynamic networks, simply using RNNs or self-attention to link the representations of the same node across different time slices may fail to adequately capture the features of various node categories. In scenarios involving changes in node features, label shifts, node disappearance, or the addition of new nodes, traditional dynamic graph learning methods struggle to perform inductive learning effectively, thus failing to cope with the complex node dynamics within dynamic networks.

To address these challenges, we propose the Hypergraph-Based Multi-granularity Dynamic Graph Node Classification (HYDG) algorithm, which captures both individual and group-level node features using hypergraphs[20, 21, 22, 23]. By connecting multiple nodes within specific time intervals through hyperedges, HYDG constructs multi-granularity features, enabling more accurate classifications via hypergraph neural networks. Compared to traditional methods, HYDG more effectively models high-order dependencies and enhances temporal information capture at both individual and group levels through weighted information propagation. Our contributions include:

  • We introduce a novel hypergraph-based algorithm for dynamic graph node classification that effectively captures diverse and robust spatio-temporal dependencies through hypergraph construction, weighted node propagation, and feature aggregation.

  • We develop a multi-scale hypergraph construction approach that concurrently captures both individual- and group-level node features, enhancing the diversity of node representations.

  • Our model, built upon two GNN backbones, demonstrates superior performance on five real-world datasets, consistently outperforming baseline models.

II methodology

The primary focus of this paper is to address the problem of node classification on discrete dynamic graphs. Given the dynamic graph snapshots GP={G1,G2,,Gt}\textbf{G}_{P}=\{\textbf{G}^{1},\textbf{G}^{2},\ldots,\textbf{G}^{t}\} up to time step tt and the corresponding node labels YP={Y1,Y2,,Yt}\textbf{Y}_{P}=\{\textbf{Y}^{1},\textbf{Y}^{2},\ldots,\textbf{Y}^{t}\}, our goal is to classify all nodes in future graph snapshots GF={Gt+1,Gt+2,,GT}\textbf{G}_{F}=\{\textbf{G}^{t+1},\textbf{G}^{t+2},\ldots,\textbf{G}^{T}\}, whose labels YF={Yt+1,Yt+2,,YT}\textbf{Y}_{F}=\{\textbf{Y}^{t+1},\textbf{Y}^{t+2},\ldots,\textbf{Y}^{T}\} are unknown.

Refer to caption
Figure 2: The detailed architecture of HYDG consists of three components: graph embedding(blue and orange represent different labels), hypergraph construction (for better visualization, we only illustrate the binary clustering approach),and hypergraph propagation.

II-A Dynamic Graph Feature Extraction

Static graph neural networks have been widely applied in graph representation learning tasks. Therefore, we first utilize a backbone GNN to perform feature extraction on each node of every temporal slice, as shown in Fig. 2(a). This process yields a feature representation for each node at each time step. For the t-th temporal slice of the dynamic graph Gt\textbf{G}^{t}, we can utilize it to obtain feature representations.

𝐙t=GNN(𝐀t,𝐗t),\mathbf{Z}^{t}=\text{GNN}(\mathbf{A}^{t},\mathbf{X}^{t}), (1)

where 𝐙t\mathbf{Z}^{t} represents the matrix of feature embeddings for the nodes in the graph snapshot 𝐆t\mathbf{G}^{t} processed by the GNN, with 𝐙t=[𝐳1t,𝐳2t,,𝐳nt]\mathbf{Z}^{t}=[\mathbf{z}_{1}^{t},\mathbf{z}_{2}^{t},\ldots,\mathbf{z}_{n}^{t}], where nn is the number of nodes. 𝐀t\mathbf{A}^{t} denotes the adjacency matrix of 𝐆t\mathbf{G}^{t}, and 𝐗t\mathbf{X}^{t} refers to the initial feature matrix of the nodes in 𝐆t\mathbf{G}^{t}.

II-B Hypergraph Construction

In dynamic graphs, nodes possess both individual features and common features shared within their subgroups. To better capture the higher-order correlations of nodes within each time slice, we construct multi-scale hypergraphs by capturing individual-level and group-level dependencies, as shown in Fig. 2(b).

Individual-level hypergraph construction.  We capture the individual high-order dependencies of each node in the dynamic graph by constructing a series of individual hyperedges, denoted as 𝒢in=(𝒱in,in)\mathcal{G}_{in}=(\mathcal{V}_{in},\mathcal{E}_{in}). Here, 𝒢in\mathcal{G}_{in} includes individual nodes 𝒱in\mathcal{V}_{in} and the set of hyperedges in\mathcal{E}_{in}. Specifically, for a given node vitv_{it}, we identify KK nodes from other time slices within a specified time range that are most close to the feature vector obtained from (1), excluding the current time slice. These K+1K+1 nodes are then connected via a hyperedge 𝐞it\mathbf{e}_{it}, as shown in (2).

𝐞it={vit,vjt𝒩K(vit)},s.t.|tt|τ,\mathbf{e}_{it}=\{v_{it},\forall v_{jt^{\prime}}\in\mathcal{N}_{K}(v_{it})\},s.t.\ |t-t^{\prime}|\leq\tau~{}, (2)

where 𝒩K\mathcal{N}_{K} represents the set of KK nearest neighbors, which can be calculated using various distance metrics such as Euclidean distance, Chebyshev distance, cosine distance, etc. |||*| denotes the temporal distance between nodes. τ\tau represents the threshold value of the time range. We model the multi-scale temporal dependencies in the dynamic graph by setting three threshold ranges: short-term connection, mid-term connection, and long-term connection, respectively. We construct the individual hypergraph 𝒢in\mathcal{G}_{in} using vitv_{it} and eite_{it} obtained from all nodes in each time slice.

Group-level hypergraph construction. To better capture the multi-granularity group features of each category in dynamic graphs, we construct a series of group-level hypergraphs 𝒢group\mathcal{G}_{group}, where 𝒢group=(𝒱group,group)\mathcal{G}_{group}=(\mathcal{V}_{group},\mathcal{E}_{group}), including group nodes 𝒱group\mathcal{V}_{group} and group hyperedge sets group\mathcal{E}_{group}. In summary, we first group the feature vectors 𝐙t\mathbf{Z}^{t} according to the true labels 𝐘t\mathbf{Y}^{t} of the nodes, followed by hierarchical clustering and aggregation of these groups within each time slice. This process yields spatio-temporal group representations, which are then used to construct hypergraphs for each node category, capturing both the features and their spatio-temporal dependencies in the dynamic network. The mathematical representation is as follows:

𝐙groupc,t={Agg(ClusterM({zvtyit=c}))},\mathbf{Z}^{c,t}_{group}=\{Agg\left(Cluster_{M}(\{z_{v}^{t}\mid y_{i}^{t}=c\})\right)\}, (3)

where yvty_{v}^{t} denote the label of node vv at time tt. By grouping node embeddings with the same label, we obtain 𝐎c,t={𝐳vtyvt=c}\mathbf{O}^{c,t}=\{\mathbf{z}_{v}^{t}\mid y_{v}^{t}=c\}, where CC represents the number of node categories. The function ClusterM()Cluster_{M}(\cdot) performs clustering on the grouped embeddings. Applying ClusterM(𝐎c,t)Cluster_{M}(\mathbf{O}^{c,t}) yields the clusters {𝐎1c,t,𝐎2c,t,,𝐎Mc,t}\{\mathbf{O}^{c,t}_{1},\mathbf{O}^{c,t}_{2},\ldots,\mathbf{O}^{c,t}_{M}\}, where MM denotes the number of clusters. The aggregation function Agg()Agg(\cdot) is applied to aggregate the vectors within each cluster, which could be operations such as max()\max(\cdot), min()\min(\cdot), or avg()avg(\cdot). Finally, for each group, we obtain 𝐙groupc,TM×T\mathbf{Z}^{c,T}_{group}\in\mathbb{R}^{M\times T}. This process is formalized to capture diverse temporal characteristics of nodes within the same category across different clusters at various time steps.

We merge each time slice obtained from 𝐙groupc,t\mathbf{Z}^{c,t}_{group} by (3) to obtain the grouped 𝐙groupc,T\mathbf{Z}^{c,T}_{group}. Following the construction method of individual-level hypergraph in (2), we construct hyperedges for 𝐙groupc,T\mathbf{Z}^{c,T}_{group} separately to capture the spatio-temporal evolutionary relationships of various category group characteristics, resulting in 𝒱groupc\mathcal{V}_{group}^{c} and groupc\mathcal{E}_{group}^{c}. Hence, we obtain 𝒢group={𝒢groupc}cC\mathcal{G}_{group}=\{\mathcal{G}_{group}^{c}\}_{c\in C}.

II-C Hypergraph Propagation

After constructing individual-level hypergraph 𝒢in\mathcal{G}_{in} and group-level hypergraph 𝒢group\mathcal{G}_{group}, We employ Hypergraph Neural Networks (HGNN)[24] for node-weighted information propagation and feature updating to obtain spatio-temporal fusion features within hyperedges, as illustrated in Fig. 2(c).

For a given hypergraph 𝒢\mathcal{G}, the incidence matrix 𝐇\mathbf{H} is defined based on the node set 𝒱\mathcal{V} and edge set \mathcal{E} as:

H(v,e)=1 if ve, else 0,v𝒱,e.H(v,e)=1\text{ if }v\in e,\text{ else }0,\quad\forall v\in\mathcal{V},\forall e\in\mathcal{E}. (4)

Based on 𝒢in\mathcal{G}_{in} and 𝒢group\mathcal{G}_{group}, we can derive the individual and group-level incidence matrices, 𝐇in\mathbf{H}_{in} and 𝐇group\mathbf{H}_{group}, through (4). Following the approach of frequency-domain GCNs [1], we use Fourier transformation to shift spatial signals to the frequency domain for convolution, and then apply the inverse Fourier transformation [24], as formalized by the following equation:

𝐙(l+1)=σ(𝐃v12𝐇𝐖𝐃e1𝐇T𝐃v12𝐙(l)Θ(l)),\mathbf{Z}^{(l+1)}=\sigma\left(\mathbf{D}_{v}^{-\frac{1}{2}}\mathbf{H}\mathbf{W}\mathbf{D}_{e}^{-1}\mathbf{H}^{T}\mathbf{D}_{v}^{-\frac{1}{2}}\mathbf{Z}^{(l)}\Theta^{(l)}\right), (5)

where 𝐙\mathbf{Z} represents the feature matrix, 𝐇\mathbf{H} denotes the incidence matrix, Θ\Theta is the learnable hypergraph convolution kernel, 𝐖\mathbf{W} is the learnable weight matrix, and σ\sigma is the non-linear activation function.

More specifically, for a given node vitv_{it}, H(vit)H(v_{it}) represents all hyperedges connected to node vitv_{it}, containing nodes with the highest relevance to vitv_{it}. We aggregate the features of nodes from different time slices in the hyperedges, except for vitv_{it}, using the corresponding weights 𝐰itjt\mathbf{w}_{it}^{jt^{\prime}} in H(vit)H(v_{it}) to capture the temporal dependencies within the same hyperedge:

𝐰𝐢𝐭𝐣𝐭=exp(d(𝐳it,𝐳jt)2σ2),vjt𝐞it,\mathbf{w_{it}^{jt^{\prime}}}=\exp\left(-\frac{d(\mathbf{z}_{it},\mathbf{z}_{jt^{\prime}})^{2}}{\sigma^{2}}\right),\forall v_{jt^{\prime}}\in\mathbf{e}_{it}, (6)
𝐩itk,l=jivjtek𝐳jtl1𝐰𝐢𝐭𝐣𝐭,𝐞kH(vit),\mathbf{p}^{k,l}_{it}=\sum_{j\neq i}^{v_{jt^{\prime}}\in e_{k}}\mathbf{z}^{l-1}_{jt^{\prime}}\mathbf{w_{it}^{jt^{\prime}}},\ \forall\mathbf{e}_{k}\in H(v_{it}), (7)

vitv_{it} and vjtv_{jt^{\prime}} represent a pair of nodes at different time slices within a hyperedge. d(𝐳it,𝐳jt)d(\mathbf{z}_{it},\mathbf{z}_{jt^{\prime}}) denotes the distance between vitv_{it} and vjtv_{jt^{\prime}} . We can use various methods to compute the distance, such as Euclidean distance, cosine distance, etc. σ\sigma can be set as the median of distances between all pairs of vertices, 𝐳jtl1\mathbf{z}^{l-1}_{jt^{\prime}} represents the output of node vjtv_{jt^{\prime}} at layer l1l-1 in HGNN.

After obtaining different hyperedge features for the dynamic graph node vitv_{it}, we compute the weight by measuring the similarity between the hyperedge features and node features and normalize the weights using softmax function. Then, we perform edge-node level spatio-temporal dependency aggregation:

𝐳itl=kexp(sim(𝐳itk,l1,𝐩itk,l))kexp(sim(𝐳itk,l1,𝐩itk,l))𝐩itk,l.\mathbf{z}^{l}_{it}=\sum_{k}\frac{{\rm exp}(sim(\mathbf{z}^{k,l-1}_{it},\mathbf{p}^{k,l}_{it}))}{\sum_{k^{\prime}}{\rm exp}(sim(\mathbf{z}^{k^{\prime},l-1}_{it},\mathbf{p}^{k^{\prime},l}_{it}))}\mathbf{p}^{k,l}_{it}. (8)

The node representations obtained from the above process, denoted as 𝐳itl\mathbf{z}^{l}_{it} ,are passed through a non-linear function, yielding the integrated representation of node dependencies across time and space.

II-D Model Learning

After the hypergraph information propagation, we obtain the individual and group hypergraph representation, 𝐙in\mathbf{Z}_{in} and 𝐙group\mathbf{Z}_{group}.By applying the softmax function, for each time slice, we compute the predicted labels 𝐘^in\mathbf{\hat{Y}}_{in} and 𝐘^group\mathbf{\hat{Y}}_{group} for the nodes in 𝒢in\mathcal{G}_{in} and 𝒢group\mathcal{G}_{group}, respectively. The true labels for the group-level hypergraph, 𝐙groupc\mathbf{Z}_{group}^{c}, are represented by cc as follows:

in(𝐘,𝐘^)=i=1Nyilog(ezinij=1Nezinj),\mathcal{L}_{in}(\mathbf{Y},\mathbf{\hat{Y}})=-\sum_{i=1}^{N}y_{i}\log(\frac{e^{z_{in}^{i}}}{\sum_{j=1}^{N}e^{z_{in}^{j}}})~{}, (9)
group(𝐘,𝐘^)=c=1Ci=1Mclog(ezgroupicj=1Nezgroupic).\mathcal{L}_{group}(\mathbf{Y},\mathbf{\hat{Y}})=-\sum_{c=1}^{C}\sum_{i=1}^{M}c\log(\frac{e^{z_{group}^{ic}}}{\sum_{j=1}^{N}e^{z_{group}^{ic}}})~{}. (10)

For in\mathcal{L}_{in} and group\mathcal{L}_{group}, we assign different weights, denoted as α\alpha and β\beta, respectively, resulting in the overall loss function all\mathcal{L}_{all}, which consists of the following two terms:

all=αin+βgroup.\mathcal{L}_{all}=\alpha\cdot\mathcal{L}_{in}+\beta\cdot\mathcal{L}_{group}~{}. (11)

During training, we construct both individual and group-level hypergraphs to capture diverse node representations using hypergraph neural networks. During testing, without labels, we use only individual-level hypergraphs and the trained network for predictions. The group-level hypergraphs, used for data augmentation, capture evolving relationships within the same class, addressing variations in unseen test sets.

III experiments

TABLE I: Quantitative results on the dynamic node classification task. The results are the averages of five runs, with the best result in each column highlighted in bold and the second best result indicated with a dash.
backbone Dataset DBLP5 DBLP3 Reddit Wiki ML-RATING
Metric Accuracy AUC Accuracy AUC Accuracy AUC Accuracy AUC Accuracy AUC
GCN[1] 63.01±0.6263.01\pm 0.62 0.65±0.020.65\pm 0.02 74.23±0.4174.23\pm 0.41 0.55±0.020.55\pm 0.02 31.02±0.3331.02\pm 0.33 0.52±0.01¯\underline{0.52\pm 0.01} 79.22±0.8179.22\pm 0.81 0.65±0.010.65\pm 0.01 85.53±0.2285.53\pm 0.22 0.67±0.010.67\pm 0.01
GCNLSTM[17] 61.83±1.8461.83\pm 1.84 0.65±0.010.65\pm 0.01 73.37±0.9273.37\pm 0.92 0.56±0.010.56\pm 0.01 30.26±0.4230.26\pm 0.42 0.50±0.010.50\pm 0.01 79.43±1.2279.43\pm 1.22 0.68±0.01\textbf{0.68}\pm\textbf{0.01} 84.77±0.8484.77\pm 0.84 0.68±0.02¯\underline{0.68\pm 0.02}
RNNGCN[18] 65.92±0.4365.92\pm 0.43 0.70±0.01\textbf{0.70}\pm\textbf{0.01} 76.55±0.4176.55\pm 0.41 0.56±0.010.56\pm 0.01 32.68±0.28¯\underline{32.68\pm 0.28} 0.51±0.010.51\pm 0.01 78.74±1.1878.74\pm 1.18 0.61±0.010.61\pm 0.01 85.75±0.52¯\underline{85.75\pm 0.52} 0.66±0.010.66\pm 0.01
GCNSE[19] 66.58±0.3166.58\pm 0.31 0.64±0.020.64\pm 0.02 76.88±0.52¯\underline{76.88\pm 0.52} 0.53±0.010.53\pm 0.01 31.09±0.4731.09\pm 0.47 0.51±0.010.51\pm 0.01 79.49±0.6279.49\pm 0.62 0.54±0.010.54\pm 0.01 85.63±0.3485.63\pm 0.34 0.57±0.010.57\pm 0.01
GCN ROLAND[25] 66.63±0.38¯\underline{66.63\pm 0.38} 0.67±0.000.67\pm 0.00 76.15±0.4276.15\pm 0.42 0.56±0.010.56\pm 0.01 32.31±0.3932.31\pm 0.39 0.52±0.01¯\underline{0.52\pm 0.01} 79.63±0.34¯\underline{79.63\pm 0.34} 0.59±0.040.59\pm 0.04 85.69±0.2885.69\pm 0.28 0.70±0.01\textbf{0.70}\pm\textbf{0.01}
EvolveGCN[16] 63.21±1.2463.21\pm 1.24 0.57±0.020.57\pm 0.02 75.18±0.6275.18\pm 0.62 0.54±0.020.54\pm 0.02 26.32±0.6526.32\pm 0.65 0.50±0.010.50\pm 0.01 76.25±1.0176.25\pm 1.01 0.57±0.030.57\pm 0.03 81.46±0.3681.46\pm 0.36 0.56±0.060.56\pm 0.06
DySAT[13] 65.75±0.8165.75\pm 0.81 0.62±0.020.62\pm 0.02 75.36±0.4475.36\pm 0.44 0.57±0.01¯\underline{0.57\pm 0.01} 29.39±0.7329.39\pm 0.73 0.51±0.010.51\pm 0.01 77.36±0.9277.36\pm 0.92 0.60±0.010.60\pm 0.01 85.19±0.6185.19\pm 0.61 0.65±0.020.65\pm 0.02
HYDG-GCN (ours) 68.26±0.20\textbf{68.26}\pm\textbf{0.20} 0.69±0.00¯\underline{0.69\pm 0.00} 77.12±0.13\textbf{77.12}\pm\textbf{0.13} 0.58±0.01\textbf{0.58}\pm\textbf{0.01} 34.12±0.42\textbf{34.12}\pm\textbf{0.42} 0.53±0.01\textbf{0.53}\pm\textbf{0.01} 80.67±0.26\textbf{80.67}\pm\textbf{0.26} 0.67±0.020.67\pm 0.02 86.96±0.17\textbf{86.96}\pm\textbf{0.17} 0.70±0.01\textbf{0.70}\pm\textbf{0.01}
GraphSAGE[8] 67.83±0.6967.83\pm 0.69 0.75±0.010.75\pm 0.01 74.06±0.3174.06\pm 0.31 0.61±0.000.61\pm 0.00 31.85±0.8531.85\pm 0.85 0.51±0.000.51\pm 0.00 77.83±1.3277.83\pm 1.32 0.63±0.050.63\pm 0.05 84.15±1.4584.15\pm 1.45 0.68±0.020.68\pm 0.02
GCNLSTM[17] 67.32±1.2567.32\pm 1.25 0.64±0.010.64\pm 0.01 75.75±0.5875.75\pm 0.58 0.58±0.010.58\pm 0.01 32.03±0.5332.03\pm 0.53 0.54±0.00\textbf{0.54}\pm\textbf{0.00} 78.36±1.19¯\underline{78.36\pm 1.19} 0.62±0.030.62\pm 0.03 83.76±2.2483.76\pm 2.24 0.69±0.020.69\pm 0.02
RNNGCN[18] 65.31±0.8365.31\pm 0.83 0.74±0.010.74\pm 0.01 76.66±0.7276.66\pm 0.72 0.56±0.010.56\pm 0.01 31.53±0.9831.53\pm 0.98 0.52±0.010.52\pm 0.01 77.30±1.0477.30\pm 1.04 0.67±0.01\textbf{0.67}\pm\textbf{0.01} 85.65±1.17¯\underline{85.65\pm 1.17} 0.69±0.020.69\pm 0.02
GCNSE[19] 67.36±0.9767.36\pm 0.97 0.52±0.020.52\pm 0.02 76.94±0.57\textbf{76.94}\pm\textbf{0.57} 0.53±0.010.53\pm 0.01 31.02±0.6931.02\pm 0.69 0.50±0.010.50\pm 0.01 78.06±1.1678.06\pm 1.16 0.60±0.010.60\pm 0.01 85.29±1.3285.29\pm 1.32 0.62±0.030.62\pm 0.03
GraphSAGE ROlAND[25] 68.42±0.5268.42\pm 0.52 0.74±0.010.74\pm 0.01 75.68±0.8375.68\pm 0.83 0.60±0.010.60\pm 0.01 32.61±0.4332.61\pm 0.43 0.53±0.000.53\pm 0.00 78.22±0.5978.22\pm 0.59 0.64±0.000.64\pm 0.00 85.23±0.7685.23\pm 0.76 0.70±0.010.70\pm 0.01
EvolveGCN[16] 63.21±1.2463.21\pm 1.24 0.57±0.020.57\pm 0.02 75.18±0.6275.18\pm 0.62 0.54±0.020.54\pm 0.02 26.32±0.6526.32\pm 0.65 0.50±0.010.50\pm 0.01 76.25±1.0176.25\pm 1.01 0.57±0.030.57\pm 0.03 82.46±0.3682.46\pm 0.36 0.57±0.060.57\pm 0.06
DySAT[13] 65.75±0.8165.75\pm 0.81 0.65±0.020.65\pm 0.02 75.36±0.4475.36\pm 0.44 0.57±0.010.57\pm 0.01 29.39±0.7329.39\pm 0.73 0.51±0.010.51\pm 0.01 77.36±0.9277.36\pm 0.92 0.60±0.010.60\pm 0.01 85.19±0.6185.19\pm 0.61 0.67±0.010.67\pm 0.01
HYDG-SAGE (ours) 70.40±0.32\textbf{70.40}\pm\textbf{0.32} 0.77±0.00\textbf{0.77}\pm\textbf{0.00} 76.51±0.1376.51\pm 0.13 0.63±0.01\textbf{0.63}\pm\textbf{0.01} 33.98±0.24\textbf{33.98}\pm\textbf{0.24} 0.54±0.00\textbf{0.54}\pm\textbf{0.00} 80.53±0.47\textbf{80.53}\pm\textbf{0.47} 0.66±0.010.66\pm 0.01 86.95±0.35\textbf{86.95}\pm\textbf{0.35} 0.72±0.01\textbf{0.72}\pm\textbf{0.01}

III-A Experimental Settings

We conducted experiments using five real-world dynamic graph datasets, with detailed information provided in Table II, We selected seven baselines for comparison, including both static[1, 8] and dynamic graph neural networks[13, 18, 19, 16, 25]. The primary task is to predict the node labels in the test set at various time slices using the node features and label information from limited time slices in the training set. To better evaluate the model’s performance, we used accuracy and Macro-AUC as evaluation metrics. Each method employs a two-layer graph neural network, utilizing both GCN and GraphSAGE as GNN layers for experimentation.

TABLE II: Real datasets for evaluation of our methods.
Dataset Nodes Edges Time Steps Classes Attributes
DBLP3[18] 4257 23540 10 3 100
DBLP5[18] 6606 42815 10 5 100
Reddit[19] 8291 264050 10 4 20
Wiki[26] 9227 31910 11 4 -
ML-Rating[27] 9746 2000438 11 7 -

III-B Overall Performance

As shown in Table I, we conduct experiments using both GCN and GraphSAGE backbones, with EvolveGCN and DySAT sharing experimental results across both sets. The results demonstrate that the HYDG model consistently outperforms baseline models in node classification tasks across five datasets. By constructing individual hypergraphs and multi-scale group-level hypergraphs while capturing spatio-temporal dependencies, HYDG achieves superior accuracy on the DBLP5, Reddit, Wiki, and ML-RATING datasets using both GCN and GraphSAGE backbones, and records the highest AUC scores on the DBLP3, Reddit, and ML-RATING datasets. Notably, HYDG shows a 2%-3% improvement in accuracy on the DBLP5 and Reddit datasets compared to the best-performing baselines, despite slightly trailing RNN-GCN in AUC on DBLP3. These results suggest that the ability of HYDG to capture spatio-temporal dependencies enables better learning of node representations and relationships in dynamic graphs, outperforming traditional methods based on self-attention mechanisms and RNNs.

III-C Ablation Study

To better explore the roles of individual-level hypergraphs and group-level hypergraphs in capturing spatio-temporal dependencies in dynamic graphs, we conduct ablation experiments on these two modules separately, and the results are shown in Fig. 3. (i) Removal of individual-level hypergraphs. Utilizing only group-level hypergraphs yields poor accuracy and AUC results. This is because group-level hypergraphs capture features only within each slice, resulting in insufficient usable nodes and inadequate capture of features and dependencies across samples. (ii) Removal of group-level hypergraphs. Utilizing only individual-level hypergraphs achieves relatively high accuracy but results in lower AUC. This is because the individual hypergraph construction, which links hyperedges by identifying each node’s nearest neighbors, may capture only single dynamic dependency patterns. Additionally, connections between nodes with different labels can lead to incomplete dynamic feature representations.

Refer to caption
Figure 3: Ablation study results of HYDG on node classification tasks across five datasets.

IV conclusion

This paper presents a dynamic graph node classification framework based on multi-granularity hypergraphs. By constructing individual-level hypergraphs across various time ranges and multi-granularity group-level hypergraphs, the framework effectively captures higher-order spatio-temporal dependencies in dynamic graph nodes. Hypergraph neural networks are employed for weighted information propagation, leading to more accurate and robust node representations. Extensive experiments on five real datasets using two backbones demonstrate that the proposed framework outperforms the optimal baseline models.

References

  • [1] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
  • [2] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
  • [3] Q. Tian, C. Zhao, M. Shao, W. Wang, Y. Lin, and D. Li, “Mldgg: Meta-learning for domain generalization on graphs,” arXiv preprint arXiv:2411.12913, 2024.
  • [4] H. Wang, C. Zhao, and F. Chen, “Madod: Generalizing ood detection to unseen domains via g-invariance meta-learning,” arXiv preprint arXiv:2411.02444, 2024.
  • [5] Z. He, C. Zhao, M. Shao, Y. Lin, D. Li, and Q. Tian, “Gdda: Semantic ood detection on graphs under covariate shift via score-based diffusion models,” arXiv preprint arXiv:2410.17526, 2024.
  • [6] M. Shao, D. Li, C. Zhao, X. Wu, Y. Lin, and Q. Tian, “Supervised algorithmic fairness in distribution shifts: A survey,” arXiv preprint arXiv:2402.01327, 2024.
  • [7] Q. Tian, W. Wang, C. Zhao, M. Shao, W. Zhang, and D. Li, “Graphs generalization under distribution shifts,” arXiv preprint arXiv:2403.16334, 2024.
  • [8] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
  • [9] Y. Liu, X. Shi, L. Pierce, and X. Ren, “Characterizing and forecasting user engagement with in-app action graph: A case study of snapchat,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 2023–2031.
  • [10] F. Li, J. Feng, H. Yan, G. Jin, F. Yang, F. Sun, D. Jin, and Y. Li, “Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution,” ACM Transactions on Knowledge Discovery from Data, vol. 17, no. 1, pp. 1–21, 2023.
  • [11] Y. Zhang, G. Song, L. Du, S. Yang, and Y. Jin, “Dane: Domain adaptive network embedding,” arXiv preprint arXiv:1906.00684, 2019.
  • [12] J. Skarding, B. Gabrys, and K. Musial, “Foundations and modeling of dynamic networks using dynamic graph neural networks: A survey,” iEEE Access, vol. 9, pp. 79 143–79 168, 2021.
  • [13] A. Sankar, Y. Wu, L. Gou, W. Zhang, and H. Yang, “Dysat: Deep neural representation learning on dynamic graphs via self-attention networks,” in Proceedings of the 13th international conference on web search and data mining, 2020, pp. 519–527.
  • [14] C. D. Barros, M. R. Mendonça, A. B. Vieira, and A. Ziviani, “A survey on embedding dynamic graphs,” ACM Computing Surveys (CSUR), vol. 55, no. 1, pp. 1–37, 2021.
  • [15] F. Manessi, A. Rozza, and M. Manzo, “Dynamic graph convolutional networks,” Pattern Recognition, vol. 97, p. 107000, 2020.
  • [16] A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, and C. Leiserson, “Evolvegcn: Evolving graph convolutional networks for dynamic graphs,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 04, 2020, pp. 5363–5370.
  • [17] J. Chen, X. Wang, and X. Xu, “Gc-lstm: Graph convolution embedded lstm for dynamic network link prediction,” Applied Intelligence, pp. 1–16, 2022.
  • [18] Y. Yao and C. Joe-Wong, “Interpretable clustering on dynamic graphs with recurrent graph neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 5, 2021, pp. 4608–4616.
  • [19] Y. Fan, Y. Yao, and C. Joe-Wong, “Gcn-se: Attention as explainability for node classification in dynamic graphs,” in 2021 IEEE International Conference on Data Mining (ICDM).   IEEE, 2021, pp. 1060–1065.
  • [20] D. Zhou, J. Huang, and B. Schölkopf, “Learning with hypergraphs: Clustering, classification, and embedding,” Advances in neural information processing systems, vol. 19, 2006.
  • [21] Y. Gao, Y. Feng, S. Ji, and R. Ji, “Hgnn+: General hypergraph neural networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3181–3199, 2022.
  • [22] Y. Gao, Z. Zhang, H. Lin, X. Zhao, S. Du, and C. Zou, “Hypergraph learning: Methods and practices,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2548–2566, 2020.
  • [23] Y. Yan, J. Qin, J. Chen, L. Liu, F. Zhu, Y. Tai, and L. Shao, “Learning multi-granular hypergraphs for video-based person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2899–2908.
  • [24] Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural networks,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 3558–3565.
  • [25] J. You, T. Du, and J. Leskovec, “Roland: graph learning framework for dynamic graphs,” in Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, 2022, pp. 2358–2366.
  • [26] “Wikimedia edit history dump,” 2021, https://meta.wikimedia.org/wiki/Data_dumps.
  • [27] F. M. Harper and J. A. Konstan, “The movielens datasets: History and context,” Acm transactions on interactive intelligent systems (tiis), vol. 5, no. 4, pp. 1–19, 2015.