Hypergraph-Based Dynamic Graph Node Classification

Xiaoxu Ma¹, Chen Zhao², Minglai Shao¹^*, Yujie Lin¹ *Corresponding author. ¹School of New Media and Communication, Tianjin University, Tianjin, China
²Department of Computer Science, Baylor University, Waco, Texas, USA
{maxiaoxu, shaoml, linyujie_22}@tju.edu.cn, [email protected]

Abstract

Node classification on static graphs has achieved significant success, but achieving accurate node classification on dynamic graphs where node topology, attributes, and labels change over time has not been well addressed. Existing methods based on RNNs and self-attention only aggregate features of the same node across different time slices, which cannot adequately address and capture the diverse dynamic changes in dynamic graphs. Therefore, we propose a novel model named Hypergraph-Based Multi-granularity Dynamic Graph Node Classification (HYDG). After obtaining basic node representations for each slice through a GNN backbone, HYDG models the representations of each node in the dynamic graph through two modules. The individual-level hypergraph captures the spatio-temporal node representations between individual nodes, while the group-level hypergraph captures the multi-granularity group temporal representations among nodes of the same class. Each hyperedge captures different temporal dependencies of varying lengths by connecting multiple nodes within specific time ranges. More accurate representations are obtained through weighted information propagation and aggregation by the hypergraph neural network. Extensive experiments on five real dynamic graph datasets using two GNN backbones demonstrate the superiority of our proposed framework.

Index Terms:

Dynamic Graph, Hypergraph, Node Classification

I Introduction

As a widely employed data structure, graph structures excel in representing intricate data and possess robust capabilities for storing and articulating node attributes and entity relationships. The current landscape of graph neural networks[1, 2, 3, 4, 5, 6, 7] predominantly operates within the realm of static graph structures[8]. However, many real-world scenarios involve dynamic graph structures, such as recommender systems, social media analytics[9], transportation systems[10], citation networks[11]and epidemic modeling. Dynamic graphs can be broadly classified into discrete-time dynamic graphs and continuous-time dynamic graphs[12].In both forms of dynamic graphs, node features, node labels, and graph structures may evolve over time[13], posing a challenge for traditional static graph models in capturing the dynamic relationships among node features. Consequently, the effective capture and utilization of node relationships and features in dynamic networks have emerged as pivotal research questions.

Refer to caption — Figure 1: An illustrative example of interest prediction in dynamic social graphs involves predicting users’ interest tags at a specific time based on historical data. In this network, both user connections and interest tags evolve over time, and our goal is to forecast users’ interest tags using information from the interest network.

In this paper, we focus on the task of node classification in discrete dynamic graphs, specifically predicting the node labels in unknown future graph snapshots given the graph information at previous time steps. As illustrated in Fig. 1, in social interest networks, a user’s social connections and personal information may change significantly over time, leading to shifts in their interest labels. Current methods primarily fall into two categories[12, 14]: one uses RNNs[15, 16, 17, 18] to capture temporal relationships between nodes, while the other employs attention[19, 13] mechanisms to capture dynamic relationships between nodes. However, in diverse real-world dynamic networks, simply using RNNs or self-attention to link the representations of the same node across different time slices may fail to adequately capture the features of various node categories. In scenarios involving changes in node features, label shifts, node disappearance, or the addition of new nodes, traditional dynamic graph learning methods struggle to perform inductive learning effectively, thus failing to cope with the complex node dynamics within dynamic networks.

To address these challenges, we propose the Hypergraph-Based Multi-granularity Dynamic Graph Node Classification (HYDG) algorithm, which captures both individual and group-level node features using hypergraphs[20, 21, 22, 23]. By connecting multiple nodes within specific time intervals through hyperedges, HYDG constructs multi-granularity features, enabling more accurate classifications via hypergraph neural networks. Compared to traditional methods, HYDG more effectively models high-order dependencies and enhances temporal information capture at both individual and group levels through weighted information propagation. Our contributions include:

•

We introduce a novel hypergraph-based algorithm for dynamic graph node classification that effectively captures diverse and robust spatio-temporal dependencies through hypergraph construction, weighted node propagation, and feature aggregation.
•

We develop a multi-scale hypergraph construction approach that concurrently captures both individual- and group-level node features, enhancing the diversity of node representations.
•

Our model, built upon two GNN backbones, demonstrates superior performance on five real-world datasets, consistently outperforming baseline models.

II methodology

The primary focus of this paper is to address the problem of node classification on discrete dynamic graphs. Given the dynamic graph snapshots $\textbf{G}_{P}=\{\textbf{G}^{1},\textbf{G}^{2},\ldots,\textbf{G}^{t}\}$ up to time step $t$ and the corresponding node labels $\textbf{Y}_{P}=\{\textbf{Y}^{1},\textbf{Y}^{2},\ldots,\textbf{Y}^{t}\}$ , our goal is to classify all nodes in future graph snapshots $\textbf{G}_{F}=\{\textbf{G}^{t+1},\textbf{G}^{t+2},\ldots,\textbf{G}^{T}\}$ , whose labels $\textbf{Y}_{F}=\{\textbf{Y}^{t+1},\textbf{Y}^{t+2},\ldots,\textbf{Y}^{T}\}$ are unknown.

II-A Dynamic Graph Feature Extraction

Static graph neural networks have been widely applied in graph representation learning tasks. Therefore, we first utilize a backbone GNN to perform feature extraction on each node of every temporal slice, as shown in Fig. 2(a). This process yields a feature representation for each node at each time step. For the t-th temporal slice of the dynamic graph $\textbf{G}^{t}$ , we can utilize it to obtain feature representations.

\mathbf{Z}^{t}=\text{GNN}(\mathbf{A}^{t},\mathbf{X}^{t}),

(1)

where $\mathbf{Z}^{t}$ represents the matrix of feature embeddings for the nodes in the graph snapshot $\mathbf{G}^{t}$ processed by the GNN, with $\mathbf{Z}^{t}=[\mathbf{z}_{1}^{t},\mathbf{z}_{2}^{t},\ldots,\mathbf{z}_{n}^{t}]$ , where $n$ is the number of nodes. $\mathbf{A}^{t}$ denotes the adjacency matrix of $\mathbf{G}^{t}$ , and $\mathbf{X}^{t}$ refers to the initial feature matrix of the nodes in $\mathbf{G}^{t}$ .

II-B Hypergraph Construction

In dynamic graphs, nodes possess both individual features and common features shared within their subgroups. To better capture the higher-order correlations of nodes within each time slice, we construct multi-scale hypergraphs by capturing individual-level and group-level dependencies, as shown in Fig. 2(b).

Individual-level hypergraph construction. We capture the individual high-order dependencies of each node in the dynamic graph by constructing a series of individual hyperedges, denoted as $\mathcal{G}_{in}=(\mathcal{V}_{in},\mathcal{E}_{in})$ . Here, $\mathcal{G}_{in}$ includes individual nodes $\mathcal{V}_{in}$ and the set of hyperedges $\mathcal{E}_{in}$ . Specifically, for a given node $v_{it}$ , we identify $K$ nodes from other time slices within a specified time range that are most close to the feature vector obtained from (1), excluding the current time slice. These $K+1$ nodes are then connected via a hyperedge $\mathbf{e}_{it}$ , as shown in (2).

\mathbf{e}_{it}=\{v_{it},\forall v_{jt^{\prime}}\in\mathcal{N}_{K}(v_{it})\},s.t.\ |t-t^{\prime}|\leq\tau~{},

(2)

where $\mathcal{N}_{K}$ represents the set of $K$ nearest neighbors, which can be calculated using various distance metrics such as Euclidean distance, Chebyshev distance, cosine distance, etc. $|*|$ denotes the temporal distance between nodes. $\tau$ represents the threshold value of the time range. We model the multi-scale temporal dependencies in the dynamic graph by setting three threshold ranges: short-term connection, mid-term connection, and long-term connection, respectively. We construct the individual hypergraph $\mathcal{G}_{in}$ using $v_{it}$ and $e_{it}$ obtained from all nodes in each time slice.

Group-level hypergraph construction. To better capture the multi-granularity group features of each category in dynamic graphs, we construct a series of group-level hypergraphs $\mathcal{G}_{group}$ , where $\mathcal{G}_{group}=(\mathcal{V}_{group},\mathcal{E}_{group})$ , including group nodes $\mathcal{V}_{group}$ and group hyperedge sets $\mathcal{E}_{group}$ . In summary, we first group the feature vectors $\mathbf{Z}^{t}$ according to the true labels $\mathbf{Y}^{t}$ of the nodes, followed by hierarchical clustering and aggregation of these groups within each time slice. This process yields spatio-temporal group representations, which are then used to construct hypergraphs for each node category, capturing both the features and their spatio-temporal dependencies in the dynamic network. The mathematical representation is as follows:

\mathbf{Z}^{c,t}_{group}=\{Agg\left(Cluster_{M}(\{z_{v}^{t}\mid y_{i}^{t}=c\})\right)\},

(3)

where $y_{v}^{t}$ denote the label of node $v$ at time $t$ . By grouping node embeddings with the same label, we obtain $\mathbf{O}^{c,t}=\{\mathbf{z}_{v}^{t}\mid y_{v}^{t}=c\}$ , where $C$ represents the number of node categories. The function $Cluster_{M}(\cdot)$ performs clustering on the grouped embeddings. Applying $Cluster_{M}(\mathbf{O}^{c,t})$ yields the clusters $\{\mathbf{O}^{c,t}_{1},\mathbf{O}^{c,t}_{2},\ldots,\mathbf{O}^{c,t}_{M}\}$ , where $M$ denotes the number of clusters. The aggregation function $Agg(\cdot)$ is applied to aggregate the vectors within each cluster, which could be operations such as $\max(\cdot)$ , $\min(\cdot)$ , or $avg(\cdot)$ . Finally, for each group, we obtain $\mathbf{Z}^{c,T}_{group}\in\mathbb{R}^{M\times T}$ . This process is formalized to capture diverse temporal characteristics of nodes within the same category across different clusters at various time steps.

We merge each time slice obtained from $\mathbf{Z}^{c,t}_{group}$ by (3) to obtain the grouped $\mathbf{Z}^{c,T}_{group}$ . Following the construction method of individual-level hypergraph in (2), we construct hyperedges for $\mathbf{Z}^{c,T}_{group}$ separately to capture the spatio-temporal evolutionary relationships of various category group characteristics, resulting in $\mathcal{V}_{group}^{c}$ and $\mathcal{E}_{group}^{c}$ . Hence, we obtain $\mathcal{G}_{group}=\{\mathcal{G}_{group}^{c}\}_{c\in C}$ .

II-C Hypergraph Propagation

After constructing individual-level hypergraph $\mathcal{G}_{in}$ and group-level hypergraph $\mathcal{G}_{group}$ , We employ Hypergraph Neural Networks (HGNN)[24] for node-weighted information propagation and feature updating to obtain spatio-temporal fusion features within hyperedges, as illustrated in Fig. 2(c).

For a given hypergraph $\mathcal{G}$ , the incidence matrix $\mathbf{H}$ is defined based on the node set $\mathcal{V}$ and edge set $\mathcal{E}$ as:

H(v,e)=1\text{ if }v\in e,\text{ else }0,\quad\forall v\in\mathcal{V},\forall e\in\mathcal{E}.

(4)

Based on $\mathcal{G}_{in}$ and $\mathcal{G}_{group}$ , we can derive the individual and group-level incidence matrices, $\mathbf{H}_{in}$ and $\mathbf{H}_{group}$ , through (4). Following the approach of frequency-domain GCNs [1], we use Fourier transformation to shift spatial signals to the frequency domain for convolution, and then apply the inverse Fourier transformation [24], as formalized by the following equation:

\mathbf{Z}^{(l+1)}=\sigma\left(\mathbf{D}_{v}^{-\frac{1}{2}}\mathbf{H}\mathbf{W}\mathbf{D}_{e}^{-1}\mathbf{H}^{T}\mathbf{D}_{v}^{-\frac{1}{2}}\mathbf{Z}^{(l)}\Theta^{(l)}\right),

(5)

where $\mathbf{Z}$ represents the feature matrix, $\mathbf{H}$ denotes the incidence matrix, $\Theta$ is the learnable hypergraph convolution kernel, $\mathbf{W}$ is the learnable weight matrix, and $\sigma$ is the non-linear activation function.

More specifically, for a given node $v_{it}$ , $H(v_{it})$ represents all hyperedges connected to node $v_{it}$ , containing nodes with the highest relevance to $v_{it}$ . We aggregate the features of nodes from different time slices in the hyperedges, except for $v_{it}$ , using the corresponding weights $\mathbf{w}_{it}^{jt^{\prime}}$ in $H(v_{it})$ to capture the temporal dependencies within the same hyperedge:

\mathbf{w_{it}^{jt^{\prime}}}=\exp\left(-\frac{d(\mathbf{z}_{it},\mathbf{z}_{jt^{\prime}})^{2}}{\sigma^{2}}\right),\forall v_{jt^{\prime}}\in\mathbf{e}_{it},

(6)

\mathbf{p}^{k,l}_{it}=\sum_{j\neq i}^{v_{jt^{\prime}}\in e_{k}}\mathbf{z}^{l-1}_{jt^{\prime}}\mathbf{w_{it}^{jt^{\prime}}},\ \forall\mathbf{e}_{k}\in H(v_{it}),

(7)

$v_{it}$ and $v_{jt^{\prime}}$ represent a pair of nodes at different time slices within a hyperedge. $d(\mathbf{z}_{it},\mathbf{z}_{jt^{\prime}})$ denotes the distance between $v_{it}$ and $v_{jt^{\prime}}$ . We can use various methods to compute the distance, such as Euclidean distance, cosine distance, etc. $\sigma$ can be set as the median of distances between all pairs of vertices, $\mathbf{z}^{l-1}_{jt^{\prime}}$ represents the output of node $v_{jt^{\prime}}$ at layer $l-1$ in HGNN.

After obtaining different hyperedge features for the dynamic graph node $v_{it}$ , we compute the weight by measuring the similarity between the hyperedge features and node features and normalize the weights using softmax function. Then, we perform edge-node level spatio-temporal dependency aggregation:

\mathbf{z}^{l}_{it}=\sum_{k}\frac{{\rm exp}(sim(\mathbf{z}^{k,l-1}_{it},\mathbf{p}^{k,l}_{it}))}{\sum_{k^{\prime}}{\rm exp}(sim(\mathbf{z}^{k^{\prime},l-1}_{it},\mathbf{p}^{k^{\prime},l}_{it}))}\mathbf{p}^{k,l}_{it}.

(8)

The node representations obtained from the above process, denoted as $\mathbf{z}^{l}_{it}$ ,are passed through a non-linear function, yielding the integrated representation of node dependencies across time and space.

II-D Model Learning

After the hypergraph information propagation, we obtain the individual and group hypergraph representation, $\mathbf{Z}_{in}$ and $\mathbf{Z}_{group}$ .By applying the softmax function, for each time slice, we compute the predicted labels $\mathbf{\hat{Y}}_{in}$ and $\mathbf{\hat{Y}}_{group}$ for the nodes in $\mathcal{G}_{in}$ and $\mathcal{G}_{group}$ , respectively. The true labels for the group-level hypergraph, $\mathbf{Z}_{group}^{c}$ , are represented by $c$ as follows:

\mathcal{L}_{in}(\mathbf{Y},\mathbf{\hat{Y}})=-\sum_{i=1}^{N}y_{i}\log(\frac{e^{z_{in}^{i}}}{\sum_{j=1}^{N}e^{z_{in}^{j}}})~{},

(9)

\mathcal{L}_{group}(\mathbf{Y},\mathbf{\hat{Y}})=-\sum_{c=1}^{C}\sum_{i=1}^{M}c\log(\frac{e^{z_{group}^{ic}}}{\sum_{j=1}^{N}e^{z_{group}^{ic}}})~{}.

(10)

For $\mathcal{L}_{in}$ and $\mathcal{L}_{group}$ , we assign different weights, denoted as $\alpha$ and $\beta$ , respectively, resulting in the overall loss function $\mathcal{L}_{all}$ , which consists of the following two terms:

\mathcal{L}_{all}=\alpha\cdot\mathcal{L}_{in}+\beta\cdot\mathcal{L}_{group}~{}.

(11)

During training, we construct both individual and group-level hypergraphs to capture diverse node representations using hypergraph neural networks. During testing, without labels, we use only individual-level hypergraphs and the trained network for predictions. The group-level hypergraphs, used for data augmentation, capture evolving relationships within the same class, addressing variations in unseen test sets.

III experiments

TABLE I: Quantitative results on the dynamic node classification task. The results are the averages of five runs, with the best result in each column highlighted in bold and the second best result indicated with a dash.

backbone	Dataset	DBLP5		DBLP3		Reddit		Wiki		ML-RATING
	Metric	Accuracy	AUC	Accuracy	AUC	Accuracy	AUC	Accuracy	AUC	Accuracy	AUC
	GCN[1]	$63.01\pm 0.62$	$0.65\pm 0.02$	$74.23\pm 0.41$	$0.55\pm 0.02$	$31.02\pm 0.33$	$\underline{0.52\pm 0.01}$	$79.22\pm 0.81$	$0.65\pm 0.01$	$85.53\pm 0.22$	$0.67\pm 0.01$
	GCNLSTM[17]	$61.83\pm 1.84$	$0.65\pm 0.01$	$73.37\pm 0.92$	$0.56\pm 0.01$	$30.26\pm 0.42$	$0.50\pm 0.01$	$79.43\pm 1.22$	$\textbf{0.68}\pm\textbf{0.01}$	$84.77\pm 0.84$	$\underline{0.68\pm 0.02}$
	RNNGCN[18]	$65.92\pm 0.43$	$\textbf{0.70}\pm\textbf{0.01}$	$76.55\pm 0.41$	$0.56\pm 0.01$	$\underline{32.68\pm 0.28}$	$0.51\pm 0.01$	$78.74\pm 1.18$	$0.61\pm 0.01$	$\underline{85.75\pm 0.52}$	$0.66\pm 0.01$
	GCNSE[19]	$66.58\pm 0.31$	$0.64\pm 0.02$	$\underline{76.88\pm 0.52}$	$0.53\pm 0.01$	$31.09\pm 0.47$	$0.51\pm 0.01$	$79.49\pm 0.62$	$0.54\pm 0.01$	$85.63\pm 0.34$	$0.57\pm 0.01$
GCN	ROLAND[25]	$\underline{66.63\pm 0.38}$	$0.67\pm 0.00$	$76.15\pm 0.42$	$0.56\pm 0.01$	$32.31\pm 0.39$	$\underline{0.52\pm 0.01}$	$\underline{79.63\pm 0.34}$	$0.59\pm 0.04$	$85.69\pm 0.28$	$\textbf{0.70}\pm\textbf{0.01}$
	EvolveGCN[16]	$63.21\pm 1.24$	$0.57\pm 0.02$	$75.18\pm 0.62$	$0.54\pm 0.02$	$26.32\pm 0.65$	$0.50\pm 0.01$	$76.25\pm 1.01$	$0.57\pm 0.03$	$81.46\pm 0.36$	$0.56\pm 0.06$
	DySAT[13]	$65.75\pm 0.81$	$0.62\pm 0.02$	$75.36\pm 0.44$	$\underline{0.57\pm 0.01}$	$29.39\pm 0.73$	$0.51\pm 0.01$	$77.36\pm 0.92$	$0.60\pm 0.01$	$85.19\pm 0.61$	$0.65\pm 0.02$
	HYDG-GCN (ours)	$\textbf{68.26}\pm\textbf{0.20}$	$\underline{0.69\pm 0.00}$	$\textbf{77.12}\pm\textbf{0.13}$	$\textbf{0.58}\pm\textbf{0.01}$	$\textbf{34.12}\pm\textbf{0.42}$	$\textbf{0.53}\pm\textbf{0.01}$	$\textbf{80.67}\pm\textbf{0.26}$	$0.67\pm 0.02$	$\textbf{86.96}\pm\textbf{0.17}$	$\textbf{0.70}\pm\textbf{0.01}$
	GraphSAGE[8]	$67.83\pm 0.69$	$0.75\pm 0.01$	$74.06\pm 0.31$	$0.61\pm 0.00$	$31.85\pm 0.85$	$0.51\pm 0.00$	$77.83\pm 1.32$	$0.63\pm 0.05$	$84.15\pm 1.45$	$0.68\pm 0.02$
	GCNLSTM[17]	$67.32\pm 1.25$	$0.64\pm 0.01$	$75.75\pm 0.58$	$0.58\pm 0.01$	$32.03\pm 0.53$	$\textbf{0.54}\pm\textbf{0.00}$	$\underline{78.36\pm 1.19}$	$0.62\pm 0.03$	$83.76\pm 2.24$	$0.69\pm 0.02$
	RNNGCN[18]	$65.31\pm 0.83$	$0.74\pm 0.01$	$76.66\pm 0.72$	$0.56\pm 0.01$	$31.53\pm 0.98$	$0.52\pm 0.01$	$77.30\pm 1.04$	$\textbf{0.67}\pm\textbf{0.01}$	$\underline{85.65\pm 1.17}$	$0.69\pm 0.02$
	GCNSE[19]	$67.36\pm 0.97$	$0.52\pm 0.02$	$\textbf{76.94}\pm\textbf{0.57}$	$0.53\pm 0.01$	$31.02\pm 0.69$	$0.50\pm 0.01$	$78.06\pm 1.16$	$0.60\pm 0.01$	$85.29\pm 1.32$	$0.62\pm 0.03$
GraphSAGE	ROlAND[25]	$68.42\pm 0.52$	$0.74\pm 0.01$	$75.68\pm 0.83$	$0.60\pm 0.01$	$32.61\pm 0.43$	$0.53\pm 0.00$	$78.22\pm 0.59$	$0.64\pm 0.00$	$85.23\pm 0.76$	$0.70\pm 0.01$
	EvolveGCN[16]	$63.21\pm 1.24$	$0.57\pm 0.02$	$75.18\pm 0.62$	$0.54\pm 0.02$	$26.32\pm 0.65$	$0.50\pm 0.01$	$76.25\pm 1.01$	$0.57\pm 0.03$	$82.46\pm 0.36$	$0.57\pm 0.06$
	DySAT[13]	$65.75\pm 0.81$	$0.65\pm 0.02$	$75.36\pm 0.44$	$0.57\pm 0.01$	$29.39\pm 0.73$	$0.51\pm 0.01$	$77.36\pm 0.92$	$0.60\pm 0.01$	$85.19\pm 0.61$	$0.67\pm 0.01$
	HYDG-SAGE (ours)	$\textbf{70.40}\pm\textbf{0.32}$	$\textbf{0.77}\pm\textbf{0.00}$	$76.51\pm 0.13$	$\textbf{0.63}\pm\textbf{0.01}$	$\textbf{33.98}\pm\textbf{0.24}$	$\textbf{0.54}\pm\textbf{0.00}$	$\textbf{80.53}\pm\textbf{0.47}$	$0.66\pm 0.01$	$\textbf{86.95}\pm\textbf{0.35}$	$\textbf{0.72}\pm\textbf{0.01}$

III-A Experimental Settings

We conducted experiments using five real-world dynamic graph datasets, with detailed information provided in Table II, We selected seven baselines for comparison, including both static[1, 8] and dynamic graph neural networks[13, 18, 19, 16, 25]. The primary task is to predict the node labels in the test set at various time slices using the node features and label information from limited time slices in the training set. To better evaluate the model’s performance, we used accuracy and Macro-AUC as evaluation metrics. Each method employs a two-layer graph neural network, utilizing both GCN and GraphSAGE as GNN layers for experimentation.

TABLE II: Real datasets for evaluation of our methods.

Dataset	Nodes	Edges	Time Steps	Classes	Attributes
DBLP3[18]	4257	23540	10	3	100
DBLP5[18]	6606	42815	10	5	100
Reddit[19]	8291	264050	10	4	20
Wiki[26]	9227	31910	11	4	-
ML-Rating[27]	9746	2000438	11	7	-

III-B Overall Performance

As shown in Table I, we conduct experiments using both GCN and GraphSAGE backbones, with EvolveGCN and DySAT sharing experimental results across both sets. The results demonstrate that the HYDG model consistently outperforms baseline models in node classification tasks across five datasets. By constructing individual hypergraphs and multi-scale group-level hypergraphs while capturing spatio-temporal dependencies, HYDG achieves superior accuracy on the DBLP5, Reddit, Wiki, and ML-RATING datasets using both GCN and GraphSAGE backbones, and records the highest AUC scores on the DBLP3, Reddit, and ML-RATING datasets. Notably, HYDG shows a 2%-3% improvement in accuracy on the DBLP5 and Reddit datasets compared to the best-performing baselines, despite slightly trailing RNN-GCN in AUC on DBLP3. These results suggest that the ability of HYDG to capture spatio-temporal dependencies enables better learning of node representations and relationships in dynamic graphs, outperforming traditional methods based on self-attention mechanisms and RNNs.

III-C Ablation Study

To better explore the roles of individual-level hypergraphs and group-level hypergraphs in capturing spatio-temporal dependencies in dynamic graphs, we conduct ablation experiments on these two modules separately, and the results are shown in Fig. 3. (i) Removal of individual-level hypergraphs. Utilizing only group-level hypergraphs yields poor accuracy and AUC results. This is because group-level hypergraphs capture features only within each slice, resulting in insufficient usable nodes and inadequate capture of features and dependencies across samples. (ii) Removal of group-level hypergraphs. Utilizing only individual-level hypergraphs achieves relatively high accuracy but results in lower AUC. This is because the individual hypergraph construction, which links hyperedges by identifying each node’s nearest neighbors, may capture only single dynamic dependency patterns. Additionally, connections between nodes with different labels can lead to incomplete dynamic feature representations.

IV conclusion

This paper presents a dynamic graph node classification framework based on multi-granularity hypergraphs. By constructing individual-level hypergraphs across various time ranges and multi-granularity group-level hypergraphs, the framework effectively captures higher-order spatio-temporal dependencies in dynamic graph nodes. Hypergraph neural networks are employed for weighted information propagation, leading to more accurate and robust node representations. Extensive experiments on five real datasets using two backbones demonstrate that the proposed framework outperforms the optimal baseline models.

References

[1] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
[2] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
[3] Q. Tian, C. Zhao, M. Shao, W. Wang, Y. Lin, and D. Li, “Mldgg: Meta-learning for domain generalization on graphs,” arXiv preprint arXiv:2411.12913, 2024.
[4] H. Wang, C. Zhao, and F. Chen, “Madod: Generalizing ood detection to unseen domains via g-invariance meta-learning,” arXiv preprint arXiv:2411.02444, 2024.
[5] Z. He, C. Zhao, M. Shao, Y. Lin, D. Li, and Q. Tian, “Gdda: Semantic ood detection on graphs under covariate shift via score-based diffusion models,” arXiv preprint arXiv:2410.17526, 2024.
[6] M. Shao, D. Li, C. Zhao, X. Wu, Y. Lin, and Q. Tian, “Supervised algorithmic fairness in distribution shifts: A survey,” arXiv preprint arXiv:2402.01327, 2024.
[7] Q. Tian, W. Wang, C. Zhao, M. Shao, W. Zhang, and D. Li, “Graphs generalization under distribution shifts,” arXiv preprint arXiv:2403.16334, 2024.
[8] W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
[9] Y. Liu, X. Shi, L. Pierce, and X. Ren, “Characterizing and forecasting user engagement with in-app action graph: A case study of snapchat,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 2023–2031.
[10] F. Li, J. Feng, H. Yan, G. Jin, F. Yang, F. Sun, D. Jin, and Y. Li, “Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution,” ACM Transactions on Knowledge Discovery from Data, vol. 17, no. 1, pp. 1–21, 2023.
[11] Y. Zhang, G. Song, L. Du, S. Yang, and Y. Jin, “Dane: Domain adaptive network embedding,” arXiv preprint arXiv:1906.00684, 2019.
[12] J. Skarding, B. Gabrys, and K. Musial, “Foundations and modeling of dynamic networks using dynamic graph neural networks: A survey,” iEEE Access, vol. 9, pp. 79 143–79 168, 2021.
[13] A. Sankar, Y. Wu, L. Gou, W. Zhang, and H. Yang, “Dysat: Deep neural representation learning on dynamic graphs via self-attention networks,” in Proceedings of the 13th international conference on web search and data mining, 2020, pp. 519–527.
[14] C. D. Barros, M. R. Mendonça, A. B. Vieira, and A. Ziviani, “A survey on embedding dynamic graphs,” ACM Computing Surveys (CSUR), vol. 55, no. 1, pp. 1–37, 2021.
[15] F. Manessi, A. Rozza, and M. Manzo, “Dynamic graph convolutional networks,” Pattern Recognition, vol. 97, p. 107000, 2020.
[16] A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, and C. Leiserson, “Evolvegcn: Evolving graph convolutional networks for dynamic graphs,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 04, 2020, pp. 5363–5370.
[17] J. Chen, X. Wang, and X. Xu, “Gc-lstm: Graph convolution embedded lstm for dynamic network link prediction,” Applied Intelligence, pp. 1–16, 2022.
[18] Y. Yao and C. Joe-Wong, “Interpretable clustering on dynamic graphs with recurrent graph neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 5, 2021, pp. 4608–4616.
[19] Y. Fan, Y. Yao, and C. Joe-Wong, “Gcn-se: Attention as explainability for node classification in dynamic graphs,” in 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 2021, pp. 1060–1065.
[20] D. Zhou, J. Huang, and B. Schölkopf, “Learning with hypergraphs: Clustering, classification, and embedding,” Advances in neural information processing systems, vol. 19, 2006.
[21] Y. Gao, Y. Feng, S. Ji, and R. Ji, “Hgnn+: General hypergraph neural networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3181–3199, 2022.
[22] Y. Gao, Z. Zhang, H. Lin, X. Zhao, S. Du, and C. Zou, “Hypergraph learning: Methods and practices,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2548–2566, 2020.
[23] Y. Yan, J. Qin, J. Chen, L. Liu, F. Zhu, Y. Tai, and L. Shao, “Learning multi-granular hypergraphs for video-based person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2899–2908.
[24] Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural networks,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 3558–3565.
[25] J. You, T. Du, and J. Leskovec, “Roland: graph learning framework for dynamic graphs,” in Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, 2022, pp. 2358–2366.
[26] “Wikimedia edit history dump,” 2021, https://meta.wikimedia.org/wiki/Data_dumps.
[27] F. M. Harper and J. A. Konstan, “The movielens datasets: History and context,” Acm transactions on interactive intelligent systems (tiis), vol. 5, no. 4, pp. 1–19, 2015.