Community-Centric Graph Unlearning
Abstract
Graph unlearning technology has become increasingly important since the advent of the ‘right to be forgotten’ and the growing concerns about the privacy and security of artificial intelligence. Graph unlearning aims to quickly eliminate the effects of specific data on graph neural networks (GNNs). However, most existing deterministic graph unlearning frameworks follow a balanced partition-submodel training-aggregation paradigm, resulting in a lack of structural information between subgraph neighborhoods and redundant unlearning parameter calculations. To address this issue, we propose a novel Graph Structure Mapping Unlearning paradigm (GSMU) and a novel method based on it named Community-centric Graph Eraser (CGE). CGE maps community subgraphs to nodes, thereby enabling the reconstruction of a node-level unlearning operation within a reduced mapped graph. CGE makes the exponential reduction of both the amount of training data and the number of unlearning parameters. Extensive experiments conducted on five real-world datasets and three widely used GNN backbones have verified the high performance and efficiency of our CGE method, highlighting its potential in the field of graph unlearning. https://github.com/liiiyi/CCGU
Introduction
As Artificial Intelligence (AI) permeates various sectors, protecting individual privacy and data security has become an increasingly pressing concern (Wei et al. 2024). Legal frameworks, such as the “right to be forgotten” (Pardau 2018; Mantelero 2013; Cofone 2020), alongside the rise of techniques like membership inference attacks (He et al. 2021) have underscored the need for model providers to swiftly remove specific data and mitigate its impacts. In parallel, Graph Neural Networks (GNNs) (Li et al. 2022; Xu et al. 2018; Li et al. 2024a, b) have gained prominence as a powerful tool, renowned for their ability to model complex data relationships. However, their widespread application in areas involving highly sensitive data (Zhang et al. 2025), such as social networks(Qiu et al. 2018) and recommendation systems (Fan et al. 2019), has intensified the demand for effective privacy-preserving techniques. Among these, Graph Unlearning (Chen et al. 2022b) has gained particular prominence.
Graph unlearning seeks to precisely quantify and remove the influence of specific data within graph-structured networks while maintaining the integrity of the overall model, thereby addressing critical privacy concerns in an era of expanding AI applications. A straightforward approach to unlearning involves retraining the entire model from scratch. However, this method is highly impractical due to the substantial time costs and associated model downtime. Consequently, current research in unlearning has focused on developing techniques that reduce this time expenditure while maintaining the model’s effectiveness (Said et al. 2023).
To achieve efficient and effective unlearning, researchers have developed various strategies that circumvent the costly process of full retraining (Guo et al. 2019; Izzo et al. 2021; Bourtoule et al. 2021; Chen et al. 2022b). These unlearning frameworks are typically divided into approximate and deterministic methods (Nguyen et al. 2022). Approximate methods (Guo et al. 2019; Izzo et al. 2021) involve fine-tuning models to implement unlearning; however, they only provide abstract statistical guarantees of privacy. In contrast, deterministic methods, the focus of this paper, involve retrospective retraining to remove data effects entirely (Yan et al. 2022). For example, the SISA (Bourtoule et al. 2021) approach partitions data for selective retraining. GraphEraser (Chen et al. 2022b), building on SISA, uses balanced partitioning to ensure equal node distribution and aggregates submodels using a learning-based method. Recently introduced, GUIDE (Wang, Huai, and Wang 2023) enhances predictive performance by repairing edges before aggregating partitioned subgraphs.
In summary, the prevailing deterministic graph unlearning frameworks (Chen et al. 2022b; Wang, Huai, and Wang 2023; Chen et al. 2022a) are primarily based on the Balance Partition-Submodel Training-Aggregation paradigm (BP-SM-TA). The principal limitation of this paradigm lies in its assumption that the unlearning operation can be decomposed into a set of discrete sub-problems, each with its own global representativeness. This decomposition, however, sacrifices the connectivity between sub-problems and increases the complexity of managing them. Graph unlearning with BP-SM-TA, two specific challenges arise:
Challenge 1 (Low Structural Utilization): In the BP-SM-TA paradigm, submodels are treated as independent, leading to the disassembly of the original graph during partitioning and resulting in the loss of critical structural information between shards that were originally connected. This, in turn, diminishes the overall comprehension of the graph’s structure. Additionally, the balanced partitioning method, which aims to achieve uniform training times for submodels by selecting nodes through clustering, further compromises the graph’s structural integrity.
Challenge 2 (High Model Complexity): In the BP-SM-TA paradigm, submodels must independently represent the original graph, meaning each submodel needs to capture the essential features of the original structure. This requirement introduces significant challenges in terms of partitioning and structuring the submodels. During the unlearning process, this paradigm necessitates retraining all affected submodels, which leads to extensive parameter computations. For large graphs, simultaneously loading all submodels for aggregation incurs substantial time costs, which contradicts the primary objective of efficient graph unlearning.
To tackle the aforementioned challenges, we propose a novel Graph Structure Mapping Unlearning paradigm (GSMU). Specifically, GSMU generates a mapped graph comprising the non-redundant features and structural information of the original graph. The nodes and edges of the mapped graph are derived from the subgraphs and inter-subgraph node relationships in the original graph. The construction of edges between mapped nodes addresses the limitation of submodel independence identified in Challenge 1. Meanwhile, unlearning operations only require updating the affected mapped nodes and edges, thereby avoiding extensive parameter retraining and addressing the limitation of independent submodel representativeness identified in Challenge 2. Moreover, we propose a novel community-centric graph unlearning method, Community-centric Graph Eraser (CGE), which serves as a practical implementation of GSMU. CGE comprises two components: parameter-free community-centric graph mapping and a node-level unlearning strategy. Specifically, CGE divides the original graph into multiple communities to accomplish the subgraph-node mapping required by the GSMU, and subsequently the unlearning strategy only needs to be performed on the nodes after the mapping, thus controlling the part affected by the unlearning requirement at the node level. Our contributions are summarized as follows:
-
•
We propose a novel graph structure mapping unlearning paradigm, GSMU, to overcome the space and time constraints of the traditional BP-SM-TA paradigm in addressing unlearning concerns. To the best of our knowledge, this is the first application of a mapped graph approach in graph unlearning.
-
•
Based on GSMU, we introduce a new efficient graph unlearning framework, CGE, which supports deterministic data unlearning while maximizing the retention of the original graph’s structural information and semantic feature. By mapping communities to nodes, unlearning operations can be executed at the node level.
-
•
Extensive experiments on five real-world datasets demonstrate that our proposed CGE achieves superior performance and remarkable unlearning efficiency across datasets of varying scales and characteristics.
Preliminaries
Notations.
In this paper, we use specific notations to represent various components of GNNs and related operations. The tilde notation , is used to indicate entities produced by mapping operations. Calligraphic fonts are employed to denote sets. We summarize the notations in Table 1.
Notation | Description |
---|---|
Original Graph | |
Mapped Graph | |
The Mapping from to | |
Node Subgraphs Set | |
Unlearning Requests Set | |
Unlearning Influence Set | |
The process of updating |
Deterministic Graph Unlearning.
Given a graph , where represents the set of nodes and represents the set of edges, and a GNN backbone model denoted as , with being the model parameters. Suppose the unlearning request is a sequence of nodes . The goal of deterministic graph unlearning is to eliminate the influence of these nodes from the model, formalized as:
(1) |
where and denotes the updated model parameters after unlearning. The advanced objectives of graph unlearning should ensure portability, comparable model utility to training from Scratch (Chen et al. 2022b), the maintenance of model performance before and after unlearning, high graph structure utilization, and minimal unlearning window periods. Graph unlearning can be broadly divided into node-level and edge-level approaches. In our GSMU paradigm, edge unlearning is considered a subproblem of node unlearning. Consequently, our work focuses on the node classification task, treating node unlearning as the fundamental unit.
Community Detection.
Community detection aims to partition a graph into a community set , optimizing intra-community connectivity while minimizing inter-community connectivity (Jin et al. 2021; Khan and Niazi 2017). Hierarchical community detection (Lancichinetti et al. 2011) further refines this process by creating multi-level community structures, which clarify relationships at each level. In this paper, we first perform coarse-grained segmentation followed by fine-grained segmentation, as described in the next section.

Community-Centric Unlearning
In this section, we detail the GSMU paradigm and the specific implementation of CGE.
GSMU.
In contrast to the BP-SM-TA paradigm, GSMU employs a subgraph-based graph structure mapping approach to derive a representative mapped graph, , from the original graph, . Each mapped node in corresponds to a subgraph in . After partitioning, and are isolated to ensure privacy. When it is necessary to unlearn a node, the unlearning operation only affects its corresponding mapped nodes, i.e., the subgraphs in which the node resides.
Definition 1.
Graph Structure Mapping Given a graph with a set of predefined subgraphs , GSMU maps it to a graph . Each subgraph constructs a mapped node, i.e., . An edge is created for each edge that connects nodes in different subgraphs and , i.e., . Node features fusion is performed on each subgraph, i.e., , where is a fusion function. Node labels are determined by voting within each subgraph, i.e., , where is a labeling function. The mapping relationship between and is defined as: .
Under the developed GSMU framework, we propose a community-centric graph unlearning method, i.e., Community-Centric Graph Eraser (CGE). The subsequent parts will provide an overview of the operational process of CGE and explain how CGE constructs mapped graph and utilises if for node prediction and unlearning.
Overview of CGE
The CGE framework, illustrated in Figure 1, consists of two stages: the generation of a community-centered mapped graph and a node-level unlearning/training stage. In particular, CGE employs hierarchical community detection to construct a subgraph set , establish the graph structure mapping in Definition 1, and generate a mapped graph for subsequent operations.
-
•
Unlearning: Upon receiving a sequence of unlearning requests , CGE employs the node-level unlearning strategy to determine the specific influence of on the mapped graph identifying the specific affected mapped edges and nodes. Subsequently, CGE recalculates the nodes and edges within according to the mapping, thereby updating the graph .
-
•
Prediction: When the node requires prediction, CGE determines destination community based on the community of the adjacent nodes of . The node features of are then fused with the mapped nodes corresponding to , which in turn updates the graph acting on the GNN. The node is reanalyzed to map the attributes of the node .
Community-Centric Mapping
CGE generates a community-centered mapped graph and establishes a graph structure mapping , as described in Definition 1. In particular, is utilized solely to retain the representative information of , thereby enabling its replacement for representation learning. Each community within is mapped into a node, and the robustness score between the mapped node pairs is employed to construct the edge relationships.
Community Detection
CGE introduces the Louvain (Blondel et al. 2008) to initialize communities and uses the OSLOM method (Lancichinetti et al. 2011) to optimize community structure. Louvain is a community detection algorithm based on modularity optimization. It attempts to move each node into the community of its neighboring nodes to maximize the modularity increment. The modularity is calculated as follows:
(2) |
where represents the edge weight between node and node . Additionally, and denote the degrees of and , respectively. The term denotes the total weight of all edges in . is an indicator variable that equals 1 when nodes and belong to the same community, and 0 otherwise.
OSLOM calculates the statistical significance of the initialized communities using the following formula:
(3) |
where represents the total number of possible connections in community , represents the actual number of connections in , and represents the probability that nodes in are connected to other nodes. OSLOM then returns an optimized community set .
Mapping
As delineated in Definition 1, the mapping of comprises four distinct mappings. Among these, we categorize the node, feature, and label mapping as node mapping, which align with edge mapping.
Node Mapping.
The initial step is to construct a node mapping based on the community set :
(4) |
To ensure that supervised learning goes well, it is essential to assign the correct labels and features to the mapped nodes. Despite the strict division of communities, node features within a community may vary and have inconsistent labels, making it impractical to average the node labels and features simply (Zhang et al. 2017). To address this, we calculated the core feature components of the community and designed a feature-based label mapping method. We construct a feature matrix , where is the feature space dimensionality and rows correspond to node feature vectors in the community . Principal Component Analysis (PCA) is then applied to to reduce its dimensionality. Let be the transformed feature matrix by PCA, where is the number of principal components based on the ratio components. The community feature is calculated as the mean of the transformed feature vectors:
(5) |
where is the transformed feature vector of node in community . The feature mapping is then constructed using the fusion feature:
(6) |
where represents the fusion feature of community .
Next, we identify a subset of nodes within the current community that have the highest feature similarity to the mapped node , and then perform majority voting on these nodes. These nodes are then subjected to a majority voting process. Specially, we consider a community with node labels and their corresponding feature vectors . We calculate the Euclidean distance between each node feature vector in the community and .
(7) |
We sort the distances in ascending order and calculate the gradient between consecutive distances. The distance corresponding to the maximum gradient is given by:
(8) |
Finally, we select nodes with distances and aggregate their labels using majority voting:
(9) |
The label mapping is then constructed as:
(10) |
Edge Mapping.
Using the adjacency matrix to represent the graph , for each edge where node belongs to community and node belongs to community , we define the edge count between the community pair as . The robustness score of the connection between communities and is evaluated using the following formula:
(11) |
where and denote the out-degree of and the in-degree of , respectively.
A smoothing function, is then employed to process the robustness score, representing the edge weight of the original community relationship. To store these weights, we introduce a set , where each element is a key-value pair: the key is the node pair , and the value is . Subsequently, a threshold is utilized to filter and construct the edge mapping as follows:
(12) |
Finally, the total mapping relationship is established as follows:
(13) |
Node-Level Unlearning Strategy
Upon receipt of an unlearning request as a sequence, denoted as , CGE initially identifies the communities of the nodes in for this batch and subsequently returns a set of affected communities, denoted as . Based on the graph structure mapping , the set of all affected mapped nodes and edges, denoted as , is obtained.
It is important to note that the application of node-level PCA computation acts as a filter to exclude non-principal components and unlearning requests that are not affected. If a node is not utilized in the calculation of the features and labels associated with the mapped node, it can be inferred that the information contained within the node is not retained within the mapped graph. Consequently, the influence of node-level unlearning of is effectively negated, thereby avoiding unnecessary expenditure of time and resources.
Based on , CGE updates the relevant feature and structure information of the graph and generates an updated graph :
(14) |
The updated graph and the mapped graph must satisfy the following constraints to completely eliminate specific data and its derived information within the model.
(15) |
Experiments
To evaluate the efficacy of CGE, we conducted a series of comprehensive experiments using four real-world datasets and three prevalent GNN backbones. The evaluation addresses the following questions: (1) Can CGE provide excellent and comparable model utility? (2) How efficient is CGE in practical applications? (3) Can CGE achieve deterministic graph unlearning? Additionally, a series of ablation studies were performed to examine CGE’s superiority at each stage of the graph unlearning process. We also assessed the applicability of different community detection methods to CGE, as detailed in Appendix D.
Dataset | #Nodes | #Edges | #Classes | #Features | Density |
---|---|---|---|---|---|
Cora | 2,708 | 5,429 | 7 | 14,33 | 0.148% |
Citeseer | 3,327 | 4,732 | 6 | 3,703 | 0.085% |
CS | 18,333 | 163,788 | 15 | 6,805 | 0.097% |
232,965 | 41 | 602 | 0.040% |
Experimental Setup
Datasets.
We evaluated the CGE on four real-world datasets of various sizes, including Cora (Yang, Cohen, and Salakhudinov 2016), Citeseer (Yang, Cohen, and Salakhudinov 2016), CS (Shchur et al. 2018a) and Reddit 111https://docs.dgl.ai/generated/dgl.data.RedditDataset.html, all of which are commonly used in GNN evaluations. The large-scale Reddit dataset was specifically included to assess the unlearning framework’s performance in a real-world context. The statistics for these datasets are summarized in Table 2.
GNN Backbones.
Baselines.
To demonstrate the efficacy of CGE, we compare it with the state-of-the-art unlearning algorithms, including SISA (Bourtoule et al. 2021), GraphEraser (Chen et al. 2022b), and GUIDE (Wang, Huai, and Wang 2023), all of which are deterministic unlearning methods based on the BP-SM-TA paradigm. We also introduced Scratch, a scheme involving retraining from beginning to end, to evaluate comparable model utility. For each unlearning batch, 0.5% of the original dataset’s nodes were randomly selected. More detailed experimental settings are provided in Appendix B.
Metrics.
We consider two key aspects to measure the performance of CGE: model utility and unlearning efficiency. Model utility is assessed using the Macro F1 score, while unlearning efficiency is evaluated based on the time cost required for model deployment and unlearning.
Implementation.
CGE is implemented using Python 3.8.19 and DGL222https://www.dgl.ai. All experiments were conducted on an NVIDIA Tesla A800 GPU server running the Ubuntu 23.04 LTS operating system.
Dataset | Model | Method | ||||
---|---|---|---|---|---|---|
Scratch | SISA | GraphEraser | GUIDE | CGE | ||
Cora | SAGE | |||||
GAT | ||||||
GCN | ||||||
Citeseer | SAGE | |||||
GAT | ||||||
GCN | ||||||
CS | SAGE | |||||
GAT | ||||||
GCN | ||||||
SAGE | ||||||
GAT | ||||||
GCN |
Method | Model Deployment | Unlearning | ||||||
---|---|---|---|---|---|---|---|---|
Cora | Citeseer | CS | Cora | Citeseer | CS | |||
Scratch | 131.52 | 109.62 | 702.11 | 13048.67 | 130.23 | 110.05 | 700.12 | 13056.33 |
GraphEraser | 179.53 | 167.16 | 1042.07 | 61220.05 | 10.78 | 12.13 | 62.56 | 3128.22 |
GUIDE | 124.58 | 127.76 | 745.10 | 39064.71 | 10.11 | 11.07 | 71.60 | 3411.89 |
CGE | 93.12 | 108.35 | 747.82 | 18143.95 | 8.12 | 9.58 | 9.33 | 49.52 |
Model Utility of CGE
To answer question (1), we conducted a comparative analysis of CGE against the best results from three baselines across three GNN backbones and four datasets. This analysis aims to verify CGE’s model utility in comparison to Scratch, as referenced in the Preliminaries.
Results.
The experimental results presented in Table 3 reveal the following findings: (1) CGE outperforms all established baselines in terms of model utility. For instance, on Reddit, CGE shows an average increase of 11.56% compared to GUIDE and an even more significant average increase of 33.24% compared to GraphEraser. This superiority is attributed to two factors: (a) CGE employs a graph-optimized community detection method, resulting in a more accurate community distribution than the balanced partitioning used in the BP-SM-TA paradigm. (b) CGE accounts for the inherent relationships between subgraphs, reducing information loss, a consideration overlooked by the BP-SM-TA paradigm. (2) CGE demonstrates comparable model utility to Scratch, which serves as the foundational baseline due to its broad and outstanding performance achieved through the entire retraining of the dataset. Notably, CGE has even outperformed Scratch in certain cases, primarily due to its pre-integration of similar data within the same community in the original graph, effectively reducing noise. This results in a newly mapped graph with more robust and accurate information. CGE achieves average performance closer to Scratch, fulfilling the objectives of unlearning.
Efficiency of CGE
To answer the question (2), we evaluate CGE’s efficiency in two tasks: model deployment and unlearning.
In the model deployment stage, the unlearning framework generates a complete prediction model, including graph partitioning, data pre-processing, and initial model training. In the unlearning stage, the model performs deterministic unlearning on a demand sequence and produces an updated model based on the deployed one. Since the training time across three backbones on the same dataset is linear, we show the average efficiency of different unlearning frameworks across all backbones, as presented in Table 4.
During model deployment stage, CGE’s efficiency surpasses most baselines by retaining only the most representative information in the mapped graph. Compared to unlearning methods trained on the original graph, the mapped graph offers smaller-scale training data with higher representativeness. For example, as shown in Table 5, the CS dataset requires fewer training data and parameters for the mapped graph compared to the representative method GraphEraser under the BP-SM-TA paradigm.
During unlearning stage, CGE’s efficiency advantage becomes even more pronounced, significantly surpassing Scratch and all baselines. The graph structure mapping minimizes the impact of data size differences, allowing CGE to reduce unlearning time to seconds without sacrificing model utility. This efficiency results from both the exponential reduction in data and parameters and the node-level unlearning strategy. Unlike traditional methods that require retraining and aggregation of submodels, CGE only needs to recalculate the relevant mapped node-level data, significantly reducing time complexity.
Index | GraphEraser | CGE |
---|---|---|
#Nodes (T) | 18,333 | 1,756 |
#Parameters (T) | 43,498,280 | 217,875 |
#Parameters (U) | 4,890,365 | 217,875 |
Unlearning Ability
To determine the effectiveness of CGE in unlearning, we utilize a node-level GNN member inference attack methodology (MIA) (He et al. 2021). This approach involves calculating the discrepancy between the original model and the model after CGE unlearning, to determine whether a node has been successfully removed from the model. After randomly removing 0.5% of all nodes according to the experimental parameters, the MIA attack was executed. We consider two scenarios: and , representing the MIA attack results on the original GNN backbones and CGE, respectively. As shown in Table 6, the AUC of MIA on CGE is significantly lower than that of the original GNN backbones, approaching 0.5, which is analogous to random guessing. This indicates that CGE does not produce additional information leakage and that its unlearning of nodes and associated effects is effective.
Dataset | GCN | GAT | SAGE | |||
---|---|---|---|---|---|---|
Cora | 73.3 | 50.1 | 75.2 | 49.8 | 74.1 | 52.1 |
Citeseer | 78.3 | 52.1 | 79.1 | 50.2 | 79.7 | 50.7 |
CS | 73.1 | 50.1 | 72.3 | 49.3 | 70.8 | 48.8 |
63.1 | 50.0 | 65.5 | 49.6 | 65.1 | 49.8 |
Ablation Study
We aim to evaluate the contribution of different modules in CGE to the system by addressing the following questions:
Compared to previous schemes, is overlapping community detection more advantageous?
Most previous schemes use similar node feature clustering methods for subgraph-balanced partitioning, as BP-SM-TA requires each subgraph to represent the original graph. In contrast, CGE calculates a highly representative mapped graph without cutting the original graph, thereby limiting information loss. To evaluate the schemes for CGE, we use a custom metric, Information Retention, which measures cosine similarity between the low-dimensional embeddings of the original graph and the subgraph—higher values indicate less information loss. Additional details and validation experiments are provided in Appendix C. Figure 2 shows the evaluation results of different graph partitioning schemes Balanced Kmeans (BEKM) and Louvain-OSLOM (OSLOM) across four datasets. The balanced partitioning scheme is suboptimal, especially in the Reddit dataset, whose original graph representativeness is below zero, explaining why GraphEraser performs poorly on Reddit. In contrast, overlapping community detection better suits CGE. Figure 2 shows that BEKM leads to a significant decrease in CGE’s performance, further validating the above explanation.



Why can CGE overcome the limitations of balanced partitioning?
Previous experiments have shown that balanced partitioning is a suboptimal graph partitioning scheme, as it causes significant information loss due to graph cutting. This scheme is adopted by the BP-SM-TA paradigm to prevent imbalanced submodels from causing unmanageable learning time consumption, but it sacrifices model utility in the process. In contrast, CGE, based on the GSMU paradigm, offers a node-level unlearning solution with low parameter count and complexity. Figure 3 shows the unlearning efficiency of CGE on two large-scale datasets CS and Reddit under different unlearning node ratios. Notably, CGE avoids exponential increases in unlearning time as the number of forgotten nodes grows, thus eliminating the need for balanced partitioning as a protective measure.
Related Work
Machine unlearning aims to erase specific data and its derived effects. Following Cao’s work (Cao and Yang 2015), this concept has evolved toward the development of both approximate and deterministic unlearning methods.
Approximate unlearning (Wu et al. 2023; Guo et al. 2019) achieves statistical unlearning by fine-tuning model parameters, but it cannot guarantee precise data erasure, posing challenges for non-convex models like deep learning. Conversely, deterministic unlearning ensures the complete removal of data points and their influence, often through retraining. SISA (Bourtoule et al. 2021) proposed a method involving random balanced partitioning into multiple sub-models, where only the affected sub-models are retrained based on unlearning requests, and the final prediction is aggregated from these sub-models. This approach is referred to as the BP-SM-TA paradigm in this paper. Building on this, Chen et al. extended the concept to graph-structured data with GraphEraser (Chen et al. 2022b), and the recently proposed GUIDE (Wang, Huai, and Wang 2023) further optimized it for inductive graphs.
In contrast to previous work, the proposed CGE framework employs a novel graph structure mapping paradigm, which maps the original graph to a highly representative low-order graph. It introduces a low-parameter node-level unlearning strategy to enhance the utilization and efficiency of the graph structure within the unlearning framework.
Conclusions
In this work, we introduce GSMU, a novel graph structure mapping unlearning paradigm that offers an intuitive and efficient approach to graph unlearning. We propose the Community-Centric Graph Eraser (CGE) as a practical implementation of GSMU. CGE first detects communities and maps them to nodes, creating a mapped graph. This graph is then used to reconstruct node-level unlearning operations and representation learning tasks, resulting in high graph structure utilization and significantly reduced unlearning computational costs. Comprehensive evaluations demonstrate that CGE effectively leverages graph structures, offering excellent model utility and significantly improved unlearning efficiency. This work establishes a groundbreaking paradigm for graph unlearning with its streamlined architecture and superior performance.
CGE shows strong potential but faces challenges with traditional community detection techniques on large-scale graphs. Future work will leverage deep learning-based methods to enhance data insights while addressing privacy concerns, aiming for a balanced and effective approach.
Acknowledgments
This work was partially supported by the Project of Guangxi Science and Technology (GuiKeAB23026040), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (MIMS24-13), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (24-A-01-02), and Innovation Project of Guangxi Graduate Education (XYCSR2024102).
References
- Blondel et al. (2008) Blondel, V. D.; Guillaume, J.-L.; Lambiotte, R.; and Lefebvre, E. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10): P10008.
- Bourtoule et al. (2021) Bourtoule, L.; Chandrasekaran, V.; Choquette-Choo, C. A.; Jia, H.; Travers, A.; Zhang, B.; Lie, D.; and Papernot, N. 2021. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), 141–159. IEEE.
- Cao and Yang (2015) Cao, Y.; and Yang, J. 2015. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, 463–480. IEEE.
- Chen et al. (2022a) Chen, C.; Sun, F.; Zhang, M.; and Ding, B. 2022a. Recommendation unlearning. In Proceedings of the ACM Web Conference 2022, 2768–2777.
- Chen et al. (2022b) Chen, M.; Zhang, Z.; Wang, T.; Backes, M.; Humbert, M.; and Zhang, Y. 2022b. Graph unlearning. In Proceedings of the 2022 ACM SIGSAC conference on computer and communications security, 499–513.
- Cofone (2020) Cofone, I. 2020. The right to be forgotten: A Canadian and comparative perspective. Routledge.
- Fan et al. (2019) Fan, W.; Ma, Y.; Li, Q.; He, Y.; Zhao, E.; Tang, J.; and Yin, D. 2019. Graph neural networks for social recommendation. In The world wide web conference, 417–426.
- Guo et al. (2019) Guo, C.; Goldstein, T.; Hannun, A.; and Van Der Maaten, L. 2019. Certified data removal from machine learning models. arXiv preprint arXiv:1911.03030.
- Hamilton, Ying, and Leskovec (2017) Hamilton, W.; Ying, Z.; and Leskovec, J. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, 30.
- He et al. (2021) He, X.; Wen, R.; Wu, Y.; Backes, M.; Shen, Y.; and Zhang, Y. 2021. Node-level membership inference attacks against graph neural networks. arXiv preprint arXiv:2102.05429.
- Izzo et al. (2021) Izzo, Z.; Smart, M. A.; Chaudhuri, K.; and Zou, J. 2021. Approximate data deletion from machine learning models. In International Conference on Artificial Intelligence and Statistics, 2008–2016. PMLR.
- Jin et al. (2021) Jin, D.; Yu, Z.; Jiao, P.; Pan, S.; He, D.; Wu, J.; Philip, S. Y.; and Zhang, W. 2021. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Transactions on Knowledge and Data Engineering, 35(2): 1149–1170.
- Khan and Niazi (2017) Khan, B. S.; and Niazi, M. A. 2017. Network community detection: A review and visual survey. arXiv preprint arXiv:1708.00977.
- Kipf and Welling (2016a) Kipf, T. N.; and Welling, M. 2016a. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
- Kipf and Welling (2016b) Kipf, T. N.; and Welling, M. 2016b. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308.
- Lancichinetti et al. (2011) Lancichinetti, A.; Radicchi, F.; Ramasco, J. J.; and Fortunato, S. 2011. Finding statistically significant communities in networks. PloS one, 6(4): e18961.
- Li et al. (2024a) Li, C.; Cheng, D.; Zhang, G.; and Zhang, S. 2024a. Contrastive learning for fair graph representations via counterfactual graph augmentation. Knowledge-Based Systems, 305: 112635.
- Li et al. (2024b) Li, Z.; Chen, B.; Xi, P.; Huang, Z.; Zhang, W.; and Zhang, S. 2024b. Spatio-Temporal Dynamically Fused Graph Convolutional Network. In 2024 International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE.
- Li et al. (2022) Li, Z.; Jian, X.; Wang, Y.; and Chen, L. 2022. Cc-gnn: A community and contraction-based graph neural network. In 2022 IEEE International Conference on Data Mining (ICDM), 231–240. IEEE.
- Mantelero (2013) Mantelero, A. 2013. The EU Proposal for a General Data Protection Regulation and the roots of the ‘right to be forgotten’. Computer Law & Security Review, 29(3): 229–235.
- Nguyen et al. (2022) Nguyen, T. T.; Huynh, T. T.; Nguyen, P. L.; Liew, A. W.-C.; Yin, H.; and Nguyen, Q. V. H. 2022. A survey of machine unlearning. arXiv preprint arXiv:2209.02299.
- Pardau (2018) Pardau, S. L. 2018. The california consumer privacy act: Towards a european-style privacy regime in the united states. J. Tech. L. & Pol’y, 23: 68.
- Qiu et al. (2018) Qiu, J.; Tang, J.; Ma, H.; Dong, Y.; Wang, K.; and Tang, J. 2018. Deepinf: Social influence prediction with deep learning. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2110–2119.
- Rosvall and Bergstrom (2008) Rosvall, M.; and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the national academy of sciences, 105(4): 1118–1123.
- Said et al. (2023) Said, A.; Derr, T.; Shabbir, M.; Abbas, W.; and Koutsoukos, X. 2023. A survey of graph unlearning. arXiv preprint arXiv:2310.02164.
- Shchur et al. (2018a) Shchur, O.; Mumme, M.; Bojchevski, A.; and Günnemann, S. 2018a. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868.
- Shchur et al. (2018b) Shchur, O.; Mumme, M.; Bojchevski, A.; and Günnemann, S. 2018b. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868.
- Veličković et al. (2017) Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; and Bengio, Y. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903.
- Wang, Huai, and Wang (2023) Wang, C.-L.; Huai, M.; and Wang, D. 2023. Inductive graph unlearning. In 32nd USENIX Security Symposium (USENIX Security 23), 3205–3222.
- Wei et al. (2024) Wei, Y.; Yuan, H.; Fu, X.; Sun, Q.; Peng, H.; Li, X.; and Hu, C. 2024. Poincaré Differential Privacy for Hierarchy-aware Graph Embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 9160–9168.
- Wu et al. (2023) Wu, J.; Yang, Y.; Qian, Y.; Sui, Y.; Wang, X.; and He, X. 2023. Gif: A general graph unlearning strategy via influence function. In Proceedings of the ACM Web Conference 2023, 651–661.
- Xie, Szymanski, and Liu (2011) Xie, J.; Szymanski, B. K.; and Liu, X. 2011. Slpa: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In 2011 ieee 11th international conference on data mining workshops, 344–349. IEEE.
- Xu et al. (2018) Xu, K.; Hu, W.; Leskovec, J.; and Jegelka, S. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.
- Yan et al. (2022) Yan, H.; Li, X.; Guo, Z.; Li, H.; Li, F.; and Lin, X. 2022. ARCANE: An Efficient Architecture for Exact Machine Unlearning. In IJCAI, volume 6, 19.
- Yang, Cohen, and Salakhudinov (2016) Yang, Z.; Cohen, W.; and Salakhudinov, R. 2016. Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning, 40–48. PMLR.
- Zhang et al. (2025) Zhang, G.; Yuan, G.; Cheng, D.; Liu, L.; Li, J.; and Zhang, S. 2025. Disentangled contrastive learning for fair graph representations. Neural Networks, 181: 106781.
- Zhang et al. (2017) Zhang, S.; Li, X.; Zong, M.; Zhu, X.; and Cheng, D. 2017. Learning k for knn classification. ACM Transactions on Intelligent Systems and Technology (TIST), 8(3): 1–19.
Appendix A A. GNN Backbones
The experimental verification of this work is based on three GNN backbones, which are representative mainstream graph neural network models, to verify the applicability of CGE.
Graph Convolutional Network (GCN):
GCN (Kipf and Welling 2016a) uses spectral graph convolutions to aggregate features from neighbors. The update rule is:
(16) |
where and is its degree matrix.
Graph Attention Network (GAT):
GAT (Veličković et al. 2017) introduces attention by weighting neighbor features based on their importance. The attention coefficient is:
(17) |
Graph SAmple and aggreGatE (SAGE):
SAGE (Hamilton, Ying, and Leskovec 2017) generates node embeddings by sampling and aggregating neighbor features. The update rule is:
(18) |
where AGG can be mean, LSTM, or pooling.
Appendix B B. Specific Experimental Settings
Parameter Settings
Since the three baselines are based on the BP-SM-TA paradigm, we set the same number of shards for each. We used the Adam optimizer with a default learning rate of 0.01 and a weight decay of 0.001. As a result, Cora, Citeseer, CS, and Reddit were divided into 20, 20, 80, and 300 shards, respectively. Each GNN backbone was trained for 200 epochs. We ran all experiments related to model utility and efficiency 10 times and reported the average results.
Evaluation Metrics
We evaluate the graph model’s utility with the Macro F1 score and assess the unlearning capability using node-level GNN membership inference attacks, measured by the Area Under the Curve (AUC).
Macro F1 Score.
The Macro F1 score averages the F1 scores for each class, providing an overall measure of the model’s classification performance across all classes (Shchur et al. 2018b). It is particularly useful when dealing with imbalanced data, ensuring that each class is equally represented in the final score.
AUC (Area Under the Curve).
AUC evaluates the effectiveness of the unlearning process by measuring the model’s susceptibility to membership inference attacks. It quantifies how well the model can distinguish between nodes that were part of the training set and those that were not, with higher AUC values indicating weaker unlearning ability.
Appendix C C. Further Ablation Study
The BP-SM-TA paradigm requires that each balanced subgraph must maintain a significant degree of representativeness of the original graph. We believe this enforced constraint severely compromises the performance of the graph structure. To validate our hypothesis, we evaluate the subgraph (community) partitioning results from two perspectives: Information Retention and Conductance. The evaluation of Information Retention has been presented in the main text, and we will briefly explain its underlying principles here.
Information Retention
In the ablation study, we assess the Information Retention across different graph partitioning schemes to measure how well the subgraphs represent the original graph. Specifically, We compute the embeddings and for the original graph and the subgraph, respectively, using a Graph Autoencoder (Kipf and Welling 2016b):
(19) |
(20) |
The autoencoder reconstructs the features from these embeddings:
(21) |
(22) |
Information retention is measured by the cosine similarity between the reconstructed features of the original graph and the mean of the subgraph reconstruction:
(23) |
where represents the set of nodes in the subgraph.
The final information retention score is the mean similarity:
(24) |
where is the number of nodes or comparisons. The value of information retention is defined between and , with a higher value indicating that the information characteristics of the subgraph are more similar to those of the original graph, and the information loss is lower.
Conductance
Conductance is a key metric used to evaluate the quality of community structures in a graph. It measures the ratio of the number of edges that cut through the boundary of a community to the total number of edges within that community. Lower conductance values indicate that the community is more tightly knit, with fewer edges crossing the boundary, thus suggesting a more compact and well-defined community structure.
The conductance of a community is given by:
(25) |
where:
-
•
is the number of edges connecting nodes in to nodes outside of ,
-
•
is the sum of the degrees of nodes within ,
-
•
is the volume of the complement of .
Lower values of indicate a more cohesive community with fewer external connections relative to its internal structure.

The conductance evaluation of the balanced community partitioning scheme BEKM and the Louvain-OSLOM (referred to as OSLOM) method used in this work is shown in Figure 4. OSLOM achieves a lower conductance score, demonstrating that the communities it partitions are more compact internally and sparser externally. This indicates that OSLOM better preserves the intrinsic structural features of communities while reducing external connections, thereby improving community purity and structural tightness. In contrast, BEKM shows higher conductance scores, indicating more boundary connections and looser internal structures in its partitioned communities. This difference highlights the superiority of OSLOM for complex network analysis, making it a more suitable approach for the CGE graph unlearning tasks based on GSMU.
Appendix D D. Adaptability of Cmmunity Detection Methods
We introduce two widely used community detection methods: Infomap (Rosvall and Bergstrom 2008) and SLPA (Xie, Szymanski, and Liu 2011).
-
•
Infomap. Infomap utilizes a method based on random walks and information theory to detect communities. It minimizes the description length of a network by finding an optimal partition that compresses the network’s structure. The objective function for Infomap can be expressed as:
(26) where is the adjacency matrix, and are the degrees of nodes and , is the total number of edges, and is 1 if nodes and are in the same community and 0 otherwise.
-
•
SLPA. SLPA (Speaker-Listener Label Propagation Algorithm) is another effective method that uses a label propagation approach to identify communities. Each node in the network iteratively updates its label based on the labels of its neighbors. The update rule for SLPA is given by:
(27) where is the label of node at iteration , denotes the set of neighbors of node , and mode denotes the most frequent label among the neighbors.
Dataset | Method | ||
---|---|---|---|
SLPA | OSLOM | Infomap | |
Cora | 16 | 83 | 5 |
Citeseer | 18 | 99 | 6 |
CS | 172 | 747 | 77 |
- | 18,077 | 8,763 |

For the combination of Louvain and OSLOM coarse and fine granularity proposed in the main text, we give a total of two structures. For small datasets, Louvain’s dataset delineation is used as the initialized community and OSLOM is used to optimize the community structure. For large datasets, we evaluate the conductivity of the initialized communities divided by Louvain, select the communities with poor conductance for fine-grained division, and optimize the community structure using OSLOM.
Cora(M) | Cora(O) | Citeseer(M) | Citeseer(O) | CS(M) | CS(O) | Reddit(M) | Reddit(O) | |
---|---|---|---|---|---|---|---|---|
Entropy | 1.8197 | 1.8311 | 1.7434 | 1.7533 | 2.3961 | 2.4087 | 3.3967 | 3.4013 |
IR | 0.2024 | 0.2200 | 0.3068 | 0.3766 | 0.0291 | 0.0285 | 0.0141 | 0.0116 |
Gini | 0.2706 | 0.2587 | 0.1400 | 0.1338 | 0.4328 | 0.4225 | 0.4314 | 0.4222 |
Table 7 presents the partitioning durations of different community detection methods across four datasets, while Figure 5 illustrates their Macro F1 scores on various GNN backbones for the same datasets. When considering both results, SLPA and OSLOM demonstrate high model utility; however, SLPA’s time consumption for large datasets is prohibitively high, rendering it impractical. Although Infomap is efficient in partitioning, its predictive performance is insufficient. Despite the relatively lower partitioning efficiency of OSLOM, we elected to utilize this method because CGE only necessitates a single community detection process throughout the workflow. We prefer the method with superior performance rather than optimizing time. Moreover, the efficiency experiments presented in the main text confirm that, despite OSLOM’s lower partitioning efficiency compared to other methods, it still offers significant advantages in model deployment compared to the baselines.
We carefully analyzed the characteristics of different community detection methods. The conventional OSLOM method is compatible with a wide range of graph structures, including directed graphs. We mitigated its high computational cost in large-scale networks by integrating coarse and fine-grained methods. In comparison, SLPA relies more on the randomness of initialization, while Infomap places an undue emphasis on random walks. To ensure the robustness of CGE, the more stable OSLOM method was selected.
It is worth mentioning that the GSMU paradigm and its CGE framework proposed in this paper are a novel universal and efficient graph unlearning strategy, and its great potential supports the integration of any more effective community detection method or graph mapping algorithm.
Appendix E E. Evaluation of Class Imbalance in CGE
In the framework design, it was considered that some nodes within a single community may represent noise at the feature or label level relative to the entire collective. To mitigate the complete dominance of the majority class, an intuitive approach was proposed based on majority voting and Euclidean distance-based local proximity. This allows label selection to better align with the community’s local structure, thereby minimizing the negative impact on supervised learning.
To validate the approach, we conducted experiments involving the class distribution of nodes in the generated mapped graph and the original graph. The class imbalance was quantified using three mathematical measures: class distribution vector entropy, class imbalance ratio (IR), and the Gini coefficient. The following outlines the methods used for calculating these measures:
-
•
Class Distribution Vector Entropy: The entropy of the class distribution vector was calculated using the formula:
(28) where is the proportion of nodes belonging to class , and is the number of distinct classes. This measure quantifies the uncertainty or disorder within the class distribution.
-
•
Class Imbalance Ratio (IR): The class imbalance ratio is defined as the ratio between the size of the largest class and the size of the smallest class:
(29) A higher ratio indicates greater imbalance in the class distribution.
-
•
Gini Coefficient: The Gini coefficient is calculated as:
(30) where is the proportion of nodes in class . This measure provides a summary of the inequality of the class distribution, with higher values indicating greater imbalance.
The results of the experiment are summarized in Table 8, where (M) denotes the mapped graph generated by CGE and (O) denotes the original graph.
The results indicate that the mapped graph generated by CGE did not introduce additional data bias when compared to the original graph.