¹¹institutetext: University of Technology Sydney, Sydney, Australia
¹¹email: [email protected]
¹¹email: {jianlong.zhou, sunny.verma, Fang.Chen}@uts.edu.au

A Survey of Explainable Graph Neural Networks: Taxonomy and Evaluation Metrics

Yiqiao Li Jianlong Zhou Sunny Verma Fang Chen

Abstract

Graph neural networks (GNNs) have demonstrated a significant boost in prediction performance on the graph data. At the same time, the predictions made by these models are often hard to interpret. In that regard, many efforts have been made to explain the prediction mechanisms of these models from perspectives such as GNNExplainer, XGNN and PGExplainer. Although such works present systematic frameworks to interpret GNNs, a holistic review for explainable GNNs is unavailable. In this survey, we present a comprehensive review of explainability techniques developed for GNNs. We focus on explainable graph neural networks, categorize them based on the use of explainable methods. We further provide the common performance metrics for GNNs explanations and point out several future research directions.

Keywords:

Deep learning Explainable artificial intelligence Graph neural networks Evaluation metrics

1 Introduction

A graph $\mathcal{G}$ can be viewed as a representation of certain relationship formed by a set of nodes $\mathcal{N}_{i}$ ( $i=1,2,\cdots,n$ ) and edges $\mathcal{E}_{j}$ ( $j=1,2,\cdots,m$ ). It is an ideal data structure that can be used to model a variety of real-world datasets (e.g., molecules). With the resurgence of deep learning, graph neural networks (GNNs) have been a powerful tool to model graph data and achieved impressive performance in a great deal of domains and applications, such as recommendation, chemistry, medical, and etc. [31, 11, 27]. However, incorporating both graph structure and feature information together has lead to complex non-linear models, increasing the difficulties of understanding its working mechanism as well as its predictions. On the other hand, an explainable model is favored and even necessary, especially in practical scenarios (e.g., medical diagnosis), as explanations benefit users in multiple ways such as improving the model’s fairness/security, and it also enhance trust in the model’s recommendations. As a result, eXplainable GNNs (XGNNs) has achieved considerable research attention in the recent years and can be categorized into two categories: 1) making the eXplainable-AI (XAI) methods and using them directly to explain GNNs; 2) developing their strategies based on graph intrinsic structures and features and doing not involving the XAI approaches.

Although there is an increasing number of work that focus on the explainability of GNNs in recent years, there is few systematical discussion about them. We believe that analyzing these recent work of XGNNs in a comprehensive way would facilitate a better understanding of these methods, stimulate new ideas, and provide insight of developing new explainable methods. Therefore, we analyze and summarize the current methods of explainable methods for GNNs. In particular, we categorize them into two groups—XAI-based XGNNs in Section 2 and non-XAI-based XGNNs in Section 3. We then present the metrics that are used to measure the explainability of XGNNs in Section 4. We discuss recurrent issues with XGNNs in Section 5, and finally point out several future research directions in Section 6.

Our contributions can be summarized as:

•

We systematically analyze state-of-the-art methods of XGNNs and categorize them into two groups: XAI-based XGNNs, which leverage the existing XAI approaches to explain GNNs; Non-XAI-based XGNNs, which moves aways from current XAI methods while attempts to explain GNNs through taking advantage of the inherent structures and features of graphs.
•

We present the evaluation metrics for XGNNs, which can be used to measure the performance of XGNNs methods, as knowledge of evaluation metrics are necessary to educate the end-users/practitioners of XGNNs.
•

We discuss the recurrent problems in the filed of XGNNs along with possible solutions, and finally point out several potential research directions to further improve the explainability of GNNs.

Refer to caption — Figure 1: The taxonomy of explainable graph neural networks.

2 XAI-based Explainable Graph Neural Networks

By analyzing the references of XGNNs, we made a binary classification of explainable GNNs’ methods, which can be divided into two categories: XAI-based and non-XAI-based. The taxonomy of XGNNs is shown in Figure 1. We begin by presenting a brief introduction of XAI and then present XGNNs, as it will aid understanding of XAI-based explainable techniques for XGNNs.

2.1 Explainable Artificial Intelligence

Over the past years, XAI has been becoming a hot research topic and there is an increasing number of studies in this field. Several surveys have summarized the history, taxonomy, evaluation, challenges and opportunities about it, mainly focusing on the explanation of deep neural networks (DNNs) [1][8][10][24][12].

XAI techniques can be classify according to three categories as discussed in [10]: (i) the difference of scope of interpretability, (ii) the difference of methodology, and (iii) the difference of usage of ML models (see Figure 2).

We can also divide XAI into model-specific XAI and model-agnostic XAI, based on the difference of usage of ML model. The model-specific XAI refers to any methods that focus on the explainability of a single or a group of specific AI models; while the model-agnostic XAI does not put any emphases on the underlying AI models.

Model-agnostic XAI can be used to assess most AI models and are often applied after the training, thus, they are usually treated as a post-hoc method. Model-agnostic XAI relies on analyzing pairs of input and output features and has no access to the specific inner workings of AI models (e.g., weights or structural information), otherwise it will be impossible to decouple it from the black-box models [25]. After analyzing the characteristics of model-specific XAI and model-agnostic XAI, we can see that, model-specific XAI methods heavily rely on the specific parameters, while any changes in the architecture of models may result in significant changes in the interpretation method itself or the corresponding explainable algorithm. Thus, the model-specific XAI methods are not available to extend to explain GNN. However, some model-agnostic XAI methods can be extended to explain GNNs.

2.2 Explaining Graph Neural Networks through XAI Methods

Convolutional neural networks (CNNs) could be used in the graph-structured data by extending the convolution operation onto graphs and in general onto non-Euclidean spaces. The extension of CNNs to non-Euclidean spaces is regared as graph convolution neural networks (GCNNs). Thus, we can adapt the common explainability methods which are originally designed for CNNs, and extend them to GCNNs. And we found that a variety of methods of XAI can be easily extended to GNNs, such as LRP [4], LIME [21], Grad-CAM [23]. These extensions are summarized in Table 1.

Table 1: The extensions of XAI to GNNs.

XAI Methods	XGNNs Methods
LRP	GNN-LRP [26, 22, 9]
Grad-CAM	Grad-CAM [20]
LIME	GraphLIME [13]

Layer-wise Relevance Propagation (LRP) assumes that the classifier can be decomposed into several computational layers, and propagates the DNNs output from the top layer to the input layer. At each layer, a propagation rule is applied [4]. The contributions to the target output node are back-propagated to the input features form a map of features that contribute to that node. Therefore, LRP is useful to visualize the contributions of input features to models predictions, especially for kernel-based classifiers and multi-layer neural networks.

Motivated by it, the researchers [26] used LRP in GNN to obtain insights into the black-box of GNN models. Schnake et al. [22] proposed GNN-LRP based higher-order Taylor expansions. GNN-LRP produces detailed explanations that subsume the complex nested interaction between the GNN model and the input graph. Furthermore, Cho et al. [9] conducted the post-hoc explanation on individual predictions with the use of LRP. The LRP calculates the relevance for every neuron by reversely propagating, through the network, from the predicted output to the input level, and the relevance represents the quantitative contribution of a given neuron to the prediction. What’s more, Baldassarre et al. [6] also applied LRP to graph models. The LRP method computes the saliency maps by decomposing the output prediction into a combination of its inputs.

Local Interpretable Model-Agnostic Explanations (LIME) is another popular method in XAI. LIME extracts individual prediction instances from the black-box model and generates a simpler and explainable model such as linear model to approximate the decision features of it. This simple model can then be interpreted and used to explain the original black-box predictions [21]. Many other papers have improved and extended the LIME. Zhao et al. [40] introduced BayLIME that incorporates LIME with Bayesian. Zafar et al. [38] used the Jaccard similarity among multiple generated explanations and proposed a deterministic version of LIME. Furthermore, LIME has been widely applied in GNNs to explain GNN models. Huang et al. [13] proposed GraphLIME, a local interpretable model explanation for graphs using the Hilbert-Schmidt Independence Criterion (HSIC) Lasso, which is a nonlinear feature selection method to achieve local explainability. Their frameworks are generic GNN-model explanation framework that learns a nonlinear interpretable model locally in the subgraph of the node being explained.

Gradient-weighted Class Activation Mapping (Grad-CAM) improves CAM by relaxing the architectural restriction that the penultimate layer must be convolutional [23]. It generates a coarse localization map to highlight the important regions in the input images by making use of the gradients of the target concept flowing into the final convolutional layer. Grad-CAM has been applied to a wide variety of convolutional neural network modelfamilies [23]. Pasa [19] directly used it as a visualization tool for the convolutional neural network explanations. Vinogradova et al. [28] further extended the Grad-CAM and applied it locally to produce heatmaps showing the relevance of individual pixels in semantic segmentation. Grad-CAM can also be extended to GNN. Pope et al. [20] described the extension of CNN explainable methods to GCNNs. They introduced explainable method (Grad-CAM) for decisions made by GCNNs. Grad-CAM enables to generate heat-maps with respect to different layers of the network.

3 Non-XAI-based Explainable Graph Neural Networks

Most XAI-based methods for XGNNs do not require knowledge of the internal parameters of the GNN model and the XAI methods used to yield explanations are not specifically designed for CNN models. Thus, it is not surprising that these methods might not be able to give a satisfying explanation when one needs to further explore the structure of GNN models, especially challenging for a large and complicated model. To mitigate this issue, researchers in recent years start to develop explainable methods that are tailored to GNN models by taking the characteristics of graph structures into account. There are three different ways to achieve this goal: (1) interpreting GNN models by finding important subgraphs; (2) interpreting GNN models by generating new graphs while this generated graph is supposed to maintain the most informative features (e.g., nodes, node features, and edges); (3) interpreting GNN models by adding intermediate levels.

3.1 Interpretable GNN through Subgraphs

Interpretable GNN through subgraphs is a family of methods that use the subgraphs to add the interpretability of GNN models and it often focuses on the local features and then only yields the most important subgraph.

Ying et al. [33] proposed GNNExplainer, which interprets GNN through subgraphs and is a model-agnostic approach to provide interpretable explanations for predictions of any GNN-based model. GNNExplainer identifies a compact subgraph structure and a small subset of node features that play a crucial role in GNN’s prediction. To explain a given node’s predicted label, GNNExplainer provides a local interpretation by highlighting relevant features as well as an important subgraph structure by identifying the edges that are most relevant to the prediction. This is a pioneer method in explaining GNNs. GNNExplainer can provide explanations for any GNN that mutual-information is apt for the task, by finding both important subgraphs and important subfeatures.

After that, Zhang et al. [39] proposed a model-agnostic relational model explainer called RelEx, which treats the underlying model as a black-box model and learns relational explanations. The RelEx constructs explanations using two steps—learning a local differentiable approximation of the black-box model and then learning an interpretable mask over the local approximation with the use of subgraphs. It can provide flexible explanation for end users and the local approximator of GNN models is locally faithful and differentiable on the input adjacency matrix.

In addition, Lin et al. [15] presented a model-agnostic framework, called Graph neural networks Including SparSe inTerpretability (GISST), for interpreting important graph structure and node features, which discards the unimportant nodes and features by inducing the sparsity. The GISST deals with the input data to obtain the important subgraph and subfeatures by getting the important probability of adjacency matrix and node features matrix. Vu et al. [29] proposed a model-agnostic explainer called Probabilistic Graphical Model for GNNs (PGM-Explainer) by identifying crucial graph components to generate an explanation. PGM-Explainer produces a simpler interpretable Bayesian and can illustrate the dependency among explained features and provide deeper explanations for GNN’s predictions. Yuan et. al. [37] proposed SubgraphX to explain GNNs by identifying important subgraphs. The information aggregation procedures in GNNs can be interpreted as interactions among different graph structures. Thus the authors used Shapley values to measure the importance of subgraphs by capturing such interactions only within the information aggregation range. Furthermore, they used Monte Carlo tree search algorithm to efficiently explore different subgraphs for a given input graph. The SubgraphX explains GNNs via identifying subgraphs explicitly.

3.2 Interpretable GNN through Graphs Generation

Instead of focusing on subgraphs, interpreting GNNs through graphs generation takes the whole graph structure (or global structure) into consideration. It considers the overall structure of the graph. Then a new graph is generated that contains only the structure necessary for the decision making by GNNs.

Similar to the PGM-Explainer analysing the explained features from conditional probabilities, Luo et al. [16] proposed a model-agnostic method of explainable GNNs called PGExplainer. PGExplainer provides explanations for GNNs by generating a probabilistic graph. It is naturally applicable to provide model-level explanations for each instance with a global view of the GNN model and has better generalization ability. On the other hand, Yuan et al. [35] also proposed XGNN, which provides model-level explanations without preserving the local fidelity. XGNN applied reinforcement learning to generate important graph to explain the prediction which is made by GNN models. It generates graph patterns by maximizing a certain prediction of the model. Thus it can provide high-level insights and a generic understanding of how GNNs work.

3.3 Interpretable GNN through Intermediate Levels Injection

Interpreting GNN through intermediate levels injection can directly encode knowledge/information as a factor graph into the model architecture. For example, the Factor Graph Neural Network (FGNN) model established by Ma et al. [17] directly encodes biological knowledge such as Gene Ontology into the model architecture. Each node in the Factor Graph Neural Network model corresponds to some biological entity such as genes or Gene Ontology terms, making the model transparent and interpretable.

In addition, Yang et al. [32] proposed the Graph-based neural networks for Image Privacy (GIP) to infer the privacy risk of images. The GIP mainly focuses on objects in an image, and the knowledge graph is extracted from the objects in the dataset without reliance on extra knowledge. The results showed that the introduction of the knowledge graph not only makes the deep model more explainable but also makes better use of the information of objects provided by the images. Furthermore, Yu et al. [34] proposed Scene-Graph Autonomous Driving Systems (SGADS) which used scene-graphs as intermediate representations to deal with the limitations in Autonomous Driving Systems (ADS) and the spatial and temporal attention components used in their approach improved both its performance and its explainability. Noutahi et al. [18] proposed LaPool, an interpretable hierarchical graph pooling method, which uses Laplacian Pooling as an intermediate to capture the relative importance of interactions between molecular substructures. LaPool takes into account both node features and graph structure to improve molecular representation.

4 Evaluation Metrics for GNN Explainers

Since explainers are used to explain why a certain decision has been made instead of depicting the whole black-box, there is uncertainty about the fidelity of the explainer itself. Therefore, it is crucial to use the right metrics to evaluate the correctness and completeness of the interpretability techniques. Recently, GraphFramEx [3] and GRAPHXAI [2] have focused on defining the explainability metrics to evaluate the fidelity of GNNs explanations. Further, some evaluation metrics for XAI [41] are also available to be applied to GNN explainers. This section provides a short review of the prevalent evaluation metrics for GNNs explanations. Generally, we evaluate a GNNs explainer from two aspects: performance and explanatory capability. In specific, explanatory capability can be evaluated from qualitative analyses and quantitative analyses, including accuracy evaluation and explainability evaluation. The taxonomy of metrics can be found in figure 3.

4.1 Performance Evaluation

Efficiency.

An efficient graph explanation algorithm should be able to provide explanations for a large number of decisions made by a machine learning model quickly and with minimal computational resources. This is particularly important in scenarios where real-time decision-making is required or where the volume of data is extremely large. In addition to being time and resource-efficient, an efficient graph explanation algorithm should also produce explanations that are accurate, interpretable, and fair. Achieving a balance between efficiency and accuracy/fairness is an active research area in the field of graph explanation. In the paper [5], authors evaluate efficiency by comparing the average computation time taken for inference on unseen graph samples.

Sanity Check (SC).

Explainers should provide explanations for GNNs by finding both important subgraphs and important features that play a crucial role in the prediction of GNN. Thus, good GNNs explainers should be sensitive to the target GNN model changes. A sanity check [30] is one way to evaluate the sensitivity capability of GNNs explainers. In specific, SC is to compare the attribution scores on the trained GNN $f$ with that on an untrained GNN $\hat{f}$ with randomly-initialized parameters. Similar attributions infer the explainer is insensitive to the properties of the model, thus failing to pass the check. Thus, we desire a lower $SC$ ( $\downarrow$ ). The definition of SC is shown in Eq. 1.

\mathrm{SC}=\mathbb{E}_{\mathcal{G}\sim\mathbb{G}}[|\rho(\Phi(\mathcal{G},f(\mathcal{G})),\Phi(\mathcal{G},\tilde{f}(\mathcal{G})))|]

(1)

Robustness.

It means the explanations of interpretation methods resist attacks such as input corruption/perturbation, adversarial attack and model manipulation. A robust interpretation method can provide similar explanations despite the presence of such attacks [16, 39]. Authors [5] computer robustness by quantifying how much an explanation changes after adding noise to the input graph.

4.2 Explanatory Evaluation: Qualitative analyses

Qualitative analyses.

Qualitative analyses are an important aspect of explainability in Graph Neural Networks (GNNs). These analyses involve examining the internal workings of a GNN to gain insight into how it makes decisions. This can be accomplished through techniques such as visualization, feature importance analysis, and interpretation of node and edge embeddings. By conducting qualitative analyses, researchers and practitioners can better understand the factors that contribute to GNN decision-making, identify potential biases or errors, and improve the overall transparency and interpretability of the model. Ultimately, qualitative analyses are crucial for ensuring that GNNs are trustworthy and can be used effectively in real-world applications. Qualitative analyses have been widely used in recent research, such as GNNExplainer [33], PGExplainer [16], GAN-GNNExplainer [14], etc.

4.3 Explanatory Evaluation: Quantitative analyses

4.3.1 Accuracy Evaluation

Accuracy evaluation refers to the process of assessing the correctness and fidelity of the explanations generated by an algorithm or model. Accurate explanations are essential for building trust in the machine learning model’s decision-making process and for ensuring fairness and transparency. Therefore, accuracy evaluation is a crucial step in developing and evaluating graph explanation algorithms.

Accuracy (ACC).

ACC is the proportion of explanations that are ”correct”. There are two definitions to measure the accuracy of explainable methods. First, one can use the percentage of the identified important features (e.g., nodes, node features, and edges) to the true important truth [16, 33, 39] (see Eq. 2):

Accuracy=\frac{1}{N}\sum^{N}_{i=1}\frac{|s_{i}|}{|S_{i}|_{gt}}

(2)

where $|S_{i}|_{gt}$ represents the truth important number of features; while $|s_{i}|$ is the important features identified by the explainable methods; $N$ is the total number of samples. While this approach is simple and intuitive, however, it requires the ground-truth explanations of datasets, which is often hard to obtain in the real world. The other one is explanation accuracy.

Explanation Accuracy.

This is derived from the perspective of model predictions and measures the prediction accuracy [14]. They use the predictions of the target GNN for the explanations to calculate the accuracy of the explanation. The accuracy of the explanation can be defined as Eq. 3:

ACC_{exp}=\frac{|f(\mathbf{G})=f(\mathbf{G}^{s})|}{|T|}

(3)

where $f$ is the pre-trained target GNN, $\mathbf{G}$ is the original graph we want to explain, and $\mathbf{G}^{s}$ is its corresponding explanation (e.g., the important subgraph), $|f(\mathbf{G})=f(\mathbf{G}^{s})|$ is the corrected classified number which means $f(\mathbf{G})=f(\mathbf{G}^{s})$ , $|T|$ is the total number of the test set.

4.3.2 Explainability Evaluation

Fidelity.

It measures whether the explanations are faithfully important to the model’s predictions. The $Fidelity^{+}$ [36, 37] metric indicates the difference in predicted probability between the original predictions and the new prediction after removing important input features. In contrast, the metric $Fidelity^{-}$ [36] represents prediction changes by keeping important input features and removing unimportant structures.

Fidelity^{+}=\frac{1}{N}\sum^{N}_{i=1}(f(\bm{G}_{i})_{y_{i}}-f(\bm{G}_{i}^{1-m_{i}})_{y_{i}})

(4)

Fidelity^{-}=\frac{1}{N}\sum^{N}_{i=1}(f(\bm{G}_{i})_{y_{i}}-f(\bm{G}_{i}^{m_{i}})_{y_{i}})

(5)

Where $N$ is the total number of samples, and $y_{i}$ is the class label. $f(\bm{G}_{i})_{y_{i}}$ and $f(\bm{G}_{i}^{1-m_{i}})_{y_{i}}$ are the prediction probabilities of $y_{i}$ when using the original graph $\bm{G}_{i}$ and the occluded graph $\bm{G}_{i}^{1-m_{i}}$ , which is gained by occluding important features found by explainers from the original graph. Thus, a higher $Fidelity^{+}$ ( $\uparrow$ ) is desired. $f(\bm{G}_{i}^{m_{i}})_{y_{i}}$ is the prediction probabilities of $y_{i}$ when using the explanation graph $\bm{G}_{i}^{m_{i}}$ , which is obtained by important structures found by explainable methods. Thus a lower $Fidelity^{-}$ ( $\downarrow$ ) is desired. Specifically, $Fidelity^{+}$ and $Fidelity^{-}$ are used to quantify the necessity and sufficiency of the explanations, respectively. The higher $Fidelity^{+}$ , the more necessary the explanation. On the contrary, the lower $Fidelity^{-}$ , the more sufficient the explanation.

Characterization Score.

The characterization score [3, 7] is a global evaluation metric that attempts to balance the sufficiency and necessity requirements. This approach is analogous to combining precision and recall in the Micro-F1 metric. The characterization score is the weighted harmonic mean of Fidelity+ and Fidelity- as defined below:

Charact=\frac{2\times Fidelity^{+}\times(1-Fidelity^{-})}{Fidelity^{+}+(1-Fidelity^{-})}

(6)

Sparsity.

It measures the fraction of features selected as important by explanation methods [20, 37], which is defined in Eq. 7:

Sparsity=\frac{1}{N}\sum^{N}_{i=1}(1-\frac{|s_{i}|}{|S_{i}|_{total}})

(7)

where the $|S_{i}|_{total}$ represents the total number of features (e.g., nodes, nodes features, or edges) in the original graph model; while $|s_{i}|$ is the size of important features/nodes found by the explainable methods and it is a subset of $|S_{i}|$ ; $N$ is the total number of samples. Note that higher sparsity values indicate that explanations are sparser and likely to capture only the most essential input information. Hence, a higher $Sparsity$ ( $\uparrow$ ) is desired.

Contrastivity (CST).

CST means the ratio of the Hamming distance between binarized heat-maps for positive and negative classes [20]. The underlying idea behind contrastivity is that the highlighted features by an explanation method should vary across classes. [20] and [30] defined and used CST to evaluate the explainability of their methods. One can define fidelity as shown in Eq. 8. And a lower $CST^{-}$ ( $\downarrow$ ) is desired.

\mathrm{CST}=\mathbb{E}_{\mathcal{G}\sim\mathbb{G}}\mathbb{E}_{s\neq\hat{y}}[\rho(\Phi(\mathcal{G},s),\Phi(\mathcal{G},\hat{y}))\mid]

(8)

Stability.

Graph explanation stability refers to the ability of a Graph Neural Network (GNN) to produce consistent explanations even when the input graph is slightly altered or perturbed. This is important for ensuring the reliability and interpretability of the model’s decisions. In [2], authors measure graph explanation stability by computer the instability degree. They calculate the instability as Eq. 9.

\operatorname{GES}\left(\mathbf{M}_{\mathcal{S}_{u^{\prime}}}^{p},\mathbf{M}_{\mathcal{S}_{u}^{p}}\right)=\max D\left(\mathbf{M}_{\mathcal{S}_{u^{\prime}}^{p}},\mathbf{M}_{\mathcal{S}_{u^{\prime}}^{p}}\right),\quad\forall\mathcal{S}_{u^{\prime}}\in\beta\left(\mathcal{S}_{u}\right)

(9)

In here, $\mathcal{S}_{u}$ is the subgraph of node $u$ , and the $\mathcal{S}_{u^{\prime}}$ is the subgraph of perturbed node $u^{\prime}$ ; $\max D$ represents the cosine distance metric, $\mathbf{M}_{\mathcal{S}_{u}^{p}}$ and $\mathbf{M}_{\mathcal{S}_{u^{\prime}}^{p}}$ are the predicted explanation masks for $\mathcal{S}_{u}$ and $\mathcal{S}_{u^{\prime}}$ ; and $\beta$ represents a $\delta$ -radius ball around $\mathcal{S}_{u}$ for which the model behavior is same.

Fairness.

It is the concept that explanations provided by machine learning models should be accurate and fair, and should not perpetuate or amplify existing biases. It promotes transparency, accountability, and fairness in decision-making processes. In the paper [2], authors propose Graph Explanation Counterfactual fairness mismatch (GECF) and Graph Explanation Group Fairness mismatch (GEGF) to evaluate the explanations on the respective datasets. To measure counterfactual fairness, they verify if the explanations corresponding to $\mathcal{S}_{u}$ and its counterfactual counterpart are similar if the underlying model predictions are similar. They calculate counterfactual fairness mismatch as:

\operatorname{GECF}\left(\mathbf{M}^{p},\mathbf{M}_{s}^{p}\right)=D\left(\mathbf{M}^{p},\mathbf{M}_{s}^{p}\right)

(10)

where $\mathbf{M}^{p}$ and $\mathbf{M}_{s}^{p}$ are the predicted explanation mask for $\mathcal{S}_{u}$ , and for the counterfactual counterpart of $\mathcal{S}_{u}$ . It should be noted that they anticipate a decrease in the GECF score for graphs that have ground-truth explanations that exhibit weak forms of unfairness. This is because the explanations for both the original and counterfactual graphs are likely to be similar. In contrast, for graphs with ground-truth explanations that exhibit strong forms of unfairness, we expect to observe an increase in the GECF score. This is because modifying the protected attribute is likely to result in changes to the explanations provided by the model.

They measure group fairness mismatch as follows:

\operatorname{GEGF}\left(\hat{\mathbf{y}}_{\mathcal{K}},\hat{\mathbf{y}}_{\mathcal{K}}^{\mathbf{E}_{u}}\right)=\left|\operatorname{SP}\left(\hat{\mathbf{y}}_{\mathcal{K}}\right)-\operatorname{SP}\left(\hat{\mathbf{y}}_{\mathcal{K}}^{\mathbf{E}_{u}}\right)\right|

(11)

Where $\hat{\mathbf{y}}_{\mathcal{K}}$ and $\hat{\mathbf{y}}_{\mathcal{K}}^{\mathrm{E}_{u}}$ are predictions for a set of K graphs using the original and the essential features identified by an explanation, respectively. And $\operatorname{SP}$ is the statistical parity. The higher values of GEGF indicate that the explanation is not preserving group fairness.

There are various evaluation metrics, and each one has its respective emphasis and reflects different aspects of an explainable model. One should therefore use a combination of multiple metrics to attain reasonable and practical explainable systems. However, as mentioned above, it is also important that one should take the characteristics of datasets and explainable methods into account in order to choose suitable evaluation metrics.

5 Discussion

This survey focuses on providing a clear taxonomy of explainable GNNs. After analyzing the literature on explainable GNNs, we summarized the problems as shown below.

•
How to explain graph neural networks? There are two major perspectives.
- –
  
  GNNs can be treated as a black-box and find an independent way to explain the links between input and output, such as GraphLIME or RelEx.
- –
  
  Another way tries to explain the details of the GNN by leveraging the information of nodes and edges in itself.
•

How to extend XAI methods to graph neural networks? There are some studies using the XAI methods to explain GNNs (see Section 2.2). The XAI methods including Saliency Maps, LRP, LIME, Guided BP, Grad-CAM, and etc., get a competitive performance for XAI and they can be extended to explain GNNs. However, those methods are not specifically designed for GNNs and require knowledge on the internal parameters of the model.
•

How to find the most important subgraph structure that influences the predictions of graph neural networks? As we mentioned in section 3.2, there are several methods to explain GNNs by focusing subgraph structures. For example, the GNNExplainer identifies a compact subgraph structure and a small subset of node features that may play a crucial role in GNN’s prediction. Furthermore, the PGMExplainer and GISST generate explanations by yielding an important subgraph and node feature subset related to any graph-based task. However, these methods only focus on subgraph structures which are local information and fail to consider any global features.
•

How to explain graph neural networks from a global perspective? Instead of the segmented information obtained through local graph structure, the global structure can often provide more interesting and complete information. For instance, the PGExplainer focuses on the explanation of the complete graph structures and provides a global understanding of predictions made by GNNs. It can explain predictions of GNNs on a set of instances collectively and easily generalize the learned explainer model to other instances.

6 Conclusion and Future Work

By analyzing recent studies of XGNNs in details, we observe that we can not only increase the interpretability of GNNs with direct use of XAI methods, but also can create novel XGNNs methods without involving current XAI approaches to improve GNN models interpretability or transparency. We also find that studies in explainable GNNs are still in their early stages. We, therefore, present the following future research directions.

The ultimate goal of XGNNs is to provide clues of how a GNN model makes its decisions to human, so that human can trust its prediction or not. Human hence play a vita role in this explainable loop. It is therefore an interesting research direction to incorporate the human experience (knowledge and feedback) into this procedure to achieve better explainable algorithms or models.

Although there are several work that leverage the existing XAI methods to achieve the explainability of GNNs, there is no unify rules to guide this procedure. It is worth developing such rules so that more existing XAI methods can be contributed to the filed of XGNNs. Furthermore, each XAI method has its pros and cons, it is therefore interesting to explore different combinations of different XAI methods (e.g., in an ensemble manner) to obtain better explainability.

References

[1] Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE access 6, 52138–52160 (2018)
[2] Agarwal, C., Queen, O., Lakkaraju, H., Zitnik, M.: Evaluating explainability for graph neural networks. CoRR abs/2208.09339 (2022). https://doi.org/10.48550/arXiv.2208.09339, https://doi.org/10.48550/arXiv.2208.09339
[3] Amara, K., Ying, R., Zhang, Z., Han, Z., Shan, Y., Brandes, U., Schemm, S., Zhang, C.: Graphframex: Towards systematic evaluation of explainability methods for graph neural networks. CoRR abs/2206.09677 (2022). https://doi.org/10.48550/arXiv.2206.09677, https://doi.org/10.48550/arXiv.2206.09677
[4] Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10(7), e0130140 (2015)
[5] Bajaj, M., Chu, L., Xue, Z.Y., Pei, J., Wang, L., Lam, P.C., Zhang, Y.: Robust counterfactual explanations on graph neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. pp. 5644–5655 (2021), https://proceedings.neurips.cc/paper/2021/hash/2c8c3a57383c63caef6724343eb62257-Abstract.html
[6] Baldassarre, F., Azizpour, H.: Explainability techniques for graph convolutional networks. arXiv preprint arXiv:1905.13686 (2019)
[7] Cai, R., Zhu, Y., Chen, X., Fang, Y., Wu, M., Qiao, J., Hao, Z.: On the probability of necessity and sufficiency of explaining graph neural networks: A lower bound optimization approach. CoRR abs/2212.07056 (2022). https://doi.org/10.48550/arXiv.2212.07056, https://doi.org/10.48550/arXiv.2212.07056
[8] Chakraborti, T., Sreedharan, S., Kambhampati, S.: The emerging landscape of explainable automated planning & decision making. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20. pp. 4803–4811 (2020)
[9] Cho, H., Lee, E.K., Choi, I.S.: Interactionnet: Modeling and explaining of noncovalent protein-ligand interactions with noncovalent graph neural network and layer-wise relevance propagation. arXiv preprint arXiv:2005.13438 (2020)
[10] Das, A., Rad, P.: Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv preprint arXiv:2006.11371 (2020)
[11] Goh, G.B., Hodas, N.O., Siegel, C., Vishnu, A.: Smiles2vec: An interpretable general-purpose deep neural network for predicting chemical properties. arXiv preprint arXiv:1712.02034 (2017)
[12] Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51(5), 1–42 (2018)
[13] Huang, Q., Yamada, M., Tian, Y., Singh, D., Yin, D., Chang, Y.: Graphlime: Local interpretable model explanations for graph neural networks. arXiv preprint arXiv:2001.06216 (2020)
[14] Li, Y., Zhou, J., Zheng, B., Chen, F.: Ganexplainer: Gan-based graph neural networks explainer. CoRR abs/2301.00012 (2023). https://doi.org/10.48550/arXiv.2301.00012, https://doi.org/10.48550/arXiv.2301.00012
[15] Lin, C., Sun, G.J., Bulusu, K.C., Dry, J.R., Hernandez, M.: Graph neural networks including sparse interpretability. arXiv preprint arXiv:2007.00119 (2020)
[16] Luo, D., Cheng, W., Xu, D., Yu, W., Zong, B., Chen, H., Zhang, X.: Parameterized explainer for graph neural network. arXiv preprint arXiv:2011.04573 (2020)
[17] Ma, T., Zhang, A.: Incorporating biological knowledge with factor graph neural network for interpretable deep learning. arXiv preprint arXiv:1906.00537 (2019)
[18] Noutahi, E., Beaini, D., Horwood, J., Giguère, S., Tossou, P.: Towards interpretable sparse graph representation learning with laplacian pooling. arXiv preprint arXiv:1905.11577 (2019)
[19] Pasa, F., Golkov, V., Pfeiffer, F., Cremers, D., Pfeiffer, D.: Efficient deep network architectures for fast chest x-ray tuberculosis screening and visualization. Scientific reports 9(1), 1–9 (2019)
[20] Pope, P.E., Kolouri, S., Rostami, M., Martin, C.E., Hoffmann, H.: Explainability methods for graph convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10772–10781 (2019)
[21] Ribeiro, M.T., Singh, S., Guestrin, C.: ” why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp. 1135–1144 (2016)
[22] Schnake, T., Eberle, O., Lederer, J., Nakajima, S., Schütt, K.T., Müller, K.R., Montavon, G.: Xai for graphs: Explaining graph neural network predictions by identifying relevant walks. arXiv preprint arXiv:2006.03589 (2020)
[23] Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. pp. 618–626 (2017)
[24] Srinivasan, R., Chander, A.: Explanation perspectives from the cognitive sciences—a survey. In: 29th International Joint Conference on Artificial Intelligence. pp. 4812–4818
[25] Stiglic, G., Kocbek, P., Fijacko, N., Zitnik, M., Verbert, K., Cilar, L.: Interpretability of machine learning-based prediction models in healthcare. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10(5), e1379 (2020)
[26] Sun, J., Lapuschkin, S., Samek, W., Zhao, Y., Cheung, N.M., Binder, A.: Explanation-guided training for cross-domain few-shot classification. arXiv preprint arXiv:2007.08790 (2020)
[27] Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE Transactions on Neural Networks and Learning Systems (2020)
[28] Vinogradova, K., Dibrov, A., Myers, G.: Towards interpretable semantic segmentation via gradient-weighted class activation mapping. arXiv preprint arXiv:2002.11434 (2020)
[29] Vu, M.N., Thai, M.T.: Pgm-explainer: Probabilistic graphical model explanations for graph neural networks. arXiv preprint arXiv:2010.05788 (2020)
[30] Wang, X., Wu, Y., Zhang, A., Feng, F., He, X., Chua, T.: Reinforced causal explainer for graph neural networks. CoRR abs/2204.11028 (2022). https://doi.org/10.48550/arXiv.2204.11028, https://doi.org/10.48550/arXiv.2204.11028
[31] Wu, S., Zhang, W., Sun, F., Cui, B.: Graph neural networks in recommender systems: A survey. arXiv preprint arXiv:2011.02260 (2020)
[32] Yang, G., Cao, J., Chen, Z., Guo, J., Li, J.: Graph-based neural networks for explainable image privacy inference. Pattern Recognition 105, 107360 (2020)
[33] Ying, R., Bourgeois, D., You, J., Zitnik, M., Leskovec, J.: Gnn explainer: A tool for post-hoc explanation of graph neural networks. arXiv preprint arXiv:1903.03894 (2019)
[34] Yu, S.Y., Malawade, A.V., Muthirayan, D., Khargonekar, P.P., Al Faruque, M.A.: Scene-graph augmented data-driven risk assessment of autonomous vehicle decisions. IEEE Transactions on Intelligent Transportation Systems (2021)
[35] Yuan, H., Tang, J., Hu, X., Ji, S.: Xgnn: Towards model-level explanations of graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. pp. 430–438 (2020)
[36] Yuan, H., Yu, H., Gui, S., Ji, S.: Explainability in graph neural networks: A taxonomic survey. CoRR abs/2012.15445 (2020), https://arxiv.org/abs/2012.15445
[37] Yuan, H., Yu, H., Wang, J., Li, K., Ji, S.: On explainability of graph neural networks via subgraph explorations (2021)
[38] Zafar, M.R., Khan, N.M.: Dlime: a deterministic local interpretable model-agnostic explanations approach for computer-aided diagnosis systems. arXiv preprint arXiv:1906.10263 (2019)
[39] Zhang, Y., Defazio, D., Ramesh, A.: Relex: A model-agnostic relational model explainer. arXiv preprint arXiv:2006.00305 (2020)
[40] Zhao, X., Huang, X., Robu, V., Flynn, D.: Baylime: Bayesian local interpretable model-agnostic explanations. arXiv preprint arXiv:2012.03058 (2020)
[41] Zhou, J., Gandomi, A.H., Chen, F., Holzinger, A.: Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics 10(5), 593 (2021)