A Survey on Spectral Graph Neural Networks

Deyu Bo¹ Xiao Wang¹ Yang Liu¹ Yuan Fang² Yawen Li¹&Chuan Shi¹ ¹Beijing University of Posts and Telecommunications, Beijing, China
²Singapore Management University, Singapore
{bodeyu, xiaowang, liuyangjanet, shichuan}@bupt.edu.cn, yfang@smu.edu.sg, lywbupt@126.com Corresponding author

Abstract

Graph neural networks (GNNs) have attracted considerable attention from the research community. It is well established that GNNs are usually roughly divided into spatial and spectral methods. Despite that spectral GNNs play an important role in both graph signal processing and graph representation learning, existing studies are biased toward spatial approaches, and there is no comprehensive review on spectral GNNs so far. In this paper, we summarize the recent development of spectral GNNs, including model, theory, and application. Specifically, we first discuss the connection between spatial GNNs and spectral GNNs, which shows that spectral GNNs can capture global information and have better expressiveness and interpretability. Next, we categorize existing spectral GNNs according to the spectrum information they use, i.e., eigenvalues or eigenvectors. In addition, we review major theoretical results and applications of spectral GNNs, followed by a quantitative experiment to benchmark some popular spectral GNNs. Finally, we conclude the paper with some future directions.

1 Introduction

Graph neural networks (GNNs) have shown strong performance in graph-related tasks, including node classification (Kipf and Welling, 2017; Velickovic et al., 2018), link prediction (Zhang and Chen, 2018), and graph classification (Xu et al., 2019b). In general, GNNs can be divided into spatial and spectral methods according to different definitions of graph convolution. Spatial methods, also known as message-passing neural networks (MPNNs) (Gilmer et al., 2017), perform graph convolutional in the vertical domain by aggregating the neighborhood information along graph structures. In contrast, spectral methods use spectral graph theory (Spielman, 2007) to transform the convolutions into products of the frequency domain. Over the past few years, research on GNNs has focused on spatial methods because of their flexibility and scalability, (Hamilton et al., 2017). On the contrary, research on spectral methods is somewhat under-explored.

Spectral GNNs have related but different views from spatial methods and play an important role in graph representation learning. First, spatial and spectral GNNs capture different information. Spatial GNNs aggregate node features layer by layer. Nodes can only capture information within a fixed distance, emphasizing local information. In contrast, spectral GNNs transform all node features into weighted sums of different eigenvectors via graph Fourier transform, which naturally captures global information. Second, spatial and spectral GNNs use different design principles. Spatial GNNs involve node-based updating, where the gradients flow between the connected nodes only and therefore have lower complexity, but their expressive power is bounded from above by the 1-Weisfeiler-Lehman (WL) test (Xu et al., 2019b). Spectral GNNs involve feature-based updating, where each feature is filtered by the fully-connected eigenspaces, and messages are passed among all nodes simultaneously. Although more complex, feature-based updating breaks the localized property and is thus more powerful than node-based updating (Balcilar et al., 2021). Third, spatial and spectral GNNs have different interpretability. Spatial GNNs require a post-hoc interpretation strategy, which aims to find the most important graph structures related to a prediction, e.g., nodes, edges, or subgraphs (Ying et al., 2019). Spectral GNNs are interpretable models in which the learned graph filters can directly state the most important frequency information associated with the labels, e.g., low-, medium-, and high-frequencies.

The above three points show that spectral GNNs are important in designing and explaining GNNs. However, little effort has been made to summarize the development of spectral methods. On the one hand, existing surveys of GNNs (Wu et al., 2021; Zhou et al., 2020) focus on spatial methods. On the other hand, spectral GNNs are close to graph signal processing (GSP). Ortega et al. (2018) and Dong et al. (2020) are two surveys on traditional GSP that formally define some fundamental concepts, such as filtering and sampling, but they do not explain the connection between spectral filtering and graph representation learning in detail. Besides, while both eigenvalues and eigenvectors play a role in the spectral domain, the latter has been largely ignored in previous literature.

In this paper, we provide a comprehensive and systematic review of spectral GNNs. Specifically, we first analyze the differences between spatial and spectral GNNs and introduce the unique challenges of designing spectral methods. Then, we differentiate existing methods into two categories: eigenvalue-based and eigenvector-based, which correspond to the design of filters and basis functions in signal processing. Within each category, there are three subcategories, and we further analyze their strengths and weaknesses in terms of effectiveness and efficiency. Next, we introduce major theoretical results and important applications of spectral GNNs. We also make a quantitative experiment to test the performance and complexity of different graph filters on both homophilic and heterophilic benchmarks. Finally, we discuss a few promising future directions in spectral GNNs.

2 Challenges and Concepts

Spectral GNNs utilize neural networks to learn representations from spectral information. The design of spectral GNNs is not trivial, which poses the following three challenges:

•

Diversity of spectral information. Spectral GNNs aim to learn node or graph representations from spectral information, which is informative but hard to explore. For example, eigenvalues and eigenvectors indicate the global and local structures (Rampásek et al., 2022), and the Fourier coefficients reflect the energy of signals on different harmonics. These spectral features with different meanings and tensor sizes pose the first challenge.
•

Complexity of graph data. There are various types of graph data in real-world applications, such as dynamic graphs, heterogeneous graphs, hyper-graphs, etc. Most of them cannot be represented by a basic adjacency matrix. Therefore, how to extend the idea of spectral convolution to graph data other than undirected graphs is indispensable for spectral GNNs.
•

Scalability of methodology. The complexity of the eigenvalue decomposition is the cube of the number of nodes. Besides, explicitly constructing the fully-connected eigenspaces also brings a quadratic time and space complexity. Both operations are important but expensive for spectral GNNs. How to make spectral GNNs scalable to large-scale graphs is a great challenge for spectral GNNs.

The reviews of spectral GNNs are shown in Section 3, where each approach solves at least one of these three challenges. Before that, we first introduce some important concepts to give a general understanding of spectral GNNs.

Refer to caption — Figure 1: Framework of this survey.

Assume that $\mathcal{G}=(\mathcal{V},\mathcal{E})$ is a graph, where $\mathcal{V}$ is the node set with $|\mathcal{V}|=N$ and $\mathcal{E}$ is the edge set. Let $\mathbf{A}\in\{0,1\}^{N\times N}$ be the adjacency matrix of $\mathcal{G}$ and the normalized graph Laplacian matrix is defined as $\mathbf{L}=\mathbf{I}_{N}-\mathbf{D}^{-1/2}\mathbf{A}\mathbf{D}^{-1/2}$ , where $\mathbf{D}$ is a diagonal degree matrix with $D_{i,i}=\sum_{j}A_{i,j}$ and $\mathbf{I}_{N}$ denotes the identity matrix.

Eigenvalue Decomposition (EVD).

The graph Laplacian matrix can be decomposed as $\mathbf{L}=\mathbf{U}\bm{\Lambda}\mathbf{U}^{\top}$ , where $\bm{\Lambda}=\text{diag}\{\lambda_{1},\cdots,\lambda_{N}\}$ is a diagonal matrix of eigenvalues and $\mathbf{U}=[\mathbf{u}_{1},\cdots,\mathbf{u}_{N}]$ is the corresponding eigenvectors. If $\mathbf{L}$ is symmetric, the eigenvalues are real and satisfy $0\leq\lambda_{1}\leq\cdots\leq\lambda_{N}\leq 2$ . Otherwise, both $\bm{\Lambda}$ and $\mathbf{U}$ will be complex. If no special emphasis, the graphs are undirected.

Graph Fourier Transform (GFT).

Discrete Fourier transform (DFT) aims to decompose an input signal or function into a weighted sum of the orthogonal bases. Through SVD, we have $\mathbf{L}\mathbf{u}_{i}=\lambda_{i}\mathbf{u}_{i}$ and $\lambda_{i}=\mathbf{u}_{i}^{\top}\mathbf{L}\mathbf{u}_{i}$ . Note that the eigenvalues reflect the frequency/smoothness of the corresponding eigenvectors. Since the eigenvectors are normalized and orthogonal, i.e. $||\mathbf{u}_{i}||=1$ and $\mathbf{u}_{i}^{\top}\mathbf{u}_{i}=\delta_{ij}$ , the eigenvectors can be used as the orthogonal bases. Given a graph signal $\mathbf{x}\in\mathbb{R}^{N\times 1}$ , its GFT and inverse GFT can be defined as:

\hat{\mathbf{x}}=\mathbf{U}^{\top}\mathbf{x},\ \mathbf{x}=\mathbf{U}\hat{\mathbf{x}},

(1)

where $\hat{\mathbf{x}}$ is the Fourier coefficients of $\mathbf{x}$ .

Based on the eigenvalues and eigenvectors of graph Laplacian, we can define the convolutional in the Fourier domain.

Spectral Graph Convolution.

The convolution between a filter $f$ and a signal $\mathbf{x}$ in the spatial domain is equal to their production in the spectral domain. Through GFT, the spectral graph convolutional is defined as

f*_{G}\mathbf{x}=\mathbf{U}\left(\left(\mathbf{U}^{\top}f\right)\odot\left(\mathbf{U}^{\top}\mathbf{x}\right)\right),

(2)

where $f*_{G}x$ is the convolution and $\odot$ is the production. Generally, spectral GNNs will parameterize $\mathbf{U}^{\top}f$ with a learnable diagonal matrix $\mathbf{G}=\text{diag}\{g_{1},\cdots,g_{N}\}$ as the spectral filter that can be optimized by gradient descent. We can rewrite the spectral graph convolutional as:

f*_{G}\mathbf{x}=\mathbf{U}\mathbf{G}\mathbf{U}^{\top}\mathbf{x}=(g_{1}\mathbf{u}_{1}\mathbf{u}_{1}^{\top}+\cdots+g_{N}\mathbf{u}_{N}\mathbf{u}_{N}^{\top})\mathbf{x},

(3)

where $\mathbf{u}_{i}\mathbf{u}_{i}^{\top}\in\mathbb{R}^{N\times N}$ is the $i$ -th fully-connected eigenspace.

3 Spectral Graph Neural Networks

In this section, we classify existing spectral GNNs into two categories: eigenvalue-based and eigenvector-based, according to the spectral information they used. Within each category, we will introduce three subcategory methods and explain their advantages and disadvantages as well as the connection in detail. Additionally, we present advances in theory and application, and benchmark existing methods on six datasets. The overall framework can be seen in Figure 1.

3.1 Eigenvalue-based Spectral GNNs

Eigenvalue-based spectral GNNs correspond to the design of filters in signal processing. Approaches belonging to this category aim to learn a good filter that captures the most important frequency information of input graph signals. In this section, we introduce three types of graph filters: advanced filters, polynomial filters, and linear filters, from which we can observe how the spectral GNNs evolve to reduce complexity and scale to large graphs, along with the declining of expressive power.

3.1.1 Advanced Filters

We refer to the spectral GNNs that explicitly take eigenvalues as the input of neural networks as advanced filters. Since the decomposition of graph Laplacian is time-consuming, advanced filters usually choose to truncate the graph spectrum, i.e., using the smallest- $k$ eigenvalues and eigenvectors, to reduce the complexity. Although computationally intensive, such methods can capture the geometric information hidden in the eigenvalues. For example, the second-smallest eigenvalue indicates the algebraic connectivity of a graph.

SpectralCNN (Bruna et al., 2014) is the first advanced graph filter, which treats the graph filter as the parameters of neural networks and directly optimizes it by gradient descent:

\mathbf{h}_{j}^{(l+1)}=\sigma(\mathbf{U}\sum_{i=1}^{d_{in}}\mathbf{G}_{ij}^{(l)}\mathbf{U}^{\top}\mathbf{h}_{i}^{(l)})\quad(j=1\cdots d_{out}),

(4)

where $\sigma$ is the activation function and $\mathbf{h}_{i}^{(l)}\in\mathbb{R}^{N\times 1}$ indicates the $i$ -th dimension of the representations in $l$ -th layer. SpectralCNN generalizes the design of convolutional neural networks (CNNs) to graphs and assigns each feature an independent graph filter, i.e. $\mathbf{G}_{ij}^{(l-1)}$ . Therefore, the total parameters of one convolution layer are $\mathcal{O}(k^{2}d_{in}d_{out})$ . More parameters enhance the fitting ability of SpectralCNN but also bring high time and space complexity. Additionally, the non-parametric graph filters make the model brittle. When there are multiple eigenvalues, SpectralCNN cannot handle the ambiguity of eigenvectors (Wang et al., 2022).

LanczosNet (Liao et al., 2019) uses spectral convolution to define a multi-scale graph representation learning framework. Spatial GNNs construct multi-scale representations by stacking multiple message-passing layers, i.e., calculating $\mathbf{A},\mathbf{A}^{2},\cdots,\mathbf{A}^{K}$ recurrently, which is infeasible for large graphs. LanczosNet leverages the fact $\mathbf{A}^{K}=\mathbf{U}\bm{\Lambda}^{K}\mathbf{U}^{\top}$ to reduce complexity and capture the long-range information in the spectral domain, which is more efficient. The long-range spectral filter is defined as:

L_{i}(\mathcal{I})=\sum_{t=1}^{k}f_{i}(\lambda_{t}^{\mathcal{I}_{1}},\lambda_{t}^{\mathcal{I}_{2}},\cdots,\lambda_{t}^{\mathcal{I}_{|\mathcal{I}|}})\mathbf{u}_{t}\mathbf{u}_{t}^{\top},

(5)

where $f_{i}$ is an element-wise function, such as multilayer perceptron (MLP) and $\mathcal{I}=\{10,20,\cdots,50\}$ . For short-range information, LanczosNet directly uses spatial convolution. The overall framework is defined as:

\mathbf{h}^{(l+1)}=[\mathbf{A}\mathbf{h}^{(l)}\cdots\mathbf{A}^{K}\mathbf{h}^{(l)},L_{1}(\mathcal{I})\mathbf{h}^{(l)}\cdots L_{T}(\mathcal{I})\mathbf{h}^{(l)}]\mathbf{W}.

(6)

It is worth noting that LanczosNet takes the eigenvalues as input and outputs the long-range graph filter. By leveraging the correspondence between eigenvalues and eigenvectors, LanczosNet learns invariant representations to the permutation of nodes and eigenvectors. The design of parameterizing eigenvalues as graph filters is first proposed by Defferrard et al. (2016) and continued in subsequent work.

Specformer (Bo et al., 2023) is a recent work on advanced graph filters that considers the set relationships between eigenvalues, where rich geometric information is embedded. For example, the algebraic multiplicity of eigenvalue 0 reflects the number of connected components in the graph. But element-wise graph filters are not aware of such statistics. Therefore, Specformer uses Transformer (Vaswani et al., 2017) as a set-to-set graph filter to learn the set relationships between eigenvalues:

\mathbf{G}=\text{Transformer}(\rho(\bm{\Lambda})),

(7)

where $\rho(\cdot)$ is a positional encoding function to vectorize the scalar eigenvalues. Theoretically, the set-to-set graph filter is a generalization of element-wise filters and, therefore, more expressive. However, the quadratic complexity of Transformer further increases the complexity of Specformer and makes it impossible to scale to large graphs.

Summary. The advanced filters are the first kind of spectral GNNs, which has great potential in graph representation learning. However, the explicit matrix factorization is time-consuming and prevents advanced filters from scaling to large graphs. Therefore, it is important to find suitable domains for advanced filters, such as molecular representation learning (Guo et al., 2022).

3.1.2 Polynomial Filters

The polynomial filters are special cases of advanced filters that simplify the neural networks into polynomial functions. The greatest advantage of the polynomial filters is that they avoid the explicit EVD but still preserve the arbitrary filtering ability. In addition to reducing the time complexity, ChebNet also learns a localized graph filter, i.e. only nodes within $K$ -hops are connected. The localization property ensures that graph filters remain sparsely connected in space and reduce the cost of memory. Therefore, polynomial filters attract considerable attention in recent years.

The basic polynomial filter is the weighted sum of graph Laplacian of different orders, which can be defined as:

\begin{split}\mathbf{G}&=\theta_{0}\mathbf{I}+\theta_{1}\bm{\Lambda}+\cdots+\theta_{K}\bm{\Lambda}^{K},\\ \mathbf{U}\mathbf{G}\mathbf{U}^{\top}&=\theta_{0}\mathbf{I}+\theta_{1}\mathbf{L}+\cdots+\theta_{K}\mathbf{L}^{K},\end{split}

(8)

where $\{\theta_{0},\theta_{1},\cdots,\theta_{K}\}$ are the coefficients of polynomials. However, there are some disadvantages. For example, the polynomial terms $\{\mathbf{L},\mathbf{L}^{2},\cdots,\mathbf{L}^{K}\}$ are non-orthogonal, which may affect the convergence of models. There are many works that try to improve the design of polynomial filters, most of which are based on the approximation theory, such as Chebyshev polynomial, Bernstein polynomial, etc.

ChebNet (Defferrard et al., 2016) uses Chebyshev expansion to construct a polynomial filter. Due to its orthogonality, ChebNet can be trained quickly to converge with a small number of polynomial terms. The Chebyshev polynomial filter is defined as:

\mathbf{G}=\sum_{k=1}^{K}\theta_{k}\mathcal{T}_{k}(\tilde{\mathbf{\Lambda}}),

(9)

where $\tilde{\mathbf{\Lambda}}=\frac{2\mathbf{\Lambda}}{\lambda_{max}}-\mathbf{I}_{N}$ that scales the eigenvalues in the range $[-1,1]$ and $\mathcal{T}_{k}(x)=2x\mathcal{T}_{k-1}(x)-\mathcal{T}_{k-2}(x)$ with $\mathcal{T}_{0}=1,\mathcal{T}_{1}=x$ constructs the orthogonal space. Although ChebNet has a good approximation ability, it may fail under certain boundary conditions, known as the Runge phenomenon (Epperson, 1987). ChebNetII (He et al., 2022) is then proposed to use Chebyshev interpolation to enhance the approximation of ChebNet and alleviate the over-fitting problem.

There are also other polynomial filters. For example, GPR-GNN (Chien et al., 2021) considers the PageRank information, BernNet (He et al., 2021) uses Bernstein polynomial to keep the graph filter positive semidefinite, JacobiConv (Wang and Zhang, 2022) provides a more general orthogonal base through Jacobi polynomial, and DSGC (Li et al., 2021b) designs polynomial filters on both node and attribute affinity graphs. Here we do not describe them in detail.

Another type of polynomial filter is the relational graph filter, which has a good approximation to the narrow bands, where there is a sharp change in the frequency response. The aforementioned polynomial filters need to increase their order to fit such narrow bands, which may result in numerical instability. Relational graph filters solve this problem by learning the ratio between two polynomials as the filtering, which can be formulated as:

\mathbf{G}=\left(\sum_{q=0}^{Q}\theta_{q}\bm{\Lambda}_{i}^{q}\right)\left(\mathbf{I}+\sum_{p=0}^{P}\theta_{p}\bm{\Lambda}_{i}^{p}\right)^{-1},

(10)

where $Q$ and $P$ are the orders of the two polynomials. CayleyNets (Levie et al., 2019) and GraphARMA (Bianchi et al., 2022) are two traditional relational graph filters in the complex and real domains, respectively. Relational graph filters can fit the narrow bands with fewer orders. However, the calculation of the inverse of a matrix is time-consuming and unstable, which requires carefully designed constraints, i.e., $1+\sum_{p=0}^{P}\theta_{p}\lambda_{i}\neq 0,\forall i\in N$ .

Summary. Polynomial filters can approximate arbitrary filters, e.g., low-, medium-, and high-pass, and easily scale to large graphs, which are widely used in various applications. For example, multivariate time-series forecasting (Liu et al., 2022c). However, the numerical instability needs to be carefully considered in designing polynomial filters, which is caused by calculating the $K$ -th power of the Laplacian matrix. Spec-GN (Yang et al., 2022) provides a spectrum-shrinking method to increase the order of polynomial filters. One can also consider the trigonometric polynomial to avoid the potential numerical problem.

3.1.3 Linear Filters

Linear filters further simplify the polynomial filters to yield high scalability and inductive learning ability. But they lost the capability to approximate arbitrary graph filters. In general, linear filters can only scale the graph spectrum, but not change the response patterns.

GCN (Kipf and Welling, 2017) is one of the most important works in the field of GNNs, which is the first-order approximation of ChebNet and can be seen as a linear filter. GCN takes the first two terms of ChebNet, i.e., $\theta_{0}\mathbf{I}$ and $\theta_{1}(\mathbf{L}-\mathbf{I})$ , and assumes that $\theta=\theta_{0}=-\theta_{1}$ . Then it has a linear filter:

\mathbf{U}\mathbf{G}\mathbf{U}^{\top}=\theta(2\mathbf{I}-\mathbf{L})=\theta(\mathbf{I}+\mathbf{A})\approx\theta\tilde{\mathbf{A}},

(11)

where $\tilde{\mathbf{A}}$ is the renormalization trick of adjacency matrix. Here we can see that the graph filter $\mathbf{G}$ is linear with the input graph structure $\tilde{\mathbf{A}}$ . This explains why GCN has the over-smoothing phenomenon (Li et al., 2018): The parameters of neural networks can only scale the graph spectrum of $\tilde{\mathbf{A}}$ but do not change its inherent low-pass pattern (Wu et al., 2019). By continuously multiplying the graph filter, only the first trivial eigenspace $\mathbf{u}_{1}=[\frac{1}{\sqrt{N}},\cdots,\frac{1}{\sqrt{N}}]^{\top}$ can preserve the amplitude 1 and others 0.

Although the linear property restricts the expressive power of GCN, it still inspires a series of subsequent works. PPNP (Klicpera et al., 2019) combines GCN with PageRank and adds a residual connection to the graph convolutional layers:

\mathbf{U}\mathbf{G}\mathbf{U}^{\top}=\theta((1-\alpha)\tilde{\mathbf{A}}+\alpha\mathbf{I}),

(12)

where $\alpha$ is a hyperparameter to balance the low-pass filter $\tilde{\mathbf{A}}$ and all-pass filter $\mathbf{I}$ . Compared with GCN, PPNP has a hyperparameter to control the response function of the graph filter. But PPNP still cannot approximate arbitrary filters. GNN-LF/HF (Zhu et al., 2021) further unifies all linear filters into an optimizing framework. It decomposes the design of graph convolution into two objectives: fitting and regularization. By setting different hyperparameters, the fitting term can make the model approximate different filters, e.g., low- and high-pass. However, such methods can only approximate predefined filters and cannot learn from data, which requires a lot of expert knowledge.

FAGCN (Bo et al., 2021) provides a solution to adaptively learn low- or high-pass filters from data but still preserves the linear property. The key idea is to use neural networks to combine some predefined graph filters.

\mathbf{U}\mathbf{G}\mathbf{U}^{\top}=\theta(\mathbf{I}-\mathbf{M}\odot\mathbf{A}),

(13)

where $\mathbf{M}\in\mathbb{R}^{N\times N}$ is a sparse edge weight matrix learned by neural networks. When the elements in $\mathbf{M}$ is larger than 0, FAGCN can act as a low-pass filter and otherwise a high-pass filter. There are also some similar methods. For example, AdaGNN (Dong et al., 2021), AKGNN (Ju et al., 2022) and ADA-UGNN (Ma et al., 2021). Through parameterizing the hyperparameters with neural networks, the linear filters can adaptively approximate different filters. But the expressive power is still not as good as the aforementioned two methods.

Summary. Generally, the linear filters are equal to the basic spatial GNNs. Taking GCN as an example, if we assign each feature an independent filter, then the spectral filter becomes:

\begin{split}\mathbf{h}_{j}^{(l+1)}&=\sigma(\mathbf{U}\sum_{i=1}^{d_{in}}\theta_{ij}\tilde{\bm{\Lambda}}\mathbf{U}^{\top}\mathbf{h}_{i}^{(l)})=\sigma(\sum_{i=1}^{d_{in}}\theta_{ij}\mathbf{U}\tilde{\bm{\Lambda}}\mathbf{U}^{\top}\mathbf{h}_{i}^{(l)})\\ &=\sigma(\sum_{i=1}^{d_{in}}\theta_{ij}\tilde{\mathbf{A}}\mathbf{h}_{i}^{(l)})=\sigma(\tilde{\mathbf{A}}\mathbf{h}^{(l)}\bm{\theta}_{j}),\end{split}

(14)

where $\bm{\theta}_{j}\in\mathbf{R}^{d_{in}\times 1}$ is the parameters of neural networks. Therefore, linear filters have good scalability, but the fitting ability is heavily restricted. Linear filters cannot directly capture global or high-order information because the predefined filters are usually designed for first-order information. It is a promising direction to combine linear filters with complicated filters to learn representations effectively and efficiently.

3.2 Eigenvector-based Spectral GNNs

A graph signal can be decomposed into the weighted sum of bases, where the weights indicate the energy in different frequencies. Therefore, it is important to choose a set of bases that well reflect the distribution of graph signals. Eigenvector-based spectral GNNs aim to design bases to effectively represent the graph signals in the spectral domain. In this section, we introduce two important bases: Graph Wavelet and Hermitian Laplacian. In addition, we also mention the basis encoder, which takes the eigenvectors as input and generates positional encodings for nodes.

3.2.1 Wavelet Basis

The Fourier bases, i.e., eigenvectors, are the most commonly used basis function. However, there are two shortcomings of eigenvectors: First, eigenvectors are dense in space. If graph signals only exist in a subgraph, e.g., diffusion signals, GFT needs to use a large number of bases to approximate such signals. Second, eigenvectors are non-localized, which cannot reflect the spatial location of the graph signals. To represent graph signals more efficiently, graph wavelets (Hammond et al., 2011) are proposed, which are sparse and localized in the vertex domain. The graph wavelets $\bm{\psi}_{s}$ are defined as:

\bm{\psi}_{s}=[\bm{\psi}_{s1},\cdots,\bm{\psi}_{sN}]=\mathbf{U}\text{diag}(e^{s\lambda_{1}},\cdots,e^{s\lambda_{N}})\mathbf{U}^{\top},

(15)

where $\bm{\psi}_{si}$ is a basis that represents a diffusion signal centered on node $i$ with the scaling parameter $s$ . Generally, a smaller value $s$ indicates a shorter diffusion, i.e., covering fewer neighbors. By using different scaling parameters, graph wavelets can construct a multi-scale representation of the graph signals. Graph wavelets not only reflect the energy distribution in the spectral domain but also indicate its location distribution in the vertex domain, which is a powerful spatial-spectral analysis tool. There are many spectral GNNs based on graph wavelets.

GWNN (Xu et al., 2019a) proposes a non-parametric spectral GNNs based on the graph wavelet transform (GWT):

\mathbf{h}_{j}^{(l+1)}=\sigma(\bm{\psi}_{s}\sum_{i=1}^{d_{in}}\mathbf{G}_{ij}^{(l)}\bm{\psi}_{s}^{-1}\mathbf{h}_{i}^{(l)})\quad(j=1\cdots d_{out}),

(16)

where $\bm{\psi}_{s}^{-1}=\mathbf{U}\text{diag}(e^{-s\lambda_{1}},\cdots,e^{-s\lambda_{N}})\mathbf{U}^{\top}$ . Since the calculation of $\bm{\psi}_{s}$ and $\bm{\psi}_{s}^{-1}$ are time-consuming, Hammond et al. (2011) uses Chebyshev polynomials to efficiently approximate the wavelet bases. Zheng et al. (2020) design orthogonal graph wavelets to provide a multi-resolution framework for graph signal analysis.

3.2.2 Conjugate Basis

A basic assumption for spectral GNNs is that the underlying graph structures are symmetric. The EVD of asymmetric graphs will generate complex eigenvalues and eigenvectors. However, existing graph filters do not support complex operations. Therefore, it is important to find a basis that can encode the direction information of edges and have real eigenvalues for graph filtering. Hermitian Laplacian is a generalization of graph Laplacian, which contains the real part and imaginary parts. The real part is a symmetric matrix, which represents the graph structures. The imaginary part is conjugate, which can be used to encode side information, e.g., edge direction. The EVD of Hermitian Laplacian has real eigenvalues and complex eigenvectors, which we call conjugate basis. Therefore, it can be directly incorporated with existing graph filters.

MagNet (Zhang et al., 2021) uses the Hermitian Laplacian to design a spectral GNN for directed graphs. The normalized Hermitian Laplacian of a directed graph is defined as:

\mathbb{L}=\mathbf{I}-\mathbb{D}^{-\frac{1}{2}}\mathbb{A}\mathbb{D}^{-\frac{1}{2}}\odot\exp\left(\mathrm{i}\bm{\Theta}^{(q)}\right)\quad\mathrm{i}\cdot\mathrm{i}=-1,

(17)

where $\mathbb{A}=\frac{1}{2}(\mathbf{A}+\mathbf{A}^{\top})$ , $\mathbb{D}$ is the degree matrix of $\mathbb{A}$ and $\bm{\Theta}^{(q)}=2\pi q(\mathbf{A}-\mathbf{A}^{\top})$ is the phase matrix, where the incoming and outgoing edges are set to 1 and -1, respectively. Therefore, they will have opposite values in the complex plane. The hyparparameter $q$ represents the importance of structural information and direction information. A smaller value of $q$ indicates that the structural information is more important. Based on the Hermitian Laplacian, MagNet defines a linear filter similar to GCN in the directed graph:

\mathbf{h}_{j}^{(l+1)}=\theta_{ij}\tilde{\mathbb{A}}\odot\exp\left(\mathrm{i}\bm{\Theta}^{(q)}\right)\mathbf{h}_{i}^{(l)},

(18)

where $\tilde{\mathbb{A}}$ is the renormalization of $\mathbb{A}$ . MagNet sheds light on how to combine graph filters with the side information of graphs. It is promising to extend graph filters to other types of graph data by using Hermitian Laplacian. For example, heterogeneous graphs that use different relations as the side information.

3.2.3 Basis Encoder

In addition to representing the energy of graph signals, the eigenvectors also reflect the global positions of nodes, which can be used as positional encodings to improve the expressive power of spatial GNNs and break the limitation of 1-WL test (Dwivedi et al., 2022). There are many attempts to learn data representation from the eigenvectors. For example, spectral clustering (Ng et al., 2001) aims to use the top- $k$ eigenvectors to represent the manifold of data. Besides, graph embedding, which aims to embed nodes into a low-dimension space, can be seen as factorizing different graph matrices (Qiu et al., 2018), where the eigenvectors are used as the node embeddings. However, such methods are two-stage approaches that cannot update representations through back-propagation.

Basis encoders aim to learn the position representations of nodes from the eigenvectors in an end-to-end manner. In addition, they also consider the sign and basis ambiguity of eigenvectors (Lim et al., 2022; Wang et al., 2022) and design models to learn invariant representations. Although the basic encoders are not necessarily GNNs, there is a close relationship between them. Spectral GNNs aim to find the important eigenspaces, i.e. $\mathbf{u}\mathbf{u}^{\top}$ , and basis encoders are designed to find the important eigenvectors that can well describe the node positions.

SAN (Kreuzer et al., 2021) proposes to use Transformer to construct a permutation-equivariant basis encoder. Through the self-attention mechanism, SAN can learn the importance of different eigenvectors, which has a similar idea to spectral GNNs. SignNet (Lim et al., 2022) finds that the eigenspaces are invariant to the sign flips and basis symmetries of eigenvectors and uses invariant and equivariant GNNs to learn positional representations from the eigenspaces, which can be seen as a generalization of spectral GNNs. PEG (Wang et al., 2022) uses the Euclidean distance between eigenvectors to reweight the adjacency matrix, thus avoiding the ambiguity of eigenvectors. PEG encodes the positional information in edges and performs well on the link prediction task.

Summary. Compared with eigenvalues-based spectral GNNs, eigenvector-based methods are under-explored, possibly due to theoretical and computational limitations. A future direction is to combine the basis information with other tasks. For example, GCN-SVD (Entezari et al., 2020) shows that the low-frequency bases are robust to graph perturbation.

3.3 Theory

In this section, we mainly introduce the theoretical progress on spectral GNNs, including expressive power, robustness, and transferability, which may give a deeper understanding of spectral GNNs.

3.3.1 Expressive power

The expressive power of MPNNs is proven to be restricted by 1-WL test (Xu et al., 2019b). But the expressive power of spectral GNNs remains under-determined. Balcilar et al. (2021) proves that spectral graph convolution with nonlinear filtering can break the limit of the 1-WL test and is as powerful as the 3-WL model. Wang and Zhang (2022) make a further attempt. They show that spectral filters are universal approximators when satisfying two conditions: no multiple eigenvalues and no missing frequency components. Since existing graph filters will map the multiple eigenvalues into the same scalar, this conclusion gives a future direction for designing spectral GNNs.

3.3.2 Robustness

It has been proved that GCNs will have different predictions under structural perturbation (Zügner et al., 2018), implying that spatial convolutions are vulnerable to attack. GCN-LFR (Chang et al., 2021) is the first to verify the vulnerability of graph filters through matrix perturbation theory. Their theoretical analysis shows that graph filters are more robust against structural perturbations when the eigenvalues fall into a specific interval. Lei et al. (2022) find that even-order graph filters are more robust than full-order methods and propose EvenNet, which is a graph filter with only even-order terms and performs well on both homophilic and heterophilic datasets.

3.3.3 Transferability

Transferability reflects the generalization ability of graph filters across different graphs. Intuitively, if two graphs have similar spectrums, the graph filters should have representations on both graphs. Levie et al. (2021) first proves that spectral filters are transferable on different graphs and can learn similar representations when there is a small perturbation between two graphs, which establishes the connections between transferability and robustness. Kenlay et al. (2021) further proposes a stability bound to measure the stability and transferability of graph filters.

		Cora	Citeseer	Pubmed	Chameleon	Squirrel	Actor
Advanced	SpectralCNN	83.17±0.87	71.06±1.93	83.76±0.51	55.73±2.66	44.79±2.58	28.87±3.77
	LanczosNet	88.03±1.70	80.55±1.19	90.13±0.49	60.96±3.32	45.14±1.44	36.07±1.38
	Specformer	88.57±1.01	81.49±0.94	-	74.72±1.29	64.64±0.81	41.93±1.04
Polynomial	ChebNet	86.37±1.71	78.65±1.55	88.00±0.74	66.94±1.59	53.15±1.97	37.06±2.41
	ChebNetII	88.08±1.17	78.46±1.56	88.10±0.63	71.26±1.28	61.92±1.37	41.68±1.04
	GPRGNN	89.61±1.80	80.85±1.29	90.09±1.15	65.27±2.73	46.46±2.01	39.16±1.39
	BernNet	87.21±1.13	79.33±1.59	89.00±0.63	68.53±3.05	51.74±2.37	40.72±1.17
	JacobiConv	89.05±1.17	80.44±0.95	89.51±0.80	74.07±1.63	57.42±1.94	40.82±1.72
	Spec-GN	88.18±0.85	80.07±1.48	88.51±0.35	65.34±3.20	50.96±2.19	40.49±1.00
	DSGC	88.24±1.53	79.75±1.98	87.72±0.69	51.73±1.95	35.68±1.22	42.63±1.32
	ARMA	87.70±1.25	79.02±1.31	88.96±0.51	60.96±5.66	51.10±0.92	38.10±2.19
	CayleyNet	86.55±1.61	78.40±1.32	-	68.21±1.85	51.65±1.98	42.26±1.42
Linear	GCN	87.23±0.82	79.82±0.74	86.43±0.57	59.73±1.75	46.55±1.02	34.01±1.57
	FAGCN	88.03±1.30	80.04±0.78	89.22±0.59	66.52±2.17	50.40±1.73	39.69±1.57
	AdaGNN	87.32±1.10	78.69±1.50	88.72±0.47	60.85±2.44	49.88±1.85	37.16±0.93
	AKGNN	88.28±1.52	81.06±0.84	88.96±0.50	68.53±0.69	46.82±0.75	35.70±1.20

Table 1: Node classification on real-world datasets, where bold and underline indicate the best and runner-up, respectively.

3.4 Application

In this section, we review the latest applications and explain why they are suitable for spectral GNNs, including multivariate time-series forecasting, neuroscience, and computer vision.

3.4.1 Multivariate Time-series Forecasting

Multivariate time-series forecasting task aims to predict multiple correlated time-series simultaneously, which needs to consider both intra-series and inter-series patterns. Learning a latent graph between different time-series can capture the inter-relations and significantly improve forecasting performance. StemGNN (Cao et al., 2020) uses Discrete Fourier Transform (DFT) and GFT to capture the intra- and inter-patterns in the spectral domain, respectively. TPGNN (Liu et al., 2022c) further proposes a temporal polynomial filter to capture the dynamics between different time-series.

3.4.2 Neuroscience

Neuroscience studies the nervous system of human brains. Brain connectivity can be seen as a special brain graph, where nodes are different brain regions, and edges are their connections. Bessadok et al. (2021) introduces how to apply GNNs for network neuroscience, and Xu et al. (2019c) utilizes graph wavelets to construct a multi-scale analysis of brain networks. In general, neuroscience is a suitable application for spectral GNNs, as it does not require scalability and has a strong need for interpretability.

3.4.3 Computer Vision

Many computer vision tasks can improve performance by introducing graph structures, such as detection, classification, recognition, etc. Chen et al. (2022) provides a comprehensive summary of vision tasks involving GNNs, and we present two of them that make use of spectral GNNs.

Point clouds can be seen as special 3D graphs, where the nodes are points and edges are their nearest neighbors. It is natural to generalize the idea of graph filters to 3D graphs. Hu et al. (2020) introduces the recent development of GSP in geometric data, and Liu et al. (2022a) reviews the attacks on point clouds from the perspective of the graph spectral domain. Point cloud tasks, such as segmentation, often need to learn the global structures of nodes. Compared to the layer-by-layer aggregation mechanism of spatial methods, spectral GNNs are better at learning global information through GFT.

Skeleton-based motion prediction is another task aimed at predicting human motions from sensor data, where sensors can be considered as the nodes in a graph, and edges are their spatial connections. Li et al. (2021a) proposes a graph scattering network to project the sensor signals into different graph spectrum bands to predict human motions. Through decomposing the sensor signals, spectral GNNs can capture the potential relationship between human motions and basis functions and use it for classification.

3.5 Benchmark

We benchmark existing spectral GNNs on six graph datasets, three of which (Cora, Citeseer, Pubmed) are homophilic graphs, while the other three (Chameleon, Squirrel, Actor) are heterophilic graphs. We implement these spectral GNNs based on our GAMMA-Spec ¹¹1https://github.com/liuyang-tian/Spectral-GNN-Library framework, which provides the implementation and visualization of different graph filters. We follow the setting of He et al. (2021) and randomly select 60% data for training, 20% for validation and 20% for testing. The results are shown in Table 1.

We can see that the advanced filters outperform other filters in four datasets, confirming that the advanced filters have better approximation ability. However, due to the high complexity, there is less research on this type of filter. Polynomial filters take the runner-up spot but are more studied than advanced filters, probably because polynomial filters strike a good balance between effectiveness and efficiency. The performance of linear filters is worse than other filters. Due to the lack of nonlinear filtering, there is also less research on linear filters. Therefore, it is an important direction to empower linear filters with adaptive filtering capability.

4 Conclusion and Future Directions

In this paper, we introduced the development of spectral GNNs from the perspective of model, theory, and application. Although some progress has been made, spectral GNNs still lag behind spatial GNNs. To facilitate the development of spectral GNNs, we briefly discuss some promising future directions.

4.0.1 Information intersection

As mentioned in the challenges, the spectral information of graphs is informative, such as eigenvalues, eigenvectors, Fourier coefficients, etc. Therefore, it is important to exploit the connections between different spectral information and fuse them to learn better representations. Besides, spectral GNNs should not be limited to supervised or semi-supervised learning. Applying the rich spectral information to unsupervised or self-supervised learning is also a promising direction. For example, Liu et al. (2022b) connects graph spectrum and graph data augmentation to improve the performance of graph contrastive learning.

4.0.2 Learning from signal processing

Signal processing on Euclidean data, e.g., time series and images, has a long history. While non-Euclidean signal processing is still an emerging field. Therefore, spectral GNNs can borrow ideas from traditional signal processing methods. For example, Kalman filtering, Bessel filtering, etc. In addition, the spectrum of the Euclidean data can be divided into amplitude spectrum and phase spectrum, representing the shape and location information, respectively. However, in spectral GNNs, most graph spectrums are amplitude spectrums, and the phase spectrums are largely ignored, which motivates the discovery of phase information in spectral GNNs.

4.0.3 Broader data and applications

One of the most important advantages of spatial GNN is the flexibility to adapt to different graph data. The message-passing mechanism can be easily extended to heterogeneous graphs and hypergraphs. However, the idea of signal processing is hard to generalize to other graph data, which hinders the development of spectral GNN to some extent. Therefore, an important future direction is to extend spectral GNNs to broader graph data and find suitable applications, e.g., biomedical and neuroscience.

References

Balcilar et al. [2021] Muhammet Balcilar, Pierre Héroux, Benoit Gaüzère, Pascal Vasseur, Sébastien Adam, and Paul Honeine. Breaking the limits of message passing graph neural networks. In ICML, volume 139, pages 599–608, 2021.
Bessadok et al. [2021] Alaa Bessadok, Mohamed Ali Mahjoub, and Islem Rekik. Graph neural networks in network neuroscience. CoRR, abs/2106.03535, 2021.
Bianchi et al. [2022] Filippo Maria Bianchi, Daniele Grattarola, Lorenzo Livi, and Cesare Alippi. Graph neural networks with convolutional ARMA filters. IEEE TPAMI, 44(7):3496–3507, 2022.
Bo et al. [2021] Deyu Bo, Xiao Wang, Chuan Shi, and Huawei Shen. Beyond low-frequency information in graph convolutional networks. In AAAI, pages 3950–3957, 2021.
Bo et al. [2023] Deyu Bo, Chuan Shi, Lele Wang, and Renjie Liao. Specformer: Spectral graph neural networks meet transformers. In ICLR, 2023.
Bruna et al. [2014] Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected networks on graphs. In ICLR, 2014.
Cao et al. [2020] Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Congrui Huang, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, and Qi Zhang. Spectral temporal graph neural network for multivariate time-series forecasting. In NeurIPS, 2020.
Chang et al. [2021] Heng Chang, Yu Rong, Tingyang Xu, Yatao Bian, Shiji Zhou, Xin Wang, Junzhou Huang, and Wenwu Zhu. Not all low-pass filters are robust in graph convolutional networks. In NeurIPS, 2021.
Chen et al. [2022] Chaoqi Chen, Yushuang Wu, Qiyuan Dai, Hong-Yu Zhou, Mutian Xu, Sibei Yang, Xiaoguang Han, and Yizhou Yu. A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective. CoRR, abs/2209.13232, 2022.
Chien et al. [2021] Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. Adaptive universal generalized pagerank graph neural network. In ICLR, 2021.
Defferrard et al. [2016] Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In NIPS, 2016.
Dong et al. [2020] Xiaowen Dong, Dorina Thanou, Laura Toni, Michael M. Bronstein, and Pascal Frossard. Graph signal processing for machine learning: A review and new perspectives. IEEE Signal Process. Mag., 37(6):117–127, 2020.
Dong et al. [2021] Yushun Dong, Kaize Ding, Brian Jalaian, Shuiwang Ji, and Jundong Li. Adagnn: Graph neural networks with adaptive frequency response filter. In CIKM, pages 392–401, 2021.
Dwivedi et al. [2022] Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Graph neural networks with learnable structural and positional representations. In ICLR, 2022.
Entezari et al. [2020] Negin Entezari, Saba A. Al-Sayouri, Amirali Darvishzadeh, and Evangelos E. Papalexakis. All you need is low (rank): Defending against adversarial attacks on graphs. In WSDM, pages 169–177, 2020.
Epperson [1987] James F Epperson. On the runge example. The American Mathematical Monthly, 94(4):329–341, 1987.
Gilmer et al. [2017] Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. Neural message passing for quantum chemistry. In ICML, volume 70, pages 1263–1272, 2017.
Guo et al. [2022] Zhichun Guo, Bozhao Nan, Yijun Tian, Olaf Wiest, Chuxu Zhang, and Nitesh V. Chawla. Graph-based molecular representation learning. CoRR, abs/2207.04869, 2022.
Hamilton et al. [2017] William L. Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. In NIPS, 2017.
Hammond et al. [2011] David K Hammond, Pierre Vandergheynst, and Rémi Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011.
He et al. [2021] Mingguo He, Zhewei Wei, Zengfeng Huang, and Hongteng Xu. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. In NeurIPS, 2021.
He et al. [2022] Mingguo He, Zhewei Wei, and Ji-Rong Wen. Convolutional neural networks on graphs with chebyshev approximation, revisited. In NeurIPS, 2022.
Hu et al. [2020] Wei Hu, Jiahao Pang, Xianming Liu, Dong Tian, Chia-Wen Lin, and Anthony Vetro. Graph signal processing for geometric data and beyond: Theory and applications. CoRR, abs/2008.01918, 2020.
Ju et al. [2022] Mingxuan Ju, Shifu Hou, Yujie Fan, Jianan Zhao, Yanfang Ye, and Liang Zhao. Adaptive kernel graph neural network. In AAAI, pages 7051–7058, 2022.
Kenlay et al. [2021] Henry Kenlay, Dorina Thanou, and Xiaowen Dong. Interpretable stability bounds for spectral graph filters. In ICML, volume 139, pages 5388–5397, 2021.
Kipf and Welling [2017] Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
Klicpera et al. [2019] Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. Predict then propagate: Graph neural networks meet personalized pagerank. In ICLR, 2019.
Kreuzer et al. [2021] Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent Létourneau, and Prudencio Tossou. Rethinking graph transformers with spectral attention. In NeurIPS, 2021.
Lei et al. [2022] Runlin Lei, Zhen Wang, Yaliang Li, Bolin Ding, and Zhewei Wei. Evennet: Ignoring odd-hop neighbors improves robustness of graph neural networks. CoRR, abs/2205.13892, 2022.
Levie et al. [2019] Ron Levie, Federico Monti, Xavier Bresson, and Michael M. Bronstein. Cayleynets: Graph convolutional neural networks with complex rational spectral filters. IEEE TSP, 67(1):97–109, 2019.
Levie et al. [2021] Ron Levie, Wei Huang, Lorenzo Bucci, Michael M. Bronstein, and Gitta Kutyniok. Transferability of spectral graph convolutional neural networks. J. Mach. Learn. Res., 22:272:1–272:59, 2021.
Li et al. [2018] Qimai Li, Zhichao Han, and Xiao-Ming Wu. Deeper insights into graph convolutional networks for semi-supervised learning. In AAAI, pages 3538–3545, 2018.
Li et al. [2021a] Maosen Li, Siheng Chen, Zihui Liu, Zijing Zhang, Lingxi Xie, Qi Tian, and Ya Zhang. Skeleton graph scattering networks for 3d skeleton-based human motion prediction. In ICCV, pages 854–864, 2021.
Li et al. [2021b] Qimai Li, Xiaotong Zhang, Han Liu, Quanyu Dai, and Xiao-Ming Wu. Dimensionwise separable 2-d graph convolution for unsupervised and semi-supervised learning on graphs. In KDD, pages 953–963, 2021.
Liao et al. [2019] Renjie Liao, Zhizhen Zhao, Raquel Urtasun, and Richard S. Zemel. Lanczosnet: Multi-scale deep graph convolutional networks. In ICLR, 2019.
Lim et al. [2022] Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess E. Smidt, Suvrit Sra, Haggai Maron, and Stefanie Jegelka. Sign and basis invariant networks for spectral graph representation learning. CoRR, abs/2202.13013, 2022.
Liu et al. [2022a] Daizong Liu, Wei Hu, and Xin Li. Point cloud attacks in graph spectral domain: When 3d geometry meets graph signal processing. CoRR, abs/2207.13326, 2022.
Liu et al. [2022b] Nian Liu, Xiao Wang, Deyu Bo, Chuan Shi, and Jian Pei. Revisiting graph contrastive learning from the perspective of graph spectrum. In NeurIPS, 2022.
Liu et al. [2022c] Yijing Liu, Qinxian Liu, Jian-Wei Zhang, Haozhe Feng, Zhongwei Wang, Zihan Zhou, and Wei Chen. Multivariate time-series forecasting with temporal polynomial graph neural networks. In NeurIPS, 2022.
Ma et al. [2021] Yao Ma, Xiaorui Liu, Tong Zhao, Yozen Liu, Jiliang Tang, and Neil Shah. A unified view on graph neural networks as graph signal denoising. In CIKM, pages 1202–1211, 2021.
Ng et al. [2001] Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.
Ortega et al. [2018] Antonio Ortega, Pascal Frossard, Jelena Kovacevic, José M. F. Moura, and Pierre Vandergheynst. Graph signal processing: Overview, challenges, and applications. Proc. IEEE, 106(5):808–828, 2018.
Qiu et al. [2018] Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM, pages 459–467, 2018.
Rampásek et al. [2022] Ladislav Rampásek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. Recipe for a general, powerful, scalable graph transformer. CoRR, abs/2205.12454, 2022.
Spielman [2007] Daniel A. Spielman. Spectral graph theory and its applications. In FOCS, pages 29–38, 2007.
Vaswani et al. [2017] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, 2017.
Velickovic et al. [2018] Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. In ICLR, 2018.
Wang and Zhang [2022] Xiyuan Wang and Muhan Zhang. How powerful are spectral graph neural networks. In ICML, volume 162, pages 23341–23362, 2022.
Wang et al. [2022] Haorui Wang, Haoteng Yin, Muhan Zhang, and Pan Li. Equivariant and stable positional encoding for more powerful graph neural networks. In ICLR, 2022.
Wu et al. [2019] Felix Wu, Amauri H. Souza Jr., Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Q. Weinberger. Simplifying graph convolutional networks. In ICML, volume 97, pages 6861–6871, 2019.
Wu et al. [2021] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. A comprehensive survey on graph neural networks. IEEE TNNLS, 32(1):4–24, 2021.
Xu et al. [2019a] Bingbing Xu, Huawei Shen, Qi Cao, Yunqi Qiu, and Xueqi Cheng. Graph wavelet neural network. In ICLR, 2019.
Xu et al. [2019b] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In ICLR, 2019.
Xu et al. [2019c] Wenyan Xu, Qing Li, Zhiyuan Zhu, and Xia Wu. A novel graph wavelet model for brain multi-scale activational-connectional feature fusion. In MICCAI, volume 11766, pages 763–771, 2019.
Yang et al. [2022] Mingqi Yang, Yanming Shen, Rui Li, Heng Qi, Qiang Zhang, and Baocai Yin. A new perspective on the effects of spectrum in graph neural networks. In ICML, volume 162, pages 25261–25279, 2022.
Ying et al. [2019] Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. Gnnexplainer: Generating explanations for graph neural networks. In NeurIPS, 2019.
Zhang and Chen [2018] Muhan Zhang and Yixin Chen. Link prediction based on graph neural networks. In NeurIPS, 2018.
Zhang et al. [2021] Xitong Zhang, Yixuan He, Nathan Brugnone, Michael Perlmutter, and Matthew J. Hirn. Magnet: A neural network for directed graphs. In NeurIPS, 2021.
Zheng et al. [2020] Xuebin Zheng, Bingxin Zhou, Ming Li, Yu Guang Wang, and Junbin Gao. Mathnet: Haar-like wavelet multiresolution-analysis for graph representation and learning. CoRR, abs/2007.11202, 2020.
Zhou et al. [2020] Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. Graph neural networks: A review of methods and applications. AI Open, 1:57–81, 2020.
Zhu et al. [2021] Meiqi Zhu, Xiao Wang, Chuan Shi, Houye Ji, and Peng Cui. Interpreting and unifying graph neural networks with an optimization framework. In WWW, pages 1215–1226, 2021.
Zügner et al. [2018] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial attacks on neural networks for graph data. In KDD, pages 2847–2856, 2018.