This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Chebyshev Spectral Neural Networks for Solving Partial Differential Equations

Pengsong Yin School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai 200240, P.R. China Shuo Ling School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai 200240, P.R. China Wenjun Ying Corresponding author.
Email addresses: [email protected] School of Mathematical Sciences, MOE-LSC and Institute of Natural Sciences, Shanghai Jiao Tong University, Minhang, Shanghai 200240, P.R. China.
Abstract

The purpose of this study is to utilize the Chebyshev spectral method neural network (CSNN) model to solve differential equations. This approach employs a single-layer neural network wherein Chebyshev spectral methods are used to construct neurons satisfying boundary conditions. The study uses a feedforward neural network model and error backpropagation principles, utilizing automatic differentiation (AD) to compute the loss function. This method avoids the need to solve non-sparse linear systems, making it convenient for algorithm implementation and solving high-dimensional problems. The unique sampling method and neuron architecture significantly enhance the training efficiency and accuracy of the neural network. Furthermore, multiple networks enables the Chebyshev spectral method to handle equations on more complex domains. The numerical efficiency and accuracy of the CSNN model are investigated through testing on elliptic partial differential equations, and it is compared with the well-known Physics-Informed Neural Network(PINN) method.

Keywords: Chebyshev spectral method neural network, differential equations, single-layer neural network, automatic differentiation, complex domains

1 Introduction

Partial differential equations (PDEs) play a crucial role in various fields of engineering and science. Many problems in mathematics, physics, engineering, economics, and other domains can be modeled using differential equations [19]. In most cases, obtaining analytical solutions for partial differential equations is challenging. Therefore, various numerical methods such as finite difference method (FDM) [22], finite element method (FEM) [18], finite volume method (FVM) [14], etc., are employed to solve these equations. While the aforementioned numerical methods provide good approximate solutions, they can be challenging for high-dimensional and complex domain problems.

Computational methods based on machine learning and neural networks have been widely researched in recent years. In fact, the increasing computational power and abundant data have made it possible to apply neural networks to increasingly complex problems. Furthermore, the versatility and simplicity of neural networks greatly solidify their powerful tool status in various domains. Meanwhile, the structure of neural networks is often reformulated to adapt to the diverse environmental requirements necessitated by different problems.

1.1 Artificial Neural Networks for PDEs

Recently, a class of Artificial Neural Networks(ANN) has been introduced, known as Physics-Informed Neural Networks (PINN) [17, 20]. The training of PINN is achieved by minimizing the loss function, which, in addition to observational data, incorporates the governing equations, initial conditions, and boundary conditions. As is well-known, neural networks possess universal approximation capabilities [6, 26]. Therefore, following this process, approximate solutions to initial/boundary value problems may be effective.

Compared to traditional numerical methods, ANNs offer several advantages in providing approximate solutions. For example, such methods generally exhibit strong versatility, capable of solving both linear and nonlinear ordinary differential equations as well as partial differential equations. Additionally, many neural network approaches demonstrate certain superiority in solving high-dimensional and general domain problems.

While traditional neural networks have achieved remarkable results in solving differential equations, there are some drawbacks associated with using conventional neural network structures. Issues such as low solution accuracy and poor generality need to be overcome, and training may be challenging in certain situations, especially when the solution of the equation exhibits high-frequency characteristics [24]. Therefore, many scientists are exploring approaches to address these challenges by incorporating mathematical prior knowledge, modifying network structures, training methods, and other techniques.

1.2 Single Layer Artificial Neural Network Model

Susmita Mall and S. Chakraverty proposed a fast learning single-layer neural network called Functional Link Artificial Neural Network (FLANN) to solve differential equations [8, 9]. In FLANN the hidden layer is replaced by a functional expansion block for enhancement of the input patterns using Chebyshev polynomials that is computationally more efficient than the multi-layer perceptron network [16, 26]. While this single-layer Chebyshev network exhibits extremely fast optimization speed and, to some extent, has a mature approximation theory [10], using such a network structure alone for any PDE is not ideal. An important factor is that it does not take into account the need to satisfy the boundary conditions or initial conditions required for solving the equation.

To align the network structure more closely with the inherent properties of the equation, this paper proposes a new neural network solution for PDE based on the Chebyshev spectral method [4]. In the subsequent sections, we will refer to it as the Chebyshev spectral neural network(CSNN). Its advantage lies in its naturally satisfying the boundary conditions of the equation, as it avoids sampling at the boundaries, thereby reducing errors near the boundaries. Additionally, due to the excellent properties of spectral methods, such as exponential convergence for smooth functions, achieving high accuracy with fewer parameters of networks, and absence of phase errors [21], this method exhibits significant improvements in solution accuracy. In some cases, it even surpasses Physics-Informed Neural Networks (PINN) [17].

1.3 Spectral Methods for Complex Domain Problems

For the traditional spectral methods, addressing complex domain problems is crucial to expanding the application scope. Current mainstream solutions can be broadly categorized into two types: coordinate transformation and partitioning methods. Orszag introduced the concept of linear transformations [12]. Wang introduced general coordinates to spectral methods[25], making it possible to compute for arbitrary shapes. Partitioning methods distribute the global nature of spectral methods to different regions to adapt to more complex geometries. Morchoishne first proposed spectral multi-domain methods [11]. Patera introduced spectral element methods [15], where spectral functions replace polynomials on each element, utilizing variational principles for finite element approximations. Although there are various methods available for solving complex domain problems, solving non-sparse matrices or intricate domain partitions remains a computationally expensive and challenging issue. CSNN employs variable substitution, transforming the polar coordinate set region that the equation needs to solve into a rectangular region for sampling and training. It uses an additional neural network to learn the transformed boundary values to achieve the solution, thus somewhat simplifying the computational complexity. While the form of the equation may undergo transformation, the powerful learning and generalization capabilities of neural networks enable effective learning even for nonlinear equations with low error. The subsequent chapters will provide a detailed explanation of this method.

The outline of this paper is as follows. Section 2 introduces some prior knowledge about the Chebyshev spectral method. Section 3 - 4 present the specific structure of the CSNN proposed by us and the algorithm for solving PDEs using it. Following that, section 5 provides a certain error analysis. Section 6 is the numerical experiments section, validating the feasibility and efficiency of our method. The conclusion and discussion part is in section 7.

2 Preliminaries

This section introduces some knowledge about the traditional Chebyshev spectral method, serving as the foundation for CSNN.

2.1 Chebyshev Polynomials

The Chebyshev differential equation yields a set of orthogonal polynomials under the integral inner product on [1,1]11[-1,1] with the specific weight function w(x)=(1x2)12𝑤𝑥superscript1superscript𝑥212w(x)=(1-x^{2})^{-\frac{1}{2}}, which are known as the Chebyshev polynomials. The first two Chebyshev polynomials are known as

T0(x)=1,subscript𝑇0𝑥1T_{0}(x)=1,
T1(x)=x.subscript𝑇1𝑥𝑥T_{1}(x)=x.

The well-known recursive formula can be employed to generate higher-order Chebyshev polynomials [1]

Tn+1(x)=2xTn(x)Tn1(x),n1,formulae-sequencesubscript𝑇𝑛1𝑥2𝑥subscript𝑇𝑛𝑥subscript𝑇𝑛1𝑥𝑛1T_{n+1}(x)=2xT_{n}(x)-T_{n-1}(x),\qquad n\geq 1, (2.1)

where Tn(x)subscript𝑇𝑛𝑥T_{n}(x) denotes n-th order Chebyshev polynomial and 1<x<11𝑥1-1<x<1 is the argument of the polynomials.

An important feature of the Chebyshev polynomials is that Chebyshev-Gauss-Lobatto quadrature points and wights can be expressed as follows

x0=1,xN=1,xj=cosπjN,1jN1,formulae-sequencesubscript𝑥01formulae-sequencesubscript𝑥𝑁1formulae-sequencesubscript𝑥𝑗𝜋𝑗𝑁1𝑗𝑁1x_{0}=1,\quad x_{N}=-1,\quad x_{j}=\cos{\frac{\pi j}{N}},\quad 1\leq j\leq N-1, (2.2)
ω0=ωN=π2N,ωj=πN,1jN1.formulae-sequencesubscript𝜔0subscript𝜔𝑁𝜋2𝑁formulae-sequencesubscript𝜔𝑗𝜋𝑁1𝑗𝑁1\omega_{0}=\omega_{N}=\frac{\pi}{2N},\qquad\omega_{j}=\frac{\pi}{N},\quad 1\leq j\leq N-1. (2.3)

For Gauss quadrature defined with {xj,ωj}j=0Nsuperscriptsubscriptsubscript𝑥𝑗subscript𝜔𝑗𝑗0𝑁\{x_{j},\omega_{j}\}_{j=0}^{N}, define a discrete inner product in C[a,b]𝐶𝑎𝑏C[a,b] and its associated norms:

(u,v)N,ω=j=0Nu(xj)v(xj)ωj,uN,ω=(u,u)N,ω12formulae-sequencesubscript𝑢𝑣𝑁𝜔superscriptsubscript𝑗0𝑁𝑢subscript𝑥𝑗𝑣subscript𝑥𝑗subscript𝜔𝑗subscriptnorm𝑢𝑁𝜔superscriptsubscript𝑢𝑢𝑁𝜔12(u,v)_{N,\omega}=\sum_{j=0}^{N}u(x_{j})v(x_{j})\omega_{j},\qquad\|u\|_{N,\omega}=(u,u)_{N,\omega}^{\frac{1}{2}} (2.4)

2.2 Basis Functions for Spectral Method

To introduce the construction method of basis functions, we exemplify with a (one-dimensional) two-point boundary-value problem. Without loss of generality, we consider the equation with homogeneous boundary conditions

au(1)+bu(1)subscript𝑎𝑢1subscript𝑏superscript𝑢1\displaystyle a_{-}u(-1)+b_{-}u^{\prime}(-1) =0,absent0\displaystyle=0, (2.5)
a+u(1)+b+u(1)subscript𝑎𝑢1subscript𝑏superscript𝑢1\displaystyle\qquad a_{+}u(1)+b_{+}u^{\prime}(1) =0.absent0\displaystyle=0.

For if the right-hand side is not zero, it can be always homogenized(see in Appendix A). Our objective is to identify a basis function of the following structure that satisfies (2.5)

ϕk(x)=Tk(x)+akTk+1(x)+bkTk+2(x).subscriptitalic-ϕ𝑘𝑥subscript𝑇𝑘𝑥subscript𝑎𝑘subscript𝑇𝑘1𝑥subscript𝑏𝑘subscript𝑇𝑘2𝑥\phi_{k}(x)=T_{k}(x)+a_{k}T_{k+1}(x)+b_{k}T_{k+2}(x). (2.6)

Since Tk(±1)=(±1)ksubscript𝑇𝑘plus-or-minus1superscriptplus-or-minus1𝑘T_{k}(\pm 1)=(\pm 1)^{k} and Tk(±1)=(±)k1k2superscriptsubscript𝑇𝑘plus-or-minus1superscriptplus-or-minus𝑘1superscript𝑘2T_{k}^{{}^{\prime}}(\pm 1)=(\pm)^{k-1}k^{2}, substitute (2.6) into (2.5) we have

{(a++b+(k+1)2)ak+(a++b+(k+2)2)bk=a+b+k2(ab(k+1)2)ak+(ab(k+2)2)bk=a+bk2\left\{\begin{aligned} &\left(a_{+}+b_{+}(k+1)^{2}\right)a_{k}+\left(a_{+}+b_{+}(k+2)^{2}\right)b_{k}=-a_{+}-b_{+}k^{2}\\ &-\left(a_{-}-b_{-}(k+1)^{2}\right)a_{k}+\left(a_{-}-b_{-}(k+2)^{2}\right)b_{k}=-a_{-}+b_{-}k^{2}\end{aligned}\right.

Thus, we obtain the expressions for the computation of aksubscript𝑎𝑘a_{k} and bksubscript𝑏𝑘b_{k}.

ak=subscript𝑎𝑘absent\displaystyle a_{k}= {(a++b+(k+2)2)(a+bk2)\displaystyle-\left\{\left(a_{+}+b_{+}(k+2)^{2}\right)\left(-a_{-}+b_{-}k^{2}\right)\right. (2.7)
(ab(k+2)2)(a+b+k2)}/ DET k,\displaystyle\left.-\left(a_{-}-b_{-}(k+2)^{2}\right)\left(-a_{+}-b_{+}k^{2}\right)\right\}/\text{ DET }_{k},
bk=subscript𝑏𝑘absent\displaystyle b_{k}= {(a++b+(k+1)2)(a+bk2)\displaystyle\left\{\left(a_{+}+b_{+}(k+1)^{2}\right)\left(-a_{-}+b_{-}k^{2}\right)\right.
+(ab(k+1)2)(a+b+k2)}/ DET k,\displaystyle\left.+\left(a_{-}-b_{-}(k+1)^{2}\right)\left(-a_{+}-b_{+}k^{2}\right)\right\}/\text{ DET }_{k},

with

DETk=2a+a2b+b(k+1)2(k+2)2+(ab+a+b)[(k+1)2+(k+2)2].subscriptDET𝑘2subscript𝑎subscript𝑎2subscript𝑏subscript𝑏superscript𝑘12superscript𝑘22subscript𝑎subscript𝑏subscript𝑎subscript𝑏delimited-[]superscript𝑘12superscript𝑘22\mathrm{DET}_{k}=2a_{+}a_{-}-2b_{+}b_{-}(k+1)^{2}(k+2)^{2}+(a_{-}b_{+}-a_{+}b_{-})[(k+1)^{2}+(k+2)^{2}].

Specifically, for the homogeneous Dirichlet conditions, substituting ak=1subscript𝑎𝑘1a_{k}=1, bk=0subscript𝑏𝑘0b_{k}=0, ck=1subscript𝑐𝑘1c_{k}=-1 into (2.7), the basis functions can be easily obtained as

ϕi(x)=Ti(x)Ti+2(x)subscriptitalic-ϕ𝑖𝑥subscript𝑇𝑖𝑥subscript𝑇𝑖2𝑥\phi_{i}(x)=T_{i}(x)-T_{i+2}(x) (2.8)

For one-dimensional problems, we can directly take ϕk(x)subscriptitalic-ϕ𝑘𝑥\phi_{k}(x) as the basis for the approximation function space

𝒳N=span{ϕi(x):i=0,1,,N2}.subscript𝒳𝑁𝑠𝑝𝑎𝑛conditional-setsubscriptitalic-ϕ𝑖𝑥𝑖01𝑁2\mathcal{X}_{N}=span\{\phi_{i}(x):i=0,1,\dots,N-2\}. (2.9)

For a higher-dimensional cube (or hypercube) domain, we can compute polynomial basis functions for each coordinate corresponding to the homogeneous boundary conditions required on each dimension. Then, we can take the product of these basis functions to form new high-dimensional basis functions. For instance, an approximation function space for two-dimensional rectangle problem with homogeneous could be

𝒳N=span{ϕi(x)ϕj(y):i,j=0,1,,N2}.subscript𝒳𝑁𝑠𝑝𝑎𝑛conditional-setsubscriptitalic-ϕ𝑖𝑥subscriptitalic-ϕ𝑗𝑦formulae-sequence𝑖𝑗01𝑁2\mathcal{X}_{N}=span\{\phi_{i}(x)\phi_{j}(y):i,j=0,1,\dots,N-2\}. (2.10)

2.3 PDEs in Complex Geometries

In this subsection, let us discuss how to solve equations in non-rectangular domains. Taking 2D problem as examples, assuming that we already have the polar coordinate representation for a complex domain in 2superscript2\mathbb{R}^{2}:

f1(θ)rf2(θ),subscriptf1𝜃rsubscriptf2𝜃\displaystyle\mathrm{f}_{1}(\theta)\leq\mathrm{r}\leq\mathrm{f}_{2}(\theta), (2.11)
0θ<2π,0𝜃2𝜋\displaystyle 0\leq\theta<2\pi,

where (r,θ)𝑟𝜃(r,\theta) are polar coordinates which can be transformed into the rectangle

1z1,1𝑧1\displaystyle-1\leq z\leq 1, (2.12)
0θ<2π,0𝜃2𝜋\displaystyle 0\leq\theta<2\pi,

by the simple stretching transformation

z=2rf1(θ)f2(θ)f1(θ)1.𝑧2𝑟subscript𝑓1𝜃subscript𝑓2𝜃subscript𝑓1𝜃1z=2\frac{r-f_{1}(\theta)}{f_{2}(\theta)-f_{1}(\theta)}-1. (2.13)

The complexity of employing a coordinate transformation arises in the coefficients of the differential equation within the transformed domain[12]. Specifically, derivatives undergo transformation as per

ur|θ=2f2(θ)f1(θ)uz|θevaluated-at𝑢𝑟𝜃evaluated-at2subscript𝑓2𝜃subscript𝑓1𝜃𝑢𝑧𝜃\displaystyle\left.\frac{\partial u}{\partial r}\right|_{\theta}=\left.\frac{2}{f_{2}(\theta)-f_{1}(\theta)}\frac{\partial u}{\partial z}\right|_{\theta}
uθ|r=uθ|z[z(f2f1)(f2+f1)f2f1uz|θ.\displaystyle\left.\frac{\partial u}{\partial\theta}\right|_{r}=\left.\frac{\partial u}{\partial\theta}\right|_{z}-\left.\frac{\left[z\left(f_{2}^{\prime}-f_{1}^{\prime}\right)-\left(f_{2}^{\prime}+f_{1}^{\prime}\right)\right.}{f_{2}-f_{1}}\frac{\partial u}{\partial z}\right|_{\theta}.

It is worth noting that nonlinear variable substitutions can sometimes introduce unexpected nonlinearities and singularities into the resulting differential equations. However, thanks to the powerful generalization and learning capabilities of neural networks, smaller errors can still be achieved when solving these differential equations. This will be discussed in detail in the Numerical Experimental sections.

3 Chebyshev Spectral Method Neural Network(CSNN)

Figure 1 illustrates the structure of CSNN, which consists of an input layer with d inputs (input 𝐱d𝐱superscript𝑑\mathbf{x}\in\mathbb{R}^{d}, in this figure and later discussion, we illustrate with the case where d=2𝑑2d=2), followed by neurons formed by basis functions based on Chebyshev polynomials, and a single output.

xy Chebyshev Expansion ϕ0(x)subscriptitalic-ϕ0𝑥\phi_{0}(x)ϕN2(x)subscriptitalic-ϕ𝑁2𝑥\phi_{N-2}(x)ψ0(y)subscript𝜓0𝑦\psi_{0}(y)ψN2(y)subscript𝜓𝑁2𝑦\psi_{N-2}(y)\vdots\vdotsϕ0(x)ψ0(y)subscriptitalic-ϕ0𝑥subscript𝜓0𝑦\phi_{0}(x)\cdot\psi_{0}(y)ϕ0(x)ψN2(y)subscriptitalic-ϕ0𝑥subscript𝜓𝑁2𝑦\phi_{0}(x)\cdot\psi_{N-2}(y)ϕN2(x)ψ0(y)subscriptitalic-ϕ𝑁2𝑥subscript𝜓0𝑦\phi_{N-2}(x)\cdot\psi_{0}(y)ϕN2(x)ψN2(y)subscriptitalic-ϕ𝑁2𝑥subscript𝜓𝑁2𝑦\phi_{N-2}(x)\cdot\psi_{N-2}(y)\vdots\vdotsoutputω0,0subscript𝜔00\omega_{0,0}ω0,N2subscript𝜔0𝑁2\omega_{0,N-2}ωN2,0subscript𝜔𝑁20\omega_{N-2,0}ωN2,N2subscript𝜔𝑁2𝑁2\omega_{N-2,N-2}
Figure 1: The structure of CSNN model(only the blue lines contain the neural network weight parameters that need to be trained).

The CSNN model comprises three parts: The first part transforms the non-homogeneous problem into the corresponding homogeneous problem; the second part computes the Chebyshev basis functions that satisfy the conditionsas neurons; the third part discusses the learning and optimization algorithms of the neural network.

The first two parts have been elaborated in detail in Section 2.2. It should be noted that if the input elements satisfy the two-point boundary conditions(e.g., z𝑧z in (2.12)), the basis functions can directly take the form of equation (2.6). If the input elements are periodic elements introduced by variable substitution (e.g., θ𝜃\theta in (2.12)), due to their inherent periodicity, the spectral expansion is more suitable. We can choose the basis by ϕk(θ)=coskθ+sinkθsubscriptitalic-ϕ𝑘𝜃𝑘𝜃𝑘𝜃\phi_{k}(\theta)=\cos{k\theta}+\sin{k\theta}, namely

θN=span{cos(iθ):i=0,1,,N2}span{sin(iθ):i=0,1,,N2}.subscript𝜃𝑁𝑠𝑝𝑎𝑛conditional-set𝑖𝜃𝑖01𝑁2𝑠𝑝𝑎𝑛conditional-set𝑖𝜃𝑖01𝑁2\mathcal{\theta}_{N}=span\{\cos(i\theta):i=0,1,\dots,N-2\}\cup span\{\sin(i\theta):i=0,1,\dots,N-2\}. (3.1)

Experiments show that this choice indeed reduces the error more effectively than using Chebyshev basis functions for all cases directly.

The learning algorithm is employed to update the network parameters and minimize the loss function. In this case, the error backpropagation algorithm is utilized to update the weights of the CSNN. To accomplish this, the gradient of an loss function with respect to the network parameters 𝐩𝐩\mathbf{p} is computed.

The network output with input data 𝐱𝐱\mathbf{x} and parameters (weights) 𝐩𝐩\mathbf{p} can be computed as:

𝒩(𝐱;𝐩)=j=0N2i=0N2wijϕi(x)ψj(y).𝒩𝐱𝐩superscriptsubscript𝑗0𝑁2superscriptsubscript𝑖0𝑁2subscript𝑤𝑖𝑗subscriptitalic-ϕ𝑖𝑥subscript𝜓𝑗𝑦\mathcal{N}(\mathbf{x};\mathbf{p})=\sum_{j=0}^{N-2}\sum_{i=0}^{N-2}w_{ij}\phi_{i}(x)\psi_{j}(y).

Where 𝐱=(x,y)𝐱𝑥𝑦\mathbf{x}=\left(x,y\right) represents the input data, ϕi(x)ψj(y)subscriptitalic-ϕ𝑖𝑥subscript𝜓𝑗𝑦\phi_{i}(x)\psi_{j}(y) and wijsubscript𝑤𝑖𝑗w_{ij} denote the basis function in (2.10) and the weight vectors of CSNN, respectively.

Upon observation, it is evident that the function approximation space obtained using a single-layer CSNN neural network resembles that of spectral methods. In the subsequent numerical experiments, we also employ a single-layer network for experimentation. In fact, numerical results indicate that the single-layer CSNN exhibits high accuracy.

4 Solving PDEs by CSNN

The general form of representing ordinary differential equations (ODEs) or partial differential equations (PDEs) for the solution u(𝐱𝐱\mathbf{x}) defined on a domain ΩdΩsuperscript𝑑\Omega\subset\mathbb{R}^{d} is as follows:

f(𝐱,u(𝐱),u(𝐱),2u(𝐱),,nu(𝐱))=0𝐱Ωformulae-sequence𝑓𝐱𝑢𝐱𝑢𝐱superscript2𝑢𝐱superscript𝑛𝑢𝐱0𝐱Ωf(\mathbf{x},u(\mathbf{x}),\nabla u(\mathbf{x}),\nabla^{2}u(\mathbf{x}),\dots,\nabla^{n}u(\mathbf{x}))=0\qquad\mathbf{x}\in\Omega (4.1)

where f𝑓f is a function that defines the structure of differential equation, u(X)𝑢𝑋u(X) and \nabla denote the solution and differential operator respectively. With suitable boundary conditions

(u,𝐱)=0onΩ𝑢𝐱0𝑜𝑛Ω\mathcal{B}(u,\mathbf{x})=0\qquad on\quad\partial\Omega (4.2)

where (u,𝐱)𝑢𝐱\mathcal{B}(u,\mathbf{x}) could be Dirichlet, Neumann, Robin boundary conditions.

In the following discussion in section 4, we will take d=2𝑑2d=2 as an example while the case for higher dimensions is similar.

4.1 PDEs in Rectangular Domains

We begin by considering a simple rectangular domain. We first create a CSNN u^(𝐱;θ)^𝑢𝐱𝜃\hat{u}(\mathbf{x};\mathbf{\theta}) as an approximation to the solution u(𝐱)𝑢𝐱u({\mathbf{x}}). Then, we can take the derivatives of u^(𝐱;θ)^𝑢𝐱𝜃\hat{u}(\mathbf{x};\mathbf{\theta}) with respect to the input 𝐱𝐱\mathbf{x} via automatic differentiation (AD) [13].

In fact, due to the specific structure of the CSNN, the network’s output automatically satisfies the boundary conditions of the equation. Therefore, we only need to calculate the residual of the equation within the domain, without further considering the boundary conditions. This not only reduces the computational load of the loss function but also avoids the need to adjust the weights of the boundary and internal residuals as in traditional PINN methods.

As for sampling, if the training basis functions are Chebyshev polynomials(2.6), then N2𝑁2N-2 Chebyshev-Gauss-Lobatto quadrature points (2.2) are taken in each dimension. If the training basis functions are Fourier polynomials, uniform sampling is directly used. This is a significant difference between CSNN and PINN. This sampling method greatly reduces the training workload.

We denote the sampling set as 𝒯={𝐱𝟏,𝐱𝟐,,𝐱|𝒯|}𝒯subscript𝐱1subscript𝐱2subscript𝐱𝒯\mathcal{T}=\{\mathbf{x_{1}},\mathbf{x_{2}},\dots,\mathbf{x_{\mathcal{|T|}}}\}. To quantify the difference between the neural network u(𝐱)𝑢𝐱u({\mathbf{x}}), we define the loss function as the weighted summation of the L2superscript𝐿2L^{2} norm of residuals for the equation

(θ;𝒯)=1|𝒯|𝐱𝒯f(X,u^(X;θ),u^(X;θ),2u^(X;θ),,nu^(X;θ))22𝜃𝒯1𝒯subscript𝐱𝒯superscriptsubscriptnorm𝑓𝑋^𝑢𝑋𝜃^𝑢𝑋𝜃superscript2^𝑢𝑋𝜃superscript𝑛^𝑢𝑋𝜃22\mathcal{L}(\mathbf{\theta};\mathcal{T})=\frac{1}{|\mathcal{T}|}\sum_{\mathbf{x}\in\mathcal{T}}\|f(X,\hat{u}(X;\mathbf{\theta}),\nabla\hat{u}(X;\mathbf{\theta}),\nabla^{2}\hat{u}(X;\mathbf{\theta}),\dots,\nabla^{n}\hat{u}(X;\mathbf{\theta}))\|_{2}^{2} (4.3)

Finally, by minimizing the loss function, we find the most suitable parameters θ𝜃\mathbf{\theta}. Considering that the loss with respect to θ𝜃\mathbf{\theta} is highly nonlinear and non-convex [2], we typically utilize gradient-based optimizers such as gradient descent, Adam [7], and L-BFGS [5] to minimize the loss function.

In conclusion, the process of solving PDEs on rectangular domains using the CSNN method is illustrated in Algorithm 1.

Algorithm 1 CSNN on rectangular domains
1:  For each dimension whose boundary condition is not of the type of periodic, homogenize boundary conditions into (2.5);
2:  For each dimension whose boundary condition is not of the type of periodic, compute the values of ak,bk,cksubscript𝑎𝑘subscript𝑏𝑘subscript𝑐𝑘a_{k},b_{k},c_{k} using (2.7);
3:  Construct the Chebyshev spectral method neural Network using the computed ϕisubscriptitalic-ϕ𝑖\phi_{i} (2.6);
4:  Compute the residual points 𝒯𝒯\mathcal{T} in each dimension;
5:  Train the neural network for a limited number of iterations with the loss function (θ;𝒯)𝜃𝒯\mathcal{L}(\mathbf{\theta};\mathcal{T}) given by (4.3).

4.2 PDEs in Complex Domains

In this section, we will introduce how to solve PDEs on general parameterized domains, which is one of the major advantages of the CSNN method over traditional spectral methods and also one of our highlights.

4.2.1 PDEs After Coordinate Transformation on Variables

For more complex polar coordinate domains, when the inner boundary of the original region has prescribed values (such as in the case of an annular region (2.11)), the transformation poses no extra issues. This is because the boundary values on the rectangular region obtained from the transformation are determined by the inner and outer boundary conditions of the original region. Therefore, apart from the change in the form of the original equation due to the variable substitution and a new periodic boundary condition arise by θ𝜃\theta direction, all other aspects are identical to solving PDEs on a rectangular domain(e.g. (2.12)).

However, when the original domain only has an outer boundary, such as

rf2(θ),rsubscriptf2𝜃\displaystyle\mathrm{r}\leq\mathrm{f}_{2}(\theta), (4.4)
0θ<2π.0𝜃2𝜋\displaystyle 0\leq\theta<2\pi.

The values on one side of the transformed rectangular region are determined by the original region’s boundary conditions, while the values on the other side become undetermined unknowns (Figure 3 illustrates the original domain of the equation. Figure 3 illustrates the parameter domain in polar coordinates. The rectangular orange border corresponds to the boundary values of the original equation, while the green border corresponds to the unknown solution at the center of the original equation.)

Refer to caption
Figure 2: the original domain
Refer to caption
Figure 3: the parameter domain(with r𝑟r on the horizontal axis and θ𝜃\theta on the vertical axis)

Traditional spectral methods may find it challenging to solve in such cases. In CSNN, we can treat the boundary unknowns u0subscript𝑢0u_{0} as an additional parameter u0^^subscript𝑢0\hat{u_{0}} which can be trained during the training process together with u^(𝐱;θ)^𝑢𝐱𝜃\hat{u}(\mathbf{x};\theta). In addition, in the next subsection, we will introduce a correction technique for u0subscript𝑢0u_{0} to assist in adjusting its value during the training process.

4.2.2 Correction for u0^^subscript𝑢0\hat{u_{0}} Network

Through multiple numerical experiments, we have found that after several iterations, the primary error in CSNN’s solution to the differential equation arises from the discrepancy between the output values of the u0^^subscript𝑢0\hat{u_{0}} network and the true solution at the origin. This error stems from never sampling the boundary points during training. We need to make certain correction to the u0^^subscript𝑢0\hat{u_{0}} network to ensure that the equation is as close as possible to satisfying our original equation at the origin.

Predefine an iteration threshold value M𝑀M. When the number of iterations exceeds M𝑀M, we perform a correction to u𝑢u every 100 iterations. Take 2D Possion equation as an example, we employ a five-point finite difference scheme for the correction

u(h,0)2u(0,0)+u(h,0)h2u(0,h)2u(0,0)+u(0,h)h2f(0,0),𝑢02𝑢00𝑢0superscript2𝑢02𝑢00𝑢0superscript2𝑓00-\frac{u(h,0)-2u(0,0)+u(-h,0)}{h^{2}}-\frac{u(0,h)-2u(0,0)+u(0,-h)}{h^{2}}\approx f(0,0), (4.5)

where hh is a very small positive number. Thus, we obtain the correction form for the unknown value u0subscript𝑢0u_{0} at the origin as

u0¯=u^(h,0)+u^(h,0)+u^(0,h)+u^(0,h)4+h2f(0,0).¯subscript𝑢0^𝑢0^𝑢0^𝑢0^𝑢04superscript2𝑓00\bar{u_{0}}=\frac{\hat{u}(h,0)+\hat{u}(-h,0)+\hat{u}(0,h)+\hat{u}(0,-h)}{4}+h^{2}f(0,0). (4.6)

Here, u^^𝑢\hat{u} is the CSNN at the current stage, and u¯¯𝑢\bar{u} is the corrected prediction. In fact, we only need to update the value of u¯¯𝑢\bar{u} to the parameter of the single-parameter u0^^subscript𝑢0\hat{u_{0}} network.

In conclusion, the process of solving PDEs on general parametric domains using the CSNN method is illustrated in Algorithm 2.

Algorithm 2 CSNN on general parametric domains
1:  By variable substitution, the PDEs on general domains are transformed into new PDEs on rectangular domains;
2:  Determining the boundary conditions for the new PDEs and creating additional trainable parameters u0^^subscript𝑢0\hat{u_{0}} caused by a new periodic boundary condition;
3:  Run Algorithm 1 on the new PDEs defined on the rectangular domain, and correct u0^^subscript𝑢0\hat{u_{0}} every certain number of training steps.

5 Approximation Theory and Error Analysis for CSNN

Let \mathcal{F} denote the family of all the functions that can be represented by the architecture of CSNN, and u represent the true solution of the equation. We define u𝒯=argminf(f;𝒯)subscript𝑢𝒯𝑎𝑟𝑔subscript𝑓𝑓𝒯u_{\mathcal{T}}=arg\min_{f\in\mathcal{F}}\mathcal{L}(f;\mathcal{T}) as the neural network whose loss is at global minimum on the trainning set 𝒯𝒯\mathcal{T}. Since finding u𝒯subscript𝑢𝒯u_{\mathcal{T}} through minimizing the loss is typically computationally challenging, we denote that our optimizer returns an approximate solution u𝒯~~subscript𝑢𝒯\tilde{u_{\mathcal{T}}}. Then, the total error \mathcal{E} can be represented as [3]

:=u𝒯~uu𝒯~u𝒯+u𝒯u.assignnorm~subscript𝑢𝒯𝑢norm~subscript𝑢𝒯subscript𝑢𝒯normsubscript𝑢𝒯𝑢\mathcal{E}:=\|\tilde{u_{\mathcal{T}}}-u\|\leq\|\tilde{u_{\mathcal{T}}}-u_{\mathcal{T}}\|+\|u_{\mathcal{T}}-u\|. (5.1)

To simplify the problem, we assume that when the training samples are large enough, the global minimum point can be reached using the optimization algorithm of the neural network.i.e.

u𝒯~u𝒯ϵnorm~subscript𝑢𝒯subscript𝑢𝒯italic-ϵ\|\tilde{u_{\mathcal{T}}}-u_{\mathcal{T}}\|\leq\epsilon (5.2)

holds for any arbitrarily small ϵitalic-ϵ\epsilon.

Here, for the approximation of u𝒯unormsubscript𝑢𝒯𝑢\|u_{\mathcal{T}}-u\| we only consider the one-dimensional case. Suppose we need to solve the following model problem

u′′+αu=f,inI=(1,1)formulae-sequencesuperscript𝑢′′𝛼𝑢𝑓𝑖𝑛𝐼11\displaystyle-u^{{}^{\prime\prime}}+\alpha u=f,\qquad in\quad I=(-1,1) (5.3)
u(1)=u(1)=0𝑢1𝑢10\displaystyle u(1)=u(-1)=0

In this case, ={vPN:v(1)=v(1)=0}conditional-set𝑣subscript𝑃𝑁𝑣1𝑣10\mathcal{F}=\{v\in P_{N}:v(-1)=v(1)=0\}. According to the assumptions, we have

αu𝒯(xj)u𝒯′′(xj)=f(xj),j=1,2,,N1formulae-sequence𝛼subscript𝑢𝒯subscript𝑥𝑗superscriptsubscript𝑢𝒯′′subscript𝑥𝑗𝑓subscript𝑥𝑗𝑗12𝑁1\displaystyle\alpha u_{\mathcal{T}}(x_{j})-u_{\mathcal{T}}^{{}^{\prime\prime}}(x_{j})=f(x_{j}),\qquad j=1,2,\dots,N-1 (5.4)
u𝒯(x0)=u𝒯(xN)=0subscript𝑢𝒯subscript𝑥0subscript𝑢𝒯subscript𝑥𝑁0\displaystyle u_{\mathcal{T}}(x_{0})=u_{\mathcal{T}}(x_{N})=0

where xj=cos(jπN)subscript𝑥𝑗𝑗𝜋𝑁x_{j}=\cos(\frac{j\pi}{N}). We can rewrite (5.4) in a variational formulation with discrete inner product (2.4) [23]:

α(u𝒯,v𝒯)N,ω(u𝒯′′,v𝒯)N,ω=(INf,v𝒯)N,ωv𝒯formulae-sequence𝛼subscriptsubscript𝑢𝒯subscript𝑣𝒯𝑁𝜔subscriptsuperscriptsubscript𝑢𝒯′′subscript𝑣𝒯𝑁𝜔subscriptsubscript𝐼𝑁𝑓subscript𝑣𝒯𝑁𝜔for-allsubscript𝑣𝒯\alpha(u_{\mathcal{T}},v_{\mathcal{T}})_{N,\omega}-(u_{\mathcal{T}}^{{}^{\prime\prime}},v_{\mathcal{T}})_{N,\omega}=(I_{N}f,v_{\mathcal{T}})_{N,\omega}\qquad\forall v_{\mathcal{T}}\in\mathcal{F} (5.5)

where IN:C(1,1)PN:subscript𝐼𝑁𝐶11subscript𝑃𝑁I_{N}:C(-1,1)\to P_{N} is the interpolating operator associated with Gauss-Lobatto points.

According to [23], for uHω32,32,m(I)𝑢superscriptsubscript𝐻superscript𝜔3232𝑚𝐼u\in H_{\omega^{-\frac{3}{2},-\frac{3}{2},*}}^{m}(I) and fHω12,12,k(I)𝑓superscriptsubscript𝐻superscript𝜔1212𝑘𝐼f\in H_{\omega^{-\frac{1}{2},-\frac{1}{2},*}}^{k}(I), we have

uu𝒯1,ωN1mxmuωm3/2,m3/2+Nkxkfωk12,k12,m,k1,formulae-sequenceless-than-or-similar-tosubscriptnorm𝑢subscript𝑢𝒯1𝜔superscript𝑁1𝑚subscriptnormsuperscriptsubscript𝑥𝑚𝑢superscript𝜔𝑚32𝑚32superscript𝑁𝑘subscriptnormsuperscriptsubscript𝑥𝑘𝑓superscript𝜔𝑘12𝑘12𝑚𝑘1\displaystyle\left\|u-u_{\mathcal{T}}\right\|_{1,\omega}\lesssim N^{1-m}\left\|\partial_{x}^{m}u\right\|_{\omega^{m-3/2,m-3/2}}+N^{-k}\left\|\partial_{x}^{k}f\right\|_{\omega^{k-\frac{1}{2},k-\frac{1}{2}}},\quad m,k\geqslant 1, (5.6)
uu𝒯ωNmxmuωm1/2,m1/2+Nkxkfωk12,k12,m,k1.formulae-sequenceless-than-or-similar-tosubscriptnorm𝑢subscript𝑢𝒯𝜔superscript𝑁𝑚subscriptnormsuperscriptsubscript𝑥𝑚𝑢superscript𝜔𝑚12𝑚12superscript𝑁𝑘subscriptnormsuperscriptsubscript𝑥𝑘𝑓superscript𝜔𝑘12𝑘12𝑚𝑘1\displaystyle\left\|u-u_{\mathcal{T}}\right\|_{\omega}\lesssim N^{-m}\left\|\partial_{x}^{m}u\right\|_{\omega^{m-1/2,m-1/2}}+N^{-k}\left\|\partial_{x}^{k}f\right\|_{\omega^{k-\frac{1}{2},k-\frac{1}{2}}},\quad m,k\geqslant 1.

where ω𝜔\omega represents the Chebyshev weight (1x2)12superscript1superscript𝑥212(1-x^{2})^{-\frac{1}{2}}, and the definition of the Jacobi-weighted Sobolev space can be found in Appendix B.

It indicates that if we ignore the error introduced by the neural network optimization algorithm, as long as the solution of the equation and the right-hand side f𝑓f are sufficiently smooth and the width of the neural network is wide enough, the approximate solution can converge to the true solution to any order.

6 Numerical Experiments

In this section, based on the aforementioned CSNN method, we validate the effectiveness and generality of the method by solving differential equations from one to four dimensions. All numerical experiments are conducted using the NVIDIA GeForce RTX 3060(Laptop) GPU, and the Adam optimizer is used to train the neural network. We begin with several simple examples.

We evaluate the accuracy by using the absolute Lsuperscript𝐿L^{\infty} error, absolute L2superscript𝐿2L^{2} error and relative L2superscript𝐿2L^{2} error defined as follows:

eabsoluteL=maxi=1,,K|u^(xi)u(xi)|,subscriptnorm𝑒𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒superscript𝐿subscript𝑖1𝐾^𝑢subscript𝑥𝑖superscript𝑢subscript𝑥𝑖\|e\|_{absolute-L^{\infty}}=\max_{i=1,\cdots,K}|\hat{u}(x_{i})-u^{*}(x_{i})|,
eabsoluteL2=i=1K|u^(xi)u(xi)|2K,subscriptnorm𝑒𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒superscript𝐿2superscriptsubscript𝑖1𝐾superscript^𝑢subscript𝑥𝑖superscript𝑢subscript𝑥𝑖2𝐾\|e\|_{absolute-L^{2}}=\sqrt{\frac{\sum_{i=1}^{K}|\hat{u}(x_{i})-u^{*}(x_{i})|^{2}}{K}},
erelativeL2=i=1K|u^(xi)u(xi)|2i=1K|u(xi)|2,subscriptnorm𝑒𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒superscript𝐿2superscriptsubscript𝑖1𝐾superscript^𝑢subscript𝑥𝑖superscript𝑢subscript𝑥𝑖2superscriptsubscript𝑖1𝐾superscriptsuperscript𝑢subscript𝑥𝑖2\|e\|_{relative-L^{2}}=\frac{\sqrt{\sum_{i=1}^{K}|\hat{u}(x_{i})-u^{*}(x_{i})|^{2}}}{\sqrt{\sum_{i=1}^{K}|u^{*}(x_{i})|^{2}}},

where u^^𝑢\hat{u} is the neural network prediction, usuperscript𝑢u^{*} is the true solution, and K𝐾K is the number of test points.

6.1 The 1D Robin boundary condition differential equation

Consider the following problem:

u(x)′′+xu(x)u(x)=(24+5x)e5x+(2+2x2)cos(x2)(4x2+1)sin(x2)u{{}^{\prime\prime}}(x)+xu^{{}^{\prime}}(x)-u(x)=(24+5x)e^{5x}+(2+2x^{2})\cos(x^{2})-(4x^{2}+1)\sin(x^{2}) (6.1)

with the Robin boundary conditions:

u(1)u(1)=4e5+sin(1)+2cos(1),u(1)+u(1)=6e5+sin(1)+2cos(1).formulae-sequence𝑢1superscript𝑢14superscript𝑒5121𝑢1superscript𝑢16superscript𝑒5121u(-1)-u^{{}^{\prime}}(-1)=-4e^{-5}+\sin(1)+2\cos(1),\qquad u(1)+u^{{}^{\prime}}(1)=6e^{5}+\sin(1)+2\cos(1). (6.2)

The exact solution is u(x)=e5x+sin(x2)𝑢𝑥superscript𝑒5𝑥superscript𝑥2u(x)=e^{5x}+\sin(x^{2}).

We take N=6,8,10,12,14𝑁68101214N=6,8,10,12,14 to construct CSNN networks for computation. The epoch size for the training process of each network is set as 300×N300𝑁300\times N. The initial learning rate is set to 0.10.10.1. A learning rate scheduler with a reduction factor of 0.70.70.7 is utilized. This scheduler dynamically adjusts the learning rate if the loss fails to decrease for 600 consecutive iterations.

The test points are uniformly sampled with 501 points in the interval from -1 to 1. The results are shown in Table 1.

N𝑁N 666 888 101010 121212 141414
Training Time (seconds) 10.44 17.38 23.51 31.30 41.21
Absolute Lsuperscript𝐿L^{\infty} Error 7.38E-2 5.46E-3 2.85E-4 1.10E-5 1.01E-6
Table 1: Result for 1D Differential Equation

It is observed that as the number of CSNN sampling points increases, the error of the solution decreases at a very high rate. In contrast, the slight increase in sampling points for PINNs has minimal impact on the equation’s solution.

6.2 The 2D Poisson equation on a rectangular domain

In this part, we solve the two-dimensional Poisson equation with Dirichlet boundary conditions:

Δu(x,y)=f(x,y)(x,y)Ω=[1,1]2.formulae-sequenceΔ𝑢𝑥𝑦𝑓𝑥𝑦𝑥𝑦Ωsuperscript112-\Delta u(x,y)=f(x,y)\qquad(x,y)\in\Omega=[-1,1]^{2}. (6.3)

The right-hand side term f𝑓f of the equation are determined by the true solution u=exsin(πy)𝑢superscript𝑒𝑥𝜋𝑦u=e^{-x}\sin{(\pi y)}. Let the boundary conditions of the equation be

{α0(y)=u(1,y),α1(y)=u(1,y),β0(x)=u(x,1),β1(x)=u(x,1),casessubscript𝛼0𝑦𝑢1𝑦subscript𝛼1𝑦𝑢1𝑦subscript𝛽0𝑥𝑢𝑥1subscript𝛽1𝑥𝑢𝑥1\left\{\begin{array}[]{l}\alpha_{0}(y)=u(-1,y),\\ \alpha_{1}(y)=u(1,y),\\ \beta_{0}(x)=u(x,-1),\\ \beta_{1}(x)=u(x,1),\end{array}\right. (6.4)
Refer to caption
Figure 4: the problem on the boundary

which Satisfy the continuity property

{α0(1)=u(1,1)=β0(1),α0(1)=u(1,1)=β1(1),α1(1)=u(1,1)=β0(1),α1(1)=u(1,1)=β1(1).\left\{\begin{aligned} \alpha_{0}(-1)&=u(-1,-1)=\beta_{0}(-1),\\ \alpha_{0}(1)&=u(1,1)=\beta_{1}(1),\\ \alpha_{1}(-1)&=u(1,-1)=\beta_{0}(1),\\ \alpha_{1}(1)&=u(1,1)=\beta_{1}(1).\end{aligned}\right. (6.5)

The trial solution is given by the expression,

ψ(x,y)=A(x,y)+u~(x,y;θ).𝜓𝑥𝑦𝐴𝑥𝑦~𝑢𝑥𝑦𝜃\psi(x,y)=A(x,y)+\tilde{u}(x,y;\mathbf{\theta}). (6.6)

Here, u~(x,y;θ)~𝑢𝑥𝑦𝜃\tilde{u}(x,y;\mathbf{\theta}) is the output obtained by the CSNN neural network, and A(x,y)𝐴𝑥𝑦A(x,y) represents the construct function which satisfies the non-homogeneous boundary conditions, where

A(x,y)=𝐴𝑥𝑦absent\displaystyle A(x,y)= (1x2α0(y)+1+x2α1(y))1𝑥2subscript𝛼0𝑦1𝑥2subscript𝛼1𝑦\displaystyle\left(\frac{1-x}{2}\alpha_{0}(y)+\frac{1+x}{2}\alpha_{1}(y)\right) (6.7)
+(1y2(β0(x)(1x2β0(1)+1+x2β0(1))))1𝑦2subscript𝛽0𝑥1𝑥2subscript𝛽011𝑥2subscript𝛽01\displaystyle+\left(\frac{1-y}{2}\left(\beta_{0}(x)-\left(\frac{1-x}{2}\beta_{0}(-1)+\frac{1+x}{2}\beta_{0}(1)\right)\right)\right)
+(1+y2(β1(x)(1x2β1(1)+1+x2β1(1))))1𝑦2subscript𝛽1𝑥1𝑥2subscript𝛽111𝑥2subscript𝛽11\displaystyle+\left(\frac{1+y}{2}\left(\beta_{1}(x)-\left(\frac{1-x}{2}\beta_{1}(-1)+\frac{1+x}{2}\beta_{1}(1)\right)\right)\right)

The optimization algorithm chosen is the Adam, with a learning rate 0.1, and the number of iterations set to 2000. The test points are on a uniformly spaced grid of 300 cells on each dimension in the x and y space. The Lsuperscript𝐿L^{\infty} error and training duration for different network complexities N are shown in Table 2.

N𝑁N 666 888 101010
Training Time (seconds) 19.1619.1619.16 26.18 38.67
Absolute Lsuperscript𝐿L^{\infty} Error 4.466E-4 1.019E-5 8.345E-7
Table 2: Result for 2D Poisson Equations
Refer to caption
Figure 5: predict for 2D Poisson equation(the left image shows the predicted values by the CSNN network, the middle image displays the accurate solution of the equation, and the right image represents the computed error).

A major advantage of the CSNN network is the selection of a large learning rate and the rapid convergence speed. The computational results for N = 10 are shown in Figure 5. The error distribution is relatively uniform across the entire domain, with the error gradually decreasing near the boundary until it approaches zero.

6.3 The 2D Poisson equation on an annular domain

In this part, we consider solving Poisson equation on the annular region Ω={(r,θ)|0.5r1\Omega=\{(r,\theta)|0.5\leq r\leq 1, 0θ2π}0\leq\theta\leq 2\pi\}. The original equation is given as follows:

Δu(x,y)=f(x,y)(x,y)Ω.formulae-sequenceΔ𝑢𝑥𝑦𝑓𝑥𝑦𝑥𝑦Ω-\Delta u(x,y)=f(x,y)\qquad(x,y)\in\Omega. (6.8)

The right-hand side term f𝑓f and the Dirichlet boundary conditions on the inner and outer boundaries are both determined by the true solution u=excos(πy)𝑢superscript𝑒𝑥𝜋𝑦u=e^{x}\cos{(\pi y)}.

Take x=s+34cos((t+1)π)𝑥𝑠34𝑡1𝜋x=\frac{s+3}{4}\cos((t+1)\pi), y=s+34sin((t+1)π)𝑦𝑠34𝑡1𝜋y=\frac{s+3}{4}\sin((t+1)\pi), the above equation (6.8) can be transformed into

16(s+3)us16uss16(π(s+3))2utt=f(3+s4cos((t+1)π),3+s4cos((t+1)π)),16𝑠3subscript𝑢𝑠16subscript𝑢𝑠𝑠16superscript𝜋𝑠32subscript𝑢𝑡𝑡𝑓3𝑠4𝑡1𝜋3𝑠4𝑡1𝜋-\frac{16}{(s+3)}u_{s}-16u_{ss}-\frac{16}{{\left(\pi\cdot(s+3)\right)}^{2}}u_{tt}=f(\frac{{3+s}}{4}\cos\left((t+1)\pi\right),\frac{3+s}{4}\cos\left((t+1)\pi\right)), (6.9)

where t,s[1,1]𝑡𝑠11t,s\in[-1,1].

In the s-direction, we employ Chebyshev basis functions (as in (2.9)), and the sampling points are still chosen as the Lobatto-Gauss-Legendre points (2.2) (with N=10). For the t-direction, considering its periodicity, we use trigonometric basis functions (as in (3.1)), and the sampling points range uniformly from -1 to 1 (with 500 sampling points).

We denote the sampling set in the parameter space as 𝒯={𝐱𝟏,𝐱𝟐,,𝐱|𝒯|}𝒯subscript𝐱1subscript𝐱2subscript𝐱𝒯\mathcal{T}=\{\mathbf{x_{1}},\mathbf{x_{2}},\dots,\mathbf{x_{\mathcal{|T|}}}\}. Constructing the loss function

(θ;𝒯)=1|𝒯|𝐱𝒯|16(s+3)us+16uss+16(π(s+3))2utt+f|2𝜃𝒯1𝒯subscript𝐱𝒯superscript16𝑠3subscript𝑢𝑠16subscript𝑢𝑠𝑠16superscript𝜋𝑠32subscript𝑢𝑡𝑡𝑓2\mathcal{L}(\mathbf{\theta};\mathcal{T})=\frac{1}{|\mathcal{T}|}\sqrt{\sum_{\mathbf{x}\in\mathcal{T}}|\frac{16}{(s+3)}u_{s}+16u_{ss}+\frac{16}{{\left(\pi\cdot(s+3)\right)}^{2}}u_{tt}+f|^{2}} (6.10)

The initial learning rate is set to 0.010.010.01, and a learning rate scheduler with a reduction factor of 0.60.60.6 is utilized. This scheduler dynamically adjusts the learning rate if the loss fails to decrease for 500 consecutive iterations. As the training progresses, the loss function steadily decreases(see in Figure 6). It can be observed that even with a relatively small initial learning rate, the CSNN network converges quickly.

Refer to caption
Figure 6: loss for 2D Poisson Equations on annular Domain

The test points are on a uniformly spaced grid of 300 cells on each dimension in the s and t space. The solution of the equation is depicted in Figure 7 and the absolute Lsuperscript𝐿L^{\infty} error reaches 9.07×1049.07superscript1049.07\times 10^{-4} after 2000 training iterations.

Refer to caption
Figure 7: The 2D Poisson equation on an annular domain(the left image shows the predicted values by the CSNN network, the middle image displays the accurate solution of the equation, and the right image represents the computed error.)

6.4 The 2D Helmholtz equation on an annular domain

We consider the 2D Helmholtz equation on the Ω={(r,θ)|0.5r1\Omega=\{(r,\theta)|0.5\leq r\leq 1, 0θ2π}0\leq\theta\leq 2\pi\}

Δu(x,y)+u(x,y)=f(x,y).Δ𝑢𝑥𝑦𝑢𝑥𝑦𝑓𝑥𝑦\Delta u(x,y)+u(x,y)=f(x,y). (6.11)

We test our method with f𝑓f and the Dirichlet boundary condition given by the exact solution u(x,y)=exsin(πy)𝑢𝑥𝑦superscript𝑒𝑥𝜋𝑦u(x,y)=e^{x}\sin(\pi y). Traditional methods for solving the 2D Helmholtz equation may encounter some difficulties. In finite difference, finite element, higher wave numbers may lead to numerical dissipation or numerical waveguide issues.

In this example, we take N=13𝑁13N=13, and the training parameters and test samples are the same as the previous subsection. The final numerical results are shown in Figure 8 whose absolute Lsubscript𝐿L_{\infty} error is 1.0192×1051.0192superscript1051.0192\times 10^{-5}.

Refer to caption
Figure 8: The 2D Helmholtz equations on an annular domain(the left image shows the predicted values by the CSNN network, the middle image displays the accurate solution of the equation, and the right image represents the computed error.)

To compare the efficiency and accuracy differences between CSNN and PINNs, we applied the PINNs method to solve the examples in this section. The network of the PINNs method consists of 4 hidden layers, each with 128 neurons, using the tanh\tanh activation function. After 2000 iterations with the same learning rate lr=0.001𝑙𝑟0.001lr=0.001, the Lsubscript𝐿L_{\infty} error of the solution given by the PINNs method is 0.0140.0140.014, which is over 1000 times higher than CSNN. It can be observed that CSNN method has a higher convergence speed and can achieve higher accuracy as well. Simultaneously, during training, CSNN does not exhibit irregular oscillations in the solution error like PINNs; instead, the error steadily decreases (see in Fig 9).

Refer to caption
Figure 9: Comparison with PINNs

6.5 The 2D Poisson equation on a circular domain

In this example, we consider solving the Poisson equation (6.8) on the circular region Ω={(r,θ)|r1\Omega=\{(r,\theta)|r\leq 1, 0θ2π}0\leq\theta\leq 2\pi\} with exact solution u(x,y)=excos(πy)𝑢𝑥𝑦superscript𝑒𝑥𝜋𝑦u(x,y)=e^{x}\cos(\pi y).

The initial learning rate is set to lr=8×103𝑙𝑟8superscript103lr=8\times 10^{-3}, and a learning rate scheduler with a reduction factor of 0.70.70.7 is utilized. This scheduler dynamically adjusts the learning rate if the loss fails to decrease for 800 consecutive iterations. Additionally, we require an extra single-parameter neural network u0^^subscript𝑢0\hat{u_{0}} to learn the value u0subscript𝑢0u_{0} at the center of the circle. We set the initial learning rate for u0^^subscript𝑢0\hat{u_{0}} as 6×1026superscript1026\times 10^{-2} and decrease the step size by multiplying it by 0.8 if the loss fails to decrease for 600 consecutive iterations.

The test points are on a uniformly spaced grid of 300 cells on each dimension in the s and t space. After 10000 iterations of training, the L2superscript𝐿2L^{2} error of the solution can reach 9×1049superscript1049\times 10^{-4}(see in Fig 10).

Refer to caption
Figure 10: the 2D Poisson equation on a circular domain(the left image shows the predicted values by the CSNN network, the middle image displays the accurate solution of the equation, and the right image represents the computed error.)

However, it must be acknowledged that the feasibility of the CSNN method is strongly associated with the sizes of the learning rates for both networks. Incorrect step size settings can lead to slow convergence of the networks or even failure to reach the global minimum. To address this issue, we propose a network correction for u^^𝑢\hat{u} (as discussed in Section 4.2.2), where we apply adjustments to the network parameters to satisfy the equation conditions within a certain number of convergence steps. This measure significantly improves the resilience of the CSNN method to the learning rate settings.

6.6 The 3D variable coefficients elliptic PDE

For the three-dimensional case, we are interested in the numerical solution of the following form of the elliptic problem

(σ(𝐱)u)κ(x)u=f(𝐱)𝜎𝐱𝑢𝜅𝑥𝑢𝑓𝐱\displaystyle\nabla\cdot(\sigma(\mathbf{x})\nabla u)-\kappa(x)u=f(\mathbf{\mathbf{x}})\qquad inΩinΩ\displaystyle\text{in}\quad\Omega (6.12)
u=gD𝑢superscript𝑔𝐷\displaystyle u=g^{D} onΩonΩ\displaystyle\text{on}\quad\partial\Omega

with σ=cos(π(x+y+z))+2𝜎𝜋𝑥𝑦𝑧2\sigma=\cos(\pi(x+y+z))+2 and κ=sin(π(x+y+z))+2𝜅𝜋𝑥𝑦𝑧2\kappa=\sin(\pi(x+y+z))+2. The Dirichlet boundary data g(x,y,z)𝑔𝑥𝑦𝑧g(x,y,z) and source data f(x,y)𝑓𝑥𝑦f(x,y) are chosen such that the exact solution to the problem is given by u(x,y,z)=e23x+23y+13zsin(π(x+y+z))𝑢𝑥𝑦𝑧superscript𝑒23𝑥23𝑦13𝑧𝜋𝑥𝑦𝑧u(x,y,z)=e^{\frac{2}{3}x+\frac{2}{3}y+\frac{1}{3}z}\sin(\pi(x+y+z)). The computational domain ΩΩ\Omega is a rolled-up cylindrical shell with an inner radius of 0.5, an outer radius of 1, and a height ranging from -1 to 1(see Fig 11).

Refer to caption
Figure 11: The computational domain of Example 6.6

If we take x=rcosθ𝑥𝑟𝜃x=r\cos\theta, y=rsinθ𝑦𝑟𝜃y=r\sin\theta, z=z𝑧𝑧z=z, the aforementioned domain can be transformed into a standard rectangular region Ω={(r,θ,z)[0.5,1]×[0,2π)×[1,1]}superscriptΩ𝑟𝜃𝑧0.5102𝜋11\Omega^{{}^{\prime}}=\{(r,\theta,z)\in[0.5,1]\times[0,2\pi)\times[-1,1]\}. And the differential operator is transformed into

(σu(x,y,z))=1rr(rσur)+1r2θ(σuθ)+z(σuz)𝜎𝑢𝑥𝑦𝑧1𝑟𝑟𝑟𝜎𝑢𝑟1superscript𝑟2𝜃𝜎𝑢𝜃𝑧𝜎𝑢𝑧\nabla\cdot\left(\sigma\nabla u(x,y,z)\right)=\frac{1}{r}\frac{\partial}{\partial r}(r\sigma\frac{\partial u}{\partial r})+\frac{1}{r^{2}}\frac{\partial}{\partial\theta}(\sigma\frac{\partial u}{\partial\theta})+\frac{\partial}{\partial z}(\sigma\frac{\partial u}{\partial z}) (6.13)

The initial learning rate is set to 0.010.010.01, and a learning rate scheduler with a reduction factor of 0.70.70.7 is utilized. This scheduler dynamically adjusts the learning rate if the loss fails to decrease for 100 consecutive iterations. r𝑟r and z𝑧z directions are sampled at N𝑁N CGL points, and the θ𝜃\theta direction is uniformly sampled with 100 points. The test points are on a uniformly spaced grid of 128 cells on each dimension in the r,θ,z𝑟𝜃𝑧r,\theta,z space. In this experiment, N=8𝑁8N=8, after training 78 seconds for 500 iterations, the Lsubscript𝐿L_{\infty} error and L2subscript𝐿2L_{2} error are 9.5367×1079.5367superscript1079.5367\times 10^{-7} and 1.9601×1071.9601superscript1071.9601\times 10^{-7}, respectively. This demonstrates that even when solving nonlinear PDEs in irregular domains, CSNN still maintains a high level of accuracy.

As an example, for the cross-section at r=0.7𝑟0.7r=0.7, the numerical results and Lsuperscript𝐿L^{\infty} errors on the cylindrical surface are shown in Figure 12.

Refer to caption
Figure 12: The 3D variable coefficients elliptic(the left image shows the predicted values by the CSNN network, the middle image displays the accurate solution of the equation, and the right image represents the computed error.)

6.7 The 4D Poisson Equation

To illustrate the applicability of CSNN in higher dimensions, we solve the Poisson equation on a four-dimensional hypercube, testing the speed and accuracy of the method and comparing it with the traditional PINNs approach. The Poisson equation is given as following

{Δu(𝐱)=4π2sin(πx1)sin(πx2)sin(πx3)sin(πx4)𝐱Ω=[1,1]4u(𝐱)=0𝐱Ω.casesΔ𝑢𝐱4superscript𝜋2𝑠𝑖𝑛𝜋subscript𝑥1𝑠𝑖𝑛𝜋subscript𝑥2𝑠𝑖𝑛𝜋subscript𝑥3𝑠𝑖𝑛𝜋subscript𝑥4𝐱Ωsuperscript114𝑢𝐱0𝐱Ω\begin{cases}-\Delta u(\mathbf{x})=4\pi^{2}sin(\pi x_{1})sin(\pi x_{2})sin(\pi x_{3})sin(\pi x_{4})&\mathbf{x}\in\Omega=[-1,1]^{4}\\ u(\mathbf{x})=0&\mathbf{x}\in\partial\Omega\end{cases}. (6.14)

The exact solution is u(𝐱=(x1,x2,x3,x4))=sin(πx1)sin(πx2)sin(πx3)sin(πx4)𝑢𝐱subscript𝑥1subscript𝑥2subscript𝑥3subscript𝑥4𝑠𝑖𝑛𝜋subscript𝑥1𝑠𝑖𝑛𝜋subscript𝑥2𝑠𝑖𝑛𝜋subscript𝑥3𝑠𝑖𝑛𝜋subscript𝑥4u(\mathbf{x}=(x_{1},x_{2},x_{3},x_{4}))=sin(\pi x_{1})sin(\pi x_{2})sin(\pi x_{3})sin(\pi x_{4}) which can be verified easily.

The result for our method on different values of N𝑁N after training 1000 epochs is given in Table 3. The test points are on a uniformly spaced grid of 111111 cells on each dimension in the 4D hypercube.

N𝑁N 666 888 101010
Training Time (seconds) 79.2079.2079.20 90.71 142.3
Absolute L2superscript𝐿2L^{2} Error 3.324E-2 7.419E-4 3.284E-5
Relative L2superscript𝐿2L^{2} Error 4.063E-2 9.069E-4 4.014E-5
Table 3: Result for 4D Poisson Equations

From the table, it can be observed that our method performs excellently in solving this PDE, especially for N=8𝑁8N=8 and 101010. It demonstrates impressive speed and accuracy in numerical solutions for PDEs in high-dimensional spaces.

In addition, it is worth mentioning that the performance of the PINNs method in this case is catastrophic. Specifically, we employe a fully connected neural network with 3 hidden layers, each containing 128 neurons. The hyper-parameter β𝛽\beta for balancing PDE residual and boundary residual is set to 1, and the optimization scheme involves the widely used Adam optimizer. We randomly sample 65,536 points within ΩΩ\Omega every 100 training steps to compute the corresponding L2superscript𝐿2L^{2} norm inside the domain. Additionally, 1,024 points are randomly sampled along the boundaries to compute the corresponding L2superscript𝐿2L^{2} norm on the boundary. Despite such substantial computational expenses, there is no corresponding improvement. Even after training for more than 10,000 iterations over 3,000 seconds, the loss function remains above 0.8. This results in the neural network failing to adequately fit the desired solution, with an absolute L2superscript𝐿2L^{2} error reaching 0.88. This highlights the superiority of our method.

7 Conclusion and Discussion

We propose a novel neural network framework for solving PDEs. Our experience with this method suggests that it offers the following advantages:

  • 1)

    High convergence rates and capability to fit high-frequency solutions.

  • 2)

    Reduced sampling point requirements, eliminating the need for boundary sampling points.

  • 3)

    Training using only a single-layer neural network greatly reduces the number of parameters, making convergence analysis feasible.

  • 4)

    Effective computation of mixed boundary conditions.

  • 5)

    Capable of solving equations in non-rectangular domains.

There are several disadvantages that need to be addressed in the future work:

  • 1)

    In solving problems in complex domains, it is currently challenging to effectively determine the learning rates for each neural network.

  • 2)

    Inability to solve problems in non-parametrizable complex domains.

In future research, we will focus on solving more complex equations, such as the Stokes equations and the elasticity equations. We will also attempt to apply our methods to interface problems, hoping to achieve good numerical results. Additionally, more flexible and complex boundaries are also of interest to us. We hope to apply our method to any form of complex boundaries, not just parametric boundaries.

Acknowledgements

This work is financially supported by the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No. XDA25010405). It is also partially supported by the National Key R & D Program of China, Project Number 2020YFA0712000, the National Natural Science Foundation of China (Grant No. DMS-11771290) and the Science Challenge Project of China (Grant No. TZ2016002).

Appendix A

Assume that U is the solution to a two-point boundary-value problem satisfying the following boundary conditions

aU(1)+bU(1)subscript𝑎𝑈1subscript𝑏superscript𝑈1\displaystyle a_{-}U(-1)+b_{-}U^{\prime}(-1) =c,absentsubscript𝑐\displaystyle=c_{-}, (1)
a+U(1)+b+U(1)subscript𝑎𝑈1subscript𝑏superscript𝑈1\displaystyle\quad a_{+}U(1)+b_{+}U^{\prime}(1) =c+.absentsubscript𝑐\displaystyle=c_{+}.

This includes in particular the Dirichlet (a±=1\left(a_{\pm}=1\right. and b±=0)\left.b_{\pm}=0\right), the Neumann (a±=0\left(a_{\pm}=0\right. and b±=1)\left.b_{\pm}=1\right), and the mixed (a=b+=0\left(a_{-}=b_{+}=0\right. or a+=b=0subscript𝑎subscript𝑏0a_{+}=b_{-}=0 ) boundary conditions. The following explains how to homogenize it.

  • Case 1a±=01subscript𝑎plus-or-minus01\quad a_{\pm}=0 and b±0subscript𝑏plus-or-minus0b_{\pm}\neq 0. We set u¯=βx2+γx¯𝑢𝛽superscript𝑥2𝛾𝑥\bar{u}=\beta x^{2}+\gamma x, where β𝛽\beta and γ𝛾\gamma are uniquely determined by asking u~~𝑢\tilde{u} to satisfy (1)

    2bβ+bγ2subscript𝑏𝛽subscript𝑏𝛾\displaystyle-2b_{-}\beta+b_{-}\gamma =c,absentsubscript𝑐\displaystyle=c_{-}, (2)
    2b+β+b+γ2subscript𝑏𝛽subscript𝑏𝛾\displaystyle\quad 2b_{+}\beta+b_{+}\gamma =c+.absentsubscript𝑐\displaystyle=c_{+}.
  • Case 2a2+a+202superscriptsubscript𝑎2superscriptsubscript𝑎202\quad a_{-}^{2}+a_{+}^{2}\neq 0. We set u~=βx+γ~𝑢𝛽𝑥𝛾\tilde{u}=\beta x+\gamma, where β𝛽\beta and γ𝛾\gamma can again be uniquely determined by asking that u~~𝑢\tilde{u} to satisfy (1)

    (a+b)β+aγsubscript𝑎subscript𝑏𝛽subscript𝑎𝛾\displaystyle\left(-a_{-}+b_{-}\right)\beta+a_{-}\gamma =c,absentsubscript𝑐\displaystyle=c_{-}, (3)
    (a++b+)β+a+γsubscript𝑎subscript𝑏𝛽subscript𝑎𝛾\displaystyle\quad\left(a_{+}+b_{+}\right)\beta+a_{+}\gamma =c+.absentsubscript𝑐\displaystyle=c_{+}.

We now set u=Uu~𝑢𝑈~𝑢u=U-\tilde{u}. Then u𝑢u satisfies the new equation with the homogeneous boundary conditions

au(1)+bu(1)subscript𝑎𝑢1subscript𝑏superscript𝑢1\displaystyle a_{-}u(-1)+b_{-}u^{\prime}(-1) =0,absent0\displaystyle=0, (4)
a+u(1)+b+u(1)subscript𝑎𝑢1subscript𝑏superscript𝑢1\displaystyle\qquad a_{+}u(1)+b_{+}u^{\prime}(1) =0.absent0\displaystyle=0.

Appendix B

Let I=(1,1)𝐼11I=(-1,1) and the Jacobi weight function of index (α,β)𝛼𝛽(\alpha,\beta) by ωα,β(x)=(1x)α(1+x)βsuperscript𝜔𝛼𝛽𝑥superscript1𝑥𝛼superscript1𝑥𝛽\omega^{\alpha,\beta}(x)=(1-x)^{\alpha}(1+x)^{\beta}. We define the Jacobi weighted Sobolev spaces:

Lωα,β2(I)={u:Iu2ωα,βdx<+},superscriptsubscript𝐿superscript𝜔𝛼𝛽2𝐼conditional-set𝑢subscript𝐼superscript𝑢2superscript𝜔𝛼𝛽differential-d𝑥\displaystyle L_{\omega^{\alpha,\beta}}^{2}(I)=\left\{u:\int_{I}u^{2}\omega^{\alpha,\beta}\mathrm{d}x<+\infty\right\},
Hωα,βl(I)={uLωα,β2(I):xu,,xluLωα,β2(I)},superscriptsubscript𝐻superscript𝜔𝛼𝛽𝑙𝐼conditional-set𝑢superscriptsubscript𝐿superscript𝜔𝛼𝛽2𝐼subscript𝑥𝑢superscriptsubscript𝑥𝑙𝑢superscriptsubscript𝐿superscript𝜔𝛼𝛽2𝐼\displaystyle H_{\omega^{\alpha,\beta}}^{l}(I)=\left\{u\in L_{\omega^{\alpha,\beta}}^{2}(I):\partial_{x}u,\cdots,\partial_{x}^{l}u\in L_{\omega^{\alpha,\beta}}^{2}(I)\right\},
Hωα,β,m(I):={u:xkuLωα+k,β+k2(I),0km},assignsuperscriptsubscript𝐻superscript𝜔𝛼𝛽𝑚𝐼conditional-set𝑢formulae-sequencesuperscriptsubscript𝑥𝑘𝑢superscriptsubscript𝐿superscript𝜔𝛼𝑘𝛽𝑘2𝐼0𝑘𝑚\displaystyle H_{\omega^{\alpha,\beta},*}^{m}(I):=\left\{u:\partial_{x}^{k}u\in L_{\omega^{\alpha+k,\beta+k}}^{2}(I),\quad 0\leqslant k\leqslant m\right\},

The norms in Lωα,β2(I)superscriptsubscript𝐿superscript𝜔𝛼𝛽2𝐼L_{\omega^{\alpha,\beta}}^{2}(I) and Hωα,βl(I)superscriptsubscript𝐻superscript𝜔𝛼𝛽𝑙𝐼H_{\omega^{\alpha,\beta}}^{l}(I) will be denoted by ωα,β\|\cdot\|_{\omega^{\alpha,\beta}} and l,ωα,β\|\cdot\|_{l,{\omega^{\alpha,\beta}}}, respectively. Hωα,β,m(I)superscriptsubscript𝐻superscript𝜔𝛼𝛽𝑚𝐼H_{\omega^{\alpha,\beta},*}^{m}(I) equipped with the inner product and norm

(u,v)m,ωα,β,=k=0m(xku,xkv)ωα+k,β+k,um,ωα,β,=(u,u)m,ωα,β,12.formulae-sequencesubscript𝑢𝑣𝑚superscript𝜔𝛼𝛽superscriptsubscript𝑘0𝑚subscriptsuperscriptsubscript𝑥𝑘𝑢superscriptsubscript𝑥𝑘𝑣superscript𝜔𝛼𝑘𝛽𝑘subscriptnorm𝑢𝑚superscript𝜔𝛼𝛽superscriptsubscript𝑢𝑢𝑚superscript𝜔𝛼𝛽12(u,v)_{m,\omega^{\alpha,\beta},*}=\sum_{k=0}^{m}\left(\partial_{x}^{k}u,\partial_{x}^{k}v\right)_{\omega^{\alpha+k,\beta+k}},\qquad\|u\|_{m,\omega^{\alpha,\beta},*}=(u,u)_{m,\omega^{\alpha,\beta},*}^{\frac{1}{2}}.

References

  • [1] RB Bhat and S Chakravety. Numerical analysis in engineering. alpha science int ltd, 2004.
  • [2] Avrim Blum and Ronald Rivest. Training a 3-node neural network is np-complete. Advances in neural information processing systems, 1, 1988.
  • [3] Léon Bottou and Olivier Bousquet. The tradeoffs of large scale learning. Advances in neural information processing systems, 20, 2007.
  • [4] John P Boyd. Chebyshev and Fourier spectral methods. Courier Corporation, 2001.
  • [5] Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. A limited memory algorithm for bound constrained optimization. SIAM Journal on scientific computing, 16(5):1190–1208, 1995.
  • [6] Simon Haykin. Neural networks: a comprehensive foundation. Prentice Hall PTR, 1998.
  • [7] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • [8] Susmita Mall and Snehashish Chakraverty. Chebyshev neural network based model for solving lane–emden type equations. Applied Mathematics and Computation, 247:100–114, 2014.
  • [9] Susmita Mall and Snehashish Chakraverty. Single layer chebyshev neural network model for solving elliptic partial differential equations. Neural Processing Letters, 45:825–840, 2017.
  • [10] John C Mason and David C Handscomb. Chebyshev polynomials. Chapman and Hall/CRC, 2002.
  • [11] Y Morchoisne. Inhomogeneous flow calculations by spectral methods-mono-domain and multi-domain techniques. NASA STI/Recon Technical Report A, 1982:14527, 1982.
  • [12] Steven A Orszag. Spectral methods for problems in complex geometrics. In Numerical methods for partial differential equations, pages 273–305. Elsevier, 1979.
  • [13] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017.
  • [14] Suhas Patankar. Numerical heat transfer and fluid flow. CRC press, 2018.
  • [15] Anthony T Patera. A spectral element method for fluid dynamics: laminar flow in a channel expansion. Journal of computational Physics, 54(3):468–488, 1984.
  • [16] Jagdish Chandra Patra. Chebyshev neural network-based model for dual-junction solar cells. IEEE Transactions on Energy Conversion, 26(1):132–139, 2010.
  • [17] Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
  • [18] Junuthula Narasimha Reddy. An introduction to the finite element method. New York, 27:14, 1993.
  • [19] Henry J Ricardo. A modern introduction to differential equations. Academic Press, 2020.
  • [20] Samuel Rudy, Alessandro Alla, Steven L Brunton, and J Nathan Kutz. Data-driven identification of parametric partial differential equations. SIAM Journal on Applied Dynamical Systems, 18(2):643–660, 2019.
  • [21] Jie Shen, Tao Tang, and Li-Lian Wang. Spectral methods: algorithms, analysis and applications, volume 41. Springer Science & Business Media, 2011.
  • [22] Gordon D Smith. Numerical solution of partial differential equations: finite difference methods. Oxford university press, 1985.
  • [23] Tao Tang. Spectral and high-order methods with applications. Science Press Beijing, 2006.
  • [24] Sifan Wang, Xinling Yu, and Paris Perdikaris. When and why pinns fail to train: A neural tangent kernel perspective. Journal of Computational Physics, 449:110768, 2022.
  • [25] Michiru Yasuhara, Yoshiaki Nakamura, and Jian-Ping Wang. Computations of the hypersonic flow by the spectral method. Journal of the Japan Society for Aeronautical and Space Sciences, 36:542–549, 01 1988.
  • [26] Jacek M Zurada. Introduction to artificial neural network. West Publishing Company, United States of America, 1992.