This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A hybrid physics-informed neural network for nonlinear partial differential equation

Chunyue Lv Lei Wang [email protected]; [email protected] Chenming Xie School of Mathematics and Physics, China University of Geosciences, Wuhan 430074, China Center for Mathematical Sciences, China University of Geosciences, Wuhan 430074, China Kindo medical data technology company limited , Wuhan 430073, China
Abstract

The recently developed physics-informed machine learning has made great progress for solving nonlinear partial differential equations (PDEs), however, it may fail to provide reasonable approximations to the PDEs with discontinuous solutions. In this paper, we focus on the discrete time physics-informed neural network (PINN), and propose a hybrid PINN scheme for the nonlinear PDEs. In this approach, the local solution structures are classified as smooth and non-smooth scales by introducing a discontinuity indicator, and then the automatic differentiation technique is employed for resolving smooth scales, while an improved weighted essentially non-oscillatory (WENO) scheme is adopted to capture discontinuities. We then test the present approach by considering the viscous and inviscid Burgers equations , and it is shown that compared with original discrete time PINN, the present hybrid approach has a better performance in approximating the discontinuous solution even at a relatively larger time step.

keywords:
Physics-informed neural network , WENO scheme , Smoothness indicator , Runge-Kutta method
journal: XXX

1 Introduction

Nonlinear partial differential equations (PDEs) have been the focus of many studies due to their frequent appearance in many applications in physical, biological, and social sciences [1]. Because of most nonlinear PDEs do not have analytical solutions expect for some extremely simple situations, and therefore, various numerical methods have been developed during the past several decades, such as the finite-difference [2], finite-volume [2] and finite-element [3], and the lattice Boltzmann methods [4], to name but a few.

Apart from the traditional numerical approaches, the rise of deep learning in recent years offers an opportunity to develop surrogate model for solving nonlinear PDEs [5, 6, 7, 8]. Among various neural network based PDEs solvers, the physics-informed neural network (PINN) proposed by Raissi et al. [8] has drawn considerable attention due to its effectiveness in solving both forward and inverse nonlinear PDE problems as well as the straightforward implementation [8, 9]. The core idea of PINN algorithm is to encode the underlying physical laws into the loss function of the neural network such that the governing equations are enforced by minimizing the residual loss function with automatic differentiation [8, 9]. More recently, the PINN and its variants have been successfully applied to solve various PDEs, including the Navier-Stokes equation, the Kdv equations [8], heat transfer equations [10], Fokker-Plank equation [11] and so on. Despite the great progress of PINNs in solving nonlinear PDEs, the governing equations appeared in previous works are usually the so-called diffusion-dominated equations, and it is known that the solutions in these cases are usually smooth enough such that the traditional PINNs are capable of achieving good predictive accuracy according to the classical universal approximation theorem [12]. However, as pointed out by some authors, it is a challenge for current PINNs to predict the solutions of PDEs with advection-dominant character, in which the solutions often develop discontinuities in some finite time even if the initial conditions is smooth enough [13]. To address this issue, some enhanced PINNs are proposed recently. By using more scattered points around the discontinuous regions, Mao et al. [14] investigated the possibility of continuous time PINN in solving the high-speed aerodynamic flows, and found that compared with random or uniform training points, the predicted solutions with clustered training points are more efficient in some cases. In addition, another work reported by Fuks et al. [15] shows that the performance of PINN in solving hyperbolic PDEs can be enhanced by adding a small diffusion term to the governing PDEs. More recently, by combining the traditional finite volume methodology with PINNs, Patel et al. [17] proposed a thermodynamically consistent PINNs for hyperbolic systems, and the solutions of various PDEs like Euler, Buckley-Leverett equations are well predicted with this method. Although there are some works on utilization of PINNs for solving the PDEs with discontinuities, the neural network appeared in some previous works either needs a priori knowledge of the solution [14] or the performance of the PINNs largely depends on the neural network architecture [15, 16].

In this work, we propose a hybrid PINN (hPINN) for nonlinear PDEs, and the basic idea of this approach is to introduce a discontinuity indicator into the neural network such that the local solution structures can be classified as smooth and non-smooth scales. Then the automatic differentiation technique (one of the most important techniques in deep learning field [18]) is employed for resolving smooth scales, while the improved weighted essentially non-oscillatory (WENO) scheme [19], which is a famous shock-capturing scheme, is adopted to capture discontinuities. The reminder of the present paper is organized as follows. In Section 2, the details of the proposed approach are presented. Then, some results for viscous and inviscid Burgers equations are presented in Section 3, and a brief conclusion is drawn in Section 4.

2 Methodology

In this work, we are interested in designing a hybrid PINN model for the following nonlinear PDE,

ut+f(u)x=υg(u)xx+h(x,t),{u_{t}}+f{\left(u\right)_{\emph{{x}}}}=\upsilon g{\left(u\right)_{\emph{{xx}}}}+h\left({\emph{{x}},t}\right), (1)

in which uu is the scalar variable and usually a function of both time tt and space x, h(x,t)h\left({\emph{{x}},t}\right) is the source term, and υ\upsilon is the diffusion coefficient. In addition, the first derivations in Eq. (1) are associated with convection while the second derivatives are responsible for diffusion.

2.1 Hybrid physics-informed neural network

Refer to caption
Figure 1: Schematic of the hPINN for solving the nonlinear PDEs, in which the spatial vector x{x} is the input, σ\sigma represents the activation function, which is set to be the hyperbolic tangent activation function in this work [8], and all the hyperparameters in the neural network are initialized by the Xavier initialization method [21].

As presented by Raissi et al. [8], the generally used PINNs can be classified as the continuous time model and the discrete time model. Compared with the latter model, the continuous time model has drawn tremendous interest in the PINN community [14, 15, 17, 16]. However, as pointed out by Raissi et al. [8], in order to enforce the physical conservation laws , the continuous time model needs a large amount of collocation points in the entire spatio-temporal domain, and thus the training of the original continuous time model is usually prohibitively expensive. Fortunately, this limitation could be well addressed by employing the discrete time model [8]. In such a case, the present work intends to solve the forward PDE problems by using the discrete time model, and the model proposed here can be viewed as the variant of the original discrete time model.

To have a better understand on the present approach, the original discrete time PINN is first reviewed. The main idea of this model is to discrete the the time operator in Eq. (1) by using the classical Runge-Kutta methods with qq stage [8], then we can obtain the following equations

un+ci(x)=un(x)Δtj=1qaij𝒩[un+cj(x)],i=1,,q,un+1(x)=un(x)Δtj=1qbj𝒩[un+cj(x)],\begin{split}&{u^{n+{c_{i}}}}\left(x\right)={u^{n}}\left(x\right)-\Delta t\sum\limits_{j=1}^{q}{{a_{ij}}{\mathscr{N}}}\left[{{u^{n+{c_{j}}}}\left(x\right)}\right],\;i=1,\ldots,q,\\ &{u^{n+1}}\left(x\right)={u^{n}}\left(x\right)-\Delta t\sum\limits_{j=1}^{q}{{b_{j}}{\mathscr{N}}}\left[{{u^{n+{c_{j}}}}\left(x\right)}\right],\end{split} (2)

in which 𝒩[u]=f(u)xυg(u)xxh(x){\mathscr{N}}\left[u\right]=f{\left(u\right)_{x}}-\upsilon g{\left(u\right)_{xx}}-h\left(x\right), un+ci(x)=u(tn+ciΔt,x){u^{n+{c_{i}}}}\left(x\right)=u\left({{t^{n}}+{c_{i}}\Delta t,x}\right), and the above time discretization scheme may be implicit or explicit, depending on the values of {aij,bj,cj}\left\{a_{ij},b_{j},c_{j}\right\}. In such a case, the PINN model usually consists of two sub-networks, in which the first neural network NN(w,b){{{NN}}(w,b)} takes x as input and outputs the values of u at different time stages, i.e., [un+c1,,un+cq,un+1]\left[{{u^{n+{c_{1}}}},\ldots,{u^{n+{c_{q}}}},{u^{n+1}}}\right], and they are further fed into the second network to encode the governing equations as well as the corresponding boundary conditions. For the forward problems considered here, since the initial/boundary conditions as well as the governing equations value are always known, the loss function in the original discrete time PINN model is given by [8],

L=LPDE+LBC,LPDE=j=1qi=1Nn|ujn(xn,i)un,i|2+|uq+1n(xn,i)un,i|2,LBC=j=1q|ujn(xb)ub|2+|uq+1n(xb)ub|2,\begin{split}&L={L_{PDE}}+{L_{BC}},\\ &{L_{PDE}}={\sum\limits_{j=1}^{q}{\sum\limits_{i=1}^{{N_{n}}}{\left|{u_{j}^{n}\left({{x^{n,i}}}\right)-{u^{n,i}}}\right|}}^{2}}+{\left|{u_{q+1}^{n}\left({{x^{n,i}}}\right)-{u^{n,i}}}\right|^{2}},\\ &{L_{BC}}=\sum\limits_{j=1}^{q}{{{\left|{u_{j}^{n}\left({{x^{b}}}\right)-{u^{b}}}\right|}^{2}}}+{\left|{u_{q+1}^{n}\left({{x^{b}}}\right)-{u^{b}}}\right|^{2}},\end{split} (3)

where NnN_{n} is the number of the collocation points, which can be randomly or uniformly sampled inside the computational domain, {xn,i,un,i}i=1Nn\left\{{{x^{n,i}},{u^{n,i}}}\right\}_{i=1}^{{N_{n}}} is the numerical data obtained at time-step tnt^{n}, xbx^{b} represents the collocation points on the boundary, and uinu_{i}^{n} is defined as

uin=un+ci+Δtj=1qaij𝒩[un+cj(x)],i=1,,q,uq+1n=un+1+Δtj=1qbj𝒩[un+cj(x)].\begin{split}&u_{i}^{n}={u^{n+{c_{i}}}}+\Delta t\sum\limits_{j=1}^{q}{{a_{ij}}{\mathscr{N}}}\left[{{u^{n+{c_{j}}}}\left(x\right)}\right],\;i=1,\ldots,q,\\ &u_{q+1}^{n}={u^{n+1}}+\Delta t\sum\limits_{j=1}^{q}{{b_{j}}{\mathscr{N}}}\left[{{u^{n+{c_{j}}}}\left(x\right)}\right].\end{split} (4)

As presented by some previous works [15, 16, 17], although the original PINN works well for various nonlinear PDEs, it may fail to provide reasonable approximations to the PDEs with dominant convection. To address this issue, an intuitive way is to employ some discontinuity-capturing methods in the neural network. In this work, inspired by the weighted essentially nonoscillatory schemes (WENO) [19, 20], which are a popular class of numerical methods in capturing discontinuities, an improved WENO scheme [19], dubbed WENO-Z, is incorporate into the PINN. On the basis of this idea, a simple way is to discretize the differential operator in the whole computational domain by using WENO-Z approach. However, this treatment may lead to another two issues: For one thing, when applying the WENO-Z scheme in the whole domain, the training cost of the neural network will be increased significantly due to the local characteristic decomposition and the nonlinear-weights computing [19, 20]. For another, the advantage of the automatic differentiation technique used in original PINN will be weakened. In this setting, an alternative solution is to introduce the hybrid concept, for which the efficient automatic differentiation technique is adopted to compute all the differential operators of the PDEs in smooth regions while the expensive WENO-Z scheme is just applied in the vicinity of discontinuities, and it is observed that this idea is widely employed in scientific computing community [22].

Fig. 1 shows a schematic of the present hybrid PINN (hPINN). Note that the evolution of the discontinuities depends little on the viscous term (i.e., g(u)xxg(u)_{xx}), the automatic differentiation technique is adopted to compute the second derivative in both the smooth and non-smooth regions for its efficient. For the approximation of the first order derivative ( i.e., f(u)xf(u)_{x}) in the non-smooth scale, the fifth-order WENO-Z scheme is used. For brevity in the presentation, we use one-dimensional scalar nonlinear PDEs as an example, and assume the scatter points xjx_{j} are uniform. The cell size as well as the cells are denoted by Δx=xj+1xj\Delta x={x_{j+1}}-{x_{j}} and Ij=[xj1/2,xj+1/2]{I_{j}}=\left[{{x_{j-1/2}},{x_{j+1/2}}}\right] with xj+1/2=xj+Δx/2{x_{j+1/2}}={x_{j}}+\Delta x/2, respectively. Then use a conservative finite difference scheme to the first-order derivative term in Eq. (1), thus obtain

fx|=x=xif^i+1/2f^i1/2Δx,\frac{{\partial f}}{{\partial x}}\left|{{}_{x={x_{i}}}}\right.=\frac{{{{\hat{f}}_{i+1/2}}-{{\hat{f}}_{i-1/2}}}}{{\Delta x}}, (5)

in which the numerical fluxes f^i+1/2{{\hat{f}}_{i+1/2}} and f^i1/2{{\hat{f}}_{i-1/2}} are the positive and negative parts of f(u)f(u) at the cell boundaries. To ensure the numerical stability, the global Lax-Friedrichs flux splitting is used for calculating the flux f(u)f(u)

f(u)=f+(u)+f(u),f\left(u\right)={f^{+}}\left(u\right)+{f^{-}}\left(u\right), (6)

where f±(u)=12(f(u)±λu){f^{\pm}}\left(u\right)=\frac{1}{2}\left({f\left(u\right)\pm\lambda u}\right) and λ=max|f(u)|\lambda=\max\left|{f^{\prime}\left(u\right)}\right|, and with this, we have f^i+1/2=f^+i+1/2+f^i+1/2{{\hat{f}}_{i+1/2}}={{\hat{f}}^{+}}_{i+1/2}+{{\hat{f}}^{-}}_{i+1/2}. Since the negative part of the flux f^i1/2{{\hat{f}}_{i-1/2}} is symmetric to the positive part with respect to xi+12x_{i+\frac{1}{2}}, here we will only describe how f^+i+1/2{{\hat{f}}^{+}}_{i+1/2} is constructed, and for convenience, we will drop the ”+” sign in the superscript. As presented in [19], the fifth-order WENO flux f^i+1/2{{\hat{f}}_{i+1/2}} is defined as

f^i+1/2=ω0f^(0)i+1/2+ω1f^(1)i+1/2+ω2f^(2)i+1/2,{{\hat{f}}_{i+1/2}}={\omega_{0}}{{\hat{f}}^{\left(0\right)}}_{i+1/2}+{\omega_{1}}{{\hat{f}}^{\left(1\right)}}_{i+1/2}+{\omega_{2}}{{\hat{f}}^{\left(2\right)}}_{i+1/2}, (7)

in which f^(k)i+1/2(k=0,1,2){{\hat{f}}^{\left(k\right)}}_{i+1/2}\;\left({k=0,1,2}\right) are the flux values assigned to three stencils {xj2,xj1,xj}\left\{{{x_{j-2}},{x_{j-1}},{x_{j}}}\right\}, {xj1,xj,xj+1}\left\{{{x_{j-1}},{x_{j}},{x_{j+1}}}\right\} and {xj,xj+1,xj+2}\left\{{{x_{j}},{x_{j+1}},{x_{j+2}}}\right\}, and they are given by

f^(0)i+1/2=16(2fj27fj1+11fj),f^(1)i+1/2=16(fj1+5fj+2fj+1),f^(2)i+1/2=16(2fj+5fj+1fj+2).\begin{split}&{{\hat{f}}^{\left(0\right)}}_{i+1/2}=\frac{1}{6}\left({2{f_{j-2}}-7{f_{j-1}}+11{f_{j}}}\right),\\ &{{\hat{f}}^{\left(1\right)}}_{i+1/2}=\frac{1}{6}\left({-{f_{j-1}}+5{f_{j}}+2{f_{j+1}}}\right),\\ &{{\hat{f}}^{\left(2\right)}}_{i+1/2}=\frac{1}{6}\left({2{f_{j}}+5{f_{j+1}}-{f_{j+2}}}\right).\\ \end{split} (8)

In addition, the nonlinear weights ω0{\omega_{0}}, ω1{\omega_{1}} and ω2{\omega_{2}} are calculated as follow [19]

ωk=αkα0+α1+α2,αk=dk[1+(τ5βk+ε)2],k=0,1,2,{\omega_{k}}=\frac{{{\alpha_{k}}}}{{{\alpha_{0}}+{\alpha_{1}}+{\alpha_{2}}}},\;{\alpha_{k}}={d_{k}}\left[{1+{{\left({\frac{{{\tau_{5}}}}{{{\beta_{k}}+\varepsilon}}}\right)}^{2}}}\right],k=0,1,2, (9)

where ε\varepsilon is a small positive sensitivity parameter, which is used to avoid divisions by zero in the weights formulation, and is chosen to be 104010^{-40} in this work as suggested in [19]. Coefficient dkd_{k} is the optimal weights given by

d0=110,d1=610,d2=310.{d_{0}}=\frac{1}{{10}},{d_{1}}=\frac{6}{{10}},{d_{2}}=\frac{3}{{10}}. (10)

τ\tau a global smoothness indicator used in WENO-Z scheme [19], which is characterized by

τ5=|β0β2|,{\tau_{5}}=\left|{{\beta_{0}}-{\beta_{2}}}\right|, (11)

In addition, βk{{\beta_{k}}} is the smoothness indicator, which measures the smoothness of the solution over a particular stencil, and are defined as

β0=1312(fj22fj1+fj)+14(fj24fj1+3fj)2,β1=1312(fj12fj+fj+1)+14(fj+1fj1)2,β2=1312(fj2fj+1+fj+2)+14(3fj4fj+1+fj+2)2.\begin{split}&{\beta_{0}}=\frac{{13}}{{12}}\left({{f_{j-2}}-2{f_{j-1}}+{f_{j}}}\right)+\frac{1}{4}{\left({{f_{j-2}}-4{f_{j-1}}+3{f_{j}}}\right)^{2}},\\ &{\beta_{1}}=\frac{{13}}{{12}}\left({{f_{j-1}}-2{f_{j}}+{f_{j+1}}}\right)+\frac{1}{4}{\left({{f_{j+1}}-{f_{j-1}}}\right)^{2}},\\ &{\beta_{2}}=\frac{{13}}{{12}}\left({{f_{j}}-2{f_{j+1}}+{f_{j+2}}}\right)+\frac{1}{4}{\left({3{f_{j}}-4{f_{j+1}}+{f_{j+2}}}\right)^{2}}.\\ \end{split} (12)

Finally, in order to distinguish the non-smooth scales from the smooth regions in the neural network, an adaptive discontinuity indicator is needed. Following the work of Fu [22], the separation between discontinuities and smooth regions can be achieved by

γk=1(βk+δ)p,k=0,1,2,3,{\gamma_{k}}=\frac{1}{{{{\left({{\beta_{k}}+\delta}\right)}^{p}}}},\;k=0,1,2,3, (13)

where δ\delta is equal to 10410^{-4} and 10310^{-3} for one-dimensional and multi-dimensional problems, qq is a scale separation parameter, the value of it is set to 6, as suggested in [22], β3{\beta_{3}} is another smoothness indicator over the stencil of {xj+1,xj+2,xj+3}\left\{{{x_{j+1,}}{x_{j+2,}}{x_{j+3}}}\right\}, and it is given by

β3=13(fi+1(22fi+173fi+2+29fi+3)+fi+2(61fi+249fi+3)+10fi+3fi+3).{\beta_{3}}=\frac{1}{3}\left({{f_{i+1}}\left({22{f_{i+1}}-73{f_{i+2}}+29{f_{i+3}}}\right)+{f_{i+2}}\left({61{f_{i+2}}-49{f_{i+3}}}\right)+10{f_{i+3}}{f_{i+3}}}\right). (14)

Then the discontinuity indicator is defined as

Flag={0,χk>CT,k{0,1,2,3}1,otherwise,{\rm{Flag=}}\left\{\begin{array}[]{l}0,\;{\chi_{k}}>{C_{T}},\;\forall k\in\left\{{0,1,2,3}\right\}\\ 1,\;{\rm{otherwise}}\end{array}\right., (15)

in which χk\chi_{k} the normalized smoothness indicator given by χk=γk/k=03γk(k=0,,3){\chi_{k}}={{{\gamma_{k}}}\mathord{\left/{\vphantom{{{\gamma_{k}}}{\sum\limits_{k=0}^{3}{{\gamma_{k}}}}}}\right.\kern-1.2pt}{\sum\limits_{k=0}^{3}{{\gamma_{k}}}}}\;\left({k=0,\ldots,3}\right)\; and CTC_{T} is a built-in parameter defined as CT=5×104{C_{T}}=5\times{10^{-4}}.

3 Computational results and discussion

Refer to caption
Refer to caption
Figure 2: Comparison of the viscous Burgers equation at t=0.2t=0.2 (a) and t=1.0t=1.0 (b), in which ν=104/π\nu=10^{-4}/\pi, and the reference solution is computed with WENO-Z scheme using 1000 mesh points.

In this section, the one-dimensional viscous and inviscid Burgers equations with Dirichlet boundary conditions are chosen to validate the present hPINN,

ut+(u22)x=ν2ux2,x[1,1],t[0,1],u(0,x)=sin(πx),u(t,1)=u(t,1)=0,\begin{split}&\frac{{\partial u}}{{\partial t}}+\frac{{\partial\left({\frac{{{u^{2}}}}{2}}\right)}}{{\partial x}}=\nu\frac{{{\partial^{2}}u}}{{\partial{x^{2}}}},\;x\in\left[{-1,1}\right],\;t\in\left[{0,1}\right],\\ &u\left({0,x}\right)=-\sin\left({\pi x}\right),\\ &u\left({t,-1}\right)=u\left({t,1}\right)=0,\end{split} (16)

and the predicted results are compared with the original PINN. As for the neural networks architecture, we employ 5 hidden-layers with 20 neuron in each layer and the number of residual points are 300 in training process, and the learning rate is set as 10410^{-4}. In both cases, the network training is performed up to the loss function is smaller than 10510^{-5}.

Refer to caption
Refer to caption
Figure 3: Comparison of the inviscid Burgers equation (i.e., ν=0.0\nu=0.0) at t=0.2t=0.2 (a) and t=1.0t=1.0 (b), in which the reference solution is computed with WENO-Z scheme using 1000 mesh points.

Fig. 2 and Fig. 3 illustrate the distributions of the predict solutions for both the viscous and inviscid Burgers equations, in which the reference solutions are obtained by using fifth order WENO-Z scheme with 1000 mesh points. As depicted in these figures, it can be clearly seen that in regions where the solution is smooth both the PINN and hPINN work well, when the shock wave appears in the solution, the original PINN fails to provide reasonable approximations at the shock position, and this phenomenon is more distinct for the inviscid Burgers equation (see Fig. 2 and Fig. 3). Different from the original PINN, the results obtained with present hPINN converges to the reference solution with a non-oscillatory profile even in the inviscid case, which further suggests that the performance of present hPINN is better that previous PINN.

To demonstrate the robustness of the present hPINN, the global relative errors obtained at various Runge-Kutta stages qq and the time-steps Δt\Delta t for present hPINN are also investigated, and the corresponding results are shown in Table 1, in which the results obtained by using original PINN are not incorporate since it fails to predict reasonable results. As shown in this table, one can find that for a fixed Runge-Kutta stage qq/time-step Δt\Delta t, the global relative error is always increased/decresed with increasing time-step Δt\Delta t/ Runge-Kutta stage qq. Moreover, due to the distinct feature of the neural network, the Runge-Kutta stage used here can be an arbitrarily larger number, making it possible to solve the nonlinear PDEs with a very large time step, which is largely different from the traditional numerical method.

Table 1: Global relative errors obtained with hPINN for different Runge-Kutta stages qq and time-step Δt\Delta t at t=0.6t=0.6.
ν=104/π{\nu}=10^{-4}/\pi ν=0.0\nu=0.0
q Δt=0.1\Delta t=0.1 Δt=0.3\Delta t=0.3 Δt=0.6\Delta t=0.6 Δt=0.1\Delta t=0.1 Δt=0.3\Delta t=0.3 Δt=0.6\Delta t=0.6
1 2.3438×1032.3438\times 10^{-3} 3.4727×1013.4727\times 10^{-1} 4.3328×1014.3328\times 10^{-1} 4.4754×1034.4754\times 10^{-3} 3.8972×1023.8972\times 10^{-2} 5.9224×1015.9224\times 10^{-1}
4 2.8555×1032.8555\times 10^{-3} 9.2065×1039.2065\times 10^{-3} 3.3564×1013.3564\times 10^{-1} 2.0912×1032.0912\times 10^{-3} 5.3934×1035.3934\times 10^{-3} 1.2390×1011.2390\times 10^{-1}
10 1.8478×1031.8478\times 10^{-3} 2.6818×1022.6818\times 10^{-2} 4.2223×1024.2223\times 10^{-2} 3.0013×1033.0013\times 10^{-3} 5.1584×1035.1584\times 10^{-3} 9.8035×1039.8035\times 10^{-3}
50 2.2558×1032.2558\times 10^{-3} 3.5475×1033.5475\times 10^{-3} 9.6065×1029.6065\times 10^{-2} 2.3808×1032.3808\times 10^{-3} 3.7173×1033.7173\times 10^{-3} 1.1018×1021.1018\times 10^{-2}

4 Conclusion

In this paper, by incorporating a discontinuity indicator into the neural network, a hybrid physics-informed neural network (hPINN) is proposed, in which the automatic differentiation technique is employ to compute all the differential operators of the PDEs in smooth regions, while the classical WENO-Z scheme is adopted to compute the convection term in the vicinity of discontinuities. Then the viscous and inviscid Burgers are selected to validated the present model. Based on the results, we observe that the present hPINN is able to provide reasonable approximation even at the shock position, and overall performance of the present hPINN is better than the original PINN. Further, due to the distinct feature of the neural network, the Runge-Kutta stage used here can be an arbitrarily larger number, making it possible to solve the nonlinear PDEs with a very large time step, which is very hard or even impossible in traditional numerical method. Finally, we would like to point out that although the present work focuses on the one-dimensional problems, it is easily to extent the present model to the multi-dimensional problems, which will be considered in ongoing work.

CRediT authorship contribution statement

Chunyue Lv: Software, Methodology, Investigation. Lei Wang: Conceptualization, Methodology, Writing - review &{\rm{\&}} editing.

Acknowledgements

This work is financially supported by the National Natural Science Foundation of China (Grant No. 12002320) and the Fundamental Research Funds for the Central Universities (Grant No. CUGGC05),.

References

  • [1] L. Debnath, Nonlinear partial differential equations for scientists and engineers, Birkhauser, Boston, MA, 1997.
  • [2] S. Mazumder, Numerical methods for partial differential equations: finite difference and finite volume methods, first ed., Academic Press, New York, 2015.
  • [3] H.G. Roos, M. Stynes, L. Tobiska, Robust numerical methods for singularly perturbed differential equations: convection-diffusion-reaction and flow problems, Springer, Berlin, 2008.
  • [4] L. Wang, X.G. Yang, H.L. Wang, Z.H. Chai, Z.C. Wei, A modified regularized lattice Boltzmann model for convection-diffusion equation with a source term, Appl. Math. Lett. 112 (2021) 106766.
  • [5] M. Dissanayake, N. Phan-Thien, Neural-network-based approximations for solving partial differential equations, Commun. Numer. Methods Eng. 10 (3) (1994) 195¨C201.
  • [6] J. Sirignano, K. Spiliopoulos, DGM: a deep learning algorithm for solving partial differential equations, J. Comput. Phys. 375 (2018) 1339¨C1364.
  • [7] K. Wu, D. Xiu, Data-driven deep learning of partial differential equations in modal space, J. Comput. Phys. 408 (2020) 109307.
  • [8] M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686¨C707.
  • [9] G.E. Karniadakis, I.G. Kevrekidis, L. Lu, Physics-informed machine learning, Nat. Rev. Phys. 3 (6) (2021) 422-440.
  • [10] N. Zobeiry, K.D. Humfeld, A physics-informed machine learning approach for solving heat transfer equation in advanced manufacturing and engineering applications, Eng. Appl. Artif. Intell. 101 (2021) 104232.
  • [11] Y. Xu, H. Zhang, Y. Li, Solving Fokker-Planck equation using deep learning. Chaos 30 (1) (2020) 013133.
  • [12] G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control. Signals Syst. 2 (4) (1989) 303-314.
  • [13] C.M. Dafermos, C.M. Dafermos, Hyperbolic conservation laws in continuum physics, Springer, Berlin, 2005.
  • [14] Z. Mao, A.D. Jagtap, G.E. Karniadakis, Physics-informed neural networks for high-speed flows, Comput. Meth. Appl. Mech. Eng. 360 (2020) 112789.
  • [15] O. Fuks, H.A. Tchelepi, Limitations of physics informed machine learning for nonlinear two-phase transport in porous media, J. Mach. Learn. Model. Comput. 1 (1) (2020).
  • [16] M.M. Almajid, M.O. Abu-Al-Saud, Prediction of porous media fluid flow using physics informed neural networks, J. Pet. Sci. Eng. 208 (2022) 109205.
  • [17] R.G. Patel, I. Manickam, N.A. Trask, Thermodynamically consistent physics-informed neural networks for hyperbolic systems, J. Comput. Phys. (2021) 110754.
  • [18] A.G. Baydin, B.A. Pearlmutter, A.A. Radul, Automatic differentiation in machine learning: a survey, J. Mach. Learn. Res. 18 (2018).
  • [19] R. Borges, M. Carmona, B. Costa, An improved weighted essentially non-oscillatory scheme for hyperbolic conservation laws, J. Comput. Phys. 227 (6) (2008) 3191-3211.
  • [20] G.S. Jiang, C.W. Shu, Efficient implementation of weighted ENO schemes, J. Comput. Phys. 126 (1995) 202¨C228.
  • [21] X. Meng, Z. Li, D. Zhang, PPINN: Parareal physics-informed neural network for time-dependent PDEs, Comput. Meth. Appl. Mech. Eng. 370 (2020) 113250.
  • [22] L. Fu, A hybrid method with TENO based discontinuity indicator for hyperbolic conservation laws, Commun. Comput. Phys. 26 (4) (2019) 973-1007.