This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Singular layer PINN methods for Burgers’ equation

Teng-Yuan Chang1, Gung-Min Gie2, Youngjoon Hong1, and Chang-Yeol Jung3 1 Department of Mathematical Sciences, KAIST, Korea 2 Department of Mathematics, University of Louisville, Louisville, KY 40292 3 Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea [email protected] [email protected] [email protected] [email protected]
Abstract.

In this article, we present a new learning method called sl-PINN to tackle the one-dimensional viscous Burgers problem at a small viscosity, which results in a singular interior layer. To address this issue, we first determine the corrector that characterizes the unique behavior of the viscous flow within the interior layers by means of asymptotic analysis. We then use these correctors to construct sl-PINN predictions for both stationary and moving interior layer cases of the viscous Burgers problem. Our numerical experiments demonstrate that sl-PINNs accurately predict the solution for low viscosity, notably reducing errors near the interior layer compared to traditional PINN methods. Our proposed method offers a comprehensive understanding of the behavior of the solution near the interior layer, aiding in capturing the robust part of the training solution.

Key words and phrases:
Singular perturbations, Physics-informed neural networks, Burgers’ equation, Interior layers
2000 Mathematics Subject Classification:
76D05, 68T99, 65M99, 68U99

1. Introduction

We consider the 1D Burgers’ equation, particularly focusing on scenarios where the viscosity is small, resulting in the formation of interior layers,

{tuεεx2uε+uεxuε=0,x,t>0,uε|t=0=u0,x,\left\{\begin{array}[]{rl}\vspace{1.5mm}\partial_{t}u^{\varepsilon}-\varepsilon\partial_{x}^{2}u^{\varepsilon}+u^{\varepsilon}\,\partial_{x}u^{\varepsilon}=0,&x\in\mathbb{R},\,\,t>0,\\ u^{\varepsilon}|_{t=0}=u_{0},&x\in\mathbb{R},\end{array}\right. (1.1)

where u0=u0(x)u_{0}=u_{0}(x) is the given initial data and ε>0\varepsilon>0 is the small viscosity parameter. The Burgers’ equation (1.1) is supplemented with one of the following boundary conditions,

  • (ii)

    uεu^{\varepsilon} is periodic in 1<x<1-1<x<1, provided that the initial data u0u_{0} in is periodic in (1,1)(-1,1) as well;

  • (iiii)

    uεgL(t)u^{\varepsilon}\rightarrow g_{L}(t) as xx\rightarrow-\infty and uεgR(t)u^{\varepsilon}\rightarrow g_{R}(t) as xx\rightarrow\infty where the boundary data gLg_{L} and gRg_{R} are smooth functions in time.

In the examples presented in Section 4, we consider the case where the data u0u_{0}, gLg_{L}, and gRg_{R} are sufficiently regular. Therefore, the Burgers’ equation at a fixed viscosity ε>0\varepsilon>0 is well-posed up to any finite time T>0T>0 and satisfies

|uε(x,t)|C0,(x,t)×(0,T),|u^{\varepsilon}(x,t)|\leq C_{0},\qquad\forall(x,t)\in\mathbb{R}\times(0,T), (1.2)

for a constant C0>0C_{0}>0 depending on the data, but independent of the viscosity parameter ε>0\varepsilon>0; see, e.g., [4] for the proof when the boundary condition (iiii) above is imposed.

The corresponding limit u0u^{0} of uεu^{\varepsilon} at the vanishing viscosity ε=0\varepsilon=0 satisfies the inviscid Burgers’ equation,

{tu0+u0xu0=0,x,t>0,u0(x,0)=u0(x),x,\left\{\begin{array}[]{rl}\vspace{1.5mm}\partial_{t}u^{0}+u^{0}\,\partial_{x}u^{0}=0,&x\in\mathbb{R},\,\,t>0,\\ u^{0}(x,0)=u_{0}(x),&x\in\mathbb{R},\end{array}\right. (1.3)

supplemented with (ii) or (iiii) above as for the viscous problem (1.1). Concerning all the examples in Section 4, the limit solution u0u^{0} exhibits a discontinuous interior shock located at x=α(t)x=\alpha(t), t0t\geq 0 (either stationary dα/dt=0d\alpha/dt=0 or moving dα/dt0d\alpha/dt\neq 0), and u0u^{0} is smooth on each side of x=α(t)x=\alpha(t). Hence by identifying the states of a shock, we introduce smooth one-side solutions,

u=u(t)=limxα(t)u0(x,t),u+=u+(t)=limxα(t)+u0(x,t).\begin{array}[]{l}\vspace{1.5mm}u^{-}=u^{-}(t)=\lim_{x\rightarrow\alpha(t)^{-}}u^{0}(x,t),\\ u^{+}=u^{+}(t)=\lim_{x\rightarrow\alpha(t)^{+}}u^{0}(x,t).\end{array} (1.4)

Setting α˙(t)=dα/dt\dot{\alpha}(t)=d\alpha/dt, the solution u0u^{0} satisfies that

u>α˙(t)>u+,t0,u^{-}>\dot{\alpha}(t)>u^{+},\quad t\geq 0, (1.5)

and, in addition, we have

α˙(t)=u+u+2,t0.\dot{\alpha}(t)=\dfrac{u^{-}+u^{+}}{2},\quad t\geq 0. (1.6)

Note that, in this case, the viscous solution uεu^{\varepsilon} is located between the left and right solutions of u0u^{0} at the shock x=α(t)x=\alpha(t), i.e.,

u>u+uε2>α˙(t)=u+u+2>uε+u+2>u+,at x=α(t).u^{-}>\dfrac{u^{-}+u^{\varepsilon}}{2}>\dot{\alpha}(t)=\dfrac{u^{-}+u^{+}}{2}>\dfrac{u^{\varepsilon}+u^{+}}{2}>u^{+},\quad\text{at }x=\alpha(t). (1.7)

The primary goal of this article is to analyze and approximate the solution uεu^{\varepsilon} of (1.1) as the viscosity ε\varepsilon tends to zero when the (at least piecewise) smooth initial data creates an interior shock of u0u^{0}. Specifically, we introduce a new machine learning technique called a singular-layer Physics-Informed Neural Network (sl-PINN) to accurately predict solutions to (1.1) when the viscosity is small. Following the methodology outlined in [11, 10, 9, 8], in Section 2, we start by revisiting the asymptotic analysis for (1.1) as presented in [4]; see, e.g., [3, 12, 13, 23] as well. Then, utilizing the interior layer analysis, we proceed to build our new sl-PINNs for (1.1) in Sections 3. The numerical computations for our sl-PINNs, along with a comparison to those from conventional PINNs, are detailed in Section 4. Finally, in the conclusion in Section 5, we summarize that the predicted solutions via sl-PINNs approximate the stiff interior layer of the Burgers problem quite accurately.

2. Interior layer analysis

We briefly recall from [4] the interior layer analysis for (1.1) that is made useful in the section below 3 where we construct our singular-layer Physics-Informed Neural Networks (sl-PINNs) for Burgers’ equation. To analyze the interior layer of the viscous Burgers’ equation (1.1), we first introduce a moving coordinate in space, centered at the shock location α(t)\alpha(t),

x~=x~(t)=xα(t),t0,\tilde{x}=\tilde{x}(t)=x-\alpha(t),\quad t\geq 0, (2.1)

and write the solutions of (1.1) and (1.3) with respect to the new system (x~,t)(\tilde{x},t) in the form,

uε(x,t)=u~ε(x~,t),u0(x,t)=u~0(x~,t).u^{\varepsilon}(x,t)=\tilde{u}^{\varepsilon}(\tilde{x},t),\qquad u^{0}(x,t)=\tilde{u}^{0}(\tilde{x},t). (2.2)

The initial condition is written in terms of the new coordinate as

u0(x,t)=u~0(x~,t).u_{0}(x,t)=\tilde{u}_{0}(\tilde{x},t). (2.3)

Now we set

uε(x,t)=u~ε(x~,t)=uLε(x~,t)+uRε(x~,t),u^{\varepsilon}(x,t)=\tilde{u}^{\varepsilon}(\tilde{x},t)=u^{\varepsilon}_{L}(\tilde{x},t)+u^{\varepsilon}_{R}(\tilde{x},t), (2.4)

where uLεu^{\varepsilon}_{L} and uRεu^{\varepsilon}_{R} are the disjoint and smooth portions of uεu^{\varepsilon} located on the left-hand side and the right-hand side of the shock at x=α(t)x=\alpha(t), and they are defined by

{uLε(x~,t)=u~ε(x~,t),for <x~<0,t0,uRε(x~,t)=u~ε(x~,t),for 0<x~<,t0.\left\{\begin{array}[]{l}\vspace{1.5mm}u^{\varepsilon}_{L}(\tilde{x},t)=\tilde{u}^{\varepsilon}(\tilde{x},t),\quad\text{for }-\infty<\tilde{x}<0,\,t\geq 0,\\ u^{\varepsilon}_{R}(\tilde{x},t)=\tilde{u}^{\varepsilon}(\tilde{x},t),\quad\text{for }0<\tilde{x}<\infty,\,t\geq 0.\end{array}\right. (2.5)

Using (1.1) and (2.1), and applying the chain rule, one can verify that the uLεu^{\varepsilon}_{L} and uRεu^{\varepsilon}_{R} satisfy the equations below:

{tuLεεx~2uLε+x~((uLε2α˙(t))uLε)=0,<x~<0,t>0,uLε(0,t)=u~ε(0,t)=uε(α(t),t),x~=0,uLεgL(t),as x~,uLε(x~,0)=u~0(x~(0))=u0(x~+α(0)),<x~<0,\left\{\begin{array}[]{rl}\vspace{1.5mm}\partial_{t}u^{\varepsilon}_{L}-\varepsilon\partial_{\tilde{x}}^{2}u^{\varepsilon}_{L}+\partial_{\tilde{x}}\left(\left(\displaystyle\frac{u^{\varepsilon}_{L}}{2}-\dot{\alpha}(t)\right)u^{\varepsilon}_{L}\right)=0,&-\infty<\tilde{x}<0,\,\,t>0,\\ \vspace{1.5mm}u^{\varepsilon}_{L}(0,t)=\tilde{u}^{\varepsilon}(0,t)=u^{\varepsilon}(\alpha(t),t),&\tilde{x}=0,\\ \vspace{1.5mm}u^{\varepsilon}_{L}\to g_{L}(t),&\text{as }\tilde{x}\to-\infty,\\ u^{\varepsilon}_{L}(\tilde{x},0)=\tilde{u}_{0}(\tilde{x}(0))=u_{0}(\tilde{x}+\alpha(0)),&-\infty<\tilde{x}<0,\end{array}\right. (2.6)

and

{tuRεεx~2uRε+x~((uRε2α˙(t))uRε)=0,0<x~<,t>0,uRε(0,t)=u~ε(0,t),=uε(α(t),t),x~=0,uRεgR(t),as x~,uRε(x~,0)=u~0(x~(0))=u0(x~+α(0)),0<x~<.\left\{\begin{array}[]{rl}\vspace{1.5mm}\partial_{t}u^{\varepsilon}_{R}-\varepsilon\partial_{\tilde{x}}^{2}u^{\varepsilon}_{R}+\partial_{\tilde{x}}\left(\left(\displaystyle\frac{u^{\varepsilon}_{R}}{2}-\dot{\alpha}(t)\right)u^{\varepsilon}_{R}\right)=0,&0<\tilde{x}<\infty,\,\,t>0,\\ \vspace{1.5mm}u^{\varepsilon}_{R}(0,t)=\tilde{u}^{\varepsilon}(0,t),=u^{\varepsilon}(\alpha(t),t),&\tilde{x}=0,\\ \vspace{1.5mm}u^{\varepsilon}_{R}\to g_{R}(t),&\text{as }\tilde{x}\to\infty,\\ u^{\varepsilon}_{R}(\tilde{x},0)=\tilde{u}_{0}(\tilde{x}(0))=u_{0}(\tilde{x}+\alpha(0)),&0<\tilde{x}<\infty.\end{array}\right. (2.7)

Thanks to the moving coordinate system (x~,t)(\tilde{x},t), now we are ready to analyze the interior shock layer near x=α(t)x=\alpha(t) by investigating the two disjoint boundary layers of uLεu^{\varepsilon}_{L} and uRεu^{\varepsilon}_{R}, located on the left-hand and right-hand sides of x=α(t)x=\alpha(t). To this end, we introduce asymptotic expansions in the form,

{uLε(x~,t)=uL0(x~,t)+φL(x~,t),<x~<0,t>0,uRε(x~,t)=uR0(x~,t)+φR(x~,t),0<x~<,t>0.\left\{\begin{array}[]{l}\vspace{1.5mm}u^{\varepsilon}_{L}(\tilde{x},t)=u^{0}_{L}(\tilde{x},t)+\varphi_{L}(\tilde{x},t),\quad-\infty<\tilde{x}<0,\,\,t>0,\\ u^{\varepsilon}_{R}(\tilde{x},t)=u^{0}_{R}(\tilde{x},t)+\varphi_{R}(\tilde{x},t),\quad 0<\tilde{x}<\infty,\,\,t>0.\end{array}\right. (2.8)

Here uL0u^{0}_{L} and uR0u^{0}_{R} are the disjoint and smooth portions of u0u^{0} located on the left-hand side and the right-hand side of x=α(t)x=\alpha(t), given by

{uL0(x~,t)=u~0(x~,t),for <x~<0,t0,uR0(x~,t)=u~0(x~,t),for 0<x~<,t0,\left\{\begin{array}[]{l}\vspace{1.5mm}u^{0}_{L}(\tilde{x},t)=\tilde{u}^{0}(\tilde{x},t),\quad\text{for }-\infty<\tilde{x}<0,\,t\geq 0,\\ u^{0}_{R}(\tilde{x},t)=\tilde{u}^{0}(\tilde{x},t),\quad\text{for }0<\tilde{x}<\infty,\,t\geq 0,\end{array}\right. (2.9)

i.e., u~0=uL0+uR0\tilde{u}^{0}=u^{0}_{L}+u^{0}_{R} for each (x~,t)×(0,)(\tilde{x},t)\in\mathbb{R}\times(0,\infty).

The φL\varphi_{L} or φR\varphi_{R} in the asymptotic expansion (2.8) is an artificial function, called the corrector, that approximates the stiff behavior of uεu0u^{\varepsilon}_{*}-u^{0}_{*}, =L,R*=L,R, near the shock of u0u^{0} at x~=0\tilde{x}=0 (or equivalently x=α(t)x=\alpha(t)). Following the asymptotic analysis in [4], we define the correctors φL\varphi_{L} and φR\varphi_{R} as the solutions of

{x~2φL+12x~(2uL0(0,t)φL2α˙(t)φL+(φL)2)=0,<x~<0,t>0,φL(0,t)=b(t)uL0(0,t),x~=0,φL0,as x~,\left\{\begin{array}[]{rl}\vspace{1.5mm}\displaystyle-\partial_{\tilde{x}}^{2}{\varphi}_{L}+\frac{1}{2}\partial_{\tilde{x}}\left(2u^{0}_{L}(0,t){\varphi}_{L}-2\dot{\alpha}(t){\varphi}_{L}+({\varphi}_{L})^{2}\right)=0,&-\infty<\tilde{x}<0,\,\,t>0,\\ \vspace{1.5mm}{\varphi}_{L}(0,t)=b(t)-u^{0}_{L}(0,t),&\tilde{x}=0,\\ {\varphi}_{L}\to 0,&\text{as }\tilde{x}\to-\infty,\end{array}\right. (2.10)

and

{εx~2φR+12x~(2uR0(0,t)φR2α˙(t)φR+(φR)2)=0,0<x~<,t>0,φR(0,t)=b(t)uR0(0,t),x~=0,φR0,as x~,\left\{\begin{array}[]{rl}\vspace{1.5mm}\displaystyle-\varepsilon\partial_{\tilde{x}}^{2}{\varphi}_{R}+\frac{1}{2}\partial_{\tilde{x}}\left(2u^{0}_{R}(0,t){\varphi}_{R}-2\dot{\alpha}(t){\varphi}_{R}+({\varphi}_{R})^{2}\right)=0,&0<\tilde{x}<\infty,\,\,t>0,\\ \vspace{1.5mm}{\varphi}_{R}(0,t)=b(t)-u^{0}_{R}(0,t),&\tilde{x}=0,\\ {\varphi}_{R}\to 0,&\text{as }\tilde{x}\to\infty,\end{array}\right. (2.11)

where

b=b(t)=uε(α(t),t).b=b(t)=u^{\varepsilon}(\alpha(t),t). (2.12)

Thanks to the property of uεu^{\varepsilon}, (1.2), we notice that the value of bb is bounded by a constant independent of the small viscosity parameter ε\varepsilon.

We recall from Lemma 3.1 in [4] that the explicit expression of φL\varphi_{L} (and hence of φR\varphi_{R} by the symmetry) is given by

φ(x~,t)\displaystyle{\varphi}_{*}\left({\tilde{x}},t\right) =(α˙(t)u0(0,t))(1+tanh(M2(x~m))),=L,R,\displaystyle=\left(\dot{\alpha}(t)-u^{0}_{*}(0,t)\right)\left(1+\tanh\left(\frac{M_{*}}{2}(\tilde{x}-m_{*})\right)\right),\quad*=L,R, (2.13)

where

M=u0(0,t)α˙(t)ε,m=1Mlog(u0(0,t)2α˙(t)+b(t)u0(0,t)b(t)).\displaystyle M_{*}=\dfrac{u^{0}_{*}(0,t)-\dot{\alpha}(t)}{\varepsilon},\qquad m_{*}=\dfrac{1}{M_{*}}\log\left(\frac{u^{0}_{*}(0,t)-2\dot{\alpha}(t)+b(t)}{u^{0}_{*}(0,t)-b(t)}\right).

The validity of our asymptotic expansion of uεu^{\varepsilon} in (2.4) and (2.8) as well as the vanishing viscosity limit of uεu^{\varepsilon} to u0u^{0} are rigorously verified in [4], and here we briefly recall the result without providing the detailed and technical proof; see Theorem 3.5 in [4]:

Under a certain regularity condition on the limit solution uL0u^{0}_{L} and uR0u^{0}_{R}, the asymptotic expansion in (2.4) and (2.8) is valid in the sense that

{uLε(uL0+φL)L(0,T;L2())+ε12uLε(uL0+φL)L2(0,T;H1())κε,uRε(uR0+φR)L(0,T;L2(+))+ε12uRε(uR0+φR)L2(0,T;H1(+))κε,\left\{\begin{array}[]{l}\vspace{1.5mm}\|u^{\varepsilon}_{L}-(u^{0}_{L}+\varphi_{L})\|_{L^{\infty}(0,T;L^{2}(\mathbb{R}^{-}))}+\varepsilon^{\frac{1}{2}}\|u^{\varepsilon}_{L}-(u^{0}_{L}+\varphi_{L})\|_{L^{2}(0,T;H^{1}(\mathbb{R}^{-}))}\leq\kappa\varepsilon,\\ \|u^{\varepsilon}_{R}-(u^{0}_{R}+\varphi_{R})\|_{L^{\infty}(0,T;L^{2}(\mathbb{R}^{+}))}+\varepsilon^{\frac{1}{2}}\|u^{\varepsilon}_{R}-(u^{0}_{R}+\varphi_{R})\|_{L^{2}(0,T;H^{1}(\mathbb{R}^{+}))}\leq\kappa\varepsilon,\end{array}\right. (2.14)

for a generic constant κ>0\kappa>0 depending on the data, but independent of the viscosity parameter ε\varepsilon. Moreover, the viscous solution uεu^{\varepsilon} converges to the corresponding limit solution u0u^{0} as the viscosity vanishes in the sense that

{uLεuL0L(0,T;L2())+uRεuR0L(0,T;L2(+))κε12,uLεuL0L(0,T;L1(0,K))+uRεuR0L(0,T;L1(K,0)κε,K>0,\left\{\begin{array}[]{l}\vspace{1.5mm}\|u^{\varepsilon}_{L}-u^{0}_{L}\|_{L^{\infty}(0,T;L^{2}(\mathbb{R}^{-}))}+\|u^{\varepsilon}_{R}-u^{0}_{R}\|_{L^{\infty}(0,T;L^{2}(\mathbb{R}^{+}))}\leq\kappa\varepsilon^{\frac{1}{2}},\\ \|u^{\varepsilon}_{L}-u^{0}_{L}\|_{L^{\infty}(0,T;L^{1}(0,K))}+\|u^{\varepsilon}_{R}-u^{0}_{R}\|_{L^{\infty}(0,T;L^{1}(-K,0)}\leq\kappa\varepsilon,\quad\forall K>0,\end{array}\right. (2.15)

for a constant κ>0\kappa>0 depending on the data, but independent of ε\varepsilon.

3. PINNs and sl-PINNs for Burgers’ equation

In this section, we are developing a new machine learning process called the singular-layer Physics-Informed Neural Network (sl-PINN) for the Burgers’ equation (1.1). We will be comparing its performance with that of the usual PINNs, specifically when the viscosity ε>0\varepsilon>0 is small and hence the viscous Burgers’ solution shows rapid changes in a small region known as the interior layer. To start, we provide a brief overview of Physics-Informed Neural Networks (PINNs) for solving the Burgers equation. We then introduce our new sl-PINNs and discuss their capabilities in comparison to the traditional PINNs. We see below in Section 4 that our new sl-PINNs yield more accurate predictions, particularly when the viscosity parameter is small.

3.1. PINN structure

We introduce an LL-layer feed-forward neural network (NN) defined by the recursive expression: 𝒩l\mathcal{N}^{l}, 0lL0\leq l\leq L such that

input layer: 𝒩0(𝒙)=𝒙N0,\displaystyle\mathcal{N}^{0}(\boldsymbol{x})=\boldsymbol{x}\in\mathbb{R}^{N_{0}},
hidden layers: 𝒩l(𝒙)=σ(𝑾l𝒩l1(𝒙)+𝒃l)Nl,1lL1,\displaystyle\mathcal{N}^{l}(\boldsymbol{x})=\sigma(\boldsymbol{W}^{l}\mathcal{N}^{l-1}(\boldsymbol{x})+\boldsymbol{b}^{l})\in\mathbb{R}^{N_{l}},\quad 1\leq l\leq L-1,
output layer: 𝒩L(𝒙)=𝑾L𝒩L1(𝒙)+𝒃LNL,\displaystyle\mathcal{N}^{L}(\boldsymbol{x})=\boldsymbol{W}^{L}\mathcal{N}^{L-1}(\boldsymbol{x})+\boldsymbol{b}^{L}\in\mathbb{R}^{N_{L}},

where 𝒙\boldsymbol{x} is an input vector in N0\mathbb{R}^{N_{0}}, and 𝑾lNl×Nl1\boldsymbol{W}^{l}\in\mathbb{R}^{N_{l}\times N_{l-1}} and 𝒃lNl\boldsymbol{b}^{l}\in\mathbb{R}^{N_{l}}, l=1,,Ll=1,\ldots,L, are the (unknown) weights and biases respectively. Here, NlN^{l}, l=0,,Ll=0,\ldots,L, is the number of neurons in the ll-th layer, and σ\sigma is the activation function. We set 𝜽={(𝑾l,𝒃l)}l=1L\boldsymbol{\theta}=\{(\boldsymbol{W}^{l},\boldsymbol{b}^{l})\}_{l=1}^{L} as the collection of all the parameters, weights and biases.

To solve the Burgers equation (1.1), we use the NN structure above with the output 𝒩L=u^(x,t;𝜽)\mathcal{N}^{L}=\hat{u}(x,t;\boldsymbol{\theta}) and the input 𝒙=(x,t)(1,1)×(0,)\boldsymbol{x}=(x,t)\in(-1,1)\times(0,\infty) where a proper treatment is implemented for the boundary condition (i)(i) or (ii)(ii) in (1.1); see (4.5) and (4.11) below. In order to determine the predicted solution u^\hat{u} of (1.1), we use the loss function defined by

(𝜽;𝒯,𝒯b,𝒯0)=1|𝒯|(x,t)𝒯|tu^εx2u^+u^xu^|2+1|𝒯b|t𝒯bl(u^)+1|𝒯0|x𝒯0|u^|t=0u0|2,\begin{split}\mathcal{L}(\boldsymbol{\theta};\mathcal{T},\mathcal{T}_{b},\mathcal{T}_{0})&=\frac{1}{|\mathcal{T}|}\sum_{(x,t)\in\mathcal{T}}\left|\partial_{t}\hat{u}-\varepsilon\partial_{x}^{2}\hat{u}+\hat{u}\,\partial_{x}\hat{u}\right|^{2}\\ &+\frac{1}{|\mathcal{T}_{b}|}{\sum_{t\in\mathcal{T}_{b}}l(\hat{u})}+\frac{1}{|\mathcal{T}_{0}|}\sum_{x\in\mathcal{T}_{0}}\left|\hat{u}|_{t=0}-u_{0}\right|^{2},\end{split} (3.1)

where 𝒯Ωx×Ωt\mathcal{T}\subset\Omega_{x}\times\Omega_{t} is a set of training data in the space-time domain, 𝒯bΩx×Ωt\mathcal{T}_{b}\subset\partial\Omega_{x}\times\Omega_{t} on the boundary, and 𝒯0Ωx\mathcal{T}_{0}\subset\Omega_{x} on the initial boundary at t=0t=0. The function l(u^)l(\hat{u}) in (3.1) is defined differently for different examples below to enforce the boundary conditions (i)(i) or (ii)(ii) in (1.1); see (4.5) and (4.11) below.

3.2. sl-PINN structure

In order to create our predicted solution for the Burgers equation (1.1), we revisit the interior layer analysis in Section 2 and then, using this analytic result, we develop two training solutions in distinct regions. More precisely, we partition the domain Ωx\Omega_{x} into disjoint subdomains designated as Ωx,L\Omega_{x,L} and Ωx,R\Omega_{x,R}, which are separated by the shock curve x=α(t)x=\alpha(t). After that, we introduce the output functions u^=u^(x,t;𝜽)\hat{u}_{*}=\hat{u}_{*}(x,t;\boldsymbol{\theta}_{*}) for the neural network in the subdomains Ωx,\Omega_{x,*} where =L,R*=L,R. The training solution for our sl-PINN method is then defined as:

u~(x,t;𝜽)=u^(x,t;𝜽)+φ~(x,t;𝜽),=L,R.\tilde{u}_{*}(x,t;\boldsymbol{\theta}_{*})=\hat{u}_{*}(x,t;\boldsymbol{\theta}_{*})+\tilde{\varphi}_{*}(x,t;\boldsymbol{\theta}_{*}),\quad*=L,R. (3.2)

Here φ~\tilde{\varphi}_{*} is the training form of the corrector φ\varphi in (2.13), and it is given by

φ~(x,t;𝜽)=(α˙(t)u^(α(t),t;𝜽))(1+tanh(M~2(xα(t)m~))),\tilde{\varphi}_{*}(x,t;\boldsymbol{\theta}_{*})=(\dot{\alpha}(t)-\hat{u}_{*}(\alpha(t),t;\boldsymbol{\theta}_{*}))\left(1+\tanh\left(\frac{\tilde{M_{*}}}{2}(x-\alpha(t)-\tilde{m})\right)\right), (3.3)

with the trainable parameters,

M~=u^(α(t),t;𝜽)α˙(t)ε,m~=1M~log(u^(α(t),t;𝜽)2α˙(t)+b~u^(α(t),t;𝜽)b~).\tilde{M}_{*}=\frac{\hat{u}_{*}(\alpha(t),t;\boldsymbol{\theta}_{*})-\dot{\alpha}(t)}{\varepsilon},\quad\tilde{m}_{*}=\frac{1}{\tilde{M}_{*}}\log\left(\frac{\hat{u}_{*}(\alpha(t),t;\boldsymbol{\theta}_{*})-2\dot{\alpha}(t)+\tilde{b}}{\hat{u}_{*}(\alpha(t),t;\boldsymbol{\theta}_{*})-\tilde{b}}\right). (3.4)

Note that the b~\tilde{b} appears in (3.4) is an approximations of b(t)=uε(α(t),t)b(t)=u^{\varepsilon}(\alpha(t),t) and it is updated iteratively during the network training; see Algorithm 1 below.

As noted in Section 2, the correctors φ~\tilde{\varphi}_{*}, =L,R*=L,R, capture well the singular behavior of the viscous Burgers’ solution uεu^{\varepsilon} near the shock of u0u^{0} at x=α(t)x=\alpha(t). Hence, by integrating these correctors into the training solution, we record the stiff behavior of uεu^{\varepsilon} directly in (3.2).

To define the loss function for our sl-PINN method, we modify and use the idea introduced in the eXtended Physics-Informed Neural Networks (XPINNs), e.g., [17]. In fact, to manage the mismatched value across the shock curve as well as the residual of differential equations along the interface, we define the loss function as

(θ;𝒯,𝒯,b,𝒯,0,𝒯Γ)=1|𝒯|(x,t)𝒯|tu~εx2u~+u~xu~|2+1|𝒯,b|t𝒯,bl(u~)+1|𝒯,0|x𝒯,0|u~|t=0u0|2+1|𝒯Γ|(x,t)𝒯Γ|u~Lu~R|2+1|𝒯Γ|(x,t)𝒯Γ|(tu~Lεx2u~L+u~Lxu~L)(tu~Rεx2u~R+u~Rxu~R)|2,=L,R.\begin{split}&\mathcal{L}_{*}(\theta_{*};\mathcal{T}_{*},\mathcal{T}_{*,b},\mathcal{T}_{*,0},\mathcal{T}_{\Gamma})\\ &=\frac{1}{|\mathcal{T}_{*}|}\sum_{(x,t)\in\mathcal{T}_{*}}\left|\partial_{t}\tilde{u}_{*}-\varepsilon\partial_{x}^{2}\tilde{u}_{*}+\tilde{u}_{*}\partial_{x}\tilde{u}_{*}\right|^{2}\\ &+{\frac{1}{|\mathcal{T}_{*,b}|}\sum_{t\in\mathcal{T}_{*,b}}l(\tilde{u}_{*})}+\frac{1}{|\mathcal{T}_{*,0}|}\sum_{x\in\mathcal{T}_{*,0}}\left|\tilde{u}_{*}|_{t=0}-u_{0}\right|^{2}\\ &+\frac{1}{|\mathcal{T}_{\Gamma}|}\sum_{(x,t)\in\mathcal{T}_{\Gamma}}{|\tilde{u}_{L}-\tilde{u}_{R}|^{2}}\\ &+\frac{1}{|\mathcal{T}_{\Gamma}|}\sum_{(x,t)\in\mathcal{T}_{\Gamma}}\left|{\left(\partial_{t}\tilde{u}_{L}-\varepsilon\partial_{x}^{2}\tilde{u}_{L}+\tilde{u}_{L}\partial_{x}\tilde{u}_{L}\right)-\left(\partial_{t}\tilde{u}_{R}-\varepsilon\partial_{x}^{2}\tilde{u}_{R}+\tilde{u}_{R}\partial_{x}\tilde{u}_{R}\right)}\right|^{2},\quad*=L,R.\end{split} (3.5)

Here the interface between ΩL=(,α(t))\Omega_{L}=(-\infty,\alpha(t)) and ΩR=(α(t),)\Omega_{R}=(\alpha(t),\infty) is given by Γ:={(x=α(t),t)Ωx×Ωt|tΩt}\Gamma:=\{(x=\alpha(t),\,t)\in\Omega_{x}\times\Omega_{t}\,|\,\,t\in\Omega_{t}\}, and the training data sets are chosen such that 𝒯Ωx,×Ωt\mathcal{T}_{*}\subset\Omega_{x,*}\times\Omega_{t}, 𝒯,bΩx,×Ωt\mathcal{T}_{*,b}\subset\partial\Omega_{x,*}\times\Omega_{t}, 𝒯,0Ωx,\mathcal{T}_{*,0}\subset\Omega_{x,*}, and 𝒯ΓΓ\mathcal{T}_{\Gamma}\subset\Gamma. The function l(u~)l(\tilde{u}_{*}), =L,R*=L,R, in (3.5) is defined differently for different examples below to enforce the boundary conditions (i)(i) or (ii)(ii) in (1.1); see (4.6) and (4.12) below.

The solutions u~L\tilde{u}_{L} and u~R\tilde{u}_{R} defined in (3.2) are trained simultaneously at each iteration step based on the corresponding loss functions L\mathcal{L}_{L} and R\mathcal{L}_{R} defined in (3.5). Note that we define two networks separately for u~L\tilde{u}_{L} and u~R\tilde{u}_{R}, but they interact with each other via the loss function (3.5) at each iteration and thus the trainings of u~L\tilde{u}_{L} and u~R\tilde{u}_{R} occur simultaneously and they depend on each other. The training algorithm is listed below in Algorithm 1.

Algorithm 1 (sl-PINN training algorithm)
  • Construct two NNs on the subdomains Ωx,×Ωt\Omega_{x,*}\times\Omega_{t}, u^=u^(x,t;𝜽(0))\hat{u}_{*}=\hat{u}_{*}(x,t;\boldsymbol{\theta}^{(0)}_{*}) for =L,R*=L,R, where 𝜽(0)\boldsymbol{\theta}^{(0)}_{*} denotes the initial parameters of the NN.

  • Initialize the b~\tilde{b} by b~=21(limxα(0)u0(x)+limxα(0)+u0(x))\tilde{b}=2^{-1}\big{(}\lim_{x\rightarrow\alpha(0)^{-}}u_{0}(x)+\lim_{x\rightarrow\alpha(0)^{+}}u_{0}(x)\big{)}.

  • Define the training solutions u~=u~(x,t;𝜽(0))\tilde{u}_{*}=\tilde{u}_{*}(x,t;\boldsymbol{\theta}^{(0)}_{*}), =L,R*=L,R, as in (3.2).

  • Choose the training sets of random data points as 𝒯\mathcal{T}_{*}, 𝒯,b\mathcal{T}_{*,b}, 𝒯,0\mathcal{T}_{*,0}, and 𝒯Γ\mathcal{T}_{\Gamma}, =L,R*=L,R.

  • Define the loss functions \mathcal{L}_{*}, =L,R*=L,R, as in (3.5).

  • for k=1,,k=1,\ldots,\ell do

    • train u~=u~(k)=u~(x,t;𝜽(k))\tilde{u}_{*}=\tilde{u}_{*}^{(k)}=\tilde{u}_{*}(x,t;\boldsymbol{\theta}^{(k)}_{*}) and obtain the updated parameters 𝜽(k+1)\boldsymbol{\theta}_{*}^{(k+1)} for =L,R*=L,R

    • update b~\tilde{b} by b~=12(u~L(α(t),t;𝜽L(k+1))+u~R(α(t),t;𝜽R(k+1)))\tilde{b}=\frac{1}{2}\left(\tilde{u}_{L}\left(\alpha(t),t;\boldsymbol{\theta}_{L}^{(k+1)}\right)+\tilde{u}_{R}\left(\alpha(t),t;\boldsymbol{\theta}_{R}^{(k+1)}\right)\right)

  • end for

  • Obtain the predicted solution as

    u~=u~L()+u~R().\tilde{u}=\tilde{u}^{(\ell)}_{L}+\tilde{u}^{(\ell)}_{R}.

Here the superscript (k)(k) denotes the kk-th iteration step by the optimizer, and \ell is the number of the maximum iterations, or the one where an early stopping within a certain tolerance is achieved.

4. Numerical Experiments

We are evaluating the performance of our new sl-PINNs in solving the viscous Burgers problem (1.1) in cases where the viscosity is small, resulting in a stiff interior layer. We are investigating this problem with both smooth initial data and non-smooth initial data. In the case of smooth initial data, we are using a sine function, which is a well-known benchmark experiment for the slightly viscous Burgers’ equation [1, 15, 6]. Additionally, we are also considering initial data composed of Heaviside functions, which lead to the occurrence of steady or moving shocks (refer to Chapters 2.3 and 8.4 in [20] or [7, 2]).

We define the error between the exact solution and the predicted solution as

E(x,t)=uref(x,t)upred(x,t),\displaystyle E(x,t)=u_{ref}(x,t)-u_{pred}(x,t),

where urefu_{ref} is the high-resolution reference solution obtained by the finite difference methods, and upredu_{pred} is the predicted solution obtained by either PINNs or sl-PINNs. The forward-time finite (central) difference method is employed to generate the reference solutions in all tests where the mesh size Δx=105\Delta x=10^{-5} and the time step Δt=107\Delta t=10^{-7} are chosen to be sufficiently small.

When computing the error for each example below, we use the high-resolution reference solution instead of the exact solution. This is because the explicit expression, (4.4) or (4.9), of the solution for each example is not suitable for direct use, as it is involved in integrations.

We measure below the error of each computation in the various norms,

ELx2(Ω)\displaystyle\|E\|_{L^{2}_{x}(\Omega)} =(Ω|E(x,t)|2𝑑x)12,\displaystyle=\left(\int_{\Omega}\left|E(x,t)\right|^{2}dx\right)^{\frac{1}{2}}, (4.1)
ELt2(Lx2(Ω))\displaystyle\|E\|_{L^{2}_{t}(L^{2}_{x}(\Omega))} =(0TELx2(Ω)2𝑑t)12,\displaystyle=\left(\int_{0}^{T}\|E\|^{2}_{L^{2}_{x}(\Omega)}dt\right)^{\frac{1}{2}}, (4.2)

and

ELx(Ω)=maxxΩ|E(x,t)|.\displaystyle\|E\|_{L^{\infty}_{x}(\Omega)}=\max_{x\in\Omega}\left|E(x,t)\right|. (4.3)

4.1. Smooth initial data

We consider the Burgers’ equation (1.1) supplemented with with the periodic boundary condition (i)(i) where the initial data is given by

u0(x)=sin(πx).\displaystyle u_{0}(x)=-\sin(\pi x).

For this example, the explicit expression of solution is given by

uε(x,t)=sin(π(xη))F(xη)G(η,t)𝑑ηF(xη)G(η,t)𝑑η,\displaystyle u^{\varepsilon}(x,t)=\dfrac{-\int_{\mathbb{R}}\sin(\pi(x-\eta))\ F(x-\eta)\ G(\eta,t)\ d\eta}{\int_{\mathbb{R}}\ F(x-\eta)\ G(\eta,t)\ d\eta}, (4.4)

where

F(y)=exp(cos(πy)2πε),G(y,t)=exp(y24εt).\displaystyle F(y)=\exp\bigg{(}-\dfrac{\cos(\pi y)}{2\pi\varepsilon}\bigg{)},\quad G(y,t)=\exp\bigg{(}-\frac{y^{2}}{4\varepsilon t}\bigg{)}.

We have observed that even though the analytic solution above has been well-studied (refer to, for example, [1]), any numerical approximation of (4.4) requires a specific numerical method to compute the integration, and the integrand functions become increasingly singular when the viscosity ε\varepsilon is small. To avoid this technical difficulties for our simulations, we use a high-resolution finite difference method to produce our reference solution.

Now, to build our sl-PINN predicted solution, we first notice that the interior layer for this example is stationary, i.e., α(t)0\alpha(t)\equiv 0, and hence we find that the training corrector (3.3) is reduced to

φ~(x,t;𝜽)=u^(0,t;𝜽)(1+tanh(u^(0,t;𝜽)2εx)).\displaystyle\tilde{\varphi}_{*}(x,t;\boldsymbol{\theta}_{*})=-\hat{u}_{*}(0,t;\boldsymbol{\theta}_{*})\left(1+\tanh\left(\frac{\hat{u}_{*}(0,t;\boldsymbol{\theta}_{*})}{2\varepsilon}\ x\right)\right).

For both classical PINNs and sl-PINNs, we set the neural networks with size 4×204\times 20 (3 hidden layers with 20 neurons each) and randomly choose N=5,000N=5,000 training points inside the space-time domain Ωx×Ωt\Omega_{x}\times\Omega_{t}, Nb=80N_{b}=80 on the boundary Ωx×Ωt\partial\Omega_{x}\times\Omega_{t}, N0=80N_{0}=80 on the initial boundary Ωx×{t=0}\Omega_{x}\times\{t=0\}, and Ni=80N_{i}=80 on the interface {x=0}×Ωt\{x=0\}\times\Omega_{t}. The training data distribution is illustrated in Figure 4.1. For the optimization, we use the Adam optimizer with learning rate 10310^{-3} for 20,00020,000 iterations, and then the L-BFGS optimizer with learning rate 11 for 10,00010,000 iterations for further training. The main reason for using the combination of Adam and L-BFGS is to employ a fast converging (2nd order) optimizer (L-BFGS) after obtaining good initial parameters with the 1st order optimizer (Adam).

Experiments are conducted for both PINNs and sl-PINNs where l(u^)l(\hat{u}) in (3.1) and l(u~)l(\tilde{u}_{*}), =L,R*=L,R, in (3.5) are defined by

l(u^)=|u^(1,t)u^(1,t)|2l(\hat{u})=\big{|}\hat{u}(1,t)-\hat{u}(-1,t)\big{|}^{2} (4.5)

and

l(u~)=|u~R(1,t)u~L(1,t)|2,for both =L,R.l(\tilde{u}_{*})=\big{|}\tilde{u}_{R}(1,t)-\tilde{u}_{L}(-1,t)\big{|}^{2},\quad\text{for both }*=L,R. (4.6)

We repeatedly train the PINNs and sl-PINNs for the Burgers’ equations using the viscosity parameter values of ε=101/π, 102/π\varepsilon=10^{-1}/\pi,\ 10^{-2}/\pi, and 103/π10^{-3}/\pi. We use the same computational settings mentioned earlier. Below, we present the training processes for the cases of ε=102/π\varepsilon=10^{-2}/\pi and ε=103/π\varepsilon=10^{-3}/\pi in Figures 4.2 and 4.3.

To evaluate the performance of sl-PINNs, we present the L2L^{2}-error (4.2) and the LL^{\infty}-error (4.3) at a specific time in Tables 1 and 2. We observe that sl-PINN provides accurate predictions for all cases when ε=101/π\varepsilon=10^{-1}/\pi, 102/π10^{-2}/\pi, and 103/π10^{-3}/\pi in L2L^{2}-norm, while PINN struggles to perform well when ε=103/π\varepsilon=10^{-3}/\pi. Additionally, the LL^{\infty}-error at a specific time shows that sl-PINN has better accuracy than PINN when the shock occurs as time progresses.

The figures below show the profiles of the PINN and sl-PINN predictions. In particular, Figure 4.4 illustrates the case when ε=102/π\varepsilon=10^{-2}/\pi, and Figure 4.5 illustrates the case when ε=103/π\varepsilon=10^{-3}/\pi. It is observed that for ε=102/π\varepsilon=10^{-2}/\pi, the sl-PINN method is successful in resolving significant errors near the shock that occurred in the PINN prediction. Additionally, when ε=103/π\varepsilon=10^{-3}/\pi, the sl-PINN method successfully predicts the solution with a stiffer interior layer which the PINN fails to predict. Furthermore, Figure 4.6 shows the temporal snapshots of the errors of PINN and sl-PINN for ε=102/π\varepsilon=10^{-2}/\pi, while Figure 4.7 shows the same for ε=103/π\varepsilon=10^{-3}/\pi.

In a study addressing the use of Physics Informed Neural Networks (PINNs) to solve the Burgers equation (referenced as [22]), it was found that increasing the number of layers in the neural network leads to improved predictive accuracy. To further test this, a deeper neural network with a size of 9×209\times 20 (8 hidden layers with 20 neurons each) was employed, along with a doubling of the training points to N=10,000N=10,000, Nb=160N_{b}=160, and N0=160N_{0}=160. As indicated in Table 2, when ε=102/π\varepsilon=10^{-2}/\pi, the accuracy of PINN increased. However, there was no improvement in accuracy for the stiffer case when ε=103/π\varepsilon=10^{-3}/\pi. Interestingly, despite using a larger neural network, the singular-layer PINNs (sl-PINNs) performe similarly to using smaller neural networks. Consequently, it was concluded that utilizing small neural networks of size 4×204\times 20 for sl-PINNs is sufficient to achieve a good approximation, thereby increasing computational efficiency.

Additionally, it’s important to note that a two-scale neural networks learning method for the Burgers equation with small ε>0\varepsilon>0 was proposed in a recent study (referenced as [25]). This approach improved the accuracy of the predicted solution by incorporating stretch variables as additional inputs to the neural network. However, this method requires appropriate initialization from pre-trained models. Sequential training is necessary to obtain the pre-trained models for ε=101/π\varepsilon=10^{-1}/\pi and ε=102/π\varepsilon=10^{-2}/\pi in order to achieve better initialization of the neural network for training the most extreme case, ε=103/π\varepsilon=10^{-3}/\pi. In contrast, our sl-PINN method only employs two neural networks for the case ε=103/π\varepsilon=10^{-3}/\pi and it still achieves better accuracy.

Refer to caption
Figure 4.1. Smooth initial data case: sl-PINN training data distribution.
Refer to caption
(a) ε=102/π\varepsilon=10^{-2}/\pi
Refer to caption
(b) ε=103/π\varepsilon=10^{-3}/\pi
Figure 4.2. Smooth initial data: sl-PINN’s training loss during the training process.
Refer to caption
(a) ε=102/π\varepsilon=10^{-2}/\pi
Refer to caption
(b) ε=103/π\varepsilon=10^{-3}/\pi
Figure 4.3. Smooth initial data: PINN’s training loss during the training process.
ELt2(Lx2(Ω))\|E\|_{L^{2}_{t}(L^{2}_{x}(\Omega))} PINNs sl-PINNs
ε=101/π\varepsilon=10^{-1}/\pi 4.30E-04 4.28E-04
ε=102/π\varepsilon=10^{-2}/\pi 6.02E-02 2.13E-03
ε=103/π\varepsilon=10^{-3}/\pi 5.50E-01 5.57E-03

PINNs sl-PINNs
ELx(Ω)\|E\|_{L^{\infty}_{x}(\Omega)} ε=102/π\varepsilon=10^{-2}/\pi ε=103/π\varepsilon=10^{-3}/\pi ε=102/π\varepsilon=10^{-2}/\pi ε=103/π\varepsilon=10^{-3}/\pi
t=0.25t=0.25 7.56E-02 6.45E-01 8.31E-03 1.10E-02
t=0.5t=0.5 2.79E-01 1.05E+00 7.89E-04 1.57E-02
t=0.75t=0.75 2.52E-01 1.45E+00 3.99E-04 1.21E-02
t=1.0t=1.0 1.99E-01 1.41E+00 3.88E-04 8.51E-03
Table 1. Error comparison between PINN and sl-PINN: L2L^{2}-errors (top) and LL^{\infty}-errors (bottom) with small NN size: 4×204\times 20.
Refer to caption
(a) reference solution: ε=102/π\varepsilon=10^{-2}/\pi
Refer to caption
(b) PINN prediction (left) and absolute pointwise error (right)
Refer to caption
(c) sl-PINN prediction (left) and absolute pointwise error (right)
Figure 4.4. Smooth initial data case when ε=102/π\varepsilon=10^{-2}/\pi: PINN vs sl-PINN solution plots with small NN size 4×204\times 20.
Refer to caption
(a) reference solution: ε=103/π\varepsilon=10^{-3}/\pi
Refer to caption
(b) PINN prediction (left) and absolute pointwise error (right)
Refer to caption
(c) sl-PINN prediction (left) and absolute pointwise error (right)
Figure 4.5. Smooth initial data case when ε=103/π\varepsilon=10^{-3}/\pi: PINN vs sl-PINN solution plots with small NN size 4×204\times 20.
Refer to caption
(a) Errors of PINN prediction at specific time.
Refer to caption
(b) Errors of sl-PINN prediction at specific time.
Figure 4.6. Smooth initial data case when ε=102/π\varepsilon=10^{-2}/\pi: PINN vs sl-PINN solution error at specific time (small NN size 4×204\times 20).
Refer to caption
(a) Errors of PINN prediction at specific time.
Refer to caption
(b) Errors of sl-PINN prediction at specific time.
Figure 4.7. Smooth initial data case when ε=103/π\varepsilon=10^{-3}/\pi: PINN vs sl-PINN solution error at specific time (small NN size 4×204\times 20).
ELt2(Lx2(Ω))\|E\|_{L^{2}_{t}(L^{2}_{x}(\Omega))} PINNs sl-PINNs
ε=101/π\varepsilon=10^{-1}/\pi 4.23E-04 2.15E-04
ε=102/π\varepsilon=10^{-2}/\pi 9.43E-04 7.10E-04
ε=103/π\varepsilon=10^{-3}/\pi 4.83E-01 4.25E-03

PINNs sl-PINNs
ELx(Ω)\|E\|_{L^{\infty}_{x}(\Omega)} ε=102/π\varepsilon=10^{-2}/\pi ε=103/π\varepsilon=10^{-3}/\pi ε=102/π\varepsilon=10^{-2}/\pi ε=103/π\varepsilon=10^{-3}/\pi
t=0.25t=0.25 1.51E-03 5.91E-01 1.38E-03 1.02E-02
t=0.5t=0.5 2.70E-03 1.04E+00 3.62E-04 1.57E-02
t=0.75t=0.75 3.26E-03 1.52E+00 3.33E-04 1.21E-02
t=1.0t=1.0 2.03E-03 7.65E-01 2.14E-04 8.51E-03
Table 2. Error comparison between PINN and sl-PINN: L2L^{2}-errors (top) and LL^{\infty}-errors (bottom) with large NN size: 9×209\times 20.

4.2. Non-smooth initial data

We consider the Burgers’ equation (1.1) supplemented with the boundary condition (ii)(ii) in the form,

limxuε=uL,limxuε=uR,uL,uR>0,\lim_{x\rightarrow-\infty}u^{\varepsilon}=u_{L},\quad\lim_{x\rightarrow\infty}u^{\varepsilon}=-u_{R},\quad u_{L},\,u_{R}>0, (4.7)

and the initial condition,

u0(x)={uL,x<0,uR,x>0.u_{0}(x)=\begin{cases}u_{L},&x<0,\\ -u_{R},&x>0.\end{cases} (4.8)

Using the Cole-Hopf transformation; see, e.g, [5, 14, 20], we find the explicit expression of solution to the viscous Burgers’ equation (1.1) with (4.7) and (4.8) as

uε(x,t)=uLuL+uR1+exp(uL+uR2ε(xct))erfc(xuLt2εt)/erfc(xuRt2εt).\displaystyle u^{\varepsilon}(x,t)=u_{L}-\frac{u_{L}+u_{R}}{1+\exp\bigg{(}-\dfrac{u_{L}+u_{R}}{2\varepsilon}(x-ct)\bigg{)}\text{erfc}\left(\dfrac{x-u_{L}t}{2\sqrt{\varepsilon t}}\right)\big{/}\text{erfc}\left(\dfrac{-x-u_{R}t}{2\sqrt{\varepsilon t}}\right)}. (4.9)

Here c=(uLuR)/2c=(u_{L}-u_{R})/2 and erfc()\text{erfc}(\cdot) is the complementary error function on (,)(-\infty,\infty) defined by

erfc(z)=1erf(z)=12π0zet2𝑑t.\text{erfc}(z)=1-\text{erf}(z)=1-\dfrac{2}{\sqrt{\pi}}\int_{0}^{z}e^{-t^{2}}\,dt. (4.10)

Note that accurately evaluating the values of the complementary error function is a significantly challenging task. Therefore, to obtain the reference solution to (1.1) with (4.7) and (4.8) in the upcoming numerical computations below, we use the high-resolution finite difference method as described.

The inviscid limit problem (1.3) with the boundary and initial data (4.7) and (4.8), known as the Riemann problem, has a unique weak solution (see, e.g., [16, 20]) given by

u0(x,t)={uL,x<st,uR,x>st,u^{0}(x,t)=\begin{cases}u_{L},&x<st,\\ -u_{R},&x>st,\end{cases}

where ss is the speed of the shock wave, determined by the Rankine-Hugoniot jump condition, written as

s(t)=α(t)=12uR2uL2uR+uL=12(uLuR).\displaystyle s(t)=\alpha^{\prime}(t)=-\dfrac{1}{2}\dfrac{u_{R}^{2}-u_{L}^{2}}{u_{R}+u_{L}}=\dfrac{1}{2}(u_{L}-u_{R}).

Integrating the equation above, we find the shock curve

α(t)=12(uLuR)t,t0.\displaystyle\alpha(t)=\frac{1}{2}(u_{L}-u_{R})t,\quad t\geq 0.

In the following experiments, we consider two examples: the first is when uL=uR=1u_{L}=u_{R}=1 for a steady shock located at x=α(t)0x=\alpha(t)\equiv 0, and the second is when uL=1u_{L}=1 and uR=1/2u_{R}=1/2 for a moving shock located at x=α(t)=t/4>0x=\alpha(t)=t/4>0. Experiments for both examples are conducted for both PINNs and sl-PINNs especially when the viscosity is small and hence the value of uεu^{\varepsilon} at x=1x=-1 (or at x=1x=1) is exponentially close to the boundary value uLu_{L} (or uR-u_{R}) at least for the time of computations. Thanks to this property, for the sake of computational convenience, we set l(u^)l(\hat{u}) in (3.1) and l(u~)l(\tilde{u}_{*}), =L,R*=L,R, in (3.5) as

l(u^)=|u^(1,t)uL|2+|u^(1,t)+uR|2,l(\hat{u})=\big{|}\hat{u}(-1,t)-u_{L}\big{|}^{2}+\big{|}\hat{u}(1,t)+u_{R}\big{|}^{2}, (4.11)

and

l(u~L)=|u~L(1,t)uL|2,l(u~R)=|u~R(1,t)+uR|2.l(\tilde{u}_{L})=\big{|}\tilde{u}_{L}(-1,t)-u_{L}\big{|}^{2},\qquad l(\tilde{u}_{R})=\big{|}\tilde{u}_{R}(1,t)+u_{R}\big{|}^{2}. (4.12)

We repeatedly train the PINNs and sl-PINNs for the Burgers’ equations using the viscosity parameter values of ε=101/π, 102/π\varepsilon=10^{-1}/\pi,\ 10^{-2}/\pi, and 103/π10^{-3}/\pi.

As for the previous one with smooth data in Section 4.1, here we use neural networks with a size of 4×204\times 20 (3 hidden layers with 20 neurons each). We randomly select N=5,000N=5,000 training points within the space-time domain Ωx×Ωt\Omega_{x}\times\Omega_{t}, Nb=80N_{b}=80 on the boundary Ωx×Ωt\partial\Omega_{x}\times\Omega_{t}, N0=80N_{0}=80 on the initial boundary Ωx×{t=0}\Omega_{x}\times\{t=0\}, and Ni=80N_{i}=80 on the interface {x=α(t)}×Ωt\{x=\alpha(t)\}\times\Omega_{t}. The distributions of training data for both steady shock case and moving shock case are shown in Figure 4.8.

We use the same optimization strategy as for the previous example, which involves adopting the Adam optimizer with a learning rate of 10310^{-3} for the first 20,000 iteration steps, followed by the L-BFGS optimizer with a learning rate of 11 for the next 10,000 iterations. We train the cases for ε=1/500, 1/1000, 1/5000,\varepsilon=1/500,\ 1/1000,\ 1/5000, and 1/100001/10000 with the same settings as described. The training losses for the moving shock case for both sl-PINN and PINN when ε=1/500\varepsilon=1/500 and ε=1/10000\varepsilon=1/10000 are presented in Figures 4.9 and 4.10 respectively.

After completing the training process, we investigate the accuracy in the L2L^{2}-norm (4.2) and the LL^{\infty}-norm (4.3) at specific time. The results of the steady shock case are shown in Table 3 and those of the moving shock case appear in Table 4. We observe that for both cases, sl-PINN obtains accurate precision for all the small ε=1/500, 1/1000, 1/5000,\varepsilon=1/500,\ 1/1000,\ 1/5000, and 1/100001/10000. On the contrary, the usual PINN has limited accuracy even for the case when a relatively large ε=1/500\varepsilon=1/500 is considered. The solution plots of PINN and sl-PINN predictions for the moving shock case when ε=1/500\varepsilon=1/500 and ε=1/10000\varepsilon=1/10000 appear in Figures 4.11 and 4.12, their temporal snapshots in Figures 4.13 and 4.15, and their error plots in Figures 4.14 and 4.16. We find that sl-PINN is capable to capture the stiff behaviour of the Burgers’ solution near the shock for a larger ε=1/500\varepsilon=1/500 to a smaller ε=1/10000\varepsilon=1/10000.

We note that the initial data in these experiments are discontinuous. It is well-known that neural networks (or any traditional numerical method) have limited performance in fitting a target function with discontinuity [18, 21]. Therefore, as seen in the predictions of the sl-PINN in Figure 4.11 or 4.12, a large error near the singular layer is somewhat expected.

Refer to caption
(a) steady shock case
Refer to caption
(b) moving shock case
Figure 4.8. Nonsmooth initial data: sl-PINN training data distribution.
Refer to caption
(a) ε=1/500\varepsilon=1/500
Refer to caption
(b) ε=1/10000\varepsilon=1/10000
Figure 4.9. Riemann problem moving shock case: sl-PINN’s training loss during the training process.
Refer to caption
(a) ε=1/500\varepsilon=1/500
Refer to caption
(b) ε=1/10000\varepsilon=1/10000
Figure 4.10. Riemann problem moving shock case: PINN’s training loss during the training process.
ELt2(Lx2(Ω))\|E\|_{L^{2}_{t}(L^{2}_{x}(\Omega))} PINNs sl-PINNs
ε=1/500\varepsilon=1/500 6.77E-01 4.34E-03
ε=1/1000\varepsilon=1/1000 1.08E+00 4.28E-03
ε=1/5000\varepsilon=1/5000 6.84E-01 2.70E-03
ε=1/10000\varepsilon=1/10000 9.29E-01 3.74E-03

PINNs sl-PINNs
ELx(Ω)\|E\|_{L^{\infty}_{x}(\Omega)} ε=1/500\varepsilon=1/500 ε=1/10000\varepsilon=1/10000 ε=1/500\varepsilon=1/500 ε=1/10000\varepsilon=1/10000
t=0.25t=0.25 2.00E+00 2.00E+00 3.82E-04 4.80E-02
t=0.5t=0.5 2.00E+00 1.99E+00 4.64E-04 4.79E-02
t=0.75t=0.75 2.00E+00 1.99E+00 4.12E-04 4.80E-02
t=1.0t=1.0 2.00E+00 1.90E+00 6.11E-04 4.80E-02
Table 3. Riemann problem - steady shock case a=1a=1, b=1b=1: L2L^{2}-error comparison (top) and LL^{\infty}-error comparison (bottom)
ELt2(Lx2(Ω))\|E\|_{L^{2}_{t}(L^{2}_{x}(\Omega))} PINNs sl-PINNs
ε=1/500\varepsilon=1/500 5.84E-01 5.46E-03
ε=1/1000\varepsilon=1/1000 6.23E-01 4.19E-03
ε=1/5000\varepsilon=1/5000 4.24E-01 3.01E-03
ε=1/10000\varepsilon=1/10000 4.04E-01 3.84E-03

PINNs sl-PINNs
ELx(Ω)\|E\|_{L^{\infty}_{x}(\Omega)} ε=1/500\varepsilon=1/500 ε=1/10000\varepsilon=1/10000 ε=1/500\varepsilon=1/500 ε=1/10000\varepsilon=1/10000
t=0.25t=0.25 1.49E+00 1.14E+00 1.79E-03 2.43E-02
t=0.5t=0.5 1.49E+00 1.14E+00 2.63E-03 3.19E-02
t=0.75t=0.75 1.49E+00 1.53E+00 2.72E-03 4.37E-02
t=1.0t=1.0 1.49E+00 1.81E+00 3.47E-03 6.07E-02
Table 4. Riemann problem - moving shock case a=1a=1, b=0.5b=0.5: L2L^{2}-error comparison (top) and LL^{\infty}-error comparison (bottom)
Refer to caption
(a) reference solution ε=1/500\varepsilon=1/500
Refer to caption
(b) PINN prediction (left) and absolute pointwise error (right)
Refer to caption
(c) sl-PINN prediction (left) and absolute pointwise error (right)
Figure 4.11. Riemann problem moving shock case when ε=1/500\varepsilon=1/500: PINN vs sl-PINN solution plots.
Refer to caption
(a) reference solution ε=1/10000\varepsilon=1/10000
Refer to caption
(b) PINN prediction (left) and absolute pointwise error (right)
Refer to caption
(c) sl-PINN prediction (left) and absolute pointwise error (right)
Figure 4.12. Riemann problem moving shock case when ε=1/10000\varepsilon=1/10000: PINN vs sl-PINN solution plots.
Refer to caption
Figure 4.13. Riemann problem moving shock case: reference solution when ε=1/500\varepsilon=1/500 vs PINN vs sl-PINN at specific time.
Refer to caption
(a) Errors of PINN prediction at specific time
Refer to caption
(b) Errors of sl-PINN prediction at specific time
Figure 4.14. Riemann problem moving shock case when ε=1/500\varepsilon=1/500: PINN vs sl-PINN solution error at specific time.
Refer to caption
Figure 4.15. Riemann problem moving shock case when ε=1/10000\varepsilon=1/10000: reference solution vs PINN vs sl-PINN at specific time.
Refer to caption
(a) Errors of PINN prediction at specific time
Refer to caption
(b) Errors of sl-PINN prediction at specific time
Figure 4.16. Riemann problem moving shock case when ε=1/10000\varepsilon=1/10000: PINN vs sl-PINN solution error at specific time.

5. Conclusion

In this article, we introduce a new learning method known as sl-PINN to address the one-dimensional viscous Burgers problem with low viscosity, causing a singular interior layer. The method incorporates corrector functions that describe the robust behavior of the solution near the interior layer into the structure of PINNs to improve learning performance in that area. This is achieved through interior analysis, treating the problem as two boundary layer sub-problems. The profile of the corrector functions is studied using asymptotic analysis. Both stationary and moving shock cases have been explored. Numerical experiments show that our sl-PINNs accurately predict the solution for every small viscosity, which, in particular, reduces errors near the interior layer compared to the original PINNs. Our proposed method provides a comprehensive understanding of the solution’s behavior near the interior layer, aiding in capturing the robust part of the training solution. This approach offers a distinct perspective from other existing works [24, 25, 19], achieving better accuracy, especially when the shock is more robust.

Acknowledgment

T.-Y. Chang gratefully was supported by the Graduate Students Study Abroad Program grant funded by the National Science and Technology Council(NSTC) in Taiwan. G.-M. Gie was supported by Simons Foundation Collaboration Grant for Mathematicians. Hong was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2021R1A2C1093579) and by the Korea government(MSIT) (RS-2023-00219980). Jung was supported by National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. 2023R1A2C1003120).

References

  • [1] Basdevant, C., Deville, M., Haldenwang, P., Lacroix, J.M., Ouazzani, J., Peyret, R., Orlandi, P. and Patera, A. Spectral and finite difference solutions of the Burgers equation. Computers and fluids, 14(1), (1986), pp.23-41.
  • [2] Buchanan, J. Robert, and Zhoude Shao. A first course in partial differential equations. 2018.
  • [3] Junho Choi, Chang-Yeol Jung, and Hoyeon Lee. On boundary layers for the Burgers equations in a bounded domain. Commun. Nonlinear Sci. Numer. Simul., 67:637–657, 2019.
  • [4] Choi, Junho and Hong, Youngjoon and Jung, Chang-Yeol and Lee, Hoyeon. Viscosity approximation of the solution to Burgers’ equations with shock layers Appl. Anal., Vol. 102, no. 1, 288–314, 2023.
  • [5] Cole, Julian D. On a quasi-linear parabolic equation occurring in aerodynamics. Quarterly of Applied Mathematics. 9 (3): 225–236. (1951).
  • [6] Caldwell, J., and P. Smith. Solution of Burgers’ equation with a large Reynolds number. Applied Mathematical Modelling 6, no. 5 (1982): 381-385.
  • [7] Debnath, Lokenath. Nonlinear partial differential equations for scientists and engineers. Boston: Birkhäuser, 2005.
  • [8] T.-Y. Chang, G.-M. Gie, Y. Hong, and C.-Y. Jung. Singular layer Physics Informed Neural Network method for Plane Parallel Flows Computers and Mathematics with Applications, Volume 166, 2024, 91–105, ISSN 0898–1221
  • [9] G.-M. Gie, Y. Hong, C.-Y. Jung, and D. Lee. Semi-analytic physics informed neural network for convection-dominated boundary layer problems in 2D Submitted
  • [10] G.-M. Gie, Y. Hong, C.-Y. Jung, and T. Munkhjin. Semi-analytic PINN methods for boundary layer problems in a rectangular domain Journal of Computational and Applied Mathematics, Volume 450, 2024, 115989, ISSN 0377-0427
  • [11] G.-M. Gie, Y. Hong, and C.-Y. Jung. Semi-analytic PINN methods for singularly perturbed boundary value problems Applicable Analysis, Published online: 08 Jan 2024
  • [12] G.-M. Gie, C.-Y. Jung, and H. Lee. Semi-analytic shooting methods for Burgers’ equation. Journal of Computational and Applied Mathematics, Vol. 418, 2023, 114694, ISSN 0377-0427
  • [13] G.-M. Gie, M. Hamouda, C.-Y. Jung, and R. Temam, Singular perturbations and boundary layers, volume 200 of Applied Mathematical Sciences. Springer Nature Switzerland AG, 2018. https://doi.org/10.1007/978-3-030-00638-9
  • [14] Hopf, Eberhard. The partial differential equation ut+uux=μuxxu_{t}+uu_{x}=\mu u_{xx}. Communications on Pure and Applied Mathematics, 3 (3): 201–230. (1950).
  • [15] Hon, Yiu-Chung, and X. Z. Mao. An efficient numerical scheme for Burgers’ equation. Applied Mathematics and Computation 95, no. 1 (1998): 37-50.
  • [16] Howison, Sam. Practical applied mathematics: modelling, analysis, approximation. No. 38. Cambridge university press, 2005.
  • [17] Jagtap, Ameya D., and George Em Karniadakis. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Communications in Computational Physics 28, no. 5 (2020).
  • [18] Jagtap, Ameya D., Kenji Kawaguchi, and George Em Karniadakis. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics 404 (2020): 109136.
  • [19] L. Lu, X. Meng, Z. Mao, G. E. Karniadakis. Deepxde: A deep learning library for solving differential equations. SIAM Review, 63 (1) (2021) 208–228.
  • [20] Olver, Peter J. Introduction to partial differential equations. Vol. 1. Berlin: Springer, 2014.
  • [21] Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the Spectral Bias of Neural Networks. Proceedings of the 36th International Conference on Machine Learning, 97 (2019).
  • [22] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378 (2019), pp. 686–707.
  • [23] S. Shih and R. B. Kellogg, Asymptotic analysis of a singular perturbation problem. SIAM J. Math. Anal. 18 (1987), pp . 1467-1511.
  • [24] Song, Y., Wang, H., Yang, H., Taccari, M.L. and Chen, X. Loss-attentional physics-informed neural networks. Journal of Computational Physics, 501, p.112781. , 2024
  • [25] Zhuang, Qiao, Chris Ziyi Yao, Zhongqiang Zhang, and George Em Karniadakis. Two-scale Neural Networks for Partial Differential Equations with Small Parameters. arXiv preprint arXiv:2402.17232 (2024).