This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Deep FBSDE Neural Networks for Solving Incompressible Navier-Stokes Equation and Cahn-Hilliard Equation

Yangtao Deng [email protected] Qiaolin He [email protected] School of Mathematics, Sichuan University, Chengdu, China
Abstract

Efficient algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the curse of dimensionality. We extend the forward-backward stochastic neural networks (FBSNNs) which depends on forward-backward stochastic differential equation (FBSDE) to solve incompressible Navier-Stokes equation. For Cahn-Hilliard equation, we derive a modified Cahn-Hilliard equation from a widely used stabilized scheme for original Cahn-Hilliard equation. This equation can be written as a continuous parabolic system, where FBSDE can be applied and the unknown solution is approximated by neural network. Also our method is successfully developed to Cahn-Hilliard-Navier-Stokes (CHNS) equation. The accuracy and stability of our methods are shown in many numerical experiments, specially in high dimension.

keywords:
forward-backward stochastic differential equation, neural network, Navier-Stokes, Cahn-Hilliard, high dimension

1 Introduction

High-dimensional nonlinear partial differential equations (PDEs) are used widely in a number of areas of social and natural sciences. Due to the significant nonlinearity of nonlinear PDEs, particularly in high-dimensional cases, analytical solutions to nonlinear PDEs are typically difficult to acquire. Therefore, numerical solutions to these kinds of nonlinear PDEs are very important. However, due to their exponential increase in complexity, traditional approaches like finite difference method and finite element method fail in high-dimensional instances. Many fields pay close attention to developments in numerical algorithms for solving high-dimensional PDEs. There are several numerical methods for solving nonlinear high-dimensional partial differential equations here, such as Monte Carlo method[1, 2], lattice rule[3] and sparse grid method[4, 5], etc. They exhibit relative adaptability in addressing high-dimensional problems. However, they typically require substantial computational resources, especially in high-dimensional scenarios. Monte Carlo method often demands a large number of sample points, while lattice rule and sparse grid method may require finer grids or adaptive strategies. Moreover, their convergence rates are usually relatively slow, particularly in high-dimensional situations. Achieving the desired accuracy may entail a significant amount of computation.

Recently, deep neural networks (DNNs) have been used to create numerical algorithms which work well at overcoming the curse of dimensionality and successfully solving high-dimensional PDEs[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. Inspired by Ritz method, deep Ritz method (DRM) [8] is proposed to solve variational problem arising from PDEs. The deep Galerkin method (DGM) is proposed in [18] to solve high-dimensional PDEs by approximating the solution with a deep neural network which is trained to satisfy the differential operator, initial condition, and boundary conditions.The physics-informed neural networks (PINN) is presented in [16], where the PDE is embedded into the neural network by utilizing automatic differentiation (AD). Three adaptive techniques to improve the computational performance of DNNs methods for high-dimensional PDEs are presented in [21]. The authors in [12] proposed an approach for scattering problems connected with linear PDEs of the Helmholtz type that relies on DNNs to describe the forward and inverse map. In [9], the deep backward stochastic differential equation (BSDE) method based on the nonlinear Feynman-Kac formula (see e.g.[23]) is proposed, which is used in [10] to estimate the solution of eigenvalue problems for semilinear second order differential operators.

The PINN and deep BSDE method are two different kinds of numerical frameworks for solving general PDEs. The AD is used to avoid truncation error and the numerical quadrature errors of variational form. Some gradient optimization methods are used to update the neural network so that the loss of the differential equation and boundary condition is reduced. The deep BSDE method treats the BSDE as a stochastic control problem with the gradient of the solution being the policy function and parameterizes the control process of the solution by DNNs. Then it trains the resulting sequence of networks in a global optimization given by the prescribed terminal condition. These methods do not rely on the training data provided by some external algorithms, which can be considered as unsupervised learning methods. One drawback of PINN is the high computational complexity of its loss function, which includes the differential operator in the PDE to be solved. On the other hand, the deep BSDE method does not require the computation of high order derivatives. Moreover, the loss function used by deep BSDE method involves only simple additive calculations, thereby deep BSDE method iterates faster. The deep BSDE method has made high-dimensional problems solvable, which allows us to solve high-dimensional semilinear PDEs in a reasonable amount of time. Recently, there are some works related to deep BSDE method, see [6, 7, 10, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]. Based on the deep BSDE method, an improved method called FBSNNs is proposed in [6]. The method proceeds by approximating the unknown solution using a fully connected feedforward neural network (FNN) and obtains the required gradient vector by applying AD. However, because the nonlinear Feynman-Kac formula is involved in the reformulation procedure, FBSNNs can only handle some specific Cauchy problems without boundary conditions. Then, it is desirable to extend the FBSNNs to other kinds of PDEs and deal with the problems with boundary conditions.

The Navier-Stokes equation is an important equation in fluid dynamics and the Cahn-Hilliard equation is widely used in multi-phase problems. These equations are difficult to solve due to their complexity. There are many deep learning methods that have been applied to solve these equations in one or two dimensions (see e.g.[37, 38, 39, 40, 41, 42]). However, these methods fail due to excessive complexity when the dimension is more than three. We choose to introduce the FBSNNs presented in [6] to solve these equations. We convert the incompressible Navier-Stokes equations into FBSDEs and then employ FBSNNs to solve them in two or three dimension. We develop a suitable numerical method based on the reflection principle to deal with the Neumann boundary condition and handle the Dirichlet boundary condition using the method mentioned in [43]. We rewrite the Cahn-Hilliard equation into a system of parabolic equations by adding stable terms reasonably, then the numerical solution of the new system is obtained using the FBSNNs. However, when dealing with mixed boundary condition, the above method should be improved. We utilize an approach which is similar to the method for Dirichlet boundary case, meanwhile we add an extra error item to the final loss function for the Neumann boundary condition. The equation can also be solved for periodic boundary condition with techniques involved the periodicity. Therefore, we can naturally solve the CHNS equation which is a coupled system of Navier-Stokes and Cahn-Hilliard equations.

The rest of this article is organized as follows. In Section 2, we introduce FBSDEs, deep BSDE method and FBSNNs method briefly. In Section 3, we present the approach to solve incompressible Navier-Stokes equations with different boundary conditions. A methodology is proposed in Section 4 to solve Cahn-Hilliard equation with different boundary conditions. In Section 5, the method to solve CHNS system is developed. Numerical experiments are given in Section 6 to verify the effectiveness of our methods. Finally, conclusions and remarks are given in Section 7.

2 FBSDEs, deep BSDE method and FBSNNs

2.1 A brief introduction of FBSDEs

The FBSDEs where the randomness in the BSDE driven by a forward stochastic differential equation (SDE), is written in the general form

{dXs=b(s,Xs)ds+σ(s,Xs)dWs,s[0,T],X0=x0,dYs=f(s,Xs,Ys,Zs)dsZsTσ(s,Xs)dWs,s[0,T],YT=g(XT),\left\{\begin{aligned} &dX_{s}=b(s,X_{s})ds+\sigma(s,X_{s})dW_{s},\quad s\in[0,T],\\ &X_{0}=x_{0},\\ &-dY_{s}=f(s,X_{s},Y_{s},Z_{s})ds-Z_{s}^{T}\sigma(s,X_{s})dW_{s},\quad s\in[0,T],\\ &Y_{T}=g(X_{T}),\end{aligned}\right. (1)

where {Ws}0sT\{W_{s}\}_{0\leq s\leq T} is a d-dimensional Brownian motion, b:[0,T]×ddb:[0,T]\times\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}, σ:[0,T]×dd×d\sigma:[0,T]\times\mathbb{R}^{d}\rightarrow\mathbb{R}^{d\times d}, f:[0,T]×d×m×d×mmf:[0,T]\times\mathbb{R}^{d}\times\mathbb{R}^{m}\times\mathbb{R}^{d\times m}\rightarrow\mathbb{R}^{m} and g:[0,T]×dmg:[0,T]\times\mathbb{R}^{d}\rightarrow\mathbb{R}^{m} are all deterministic mappings of time and space, with the fixed T>0T>0. We refer to ZZ as the controlprocesscontrol\ process according to the stochastic control terminology. In order to guarantee the existence of a unique solution pair {(Ys,Zs)}0sT\{(Y_{s},Z_{s})\}_{0\leq s\leq T} adapted to the augmented natural filtration, the standard well-posedness assumptions of [23] are required. Indeed, considering the quasi-linear, parabolic terminal problem

ut(t,x)+u(t,x)+f(t,x,u(t,x),u(t,x))=0,(t,x)[0,T]×d,\displaystyle u_{t}(t,x)+\mathcal{L}u(t,x)+f(t,x,u(t,x),\nabla u(t,x))=0,\ (t,x)\in[0,T]\times\mathbb{R}^{d}, (2)

with u(T,x)=g(x)u(T,x)=g(x) and \mathcal{L} is the second-order differential operator

u(t,x)=12i,j=1dai,j(t,x)2u(t,x)xixj+i=1dbi(t,x)u(t,x)xi,ai,j=[σσT]ij,\displaystyle\mathcal{L}u(t,x)=\frac{1}{2}\sum_{i,j=1}^{d}a_{i,j}(t,x)\frac{\partial^{2}u(t,x)}{\partial x_{i}\partial x_{j}}+\sum_{i=1}^{d}b_{i}(t,x)\frac{\partial u(t,x)}{\partial x_{i}},\quad a_{i,j}=[\sigma\sigma^{T}]_{ij}, (3)

the nonlinear Feynman-Kac formula indicates that the solution of (1) coincides almost exactly with the solution of (2) (cf., e.g., [23])

Ys=u(s,Xs),Zs=u(s,Xs),s[0,T].\displaystyle Y_{s}=u(s,X_{s}),\ Z_{s}=\nabla u\left(s,X_{s}\right),\ s\in[0,T]. (4)

As a result, the BSDE formulation offers a stochastic representation to the synchronous solution of a parabolic problem and its gradient, which is a distinct advantage for numerous applications in stochastic control.

2.2 Deep BSDE method

Inorder to review the deep BSDE method proposed in [9], we consider the following FBSDEs

{Xt=x0+0tb(s,Xs)𝑑s+0tσ(s,Xs)𝑑Ws,Yt=g(XT)+tTf(s,Xs,Ys,Zs)𝑑stTZsTσ(s,Xs)𝑑Ws,\left\{\begin{aligned} &X_{t}=x_{0}+\int_{0}^{t}b(s,X_{s})ds+\int_{0}^{t}\sigma(s,X_{s})dW_{s},\\ &Y_{t}=g(X_{T})+\int_{t}^{T}f(s,X_{s},Y_{s},Z_{s})ds-\int_{t}^{T}Z_{s}^{T}\sigma(s,X_{s})dW_{s},\end{aligned}\right. (5)

which is the integration form of (1). Given a partition of the time interval [0,T]:0=t0<t1<<tN=T[0,T]:0=t_{0}<t_{1}<...<t_{N}=T, the Euler-Maruyama scheme is used to discretize for XtX_{t} and YtY_{t} and we have

{Xtn+1=Xtn+b(tn,Xtn)Δtn+σ(tn,Xtn)ΔWtn,Ytn+1(Xtn+1)=Ytn(Xtn)ftn(Xtn,Ytn(Xtn),Ztn(Xtn))Δtn+ZtnT(Xtn)σ(tn,Xtn)ΔWtn,\left\{\begin{aligned} &X_{t_{n+1}}=X_{t_{n}}+b(t_{n},X_{t_{n}})\Delta t_{n}+\sigma(t_{n},X_{t_{n}})\Delta W_{t_{n}},\\ &Y_{t_{n+1}}(X_{t_{n+1}})=Y_{t_{n}}(X_{t_{n}})-f_{t_{n}}(X_{t_{n}},Y_{t_{n}}(X_{t_{n}}),Z_{t_{n}}(X_{t_{n}}))\Delta t_{n}\\ &+Z_{t_{n}}^{T}(X_{t_{n}})\sigma(t_{n},X_{t_{n}})\Delta W_{t_{n}},\end{aligned}\right. (6)

where Δtn=tn+1tn=TN\Delta t_{n}=t_{n+1}-t_{n}=\frac{T}{N}, ΔWtn=Wtn+1Wtn\Delta W_{t_{n}}=W_{t_{n+1}}-W_{t_{n}}. The Ztn(Xtn)Z_{t_{n}}(X_{t_{n}}) is approximated by a FNN with parameter θn\theta_{n} for n=1,,N1n=1,\cdots,N-1. The initial values Yt0(Xt0)Y_{t_{0}}(X_{t_{0}}) and Zt0(Xt0)Z_{t_{0}}(X_{t_{0}}) are treated as trainable parameters in the model. To make Yt0(Xt0)Y_{t_{0}}(X_{t_{0}}) to approximate u(t0,Xt0)u(t_{0},X_{t_{0}}), the difference in the matching with a given terminal condition is used to define the expected loss function

l(Yt0(Xt0),Zt0(Xt0),θ1,,θN1)=1Mm=1M|(gYtN)(XtN,m)|2,\displaystyle l\left(Y_{t_{0}}(X_{t_{0}}),Z_{t_{0}}(X_{t_{0}}),\theta_{1},...,\theta_{N-1}\right)=\frac{1}{M}\sum_{m=1}^{M}\left|(g-Y_{t_{N}})(X_{t_{N},m})\right|^{2}, (7)

which represents MM different realizations of the underlying Brownian motion, where the subscript mm corresponds to the mm-th realization of the underlying Brownian motion. The process is called the deep BSDE method.

2.3 FBSNNs

Raissi [6] introduced neural networks called FBSNNs to solve FBSDEs. The unknown solution u(t,x)u(t,x) is approximated by the FNN with the input (t,x)(t,x) and the required gradient vector u(t,x)\nabla u(t,x) is attained by applying AD. The parameter θ\theta of FNN can be learned by minimizing the loss function given explicitly in equation (9) obtained from discretizing the FBSDEs (5) using the Euler-Maruyama scheme

{Xtn+1=Xtn+b(tn,Xtn)Δtn+σ(tn,Xtn)ΔWtn,Y~tn+1(Xtn+1)=Ytn(Xtn)ftn(Xtn,Ytn(Xtn),Ztn(Xtn))Δtn+ZtnT(Xtn)σ(tn,Xtn)ΔWtn,\left\{\begin{aligned} &X_{t_{n+1}}=X_{t_{n}}+b(t_{n},X_{t_{n}})\Delta t_{n}+\sigma(t_{n},X_{t_{n}})\Delta W_{t_{n}},\\ &\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}})=Y_{t_{n}}(X_{t_{n}})-f_{t_{n}}(X_{t_{n}},Y_{t_{n}}(X_{t_{n}}),Z_{t_{n}}(X_{t_{n}}))\Delta t_{n}\\ &+Z_{t_{n}}^{T}(X_{t_{n}})\sigma(t_{n},X_{t_{n}})\Delta W_{t_{n}},\end{aligned}\right. (8)

where Ytn(Xtn)Y_{t_{n}}(X_{t_{n}}) represents the estimated value of u(tn,Xtn)u(t_{n},X_{t_{n}}) given by the FNN and Y~tn+1(Xtn+1)\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}}) is the reference value corresponding to Ytn+1(Xtn+1)Y_{t_{n+1}}(X_{t_{n+1}}), which is obtained from the calculation in (8). The loss function is then given by

l(θ)\displaystyle l(\theta) =m=1M[n=0N1|(Y~tn+1Ytn+1)(Xtn+1,m)|2+|(gYtN)(XtN,m)|2],\displaystyle=\sum_{m=1}^{M}\left[\sum_{n=0}^{N-1}|(\widetilde{Y}_{t_{n+1}}-Y_{t_{n+1}})(X_{t_{n+1},m})|^{2}+|(g-Y_{t_{N}})(X_{t_{N},m})|^{2}\right], (9)

where the subscripts mm is the same meaning as it in (7).

The deep BSDE method only calculates the value of u(t0,Xt0)u(t_{0},X_{t_{0}}). This means that in order to obtain an approximation to Ytn(Xtn)=u(tn,Xtn)Y_{t_{n}}(X_{t_{n}})=u(t_{n},X_{t_{n}}) at a later time tn>t0t_{n}>t_{0}, we have to retrain the algorithm. Furthermore, the number of the FNNs grows with the number of time steps NN, which makes training difficult. In this article, we use the FBSNNs. The FNN is expected to be able to approximate u(t,x)u(t,x) over the entire computational area instead of only one point. That is, we will use multiple initial points to train the FNN. In order to improve the efficiency, the number of Brownian motion trajectories for each initial point is set as M=1M=1.

3 Deep neural network for solving the incompressible Navier-Stokes equation

3.1 A class of FBSDEs associated to the incompressible Navier-Stokes equation

The Cauchy problem for deterministic backward Navier–Stokes equation for the velocity field of the incompressible and viscous fluid is

{ut+νΔu+(u)u+p+f=0, 0tT,u=0,u(T)=g,\left\{\begin{aligned} &u_{t}+\nu\Delta u+(u\cdot\nabla)u+\nabla p+f=0,\ 0\leq t\leq T,\\ &\nabla\cdot u=0,\ u(T)=g,\\ \end{aligned}\right. (10)

which is obtained from the classical Navier–Stokes equation via the time-reversing transformation

(u,p,f)(t,x)(u,p,f)(Tt,x),0tT.\displaystyle(u,p,f)(t,x)\rightarrow(-u,p,f)(T-t,x),\quad 0\leq t\leq T. (11)

Here u=u(t,x)u=u(t,x) represents the dd-dimensional velocity field of a fluid, p=p(t,x)p=p(t,x) is the pressure, ν>0\nu>0 is the viscosity coefficient, and f=f(t,x)f=f(t,x) stands for the external force. We now study the backward Navier–Stokes equation (10) in d\mathbb{R}^{d} with different boundary conditions.

Then, the PDE (10) is associated through the nonlinear Feynman-Kac formula to the following FBSDEs

{dXs=2νdWs,s[0,T],X0=x0,dYs=(f(s,Xs)+p(s,Xs)+(Ys)Ys)ds2νZsTdWs,s[0,T],YT=g(XT),\left\{\begin{aligned} &dX_{s}=\sqrt{2\nu}dW_{s},\quad s\in[0,T],\\ &X_{0}=x_{0},\\ &-dY_{s}=\left(f(s,X_{s})+\nabla p\left(s,X_{s}\right)+(Y_{s}\cdot\nabla)Y_{s}\right)ds-\sqrt{2\nu}Z_{s}^{T}dW_{s},\quad s\in[0,T],\\ &Y_{T}=g(X_{T}),\\ \end{aligned}\right. (12)

where

Ys=u(s,Xs),Zs=u(s,Xs).Y_{s}=u\left(s,X_{s}\right),\quad Z_{s}=\nabla u\left(s,X_{s}\right).

3.2 The algorithm for solving the incompressible Navier-Stokes equation

Given a partition of [0,T]:0=t0<t1<<tN=T[0,T]:0=t_{0}<t_{1}<...<t_{N}=T, we consider the Euler-Maruyama scheme with n=0,,N1n=0,...,N-1 for FBSDEs (12)

{Xtn+1=Xtn+2νΔWtn,Y~tn+1(Xtn+1)=Ytn(Xtn)(ftn+Ptn+(Ytn)Ytn)(Xtn)Δtn+2νZtnT(Xtn)ΔWtn,\left\{\begin{aligned} &X_{t_{n+1}}=X_{t_{n}}+\sqrt{2\nu}\Delta W_{t_{n}},\\ &\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}})=Y_{t_{n}}(X_{t_{n}})-(f_{t_{n}}+\nabla P_{t_{n}}+(Y_{t_{n}}\cdot\nabla)Y_{t_{n}})(X_{t_{n}})\Delta t_{n}\\ &+\sqrt{2\nu}Z_{t_{n}}^{T}(X_{t_{n}})\Delta W_{t_{n}},\end{aligned}\right. (13)

where Δtn=tn+1tn=TN\Delta t_{n}=t_{n+1}-t_{n}=\frac{T}{N} and ΔWtn=Wtn+1Wtn\Delta W_{t_{n}}=W_{t_{n+1}}-W_{t_{n}}. The (Ytn,Ptn)T(Y_{t_{n}},P_{t_{n}})^{T} represents the estimated value of (u,p)T(u,p)^{T} at time tnt_{n} given by the FNN, respectively. The Y~tn+1(Xtn+1)\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}}) is the reference value of Ytn+1(Xtn+1)Y_{t_{n+1}}(X_{t_{n+1}}), which is obtained from the calculation in the second equation in (13). We utilize KK different initial sampling points for training the FNN. The algorithm of the proposed scheme is summarized in Algorithm 1. Illustration of the Algorithm 1 for solving the incompressible Navier-Stokes equation is shown in Figure 1.

Algorithm 1 Algorithm for the incompressible Navier-Stokes equation
1:Number of initial sampling points KK, terminal time TT, number of time intervals NN, viscosity coefficient ν\nu, maximum number of training iterations MiterM_{iter}.
2:The optimal FNN 𝒰θ\mathcal{U}_{\theta}.
3:Initialize the FNN 𝒰θ\mathcal{U}_{\theta};
4:Select initial sampling points x0x_{0} by uniform distribution;
5:Generate independent dd-dimensional standard Brownian motions Wtn(n=0,,N)W_{t_{n}}(n=0,...,N);
6:Compute Xtn+1X_{t_{n+1}} according to (13) for n=0,,N1n=0,...,N-1;
7:According to (13), use the FNN 𝒰θ\mathcal{U}_{\theta} with AD to calculate Y~tn+1(Xtn+1)\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}}) for n=0,,N1n=0,\ldots,N-1;
8:Minimize the loss function by the Adam algorithm
l(θ)\displaystyle l(\theta) =1Kk=1K[1Nn=0N1|(Y~tn+1Ytn+1)(Xtn+1,k)|2+α1|(gYtN)(XtN,k)|2\displaystyle=\frac{1}{K}\sum_{k=1}^{K}\left[\frac{1}{N}\sum_{n=0}^{N-1}|(\widetilde{Y}_{t_{n+1}}-Y_{t_{n+1}})(X_{t_{n+1},k})|^{2}+\alpha_{1}|(g-Y_{t_{N}})(X_{t_{N},k})|^{2}\right. (14)
+α2N+1n=0N|Ytn(Xtn,k)|2],\displaystyle\left.+\frac{\alpha_{2}}{N+1}\sum_{n=0}^{N}|\nabla\cdot Y_{t_{n}}(X_{t_{n},k})|^{2}\right],
where αi,i=1,2\alpha_{i},i=1,2 are the weights of the components of the loss function. The subscript kk corresponds to the kk-th initial sampling point;
9:Repeat procedures 2 to 6 until the maximum number of training iterations MiterM_{iter} is reached.
Refer to caption
Figure 1: Illustration of Algorithm 1 for solving the incompressible Navier-Stokes equation

3.3 The algorithm for solving the incompressible Navier-Stokes equation with Dirichlet boundary condition

For the backward Navier–Stokes equation (10) in Ωd\Omega\subset\mathbb{R}^{d} with the Dirichlet boundary condition

u(t,x)=h(t,x),(t,x)[0,T]×Ω,\displaystyle u(t,x)=h(t,x),\quad(t,x)\in[0,T]\times\partial\Omega, (15)

the corresponding FBSDEs can be rewritten as the following form according to [43] by the nonlinear Feynman-Kac formula

{Xt=x0+0t2ν𝑑Ws,Yt=Φ(TΛτ,XTΛτ)+tΛτTΛτ(f(s,Xs)+p(s,Xs)+(Ys)Ys)dstΛτTΛτ2νZsTdWs,\left\{\begin{aligned} &X_{t}=x_{0}+\int_{0}^{t}\sqrt{2\nu}dW_{s},\\ &Y_{t}=\Phi(T\Lambda\tau,X_{T\Lambda\tau})+\int_{t\Lambda\tau}^{T\Lambda\tau}(f(s,X_{s})+\nabla p\left(s,X_{s}\right)\\ &+(Y_{s}\cdot\nabla)Y_{s})ds-\int_{t\Lambda\tau}^{T\Lambda\tau}\sqrt{2\nu}Z_{s}^{T}dW_{s},\end{aligned}\right. (16)

where aΛb=min{a,b}a\Lambda b=\min\{a,b\}, the stopping time τ=inf{t>0:XtΩ}\tau=\inf\{t>0:X_{t}\notin\Omega\} be the first time that the process XtX_{t} exits Ω\Omega and

Φ(TΛτ,XTΛτ)={g(XT),τ>T,XTΩ,h(τ,Xτ),τT,XτΩ.\displaystyle\Phi(T\Lambda\tau,X_{T\Lambda\tau})=\left\{\begin{array}[]{ll}g(X_{T}),\quad\tau>T,\ X_{T}\in\Omega,\\ h(\tau,X_{\tau}),\quad\tau\leq T,\ X_{\tau}\in\partial\Omega.\\ \end{array}\right. (17)

Through the Euler scheme, the discrete formulation of the FBSDEs (16) can be obtained accordingly

{Xtn+1=Xtn+2νΔWtn,Y~tn+1Λτ(Xtn+1Λτ)=YtnΛτ(XtnΛτ)𝟙(0,τ)(tn)[(ftn+Ptn+(Ytn)Ytn)(Xtn)Δtn2νZtnT(Xtn)ΔWtn],\left\{\begin{aligned} &X_{t_{n+1}}=X_{t_{n}}+\sqrt{2\nu}\Delta W_{t_{n}},\\ &\widetilde{Y}_{t_{n+1}\Lambda\tau}(X_{t_{n+1}\Lambda\tau})=Y_{t_{n}\Lambda\tau}(X_{t_{n}\Lambda\tau})-\mathbbm{1}_{(0,\tau)}(t_{n})[(f_{t_{n}}+\nabla P_{t_{n}}\\ &+(Y_{t_{n}}\cdot\nabla)Y_{t_{n}})(X_{t_{n}})\Delta t_{n}-\sqrt{2\nu}Z_{t_{n}}^{T}(X_{t_{n}})\Delta W_{t_{n}}],\end{aligned}\right. (18)

where 𝟙(0,τ)(tn)=1,tn[0,τ)\mathbbm{1}_{(0,\tau)}(t_{n})=1,t_{n}\in[0,\tau). It should be noted that we will calculate the stop time τ\tau after the iteration of Xtn+1X_{t_{n+1}} is completed. Supposing τ=tn+1\tau=t_{n+1} when τT\tau\leq T, we let Xτ=Xtn+1X_{\tau}=X_{t_{n+1}} on Ω\partial\Omega and update ΔWtn=(Xtn+1Xtn)/2ν\Delta W_{t_{n}}=(X_{t_{n+1}}-X_{t_{n}})/\sqrt{2\nu} to satisfy (18). The algorithm of the proposed scheme is similar as Algorithm 1.

3.4 The algorithm for solving the incompressible Navier-Stokes equation with Neumann boundary condition

We consider the backward Navier–Stokes equation (10) with the Neumann boundary condition

u(t,x)𝐧=q(t,x),(t,x)[0,T]×Ω,\displaystyle\frac{\partial u(t,x)}{\partial\mathbf{n}}=q(t,x),\quad(t,x)\in[0,T]\times\partial\Omega, (19)

where 𝐧\mathbf{n} is the unit normal vector at Ω\partial\Omega pointing outward of Ω\Omega. Supposing xΩ,x+ΔxΩx\in\partial\Omega,x+\Delta x\notin\Omega and xΔxΩx-\Delta x\in\Omega, where x+Δxx+\Delta x and xΔxx-\Delta x are symmetric to the boundary Ω\partial\Omega. Then we have

u(t,xΔx)u(t,x+Δx)2q(t,x)|Δx|,(t,x)[0,T]×Ω.\displaystyle u(t,x-\Delta x)-u(t,x+\Delta x)\approx-2q(t,x)|\Delta x|,\quad(t,x)\in[0,T]\times\partial\Omega. (20)

If Xtn+1ΩX_{t_{n+1}}\in\Omega, let Xtn+1=Xtn+1X^{\prime}_{t_{n+1}}=X_{t_{n+1}}, and if Xtn+1ΩX_{t_{n+1}}\notin\Omega, let Xtn+1ΩX^{\prime}_{t_{n+1}}\in\Omega is the symmetric point of Xtn+1X_{t_{n+1}} to the boundary Ω\partial\Omega. The Xtn+1′′X^{\prime\prime}_{t_{n+1}} is used to denote the intersection of the line segment Xtn+1Xtn+1¯\overline{X_{t_{n+1}}X^{\prime}_{t_{n+1}}} and Ω\partial\Omega. Therefore, the discretization can be rewritten similarly as

{Xtn+1=Xtn+2νΔWtn,ΔYtn+1=q(tn+1,Xtn+1′′)|ΔXtn+1|,Xtn+1=Xtn+1,Y~tn+1(Xtn+1)=Ytn(Xtn)(ftn+Ptn+(Ytn)Ytn)(Xtn)Δtn+2νZtnT(Xtn)ΔWtnΔYtn+1,\left\{\begin{aligned} &X_{t_{n+1}}=X_{t_{n}}+\sqrt{2\nu}\Delta W_{t_{n}},\\ &\Delta Y_{t_{n+1}}=q(t_{n+1},X^{\prime\prime}_{t_{n+1}})|\Delta X_{t_{n+1}}|,\\ &X_{t_{n+1}}=X^{\prime}_{t_{n+1}},\\ &\widetilde{Y}_{t_{n+1}}(X_{t_{n+1}})=Y_{t_{n}}(X_{t_{n}})-(f_{t_{n}}+\nabla P_{t_{n}}+(Y_{t_{n}}\cdot\nabla)Y_{t_{n}})(X_{t_{n}})\Delta t_{n}\\ &+\sqrt{2\nu}Z_{t_{n}}^{T}(X_{t_{n}})\Delta W_{t_{n}}-\Delta Y_{t_{n+1}},\end{aligned}\right. (21)

where ΔXtn+1=Xtn+1Xtn+1\Delta X_{t_{n+1}}=X_{t_{n+1}}-X^{\prime}_{t_{n+1}}. The algorithm of the proposed scheme is similar as Algorithm 1.

Remark 1.

There are some similar works in [25, 29] for dealing with the Neumann boundary conditions. If Xtn+1ΩX_{t_{n+1}}\notin\Omega during the iterative process, the authors [25, 29] choose to reflect Xtn+1X_{t_{n+1}} on the boundary Ω\partial\Omega, which allows them to deal with the homogeneous Neumann conditions. In contrast, our method can deal with non-homogeneous Neumann boundary conditions.

4 Deep neural network for solving the Cahn-Hilliard equation

4.1 Rewrite Cahn-Hilliard equation into a parabolic PDE system

We consider the following Cahn-Hilliard equation, which has fourth order derivatives,

{ϕtLdΔμ+f=0,t0,μ+γ2Δϕ+ϕϕ3=0,t0,ϕ(0)=g,\left\{\begin{aligned} &\phi_{t}-L_{d}\Delta\mu+f=0,\quad t\geq 0,\\ &\mu+\gamma^{2}\Delta\phi+\phi-\phi^{3}=0,\quad t\geq 0,\\ &\phi(0)=-g,\end{aligned}\right. (22)

where ϕ=ϕ(t,x)\phi=\phi(t,x) is the unknown, e.g., the concentration of the fluid, μ=μ(t,x)\mu=\mu(t,x) is a function of ϕ\phi, e.g.,the chemical potential, Ld>0L_{d}>0 is the diffusion coefficient and γ>0\gamma>0 is the model parameter. A first order stabilized scheme [44] for the Cahn-Hilliard equation (22) reads as

{ϕn+1ϕnΔtLdΔμn+Δtfn=0,μn+1+γ2Δϕn+ϕn(ϕn)3SLd(ϕn+1ϕn)=0,\left\{\begin{aligned} &\phi^{n+1}-\phi^{n}-\Delta tL_{d}\Delta\mu^{n}+\Delta tf^{n}=0,\\ &\mu^{n+1}+\gamma^{2}\Delta\phi^{n}+\phi^{n}-(\phi^{n})^{3}-\frac{S}{L_{d}}(\phi^{n+1}-\phi^{n})=0,\\ \end{aligned}\right. (23)

where SS is a suitable stabilized parameter. It is easy to derive the following equation

μn+1μnΔt+γ2ΔϕnΔt+ϕn(ϕn)3SLd(ϕn+1ϕn)Δt=μnΔt.\displaystyle\frac{\mu^{n+1}-\mu^{n}}{\Delta t}+\frac{\gamma^{2}\Delta\phi^{n}}{\Delta t}+\frac{\phi^{n}-(\phi^{n})^{3}-\frac{S}{L_{d}}(\phi^{n+1}-\phi^{n})}{\Delta t}=\frac{-\mu^{n}}{\Delta t}. (24)

The first equation of (23) and (24) can be regarded as the discretization of the following modified Cahn-Hilliard equation in [0,T][0,T]

{ϕtLdΔμ+f=0,μt+γ2δΔϕSΔμ+SLdf+1δ(μ+ϕϕ3)=0,ϕ(0)=g,\left\{\begin{aligned} &\phi_{t}-L_{d}\Delta\mu+f=0,\\ &\mu_{t}+\frac{\gamma^{2}}{\delta}\Delta\phi-S\Delta\mu+\frac{S}{L_{d}}f+\frac{1}{\delta}\left(\mu+\phi-\phi^{3}\right)=0,\\ &\phi(0)=-g,\end{aligned}\right. (25)

where δ=O(Δt)\delta=O(\Delta t). By reversing the time and defining

(ϕ,μ,f)(t,x)(ϕ,μ,f)(Tt,x),0tT,(\phi,\mu,f)(t,x)\rightarrow(-\phi,-\mu,f)(T-t,x),\quad 0\leq t\leq T,

the (ϕ,μ)(\phi,\mu) satisfies the following backward Cahn-Hilliard equation in [0,T][0,T]

{ϕt+LdΔμ+f=0,μtγ2δΔϕ+SΔμ+SLdf1δ(μ+ϕϕ3)=0,ϕ(T)=g.\left\{\begin{aligned} &\phi_{t}+L_{d}\Delta\mu+f=0,\\ &\mu_{t}-\frac{\gamma^{2}}{\delta}\Delta\phi+S\Delta\mu+\frac{S}{L_{d}}f-\frac{1}{\delta}(\mu+\phi-\phi^{3})=0,\\ &\phi(T)=g.\end{aligned}\right. (26)

In order to satisfy the nonlinear Feynman-Kac formula and utilize the FBSNNs, we treat the backward Cahn-Hilliard equation (26) as a semilinear parabolic differential equation

ψt+AΔψ+F=0,\displaystyle\psi_{t}+A\Delta\psi+F=0, (27)

with ψ=(ϕ,μ)T\psi=(\phi,\mu)^{T}, A=(0Ldγ2δS)A=\left(\begin{array}[]{ll}0&L_{d}\\ -\frac{\gamma^{2}}{\delta}&S\end{array}\right) and F=(f,SLdf1δ(μ+ϕϕ3))TF=\left(f,\frac{S}{L_{d}}f-\frac{1}{\delta}(\mu+\phi-\phi^{3})\right)^{T}. Supposing λ1\lambda_{1} and λ2\lambda_{2} are two different eigenvalues of the coefficient matrix AA, then the coefficient matrix AA can be diagonalized by RR and R1R^{-1} so that D=diag(λ1,λ2)=R1ARD=\mbox{diag}(\lambda_{1},\lambda_{2})=R^{-1}AR, where RR is a matrix of eigenvectors. The system (27) becomes

ψ^t+DΔψ^+F^=0,\displaystyle\hat{\psi}_{t}+D\Delta\hat{\psi}+\hat{F}=0, (28)

where ψ^=R1ψ=(ϕ^,μ^)T\hat{\psi}=R^{-1}\psi=(\hat{\phi},\hat{\mu})^{T}, F^=R1F=(F^ϕ^,F^μ^)T\hat{F}=R^{-1}F=(\hat{F}^{\hat{\phi}},\hat{F}^{\hat{\mu}})^{T} and SS is chosen so that S>2γLdδS>2\gamma\sqrt{\frac{L_{d}}{\delta}} for λ1,2>0\lambda_{1,2}>0, where λ1,2=S±S24γ2Ldδ2\lambda_{1,2}=\frac{S\pm\sqrt{S^{2}-\frac{4\gamma^{2}L_{d}}{\delta}}}{2}.

Therefore, the system (28) is decomposed into two independent PDEs and the corresponding FBSDEs can be obtained as follows

{dXsϕ^=2λ1dWsϕ^,s[0,T],dXsμ^=2λ2dWsμ^,s[0,T],X0ϕ^=x0,X0μ^=x0,dYsϕ^=F^sϕ^ds2λ1(Zsϕ^)TdWsϕ^,s[0,T],dYsμ^=F^sμ^ds2λ2(Zsμ^)TdWsμ^,s[0,T],ϕ(T)=g,\left\{\begin{aligned} &dX_{s}^{\hat{\phi}}=\sqrt{2\lambda_{1}}dW_{s}^{\hat{\phi}},\quad s\in[0,T],\\ &dX_{s}^{\hat{\mu}}=\sqrt{2\lambda_{2}}dW_{s}^{\hat{\mu}},\quad s\in[0,T],\\ &X_{0}^{\hat{\phi}}=x_{0},\ X_{0}^{\hat{\mu}}=x_{0},\\ &-dY_{s}^{\hat{\phi}}=\hat{F}_{s}^{\hat{\phi}}ds-\sqrt{2\lambda_{1}}(Z_{s}^{\hat{\phi}})^{T}dW_{s}^{\hat{\phi}},\quad s\in[0,T],\\ &-dY_{s}^{\hat{\mu}}=\hat{F}_{s}^{\hat{\mu}}ds-\sqrt{2\lambda_{2}}(Z_{s}^{\hat{\mu}})^{T}dW_{s}^{\hat{\mu}},\quad s\in[0,T],\\ &\phi(T)=g,\end{aligned}\right. (29)

where

Ysϕ^=ϕ^(s,Xsϕ^),Zsϕ^=ϕ^(s,Xsϕ^),F^sϕ^=F^ϕ^(s,Xsϕ^,ϕ(s,Xsϕ^),μ(s,Xsϕ^)),Y_{s}^{\hat{\phi}}=\hat{\phi}\left(s,X_{s}^{\hat{\phi}}\right),Z_{s}^{\hat{\phi}}=\nabla\hat{\phi}\left(s,X_{s}^{\hat{\phi}}\right),\hat{F}_{s}^{\hat{\phi}}=\hat{F}^{\hat{\phi}}\left(s,X_{s}^{\hat{\phi}},\phi(s,X_{s}^{\hat{\phi}}),\mu(s,X_{s}^{\hat{\phi}})\right),

and

Ysμ^=μ^(s,Xsμ^),Zsμ^=μ^(s,Xsμ^),F^sμ^=F^μ^(s,Xsμ^,ϕ(s,Xsμ^),μ(s,Xsμ^)).Y_{s}^{\hat{\mu}}=\hat{\mu}\left(s,X_{s}^{\hat{\mu}}\right),Z_{s}^{\hat{\mu}}=\nabla\hat{\mu}\left(s,X_{s}^{\hat{\mu}}\right),\hat{F}_{s}^{\hat{\mu}}=\hat{F}^{\hat{\mu}}\left(s,X_{s}^{\hat{\mu}},\phi(s,X_{s}^{\hat{\mu}}),\mu(s,X_{s}^{\hat{\mu}})\right).

The {Xsϕ^}0sT\{X_{s}^{\hat{\phi}}\}_{0\leq s\leq T} and {Xsμ^}0sT\{X_{s}^{\hat{\mu}}\}_{0\leq s\leq T} are the forward stochastic processes corresponding to ϕ^\hat{\phi} and μ^\hat{\mu} respectively, which are constrained by x0x_{0} at the initial time.

4.2 The algorithm for solving the Cahn-Hilliard equation

Given a partition of [0,T]:0=t0<t1<<tN=T[0,T]:0=t_{0}<t_{1}<...<t_{N}=T, we consider the simple Euler scheme for the FBSDEs (29) with n=0,,N1n=0,...,N-1

{Xtn+1ϕ^=Xtnϕ^+2λ1ΔWtnϕ^,Xtn+1μ^=Xtnμ^+2λ2ΔWtnμ^.Y~tn+1ϕ^(Xtn+1ϕ^)=Ytnϕ^(Xtnϕ^)F^tnϕ^(Xtnϕ^,Ytnϕ(Xtnϕ^),Ytnμ(Xtnϕ^))Δtn+2λ1(Ztnϕ^(Xtnϕ^))TΔWtnϕ^,Y~tn+1μ^(Xtn+1μ^)=Ytnμ^(Xtnμ^)F^tnμ^(Xtnμ^,Ytnϕ(Xtnμ^),Ytnμ(Xtnμ^))Δtn+2λ2(Ztnμ^(Xtnμ^))TΔWtnμ^,\left\{\begin{aligned} &X_{t_{n+1}}^{\hat{\phi}}=X_{t_{n}}^{\hat{\phi}}+\sqrt{2\lambda_{1}}\Delta W_{t_{n}}^{\hat{\phi}},\\ &X_{t_{n+1}}^{\hat{\mu}}=X_{t_{n}}^{\hat{\mu}}+\sqrt{2\lambda_{2}}\Delta W_{t_{n}}^{\hat{\mu}}.\\ &\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}})=Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}})-\hat{F}_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}},Y_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\phi}}),Y_{t_{n}}^{\mu}(X_{t_{n}}^{\hat{\phi}}))\Delta t_{n}\\ &+\sqrt{2\lambda_{1}}(Z_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}}))^{T}\Delta W_{t_{n}}^{\hat{\phi}},\\ &\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}})=Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}})-\hat{F}_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}},Y_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\mu}}),Y_{t_{n}}^{\mu}(X_{t_{n}}^{\hat{\mu}}))\Delta t_{n}\\ &+\sqrt{2\lambda_{2}}(Z_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}}))^{T}\Delta W_{t_{n}}^{\hat{\mu}},\end{aligned}\right. (30)

where Δtn=tn+1tn=TN=δ,ΔWtnϕ^=Wtn+1ϕ^Wtnϕ^\Delta t_{n}=t_{n+1}-t_{n}=\frac{T}{N}=\delta,\ \Delta W_{t_{n}}^{\hat{\phi}}=W_{t_{n+1}}^{\hat{\phi}}-W_{t_{n}}^{\hat{\phi}} and ΔWtnμ^=Wtn+1μ^Wtnμ^\Delta W_{t_{n}}^{\hat{\mu}}=W_{t_{n+1}}^{\hat{\mu}}-W_{t_{n}}^{\hat{\mu}}. The (Ytnϕ^(Xtnϕ^),Ytnμ^(Xtnμ^))T(Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}}),Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}}))^{T} represents the estimated value of (ϕ^(tn,Xtnϕ^),μ^(tn,Xtnμ^))T(\hat{\phi}(t_{n},X_{t_{n}}^{\hat{\phi}}),\hat{\mu}(t_{n},X_{t_{n}}^{\hat{\mu}}))^{T}. The (Y~tn+1ϕ^(Xtn+1ϕ^),Y~tn+1μ^(Xtn+1μ^))T(\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}),\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}}))^{T} is the reference value of (Ytn+1ϕ^(Xtn+1ϕ^),(Y_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}), Ytn+1μ^(Xtn+1μ^))TY_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}}))^{T}, which is obtained from the last two equations in (30).

The (Ytnϕ,Ytnμ)T(Y_{t_{n}}^{\phi},Y_{t_{n}}^{\mu})^{T} represents the estimated value of (ϕ,μ)T(\phi,\mu)^{T} at time tnt_{n} given by the FNN. Due to the diagonalization, we have

{Ytnϕ^(Xtnϕ^)=[R1(Ytnϕ,Ytnμ)T(Xtnϕ^)]1,Ytnμ^(Xtnμ^)=[R1(Ytnϕ,Ytnμ)T(Xtnμ^)]2,\left\{\begin{aligned} \begin{aligned} &Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}})=\left[R^{-1}(Y_{t_{n}}^{\phi},Y_{t_{n}}^{\mu})^{T}(X_{t_{n}}^{\hat{\phi}})\right]_{1},\\ &Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}})=\left[R^{-1}(Y_{t_{n}}^{\phi},Y_{t_{n}}^{\mu})^{T}(X_{t_{n}}^{\hat{\mu}})\right]_{2},\end{aligned}\end{aligned}\right. (31)

where subscript 11 or 22 is used to represent the 11-th or 22-th component of the vector. We utilize KK different initial sampling points for training the FNN. The algorithm of the proposed scheme is summarized as Algorithm 2. Illustration of the Algorithm 2 for solving the Cahn-Hilliard equation is shown in Figure 2.

Algorithm 2 Algorithm for the Cahn-Hilliard equation
1:Number of initial sampling points KK, terminal time TT, number of time intervals NN, diffusion coefficient LdL_{d}, parameters γ\gamma, δ\delta and SS, matrix R1R^{-1}, eigenvalues λ1,2\lambda_{1,2}, maximum number of training iterations MiterM_{iter}.
2:The optimal FNN 𝒰θ\mathcal{U}_{\theta}.
3:Initialize the FNN 𝒰θ\mathcal{U}_{\theta};
4:Select initial sampling points x0x_{0} by uniform distribution;
5:Generate independent dd-dimensional standard Brownian motions Wtnϕ^W_{t_{n}}^{\hat{\phi}} and Wtnμ^(n=0,,N)W_{t_{n}}^{\hat{\mu}}(n=0,...,N);
6:Compute Xtn+1ϕ^X_{t_{n+1}}^{\hat{\phi}} and Xtn+1μ^X_{t_{n+1}}^{\hat{\mu}} according to (30) for n=0,,N1n=0,...,N-1;
7:According to (30) and (31), use the FNN 𝒰θ\mathcal{U}_{\theta} with AD to calculate Y~tn+1ϕ^(Xtn+1ϕ^)\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}) and Y~tn+1μ^(Xtn+1μ^)\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}}) for n=0,,N1n=0,\ldots,N-1;
8:Minimize the loss function by the Adam algorithm
l(θ)=\displaystyle l(\theta)= 1Kk=1K[1Nn=0N1(|(Y~tn+1ϕ^Ytn+1ϕ^)(Xtn+1,kϕ^)|2\displaystyle\frac{1}{K}\sum_{k=1}^{K}\left[\frac{1}{N}\sum_{n=0}^{N-1}\left(|(\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}-Y_{t_{n+1}}^{\hat{\phi}})(X_{t_{n+1},k}^{\hat{\phi}})|^{2}\right.\right. (32)
+|(Y~tn+1μ^Ytn+1μ^)(Xtn+1,kμ^)|2)+α1|(gYtNϕ)(XtN,kϕ^μ^)|2],\displaystyle\left.\left.+|(\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}-Y_{t_{n+1}}^{\hat{\mu}})(X_{t_{n+1},k}^{\hat{\mu}})|^{2}\right)+\alpha_{1}|(g-Y_{t_{N}}^{\phi})(X_{t_{N},k}^{\hat{\phi}\cup\hat{\mu}})|^{2}\right],
where XtN,kϕ^μ^=XtN,kϕ^XtN,kμ^X_{t_{N},k}^{\hat{\phi}\cup\hat{\mu}}=X_{t_{N},k}^{\hat{\phi}}\cup X_{t_{N},k}^{\hat{\mu}} and α1\alpha_{1} is the weight of the terminal condition. The subscript kk corresponds to the kk-th initial sampling point;
9:Repeat procedures 2 to 6 until the maximum number of training iterations MiterM_{iter} is reached.
Refer to caption
Figure 2: Illustration of Algorithm 2 for solving the Cahn-Hilliard equation

The Algorithm 2 shows that we only need to compute first-order derivatives during training, which causes the training time to increase linearly with the dimension dd. This makes our method capable of efficiently solving high-dimensional Cahn-Hilliard equations, which can be observed in the numerical experiments of solving the high-dimensional Cahn-Hilliard equations in Section 6.2.

4.3 The algorithm for solving the Cahn-Hilliard equation with mixed boundary condition

We consider the Cahn-Hilliard equation (22) in Ωd\Omega\subset\mathbb{R}^{d} with the mixed condition

{ϕ(t,x)=h(t,x),μ(t,x)𝐧=q(t,x),(t,x)[0,T]×Ω,\displaystyle\left\{\begin{aligned} &\phi(t,x)=h(t,x),\\ &\frac{\partial\mu(t,x)}{\partial\mathbf{n}}=q(t,x),\end{aligned}\right.\quad(t,x)\in[0,T]\times\partial\Omega, (33)

where 𝐧\mathbf{n} is the unit normal vector at Ω\partial\Omega pointing outward of Ω\Omega. The method described in Section 3.3 is used to deal with the Dirichlet boundary condition. For the Neumann boundary condition, it is noted that

μ^𝐧(t,x)=[R1(ϕ𝐧,μ𝐧)T(t,x)]2\displaystyle\begin{aligned} \frac{\partial\hat{\mu}}{\partial\mathbf{n}}(t,x)=\left[R^{-1}\left(\frac{\partial\phi}{\partial\mathbf{n}},\frac{\partial\mu}{\partial\mathbf{n}}\right)^{T}(t,x)\right]_{2}\end{aligned} (34)

where ϕ𝐧(t,x)\frac{\partial\phi}{\partial\mathbf{n}}(t,x) is given by the FNN with AD and subscript 2 represents the second component. Therefore, the method described in Section 3.4 can be used to deal with the Neumann boundary condition. It is shown in Section 6 that our method performs well numerically. The algorithm of the proposed scheme is similar as Algorithm 2.

4.4 The algorithm for solving the Cahn-Hilliard equation with periodic boundary condition

We consider the Cahn-Hilliard equation (22) with the periodic boundary condition,

{ϕ(t,x1,,xi+Ii,,xd)=ϕ(t,x1,,xi,,xd),μ(t,x1,,xi+Ii,,xd)=μ(t,x1,,xi,,xd),i=1,,d,\displaystyle\left\{\begin{aligned} &\phi(t,x_{1},\cdots,x_{i}+I_{i},\cdots,x_{d})=\phi(t,x_{1},\cdots,x_{i},\cdots,x_{d}),\\ &\mu(t,x_{1},\cdots,x_{i}+I_{i},\cdots,x_{d})=\mu(t,x_{1},\cdots,x_{i},\cdots,x_{d}),\end{aligned}\right.\quad i=1,\cdots,d, (35)

where IiI_{i} is the period along the ii-th direction. To satisfy the condition (35), as in [10], we transform the input vector x=(x1,,xd)x=(x_{1},\cdots,x_{d}) into a fixed trigonometric basis before applying the FNN. The component xix_{i} in xx is mapped as follows

xi{sin(j2πxiIi),cos(j2πxiIi)}j=1J,i=1,,d,\displaystyle x_{i}\rightarrow\left\{\sin\left(j\cdot 2\pi\frac{x_{i}}{I_{i}}\right),\cos\left(j\cdot 2\pi\frac{x_{i}}{I_{i}}\right)\right\}_{j=1}^{J},\ i=1,\cdots,d, (36)

where JJ is the order of the trigonometric basis. The network structure for the periodic boundary condition (35) is shown in Figure 3. The algorithm of the proposed scheme is similar as Algorithm 2.

Refer to caption
Figure 3: Network structure for periodic boundary condition

5 Deep neural network for solving Cahn-Hilliard-Navier-Stokes system

We now solve the coupled Cahn-Hilliard-Navier-Stokes equation in domain d\mathbb{R}^{d}. According to Section 3.1 and Section 4.1, after time-reversing, the modified CHNS system is

{ut+νΔu+(u)u+p+Cϕμ+f1=0,ϕt+uϕ+LdΔμ+f2=0,μtγ2δΔϕ+SΔμ+SLd(uϕ+f2)1δ(μ+ϕϕ3)=0,u=0,u(T)=gu,ϕ(T)=gϕ,\left\{\begin{aligned} &u_{t}+\nu\Delta u+(u\cdot\nabla)u+\nabla p+C\phi\nabla\mu+f_{1}=0,\\ &\phi_{t}+u\cdot\nabla\phi+L_{d}\Delta\mu+f_{2}=0,\\ &\mu_{t}-\frac{\gamma^{2}}{\delta}\Delta\phi+S\Delta\mu+\frac{S}{L_{d}}(u\cdot\nabla\phi+f_{2})-\frac{1}{\delta}(\mu+\phi-\phi^{3})=0,\\ &\nabla\cdot u=0,\\ &u(T)=g_{u},\quad\phi(T)=g_{\phi},\end{aligned}\right. (37)

where CC denotes a parameter, e.g., the strength of the capillary force comparing with the Newtonian fluid stress.

Similarly, we have the corresponding FBSDEs of (37) by diagonalizing and using the nonlinear Feynman-Kac formula

{dXsu=2νdWsu,s[0,T],dXsϕ^=2λ1dWsϕ^,s[0,T],dXsμ^=2λ2dWsμ^,s[0,T],(X0u,X0ϕ^,X0μ^)T=(x0,x0,x0)T,dYsu=Fsuds2ν(Zsu)TdWsu,s[0,T],dYsϕ^=F^sϕ^ds2λ1(Zsϕ^)TdWsϕ^,s[0,T],dYsμ^=F^sμ^ds2λ2(Zsμ^)TdWsμ^,s[0,T],u(T)=gu,ϕ(T)=gϕ,\left\{\begin{aligned} &dX_{s}^{u}=\sqrt{2\nu}dW_{s}^{u},\quad s\in[0,T],\\ &dX_{s}^{\hat{\phi}}=\sqrt{2\lambda_{1}}dW_{s}^{\hat{\phi}},\quad s\in[0,T],\\ &dX_{s}^{\hat{\mu}}=\sqrt{2\lambda_{2}}dW_{s}^{\hat{\mu}},\quad s\in[0,T],\\ &(X_{0}^{u},X_{0}^{\hat{\phi}},X_{0}^{\hat{\mu}})^{T}=(x_{0},x_{0},x_{0})^{T},\\ &-dY_{s}^{u}=F_{s}^{u}ds-\sqrt{2\nu}(Z_{s}^{u})^{T}dW_{s}^{u},\quad s\in[0,T],\\ &-dY_{s}^{\hat{\phi}}=\hat{F}_{s}^{\hat{\phi}}ds-\sqrt{2\lambda_{1}}(Z_{s}^{\hat{\phi}})^{T}dW_{s}^{\hat{\phi}},\quad s\in[0,T],\\ &-dY_{s}^{\hat{\mu}}=\hat{F}_{s}^{\hat{\mu}}ds-\sqrt{2\lambda_{2}}(Z_{s}^{\hat{\mu}})^{T}dW_{s}^{\hat{\mu}},\quad s\in[0,T],\\ &u(T)=g_{u},\quad\phi(T)=g_{\phi},\end{aligned}\right. (38)

with Fu=(u)u+p+Cϕμ+f1F^{u}=(u\cdot\nabla)u+\nabla p+C\phi\nabla\mu+f_{1} and

(F^ϕ^,F^μ^)T=R1(f2+uϕ,SLd(uϕ+f2)1δ(μ+ϕϕ3))T.(\hat{F}^{\hat{\phi}},\hat{F}^{\hat{\mu}})^{T}=R^{-1}\left(f_{2}+u\cdot\nabla\phi,\frac{S}{L_{d}}\left(u\cdot\nabla\phi+f_{2}\right)-\frac{1}{\delta}\left(\mu+\phi-\phi^{3}\right)\right)^{T}.

The Euler scheme of (38) for n=0,,N1n=0,\cdots,N-1 is

{Xtn+1u=Xtnu+2νΔWtnu,Xtn+1ϕ^=Xtnϕ^+2λ1ΔWtnϕ^,Xtn+1μ^=Xtnμ^+2λ2ΔWtnμ^,Y~tn+1u(Xtn+1u)=Ytnu(Xtnu)+2ν(Ztnu(Xtnu))TΔWtnuFtnu(Xtnu,Ptn(Xtnu),Ytnu(Xtnu),Ytnϕ(Xtnu),Ztnμ(Xtnu))Δtn,Y~tn+1ϕ^(Xtn+1ϕ^)=Ytnϕ^(Xtnϕ^)+2λ1(Ztnϕ^(Xtnϕ^))TΔWtnϕ^F^tnϕ^(Xtnϕ^,Ytnu(Xtnϕ^),Ytnϕ(Xtnϕ^),Ytnμ(Xtnϕ^),Ztnϕ(Xtnϕ^))Δtn,Y~tn+1μ^(Xtn+1μ^)=Ytnμ^(Xtnμ^)+2λ2(Ztnμ^(Xtnμ^))TΔWtnμ^F^tnμ^(Xtnμ^,Ytnu(Xtnμ^),Ytnϕ(Xtnμ^),Ytnμ(Xtnμ^),Ztnϕ(Xtnμ^))Δtn,\left\{\begin{aligned} &X_{t_{n+1}}^{u}=X_{t_{n}}^{u}+\sqrt{2\nu}\Delta W_{t_{n}}^{u},\\ &X_{t_{n+1}}^{\hat{\phi}}=X_{t_{n}}^{\hat{\phi}}+\sqrt{2\lambda_{1}}\Delta W_{t_{n}}^{\hat{\phi}},\\ &X_{t_{n+1}}^{\hat{\mu}}=X_{t_{n}}^{\hat{\mu}}+\sqrt{2\lambda_{2}}\Delta W_{t_{n}}^{\hat{\mu}},\\ &\widetilde{Y}_{t_{n+1}}^{u}(X_{t_{n+1}}^{u})=Y_{t_{n}}^{u}(X_{t_{n}}^{u})+\sqrt{2\nu}(Z_{t_{n}}^{u}(X_{t_{n}}^{u}))^{T}\Delta W_{t_{n}}^{u}\\ &-F_{t_{n}}^{u}(X_{t_{n}}^{u},\nabla P_{t_{n}}(X_{t_{n}}^{u}),Y_{t_{n}}^{u}(X_{t_{n}}^{u}),Y_{t_{n}}^{\phi}(X_{t_{n}}^{u}),Z_{t_{n}}^{\mu}(X_{t_{n}}^{u}))\Delta t_{n},\\ &\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}})=Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}})+\sqrt{2\lambda_{1}}(Z_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}}))^{T}\Delta W_{t_{n}}^{\hat{\phi}}\\ &-\hat{F}_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}},Y_{t_{n}}^{u}(X_{t_{n}}^{\hat{\phi}}),Y_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\phi}}),Y_{t_{n}}^{\mu}(X_{t_{n}}^{\hat{\phi}}),Z_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\phi}}))\Delta t_{n},\\ &\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}})=Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}})+\sqrt{2\lambda_{2}}(Z_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}}))^{T}\Delta W_{t_{n}}^{\hat{\mu}}\\ &-\hat{F}_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}},Y_{t_{n}}^{u}(X_{t_{n}}^{\hat{\mu}}),Y_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\mu}}),Y_{t_{n}}^{\mu}(X_{t_{n}}^{\hat{\mu}}),Z_{t_{n}}^{\phi}(X_{t_{n}}^{\hat{\mu}}))\Delta t_{n},\end{aligned}\right. (39)

where Δtn=tn+1tn=TN=δ,ΔWtnu=Wtn+1uWtnu,ΔWtnϕ^=Wtn+1ϕ^Wtnϕ^\Delta t_{n}=t_{n+1}-t_{n}=\frac{T}{N}=\delta,\Delta W_{t_{n}}^{u}=W_{t_{n+1}}^{u}-W_{t_{n}}^{u},\Delta W_{t_{n}}^{\hat{\phi}}=W_{t_{n+1}}^{\hat{\phi}}-W_{t_{n}}^{\hat{\phi}} and ΔWtnμ^=Wtn+1μ^Wtnμ^\Delta W_{t_{n}}^{\hat{\mu}}=W_{t_{n+1}}^{\hat{\mu}}-W_{t_{n}}^{\hat{\mu}}. The Ytnu(Xtnu)Y_{t_{n}}^{u}(X_{t_{n}}^{u}), Ytnϕ^(Xtnϕ^)Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}}) and Ytnμ^(Xtnμ^)Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}}) represent the estimated values of u(tn,Xtnu)u(t_{n},X_{t_{n}}^{u}), ϕ^(tn,Xtnϕ^)\hat{\phi}(t_{n},X_{t_{n}}^{\hat{\phi}}) and μ^(tn,Xtnμ^)\hat{\mu}(t_{n},X_{t_{n}}^{\hat{\mu}}), respectively. The Y~tn+1u(Xtn+1u)\widetilde{Y}_{t_{n+1}}^{u}(X_{t_{n+1}}^{u}), Y~tn+1ϕ^(Xtn+1ϕ^)\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}) and Y~tn+1μ^(Xtn+1μ^)\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}}) are the reference values of Ytn+1u(Xtn+1u)Y_{t_{n+1}}^{u}(X_{t_{n+1}}^{u}), Ytn+1ϕ^(Xtn+1ϕ^)Y_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}) and Ytn+1μ^(Xtn+1μ^)Y_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}}), respectively, which are obtained from the last three equations in (39).

The (Ytnu,Ptn)T(Y_{t_{n}}^{u},P_{t_{n}})^{T} represents the estimated value of (u,p)T(u,p)^{T} at time tnt_{n} given by the FNN 𝒰θ1\mathcal{U}_{\theta_{1}}. The (Ytnϕ,Ytnμ)T(Y_{t_{n}}^{\phi},Y_{t_{n}}^{\mu})^{T} represents the estimated value of (ϕ,μ)T(\phi,\mu)^{T} at time tnt_{n} given by the FNN 𝒰θ2\mathcal{U}_{\theta_{2}}. The calculations of Ytnϕ^(Xtnϕ^)Y_{t_{n}}^{\hat{\phi}}(X_{t_{n}}^{\hat{\phi}}) and Ytnμ^(Xtnμ^)Y_{t_{n}}^{\hat{\mu}}(X_{t_{n}}^{\hat{\mu}}) are based on (31). We choose KK different initial sampling points for training. The algorithm of the proposed scheme is summarized as Algorithm 3.

Algorithm 3 Algorithm for the Cahn-Hilliard-Navier-Stokes equation
1:Number of initial sampling points KK, terminal time TT, number of time intervals NN, viscosity coefficient ν\nu, diffusion coefficient LdL_{d}, parameters γ\gamma, δ\delta, CC and SS, matrix R1R^{-1}, eigenvalues λ1,2\lambda_{1,2}, maximum number of training iterations MiterM_{iter}.
2:The optimal FNNs 𝒰θ1\mathcal{U}_{\theta_{1}} and 𝒰θ2\mathcal{U}_{\theta_{2}}.
3:Initialize the FNNs 𝒰θ1\mathcal{U}_{\theta_{1}} and 𝒰θ2\mathcal{U}_{\theta_{2}};
4:Select initial sampling points x0x_{0} by uniform distribution;
5:Generate independent dd-dimensional standard Brownian motions WtnuW_{t_{n}}^{u}, Wtnϕ^W_{t_{n}}^{\hat{\phi}} and Wtnμ^(n=0,,N)W_{t_{n}}^{\hat{\mu}}(n=0,...,N);
6:Compute Xtn+1uX_{t_{n+1}}^{u}, Xtn+1ϕ^X_{t_{n+1}}^{\hat{\phi}} and Xtn+1μ^X_{t_{n+1}}^{\hat{\mu}} according to (39) for n=0,,N1n=0,...,N-1;
7:According to (31) and (39), use the FNNs 𝒰θ1\mathcal{U}_{\theta_{1}} and 𝒰θ2\mathcal{U}_{\theta_{2}} with AD to calculate Y~tn+1u(Xtn+1u)\widetilde{Y}_{t_{n+1}}^{u}(X_{t_{n+1}}^{u}), Y~tn+1ϕ^(Xtn+1ϕ^)\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}(X_{t_{n+1}}^{\hat{\phi}}) and Y~tn+1μ^(Xtn+1μ^)\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}(X_{t_{n+1}}^{\hat{\mu}});
8:Minimize the loss function by the Adam algorithm
l(θ1,θ2)=1Kk=1K[1Nn=0N1(|(Y~tn+1uYtn+1u)(Xtn+1,ku)|2\displaystyle l(\theta_{1},\theta_{2})=\frac{1}{K}\sum_{k=1}^{K}\left[\frac{1}{N}\sum_{n=0}^{N-1}\left(|(\widetilde{Y}_{t_{n+1}}^{u}-Y_{t_{n+1}}^{u})(X_{t_{n+1},k}^{u})|^{2}\right.\right. (40)
+|(Y~tn+1ϕ^Ytn+1ϕ^)(Xtn+1,kϕ^)|2+|(Y~tn+1μ^Ytn+1μ^)(Xtn+1,kμ^)|2)\displaystyle\left.+|(\widetilde{Y}_{t_{n+1}}^{\hat{\phi}}-Y_{t_{n+1}}^{\hat{\phi}})(X_{t_{n+1},k}^{\hat{\phi}})|^{2}+|(\widetilde{Y}_{t_{n+1}}^{\hat{\mu}}-Y_{t_{n+1}}^{\hat{\mu}})(X_{t_{n+1},k}^{\hat{\mu}})|^{2}\right)
+α1|(guYtNu)(XtN,kuϕ^μ^)|2+α2|(gϕYtNϕ)(XtN,kuϕ^μ^)|2\displaystyle+\alpha_{1}|(g_{u}-Y_{t_{N}}^{u})(X_{t_{N},k}^{u\cup\hat{\phi}\cup\hat{\mu}})|^{2}+\alpha_{2}|(g_{\phi}-Y_{t_{N}}^{\phi})(X_{t_{N},k}^{u\cup\hat{\phi}\cup\hat{\mu}})|^{2}
+α3N+1n=0N|Ytnu(Xtn,kuϕ^μ^)|2],\displaystyle\left.+\frac{\alpha_{3}}{N+1}\sum_{n=0}^{N}|\nabla\cdot Y_{t_{n}}^{u}(X_{t_{n},k}^{u\cup\hat{\phi}\cup\hat{\mu}})|^{2}\right],
where Xtn,kuϕ^μ^=Xtn,kuXtn,kϕ^Xtn,kμ^X_{t_{n},k}^{u\cup\hat{\phi}\cup\hat{\mu}}=X_{t_{n},k}^{u}\cup X_{t_{n},k}^{\hat{\phi}}\cup X_{t_{n},k}^{\hat{\mu}} and αi,i=1,2,3\alpha_{i},i=1,2,3 are the weights of the components of the loss function. The subscript kk corresponds to the kk-th initial sampling point;
9:Repeat procedures 2 to 6 until the maximum number of training iterations MiterM_{iter} is reached.

For the CHNS equation with the Dirichlet, Neumann and periodic boundary conditions, the similar methods as in Section 3 and Section 4 can be used.

6 Numerical experiments

In this section, we present a series of numerical results to validate our methods. For quantitative comparison, we calculate the error of the numerical solution Yt0Y_{t_{0}} and the exact solution ut0u_{t_{0}} in the relative LL^{\infty} norm and relative L2L^{2} norm, which are defined as

eL=Yt0ut0Lut0L,eL2=Yt0ut0L2ut0L2.||e||_{L^{\infty}}=\frac{||Y_{t_{0}}-u_{t_{0}}||_{L^{\infty}}}{||u_{t_{0}}||_{L^{\infty}}},\ \ \ ||e||_{L^{2}}=\frac{||Y_{t_{0}}-u_{t_{0}}||_{L^{2}}}{||u_{t_{0}}||_{L^{2}}}.

The total number of training iterations is given by 1E+5, which is divided into 2E+4, 3E+4, 3E+4 and 2E+4 iterations with learning rates of 5E-3, 5E-4, 5E-5 and 5E-6, respectively, as the way in [6]. We employ the Adam optimizer to train FNNs. For each training step, we train the FNNs using 100 points randomly selected by the Latin hypercube sampling technique (LHS) in the domain. After the training process, we randomly pick 10000 points by the same method in the domain to test the FNNs. We set 4 hidden layers for the FNNs and each hidden layer consists of 30 neurons. The cosine function is taken as the activation function of the FNNs if we do not specify otherwise. In each numerical example, we use a set of the time interval Δt\Delta t, weights αi\alpha_{i} and the stabilization parameter SS. How to select or adjust these hyperparameters will be our future work. In our simulations, we use AMD Ryzen 7 3700X CPU and NVIDIA GTX 1660 SUPER GPU to train FNNs. The parameters and settings of numerical experiments are summarized in the Table 1.

total number of training iterations 1E+5
number of iterations per segment [2E+4, 3E+4, 3E+4, 2E+4]
learning rate per segment [5E-3, 5E-4, 5E-5, 5E-6]
optimization algorithm Adam
structure of neural networks [30,30,30,30]
number of training points 100
number of test points 10000
point selection method LHS
activation function cos
CPU AMD Ryzen 7 3700X
GPU NVIDIA GTX 1660 SUPER
Table 1: Parameters and settings for numerical experiments

6.1 Navier-Stokes equation

In this section, we numerically simulate the Taylor-Green vortex flow, which is a classical model to test numerical schemes for the incompressible Navier-Stokes equation. First, we consider the explicit 2D Taylor-Green vortex flow

{u1(t,x)=cos(x1)sin(x2)exp(2νt),u2(t,x)=sin(x1)cos(x2)exp(2νt),p(t,x)=14(cos(2x1)+cos(2x2))exp(4νt)+c,\left\{\begin{aligned} &u_{1}(t,x)=-\cos(x_{1})\sin(x_{2})\exp(-2\nu t),\\ &u_{2}(t,x)=\sin(x_{1})\cos(x_{2})\exp(-2\nu t),\\ &p(t,x)=-\frac{1}{4}\left(\cos(2x_{1})+\cos(2x_{2})\right)\exp(-4\nu t)+c,\end{aligned}\right. (41)

for (t,x)=(0,T]×[0,2π]2(t,x)=(0,T]\times[0,2\pi]^{2} with constant cc\in\mathbb{R} and initial condition

{u1(0,x)=cos(x1)sin(x2),u2(0,x)=sin(x1)cos(x2).\left\{\begin{aligned} &u_{1}(0,x)=-\cos(x_{1})\sin(x_{2}),\\ &u_{2}(0,x)=\sin(x_{1})\cos(x_{2}).\end{aligned}\right. (42)

Algorithm 1 is employed to estimate 𝐮(T,x)=(u1,u2)T\mathbf{u}(T,x)=(u_{1},u_{2})^{T} with T=0.1T=0.1, f=0f=0, N=5N=5, Δt=TN=0.02\Delta t=\frac{T}{N}=0.02 and α1=α2=0.1\alpha_{1}=\alpha_{2}=0.1. The numerical results of the errors for 𝐮\mathbf{u} and p\nabla p with different viscosity ν\nu are shown in Table 2. The relative L2L^{2} errors and the training losses with different training steps are shown in Figure 4. It is observed that these values decrease with parameter ν\nu decreases. Similar phenomena will occur in the later experiments. Our method is not sensitive to the parameter ν\nu. The training time is 500s for each case, which is a acceptable cost.

ν\nu 1E-1 1E-2 1E-3 1E-4
e1L||e_{1}||_{L^{\infty}} 1.60E-2 8.58E-3 7.41E-3 6.92E-3
e2L||e_{2}||_{L^{\infty}} 1.66E-2 5.72E-3 6.30E-3 6.45E-3
e1L2||e_{1}||_{L^{2}} 7.23E-3 2.00E-3 9.02E-4 8.52E-4
e2L2||e_{2}||_{L^{2}} 8.61E-3 1.39E-3 8.59E-4 8.36E-4
epL2||e_{\nabla p}||_{L^{2}} 1.12E-1 2.41E-2 9.45E-3 8.40E-3
time 500s
Table 2: Relative LL^{\infty} and L2L^{2} errors for (41) performed by the Algorithm 1 with ν=\nu= 1E-1, 1E-2, 1E-3 and 1E-4.
Refer to caption
(a) Relative L2L^{2} error
Refer to caption
(b) Training loss
Figure 4: Relative L2L^{2} errors of 𝐮\mathbf{u} and training losses of the Algorithm 1 for (41) with different ν\nu.

The 3D Arnold-Beltrami-Childress (ABC) flow is as follows

{u1(t,x)=(Asin(x3)+Ccos(x2))eνt,u2(t,x)=(Bsin(x1)+Acos(x3))eνt,u3(t,x)=(Csin(x2)+Bcos(x1))eνt,p(t,x)=(BCcos(x1)sin(x2)+ABsin(x1)cos(x3)+ACsin(x3)cos(x2))e2νt+c,\left\{\begin{aligned} &u_{1}(t,x)=(A\sin(x_{3})+C\cos(x_{2}))e^{-\nu t},\\ &u_{2}(t,x)=(B\sin(x_{1})+A\cos(x_{3}))e^{-\nu t},\\ &u_{3}(t,x)=(C\sin(x_{2})+B\cos(x_{1}))e^{-\nu t},\\ &p(t,x)=-(BC\cos(x_{1})\sin(x_{2})+AB\sin(x_{1})\cos(x_{3})\\ &\quad\quad\quad\quad+AC\sin(x_{3})\cos(x_{2}))e^{-2\nu t}+c,\end{aligned}\right. (43)

for (t,x)=(0,T]×[0,2π]3(t,x)=(0,T]\times[0,2\pi]^{3} with parameters A,B,CA,B,C\in\mathbb{R}, constant cc\in\mathbb{R} and initial condition

{u1(0,x)=(Asin(x3)+Ccos(x2)),u2(0,x)=(Bsin(x1)+Acos(x3)),u3(0,x)=(Csin(x2)+Bcos(x1)).\left\{\begin{aligned} &u_{1}(0,x)=(A\sin(x_{3})+C\cos(x_{2})),\\ &u_{2}(0,x)=(B\sin(x_{1})+A\cos(x_{3})),\\ &u_{3}(0,x)=(C\sin(x_{2})+B\cos(x_{1})).\end{aligned}\right. (44)

We estimate 𝐮(T,x)=(u1,u2,u3)T\mathbf{u}(T,x)=(u_{1},u_{2},u_{3})^{T} by applying the Algorithm 1 with parameters A=B=C=0.5A=B=C=0.5, T=0.1T=0.1, f=0f=0, N=5N=5, Δt=TN=0.02\Delta t=\frac{T}{N}=0.02, α1=α2=0.1\alpha_{1}=\alpha_{2}=0.1. The numerical results of the errors for 𝐮\mathbf{u} and p\nabla p with different viscosity ν\nu are shown in Table 3. The relative L2L^{2} errors and the training losses with different training steps are shown in Figure 5. The training time is 700s for each case, which is not too much longer than the 2D simulations.

ν\nu 1E-1 1E-2 1E-3 1E-4
e1L||e_{1}||_{L^{\infty}} 1.46E-2 8.77E-3 9.19E-3 9.16E-3
e2L||e_{2}||_{L^{\infty}} 1.41E-2 8.12E-3 9.34E-3 9.60E-3
e3L||e_{3}||_{L^{\infty}} 1.24E-2 9.32E-3 8.83E-3 8.81E-3
e1L2||e_{1}||_{L^{2}} 6.20E-3 2.01E-3 1.81E-3 1.83E-3
e2L2||e_{2}||_{L^{2}} 4.60E-3 2.36E-3 2.08E-3 2.05E-3
e3L2||e_{3}||_{L^{2}} 7.05E-3 3.12E-3 2.38E-3 2.26E-3
epL2||e_{\nabla p}||_{L^{2}} 1.70E-1 6.92E-2 5.72E-2 5.58E-2
time 700s
Table 3: Relative LL^{\infty} and L2L^{2} errors for (43) performed by the Algorithm 1 with ν=\nu=1E-1, 1E-2, 1E-3 and 1E-4.
Refer to caption
(a) Relative L2L^{2} error
Refer to caption
(b) Training loss
Figure 5: Relative L2L^{2} errors and training losses of the Algorithm 1 for (43) with different ν\nu.

Next, we consider the 2D Taylor-Green vortex flow (41)–(42) with the Dirichlet boundary condition (15) for (t,x)=(0,T]×(0,2π)2(t,x)=(0,T]\times(0,2\pi)^{2}. The other parameters remain the same as the first example. We use the algorithm in Section 3.3 and the numerical results of the errors for 𝐮\mathbf{u} and p\nabla p with different viscosity parameters ν\nu are shown in Table 4 and Figure 6 depicts the training processes. We also consider the 2D Taylor-Green vortex flow (41)–(42) with the Neumann boundary condition (19) for (t,x)=(0,T]×(0,2π)2(t,x)=(0,T]\times(0,2\pi)^{2}. We use the algorithm in Section 3.4 and the numerical results of the errors for 𝐮\mathbf{u} and p\nabla p with different viscosity parameters ν\nu are shown in Table 5 and the training processes are shown in Figure 7.

ν\nu 1E-1 1E-2 1E-3 1E-4
e1L||e_{1}||_{L^{\infty}} 4.96E-3 8.19E-3 7.13E-3 6.99E-3
e2L||e_{2}||_{L^{\infty}} 5.44E-3 5.70E-3 6.09E-3 6.41E-3
e1L2||e_{1}||_{L^{2}} 1.89E-3 1.57E-3 8.90E-4 8.70E-4
e2L2||e_{2}||_{L^{2}} 1.92E-3 1.59E-3 8.62E-4 8.57E-4
epL2||e_{\nabla p}||_{L^{2}} 3.26E-2 2.20E-2 9.28E-3 8.83E-3
time 700s
Table 4: Relative LL^{\infty} and L2L^{2} errors for (41) with the Dirichlet boundary condition performed by the algorithm in Section 3.3 with ν=\nu= 1E-1, 1E-2, 1E-3 and 1E-4.
Refer to caption
(a) Relative L2L^{2} error
Refer to caption
(b) Training loss
Figure 6: Relative L2L^{2} errors and training losses of the algorithm in Section 3.3 for (41) with the Dirichlet boundary condition and different ν\nu.
ν\nu 1E-1 1E-2 1E-3 1E-4
e1L||e_{1}||_{L^{\infty}} 1.50E-2 9.45E-3 7.65E-3 7.35E-3
e2L||e_{2}||_{L^{\infty}} 1.55E-2 6.10E-3 6.50E-3 6.64E-3
e1L2||e_{1}||_{L^{2}} 8.05E-3 1.98E-3 9.27E-4 8.83E-4
e2L2||e_{2}||_{L^{2}} 7.59E-3 1.52E-3 8.60E-4 8.53E-4
epL2||e_{\nabla p}||_{L^{2}} 1.11E-1 2.43E-2 9.36E-3 9.00E-3
time 900s
Table 5: Relative LL^{\infty} and L2L^{2} errors for (41) with the Neumann boundary condition performed by the algorithm in Section 3.4 with ν=\nu= 1E-1, 1E-2, 1E-3 and 1E-4.
Refer to caption
(a) Relative L2L^{2} error
Refer to caption
(b) Training loss
Figure 7: Relative L2L^{2} errors and training losses of the algorithm in Section 3.4 for (41) with the Neumann boundary condition and different ν\nu.

Now, we consider the 2D lid driven cavity flow for (t,x)=(0,T]×(0,1)2(t,x)=(0,T]\times(0,1)^{2} with the boundary and initial conditions

{𝐮(t,x)=(1,0),xΩu𝐮(t,x)=(0,0),xΩ\Ωu𝐮(0,x)=(0,0),xΩ,\left\{\begin{aligned} &\mathbf{u}(t,x)=(1,0),\quad x\in\partial\Omega_{u}\\ &\mathbf{u}(t,x)=(0,0),\quad x\in\partial\Omega\backslash\partial\Omega_{u}\\ &\mathbf{u}(0,x)=(0,0),\quad x\in\Omega,\end{aligned}\right. (45)

where Ωu\partial\Omega_{u} represents the upper boundary. We utilize the algorithm in Section 3.3 to simulate 𝐮\mathbf{u}. We impose boundary conditions ui=0,i=1,2u_{i}=0,i=1,2 to the network in the training process and let

{Ytn1(x1,x2)=8x1(x11)x2Ytn1(x1,x2),Ytn2(x1,x2)=8x1(x11)x2(x21)Ytn2(x1,x2),\left\{\begin{aligned} &Y_{t_{n}}^{1}(x_{1},x_{2})=8x_{1}(x_{1}-1)x_{2}Y_{t_{n}}^{1}(x_{1},x_{2}),\\ &Y_{t_{n}}^{2}(x_{1},x_{2})=8x_{1}(x_{1}-1)x_{2}(x_{2}-1)Y_{t_{n}}^{2}(x_{1},x_{2}),\end{aligned}\right. (46)

where Ytni(x1,x2)Y_{t_{n}}^{i}(x_{1},x_{2}) represents the estimate of ui(tn,x1,x2)u_{i}(t_{n},x_{1},x_{2}) output by the FNN. Therefore, it is easily verified that YtnY_{t_{n}} satisfies the boundary conditions of ui=0,i=1,2u_{i}=0,i=1,2. For the condition of u1=1u_{1}=1 on Ωu\partial\Omega_{u} , we add the following additional term to the loss function in Algorithm 1

α3Kuk=1Ku[1N+1n=0N|Ytn1(Xtn,ku)+1|2],\displaystyle\frac{\alpha_{3}}{K_{u}}\sum_{k=1}^{K_{u}}\left[\frac{1}{N+1}\sum_{n=0}^{N}|Y_{t_{n}}^{1}(X^{u}_{t_{n},k})+1|^{2}\right], (47)

where Xtn,kuX^{u}_{t_{n},k} represents the kk-th point among the KuK_{u} points selected on Ωu\partial\Omega_{u} at time tnt_{n}. The parameters are chosen as T=0.5T=0.5, f=0f=0, ν=0.1\nu=0.1, N=25N=25, Ku=100K_{u}=100, Δt=0.02\Delta t=0.02, α1=α2=α3=0.01\alpha_{1}=\alpha_{2}=\alpha_{3}=0.01. To improve accuracy and save training time, we use the time adaptive approach II mentioned in [41]. At T=0.5T=0.5, the stream function and the pressure pp with ν=0.1\nu=0.1 are visually shown in Figure 8. These results are consistent with benchmark results.

Refer to caption
(a) stream function
Refer to caption
(b) pressure pp
Figure 8: The stream function and pressure pp for 2D lid driven cavity flow at T=0.5T=0.5 with ν=0.1\nu=0.1.

Finally, we consider that the flow past a circular obstacle for (t,x)=(0,T]×(2,10)×(2,2)(t,x)=(0,T]\times(-2,10)\times(-2,2). The center of the obstacle is at position (0,0)(0,0) with the diameter D=1D=1. The boundary and initial conditions are given

{u2(t,x)=0,xΩuΩd,𝐮(t,x)=(0,0),xΩc,𝐮(t,x)=(uin,0),xΩl,p𝐧ν𝐮𝐧=0,xΩr,𝐮(0,x)=(uin,0),xΩ,\left\{\begin{aligned} &u_{2}(t,x)=0,\quad x\in\partial\Omega_{u}\cup\partial\Omega_{d},\\ \ &\mathbf{u}(t,x)=(0,0),\quad x\in\partial\Omega_{c},\\ &\mathbf{u}(t,x)=(u_{in},0),\quad x\in\partial\Omega_{l},\\ &p\mathbf{n}-\nu\nabla\mathbf{u}\cdot\mathbf{n}=0,\quad x\in\partial\Omega_{r},\\ &\mathbf{u}(0,x)=(u_{in},0),\quad x\in\Omega,\end{aligned}\right. (48)

where Ωu,Ωd,Ωl,Ωr,Ωc\partial\Omega_{u},\partial\Omega_{d},\partial\Omega_{l},\partial\Omega_{r},\partial\Omega_{c} represent the upper, lower, left, right boundaries and the surface of the obstacle. The uinu_{in} is the inlet velocity and 𝐧\mathbf{n} is the unit normal vector at Ω\partial\Omega pointing outward of Ω\Omega. We utilize the algorithm in Section 3.3 to deal with the Dirichlet boundary conditions on ΩuΩdΩlΩc\partial\Omega_{u}\cup\partial\Omega_{d}\cup\partial\Omega_{l}\cup\partial\Omega_{c}, while we utilize the algorithm in Section 3.4 to deal with the condition on Ωr\partial\Omega_{r}. We let

{Ytn1(x1,x2)=8Ytn1(x1,x2)(x12+x220.25)(x12+x22),Ytn2(x1,x2)=8Ytn2(x1,x2)(x12+x220.25)(x1+2)(x22)(x2+2)(x12+x22),\left\{\begin{aligned} &Y_{t_{n}}^{1}(x_{1},x_{2})=\frac{8Y_{t_{n}}^{1}(x_{1},x_{2})(x_{1}^{2}+x_{2}^{2}-0.25)}{(x_{1}^{2}+x_{2}^{2})},\\ &Y_{t_{n}}^{2}(x_{1},x_{2})=\frac{8Y_{t_{n}}^{2}(x_{1},x_{2})(x_{1}^{2}+x_{2}^{2}-0.25)(x_{1}+2)(x_{2}-2)(x_{2}+2)}{(x_{1}^{2}+x_{2}^{2})},\\ \end{aligned}\right. (49)

where Ytni(x1,x2)Y_{t_{n}}^{i}(x_{1},x_{2}) represents the estimate of ui(tn,x1,x2)u_{i}(t_{n},x_{1},x_{2}) output by the FNN. Therefore, it is easily verified that YtnY_{t_{n}} satisfies the boundary conditions of ui=0,i=1,2u_{i}=0,i=1,2. For the conditions of u1=uinu_{1}=u_{in} on Ωl\partial\Omega_{l} and p𝐧ν𝐮𝐧=0p\mathbf{n}-\nu\nabla\mathbf{u}\cdot\mathbf{n}=0 on Ωr\partial\Omega_{r}, we add the following additional term to the loss function in Algorithm 1

α3N+1n=0N[1Klk=1Kl|Ytn1(Xtn,kl)+u|2+1Krk=1Kr|(Ptn𝐧+νZtn𝐧)(Xtn,kr)|2],\displaystyle\frac{\alpha_{3}}{N+1}\sum_{n=0}^{N}\left[\frac{1}{K_{l}}\sum_{k=1}^{K_{l}}|Y_{t_{n}}^{1}(X^{l}_{t_{n},k})+u_{\infty}|^{2}+\frac{1}{K_{r}}\sum_{k=1}^{K_{r}}|(P_{t_{n}}\mathbf{n}+\nu Z_{t_{n}}\cdot\mathbf{n})(X^{r}_{t_{n},k})|^{2}\right], (50)

where Xtn,klX^{l}_{t_{n},k} denotes the kk-th point among the KlK_{l} points selected on Ωl\partial\Omega_{l} at time tnt_{n} and Xtn,krX^{r}_{t_{n},k} denotes the kk-th point among the KrK_{r} points selected on Ωr\partial\Omega_{r} at time tnt_{n}. The parameters are chosen as T=1.0T=1.0, f=0f=0, ν=0.025\nu=0.025, N=50N=50, Kl=Kr=100K_{l}=K_{r}=100, Δt=0.02\Delta t=0.02, α1=α2=α3=0.01\alpha_{1}=\alpha_{2}=\alpha_{3}=0.01 and uin=3u_{in}=3. Similarly, We choose to use the time adaptive approach II mentioned in [41] to improve accuracy and save training time. At T=1.0T=1.0, the streamline is shown in Figure 9 with ν=0.025\nu=0.025, which is consistent with the result obtained by traditional numerical methods.

Refer to caption
Figure 9: The streamline for the flow past a circular obstacle at T=1.0T=1.0 with ν=0.025\nu=0.025.

6.2 Cahn-Hilliard equation

In this section, we consider the Cahn-Hilliard equation (22) in (t,x)=(0,T]×[1,1]d(t,x)=(0,T]\times[-1,1]^{d} with initial condition

ϕ(0,x)=cos(πdi=1dxi).\displaystyle\phi(0,x)=\cos\left(\frac{\pi}{\sqrt{d}}\sum_{i=1}^{d}x_{i}\right). (51)

The exact solution is given by

ϕ(t,x)=etcos(πdi=1dxi).\displaystyle\phi(t,x)=e^{-t}\cos\left(\frac{\pi}{\sqrt{d}}\sum_{i=1}^{d}x_{i}\right). (52)

The parameters are taken as Ld=L_{d}= 5E-4, T=0.1T=0.1, N=10N=10, δ=Δt=0.01\delta=\Delta t=0.01 and α1=0.01\alpha_{1}=0.01. We estimate ϕ\phi using Algorithm 2 with different parameter γ\gamma in different dimension. The numerical results of the errors for ϕ\phi with different γ\gamma and SS are recorded in Table 6. Training processes in different dimension are shown in Figure 10. For a fixed dimension, when γ\gamma decreases, the relative L2L^{2} error and training losses decrease. Our method is not sensitive to parameters γ\gamma and SS, and the training time of our method increases linearly with the dimension dd, while the accuracy does not decrease. It works for the problem with high-order derivatives in high dimensions, which does not make the training difficult.

d γ\gamma S eL||e||_{L^{\infty}} eL2||e||_{L^{2}} time
2 0.5 0.5 3.25E-2 5.02E-3 1200s
0.1 0.1 2.56E-3 1.67E-3
0.05 0.05 2.08E-3 1.13E-3
0.01 0.01 1.99E-3 6.67E-4
50 0.5 0.5 1.08E-2 4.06E-3 3200s
0.1 0.1 3.78E-3 1.65E-3
0.05 0.05 3.48E-3 1.04E-3
0.01 0.01 3.63E-3 6.49E-4
100 0.5 0.5 1.62E-2 4.25E-3 5200s
0.1 0.1 6.46E-3 1.76E-3
0.05 0.05 2.38E-3 1.15E-3
0.01 0.01 4.06E-3 6.79E-4
Table 6: Relative LL^{\infty} and L2L^{2} errors for (52) performed by the Algorithm 2 with γ=0.5, 0.1, 0.05, 0.01\gamma=0.5,\ 0.1,\ 0.05,\ 0.01 for dimension d=2, 50, 100d=2,\ 50,\ 100.
Refer to caption
(a) Relative L2L^{2} errors for 2D
Refer to caption
(b) Relative L2L^{2} errors for 50D
Refer to caption
(c) Relative L2L^{2} errors for 100D
Refer to caption
(d) Training losses for 2D
Refer to caption
(e) Training losses for 50D
Refer to caption
(f) Training losses for 100D
Figure 10: Relative L2L^{2} errors and training losses of the Algorithm 2 for (52) with different parameter γ\gamma in different dimension.

We consider the Cahn-Hilliard equation (22) with exact solution (52) defined in Ω={x:|x|<1}\Omega=\{x:|x|<1\} satisfying the mixed boundary conditions (33) on Ω={x:|x|=1}\partial\Omega=\{x:|x|=1\}, where h(t,x)h(t,x) and q(t,x)q(t,x) are given by the exact solution. In this case, we choose α1=1\alpha_{1}=1. We utilize the algorithm in Section 4.3 to solve ϕ\phi with different parameter γ\gamma in different dimension. The numerical results of the errors for ϕ\phi with different γ\gamma and SS are recorded in Table 7 and the training processes are shown in Figure 11. Our method works for boundary value problem in high dimension.

d γ\gamma S eL||e||_{L^{\infty}} eL2||e||_{L^{2}} time
2 0.5 0.5 1.69E-2 6.12E-3 1800s
0.1 0.1 4.52E-3 1.94E-3
0.05 0.05 3.64E-3 1.69E-3
0.01 0.01 3.72E-3 1.92E-3
50 0.5 0.5 2.40E-2 4.46E-3 3800s
0.1 0.1 1.13E-2 2.01E-3
0.05 0.05 5.77E-3 1.16E-3
0.01 0.01 7.33E-3 8.55E-4
100 0.5 0.5 1.14E-2 2.58E-3 6000s
0.1 0.1 9.52E-3 2.10E-3
0.05 0.05 1.10E-2 1.44E-3
0.01 0.01 4.98E-2 9.56E-4
Table 7: Relative LL^{\infty} and L2L^{2} errors for (52) with the mixed boundary conditions performed by the algorithm in Section 4.3, with γ=0.5, 0.1, 0.05, 0.01\gamma=0.5,\ 0.1,\ 0.05,\ 0.01 for dimension d=2, 50, 100d=2,\ 50,\ 100.
Refer to caption
(a) Relative L2L^{2} errors for 2D
Refer to caption
(b) Relative L2L^{2} errors for 50D
Refer to caption
(c) Relative L2L^{2} errors for 100D
Refer to caption
(d) Training loss for 2D
Refer to caption
(e) Training loss for 50D
Refer to caption
(f) Training loss for 100D
Figure 11: Relative L2L^{2} errors and training losses of the algorithm in Section 4.3 for (52) with different parameter γ\gamma in different dimension.

Next, we consider the Cahn-Hilliard equation (22) with exact solution (52) defined in Ω=(2,2)2\Omega=(-\sqrt{2},\sqrt{2})^{2} with the periodic boundary condition (35) and mixed boundary condition (33) on Ω\partial\Omega, where the periods Ii=22,i=1,2I_{i}=2\sqrt{2},i=1,2, where h(t,x)h(t,x) and q(t,x)q(t,x) are given by the exact solution. We choose J=1J=1 and other parameters remain the same as previous example. We utilize the algorithm in Sections 4.3 and 4.4 to solve ϕ\phi with different parameter γ\gamma. The numerical results of the errors for ϕ\phi are recorded in Table 8 and Figure 12.

γ\gamma S eL||e||_{L^{\infty}} eL2||e||_{L^{2}} time
0.5 0.5 9.98E-3 5.48E-3 3400s
0.1 0.1 4.86E-3 2.42E-3
0.05 0.05 7.86E-3 2.84E-3
0.01 0.01 7.39E-3 2.88E-3
Table 8: Relative LL^{\infty} and L2L^{2} errors for (52) with the mixed and periodic boundary condition performed by the algorithm in Sections 4.3 and 4.4 with γ=0.5, 0.1, 0.05, 0.01\gamma=0.5,\ 0.1,\ 0.05,\ 0.01.
Refer to caption
(a) Relative L2L^{2} errors
Refer to caption
(b) Training loss
Figure 12: Relative L2L^{2} errors and training losses of the algorithm in Sections 4.3 and 4.4 for (52) with different parameter γ\gamma.

6.3 Cahn-Hilliard-Navier-Stokes equation

In this section, we consider the coupled CHNS system

{u1(t,x)=cos(x1)sin(x2)et,u2(t,x)=sin(x1)cos(x2)et,p(t,x)=14(cos(2x1)+cos(2x2))e2t+c,ϕ(t,x)=sin(x1)sin(x2)et,\left\{\begin{aligned} &u_{1}(t,x)=-\cos(x_{1})\sin(x_{2})e^{-t},\\ &u_{2}(t,x)=\sin(x_{1})\cos(x_{2})e^{-t},\\ &p(t,x)=-\frac{1}{4}(cos(2x_{1})+\cos(2x_{2}))e^{-2t}+c,\\ &\phi(t,x)=\sin(x_{1})\sin(x_{2})e^{-t},\end{aligned}\right. (53)

for (t,x)=(0,T]×[0,2π]2(t,x)=(0,T]\times[0,2\pi]^{2} with the constant cc and initial condition

{u1(0,x)=cos(x1)sin(x2),u2(0,x)=sin(x1)cos(x2),ϕ(0,x)=sin(x1)sin(x2).\left\{\begin{aligned} &u_{1}(0,x)=-\cos(x_{1})\sin(x_{2}),\\ &u_{2}(0,x)=\sin(x_{1})\cos(x_{2}),\\ &\phi(0,x)=\sin(x_{1})\sin(x_{2}).\\ \end{aligned}\right. (54)

The parameters are taken as T=0.1T=0.1, N=5N=5, δ=Δt=0.02\delta=\Delta t=0.02, ν=1E3\nu=1E-3, C=1C=1, Ld=5E4L_{d}=5E-4, γ=0.01\gamma=0.01, S=0.0032S=0.0032, α1=α2=α3=0.01\alpha_{1}=\alpha_{2}=\alpha_{3}=0.01. The numerical results of the errors for 𝐮(T,x)=(u1,u2)T\mathbf{u}(T,x)=(u_{1},u_{2})^{T}, ϕ\phi and p\nabla p are recorded in Table 9 and Figure 13 shows the training process, where the Algorithm 3 in Section 5 is implemented. It is easy to see that our method works for the coupled system.

eu1L||e_{u_{1}}||_{L^{\infty}} eu2L||e_{u_{2}}||_{L^{\infty}} eϕL||e_{\phi}||_{L^{\infty}} eu1L2||e_{u_{1}}||_{L^{2}} eu2L2||e_{u_{2}}||_{L^{2}} eϕL2||e_{\phi}||_{L^{2}} epL2||e_{\nabla p}||_{L^{2}} time
2.41E-2 2.30E-2 1.25E-2 1.31E-2 1.16E-2 4.78E-3 2.07E-1 1800s
Table 9: Relative LL^{\infty} and L2L^{2} errors for (53) performed by the Algorithm 3.
Refer to caption
(a) Relative L2L^{2} errors
Refer to caption
(b) Training loss
Figure 13: Relative L2L^{2} errors and training losses of the Algorithm 3 for (53).

Finally, we study the interface problem modeled by the CHNS system. In this example, we choose the square domain Ω=[1,1]2\Omega=[-1,1]^{2} and the parameters Ld=1L_{d}=1, ν=1\nu=1, C=10C=10, T=3T=3, N=300N=300, δ=Δt=0.01\delta=\Delta t=0.01, γ=0.03\gamma=0.03, S=3.3S=3.3, α1=α2=α3=0.01\alpha_{1}=\alpha_{2}=\alpha_{3}=0.01. The initial conditions for ϕ\phi and 𝐮=(u1,u2)T\mathbf{u}=(u_{1},u_{2})^{T} is given

{ϕ(0,x1,x2)=max(tanhrR12γ,tanhrR22γ),𝐮(0,x1,x2)=0,\left\{\begin{aligned} &\phi(0,x_{1},x_{2})=\max\left(\tanh\frac{r-R_{1}}{2\gamma},\tanh\frac{r-R_{2}}{2\gamma}\right),\\ &\mathbf{u}(0,x_{1},x_{2})=0,\end{aligned}\right. (55)

where r=0.4r=0.4, R1=(x10.7r)2+x22R_{1}=\sqrt{(x_{1}-0.7r)^{2}+x_{2}^{2}}, and R2=(x1+0.7r)2+x22R_{2}=\sqrt{(x_{1}+0.7r)^{2}+x_{2}^{2}}. The time adaptive approach II in [41] is employed to reduce the training time. According to the conservation of mass, we add the following loss term to the final loss function defined in Algorithm 3

1N+1n=0N|ΩYtnϕ(X)𝑑XΩg(X)𝑑X|2.\displaystyle\frac{1}{N+1}\sum_{n=0}^{N}\left|\int_{\Omega}Y_{t_{n}}^{\phi}(X)dX-\int_{\Omega}g(X)dX\right|^{2}. (56)

The cosine and tanh functions are chosen as activation functions for the FNNs 𝒰θ1\mathcal{U}_{\theta_{1}} and 𝒰θ2\mathcal{U}_{\theta_{2}}, respectively. The evolution of the bubbles merging is visually shown in Figure 14, which is coincide with the result in the literature.

Refer to caption
(a) t=0
Refer to caption
(b) t=0.2
Refer to caption
(c) t=0.5
Refer to caption
(d) t=1.0
Refer to caption
(e) t=1.5
Refer to caption
(f) t=2.0
Refer to caption
(g) t=2.5
Refer to caption
(h) t=3.0
Figure 14: Phase evolution at t = 0, 0.2, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0.

7 Conclusions and remarks

In this article, we have presented the methods to obtain the numerical solutions of the incompressible Navier-Stokes equation, the Cahn-Hilliard equation and the CHNS system with different boundary conditions based on the Forward-Backward Stochastic Neural Networks. In particular, we utilize the modified Cahn-Hilliard equation that is derived from a widely used stabilized scheme for original Cahn-Hilliard, which can be diagonalized into a parabolic system. The FBSNNs are applied to this system, which works well for high-dimensional problem. We demonstrate the performance of our algorithms on a variety of numerical experiments. In all numerical results, our methods are shown to be both stable and accurate. In the future work, we will study on how to make the training more efficiently and provide the theoretical analysis for our methods with some assumptions.

Declarations

  • 1.

    –Ethical Approval
    Not Applicable

  • 2.

    –Availability of supporting data
    The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

  • 3.

    –Competing interests
    The authors have no relevant financial or non-financial interests to disclose.

  • 4.

    –Funding
    This research is partially supported by the National key R & D Program of China (No.2022YFE03040002) and the National Natural Science Foundation of China ( No.12371434).

  • 5.

    –Authors’ contributions
    All authors contribute to the study conception and design. Numerical simulations are performed by Deng Yangtao. All authors read and approve the final manuscript.

  • 6.

    –Acknowledgments
    This research is partially supported by the National key R & D Program of China (No.2022YFE03040002) and the National Natural Science Foundation of China ( No.12371434).

References

  • [1] N. Kantas, A. Beskos, A. Jasra, Sequential Monte Carlo methods for high-dimensional inverse problems: A case study for the Navier–Stokes equations, SIAM/ASA Journal on Uncertainty Quantification 2 (1) (2014) 464–489.
  • [2] X. Warin, Nesting Monte Carlo for high-dimensional non-linear PDEs, Monte Carlo Methods and Applications 24 (4) (2018) 225–247.
  • [3] J. Dick, F. Y. Kuo, I. H. Sloan, High-dimensional integration: the quasi-Monte Carlo way, Acta Numerica 22 (2013) 133–288.
  • [4] J. Shen, H. Yu, Efficient spectral sparse grid methods and applications to high-dimensional elliptic problems, SIAM Journal on Scientific Computing 32 (6) (2010) 3228–3250.
  • [5] Z. Wang, Q. Tang, W. Guo, Y. Cheng, Sparse grid discontinuous Galerkin methods for high-dimensional elliptic equations, Journal of Computational Physics 314 (2016) 244–263.
  • [6] M. Raissi, Forward–backward stochastic neural networks: deep learning of high-dimensional partial differential equations, in: Peter Carr Gedenkschrift: Research Advances in Mathematical Finance, World Scientific, 2024, pp. 637–655.
  • [7] C. Beck, S. Becker, P. Cheridito, A. Jentzen, A. Neufeld, Deep splitting method for parabolic PDEs, SIAM Journal on Scientific Computing 43 (5) (2021) A3135–A3154.
  • [8] W. E, B. Yu, The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics 6 (1) (2018) 1–12.
  • [9] J. Han, A. Jentzen, W. E, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences 115 (34) (2018) 8505–8510.
  • [10] J. Han, J. Lu, M. Zhou, Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion Monte Carlo like approach, Journal of Computational Physics 423 (2020) 109792.
  • [11] Y. Khoo, J. Lu, L. Ying, Solving parametric PDE problems with artificial neural networks, European Journal of Applied Mathematics 32 (3) (2021) 421–435.
  • [12] Y. Khoo, L. Ying, SwitchNet: a neural network model for forward and inverse scattering problems, SIAM Journal on Scientific Computing 41 (5) (2019) A3182–A3201.
  • [13] L. Lu, X. Meng, Z. Mao, G. E. Karniadakis, DeepXDE: A deep learning library for solving differential equations, SIAM Review 63 (1) (2021) 208–228.
  • [14] K. O. Lye, S. Mishra, D. Ray, P. Chandrashekar, Iterative surrogate model optimization (ISMO): an active learning algorithm for PDE constrained optimization with deep neural networks, Computer Methods in Applied Mechanics and Engineering 374 (2021) 113575.
  • [15] L. Lyu, Z. Zhang, M. Chen, J. Chen, MIM: A deep mixed residual method for solving high-order partial differential equations, Journal of Computational Physics 452 (2022) 110930.
  • [16] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational physics 378 (2019) 686–707.
  • [17] E. Samaniego, C. Anitescu, S. Goswami, V. M. Nguyen-Thanh, H. Guo, K. Hamdia, X. Zhuang, T. Rabczuk, An energy approach to the solution of partial differential equations in computational mechanics via machine learning: Concepts, implementation and applications, Computer Methods in Applied Mechanics and Engineering 362 (2020) 112790.
  • [18] J. Sirignano, K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, Journal of computational physics 375 (2018) 1339–1364.
  • [19] Z. Wang, Z. Zhang, A mesh-free method for interface problems using the deep learning approach, Journal of Computational Physics 400 (2020) 108963.
  • [20] Y. Zang, G. Bao, X. Ye, H. Zhou, Weak adversarial networks for high-dimensional partial differential equations, Journal of Computational Physics 411 (2020) 109409.
  • [21] S. Zeng, Z. Zhang, Q. Zou, Adaptive deep neural networks methods for high-dimensional partial differential equations, Journal of Computational Physics 463 (2022) 111232.
  • [22] Y. Zhu, N. Zabaras, P.-S. Koutsourelakis, P. Perdikaris, Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data, Journal of Computational Physics 394 (2019) 56–81.
  • [23] E. Pardoux, S. Peng, Backward stochastic differential equations and quasilinear parabolic partial differential equations, in: B. L. Rozovskii, R. B. Sowers (Eds.), Stochastic partial differential equations and their applications, Springer, Berlin, Heidelberg, 1992, pp. 200–217.
  • [24] C. Beck, S. Becker, P. Grohs, N. Jaafari, A. Jentzen, Solving the Kolmogorov pde by means of deep learning, Journal of Scientific Computing 88 (2021) 1–28.
  • [25] V. Boussange, S. Becker, A. Jentzen, B. Kuckuck, L. Pellissier, Deep learning approximations for non-local nonlinear PDEs with Neumann boundary conditions, Partial Differential Equations and Applications 4 (6) (2023) 51.
  • [26] Q. Chan-Wai-Nam, J. Mikael, X. Warin, Machine learning for semi linear PDEs, Journal of scientific computing 79 (3) (2019) 1667–1712.
  • [27] Q. Feng, M. Luo, Z. Zhang, Deep signature FBSDE algorithm, Numerical Algebra, Control and Optimization 13 (3&4) (2023) 500–522.
  • [28] M. Fujii, A. Takahashi, M. Takahashi, Asymptotic expansion as prior knowledge in deep learning method for high dimensional BSDEs, Asia-Pacific Financial Markets 26 (3) (2019) 391–408.
  • [29] J. Han, M. Nica, A. R. Stinchcombe, A derivative-free method for solving elliptic partial differential equations with deep neural networks, Journal of Computational Physics 419 (2020) 109672.
  • [30] J. Y. Nguwi, G. Penent, N. Privault, A deep branching solver for fully nonlinear partial differential equations, Journal of Computational Physics 499 (2024) 112712.
  • [31] N. Nüsken, L. Richter, Interpolating between BSDEs and PINNs: deep learning for elliptic and parabolic boundary value problems, Journal of Machine Learning 2 (1) (2023) 31–64.
  • [32] N. Nüsken, L. Richter, Solving high-dimensional Hamilton–Jacobi–Bellman PDEs using neural networks: perspectives from the theory of controlled diffusions and measures on path space, Partial Differential Equations and Applications 2 (4) (2021) 1–48.
  • [33] H. Pham, X. Warin, M. Germain, Neural networks-based backward scheme for fully nonlinear PDEs, SN Partial Differential Equations and Applications 2 (1) (2021) 1–24.
  • [34] A. Takahashi, Y. Tsuchida, T. Yamada, A new efficient approximation scheme for solving high-dimensional semilinear PDEs: control variate method for Deep BSDE solver, Journal of Computational Physics 454 (2022) 110956.
  • [35] M. Sabate Vidales, D. Šiška, L. Szpruch, Unbiased deep solvers for linear parametric PDEs, Applied Mathematical Finance 28 (4) (2021) 299–329.
  • [36] Y. Wang, Y.-H. Ni, Deep BSDE-ML learning and its application to model-free optimal control, arXiv preprint arXiv:2201.01318.
  • [37] R. Mattey, S. Ghosh, A novel sequential method to train physics informed neural networks for Allen Cahn and Cahn Hilliard equations, Computer Methods in Applied Mechanics and Engineering 390 (2022) 114474.
  • [38] T. P. Miyanawala, R. K. Jaiman, An efficient deep learning technique for the Navier-Stokes equations: Application to unsteady wake flow dynamics, arXiv preprint arXiv:1710.09099.
  • [39] A. T. Mohan, D. V. Gaitonde, A deep learning based approach to reduced order modeling for turbulent flow control using LSTM neural networks, arXiv preprint arXiv:1804.09269.
  • [40] M. Raissi, A. Yazdani, G. E. Karniadakis, Hidden fluid mechanics: A Navier-Stokes informed deep learning framework for assimilating flow visualization data, arXiv preprint arXiv:1808.04327.
  • [41] C. L. Zhao, Solving Allen-Cahn and Cahn-Hilliard Equations using the Adaptive Physics Informed Neural Networks, Communications in Computational Physics 29 (3).
  • [42] J. Yang, Q. Zhu, A Local Deep Learning Method for Solving High Order Partial Differential Equations., Numerical Mathematics: Theory, Methods & Applications 15 (1).
  • [43] É. Pardoux, Backward stochastic differential equations and viscosity solutions of systems of semilinear parabolic and elliptic PDEs of second order, in: L. Decreusefond, B. Øksendal, J. Gjerde, A. S. Üstünel (Eds.), Stochastic Analysis and Related Topics VI, Springer, Boston, MA, 1998, pp. 79–127.
  • [44] J. Shen, X. Yang, Numerical approximations of Allen-Cahn and Cahn-Hilliard equations, Discrete & Continuous Dynamical Systems 28 (4) (2010) 1669.