A deep learning method for multi-material diffusion problems based on physics-informed neural networks

Yanzhong Yao Jiawei Guo [email protected] Tongxiang Gu Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Beijing 100088, China Graduate School of China Academy of Engineering Physics, Beijing 100088, China

Abstract

Given the facts of the extensiveness of multi-material diffusion problems and the inability of the standard PINN(Physics-Informed Neural Networks) method for such problems, in this paper we present a novel PINN method that can accurately solve the multi-material diffusion equation. The new method applies continuity conditions at the material interface derived from the property of the diffusion equation, and combines the distinctive spatial separation strategy and the loss term normalization strategy to solve the problem that the residual points cannot be arranged at the material interface, the problem that it is difficult to express non-smooth functions with a single neural network, and the problem that the neural network is difficult to optimize the loss function with different magnitudes of loss terms, which finally provides the available prediction function for a class of multi-material diffusion problems. Numerical experiments verify the robustness and effectiveness of the new method.

keywords:

multi-material diffusion equation , deep learning method , physics-informed neural networks , flux continuity condition , domain separation strategy.

^†^†journal: Journal

1 Introduction

Diffusion equations are an important class of partial differential equations that need to be studied in many applications such as groundwater seepage [1], oil reservoir simulation [2], nuclear reactions [3], etc. They can formulate both the diffusion process of material concentration in space, and the energy transfer process in radiative heat conduction problems [4, 5]. Their numerical solution methods have been a hot research topic in the field of scientific and engineering computation.

In recent years, the technology of deep learning for solving partial differential equations has developed rapidly. Researchers have developed a new method for solving partial differential equations by introducing Physical Information into Neural Networks, which is known as the PINN method [6, 7, 8, 9]. The (continuous) PINN method described in Ref. [7] will be referred to as the standard PINN method in the remainder of this paper. The PINN method takes the definite solution condition as the supervised learning part, and combines the automatic differential technology to take the approximation degree of the governing equation as the residual part. Two parts together form the loss function as the training objective of the network prediction. This technique makes the network output obtained by the optimization algorithm not only satisfy the definite solution condition, but also satisfy the governing equation.

Compared with the traditional numerical methods, including the finite element method (FEM) and the finite volume method, the PINN method (FVM) has advantages in some aspects, such as unnecessary grid generation, adaptation to high-dimensional problems, and so on. However, the PINN method still faces many challenges in practical applications, one of which is how to efficiently solve heterogeneous diffusion equations on the computational domain involving multiple materials.

For the multi-material diffusion problem, there will be material interfaces, and because the materials on both sides of the interface have different physical properties, their diffusion coefficients or heat conduction coefficients will have large differences, and they will have jumps at the interface, and such jumps will cause the derivatives of the solution function to necessarily have jumps at the interface, i.e., the solution function is not continuously differentiable at the interface, and the second-order partial derivatives of the solution function will not exist at all. This problem poses two major difficulties for the standard PINN method: (1) the PINN method generally produces a smooth prediction function, so it is difficult for the standard PINN method to obtain a prediction function that is not continuously differentiable [10]; (2) since the solution function does not have second-order partial derivatives at the interface, the standard PINN method based on the automatic differentiation technique cannot incorporate the sampled points at the interface as residual points in the training of a single neural network [11]. This inevitably leads to an inaccurate solution near the interface. However, it is well known that the structure of the solution near the interface is generally very complex, and the computational accuracy for this position can seriously affect the overall computational accuracy of the computational model. Obtaining high accuracy numerical solutions near the interface is a very challenging but necessary task for any numerical method.

To solve the equation with a non-smooth solution at interfaces using the PINN method, the following two strategies are most commonly used. One is to accept the fact that the neural network will make an incorrect prediction near the interface. To get useful predictions in the part away from the interface, the points at the interface are not sampled, or their residuals are weighted so that the loss near the interface tends to 0. In Ref. [12], Xie et al. designed a weighted function to handle the possible jump of the diffusion coefficient across the material interface. The other idea is to use different neural networks for each subdomain, so that the outputs of the multiple networks can express the functions with non-smoothness at the interface. In Ref. [11], He et al. pointed out that it is inefficient to use only one neural network structure to capture the interface, and they proposed a piece-wise neural network structure and an adaptive resampling strategy to solve the elliptic interface problems. As a result, the PINN method in combination with domain decomposition techniques has received increasing attention. For solving nonlinear PDEs in complex geometry domains, extended PINNs (XPINNs) based on domain decomposition techniques have been proposed in Ref. [13]. In Ref. [14], a Distributor-PINN method (DPINN) was proposed, which decomposes the computational domain into several disjoint subdomains and trains several sub-networks simultaneously by minimizing the loss function, and each sub-network is used to approximate the solution on a subdomain. Deep DDM in [15] is another PINN method based on domain decomposition techniques to solve two-dimensional elliptic interface problems. In Ref.[16], the author showed that the treatments in the above domain decomposition-based PINN methods could also suffer from convergence issues or have the drawback of low accuracy by the experiments, and they developed a dynamic weighting strategy that adaptively assigns appropriate weights to different terms in the loss function based on the multi-gradient descent algorithm [17].

In view of the facts of the extensiveness of multi-material diffusion problems and the inability of the standard PINN for such problems, this paper attempts to propose effective strategies to overcome the above two difficulties, including that a single neural network is difficult to express the function with different derivatives on two sides of the interface, and the standard PINN method cannot arrange effective sampling points on the interface, so as to construct a novel PINN method named as DS-PINN, which can accurately solve the multi-material diffusion equation using a single neural network. In addition, we develop a normalization strategy for the loss terms and present the nDS-PINN method, which further improves the prediction accuracy.

The rest of this paper is organized as follows: in section 2, we do some preliminary work by giving the governing equation of the multi-material diffusion problem and its standard PINN form; in section 3, we discuss in detail how to improve the standard PINN to obtain an available prediction function for the multi-material diffusion problem; in section 4, we give several numerical examples to verify the effectiveness of our method; the conclusion about the new PINN method is drawn in the last section.

2 Preliminaries

In this section, we first describe a class of multi-material diffusion problems and then give the standard PINN method for solving them.

2.1 Physical model and governing equation

A class of linear diffusion problems on the domain containing different materials can be formulated as follows:

-\nabla\cdot\kappa(X)\nabla u=Q\left(X\right),X\in\Omega,

(2.1)

with the Dirichlet boundary condition

u(X)=g(X),X\in\partial\Omega,

(2.2)

where $u=u(X)$ is the function to be solved, $\Omega$ is an open domain in $\mathbb{R}^{d}$ with the boundary $\partial\Omega$ . For multi-material problems, $\Omega$ is composed of several subdomains containing different materials. The source term $Q(X)$ and the boundary condition $g(X)$ are bounded in their domains of definition. The boundary condition can also be of Robin or mixed type. Note that, the diffusion coefficient

\kappa(X)=\kappa_{i}(X),~{}\text{for}~{}X\in\Omega_{i},~{}i=1,2,\cdots,N,

(2.3)

where $\Omega_{i}$ denotes a subdomain containing a certain material, $\kappa_{i}(X)$ is the diffusion coefficient on that subdomain, and $N$ denotes the number of material type. $\kappa_{i}(X),i=1,\dots,N$ are smooth functions on their respective subdomains, but they may not be equal at the material interface.

To simplify the description, this paper mainly discusses two-dimensional problems, and the same idea can also be applied to three-dimensional problems.

Refer to caption — Figure 2.1: A 2D domain with two materials.

Figure 2.1 shows a 2D domain containing two materials, and $\Gamma$ is the material interface. If $\kappa_{1}(X)\neq\kappa_{2}(X),X\in\Gamma$ , then according to Eq. (2.1), one can get

\nabla u(X)|_{\Gamma^{-}}\neq\nabla u(X)|_{\Gamma^{+}}.

(2.4)

Thus, the solution $u$ is not continuously differentiable at the material interface $\Gamma$ , and its second-order partial derivatives are absent.

2.2 Standard PINN method for diffusion equations

The PINN method produces the prediction function $u_{\theta}$ as an approximation of the solution of Eqs. (2.1)-(2.3), where $\theta$ denotes the neural network parameters, including the weights and biases of the neural networks. Later in this paper, we refer to $u_{\theta}$ as the prediction.

For the standard PINN method, $\theta$ are obtained by optimising the loss function, and the loss function consists of two parts as follows:

\mathcal{L}(\theta;\Sigma)=w_{b}\mathcal{L}_{b}(\theta;\tau_{b})+w_{r}\mathcal{L}_{r}(\theta;\tau_{r}),

(2.5)

where $w_{b}$ and $w_{r}$ are the weights for the two parts of the loss function, respectively, and the supervised loss term and the residual loss term are

	$\displaystyle\mathcal{L}_{b}(\theta;\tau_{b})=\frac{1}{N_{b}}\sum_{i=1}^{N_{b}}\left\|u_{\theta}(X_{i})-g(X_{i})\right\|^{2},$		(2.6)
	$\displaystyle\mathcal{L}_{r}(\theta;\tau_{r})=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\left\|-\nabla\cdot\kappa(X_{i})\nabla u(X_{i})-Q\left(X_{i}\right)\right\|^{2}.$		(2.7)

$\Sigma=\{\tau_{b},\tau_{r}\}$ denotes the training data set, where $\tau_{b}=\{\left(X_{i},g(X_{i})\right)|X_{i}\in\partial\Omega\}_{i=1}^{N_{b}}$ is the labeled data set, and $\tau_{r}=\{X_{i}\in\Omega\}_{i=1}^{N_{r}}$ is the residual data set. $N_{b}$ and $N_{r}$ denote the number of boundary sampling points on $\partial\Omega$ and the number of inner sampling points in $\Omega$ , respectively.

Suppose

\bar{\theta}=\arg\min_{\theta}\mathcal{L}(\theta;\Sigma),

(2.8)

which can be obtained by some optimization methods, and then $u_{\bar{\theta}}$ is the approximation of the unknown function $u$ .

The partial differential operators, such as $u_{x}$ and $u_{xx}$ , can be implemented using automatic differentiation (AD). This can be easily realized in the deep learning framework like the PyTorch [18] or Tensorflow [19].

Remark 2.1.

For unsteady diffusion problems

\begin{cases}u_{t}-\nabla\cdot\kappa(t,X)\nabla u=Q\left(t,X\right),t\in(0,T],&~{}X\in\Omega,\\ u(t,X)=g(t,X),t\in(0,T],&~{}X\in\partial\Omega,\\ u(0,X)=\phi(X),&~{}X\in\Omega,\end{cases}

(2.9)

the method to be discussed in this paper is also applicable. At this case, the loss function requires adding a loss term to reflect the degree of approximation of the initial condition.

3 An improved PINN method for solving multi-material diffusion problems

In this section, we investigate a deep learning method for solving the multi-material diffusion equations (2.1)-(2.3), and present an improved PINN using interface connection conditions and the domain separation strategy, which is called DS-PINN. At the end of this section, we further improve the performance of the DS-PINN method by introducing the normalization strategy, which is denoted as nD-PINN.

It is well known that the standard PINN method, under the assumption that the solution of the equation is sufficiently smooth, uses automatic differentiation techniques to compute the residual loss term. Since the diffusion coefficients of Eq. (2.1) are discontinuous at the material interface, $u$ is not continuously differentiable and its second-order derivatives do not exist at the interface, which means that one cannot sample residual points at the interface, and thus the equation information at the interface is lost. If this issue is not properly addressed, the prediction obtained from the standard PINN will have a large uncertainty at the material interface, resulting in an unreliable result. The following subsections provide several strategies for dealing with this problem.

3.1 Introducing material interface continuity conditions into the standard PINN

Since the second derivative of the function $u$ does not exist at the material interface, the residual error of the sampling point at the interface cannot be calculated according to Eq. (2.7). To compensate for this deficiency, we can add new loss terms to the loss function according to the properties that Eq. (2.1) satisfies at the interface, so that the prediction function of the neural network can reasonably reflect the behavior of the solution at the interface.

According to the property of the diffusion equation, the following two conditions should be satisfied at the interface $\Gamma$ :

	$\displaystyle\left.u(X)\right\|_{\Gamma^{-}}$	$\displaystyle=\left.u(X)\right\|_{\Gamma^{+}},$		(3.1)
	$\displaystyle\left.-\kappa_{1}(X)\nabla u(X)\right\|_{\Gamma^{-}}\cdot\bm{n}_{1}$	$\displaystyle=-\left(\left.-\kappa_{2}(X)\nabla u(X)\right\|_{\Gamma^{+}}\cdot\bm{n}_{2}\right),$		(3.2)

where $\left.\cdot\right|_{\Gamma^{-}}$ and $\left.\cdot\right|_{\Gamma^{+}}$ represent the corresponding function values of approaching any point $X$ on the interface $\Gamma$ from $\Omega_{1}$ and $\Omega_{2}$ , respectively. $\bm{n}_{1}$ and $\bm{n}_{2}$ are the outer normal directions of the corresponding subdomain, as shown in the Figure 2.1.

Eqs. (3.1)-(3.2) are called the continuity conditions, which include the solution continuity and the flux continuity.

Define $[\![\mathcal{F}(X)]\!]_{\Gamma}:=\mathcal{F}(X)|_{\Gamma^{+}}-\mathcal{F}(X)|_{\Gamma^{-}}$ to denote the jump of $\mathcal{F}(X)$ across the material interface. Then the continuity conditions (3.1)-(3.2) can be rewritten as follows:

	$\displaystyle[\![u(X)]\!]_{\Gamma}$	$\displaystyle=0,$		(3.3)
	$\displaystyle[\![-\kappa(X)\nabla u(X)\cdot\bm{n}]\!]_{\Gamma}$	$\displaystyle=0.$		(3.4)

In fact, the continuity conditions of the material interface are the hypothetical conditions for the derivation of Eq. (2.1). However, these two conditions can also be obtained from Eq. (2.1). The solution continuity condition (3.1) is the assumed condition, which is a necessary condition for the governing equation (2.1) to hold. Next, we give the derivation of the flux continuity condition (3.2).

Suppose that $V$ is an arbitrary control volume containing part of the interface $\Gamma$ which divides it into two parts $V_{1}$ and $V_{2}$ , as shown in Figure 3.1.

Integrating Eq. (2.1) over the volume $V_{1}$ , we obtain

\int_{V_{1}}\left(-\nabla\cdot\kappa_{1}(X)\nabla u\right)\mathrm{d}V=\int_{V_{1}}Q(X)\mathrm{d}V.

(3.5)

According to the divergence theorem, Eq. (3.5) can be rewritten as follows:

	$\displaystyle\int_{V_{1}}\left(-\nabla\cdot\kappa_{1}(X)\nabla u\right)\mathrm{d}V=\oint_{\partial V_{1}}$	$\displaystyle\left(-\kappa_{1}(X)\nabla u\right)\cdot\bm{n}_{1}\mathrm{d}S$
	$\displaystyle=\int_{\overrightarrow{ABC}+\overrightarrow{CA}}$	$\displaystyle\left(-\kappa_{1}(X)\nabla u\right)\cdot\bm{n}_{1}\mathrm{d}S=\int_{V_{1}}Q(X)\mathrm{d}V.$		(3.6)

Analogously, we obtain the similar result for $V_{2}$

\displaystyle\int_{\overrightarrow{CDA}+\overrightarrow{AC}}\left(-\kappa_{2}(X)\nabla u\right)\cdot\bm{n}_{2}\mathrm{d}S=\int_{V_{2}}Q(X)\mathrm{d}V.

(3.7)

Integrating Eq. (2.1) over the entire control volume $V$ and using the divergence theorem, we have

\displaystyle\int_{\overrightarrow{ABC}+\overrightarrow{CDA}}\left(-\kappa(X)\nabla u\right)\cdot\bm{n}\mathrm{d}S=\int_{V}Q(X)\mathrm{d}V.

(3.8)

By combining Eqs. (3.6), (3.7) and (3.8), we get the flux continuity formula

\int_{\overrightarrow{CA}}\left(-\kappa_{1}(X)\nabla u\right)\cdot\bm{n}_{1}\mathrm{d}S+\int_{\overrightarrow{AC}}\left(-\kappa_{2}(X)\nabla u\right)\cdot\bm{n}_{2}\mathrm{d}S=0.

(3.9)

Given the arbitrariness of $V$ , we can obtain the flux continuity condition (3.2).

Eq.3.9 is used in many papers constructing discrete schemes for heterogeneous diffusion equations, such as [20, 21, 22].

Remark 3.1.

The equations should satisfy the continuity condition at the interface, so we want to reflect this property of the predicted solution near the interface by adding new loss terms to the loss function. However, these two conditions cannot be directly applied to the training of the PINN. The reason is that the prediction function obtained from a single PINN training is continuously differentiable, i.e., there is only one unique derivative at each position, so the solution continuity condition is naturally satisfied, while the flow continuity condition cannot be achieved.

3.2 Applying domain separation strategy to compute derivatives on both sides of the material interface

To characterize the derivative discontinuity property at the interface, a natural idea is to train this model using two sets of neural networks linked by the interface connection conditions. However, this strategy faces some difficulties: (1) it requires multiple sets of networks for the multi-material model; (2) it is more difficult to design and implement optimization algorithms; and (3) it generally requires iteration between neural networks and the convergence is difficult to guarantee.

Under the premise of using only one neural network to obtain different derivative values on both sides of the interface, an intuitive idea is to separate the two domains divided by the interface by a certain distance, and the material interface becomes the boundaries of two sub-domains with a certain interval, so that each point on the interface becomes two points belonging to different locations in space, so it is logical that they can have different derivative values.

Figure 3.2 shows the respective domain separation strategies for the two types of material interfaces. It is easy to generalize this strategy to the case of multiple materials.

With regard to the domain separation strategy, the following points should be noted:

•

The subdomains cannot overlap because only one neural network is used.
•

There are no strict limitations on the distance and direction of the separation. If the difference of diffusion coefficients between the two sides of the interface is large, then we should choose a larger distance. In Sect. 4, the performance is tested with different separation distances ${d}$ .
•

After implementing the separation strategy, the material interface changes from $\Gamma$ to two spatially separated boundaries $\Gamma_{1}$ and $\Gamma_{2}$ , on which the interface continuity conditions (3.1),(3.2) are imposed.
•

The training points are sampled based on all subdomains after separation, and the sampling points on the material interface should be consistent between the matching subdomains to impose the interface continuity condition.

Next, we give a brief analysis of the effectiveness of the separation strategy.

Taking a 2D model as an example, assume that $u_{1}$ and $u_{2}$ are functions on the subdomains $\Omega_{1}$ and $\Omega_{2}$ , respectively, satisfying Eqs. (2.1) and (2.2), and that the diffusion coefficient $\kappa(x,y)$ , the source term $Q(x,y)$ and the boundary condition $g(x,y)$ are equal to the values of the corresponding position after moving the computational domain from $\Omega_{1}$ to $\Omega_{2}$ , as shown in Figure 3.3, i.e.,

$\displaystyle\kappa(x,y)$	$\displaystyle=\kappa(x+\Delta x,y+\Delta y),\quad(x,y)\in\Omega_{1},(x+\Delta x,y+\Delta y)\in\Omega_{2},$
$\displaystyle Q(x,y)$	$\displaystyle=Q(x+\Delta x,y+\Delta y),\quad(x,y)\in\Omega_{1},(x+\Delta x,y+\Delta y)\in\Omega_{2},$	(3.10)
$\displaystyle g(x,y)$	$\displaystyle=g(x+\Delta x,y+\Delta y),\quad(x,y)\in\partial\Omega_{1},(x+\Delta x,y+\Delta y)\in\partial\Omega_{2}.$

Then we have

u_{1}(x,y)=u_{2}(x+\Delta x,y+\Delta y),\quad(x,y)\in\Omega_{1}.

(3.11)

This is an obvious result, and we can also give a simple proof.
Let

w(x,y)=u_{1}(x,y)-u_{2}(x+\Delta x,y+\Delta y),

(3.12)

and then $w(x,y)$ satisfies

\begin{split}\begin{cases}-\nabla\cdot\kappa(x,y)\nabla w(x,y)=0,\quad(x,y)\in\Omega_{1},\\ \left.w(x,y)\right|_{(x,y)\in\partial\Omega_{1}}=0.\end{cases}\end{split}

(3.13)

According the extremum principle, we get $w(x,y)\equiv 0$ .

The above analysis shows that after translating the computational domain $\Omega_{1}$ to a new location $\Omega_{2}$ , if the condition (3.2) is satisfied, the solutions of both domains have the same structure for Eqs. (2.1)- (2.2).

Remark 3.2.

Domain separation is a constructive strategy that makes it possible to express a class of non-smooth functions with a single neural network, fully exploiting the mesh-free advantage of the PINN method. Compared to the conventional method of using multiple neural networks to solve multi-material diffusion problems, the DS-PINN method using this strategy is not only easy to implement, but also does not require iteration between networks, resulting in relatively high computational efficiency.

3.3 Adding the special term representing the interface connection condition to the loss function.

For multi-material diffusion problems, the solution at the interface is critical, and obtaining a highly accurate numerical solution near the interface is a very challenging task for any kind of numerical method. Since $u$ has no second-order partial derivatives at the interface, the standard PINN method based on the automatic differentiation technique cannot incorporate the sampled points at the interface as residuals in the training of the neural network, which inevitably leads to inaccurate solutions near the interface. Sect. 3.1 gives continuity conditions that the solution at the interface should satisfy. Introducing this connection condition into PINN can fill the gap of missing information at the interface. Furthermore, by introducing a domain separation strategy, Sect. 3.2 overcomes the problem of two derivative values at one location caused by $u$ not being continuously differentiable at the interface, which is another hurdle for the standard PINN method to solve the heterogeneous diffusion problem.

Based on the work in the previous two sections, we can easily improve the standard PINN by adding the interface continuity condition to the loss function, so that the interface information is introduced into the neural network. In return, the prediction function we obtain can give very accurate predictions near the interface. For simplicity, we will only discuss the case of a single interface for two materials. However, the case of multiple interfaces for multiple materials can be treated in a similar manner.

By adding the interface loss term in the loss function, the loss function of the DS-PINN is given by

\mathcal{L}(\theta;\Sigma)={w_{b}}\mathcal{L}_{b}(\theta;\tau_{b})+{w_{r}}\mathcal{L}_{r}(\theta;\tau_{r})+{w_{\Gamma}}\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma}),

(3.14)

where

\displaystyle\mathcal{L}_{b}(\theta;\tau_{b})=

\displaystyle\frac{1}{N_{b}}\sum_{i=1}^{N_{b}}\left|u_{\theta}(X^{\prime}_{i})-g(X_{i})\right|^{2},

(3.15)

\displaystyle\mathcal{L}_{r}(\theta;\tau_{r})=

\displaystyle\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\left|-\nabla\cdot\kappa(X_{i})\nabla u(X^{\prime}_{i})-Q\left(X_{i}\right)\right|^{2},

(3.16)

	$\displaystyle\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma})=$	$\displaystyle\frac{1}{N_{\Gamma}}\sum_{i=1}^{N_{\Gamma}}\left(\left\|u_{\theta}(X_{i}^{(1)})-u_{\theta}(X_{i}^{(2)})\right\|^{2}+\right.$
		$\displaystyle~{}\left.\left\|-\kappa_{1}(X_{i}^{(1)})\nabla u(X_{i}^{(1)})\cdot\bm{n}_{1}-\kappa_{2}(X_{i}^{(2)})\nabla u(X_{i}^{(2)})\cdot\bm{n}_{2}\right\|^{2}\right).$		(3.17)

The training data set $\Sigma\left(\tau_{b},\tau_{r},\tau_{\Gamma}\right)$ is as follows:

	$\displaystyle\tau_{b}=\{\left(X_{i},g(X_{i})\right)\|X_{i}\in\partial\Omega\}_{i=1}^{N_{b}},$		(3.18)
	$\displaystyle\tau_{r}=\{X_{i}\in\Omega\}_{i=1}^{N_{r}},$		(3.19)
	$\displaystyle\tau_{\Gamma}=\{X_{i}\|X_{i}\in\Gamma\}_{i=1}^{N_{\Gamma}}.$		(3.20)

In Eqs. (3.15) and (3.16), $X_{i}$ represents a sampling point on the boundary of the original domain $\partial\Omega$ or in the original domain $\Omega$ , and $X^{\prime}_{i}$ is the matching point of $X_{i}$ . If the subdomain containing $X_{i}$ has not moved, then $X^{\prime}_{i}=X_{i}$ ; if the subdomain containing $X_{i}$ has moved by a distance of $\Delta L$ , then $X^{\prime}_{i}=X_{i}+\Delta L$ .

In Eqs. (3.17) and (3.20), $X_{i}$ represents a sampling point on the material interface $\Gamma$ in the original domain $\Omega$ , and $\Gamma$ are referred to as $\Gamma_{1}$ and $\Gamma_{2}$ in two neighboring subdomains $\Omega_{1}$ and $\Omega_{2}$ , respectively. $X^{(1)}_{i}$ and $X^{(2)}_{i}$ are two matching points of $X_{i}$ belonging to $\Gamma_{1}$ and $\Gamma_{2}$ . If the subdomain containing $X^{(k)}_{i}$ has not moved, then $X^{(k)}_{i}=X_{i}$ , k=1 or 2; if the subdomain containing $X^{(k)}_{i}$ has moved by a distance of $\Delta L$ , then $X^{(k)}_{i}=X_{i}+\Delta L$ , k=1 or 2.

Note that, according to Eq. (3.2), we use $g(X_{i})$ , $\kappa(X_{i})$ and $Q(X_{i})$ to replace with $g(X^{\prime}_{i})$ , $\kappa(X^{\prime}_{i})$ and $Q(X^{\prime}_{i})$ in Eqs. (3.15) and (3.16). In addition, if higher accuracy is needed at the material interface, one can increase $\omega_{\Gamma}$ .

So far, we have obtained the loss function for the multi-material diffusion equation, and by optimizing this loss function, we can get the prediction function. The last task is to map the prediction function of the subdomain shifted by a certain distance back to its original position using the following formula

u_{\theta}(X)=u_{\theta}(X^{\prime}),\quad X\in\bar{\Omega}.

(3.21)

Similarly, $X^{\prime}$ is the matching point of $X$ . Depending on whether the subdomain containing $X$ is moved, $X^{\prime}=X$ or $X^{\prime}=X+\Delta L$ .

3.4 Normalizing loss terms to improve the training performance

The loss function of the standard PINN method contains two main terms, the supervised loss term and the residual loss term, whose variables usually have different physical meanings and are of different magnitudes, so that combining them for optimization will generally result in the numerically smaller term not being reasonably optimized, leading to the final prediction deviating from the reference solution. The question of how to balance the different terms in the loss function plays a key role in the PINN method, and some researchers have made important progress [11, 12, 16, 23].

It should be noted that normalizing each loss term in the loss function according to the characteristics of the equation not only facilitates the implementation of the optimization algorithm, but also helps to balance the importance of each loss term, eliminates poor training results due to different orders of magnitude, and improves the computational accuracy of the prediction function. Based on the governing equation (2.1) and its boundary condition (2.2), a strategy for normalizing the supervised term (3.15) and the residual term (3.16) is given below.

Considering that $g(X)$ and $Q(X)$ may reflect the magnitude of $u$ and $u_{xx}+u_{yy}$ , respectively, let the normalization factors $\zeta_{b}$ and $\zeta_{r}$ are as follows

	$\displaystyle\zeta_{b}$	$\displaystyle=\frac{1}{N_{b}}\sum_{i=1}^{N_{b}}\left\|g(X_{i})\right\|^{2},$		(3.22)
	$\displaystyle\zeta_{r}$	$\displaystyle=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\left\|Q\left(X_{i}\right)\right\|^{2}.$		(3.23)

Next, we define the new loss terms

\mathcal{\tilde{L}}_{b}(\theta;\tau_{b})=\begin{cases}\frac{1}{\zeta_{b}}\mathcal{L}_{b}(\theta;\tau_{b}),&~{}\text{if}~{}\zeta_{b}\neq 0,\\ \mathcal{L}_{b}(\theta;\tau_{b}),&~{}\text{if}~{}\zeta_{b}=0.\end{cases}

(3.24)

\mathcal{\tilde{L}}_{r}(\theta;\tau_{r})=\begin{cases}\frac{1}{\zeta_{r}}\mathcal{L}_{r}(\theta;\tau_{r}),&~{}\text{if}~{}\zeta_{r}\neq 0,\\ \mathcal{L}_{r}(\theta;\tau_{r}),&~{}\text{if}~{}\zeta_{r}=0.\end{cases}

(3.25)

Note that, although the normalization factors cannot normalize the value of each loss term to $[0,1]$ , it is able to largely eliminate the effect of the magnitude. Now, the loss function can be rewritten as follow

\mathcal{L}(\theta;\Sigma)={w_{b}}\mathcal{\tilde{L}}_{b}(\theta;\tau_{b})+{w_{r}}\mathcal{\tilde{L}}_{r}(\theta;\tau_{r})+{w_{\Gamma}}\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma}),

(3.26)

and we call the PINN method using this loss function the normalized DS-PINN, denoted by nDS-PINN.

Remark 3.3.

The interface loss term $\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma})$ isn’t normalized because it reflects the continuity conditions (3.3)-(3.4). This is similar to the case where $\zeta_{b}=0$ or $\zeta_{r}=0$ in Eq. (3.24) or Eq. (3.25). The interface connection condition can be generalized to the general form

	$\displaystyle[\![u(X)]\!]_{\Gamma}$	$\displaystyle=\Phi(X),$		(3.27)
	$\displaystyle[\![-\kappa(X)\nabla u(X)\cdot\bm{n}]\!]_{\Gamma}$	$\displaystyle=\Psi(X).$		(3.28)

The continuity conditions (3.3) and (3.4) are considered as a special case where $\Phi(X)=\Psi(X)=0$ . For a class of interface problems with jump conditions, i.e., for the case where $\Phi(X)\neq 0$ and $\Psi(X)\neq 0$ , we can use $\Phi(X)$ and $\Psi(X)$ to normalize $\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma})$ in the same way as Eqs. (3.22)-(3.25).

4 Numerical experiments

In this section, we give several numerical experiments to demonstrate the performance of the proposed method in this work. In Sect. 4.1, we test the performance of the new methods by solving a typical two-material diffusion equation, and we show the results based on different separation distances. In Sect. 4.2, we solve a multi-material diffusion equation with 4 subdomains. In Sect. 4.3, we present a diffusion model with a special computational domain. The computational domain contains different materials within a circular subdomain at its center. In Sect. 4.4, we test the ability of the new method for diffusion problems with jump conditions at the interface.

We use the deep learning framework TensorFlow(version 1.5) to develop the code. The datatype of all variables is float32. For all the numerical experiments, we use the Adam optimizer to run 2000 iterations and then switch to the L-BFGS optimizer until convergence. All parameters and termination criteria of the L-BFGS optimizer are considered as proposed in Ref. [24]. Before training, the parameters of the neural network are randomly initialized by using the Xavier scheme [25]. All numerical experiments use the deep neural network with 5 hidden layers of 50 neurons each.

The accuracy of the trained model is evaluated using the relative $\mathbb{L}_{2}$ error, which is defined as follows:

\left\|e\right\|_{\mathbb{L}_{2}}=\frac{\sqrt{\sum_{i=1}^{N}\left|u_{\theta}(X_{i})-u^{*}(X_{i})\right|^{2}}}{\sqrt{\sum_{i=1}^{N}\left|u^{*}(X_{i})\right|^{2}}},

(4.1)

where we suppose $\sum_{i=1}^{N}\left|u^{*}(X_{i})\right|^{2}\neq 0$ , $u^{*}(X_{i})$ is the exact solution or the reference solution, and $u_{\theta}(X_{i})$ is the neural network prediction for $N=10000$ test points which are uniformly distributed over the computational domain. For the number of training points in this section, we set $N_{f}=5000$ for each subdomain, $N_{b}=500$ for each boundary, and $N_{\Gamma}=2000$ for each interface.

Remark 4.1.

Different hyperparameters, such as the model architecture, the size of the training data, the weights, the optimizer, etc., can cause the PINN method to produce different computational results. To test the robustness and effectiveness of our method, we use the same set of parameters for all numerical examples in this paper. We believe that finer tuning of the parameters could yield even better results.

4.1 Typical two-material diffusion problems

Consider the following diffusion problem with two materials in the computational domain:

\begin{cases}-\nabla\cdot\left(\kappa(x,y)\nabla u\right)=Q(x,y),&~{}(x,y)\in\Omega=(0,1)\times(0,1),\\ u(x,y)=0,&~{}(x,y)\in\partial\Omega,\end{cases}

(4.2)

where

\kappa(x,y)=\begin{cases}4,~{}(x,y)\in\left(0,\frac{2}{3}\right]\times(0,1),\\ 1,~{}(x,y)\in\left(\frac{2}{3},1\right)\times(0,1),\end{cases}

(4.3)

and

Q(x,y)=\begin{cases}20\pi^{2}\sin\pi x\sin 2\pi y,~{}(x,y)\in\left(0,\frac{2}{3}\right]\times(0,1),\\ 20\pi^{2}\sin 4\pi x\sin 2\pi y,~{}(x,y)\in\left(\frac{2}{3},1\right)\times(0,1).\end{cases}

(4.4)

The exact solution of this equation is

u(x,y)=\begin{cases}\sin\pi x\sin 2\pi y,~{}(x,y)\in\left(0,\frac{2}{3}\right]\times(0,1),\\ \sin 4\pi x\sin 2\pi y,~{}(x,y)\in\left(\frac{2}{3},1\right]\times(0,1).\end{cases}

(4.5)

Figure 4.1 shows the exact solution of this example, expressed in z-coordinate value and color, respectively. The solution $u(x,t)$ is continuous at the interface $x=\frac{2}{3}$ , but its partial derivative $u_{x}$ is discontinuous at the interface.

For this model, we set the domain separation distance ${d}=0.1$ . Table 4.1 shows the relative $\mathbb{L}_{2}$ error of different methods. Figure 4.2 shows the the prediction and the point-wise error of different PINN methods. It can be seen that the standard PINN method gives a smooth prediction, and due to the lack of interface information, its prediction is of low accuracy. However, the DS-PINN method provides a highly accurate prediction, and by normalizing the residual terms, the nDS-PINN method further improves the prediction accuracy by more than an order of magnitude.

Table 4.1:

\left\|e\right\|_{\mathbb{L}_{2}}

errors of three PINN methods for Sect. 4.1.

Method	$\left\\|e\right\\|_{\mathbb{L}_{2}}$
Standard PINN	$6.0\pm 0.4\times 10^{-1}$
DS-PINN (this work)	$4.1\pm 1.3\times 10^{-2}$
nDS-PINN (this work)	$7.3\pm 1.6\times 10^{-4}$

How does the separation distance between subdomains affect the solution accuracy? To investigate this question, we use different separation distances to test this computational model, and the figure 4.3 shows the effect of different ${d}$ on the prediction accuracy.

As can be seen from the Figure 4.3, if ${d}$ is too small ( ${d}<10^{-2}$ ), the error is very large. The reason is that, for a small ${d}$ , when the diffusion coefficients on both sides of the material interface differ greatly, it is equivalent to solving a problem with a large variation of the derivative over a narrow interval, which is computationally very difficult and therefore yields inaccurate results without special tricks. Conversely, if ${d}$ is too large ( ${d}>10$ ), then the range expressed by the neural network increases significantly, and moreover, the gaps between the subdomains do not assign the sampling points and do not participate in the training of the neural network, and so they belong to the invalid computational domain. Obviously, if the proportion of invalid regions is too large, the accuracy of the prediction function will inevitably be low.

Remark 4.2.

Regarding the standard PINN method, although one can get a higher accuracy by changing some hyper-parameters, its prediction accuracy near the material interface does not change significantly. This remark is also adapted to the discussion of the standard PINN method in the later examples.

Remark 4.3.

Choosing an appropriate separation distance is beneficial for improving the computational accuracy of the model. We believe that ${d}$ should be chosen based on the fact that it should be proportional to $\left|\kappa_{1}-\kappa_{2}\right|$ , while keeping the percentage of added invalid computational area as small as possible. Here $\kappa_{1}$ and $\kappa_{2}$ represent the diffusion coefficients on both sides of the interface.

4.2 Multi-material diffusion problems

In this subsection, we examine a multi-material diffusion example and the separation distance $d=0.1$ . The governing equation is the same as Eq. (4.2). Suppose that the computational domain $\Omega$ consists of 4 subdomains with different diffusion coefficients,

\kappa(x,y)=\begin{cases}4,&~{}(x,y)\in\left(-1,0\right]\times(-1,0],\\ 1,&~{}(x,y)\in\left(0,1\right)\times(-1,0],\\ 2,&~{}(x,y)\in\left(0,1\right)\times(0,1),\\ 1,&~{}(x,y)\in\left(-1,0\right]\times(0,1).\end{cases}

(4.6)

We also assume that this problem has the exact solution as follows:

u(x,y)=\begin{cases}\sin\pi x\sin\pi y,&~{}(x,y)\in\left[-1,0\right]\times[-1,0],\\ 4\sin\pi x\sin\pi y,&~{}(x,y)\in\left(0,1\right]\times[-1,0],\\ 2\sin\pi x\sin\pi y,&~{}(x,y)\in\left(0,1\right]\times(0,1],\\ 4\sin\pi x\sin\pi y,&~{}(x,y)\in\left[-1,0\right]\times(0,1].\end{cases}

(4.7)

To test the performance of the new methods, we solve this model using DS-PINN and nDS-PINN. The source term $Q(x,y)$ and the boundary conditions are derived from the exact solution (4.7).

Figure 4.4 shows a schematic diagram of the sampling of training points in different subdomains, and the training points consist of three types, including residual points located inside the subdomains, supervised points located at the boundaries, and interface points located at the material interfaces.

This example is somewhat complicated. If multiple neural networks were conventionally used to handle the material interface, this problem would require four neural networks, which would not only be difficult to implement, but also computationally inefficient. With our methods, this problem can be easily solved with only a single neural network.

Table 4.2:

\left\|e\right\|_{\mathbb{L}_{2}}

errors of three PINN methods for Sect. 4.2.

Method	$\left\\|e\right\\|_{\mathbb{L}_{2}}$
Standard PINN	$8.96\pm 0.03\times 10^{-1}$
DS-PINN (this work)	$1.5\pm 0.2\times 10^{-3}$
nDS-PINN (this work)	$5.2\pm 1.6\times 10^{-4}$

Table 4.2 shows the relative $\mathbb{L}_{2}$ error of different methods, and Figure 4.5 shows the the prediction and the point-wise error of different PINN methods. Similar to the results of the previous example, the standard PINN gives a poor prediction for this model, while the DS-PINN gives a satisfactory prediction and the nDS-PINN gives an accurate prediction.

4.3 The diffusion problem with the heterogeneous material located inside the computational domain

In practical applications such as heat transfer and oil reservoir simulation, it is common for a material to be completely enveloped by another material. The purpose of this section is to test the ability of our method to handle this case.

Consider the problem as follows:

\begin{cases}-\nabla\cdot\left(\kappa(x,y)\nabla u\right)=Q(x,y),&~{}(x,y)\in\Omega=(-2,2)\times(-2,2),\\ u(x,\pm 2)=\sin(\frac{\pi}{4}(x^{2}+3)),&~{}x\in[-2,2],\\ u(\pm 2,y)=\sin(\frac{\pi}{4}(y^{2}+3)),&~{}y\in[-2,2],\end{cases}

(4.8)

where

\kappa(x,y)=\begin{cases}1,~{}&(x,y)\in\Omega_{1}=\{(x,y)|x^{2}+y^{2}<1\},\\ 4,~{}&(x,y)\in\Omega\backslash\Omega_{1},\end{cases}

and the source term

Q(x,y)=\begin{cases}-4\pi\cos\left(\pi(x^{2}+y^{2}-1)\right)+4\pi^{2}(x^{2}+y^{2})\sin\left(\pi(x^{2}+y^{2}-1)\right),\\ \hfill(x,y)\in\Omega_{1},\\ -4\pi\cos\left(\frac{\pi}{4}(x^{2}+y^{2}-1)\right)+\pi^{2}(x^{2}+y^{2})\sin\left(\frac{\pi}{4}(x^{2}+y^{2}-1)\right),\\ \hfill(x,y)\in\Omega\backslash\Omega_{1}.\end{cases}

The exact solution of this problem is

u(x,y)=\begin{cases}\sin(\pi(x^{2}+y^{2}-1)),~{}&(x,y)\in\Omega_{1},\\ \sin(\frac{\pi}{4}(x^{2}+y^{2}-1)),~{}&(x,y)\in\Omega\backslash\Omega_{1}.\end{cases}

(4.9)

In this case, the separation distance $d=3.5$ . Table 4.3 shows the results of three PINN method, and we can see that DS-PINN method and nDS-PINN method achieves satisfactory accuracy. The exact solution and the prediction using two PINN methods (the standard PINN method and the nDS-PINN method) are shown in Figure 4.6, and the point-wise errors are shown in Figure 4.7.

Table 4.3:

\left\|e\right\|_{\mathbb{L}_{2}}

errors of three PINN methods for Sect. 4.3.

Method	$\left\\|e\right\\|_{\mathbb{L}_{2}}$
Standard PINN	$2.96\pm 0.01\times 10^{0}$
DS-PINN (this work)	$3.3\pm 1.3\times 10^{-3}$
nDS-PINN (this work)	$3.3\pm 1.4\times 10^{-4}$

It can be seen that the results of the standard PINN method are far from the exact solution. The main reason is that, due to the inability to arrange training points at the material interface, the exchange of information between $\Omega_{1}$ and $\Omega\backslash\Omega_{1}$ is hindered. At the same time, due to the fact that $\Omega_{1}$ , which contains the heterogeneous material, is completely inside $\Omega$ , its training is completely free from the constraints of the boundary conditions (the supervised term), which leads to a complete loss of control over the prediction of the central subdomain $\Omega_{1}$ , and further leads to a complete deviation from the exact solution of the entire prediction.

On the other hand, The DS-PINN and nDS-PINN methods provide predictions that are highly consistent with the exact solution. In particular, the nDS-PINN method with the normalization strategy shows a very high computational accuracy, both in terms of the relative $\mathbb{L}_{2}$ error and the point-wise error, which is one order of magnitude smaller than that of the DS-PINN method, showing excellent performance. Unlike the previous two examples, the boundary condition of this example is not zero, so not only the residual term is normalized, but also the supervised term is normalized.

In Figure 4.8, the left image shows the schematic diagram of domain separation and it also gives the distribution of training points; the right image shows the DS-PINN prediction for the whole extended domain after implementing the domain separation strategy. This is a very interesting picture, and we can see that the prediction on $\Omega_{1}$ and $\Omega\backslash\Omega_{1}$ match well with the exact solution at the corresponding locations.

4.4 The diffusion problems with jump conditions at the interface

In this paper we are mainly concerned with a class of multi-material diffusion problems formulated by Eqs. (2.1)-(2.3), for which the continuity conditions (3.3) and (3.4) should be satisfied at the material interface.

However, there is a special class of heterogeneous diffusion problems, such as the heat conduction problem with a thin insulating layer or with a phase change at the material interface, and the percolation problem with a filter membrane, which also receive much attention. The solutions and fluxes of these problems are discontinuous at the material interface and they satisfy some jump conditions. Such problems, which are also studied in Refs. [16, 26], can be formulated by the following equations:

\begin{cases}-\nabla\cdot\left(\kappa(x,y)\nabla u(x,y)\right)=Q(x,y),&~{}(x,y)\in~{}\Omega=(-1,1)\times(-1,1),\\ [\![u(x,y)]\!]_{\Gamma}=\Phi(x,y),&~{}(x,y)\in~{}\Gamma,\\ [\![\kappa(x,y)\nabla u(x,y)\cdot\bm{n}]\!]_{\Gamma}=\Psi(x,y),&~{}(x,y)\in~{}\Gamma,\\ u(x,\pm 1)=\ln(1+x^{2}),&~{}x\in[-1,1],\\ u(\pm 1,y)=\ln(1+y^{2}),&~{}y\in[-1,1],\end{cases}

(4.10)

where the coefficient $\kappa(x,y)$ and the source term $f(x,y)$ are as follows:

	$\displaystyle\kappa(x,y)=\begin{cases}\cos(x+y)+2,~{}&(x,y)\in\Omega_{1}=\{(x,y)\|x^{2}+y^{2}<0.5^{2}\},\\ \sin(x+y)+2,~{}&(x,y)\in\Omega\backslash\Omega_{1},\end{cases}$		(4.11)
	$\displaystyle Q(x,y)=\begin{cases}4(\cos(x+y)+1)\sin(x+y),~{}&(x,y)\in\Omega_{1},\\ -2\cos(x+y)\frac{x+y}{x^{2}+y^{2}},~{}&(x,y)\in\Omega\backslash\Omega_{1}.\end{cases}$		(4.12)

In Eq. (4.10), the interface $\Gamma$ is a circle with a radius of 0.5 and centered at $(0,0)$ . Note that $[\![\mu]\!]_{\Gamma}:=\mu|_{\Gamma^{+}}-\mu|_{\Gamma^{-}}$ denotes the jump of $\mu$ across the interface. $\Phi(x,y)$ and $\Psi(x,y)$ can be derived from the exact solution below.

The exact solution of this case is

u(x,y)=\begin{cases}\sin(x+y),~{}&(x,y)\in\Omega_{1},\\ \ln(x^{2}+y^{2}),~{}&(x,y)\in\Omega\backslash\Omega_{1}.\end{cases}

(4.13)

For this model, our methods are fully applicable, with only a slight modification of the loss term $\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma})$ by replacing the continuity conditions with the jump conditions. The computational results using different PINN methods are shown in Figure 4.9 and Table 4.4. It can be seen that the standard PINN method is powerless for such a model, while our methods DS-PINN and nDS-PINN (with separation distance ${d}=2$ ) can solve this model exactly, especially the nDS-PINN method gives extremely accurate computational results.

It should be emphasized that for the nDS-PINN method, since $\Phi(x,y)$ and $\Psi(x,y)$ in the jump condition are known functions, we used them to normalize the interface loss term $\mathcal{L}_{\Gamma}(\theta;\tau_{\Gamma})$ . The result of our nDS-PINN method is consistent with that of the INN method, which is taken from Ref. [16].

Table 4.4:

\left\|e\right\|_{\mathbb{L}_{2}}

errors of four PINN methods for Sect. 4.4.

Method	$\left\\|e\right\\|_{\mathbb{L}_{2}}$
Standard PINN	$1.33\pm 0.02\times 10^{0}$
INN [16]	$5.6\times 10^{-4}$
DS-PINN (this work)	$1.7\pm 0.8\times 10^{-3}$
nDS-PINN (this work)	$3.8\pm 1.2\times 10^{-4}$

5 Conclusions

For a class of multi-material diffusion problems, this paper first analyzed the reasons why the standard PINN cannot be applied; then derived two continuity conditions that should be satisfied at the material interface, the use of which can effectively fill in the missing information at the interface; further, we designed a domain separation strategy to overcome the problem that the solution function cannot be expressed by a single neural network due to the discontinuity of its derivatives at the interface. Finally, by combining the above two works, we improved the standard PINN by adding special terms to the loss function so that the interface conditions are accurately represented in a single neural network, which makes the obtained prediction function fully reflect the characteristics of the solution at the interface, giving very accurate predictions near the interface. In addition, we design a problem-adapted normalization method for the loss term, which can further significantly improve the accuracy of the prediction. Various numerical experiments verify the effectiveness of our method. The new method perfectly solves the problem that the standard PINN cannot be adapted to the multi-material diffusion model. We believe that this work provides a novel idea for PINN to solve partial differential equations with non-smooth solutions, and it is a useful development of the standard PINN.

Note that the methods in this paper are only for linear multi-material diffusion equations. It is our future work to study PINN methods for solving nonlinear multi-material diffusion problems.

Code availability

The code of this work is publicly available online via https://doi.org/10.5281/zenodo.7927544.

Acknowledgements

The work is supported by the National Science Foundation of China under Grant No.12271055, the Foundation of CAEP (CX20210044), the Natural Science Foundation of Shandong Province No.ZR2021MA092, and the Foundation of Computational Physics Laboratory.

References

[1] G. Hongwei, X. Zhuang, P. Chen, N. Alajlan, T. Rabczuk, Stochastic deep collocation method based on neural architecture search and transfer learning for heterogeneous porous media, Engineering with Computers (2022) 1–26doi:10.1007/s00366-021-01586-2.
[2] E. Illarionov, P. Temirchev, D. Voloskov, R. Kostoev, M. Simonov, D. Pissarenko, D. Orlov, D. Koroteev, End-to-end neural network approach to 3D reservoir simulation and adaptation, Journal of Petroleum Science and Engineering 208 (2022) 109332. doi:https://doi.org/10.1016/j.petrol.2021.109332.
[3] J. Lindl, Development of the indirect‐drive approach to inertial confinement fusion and the target physics basis for ignition and gain, Physics of Plasmas 2 (11) (1995) 3933–4024. doi:10.1063/1.871025.
[4] Y. Yao, S. Miao, G. Lv, An efficient iterative method for radiation heat conduction problems, International Journal for Numerical Methods in Fluids 93 (2021) 2362–2379. doi:10.1002/fld.4977.
[5] S. Mishra, R. Molinaro, Physics informed neural networks for simulating radiative transfer, Journal of Quantitative Spectroscopy and Radiative Transfer 270 (2021) 107705. doi:10.1016/j.jqsrt.2021.107705.
[6] G. Karniadakis, Y. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics-informed machine learning, Nature Reviews Physics 3 (2021) 1–19. doi:10.1038/s42254-021-00314-5.
[7] M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707. doi:10.1016/j.jcp.2018.10.045.
[8] M. Raissi, Deep hidden physics models: Deep learning of nonlinear partial differential equations, Journal Of Machine Learning Research 19 (1) (2018) 932–955. doi:10.1016/j.jcp.2017.11.039.
[9] M. Raissi, A. Yazdani, G. E. Karniadakis, Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations, Science 367 (6481) (2020) 1026–1030. doi:10.1126/science.aaw4741.
[10] L. Lu, X. Meng, Z. Mao, G. E. Karniadakis, DeepXDE: A deep learning library for solving differential equations, SIAM Review 63 (1) (2021) 208–228. doi:10.1137/19M1274067.
[11] C. He, X. Hu, L. Mu, A mesh-free method using piecewise deep neural network for elliptic interface problems, Journal of Computational and Applied Mathematics 412 (2022) 114358. doi:https://doi.org/10.1016/j.cam.2022.114358.
[12] H. Xie, C. Zhai, L. Liu, H. Yong, A weighted first-order formulation for solving anisotropic diffusion equations with deep neural networks (2022). arXiv:arXiv:2205.06658.
[13] A. D. Jagtap, G. E. Karniadakis, Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations, Communications in Computational Physics 28 (5) (2020) 2002–2041. doi:10.4208/cicp.OA-2020-0164.
[14] V. Dwivedi, N. Parashar, B. Srinivasan, Distributed physics informed neural network for data-efficient solution to partial differential equations (2019). arXiv:arXiv:1907.08967.
[15] W. Li, X. Xiang, Y. Xu, Deep domain decomposition method: Elliptic problems, in: J. Lu, R. Ward (Eds.), Proceedings of The First Mathematical and Scientific Machine Learning Conference, Vol. 107 of Proceedings of Machine Learning Research, PMLR, 2020, pp. 269–286.
[16] S. Wu, B. Lu, INN: Interfaced neural networks as an accessible meshless approach for solving interface PDE problems, Journal of Computational Physics 470 (2022) 111588. doi:https://doi.org/10.1016/j.jcp.2022.111588.
[17] J.-A. Désidéri, Multiple-gradient descent algorithm (MGDA) for multiobjective optimization, Comptes Rendus Mathematique 350 (5-6) (2012) 313–318. doi:10.1016/j.crma.2012.03.014.
[18] A. Paszke, S. Gross, F. Massa, A. Lerer, et al., Pytorch: An imperative style, high-performance deep learning library, in: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2019, pp. 8026–8037. doi:10.5555/3454287.3455008.
[19] M. Abadi, P. Barham, J. Chen, et al., TensorFlow: A system for Large-Scale machine learning, in: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association, Savannah, GA, 2016, pp. 265–283. doi:10.5555/3026877.3026899.
[20] I. Aavatsmark, T. Barkve, O. Bøe, T. Mannseth, Discretization on unstructured grids for inhomogeneous, anisotropic Media. Part I: Derivation of the methods, SIAM J. Sci. Comput. 19 (1998) 1700–1716. doi:10.1137/S1064827595293582.
[21] S. Miao, J. Wu, A nonlinear correction scheme for the heterogeneous and anisotropic diffusion problems on polygonal meshes, Journal of Computational Physics 448 (2022) 110729. doi:10.1016/j.jcp.2021.110729.
[22] Brezzi Franco and Lipnikov Konstantin and Shashkov Mikhail, Convergence of the mimetic finite difference method for diffusion problems on polyhedral meshes, SIAM Journal on Numerical Analysis 43 (5) (2005) 1872–1896. doi:10.1137/040613950.
[23] Colby, L. Wight, J. Zhao, Solving Allen-Cahn and Cahn-Hilliard Equations Using the Adaptive Physics Informed Neural Networks, Communications in Computational Physics 29 (3) (2021) 930–954. doi:https://doi.org/10.4208/cicp.OA-2020-0086.
[24] R. H. Byrd, P. Lu, J. Nocedal, C. Zhu, A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing 16 (5) (1995) 1190–1208. doi:10.1137/0916069.
[25] C. Xu, B. T. Cao, Y. Yuan, G. Meschke, Transfer learning based physics-informed neural networks for solving inverse problems in tunneling (2022). arXiv:arXiv:2205.07731.
[26] S. Hou, X.-D. Liu, A numerical method for solving variable coefficient elliptic equation with interfaces, Journal of Computational Physics 202 (2) (2005) 411–445. doi:https://doi.org/10.1016/j.jcp.2004.07.016.

	$\displaystyle\left.u(X)\right\|_{\Gamma^{-}}$	$\displaystyle=\left.u(X)\right\|_{\Gamma^{+}},$		(3.1)
	$\displaystyle\left.-\kappa_{1}(X)\nabla u(X)\right\|_{\Gamma^{-}}\cdot\bm{n}_{1}$	$\displaystyle=-\left(\left.-\kappa_{2}(X)\nabla u(X)\right\|_{\Gamma^{+}}\cdot\bm{n}_{2}\right),$		(3.2)