A hybrid iterative method based on MIONet for PDEs: Theory and numerical examples111The research is supported by NSFC project 12288101.
Abstract
We propose a hybrid iterative method based on MIONet for PDEs, which combines the traditional numerical iterative solver and the recent powerful machine learning method of neural operator, and further systematically analyze its theoretical properties, including the convergence condition, the spectral behavior, as well as the convergence rate, in terms of the errors of the discretization and the model inference. We show the theoretical results for the frequently-used smoothers, i.e. Richardson (damped Jacobi) and Gauss-Seidel. We give an upper bound of the convergence rate of the hybrid method w.r.t. the model correction period, which indicates a minimum point to make the hybrid iteration converge fastest. Several numerical examples including the hybrid Richardson (Gauss-Seidel) iteration for the 1-d (2-d) Poisson equation are presented to verify our theoretical results, and also reflect an excellent acceleration effect. As a meshless acceleration method, it is provided with enormous potentials for practice applications.
1 Introduction
Partial differential equations (PDEs) play a crucial role in describing physical reality and modeling engineering problems. As the vast majority of PDEs have no computable analytical solutions, seeking numerical solutions is an important issue. There have been many classical numerical methods developed for solving PDEs, such as the finite element method (FEM) [3], the finite difference method (FDM) [32], the finite volume method [6], and the spectral method [29]. With these methods, linear PDEs are usually discretized as linear algebraic systems, and subsequently be solved by direct methods (e.g. Gaussian elimination), or iterative methods such as Richardson, (damped) Jacobi, Gauss-Seidel, SOR, SSOR [33]. When the scale of the problem is very large, iterative methods show significant advantages like low memory requirements. However, a large scale system results in a long solving time, which is unacceptable. To accelerate the iteration, multilevel iterative methods are developed [8, 34, 38], a multigrid solver will greatly reduce the steps of iteration and lead to fast convergence. The shortcoming of the multigrid method is that it is mesh-depended, and requires users to generate several levels of meshes, which may be a very heavy task. In order to overcome this drawback, the algebraic multigrid (AMG) methods [2, 22, 28, 39] are proposed, which deal with the algebraic system directly without geometric meshes. To further promote the development of fast and low usage cost iterative methods, in this work we propose a hybrid iterative method by introducing neural network-based tools.
The rise of neural networks (NNs) brings a huge impact on searching numerical solutions of PDEs. It has been shown that NNs have powerful expressivity [4, 30], thus NNs are considered as universal approximators for the solutions of PDEs. Mainstream neural networks-based solving methods for PDEs are optimizing the parameters of such NNs to minimize the corresponding loss functions constructed by strong or variational forms of PDEs, e.g. the PINNs [14, 21, 26] , the deep Ritz method [5], and the deep Galerkin method [31]. These methods show the possibility for solving high-dimensional problems, which is indeed a challenge for traditional methods due to the curse of dimensionality. As meshless methods, they are easily implemented and friendly for coders. The drawbacks of such methods are also obvious. On the one hand, these methods are usually hard to obtain solutions as accurate as the traditional methods like FEM, on the other hand, one needs to restart a training process once some parameters of the given PDE problem (e.g. the right-hand side term of Poisson equation) are changed, which is costly to deal with parametric PDE problems. In recent years, a new research field called neural operator, is emerging to achieve the fast solving for PDEs based on neural networks. The pioneer works are the DeepONet [19, 20] and the FNO [17, 18], they directly learn the solution operators of PDEs, for example, the mapping from the right-hand side of Poisson equation to its solution. These methods present a new landscape for understanding physical reality, one may study the causality via a data-driven manner besides the PDE model. Recently numerous neural operators are proposed to achieve higher performance [7, 9, 12, 13, 25, 27, 37], as this field is developing rapidly. There are also theoretical works for the above neural operators including error analysis [15, 16, 23].
The most attractive neural operator for this work is the multiple-input operator network (MIONet) [13]. MIONet firstly faces up to the problems with multiple inputs, and generalizes the architecture and the theory of DeepONet by the tool of tensor decomposition. Such architecture also induces the tensor-based models for solving eigenvalue problems [10, 36]. An important characteristic of DeepONet/MIONet series architectures is that they encode the output functions as neural networks, rather than discrete vectors. This feature leads to efficient interpolation for the predicted solutions, and also allows the computing of derivatives of output functions during training, which prompts the work of physics-informed DeepONets [35] that learns the solution operator without any input-output pairs of data, by training a PINN-style loss. Another interesting work based on DeepONet is the HINTS [41], which in fact inspires this work. Almost all of the works of neural operators use a pure machine learning-style to solve the PDE problems, and they break away from traditional PDE numerical methods. The neural operators are developed for fast solving, but usually have limited prediction accuracy, while the traditional numerical methods can provide solutions as accurate as needed at the cost of long solving time. Surprisingly, HINTS combines the two methods, i.e. the traditional numerical solver and the neural operator. It has been shown a spectral bias/frequency principle of the training behavior of NNs in [24, 40], i.e. the NNs often fit functions from low to high frequency during the training. Based on this feature, [11] employs the deep learning solutions (e.g. PINNs) to initialize the iterative methods for PDEs. HINTS further utilizes the spectral bias, where a pre-trained DeepONet is periodically used to eliminate the low-frequency components of the error during the iteration process, thus achieve acceleration since the smoothers (e.g. Gauss-Seidel) are difficult to deal with the low-frequency errors. This idea is similar to the mulltigrid method. A significant difference is that HINTS does not require the users to generate meshes, and the cost of this method has been transferred to the developers who are responsible for the pre-trained DeepONets.
Contributions. As HINTS studies the hybrid method empirically, the lack of theoretical guidance leads to the imperfection of this method. In this work, we propose a hybrid iterative method based on MIONet. We theoretically show that MIONet is a more reasonable framework to complete the hybrid iterative method. An ideal operator regression model for the hybrid method is expected to satisfy the following three key points:
-
1.
Supporting multiple inputs. Taking the Poisson equation
(1) as an example, its solution operator has two inputs, hence the chosen operator regression model should efficiently deal with such multiple-input case.
-
2.
Separating linearity and nonlinearity. Since the solution operator of (1) is linear w.r.t. but nonlinear w.r.t. , the operator regression model is also expected to be linear w.r.t. and nonlinear w.r.t. . The linearity on will guarantee the convergence of the hybrid iterative method which will be discussed later.
-
3.
Supporting efficient interpolation. After obtaining the predicted solution by operator regression model , we have to compute the values of at the grid points which are independent of the model , so that the model should support efficient interpolation.
Fortunately, we find that MIONet satisfies all the three points perfectly, while other models do not (note that MIONet is a high-level framework, and one may design a targeted architecture for each branch net on a specific problem, rather than directly use FNNs.). With the proposed MIONet-based hybrid iterative method, we systematically analyze its theoretical properties, including the convergence condition, the spectral behavior, as well as the convergence rate, in terms of the errors of the discretization and the model inference. We show the theoretical results for the frequently-used smoothers, i.e. Richardson (damped Jacobi) and Gauss-Seidel. We give an upper bound of the convergence rate of the hybrid method w.r.t. the model correction period, which indicates a minimum point to make the hybrid iteration converge fastest. Several numerical examples including the Richardson (Gauss-Seidel) iteration for the 1-d (2-d) Poisson equation are shown to verify our theoretical results.
This paper is organized as follows. We first briefly introduce the operator regression model MIONet as the necessary tool in Section 2. In Section 3, we present the algorithm of the proposed hybrid iterative method in detail. We then comprehensively analyze the theoretical properties of this method in Section 4. The numerical experiments are shown in Section 5. Finally, Section 6 summarizes this work.
2 Operator regression with MIONet
Let us simply introduce the operator regression model MIONet, which is a necessary tool for the later discussion of the hybrid iterative method. All the content of this section can be found in the MIONet paper [13], and the readers may refer to that paper for more details if interested. The readers familiar with MIONet may skip this section.
Let and be Banach spaces, and () is a compact set. Now we aim to learn a continuous operator
where and . Such form the space which is studied below. Note that could be any Banach space. We usually consider as a function space, such as for a compact domain .
To deal with the inputs from infinite-dimensional Banach spaces, it is necessary to firstly project them onto finite-dimensional spaces. Here we utilize the tools of Schauder basis and canonical projections.
Definition 1 (Schauder basis).
Let be an infinite-dimensional normed linear space. A sequence in is called a Schauder basis of , if for every there is a unique sequence of scalars , called the coordinates of , such that
We show two useful examples of Schauder basis as follows.
Example 1.
Faber-Schauder basis of . Given distinct points which is a dense subset in with , . Let , , and is chosen as an element, such that is a basis of the -dimensional space which consists of all the piecewise linear functions with grid points .
Example 2.
Fourier basis of . Any orthogonal basis in a separable Hilbert space is a Schauder basis.
We denote the coordinate functional of by , and thus
Then for a positive integer , the canonical projection is defined as
We have the following property for , according to which, we can represent points in an infinite-dimensional Banach space by finite coordinates within a sufficiently small projection error.
Property 1 (Canonical projection).
Assume that is a compact set in a Banach space equipped with a Schauder basis and corresponding canonical projections , then we have
For convenience, we decompose the as
where and are defined as
The are essentially the truncated coordinates for . Moreover, sometimes we can further replace with an equivalent basis for the decomposition of , i.e.,
with a nonsingular matrix . For example, when applying the Faber-Schauder basis (Example 1), instead of using the coordinates based on the sequence , we adopt the function values evaluated at certain grid points as the coordinates, which is the same as the linear element basis in the finite element method.
By considering the canonical projection property and the injective tensor product decomposition
(2) |
we can easily obtain the following approximation result.
Theorem 1.
Suppose that are Banach spaces, are compact sets, and have a Schauder basis with canonical projections (truncate the previous q terms). Assume that is a continuous operator, then for any , there exist positive integers , continuous functions , , such that
(3) |
In addition, if is linear with respect to , then linear is sufficient.
The MIONet is immediately constructed based on this theorem, where we just replace the with neural networks. Assume that now (, etc. are also available) for a compact domain .
MIONet. We construct the MIONet according to Eq. (3). Specifically, and are approximated by neural networks (called branch net ) and (called trunk net). Then the network (Figure 1) is
(4) |
where truncates the previous coordinates w.r.t. the corresponding Schauder basis as defined in Eq. (3), is the Hadamard product (element-wise product), e.g.
(5) |
and is the summation of all the components of a vector, is a trainable bias.
The universal approximation theorem for MIONet can be easily obtained via Theorem 1 and the universal approximation theorems for the applied neural networks (i.e. and ) such as FNNs.
Theorem 2 (Universal approximation theorem for MIONet).
Under the setting of Theorem 1 with etc., for any , there exists a MIONet , such that
In addition, the branch nets can be chosen as linear maps (i.e. one-layer FNN without activation and bias) if is linear with respect to the corresponding inputs.
Up to now, we have already shown the architecture and properties of MIONet, which will be applied to the iterative methods later.

3 A hybrid iterative method
We present a new MIONet-based hybrid iterative method for general linear equations:
(6) |
where is a linear differential operator depends on . is a finite set consisting of several parameters which can be scalars, vectors, functions, etc.. Other boundary conditions are also available.
Without loss of generality we take the well-known Poisson equation as an example. Consider
(7) |
where is a bounded open set. Here , and a.e. for a constant , so that it is a classical second-order elliptic equation. The variational formula is
(8) |
Now assume that is a bounded polyhedral domain, is a quasi-uniform and shape regular triangulation of . We apply the simple linear element space
(9) |
where denotes the function space consisting of all the linear polynomials over . The finite element method is
(10) |
Let be the nodal basis functions corresponding to all the interior nodes of . Then we obtain a discrete linear system for Eq. (10) as
(11) |
where , . Denote , then is the numerical solution obtained by the above system. What we next to do is solving Eq. (11) via a suitable iterative method.
For convenience, we temporarily omit the “” in above notations, but we keep in mind that they indeed depend on . For the discrete linear system
(12) |
we consider the iterative form
(13) |
where is an approximation of . The frequently-used choices of are
Richardson | (14) | ||||
Jacobi (damped) | (15) | ||||
Gauss-Seidel (SOR) | (16) |
where , is the diagonal of , and are the strictly lower and upper triangular parts of , respectively. The is usually considered as the relaxation parameter to be chosen. By performing a suitable iterative method, we will get a numerical solution for system (12).
Briefly, our hybrid iterative method is just inserting some inference steps via MIONet into the iterative process. Now we return to the operator regression model MIONet. For the problem (7), what we are concerned with is in fact the solution operator
(17) |
Now assume that we have a dataset as satisfying . We pre-train a MIONet, denoted by , on this dataset with loss function
(18) |
where could be , , , etc.. Note that here we set linear with respect to due to Theorem 1. An illustration is presented in Figure 2.

Denote , which is the residual vector of -th step. We expect to find a residual function related to the residual vector . Define the residual function , where satisfies
(19) |
Specially, we let . Note that is linear with respect to , we denote their relationship by a linear operator as
(20) |
We further define the interpolation operator as
(21) |
where is the corresponding nodal coordinate of , i.e. (Dirac delta function).
Then the hybrid iterative method can be written as
(22) | |||||
otherwise | (23) |
where is a positive integer to be chosen, we regard it as the correction period. The algorithm is summarized in Algorithm 1.
Approximation and padding for residual function. There is a new linear system (19) involved. We may choose an approximation for , for example:
(24) |
With the solved , we obtain the expressed as a continuous piece-wise linear function w.r.t. , which does not contain the boundary nodes. Hence the current equals to 0 on , which is regarded as a “zero padding”. Here we may use other better padding for , for example, add boundary nodes whose values equal to the values of their corresponding nearest interior nodes (we can also apply other padding strategy like reflection padding, etc.). For convenience, we still use the notation and , i.e.
(25) |
Note that the above approximation brings an acceptable error for this hybrid method. Since , is exactly the prediction by MIONet. We may also think of as an initial guess provided by MIONet for the iterative method. When feeding to , we need to evaluate the linear element function at some sampling points (usually with smaller scale compared to the number of nodes ) for .
Efficient interpolation. The interpolation is easy to compute, since the prediction
(26) |
is expressed by a neural network (encoded as a trunk net) rather than a discrete vector, and we only need to directly feed the interpolated points into the trunk net of the ONet. This is indeed an important advantage of the ONets series architectures.
Hybrid iteration with multigrid method. Since there is no requirement for the basic iterative method in this algorithm, we can also utilize other advanced iterative methods such as the multigrid method here. We will show more details in the experiment part.
Inhomogeneous boundary condition. Here we further investigate the inhomogeneous boundary condition. Consider
(27) |
where is the trace operator. In such a case, we aim to learn an operator mapping from to the solution , i.e.,
(28) |
Of course we can train an MIONet to learn this operator. However, this strategy losses the linearity w.r.t. the residual vector/function during the iteration, and we strive to preserve this linearity to guarantee the convergency of the hybrid method. Denote by the trace extension operator for (27), then for . We derive that
(29) |
where . Therefore the model can be modified to
(30) |
where is a standard MIONet which is linear w.r.t. the second input as before. Although the trace extension operator is unique defined, we may choose a suitable constructed extension as needed, for example the continuous piece-wise linear function related to a grid, which takes 0 at the interior nodes and matches at the boundary nodes. Note that here we regard as an element in .
There is in fact another way which is much easier to achieve. It is not difficult to find that is linear w.r.t. , i.e.,
(31) |
then we naturally apply the model
(32) |
where is a simple cartesian product of and , and it will be fed into the linear branch net of . In such a case, the discrete algebraic system of (27) is augmented with several rows and columns taking into account the boundary nodes, where the boundary block is the identity. The unknown and residual vectors contain not only the coefficients of the interior nodes but also the boundary nodes. We apply this strategy in our experiment.
Necessary convergence condition for operator regression models. There is a necessary condition for the convergence of the hybrid iterative method. Assume that the hybrid method converges with a model , then
(33) |
and we have
(34) |
As the interpolation operator is related to the mesh , and the above equation holds for arbitrary , we know
(35) |
Obviously, Eq. (35) holds as long as is linear w.r.t. the second input.
4 Theoretical analysis
Consider the one-dimensional two-point boundary value problem of system (7) where , , i.e.,
(36) |
We choose a uniform grid with size and apply the linear element bases functions whose nodal points are interior, then the above system leads to a discrete system as
(37) |
where
(38) |
and . We can easily write out the eigenvalues and the eigenvectors of as
(39) |
Denote , then is symmetric and orthogonal, i.e. and .
4.1 Convergence condition
Recall that we apply the hybrid iterative method (22-23) to solve (37). For convenience, here we simply set , so that we have for all . We further denote that
(40) |
thus is a linear map. Denote that
(41) |
then we find that
(42) |
and for ,
(43) |
thus
(44) |
Consequently we have
(45) |
where
(46) |
Up to now, we have derived the convergence results of the hybrid iterative method.
Theorem 3.
Corollary 1.
The corollary is also obvious since as if .
Corollary 2.
Proof.
It is noteworthy that there is no requirement on the loss of the MIONet to guarantee the convergence. However, an with a large loss cannot accelerate the iteration, which will be discussed later.
4.2 Error of model inference
Since the MIONet usually learns the low-frequency data, we assume that the dataset includes the previous several eigenfunctions as
(50) |
where are the eigenvalue and the eigenfunction of the original system, i.e.,
(51) |
For convenience, we consider the loss
(52) |
We further impose a generalization assumption on the MIONet for this case.
Assumption 1 (generalization assumption).
Assume that the MIONet keeps a consistent Lipschitz constant w.r.t. the second input during training, i.e.
(53) |
holds for and . The set is regarded as the sequence of parameters during the training process. depends only on .
Let be defined as in Eq. (25) and extended with replication padding. Then
(54) |
where and are the two boundary nodal functions, hence
(55) |
Consequently,
(56) |
where is the Lipschitz constant defined as in Assumption 1. By Eq. (52),
(57) |
Now denote
(58) |
and we get the estimate
(59) |
With the results of (57-59) and the orthogonality of , we further derive that
(60) |
It is elementary to see
(61) |
so that
(62) |
It is summarized as
(63) |
The error includes two parts, and the terms are caused by the errors of the discretization and the loss, respectively. The first part of this error is inevitable even though the loss has been trained to zero, since there exists a gap between the eigenvector (eigenvalue) of the discrete system (37) and the eigenfunction (eigenvalue) of the original system (36), which becomes larger as increases.
The estimate (63) can be equivalently written as
(64) |
Eq. (64) tells that is necessary to ensure that is small enough. Therefore , the size of the dataset defined in (50), is no need to be too large compared to . We reasonably assume that . For a fixed , we have derived that
(65) |
Note that here we consider the ground truth, i.e. the analytical eigenvectors and eigenvalues, as data. If we utilize numerical solutions as data, the error will decrease, since the gap between data and the discrete system becomes smaller. As a matter of fact, if we replace the in the dataset (50) by the corresponding numerical eigenvalue , then the error term in the last equality of (62) will disappear.
4.3 Spectral analysis
With the above discussion, we are able to analyze the spectral behavior and the convergence speed of the proposed hybrid iterative method. Now we express (see Section 4.1) by the eigenvectors of as
(66) |
where , . We further set
(67) |
then by direct computing we have
(68) |
so that
(69) |
Similarly, we can derive that
(70) |
Consequently, we obtain the final error as
(71) |
where
(72) |
Next we proceed discussions on . Note that the defined in (67) depends on the trained operator regression model MIONet, i.e., . Assume that perfectly learns the eigenvectors of Eq. (37), so that
(73) |
which leads to
(74) |
Then
(75) |
thus
(76) |
In such a situation, the first iterative step by MIONet in fact gives the right solution for the discrete system (37). However, we have to take into account the error of MIONet. Keep in mind that is in fact an approximation of , therefore we expect
(77) |
at least for the first few columns. As discussed in Section 4.2, we recall that
(78) |
and
(79) |
The errors of MIONet on the previous eigenvectors can be controlled by and the loss . Hence we assume that are small enough with sufficiently training for MIONet, while are uncontrolled. Therefore,
(80) |
whose first columns are small enough. It indicates that a well-trained MIONet can eliminate the low-frequency components of the error.
Return to the final error , we have
(81) |
where
(82) |
We subsequently study how the acts on a vector . Now assume that we choose the Richardson iteration and set which satisfies the convergence condition, then and
(83) |
therefore
(84) |
As
(85) |
we obtain the estimate
(86) |
which shows the change of the error within a period of steps. Subsequently, we write down the total error as
(87) |
Moreover, the last term in above inequality is
(88) |
where is a constant independent of . We hence have the simple relationship that
(89) |
For comparison, we write down the error of the original Richardson method as
(90) |
where is a constant under the assumption that .
Now we show the errors of the Richardson iterative method and the corresponding hybrid method as
Richardson | (91) | ||||
Hybrid | (92) |
where
(93) |
Note that , , and is usually small enough while is uncontrolled. Up to now, we are able to compare the convergence speeds of these two methods.
Theorem 4.
Assume that . Then there exists a , such that for any integer , it holds
(94) |
for large enough. Briefly, the hybrid iterative method converges faster than the original Richardson iterative method for large enough as long as .
Proof.
If , we have
(95) |
∎
Next we investigate how the influences the convergence rate of the hybrid iterative method. According to Eq. (92-91), the average convergence rates of the above methods are derived as the following.
Theorem 5 (convergence rate).
The convergence rate of the Richardson iterative method and an upper bound of the average convergence rate of the corresponding hybrid iterative method are
Richardson | (96) | ||||
Hybrid | (97) |
This theorem clearly points out how the network correction step accelerates the iteration.
Denote
(98) |
To visually observe the tendency of Rate(), we for example set , and plot them in Figure 3. It is shown that decreases first then increases, and there exists a minimum point of . For that is too small, the rate will be larger than 1, which means the hybrid iterative method will diverge. The rate increases and tends to the convergence rate of Richardson as .

4.4 Discussion for complicated cases
At this point, we have systematically analyzed the hybrid iterative method for the one-dimensional case with Richardson iteration (or equivalently damped Jacobi iteration). For more complicated case, such as the high-dimensional problem with Gauss-Seidel iteration which is a better smoother, we have to apply other effective tools [1, 8]. The smoothing property of GS is conveniently analyzed by using local mode analysis introduced by Brandt [1]. Here let us show the idea and promote the discussion on the hybrid Gauss-Seidel iteration.
Recall that the 2-d Poisson equation with the uniform grid and the lexicographical order can be discretized as
(99) |
To begin with, we add some idealized assumptions and simplifications as in the local mode analysis:
-
1.
Assume that boundary conditions are neglected and the problem is defined on an infinite grid with a period of , i.e.
(100) -
2.
Assume that the Gauss-Seidel iteration is implicitly defined as
(101) where is updated from . Note that Eq. (101) leads to an algebraic system for , which is uniquely solvable since it is strictly diagonally dominant.
Denote and , then
(102) |
We expand these errors by the discrete Fourier transformation, as
(103) |
where
(104) |
and
(105) |
Substituting (103) into (102), we obtain
(106) |
The smoothing factor is defined as
(107) |
It can be shown that , which means the GS iteration can efficiently eliminate the high-frequency errors.
It is worth noting that , which indicates that the lowest-frequency component will never be eliminated. In fact, the problem (99) does not determine a solution, since a change of a constant on does not affect the relationship. For convenience, we only consider the error excluding the lowest-frequency component in the following.
Based on this framework, we proceed the discussion on the network correction step. Assume that the MIONet-based iterative step sufficiently learns the low-frequency modes, i.e.,
(108) |
where is a constant that determines the (low-)frequency components learned. Denote
(109) |
we substitute (109) into (108) and similarly derive that
(110) |
which leads to
(111) |
Denote
(112) |
then we suppose that is as small as needed for .
Now we consider the MIONet iteration step, i.e.,
(113) |
or equivalently
(114) |
where , , and
(115) |
We expand as before, i.e.,
(116) |
then we have
(117) |
Consequently,
(118) |
Return to the algorithm of the hybrid iterative method, the network correction step is performed following GS iteration steps, so that can be written as
(119) |
then we have the estimate
(120) |
and the low-frequency component () has a smoothing factor greater than , i.e.
(121) |
Now denote
(122) |
then
(123) |
by (118). With (120) and (121), we further derive that
(124) |
where
(125) |
and is supposed to be small as discussed before. The relationship
(126) |
immediately leads to the final upper bound of the average convergence rate
(127) |
According to the above deduction, we have studied the convergence rate of this hybrid iterative method for 2-d problem with GS iteration, which is consistent with Eq. (98). It is expected to promote the discussion and obtain further theoretical results for more complicated cases. We will keep it for future research due to space limitations.
5 Numerical experiments
The experiments are implemented in c++
with Eigen library for sparse matrices and basic iterations, and LibTorch library for model inference. The used models are pre-trained via PyTorch. The running devices are Intel(R) Core(TM) i7-10750H for CPU and NVIDIA GeForce RTX 2070 for GPU.
5.1 One-dimensional Poisson equation
Consider the 1-d Poisson equation
(128) |
To firstly construct the dataset, we generate and by Gaussian Process (GP) with RBF kernel. is generated by GP with mean 1, std 0.2, length scale 0.1, while is with mean 0, std 1, length scale 0.1. We totally generate 5000 data points as training set . After training an MIONet on the dataset , we obtain an MIONet-based iteration step (linear), where is the number of the interior nodal points in the uniform grid for the interval . Here the size of MIONet is for ’s branch net, for ’s branch net, and for the trunk net. Note that we remove the bias in ’s branch net and also the bias in Eq. (4) to guarantee the linearity w.r.t. . Figure 4 shows an example of a couple of inputs and as well as the corresponding solution and the MIONet prediction.

We then apply the element together with the Richardson iteration (set ) to solve this 1-d Poisson equation for . The MIONet iteration is performed every steps to correct low-frequency errors in the Richardson-MIONet method (we also show the convergence rate versus in Figure 5, which is consistent with the theoretical results as Eq. (98) and Figure 3). The error threshold for stopping iteration is . We list the numbers of iterations of the Richardson and the Richardson-MIONet in Table 1. The number of iterations of the original Richardson method is 157 times that of the Richardson-MIONet method. Since the 1-d case converges too fast, we leave the study of the convergence time to the later 2-d experiment.
Iterations | Speed up | |
---|---|---|
Richardson | 28396 | / |
Richardson-MIONet | 181 | 157 |

We further observe the spectral error of the iterative methods, and the results are shown in Figure 6. The high-frequency errors are quickly eliminated by the Richardson iteration in both two methods. However, the low-frequency errors are difficult to eliminate via the original Richardson method. In each correction step, MIONet efficiently modifies the low-frequency errors. Although MIONet brings new high-frequency errors, they will be quickly clear up soon. After about 180 steps, the Richardson-MIONet method has already achieved a machine accuracy, while the original Richardson method still keeps a significant error.

5.2 Two-dimensional Poisson equation
Consider the 2-d Poisson equation
(129) |
where . We also generate and as the same setting as the 1-d case. The difference is that we set the length scale to 0.2. We totally generate 5000 data points as training set . After training an MIONet on the dataset , we obtain an approximate solution operator. Here the size of MIONet is for ’s branch net, for ’s branch net, and for the trunk net. We also remove the corresponding biases as in the 1-d case. An example of a couple of inputs and as well as the corresponding solution and the MIONet prediction is shown in Figure 7.

We then apply the element together with the Gauss-Seidel (GS) iteration to solve this 2-d Poisson equation with grid size . The MIONet iteration () is performed every steps to correct low-frequency errors in the GS-MIONet method. The error threshold for stopping iteration is . We list the numbers of convergence iterations and the convergence time of GS-MIONet given different in Table 2, 3. We also show the trends of the convergence steps and the convergence time versus in Figure 8. The best speed of the GS-MIONet method is nearly 30 times that of the original GS method at , while the number of convergence iterations reaches its minimum at . This gap exists since the inference time of MIONet accounts for a significant portion in the hybrid iteration. When (i.e. the number of convergence iterations of the GS method), MIONet only gives an initial guess during the hybrid iteration, which provides a acceleration.
GS | 500 | 1000 | 2000 | 4000 | 8000 | 16000 | 32000 | 64000 | |
---|---|---|---|---|---|---|---|---|---|
Iterations | 2556893 | div. | 167001 | 94001 | 78471 | 82484 | 96001 | 160001 | 256001 |
Time (s) | 27103 | div. | 2641 | 1222 | 923 | 919 | 1049 | 1722 | 2750 |
Speed up | / | div. | 10.3 | 22.2 | 29.4 | 29.5 | 25.8 | 15.7 | 9.9 |
128000 | 256000 | 512000 | 1024000 | 2048000 | 4096000 | 8192000 | |
Iterations | 384640 | 768001 | 1024001 | 1439557 | 2048001 | 2071460 | 2071460 |
Time (s) | 4079 | 8173 | 10804 | 15467 | 22058 | 22181 | 22181 |
Speed up | 6.6 | 3.3 | 2.5 | 1.75 | 1.23 | 1.22 | 1.22 |

Next we compare the hybrid iterative method to the multigrid method. We first solve this equation by different levels of multigrid methods, then for each case, we again test the corresponding multigrid-MIONet hybrid iterative method. In this hybrid method, we regard a V-cycle of the multigrid method as one step, and insert one network correction step every V-cycles. The results are recorded in Table 4 and visually presented in Figure 9. We observe that the proposed hybrid iterative method can flexibly utilize the advanced multigrid method. The speed of GS-MIONet is between the multigrid methods of level 3 and 4. There is an acceleration effect all the way up to level 6. For a high level () multigrid method, it converges so fast that the inference time of MIONet is larger than the multigrid’s convergence time. Unsurprisingly in such a case the hybrid method converges slower compared to the original multigrid method. However, in practice it is costly to construct high level multigrid, while the use of the hybrid iterative method has no extra burden for users as long as the pre-trained model is provided.
Method | GS | Mg2 | Mg3 | Mg4 | Mg5 | Mg6 | Mg7 | Mg8 | Mg9 | Mg10 |
Basic time (s) | 27103 | 8880 | 2210 | 567 | 139 | 35 | 8.7 | 2.4 | 1.7 | 1.7 |
Hybrid time (s) | 919 | 337 | 118 | 56 | 34 | 22 | 12.2 | 7.0 | 6.4 | 6.4 |
Speed up | 29.5 | 26.4 | 18.7 | 10.1 | 4.1 | 1.6 | / | / | / | / |

5.3 Inhomogeneous boundary condition
Finally we consider the 2-d Poisson equation with inhomogeneous boundary condition:
(130) |
where , and the boundary condition is also changing. We generate and as the same setting as the previous 2-d case. The is regarded as a function with a period of 4 (anticlockwise along the boundary), which is generated by GP with exponential sine squared kernel, where we set mean, std, length scale and period to 0, 0.05, 1 and 4, respectively. We totally generate 5000 data points as training set . After training an MIONet on the dataset , we obtain an approximate solution operator. Here the size of MIONet is for ’s branch net, for ’s branch net, and for the trunk net. We remove the corresponding biases as before. An example of inputs , and as well as the corresponding solution and the MIONet prediction is shown in Figure 10.

We then apply the element together with the GS iteration to solve this equation on the uniform grid. The MIONet iteration () is performed every steps to correct low-frequency errors in the GS-MIONet method. The error threshold for stopping iteration is . We show the number of convergence iterations and the convergence time of the GS(-MIONet) method in Table 5. The speed of the GS-MIONet method is nearly 24 times that of the original GS method.
Iterations | Time (s) | Speed up | |
---|---|---|---|
GS | 672230 | 1652 | / |
GS-MIONet | 20963 | 69 | 23.9 |
6 Conclusions
We have proposed a hybrid iterative method based on MIONet for PDEs, which combines the traditional numerical iterative solver and the neural operator, and further systematically analyzed its theoretical properties, including the convergence condition, the spectral behavior, as well as the convergence rate, in terms of the errors of the discretization and the model inference. We have shown the theoretical results for Richardson (damped Jacobi) and Gauss-Seidel. An upper bound of the convergence rate of the hybrid method w.r.t. the correction period is given, which indicates an optimal to make the hybrid iteration converge fastest. Several numerical examples including the hybrid Richardson (Gauss-Seidel) iteration for the 1-d (2-d) Poisson equation are presented to verify our theoretical results. In addition, we tested the hybrid iteration utilizing the multigrid method, it still reflects an excellent acceleration effect, unless the level of multigrid is very high. Furthermore, we have achieved the hybrid iteration for the Poisson equation with inhomogeneous boundary condition, which has not been studied in the work of HINTS. So far, we are able to deal with the Poisson equation in which all the parameters are changing. As a meshless acceleration method, it is provided with enormous potentials for practice applications.
References
- [1] A. Brandt. Multi-level adaptive solutions to boundary-value problems. Mathematics of computation, 31(138):333–390, 1977.
- [2] A. Brandt. Algebraic multigrid theory: The symmetric case. Applied mathematics and computation, 19(1-4):23–56, 1986.
- [3] S. C. Brenner. The mathematical theory of finite element methods. Springer, 2008.
- [4] W. E, C. Ma, and L. Wu. Barron spaces and the compositional function spaces for neural network models. arXiv preprint arXiv:1906.08039, 2019.
- [5] W. E and B. Yu. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1):1–12, 2018.
- [6] R. Eymard, T. Gallouët, and R. Herbin. Finite volume methods. Handbook of numerical analysis, 7:713–1018, 2000.
- [7] G. Gupta, X. Xiao, and P. Bogdan. Multiwavelet-based operator learning for differential equations. Advances in neural information processing systems, 34:24048–24062, 2021.
- [8] W. Hackbusch. Multi-grid methods and applications, volume 4. Springer Science & Business Media, 2013.
- [9] J. He, X. Liu, and J. Xu. MgNO: Efficient parameterization of linear operators via multigrid. arXiv preprint arXiv:2310.19809, 2023.
- [10] J. Hu and P. Jin. Experimental observation on a low-rank tensor model for eigenvalue problems. arXiv preprint arXiv:2302.00538, 2023.
- [11] J. Huang, H. Wang, and H. Yang. Int-deep: A deep learning initialized iterative method for nonlinear problems. Journal of computational physics, 419:109675, 2020.
- [12] Z. Jiang, M. Zhu, D. Li, Q. Li, Y. O. Yuan, and L. Lu. Fourier-MIONet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration. arXiv preprint arXiv:2303.04778, 2023.
- [13] P. Jin, S. Meng, and L. Lu. MIONet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022.
- [14] G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
- [15] N. Kovachki, S. Lanthaler, and S. Mishra. On universal approximation and error bounds for Fourier neural operators. The Journal of Machine Learning Research, 22(1):13237–13312, 2021.
- [16] S. Lanthaler, S. Mishra, and G. E. Karniadakis. Error estimates for DeepONets: A deep learning framework in infinite dimensions. Transactions of Mathematics and Its Applications, 6(1):tnac001, 2022.
- [17] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
- [18] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020.
- [19] L. Lu, P. Jin, and G. E. Karniadakis. DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193, 2019.
- [20] L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021.
- [21] L. Lu, R. Pestourie, W. Yao, Z. Wang, F. Verdugo, and S. G. Johnson. Physics-informed neural networks with hard constraints for inverse design. SIAM Journal on Scientific Computing, 43(6):B1105–B1132, 2021.
- [22] J. Mandel. Algebraic study of multigrid methods for symmetric, definite problems. Applied mathematics and computation, 25(1):39–56, 1988.
- [23] C. Marcati and C. Schwab. Exponential convergence of deep operator networks for elliptic partial differential equations. SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023.
- [24] N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, and A. Courville. On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310. PMLR, 2019.
- [25] M. A. Rahman, Z. E. Ross, and K. Azizzadenesheli. U-no: U-shaped neural operators. arXiv preprint arXiv:2204.11127, 2022.
- [26] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378:686–707, 2019.
- [27] B. Raonić, R. Molinaro, T. Rohner, S. Mishra, and E. de Bezenac. Convolutional Neural Operators. arXiv preprint arXiv:2302.01178, 2023.
- [28] J. W. Ruge and K. Stüben. Algebraic multigrid. In Multigrid methods, pages 73–130. SIAM, 1987.
- [29] J. Shen, T. Tang, and L.-L. Wang. Spectral methods: algorithms, analysis and applications, volume 41. Springer Science & Business Media, 2011.
- [30] J. W. Siegel and J. Xu. High-order approximation rates for shallow neural networks with cosine and activation functions. Applied and Computational Harmonic Analysis, 58:1–26, 2022.
- [31] J. Sirignano and K. Spiliopoulos. DGM: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375:1339–1364, 2018.
- [32] G. D. Smith. Numerical solution of partial differential equations: finite difference methods. Oxford university press, 1985.
- [33] L. N. Trefethen and D. Bau. Numerical linear algebra, volume 181. Siam, 2022.
- [34] U. Trottenberg, C. W. Oosterlee, and A. Schuller. Multigrid. Elsevier, 2000.
- [35] S. Wang, H. Wang, and P. Perdikaris. Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Science advances, 7(40):eabi8605, 2021.
- [36] Y. Wang, P. Jin, and H. Xie. Tensor neural network and its numerical integration. arXiv preprint arXiv:2207.02754, 2022.
- [37] H. Wu, T. Hu, H. Luo, J. Wang, and M. Long. Solving High-Dimensional PDEs with Latent Spectral Models. arXiv preprint arXiv:2301.12664, 2023.
- [38] J. Xu. Theory of multilevel methods, volume 8924558. Cornell University Ithaca, NY, 1989.
- [39] J. Xu and L. Zikatanov. Algebraic multigrid methods. Acta Numerica, 26:591–721, 2017.
- [40] Z.-Q. J. Xu, Y. Zhang, T. Luo, Y. Xiao, and Z. Ma. Frequency principle: Fourier analysis sheds light on deep neural networks. arXiv preprint arXiv:1901.06523, 2019.
- [41] E. Zhang, A. Kahana, E. Turkel, R. Ranade, J. Pathak, and G. E. Karniadakis. A hybrid iterative numerical transferable solver (HINTS) for PDEs based on deep operator network and relaxation methods. arXiv preprint arXiv:2208.13273, 2022.