Automatic stabilization of finite-element simulations using neural networks and hierarchical matrices

Tomasz Służalec⁽¹⁾, Mateusz Dobija⁽¹⁾, Anna Paszyńska⁽¹⁾, Ignacio Muga⁽²⁾,
Maciej Paszyński⁽³⁾ ⁽¹⁾Jagiellonian University, Kraków, Poland
⁽²⁾ Pontificia Universidad Católica of Valparaíso, Chile
⁽³⁾ AGH University of Science and Technology, Kraków, Poland
e-mail: [email protected]

Abstract

Petrov-Galerkin formulations with optimal test functions allow for the stabilization of finite element simulations. In particular, given a discrete trial space, the optimal test space induces a numerical scheme delivering the best approximation in terms of a problem-dependent energy norm. This ideal approach has two shortcomings: first, we need to explicitly know the set of optimal test functions; and second, the optimal test functions may have large supports inducing expensive dense linear systems.

Nevertheless, parametric families of PDEs are an example where it is worth investing some (offline) computational effort to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters, in an online stage. Therefore, as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions (in a basis expansion) associated with that PDE-parameter. Next, as a remedy for the second shortcoming, we use the low-rank approximation to hierarchically compress the (non-square) matrix of coefficients of optimal test functions. In order to accelerate this process, we train a neural network to learn a critical bottleneck of the compression algorithm (for a given set of PDE-parameters). When solving online the resulting (compressed) Petrov-Galerkin formulation, we employ a GMRES iterative solver with inexpensive matrix-vector multiplications thanks to the low-rank features of the compressed matrix. We perform experiments showing that the full online procedure as fast as the original (unstable) Galerkin approach. We illustrate our findings by means of 2D Eriksson-Johnson and Hemholtz model problems.

keywords:

Petrov-Galerkin method , optimal test functions , parametric PDEs , automatic stabilization , neural networks , hierarchical matrices

1 Introduction

Difficult finite-element simulations solved with the Galerkin method (where we employ the same trial and test space) often generate incorrect numerical results with oscillations or spurious behavior. Examples of such problems are the advection-dominated diffusion equation [7] or the Helmholtz equation [10].

Petrov-Galerkin formulations¹¹1i.e., trial and test spaces are not equal, although they share the same dimension. [18] with optimal test functions allow for automatic stabilization of difficult finite-element simulations. This particular Petrov-Galerkin approach is equivalent to the residual minimization (RM) method [7], whose applications include advection-diffusion [16, 6], Navier-Stokes [15], or space-time formulations [20].

In general, for a fixed trial space, RM allows for stabilization by enriching the discrete test space where optimal test functions of the associated Petrov-Galerkin method live ²²2Contrary to standard stabilization methods like SUPG [5, 13], there are no special stabilization terms modifying the weak formulation.. If we had explicitly the formulas for the optimal test functions expanded in the enriched discrete test space, then we can return to the Petrov-Galerkin formulation and solve problems in the best possible way for a given trial space.

This scenario has two shortcomings. The first problem is that the computation of optimal test functions is expensive. It requires solving a large system of linear equations³³3As large as the dimension of the enriched discrete test space. with multiple right-hand sides (one right-hand side per each basis function of the trial space). The second problem is that the optimal test functions can have global supports, and thus the Petrov-Galerkin method with optimal test functions can generate a dense matrix, expensive to solve.

Nevertheless, parametric families of partial differential equations (PDEs) are an example where it is worth investing some (offline) computational effort, to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters in an online stage. Therefore, in the context of a parametric family of PDEs, and as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions⁴⁴4expanded in the basis of the enriched discrete test space. associated with that PDE-parameter. We emphasize that this last procedure is independent of particular right-hand side sources and/or prescribed boundary data of the PDE family.

The obtained matrix $\mathbb{W}$ of optimal test functions coefficients is dense. The Petrov-Galerkin method induces a linear system of the form $\mathbb{B}^{T}\mathbb{W}\,x=(L^{T}\mathbb{W})^{T}$ , where $L$ is a right-hand side vector and $\mathbb{B}$ is the matrix associated with the bilinear form of the underlying PDE. To avoid dense matrix computations, we compress $\mathbb{W}$ using the approach of hierarchical matrices [11]. Having the matrix $\mathbb{W}$ compressed into a hierarchical matrix $\mathbb{H}$ , we employ the GMRES method [19], which involves computations of the residual $R=\mathbb{B}^{T}\mathbb{H}\,x-(L^{T}\mathbb{H})^{T}$ and the hierarchical matrix enables matrix-vector multiplication of $\mathbb{H}\,x$ and $L^{T}\mathbb{H}$ in a quasi-linear computational cost.

However, compressing the matrix $\mathbb{W}$ for each PDE-parameter is an expensive procedure. Thus, with the help of an artificial neural network, we train (offline) the critical bottleneck of the compression algorithm. We obtain a stabilized method with the additional quasi-linear cost resulting from matrix-vector multiplications within the GMRES method. From our numerical results with the Ericksson-Johnson and Hemholtz model problems, this cost of stabilization (the cost of compression of the hierarchical matrix and the cost of GMRES with hierarchical matrix multiplication by a vector) is of the same order as the cost of the solution of the original Galerkin problem without the stabilization. Thus, we claim we can obtain stabilization with neural networks and hierarchical matrices practically for free.

1.1 One-dimensional illustration of stabilization

Let us illustrate how we stabilize difficult computational problems by means of a one-dimensional example for the advection-dominated diffusion model.

Given $0<\epsilon\leq 1$ , consider the following differential equation:

\left\{\begin{array}[]{rl}-\epsilon u^{\prime\prime}+u^{\prime}={\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}0}&\hbox{in }(0,1)\,;\\ -\epsilon u^{\prime}(0)+u(0)=1&\hbox{and }u(1)=0.\end{array}\right.

(1)

In weak-form, equation (1) translates into find $u\in H_{0)}^{1}(0,1):=\{v\in H^{1}(0,1):v(1)=0\}$ such that:

\epsilon\int_{0}^{1}u^{\prime}v^{\prime}+\int_{0}^{1}u^{\prime}v+u(0)v(0)={\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}v(0)}

(2)

Refer to caption — Figure 1: Discrete spaces $U_{h}$ (left) and $V_{h}$ (right).

We define discrete spaces $U_{h}$ and $V_{h}$ as depicted in Figure 1. Given a regular mesh, $U_{h}$ will be the space of piecewise linear and continuous functions; while $V_{h}$ will be the space of piecewise quadratics and continuous functions. From one side, we discretize formulation (2) using a standard Galerkin method where trial and test spaces are equal to $V_{h}$ . On the other side, we discretize (2) by means of a residual minimization (RM) technique that uses $U_{h}$ as the trial space, and $V_{h}$ as the test space. Figure 2 compares the discrete solutions delivered by these two methods for different values of $\epsilon$ , and different meshes parametrized by the number of elements $n$ . We observe a superior performance of the residual minimization method, even though the approximation (trial) space used is poorer than that of the Galerkin method.

1.2 Outline of the paper

The structure of the paper is the following. Section 2 contains all the theoretical ingredients needed to understand our approach. Namely, Petrov-Galerkin formulations with optimal test functions (section 2.1); optimal test functions for an affine family of PDEs (section 2.2); the hierarchical compression of the optimal test functions matrix of coefficients (section 2.3); the fast hierarchical matrix-vector multiplication and related fast implementation of the GMRES solver (section 2.4); and the neural networks acceleration of the hierarchical matrix compression (section 2.5). Next, in section 3 we apply our automatic stabilization procedure to well-known unstable parametric model problems. First, for the two-dimensional Eriksson-Johnsson model problem (section 3.1); and next, for a two-dimensional Helmholtz equation (section 3.2). We write our conclusions in section 4. All the pseudo-code algorithms needed to follow our methodology have been shifted to the Appendix.

2 Theoretical ingredients

2.1 Petrov-Galerkin formulations with optimal test functions

Let $U$ and $V$ be Hilbert spaces. We consider a general variational formulation of a PDE, which is to find $u\in U$ such that

b(u,v)=\ell(v)\,,\quad\forall v\in V\,,

(3)

where $b:U\times V\to\mathbb{R}$ is a continuous bilinear form, and $\ell:V\to\mathbb{R}$ is a continuous linear functional (i.e., $\ell\in V^{\prime}$ , the dual space of $V$ ).

The dual space $V^{\prime}$ has a norm inherited by the norm of $V$ . Indeed, if $(\cdot,\cdot)_{V}$ denotes the inner-product of the Hilbert space $V$ , then these norms are given by the following expressions:

V\ni v\mapsto\|v\|_{V}:=\sqrt{(v,v)_{V}}\qquad\hbox{and}\qquad V^{\prime}\ni f\mapsto\|f\|_{V^{\prime}}:=\sup_{\|v\|_{V}=1}f(v).

We will assume well-posedness of problem (3), which translates into the well-known inf-sup conditions (see, e.g., [9, Theorem 2.6]):


	$\displaystyle\exists\gamma>0\hbox{ such that }\\|b(w,\cdot)\\|_{V^{\prime}}\geq\gamma\\|w\\|_{U}\,,\quad\forall w\in U;$		(4a)
	$\displaystyle\{v\in V:b(w,v)=0\,,\forall w\in U\}=\{0\}.$		(4b)

Notice that the continuity assumption on the bilinear form $b(\cdot,\cdot)$ , also implies the existence of a constant $M\geq\gamma$ such that:

\|b(w,\cdot)\|_{V^{\prime}}\leq M\|w\|_{U}\,,\quad\forall w\in U.

(5)

Given a discrete finite-element space $U_{h}\subset U$ , a natural candidate to approximate the solution $u\in U$ to problem (3) is the residual minimizer

u_{h}=\operatorname*{arg\,min}_{w_{h}\in U_{h}}\|b(w_{h},\cdot)-\ell(\cdot)\|_{V^{\prime}}\,.

(6)

Indeed, combining (4a), (6), and (5), the residual minimizer automatically satisfies the quasi-optimality property:

\gamma\|u_{h}-u\|_{U}\leq\|b(u_{h},\cdot)-\ell(\cdot)\|_{V^{\prime}}=\inf_{w_{h}\in U_{h}}\|b(w_{h},\cdot)-\ell(\cdot)\|_{V^{\prime}}\leq M\inf_{w_{h}\in U_{h}}\|w_{h}-u\|_{U}\,.

Thus, the residual minimizer inherits stability properties from the continuous problem.

It is well-known (see, e.g., [8]) that the saddle-point formulation of residual minimization (6) becomes the mixed problem that aims to find $u_{h}\in U_{h}$ , and a residual representative $r\in V$ , such that:


$\displaystyle(r,v)_{V}-b(u_{h},v)$	$\displaystyle=-\ell(v),\quad$	$\displaystyle\forall v\in V,$	(7a)
$\displaystyle b(w_{h},r)$	$\displaystyle=0,\quad$	$\displaystyle\forall w_{h}\in U_{h}.$	(7b)

Although it seems harmless, the mixed problem (7) is still infinite-dimensional in the test space $V$ . To obtain a computable version, we introduce a discrete test space $V_{h}\in V$ that turns (7) into the fully discrete problem to find $u_{h}\in U_{h}$ ⁵⁵5Notice the abuse of notation. This new $u_{h}\in U_{h}$ solution of (8) does not equals the exact residual minimizer solution of (6), or equivalently (7)., and a residual representative $r_{h}\in V_{h}$ , such that:


$\displaystyle(r_{h},v_{h})_{V}-b(u_{h},v_{h})$	$\displaystyle=-\ell(v_{h}),\quad$	$\displaystyle\forall v\in V_{h},$	(8a)
$\displaystyle b(w_{h},r_{h})$	$\displaystyle=0,\quad$	$\displaystyle\forall w_{h}\in U_{h},$	(8b)

Problem (8) corresponds to the saddle-point formulation of a discrete-dual residual minimization, in which the dual norm $\|\cdot\|_{V^{\prime}}$ in (6) is replaced by the discrete-dual norm $\|\cdot\|_{V_{h}^{\prime}}$ . Well-posedness and stability of (8) has been extensively studied in [17] and depends on a Fortin compatibility condition between the discrete spaces $U_{h}$ and $V_{h}$ . Moreover, once this condition is fulfilled (and in order to gain stability), it is possible to enrich the test space $V_{h}$ without changing the trial space $U_{h}$ . Obviously, this process will enlarge the linear system (8). Nevertheless, we know that there is an equivalent linear system of the same size of the trial space, delivering the same $u_{h}\in U_{h}$ solving (8). This is known as the Petrov-Galerkin method with optimal test functions, which we describe below. Our goal will be to make this generally impractical method practical.

Let us introduce now the concept of optimal test functions. For each $w_{h}\in U_{h}$ , the optimal test function $Tw_{h}\in V_{h}$ is defined as the Riesz representative of the functional $b(w_{h},\cdot)\in V_{h}^{\prime}$ , i.e.,

(Tw_{h},v_{h})_{V}=b(w_{h},v_{h})\,,\quad\forall v_{h}\in V_{h}.

(9)

Testing equation (8a) with optimal test functions $Tw_{h}$ , using (9) and (8b), we arrive to the following Petrov-Galerkin system with optimal test functions:

\left\{\begin{array}[]{ll}\hbox{Find }u_{h}\in U_{h}\hbox{ such that}\\ b(u_{h},Tw_{h})=\ell(Tw_{h})\,,&\forall w_{h}\in U_{h}\,.\end{array}\right.

(10)

In order to explicit a matrix expression for (10), let us set $U_{h}:=\operatorname{span}\{w_{1},\dots,w_{n}\}$ and $V_{h}:=\operatorname{span}\{v_{1},...,v_{m}\}$ . Consider the matrix $\mathbb{B}$ linked to the bilinear form $b(\cdot,\cdot)$ such that its $(i,j)$ -entry is $\mathbb{B}_{ij}=b(w_{j},v_{i})$ . Analogously, we consider the Gram matrix $\mathbb{G}$ linked to the inner product $(\cdot,\cdot)_{V}$ such that $\mathbb{G}_{ki}=(v_{k},v_{i})_{V}$ . The optimal test space is defined as $V_{h}^{\hbox{\tiny{opt}}}:=\operatorname{span}\{Tw_{1},...,Tw_{n}\}$ . Thus, using (9), we observe that the matrix containing the coefficients of optimal test functions when expanded in the basis of $V_{h}$ is $\mathbb{W}:=\mathbb{G}^{-1}\mathbb{B}$ . Moreover, the Petrov-Galerkin system (10) becomes

\mathbb{B}^{T}\mathbb{W}\,x=(L^{T}\mathbb{W})^{T},

(11)

where the vector $x$ contains the coefficients of the expansion of $u_{h}$ in the basis of $U_{h}$ , $L^{T}=[\ell(v_{1})\,\cdots\,\ell(v_{m})]$ , and we have used the fact that $\mathbb{G}^{T}=\mathbb{G}$ . Therefore, if we aim to solve the Petrov-Galerkin linear system (11), an optimized matrix-vector multiplication to perform $\mathbb{W}\,x$ and $L^{T}\mathbb{W}$ becomes critical. Section 2.3 is devoted to study the hierarchical compression of $\mathbb{W}$ , which allows for fast vector-matrix multiplications.

2.2 Optimal test functions for an affine family of parametric PDEs

Assume we want to solve parametric PDEs in variational form, i.e.,

\hbox{Find }\quad u_{\mu}\in U\quad\hbox{ such that }\quad b_{\mu}(u_{\mu},v)=\ell_{\mu}(v),\quad\forall v\in V,

where for each set of parameters $\mu\in\mathcal{P}\subset\mathbb{R}^{d}$ , the bilinear form $b_{\mu}(\cdot,\cdot)$ is continuous and inf-sup stable, with constants that may depend on $\mu$ . Moreover, we assume that $b_{\mu}(\cdot,\cdot)$ has the affine decomposition:

b_{\mu}(\cdot,\cdot)=b_{0}(\cdot,\cdot)+\sum_{k=1}^{d}\theta_{k}(\mu)b_{k}(\cdot,\cdot)\,,

where $\theta_{k}:\mathcal{P}\to\mathbb{R}$ and $b_{k}:U\times V\to\mathbb{R}$ are accessible and easy to compute. When the trial and test spaces are discretized, the bilinear form $b_{\mu}(\cdot,\cdot)$ induces a matrix of the form:

\mathbb{B}_{\mu}=\mathbb{B}_{0}+\sum_{k=1}^{d}\theta_{k}(\mu)\mathbb{B}_{k}\,.

Thus, the matrix of coefficients of optimal test functions $\mathbb{W}_{\mu}:=\mathbb{G}^{-1}\mathbb{B}_{\mu}$ becomes in this case:

\mathbb{W}_{\mu}=\mathbb{G}^{-1}\mathbb{B}_{0}+\sum_{k=1}^{d}\theta_{k}(\mu)\mathbb{G}^{-1}\mathbb{B}_{k}\,.

(12)

Equation (18) shows the particular form that expression (12) gets for the Eriksson-Johnsson model problem, where knowing $\mathbb{G}^{-1}\mathbb{B}_{0}$ and $\mathbb{G}^{-1}\mathbb{B}_{1}$ implies the knowledge of $\mathbb{W}_{\epsilon}$ for any $\epsilon>0$ .

2.3 Hierarchical compression of the optimal test functions matrix of coefficients

The hierarchical matrices has been introduced by Hackbush [11]. The main idea of the hierachical compression of a matrix is to store the matrix in a tree-like structure, where:

1.

the root node corresponds to the whole matrix;
2.

the root node has some number of sons (in our approach 4 sons) corresponding to submatrices of the main matrix;
3.

each node can have sons (in our approach 4 sons) corresponding to submatrices (blocks), or can be a leaf representing the corresponding matrix (block);
4.

each leaf stores its associated matrix in the SVD compressed form or as zero matrix;
5.

at each node, the decision about storing the block in SVD form, or either dividing the block into submatrices, depends on an admissibility condition of the block.

Exemplary hierarchical compression of the matrix in a form of a tree is presented in Figure 3; while the algorithm for compression of the matrix into the hierarchical matrix format is presented in Algorithm 1.

The admissibility condition controls the process of creation of the tree, it allows to decide if the matrix should be divided (or not) into submatrices. In our case, the admissibility condition is defined by

1.

The size of the matrix: if the matrix is bigger than a pre-defined maximal admissible size $l>>1$ , then the matrix should be divided into submatrices;
2.

The first $r$ singular values: if the $r+1$ singular value is greater than a pre-defined threshold $\delta>0$ , then the matrix should be divided into submatrices.

In the leaves of the tree, we perform a reduced Signular Value Decomposition (rSVD). A reduced singular value decomposition of a $(n\times m)$ -matrix $\mathbb{M}$ of rank $k$ is a factorisation of the form

\mathbb{M}=\mathbb{U}\,\mathbb{D}\,\mathbb{V}^{T},

with unitary matrices $\mathbb{U}\in\mathbb{R}^{n\times{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}k}}$ and $\mathbb{V}\in\mathbb{R}^{m\times{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}k}}$ , and a diagonal matrix $\mathbb{D}\in\mathbb{R}^{{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}k}\times{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}k}}$ where the diagonal entries are $\mathbb{D}_{11}\geq\mathbb{D}_{22}\geq\cdots\geq\mathbb{D}_{kk}>0$ . (see Figure 4).

The diagonal entries of $\mathbb{D}$ are called the singular values of $\mathbb{M}$ . The computational complexity of the reduced SVD is ${\cal O}((m+n){\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}k}^{2})$ .

2.4 Matrix-vector multiplication with $\mathbb{H}$ -matrices and GMRES solver speedup

The computational cost of matrix-vector multiplication using a compressed $\mathbb{H}$ -matrix of rank $r$ and $s$ right-hand side vectors is ${\cal O}((m+n)rs)$ . This is illustrated in Figure 5(a).

The multiplication of a matrix compressed into SVD blocks is performed recursively as illustrated in Figure 5(b). The resulting computational cost of the multiplication is ${\cal O}(Nrs)$ , where $N:=\max\{n,m\}$ .

The GMRES algorithm employed for computing the solution, includes multiplications of the problem matrix by vectors (see line 1, line 4, and line 5 in Algorithm 9). The application of the stabilization matrix requires replacement of the $\mathbb{B}$ matrix by $\mathbb{B}^{T}\mathbb{H}$ . If we can apply matrix-vector multiplication $\mathbb{H}\,x$ in a linear cost, we can say that our stabilization comes for free.

2.5 Neural network learning the hierarchical matrices

The hierarchical matrix is obtained by constructing a tree with SVD decompositions of different blocks of the full matrix. The root level corresponds to the entire matrix, and the children correspond to sub-blocks. Only the leaf nodes have blocks stored in the SVD decomposition format. The most expensive part of the compression algorithm is checking the admissibility condition. In particular, checking if a given block has $r$ singular values smaller than $\delta$ , and whether we partition or run SVD decomposition. The SVD data for the blocks of different sizes can be precomputed and stored in a list, see Figure 6.

From the set $\mathcal{P}\subset\mathbb{R}^{d}$ of PDE-parameters, we can construct the neural network

\mathcal{P}\ni\mu\rightarrow\operatorname{DNN}(\mu)=\{\mathbb{U}_{i}(\mu),\mathbb{D}_{i}(\mu),\mathbb{V}_{i}(\mu)\}_{i=1,...,N_{B}}

(13)

where $\operatorname{DNN}(\mu)$ is the list of SVD decompositions for all $N_{B}$ blocks of different dimensions.

Unfortunately, training the neural networks $\mathbb{U}_{i}(\mu)$ and $\mathbb{V}_{i}(\mu)$ for different blocks, as functions of $\mu$ does not work. However, it is possible to train the singular values for different blocks as function of $\mu$ , i.e.,

\mathcal{P}\ni\mu\rightarrow\operatorname{DNN}(\mu)=\{\mathbb{D}_{i}(\mu)\}_{i=1,...,N_{B}}\,.

(14)

Knowing $\mathbb{D}_{i}(\mu)$ a priori for a given $\mu$ allows us to construct the structure of the hierarchical matrix, and call truncated SVD only for leaves of the matrix, which considerably speeds up the compression algorithm. An illustration of the architecture of the neural network used to learn singular values is depicted in Figure 7.

3 Application

3.1 Two-dimensional Eriksson-Johnson problem

Given $\Omega=(0,1)^{2}\subset\mathbb{R}^{2}$ and $\beta=(1,0)$ , we seek the solution of the advection-diffusion problem

\left\{\begin{array}[]{rl}-\epsilon\,\Delta u+\beta\cdot\nabla u=0&\hbox{in }\Omega\\ u=\sin(k\pi y)\chi_{\{x=0\}}&\hbox{over }\partial\Omega\,,\end{array}\right.

(15)

where $\chi_{\{x=0\}}$ denotes the characteristic function over the the inflow boundary $x=0$ .

The problem is driven by the inflow Dirichlet boundary condition and develops a boundary layer of width $\epsilon$ near the outflow $x=1$ , as shown in Figure 8.

The weak form with a general Dirichlet boundary data $g\in H^{1\over 2}(\partial\Omega)$ will be to find $u^{\epsilon}\in H^{1}(\Omega)$ such that

\left\{\begin{array}[]{rl}\underbrace{\displaystyle\epsilon\int_{\Omega}\nabla u^{\epsilon}\cdot\nabla v+\int_{\Omega}(\beta\cdot\nabla u^{\epsilon})v}_{\epsilon\,b_{1}(u^{\epsilon},v)+b_{0}(u^{\epsilon},v)=:b_{\epsilon}(u^{\epsilon},v)}=0,&\forall v\in H_{0}^{1}(\Omega),\\ u^{\epsilon}=g,&\hbox{over }\partial\Omega.\end{array}\right.

(16)

To simplify the discussion, we approximate the solution as tensor products of one-dimensional B-splines basis functions $\{B_{i;p}(x)B_{j;p}(y)\}_{i,j}$ of uniform order $p$ in all directions. This discrete trial space $U_{h}^{p}\subset H^{1}(\Omega)$ will be split as $U_{h}^{p}=U_{h,0}^{p}+U^{p}_{h,\partial\Omega}$ , where $U^{p}_{h,0}\subset H_{0}^{1}$ contains all the basis functions vanishing at $\partial\Omega$ ; and $U^{p}_{h,\partial\Omega}$ is the complementary subspace containing the basis functions associated with boundary nodes. Our discrete solution will be $u_{h}^{\epsilon}=u_{h,0}^{\epsilon}+u^{\epsilon}_{h,g}$ , where $u^{\epsilon}_{h,0}\in U_{h,0}^{p}$ is unknown and $u^{\epsilon}_{h,g}\in U_{h,\partial\Omega}^{p}$ is directly obtained using the boundary data $g$ .

We build the test space using a larger polynomial order $q>p$ . That is, we use the tensor product of one-dimensional B-splines basis functions $\{B_{s;q}(x)B_{t;q}(y)\}_{s,t}$ of order $q$ and vanishing over $\partial\Omega$ . This discrete test space will be denoted by $V^{q}_{h,0}\subset H_{0}^{1}(\Omega)$ . The discrete residual minimization problem will be to find $u^{\epsilon}_{h,0}\in U_{h,0}^{p}$ and $r_{h}\in V_{h,0}^{q}$ such that


$\displaystyle(\nabla r_{h},\nabla v_{h})_{L^{2}(\Omega)}-b_{\epsilon}(u^{\epsilon}_{h,0},v_{h})$	$\displaystyle=b_{\epsilon}(u^{\epsilon}_{h,g},v_{h})\,,$	$\displaystyle\quad\forall v_{h}\in V_{h,0}^{q}\,;$	(17a)
$\displaystyle b_{\epsilon}(w_{h},r_{h})$	$\displaystyle=0\,,$	$\displaystyle\quad\forall w_{h}\in U_{h,0}^{p}\,.$	(17b)

The reduced matrix system associated with (17) takes the form:

\mathbb{B}_{\epsilon}^{T}\mathbb{W}_{\epsilon}x=(L^{T}\mathbb{W}_{\epsilon})^{T},\quad\hbox{ where }\mathbb{W}_{\epsilon}=\mathbb{G}^{-1}\mathbb{B}_{\epsilon}=\epsilon\,\mathbb{G}^{-1}\mathbb{B}_{1}+\mathbb{G}^{-1}\mathbb{B}_{0}.

(18)

We train a neural network for the diagonal $\mathbb{D}(\epsilon)$ matrix of the SVD decomposition $[\mathbb{U}(\epsilon),\mathbb{D}(\epsilon),\mathbb{V}(\epsilon)]$ of the entire matrix $\mathbb{W}_{\epsilon}$ (see Figure 9), as well as $\mathbb{D}_{ij}(\epsilon)$ from the SVD decompositions $\{\mathbb{U}_{ij}(\epsilon),\mathbb{D}_{ij}(\epsilon),\mathbb{V}_{ij}(\epsilon)\}_{i=1,...,j^{2}}$ of sub-matrices obtained by $j\times j$ partitions of $\mathbb{W}_{\epsilon}$ , for $j=2,4,8,16$ . The white parts correspond to the boundary nodes, where we have enforced the boundary conditions. The convergence of the training procedures is presented in Figures 10.

In an online stage, for a given diffusion coefficient $\epsilon$ , we perform the compression of the matrix $\mathbb{W}_{\epsilon}$ into the hierarchical matrix $\mathbb{H}_{\epsilon}$ using Algorithm 6, where the admissibility condition is now provided by the trained neural network. The compressed hierarchical matrices are illustrated in Figure 11.

Having the compressed matrix $\mathbb{H}_{\epsilon}$ , we employ the GMRES algorithm [19] for solving the linear system (18). To avoid the computation with a dense $\mathbb{B}_{\epsilon}^{T}\mathbb{H}_{\epsilon}$ matrix, we note that the GMRES method involves computations of the residual $R=\mathbb{B}_{\epsilon}^{T}\mathbb{H}_{\epsilon}\,x-(L^{T}\mathbb{H}_{\epsilon})^{T}$ and the hierarchical matrix $\mathbb{H}_{\epsilon}$ enables matrix-vector multiplications of $\mathbb{H}_{\epsilon}\,x$ and $L^{T}\mathbb{H}_{\epsilon}$ in a quasi-linear computational cost.

$\epsilon$	Compress	Compress	${\mathbb{H}}_{\epsilon}*x$	${\mathbb{A}}*{\mathbb{H}}_{\epsilon}x$	# iter	Total
	${\mathbb{H}}_{\epsilon}$ flops	${\mathbb{H}}_{\epsilon}$ flops	flops	flops	GMRES	flops
	with DNN	without DNN			${\mathbb{H}}$ -matrix
0.1	14,334	158,946	41,163	9,152	90	3,728,166
$10^{-6}$	31,880	117,922	33,100	9,002	76	3,232,392

Table 1: Computational costs of the stabilized Erikkson-Johnsson solver using neural networks, hierarchical matrices and GMRES solver.

$\epsilon$	# iter GMRES Galerkin	Flops per iteration	Total flops
0.1	89	17,536	1,560,704
$10^{-6}$	65	17,536	1,139,840

Table 2: Computational costs of Galerkin method for the Erikkson-Johnsson problem using GMRES solver.

In Table 1 we summarize the computational costs of our solver for two values of $\epsilon=\{0.1,0.000001\}$ . The computational mesh was a tensor product of quadratic B-splines with 26 elements along $x$ -axis and quadratic B-splines with 10 elements along $y$ -axis. As we can read from the second and third columns, the DNN speeds up the compression process of the matrix of optimal test function’s coefficients around ten times. We employ the GMRES solver that computes the residual. The cost of multiplication of the $\mathbb{H}_{\epsilon}*x$ and the cost of multiplication of $\mathbb{A}*\mathbb{H}_{\epsilon}x$ is included in the fourth and fifth column in Table 1. The total cost of the GMRES with hierarchical matrices augmented by DNN compression is equal to compression cost of ${\mathbb{H}}_{\epsilon}$ with DNN plus number of iterations times the multiplication cost of ${\mathbb{H}}_{\epsilon}*x$ plus multiplication cost of ${\mathbb{A}}*{\mathbb{H}}_{\epsilon}x$ . The total cost is presented in the last column of Table 1.

For comparison, we run the GMRES algorithm on the Galerkin method. The comparison is summarized in Table 2. The number of iterations, the cost per iteration, and the total cost are presented there. We can see that the cost of the stabilized solution is of the same order as the cost of the Galerkin solution. We compare here the costs of the correct solution obtained from the stable Petrov-Galerkin method with the cost of the incorrect solution from the unstable Galerkin method.

The numerical results are compared with the exact solutions in Figure 12 for inflow data $g=\sin(\pi y)$ , and Figure 13 for inflow data $g=\sin(2\pi y)$ .

3.2 Hemholtz problem

Given $\Omega=(0,1)^{2}\subset\mathbb{R}^{2}$ and $\kappa\in[1,10]$ , we seek the solution of the Hemholtz problem

\left\{\begin{array}[]{rl}\Delta u+\kappa^{2}u=f&\hbox{in }\Omega\\ u=g&\hbox{over }\partial\Omega\,,\end{array}\right.

(19)

with right-hand sides $f$ and $g$ such that the exact solution is $u(x,y)=\sin(\kappa\pi x)\sin(\kappa\pi y)$ .

We employ $20\times 20$ finite elements mesh. The trial space is constructed from quadratic B-splines. The test space is obtained for trial, and quadraticfrom B-splines with $C^{0}$ separators (equivalent to Lagrange polynomials). The dependence of the coefficients of the optimal test functions on $\kappa$ for the Hemholtz problem has the affine structure described in Section 2.2, thus we can offline construct the function

\mathcal{P}\ni\kappa\rightarrow\mathbb{W}(\kappa),

(20)

where $\mathbb{W}(\kappa)$ is the matrix of the coefficients of the optimal test functions. We fix the trial and test spaces used for approximation of the solution and stabilization of the Petrov-Galerkin formulation. Next, we consider blocks of different size of matrix $\mathbb{W}(\kappa)$ , and we train the SVD for these different blocks as function of $\kappa$ , i.e.,

\mathcal{P}\ni\kappa\rightarrow\operatorname{DNN}(\kappa)=\{\mathbb{U}_{i}(\kappa),\mathbb{D}_{i}(\kappa),\mathbb{V}_{i}(\kappa)\}_{i=1,...,N_{B}}\,.

(21)

The convergence of the training procedure is presented in Figure 14. Knowing $\mathbb{D}_{i}(\kappa)$ a priori for a given $\kappa$ allows us to construct the structure of the hierarchical matrix, and we obtain the $\mathbb{U}_{i}(\kappa)$ and $\mathbb{V}_{i}(\kappa)$ from the neural networks.

Figure 15 depicts the exemplary resulting hierarchical matrices.

Table 3 summarizes the computational costs of our solver for two values of $\kappa=\{1,10\}$ . The computational mesh was a tensor product of quadratic B-splines with 10 elements along $x$ and $y$ axes. The DNN allows obtaining the compression of the matrix of optimal test function’s coefficients for free. We employ the GMRES solver that computes the residual. The cost of multiplication of the $\mathbb{H}_{\kappa}*x$ and the cost of multiplication of $\mathbb{A}*\mathbb{H}_{\kappa}x$ is included in the fourth and fifth column in Table 3. The total cost of the GMRES with hierarchical matrices augmented by DNN compression is equal to the number of iterations times the multiplication cost of ${\mathbb{H}}_{\kappa}*x$ plus multiplication cost of ${\mathbb{A}}*{\mathbb{H}}_{\kappa}x$ . The total cost is presented in the last column of Table 3.

We present in Table 4 the cost of the GMRES algorithm executed on the Galerkin method. We present the number of iterations, the cost per iteration, and the total cost. We can see that the cost of obtaining the stabilized solution is of the same order as the cost of the Galerkin method. We compare here the costs of the correct solution obtained from the Petrov-Galerkin method with the cost of the incorrect solution from the unstable Galerkin method.

The comparison of the solution obtained with the Petrov-Galerkin formulation with the optimal test functions generated by DNN and the exact solution is presented in Figure 16.

$\kappa$	Compress	Compress	${\mathbb{W}}_{\kappa}*x$	${\mathbb{A}}*{\mathbb{W}}_{\kappa}x$	# iter	Total
	${\mathbb{W}}_{\kappa}$ flops	${\mathbb{W}}_{\kappa}$ flops	flops	flops	GMRES	flops
	with DNN	without DNN			${\mathbb{H}}$ -matrix
1	0	129,788	47,649	8,901	10	448,284
10	0	129,788	47,649	8,877	31	1,752,306

Table 3: Computational costs of the stabilized Hemholtz solver using neural networks, hierarchical matrices and GMRES solver.

$\epsilon$	# iter GMRES Galerkin	Flops per iteration	Total flops
1	10	14,444	144,440
10	31	14,444	447,764

Table 4: Computational costs of Galerkin method for the Hemholtz problem using GMRES solver.

4 Conclusions

We have employed the Petrov-Galerkin formulation with optimal test functions for the stabilization of difficult problems. We have focused on advection-dominated diffusion and Helmholtz problems. During the offline phase, we explicitly compute the matrix of coefficients of optimal test functions for any PDE-parameter. We have also trained neural networks to compute for each PDE-parameter the bottleneck of hierarchical matrix compression. During the online phase, we rapidly compute the matrix compression using the neural networks, and we perform the GMRES iterative solver on the reduced Petrov-Galerkin linear system, where vector-matrix multiplications are done in a quasi-linear computational cost, due to the hierarchical structure of the low-rank decomposition used. Thus, we obtain the online stabilization practically for free.

Acknowledgments

The European Union’s Horizon 2020 Research and Innovation Program of the Marie Skłodowska-Curie grant agreement No. 777778, MATHROCKs. Research project partly supported by the program “Excellence initiative – research university” for the University of Science and Technology.

Appendix A Algorithms

A.1 Recursive hierarchical compression of the matrix

t_{\min},t_{\max},s_{\min},s_{\max}\in{\mathbb{N}}

(row and column index ranges),

1\leq t_{\min}\leq t_{\max}\leq n,1\leq s_{\min}\leq s_{\max}\leq m

where

n\times m

is the size of the matrix to be compressed

\operatorname{Admissible}(t_{\min},t_{\max},s_{\min},s_{\max},r,\delta)

then

v=\operatorname{CompressMatrix}(t_{\min},t_{\max},s_{\min},s_{\max},r)

else

// Create a new node with four sons corresponding to four //quarters of the matrix

create new node

v

AppendChild(v,CreateTree

(t_{\min},t_{\operatorname{newmax}},s_{\min},s_{\operatorname{newmax}}))

AppendChild(v,CreateTree

(t_{\min},t_{\operatorname{newmax}},s_{\operatorname{newmax}}+1,s_{\max}))

AppendChild(v,CreateTree

(t_{\operatorname{newmax}}+1,t_{\max},s_{\min},s_{\operatorname{newmax}}))

AppendChild(v,CreateTree

(t_{\operatorname{newmax}}+1,t_{\max},s_{\operatorname{newmax}}+1,s_{\max}))

end if

RETURN

v

Algorithm 1 Recursive hierarchical compression of the matrix: CreateTree(

r,\delta

) where

r

is the rank used for the compression, and

\delta

is the threshold for the singular values.

A.2 Checking of the admissibility condition

t_{\min},t_{\max},s_{\min},s_{\max}

- range of indexes of block,

\delta

compression threshold,

r

maximum rank

if block (

t_{\min},t_{\max},s_{\min},s_{\max}

) consist of zeros then

return true;

end if

[\mathbb{U},\mathbb{D},\mathbb{V}]\leftarrow\operatorname{truncatedSVD}(t_{\min},t_{\max},s_{\min},s_{\max},r+1)

;

\sigma\leftarrow\operatorname{diag}(\mathbb{D})

;

\sigma(r+1)<\delta

then

return true;

end if

return false;

Algorithm 2 Checking of the admissibility condition:

\operatorname{result}=\operatorname{Admissible}(t_{\min},t_{\max},s_{\min},s_{\max},r,\delta)

A.3 Matrix vector multiplication

\textrm{node }v

representing compressed matrix

\mathbb{H}(v)\in{\cal M}^{m\times n}

X\in{\cal M}^{n\times c}

vectors to multiply

v.nr\_sons==0

then

v.rank==0

then

\textrm{return }zeros(size(A).rows)

end if

\textrm{return }v.U*(v.V*X)

end if

rows=size(X).rows

X_{1}=X(1:\frac{rows}{2},*)

X_{2}=X(\frac{rows}{2}+1:size(A).rows,*)

C_{2}=v.son(1).U;C_{1}=v.son(1).V

D_{2}=v.son(2).U;D_{1}=v.son(2).V

E_{2}=v.son(3).U;E_{1}=v.son(3).V

F_{2}=v.son(4).U;F_{1}=v.son(4).V

\textrm{return }\begin{bmatrix}C_{2}*(C_{1}*X_{1})+D_{2}*(D_{1}*X_{2})\\ E_{2}*(E_{1}*X_{1})+F_{2}*(F_{1}*X_{2})\end{bmatrix}

Algorithm 3 Matrix vector multiplication:

Y={\bf matrix\_vector\_mult}(v,X)

A.4 rSVD compression of a block

t_{\min},t_{\max},s_{\min},s_{\max}

- range of indexes of block,

\delta

compression threshold,

r

maximum rank

if block (

t_{\min},t_{\max},s_{\min},s_{\max}

) consist of zeros then

\textrm{\bf create new node }v;v.rank\leftarrow 0;v.size\leftarrow size(t_{\min},t_{\max},s_{\min},s_{\max});\textrm{\bf return }v;

end if

[\mathbb{U},\mathbb{D},\mathbb{V}]\leftarrow\operatorname{truncatedSVD}(t_{\min},t_{\max},s_{\min},s_{\max},r)

;

\sigma\leftarrow\operatorname{diag}(\mathbb{D})

;

rank\leftarrow rank(\mathbb{D})

\textrm{\bf create new node }v;

v.rank\leftarrow rank;

v.singularvalues\leftarrow\sigma(1:rank);

v.U\leftarrow U(*,1:rank);

v.V\leftarrow D(1:rank,1:rank)*V(1:rank,*);

v.sons\leftarrow\emptyset;

v.size\leftarrow size(t_{\min},t_{\max},s_{\min},s_{\max});

\textrm{\bf return }v;

Algorithm 4 rSVD compression of a block:

node={\bf CompressMatrix}(t_{\min},t_{\max},s_{\min},s_{\max},r)

A.5 Pseudo-code of the GMRES algorithm

A

matrix,

b

right-hand-side vector,

x_{0}

starting point

Compute

r_{0}=b-Ax_{0}

Compute

v_{1}=\frac{r_{0}}{\|r_{0}\|}

for

j=1,2,...,k

Compute

h_{i,j}=\left(Av_{j},v_{i}\right)

for

i=1,2,...,j

Compute

\hat{v}_{j+1}=Av_{j}-\sum_{i=1,...,j}h_{i,j}v_{i}

Compute

h_{j+1,j}=\|\hat{v}_{j+1}\|_{2}

Compute

v_{j+1}=\hat{v}_{j+1}/h_{j+1,j}

end for

Form solution

x_{k}=x_{0}+V_{k}y_{k}

where

V_{k}=[v_{1}...v_{k}]

, and

y_{k}

minimizes

J(y)=\|\beta e_{1}-\hat{H}_{k}y\|

where

\hat{H}=\begin{bmatrix}h_{1,1}&h_{1,2}\cdots h_{1,k}\\ h_{2,1}&h_{2,2}\cdots h_{2,k}\\ 0&\ddots&\ddots&\vdots\\ \vdots&\ddots&h_{k,k-1}&h_{k,k}\\ 0&\cdots&0&h_{k+1,k}\end{bmatrix}

Algorithm 5 Pseudo-code of the GMRES algorithm

A.6 Recursive hierarchical compression of the matrix augmented by neural network

t_{\min},t_{\max},s_{\min},s_{\max},\in{\mathbb{N}}

(row and column index ranges),

r

rank of the blocks,

\delta

accuracy of compression,

\mu

PDE parameter

1\leq t_{\min}\leq t_{\max}\leq n,1\leq s_{\min}\leq s_{\max}\leq m

where

n\times m

is the size of the matrix to be compressed

i = block index for

(t_{\min},t_{\max},s_{\min},s_{\max})

\mathbb{D}_{i}(\mu)[r+1]<\delta

(asking NN for block singularvalues then

if block (

t_{\min},t_{\max},s_{\min},s_{\max}

) consist of zeros then

\textrm{\bf create new node }v;v.rank\leftarrow 0;v.size\leftarrow size(t_{\min},t_{\max},s_{\min},s_{\max},s,t);\textrm{\bf return }v;

end if

[\mathbb{U},\mathbb{D},\mathbb{V}]\leftarrow\operatorname{truncatedSVD}(t_{\min},t_{\max},s_{\min},s_{\max},r);\sigma\leftarrow\operatorname{diag}(\mathbb{D});

rank\leftarrow rank(\mathbb{D})

\textrm{\bf create new node }v;

v.rank\leftarrow rank;

v.singularvalues\leftarrow\sigma(1:rank);

v.U\leftarrow\mathbb{U}(*,1:rank);

v.V\leftarrow\mathbb{D}(1:rank,1:rank)*\mathbb{V}(1:rank,*);

v.sons\leftarrow\emptyset;

v.size\leftarrow size(t_{\min},t_{\max},s_{\min},s_{\max});

\textrm{\bf return }v;

else

// Create a new node with four sons corresponding to four //quarters of the matrix

create new node v

AppendChild(v,CreateTreeNN

(t_{\min},t_{\operatorname{newmax}},s_{\min},s_{\operatorname{newmax}},r,\delta,\mu))

AppendChild(v,CreateTreeNN

(t_{\min},t_{\operatorname{newmax}},s_{\operatorname{newmax}}+1,s_{\max},r,\delta,\mu))

AppendChild(v,CreateTreeNN

(t_{\operatorname{newmax}}+1,t_{\max},s_{\min},s_{\operatorname{newmax}},r,\delta,\mu))

AppendChild(v,CreateTreeNN

(t_{\operatorname{newmax}}+1,t_{\max},s_{\operatorname{newmax}}+1,s_{\max},r,\delta,\mu))

end if

RETURN

v

Algorithm 6 Recursive hierarchical compression of the matrix augmented by neural network: CreateTreeNN(1,rowsof(

A

),1,columnsof(

A

r,\delta,\mu

) where

r

is the rank used for the compression, and

\delta

is the threshold for the

r

singular values,

\mu

is the PDE parameter.

References

[1] A. F. Agarap, Deep learning using rectified linear units (ReLu). arXiv preprint arXiv:1803.08375 (2018).
[2] P. R. Amestoy and I. S. Duff, Multifrontal parallel distributed symmetric and unsymmetric solvers, Computer Methods in Applied Mechanics and Engineering, 184 (2000) 501-520.
[3] P. R. Amestoy, I. S. Duff, J. Koster and J-Y L’Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM Journal on Matrix Analysis and Applications, 1(23) (2001) 15-41.
[4] P. R. Amestoy, A. Guermouche, J-Y L’Excellent and S. Pralet, Hybrid scheduling for the parallel solution of linear systems, Computer Methods in Applied Mechanics and Engineering 2(32) (2001), 136–156.
[5] V. M. Calo, Residual-based multiscale turbulence modeling: Finite volume simulations of bypass transition, Stanford University, Ph.D. Thesis (2005)
[6] V. M. Calo, M. Łoś, Q. Deng, I. Muga, M. Paszyński, Isogeometric residual minimization method (iGRM) with direction splitting preconditioner for stationary advection-dominated diffusion problems, Computer Methods in Applied Mechanics and Engineering 373 (2021) 113214.
[7] J. Chan, J. A.Evans, A minimal-residual finite element method for the convection-diffusion equations, ICES-REPORT 13-12 (2013)
[8] J. Chan, J. A. Evans, W. Qiu. A dual Petrov–Galerkin fi- nite element method for the convection–diffusion equation. Computers & Mathematics with Applications 68(11) (2014) 1513–1529
[9] A. Ern, J.-L. Guermond. Theory and practice of finite elements. Vol. 159. Springer, 2013
[10] O.G. Ernst, M. J. Gander, Why it is Difficult to Solve Helmholtz Problems with Classical Iterative Methods. In: Graham, I., Hou, T., Lakkis, O., Scheichl, R. (eds.) Numerical Analysis of Multiscale Problems. Lecture Notes in Computational Science and Engineering, vol 83. Springer, Berlin, Heidelberg (2012)
[11] W. Hackbush, Hierarchical Matrices: Algorithms and Analysis, Springer (2009)
[12] R. A. Horn, C. R., Johnson, Matrix analysis. Cambridge university press. (1990)
[13] T.J.R. Hughes, L.P. Franca, M. Mallet, A new finite element formulation for fluid dynamics: VI. Convergence analysis of the generalized SUPG formulation for linear time– dependent multidimensional advective–diffusive systems, Computer Methods in Applied Mechanics and Engineering, 6 (1987) 97–112.
[14] D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
[15] M. Łoś, I. Muga, J. Muñoz-Matute, M. Paszyński, Isogeometric Residual Minimization Method (iGRM) with direction splitting for non-stationary advection–diffusion problems, Computers & Mathematics with Applications, 79(2) (2020) 213-229.
[16] M. Łoś, I. Muga, J. Muñoz-Matute, M. Paszyński, Isogeometric residual minimization (iGRM) for non-stationary Stokes and Navier–Stokes problems, Computers & Mathematics with Applications, 95(1) (2021) 200-214.
[17] I. Muga, K. G. Van Der Zee. Discretization of Linear Prob- lems in Banach Spaces: Residual Minimization, Nonlinear Petrov–Galerkin, and Monotone Mixed Methods, SIAM Journal on Numerical Analysis 58(6) (2020), 3406–3426.
[18] J. N. Reddy: An introduction to the finite element method, Mcgraw–Hill (2006)
[19] Y. Saad, Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics; 2nd edition (2003)
[20] R. Stevenson, J. Westerdiep, Minimal residual space-time discretizations of parabolic equations: Asymmetric spatial operators, Computers & Mathematics with Applications 101 (2021) 107-118

Automatic stabilization of finite-element simulations using neural networks and hierarchical matrices

Abstract

keywords:

1 Introduction

1.1 One-dimensional illustration of stabilization

1.2 Outline of the paper

2 Theoretical ingredients

2.1 Petrov-Galerkin formulations with optimal test functions

2.2 Optimal test functions for an affine family of parametric PDEs

2.3 Hierarchical compression of the optimal test functions matrix of coefficients

2.4 Matrix-vector multiplication with ℍ\mathbb{H}-matrices and GMRES solver speedup

2.5 Neural network learning the hierarchical matrices

3 Application

3.1 Two-dimensional Eriksson-Johnson problem

3.2 Hemholtz problem

4 Conclusions

Acknowledgments

Appendix A Algorithms

A.1 Recursive hierarchical compression of the matrix

A.2 Checking of the admissibility condition

A.3 Matrix vector multiplication

A.4 rSVD compression of a block

A.5 Pseudo-code of the GMRES algorithm

A.6 Recursive hierarchical compression of the matrix augmented by neural network

References

2.4 Matrix-vector multiplication with $\mathbb{H}$ -matrices and GMRES solver speedup