Automatic stabilization of finite-element simulations using neural networks and hierarchical matrices
Abstract
Petrov-Galerkin formulations with optimal test functions allow for the stabilization of finite element simulations. In particular, given a discrete trial space, the optimal test space induces a numerical scheme delivering the best approximation in terms of a problem-dependent energy norm. This ideal approach has two shortcomings: first, we need to explicitly know the set of optimal test functions; and second, the optimal test functions may have large supports inducing expensive dense linear systems.
Nevertheless, parametric families of PDEs are an example where it is worth investing some (offline) computational effort to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters, in an online stage. Therefore, as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions (in a basis expansion) associated with that PDE-parameter. Next, as a remedy for the second shortcoming, we use the low-rank approximation to hierarchically compress the (non-square) matrix of coefficients of optimal test functions. In order to accelerate this process, we train a neural network to learn a critical bottleneck of the compression algorithm (for a given set of PDE-parameters). When solving online the resulting (compressed) Petrov-Galerkin formulation, we employ a GMRES iterative solver with inexpensive matrix-vector multiplications thanks to the low-rank features of the compressed matrix. We perform experiments showing that the full online procedure as fast as the original (unstable) Galerkin approach. We illustrate our findings by means of 2D Eriksson-Johnson and Hemholtz model problems.
keywords:
Petrov-Galerkin method , optimal test functions , parametric PDEs , automatic stabilization , neural networks , hierarchical matrices1 Introduction
Difficult finite-element simulations solved with the Galerkin method (where we employ the same trial and test space) often generate incorrect numerical results with oscillations or spurious behavior. Examples of such problems are the advection-dominated diffusion equation [7] or the Helmholtz equation [10].
Petrov-Galerkin formulations111i.e., trial and test spaces are not equal, although they share the same dimension. [18] with optimal test functions allow for automatic stabilization of difficult finite-element simulations. This particular Petrov-Galerkin approach is equivalent to the residual minimization (RM) method [7], whose applications include advection-diffusion [16, 6], Navier-Stokes [15], or space-time formulations [20].
In general, for a fixed trial space, RM allows for stabilization by enriching the discrete test space where optimal test functions of the associated Petrov-Galerkin method live 222Contrary to standard stabilization methods like SUPG [5, 13], there are no special stabilization terms modifying the weak formulation.. If we had explicitly the formulas for the optimal test functions expanded in the enriched discrete test space, then we can return to the Petrov-Galerkin formulation and solve problems in the best possible way for a given trial space.
This scenario has two shortcomings. The first problem is that the computation of optimal test functions is expensive. It requires solving a large system of linear equations333As large as the dimension of the enriched discrete test space. with multiple right-hand sides (one right-hand side per each basis function of the trial space). The second problem is that the optimal test functions can have global supports, and thus the Petrov-Galerkin method with optimal test functions can generate a dense matrix, expensive to solve.
Nevertheless, parametric families of partial differential equations (PDEs) are an example where it is worth investing some (offline) computational effort, to obtain stabilized linear systems that can be solved efficiently, for a given set of parameters in an online stage. Therefore, in the context of a parametric family of PDEs, and as a remedy for the first shortcoming, we explicitly compute (offline) a function mapping any PDE-parameter, to the matrix of coefficients of optimal test functions444expanded in the basis of the enriched discrete test space. associated with that PDE-parameter. We emphasize that this last procedure is independent of particular right-hand side sources and/or prescribed boundary data of the PDE family.
The obtained matrix of optimal test functions coefficients is dense. The Petrov-Galerkin method induces a linear system of the form , where is a right-hand side vector and is the matrix associated with the bilinear form of the underlying PDE. To avoid dense matrix computations, we compress using the approach of hierarchical matrices [11]. Having the matrix compressed into a hierarchical matrix , we employ the GMRES method [19], which involves computations of the residual and the hierarchical matrix enables matrix-vector multiplication of and in a quasi-linear computational cost.
However, compressing the matrix for each PDE-parameter is an expensive procedure. Thus, with the help of an artificial neural network, we train (offline) the critical bottleneck of the compression algorithm. We obtain a stabilized method with the additional quasi-linear cost resulting from matrix-vector multiplications within the GMRES method. From our numerical results with the Ericksson-Johnson and Hemholtz model problems, this cost of stabilization (the cost of compression of the hierarchical matrix and the cost of GMRES with hierarchical matrix multiplication by a vector) is of the same order as the cost of the solution of the original Galerkin problem without the stabilization. Thus, we claim we can obtain stabilization with neural networks and hierarchical matrices practically for free.
1.1 One-dimensional illustration of stabilization
Let us illustrate how we stabilize difficult computational problems by means of a one-dimensional example for the advection-dominated diffusion model.
Given , consider the following differential equation:
(1) |
In weak-form, equation (1) translates into find such that:
(2) |


We define discrete spaces and as depicted in Figure 1. Given a regular mesh, will be the space of piecewise linear and continuous functions; while will be the space of piecewise quadratics and continuous functions. From one side, we discretize formulation (2) using a standard Galerkin method where trial and test spaces are equal to . On the other side, we discretize (2) by means of a residual minimization (RM) technique that uses as the trial space, and as the test space. Figure 2 compares the discrete solutions delivered by these two methods for different values of , and different meshes parametrized by the number of elements . We observe a superior performance of the residual minimization method, even though the approximation (trial) space used is poorer than that of the Galerkin method.






1.2 Outline of the paper
The structure of the paper is the following. Section 2 contains all the theoretical ingredients needed to understand our approach. Namely, Petrov-Galerkin formulations with optimal test functions (section 2.1); optimal test functions for an affine family of PDEs (section 2.2); the hierarchical compression of the optimal test functions matrix of coefficients (section 2.3); the fast hierarchical matrix-vector multiplication and related fast implementation of the GMRES solver (section 2.4); and the neural networks acceleration of the hierarchical matrix compression (section 2.5). Next, in section 3 we apply our automatic stabilization procedure to well-known unstable parametric model problems. First, for the two-dimensional Eriksson-Johnsson model problem (section 3.1); and next, for a two-dimensional Helmholtz equation (section 3.2). We write our conclusions in section 4. All the pseudo-code algorithms needed to follow our methodology have been shifted to the Appendix.
2 Theoretical ingredients
2.1 Petrov-Galerkin formulations with optimal test functions
Let and be Hilbert spaces. We consider a general variational formulation of a PDE, which is to find such that
(3) |
where is a continuous bilinear form, and is a continuous linear functional (i.e., , the dual space of ).
The dual space has a norm inherited by the norm of . Indeed, if denotes the inner-product of the Hilbert space , then these norms are given by the following expressions:
We will assume well-posedness of problem (3), which translates into the well-known inf-sup conditions (see, e.g., [9, Theorem 2.6]):
(4a) | |||
(4b) |
Notice that the continuity assumption on the bilinear form , also implies the existence of a constant such that:
(5) |
Given a discrete finite-element space , a natural candidate to approximate the solution to problem (3) is the residual minimizer
(6) |
Indeed, combining (4a), (6), and (5), the residual minimizer automatically satisfies the quasi-optimality property:
Thus, the residual minimizer inherits stability properties from the continuous problem.
It is well-known (see, e.g., [8]) that the saddle-point formulation of residual minimization (6) becomes the mixed problem that aims to find , and a residual representative , such that:
(7a) | |||||
(7b) |
Although it seems harmless, the mixed problem (7) is still infinite-dimensional in the test space . To obtain a computable version, we introduce a discrete test space that turns (7) into the fully discrete problem to find 555Notice the abuse of notation. This new solution of (8) does not equals the exact residual minimizer solution of (6), or equivalently (7)., and a residual representative , such that:
(8a) | |||||
(8b) |
Problem (8) corresponds to the saddle-point formulation of a discrete-dual residual minimization, in which the dual norm in (6) is replaced by the discrete-dual norm . Well-posedness and stability of (8) has been extensively studied in [17] and depends on a Fortin compatibility condition between the discrete spaces and . Moreover, once this condition is fulfilled (and in order to gain stability), it is possible to enrich the test space without changing the trial space . Obviously, this process will enlarge the linear system (8). Nevertheless, we know that there is an equivalent linear system of the same size of the trial space, delivering the same solving (8). This is known as the Petrov-Galerkin method with optimal test functions, which we describe below. Our goal will be to make this generally impractical method practical.
Let us introduce now the concept of optimal test functions. For each , the optimal test function is defined as the Riesz representative of the functional , i.e.,
(9) |
Testing equation (8a) with optimal test functions , using (9) and (8b), we arrive to the following Petrov-Galerkin system with optimal test functions:
(10) |
In order to explicit a matrix expression for (10), let us set and . Consider the matrix linked to the bilinear form such that its -entry is . Analogously, we consider the Gram matrix linked to the inner product such that . The optimal test space is defined as . Thus, using (9), we observe that the matrix containing the coefficients of optimal test functions when expanded in the basis of is . Moreover, the Petrov-Galerkin system (10) becomes
(11) |
where the vector contains the coefficients of the expansion of in the basis of , , and we have used the fact that . Therefore, if we aim to solve the Petrov-Galerkin linear system (11), an optimized matrix-vector multiplication to perform and becomes critical. Section 2.3 is devoted to study the hierarchical compression of , which allows for fast vector-matrix multiplications.
2.2 Optimal test functions for an affine family of parametric PDEs
Assume we want to solve parametric PDEs in variational form, i.e.,
where for each set of parameters , the bilinear form is continuous and inf-sup stable, with constants that may depend on . Moreover, we assume that has the affine decomposition:
where and are accessible and easy to compute. When the trial and test spaces are discretized, the bilinear form induces a matrix of the form:
Thus, the matrix of coefficients of optimal test functions becomes in this case:
(12) |
Equation (18) shows the particular form that expression (12) gets for the Eriksson-Johnsson model problem, where knowing and implies the knowledge of for any .
2.3 Hierarchical compression of the optimal test functions matrix of coefficients
The hierarchical matrices has been introduced by Hackbush [11]. The main idea of the hierachical compression of a matrix is to store the matrix in a tree-like structure, where:
-
1.
the root node corresponds to the whole matrix;
-
2.
the root node has some number of sons (in our approach 4 sons) corresponding to submatrices of the main matrix;
-
3.
each node can have sons (in our approach 4 sons) corresponding to submatrices (blocks), or can be a leaf representing the corresponding matrix (block);
-
4.
each leaf stores its associated matrix in the SVD compressed form or as zero matrix;
-
5.
at each node, the decision about storing the block in SVD form, or either dividing the block into submatrices, depends on an admissibility condition of the block.

Exemplary hierarchical compression of the matrix in a form of a tree is presented in Figure 3; while the algorithm for compression of the matrix into the hierarchical matrix format is presented in Algorithm 1.
The admissibility condition controls the process of creation of the tree, it allows to decide if the matrix should be divided (or not) into submatrices. In our case, the admissibility condition is defined by
-
1.
The size of the matrix: if the matrix is bigger than a pre-defined maximal admissible size , then the matrix should be divided into submatrices;
-
2.
The first singular values: if the singular value is greater than a pre-defined threshold , then the matrix should be divided into submatrices.
In the leaves of the tree, we perform a reduced Signular Value Decomposition (rSVD). A reduced singular value decomposition of a -matrix of rank is a factorisation of the form
with unitary matrices and , and a diagonal matrix where the diagonal entries are . (see Figure 4).

The diagonal entries of are called the singular values of . The computational complexity of the reduced SVD is .
2.4 Matrix-vector multiplication with -matrices and GMRES solver speedup
The computational cost of matrix-vector multiplication using a compressed -matrix of rank and right-hand side vectors is . This is illustrated in Figure 5(a).
The multiplication of a matrix compressed into SVD blocks is performed recursively as illustrated in Figure 5(b). The resulting computational cost of the multiplication is , where .

The GMRES algorithm employed for computing the solution, includes multiplications of the problem matrix by vectors (see line 1, line 4, and line 5 in Algorithm 9). The application of the stabilization matrix requires replacement of the matrix by . If we can apply matrix-vector multiplication in a linear cost, we can say that our stabilization comes for free.
2.5 Neural network learning the hierarchical matrices
The hierarchical matrix is obtained by constructing a tree with SVD decompositions of different blocks of the full matrix. The root level corresponds to the entire matrix, and the children correspond to sub-blocks. Only the leaf nodes have blocks stored in the SVD decomposition format. The most expensive part of the compression algorithm is checking the admissibility condition. In particular, checking if a given block has singular values smaller than , and whether we partition or run SVD decomposition. The SVD data for the blocks of different sizes can be precomputed and stored in a list, see Figure 6.

From the set of PDE-parameters, we can construct the neural network
(13) |
where is the list of SVD decompositions for all blocks of different dimensions.
Unfortunately, training the neural networks and for different blocks, as functions of does not work. However, it is possible to train the singular values for different blocks as function of , i.e.,
(14) |
Knowing a priori for a given allows us to construct the structure of the hierarchical matrix, and call truncated SVD only for leaves of the matrix, which considerably speeds up the compression algorithm. An illustration of the architecture of the neural network used to learn singular values is depicted in Figure 7.

3 Application
3.1 Two-dimensional Eriksson-Johnson problem
Given and , we seek the solution of the advection-diffusion problem
(15) |
where denotes the characteristic function over the the inflow boundary .

The problem is driven by the inflow Dirichlet boundary condition and develops a boundary layer of width near the outflow , as shown in Figure 8.
The weak form with a general Dirichlet boundary data will be to find such that
(16) |
To simplify the discussion, we approximate the solution as tensor products of one-dimensional B-splines basis functions of uniform order in all directions. This discrete trial space will be split as , where contains all the basis functions vanishing at ; and is the complementary subspace containing the basis functions associated with boundary nodes. Our discrete solution will be , where is unknown and is directly obtained using the boundary data .
We build the test space using a larger polynomial order . That is, we use the tensor product of one-dimensional B-splines basis functions of order and vanishing over . This discrete test space will be denoted by . The discrete residual minimization problem will be to find and such that
(17a) | |||||
(17b) |
The reduced matrix system associated with (17) takes the form:
(18) |

We train a neural network for the diagonal matrix of the SVD decomposition of the entire matrix (see Figure 9), as well as from the SVD decompositions of sub-matrices obtained by partitions of , for . The white parts correspond to the boundary nodes, where we have enforced the boundary conditions. The convergence of the training procedures is presented in Figures 10.




In an online stage, for a given diffusion coefficient , we perform the compression of the matrix into the hierarchical matrix using Algorithm 6, where the admissibility condition is now provided by the trained neural network. The compressed hierarchical matrices are illustrated in Figure 11.


Having the compressed matrix , we employ the GMRES algorithm [19] for solving the linear system (18). To avoid the computation with a dense matrix, we note that the GMRES method involves computations of the residual and the hierarchical matrix enables matrix-vector multiplications of and in a quasi-linear computational cost.
Compress | Compress | # iter | Total | |||
flops | flops | flops | flops | GMRES | flops | |
with DNN | without DNN | -matrix | ||||
0.1 | 14,334 | 158,946 | 41,163 | 9,152 | 90 | 3,728,166 |
31,880 | 117,922 | 33,100 | 9,002 | 76 | 3,232,392 |
# iter GMRES Galerkin | Flops per iteration | Total flops | |
---|---|---|---|
0.1 | 89 | 17,536 | 1,560,704 |
65 | 17,536 | 1,139,840 |
In Table 1 we summarize the computational costs of our solver for two values of . The computational mesh was a tensor product of quadratic B-splines with 26 elements along -axis and quadratic B-splines with 10 elements along -axis. As we can read from the second and third columns, the DNN speeds up the compression process of the matrix of optimal test function’s coefficients around ten times. We employ the GMRES solver that computes the residual. The cost of multiplication of the and the cost of multiplication of is included in the fourth and fifth column in Table 1. The total cost of the GMRES with hierarchical matrices augmented by DNN compression is equal to compression cost of with DNN plus number of iterations times the multiplication cost of plus multiplication cost of . The total cost is presented in the last column of Table 1.
For comparison, we run the GMRES algorithm on the Galerkin method. The comparison is summarized in Table 2. The number of iterations, the cost per iteration, and the total cost are presented there. We can see that the cost of the stabilized solution is of the same order as the cost of the Galerkin solution. We compare here the costs of the correct solution obtained from the stable Petrov-Galerkin method with the cost of the incorrect solution from the unstable Galerkin method.
The numerical results are compared with the exact solutions in Figure 12 for inflow data , and Figure 13 for inflow data .



3.2 Hemholtz problem
Given and , we seek the solution of the Hemholtz problem
(19) |
with right-hand sides and such that the exact solution is .
We employ finite elements mesh. The trial space is constructed from quadratic B-splines. The test space is obtained for trial, and quadraticfrom B-splines with separators (equivalent to Lagrange polynomials). The dependence of the coefficients of the optimal test functions on for the Hemholtz problem has the affine structure described in Section 2.2, thus we can offline construct the function
(20) |
where is the matrix of the coefficients of the optimal test functions. We fix the trial and test spaces used for approximation of the solution and stabilization of the Petrov-Galerkin formulation. Next, we consider blocks of different size of matrix , and we train the SVD for these different blocks as function of , i.e.,
(21) |



The convergence of the training procedure is presented in Figure 14. Knowing a priori for a given allows us to construct the structure of the hierarchical matrix, and we obtain the and from the neural networks.
Figure 15 depicts the exemplary resulting hierarchical matrices.
Table 3 summarizes the computational costs of our solver for two values of . The computational mesh was a tensor product of quadratic B-splines with 10 elements along and axes. The DNN allows obtaining the compression of the matrix of optimal test function’s coefficients for free. We employ the GMRES solver that computes the residual. The cost of multiplication of the and the cost of multiplication of is included in the fourth and fifth column in Table 3. The total cost of the GMRES with hierarchical matrices augmented by DNN compression is equal to the number of iterations times the multiplication cost of plus multiplication cost of . The total cost is presented in the last column of Table 3.
We present in Table 4 the cost of the GMRES algorithm executed on the Galerkin method. We present the number of iterations, the cost per iteration, and the total cost. We can see that the cost of obtaining the stabilized solution is of the same order as the cost of the Galerkin method. We compare here the costs of the correct solution obtained from the Petrov-Galerkin method with the cost of the incorrect solution from the unstable Galerkin method.
The comparison of the solution obtained with the Petrov-Galerkin formulation with the optimal test functions generated by DNN and the exact solution is presented in Figure 16.



Compress | Compress | # iter | Total | |||
flops | flops | flops | flops | GMRES | flops | |
with DNN | without DNN | -matrix | ||||
1 | 0 | 129,788 | 47,649 | 8,901 | 10 | 448,284 |
10 | 0 | 129,788 | 47,649 | 8,877 | 31 | 1,752,306 |
# iter GMRES Galerkin | Flops per iteration | Total flops | |
---|---|---|---|
1 | 10 | 14,444 | 144,440 |
10 | 31 | 14,444 | 447,764 |
4 Conclusions
We have employed the Petrov-Galerkin formulation with optimal test functions for the stabilization of difficult problems. We have focused on advection-dominated diffusion and Helmholtz problems. During the offline phase, we explicitly compute the matrix of coefficients of optimal test functions for any PDE-parameter. We have also trained neural networks to compute for each PDE-parameter the bottleneck of hierarchical matrix compression. During the online phase, we rapidly compute the matrix compression using the neural networks, and we perform the GMRES iterative solver on the reduced Petrov-Galerkin linear system, where vector-matrix multiplications are done in a quasi-linear computational cost, due to the hierarchical structure of the low-rank decomposition used. Thus, we obtain the online stabilization practically for free.
Acknowledgments
The European Union’s Horizon 2020 Research and Innovation Program of the Marie Skłodowska-Curie grant agreement No. 777778, MATHROCKs. Research project partly supported by the program “Excellence initiative – research university” for the University of Science and Technology.
Appendix A Algorithms
A.1 Recursive hierarchical compression of the matrix
where is the size of the matrix to be compressed
A.2 Checking of the admissibility condition
A.3 Matrix vector multiplication
A.4 rSVD compression of a block
A.5 Pseudo-code of the GMRES algorithm
A.6 Recursive hierarchical compression of the matrix augmented by neural network
where is the size of the matrix to be compressed
References
- [1] A. F. Agarap, Deep learning using rectified linear units (ReLu). arXiv preprint arXiv:1803.08375 (2018).
- [2] P. R. Amestoy and I. S. Duff, Multifrontal parallel distributed symmetric and unsymmetric solvers, Computer Methods in Applied Mechanics and Engineering, 184 (2000) 501-520.
- [3] P. R. Amestoy, I. S. Duff, J. Koster and J-Y L’Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM Journal on Matrix Analysis and Applications, 1(23) (2001) 15-41.
- [4] P. R. Amestoy, A. Guermouche, J-Y L’Excellent and S. Pralet, Hybrid scheduling for the parallel solution of linear systems, Computer Methods in Applied Mechanics and Engineering 2(32) (2001), 136–156.
- [5] V. M. Calo, Residual-based multiscale turbulence modeling: Finite volume simulations of bypass transition, Stanford University, Ph.D. Thesis (2005)
- [6] V. M. Calo, M. Łoś, Q. Deng, I. Muga, M. Paszyński, Isogeometric residual minimization method (iGRM) with direction splitting preconditioner for stationary advection-dominated diffusion problems, Computer Methods in Applied Mechanics and Engineering 373 (2021) 113214.
- [7] J. Chan, J. A.Evans, A minimal-residual finite element method for the convection-diffusion equations, ICES-REPORT 13-12 (2013)
- [8] J. Chan, J. A. Evans, W. Qiu. A dual Petrov–Galerkin fi- nite element method for the convection–diffusion equation. Computers & Mathematics with Applications 68(11) (2014) 1513–1529
- [9] A. Ern, J.-L. Guermond. Theory and practice of finite elements. Vol. 159. Springer, 2013
- [10] O.G. Ernst, M. J. Gander, Why it is Difficult to Solve Helmholtz Problems with Classical Iterative Methods. In: Graham, I., Hou, T., Lakkis, O., Scheichl, R. (eds.) Numerical Analysis of Multiscale Problems. Lecture Notes in Computational Science and Engineering, vol 83. Springer, Berlin, Heidelberg (2012)
- [11] W. Hackbush, Hierarchical Matrices: Algorithms and Analysis, Springer (2009)
- [12] R. A. Horn, C. R., Johnson, Matrix analysis. Cambridge university press. (1990)
- [13] T.J.R. Hughes, L.P. Franca, M. Mallet, A new finite element formulation for fluid dynamics: VI. Convergence analysis of the generalized SUPG formulation for linear time– dependent multidimensional advective–diffusive systems, Computer Methods in Applied Mechanics and Engineering, 6 (1987) 97–112.
- [14] D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
- [15] M. Łoś, I. Muga, J. Muñoz-Matute, M. Paszyński, Isogeometric Residual Minimization Method (iGRM) with direction splitting for non-stationary advection–diffusion problems, Computers & Mathematics with Applications, 79(2) (2020) 213-229.
- [16] M. Łoś, I. Muga, J. Muñoz-Matute, M. Paszyński, Isogeometric residual minimization (iGRM) for non-stationary Stokes and Navier–Stokes problems, Computers & Mathematics with Applications, 95(1) (2021) 200-214.
- [17] I. Muga, K. G. Van Der Zee. Discretization of Linear Prob- lems in Banach Spaces: Residual Minimization, Nonlinear Petrov–Galerkin, and Monotone Mixed Methods, SIAM Journal on Numerical Analysis 58(6) (2020), 3406–3426.
- [18] J. N. Reddy: An introduction to the finite element method, Mcgraw–Hill (2006)
- [19] Y. Saad, Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics; 2nd edition (2003)
- [20] R. Stevenson, J. Westerdiep, Minimal residual space-time discretizations of parabolic equations: Asymmetric spatial operators, Computers & Mathematics with Applications 101 (2021) 107-118