2021
[1]\fnmSean \surHon
[1]\orgdivDepartment of Mathematics, \orgnameHong Kong Baptist University, \orgaddress\streetKowloon Tong, \cityHong Kong SAR, \countryChina
2]\orgdivDepartment of Science and High Technology, \orgnameUniversity of Insubria, \orgaddress \cityComo, \postcode6152, \stateLombardy, \countryItaly
3]\orgdivDepartment of Information Technology, \orgnameUppsala University, \orgaddress \cityUppsala, \postcode75008, \stateUppland, \countrySweden
A sine transform based preconditioned MINRES method for all-at-once systems from constant and variable-coefficient evolutionary PDEs
Abstract
In this work, we propose a simple yet generic preconditioned Krylov subspace method for a large class of nonsymmetric block Toeplitz all-at-once systems arising from discretizing evolutionary partial differential equations. Namely, our main result is to propose two novel symmetric positive definite preconditioners, which can be efficiently diagonalized by the discrete sine transform matrix. More specifically, our approach is to first permute the original linear system to obtain a symmetric one, and subsequently develop desired preconditioners based on the spectral symbol of the modified matrix. Then, we show that the eigenvalues of the preconditioned matrix sequences are clustered around , which entails rapid convergence when the minimal residual method is devised. Alternatively, when the conjugate gradient method on the normal equations is used, we show that our preconditioner is effective in the sense that the eigenvalues of the preconditioned matrix sequence are clustered around unity. An extension of our proposed preconditioned method is given for high-order backward difference time discretization schemes, which can be applied on a wide range of time-dependent equations. Numerical examples are given, also in the variable-coefficient setting, to demonstrate the effectiveness of our proposed preconditioners, which consistently outperforms an existing block circulant preconditioner discussed in the relevant literature.
keywords:
sine transform based preconditioners; parallel-in-time; Krylov subspace methods; MINRES; all-at-once discretization; block Toeplitz systemspacs:
[MSC Classification]15B05, 65F08, 65F10, 65M22
1 Introduction
Over the past few years, there has been a growing interest on developing effective preconditioners for solving the all-at-once linear systems stemming from evolutionary partial differential equations (PDEs); see for instance Bertaccini2001 ; BertacciniNg2003 ; doi:10.1137/16M1062016 ; McDonald2017 ; HON2021113965 ; 0x003ab34e ; https://doi.org/10.1002/nla.2386 ; doi:10.1137/20M1316354 ; LIN2021110221 ; doi:10.1137/19M1309869 ; WU2021110076 and references therein. Instead of solving PDEs in a sequential manner, this class of parallel-in-time (PinT) methods solves for all unknowns simultaneously by constructing a large linear system which is composed of smaller linear systems at each time level. Our proposed method in this work belongs to the diagonalization-based all-at-once methods 10.1007/978-3-319-18827-0_50 , along with doi:10.1137/16M1062016 ; doi:10.1137/20M1316354 . Related PinT methods include space-time multigrid doi:10.1137/15M1046605 ; doi:10.1137/0916050 , multigrid reduction in time doi:10.1137/130944230 ; doi:10.1137/16M1074096 , and parareal method LIONS2001661 ; doi:10.1137/05064607X . For a survey on the development of these PinT methods, we refer the readers to 10.1007/978-3-319-23321-5_3 and the references therein.
As a model problem, we consider the following initial-boundary (value) problem for the heat equation
(1) |
where is open, denotes the boundary of , and are all given functions.
After discretizing the spatial domain with a mesh and using the simple two-level -method for time discretization with time-steps of size , we get
where are the mass matrix and the stiffness matrix approximating , respectively, in the standard finite element terminology, and . The forward Euler method corresponds to the choice , the backward Euler method to , while the Crank-Nicolson method is associated with .
The discrete equations to be solved are described below
Instead of solving the above equations for sequentially, for , we solve the following equivalent all-at-once system
where
(2) |
is a by nonsymmetric block Toeplitz matrix, which is generated by the matrix-valued function
(3) |
According to the notion of generating function reported in Section 2 (see also book-GLT-I and the references there reported), it is noted that if and commute then and commute. If one uses a finite element formulation to discretize in space for (1), then and are simultaneously diagonalizable, provided that a uniform grid is used. Alternatively, if finite difference methods are used for (1), then , being exactly the identity matrix in this case, always commutes with (see (doi:10.1137/16M1062016, , Section 3.1.1) for a discussion).
Therefore, throughout this work, the following assumptions on both and are made, which are compatible with (doi:10.1137/20M1316354, , Section 2):
-
1.
Both and are real symmetric positive definite (SPD);
-
2.
Both and are sparse, i.e., they have only nonzero entries.
-
3.
and commute.
As already observed the commutation between and is obvious if is the identity matrix or in the constant coefficient setting. When is not constant or there is also a dependency on time, that is , or the discretization method is different (finite elements, finite volumes, IgA) or graded meshes are used, the commutation is no longer true algebraically, but it holds asymptotically for large enough and it can be proved via multilevel GLT tools (see book-GLT-II ). More in detail, by assuming with going to infinity as tend to infinity and so that and
(4) |
in the case of a variable coefficient , we obtain that the matrix sequence is spectrally distributed as on , in the sense of Definition 1, since a real symmetric GLT matrix sequence with GLT symbol . The latter statement holds since we exploit a canonical decomposition of as the product of a diagonal sampling sequence and of Toeplitz matrix sequence plus a zero-distributed matrix sequence (see again (book-GLT-II, , Ch. 6, Sect. 7.3)). Hence, under the assumption that finite differences are used and by exploiting the tensor structure of in (2), the global matrix sequence will be again a GLT matrix sequence with GLT symbol
Still the same distribution is present if condition (4) is violated and , but with a different scaling factor and it applies to the matrix sequence .
However, in the present setting is considered fixed that is and hence also the condition is not satisfied. Here a matrix-valued distribution is deduced. When assuming that both and are allowed to tend to infinity, in the way indicated in the previous lines, the situation changes and the related type of analysis represents a research line with various cases to be considered in the future.
Instead of directly solving (2), we consider the permuted linear system , where
(5) |
Note that with being the anti-identity matrix or the flip matrix that is if and only if and otherwise. Clearly, is symmetric since is assumed symmetric. Thus, we can then deploy a symmetric Krylov subspace method such as the minimal residual (MINRES) method and design an effective preconditioner for , whose convergence depends only on the eigenvalues.
In doi:10.1137/16M1062016 , a Strang block circulant preconditioner was proposed for and proved effective when . The main advantage of this iterative approach is that the inverse of the preconditioner can be computed using parallel fast Fourier transforms over a number of processors. However, the performance of such a preconditioner can be unsatisfactory, as will be shown in the numerical examples in Section 5. We remark that such an unsatisfactory preconditioning effect of circulant matrices Strang1986 ; doi:10.1137/0909051 for certain ill-conditioned Toeplitz systems was discussed in Hon_SC_Wathen .
Thus, instead of using block circulant preconditioners, we propose in this work the following SPD preconditioner for , that can be diagonalized by the discrete sine transform matrix
(6) |
which is constructed based on approximating the asymptotic eigenvalue distribution of . Similar to the block circulant preconditioner, the computation of can also be parallelizable over processors. We refer readers to Bini1990 for a detailed discussion on such a parallel implementation.
We remind that in a pure Toeplitz setting the algebra of matrices diagonalized by the (real) orthogonal sine transform (i.e., the so-called algebra BC83 ) was extensively used in the 90s (see D1 ; D2 ; DS ; Slinear99 and references therein), while recently the approach based on preconditioning gained further attention in the context of approximated fractional differential equations (see BEV21 ; DMS16 and references therein). Here, we propose its use in a different setting where the Toeplitz structure is nonsymmetric. We recall that the preconditioning is superior when compared to the circulant preconditioning, when real symmetric Toeplitz-like structures are considered, as discussed and theoretically motivated in DS ; Slinear99 .
In particular, we will show that the eigenvalues of the preconditioned matrix sequence are clustered around , which leads to fast convergence when MINRES is employed.
Yet, requires fast diagonalizability of and in order to be efficiently implemented. When such diagonalizability is not available, we further propose the following preconditioner as a modification of ,
(7) |
where
and
Its derivation and implementation will be discussed in Section 3.1.4.
In general, as mentioned in (ANU:9672992, , Chapter 6), the convergence study of preconditioning strategies for nonsymmetric problems is heuristic, since descriptive convergence bounds are usually not available for the generalized minimal residual (GMRES) method or any of the other applicable nonsymmetric Krylov subspace iterative methods.
The paper is organized as follows. In Section 2, we review some preliminary results on block Toeplitz matrices and matrix-valued functions. We present our main results in Section 3 where our proposed preconditioner and its effectiveness are provided. An extension of our preconditioning method is given in Section 4 in the setting of high-order in time discretization schemes. Numerical tests are given in Section 5 to support the analysis of our proposed preconditioner. Finally in Section 6 we draw conclusions and list a few open problems, including the challenging case of variable-coefficients when both the sizes and tend to infinity.
2 Preliminaries
In the following subsections, we provide some useful background knowledge about block Toeplitz matrices and matrix functions.
2.1 Block Toeplitz matrices
We let be the Banach space of all matrix-valued functions that are Lebesgue integrable over . The -norm induced by the trace norm over is
where denotes the trace norm of . The block Toeplitz matrix generated by is denoted by , namely
(8) |
where the Fourier coefficients of are
The function is called the generating function or symbol of . For thorough discussions on the related properties of (multilevel) block Toeplitz matrices, we refer the readers to MR2108963 ; Chan:1996:CGM:240441.240445 ; book-GLT-I and references therein.
Before discussing the asymptotic singular value or spectral distribution of associated with , which is crucial to develop our preconditioning theory, we introduce the following definition.
Let (or ) be the space of complex-valued continuous functions defined on (or ) with bounded support.
Definition 1.
Ferrari2019 Let be a sequence of -by- matrices with fixed positive integer.
-
[1.]
-
1.
We say that has an asymptotic singular value distribution described by an -by- matrix-valued if
and we write .
-
2.
We say that has an asymptotic spectral distribution described by an -by- matrix-valued if
and we write .
Before presenting our main preconditioning results, we introduce the following notation. Given with , we define as , where and , with the constraint that and have non-intersecting interior part, i.e., . In this way, we have . Given any defined over , we define over in the following way
(9) |
Theorem 1.
(Ferrari2019, , Theorem 3.4) Suppose has Hermitian Fourier coefficients. Let be the block Toeplitz matrix generated by and let . Then
over the domain with and , where is defined in (9). That is
where , , are the eigenvalue functions of .
As for the symmetrized matrix given in (5), we can see by MazzaPestana2018 ; Ferrari2019 that their eigenvalues are distributed as . By Theorem 1, is always symmetric indefinite when is sufficiently large, which gives grounds for the use of MINRES in this work.
2.2 Matrix functions
In this subsection, several useful results on functions of matrices are presented.
If is analytic on a simply connected open region of the complex plane containing the interval , there exist ellipses with foci in and such that is analytic in their interiors. Let and be the half axes of such an ellipse with . Such an ellipse, denoted by , is completely specified once the number is known.
Theorem 2 (Bernstein’s theorem).
(Benzi1999, , Theorem 2.1) Let the function be analytic in the interior of the ellipse with and continuous on . In addition, suppose is real for real . Then, the best approximation error
where denotes the degree of the polynomial and
Let be an symmetric matrix and let be the smallest interval containing . If we introduce the linear affine function
then and therefore the spectrum of the symmetric matrix
is contained in . Given a function analytic on a simply connected region containing and such that is real when is real, the function satisfies the assumptions of Bernstein’s theorem.
In the special case where is SPD and that are of our interest in this work, we apply Bernstein’s result to the function
(10) |
where , and with the spectral condition number of being
3 Main results
In this section, we provide our results on the proposed preconditioner defined by (6), whose design is based on the following spectral symbol
(11) | |||||
Due to the fact that , it can be regarded as an optimal preconditioner for , in the sense of the spectral distribution theory developed in the work Ferrari2019 . We discuss mainly MINRES combined with our proposed preconditioner, and related issues such as implementations and convergence analysis are given as well.
3.1 Preconditioning for heat equations
In this section, we provide a MINRES approach for .
3.1.1 Implementations
We first discuss the fast computation of for any given vector . Since both and are sparse matrices, is sparse. Hence, computing the matrix-vector product only requires linear complexity of . Finally, computing needs the same complexity since simply imposes a reordering on the vector . Alternatively, due to the fact that itself is a block Toeplitz matrix, it is well-known that the product can be computed in operations using FFTs and the related storage is of , where can be regarded as a fixed constant and is the effective dimensional parameter.
We first indicate that has the following decomposition
where
(12) |
is a tridiagonal Toeplitz matrix which has the eigendecomposition with being the symmetric discrete sine transform matrix. Hence,
The product can be computed via the following three-step procedures:
-
[1.]
-
1.
-
2.
-
3.
.
When both and are diagonalizable by , which holds true when the spatial discretization is a finite difference/element method with a uniform square grid and a constant diffusion coefficient function , the matrix-vector product can be computed efficiently in operations using fast sine transforms with a storage is of order .
When the diffusion coefficients change corresponding to space, the matrix is no longer a Toeplitz matrix and is not diagonalizable by . Considering the one-dimensional space case, let the spatial grid size , we have the following coefficient matrix
(13) |
where for all . In this case, we can construct a preconditioner for by averaging is entries along the three diagonals to obtain a tridiagonal symmetric Toeplitz matrix. Clearly, such a new preconditioner can be fast diagonalized by . Thus, the overall preconditioner can be diagonalized as well.
3.1.2 Eigenvalue analysis of
The following proposition guarantees the effectiveness of as a preconditioner for , showing that the eigenvalues of are clustered around which ensures fast convergence of MINRES.
Proposition 3.
Proof: Since the eigendecomposition of the symmetric matrix is , we can have
where is a diagonal matrix whose entries are either or . Therefore, is symmetric and orthogonal, and hence its eigenvalues are only and .
Now, we turn our focus to the following lemma and proposition, which will be helpful to show our main result.
Lemma 4.
Proof: First, we observe that
Second, exploiting the simple structures of both and , we have by direct computations
(14) |
for integer values and , where represents a nonzero entry. Namely, is a block matrix with one block in its Southeast corner, and the block is of size . Thus,
is also a block matrix with one block in its Southeast corner, which is of size provided that . Hence, we have .
Remark 1.
The computational lemma used in the proof of Lemma 4 was first given in (doi:10.1137/080720280, , Lemma 3.11).
Proposition 5.
Proof: Let and be defined in equation (10). By Theorem 2, there exists a polynomial with degree less than or equal to such that
where
and and are the condition numbers of and , respectively. Thus, for any there exists an integer such that
Also, we have
By Lemma 4, we know that has the same sparsity structure as that of . Consequently, we deduce .
We then obtain
where and . Therefore, the proof is concluded.
Theorem 6.
Proof: By Proposition 3, we can write
By Proposition 5, we then have
where
and
Note that
by using the general inequalities in (Serra_Tilli_2002, , Corollary 4.2), noticing that the Schatten norm in that paper is exactly the spectral norm . Furthermore, we have strict inequality, that is , whenever the infimum of is strictly bounded by , as it often occurs, including in our case.
Hence, is uniformly bounded with respect to and the proof is concluded.
By (BRANDTS20103100, , Corollary 3) and Theorem 1, we know from Theorem 6 that the preconditioned matrix sequence has clustered eigenvalues around , with a number of outliers independent of . Hence, the convergence is independent of the time steps, and we can expect fast MINRES converges in exact arithmetic with as a preconditioner.
3.1.3 Eigenvalue analysis of
For a complete eigenvalue analysis, we examine the preconditioned normal equation matrix , even though we do not use the conjugate gradient method on normal equations (CGNE) in this work.
As for implementations, the product for any vector can be computed effectively using the similar arguments in Section 3.1.1.
In the following result, we show that the eigenvalues of are clustered around with outliers independent of the time step , which implies fast convergence of CGNE.
Proof: Direct computations show that
where . The matrix equation further leads to
(15) |
where .
We now consider the normal equation of , namely
where .
Notice that every rank-one term in can perturb at the same time one eigenvalue of . Therefore, by Theorem 7, there are at least eigenvalues of the normal equation matrix are equal to , which implies rapid convergence of CGNE.
3.1.4 Modified Preconditioner
Following our spectral symbol matching strategy, the construction of the modified preconditioner is motivated by the following approximation without requiring the fast diagonalizability of and .
Since the original preconditioner satisfy are clustered around , where defined by (3), we can approximate the spectral symbol
where
via the following matrix-valued function
Hence, we can regard as satisfying are clustered around .
In matrix form, the corresponding preconditioner generated by is precisely our proposed modified preconditioner defined by (7).
The product for any vector can be implemented by the following three-step procedures, which does not require the fast diagonalizability of and . Notice that both and from (7) are Tau matrices. Hence,
where and are respectively the diagonal matrices containing the eigenvalues of and .
Hence, the computations of can be implemented via the following three steps:
-
[1.]
-
1.
-
2.
-
3.
.
Both Steps 1 & 3 can be computed efficiently via fast sine transforms in operations. As for Step 2, the shifted Laplacian systems can be efficiently solved for example using the multigrid methods doi:10.1137/15M102085X . For more details regarding such efficient implementation, we refer to (doi:10.1137/16M1062016, , Section 3.1).
4 Extension to multistep methods
Our proposed preconditioning method can be easily extended for high-order in time discretization schemes. Consider again (1), using an -order backward difference scheme for time, we have the following all-at-once system to solve for , where
(16) |
is a by multiple block Toeplitz matrix, which is generated by the matrix-valued function
(17) |
and is an appropriate vector resulted from the adopted discretization schemes. Note that the matrices are composed of the mass matrix and the stiffness matrix .
Our approach is to first permute (16) and obtain a symmetric (indefinite) matrix
(18) |
and then construct a corresponding sine transform based preconditioner for it. We first compute the following matrix-valued function from (17)
where are all matrix polynomials composed of . Now, by expressing in terms of , we can further rewrite as
where are certain matrix polynomials composed of
We propose the following SPD preconditioner, which approximates the symbol in the eigenvalue sense, for :
(20) |
where is the same matrix given by (12).
Note that
Hence, the product for any vector can be computed effectively using the similar three-step procedures in Section 3.1.1.
4.1 Eigenvalue analysis of
As before, we first introduce matrix as an auxiliary for showing the eigenvalues of are clustered. The following proposition guarantees the effectiveness of as a preconditioner for , showing that the eigenvalues of are clustered around , which leads to fast convergence of MINRES.
Proposition 8.
Proof: Since the eigendecomposition of the symmetric matrix is , we can have
where is a diagonal matrix whose entries are either or . Therefore, is symmetric and orthogonal, and hence its eigenvalues are only and .
In addition, we require the following lemma and proposition for showing our main result.
Lemma 9.
Proof: First, we observe from (4) that
Note that we also have
where are low rank matrices in the sense that .
Recalling from (20) that and also note that , we have
One can further show by direct computations that has the following structure
where represents a nonzero entry, since such a block structure is determined by the last term . Namely, it is a block matrix with in the Southeast and Northwest corners and the block is of size .
We can now examine the structure of the matrix . By (16), we have
where the first matrix is a block matrix with a block in the corner. Thus, is a block matrix with a block in the two corners. We have by direct computations
Similar to (14), by exploiting the simple structure of , we can show that for given positive integers and the matrix is a block matrix with a block in its Southeast and Northwest corners and the block is of size . Thus,
is also a block matrix with a block in the two corners and the size is , provided that . Hence, we have .
The following results, i.e., Proposition 10 and Theorem 11, follow using similar arguments for showing Proposition 5 and Theorem 6. Hence, the proofs are left to the readers.
Proposition 10.
Theorem 11.
Since both and in Theorem 11 are symmetric, by (BRANDTS20103100, , Corollary 3) and Theorem 1, we know that the preconditioned matrix sequence has clustered eigenvalues around with number of outliers independent of . Hence, the convergence is independent of the time steps, and we can then expect that MINRES converges rapidly in exact arithmetic with as a preconditioner.
5 Numerical examples
In this section, we demonstrate the effectiveness of our proposed preconditioner. All numerical experiments are carried out using MATLAB on a Dell R640 server equipped with dual Xeon Gold 6246R 16-Cores 3.4GHz CPUs, 512GB RAM running Ubuntu 20.04 LTS. The CPU time, which is denoted by “CPU” in the tables below, is estimated in seconds using the MATLAB built-in functions tic/toc. All Krylov subspace solvers are implemented using the build-in functions on MATLAB, and “Iter” refers to the iteration number required for convergence. Furthermore, we choose a zero initial guess and a stopping tolerance of based on the reduction in relative residual norms. Throughout all examples, we consider finite difference methods with uniform spatial grids, which results in being the identity matrix and being diagonalized by the discrete sine transform. Note that in the examples “DoF” denotes the degree of freedom, and is the existing absolute value block circulant preconditioner proposed in doi:10.1137/16M1062016 .
Example 1.
doi:10.1137/16M1062016 The first example is a two-dimensional problem of solving (1) with , , , , , and .
For this example, we adopt the backward Euler method and the Crank-Nicolson method in time for solving the equation and report the convergence results in Tables 1 and 2, respectively. While increases quickly in iteration number with the parameters for this ill-conditioned example, our proposed preconditioners and significantly outperform in terms of both iteration numbers and CPU times. In addition, we observe that is only slightly inferior in precondition effect when compared with . In fact, in most cases, their induced iteration numbers are identical. The use of as a preconditioner appears to be excellent, considering that it is an approximation to .
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
30752 | 34 | 0.24 | 11 | 0.096 | 11 | 0.071 | ||
127008 | 48 | 0.65 | 11 | 0.17 | 11 | 0.18 | ||
516128 | 59 | 2.18 | 11 | 0.57 | 11 | 0.56 | ||
2080800 | 82 | 19.56 | 11 | 3.27 | 11 | 3.27 | ||
61504 | 34 | 0.30 | 11 | 0.12 | 11 | 0.11 | ||
254016 | 48 | 1.04 | 11 | 0.26 | 11 | 0.27 | ||
1032256 | 72 | 6.81 | 11 | 1.35 | 13 | 1.48 | ||
4161600 | 82 | 40.82 | 11 | 7.28 | 13 | 8.37 | ||
123008 | 34 | 0.49 | 13 | 0.21 | 13 | 0.22 | ||
508032 | 48 | 1.92 | 13 | 0.62 | 13 | 0.61 | ||
2064512 | 72 | 15.04 | 13 | 3.68 | 13 | 3.73 | ||
8323200 | 79 | 81.81 | 13 | 17.19 | 13 | 17.32 | ||
246016 | 34 | 0.78 | 13 | 0.35 | 15 | 0.38 | ||
1016064 | 48 | 4.62 | 13 | 1.61 | 15 | 1.78 | ||
4129024 | 71 | 34.91 | 13 | 8.28 | 15 | 9.34 | ||
16646400 | 79 | 157.74 | 14 | 35.95 | 15 | 38.32 | ||
\botrule |
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
30752 | 33 | 0.45 | 11 | 0.098 | 11 | 0.082 | ||
127008 | 48 | 0.67 | 11 | 0.18 | 11 | 0.20 | ||
516128 | 59 | 2.42 | 11 | 0.63 | 11 | 0.63 | ||
2080800 | 82 | 18.83 | 11 | 3.44 | 11 | 3.49 | ||
61504 | 34 | 0.38 | 11 | 0.15 | 11 | 0.14 | ||
254016 | 48 | 1.10 | 11 | 0.30 | 13 | 0.36 | ||
1032256 | 73 | 7.28 | 11 | 1.44 | 13 | 1.51 | ||
4161600 | 83 | 44.01 | 11 | 7.64 | 13 | 8.74 | ||
123008 | 34 | 0.68 | 13 | 0.23 | 13 | 0.24 | ||
508032 | 48 | 2.20 | 13 | 0.71 | 13 | 0.67 | ||
2064512 | 72 | 16.06 | 13 | 3.98 | 13 | 3.97 | ||
8323200 | 80 | 87.92 | 13 | 18.48 | 13 | 19.00 | ||
246016 | 34 | 0.83 | 13 | 0.39 | 15 | 0.42 | ||
1016064 | 48 | 5.02 | 13 | 1.73 | 15 | 1.97 | ||
4129024 | 72 | 37.34 | 13 | 9.14 | 15 | 10.34 | ||
16646400 | 79 | 165.53 | 14 | 37.91 | 15 | 40.49 | ||
\botrule |
Example 2.
This example is a two-dimensional problem of solving (1) with , , , and
This example has the closed-form analytical solution as follows
Since the exact solution of this example is known, we can define the error estimate as follows
where denotes the iterative solution of the linear system in (2), and denotes the exact solution of the heat equation on the mesh.
Tables 3 and 4 show the convergence results of MINRES using the backward Euler method and the Crank-Nicolson method in time, respectively. Table 5 shows the corresponding error results of MINRES using different preconditioners. Again, similar to the last example, we observed that our proposed preconditioners and considerably surpass in this variable-coefficient case.
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
30752 | 107 | 0.63 | 11 | 0.084 | 11 | 0.084 | ||
127008 | 141 | 1.95 | 11 | 0.17 | 12 | 0.16 | ||
516128 | 218 | 7.91 | 11 | 0.57 | 13 | 0.64 | ||
2080800 | 315 | 62.44 | 12 | 3.82 | 16 | 4.87 | ||
61504 | 106 | 1.00 | 11 | 0.13 | 13 | 0.15 | ||
254016 | 154 | 3.20 | 11 | 0.25 | 13 | 0.31 | ||
1032256 | 219 | 21.27 | 13 | 1.44 | 14 | 1.49 | ||
4161600 | 307 | 155.67 | 13 | 8.18 | 17 | 10.69 | ||
123008 | 107 | 1.49 | 13 | 0.22 | 13 | 0.21 | ||
508032 | 160 | 6.30 | 13 | 0.58 | 14 | 0.63 | ||
2064512 | 218 | 47.22 | 13 | 3.44 | 15 | 4.55 | ||
8323200 | 303 | 315.43 | 13 | 17.64 | 18 | 23.49 | ||
246016 | 118 | 2.50 | 14 | 0.35 | 15 | 0.36 | ||
1016064 | 177 | 14.86 | 14 | 1.58 | 15 | 1.45 | ||
4129024 | 220 | 111.71 | 14 | 10.00 | 17 | 11.22 | ||
16646400 | 299 | 601.57 | 15 | 39.08 | 19 | 48.44 | ||
\botrule |
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
30752 | 106 | 0.61 | 11 | 0.085 | 11 | 0.076 | ||
127008 | 141 | 1.94 | 11 | 0.19 | 12 | 0.18 | ||
516128 | 216 | 8.75 | 11 | 0.70 | 13 | 0.70 | ||
2080800 | 309 | 67.75 | 12 | 4.22 | 17 | 5.63 | ||
61504 | 106 | 0.96 | 11 | 0.12 | 13 | 0.14 | ||
254016 | 154 | 3.38 | 11 | 0.27 | 13 | 0.32 | ||
1032256 | 218 | 18.64 | 11 | 1.16 | 14 | 1.42 | ||
4161600 | 304 | 164.44 | 13 | 9.68 | 17 | 11.38 | ||
123008 | 107 | 1.53 | 13 | 0.21 | 13 | 0.21 | ||
508032 | 160 | 6.78 | 13 | 0.63 | 13 | 0.66 | ||
2064512 | 218 | 48.20 | 13 | 4.12 | 15 | 4.63 | ||
8323200 | 301 | 337.99 | 13 | 18.98 | 19 | 26.58 | ||
246016 | 117 | 2.63 | 14 | 0.36 | 15 | 0.38 | ||
1016064 | 177 | 15.35 | 14 | 1.47 | 15 | 1.56 | ||
4129024 | 221 | 114.99 | 14 | 9.39 | 17 | 11.16 | ||
16646400 | 299 | 648.21 | 15 | 42.38 | 19 | 52.54 | ||
\botrule |
Backward Euler | Crank-Nicolson | |||||||
DoF | ||||||||
30752 | 6.14e-4 | 6.14e-4 | 6.14e-4 | 3.12e-6 | 3.12e-6 | 3.12e-6 | ||
127008 | 6.14e-4 | 6.14e-4 | 6.14e-4 | 3.12e-6 | 3.12e-6 | 3.12e-6 | ||
516128 | 6.14e-4 | 6.14e-4 | 6.14e-4 | 3.12e-6 | 3.12e-6 | 3.12e-6 | ||
2080800 | 6.14e-4 | 6.14e-4 | 6.14e-4 | 3.12e-6 | 3.12e-6 | 3.12e-6 | ||
61504 | 3.08e-4 | 3.08e-4 | 3.08e-4 | 8.12e-7 | 7.99e-7 | 8.02e-7 | ||
254016 | 3.08e-4 | 3.08e-4 | 3.08e-4 | 8.05e-7 | 8.01e-7 | 8.03e-7 | ||
1032256 | 3.08e-4 | 3.08e-4 | 3.08e-4 | 8.15e-7 | 8.07e-7 | 8.01e-7 | ||
4161600 | 3.08e-4 | 3.08e-4 | 3.08e-4 | 8.15e-7 | 8.02e-7 | 8.02e-7 | ||
123008 | 1.54e-4 | 1.54e-4 | 1.54e-4 | 2.38e-7 | 1.99e-7 | 2.04e-7 | ||
508032 | 1.54e-4 | 1.54e-4 | 1.54e-4 | 4.08e-7 | 2.00e-7 | 2.00e-7 | ||
2064512 | 1.54e-4 | 1.54e-4 | 1.54e-4 | 4.78e-7 | 2.03e-7 | 2.56e-7 | ||
8323200 | 1.54e-4 | 1.54e-4 | 1.54e-4 | 4.10e-7 | 7.08e-7 | 5.87e-7 | ||
246016 | 7.71e-5 | 7.71e-5 | 7.71e-5 | 1.79e-7 | 5.79e-8 | 4.98e-8 | ||
1016064 | 7.71e-5 | 7.71e-5 | 7.71e-5 | 6.37e-7 | 5.91e-8 | 6.93e-8 | ||
4129024 | 7.71e-5 | 7.71e-5 | 7.71e-5 | 6.50e-7 | 1.53e-7 | 2.58e-7 | ||
16646400 | 7.71e-5 | 7.71e-5 | 7.71e-5 | 3.76e-7 | 6.42e-7 | 7.60e-7 | ||
\botrule |
Example 3.
The last example is a three-dimensional problem of solving (1) with , , , , , and .
Tables 6 and 7 respectively show the convergence results of MINRES when the backward Euler method and the Crank-Nicolson method are used for time. Overall, the preconditioners and still perform better than with regard to iteration numbers. In this relatively well-conditioned case, all preconditioned solvers appear to have comparable CPU times.
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
2744 | 11 | 0.034 | 10 | 0.027 | 13 | 0.029 | ||
27000 | 18 | 0.12 | 12 | 0.093 | 14 | 0.10 | ||
238328 | 21 | 0.54 | 13 | 0.39 | 16 | 0.49 | ||
2000376 | 21 | 5.91 | 13 | 5.06 | 16 | 6.07 | ||
5488 | 14 | 0.032 | 12 | 0.037 | 14 | 0.040 | ||
54000 | 18 | 0.17 | 15 | 0.15 | 17 | 0.18 | ||
476656 | 21 | 1.08 | 15 | 0.85 | 18 | 0.96 | ||
4000752 | 24 | 15.77 | 17 | 13.32 | 18 | 14.01 | ||
10976 | 14 | 0.059 | 14 | 0.070 | 15 | 0.075 | ||
108000 | 18 | 0.29 | 17 | 0.27 | 19 | 0.31 | ||
953312 | 22 | 2.37 | 18 | 2.00 | 21 | 2.30 | ||
8001504 | 24 | 32.41 | 19 | 30.75 | 22 | 36.00 | ||
21952 | 14 | 0.084 | 15 | 0.094 | 17 | 0.13 | ||
216000 | 18 | 0.49 | 18 | 0.50 | 21 | 0.55 | ||
1906624 | 22 | 6.13 | 21 | 7.05 | 24 | 7.94 | ||
16003008 | 24 | 65.54 | 21 | 66.41 | 24 | 75.87 | ||
\botrule |
DoF | Iter | CPU | Iter | CPU | Iter | CPU | ||
2744 | 14 | 0.040 | 10 | 0.034 | 13 | 0.030 | ||
27000 | 18 | 0.10 | 13 | 0.095 | 15 | 0.11 | ||
238328 | 21 | 0.61 | 13 | 0.44 | 17 | 0.55 | ||
2000376 | 21 | 6.23 | 13 | 5.11 | 17 | 6.71 | ||
5488 | 14 | 0.034 | 12 | 0.037 | 15 | 0.048 | ||
54000 | 18 | 0.20 | 15 | 0.19 | 17 | 0.18 | ||
476656 | 21 | 1.10 | 15 | 0.90 | 20 | 1.16 | ||
4000752 | 25 | 17.49 | 17 | 13.98 | 20 | 16.36 | ||
10976 | 14 | 0.066 | 14 | 0.064 | 16 | 0.081 | ||
108000 | 18 | 0.30 | 17 | 0.30 | 19 | 0.35 | ||
953312 | 23 | 3.03 | 18 | 2.86 | 22 | 3.46 | ||
8001504 | 24 | 34.45 | 19 | 32.64 | 22 | 37.47 | ||
21952 | 14 | 0.10 | 15 | 0.095 | 17 | 0.11 | ||
216000 | 18 | 0.51 | 18 | 0.50 | 22 | 0.61 | ||
1906624 | 23 | 7.56 | 21 | 7.23 | 25 | 8.49 | ||
16003008 | 24 | 69.35 | 21 | 69.73 | 26 | 85.77 | ||
\botrule |
6 Conclusion
We have developed in this work a sine transform based preconditioning method which can be used for solving a wide range of time-dependent PDEs. Namely, our proposed preconditioners can be applied for a large class of symmetrized all-at-once system resulted from discretizing the underlying equation. The method is generic, and can be used with high-order discretization schemes for time.
We have adopted the Krylov subspace solver of MINRES by exploiting the symmetry of the permuted matrix. In particular, we emphasize that our sine transform based preconditioners can be seen as optimal or quasi-optimal preconditioners, which are both symmetric positive definite and efficiently implemented, according to the previous work on the spectral distribution of symmetrized Toeplitz matrix sequences MazzaPestana2018 ; Ferrari2019 . We observe that a quasi optimal behavior is obtained in the case of multilevel coefficient matrices, which is a good result given the theoretical barriers proved in SeTy2 for matrix algebra preconditioners with equimodular transforms like multilevel circulants (see also nega-gen for more discussion and more general results).
Also, specifically designed for MINRES, our preconditioners consistently outperform an existing block circulant preconditioner in both iteration counts and CPU times as shown in each numerical example. It is observed that they can achieve rapid convergence, especially in the ill-conditioned case. We stress that the severity of ill-conditioning of the original linear system mainly comes from both the spatial and temporal discretization schemes for the underlying equation. In addition to effective preconditioning, one remedy is to develop better discretization schemes in the first place for both space and time, which would be a potential research direction in the next stage. As long as the resulting linear systems from such discretization schemes are of lower triangular banded block Toeplitz structure, our proposed preconditioners can be incorporated. The required structure, mainly as a result of time discretization through an all-at-once process, is not difficult to obtain.
Our preconditioning technique (i.e., ) utilizes the fast diagonalizability of the Laplacian matrix, which relies mainly on the uniform spatial grid setting. We however emphasize that such a uniform grid assumption is a common starting point in the all-at-once preconditioning context, as with many existing works such as LIN2021110221 ; WU2021110076 . Along this line, it would be a direction for future research to develop efficient preconditioning methods when the physical domain is irregular - in this setting the theory of generalized locally Toeplitz sequences could be exploited for the spectral and convergence analysis GLT-LAA2 ; Ba . Also, our proposed preconditioner , which does not require such fast diagonalizability and has shown excellent preconditioning effect in the numerical tests, deserves further investigation.
For another future work, it would be interesting to combine the symmetrization MINRES solver with the block -circulant preconditioning technique doi:10.1137/20M1316354 ; doi:10.1137/19M1309869 ; WU2021110076 which has only been shown applicable for GMRES. Such a combination would further advance the pioneering block circulant preconditioning theory (cooperated with MINRES) originally proposed in doi:10.1137/16M1062016 , which is still at an early stage of development.
As already observed, a further interesting case is that of variable coefficients. When is not constant or there is also a dependency on time, that is , the spectral analysis can be performed and the related solvers can be designed via multilevel GLT tools (see book-GLT-II ), by assuming that both and are allowed to tend to infinity. In this setting, instead of dealing with a matrix valued symbol (whose size is the fixed parameter ), we have further levels in the structure, where is the spatial dimensionality of the problem, but with scalar-valued symbols. Of course, this setting is a challenge (see SeTy2 ; nega-gen and references therein) to be considered in future work, even though the numerical tests considered in this work seem promising in this direction.
Declarations
Ethical Approval and Consent to participate
Not applicable.
Consent for publication
The authors consent to publication of this work in Numerical Algorithms of Springer Nature, when accepted.
Human and Animal Ethics
Not applicable.
Availability of supporting data
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
Competing interests
The authors declare no competing interests.
Funding
The work of S. Hon was supported in part by the Hong Kong RGC under grant 22300921, a start-up allowance from the Croucher Foundation, and a Tier 2 Start-up Grant from Hong Kong Baptist University. The work of S. Serra-Capizzano was supported in part by INDAM-GNCS. Furthermore, the work of Stefano Serra-Capizzano wss funded from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 955701. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and Belgium, France, Germany, Switzerland. Stefano Serra-Capizzano is also grateful for the support of the Laboratory of Theory, Economics and Systems – Department of Computer Science at Athens University of Economics and Business.
Authors’ contributions
S. Hon developed the methodology, wrote the main manuscript text, and conducted the main numerical experiments. P. Y. Fung improved the numerical experiments and modified the manuscript text. J. Dong modified the manuscript text. S. Serra-Capizzano validated the methodology and modified the manuscript text. All authors reviewed the manuscript.
References
- (1) Barakitis, N., Ekstrom, S. E., Vassalos, P.: Preconditioners for fractional diffusion equations based on the spectral symbol, Numerical Linear Algebra with Applications, e2441 (2022) https://doi.org/10.1002/nla.2441
- (2) Barbarino, G.: A systematic approach to reduced GLT. BIT Numerical Mathematics. 1–63 (2021) https://doi.org/10.1007/s10543-021-00896-7
- (3) Benzi, M., Golub, G. H.: Bounds for the entries of matrix functions with applications to preconditioning. BIT Numerical Mathematics. 39(3), 417–438 (1999) https://link.springer.com/article/10.1023/A:1022362401426
- (4) Bertaccini, D.: Reliable preconditioned iterative linear solvers for some integrators. Numerical Linear Algebra with Applications. 8(2), 111–125 (2001) https://doi.org/10.1002/1099-1506(200103)8:2<111::AID-NLA234>3.0.CO;2-Q
- (5) Bertaccini, D., Ng, M. K.: Block -circulant preconditioners for the systems of differential equations. Calcolo. 40(2), 71–90 (2003) https://link.springer.com/article/10.1007/s100920300004
- (6) Bini, D., Benedetto, F.: A new preconditioner for the parallel solution of positive definite Toeplitz systems. in Proc. Second ACM Symp. on Parallel Algorithms and Architectures, pp. 220–223 (1990). https://doi.org/10.1145/97444.97688
- (7) Bini, D., Capovani, M.: Spectral and computational properties of band symmetric Toeplitz matrices. Linear Algebra and Its Applications. 52, 99–126 (1983) https://doi.org/10.1016/0024-3795(83)90009-5
- (8) Brandts, J. H., da Silva, R. R.: Computable eigenvalue bounds for rank- perturbations. Linear Algebra and Its Applications. 432(12), 3100–3116 (2010) https://doi.org/10.1016/j.laa.2010.02.010
- (9) Chan, R. H., Ng, M. K.: Conjugate gradient methods for Toeplitz systems. SIAM Review. 38(3), 427–482 (1996) https://doi.org/10.1137/s0036144594276474
- (10) Chan, T. F.: An optimal circulant preconditioner for Toeplitz systems. SIAM Journal on Scientific and Statistical Computing. 9(4), 766–771 (1988) https://doi.org/10.1137/0909051
- (11) Cocquet, P. H., Gander, M. J.: How large a shift is needed in the shifted Helmholtz preconditioner for its effective inversion by multigrid?. SIAM Journal on Scientific and Statistical Computing, 39(2), A438–A478 (2017) https://doi.org/10.1137/15M102085X
- (12) Danieli, F., Wathen, A. J.: All-at-once solution of linear wave equations. Numerical Linear Algebra with Applications. e2386, (2021) https://doi.org/10.1002/nla.2386
- (13) Di Benedetto, F.: Analysis of preconditioning techniques for ill-conditioned Toeplitz matrices. SIAM Journal on Scientific Computing. 16(3), 682–697 (1995) https://doi.org/10.1137/0916041
- (14) Di Benedetto, F.: Solution of Toeplitz normal equations by sine transform based preconditioning. Linear Algebra and Its Applications. 285(1-3), 229–255 (1998) https://doi.org/10.1016/s0024-3795(98)10115-5
- (15) Di Benedetto, F., Serra-Capizzano, S.: A unifying approach to abstract matrix algebra preconditioning. Numerische Mathematik. 82, 57–90 (1999) https://doi.org/10.1007/s002110050411
- (16) Dobrev, V. A., Kolev, Tz., Petersson, N.A., Schroder J.B.: Two-level convergence theory for multigrid reduction in time (MGRIT). SIAM Journal on Scientific Computing. 39(5), S501–S527 (2017) https://doi.org/10.1137/16m1074096
- (17) Donatelli, M., Mazza, M., Serra-Capizzano, S.: Spectral analysis and structure preserving preconditioners for fractional diffusion equations. Journal of Computational Physics. 307, 262–279 (2016) https://doi.org/10.1016/j.jcp.2015.11.061
- (18) Falgout, R. D., Friedhoff, S., Kolev, T. V., MacLachlan, S. P., Schroder J. B.: Parallel time integration with multigrid. SIAM Journal on Scientific Computing. 36(6), C635–C661 (2014) https://doi.org/10.1002/pamm.201410456
- (19) Ferrari, P., Furci, I., Hon, S., Mursaleen, M. A., Serra-Capizzano, S.: The eigenvalue distribution of special 2-by-2 block matrix-sequences with applications to the case of symmetrized Toeplitz structures. SIAM Journal on Matrix Analysis and Applications. 40(3), 1066–1086 (2019) https://doi.org/10.1137/18m1207399
- (20) Gander, M.J.: 50 years of time parallel time integration. Multiple Shooting and Time Domain Decomposition Methods. 69–113. Springer, Cham (2015). Series Editors: T. Carraro, M. Geiger, S. Körkel, R. Rannacher https://doi.org/10.1007/978-3-319-23321-5_3
- (21) Gander, M. J., Halpern, L., Ryan, J., Tran, T. T. B.: A direct solver for time parallelization. Domain Decomposition Methods in Science and Engineering XXII. 491–499. Springer, Cham (2016). Series Editors: T, Dickopf, M, J. Gander, L. Halpern, R. Krause, L.F. Pavarino https://doi.org/10.1007/978-3-319-18827-0_50
- (22) Gander, M. J., Neumüller, M.: Analysis of a new space-time parallel multigrid algorithm for parabolic problems. SIAM Journal on Scientific Computing. 38(4) A2173–A2208 (2016) https://doi.org/10.1137/15m1046605
- (23) Gander, M. J., Vandewalle, S.: Analysis of the parareal time-parallel time-integration method. SIAM Journal on Scientific Computing. 29(2) 556–578 (2007) https://doi.org/10.1137/05064607x
- (24) Garoni, C., Serra-Capizzano, S.: Generalized locally Toeplitz sequences: Theory and applications. Volume 1, Springer, Cham (2017) https://doi.org/10.1007/978-3-319-53679-8_8
- (25) Garoni, C., Serra-Capizzano, S.: Generalized locally Toeplitz sequences: Theory and applications. Volume 2, Springer, Cham (2018) https://doi.org/10.1007/978-3-030-02233-4
- (26) Goddard, A., Wathen, A.: A note on parallel preconditioning for all-at-once evolutionary PDEs. Electronic Transactions on Numerical Analysis. 51, 135–150 (2019) https://doi.org/10.1553/etna_vol51s135
- (27) Hon, S.: Optimal block circulant preconditioners for block Toeplitz systems with application to evolutionary PDEs. Journal of Computational and Applied Mathematics. 407, 113965 (2021). https://doi.org/10.1016/j.cam.2021.113965
- (28) Hon, S., Serra-Capizzano, S., Wathen, A.: Band-Toeplitz preconditioners for ill-conditioned Toeplitz systems. BIT Numerical Mathematics. (2021) https://doi.org/10.1007/s10543-021-00889-6
- (29) Horton, G., Vandewalle, S.: A space-time multigrid method for parabolic partial differential Equations. SIAM Journal on Scientific Computing. 16(4), 848–864 (1995). https://doi.org/10.1137/0916050
- (30) Lin, X. L., Ng, M. K.: An all-at-once preconditioner for evolutionary partial differential equations. SIAM Journal on Scientific Computing. 43(4), A2766–A2784 (2021) https://doi.org/10.1137/20m1316354
- (31) Lin, X. L., Ng, M. K., Zhi, Y.: A parallel-in-time two-sided preconditioning for all-at-once system from a non-local evolutionary equation with weakly singular kernel. Journal of Computational Physics. 434, 110221 (2021) https://doi.org/10.1016/j.jcp.2021.110221
- (32) Lions, J. L., Maday, Y., Turinici, G.: A "parareal" in time discretization of PDE’s. Comptes rendus de l’Académie des sciences. Série 1, Mathématique. 332(7), 661–668 (2001)
- (33) Liu, J., Wu, S. L.: A fast block -circulant preconditioner for all-at-once systems from wave equations. SIAM Journal on Matrix Analysis and Applications. 41(4), 1912–1943 (2021) https://doi.org/10.1137/19m1309869
- (34) Mazza, M., Pestana, J.: Spectral properties of flipped Toeplitz matrices and related preconditioning. BIT Numerical Mathematics. 59, 463–482 (2018) https://doi.org/10.1007/s10543-018-0740-y
- (35) McDonald, E., Hon, S., Pestana, J., Wathen, A.: Preconditioning for nonsymmetry and time-dependence. In Lecture Notes in Computational Science and Engineering. 116, 81–91 (2017). Springer International Publishing. https://doi.org/10.1007/978-3-319-52389-7_7
- (36) McDonald, E., Pestana, J., Wathen, A.: Preconditioning and iterative solution of all-at-once systems for evolutionary partial differential equations, SIAM Journal on Scientific Computing, 40(2) A1012–A1033 (2018) https://doi.org/10.1137/16m1062016
- (37) Ng, M. K.: Iterative Methods for Toeplitz Systems. Numerical Mathematics and Scientific Computation. Oxford University Press, New York (2004)
- (38) Ng, M. K., Pan, J.: Approximate inverse circulant-plus-diagonal preconditioners for Toeplitz-plus-diagonal matrices. SIAM Journal on Scientific Computing, 32(3), 1442–1464 (2010) https://doi.org/10.1137/080720280
- (39) Serra-Capizzano, S.: The GLT class as a generalized Fourier analysis and applications. Linear Algebra and Its Applications. 419(1), 180–233 (2006) https://doi.org/10.1016/j.laa.2006.04.012
- (40) Serra-Capizzano, S.: Toeplitz preconditioners constructed from linear approximation processes. SIAM Journal on Matrix Analysis and Applications. 20(2), 446–465 (1999) https://doi.org/10.1137/s0895479897316904
- (41) Serra-Capizzano, S.: Matrix algebra preconditioners for multilevel Toeplitz matrices are not superlinear. Special issue on structured and infinite systems of linear equations. Linear Algebra and Its Applications. 343/344, 303–319 (2002) https://dx.doi.org/10.1016/S0024-3795(01)00361-5
- (42) Serra-Capizzano, S., Tilli, P.: On unitarily invariant norms of matrix-valued linear positive operators. Journal of Inequalities and Applications. 7(3), 309–330 (2002) https://doi.org/10.1155/s1025583402000152
- (43) Serra-Capizzano, S., Tyrtyshnikov., E.: How to prove that a preconditioner cannot be superlinear, Mathematics of Computation, 72(243), 1305–1316 (2003) https://doi.org/10.1090/s0025-5718-03-01506-0
- (44) Strang, G.: A Proposal for Toeplitz Matrix Calculations, Studies in Applied Mathematics, 74(2), 171–176 (1986) https://doi.org/10.1002/sapm1986742171
- (45) Wathen, A.: Preconditioning. Acta Numerica. 24, 329–376 (2015) https://doi.org/10.1017/s0962492915000021
- (46) Wu, S. L., Zhou, T.: Parallel implementation for the two-stage SDIRK methods via diagonalization. Journal of Computational Physics. 428, 110076 (2021) https://doi.org/10.1016/j.jcp.2020.110076