Shifted Lanczos method for
quadratic forms with Hermitian matrix resolvents
Abstract
Quadratic forms of Hermitian matrix resolvents involve the solutions of shifted linear systems. Efficient iterative solutions use the shift-invariance property of Krylov subspaces. The Hermitian Lanczos method reduces a given vector and matrix to a Jacobi matrix (real symmetric tridiagonal matrix with positive super and sub-diagonal entries) and approximates the quadratic form using the Jacobi matrix. This study develops a shifted Lanczos method that deals directly with the Hermitian matrix resolvent. We derive a matrix representation of a linear operator that approximates the resolvent by solving a Vorobyev moment problem associated with the shifted Lanczos method. We show that an entry of the Jacobi matrix resolvent can approximate the quadratic form, matching the moments. We give a sufficient condition such that the method does not break down, an error bound, and error estimates. Numerical experiments on matrices drawn from real-world applications compare the proposed method with previous methods and show that the proposed method outperforms well-established methods in solving some problems.
1 Introduction
Consider the computation of quadratic forms
(1.1) |
where denotes the complex conjugate transpose of a vector , is a Hermitian matrix that may be indefinite, and . Here, is assumed to be invertible. If is not real, then is not Hermitian.
A straightforward approach to the quadratic form (1.1) is to solve the shifted linear systems
(1.2) |
and compute for , , , . The development of efficient solutions for shifted linear systems (1.2) with real symmetric has been on-going for two decades; these solutions include shifted Krylov subspace methods such as variants of the conjugate orthogonal conjugate gradient (COCG) method [45], a version of the quasi-minimal residual (QMR_SYM) method [5], and the conjugate orthogonal conjugate residual (COCR) method [38] proposed in [43], [37], and [39], respectively. These methods use the shift-invariance property of the Krylov subspace and complex symmetry of for efficient formulations. A shifted Krylov subspace method forms basis vectors of the Krylov subspace for a shifted matrix with a particular shift by using a short-term recurrence and determines an iterate for the linear system and associated iterates for other shifted linear systems simultaneously for different values of , , , , , without additional matrix-vector products. Typically, shifted Krylov subspace methods take the iterative residual vectors of the shifted linear systems to be collinear to that of the seed linear system. Here, we call a representative linear system a seed linear system. These methods may suffer from breakdown, i.e. division by zero, although this rarely occurs in practice. Previous studies on shifted CG and MINRES methods [7, 26] focused on real shifts. The technique in [26] improves the convergence of the methods with preconditioning under the assumption that the factorization or the use of a direct solver for the shifted symmetric matrix is performed efficiently. Previous studies on a shifted Lanczos method [6] for a Hermitian positive definite with a complex shift derived its error bound and estimates. Extensions of the MINRES method [30] to the shifted linear system (1.2) work with the Hermitian and complex shift [11, 36]. See [19] for more shifted Krylov subspace methods. The Padé approximation via the Lanczos process (PVL) [4] has attracted interest for the case , where is the spectral radius of a square matrix, whereas our interest includes the contrasting case .
This study focuses on exploiting the advantage of a Krylov subspace method in efficiently approximating the Hermitian matrix resolvent and directly computing the quadratic form (1.1) without solving the shifted linear systems (1.2). The key feature of our approach is the development of a shifted Lanczos method. The Hermitian Lanczos method projects an original model represented by and to a lower-order model and matches the lowest-order moments of the original model with those of the reduced model. We show that the shifted Lanczos method retains such a property. The Vorobyev moment problem [44, 2, 40, 24] enables a concise derivation of thisto concisely derive the method. This problem gives a matrix representation of a linear operator that represents the reduced model to approximate the resolvent . We show that the entry of the Jacobi matrix resolvent can approximate the quadratic form (1.1) with eight additional operations to the Lanczos method for each shift by using a recursive formula. Moreover, we give a sufficient condition such that the proposed method does not break down. Breakdown may occur in the COCG, COCR, and QMR_SYM methods, although this is rarely seen. Furthermore, we drive an error bound and develop practical error estimates of the shifted Lanczos method.
Practical applications that can benefit from this development include chemistry and physics [24, Section 3.9], eigensolvers using complex moments [35], the computation of Green’s function for a many-particle Hamiltonian [46], the stochastic estimation of the number of eigenvalues [25], samplers for determinantal point processes [23] (see also references therein), the approximation of Markov chains in Bayesian sampling [20], and computational quantum physics [19]. An extension of the single-vector case in (1.1) for real to the multiple-vector case , , , , namely,
(1.3) |
can be reduced to the solutions of bilinear forms , , , , , , where is the th column of . The bilinear form can be further reduced to the quadratic form
(1.4) |
where and (cf. [10, p. 114]). This type of problem arises in the analysis of dynamical systems [1].
The rest of this paper is organized as follows. In Section 2, we review the Lanczos method and its moment-matching property. In Section 3, we describe a shifted Lanczos method for computing the quadratic forms (1.1), its moment-matching property, and implementation; discuss related methods; and give a sufficient breakdown-free condition, error bound, and error estimates. In Section 4, we present the results of numerical experiments in which the shifted Lanczos method is compared with previous methods and illustrate the developed error estimates. In Section 5, we conclude the paper.
2 Lanczos method
We review the Lanczos method [22] for Hermitian and . Algorithm 2.1 gives the procedures of the Lanczos method.
Here, denotes the Euclidean norm. Denote the Lanczos decomposition of by
(2.1) |
where the columns of form an orthonormal basis of the Krylov subspace and
(2.2) | ||||
(2.3) |
Here, is the th Euclidean basis vector and is the Jacobi matrix (real symmetric tridiagonal matrix with positive super and subdiagonal entries). Then, holds.
2.1 Hamburger moment problem
The Lanczos method projects the original model given by a Hermitian matrix and an initial vector to a lower-order model given by and , matching the lowest-order moments of the original model and those of the reduced model, i.e., it approximates a given matrix via moment matching. A thorough derivation of the well-known moment-matching property (2.12) presented below is given in [24, Chapter 3]. We review this for completeness. Given a sequence of scalars , , , , , , the problem of finding a nondecreasing real distribution function , , with points of increase such that the Riemann–Stieltjes integral is equal to the given sequence of scalars
(2.4) |
is called the Hamburger moment problem [12, 13, 14, 15]. The motivation to view the Lanczos method via the Hamburger moment problem instead of the Stieltjes moment problem is to enable working with indefinite Hermitian matrices with a shift. Note that the Stieltjes moment problem goes on the positive real axis and its relation with the Lanczos method on Hermitian positive definite matrices has been well-established [24].
The left-hand side of (2.4) is called the th moment with respect to the distribution function . Set another moment
(2.5) |
and the distribution function with points of increase to
(2.6) |
associated with weights , , , , , where are the eigenvalues of and , , , , , are the corresponding eigenvectors. Here, for clarity, we assume that the eigenvalues of are distinct without loss of generality. Thus, the distribution function is connected with the eigenpairs of . Then, we can express the moment (2.5) as the Gauss–Christoffel quadrature and the quadratic form
(2.7) | ||||
(2.8) |
Thus, the solution of (2.4) is given by
(2.9) |
associated with weights , , , , , where are the eigenvalues of and , , , , , are the corresponding eigenvectors. Here, the distribution function is connected with the eigenpairs of . Because the Gauss–Christoffel quadrature is exact for polynomials up to degree
(2.10) | ||||
(2.11) |
the first moments match
(2.12) |
2.2 Model reduction via Vorobyev moment matching
We can state the problem of moment matching in the language of matrices via the Vorobyev moment problem. To derive a linear operator that reduces the model order of , we follow [44] for the derivation (see also [2, 40]). Let
(2.13) |
for , , , where , , , , are assumed to be linearly independent. Then, the Vorobyev moment problem involves determining a sequence of linear operators such that
(2.14) |
for , , , where is the orthogonal projector onto . A linear operator reducing the model order of is given by
(2.15) |
for , , , where the sequence is strongly convergent to [44, Theorem II] (see [2, Section 4.2] for the derivation of (2.15)). Therefore, the first moments of the reduced model match those of the original model
(2.16) |
for , , . We will use this property to derive a shifted Lanczos method in the next section.
3 Shifted Lanczos method
Next, we formulate a shifted Lanczos method to approximate the resolvent in (1.1). For convenience, we omit subscript of if no confusion can arise. The application of Vorobyev’s method of moments to the shifted matrix and vector gives a matrix representation of a linear operator that represents the reduced model to approximate the resolvent . Let
(3.1) |
for , , , where , , , , are assumed to be linearly independent. Then, the Vorobyev moment problem involves determining a sequence of linear operators such that
(3.2) |
for , , , where is the orthogonal projector onto . We first solve the problem for the linear operator . Equations (3.2) can be written as
(3.3) |
for , , . An arbitrary vector is expanded as
(3.4) |
where . Multiplying both sides by gives
(3.5) |
Projecting this onto (shift-invariance property) gives
(3.6) | ||||
(3.7) | ||||
(3.8) | ||||
(3.9) |
Here, the first equality is due to equations (3.1), (3.2), and
(3.10) | ||||
(3.11) |
Hence, (3.9) shows that on . Because for any vector , we can obtain the expression
(3.12) |
by extending the domain to the whole space . Note that the sequence is strongly convergent to [44, Theorem II]. The expression (3.12) can be obtained from the shifted Lanczos decomposition [6, Lemma 2.1 (i)]
(3.13) |
Multiplying this by gives
(3.14) |
This gives the orthogonally projected restriction
(3.15) | ||||
(3.16) |
However, this way does not give the insight of strong convergence.
By using the expression , we show that the moments of the original model with and and those of the reduced model with and match. By using the moment-matching property (2.16) and the binomial formula, we have
(3.17) |
Therefore, the reduced model matches the first moments of the original model
(3.18) | ||||
(3.19) |
Note that the left-hand side is equal to
(3.20) |
with distribution function (2.6) and the right-hand side is equal to
(3.21) |
with distribution function (2.9) (cf. [42, Chapter 15]). Thus, the Hamburger moment problem (Section 2.1) gives these connections that carry over the expressions of moments (2.7) and the exactness of the quadrature (2.10) to the shifted case, whereas Vorobyev’s method of moments translates these connections in the language of matrices.
Consider the approximation of . Because the matrix representation of the inverse of the reduced-order operator restricted onto [16, p. 79] is given by
(3.22) |
an approximation of is given by . Therefore, we obtain
(3.23) |
3.1 Implementation
The quantity in (3.23) is the entry of the resolvent of a successively enlarging Jacobi matrix . An efficient recursive formula for computing such an entry was developed in [10, Section 3.4]. The formula is given by
(3.24) |
starting with , , and , where
(3.25) | ||||
(3.26) | ||||
(3.27) | ||||
(3.28) |
We summarize the procedures for approximating quadratic forms (1.1) in Algorithm 3.1. Here, we denote . Quantities denoted with the superscript correspond to the shift . The difference from Algorithm 2.1 is the addition of Lines 2 and 7. In particular, when is a real symmetric matrix and is a real vector in Algorithm 3.1, only real arithmetic is needed to compute Lines 1, 4, and 6, whereas Lines 2 and 7 require complex arithmetic in general. Note that is a real number in theory. However, when is not real but complex Hermitian, due to rounding error in finite precision arithmetic, the imaginary part of may grow and affect the accuracy. Therefore, it is recommended to explicitly set the real part of the numerically computed to its value.
We compare the shifted Lanczos method with related methods in terms of computational cost. Algorithms 3.2, 3.3, and 3.4 give simple modifications of the shifted COCG, COCR, and MINRES methods, respectively, for computing quadratic forms (1.1). Here, is the seed shift. The modifications are given by applying to the th iterate and associated vectors. They produce approximations , , and , respectively, to the quadratic forms (1.1). In the shifted COCG and COCR methods, if a satisfactory seed iterate is obtained, one can use the seed switching technique [46] to choose a different shift as the seed shift . Similarly to the shifted MINRES method [36, Section 2.3], the shifted Lanczos method does not need such a seed switching technique.
Remark.
The modifications applied to the shifted COCG, COCR, and MINRES methods for computing quadratic forms to derive Algorithms 3.2, 3.3, and 3.4, respectively, can also be applied to the shifted CG method to compute quadratic forms. However, such a shifted modification of the CG method [19, Section 2.2] is mathematically equivalent to the shifted Lanczos method for Hermitian positive definite , whereas it is not mathematically equivalent to the shifted Lanczos method in general. Such a shifted CG method may not work on the case where is not real, because the shifted coefficient matrix is not Hermitian for . With the connection between the tridiagonal entries of the Jacobi matrix and the scalar coefficients of vectors in the CG method [33, Section 6.7], we may formulate the shifted CG method for quadratic forms, which is mathematically equivalent to the shifted Lanczos method for quadratic forms. See [27, Section 9.6] for different aspects of the shifted CG method.
For simplicity of comparison, we count basic scalar, vector, and matrix operations. The Lanczos method (Algorithm 2.1) needs one vector scale (scale), one dot product (dot), one vector norm (norm), two scalar–vector additions (axpy) and one matrix–vector product (matvec) per iteration. Table 3.2 gives the number of basic vector and matrix operations of the shifted methods. Table 3.2 gives the number of scalar operations for each shift per iteration. These tables show that in terms of the cost per iteration, the shifted Lanczos method is the cheapest of the methods compared.
Method | scale | dot | norm | axpy | matvec |
---|---|---|---|---|---|
Shifted Lanczos | 1 | 1 | 1 | 2 | 1 |
Shifted COCG | 1 | 1 | 1 | 2 | 1 |
Shifted COCR | 0 | 3 | 0 | 2 | 1 |
Shifted MINRES | 1 | 2 | 1 | 2 | 1 |
Method: name of the method, scale: vector scale, dot: dot product, norm: vector norm, axpy: scalar–vector addition, matvec: matrix–vector product.
Method | Total | ||||
---|---|---|---|---|---|
Shifted Lanczos | 3 | 4 | 1 | 0 | 8 |
Shifted COCG | 6 | 9 | 3 | 0 | 18 |
Shifted COCR | 6 | 8 | 3 | 0 | 17 |
Shifted MINRES | 11 | 20 | 3 | 1 | 35 |
Method: name of the method, : addition, : multiplication, : division, : square root, Total: total number of operations.
3.2 Breakdown-free condition
A breakdown resulting from division by zero for may occur in Line 7 of Algorithm 3.1 before a solution is obtained. Therefore, we give a sufficient condition such that the shifted Lanczos method does not break down. Let for convenience, where denotes the determinant of a matrix. For convenience, we omit superscript from the quantities given in Algorithm 3.1 and prepare lemmas on shifted versions of well-known properties of the Jacobi matrix.
Lemma 3.1.
Let and be defined as above. Assume that holds for , , , , . If for , then we have .
Proof.
Then, the breakdown-free condition is described as follows.
Theorem 3.2.
Let be defined as above. Let and be the smallest and largest eigenvalues of , respectively. If satisfies , then holds for .
Proof.
Theorem 3.2 shows that the shifted Lanczos method does not break down whenever each satisfies . The condition in Theorem 3.2 is not necessarily equivalent to assuming that is positive or negative definite because . The shifted MINRES methods [11, 36] do not break down for nonsingular , whereas the shifted COCG and COCR methods may break down.
Projection methods for symmetric or Hermitian eigenproblems [34, 31] typically take shift points for quadrature points in a circle and can circumvent taking quadrature points on the real line. Therefore, the shifted Lanczos method can avoid breakdown when applied to quadratic forms in these methods with particular choices of quadrature points.
3.3 Convergence bound
The approximation of the quadratic form by using the shifted Lanczos method can be viewed as a way of solving the shifted linear system by using the same method and computing , where . Concretely, to determine the th iterate of the Lanczos method for the linear system , the method imposes the Galerkin condition
(3.34) |
The iterate is the same as the CG iterate if is Hermitian positive definite; otherwise, they may be different. The absolute error of the shifted Lanczos method for quadratic forms for the th iteration is
(3.35) |
and its upper bound is given by using the Cauchy-Schwarz inequality
(3.36) | ||||
(3.37) |
For the Hermitian positive definite linear systems , , a well-known upper bound [21] of the Lanczos method is related to the -norm and a recent upper bound depends on the distribution of the eigenvalues [29, Theorem B.1].
The following assertions give an error bound of the shifted Lanczos method for quadratic forms.
Theorem 3.3.
Let be a Hermitian matrix, , , , and be the set of all polynomials with degree less than . Then, the absolute error of the shifted Lanczos method for the quadratic form for the th iteration satisfies
(3.38) |
where
(3.39) |
Proof.
If and are defined as above, then the error norm has the expression
(3.40) | ||||
(3.41) |
For any polynomial , the first factor of the last quantity is bounded as
(3.42) | ||||
(3.43) | ||||
(3.44) |
with and . Here, we used the shifted Lanczos decomposition (3.13) and the identity
(3.45) |
cf. [29, Lemma 4.1]. Because of the interlacing property of eigenvalues [17, Theorem 4.3.17]
(3.46) |
and
(3.47) |
Therefore, the assertion (3.38) holds. ∎∎
The shifted Lanczos method uses a short-term recurrence and may suffer from rounding errors. Rounding errors in the Lanczos method may quickly lead to loss of orthogonality of the computed basis vectors of the Krylov subspace and thus could cause a delay in convergence. We leave an open problem of giving the bound in finite precision, cf. [29, Theorem 6.2].
3.4 Estimation of error
The bound presented in the previous section gives insights on the convergence of the shifted Lanczos method for quadratic forms but is not practically used as a stopping criterion in the iteration of Algorithm 3.1 because it needs to know the condition number of . To check if a satisfactory solution is obtained for each iteration in Algorithm 3.1 in practice, we formulate two estimates of the error .
First, the shifted Lanczos decomposition (3.13) gives
(3.48) |
Multiplying this by from the left and from the right gives
(3.49) |
Together with , we have
(3.50) |
With the representation (3.22) and a positive integer , we approximate by
(3.51) |
Therefore, we obtain the following estimate:
(3.52) |
This estimate requires additional iterations and needs and to compute. The former is the entry of and the latter is the entry of . These quantities can be computed recursively [10, Sections 3.2, 3.4] as follows:
(3.53) | ||||
(3.54) |
where is defined in (3.26) and
(3.55) |
Therefore, we may update by
(3.56) |
To update for each shift per iteration, one multiplication by and one division by are required. To compute for , , , for each shift per iteration, we require one minus in , minuses and divisions in if one stores and for , , , computed in Line 7 of Algorithm 3.1. To update for each shift per iteration, multiplications by , one multiplication by , and one division by are required.
Second, the triangular equation gives
(3.57) | ||||
(3.58) |
Under the assumption that the approximation error for the ()th iteration is significantly smaller than that for the th iteration, we obtain an estimate for the th iteration
(3.59) |
This estimate requires additional iterations and one minus in . An analogous estimate can be found in [41, Section 4] for the -norm error in the CG method under a similar assumption. The shifted coefficient matrix may not be Hermitian and may not form the -norm in the usual sense.
4 Numerical experiments
Numerical experiments were performed to compare the proposed method with the previous methods—shifted COCG method (Algorithm 3.2), shifted COCR method (Algorithm 3.3), shifted MINRES method (Algorithm 3.4), shifted Lanczos method (Algorithm 3.1, and direct solver using the MATLAB function mldivide for solving (1.2)—in terms of the number of iterations and CPU time.
All computations were performed on a computer with an Intel Xeon E5-2670 v2 2.50 GHz CPU, 256 GB of random-access memory (RAM), and CentOS 6.10. All programs were coded and run in MATLAB R2019a in double-precision floating-point arithmetic with unit round-off at .
Table 4.1 gives information about the test matrices, including the size of each matrix, density of nonzero entries [%], (estimated) condition number, and application from which the matrix arose. The condition number was estimated using the MATLAB function condest. Matrices mhd1280b, apache2, CurlCurl_3 and thermal2 are from [3]; the other matrix, VCNT1000000std, is from [18]. Matrix mhd1280b is complex symmetric and matrix conf5.4-00l8x8-2000 is complex Hermitian and indefinite, whereas the others are real symmetric. The Cholesky factorization [32] proved that matrices mhd1280b and apache2 are positive definite. According to the information in [3], matrix CurlCurl_3 is not positive definite and matrix thermal2 is positive definite.
The condition number for CurlCurl_3 could not be computed because of insufficient computer memory. Vector was simply set to for reproducibility, where is the all-ones vector. The shifts were set to
(4.1) |
where is the imaginary unit. Because , the conditions in Theorem 3.2 are satisfied and the shifted Lanczos method is breakdown-free. These shifts demonstrate typical choices of quadrature points in the projection method for eigenproblems [35].
Matrix | Density [%] | condest | Application | |
---|---|---|---|---|
mhd1280b | 1,280 | Magnetohydrodynamics | ||
conf5.4-00l8x8-2000 | 49,152 | Quantum chromodynamics | ||
apache2 | 715,176 | Structural analysis | ||
VCNT1000000std | 1,000,000 | Quantum mechanics | ||
CurlCurl_3 | 1,219,574 | — | Electromagnetic analysis | |
thermal2 | 1,228,045 | Steady-state thermal analysis |
Matrix: name of the matrix, : size of the matrix, density: density of nonzero entries, condest: condition number estimated by using the MATLAB function condest, application: application from which the matrix arises.
4.1 Comparisons in terms of CPU time
We compare the methods in terms of CPU time. Figure 4.1 shows the relative error versus the number of iterations for the compared methods on test matrices. The shifted methods terminated when the largest relative error among , , , became less than or equal to . Here, we adopted the numerical solution computed by MATLAB’s sparse direct solver as the exact solution. Table 4.2 gives the largest relative residual norm for MATLAB’s sparse direct solver for the test matrices, where is the numerical solution of the linear system computed by the sparse direct solver. The sparse direct solver was accurate on mhd1280b, VCNT1000000std, conf5.4-00l8x8-2000, and thermal2 and not accurate on apache2 and CurlCurl_3. The convergence curve plotted corresponds to the case where the required number of iterations is the largest among shifts , , , , , because this case determines the total CPU time. Such cases are for on mhd1280b, conf5.4-00l8x8-2000, apache2, and thermal2; for on CurlCurl_3; and for on VCNT1000000std. The shifted Lanczos method converged faster than the other methods on apache2, CurlCurl_3, and thermal2 and was competitive with the shifted MINRES method on VCNT1000000std. The shifted COCR and MINRES methods were almost identical on CurlCurl_3. The convergence curves of the four methods on VCNT1000000std and CurlCurl_3 decreased monotonically. Although the shifted MINRES method is monotonically non-increasing in terms of the residual norm, Figures 4.1 4.1 and 4.1 4.1 show that the shifted MINRES method does not necessarily converge monotonically in terms of the error. The convergence curves of the four methods are oscillatory on apache2 and thermal2. The shifted Lanczos method did not reach the stopping criterion on CurlCurl_3, whereas the shifted COCR and MINRES methods did. The curves of the shifted MINRES and Lanczos methods for conf5.4-00l8x8-2000, those of the shifted COCG and Lanczos methods for VCNT1000000std, those of the shifted COCG and Lanczos methods for CurlCurl_3, and those of the shifted COCR and MINRES methods for CurlCurl_3 seem to overlap.






Table 4.3 gives the CPU time in seconds taken by the four methods. The symbol * stands for the least CPU time for each matrix. The shifted Lanczos method was competitive with or faster than the MINRES method except on CurlCurl_3, whereas the shifted COCR method took more CPU time. CCOG required fewer iterations than the shifted Lanczos method but took more CPU time on CurlCurl_3 and thermal2. This is because the shifted COCG (and COCR) method used as the seed, and its complex symmetric Lanczos iterations in Line 5 of Algorithm 3.2 were performed in complex arithmetic. When setting the seed to zero, its complex symmetric Lanczos iterations were performed with real operations; however, the method required more CPU time and iterations than the case with shift .
Matrix | |
---|---|
mhd1280b | |
conf5.4-00l8x8-2000 | |
apache2 | |
VCNT1000000std | |
CurlCurl_3 | |
thermal2 |
Matrix: name of the matrix, : maximum relative error norm.
Method | mhd1280b | conf5.4-00l8x8-2000 | apache2 | |||
---|---|---|---|---|---|---|
iter | time | iter | time | iter | time | |
mldivide | 0.08 | 6,505 | 1,978 | |||
Shifted COCG | 446 | 0.05 | — | 6,613 | 175 | |
Shifted COCR | 437 | 0.04 | — | 6,461 | 163 | |
Shifted MINRES | 281 | 0.03 | 266 | 1.67 | 7,283 | 95.4 |
Shifted Lanczos | 219 | *0.02 | 266 | *1.52 | 5,662 | *68.4 |
Method | VCNT1000000std | CurlCurl_3 | thermal2 | |||
---|---|---|---|---|---|---|
iter | time | iter | time | iter | time | |
mldivide | 555 | 749,696 | 563 | |||
Shifted COCG | 16 | 1.18 | 4,974 | 246 | 214 | 11.40 |
Shifted COCR | 16 | 1.16 | 5,698 | 382 | 279 | 14.57 |
Shifted MINRES | 14 | *0.73 | 5,779 | 156 | 216 | 6.53 |
Shifted Lanczos | 16 | 0.81 | 5,139 | *135 | 214 | *6.17 |
4.2 Spectrum and shift
We illustrate the effect of the spectrum of and shift on the convergence. The condition number of shifted matrix is given by
(4.2) |
This means that the condition number has a connection between the spectrum and shift. For convenience of the computation of a spectrum, we take a small matrix mhd1280b as an example. The largest and smallest eigenvalues of the matrix are approximately and , respectively. Substitute the shift for , , , , and for controlling the condition number. Then, the corresponding condition numbers of are as given in Table 4.4. Table 4.4 gives the required number of iterations and CPU time for the compared methods for each value of value. The stopping criterion was that the relative error became less than or equal to . The table also shows that the shifted Lanczos method was competitive with or outperformed other methods in terms of the CPU time. It shows that as the condition number increases, the required number of iterations tends to increase.
condition number | ||||||||
---|---|---|---|---|---|---|---|---|
iter | time | iter | time | iter | time | iter | time | |
Shifted COCG | 118 | *0.01 | 357 | 0.03 | 1,087 | 0.10 | 3,013 | 0.27 |
Shifted COCR | 117 | *0.01 | 331 | 0.03 | 1,018 | 0.10 | 3,129 | 0.29 |
Shifted MINRES | 84 | *0.01 | 231 | 0.03 | 690 | 0.07 | 1,988 | 0.19 |
Shifted Lanczos | 76 | *0.01 | 226 | *0.02 | 680 | *0.06 | 1,894 | *0.17 |




4.3 Estimation of error
We tested the estimates and developed in Section 3.4 on the matrices used in Section 4.1. Figure 4.3 shows the relative error and its estimates and with versus the number of iterations. Here, the plotted curve corresponds to the case where the required number of iterations was the largest among shifts , , , , . For presentation, these estimates are normalized by . Both estimates were accurate on conf5.4-00l8x8-2000, VCNT1000000std, and thermal2. The estimate was accurate on mdh1280b. Both estimates underestimated the error on apache2 and CurlCurl_3 because the assumption made in Section 3.4 might not hold and the shifted Lanczos method lacks the monotonicity of an error norm such as the one in the CG method (cf. [41]). The oscillations of the estimates are similar to those observed for the errors on apache2 and thermal2. However, we have no theoretical justification of these estimates at this time and when they capture the convergence well, in particular for finite-precision arithmetic, is open.






5 Conclusions
We explored the computation of quadratic forms of Hermitian matrix resolvents. In contrast to previous shifted Krylov subspace methods, our method approximates the matrix resolvent directly. The underlying concept used in approximating the resolvent is to exploit the moment-matching property of a shifted Lanczos method by solving the Vorobyev moment problem. We showed that the shifted Lanczos method matches the first moments of the original model and those of the reduced model and extended the scope of the problems that the standard Lanczos method can solve. We derived the inverse of a linear operator representing the reduced model and related it to an entry of a Jacobi matrix resolvent. The entry can be efficiently computed by using a recursive formula. Previous shifted Krylov subspace methods work on a real symmetric matrix with a complex shift, whereas the proposed method works on a Hermitian matrix with a complex shift and does not break down, provided that the shift is not in the interior of the extremal eigenvalues of the Hermitian matrix. We gave an error bound and estimates for the shifted Lanczos method for quadratic forms. Numerical experiments on matrices drawn from real-world applications showed that the shifted Lanczos method is competitive with the shifted MINRES method and outperforms it when solving some problems. We illustrated the effect of the spectrum and shift on the convergence and showed that the error estimate is reasonable.
We intend to perform preconditioning for the shifted Lanczos method for quadratic forms in future work. The shifted COCG and COCR methods can use preconditioning if the preconditioned matrix is complex symmetric; however, for the shifted Lanczos method, it is not trivial to incorporate preconditioning. How to monitor the convergence of the proposed method is also not trivial, and we leave the development of a more rigorous and/or sophisticated estimate for future work. With regard to the error estimate, there will be an interesting connection with the Gauss quadratures (cf. [8, 9, 28]).
Acknowledgments
The author would like to thank Editage for English language editing, Professor Zdeněk Strakoš for giving the author a chance to attend his course which motivated the author to study this work, and Professor Ken Hayami for helpful discussions.
References
- [1] Bai, Z., Golub, G.: Computation of large-scale quadratic forms and transfer functions using the theory of moments, quadrature and Padé approximation. In: Modern Methods in Scientific Computing and Applications, pp. 1–30. Springer Netherlands, Dordrecht (2002). DOI 10.1007/978-94-010-0510-4_1
- [2] Brezinski, C.: The methods of Vorobyev and Lanczos. Linear Algebra Appl. 234, 21–41 (1996). DOI 10.1016/0024-3795(94)00081-6
- [3] Davis, T., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Software 38(1), 1:1–1:25 (2011). DOI 10.1145/2049662.2049663
- [4] Feldmann, P., Freund, R.W.: Efficient linear circuit analysis by Padé approximation via the Lanczos process. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 14(5), 639–649 (1995). DOI 10.1109/43.384428
- [5] Freund, R.W.: Conjugate gradient-type methods for linear systems with complex symmetric coefficient matrices. SIAM J. Sci. Statist. Comput. 13(1), 425–448 (1992). DOI 10.1137/0913023
- [6] Frommer, A., Kahl, K., Lippert, T., Rittich, H.: 2-norm error bounds and estimates for Lanczos approximations to linear systems and rational matrix functions. SIAM J. Matrix Anal. Appl. 34(3), 1046–1065 (2013). DOI 10.1137/110859749
- [7] Frommer, A., Maass, P.: Fast CG-based methods for Tikhonov–Phillips regularization. SIAM J. Sci. Comput. 20(5), 1831–1850 (1999). DOI 10.1137/s1064827596313310
- [8] Golub, G.H., Meurant, G.: Matrices, moments and quadrature, vol. 303, pp. 105–156. Longman Sci. Tech., Harlow (1993)
- [9] Golub, G.H., Meurant, G.: Matrices, moments and quadrature II; How to compute the norm of the error in iterative methods. BIT 37(3), 687–705 (1997). DOI 10.1007/bf02510247
- [10] Golub, G.H., Meurant, G.: Matrices, Moments and Quadrature with Applications. Princeton University Press, Princeton, NJ (2010)
- [11] Gu, G.D., Liu, G.: On convergence property of MINRES method for solving a complex shifted Hermitian linear system. Int. J. Math. Comput. Sci. 7, 290–294 (2013). DOI 10.5281/ZENODO.1087887
- [12] Hamburger, H.: Beiträge zur Konvergenztheorie der Stieltjesschen Kettenbrüche. Math. Z. 4, 186–222 (1919). DOI 10.1007/BF01203012
- [13] Hamburger, H.: Über eine Erweiterung des Stieltjesschen Momentenproblems. Math. Ann. 81(2-4), 235–319 (1920). DOI 10.1007/bf01564869
- [14] Hamburger, H.: Über eine Erweiterung des Stieltjesschen Momentenproblems. Math. Ann. 82(1-2), 120–164 (1920). DOI 10.1007/bf01457982
- [15] Hamburger, H.: Über eine Erweiterung des Stieltjesschen Momentenproblems. Math. Ann. 82(3-4), 168–187 (1921). DOI 10.1007/bf01498663
- [16] Hoffman, K., Kunze, R.: Linear Algebra, 2nd edn. Prentice-Hall Inc., Englewood, Cliffs, NJ (1971)
- [17] Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, New York, NY (2013). DOI 10.1017/9781139020411
- [18] Hoshi, T., Imachi, H., Kuwata, A., Kakuda, K., Fujita, T., Matsui, H.: Numerical aspect of large-scale electronic state calculation for flexible device material. Jpn. J. Ind. Appl. Math. 36(2), 685–698 (2019). DOI 10.1007/s13160-019-00358-2
- [19] Hoshi, T., Kawamura, M., Yoshimi, K., Motoyama, Y., Misawa, T., Yamaji, Y., Todo, S., Kawashima, N., Sogabe, T.: K– Open-source library for the shifted Krylov subspace method of the form . Comput. Phys. Commun. p. 107536 (2020). DOI 10.1016/j.cpc.2020.107536
- [20] Johndrow, J.E., Mattingly, J.C., Mukherjee, S., Dunson, D.B.: Optimal approximating Markov chains for Bayesian inference. arXiv 1508.03387 [stat.CO], 1–31 (2017)
- [21] Kaniel, S.: Estimates for some computational techniques in linear algebra. Math. Comp. 20, 369–378 (1966). DOI 10.1090/S0025-5718-1966-0234618-4
- [22] Lanczos, C.: Solution of systems of linear equations by minimized iterations. J. Res. Nat. Bur. Stand. 49(1), 33 (1952). DOI 10.6028/jres.049.006
- [23] Li, C., Sra, S., Jegelka, S.: Gaussian quadrature for matrix inverse forms with applications. In: M.F. Balcan, K.Q. Weinberger (eds.) Proceedings of The 33rd International Conference on Machine Learning, pp. 1766–1775. PMLR, New York, USA (2016)
- [24] Liesen, J., Strakoš, Z.: Krylov Subspace Methods, Principles and Analysis. Oxford University Press, Oxford (2013)
- [25] Maeda, Y., Futamura, Y., Imakura, A., Sakurai, T.: Filter analysis for the stochastic estimation of eigenvalue counts. JSIAM Lett. 7(0), 53–56 (2015). DOI 10.14495/jsiaml.7.53
- [26] Meerbergen, K.: The solution of parametrized symmetric linear systems. SIAM J. Matrix Anal. Appl. 24(4), 1038–1059 (2003). DOI 10.1137/s0895479800380386
- [27] Meurant, G.: The Lanczos and Conjugate Gradient Algorithms. SIAM, Philadelphia, PA (2006). DOI 10.1137/1.9780898718140
- [28] Meurant, G., Tichý, P.: On computing quadrature-based bounds for the -norm of the error in conjugate gradients. Numer. Algorithms 62(2), 163–191 (2013). DOI 10.1007/s11075-012-9591-9
- [29] Musco, C., Musco, C., Sidford, A.: Stability of the Lanczos method for matrix function approximation. In: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1605–1624. Society for Industrial and Applied Mathematics (2018). DOI 10.1137/1.9781611975031.105
- [30] Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12, 617–629 (1975). DOI 10.1137/0712047
- [31] Polizzi, E.: Density-matrix-based algorithm for solving eigenvalue problems. Phys. Rev. B 79(11), 115112–1–6 (2009). DOI 10.1103/physrevb.79.115112
- [32] Rump, S.M.: Verification of positive definiteness. BIT 46(2), 433–452 (2006). DOI 10.1007/s10543-006-0056-1
- [33] Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia, PA (2003). DOI 10.1137/1.9780898718003
- [34] Sakurai, T., Sugiura, H.: A projection method for generalized eigenvalue problems using numerical integration. J. Comput. Appl. Math. 159(1), 119–128 (2003). DOI 10.1016/S0377-0427(03)00565-X
- [35] Sakurai, T., Tadano, H.: CIRR: a Rayleigh–Ritz type method with contour integral for generalized eigenvalue problems. Hokkaido Math. J. 36, 745–757 (2007). DOI 10.14492/hokmj/1272848031
- [36] Seito, H., Hoshi, T., Yamamoto, Y.: On using the shifted minimal residual method for quantum-mechanical wave packet simulation. JSIAM Lett. 11(0), 13–16 (2019). DOI 10.14495/jsiaml.11.13
- [37] Sogabe, T., Hoshi, T., Zhang, S., Fujiwara, T.: On a weighted quasi-residual minimization strategy for solving complex symmetric shifted linear systems. Electron. Trans. Numer. Anal. 31, 126–140 (2008)
- [38] Sogabe, T., Zhang, S.L.: A COCR method for solving complex symmetric linear systems. J. Comput. Appl. Math. 199(2), 297–303 (2007). DOI 10.1016/j.cam.2005.07.032
- [39] Sogabe, T., Zhang, S.L.: An extension of the COCR method to solving shifted linear systems with complex symmetric matrices. East Asian J. Appl. Math. 1(2), 97–107 (2011). DOI 10.4208/eajam.260410.240510a
- [40] Strakoš, Z.: Model reduction using the Vorobyev moment problem. Numer. Algorithms 51(3), 363–379 (2009). DOI 10.1007/s11075-008-9237-0
- [41] Strakoš, Z., Tichý, P.: On error estimation in the conjugate gradient method and why it works in finite precision computations. Electron. Trans. Numer. Anal. 13, 56–80 (2002)
- [42] Szegö, G.: Orthogonal Polynomials, vol. XXIII. American Mathematical Society, New York (1959). DOI 10.1090/coll/023
- [43] Takayama, R., Hoshi, T., Sogabe, T., Zhang, S.L., Fujiwara, T.: Linear algebraic calculation of the Green’s function for large-scale electronic structure theory. Phys. Rev. B 73(165108), 1–9 (2006). DOI 10.1103/physrevb.73.165108
- [44] Vorobyev, Y.V.: Method of Moments in Applied Mathematics. Gordon and Breach Science Publishers, New York (1965)
- [45] van der Vorst, H.A., Melissen, J.: A Petrov–Galerkin type method for solving , where is symmetric complex. IEEE Trans. Magn. 26(2), 706–708 (1990). DOI 10.1109/20.106415
- [46] Yamamoto, S., Sogabe, T., Hoshi, T., Zhang, S.L., Fujiwara, T.: Shifted conjugate-orthogonal–conjugate-gradient method and its application to double orbital extended Hubbard model. J. Phys. Soc. Jpn. 77(11), 114713 (2008). DOI 10.1143/jpsj.77.114713