This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A skew-symmetric Lanczos bidiagonalization method for computing several largest eigenpairs of a large skew-symmetric matrix

Jinzhi Huang School of Mathematical Sciences, Soochow University, 215006 Suzhou, China ([email protected]). The work of this author was supported in part by the Youth Program of the Natural Science Foundation of Jiangsu Province (No. BK20220482).    Zhongxiao Jia Corresponding author. Department of Mathematical Sciences, Tsinghua University, 100084 Beijing, China ([email protected]). The work of this author was supported in part by the National Natural Science Foundation of China (No. 12171273).
Abstract

The spectral decomposition of a real skew-symmetric matrix AA can be mathematically transformed into a specific structured singular value decomposition (SVD) of AA. Based on such equivalence, a skew-symmetric Lanczos bidiagonalization (SSLBD) method is proposed for the specific SVD problem that computes extreme singular values and the corresponding singular vectors of AA, from which the eigenpairs of AA corresponding to the extreme conjugate eigenvalues in magnitude are recovered pairwise in real arithmetic. A number of convergence results on the method are established, and accuracy estimates for approximate singular triplets are given. In finite precision arithmetic, it is proven that the semi-orthogonality of each set of basis vectors and the semi-biorthogonality of two sets of basis vectors suffice to compute the singular values accurately. A commonly used efficient partial reorthogonalization strategy is adapted to maintaining the needed semi-orthogonality and semi-biorthogonality. For a practical purpose, an implicitly restarted SSLBD algorithm is developed with partial reorthogonalization. Numerical experiments illustrate the effectiveness and overall efficiency of the algorithm.

keywords:
Skew-symmetric matrix, spectral decomposition, eigenvalue, eigenvector, singular value decomposition, singular value, singular vector, Lanczos bidiagonalization, partial reorthogonalization
AMS:
65F15, 15A18, 65F10, 65F25

1 Introduction

Let An×nA\in\mathbb{R}^{n\times n} be a large scale and possibly sparse skew-symmetric matrix, i.e., AT=AA^{T}=-A, where the superscript TT denotes the transpose of a matrix or vector. We consider the eigenvalue problem

(1.1) Ax=λx,Ax=\lambda x,

where λ\lambda\in\mathbb{C} is an eigenvalue of AA and xnx\in\mathbb{C}^{n} with the 2-norm x=1\|x\|=1 is an corresponding eigenvector. It is well known that the nonzero eigenvalues λ\lambda of AA are purely imaginary and come in conjugate pairs. The real skew-symmetric eigenvalue problem arises in a variety of applications, such as matrix function computations [4, 7], the solution of linear quadratic optimal control problems [19, 39], model reductions [22, 37], wave propagation solution for repetitive structures [12, 38], crack following in anisotropic materials [1, 21], and some others [20, 26, 36].

For small to medium sized skew-symmetric matrices, several numerical algorithms have been developed to compute their spectral decompositions in real arithmetic [8, 23, 26, 35]. For a large scale skew-symmetric AA, however, it still lacks a high-performance numerical algorithm that can make use of its skew-symmetric structure to compute several eigenvalues and/or the corresponding eigenvectors only in real arithmetic. In this paper, we close this gap by proposing a Krylov subspace type method to compute several extreme conjugate eigenvalues in magnitude and the corresponding eigenvectors of AA that only uses real arithmetic.

Our work originates from a basic fact that, as will be shown in Section 2, the spectral decomposition of the skew-symmetric AA has a close relationship with its specifically structured singular value decomposition (SVD), where each conjugate pair of purely imaginary eigenvalues of AA corresponds to a double singular value of AA and the mutually orthogonal real and imaginary parts of associated eigenvectors are the corresponding left and right singular vectors of AA. Therefore, the solution of eigenvalue problem of the skew-symmetric AA is equivalent to the computation of its structured SVD, and the desired eigenpairs can be recovered from the converged singular triplets by exploiting this equivalence.

It has been well realized in [35] that the spectral decomposition of the skew-symmetric AA can be reduced to the SVD of a certain bidiagonal matrix BB whose size is n/2n/2 for nn even. Exploiting such property, Ward and Gray [35] have designed an eigensolver for dense skew-symmetric matrices. For AA large scale, however, the computation of BB using orthogonal similarity transformations is prohibitive due to the requirement of excessive storage and computations. Nevertheless, with the help of Lanczos bidiagonalization (LBD) [24], whose complete bidiagonal reduction is originally due to [9], we are able to compute a sequence of leading principal matrices of BB, based on which, approximations to the extreme singular values of AA and/or the corresponding singular vectors can be computed. Along this line, a number of LBD based methods have been proposed and intensively studied, and several practical explicitly and implicitly restarted LBD type algorithms have been well developed [2, 3, 5, 11, 14, 15, 17, 18, 34].

For the skew-symmetric matrix AA, LBD owns some unique properties in both mathematics and numerics. However, direct applications of the aforementioned algorithms do not make use of these special properties, and will thus encounter some severe numerical difficulties. We will fully exploit these properties to propose a skew-symmetric LBD (SSLBD) method for computing several extreme singular triplets of AA, from which its extreme conjugate eigenvalues in magnitude and the associate eigenvectors are recovered. Attractively, our algorithm is performed in real arithmetic when computing complex eigenpairs of AA. Particularly, one conjugate pair of approximations to two conjugate pairwise eigenpairs can be recovered from one converged approximate singular triplet. We estimate the distance between the subspace generated by the real and imaginary parts of a pair of desired conjugate eigenvectors and the Krylov subspaces generated by SSLBD, and prove how it tends to zero as the subspace dimension increases. With these estimates, we establish a priori error bounds for the approximate eigenvalues and approximate eigenspaces obtained by the SSLBD method. One of them is the error bound for the two dimensional approximate eigenspace, which extends [29, pp.103, Theorem 4.6], a classic error bound for the Ritz vector in the real symmetric (complex Hermitian) case. These results show how fast our method converges when computing several extreme singular triplets.

As the subspace dimension increases, the SSLBD method will eventually become impractical due to the excessive memory and computational cost. For a practical purpose, it is generally necessary to restart the method by selecting an increasingly better starting vector when the maximum iterations allowed are attained, until the restarted algorithm converges ultimately. We will develop an implicitly restarted SSLBD algorithm. Initially, implicit restart was proposed by Sorensen [31] for large eigenvalue problems, and then it has been developed by Larsen [18] and Jia and Niu [14, 15] for large SVD computations. Our implicit restart is based on them but specialized to the skew-symmetric AA. We will show that two sets of left and right Lanczos vectors generated by SSLBD are meanwhile biorthogonal for the skew-symmetric AA. However, in finite precision arithmetic, this numerical biorthogonality loses gradually even if they themselves are kept numerically orthonormal. As a matter of fact, we have found that the numerical orthogonality of the computed left and right Lanczos vectors in an available state-of-the-art Lanczos bidiagonalization type algorithm does not suffice for the SVD problem of the skew-symmetric AA, and unfortunate ghosts appear when the numerical biorthogonality loses severely; that is, a singular value of AA may be recomputed a few times. Therefore, certain numerical biorthogonality is crucial to make the method work regularly.

As has been proved by Simon [30] for the symmetric Lanczos method on the eigenvalue problem and extended by Larsen [17] to the LBD method for the SVD problem, as far as the accurate computation of singular values is concerned, the numerical semi-orthogonality suffices for a general AA, and the numerical orthogonality to the level of machine precision is not necessary. Here semi-orthogonality means that two vectors are numerically orthogonal to the level of the square root of machine precision. To maintain the numerical stability and make the SSLBD method behave like it in exact arithmetic, we shall prove that the semi-orthogonality of each set of Lanczos vectors and the semi-biorthogonality of left and right Lanczos vectors suffice for the accurate computation of singular values of the skew-symmetric AA. We will seek for an effective and efficient reorthogonalization strategy for this purpose. Specifically, besides the commonly used partial reorthogonalization to maintain the desired semi-orthogonality of each set of Lanczos vectors, we will propose a partial reorthogonalization strategy to maintain the numerical semi-biorthogonality of two sets of Lanczos vectors in order to avoid the ghost phenomena and make the algorithm converge regularly. We will introduce such kind of reorthogonalization strategy into the implicitly restarted SSLBD algorithm and design a simple scheme to recover the desired conjugate eigenpairs of AA pairwise from the converged singular triplets.

The rest of this paper is organized as follows. In Section 2, we describe some properties of the spectral decomposition of AA, and show its mathematical equivalence with the SVD of AA. Then we establish a close relationship between the SVD of AA and that of an upper bidiagonal matrix whose order is half of that of AA when the order of AA is even. In Section 3, we present some properties of the SSLBD process, and propose the SSLBD method for computing several extreme singular values and the corresponding singular vectors of AA, from which conjugate approximate eigenpairs of AA are pairwise reconstructed. In Section 4, we establish convergence results on the SSLBD method. In Section 5, we design a partial reorthogonalization scheme for the SSLBD process, and develop an implicitly restarted SSLBD algorithm to compute several extreme singular triplets of AA. Numerical experiments are reported in Section 6. We conclude the paper in Section 7.

Throughout this paper, we denote by i\mathrm{i} the imaginary unit, by XHX^{H} and XX^{{\dagger}} the conjugate transpose and pseudoinverse of XX, by (X)\mathcal{R}(X) the range space of XX, and by X\|X\| and XF\|X\|_{F} the 22- and Frobenius norms of XX, respectively. We denote by k\mathbb{R}^{k} and k\mathbb{C}^{k} the real and complex spaces of dimension kk, by IkI_{k} the identity matrix of order kk, and by 𝟎k\bm{0}_{k} and 𝟎k,\bm{0}_{k,\ell} the zero matrices of orders k×kk\times k and k×k\times\ell, respectively. The subscripts of identity and zero matrices are omitted when they are clear from the context.

2 Preliminaries

For the skew-symmetric matrix An×nA\in\mathbb{R}^{n\times n}, the following five properties are well known and easily justified: Property (i): AA is diagonalizable by a unitary similarity transformation; Property (ii): the eigenvalues λ\lambda of AA are either purely imaginary or zero; Property (iii): for an eigenpair (λ,z)(\lambda,z) of AA, the conjugate (λ¯,z¯)(\bar{\lambda},\bar{z}) is also an eigenpair of AA; Property (iv): the real and imaginary parts of the eigenvector zz corresponding to a purely imaginary eigenvalue of AA have the same length, and are mutually orthogonal; Property (v): AA of odd order must be singular, and has at least one zero eigenvalue. It is easily deduced that a nonsingular AA must have even order, i.e., n=2n=2\ell for some integer \ell, and its eigenvalues λ\lambda are purely imaginary and come in conjugate, i.e., plus-minus, pairs.

We formulate the following basic results as a theorem, which establishes close structure relationships between the spectral decomposition of AA and its SVD and forms the basis of our method and algorithm to be proposed. For completeness and our frequent use in this paper, we will give a rigorous proof by exploiting the above five properties.

Theorem 2.1.

The spectral decomposition of the nonsingular skew-symmetric An×nA\in\mathbb{R}^{n\times n} with n=2n=2\ell is

(2.1) A=XΛXHA=X\Lambda X^{H}

with

(2.2) X=[12(U+iV)12(UiV)]andΛ=diag{iΣ,iΣ},X=\begin{bmatrix}\frac{1}{\sqrt{2}}(U+\mathrm{i}V)&\frac{1}{\sqrt{2}}(U-\mathrm{i}V)\end{bmatrix}\qquad\mbox{and}\qquad\Lambda=\mathop{\operator@font diag}\nolimits\{\mathrm{i}\Sigma,-\mathrm{i}\Sigma\},

where Un×U\in\mathbb{R}^{n\times\ell} and Vn×V\in\mathbb{R}^{n\times\ell} are orthonormal and biorthogonal:

(2.3) UTU=I,VTV=I,UTV=𝟎,U^{T}U=I,\qquad V^{T}V=I,\qquad U^{T}V=\bm{0},

and Σ=diag{σ1,,σ}×\Sigma=\mathop{\operator@font diag}\nolimits\{\sigma_{1},\dots,\sigma_{\ell}\}\in\mathbb{R}^{\ell\times\ell} with σ1,,σ>0\sigma_{1},\dots,\sigma_{\ell}>0. The SVD of AA is

(2.4) A=[UV][ΣΣ][VU]T.A=\begin{bmatrix}U&V\end{bmatrix}\begin{bmatrix}\Sigma&\\ &\Sigma\end{bmatrix}\begin{bmatrix}V&-U\end{bmatrix}^{T}.
{proof}

By Properties (i) and (ii) and the assumption that AA is nonsingular, we denote by Λ+×\Lambda_{+}\in\mathbb{C}^{\ell\times\ell} the diagonal matrix consisting of all the eigenvalues of AA with positive imaginary parts, and by X+n×X_{+}\in\mathbb{C}^{n\times\ell} the corresponding orthonormal eigenvector matrix. By Property (iv), we can write (Λ+,X+)(\Lambda_{+},X_{+}) as

(2.5) (Λ+,X+)=(iΣ,12(U+iV)),(\Lambda_{+},X_{+})=\left(\mathrm{i}\Sigma,\tfrac{1}{\sqrt{2}}(U+\mathrm{i}V)\right),

where the columns of UU and VV have unit-length, and AX+=X+Λ+AX_{+}=X_{+}\Lambda_{+} and X+HX+=IX_{+}^{H}X_{+}=I. By these and Property (iii), the conjugate of (Λ+,X+)(\Lambda_{+},X_{+}) is

(2.6) (Λ,X)=(iΣ,12(UiV)),(\Lambda_{-},X_{-})=\left(-\mathrm{i}\Sigma,\tfrac{1}{\sqrt{2}}(U-\mathrm{i}V)\right),

and AX=XΛAX_{-}=X_{-}\Lambda_{-} and XHX=IX_{-}^{H}X_{-}=I. As a result, XX and Λ\Lambda defined by (2.2) are the eigenvector and eigenvalue matrices of AA, respectively, and

(2.7) AX=XΛ.AX=X\Lambda.

Premultiplying the two sides of AX+=X+Λ+AX_{+}=X_{+}\Lambda_{+} by X+TX_{+}^{T} delivers X+TAX+=X+TX+Λ+X_{+}^{T}AX_{+}=X_{+}^{T}X_{+}\Lambda_{+}, which, by the skew-symmetry of AA, shows that

X+TX+Λ++Λ+X+TX+=𝟎.X_{+}^{T}X_{+}\Lambda_{+}+\Lambda_{+}X_{+}^{T}X_{+}=\bm{0}.

Since Λ+=iΣ\Lambda_{+}=\mathrm{i}\Sigma and Σ\Sigma is a positive diagonal matrix, it follows from the above equation that

X+TX+=𝟎.X_{+}^{T}X_{+}=\bm{0}.

By definition (2.2) of XX and the fact that X+X_{+} and XX_{-} are column orthonormal, XX is unitary. Postmultiplying (2.7) by XHX^{H} yields (2.1), from which it follows that (2.4) holds.

Inserting X+=12(U+iV)X_{+}=\frac{1}{\sqrt{2}}(U+\mathrm{i}V) into X+HX+=IX_{+}^{H}X_{+}=I and X+TX+=𝟎X_{+}^{T}X_{+}=\bm{0} and solving the resulting equations for UTU,VTVU^{T}U,V^{T}V and UTVU^{T}V, we obtain (2.3). This shows that both [U,V][U,V] and [V,U][V,-U] are orthogonal. Therefore, (2.4) is the SVD of AA.

Remark 1.

For a singular skew-symmetric An×nA\in\mathbb{R}^{n\times n} with n=2+0n=2\ell+\ell_{0} where 0\ell_{0} is the multiplicity of zero eigenvalues of AA, we can still write the spectral decomposition of AA as (2.1) with some slight rearrangements for the eigenvalue and eigenvector matrices Λ\Lambda and XX:

(2.8) X=[12(U+iV)12(UiV)X0]andΛ=diag{iΣ,iΣ,𝟎0},X=\begin{bmatrix}\frac{1}{\sqrt{2}}(U+\mathrm{i}V)&\frac{1}{\sqrt{2}}(U-\mathrm{i}V)&X_{0}\end{bmatrix}\qquad\mbox{and}\qquad\Lambda=\mathop{\operator@font diag}\nolimits\{\mathrm{i}\Sigma,-\mathrm{i}\Sigma,\bm{0}_{\ell_{0}}\},

where U,VU,V and Σ\Sigma are as in Theorem 2.1 and the columns of X0X_{0} form an orthonormal basis of the null space of AA. The proof is analogous to that of Theorem 2.1 and thus omitted. In this case, relation (2.4) gives the thin SVD of AA.

For UU and VV in (2.4), write

U=[u1,u2,,u]andV=[v1,v2,,v].U=[u_{1},u_{2},\ldots,u_{\ell}]\qquad\mbox{and}\qquad V=[v_{1},v_{2},\ldots,v_{\ell}].

On the basis of Theorem 2.1, in the sequel, we denote by

(2.9) (λ±j,x±j)=(±iσj,12(uj±ivj)),j=1,,.(\lambda_{\pm j},x_{\pm j})=\left(\pm\mathrm{i}\sigma_{j},\tfrac{1}{\sqrt{2}}(u_{j}\pm\mathrm{i}v_{j})\right),\qquad j=1,\dots,\ell.

The SVD (2.4) indicates that (σj,uj,vj)(\sigma_{j},u_{j},v_{j}) and (σj,vj,uj)(\sigma_{j},v_{j},-u_{j}) are two singular triplets of AA corresponding to the multiple singular value σj\sigma_{j}. Therefore, in order to obtain the conjugate eigenpairs (λ±j,x±j)(\lambda_{\pm j},x_{\pm j}) of AA, one can compute only the singular triplet (σj,uj,vj)(\sigma_{j},u_{j},v_{j}) or (σj,vj,uj)(\sigma_{j},v_{j},-u_{j}), and then recovers the desired eigenpairs from the singular triplet by (2.9).

Next we present a bidiagonal decomposition of the nonsingular skew-symmetric AA with order n=2n=2\ell, which essentially appears in [35]. For completeness, we include a proof.

Theorem 2.2.

Assume that the skew-symmetric An×nA\in\mathbb{R}^{n\times n} with n=2n=2\ell is nonsingular. Then there exist two orthonormal matrices Pn×P\in\mathbb{R}^{n\times\ell} and Qn×Q\in\mathbb{R}^{n\times\ell} satisfying PTQ=𝟎P^{T}Q=\mathbf{0} and an upper bidiagonal matrix B×B\in\mathbb{R}^{\ell\times\ell} such that

(2.10) A=[PQ][BBT][QP]T.A=\begin{bmatrix}P&Q\end{bmatrix}\begin{bmatrix}B&\\ &B^{T}\end{bmatrix}\begin{bmatrix}Q&-P\end{bmatrix}^{T}.
{proof}

It is known from, e.g., [35] that AA is orthogonally similar to a skew-symmetric tridiagonal matrix, that is, there exists an orthogonal matrix Fn×nF\in\mathbb{R}^{n\times n} such that

(2.11) A=FTAF=[0α1α1αn1αn10].A^{\prime}=F^{T}AF=\begin{bmatrix}0&-\alpha_{1}&&\\ \alpha_{1}&\ddots&\ddots&\\ &\ddots&\ddots&-\alpha_{n-1}\\ &&\alpha_{n-1}&0\end{bmatrix}.

Denote Π1=[e2,n,e4,n,,e2,n]\Pi_{1}=[e_{2,n},e_{4,n},\dots,e_{2\ell,n}] and Π2=[e1,n,e3,n,,e21,n]\Pi_{2}=[e_{1,n},e_{3,n},\dots,e_{2\ell-1,n}] with ej,ne_{j,n} being the jjth column of InI_{n}, j=1,,nj=1,\dots,n. Then it is easy to verify that

A′′\displaystyle A^{\prime\prime} =\displaystyle= [Π1Π2]TA[Π2Π1]=[BBT],\displaystyle\ \begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}^{T}A^{\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}\end{bmatrix}=\begin{bmatrix}B&\\ &-B^{T}\end{bmatrix},
(2.12) B\displaystyle B\ =\displaystyle= Π1TAΠ2=[α1α2α3α4α22α21].\displaystyle\ \Pi_{1}^{T}A^{\prime}\Pi_{2}=\begin{bmatrix}\alpha_{1}&-\alpha_{2}&&&\\[-1.00006pt] &\alpha_{3}&-\alpha_{4}&&\\[-1.00006pt] &&\ddots&\ddots&\\[-1.00006pt] &&&\ddots&-\alpha_{2\ell-2}\\[-1.00006pt] &&&&\alpha_{2\ell-1}\end{bmatrix}.

Combining these two relations with (2.11), we obtain

(2.13) A\displaystyle A =\displaystyle= FAFT=F[Π1Π2]A′′[Π2Π1]TFT\displaystyle FA^{\prime}F^{T}=F\begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}A^{\prime\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}\end{bmatrix}^{T}F^{T}
=\displaystyle= [FΠ1FΠ2]A′′[FΠ2FΠ1]T\displaystyle\begin{bmatrix}F\Pi_{1}&F\Pi_{2}\end{bmatrix}A^{\prime\prime}\begin{bmatrix}F\Pi_{2}&F\Pi_{1}\end{bmatrix}^{T}
=\displaystyle= [PQ][BBT][QP]T,\displaystyle\begin{bmatrix}P&Q\end{bmatrix}\begin{bmatrix}B&\\ &-B^{T}\end{bmatrix}\begin{bmatrix}Q&P\end{bmatrix}^{T},

which proves (2.10), where we have denoted P=FΠ1P=F\Pi_{1} and Q=FΠ2Q=F\Pi_{2}. Note that FF is orthogonal and Π1\Pi_{1}, Π2\Pi_{2} are not only orthonormal but also biorthogonal: Π1TΠ1=Π2TΠ2=I\Pi_{1}^{T}\Pi_{1}=\Pi_{2}^{T}\Pi_{2}=I and Π1TΠ2=𝟎\Pi_{1}^{T}\Pi_{2}=\mathbf{0}. Therefore, PTP=QTQ=IP^{T}P=Q^{T}Q=I and PTQ=𝟎P^{T}Q=\bm{0}.

Remark 2.

For a singular skew-symmetric An×nA\in\mathbb{R}^{n\times n} with n=2+0n=2\ell+\ell_{0} where 0\ell_{0} is the multiplicity of zero eigenvalues of AA, the lower diagonal elements αi,i=2,2+1,,n1\alpha_{i},i=2\ell,2\ell+1,\dots,n-1 of AA^{\prime} in (2.11) are zeros. Taking Π1\Pi_{1} and Π2\Pi_{2} as in (2.12) and Π3=[e2+1,n,e2+2,n,,en,n]\Pi_{3}=[e_{2\ell+1,n},e_{2\ell+2,n},\dots,e_{n,n}], we obtain

(2.14) A′′=[Π1Π2Π3]TA[Π2Π1Π3]=diag{B,BT,𝟎0}A^{\prime\prime}=\ \begin{bmatrix}\Pi_{1}&\Pi_{2}&\Pi_{3}\end{bmatrix}^{T}A^{\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}&\Pi_{3}\end{bmatrix}=\mathop{\operator@font diag}\nolimits\{B,-B^{T},\bm{0}_{\ell_{0}}\}

with BB defined by (2.12). Using the same derivations as in the proof of Theorem 2.2, we have

A=[PQX0][BBT𝟎0][QPX0]T,A=\begin{bmatrix}P&Q&X_{0}\end{bmatrix}\begin{bmatrix}B\!\!&&\\ &\!\!B^{T}\!\!&\\ &&\!\!\bm{0}_{\ell_{0}}\end{bmatrix}\begin{bmatrix}Q&-P&X_{0}\end{bmatrix}^{T},

where PP and QQ are as in (2.13) and X0=FΠ3X_{0}=F\Pi_{3} is the orthonormal basis matrix of the null space of AA. This decomposition is a precise extension of (2.10) to AA with any order.

Combining Remark 2 with Remark 1 after Theorem 2.1, for ease of presentation, in the sequel we will always assume that AA has even order and is nonsingular. However, our method and algorithm to be proposed apply to a singular skew-symmetric AA with only some slight modifications.

Decomposition (2.10) is a bidiagonal decomposition of AA that reduces AA to diag{B,BT}\mathop{\operator@font diag}\nolimits\{B,B^{T}\} using the structured orthogonal matrices [PQ]T\begin{bmatrix}P&Q\end{bmatrix}^{T} and [QP]\begin{bmatrix}Q&-P\end{bmatrix} from the left and right, respectively. The singular values of AA are the union of those of BB and BTB^{T} but the SVD of BB alone delivers the whole SVD of AA: Let B=UBΣBVBTB=U_{B}\Sigma_{B}V_{B}^{T} be the SVD of BB. Then Theorem 2.2 indicates that (Σ,U,V)=(ΣB,PUB,QVB)(\Sigma,U,V)=(\Sigma_{B},PU_{B},QV_{B}) is an \ell-dimensional partial SVD of AA with UU and VV being orthonormal and UTV=𝟎U^{T}V=\mathbf{0}. By Theorem 2.1, the spectral decomposition (2.1) of AA can be restored directly from (Σ,U,V)(\Sigma,U,V). This is exactly the working mechanism of the eigensolver proposed in [35] for small to medium sized skew-symmetric matrices.

For AA large, using orthogonal transformations to construct PP, QQ and BB in (2.10) and computing the SVD of BB are unaffordable. Fortunately, based on decomposition (2.10), we can perform the LBD process on the skew-symmetric AA, i.e., the SSLBD process, to successively generate the columns of PP and QQ and the leading principal matrices of BB, so that the Rayleigh–Ritz projection comes into the computation of a partial SVD of AA.

3 The SSLBD method

For ease of presentation, from now on, we always assume that the eigenvalues λ±j=±iσj\lambda_{\pm j}=\pm\mathrm{i}\sigma_{j} of AA in (2.9) are simple, and that σ1,,σ\sigma_{1},\dots,\sigma_{\ell} are labeled in decreasing order:

(3.1) σ1>σ2>>σ>0.\sigma_{1}>\sigma_{2}>\dots>\sigma_{\ell}>0.

Our goal is to compute the kk pairs of extreme, e.g., largest conjugate eigenvalues λ±1,,λ±k\lambda_{\pm 1},\dots,\lambda_{\pm k} and/or the corresponding eigenvectors x±1,,x±kx_{\pm 1},\dots,x_{\pm k} defined by (2.9). By Theorem 2.1, this amounts to computing the following partial SVD of AA:

(Σk,Uk,Vk)=(diag{σ1,,σk},[u1,,uk],[v1,,vk]).(\Sigma_{k},U_{k},V_{k})=\left(\mathrm{diag}\{\sigma_{1},\dots,\sigma_{k}\},[u_{1},\dots,u_{k}],[v_{1},\dots,v_{k}]\right).

3.1 The mm-step SSLBD process

Algorithm 1 sketches the mm-step SSLBD process on the skew-symmetric AA, which is a variant of the lower LBD process proposed by Paige and Saunders [24].

Algorithm 1 The mm-step SSLBD process.
1:  Initialization: Set γ0=0\gamma_{0}=0 and p0=𝟎p_{0}=\bm{0}, and choose q1nq_{1}\in\mathbb{R}^{n} with q1=1\|q_{1}\|=1.
2:  for j=1,,m<j=1,\dots,m<\ell do
3:     Compute sj=Aqjγj1pj1s_{j}=Aq_{j}-\gamma_{j-1}p_{j-1} and βj=sj\beta_{j}=\|s_{j}\|.
4:     if βj=0\beta_{j}=0 then break; else calculate pj=1βjsjp_{j}=\frac{1}{\beta_{j}}s_{j}.
5:     Compute tj=Apjβjqjt_{j}=-Ap_{j}-\beta_{j}q_{j} and γj=tj\gamma_{j}=\|t_{j}\|.
6:     if γj=0\gamma_{j}=0 then break; else calculate qj+1=1γjtjq_{j+1}=\frac{1}{\gamma_{j}}t_{j}.
7:  end for

Assume that the mm-step SSLBD process does not break down for m<m<\ell, and note that ATA=A2A^{T}A=-A^{2} and AAT=A2AA^{T}=-A^{2}. Then the process computes the orthonormal base {pj}j=1m\{p_{j}\}_{j=1}^{m} and {qj}j=1m+1\{q_{j}\}_{j=1}^{m+1} of the Krylov subspaces

(3.2) 𝒰m=𝒦(A2,m,p1)and𝒱m+1=𝒦(A2,m+1,q1)\mathcal{U}_{m}=\mathcal{K}(A^{2},m,p_{1})\qquad\mbox{and}\qquad\mathcal{V}_{m+1}=\mathcal{K}(A^{2},m+1,q_{1})

generated by A2A^{2} and the starting vectors p1p_{1} and q1q_{1} with p1=Aq1/Aq1p_{1}=Aq_{1}/\|Aq_{1}\|, respectively. Denote Pj=[p1,,pj]P_{j}=[p_{1},\dots,p_{j}] for j=1,,mj=1,\dots,m and Qj=[q1,,qj]Q_{j}=[q_{1},\dots,q_{j}] for j=1,,m+1j=1,\dots,m+1, whose columns are called left and right Lanczos vectors, respectively. Then the mm-step SSLBD process can be written in the matrix form:

(3.3) {AQm=PmBm,APm=QmBmTγmqm+1em,mT,withBm=[β1γ1γm1βm],\left\{\begin{aligned} &AQ_{m}=P_{m}B_{m},\\[5.0pt] &AP_{m}=-Q_{m}B_{m}^{T}-\gamma_{m}q_{m+1}e_{m,m}^{T},\end{aligned}\right.\qquad\mbox{with}\qquad B_{m}=\begin{bmatrix}\beta_{1}&\gamma_{1}&&\\ &\ddots&\ddots&\\ &&\ddots&\gamma_{m-1}\\ &&&\beta_{m}\end{bmatrix},

where em,me_{m,m} is the mmth column of ImI_{m}. It is well known that the singular values of BmB_{m} are simple whenever the mm-step SSLBD process does not break down. For later use, denote B^m=[Bm,γmem,m]\widehat{B}_{m}=[B_{m},\gamma_{m}e_{m,m}]. Then we can write the second relation in (3.3) as APm=Qm+1B^mTAP_{m}=-Q_{m+1}\widehat{B}_{m}^{T}.

Theorem 3.1.

Let PmP_{m} and Qm+1Q_{m+1} be generated by Algorithm 1 without breakdown for m<m<\ell. Then PmP_{m} and Qm+1Q_{m+1} are biorthogonal:

(3.4) Qm+1TPm=𝟎.Q_{m+1}^{T}P_{m}=\bm{0}.
{proof}

We use induction on mm. Since β1p1=Aq1\beta_{1}p_{1}=Aq_{1} and γ1q2=Ap1β1q1\gamma_{1}q_{2}=-Ap_{1}-\beta_{1}q_{1}, by the skew-symmetry of AA and β1>0\beta_{1}>0, γ1>0\gamma_{1}>0, we have

β1q1Tp1=q1TAq1=0andγ1q2Tp1=p1TAp1β1q1Tp1=0,\beta_{1}q_{1}^{T}p_{1}=q_{1}^{T}Aq_{1}=0\qquad\mbox{and}\qquad\gamma_{1}q_{2}^{T}p_{1}=p_{1}^{T}Ap_{1}-\beta_{1}q_{1}^{T}p_{1}=0,

which means that Q2TP1=𝟎Q_{2}^{T}P_{1}=\bm{0} and thus proves (3.4) for m=1m=1.

Suppose that (3.4) holds for m=1,2,,j1m=1,2,\dots,j-1, i.e., QjTPj1=𝟎Q_{j}^{T}P_{j-1}=\bm{0}. For m=jm=j, partition

(3.5) Qj+1TPj=[QjTqj+1T]Pj=[QjTPjqj+1TPj].Q_{j+1}^{T}P_{j}=\begin{bmatrix}Q_{j}^{T}\\ q_{j+1}^{T}\end{bmatrix}P_{j}=\begin{bmatrix}Q_{j}^{T}P_{j}\\ q_{j+1}^{T}P_{j}\end{bmatrix}.

It follows from (3.3) and the inductive hypothesis QjTPj1=𝟎Q_{j}^{T}P_{j-1}=\bm{0} that

QjTAQj1=QjTPj1Bj1=𝟎,Q_{j}^{T}AQ_{j-1}=Q_{j}^{T}P_{j-1}B_{j-1}=\bm{0},

which means that Qj1TAQj1=𝟎Q_{j-1}^{T}AQ_{j-1}=\bm{0} and qjTAQj1=𝟎q_{j}^{T}AQ_{j-1}=\bm{0}. Since βi>0\beta_{i}>0, i=1,,ji=1,\dots,j, BjB_{j} is nonsingular, from (3.3) and the above relations we have

(3.6) QjTPj=QjTAQjBj1=[Qj1TAQj1Qj1TAqjqjTAQj1qjTAqj]Bj1=𝟎Q_{j}^{T}P_{j}=Q_{j}^{T}AQ_{j}B_{j}^{-1}=\begin{bmatrix}Q_{j-1}^{T}AQ_{j-1}&Q_{j-1}^{T}Aq_{j}\\ q_{j}^{T}AQ_{j-1}&q_{j}^{T}Aq_{j}\end{bmatrix}B_{j}^{-1}=\bm{0}

by noting that qjTAqj=0q_{j}^{T}Aq_{j}=0. Particularly, relation (3.6) shows that qjTPj=𝟎q_{j}^{T}P_{j}=\bm{0} and pjTQj=𝟎p_{j}^{T}Q_{j}=\bm{0}. Making use of them and APj1=QjB^j1TAP_{j-1}=-Q_{j}\widehat{B}_{j-1}^{T}, and noticing that γj>0\gamma_{j}>0, we obtain

(3.7) qj+1TPj\displaystyle\quad q_{j+1}^{T}P_{j} =\displaystyle= 1γj(ATpjβjqj)TPj=1γjpjTAPj\displaystyle\tfrac{1}{\gamma_{j}}(A^{T}p_{j}-\beta_{j}q_{j})^{T}P_{j}=\tfrac{1}{\gamma_{j}}p_{j}^{T}AP_{j}
=\displaystyle= 1γj[pjTAPj1pjTApj]=1γj[pjTQjB^j1T𝟎]=𝟎.\displaystyle\tfrac{1}{\gamma_{j}}\begin{bmatrix}p_{j}^{T}AP_{j-1}&p_{j}^{T}Ap_{j}\end{bmatrix}=\tfrac{1}{\gamma_{j}}\begin{bmatrix}-p_{j}^{T}Q_{j}\widehat{B}_{j-1}^{T}&\bm{0}\end{bmatrix}=\bm{0}.

Applying (3.6) and (3.7) to (3.5) yields (3.4) for m=jm={j}. By induction, we have proved that relation (3.4) holds for all m<m<\ell.

Theorem 3.1 indicates that any vectors from 𝒰m=(Pm)\mathcal{U}_{m}=\mathcal{R}(P_{m}) and 𝒱m=(Qm)\mathcal{V}_{m}=\mathcal{R}(Q_{m}), which are called left and right subspaces, are biorthogonal. Therefore, any left and right approximate singular vectors extracted from 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m} are biorthogonal, a desired property as the left and right singular vectors of AA are so, too.

The SSLBD process must break down no later than \ell steps since AA has =n/2\ell=n/2 distinct singular values; see [13] for a detailed analysis on the multiple singular value case. Once breakdown occurs for some m<m<\ell, all the singular values of BmB_{m} are exact ones of AA, and mm exact singular triplets of AA have been found, as we shall see in Section 4. Furthermore, by the assumption that AA is nonsingular, breakdown can occur only for γm=0\gamma_{m}=0; for if βm=0\beta_{m}=0, then BmB_{m} and thus AA would have a zero singular value.

3.2 The SSLBD method

With the left and right searching subspaces 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m}, we use the standard Rayleigh–Ritz projection, i.e., the standard extraction approach [2, 3, 5, 14, 17, 18], to compute the approximate singular triplets (θj,u~j,v~j)\left(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}\right) of AA with the unit-length vectors u~j𝒰m\tilde{u}_{j}\in\mathcal{U}_{m} and v~j𝒱m\tilde{v}_{j}\in\mathcal{V}_{m} that satisfy the requirement

(3.8) {Av~jθju~j𝒰m,Au~j+θjv~j𝒱m,j=1,,m.\left\{\begin{aligned} &A\tilde{v}_{j}-\theta_{j}\tilde{u}_{j}\perp\mathcal{U}_{m},\\ &A\tilde{u}_{j}+\theta_{j}\tilde{v}_{j}\perp\mathcal{V}_{m},\end{aligned}\right.\qquad\qquad j=1,\dots,m.

Set u~j=Pmcj\tilde{u}_{j}=P_{m}c_{j} and v~j=Qmdj\tilde{v}_{j}=Q_{m}d_{j} for j=1,,mj=1,\dots,m. Then (3.3) indicates that (3.8) becomes

(3.9) Bmdj=θjcj,BmTcj=θjdj,j=1,,m;B_{m}d_{j}=\theta_{j}c_{j},\qquad B_{m}^{T}c_{j}=\theta_{j}d_{j},\qquad j=1,\dots,m;

that is, (θj,cj,dj)(\theta_{j},c_{j},d_{j}), j=1,,mj=1,\dots,m are the singular triplets of BmB_{m}. Therefore, the standard extraction amounts to computing the SVD of BmB_{m} with the singular values ordered as θ1>θ2>>θm\theta_{1}>\theta_{2}>\dots>\theta_{m} and takes (θj,u~j,v~j),j=1,,m(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}),\ j=1,\dots,m as approximations to some singular triplets of AA. The resulting method is called the SSLBD method, and the (θj,u~j,v~j)(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}) are called the Ritz approximations of AA with respect to the left and right searching subspaces 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m}; particularly, the θj\theta_{j} are the Ritz values, and the u~j\tilde{u}_{j} and v~j\tilde{v}_{j} are the left and right Ritz vectors, respectively.

Since QmTATAQm=BmTBmQ_{m}^{T}A^{T}AQ_{m}=B_{m}^{T}B_{m}, by the Cauchy interlace theorem of eigenvalues (cf. [25, Theorem 10.1.1, pp.203]), for jj fixed, as mm increases, θj\theta_{j} and θmj+1\theta_{m-j+1} monotonically converge to σj\sigma_{j} and σj+1\sigma_{\ell-j+1} from below and above, respectively. Therefore, we can take (θj,u~j,v~j),j=1,,k(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}),\ j=1,\dots,k as approximations to the kk largest singular triplets (σj,uj,vj)(\sigma_{j},u_{j},v_{j}), j=1,,kj=1,\dots,k of AA with kmk\ll m. Likewise, we may use (θmj+1,u~mj+1,v~mj+1)(\theta_{m-j+1},\tilde{u}_{m-j+1},\tilde{v}_{m-j+1}), j=1,,kj=1,\dots,k to approximate the kk smallest singular triplets (σj+1,uj+1,vj+1)(\sigma_{\ell-j+1},u_{\ell-j+1},v_{\ell-j+1}), j=1,,kj=1,\dots,k of AA.

4 A convergence analysis

For later use, for each j=1,,j=1,\dots,\ell, we denote

(4.1) Λj=[σjσj]andXj=[ujvj].\Lambda_{j}=\begin{bmatrix}&\sigma_{j}\\ -\sigma_{j}&\end{bmatrix}\qquad\mbox{and}\qquad X_{j}=\begin{bmatrix}u_{j}&v_{j}\end{bmatrix}.

Then AXj=XjΛjAX_{j}=X_{j}\Lambda_{j}, and 𝒳j:=(Xj)=span{x±j}\mathcal{X}_{j}:=\mathcal{R}(X_{j})=\mathrm{span}\{x_{\pm j}\} is the two dimensional eigenspace of AA associated with the conjugate eigenvalues λ±j\lambda_{\pm j}, j=1,,j=1,\dots,\ell. We call the pair (Λj,Xj)(\Lambda_{j},X_{j}) a block eigenpair of AA, j=1,,j=1,\dots,\ell. From (2.3), X^=[X1,,X]\widehat{X}=[X_{1},\dots,X_{\ell}] is orthogonal, and (2.1) can be written as

(4.2) A=X^Λ^X^TwithΛ^=diag{Λ1,,Λ}.A=\widehat{X}\widehat{\Lambda}\widehat{X}^{T}\qquad\mbox{with}\qquad\widehat{\Lambda}=\mathop{\operator@font diag}\nolimits\{\Lambda_{1},\dots,\Lambda_{\ell}\}.

To make a convergence analysis on the SSLBD method, we review some preliminaries. Let the columns of ZZ, WW and WW_{\perp} form orthonormal base of 𝒵\mathcal{Z}, 𝒲\mathcal{W} and the orthogonal complement of 𝒲\mathcal{W}, respectively. Denote by (𝒲,𝒵)\angle(\mathcal{W},\mathcal{Z}) the canonical angles between 𝒲\mathcal{W} and 𝒵\mathcal{Z} [10, pp.329–330]. The 2-norm distance between 𝒵\mathcal{Z} and 𝒲\mathcal{W} is defined by

sin(𝒲,𝒵)=WHZ=(IWWH)Z,\|\sin\angle(\mathcal{W},\mathcal{Z})\|=\|W_{\perp}^{H}Z\|=\|(I-WW^{H})Z\|,

which is the sine of the largest canonical angle between 𝒲\mathcal{W} and 𝒵\mathcal{Z} (cf. Jia and Stewart [16]). It is worthwhile to point out that if the dimensions of 𝒲\mathcal{W} and 𝒵\mathcal{Z} are not equal then this measure is not symmetric in its arguments. In fact, if the dimension of 𝒲\mathcal{W} is greater than that of 𝒵\mathcal{Z}, then sin(𝒵,𝒲)=1\|\sin\angle(\mathcal{Z},\mathcal{W})\|=1, although generally sin(𝒲,𝒵)<1\|\sin\angle(\mathcal{W},\mathcal{Z})\|<1. In this paper, we will use the FF-norm distance

(4.3) sin(𝒲,𝒵)F=WHZF,\|\sin\angle(\mathcal{W},\mathcal{Z})\|_{F}=\|W_{\perp}^{H}Z\|_{F},

which equals the square root of the squares sum of sines of all the canonical angles between the two subspaces. Correspondingly, we define tan(𝒲,𝒵)F\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F} to be the square root of the squares sum of the tangents of all the canonical angles between the subspaces. These tangents are the generalized singular values of the matrix pair {WHZ,WHZ}\{W_{\perp}^{H}Z,W^{H}Z\}.

Notice that for 𝒲\mathcal{W} and 𝒵\mathcal{Z} with equal dimensions, if WHZW^{H}Z is nonsingular then the tangents of canonical angles are the singular values of WHZ(WHZ)1W_{\perp}^{H}Z(W^{H}Z)^{-1}, so that

(4.4) tan(𝒲,𝒵)F=WHZ(WHZ)1F=(IWWH)Z(WHZ)1F,\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}=\|W_{\perp}^{H}Z(W^{H}Z)^{-1}\|_{F}=\|(I-WW^{H})Z(W^{H}Z)^{-1}\|_{F},

which holds when the FF-norm is replaced by the 2-norm. We remark that the inverse in the above cannot be replaced by the pseudoinverse \dagger when WHZW^{H}Z is singular. In fact, by definition, tan(𝒲,𝒵)F\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F} is infinite when WHZW^{H}Z is singular, and the generalized singular values of {WHZ,WHZ}\{W_{\perp}^{H}Z,W^{H}Z\} are not the singular values of WHZ(WHZ)W_{\perp}^{H}Z(W^{H}Z)^{{\dagger}} in this case.

Note that the real and imaginary parts of the eigenvectors x±jx_{\pm j} of AA can be interchanged since ix±j\mathrm{i}x_{\pm j} are the eigenvectors of AA and its left and right singular vectors uju_{j} and vjv_{j} are the right and left ones, too (cf. (2.4)). As we have pointed out, any pair of approximate left and right singular vectors extracted from the biorthogonal 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m} are mutually orthogonal, using which we construct the real and imaginary parts of an approximate eigenvector of AA. As a consequence, in the SVD context of the skew-symmetric AA, because of these properties, when analyzing the convergence of the SSLBD method, we consider the orthogonal direct sum 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} as a whole, and estimate the distance between a desired two dimensional eigenspace 𝒳j\mathcal{X}_{j} of AA and the 2m2m-dimensional subspace 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} for jj small. We remind that their dimensions are unequal for m>1m>1. Using these distances and their estimates, we can establish a priori error bounds for approximate eigenvalues and approximate eigenspaces obtained by the SSLBD method, showing how fast they converge when computing largest eigenvalues in magnitudes and the associated eigenvectors.

In terms of the definition of tan(𝒲,𝒵)F\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}, we present the following estimate for tan(𝒰m𝒱m,𝒳j)F\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}.

Theorem 4.1.

Let 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m} be defined by (3.2), and suppose that XjHYX_{j}^{H}Y with the initial Y=[p1,q1]Y=[p_{1},q_{1}] is nonsingular. Then the following estimate holds for any integer 1j<m1\leq j<m:

(4.5) tan(𝒰m𝒱m,𝒳j)Fηjχmj(ξj)tan(𝒴,𝒳j)F,\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}\leq\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F},

where χi()\chi_{i}(\cdot) is the degree ii Chebyshev polynomial of the first kind and

(4.6) ξj=1+2σj2σj+12σj+12σ2andηj={1,ifj=1,i=1j1σi2σ2σi2σj2,ifj>1.\xi_{j}=1+2\cdot\frac{\sigma_{j}^{2}-\sigma_{j+1}^{2}}{\sigma^{2}_{j+1}-\sigma^{2}_{\ell}}\qquad\mbox{and}\qquad\eta_{j}=\left\{\begin{aligned} &1,\qquad\quad\qquad\qquad\mbox{if}\quad j=1,\\ &\prod\limits_{i=1}^{j-1}\frac{\sigma_{i}^{2}-\sigma_{\ell}^{2}}{\sigma_{i}^{2}-\sigma_{j}^{2}},\hskip 23.99997pt\mbox{if}\quad j>1.\end{aligned}\right.
{proof}

For a fixed 1j<m1\leq j<m, denote by Λ^j=diag{Λ1,,Λj1,Λj+1,,Λ}\widehat{\Lambda}_{j}=\mathop{\operator@font diag}\nolimits\{\Lambda_{1},\dots,\Lambda_{j-1},\Lambda_{j+1},\dots,\Lambda_{\ell}\} and X^j=[X1,,Xj1,Xj+1,,X]\widehat{X}_{j}=\![X_{1},\dots,X_{j-1},X_{j+1},\dots,X_{\ell}] which delete Λj\Lambda_{j} and XjX_{j} from X^\widehat{X} and Λ^\widehat{\Lambda} in (4.2), respectively. Then relation (4.2) can be written as

(4.7) A=X^jΛ^jX^jT+XjΛjXjT.A=\widehat{X}_{j}\widehat{\Lambda}_{j}\widehat{X}_{j}^{T}+X_{j}\Lambda_{j}X_{j}^{T}.

Notice that Y=[p1,q1]Y=[p_{1},q_{1}] is orthonormal by Theorem 3.1, and 𝒴=(Y)=𝒰1𝒱1\mathcal{Y}=\mathcal{R}(Y)=\mathcal{U}_{1}\oplus\mathcal{V}_{1}. Since X^\widehat{X} in (4.2) is orthogonal, there exists an orthonormal matrix G=[G1T,,GT]TG=[G_{1}^{T},\dots,G_{\ell}^{T}]^{T} with G1,,G2×2G_{1},\dots,G_{\ell}\in\mathbb{R}^{2\times 2} such that

(4.8) Y=X^G=X^jG^j+XjGjwithG^j=[G1,,Gj1,Gj+1,,G].Y=\widehat{X}G=\widehat{X}_{j}\widehat{G}_{j}+X_{j}G_{j}\qquad\mbox{with}\qquad\widehat{G}_{j}=[G_{1},\dots,G_{j-1},G_{j+1},\dots,G_{\ell}].

For an arbitrary polynomial ρ()\rho(\cdot) in 𝒫m1\mathcal{P}^{m-1}, the set of polynomials of degree no more than m1m-1 that satisfies ρ(σj2)0\rho(-\sigma_{j}^{2})\neq 0, write Yρ=ρ(A2)YY_{\rho}=\rho(A^{2})Y and 𝒴ρ=(Yρ)\mathcal{Y}_{\rho}=\mathcal{R}(Y_{\rho}). Then

(4.9) Yρ=X^jρ(Λ^j2)G^j+Xjρ(Λj2)Gj.Y_{\rho}=\widehat{X}_{j}\rho(\widehat{\Lambda}_{j}^{2})\widehat{G}_{j}+X_{j}\rho(\Lambda_{j}^{2})G_{j}.

From (4.1), it is known that Λi2=σi2I2\Lambda_{i}^{2}=-\sigma_{i}^{2}I_{2}, i=1,,i=1,\dots,\ell.

Since Gj=XjHYG_{j}=X_{j}^{H}Y is nonsingular, ρ(σj2)0\rho(-\sigma_{j}^{2})\neq 0, and the columns of Yρ(YρTYρ)1/2Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2} form an orthonormal basis of 𝒴ρ\mathcal{Y}_{\rho}, by definition (4.4) and relation (4.9), we obtain

(4.10) tan(𝒴ρ,𝒳j)F\displaystyle\|\tan\angle(\mathcal{Y}_{\rho},\mathcal{X}_{j})\|_{F} =\displaystyle= (IXjXjT)Yρ(YρTYρ)1/2(XjTYρ(YρTYρ)1/2)1F\displaystyle\|(I-X_{j}X_{j}^{T})Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2}(X_{j}^{T}Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2})^{-1}\|_{F}
=\displaystyle= (IXjXjT)Yρ(XjTYρ)1F=ρ(Λ^j2)G^j(ρ(Λj2)Gj)1F\displaystyle\|(I-X_{j}X_{j}^{T})Y_{\rho}(X_{j}^{T}Y_{\rho})^{-1}\|_{F}=\|\rho(\widehat{\Lambda}_{j}^{2})\widehat{G}_{j}(\rho(\Lambda_{j}^{2})G_{j})^{-1}\|_{F}
\displaystyle\leq ρ(Λ^j2)|ρ(σj2)|G^jGj1F=maxij|ρ(σi2)||ρ(σj2)|G^jGj1F\displaystyle\frac{\|\rho(\widehat{\Lambda}_{j}^{2})\|}{|\rho(-\sigma_{j}^{2})|}\|\widehat{G}_{j}G_{j}^{-1}\|_{F}=\frac{\max_{i\neq j}|\rho(-\sigma_{i}^{2})|}{|\rho(-\sigma_{j}^{2})|}\|\widehat{G}_{j}G_{j}^{-1}\|_{F}
=\displaystyle= maxij|ρ(σi2)||ρ(σj2)|(IXjXjT)Y(XjTY)1Fby (4.8)\displaystyle\frac{\max_{i\neq j}|\rho(-\sigma_{i}^{2})|}{|\rho(-\sigma_{j}^{2})|}\|(I-X_{j}X_{j}^{T})Y(X_{j}^{T}Y)^{-1}\|_{F}\qquad\qquad\ \ \mbox{by \eqref{expv1}}
=\displaystyle= maxij|ρ(σi2)||ρ(σj2)|tan(𝒴,𝒳j)F.\displaystyle\frac{\max_{i\neq j}|\rho(-\sigma_{i}^{2})|}{|\rho(-\sigma_{j}^{2})|}\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}.

Note that 𝒴ρ𝒰m𝒱m\mathcal{Y}_{\rho}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m} for any ρ𝒫m1\rho\in\mathcal{P}^{m-1}. By definition and (4.10), it holds that

tan(𝒰m𝒱m,𝒳j)F\displaystyle\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F} \displaystyle\leq minρ𝒫m1tan(𝒴ρ,𝒳j)F\displaystyle\min_{\rho\in\mathcal{P}^{m-1}}\|\tan\angle(\mathcal{Y}_{\rho},\mathcal{X}_{j})\|_{F}
\displaystyle\leq minρ𝒫m1maxij|ρ(σi2)||ρ(σj2)|tan(𝒴,𝒳j)F\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1}}\frac{\max_{i\neq j}|\rho(-\sigma_{i}^{2})|}{|\rho(-\sigma_{j}^{2})|}\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}
=\displaystyle= minρ𝒫m1,ρ(σj2)=1maxij|ρ(σi2)|tan(𝒴,𝒳j)F\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1},\rho(-\sigma_{j}^{2})=1}\max_{i\neq j}|\rho(-\sigma_{i}^{2})|\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}
\displaystyle\leq ηjχmj(ξj)tan(𝒴,𝒳j)F,\displaystyle\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F},

where the last inequality comes from the proof of Theorem 1 in [27], χmj()\chi_{m-j}(\cdot) is the degree mjm-j Chebyshev polynomial of the first kind, and ξj\xi_{j} and ηj\eta_{j} are defined by (4.6).

Theorem 4.1 establishes accuracy estimates for 𝒳j\mathcal{X}_{j} approximating the searching subspaces 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}. But one must be aware that they are only significant for jmj\ll m. The scalars ηj1\eta_{j}\geq 1 and ξj>1\xi_{j}>1 defined by (4.6) are constants depending only on the eigenvalue or singular value distribution of AA. For a fixed integer jmj\ll m, as long as the initial searching subspace 𝒴\mathcal{Y} contains some information on 𝒳j\mathcal{X}_{j}, that is, tan(𝒴,𝒳j)F\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F} is finite in (4.5), then the larger mm is, the smaller ηjχmj(ξj)\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})} is and the closer 𝒳j\mathcal{X}_{j} is to 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}. Moreover, the better the singular value σj\sigma_{j} is separated from the other singular values σiσj\sigma_{i}\neq\sigma_{j} of AA, the smaller ηj\eta_{j} and the larger ξj\xi_{j} are, meaning that 𝒳j\mathcal{X}_{j} approaches 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} more quickly as mm increases. Generally, tan(𝒰m𝒱m,𝒳j)F\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F} decays faster for jj smaller. Two extreme cases are that tan(𝒴,𝒳j)F=0\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}=0 and tan(𝒴,𝒳j)F=+\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}=+\infty. In the first case, Algorithm 1 breaks down at step m=1m=1, and we already have the exact eigenspace 𝒳j=𝒴\mathcal{X}_{j}=\mathcal{Y}. The second case indicates that the initial 𝒴\mathcal{Y} is deficient in 𝒳j\mathcal{X}_{j}, such that 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} does not contain any information on 𝒳j\mathcal{X}_{j}; consequently, one cannot find any meaningful approximations to uju_{j} and vjv_{j} from 𝒰m\mathcal{U}_{m} and 𝒱m\mathcal{V}_{m} for any m<m<\ell.

Theorem 4.1 shows that the bounds tend to zero for jj small as mm increases. Using a similar proof, we can establish the following analogue of (4.5) for jj small:

(4.11) tan(𝒰m𝒱m,𝒳j+1)Fη^jχmj(ξ^j)tan(𝒴,𝒳j+1)F,\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{\ell-j+1})\|_{F}\leq\frac{\hat{\eta}_{j}}{\chi_{m-j}(\hat{\xi}_{j})}\|\tan\angle(\mathcal{Y},\mathcal{X}_{\ell-j+1})\|_{F},

where

ξ^j=1+2σj+12σj2σj2σ12andη^j={1,ifj=1,i=1j1σi2σ12σi2σj2,ifj>1.\hat{\xi}_{j}=1+2\cdot\frac{\sigma_{\ell-j+1}^{2}-\sigma_{\ell-j}^{2}}{\sigma^{2}_{\ell-j}-\sigma^{2}_{1}}\qquad\mbox{and}\qquad\hat{\eta}_{j}=\left\{\begin{aligned} &1,\qquad\quad\qquad\qquad\ \ \ \mbox{if}\quad j=1,\\ &\prod\limits_{i=1}^{j-1}\frac{\sigma_{\ell-i}^{2}-\sigma_{1}^{2}}{\sigma_{\ell-i}^{2}-\sigma_{\ell-j}^{2}},\hskip 18.50008pt\mbox{if}\quad j>1.\end{aligned}\right.

Similar arguments show that tan(𝒰m𝒱m,𝒳j+1)F\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{\ell-j+1})\|_{F} generally tends to zero faster for jj smaller as mm increases. It indicates that 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} also favors the eigenvectors of AA corresponding to several smallest eigenvalues in magnitude if the smallest singular values σj+1\sigma_{\ell-j+1} are not clustered. In the sequel, for brevity, we only discuss the computation of the largest conjugate eigenvalues in magnitude and the associated eigenvectors of AA.

In term of tan(𝒰m𝒱m,𝒳j)F\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}, we next present a priori accuracy estimates for the Ritz approximations computed by the SSLBD method. To this end, we first establish the following two lemmas.

Lemma 4.2.

Let 𝒬m\mathscr{Q}_{m} be the orthogonal projector onto the subspace 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}. Then

(4.12) (Θj,Zj)=([θjθj],[u~jv~j]),j=1,2,,m(\Theta_{j},Z_{j})=\left(\begin{bmatrix}&\theta_{j}\\ -\theta_{j}&\end{bmatrix},\begin{bmatrix}\tilde{u}_{j}&\tilde{v}_{j}\end{bmatrix}\right),\qquad j=1,2,\ldots,m

are the block eigenpairs of the linear operator 𝒬mA𝒬m\mathcal{Q}_{m}A\mathcal{Q}_{m} restricted to 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}:

(4.13) 𝒬mA𝒬mZj=ZjΘj.\mathcal{Q}_{m}A\mathcal{Q}_{m}Z_{j}=Z_{j}\Theta_{j}.
{proof}

Since the columns of [Pm,Qm][P_{m},Q_{m}] form an orthonormal basis of 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}, the orthogonal projector onto 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} is 𝒬m=[Pm,Qm][Pm,Qm]T\mathcal{Q}_{m}=[P_{m},Q_{m}][P_{m},Q_{m}]^{T}. By u~j=Pmcj\tilde{u}_{j}=P_{m}c_{j} and v~j=Qmdj\tilde{v}_{j}=Q_{m}d_{j}, we have

Zj=[u~jv~j]=[PmQm][cjdj].Z_{j}=\begin{bmatrix}\tilde{u}_{j}&\tilde{v}_{j}\end{bmatrix}=\begin{bmatrix}P_{m}&Q_{m}\end{bmatrix}\begin{bmatrix}c_{j}&\\ &d_{j}\end{bmatrix}.

It is known from the proof of Theorem 3.1 that PmTAPm=QmTAQm=𝟎P_{m}^{T}AP_{m}=Q_{m}^{T}AQ_{m}=\bm{0}. Making use of that and PmTAQm=BmP_{m}^{T}AQ_{m}=B_{m}, QmTAPm=BmTQ_{m}^{T}AP_{m}=-B_{m}^{T} as well as (3.9), by 𝒬mZj=Zj\mathcal{Q}_{m}Z_{j}=Z_{j}, we obtain

𝒬mA𝒬mZj=[PmQm][BmBmT][cjdj]=[PmQm][cjdj][θjθj]=ZjΘj.\mathcal{Q}_{m}A\mathcal{Q}_{m}Z_{j}=\begin{bmatrix}P_{m}\!\!\!&\!\!\!Q_{m}\end{bmatrix}\begin{bmatrix}&\!\!\!B_{m}\\ -B^{T}_{m}\!\!\!&\end{bmatrix}\begin{bmatrix}c_{j}\!\!\!&\\ &\!\!\!d_{j}\end{bmatrix}=\begin{bmatrix}P_{m}\!\!\!&\!\!\!Q_{m}\end{bmatrix}\begin{bmatrix}c_{j}\!\!\!&\\ &\!\!\!d_{j}\end{bmatrix}\begin{bmatrix}&\!\!\!\theta_{j}\\ -\theta_{j}\!\!\!&\end{bmatrix}=Z_{j}\Theta_{j}.
Lemma 4.3.

With the notations of Lemma 4.2, for an arbitrary E2×2E\in\mathbb{R}^{2\times 2}, it holds that

(4.14) ΘjEEΛiF|θjσi|EF,\qquad\|\Theta_{j}E-E\Lambda_{i}\|_{F}\geq|\theta_{j}-\sigma_{i}|\|E\|_{F},

where Θj,j=1,,m\Theta_{j},j=1,\dots,m and Λi,i=1,,\Lambda_{i},i=1,\dots,\ell are defined by (4.12) and (4.1), respectively.

{proof}

Since Θj=θjI^2\Theta_{j}=\theta_{j}\widehat{I}_{2} and Λi=σiI^2\Lambda_{i}=\sigma_{i}\widehat{I}_{2} with I^2=[11]\widehat{I}_{2}=\begin{bmatrix}\begin{smallmatrix}&1\\ -1&\end{smallmatrix}\end{bmatrix}, by the fact that I^2EF=EF\|\widehat{I}_{2}E\|_{F}=\|E\|_{F} and EI^2F=EF\|E\widehat{I}_{2}\|_{F}=\|E\|_{F}, we have

(4.15) ΘjEEΛiF\displaystyle\|\Theta_{j}E-E\Lambda_{i}\|_{F} =\displaystyle= ΘjEF2+EΛiF22tr(ΛiTETΘjE)\displaystyle\sqrt{\|\Theta_{j}E\|_{F}^{2}+\|E\Lambda_{i}\|_{F}^{2}-2\mathrm{tr}(\Lambda_{i}^{T}E^{T}\Theta_{j}E)}
=\displaystyle= θj2EF2+σi2EF22θjσitr(I^2TETI^2E).\displaystyle\sqrt{\theta_{j}^{2}\|E\|_{F}^{2}+\sigma_{i}^{2}\|E\|_{F}^{2}-2\theta_{j}\sigma_{i}\mathrm{tr}(\widehat{I}_{2}^{T}E^{T}\widehat{I}_{2}E)}.

Denote E=[e11e12e21e22]E=\begin{bmatrix}e_{11}&e_{12}\\ e_{21}&e_{22}\end{bmatrix}. Then the trace

tr(I^2TETI^2E)=2e11e222e12e21e112+e222+e122+e212=EF2,\mathrm{tr}(\widehat{I}_{2}^{T}E^{T}\widehat{I}_{2}E)=2e_{11}e_{22}-2e_{12}e_{21}\leq e_{11}^{2}+e_{22}^{2}+e_{12}^{2}+e_{21}^{2}=\|E\|_{F}^{2},

applying which to (4.15) gives

(4.16) ΘjEEΛiFθj2EF2+σi2EF22θjσiEF2=|θjσi|EF.\|\Theta_{j}E-E\Lambda_{i}\|_{F}\geq\sqrt{\theta_{j}^{2}\|E\|_{F}^{2}+\sigma_{i}^{2}\|E\|_{F}^{2}-2\theta_{j}\sigma_{i}\|E\|_{F}^{2}}=|\theta_{j}-\sigma_{i}|\|E\|_{F}.

Making use of Lemmas 4.24.3, we are now in a position to establish a priori accuracy estimates for the Ritz approximations.

Theorem 4.4.

For (σj,uj,vj),j=1,,(\sigma_{j},u_{j},v_{j}),j=1,\dots,\ell, assume that (θj,u~j,v~j)(\theta_{j^{\prime}},\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}}) is a Ritz approximation to the desired (σj,uj,vj)(\sigma_{j},u_{j},v_{j}) where θj\theta_{j^{\prime}} is the closest to σj\sigma_{j} among the Ritz values. Denote 𝒳j=span{uj,vj}\mathcal{X}_{j}=\mathrm{span}\{u_{j},v_{j}\} and 𝒵j=span{u~j,v~j}\mathcal{Z}_{j^{\prime}}=\mathrm{span}\{\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}}\}. Then

(4.17) sin(𝒵j,𝒳j)F1+𝒬mA(I𝒬m)2minij|σjθi|2sin(𝒰m𝒱m,𝒳j)F,\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}\leq\sqrt{1+\frac{\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|^{2}}{\min_{i\neq j^{\prime}}|\sigma_{j}-\theta_{i}|^{2}}}\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F},

where 𝒬m\mathcal{Q}_{m} is the orthogonal projector onto the subspace 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}.

{proof}

Notice that the two dimensional subspace 𝒲j=𝒬m𝒳j𝒰m𝒱m\mathcal{W}_{j}^{\prime}=\mathcal{Q}_{m}\mathcal{X}_{j}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m} is the orthogonal projection of 𝒳j\mathcal{X}_{j} onto 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} with respect to the FF-norm. Therefore, by the definition of sin(𝒰m𝒱m,𝒳j)F\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}, we have

(4.18) sin(𝒲j,𝒳j)F=sin(𝒰m𝒱m,𝒳j)F.\|\sin\angle(\mathcal{W}_{j}^{\prime},\mathcal{X}_{j})\|_{F}=\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}.

Let W1n×2W_{1}\in\mathbb{R}^{n\times 2} and W2n×(n2m)W_{2}\in\mathbb{R}^{n\times(n-2m)} be the orthonormal basis matrices of 𝒲j\mathcal{W}_{j}^{\prime} and the orthogonal complement of 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} with respect to n\mathbb{R}^{n}, respectively. The orthonormal basis matrix Xj=[uj,vj]X_{j}=[u_{j},v_{j}] of 𝒳j\mathcal{X}_{j} can be expressed as

(4.19) Xj=W1H1+W2H2,X_{j}=W_{1}H_{1}+W_{2}H_{2},

where H12×2H_{1}\in\mathbb{R}^{2\times 2} and H2(n2m)×2H_{2}\in\mathbb{R}^{(n-2m)\times 2}. Since AXj=XjΛjAX_{j}=X_{j}\Lambda_{j}, from the above relation we have

AW1H1W1H1Λj=W2H2ΛjAW2H2.AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}=W_{2}H_{2}\Lambda_{j}-AW_{2}H_{2}.

Premultiplying this relation by 𝒬m\mathcal{Q}_{m} and taking the FF-norms in the two hand sides, from 𝒬mW1=W1\mathcal{Q}_{m}W_{1}=W_{1}, 𝒬mW2=𝟎\mathcal{Q}_{m}W_{2}=\bm{0} and I𝒬m=W2W2TI-\mathcal{Q}_{m}=W_{2}W_{2}^{T}, we obtain

(4.20) 𝒬mAW1H1W1H1ΛjF=𝒬mAW2H2F𝒬mA(I𝒬m)H2F.\|\mathcal{Q}_{m}AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}\|_{F}=\|\mathcal{Q}_{m}AW_{2}H_{2}\|_{F}\leq\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|\|H_{2}\|_{F}.

Recall the definition of ZjZ_{j^{\prime}} in (4.12), and denote Z^j=[Z1,,Zj1,Zj+1,,Zm]\widehat{Z}_{j^{\prime}}=[Z_{1},\dots,Z_{j^{\prime}-1},Z_{j^{\prime}+1},\dots,Z_{m}]. Note that the columns of [Zj,Z^j][Z_{j^{\prime}},\widehat{Z}_{j^{\prime}}] form an orthonormal basis of 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}. Since 𝒲j𝒰m𝒱m\mathcal{W}_{j}^{\prime}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m}, we can decompose its orthonormal basis matrix W1W_{1} into

(4.21) W1=ZjF1+Z^jF2,W_{1}=Z_{j^{\prime}}F_{1}+\widehat{Z}_{j^{\prime}}F_{2},

where F12×2F_{1}\in\mathbb{R}^{2\times 2} and F2(2m2)×2F_{2}\in\mathbb{R}^{(2m-2)\times 2}. Lemma 4.2 shows that 𝒬mAZj=ZjΘj\mathcal{Q}_{m}AZ_{j^{\prime}}=Z_{j^{\prime}}\Theta_{j^{\prime}} and 𝒬mAZ^j=Z^jΘ^j\mathcal{Q}_{m}A\widehat{Z}_{j^{\prime}}=\widehat{Z}_{j^{\prime}}\widehat{\Theta}_{j^{\prime}} with Θ^j=diag{Θ1,,Θj1,Θj+1,,Θm}\widehat{\Theta}_{j^{\prime}}=\mathop{\operator@font diag}\nolimits\{\Theta_{1},\dots,\Theta_{j^{\prime}-1},\Theta_{j^{\prime}+1},\dots,\Theta_{m}\}. Therefore, by (4.21), we obtain

(4.22) 𝒬mAW1H1W1H1ΛjF\displaystyle\|\mathcal{Q}_{m}AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}\|_{F} =\displaystyle= 𝒬mAZjF1H1ZjF1H1Λj+𝒬mAZ^jF2H1Z^jF2H1ΛjF\displaystyle\|\mathcal{Q}_{m}AZ_{j^{\prime}}F_{1}H_{1}-Z_{j^{\prime}}F_{1}H_{1}\Lambda_{j}+\mathcal{Q}_{m}A\widehat{Z}_{j^{\prime}}F_{2}H_{1}-\widehat{Z}_{j^{\prime}}F_{2}H_{1}\Lambda_{j}\|_{F}
=\displaystyle= Zj(ΘjF1H1F1H1Λj)+Z^j(Θ^jF2H1F2H1Λj)F\displaystyle\|Z_{j^{\prime}}(\Theta_{j^{\prime}}F_{1}H_{1}-F_{1}H_{1}\Lambda_{j})+\widehat{Z}_{j^{\prime}}(\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j})\|_{F}
\displaystyle\geq Z^j(Θ^jF2H1F2H1ΛjF\displaystyle\|\widehat{Z}_{j^{\prime}}(\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\|_{F}
=\displaystyle= Θ^jF2H1F2H1ΛjF,\displaystyle\|\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\|_{F},

where the last two relations hold since [Zj,Z^j][Z_{j^{\prime}},\widehat{Z}_{j^{\prime}}] is orthonormal. Write

E=F2H1=[E1T,,Ej1T,Ej+1T,EmT]TE=F_{2}H_{1}=[E_{1}^{T},\dots,E_{j^{\prime}-1}^{T},E_{j^{\prime}+1}^{T},E_{m}^{T}]^{T}

with each Ei2×2E_{i}\in\mathbb{R}^{2\times 2}. Then by Lemma 4.3 we obtain

(4.23) Θ^jF2H1F2H1ΛjF2\displaystyle\|\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\|_{F}^{2} =\displaystyle= i=1,ijmΘiEiEiΛjF2i=1,ijm|θiσj|2EiF2\displaystyle\sum_{i=1,i\neq j^{\prime}}^{m}\|\Theta_{i}E_{i}-E_{i}\Lambda_{j}\|_{F}^{2}\geq\sum_{i=1,i\neq j^{\prime}}^{m}|\theta_{i}-\sigma_{j}|^{2}\|E_{i}\|_{F}^{2}
\displaystyle\geq minij|θiσj|2i=1,ijmEiF2=minij|θiσj|2EF2\displaystyle\min_{i\neq j^{\prime}}|\theta_{i}-\sigma_{j}|^{2}\sum_{i=1,i\neq j^{\prime}}^{m}\|E_{i}\|_{F}^{2}=\min_{i\neq j^{\prime}}|\theta_{i}-\sigma_{j}|^{2}\|E\|_{F}^{2}
=\displaystyle= minij|θiσj|2F2H1F2.\displaystyle\min_{i\neq j^{\prime}}|\theta_{i}-\sigma_{j}|^{2}\|F_{2}H_{1}\|_{F}^{2}.

Therefore, combining (4.20), (4.22) and (4.23), we obtain

(4.24) F2H1F𝒬mA(I𝒬m)minij|θiσj|H2F.\|F_{2}H_{1}\|_{F}\leq\frac{\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|}{\min_{i\neq j^{\prime}}|\theta_{i}-\sigma_{j}|}\|H_{2}\|_{F}.

Since both XjX_{j} and W1W_{1} are column orthonormal, decomposition (4.19) shows that

(4.25) sin(𝒲j,𝒳j)F=(IW1W1T)XjF=H2F.\|\sin\angle(\mathcal{W}_{j}^{\prime},\mathcal{X}_{j})\|_{F}=\|(I-W_{1}W_{1}^{T})X_{j}\|_{F}=\|H_{2}\|_{F}.

Substituting (4.21) into (4.19) yields

(4.26) Xj=ZjF1H1+Z^jF2H1+W2H2,X_{j}=Z_{j^{\prime}}F_{1}H_{1}+\widehat{Z}_{j^{\prime}}F_{2}H_{1}+W_{2}H_{2},

which, by the fact that the columns of [Zj,Z^j,W2][Z_{j^{\prime}},\widehat{Z}_{j^{\prime}},W_{2}] are orthonormal, means that

sin(𝒵j,𝒳j)F2=(IZjZjT)XjF2=Z^jF2H1+W2H2F2=F2H1F2+H2F2.\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}=\|(I-Z_{j^{\prime}}Z_{j^{\prime}}^{T})X_{j}\|_{F}^{2}=\|\widehat{Z}_{j^{\prime}}F_{2}H_{1}+W_{2}H_{2}\|_{F}^{2}=\|F_{2}H_{1}\|_{F}^{2}+\|H_{2}\|_{F}^{2}.

Then (4.17) follows by in turn applying (4.24), (4.25) and (4.18) to the above relation.

Remark 3.

This theorem extends a classic result (cf. [29, pp.103, Theorem 4.6]) on the a priori error bound for the Ritz vector in the real symmetric (complex Hermitian) case to the skew-symmetric case for the Ritz block 𝒵j=span{u~j,v~j}\mathcal{Z}_{j^{\prime}}=\mathrm{span}\{\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}}\}.

Theorem 4.5.

With the notations of Theorem 4.4, assume that the angles

(4.27) (u~j,uj)π4 and (v~j,vj)π4.\angle(\tilde{u}_{j^{\prime}},u_{j})\leq\frac{\pi}{4}\mbox{\ \ \ and\ \ \ }\angle(\tilde{v}_{j^{\prime}},v_{j})\leq\frac{\pi}{4}.

Then

(4.28) |θjσj|\displaystyle|\theta_{j^{\prime}}-\sigma_{j}| \displaystyle\leq σjsin(𝒵j,𝒳j)2+σ12sin(𝒵j,𝒳j)F2,\displaystyle\sigma_{j}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}+\frac{\sigma_{1}}{\sqrt{2}}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2},
(4.29) |θjσj|\displaystyle|\theta_{j^{\prime}}-\sigma_{j}| \displaystyle\leq (σ1+σj)sin(𝒵j,𝒳j)2.\displaystyle(\sigma_{1}+\sigma_{j})\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.
{proof}

Decompose the orthonormal matrix ZjZ_{j^{\prime}} defined by (4.12) into the orthogonal direct sum

(4.30) Zj=XjH+X^jH^,Z_{j^{\prime}}=X_{j}H+\widehat{X}_{j}\widehat{H},

where H2×2H\in\mathbb{R}^{2\times 2} and H^(n2)×2\widehat{H}\in\mathbb{R}^{(n-2)\times 2}. Then

(4.31) HTH+H^TH^=I2.H^{T}H+\widehat{H}^{T}\widehat{H}=I_{2}.

By (4.13), (4.30) and AXj=XjΛjAX_{j}=X_{j}\Lambda_{j}, AX^j=X^jΛ^jA\widehat{X}_{j}=\widehat{X}_{j}\widehat{\Lambda}_{j}, we obtain

(4.32) ΘjΛj=ZjTAZjΛj=HTΛjH+H^TΛ^jH^Λj=σj(HTI^2HI^2)+H^TΛ^jH^,\Theta_{j^{\prime}}-\Lambda_{j}=Z_{j^{\prime}}^{T}AZ_{j^{\prime}}-\Lambda_{j}=H^{T}\Lambda_{j}H+\widehat{H}^{T}\widehat{\Lambda}_{j}\widehat{H}-\Lambda_{j}=\sigma_{j}\cdot\left(H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\right)+\widehat{H}^{T}\widehat{\Lambda}_{j}\widehat{H},

where the last equality holds since Λj=σjI^2\Lambda_{j}=\sigma_{j}\widehat{I}_{2} with I^2=[11]\widehat{I}_{2}=\begin{bmatrix}\begin{smallmatrix}&1\\ -1\end{smallmatrix}\end{bmatrix}. Notice that Θj=θjI^2\Theta_{j^{\prime}}=\theta_{j^{\prime}}\widehat{I}_{2} and I^2F=2\|\widehat{I}_{2}\|_{F}=\sqrt{2}. Taking the FF-norms in the two hand sides of the above relation and exploiting Λ^jσ1\|\widehat{\Lambda}_{j}\|\leq\sigma_{1} and H^F=sin(𝒵j,𝒳j)F\|\widehat{H}\|_{F}=\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F} yield

(4.33) 2|θjσj|\displaystyle\sqrt{2}|\theta_{j^{\prime}}-\sigma_{j}| \displaystyle\leq σjHTI^2HI^2F+Λ^jH^F2\displaystyle\sigma_{j}\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\|_{F}+\|\widehat{\Lambda}_{j}\|\|\widehat{H}\|_{F}^{2}
\displaystyle\leq σjHTI^2HI^2F+σ1sin(𝒵j,𝒳j)F2.\displaystyle\sigma_{j}\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\|_{F}+\sigma_{1}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}.

Write H=[hij]H=[h_{ij}], and notice that h11=cos(u~j,uj)h_{11}=\cos\angle(\tilde{u}_{j^{\prime}},u_{j}) and h22=cos(v~j,vj)h_{22}=\cos\angle(\tilde{v}_{j^{\prime}},v_{j}). Then by (4.27) we have

(4.34) h1112,h2212and|h12|12,|h21|12h_{11}\geq\frac{1}{\sqrt{2}},\qquad h_{22}\geq\frac{1}{\sqrt{2}}\qquad\mbox{and}\qquad|h_{12}|\leq\frac{1}{\sqrt{2}},\qquad|h_{21}|\leq\frac{1}{\sqrt{2}}

since, from (4.31), the 2-norms of columns [h11,h21]T[h_{11},h_{21}]^{T} and [h12,h22]T[h_{12},h_{22}]^{T} of HH are no more than one. As a consequence, we obtain

det(H)\displaystyle{\rm det}(H) =\displaystyle= h11h22h12h21h11h22|h12h21|\displaystyle h_{11}h_{22}-h_{12}h_{21}\geq h_{11}h_{22}-|h_{12}h_{21}|
\displaystyle\geq 12121212=0.\displaystyle\frac{1}{\sqrt{2}}\cdot\frac{1}{\sqrt{2}}-\frac{1}{\sqrt{2}}\cdot\frac{1}{\sqrt{2}}=0.

Let σ1(H)σ2(H)\sigma_{1}(H)\geq\sigma_{2}(H) be the singular values of HH. Therefore, by definition we have σ2(H)=cos(𝒵j,𝒳j)\sigma_{2}(H)=\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\| and

det(H)=σ1(H)σ2(H)σ22(H)=cos(𝒵j,𝒳j)2.{\rm det}(H)=\sigma_{1}(H)\sigma_{2}(H)\geq\sigma_{2}^{2}(H)=\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.

Then it is straightforward to verify that

HTI^2HI^2F\displaystyle\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\|_{F} =\displaystyle= [h11h22h12h211h12h21h11h22+1]F\displaystyle\left\|\begin{bmatrix}&h_{11}h_{22}-h_{12}h_{21}-1\\ h_{12}h_{21}-h_{11}h_{22}+1&\end{bmatrix}\right\|_{F}
=\displaystyle= 2(1det(H))2(1cos(𝒵j,𝒳j)2)\displaystyle\sqrt{2}(1-{\rm det}(H))\leq\sqrt{2}(1-\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2})
\displaystyle\leq 2sin(𝒵j,𝒳j)2.\displaystyle\sqrt{2}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.

Applying the above inequality to the right hand side of (4.33), we obtain

|θjσj|σjsin(𝒵j,𝒳j)2+σ12sin(𝒵j,𝒳j)F2.|\theta_{j^{\prime}}-\sigma_{j}|\leq\sigma_{j}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}+\frac{\sigma_{1}}{\sqrt{2}}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}.

Similarly, we get bound (4.29) from (4.32) by using H^=sin(𝒵j,𝒳j)\|\widehat{H}\|=\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\| and

HTI^2HI^2=1det(H)sin(𝒵j,𝒳j)2.\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\|=1-{\det}(H)\leq\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.

We can amplify bound (4.28) as

|θjσj|(1+12)σ1sin(𝒵j,𝒳j)F2.|\theta_{j^{\prime}}-\sigma_{j}|\leq\left(1+\frac{1}{\sqrt{2}}\right)\sigma_{1}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}.

Applying (4.17) to sin(𝒵j,𝒳j)F\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}, we are able to estimate how fast |θjσj||\theta_{j^{\prime}}-\sigma_{j}| tends to zero as mm increases. Since 𝒳j\mathcal{X}_{j} approaches 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m} for a fixed small jmj\ll m as mm increases, if σj\sigma_{j} is well separated from the Ritz values of AA other than the approximate singular value θj\theta_{j^{\prime}}, that is, minij|σjθi|\min_{i\neq j^{\prime}}|\sigma_{j}-\theta_{i}| is not small, then 𝒵j\mathcal{Z}_{j^{\prime}} converges to 𝒳j\mathcal{X}_{j} as fast as 𝒳j\mathcal{X}_{j} tends to 𝒰m𝒱m\mathcal{U}_{m}\oplus\mathcal{V}_{m}. In this case, Theorem 4.5 shows that the convergence of θj\theta_{j^{\prime}} to σj\sigma_{j} is quadratic relative to that of 𝒵j\mathcal{Z}_{j^{\prime}}. We should point out that assumption (4.27) is weak and is met very soon as mm increases.

For brevity, we now omit the subscript jj and denote by (θ,u~,v~)\left(\theta,\tilde{u},\tilde{v}\right) a Ritz approximation at the mmth step. We can then recover two conjugate approximate eigenpairs from (θ,u~,v~)(\theta,\tilde{u},\tilde{v}) in the form of (λ~±,x~±)=(±iθ,12(u~±iv~))(\tilde{\lambda}_{\pm},\tilde{x}_{\pm})=(\pm\mathrm{i}\theta,\frac{1}{\sqrt{2}}(\tilde{u}\pm\mathrm{i}\tilde{v})). Note that their residuals are

(4.35) r±=Ax~±λ~±x~±=rR±irIwith{rR=12(Au~+θv~),rI=12(Av~θu~).r_{\pm}=A\tilde{x}_{\pm}-\tilde{\lambda}_{\pm}\tilde{x}_{\pm}=r_{\mathrm{R}}\pm\mathrm{i}r_{\mathrm{I}}\qquad\mbox{with}\qquad\left\{\begin{aligned} r_{\mathrm{R}}&=\tfrac{1}{\sqrt{2}}(A\tilde{u}+\theta\tilde{v}),\\ r_{\mathrm{I}}&=\tfrac{1}{\sqrt{2}}(A\tilde{v}-\theta\tilde{u}).\end{aligned}\right.

Obviously, r±=𝟎r_{\pm}=\bm{0} if and only if (θ,u~,v~)(\theta,\tilde{u},\tilde{v}) is an exact singular triplet of AA. We claim that (λ~±,x~±)(\tilde{\lambda}_{\pm},\tilde{x}_{\pm}) has converged if its residual norm satisfies

(4.36) r±=rR2+rI2Atol,\|r_{\pm}\|=\sqrt{\|r_{\mathrm{R}}\|^{2}+\|r_{\mathrm{I}}\|^{2}}\leq\|A\|\cdot tol,

where tol>0tol>0 is a prescribed tolerance. In practical computations, one can replace A\|A\| by the largest Ritz value θ1\theta_{1}. Notice that

2r±=Au~+θv~2+Av~θu~2\sqrt{2}\|r_{\pm}\|=\sqrt{\|A\tilde{u}+\theta\tilde{v}\|^{2}+\|A\tilde{v}-\theta\tilde{u}\|^{2}}

is nothing but the residual norm of the Ritz approximation (θ,u~,v~)(\theta,\tilde{u},\tilde{v}) of AA. Therefore, the eigenvalue problem and SVD problem of AA essentially share the same general-purpose stopping criterion.

By inserting u~=Pmc\tilde{u}=P_{m}c and v~=Qmd\tilde{v}=Q_{m}d into (4.35) and making use of (3.3) and (3.9), it is easily justified that

(4.37) r±=12BmTcθd2+γm2|em,mTc|2+Bmdθc2=12γm|em,mTc|.\|r_{\pm}\|=\tfrac{1}{\sqrt{2}}\sqrt{\|B_{m}^{T}c-\theta d\|^{2}+\gamma_{m}^{2}|e_{m,m}^{T}c|^{2}+\|B_{m}d-\theta c\|^{2}}=\tfrac{1}{\sqrt{2}}\gamma_{m}|e_{m,m}^{T}c|.

This indicates that we can calculate the residual norms of the Ritz approximations cheaply without explicitly forming the approximate singular vectors. We only compute the approximate singular vectors of AA until the corresponding residual norms defined by (4.37) drop below a prescribed tolerance. Moreover, (4.37) shows that once the mm-step SSLBD process breaks down, i.e., γm=0\gamma_{m}=0, all the computed mm approximate singular triplets (θ,u~,v~)(\theta,\tilde{u},\tilde{v}) are the exact ones of AA, as we have mentioned at the end of Section 3.1.

5 An implicitly restarted SSLBD algorithm with partial reorthogonalization

This section is devoted to efficient partial reorthogonalization and the development of an implicitly restarted SSLBD algorithm, which are crucial for a practical purpose and the numerical reliability so as to avoid ghost phenomena of computed Ritz approximations and make the algorithm behave as if it is in exact arithmetic.

5.1 Partial reorthogonalization

We adopt a commonly used two-sided partial reorthogonalization strategy, as done in [14, 15, 18], to make the columns of PmP_{m} and Qm+1Q_{m+1} numerically orthogonal to the level 𝒪(ε)\mathcal{O}(\sqrt{\varepsilon}) with ε\varepsilon the machine precision. In finite precision arithmetic, however, such semi-orthogonality cannot automatically guarantee the numerical semi-biorthogonality of PmP_{m} and QmQ_{m}, causing that computed Ritz values have duplicates or ghosts and the convergence is delayed. To avoid this severe deficiency and make the method as if it does in exact arithmetic, one of our particular concerns is to design an effective and efficient partial reorthogonalization strategy, so that the entries of QmTPmQ_{m}^{T}P_{m} are 𝒪(ε)\mathcal{O}(\sqrt{\varepsilon}) in size; that is, the columns of PmP_{m} and QmQ_{m} are numerically semi-biorthogonal. As it will turn out, how to achieve this goal efficiently is involved. Why the semi-orthogonality and semi-biorthogonality suffice is based on the following fundamental result.

Theorem 5.1.

Let the bidiagonal BmB_{m} and Pm=[p1,,pm]P_{m}=[p_{1},\ldots,p_{m}] and Qm=[q1,,qm]Q_{m}=[q_{1},\ldots,q_{m}] be produced by the mm-step SSLBD process. Assume that βi,i=1,2,,m\beta_{i},\ i=1,2,\ldots,m and γi,i=1,2,,m1\gamma_{i},\ i=1,2,\ldots,m-1 defined in (3.3) are not small, and

(5.1) max{max1i<jm|piTpj|,max1i<jm|qiTqj|,max1i,jm|piTqj|}ϵm.\max\left\{\max_{1\leq i<j\leq m}\left|p_{i}^{T}p_{j}\right|,\max_{1\leq i<j\leq m}\left|q_{i}^{T}q_{j}\right|,\max_{1\leq i,j\leq m}\left|p_{i}^{T}q_{j}\right|\right\}\leq\sqrt{\frac{\epsilon}{m}}.

Let [qˇ1,pˇ1,,qˇm,pˇm][\check{q}_{1},\check{p}_{1},\dots,\check{q}_{m},\check{p}_{m}] be the Q-factor in the QR decomposition of Mm=[q1,p1,,qm,pm]M_{m}=[q_{1},p_{1},\dots,q_{m},p_{m}] with the diagonals of the upper triangular matrix being positive, and define the orthonormal matrices Pˇm=[pˇ1,,pˇm]\check{P}_{m}=[\check{p}_{1},\dots,\check{p}_{m}] and Qˇm=[qˇ1,,qˇm]\check{Q}_{m}=[\check{q}_{1},\dots,\check{q}_{m}]. Then

(5.2) PˇmTAQˇm=Bm+Δm,\check{P}_{m}^{T}A\check{Q}_{m}=B_{m}+\Delta_{m},

where the entries of Δm\Delta_{m} are 𝒪(Aϵ)\mathcal{O}(\|A\|\epsilon) in size.

{proof}

Recall the equivalence of decompositions (2.10) and (2.11), and notice that decomposition (2.11) can be realized by the skew-symmetric Lanczos process, and (2.10) can be computed by our SSLBD process. The result then follows from a proof analogous to that of Theorem 4 in [30], and we thus omit details.

Theorem 5.1 shows that if the computed Lanczos vectors are semi-orthogonal and semi-biorthogonal then the upper bidiagonal matrix BmB_{m} is, up to roundoff error, equal to the projection matrix PˇmTAQˇm\check{P}_{m}^{T}A\check{Q}_{m} of AA with respect to the left and right searching subspaces 𝒰m=(Pˇm)\mathcal{U}^{\prime}_{m}=\mathcal{R}(\check{P}_{m}) and 𝒱m=(Qˇm)\mathcal{V}^{\prime}_{m}=\mathcal{R}(\check{Q}_{m}). This means that the singular values of BmB_{m}, the computed Ritz values, are as accurate as those of PˇmTAQˇm\check{P}_{m}^{T}A\check{Q}_{m}, i.e., the exact Ritz values, within the level 𝒪(Aϵ)\mathcal{O}(\|A\|\epsilon). In other words, the semi-orthogonality and semi-biorthogonality suffice to guarantee that the computed and true Ritz values are equally accurate in finite precision arithmetic. Therefore, the full numerical orthogonality and biorthogonality at the level 𝒪(ε)\mathcal{O}(\varepsilon) do not help and instead cause unnecessary waste for the accurate computation of singular values.

To achieve the desired numerical semi-orthogonality and semi-biorthogonality, we first need to efficiently estimate the levels of orthogonality and biorthogonality among all the Lanczos vectors to monitor if (5.1) is guaranteed. Whenever (5.1) is violated, we will use an efficient partial reorthogonalization to restore (5.1). To this end, we adapt the recurrences in Larsen’s PhD thesis [17] to the effective estimations of the desired semi-orthogonality and semi-biorthogonality, as is shown below.

Denote by Φm×m\Phi\in\mathbb{R}^{m\times m} and Ψ(m+1)×(m+1)\Psi\in\mathbb{R}^{(m+1)\times(m+1)} the estimate matrices of orthogonality between the columns of PmP_{m} and Qm+1Q_{m+1}, respectively, and by Ωm×(m+1)\Omega\in\mathbb{R}^{m\times(m+1)} the estimate matrix of biorthogonality of PmP_{m} and Qm+1Q_{m+1}. That is, their (i,j)(i,j)-elements φijpiTpj\varphi_{ij}\approx p_{i}^{T}p_{j}, ψijqiTqj\psi_{ij}\approx q_{i}^{T}q_{j} and ωijpiTqj\omega_{ij}\approx p_{i}^{T}q_{j}, respectively. Adopting the recurrences in [17], we set φii=ψii=1\varphi_{ii}=\psi_{ii}=1 for i=1,,mi=1,\dots,m, and at the jjth step use the following recurrences to compute

(5.3) {φij=βiψij+γiψi+1,jγj1φi,j1,φij=(φij+sign(φij)ϵ1)/βj,fori=1,,j1,\displaystyle\left\{\begin{aligned} &\varphi_{ij}^{\prime}=\beta_{i}\psi_{ij}+\gamma_{i}\psi_{i+1,j}-\gamma_{j-1}\varphi_{i,j-1},\\ &\varphi_{ij}=(\varphi_{ij}^{\prime}+sign(\varphi_{ij}^{\prime})\epsilon_{1})/\beta_{j},\end{aligned}\right.\quad\qquad{\ \!}\mbox{for}\quad i=1,\dots,j-1,
(5.4) {ψi,j+1=γi1φi1,j+βiφijβjψij,ψi,j+1=(ψi,j+1+sign(ψi,j+1)ϵ1)/γj,fori=1,,j,\displaystyle\left\{\begin{aligned} &\psi_{i,j+1}^{\prime}=\gamma_{i-1}\varphi_{i-1,j}+\beta_{i}\varphi_{ij}-\beta_{j}\psi_{ij},\\ &\psi_{i,j+1}=(\psi_{i,j+1}^{\prime}+sign(\psi_{i,j+1}^{\prime})\epsilon_{1})/\gamma_{j},\end{aligned}\right.\qquad\mbox{for}\quad i=1,\dots,j,

where sign()sign(\cdot) is the sign function, ϵ1=ϵnAe2\epsilon_{1}=\frac{\epsilon\sqrt{n}\|A\|_{e}}{2} and Ae\|A\|_{e} is an estimate for A\|A\|. The terms sign()ϵ1sign(\cdot)\epsilon_{1} in (5.3) and (5.4) take into account the rounding errors in LBD, which makes the calculated items as large as possible in magnitude so that it is safe to take them as estimates for the levels of orthogonality among the left and right Lanczos vectors. Based on the sizes of φij\varphi_{ij} and ψi,j+1\psi_{i,j+1}, one can decide to which previous Lanczos vectors the newly computed one should be reorthogonalized.

Making use of the skew-symmetry of AA, we can derive analogous recurrences to those in [17] to compute the elements ωij\omega_{ij} of Ω\Omega. The derivations are rigorous but tedious, so we omit details here, and present the recurrences directly. Initially, set ωi,0=0\omega_{i,0}=0 and ω0,j=0\omega_{0,j}=0 for i=1,,mi=1,\dots,m and j=1,,m+1j=1,\dots,m+1. At the jjth step, we compute the elements of Ω\Omega by

(5.6) {ωji=βiωij+γi1ωi1,j+γj1ωj1,i,ωji=(ωji+sign(ωji)ϵ1)/βj,fori=1,,j,\displaystyle\left\{\begin{aligned} &\omega_{ji}^{\prime}=\beta_{i}\omega_{ij}+\gamma_{i-1}\omega_{i-1,j}+\gamma_{j-1}\omega_{j-1,i},\\ &\omega_{ji}=-(\omega_{ji}^{\prime}+sign(\omega_{ji}^{\prime})\epsilon_{1})/\beta_{j},\end{aligned}\right.\ \ \ {\ \!}\qquad\mbox{for}\quad i=1,\dots,j,
{ωi,j+1=γiωj,i+1+βiωji+βjωij,ωi,j+1=(ωi,j+1+sign(ωi,j+1)ϵ1)/γj,fori=1,,j.\displaystyle\left\{\begin{aligned} &\omega_{i,j+1}^{\prime}=\gamma_{i}\omega_{j,i+1}+\beta_{i}\omega_{ji}+\beta_{j}\omega_{ij},\\ &\omega_{i,j+1}=-(\omega_{i,j+1}^{\prime}+sign(\omega_{i,j+1}^{\prime})\epsilon_{1})/\gamma_{j},\end{aligned}\right.\qquad\mbox{for}\quad i=1,\dots,j.

Having computed Φ\Phi, Ψ\Psi and Ω\Omega, at the jjth step of the SSLBD process, we determine the index sets:

(5.7) P(j)={i|1i<j,|φij|ϵm},P,Q(j)={i|1ij,|ωji|ϵm},\displaystyle\mathcal{I}_{P}^{(j)}=\left\{i|1\leq i<j,\ |\varphi_{ij}|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},\hskip 26.00009pt\mathcal{I}_{P,Q}^{(j)}=\left\{i|1\leq i\leq j,\ |\omega_{ji}|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},
(5.8) Q(j)={i|1ij,|ψi,j+1|ϵm},Q,P(j)={i|1ij,|ωi,j+1|ϵm}.\displaystyle\mathcal{I}_{Q}^{(j)}=\left\{i|1\leq i\leq j,\ |\psi_{i,j+1}|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},\quad\ \mathcal{I}_{Q,P}^{(j)}=\left\{i|1\leq i\leq j,\ |\omega_{i,j+1}|\geq\sqrt{\tfrac{\epsilon}{m}}\right\}.

These four sets consist of the indices that correspond to the left and right Lanczos vectors pip_{i} and qiq_{i} that have lost the semi-orthogonality and semi-biorthogonality with pjp_{j} and qj+1q_{j+1}, respectively. We then make use of the modified Gram–Schmidt (MGS) orthogonalization procedure [10, 28, 32] to reorthogonalize the newly computed left Lanczos vector pjp_{j} first against the left Lanczos vector(s) pip_{i} with iP(j)i\in\mathcal{I}_{P}^{(j)} and then the right one(s) qiq_{i} with iP,Q(j)i\in\mathcal{I}_{P,Q}^{(j)}. Having done these, we reorthogonalize the right Lanczos vector qj+1q_{j+1}, which has newly been computed using the updated pjp_{j}, first against the right Lanczos vector(s) qiq_{i} for iQ(j)i\in\mathcal{I}_{Q}^{(j)} and then the left one(s) pip_{i} for iQ,P(j)i\in\mathcal{I}_{Q,P}^{(j)}.

Algorithm 2 Partial reorthogonalization in the SSLBD process.
  • (3.1)

    Calculate φ1j,φ2j,,φj1,j\varphi_{1j},\varphi_{2j},\dots,\varphi_{j-1,j} and ωj1,ωj2,,ωjj\omega_{j1},\omega_{j2},\dots,\omega_{jj} by (5.3) and (5.6), respectively, and determine the index sets P(j)\mathcal{I}_{P}^{(j)} and P,Q(j)\mathcal{I}_{P,Q}^{(j)} according to (5.7).

  • (3.2)

    for iP(j)i\in\mathcal{I}_{P}^{(j)} do                                         %\% partially reorthogonalize pjp_{j} to Pj1P_{j-1}

    • Calculate τ=piTsj\tau=p_{i}^{T}s_{j}, and update sj=sjτpis_{j}=s_{j}-\tau p_{i}.

    • Overwrite φlj=φljτφli\varphi_{lj}^{\prime}=\varphi_{lj}^{\prime}-\tau\varphi_{li} for l=1,,j1l=1,\dots,j-1 and ωjl=ωjlτωil\omega_{jl}^{\prime}=\omega_{jl}^{\prime}-\tau\omega_{il} for l=1,,jl=1,\dots,j.

  • end for

  • (3.3)

    for iP,Q(j)i\in\mathcal{I}_{P,Q}^{(j)} do                                         %\% partially reorthogonalize pjp_{j} to QjQ_{j}

    • Compute τ=qiTsj\tau=q_{i}^{T}s_{j}, and update sj=sjτqis_{j}=s_{j}-\tau q_{i}.

    • Overwrite ωjl=ωjlτψil\omega_{jl}^{\prime}=\omega_{jl}^{\prime}-\tau\psi_{il} for l=1,,jl=1,\dots,j and φlj=φljτωli\varphi_{lj}^{\prime}=\varphi_{lj}^{\prime}-\tau\omega_{li} for l=1,,j1l=1,\dots,j-1.

  • end for

  • (3.4)

    Compute βj=sj\beta_{j}=\|s_{j}\|, and update φ1j,φ2j,,φj1,j\varphi_{1j},\varphi_{2j},\dots,\varphi_{j-1,j} and ωj1,ωj2,,ωjj\omega_{j1},\omega_{j2},\dots,\omega_{jj} by the second relations in (5.3) and (5.6), respectively.

  • (5.1)

    Compute ψ1j,ψ2j,,ψjj\psi_{1j},\psi_{2j},\dots,\psi_{jj} and ω1j,ω2j,,ωjj\omega_{1j},\omega_{2j},\dots,\omega_{jj} using (5.4) and (5.6), respectively, and determine the index sets Q(j)\mathcal{I}_{Q}^{(j)} and Q,P(j)\mathcal{I}_{Q,P}^{(j)} according to (5.8).

  • (5.2)

    for iQ(j)i\in\mathcal{I}_{Q}^{(j)} do                                         %\% partially reorthogonalize qj+1q_{j+1} to QjQ_{j}

    • Calculate τ=qiTtj\tau=q_{i}^{T}t_{j}, and update tj=tjτqit_{j}=t_{j}-\tau q_{i}.

    • Overwrite ψl,j+1=ψl,j+1τψli\psi_{l,j+1}^{\prime}=\psi_{l,j+1}^{\prime}-\tau\psi_{li} and ωl,j+1=ωl,j+1τωli\omega_{l,j+1}^{\prime}=\omega_{l,j+1}^{\prime}-\tau\omega_{li} for l=1,,jl=1,\dots,j.

  • end for

  • (5.3)

    for iQ,P(j)i\in\mathcal{I}_{Q,P}^{(j)} do                                         %\% partially reorthogonalize qj+1q_{j+1} to PjP_{j}

    • Compute τ=piTtj\tau=p_{i}^{T}t_{j}, and update tj=tjτpit_{j}=t_{j}-\tau p_{i}.

    • Overwrite ωl,j+1=ωl,j+1τφli\omega_{l,j+1}^{\prime}=\omega_{l,j+1}^{\prime}-\tau\varphi_{li} and ψl,j+1=ψl,j+1τωil\psi_{l,j+1}^{\prime}=\psi_{l,j+1}^{\prime}-\tau\omega_{il} for l=1,,jl=1,\dots,j.

  • end for

  • (5.4)

    Calculate γj=tj\gamma_{j}=\|t_{j}\|, and update ω1,j+1,ω2,j+1,,ωj,j+1\omega_{1,j+1},\omega_{2,j+1},\dots,\omega_{j,j+1} and ψ1,j+1,ψ2,j+1,,ψj,j+1\psi_{1,j+1},\psi_{2,j+1},\dots,\psi_{j,j+1} by the second relations in (5.4) and (5.6), respectively.

Remarkably, once a Lanczos vector is reorthogonalized to a previous one, the relevant elements in Φ\Phi, Ψ\Psi and Ω\Omega change correspondingly, which may no longer be reliable estimates for the levels of orthogonality and biorthogonality among the updated Lanczos vector and all the previous ones. To this end, we need to update those changed values of Φ\Phi, Ψ\Psi and Ω\Omega by the MGS procedure111In his PhD thesis [17], Larsen resets those quantities as 𝒪(ϵ)\mathcal{O}(\epsilon) and includes the 𝒪(ϵ)\mathcal{O}(\epsilon) parts in ϵ1\epsilon_{1} of (5.3)–(5.6). But he does not explain why those quantities can automatically remain 𝒪(ϵ)\mathcal{O}(\epsilon) as the procedure is going on. In our procedure, we guarantee those vectors to be numerically orthogonal and biorthogonal by explicitly performing the MGS procedure.. Since Φ\Phi and Ψ\Psi are symmetric matrices with diagonals being ones, it suffices to compute their strictly upper triangular parts. Algorithm 2 describes the whole partial reorthogonalization process in SSLBD, where we insert steps (3.1)–(3.4) and (5.1)–(5.4) between steps 34 and steps 56 of Algorithm 1, respectively. In such a way, we have developed an mm-step SSLBD process with partial reorthogonalization.

We see from Algorithm 2 and relations (5.3)–(5.6) that the jjth step takes 𝒪(j)\mathcal{O}(j) flops to compute the new elements of Φ\Phi, Ψ\Psi and Ω\Omega for the first time, which is negligible relative to the cost of performing 22 matrix-vector multiplications with AA at that step. Also, it takes 𝒪(j)\mathcal{O}(j) extra flops to update the elements of Φ\Phi, Ψ\Psi and Ω\Omega after one step of reorthogonalization. Therefore, we can estimate the levels of orthogonality and biorthogonality very efficiently.

5.2 An implicitly restarted algorithm

When the dimension of the searching subspaces reaches the maximum number mm allowed, implicit restart needs to select certain mkm-k shifts. The shifts can particularly be taken as the unwanted Ritz values μ1=θk+1,,μmk=θm\mu_{1}=\theta_{k+1},\dots,\mu_{m-k}=\theta_{m}, called the exact shifts [18, 31]. For those so-called bad shifts, i.e., those close to some of the desired singular values, we replace them with certain more appropriate values. As suggested in [17] and adopted in [14, 15] with necessary modifications when computing the smallest singular triplets, we regard a shift μ\mu as a bad one and reset it to be zero if

(5.9) |(θkr±k)μ|θk103,\left|(\theta_{k}-\|r_{\pm k}\|)-\mu\right|\leq{\theta_{k}}\cdot 10^{-3},

where r±k=rR,k2+rI,k2\|r_{\pm k}\|=\sqrt{\|r_{\mathrm{R},k}\|^{2}+\|r_{\mathrm{I},k}\|^{2}} is the residual norm of the approximate singular triplet (θk,u~k,v~k)(\theta_{k},\tilde{u}_{k},\tilde{v}_{k}) with rR,kr_{\mathrm{R},k} and rI,kr_{\mathrm{I},k} defined by (4.35).

Implicit restart formally performs mkm-k implicit QR iterations [29, 33] on BmTBmB_{m}^{T}B_{m} with the shifts μ12,,μmk2\mu_{1}^{2},\dots,\mu_{m-k}^{2} by equivalently acting Givens rotations directly on BmB_{m} and accumulating the orthogonal matrices C~m\widetilde{C}_{m} and D~m\widetilde{D}_{m} with order mm such that

{C~mT(BmTBmμmk2I)(BmTBmμ12I)isupper triangular,B~m=C~mTBmD~misupper bidiagonal.\left\{\begin{aligned} \ \ &\widetilde{C}_{m}^{T}(B_{m}^{T}B_{m}-\mu_{m-k}^{2}I)\cdots(B_{m}^{T}B_{m}-\mu_{1}^{2}I)\qquad\mbox{is}\qquad\mbox{upper triangular},\\[5.0pt] &\widetilde{B}_{m}\ =\ \widetilde{C}_{m}^{T}B_{m}\widetilde{D}_{m}\hskip 103.00018pt\mbox{is}\qquad\mbox{upper bidiagonal}.\end{aligned}\right.

Using these matrices, we first compute

(5.10) Bk,new=B~k,Pk,new=PmC~k,Qk,new=QmD~k,B_{k,\mathrm{new}}=\widetilde{B}_{k},\qquad P_{k,\mathrm{new}}=P_{m}\widetilde{C}_{k},\qquad Q_{k,\mathrm{new}}=Q_{m}\widetilde{D}_{k},

where B~k\widetilde{B}_{k} is the kkth leading principal submatrix of B~m\widetilde{B}_{m}, and C~k\widetilde{C}_{k} and D~k\widetilde{D}_{k} consist of the first kk columns of C~m\widetilde{C}_{m} and D~m\widetilde{D}_{m}, respectively. We then compute

(5.11) qk+1=βmc~mkqm+1+γ~kQmd~k+1,βk,new=qk+1,qk+1,new=qk+1βk,newq_{k+1}^{\prime}=\beta_{m}\tilde{c}_{mk}\cdot q_{m+1}+\tilde{\gamma}_{k}Q_{m}\tilde{d}_{k+1},\qquad\beta_{k,\mathrm{new}}=\|q^{\prime}_{k+1}\|,\qquad q_{k+1,\mathrm{new}}=\frac{q_{k+1}^{\prime}}{\beta_{k,\mathrm{new}}}

with d~k+1\tilde{d}_{k+1} the (k+1)(k+1)st column of D~m\widetilde{D}_{m} and c~mk\tilde{c}_{mk}, γ~k\tilde{\gamma}_{k} the (m,k)(m,k)- and (k,k+1)(k,k+1)-elements of C~m\widetilde{C}_{m} and B~m\widetilde{B}_{m}, respectively. As a result, when the above implicit restart is run, we have already obtained a kk-step SSLBD process rather than over scratch. We then perform (mk)(m-k)-step SSLBD process from step k+1k+1 onwards to obtain a new mm-step SSLBD process.

Once the SSLBD process is restarted, the matrices Φ\Phi, Ψ\Psi and Ω\Omega must be updated. This can be done efficiently by making use of (5.10) and (5.11). Note that Φ\Phi and Ψ\Psi are symmetric matrices with orders mm and m+1m+1, respectively, and Ω\Omega is an m×(m+1)m\times(m+1) flat matrix. The corresponding updated matrices are with orders kk, k+1k+1 and size k×(k+1)k\times(k+1), respectively, which are updated by

Φ:=C~kTΦC~k,Ψk:=D~kTΨmD~k,Ωk:=C~kTΩmD~k,\Phi:=\widetilde{C}_{k}^{T}\Phi\widetilde{C}_{k},\qquad\Psi_{k}:=\widetilde{D}_{k}^{T}\Psi_{m}\widetilde{D}_{k},\qquad\Omega_{k}:=\widetilde{C}_{k}^{T}\Omega_{m}\widetilde{D}_{k},

and

Ψ1:k,k+1\displaystyle\Psi_{1:k,k+1} :=\displaystyle:= D~kT(βmc~mkΨ1:m,m+1+γ~kΨmd~k+1)/βk,new,\displaystyle\widetilde{D}_{k}^{T}(\beta_{m}\tilde{c}_{mk}\Psi_{1:m,m+1}+\tilde{\gamma}_{k}\Psi_{m}\tilde{d}_{k+1})/\beta_{k,\mathrm{new}},
Ω1:k,k+1\displaystyle\Omega_{1:k,k+1} :=\displaystyle:= C~kT(βmc~mkΩ1:m,m+1+γ~kΩmd~k+1)/βk,new.\displaystyle\widetilde{C}_{k}^{T}(\beta_{m}\tilde{c}_{mk}\Omega_{1:m,m+1}+\tilde{\gamma}_{k}\Omega_{m}\tilde{d}_{k+1})/\beta_{k,\mathrm{new}}.

Here we have denoted by ()j(\cdot)_{j} and ()1:j,j+1(\cdot)_{1:j,j+1} the j×jj\times j leading principal submatrix and the first jj entries in the (j+1)(j+1)th column of the estimate matrices of orthogonality and biorthogonality, respectively. The updating process needs 𝒪(km2)\mathcal{O}(km^{2}) extra flops, which is negligible relative to the mm-step implicit restart for m=n/2m\ll\ell=n/2.

As we have seen previously, the convergence criterion (4.36) and the efficient partial reorthogonalization scheme (cf. (5.3)–(5.6)) depend on a rough estimate for A\|A\|, which itself is approximated by the largest approximate singular value θ1\theta_{1} in our algorithm. Therefore, we simply set Ae=θ1\|A\|_{e}=\theta_{1} to replace A\|A\| in (4.36). Notice that, before the end of the first cycle of the implicitly restarted algorithm, there is no approximate singular value available. We adopt a strategy analogous to that in [17] to estimate A\|A\|: Set Ae=β12+γ12\|A\|_{e}=\sqrt{\beta_{1}^{2}+\gamma_{1}^{2}} for j=1j=1, and update it for j2j\geq 2 by

Ae=max{Ae,βj12+γj12+γj1βj+γj2βj1,βj2+γj2+γj1βj},\|A\|_{e}=\max\left\{\|A\|_{e},\sqrt{\beta_{j-1}^{2}+\gamma_{j-1}^{2}+\gamma_{j-1}\beta_{j}+\gamma_{j-2}\beta_{j-1}},\sqrt{\beta_{j}^{2}+\gamma_{j}^{2}+\gamma_{j-1}\beta_{j}}\right\},

where γ0=0\gamma_{0}=0.

Algorithm 3 Implicitly restarted SSLBD algorithm with partial reorthogonalization
1:  Initialization: Set i=0i=0, P0=[]P_{0}=[\ \ ], Q1=[q1]Q_{1}=[q_{1}] and B0=[]B_{0}=[\ \ ].
2:  while not converged or i<imaxi<i_{max} do
3:     For i=0i=0 and i>0i>0, perform the mm- or (mk)(m-k)-step SSLBD process on AA with partial reorthogonalization to obtain PmP_{m}, Qm+1Q_{m+1} and B^m=[Bm,γmem]\widehat{B}_{m}=[B_{m},\gamma_{m}e_{m}].
4:     Compute the SVD (3.9) of BmB_{m} to obtain its singular triplets (θj,cj,dj)\left(\theta_{j},c_{j},d_{j}\right), and calculate the corresponding residual norms r±j\|r_{\pm j}\| by (4.37), j=1,,mj=1,\dots,m.
5:     if r±jAetol\|r_{\pm j}\|\leq\|A\|_{e}\cdot tol for all j=1,,kj=1,\dots,k, then compute u~j=Pmcj\tilde{u}_{j}=P_{m}c_{j} and v~j=Qmdj\tilde{v}_{j}=Q_{m}d_{j} for j=1,,kj=1,\dots,k, and break the loop.
6:     Set μj=θj+k,j=1,,mk\mu_{j}=\theta_{j+k},j=1,\dots,m-k, and replace those satisfying (5.9) with zeros. Perform the implicit restart scheme with the shifts μ12,,μmk2\mu_{1}^{2},\dots,\mu_{m-k}^{2} to obtain the updated PkP_{k}, Qk+1Q_{k+1} and B^k\widehat{B}_{k}. Set i=i+1i=i+1, and goto step 3.
7:  end while

Algorithm 3 sketches the implicitly restarted SSLBD algorithm with partial reorthogonalization for computing the kk largest singular triplets of AA. It requires the device to compute Ax¯A\underline{x} for an arbitrary nn-dimensional vector x¯\underline{x} and the number kk of the desired conjugate eigenpairs of AA. The number mm is the maximum subspace dimension, imaxi_{\max} is the maximum restarts allowed, toltol is the stopping tolerance in (4.36), and the unit-length vector q1q_{1} is an initial right Lanczos vector. The defaults of these parameters are set as 3030, 20002000, 10810^{-8} and [n12,,n12]T[n^{-\frac{1}{2}},\dots,n^{-\frac{1}{2}}]^{T}, respectively.

6 Numerical examples

In this section, we report numerical experiments on several matrices to illustrate the performance of our implicitly restarted SSLBD algorithm with partial reorthogonalization for the eigenvalue problem of a large skew-symmetric AA. The algorithm will be abbreviated as IRSSLBD, and has been coded in the Matlab language. All the experiments were performed on an Intel (R) core (TM) i9-10885H CPU 2.40 GHz with the main memory 64 GB and 16 cores using the Matlab R2021a with the machine precision ϵ=2.22×1016\epsilon=2.22\times 10^{-16} under the Microsoft Windows 10 64-bit system.

Table 1: Properties of the test matrices A=AoAoT2A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2} and A=[AoAoT]A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix} with AoA_{\mathrm{o}} being square and rectangular, respectively, where “M80PI”, “visco1”, “flowm”, “e40r0” and “Marag”, “Kemel”, “storm”, “dano3” are abbreviations of “M80PI_n1”, “viscoplastic1”, “flowmeter5”, “e40r0100” and “Maragal_6”, “Kemelmacher”, “stormg2-27”, “dano3mip”, respectively.
AoA_{\mathrm{o}} nn nnz(A)nnz(A) σmax(A)\sigma_{\max}(A) σmin(A)\sigma_{\min}(A) gap(1)\mathrm{gap}(1) gap(5)\mathrm{gap}(5) gap(10)\mathrm{gap}(10)
plsk1919 1919 9662 1.71 3.79e-17 1.34e-1 5.17e-3 5.17e-3
tols4000 4000 9154 1.17e+7 9.83e-56 4.97e-3 4.97e-3 4.97e-3
M80PI 4028 8066 2.84 6.60e-18 7.03e-3 3.68e-3 3.68e-3
visco1 4326 71572 3.70e+1 5.42e-21 1.59e-2 3.50e-4 3.50e-4
utm5940 5940 114590 1.21 1.83e-7 5.14e-2 9.57e-4 9.57e-4
coater2 9540 264448 3.11e+2 1.35e-13 9.89e-3 7.38e-3 4.51e-4
flowm 9669 54130 3.32e-2 7.56e-36 2.87e-3 8.27e-4 6.88e-4
inlet 11730 440562 3.35 6.17e-7 1.61e-2 5.63e-4 5.63e-4
e40r0 17281 906340 1.04e+1 1.54e-16 9.64e-2 3.99e-3 2.00e-3
ns3Da 20414 1660392 5.12e-1 2.35e-6 2.59e-2 2.38e-3 8.37e-5
epb2 25228 199152 1.09 2.85e-12 1.15e-2 8.89e-4 1.86e-4
barth 6691 39496 2.23 5.46e-18 5.98e-3 1.85e-5 1.85e-5
Hamrle2 5952 44282 5.55e+1 6.41e-5 5.27e-6 5.27e-6 2.56e-6
delf 9824 30794 3.92e+3 0 3.12e-3 3.12e-3 3.12e-3
large 12899 41270 4.04e+3 0 2.21e-3 2.21e-3 2.21e-3
Marag 31407 1075388 1.33e+1 0 2.78e-1 1.66e-2 7.89e-3
r05 14880 208290 1.82e+1 0 2.07e-3 2.07e-3 5.73e-4
nl 22364 94070 2.87e+2 0 2.27e-2 1.38e-3 7.94e-4
deter7 24528 74262 9.67 0 3.42e-1 5.56e-4 5.56e-4
Kemel 38145 201750 2.41e+2 0 2.17e-2 6.96e-3 5.50e-4
p05 14680 118090 1.28e+1 0 5.08e-3 5.08e-3 1.93e-4
storm 51926 188548 5.45e+2 0 1.64e-2 2.28e-5 2.28e-5
south31 54746 224796 1.29e+4 0 6.86e+1 3.96e-1 2.02e-2
dano3 19053 163266 1.82e+3 0 1.17e-3 2.31e-6 2.31e-6

Table 1 lists some basic properties of the test skew-symmetric matrices A=AoAoT2A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2} and A=[AoAoT]A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix} with AoA_{\mathrm{o}} being the square and rectangular real matrices selected from the University of Florida Sparse Matrix Collection [6], respectively, where nnz(A)nnz(A) denotes the number of nonzero elements of AA, σmax(A)\sigma_{\max}(A) and σmin(A)\sigma_{\min}(A) are the largest and smallest singular values of AA, respectively, and gap(k):=min1jk|λj2λj+12|/|λj+12λ2|=min1jk|σj2σj+12|/|σj+12σmin2(A)|\mathrm{gap}(k):=\min\limits_{1\leq j\leq k}|\lambda_{j}^{2}-\lambda_{j+1}^{2}|/|\lambda_{j+1}^{2}-\lambda_{\ell}^{2}|=\min\limits_{1\leq j\leq k}|\sigma_{j}^{2}-\sigma_{j+1}^{2}|/|\sigma_{j+1}^{2}-\sigma_{\min}^{2}(A)|. All the singular values of AA are computed, for the experimental purpose, by using the Matlab built-in function svd. As Theorem 4.1 shows, the size of gap(k)\mathrm{gap}(k) critically affects the performance of IRSSLBD for computing the kk largest conjugate eigenpairs of AA; the bigger gap(k)\mathrm{gap}(k) is, the faster IRSSLBD converges. Otherwise, IRSSLBD may converge slowly.

Unless mentioned otherwise, we always run the IRSSLBD algorithm with the default parameters described in the end of Section 5. We also test the Matlab built-in functions eigs and svds in order to illustrate the efficiency and reliability of IRSSLBD. Concretely, with the same parameters as those in IRSSLBD, we use eigs to compute 2k2k eigenpairs corresponding to the 2k2k largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of AA, and use svds to compute the kk or 2k2k largest singular triplets of AA to obtain the 2k2k largest eigenpairs of AA. We record the number of matrix-vector multiplications, abbreviated as #Mv\#{\rm Mv}, and the CPU time in second, denoted by TcpuT_{\rm cpu}, that the three algorithms use to achieve the same stopping tolerance. We mention that, for most of the test matrices, the CPU time used by each of the three algorithms is too short, e.g., fewer than 0.1s, so that it is hard to make a convincing and reliable comparison using CPU time. Therefore, we will not report the CPU time in most of the experiments. We should remind that a comparison of CPU time used by our algorithm and eigs, svds may also be misleading because our IRSSLBD code is the pure Matlab language while eigs and svds were programmed using advanced and much higher efficient C/C++ language. Nevertheless, we aim to illustrate that IRSSLBD is indeed fast and not slower than eigs and svds even in terms of CPU time.

Experiment 6.1.

We compute k=1,5,10k=1,5,10 pairs of the largest conjugate eigenvalues in magnitude of A=plsk1919A=\mathrm{plsk1919} and the corresponding eigenvectors.

Table 2: Results on A=plsk1919A=\mathrm{plsk1919}.
  Algorithm k=1k=1 k=5k=5 k=10k=10
   #\#Mv    TcpuT_{\mathrm{cpu}}    #\#Mv    TcpuT_{\mathrm{cpu}}    #\#Mv    TcpuT_{\mathrm{cpu}}
eigs 58 4.52e-3 122 1.02e-2 186 3.06e-2
SVDS(kk) 118 4.11e-3 258 1.22e-2 416 1.61e-2
SVDS(2k2k) 266 8.35e-3 416 1.61e-2 448 2.04e-2
IRSSLBD 50 8.26e-3 106 4.03e-2 134 5.20e-2

Table 2 displays the #\#Mv and TcpuT_{\mathrm{cpu}} used by the three algorithms, where SVDS(kk) and SVDS(2k2k) indicate that we use svds to compute the kk and 2k2k largest singular triplets of AA, respectively. We have found that IRSSLBD and eigs solved the SVD problems successfully and computed the desired eigenpairs of AA correctly. In terms of #\#Mv, we can see from the table that IRSSLBD is slightly more efficient than eigs for k=1,5k=1,5 but outperforms eigs substantially for k=10k=10. IRSSLBD is thus superior to eigs for this matrix. We also see from Table 2 that the CPU time used by IRSSLBD is very short and competitive with that used by eigs. The same phenomena have been observed for most of the test matrices.

Remarkably, we have found that, without using explicit partial reorthogonalization to maintain the semi-biorthogonality of the left and right Lanczos vectors, IRSSLBD and svds encounter some severe troubles, and they behave similarly. Indeed, for k=1k=1, SVDS(kk) succeeded in computing the largest singular triplet of AA, from which the desired largest conjugate eigenpairs are recovered as described in Section 3.2. For k=5k=5 and 1010, however, SVDS(kk) computes duplicate approximations to the singular triplets associated with each conjugate eigenvalue pair of AA, and ghosts appeared, causing that SVDS(kk) computed only half of the desired eigenpairs, i.e., the first to third and the first to fifth conjugate eigenpairs, respectively. The ghost phenomena not only delay the convergence of SVDS(kk) considerably but also make it converge irregularly and fail to find all the desired singular triplets. SVDS(kk) also uses much more #\#Mv than IRSSLBD does for k=10k=10. Likewise, for each k=1,5,10k=1,5,10, SVDS(2k2k) succeeds to compute only kk approximations to the desired largest singular triplets of AA, from which the desired approximate eigenpairs can be recovered, but consumes 353\sim 5 times #\#Mv as many as those used by IRSSLBD. This shows that direct application of an LBD algorithm to skew-symmetric matrix eigenvalue problems does not work well in finite precision arithmetic. In contrast, IRSSLBD works well when the semi-orthogonality and semi-biorthogonality are maintained. Therefore, the semi-biorthogonality of left and right Lanczos vectors is crucial.

Experiment 6.2.

We compute k=1,5,10k=1,5,10 pairs of largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of the twelve square matrices A=AoAoT2A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2} with Ao=tols4000A_{\mathrm{o}}=\mathrm{tols4000}, M80PI\mathrm{M80PI}, visco1\mathrm{visco1}, utm5940\mathrm{utm5940}, coater2\mathrm{coater2}, flowm\mathrm{flowm}, inlet\mathrm{inlet}, e40r0\mathrm{e40r0}, ns3Da\mathrm{ns3Da}, epb2\mathrm{epb2}, barth\mathrm{barth} and Hamrle2\mathrm{Hamrle2}, respectively. We will report the results on IRSSLBD and eigs only since svds behaves irregularly and computes only partial desired singular triplets because of the severe loss of numerical biorthogonality of left and right Lanczos vectors.

Table 3: The #Mv used by IRSSLBD and eigs to compute k=1,5,10k=1,5,10 pairs of largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of A=AoAoT2A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2}.
AoA_{\mathrm{o}} Algorithm: eigs Algorithm: IRSSLBD
  k=1k=1   k=5k=5   k=10k=10  k=1k=1  k=5k=5  k=10k=10
tols4000 338 278 460 174 (51.48) 232 (83.45) 292 (63.48)
M80PI 58 110 254 44 (75.86) 100 (90.91) 158 (62.20)
visco1 198 302 912 132 (66.67) 282 (93.38) 370 (40.57)
utm5940 86 336 788 76 (88.37) 278 (82.74) 306 (38.83)
coater2 142 162 282 84 (59.15) 130 (80.25) 152 (53.90)
flowm 478 384 702 228 (47.70) 346 (90.10) 394 (56.13)
inlet 170 412 572 132 (77.65) 316 (76.70) 310 (54.20)
e40r0 86 168 576 70 (81.40) 148 (88.10) 284 (49.31)
ns3Da 114 226 960 92 (80.70) 204 (90.27) 376 (39.17)
epb2 198 328 828 138 (69.70) 278 (84.76) 384 (46.38)
barth 310 392 510 172 (55.48) 318 (81.12) 322 (63.14)
Hamrle2 3558 300 16178 136 (  3.82) 190 (63.33) 354 (  2.19)

IRSSLBD and eigs successfully compute the desired eigenpairs of all the test matrices. Table 3 displays the #\#Mv used by them, where each quantity in the parenthesis is the percentage of #\#Mv used by IRSSLBD over that by eigs and we drop %\% to save the space.

For k=1k=1, we see from Table 3 that IRSSLBD is always more efficient than eigs, and it often outperforms the latter considerably for most of the test matrices. For instance, IRSSLBD uses fewer than 60%60\% #\#Mv of eigs for Ao=A_{\mathrm{o}}= tols4000, coater2, flowm and barth. Strikingly, IRSSLBD consumes only 3.82%3.82\% of the #\#Mv used by eigs for Ao=Hamrle2A_{\mathrm{o}}=\mathrm{Hamrle2}. For k=5k=5, IRSSLBD is often considerably more efficient than eigs for most of the test matrices, especially Ao=A_{\mathrm{o}}= Hamrle2, where IRSSLBD uses 36%36\% fewer #\#Mv. For k=10k=10, IRSSLBD uses 62%62\%64%64\% of the #\#Mv consumed by eigs for Ao=A_{\mathrm{o}}= tols4000, M80PI and barth, and it uses 54%54\%57%57\% of the #\#Mv for Ao=A_{\mathrm{o}}= coater2, flowm, and inlet. Therefore, for these six matrices, the advantage of IRSSLBD over eigs is considerable. For the five matrices Ao=A_{\mathrm{o}}= visco1, utm5940, e40r0, ns3Da and epb2, IRSSLBD is even more efficient than 𝖾𝗂𝗀𝗌{\sf eigs} as it consumes no more than half of the #\#Mv used by eigs. For Ao=A_{\mathrm{o}}= Hamrle2, IRSSLBD only uses 2.19%2.19\% of the #\#Mv cost by eigs, and the save is huge.

In summary, for these twelve test matrices, our IRSSLBD algorithm outperforms eigs and is often considerably more efficient than the latter for the given three kk.

Experiment 6.3.

We compute k=1,5,10k=1,5,10 pairs of the largest conjugate eigenvalues of A=[AoAoT]A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix} and the corresponding eigenvectors with the eleven matrices Ao=A_{\mathrm{o}}= delf, large, Marag, r05, nl, deter7, Kemel, p05, storm, south31 and dano3.

Table 4: The #Mv used by IRSSLBD and eigs to compute k=1,5,10k=1,5,10 pairs of the largest conjugate eigenpairs of A=[AoAoT]A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix}.
AoA_{\mathrm{o}} Algorithm: eigs Algorithm: IRSSLBD
 k=1k=1  k=5k=5  k=10k=10  k=1k=1   k=5k=5   k=10k=10
delf 478 214 192 142 (29.71) 142 (66.36) 138 (71.88)
large 590 284 268 188 (31.86) 172 (60.56) 172 (64.18)
Marag 58 84 178 38 (65.52) 82 (97.62) 130 (73.03)
r05 86 70 668 54 (62.79) 56 (80.00) 222 (33.23)
nl 114 242 222 76 (66.67) 172 (71.07) 126 (56.76)
deter7 58 346 144 36 (62.07) 148 (42.77) 122 (84.72)
Kemel 142 220 636 118 (83.10) 198 (90.00) 306 (48.11)
p05 114 90 692 58 (50.88) 62 (68.89) 234 (32.82)
storm 58 84 78 46 (79.31) 64 (76.19) 66 (84.62)
south31 30 42 146 8 (26.67) 26 (61.90) 92 (63.01)
dano3 114 70 5922 58 (50.88) 70 (100.0) 196 (  3.31)

These test matrices are all singular with large dimensional null spaces. In order to purge the null space from the searching subspaces, we take the initial vector q1q_{1} in IRSSLBD and eigs to be the unit-length vector normalized from A[1,,1]TA\cdot[1,\dots,1]^{T}. All the other parameters are the same for the two algorithms with the default values as described previously. Both IRSSLBD and eigs succeed in computing all the desired eigenpairs of AA for the three kk’s. Table 4 displays the #Mv, where the quantities in the parentheses are as in Table 3.

We have also observed an interesting phenomenon from this table and the previous experiments. As is seen from Table 4, for some test matrices, e.g., delf, large, nl, deter7, p05 and dano, when more conjugate eigenpairs of AA are desired, IRSSLBD or eigs may use even fewer #\#Mv. This occurs mainly due to the implicit restart technique. Specifically, when more conjugate eigenpairs are desired, larger dimensional new initial searching subspaces 𝒰k\mathcal{U}_{k} and 𝒱k\mathcal{V}_{k} are retained in the next restart, so that the resulting restarted searching subspaces may contain more information on the desired eigenvectors, causing that the largest approximate eigenpairs may converge faster.

As for an overall efficiency comparison of IRSSLBD and eigs, we have also observed the advantage of the former over the latter, similar to that in Experiment 6.2. Therefore, we now make some comments on them together. Before doing so, let us regard each test matrix with each selected kk as one independent problem, so in total we have 33 problems, which will be distinguished by the name-value pairs (Ao,k)(A_{\mathrm{o}},k).

As is seen from Table 4, for (Ao,k)=(Marag,5)(A_{\mathrm{o}},k)=(\mathrm{Marag},5) and (dano3,5)(\mathrm{dano3},5), IRSSLBD and eigs use almost the same #\#Mv. For (Ao,k)=(Kemel,1)(A_{\mathrm{o}},k)=(\mathrm{Kemel},1), (Kemel,5)(\mathrm{Kemel},5), (deter7,10)(\mathrm{deter7},10) and (strom,10)(\mathrm{strom},10), IRSSLBD is slightly better than eigs and uses 82%82\%90%90\% of the #\#Mv. For (Ao,k)=(Marag,1)(A_{\mathrm{o}},\!k)\!=\!(\mathrm{Marag},\!1), (r05,1)(\mathrm{r05},1), (nl,1)(\mathrm{nl},1), (deter7,1)(\mathrm{deter7},1), (storm,1)(\mathrm{storm},1), (delf,5)(\mathrm{delf},5), (large,5)(\mathrm{large},\!5), (r05,5)(\mathrm{r05},5), (nl,5)(\mathrm{nl},\!5), (p05,5)(\mathrm{p05},5), (storm,5)(\mathrm{storm},\!5), (south31,5)(\mathrm{south31},5), (delf,10)(\mathrm{delf},10), (large,10)(\mathrm{large},10), (Marag,10)(\mathrm{Marag},10), (nl,10)(\mathrm{nl},10) and (south31,10)(\mathrm{south31},10), IRSSLBD outperforms eigs considerably as it costs 56%56\%80%80\% of the #\#Mv used by eigs. For (Ao,k)=(delf,1)(A_{\mathrm{o}},k)=(\mathrm{delf},1), (large,1)(\mathrm{large},1), (p05,1)(\mathrm{p05},1), (south31,1)(\mathrm{south31},1), (dano3,1)(\mathrm{dano3},1), (deter7,5)(\mathrm{deter7},5), (r05,10)(\mathrm{r05},10), (Kemel,10)(\mathrm{Kemel},10), (p05,10)(\mathrm{p05},10) and (dano3,10)(\mathrm{dano3},10), IRSSLBD is substantially more efficient than eigs as it uses only approximately half of the #\#Mv or even fewer. Particularly, for (Ao,k)=(dano3,10)(A_{\mathrm{o}},k)=(\mathrm{dano3},10), the advantage of IRSSLBD over eigs is very substantial since, #\#Mv used by the former is only 3.5%3.5\% of that by the latter.

In summary, by choosing an appropriate starting vector, IRSSLBD suits well for computing several largest conjugate eigenpairs of a singular skew-symmetric matrix, and it is more and can be much more efficient than eigs.

7 Conclusions

We have shown that the eigendecomposition of a real skew-symmetric matrix AA has a close relationship with its structured SVD, by which the computation of its several extreme conjugate eigenpairs can be equivalently transformed into the computation of the largest singular triplets in real arithmetic. For AA large, as a key step toward the efficient computation of the desired eigenpairs, by means of the equivalence result on the tridiagonal decomposition of AA and a half sized bidiagonal decomposition, the SSLBD process is exploited to successively compute a sequence of partial bidiagonal decompositions. The process provides us with a sequence of left and right searching subspaces and bidiagonal matrices, making the computation of a partial SVD of AA possible, from which the desired eigenpairs of AA is obtained without involving complex arithmetic.

We have made a detailed theoretical analysis on the SSLBD process. Based on the results obtained, we have proposed a SSLBD method for computing several extreme singular triplets of AA, from which the desired extreme conjugate eigenvalues in magnitude and the corresponding eigenvectors can be recovered. We have established estimates for the distance between a desired eigenspace and the underlying subspaces that the SSLBD process generates, showing how it converges to zero. In terms of the distance between the desired eigenspace and the searching subspaces, we have derived a priori error bounds for the recovered conjugate eigenvalues and the associated eigenspaces. The results indicate that the SSLBD method generally favors extreme conjugate eigenpairs of AA.

We have made a numerical analysis on the SSLBD process and proposed an efficient and reliable approach to track the orthogonality and biorthogonality among the computed left and right Lanczos vectors. Unlike the standard LBD process, for its skew-symmetric variant SSLBD, we have proved that only the semi-orthogonality of the left and right Lanczos vectors does not suffice and that the semi-biorthogonality of these two sets of vectors is absolutely necessary, which has been numerically confirmed. Based on these, we have designed an effective partial reorthogonalization strategy for the SSLBD process to maintain the desired semi-orthogonality and semi-biorthogonality. Combining the implicit restart with the partial reorthogonalization proposed, we have developed an implicitly restarted SSLBD algorithm to compute several largest singular triplets of a large real skew-symmetric matrix.

A lot of numerical experiments have confirmed the effectiveness and high efficiency of the implicitly restarted SSLBD algorithm. They have indicated that it is at least competitive with eigs and often outperforms the latter considerably.

References

  • [1] T. Apel, V. Mehrmann, and D. Watkins, Structured eigenvalue methods for the computation of corner singularities in 3d anisotropic elastic structures, Comput. Methods Appl. Mech. Eng., 191 (2002), pp. 4459–4473.
  • [2] J. Baglama and L. Reichel, Augmented implicitly restarted Lanczos bidiagonalization methods, SIAM J. Sci. Comput., 27 (2005), pp. 19–42.
  • [3] M. W. Berry, Large-scale sparse singular value decomposition, Int. J. Supercomput. Appl., 6 (1992), pp. 13–49.
  • [4] J. R. Cardoso and F. S. Leite, Exponentials of skew-symmetric matrices and logarithms of orthogonal matrices, J. Comput. Appl. Math., 233 (2010), pp. 2867–2875.
  • [5] J. Cullum, R. A. Willoughby, and M. Lake, A Lanczos algorithm for computing singular values and vectors of large matrices, SIAM J. Sci. Statist. Comput., 4 (1983), pp. 197–215.
  • [6] T. A. Davis and Y. Hu, The University of Florida sparse matrix collection, ACM Trans. Math. Software, 38 (2011), pp. 1–25. Data available online at http://www.cise.ufl.edu/research/sparse/matrices/.
  • [7] N. Del Buono, L. Lopez, and R. Peluso, Computation of the exponential of large sparse skew-symmetric matrices, SIAM J. Sci. Comput., 27 (2005), pp. 278–293.
  • [8] K. V. Fernando, Accurately counting singular values of bidiagonal matrices and eigenvalues of skew-symmetric tridiagonal matrices, SIAM J. Matrix Anal. Appl., 20 (1998), pp. 373–399.
  • [9] G. Golub and W. Kahan, Calculating the singular values and pseudo-inverse of a matrix, J. Soc. Indust. Appl. Math. Ser. B Numer. Anal., 2 (1965), pp. 205–224.
  • [10] G. Golub and C. F. van Loan, Matrix Computations, 4th ed., The John Hopkins University Press, Baltimore, 2012.
  • [11] M. E. Hochstenbach, Harmonic and refined extraction methods for the singular value problem, with applications in least squares problems, BIT Numer. Math., 44 (2004), pp. 721–754.
  • [12] M. Hǎrǎguş and T. Kapitula, On the spectra of periodic waves for infinite-dimensional Hamiltonian systems, Physica D, 237 (2008), pp. 2649–2671.
  • [13] Z. Jia, Regularization properties of LSQR for linear discrete ill-posed problems in the multiple singular value case and best, near best and general low rank approximations, Inverse Probl., 36 (2020), 085009 (38pp).
  • [14] Z. Jia and D. Niu, An implicitly restarted refined bidiagonalization Lanczos method for computing a partial singular value decomposition, SIAM J. Matrix Anal. Appl., 25 (2003), pp. 246–265.
  • [15] Z. Jia and D. Niu, A refined harmonic Lanczos bidiagonalization method and an implicitly restarted algorithm for computing the smallest singular triplets of large matrices, SIAM J. Sci. Comput., 32 (2010), pp. 714–744.
  • [16] Z. Jia and G. W. Stewart, An analysis of the Rayleigh–Ritz method for approximating eigenspaces, Math. Comput., 70 (2001), pp. 637–647.
  • [17] R. M. Larsen, Lanczos bidiagonalization with partial reorthogonalization, PhD thesis, University of Aarhus, 1998.
  • [18] R. M. Larsen, Combining implicit restarts and partial reorthogonalization in Lanczos bidiagonalization, SCCM, Stanford University, (2001).
  • [19] V. Mehrmann, The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution, vol. 163, Springer, Heidelberg, 1991.
  • [20] V. Mehrmann, C. Schröder, and V. Simoncini, An implicitly-restarted Krylov subspace method for real symmetric/skew-symmetric eigenproblems, Linear Algebra Appl., 436 (2012), pp. 4070–4087.
  • [21] V. Mehrmann and D. Watkins, Polynomial eigenvalue problems with Hamiltonian structure, Electron. Trans. Numer. Anal., 13 (2002), pp. 106–118.
  • [22] J.-M. Mencik and D. Duhamel, A wave-based model reduction technique for the description of the dynamic behavior of periodic structures involving arbitrary-shaped substructures and large-sized finite element models, Finite Elem. Anal. Des., 101 (2015), pp. 1–14.
  • [23] M. Paardekooper, An eigenvalue algorithm for skew-symmetric matrices, Numer. Math., 17 (1971), pp. 189–202.
  • [24] C. C. Paige and M. A. Saunders, LSQR: an algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Soft., 8 (1982), pp. 43–71.
  • [25] B. N. Parlett, The Symmetric Eigenvalue Problem, SIAM, Philadelphia, 1998.
  • [26] C. Penke, A. Marek, C. Vorwerk, C. Draxl, and P. Benner, High performance solution of skew-symmetric eigenvalue problems with applications in solving the Bethe-Salpeter eigenvalue problem, Parallel Comput., 96 (2020), p. 102639.
  • [27] Y. Saad, On the rates of convergence of the Lanczos and the block-Lanczos methods, SIAM J. Numer. Anal., 17 (1980), pp. 687–706.
  • [28] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, 2003.
  • [29] Y. Saad, Numerical Methods for Large Eigenvalue Problems: revised edition, SIAM, Philadelphia, 2011.
  • [30] H. D. Simon, Analysis of the symmetric Lanczos algorithm with reorthogonalization methods, Linear Algebra Appl., 61 (1984), pp. 101–131.
  • [31] D. C. Sorensen, Implicit application of polynomial filters in a k-step Arnoldi method, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 357–385.
  • [32] G. W. Stewart, Matrix Algorithms I: Basic Decompositions, SIAM, Philadelphia, 1998.
  • [33] G. W. Stewart, Matrix Algorithms II: Eigensystems, SIAM, Philadelphia, 2001.
  • [34] M. Stoll, A Krylov–Schur approach to the truncated SVD, Linear Algebra Appl., 436 (2012), pp. 2795–2806.
  • [35] R. C. Ward and L. J. Gray, Eigensystem computation for skew-symmetric and a class of symmetric matrices, ACM Trans. Math. Software, 4 (1978), pp. 278–285.
  • [36] M. Wimmer, Algorithm 923: Efficient numerical computation of the pfaffian for dense and banded skew-symmetric matrices, ACM Trans. Math. Software, 38 (2012), pp. 1–17.
  • [37] W.-Y. Yan and J. Lam, An approximate approach to H2H^{2} optimal model reduction, IEEE Trans. Automat. Control, 44 (1999), pp. 1341–1358.
  • [38] W. X. Zhong and F. W. Williams, On the direct solution of wave propagation for repetitive structures, J. Sound Vibr., 181 (1995), pp. 485–501.
  • [39] K. Zhou, J. C. Doyle, K. Glover, et al., Robust and Optimal Control, Prentice Hall, New Jersey, 1996.