A skew-symmetric Lanczos bidiagonalization method for computing several largest eigenpairs of a large skew-symmetric matrix

Jinzhi Huang School of Mathematical Sciences, Soochow University, 215006 Suzhou, China ([email protected]). The work of this author was supported in part by the Youth Program of the Natural Science Foundation of Jiangsu Province (No. BK20220482). Zhongxiao Jia Corresponding author. Department of Mathematical Sciences, Tsinghua University, 100084 Beijing, China ([email protected]). The work of this author was supported in part by the National Natural Science Foundation of China (No. 12171273).

Abstract

The spectral decomposition of a real skew-symmetric matrix $A$ can be mathematically transformed into a specific structured singular value decomposition (SVD) of $A$ . Based on such equivalence, a skew-symmetric Lanczos bidiagonalization (SSLBD) method is proposed for the specific SVD problem that computes extreme singular values and the corresponding singular vectors of $A$ , from which the eigenpairs of $A$ corresponding to the extreme conjugate eigenvalues in magnitude are recovered pairwise in real arithmetic. A number of convergence results on the method are established, and accuracy estimates for approximate singular triplets are given. In finite precision arithmetic, it is proven that the semi-orthogonality of each set of basis vectors and the semi-biorthogonality of two sets of basis vectors suffice to compute the singular values accurately. A commonly used efficient partial reorthogonalization strategy is adapted to maintaining the needed semi-orthogonality and semi-biorthogonality. For a practical purpose, an implicitly restarted SSLBD algorithm is developed with partial reorthogonalization. Numerical experiments illustrate the effectiveness and overall efficiency of the algorithm.

keywords:

Skew-symmetric matrix, spectral decomposition, eigenvalue, eigenvector, singular value decomposition, singular value, singular vector, Lanczos bidiagonalization, partial reorthogonalization

AMS:

65F15, 15A18, 65F10, 65F25

1 Introduction

Let $A\in\mathbb{R}^{n\times n}$ be a large scale and possibly sparse skew-symmetric matrix, i.e., $A^{T}=-A$ , where the superscript $T$ denotes the transpose of a matrix or vector. We consider the eigenvalue problem

(1.1)

Ax=\lambda x,

where $\lambda\in\mathbb{C}$ is an eigenvalue of $A$ and $x\in\mathbb{C}^{n}$ with the 2-norm $\|x\|=1$ is an corresponding eigenvector. It is well known that the nonzero eigenvalues $\lambda$ of $A$ are purely imaginary and come in conjugate pairs. The real skew-symmetric eigenvalue problem arises in a variety of applications, such as matrix function computations [4, 7], the solution of linear quadratic optimal control problems [19, 39], model reductions [22, 37], wave propagation solution for repetitive structures [12, 38], crack following in anisotropic materials [1, 21], and some others [20, 26, 36].

For small to medium sized skew-symmetric matrices, several numerical algorithms have been developed to compute their spectral decompositions in real arithmetic [8, 23, 26, 35]. For a large scale skew-symmetric $A$ , however, it still lacks a high-performance numerical algorithm that can make use of its skew-symmetric structure to compute several eigenvalues and/or the corresponding eigenvectors only in real arithmetic. In this paper, we close this gap by proposing a Krylov subspace type method to compute several extreme conjugate eigenvalues in magnitude and the corresponding eigenvectors of $A$ that only uses real arithmetic.

Our work originates from a basic fact that, as will be shown in Section 2, the spectral decomposition of the skew-symmetric $A$ has a close relationship with its specifically structured singular value decomposition (SVD), where each conjugate pair of purely imaginary eigenvalues of $A$ corresponds to a double singular value of $A$ and the mutually orthogonal real and imaginary parts of associated eigenvectors are the corresponding left and right singular vectors of $A$ . Therefore, the solution of eigenvalue problem of the skew-symmetric $A$ is equivalent to the computation of its structured SVD, and the desired eigenpairs can be recovered from the converged singular triplets by exploiting this equivalence.

It has been well realized in [35] that the spectral decomposition of the skew-symmetric $A$ can be reduced to the SVD of a certain bidiagonal matrix $B$ whose size is $n/2$ for $n$ even. Exploiting such property, Ward and Gray [35] have designed an eigensolver for dense skew-symmetric matrices. For $A$ large scale, however, the computation of $B$ using orthogonal similarity transformations is prohibitive due to the requirement of excessive storage and computations. Nevertheless, with the help of Lanczos bidiagonalization (LBD) [24], whose complete bidiagonal reduction is originally due to [9], we are able to compute a sequence of leading principal matrices of $B$ , based on which, approximations to the extreme singular values of $A$ and/or the corresponding singular vectors can be computed. Along this line, a number of LBD based methods have been proposed and intensively studied, and several practical explicitly and implicitly restarted LBD type algorithms have been well developed [2, 3, 5, 11, 14, 15, 17, 18, 34].

For the skew-symmetric matrix $A$ , LBD owns some unique properties in both mathematics and numerics. However, direct applications of the aforementioned algorithms do not make use of these special properties, and will thus encounter some severe numerical difficulties. We will fully exploit these properties to propose a skew-symmetric LBD (SSLBD) method for computing several extreme singular triplets of $A$ , from which its extreme conjugate eigenvalues in magnitude and the associate eigenvectors are recovered. Attractively, our algorithm is performed in real arithmetic when computing complex eigenpairs of $A$ . Particularly, one conjugate pair of approximations to two conjugate pairwise eigenpairs can be recovered from one converged approximate singular triplet. We estimate the distance between the subspace generated by the real and imaginary parts of a pair of desired conjugate eigenvectors and the Krylov subspaces generated by SSLBD, and prove how it tends to zero as the subspace dimension increases. With these estimates, we establish a priori error bounds for the approximate eigenvalues and approximate eigenspaces obtained by the SSLBD method. One of them is the error bound for the two dimensional approximate eigenspace, which extends [29, pp.103, Theorem 4.6], a classic error bound for the Ritz vector in the real symmetric (complex Hermitian) case. These results show how fast our method converges when computing several extreme singular triplets.

As the subspace dimension increases, the SSLBD method will eventually become impractical due to the excessive memory and computational cost. For a practical purpose, it is generally necessary to restart the method by selecting an increasingly better starting vector when the maximum iterations allowed are attained, until the restarted algorithm converges ultimately. We will develop an implicitly restarted SSLBD algorithm. Initially, implicit restart was proposed by Sorensen [31] for large eigenvalue problems, and then it has been developed by Larsen [18] and Jia and Niu [14, 15] for large SVD computations. Our implicit restart is based on them but specialized to the skew-symmetric $A$ . We will show that two sets of left and right Lanczos vectors generated by SSLBD are meanwhile biorthogonal for the skew-symmetric $A$ . However, in finite precision arithmetic, this numerical biorthogonality loses gradually even if they themselves are kept numerically orthonormal. As a matter of fact, we have found that the numerical orthogonality of the computed left and right Lanczos vectors in an available state-of-the-art Lanczos bidiagonalization type algorithm does not suffice for the SVD problem of the skew-symmetric $A$ , and unfortunate ghosts appear when the numerical biorthogonality loses severely; that is, a singular value of $A$ may be recomputed a few times. Therefore, certain numerical biorthogonality is crucial to make the method work regularly.

As has been proved by Simon [30] for the symmetric Lanczos method on the eigenvalue problem and extended by Larsen [17] to the LBD method for the SVD problem, as far as the accurate computation of singular values is concerned, the numerical semi-orthogonality suffices for a general $A$ , and the numerical orthogonality to the level of machine precision is not necessary. Here semi-orthogonality means that two vectors are numerically orthogonal to the level of the square root of machine precision. To maintain the numerical stability and make the SSLBD method behave like it in exact arithmetic, we shall prove that the semi-orthogonality of each set of Lanczos vectors and the semi-biorthogonality of left and right Lanczos vectors suffice for the accurate computation of singular values of the skew-symmetric $A$ . We will seek for an effective and efficient reorthogonalization strategy for this purpose. Specifically, besides the commonly used partial reorthogonalization to maintain the desired semi-orthogonality of each set of Lanczos vectors, we will propose a partial reorthogonalization strategy to maintain the numerical semi-biorthogonality of two sets of Lanczos vectors in order to avoid the ghost phenomena and make the algorithm converge regularly. We will introduce such kind of reorthogonalization strategy into the implicitly restarted SSLBD algorithm and design a simple scheme to recover the desired conjugate eigenpairs of $A$ pairwise from the converged singular triplets.

The rest of this paper is organized as follows. In Section 2, we describe some properties of the spectral decomposition of $A$ , and show its mathematical equivalence with the SVD of $A$ . Then we establish a close relationship between the SVD of $A$ and that of an upper bidiagonal matrix whose order is half of that of $A$ when the order of $A$ is even. In Section 3, we present some properties of the SSLBD process, and propose the SSLBD method for computing several extreme singular values and the corresponding singular vectors of $A$ , from which conjugate approximate eigenpairs of $A$ are pairwise reconstructed. In Section 4, we establish convergence results on the SSLBD method. In Section 5, we design a partial reorthogonalization scheme for the SSLBD process, and develop an implicitly restarted SSLBD algorithm to compute several extreme singular triplets of $A$ . Numerical experiments are reported in Section 6. We conclude the paper in Section 7.

Throughout this paper, we denote by $\mathrm{i}$ the imaginary unit, by $X^{H}$ and $X^{{\dagger}}$ the conjugate transpose and pseudoinverse of $X$ , by $\mathcal{R}(X)$ the range space of $X$ , and by $\|X\|$ and $\|X\|_{F}$ the $2$ - and Frobenius norms of $X$ , respectively. We denote by $\mathbb{R}^{k}$ and $\mathbb{C}^{k}$ the real and complex spaces of dimension $k$ , by $I_{k}$ the identity matrix of order $k$ , and by $\bm{0}_{k}$ and $\bm{0}_{k,\ell}$ the zero matrices of orders $k\times k$ and $k\times\ell$ , respectively. The subscripts of identity and zero matrices are omitted when they are clear from the context.

2 Preliminaries

For the skew-symmetric matrix $A\in\mathbb{R}^{n\times n}$ , the following five properties are well known and easily justified: Property (i): $A$ is diagonalizable by a unitary similarity transformation; Property (ii): the eigenvalues $\lambda$ of $A$ are either purely imaginary or zero; Property (iii): for an eigenpair $(\lambda,z)$ of $A$ , the conjugate $(\bar{\lambda},\bar{z})$ is also an eigenpair of $A$ ; Property (iv): the real and imaginary parts of the eigenvector $z$ corresponding to a purely imaginary eigenvalue of $A$ have the same length, and are mutually orthogonal; Property (v): $A$ of odd order must be singular, and has at least one zero eigenvalue. It is easily deduced that a nonsingular $A$ must have even order, i.e., $n=2\ell$ for some integer $\ell$ , and its eigenvalues $\lambda$ are purely imaginary and come in conjugate, i.e., plus-minus, pairs.

We formulate the following basic results as a theorem, which establishes close structure relationships between the spectral decomposition of $A$ and its SVD and forms the basis of our method and algorithm to be proposed. For completeness and our frequent use in this paper, we will give a rigorous proof by exploiting the above five properties.

Theorem 2.1.

The spectral decomposition of the nonsingular skew-symmetric $A\in\mathbb{R}^{n\times n}$ with $n=2\ell$ is

(2.1)

A=X\Lambda X^{H}

with

(2.2)

X=\begin{bmatrix}\frac{1}{\sqrt{2}}(U+\mathrm{i}V)&\frac{1}{\sqrt{2}}(U-\mathrm{i}V)\end{bmatrix}\qquad\mbox{and}\qquad\Lambda=\mathop{\operator@font diag}\nolimits\{\mathrm{i}\Sigma,-\mathrm{i}\Sigma\},

where $U\in\mathbb{R}^{n\times\ell}$ and $V\in\mathbb{R}^{n\times\ell}$ are orthonormal and biorthogonal:

(2.3)

U^{T}U=I,\qquad V^{T}V=I,\qquad U^{T}V=\bm{0},

and $\Sigma=\mathop{\operator@font diag}\nolimits\{\sigma_{1},\dots,\sigma_{\ell}\}\in\mathbb{R}^{\ell\times\ell}$ with $\sigma_{1},\dots,\sigma_{\ell}>0$ . The SVD of $A$ is

(2.4)

A=\begin{bmatrix}U&V\end{bmatrix}\begin{bmatrix}\Sigma&\\ &\Sigma\end{bmatrix}\begin{bmatrix}V&-U\end{bmatrix}^{T}.

{proof}

By Properties (i) and (ii) and the assumption that $A$ is nonsingular, we denote by $\Lambda_{+}\in\mathbb{C}^{\ell\times\ell}$ the diagonal matrix consisting of all the eigenvalues of $A$ with positive imaginary parts, and by $X_{+}\in\mathbb{C}^{n\times\ell}$ the corresponding orthonormal eigenvector matrix. By Property (iv), we can write $(\Lambda_{+},X_{+})$ as

(2.5)

(\Lambda_{+},X_{+})=\left(\mathrm{i}\Sigma,\tfrac{1}{\sqrt{2}}(U+\mathrm{i}V)\right),

where the columns of $U$ and $V$ have unit-length, and $AX_{+}=X_{+}\Lambda_{+}$ and $X_{+}^{H}X_{+}=I$ . By these and Property (iii), the conjugate of $(\Lambda_{+},X_{+})$ is

(2.6)

(\Lambda_{-},X_{-})=\left(-\mathrm{i}\Sigma,\tfrac{1}{\sqrt{2}}(U-\mathrm{i}V)\right),

and $AX_{-}=X_{-}\Lambda_{-}$ and $X_{-}^{H}X_{-}=I$ . As a result, $X$ and $\Lambda$ defined by (2.2) are the eigenvector and eigenvalue matrices of $A$ , respectively, and

(2.7)

AX=X\Lambda.

Premultiplying the two sides of $AX_{+}=X_{+}\Lambda_{+}$ by $X_{+}^{T}$ delivers $X_{+}^{T}AX_{+}=X_{+}^{T}X_{+}\Lambda_{+}$ , which, by the skew-symmetry of $A$ , shows that

X_{+}^{T}X_{+}\Lambda_{+}+\Lambda_{+}X_{+}^{T}X_{+}=\bm{0}.

Since $\Lambda_{+}=\mathrm{i}\Sigma$ and $\Sigma$ is a positive diagonal matrix, it follows from the above equation that

X_{+}^{T}X_{+}=\bm{0}.

By definition (2.2) of $X$ and the fact that $X_{+}$ and $X_{-}$ are column orthonormal, $X$ is unitary. Postmultiplying (2.7) by $X^{H}$ yields (2.1), from which it follows that (2.4) holds.

Inserting $X_{+}=\frac{1}{\sqrt{2}}(U+\mathrm{i}V)$ into $X_{+}^{H}X_{+}=I$ and $X_{+}^{T}X_{+}=\bm{0}$ and solving the resulting equations for $U^{T}U,V^{T}V$ and $U^{T}V$ , we obtain (2.3). This shows that both $[U,V]$ and $[V,-U]$ are orthogonal. Therefore, (2.4) is the SVD of $A$ .

Remark 1.

For a singular skew-symmetric $A\in\mathbb{R}^{n\times n}$ with $n=2\ell+\ell_{0}$ where $\ell_{0}$ is the multiplicity of zero eigenvalues of $A$ , we can still write the spectral decomposition of $A$ as (2.1) with some slight rearrangements for the eigenvalue and eigenvector matrices $\Lambda$ and $X$ :

(2.8)

X=\begin{bmatrix}\frac{1}{\sqrt{2}}(U+\mathrm{i}V)&\frac{1}{\sqrt{2}}(U-\mathrm{i}V)&X_{0}\end{bmatrix}\qquad\mbox{and}\qquad\Lambda=\mathop{\operator@font diag}\nolimits\{\mathrm{i}\Sigma,-\mathrm{i}\Sigma,\bm{0}_{\ell_{0}}\},

where $U,V$ and $\Sigma$ are as in Theorem 2.1 and the columns of $X_{0}$ form an orthonormal basis of the null space of $A$ . The proof is analogous to that of Theorem 2.1 and thus omitted. In this case, relation (2.4) gives the thin SVD of $A$ .

For $U$ and $V$ in (2.4), write

U=[u_{1},u_{2},\ldots,u_{\ell}]\qquad\mbox{and}\qquad V=[v_{1},v_{2},\ldots,v_{\ell}].

On the basis of Theorem 2.1, in the sequel, we denote by

(2.9)

(\lambda_{\pm j},x_{\pm j})=\left(\pm\mathrm{i}\sigma_{j},\tfrac{1}{\sqrt{2}}(u_{j}\pm\mathrm{i}v_{j})\right),\qquad j=1,\dots,\ell.

The SVD (2.4) indicates that $(\sigma_{j},u_{j},v_{j})$ and $(\sigma_{j},v_{j},-u_{j})$ are two singular triplets of $A$ corresponding to the multiple singular value $\sigma_{j}$ . Therefore, in order to obtain the conjugate eigenpairs $(\lambda_{\pm j},x_{\pm j})$ of $A$ , one can compute only the singular triplet $(\sigma_{j},u_{j},v_{j})$ or $(\sigma_{j},v_{j},-u_{j})$ , and then recovers the desired eigenpairs from the singular triplet by (2.9).

Next we present a bidiagonal decomposition of the nonsingular skew-symmetric $A$ with order $n=2\ell$ , which essentially appears in [35]. For completeness, we include a proof.

Theorem 2.2.

Assume that the skew-symmetric $A\in\mathbb{R}^{n\times n}$ with $n=2\ell$ is nonsingular. Then there exist two orthonormal matrices $P\in\mathbb{R}^{n\times\ell}$ and $Q\in\mathbb{R}^{n\times\ell}$ satisfying $P^{T}Q=\mathbf{0}$ and an upper bidiagonal matrix $B\in\mathbb{R}^{\ell\times\ell}$ such that

(2.10)

A=\begin{bmatrix}P&Q\end{bmatrix}\begin{bmatrix}B&\\ &B^{T}\end{bmatrix}\begin{bmatrix}Q&-P\end{bmatrix}^{T}.

{proof}

It is known from, e.g., [35] that $A$ is orthogonally similar to a skew-symmetric tridiagonal matrix, that is, there exists an orthogonal matrix $F\in\mathbb{R}^{n\times n}$ such that

(2.11)

A^{\prime}=F^{T}AF=\begin{bmatrix}0&-\alpha_{1}&&\\ \alpha_{1}&\ddots&\ddots&\\ &\ddots&\ddots&-\alpha_{n-1}\\ &&\alpha_{n-1}&0\end{bmatrix}.

Denote $\Pi_{1}=[e_{2,n},e_{4,n},\dots,e_{2\ell,n}]$ and $\Pi_{2}=[e_{1,n},e_{3,n},\dots,e_{2\ell-1,n}]$ with $e_{j,n}$ being the $j$ th column of $I_{n}$ , $j=1,\dots,n$ . Then it is easy to verify that

	$\displaystyle A^{\prime\prime}$	$\displaystyle=$	$\displaystyle\ \begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}^{T}A^{\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}\end{bmatrix}=\begin{bmatrix}B&\\ &-B^{T}\end{bmatrix},$
(2.12)		$\displaystyle B\$	$\displaystyle=$	$\displaystyle\ \Pi_{1}^{T}A^{\prime}\Pi_{2}=\begin{bmatrix}\alpha_{1}&-\alpha_{2}&&&\\[-1.00006pt] &\alpha_{3}&-\alpha_{4}&&\\[-1.00006pt] &&\ddots&\ddots&\\[-1.00006pt] &&&\ddots&-\alpha_{2\ell-2}\\[-1.00006pt] &&&&\alpha_{2\ell-1}\end{bmatrix}.$

Combining these two relations with (2.11), we obtain

(2.13)	$\displaystyle A$	$\displaystyle=$	$\displaystyle FA^{\prime}F^{T}=F\begin{bmatrix}\Pi_{1}&\Pi_{2}\end{bmatrix}A^{\prime\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}\end{bmatrix}^{T}F^{T}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}F\Pi_{1}&F\Pi_{2}\end{bmatrix}A^{\prime\prime}\begin{bmatrix}F\Pi_{2}&F\Pi_{1}\end{bmatrix}^{T}$
		$\displaystyle=$	$\displaystyle\begin{bmatrix}P&Q\end{bmatrix}\begin{bmatrix}B&\\ &-B^{T}\end{bmatrix}\begin{bmatrix}Q&P\end{bmatrix}^{T},$

which proves (2.10), where we have denoted $P=F\Pi_{1}$ and $Q=F\Pi_{2}$ . Note that $F$ is orthogonal and $\Pi_{1}$ , $\Pi_{2}$ are not only orthonormal but also biorthogonal: $\Pi_{1}^{T}\Pi_{1}=\Pi_{2}^{T}\Pi_{2}=I$ and $\Pi_{1}^{T}\Pi_{2}=\mathbf{0}$ . Therefore, $P^{T}P=Q^{T}Q=I$ and $P^{T}Q=\bm{0}$ .

Remark 2.

For a singular skew-symmetric $A\in\mathbb{R}^{n\times n}$ with $n=2\ell+\ell_{0}$ where $\ell_{0}$ is the multiplicity of zero eigenvalues of $A$ , the lower diagonal elements $\alpha_{i},i=2\ell,2\ell+1,\dots,n-1$ of $A^{\prime}$ in (2.11) are zeros. Taking $\Pi_{1}$ and $\Pi_{2}$ as in (2.12) and $\Pi_{3}=[e_{2\ell+1,n},e_{2\ell+2,n},\dots,e_{n,n}]$ , we obtain

(2.14)

A^{\prime\prime}=\ \begin{bmatrix}\Pi_{1}&\Pi_{2}&\Pi_{3}\end{bmatrix}^{T}A^{\prime}\begin{bmatrix}\Pi_{2}&\Pi_{1}&\Pi_{3}\end{bmatrix}=\mathop{\operator@font diag}\nolimits\{B,-B^{T},\bm{0}_{\ell_{0}}\}

with $B$ defined by (2.12). Using the same derivations as in the proof of Theorem 2.2, we have

A=\begin{bmatrix}P&Q&X_{0}\end{bmatrix}\begin{bmatrix}B\!\!&&\\ &\!\!B^{T}\!\!&\\ &&\!\!\bm{0}_{\ell_{0}}\end{bmatrix}\begin{bmatrix}Q&-P&X_{0}\end{bmatrix}^{T},

where $P$ and $Q$ are as in (2.13) and $X_{0}=F\Pi_{3}$ is the orthonormal basis matrix of the null space of $A$ . This decomposition is a precise extension of (2.10) to $A$ with any order.

Combining Remark 2 with Remark 1 after Theorem 2.1, for ease of presentation, in the sequel we will always assume that $A$ has even order and is nonsingular. However, our method and algorithm to be proposed apply to a singular skew-symmetric $A$ with only some slight modifications.

Decomposition (2.10) is a bidiagonal decomposition of $A$ that reduces $A$ to $\mathop{\operator@font diag}\nolimits\{B,B^{T}\}$ using the structured orthogonal matrices $\begin{bmatrix}P&Q\end{bmatrix}^{T}$ and $\begin{bmatrix}Q&-P\end{bmatrix}$ from the left and right, respectively. The singular values of $A$ are the union of those of $B$ and $B^{T}$ but the SVD of $B$ alone delivers the whole SVD of $A$ : Let $B=U_{B}\Sigma_{B}V_{B}^{T}$ be the SVD of $B$ . Then Theorem 2.2 indicates that $(\Sigma,U,V)=(\Sigma_{B},PU_{B},QV_{B})$ is an $\ell$ -dimensional partial SVD of $A$ with $U$ and $V$ being orthonormal and $U^{T}V=\mathbf{0}$ . By Theorem 2.1, the spectral decomposition (2.1) of $A$ can be restored directly from $(\Sigma,U,V)$ . This is exactly the working mechanism of the eigensolver proposed in [35] for small to medium sized skew-symmetric matrices.

For $A$ large, using orthogonal transformations to construct $P$ , $Q$ and $B$ in (2.10) and computing the SVD of $B$ are unaffordable. Fortunately, based on decomposition (2.10), we can perform the LBD process on the skew-symmetric $A$ , i.e., the SSLBD process, to successively generate the columns of $P$ and $Q$ and the leading principal matrices of $B$ , so that the Rayleigh–Ritz projection comes into the computation of a partial SVD of $A$ .

3 The SSLBD method

For ease of presentation, from now on, we always assume that the eigenvalues $\lambda_{\pm j}=\pm\mathrm{i}\sigma_{j}$ of $A$ in (2.9) are simple, and that $\sigma_{1},\dots,\sigma_{\ell}$ are labeled in decreasing order:

(3.1)

\sigma_{1}>\sigma_{2}>\dots>\sigma_{\ell}>0.

Our goal is to compute the $k$ pairs of extreme, e.g., largest conjugate eigenvalues $\lambda_{\pm 1},\dots,\lambda_{\pm k}$ and/or the corresponding eigenvectors $x_{\pm 1},\dots,x_{\pm k}$ defined by (2.9). By Theorem 2.1, this amounts to computing the following partial SVD of $A$ :

(\Sigma_{k},U_{k},V_{k})=\left(\mathrm{diag}\{\sigma_{1},\dots,\sigma_{k}\},[u_{1},\dots,u_{k}],[v_{1},\dots,v_{k}]\right).

3.1 The $m$ -step SSLBD process

Algorithm 1 sketches the $m$ -step SSLBD process on the skew-symmetric $A$ , which is a variant of the lower LBD process proposed by Paige and Saunders [24].

Algorithm 1 The

m

-step SSLBD process.

1: Initialization: Set

\gamma_{0}=0

and

p_{0}=\bm{0}

, and choose

q_{1}\in\mathbb{R}^{n}

with

\|q_{1}\|=1

2: for

j=1,\dots,m<\ell

3: Compute

s_{j}=Aq_{j}-\gamma_{j-1}p_{j-1}

and

\beta_{j}=\|s_{j}\|

4: if

\beta_{j}=0

then break; else calculate

p_{j}=\frac{1}{\beta_{j}}s_{j}

5: Compute

t_{j}=-Ap_{j}-\beta_{j}q_{j}

and

\gamma_{j}=\|t_{j}\|

6: if

\gamma_{j}=0

then break; else calculate

q_{j+1}=\frac{1}{\gamma_{j}}t_{j}

7: end for

Assume that the $m$ -step SSLBD process does not break down for $m<\ell$ , and note that $A^{T}A=-A^{2}$ and $AA^{T}=-A^{2}$ . Then the process computes the orthonormal base $\{p_{j}\}_{j=1}^{m}$ and $\{q_{j}\}_{j=1}^{m+1}$ of the Krylov subspaces

(3.2)

\mathcal{U}_{m}=\mathcal{K}(A^{2},m,p_{1})\qquad\mbox{and}\qquad\mathcal{V}_{m+1}=\mathcal{K}(A^{2},m+1,q_{1})

generated by $A^{2}$ and the starting vectors $p_{1}$ and $q_{1}$ with $p_{1}=Aq_{1}/\|Aq_{1}\|$ , respectively. Denote $P_{j}=[p_{1},\dots,p_{j}]$ for $j=1,\dots,m$ and $Q_{j}=[q_{1},\dots,q_{j}]$ for $j=1,\dots,m+1$ , whose columns are called left and right Lanczos vectors, respectively. Then the $m$ -step SSLBD process can be written in the matrix form:

(3.3)

\left\{\begin{aligned} &AQ_{m}=P_{m}B_{m},\\[5.0pt] &AP_{m}=-Q_{m}B_{m}^{T}-\gamma_{m}q_{m+1}e_{m,m}^{T},\end{aligned}\right.\qquad\mbox{with}\qquad B_{m}=\begin{bmatrix}\beta_{1}&\gamma_{1}&&\\ &\ddots&\ddots&\\ &&\ddots&\gamma_{m-1}\\ &&&\beta_{m}\end{bmatrix},

where $e_{m,m}$ is the $m$ th column of $I_{m}$ . It is well known that the singular values of $B_{m}$ are simple whenever the $m$ -step SSLBD process does not break down. For later use, denote $\widehat{B}_{m}=[B_{m},\gamma_{m}e_{m,m}]$ . Then we can write the second relation in (3.3) as $AP_{m}=-Q_{m+1}\widehat{B}_{m}^{T}$ .

Theorem 3.1.

Let $P_{m}$ and $Q_{m+1}$ be generated by Algorithm 1 without breakdown for $m<\ell$ . Then $P_{m}$ and $Q_{m+1}$ are biorthogonal:

(3.4)

Q_{m+1}^{T}P_{m}=\bm{0}.

{proof}

We use induction on $m$ . Since $\beta_{1}p_{1}=Aq_{1}$ and $\gamma_{1}q_{2}=-Ap_{1}-\beta_{1}q_{1}$ , by the skew-symmetry of $A$ and $\beta_{1}>0$ , $\gamma_{1}>0$ , we have

\beta_{1}q_{1}^{T}p_{1}=q_{1}^{T}Aq_{1}=0\qquad\mbox{and}\qquad\gamma_{1}q_{2}^{T}p_{1}=p_{1}^{T}Ap_{1}-\beta_{1}q_{1}^{T}p_{1}=0,

which means that $Q_{2}^{T}P_{1}=\bm{0}$ and thus proves (3.4) for $m=1$ .

Suppose that (3.4) holds for $m=1,2,\dots,j-1$ , i.e., $Q_{j}^{T}P_{j-1}=\bm{0}$ . For $m=j$ , partition

(3.5)

Q_{j+1}^{T}P_{j}=\begin{bmatrix}Q_{j}^{T}\\ q_{j+1}^{T}\end{bmatrix}P_{j}=\begin{bmatrix}Q_{j}^{T}P_{j}\\ q_{j+1}^{T}P_{j}\end{bmatrix}.

It follows from (3.3) and the inductive hypothesis $Q_{j}^{T}P_{j-1}=\bm{0}$ that

Q_{j}^{T}AQ_{j-1}=Q_{j}^{T}P_{j-1}B_{j-1}=\bm{0},

which means that $Q_{j-1}^{T}AQ_{j-1}=\bm{0}$ and $q_{j}^{T}AQ_{j-1}=\bm{0}$ . Since $\beta_{i}>0$ , $i=1,\dots,j$ , $B_{j}$ is nonsingular, from (3.3) and the above relations we have

(3.6)

Q_{j}^{T}P_{j}=Q_{j}^{T}AQ_{j}B_{j}^{-1}=\begin{bmatrix}Q_{j-1}^{T}AQ_{j-1}&Q_{j-1}^{T}Aq_{j}\\ q_{j}^{T}AQ_{j-1}&q_{j}^{T}Aq_{j}\end{bmatrix}B_{j}^{-1}=\bm{0}

by noting that $q_{j}^{T}Aq_{j}=0$ . Particularly, relation (3.6) shows that $q_{j}^{T}P_{j}=\bm{0}$ and $p_{j}^{T}Q_{j}=\bm{0}$ . Making use of them and $AP_{j-1}=-Q_{j}\widehat{B}_{j-1}^{T}$ , and noticing that $\gamma_{j}>0$ , we obtain

(3.7)		$\displaystyle\quad q_{j+1}^{T}P_{j}$	$\displaystyle=$	$\displaystyle\tfrac{1}{\gamma_{j}}(A^{T}p_{j}-\beta_{j}q_{j})^{T}P_{j}=\tfrac{1}{\gamma_{j}}p_{j}^{T}AP_{j}$
(3.7)			$\displaystyle=$	$\displaystyle\tfrac{1}{\gamma_{j}}\begin{bmatrix}p_{j}^{T}AP_{j-1}&p_{j}^{T}Ap_{j}\end{bmatrix}=\tfrac{1}{\gamma_{j}}\begin{bmatrix}-p_{j}^{T}Q_{j}\widehat{B}_{j-1}^{T}&\bm{0}\end{bmatrix}=\bm{0}.$

Applying (3.6) and (3.7) to (3.5) yields (3.4) for $m={j}$ . By induction, we have proved that relation (3.4) holds for all $m<\ell$ .

Theorem 3.1 indicates that any vectors from $\mathcal{U}_{m}=\mathcal{R}(P_{m})$ and $\mathcal{V}_{m}=\mathcal{R}(Q_{m})$ , which are called left and right subspaces, are biorthogonal. Therefore, any left and right approximate singular vectors extracted from $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ are biorthogonal, a desired property as the left and right singular vectors of $A$ are so, too.

The SSLBD process must break down no later than $\ell$ steps since $A$ has $\ell=n/2$ distinct singular values; see [13] for a detailed analysis on the multiple singular value case. Once breakdown occurs for some $m<\ell$ , all the singular values of $B_{m}$ are exact ones of $A$ , and $m$ exact singular triplets of $A$ have been found, as we shall see in Section 4. Furthermore, by the assumption that $A$ is nonsingular, breakdown can occur only for $\gamma_{m}=0$ ; for if $\beta_{m}=0$ , then $B_{m}$ and thus $A$ would have a zero singular value.

3.2 The SSLBD method

With the left and right searching subspaces $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ , we use the standard Rayleigh–Ritz projection, i.e., the standard extraction approach [2, 3, 5, 14, 17, 18], to compute the approximate singular triplets $\left(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}\right)$ of $A$ with the unit-length vectors $\tilde{u}_{j}\in\mathcal{U}_{m}$ and $\tilde{v}_{j}\in\mathcal{V}_{m}$ that satisfy the requirement

(3.8)

\left\{\begin{aligned} &A\tilde{v}_{j}-\theta_{j}\tilde{u}_{j}\perp\mathcal{U}_{m},\\ &A\tilde{u}_{j}+\theta_{j}\tilde{v}_{j}\perp\mathcal{V}_{m},\end{aligned}\right.\qquad\qquad j=1,\dots,m.

Set $\tilde{u}_{j}=P_{m}c_{j}$ and $\tilde{v}_{j}=Q_{m}d_{j}$ for $j=1,\dots,m$ . Then (3.3) indicates that (3.8) becomes

(3.9)

B_{m}d_{j}=\theta_{j}c_{j},\qquad B_{m}^{T}c_{j}=\theta_{j}d_{j},\qquad j=1,\dots,m;

that is, $(\theta_{j},c_{j},d_{j})$ , $j=1,\dots,m$ are the singular triplets of $B_{m}$ . Therefore, the standard extraction amounts to computing the SVD of $B_{m}$ with the singular values ordered as $\theta_{1}>\theta_{2}>\dots>\theta_{m}$ and takes $(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}),\ j=1,\dots,m$ as approximations to some singular triplets of $A$ . The resulting method is called the SSLBD method, and the $(\theta_{j},\tilde{u}_{j},\tilde{v}_{j})$ are called the Ritz approximations of $A$ with respect to the left and right searching subspaces $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ ; particularly, the $\theta_{j}$ are the Ritz values, and the $\tilde{u}_{j}$ and $\tilde{v}_{j}$ are the left and right Ritz vectors, respectively.

Since $Q_{m}^{T}A^{T}AQ_{m}=B_{m}^{T}B_{m}$ , by the Cauchy interlace theorem of eigenvalues (cf. [25, Theorem 10.1.1, pp.203]), for $j$ fixed, as $m$ increases, $\theta_{j}$ and $\theta_{m-j+1}$ monotonically converge to $\sigma_{j}$ and $\sigma_{\ell-j+1}$ from below and above, respectively. Therefore, we can take $(\theta_{j},\tilde{u}_{j},\tilde{v}_{j}),\ j=1,\dots,k$ as approximations to the $k$ largest singular triplets $(\sigma_{j},u_{j},v_{j})$ , $j=1,\dots,k$ of $A$ with $k\ll m$ . Likewise, we may use $(\theta_{m-j+1},\tilde{u}_{m-j+1},\tilde{v}_{m-j+1})$ , $j=1,\dots,k$ to approximate the $k$ smallest singular triplets $(\sigma_{\ell-j+1},u_{\ell-j+1},v_{\ell-j+1})$ , $j=1,\dots,k$ of $A$ .

4 A convergence analysis

For later use, for each $j=1,\dots,\ell$ , we denote

(4.1)

\Lambda_{j}=\begin{bmatrix}&\sigma_{j}\\ -\sigma_{j}&\end{bmatrix}\qquad\mbox{and}\qquad X_{j}=\begin{bmatrix}u_{j}&v_{j}\end{bmatrix}.

Then $AX_{j}=X_{j}\Lambda_{j}$ , and $\mathcal{X}_{j}:=\mathcal{R}(X_{j})=\mathrm{span}\{x_{\pm j}\}$ is the two dimensional eigenspace of $A$ associated with the conjugate eigenvalues $\lambda_{\pm j}$ , $j=1,\dots,\ell$ . We call the pair $(\Lambda_{j},X_{j})$ a block eigenpair of $A$ , $j=1,\dots,\ell$ . From (2.3), $\widehat{X}=[X_{1},\dots,X_{\ell}]$ is orthogonal, and (2.1) can be written as

(4.2)

A=\widehat{X}\widehat{\Lambda}\widehat{X}^{T}\qquad\mbox{with}\qquad\widehat{\Lambda}=\mathop{\operator@font diag}\nolimits\{\Lambda_{1},\dots,\Lambda_{\ell}\}.

To make a convergence analysis on the SSLBD method, we review some preliminaries. Let the columns of $Z$ , $W$ and $W_{\perp}$ form orthonormal base of $\mathcal{Z}$ , $\mathcal{W}$ and the orthogonal complement of $\mathcal{W}$ , respectively. Denote by $\angle(\mathcal{W},\mathcal{Z})$ the canonical angles between $\mathcal{W}$ and $\mathcal{Z}$ [10, pp.329–330]. The 2-norm distance between $\mathcal{Z}$ and $\mathcal{W}$ is defined by

\|\sin\angle(\mathcal{W},\mathcal{Z})\|=\|W_{\perp}^{H}Z\|=\|(I-WW^{H})Z\|,

which is the sine of the largest canonical angle between $\mathcal{W}$ and $\mathcal{Z}$ (cf. Jia and Stewart [16]). It is worthwhile to point out that if the dimensions of $\mathcal{W}$ and $\mathcal{Z}$ are not equal then this measure is not symmetric in its arguments. In fact, if the dimension of $\mathcal{W}$ is greater than that of $\mathcal{Z}$ , then $\|\sin\angle(\mathcal{Z},\mathcal{W})\|=1$ , although generally $\|\sin\angle(\mathcal{W},\mathcal{Z})\|<1$ . In this paper, we will use the $F$ -norm distance

(4.3)

\|\sin\angle(\mathcal{W},\mathcal{Z})\|_{F}=\|W_{\perp}^{H}Z\|_{F},

which equals the square root of the squares sum of sines of all the canonical angles between the two subspaces. Correspondingly, we define $\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}$ to be the square root of the squares sum of the tangents of all the canonical angles between the subspaces. These tangents are the generalized singular values of the matrix pair $\{W_{\perp}^{H}Z,W^{H}Z\}$ .

Notice that for $\mathcal{W}$ and $\mathcal{Z}$ with equal dimensions, if $W^{H}Z$ is nonsingular then the tangents of canonical angles are the singular values of $W_{\perp}^{H}Z(W^{H}Z)^{-1}$ , so that

(4.4)

\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}=\|W_{\perp}^{H}Z(W^{H}Z)^{-1}\|_{F}=\|(I-WW^{H})Z(W^{H}Z)^{-1}\|_{F},

which holds when the $F$ -norm is replaced by the 2-norm. We remark that the inverse in the above cannot be replaced by the pseudoinverse $\dagger$ when $W^{H}Z$ is singular. In fact, by definition, $\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}$ is infinite when $W^{H}Z$ is singular, and the generalized singular values of $\{W_{\perp}^{H}Z,W^{H}Z\}$ are not the singular values of $W_{\perp}^{H}Z(W^{H}Z)^{{\dagger}}$ in this case.

Note that the real and imaginary parts of the eigenvectors $x_{\pm j}$ of $A$ can be interchanged since $\mathrm{i}x_{\pm j}$ are the eigenvectors of $A$ and its left and right singular vectors $u_{j}$ and $v_{j}$ are the right and left ones, too (cf. (2.4)). As we have pointed out, any pair of approximate left and right singular vectors extracted from the biorthogonal $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ are mutually orthogonal, using which we construct the real and imaginary parts of an approximate eigenvector of $A$ . As a consequence, in the SVD context of the skew-symmetric $A$ , because of these properties, when analyzing the convergence of the SSLBD method, we consider the orthogonal direct sum $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ as a whole, and estimate the distance between a desired two dimensional eigenspace $\mathcal{X}_{j}$ of $A$ and the $2m$ -dimensional subspace $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ for $j$ small. We remind that their dimensions are unequal for $m>1$ . Using these distances and their estimates, we can establish a priori error bounds for approximate eigenvalues and approximate eigenspaces obtained by the SSLBD method, showing how fast they converge when computing largest eigenvalues in magnitudes and the associated eigenvectors.

In terms of the definition of $\|\tan\angle(\mathcal{W},\mathcal{Z})\|_{F}$ , we present the following estimate for $\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}$ .

Theorem 4.1.

Let $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ be defined by (3.2), and suppose that $X_{j}^{H}Y$ with the initial $Y=[p_{1},q_{1}]$ is nonsingular. Then the following estimate holds for any integer $1\leq j<m$ :

(4.5)

\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}\leq\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F},

where $\chi_{i}(\cdot)$ is the degree $i$ Chebyshev polynomial of the first kind and

(4.6)

\xi_{j}=1+2\cdot\frac{\sigma_{j}^{2}-\sigma_{j+1}^{2}}{\sigma^{2}_{j+1}-\sigma^{2}_{\ell}}\qquad\mbox{and}\qquad\eta_{j}=\left\{\begin{aligned} &1,\qquad\quad\qquad\qquad\mbox{if}\quad j=1,\\ &\prod\limits_{i=1}^{j-1}\frac{\sigma_{i}^{2}-\sigma_{\ell}^{2}}{\sigma_{i}^{2}-\sigma_{j}^{2}},\hskip 23.99997pt\mbox{if}\quad j>1.\end{aligned}\right.

{proof}

For a fixed $1\leq j<m$ , denote by $\widehat{\Lambda}_{j}=\mathop{\operator@font diag}\nolimits\{\Lambda_{1},\dots,\Lambda_{j-1},\Lambda_{j+1},\dots,\Lambda_{\ell}\}$ and $\widehat{X}_{j}=\![X_{1},\dots,X_{j-1},X_{j+1},\dots,X_{\ell}]$ which delete $\Lambda_{j}$ and $X_{j}$ from $\widehat{X}$ and $\widehat{\Lambda}$ in (4.2), respectively. Then relation (4.2) can be written as

(4.7)

A=\widehat{X}_{j}\widehat{\Lambda}_{j}\widehat{X}_{j}^{T}+X_{j}\Lambda_{j}X_{j}^{T}.

Notice that $Y=[p_{1},q_{1}]$ is orthonormal by Theorem 3.1, and $\mathcal{Y}=\mathcal{R}(Y)=\mathcal{U}_{1}\oplus\mathcal{V}_{1}$ . Since $\widehat{X}$ in (4.2) is orthogonal, there exists an orthonormal matrix $G=[G_{1}^{T},\dots,G_{\ell}^{T}]^{T}$ with $G_{1},\dots,G_{\ell}\in\mathbb{R}^{2\times 2}$ such that

(4.8)

Y=\widehat{X}G=\widehat{X}_{j}\widehat{G}_{j}+X_{j}G_{j}\qquad\mbox{with}\qquad\widehat{G}_{j}=[G_{1},\dots,G_{j-1},G_{j+1},\dots,G_{\ell}].

For an arbitrary polynomial $\rho(\cdot)$ in $\mathcal{P}^{m-1}$ , the set of polynomials of degree no more than $m-1$ that satisfies $\rho(-\sigma_{j}^{2})\neq 0$ , write $Y_{\rho}=\rho(A^{2})Y$ and $\mathcal{Y}_{\rho}=\mathcal{R}(Y_{\rho})$ . Then

(4.9)

Y_{\rho}=\widehat{X}_{j}\rho(\widehat{\Lambda}_{j}^{2})\widehat{G}_{j}+X_{j}\rho(\Lambda_{j}^{2})G_{j}.

From (4.1), it is known that $\Lambda_{i}^{2}=-\sigma_{i}^{2}I_{2}$ , $i=1,\dots,\ell$ .

Since $G_{j}=X_{j}^{H}Y$ is nonsingular, $\rho(-\sigma_{j}^{2})\neq 0$ , and the columns of $Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2}$ form an orthonormal basis of $\mathcal{Y}_{\rho}$ , by definition (4.4) and relation (4.9), we obtain

(4.10)	$\displaystyle\\|\tan\angle(\mathcal{Y}_{\rho},\mathcal{X}_{j})\\|_{F}$	$\displaystyle=$	$\displaystyle\\|(I-X_{j}X_{j}^{T})Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2}(X_{j}^{T}Y_{\rho}(Y_{\rho}^{T}Y_{\rho})^{-1/2})^{-1}\\|_{F}$
		$\displaystyle=$	$\displaystyle\\|(I-X_{j}X_{j}^{T})Y_{\rho}(X_{j}^{T}Y_{\rho})^{-1}\\|_{F}=\\|\rho(\widehat{\Lambda}_{j}^{2})\widehat{G}_{j}(\rho(\Lambda_{j}^{2})G_{j})^{-1}\\|_{F}$
		$\displaystyle\leq$	$\displaystyle\frac{\\|\rho(\widehat{\Lambda}_{j}^{2})\\|}{\|\rho(-\sigma_{j}^{2})\|}\\|\widehat{G}_{j}G_{j}^{-1}\\|_{F}=\frac{\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|}{\|\rho(-\sigma_{j}^{2})\|}\\|\widehat{G}_{j}G_{j}^{-1}\\|_{F}$
		$\displaystyle=$	$\displaystyle\frac{\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|}{\|\rho(-\sigma_{j}^{2})\|}\\|(I-X_{j}X_{j}^{T})Y(X_{j}^{T}Y)^{-1}\\|_{F}\qquad\qquad\ \ \mbox{by \eqref{expv1}}$
		$\displaystyle=$	$\displaystyle\frac{\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|}{\|\rho(-\sigma_{j}^{2})\|}\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F}.$

Note that $\mathcal{Y}_{\rho}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ for any $\rho\in\mathcal{P}^{m-1}$ . By definition and (4.10), it holds that

$\displaystyle\\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\\|_{F}$	$\displaystyle\leq$	$\displaystyle\min_{\rho\in\mathcal{P}^{m-1}}\\|\tan\angle(\mathcal{Y}_{\rho},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle\leq$	$\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1}}\frac{\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|}{\|\rho(-\sigma_{j}^{2})\|}\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle=$	$\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1},\rho(-\sigma_{j}^{2})=1}\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle\leq$	$\displaystyle\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F},$

where the last inequality comes from the proof of Theorem 1 in [27], $\chi_{m-j}(\cdot)$ is the degree $m-j$ Chebyshev polynomial of the first kind, and $\xi_{j}$ and $\eta_{j}$ are defined by (4.6).

Theorem 4.1 establishes accuracy estimates for $\mathcal{X}_{j}$ approximating the searching subspaces $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ . But one must be aware that they are only significant for $j\ll m$ . The scalars $\eta_{j}\geq 1$ and $\xi_{j}>1$ defined by (4.6) are constants depending only on the eigenvalue or singular value distribution of $A$ . For a fixed integer $j\ll m$ , as long as the initial searching subspace $\mathcal{Y}$ contains some information on $\mathcal{X}_{j}$ , that is, $\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}$ is finite in (4.5), then the larger $m$ is, the smaller $\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}$ is and the closer $\mathcal{X}_{j}$ is to $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ . Moreover, the better the singular value $\sigma_{j}$ is separated from the other singular values $\sigma_{i}\neq\sigma_{j}$ of $A$ , the smaller $\eta_{j}$ and the larger $\xi_{j}$ are, meaning that $\mathcal{X}_{j}$ approaches $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ more quickly as $m$ increases. Generally, $\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}$ decays faster for $j$ smaller. Two extreme cases are that $\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}=0$ and $\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\|_{F}=+\infty$ . In the first case, Algorithm 1 breaks down at step $m=1$ , and we already have the exact eigenspace $\mathcal{X}_{j}=\mathcal{Y}$ . The second case indicates that the initial $\mathcal{Y}$ is deficient in $\mathcal{X}_{j}$ , such that $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ does not contain any information on $\mathcal{X}_{j}$ ; consequently, one cannot find any meaningful approximations to $u_{j}$ and $v_{j}$ from $\mathcal{U}_{m}$ and $\mathcal{V}_{m}$ for any $m<\ell$ .

Theorem 4.1 shows that the bounds tend to zero for $j$ small as $m$ increases. Using a similar proof, we can establish the following analogue of (4.5) for $j$ small:

(4.11)

\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{\ell-j+1})\|_{F}\leq\frac{\hat{\eta}_{j}}{\chi_{m-j}(\hat{\xi}_{j})}\|\tan\angle(\mathcal{Y},\mathcal{X}_{\ell-j+1})\|_{F},

where

\hat{\xi}_{j}=1+2\cdot\frac{\sigma_{\ell-j+1}^{2}-\sigma_{\ell-j}^{2}}{\sigma^{2}_{\ell-j}-\sigma^{2}_{1}}\qquad\mbox{and}\qquad\hat{\eta}_{j}=\left\{\begin{aligned} &1,\qquad\quad\qquad\qquad\ \ \ \mbox{if}\quad j=1,\\ &\prod\limits_{i=1}^{j-1}\frac{\sigma_{\ell-i}^{2}-\sigma_{1}^{2}}{\sigma_{\ell-i}^{2}-\sigma_{\ell-j}^{2}},\hskip 18.50008pt\mbox{if}\quad j>1.\end{aligned}\right.

Similar arguments show that $\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{\ell-j+1})\|_{F}$ generally tends to zero faster for $j$ smaller as $m$ increases. It indicates that $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ also favors the eigenvectors of $A$ corresponding to several smallest eigenvalues in magnitude if the smallest singular values $\sigma_{\ell-j+1}$ are not clustered. In the sequel, for brevity, we only discuss the computation of the largest conjugate eigenvalues in magnitude and the associated eigenvectors of $A$ .

In term of $\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}$ , we next present a priori accuracy estimates for the Ritz approximations computed by the SSLBD method. To this end, we first establish the following two lemmas.

Lemma 4.2.

Let $\mathscr{Q}_{m}$ be the orthogonal projector onto the subspace $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ . Then

(4.12)

(\Theta_{j},Z_{j})=\left(\begin{bmatrix}&\theta_{j}\\ -\theta_{j}&\end{bmatrix},\begin{bmatrix}\tilde{u}_{j}&\tilde{v}_{j}\end{bmatrix}\right),\qquad j=1,2,\ldots,m

are the block eigenpairs of the linear operator $\mathcal{Q}_{m}A\mathcal{Q}_{m}$ restricted to $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ :

(4.13)

\mathcal{Q}_{m}A\mathcal{Q}_{m}Z_{j}=Z_{j}\Theta_{j}.

{proof}

Since the columns of $[P_{m},Q_{m}]$ form an orthonormal basis of $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ , the orthogonal projector onto $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ is $\mathcal{Q}_{m}=[P_{m},Q_{m}][P_{m},Q_{m}]^{T}$ . By $\tilde{u}_{j}=P_{m}c_{j}$ and $\tilde{v}_{j}=Q_{m}d_{j}$ , we have

Z_{j}=\begin{bmatrix}\tilde{u}_{j}&\tilde{v}_{j}\end{bmatrix}=\begin{bmatrix}P_{m}&Q_{m}\end{bmatrix}\begin{bmatrix}c_{j}&\\ &d_{j}\end{bmatrix}.

It is known from the proof of Theorem 3.1 that $P_{m}^{T}AP_{m}=Q_{m}^{T}AQ_{m}=\bm{0}$ . Making use of that and $P_{m}^{T}AQ_{m}=B_{m}$ , $Q_{m}^{T}AP_{m}=-B_{m}^{T}$ as well as (3.9), by $\mathcal{Q}_{m}Z_{j}=Z_{j}$ , we obtain

\mathcal{Q}_{m}A\mathcal{Q}_{m}Z_{j}=\begin{bmatrix}P_{m}\!\!\!&\!\!\!Q_{m}\end{bmatrix}\begin{bmatrix}&\!\!\!B_{m}\\ -B^{T}_{m}\!\!\!&\end{bmatrix}\begin{bmatrix}c_{j}\!\!\!&\\ &\!\!\!d_{j}\end{bmatrix}=\begin{bmatrix}P_{m}\!\!\!&\!\!\!Q_{m}\end{bmatrix}\begin{bmatrix}c_{j}\!\!\!&\\ &\!\!\!d_{j}\end{bmatrix}\begin{bmatrix}&\!\!\!\theta_{j}\\ -\theta_{j}\!\!\!&\end{bmatrix}=Z_{j}\Theta_{j}.

Lemma 4.3.

With the notations of Lemma 4.2, for an arbitrary $E\in\mathbb{R}^{2\times 2}$ , it holds that

(4.14)

\qquad\|\Theta_{j}E-E\Lambda_{i}\|_{F}\geq|\theta_{j}-\sigma_{i}|\|E\|_{F},

where $\Theta_{j},j=1,\dots,m$ and $\Lambda_{i},i=1,\dots,\ell$ are defined by (4.12) and (4.1), respectively.

{proof}

Since $\Theta_{j}=\theta_{j}\widehat{I}_{2}$ and $\Lambda_{i}=\sigma_{i}\widehat{I}_{2}$ with $\widehat{I}_{2}=\begin{bmatrix}\begin{smallmatrix}&1\\ -1&\end{smallmatrix}\end{bmatrix}$ , by the fact that $\|\widehat{I}_{2}E\|_{F}=\|E\|_{F}$ and $\|E\widehat{I}_{2}\|_{F}=\|E\|_{F}$ , we have

(4.15)		$\displaystyle\\|\Theta_{j}E-E\Lambda_{i}\\|_{F}$	$\displaystyle=$	$\displaystyle\sqrt{\\|\Theta_{j}E\\|_{F}^{2}+\\|E\Lambda_{i}\\|_{F}^{2}-2\mathrm{tr}(\Lambda_{i}^{T}E^{T}\Theta_{j}E)}$
(4.15)			$\displaystyle=$	$\displaystyle\sqrt{\theta_{j}^{2}\\|E\\|_{F}^{2}+\sigma_{i}^{2}\\|E\\|_{F}^{2}-2\theta_{j}\sigma_{i}\mathrm{tr}(\widehat{I}_{2}^{T}E^{T}\widehat{I}_{2}E)}.$

Denote $E=\begin{bmatrix}e_{11}&e_{12}\\ e_{21}&e_{22}\end{bmatrix}$ . Then the trace

\mathrm{tr}(\widehat{I}_{2}^{T}E^{T}\widehat{I}_{2}E)=2e_{11}e_{22}-2e_{12}e_{21}\leq e_{11}^{2}+e_{22}^{2}+e_{12}^{2}+e_{21}^{2}=\|E\|_{F}^{2},

applying which to (4.15) gives

(4.16)

\|\Theta_{j}E-E\Lambda_{i}\|_{F}\geq\sqrt{\theta_{j}^{2}\|E\|_{F}^{2}+\sigma_{i}^{2}\|E\|_{F}^{2}-2\theta_{j}\sigma_{i}\|E\|_{F}^{2}}=|\theta_{j}-\sigma_{i}|\|E\|_{F}.

Making use of Lemmas 4.2–4.3, we are now in a position to establish a priori accuracy estimates for the Ritz approximations.

Theorem 4.4.

For $(\sigma_{j},u_{j},v_{j}),j=1,\dots,\ell$ , assume that $(\theta_{j^{\prime}},\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}})$ is a Ritz approximation to the desired $(\sigma_{j},u_{j},v_{j})$ where $\theta_{j^{\prime}}$ is the closest to $\sigma_{j}$ among the Ritz values. Denote $\mathcal{X}_{j}=\mathrm{span}\{u_{j},v_{j}\}$ and $\mathcal{Z}_{j^{\prime}}=\mathrm{span}\{\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}}\}$ . Then

(4.17)

\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}\leq\sqrt{1+\frac{\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|^{2}}{\min_{i\neq j^{\prime}}|\sigma_{j}-\theta_{i}|^{2}}}\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F},

where $\mathcal{Q}_{m}$ is the orthogonal projector onto the subspace $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ .

{proof}

Notice that the two dimensional subspace $\mathcal{W}_{j}^{\prime}=\mathcal{Q}_{m}\mathcal{X}_{j}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ is the orthogonal projection of $\mathcal{X}_{j}$ onto $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ with respect to the $F$ -norm. Therefore, by the definition of $\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}$ , we have

(4.18)

\|\sin\angle(\mathcal{W}_{j}^{\prime},\mathcal{X}_{j})\|_{F}=\|\sin\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\|_{F}.

Let $W_{1}\in\mathbb{R}^{n\times 2}$ and $W_{2}\in\mathbb{R}^{n\times(n-2m)}$ be the orthonormal basis matrices of $\mathcal{W}_{j}^{\prime}$ and the orthogonal complement of $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ with respect to $\mathbb{R}^{n}$ , respectively. The orthonormal basis matrix $X_{j}=[u_{j},v_{j}]$ of $\mathcal{X}_{j}$ can be expressed as

(4.19)

X_{j}=W_{1}H_{1}+W_{2}H_{2},

where $H_{1}\in\mathbb{R}^{2\times 2}$ and $H_{2}\in\mathbb{R}^{(n-2m)\times 2}$ . Since $AX_{j}=X_{j}\Lambda_{j}$ , from the above relation we have

AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}=W_{2}H_{2}\Lambda_{j}-AW_{2}H_{2}.

Premultiplying this relation by $\mathcal{Q}_{m}$ and taking the $F$ -norms in the two hand sides, from $\mathcal{Q}_{m}W_{1}=W_{1}$ , $\mathcal{Q}_{m}W_{2}=\bm{0}$ and $I-\mathcal{Q}_{m}=W_{2}W_{2}^{T}$ , we obtain

(4.20)

\|\mathcal{Q}_{m}AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}\|_{F}=\|\mathcal{Q}_{m}AW_{2}H_{2}\|_{F}\leq\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|\|H_{2}\|_{F}.

Recall the definition of $Z_{j^{\prime}}$ in (4.12), and denote $\widehat{Z}_{j^{\prime}}=[Z_{1},\dots,Z_{j^{\prime}-1},Z_{j^{\prime}+1},\dots,Z_{m}]$ . Note that the columns of $[Z_{j^{\prime}},\widehat{Z}_{j^{\prime}}]$ form an orthonormal basis of $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ . Since $\mathcal{W}_{j}^{\prime}\subset\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ , we can decompose its orthonormal basis matrix $W_{1}$ into

(4.21)

W_{1}=Z_{j^{\prime}}F_{1}+\widehat{Z}_{j^{\prime}}F_{2},

where $F_{1}\in\mathbb{R}^{2\times 2}$ and $F_{2}\in\mathbb{R}^{(2m-2)\times 2}$ . Lemma 4.2 shows that $\mathcal{Q}_{m}AZ_{j^{\prime}}=Z_{j^{\prime}}\Theta_{j^{\prime}}$ and $\mathcal{Q}_{m}A\widehat{Z}_{j^{\prime}}=\widehat{Z}_{j^{\prime}}\widehat{\Theta}_{j^{\prime}}$ with $\widehat{\Theta}_{j^{\prime}}=\mathop{\operator@font diag}\nolimits\{\Theta_{1},\dots,\Theta_{j^{\prime}-1},\Theta_{j^{\prime}+1},\dots,\Theta_{m}\}$ . Therefore, by (4.21), we obtain

(4.22)	$\displaystyle\\|\mathcal{Q}_{m}AW_{1}H_{1}-W_{1}H_{1}\Lambda_{j}\\|_{F}$	$\displaystyle=$	$\displaystyle\\|\mathcal{Q}_{m}AZ_{j^{\prime}}F_{1}H_{1}-Z_{j^{\prime}}F_{1}H_{1}\Lambda_{j}+\mathcal{Q}_{m}A\widehat{Z}_{j^{\prime}}F_{2}H_{1}-\widehat{Z}_{j^{\prime}}F_{2}H_{1}\Lambda_{j}\\|_{F}$
		$\displaystyle=$	$\displaystyle\\|Z_{j^{\prime}}(\Theta_{j^{\prime}}F_{1}H_{1}-F_{1}H_{1}\Lambda_{j})+\widehat{Z}_{j^{\prime}}(\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j})\\|_{F}$
		$\displaystyle\geq$	$\displaystyle\\|\widehat{Z}_{j^{\prime}}(\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\\|_{F}$
		$\displaystyle=$	$\displaystyle\\|\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\\|_{F},$

where the last two relations hold since $[Z_{j^{\prime}},\widehat{Z}_{j^{\prime}}]$ is orthonormal. Write

E=F_{2}H_{1}=[E_{1}^{T},\dots,E_{j^{\prime}-1}^{T},E_{j^{\prime}+1}^{T},E_{m}^{T}]^{T}

with each $E_{i}\in\mathbb{R}^{2\times 2}$ . Then by Lemma 4.3 we obtain

(4.23)	$\displaystyle\\|\widehat{\Theta}_{j^{\prime}}F_{2}H_{1}-F_{2}H_{1}\Lambda_{j}\\|_{F}^{2}$	$\displaystyle=$	$\displaystyle\sum_{i=1,i\neq j^{\prime}}^{m}\\|\Theta_{i}E_{i}-E_{i}\Lambda_{j}\\|_{F}^{2}\geq\sum_{i=1,i\neq j^{\prime}}^{m}\|\theta_{i}-\sigma_{j}\|^{2}\\|E_{i}\\|_{F}^{2}$
		$\displaystyle\geq$	$\displaystyle\min_{i\neq j^{\prime}}\|\theta_{i}-\sigma_{j}\|^{2}\sum_{i=1,i\neq j^{\prime}}^{m}\\|E_{i}\\|_{F}^{2}=\min_{i\neq j^{\prime}}\|\theta_{i}-\sigma_{j}\|^{2}\\|E\\|_{F}^{2}$
		$\displaystyle=$	$\displaystyle\min_{i\neq j^{\prime}}\|\theta_{i}-\sigma_{j}\|^{2}\\|F_{2}H_{1}\\|_{F}^{2}.$

Therefore, combining (4.20), (4.22) and (4.23), we obtain

(4.24)

\|F_{2}H_{1}\|_{F}\leq\frac{\|\mathcal{Q}_{m}A(I-\mathcal{Q}_{m})\|}{\min_{i\neq j^{\prime}}|\theta_{i}-\sigma_{j}|}\|H_{2}\|_{F}.

Since both $X_{j}$ and $W_{1}$ are column orthonormal, decomposition (4.19) shows that

(4.25)

\|\sin\angle(\mathcal{W}_{j}^{\prime},\mathcal{X}_{j})\|_{F}=\|(I-W_{1}W_{1}^{T})X_{j}\|_{F}=\|H_{2}\|_{F}.

Substituting (4.21) into (4.19) yields

(4.26)

X_{j}=Z_{j^{\prime}}F_{1}H_{1}+\widehat{Z}_{j^{\prime}}F_{2}H_{1}+W_{2}H_{2},

which, by the fact that the columns of $[Z_{j^{\prime}},\widehat{Z}_{j^{\prime}},W_{2}]$ are orthonormal, means that

\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}=\|(I-Z_{j^{\prime}}Z_{j^{\prime}}^{T})X_{j}\|_{F}^{2}=\|\widehat{Z}_{j^{\prime}}F_{2}H_{1}+W_{2}H_{2}\|_{F}^{2}=\|F_{2}H_{1}\|_{F}^{2}+\|H_{2}\|_{F}^{2}.

Then (4.17) follows by in turn applying (4.24), (4.25) and (4.18) to the above relation.

Remark 3.

This theorem extends a classic result (cf. [29, pp.103, Theorem 4.6]) on the a priori error bound for the Ritz vector in the real symmetric (complex Hermitian) case to the skew-symmetric case for the Ritz block $\mathcal{Z}_{j^{\prime}}=\mathrm{span}\{\tilde{u}_{j^{\prime}},\tilde{v}_{j^{\prime}}\}$ .

Theorem 4.5.

With the notations of Theorem 4.4, assume that the angles

(4.27)

\angle(\tilde{u}_{j^{\prime}},u_{j})\leq\frac{\pi}{4}\mbox{\ \ \ and\ \ \ }\angle(\tilde{v}_{j^{\prime}},v_{j})\leq\frac{\pi}{4}.

Then

(4.28)		$\displaystyle\|\theta_{j^{\prime}}-\sigma_{j}\|$	$\displaystyle\leq$	$\displaystyle\sigma_{j}\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2}+\frac{\sigma_{1}}{\sqrt{2}}\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|_{F}^{2},$
(4.29)		$\displaystyle\|\theta_{j^{\prime}}-\sigma_{j}\|$	$\displaystyle\leq$	$\displaystyle(\sigma_{1}+\sigma_{j})\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2}.$

{proof}

Decompose the orthonormal matrix $Z_{j^{\prime}}$ defined by (4.12) into the orthogonal direct sum

(4.30)

Z_{j^{\prime}}=X_{j}H+\widehat{X}_{j}\widehat{H},

where $H\in\mathbb{R}^{2\times 2}$ and $\widehat{H}\in\mathbb{R}^{(n-2)\times 2}$ . Then

(4.31)

H^{T}H+\widehat{H}^{T}\widehat{H}=I_{2}.

By (4.13), (4.30) and $AX_{j}=X_{j}\Lambda_{j}$ , $A\widehat{X}_{j}=\widehat{X}_{j}\widehat{\Lambda}_{j}$ , we obtain

(4.32)

\Theta_{j^{\prime}}-\Lambda_{j}=Z_{j^{\prime}}^{T}AZ_{j^{\prime}}-\Lambda_{j}=H^{T}\Lambda_{j}H+\widehat{H}^{T}\widehat{\Lambda}_{j}\widehat{H}-\Lambda_{j}=\sigma_{j}\cdot\left(H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\right)+\widehat{H}^{T}\widehat{\Lambda}_{j}\widehat{H},

where the last equality holds since $\Lambda_{j}=\sigma_{j}\widehat{I}_{2}$ with $\widehat{I}_{2}=\begin{bmatrix}\begin{smallmatrix}&1\\ -1\end{smallmatrix}\end{bmatrix}$ . Notice that $\Theta_{j^{\prime}}=\theta_{j^{\prime}}\widehat{I}_{2}$ and $\|\widehat{I}_{2}\|_{F}=\sqrt{2}$ . Taking the $F$ -norms in the two hand sides of the above relation and exploiting $\|\widehat{\Lambda}_{j}\|\leq\sigma_{1}$ and $\|\widehat{H}\|_{F}=\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}$ yield

(4.33)		$\displaystyle\sqrt{2}\|\theta_{j^{\prime}}-\sigma_{j}\|$	$\displaystyle\leq$	$\displaystyle\sigma_{j}\\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\\|_{F}+\\|\widehat{\Lambda}_{j}\\|\\|\widehat{H}\\|_{F}^{2}$
(4.33)			$\displaystyle\leq$	$\displaystyle\sigma_{j}\\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\\|_{F}+\sigma_{1}\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|_{F}^{2}.$

Write $H=[h_{ij}]$ , and notice that $h_{11}=\cos\angle(\tilde{u}_{j^{\prime}},u_{j})$ and $h_{22}=\cos\angle(\tilde{v}_{j^{\prime}},v_{j})$ . Then by (4.27) we have

(4.34)

h_{11}\geq\frac{1}{\sqrt{2}},\qquad h_{22}\geq\frac{1}{\sqrt{2}}\qquad\mbox{and}\qquad|h_{12}|\leq\frac{1}{\sqrt{2}},\qquad|h_{21}|\leq\frac{1}{\sqrt{2}}

since, from (4.31), the 2-norms of columns $[h_{11},h_{21}]^{T}$ and $[h_{12},h_{22}]^{T}$ of $H$ are no more than one. As a consequence, we obtain

	$\displaystyle{\rm det}(H)$	$\displaystyle=$	$\displaystyle h_{11}h_{22}-h_{12}h_{21}\geq h_{11}h_{22}-\|h_{12}h_{21}\|$
		$\displaystyle\geq$	$\displaystyle\frac{1}{\sqrt{2}}\cdot\frac{1}{\sqrt{2}}-\frac{1}{\sqrt{2}}\cdot\frac{1}{\sqrt{2}}=0.$

Let $\sigma_{1}(H)\geq\sigma_{2}(H)$ be the singular values of $H$ . Therefore, by definition we have $\sigma_{2}(H)=\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|$ and

{\rm det}(H)=\sigma_{1}(H)\sigma_{2}(H)\geq\sigma_{2}^{2}(H)=\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.

Then it is straightforward to verify that

$\displaystyle\\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\\|_{F}$	$\displaystyle=$	$\displaystyle\left\\|\begin{bmatrix}&h_{11}h_{22}-h_{12}h_{21}-1\\ h_{12}h_{21}-h_{11}h_{22}+1&\end{bmatrix}\right\\|_{F}$
	$\displaystyle=$	$\displaystyle\sqrt{2}(1-{\rm det}(H))\leq\sqrt{2}(1-\\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2})$
	$\displaystyle\leq$	$\displaystyle\sqrt{2}\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2}.$

Applying the above inequality to the right hand side of (4.33), we obtain

|\theta_{j^{\prime}}-\sigma_{j}|\leq\sigma_{j}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}+\frac{\sigma_{1}}{\sqrt{2}}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}.

Similarly, we get bound (4.29) from (4.32) by using $\|\widehat{H}\|=\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|$ and

\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\|=1-{\det}(H)\leq\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|^{2}.

We can amplify bound (4.28) as

|\theta_{j^{\prime}}-\sigma_{j}|\leq\left(1+\frac{1}{\sqrt{2}}\right)\sigma_{1}\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}^{2}.

Applying (4.17) to $\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\|_{F}$ , we are able to estimate how fast $|\theta_{j^{\prime}}-\sigma_{j}|$ tends to zero as $m$ increases. Since $\mathcal{X}_{j}$ approaches $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ for a fixed small $j\ll m$ as $m$ increases, if $\sigma_{j}$ is well separated from the Ritz values of $A$ other than the approximate singular value $\theta_{j^{\prime}}$ , that is, $\min_{i\neq j^{\prime}}|\sigma_{j}-\theta_{i}|$ is not small, then $\mathcal{Z}_{j^{\prime}}$ converges to $\mathcal{X}_{j}$ as fast as $\mathcal{X}_{j}$ tends to $\mathcal{U}_{m}\oplus\mathcal{V}_{m}$ . In this case, Theorem 4.5 shows that the convergence of $\theta_{j^{\prime}}$ to $\sigma_{j}$ is quadratic relative to that of $\mathcal{Z}_{j^{\prime}}$ . We should point out that assumption (4.27) is weak and is met very soon as $m$ increases.

For brevity, we now omit the subscript $j$ and denote by $\left(\theta,\tilde{u},\tilde{v}\right)$ a Ritz approximation at the $m$ th step. We can then recover two conjugate approximate eigenpairs from $(\theta,\tilde{u},\tilde{v})$ in the form of $(\tilde{\lambda}_{\pm},\tilde{x}_{\pm})=(\pm\mathrm{i}\theta,\frac{1}{\sqrt{2}}(\tilde{u}\pm\mathrm{i}\tilde{v}))$ . Note that their residuals are

(4.35)

r_{\pm}=A\tilde{x}_{\pm}-\tilde{\lambda}_{\pm}\tilde{x}_{\pm}=r_{\mathrm{R}}\pm\mathrm{i}r_{\mathrm{I}}\qquad\mbox{with}\qquad\left\{\begin{aligned} r_{\mathrm{R}}&=\tfrac{1}{\sqrt{2}}(A\tilde{u}+\theta\tilde{v}),\\ r_{\mathrm{I}}&=\tfrac{1}{\sqrt{2}}(A\tilde{v}-\theta\tilde{u}).\end{aligned}\right.

Obviously, $r_{\pm}=\bm{0}$ if and only if $(\theta,\tilde{u},\tilde{v})$ is an exact singular triplet of $A$ . We claim that $(\tilde{\lambda}_{\pm},\tilde{x}_{\pm})$ has converged if its residual norm satisfies

(4.36)

\|r_{\pm}\|=\sqrt{\|r_{\mathrm{R}}\|^{2}+\|r_{\mathrm{I}}\|^{2}}\leq\|A\|\cdot tol,

where $tol>0$ is a prescribed tolerance. In practical computations, one can replace $\|A\|$ by the largest Ritz value $\theta_{1}$ . Notice that

\sqrt{2}\|r_{\pm}\|=\sqrt{\|A\tilde{u}+\theta\tilde{v}\|^{2}+\|A\tilde{v}-\theta\tilde{u}\|^{2}}

is nothing but the residual norm of the Ritz approximation $(\theta,\tilde{u},\tilde{v})$ of $A$ . Therefore, the eigenvalue problem and SVD problem of $A$ essentially share the same general-purpose stopping criterion.

By inserting $\tilde{u}=P_{m}c$ and $\tilde{v}=Q_{m}d$ into (4.35) and making use of (3.3) and (3.9), it is easily justified that

(4.37)

\|r_{\pm}\|=\tfrac{1}{\sqrt{2}}\sqrt{\|B_{m}^{T}c-\theta d\|^{2}+\gamma_{m}^{2}|e_{m,m}^{T}c|^{2}+\|B_{m}d-\theta c\|^{2}}=\tfrac{1}{\sqrt{2}}\gamma_{m}|e_{m,m}^{T}c|.

This indicates that we can calculate the residual norms of the Ritz approximations cheaply without explicitly forming the approximate singular vectors. We only compute the approximate singular vectors of $A$ until the corresponding residual norms defined by (4.37) drop below a prescribed tolerance. Moreover, (4.37) shows that once the $m$ -step SSLBD process breaks down, i.e., $\gamma_{m}=0$ , all the computed $m$ approximate singular triplets $(\theta,\tilde{u},\tilde{v})$ are the exact ones of $A$ , as we have mentioned at the end of Section 3.1.

5 An implicitly restarted SSLBD algorithm with partial reorthogonalization

This section is devoted to efficient partial reorthogonalization and the development of an implicitly restarted SSLBD algorithm, which are crucial for a practical purpose and the numerical reliability so as to avoid ghost phenomena of computed Ritz approximations and make the algorithm behave as if it is in exact arithmetic.

5.1 Partial reorthogonalization

We adopt a commonly used two-sided partial reorthogonalization strategy, as done in [14, 15, 18], to make the columns of $P_{m}$ and $Q_{m+1}$ numerically orthogonal to the level $\mathcal{O}(\sqrt{\varepsilon})$ with $\varepsilon$ the machine precision. In finite precision arithmetic, however, such semi-orthogonality cannot automatically guarantee the numerical semi-biorthogonality of $P_{m}$ and $Q_{m}$ , causing that computed Ritz values have duplicates or ghosts and the convergence is delayed. To avoid this severe deficiency and make the method as if it does in exact arithmetic, one of our particular concerns is to design an effective and efficient partial reorthogonalization strategy, so that the entries of $Q_{m}^{T}P_{m}$ are $\mathcal{O}(\sqrt{\varepsilon})$ in size; that is, the columns of $P_{m}$ and $Q_{m}$ are numerically semi-biorthogonal. As it will turn out, how to achieve this goal efficiently is involved. Why the semi-orthogonality and semi-biorthogonality suffice is based on the following fundamental result.

Theorem 5.1.

Let the bidiagonal $B_{m}$ and $P_{m}=[p_{1},\ldots,p_{m}]$ and $Q_{m}=[q_{1},\ldots,q_{m}]$ be produced by the $m$ -step SSLBD process. Assume that $\beta_{i},\ i=1,2,\ldots,m$ and $\gamma_{i},\ i=1,2,\ldots,m-1$ defined in (3.3) are not small, and

(5.1)

\max\left\{\max_{1\leq i<j\leq m}\left|p_{i}^{T}p_{j}\right|,\max_{1\leq i<j\leq m}\left|q_{i}^{T}q_{j}\right|,\max_{1\leq i,j\leq m}\left|p_{i}^{T}q_{j}\right|\right\}\leq\sqrt{\frac{\epsilon}{m}}.

Let $[\check{q}_{1},\check{p}_{1},\dots,\check{q}_{m},\check{p}_{m}]$ be the Q-factor in the QR decomposition of $M_{m}=[q_{1},p_{1},\dots,q_{m},p_{m}]$ with the diagonals of the upper triangular matrix being positive, and define the orthonormal matrices $\check{P}_{m}=[\check{p}_{1},\dots,\check{p}_{m}]$ and $\check{Q}_{m}=[\check{q}_{1},\dots,\check{q}_{m}]$ . Then

(5.2)

\check{P}_{m}^{T}A\check{Q}_{m}=B_{m}+\Delta_{m},

where the entries of $\Delta_{m}$ are $\mathcal{O}(\|A\|\epsilon)$ in size.

{proof}

Recall the equivalence of decompositions (2.10) and (2.11), and notice that decomposition (2.11) can be realized by the skew-symmetric Lanczos process, and (2.10) can be computed by our SSLBD process. The result then follows from a proof analogous to that of Theorem 4 in [30], and we thus omit details.

Theorem 5.1 shows that if the computed Lanczos vectors are semi-orthogonal and semi-biorthogonal then the upper bidiagonal matrix $B_{m}$ is, up to roundoff error, equal to the projection matrix $\check{P}_{m}^{T}A\check{Q}_{m}$ of $A$ with respect to the left and right searching subspaces $\mathcal{U}^{\prime}_{m}=\mathcal{R}(\check{P}_{m})$ and $\mathcal{V}^{\prime}_{m}=\mathcal{R}(\check{Q}_{m})$ . This means that the singular values of $B_{m}$ , the computed Ritz values, are as accurate as those of $\check{P}_{m}^{T}A\check{Q}_{m}$ , i.e., the exact Ritz values, within the level $\mathcal{O}(\|A\|\epsilon)$ . In other words, the semi-orthogonality and semi-biorthogonality suffice to guarantee that the computed and true Ritz values are equally accurate in finite precision arithmetic. Therefore, the full numerical orthogonality and biorthogonality at the level $\mathcal{O}(\varepsilon)$ do not help and instead cause unnecessary waste for the accurate computation of singular values.

To achieve the desired numerical semi-orthogonality and semi-biorthogonality, we first need to efficiently estimate the levels of orthogonality and biorthogonality among all the Lanczos vectors to monitor if (5.1) is guaranteed. Whenever (5.1) is violated, we will use an efficient partial reorthogonalization to restore (5.1). To this end, we adapt the recurrences in Larsen’s PhD thesis [17] to the effective estimations of the desired semi-orthogonality and semi-biorthogonality, as is shown below.

Denote by $\Phi\in\mathbb{R}^{m\times m}$ and $\Psi\in\mathbb{R}^{(m+1)\times(m+1)}$ the estimate matrices of orthogonality between the columns of $P_{m}$ and $Q_{m+1}$ , respectively, and by $\Omega\in\mathbb{R}^{m\times(m+1)}$ the estimate matrix of biorthogonality of $P_{m}$ and $Q_{m+1}$ . That is, their $(i,j)$ -elements $\varphi_{ij}\approx p_{i}^{T}p_{j}$ , $\psi_{ij}\approx q_{i}^{T}q_{j}$ and $\omega_{ij}\approx p_{i}^{T}q_{j}$ , respectively. Adopting the recurrences in [17], we set $\varphi_{ii}=\psi_{ii}=1$ for $i=1,\dots,m$ , and at the $j$ th step use the following recurrences to compute

(5.3)				$\displaystyle\left\{\begin{aligned} &\varphi_{ij}^{\prime}=\beta_{i}\psi_{ij}+\gamma_{i}\psi_{i+1,j}-\gamma_{j-1}\varphi_{i,j-1},\\ &\varphi_{ij}=(\varphi_{ij}^{\prime}+sign(\varphi_{ij}^{\prime})\epsilon_{1})/\beta_{j},\end{aligned}\right.\quad\qquad{\ \!}\mbox{for}\quad i=1,\dots,j-1,$
(5.4)				$\displaystyle\left\{\begin{aligned} &\psi_{i,j+1}^{\prime}=\gamma_{i-1}\varphi_{i-1,j}+\beta_{i}\varphi_{ij}-\beta_{j}\psi_{ij},\\ &\psi_{i,j+1}=(\psi_{i,j+1}^{\prime}+sign(\psi_{i,j+1}^{\prime})\epsilon_{1})/\gamma_{j},\end{aligned}\right.\qquad\mbox{for}\quad i=1,\dots,j,$

where $sign(\cdot)$ is the sign function, $\epsilon_{1}=\frac{\epsilon\sqrt{n}\|A\|_{e}}{2}$ and $\|A\|_{e}$ is an estimate for $\|A\|$ . The terms $sign(\cdot)\epsilon_{1}$ in (5.3) and (5.4) take into account the rounding errors in LBD, which makes the calculated items as large as possible in magnitude so that it is safe to take them as estimates for the levels of orthogonality among the left and right Lanczos vectors. Based on the sizes of $\varphi_{ij}$ and $\psi_{i,j+1}$ , one can decide to which previous Lanczos vectors the newly computed one should be reorthogonalized.

Making use of the skew-symmetry of $A$ , we can derive analogous recurrences to those in [17] to compute the elements $\omega_{ij}$ of $\Omega$ . The derivations are rigorous but tedious, so we omit details here, and present the recurrences directly. Initially, set $\omega_{i,0}=0$ and $\omega_{0,j}=0$ for $i=1,\dots,m$ and $j=1,\dots,m+1$ . At the $j$ th step, we compute the elements of $\Omega$ by

(5.6)				$\displaystyle\left\{\begin{aligned} &\omega_{ji}^{\prime}=\beta_{i}\omega_{ij}+\gamma_{i-1}\omega_{i-1,j}+\gamma_{j-1}\omega_{j-1,i},\\ &\omega_{ji}=-(\omega_{ji}^{\prime}+sign(\omega_{ji}^{\prime})\epsilon_{1})/\beta_{j},\end{aligned}\right.\ \ \ {\ \!}\qquad\mbox{for}\quad i=1,\dots,j,$
(5.6)				$\displaystyle\left\{\begin{aligned} &\omega_{i,j+1}^{\prime}=\gamma_{i}\omega_{j,i+1}+\beta_{i}\omega_{ji}+\beta_{j}\omega_{ij},\\ &\omega_{i,j+1}=-(\omega_{i,j+1}^{\prime}+sign(\omega_{i,j+1}^{\prime})\epsilon_{1})/\gamma_{j},\end{aligned}\right.\qquad\mbox{for}\quad i=1,\dots,j.$

Having computed $\Phi$ , $\Psi$ and $\Omega$ , at the $j$ th step of the SSLBD process, we determine the index sets:

(5.7)				$\displaystyle\mathcal{I}_{P}^{(j)}=\left\{i\|1\leq i<j,\ \|\varphi_{ij}\|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},\hskip 26.00009pt\mathcal{I}_{P,Q}^{(j)}=\left\{i\|1\leq i\leq j,\ \|\omega_{ji}\|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},$
(5.8)				$\displaystyle\mathcal{I}_{Q}^{(j)}=\left\{i\|1\leq i\leq j,\ \|\psi_{i,j+1}\|\geq\sqrt{\tfrac{\epsilon}{m}}\right\},\quad\ \mathcal{I}_{Q,P}^{(j)}=\left\{i\|1\leq i\leq j,\ \|\omega_{i,j+1}\|\geq\sqrt{\tfrac{\epsilon}{m}}\right\}.$

These four sets consist of the indices that correspond to the left and right Lanczos vectors $p_{i}$ and $q_{i}$ that have lost the semi-orthogonality and semi-biorthogonality with $p_{j}$ and $q_{j+1}$ , respectively. We then make use of the modified Gram–Schmidt (MGS) orthogonalization procedure [10, 28, 32] to reorthogonalize the newly computed left Lanczos vector $p_{j}$ first against the left Lanczos vector(s) $p_{i}$ with $i\in\mathcal{I}_{P}^{(j)}$ and then the right one(s) $q_{i}$ with $i\in\mathcal{I}_{P,Q}^{(j)}$ . Having done these, we reorthogonalize the right Lanczos vector $q_{j+1}$ , which has newly been computed using the updated $p_{j}$ , first against the right Lanczos vector(s) $q_{i}$ for $i\in\mathcal{I}_{Q}^{(j)}$ and then the left one(s) $p_{i}$ for $i\in\mathcal{I}_{Q,P}^{(j)}$ .

Algorithm 2 Partial reorthogonalization in the SSLBD process.

(3.1)

Calculate $\varphi_{1j},\varphi_{2j},\dots,\varphi_{j-1,j}$ and $\omega_{j1},\omega_{j2},\dots,\omega_{jj}$ by (5.3) and (5.6), respectively, and determine the index sets $\mathcal{I}_{P}^{(j)}$ and $\mathcal{I}_{P,Q}^{(j)}$ according to (5.7).
(3.2)
for $i\in\mathcal{I}_{P}^{(j)}$ do $\%$ partially reorthogonalize $p_{j}$ to $P_{j-1}$
- Calculate $\tau=p_{i}^{T}s_{j}$ , and update $s_{j}=s_{j}-\tau p_{i}$ .
- Overwrite $\varphi_{lj}^{\prime}=\varphi_{lj}^{\prime}-\tau\varphi_{li}$ for $l=1,\dots,j-1$ and $\omega_{jl}^{\prime}=\omega_{jl}^{\prime}-\tau\omega_{il}$ for $l=1,\dots,j$ .
end for
(3.3)
for $i\in\mathcal{I}_{P,Q}^{(j)}$ do $\%$ partially reorthogonalize $p_{j}$ to $Q_{j}$
- Compute $\tau=q_{i}^{T}s_{j}$ , and update $s_{j}=s_{j}-\tau q_{i}$ .
- Overwrite $\omega_{jl}^{\prime}=\omega_{jl}^{\prime}-\tau\psi_{il}$ for $l=1,\dots,j$ and $\varphi_{lj}^{\prime}=\varphi_{lj}^{\prime}-\tau\omega_{li}$ for $l=1,\dots,j-1$ .
end for
(3.4)

Compute $\beta_{j}=\|s_{j}\|$ , and update $\varphi_{1j},\varphi_{2j},\dots,\varphi_{j-1,j}$ and $\omega_{j1},\omega_{j2},\dots,\omega_{jj}$ by the second relations in (5.3) and (5.6), respectively.
(5.1)

Compute $\psi_{1j},\psi_{2j},\dots,\psi_{jj}$ and $\omega_{1j},\omega_{2j},\dots,\omega_{jj}$ using (5.4) and (5.6), respectively, and determine the index sets $\mathcal{I}_{Q}^{(j)}$ and $\mathcal{I}_{Q,P}^{(j)}$ according to (5.8).
(5.2)
for $i\in\mathcal{I}_{Q}^{(j)}$ do $\%$ partially reorthogonalize $q_{j+1}$ to $Q_{j}$
- Calculate $\tau=q_{i}^{T}t_{j}$ , and update $t_{j}=t_{j}-\tau q_{i}$ .
- Overwrite $\psi_{l,j+1}^{\prime}=\psi_{l,j+1}^{\prime}-\tau\psi_{li}$ and $\omega_{l,j+1}^{\prime}=\omega_{l,j+1}^{\prime}-\tau\omega_{li}$ for $l=1,\dots,j$ .
end for
(5.3)
for $i\in\mathcal{I}_{Q,P}^{(j)}$ do $\%$ partially reorthogonalize $q_{j+1}$ to $P_{j}$
- Compute $\tau=p_{i}^{T}t_{j}$ , and update $t_{j}=t_{j}-\tau p_{i}$ .
- Overwrite $\omega_{l,j+1}^{\prime}=\omega_{l,j+1}^{\prime}-\tau\varphi_{li}$ and $\psi_{l,j+1}^{\prime}=\psi_{l,j+1}^{\prime}-\tau\omega_{il}$ for $l=1,\dots,j$ .
end for
(5.4)

Calculate $\gamma_{j}=\|t_{j}\|$ , and update $\omega_{1,j+1},\omega_{2,j+1},\dots,\omega_{j,j+1}$ and $\psi_{1,j+1},\psi_{2,j+1},\dots,\psi_{j,j+1}$ by the second relations in (5.4) and (5.6), respectively.

Remarkably, once a Lanczos vector is reorthogonalized to a previous one, the relevant elements in $\Phi$ , $\Psi$ and $\Omega$ change correspondingly, which may no longer be reliable estimates for the levels of orthogonality and biorthogonality among the updated Lanczos vector and all the previous ones. To this end, we need to update those changed values of $\Phi$ , $\Psi$ and $\Omega$ by the MGS procedure¹¹1In his PhD thesis [17], Larsen resets those quantities as $\mathcal{O}(\epsilon)$ and includes the $\mathcal{O}(\epsilon)$ parts in $\epsilon_{1}$ of (5.3)–(5.6). But he does not explain why those quantities can automatically remain $\mathcal{O}(\epsilon)$ as the procedure is going on. In our procedure, we guarantee those vectors to be numerically orthogonal and biorthogonal by explicitly performing the MGS procedure.. Since $\Phi$ and $\Psi$ are symmetric matrices with diagonals being ones, it suffices to compute their strictly upper triangular parts. Algorithm 2 describes the whole partial reorthogonalization process in SSLBD, where we insert steps (3.1)–(3.4) and (5.1)–(5.4) between steps 3–4 and steps 5–6 of Algorithm 1, respectively. In such a way, we have developed an $m$ -step SSLBD process with partial reorthogonalization.

We see from Algorithm 2 and relations (5.3)–(5.6) that the $j$ th step takes $\mathcal{O}(j)$ flops to compute the new elements of $\Phi$ , $\Psi$ and $\Omega$ for the first time, which is negligible relative to the cost of performing $2$ matrix-vector multiplications with $A$ at that step. Also, it takes $\mathcal{O}(j)$ extra flops to update the elements of $\Phi$ , $\Psi$ and $\Omega$ after one step of reorthogonalization. Therefore, we can estimate the levels of orthogonality and biorthogonality very efficiently.

5.2 An implicitly restarted algorithm

When the dimension of the searching subspaces reaches the maximum number $m$ allowed, implicit restart needs to select certain $m-k$ shifts. The shifts can particularly be taken as the unwanted Ritz values $\mu_{1}=\theta_{k+1},\dots,\mu_{m-k}=\theta_{m}$ , called the exact shifts [18, 31]. For those so-called bad shifts, i.e., those close to some of the desired singular values, we replace them with certain more appropriate values. As suggested in [17] and adopted in [14, 15] with necessary modifications when computing the smallest singular triplets, we regard a shift $\mu$ as a bad one and reset it to be zero if

(5.9)

\left|(\theta_{k}-\|r_{\pm k}\|)-\mu\right|\leq{\theta_{k}}\cdot 10^{-3},

where $\|r_{\pm k}\|=\sqrt{\|r_{\mathrm{R},k}\|^{2}+\|r_{\mathrm{I},k}\|^{2}}$ is the residual norm of the approximate singular triplet $(\theta_{k},\tilde{u}_{k},\tilde{v}_{k})$ with $r_{\mathrm{R},k}$ and $r_{\mathrm{I},k}$ defined by (4.35).

Implicit restart formally performs $m-k$ implicit QR iterations [29, 33] on $B_{m}^{T}B_{m}$ with the shifts $\mu_{1}^{2},\dots,\mu_{m-k}^{2}$ by equivalently acting Givens rotations directly on $B_{m}$ and accumulating the orthogonal matrices $\widetilde{C}_{m}$ and $\widetilde{D}_{m}$ with order $m$ such that

\left\{\begin{aligned} \ \ &\widetilde{C}_{m}^{T}(B_{m}^{T}B_{m}-\mu_{m-k}^{2}I)\cdots(B_{m}^{T}B_{m}-\mu_{1}^{2}I)\qquad\mbox{is}\qquad\mbox{upper triangular},\\[5.0pt] &\widetilde{B}_{m}\ =\ \widetilde{C}_{m}^{T}B_{m}\widetilde{D}_{m}\hskip 103.00018pt\mbox{is}\qquad\mbox{upper bidiagonal}.\end{aligned}\right.

Using these matrices, we first compute

(5.10)

B_{k,\mathrm{new}}=\widetilde{B}_{k},\qquad P_{k,\mathrm{new}}=P_{m}\widetilde{C}_{k},\qquad Q_{k,\mathrm{new}}=Q_{m}\widetilde{D}_{k},

where $\widetilde{B}_{k}$ is the $k$ th leading principal submatrix of $\widetilde{B}_{m}$ , and $\widetilde{C}_{k}$ and $\widetilde{D}_{k}$ consist of the first $k$ columns of $\widetilde{C}_{m}$ and $\widetilde{D}_{m}$ , respectively. We then compute

(5.11)

q_{k+1}^{\prime}=\beta_{m}\tilde{c}_{mk}\cdot q_{m+1}+\tilde{\gamma}_{k}Q_{m}\tilde{d}_{k+1},\qquad\beta_{k,\mathrm{new}}=\|q^{\prime}_{k+1}\|,\qquad q_{k+1,\mathrm{new}}=\frac{q_{k+1}^{\prime}}{\beta_{k,\mathrm{new}}}

with $\tilde{d}_{k+1}$ the $(k+1)$ st column of $\widetilde{D}_{m}$ and $\tilde{c}_{mk}$ , $\tilde{\gamma}_{k}$ the $(m,k)$ - and $(k,k+1)$ -elements of $\widetilde{C}_{m}$ and $\widetilde{B}_{m}$ , respectively. As a result, when the above implicit restart is run, we have already obtained a $k$ -step SSLBD process rather than over scratch. We then perform $(m-k)$ -step SSLBD process from step $k+1$ onwards to obtain a new $m$ -step SSLBD process.

Once the SSLBD process is restarted, the matrices $\Phi$ , $\Psi$ and $\Omega$ must be updated. This can be done efficiently by making use of (5.10) and (5.11). Note that $\Phi$ and $\Psi$ are symmetric matrices with orders $m$ and $m+1$ , respectively, and $\Omega$ is an $m\times(m+1)$ flat matrix. The corresponding updated matrices are with orders $k$ , $k+1$ and size $k\times(k+1)$ , respectively, which are updated by

\Phi:=\widetilde{C}_{k}^{T}\Phi\widetilde{C}_{k},\qquad\Psi_{k}:=\widetilde{D}_{k}^{T}\Psi_{m}\widetilde{D}_{k},\qquad\Omega_{k}:=\widetilde{C}_{k}^{T}\Omega_{m}\widetilde{D}_{k},

and

	$\displaystyle\Psi_{1:k,k+1}$	$\displaystyle:=$	$\displaystyle\widetilde{D}_{k}^{T}(\beta_{m}\tilde{c}_{mk}\Psi_{1:m,m+1}+\tilde{\gamma}_{k}\Psi_{m}\tilde{d}_{k+1})/\beta_{k,\mathrm{new}},$
	$\displaystyle\Omega_{1:k,k+1}$	$\displaystyle:=$	$\displaystyle\widetilde{C}_{k}^{T}(\beta_{m}\tilde{c}_{mk}\Omega_{1:m,m+1}+\tilde{\gamma}_{k}\Omega_{m}\tilde{d}_{k+1})/\beta_{k,\mathrm{new}}.$

Here we have denoted by $(\cdot)_{j}$ and $(\cdot)_{1:j,j+1}$ the $j\times j$ leading principal submatrix and the first $j$ entries in the $(j+1)$ th column of the estimate matrices of orthogonality and biorthogonality, respectively. The updating process needs $\mathcal{O}(km^{2})$ extra flops, which is negligible relative to the $m$ -step implicit restart for $m\ll\ell=n/2$ .

As we have seen previously, the convergence criterion (4.36) and the efficient partial reorthogonalization scheme (cf. (5.3)–(5.6)) depend on a rough estimate for $\|A\|$ , which itself is approximated by the largest approximate singular value $\theta_{1}$ in our algorithm. Therefore, we simply set $\|A\|_{e}=\theta_{1}$ to replace $\|A\|$ in (4.36). Notice that, before the end of the first cycle of the implicitly restarted algorithm, there is no approximate singular value available. We adopt a strategy analogous to that in [17] to estimate $\|A\|$ : Set $\|A\|_{e}=\sqrt{\beta_{1}^{2}+\gamma_{1}^{2}}$ for $j=1$ , and update it for $j\geq 2$ by

\|A\|_{e}=\max\left\{\|A\|_{e},\sqrt{\beta_{j-1}^{2}+\gamma_{j-1}^{2}+\gamma_{j-1}\beta_{j}+\gamma_{j-2}\beta_{j-1}},\sqrt{\beta_{j}^{2}+\gamma_{j}^{2}+\gamma_{j-1}\beta_{j}}\right\},

where $\gamma_{0}=0$ .

Algorithm 3 Implicitly restarted SSLBD algorithm with partial reorthogonalization

1: Initialization: Set

i=0

P_{0}=[\ \ ]

Q_{1}=[q_{1}]

and

B_{0}=[\ \ ]

2: while not converged or

i<i_{max}

3: For

i=0

and

i>0

, perform the

m

- or

(m-k)

-step SSLBD process on

A

with partial reorthogonalization to obtain

P_{m}

Q_{m+1}

and

\widehat{B}_{m}=[B_{m},\gamma_{m}e_{m}]

4: Compute the SVD (3.9) of

B_{m}

to obtain its singular triplets

\left(\theta_{j},c_{j},d_{j}\right)

, and calculate the corresponding residual norms

\|r_{\pm j}\|

by (4.37),

j=1,\dots,m

5: if

\|r_{\pm j}\|\leq\|A\|_{e}\cdot tol

for all

j=1,\dots,k

, then compute

\tilde{u}_{j}=P_{m}c_{j}

and

\tilde{v}_{j}=Q_{m}d_{j}

for

j=1,\dots,k

, and break the loop.

6: Set

\mu_{j}=\theta_{j+k},j=1,\dots,m-k

, and replace those satisfying (5.9) with zeros. Perform the implicit restart scheme with the shifts

\mu_{1}^{2},\dots,\mu_{m-k}^{2}

to obtain the updated

P_{k}

Q_{k+1}

and

\widehat{B}_{k}

. Set

i=i+1

, and goto step 3.

7: end while

Algorithm 3 sketches the implicitly restarted SSLBD algorithm with partial reorthogonalization for computing the $k$ largest singular triplets of $A$ . It requires the device to compute $A\underline{x}$ for an arbitrary $n$ -dimensional vector $\underline{x}$ and the number $k$ of the desired conjugate eigenpairs of $A$ . The number $m$ is the maximum subspace dimension, $i_{\max}$ is the maximum restarts allowed, $tol$ is the stopping tolerance in (4.36), and the unit-length vector $q_{1}$ is an initial right Lanczos vector. The defaults of these parameters are set as $30$ , $2000$ , $10^{-8}$ and $[n^{-\frac{1}{2}},\dots,n^{-\frac{1}{2}}]^{T}$ , respectively.

6 Numerical examples

In this section, we report numerical experiments on several matrices to illustrate the performance of our implicitly restarted SSLBD algorithm with partial reorthogonalization for the eigenvalue problem of a large skew-symmetric $A$ . The algorithm will be abbreviated as IRSSLBD, and has been coded in the Matlab language. All the experiments were performed on an Intel (R) core (TM) i9-10885H CPU 2.40 GHz with the main memory 64 GB and 16 cores using the Matlab R2021a with the machine precision $\epsilon=2.22\times 10^{-16}$ under the Microsoft Windows 10 64-bit system.

Table 1: Properties of the test matrices

A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2}

and

A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix}

with

A_{\mathrm{o}}

being square and rectangular, respectively, where “M80PI”, “visco1”, “flowm”, “e40r0” and “Marag”, “Kemel”, “storm”, “dano3” are abbreviations of “M80PI_n1”, “viscoplastic1”, “flowmeter5”, “e40r0100” and “Maragal_6”, “Kemelmacher”, “stormg2-27”, “dano3mip”, respectively.

$A_{\mathrm{o}}$	$n$	$nnz(A)$	$\sigma_{\max}(A)$	$\sigma_{\min}(A)$	$\mathrm{gap}(1)$	$\mathrm{gap}(5)$	$\mathrm{gap}(10)$
plsk1919	1919	9662	1.71	3.79e-17	1.34e-1	5.17e-3	5.17e-3
tols4000	4000	9154	1.17e+7	9.83e-56	4.97e-3	4.97e-3	4.97e-3
M80PI	4028	8066	2.84	6.60e-18	7.03e-3	3.68e-3	3.68e-3
visco1	4326	71572	3.70e+1	5.42e-21	1.59e-2	3.50e-4	3.50e-4
utm5940	5940	114590	1.21	1.83e-7	5.14e-2	9.57e-4	9.57e-4
coater2	9540	264448	3.11e+2	1.35e-13	9.89e-3	7.38e-3	4.51e-4
flowm	9669	54130	3.32e-2	7.56e-36	2.87e-3	8.27e-4	6.88e-4
inlet	11730	440562	3.35	6.17e-7	1.61e-2	5.63e-4	5.63e-4
e40r0	17281	906340	1.04e+1	1.54e-16	9.64e-2	3.99e-3	2.00e-3
ns3Da	20414	1660392	5.12e-1	2.35e-6	2.59e-2	2.38e-3	8.37e-5
epb2	25228	199152	1.09	2.85e-12	1.15e-2	8.89e-4	1.86e-4
barth	6691	39496	2.23	5.46e-18	5.98e-3	1.85e-5	1.85e-5
Hamrle2	5952	44282	5.55e+1	6.41e-5	5.27e-6	5.27e-6	2.56e-6
delf	9824	30794	3.92e+3	0	3.12e-3	3.12e-3	3.12e-3
large	12899	41270	4.04e+3	0	2.21e-3	2.21e-3	2.21e-3
Marag	31407	1075388	1.33e+1	0	2.78e-1	1.66e-2	7.89e-3
r05	14880	208290	1.82e+1	0	2.07e-3	2.07e-3	5.73e-4
nl	22364	94070	2.87e+2	0	2.27e-2	1.38e-3	7.94e-4
deter7	24528	74262	9.67	0	3.42e-1	5.56e-4	5.56e-4
Kemel	38145	201750	2.41e+2	0	2.17e-2	6.96e-3	5.50e-4
p05	14680	118090	1.28e+1	0	5.08e-3	5.08e-3	1.93e-4
storm	51926	188548	5.45e+2	0	1.64e-2	2.28e-5	2.28e-5
south31	54746	224796	1.29e+4	0	6.86e+1	3.96e-1	2.02e-2
dano3	19053	163266	1.82e+3	0	1.17e-3	2.31e-6	2.31e-6

Table 1 lists some basic properties of the test skew-symmetric matrices $A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2}$ and $A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix}$ with $A_{\mathrm{o}}$ being the square and rectangular real matrices selected from the University of Florida Sparse Matrix Collection [6], respectively, where $nnz(A)$ denotes the number of nonzero elements of $A$ , $\sigma_{\max}(A)$ and $\sigma_{\min}(A)$ are the largest and smallest singular values of $A$ , respectively, and $\mathrm{gap}(k):=\min\limits_{1\leq j\leq k}|\lambda_{j}^{2}-\lambda_{j+1}^{2}|/|\lambda_{j+1}^{2}-\lambda_{\ell}^{2}|=\min\limits_{1\leq j\leq k}|\sigma_{j}^{2}-\sigma_{j+1}^{2}|/|\sigma_{j+1}^{2}-\sigma_{\min}^{2}(A)|$ . All the singular values of $A$ are computed, for the experimental purpose, by using the Matlab built-in function svd. As Theorem 4.1 shows, the size of $\mathrm{gap}(k)$ critically affects the performance of IRSSLBD for computing the $k$ largest conjugate eigenpairs of $A$ ; the bigger $\mathrm{gap}(k)$ is, the faster IRSSLBD converges. Otherwise, IRSSLBD may converge slowly.

Unless mentioned otherwise, we always run the IRSSLBD algorithm with the default parameters described in the end of Section 5. We also test the Matlab built-in functions eigs and svds in order to illustrate the efficiency and reliability of IRSSLBD. Concretely, with the same parameters as those in IRSSLBD, we use eigs to compute $2k$ eigenpairs corresponding to the $2k$ largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of $A$ , and use svds to compute the $k$ or $2k$ largest singular triplets of $A$ to obtain the $2k$ largest eigenpairs of $A$ . We record the number of matrix-vector multiplications, abbreviated as $\#{\rm Mv}$ , and the CPU time in second, denoted by $T_{\rm cpu}$ , that the three algorithms use to achieve the same stopping tolerance. We mention that, for most of the test matrices, the CPU time used by each of the three algorithms is too short, e.g., fewer than 0.1s, so that it is hard to make a convincing and reliable comparison using CPU time. Therefore, we will not report the CPU time in most of the experiments. We should remind that a comparison of CPU time used by our algorithm and eigs, svds may also be misleading because our IRSSLBD code is the pure Matlab language while eigs and svds were programmed using advanced and much higher efficient C/C++ language. Nevertheless, we aim to illustrate that IRSSLBD is indeed fast and not slower than eigs and svds even in terms of CPU time.

Experiment 6.1.

We compute $k=1,5,10$ pairs of the largest conjugate eigenvalues in magnitude of $A=\mathrm{plsk1919}$ and the corresponding eigenvectors.

Table 2: Results on

A=\mathrm{plsk1919}

Algorithm	$k=1$		$k=5$		$k=10$
Algorithm	$\#$ Mv	$T_{\mathrm{cpu}}$	$\#$ Mv	$T_{\mathrm{cpu}}$	$\#$ Mv	$T_{\mathrm{cpu}}$
eigs	58	4.52e-3	122	1.02e-2	186	3.06e-2
SVDS( $k$ )	118	4.11e-3	258	1.22e-2	416	1.61e-2
SVDS( $2k$ )	266	8.35e-3	416	1.61e-2	448	2.04e-2
IRSSLBD	50	8.26e-3	106	4.03e-2	134	5.20e-2

Table 2 displays the $\#$ Mv and $T_{\mathrm{cpu}}$ used by the three algorithms, where SVDS( $k$ ) and SVDS( $2k$ ) indicate that we use svds to compute the $k$ and $2k$ largest singular triplets of $A$ , respectively. We have found that IRSSLBD and eigs solved the SVD problems successfully and computed the desired eigenpairs of $A$ correctly. In terms of $\#$ Mv, we can see from the table that IRSSLBD is slightly more efficient than eigs for $k=1,5$ but outperforms eigs substantially for $k=10$ . IRSSLBD is thus superior to eigs for this matrix. We also see from Table 2 that the CPU time used by IRSSLBD is very short and competitive with that used by eigs. The same phenomena have been observed for most of the test matrices.

Remarkably, we have found that, without using explicit partial reorthogonalization to maintain the semi-biorthogonality of the left and right Lanczos vectors, IRSSLBD and svds encounter some severe troubles, and they behave similarly. Indeed, for $k=1$ , SVDS( $k$ ) succeeded in computing the largest singular triplet of $A$ , from which the desired largest conjugate eigenpairs are recovered as described in Section 3.2. For $k=5$ and $10$ , however, SVDS( $k$ ) computes duplicate approximations to the singular triplets associated with each conjugate eigenvalue pair of $A$ , and ghosts appeared, causing that SVDS( $k$ ) computed only half of the desired eigenpairs, i.e., the first to third and the first to fifth conjugate eigenpairs, respectively. The ghost phenomena not only delay the convergence of SVDS( $k$ ) considerably but also make it converge irregularly and fail to find all the desired singular triplets. SVDS( $k$ ) also uses much more $\#$ Mv than IRSSLBD does for $k=10$ . Likewise, for each $k=1,5,10$ , SVDS( $2k$ ) succeeds to compute only $k$ approximations to the desired largest singular triplets of $A$ , from which the desired approximate eigenpairs can be recovered, but consumes $3\sim 5$ times $\#$ Mv as many as those used by IRSSLBD. This shows that direct application of an LBD algorithm to skew-symmetric matrix eigenvalue problems does not work well in finite precision arithmetic. In contrast, IRSSLBD works well when the semi-orthogonality and semi-biorthogonality are maintained. Therefore, the semi-biorthogonality of left and right Lanczos vectors is crucial.

Experiment 6.2.

We compute $k=1,5,10$ pairs of largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of the twelve square matrices $A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2}$ with $A_{\mathrm{o}}=\mathrm{tols4000}$ , $\mathrm{M80PI}$ , $\mathrm{visco1}$ , $\mathrm{utm5940}$ , $\mathrm{coater2}$ , $\mathrm{flowm}$ , $\mathrm{inlet}$ , $\mathrm{e40r0}$ , $\mathrm{ns3Da}$ , $\mathrm{epb2}$ , $\mathrm{barth}$ and $\mathrm{Hamrle2}$ , respectively. We will report the results on IRSSLBD and eigs only since svds behaves irregularly and computes only partial desired singular triplets because of the severe loss of numerical biorthogonality of left and right Lanczos vectors.

Table 3: The #Mv used by IRSSLBD and eigs to compute

k=1,5,10

pairs of largest conjugate eigenvalues in magnitude and the corresponding eigenvectors of

A=\frac{A_{\mathrm{o}}-A_{\mathrm{o}}^{T}}{2}

$A_{\mathrm{o}}$	Algorithm: eigs			Algorithm: IRSSLBD
$A_{\mathrm{o}}$	$k=1$	$k=5$	$k=10$	$k=1$	$k=5$	$k=10$
tols4000	338	278	460	174 (51.48)	232 (83.45)	292 (63.48)
M80PI	58	110	254	44 (75.86)	100 (90.91)	158 (62.20)
visco1	198	302	912	132 (66.67)	282 (93.38)	370 (40.57)
utm5940	86	336	788	76 (88.37)	278 (82.74)	306 (38.83)
coater2	142	162	282	84 (59.15)	130 (80.25)	152 (53.90)
flowm	478	384	702	228 (47.70)	346 (90.10)	394 (56.13)
inlet	170	412	572	132 (77.65)	316 (76.70)	310 (54.20)
e40r0	86	168	576	70 (81.40)	148 (88.10)	284 (49.31)
ns3Da	114	226	960	92 (80.70)	204 (90.27)	376 (39.17)
epb2	198	328	828	138 (69.70)	278 (84.76)	384 (46.38)
barth	310	392	510	172 (55.48)	318 (81.12)	322 (63.14)
Hamrle2	3558	300	16178	136 ( 3.82)	190 (63.33)	354 ( 2.19)

IRSSLBD and eigs successfully compute the desired eigenpairs of all the test matrices. Table 3 displays the $\#$ Mv used by them, where each quantity in the parenthesis is the percentage of $\#$ Mv used by IRSSLBD over that by eigs and we drop $\%$ to save the space.

For $k=1$ , we see from Table 3 that IRSSLBD is always more efficient than eigs, and it often outperforms the latter considerably for most of the test matrices. For instance, IRSSLBD uses fewer than $60\%$ $\#$ Mv of eigs for $A_{\mathrm{o}}=$ tols4000, coater2, flowm and barth. Strikingly, IRSSLBD consumes only $3.82\%$ of the $\#$ Mv used by eigs for $A_{\mathrm{o}}=\mathrm{Hamrle2}$ . For $k=5$ , IRSSLBD is often considerably more efficient than eigs for most of the test matrices, especially $A_{\mathrm{o}}=$ Hamrle2, where IRSSLBD uses $36\%$ fewer $\#$ Mv. For $k=10$ , IRSSLBD uses $62\%$ – $64\%$ of the $\#$ Mv consumed by eigs for $A_{\mathrm{o}}=$ tols4000, M80PI and barth, and it uses $54\%$ – $57\%$ of the $\#$ Mv for $A_{\mathrm{o}}=$ coater2, flowm, and inlet. Therefore, for these six matrices, the advantage of IRSSLBD over eigs is considerable. For the five matrices $A_{\mathrm{o}}=$ visco1, utm5940, e40r0, ns3Da and epb2, IRSSLBD is even more efficient than ${\sf eigs}$ as it consumes no more than half of the $\#$ Mv used by eigs. For $A_{\mathrm{o}}=$ Hamrle2, IRSSLBD only uses $2.19\%$ of the $\#$ Mv cost by eigs, and the save is huge.

In summary, for these twelve test matrices, our IRSSLBD algorithm outperforms eigs and is often considerably more efficient than the latter for the given three $k$ .

Experiment 6.3.

We compute $k=1,5,10$ pairs of the largest conjugate eigenvalues of $A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix}$ and the corresponding eigenvectors with the eleven matrices $A_{\mathrm{o}}=$ delf, large, Marag, r05, nl, deter7, Kemel, p05, storm, south31 and dano3.

Table 4: The #Mv used by IRSSLBD and eigs to compute

k=1,5,10

pairs of the largest conjugate eigenpairs of

A=\begin{bmatrix}\begin{smallmatrix}&A_{\mathrm{o}}\\ -A_{\mathrm{o}}^{T}&\end{smallmatrix}\end{bmatrix}

$A_{\mathrm{o}}$	Algorithm: eigs			Algorithm: IRSSLBD
$A_{\mathrm{o}}$	$k=1$	$k=5$	$k=10$	$k=1$	$k=5$	$k=10$
delf	478	214	192	142 (29.71)	142 (66.36)	138 (71.88)
large	590	284	268	188 (31.86)	172 (60.56)	172 (64.18)
Marag	58	84	178	38 (65.52)	82 (97.62)	130 (73.03)
r05	86	70	668	54 (62.79)	56 (80.00)	222 (33.23)
nl	114	242	222	76 (66.67)	172 (71.07)	126 (56.76)
deter7	58	346	144	36 (62.07)	148 (42.77)	122 (84.72)
Kemel	142	220	636	118 (83.10)	198 (90.00)	306 (48.11)
p05	114	90	692	58 (50.88)	62 (68.89)	234 (32.82)
storm	58	84	78	46 (79.31)	64 (76.19)	66 (84.62)
south31	30	42	146	8 (26.67)	26 (61.90)	92 (63.01)
dano3	114	70	5922	58 (50.88)	70 (100.0)	196 ( 3.31)

These test matrices are all singular with large dimensional null spaces. In order to purge the null space from the searching subspaces, we take the initial vector $q_{1}$ in IRSSLBD and eigs to be the unit-length vector normalized from $A\cdot[1,\dots,1]^{T}$ . All the other parameters are the same for the two algorithms with the default values as described previously. Both IRSSLBD and eigs succeed in computing all the desired eigenpairs of $A$ for the three $k$ ’s. Table 4 displays the #Mv, where the quantities in the parentheses are as in Table 3.

We have also observed an interesting phenomenon from this table and the previous experiments. As is seen from Table 4, for some test matrices, e.g., delf, large, nl, deter7, p05 and dano, when more conjugate eigenpairs of $A$ are desired, IRSSLBD or eigs may use even fewer $\#$ Mv. This occurs mainly due to the implicit restart technique. Specifically, when more conjugate eigenpairs are desired, larger dimensional new initial searching subspaces $\mathcal{U}_{k}$ and $\mathcal{V}_{k}$ are retained in the next restart, so that the resulting restarted searching subspaces may contain more information on the desired eigenvectors, causing that the largest approximate eigenpairs may converge faster.

As for an overall efficiency comparison of IRSSLBD and eigs, we have also observed the advantage of the former over the latter, similar to that in Experiment 6.2. Therefore, we now make some comments on them together. Before doing so, let us regard each test matrix with each selected $k$ as one independent problem, so in total we have 33 problems, which will be distinguished by the name-value pairs $(A_{\mathrm{o}},k)$ .

As is seen from Table 4, for $(A_{\mathrm{o}},k)=(\mathrm{Marag},5)$ and $(\mathrm{dano3},5)$ , IRSSLBD and eigs use almost the same $\#$ Mv. For $(A_{\mathrm{o}},k)=(\mathrm{Kemel},1)$ , $(\mathrm{Kemel},5)$ , $(\mathrm{deter7},10)$ and $(\mathrm{strom},10)$ , IRSSLBD is slightly better than eigs and uses $82\%$ – $90\%$ of the $\#$ Mv. For $(A_{\mathrm{o}},\!k)\!=\!(\mathrm{Marag},\!1)$ , $(\mathrm{r05},1)$ , $(\mathrm{nl},1)$ , $(\mathrm{deter7},1)$ , $(\mathrm{storm},1)$ , $(\mathrm{delf},5)$ , $(\mathrm{large},\!5)$ , $(\mathrm{r05},5)$ , $(\mathrm{nl},\!5)$ , $(\mathrm{p05},5)$ , $(\mathrm{storm},\!5)$ , $(\mathrm{south31},5)$ , $(\mathrm{delf},10)$ , $(\mathrm{large},10)$ , $(\mathrm{Marag},10)$ , $(\mathrm{nl},10)$ and $(\mathrm{south31},10)$ , IRSSLBD outperforms eigs considerably as it costs $56\%$ – $80\%$ of the $\#$ Mv used by eigs. For $(A_{\mathrm{o}},k)=(\mathrm{delf},1)$ , $(\mathrm{large},1)$ , $(\mathrm{p05},1)$ , $(\mathrm{south31},1)$ , $(\mathrm{dano3},1)$ , $(\mathrm{deter7},5)$ , $(\mathrm{r05},10)$ , $(\mathrm{Kemel},10)$ , $(\mathrm{p05},10)$ and $(\mathrm{dano3},10)$ , IRSSLBD is substantially more efficient than eigs as it uses only approximately half of the $\#$ Mv or even fewer. Particularly, for $(A_{\mathrm{o}},k)=(\mathrm{dano3},10)$ , the advantage of IRSSLBD over eigs is very substantial since, $\#$ Mv used by the former is only $3.5\%$ of that by the latter.

In summary, by choosing an appropriate starting vector, IRSSLBD suits well for computing several largest conjugate eigenpairs of a singular skew-symmetric matrix, and it is more and can be much more efficient than eigs.

7 Conclusions

We have shown that the eigendecomposition of a real skew-symmetric matrix $A$ has a close relationship with its structured SVD, by which the computation of its several extreme conjugate eigenpairs can be equivalently transformed into the computation of the largest singular triplets in real arithmetic. For $A$ large, as a key step toward the efficient computation of the desired eigenpairs, by means of the equivalence result on the tridiagonal decomposition of $A$ and a half sized bidiagonal decomposition, the SSLBD process is exploited to successively compute a sequence of partial bidiagonal decompositions. The process provides us with a sequence of left and right searching subspaces and bidiagonal matrices, making the computation of a partial SVD of $A$ possible, from which the desired eigenpairs of $A$ is obtained without involving complex arithmetic.

We have made a detailed theoretical analysis on the SSLBD process. Based on the results obtained, we have proposed a SSLBD method for computing several extreme singular triplets of $A$ , from which the desired extreme conjugate eigenvalues in magnitude and the corresponding eigenvectors can be recovered. We have established estimates for the distance between a desired eigenspace and the underlying subspaces that the SSLBD process generates, showing how it converges to zero. In terms of the distance between the desired eigenspace and the searching subspaces, we have derived a priori error bounds for the recovered conjugate eigenvalues and the associated eigenspaces. The results indicate that the SSLBD method generally favors extreme conjugate eigenpairs of $A$ .

We have made a numerical analysis on the SSLBD process and proposed an efficient and reliable approach to track the orthogonality and biorthogonality among the computed left and right Lanczos vectors. Unlike the standard LBD process, for its skew-symmetric variant SSLBD, we have proved that only the semi-orthogonality of the left and right Lanczos vectors does not suffice and that the semi-biorthogonality of these two sets of vectors is absolutely necessary, which has been numerically confirmed. Based on these, we have designed an effective partial reorthogonalization strategy for the SSLBD process to maintain the desired semi-orthogonality and semi-biorthogonality. Combining the implicit restart with the partial reorthogonalization proposed, we have developed an implicitly restarted SSLBD algorithm to compute several largest singular triplets of a large real skew-symmetric matrix.

A lot of numerical experiments have confirmed the effectiveness and high efficiency of the implicitly restarted SSLBD algorithm. They have indicated that it is at least competitive with eigs and often outperforms the latter considerably.

References

[1] T. Apel, V. Mehrmann, and D. Watkins, Structured eigenvalue methods for the computation of corner singularities in 3d anisotropic elastic structures, Comput. Methods Appl. Mech. Eng., 191 (2002), pp. 4459–4473.
[2] J. Baglama and L. Reichel, Augmented implicitly restarted Lanczos bidiagonalization methods, SIAM J. Sci. Comput., 27 (2005), pp. 19–42.
[3] M. W. Berry, Large-scale sparse singular value decomposition, Int. J. Supercomput. Appl., 6 (1992), pp. 13–49.
[4] J. R. Cardoso and F. S. Leite, Exponentials of skew-symmetric matrices and logarithms of orthogonal matrices, J. Comput. Appl. Math., 233 (2010), pp. 2867–2875.
[5] J. Cullum, R. A. Willoughby, and M. Lake, A Lanczos algorithm for computing singular values and vectors of large matrices, SIAM J. Sci. Statist. Comput., 4 (1983), pp. 197–215.
[6] T. A. Davis and Y. Hu, The University of Florida sparse matrix collection, ACM Trans. Math. Software, 38 (2011), pp. 1–25. Data available online at http://www.cise.ufl.edu/research/sparse/matrices/.
[7] N. Del Buono, L. Lopez, and R. Peluso, Computation of the exponential of large sparse skew-symmetric matrices, SIAM J. Sci. Comput., 27 (2005), pp. 278–293.
[8] K. V. Fernando, Accurately counting singular values of bidiagonal matrices and eigenvalues of skew-symmetric tridiagonal matrices, SIAM J. Matrix Anal. Appl., 20 (1998), pp. 373–399.
[9] G. Golub and W. Kahan, Calculating the singular values and pseudo-inverse of a matrix, J. Soc. Indust. Appl. Math. Ser. B Numer. Anal., 2 (1965), pp. 205–224.
[10] G. Golub and C. F. van Loan, Matrix Computations, 4th ed., The John Hopkins University Press, Baltimore, 2012.
[11] M. E. Hochstenbach, Harmonic and refined extraction methods for the singular value problem, with applications in least squares problems, BIT Numer. Math., 44 (2004), pp. 721–754.
[12] M. Hǎrǎguş and T. Kapitula, On the spectra of periodic waves for infinite-dimensional Hamiltonian systems, Physica D, 237 (2008), pp. 2649–2671.
[13] Z. Jia, Regularization properties of LSQR for linear discrete ill-posed problems in the multiple singular value case and best, near best and general low rank approximations, Inverse Probl., 36 (2020), 085009 (38pp).
[14] Z. Jia and D. Niu, An implicitly restarted refined bidiagonalization Lanczos method for computing a partial singular value decomposition, SIAM J. Matrix Anal. Appl., 25 (2003), pp. 246–265.
[15] Z. Jia and D. Niu, A refined harmonic Lanczos bidiagonalization method and an implicitly restarted algorithm for computing the smallest singular triplets of large matrices, SIAM J. Sci. Comput., 32 (2010), pp. 714–744.
[16] Z. Jia and G. W. Stewart, An analysis of the Rayleigh–Ritz method for approximating eigenspaces, Math. Comput., 70 (2001), pp. 637–647.
[17] R. M. Larsen, Lanczos bidiagonalization with partial reorthogonalization, PhD thesis, University of Aarhus, 1998.
[18] R. M. Larsen, Combining implicit restarts and partial reorthogonalization in Lanczos bidiagonalization, SCCM, Stanford University, (2001).
[19] V. Mehrmann, The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution, vol. 163, Springer, Heidelberg, 1991.
[20] V. Mehrmann, C. Schröder, and V. Simoncini, An implicitly-restarted Krylov subspace method for real symmetric/skew-symmetric eigenproblems, Linear Algebra Appl., 436 (2012), pp. 4070–4087.
[21] V. Mehrmann and D. Watkins, Polynomial eigenvalue problems with Hamiltonian structure, Electron. Trans. Numer. Anal., 13 (2002), pp. 106–118.
[22] J.-M. Mencik and D. Duhamel, A wave-based model reduction technique for the description of the dynamic behavior of periodic structures involving arbitrary-shaped substructures and large-sized finite element models, Finite Elem. Anal. Des., 101 (2015), pp. 1–14.
[23] M. Paardekooper, An eigenvalue algorithm for skew-symmetric matrices, Numer. Math., 17 (1971), pp. 189–202.
[24] C. C. Paige and M. A. Saunders, LSQR: an algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Soft., 8 (1982), pp. 43–71.
[25] B. N. Parlett, The Symmetric Eigenvalue Problem, SIAM, Philadelphia, 1998.
[26] C. Penke, A. Marek, C. Vorwerk, C. Draxl, and P. Benner, High performance solution of skew-symmetric eigenvalue problems with applications in solving the Bethe-Salpeter eigenvalue problem, Parallel Comput., 96 (2020), p. 102639.
[27] Y. Saad, On the rates of convergence of the Lanczos and the block-Lanczos methods, SIAM J. Numer. Anal., 17 (1980), pp. 687–706.
[28] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, 2003.
[29] Y. Saad, Numerical Methods for Large Eigenvalue Problems: revised edition, SIAM, Philadelphia, 2011.
[30] H. D. Simon, Analysis of the symmetric Lanczos algorithm with reorthogonalization methods, Linear Algebra Appl., 61 (1984), pp. 101–131.
[31] D. C. Sorensen, Implicit application of polynomial filters in a k-step Arnoldi method, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 357–385.
[32] G. W. Stewart, Matrix Algorithms I: Basic Decompositions, SIAM, Philadelphia, 1998.
[33] G. W. Stewart, Matrix Algorithms II: Eigensystems, SIAM, Philadelphia, 2001.
[34] M. Stoll, A Krylov–Schur approach to the truncated SVD, Linear Algebra Appl., 436 (2012), pp. 2795–2806.
[35] R. C. Ward and L. J. Gray, Eigensystem computation for skew-symmetric and a class of symmetric matrices, ACM Trans. Math. Software, 4 (1978), pp. 278–285.
[36] M. Wimmer, Algorithm 923: Efficient numerical computation of the pfaffian for dense and banded skew-symmetric matrices, ACM Trans. Math. Software, 38 (2012), pp. 1–17.
[37] W.-Y. Yan and J. Lam, An approximate approach to $H^{2}$ optimal model reduction, IEEE Trans. Automat. Control, 44 (1999), pp. 1341–1358.
[38] W. X. Zhong and F. W. Williams, On the direct solution of wave propagation for repetitive structures, J. Sound Vibr., 181 (1995), pp. 485–501.
[39] K. Zhou, J. C. Doyle, K. Glover, et al., Robust and Optimal Control, Prentice Hall, New Jersey, 1996.

$\displaystyle\\|\tan\angle(\mathcal{U}_{m}\oplus\mathcal{V}_{m},\mathcal{X}_{j})\\|_{F}$	$\displaystyle\leq$	$\displaystyle\min_{\rho\in\mathcal{P}^{m-1}}\\|\tan\angle(\mathcal{Y}_{\rho},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle\leq$	$\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1}}\frac{\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|}{\|\rho(-\sigma_{j}^{2})\|}\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle=$	$\displaystyle\min\limits_{\rho\in\mathcal{P}^{m-1},\rho(-\sigma_{j}^{2})=1}\max_{i\neq j}\|\rho(-\sigma_{i}^{2})\|\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F}$
	$\displaystyle\leq$	$\displaystyle\frac{\eta_{j}}{\chi_{m-j}(\xi_{j})}\\|\tan\angle(\mathcal{Y},\mathcal{X}_{j})\\|_{F},$

$\displaystyle\\|H^{T}\widehat{I}_{2}H-\widehat{I}_{2}\\|_{F}$	$\displaystyle=$	$\displaystyle\left\\|\begin{bmatrix}&h_{11}h_{22}-h_{12}h_{21}-1\\ h_{12}h_{21}-h_{11}h_{22}+1&\end{bmatrix}\right\\|_{F}$
	$\displaystyle=$	$\displaystyle\sqrt{2}(1-{\rm det}(H))\leq\sqrt{2}(1-\\|\cos\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2})$
	$\displaystyle\leq$	$\displaystyle\sqrt{2}\\|\sin\angle(\mathcal{Z}_{j^{\prime}},\mathcal{X}_{j})\\|^{2}.$

A skew-symmetric Lanczos bidiagonalization method for computing several largest eigenpairs of a large skew-symmetric matrix

Abstract

keywords:

AMS:

1 Introduction

2 Preliminaries

Theorem 2.1.

Remark 1.

Theorem 2.2.

Remark 2.

3 The SSLBD method

3.1 The mm-step SSLBD process

Theorem 3.1.

3.2 The SSLBD method

4 A convergence analysis

Theorem 4.1.

Lemma 4.2.

Lemma 4.3.

Theorem 4.4.

Remark 3.

Theorem 4.5.

5 An implicitly restarted SSLBD algorithm with partial reorthogonalization

5.1 Partial reorthogonalization

Theorem 5.1.

5.2 An implicitly restarted algorithm

6 Numerical examples

Experiment 6.1.

Experiment 6.2.

Experiment 6.3.

7 Conclusions

References

3.1 The $m$ -step SSLBD process