Abstract
This paper establishes a variant of Stewart’s theorem [Theorem 6.4 of Stewart, SIAM Rev., 15:727–764, 1973]
for the singular subspaces associated with the SVD of a matrix
subject to perturbations. Stewart’s original version uses both the Frobenius and spectral norms,
whereas the new variant uses the spectral norm and any unitarily invariant norm that offer choices
per convenience of particular applications and lead to sharper bounds
than that straightforwardly derived from Stewart’s original theorem with the help of the well-known equivalence inequalities
between matrix norms. Of interest in their own right, bounds on
the solution to two couple Sylvester equations are established for a few different circumstances.
Keywords: SVD, perturbation, singular subspaces, coupled Sylvester equations
Mathematics Subject Classification 65F15, 15A18
1 Introduction
In [15], Stewart established a perturbation theory
for the singular subspaces associated with the SVD of
a matrix slightly perturbed. In the same paper he also investigated
the eigenspace of a matrix and the deflating subspace of a regular matrix pencil subject to perturbations.
But in this paper, we limit our scope
to one of his theorems, namely, [15, Theorem 6.4] on SVD, where both the Frobenius and spectral norms are used to measure perturbations.
Our goal is to establish a more general and yet sharper version of [15, Theorem 6.4] in the spectral norm and any unitarily invariant norm.
Our new version
offers flexibility in its applications, for example, the version with the unitarily invariant norm also set to the spectral norm
is more convenient to use in our recent work [18] for
analyzing the quality of a reduced order model by approximate balanced truncation [1, 2].
Even in the case of using both the Frobenius and spectral norms, our results are slightly better than
Stewart’s [15, Theorem 6.4] in terms of a less stringent condition and yet a sharper bound.
Given ,
let
|
|
|
(1.1a) |
be two unitary matrices such that admits the decomposition |
|
|
|
(1.1b) |
where .
For example, this can be the SVD of , for which
is diagonal with nonnegative diagonal entries as some of the singular values of and
is leading diagonal by which we mean that only its entries along the main diagonal line
starting at the top-left corner may be nonzero and nonnegative and they are some of the singular values of , too.
By convention, has singular values arranged in decreasing order:
|
|
|
(1.2) |
Denote
the singular value set and its extended set of by
|
|
|
(1.3) |
where the union is meant to be the multiset union.
Consider now that is perturbed to , and partition
|
|
|
(1.4) |
Naturally, one would ask whether admits a decomposition that is “close” to the one for in (1.1).
Stewart [15, Theorem 6.4] provided an answer to that by
seeking orthogonal matrices [15, p.760]
|
|
|
|
|
|
(1.5c) |
such that
|
|
|
(1.6) |
Theorem 1.1 ([15, Theorem 6.4]).
Given , let be decomposed as in (1.1), and partition
according to (1.4). Let
|
|
|
|
|
(1.7a) |
|
|
|
|
(1.7b) |
|
|
|
|
(1.7c) |
where and denote the spectral and Frobenius norms, respectively.
If
|
|
|
(1.8) |
then there exist
and satisfying
|
|
|
(1.9) |
such that (1.6) with (1.5) holds.
A number of results can be deduced from this informative theorem, e.g.,
bounds on and , the explicit expressions of
and in terms of and , , , and , and also the singular values of
is the multiset union of the singular values of and .
Although the spectral norm is used in defining ,
the Frobenius norm is in full display in defining and in bounding and .
A straightforward version of it, using only the spectral norm can be easily derived in light of the equivalency inequalities
|
|
|
For example, we can conclude that
if and
|
|
|
(1.10) |
then there exist
and satisfying
|
|
|
(1.11) |
such that (1.6) with (1.5) holds.
There are two clear drawbacks of such a straightforward version:
1) a much stronger condition in (1.10) than something like
|
|
|
(1.12) |
as one might possibly expect, and 2) a much weaker
conclusion in (1.11) than
something like
|
|
|
(1.13) |
also as one might possibly expect. The appearances of and in (1.10)
are particularly unsatisfactory for large (huge) and usually modest .
Our goal in this paper is to create a version of Theorem 1.1 that uses
the spectral norm and any unitarily invariant norm. Our eventual version, Theorem 3.1,
when restricted exclusively to the spectral norm, does not exactly coincide with the possible expectations in
(1.12) and (1.13), but is very much close to it, namely
(1.12) and (1.13) with
and replaced by
and , respectively.
In particular, and disappear altogether.
The rest of this paper is organized as follows. Section 2 states two preliminary results for later use in our proofs.
Our main result, a variant of Stewart’s, is stated and proved in Section 3.
In Section 4, we compare our main result realized for the spectral norm only and for both the spectral
and Frobenius norm with Stewart’s result in Theorem 1.1 and its potential consequences
in (1.10) and (1.11).
Finally, we draw our conclusions in Section 5.
Notation.
is the set
of all complex matrices, ,
and .
(or simply if its dimension is
clear from the context) is the identity matrix.
The superscript is the complex conjugate transpose of a matrix or vector .
We shall also adopt MATLAB-like convention to
access the entries of vectors and matrices.
Let be the set of integers from to inclusive.
, , and are submatrices of ,
consisting of intersections of row to row and column to column ,
row to row , and column to column , respectively.
is the column space of , i.e., the subspace spanned by the columns of .
Finally is the spectrum of a square matrix .
We will continue to adopt the notation introduced so far in this section such as and .
In particular, has singular values denoted by in decreasing order as in
(1.2), and accordingly two singular value sets and in (1.3), and
and .
2 Preliminaries
In this section, we will state four lemmas that we will need later.
The first lemma, Lemma 2.1, is due to Stewart [14, 15], and
the second one, Lemma 2.2, summarizes known bounds on the solution to the Sylvester equation with Hermitian coefficient matrices
of Davis and Kahan [6] and of Bhatia, Davis, and McIntosh [4].
The third one, Lemma 2.3, establishes
bounds on the solution pair to a set of the coupled Sylvester equations and the results are new, except the one for
the Frobenius norm. Finally, the fourth lemma, Lemma 2.4, is likely known.
Lemma 2.1 ([14, Theorem 3.5], [15, Theorem 3.1]).
Let be a bounded linear operator on the Banach space that
has a bounded inverse. Let be a continuous function on
that satisfies, for ,
|
(i) |
|
|
|
(ii) |
|
|
for some . Let
Given , if
|
|
|
then equation
has a solution that satisfies
|
|
|
We need the notation of unitarily invariant norm to go forward. A matrix norm is called a unitarily invariant norm on if it is a matrix norm
and has the following two additional properties [3, 16]:
-
1.
for all unitary matrices and
and ;
-
2.
, the spectral norm of , if .
Two commonly used unitarily invariant norms are
where are the singular values of .
In what follows, denotes a general unitarily invariant norm.
In this article, for convenience, any
we use is generic to matrix sizes in the sense that it applies to matrices of all sizes.
Examples include the matrix spectral norm , the Frobenius norm , and the trace norm. One important property
of unitarily invariant norms is
|
|
|
for any matrices , , and of compatible sizes.
The next lemma summarizes known bounds on the solution of the Sylvester equation . These bounds have played important roles
in eigenspace variations as demonstrated by Davis and Kahan [6, 1970], and Bhatia, Davis, and McIntosh [4, 1983].
For a brief review, see [10].
Lemma 2.2.
Consider matrix equation , where and are Hermitian, and .
If , then the equation has a unique solution . Furthermore, the following statements
hold.
-
(a)
[6] we have
|
|
|
(2.1) |
-
(b)
[6] if there exist and such that
|
|
|
(2.2) |
then for any unitarily invariant norm ,
|
|
|
(2.3) |
-
(c)
[4, 5] we have
|
|
|
(2.4) |
where is as in (2.1).
The results of Lemma 2.2 have an alternative interpretation through a linear operator
|
|
|
(2.5) |
endowed with certain unitarily invariant norm on , including
the spectral and Frobenius norms as special ones, where and are Hermitian.
With any given , there is an induced operator norm on the linear operator from to itself.
Translating the results of Lemma 2.2 yields the following corollary.
Corollary 2.1.
Let and be Hermitian, and define linear operator
on as in (2.5).
If , then is invertible, and, furthermore, the following statements
hold.
-
(i)
With and as in (2.1), we have
;
-
(ii)
With general and assuming (2.2),
then ,
where is the largest for some and
subject to (2.2);
-
(iii)
With general and as in (2.1),
we have
.
Proof.
To see these, we note that . Hence,
immediately it follows that for items (i) and (ii) and
for item (iii). The equality sign in items (i) and (ii)
are achieved by letting , where are unit
eigenvectors of and , respectively, such that
and , assuming , and hence
|
|
|
Also because consists of one nonzero singular value and some copies of zeros,
|
|
|
implying for item (i) or item (ii).
∎
For subspace variation associated with SVD, a set of two coupled Sylvester equations come into play:
|
|
|
(2.6) |
where , , , and .
Note that is possibly a nonsquare matrix, in which case we pad some blocks of zeros to
, , and or to yield an equivalent set of two coupled Sylvester equations with being square.
Specifically, if , let
|
|
|
whereas if , let
|
|
|
Then (2.6) is equivalent to
|
|
|
(2.7) |
which can be merged into one Sylvester equation
|
|
|
(2.8) |
It can be seen that
|
|
|
(2.9) |
where negating a set means negating each element of the set, and
|
|
|
(2.10) |
for any unitary invariant norm. Apply Lemma 2.2 to get
Lemma 2.3.
Consider a set of two coupled Sylvester equations (2.6), where , , , and .
If , then the set of equations has a unique solution pair
. Furthermore, the following statements
hold:
-
(a)
we have [15]
|
|
|
(2.11) |
-
(b)
if ,
then for any unitarily invariant norm ,
|
|
|
|
|
(2.12a) |
|
|
|
|
(2.12b) |
and in particular for the spectral norm
|
|
|
(2.13) |
-
(c)
we have
|
|
|
|
|
(2.14a) |
|
|
|
|
(2.14b) |
where is as in (2.11),
and in particular for the spectral norm
|
|
|
(2.15) |
Proof.
Recall (2.9) and (2.10).
The inequality in (2.11) is essentially Stewart’s [15, Theorem 6.2], but here as a corollary
of Lemma 2.2 applied to (2.8).
Inequalities (2.12a) and (2.14a)
are also corollaries
of Lemma 2.2 applied to (2.8).
Inequalities (2.13) and (2.15) follow
from (2.12) and (2.14a), respectively, due to
|
|
|
(2.16) |
It remains to show (2.12b) and (2.14b).
Inequality (2.12b) is essentially implied in [17, section 3], but we will present
a quick proof anyway.
Note that for the case
.
We have
|
|
|
|
|
|
|
|
|
|
|
|
|
(2.17a) |
Similarly, we can get |
|
|
|
(2.17b) |
There are two cases to consider. If , then by (2.17a) we get
|
|
|
(2.18a) |
if, on the other hand, , then by (2.17b) we get |
|
|
|
(2.18b) |
Together, (2.18a) and (2.18b) yield
|
|
|
as expected. Finally, noticing that
|
|
|
|
|
|
|
|
|
|
|
|
we see that (2.14a) implies (2.14b).
∎
Lemma 2.3 has more than what we need later. In fact, we will only use (2.13) and (2.15) in our later development.
The results of Lemma 2.3 have an alternative interpretation through a linear operator
|
|
|
(2.19) |
endowed with certain norm on to make it a Banach space, including (cf. those used in Lemma 2.3)
|
|
|
(2.20a) |
for any , where is any given unitarily invariant norm. Two particular ones are |
|
|
|
(2.20b) |
upon realizing in (2.20a) as the spectral norm or the Frobenius norm.
Another possible endowed norm on is |
|
|
|
(2.20c) |
With each endowed norm, there is an induced operator norm on the linear operator from to itself.
Translating the results of Lemma 2.3 yields the following corollary.
Corollary 2.2.
Let , , , and , and define
linear operator
on as in (2.5) where
is given by one of those in (2.20).
If , then is invertible, and, furthermore, the following statements
hold.
-
(i)
With and as in (2.11), we have
[15];
-
(ii)
With either (2.20a) or (2.20c),
if ,
then ;
-
(iii)
With (2.20a) and as in (2.11), we have
;
-
(iv)
With (2.20c) and as in (2.11), we have
.
Proof.
To see these, we note that . Hence,
immediately it follows that for items (i) and (ii),
for item (iii), and for item (iv).
The equality sign in items (i) and (ii)
are achieved by letting and , where are unit
singular vectors of and , respectively such that
and , and hence
|
|
|
Also because , , , and all consist of one nonzero singular value and some copies of zeros,
|
|
|
|
|
|
|
|
|
implying with the respective norm on as specified in item (i) or item (ii).
∎
The next lemma is likely known. We state it here with a proof for self-containedness.
Lemma 2.4.
Given and an integer , we have
|
|
|
|
|
|
|
|
Proof.
Partition as
|
|
|
and then we have
|
|
|
It follows from Fischer’s minimax principle for the symmetric eigenvalue problem
(see, e.g., [12, eq. (1.3)], [13, p.206], [16, p.201])
that
|
|
|
|
|
|
|
|
|
|
|
|
(2.21) |
where denotes a subspace of , is the th largest eigenvalue of , and the first inequality is due to limiting
the subspace to the one composed of vectors with their last entries being to .
This proves .
Next, using , in the same way as in (2.21), we can get
. This completes the proof of the inequalities
for .
Analogously, again by Fischer’s minimax principle, we have
|
|
|
|
|
|
|
|
|
|
|
|
(2.22) |
where is the st largest eigenvalue of , and the first inequality is due to limiting
the subspace to the one composed of vectors with their first entries being to .
Next, using , in the same way as in (2.22), we can get
.
∎
3 Main Result
In order to achieve (1.6), we need
some and to satisfy
|
|
|
|
|
(3.1a) |
|
|
|
|
(3.1b) |
obtained from setting off-diagonal blocks of , partitioned accordingly, to .
In what follows, we will use Lemma 2.1 to prove the existence of a solution pair
to (3.1)
with an upper bound under certain conditions.
Keeping in mind that our goal is to create a variant
of Theorem 1.1 in a unitarily invariant norm and the spectral norm, and hopefully the variant for the special case of
the Frobenius norm is better than Theorem 1.1 in terms of both weaker conditions and stronger results.
In the setting of Lemma 2.1, we will use the Banach space
|
|
|
endowed with one of the two norms: for ,
|
|
|
|
|
(3.2a) |
|
|
|
|
(3.2b) |
The two endowed norms become one for the case , the spectral norm, but otherwise are different.
The linear operator is given by
|
|
|
(3.3) |
and the continuous function is
|
|
|
(3.4) |
It is not hard to verify that defined in (3.2) is indeed a norm on .
Compactly, the two equations in (3.1) can be merged into one to take the form
|
|
|
(3.5) |
It remains to verify the conditions of Lemma 2.1 for and we just defined.
This is done in the next two lemmas.
Let
|
|
|
|
|
(3.6a) |
|
|
|
|
(3.6b) |
|
|
|
|
(3.6c) |
Lemma 3.1.
Suppose . Dependent of different cases, we have for any
|
|
|
(3.7) |
which implies , where is as in (3.6b), and
-
(i)
with the endowed norm in (3.2a),
|
|
|
(3.8) |
-
(ii)
with the endowed norm in (3.2b),
|
|
|
(3.9) |
Proof.
Notice that consists of a set of two coupled Sylvester equations in (2.6)
with
|
|
|
and and correspond to and there, respectively. We claim that
|
|
|
(3.10) |
where achieve the minimum.
By Mirsky’s theorem [16, p.204], there are such that
and . Hence
|
|
|
|
|
|
|
|
|
|
|
|
yielding (3.10).
If , then
, and hence
|
|
|
|
|
|
|
|
Keep in mind that
|
|
|
by Mirsky’s theorem [16, p.204], and hence
|
|
|
This lemma is a consequence of Lemma 2.3.
∎
In what follows, our representation will assume that one of the endowed norm
in (3.2) is selected and fixed, and, along with it, the part of Lemma 3.1, unless explicitly stated otherwise.
Lemma 3.2.
For the continuous function defined in (3.4),
we have
|
(i) |
|
|
|
(ii) |
|
|
where is as in (3.6c).
Proof.
With (3.2a), we have
|
|
|
|
|
|
|
|
|
|
|
|
(3.11) |
with (3.2b), we have
|
|
|
|
|
|
|
|
|
|
|
|
(3.12) |
This proves the inequality in item (i). For item (ii), we note
|
|
|
For each of the two components, we have
|
|
|
|
|
(3.13a) |
|
|
|
|
(3.13b) |
With (3.2a), we get, similar to the derivation in (3.11),
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
with (3.2b), we get, similar to the derivation in (3.12),
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
completing the proof of item (ii).
∎
Next we apply Lemma 2.1 to ensure a particular solution to
(3.5).
Lemma 3.3.
Let , , and be defined as in (3.6), and suppose
the conditions in Lemma 3.1 that ensure (3.7) with constant as
specified there.
If
|
|
|
(3.14) |
then (3.5) has a solution that satisfies
|
|
|
(3.15) |
Here and are understood as the same one of the endowed norms in (3.2) under consideration.
Proof.
In light of Lemmas 3.1 and 3.2, we find the conclusion is a straightforward consequence of Lemma 2.1
with , , and .
∎
Finally, we state the main result of this section.
Theorem 3.1.
Given , let be decomposed as in (1.1), and partition
according to (1.4). Let ,
, and be defined as in (3.6). If (3.14) is satisfied, then the following statements hold:
-
(a)
there exists a solution
to (3.1),
satisfying (3.15);
-
(b)
admits the decomposition (1.6),
and the singular values of is the multiset union of those of
|
|
|
|
|
|
|
|
|
(3.16a) |
|
|
|
|
(3.16b) |
and
|
|
|
|
|
|
|
|
|
(3.17a) |
|
|
|
|
(3.17b) |
-
(c)
We have
|
|
|
|
(3.18a) |
|
|
|
(3.18b) |
where and
are the smallest singular value of and the largest singular value of ,
respectively;
-
(d)
The left and right singular subspaces of associated with the part of its singular values
are spanned by the columns of
|
|
|
|
|
(3.19a) |
|
|
|
|
(3.19b) |
respectively. In particular,
|
|
|
|
|
(3.20a) |
|
|
|
|
(3.20b) |
Proof.
Only item (c) and the inequalities in (3.20) of item (d) need proofs. For item (c),
using (3.16a) for and (3.16b) for in
below,
we get
|
|
|
Hence
|
|
|
where is the smallest eigenvalues of a Hermitian matrix.
Therefore, with the help of (3.15), we have
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(3.21) |
yielding the first inequality in (3.18). Similarly, we get the second inequality there by
using (3.17).
Now we prove the inequalities in (3.20). We have
|
|
|
|
|
|
|
|
Let be the SVD of . We find
|
|
|
where for the middle matrix on the right, is diagonal and
is leading diagonal. Hence the singular values of the middle matrix are given by: for each singular value of ,
|
|
|
|
(3.22) |
|
|
|
|
(3.23) |
|
|
|
|
Therefore, we
get
|
|
|
yielding (3.20a) in light of (3.15).
Similarly, we have (3.20b).
∎
The lower and upper bound on and , respectively, in
(3.18), although always true, do not provide useful information, unless also
, in which case it can be used to establish a sufficient condition
to ensure .
Corollary 3.1.
Given , let be decomposed as in (1.1), and partition
according to (1.4). Let ,
, and be defined as in (3.6), and suppose
. If (3.14) with is satisfied, then
the top singular values of
are exactly the singular values of , and
|
|
|
(3.24) |
Proof.
We have all conclusions of Theorem 3.1.
By (3.18), we have
|
|
|
which says that the top singular values of
are exactly the singular values of . As a result, applying Lemma 2.4 to (1.4),
we have
|
|
|
(3.25) |
where the last inequality in (3.25) is a consequence of Mirsky’s theorem [16, p.204].
This proves the first inequality in (3.24).
It can be seen that
|
|
|
and hence by [7, Theorem 3] combining with (3.25), we get
|
|
|
(3.26) |
yielding the second inequality in (3.24).
∎
The sharper lower bound on in (3.24) than the one by
the first inequality in (3.18) is only made possible by first establishing that
the top singular values of are exactly the singular values of , which
relies on the first inequality in (3.18) for a proof, however.
More than (3.26) which is just for the smallest singular value only, [7, Theorem 3] yields
|
|
|
under the conditions of Corollary 3.1.
4 Discussions
Theorem 3.1 serves the same purpose as Stewart’s [15, Theorem 6.4], but the former
provides a variety of choices of norms for the Banach space and, as a consequence, a number of results dependent of
circumstances to use.
Let us begin by realizing Theorem 3.1 and Corollary 3.1 with unitarily invariant norm
being set to the Frobenius norm, in order to compare with Theorem 1.1 [15, Theorem 6.4].
We have two endowed norms from (3.2) to choose:
|
|
|
We have the following corollary to Theorem 3.1, where we state the conditions and the bounds on
but omit the others that can be deduced from the bounds on .
Corollary 4.1.
Given , let be decomposed as in (1.1), and partition
according to (1.4). Let ,
, and be defined as in (3.6).
-
(i)
If
|
|
|
(4.1) |
then (3.5) has a solution that satisfies
|
|
|
|
|
|
|
|
(4.2) |
-
(ii)
If and if
|
|
|
(4.3) |
then (3.5) has a solution that satisfies
|
|
|
|
|
|
|
|
(4.4) |
In comparing Corollary 4.1 with Theorem 1.1 [15, Theorem 6.4], we note
Corollary 4.1(i) provides better results than Theorem 1.1 in their conditions:
(4.1) vs. (1.8), and bounds: (4.2) vs. (1.9),
because
|
|
|
|
|
|
|
|
Such an improvement of Corollary 4.1(i) over Theorem 1.1 could be considered marginal because it can be easily recovered
by just refining Stewart’s relevant estimates [15]. However, the improvement by Corollary 4.1(ii)
over Theorem 1.1, under the condition , unlikely can be achieved
by simply refining Stewart’s arguments there.
Our original motivation to revisit this classical result of Stewart’s is the need for a version of Theorem 1.1
in the spectral norm while we are working on an error analysis for model order reduction by balanced truncation
[18]. Let us look at what Theorem 3.1 leads to for the spectral norm, for which the
two endowed norms from (3.2) collapse to the same one
|
|
|
We have the following corollary to Theorem 3.1, which yields far sharper results than those such as
(1.11) that would otherwise have to be derived from Theorem 1.1.
Corollary 4.2.
Given , let be decomposed as in (1.1), and partition
according to (1.4). Let ,
, and be defined as in (3.6).
-
(i)
If
|
|
|
(4.5) |
then (3.5) has a solution that satisfies
|
|
|
(4.6) |
-
(ii)
If and if
|
|
|
(4.7) |
then (3.5) has a solution that satisfies
|
|
|
(4.8) |
During the proof of Lemma 2.3, we commented that inequality (2.12b) had been really implied
by Wedin [17, section 3], and inequality (2.11) may also be essentially implied in [17] but with
his
defined as
|
|
|
(4.9) |
which is the same as the one in (2.11) for the case because then ,
but can be different if for which case that may or may not contain , however.
Both (2.12b) and (2.11) are used to develop the generalized theorems for SVD there
(see, also, [11, p.21-7]). In the same spirit, we can use (2.12a)
and (2.14) to establish more generalized theorems for SVD. In fact, we have the following theorem.
Theorem 4.1.
Let be decomposed as in (1.1), and let admit
a decomposition in the same form as in (1.1), except with tildes on all symbols. Let
|
|
|
If
|
|
|
then
|
|
|
where
in general, but if also or for the Frobenius norm.
Here is the diagonal matrix of the canonical angles between the subspaces
and [11, p.21-2], [16].
The case for is not new and Stewart and Sun [16, p.260] credited it to Wedin [17]
with a slightly
different (similar to (4.9) we commented moments ago). However, Wedin [17] did not
explicitly mention it for the case. Evidently, Wedin could easily had it because of the machinery he had already built in the paper.
It can be seen that
|
|
|
Or, equivalently,
|
|
|
which takes the form of the coupled Sylvester equations (2.6) with and .
Noticing that the singular values of and those of are the same as the sines of the canonical angles
between and and between and , respectively, and hence [16, 8, 9]
|
|
|
and noticing
|
|
|
the conclusion in the theorem is a simple consequence of Lemma 2.3.
∎