Quality of Approximate Balanced Truncation
Abstract
Model reduction is a powerful tool in dealing with numerical simulation of large scale dynamic systems for studying complex physical systems. Two major types of model reduction methods for linear time-invariant dynamic systems are Krylov subspace-based methods and balanced truncation-based methods. The methods of the second type are much more theoretically sound than the first type in that there is a fairly tight global error bound on the approximation error between the original system and the reduced one. It is noted that the error bound is established based upon the availability of the exact controllability and observability Gramians. However, numerically, the Gramians are not available and have to be numerically calculated, and for a large scale system, a viable option is to compute low-rank approximations of the Gramians from which an approximate balanced truncation is then performed. Hence, rigorously speaking, the existing global error bound is not applicable to any reduced system obtained via approximate Gramians. The goal of this paper is to address this issue by establishing global error bounds for reduced systems via approximate balanced truncation.
Keywords: model reduction, balanced Truncation, transfer function, controllability Gramian, observability Gramian, Hankel singular value, low-rank approximation, error bound
Mathematics Subject Classification 78M34, 93A15, 93B40
1 Introduction
Model reduction is a powerful tool in dealing with numerical simulation of large scale dynamic systems for studying complex physical systems [2, 11, 17]. In this paper, we are interested in the following continuous linear time-invariant dynamic system
(1.1a) | ||||
(1.1b) |
where is the state vector, and is the input, is the output, and are constant matrices that define the dynamic system. In today’s applications of interests, such as very large scale integration (VLSI) circuit designs and structural dynamics, can be up to millions [3, 5, 11], but usually the dimensions of input and output vectors are much smaller, i.e., . Large can be an obstacle in practice both computationally and in memory usage. Model reduction is then called for to overcome the obstacle.
In a nutshell, model reduction for dynamic system (1.1) seeks two matrices such that to reduce the system (1.1) to
(1.2a) | ||||
(1.2b) |
where are given by
(1.3) |
Intuitively, this reduced system (1.2) may be thought of obtaining from (1.1) by letting and performing Galerkin projection with . The new state vector is now in , a much smaller space in dimension than . In practice, for the reduced system to be of any use, the two systems must be “close” in some sense.
Different model reduction methods differ in their choosing and , the projection matrices. There are two major types: Krylov subspace-based methods [5, 11, 17] and balanced truncation-based methods [2, 12]. The methods of the first type are computationally more efficient for large scale systems and reduced models are accurate around points where Krylov subspaces are built, while those of the second type are theoretically sound in that fairly tight global error bounds are known but numerically much more expensive in that controllability and observability Gramians which are provably positive definite have to be computed at cost of complexity.
Modern balanced truncation-based methods have improved, thanks to the discovery that the Gramians are usually numerically low-rank [4, 20, 21] and methods that compute their low-rank approximations in the factor form [15, 9]. The low-rank factors are then naturally used to compute an approximate balanced truncation. Moments ago, we pointed out the advantage of balanced truncation-based methods in their sound global approximations guaranteed by tight global error bounds, but these bounds are established based on exact Gramians and hence the exiting global error bounds, though suggestive, are no longer valid. To the best of our knowledge, there is no study as to the quality of reduced models by modern balanced truncation-based methods that use the low-rank approximate Gramians. Our aim in this paper is to address the void.
The rest of this paper is organized as follows. Section 2 reviews the basics of balanced truncation methods. Section 3 explains approximate balanced truncation, when some low-rank approximations of controllability and observability Gramians, not the exact Gramians themselves, are available. In Section 4, we establish our main results to quantify the accuracy of the reduced model by approximate balanced reduction. We draw our conclusions and makes some remarks. Some preliminary material on subspaces of and perturbation for Lyapunov equation are discussed in appendixes.
Notation. is the set of real matrices, , and . is the identity matrix or simply if its size is clear from the context, and is the th column of of apt size. stands for the transpose of a matrix/vector. is the column subspace of , spanned by its columns. For , its singular values are
where , and and . , , and are its spectral and Frobenius norms:
respectively. is some unitarily invariant norm of [18, 23]. For a matrix that is known to have real eigenvalues only, denotes the set of its eigenvalues (counted by multiplicities) arranged in the decreasing order, and and . means that it is symmetric and positive definite (semi-definite), and accordingly if . MATLAB-like notation is used to access the entries of a matrix: to denote the submatrix of a matrix , consisting of the intersections of rows to and columns to , and when is replaced by , it means all rows, similarly for columns.
2 Review of balanced truncation
In this section, we will review the balanced truncation, minimally to the point to serve our purpose in this paper. The reader is referred to [2] for a more detailed exposition.
Consider continuous linear time-invariant dynamic system (1.1) and suppose that it is stable, observable and controllable [1, 25].
2.1 Quality of a reduced order model
Suppose initially . Applying the Laplacian transformation to (1.1) yields
where and are the Laplacian transformations of and , respectively, and is the so-called transfer function of system (1.1). Conveniently, we will adopt the notation to denote the system (1.1) symbolically by
with the round bracket to distinguish it from the square bracket for matrices. The infinity Hankel norm of the system , also known as the infinity Hankel norm of , is defined as
(2.1) |
where is the spectral norm of a matrix, and is the imaginary unit.
In Section 1, we introduced the framework of model reduction with two matrices such that . For the ease of our presentation going forward, we shall rename them as and . Next we look for such that
(2.2) |
Such always exist by Lemmas A.1 and A.2 if . In any practical model reduction method, only need to be produced. That are introduced here is only for our analysis later. Denote by
(2.3) |
which are consistent because of (2.2). To the original system (1.1), perform transformation: , to get
(2.4a) | ||||
(2.4b) | ||||
where | ||||
(2.4c) |
naturally partitioned as
In particular,
(2.5) |
One can verify that the transfer functions of (1.1) and (2.4) are exactly the same.
In current notation, the reduced system (1.2) in Section 1 takes the form
(2.6a) | ||||
(2.6b) |
which will be denoted in short by . Its transfer function is given by
(2.7) |
Naturally, we would like that the full system (2.4) and its reduced one (2.6) are “close”. One way to measure the closeness is the -norm of the difference between the two transfer functions [2, 25]:
assuming both systems are stable, observable and controllable [25]. Another way is by -norm which we will get to later. It turns out that is the transfer function of an expanded dynamic system:
(2.8a) | ||||
(2.8b) |
or in the short notation
The key that really determines the quality of a reduced system is the subspaces and as far as the transfer function (2.7) is concerned, as guaranteed by the next theorem.
Theorem 2.1.
Proof.
Fix a pair of basis matrices of and , respectively, such that . Consider any other two basis matrices of and , respectively, such that . Then and for some nonsingular . We have
implying , and
The transfer function associated with is
having nothing to do with and , as was to be shown. ∎
2.2 Balanced truncation
Balanced truncation fits into the general framework of model reduction, and thus it suffices for us to define and for balanced truncation accordingly.
The controllability and observability Gramians and are defined as the solutions to the Lyapunov equations:
(2.9a) | ||||
(2.9b) |
respectively. Under the assumption that dynamic system (1.1) is stable, observable and controllable, the Lyapunov equations have unique solutions that are positive definite, i.e., and . The model order reduction based on balanced truncation [2, 10] starts with a balanced transformation to dynamic system (1.1) such that both Gramians are the same and diagonal with diagonal entries being the system’s invariants, known as the Hankel singular values of the system.
Balanced truncation is classically introduced in the literature through some full-rank decompositions of and :
(2.10) |
where and are nonsingular because and . But that is not necessary in theory, namely do not have to be square, in which case both will have no fewer than columns because the equalities in (2.10) ensure and . Later in Theorem 2.3, we will show that balanced truncation is invariant with respect to how the decompositions in (2.10) are done, including non-square and . Such an invariance property is critical to our analysis.
Suppose that we have (2.10) with
(2.11) |
Without loss of generality, we may assume . Let the SVD of be
(2.12a) | |||
where | |||
(2.12b) | |||
(2.12c) |
Only for are positive and the rest are . Those for are the so-called Hankel singular values of the system, and they are invariant with respect to different ways of decomposing and in (2.10) with (2.11), and, in fact, they are the square roots of the eigenvalues of , which are real and positive. To see this, we note are the eigenvalues of whose nonzero eigenvalues are the same as those of .
Define
(2.13) |
It can be verified that because
With and , we define , , and according to (2.4c), and, as a result, the transformed system (2.4). In turn, we have , , and . Plug these relations into (2.9) to get, after simple re-arrangements,
(2.14a) | ||||
(2.14b) |
which are precisely the Lyapunov equations for the Gramians
(2.15) |
of the transformed system (2.4). With the help of (2.10), (2.12) and (2.13), it is not hard to verify that
balancing out the Gramians.
Given integer (usually ), according to the partitions of , , and in (2.12), we write
(2.16a) | ||||
(2.16b) |
leading to the reduced system (2.6) in form but with newly defined by (2.16). In the rest of this section, we will adopt the notations in Section 2.1 but with given by (2.16).
Balanced truncation as stated is a very expensive procedure that generates (2.6) computationally. The computations of and fully costs each, by, e.g., the Bartels-Stewart algorithm [7], decompositions and costs each, and so does computing SVD of , not to mention storage requirements. However, it is a well-understood method in that the associated reduced system (1.2) inherits most important system properties of the original system: being stable, observable and controllable, and also there is a global error bound that guarantees the overall quality of the reduced system.
In terms of Gramians, the - and -norms of previously defined in (2.1) are given by (e.g., [2, Section 5.4.2])
where is the largest Hankel singular value in (2.12c). We remark that the transformations on and as in (2.15) for any nonsingular , not necessarily the one in (2.13), preserve eigenvalues of because
For the ease of future reference, we will denote by :
(2.17) |
the transfer function of the reduced system (2.6) with as in (2.16) by the balanced truncation.
Theorem 2.2 ([2, 25]).
For and from the balanced truncation as in (2.16), we have
(2.18) |
where are the Hankel singular values of the system, i.e., the first singular values of .
Remark 2.1.
One thing that is not clear yet and hasn’t been drawn much attention in the literature is whether the reduced system by the balanced truncation of order varies with the decompositions and which are not unique, including and that may not necessarily be square. This turns out to be an easy question to answer.
Theorem 2.3.
Proof.
We will show that the projection matrices and defined in (2.16) are invariant with respect to any choices of decompositions for and of the said kind. Suppose we have two different decompositions for each one of and
The idea is to show that after fixing one pair of decompositions and , and constructed from any other decompositions and , including nonsquare and , remain the same. Evidentally .
Without loss of generality, we may assume ; otherwise we can append some columns of to from the right.
Since and , there exist and such that
It can be verified that and , i.e., both and have orthonormal rows. Suppose we already have the SVD of as in (2.12) with . Both and have orthonormal columns. There exist and such that
are orthogonal matrices. We have
yielding an SVD of , for which the corresponding projection matrices from and are given by
and, similarly, , yielding the same projection matrices as and in (2.16) from and , which in turn leads to the same reduced system (2.6) and hence the same transfer function. Now let
be another SVD of subject to the inherent freedom in SVD, where and . Since , by the uniqueness of singular subspaces, we know and . Therefore
and similarly, , implying the same transfer function regardless of whether the reduced system is obtained by the projection matrix pair or by the pair by Theorem 2.1. ∎
2.3 A variant of balanced truncation
A distinguished feature of the transformation in (2.16) is that it makes the transformed system (2.4) balanced, i.e., both controllability and observability Gramians are the same and diagonal, and so the reduced system (2.6) is balanced, too. But as far as just the transfer function of the reduced system is concerned, there is no need to have and precisely the same as the ones in (2.16) because of Theorem 2.1. In fact, all that we need is to make sure and , besides . Specifically, we have by Theorem 2.1
Corollary 2.1.
defined by (2.16) for balanced truncation are difficult to work with in analyzing the quality of balanced truncation. Luckily, the use of transfer function for analysis allows us to focus on the subspaces and . Later, instead of the concrete forms of and in (2.16), we will work with the reduced system (2.6) with
(2.20) |
It is not hard to verify that , , and . Effectively, in the notations of Section 2.2 up to SVD (2.12), this relates to transform the original system (1.1) to (2.4) with
(2.21) |
Accordingly, the Gramians for the reduced system, by (2.15), are
(2.22) |
which are not balanced, but the reduced system has the same transfer function as by the balanced truncation with (2.16) nonetheless.
3 Approximate balanced truncation
When is large, balanced truncation as stated is a very expensive procedure both computationally and in storage usage. Fortunately, and are usually numerically low-rank [8, 20, 6, 21, 4], which means, and can be very well approximated by and , respectively, where and with . Naturally, we will use and to play the roles of and in Section 2.2. Specifically, a model order reduction by approximate balanced truncation goes as follows.
-
1.
compute some low-rank approximations to and in the product form
(3.1) where and . Without loss of generality, assume , for our presentation.
- 2.
-
3.
finally, , , and are reduced to
(3.2) where
(3.3)
It can be verified that . Accordingly, we will have a reduced system
(3.4a) | ||||
(3.4b) |
which will not be quite the same as (1.2) with and in (2.16) from the (exact) balanced truncation. The transfer function of (3.4) is
(3.5) |
One lingering question that has not been addressed in the literature is how good reduced system (3.4) is, compared to the true reduced system of balanced truncation. The seemingly convincing argument that if and are sufficiently accurate then should approximate well could be doubtful because usually . A different argument may say otherwise. In order for and to approximate and well, respectively, both and must approximate the dominant components of the factors and of and well. The problem is here while it is possible that the dominant components of and could mismatch in forming , i.e., in the unlucky scenario, the dominant components of match the least dominant components of in forming and simply extracting out the dominant components of and is not enough. Hence it becomes critically important to provide theoretical analysis that shows the quality of approximate balanced truncation derived from and , assuming and are tiny.
By the same reasoning as we argue in Subsection 2.2, the transfer function stays the same for any that satisfy
(3.6) |
and the pair in (3.3) is just one of many concrete pairs that satisfy (3.6). Again and in (3.3) for approximate balanced truncation are difficult to work with in our later analysis. Luckily, we can again focus on the subspaces and because of Theorem 2.1. Precisely what to use will be specified later in Section 4 so that they will be close to and in (2.20), respectively.
We reiterate our notations for the reduced models going forward.
- •
- •
In the rest of this paper, assuming , we will
- (i)
-
(ii)
bound and in terms of for both and .
4 Quality of the approximate balanced reduction
The true balanced truncation requires computing the controllability and observability Gramians and to the working precision, performing their full-rank decompositions (such as the Cholesky decomposition) and an SVD, each of which costs flops. It is infeasible for large scale dynamic systems. Luckily, the numbers of columns in and are usually of and and numerically have extremely low ranks. In practice, due to the fast decay of the Hankel singular values [4, 6, 10, 8, 20], and the fact that solving the Lyapunov equations in (2.9) for the full Gramians is too expensive and storing the full Gramians takes too much space, we can only afford to compute low-rank approximations to and in the product form as in (3.1) [14, 21, 22]. More than that, and approach and from below, i.e.,
(4.1a) | |||
where and . This is what we will assume about amd in the rest of this paper, besides | |||
(4.1b) |
for some sufficiently tiny and . Except their existences, exactly what , and their full-rank factors and are not needed in our analysis. Because of (4.1a), we may write
(4.2a) | ||||
(4.2b) | ||||
where and are unknown, and neither are | ||||
(4.2c) |
and . Without loss of generality, we may assume
otherwise, we simply append a few columns of to . Let
(4.3d) |
It is reasonable to require
because we are looking for balanced truncation of order . Lemma 4.1 provides some basic inequalities we need in the rest of this paper.
Lemma 4.1.
Proof.
Remark 4.1.
Besides the spectral norm , the Frobenius norm is another commonly used matrix norm, too. Naturally, we are wondering if we could have Frobenius-norm versions of (4.1b) and Lemma 4.1. Theoretically, it can be done, but there is one potential problem which is that matrix dimension will show up. Here is why:
and this inequality becomes an equality if all singular values of are the same. Although always, potentially , bringing into the estimates here and forward. That can be an unfavorable thing to have for huge .
4.1 Associated SVDs
Let the SVD of in (4.3d) be
(4.7a) | |||
where | |||
(4.7b) | |||
(4.7c) |
Despite of its large size, still has only nonzero singular values, namely , which are the Hankel singular values of the system, and the rest of its singular values for .
Lemma 4.2.
Proof.
The nonzero singular values of are given by those of . It suffices to show for . Note for are the eigenvalues of
whose nonzero eigenvalues are the same as those of
whose nonzero eigenvalues are the same as those of
whose nonzero eigenvalues are for . ∎
Partition
By Lemma 4.1, we find
(4.8) |
where is defined in (4.6). Now we will apply [19, Theorem 3.1] to to yield an almost SVD decomposition of :
(4.9) |
where
(4.10c) |
are two orthogonal matrices, and .
Theorem 4.1.
Let be as in (4.6), and let
If
then the following statements hold:
- (a)
-
(b)
the singular values of is the multiset union of
(4.12a) (4.12b) and
(4.13a) (4.13b) -
(c)
we have
(4.14) where and are the smallest singular value of and the largest singular value of , respectively;
-
(d)
if also , then the top singular values of are exactly the singular values of ,
(4.15) and the dominant left and right singular subspaces are spanned by the columns of
(4.16a) (4.16b) respectively. In particular,
(4.17a) (4.17b) and111 It is tempting to wonder if , considering the standard perturbation result of singular values [23, p.204], [16, p.21-7]. Unfortunately, is unlikely diagonal. Another set of two inequalities for the same purpose as (4.18) can be obtained as outlined in Remark 4.2.
(4.18a) (4.18b)
Proof.
Recall (4.8). Apply [19, Theorem 3.1] to with , and here to yield all conclusions of the theorem, except (4.17) and (4.18), which we will now prove.
To prove (4.17a), we have
Let be the SVD of . We find
where for the middle matrix on the right, is diagonal and is leading diagonal. Hence the singular values of the middle matrix are given by: for each singular value of ,
Therefore, we get
yielding (4.17a) in light of (4.11). Similarly, we have (4.17b).
Remark 4.2.
Another upper bound on can be obtained as follows. Alternatively to (4.19), we have
and therefore
(4.20) |
Comparing (4.18a) with (4.20), we find that both contain a term that depends only on : in the former whereas in the latter, and clearly the edge goes to (4.18a) for the term, and that both contain a term proportional to , and the edge goes to (4.20) because it is v.s. . In the same way as how (4.18b) is created, we can create an upper bound on , using (4.20), instead. Detail is omitted.
As we commented on [19, Theorem 3.1], (4.15) improves the first inequality in (4.14), but it relies on the latter to first establish the fact that the top singular values of are exactly the singular values of .
The decomposition (4.9) we built for has an SVD look, but it is not an SVD because for are not diagonal. One idea is to perform an SVD on and update , accordingly to get and for the dominant left and right singular vector matrices, but it is hard, if not impossible, to relate the resulting and to and , and in return, difficult to relate and defined in in (3.3) to defined in (2.16). This is precisely the reason behind our previous comment at the end of Sections 2 and 3 that defined in (2.16) and and in (3.3) are difficult to use. Fortunately these concrete forms for and and are not essential as far the transfer functions are concerned because of Theorem 2.1. On the other hand, it is rather easy to relate , , and there to , , and , respectively, from the SVD of .
In the rest of this paper, we will assume the following setup without explicit mentioning it:
Setup. Approximate Gramians and satisfy (4.1) such that the conditions of Theorem 4.1, including , hold. True balanced truncation is carried with in (2.20), while are used for approximate balanced truncation. Accordingly, , , and in the reduced system (2.6) from the true balanced truncation are defined by (2.5), and , , and in the reduced system (3.4) from approximate balanced truncation by (3.2). | (4.21) |
in (2.20) and just intriduced, produce different reduced models from the usual reduced models by balanced truncation in the literature, but keep the associated transfer functions intact, nonetheless. In particular, and are introduced for our analysis only. In practice, they cannot be computed because given and , knowledge on what and are is not available, a priori.
4.2 Bounds on differences between reduced systems
In this subsection we will bound the differences of the coefficient matrices and transfer functions between the reduced system (2.6) from the true balanced truncation and (3.4) from an approximate balanced truncation.
First we will establish bounds on and .
Lemma 4.3.
We have
(4.22a) | ||||
(4.22b) |
Proof.
The differences between the coefficient matrices of the two reduced systems are bounded in Theorem 4.2 below, where the use of any unitarily invariant norm does not require additional care for proofs, and yet may be of independent interest.
Theorem 4.2.
For any unitarily invariant norm , we have
(4.23a) | ||||
(4.23b) | ||||
(4.23c) |
and
(4.24a) | ||||
(4.24b) |
Proof.
Remark 4.3.
Previously, we have introduced in (2.17) and in (3.5) for the transfer functions for the reduced systems by the true and approximate balanced truncation, respectively. Let
the difference between the transfer functions, where subscript ‘d’ is used here and in what follows to stand for ‘difference’ between the related things from the true balanced truncation and its approximation.
We are interested in established bounds for and . To this end, we introduce
(4.27) |
the solutions to and , respectively, and let
(4.28) |
Both and are well-defined because is from the exact balanced truncation and hence inherits its stability from the original state matrix .
is the transfer function of the system
(4.29) |
or in short,
Denoted by , the controllability and observability Gramians of (4.29), respectively. They are the solutions to
(4.30a) | ||||
(4.30b) |
respectively. It is well-known that
(4.31) |
The -norm of continuous system (1.1) is the energy of the associated impulse response in the time domain [2]. Our goals are then turned into estimating the largest eigenvalues of and the traces.
Lemma 4.4.
If for , then
(4.32) |
where and satisfy
(4.33a) | ||||
(4.33b) |
and
(4.34a) | ||||
(4.34b) |
Proof.
Partition both as
We start by investigating first. Blockwise, (4.30a) is equivaent to the following three equations:
(4.35a) | ||||
(4.35b) | ||||
(4.35c) |
It follows from Section 2.3 that , and from Lemma B.1 that both and are near , and therefore the form of as in (4.32). Specifically, by (4.23) and Lemma B.1, we have
with
They, together with Theorem 4.2, yield (4.33).
We now turn our attention to . Blockwise, (4.30b) is equivalent to the following three equations:
(4.36a) | ||||
(4.36b) | ||||
(4.36c) |
It follows from Section 2.3 that , and from Lemma B.1 that both and are near , and therefore the form of as in (4.32). Specifically, by (4.23) and Lemma B.1, we have
with
They, together with Theorem 4.2, yield (4.34). ∎
Remark 4.4.
Bounds on and can also be established, only a little more complicated than (4.33) and (4.34), upon using Lemma B.1 with the Frobenius norm and noticing
In fact, we will have
But these bounds are not materially better than these straightforwardly obtained from (4.33) and (4.34), together with for any .
Theorem 4.3.
If for , then
(4.37) | ||||
(4.38) |
where and for are defined in Lemma 4.4, and and is the numbers of columns of and , respectively.
Proof.
Remark 4.5.
Alternatively, basing on the second expression in (4.31) for , we can derive a different bound. Similarly to (4.39) and (4.40), we claim that
(4.41) | ||||
(4.42) |
They can be proven, analogously along the line we proved (4.39) and (4.40), as follows:
Finally,
yielding a different from the one in (4.38). It is not clear which one is smaller.
Norms of the coefficient matrices for the reduced systems appear in the bounds in Theorem 4.3. They can be replaced by the norms of the corresponding coefficient matrices for the original system with the help of the next lemma.
Lemma 4.5.
Proof.
We have by (2.20)
Therefore
and similarly for and . The proofs of the other two inequalities are similar. ∎
4.3 Transfer function for approximate balanced truncation
In this subsection, we establish bounds to measure the quality of the reduced system (3.4) from approximate balanced truncation as an approximation to the original system (1.1). Even though the projection matrices we used for approximate balanced truncation are different from the ones in practice, the transfer function as a result remains the same, nonetheless. Therefore our bounds are applicable in real applications. These bounds are the immediate consequences of Theorem 4.3 upon using
for and .
Theorem 4.4.
An immediate explanation to both inequalities (4.44) and (4.45) is that the reduced system (3.4) from the approximate balanced reduction as an approximation to the original system (1.1) is worse than the one from the true balanced reduction by no more than and in terms of the - and -norm, respectively. Both and can be traced back to the initial approximation errors and in the computed Gramians as specified in (4.1) albeit complicatedly. To better understand what and are in terms of and , we summarize all quantities that lead to them, up to the first order in
Then in (4.6). Let . We have
Alternatively, for , also by Remark 4.5
It can be seen that both and are of , pretty disappointing.
5 Concluding Remarks
For a continuous linear time-invariant dynamic system, the existing global error bound that bounds the error between a reduced model via balanced truncation and the original dynamic system assumes that the reduced model is constructed from two exact controllability and observability Gramians. But in practice, the Gramians are usually approximated by some computed low-rank approximations, especially when the original dynamic system is large scale. Thus, rigorously speaking, the existing global error bound, although indicative about the accuracy in the reduced system, is not really applicable. In this paper, we perform an error analysis, assuming the reduced model is constructed from two low-rank approximations of the Gramians, making up the deficiency in the current theory for measuring the quality of the reduced model obtained by approximate balanced truncation. Error bounds have been obtained for the purpose.
So far, we have been focused on continuous linear time-invariant dynamic systems, but our techniques should be extendable to discrete time-invariant dynamic systems without much difficulty.
Throughout this paper, our presentation is restricted to the real number field . This restriction is more for simplicity and clarity than the capability of our techniques. In fact, our approach can be straightforwardly modified to cover the complex number case: replace all transposes of vectors/matrices with complex conjugate transposes .
Appendix
Appendix A Some results on subspaces
Consider two subspaces and with dimension of and let and be orthonormal basis matrices of and , respectively, i.e.,
and denote by for in the descending order, i.e., , the singular values of . The canonical angles between to are defined by
They are in the ascending order, i.e., . Set
It can be seen that these angles are independent of the orthonormal basis matrices and which are not unique.
We sometimes place a matrix in one of or both arguments of and with an understanding that it is about the subspace spanned by the columns of the matrix argument.
It is known that defines a distance metric between and [24, p.95].
The next two lemmas and their proofs are about how to pick up two bi-orthogonal basis matrices of two subspaces with acute canonical angles. The results provide a foundation to some of our argument in the paper.
Lemma A.1.
Let and be two subspaces with dimension of . Then
if and only if is nonsingular for any two basis matrices of and , respectively.
Proof.
Suppose that , and let be basis matrices of and , respectively. Then
(A.1) |
are two orthonormal basis matrices of and , respectively. The singular values of are for which are positive because , and hence is nonsingular, and since
(A.2) |
is nonsingular.
Lemma A.2.
Let and be two subspaces with dimension of and suppose that , i.e., the canonical angles between the two subspaces are acute.
-
(a)
There exist basis matrices of and , respectively, such that ;
-
(b)
Given a basis matrix of , there exists a basis matrix of such that ;
-
(c)
Given basis matrices of and , respectively, such that , there exist matrices such that
Proof.
For item (a), first we pick two orthonormal basis matrices of and , respectively. The assumption implies that the singular values of are for are positive, and hence is nonsingular. Now take and .
For item (b), we note is an orthonormal basis matrix of . Let be an orthonormal basis matrix of . As we just argued,
is nonsingular, implying is nonsingular. Now take .
Finally for item (c), let be any orthonormal basis matrices of and , the orthogonal complements of and , respectively, i.e.,
We claim that is nonsingular; otherwise there exists
such that , i.e., , pre-multiplying which by leads to , which implies , which implies because is an orthonormal basis matrix of , which says , a contradiction. Similarly, we know is nonsingular, and so is
implying is nonsingular. Now take and . ∎
Appendix B Perturbation for Lyapunov equation
In this section, we will establish a lemma on the change of the solution to
(B.1) |
subject to perturbations to and , along the technical line of [13], where may not necessarily be Hermitian. It is known as the Lyapunov equation if is Hermitian, but here it may not be. The result in the lemma below is used during our intermediate estimates of transfer function. In conforming to [13], we will state the result for complex matrices: is the set of all -by- complex matrices and denotes the complex conjugate of .
Lemma B.1.
Suppose that is stable, i.e., all of its eigenvalues are located in the left half of the complex plane, and let
which is the unique solution to the Lyapunov equation . Let (not necessarily Hermitian) and is the solution to the matrix equation (B.1). Perturb and to () and , respectively, and suppose that the perturbed equation
(B.2) |
has a solution , where the trivial case either or is excluded. If
(B.3) |
then for any unitarily invariant norm
(B.4) |
Equation (B.1) is not necessarily a Lyapunov equation because is allowed non-Hermitian, not to mention (B.2) for which two different perturbations are allowed to at its two occurrences. Equation (B.1) has a unique solution because is assumed stable, but a solution to the perturbed equation (B.2) is assumed to exist. It is not clear if the assumption (B.3) ensures both for are stable and thereby guarantees that (B.2) has a unique solution, too, something worthy further investigation.
Proof of Lemma B.1.
In [13] for the case for are the same and denoted by , under the condition of Lemma B.1 but without assuming (B.3), it is proved that
(B.5) |
elegantly formulated in such a way that all changes are measured relatively. We can achieve the same thing, too. In fact, under the condition of Lemma B.1 but without assuming (B.3), it can be shown that
(B.6) |
But, as we argued at the beginning, (B.4) is more convenient for us to use in our intermediate estimations.
References
- [1] B. D. O. Anderson and J. B. Moore. Linear Optimal Control. Prentice-Hall, Englewood Cliffs, 1971.
- [2] A. C. Antoulas. Approximation of Large-Scale Dynamical Systems. Advances in Design and Control. SIAM, Philadelphia, PA, 2005.
- [3] A. C. Antoulas, D. C. Sorensen, and S. Gugercin. A survey of model reduction methods for large-scale systems. In Vadim Olshevsky, editor, Structured Matrices in Mathematics, Computer Science, and Engineering I: Proceedings of an AMS-IMS-SIAM joint summer research conference, University of Colorado, Boulder, June 27-July 1, 1999, volume 280 of Contemporary Mathematics, pages 193–219. American Mathematical Society, Providence, Rhode Island, 2001.
- [4] A. C. Antoulas, D. C. Sorensen, and Y. Zhou. On the decay rate of Hankel singular values and related issues. Sys. Contr. Lett., 46(5):323–342, August 2002.
- [5] Zhaojun Bai. Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems. Appl. Numer. Math., 43:9–44, 2002.
- [6] Jonathan Baker, Mark Embree, and John Sabino. Fast singular value decay for Lyapunov solutions with nonnormal coefficients. SIAM J. Matrix Anal. Appl., 36(2):656–668, 2015.
- [7] R. H. Bartels and G. W. Stewart. Algorithm 432: The solution of the matrix equation . Commun. ACM, 8:820–826, 1972.
- [8] Bernhard Beckermann and Alex Townsend. Bounds on the singular values of matrices with displacement structure. SIAM Rev., 61(2):319–344, 2019.
- [9] Peter Benner, R.-C. Li, and Ninoslav Truhar. On ADI method for Sylvester equations. J. Comput. Appl. Math., 233(4):1035–1045, 2009.
- [10] Peter Benner, Mario Ohlberger, Albert Cohen, and Karen Willcox, editors. Model Reduction and Approximation: Theory and Algorithms. Computational Science & Engineering. SIAM, Philadelphia, 2017.
- [11] Roland W. Freund. Model reduction methods based on Krylov subspaces. Acta Numer., 12:267–319, 2003.
- [12] Serkan Gugercin and Athanasios C. Antoulas. A survey of model reduction by balanced truncation and some new results. Int. J. Control, 77(8):748–766, 2004.
- [13] C. Hewer and C. Kenney. The sensitivity of the stable Lyapunov equation. SIAM J. Control Optim., 26:321–344, 1988.
- [14] Jing-Rebecca Li and Jacob White. Low-rank solution of Lyapunov equations. SIAM J. Matrix Anal. Appl., 24(1):260–280, 2002.
- [15] Jing-Rebecca Li and Jacob White. Low-rank solution of Lyapunov equations. SIAM Rev., 46(4):693–713, 2004.
- [16] R.-C. Li. Matrix perturbation theory. In L. Hogben, R. Brualdi, and G. W. Stewart, editors, Handbook of Linear Algebra, page Chapter 21. CRC Press, Boca Raton, FL, 2nd edition, 2014.
- [17] R.-C. Li and Z. Bai. Structure-preserving model reduction using a Krylov subspace projection formulation. Comm. Math. Sci., 3(2):179–199, 2005.
- [18] R.-C. Li and L.-H. Zhang. Convergence of block Lanczos method for eigenvalue clusters. Numer. Math., 131:83–113, 2015.
- [19] Ren-Cang Li, Ninoslav Truhar, and Lei-Hong Zhang. On Stewart’s perturbation theorem for svd. Ann. Math. Sci. Appl., 2024. to appear.
- [20] Thilo Penzl. Eigenvalue decay bounds for solutions of Lyapunov equations: the symmetric case. Sys. Contr. Lett., 40(2):139–144, June 2000.
- [21] J. Sabino. Solution of Large-Scale Lyapunov Equations via the Block Modified Smith Method. PhD thesis, Rice University, Houston, Texas, 2006.
- [22] D.C. Sorensen and A.C. Antoulas. The Sylvester equation and approximate balanced reduction. Linear Algebra Appl., 351-352:671–700, 2002.
- [23] G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990.
- [24] Ji-Guang Sun. Matrix Perturbation Analysis. Academic Press, Beijing, 1987. In Chinese.
- [25] Kemin Zhou, John C. Doyle, and Keith Glover. Robust and Optimal Control. Prentice Hall, Upper Saddle River, New Jersey, 1995.