This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Time-limited 2\mathcal{H}_{2}-optimal Model Order Reduction of Linear Systems with Quadratic Outputs

Umair Zulfiqar [email protected] Zhi-Hua Xiao Qiu-Yan Song Mohammad Monir Uddin Victor Sreeram School of Electronic Information and Electrical Engineering, Yangtze University, Jingzhou, Hubei, 434023, China School of Information and Mathematics, Yangtze University, Jingzhou, Hubei, 434023, China School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, 200444, China Department of Mathematics and Physics, North South University, Dhaka, 1229, Bangladesh Department of Electrical, Electronic, and Computer Engineering, The University of Western Australia, Perth, 6009, Australia
Abstract

An important class of dynamical systems with several practical applications is linear systems with quadratic outputs. These models have the same state equation as standard linear time-invariant systems but differ in their output equations, which are nonlinear quadratic functions of the system states. When dealing with models of exceptionally high order, the computational demands for simulation and analysis can become overwhelming. In such cases, model order reduction proves to be a useful technique, as it allows for constructing a reduced-order model that accurately represents the essential characteristics of the original high-order system while significantly simplifying its complexity.

In time-limited model order reduction, the main goal is to maintain the output response of the original system within a specific time range in the reduced-order model. To assess the error within this time interval, a mathematical expression for the time-limited 2\mathcal{H}_{2}-norm is derived in this paper. This norm acts as a measure of the accuracy of the reduced-order model within the specified time range. Subsequently, the necessary conditions for achieving a local optimum of the time-limited 2\mathcal{H}_{2} norm error are derived. The inherent inability to satisfy these optimality conditions within the Petrov-Galerkin projection framework is also discussed. After that, a stationary point iteration algorithm based on the optimality conditions and Petrov-Galerkin projection is proposed. Upon convergence, this algorithm fulfills three of the four optimality conditions. To demonstrate the effectiveness of the proposed algorithm, a numerical example is provided that showcases its ability to effectively approximate the original high-order model within the desired time interval.

keywords:
2\mathcal{H}_{2}-optimal, time-limited, model order reduction, projection, reduced-order model, quadratic output
journal: ArXiv.org

1 Introduction

This research work studies a special class of nonlinear dynamical systems characterized by weak nonlinearity. These systems retain linear time-invariant (LTI) state equations but incorporate quadratic nonlinear terms in their output equations, thus named linear quadratic output (LQO) systems [1]. They naturally arise in scenarios requiring the observation of quantities that involve the product of state components, either in the time or frequency domain. These systems are valuable in quantifying energy or power of dynamical systems, for example, in assessing a system’s internal energy [2] or the cost function in optimal quadratic control problems [3]. Moreover, they are used to measure deviations of state coordinates from a reference point, such as for calculating the root mean squared displacement of spatial coordinates around an excitation point or for estimating the variance of a random variable in stochastic modeling [4].

Consider an LQO system defined by the following state and output equations:

H:={x˙(t)=Ax(t)+Bu(t),y(t)=Cx(t)+[x(t)TM1x(t)x(t)TMpx(t)],\displaystyle H:=\begin{cases}\dot{x}(t)=Ax(t)+Bu(t),\\ y(t)=Cx(t)+\begin{bmatrix}x(t)^{T}M_{1}x(t)\\ \vdots\\ x(t)^{T}M_{p}x(t)\end{bmatrix},\end{cases} (1)

where x(t)N×1x(t)\in\mathbb{R}^{N\times 1} denotes the state vector, u(t)m×1u(t)\in\mathbb{R}^{m\times 1} represents inputs, y(t)p×1y(t)\in\mathbb{R}^{p\times 1} is the output vector. The matrices AN×NA\in\mathbb{R}^{N\times N}, BN×mB\in\mathbb{R}^{N\times m}, Cp×NC\in\mathbb{R}^{p\times N}, and MiN×NM_{i}\in\mathbb{R}^{N\times N} consititute the state-space realization (A,B,C,M1,,Mp)(A,B,C,M_{1},\cdots,M_{p}). The state equation is identical in form to that of a standard LTI system. However, the output equation introduces a nonlinearity through the quadratic terms x(t)TMix(t)x(t)^{T}M_{i}x(t), which distinguishes the LQO system from a standard LTI system.

Accurately modeling complex physical phenomena frequently necessitates dynamical systems of exceptionally high order, often in the range of several thousand or more. Due to this very high order NN, simulating and analyzing the model (1) becomes computationally intensive and impractical. Consequently, it becomes necessary to approximate (1) with a reduced-order model (ROM) of significantly lower order nn (where nN)n\ll N), which simplifies simulation and analysis [5]. Model order reduction (MOR) refers to the process of constructing a ROM while ensuring that the essential features and characteristics of the original model are preserved [6].

We define the nthn^{th}-order ROM of the system HH as H^\hat{H}, which is described by the state and output equations shown in (2)

H^:={xn˙(t)=A^xn(t)+B^u(t),yn(t)=C^xn(t)+[xn(t)TM1^xn(t)xn(t)TMp^xn(t)].\displaystyle\hat{H}:=\begin{cases}\dot{x_{n}}(t)=\hat{A}x_{n}(t)+\hat{B}u(t),\\ y_{n}(t)=\hat{C}x_{n}(t)+\begin{bmatrix}x_{n}(t)^{T}\hat{M_{1}}x_{n}(t)\\ \vdots\\ x_{n}(t)^{T}\hat{M_{p}}x_{n}(t)\end{bmatrix}.\end{cases} (2)

Here, A^\hat{A}, B^\hat{B}, C^\hat{C}, and M^i\hat{M}_{i} are matrices derived from the original system matrices AA, BB, CC, and MiM_{i}, respectively, through the Petrov-Galerkin projection condition W^TV^=I\hat{W}^{T}\hat{V}=I. Specifically, A^=W^TAV^\hat{A}=\hat{W}^{T}A\hat{V}, B^=W^TB\hat{B}=\hat{W}^{T}B, C^=CV^\hat{C}=C\hat{V}, and Mi^=V^TMiV^\hat{M_{i}}=\hat{V}^{T}M_{i}\hat{V}. The projection matrices V^N×n\hat{V}\in\mathbb{R}^{N\times n} and W^N×n\hat{W}\in\mathbb{R}^{N\times n} are used to project the original system HH onto a reduced subspace, resulting in the ROM H^\hat{H}. Different MOR techniques vary in their approach to constructing V^\hat{V} and W^\hat{W}, which is dependent on the specific characteristics of HH that need to be preserved in H^\hat{H} [7, 8, 9]. Throughout this paper, it is assumed that both AA and A^\hat{A} are Hurwitz matrices.

The Balanced Truncation (BT) method, developed in 1981, is a widely utilized MOR technique [10]. This method selectively retains states that significantly contribute to the energy transfer between inputs and outputs, while discarding those with minimal influence, as determined by their Hankel singular values. One key advantage of BT is its capability to estimate errors a priori before constructing the ROM [11]. Furthermore, BT ensures that the stability of the original model is maintained. Initially developed for standard LTI systems, BT has significantly expanded its applicability to encompass various system types, including descriptor systems [12], second-order systems [13], linear time-varying systems [14], parametric systems [15], nonlinear systems [16], and bilinear systems [17], among others. Additionally, BT has been adapted to preserve specific system properties, such as positive realness [18], bounded realness [19], passivity [20], and special structural characteristics [21]. For a comprehensive understanding of the diverse BT algorithm family, refer to the survey [22]. BT has been extended to LQO systems in [23, 24, 25]. Among these algorithms, only the one presented in [25] maintains the LQO structure in the ROM.

A locally optimal solution in the 2\mathcal{H}_{2} norm is achievable using several efficient algorithms. The necessary conditions for this local optimum, known as Wilson’s conditions [26], involve interpolation of the original system at selected points. The iterative rational Krylov algorithm (IRKA) is a well-known algorithm that can achieve this local optimum [27]. A more general algorithm, which does not assume the original system having simple poles, is also available [28]. This algorithm is based on Sylvester equations and its numerical properties have been improved in [29]. Recently, the 2\mathcal{H}_{2} MOR problem for LQO systems has been addressed, and an algorithm based on Sylvester equations has been proposed as a solution [30].

BT aims to approximate the original model dynamics over the entire time horizon. However, practical systems and simulations often operate within finite time intervals due to real-world constraints. For example, in interconnected power systems, low-frequency oscillations typically last around 15 seconds and are mitigated by power system stabilizers and damping controllers [31]. The first 1515 seconds are crucial for small-signal stability assessments [32]. Similarly, in finite-time optimal control problems, the system’s behavior within a specific time frame is of utmost importance [33]. This need has led to the development of time-limited MOR, which focuses on achieving good accuracy within a specified time interval rather than the entire time domain [34].

The time-limited MOR problem focuses on developing a ROM that guarantees a small deviation between the outputs of the original model y(t)y(t) and the ROM’s output yn(t)y_{n}(t) for a given input signal u(t)u(t), but only within a specified limited time interval [0,τ][0,\tau] seconds. In other words, the goal is to ensure that the approximation error y(t)yn(t)\|y(t)-y_{n}(t)\| remains minimal within this specified time frame [35].

To address the time-limited MOR problem, BT has been adapted into the time-limited BT (TLBT) algorithm [34]. Although TLBT does not fully retain all characteristics of traditional BT, such as stability guarantees and a priori error bounds, it effectively addresses the time-limited MOR scenario. The computational aspects of TLBT and strategies for handling large-scale systems have been explored in [36]. Furthermore, TLBT has been extended to handle descriptor systems [36], second-order systems [37], and bilinear systems [38], enhancing its applicability. Recently, TLBT has also been extended to LQO systems in [39].

In [40], a new norm called the time-limited 2\mathcal{H}_{2} norm is defined, and the conditions needed to find a local optimum for this norm are derived. An algorithm called time-limited IRKA (TLIRKA) is then proposed based on IRKA to construct a local optimum, but it does not meet any necessary conditions. In [41], new optimality conditions based on interpolation are derived, and a nonlinear optimization algorithm is proposed to construct the local optimum, whose applicability is limited to medium-scale systems. This paper studies the time-limited 2\mathcal{H}_{2}-optimal MOR problem for LQO systems.

The key contributions of this research work are manifold. First, it defines the time-limited 2\mathcal{H}_{2} norm (2,τ\mathcal{H}_{2,\tau} norm) for LQO systems and demonstrates its computation using time-limited system Gramians defined in [39]. Second, it derives the necessary condition for achieving a local optimum of HH^2,τ2||H-\hat{H}||_{\mathcal{H}_{2,\tau}}^{2}. Third, these conditions are then contrasted with those related to the standard 2\mathcal{H}_{2}-optimal MOR [30]. Notably, it is shown that Petrov-Galerkin projection cannot, in general, attain a local optimum in the time-limited setting. Fourth, a stationary point algorithm, rooted in the Petrov-Galerkin projection, is proposed. Upon convergence, this algorithm satisfies three of the four necessary conditions for optimality. An illustrative numerical example is presented to demonstrate the accuracy of the proposed algorithm within the specified time interval.

2 Literature Review

In this section, we will briefly discuss two of the most relevant MOR algorithms for LQO systems in the context of the problem under consideration. The first algorithm is the 2\mathcal{H}_{2}-optimal MOR method [30], and the second is the TLBT [39].

2.1 2\mathcal{H}_{2}-optimal MOR Algorithm (HOMORA) [30]

Let us define the matrices P~\tilde{P}, P^\hat{P}, Y~\tilde{Y}, Y^\hat{Y}, Z~\tilde{Z}, Z^\hat{Z}, Q~\tilde{Q}, and Q^\hat{Q} that solve the following set of linear matrix equations:

AP~+P~A^T+BB^T=0,\displaystyle A\tilde{P}+\tilde{P}\hat{A}^{T}+B\hat{B}^{T}=0,
A^P^+P^A^T+B^B^T=0,\displaystyle\hat{A}\hat{P}+\hat{P}\hat{A}^{T}+\hat{B}\hat{B}^{T}=0,
ATY~+Y~A^+CTC^=0,\displaystyle A^{T}\tilde{Y}+\tilde{Y}\hat{A}+C^{T}\hat{C}=0,
A^TY^+Y^A^+C^TC^=0,\displaystyle\hat{A}^{T}\hat{Y}+\hat{Y}\hat{A}+\hat{C}^{T}\hat{C}=0,
ATZ~+Z~A^+i=1pMiP~Mi^=0,\displaystyle A^{T}\tilde{Z}+\tilde{Z}\hat{A}+\sum_{i=1}^{p}M_{i}\tilde{P}\hat{M_{i}}=0,
A^TZ^+Z^A^+i=1pMi^P^Mi^=0,\displaystyle\hat{A}^{T}\hat{Z}+\hat{Z}\hat{A}+\sum_{i=1}^{p}\hat{M_{i}}\hat{P}\hat{M_{i}}=0,
ATQ~+Q~A^+CTC^+i=1pMiP~Mi^=0,\displaystyle A^{T}\tilde{Q}+\tilde{Q}\hat{A}+C^{T}\hat{C}+\sum_{i=1}^{p}M_{i}\tilde{P}\hat{M_{i}}=0,
A^TQ^+Q^A^+C^TC^+i=1pMi^P^Mi^=0.\displaystyle\hat{A}^{T}\hat{Q}+\hat{Q}\hat{A}+\hat{C}^{T}\hat{C}+\sum_{i=1}^{p}\hat{M_{i}}\hat{P}\hat{M_{i}}=0.

According to [30], the necessary conditions for the local optimum of the (squared) 2\mathcal{H}_{2}-norm of the error denoted as HH^22||H-\hat{H}||_{\mathcal{H}_{2}}^{2}, are given by the following set of equations:

(Y~+2Z~)TP~+(Y^+2Z^)P^\displaystyle-(\tilde{Y}+2\tilde{Z})^{T}\tilde{P}+(\hat{Y}+2\hat{Z})\hat{P} =0,\displaystyle=0, (3)
P~TMiP~+P^Mi^P^\displaystyle-\tilde{P}^{T}M_{i}\tilde{P}+\hat{P}\hat{M_{i}}\hat{P} =0,\displaystyle=0, (4)
(Y~+2Z~)TB+(Y^+2Z^)B^\displaystyle-(\tilde{Y}+2\tilde{Z})^{T}B+(\hat{Y}+2\hat{Z})\hat{B} =0,\displaystyle=0, (5)
CP~+C^P^\displaystyle-C\tilde{P}+\hat{C}\hat{P} =0.\displaystyle=0. (6)

Furthermore, it is shown that these optimality conditions can be achieved by setting the projection matrices as V^=P~\hat{V}=\tilde{P} and W^=(Y~+2Z~)(P~T(Y~+2Z~))1\hat{W}=(\tilde{Y}+2\tilde{Z})\big{(}\tilde{P}^{T}(\tilde{Y}+2\tilde{Z})\big{)}^{-1}. Starting with an initial guess of the ROM, the projection matrices are iteratively updated until convergence is reached, at which point the optimality conditions (3)-(6) are satisfied.

2.2 Time-limited Balanced Truncation (TLBT) [39]

TLBT constructs the ROM by identifying and truncating the states that have minimal contribution to the input-output energy transfer within the desired time interval [0,τ][0,\tau] seconds. This is achieved by first constructing a time-limited balanced realization using time-limited Gramians, and then truncating the states that correspond to the smallest time-limited Hankel singular values.

The time-limited controllability Gramian PτP_{\tau} within the desired time interval [0,τ][0,\tau] seconds is defined as

Pτ=0τeAtBBTeATt𝑑t.\displaystyle P_{\tau}=\int_{0}^{\tau}e^{At}BB^{T}e^{A^{T}t}dt.

By defining SτS_{\tau} as Sτ=eAτS_{\tau}=e^{A\tau}, PτP_{\tau} can be found by solving the following Lyapunov equation:

APτ+PτAT+BBTSτBBTSτT\displaystyle AP_{\tau}+P_{\tau}A^{T}+BB^{T}-S_{\tau}BB^{T}S_{\tau}^{T} =0.\displaystyle=0.

The time-limited observability Gramian QτQ_{\tau} within the time interval [0,τ][0,\tau] sec is defined as the sum of YτY_{\tau} and ZτZ_{\tau} where YτY_{\tau} and ZτZ_{\tau} are given by

Yτ\displaystyle Y_{\tau} =0τeATtCTCeAt𝑑t,\displaystyle=\int_{0}^{\tau}e^{A^{T}t}C^{T}Ce^{At}dt,
Zτ\displaystyle Z_{\tau} =0τeATt1(i=1pMi(0τeAt2BBTeATt2𝑑t2)Mi)eAt1𝑑t1.\displaystyle=\int_{0}^{\tau}e^{A^{T}t_{1}}\Bigg{(}\sum_{i=1}^{p}M_{i}\Big{(}\int_{0}^{\tau}e^{At_{2}}BB^{T}e^{A^{T}t_{2}}dt_{2}\Big{)}M_{i}\Bigg{)}e^{At_{1}}dt_{1}.

YτY_{\tau}, ZτZ_{\tau}, and QτQ_{\tau} can be computed through the solution of following Lyapunov equations:

ATYτ+YτA+CTCSτTCTCSτ\displaystyle A^{T}Y_{\tau}+Y_{\tau}A+C^{T}C-S_{\tau}^{T}C^{T}CS_{\tau} =0,\displaystyle=0,
ATZτ+ZτA+i=1p(MiPτMiSτTMiPτMiSτ)\displaystyle A^{T}Z_{\tau}+Z_{\tau}A+\sum_{i=1}^{p}\big{(}M_{i}P_{\tau}M_{i}-S_{\tau}^{T}M_{i}P_{\tau}M_{i}S_{\tau}\big{)} =0,\displaystyle=0,
ATQτ+QτA+CTCSτTCTCSτ+i=1p(MiPτMiSτTMiPτMiSτ)\displaystyle A^{T}Q_{\tau}+Q_{\tau}A+C^{T}C-S_{\tau}^{T}C^{T}CS_{\tau}+\sum_{i=1}^{p}\big{(}M_{i}P_{\tau}M_{i}-S_{\tau}^{T}M_{i}P_{\tau}M_{i}S_{\tau}\big{)} =0.\displaystyle=0.

The time-limited Hankel singular values, denoted by σi\sigma_{i}, are computed as the square root of the eigenvalues of the product of matrices PτP_{\tau} and QτQ_{\tau}:

σi=λi(PτQτ)fori=1,,N,\displaystyle\sigma_{i}=\sqrt{\lambda_{i}(P_{\tau}Q_{\tau})}\hskip 14.22636pt\textnormal{for}\hskip 14.22636pti=1,\cdots,N,

wherein λi()\lambda_{i}(\cdot) represents the eigenvalues. The projection matrices in TLBT are determined such that the transformed matrices W^TPτW^\hat{W}^{T}P_{\tau}\hat{W} and V^TQτV^\hat{V}^{T}Q_{\tau}\hat{V} become diagonal matrices with the nn largest time-limited Hankel singular values of HH, i.e., W^TPτW^=V^TQτV^=diag(σ1,,σn)\hat{W}^{T}P_{\tau}\hat{W}=\hat{V}^{T}Q_{\tau}\hat{V}=diag(\sigma_{1},\cdots,\sigma_{n}).

3 Main Work

This section introduces the time-limited 2\mathcal{H}_{2} norm and its connection to the time-limited observability Gramian. Subsequently, necessary conditions for the local optimality of the squared time-limited 2\mathcal{H}_{2} norm of the error are derived. Based on these optimality conditions, a Petrov-Galerkin projection-based iterative algorithm is proposed that satisfies three of the four optimality conditions. The challenges associated with fulfilling the remaining optimality condition within the Petrov-Galerkin projection framework are also discussed. Finally, the computational efficiency of the proposed algorithm is briefly analyzed.

3.1 2,τ\mathcal{H}_{2,\tau} norm Definition

The classical 2\mathcal{H}_{2} norm for LQO systems is defined in the time domain as:

H2\displaystyle||H||_{\mathcal{H}_{2}} =[trace(0h1T(t)h1(t)dt\displaystyle=\Bigg{[}trace\Big{(}\int_{0}^{\infty}h_{1}^{T}(t)h_{1}(t)dt
+00i=1ph2,iT(t1,t2)h2,i(t1,t2)dt1dt2)]12,\displaystyle\hskip 42.67912pt+\int_{0}^{\infty}\int_{0}^{\infty}\sum_{i=1}^{p}h_{2,i}^{T}(t_{1},t_{2})h_{2,i}(t_{1},t_{2})dt_{1}dt_{2}\Big{)}\Bigg{]}^{-\frac{1}{2}},

where h1(t)=CeAtBh_{1}(t)=Ce^{At}B and h2,i(t1,t2)=BTeATt1MieAt2Bh_{2,i}(t_{1},t_{2})=B^{T}e^{A^{T}t_{1}}M_{i}e^{At_{2}}B. The 2\mathcal{H}_{2} norm measures the output power in response to unit white noise over the entire time horizon. However, for the problem at hand, we focus on the output power within a specific time interval, leading to the concept of the time-limited 2\mathcal{H}_{2} norm.

Definition 3.1.

The time-limited 2\mathcal{H}_{2} norm of the LQO system within the time interval [0,τ][0,\tau] sec is defined as:

H2,τ\displaystyle||H||_{\mathcal{H}_{2,\tau}} =[trace(0τh1T(t)h1(t)dt\displaystyle=\Big{[}trace\Big{(}\int_{0}^{\tau}h_{1}^{T}(t)h_{1}(t)dt
+0τ0τi=1ph2,iT(t1,t2)h2,i(t1,t2)dt1dt2)]12.\displaystyle\hskip 42.67912pt+\int_{0}^{\tau}\int_{0}^{\tau}\sum_{i=1}^{p}h_{2,i}^{T}(t_{1},t_{2})h_{2,i}(t_{1},t_{2})dt_{1}dt_{2}\Big{)}\Big{]}^{-\frac{1}{2}}.
Proposition 3.2.

The 2,τ\mathcal{H}_{2,\tau} norm is related to the time-limited observability Gramian QτQ_{\tau} as follows:

H2,τ=trace(BTQτB).\displaystyle||H||_{\mathcal{H}_{2,\tau}}=\sqrt{trace(B^{T}Q_{\tau}B)}.
Proof.

Observe that

trace(0τh1T(t)h1(t)𝑑t)\displaystyle trace\Big{(}\int_{0}^{\tau}h_{1}^{T}(t)h_{1}(t)dt\Big{)}
=trace(BT[0τeATtCTCeAt]Bdt)\displaystyle=trace\Big{(}B^{T}\Big{[}\int_{0}^{\tau}e^{A^{T}t}C^{T}Ce^{At}\Big{]}Bdt\Big{)}
=trace(BTYτB).\displaystyle=trace(B^{T}Y_{\tau}B).

Furthermore, notice that

trace(0τ0τi=1ph2,iT(t1,t2)h2,i(t1,t2)dt1dt2)\displaystyle trace\Big{(}\int_{0}^{\tau}\int_{0}^{\tau}\sum_{i=1}^{p}h_{2,i}^{T}(t_{1},t_{2})h_{2,i}(t_{1},t_{2})dt_{1}dt_{2}\Big{)}
=trace(BT[0τeATt2(i=1pMi(0τeAt1BBTeATt1𝑑t1)Mi)eAt2𝑑t2]B)\displaystyle=trace\Bigg{(}B^{T}\Big{[}\int_{0}^{\tau}e^{A^{T}t_{2}}\Big{(}\sum_{i=1}^{p}M_{i}\Big{(}\int_{0}^{\tau}e^{At_{1}}BB^{T}e^{A^{T}t_{1}}dt_{1}\Big{)}M_{i}\Big{)}e^{At_{2}}dt_{2}\Big{]}B\Bigg{)}
=trace(BTZτB).\displaystyle=trace(B^{T}Z_{\tau}B).

Therefore, H2,τ=trace(BT(Yτ+Zτ)B)=trace(BT(Qτ)B)||H||_{\mathcal{H}_{2,\tau}}=\sqrt{trace\big{(}B^{T}(Y_{\tau}+Z_{\tau})B\big{)}}=\sqrt{trace\big{(}B^{T}(Q_{\tau})B\big{)}}. ∎

3.2 2,τ\mathcal{H}_{2,\tau} Norm of the Error

Let us define the error system E=HH^E=H-\hat{H} with the following state-space representation:

E:={xe˙(t)=[x(t)xn(t)]=Aexe(t)+Beu(t),ye(t)=y(t)yn(t)=Cexe(t)+[xe(t)TMe,1xe(t)xe(t)TMe,pxe(t)],\displaystyle E:=\begin{cases}\dot{x_{e}}(t)=\begin{bmatrix}x(t)\\ x_{n}(t)\end{bmatrix}=A_{e}x_{e}(t)+B_{e}u(t),\\ y_{e}(t)=y(t)-y_{n}(t)=C_{e}x_{e}(t)+\begin{bmatrix}x_{e}(t)^{T}M_{e,1}x_{e}(t)\\ \vdots\\ x_{e}(t)^{T}M_{e,p}x_{e}(t)\end{bmatrix},\end{cases}

where

Ae\displaystyle A_{e} =[A00A^],\displaystyle=\begin{bmatrix}A&0\\ 0&\hat{A}\end{bmatrix}, Be\displaystyle B_{e} =[BB^],\displaystyle=\begin{bmatrix}B\\ \hat{B}\end{bmatrix}, Me,i\displaystyle M_{e,i} =[Mi00Mi^],\displaystyle=\begin{bmatrix}M_{i}&0\\ 0&-\hat{M_{i}}\end{bmatrix}, Ce\displaystyle C_{e} =[CC^].\displaystyle=\begin{bmatrix}C&-\hat{C}\end{bmatrix}. (7)

Let Se,τS_{e,\tau} be defined as Se,τ=eAeτS_{e,\tau}=e^{A_{e}\tau}. The time-limited controllability Gramian Pe,τP_{e,\tau} and the time-limited observability Gramian Qe,τ=Ye,τ+Ze,τQ_{e,\tau}=Y_{e,\tau}+Z_{e,\tau} of the realization (Ae,Be,Ce,Me,1,,Me,p)(A_{e},B_{e},C_{e},M_{e,1},\cdots,M_{e,p}) can be obtained by solving the following Lyapunov equations:

AePe,τ+Pe,τAeT\displaystyle\hskip 28.45274ptA_{e}P_{e,\tau}+P_{e,\tau}A_{e}^{T} +BeBeTSe,τBeBeTSe,τT=0,\displaystyle+B_{e}B_{e}^{T}-S_{e,\tau}B_{e}B_{e}^{T}S_{e,\tau}^{T}=0,
AeTYe,τ+Ye,τAe\displaystyle A_{e}^{T}Y_{e,\tau}+Y_{e,\tau}A_{e} +CeTCeSe,τTCeTCeSe,τ=0,\displaystyle+C_{e}^{T}C_{e}-S_{e,\tau}^{T}C_{e}^{T}C_{e}S_{e,\tau}=0,
AeTZe,τ+Ze,τAe\displaystyle A_{e}^{T}Z_{e,\tau}+Z_{e,\tau}A_{e} +i=1p(Me,iPe,τMe,i\displaystyle+\sum_{i=1}^{p}\big{(}M_{e,i}P_{e,\tau}M_{e,i}
Se,τTMe,iPe,τMe,iSe,τ)=0,\displaystyle\hskip 14.22636pt-S_{e,\tau}^{T}M_{e,i}P_{e,\tau}M_{e,i}S_{e,\tau}\big{)}=0,
AeTQe,τ+Qe,τAe\displaystyle A_{e}^{T}Q_{e,\tau}+Q_{e,\tau}A_{e} +CeTCeSe,τTCeTCeSe,τ\displaystyle+C_{e}^{T}C_{e}-S_{e,\tau}^{T}C_{e}^{T}C_{e}S_{e,\tau}
+i=1p(Me,iPe,τMe,iSe,τTMe,iPe,τMe,iSe,τ)=0.\displaystyle\hskip 14.22636pt+\sum_{i=1}^{p}\big{(}M_{e,i}P_{e,\tau}M_{e,i}-S_{e,\tau}^{T}M_{e,i}P_{e,\tau}M_{e,i}S_{e,\tau}\big{)}=0.

Partition Pe,τP_{e,\tau}, Ye,τY_{e,\tau}, Ze,τZ_{e,\tau}, and Qe,τQ_{e,\tau} according to (7) as follows:

Pe,τ\displaystyle P_{e,\tau} =[PτP~τP~τTP^τ],\displaystyle=\begin{bmatrix}P_{\tau}&\tilde{P}_{\tau}\\ \tilde{P}_{\tau}^{T}&\hat{P}_{\tau}\end{bmatrix}, Ye,τ\displaystyle Y_{e,\tau} =[YτY~τY~τTY^τ],\displaystyle=\begin{bmatrix}Y_{\tau}&-\tilde{Y}_{\tau}\\ -\tilde{Y}_{\tau}^{T}&\hat{Y}_{\tau}\end{bmatrix},
Ze,τ\displaystyle Z_{e,\tau} =[ZτZ~τZ~τTZ^τ],\displaystyle=\begin{bmatrix}Z_{\tau}&-\tilde{Z}_{\tau}\\ -\tilde{Z}_{\tau}^{T}&\hat{Z}_{\tau}\end{bmatrix}, Qe,τ\displaystyle Q_{e,\tau} =[QτQ~τQ~τTQ^τ].\displaystyle=\begin{bmatrix}Q_{\tau}&-\tilde{Q}_{\tau}\\ -\tilde{Q}_{\tau}^{T}&\hat{Q}_{\tau}\end{bmatrix}.

By defining S^τ=eA^τ\hat{S}_{\tau}=e^{\hat{A}\tau}, it can be verified that the following linear matrix equations hold:

AP~τ+P~τA^T\displaystyle A\tilde{P}_{\tau}+\tilde{P}_{\tau}\hat{A}^{T} +BB^TSτBB^TS^τT=0,\displaystyle+B\hat{B}^{T}-S_{\tau}B\hat{B}^{T}\hat{S}_{\tau}^{T}=0, (8)
A^P^τ+P^τA^T\displaystyle\hat{A}\hat{P}_{\tau}+\hat{P}_{\tau}\hat{A}^{T} +B^B^TS^τB^B^TS^τT=0,\displaystyle+\hat{B}\hat{B}^{T}-\hat{S}_{\tau}\hat{B}\hat{B}^{T}\hat{S}_{\tau}^{T}=0, (9)
ATY~τ+Y~τA^\displaystyle A^{T}\tilde{Y}_{\tau}+\tilde{Y}_{\tau}\hat{A} +CTC^SτTCTC^S^τ=0,\displaystyle+C^{T}\hat{C}-S_{\tau}^{T}C^{T}\hat{C}\hat{S}_{\tau}=0, (10)
A^TY^τ+Y^τA^\displaystyle\hat{A}^{T}\hat{Y}_{\tau}+\hat{Y}_{\tau}\hat{A} +C^TC^S^τTC^TC^S^τ=0,\displaystyle+\hat{C}^{T}\hat{C}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}=0, (11)
ATZ~τ+Z~τA^\displaystyle A^{T}\tilde{Z}_{\tau}+\tilde{Z}_{\tau}\hat{A} +i=1p(MiP~τMi^\displaystyle+\sum_{i=1}^{p}\big{(}M_{i}\tilde{P}_{\tau}\hat{M_{i}}
SτTMiP~τMi^S^τ)=0,\displaystyle-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0, (12)
A^TZ^τ+Z^τA^\displaystyle\hat{A}^{T}\hat{Z}_{\tau}+\hat{Z}_{\tau}\hat{A} +i=1p(Mi^P^τMi^\displaystyle+\sum_{i=1}^{p}\big{(}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}
S^τTMi^P^τMi^S^τ)=0,\displaystyle-\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0, (13)
ATQ~τ+Q~τA^\displaystyle A^{T}\tilde{Q}_{\tau}+\tilde{Q}_{\tau}\hat{A} +CTC^SτTCTC^S^τ\displaystyle+C^{T}\hat{C}-S_{\tau}^{T}C^{T}\hat{C}\hat{S}_{\tau}
+i=1p(MiP~τMi^SτTMiP~τMi^S^τ)=0,\displaystyle+\sum_{i=1}^{p}\big{(}M_{i}\tilde{P}_{\tau}\hat{M_{i}}-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0, (14)
A^TQ^τ+Q^τA^\displaystyle\hat{A}^{T}\hat{Q}_{\tau}+\hat{Q}_{\tau}\hat{A} +C^TC^S^τTC^TC^S^τ\displaystyle+\hat{C}^{T}\hat{C}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}
+i=1p(Mi^P^τMi^S^τTMi^P^τMi^S^τ)=0.\displaystyle+\sum_{i=1}^{p}\big{(}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}-\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0. (15)

Consequently, the 2,τ\mathcal{H}_{2,\tau} norm of EE can be expressed as:

E2,τ\displaystyle||E||_{\mathcal{H}_{2,\tau}} =trace(BeTQe,τBe)\displaystyle=\sqrt{trace(B_{e}^{T}Q_{e,\tau}B_{e})}
=trace(BTQτB2BTQ~τB^+B^TQ^τB^).\displaystyle=\sqrt{trace(B^{T}Q_{\tau}B-2B^{T}\tilde{Q}_{\tau}\hat{B}+\hat{B}^{T}\hat{Q}_{\tau}\hat{B})}.
Corollary 3.3.

E2,τ2=H2,τ22H,H^2,τ+H^2,τ2,||E||_{\mathcal{H}_{2,\tau}}^{2}=||H||_{\mathcal{H}_{2,\tau}}^{2}-2\langle H,\hat{H}\rangle_{\mathcal{H}_{2,\tau}}+||\hat{H}||_{\mathcal{H}_{2,\tau}}^{2}, wherein H,H^2,τ\langle H,\hat{H}\rangle_{\mathcal{H}_{2,\tau}} represents the 2,τ\mathcal{H}_{2,\tau} inner product of HH and H^\hat{H}.

Proof.

The first and last terms in the expression for E2,τ2||E||_{\mathcal{H}_{2,\tau}}^{2} are straightforward. Our focus lies in showing that the middle term corresponds to the 2,τ\mathcal{H}_{2,\tau} inner product of HH and H^\hat{H}. By expanding the inner product definition, we can express it as:

H,H^2,τ\displaystyle\langle H,\hat{H}\rangle_{\mathcal{H}_{2,\tau}} =0τtrace(h1T(t)h^1(t)dt)\displaystyle=\int_{0}^{\tau}trace\big{(}h_{1}^{T}(t)\hat{h}_{1}(t)dt\big{)}
+trace(0τ0τi=1p(h2,iT(t1,t2)h^2,i(t1,t2))dt1dt2),\displaystyle\hskip 28.45274pt+trace\Big{(}\int_{0}^{\tau}\int_{0}^{\tau}\sum_{i=1}^{p}\big{(}h_{2,i}^{T}(t_{1},t_{2})\hat{h}_{2,i}(t_{1},t_{2})\big{)}dt_{1}dt_{2}\Big{)},

wherein h^1(t)=C^eA^tB^\hat{h}_{1}(t)=\hat{C}e^{\hat{A}t}\hat{B} and h^2,i(t1,t2)=B^TeA^Tt1M^ieA^t2B^\hat{h}_{2,i}(t_{1},t_{2})=\hat{B}^{T}e^{\hat{A}^{T}t_{1}}\hat{M}_{i}e^{\hat{A}t_{2}}\hat{B}. Furthermore,

trace(0τh1T(t)h^1(t)𝑑t)\displaystyle trace\Big{(}\int_{0}^{\tau}h_{1}^{T}(t)\hat{h}_{1}(t)dt\Big{)} =\displaystyle=
trace(BT(0τ\displaystyle trace\Big{(}B^{T}\big{(}\int_{0}^{\tau} eATtCTC^eA^dt)B^),\displaystyle e^{A^{T}t}C^{T}\hat{C}e^{\hat{A}}dt\big{)}\hat{B}\Big{)},
trace(0τ0τi=1ph2,iT(t1,t2)h^2,i(t1,t2)dt1dt2)\displaystyle trace\Big{(}\int_{0}^{\tau}\int_{0}^{\tau}\sum_{i=1}^{p}h_{2,i}^{T}(t_{1},t_{2})\hat{h}_{2,i}(t_{1},t_{2})dt_{1}dt_{2}\Big{)} =\displaystyle=
trace(BT[0τeATt1(i=1pMi(0τ\displaystyle trace\Bigg{(}B^{T}\Big{[}\int_{0}^{\tau}e^{A^{T}t_{1}}\Big{(}\sum_{i=1}^{p}M_{i}\big{(}\int_{0}^{\tau} eAt2BB^TeA^Tt2dt2)M^i)eA^t1]B^),\displaystyle e^{At_{2}}B\hat{B}^{T}e^{\hat{A}^{T}t_{2}}dt_{2}\big{)}\hat{M}_{i}\Big{)}e^{\hat{A}t_{1}}\Big{]}\hat{B}\Bigg{)},

The Sylvester equations (10) and (12) can be solved by computing the following integrals:

Y~τ\displaystyle\tilde{Y}_{\tau} =0τeATtCTC^eA^t𝑑t,\displaystyle=\int_{0}^{\tau}e^{A^{T}t}C^{T}\hat{C}e^{\hat{A}t}dt,
Z~τ\displaystyle\tilde{Z}_{\tau} =0τeATt1(i=1pMi(0τeAt2BB^TeA^Tt2𝑑t2)M^i)eA^t1𝑑t1;\displaystyle=\int_{0}^{\tau}e^{A^{T}t_{1}}\Bigg{(}\sum_{i=1}^{p}M_{i}\Big{(}\int_{0}^{\tau}e^{At_{2}}B\hat{B}^{T}e^{\hat{A}^{T}t_{2}}dt_{2}\Big{)}\hat{M}_{i}\Bigg{)}e^{\hat{A}t_{1}}dt_{1};

cf. [25, 39] Consequently, the 2,τ\mathcal{H}_{2,\tau} inner product between HH and H^\hat{H} can be expressed as

H,H^2,τ=trace(BT(Y~τ+Z~τ)B^)=trace(BTQ~τB^).\displaystyle\langle H,\hat{H}\rangle_{\mathcal{H}_{2,\tau}}=trace\big{(}B^{T}(\tilde{Y}_{\tau}+\tilde{Z}_{\tau})\hat{B}\big{)}=trace(B^{T}\tilde{Q}_{\tau}\hat{B}).

3.3 Optimality Conditions

This subsection establishes the necessary conditions for minimizing the squared 2,τ\mathcal{H}_{2,\tau}-norm of the error. To achieve this, several auxiliary variables are introduced. Specifically, Z¯τ\bar{Z}_{\tau} and Z¯n,τ\bar{Z}_{n,\tau} are defined as the solutions to the following linear matrix equations:

ATZ¯τ+Z¯τA^+i=1pMiP~τMi^=0,\displaystyle A^{T}\bar{Z}_{\tau}+\bar{Z}_{\tau}\hat{A}+\sum_{i=1}^{p}M_{i}\tilde{P}_{\tau}\hat{M_{i}}=0,
A^TZ¯n,τ+Z¯n,τA^+i=1pMi^P^τMi^=0.\displaystyle\hat{A}^{T}\bar{Z}_{n,\tau}+\bar{Z}_{n,\tau}\hat{A}+\sum_{i=1}^{p}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}=0.

Note that P~τ\tilde{P}_{\tau}, P^τ\hat{P}_{\tau}, Z~τ\tilde{Z}_{\tau}, and Z^τ\hat{Z}_{\tau} are truncated versions of P~\tilde{P}, P^\hat{P}, Z¯τ\bar{Z}_{\tau}, and Z¯n,τ\bar{Z}_{n,\tau}, respectively, with integration limits restricted to [0,τ][0,\tau]. Additionally, we define P~12\tilde{P}_{12}, P~n\tilde{P}_{n}, Z~12\tilde{Z}_{12}, and Z~n\tilde{Z}_{n} as follows:

P~12\displaystyle\tilde{P}_{12} =P~|τ=P~|0P~|0τ=P~P~τ,\displaystyle=\tilde{P}\Big{|}_{\tau}^{\infty}=\tilde{P}\Big{|}_{0}^{\infty}-\tilde{P}\Big{|}_{0}^{\tau}=\tilde{P}-\tilde{P}_{\tau}, (16)
P~n\displaystyle\tilde{P}_{n} =P^|τ=P^|0P^|0τ=P^P^τ,\displaystyle=\hat{P}\Big{|}_{\tau}^{\infty}=\hat{P}\Big{|}_{0}^{\infty}-\hat{P}\Big{|}_{0}^{\tau}=\hat{P}-\hat{P}_{\tau}, (17)
Z~12\displaystyle\tilde{Z}_{12} =Z¯τ|τ=Z¯τ|0Z¯τ|0τ=Z¯τZ~τ,\displaystyle=\bar{Z}_{\tau}\Big{|}_{\tau}^{\infty}=\bar{Z}_{\tau}\Big{|}_{0}^{\infty}-\bar{Z}_{\tau}\Big{|}_{0}^{\tau}=\bar{Z}_{\tau}-\tilde{Z}_{\tau}, (18)
Z~n\displaystyle\tilde{Z}_{n} =Z¯n,τ|τ=Z¯n,τ|0Z¯n,τ|0τ=Z¯n,τZ^τ.\displaystyle=\bar{Z}_{n,\tau}\Big{|}_{\tau}^{\infty}=\bar{Z}_{n,\tau}\Big{|}_{0}^{\infty}-\bar{Z}_{n,\tau}\Big{|}_{0}^{\tau}=\bar{Z}_{n,\tau}-\hat{Z}_{\tau}. (19)

Furthermore, let us define VV, WW, and LτL_{\tau} as

V\displaystyle V =B^BTSτTZ¯τB^B^TS^τTZ¯n,τ+P~TSτTCTC^P^S^τTC^C^T\displaystyle=\hat{B}B^{T}S_{\tau}^{T}\bar{Z}_{\tau}-\hat{B}\hat{B}^{T}\hat{S}_{\tau}^{T}\bar{Z}_{n,\tau}+\tilde{P}^{T}S_{\tau}^{T}C^{T}\hat{C}-\hat{P}\hat{S}_{\tau}^{T}\hat{C}\hat{C}^{T}
+i=1p(P~TSτTMiP~τMi^P^S^τTMi^P^τMi^)\displaystyle\hskip 42.67912pt+\sum_{i=1}^{p}\big{(}\tilde{P}^{T}S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}-\hat{P}\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\big{)}
W\displaystyle W =(A^,V)=0τeA^(τt)Ve(A^+V)t𝑑t,\displaystyle=\mathcal{L}(\hat{A},V)=\int_{0}^{\tau}e^{\hat{A}(\tau-t)}Ve^{(\hat{A}+V)t}dt,
Lτ\displaystyle L_{\tau} =Q~τTP~12+Q^τP~nZ~12TP~τ+Z~nP~τ+WT,\displaystyle=-\tilde{Q}_{\tau}^{T}\tilde{P}_{12}+\hat{Q}_{\tau}\tilde{P}_{n}-\tilde{Z}_{12}^{T}\tilde{P}_{\tau}+\tilde{Z}_{n}\tilde{P}_{\tau}+W^{T},

where (A^,V)\mathcal{L}(\hat{A},V) represents the Fréchet derivative of the matrix exponential eA^τe^{\hat{A}\tau} in the direction of matrix VV.

We now present the necessary conditions for achieving a local minimum of the squared 2,τ\mathcal{H}_{2,\tau}-norm of the error.

Theorem 3.4.

A local minimum of E2,τ2||E||_{\mathcal{H}_{2,\tau}}^{2} must satisfy the following necessary conditions:

(Y~τ+2Z~τ)TP~τ+(Y^τ+2Z^τ)P^τ+Lτ\displaystyle-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}\tilde{P}_{\tau}+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{P}_{\tau}+L_{\tau} =0,\displaystyle=0, (20)
P~τTMiP~τ+P^τMi^P^τ\displaystyle-\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}+\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau} =0,\displaystyle=0, (21)
(Y~τ+2Z~τ)TB+(Y^τ+2Z^τ)B^\displaystyle-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}B+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{B} =0,\displaystyle=0, (22)
CP~τ+C^P^τ\displaystyle-C\tilde{P}_{\tau}+\hat{C}\hat{P}_{\tau} =0.\displaystyle=0. (23)
Proof.

The proof of this theorem is tedious and lengthy and is therefore deferred to Appendix A. ∎

3.4 Comparison with Local Optimum of E22||E||_{\mathcal{H}_{2}}^{2}

This subsection draws a comparison between the necessary conditions for optimizing the standard 2\mathcal{H}_{2}-norm and the time-limited 2,τ\mathcal{H}_{2,\tau}-norm of the error. To facilitate this comparison, we begin by mathematically expressing the standard 2\mathcal{H}_{2}-norm as presented in [25, 30]. The controllability Gramian, denoted as PeP_{e}, and the observability Gramian, denoted as Qe=Ye+ZeQ_{e}=Y_{e}+Z_{e}, associated with the system realization (Ae,Be,Ce,Me,1,,Me,p)(A_{e},B_{e},C_{e},M_{e,1},\dots,M_{e,p}) can be determined by solving the following Lyapunov equations:

AePe+PeAeT+BeBeT\displaystyle A_{e}P_{e}+P_{e}A_{e}^{T}+B_{e}B_{e}^{T} =0,\displaystyle=0,
AeTYe+YeAe+CeTCe\displaystyle A_{e}^{T}Y_{e}+Y_{e}A_{e}+C_{e}^{T}C_{e} =0,\displaystyle=0,
AeTZe+ZeAe+i=1pMe,iPeMe,i\displaystyle A_{e}^{T}Z_{e}+Z_{e}A_{e}+\sum_{i=1}^{p}M_{e,i}P_{e}M_{e,i} =0,\displaystyle=0,
AeTQe+QeAe+CeTCe+i=1pMe,iPeMe,i\displaystyle A_{e}^{T}Q_{e}+Q_{e}A_{e}+C_{e}^{T}C_{e}+\sum_{i=1}^{p}M_{e,i}P_{e}M_{e,i} =0.\displaystyle=0.

Partitioning PeP_{e}, YeY_{e}, ZeZ_{e}, and QeQ_{e} according to the structure outlined in (7) yields:

Pe\displaystyle P_{e} =[PP~P~TP^],\displaystyle=\begin{bmatrix}P&\tilde{P}\\ \tilde{P}^{T}&\hat{P}\end{bmatrix}, Ye\displaystyle Y_{e} =[YY~Y~TY^],\displaystyle=\begin{bmatrix}Y&-\tilde{Y}\\ -\tilde{Y}^{T}&\hat{Y}\end{bmatrix},
Ze\displaystyle Z_{e} =[ZZ~Z~TZ^],\displaystyle=\begin{bmatrix}Z&-\tilde{Z}\\ -\tilde{Z}^{T}&\hat{Z}\end{bmatrix}, Qe\displaystyle Q_{e} =[QQ~Q~TQ^].\displaystyle=\begin{bmatrix}Q&-\tilde{Q}\\ -\tilde{Q}^{T}&\hat{Q}\end{bmatrix}.

Consequently, the 2\mathcal{H}_{2}-norm of EE can be expressed as:

E2\displaystyle||E||_{\mathcal{H}_{2}} =trace(BeTQeBe)\displaystyle=\sqrt{trace(B_{e}^{T}Q_{e}B_{e})}
=trace(BTQB2BTQ~B^+B^TQ^B^).\displaystyle=\sqrt{trace(B^{T}QB-2B^{T}\tilde{Q}\hat{B}+\hat{B}^{T}\hat{Q}\hat{B})}.

A comparison of the optimality conditions (20)-(23) and (3)-(6) reveals both similarities and distinct differences. By restricting the integration limits of PeP_{e} and QeQ_{e} to the [0,τ][0,\tau] second, the optimality conditions (21)-(23) can be derived from their counterparts (4)-(6). However, the optimality condition (3) does not reduce to condition (20) through this integration limit constraint.

The optimality conditions (4)-(6) directly yield optimal expressions for Mi^\hat{M_{i}}, B^\hat{B}, and C^\hat{C} as follows:

Mi^\displaystyle\hat{M_{i}} =P^1P~TMiP~P^1,\displaystyle=\hat{P}^{-1}\tilde{P}^{T}M_{i}\tilde{P}\hat{P}^{-1}, (24)
B^\displaystyle\hat{B} =(Y^+2Z^)1(Y~+2Z~)TB,\displaystyle=(\hat{Y}+2\hat{Z})^{-1}(\tilde{Y}+2\tilde{Z})^{T}B, (25)
C^\displaystyle\hat{C} =CP~P^1.\displaystyle=C\tilde{P}\hat{P}^{-1}. (26)

By imposing the time constraint through integration limits on PeP_{e} and QeQ_{e}, these optimal expressions can be adapted for the time-limited case, resulting in Equations (27)-(29), consistent with the optimality conditions (21)-(23).

Mi^\displaystyle\hat{M_{i}} =P^τ1P~τTMiP~τP^τ1,\displaystyle=\hat{P}_{\tau}^{-1}\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{P}_{\tau}^{-1}, (27)
B^\displaystyle\hat{B} =(Y^τ+2Z^τ)1(Y~τ+2Z~τ)TB,\displaystyle=(\hat{Y}_{\tau}+2\hat{Z}_{\tau})^{-1}(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}B, (28)
C^\displaystyle\hat{C} =CP~τP^τ1.\displaystyle=C\tilde{P}_{\tau}\hat{P}_{\tau}^{-1}. (29)

The optimal projection matrices V^\hat{V} and W^\hat{W} for the standard 2\mathcal{H}_{2} optimal model order reduction problem can be defined as:

V^\displaystyle\hat{V} =P~P^1,\displaystyle=\tilde{P}\hat{P}^{-1}, W^\displaystyle\hat{W} =(Y~+2Z~)(Y^+2Z^)1.\displaystyle=(\tilde{Y}+2\tilde{Z})(\hat{Y}+2\hat{Z})^{-1}.

Analogous definitions can be used for the time-limited case by replacing the original matrices with their time-limited counterparts, i.e.,

V^\displaystyle\hat{V} =P~τP^τ1,\displaystyle=\tilde{P}_{\tau}\hat{P}_{\tau}^{-1}, W^\displaystyle\hat{W} =(Y~τ+2Z~τ)(Y^τ+2Z^τ)1.\displaystyle=(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})(\hat{Y}_{\tau}+2\hat{Z}_{\tau})^{-1}.

While this approach enables optimal selections for Mi^\hat{M_{i}}, B^\hat{B}, and C^\hat{C} as per Equations (27)-(29), it falls short in determining an optimal A^\hat{A}. Enforcing the Petrov-Galerkin condition, W^TV^=I\hat{W}^{T}\hat{V}=I, ensures that the term (Y~τ+2Z~τ)TP~τ+(Y^τ+2Z^τ)P^τ-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}\tilde{P}_{\tau}+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{P}_{\tau} becomes zero, but the residual term LτL_{\tau} generally remains nonzero, indicating a deviation from the optimality condition (20). Consequently, achieving a local optimum for the time-limited case within the Petrov-Galerkin projection framework remains generally impossible. Nevertheless, this choice of projection matrices yields exact satisfaction of the optimality conditions (21)-(23) and an approximate satisfaction the optimality condition (20).

Thus far, the optimal projection matrices V^\hat{V} and W^\hat{W} have been determined as V^=P~τP^τ1\hat{V}=\tilde{P}_{\tau}\hat{P}_{\tau}^{-1} and W^=(Y~τ+2Z~τ)(Y^τ+2Z^τ)1\hat{W}=(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})(\hat{Y}_{\tau}+2\hat{Z}_{\tau})^{-1}, respectively. However, it is important to note that these matrices depend on the ROM, (A^,B^,C^,M^1,,M^p)(\hat{A},\hat{B},\hat{C},\hat{M}_{1},...,\hat{M}_{p}), which is yet unknown. Consequently, Equations (2) and (8)-(13) form a coupled system of equations that can be represented as:

(A^,B^,C^,M1^,,Mp^)\displaystyle(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}}) =f(P~τ,P^τ,Y~τ,Y^τ,Z~τ,Z^τ),\displaystyle=f(\tilde{P}_{\tau},\hat{P}_{\tau},\tilde{Y}_{\tau},\hat{Y}_{\tau},\tilde{Z}_{\tau},\hat{Z}_{\tau}),
(P~τ,P^τ,Y~τ,Y^τ,Z~τ,Z^τ)\displaystyle(\tilde{P}_{\tau},\hat{P}_{\tau},\tilde{Y}_{\tau},\hat{Y}_{\tau},\tilde{Z}_{\tau},\hat{Z}_{\tau}) =g(A^,B^,C^,M1^,,Mp^).\displaystyle=g(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}}).

The stationary points of the composite function

(A^,B^,C^,M1^,,Mp^)=f(g(A^,B^,C^,M1^,,Mp^))\displaystyle(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}})=f\big{(}g(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}})\big{)}

satisfy the optimality conditions (21)-(23). Moreover, by imposing the Petrov-Galerkin condition, W^TV^=I\hat{W}^{T}\hat{V}=I, the optimality condition (20) is approximately fulfilled with a deviation quantified by LτL_{\tau}. In contrast, the classical 2\mathcal{H}_{2}-optimal MOR problem achieves exact satisfaction of all optimality conditions (3)-(6) through the enforcement of the Petrov-Galerkin condition on the stationary points.

In standard 2\mathcal{H}_{2}-optimal MOR, it is established that selecting projection matrices as V^=P~\hat{V}=\tilde{P} and W^=Y~+2Z~\hat{W}=\tilde{Y}+2\tilde{Z}, rather than the previously considered forms, leads to P^=I\hat{P}=I and Y^+2Z^=I\hat{Y}+2\hat{Z}=I at stationary points. Consequently, the combination of these projection matrices with the Petrov-Galerkin condition, W^TV^=I\hat{W}^{T}\hat{V}=I, ensures satisfaction of all optimality conditions (3)-(6). However, when applying the analogous approach to the time-limited case, with V^=P~τ\hat{V}=\tilde{P}_{\tau} and W^=Y~τ+2Z~τ\hat{W}=\tilde{Y}_{\tau}+2\tilde{Z}_{\tau}, and imposing the Petrov-Galerkin condition, the resulting ROM does not satisfy any optimality condition.

Theorem 3.5.

If the conditions W^TSτB=S^τB^\hat{W}^{T}S_{\tau}B=\hat{S}_{\tau}\hat{B}, CSτV^=C^S^τCS_{\tau}\hat{V}=\hat{C}\hat{S}_{\tau}, and V^TMiSτV^=Mi^S^τ\hat{V}^{T}M_{i}S_{\tau}\hat{V}=\hat{M_{i}}\hat{S}_{\tau} hold, then selecting V^=P~τ\hat{V}=\tilde{P}_{\tau} and W^=Y~τ+2Z~τ\hat{W}=\tilde{Y}_{\tau}+2\tilde{Z}_{\tau} with the Petrov-Galerkin condition W^TV^=I\hat{W}^{T}\hat{V}=I leads to P^τ=I\hat{P}_{\tau}=I and Y^τ+2Z^τ=I\hat{Y}_{\tau}+2\hat{Z}_{\tau}=I, consequently satisfying the optimality conditions (20)-(23).

Proof.

The proof is detailed in Appendix B. ∎

In general, the chosen projection matrices, V^=P~τ\hat{V}=\tilde{P}_{\tau} and W^=Y~τ+2Z~τ\hat{W}=\tilde{Y}_{\tau}+2\tilde{Z}_{\tau}, along with the Petrov-Galerkin condition W^TV^=I\hat{W}^{T}\hat{V}=I, do not satisfy the conditions W^TSτB=S^τB^\hat{W}^{T}S_{\tau}B=\hat{S}_{\tau}\hat{B}, CSτV^=C^S^τCS_{\tau}\hat{V}=\hat{C}\hat{S}_{\tau}, and V^TMiSτV^=Mi^S^τ\hat{V}^{T}M_{i}S_{\tau}\hat{V}=\hat{M_{i}}\hat{S}_{\tau}. As a result, P^τI\hat{P}_{\tau}\neq I and Y^τ+2Z^τI\hat{Y}_{\tau}+2\hat{Z}_{\tau}\neq I at the stationary points, and consequently, the optimality conditions (21)-(23) are not fulfilled.

3.5 Algorithm

For simplicity, we have thus far assumed the desired time interval begins at 0 seconds. However, for any general time interval [τ1,τ2][\tau_{1},\tau_{2}] seconds, P~τ\tilde{P}_{\tau}, P^τ\hat{P}_{\tau}, Y~τ+2Z~τ\tilde{Y}_{\tau}+2\tilde{Z}_{\tau}, and Y^τ+2Z^τ\hat{Y}_{\tau}+2\hat{Z}_{\tau} can be calculated by solving the following equations:

AP~τ+P~τA^T+eAτ1BB^TeA^Tτ1eAτ2BB^TeA^Tτ2=0,\displaystyle A\tilde{P}_{\tau}+\tilde{P}_{\tau}\hat{A}^{T}+e^{A\tau_{1}}B\hat{B}^{T}e^{\hat{A}^{T}\tau_{1}}-e^{A\tau_{2}}B\hat{B}^{T}e^{\hat{A}^{T}\tau_{2}}=0, (30)
A^P^τ+P^τA^T+eA^τ1B^B^TeA^Tτ1eA^τ2B^B^TeA^Tτ2=0,\displaystyle\hat{A}\hat{P}_{\tau}+\hat{P}_{\tau}\hat{A}^{T}+e^{\hat{A}\tau_{1}}\hat{B}\hat{B}^{T}e^{\hat{A}^{T}\tau_{1}}-e^{\hat{A}\tau_{2}}\hat{B}\hat{B}^{T}e^{\hat{A}^{T}\tau_{2}}=0, (31)
AT(Y~τ+2Z~τ)+(Y~τ+2Z~τ)A^+eATτ1CTC^eA^τ1eATτ2CTC^eA^τ2\displaystyle A^{T}(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})+(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})\hat{A}+e^{A^{T}\tau_{1}}C^{T}\hat{C}e^{\hat{A}\tau_{1}}-e^{A^{T}\tau_{2}}C^{T}\hat{C}e^{\hat{A}\tau_{2}}
+2i=1p(eATτ1MiP~τMi^eA^τ1eATτ2MiP~τMi^eA^τ2)=0,\displaystyle+2\sum_{i=1}^{p}\big{(}e^{A^{T}\tau_{1}}M_{i}\tilde{P}_{\tau}\hat{M_{i}}e^{\hat{A}\tau_{1}}-e^{A^{T}\tau_{2}}M_{i}\tilde{P}_{\tau}\hat{M_{i}}e^{\hat{A}\tau_{2}}\big{)}=0, (32)
A^T(Y^τ+2Z^τ)+(Y^τ+2Z^τ)A^+eA^Tτ1C^TC^eA^τ1eA^Tτ2C^TC^eA^τ2\displaystyle\hat{A}^{T}(\hat{Y}_{\tau}+2\hat{Z}_{\tau})+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{A}+e^{\hat{A}^{T}\tau_{1}}\hat{C}^{T}\hat{C}e^{\hat{A}\tau_{1}}-e^{\hat{A}^{T}\tau_{2}}\hat{C}^{T}\hat{C}e^{\hat{A}\tau_{2}}
+2i=1p(eA^Tτ1Mi^P^τMi^eA^τ1eA^Tτ2Mi^P^τMi^eA^τ2)=0.\displaystyle+2\sum_{i=1}^{p}\big{(}e^{\hat{A}^{T}\tau_{1}}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}e^{\hat{A}\tau_{1}}-e^{\hat{A}^{T}\tau_{2}}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}e^{\hat{A}\tau_{2}}\big{)}=0. (33)

We now present our proposed algorithm named as the “Time-limited 2\mathcal{H}_{2} Near-optimal Iterative Algorithm (TLHNOIA)”. This algorithm begins with an arbitrary initial guess of the ROM and iteratively refines it until convergence, which is indicated by minimal change in the ROM’s state-space matrices. Steps (2) and (3) calculate the projection matrices in each iteration, while steps (4)-(8) employ bi-orthogonal Gram–Schmidt method for ensuring Petrov-Galerkin condition W^TV^=I\hat{W}^{T}\hat{V}=I.

Algorithm 1 FLHNOIA

Input: Full order system: (A,B,C,M1,,Mp)(A,B,C,M_{1},\cdots,M_{p}); Desired time interval: [τ1,τ2][\tau_{1},\tau_{2}] sec; Initial guess of ROM: (A^,B^,C^,M1^,,Mp^)(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}}); Tolerance: toltol.

Output: ROM: (A^,B^,C^,M1^,,Mp^)(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}}).

1:  while(relative change in (A^,B^,C^,M1^,,Mp^)(\hat{A},\hat{B},\hat{C},\hat{M_{1}},\cdots,\hat{M_{p}}) >> toltol)
2:  Solve equations (30)-(33) to compute P~τ\tilde{P}_{\tau}, P^τ\hat{P}_{\tau}, Y~τ+2Z~τ\tilde{Y}_{\tau}+2\tilde{Z}_{\tau}, and Y^τ+2Z^τ\hat{Y}_{\tau}+2\hat{Z}_{\tau}.
3:  Set V^=P~τP^τ1\hat{V}=\tilde{P}_{\tau}\hat{P}_{\tau}^{-1} and W^=(Y~τ+2Z~τ)(Y^τ+2Z^τ)1\hat{W}=(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})(\hat{Y}_{\tau}+2\hat{Z}_{\tau})^{-1}.
4:  for l=1,,kl=1,\ldots,k do
5:  v=V^(:,l)v=\hat{V}(:,l), v=j=1l(IV^(:,j)W^(:,j)T)vv=\prod_{j=1}^{l}\big{(}I-\hat{V}(:,j)\hat{W}(:,j)^{T}\big{)}v.
6:  w=W^(:,l)w=\hat{W}(:,l), w=j=1l(IW^(:,j)V^(:,j)T)ww=\prod_{j=1}^{l}\big{(}I-\hat{W}(:,j)\hat{V}(:,j)^{T}\big{)}w.
7:  v=vv2v=\frac{v}{||v||_{2}}, w=ww2w=\frac{w}{||w||_{2}}, v=vwTvv=\frac{v}{w^{T}v}, V^(:,l)=v\hat{V}(:,l)=v, W^(:,l)=w\hat{W}(:,l)=w.
8:  end for
9:  Update A^=W^TAV^\hat{A}=\hat{W}^{T}A\hat{V}, B^=W^TB\hat{B}=\hat{W}^{T}B, C^=CV^\hat{C}=C\hat{V}, Mi^=V^TMiV^\hat{M_{i}}=\hat{V}^{T}M_{i}\hat{V}.
10:  end while
Remark 1.

To assess convergence, monitoring the stagnation of the ROM poles is a more reliable indicator than examining state-space realizations. This is because 2\mathcal{H}_{2}-optimal MOR techniques frequently generate ROMs with different state-space representations but identical transfer functions. The stagnation of ROM poles has been widely adopted as a convergence criterion in 2\mathcal{H}_{2}-optimal MOR algorithms due to its effectiveness [40].

4 Computational Aspects

This section discusses computational efficiency in implementing TLHNOIA. A key step in TLHNOIA, Step (2), involves calculating the matrix exponential eAτe^{A\tau}, which can become computationally intensive for high-order models. To mitigate this, we can utilize Krylov subspace-based methods in [36], to approximate eAτBe^{A\tau}B, CeAτCe^{A\tau}, and MieAτM_{i}e^{A\tau}. The most computationally demanding task within each iteration is solving Sylvester equations 30) and (32). Due to the sparsity of state-space matrices in high-order dynamical systems, these Sylvester equations exhibit a “sparse-dense” structure, a common feature in 2\mathcal{H}_{2}-optimal MOR algorithms. Specifically, these equations involve large sparse matrices and smaller dense matrices, i.e.,

𝒦++𝒩𝒪\displaystyle\mathcal{K}\mathcal{L}+\mathcal{L}\mathcal{M}+\mathcal{N}\mathcal{O} =0,\displaystyle=0,

wherein the large matrices 𝒦N×N\mathcal{K}\in\mathbb{R}^{N\times N} and 𝒩N×d\mathcal{N}\in\mathbb{R}^{N\times d} (dNd\ll N) are sparse, and the small matrices n×n\mathcal{M}\in\mathbb{R}^{n\times n} and 𝒪d×n\mathcal{O}\in\mathbb{R}^{d\times n} are dense. To efficiently solve such equations, the efficient algorithm proposed in [29] can be used. The remaining steps of TLHNOIA involve relatively simple matrix computations and small Lyapunov equation solutions, which are computationally inexpensive.

5 Illustrative Example

This section presents an illustrative example to verify key properties of TLHNOIA. Consider a sixth-order LQO system defined by the following state-space representation:

A\displaystyle A =[0001000000100000015.45454.545500.05450.045501021110.10.210.1105.56.500.0550.065],\displaystyle=\begin{bmatrix}0&0&0&1&0&0\\ 0&0&0&0&1&0\\ 0&0&0&0&0&1\\ -5.4545&4.5455&0&-0.0545&0.0455&0\\ 10&-21&11&0.1&-0.21&0.11\\ 0&5.5&-6.5&0&0.055&-0.065\end{bmatrix},
B\displaystyle B =[0000.09090.40.5]T,\displaystyle=\begin{bmatrix}0&0&0&0.0909&0.4&-0.5\end{bmatrix}^{T},
C\displaystyle C =[223000],\displaystyle=\begin{bmatrix}2&-2&3&0&0&0\end{bmatrix},
M1\displaystyle M_{1} =diag(0.5,0.3,0,0,0,0).\displaystyle=diag(0.5,0.3,0,0,0,0).

The desired time interval for this example is [0,0.5][0,0.5] sec. To initialize TLHNOIA, the following initial guess is used:

A^\displaystyle\hat{A} =[0.00380.87370.00460.87370.00380.00530.00540.00600.0353],\displaystyle=\begin{bmatrix}-0.0038&-0.8737&0.0046\\ 0.8737&-0.0038&0.0053\\ 0.0054&-0.0060&-0.0353\end{bmatrix}, B^\displaystyle\hat{B} =[0.35180.34720.2617]T,\displaystyle=\begin{bmatrix}0.3518&-0.3472&-0.2617\end{bmatrix}^{T},
C^\displaystyle\hat{C} =[0.34540.34050.2479],\displaystyle=\begin{bmatrix}-0.3454&-0.3405&0.2479\end{bmatrix}, M^1\displaystyle\hat{M}_{1} =[0.01130.01140.01300.01140.01160.01320.01300.01320.0271].\displaystyle=\begin{bmatrix}0.0113&0.0114&0.0130\\ 0.0114&0.0116&0.0132\\ 0.0130&0.0132&0.0271\end{bmatrix}.

TLHNOIA was stopped when the change in eigenvalues of A^\hat{A} stagnates, as change in the ROM’s state-space realization did not cease. The resulting final ROM is:

A^\displaystyle\hat{A} =[6.478018.55895.50993.16365.96762.15610.42783.17260.8857],\displaystyle=\begin{bmatrix}-6.4780&18.5589&5.5099\\ -3.1636&5.9676&2.1561\\ 0.4278&-3.1726&0.8857\end{bmatrix}, B^\displaystyle\hat{B} =[0.35330.13000.0892]T,\displaystyle=\begin{bmatrix}0.3533&0.1300&-0.0892\end{bmatrix}^{T},
C^\displaystyle\hat{C} =[0.63333.08221.9898],\displaystyle=\begin{bmatrix}-0.6333&3.0822&1.9898\end{bmatrix}, M^1\displaystyle\hat{M}_{1} =[0.07880.19230.03850.19230.48990.06380.03850.06380.0637].\displaystyle=\begin{bmatrix}0.0788&-0.1923&0.0385\\ -0.1923&0.4899&-0.0638\\ 0.0385&-0.0638&0.0637\end{bmatrix}.

The numerical results below confirm that this ROM, for all practical purposes, satisfies the optimality conditions (21)-(23).

P~τTMiP~τ+P^τMi^P^τ2\displaystyle||-\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}+\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau}||_{2} =4.3029×109,\displaystyle=4.3029\times 10^{-9},
(Y~τ+2Z~τ)TB+(Y^τ+2Z^τ)B^2\displaystyle||-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}B+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{B}||_{2} =1.8855×104,\displaystyle=1.8855\times 10^{-4},
CP~τ+C^P^τ2\displaystyle||-C\tilde{P}_{\tau}+\hat{C}\hat{P}_{\tau}||_{2} =1.5081×105.\displaystyle=1.5081\times 10^{-5}.

Subsequently, a third-order ROM is generated using BT, TLBT, and HOMORA. The same initial ROM is employed to start HOMORA. Figure 1 presents the relative output error on a logarithmic scale for the input u(t)=0.01cos(2t)u(t)=0.01cos(2t) within the specified 0 to 0.50.5 sec interval. As shown, TLBT and TLHNOIA exhibit superior accuracy.

Refer to caption
Figure 1: Relative Error in the Output Response within [0,0.5][0,0.5] sec

6 Conclusion

This research investigates the 2\mathcal{H}_{2}-optimal MOR problem within a specified finite time horizon. To quantify output strength within this interval, we introduce the time-limited 2\mathcal{H}_{2} norm for LQO systems. We derive necessary conditions for achieving local optima of the squared time-limited 2\mathcal{H}_{2} norm of the error. These conditions are compared to those of the standard, unconstrained 2\mathcal{H}_{2}-optimal MOR problem. We analyze the limitations of the Petrov-Galerkin projection method in satisfying all optimality conditions for the time-limited scenario. Consequently, we propose a Petrov-Galerkin projection algorithm that satisfies three of the four optimality conditions. Numerical experiments validate our theoretical findings and demonstrate the algorithm’s effectiveness in achieving high accuracy within the desired time frame.

Appendix A

This appendix provides a proof of Theorem 3.4. Throughout the proof, the following trace properties are utilized: the invariance of the trace under matrix transposition and cyclic permutations, and the linearity of the trace operation, i.e.,

  1. 1.

    Trace of transpose: trace(STU)=trace(UTTTST)trace(STU)=trace(U^{T}T^{T}S^{T}).

  2. 2.

    Circular permutation in trace: trace(STU)=trace(UST)=trace(TUS)trace(STU)=trace(UST)=trace(TUS).

  3. 3.

    Trace of addition: trace(S+T+U)=trace(S)+trace(T)+trace(U)trace(S+T+U)=trace(S)+trace(T)+trace(U).

We define a cost function, JJ, as the part of EH2,τ2||E||_{H_{2,\tau}}^{2} that depends on the ROM:

J=trace(2BTQ~τB^+B^TQ^τB^).\displaystyle J=trace(-2B^{T}\tilde{Q}_{\tau}\hat{B}+\hat{B}^{T}\hat{Q}_{\tau}\hat{B}).

Introducing a small first-order perturbation, ΔA^\Delta_{\hat{A}} to A^\hat{A}, induces corresponding perturbations, ΔJA^\Delta_{J}^{\hat{A}}, ΔQ~τA^\Delta_{\tilde{Q}_{\tau}}^{\hat{A}}, and ΔQ^τA^\Delta_{\hat{Q}_{\tau}}^{\hat{A}}, in the cost function JJ and matrices Q~τ\tilde{Q}_{\tau} and Q^τ\hat{Q}_{\tau}, respectively. The resulting first-order change in JJ is:

ΔJA^=trace(2BTΔQ~τA^B^+B^TΔQ^τA^B^).\displaystyle\Delta_{J}^{\hat{A}}=trace(2B^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{A}}\hat{B}+\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{A}}\hat{B}).

Furthermore, based on equations (14) and (15), we find that ΔQ~τA^\Delta_{\tilde{Q}_{\tau}}^{\hat{A}} and ΔQ^τA^\Delta_{\hat{Q}_{\tau}}^{\hat{A}} are solutions to the following Lyapunov equations:

ATΔQ~τA^+ΔQ~τA^A^+Q~τΔA^SτTCTC^ΔS^τA^\displaystyle A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{A}}+\Delta_{\tilde{Q}_{\tau}}^{\hat{A}}\hat{A}+\tilde{Q}_{\tau}\Delta_{\hat{A}}-S_{\tau}^{T}C^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(MiΔP~τA^Mi^SτTMiΔP~τA^Mi^S^τSτTMiP~τMi^ΔS^τA^)=0,\displaystyle\hskip 34.14322pt+\sum_{i=1}^{p}\big{(}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{M_{i}}-S_{\tau}^{T}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{M_{i}}\hat{S}_{\tau}-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}=0,
A^TΔQ^τA^+ΔQ^τA^A^+(ΔA^)TQ^τ+Q^τΔA^(ΔS^τA^)TC^TC^S^τS^τTC^TC^ΔS^τA^\displaystyle\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{A}}+\Delta_{\hat{Q}_{\tau}}^{\hat{A}}\hat{A}+(\Delta_{\hat{A}})^{T}\hat{Q}_{\tau}+\hat{Q}_{\tau}\Delta_{\hat{A}}-(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(M^iΔP^τA^M^i(ΔS^τA^)TM^iP^τM^iS^τ\displaystyle\hskip 34.14322pt+\sum_{i=1}^{p}\big{(}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{M}_{i}-(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\hat{S}_{\tau}
S^τTM^iΔP^τA^M^iS^τS^τTM^iP^τM^iΔS^τA^)=0,\displaystyle\hskip 34.14322pt-\hat{S}_{\tau}^{T}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{M}_{i}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}=0,

where

ΔS^τA^=(A^,ΔA^)+o(ΔA^);\displaystyle\Delta_{\hat{S}_{\tau}}^{\hat{A}}=\mathcal{L}(\hat{A},\Delta_{\hat{A}})+o(||\Delta_{\hat{A}}||);

cf. [42]. As we focus solely on first-order perturbations, the higher-order term o(ΔA^)o(||\Delta_{\hat{A}}||) is disregarded in subsequent analysis. Now,

trace(BB^T(ΔQ~τA^)T)\displaystyle trace\Big{(}B\hat{B}^{T}(\Delta_{\tilde{Q}_{\tau}}^{\hat{A}})^{T}\Big{)} =trace((AP~P~A^T)(ΔQ~τA^)T)\displaystyle=trace\Big{(}(-A\tilde{P}-\tilde{P}\hat{A}^{T})(\Delta_{\tilde{Q}_{\tau}}^{\hat{A}})^{T}\Big{)}
=trace(AP~(ΔQ~τA^)TP~A^T(ΔQ~τA^)T)\displaystyle=trace\Big{(}-A\tilde{P}(\Delta_{\tilde{Q}_{\tau}}^{\hat{A}})^{T}-\tilde{P}\hat{A}^{T}(\Delta_{\tilde{Q}_{\tau}}^{\hat{A}})^{T}\Big{)}
=trace(P~T(ATΔQ~τA^A^ΔQ~τA^))\displaystyle=trace\Big{(}\tilde{P}^{T}(-A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{A}}-\hat{A}\Delta_{\tilde{Q}_{\tau}}^{\hat{A}})\Big{)}
=trace(P~T(Q~τΔA^SτTCTC^ΔS^τA^\displaystyle=trace\Bigg{(}\tilde{P}^{T}\Big{(}\tilde{Q}_{\tau}\Delta_{\hat{A}}-S_{\tau}^{T}C^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(MiΔP~τA^Mi^SτTMiΔP~τA^Mi^S^τ\displaystyle\hskip 28.45274pt+\sum_{i=1}^{p}\big{(}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{M_{i}}-S_{\tau}^{T}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{M_{i}}\hat{S}_{\tau}
SτTMiP~τMi^ΔS^τA^)))\displaystyle\hskip 28.45274pt-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}\Big{)}\Bigg{)}
=trace(Q~τP~(ΔA^)TC^TCSτP~(ΔS^τA^)T\displaystyle=trace\Big{(}\tilde{Q}_{\tau}\tilde{P}(\Delta_{\hat{A}})^{T}-\hat{C}^{T}CS_{\tau}\tilde{P}(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}
+i=1p(MiP~M^i(ΔP~τA^)TMiSτP~S^τTM^i(ΔS^τA^)T\displaystyle\hskip 28.45274pt+\sum_{i=1}^{p}\big{(}M_{i}\tilde{P}\hat{M}_{i}(\Delta_{\tilde{P}_{\tau}}^{\hat{A}})^{T}-M_{i}S_{\tau}\tilde{P}\hat{S}_{\tau}^{T}\hat{M}_{i}(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}
M^iP~τTMiSτP~(ΔS^τA^)T)).\displaystyle\hskip 28.45274pt-\hat{M}_{i}\tilde{P}_{\tau}^{T}M_{i}S_{\tau}\tilde{P}(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}\big{)}\Big{)}.

Similarly, we find that

trace(B^B^TΔQ^τA^)\displaystyle trace(\hat{B}\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{A}}) =trace((A^P^P^A^T)ΔQ^τA^)\displaystyle=trace\Big{(}\big{(}-\hat{A}\hat{P}-\hat{P}\hat{A}^{T}\big{)}\Delta_{\hat{Q}_{\tau}}^{\hat{A}}\Big{)}
=trace(P^(A^TΔQ^τA^ΔQ^τA^A^))\displaystyle=trace\Big{(}\hat{P}\big{(}-\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{A}}-\Delta_{\hat{Q}_{\tau}}^{\hat{A}}\hat{A}\big{)}\Big{)}
=trace(P^((ΔA^)TQ^τ+Q^τΔA^\displaystyle=trace\Bigg{(}\hat{P}\Big{(}(\Delta_{\hat{A}})^{T}\hat{Q}_{\tau}+\hat{Q}_{\tau}\Delta_{\hat{A}}
(ΔS^τA^)TC^TC^S^τS^τTC^TC^ΔS^τA^\displaystyle-(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(M^iΔP^τA^M^i(ΔS^τA^)TM^iP^τM^iS^τ\displaystyle+\sum_{i=1}^{p}\big{(}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{M}_{i}-(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\hat{S}_{\tau}
S^τTM^iΔP^τA^M^iS^τS^τTM^iP^τM^iΔS^τA^)))\displaystyle-\hat{S}_{\tau}^{T}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{M}_{i}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}\Big{)}\Bigg{)}
=trace(2Q^τP^(ΔA^)T2P^S^τTC^TC^ΔS^τA^\displaystyle=trace\Big{(}2\hat{Q}_{\tau}\hat{P}(\Delta_{\hat{A}})^{T}-2\hat{P}\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(M^iP^M^iΔP^τA^M^iS^τP^S^τTM^iΔP^τA^\displaystyle+\sum_{i=1}^{p}\big{(}\hat{M}_{i}\hat{P}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}-\hat{M}_{i}\hat{S}_{\tau}\hat{P}\hat{S}_{\tau}^{T}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}
2P^S^τTM^iP^τM^iΔS^τA^)).\displaystyle-2\hat{P}\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}\Big{)}.

Further, since

P~τ=P~SτP~S^τT,\displaystyle\tilde{P}_{\tau}=\tilde{P}-S_{\tau}\tilde{P}\hat{S}_{\tau}^{T},
P^τ=P^S^τP^S^τT,\displaystyle\hat{P}_{\tau}=\hat{P}-\hat{S}_{\tau}\hat{P}\hat{S}_{\tau}^{T},

ΔJA^\Delta_{J}^{\hat{A}} becomes

ΔJA^\displaystyle\Delta_{J}^{\hat{A}} =trace(2(Q~τ)TP~(ΔA^)T+2Q^τP^(ΔA^)T\displaystyle=trace\Big{(}-2(\tilde{Q}_{\tau})^{T}\tilde{P}(\Delta_{\hat{A}})^{T}+2\hat{Q}_{\tau}\hat{P}(\Delta_{\hat{A}})^{T}
+2P~TSτTCTC^ΔS^τA^2P^S^τTC^TC^ΔS^τA^\displaystyle\hskip 28.45274pt+2\tilde{P}^{T}S_{\tau}^{T}C^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}-2\hat{P}\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(2MiP~τM^i(ΔP~τA^)T+2P~TSτTMiP~τM^iΔS^τA^\displaystyle\hskip 28.45274pt+\sum_{i=1}^{p}\big{(}-2M_{i}\tilde{P}_{\tau}\hat{M}_{i}(\Delta_{\tilde{P}_{\tau}}^{\hat{A}})^{T}+2\tilde{P}^{T}S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M}_{i}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+M^iP^τM^iΔP^τA^2P^S^τTM^iP^τM^iΔS^τA^));\displaystyle\hskip 28.45274pt+\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\Delta_{\hat{P}_{\tau}}^{\hat{A}}-2\hat{P}\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\hat{M}_{i}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}\Big{)};

cf. [43].

Introducing a small first-order perturbation, ΔA^\Delta_{\hat{A}} to A^\hat{A}, induces corresponding perturbations, ΔP~τA^\Delta_{\tilde{P}_{\tau}}^{\hat{A}} and ΔP^τA^\Delta_{\hat{P}_{\tau}}^{\hat{A}}. Based on equations (8) and (9), we find that ΔP~τA^\Delta_{\tilde{P}_{\tau}}^{\hat{A}} and ΔP^τA^\Delta_{\hat{P}_{\tau}}^{\hat{A}} are solutions to the following Lyapunov equations:

AΔP~τA^+ΔP~τA^A^T+P~τ(ΔA^)T+SτBB^T(ΔS^τA^)T=0,\displaystyle A\Delta_{\tilde{P}_{\tau}}^{\hat{A}}+\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{A}^{T}+\tilde{P}_{\tau}(\Delta_{\hat{A}})^{T}+S_{\tau}B\hat{B}^{T}(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}=0,
A^ΔP^τA^+ΔP^τA^A^T+ΔA^P^τ+P^τ(ΔA^)T\displaystyle\hat{A}\Delta_{\hat{P}_{\tau}}^{\hat{A}}+\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{A}^{T}+\Delta_{\hat{A}}\hat{P}_{\tau}+\hat{P}_{\tau}(\Delta_{\hat{A}})^{T}
ΔS^τA^B^B^TS^τTS^τB^B^T(ΔS^τA^)T=0.\displaystyle\hskip 28.45274pt-\Delta_{\hat{S}_{\tau}}^{\hat{A}}\hat{B}\hat{B}^{T}\hat{S}_{\tau}^{T}-\hat{S}_{\tau}\hat{B}\hat{B}^{T}(\Delta_{\hat{S}_{\tau}}^{\hat{A}})^{T}=0.

Observe that

trace(i=1pMiP~τMi^(ΔP~τA^)T)\displaystyle trace\Big{(}\sum_{i=1}^{p}M_{i}\tilde{P}_{\tau}\hat{M_{i}}(\Delta_{\tilde{P}_{\tau}}^{\hat{A}})^{T}\Big{)} =trace((ATZ¯τZ¯τA^)(ΔP~τA^)T)\displaystyle=trace\Big{(}\big{(}-A^{T}\bar{Z}_{\tau}-\bar{Z}_{\tau}\hat{A}\big{)}(\Delta_{\tilde{P}_{\tau}}^{\hat{A}})^{T}\Big{)}
=trace((AΔP~τA^ΔP~τA^A^T)Z¯τT)\displaystyle=trace\Big{(}\big{(}-A\Delta_{\tilde{P}_{\tau}}^{\hat{A}}-\Delta_{\tilde{P}_{\tau}}^{\hat{A}}\hat{A}^{T}\big{)}\bar{Z}_{\tau}^{T}\Big{)}
=trace(Z¯τTP~τ(ΔA^)TB^BTSτTZ¯τΔS^τA^),\displaystyle=trace\big{(}\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}(\Delta_{\hat{A}})^{T}-\hat{B}B^{T}S_{\tau}^{T}\bar{Z}_{\tau}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)},

and

trace(i=1pMi^P^τMi^ΔP^τA^)\displaystyle trace\Big{(}\sum_{i=1}^{p}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\Big{)} =trace((A^TZ¯n,τZ¯n,τA^)ΔP^τA^)\displaystyle=trace\Big{(}\big{(}-\hat{A}^{T}\bar{Z}_{n,\tau}-\bar{Z}_{n,\tau}\hat{A}\big{)}\Delta_{\hat{P}_{\tau}}^{\hat{A}}\Big{)}
=trace((A^ΔP^τA^ΔP^τA^A^T)Z¯n,τ)\displaystyle=trace\Big{(}\big{(}-\hat{A}\Delta_{\hat{P}_{\tau}}^{\hat{A}}-\Delta_{\hat{P}_{\tau}}^{\hat{A}}\hat{A}^{T}\big{)}\bar{Z}_{n,\tau}\Big{)}
=2trace(Z¯n,τP^τ(ΔA^)TB^B^TS^τTZ¯n,τΔS^τA^).\displaystyle=2trace\big{(}\bar{Z}_{n,\tau}\hat{P}_{\tau}(\Delta_{\hat{A}})^{T}-\hat{B}\hat{B}^{T}\hat{S}_{\tau}^{T}\bar{Z}_{n,\tau}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}.

Consequently, ΔJA^\Delta_{J}^{\hat{A}} becomes

ΔJA^\displaystyle\Delta_{J}^{\hat{A}} =trace(2(Q~τ)TP~(ΔA^)T+2Q^τP^(ΔA^)T\displaystyle=trace\Big{(}-2(\tilde{Q}_{\tau})^{T}\tilde{P}(\Delta_{\hat{A}})^{T}+2\hat{Q}_{\tau}\hat{P}(\Delta_{\hat{A}})^{T}
2Z¯τTP~τ(ΔA^)T+2Z¯n,τP^τ(ΔA^)T\displaystyle\hskip 14.22636pt-2\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}(\Delta_{\hat{A}})^{T}+2\bar{Z}_{n,\tau}\hat{P}_{\tau}(\Delta_{\hat{A}})^{T}
+2B^BTSτTZ¯τΔS^τA^2B^B^TS^τTZ¯n,τΔS^τA^\displaystyle\hskip 14.22636pt+2\hat{B}B^{T}S_{\tau}^{T}\bar{Z}_{\tau}\Delta_{\hat{S}_{\tau}}^{\hat{A}}-2\hat{B}\hat{B}^{T}\hat{S}_{\tau}^{T}\bar{Z}_{n,\tau}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+2P~TSτTCTC^ΔS^τA^2P^S^τTC^C^TΔS^τA^\displaystyle\hskip 14.22636pt+2\tilde{P}^{T}S_{\tau}^{T}C^{T}\hat{C}\Delta_{\hat{S}_{\tau}}^{\hat{A}}-2\hat{P}\hat{S}_{\tau}^{T}\hat{C}\hat{C}^{T}\Delta_{\hat{S}_{\tau}}^{\hat{A}}
+i=1p(2P~TSτTMiP~τMi^ΔS^τA^2P^S^τTMi^P^τMi^ΔS^τA^))\displaystyle\hskip 14.22636pt+\sum_{i=1}^{p}\big{(}2\tilde{P}^{T}S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\hat{M_{i}}\Delta_{\hat{S}_{\tau}}^{\hat{A}}-2\hat{P}\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\Delta_{\hat{S}_{\tau}}^{\hat{A}}\big{)}\Big{)}
=trace(2(Q~τ)TP~(ΔA^)T+2Q^τP^(ΔA^)T\displaystyle=trace\Big{(}-2(\tilde{Q}_{\tau})^{T}\tilde{P}(\Delta_{\hat{A}})^{T}+2\hat{Q}_{\tau}\hat{P}(\Delta_{\hat{A}})^{T}
2Z¯τTP~τ(ΔA^)T+2Z¯n,τP^τ(ΔA^)T+2VΔS^τA^).\displaystyle\hskip 14.22636pt-2\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}(\Delta_{\hat{A}})^{T}+2\bar{Z}_{n,\tau}\hat{P}_{\tau}(\Delta_{\hat{A}})^{T}+2V\Delta_{\hat{S}_{\tau}}^{\hat{A}}\Big{)}.

Recall that ΔS^τA^=(A^,ΔA^)\Delta_{\hat{S}_{\tau}}^{\hat{A}}=\mathcal{L}(\hat{A},\Delta_{\hat{A}}). Interchanging the trace and integral operators yields:

trace(VΔS^τA^)=trace(WΔA^);\displaystyle trace(V\Delta_{\hat{S}_{\tau}}^{\hat{A}})=trace(W\Delta_{\hat{A}});

cf. [42, 44]. Resultantly,

ΔJA^=2trace((Q~τTP~+Q^τP^Z¯τTP~τ+Z¯n,τP^τ+WT)(ΔA^)T)\displaystyle\Delta_{J}^{\hat{A}}=2trace\Big{(}\big{(}-\tilde{Q}_{\tau}^{T}\tilde{P}+\hat{Q}_{\tau}\hat{P}-\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}+\bar{Z}_{n,\tau}\hat{P}_{\tau}+W^{T}\big{)}(\Delta_{\hat{A}})^{T}\Big{)}

Consequently, the gradient of JJ with respect to A^\hat{A} is:

JA^=2(Q~τTP~+Q^τP^Z¯τTP~τ+Z¯n,τP^τ+WT);\displaystyle\nabla_{J}^{\hat{A}}=2\big{(}-\tilde{Q}_{\tau}^{T}\tilde{P}+\hat{Q}_{\tau}\hat{P}-\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}+\bar{Z}_{n,\tau}\hat{P}_{\tau}+W^{T}\big{)};

cf. [30]. It is evident that

Q~τTP~+Q^τP^Z¯τTP~τ+Z¯n,τP^τ+WT=0\displaystyle-\tilde{Q}_{\tau}^{T}\tilde{P}+\hat{Q}_{\tau}\hat{P}-\bar{Z}_{\tau}^{T}\tilde{P}_{\tau}+\bar{Z}_{n,\tau}\hat{P}_{\tau}+W^{T}=0 (34)

is a necessary condition for the local optimum of EH2,τ2||E||_{H_{2,\tau}}^{2}. By substituting equations (16)-(19) into (34), we obtain:

Q~τTP~τ+Q^τP^τZ~τTP~τ+Z^τP^τ+Lτ=0.\displaystyle-\tilde{Q}_{\tau}^{T}\tilde{P}_{\tau}+\hat{Q}_{\tau}\hat{P}_{\tau}-\tilde{Z}_{\tau}^{T}\tilde{P}_{\tau}+\hat{Z}_{\tau}\hat{P}_{\tau}+L_{\tau}=0.

Given that Q~τ=Y~τ+Z~τ\tilde{Q}_{\tau}=\tilde{Y}_{\tau}+\tilde{Z}_{\tau} and Q^τ=Y^τ+Z^τ\hat{Q}_{\tau}=\hat{Y}_{\tau}+\hat{Z}_{\tau}, the equation simplifies to:

(Y~τ+2Z~τ)TP~τ+(Y^τ+2Z^τ)P^τ+Lτ\displaystyle-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}\tilde{P}_{\tau}+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{P}_{\tau}+L_{\tau} =0.\displaystyle=0.

A small first-order change to the matrix Mi^\hat{M_{i}}, denoted as ΔMi^\Delta_{\hat{M_{i}}}, induces corresponding changes in other variables: JJ becomes J+ΔJMi^J+\Delta_{J}^{\hat{M_{i}}}, Q~τ\tilde{Q}_{\tau} becomes Q~τ+ΔQ~τMi^\tilde{Q}_{\tau}+\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}, and Q^τ\hat{Q}_{\tau} becomes Q^τ+ΔQ^τMi^\hat{Q}_{\tau}+\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}. The resulting first-order perturbation in JJ, denoted ΔJMi^\Delta_{J}^{\hat{M_{i}}}, can be expressed as

ΔJMi^=trace(2BTΔQ~τMi^B^+B^TΔQ^τMi^B^).\displaystyle\Delta_{J}^{\hat{M_{i}}}=trace(-2B^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}\hat{B}+\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\hat{B}).

Furthermore, based on equations (14) and (15), ΔQ~τMi^\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}} and ΔQ^τMi^\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}} are solutions to the following Lyapunov equations:

ATΔQ~τMi^+ΔQ~τMi^A^+MiP~τΔM^iSτTMiP~τΔM^iS^τ=0,\displaystyle A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}+\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}\hat{A}+M_{i}\tilde{P}_{\tau}\Delta_{\hat{M}_{i}}-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\Delta_{\hat{M}_{i}}\hat{S}_{\tau}=0,
A^TΔQ^τMi^+ΔQ^τMi^A^+ΔM^iP^τM^i+M^iP^τΔM^i\displaystyle\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}+\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\hat{A}+\Delta_{\hat{M}_{i}}\hat{P}_{\tau}\hat{M}_{i}+\hat{M}_{i}\hat{P}_{\tau}\Delta_{\hat{M}_{i}}
S^τTΔM^iP^τM^iS^τS^τTM^iP^τΔM^iS^τ=0.\displaystyle\hskip 32.72049pt-\hat{S}_{\tau}^{T}\Delta_{\hat{M}_{i}}\hat{P}_{\tau}\hat{M}_{i}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\Delta_{\hat{M}_{i}}\hat{S}_{\tau}=0.

Observe that

trace(BTΔQ~τMi^B^)\displaystyle trace(B^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}\hat{B}) =trace(BB^T(ΔQ~τMi^)T)\displaystyle=trace\big{(}B\hat{B}^{T}(\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}})^{T}\big{)}
=trace((AP~P~A^T)(ΔQ~τMi^)T)\displaystyle=trace\Big{(}\big{(}-A\tilde{P}-\tilde{P}\hat{A}^{T}\big{)}(\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}})^{T}\Big{)}
=trace((ATΔQ~τMi^ΔQ~τMi^A^)P~T)\displaystyle=trace\Big{(}\big{(}-A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}-\Delta_{\tilde{Q}_{\tau}}^{\hat{M_{i}}}\hat{A}\big{)}\tilde{P}^{T}\Big{)}
=trace((MiP~τΔM^iSτTMiP~τΔM^iS^τ)P~T)\displaystyle=trace\Big{(}\big{(}M_{i}\tilde{P}_{\tau}\Delta_{\hat{M}_{i}}-S_{\tau}^{T}M_{i}\tilde{P}_{\tau}\Delta_{\hat{M}_{i}}\hat{S}_{\tau}\big{)}\tilde{P}^{T}\Big{)}
=trace(P~τMiP~τ(ΔM^i)T).\displaystyle=trace(\tilde{P}_{\tau}M_{i}\tilde{P}_{\tau}(\Delta_{\hat{M}_{i}})^{T}).

Additionally,

trace(B^TΔQ^τMi^B^)\displaystyle trace(\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\hat{B}) =trace(B^B^TΔQ^τMi^)\displaystyle=trace\big{(}\hat{B}\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\big{)}
=trace((A^P^P^A^T)ΔQ^τMi^)\displaystyle=trace\Big{(}\big{(}-\hat{A}\hat{P}-\hat{P}\hat{A}^{T}\big{)}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\Big{)}
=trace((A^TΔQ^τMi^ΔQ^τMi^A^)P^)\displaystyle=trace\Big{(}\big{(}-\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}-\Delta_{\hat{Q}_{\tau}}^{\hat{M_{i}}}\hat{A}\big{)}\hat{P}\Big{)}
=trace((ΔM^iP^τM^i+M^iP^τΔM^i\displaystyle=trace\Big{(}\big{(}\Delta_{\hat{M}_{i}}\hat{P}_{\tau}\hat{M}_{i}+\hat{M}_{i}\hat{P}_{\tau}\Delta_{\hat{M}_{i}}
S^τTΔM^iP^τM^iS^τS^τTM^iP^τΔM^iS^τ)P^)\displaystyle-\hat{S}_{\tau}^{T}\Delta_{\hat{M}_{i}}\hat{P}_{\tau}\hat{M}_{i}\hat{S}_{\tau}-\hat{S}_{\tau}^{T}\hat{M}_{i}\hat{P}_{\tau}\Delta_{\hat{M}_{i}}\hat{S}_{\tau}\big{)}\hat{P}\Big{)}
=trace(2P^τMi^P^τ).\displaystyle=trace\big{(}2\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau}\big{)}.

Consequently, ΔJMi^\Delta_{J}^{\hat{M_{i}}} can be expressed as:

ΔJMi^=2trace((P~τTMiP~τ+P^τMi^P^τ)(ΔMi^)T).\displaystyle\Delta_{J}^{\hat{M_{i}}}=2trace\big{(}(-\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}+\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau})(\Delta_{\hat{M_{i}}})^{T}\big{)}.

From this, the gradient of JJ with respect to Mi^\hat{M_{i}} is:

JMi^=2(P~τTMiP~τ+P^τMi^P^τ).\displaystyle\nabla_{J}^{\hat{M_{i}}}=2(-\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}+\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau}).

Therefore, a necessary condition for a local minimum of E2,τ2||E||_{\mathcal{H}_{2,\tau}}^{2} is:

P~τTMiP~τ+P^τMi^P^τ=0.\displaystyle-\tilde{P}_{\tau}^{T}M_{i}\tilde{P}_{\tau}+\hat{P}_{\tau}\hat{M_{i}}\hat{P}_{\tau}=0.

A small perturbation to the matrix B^\hat{B}, denoted as ΔB^\Delta_{\hat{B}}, results in corresponding changes to other variables: JJ becomes J+ΔJB^J+\Delta_{J}^{\hat{B}}, P~τ\tilde{P}_{\tau} becomes P~τ+ΔP~τB^\tilde{P}_{\tau}+\Delta_{\tilde{P}_{\tau}}^{\hat{B}}, P^τ\hat{P}_{\tau} becomes P^τ+ΔP^τB^\hat{P}_{\tau}+\Delta_{\hat{P}_{\tau}}^{\hat{B}}, Q~τ\tilde{Q}_{\tau} becomes Q~τ+ΔQ~τB^\tilde{Q}_{\tau}+\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}, and Q^τ\hat{Q}_{\tau} becomes Q^τ+ΔQ^τB^\hat{Q}_{\tau}+\Delta_{\hat{Q}_{\tau}}^{\hat{B}}. The resulting first-order change in JJ, represented by ΔJB^\Delta_{J}^{\hat{B}}, can be expressed as:

ΔJB^=trace(2Q~τTB(ΔB^)T+2Q^τB^(ΔB^)T2BB^(ΔQ~τB^)T+B^B^TΔQ^τB^).\displaystyle\Delta_{J}^{\hat{B}}=trace\big{(}-2\tilde{Q}_{\tau}^{T}B(\Delta_{\hat{B}})^{T}+2\hat{Q}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}-2B\hat{B}(\Delta_{\tilde{Q}_{\tau}}^{\hat{B}})^{T}+\hat{B}\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{B}}\big{)}.

Based on equations (8) through (15), the variables ΔP~τB^\Delta_{\tilde{P}_{\tau}}^{\hat{B}}, ΔP^τB^\Delta_{\hat{P}_{\tau}}^{\hat{B}}, ΔQ~τB^\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}, and ΔQ^τB^\Delta_{\hat{Q}_{\tau}}^{\hat{B}} satisfy the following equations:

AΔP~τB^+ΔP~τB^A^T+B(ΔB^)TSτB(ΔB^)TS^τT=0,\displaystyle A\Delta_{\tilde{P}_{\tau}}^{\hat{B}}+\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{A}^{T}+B(\Delta_{\hat{B}})^{T}-S_{\tau}B(\Delta_{\hat{B}})^{T}\hat{S}_{\tau}^{T}=0,
A^ΔP^τB^+ΔP^τB^A^T+ΔB^B^T+B^(ΔB^)TS^τΔB^B^TS^τTS^τB^(ΔB^)TS^τT=0\displaystyle\hat{A}\Delta_{\hat{P}_{\tau}}^{\hat{B}}+\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{A}^{T}+\Delta_{\hat{B}}\hat{B}^{T}+\hat{B}(\Delta_{\hat{B}})^{T}-\hat{S}_{\tau}\Delta_{\hat{B}}\hat{B}^{T}\hat{S}_{\tau}^{T}-\hat{S}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}\hat{S}_{\tau}^{T}=0
ATΔQ~τB^+ΔQ~τB^A^+i=1p(MiΔP~τB^Mi^SτTMiΔP~τB^Mi^S^τ)=0,\displaystyle A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}+\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}\hat{A}+\sum_{i=1}^{p}\big{(}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{M_{i}}-S_{\tau}^{T}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0,
A^TΔQ^τB^+ΔQ^τB^A^+i=1p(Mi^ΔP^τB^Mi^S^τTMi^ΔP^τB^Mi^S^τ)=0.\displaystyle\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{B}}+\Delta_{\hat{Q}_{\tau}}^{\hat{B}}\hat{A}+\sum_{i=1}^{p}\big{(}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{M_{i}}-\hat{S}_{\tau}^{T}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{M_{i}}\hat{S}_{\tau}\big{)}=0.

It can be observed that

trace(BB^T(ΔQ~τB^)T)\displaystyle trace\big{(}B\hat{B}^{T}(\Delta_{\tilde{Q}_{\tau}}^{\hat{B}})^{T}\big{)} =trace((AP~P~A^T)(ΔQ~τB^)T)\displaystyle=trace\Big{(}\big{(}-A\tilde{P}-\tilde{P}\hat{A}^{T}\big{)}(\Delta_{\tilde{Q}_{\tau}}^{\hat{B}})^{T}\Big{)}
=trace((ATΔQ~τB^ΔQ~τB^A^)P~T)\displaystyle=trace\Big{(}\big{(}-A^{T}\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}-\Delta_{\tilde{Q}_{\tau}}^{\hat{B}}\hat{A}\big{)}\tilde{P}^{T}\Big{)}
=trace((i=1p(MiΔP~τB^Mi^SτTMiΔP~τB^Mi^S^τ))P~T)\displaystyle=trace\Bigg{(}\Big{(}\sum_{i=1}^{p}\big{(}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{M_{i}}-S_{\tau}^{T}M_{i}\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{M_{i}}\hat{S}_{\tau}\big{)}\Big{)}\tilde{P}^{T}\Bigg{)}
=trace(i=1pMiP~τMi^(ΔP~τB^)T).\displaystyle=trace\Big{(}\sum_{i=1}^{p}M_{i}\tilde{P}_{\tau}\hat{M_{i}}(\Delta_{\tilde{P}_{\tau}}^{\hat{B}})^{T}\Big{)}.

Similarly,

trace(B^B^TΔQ^τB^)\displaystyle trace\big{(}\hat{B}\hat{B}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{B}}\big{)} =trace((A^P^P^A^T)ΔQ^τB^)\displaystyle=trace\Big{(}\big{(}-\hat{A}\hat{P}-\hat{P}\hat{A}^{T}\big{)}\Delta_{\hat{Q}_{\tau}}^{\hat{B}}\Big{)}
=trace((A^TΔQ^τB^ΔQ^τB^A^)P^)\displaystyle=trace\Big{(}\big{(}-\hat{A}^{T}\Delta_{\hat{Q}_{\tau}}^{\hat{B}}-\Delta_{\hat{Q}_{\tau}}^{\hat{B}}\hat{A}\big{)}\hat{P}\Big{)}
=trace((i=1p(Mi^ΔP^τB^Mi^S^τTMi^ΔP^τB^Mi^S^τ))P^)\displaystyle=trace\Bigg{(}\Big{(}\sum_{i=1}^{p}\big{(}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{M_{i}}-\hat{S}_{\tau}^{T}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{M_{i}}\hat{S}_{\tau}\big{)}\Big{)}\hat{P}\Bigg{)}
=trace(i=1pMi^P^τMi^ΔP^τB^).\displaystyle=trace\Big{(}\sum_{i=1}^{p}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\Big{)}.

Therefore, ΔJB^\Delta_{J}^{\hat{B}} can be expressed as:

ΔJB^\displaystyle\Delta_{J}^{\hat{B}} =trace(2Q~τTB(ΔB^)T+2Q^τB^(ΔB^)T\displaystyle=trace\Big{(}-2\tilde{Q}_{\tau}^{T}B(\Delta_{\hat{B}})^{T}+2\hat{Q}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}
2i=1pMiP~τMi^(ΔP~τB^)T+i=1pMi^P^τMi^ΔP^τB^).\displaystyle\hskip 42.67912pt-2\sum_{i=1}^{p}M_{i}\tilde{P}_{\tau}\hat{M_{i}}(\Delta_{\tilde{P}_{\tau}}^{\hat{B}})^{T}+\sum_{i=1}^{p}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\Big{)}.

It can be shown that

trace(i=1pMiP~τMi^(ΔP~τB^)T)\displaystyle trace\Big{(}\sum_{i=1}^{p}M_{i}\tilde{P}_{\tau}\hat{M_{i}}(\Delta_{\tilde{P}_{\tau}}^{\hat{B}})^{T}\Big{)} =trace((ATZ¯τZ¯τA^)(ΔP~τB^)T)\displaystyle=trace\Big{(}\big{(}-A^{T}\bar{Z}_{\tau}-\bar{Z}_{\tau}\hat{A}\big{)}(\Delta_{\tilde{P}_{\tau}}^{\hat{B}})^{T}\Big{)}
=trace((AΔP~τB^ΔP~τB^A^T)Z¯τT)\displaystyle=trace\Big{(}\big{(}-A\Delta_{\tilde{P}_{\tau}}^{\hat{B}}-\Delta_{\tilde{P}_{\tau}}^{\hat{B}}\hat{A}^{T}\big{)}\bar{Z}_{\tau}^{T}\Big{)}
=trace((B(ΔB^)TSτB(ΔB^)TS^τT)Z¯τT)\displaystyle=trace\Big{(}\big{(}B(\Delta_{\hat{B}})^{T}-S_{\tau}B(\Delta_{\hat{B}})^{T}\hat{S}_{\tau}^{T}\big{)}\bar{Z}_{\tau}^{T}\Big{)}
=trace(Z~τTB(ΔB^)T),\displaystyle=trace\big{(}\tilde{Z}_{\tau}^{T}B(\Delta_{\hat{B}})^{T}\big{)},

and

trace(i=1pMi^P^τMi^ΔP^τB^)\displaystyle trace\Big{(}\sum_{i=1}^{p}\hat{M_{i}}\hat{P}_{\tau}\hat{M_{i}}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\Big{)} =trace((A^TZ¯n,τZ¯n,τA^)ΔP^τB^)\displaystyle=trace\Big{(}\big{(}-\hat{A}^{T}\bar{Z}_{n,\tau}-\bar{Z}_{n,\tau}\hat{A}\big{)}\Delta_{\hat{P}_{\tau}}^{\hat{B}}\Big{)}
=trace((A^ΔP^τB^ΔP^τB^A^T)Z¯n,τ)\displaystyle=trace\Big{(}\big{(}-\hat{A}\Delta_{\hat{P}_{\tau}}^{\hat{B}}-\Delta_{\hat{P}_{\tau}}^{\hat{B}}\hat{A}^{T}\big{)}\bar{Z}_{n,\tau}\Big{)}
=trace((ΔB^B^T+B^(ΔB^)TS^τΔB^B^TS^τT\displaystyle=trace\Bigg{(}\Big{(}\Delta_{\hat{B}}\hat{B}^{T}+\hat{B}(\Delta_{\hat{B}})^{T}-\hat{S}_{\tau}\Delta_{\hat{B}}\hat{B}^{T}\hat{S}_{\tau}^{T}
S^τB^(ΔB^)TS^τT)Z¯n,τ)\displaystyle\hskip 42.67912pt-\hat{S}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}\hat{S}_{\tau}^{T}\Big{)}\bar{Z}_{n,\tau}\Bigg{)}
=2trace(Z^τTB^(ΔB^)T).\displaystyle=2trace\big{(}\hat{Z}_{\tau}^{T}\hat{B}(\Delta_{\hat{B}})^{T}\big{)}.

As a result, ΔJB^\Delta_{J}^{\hat{B}} simplifies to:

ΔJB^\displaystyle\Delta_{J}^{\hat{B}} =trace(2Q~τTB(ΔB^)T+2Q^τB^(ΔB^)T\displaystyle=trace\big{(}-2\tilde{Q}_{\tau}^{T}B(\Delta_{\hat{B}})^{T}+2\hat{Q}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}
2Z~τTB(ΔB^)T+2Z^τB^(ΔB^)T).\displaystyle\hskip 42.67912pt-2\tilde{Z}_{\tau}^{T}B(\Delta_{\hat{B}})^{T}+2\hat{Z}_{\tau}\hat{B}(\Delta_{\hat{B}})^{T}\big{)}.

From this, the gradient of JJ with respect to B^\hat{B} is:

JB^=2(Q~τTB+Q^τB^Z~τTB+Z^τB^).\displaystyle\nabla_{J}^{\hat{B}}=2(-\tilde{Q}_{\tau}^{T}B+\hat{Q}_{\tau}\hat{B}-\tilde{Z}_{\tau}^{T}B+\hat{Z}_{\tau}\hat{B}).

Therefore, a necessary condition for a local minimum of E2,τ2||E||_{\mathcal{H}_{2,\tau}}^{2} is:

Q~τTB+Q^τB^Z~τTB+Z^τB^=(Y~τ+2Z~τ)TB+(Y^τ+2Z^τ)B^=0.\displaystyle-\tilde{Q}_{\tau}^{T}B+\hat{Q}_{\tau}\hat{B}-\tilde{Z}_{\tau}^{T}B+\hat{Z}_{\tau}\hat{B}=-(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}B+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{B}=0.

We begin by rewriting the cost function JJ in a slightly different form:

J\displaystyle J =trace(2BT(Y~τ+Z~τ)B^+B^T(Y^τ+Z^τ)B^)\displaystyle=trace(-2B^{T}(\tilde{Y}_{\tau}+\tilde{Z}_{\tau})\hat{B}+\hat{B}^{T}(\hat{Y}_{\tau}+\hat{Z}_{\tau})\hat{B})
=trace(2BTY~τB^2BTZ~τB^+B^TY^τB^+B^TZ^τB^).\displaystyle=trace(-2B^{T}\tilde{Y}_{\tau}\hat{B}-2B^{T}\tilde{Z}_{\tau}\hat{B}+\hat{B}^{T}\hat{Y}_{\tau}\hat{B}+\hat{B}^{T}\hat{Z}_{\tau}\hat{B}).

Observing that

trace(2BTY~τB^+B^TY^τB^)=trace(2CP~τC^T+C^P^τC^T),\displaystyle trace(-2B^{T}\tilde{Y}_{\tau}\hat{B}+\hat{B}^{T}\hat{Y}_{\tau}\hat{B})=trace(2C\tilde{P}_{\tau}\hat{C}^{T}+\hat{C}\hat{P}_{\tau}\hat{C}^{T}),

we can rewrite JJ as:

J=trace(2CP~τC^T+C^P^τC^T2BTZ~τB^+B^TZ^τB^);\displaystyle J=trace(2C\tilde{P}_{\tau}\hat{C}^{T}+\hat{C}\hat{P}_{\tau}\hat{C}^{T}-2B^{T}\tilde{Z}_{\tau}\hat{B}+\hat{B}^{T}\hat{Z}_{\tau}\hat{B});

cf. [40]. Introducing a small perturbation to the matrix C^\hat{C}, denoted as ΔC^\Delta_{\hat{C}}, leads to a corresponding change in JJ, expressed as J+ΔJC^J+\Delta_{J}^{\hat{C}}. The resulting first-order change in JJ, represented by ΔJC^\Delta_{J}^{\hat{C}}, is given by:

ΔJC^=trace(2CP~τ(ΔC^)T+2C^P^τ(ΔC^)T)\displaystyle\Delta_{J}^{\hat{C}}=trace(2C\tilde{P}_{\tau}(\Delta_{\hat{C}})^{T}+2\hat{C}\hat{P}_{\tau}(\Delta_{\hat{C}})^{T})

Consequently, the gradient of JJ with respect to C^\hat{C} is:

JC^=2(CP~τ+C^P^τ).\displaystyle\nabla_{J}^{\hat{C}}=2(C\tilde{P}_{\tau}+\hat{C}\hat{P}_{\tau}).

Therefore, a necessary condition for a local minimum of E2,τ2||E||_{\mathcal{H}_{2,\tau}}^{2} is:

CP~τ+C^P^τ=0.\displaystyle C\tilde{P}_{\tau}+\hat{C}\hat{P}_{\tau}=0.

This concludes the proof.

Appendix B

Pre-multiplying equation (8) by W^T\hat{W}^{T} yields:

W^T(AP~τ+P~τA^T+BB^TSτBB^TS^τT)\displaystyle\hat{W}^{T}\big{(}A\tilde{P}_{\tau}+\tilde{P}_{\tau}\hat{A}^{T}+B\hat{B}^{T}-S_{\tau}B\hat{B}^{T}\hat{S}_{\tau}^{T}\big{)} =0\displaystyle=0
A^+A^T+B^B^TS^τB^B^TS^τ\displaystyle\hat{A}+\hat{A}^{T}+\hat{B}\hat{B}^{T}-\hat{S}_{\tau}\hat{B}\hat{B}^{T}\hat{S}_{\tau} =0.\displaystyle=0.

Given the uniqueness of equation (9), we conclude that P^τ=I\hat{P}_{\tau}=I.

It can readily be noted that (Y~τ+2Z~τ)T(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T} and Y^τ+2Z^τ\hat{Y}_{\tau}+2\hat{Z}_{\tau} satisfy the following equations:

A^T(Y~τ+2Z~τ)T+(Y~τ+2Z~τ)TA+C^TCS^τTC^TCSτ\displaystyle\hat{A}^{T}(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}+(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}A+\hat{C}^{T}C-\hat{S}_{\tau}^{T}\hat{C}^{T}CS_{\tau}
+i=1p2Mi^P~τTMii=1p2S^τTMi^P~τTMiSτ=0,\displaystyle\hskip 68.28644pt+\sum_{i=1}^{p}2\hat{M_{i}}\tilde{P}_{\tau}^{T}M_{i}-\sum_{i=1}^{p}2\hat{S}_{\tau}^{T}\hat{M_{i}}\tilde{P}_{\tau}^{T}M_{i}S_{\tau}=0, (35)
A^T(Y^τ+2Z^τ)+(Y^τ+2Z^τ)A^+C^TC^S^τTC^TC^S^τ\displaystyle\hat{A}^{T}(\hat{Y}_{\tau}+2\hat{Z}_{\tau})+(\hat{Y}_{\tau}+2\hat{Z}_{\tau})\hat{A}+\hat{C}^{T}\hat{C}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}
+i=1p2Mi^P^τTM^ii=1p2S^τTMi^P^τTM^iS^τ=0.\displaystyle\hskip 68.28644pt+\sum_{i=1}^{p}2\hat{M_{i}}\hat{P}_{\tau}^{T}\hat{M}_{i}-\sum_{i=1}^{p}2\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{P}_{\tau}^{T}\hat{M}_{i}\hat{S}_{\tau}=0. (36)

By post-multiplying (35) by V^\hat{V} results in:

(A^T(Y~τ+2Z~τ)T+(Y~τ+2Z~τ)TA+C^TCS^τTC^TCSτ\displaystyle\Big{(}\hat{A}^{T}(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}+(\tilde{Y}_{\tau}+2\tilde{Z}_{\tau})^{T}A+\hat{C}^{T}C-\hat{S}_{\tau}^{T}\hat{C}^{T}CS_{\tau}
+i=1p2Mi^P~τTMii=1p2S^τTMi^P~τTMiSτ)V^\displaystyle\hskip 42.67912pt+\sum_{i=1}^{p}2\hat{M_{i}}\tilde{P}_{\tau}^{T}M_{i}-\sum_{i=1}^{p}2\hat{S}_{\tau}^{T}\hat{M_{i}}\tilde{P}_{\tau}^{T}M_{i}S_{\tau}\Big{)}\hat{V}
=A^T+A^+C^TC^S^τTC^TC^S^τ\displaystyle=\hat{A}^{T}+\hat{A}+\hat{C}^{T}\hat{C}-\hat{S}_{\tau}^{T}\hat{C}^{T}\hat{C}\hat{S}_{\tau}
+i=1p2Mi^Mi^i=1p2S^τTMi^Mi^S^τ=0.\displaystyle\hskip 42.67912pt+\sum_{i=1}^{p}2\hat{M_{i}}\hat{M_{i}}-\sum_{i=1}^{p}2\hat{S}_{\tau}^{T}\hat{M_{i}}\hat{M_{i}}\hat{S}_{\tau}=0.

Due to the uniqueness of equations (9) and (36),we find that P^τ=I\hat{P}_{\tau}=I and Y^τ+2Z^τ=I\hat{Y}_{\tau}+2\hat{Z}_{\tau}=I. Consequently, the optimality conditions (21)-(23) are satisfied.

References

  • [1] J. M. Montenbruck, S. Zeng, F. Allgöwer, Linear systems with quadratic outputs, in: 2017 American Control Conference (ACC), IEEE, 2017, pp. 1030–1034.
  • [2] A. Van Der Schaft, Port-Hamiltonian systems: An introductory survey, in: International Congress of Mathematicians, European Mathematical Society Publishing House (EMS Ph), 2006, pp. 1339–1365.
  • [3] B. Picinbono, P. Devaut, Optimal linear-quadratic systems for detection and estimation, IEEE Transactions on Information Theory 34 (2) (1988) 304–311.
  • [4] Q. Aumann, S. W. Werner, Structured model order reduction for vibro-acoustic problems using interpolation and balancing methods, Journal of Sound and Vibration 543 (2023) 117363.
  • [5] R. Van Beeumen, K. Van Nimmen, G. Lombaert, K. Meerbergen, Model reduction for dynamical systems with quadratic output, International Journal for Numerical Methods in Engineering 91 (3) (2012) 229–248.
  • [6] P. Benner, V. Mehrmann, D. C. Sorensen, Dimension reduction of large-scale systems, Vol. 45, Springer, 2005.
  • [7] A. Antoulas, D. Sorensen, K. A. Gallivan, P. Van Dooren, A. Grama, C. Hoffmann, A. Sameh, Model reduction of large-scale dynamical systems, in: Computational Science-ICCS 2004: 4th International Conference, Kraków, Poland, June 6-9, 2004, Proceedings, Part III 4, Springer, 2004, pp. 740–747.
  • [8] A. C. Antoulas, R. Ionutiu, N. Martins, E. J. W. ter Maten, K. Mohaghegh, R. Pulch, J. Rommes, M. Saadvandi, M. Striebel, Model order reduction: Methods, concepts and properties, Coupled Multiscale Simulation and Optimization in Nanoelectronics (2015) 159–265.
  • [9] A. C. Antoulas, C. A. Beattie, S. Güğercin, Interpolatory methods for model reduction, SIAM, 2020.
  • [10] B. Moore, Principal component analysis in linear systems: Controllability, observability, and model reduction, IEEE Transactions on Automatic Control 26 (1) (1981) 17–32.
  • [11] D. F. Enns, Model reduction with balanced realizations: An error bound and a frequency weighted generalization, in: The 23rd IEEE Conference on Decision and Control, IEEE, 1984, pp. 127–132.
  • [12] V. Mehrmann, T. Stykel, Balanced truncation model reduction for large-scale systems in descriptor form, in: Dimension Reduction of Large-Scale Systems: Proceedings of a Workshop held in Oberwolfach, Germany, October 19–25, 2003, Springer, 2005, pp. 83–115.
  • [13] T. Reis, T. Stykel, Balanced truncation model reduction of second-order systems, Mathematical and Computer Modelling of Dynamical Systems 14 (5) (2008) 391–406.
  • [14] H. Sandberg, A. Rantzer, Balanced truncation of linear time-varying systems, IEEE Transactions on Automatic Control 49 (2) (2004) 217–229.
  • [15] N. T. Son, P.-Y. Gousenbourger, E. Massart, T. Stykel, Balanced truncation for parametric linear systems using interpolation of gramians: a comparison of algebraic and geometric approaches, Model reduction of complex dynamical systems (2021) 31–51.
  • [16] B. Kramer, K. Willcox, Balanced truncation model reduction for lifted nonlinear systems, in: Realization and Model Reduction of Dynamical Systems: A Festschrift in Honor of the 70th Birthday of Thanos Antoulas, Springer, 2022, pp. 157–174.
  • [17] P. Benner, P. Goyal, M. Redmann, Truncated gramians for bilinear systems and their advantages in model order reduction, Model reduction of parametrized systems (2017) 285–300.
  • [18] N. Wong, V. Balakrishnan, Fast positive-real balanced truncation via quadratic alternating direction implicit iteration, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 26 (9) (2007) 1725–1731.
  • [19] C. Guiver, M. R. Opmeer, Bounded real and positive real balanced truncation for infinite-dimensional systems, Mathematical Control and Related Fields 3 (1) (2013) 83–119.
  • [20] N. Wong, V. Balakrishnan, C.-K. Koh, Passivity-preserving model reduction via a computationally efficient project-and-balance scheme, in: Proceedings of the 41st annual Design Automation Conference, 2004, pp. 369–374.
  • [21] A. Sarkar, J. M. Scherpen, Structure-preserving generalized balanced truncation for nonlinear port-hamiltonian systems, Systems & Control Letters 174 (2023) 105501.
  • [22] S. Gugercin, A. C. Antoulas, A survey of model reduction by balanced truncation and some new results, International Journal of Control 77 (8) (2004) 748–766.
  • [23] R. Van Beeumen, K. Meerbergen, Model reduction by balanced truncation of linear systems with a quadratic output, in: AIP Conference Proceedings, Vol. 1281, American Institute of Physics, 2010, pp. 2033–2036.
  • [24] R. Pulch, A. Narayan, Balanced truncation for model order reduction of linear dynamical systems with quadratic outputs, SIAM Journal on Scientific Computing 41 (4) (2019) A2270–A2295.
  • [25] P. Benner, P. Goyal, I. P. Duff, Gramians, energy functionals, and balanced truncation for linear dynamical systems with quadratic outputs, IEEE Transactions on Automatic Control 67 (2) (2021) 886–893.
  • [26] D. Wilson, Optimum solution of model-reduction problem, in: Proceedings of the Institution of Electrical Engineers, Vol. 117, IET, 1970, pp. 1161–1165.
  • [27] S. Gugercin, A. C. Antoulas, C. Beattie, 2\mathcal{H}_{2} model reduction for large-scale linear dynamical systems, SIAM journal on matrix analysis and applications 30 (2) (2008) 609–638.
  • [28] Y. Xu, T. Zeng, Optimal 2\mathcal{H}_{2} model reduction for large scale MIMO systems via tangential interpolation, International Journal of Numerical Analysis & Modeling 8 (1) (2011).
  • [29] P. Benner, M. Køhler, J. Saak, Sparse-dense Sylvester equations in 2\mathcal{H}_{2}-model order reduction, Preprint MPIMD/11-11, Max Planck Institute Magdeburg, available from http://www.mpi-magdeburg.mpg.de/preprints/ (Dec. 2011).
  • [30] S. Reiter, I. Pontes Duff, I. V. Gosea, S. Gugercin, 2\mathcal{H}_{2} optimal model reduction of linear systems with multiple quadratic outputs, arXiv preprint arXiv:2405.05951 (2024).
  • [31] P. Kundur, Power system stability, Power system stability and control, McGraw- Hill, New York, (1994).
  • [32] G. Rogers, The nature of power system oscillations, in: Power System Oscillations, Springer, 2000, pp. 7–30.
  • [33] M. Grimble, Solution of finite-time optimal control problems with mixed end constraints in the s-domain, IEEE Transactions on Automatic Control 24 (1) (1979) 100–108.
  • [34] W. Gawronski, J.-N. Juang, Model reduction in limited time and frequency intervals, International Journal of Systems Science 21 (2) (1990) 349–376.
  • [35] U. Zulfiqar, X. Du, Q.-Y. Song, Z.-H. Xiao, V. Sreeram, Relative error-based time-limited 2\mathcal{H}_{2} model order reduction via oblique projection, Journal of the Franklin Institute 361 (2) (2024) 1093–1114.
  • [36] P. Kürschner, Balanced truncation model order reduction in limited time intervals for large systems, Advances in Computational Mathematics 44 (6) (2018) 1821–1844.
  • [37] P. Benner, S. W. Werner, Frequency-and time-limited balanced truncation for large-scale second-order systems, Linear Algebra and its Applications 623 (2021) 68–103.
  • [38] H. R. Shaker, M. Tahavori, Time-interval model reduction of bilinear systems, International Journal of Control 87 (8) (2014) 1487–1495.
  • [39] Q.-Y. Song, U. Zulfiqar, Z.-H. Xiao, M. M. Uddin, V. Sreeram, Balanced truncation of linear systems with quadratic outputs in limited time and frequency intervals, arXiv preprint arXiv:2402.11445 (2024).
  • [40] P. Goyal, M. Redmann, Time-limited 2\mathcal{H}_{2}-optimal model order reduction, Applied Mathematics and Computation 355 (2019) 184–197.
  • [41] K. Sinani, S. Gugercin, 2(tf)\mathcal{H}_{2}(t_{f}) optimality conditions for a finite-time horizon, Automatica 110 (2019) 108604.
  • [42] N. J. Higham, Functions of matrices: theory and computation, SIAM, 2008.
  • [43] U. Zulfiqar, V. Sreeram, X. Du, Time-limited pseudo-optimal-model order reduction, IET Control Theory & Applications 14 (14) (2020) 1995–2007.
  • [44] K. B. Petersen, M. S. Pedersen, et al., The matrix cookbook, Technical University of Denmark 7 (15) (2008) 510.