This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Difference and Unity of Irregular LQ Control and Standard LQ Control and Its Solution

Huanshui Zhang and Juanjuan Xu *This work is supported by the National Natural Science Foundation of China under Grants 61633014, 61873332, U1806204, U1701264, 61922051, the foundation for Innovative Research Groups of National Natural Science Foundation of China (61821004) and Youth Innovation Group Project of Shandong University (2020QNQT016).H. Zhang is with Shandong University and Shandong University of Science and Technology, Shandong, P.R.China (most of the work was completed in Shandong University). Juanjuan Xu is with School of Control Science and Engineering of Shandong University, Shandong, P.R.China. [email protected], [email protected]
Abstract

Irregular linear quadratic control (LQ, was called Singular LQ) has been a long-standing problem since 1970s. This paper will show that an irregular LQ control (deterministic) is solvable (for arbitrary initial value) if and only if the LQ cost can be rewritten as a regular one by changing the terminal cost x(T)Hx(T)x^{\prime}(T)Hx(T) to x(T)[H+P1(T)]x(T)x^{\prime}(T)[H+P_{1}(T)]x(T), while the optimal controller can achieve P1(T)x(T)=0P_{1}(T)x(T)=0 at the same time. In other words, the irregular controller (if exists) needs to do two things at the same time, one thing is to minimize the cost and the other is to achieve the terminal constraint P1(T)x(T)=0P_{1}(T)x(T)=0, which clarifies the essential difference of irregular LQ from the standard LQ control where the controller is to minimize the cost only.

With this breakthrough, we further study the irregular LQ control for stochastic systems with multiplicative noise. A sufficient solving condition and the optimal controller is presented based on Riccati equations.

Index Terms:
Irregular, LQ control, Riccati equation, Stochastic control.

I Introduction

Linear-quadratic (LQ) optimal control has received much attention in recent years due to the widely applications in modern engineering [2, 4, 14]. Considering the singularity of the weighting matrix of the control in the cost function, LQ optimal control problem is mainly consisting of regular optimal control and irregular optimal control. Most of the previous works have been focused on regular case. In particular, when the weighting matrix of the control in the cost function is positive-definite, the LQ optimal control naturally belongs to the regular case which has been extensively studied in [14, 17, 21, 3]. When the weighting matrix of the control in the cost function is in more general case of indefinition, [24] studied the stochastic optimal control and obtained the optimal solution where the stochastic Riccati equation is strictly required to be regular, i.e., the results are only applicable to regular LQ problems.

In the case of irregularity, the optimal LQ control has been remaining major challenging although much efforts have been made since 1970’s. In [10, 19, 28] and references therein, the singular LQ control was studied by using ‘Transformation in state space’, where the problem with control weighting matrix is zero (R=0R=0) was studied. It was shown that the problem is solvable if the initial value is given like x2(0)=C21(0)x1(0)x_{2}(0)=C_{21}(0)x_{1}(0). Otherwise, an impulse control must be applied at the initial time [12]. In other words, the approach of ‘Transformation in state space’ is only applicable to the case of specified initial value. In [16, 5, 34], the approach of ‘higher order maximum principle’ was applied to singular LQ control. However, if the higher derivatives vanish, it is impossible to find the singular control with this approach [9]. The third approach is the perturbation approach in [6], [27]. The optimal solution is obtained by using the limitation of the solution to Riccati equation when the perturbation is approaching to zero.

More recently, with the analytical solution to a forward and backward differential equations (FBDEs), [32] considered the irregular LQ control for deterministic systems with arbitrary initial value, where the irregular controller was designed based on a regular Riccati equation and the controllability of a subsystem.

In this paper we will study the irregular optimal control problem aiming to explore the difference between the regular control and irregular control (see Theorem 1 in the below). It is interesting to show that an irregular LQ control is solvable if and only if LQ cost can be rewritten as a regular one by changing the terminal term of the LQ cost, while the controller can make the changed terminal to be zero. Moreover, we extend the results to stochastic control problem with irregular cost (Theorem 2).

The remainder of the paper is organized as follows. Section II presents the solution for the deterministic optimal control problem with irregular performance. The solution to the stochastic optimal control problem with irregular performance is given in Section III. Some concluding words are given in Section IV. Some proofs of the results are presented in Appendix.

The following notations will be used throughout this paper: RnR^{n} denotes the family of nn dimensional vectors. xx^{\prime} means the transpose of x.x. It is defined that x2=xx.\|x\|^{2}=x^{\prime}x. A symmetric matrix M>0(0)M>0\ (\geq 0) means strictly positive definite (positive semi-definite). Range(M)Range(M) represents the range of the matrix M.M. MM^{{\dagger}} is called the Moore-Penrose inverse [20] of the matrix MM if it satisfies MMM=M,MMM=M,(MM)=MMMM^{{\dagger}}M=M,~{}M^{{\dagger}}MM^{{\dagger}}=M^{{\dagger}},(MM^{{\dagger}})=MM^{{\dagger}} and (MM)=MM.(M^{{\dagger}}M)^{\prime}=M^{{\dagger}}M.

II Deterministic optimal control with irregular performance

In this section, we consider the deterministic optimal control with irregular performance where the linear system governed by a differential equation:

x˙(t)\displaystyle\dot{x}(t) =\displaystyle= A(t)x(t)+B(t)u(t),x(t0)=x0,\displaystyle A(t)x(t)+B(t)u(t),~{}x(t_{0})=x_{0}, (1)

where xRnx\in R^{n} is the state, uRmu\in R^{m} is the control input. The matrices A,B,A¯,B¯A,B,\bar{A},\bar{B} are constant matrices with appropriate dimension. x0x_{0} represents the initial value. The cost function is given by

J0(t0,x0;u)\displaystyle J_{0}(t_{0},x_{0};u) =\displaystyle= t0T[x(t)Q(t)x(t)+u(t)R(t)u(t)]𝑑t\displaystyle\int_{t_{0}}^{T}[x^{\prime}(t)Q(t)x(t)+u^{\prime}(t)R(t)u(t)]dt (2)
+x(T)Hx(T),\displaystyle+x^{\prime}(T)Hx(T),

where Q(t)0,R(t)0Q(t)\geq 0,R(t)\geq 0 are symmetric matrices with appropriate dimensions.

Problem 1.

For any (t0,x0)(t_{0},x_{0}), find a controller u(t)u(t) such that (2) is minimized subject to (1).

Noting that R(t)R(t) is semi positive-definite, the problem was usually called singular control [10, 28], which remains to be solved due to the difficulty caused by the regularity.

II-A What is an irregular LQ control

Singular LQ control contains regular and irregular control, the first case is easily done using the standard control approach, the second case of irregular is much involved as said in the above. To define irregular control problem, we introduce the following Riccati equation associated with system (1) and cost (2)

0\displaystyle 0 =\displaystyle= P˙(t)+A(t)P(t)+P(t)A(t)+Q(t)\displaystyle\dot{P}(t)+A^{\prime}(t)P(t)+P(t)A(t)+Q(t) (3)
P(t)B(t)R(t)B(t)P(t),\displaystyle-P(t)B(t)R^{{\dagger}}(t)B^{\prime}(t)P(t),

where the terminal condition is given by P(T)=HP(T)=H and R(t)R^{{\dagger}}(t) represents the Moore-Penrose inverse of R(t)R(t). If Range[B(t)P(t)]Range[R(t)]Range[B^{\prime}(t)P(t)]\subseteq Range[R(t)], the LQ control problem is standard and called regular. Otherwise if

Range[B(t)P(t)]Range[R(t)],\displaystyle Range[B^{\prime}(t)P(t)]\not\subseteq Range[R(t)], (4)

the LQ control is called irregular and the performance cost (2) is irregular accordingly.

II-B Why is it difficult?

The irregularity implies the controller is unsolvable with classical control theory. In fact, the irregularity leads to extremely difficulty to obtain the controller. To show this, we present an example where the system is governed by

x˙(t)\displaystyle\dot{x}(t) =\displaystyle= x(t)+[11]u(t),x(t0)=x0,\displaystyle x(t)+\left[\begin{array}[]{cc}1&-1\\ \end{array}\right]u(t),~{}x(t_{0})=x_{0}, (6)

and the cost function is given by

JT(t0,x0;u)=t0Tu(t)[1000]u(t)𝑑t+x(T)x(T).\displaystyle J_{T}(t_{0},x_{0};u)=\int_{t_{0}}^{T}u^{\prime}(t)\left[\begin{array}[]{cc}1&0\\ 0&0\\ \end{array}\right]u(t)dt+x^{\prime}(T)x(T). (9)

The solution to the Riccati equation 0=P˙(t)+2P(t)P2(t)0=\dot{P}(t)+2P(t)-P^{2}(t) with P(T)=1P(T)=1 is given by P(t)=21+e2(tT).P(t)=\frac{2}{1+e^{2(t-T)}}. Then, it holds that Range[BP(t)]Range(R)Range[B^{\prime}P(t)]\nsubseteq Range(R) where B=[11]B=\left[\begin{array}[]{cc}1&-1\\ \end{array}\right] and R=[1000]R=\left[\begin{array}[]{cc}1&0\\ 0&0\\ \end{array}\right]. This implies that it is unable to obtain u(t)u(t) from the classical equilibrium condition Ru(t)+BP(t)x(t)=0Ru(t)+B^{\prime}P(t)x(t)=0 for arbitrary x(t)x(t).

The irregularity also leads to fundamental difficulty, that is, completing sum of squares can not be achieved for irregular LQ cost (2). Actually, for the above optimization problem of minimizing (9) subject to (6), by taking derivative to x(t)P(t)x(t),x^{\prime}(t)P(t)x(t), it yields that

ddt[x(t)P(t)x(t)]=2u(t)BP(t)x(t)+x(t)P2(t)x(t).\displaystyle\frac{d}{dt}[x^{\prime}(t)P(t)x(t)]=2u^{\prime}(t)B^{\prime}P(t)x(t)+x^{\prime}(t)P^{2}(t)x(t).

Then the cost function (9) can be rewritten by taking integration from t0t_{0} to TT in the above equation as

JT(t0,x0;u)\displaystyle J_{T}(t_{0},x_{0};u)
=\displaystyle= x(t0)P(t0)x(t0)+t0T[u(t)Ru(t)+2u(t)\displaystyle x^{\prime}(t_{0})P(t_{0})x(t_{0})+\int_{t_{0}}^{T}\Big{[}u^{\prime}(t)Ru(t)+2u^{\prime}(t)
×BP(t)x(t)+x(t)P2(t)x(t)]dt\displaystyle\times B^{\prime}P(t)x(t)+x^{\prime}(t)P^{2}(t)x(t)\Big{]}dt
=\displaystyle= x(t0)P(t0)x(t0)+t0T{[u(t)+RBP(t)x(t)]\displaystyle x^{\prime}(t_{0})P(t_{0})x(t_{0})+\int_{t_{0}}^{T}\Big{\{}\Big{[}u(t)+R^{{\dagger}}B^{\prime}P(t)x(t)\Big{]}^{\prime}
×R[u(t)+RBP(t)x(t)]Ru(t)\displaystyle\times R\Big{[}u(t)+R^{{\dagger}}B^{\prime}P(t)x(t)\Big{]}Ru(t)
+2u(t)(IRR)BP(t)x(t)}dt.\displaystyle+2u^{\prime}(t)(I-RR^{{\dagger}})B^{\prime}P(t)x(t)\Big{\}}dt.

From the above cost function and the fact that Range[BP(t)]Range(R),Range[B^{\prime}P(t)]\nsubseteq Range(R), it is seen that the optimal controller can not be obtained by completing sum of squares because the last term in the above is not zero.

Thus the irregularity leads to the invalidity of standard methods for LQ control. In this paper, in order to find a way to solve the irregular control, we will first explore the essential difference of irregular control from regular one in the following Theorem 1.

II-C Solution to deterministic optimal control with irregular performance

Firstly, we present the maximum principle for Problem 1 [32].

Lemma 1.

If Problem 1 is solvable, then the optimal controller satisfies

0\displaystyle 0 =\displaystyle= R(t)u(t)+B(t)p(t),\displaystyle R(t)u(t)+B^{\prime}(t)p(t), (10)

where p(t)p(t) obeys the following dynamics:

p˙(t)\displaystyle\dot{p}(t) =\displaystyle= A(t)p(t)Q(t)x(t),\displaystyle-A^{\prime}(t)p(t)-Q(t)x(t), (11)

with the terminal value p(T)=Hx(T)p(T)=Hx(T). Conversely, if FBDEs (1), (11) and (10) is solvable, then Problem 1 is also solvable.

Proof. The proof follows from the maximum principle by using the fact that Q(t)0,R(t)0Q(t)\geq 0,R(t)\geq 0. So we omit the details. \blacksquare

We next make some denotations for convenience of the derivation of the main result. Let rank(R(t))=m0<m,rank(R(t))=m_{0}<m, thus rank[IR(t)R(t)]=mm0>0rank\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}=m-m_{0}>0. There is an elementary row transformation matrix T0(t)T_{0}(t) such that

T0(t)[IR(t)R(t)]=[0ΥT0(t)],\displaystyle T_{0}(t)\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}=\left[\begin{array}[]{c}0\\ \Upsilon_{T_{0}}(t)\\ \end{array}\right], (14)

where ΥT0(t)R[mm0]×m\Upsilon_{T_{0}}(t)\in R^{[m-m_{0}]\times m} is full row rank. Furthermore denote

A0(t)\displaystyle A_{0}(t) =\displaystyle= A(t)B(t)R(t)B(t)P(t),\displaystyle A(t)-B(t)R^{{\dagger}}(t)B^{\prime}(t)P(t),
D0(t)\displaystyle D_{0}(t) =\displaystyle= B(t)R(t)B(t),\displaystyle-B(t)R^{{\dagger}}(t)B^{\prime}(t),
[B0(t)]\displaystyle\left[\begin{array}[]{cc}\ast&B_{0}(t)\\ \end{array}\right] =\displaystyle= B(t)[IR(t)R(t)]T01(t),\displaystyle B(t)\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}{T_{0}}^{-1}(t), (16)
[G0(t)]\displaystyle\left[\begin{array}[]{cc}*&G_{0}(t)\\ \end{array}\right] =\displaystyle= T01(t),\displaystyle T_{0}^{-1}(t), (18)

and define

0\displaystyle 0 =\displaystyle= P˙1(t)+P1(t)A0(t)+A0(t)P1(t)\displaystyle\dot{P}_{1}(t)+P_{1}(t)A_{0}(t)+A_{0}^{\prime}(t)P_{1}(t) (19)
+P1(t)D0(t)P1(t),\displaystyle+P_{1}(t)D_{0}(t)P_{1}(t),

where the terminal value P1(T)P_{1}(T) is to be determined.

We are now in the position to give the main result of this section as follows.

Theorem 1.

Problem 1 is solvable if and only if there exists a matrix P1(T)P_{1}(T) satisfying 0=B0(T)[P(T)+P1(T)]0=B_{0}^{\prime}(T)[P(T)+P_{1}(T)] such that the following changed cost

J¯0(t0,x0;u)\displaystyle{\bar{J}}_{0}(t_{0},x_{0};u) =\displaystyle= J0(t0,x0;u)+x(T)P1(T)x(T)\displaystyle J_{0}(t_{0},x_{0};u)+x^{\prime}(T)P_{1}(T)x(T) (20)

is regular and P1(T)x(T)=0P_{1}(T)x(T)=0 is achieved with the controller minimizing (20).

Proof. “Sufficiency” The aim is to prove if there exists a matrix P1(T)P_{1}(T) satisfying 0=B0(T)[P(T)+P1(T)]0=B_{0}^{\prime}(T)[P(T)+P_{1}(T)] such that (20) is regular and P1(T)x(T)=0P_{1}(T)x(T)=0 is achieved, then Problem 1 is solvable. Based on Lemma 1, it is sufficient to show that the FBDEs (1), (11) and (10) is solvable. To this end, we will verify that the following (p(t),x(t))(p(t),x(t)) solves the FBDEs:

p(t)\displaystyle p(t) =\displaystyle= P(t)x(t)+P1(t)x(t),\displaystyle P(t)x(t)+P_{1}(t)x(t), (21)
x˙(t)\displaystyle\dot{x}(t) =\displaystyle= {A(t)B(t)R(t)B(t)[P(t)+P1(t)]}x(t)\displaystyle\Big{\{}A(t)-B(t)R^{{\dagger}}(t)B^{\prime}(t)\Big{[}P(t)+P_{1}(t)\Big{]}\Big{\}}x(t) (22)
+B(t)[IR(t)R(t)]z(t),\displaystyle+B(t)\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}z(t),

where P(t)P(t) is defined in (3), P1(t)P_{1}(t) is defined by (19) with P1(T)P_{1}(T) satisfying 0=B0(T)[P(T)+P1(T)]0=B_{0}^{\prime}(T)[P(T)+P_{1}(T)] and z(t)z(t) is a vector with compatible dimension such that P1(T)x(T)=0P_{1}(T)x(T)=0.

The verification is divided into two steps. The first step is to prove that the Riccati equation P(t)+P1(t)P(t)+P_{1}(t) is regular. From the regularity of (20), we have the following Riccati equation is regular:

0\displaystyle 0 =\displaystyle= P¯˙(t)+A(t)P¯(t)+P¯(t)A(t)+Q(t)\displaystyle\dot{\bar{P}}(t)+A^{\prime}(t)\bar{P}(t)+\bar{P}(t)A(t)+Q(t) (23)
P¯(t)B(t)R(t)B(t)P¯(t),\displaystyle-\bar{P}(t)B(t)R^{{\dagger}}(t)B^{\prime}(t)\bar{P}(t),

with terminal value P¯(T)=H+P1(T)\bar{P}(T)=H+P_{1}(T). That is,

[IR(t)R(t)]B(t)P¯(t)=0.\displaystyle\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}B^{\prime}(t)\bar{P}(t)=0. (24)

In addition, note that P(t)+P1(t)P(t)+P_{1}(t) satisfies

0\displaystyle 0 =\displaystyle= P˙(t)+P˙1(t)+A(t)[P(t)+P1(t)]\displaystyle\dot{P}(t)+\dot{P}_{1}(t)+A^{\prime}(t)\Big{[}P(t)+P_{1}(t)\Big{]} (25)
+[P(t)+P1(t)]A(t)+Q(t)[P(t)+P1(t)]\displaystyle+\Big{[}P(t)+P_{1}(t)\Big{]}A(t)+Q(t)-\Big{[}P(t)+P_{1}(t)\Big{]}
×B(t)R(t)B(t)[P(t)+P1(t)],\displaystyle\times B(t)R^{{\dagger}}(t)B^{\prime}(t)\Big{[}P(t)+P_{1}(t)\Big{]},

with the same terminal value P¯(T)=H+P1(T)\bar{P}(T)=H+P_{1}(T) to (23). This implies that

P¯(t)=P(t)+P1(t).\displaystyle\bar{P}(t)=P(t)+P_{1}(t). (26)

Thus, it is obtained from (24) that

[IR(t)R(t)]B(t)[P(t)+P1(t)]\displaystyle\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}B^{\prime}(t)\Big{[}P(t)+P_{1}(t)\Big{]}
=\displaystyle= [IR(t)R(t)]B(t)P¯(t)=0.\displaystyle\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}B^{\prime}(t)\bar{P}(t)=0.

The second step is to take derivatives on the right of (21). By using (26), (23) and (22), we derive from (21) that

ddt[P¯(t)x(t)]\displaystyle\frac{d}{dt}\Big{[}\bar{P}(t)x(t)\Big{]} =\displaystyle= ddt[[P(t)+P1(t)]x(t)]\displaystyle\frac{d}{dt}\Big{[}[P(t)+P_{1}(t)]x(t)\Big{]}
=\displaystyle= [A(t)P¯(t)+P¯(t)A(t)+Q(t)\displaystyle-\Big{[}A^{\prime}(t)\bar{P}(t)+\bar{P}(t)A(t)+Q(t)
P¯(t)B(t)R(t)B(t)P¯(t)]x(t)\displaystyle-\bar{P}(t)B(t)R^{{\dagger}}(t)B^{\prime}(t)\bar{P}(t)\Big{]}x(t)
+P¯(t)[A(t)B(t)R(t)B(t)P¯(t)]x(t)\displaystyle+\bar{P}(t)\Big{[}A(t)-B(t)R^{{\dagger}}(t)B^{\prime}(t)\bar{P}(t)\Big{]}x(t)
+P¯(t)B(t)[IR(t)R(t)]z(t)\displaystyle+\bar{P}(t)B(t)\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}z(t)
=\displaystyle= A(t)P¯(t)x(t)Q(t)x(t),\displaystyle-A^{\prime}(t)\bar{P}(t)x(t)-Q(t)x(t),

where the last term in the first equality is equal to zero by using (24). Furthermore, by denoting u(t)=R(t)P¯(t)x(t)+[IR(t)R(t)]z(t)u(t)=-R^{{\dagger}}(t)\bar{P}(t)x(t)+[I-R(t)R^{{\dagger}}(t)]z(t), it follows that 0=R(t)u(t)+B(t)P¯(t)x(t)0=R(t)u(t)+B^{\prime}(t)\bar{P}(t)x(t). In additionally, (22) can be rewritten as x˙(t)=A(t)x(t)+B(t)u(t)\dot{x}(t)=A(t)x(t)+B(t)u(t). Accordingly, (p(t),x(t))(p(t),x(t)) defined by (21) and (22) solves the FBDEs (1), (11) and (10). Thus, based on Lemma 1, Problem 1 is solvable.

“Necessity” The aim is to verify the regularity of (20) and P1(T)x(T)=0P_{1}(T)x(T)=0. In fact, by using Theorem 2 in [32], the necessary condition for the solvability of Problem 1 is that B0(t)[P(t)+P1(t)]=0B_{0}^{\prime}(t)[P(t)+P_{1}(t)]=0 and P1(T)x(T)=0P_{1}(T)x(T)=0 holds. Thus, the key is to prove (20) is regular, that is, (24) holds where P¯(t)\bar{P}(t) satisfies (23). In fact, by using (26) and the fact that ΥT0(t)\Upsilon_{T_{0}}(t) has full row rank, it is obtained from B0(t)[P(t)+P1(t)]=0B_{0}^{\prime}(t)[P(t)+P_{1}(t)]=0 that

0\displaystyle 0 =\displaystyle= ΥT0(t)B0(t)P¯(t)\displaystyle\Upsilon_{T_{0}}^{\prime}(t)B_{0}^{\prime}(t)\bar{P}(t)
=\displaystyle= [0ΥT0(t)][B0(t)]P¯(t)\displaystyle\left[\begin{array}[]{cc}0&\Upsilon_{T_{0}}^{\prime}(t)\\ \end{array}\right]\left[\begin{array}[]{c}*\\ B_{0}^{\prime}(t)\\ \end{array}\right]\bar{P}(t)
=\displaystyle= [IR(t)R(t)]T0(t)[T0(t)]1\displaystyle\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}T_{0}^{\prime}(t)[T_{0}^{\prime}(t)]^{-1}
×[IR(t)R(t)]B(t)P¯(t)\displaystyle\times\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}B^{\prime}(t)\bar{P}(t)
=\displaystyle= [IR(t)R(t)]B(t)P¯(t).\displaystyle\Big{[}I-R(t)R^{{\dagger}}(t)\Big{]}B^{\prime}(t)\bar{P}(t).

This implies that (20) is regular. The proof is now completed. \blacksquare

Remark 1.
  • It is obvious that P1(T)=0P_{1}(T)=0 for the regular (standard) LQ control, while P1(T)0P_{1}(T)\not=0 for the irregular LQ control. So an essential difference of irregular LQ from regular one is explored that the irregular controller (if exists) needs to do two things at the same time, one thing is to minimize the cost (20) and the other is to achieve the terminal constraint P1(T)x(T)=0P_{1}(T)x(T)=0.

  • Though the difference, the LQ control problem (irregular and regular) can be solved in a unified way as in Theorem 1.

To conclude this section, we present the optimal controller of Problem 1.

Corollary 1.

If there exists a matrix P1(T)P_{1}(T) satisfying 0=B0(T)[P(T)+P1(T)]0=B_{0}^{\prime}(T)[P(T)+P_{1}(T)] such that the changed cost (20) is regular, then the optimal controller is given by

u(t)\displaystyle u(t) =\displaystyle= R(t)B(t)[P(t)+P1(t)]x(t)\displaystyle-R^{{\dagger}}(t)B^{\prime}(t)\Big{[}P(t)+P_{1}(t)\Big{]}x(t) (31)
+G0(t)u1(t),\displaystyle+G_{0}(t)u_{1}(t),

where P(t)P(t) and P1(t)P_{1}(t) satisfy Riccati equations (3) and (19), and u1(t)u_{1}(t) is chosen such that P1(T)x(T)=0P_{1}(T)x(T)=0. The optimal cost is given by

J(t0,x0;u)=x0[P(t0)+P1(t0)]x0.\displaystyle J^{*}(t_{0},x_{0};u)=x_{0}^{\prime}\Big{[}P(t_{0})+P_{1}(t_{0})\Big{]}x_{0}. (32)

Proof. By solving the regular optimal control problem of minimizing (20) subject to (1), we have the optimal control is given by

u(t)\displaystyle u(t) =\displaystyle= R(t)B(t)P¯(t)x(t)\displaystyle-R^{{\dagger}}(t)B^{\prime}(t)\bar{P}(t)x(t) (33)
+[IR(t)R(t)]z(t),\displaystyle+\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}z(t),

where z(t)z(t) is chosen such that P1(T)x(T)=0P_{1}(T)x(T)=0. Combining with the denotation above (19), we can rewrite the last term in the above equation as

[IR(t)R(t)]z(t)\displaystyle\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}z(t) =\displaystyle= T01(t)T0(t)[IR(t)R(t)]z(t)\displaystyle T_{0}^{-1}(t)T_{0}(t)\Big{[}I-R^{{\dagger}}(t)R(t)\Big{]}z(t)
=\displaystyle= [G0(t)][0ΥT0(t)]z(t)\displaystyle\left[\begin{array}[]{cc}*&G_{0}(t)\\ \end{array}\right]\left[\begin{array}[]{c}0\\ \Upsilon_{T_{0}}(t)\\ \end{array}\right]z(t)
=\displaystyle= G0(t)ΥT0(t)z(t).\displaystyle G_{0}(t)\Upsilon_{T_{0}}(t)z(t).

By letting ΥT0(t)z(t)=u1(t)\Upsilon_{T_{0}}(t)z(t)=u_{1}(t), the optimal controller (31) follows. The proof is now completed. \blacksquare

III Stochastic optimal control with irregular performance

In this section, we will extend the above results to the stochastic optimal control with irregular performance where the linear control system is governed by Itô stochastic differential equation:

dx(t)\displaystyle dx(t) =\displaystyle= [A(t)x(t)+B(t)u(t)]dt+[A¯(t)x(t)\displaystyle\Big{[}A(t)x(t)+B(t)u(t)\Big{]}dt+\Big{[}\bar{A}(t)x(t) (38)
+B¯(t)u(t)]dw(t),x(t0)=x0,\displaystyle+\bar{B}(t)u(t)\Big{]}dw(t),~{}x(t_{0})=x_{0},

where xRnx\in R^{n} is the state, uRmu\in R^{m} is the control input. w(t)w(t) is a standard one-dimension Brownian motion. The filtration t\mathcal{F}_{t} is generated by {w(t),tt0}\{w(t),t\geq t_{0}\}, that is, t=σ{w(s),t0st}\mathcal{F}_{t}=\sigma\{w(s),t_{0}\leq s\leq t\}. The matrices A(t),B(t),A¯(t),B¯(t)A(t),B(t),\bar{A}(t),\bar{B}(t) are deterministic matrices with compatible dimensions. The cost function is given by

J(t0,x0;u)\displaystyle J(t_{0},x_{0};u) =\displaystyle= Et0T[x(t)Q(t)x(t)+u(t)R(t)u(t)]𝑑t\displaystyle E\int_{t_{0}}^{T}\Big{[}x^{\prime}(t)Q(t)x(t)+u^{\prime}(t)R(t)u(t)\Big{]}dt (39)
+Ex(T)Hx(T),\displaystyle+Ex^{\prime}(T)Hx(T),

where Q(t),R(t),HQ(t),R(t),H are symmetric matrices with compatible dimensions. The set of the admissible controllers is denoted by

𝒰[t0,T]\displaystyle\mathcal{U}[t_{0},T] =\displaystyle= {u(t),t[t0,T]|u(t)istadapted,\displaystyle\Big{\{}u(t),t\in[t_{0},T]\Big{|}u(t)~{}\mbox{is}~{}\mathcal{F}_{t}~{}\mbox{adapted},
Et0Tu(t)2dt<}.\displaystyle E\int_{t_{0}}^{T}\|u(t)\|^{2}dt<\infty\Big{\}}.
Problem 2.

For any (t0,x0)(t_{0},x_{0}), find an t{\cal F}_{t}-adapted controller u(t)u(t) such that (39) is minimized subject to (38).

To guarantee the solvability of Problem 2, we make the following assumption:

Assumption 1.

Convexity

J(t0,0;u)0.\displaystyle J(t_{0},0;u)\geq 0.

Under Assumption 1, we have the following maximum principle for Problem 2 [33].

Lemma 2.

If Problem 2 is solvable, then the optimal controller satisfies

0=R(t)u(t)+B(t)p(t)+B¯(t)q(t),\displaystyle 0=R(t)u(t)+B^{\prime}(t)p(t)+\bar{B}^{\prime}(t)q(t), (40)

where (p(t),q(t))(p(t),q(t)) obey a backward stochastic differential equation (BSDE):

dp(t)\displaystyle dp(t) =\displaystyle= [A(t)p(t)+A¯q(t)+Q(t)x(t)]dt\displaystyle-[A^{\prime}(t)p(t)+\bar{A}^{\prime}q(t)+Q(t)x(t)]dt (41)
+q(t)dw(t),\displaystyle+q(t)dw(t),

with the terminal value as p(T)=Hx(T)p(T)=Hx(T). Conversely, if FBSDEs (38), (41) and (40) is solvable, then Problem 2 is also solvable.

Proof. The proof can be found in Lemma 1 in [33]. \blacksquare

Parallel to (3), we introduce the generalized Riccati equation:

0\displaystyle 0 =\displaystyle= P˙(t)+A(t)P(t)+A¯(t)P(t)A¯(t)+P(t)A(t)\displaystyle\dot{P}(t)+A^{\prime}(t)P(t)+\bar{A}^{\prime}(t)P(t)\bar{A}(t)+P(t)A(t) (42)
+Q(t)Γ0(t)Υ0(t)Γ0(t),\displaystyle+Q(t)-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t),

where

Υ0(t)\displaystyle\Upsilon_{0}(t) =\displaystyle= R(t)+B¯(t)P(t)B¯(t),\displaystyle R(t)+\bar{B}^{\prime}(t)P(t)\bar{B}(t), (43)
Γ0(t)\displaystyle\Gamma_{0}(t) =\displaystyle= B(t)P(t)+B¯(t)P(t)A¯(t),\displaystyle B^{\prime}(t)P(t)+\bar{B}^{\prime}(t)P(t)\bar{A}(t), (44)

and the terminal condition is given by P(T)=HP(T)=H.

As has been studied in Section II, we will focus on the stochastic optimal control with irregular performance, that is,

Range[Γ0(t)]Range[Υ0(t)].\displaystyle Range[\Gamma_{0}(t)]\not\subseteq Range[\Upsilon_{0}(t)]. (45)

III-A Preliminaries on stochastic optimal control with irregular performance

In this subsection, we firstly make some denotations for convenience of use. Without loss of generality, we assume that rank[Υ0(t)]=m0<mrank\Big{[}\Upsilon_{0}(t)\Big{]}=m_{0}<m. Thus rank[IΥ0(t)Υ0(t)]=mm0>0rank\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}=m-m_{0}>0. It is not difficult to know that there is an elementary row transformation matrix T0(t)T_{0}(t) such that

T0(t)[IΥ0(t)Υ0(t)]=[0ΥT0(t)],\displaystyle T_{0}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}=\left[\begin{array}[]{c}0\\ \Upsilon_{T_{0}}(t)\\ \end{array}\right], (48)

where ΥT0(t)R(mm0)×m\Upsilon_{T_{0}}(t)\in R^{(m-m_{0})\times m} is full row rank. Furthermore, we make the following denotations:

[C0(t)]\displaystyle\left[\begin{array}[]{cc}\ast&C_{0}^{\prime}(t)\\ \end{array}\right] =\displaystyle= Γ0(t)[IΥ0(t)Υ0(t)]T01(t),\displaystyle\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon^{{\dagger}}_{0}(t)\Upsilon_{0}(t)\Big{]}{T_{0}}^{-1}(t), (50)
[B0(t)]\displaystyle\left[\begin{array}[]{cc}\ast&B_{0}(t)\\ \end{array}\right] =\displaystyle= B(t)[IΥ0(t)Υ0(t)]T01(t),\displaystyle B(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}{T_{0}}^{-1}(t), (52)
[B¯0(t)]\displaystyle\left[\begin{array}[]{cc}\ast&\bar{B}_{0}(t)\\ \end{array}\right] =\displaystyle= B¯(t)[IΥ0(t)Υ0(t)]T01(t),\displaystyle\bar{B}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}{T_{0}}^{-1}(t), (54)
[G(t)]\displaystyle\left[\begin{array}[]{cc}*&G(t)\\ \end{array}\right] =\displaystyle= T01(t),\displaystyle T_{0}^{-1}(t), (56)
A0(t)\displaystyle A_{0}(t) =\displaystyle= A(t)B(t)Υ0(t)Γ0(t),\displaystyle A(t)-B(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t),
A¯0(t)\displaystyle\bar{A}_{0}(t) =\displaystyle= A¯(t)B¯(t)Υ0(t)Γ0(t),\displaystyle\bar{A}(t)-\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t),
D0(t)\displaystyle D_{0}(t) =\displaystyle= B(t)Υ0(t)B(t),\displaystyle-B(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t),
D¯0(t)\displaystyle\bar{D}_{0}(t) =\displaystyle= B¯(t)Υ0(t)B(t),\displaystyle-\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t),
F0(t)\displaystyle F_{0}(t) =\displaystyle= B(t)Υ0(t)B¯(t),\displaystyle-B(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t),
F¯0(t)\displaystyle\bar{F}_{0}(t) =\displaystyle= B¯(t)Υ0(t)B¯(t),\displaystyle-\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t),

where C0(t),B0(t),B¯0(t)Rn×(mm0)C_{0}^{\prime}(t),B_{0}(t),\bar{B}_{0}(t)\in R^{n\times(m-m_{0})}, G(t)Rm×(mm0)G(t)\in R^{m\times(m-m_{0})} and define the following Riccati equation:

0\displaystyle 0 =\displaystyle= P˙1(t)+P1(t)A0(t)+A0(t)P1(t)+P1(t)D0(t)P1(t)\displaystyle\dot{P}_{1}(t)+P_{1}(t)A_{0}(t)+A_{0}^{\prime}(t)P_{1}(t)+P_{1}(t)D_{0}(t)P_{1}(t) (57)
+[A¯0(t)+P1(t)F0(t)][IP1(t)F¯0(t)]\displaystyle+\Big{[}\bar{A}^{\prime}_{0}(t)+P_{1}(t)F_{0}(t)\Big{]}\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)[A¯0(t)+D¯0(t)P1(t)],\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]},

where the terminal value P1(T)P_{1}(T) is to be determined. Moreover, we define that

A1(t)\displaystyle A_{1}(t) =\displaystyle= A0(t)+D0(t)P1(t)+F0(t)[IP1(t)F¯0(t)]\displaystyle A_{0}(t)+D_{0}(t)P_{1}(t)+F_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)[A¯0(t)+D¯0(t)P1(t)],\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]},
B1(t)\displaystyle B_{1}(t) =\displaystyle= B0(t)+F0(t)[IP1(t)F¯0(t)]P1(t)B¯0(t),\displaystyle B_{0}(t)+F_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}_{0}(t),
A¯1(t)\displaystyle\bar{A}_{1}(t) =\displaystyle= A¯0(t)+D¯0(t)P1(t)+F¯0(t)[IP1(t)F¯0(t)]\displaystyle\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)[A¯0(t)+D¯0(t)P1(t)],\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]},
B¯1(t)\displaystyle\bar{B}_{1}(t) =\displaystyle= B¯0(t)+F¯0(t)[IP1(t)F¯0(t)]P1(t)B¯0(t).\displaystyle\bar{B}_{0}(t)+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}_{0}(t).

In view of the Riccati equations (38) and (57), we make the following assumptions for the solutions P(t)P(t) and P1(t)P_{1}(t) of (38) and (57).

Assumption 2.
  1. 1.
    L(t)\displaystyle L^{\prime}(t) =\displaystyle= L(t)[IP1(t)F¯0(t)][IP1(t)F¯0(t)],\displaystyle L^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]},

    where L(t)L(t) may be B(t)B(t), A¯(t)\bar{A}(t), or B¯(t)\bar{B}(t).

  2. 2.
    [Υ0(t)+B¯(t)P1(t)B¯(t)]L(t)\displaystyle\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}L^{\prime}(t) (59)
    =\displaystyle= {IΥ0(t)B¯(t)[IP1(t)F¯0(t)]\displaystyle\Big{\{}I-\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
    ×P1(t)B¯(t)}Υ0(t)L(t),\displaystyle\times P_{1}(t)\bar{B}(t)\Big{\}}\Upsilon_{0}^{{\dagger}}(t)L^{\prime}(t),

    where L(t)L(t) may be B(t)B(t), A¯(t)\bar{A}(t), or B¯(t)\bar{B}(t).

  3. 3.
    0\displaystyle 0 =\displaystyle= B¯0(t)[IP1(t)F¯0(t)]P1(t)B¯0(t).\displaystyle\bar{B}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}_{0}(t). (60)
Remark 2.

It is noted that the above assumption is not restrictive. In fact, conditions (LABEL:p11) and (59) hold when the Moore-Penrose inverse become inverse. Also, (LABEL:p11), (59) and (60) hold naturally for deterministic systems.

Based on the assumption, we present the following lemmas which is useful for the derivation of the main result.

Lemma 3.

Under the assumption (LABEL:p11), it holds that

  1. 1.

    Commutative law

    L1(t)P1(t)[IF¯0(t)P1(t)]L2(t)\displaystyle L_{1}^{\prime}(t)P_{1}(t)\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L_{2}(t)
    =\displaystyle= L1(t)[IP1(t)F¯0(t)]P1(t)L2(t),\displaystyle L_{1}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)L_{2}(t),

    where L1(t),L2(t)L_{1}(t),L_{2}(t) may be B(t),A¯(t)B(t),\bar{A}(t), or B¯(t)\bar{B}(t).

  2. 2.

    Formula of Moore-Penrose inverse for sum of matrices

    [IF¯0(t)P1(t)]L(t)\displaystyle\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L(t)
    =\displaystyle= {I+F¯0(t)[IP1(t)F¯0(t)]P1(t)}L(t),\displaystyle\Big{\{}I+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\Big{\}}L(t),

    where L(t)L(t) may be B(t),A¯(t)B(t),\bar{A}(t), or B¯(t)\bar{B}(t).

Proof. The proof is given in Appendix -A. \blacksquare

By using Lemma 3, we have a uniform equation for P(t)+P1(t)P(t)+P_{1}(t) as follows.

Lemma 4.

Under the assumption (LABEL:p11)-(59), it holds that P¯(t)=P(t)+P1(t)\bar{P}(t)=P(t)+P_{1}(t) satisfies the following Riccati equation:

0\displaystyle 0 =\displaystyle= P¯˙(t)+A(t)P¯(t)+P¯(t)A(t)+A¯(t)P¯(t)A¯(t)\displaystyle\dot{\bar{P}}(t)+A^{\prime}(t)\bar{P}(t)+\bar{P}(t)A(t)+\bar{A}^{\prime}(t)\bar{P}(t)\bar{A}(t) (61)
+Q(t)Γ¯(t)Υ¯(t)Γ¯(t),\displaystyle+Q(t)-\bar{\Gamma}^{\prime}(t)\bar{\Upsilon}^{{\dagger}}(t)\bar{\Gamma}(t),

where P(t)P(t) and P1(t)P_{1}(t) are solutions of (42) and (57) respectively, the terminal value is given by P¯(T)=H+P1(T)\bar{P}(T)=H+P_{1}(T) and

Υ¯(t)\displaystyle\bar{\Upsilon}(t) =\displaystyle= R(t)+B¯(t)P¯(t)B¯,\displaystyle R(t)+\bar{B}^{\prime}(t)\bar{P}(t)\bar{B},
Γ¯(t)\displaystyle\bar{\Gamma}(t) =\displaystyle= B(t)P¯(t)+B¯(t)P¯(t)A¯(t).\displaystyle B^{\prime}(t)\bar{P}(t)+\bar{B}^{\prime}(t)\bar{P}(t)\bar{A}(t).

Proof. The proof is given in Appendix -B. \blacksquare

At the end of this subsection, we give an equivalent solvability condition for Problem 2 by reformulating FBSDEs (38), (41) and (40) with the denotations below (48).

Lemma 5.

If Problem 2 has a solution, then the optimal controller satisfies

u(t)\displaystyle u(t) =\displaystyle= Υ0(t)[Γ0(t)x(t)+B(t)Θ(t)+B¯(t)Θ¯(t)]\displaystyle-\Upsilon_{0}^{{\dagger}}(t)\Big{[}\Gamma_{0}(t)x(t)+B^{\prime}(t)\Theta(t)+\bar{B}^{\prime}(t)\bar{\Theta}(t)\Big{]} (62)
+G(t)u1(t),\displaystyle+G(t)u_{1}(t),

where u1(t)Rmm0u_{1}(t)\in R^{m-m_{0}} is an arbitrary vector such that

0\displaystyle 0 =\displaystyle= C0(t)x(t)+B0(t)Θ(t)+B¯0(t)Θ¯(t),\displaystyle C_{0}(t)x(t)+B_{0}^{\prime}(t)\Theta(t)+\bar{B}_{0}^{\prime}(t)\bar{\Theta}(t), (63)

and (x(t),Θ(t),Θ¯(t))(x(t),\Theta(t),\bar{\Theta}(t)) obey the following FBSDEs:

dx(t)\displaystyle dx(t) =\displaystyle= [A0(t)x(t)+D0(t)Θ(t)+F0(t)Θ¯(t)\displaystyle\Big{[}A_{0}(t)x(t)+D_{0}(t)\Theta(t)+F_{0}(t)\bar{\Theta}(t) (64)
+B0(t)u1(t)]dt+[A¯0(t)x(t)+D¯0(t)Θ(t)\displaystyle+B_{0}(t)u_{1}(t)\Big{]}dt+\Big{[}\bar{A}_{0}(t)x(t)+\bar{D}_{0}(t)\Theta(t)
+F¯0(t)Θ¯(t)+B¯0(t)u1(t)]dw(t),\displaystyle+\bar{F}_{0}(t)\bar{\Theta}(t)+\bar{B}_{0}(t)u_{1}(t)\Big{]}dw(t),
dΘ(t)\displaystyle d\Theta(t) =\displaystyle= [A0(t)Θ(t)+A¯0(t)Θ¯(t)+C0(t)u1(t)]dt\displaystyle-\Big{[}A^{\prime}_{0}(t)\Theta(t)+\bar{A}^{\prime}_{0}(t)\bar{\Theta}(t)+C^{\prime}_{0}(t)u_{1}(t)\Big{]}dt (65)
+Θ¯(t)dw(t),\displaystyle+\bar{\Theta}(t)dw(t),

with x(0)=x0x(0)=x_{0} and Θ(T)=0.\Theta(T)=0. Conversely, if FBSDEs (64), (65) and (63) is solvable, then Problem 2 is also solvable.

Proof. The proof is given in Appendix -C. \blacksquare

III-B Solution to stochastic optimal control with irregular performance

We are now in the position to present the main result for the stochastic optimal control with irregular performance.

Theorem 2.

Under Assumption 2, Problem 2 is solvable if there exists a matrix P1(T)P_{1}(T) such that the following changed cost function

J¯(x0;u)=J(x0;u)+E{x(T)[H+P1(T)]x(T)},\displaystyle\bar{J}(x_{0};u)=J(x_{0};u)+E\Big{\{}x^{\prime}(T)\Big{[}H+P_{1}(T)\Big{]}x(T)\Big{\}}, (66)

is regular and P1(T)x(T)=0P_{1}(T)x(T)=0 is achieved with the controller minimizing (66).

Proof. Based on Lemma 5, Problem 2 is solvable if FBSDEs (64), (65) and (63) is solvable. Thus, it is sufficient to prove that if there exists a matrix P1(T)P_{1}(T) such that (66) is regular and P1(T)x(T)=0P_{1}(T)x(T)=0, then the FBSDEs (64), (65) and (63) is solvable. To this end, the proof is divided into two steps. The first step is to show that the regularity of (66) implies the following condition holds:

0\displaystyle 0 =\displaystyle= C0(t)+B0(t)P1(t)+B¯0(t)[IP1(t)F¯0(t)]\displaystyle C_{0}(t)+B_{0}^{\prime}(t)P_{1}(t)+\bar{B}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}} (67)
×P1(t)[A¯0(t)+D¯0(t)P1(t)].\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}.

The second step is to verify that under the conditions of Assumption 2, (67) and P1(T)x(T)=0P_{1}(T)x(T)=0, the following defined (x(t),Θ(t),Θ¯(t))(x(t),\Theta(t),\bar{\Theta}(t)) solves the FBSDEs (64), (65) and (63):

Θ(t)\displaystyle\Theta(t) =\displaystyle= P1(t)x(t),\displaystyle P_{1}(t)x(t), (68)
Θ¯(t)\displaystyle\bar{\Theta}(t) =\displaystyle= P1(t)[A¯1(t)x(t)+B¯1(t)u1(t)],\displaystyle P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}, (69)
dx(t)\displaystyle dx(t) =\displaystyle= [A1(t)x(t)+B1(t)u1(t)]dt\displaystyle\Big{[}A_{1}(t)x(t)+B_{1}(t)u_{1}(t)\Big{]}dt (70)
+[A¯1(t)x(t)+B¯1(t)u1(t)]dw(t).\displaystyle+\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}dw(t).

First of all, we prove that if (66) is regular, then (67) holds. In fact, by using (78) in Appendix -B, we have

{Γ0(t)+B(t)P1(t)+B¯(t)\displaystyle\Big{\{}\Gamma_{0}(t)+B^{\prime}(t)P_{1}(t)+\bar{B}^{\prime}(t)
×[IP1(t)F¯0(t)]P1(t)[A¯0(t)+D¯0(t)P1(t)]}\displaystyle\times\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}\Big{\}}^{\prime}
=\displaystyle= [A¯(t)P1(t)B¯(t)+Γ0(t)+P1(t)B(t)]\displaystyle\Big{[}\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)+\Gamma_{0}^{\prime}(t)+P_{1}(t)B(t)\Big{]}
×{IΥ0(t)B¯(t)[IP1(t)F¯0(t)]P1(t)B¯(t)}\displaystyle\times\Big{\{}I-\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}(t)\Big{\}}
=\displaystyle= Γ¯(t){IΥ0(t)B¯(t)[IP1(t)F¯0(t)]P1(t)B¯(t)}.\displaystyle\bar{\Gamma}^{\prime}(t)\Big{\{}I-\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}(t)\Big{\}}.

Together with the denotations below (48), it is further obtained that

ΥT0(t){C0(t)+B0(t)P1(t)+B¯0(t)[IP1(t)F¯0(t)]\displaystyle\Upsilon_{T_{0}}^{\prime}(t)\Big{\{}C_{0}(t)+B_{0}^{\prime}(t)P_{1}(t)+\bar{B}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}} (71)
×P1(t)[A¯0(t)+D¯0(t)P1(t)]}\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}\Big{\}}
=\displaystyle= [IΥ0(t)Υ0(t)]{IΥ0(t)B¯(t)[IP1(t)F¯0(t)]\displaystyle\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}\Big{\{}I-\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)B¯(t)}Γ¯(t)\displaystyle\times P_{1}(t)\bar{B}(t)\Big{\}}^{\prime}\bar{\Gamma}(t)
=\displaystyle= {IΥ0(t)Υ¯(t)B¯(t)P1(t)[IF¯0(t)P1(t)]\displaystyle\Big{\{}I-\Upsilon_{0}(t)\bar{\Upsilon}^{{\dagger}}(t)-\bar{B}^{\prime}(t)P_{1}(t)\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}
×B¯(t)Υ0(t)}Γ¯(t)\displaystyle\times\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{\}}\bar{\Gamma}(t)
=\displaystyle= [IΥ¯(t)Υ¯(t)]Γ¯(t)\displaystyle\Big{[}I-\bar{\Upsilon}(t)\bar{\Upsilon}^{{\dagger}}(t)\Big{]}\bar{\Gamma}(t)
=\displaystyle= 0,\displaystyle 0,

where the last equality follows from the regularity of (66). In view of the fact that ΥT0(t)\Upsilon_{T_{0}}(t) has full row rank, we obtain (67) holds.

Next, we prove that (x(t),Θ(t),Θ¯(t))(x(t),\Theta(t),\bar{\Theta}(t)) defined by (68)-(70) solves the FBSDEs (64), (65) and (63). In fact, by taking Itô’s formula to P1(t)x(t)P_{1}(t)x(t), we have

d[P1(t)x(t)]\displaystyle d\Big{[}P_{1}(t)x(t)\Big{]}
=\displaystyle= P˙1(t)x(t)dt+P1(t)[A1(t)x(t)+B1(t)u1(t)]dt\displaystyle\dot{P}_{1}(t)x(t)dt+P_{1}(t)\Big{[}A_{1}(t)x(t)+B_{1}(t)u_{1}(t)\Big{]}dt
+P1(t)[A¯1(t)x(t)+B¯1(t)u1(t)]dw(t)\displaystyle+P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}dw(t)
=\displaystyle= A0(t)P1(t)x(t)dtA¯0(t)[IP1(t)F¯0(t)]\displaystyle-A_{0}^{\prime}(t)P_{1}(t)x(t)dt-\bar{A}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×{P1(t)[A¯0(t)+D¯0(t)P1(t)]x(t)\displaystyle\times\Big{\{}P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}x(t)
+P1(t)B¯0(t)u1(t)}dt\displaystyle+P_{1}(t)\bar{B}_{0}(t)u_{1}(t)\Big{\}}dt
+{P1(t)B0(t)+[A¯0(t)+P1(t)F0(t)]\displaystyle+\Big{\{}P_{1}(t)B_{0}(t)+\Big{[}\bar{A}_{0}^{\prime}(t)+P_{1}(t)F_{0}(t)\Big{]}
×[IP1(t)F¯0(t)]P1(t)B¯0(t)}u1(t)dt\displaystyle\times\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}_{0}(t)\Big{\}}u_{1}(t)dt
+P1(t)[A¯1(t)x(t)+B¯1(t)u1(t)]dw(t),\displaystyle+P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}dw(t),

where the equation (57) of P1(t)P_{1}(t) has been used in the derivation of the last equality. Combining with (67), the above equation is further formulated as

d[P1(t)x(t)]\displaystyle d\Big{[}P_{1}(t)x(t)\Big{]} (72)
=\displaystyle= A0(t)P1(t)x(t)dtA¯0(t)[IP1(t)F¯0(t)]\displaystyle-A_{0}^{\prime}(t)P_{1}(t)x(t)dt-\bar{A}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×{P1(t)[A¯0(t)+D¯0(t)P1(t)]x(t)+P1(t)B¯0(t)\displaystyle\times\Big{\{}P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}x(t)+P_{1}(t)\bar{B}_{0}(t)
×u1(t)}dtC0(t)u1(t)dt\displaystyle\times u_{1}(t)\Big{\}}dt-C_{0}^{\prime}(t)u_{1}(t)dt
+P1(t)[A1(t)x(t)+B1(t)u1(t)]dw(t).\displaystyle+P_{1}(t)\Big{[}A_{1}(t)x(t)+B_{1}(t)u_{1}(t)\Big{]}dw(t).

By taking again (78) in Appendix -B, we obtain that

L1(t)[IP1(t)F¯0(t)]P1(t)L2(t)\displaystyle L_{1}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)L_{2}(t)
=\displaystyle= L1(t){I+P1(t)F¯0(t)[IP1(t)F¯0(t)]}\displaystyle L_{1}^{\prime}(t)\Big{\{}I+P_{1}(t)\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}\Big{\}}
×P1(t)L2(t).\displaystyle\times P_{1}(t)L_{2}(t).

Accordingly, (72) can be reformulated as

d[P1(t)x(t)]\displaystyle d\Big{[}P_{1}(t)x(t)\Big{]} (73)
=\displaystyle= A0(t)P1(t)x(t)dtA¯0(t){I+P1(t)F¯0(t)\displaystyle-A_{0}^{\prime}(t)P_{1}(t)x(t)dt-\bar{A}_{0}^{\prime}(t)\Big{\{}I+P_{1}(t)\bar{F}_{0}(t)
×[IP1(t)F¯0(t)]}{P1(t)[A¯0(t)\displaystyle\times\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}\Big{\}}\Big{\{}P_{1}(t)\Big{[}\bar{A}_{0}(t)
+D¯0(t)P1(t)]x(t)+P1(t)B¯0(t)u1(t)}dt\displaystyle+\bar{D}_{0}(t)P_{1}(t)\Big{]}x(t)+P_{1}(t)\bar{B}_{0}(t)u_{1}(t)\Big{\}}dt
C0(t)u1(t)dt+P1(t)[A¯1(t)x(t)\displaystyle-C_{0}^{\prime}(t)u_{1}(t)dt+P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)
+B¯1(t)u1(t)]dw(t)\displaystyle+\bar{B}_{1}(t)u_{1}(t)\Big{]}dw(t)
=\displaystyle= A0(t)P1(t)x(t)dtA¯0(t)P1(t)[A¯1(t)x(t)\displaystyle-A_{0}^{\prime}(t)P_{1}(t)x(t)dt-\bar{A}_{0}^{\prime}(t)P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)
+B¯1(t)u1(t)]dtC0(t)u1(t)dt\displaystyle+\bar{B}_{1}(t)u_{1}(t)\Big{]}dt-C_{0}^{\prime}(t)u_{1}(t)dt
+P1(t)[A¯1(t)x(t)+B¯1(t)u1(t)]dw(t).\displaystyle+P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}dw(t).

In addition, by using (60) and (67), we have

C0(t)x(t)+B0(t)P1(t)x(t)\displaystyle C_{0}(t)x(t)+B_{0}^{\prime}(t)P_{1}(t)x(t) (74)
+B¯0(t)P1(t)[A¯1(t)x(t)+B¯1(t)u1(t)]\displaystyle+\bar{B}_{0}^{\prime}(t)P_{1}(t)\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}
=\displaystyle= {C0(t)+B0(t)P1(t)+B¯0(t)[IP1(t)F¯0(t)]\displaystyle\Big{\{}C_{0}(t)+B_{0}^{\prime}(t)P_{1}(t)+\bar{B}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)[A¯0(t)+D¯0(t)P1(t)]}x(t)\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}\Big{\}}x(t)
+B¯0(t)[IP1(t)F¯0(t)]P1(t)B¯0(t)u1(t)\displaystyle+\bar{B}_{0}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}_{0}(t)u_{1}(t)
=\displaystyle= 0.\displaystyle 0.

Similarly, we can rewrite (70) as follows:

dx(t)\displaystyle dx(t) =\displaystyle= {A0(t)x(t)+D0(t)P1(t)x(t)+F0(t)P1(t)\displaystyle\Big{\{}A_{0}(t)x(t)+D_{0}(t)P_{1}(t)x(t)+F_{0}(t)P_{1}(t)
×[A¯1(t)x(t)+B¯1(t)u1(t)]+B0(t)u1(t)}dt\displaystyle\times\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}+B_{0}(t)u_{1}(t)\Big{\}}dt
+{A¯0(t)x(t)+D¯0(t)P1(t)x(t)+F¯0(t)P1(t)\displaystyle+\Big{\{}\bar{A}_{0}(t)x(t)+\bar{D}_{0}(t)P_{1}(t)x(t)+\bar{F}_{0}(t)P_{1}(t)
×[A¯1(t)x(t)+B¯1(t)u1(t)]+B¯0(t)u1(t)}dw(t).\displaystyle\times\Big{[}\bar{A}_{1}(t)x(t)+\bar{B}_{1}(t)u_{1}(t)\Big{]}+\bar{B}_{0}(t)u_{1}(t)\Big{\}}dw(t).

By making comparison between (LABEL:p9), (73), (74) and (64), (65) and (63), and using the existence of u1(t)u_{1}(t) such that P1(T)x(T)=0P_{1}(T)x(T)=0, it follows that (68)-(70) solves the FBSDEs (64), (65) and (63).

Based on Lemma 5, Problem 2 is solvable. The proof is now completed. \blacksquare

We now present the optimal controller for the stochastic optimal control with irregular performance.

Corollary 2.

Under the Assumption 2, if there exists a matrix P1(T)P_{1}(T) such that the cost function (66) is regular and P1(T)x(T)=0P_{1}(T)x(T)=0 can be achieved with the controller minimizing (66), then the optimal controller u(t)u(t) is given by

u(t)\displaystyle u(t) =\displaystyle= Υ¯(t)Γ¯(t)x(t)+[IΥ¯(t)Υ¯(t)]z(t),\displaystyle-\bar{\Upsilon}^{{\dagger}}(t)\bar{\Gamma}(t)x(t)+[I-\bar{\Upsilon}^{{\dagger}}(t)\bar{\Upsilon}(t)]z(t), (76)

where z(t)Rmz(t)\in R^{m} is an arbitrary vector with compatible dimension such that P1(T)x(T)=0P_{1}(T)x(T)=0. The optimal cost is given by

J(x0;u)\displaystyle J^{*}(x_{0};u) =\displaystyle= x0P¯(t0)x0.\displaystyle x_{0}^{\prime}\bar{P}(t_{0})x_{0}. (77)

Proof. By solving the regular optimal control problem of minimizing (66) subject to (38), we have the optimal control (76) directly. The proof is now completed. \blacksquare

IV Conclusions

In this paper, we have investigated the essential problem of the irregular LQ control. It was shown that the difference between the irregular LQ control and the standard (regular) one is that the irregular controller needs to do two things at the same time (minimizing the LQ cost and achieving the state terminal condition). As application, we have presented a sufficient condition solution to the irregular LQ control for stochastic systems with multiplicative noise.

-A Proof of Lemma 3

  1. 1.

    By using (LABEL:p11), we have

    L1(t)P1(t)[IF¯0(t)P1(t)]L2(t)\displaystyle L_{1}^{\prime}(t)P_{1}(t)\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L_{2}(t)
    =\displaystyle= L1(t)[IP1(t)F¯0(t)][IP1(t)F¯0(t)]P1(t)\displaystyle L_{1}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}P_{1}(t)
    ×[IF¯0(t)P1(t)]L2(t)\displaystyle\times\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L_{2}(t)
    =\displaystyle= L1(t)[IP1(t)F¯0(t)]P1(t)\displaystyle L_{1}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)
    ×[IF¯0(t)P1(t)][IF¯0(t)P1(t)]L2(t)\displaystyle\times\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L_{2}(t)
    =\displaystyle= L1(t)[IP1(t)F¯0(t)]P1(t)L2(t).\displaystyle L_{1}^{\prime}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)L_{2}(t).
  2. 2.

    By using again (LABEL:p11), it is obtained

    [IF¯0(t)P1(t)]L(t)\displaystyle\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L(t)
    =\displaystyle= L(t)+F¯0(t)P1(t)[IF¯0(t)P1(t)]L(t)\displaystyle L(t)+\bar{F}_{0}(t)P_{1}(t)\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L(t)
    =\displaystyle= L(t)+F¯0(t)[IP1(t)F¯0(t)][IP1(t)F¯0(t)]\displaystyle L(t)+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}
    ×P1(t)[IF¯0(t)P1(t)]L(t)\displaystyle\times P_{1}(t)\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L(t)
    =\displaystyle= L(t)+F¯0(t)[IP1(t)F¯0(t)]P1(t)\displaystyle L(t)+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)
    ×[IF¯0(t)P1(t)][IF¯0(t)P1(t)]L(t)\displaystyle\times\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}\Big{[}I-\bar{F}_{0}(t)P_{1}(t)\Big{]}^{{\dagger}}L(t)
    =\displaystyle= L(t)+F¯0(t)[IP1(t)F¯0(t)]P1(t)L(t).\displaystyle L(t)+\bar{F}_{0}(t)\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}P_{1}(t)L(t).

The proof is now completed.

-B Proof of Lemma 4

First, we make some algebraic calculations to Riccati equation (57). Based on Lemma 3, it can be derived that

L1(t)P1(t)L2(t)\displaystyle L_{1}^{\prime}(t)P_{1}(t)L_{2}(t) =\displaystyle= L1(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle L_{1}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]} (78)
×[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)L2(t),\displaystyle\times\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)L_{2}(t),

where L1(t),L2(t)L_{1}(t),L_{2}(t) may be B(t),A¯(t)B(t),\bar{A}(t), or B¯(t)\bar{B}(t). This implies that

A¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)B¯(t)Υ0(t)\displaystyle\bar{A}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t) (79)
×Γ0(t)\displaystyle\times\Gamma_{0}(t)
=\displaystyle= A¯(t)P1(t)B¯(t)Υ0(t)Γ0(t)\displaystyle\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
A¯(t)P1(t)B¯(t)Υ0(t)B¯(t)\displaystyle-\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)
×[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle\times\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}
×P1(t)B¯(t)Υ0(t)Γ0(t)\displaystyle\times P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
=\displaystyle= A¯(t)P1(t)B¯(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]Γ0(t),\displaystyle\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}\Gamma_{0}(t),

where (59) has been used in the derivation of the last equality. By taking similar procedures to the above equation, it is obtained that

[A¯0(t)+P1(t)F0(t)][IP1(t)F¯0(t)]\displaystyle\Big{[}\bar{A}^{\prime}_{0}(t)+P_{1}(t)F_{0}(t)\Big{]}\Big{[}I-P_{1}(t)\bar{F}_{0}(t)\Big{]}^{{\dagger}}
×P1(t)[A¯0(t)+D¯0(t)P1(t)]\displaystyle\times P_{1}(t)\Big{[}\bar{A}_{0}(t)+\bar{D}_{0}(t)P_{1}(t)\Big{]}
=\displaystyle= [A¯(t)Γ0(t)Υ0(t)B¯(t)P1(t)B(t)Υ0(t)B¯(t)]\displaystyle\Big{[}\bar{A}^{\prime}(t)-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)-P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}
×[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)\displaystyle\times\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)
×[A¯(t)B¯(t)Υ0(t)Γ0(t)B¯(t)Υ0(t)B(t)P1(t)]\displaystyle\times\Big{[}\bar{A}(t)-\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)-\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)\Big{]}
=\displaystyle= A¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)A¯(t)\displaystyle\bar{A}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{A}(t)
A¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)B¯(t)Υ0(t)\displaystyle-\bar{A}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)
×Γ0(t)\displaystyle\times\Gamma_{0}(t)
A¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)B¯(t)Υ0(t)\displaystyle-\bar{A}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)
×B(t)P1(t)\displaystyle\times B^{\prime}(t)P_{1}(t)
Γ0(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}
×P1(t)A¯(t)\displaystyle\times P_{1}(t)\bar{A}(t)
+Γ0(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)
×B¯(t)Υ0(t)Γ0(t)\displaystyle\times\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+Γ0(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]P1(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}P_{1}(t)
×B¯(t)Υ0(t)B(t)P1(t)\displaystyle\times\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
P1(t)B(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle-P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}
×P1(t)A¯(t)\displaystyle\times P_{1}(t)\bar{A}(t)
+P1(t)B(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}
×P1(t)B¯(t)Υ0(t)Γ0(t)\displaystyle\times P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+P1(t)B(t)Υ0(t)B¯(t)[I+P1(t)B¯(t)Υ0(t)B¯(t)]\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{[}I+P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}^{{\dagger}}
×P1(t)B¯(t)Υ0(t)B(t)P1(t)\displaystyle\times P_{1}(t)\bar{B}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
=\displaystyle= A¯(t)P1(t)A¯(t)\displaystyle\bar{A}^{\prime}(t)P_{1}(t)\bar{A}(t)
A¯(t)P1(t)B¯(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle-\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×B¯(t)P1(t)A¯(t)\displaystyle\times\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)
A¯(t)P1(t)B¯(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]Γ0(t)\displaystyle-\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}\Gamma_{0}(t)
A¯(t)P1(t)B¯(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle-\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×B(t)P1(t)\displaystyle\times B^{\prime}(t)P_{1}(t)
Γ0(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle-\Gamma_{0}^{\prime}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×B¯(t)P1(t)A¯(t)\displaystyle\times\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)
Γ0(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]Γ0(t)\displaystyle-\Gamma_{0}^{\prime}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}\Gamma_{0}(t)
+Γ0(t)Υ0(t)Γ0(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
Γ0(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]B(t)P1(t)\displaystyle-\Gamma_{0}^{\prime}(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}B^{\prime}(t)P_{1}(t)
+Γ0(t)Υ0(t)B(t)P1(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
P1(t)B(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle-P_{1}(t)B(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×B¯(t)P1(t)A¯(t)\displaystyle\times\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)
P1(t)B(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]Γ0(t)\displaystyle-P_{1}(t)B(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}\Gamma_{0}(t)
+P1(t)B(t)Υ0(t)Γ0(t)\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
P1(t)B(t)[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle-P_{1}(t)B(t)\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×B(t)P1(t)\displaystyle\times B^{\prime}(t)P_{1}(t)
+P1(t)B(t)Υ0(t)B(t)P1(t)\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
=\displaystyle= A¯(t)P1(t)A¯(t)\displaystyle\bar{A}^{\prime}(t)P_{1}(t)\bar{A}(t)
[A¯(t)P1(t)B¯(t)+P1(t)B(t)+Γ0(t)]\displaystyle-\Big{[}\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)+P_{1}(t)B(t)+\Gamma_{0}^{\prime}(t)\Big{]}
×[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle\times\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×[B¯(t)P1(t)A¯(t)+B(t)P1(t)+Γ0(t)]\displaystyle\times\Big{[}\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)+B^{\prime}(t)P_{1}(t)+\Gamma_{0}(t)\Big{]}
+Γ0(t)Υ0(t)Γ0(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+Γ0(t)Υ0(t)B(t)P1(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
+P1(t)B(t)Υ0(t)Γ0(t)\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+P1(t)B(t)Υ0(t)B(t)P1(t).\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t).

Together with (42) and (57), it follows that

0\displaystyle 0 =\displaystyle= P˙1(t)+P˙(t)+[P1(t)+P(t)]A(t)\displaystyle\dot{P}_{1}(t)+\dot{P}(t)+\Big{[}P_{1}(t)+P(t)\Big{]}A(t)
+A(t)[P1(t)+P(t)]\displaystyle+A^{\prime}(t)\Big{[}P_{1}(t)+P(t)\Big{]}
P1(t)B(t)Υ0(t)Γ0(t)\displaystyle-P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
Γ0(t)Υ0(t)B(t)P1(t)\displaystyle-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
P1(t)B(t)Υ0(t)B(t)P1(t)\displaystyle-P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
+A¯(t)[P1(t)+P(t)]A¯(t)\displaystyle+\bar{A}^{\prime}(t)\Big{[}P_{1}(t)+P(t)\Big{]}\bar{A}(t)
[A¯(t)P1(t)B¯(t)+P1(t)B(t)+Γ0(t)]\displaystyle-\Big{[}\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)+P_{1}(t)B(t)+\Gamma_{0}^{\prime}(t)\Big{]}
×[Υ0(t)+B¯(t)P1(t)B¯(t)]\displaystyle\times\Big{[}\Upsilon_{0}(t)+\bar{B}^{\prime}(t)P_{1}(t)\bar{B}(t)\Big{]}^{{\dagger}}
×[B¯(t)P1(t)A¯(t)+B(t)P1(t)+Γ0(t)]\displaystyle\times\Big{[}\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)+B^{\prime}(t)P_{1}(t)+\Gamma_{0}(t)\Big{]}
+Γ0(t)Υ0(t)Γ0(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+Γ0(t)Υ0(t)B(t)P1(t)\displaystyle+\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
+P1(t)B(t)Υ0(t)Γ0(t)\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
+P1(t)B(t)Υ0(t)B(t)P1(t)\displaystyle+P_{1}(t)B(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)P_{1}(t)
Γ0(t)Υ0(t)Γ0(t)\displaystyle-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Gamma_{0}(t)
=\displaystyle= P˙1(t)+P˙(t)+[P1(t)+P(t)]A(t)\displaystyle\dot{P}_{1}(t)+\dot{P}(t)+\Big{[}P_{1}(t)+P(t)\Big{]}A(t)
+A(t)[P1(t)+P(t)]\displaystyle+A^{\prime}(t)\Big{[}P_{1}(t)+P(t)\Big{]}
+A¯(t)[P1(t)+P(t)]A¯(t)\displaystyle+\bar{A}^{\prime}(t)\Big{[}P_{1}(t)+P(t)\Big{]}\bar{A}(t)
[A¯(t)P1(t)B¯(t)+P1(t)B(t)+Γ0(t)]\displaystyle-\Big{[}\bar{A}^{\prime}(t)P_{1}(t)\bar{B}(t)+P_{1}(t)B(t)+\Gamma_{0}^{\prime}(t)\Big{]}
×[R(t)+B¯(t)(P1(t)+P1(t))B¯(t)]\displaystyle\times\Big{[}R(t)+\bar{B}^{\prime}(t)(P_{1}(t)+P_{1}(t))\bar{B}(t)]^{{\dagger}}
×[B¯(t)P1(t)A¯(t)+B(t)P1(t)+Γ0(t)].\displaystyle\times\Big{[}\bar{B}^{\prime}(t)P_{1}(t)\bar{A}(t)+B^{\prime}(t)P_{1}(t)+\Gamma_{0}(t)\Big{]}.

This is exactly (61). The proof is now completed.

-C Proof of Lemma 5

The key is to reformulate FBSDEs (38), (41) and (40) as FBSDEs (64), (65) and (63). Without loss of generality, we assume that

Θ(t)=p(t)P(t)x(t),\displaystyle\Theta(t)=p(t)-P(t)x(t), (80)

where P(t)P(t) obeys the Riccati equation (42). It is obvious that P(T)=HP(T)=H and Θ(T)=p(T)P(T)x(T)=0\Theta(T)=p(T)-P(T)x(T)=0. We also assume without loss of generality that

dΘ(t)=Θ^(t)dt+Θ¯(t)dw(t),\displaystyle d\Theta(t)=\hat{\Theta}(t)dt+\bar{\Theta}(t)dw(t), (81)

where Θ^(t)\hat{\Theta}(t) and Θ¯(t)\bar{\Theta}(t) are to be determined. Applying Itô’s formula to (80) yields

dp(t)\displaystyle dp(t) =\displaystyle= P˙(t)x(t)dt+P(t)[A(t)x(t)+B(t)u(t)]dt\displaystyle\dot{P}(t)x(t)dt+P(t)\Big{[}A(t)x(t)+B(t)u(t)\Big{]}dt (82)
+P(t)[A¯(t)x(t)+B¯(t)u(t)]dw(t)+Θ^(t)dt\displaystyle+P(t)\Big{[}\bar{A}(t)x(t)+\bar{B}(t)u(t)\Big{]}dw(t)+\hat{\Theta}(t)dt
+Θ¯(t)dw(t).\displaystyle+\bar{\Theta}(t)dw(t).

Using (80), we rewrite (41) as

dp(t)\displaystyle dp(t) =\displaystyle= [A(t)P(t)x(t)+A(t)Θ(t)+A¯(t)q(t)\displaystyle-\Big{[}A^{\prime}(t)P(t)x(t)+A^{\prime}(t)\Theta(t)+\bar{A}^{\prime}(t)q(t) (83)
+Q(t)x(t)]dt+q(t)dw(t).\displaystyle+Q(t)x(t)\Big{]}dt+q(t)dw(t).

With a comparison of (82) and (83), it follows that

q(t)\displaystyle q(t) =\displaystyle= P(t)[A¯(t)x(t)+B¯(t)u(t)]+Θ¯(t),\displaystyle P(t)\Big{[}\bar{A}(t)x(t)+\bar{B}(t)u(t)\Big{]}+\bar{\Theta}(t), (84)
0\displaystyle 0 =\displaystyle= P˙(t)x(t)+P(t)A(t)x(t)+P(t)B(t)u(t)\displaystyle\dot{P}(t)x(t)+P(t)A(t)x(t)+P(t)B(t)u(t) (85)
+Θ^(t)+A(t)P(t)x(t)+A(t)Θ(t)\displaystyle+\hat{\Theta}(t)+A^{\prime}(t)P(t)x(t)+A^{\prime}(t)\Theta(t)
+A¯(t)q(t)+Q(t)x(t).\displaystyle+\bar{A}^{\prime}(t)q(t)+Q(t)x(t).

Using (84) and (80), (40) becomes

0\displaystyle 0 =\displaystyle= R(t)u(t)+B(t)P(t)x(t)+B(t)Θ(t)\displaystyle R(t)u(t)+B^{\prime}(t)P(t)x(t)+B^{\prime}(t)\Theta(t) (86)
+B¯(t)P(t)A¯(t)x(t)+B¯(t)P(t)B¯(t)u(t)\displaystyle+\bar{B}^{\prime}(t)P(t)\bar{A}(t)x(t)+\bar{B}^{\prime}(t)P(t)\bar{B}(t)u(t)
+B¯(t)Θ¯(t)\displaystyle+\bar{B}^{\prime}(t)\bar{\Theta}(t)
=\displaystyle= Υ0(t)u(t)+Γ0(t)x(t)+B(t)Θ(t)\displaystyle\Upsilon_{0}(t)u(t)+\Gamma_{0}(t)x(t)+B^{\prime}(t)\Theta(t)
+B¯(t)Θ¯(t),\displaystyle+\bar{B}^{\prime}(t)\bar{\Theta}(t),

where Υ0(t)\Upsilon_{0}(t) and Γ0(t)\Gamma_{0}(t) are respectively as in (43) and (44).

Thus, (86) can be equivalently written as

u(t)\displaystyle u(t) =\displaystyle= Υ0(t)[Γ0(t)x(t)+B(t)Θ(t)+B¯(t)Θ¯(t)]\displaystyle-\Upsilon_{0}^{{\dagger}}(t)\Big{[}\Gamma_{0}(t)x(t)+B^{\prime}(t)\Theta(t)+\bar{B}^{\prime}(t)\bar{\Theta}(t)\Big{]} (87)
+[IΥ0(t)Υ0(t)]z(t),\displaystyle+\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t),

where z(t)z(t) is a vector with compatible dimension such that the following equality holds

0\displaystyle 0 =\displaystyle= [IΥ0(t)Υ0(t)][Γ0(t)x(t)+B(t)Θ(t)\displaystyle\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}\Big{[}\Gamma_{0}(t)x(t)+B^{\prime}(t)\Theta(t) (88)
+B¯(t)Θ¯(t)].\displaystyle+\bar{B}^{\prime}(t)\bar{\Theta}(t)\Big{]}.

Denote

T0(t)[IΥ0(t)Υ0(t)]z(t)=[0u1(t)],\displaystyle T_{0}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t)=\left[\begin{array}[]{c}0\\ u_{1}(t)\\ \end{array}\right], (91)

where u1(t)=ΥT0(t)z(t)Rmm0u_{1}(t)=\Upsilon_{T_{0}}(t)z(t)\in R^{m-m_{0}}. Note that ΥT0(t)R[mm0]×m\Upsilon_{T_{0}}(t)\in R^{[m-m_{0}]\times m} is full row rank, it yields that u1(t)u_{1}(t) can be arbitrary due to the arbitrariness of z(t)z(t). Moreover, the optimal controller (62) follows from (87) by using the denotation of G(t)G(t).

Now we rewrite (88) as (63). First, it is noted that

IΥ0(t)Υ0(t)\displaystyle I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t) (93)
=\displaystyle= [IΥ0(t)Υ0(t)][IΥ0(t)Υ0(t)]\displaystyle\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}
=\displaystyle= [IΥ0(t)Υ0(t)]T0(t)[T01(t)][IΥ0(t)Υ0(t)]\displaystyle\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}T^{\prime}_{0}(t)\Big{[}T^{-1}_{0}(t)\Big{]}^{\prime}\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]}
=\displaystyle= [0ΥT0(t)][T01(t)][IΥ0(t)Υ0(t)],\displaystyle\left[\begin{array}[]{cc}0&\Upsilon^{\prime}_{T_{0}}(t)\\ \end{array}\right]\Big{[}T^{-1}_{0}(t)\Big{]}^{\prime}\Big{[}I-\Upsilon_{0}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{]},

where (48) has been used in the derivation of the last equality. By using the denotations below (48), (88) can be written as

0\displaystyle 0 =\displaystyle= ΥT0(t)[C0(t)x(t)+B0(t)Θ(t)+B¯0(t)Θ¯(t)].\displaystyle\Upsilon^{\prime}_{T_{0}}(t)\Big{[}C_{0}(t)x(t)+B_{0}^{\prime}(t)\Theta(t)+\bar{B}_{0}^{\prime}(t)\bar{\Theta}(t)\Big{]}. (94)

Note that ΥT0(t)\Upsilon^{\prime}_{T_{0}}(t) is full column rank, (94) is rewritten as (63) directly. By substituting (87) and (84) into (85) and using (42), it yields that

0\displaystyle 0 =\displaystyle= P˙(t)x(t)+P(t)A(t)x(t)+Θ^(t)+A(t)P(t)x(t)\displaystyle\dot{P}(t)x(t)+P(t)A(t)x(t)+\hat{\Theta}(t)+A^{\prime}(t)P(t)x(t) (95)
+A(t)Θ(t)+A¯(t)P(t)A¯(t)x(t)+A¯(t)Θ¯(t)\displaystyle+A^{\prime}(t)\Theta(t)+\bar{A}^{\prime}(t)P(t)\bar{A}(t)x(t)+\bar{A}^{\prime}(t)\bar{\Theta}(t)
+Q(t)x(t)Γ0(t)Υ0(t)[Γ0(t)x(t)+B(t)Θ(t)\displaystyle+Q(t)x(t)-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\Big{[}\Gamma_{0}(t)x(t)+B^{\prime}(t)\Theta(t)
+B¯(t)Θ¯(t)]+Γ0(t)[IΥ0(t)Υ0(t)]z(t)\displaystyle+\bar{B}^{\prime}(t)\bar{\Theta}(t)\Big{]}+\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t)
=\displaystyle= Θ^(t)+[A(t)Γ0(t)Υ0(t)B(t)]Θ(t)\displaystyle\hat{\Theta}(t)+\Big{[}A^{\prime}(t)-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)B^{\prime}(t)\Big{]}\Theta(t)
+[A¯(t)Γ0(t)Υ0(t)B¯(t)]Θ¯(t)\displaystyle+\Big{[}\bar{A}^{\prime}(t)-\Gamma_{0}^{\prime}(t)\Upsilon_{0}^{{\dagger}}(t)\bar{B}^{\prime}(t)\Big{]}\bar{\Theta}(t)
+Γ0(t)[IΥ0(t)Υ0(t)]z(t).\displaystyle+\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t).

In view of the fact that [IΥ0(t)Υ0(t)]2=IΥ0(t)Υ0(t),[I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)]^{2}=I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t), it is obtained that

Γ0(t)[IΥ0(t)Υ0(t)]z(t)\displaystyle\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t) (98)
=\displaystyle= Γ0(t)[IΥ0(t)Υ0(t)]T01(t)T0(t)\displaystyle\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}T_{0}^{-1}(t)T_{0}(t)
×[IΥ0(t)Υ0(t)]z(t)\displaystyle\times\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}z(t)
=\displaystyle= Γ0(t)[IΥ0(t)Υ0(t)]T01(t)[0u1(t)]\displaystyle\Gamma_{0}^{\prime}(t)\Big{[}I-\Upsilon_{0}^{{\dagger}}(t)\Upsilon_{0}(t)\Big{]}T_{0}^{-1}(t)\left[\begin{array}[]{c}0\\ u_{1}(t)\\ \end{array}\right]
=\displaystyle= C0(t)u1(t).\displaystyle C_{0}^{\prime}(t)u_{1}(t). (99)

Thus, we have from (95) that

Θ^(t)=[A0(t)Θ(t)+A¯0(t)Θ¯(t)+C0(t)u1(t)],\displaystyle\hat{\Theta}(t)=-\Big{[}A^{\prime}_{0}(t)\Theta(t)+\bar{A}^{\prime}_{0}(t)\bar{\Theta}(t)+C^{\prime}_{0}(t)u_{1}(t)\Big{]}, (100)

this implies that the dynamic of Θ(t)\Theta(t) is as (65) using (81).

By substituting (87) into (38), one has the dynamic (64) of the state.

Based on the above derivations, FBSDEs (38), (41) and (40) has been equivalently rewritten as FBSDEs (64), (65) and (63). Combining with Lemma 2, the result is derived. The proof is now completed.

References

  • [1] B. D. O. Anderson, J. B. Moore. Optimal control: linear quadratic methods. Englewood Cliffs, NJ: Prentice Hall, 1990.
  • [2] D. J. Bell, Singular problems in optimal control-a survey, International Journal of Control, 21(2): 319-331, 1975.
  • [3] R. Bellman, The theory of dynamic programming. Bulletin of the American Mathematical Society, 60(6): 503-516, 1954.
  • [4] R. Bellman, I. Glicksberg, O. Gross, Some aspects of the mathematical theory of control processes, Rand Corporation, R-313, 1958.
  • [5] J. F. Bonnans, F. J. Silva, First and second order necessary conditions for stochastic optimal control problems. Applied Mathematics & Optimization, 2012, 65: 403-439.
  • [6] H.-F. Chen, Unified controls applicable to general case under quadratic index, Acta Mathematicae Applicatae Sinica, 5(1): 45-52, 1982.
  • [7] S. Chen, X. Li, X. Zhou, Stochastic linear quadratic regulators with indefinite control weight costs, SIAM Journal on Control and Optimization, 36(5): 1685-1702, 1998.
  • [8] D. Clements, B. Anderson, Singular optimal control: The linear-quadratic problem, Springer-Verlag, New York, 1978.
  • [9] R. Gabasov, F. M. Kirillova, High order necessary conditions for optimality. SIAM J. Control, 1972, 10: 127-168.
  • [10] V. Gurman, The method of multiple maxima and optimization problems for space maneuvers. Proc. Second Readings of K. E. Tsiolkovskii, Moscow, 1968, 39-51.
  • [11] D. Hoehener, Variational approach to second-order optimality conditions for control problems with pure state constraints. SIAM Journal of Control and Optimization, 2012, 50: 1139-1173.
  • [12] Y. Ho, Linear Stochastic Singular Control Problems, Journal of Optimization Theory and Application, 9(1): 24-31, 1972.
  • [13] T. Hsia, On the existence and synthesis of optimal singular control with quadratic performance index, IEEE Trans. Autom. Control, 12(6): 778-779, 1967.
  • [14] R. E. Kalman, Contributions to the theory of optimal control, Bol. Soc., Mat. Mexicana, 5: 102-119, 1960.
  • [15] I. Kliger, Discussion on the stability of the singular trajectory with respect to “Bang-Bang” control, IEEE Trans. Autom. Control, 9(4): 583-585, 1964.
  • [16] A. J. Krener, The high order maximal principle and its application to singular extremals. SIAM Journal of Control and Optimization, 1977, 15: 256-293.
  • [17] A. M. Letov, The analytical design of control systems, Automat. Remote Control, 22: 363-372, 1961.
  • [18] F. L. Lewis, D. L. Vrabie, V. L. Syrmos Optimal control. John Wiley & Sons, Inc., 2012.
  • [19] J. Moore, The singular solutions to a singular quadratic minimization problem. International Journal of Control, 1974, 20(3): 383-393.
  • [20] R. Penrose, A generalized inverse of matrices, Mathematical Proceedings of the Cambridge Philosophical Society, 52: 17-19, 1955.
  • [21] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, E. F. Mishchenko. The mathematical theory of optimal process. English translation. Interscience, 1962.
  • [22] Q. Qi, H. Zhang, Time-inconsistent stochastic linear quadratic control for discrete-time systems, SCIENCE CHINA Information Sciences, 60(12), 120204:1-120204:13, 2017.
  • [23] M. Ait Rami, X. Chen, X. Y. Zhou, Discrete-time indefinite LQ control with state and control dependent noise, Journal of Global Optimization, 23: 245-265, 2002.
  • [24] M. A. Rami, J. B. Moore, and X. Y. Zhou, Indefinite stochastic linear quadratic control and generalized Riccati equation, SIAM Journal on Control and Optimization, 40, 4: 1296 C1311, 2001.
  • [25] J. Shi, G. Wang, J. Xiong, Linear-quadratic stochastic Stackelberg differential game with asymmetric information, SCIENCE CHINA Information Sciences, 60, 092202:1-092202:15, 2017.
  • [26] J. Speyer, D. Jacobson, Necessary and Sufficient Conditions for Optimality for Singular Control Problems, Journal of Mathematical Analysis and Applications, Vol 31, No. 1, 1971.
  • [27] J. Sun, X. Li, J. Yong, Open-loop and closed-loop solvabilities for stochastic linear quadratic optimal control problems, SIAM Journal on Control and Optimization, 54(5): 2274-2308, 2016.
  • [28] J. C. Willems, A. Kitapci, L. M. Silverman, Singular optimal control: a geometric approach. IAM Journal of Control and Optimization, 1986, 24(2): 323-337.
  • [29] J. Xu, J. Shi, H. Zhang, A leader-follower stochastic linear quadratic differential game with time delay, SCIENCE CHINA Information Sciences, 61:112202, 2018.
  • [30] H. Zhang, L. Lin, J. Xu, M. Fu, Linear quadratic regulation and stabilization of discrete-time Systems with delay and multiplicative noise, IEEE Transactions on Automatic Control, 60(10): 2599-2613, 2015.
  • [31] H. Zhang, J. Xu, Control for Itô stochastic systems with input delay, IEEE Transactions on Automatic Control, 62(1): 350-365, 2017.
  • [32] H. Zhang, J. Xu, Optimal control with irregular performance, SCIENCE CHINA Information Sciences, 62:192203, 2019.
  • [33] H. Zhang, J. Xu, Control for Itô stochastic systems with input delay, IEEE Trans. Autom. Control, 62(1): 350-365, 2017.
  • [34] H. Zhang, X. Zhang, Pointwise second-order necessary conditions for stochastic optimal controls, Part I: The case of convex control constraint. SIAM Journal of Control and Optimization, 2015, 53(4): 2267-2296.