Data-Driven Predictive Control With Adaptive Disturbance Attenuation For Constrained Systems

Nan Li [email protected] Ilya Kolmanovsky [email protected] Hong Chen [email protected] Department of Aerospace Engineering, Auburn University, Auburn, AL 36849, USA Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI 48105, USA College of Electronic and Information Engineering, Tongji University, Shanghai, China

Abstract

In this paper, we propose a novel data-driven predictive control approach for systems subject to time-domain constraints. The approach combines the strengths of $\mathcal{H}_{\infty}$ control for rejecting disturbances and MPC for handling constraints. In particular, the approach can dynamically adapt $\mathcal{H}_{\infty}$ disturbance attenuation performance depending on measured system state and forecasted disturbance level to satisfy constraints. We establish theoretical properties of the approach including robust guarantees of closed-loop stability, disturbance attenuation, constraint satisfaction under noisy data, as well as sufficient conditions for recursive feasibility, and illustrate the approach with a numerical example.

keywords:

Data-Driven Control,

\mathcal{H}_{\infty}

Control, Model Predictive Control, Constraints, Linear Matrix Inequality

, ,

1 Introduction

In addition to stability, disturbance rejection and constraint satisfaction are important topics in control systems engineering, especially as modern systems often operate in complex and dynamic environments while increasingly stringent safety and performance requirements are imposed on them. The $\mathcal{H}_{\infty}$ control is an effective approach to addressing the former – it aims to minimize the effect of disturbances on system outputs [1]. Since introduced in the early 1980s, $\mathcal{H}_{\infty}$ control has been successfully implemented in aerospace, automotive, and many other sectors. However, time-domain constraints on state and control variables are not handled in conventional $\mathcal{H}_{\infty}$ control. Meanwhile, Model Predictive Control (MPC) stands out from many other control approaches for its ability to explicitly and non-conservatively handle time-domain constraints [2]. Therefore, combing $\mathcal{H}_{\infty}$ control and MPC to achieve desired disturbance rejection properties while satisfying constraints is appealing and has been studied in, e.g., [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]. A major strength of such a combination is the ability to dynamically adapt $\mathcal{H}_{\infty}$ disturbance attenuation performance (depending on measured/estimated system state and forecasted disturbance level) to satisfy constraints through solving an MPC optimization problem repeatedly in an online manner [3, 4].

Conventional $\mathcal{H}_{\infty}$ control and MPC methods rely on parametric models of the system to be controlled. As engineered systems are becoming increasingly complex and cyber-physical, first-principle models are more difficult to obtain. Meanwhile, with the rapid advances in sensing, computation, and communication technologies, data is more readily available. This has spurred the development of data-driven methods for system modeling, analysis, and control. In particular, practitioners may favor an end-to-end solution that bypasses the intermediate steps of modeling and analysis and produces a controller with desired properties directly from measured data of system behavior. Therefore, it is beneficial to extend model-based methods for $\mathcal{H}_{\infty}$ control, MPC, and the aforementioned combination to their data-driven counterparts.

Here we provide a brief review of existing data-driven control methods in the literature: Reinforcement Learning (RL) can be used to train optimal controllers from data [13, 14]. However, the majority of RL methods optimize the average behavior of the closed-loop system when it is subject to disturbances with certain statistics and do not provide a worst-case robustness guarantee. Furthermore, they typically handle constraints through penalties, hence making constraints soft. Integrating data-driven uncertainty estimation/system identification with(in) the MPC framework can lead to reliable usage of data for improving MPC performance while maintaining certain robustness guarantees especially for satisfying constraints [15, 16]. Along these lines, the Data-enabled Predictive Control (DeePC) is an emerging technique and has attracted increasing attention from researchers recently. The DeePC uses measured input-output data to create a non-parametric system model based on behavioral systems theory and uses the model to predict future trajectories [17]. It has demonstrated superior performance in various applications [18, 19, 20, 21]. However, DeePC entails higher computational cost than conventional MPC due to its high-dimensional, data-based non-parametric system model [22, 23]. Also, how to handle noisy data and provide certain robustness guarantees under noisy data remains to be an open question in DeePC [24, 25, 26]. Furthermore, to the best of our knowledge, there has been no previous work integrating $\mathcal{H}_{\infty}$ control and MPC (or, designing an MPC that has an $\mathcal{H}_{\infty}$ -type disturbance attenuation property) in the general data-driven MPC literature. An approach to synthesizing an $\mathcal{H}_{\infty}$ controller using noisy data based on a matrix S-lemma was introduced in [27]. The approach reduces a data-driven $\mathcal{H}_{\infty}$ control synthesis problem to a low-dimensional Linear Matrix Inequality (LMI) optimization problem, which is computationally tractable with state-of-the-art interior-point LMI solvers. However, time-domain constraints are not handled by the approach of [27], and moving-horizon, MPC-type implementation of the approach was not considered in [27]. More classical data-driven control methods also include self-tuning regulators [28] and iterative learning control [29]. A comprehensive survey of data-driven control methods can be found in [30].

In this paper, we fill the gap in the literature by proposing a novel data-driven control approach that combines the strengths of $\mathcal{H}_{\infty}$ control for rejecting disturbances and MPC for handling constraints. Our approach can be viewed as a data-driven counterpart of the model-based moving-horizon $\mathcal{H}_{\infty}$ control approach of [3, 4] and enjoys similar properties including dynamic adaptation of $\mathcal{H}_{\infty}$ performance depending on measured system state and forecasted disturbance level for satisfying constraints. Specifically, the contributions include:

1.

Our approach is the first data-driven MPC method in the literature that focuses on $\mathcal{H}_{\infty}$ -type disturbance attenuation for systems with time-domain constraints.
2.

We conduct a comprehensive analysis of the theoretical properties of our approach. The results include robust guarantees of closed-loop stability, disturbance attenuation, and constraint satisfaction under noisy data, as well as conditions for online problem recursive feasibility.

The paper is organized as follows: We describe the problem treated in this paper including key assumptions and preliminaries in Section 2. We develop an approach to synthesizing an $\mathcal{H}_{\infty}$ controller that also enforces time-domain constraints for an unknown system using noisy trajectory data in Section 3. The development in this section has merit in its own right because it extends the data-driven $\mathcal{H}_{\infty}$ control synthesis approach of [27] (which does not handle constraints) to the constrained case. It is also an essential building block of the data-driven MPC algorithm developed in Section 4. Then, Section 4 presents our proposed data-driven MPC algorithm and analyzes its theoretical properties. We illustrate the algorithm with a numerical example in Section 5. Finally, we conclude the paper in Section 6.

The notations used in this paper are mostly standard. We use $\mathbb{R}^{n}$ to denote the space of $n$ -dimensional real vectors, $\mathbb{R}^{n\times m}$ the space of $n$ -by- $m$ real matrices, and $\mathbb{N}$ the set of natural numbers including zero. Given a vector $x\in\mathbb{R}^{n}$ , we use $\|x\|$ to denote its Euclidean norm, i.e., $\|x\|=\sqrt{x^{\top}x}$ . Given a matrix $M\in\mathbb{R}^{n\times m}$ , its kernel is the subspace of all $x\in\mathbb{R}^{m}$ such that $Mx=0$ , i.e., $\text{ker}(M)=\{x\in\mathbb{R}^{m}:Mx=0\}$ . Given two symmetric matrices $M,N\in\mathbb{R}^{n\times n}$ , $M\succ N$ means that $M-N$ is positive definite, i.e., $x^{\top}(M-N)x>0$ for all non-zero $x\in\mathbb{R}^{n}$ , and $M\succeq N$ means that $M-N$ is positive semidefinite, i.e., $x^{\top}(M-N)x\geq 0$ for all $x\in\mathbb{R}^{n}$ . Similarly, $M\prec N$ and $M\preceq N$ means that $M-N$ is negative definite and negative semidefinite, respectively. For an optimization problem, by “solution” we mean a feasible solution, i.e., a set of values for the decision variables that satisfies all constraints, and by “optimal solution” we mean a feasible solution where the objective function (almost) reaches its maximum (or minimum) value. Because the optimization problems appearing in this paper are all convex problems, an “optimal solution” is globally optimal.

2 Problem Statement and Preliminaries

We consider the control of dynamic systems that can be represented by the following linear time-invariant model:


$\displaystyle x(t+1)$	$\displaystyle=A_{\text{o}}x(t)+B_{\text{o}}u(t)+w(t)$	(1a)
$\displaystyle y_{1}(t)$	$\displaystyle=C_{1}x(t)+D_{1}u(t)$	(1b)
$\displaystyle y_{2}(t)$	$\displaystyle=C_{2}x(t)+D_{2}u(t)$	(1c)

where $x(t)\in\mathbb{R}^{n}$ denotes the system state at the discrete time $t\in\mathbb{N}$ , $u(t)\in\mathbb{R}^{m}$ denotes the control input, $w(t)\in\mathbb{R}^{n}$ denotes an unmeasured disturbance input, $y_{1}(t)\in\mathbb{R}^{p_{1}}$ and $y_{2}(t)\in\mathbb{R}^{p_{2}}$ are two outputs the roles of which are introduced below. The goal is to design a control algorithm that achieves the following three objectives:

1.

Guaranteeing closed-loop stability;
2.

Optimizing disturbance attenuation in terms of the $\mathcal{H}_{\infty}$ performance from $w$ to output $y_{1}$ ;
3.

Enforcing the following constraints on output $y_{2}$ at all times $t\in\mathbb{N}$ :

$y_{2v}(t)\leq y_{2v,\max},\quad v=1,2,\dots,p_{2}$ (2)

where $y_{2v}(t)$ denotes the $v$ th entry of $y_{2}(t)$ and $y_{2v,\max}\geq 0$ for all $v=1,2,\dots,p_{2}$ . Constraints in this form can represent state/output variable bounds, control input limits, etc.

In addition to the standard assumptions of $(A_{\text{o}},B_{\text{o}})$ being stabilizable and $(C_{1},A_{\text{o}})$ being detectable, we also assume that the system model $(A_{\text{o}},B_{\text{o}})$ is unknown and only trajectory data are available. This calls for a data-driven control approach.

Assume we have data $\{(x^{+}_{j},x_{j},u_{j})\}_{j=1}^{J}$ , where $(x_{j},u_{j})$ denotes a pair of previous state and control input values, $x^{+}_{j}$ denotes the corresponding next state value, and the subscript $j$ indicates the $j$ th data point. The data can be collected from a single or multiple trajectories. According to (1a), we have

x^{+}_{j}=A_{\text{o}}x_{j}+B_{\text{o}}u_{j}+w_{j}

(3)

where $w_{j}$ is the disturbance input value at the time of collecting the data point $(x^{+}_{j},x_{j},u_{j})$ . Organizing data with the following matrices:


$\displaystyle X^{+}$	$\displaystyle=\left[x^{+}_{1},\,x^{+}_{2},\,\cdots,\,x^{+}_{J}\right]$	(4a)
$\displaystyle X$	$\displaystyle=\left[x_{1},\,x_{2},\,\cdots,\,x_{J}\right]$	(4b)
$\displaystyle U$	$\displaystyle=\left[u_{1},\,u_{2},\,\cdots,\,u_{J}\right]$	(4c)
$\displaystyle W$	$\displaystyle=\left[w_{1},\,w_{2},\,\cdots,\,w_{J}\right]$	(4d)

the relation (3) implies

W=X^{+}-A_{\text{o}}X-B_{\text{o}}U

(5)

Because the disturbance input $w_{j}$ is not measured, $W$ in (4d) and (5) is unknown. We make the following assumption about the data:

Assumption 1. The disturbance input values $w_{1},w_{2},$ $\dots,w_{J}$ , collected in $W$ , satisfy the following quadratic matrix inequality:

\begin{bmatrix}I\\ W^{\top}\end{bmatrix}^{\top}\begin{bmatrix}\Phi_{11}&\Phi_{12}\\ \Phi_{12}^{\top}&\Phi_{22}\end{bmatrix}\begin{bmatrix}I\\ W^{\top}\end{bmatrix}\succeq 0

(6)

where $\Phi_{11}=\Phi_{11}^{\top}\in\mathbb{R}^{n\times n}$ , $\Phi_{12}\in\mathbb{R}^{n\times J}$ , and $\Phi_{22}=\Phi_{22}^{\top}\prec 0$ are known matrices.

When $\Phi_{12}=0$ and $\Phi_{22}=-I$ , (6) reduces to $\sum_{j=1}^{J}w_{j}w_{j}^{\top}\preceq\Phi_{11}$ , which has the interpretation that the total energy of the disturbance inputs in data is bounded by $\Phi_{11}$ . A known norm bound on each individual disturbance $w_{j}$ , $\|w_{j}\|\leq\varepsilon$ , implies a bound in the form of (6) with $\Phi_{11}=(J\varepsilon^{2})I$ , $\Phi_{12}=0$ , and $\Phi_{22}=-I$ .

Plugging $W=X^{+}-AX-BU$ into (6) and after some algebra, we obtain

	$\displaystyle\begin{bmatrix}I\\ A^{\top}\\ B^{\top}\end{bmatrix}^{\!\!\top}\!\!\begin{bmatrix}\Theta&\star&\star\\ -X\Phi_{12}^{\top}-X\Phi_{22}(X^{+})^{\top}&X\Phi_{22}X^{\top}&\star\\ -U\Phi_{12}^{\top}-U\Phi_{22}(X^{+})^{\top}&U\Phi_{22}X^{\top}&U\Phi_{22}U^{\top}\end{bmatrix}$	$\displaystyle\!\!\begin{bmatrix}I\\ A^{\top}\\ B^{\top}\end{bmatrix}$
		$\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\succeq 0$		(7)

where $\Theta=\Phi_{11}+X^{+}\Phi_{12}^{\top}+\Phi_{12}(X^{+})^{\top}+X^{+}\Phi_{22}(X^{+})^{\top}$ , and $\star$ indicates the transpose of the related element below the diagonal. We let $\Sigma$ be the collection of $(A,B)$ satisfying (2):

\Sigma=\{(A,B):\text{\eqref{equ:7} is satisfied}\}

(8)

According to (5), $(A_{\text{o}},B_{\text{o}})\in\Sigma$ .

Under Assumption 1, we propose a control approach to achieving the three objectives below (1) for the unknown system (1). Our approach is based on the following matrix S-lemma developed in [27]:

Lemma 1 [27]. Let $M,N\in\mathbb{R}^{(2n+m)\times(2n+m)}$ be symmetric and partitioned as follows:

M=\begin{bmatrix}M_{11}&M_{12}\\ M_{12}^{\top}&M_{22}\end{bmatrix}\quad N=\begin{bmatrix}N_{11}&N_{12}\\ N_{12}^{\top}&N_{22}\end{bmatrix}

(9)

where $M_{11},N_{11}\in\mathbb{R}^{n\times n}$ . Suppose $M_{22}\preceq 0$ , $N_{22}\preceq 0$ , $\text{ker}(N_{22})\subseteq\text{ker}(N_{12})$ , and there exists $\bar{Z}\in\mathbb{R}^{(n+m)\times n}$ satisfying $\begin{bmatrix}I\\ \bar{Z}\end{bmatrix}^{\top}N\begin{bmatrix}I\\ \bar{Z}\end{bmatrix}\succ 0$ . Then, we have that

\begin{bmatrix}I\\ Z\end{bmatrix}^{\top}\!M\begin{bmatrix}I\\ Z\end{bmatrix}\succ 0\text{ for all $Z$ satisfying }\begin{bmatrix}I\\ Z\end{bmatrix}^{\top}\!N\begin{bmatrix}I\\ Z\end{bmatrix}\succeq 0

(10)

if and only if there exist $\alpha\geq 0$ and $\beta>0$ such that

M-\alpha N\succeq\begin{bmatrix}\beta I&0\\ 0&0\end{bmatrix}

(11)

Proof. See Theorem 13 of [27]. $\blacksquare$

3 Constrained $\mathcal{H}_{\infty}$ Control

In this section, we consider a linear-feedback $u(t)=Kx(t)$ with a constant gain matrix $K\in\mathbb{R}^{m\times n}$ . This controller yields the following closed-loop system:


$\displaystyle x(t+1)$	$\displaystyle=\left(A_{\text{o}}+B_{\text{o}}K\right)x(t)+w(t)$	(12a)
$\displaystyle y_{1}(t)$	$\displaystyle=\left(C_{1}+D_{1}K\right)x(t)$	(12b)
$\displaystyle y_{2}(t)$	$\displaystyle=\left(C_{2}+D_{2}K\right)x(t)$	(12c)

The closed-loop transfer matrix from disturbance input $w$ to performance output $y_{1}$ is given by

G_{1}(z)=\left(C_{1}+D_{1}K\right)\left(zI-(A_{\text{o}}+B_{\text{o}}K)\right)^{-1}

(13)

For a performance level $\gamma>0$ , the closed-loop matrix $A_{\text{c}}=A_{\text{o}}+B_{\text{o}}K$ is Schur stable and the $\mathcal{H}_{\infty}$ norm of $G_{1}(z)$ satisfies $\|G_{1}(z)\|_{\mathcal{H}_{\infty}}<\gamma$ if and only if there exists a matrix $P=P^{\top}\succ 0$ satisfying


	$\displaystyle P-(A_{\text{o}}+B_{\text{o}}K)^{\top}P(A_{\text{o}}+B_{\text{o}}K)-(C_{1}+D_{1}K)^{\top}(C_{1}+D_{1}K)$
	$\displaystyle-(A_{\text{o}}+B_{\text{o}}K)^{\top}P(\gamma^{2}I-P)^{-1}P(A_{\text{o}}+B_{\text{o}}K)\succ 0$		(14a)
	$\displaystyle\gamma^{2}I-P\succ 0$		(14b)

See Theorem 2.2 of [31]. Now let $Q=P^{-1}$ (hence, $Q=Q^{\top}\succ 0$ ) and $Y=KQ$ . With some algebra and Schur complement arguments, it can be shown that (14) is equivalent to


	$\displaystyle R=Q-(C_{1}Q+D_{1}Y)^{\top}(C_{1}Q+D_{1}Y)\succ 0$		(15a)
	$\displaystyle Q-\gamma^{-2}I-(A_{\text{o}}Q+B_{\text{o}}Y)R^{-1}(A_{\text{o}}Q+B_{\text{o}}Y)^{\top}\succ 0$		(15b)

Note that (15b) can be written as

\begin{bmatrix}I\\ A_{\text{o}}^{\top}\\ B_{\text{o}}^{\top}\end{bmatrix}^{\!\top}\!\!\begin{bmatrix}Q-\gamma^{-2}I&0&0\\ 0&-QR^{-1}Q&-QR^{-1}Y^{\top}\\ 0&-YR^{-1}Q&-YR^{-1}Y^{\top}\end{bmatrix}\begin{bmatrix}I\\ A_{\text{o}}^{\top}\\ B_{\text{o}}^{\top}\end{bmatrix}\succ 0

(16)

Now consider the quadratic Lyapunov function $V(x(t))=x(t)^{\top}Px(t)$ . When (14) holds, we can derive the following dissipation inequality:

		$\displaystyle V(x(t+1))-V(x(t))=x(t+1)^{\top}Px(t+1)-x(t)^{\top}Px(t)$
		$\displaystyle=x(t)^{\top}(A_{\text{c}}^{\top}PA_{\text{c}}-P)x(t)+w(t)^{\top}Pw(t)$
		$\displaystyle\,\,+x(t)^{\top}A_{\text{c}}^{\top}Pw(t)+w(t)^{\top}PA_{\text{c}}x(t)$
		$\displaystyle\leq-x(t)^{\top}(C_{1}+D_{1}K)^{\top}(C_{1}+D_{1}K)x(t)+\gamma^{2}w(t)^{\top}w(t)$
		$\displaystyle\,\,-x(t)^{\top}A_{\text{c}}^{\top}P(\gamma^{2}I-P)^{-1}PA_{\text{c}}x(t)-w(t)^{\top}(\gamma^{2}I-P)w(t)$
		$\displaystyle\,\,+x(t)^{\top}A_{\text{c}}^{\top}Pw(t)+w(t)^{\top}PA_{\text{c}}x(t)$
		$\displaystyle=-y_{1}(t)^{\top}y_{1}(t)+\gamma^{2}w(t)^{\top}w(t)-S$
		$\displaystyle\leq-\\|y_{1}(t)\\|^{2}+\gamma^{2}\\|w(t)\\|^{2}$		(17)

where $S=\|(\gamma^{2}I-P)^{-1/2}PA_{\text{c}}x(t)-(\gamma^{2}I-P)^{1/2}w(t)\|^{2}$ satisfies $S\geq 0$ . We define, for $r>0$ , the ellipsoidal set

\mathcal{E}(P,r)=\{x\in\mathbb{R}^{n}:V(x)\leq r\}

(18)

and we arrive at the following result:

Lemma 2. Suppose (14) holds and the energy of the disturbance input is bounded as $\sum_{t=0}^{\infty}\|w(t)\|^{2}\leq\sigma_{0}$ for some $\sigma_{0}\geq 0$ . Then, for any $r_{0}\geq x(0)^{\top}Px(0)+\gamma^{2}\sigma_{0}$ , $\mathcal{E}(P,r_{0})$ is an invariant set of (12), i.e., $x(t)\in\mathcal{E}(P,r_{0})$ for all $t\in\mathbb{N}$ .

Proof. Suppose (14) holds, $\sum_{t=0}^{\infty}\|w(t)\|^{2}\leq\sigma_{0}$ , and $r_{0}\geq x(0)^{\top}Px(0)+\gamma^{2}\sigma_{0}$ . Then, the dissipation inequality (3) implies

	$\displaystyle V(x(t))\leq V(x(0))-\sum_{i=0}^{t-1}\\|y_{1}(i)\\|^{2}+\gamma^{2}\sum_{i=0}^{t-1}\\|w(i)\\|^{2}$
	$\displaystyle\quad\quad\leq x(0)^{\top}Px(0)+\gamma^{2}\sum_{i=0}^{\infty}\\|w(i)\\|^{2}\leq r_{0}$		(19)

for all $t\in\mathbb{N}$ . This shows $x(t)\in\mathcal{E}(P,r_{0})$ for all $t\in\mathbb{N}$ . $\blacksquare$

We now consider the optimization problem (20) for designing the feedback gain $K=YQ^{-1}$ . In (20), $e_{v}$ denotes the $v$ th standard basis vector, $\sigma_{0}>0$ is a forecasted energy bound of the disturbance input, and $r_{0}>0$ is a design parameter to be tuned so that (20) is feasible. We note that (20) is an LMI (hence, convex) problem in the decision variables $(\eta,Q,Y,\alpha,\beta)$ . A design of $K$ based on (20) has several desirable properties, stated in the following theorem:


	$\displaystyle\max_{\eta>0,Q=Q^{\top}\!,Y,\alpha\geq 0,\beta>0}\,\eta$	(20a)
s.t.	$\displaystyle\begin{bmatrix}Q-(\eta+\beta)I&\star&\star&\star&\star\\ 0&0&\star&\star&\star\\ 0&0&0&\star&\star\\ 0&Q&Y^{\top}&Q&\star\\ 0&0&0&C_{1}Q+D_{1}Y&I\end{bmatrix}\succeq\alpha\begin{bmatrix}\Theta&\star&\star&\star&\,\star\\ -X\Phi_{12}^{\top}-X\Phi_{22}(X^{+})^{\top}&X\Phi_{22}X^{\top}&\star&\star&\,\star\\ -U\Phi_{12}^{\top}-U\Phi_{22}(X^{+})^{\top}&U\Phi_{22}X^{\top}&U\Phi_{22}U^{\top}&\star&\,\star\\ 0&0&0&0&\,\star\\ 0&0&0&0&\,0\end{bmatrix}$	(20b)
	$\displaystyle\begin{bmatrix}Q&\star\\ C_{1}Q+D_{1}Y&I\end{bmatrix}\succ 0\quad\,({\text{20c}})\quad\,\,\,\begin{bmatrix}r_{0}&\star&\star\\ x(0)&Q&\star\\ 1&0&\sigma_{0}^{-1}\eta\end{bmatrix}\succeq 0\quad\,({\text{20d}})\quad\,\,\,\begin{bmatrix}y_{2v,\max}^{2}\,r_{0}^{-1}&\star\\ (C_{2}Q+D_{2}Y)^{\top}e_{v}&Q\end{bmatrix}\succeq 0,\,\,v=1,2,\dots,p_{2}$	(20e)

where $\Theta=\Phi_{11}+X^{+}\Phi_{12}^{\top}+\Phi_{12}(X^{+})^{\top}+X^{+}\Phi_{22}(X^{+})^{\top}$ , and $\star$ indicates the transpose of the related element below the diagonal.

Theorem 1. Suppose

(a)

The offline data satisfy Assumption 1 and there exists $(A,B)=(\bar{A},\bar{B})$ such that (2) is strictly positive-definite;
(b)

The online disturbance inputs satisfy the energy bound $\sum_{t=0}^{\infty}\|w(t)\|^{2}\leq\sigma_{0}$ ;
(c)

The tuple $(\eta_{0},Q_{0},Y_{0},\alpha_{0},\beta_{0})$ is a solution to (20).

Then, if we control the unknown system (1) using the linear-feedback $u(t)=K_{0}x(t)$ , with $K_{0}=Y_{0}Q_{0}^{-1}$ , the closed-loop system has the following properties:

(i)

The closed-loop matrix $A_{\text{c},0}=A_{\text{o}}+B_{\text{o}}K_{0}$ is Schur stable and the closed-loop $\mathcal{H}_{\infty}$ norm from $w$ to $y_{1}$ is less than $\gamma_{0}=\eta_{0}^{-1/2}$ ;
(ii)

The constraints in (2) are satisfied at all times.

Proof. First, let

		$\displaystyle M=\left[\begin{array}[]{c\|c}M_{11}&M_{12}\\ \hline\cr M_{12}^{\top}&M_{22}\end{array}\right]=\left[\begin{array}[]{c\|cc}Q-\eta I&0&0\\ \hline\cr 0&-QR^{-1}Q&-QR^{-1}Y^{\top}\\ 0&-YR^{-1}Q&-YR^{-1}Y^{\top}\end{array}\right]$		(26)
		$\displaystyle N=\left[\begin{array}[]{c\|c}N_{11}&N_{12}\\ \hline\cr N_{12}^{\top}&N_{22}\end{array}\right]$		(29)
		$\displaystyle=\left[\begin{array}[]{c\|cc}\Theta&\star&\star\\ \hline\cr-X\Phi_{12}^{\top}-X\Phi_{22}(X^{+})^{\top}&X\Phi_{22}X^{\top}&\star\\ -U\Phi_{12}^{\top}-U\Phi_{22}(X^{+})^{\top}&U\Phi_{22}X^{\top}&U\Phi_{22}U^{\top}\end{array}\right]$		(33)

where $R=Q-(C_{1}Q+D_{1}Y)^{\top}(C_{1}Q+D_{1}Y)$ . Using a Schur complement argument, it can be shown that (20b) is equivalent to

M-\alpha N\succeq\begin{bmatrix}\beta I&0\\ 0&0\end{bmatrix}

(34)

Meanwhile, for the $M$ and $N$ defined and partitioned as above, the following conditions can be verified: 1) $M_{22}\preceq 0$ , using the fact that $R\succ 0$ due to the constraint (20c), 2) $N_{22}\preceq 0$ , using $\Phi_{22}\prec 0$ according to Assumption 1, and 3) $\text{ker}(N_{22})\subseteq\text{ker}(N_{12})$ . Now with $Z=[A,B]^{\top}$ and $\bar{Z}=[\bar{A},\bar{B}]^{\top}$ (with $(\bar{A},\bar{B})$ given in assumption (a)), it can be seen that the assumptions of Lemma 1 are all satisfied. According to Lemma 1, (34) holds for some $\alpha\geq 0$ and $\beta>0$ if and only if

\begin{bmatrix}I\\ A^{\top}\\ B^{\top}\end{bmatrix}^{\top}M\begin{bmatrix}I\\ A^{\top}\\ B^{\top}\end{bmatrix}\succ 0\text{ for all $(A,B)\in\Sigma$}

(35)

where $\Sigma$ is defined in (8). Recall that $(A_{\text{o}},B_{\text{o}})\in\Sigma$ . Therefore, we have shown that if $(\eta,Q=Q^{\top},Y,\alpha,\beta)$ satisfies (20b), (20c), and $\eta>0,\alpha\geq 0,\beta>0$ , then (16) (hence, (15b)), with $\gamma=\eta^{-1/2}$ , necessarily holds. Note that (20c) also implies (15a) and $Q\succ 0$ . Now, if we let $P=Q^{-1}$ and $K=YQ^{-1}$ , we arrive at (14) with $\gamma=\eta^{-1/2}$ . This proves part (i).

Now suppose $(\eta,Q=Q^{\top},Y,\alpha,\beta)$ satisfies (20b)-(20e) and $\eta>0,\alpha\geq 0,\beta>0$ . Above we have shown that, with $P=Q^{-1}$ , $K=YQ^{-1}$ , and $\gamma=\eta^{-1/2}$ , (14) holds. Meanwhile, using a Schur complement argument, (20d) implies $r_{0}\geq x(0)^{\top}Q^{-1}x(0)+\eta^{-1}\sigma_{0}=x(0)^{\top}Px(0)+\gamma^{2}\sigma_{0}$ . In this case, according to Lemma 2, $\sum_{t=0}^{\infty}\|w(t)\|^{2}\leq\sigma_{0}$ in assumption (b) implies that $\mathcal{E}(P,r_{0})$ is an invariant set of the closed-loop system, i.e., $x(t)\in\mathcal{E}(P,r_{0})$ for all $t\in\mathbb{N}$ . Recall that $\mathcal{E}(P,r)$ is the ellipsoidal set defined in (18). The support function of $\mathcal{E}(P,r)$ is $h_{\mathcal{E}(P,r)}(\zeta)=\sqrt{r\,\zeta^{\top}P^{-1}\zeta}$ . Hence, $x(t)\in\mathcal{E}(P,r_{0})$ implies

		$\displaystyle\!y_{2v}(t)=e_{v}^{\top}(C_{2}+D_{2}K)x(t)\leq h_{\mathcal{E}(P,r_{0})}((C_{2}+D_{2}K)^{\top}e_{v})$
		$\displaystyle=\sqrt{r_{0}\,e_{v}^{\top}(C_{2}+D_{2}K)P^{-1}(C_{2}+D_{2}K)^{\top}e_{v}}$
		$\displaystyle=\sqrt{r_{0}\,e_{v}^{\top}(C_{2}Q+D_{2}Y)Q^{-1}(C_{2}Q+D_{2}Y)^{\top}e_{v}}$		(36)

Meanwhile, using a Schur complement argument, (20e) is equivalent to

		$\displaystyle y_{2v,\max}^{2}\,r_{0}^{-1}-e_{v}^{\top}(C_{2}Q+D_{2}Y)Q^{-1}(C_{2}Q+D_{2}Y)^{\top}e_{v}\geq 0$
		$\displaystyle\iff$
		$\displaystyle\sqrt{r_{0}\,e_{v}^{\top}(C_{2}Q+D_{2}Y)Q^{-1}(C_{2}Q+D_{2}Y)^{\top}e_{v}}\leq y_{2v,\max}$		(37)

Combing (3) and (3), we obtain $y_{2v}(t)\leq y_{2v,\max}$ . This completes the proof of part (ii). $\blacksquare$

Theorem 1 states the stability, disturbance attenuation, and constraint enforcement properties of the feedback gain $K$ for the unknown system (1) synthesized using its trajectory data according to (20). The existence of $(\bar{A},\bar{B})$ such that (2) is strictly positive-definite in assumption (a) of Theorem 1 can be checked offline. We make the following two remarks about Theorem 1:

Remark 1. The $\mathcal{H}_{\infty}$ norm from $w$ to $y_{1}$ represents a level of disturbance attenuation of the closed-loop system. In particular, using (3) recursively we can obtain the following dissipation inequality that holds for all $t\in\mathbb{N}$ :

\sum_{i=0}^{t}\|y_{1}(i)\|^{2}\leq\gamma^{2}\sum_{i=0}^{t}\|w(i)\|^{2}+x(0)^{\top}Px(0)

(38)

and this indicates that the $\l_{2}$ -gain from disturbance $w$ to output $y_{1}$ is bounded by $\gamma$ . Because (20) maximizes $\eta>0$ , which is equivalent to minimizing $\gamma=\eta^{-1/2}$ , the obtained controller seeks to maximize disturbance attenuation while satisfying the time-domain constraints in (2).

Remark 2. Differently from many other robust control formulations that assume set-bounded disturbances (i.e., $w(t)\in\mathbb{W}$ for some known set bound $\mathbb{W}$ and for all $t\in\mathbb{N}$ ), our formulation (20) enforces constraints for disturbance inputs that have bounded total energy where the energy can be distributed arbitrarily over time (i.e., $\sum_{t=0}^{\infty}\|w(t)\|^{2}\leq\sigma_{0}$ ). This total energy model is particularly suitable for modeling transient disturbances such as wind gusts in aircraft flight control or wind turbine control, power outages in power systems, temporary actuator or sensor failures, etc. Such disturbances occur infrequently and typically last a short period of time (as compared to persistent disturbances), but they can have a significant magnitude. Meanwhile, predicting exactly when they will occur can be difficult or impossible. In such a case, on the one hand, our formulation based on a total energy disturbance model may lead to a less conservative solution (hence, better performance) than those assuming a set bound at all times; on the other hand, estimating/predicting an energy bound for such transient disturbance events using historic data and/or real-time information (such as weather conditions) is possible in many applications. In what follows we assume a mechanism that is able to forecast an energy bound for future disturbances is available. Treating set-bounded persistent disturbances is left for future research.

4 Moving-Horizon Control

We now consider a strategy for implementing the control developed in Section 3 in a moving-horizon manner: At each time $t\in\mathbb{N}$ , one solves (20) with the current system state as the initial condition $x(0)$ in (20d) for a feedback gain $K_{t}$ and uses the control $u(t)=K_{t}x(t)$ over one time step. This way, the feedback gain becomes adaptive to system state and the control becomes nonlinear, possibly leading to improved performance while satisfying constraints. However, this simple implementation may fail to guarantee stability and disturbance attenuation, as discussed in [3, 4]. To recover a disturbance attenuation guarantee, following the strategy of [3, 4], we consider the following inequality:

x(t)^{\top}P_{t}x(t)-x(t)^{\top}P_{t-1}x(t)\leq\Delta_{t}

(39)

where $P_{t}$ is associated with the feedback gain $K_{t}$ that is used at time $t$ and with an $\mathcal{H}_{\infty}$ performance level of $\gamma_{t}$ (i.e., the triple $(P_{t},K_{t},\gamma_{t})$ satisfies (14)), $P_{t-1}$ is associated with $K_{t-1}$ and $\gamma_{t-1}$ , and $\Delta_{t}$ keeps track of a previous dissipation level and is defined recursively according to

\!\Delta_{t}\!=\!\Delta_{t-1}-\Big{(}x(t-1)^{\top}P_{t-1}x(t-1)-x(t-1)^{\top}P_{t-2}x(t-1)\Big{)}

(40)

for $t\geq 2$ and $\Delta_{1}=0$ . Note that the definition of $\Delta_{t}$ in (40) only uses state and $P$ -matrix information up to time $t-1$ .

Lemma 3. Suppose (39) holds for all $t\geq 1$ where $\Delta_{t}$ is defined recursively according to (40). Then, the following dissipation inequality will be satisfied for all $t\in\mathbb{N}$ :

\sum_{i=0}^{t}\|y_{1}(i)\|^{2}\leq\bar{\gamma}_{t}^{2}\sum_{i=0}^{t}\|w(i)\|^{2}+x(0)^{\top}P_{0}x(0)

(41)

where $\bar{\gamma}_{t}=\max\{\gamma_{0},\gamma_{1},\dots,\gamma_{t}\}$ .

Proof. For each $i$ , because $(P_{i},K_{i},\gamma_{i})$ satisfies (14), similar to (3), we can derive the following inequality:

x(i+1)^{\top}P_{i}x(i+1)-x(i)^{\top}P_{i}x(i)\leq-\|y_{1}(i)\|^{2}+\gamma_{i}^{2}\|w(i)\|^{2}

(42)

Summing over $i=0,1,\dots,t$ , we obtain

		$\displaystyle\!\sum_{i=0}^{t}\\|y_{1}(i)\\|^{2}\leq\sum_{i=0}^{t}\gamma_{i}^{2}\\|w(i)\\|^{2}+x(0)^{\top}P_{0}x(0)\,+\,$		(43)
		$\displaystyle\!\sum_{i=1}^{t}\!\Big{(}x(i)^{\top}P_{i}x(i)-x(i)^{\top}P_{i-1}x(i)\Big{)}\!-x(t+1)^{\top}P_{t}x(t+1)$

The definition (40) yields the following closed-form expression for $\Delta_{t}$ :

\Delta_{t}=-\sum_{i=1}^{t-1}\Big{(}x(i)^{\top}P_{i}x(i)-x(i)^{\top}P_{i-1}x(i)\Big{)}

(44)

Supposing (39) holds at $t$ , we have

\sum_{i=1}^{t}\Big{(}x(i)^{\top}P_{i}x(i)-x(i)^{\top}P_{i-1}x(i)\Big{)}\leq 0

(45)

Combining (43), (45), and $\bar{\gamma}_{t}=\max_{i=0,\dots,t}\gamma_{i}$ , we obtain

$\displaystyle\sum_{i=0}^{t}\\|y_{1}(i)\\|^{2}$	$\displaystyle\leq\sum_{i=0}^{t}\gamma_{i}^{2}\\|w(i)\\|^{2}+x(0)^{\top}P_{0}x(0)$
	$\displaystyle\quad\quad\quad-x(t+1)^{\top}P_{t}x(t+1)$
	$\displaystyle\leq\bar{\gamma}_{t}^{2}\sum_{i=0}^{t}\\|w(i)\\|^{2}+x(0)^{\top}P_{0}x(0)\quad\blacksquare$	(46)

We now consider the following moving-horizon approach to constrained $\mathcal{H}_{\infty}$ control for the unknown system (1): At each time $t\in\mathbb{N}$ , we solve the optimization problem (47) for designing the feedback gain $K_{t}=Y_{t}Q_{t}^{-1}$ and use the control $u(t)=K_{t}x(t)$ over one time step. In particular, the constraint (47f) is excluded at the initial time $t=0$ and included for $t\geq 1$ . In (47), $x(t)$ denotes the measured current system state, $P_{t-1}=Q_{t-1}^{-1}$ is from the previous time, $\Delta_{t}$ is defined according to (40), $\sigma_{t}$ represents a forecasted bound on the total energy of present and future disturbances, i.e., $\sum_{i=t}^{\infty}\|w(i)\|^{2}\leq\sigma_{t}$ , and $r_{t}>0$ is a design parameter, of which a design method is informed by Lemma 4 and Theorem 2. For a moving-horizon control algorithm, recursive feasibility (i.e., the online optimization problem being feasible at a given time implies the problem being feasible again at the next time) is a highly desirable property. Before we discuss the closed-loop properties of the proposed algorithm, the following lemma provides a recursive feasibility result:


	$\displaystyle\max_{\eta>0,Q=Q^{\top}\!,Y,\alpha\geq 0,\beta>0}\,\eta$	(47a)
s.t.	$\displaystyle\begin{bmatrix}Q-(\eta+\beta)I&\star&\star&\star&\star\\ 0&0&\star&\star&\star\\ 0&0&0&\star&\star\\ 0&Q&Y^{\top}&Q&\star\\ 0&0&0&C_{1}Q+D_{1}Y&I\end{bmatrix}\succeq\alpha\begin{bmatrix}\Theta&\star&\star&\star&\,\star\\ -X\Phi_{12}^{\top}-X\Phi_{22}(X^{+})^{\top}&X\Phi_{22}X^{\top}&\star&\star&\,\star\\ -U\Phi_{12}^{\top}-U\Phi_{22}(X^{+})^{\top}&U\Phi_{22}X^{\top}&U\Phi_{22}U^{\top}&\star&\,\star\\ 0&0&0&0&\,\star\\ 0&0&0&0&\,0\end{bmatrix}$	(47b)
	$\displaystyle\begin{bmatrix}Q&\star\\ C_{1}Q+D_{1}Y&I\end{bmatrix}\succ 0\quad\quad\quad\quad\quad\quad\quad\quad\,({\text{47c}})\quad\quad\quad\quad\quad\quad\quad\begin{bmatrix}r_{t}&\star&\star\\ x(t)&Q&\star\\ 1&0&\sigma_{t}^{-1}\eta\end{bmatrix}\succeq 0$	(47d)
	$\displaystyle\begin{bmatrix}y_{2v,\max}^{2}\,r_{t}^{-1}&\star\\ (C_{2}Q+D_{2}Y)^{\top}e_{v}&Q\end{bmatrix}\succeq 0,\,\,v=1,2,\dots,p_{2}\quad\quad({\text{47e}})\quad\quad\quad\,\begin{bmatrix}x(t)^{\top}P_{t-1}x(t)+\Delta_{t}&\star\\ x(t)&Q\end{bmatrix}\succeq 0$	(47f)

where $\Theta=\Phi_{11}+X^{+}\Phi_{12}^{\top}+\Phi_{12}(X^{+})^{\top}+X^{+}\Phi_{22}(X^{+})^{\top}$ , $\star$ indicates the transpose of the related element below the diagonal, and the constraint (47f) is excluded at $t=0$ and included for $t\geq 1$ .

Lemma 4. Suppose (47) is feasible at time $t$ and $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ denotes a solution. At time $t+1$ , if the forecasted disturbance energy bound $\sigma_{t+1}$ satisfies $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ and the parameter $r_{t+1}$ is chosen to be $r_{t+1}=r_{t}$ , (47) will be feasible again. In particular, in this case, $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ remains to be a solution to (47) at time $t+1$ .

Proof. First, we note that the constraints (47b), (35c), and (35e) with $r_{t+1}=r_{t}$ do not change from $t$ to $t+1$ . The solution $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ satisfying (47b) and (35c) implies (14) with $P=Q_{t}^{-1}$ , $K=Y_{t}Q_{t}^{-1}$ , and $\gamma=\eta_{t}^{-1/2}$ . In this case, similar to (3), the following inequality holds

		$\displaystyle x(t+1)^{\top}Q_{t}^{-1}x(t+1)-x(t)^{\top}Q_{t}^{-1}x(t)$
		$\displaystyle\quad\leq-\\|y_{1}(t)\\|^{2}+\eta_{t}^{-1}\\|w(t)\\|^{2}$		(48)

Meanwhile, $(\eta_{t},Q_{t})$ satisfying the constraint (47d) at time $t$ is equivalent to

r_{t}\geq x(t)^{\top}Q_{t}^{-1}x(t)+\eta_{t}^{-1}\sigma_{t}

(49)

If $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ and $r_{t+1}=r_{t}$ , together with (4) and (49), we can obtain

		$\displaystyle x(t+1)^{\top}Q_{t}^{-1}x(t+1)+\eta_{t}^{-1}\sigma_{t+1}$
		$\displaystyle\leq x(t)^{\top}Q_{t}^{-1}x(t)+\eta_{t}^{-1}\\|w(t)\\|^{2}+\eta_{t}^{-1}(\sigma_{t}-\\|w(t)\\|^{2})$
		$\displaystyle=x(t)^{\top}Q_{t}^{-1}x(t)+\eta_{t}^{-1}\sigma_{t}$
		$\displaystyle\leq r_{t}=r_{t+1}$		(50)

which implies that $(\eta_{t},Q_{t})$ also satisfies the constraint (47d) at time $t+1$ . Last, for $t=0$ , we have $\Delta_{t+1}=\Delta_{1}=0$ ; for $t\geq 1$ , $Q_{t}$ satisfying the constraint (47f) implies (39) with $P_{t}=Q_{t}^{-1}$ . In the latter case, similar to (44)–(45), we have

\Delta_{t+1}=-\sum_{i=1}^{t}\Big{(}x(i)^{\top}P_{i}x(i)-x(i)^{\top}P_{i-1}x(i)\Big{)}\geq 0

(51)

At time $t+1$ , the constraint (47f) is equivalent to

x(t+1)^{\top}Q^{-1}x(t+1)-x(t+1)^{\top}P_{t}x(t+1)\leq\Delta_{t+1}

(52)

where $P_{t}=Q_{t}^{-1}$ . Because $\Delta_{t+1}\geq 0$ , it is clear that $Q=Q_{t}$ satisfies (52) (hence, (47f)).

Therefore, we have shown that if $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ and $r_{t+1}=r_{t}$ , a solution at time $t$ , $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ , still satisfies all constraints of (47) at time $t+1$ , i.e., $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ remains to be a solution to (47) at $t+1$ . This proves the result. $\blacksquare$

Remark 3. Lemma 4 represents a sufficient condition for the online optimization problem (47) to be recursively feasible. The assumption $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ is reasonable because $\sigma_{t}$ bounds the total energy of disturbance inputs from time $t$ and $\sigma_{t+1}$ bounds that from $t+1$ – they differ by the energy of the disturbance input at time $t$ . Then, $r_{t+1}=r_{t}$ yields a simple strategy for setting the parameter $r_{t}$ – at each time $t$ , $r_{t}$ is set to its previous value. At time instants at which (47) is infeasible with $r_{t}=r_{t-1}$ (due to errors in forecasted disturbance energy bound and violations of $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ ), an alternative strategy that promotes feasibility is to include $r_{t}$ or its inverse $\rho_{t}=r_{t}^{-1}$ as a decision variable optimized together with the other variables $(\eta,Q,Y,\alpha,\beta)$ . In this case, (47) becomes a nonconvex problem with a single nonconvex variable $r_{t}$ or $\rho_{t}$ , which can be solved using a branch-and-bound type algorithm with branching on $r_{t}$ or $\rho_{t}$ [32].

We now analyze the closed-loop properties of the system under the proposed moving-horizon control approach. The properties are given in Theorem 2.

Theorem 2. Suppose

(a)

The offline data satisfy Assumption 1 and there exists $(A,B)=(\bar{A},\bar{B})$ such that (2) is strictly positive-definite;
(b)

The online disturbance inputs satisfy $\sum_{i=t}^{\infty}\|w(i)\|^{2}\leq\sigma_{t}$ for all $t\in\mathbb{N}$ , where $\sigma_{t}$ are forecasted energy bounds and used in (47);
(c)

The online optimization problem (47) is feasible at all $t\in\mathbb{N}$ and $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ denotes a solution to (47) at each $t$ ;
(d)

The solutions $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ have a common lower bound $\underline{\eta}>0$ on their objective values, i.e., $\eta_{t}\geq\underline{\eta}$ for all $t\in\mathbb{N}$ .

Then, if we control the unknown system (1) using $u(t)=K_{t}x(t)$ , with $K_{t}=Y_{t}Q_{t}^{-1}$ , at all $t\in\mathbb{N}$ , the closed-loop system has the following properties:

(i)

The system state $x(t)$ converges to $0$ as $t\to\infty$ ;
(ii)

The following dissipation inequality is satisfied for all $t\in\mathbb{N}$ :

$\sum_{i=0}^{t}\|y_{1}(i)\|^{2}\leq\bar{\gamma}^{2}\sum_{i=0}^{t}\|w(i)\|^{2}+x(0)^{\top}P_{0}x(0)$ (53)

where $\bar{\gamma}=\underline{\eta}^{-1/2}$ , indicating that the $\l_{2}$ -gain from disturbance $w$ to output $y_{1}$ is bounded by $\bar{\gamma}$ ;
(iii)

The constraints in (2) are satisfied at all times.

Furthermore, suppose (a) and (b) hold and

(e)

Problem (47) is feasible at the initial time $t=0$ , the energy bounds $\sigma_{t}$ satisfy $\sigma_{t+1}\leq\sigma_{t}-\|w(t)\|^{2}$ for all $t\in\mathbb{N}$ , and the parameters $r_{t}$ are chosen as $r_{t}=r_{0}$ for all $t\in\mathbb{N}$ ;
(f)

The tuple $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ is a (not only feasible but also) optimal solution to (47) at each $t\in\mathbb{N}$ .

Then, the following results hold:

(iv)

The conditions (c) and (d) are necessarily satisfied;
(v)

The solutions have non-decreasing objective values, i.e., $\eta_{t+1}\geq\eta_{t}$ for all $t\in\mathbb{N}$ ;
(vi)

The dissipation inequality (53) holds true with the gain of $\bar{\gamma}=\eta_{0}^{-1/2}$ .

Proof. We start with proving the inequality (53) in (ii). At each time $t\in\mathbb{N}$ , the solution $(\eta_{t},Q_{t},Y_{t},\alpha_{t},\beta_{t})$ satisfies the constraints (47b) and (35c). Following similar steps as in the proof of Theorem 1, part (i), we can show that (14), with $P=Q_{t}^{-1}$ , $K=Y_{t}Q_{t}^{-1}$ , and $\gamma=\eta_{t}^{-1/2}$ , holds. The constraint (47f) is equivalent to

x(t)^{\top}Q^{-1}x(t)-x(t)^{\top}P_{t-1}x(t)\leq\Delta_{t}

(54)

Because (47f) (hence, (54)) is satisfied by $Q_{t}$ at each $t\geq 1$ , we have (39) for all $t\geq 1$ . In this case, according to Lemma 3, (41) holds for all $t\in\mathbb{N}$ . Meanwhile, $\eta_{t}\geq\underline{\eta}$ for all $t\in\mathbb{N}$ (assumption (d)) implies that $\bar{\gamma}_{t}\leq\bar{\gamma}$ for all $t$ , where $\bar{\gamma}_{t}=\max_{i=0,\dots,t}\gamma_{i}$ , $\gamma_{i}=\eta_{i}^{-1/2}$ , and $\bar{\gamma}=\underline{\eta}^{-1/2}$ . The combination of (41) and $\bar{\gamma}_{t}\leq\bar{\gamma}$ leads to (53). This proves part (ii). Now, because (53) holds for all $t\in\mathbb{N}$ , as $t\to\infty$ , we obtain

\sum_{i=0}^{\infty}\|y_{1}(i)\|^{2}\leq\bar{\gamma}^{2}\sum_{i=0}^{\infty}\|w(i)\|^{2}+x(0)^{\top}P_{0}x(0)

(55)

Note that under assumption (b), the right-hand side of (55) is bounded by $\bar{\gamma}^{2}\sigma_{0}+x(0)^{\top}P_{0}x(0)$ , and hence the series on the left-hand side converges according to the monotone convergence theorem. And (55) implies $y_{1}(t)\to 0$ as $t\to\infty$ . When $(C_{1},A_{\text{o}})$ is detectable, $y_{1}(t)\to 0$ implies $x(t)\to 0$ . This proves part (i). For part (iii), because $(\eta_{t},Q_{t},Y_{t})$ satisfies (47d) and (35e), following similar steps as in the proof of Theorem 1, part (ii), it can be shown that $x(t)\in\mathcal{E}(Q_{t}^{-1},r_{t})$ (due to (47d)) and this implies $y_{2v}(t)=e_{v}^{\top}(C_{2}+D_{2}K_{t})x(t)\leq y_{2v,\max}$ for all $v=1,2,\dots,p_{2}$ (due to (35e)).

Now suppose (a), (b), (e), and (f) hold. According to Lemma 4, when (47) is feasible at $t=0$ , $\sigma_{1}\leq\sigma_{0}-\|w(0)\|^{2}$ , and $r_{1}=r_{0}$ , an optimal solution $(\eta_{0},Q_{0},Y_{0},\alpha_{0},\beta_{0})$ to (47) at $t=0$ remains to be a feasible solution to (47) at $t=1$ . In this case, (47) is feasible at $t=1$ and an optimal solution $(\eta_{1},Q_{1},Y_{1},\alpha_{1},\beta_{1})$ has an objective value at least as large as that of the feasible solution $(\eta_{0},Q_{0},Y_{0},\alpha_{0},\beta_{0})$ , i.e., $\eta_{1}\geq\eta_{0}$ . Using the same argument recursively, we can conclude that (47) is feasible at all $t\in\mathbb{N}$ and $\eta_{t+1}\geq\eta_{t}$ for all $t$ . The latter also implies that $\eta_{0}>0$ is a common lower bound for all $\eta_{t}$ , i.e., (d) is satisfied with $\underline{\eta}=\eta_{0}$ . Accordingly, (53) holds with $\bar{\gamma}=\underline{\eta}^{-1/2}=\eta_{0}^{-1/2}$ . This completes the proofs of parts (iv), (v), and (vi). $\blacksquare$

We make the following remark about Theorem 2:

Remark 4. Theorem 2 shows that our objectives of closed-loop stability, disturbance attenuation, and constraint enforcement stated in Section 2 are fulfilled by the proposed data-driven moving-horizon $\mathcal{H}_{\infty}$ control based on (47). In particular, part (i) shows that the system state converges to zero asymptotically for any disturbance input signal that has bounded total energy. Parts (v) and (vi) show that, under assumption (e) (which is reasonable as discussed in Remark 3), the moving-horizon approach based on (47) achieves a disturbance attenuation performance at least as good as that achieved by a constant linear-feedback designed based on (20). In particular, the moving-horizon approach based on (47) is able to dynamically adapt the feedback gain $K_{t}$ to measured/estimated system state $x(t)$ and forecasted disturbance energy bound $\sigma_{t}$ and hence has the potential to achieve improved disturbance attenuation performance while satisfying constraints. We will demonstrate this improvement with a numerical example in the following section.

5 Numerical Example

In this section, we use a numerical example to illustrate our proposed control approach. Consider a system in the form of (1) with the following parameters:


$\displaystyle A_{\text{o}}$	$\displaystyle=\begin{bmatrix}0.8147&0.9134&0.2785\\ 0.9058&0.6324&0.5469\\ 0.1270&0.0975&0.9575\end{bmatrix}\quad B_{\text{o}}=\begin{bmatrix}-0.6787\\ -0.7577\\ -0.7431\end{bmatrix}$	(56a)
$\displaystyle C_{1}$	$\displaystyle=\,\begin{bmatrix}\,1&0&0\,\,\end{bmatrix}\quad\quad D_{1}=\,\,0$	(56b)
$\displaystyle C_{2}$	$\displaystyle=\begin{bmatrix}0&1&0\\ 0&0&0\end{bmatrix}\quad\quad D_{2}=\begin{bmatrix}0\\ 1\end{bmatrix}\quad\quad y_{\max}=\begin{bmatrix}1\\ 0.5\end{bmatrix}$	(56c)

The parameters in (56b) mean that we want to minimize the effect of disturbance input $w(t)$ on the first state $x_{1}(t)$ , and the parameters in (56c) mean that the second state $x_{2}(t)$ and the control input $u(t)$ should satisfy the constraints $x_{2}(t)\leq 1$ and $u(t)\leq 0.5$ . Assume $(A_{\text{o}},B_{\text{o}})$ is unknown. We simulate the system and collect $J=100$ data of $(x^{+}_{j},x_{j},u_{j})$ with random disturbance input $w_{j}$ that satisfies $\|w_{j}\|\leq\varepsilon=10^{-2}$ . Then, we construct the matrices in Assumption 1 as $\Phi_{11}=(J\varepsilon^{2})I$ , $\Phi_{12}=0$ , and $\Phi_{22}=-I$ .

We implement three data-driven approaches using the same set of collected data $\{(x^{+}_{j},x_{j},u_{j})\}_{j=1}^{J}$ to control the system: 1) the data-driven $\mathcal{H}_{\infty}$ control approach of [27] (which does not handle constraints), 2) the constrained $\mathcal{H}_{\infty}$ control based on (20) (without moving-horizon implementation), and 3) the moving-horizon $\mathcal{H}_{\infty}$ control based on (47).

We consider the initial condition $x(0)=[0.95,0,0]^{\top}$ . For (20) and (47), to illustrate the results of Theorems 1 and 2, we use the parameters $\sigma_{0}=10^{-2}$ , $\sigma_{t+1}=\sigma_{t}-\|w(t)\|^{2}$ , and $r_{t}=10$ for all $t\in\mathbb{N}$ .

The system is stabilized and $x(t)$ converges to $0$ under each of the three approaches. However, as shown in Fig. 1, the control using the approach of [27] violates the constraint $u(t)\leq 0.5$ . This is expected because the approach of [27] does not consider any constraints. In contrast, (20) and (47) both satisfy the constraints, illustrating the effectiveness of our proposed approach for handling constraints. We then compare the time history of $\gamma$ using the constrained $\mathcal{H}_{\infty}$ control based on (20) (without moving-horizon implementation) versus that using the moving-horizon $\mathcal{H}_{\infty}$ control based on (47) in Fig. 2. Note that a smaller $\gamma$ indicates a stronger attenuation of the effect of disturbance input $w(t)$ on performance output $y_{1}(t)$ . It can be seen that both approaches start with a large $\gamma$ . This is because the control has to sacrifice some disturbance attenuation performance for satisfying the constraints on $x_{2}(t)$ and $u(t)$ . Because the approach of (20) uses a fixed gain, the disturbance attenuation performance level $\gamma$ remains constant over time. In contrast, as the state $x(t)$ and the control $u(t)$ are getting farther away from the constraint boundaries, the moving-horizon approach based on (47) adjusts the gain and achieves a lower $\gamma$ , illustrating the effectiveness of our proposed moving-horizon approach for improving performance while satisfying constraints.

Refer to caption — Figure 1: Control input time history.

In this example, the optimization problem (47) is feasible at every time step, which is consistent with the recursive feasibility result of Lemma 4 and Theorem 2. The average computation time per step for solving (47) using the MATLAB-based LMI solver mincx on a MacBook Air (M1 CPU, 8 GB RAM) is 12.5 ms, indicating the computational feasibility of the approach.

6 Conclusions

In this paper, we proposed a novel data-driven moving-horizon control approach for constrained systems. The approach optimizes $\mathcal{H}_{\infty}$ -type disturbance rejection while satisfying constraints. We established theoretical guarantees of the approach regarding closed-loop stability, disturbance attenuation, constraint satisfaction under noisy offline data, and online problem recursive feasibility. The effectiveness of the approach has been illustrated with a numerical example. Future work includes applying the proposed approach to practical control engineering problems.

References

[1] K. Zhou, J. C. Doyle, and K. Glover, Robust and optimal control. USA: Prentice Hall, 1996.
[2] J. M. Maciejowski, Predictive control: with constraints. USA: Prentice Hall, 2002.
[3] H. Chen and C. Scherer, “Disturbance attenuation with actuator constraints by moving horizon $H_{\infty}$ control,” IFAC Proceedings, vol. 37, no. 1, pp. 415–420, 2004.
[4] H. Chen and C. W. Scherer, “Moving horizon $H_{\infty}$ control with performance adaptation for constrained linear systems,” Automatica, vol. 42, no. 6, pp. 1033–1040, 2006.
[5] S.-M. Lee and J. H. Park, “Robust $H_{\infty}$ model predictive control for uncertain systems using relaxation matrices,” International Journal of Control, vol. 81, no. 4, pp. 641–650, 2008.
[6] P. E. Orukpe, “Towards a less conservative model predictive control based on mixed $H_{2}/H_{\infty}$ control approach,” International Journal of Control, vol. 84, no. 5, pp. 998–1007, 2011.
[7] H. Huang, D. Li, and Y. Xi, “Mixed $H_{2}/H_{\infty}$ robust model predictive control with saturated inputs,” International Journal of Systems Science, vol. 45, no. 12, pp. 2565–2575, 2014.
[8] M. Benallouch, G. Schutz, D. Fiorelli, and M. Boutayeb, “ $H_{\infty}$ model predictive control for discrete-time switched linear systems with application to drinking water supply network,” Journal of Process Control, vol. 24, no. 6, pp. 924–938, 2014.
[9] Y. Song, X. Fang, and Q. Diao, “Mixed $H_{2}/H_{\infty}$ distributed robust model predictive control for polytopic uncertain systems subject to actuator saturation and missing measurements,” International Journal of Systems Science, vol. 47, no. 4, pp. 777–790, 2016.
[10] Y. Song, Z. Wang, D. Ding, and G. Wei, “Robust $H_{2}/H_{\infty}$ model predictive control for linear systems with polytopic uncertainties under weighted MEF-TOD protocol,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 7, pp. 1470–1481, 2017.
[11] Y. Zhang, C.-C. Lim, and F. Liu, “Robust mixed $H_{2}/H_{\infty}$ model predictive control for Markov jump systems with partially uncertain transition probabilities,” Journal of the Franklin Institute, vol. 355, no. 8, pp. 3423–3437, 2018.
[12] A. Shokrollahi and S. Shamaghdari, “Robust $H_{\infty}$ model predictive control for constrained Lipschitz non-linear systems,” Journal of Process Control, vol. 104, pp. 101–111, 2021.
[13] F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, “Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,” IEEE Control Systems Magazine, vol. 32, no. 6, pp. 76–105, 2012.
[14] B. Recht, “A tour of reinforcement learning: The view from continuous control,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 253–279, 2019.
[15] U. Rosolia and F. Borrelli, “Learning model predictive control for iterative tasks. A data-driven control framework,” IEEE Transactions on Automatic Control, vol. 63, no. 7, pp. 1883–1896, 2017.
[16] L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger, “Learning-based model predictive control: Toward safe learning in control,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 3, pp. 269–296, 2020.
[17] J. Coulson, J. Lygeros, and F. Dörfler, “Data-enabled predictive control: In the shallows of the DeePC,” in 18th European Control Conference, pp. 307–312, IEEE, 2019.
[18] E. Elokda, J. Coulson, P. N. Beuchat, J. Lygeros, and F. Dörfler, “Data-enabled predictive control for quadcopters,” International Journal of Robust and Nonlinear Control, vol. 31, no. 18, pp. 8916–8936, 2021.
[19] L. Huang, J. Coulson, J. Lygeros, and F. Dörfler, “Decentralized data-enabled predictive control for power system oscillation damping,” IEEE Transactions on Control Systems Technology, vol. 30, no. 3, pp. 1065–1077, 2021.
[20] V. Chinde, Y. Lin, and M. J. Ellis, “Data-enabled predictive control for building HVAC systems,” Journal of Dynamic Systems, Measurement, and Control, vol. 144, no. 8, p. 081001, 2022.
[21] N. Li, E. Taheri, I. Kolmanovsky, and D. Filev, “Minimum-time trajectory optimization with data-based models: A linear programming approach,” arXiv preprint arXiv:2312.05724, 2023.
[22] S. Baros, C.-Y. Chang, G. E. Colon-Reyes, and A. Bernstein, “Online data-enabled predictive control,” Automatica, vol. 138, p. 109926, 2022.
[23] L. Dai, T. Huang, R. Gao, Y. Zhang, and Y. Xia, “Cloud-based computational data-enabled predictive control,” IEEE Internet of Things Journal, vol. 9, no. 24, pp. 24949–24962, 2022.
[24] J. Berberich, J. Köhler, M. A. Müller, and F. Allgöwer, “Data-driven model predictive control with stability and robustness guarantees,” IEEE Transactions on Automatic Control, vol. 66, no. 4, pp. 1702–1717, 2020.
[25] J. Coulson, J. Lygeros, and F. Dörfler, “Distributionally robust chance constrained data-enabled predictive control,” IEEE Transactions on Automatic Control, vol. 67, no. 7, pp. 3289–3304, 2021.
[26] L. Huang, J. Zhen, J. Lygeros, and F. Dörfler, “Robust data-enabled predictive control: Tractable formulations and performance guarantees,” IEEE Transactions on Automatic Control, vol. 68, no. 5, pp. 3163–3170, 2023.
[27] H. J. van Waarde, M. K. Camlibel, and M. Mesbahi, “From noisy data to feedback controllers: Nonconservative design via a matrix S-lemma,” IEEE Transactions on Automatic Control, vol. 67, no. 1, pp. 162–175, 2020.
[28] K. J. Åström, U. Borisson, L. Ljung, and B. Wittenmark, “Theory and applications of self-tuning regulators,” Automatica, vol. 13, no. 5, pp. 457–476, 1977.
[29] D. A. Bristow, M. Tharayil, and A. G. Alleyne, “A survey of iterative learning control,” IEEE Control Systems Magazine, vol. 26, no. 3, pp. 96–114, 2006.
[30] Z.-S. Hou and Z. Wang, “From model-based control to data-driven control: Survey, classification and perspective,” Information Sciences, vol. 235, pp. 3–35, 2013.
[31] C. E. de Souza and L. Xie, “On the discrete-time bounded real lemma with application in the characterization of static state feedback $H_{\infty}$ controllers,” Systems & Control Letters, vol. 18, no. 1, pp. 61–71, 1992.
[32] H. Tuy, “On nonconvex optimization problems with separated nonconvex variables,” Journal of Global Optimization, vol. 2, pp. 133–144, 1992.

$\displaystyle\sum_{i=0}^{t}\\|y_{1}(i)\\|^{2}$	$\displaystyle\leq\sum_{i=0}^{t}\gamma_{i}^{2}\\|w(i)\\|^{2}+x(0)^{\top}P_{0}x(0)$
	$\displaystyle\quad\quad\quad-x(t+1)^{\top}P_{t}x(t+1)$
	$\displaystyle\leq\bar{\gamma}_{t}^{2}\sum_{i=0}^{t}\\|w(i)\\|^{2}+x(0)^{\top}P_{0}x(0)\quad\blacksquare$	(46)