This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Harnessing Data for Accelerating Model Predictive Control by Constraint Removal

Zhinan Hou, Feiran Zhao, Keyou You *The research was supported by National Science and Technology Major Project of China (2022ZD0116700) and National Natural Science Foundation of China (62033006, 62325305). Z. Hou, F. Zhao, and K. You are with the Department of Automation and BNRist, Tsinghua University, Beijing 100084, China. e-mail: [email protected], [email protected], [email protected].
Abstract

Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC policy. This removal rule can use the information of historical data according to the Lipschitz constant and the distance between the current state and historical states. In particular, we provide the explicit expression for calculating the Lipschitz constant by the model parameters. Finally, simulations are performed to validate the effectiveness of the proposed method.

I Introduction

Model predictive control (MPC) is a popular optimal control technique and has many successful applications, including autonomous driving [1], industrial process [2], and robotic manipulation [3]. By solving online a receding-horizon optimization problem, the MPC computes the control input in real-time that ensures the satisfaction of system constraints. However, solving an optimization problem per timestep can be computationally demanding even for linear dynamical systems, especially when there are thousands of constraints [4]. Thus, reducing the computation burden has become an important problem for both theoretical and practical researchers.

Manifold approaches to accelerating the online computation include model reduction [5], explicit MPC [6], tailored solvers [7], and constraint removal [8]. Recently, there has been an increasing interest in the constraint removal approach as it can utilize online data to further reduce the computation time [9]. The constraint removal method first pre-determines unnecessary (inactive) constraints of the receding-horizon optimization problem by exploiting local properties of the optimization problem, such as reachable sets [9], region of activity [10], and contraction properties of the cost function [11]. Then, it removes these constraints to reduce the scale of the optimization problem, while maintaining the solution unchanged. Very recently, Nouwens et al. [12] proposes a general system-theoretic principle to design the constraint removal rule using the online closed-loop state-input data, leading to the constraint-adaptive MPC (ca-MPC) framework. While the ca-MPC enables the use of system-theoretic properties (e.g., reachability and optimality), it is conservative as only the data at current time is utilized. In fact, the historical data should also be taken into account in some applications. For example, the racing car routinely runs on the same track, and using data from past iterations may further improve the performance of the constraint-removal method.

In this paper, we propose a new constraint removal method for the ca-MPC of linear systems, which uses the historical state-input data to further accelerate the online computation. Instead of using reachability and optimality of the receding-horizon problem [12], we exploit the Lipschitz continuity of the MPC policy in the state to design the removal rule. However, an explicit Lipschitz constant is challenging to obtain and is absent in the literature. Fortunately, by estimating the KKT solution of the receding-horizon problem, we are able to provide an explicit expression for the Lipschitz constant. In particular, the Lipschitz constant can be computed offline using only the model parameters. By leveraging the Lipschitz continuity, the proposed constraint removal rule can utilize the historical state data that is “close” to the current system state. Finally, simulations are performed to validate the effectiveness of the proposed method.

Our result on the Lipschitz constants of the MPC policy has independent interests of its own. The Lipschitz constant is beneficial in various realization of MPC including robustness against disturbance [13] and neural network-based approximation [14]. However, an explicit Lipschitz constant is challenging to obtain. While recent works estimate the Lipschitz constant via enumerating potential active sets [15] or a mixed-integer linear program [14][16], they are time-consuming and lack an explicit expression. In contrast, we provide the first explicit Lipschitz constant that can be efficiently computed using model parameters.

The rest of this paper is organized as follows. The notation is given in the remainder of this section. Section II introduces the linear MPC and formulates the problem. We provide an explicit expression of the Lipschitz constant and present the ca-MPC scheme in Section III. We illustrate the effectiveness of the proposed scheme with simulation in Section IV and give a conclusion in Section V.

Notation. We denote the set of real number by \mathbb{R}, the set of nn-dimensional real-valued vectors by n\mathbb{R}^{n} and the set of n×mn\times m-dimensional real-valued matrices by n×m\mathbb{R}^{n\times m}. Given a matrix Am×nA\in\mathbb{R}^{m\times n}, ATA^{T} denotes its transpose, AjA_{j} denotes its jj-th row and so is the vector. λi(A)\lambda_{i}(A) is the i-th largest eigenvalue of AA. Specially, we define λmin(A)\lambda_{min}(A) as the smallest eigenvalue of AA. For two vectors a,ba,b, we denote element-wise multiplication by aba\circ b . For any set of indices 𝕀{1,,m}\mathbb{I}\subseteq\{1,...,m\}, A𝕀A_{\mathbb{I}} denotes the submatrix obtained by selecting the rows indicated in 𝕀\mathbb{I}. card(S)\text{card}(S) is denoted as the number of elements in the set SS. [a,b]={n|anb}\mathbb{N}_{[a,b]}=\{n\in\mathbb{N}\ |\ a\leq n\leq b\}. For two sets A,BA,B, we define AB={x|xA,xB}A-B=\{x\ |\ x\in A,x\notin B\}.

II Problem formulation and background

In this section, we introduce the linear system to be controlled and the linear MPC setup (see Section II-A). Furthermore, we introduce the constraint-adaptive MPC and formulate our problem in Section II-B.

II-A Linear MPC

We consider a discrete-time linear time-invariant (LTI) system:

xk+1=Axk+Buk,x_{k+1}=Ax_{k}+Bu_{k}, (1)

where xknx_{k}\in\mathbb{R}^{n} and ukmu_{k}\in\mathbb{R}^{m} denote the states and the inputs, respectively, at discrete time k𝒩k\in\mathcal{N}. An×nA\in\mathbb{R}^{n\times n} and Bm×mB\in\mathbb{R}^{m\times m} are known system matrices. The system is subject to polyhedral state and input constraints:

(xk,uk)𝕃={(x,u)|Cx+DuE},(x_{k},u_{k})\in\mathbb{L}=\{(x,u)\ |\ Cx+Du\leq E\}, (2)

where Cc×n,Dc×m,EcC\in\mathbb{R}^{c\times n},\ D\in\mathbb{R}^{c\times m},\ E\in\mathbb{R}^{c} and origin is in the interior of 𝕃\mathbb{L}.

Based on the system dynamic (1) and the constraints (2), an MPC over a finite time horizon of length T1T\leq 1, given the state xkx_{k} at time kk\in\mathbb{N}, can be formulated as

minu.|k\displaystyle\min_{u_{.|k}} J(x.|k,u.|k),\displaystyle J(x_{.|k},u_{.|k}), (3)
s.t. xt+1|k=Axt|k+But|k,\displaystyle x_{t+1|k}=Ax_{t|k}+Bu_{t|k},
Cxt|k+Dut|kE,\displaystyle Cx_{t|k}+Du_{t|k}\leq E,
CTxN|kET,\displaystyle C_{T}x_{N|k}\leq E_{T},
x0|k=xk,\displaystyle x_{0|k}=x_{k},
t=0,1,,N1,\displaystyle t=0,1,...,N-1,

where

J(x.|k,u.|k):=t=0N112(xt|kTQxt|k+ut|kTRut|k)+12xN|kTPxN|k.J(x_{.|k},u_{.|k}):=\sum^{N-1}_{t=0}\frac{1}{2}(x_{t|k}^{T}Qx_{t|k}+u_{t|k}^{T}Ru_{t|k})+\frac{1}{2}x_{N|k}^{T}Px_{N|k}. (4)

Here, x.|k,u.|kx_{.|k},u_{.|k} denote the predicted state and the predicted input. J(x.|k,u.|k)J(x_{.|k},u_{.|k}) is composed of the stage cost and the terminal cost where the weights Q,R,PQ,R,P are symmetric positive definite matrices. The constraint 𝕏T={x|CTxET}\mathbb{X}_{T}=\{x\ |\ C_{T}x\leq E_{T}\} is the terminal constraint.

Let u.|ku^{*}_{.|k} be the solution of (3). Then, the first input is applied to the system:

uMPC(k)=u0|k.u_{MPC}(k)=u^{*}_{0|k}. (5)

The optimization (3) can be expressed as a standard multiparameter quadratic programming (mp-QP) form by regarding xx as the parameter vector:

minz\displaystyle\min_{z}\ 12zTHz+xTFz,\displaystyle\frac{1}{2}z^{T}Hz+x^{T}Fz, (6a)
s.t. z(x):={z|GzW+Sx},\displaystyle z\in\mathbb{Z}(x):=\{z\ |\ Gz\leq W+Sx\}, (6b)

where xnx\in\mathbb{R}^{n} is the current state, Hnz×nz,Fn×nz,Gnc×nz,Wnc,Snc×nH\in\mathbb{R}^{n_{z}\times n_{z}},F\in\mathbb{R}^{n\times n_{z}},G\in\mathbb{R}^{n_{c}\times n_{z}},W\in\mathbb{R}^{n_{c}},S\in\mathbb{R}^{n_{c}\times n} are parameter matrices which are obtained from Q,R,P,A,B,C,D,E,CT,ETQ,R,P,A,B,C,D,E,C_{T},E_{T}. We denote an optimal solution to (6) by z(x)z^{*}(x). The set of indices of the active constraints responding to xx is denoted by 𝒜(x)\mathcal{A}(x).

If mp-QP has many constraints in (6b), then it is challenging and time-consuming to solve this optimization problem. One approach to reduce computational burden is to remove some redundant constraints.

II-B Constraint-adaptive MPC

Refer to caption
Figure 1: The illustrations of Lemma 1. The black dashed lines denote level sets of the cost function. The orange region denotes (x)\mathcal{M}(x). The green solid and dashed lines denote the constraints specified by 𝕀(x)\mathbb{I}(x) and (x)\mathbb{C}(x), respectively. The left half-space of the green dashed line is the set (x,(x))\mathbb{Z}(x,\mathbb{C}(x)).

Nouwens et al. [12] proposed a online constraint removal framework for linear system, called constraint-adaptive MPC. Specially, his method introduced the reduced MPC problem which have fewer constraints compared to the original MPC problem. The mp-QP version of the reduced MPC problem is formulated as follows:

minz\displaystyle\min_{z}\ 12zTHz+xTFz,\displaystyle\frac{1}{2}z^{T}Hz+x^{T}Fz, (7a)
s.t. z(x,𝕀(x))={z|GjzWj+Sjx,j𝕀(x)},\displaystyle z\in\mathbb{Z}(x,\mathbb{I}(x))=\{z\ |\ G_{j}z\leq W_{j}+S_{j}x,j\in\mathbb{I}(x)\}, (7b)

where we define the mapping 𝕀:n[1,nc]\mathbb{I}:\mathbb{R}^{n}\rightrightarrows\mathbb{N}_{[1,n_{c}]} from the state space to the indices set of reduced constraints. z(x,𝕀(x))z^{*}(x,\mathbb{I}(x)) denote the optimal solution of (7).

To guarantee that the optimal solution to (7) is the same as that to (6), Nouwens et al. [12] build upon the theorem as follows:

Lemma 1 (Theorem 1, [12]).

Consider the original and reduced MPC optimization problem (6) and (7). Let the set-valued mapping :n[1,nc]\mathbb{C}:\mathbb{R}^{n}\rightrightarrows\mathbb{N}_{[1,n_{c}]} denote the set of indices of removed constraints, i.e., (x)=[1,nc]𝕀(x)\mathbb{C}(x)=\mathbb{N}_{[1,n_{c}]}-\mathbb{I}(x). If there exists a mapping :nnz\mathcal{M}:\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{n_{z}} such that for all xnx\in\mathbb{R}^{n},

z(x,𝕀(x))(x),\displaystyle z^{*}(x,\mathbb{I}(x))\in\mathcal{M}(x), (8a)
(x)(x,(x)).\displaystyle\mathcal{M}(x)\subset\mathbb{Z}(x,\mathbb{C}(x)). (8b)

Then, it holds that z(x,𝕀(x))=z(x)z^{*}(x,\mathbb{I}(x))=z^{*}(x).

In Lemma 1, (x)\mathcal{M}(x) is regarded as the outer approximation of z(x,𝕀(x))z^{*}(x,\mathbb{I}(x)). This approximation can be obtained by some information about the optimal solution (e.g., reachability and optimality [12]).

The way to apply the Lemma 1 is to first construct the outer approximation (x)\mathcal{M}(x) of z(x,𝕀(x))z^{*}(x,\mathbb{I}(x)) for all possible choices of 𝕀\mathbb{I}. Then, the indices in (x)\mathbb{C}(x) are selected such that (8b) is satisfied. Finally, constraints in (x)\mathbb{C}(x) are removed to formulate the reduced mp-QP problem (7). As shown in Fig. 1, (x)\mathcal{M}(x) is in the half-space formed by (x)\mathbb{C}(x) and contained by (x,(x))\mathbb{Z}(x,\mathbb{C}(x)). In this case, the constraints specified by (x)\mathbb{C}(x) can be removed directly without changing the optimal solution, i.e. z(x,𝕀(x))=z(x)z^{*}(x,\mathbb{I}(x))=z^{*}(x). Thus, the design of outer approximation (x)\mathcal{M}(x) is essential, and a good approximation leads to efficient constraint removal. In this paper, we are interested in the following problem:

Problem 1.

Design an approximation (x)\mathcal{M}(x) in Lemma 1 and improve the constraints removal.

III Exploiting historical data for ca-MPC

In this section, we use the historical data to design the outer approximation (x)\mathcal{M}(x). This is motivated by an observation that the optimal solution for mp-QP is Lipschitz continuous. If the current state xx is close to the previous state x~\tilde{x}, then the optimal solution z(x)z^{*}(x) of the current state is in a circle centered around that z(x~)z^{*}(\tilde{x}) of which radius is κmaxxx~2\kappa_{max}\|x-\tilde{x}\|_{2}. This information can be used to design the outer approximation (x)\mathcal{M}(x), then evaluate and remove redundant constraints. Specifically, we provide an explicit form of the Lipschitz constant of linear MPC in Section III-A. Based on this, we design a constraint removal mechanism that utilizes historical data and present our MPC scheme in Section III-B.

III-A The Lipschitz constant of reduced mp-QP

Consider a mp-QP problem with constraints selected by 𝕀\mathbb{I} which has no relation with the state x compared to (7):

minz\displaystyle\min_{z}\ 12zTHz+xTFz,\displaystyle\frac{1}{2}z^{T}Hz+x^{T}Fz, (9a)
s.t. z𝕀(x)={z|G𝕀zW𝕀+S𝕀x}.\displaystyle z\in\mathbb{Z}_{\mathbb{I}}(x)=\{z\ |\ G_{\mathbb{I}}z\leq W_{\mathbb{I}}+S_{\mathbb{I}}x\}. (9b)

Here, we denote the optimal solution of (9) by z𝕀(x)z^{*}_{\mathbb{I}}(x). For each 𝕀\mathbb{I}, there exists a κ𝕀\kappa_{\mathbb{I}} [6] such that

z𝕀(x1)z𝕀(x2)κ𝕀x1x2.\|z_{\mathbb{I}}^{*}(x_{1})-z_{\mathbb{I}}^{*}(x_{2})\|\leq\kappa_{\mathbb{I}}\|x_{1}-x_{2}\|. (10)

The Lipschitz constant κ𝕀\kappa_{\mathbb{I}} depends on 𝕀\mathbb{I}. It is hard to compute the Lipschitz constant κ𝕀\kappa_{\mathbb{I}} unless the explicit law is known [15]. In fact, the explicit law is intractable for large-scale system. In our cases, we need to calculate the maximum Lipschitz constant of mp-QP in different 𝕀\mathbb{I}, i.e., κmax=max𝕀[1,nc]κ𝕀\kappa_{\text{max}}=\max_{\mathbb{I}\subseteq\mathbb{N}_{[1,n_{c}]}}\kappa_{\mathbb{I}}. As far as we know, no methods can calculate this constant efficiently. To address this problem, we provide an explicit form for κmax\kappa_{\text{max}} with very low computational cost as follows:

Theorem 1.

For any 𝕀[1,nc],x1,x2n\mathbb{I}\subseteq\mathbb{N}_{[1,n_{c}]},x_{1},x_{2}\in\mathbb{R}^{n}, it holds that

z𝕀(x1)z𝕀(x2)2κmaxx1x22,\|z_{\mathbb{I}}^{*}(x_{1})-z_{\mathbb{I}}^{*}(x_{2})\|_{2}\leq\kappa_{\text{max}}\|x_{1}-x_{2}\|_{2}, (11)

with

κmax=\displaystyle\kappa_{\text{max}}= H1FT2+1minjGjH1GjTH1GT2\displaystyle\|H^{-1}F^{T}\|_{2}+\frac{1}{\min_{j}G_{j}H^{-1}G_{j}^{T}}\|H^{-1}G^{T}\|_{2} (12)
×S+GH1FT2.\displaystyle\times\|S+GH^{-1}F^{T}\|_{2}.
Remark 1.

This Lipschitz constant is finite because H1H^{-1} is positive definite matrix and GjH1GjT>0G_{j}H^{-1}G_{j}^{T}>0. Also, this constant can be offline computed from the model parameters.

Theorem 1 provides an efficient computing method for the Lipschitz constant. Its computation includes matrix multiplication and matrix norm. Let us analyze the computation complexity of the matrix norm. Here, we are interested in the mp-QP that exists many constraints, i.e., ncn+nzn_{c}\gg n+n_{z}. For a matrix Xa×bX\in\mathbb{R}^{a\times b}, its matrix norm X2=λ1(XXT)=λ1(XXT)\|X\|_{2}=\sqrt{\lambda_{1}(XX^{T})}=\sqrt{\lambda_{1}(XX^{T})}. The computation complexity per step of λ1(XXT)\lambda_{1}(XX^{T}) and λ1(XTX)\lambda_{1}(X^{T}X) with power iteration methods are 𝒪(a2)\mathcal{O}(a^{2}) and 𝒪(b2)\mathcal{O}(b^{2}), respectively. So the computation complexity of X2\|X\|_{2} is 𝒪(a2)(a<b)\mathcal{O}(a^{2})(a<b) or 𝒪(b2)(a>b)\mathcal{O}(b^{2})(a>b). The dimensions of three matrix in (12) to compute norm are nz×nn_{z}\times n, nz×nzn_{z}\times n_{z} and nc×nn_{c}\times n. Although ncn+nzn_{c}\gg n+n_{z}, the computation complexity of matrix norm in (12) is 𝒪(n2+nz2)\mathcal{O}(n^{2}+n_{z}^{2}) and are not influenced by the number ncn_{c} of constraints.

This upper bound in Theorem 1 may be not close. To decrease the gap, we can utilize some techiques to find a smaller value κmax\kappa_{\text{max}}. Let ΦRnc×nc\Phi\in R^{n_{c}\times n_{c}} be a full-rank matrix. We substitute G,W,SG,W,S with ΦG,ΦW,ΦS\Phi G,\Phi W,\Phi S in (6b) and the feasible region and the optimal solution are not changed. The variant of (12) is obtained:

κ^max\displaystyle\hat{\kappa}_{\text{max}} =H1FT2+1minj(ΦG)jH1(ΦG)jTH1GTΦT2\displaystyle=\|H^{-1}F^{T}\|_{2}+\frac{1}{\min_{j}(\Phi G)_{j}H^{-1}(\Phi G)_{j}^{T}}\|H^{-1}G^{T}\Phi^{T}\|_{2} (13)
×ΦS+ΦGH1FT2.\displaystyle\times\|\Phi S+\Phi GH^{-1}F^{T}\|_{2}.

Equation (13) can be used as the Lipschitz constant in (11). We can select the appropriate matrix Φ\Phi to find a smaller value κ^max\hat{\kappa}_{\text{max}} than κmax\kappa_{\text{max}} in (12). In Section IV, we provide a empirical approach to choose Φ\Phi.

III-B The outer approximation (x)\mathcal{M}(x)

We now utilize the Theorem 1 to construct the outer approximation (x)\mathcal{M}(x). Then, we propose the ca-MPC scheme based on the mapping 𝕀\mathbb{I} computed by (x)\mathcal{M}(x).

We present the key lemma used in the design of (x)\mathcal{M}(x):

Lemma 2.

For any x1,x2nx_{1},x_{2}\in\mathbb{R}^{n}, if 𝒜(x2)𝕀(x1)\mathcal{A}(x_{2})\subseteq\mathbb{I}(x_{1}), we have

z(x1,𝕀(x1))z(x2)2κmaxx1x22.\|z^{*}(x_{1},\mathbb{I}(x_{1}))-z^{*}(x_{2})\|_{2}\leq\kappa_{\text{max}}\|x_{1}-x_{2}\|_{2}. (14)
Proof.

Let 𝕀=𝕀(x1)\mathbb{I}=\mathbb{I}(x_{1}) in Theorem 1:

z(x1,𝕀(x1))z(x2,𝕀(x1))2κmaxx1x22.\|z^{*}(x_{1},\mathbb{I}(x_{1}))-z^{*}(x_{2},\mathbb{I}(x_{1}))\|_{2}\leq\kappa_{\text{max}}\|x_{1}-x_{2}\|_{2}. (15)

We denote the objective function as V(z)=12zTHz+x2TFzV(z)=\frac{1}{2}z^{T}Hz+x^{T}_{2}Fz. Because 𝒜(x2)𝕀(x1)[1,nc]\mathcal{A}(x_{2})\subseteq\mathbb{I}(x_{1})\subseteq\mathbb{N}_{[1,n_{c}]}, we have

V(z(x2,𝒜(x2)))V(z(x2,𝕀(x1)))V(z(x2)).V(z^{*}(x_{2},\mathcal{A}(x_{2})))\leq V(z^{*}(x_{2},\mathbb{I}(x_{1})))\leq V(z^{*}(x_{2})).

Then, it follows from V(z(x2,𝒜(x2)))=V(z(x2))V(z^{*}(x_{2},\mathcal{A}(x_{2})))=V(z^{*}(x_{2})) that

V(z(x2,𝒜(x2)))=V(z(x2,𝕀(x1)))=V(z(x2)).V(z^{*}(x_{2},\mathcal{A}(x_{2})))=V(z^{*}(x_{2},\mathbb{I}(x_{1})))=V(z^{*}(x_{2})). (16)

z(x2,𝕀(x1)),z(x2)(x2,𝒜(x2))z^{*}(x_{2},\mathbb{I}(x_{1})),z^{*}(x_{2})\in\mathbb{Z}(x_{2},\mathcal{A}(x_{2})) and are optimal in (7) where let x=x2x=x_{2} and 𝕀(x)=𝒜(x2)\mathbb{I}(x)=\mathcal{A}(x_{2}) according to (16). Since this mp-QP have a unique optimal solution, so we have

z(x2,𝕀(x1))=z(x2).z^{*}(x_{2},\mathbb{I}(x_{1}))=z^{*}(x_{2}). (17)

Substituting (17) into (15) and the proof is finished. ∎

In Lemma 2, we consider x1=xx_{1}=x as the current state and x2=x~x_{2}=\tilde{x} as a historical state. Let reduced constraint indices 𝕀(x)\mathbb{I}(x) satisfy 𝒜(x~)𝕀(x)\mathcal{A}(\tilde{x})\subseteq\mathbb{I}(x):

z(x,𝕀(x))(x),z^{*}(x,\mathbb{I}(x))\in\mathcal{M}(x), (18)

where

(x)={z|zz(x~)2κmaxxx~2}.\mathcal{M}(x)=\{z\ |\ \|z-z^{*}(\tilde{x})\|_{2}\leq\kappa_{\text{max}}\|x-\tilde{x}\|_{2}\}. (19)

Equation (19) indicates that (x)\mathcal{M}(x) is a sphere set. The term xx~2\|x-\tilde{x}\|_{2} in (19) is the distance between the current state and historical state. This distance is expected to be small so that (x)\mathcal{M}(x) can be contained in many constraints of (x)\mathbb{Z}(x). Based on this distance, we can evaluate if the historical state can provide good information for (x)\mathbb{C}(x) and can help remove some redundant constraints. As a result, the historical data which have the smallest distance xx~2\|x-\tilde{x}\|_{2} is used to design (x)\mathcal{M}(x) and construct (x)\mathbb{C}(x).

To select the removal constraints (x)\mathbb{C}(x), we utilize the sphere half-space intersection check to evaluate if the sphere set (x)\mathcal{M}(x) is entirely constained in the half-space specified by the constraint in (x)\mathbb{Z}(x). This check leads to our removal rule:

(x)={j[1,nc]|κmaxxx~2Gj2<Wj+SjxGjz(x~)}𝒜(x~).\begin{split}\mathbb{C}(x)=\{j\in\mathbb{N}_{[1,n_{c}]}\ |\ &\kappa_{\text{max}}\|x-\tilde{x}\|_{2}\|G_{j}\|_{2}<W_{j}+S_{j}x\\ &-G_{j}z^{*}(\tilde{x})\}-\mathcal{A}(\tilde{x}).\end{split} (20)

In Algorithm 1, we present the removal process and present our overview of the proposed MPC scheme. Theorem 2 shows that our removal rule (20) maintains the trajectory unchanged.

Algorithm 1 implementation of ca-MPC using historical data
1:k=0k=0;
2:Initialize historical data DD;
3:κmax(12) or (13)\kappa_{\text{max}}\leftarrow\eqref{3_5}\text{ or }\eqref{3_6};
4:while True do
5:     Measure xkx_{k};
6:     Find x~\tilde{x} with the smallest distance xkx~2\|x_{k}-\tilde{x}\|_{2} from historical data DD
7:     (xk)(20)\mathbb{C}(x_{k})\leftarrow\eqref{3_14} with data (x~,z(x~),𝒜(x~))(\tilde{x},z^{*}(\tilde{x}),\mathcal{A}(\tilde{x}));
8:     𝕀(xk)=[1,nc](xk)\mathbb{I}(x_{k})=\mathbb{N}_{[1,n_{c}]}-\mathbb{C}(x_{k});
9:     z(xk),𝒜(xk)(7)z^{*}(x_{k}),\mathcal{A}(x_{k})\leftarrow\eqref{2_8} with x=xkx=x_{k} and 𝕀(xk)\mathbb{I}(x_{k});
10:     Store data (xk,z(xk),𝒜(xk))(x_{k},z^{*}(x_{k}),\mathcal{A}(x_{k})) in DD;
11:     Apply control input from zz^{*} into plant;
12:     kk+1k\leftarrow k+1;
13:end while
Theorem 2.

Given a original and reduced mp-QP problem (6) and (7), the Lipschitz constant κmax\kappa_{\text{max}} (12), the removal rule (x)\mathbb{C}(x) (20), the state trajectory under Algorithm 1 is the same with closed-loop trajectory under the policy (6).

Proof.

We prove this Theorem by induction.

At the k step of the Algorithm 1, it follows from line 7-8 that

𝒜(x~)𝕀(xk).\mathcal{A}(\tilde{x})\subset\mathbb{I}(x_{k}). (21)

Then it follows from Lemma 2 that

z(xk,𝕀(xk))(xk).z^{*}(x_{k},\mathbb{I}(x_{k}))\in\mathcal{M}(x_{k}). (22)

with

(xk)={z|zz(x~))2κmaxxkx~2}.\mathcal{M}(x_{k})=\{z\ |\ \|z-z^{*}(\tilde{x}))\|_{2}\leq\kappa_{\text{max}}\|x_{k}-\tilde{x}\|_{2}\}. (23)

For each j(xk)j\in\mathbb{C}(x_{k}), it follows from line 7 that

κmaxxx~2<Wj+Sjx~Gjz(x~)Gj2.\kappa_{\text{max}}\|x-\tilde{x}\|_{2}<\frac{W_{j}+S_{j}\tilde{x}-G_{j}z^{*}(\tilde{x})}{\|G_{j}\|_{2}}. (24)

We know that the distance between z(x~)z^{*}(\tilde{x}) and plane Zj={v|Gjv=Wj+Sjx~}Z_{j}=\{v\ |\ G_{j}v=W_{j}+S_{j}\tilde{x}\} is Wj+Sjx~Gjz(x~)Gj2\frac{W_{j}+S_{j}\tilde{x}-G_{j}z^{*}(\tilde{x})}{\|G_{j}\|_{2}}. It follow from (22)(23)(24) that (xk)\mathcal{M}(x_{k}) does not intersect with the plane ZjZ_{j}. Also we know z(x~)Zjz^{*}(\tilde{x})\in Z_{j}. Then (xk)Zj\mathcal{M}(x_{k})\subset Z_{j}. As a result, it holds that

(xk)j(xk)Zj=(x,(xk)).\mathcal{M}(x_{k})\subset\cap_{j\in\mathbb{C}(x_{k})}Z_{j}=\mathbb{Z}(x,\mathbb{C}(x_{k})). (25)

According to Lemma 1, combining (22), (25) with (21) yields

z(xk,𝕀(xk))=z(xk).z^{*}(x_{k},\mathbb{I}(x_{k}))=z^{*}(x_{k}).

which can complete the proof by induction.

Theorem 2 indicates that our algorithm exploits the historical data to accelerate the optimization without changing the state trajectory. Compared to solving the optimization problem (6) with a large number of constraints, the computation in line 6 of Algorithm 1 is low unless the historical data is very large. Even though this computation is limited by the number of historical data, some techiques can be used to overcome this problem. For example, the redundant data can removed when distance between two historical data is small.

In Algorithm 1, when data set DD is empty, we can initialize with one data (x,z(x),𝒜(x))=(0,0,)(x,z^{*}(x),\mathcal{A}(x))=(0,0,\emptyset). In (20), The terms such as Gj2\|G_{j}\|_{2} can be calculated offline which can accelerate the construction of removal constraints.

IV Simulations

In this section, we apply the proposed method to the double integrator system. The simulations are conducted in MATLAB.

Consider the double integrator system:

xk+1=[10.101]xk+[0.0050.1]uk,x_{k+1}=\begin{bmatrix}1&0.1\\ 0&1\end{bmatrix}x_{k}+\begin{bmatrix}0.005\\ 0.1\end{bmatrix}u_{k},

where the input is constrained by |u|1|u|\leq 1. The state constraints are

(x1,jd)TP1(xd)1,\displaystyle(x^{1,j}-d)^{T}P_{1}(x-d)\leq 1,
(x2,jd)TP2(xd)1,\displaystyle(x^{2,j}-d)^{T}P_{2}(x-d)\leq 1,

which are linearly approximated by two quadratic constraints (27). x1,jx^{1,j} and x2,jx^{2,j} are the boundary points of

(xd)TP1(xd)1,\displaystyle(x-d)^{T}P_{1}(x-d)\leq 1, (27a)
(xd)TP2(xd)1,\displaystyle(x-d)^{T}P_{2}(x-d)\leq 1, (27b)

respectively, and

d=[2.150],P1=[0.140.170.171.7],P2=[0.200.050.050.21].d=\begin{bmatrix}-2.15\\ 0\end{bmatrix},\ P_{1}=\begin{bmatrix}0.14&0.17\\ 0.17&1.7\end{bmatrix},\ P_{2}=\begin{bmatrix}0.20&0.05\\ 0.05&0.21\end{bmatrix}.

Similarly, the terminal constraint is constructed by piecewise linear approximation of two quadratic constraints (27a) and (xd)TP3(xd)1(x-d)^{T}P_{3}(x-d)\leq 1 with P3=[0.10.70.70.97]P_{3}=\begin{bmatrix}0.1&0.7\\ 0.7&0.97\end{bmatrix}. The state constraints and terminal constraint are shown in Fig. 2.

The MPC cost function is defined as (4) where

Q=[1001],P=[1001],R=1.Q=\begin{bmatrix}1&0\\ 0&1\end{bmatrix},\ P=\begin{bmatrix}1&0\\ 0&1\end{bmatrix},R=1.

The prediction horizon is N=12N=12 and the total number of constraints is nc=1948n_{c}=1948. The Lipschitz constant κ^max=21.44\hat{\kappa}_{\text{max}}=21.44 is calculated by (13) where

Φ=diag(1G12,1G22,,1Gnc2).\Phi=\text{diag}(\frac{1}{\|G_{1}\|_{2}},\frac{1}{\|G_{2}\|_{2}},\cdots,\frac{1}{\|G_{n_{c}}\|_{2}}).

The system is initialized at x0=[4.10]Tx_{0}=\begin{bmatrix}-4.1&0\end{bmatrix}^{T} and expected to converge to the origin under the control policy. The data set DD is initialized with one data (0,0,)(0,0,\emptyset). The resulting state trajectories are shown in Fig. 2. The closed-loop trajectories under the original MPC and the proposed ca-MPC steer towards the origin without violating the constraints. Moreover, our proposed ca-MPC produces the same trajectories as that of the original MPC. The comparation of the constraints number and computation time between two schemes are shown in Fig. 3. After 10 steps, the proposed scheme decreases more than 80% constraints, then the proposed scheme removes almost all of constraints after 80 steps. In addition, our ca-MPC scheme is 10-100 times faster to compute compared to the original MPC scheme.

Refer to caption
Figure 2: The closed-loop state trajectory using the original MPC policy and the proposed ca-MPC policy.
Refer to caption
Figure 3: The percentage of constraints and the computation time with respect to the original MPC solution over time.

V Conclusions

In this paper, we presented the constraint-adaptive MPC scheme which can exploit the historical data to remove constraints. A crucial aspect of our method is to analyze the Lipschitz continuity and design the outer approximation. In particular, we provided an explicit form of the Lipschitz constant which can be calculated offline and efficiently. Then, we proved that the closed-loop behavior remains unchanged compared to the original MPC policy. Simulations validated that our scheme can significantly reduce the number of constraints and improve the computational speed.

Appendix A Proof of Theorem 1

The KKT conditions related to the mp-QP (9) are

Hz+FTx+G𝕀Tλ=0,\displaystyle Hz+F^{T}x+G_{\mathbb{I}}^{T}\lambda=0, (28a)
λ(G𝕀zW𝕀S𝕀x)=0,\displaystyle\lambda\circ(G_{\mathbb{I}}z-W_{\mathbb{I}}-S_{\mathbb{I}}x)=0, (28b)
λ0,\displaystyle\lambda\geq 0, (28c)
G𝕀zW𝕀+S𝕀x.\displaystyle G_{\mathbb{I}}z\leq W_{\mathbb{I}}+S_{\mathbb{I}}x. (28d)

Solving (28a) for z yields

z=H1(FTx+G𝕀Tλ).z=-H^{-1}(F^{T}x+G_{\mathbb{I}}^{T}\lambda). (29)

Let superscript ~\tilde{} denotes the variable corresponding to active constraints 𝒜𝕀(x)\mathcal{A}_{\mathbb{I}}(x). Without loss of generality, we assume that the rows of G~\tilde{G} are linearly independent [6]. For active constraints, G~𝕀zW~𝕀S~𝕀x=0\tilde{G}_{\mathbb{I}}z-\tilde{W}_{\mathbb{I}}-\tilde{S}_{\mathbb{I}}x=0. Combining it with (29):

λ~=(G~𝕀H1G~𝕀T)1(W~𝕀+(S~𝕀+G~𝕀H1FT)x)\tilde{\lambda}=-(\tilde{G}_{\mathbb{I}}H^{-1}\tilde{G}_{\mathbb{I}}^{T})^{-1}(\tilde{W}_{\mathbb{I}}+(\tilde{S}_{\mathbb{I}}+\tilde{G}_{\mathbb{I}}H^{-1}F^{T})x) (30)

where G~𝕀,W~𝕀,S~𝕀\tilde{G}_{\mathbb{I}},\tilde{W}_{\mathbb{I}},\tilde{S}_{\mathbb{I}} correspond to the set of active constraints, and G~H1G~T\tilde{G}H^{-1}\tilde{G}^{T} is invertible because the rows of G~\tilde{G} are linearly independent. Substituting (30) into (29):

z𝕀(x)=\displaystyle z^{*}_{\mathbb{I}}(x)= H1FTx\displaystyle-H^{-1}F^{T}x
+H1G~𝕀T(G~𝕀H1G~𝕀T)1(W~𝕀+(S~𝕀+G~𝕀H1FT)x).\displaystyle+H^{-1}\tilde{G}_{\mathbb{I}}^{T}(\tilde{G}_{\mathbb{I}}H^{-1}\tilde{G}_{\mathbb{I}}^{T})^{-1}(\tilde{W}_{\mathbb{I}}+(\tilde{S}_{\mathbb{I}}+\tilde{G}_{\mathbb{I}}H^{-1}F^{T})x).

Define

K=H1FT+H1G~𝕀T(G~𝕀H1G~𝕀T)1(S~𝕀+G~𝕀H1FT).K=-H^{-1}F^{T}+H^{-1}\tilde{G}_{\mathbb{I}}^{T}(\tilde{G}_{\mathbb{I}}H^{-1}\tilde{G}_{\mathbb{I}}^{T})^{-1}(\tilde{S}_{\mathbb{I}}+\tilde{G}_{\mathbb{I}}H^{-1}F^{T}).

We obtain

K2H1FT2+H1G~𝕀T(G~𝕀H1G~𝕀T)1(S~𝕀+G~𝕀H1FT)2.\begin{split}\|K\|_{2}\leq&\|H^{-1}F^{T}\|_{2}\\ &+\|H^{-1}\tilde{G}_{\mathbb{I}}^{T}(\tilde{G}_{\mathbb{I}}H^{-1}\tilde{G}_{\mathbb{I}}^{T})^{-1}(\tilde{S}_{\mathbb{I}}+\tilde{G}_{\mathbb{I}}H^{-1}F^{T})\|_{2}.\end{split} (31)

In fact, G~𝕀,S~𝕀\tilde{G}_{\mathbb{I}},\tilde{S}_{\mathbb{I}} are the submatrices composed by the rows in G,SG,S respectively. We can represent these matrix with JJ such that

G~𝕀=JG,S~𝕀=JS.\tilde{G}_{\mathbb{I}}=JG,\ \tilde{S}_{\mathbb{I}}=JS.

Here, the matrix JJ is non-square matrix that has at most one entry of 1 in each row and each column with all other entries 0. Substituting this into the second term of (31) yields

H1G~𝕀T(G~𝕀H1G~𝕀T)1(S~𝕀+G~𝕀H1FT)2\displaystyle\|H^{-1}\tilde{G}_{\mathbb{I}}^{T}(\tilde{G}_{\mathbb{I}}H^{-1}\tilde{G}_{\mathbb{I}}^{T})^{-1}(\tilde{S}_{\mathbb{I}}+\tilde{G}_{\mathbb{I}}H^{-1}F^{T})\|_{2} (32)
=H1GTJT(JGH1GTJT)1(JS+JGH1FT)2\displaystyle=\|H^{-1}G^{T}J^{T}(JGH^{-1}G^{T}J^{T})^{-1}(JS+JGH^{-1}F^{T})\|_{2}
H1GT2JT(JGH1GTJT)1J2S+GH1FT2.\displaystyle\leq\|H^{-1}G^{T}\|_{2}\|J^{T}(JGH^{-1}G^{T}J^{T})^{-1}J\|_{2}\|S+GH^{-1}F^{T}\|_{2}.

The last inequation is built with the submultiplicative property of the spectral norm. It follows that

JT(JGH1GTJT)1J2\displaystyle\|J^{T}(JGH^{-1}G^{T}J^{T})^{-1}J\|_{2} (33)
JT2(JGH1GTJT)12J2\displaystyle\leq\|J^{T}\|_{2}\|(JGH^{-1}G^{T}J^{T})^{-1}\|_{2}\|J\|_{2}
=(JGH1GTJT)12=λ1((JGH1GTJT)1)\displaystyle=\|(JGH^{-1}G^{T}J^{T})^{-1}\|_{2}=\lambda_{1}((JGH^{-1}G^{T}J^{T})^{-1})
=1λmin(JGH1GTJT),\displaystyle=\frac{1}{\lambda_{\text{min}}(JGH^{-1}G^{T}J^{T})},

where the second and third equation are satisfied because (JGH1GTJT)1(JGH^{-1}G^{T}J^{T})^{-1} is positive definite matrix.

H1H^{-1} is positive definite matrix and can be written as H1=LLTH^{-1}=LL^{T} (i.e. Cholesky decomposition) where LL has full rank. Let Y=GLY=GL The matrices JGLLTGTJTJGLL^{T}G^{T}J^{T} and LTGTJTJGLL^{T}G^{T}J^{T}JGL have the same non-zero eigenvalues and the number of non-zero eigenvalues is 𝒜(x)\mathcal{A}(x):

λmin(JGH1GTJT)=λmin(JYYTJT)=λcard(𝒜(x))(YTJTJY)=λcard(𝒜(x))(j𝒜(x)YjTYj).\begin{split}&\lambda_{\text{min}}(JGH^{-1}G^{T}J^{T})=\lambda_{\text{min}}(JYY^{T}J^{T})\\ &=\lambda_{\text{card}(\mathcal{A}(x))}(Y^{T}J^{T}JY)=\lambda_{\text{card}(\mathcal{A}(x))}(\sum_{j\in\mathcal{A}(x)}Y^{T}_{j}Y_{j}).\end{split} (34)

According to the Corollary in [17, Corollary 4.3.15.], if two matrices A,BA,B are Hermitian, then λi(A)+λ1(B)λi(A+B)\lambda_{i}(A)+\lambda_{1}(B)\leq\lambda_{i}(A+B). As a result,

λcard(𝒜(x))(j𝒜(x)YjTYj)λcard(𝒜(x))(j𝒜(x),jj0YjTYj)+λ1(Yj0TYj0),\begin{split}&\lambda_{\text{card}(\mathcal{A}(x))}(\sum_{j\in\mathcal{A}(x)}Y^{T}_{j}Y_{j})\\ &\geq\lambda_{\text{card}(\mathcal{A}(x))}(\sum_{j\in\mathcal{A}(x),j\neq j_{0}}Y^{T}_{j}Y_{j})+\lambda_{1}(Y^{T}_{j_{0}}Y_{j_{0}}),\end{split}

where j0j_{0} can be any element in 𝒜(x)\mathcal{A}(x). The rank of j𝒜(x),jj0YjTYj\sum_{j\in\mathcal{A}(x),j\neq j_{0}}Y^{T}_{j}Y_{j} is less than card(𝒜(x))\text{card}(\mathcal{A}(x)). So its card(𝒜(x))\text{card}(\mathcal{A}(x))-th largest eigenvalue is zero. Moreover, λ1(Yj0TYj0)=Yj0Yj0T\lambda_{1}(Y^{T}_{j_{0}}Y_{j_{0}})=Y_{j_{0}}Y^{T}_{j_{0}}. We have

λcard(𝒜(x))(j𝒜(x)YjTYj)Yj0Yj0TminjYjYjT=minjGjH1GjT.\begin{split}&\lambda_{\text{card}(\mathcal{A}(x))}(\sum_{j\in\mathcal{A}(x)}Y^{T}_{j}Y_{j})\geq Y_{j_{0}}Y^{T}_{j_{0}}\\ &\ \ \ \ \geq\min_{j}Y_{j}Y_{j}^{T}=\min_{j}G_{j}H^{-1}G_{j}^{T}.\end{split} (35)

Combing (32),(33),(34),(35) with (31) leads to

K2H1FT2+1minjGjH1GjTH1GT2S+GH1FT2,\begin{split}\|K\|_{2}&\leq\|H^{-1}F^{T}\|_{2}\\ &+\frac{1}{\min_{j}G_{j}H^{-1}G_{j}^{T}}\|H^{-1}G^{T}\|_{2}\|S+GH^{-1}F^{T}\|_{2},\end{split}

which completes the proof.

References

  • [1] G. Cesari, G. Schildbach, A. Carvalho, and F. Borrelli, “Scenario model predictive control for lane change assistance and autonomous driving on highways,” IEEE Intelligent transportation systems magazine, vol. 9, no. 3, pp. 23–35, 2017.
  • [2] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive control technology,” Control engineering practice, vol. 11, no. 7, pp. 733–764, 2003.
  • [3] C. M. Best, M. T. Gillespie, P. Hyatt, L. Rupert, V. Sherrod, and M. D. Killpack, “A new soft robot control method: Using model predictive control for a pneumatically actuated humanoid,” IEEE Robotics & Automation Magazine, vol. 23, no. 3, pp. 75–84, 2016.
  • [4] J. G. Van Antwerp and R. D. Braatz, “Model predictive control of large scale processes,” Journal of Process Control, vol. 10, no. 1, pp. 1–8, 2000.
  • [5] S. Hovland, K. Willcox, and J. T. Gravdahl, “Mpc for large-scale systems via model reduction and multiparametric quadratic programming,” in Proceedings of the 45th IEEE Conference on Decision and Control.   IEEE, 2006, pp. 3418–3423.
  • [6] A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos, “The explicit linear quadratic regulator for constrained systems,” Automatica, vol. 38, no. 1, pp. 3–20, 2002.
  • [7] J. L. Jerez, E. C. Kerrigan, and G. A. Constantinides, “A condensed and sparse qp formulation for predictive control,” in 2011 50th IEEE Conference on Decision and Control and European Control Conference.   IEEE, 2011, pp. 5217–5222.
  • [8] A. J. Ardakani and F. Bouffard, “Acceleration of umbrella constraint discovery in generation scheduling problems,” IEEE Transactions on Power Systems, vol. 30, no. 4, pp. 2100–2109, 2014.
  • [9] S. Nouwens, B. de Jager, M. M. Paulides, and W. M. H. Heemels, “Constraint removal for mpc with performance preservation and a hyperthermia cancer treatment case study,” in 2021 60th IEEE Conference on Decision and Control (CDC).   IEEE, 2021, pp. 4103–4108.
  • [10] M. Jost and M. Mönnigmann, “Accelerating model predictive control by online constraint removal,” in 52nd IEEE Conference on Decision and Control.   IEEE, 2013, pp. 5764–5769.
  • [11] M. Jost, G. Pannocchia, and M. Mönnigmann, “Online constraint removal: Accelerating mpc with a lyapunov function,” Automatica, vol. 57, pp. 164–169, 2015.
  • [12] S. A. N. Nouwens, M. M. Paulides, and M. Heemels, “Constraint-adaptive mpc for linear systems: A system-theoretic framework for speeding up mpc through online constraint removal,” Automatica, vol. 157, p. 111243, 2023.
  • [13] P. O. Scokaert, J. B. Rawlings, and E. S. Meadows, “Discrete-time stability with perturbations: Application to model predictive control,” Automatica, vol. 33, no. 3, pp. 463–470, 1997.
  • [14] F. Fabiani and P. J. Goulart, “Reliably-stabilizing piecewise-affine neural network controllers,” IEEE Transactions on Automatic Control, vol. 68, no. 9, pp. 5201–5215, 2023.
  • [15] M. S. Darup, M. Jost, G. Pannocchia, and M. Mönnigmann, “On the maximal controller gain in linear mpc,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9218–9223, 2017.
  • [16] D. Teichrib and M. S. Darup, “Efficient computation of lipschitz constants for mpc with symmetries,” in 2023 62nd IEEE Conference on Decision and Control (CDC).   IEEE, 2023, pp. 6685–6691.
  • [17] R. A. Horn and C. R. Johnson, Matrix analysis.   Cambridge university press, 2012.