Data-Based Receding Horizon Control of Linear Network Systems^†^†thanks: This work was supported by ARO-W911NF-18-1-0213.

Ahmed Allibhoy Jorge Cortés A. Allibhoy and J. Cortés are with the Department of Mechanical and Aerospace Engineering, UC San Diego, {aallibho,cortes}@ucsd.edu

Abstract

We propose a distributed data-based predictive control scheme to stabilize a network system described by linear dynamics. Agents cooperate to predict the future system evolution without knowledge of the dynamics, relying instead on learning a data-based representation from a single sample trajectory. We employ this representation to reformulate the finite-horizon Linear Quadratic Regulator problem as a network optimization with separable objective functions and locally expressible constraints. We show that the controller resulting from approximately solving this problem using a distributed optimization algorithm in a receding horizon manner is stabilizing. We validate our results through numerical simulations.

Index Terms:

Data-based control, network systems, predictive control of linear systems

I Introduction

With the growing complexity of engineering systems, data-based methods in control theory are becoming increasingly popular, particularly for systems where it is too difficult to develop models from first principles and parameter identification is impractical or too costly. An important class of such systems are network systems, which arise in many applications such as neuroscience, power systems, traffic management, and robotics. Without a system model, agents must use sampled data to characterize the network behavior. However, the decentralized nature of the system means that agents only have access to information that can be measured locally, and must coordinate with one another to predict the network response and decide their control actions. These observations motivate the focus here on distributed data-based control of network systems with linear dynamics.

Literature Review

Distributed control of network systems is a burgeoning area of research, see e.g., [1, 2, 3] and references therein. In general, designing optimal controllers for network systems is an NP-hard problem, but under certain conditions optimal distributed controllers for linear systems can be obtained as the solution to a convex program [4]. When these conditions do not hold, suboptimal controllers can be obtained by convex relaxations [5, 6] or convex restrictions [7] of the original problem. Although these methods produce distributed controllers, the computation of the controller itself is typically done offline, in a centralized manner, and requires knowledge of the underlying system model. Reinforcement learning (RL) is an increasingly popular approach for controlling robots [8] and multi-agent systems [9]. However, RL approaches typically require a very large number of samples to perform effectively [10] and their complexity makes it difficult to get stability, safety, and robustness guarantees as is standard with other control approaches. For applications where safety assurances are required, model predictive control (MPC) is widely used since performance and safety constraints can be directly incorporated into an optimization problem that is solved online. Several distributed MPC formulations are available for multi-agent systems where the dynamics of the agents are coupled, such as [11, 12] where each agent implements a control policy minimizing its own objective while accounting for network interactions locally, or [13] where agents cooperate to minimize a system-wide objective using a network optimization algorithm. Data-based approaches to predictive control have also been proposed. System identification [14] is often leveraged to learn a parameterized model which can then be used with any of the MPC formulations previously mentioned. Methods for implementing a controller directly from sampled data without any intermediate identification also exist. The fundamental lemma from behavioral systems theory [15], which characterizes system trajectories from a single sample trajectory, has recently gained attention in the area of data-based control [16, 17, 18], and has been used for predictive control in the recently developed DeePC framework [19, 20]. Our work here extends the DeePC framework to network systems where each node only has partial access to the data.

Statement of Contributions

We develop distributed data-based feedback controllers for network systems¹¹1Throughout the paper, we make use of the following notation. Given integers, $a,b\in{\mathbb{Z}}$ with $a<b$ , let $[a,b]=\{a,a+1,\dots,b\}$ . Let $\mathcal{G}=(\mathcal{V},\mathcal{E})$ be an undirected graph with $N$ nodes, where $\mathcal{V}=[1,N]$ and $\mathcal{E}\subset\mathcal{V}\times\mathcal{V}$ . The neighbors of $i\in\mathcal{V}$ are $\mathcal{N}_{i}=\{j:(i,j)\in\mathcal{E}\}$ . Given $S=\{s_{1},s_{2},\dots,s_{M}\}\subseteq[1,N]$ and a vector $x=[x_{1}^{\mathsf{T}},x_{2}^{\mathsf{T}},\dots,x_{N}^{\mathsf{T}}]$ , we denote $x_{S}=\begin{bmatrix}x_{s_{1}}^{\mathsf{T}}&x_{s_{2}}^{\mathsf{T}}&\cdots&x_{s_{M}}^{\mathsf{T}}\end{bmatrix}$ . For $x_{i}\in^{d_{i}}$ with $i\in[1,K]$ , we let $\textnormal{col}(x_{1},x_{2},\dots,x_{K})=[x_{1}^{\mathsf{T}},x_{2}^{\mathsf{T}},\dots,x_{K}^{\mathsf{T}}]^{\mathsf{T}}$ . For positive semidefinite $Q\in^{n\times n}$ , we denote $\left\lVert x\right\rVert_{Q}=\sqrt{x^{\mathsf{T}}Qx}$ . For $M\in^{n\times m}$ , we denote by $M^{\dagger}$ its Moore-Penrose pseudoinverse. The Hankel matrix of a signal $w:[0,T]\to^{k}$ with $t\leq T$ block rows is the $kt\times(T-t+1)$ matrix $\mathcal{H}_{t}(w)=\begin{bmatrix}w(0)&w(1)&\cdots&w(T-t)\\ w(1)&w(2)&\cdots&w(T-t+1)\\ \vdots&\vdots&\ddots&\vdots\\ w(t-1)&w(t)&\cdots&w(T-1)\end{bmatrix}.$ Given two signals $v_{1}:[0,T-1]\to^{k_{1}}$ and $v_{2}:[0,T-1]\to^{k_{2}}$ , let $v=\textnormal{col}(v_{1},v_{2})$ be the signal where $v(t)=v_{1}(t)$ for $0\leq t<T$ , and $v(t)=v_{2}(t-T)$ for $T\leq t<2T$ .. A group of agents whose state evolves according to unknown coupled linear dynamics each have access to their own state and those of their neighbors in some sample trajectory. Their collective objective is to drive the network state to the origin while minimizing a quadratic objective function without knowledge of the system dynamics. The approach we use computes the control policy online and in a distributed manner by extending the DeePC formalism to the network case. Building upon the fundamental lemma, we introduce a new distributed, data-based representation of possible network trajectories. We use this representation to pose the control synthesis as a network optimization problem, without state or input constraints, in terms of the data available to each agent. We show that this optimization problem is equivalent to the standard finite-horizon Linear Quadratic Regulation (LQR) problem and introduce a primal-dual method along with a suboptimality certificate to allow agents to cooperatively find an approximate solution. Finally, we show that the controller that results from implementing the distributed solver in a receding horizon manner is stabilizing.

II Preliminaries

We briefy recall here basic concepts on the identifiability of Linear Time-Invariant (LTI) systems from data. Given $t,T_{d}\in{\mathbb{Z}}_{\geq 0}$ with $t<T_{d}$ , a signal $w:[0,T_{d}-1]\to^{k}$ is persistently exciting of order $t$ if $\textnormal{rowrank }\mathcal{H}_{t}(w)=kt$ . Informally, this means that any arbitrary signal $v:[0,t-1]\to^{k}$ can be described as a linear combination of windows of width $t$ in the signal $w$ . A necessary condition for persistence of excitation is $T_{d}\geq(k+1)t-1$ .

Lemma II.1.

(Fundamental Lemma [15]): Consider the LTI system $x(t+1)=Ax(t)+Bu(t)$ , with $(A,B)$ controllable. Let $u^{d}:[0,T_{d}-1]\to^{m}$ , $x^{d}:[0,T_{d}-1]\to^{n}$ be sequences such that $w^{d}=\textnormal{col}(u^{d},x^{d})$ is a trajectory of the system and $u^{d}$ is persistently exciting of order $n+\tau$ . Then for any pair $u:[0,\tau-1]\to^{m}$ , $x:[0,\tau-1]\to^{n}$ , $w=\textnormal{col}(u,x)$ is a trajectory of the system if and only if there exists $g\in^{T_{d}-\tau-1}$ such that $\mathcal{H}_{\tau}(w^{d})g=w$ .

Lemma II.1 is stated here in state-space form, even though the result was originally presented in the language of behavioral systems theory. The result states that all trajectories of a controllable LTI system can be characterized by a single trajectory if the corresponding input is persistently exciting of sufficiently high order, obviating the need for a model or parameter estimation when designing a controller.

III Problem Formulation

Consider a network system described by an undirected graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ with $N$ nodes. Each node corresponds to an agent with sensing, communication, and computation capabilities. Each edge corresponds to both a physical coupling and a communication link between the corresponding agents. A subset of the nodes $S\subset\mathcal{V}$ , with $|S|=M$ , also have actuation capabilities via inputs $u_{i}\in^{m_{i}}$ . The system dynamics are then

\small x_{i}(t+1)=\left\{\begin{aligned} &A_{ii}x_{i}(t)+\sum_{j\in\mathcal{N}_{i}}A_{ij}x_{j}(t)+B_{i}u_{i}&i\in S,\cr&A_{ii}x_{i}(t)+\sum_{j\in\mathcal{N}_{i}}A_{ij}x_{j}(t)&i\notin S,\end{aligned}\right.

(1)

where $x_{i}\in^{n_{i}}$ , $A_{ij}\in^{n_{i}\times n_{j}}$ and $B_{i}\in^{n_{i}\times m_{i}}$ . Let $n=\sum_{i=1}^{N}n_{i}$ and $m=\sum_{i=1}^{M}m_{i}$ and define $x=\textnormal{col}(x_{1},x_{2},\cdots,x_{N})\in^{n}$ and $u=\textnormal{col}(u_{1},u_{2},\cdots,u_{M})\in^{m}$ . Let $A\in^{n\times n}$ and $B\in^{n\times m}$ be matrices so that (1) takes the compact form $x(t+1)=Ax(t)+Bu(t)$ .

To each node $i\in\mathcal{V}$ , we associate an objective of the form $J_{i}(x_{i},u_{i})=\sum_{t=0}^{T-1}\left\lVert x_{i}(t)\right\rVert_{Q_{i}}+\left\lVert u_{i}(t)\right\rVert_{R_{i}}$ when $i\in S$ and $J_{i}(x_{i})=\sum_{t=0}^{T-1}\left\lVert x_{i}(t)\right\rVert_{Q_{i}}$ otherwise. Here, each $Q_{i}\in^{n_{i}\times n_{i}}$ is positive semidefinite, each $R_{i}\in^{i\times i}$ is positive definite, $\textnormal{col}(u,x)$ is a system trajectory, and $T$ is the time horizon of trajectories being considered.

Each node wants to drive its state $x_{i}$ to the origin while minimizing $J_{i}$ and satisfying the constraints. The resulting network objective function is the sum of the objective functions across the nodes. Letting $Q=\text{blkdiag}(Q_{1},Q_{2},\dots,Q_{N})\in^{n\times n}$ and $R=\text{blkdiag}(R_{s_{1}},R_{s_{2}},\dots,R_{s_{M}})\in^{m\times m}$ , this objective can be written as

\displaystyle J(x,u)

\displaystyle=\sum_{i\in S}J_{i}(x_{i},u)+\sum_{i\in\mathcal{V}\setminus S}J_{i}(x_{i})

so that $J(x,u)=\sum_{t=0}^{T-1}\left\lVert x(t)\right\rVert_{Q}+\left\lVert u(t)\right\rVert_{R}$ . If the system starts from $x(0)=x^{0}\in^{n}$ , the agents’ goal can be formulated as the network optimization problem:

	$\displaystyle\underset{u,x}{\text{minimize}}$	$\displaystyle\sum_{t=0}^{T-1}\left\lVert x(t)\right\rVert_{Q}+\left\lVert u(t)\right\rVert_{R}$		(2)
	subject to	$\displaystyle x(t+1)=Ax(t)+Bu(t),$	for $t\in[0,T_{\text{lqr}}]$
$\displaystyle x(0)=x^{0},\;x(T)=0.$

Note that the agents’ decisions on their control inputs are coupled through the constraints. Since $R\succ 0$ , if (2) is feasible, its optimal state and input trajectories are unique.

A key aspect of this paper is that we consider scenarios where the system matrices $A$ and $B$ are unknown to the network. Instead, we assume that, for a set of given input sequences $\{u^{d}_{i}:[0,T_{d}-1]\to^{m_{i}}\}_{i\in S}$ , the corresponding state trajectories $\{x^{d}_{i}:[0,T_{d}-1]\to^{n_{i}}\}_{i\in\mathcal{V}}$ are available, and each node $i\in\mathcal{V}$ has access to its own state trajectory as well as those of its neighbors. Actuated nodes $i\in S$ also have access to their own input $u^{d}_{i}$ , but this is unknown to its neighbors $\mathcal{N}_{i}$ . Our aim is to synthesize a control policy that can be implemented by each node in a distributed way with data available to it. The resulting controller should stabilize the system to the origin while minimizing $J(x,u)$ .

IV Data-Based Representation for Optimization

Here, we introduce a data-based representation of system trajectories that is employed to pose a network optimization problem equivalent to (2). Throughout this section, we let $x^{d}:[0,T_{d}-1]\to^{n}$ , $u^{d}:[0,T_{d}-1]\to^{m}$ be sequences such that $w^{d}(t)=\textnormal{col}(u^{d}(t),x^{d}(t))$ is a trajectory of (1). Let

w^{d}_{i}(t)=\begin{cases}\textnormal{col}(u^{d}_{i}(t),x^{d}_{\mathcal{N}_{i}}(t),x^{d}_{i}(t))&\text{if $i\in S$}\\ \textnormal{col}(x^{d}_{\mathcal{N}_{i}}(t),x^{d}_{i}(t))&\text{if $i\notin S$}\end{cases},

for each $i\in\mathcal{V}$ and $0\leq t<T_{d}-1$ . Then $w_{i}^{d}$ is the data available to each node. Let $u:[0,\tau]\to^{m}$ , $x:[0,\tau]\to^{n}$ be arbitrary sequences where $T_{d}\geq(n+m)\tau-1$ . Define $w(t)=\textnormal{col}(x(t),u(t))$ and

w_{i}(t)=\begin{cases}\textnormal{col}(u_{i}(t),x_{\mathcal{N}_{i}}(t),x_{i}(t))&\text{if $i\in S$}\\ \textnormal{col}(x_{\mathcal{N}_{i}}(t),x_{i}(t))&\text{if $i\notin S$}\end{cases}.

Let $k_{i}=n_{i}\tau+\sum_{j\in\mathcal{N}_{i}}n_{j}\tau+m_{i}\tau$ for $i\in S$ , and $k_{i}=n_{i}\tau+\sum_{j\in\mathcal{N}_{i}}n_{j}\tau$ otherwise. We define $E_{i}\in^{k_{i}\times(m+n)\tau}$ to be the matrix consisting of all ones and zeros such that $E_{i}w^{d}=w^{d}_{i}$ and $E_{i}w=w_{i}$ .

IV-A Data-Based Representation of Network Trajectories

Lemma II.1 states conditions under which the behavior of the system can be described completely by the Hankel matrix of the sampled data. Here we extend Lemma II.1 to the setting of a network system to build a data-based representation of network trajectories using the Hankel matrices of the data available to each agent, $\mathcal{H}_{\tau}(w^{d}_{i})$ . We show that under certain conditions the image of $\mathcal{H}_{\tau}(w^{d}_{i})$ is the set of all possible trajectories of node $i$ .

Proposition IV.1.

(Sufficiency of Date-Based Image Representation): If for each $i\in\mathcal{V}$ there exists $g_{i}\in^{T_{d}-\tau+1}$ with $\mathcal{H}_{\tau}(w^{d}_{i})g_{i}=w_{i}$ , then $w$ is a trajectory of (1).

Proof.

Writing $g_{i}=(g_{i}(0),g_{i}(1),\dots,g_{i}(T_{d}-\tau))^{\mathsf{T}}$ so for all $0\leq t<\tau-1$ , $w_{i}(t)=\sum_{k=0}^{T_{d}-\tau+1}g_{i}(k)w^{d}_{i}(t+k)$ , it follows that $x_{i}(t+1)=\sum_{k=0}^{T-\tau}g_{i}(k)x^{d}_{i}(t+k+1)$ so for $i\notin S$ ,

	$\displaystyle x_{i}(t+1)$	$\displaystyle=\sum_{k=0}^{T_{d}-\tau}g_{i}(k)\Big{(}A_{ii}x^{d}_{i}(t+k)+\sum_{j\in\mathcal{N}_{i}}A_{ij}x^{d}_{j}(t+k)\Big{)}$
		$\displaystyle=A_{ii}x_{i}(t)+\sum_{j\in\mathcal{N}_{i}}A_{ij}x_{j}(t).$

By a similar computation, we can show that for each $i\in S$ ,

\displaystyle x_{i}(t+1)=A_{ii}

\displaystyle x_{i}(t)+\sum_{j\in\mathcal{N}_{i}}A_{ij}x_{j}(t)+B_{i}u_{i}(t),

which is consistent with (1). ∎

Next we identify conditions for the converse of the above result to hold, i.e., when the Hankel matrices of all the agents characterize all possible network trajectories.

Proposition IV.2.

(Necessity of Data-Based Image Representation): If $(A,B)$ is controllable, $w^{d}$ is a trajectory of (1) and either

(i)

$u^{d}$ is persistently exciting of order $n+\tau$ ;
(ii)

$\textnormal{col}(u^{d}_{i}(t),x_{\mathcal{N}_{i}}^{d}(t))$ is persistently exciting of order $n_{i}+\tau$ for each $i\in S$ , and $x_{\mathcal{N}_{i}}^{d}$ is persistently exciting of order $n_{i}+\tau$ for each $i\in\mathcal{V}\setminus S$ ;

then for all $i\in\mathcal{V}$ there exists $g_{i}\in^{T_{d}-\tau+1}$ such that $\mathcal{H}_{\tau}(w^{d}_{i})g_{i}=w_{i}$ .

Proof.

In the case of (i) we simply apply Lemma II.1 to obtain $g\in^{T_{d}-\tau+1}$ where $\mathcal{H}_{\tau}(w^{d})g=w$ , and note that for all $i\in\mathcal{V}$ , $w_{i}=E_{i}w=E_{i}\mathcal{H}_{\tau}(w^{d})g=\mathcal{H}_{\tau}(w_{i}^{d})g$ , so the result follows by letting $g_{i}=g$ . For case (ii), we think of $x_{j}$ for $j\in\mathcal{N}_{i}$ as an input to node $i$ . Letting $k=|\mathcal{N}_{i}|$ , where $\mathcal{N}_{i}=\{j_{1},j_{2},\dots,j_{k}\}$ , and defining

\displaystyle\small\tilde{B}_{i}=\begin{cases}\begin{bmatrix}A_{ij_{1}}&A_{ij_{2}}&\cdots&A_{ij_{k}}&B_{i}\end{bmatrix}&i\in S\\[10.0pt] \begin{bmatrix}A_{ij_{1}}&A_{ij_{2}}&\cdots&A_{ij_{k}}\end{bmatrix}&i\notin S\end{cases},

we have

\displaystyle\small x_{i}(t+1)=\begin{cases}A_{ii}x_{i}(t)+\tilde{B}_{i}\textnormal{col}(x_{j_{1}},x_{j_{2}},\cdots,x_{j_{k}},u_{i})&i\in S\\ A_{ii}x_{i}(t)+\tilde{B}_{i}\textnormal{col}(x_{j_{1}},x_{j_{2}},\cdots,x_{j_{k}})&i\notin S\end{cases}

Let $x^{0}_{i}\in^{n_{i}}$ be arbitrary, and $x^{0}\in^{n}$ such that the $i$ th block component is $x^{0}_{i}$ . Since $(A,B)$ is controllable there exists an input $\bar{u}:[0,n]\to^{m}$ such the corresponding state trajectory $\bar{x}:[0,n]\to^{m}$ with $\bar{x}(0)=x^{0}$ has $\bar{x}(n)=0$ . Note that if $i\in S$ , then $\bar{x}_{i}$ is the state trajectory corresponding to the input $\textnormal{col}(\bar{u}_{i},\bar{x}_{\mathcal{N}_{i}})$ , and $\bar{x}_{i}(0)=x_{i}^{0}$ and $\bar{x}_{i}(n)=0$ . Likewise, if $i\notin S$ , $\bar{x}_{i}$ is the state trajectory corresponding to the input $\bar{x}_{\mathcal{N}_{i}}$ , and $\bar{x}_{i}(0)=x_{i}^{0}$ and $\bar{x}_{i}(n)=0$ . Hence $(A_{ii},\tilde{B}_{i})$ is controllable for all $i\in\mathcal{V}$ and the result follows from Lemma II.1. ∎

Remark IV.3.

(Feasibility of Identifiability Conditions): Proposition IV.2 gives conditions on when the data is rich enough to characterize all possible trajectories of the system. Condition (i) gives conditions on the input sequence, $u^{d}$ , which guarantee a priori the identifiability of the system from data. This condition is generically true in the sense that the set of sequences $u^{d}$ which are not persistently exciting of order $n+\tau$ (even though for all $i\in S$ , $u^{d}_{i}$ is) have zero Lebesgue measure. In general, it is difficult to verify condition (i) in a distributed manner. On the other hand, it is straightforward to verify condition (ii) using only information available to the individual agents. However this verification must be done in an ad hoc manner, after the input has been applied to the system. While the condition is sufficient for identifiability, there are systems where for all inputs $u^{d}$ , the resulting trajectory $w^{d}$ will never satisfy it. $\square$

IV-B Equivalent Network Optimization Problem

Here, we build on the data-based image representation of network trajectories in a distributed fashion to pose a network optimization problem that can be solved with the data available to each agent, which is equivalent to an LQR problem with a time horizon of $T$ . Each node can use this representation along with $T_{\text{ini}}>0$ past states and inputs to predict future trajectories assuming that the hypotheses of Proposition IV.2 are satisfied. Formally, let $\tau=T_{\text{ini}}+T+1$ and let $u^{\text{ini}}:[0,T_{\text{ini}}-1]\to^{m}$ and $x^{\text{ini}}:[0,T_{\text{ini}}-1]\to^{n}$ be sequences such that $\textnormal{col}(u^{\text{ini}},x^{\text{ini}})$ is a $T_{\text{ini}}$ long trajectory of the system. In the network optimization we introduce below, we optimize over system trajectories $\textnormal{col}(u,x)$ of length $\tau$ , constrained so the first $T_{\text{ini}}$ samples of $u$ and $x$ are $u^{\text{ini}}$ and $x^{\text{ini}}$ resp. This plays a similar role to the initial condition constraint in (2).

For each node $i\in\mathcal{V}$ , define

H_{i}=\begin{bmatrix}\mathcal{H}_{\tau}(u^{d}_{i})\\ \mathcal{H}_{\tau}(x^{d}_{\mathcal{N}_{i}})\\ \mathcal{H}_{\tau}(x^{d}_{i})\end{bmatrix}\text{ if }i\in S\text{ and }H_{i}=\begin{bmatrix}\mathcal{H}_{\tau}(x^{d}_{\mathcal{N}_{i}})\\ \mathcal{H}_{\tau}(x^{d}_{i})\end{bmatrix}\text{ if }i\notin S.

(3)

Consider the following problem

	$\displaystyle\underset{g_{i},u_{i},x_{i}}{\text{minimize}}$	$\displaystyle\sum_{i\in S}J_{i}(x_{i},u_{i})+\sum_{i\in\mathcal{V}\setminus S}J_{i}(x_{i})$		(4)
	subject to	$\displaystyle H_{i}g_{i}=\textnormal{col}(u^{\text{ini}}_{i},u_{i},x^{\text{ini}}_{\mathcal{N}_{i}},x_{\mathcal{N}_{i}},x^{\text{ini}}_{i},x_{i}),$	$\displaystyle i\in S$
$\displaystyle H_{i}g_{i}=\textnormal{col}(x^{\text{ini}}_{\mathcal{N}_{i}},x_{\mathcal{N}_{i}},x^{\text{ini}}_{i},x_{i}),$	$\displaystyle i\notin S$
$\displaystyle x_{i}(T)=0$	$\displaystyle i\in V.$

Although (4) does not necessarily have a unique optimizer, any optimizer $(g^{*},u^{*},x^{*})$ of (4) is such that $u^{*}$ and $x^{*}$ are the optimal input and state sequences of (2), as formalized next.

Proposition IV.4.

(Equivalent Network Optimization): Consider the system (1) and sample trajectory $w^{d}$ satisfying the hypotheses of Proposition IV.2 and let $x^{0}=Ax^{\text{ini}}(T_{\text{ini}}-1)+Bu^{\text{ini}}(T_{\text{ini}}-1)$ . Then the following hold:

(i)

If problem (2) is feasible, then it has a unique optimizer;
(ii)

Problem (4) is feasible if and only if (2) is feasible;
(iii)

If (4) is feasible, $(u^{1,*},x^{1,*})$ is the optimizer of (2) and $(g^{*},u^{2,*},x^{2,*})$ is an optimizer of (4), then $u^{1,*}=u^{2,*}$ and $x^{1,*}=x^{2,*}$ .

We omit the proof for space reasons, but note that it is the analogue of Theorem 5.1 and Corollary 5.2 in [19] for the case of network systems once one invokes Propositions IV.1 and IV.2 above. Unlike the original network optimization problem (2), for which agents lack knowledge of the system matrices $A$ , $B$ , the network optimization problem (4) can be solved in a distributed way with the information available to them. The structure of the problem (aggregate objective functions plus locally expressible constraints) makes it amenable to a variety of distributed optimization algorithms, see e.g., [21, 22]. In Section V-B below, we employ a primal-dual dynamic to find asymptotically a solution of (4) in a distributed way.

Remark IV.5.

(Scalability of Network Optimization): As the number of nodes in the network increases so does the state dimension, hence more data is required in order to maintain persistency of excitation. A necessary condition is $T_{d}\geq(n+m+1)(T_{\text{ini}}+T)-1$ . Assuming that $T\sim O(n)$ , $T_{\text{ini}}\sim O(1)$ , we have $T_{d}\sim O(n^{2}+mn)$ . The decision variable for each node is $z_{i}=\textnormal{col}(g_{i},u_{i},x_{i})$ when $i\in S$ and $z_{i}=\textnormal{col}(g_{i},x_{i})$ otherwise. The size of $z_{i}$ is $O(n^{2}+mn)$ . However, using the distributed optimization algorithm of Section V-B, agent $i$ only needs to send messages of size $O(k_{i})$ to agent $j$ . $\square$

V Distributed Data-Based Predictive Control

Here we introduce a distributed data-based predictive control scheme to stabilize the system (1) to the origin, as described in Section III. To do this, we solve the network optimization problem (4) in a receding horizon manner with $u^{\text{ini}}$ and $x^{\text{ini}}$ updated every time step based on the systems current state. The control scheme is summarized in Algorithm 1.

Algorithm 1 Distributed Data-Based Predictive Control

1:Input: Sample trajectory

w^{d}

, performance indices

(Q_{i})_{i=1}^{N}

(R_{i})_{i=1}^{N}

2:Initialize

H_{i}

as in equation (3), let

u^{\text{ini}}

and

x^{\text{ini}}

be the

T_{\text{ini}}

most recent states and inputs respectively, and set

t=0

3:while

\left\lVert x\right\rVert>0

4: Use a distributed optimization algorithm to obtain an approximate solution to (4),

\hat{z}=\textnormal{col}(\hat{g},\hat{u},\hat{x})

, such that

\lVert\hat{u}-u^{*}\rVert\leq\delta\min\{1,\lVert x^{\text{ini}}(T_{\text{ini}}-1)\rVert\}

5: Apply the input

\hat{u}(0)

6: Set

t

t+1

and

u^{\text{ini}}

and

x^{\text{ini}}

to the

T_{\text{ini}}

most recent inputs and states respectively.

The rest of the section proceeds by first showing that the controller resulting from Algorithm 1 is stabilizing even when the network optimization (4) is solved only approximately; and then introducing a particular distributed solver for (4) along with a suboptimality certificate to check, in a distributed manner, the stopping condition in Step 4 of Algorithm 1.

V-A Stability Analysis of Closed-Loop System

In the rest of the paper, we rely on the following assumption.

Assumption V.1.

Consider the system (1),

(i)

the collected data satisfies the hypotheses of Proposition IV.2;
(ii)

the optimization problem (4) is feasible for all $(u^{\text{ini}},x^{\text{ini}})$ which is a valid $T_{\text{ini}}$ -sample long system trajectory.

Since the system is controllable (cf. (i)), a sufficient condition for guaranteeing the feasibility in (ii) is $T\geq n$ . In fact, in such case, there exists a trajectory $(u,x)$ such that $x(T)=0$ . It follows that $(u,x)$ is feasible for (2), so by Proposition IV.4, there exists some $g$ such that $(g,u,x)$ is feasible for (4).

Under Assumption V.1, the closed-loop system with the controller corresponding to a receding horizon implementation of (2) is globally exponentially stable, cf. [23, Theorem 12.2]. Unlike [19], we do not assume we have access to the exact solution of (4) since distributed optimization algorithms typically only converge asymptotically to the true optimizer and must be terminated in finite time. Here we show that Algorithm 1 still stabilizes the system when the tolerance $\delta$ is sufficiently small.

Theorem V.2.

(Distributed, Data-Based Predictive Control is Stabilizing): For $\delta>0$ , let $\phi_{\delta}:^{n}\to^{m}$ be the feedback control corresponding to Algorithm 1. Under Assumption V.1, there exists $\delta^{*}>0$ such that for all $\delta<\delta^{*}$ , the origin is globally asymptotically stable with respect to the closed-loop dynamics $x(t+1)=Ax(t)+B\phi_{\delta}(x(t))$ .

Proof.

Let $\phi_{\text{mpc}}:^{n}\to^{m}$ be the feedback corresponding to a receding horizon implementation of (2). Consider the system $x(t+1)=f(x(t),v(t)),$ where $f(x,v)=A+B\phi_{\text{mpc}}(x)+Bv$ . Let $J^{*}(x^{0})=J(x^{*},u^{*})$ , where $(u^{*},x^{*})$ is an optimizer of (2). Because (4) is nondegenerate, $J^{*}$ is continuously differentiable, cf. [23, Theorem 6.9], and the system is input-to-state stable (ISS) with $J^{*}$ being a ISS-Lyapunov function satisfying $J^{*}(f(x,v))-J^{*}(x)\leq-\alpha\left\lVert x\right\rVert^{2}+\sigma\left\lVert v\right\rVert^{2}$ for constants $\alpha,\sigma>0$ , cf. [24, Theorem 1] (albeit the result is stated there for systems where $A$ is Schur stable, the same reasoning is valid when there are no state or input constraints and $A$ is unstable). Because the system $x(t+1)=f(x(t),0)$ without disturbances is exponentially stable [25], it follows by [26, proof of Lemma 3.5] that the gain function is linear, so there exist $\gamma>0$ and a class $\mathcal{K}\mathcal{L}$ function $\beta$ such that

\left\lVert x(t)\right\rVert\leq\beta(\left\lVert x(0)\right\rVert,t)+\gamma\sup_{0\leq\tau\leq t}\left\lVert v(\tau)\right\rVert,

for all $t\in{\mathbb{Z}}_{\geq 0}$ . Let $x:{\mathbb{Z}}_{\geq 0}\to^{n}$ be a trajectory of the closed-loop dynamics of (1) with the controller described by Algorithm 1 where $\delta<\delta^{*}=\gamma^{-1}$ and define $v(t)=\phi_{\delta}(x(t))-\phi_{\text{mpc}}(x(t))$ . It follows that $x(t+1)=f(x(t),v(t))$ . Note that $\phi_{\text{mpc}}(x(t))=u^{*}(0;u^{\text{ini}},x^{\text{ini}})$ , where $u^{\text{ini}}=\phi_{\delta}(x(t-1))$ and $x^{\text{ini}}=x(t-1)$ so it follows that $\left\lVert v(t)\right\rVert=\left\lVert\hat{u}(0;u^{\text{ini}},x^{\text{ini}})-u^{*}(0;u^{\text{ini}},x^{\text{ini}})\right\rVert\leq\delta\left\lVert x(t-1)\right\rVert.$ We claim that for all $k\in{\mathbb{N}}$ , there exists $T_{k}\in{\mathbb{N}}$ such that $\left\lVert x(t)\right\rVert\leq(k+1)(\gamma\delta)^{k}$ whenever $t\geq T_{k}$ . The case when $k=1$ follows by observing that $\left\lVert x(t)\right\rVert\leq\beta(\left\lVert x(0)\right\rVert,t)+\gamma\delta$ and there exists $T_{1}$ such that $\beta(\left\lVert x(0)\right\rVert,t)<\gamma\delta$ for all $t\geq T_{1}$ . If the claim holds for some $k$ , then for all $t\geq T_{k}+1$ ,

	$\displaystyle\left\lVert x(t)\right\rVert$	$\displaystyle\leq\beta((k+1)(\gamma\delta)^{k},t)+\gamma\sup_{T_{k}+1<\tau\leq t}\left\lVert v(\tau)\right\rVert$
		$\displaystyle\leq\beta((k+1)(\gamma\delta)^{k},t)+(k+1)(\gamma\delta)^{k+1},$

so by choosing $T_{k+1}$ such that $\beta((k+1)(\gamma\delta)^{k},t)<(\gamma\delta)^{k+1}$ for all $t\geq T_{k+1}$ , then $\left\lVert x(t)\right\rVert<(k+2)(\gamma\delta)^{k+1}$ for all $t>T_{k+1}$ and the claim follows by induction. Hence $\limsup_{t\to\infty}\left\lVert x(t)\right\rVert\leq\limsup_{k\to\infty}(k+1)(\gamma\delta)^{k}=0.$ To show global Lyapunov stability, let $\eta>0$ be arbitrary, and suppose that $\left\lVert x(0)\right\rVert$ is chosen so that $\beta(\left\lVert x(0)\right\rVert,0)<(1-\gamma\delta)\eta$ and $\left\lVert x(0)\right\rVert<\eta$ . Then for all $t>0$ ,

	$\displaystyle\left\lVert x(t)\right\rVert$	$\displaystyle\leq(1-\gamma\delta)\eta+\gamma\sup_{0\leq\tau\leq t}\left\lVert v(\tau)\right\rVert$
		$\displaystyle\leq(1-\gamma\delta)\eta+\gamma\delta\left\lVert x(t-1)\right\rVert.$

If $\left\lVert x(t-1)\right\rVert<\eta$ , then $\left\lVert x(t)\right\rVert<\eta$ . It follows by induction on $t$ that $\left\lVert x(t)\right\rVert<\eta$ for all $\eta$ . ∎

V-B Primal-Dual Solver for Network Optimization

In this section we introduce a method for solving the optimization problem (4) in a distributed way. We let

z_{i}=\begin{cases}\textnormal{col}(g_{i},u_{i},x_{i})&i\in S,\\ \textnormal{col}(g_{i},x_{i})&i\notin S,\end{cases}

and $z=\textnormal{col}(z_{1},\dots,z_{N})$ . Note $z_{i}\in^{d_{i}}$ , where $d_{i}=T-(T_{\text{ini}}+T)+1+n_{i}+m_{i}$ for $i\in S$ and $d_{i}=T-(T_{\text{ini}}+T)+1+n_{i}$ otherwise. Problem (4) can be written as

	$\displaystyle\underset{z_{i}\in^{d_{i}}}{\text{minimize}}\qquad$	$\displaystyle\sum_{i\in\mathcal{V}}\left\lVert z_{i}\right\rVert_{\mathcal{Q}_{i}}^{2}$		(5)
	subject to	$\displaystyle\mathcal{A}_{i}z_{\mathcal{N}_{i}}=b_{i}.$

for suitable $\mathcal{Q}_{i}\in^{d_{i}\times d_{i}}$ , $\mathcal{Q}_{i}\succeq 0$ , $\mathcal{A}_{i}\in^{c_{i}\times d_{i}}$ ,, and $b_{i}\in^{c_{i}}$ , with $i,c_{i}\in{\mathbb{Z}}$ . The Lagrangian of (5) is

\displaystyle\mathcal{L}(z,\lambda)=\sum_{i\in\mathcal{V}}\left\lVert z_{i}\right\rVert_{\mathcal{Q}_{i}}^{2}+\lambda_{i}^{\mathsf{T}}(\mathcal{A}_{i}z_{\mathcal{N}_{i}}-b_{i}).

If $\lambda^{*}$ is an optimizer of the dual problem, then the pair $(z^{*},\lambda^{*})$ is a (min-max) saddle point of $\mathcal{L}$ , meaning that $\mathcal{L}(z^{*},\lambda)\leq\mathcal{L}(z^{*},\lambda^{*})\leq\mathcal{L}(z,\lambda^{*})$ for all $z\in^{d}$ and $\lambda\in^{c}$ . The saddle-point property of the Lagrangian suggests that the primal-dual flow, which descends along the gradient of the primal variable and ascends along the gradient of the dual variable,

	$\displaystyle\begin{bmatrix}\dot{z}_{i}\\ \dot{\lambda_{i}}\end{bmatrix}$	$\displaystyle=\begin{bmatrix}-\nabla_{z_{i}}\mathcal{L}(z,\lambda,\mu)\\ \nabla_{\lambda_{i}}\mathcal{L}(z,\lambda,\mu)\end{bmatrix}$		(6)
		$\displaystyle=\begin{bmatrix}-2\mathcal{Q}_{i}z_{i}-F_{ii}\mathcal{A}_{i}^{\mathsf{T}}\lambda_{i}-\sum_{j\in\mathcal{N}_{i}}F_{ij}\mathcal{A}_{j}^{\mathsf{T}}\lambda_{j}\\ \mathcal{A}_{i}z_{\mathcal{N}_{i}}+b_{i}\end{bmatrix},$		(6)

can be used to compute the optimizer. Here, $F_{ij}\in^{(d_{i}+\sum_{j\in\mathcal{N}_{i}}d_{j})\times d_{i}}$ is the matrix such that $F_{ij}z_{\mathcal{N}_{j}}=z_{i}$ . By [22, Corollary 4.5], the flow converges asymptotically to a saddle point of $\mathcal{L}$ . This procedure is fully distributed, since the flow equations in (6) can be computed with the information available to each agent or its direct neighbors. In particular, if $j\in\mathcal{N}_{i}$ , then the message agent $j$ shares with agent $i$ consists of $\textnormal{col}(x_{j},\lambda_{j})\in^{n_{j}T+k_{j}}$ , which is $O(k_{i})$ (cf. Remark IV.5).

We conclude by providing a certificate that can be used to verify the stopping condition of Step 4 in Algorithm 1.

Proposition V.3.

(Suboptimality Certificate): Let $u^{*}$ and $x^{*}$ denote the optimal input and state trajectories of (2), $y=\textnormal{col}(z,\lambda)$ , $\mathcal{Q}={\rm diag}(\mathcal{Q}_{1},\mathcal{Q}_{2},\dots,\mathcal{Q}_{N})$ , $\mathcal{A}=[F_{1}^{\mathsf{T}}\mathcal{A}_{1}^{\mathsf{T}},F_{2}^{\mathsf{T}}\mathcal{A}_{2}^{\mathsf{T}},\dots,F_{N}^{\mathsf{T}}\mathcal{A}_{N}^{\mathsf{T}}]^{\mathsf{T}}$ , $b=\textnormal{col}(b_{1},b_{2},\dots,b_{N})$ , and

\displaystyle M=\begin{bmatrix}-2\mathcal{Q}^{\mathsf{T}}&-\mathcal{A}^{\mathsf{T}}\\ \mathcal{A}&0\end{bmatrix}\qquad q=\begin{bmatrix}0\\ b\end{bmatrix}.

Under Assumption V.1, and with the flow given by (6), if $\lVert{\textnormal{col}(\dot{z}_{i},\dot{\lambda}_{i})}\rVert<\rho$ for all $i\in\mathcal{V}$ , where $\rho=\frac{\delta^{2}}{N\left\lVert M^{\dagger}\right\rVert^{2}}$ , then $\left\lVert u-u^{*}\right\rVert<\delta$ .

Proof.

The set of saddle points of $\mathcal{L}$ is $\mathcal{S}=\{y\;|\;My+q=0\}$ . Since the optimal input and state trajectories are unique, all saddle points share the property that, for each $i\in S$ , the $(u_{i},x_{i})$ components of their $z_{i}$ equal $(u^{*}_{i},x^{*}_{i})$ . Given an arbitrary $y$ , we have $\left\lVert u-u^{*}\right\rVert\leq\left\lVert y-w\right\rVert$ for all $w\in\mathcal{S}$ , and hence $\left\lVert u-u^{*}\right\rVert\leq\inf_{w\in\mathcal{S}}\left\lVert y-w\right\rVert$ . The set of saddle points can also be described as $\mathcal{S}=\{\hat{y}\}+\ker M$ , for any $\hat{y}\in\mathcal{S}$ . Therefore, $\inf_{w\in\mathcal{S}}\left\lVert y-w\right\rVert=\inf_{v\in\ker M}\left\lVert y-\hat{y}-v\right\rVert$ . Since $I-M^{\dagger}M$ is the orthogonal projection onto $\ker M$ ,

	$\displaystyle\inf_{v\in\ker M}$	$\displaystyle\left\lVert y-\hat{y}-v\right\rVert=\left\lVert(y-\hat{y})-(I-M^{\dagger}M)(y-\hat{y})\right\rVert$
		$\displaystyle=\left\lVert M^{\dagger}(My-M\hat{y})\right\rVert=\left\lVert M^{\dagger}(My+q)\right\rVert\leq\left\lVert M^{\dagger}\right\rVert\left\lVert\dot{y}\right\rVert,$

and therefore,

\left\lVert u-u^{*}\right\rVert^{2}\leq\left\lVert M^{\dagger}\right\rVert\sum_{i\in\mathcal{V}}\left\lVert\dot{y}_{i}\right\rVert^{2}<\left\lVert M^{\dagger}\right\rVert\sum_{i\in\mathcal{V}}\frac{\delta^{2}}{N\left\lVert M^{\dagger}\right\rVert}=\delta^{2}.

∎

The suboptimality certificate can be checked in a fully distributed manner using information locally available to each agent provided that $\left\lVert M^{\dagger}\right\rVert$ is known. Because $M$ depends only on the objective $\mathcal{Q}$ and constraints $\mathcal{A}$ , which in turn comes from the sample trajectory $w^{d}$ , it can be computed offline. Finally, it is possible for each agent to compute a bound on $\left\lVert M^{\dagger}\right\rVert$ using the fact that, for $M=[M_{1}^{\mathsf{T}},M_{2}^{\mathsf{T}},\dots M_{N}^{\mathsf{T}}]^{\mathsf{T}}$ , one has $\left\lVert M^{\dagger}\right\rVert\leq\lVert M_{i}^{\dagger}\rVert$ for all $i\in\mathcal{V}$ . It follows that for all $i\in\mathcal{V}$ ,

\left\lVert M^{\dagger}\right\rVert\leq\left\lVert\begin{bmatrix}2\mathcal{Q}_{i}&F_{ii}\mathcal{A}_{i}^{\mathsf{T}}&F_{ij_{1}}\mathcal{A}_{j_{1}}^{\mathsf{T}}&\cdots&F_{ij_{|\mathcal{N}_{i}|}}\mathcal{A}_{j_{|\mathcal{N}_{i}|}}^{\mathsf{T}}\end{bmatrix}^{\dagger}\right\rVert,

for $\{j_{1},j_{2},\dots,j_{|\mathcal{N}_{i}|}\}=\mathcal{N}_{i}$ , so each agent can compute a bound on $\left\lVert M^{\dagger}\right\rVert$ using data available to itself and its neighbors.

Refer to caption — (a) Simulation of Newman-Watts-Strogatz network with $T=n$

V-C Numerical Simulations

We simulate the proposed distributed data-based predictive controller on a Newman-Watts-Strogatz network [27] and a star network. In each case, $A$ and $B$ are chosen at random so that $(A,B)$ is controllable. The input sequence is chosen as $u^{d}(t)=Kx^{d}(t)+w(t)$ , where $K$ is a matrix so that $A+BK$ is marginally stable (the data does not need to be generated from a stable system, but this is done to avoid numerical issues), and $w(t)$ is a Gaussian white noise process. We use Proposition IV.2 to ensure that the data is informative enough for data-driven control. In both cases, condition (i) is satisfied. Condition (ii) fails for the Newman-Watts-Strogatz network, but is satisfied by the star network (cf. Remark IV.3). We integrate the primal-dual flow using the stopping condition in Proposition V.3 is used to terminate the flow. Fig. 1 shows the closed-loop state trajectories and the number of iterations on each time step with different values of $\rho$ . For the Newman-Watts-Strogatz network, cf. Fig. 1(a), the time horizon is $T=n$ . For the star network, cf. Fig. 1(b), the time horizon is $T=5<n$ , but the optimization at each time step is still feasible. In both cases, the distributed data-based predictive controller better approximates the exact MPC for smaller values of $\rho$ at the cost of more iterations per time step.

VI Conclusions and Future Work

We have introduced a distributed data-based predictive controller for stabilizing network linear dynamics described by unknown system matrices. Instead of building a dynamic model, agents learn a non-parametric representation based on a single trajectory and use it to implement a controller as the solution of a network optimization problem solved in a receding horizon manner and in a distributed way. Future work will explicitly quantify the tolerance $\delta^{*}$ in terms of the available data and study ways to construct a terminal cost without knowledge of the underlying model to guarantee stability when the stabilizing terminal constraint is omitted. We plan to extend the results to cases where there are constraints on the state and input, characterize the robustness properties of the introduced control scheme, investigate ways of improving its scalability, and consider more general scenarios, including the presence of noise in the data, inputs not persistently exciting of sufficiently high order, and partial observations of the network state. We also plan to explore improvements to the primal-dual flow to solve the optimization problem with fewer iterations and less communication between the agents.

References

[1] F. Bullo, J. Cortés, and S. Martinez, Distributed Control of Robotic Networks. Applied Mathematics Series, Princeton University Press, 2009.
[2] M. Mesbahi and M. Egerstedt, Graph Theoretic Methods in Multiagent Networks. Applied Mathematics Series, Princeton University Press, 2010.
[3] R. R. Negenborn and J. M. Maestre, “Distributed model predictive control: An overview and roadmap of future research opportunities,” IEEE Control Systems, vol. 34, no. 4, pp. 87–97, 2014.
[4] M. Rotkowitz and S. Lall, “A characterization of convex problems in decentralized control,” IEEE Transactions on Automatic Control, vol. 51, no. 2, pp. 274–286, 2006.
[5] F. Lin, M. Fardad, and M. R. Jovanovic, “Design of optimal sparse feedback gains via the alternating direction method of multipliers,” IEEE Transactions on Automatic Control, vol. 58, no. 9, pp. 2426–2431, 2013.
[6] G. Fazelnia, R. Madani, A. Kalbat, and J. Lavaei, “Convex relaxation for optimal distributed control problems,” IEEE Transactions on Automatic Control, vol. 62, no. 1, pp. 206–221, 2017.
[7] L. Furieri, Y. Zheng, A. Papachristodoulou, and M. Kamgarpour, “On separable quadratic Lyapunov functions for convex design of distributed controllers,” in European Control Conference, (Naples, Italy), pp. 42–49, 2019.
[8] J. Kober, J. A. Bagnell, and J. Peters, “Reinforcement learning in robotics: A survey,” International Journal of Robotics Research, vol. 32, no. 11, pp. 1238–1274, 2013.
[9] L. Bu, R. Babu, B. D. Schutter, et al., “A comprehensive survey of multiagent reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 38, no. 2, pp. 156–172, 2008.
[10] B. Recht, “A tour of reinforcement learning: The view from continuous control,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 253–279, 2019.
[11] D. Jia and B. Krogh, “Min-max feedback model predictive control for distributed control with communication,” in American Control Conference, (Anchorage, AK), pp. 4507–4512, 2002.
[12] W. B. Dunbar, “Distributed receding horizon control of dynamically coupled nonlinear systems,” IEEE Transactions on Automatic Control, vol. 52, no. 7, pp. 1249–1263, 2007.
[13] B. T. Stewart, A. N. Venkat, J. B. Rawlings, S. J. Wright, and G. Pannocchia, “Cooperative distributed model predictive control,” Systems & Control Letters, vol. 59, no. 8, pp. 460–469, 2010.
[14] L. Ljung, System Identification: Theory for the User. Prentice Hall information and system sciences series, Prentice Hall, 1999.
[15] J. C. Willems, P. Rapisarda, I. Markovsky, and B. L. M. D. Moor, “A note on persistency of excitation,” Systems & Control Letters, vol. 54, no. 4, pp. 325–329, 2005.
[16] U. Park and M. Ikeda, “Stability analysis and control design of lti discrete-time systems by the direct use of time series data,” Automatica, vol. 45, no. 5, pp. 1265–1271, 2009.
[17] T. M. Maupong and P. Rapisarda, “Data-driven control: A behavioral approach,” Systems & Control Letters, vol. 101, pp. 37–43, 2017.
[18] C. D. Persis and P. Tesi, “Formulas for data-driven control: Stabilization, optimality and robustness,” arXiv preprint arXiv:1903.06842, 2019.
[19] J. Coulson, J. Lygeros, and F. Dörfler, “Data-enabled predictive control: in the shallows of the DeePC,” in European Control Conference, (Naples, Italy), pp. 307–312, 2019.
[20] J. Berberich, J. Köhler, A. M. Müller, and F. Allgöwer, “Data-driven model predictive control with stability and robustness guarantees,” arXiv preprint arXiv:1906.04679, 2019.
[21] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011.
[22] A. Cherukuri, B. Gharesifard, and J. Cortés, “Saddle-point dynamics: conditions for asymptotic stability of saddle points,” SIAM Journal on Control and Optimization, vol. 55, no. 1, pp. 486–511, 2017.
[23] F. Borrelli, A. Bemporad, and M. Morari, Predictive Control for Linear and Hybrid Systems. Cambridge, UK: Cambridge University Press, 2017.
[24] A. Jadbabaie and A. S. Morse, “On the ISS property for receding horizon control of constrained linear systems,” IFAC Proceedings Volumes, vol. 35, no. 1, pp. 37–40, 2002.
[25] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, pp. 789–814, 2000.
[26] Z.-P. Jiang and Y. Wang, “Input-to-state stability for discrete-time nonlinear systems,” Automatica, vol. 37, no. 6, pp. 857–869, 2001.
[27] M. E. J. Newman and D. J. Watts, “Renormalization group analysis of the small-world network model,” Physics Letters A, vol. 263, no. 4-6, pp. 341–346, 1999.

Data-Based Receding Horizon Control of Linear Network Systems††thanks: This work was supported by ARO-W911NF-18-1-0213.

Abstract

Index Terms:

I Introduction

Literature Review

Statement of Contributions

II Preliminaries

Lemma II.1.

III Problem Formulation

IV Data-Based Representation for Optimization

IV-A Data-Based Representation of Network Trajectories

Proposition IV.1.

Proof.

Proposition IV.2.

Proof.

Remark IV.3.

IV-B Equivalent Network Optimization Problem

Proposition IV.4.

Remark IV.5.

V Distributed Data-Based Predictive Control

V-A Stability Analysis of Closed-Loop System

Assumption V.1.

Theorem V.2.

Proof.

V-B Primal-Dual Solver for Network Optimization

Proposition V.3.

Proof.

V-C Numerical Simulations

VI Conclusions and Future Work

References

Data-Based Receding Horizon Control of Linear Network Systems^†^†thanks: This work was supported by ARO-W911NF-18-1-0213.