Ambiguity Tube MPC

Fan Wu, Mario E. Villanueva, Boris Houska

(ShanghaiTech University)

Abstract

This paper is about a class of distributionally robust model predictive controllers (MPC) for nonlinear stochastic processes that evaluate risk and control performance measures by propagating ambiguity sets in the space of state probability measures. A framework for formulating such ambiguity tube MPC controllers is presented, which is based on modern measure-theoretic methods from the field of optimal transport theory. Moreover, a supermartingale based analysis technique is proposed, leading to stochastic stability results for a large class of distributionally robust controllers for linear and nonlinear systems. In this context, we also discuss how to construct terminal cost functions for stochastic and distributionally robust MPC that ensure closed-loop stability and asymptotic convergence to robust invariant sets. The corresponding theoretical developments are illustrated by tutorial-style examples and a numerical case study.

1 Introduction

Traditional robust MPC formulations that systematically take model uncertainties and external disturbances into account can be categorized into two classes. The first class of robust MPC controllers are based on min-max [15, 29] or tube-based MPC formulations [21, 25, 28], which typically assume that worst-case bounds on the uncertainty are available. This is in contrast to the second class of optimization based robust controllers, namely stochastic MPC controllers [17, 26], which assume that the probability distribution of the external disturbance is known. The main practical difference between these formulations is that most stochastic MPC controllers attempt to either bound or penalize the probability of a constraint violation, but, in contrast to min-max MPC formulations, conservative worst-case constraints are not enforced.

In terms of recent developments in the field of robust MPC, several attempts have been made to unify the above classes by considering distributionally robust MPC controllers [35]. Here, one assumes that the uncertainty is stochastic, but the associated probability distribution is only known to be in a given ambiguity set. Thus, in the most general setting, distributionally robust MPC formulations contain both traditional stochastic MPC as well as min-max MPC as special cases: in the context of stochastic MPC the ambiguity set is a singleton whereas min-max MPC is based on ambiguity sets that contain all uncertainty distributions with a given bounded support. Notice that modern distributionally robust MPC formulations are often formulated by using risk measures [33]. This trend is motivated by the availability of rather general classes of coherent—and, most importantly, computationally tractable—risk measures, such as the conditional value at risk [30].

The current paper focuses on distributionally robust MPC problems that are formulated as ambiguity controllers. Here, the main idea is to propagate sets in the space of probability measures on the state space. The primary motivation for analyzing such classes of controllers is, however, not to develop yet another way to formulate robust MPC, but to develop a coherent stability theory for a very general class of distributionally robust MPC controllers that contain traditional tube based MPC as well as stochastic MPC as a special case.

Now, before we can outline why such a general stability theory for ambiguity controllers is of fundamental interest for control theory research—especially, in the emerging era of learning based MPC [13, 41], where uncertain models are omni-present—one has to first mention that there exist already many stability results for MPC. First of all, the stability of classical (certainty-equivalent) MPC has been analyzed by many authors—be it for tracking or economic MPC, with or without terminal costs or regions [7, 12, 29]. Similarly, the stability of variants of min-max MPC schemes have been analyzed exhaustively [25, 40], although the development of a unified stability analysis for general set-based MPC controllers is a topic of ongoing research [38].

The above reviewed results on the stability of classical certainty-equivalent and tube MPC controllers have in common that they are all based on the construction of Lyapunov functions, which descend along the closed-loop trajectories of the robust controller. This is in contrast to existing results on the stability of stochastic MPC, which are usually based on the theory of non-negative supermartingales [9, 10]. The corresponding mathematical foundation for such stability results has been developed by H.J. Kushner and R.S. Bucy for analyzing general Markov processes. The corresponding original work can be found in [5] and [19]. We additionally refer to [20] for an review of the history of this theory. At the current status of research on stochastic MPC, martingale theory has been applied to special classes of linear MPC controllers with multiplicative uncertainty [2]. Moreover, an impressive collection of articles by M. Cannon and B. Kouvaritakis has appeared during the last two decades, which has had significant impact on shaping the state-of-the-art of stochastic MPC. As we cannot possibly list all of the papers of these authors, we refer at this point to their textbook [17] for an overview of formulations and stability results for stochastic linear MPC. Additionally, the book chapter [18] comes along with an excellent overview of recursive feasibility results for stochastic MPC as well as a proof of stochastic stability with respect to ellipsoidal regions that are derived by using non-negative supermartingales, too.

Given the above list of articles it can certainly be stated that the question how to establish stability results for both robust mix-max as well as stochastic MPC has received significant attention and much progress has been made. Nevertheless, looking back at the MPC literature from the last decade, it must also be stated that this question has raised a critical discussion. For example, a general critique of robust MPC can be found in [24], which points out the lack of a satisfying treatment of stabilizing terminal conditions for stochastic MPC. Similarly, [6] points out various discrepancies between deterministic and stochastic MPC. From these articles, it certainly becomes clear that one has to distinguish carefully between rigorous supermartingale based stability analysis in the above reviewed sense of Kushner and Bucy and weaker properties of stochastic MPC from which stability may not be inferred. Among these weaker properties are bounds on the asymptotic average performance of stochastic MPC [17, 18], which do not necessarily imply stability. Moreover, in the past 5 years several articles have appeared, which all exploit input-to-state-stability assumptions for establishing convergence of stochastic MPC controllers to robust invariant sets [23, 32]. One of the strongest results in this context appeared only a few weeks ago in [27], where an input-to-state-stability assumption in combination with the Borel-Cantelli lemma is used to establish conditions under which the state of a potentially nonlinear Markov process converges almost surely to a minimal robust invariant set. These conditions are applicable for establishing convergence of a variety of stochastic MPC formulations. Nevertheless, it has to be recalled here that, in general, neither stability implies convergence nor convergence implies stability. As such, none of the above reviewed articles proposes a completely satisfying answer to the question how asymptotic stability conditions can be established for general stochastic, let alone distributionally robust, MPC.

Contribution.

This paper is about the mathematical formulation and stochastic stability analysis of distributionally robust MPC controllers for general, potentially nonlinear, but Lipschitz continuous stochastic discrete-time systems. Here, the focus is on so-called ambiguity tube MPC controllers that are based on the propagation of sets in the space of state probability measures, such that all theoretical results are applicable to traditional stochastic as well as tube based min-max MPC as special cases. The corresponding contributions of the current article can be outlined as follows.

1.

Section 2 develops a novel framework for formulating ambiguity tube MPC problems by exploiting measure-theoretic concepts from the field of modern optimal transport theory [37]. In detail, we propose a Wasserstein metric based setting, which leads to a well-formulated class of ambiguity tube MPC controllers that admit a continuous value function; see Theorem 1.
2.

Section 3 presents a complete stability analysis for ambiguity tube MPC that is applicable to general Lipschitz continuous stochastic discrete-time system under mild assumptions on the coherentness of the optimized performance and risk measures, as well as on the consistency of the terminal cost function of the MPC controller. In detail, Theorem 2 establishes conditions under which the cost function of the ambiguity tube MPC controller is a non-negative supermartingale along its closed-loop trajectories, which can then be used to establish robust stability or—under a slightly stronger regularity assumption—robust asymptotic stability of the closed loop system in a stochastic sense, as summarized in Theorems 3 and 4. Notice that these stability results are both more general than the existing stability and convergence statements about stochastic MPC in [6, 18, 27], as they apply to general nonlinear systems and ambiguity set based formulations. Besides, Theorem 4 establishes conditions under which the closed-loop trajectories of stochastic (or ambiguity based) MPC is asymptotically stable with respect to a minimal robust invariant set. This is in contrast to the less tight results in [18], which only establish stability and convergence of linear stochastic MPC closed-loop systems with respect to an ellipsoidal set that overestimates the actual (in general, non-ellipsoidal) limit set of the stochastic ancillary closed-loop system.
3.

Section 4 discusses how to implement ambiguity tube MPC in practice. Here, our focus is—for the sake of simplicity of presentation—on linear systems, although remarks on how this can be implemented for nonlinear systems are provided, too. The purpose of this section is to illustrate how the technical assumptions from Section 3 can be satisfied in practice. In this context, a relevant technical contribution is summarized in Lemma 3, which explains how to construct stabilizing terminal cost functions for stochastic and ambiguity tube MPC.

Notice that as much this paper attempts to make a step forward towards a more coherent stability analysis and treatment of stabilizing terminal conditions for stochastic and distributionally robust MPC, it must be stated clearly that we do not claim to be anywhere close to addressing the long list of conceptual issues of robust MPC that D. Mayne summarized in his critique [24]. Nevertheless, in order to assess the role of this paper in the ongoing development of robust MPC, Section 5 does not only summarize and interpret the contributions of this paper, but also comments on the long list of open issues that research on robust MPC is currently facing.

Notation.

If $(R,r)$ is a metric space with given distance function $r:R\times R\to\mathbb{R}_{+}$ , we use the notation $\mathbb{K}(R)$ to denote the set of compact subsets of $R$ —assuming that it is clear from the context what $r$ is. Similarly, if $(R_{1},r_{1})$ and $(R_{2},r_{2})$ are two metric spaces, we use the notation $\mathcal{L}(R_{1},R_{2})$ to denote the set of Lipschitz continuous functions from $R_{1}$ to $R_{2}$ with respect to the distance functions $r_{1}$ and $r_{2}$ . Moreover, $\mathcal{L}_{1}(R_{1},R_{2})$ denotes the subset of $\mathcal{L}(R_{1},R_{2})$ that consist of all functions from $R_{1}$ to $R_{2}$ whose Lipschitz constant is smaller than or equal to $1$ . Finally, $R_{1}\times R_{2}$ is again a metric space with distance function

r((a,b),(c,d))\;\overset{\mathrm{def}}{=}\;r_{1}(a,c)+r_{2}(b,d)

for all $a,c\in R_{1}$ and all $b,d\in R_{2}$ . If nothing else is stated, we assume that the distance function in the new metric space $R_{1}\times R_{2}$ is constructed in the above additive way—without always saying this explicitly. Additionally, at some points in this paper, we make use of the notation

\mathrm{dist}_{q}(a,B)\;\overset{\mathrm{def}}{=}\;\min_{b\in B}\;\|a-b\|_{q}

to denote the distance of a point $a\in\mathbb{R}^{n}$ to a compact set $B\in\mathbb{K}(\mathbb{R}^{n})$ , where $\|\cdot\|_{q}$ denotes the Hölder $q$ -norm.

For the special case that $R_{1}\in\mathbb{K}(\mathbb{R}^{n})$ and $R_{2}\in\mathbb{K}(\mathbb{R}^{m})$ are equipped with the standard Euclidean distance, we denote with

d_{\mathcal{L}}:\mathcal{L}(R_{1},R_{2})\times\mathcal{L}(R_{1},R_{2})\to\mathbb{R}_{+}

the corresponding Hilbert-Sobolev distance,

d_{\mathcal{L}}(\mu,\nu)\;\overset{\mathrm{def}}{=}\;\sqrt{\int\|\mu-\nu\|_{2}^{2}+\|\nabla\mu-\nabla\nu\|_{2}^{2}\;\mathrm{d}x}\;,

which is defined for all $\mu,\nu\in\mathcal{L}(R_{1},R_{2})$ , where $\nabla$ denotes the weak gradient operator recalling that Lipschitz continuous functions are differentiable almost everywhere. With this notation, $(\mathcal{L}(R_{1},R_{2}),d_{\mathcal{L}})$ is a metric space.

Next, let $X\in\mathbb{K}(\mathbb{R}^{n})$ denote a given compact set in $\mathbb{R}^{n}$ . We use the notation $\mathcal{P}(X)$ to denote the set of Borel probability measures on $X$ . Notice that this definition is such that $p(X)=1$ for all $p\in\mathcal{P}(X)$ ; and $\mathcal{P}(X)$ is convex. In this context, we additionally introduce the notation $\mathcal{B}(X)$ to denote the associated Borel $\sigma$ -algebra of $X$ . Notice that $\mathcal{P}(X)$ turns out to be a metric space with respect to the Wasserstein distance function¹¹1A very detailed review of the history and mathematical properties of Wasserstein distances can be found in [37, Chapter 6], which is defined as follows.

Definition 1

Throughout this paper, we use the notation $d_{\mathrm{W}}:\mathcal{P}(X)\times\mathcal{P}(X)\to\mathbb{R}_{+}$ to denote the Wasserstein distance, given by

\forall p,q\in\mathcal{P}(X),\quad d_{\mathrm{W}}(p,q)\;\overset{\mathrm{def}}{=}\sup_{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}\left|\int\varphi\,\mathrm{d}(p-q)\right|\,.

Notice that $d_{\mathrm{W}}$ is well-defined in our context, as all Lipschitz continuous functions are $\mathcal{B}(X)$ -measurable and, consequently, the integrals in the above definition exist and are finite, since we assume that $X$ is compact.

Throughout this paper, we use the notation $\delta_{y}\in\mathcal{P}(X)$ to denote the Dirac measure at a point $y\in X$ , given by

\forall Y\in\mathcal{B}(X),\qquad\delta_{y}(Y)\;\overset{\mathrm{def}}{=}\;\left\{\begin{array}[]{ll}1\quad\text{if}\;y\in Y\\ 0\quad\text{otherwise}\;.\end{array}\right.

Because this paper makes intense use of the concept of ambiguity sets, we additionally introduce the shorthand notation

\mathcal{K}(X)\;\overset{\mathrm{def}}{=}\;\mathbb{K}(\mathcal{P}(X))

to denote the set of subsets of $\mathcal{P}(X)$ that are compact in the Wasserstein space $(\mathcal{P}(X),d_{\mathrm{W}})$ . By construction, $\mathcal{K}(X)$ is itself a metric space with respect to the Hausdorff-Wasserstein distance,

d_{\mathrm{H}}(P,Q)\overset{\mathrm{def}}{=}\max\left\{\max_{p\in P}\min_{q\in Q}\,d_{\mathrm{W}}(p,q),\,\max_{q\in Q}\min_{p\in P}\,d_{\mathrm{W}}(p,q)\right\},

defined for all $P,Q\in\mathcal{K}(X)$ .

Remark 1

If $w$ is a random variable with given Lebesgue integrable probability distribution $\rho:\mathbb{R}^{n}\to\mathbb{R}_{+}$ , the probability of the event $w\in W$ for a Borel set $W\subseteq\mathbb{R}^{n}$ is denoted by

\mathrm{Pr}(w\in W)\ \overset{\mathrm{def}}{=}\ \int_{W}1\,\mathrm{d}p\ \overset{\mathrm{def}}{=}\ p(W)\ \overset{\mathrm{def}}{=}\ \int_{W}\rho(w)\,\mathrm{d}w,

where $p\in\mathcal{P}(\mathbb{R}^{n})$ is called the probability measure of $w$ . Notice that all four notations are, by definition, equivalent. Many articles on stochastic control, for instance, [5, 19, 20] work with measures rather than probability distributions, as this has many technical advantages [37]. This means that we specify the probability measure $p$ rather than the probability distribution $\rho$ . Notice that the relation

\rho=\frac{\mathrm{d}p}{\mathrm{d}w}

holds, where the right-hand expression denotes the Radon-Nikodyn derivative of the measure $p$ with respect to the traditional Lebesgue measure [34].

2 Control of Ambiguity Tubes

This section introduces a general class of uncertain control system models and their associated Markovian kernels that can be used to propagate state probability measures. Section 2.3 exploits the properties of these Markovian kernels to introduce a topologically coherent framework for defining associated ambiguity tubes that have compact cross-section in $\mathcal{K}(X)$ . Moreover, Section 2.4 focuses on an axiomatic characterization of proper risk and performance measures, which are then used in Section 2.5 to introduce a general class of ambiguity tube MPC controllers, completing our problem formulation.

2.1 Uncertain Control Systems

This paper concerns uncertain nonlinear discrete-time control systems of the form

\displaystyle\forall k\in\mathbb{N}_{0},\qquad x_{k+1}=f(x_{k},u_{k},w_{k})\;,

(1)

where $x_{k}$ denotes the state at time $k\in\mathbb{N}_{0}$ , evolving in the state-space domain $X\in\mathbb{K}(\mathbb{R}^{n_{x}})$ . Moreover, $u_{k}\in U$ denotes the control input at time $k$ with domain $U\in\mathbb{K}(\mathbb{R}^{n_{u}})$ and, finally, $w_{k}$ denotes an uncertain external disturbance with compact support $W\in\mathbb{K}(\mathbb{R}^{n_{w}})$ .

Assumption 1

We assume that the right-hand side function $f$ satisfies

\displaystyle f\in\mathcal{L}\left(\,X\times U\times W,\,X\,\right)\;.

(2)

The above assumption does not only require the function $f$ to be Lipschitz continuous, but it also requires that the image set of $f$ must be contained in $X$ . Consequently, it should be mentioned that, throughout this paper, we interpret the set $X$ as a sufficiently large region of interest in which we wish to analyze the system’s behavior. Consequently, if $f$ is Lipschitz continuous on $\mathbb{R}^{n_{x}}$ but not bounded, we redefine $f\leftarrow\mathrm{proj}_{X}\circ f$ with $\mathrm{proj}_{X}$ denoting a Lipschitz continuous projection onto $X$ , such that (2) holds by construction. Notice that the set $X$ should not be mixed up with the set $\mathbb{X}\subseteq X$ that could, for example, model a so-called state-constraint; that is, a region in which we would like to keep the system state with high probability.

In the following, we additionally introduce a compact set $\mathcal{U}\in\mathbb{K}(\mathcal{L}(X,U))$ in order to denote a class of ancillary feedback laws in the metric space $(\mathcal{L}(X,U),d_{\mathcal{L}})$ . In the context of this paper, $\mathcal{U}$ models a suitable class of computer representable feedback laws.

Example 1

One of the easiest examples for a class of computer representable feedback policies is the set

\mathcal{U}=\left\{\,x\to Kx+k\,\middle|\begin{array}[]{l}\exists k\in\mathbb{R}^{n_{u}},\exists K\in\mathbb{R}^{n_{u}\times n_{x}}:\\ \ \ \|K\|\leq L\end{array}\right\}

of affine control laws with bounded feedback gain, where $L<\infty$ is a given bound on the norm of the matrix $K$ , such that all functions in $\mathcal{U}$ are Lipschitz continuous. Specific feedback laws can in this case be represented by storing the finite dimensional matrix $K$ and the offset vector $k$ .

2.2 Models of Stochastic Uncertainties

In order to refine the above uncertain system model, we introduce the probability spaces $(W,\mathcal{B}(W),\omega_{k})$ . Here, $\omega_{k}\in\mathcal{P}(W)$ denotes the probability measures of the random variable $w_{k}:W\to\mathbb{R}$ such that

\forall W^{\prime}\in\mathcal{B}(W),\quad\mathrm{Pr}(w_{k}\in W^{\prime})=\omega_{k}(W^{\prime})\;.

In the most general setting, we might not know the measure $\omega_{k}$ up to a high precision, but we work with the more realistic assumption that an ambiguity set $\Omega\in\mathcal{K}(W)$ is given, which means that the sequence $\omega_{k}$ is only known to satisfy $\omega_{k}\in\Omega$ for all $k\in\mathbb{N}_{0}$ .

In order to proceed with this modeling assumption, we analyze the closed-loop system

\displaystyle\forall k\in\mathbb{N}_{0},\qquad x_{k+1}=f(x_{k},\mu(x_{k}),w_{k})\;,

(3)

for a given feedback law $\mu\in\mathcal{U}$ . Because the disturbance sequence $w_{0},w_{1},\ldots$ consists of independent random variables, the states $x_{k}$ are random variables, too. Now, if $p_{k}\in\mathcal{P}(X)$ denotes a probability measure that is associated with $x_{k}$ , then the probability measure $p_{k+1}\in\mathcal{P}(X)$ of the random variable $x_{k+1}$ in (3) is a function of $p_{k}$ , $\mu$ , and $\omega_{k}$ . Formally, this propagation of measures can be defined by using a parametric Markovian kernel, $\mathcal{N}[x,\mu,\omega]:\mathcal{B}(X)\to\mathbb{R}$ , given by

\mathcal{N}[x,\mu,\omega](X^{+})\overset{\mathrm{def}}{=}\omega\left(\;\left\{\,w\in W\,\middle|\,f(x,\mu(x),w)\in X^{+}\,\right\}\;\right)

for all Borel sets $X^{+}\in\mathcal{B}(X)$ . The corresponding transition map

\displaystyle\Phi(p,\mu,\omega)\;\overset{\mathrm{def}}{=}\;\int_{X}\mathcal{N}[x,\mu,\omega]\,p(\mathrm{d}x)

(4)

is then well-defined for all $p\in\mathcal{P}(X)$ , all $\mu\in\mathcal{U}$ , and all probability measures $\omega\in\mathcal{P}(W)$ , where $\mathrm{d}x$ denotes the traditional Lebesgue measure [10]. This follows from our assumption that $f$ is Lipschitz continuous such that the Markovian kernel $\mathcal{N}[\cdot,\mu,\omega](X^{+})$ is for any given Borel set $X^{+}$ a $\mathcal{B}(X)$ -measurable function in $x$ . In summary, the associated recursion for the sequence of measures $p_{0},p_{1},p_{2},\ldots$ can be written in the form

\displaystyle\forall k\in\mathbb{N}_{0},\qquad p_{k+1}=\Phi(p_{k},\mu,\omega_{k})\;.

(5)

The following technical lemma establishes an important property of the function $\Phi$ .

Lemma 1

If Assumption 1 holds, then we have

\Phi\in\mathcal{L}(\;\mathcal{P}(X)\times\mathcal{U}\times\mathcal{P}(W),\;\mathcal{P}(X)\;)\,;

that is, $\Phi$ is jointly Lipschitz continuous with respect to all three input arguments. Here, we recall that $\mathcal{P}(X)$ and $\mathcal{P}(W)$ are both metric spaces with respect to their associated Wasserstein distance functions $d_{\mathrm{W}}$ , while $\mathcal{U}$ is equipped with the Hilbert-Sobolev distance function $d_{\mathcal{L}}$ .

Proof. First of all, as mentioned above, $\Phi$ is well defined by (4), as Assumption 1 ensures not only that $f$ is a Lipschitz continuous function but also that the image set of $f$ is contained in $X$ —such that the image set of $\Phi$ is contained in $\mathcal{P}(X)$ . Now, let $p,q\in\mathcal{P}(X)$ and $\omega,\xi\in\mathcal{P}(W)$ be given measures and $\mu,\nu\in\mathcal{U}$ given feedback laws. We set $p^{+}=\Phi(p,\mu,\omega)$ and $q^{+}=\Phi(q,\nu,\xi)$ . The definition of the Wasserstein metric implies that

\displaystyle\begin{array}[]{l}\displaystyle d_{\mathrm{W}}(p^{+},q^{+})=\underset{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}{\sup}\left|\int\varphi\,\mathrm{d}p^{+}-\int\varphi\,\mathrm{d}q^{+}\right|\\[11.38092pt] \displaystyle=\underset{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}{\sup}\left|\int\int\varphi\circ f_{\mu}\,\mathrm{d}p\,\mathrm{d}\omega-\int\int\varphi\circ f_{\nu}\,\mathrm{d}q\,\mathrm{d}\xi\right|\end{array}

(7)

Here, $\circ$ denotes the composition operator and, additionally, we have introduced the shorthand notation

\forall\mu\in\mathcal{U},\qquad f_{\mu}(x,w)\;\overset{\mathrm{def}}{=}\;f(x,\mu(x),w)\;.

Because we assume that $\mathcal{U}\in\mathbb{K}(\mathcal{L}(X,U))$ and because the particular definition of the Hilbert-Sobolev distance implies that all functions in $\mathcal{U}$ are uniformly Lipschitz continuous, the functions $f_{\mu}$ are—by construction—uniformly Lipschitz continuous over the compact set $\mathcal{U}$ . Let $\gamma_{1}<\infty$ denote the associated uniform Lipschitz constant. Since $\varphi$ is $1$ -Lipschitz continuous, the functions of the form $\varphi\circ f_{\mu}$ are also Lipschitz continuous with uniform Lipschitz constant $1*\gamma_{1}=\gamma_{1}$ . Thus, the triangle inequality yields the estimate

\displaystyle\begin{array}[]{l}\displaystyle\left|\int\int\varphi\circ f_{\mu}\,\mathrm{d}p\,\mathrm{d}\omega-\int\int\varphi\circ f_{\nu}\,\mathrm{d}q\,\mathrm{d}\xi\right|\\[11.38092pt] \displaystyle\leq\;\left|\int\int\varphi\circ f_{\mu}\,\mathrm{d}p\,\mathrm{d}\omega-\int\int\varphi\circ f_{\mu}\,\mathrm{d}q\,\mathrm{d}\omega\right|\\[5.69046pt] \displaystyle\hphantom{\leq\;}+\left|\int\int\varphi\circ f_{\mu}\,\mathrm{d}q\,\mathrm{d}\omega-\int\int\varphi\circ f_{\mu}\,\mathrm{d}q\,\mathrm{d}\xi\right|\\[5.69046pt] \displaystyle\hphantom{\leq\;}+\left|\int\int\left[\varphi\circ f_{\mu}-\varphi\circ f_{\nu}\right]\,\mathrm{d}q\,\mathrm{d}\xi\right|\\[11.38092pt] \displaystyle\leq\;\gamma_{1}\cdot d_{\mathrm{W}}(p,q)+\gamma_{1}\cdot d_{\mathrm{W}}(\omega,\xi)\\[7.11317pt] \displaystyle\hphantom{\leq\;}+\left|\int\int\left[\varphi\circ f_{\mu}-\varphi\circ f_{\nu}\right]\,\mathrm{d}q\,\mathrm{d}\xi\right|\;,\end{array}

(14)

which holds uniformly for all $\varphi\in\mathcal{L}_{1}(X,\mathbb{R})$ and all functions $\mu,\nu\in\mathcal{U}$ . Additionally, since $q$ and $\xi$ are probability measures, the last integral term can be bounded as

\left|\int\int\left[\varphi\circ f_{\mu}-\varphi\circ f_{\nu}\right]\,\mathrm{d}q\,\mathrm{d}\xi\right|\leq\gamma_{2}\cdot|X|\cdot|W|\cdot d_{\mathcal{L}}(\mu,\nu)\;,

where $\gamma_{2}$ denotes the Lipschitz constant of $f$ with respect to its second argument, $|X|$ the diameter of the compact set $X$ and $|W|$ the diameter of the set $W$ . Finally, by substituting all the above inequalities we find that

d_{\mathrm{W}}(p^{+},q^{+})\leq\gamma\left(d_{\mathrm{W}}(p,q)+d_{\mathrm{W}}(\omega,\xi)+d_{\mathcal{L}}(\mu,\nu)\right)

with $\gamma=\max\{\gamma_{1},\gamma_{2}\cdot|X|\cdot|W|\}$ . But this inequality implies that $\Phi$ is indeed Lipschitz continuous. $\diamond$

Remark 2

The proof of Lemma 1 relies on the properties of Wasserstein (Kantorovich-Rubinstein) distances, which have originally been introduced independently by several authors including Kantorovich [16] and Wasserstein [36], see also [37]. In order to explain why the Wasserstein metric is remarkably powerful in the context of control system analysis, let us briefly discuss what would have happened if we had used, for example, a total variation distance,

d_{\mathrm{TV}}(p,q)\;\overset{\mathrm{def}}{=}\;\sup_{A\in\mathcal{B}(X)}|p(A)-q(A)|\;,

instead of the Wasserstein distance in order to define the metric space $\mathcal{P}(X)$ . For this aim, we consider a scalar system with $f(x,u,w)=u$ and parametric feedback laws of the form $\mu_{\kappa}(x)={\kappa}$ , $\mathcal{U}=\{\mu_{\kappa}\mid{\kappa}\in[-1,1]\}$ . In this example, we have

	$\displaystyle d_{\mathrm{TV}}(\Phi(p,\mu_{\kappa},\omega),\Phi(p,\mu_{{\kappa}^{\prime}},\omega))$	$\displaystyle=$	$\displaystyle d_{\mathrm{TV}}(\delta_{\kappa},\delta_{{\kappa}^{\prime}})$		(17)
		$\displaystyle=$	$\displaystyle\left\{\begin{array}[]{ll}0&\text{if}\;{\kappa}={\kappa}^{\prime}\\ 1&\text{otherwise}\end{array}\right.$		(17)

implying that $\Phi$ is not Lipschitz continuous with respect to the parametric feedback law on $\mathcal{U}$ if we use the total variation distance to define our underlying metric space. In other words, the statement of Lemma 1 happens to be wrong in general, if we replace the Wasserstein distance with other distances such as the total variation distance.

2.3 Ambiguity Tubes

In this section, we generalize the considerations from the previous section by introducing ambiguity tubes. The motivation for considering such a general setting is twofold: firstly, in practice, one might not know the exact probability measure $\omega_{k}$ of the process noise $w_{k}$ , but only have a set $\Omega\in\mathcal{K}(W)$ of possible probability measures. For instance, one might know a couple of lower order moments of $w_{k}$ , such as the expected value and variance, while higher order moments are known less accurately or they could be even completely unknown. And secondly, in the context of nonlinear system analysis as well as in the context of high dimensional state spaces, a propagation of the exact state distribution can be difficult or impossible. In such cases, it may be easier to bound the true probability measure of the state by a so-called enclosure; that is, a set of probability measures that—in a suitable, yet to be defined sense—overestimates the actual probability measure of the state.

In order to prepare a mathematical definition of ambiguity tubes, we introduce an ambiguity transition map $F:\mathcal{K}(X)\times\mathcal{U}\to\mathcal{K}(X)$ , which is defined as

\displaystyle F(P,\mu)\;\overset{\mathrm{def}}{=}\;\left\{\;\Phi(p,\mu,\omega)\;\middle|\;p\in P,\;\omega\in\Omega\;\right\}

(18)

for all $P\in\mathcal{K}(X)$ and all $\mu\in\mathcal{U}$ . Here, the ambiguity set $\Omega\in\mathcal{K}(W)$ of possible disturbance probability measures is assumed to be given and constant. The following corollary is a direct consequence of Lemma 1.

Corollary 1

Let Assumption 1 hold. The function $F$ is Lipschitz continuous,

F\in\mathcal{L}(\;\mathcal{K}(X)\times\mathcal{U},\;\mathcal{K}(X)\;)\;,

recalling that $\mathcal{K}(X)$ is equipped with the Hausdorff-Wasserstein metric $d_{\mathrm{H}}$ .

Proof. First of all, the fact that the image sets of $F$ are compact follows from (18) and the fact that the function $\Phi$ is Lipschitz continuous. Next, $F$ directly inherits the Lipschitz continuity of $\Phi$ —including its Lipschitz constant $\gamma$ that has been introduced in the proof of Lemma 1—as $d_{\mathrm{H}}$ is the Hausdorff metric of $d_{\mathrm{W}}$ . $\diamond$

In order to formalize the concept of ambiguity enclosures, the following definition is introduced.

Definition 2

An ambiguity set $Q\in\mathcal{K}(X)$ is called an enclosure of an ambiguity set $P\in\mathcal{K}(X)$ , denoted by $P\preceq Q$ , if

\sup_{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}\left[\max_{p\in P}\min_{q\in Q}\int\varphi\,\mathrm{d}(p-q)\right]\leq 0\;.

Two ambiguity sets $P,Q\in\mathcal{K}(X)$ are considered equivalent, denoted by $P\simeq Q$ , if both $P\preceq Q$ and $Q\preceq P$ .

Notice that the above definition of the relation “ $\preceq$ ” should not be mixed up with the set inclusion relation “ $\subseteq$ ”, as used for the definition of set enclosures in the field of set-based tube MPC and global optimization. The corresponding conceptual difference is illustrated by the following example.

Example 2

Let us consider the ambiguity sets

P=\{\delta_{0},\delta_{1}\}\quad\text{and}\quad Q=\{\delta_{0},\delta_{1},0.5\delta_{0}+0.5\delta_{1}\}

of the compact set $X=[0,1]\subseteq\mathbb{R}$ recalling that $\delta_{0}$ and $\delta_{1}$ denote the Dirac measures at $0$ and $1$ , respectively. In this example, the upper bounds of the integrals,

	$\displaystyle\max_{p\in P}\int\varphi\;\mathrm{dp}$	$\displaystyle=$	$\displaystyle\max\{\;\varphi(0),\;\varphi(1)\;\}$
	$\displaystyle\text{and}\qquad\max_{q\in Q}\int\varphi\;\mathrm{dq}$	$\displaystyle=$	$\displaystyle\max\{\;\varphi(0),\;\varphi(1)\;\}\;,$

coincide for any Lipschitz continuous function $\varphi$ . Similarly, the associated lower bounds

	$\displaystyle\min_{p\in P}\int\varphi\;\mathrm{dp}$	$\displaystyle=$	$\displaystyle\min\{\;\varphi(0),\;\varphi(1)\;\}$
	$\displaystyle\text{and}\qquad\min_{q\in Q}\int\varphi\;\mathrm{dq}$	$\displaystyle=$	$\displaystyle\min\{\;\varphi(0),\;\varphi(1)\;\}\;,$

coincide, too. Consequently, in the sense of Definition 2, the ambiguity sets $P$ and $Q$ are equivalent, $Q\simeq P$ . In particular, we have $Q\preceq P$ but we do not have $Q\subseteq P$ . Thus, the relations $\preceq$ and $\subseteq$ are not the same.

The following proposition ensures that the relation $\preceq$ defines a proper partial order on $\mathcal{K}(X)$ with respect to the equivalence relation $\simeq$ . Moreover, topological compatibility with respect to our Hausdorff-Wasserstein metric setting is established.

Proposition 1

Let the enclosure relation $\preceq$ be defined as in Definition 2. Then, the following properties are satisfied for any $P,Q,T\in\mathcal{K}(X)$ .

1.

Reflexivity: we have $P\preceq P$ .
2.

Anti-Symmetry: if $P\preceq Q$ and $Q\preceq P$ , then $P\simeq Q$ .
3.

Transitivity: if $P\preceq Q$ and $Q\preceq T$ , then $P\preceq T$ .
4.

Compactness: The set

$\mathcal{S}=\{(P,Q)\in\mathcal{K}(X)\times\mathcal{K}(X)\mid P\preceq Q\}$

is compact; that is, $\mathcal{S}\in\mathbb{K}(\mathcal{K}(X)\times\mathcal{K}(X))$ .

Proof. Reflexivity, anti-symmetry with respect to the equivalence relation $\simeq$ , and transitivity follow directly from Definition 2. Thus, our focus is on the last statement, which claims to establish compatibility of our definition of enclosures and the proposed Wasserstein-Hausdorff metric setting. Let $P_{0},P_{1},P_{2},\ldots\in\mathcal{K}(X)$ and $Q_{0},Q_{1},Q_{2},\ldots\in\mathcal{K}(X)$ be two convergent sequences with

P^{\star}\overset{\mathrm{def}}{=}\lim_{k\to\infty}P_{k}\qquad\text{and}\qquad Q^{\star}\overset{\mathrm{def}}{=}\lim_{k\to\infty}Q_{k}

and such that $P_{k}\preceq Q_{k}$ for all $k\in\mathbb{N}$ . Because all sets are compact the maximizers

p_{k,\varphi}^{\star}\overset{\mathrm{def}}{=}\underset{p\in P_{k}}{\mathrm{argmax}}\int\varphi\,\mathrm{d}p\quad\text{and}\quad q_{k,\varphi}^{\star}\overset{\mathrm{def}}{=}\underset{q\in Q_{k}}{\mathrm{argmax}}\int\varphi\,\mathrm{d}q\;.

exist for all $\varphi\in\mathcal{L}_{1}(X,\mathbb{R})$ . Next, since $\mathcal{K}(X)$ is a compact set of compact sets, we have not only $P^{\star},Q^{\star}\in\mathcal{K}(X)$ , but the triangle inequality for the Hausdorff-Wasserstein metric additionally yields that

	$\displaystyle\widetilde{p}_{\varphi}\overset{\mathrm{def}}{=}\lim_{k\to\infty}p_{k,\varphi}^{\star}$	$\displaystyle=$	$\displaystyle\underset{p\in P^{\star}}{\mathrm{argmax}}\int\varphi\,\mathrm{d}p$
	$\displaystyle\text{and}\qquad\widetilde{q}_{\varphi}\overset{\mathrm{def}}{=}\lim_{k\to\infty}q_{k,\varphi}^{\star}$	$\displaystyle=$	$\displaystyle\underset{q\in Q^{\star}}{\mathrm{argmax}}\int\varphi\,\mathrm{d}q\;.$

A direct consequence of these equations is that we have

	$\displaystyle\sup_{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}\left[\max_{p\in P^{\star}}\min_{q\in Q^{\star}}\int\varphi\,\mathrm{d}(p-q)\right]$
	$\displaystyle\qquad=\sup_{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}\{\widetilde{p}_{\varphi}-\widetilde{q}_{\varphi}\}$
	$\displaystyle\qquad=\sup_{\varphi\in\mathcal{L}_{1}(X,\mathbb{R})}\lim_{k\to\infty}\{\underbrace{p_{\varphi,k}^{\star}-q_{\varphi,k}^{\star}}_{\leq 0}\}\leq 0\;,$

which shows that $P^{\star}\preceq Q^{\star}$ . Notice that this means that if $(P_{k},Q_{k})\in S$ is a Cauchy sequence, then the limit point satisfies $(P^{\star},Q^{\star})\in S$ ; that is $S$ is closed. Because $S$ is bounded by construction, this also implies that $S$ is compact, $S\in\mathbb{K}(\mathcal{K}(X)\times\mathcal{K}(X))$ . $\diamond$

After this technical preparation, the following definition of ambiguity tubes is possible.

Definition 3

The sequence $(P_{0},P_{1},\ldots,P_{N})\in\mathcal{K}(X)^{N+1}$ is called an ambiguity tube of (1) on the discrete-time horizon $\{0,1,\ldots,N\}$ , if there exist an associated sequence of ancillary feedback controllers $\mu_{0},\mu_{1},\ldots,\mu_{N-1}\in\mathcal{U}$ such that

\forall k\in\{0,1,\ldots,N-1\},\qquad F(P_{k},\mu_{k})\preceq P_{k+1}\;.

Notice that Definition 3 is inspired by methods from the field of set-theoretic methods in control, in particular, the concept of set-valued tubes, as used in the field of Tube MPC [15, 21, 25]. In detail, the step from set-valued robust forward invariant tubes to ambiguity tubes is, however, not straightforward. For instance, the standard set inclusion relation “ $\subseteq$ ” would be too strong for a practical definition of ambiguity tubes and is here replaced by the relation $\preceq$ . This adaption of concepts to our measure based setting is needed, as the purpose of constructing tubes in traditional set propagation and ambiguity set propagation are different. As we will also discuss in the following sections, ambiguity tubes can be used to assess, analyze, and trade-off the risk of constraint violations with other performance measures rather than enforcing the conservative worst-case constraints that are implemented in traditional tube MPC.

Remark 3

An equivalent characterization of the relations in Definition 2 can be obtained by borrowing notation from the field of convex optimization that is related to the concept of duality and support functions [4, 31]. In order to explain this, we denote with $d_{P}:\mathcal{L}_{1}(X,\mathbb{R})\to\mathbb{R}$ the support function of the ambiguity set $P$ ,

\forall\varphi\in\mathcal{L}_{1}(X,\mathbb{R}),\qquad d_{P}(\varphi)\;\overset{\mathrm{def}}{=}\;\max_{p\in P}\int\varphi\,\mathrm{d}p\;.

This notation is such that we have $d_{P}=d_{Q}$ if and only if $P\simeq Q$ . Similarly, we have $d_{P}\leq d_{Q}$ if and only if $P\preceq Q$ .

2.4 Proper Ambiguity Measures

The goal of this section is to formalize certain concepts of modeling performance and risk in the space of ambiguity sets. For this aim, we introduce maps of the form

\ell:\mathcal{K}(X)\to\mathbb{R}\;,

which assign real values to ambiguity sets. In this context, we propose to introduce the following regularity condition under which an ambiguity measure is considered “proper”²²2The conditions in Definition 4 are of an axiomatic nature that is inspired by similar axioms for coherent risk measures, as introduced in [30]. One difference though is that we work with ambiguity sets rather than single probability measures. Moreover, Definition 4 is, at least in this form, tailored to our proposed Wasserstein-Hausdorff metric setting in which Lipschitz continuity (not only closedness of image sets as required for regular risk measures [30]) is needed for ensuring topological compatibility..

Definition 4

We say that $\ell:\mathcal{K}(X)\to\mathbb{R}$ is a proper ambiguity measure, if $\ell$ is Lipschitz continuous, $\ell\in\mathcal{L}(\mathcal{K}(X),\mathbb{R})$ , linear with respect to weighted Minkowski sums; that is,

\forall\theta\in[0,1],\quad\ell(\theta P\oplus(1-\theta)Q)=\theta\ell(P)+(1-\theta)\ell(Q)\;,

and monotonous; that is, $P\preceq Q$ implies $\ell(P)\leq\ell(Q)$ for $P,Q\in\mathcal{K}(X)$ .

The Lipschitz continuity requirement in the above definition ensures that proper ambiguity measures $\ell$ also satisfy the equation $\ell(P)=\ell(Q)$ for all $P,Q\in\mathcal{K}(X)$ with $P\simeq Q$ . Together with the monotonicity relation, this implies that proper ambiguity measures are compatible with our definition of the relations “ $\preceq$ ” and “ $\simeq$ ” from Definition 2. In order to understand why practical performance and risk measures can be assumed to be proper without adding much of a restriction, it is helpful to have the following examples in mind.

Example 3

If $l\in\mathcal{L}(\mathbb{R}^{n},\mathbb{R})$ denotes a cost function, for example, the stage cost of a nominal MPC controller, the associated worst-case average performance

\ell(P)\;\overset{\mathrm{def}}{=}\;\max_{p\in P}\int l\,\mathrm{d}p

is well defined, where the maximizer exists for compact ambiguity sets $P\in\mathcal{K}(X)$ . It is easy to check that $\ell$ is a proper ambiguity measure in the sense of Definition 4.

Example 4

If $\mathbb{X}\in\mathbb{K}(X)$ denotes a state constraint set, the associated maximum expected constraint violation at risk is given by

\mathcal{R}(P)\;\overset{\mathrm{def}}{=}\;\max_{p\in P}\int\mathrm{dist}_{1}(x,\mathbb{X})\,p(\mathrm{d}x)

recalling that $\mathrm{dist}(x,\mathbb{X})=\min_{z\in\mathbb{X}}\|x-z\|_{1}$ denotes the distance function with respect to the $1$ -norm. Similar to the previous example, $\mathcal{R}$ turns out to be a proper ambiguity measure in the sense of Definition 4, which can here be interpreted as a risk measure. In fact, this risk measure is closely related to the so-called worst-case conditional value at risk, as introduced by T.R. Rockafellar, see for example [30], which is by now accepted as one of the most practical and computationally tractable risk measures in engineering and management sciences.

2.5 Ambiguity Tube MPC

The focus of this section is on the formulation of ambiguity tube MPC problems of the form

	$\displaystyle\mathcal{V}(y)\;\overset{\mathrm{def}}{=}$	$\displaystyle\underset{P,\mu}{\min}$	$\displaystyle\sum_{k=0}^{N-1}L(P_{k},\mu_{k})+M(P_{N})$		(23)
		$\displaystyle\mathrm{s.t.}$	$\displaystyle\left\{\begin{array}[]{l}\forall k\in\{0,1,\ldots,N-1\},\\ F(P_{k},\mu_{k})\preceq P_{k+1}\\ P_{k}\in\mathcal{K}(X),\;\mu_{k}\in\mathcal{U}\\ \delta_{y}\in P_{0}\;.\end{array}\right.$		(23)

Here, the sequence of ambiguity sets $P=(P_{0},P_{1},\ldots,P_{N})$ and the ancillary feedback laws $\mu=(\mu_{0},\mu_{1},\ldots,\mu_{N})$ are optimization variables. Moreover, $y$ denotes the current state measurement—recalling that $\delta_{y}$ denotes the associated Dirac measure—while

L:\mathcal{K}(X)\times\mathcal{U}\to\mathbb{R}\quad\text{and}\quad M:\mathcal{K}(X)\to\mathbb{R}

denote the stage and end cost terms. If $\mu_{0}^{\star}[y]\in\mathcal{U}$ denotes the parametric minimizer of (23), then the actual MPC feedback law is given by

\displaystyle\mu_{\mathrm{MPC}}(y)\overset{\mathrm{def}}{=}\mu_{0}^{\star}[y](y)\;.

(24)

Notice that in this notation, the current time of the MPC controller is reset to $0$ after every iteration. The following theorem introduces a minimum requirement under which one could call (23) well-formulated.

Theorem 1

Let Assumption 1 be satisfied and let $L$ and $M$ be continuous functions on the compact domains $\mathcal{K}(X)\times\mathcal{U}$ and $\mathcal{K}(X)$ , respectively, then (23) admits a minimizer for any $y\in X$ . Moreover, the function $\mathcal{V}$ is continuous on $X$ .

Proof. Because Assumption 1 holds, we can combine the results of Corollary 1 with the fourth statement of Proposition 1 to conclude that the feasible set of (23) is non-empty and compact. Consequently, if $L$ and $M$ are continuous functions, Weierstrass’ theorem yields the first statement of this theorem. Similarly, the second statement follows from a variant of Berge’s theorem [1]; see, also [31, Thm 1.17]. $\diamond$

Notice that the above theorem has been formulated under a rather weak requirement on the continuity of the functions $L$ and $M$ ; that is, without necessarily requiring that these functions are proper ambiguity measures. However, as we will discuss in the following sections, much stronger assumptions on the cost functions $L$ and $M$ are needed, if one is interested in analyzing the stability properties of the MPC controller (23).

Remark 4

Notice that the ambiguity tube MPC formulation (23) contains traditional tube MPC as well as existing stochastic MPC formulations as special cases that are obtained for the case that $\Omega$ is the set of all probability distributions with support set $W$ or, alternatively, a singleton, $\Omega=\{\omega\}$ . However, at this point, it should be mentioned that (23) is formulated in the understanding that state-constraints are taken into account by adding suitable risk measures to the objective, as explained by Example 4. This notation is—at least from the viewpoint of stochastic MPC—rather natural, as one would in such a setting usually be interested in an MPC objective that allows one to tradeoff between the risk of violating a constraint and control performance. Nevertheless, for the sake of generality of the following analysis, it should be mentioned that if one is interested in enforcing explicit chance constraints on the probability of a constraint formulation, the corresponding MPC controllers can only be reformulated as a problem of the form (23), if additional assumptions on the regularity³³3If we work with proper ambiguity measures in order to formulate constraints, this is clearly sufficient to ensure regularity. and recursive feasibility of these constraints are made—such that they can be added to the stage cost in the form of $L_{1}$ -penalties without altering the problem formulation. Notice that such regularity and recursive feasibility conditions have been discussed in all detail in [29] for min-max MPC and in [18] for stochastic MPC.

3 Stability Analysis

As mentioned in the introduction, the basic concepts for analyzing stability of Markovian systems have been established in [5] and [19] by using martingale theory. The goal of this section is to lay the foundation for applying this theory to analyze the stochastic closed-loop stability properties of the ambiguity tube MPC controller (23) in the presence of uncertainties. For this aim, this section is divided into three parts: Section 3.1 concisely collects and elaborates on all assumptions that will be needed for this stability analysis, Section 3.2 establishes an important technical result regarding the concavity of MPC cost functions with respect to Minkowski addition of ambiguity sets, and Section 3.3 uses this concavity property to construct a non-negative supermartingale, which finally leads to the powerful stability results for ambiguity tube MPC that are summarized in Theorems 3 and 4.

3.1 Conditions on the Stage and Terminal Cost Function

Throughout the following stability analysis, two main assumptions on the stage and end cost function of the MPC controller (23) are needed, as introduced below.

Assumption 2

The functions $L(\cdot,\mu)$ and $M$ are for any given $\mu\in\mathcal{U}$ proper ambiguity measures. Moreover, we assume that $L$ is continuous on $\mathcal{K}(X)\times\mathcal{U}$ .

Notice that Examples 3 and 4 discuss how to formulate practical risk and performance measures in such a way that the above assumption holds. From here on, a separate assumption on the continuity of $M$ (as in Theorem 1) is not needed anymore, as proper ambiguity measures are Lipschitz continuous functions and, consequently, also continuous. Assumption 2 does, however, add the explicit requirement that $L$ is continuous, as this function depends in general also on the feedback law $\mu$ —in this way, we make sure that the conditions of Theorem 1 are satisfied whenever Assumptions 1 and 2 are satisfied. The following assumption formulates an additional condition under which we consider the terminal cost function $M$ admissible.

Assumption 3

The functions $L$ and $M$ are non-negative and satisfy the terminal descent condition

\forall P\in\mathcal{K}(X),\;\exists\mu\in\mathcal{U}:\;\;L(P,\mu)+M(F(P,\mu))\leq M(P)\,.

The above assumption can be interpreted as a Lyapunov decent condition—similar conditions are used for constructing terminal cost functions for classical certainty-equivalent as well as tube based MPC controllers [7, 12, 29, 38]. The question how to construct functions $L$ and $M$ that simultaneously satisfy Assumptions 2 and 3 in practice will be discussed in Section 4.2.

Remark 5

Assumptions 2 and 3 together imply that $M$ must be a Lipschitz continuous Control Lyapunov Function (CLF). In the general context of traditional nonlinear system analysis, conditions under which such Lipschitz continuous CLFs exist have been analyzed by various authors [8, 22]. However, if one considers nonlinear MPC problems with explicit state constraints (see also Remark 4), it is possible to construct systems—for example, based on Artstein’s circles—that are asymptotically stabilizable yet fail to admit a continuous CLF [11]. It is, however, also pointed out in [11] that systems that only admit discontinuous CLFs often lead to non-robust MPC controllers. Therefore, in the context of robust MPC design, the motivation for working with Assumptions 2 and 3 is to exclude such pathological non-robustly stabilizable systems. Notice that more general regularity assumptions, under which a robust control design is possible, are beyond the scope of this paper.

3.2 On Concave Cost Functions

After summarizing all main assumptions of this section, we can now focus on the properties of certain cost functions that will later be used to construct a supermartingale for the proposed ambiguity tube MPC controller. For this aim, we first introduce the auxiliary function

	$\displaystyle\mathcal{J}_{\mu}(Q)\;\overset{\mathrm{def}}{=}$	$\displaystyle\underset{P}{\min}$	$\displaystyle\sum_{k=0}^{N-1}L(P_{k},\mu_{k})+M(P_{N})$		(29)
		$\displaystyle\mathrm{s.t.}$	$\displaystyle\left\{\begin{array}[]{l}\forall k\in\{0,1,\ldots,N-1\},\\ F(P_{k},\mu_{k})\preceq P_{k+1}\\ P_{k}\in\mathcal{K}(X)\\ Q\preceq P_{0},\;P_{N}=P^{\star}\;,\end{array}\right.$		(29)

which is formally defined for all $Q\in\mathcal{K}(X)$ and all ancillary feedback laws $\mu\in\mathcal{U}^{N}$ .

Lemma 2

If Assumptions 1 and 2 are satisfied, then $\mathcal{J}_{\mu}$ is for any given $\mu\in\mathcal{U}^{N}$ a proper ambiguity measure.

Proof. In the following, we may assume that $\mu\in\mathcal{U}^{N}$ is constant and given. Our proof is divided into two parts, where the first part focuses on establishing a linearity property of the function $F$ . The second part of the proof builds upon the first part in order to further analyze the properties of the function $\mathcal{J}_{\mu}$ .

PART I: Recall that the function $\Phi$ , which has been defined in (4), is—by construction—linear in its first argument. Consequently, we have

\theta\Phi(p,\nu,\omega)+(1-\theta)\Phi(q,\nu,\omega)=\Phi(\theta p+(1-\theta)q,\nu,\omega)

for all $p,q\in\mathcal{P}(X)$ and all $\theta\in[0,1]$ , for any given $\omega\in\mathcal{P}(W)$ and $\nu\in\mathcal{U}$ . But this property of $\Phi$ implies directly that the function $F$ satisfies

\theta F(P,\nu)\oplus(1-\theta)F(Q,\nu)=F(\theta P\oplus(1-\theta)Q,\nu)\;,

for all $P,Q\in\mathcal{K}(X)$ , all $\nu\in\mathcal{U}$ , and all $\theta\in[0,1]$ , which follows from the definition of $F$ in (18).

PART II: Let $\mathfrak{P}_{0}[Q],\ldots,\mathfrak{P}_{N}[Q]\in\mathcal{K}(X)$ denote the solution of the recursion

	$\displaystyle\mathfrak{P}_{0}[Q]$	$\displaystyle=$	$\displaystyle Q\;,$
	$\displaystyle\mathfrak{P}_{k+1}[Q]$	$\displaystyle=$	$\displaystyle F(\mathfrak{P}_{k}[Q],\mu_{k})$

for $k\in\{0,1,\ldots,N-1\}$ recalling that $\mu$ is given. Corollary 1 ensures that the transition map $F$ is Lipschitz continuous such that the above recursion generates indeed compact ambiguity sets for any compact input set $Q\in\mathcal{K}(X)$ , such that the sequence $\mathfrak{P}_{0},\ldots,\mathfrak{P}_{N}$ is well-defined. Due to the linearity of $F$ with respect to its first argument (see Part I), it follows by induction over $k$ that

\displaystyle\mathfrak{P}_{k}[\theta Q\oplus(1-\theta)Q^{\prime}]=\theta\mathfrak{P}_{k}[Q]\oplus(1-\theta)\mathfrak{P}_{k}[Q^{\prime}]

(30)

for all $Q,Q^{\prime}\in\mathcal{K}(X)$ and all $\theta\in[0,1]$ . Next, because both the ambiguity measures $L$ and $M$ are—due to Assumption 2—monotonous and Lipschitz continuous, we have

\displaystyle\mathcal{J}_{\mu}(Q)=\sum_{k=0}^{N-1}L(\mathfrak{P}_{k}[Q],\mu_{k})+M(\mathfrak{P}_{N}[Q])

(31)

for any $Q\in\mathcal{K}(X)$ .

Finally, it remains to combine the results (30) and (31) with our assumption that $L$ and $M$ are proper ambiguity measures, which implies that $\mathcal{J}_{\mu}$ is a proper ambiguity measure, too. $\diamond$

Notice that the auxiliary function $\mathcal{J}_{\mu}$ depends on the feedback law $\mu$ , which is optimized in the context of MPC. Consequently, we are in the following not directly interested in this auxiliary function, but rather in the actual cost-to-go function

\displaystyle J(Q)\;\overset{\mathrm{def}}{=}\;\min_{\mu\in\mathcal{U}^{N}}\;\mathcal{J}_{\mu}(Q)\;,

(32)

which is defined for all $Q\in\mathcal{K}(X)$ , too. Clearly, the function $J$ is closely related to the value function $\mathcal{V}$ of the ambiguity tube MPC controller (23), as we have

\displaystyle\forall y\in X,\qquad\mathcal{V}(y)=J(\{\delta_{y}\})\;.

(33)

This equation follows directly by comparing the definition of $\mathcal{V}$ in (23) with the definitions in (29) and (32). The following corollary summarizes an important consequence of Lemma 2.

Corollary 2

Let Assumptions 1 and 2 be satisfied. The cost-to-go function $J$ is concave with respect to weighted Minkowski addition; that is, we have

\displaystyle J(\theta Q\oplus(1-\theta)Q^{\prime})\;\geq\;\theta J(Q)+(1-\theta)J(Q^{\prime})

(34)

for all $Q,Q^{\prime}\in\mathcal{K}(X)$ and all $\theta\in[0,1]$ . Moreover, $J$ is monotonous, $Q\preceq Q^{\prime}$ implies $J(Q)\leq J(Q^{\prime})$ .

Proof. The key idea for establishing the first statement of this corollary is to use the linearity of the auxiliary function $\mathcal{J}_{\mu}$ (Lemma 2). This yields the inequality

\displaystyle\begin{array}[]{l}J(\theta Q\oplus(1-\theta)Q^{\prime})\\[7.11317pt] \begin{array}[]{cl}=&\underset{\mu\in\mathcal{U}^{N}}{\min}\mathcal{J}_{\mu}(\theta Q\oplus(1-\theta)Q^{\prime})\\[8.5359pt] =&\underset{\mu\in\mathcal{U}^{N}}{\min}\left\{\theta\mathcal{J}_{\mu}(Q)+(1-\theta)\mathcal{J}_{\mu}(Q^{\prime})\right\}\\[8.5359pt] \geq&\theta\left(\underset{\mu\in\mathcal{U}^{N}}{\min}\,\mathcal{J}_{\mu}(Q)\right)+(1-\theta)\left(\underset{\mu^{\prime}\in\mathcal{U}^{N}}{\min}\,\mathcal{J}_{\mu^{\prime}}(Q^{\prime})\right)\\[14.22636pt] =&\theta J(Q)+(1-\theta)J(Q^{\prime})\end{array}\end{array}

(38)

for all $Q,Q^{\prime}\in\mathcal{K}(X)$ and all $\theta\in[0,1]$ . This corresponds to the first statement of the lemma. The second statement follows from the fact that any minimizer of the optimization problem

	$\displaystyle\mathcal{J}_{\mu}(Q^{\prime})\;=$	$\displaystyle\underset{P,\mu}{\min}$	$\displaystyle\sum_{k=0}^{N-1}L(P_{k},\mu_{k})+M(P_{N})$		(43)
		$\displaystyle\mathrm{s.t.}$	$\displaystyle\left\{\begin{array}[]{l}\forall k\in\{0,1,\ldots,N-1\},\\ F(P_{k},\mu_{k})\preceq P_{k+1}\\ P_{k}\in\mathcal{K}(X)\;,\;\mu_{k}\in\mathcal{U}\\ Q^{\prime}\preceq P_{0}\;,\end{array}\right.$		(43)

is a feasible point of the corresponding optimization problem that is obtained when replacing the ambiguity set $Q^{\prime}\in\mathcal{K}(X)$ with an ambiguity set $Q$ that satisfies $Q\preceq Q^{\prime}$ , which then implies monotonicity; that is, $J(Q)\leq J(Q^{\prime})$ . $\diamond$

Corollary 2 is central to establishing stability properties of stochastic MPC or, in the context of this paper, more general ambiguity tube MPC schemes. The following example helps to understand why this concavity statement is important.

Example 5

Let us consider the example that $Q=\{\delta_{a}\}$ and $Q^{\prime}=\{\delta_{b}\}$ for two points $a,b\in X$ . In words, this means that $J(Q)$ is the cost that is associated with knowing that we are currently at the point $a$ . Similarly, $J(Q^{\prime})$ can be interpreted as the cost that is associated with knowing that we are currently at the point $b$ . Now, if we set $\theta=\frac{1}{2}$ , the corresponding ambiguity set

\theta Q\oplus(1-\theta)Q^{\prime}=\left\{\frac{1}{2}\delta_{a}+\frac{1}{2}\delta_{b}\right\}

can be associated with the situation that we don’t know whether we are at the point $a$ or at the point $b$ , as both events could happen with probability $\frac{1}{2}$ . Thus, in this example, the first statement of Corollary 2 is saying that the cost that is associated with not knowing whether we are at $a$ or $b$ —that is, $J(\theta Q\oplus(1-\theta)Q^{\prime})$ —is larger or equal than the expected cost that is obtained when planning to first measure whether we are at $a$ or $b$ and then evaluating the cost function.

The following section exploits the conceptual idea from the above example and Corollary 2 in order to construct a non-negative supermartingale for ambiguity tube MPC.

3.3 Supermartingales for Ambiguity Tube MPC

The goal of this section is to establish conditions under which the value function $\mathcal{V}$ is a supermartingale along the trajectories of the closed system that is associated with the ambiguity tube MPC feedback law $\mu_{\mathrm{MPC}}$ , as defined in (24). Let us first recall that

\displaystyle\forall k\in\mathbb{N},\qquad y_{k+1}=f(y_{k},\mu_{\mathrm{MPC}}(y_{k}),w_{k})

(44)

denotes the closed-loop stochastic process that is associated with the MPC controller (23). Here, we additionally recall that $w_{0},w_{1},\ldots:W\to W$ , $w_{k}(w)=w$ , are independent $B(W)$ -measurable random variables in the probability spaces $(W,\mathcal{B}(W),\omega_{k})$ that depend on the sequence of measures $\omega_{0},\omega_{1},\ldots\in\Omega$ . This implies that the states $y_{0},y_{1},\ldots$ are random variables, too. In the following, we use the notation $\mathfrak{S}_{k}=\sigma(y_{0},y_{1},\ldots,y_{k})$ to denote the minimal $\sigma$ -field of the sequence $y_{0},y_{1},\ldots,y_{k}$ , such that

\forall k\in\mathbb{N},\qquad\mathfrak{S}_{k}\subseteq\mathfrak{S}_{k+1}

is a filtration of $\sigma$ -algebras. Moreover, we use the standard notation [34]

\displaystyle\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}

\displaystyle\overset{\mathrm{def}}{=}

\displaystyle\int\mathcal{V}(f(y_{k},\mu_{\mathrm{MPC}}(y_{k}),\cdot))\,\mathrm{d}\omega_{k}

to denote the conditional expectation of $\mathcal{V}(y_{k+1})$ given $\mathfrak{S}_{k}$ . Notice that this definition depends on the sequence of probability measures $\omega_{0},\omega_{1},\ldots\in\Omega$ . Clearly, if Assumptions 1, and 2 hold, Theorem 1 ensures that $\mathcal{V}$ is continuous. Since Assumption 1 also ensures that $f$ is Lipschitz continuous, the integrand in above expression is trivially $\mathcal{B}(W)$ -measurable and, consequently, the conditional expectation of $\mathcal{V}(y_{k+1})$ given $\mathfrak{S}_{k}$ is well-defined.

Theorem 2

Let Assumptions 1, 2, and 3 be satisfied. Then the function $\mathcal{V}$ is a supermartingale along the trajectories of the closed-loop system (44); that is, we have

\forall k\in\mathbb{N},\qquad\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}\leq\mathcal{V}(y_{k})

independent of the choice of $\omega_{0},\omega_{1}\ldots\in\Omega$ .

Proof. Throughout this proof we use the notation $P_{0}^{\star}(y),P_{1}^{\star}(y),\ldots\in\mathcal{K}(X)$ to denote the parametric minimizers of (23) such that the inequality

\displaystyle\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}

\displaystyle\leq

\displaystyle\max_{p\in P_{1}^{\star}(y_{n})}\,\int\mathcal{V}\,\mathrm{d}p

(45)

holds—by definition of $P_{1}^{\star}(y_{n})$ —for any choice of the sequence $\omega_{0},\omega_{1}\ldots\in\Omega$ . Next, since Assumptions 1 and 2 are satisfied, we can use the first statement of Corollary 2 to establish the inequality

	$\displaystyle\int\mathcal{V}\,\mathrm{d}p$	$\displaystyle\overset{\eqref{eq::VJ}}{=}$	$\displaystyle\int J\left(\{\delta_{y}\}\right)\,p(\mathrm{d}y)$		(46)
		$\displaystyle\overset{\eqref{eq::concave}}{\leq}$	$\displaystyle J\left(\,\left\{\int\delta_{y}\,p(\mathrm{d}y)\right\}\,\right)=J(\{p\})\;,$		(46)

which holds for any probability measure $p\in\mathcal{P}(X)$ . Moreover, we know from the second statement of Corollary 2 that $J$ is monotonous, which implies that the implication chain

\displaystyle p\in P\quad\Longrightarrow\quad\{p\}\preceq P\quad\Longrightarrow\quad J(\{p\})\leq J(P)

(47)

holds for any $P\in\mathcal{K}(X)$ . In order to briefly summarize our intermediate results so far, we combine (45), (46), and (47), which yields the inequality

\displaystyle\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}

\displaystyle\leq

\displaystyle J(P_{1}^{\star}(y_{k}))\;.

(48)

Next, in analogy to the construction of Lyapunov functions for traditional MPC controllers [29], we can use that Assumption 3 implies that the cost-to-go function $J$ descends along the iterates $P_{i}^{\star}(y_{k})$ , which means that

\displaystyle J(P_{1}^{\star}(y_{k}))

\displaystyle\leq

\displaystyle J(P_{0}^{\star}(y_{k}))\;=\;\mathcal{V}(y_{k})\;.

(49)

But then (48) and (49) imply that we also have

\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}\;\leq\;\mathcal{V}(y_{k})\;.

The latter inequality does not depend on the choice of the sequence $\omega_{0},\omega_{1},\ldots\in\Omega$ and, consequently, corresponds to the statement of the theorem. $\diamond$

3.4 Stability of Ambiguity Tube MPC

As established in Theorem 1, Assumptions 1 and 2 are sufficient to ensure that $\mathcal{V}$ is a continuous function. Consequently,

\displaystyle\mathcal{V}^{\star}\;\overset{\mathrm{def}}{=}\;\min_{y\in X}\,\mathcal{V}(y)\quad\text{and}\quad Y^{\star}\;\overset{\mathrm{def}}{=}\;\underset{y\in X}{\text{argmin}}\;\mathcal{V}(y)

(50)

are well-defined and the set $Y^{\star}$ is compact, $Y^{\star}\in\mathbb{K}(X)$ , and non-empty (see Theorem 1). Moreover, we may assume, without loss of generality, that $\mathcal{V}^{\star}=0$ , as adding constant offsets to $\mathcal{V}$ does not affect the result of Theorem 2. In the following, we introduce the notation

\displaystyle\mathcal{N}_{\varepsilon}(Y^{\star})

\displaystyle\overset{\mathrm{def}}{=}

\displaystyle\left\{\;x\in X\;\middle|\;\mathrm{dist}_{2}(x,Y^{\star})<\varepsilon\;\right\}

to denote an $\varepsilon$ -neighborhood of $Y^{\star}$ . The definition below introduces a (rather standard) notion of stability of the closed-loop system (44). Here, it is helpful to recall that the probability measures $p_{k}$ of the sequence $y_{k}$ are given by the recursion

p_{k+1}=\Phi(p_{k},\mu_{\mathrm{MPC}},\omega_{k})\;,

which, in turn, depends on the sequence $\omega_{0},\omega_{1},\ldots\in\Omega$ and on the given initial state measurement $y_{0}$ , as we set $p_{0}=\delta_{y_{0}}$ .

Definition 5

The closed-loop system (44) is called robustly stable with respect to the set $Y^{\star}$ , if there exists for every $\varepsilon>0$ a $\delta>0$ such that for any $y_{0}\in\mathcal{N}_{\delta}(Y^{\star})$ we have

\forall k\in\mathbb{N},\quad p_{k}\left(\mathcal{N}_{\varepsilon}(Y^{\star})\right)\;>\;1-\varepsilon\,,

independent of the choice of the probability measures $\omega_{0},\omega_{1},\ldots\in\Omega$ . If we additionally have that

\lim_{k\to\infty}\;p_{k}(Y^{\star})=1\;,

independent of the choice of $\omega_{0},\omega_{1},\ldots\in\Omega$ , we say that the closed-loop system is robustly asymptotically stable with respect to $Y^{\star}$ .

The following theorem summarizes the main result of this section.

Theorem 3

If Assumptions 1, 2, and 3 hold, then (44) is robustly stable with respect to $Y^{\star}$ .

Proof. The assumptions of this theorem ensure that $\mathcal{V}$ is continuous (Theorem 1) and a supermartingale along the trajectories of (44), independent of the choice of the probability measures $\omega_{0},\omega_{1},\ldots\in\Omega$ (Theorem 2). Moreover, since we work with a compact support, all random variables are essentially bounded. Consequently, we can apply Bucy’s supermartingale stability theorem [5, Thm. 1] to conclude that (44) is robustly stable with respect to $Y^{\star}$ . $\diamond$

Remark 6

The above proof is based on the historical result of Bucy’s original article on positive supermartingales, who—at the time of publishing his original article—formally only established stability of Markov processes with respect to an isolated equilibrium; that is, for the case that the set $Y^{\star}$ is a singleton. However, the proof of Theorem 1 in [5] generalizes trivially to the version that is needed in the above proof after replacing the distance between the states of the Markov system and the equilibrium point by the corresponding distance of these iterates to the set $Y^{\star}$ . By now, this and other generalization of the supermartingale based stability theorems by Bucy and Kushner for Markov processes are, of course, well-known and can in very similar versions also (and besides many others) be found in [10, 19, 20, 34].

For the case that we are not only interested in robust stability but also robust asymptotic stability, the above theorem can be extended after introducing a slightly stronger regularity requirement on the function $L$ , which is introduced below.

Definition 6

We say that the function $L$ is positive definite with respect to $Y^{\star}$ if $L(P,\mu)>0$ for all $P$ with

\min_{p\in P}\;p(Y^{\star})<1

and all $\mu\in\mathcal{U}$ .

The corresponding stronger version of Theorem 3 can now be formulated as follows.

Theorem 4

Let Assumptions 1, 2, and 3 hold. If $L$ is positive definite with respect to $Y^{\star}$ , then (44) is robustly asymptotically stable with respect to $Y^{\star}$ .

Proof. The statement of this theorem is similar to the statement of Theorem 3, but we need to work with a slightly tighter version of the supermartingale inequality from Theorem 2. For this aim, we use that Assumption 3 implies that

\displaystyle L(P_{0}^{\star}(y_{k}),\mu^{\star}[y_{k}])+J(P_{1}^{\star}(y_{k}))

\displaystyle\leq

\displaystyle J(P_{0}^{\star}(y_{k}))\;.

(51)

Consequently, the inequality (49) can be replaced by its tighter version,

\min_{p\in P_{0}^{\star}(y_{k})}\;p(Y^{\star})<1\quad\Longrightarrow\quad J(P_{1}^{\star}(y_{k}))<\mathcal{V}(y_{k})\;.

Thus, the corresponding argument in the proof of Theorem 2 can be modified finding that we also have

\min_{p\in P_{0}^{\star}(y_{k})}\;p(Y^{\star})<1\quad\Longrightarrow\quad\mathbb{E}\{\mathcal{V}(y_{k+1})\mid\mathfrak{S}_{k}\}<\mathcal{V}(y_{k})

for all $k\in\mathbb{N}$ and independent of the choice of the sequence $\omega_{0},\omega_{1}\ldots\in\Omega$ . But this means that $\mathcal{V}$ is a strict non-negative supermartingale and we can apply the standard result from [5, Thm. 2] (of course, again after replacing Bucy’s outdated definition of equilibrium points with our definition of the set $Y^{\star}$ —see Remark 6) to establish the statement of this theorem. $\diamond$

4 Practical Implementation of Ambiguity Tube MPC for Linear Systems

In order to illustrate and discuss the above theoretical results, this section develops a practical framework for reformulating a class of ambiguity tube MPC controllers for linear systems as convex optimization problems that can then be solved with existing MPC software. In particular, Section 4.2 focuses on the question how to construct stabilizing terminal costs for stochastic and ambiguity tube MPC.

4.1 Linear Stochastic Systems

Let us consider linear stochastic discrete-time systems with (projected) linear right-hand function

f=\mathrm{proj}_{X}\circ\hat{f}\qquad\text{with}\qquad\hat{f}(x,u,w)=Ax+Bu+w\;,

where $A\in\mathbb{R}^{n_{x}\times n_{x}}$ and $B\in\mathbb{R}^{n_{x}\times n_{u}}$ are given matrices. Here, we assume that an asymptotically stabilizing linear feedback gain $K\in\mathbb{R}^{n_{u}\times n_{x}}$ for $(A,B)$ exists; that is, such that all eigenvalues of $A+BK$ are in the open unit disc. For simplicity of presentation, we focus on parametric ancillary feedback controllers of the form

\mu[u_{\mathrm{c}}](x)=u_{\mathrm{c}}+Kx\;,\quad\text{where}\quad\mathcal{U}=\left\{\;\mu[u_{\mathrm{c}}]\;\mid\;u_{\mathrm{c}}\in U_{\mathrm{c}}\;\right\}

denotes the associated compact set of representable ancillary control laws for the pre-computed control gain $K$ . In this context, $U_{\mathrm{c}}\in\mathbb{K}(\mathbb{R}^{n_{u}})$ denotes a control constraint that is associated with the central control offset $u_{\mathrm{c}}$ . Similarly, we introduce the notation

\omega[w_{\mathrm{c}}](X^{\prime})=\overline{\omega}(\{w_{\mathrm{c}}\}\oplus X^{\prime})\;\;\text{and}\;\;\Omega=\{\omega[w_{\mathrm{c}}]\mid w_{\mathrm{c}}\in W_{\mathrm{c}}\}

for all $X^{\prime}\in\mathcal{B}(X)$ , assuming that $\overline{\omega}\in\mathcal{P}(\overline{W})$ , the central set $W_{\mathrm{c}}\in\mathbb{K}(\mathbb{R}^{n_{x}})$ , and the domain $\overline{W}\in\mathbb{K}(\mathbb{R}^{n_{x}})$ are given. Finally, the corresponding domain of the uncertainty sequence is denoted by

W\;\overset{\mathrm{def}}{=}\;W_{\mathrm{c}}\oplus\overline{W}\;.

At this point, there are two remarks in order.

Remark 7

Because $K$ is a pre-stabilizing feedback, one can construct the compact domain $X\in\mathbb{K}(\mathbb{R}^{n_{x}})$ such that

X\supseteq(A+BK)X\oplus\left[BU_{\mathrm{c}}\oplus W\right]\;.

This construction is such that the closed-loop uncertainty measure propagation operator $\Phi$ is unaffected by the projection onto $X$ —it would have been the same, if we would have set $f=\hat{f}$ . This illustrates how Assumption 1 can—at least for asymptotically stabilizable linear systems—formally be satisfied by a simple projection onto an invariant set, without altering the original physical problem formulation.

Remark 8

Notice that the ambiguity set $\Omega$ in the above system model can be used to overestimate nonlinear terms. For example, if we have a system of the form

x^{+}=Ax+Bu+g(x)+\overline{w}\;,

where $g(x)$ is bounded on $X$ , such that $g(x)\in W_{\mathrm{c}}$ while $\overline{w}$ is a random variable with probability measure $\overline{\omega}$ , then the ambiguity set $\Omega$ is such that the probability measure $\omega_{g}$ of the random variable $w=g(x)+\overline{w}$ satisfies $\{\omega_{g}\}\preceq\Omega$ . This example can be used as a starting point to develop computationally tractable ambiguity tube MPC formulations for nonlinear systems, although a discussion of less conservative nonlinearity bounders, as developed for set propagation in [40], are beyond the scope of this paper.

4.2 Construction of Stage and Terminal Costs

In order to discuss how to design stage and terminal costs, $L$ and $M$ , which satisfy the requirements from Assumptions 2 and 3, this section focuses on the case that the stage cost has the form

\displaystyle L(P,\mu)=\max_{p\in P}\int\Theta(x,u_{\mathrm{c}})\,p(\mathrm{d}x)\;,

(52)

where $\Theta\in\mathcal{L}(X\times U_{\mathrm{c}},\mathbb{R}_{+})$ is a non-negative and Lipschitz continuous control performance function. In the following, we assume that we have $0\in U_{\mathrm{c}}$ as well $\Theta(y^{\star},0)=0$ for all $y^{\star}\in Y^{\star}$ , where $Y^{\star}$ denotes the limit set of the considered linear stochastic system that is obtained for the offset-free ancillary control law; that is,

Y^{\star}\;=\;\bigoplus_{k=0}^{\infty}(A+BK)^{k}\,W\;.

Notice that this set can be computed by using standard methods from the field of set based computing [3, 15].

Example 6

Let us assume that $\mathbb{X}\in\mathbb{K}(\mathbb{R}^{n_{x}})$ is a given state constraint with $Y^{\star}\subseteq\mathbb{X}$ . In this case, the risk and performance measure

\Theta(x,u)=\|u\|_{2}^{2}+\mathrm{dist}_{2}(x,Y^{\star})^{2}+\tau\cdot\mathrm{dist}_{1}(x,\mathbb{X})

is Lipschitz continuous. It can be used to model a trade-off between the least-squares control performance term

\|u\|_{2}^{2}+\mathrm{dist}_{2}(x,Y^{\star})^{2}

that penalizes control offsets and the distance of the state to the target region $Y^{\star}$ , and the constraint violation term $\tau\cdot\mathrm{dist}_{1}(x,\mathbb{X})$ that is $0$ if $x$ satisfies the constraint. Here, $\tau>0$ is a tuning parameter that can be used to adjust how risk-averse the controller is; see also Examples 3 and 4.

Now, the key idea for constructing the terminal cost $M$ is to first construct a non-negative function $\Pi\in\mathcal{L}(X,\mathbb{R}_{+})$ that satisfies the ancillary Lyapunov descent condition

\displaystyle\forall x\in X,\quad\Theta(x,0)+\underset{w\in W}{\max}\;\Pi(f(x,Kx,w))\;\leq\;\Pi(x)\;.

(53)

Notice that such a function $\Pi$ exists, as we assume that the closed-loop system matrix $(A+BK)$ is asymptotically stabilizing. It can be constructed as follows. Let $\Sigma$ denote the positive definite solution of the algebraic Lyapunov equation

(A+BK)^{\intercal}\Sigma(A+BK)+I=\Sigma,

let $|\lambda_{\mathrm{max}}(A+BK)|\leq\overline{\lambda}<1$ be an upper bound on the spectral radius of the matrix $A+BK$ , and let $\Lambda>0$ be the Lipschitz constant of $\Theta$ with respect to the weighted Euclidean norm, $\|x\|_{\Sigma}\overset{\mathrm{def}}{=}\sqrt{x^{\intercal}\Sigma x}$ , such that

\forall x\in X,\qquad\Theta(x,0)\ \leq\ \Lambda\min_{x^{\prime}\in Y^{\star}}\|x-x^{\prime}\|_{\Sigma}\;.

Next, we claim that the function

\forall x\in X,\qquad\Pi(x)\ \overset{\mathrm{def}}{=}\ \frac{\Lambda}{1-\overline{\lambda}}\left[\min_{x^{\prime}\in Y^{\star}}\|x-x^{\prime}\|_{\Sigma}\right]

satisfies (53). In order to prove this, notice that the inequality

\forall x\in X,\qquad\underset{w\in W}{\max}\;\Pi(f(x,Kx,w))\ \leq\ \overline{\lambda}\cdot\Pi(x)

holds by construction of $\Sigma$ , $\Pi$ and $Y^{\star}$ . Thus, we have

	$\displaystyle\Theta(x,0)+\underset{w\in W}{\max}\;\Pi(f(x,Kx,w))$
	$\displaystyle\ \leq\ \Lambda\min_{x^{\prime}\in Y^{\star}}\\|x-x^{\prime}\\|_{\Sigma}+\overline{\lambda}\cdot\Pi(x)$
	$\displaystyle\ =\ \left[\Lambda+\frac{\Lambda\overline{\lambda}}{1-\overline{\lambda}}\right]\min_{x^{\prime}\in Y^{\star}}\\|x-x^{\prime}\\|_{\Sigma}=\Pi(x)$		(54)

for all $x\in X$ ; that is, $\Pi$ satisfies (53). Next, the associated ambiguity measure

\displaystyle M(P)=\max_{p\in P}\int\Pi\,\mathrm{d}p

(55)

can be used as an associated terminal cost that satisfies Assumption 3. As this result is of high practical relevance, we summarize it in the form of the following lemma.

Lemma 3

Let $L$ and $M$ be defined as in (52) and (55). If the functions $\Theta$ and $\Pi$ are non-negative and Lipschitz continuous such that (53) is satisfies and if $0\in U_{\mathrm{c}}$ , then $L$ and $M$ satisfy all requirements from Assumptions 2 and 3.

Proof. Because $\Theta$ and $\Pi$ are Lipschitz continuous $L$ and $M$ are—by construction—proper ambiguity measures and Assumption 2 is satisfied. Moreover, since $\Theta$ and $\Pi$ are non-negative, $L$ and $M$ are non-negative. Next, (55) implies that

$\displaystyle M(F(P,\mu))$	$\displaystyle\overset{\eqref{eq::MM}}{=}$	$\displaystyle\max_{p\in F(P,\mu)}\int\Pi\,\mathrm{d}p$	(56)
	$\displaystyle=$	$\displaystyle\max_{p\in P,\omega\in\Omega}\int\int\Pi(f(x,Kx,w))\,p(\mathrm{d}x)\,\omega(\mathrm{d}w)$
	$\displaystyle\leq$	$\displaystyle\max_{p\in P}\int\max_{w\in W}\Pi(f(x,Kx,w))\,p(\mathrm{d}x)\;,$

where the second equation holds for the offset-free ancillary feedback law $\mu(x)=Kx$ . Furthermore, according to (53), we have

\displaystyle\max_{w\in W}\;\Pi(f(x,Kx,w))

\displaystyle\leq

\displaystyle\Pi(x)-\Theta(x,0)

(57)

for all $x\in X$ . Consequently, we can substitute this inequality in (56) finding

	$\displaystyle M(F(P,\mu))$	$\displaystyle\overset{\eqref{eq::Auxx},\eqref{eq::Auxx2}}{\leq}$	$\displaystyle\max_{p\in P}\int\left[\Pi-\Theta(\cdot,0)\right]\,\mathrm{d}p$
		$\displaystyle\overset{\eqref{eq::MM},\eqref{eq::LL}}{=}$	$\displaystyle M(x)-L(P,\mu)$

recalling that this holds for the particular feedback law $\mu(x)=Kx$ . In other words, because we assume that $0\in U_{\mathrm{c}}$ , there exists for every $P\in\mathcal{K}(X)$ a $\mu\in\mathcal{U}$ for which

L(P,\mu)+M(F(P,\mu))\;\leq\;M(P)

and the conditions from Assumption 3 are satisfied. This corresponds to the statement of the lemma. $\diamond$

Remark 9

Notice that many articles on stochastic MPC, for example [6, 18, 24], start their construction of the stage cost by assuming that a nominal (non-negative) cost function $l:X\times U\to\mathbb{R}_{+}$ is given. For example, in the easiest case, one could consider the least-squares cost

l(x,u)=x^{2}+u^{2}\;.

In the above context, however, we cannot simply set $\Theta=l$ , as Condition (53) can only be satisfied if we have $\Theta(y^{\star},0)=0$ for all $y^{\star}\in Y^{\star}$ —but $Y^{\star}$ is usually not a singleton. However, one can find a Lipschitz continuous function $\Theta$ that approximates the function

\Theta(x,u)\approx\left\{\begin{array}[]{ll}l(x,u)&\text{if}\;\;x\notin Y^{\star}\\[2.84544pt] 0&\text{if}\;\;x\in Y^{\star}\end{array}\right.

up to any approximation accuracy such that $\Theta$ coincides with $l$ on the domain $X\setminus Y^{\star}$ with high precision. In fact, the approximation is in this context only needed for technical reasons, such that $\Theta$ is Lipschitz continuous. Notice that this construction satisfies the requirements of Lemma 3 and is, as such, fully compatible with our stability analysis framework. Next, we construct the hybrid feedback law

\widetilde{\mu}(y)\;\overset{\mathrm{def}}{=}\;\left\{\begin{array}[]{ll}\mu_{\mathrm{MPC}}(y)&\text{if}\;\;y\notin Y^{\star}\\[2.84544pt] Ky&\text{if}\;\;y\in Y^{\star}\;,\end{array}\right.

which simply switches to the ancillary control law $x\to Kx$ whenever the current state is already inside the target region $Y^{\star}$ . This construction is compatible with the stability statements from Theorems 3 and 4, as we modify the closed-loop system only inside the robust control invariant target region. Of course, this is in the understanding that the control gain $K$ is optimized beforehand and that this linear controller leads to a close-to-optimal control performance (with respect to worst-case expected value of the given cost function $l$ ) inside the region $Y^{\star}$ —if not, one needs to work with more sophisticated ancillary controllers and redefine $\mathcal{U}$ . The robust MPC controller is in this case only taking care of the case that the current state is in $X\setminus Y^{\star}$ —but in this region $\Theta$ coincides with the given cost function $l$ as desired.

4.3 Implementation Details

Ambiguity tube MPC can be implemented by pre-computing the stage and terminal cost offline. This has the advantage that, in the online phase, a simple convex optimization problem is solved. For this aim, we pre-compute the central sets

\displaystyle Z_{k+1}

\displaystyle\overset{\mathrm{def}}{=}

\displaystyle\bigoplus_{i=0}^{k}(A+BK)^{i}W_{\mathrm{c}}

(58)

for all $k\in\{0,1,\ldots,N-1\}$ by using standard set computation techniques [3]. Similarly, by introducing the Markovian kernel

\forall X^{\prime}\in\mathcal{B}(X),\quad\Delta[x](X^{\prime})\overset{\mathrm{def}}{=}\omega(X^{\prime}\oplus\{-(A+BK)x\})\;,

we can pre-compute offset-free measures $q_{k}$ via the Markovian recursion

	$\displaystyle\forall k\in\mathbb{N},\quad q_{k+1}$	$\displaystyle=$	$\displaystyle\int\Delta[x]\,q_{k}(\mathrm{d}x)$		(59)
	$\displaystyle q_{0}$	$\displaystyle=$	$\displaystyle\delta_{0}\;.$

For example, if $\omega$ denotes a uniform probability measure with compact zonotopic support, the measures $q_{k}$ can be computed with high precision by using a generalized Lyapunov recursion in combination with a Gram-Charlier expansion [39]. After this preparation, we can pre-compute Chebyshev representations of the functions

	$\displaystyle S_{k}(z,u)$	$\displaystyle=$	$\displaystyle\max_{z_{\mathrm{c}}\in Z_{k}}\int\Theta(z+z_{\mathrm{c}}+\overline{z},u)\,q_{k}(\mathrm{d}\overline{z})$
	$\displaystyle\text{and}\qquad S_{N}(z)$	$\displaystyle=$	$\displaystyle\max_{z_{\mathrm{c}}\in Z_{N}}\int\Pi(z+z_{\mathrm{c}}+\overline{z})\,q_{N}(\mathrm{d}\overline{z})$

with high accuracy, as discussed in [39], too. Here, the function $\Theta$ and $\Pi$ are constructed as in the previous section. In particular, if $\Theta$ is convex, as in Example 6, $S_{k}$ is convex. Similarly, if $\Pi$ is convex, $S_{N}$ is convex. Finally, the associated online optimization problem,

	$\displaystyle\mathcal{V}(y)\;=$	$\displaystyle\underset{v,z}{\min}$	$\displaystyle\sum_{k=0}^{N-1}S_{k}(z_{k},v_{k})+S_{N}(z_{N})$		(63)
		$\displaystyle\mathrm{s.t.}$	$\displaystyle\left\{\begin{array}[]{l}\forall k\in\{0,1,\ldots,N-1\},\\[4.55254pt] z_{k+1}=Az_{k}+Bv_{k}\\[4.55254pt] z_{0}=y\;,\end{array}\right.$		(63)

can be solved with existing MPC software. The associated ambiguity tube MPC feedback has then, by construction, the form

\mu_{\mathrm{MPC}}(y)=v_{0}^{\star}(y)\;,

where $v_{0}^{\star}(y)$ denotes the first element of an optimal control input sequence of (63) as a function of $y$ . This construction can be further refined by implementing the hybrid control law from Remark 9.

Refer to caption — Figure 1: Randomly generated closed-loop scenarios of the ambiguity tube MPC controller (63) with (LEFT) $\tau=10^{-1}$ and (RIGHT) $\tau=10$ . The state constraint, $x_{1}\leq 5$ , is visualized in the form of the dark gray-shaded infeasible region $x_{1}>5$ . For the very small penalty parameter $\tau=\frac{1}{10}$ the controller risks marginal constraint violation for the sake of better nominal control performance. All trajectories converge to the light gray-shaded target region $Y^{\star}$ . Notice that the initial state, $y_{0}=[-60,5]^{\intercal}$ , would be far off to the left of the figure and it is therefore not visualized—the colored dots correspond to the discrete-time states after the first MPC iteration.

4.4 Numerical Illustration

This section illustrates the performance of the above ambiguity tube MPC controller for the nonlinear system

	$\displaystyle x_{1}^{+}$	$\displaystyle=$	$\displaystyle-\frac{x_{1}}{8}+x_{2}+u_{1}+w_{1},$
	$\displaystyle x_{2}^{+}$	$\displaystyle=$	$\displaystyle-\frac{x_{1}}{2}+\frac{x_{2}}{4}+u_{2}+\cos(x_{1})\sin(5x_{2})^{3}+w_{2}\;.$

In order to write this system in the above form, we introduce the notation

A\,\overset{\mathrm{def}}{=}\,\left(\begin{array}[]{rr}-\frac{1}{8}&1\\[4.55254pt] -\frac{1}{2}&\frac{1}{4}\end{array}\right)\;,\quad B\,\overset{\mathrm{def}}{=}\,I\quad\text{and}\quad K\,\overset{\mathrm{def}}{=}\,\left(\begin{array}[]{rr}\frac{1}{8}&-\frac{1}{2}\\[4.55254pt] \frac{1}{4}&-\frac{1}{4}\end{array}\right)

to denote the system matrices and a suitable ancillary control gain. Notice that the influence of the nonlinear term, $\cos(x_{1})\sin(5x_{2})^{3}$ , can be over-estimated by introducing the central set $W_{\mathrm{c}}\overset{\mathrm{def}}{=}\{0\}\times[-1,1]$ (see Remark 8). Additionally, we set $U_{\mathrm{c}}\overset{\mathrm{def}}{=}[-10,10]^{2}$ . Moreover, the uncertain input is modeled by the distribution measure

\omega(A)\overset{\mathrm{def}}{=}\int_{A}\rho(w)\mathrm{d}w

with Radon-Nikodyn derivative (density function)

\rho(w)\;\overset{\mathrm{def}}{=}\;\left\{\begin{array}[]{ll}\mathfrak{d}(w_{1})&\text{if}\;w_{2}\in[-1,1]\\ 0&\text{otherwise}\end{array}\right\}\;,

where $\mathfrak{d}\overset{\mathrm{def}}{=}\partial\delta/\partial w$ denotes the standard Dirac distribution. The objective is constructed as in Example 6,

\Theta(x,u)\;\overset{\mathrm{def}}{=}\;\|u\|_{2}^{2}+\mathrm{dist}_{2}(x,Y^{\star})^{2}+\tau\cdot\mathrm{dist}_{1}(x,\mathbb{X})\;,

where $\tau>0$ is a risk-parameter that is associated with the given state constraint set

\mathbb{X}\;\overset{\mathrm{def}}{=}\;\left\{\;x\in\mathbb{R}^{2}\;\middle|\;x_{1}\leq 5\;\right\}\;.

Here, it is not difficult to check that the function

\Pi(x)\;\overset{\mathrm{def}}{=}\;\left[2+\frac{7}{27}\cdot\tau\right]\mathrm{dist}_{2}(x,Y^{\star})^{2}

satisfies the requirements from Lemma 3. Consequently, the associated ambiguity measures $L$ and $M$ satisfy all technical requirements of Theorem 4. Notice that the functions $S_{k}$ in (63) are in our implementation pre-computed with high precision such that the convex optimization problem (63) can be solved in much less than $1\,\mathrm{ms}$ by using ACADO Toolkit [14]. The prediction horizon of the MPC controller is set to $N=5$ .

Figure 1 shows two ambiguity tube MPC closed-loop simulations that are both started at the initial point $y_{0}=[-60,5]^{\intercal}$ . In the left figure, we have set $\tau=\frac{1}{10}$ , which means that the constraint violation penalty is small compared to the nominal control performance objective. Consequently, during the randomly generated closed-loop scenarios marginal constraint violation can be observed. This is in contrast to the right part of Figure 1, which shows randomly generated closed-loop scenarios for the case $\tau=10$ , leading to much smaller expected constraint violations at risk. In all cases, that is, independent of how the penalty parameter $\tau>0$ is chosen and independent of the particular uncertainty scenario, the closed-loop trajectories converge to the terminal region $Y^{\star}$ after a short transition period. In this particular example, we observe that this happens typically after $3$ to $8$ discrete-time steps, which confirms the robust asymptotic convergence statement of Theorem 4.

5 Conclusion

This paper has presented a coherent measure-theoretic framework for analyzing the stability of a rather general class of ambiguity tube MPC controllers. In detail, we have proposed a Wasserstein-Hausdorff metric leading to our first main result in Theorem 1, where conditions for the existence of a continuous value function of ambiguity tube MPC controllers have been established. Moreover, Theorem 2 has built upon this topological framework to establish conditions under which the stage and terminal cost are proper ambiguity measures, such that the cost function of the MPC controller can be turned into a non-negative supermartingale along the trajectories of the stochastic closed-loop system. Related stochastic stability and convergence results for ambiguity tube MPC have been summarized in Theorems 3 and 4.

In the sense that Lemma 3 proposes a practical strategy for constructing stabilizing terminal costs for stochastic and ambiguity tube MPC, the current article has outlined a path towards a more consistent stability theory that goes much beyond the existing convergence results from [6, 18, 27]. At the same time, however, it should also be pointed out that these results are based on a slightly different strategy of modeling the stage cost of the MPC controller, as discussed in Remark 9, where it is also explained why it may be advisable to use a hybrid feedback control law that switches to a pre-optimized ancillary controller whenever the state is inside its associated target region $Y^{\star}$ .

Last but not least, as much as this article has been attempting to make a step forward, towards a more consistent stability theory and practical formulation of robust MPC, it should also be stated that many open problems and conceptual challenges remain. In the line of this paper, for instance, a discussion of more advanced representations of ambiguity set representations and handling of nonlinearities, a more in-depth analysis of the interplay of the choice of $\mathcal{U}$ , the performance of the ambiguity tube controller, and its computational tractability, bounds on the concentrations of the state distributions, economic objectives for ambiguity tube MPC, as well as a deeper analysis of risk measures and related issues of recursive feasibility are only a small and incomplete selection of open problems in the field of distributionally robust MPC.

References

[1] C. Berge. Topological Spaces. Oliver and Boyd, 1963.
[2] D. Bernardini and A. Bemporad. Stabilizing model predictive control of stochastic constrained linear systems. IEEE Transactions on Automatic Control, 57(6):1468–1480, 2012.
[3] F. Blanchini and S. Miani. Set-theoretic methods in control. Birkhäuser, 2015.
[4] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
[5] R.S. Bucy. Stability and positive supermartingales. Journal of Differential Equations, 1:151–155, 1965.
[6] D. Chatterjee and J. Lygeros. On stability and performance of stochastic predictive control techniques. IEEE Transactions on Automatic Control, 60:509–514, 2015.
[7] H. Chen and F. Allgöwer. A quasi-infinite horizon nonlinear model predictive control scheme with guaranteed stability. Automatica, 34(10):1205–1217, 1998.
[8] F. Clarke, Y. Ledyaev, and R. Stern. Asymptotic stability and smooth Lyapunov functions. Journal of Differential Equations, 149(1):69–114, 1998.
[9] J.L. Doob. Stochastic Processes. Wiley, 1953.
[10] W. Feller. Introduction to Probability Theory and Its Applications. Wiley, 1971.
[11] M.J. Grimm, G. Messina, S.E. Tuna, and A.R. Teel. Examples when nonlinear model predictive control is nonrobust. Automatica, 40:1729–1738, 2004.
[12] L. Grüne. Analysis and design of unconstrained nonlinear MPC schemes for finite and infinite dimensional systems. SIAM Journal on Control and Optimization, 48(2):1206–1228, 2009.
[13] L. Hewing, K.P. Wabersich, M. Menner, and M.N. Zeilinger. Learning-based model predictive control: Toward safe learning in control. Annual Review of Control, Robotics, and Autonomous Systems, 3:269–296, 2020.
[14] B. Houska, H.J. Ferreau, and M. Diehl. An auto-generated real-time iteration algorithm for nonlinear MPC in the microsecond range. Automatica, 47:2279–2285, 2011.
[15] B. Houska and M.E. Villanueva. Robust optimzation for MPC. In S. Raković and W. Levine, editors, Handbook of Model Predictive Control, pages 413–443. Birkäuser, 2019.
[16] L.V. Kantorovich. On the translocation of masses. Journal of Mathematical Sciences, 133(4):1381–1382, 2006.
[17] B. Kouvaritakis and M. Cannon. Model Predictive Control: Classical, Robust and Stochastic. Springer, 2015.
[18] B. Kouvaritakis and M. Cannon. Feasibility, Stability, Convergence and Markov Chains. In: Model Predictive Control. Advanced Textbooks in Control and Signal Processing, pages 271–301. Springer, 2016.
[19] H.J. Kushner. On the stability of stochastic dynamical systems. Proceedings of the National Academy of Sciences, 53(1):8–12, 1965.
[20] H.J. Kushner. A partial history of the early development of continuous-time nonlinear stochastic systems theory. Automatica, 50(2):303–334, 2014.
[21] W. Langson, I. Chryssochoos, S.V. Raković, and D.Q. Mayne. Robust model predictive control using tubes. Automatica, 40(1):125–133, 2004.
[22] Y. Ledyaev and E. Sontag. A Lyapunov characterization of robust stabilization. Nonlinear Analysis, 37:813–840, 1999.
[23] M. Lorenzen, F. Dabbene, R. Tempo, and F. Allgöwer. Constraint-tightening and stability in stochastic model predictive control. IEEE Transactions on Automatic Control, 2016.
[24] D.Q. Mayne. Robust and stochastic MPC: are we going in the right direction? IFAC-PapersOnLine, 48(23):1–8, 2015.
[25] D.Q. Mayne, M.M. Seron, and S. Raković. Robust model predictive control of constrained linear systems with bounded disturbances. Automatica, 41(2):219–224, 2005.
[26] A. Mesbah. Stochastic model predictive control: An overview and perspectives for future research. IEEE Control Systems Magazine, 36(6):30–44, 2016.
[27] D. Munoz-Carpintero and M. Cannon. Convergence of stochastic nonlinear systems and implications for stochastic model predictive control. IEEE Transactions on Automatic Control, pages 1–8, 2020 (early access).
[28] S. Raković, B. Kouvaritakis, R. Findeisen, and M. Cannon. Homothetic tube model predictive control. Automatica, 48(8):1631–1638, 2012.
[29] J.B. Rawlings, D.Q. Mayne, and M.M. Diehl. Model Predictive Control: Theory and Design. Madison, WI: Nob Hill Publishing, 2018.
[30] R.T. Rockafellar and S. Uryasev. The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surveys in Operations Research and Management Science, 18:33–53, 2013.
[31] R.T. Rockafellar and R.J. Wets. Variational Analysis. Springer, 2005.
[32] M.A. Sehr and R.R. Bitmead. Stochastic output-feedback model predictive control. Automatica, 94:315–323, 2018.
[33] P. Sopasakis, D. Herceg, A. Bemporad, and P. Patrinos. Risk-averse model predictive control. Automatica, 100:281–288, 2019.
[34] J.C. Taylor. An introduction to measure and probability. Springer, 1996.
[35] B. Van Parys, D. Kuhn, P. Goulart, and M. Morari. Distributionally robust con- trol of constrained stochastic systems. IEEE Transactions on Automatic Control, 61(2):430–442, 2016.
[36] L.N. Vasershtein. Markov processes over denumerable products of spaces describing large system of automata. Problemy Peredači Informacii, 5(3):64–72, 1969.
[37] C. Villani. Optimal transport, old and new. Springer, 2005.
[38] M.E. Villanueva, E. De Lazzari., M.A. Müller, and B. Houska. A set-theoretic generalization of dissipativity with applications in Tube MPC. Automatica, 122(109179), 2020.
[39] M.E. Villanueva and B. Houska. On stochastic linear systems with zonotopic support sets. Automatica, 111(108652), 2020.
[40] M.E. Villanueva, R. Quirynen, M. Diehl, B. Chachuat, and B. Houska. Robust MPC via min–max differential inequalities. Automatica, 77:311–321, 2017.
[41] M. Zanon and S. Gros. Safe reinforcement learning using robust MPC. IEEE Transactions on Automatic Control, 66(8):3638–3652, 2021.