This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Robust Control Lyapunov-Value Functions for Nonlinear Disturbed Systems

Zheng Gong [email protected]    Sylvia Herbert [email protected] La Jolla, San Diego
Abstract

Control Lyapunov Functions (CLFs) have been extensively used in the control community. A well-known drawback is the absence of a systematic way to construct CLFs for general nonlinear systems, and the problem can become more complex with input or state constraints. Our preliminary work on constructing Control Lyapunov Value Functions (CLVFs) using Hamilton-Jacobi (HJ) reachability analysis provides a method for finding a non-smooth CLF. In this paper, we extend our work on CLVFs to systems with bounded disturbance and define the Robust CLVF (R-CLVF). The R-CLVF naturally inherits all properties of the CLVF; i.e., it first identifies the ”smallest robust control invariant set (SRCIS)” and stabilizes the system to it with a user-specified exponential rate. The region from which the exponential rate can be met is called the ”region of exponential stabilizability (ROES).” We provide clearer definitions of the SRCIS and more rigorous proofs of several important theorems. Since the computation of the R-CLVF suffers from the ”curse of dimensionality,” we also provide two techniques (warmstart and system decomposition) that solve it, along with necessary proofs. Three numerical examples are provided, validating our definition of SRCIS, illustrating the trade-off between a faster decay rate and a smaller ROES, and demonstrating the efficiency of computation using warmstart and decomposition.

keywords:
Optimal Control; HJ Reachability Analysis; Control Lyaounov Function.
thanks: All authors are in the department of Mechanical and Aerospace Engineering, UC San Diego. {zhgong, sherbert}@ucsd.edu. This work is supported by ONR YIP N00014-22-1-2292.

,

1 Introduction

Liveness and safety are two main concerns for autonomous systems working in the real world. Using control Lyapunov functions (CLFs) to stabilize the trajectories of a system to an equilibrium point [1, 2, 3] is a popular approach to ensure liveness, whereas using control barrier functions (CBFs) to guarantee forward control invariance is popular for maintaining safety [4, 5, 6]. However, finding CLFs and CBFs is hard, and users of these methods typically rely on hand-designed or application-specific CLFs and CBFs [7, 8, 9, 10, 11]. However, finding these hand-crafted functions can be difficult, especially for high-dimensional systems with state or input constraints.

Liveness and safety can also be achieved by formal methods such as Hamilton-Jacobi (HJ) reachability analysis [12]. This method formulates liveness and safety as optimal control problems, and has been used for applications in aerospace, autonomous driving, and more [13, 14, 15, 16, 17]. This method computes a value function whose level sets provide information about safety (or liveness) over space and time, and whose gradients provide the safety (or liveness) controller. This value function can be computed numerically using dynamic programming for general nonlinear systems and can accommodate input and disturbance bounds. Undermining these appealing benefits is the “curse of dimensionality.” Ongoing research has improved computational efficiency and refined the appximation [18, 19, 20, 21], but performing dynamic programming in high dimensions (6D or more) remains challenging.

Standard HJ reachability analysis focuses on problems such as minimum time to reach a goal, or avoiding certain states for all time. It does not stabilize a system to a goal after reaching it. In our previous work [22], we modified the value function and defined the control Lyapunov value function (CLVF) for undisturbed systems. The CLVF finds the smallest control invariant set (SCIS) and the region of exponential stabilizability (ROES) of the system. Its gradient can be used to synthesize controllers that stabilize the system to the SCIS with a user-specified exponential rate γ\gamma. It also handles complex dynamics and input bounds well.

However, the previous CLVF work only works for systems without disturbance, and the term “SCIS” is not the minimal control invariant set as defined in [23, 24]. Further, the “curse of dimensionality” restricts its application to relatively low dimensional systems (5D or lower.) In facing all these limitations, we formed this journal extension. The main contributions are:

  1. 1.

    We define the time-varying robust CLVF (TV-R-CLVF) and the robust CLVF (R-CLVF) for systems with bounded disturbance and control. We prove that the R-CLVF is Lipschitz continuous, satisfies the dynamic programming principle, and is the unique viscosity solution to the corresponding R-CLVF variational inequality (VI).

  2. 2.

    We define the smallest robustly control invariant set (SRCIS) of a system. We show that the SRCIS of a given system is the zero-level set of the computed R-CLVF.

  3. 3.

    We relax the choice of the loss function to any vector norm. We show different choices of the norm results in different SRCIS, ROES, and trajectories.

  4. 4.

    Two methods to accelerate computation are introduced: warmstart R-CLVF and system decomposition. A point-wise optimal R-CLVF quadratic program (QP) controller is provided and the algorithm for computing the R-CLVF is updated.

  5. 5.

    We provide numerical examples to validate the theory and show numerical efficiency with warmstart R-CLVF and system decomposition.

The paper is organized in the following order: Sec 2 provides background information on HJ reachability analysis and CLVF. Sec 3 introduces the TV-R-CLVF, and builds up the theoretic foundation for the R-CLVF. An optimal R-CLVF-QP controller is provided. Sec 4 introduces warmstart R-CLVF and system decomposition that accelerates the computation. Sec 5 shows three numerical examples, validating the theory.

2 Background

In this paper, we seek to exponentially stabilize a given nonlinear time-invariant dynamic system with bounded input and disturbance to its SRCIS. We start by defining crucial terms.

2.1 Problem Formulation

Consider the nonlinear time-invariant system

x˙(s)=f(x(s),u(s),d(s)),s[t,0],x(t)=x0,\dot{x}(s)=f\left(x(s),u(s),d(s)\right),\hskip 5.0pts\in[t,0],\hskip 5.0ptx(t)=x_{0}, (1)

where t<0t<0 is the initial time, and x0nx_{0}\in\mathbb{R}^{n} is the initial state. The control signal u()u(\cdot) and disturbance signal d()d(\cdot) are drawn from the set of measurable functions 𝕌\mathbb{U} and 𝔻\mathbb{D}. Assume also the control input uu and disturbance dd are drawn from convex compact sets 𝒰m\mathcal{U}\subset\mathbb{R}^{m} and 𝒟p\mathcal{D}\subset\mathbb{R}^{p} respectively. We have:

u():[t,0]𝒰,d():[t,0]𝒟.\displaystyle u(\cdot):[t,0]\mapsto\mathcal{U},\hskip 8.5359ptd(\cdot):[t,0]\mapsto\mathcal{D}.

Assume the dynamics f:n×𝒰×𝒟nf:\mathbb{R}^{n}\times\mathcal{U}\times\mathcal{D}\mapsto\mathbb{R}^{n} is uniformly continuous in (x,u,d)(x,u,d), Lipschitz continuous in xx for fixed u()u(\cdot) and d()d(\cdot), bounded xn,u𝒰,d𝒟\forall x\in\mathbb{R}^{n},u\in\mathcal{U},d\in\mathcal{D}. Under these assumptions, given initial state xx, control and disturbance signal u()u(\cdot), d()d(\cdot), there exists a unique solution ξ(s;t,x,u(),d())\xi(s;t,x,u(\cdot),d(\cdot)), s[t,0]s\in[t,0] of the system (1). When the initial condition, control, and disturbance signal used are not important, we use ξ(s)\xi(s) to denote the solution, which is also called the trajectory in this paper. Further assume the disturbance signal can be determined as a strategy with respect to the control signal: λ:𝕌𝔻\lambda:\mathbb{U}\mapsto\mathbb{D}, drawn from the set of non-anticipative maps λΛ\lambda\in\Lambda [25].

In this paper, we seek to stabilize the system (1) to its SRCIS. We first introduce the notion of a robustly control invariant set.

Definition 1.

(Robustly Control Invariant Set.) A closed set \mathcal{I} is robustly control invariant for (1) if x\forall x\in\mathcal{I}, λΛ\forall\lambda\in\Lambda, u()𝕌\exists u(\cdot)\in\mathbb{U} such that ξ(s;t,x,u(),λ[u])\xi(s;t,x,u(\cdot),\lambda[u])\in\mathcal{I}, s[t,0]\forall s\in[t,0].

When the system has equilibrium points, we assume 0 is one, i.e. f(0,0,0)=0f(0,0,0)=0. When the system does not have an equilibrium point, we assume it has some robust control invariant set around the origin.

We are also interested in finding the region of exponential stabilizability (ROES) of a set. We first define the distance from a point to a set 𝒜\mathcal{A} to be

dst(x;𝒜)=mina𝒜xa,dst(x;\mathcal{A})=\min_{a\in\partial\mathcal{A}}||x-a||, (2)

where 𝒜\partial\mathcal{A} is the boundary of 𝒜\mathcal{A} and any vector norm is applicable here.

Definition 2.

The ROES of a set \mathcal{I} is the set of states from which the trajectory converges to \mathcal{I} with an exponential rate γ\gamma:

𝒟ROES:={xn|λΛ,u()𝕌,γ,k>0 s.t.\displaystyle\mathcal{D}_{\text{ROES}}:=\{x\in\mathbb{R}^{n}|\hskip 5.69054pt\forall\lambda\in\Lambda,\exists u(\cdot)\in\mathbb{U},\gamma,k>0\text{ s.t. }
dst(ξ(s;t,x,u(),λ[u]);)keγ(st)dst(x;)}.\displaystyle dst(\xi(s;t,x,u(\cdot),\lambda[u]);\mathcal{I})\leq ke^{-\gamma(s-t)}dst(x;\mathcal{I})\}.

2.2 HJ Reachability and CLVF

In the conference version [22], we proposed to construct the CLVF using HJ reachability analysis. This is done by formulating a reachability safety problem, where the system tries to avoid all regions of the state space that are not the origin. This problem can be solved as an optimal control problem.

Traditionally in HJ reachability analysis, the continuous loss function ¯:n\bar{\ell}:\mathbb{R}^{n}\mapsto\mathbb{R} is defined such that its zero super-level set is the failure set ={x:¯(x)0}\mathcal{F}=\{x:\bar{\ell}(x)\geq 0\}.

The finite-time horizon cost function captures whether a trajectory enters \mathcal{F} at any time in [t,0][t,0] under given control and disturbance signal:

J(t,x,u(),d())=maxs[t,0]¯(ξ(s;t,x,u(),d())).J(t,x,u(\cdot),d(\cdot))=\max_{s\in[t,0]}\bar{\ell}\bigl{(}\xi(s;t,x,u(\cdot),d(\cdot))\bigl{)}. (3)

The value function is the cost given optimal control signal with worst case disturbance:

V(x,t)\displaystyle V(x,t) =maxλΛminu()𝕌J(t,x,u(),λ[u])\displaystyle=\max_{\lambda\in\Lambda}\min_{u(\cdot)\in\mathbb{U}}J(t,x,u(\cdot),\lambda[u])
=maxλΛminu()𝕌maxs[t,0]¯(ϕ(s;t,x,u(),λ[u]).\displaystyle=\max_{\lambda\in\Lambda}\min_{u(\cdot)\in\mathbb{U}}\max_{s\in[t,0]}\bar{\ell}(\phi(s;t,x,u(\cdot),\lambda[u]). (4)

The value function is the viscosity solution to the Hamilton-Jacobi-Isaacs variational inequality (HJI-VI) [26]:

0=min{¯(x)V(x,t),\displaystyle 0=\min\biggl{\{}\bar{\ell}(x)-V(x,t), (5)
DtV(x,t)+maxd𝒟minu𝒰DxV(x,t)f(x,u,d)}.\displaystyle\hskip 10.00002ptD_{t}V(x,t)+\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}D_{x}V(x,t)\cdot f(x,u,d)\biggl{\}}.

Therefore the value function (2.2) can be computed using dynamic programming by solving this HJI-VI recursively over time. The infinite-time horizon value function is defined by taking the limit of V(x,t)V(x,t) as tt\rightarrow-\infty [27],

V(x)=limtV(x,t).V^{\infty}(x)=\lim_{t\rightarrow-\infty}V(x,t). (6)

For the time-varying value function, V(x,t)0V(x,t)\geq 0 means despite the control signal used, there always exists a disturbance signal such that the trajectory starting from that point xx will enter \mathcal{F} for some time s[t,0]s\in[t,0]. The sub-zero level set of V(x,t)V(x,t) is therefore safe for the time horizon [t,0][t,0]. This can be extended to say that each α\alpha sub-level set 𝒱α={x:V(x,t)α}\mathcal{V}_{\alpha}=\{x:V(x,t)\leq\alpha\} is safe with respect to the set defined by α={x:¯(x)α}\mathcal{F}_{\alpha}=\{x:\bar{\ell}(x)\leq\alpha\}.

In the infinite-time setting, for all states in the α\alpha sub-level set of V(x)V^{\infty}(x), there always exists a control signal such that the maximum loss is lower than α\alpha despite the disturbance signal. This means every α\alpha sub-level set of V(x)V^{\infty}(x) is robustly control invariant and the trajectories can be maintained within a particular level set boundary. Further, this set is the largest RCIS contained within the α\alpha sub-level set of ¯(x)\bar{\ell}(x).

Remark 1.

In this paper, we restrict the selection of ¯(x)\bar{\ell}(x) to be vector norms (e.g., p-norm, or weighted Q norms.) In other words, the loss function measures the distance of a state to the origin. With this restriction, the cost function (3) captures the largest deviation from the origin of a given trajectory, initialized at xx with u()u(\cdot) and d()d(\cdot) applied, in time horizon [t,0][t,0]. The (infinite time) value function (2.2) captures the largest deviation with optimal control and disturbance signals applied in (infinite time) finite time horizon.

Denote the minimal value of V(x)V^{\infty}(x) as Vm:=minxV(x)V^{\infty}_{m}:=\min_{x}V^{\infty}(x). The VmV^{\infty}_{m}-level set of VV^{\infty} is the smallest RCIS (SRCIS), and denoted by m\mathcal{I}_{m}. Further, all the states in the SRCIS have the same value:

Vm=maxam¯(a).\displaystyle V^{\infty}_{m}=\max_{a\in\partial\mathcal{I}_{m}}\bar{\ell}(a). (7)
Remark 2.

Here the term smallest should be understood as ‘smallest distance to the origin measured by ¯(x)\bar{\ell}(x),’ and the SRCIS should be understood as ‘the largest RCIS, with the smallest distance to the origin.’ This is different from the ‘minimal RCIS’ as defined in [23] (where ‘minimal’ is defined as ‘no subset is robust control invariant’).

Refer to caption
Figure 1: SRCIS corresponds to different loss functions for system (8). Top left to right: R-CLVF when ¯(x)=x2\bar{\ell}(x)=||x||_{2}, x||x||_{\infty}, xQ||x||_{Q}, and xQ=xTQx||x||_{Q}=\sqrt{x^{T}Qx} given Q=diag[0.2,1]Q=diag[0.2,1]. Bottom left to right: the corresponding SRCIS and a trajectory starting inside the SRCIS. The robust control invariance is validated.

An example to illustrate this difference is this:

x˙=x+d,y˙=y+u\displaystyle\dot{x}=-x+d,\quad\dot{y}=y+u (8)

where u[1,1]u\in[-1,1] and d[0.5,0.5]d\in[-0.5,0.5]. This system has an undisturbed, uncontrolled equilibrium point [x,u,d]=[0,0,0][x,u,d]=[0,0,0]. It can be verified that ={x[0.5,0.5],y=0}\mathcal{I}=\{x\in[-0.5,0.5],y=0\} is one ‘minimal RCIS’ as all its subsets are not robustly control invariant. In fact, picking any y[1,1]y\in[-1,1] results in a ‘minimal RCIS.’ On the other hand, picking ¯(x)=x\bar{\ell}(x)=||x||_{\infty}, the SRCIS is m={x,y[0.5,0.5]}\mathcal{I}_{m}=\{x,y\in[-0.5,0.5]\}. This is because though the control can stabilize any |y|<1|y|<1 to the origin, the disturbance is also strong enough to perturb any |x|<0.5|x|<0.5 to leave the origin. Therefore, all states s.t. x,y[0.5,0.5]x,y\in[-0.5,0.5] have the same value, and the SRCIS measured by the \infty-norm is a square. Fig. 1 shows the SRCIS for three different choices of (x)\ell(x) and the corresponding value function.

An interesting observation is that adding or substracting a constant value to the loss function ¯(x)\bar{\ell}(x), the corresponding SRCIS stays the same.

Proposition 1.

Define (x)=¯(x)a{\ell}(x)=\bar{\ell}(x)-a, and denote the corresponding value function as V¯(x,t)\underline{V}(x,t), then

V¯(x,t)\displaystyle\underline{V}(x,t) =maxλΛminu𝕌maxs[t,0](ξ(s;t,x,u(),λ[u])\displaystyle=\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max_{s\in[t,0]}{\ell}(\xi(s;t,x,u(\cdot),\lambda[u])
=maxλΛminu𝕌maxs[t,0](¯(ξ(s;t,x,u(),λ[u])a)\displaystyle=\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max_{s\in[t,0]}\big{(}\bar{\ell}(\xi(s;t,x,u(\cdot),\lambda[u])-a\big{)}
=V(x,t)a.\displaystyle=V(x,t)-a. (9)

This means adding/subtracting a constant value to the loss function will equivalently add/subtract the value function with the same value.

However, each level set of the HJ value function (6) is only robustly control invariant, there is no guarantee that the system can be stabilized to lower level sets or the origin. In our preliminary paper [22], we define the control Lyapunov-value function (CLVF) for undisturbed systems. We proved that the CLVF satisfies the dynamic programming principle, and is the unique viscosity solution to the corresponding CLVF-VI. We also proved that the domain of CLVF is the ROES of the SCIS. A feasibility-guaranteed QP was provided for controller synthesis.

In this article, we further develop the theory of CLVFs for disturbed systems and provide necessary theorems for numerical implementation in high-dimensional nonlinear systems.

3 Robust Control Lyapunov-Value functions

In this section, we start by defining the TV-R-CLVF and prove some important properties of it. We then define the R-CLVF, which is the limit function of the TV-R-CLVF. We show that the existence of the R-CLVF is equivalent to the exponential stabilizability of the system to its SRCIS and that its domain is the ROES.

3.1 TV-R-CLVF

Definition 3.

A TV-R-CLVF is a function Vγ(x,t):n×V_{\gamma}(x,t):\mathbb{R}^{n}\times\mathbb{R}_{-}\rightarrow\mathbb{R} defined as:

Vγ(x,t)=maxλΛminu()𝕌maxs[t,0]eγ(st)(ξ(s;t,x,u(),λ[u])),\displaystyle V_{\gamma}(x,t)=\max_{\lambda\in\Lambda}\min_{u(\cdot)\in\mathbb{U}}\max_{s\in[t,0]}e^{\gamma(s-t)}\ell\bigl{(}\xi(s;t,x,u(\cdot),\lambda[u])\bigl{)}, (10)

where Jγ(t,x,u(),d())J_{\gamma}(t,x,u(\cdot),d(\cdot)) is the cost function:

Jγ(t,x,u(),d()))=maxs[t,0]eγ(st)(ξ(s;t,x,u(),d())),\displaystyle J_{\gamma}(t,x,u(\cdot),d(\cdot)))=\max_{s\in[t,0]}e^{\gamma(s-t)}\ell\bigl{(}\xi(s;t,x,u(\cdot),d(\cdot))\bigl{)}, (11)

γ0\gamma\geq 0 is a user-specified parameter that represents the desired decay rate, (x)=¯(x)Vm\ell(x)=\bar{\ell}(x)-V^{\infty}_{m}.

The cost at a state captures the maximum exponentially amplified distance between the trajectory starting from this state and the zero-level set of (x)\ell(x) (positive outside and negative inside.) The optimal control tries to minimize this cost and seeks to drive the system towards the origin. In contrast, the disturbance tries to maximize the cost and push the system away from the origin.

Proposition 2.

The TV-R-CLVF is bounded and Lipschitz in xx for any compact set 𝒞\mathcal{C}.

Proof.

Since the solution exists and the loss function is chosen to be vector norms, within any finite time horizon [t,0][t,0], the cost function J(t,x,u(),d())J(t,x,u(\cdot),d(\cdot)) is bounded, and this holds for all control and disturbance signals. Therefore the TV-R-CLVF is also bounded.

For the local Lipschitz property, we start by proving the cost function is locally Lipschitz continuous in xx. Because of the continuous dependence on the initial condition, x,y𝒞\forall x,y\in\mathcal{C}, there exists a constant c>0c>0 such that

ξ(s;t,x,u(),d())ξ(s;t,y,u(),d())cxy,\displaystyle||\xi(s;t,x,u(\cdot),d(\cdot))-\xi(s;t,y,u(\cdot),d(\cdot))||\leq c||x-y||,

refer to [13] inequality (3.16). Since (x)=xVm\ell(x)=||x||-V^{\infty}_{m}, using the triangle inequality of the vector norms, we have

|(ξ(s;t,x,u(),d()))(ξ(s;t,y,u(),d()))|\displaystyle|\ell\big{(}\xi(s;t,x,u(\cdot),d(\cdot))\big{)}-\ell\big{(}\xi(s;t,y,u(\cdot),d(\cdot))\big{)}|
=\displaystyle= |ξ(s;t,x,u(),d())ξ(s;t,y,u(),d())|\displaystyle\big{|}||\xi(s;t,x,u(\cdot),d(\cdot))||-||\xi(s;t,y,u(\cdot),d(\cdot))||\big{|}
\displaystyle\leq |ξ(s;t,x,u(),d())ξ(s;t,y,u(),d())|\displaystyle\big{|}||\xi(s;t,x,u(\cdot),d(\cdot))-\xi(s;t,y,u(\cdot),d(\cdot))||\big{|}
\displaystyle\leq |cxy|=cxy.\displaystyle\big{|}c||x-y||\big{|}=c||x-y||.

Multiply eγ(st)e^{\gamma(s-t)} on both side, we get

eγ(st)|(ξ(s;t,x,u(),d()))(ξ(s;t,y,u(),d()))|\displaystyle e^{\gamma(s-t)}|\ell\big{(}\xi(s;t,x,u(\cdot),d(\cdot))\big{)}-\ell\big{(}\xi(s;t,y,u(\cdot),d(\cdot))\big{)}|
eγ(st)cxy.\displaystyle\hskip 50.00008pt\leq e^{\gamma(s-t)}c||x-y||. (12)

Further, we have:

Jγ(t,x,u,d)Jγ(t,y,u,d)\displaystyle||J_{\gamma}(t,x,u,d)-J_{\gamma}(t,y,u,d)||
=\displaystyle= ||maxs[t,0]eγ(st)(ξ(s;t,x,u(),d()))\displaystyle||\max_{s\in[t,0]}e^{\gamma(s-t)}\ell\big{(}\xi(s;t,x,u(\cdot),d(\cdot))\big{)}-
maxs[t,0]eγ(st)(ξ(s;t,y,u(),d()))||\displaystyle\hskip 30.00005pt\max_{s\in[t,0]}e^{\gamma(s-t)}\ell\big{(}\xi(s;t,y,u(\cdot),d(\cdot))\big{)}||
\displaystyle\leq maxs[t,0]||eγ(st)(ξ(s;t,x,u(),d()))\displaystyle\max_{s\in[t,0]}||e^{\gamma(s-t)}\ell\big{(}\xi(s;t,x,u(\cdot),d(\cdot))\big{)}-
eγ(st)(ξ(s;t,y,u(),d()))||\displaystyle\hskip 30.00005pte^{\gamma(s-t)}\ell\big{(}\xi(s;t,y,u(\cdot),d(\cdot))\big{)}||
\displaystyle\leq maxs[t,0]eγ(st)cxy=eγtcxy.\displaystyle\max_{s\in[t,0]}e^{\gamma(s-t)}c||x-y||=e^{-\gamma t}c||x-y||.

This shows the cost function is Lipschitz in xx with Lipschitz constant eγtce^{-\gamma t}c. Since the above conclusion holds for arbitrary control and disturbance signals, we conclude that the TV-R-CLVF is also Lipschitz with the same Lipschitz constant:

|Vγ(x,t)Vγ(y,t)|eγtcxy\displaystyle|V_{\gamma}(x,t)-V_{\gamma}(y,t)|\leq e^{-\gamma t}c||x-y||

Denote the zero-level set of TV-R-CLVF as

𝒵γ(t):={x:Vγ(x,t)=0}.\displaystyle\mathcal{Z}_{\gamma}(t):=\{x:V_{\gamma}(x,t)=0\}. (13)

An important property of the TV-R-CLVF is that for all different γ0\gamma\geq 0, the zero-level sets at a given time tt are the same.

Lemma 1.

For all γ0\gamma\geq 0, 𝒵γ(t)\mathcal{Z}_{\gamma}(t) are the same.

Proof.

Assume 0γ1<γ20\leq\gamma_{1}<\gamma_{2}, we prove the following: x𝒵γ1(t)x𝒵γ2(t)x\in\mathcal{Z}_{\gamma_{1}}(t)\iff x\in\mathcal{Z}_{\gamma_{2}}(t).

(\Rightarrow) We first prove x𝒵γ1(t)x𝒵γ2(t)x\in\mathcal{Z}_{\gamma_{1}}(t)\implies x\in\mathcal{Z}_{\gamma_{2}}(t).

Since eγ1(st)>0e^{\gamma_{1}(s-t)}>0 for all s[t,0]s\in[t,0], and from the equation (10), (ξ(s))\ell(\xi(s)) must remain non-positive for all s[t,0]s\in[t,0], otherwise Vγ(x,t)>0V_{\gamma}(x,t)>0. Further, there must exist a t1[t,0]t_{1}\in[t,0] s.t. (ξ(t1;t,x,u(),d()))=0\ell(\xi(t_{1};t,x,u^{*}(\cdot),d^{*}(\cdot)))=0, where d()d^{*}(\cdot) and u()u^{*}(\cdot) are optimal disturbance and control signals. With the same control and disturbance signal, we have

Jγ2(t,x,u(),d())=0.\displaystyle J_{\gamma_{2}}(t,x,u(\cdot),d(\cdot))=0.

If d()d(\cdot) and u()u(\cdot) are not optimal disturbance and control for the TV-R-CLVF with γ2\gamma_{2}, then there must exist u1()u_{1}(\cdot) and d1()d_{1}(\cdot), s.t. (ξ(s;t,x,u1(),d1()))<0\ell(\xi(s;t,x,u_{1}(\cdot),d_{1}(\cdot)))<0 for all s[t,0]s\in[t,0]. However, if this is the case, then apply u1()u_{1}(\cdot) and d1()d_{1}(\cdot) to TV-R-CLVF with γ1\gamma_{1}, we get

Jγ1(t,x,u1(),d1())<0=Vγ1(x,t),\displaystyle J_{\gamma_{1}}(t,x,u_{1}(\cdot),d_{1}(\cdot))<0=V_{\gamma_{1}}(x,t),

which contradicts the assumption. Therefore d()d(\cdot) and u()u(\cdot) are optimal for TV-R-CLVF with γ2\gamma_{2}, i.e. Vγ2(x,t)=2V_{\gamma_{2}}(x,t)=2.

(\Leftarrow) Switch γ1\gamma_{1} and γ2\gamma_{2} and follow the same process, we get x𝒵γ2(t)x𝒵γ1(t)x\in\mathcal{Z}_{\gamma_{2}}(t)\implies x\in\mathcal{Z}_{\gamma_{1}}(t)

The essence of this proposition is that x𝒵γ(t)\forall x\in\mathcal{Z}_{\gamma}(t), if u()u(\cdot) and d()d(\cdot) is optimal w.r.t. γ1\gamma_{1}, it is also optimal w.r.t. all γ\gamma.

We now present that the TV-R-CLVF satisfies the dynamic programming principle, and is the unique viscosity solution to the TV-R-CLVF-VI.

Theorem 2.

Vγ(x,t)V_{\gamma}(x,t) satisfies the following dynamic programming principle for all t<t+δ0t<t+\delta\leq 0:

Vγ(x,t)=maxλΛminu𝕌max{eγδVγ(ξ(t+δ),t+δ),\displaystyle V_{\gamma}(x,t)=\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{\gamma\delta}V_{\gamma}(\xi(t+\delta),t+\delta),
maxs[t,t+δ]eγ(st)(ξ(s))}.\displaystyle\hskip 60.00009pt\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi(s))\biggl{\}}. (14)
Theorem 3.

The TV-R-CLVF is the unique viscosity solution to the following TV-R-CLVF-VI,

max{(x)Vγ(x,t),\displaystyle\max\biggl{\{}\ell(x)-V_{\gamma}(x,t), (15)
DtVγ+maxd𝒟minu𝒰DxVγf(x,u,d)+γVγ}=0,\displaystyle\hskip 2.84544ptD_{t}V_{\gamma}+\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}D_{x}V_{\gamma}\cdot f(x,u,d)+\gamma V_{\gamma}\biggl{\}}=0,

with initial condition Vγ(x,t)=(x)V_{\gamma}(x,t)=\ell(x).

The proof of the above two Theorems can be obtained analogously following Theorem 2,3 in [28], and is omitted here. Here, H:𝒟γ××nH:\mathcal{D}_{\gamma}\times\mathbb{R}\times\mathbb{R}^{n} is called the Hamiltonian:

H(x,v,p)=maxd𝒟minu𝒰pf(x,u,d)+γv.\displaystyle H(x,v,p)=\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}p\cdot f(x,u,d)+\gamma v.

Further, we can show that the Hamiltonian is a continuous function in (x,v,p)(x,v,p). Since H(x,v,p)H(x,v,p) is affine in aa, the continuity in vv is proved. Also, pp is a continuous function of (x,v,p)(x,v,p), and from the assumption, ff is also continuous in (x,v,p)(x,v,p). The dot product of two continuous functions is a continuous function, so pf(x,u,d)p\cdot f(x,u,d) is continuous in (x,v,p)(x,v,p). This holds for all u,du,d, therefore maxdminupf(x,u,d)\max_{d}\min_{u}p\cdot f(x,u,d) is continuous in (x,v,p)(x,v,p). In conclusion, H is continuos in (x,v,p)(x,v,p).

3.2 R-CLVF

We now turn our attention to the infinite-time horizon and the R-CLVF.

Definition 4.

Robust Control Lyapunov-Value Function (R-CLVF) Given a compact set DγnD_{\gamma}\subseteq\mathbb{R}^{n}, the function Vγ:𝒟γ+V^{\infty}_{\gamma}:\mathcal{D}_{\gamma}\mapsto\mathbb{R}_{+} is a R-CLVF if the following limit exists:

Vγ(x)=limtVγ(x,t).\displaystyle V^{\infty}_{\gamma}(x)=\lim_{t\rightarrow-\infty}V_{\gamma}(x,t). (16)

It should be noted that the domain of the TV-R-CLVF is n\mathbb{R}^{n}, while for the R-CLVF, it is 𝒟γ\mathcal{D}_{\gamma}. Also, Remark 2 from [22] still holds, i.e., the convergence in equation (16) is uniform in 𝒟γ\mathcal{D}_{\gamma}. The existence of the R-CLVF on 𝒟γ\mathcal{D}_{\gamma} is justified by the following Lemma.

Lemma 4.

The R-CLVF exists on a compact set 𝒟γ\mathcal{D}_{\gamma} (or n\mathbb{R}^{n}) if the system is exponentially stabilizable to its SRCIS from 𝒟ROES\mathcal{D}_{\text{ROES}} (or n\mathbb{R}^{n}). Further 𝒟γ=𝒟ROES\mathcal{D}_{\gamma}=\mathcal{D}_{\text{ROES}}.

Proof.

Assume the system is exponentially stabilizable to the SRCIS. Using the Definition 2, we have λΛ\forall\lambda\in\Lambda, u()\exists u^{*}(\cdot) s.t.

dst(ξ(s;t,x,u(),λ[u]);m)keγ(st)dst(x;m).\displaystyle dst(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]);\mathcal{I}_{m})\leq ke^{-\gamma(s-t)}dst(x;\mathcal{I}_{m}).

Plug in equation (2),

minamξ(s;t,x,u(),λ[u])a\displaystyle\min_{a\in\partial\mathcal{I}_{m}}||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])-a||
keγ(st)minamxa.\displaystyle\hskip 60.00009pt\leq ke^{-\gamma(s-t)}\min_{a\in\partial\mathcal{I}_{m}}||x-a||. (17)

Plug in (x)=¯(x)Vm=xVm\ell(x)=\bar{\ell}(x)-V^{\infty}_{m}=||x||-V^{\infty}_{m}, we have

(ξ(s;t,x,u(),λ[u]))\displaystyle\ell(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]))
=\displaystyle= ¯(ξ(s;t,x,u(),λ[u]))Vm\displaystyle\bar{\ell}(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]))-V^{\infty}_{m}
=\displaystyle= ξ(s;t,x,u(),λ[u])maxama\displaystyle||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||-\max_{a\in\partial\mathcal{I}_{m}}||a||
\displaystyle\leq ξ(s;t,x,u(),λ[u])minama\displaystyle||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||-\min_{a\in\partial\mathcal{I}_{m}}||a||
=\displaystyle= minam(||ξ(s;t,x,u(),λ[u])||||a||)\displaystyle\min_{a\in\partial\mathcal{I}_{m}}\bigl{(}||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||-||a||\bigl{)}
\displaystyle\leq minam(||ξ(s;t,x,u(),λ[u])a||)\displaystyle\min_{a\in\partial\mathcal{I}_{m}}\bigl{(}||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])-a||\bigl{)}
\displaystyle\leq keγ(st)minamxa,\displaystyle ke^{-\gamma(s-t)}\min_{a\in\partial\mathcal{I}_{m}}||x-a||, (18)

where we used equation (4) for the last inequality. Multiply eγ(st)e^{\gamma(s-t)} on both side

eγ(st)((ξ(s;t,x,u(),λ[u])))\displaystyle e^{\gamma(s-t)}\bigl{(}\ell(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]))\bigl{)}
\displaystyle\leq eγ(st)keγ(st)minamxa\displaystyle e^{\gamma(s-t)}ke^{-\gamma(s-t)}\min_{a\in\partial\mathcal{I}_{m}}||x-a||
=\displaystyle= kminamxa,\displaystyle k\min_{a\in\partial\mathcal{I}_{m}}||x-a||,

which holds for all s[t,0]s\in[t,0]. Therefore

Vγ(x,t)=\displaystyle V_{\gamma}(x,t)= maxs[t,0]eγ(st)((ξ(s;t,x,u(),λ[u]))Vm)\displaystyle\max_{s\in[t,0]}e^{\gamma(s-t)}\left(\ell(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]))-V^{\infty}_{m}\right)
\displaystyle\leq kminamxa.\displaystyle k\min_{a\in\partial\mathcal{I}_{m}}||x-a||.

This upper bound kminamxak\min_{a\in\partial\mathcal{I}_{m}}||x-a|| is independent of tt, therefore as tt\rightarrow-\infty, we have Vγ(x)kminamxaV_{\gamma}^{\infty}(x)\leq k\min_{a\in\partial\mathcal{I}_{m}}||x-a||. Since the R-CLVF monotonically increases, we conclude that the limit in (16) exists x𝒟ROES\forall x\in\mathcal{D}_{\text{ROES}}, and 𝒟γ=𝒟ROES\mathcal{D}_{\gamma}=\mathcal{D}_{\text{ROES}}.

Denote the zero-level set of R-CLVF as

𝒵γ:={x:Vγ(x)=0}.\displaystyle\mathcal{Z}_{\gamma}^{\infty}:=\{x:V_{\gamma}^{\infty}(x)=0\}.

The R-CLVF with different γ\gamma has the same zero level set.

Proposition 3.

For all γ0\gamma\geq 0, 𝒵γ\mathcal{Z}_{\gamma}^{\infty} are the same.

The proof is analogous to the proof of Lemma 1 and is omitted here.

Proposition 4.

The R-CLVF is locally Lipschitz continuous in xx

Proof.

Since the convergence is uniform, ϵ>0\forall\epsilon>0, tN<0\exists t_{N}<0, s.t. ttN\forall t\leq t_{N} and x,y𝒟γ\forall x,y\in\mathcal{D}_{\gamma} we have

ϵ\displaystyle-\epsilon\leq Vγ(x)Vγ(x,tN)ϵ,\displaystyle V_{\gamma}^{\infty}(x)-V_{\gamma}(x,t_{N})\leq\epsilon,
ϵ\displaystyle-\epsilon\leq Vγ(y)Vγ(y,tN)ϵ,\displaystyle V_{\gamma}^{\infty}(y)-V_{\gamma}(y,t_{N})\leq\epsilon,

which give us

Vγ(x)Vγ(y)\displaystyle||V_{\gamma}^{\infty}(x)-V_{\gamma}^{\infty}(y)|| Vγ(x,tN)Vγ(y,tN)+2ϵ\displaystyle\leq||V_{\gamma}(x,t_{N})-V_{\gamma}(y,t_{N})||+2\epsilon
eγtNcxy+2ϵ.\displaystyle\leq e^{-\gamma t_{N}}c||x-y||+2\epsilon.

where we used Proposition 2 for the last inequality. Since ϵ\epsilon can be chosen arbitrarily small, we conclude that the CLVF is Lipschitz in 𝒟γ\mathcal{D}_{\gamma} (refer to the proof of Theorem 3.2 of [13].)

Theorem 5.

(CLVF Dynamic Programming Principle) For all ts0t\leq s\leq 0, the following is satisfied

Vγ(x)=maxλΛminu𝕌max{eγtVγ(z),\displaystyle V_{\gamma}^{\infty}(x)=\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}V^{\infty}_{\gamma}(z),\hskip 14.22636pt
maxs[t,0]eγ(st)(ξ(s;t,x,u,λ[u]))}\displaystyle\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s;t,x,u,\lambda[u]))\biggl{\}} (19)
Proof.

From the definition of the R-CLVF and Theorem 2, t<t+δ0\forall t<t+\delta\leq 0 and x𝒟γ\forall x\in\mathcal{D}_{\gamma} we have:

Vγ(x)=limtVγ(x,t)\displaystyle V_{\gamma}^{\infty}(x)=\lim_{t\rightarrow-\infty}V_{\gamma}(x,t)
=\displaystyle= limtmaxλΛminu𝕌max{eγδVγ(ξ(t+δ),t+δ),\displaystyle\lim_{t\rightarrow-\infty}\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{\gamma\delta}V_{\gamma}(\xi(t+\delta),t+\delta),
maxs[t,t+δ]eγ(st)(ξ(s))},\displaystyle\hskip 71.13188pt\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi(s))\biggl{\}},
=\displaystyle= limtmax{eγδVγ(ξ(t+δ),t+δ),\displaystyle\lim_{t\rightarrow-\infty}\max\biggl{\{}e^{\gamma\delta}V_{\gamma}(\xi^{*}(t+\delta),t+\delta),
maxs[t,t+δ]eγ(st)(ξ(s))}\displaystyle\hskip 71.13188pt\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi^{*}(s))\biggl{\}} (20)

where ξ(s)=ξ(s;t,x,u(),λ[u])\xi^{*}(s)=\xi(s;t,x,u^{*}(\cdot),\lambda^{*}[u]), and u()u^{*}(\cdot) and λ[u]\lambda^{*}[u] are the optimal control and disturbance strategy. Further, since the dynamics is time-invariant, for any T<t0T<t\leq 0, define

u^(s)={u¯(s)ifTts0,u(s(Tt))ifTs<Tt,\displaystyle\hat{u}(s)=\begin{cases}\bar{u}(s)\ &\text{if}\quad T-t\leq s\leq 0,\\ u^{*}(s-(T-t))\ &\text{if}\quad T\leq s<T-t,\end{cases}

and corresponding disturbance

λ^(s)={λ¯(s)ifTts0,λ[u](s(Tt))ifTs<Tt,\displaystyle\hat{\lambda}(s)=\begin{cases}\bar{\lambda}(s)\ &\text{if}\quad T-t\leq s\leq 0,\\ \lambda[u^{*}](s-(T-t))\ &\text{if}\quad T\leq s<T-t,\end{cases}

it can be verified that s[T,0]\forall s\in[T,0]

maxs[t,0]eγ(st)(ξ(s;t,x,u(),λ[u]))\displaystyle\max_{s\in[t,0]}e^{\gamma(s-t)}\ell\bigl{(}\xi(s;t,x,u(\cdot),\lambda[u])\bigl{)}
=\displaystyle= maxs[T,Tt]eγ(sT)(ξ(s;T,x,u^(),λ^[u^])).\displaystyle\max_{s\in[T,T-t]}e^{\gamma(s-T)}\ell\bigl{(}\xi(s;T,x,\hat{u}(\cdot),\hat{\lambda}[\hat{u}])\bigl{)}.

In other words, if we only change the initial time, but keep the time horizon unchanged, the cost will stay the same, with optimal control and disturbance determined by shifting the original optimal control and disturbance signal with the corresponding time. Denote

ξ(t+δ)=ξ(t+δ;t,x,u(),λ[u])=z,\displaystyle\xi^{*}(t+\delta)=\xi(t+\delta;t,x,u^{*}(\cdot),\lambda^{*}[u])=z,

we have:

limteγδVγ(ξ(t+δ),t+δ)\displaystyle\lim_{t\rightarrow-\infty}e^{\gamma\delta}V_{\gamma}(\xi^{*}(t+\delta),t+\delta)
=\displaystyle= limteγδVγ(z,t+δ)=eγδVγ(z),\displaystyle\lim_{t\rightarrow-\infty}e^{\gamma\delta}V_{\gamma}(z,t+\delta)=e^{\gamma\delta}V_{\gamma}^{\infty}(z), (21)

and

limtmaxs[t,t+δ]eγ(st)(ξ(s))\displaystyle\lim_{t\rightarrow-\infty}\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi^{*}(s))
=\displaystyle= maxs[t,t+δ]eγ(st)(ξ(s)).\displaystyle\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi^{*}(s)). (22)

Combine equations (5) (5), equation (5) can be written as

Vγ(x)=\displaystyle V_{\gamma}^{\infty}(x)= max{limteγδVγ(ξ(t+δ),t+δ),\displaystyle\max\biggl{\{}\lim_{t\rightarrow-\infty}e^{\gamma\delta}V_{\gamma}(\xi^{*}(t+\delta),t+\delta),
limtmaxs[t,t+δ]eγ(st)(ξ(s))}\displaystyle\hskip 42.67912pt\lim_{t\rightarrow-\infty}\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi^{*}(s))\biggl{\}}
=\displaystyle= max{eγδVγ(z),maxs[t,t+δ]eγ(st)(ξ(s))}\displaystyle\max\biggl{\{}e^{\gamma\delta}V_{\gamma}^{\infty}(z),\max_{s\in[t,t+\delta]}e^{\gamma(s-t)}\ell(\xi^{*}(s))\biggl{\}}

Choosing δ=t\delta=-t, we get:

Vγ(x)=maxλΛminu𝕌max{eγtVγ(z),\displaystyle V_{\gamma}^{\infty}(x)=\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}V^{\infty}_{\gamma}(z),\hskip 14.22636pt
maxs[t,0]eγ(st)(ξ(s;t,x,u,λ[u]))}.\displaystyle\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s;t,x,u,\lambda[u]))\biggl{\}}.

Theorem 6.

(CLVF-VI viscosity solution) The CLVF is the unique continuous solution to the following CLVF-VI in the viscosity sense,

max{(x)Vγ(x),\displaystyle\max\biggl{\{}\ell(x)-V_{\gamma}^{\infty}(x), (23)
maxd𝒟minu𝒰DxVγf(x,u)+γVγ(x)}=0.\displaystyle\hskip 2.84544pt\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}D_{x}V_{\gamma}^{\infty}\cdot f(x,u)+\gamma V_{\gamma}^{\infty}(x)\biggl{\}}=0.
Proof.

We prove this theorem using the stability of viscosity solutions.

First, define function (x,v,p):𝒟γ××n\mathcal{F}(x,v,p):\mathcal{D}_{\gamma}\times\mathbb{R}\times\mathbb{R}^{n}\mapsto\mathbb{R}

(x,v,p)=max{(x)v,H(x,v,p)}.\displaystyle\mathcal{F}(x,v,p)=\max\{\ell(x)-v,\hskip 5.0ptH(x,v,p)\}. (24)

Since H(x,v,p)H(x,v,p) and (x)\ell(x) are continuous functions, (x,v,p)\mathcal{F}(x,v,p) is also continuous.

Now, fix tt and only look at xx. Consider a sequence {tn}\{t_{n}\}, and tnt_{n}\neq-\infty and limntn=\lim_{n\rightarrow\infty}t_{n}=-\infty. Evaluate V(x,t)V(x,t) and DtV(x,tn)D_{t}V(x,t_{n}) at each tnt_{n}, we get two sequence of functions Vn{V_{n}} and {DtVn}\{D_{t}V_{n}\}, with limnVn(x)=Vγ(x)\lim_{n\rightarrow\infty}V_{n}(x)=V_{\gamma}^{\infty}(x) and limnDtVn(x)=0\lim_{n\rightarrow\infty}D_{t}V_{n}(x)=0 uniformly. Also, denote

n(x,vn,pn)=max{(x)vn,\displaystyle\mathcal{F}_{n}(x,v_{n},p_{n})=\max\{\ell(x)-v_{n},
DtVn(x)+H(x,vn,pn)}.\displaystyle\hskip 60.00009ptD_{t}V_{n}(x)+H(x,v_{n},p_{n})\}.

We have a sequence of functions {n(x,vn,pn)}\{\mathcal{F}_{n}(x,v_{n},p_{n})\}, and

limnn(x,vn,pn)=max{(x)v,H(x,v,p)},\displaystyle\lim_{n\rightarrow\infty}\mathcal{F}_{n}(x,v_{n},p_{n})=\max\biggl{\{}\ell(x)-v,\hskip 5.0ptH(x,v,p)\biggl{\}},

which is the left-hand side (LHS) of equation (23), and the convergence is uniform. Further, Theorem 3 shows that Vn(x)V_{n}(x) is the viscosity solution to n(x,Vn,pn)=0\mathcal{F}_{n}(x,V_{n},p_{n})=0.

By Theorem I.2 of [15], VγV_{\gamma}^{\infty} is the viscosity solution of the R-CLVF-VI (23). ∎

It should be noted that in the numerical solver, we cannot directly solve for equation (23). Instead, we solve for equation (15) and backpropagate using dynamic programming to get the value at the previous time step. This is why we do not specify the boundary condition for equation (23).

Proposition 5.

At any point (differentiable or non-differentiable) in the domain 𝒟γ\mathcal{D}_{\gamma} of the R-CLVF, d𝒟\forall d\in\mathcal{D}, there exists some control u𝒰u\in\mathcal{U} such that

maxd𝒟minu𝒰V˙γγVγ.\displaystyle\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}\dot{V}^{\infty}_{\gamma}\leq-\gamma V^{\infty}_{\gamma}. (25)
Proof.

Since the R-CLVF is only Lipschitz continuous, there exist points that are not differentiable. For those points, [16] showed that either a super-differential (D+Vγ(x)D^{+}V_{\gamma}^{\infty}(x)) or a sub-differential (DVγ(x)D^{-}V_{\gamma}^{\infty}(x)) exists, whose elements are called super-gradients and sub-gradients respectively. A function is differentiable at xx if DVγ(x)=D+Vγ(x)D^{-}V_{\gamma}^{\infty}(x)=D^{+}V_{\gamma}^{\infty}(x). Non-differentiable points only have a super-differential or sub-differential. At non-differentiable points, define V˙γ(x)=pf(x,u)\dot{V}_{\gamma}^{\infty}(x)=p\cdot f(x,u), where pp is either a sub-gradient or a super-gradient.

For non-differentiable points with super-differential, the corresponding solution is called a sub-solution, and

max{(x)Vγ(x),maxd𝒟minu𝒰p+f(x,u,d)\displaystyle\max\biggl{\{}\ell(x)-V_{\gamma}^{\infty}(x),\hskip 10.00002pt\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}p^{+}\cdot f(x,u,d)
+γVγ(x)}0,p+D+Vγ(x).\displaystyle\hskip 20.00003pt+\gamma V_{\gamma}^{\infty}(x)\biggl{\}}\leq 0,\quad\forall p^{+}\in D^{+}V_{\gamma}^{\infty}(x).

The maximum of the two terms is less or equal to 0, which implies both terms must be less or equal to 0:

p+D+Vγ(x)maxd𝒟minu𝒰p+f(x,u,d)γVγ(x).\displaystyle\forall p^{+}\in D^{+}V_{\gamma}^{\infty}(x)\text{, }\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}p^{+}\cdot f(x,u,d)\leq-\gamma V_{\gamma}^{\infty}(x).

This means for any super-gradients, there exists some control input, that will provide a sufficient decrease in the value along the trajectory.

When there exists sub-differential, we have:

max{(x)Vγ(x),maxd𝒟minu𝒰pf(x,u,d)\displaystyle\max\biggl{\{}\ell(x)-V_{\gamma}^{\infty}(x),\hskip 10.00002pt\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}p^{-}\cdot f(x,u,d)
+γVγ(x)}0,pDVγ(x).\displaystyle\hskip 20.00003pt+\gamma V_{\gamma}^{\infty}(x)\biggl{\}}\leq 0,\quad\forall p^{-}\in D^{-}V_{\gamma}^{\infty}(x).

Using Theorem 2.3 in[17], we hav

pDVγ(x) , maxd𝒟minu𝒰pf(x,u,d)=γVγ(x).\displaystyle\forall p^{-}\in D^{-}V_{\gamma}^{\infty}(x)\text{ , }\max_{d\in\mathcal{D}}\min_{u\in\mathcal{U}}p^{-}\cdot f(x,u,d)=-\gamma V_{\gamma}^{\infty}(x).

Combined, we get the desired inequality: V˙γγVγ\dot{V}_{\gamma}^{\infty}\leq-\gamma V_{\gamma}^{\infty} holds for all points in 𝒟γ\mathcal{D}_{\gamma}. ∎

In Lemma 4, we showed that the existence of the R-CLVF can be derived when the system is robustly exponentially stabilizable to its SRCIS. Now, we show that the existence of the R-CLVF implies the robust exponential stabilizability.

Lemma 7.

The system can be exponentially stabilized to its smallest robustly control invariant set m\mathcal{I}_{m} from 𝒟γm\mathcal{D}_{\gamma}\setminus\mathcal{I}_{m} (or nm\mathbb{R}^{n}\setminus\mathcal{I}_{m}), if the R-CLVF exists in domdom (or n\mathbb{R}^{n}).

Proof.

Assume the limit in (16) exists in 𝒟γ\mathcal{D}_{\gamma}. For any initial state x𝒟γmx\in\mathcal{D}_{\gamma}\setminus\mathcal{I}_{m}, consider the optimal trajectory ξ(s;t,x,u(),λ[u])\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]) ts0\forall t\leq s\leq 0. From Proposition 5:

DxVγ(x)f(x,u,d)=V˙γγVγ.\displaystyle D_{x}V_{\gamma}^{\infty}(x)\cdot f(x,u^{*},d^{*})=\dot{V}_{\gamma}^{\infty}\leq-\gamma V^{\infty}_{\gamma}.

Using the comparison principle, we have s[t,0]\forall s\in[t,0],

Vγ(ξ(s;t,x,u(),λ[u]))eγ(st)Vγ(x).\displaystyle V_{\gamma}^{\infty}\big{(}\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])\big{)}\leq e^{-\gamma(s-t)}V_{\gamma}^{\infty}(x). (26)

Since Vγ(x,0)Vγ(x)V_{\gamma}(x,0)\leq V_{\gamma}^{\infty}(x), we have:

ξ(s;t,x,u(),λ[u])\displaystyle||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||
=\displaystyle= Vγ(ξ(s;t,x,u(),λ[u]),0)+Vm\displaystyle V_{\gamma}\big{(}\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]),0\big{)}+V^{\infty}_{m}
\displaystyle\leq Vγ(ξ(s;t,x,u(),λ[u]))+Vm.\displaystyle V_{\gamma}^{\infty}\big{(}\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])\big{)}+V^{\infty}_{m}.

Therefore, we have:

minamξ(s;t,x,u(),λ[u])a\displaystyle\min_{a\in\partial\mathcal{I}_{m}}||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])-a||
\displaystyle\leq ξ(s;t,x,u(),λ[u])+minama\displaystyle||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||+\min_{a\in\partial\mathcal{I}_{m}}||a||
\displaystyle\leq ξ(s;t,x,u(),λ[u])+Vm\displaystyle||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])||+V^{\infty}_{m}
\displaystyle\leq Vγ(ξ(s;t,x,u(),λ[u]))+2Vm\displaystyle V_{\gamma}^{\infty}\big{(}\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])\big{)}+2V^{\infty}_{m}

Plugging in (26) gives us

minamξ(s;t,x,u(),λ[u])a\displaystyle\min_{a\in\partial\mathcal{I}_{m}}||\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}])-a||
\displaystyle\leq eγ(st)Vγ(x)+2Vm\displaystyle e^{-\gamma(s-t)}V_{\gamma}^{\infty}(x)+2V^{\infty}_{m}
=\displaystyle= eγ(st)k1minamxa+eγ(st)k2minamxa\displaystyle e^{-\gamma(s-t)}k_{1}\min_{a\in\partial\mathcal{I}_{m}}||x-a||+e^{-\gamma(s-t)}k_{2}\min_{a\in\partial\mathcal{I}_{m}}||x-a||
=\displaystyle= eγ(st)(k1+k2)minamxa\displaystyle e^{-\gamma(s-t)}(k_{1}+k_{2})\min_{a\in\partial\mathcal{I}_{m}}||x-a||

where

k1\displaystyle k_{1} =Vγ(x)minamxa\displaystyle=\frac{V_{\gamma}^{\infty}(x)}{\min_{a\in\partial\mathcal{I}_{m}}||x-a||}
k2\displaystyle k_{2} =2Vmeγ(st)minamxa,\displaystyle=\frac{2V^{\infty}_{m}}{e^{-\gamma(s-t)}\min_{a\in\partial\mathcal{I}_{m}}||x-a||},

and 0<k1,k2<0<k_{1},k_{2}<\infty. In other words, the controlled system can be locally exponentially stabilized to mrcismrcis from 𝒟ROES\mathcal{D}_{\text{ROES}}, If the R-CLVF exists on 𝒟γ\mathcal{D}_{\gamma}. Further, if the R-CLVF exists on n\mathbb{R}^{n}. the above result holds globally.

Combining Lemma 4 and Lemma 7, we provide the following theorem.

Theorem 8.

The system can be exponentially stabilized to its smallest robustly control invariant set m\mathcal{I}_{m} from 𝒟γm\mathcal{D}_{\gamma}\setminus\mathcal{I}_{m} (or nm\mathbb{R}^{n}\setminus\mathcal{I}_{m}), if and only if the R-CLVF exists in domdom (or n\mathbb{R}^{n}).

Remark 3.

From (10) and (16), it can be seen that if γ1>γ2\gamma_{1}>\gamma_{2}, then Vγ1>Vγ2V^{\infty}_{\gamma_{1}}>V^{\infty}_{\gamma_{2}}. Assume their corresponding domain is 𝒟γ1\mathcal{D}_{\gamma_{1}} and 𝒟γ2\mathcal{D}_{\gamma_{2}}, we have 𝒟γ1𝒟γ2\mathcal{D}_{\gamma_{1}}\subset\mathcal{D}_{\gamma_{2}}. From Theorem 8, we have the following conclusion: a larger γ\gamma corresponds to a faster convergence rate, while a smaller ROES.

3.3 R-CLVF-QP

For control and disturbance affine system

x˙=f(x,u,d)=g(x)+hu(x)u+hd(x)d,\dot{x}=f\big{(}x,u,d\big{)}=g(x)+h_{u}(x)u+h_{d}(x)d, (27)

where g:nng:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}, hu:nn×muh_{u}:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n\times m_{u}}, hd:nn×mdh_{d}:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n\times m_{d}}. For such systems, (25)is equivalent to the following linear inequality in uu:

DxVγ(x)g(x)+minu𝒰DxVγ(x)hu(x)u\displaystyle D_{x}V_{\gamma}^{\infty}(x)\cdot g(x)+\min_{u\in\mathcal{U}}D_{x}V_{\gamma}^{\infty}(x)\cdot h_{u}(x)u
+maxd𝒟DxVγ(x)hd(x)dγVγ(x).\displaystyle\hskip 50.00008pt+\max_{d\in\mathcal{D}}D_{x}V_{\gamma}^{\infty}(x)\cdot h_{d}(x)d\leq-\gamma V^{\infty}_{\gamma}(x).
Theorem 9.

(Feasibility Guaranteed R-CLVF-QP) Given some reference control uru_{r}, the optimal controller can be synthesized by the following CLVF-QP with guaranteed feasibility x𝒟γ\forall x\in\mathcal{D}_{\gamma}.

minu𝒰(uur)T(uur)s.t.\displaystyle\hskip 30.00005pt\min_{u\in\mathcal{U}}\quad(u-u_{r})^{T}(u-u_{r})\quad\text{s.t.}
DxVγ(x)g(x)+DxVγ(x)hu(x)u\displaystyle D_{x}V_{\gamma}^{\infty}(x)\cdot g(x)+D_{x}V_{\gamma}^{\infty}(x)\cdot h_{u}(x)u
+DxVγ(x)hd(x)dγVγ(x)\displaystyle\hskip 40.00006pt+D_{x}V_{\gamma}^{\infty}(x)\cdot h_{d}(x)d\leq-\gamma V_{\gamma}^{\infty}(x)
Proof.

This is a direct result of Proposition 5. ∎

Note that the QP controller is only point-wise optimal, with respect to “staying close to the reference controller.” It is not optimal w.r.t. the value function, as will be shown in the numerical examples.

4 R-CLVF with Numerical Implementation

In the numerical implementation for computing the R-CLVF, equation (5) is solved on a discrete grid, until some convergence threshold is met, this leads to the well-known “curse of dimensionality.” In this section, we provide two main methods to overcome this issue: the warmstarting technique and the system decomposition technique. Necessary proofs are provided and the effectiveness is validated with a 10D example in the numerical example.

4.1 R-CLVF with Warmstarting

In the previous conference paper, we introduced a two-step process, that first finds the SRCIS, and then finds the CLVF. This process requires solving the TV-R-CLVF-VI two times, each with a complete initialization. In this subsection, we show that the converged value function for the first step can be used to warmstart the second step computation.

Denote the time-varying value function with initial value k(x)k(x) as V¯γ(x,t)\bar{V}_{\gamma}(x,t), and the infinite time value function as V¯γ(x)\bar{V}^{\infty}_{\gamma}(x), with the corresponding domain 𝒟¯γ\bar{\mathcal{D}}_{\gamma}. We still have the same loss function (x)\ell(x).

Theorem 10.

For all initialization V¯γ(x,0)=k(x)\bar{V}_{\gamma}(x,0)=k(x), we have V¯γ(x,t)Vγ(x,t)\bar{V}_{\gamma}(x,t)\geq{V_{\gamma}}(x,t) holds x\forall x, t<0\forall t<0.

Proof.
  1. 1.

    Assume k(x)=(x)k(x)=\ell(x). Then V¯γ(x,t)Vγ(x,t)\bar{V}_{\gamma}(x,t)\geq{V_{\gamma}}(x,t) holds x\forall x.

  2. 2.

    Assume k(x)>(x)k(x)>\ell(x). From (5), we have t<0\forall t<0:

    V¯γ(x,t)\displaystyle\bar{V}_{\gamma}(x,t)
    =\displaystyle= maxλΛminu𝕌max{eγtV¯γ(z,0),maxs[t,0]eγ(st)(ξ(s)}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}\bar{V}^{\infty}_{\gamma}(z,0),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s)\biggl{\}}
    =\displaystyle= maxλΛminu𝕌max{eγtk(ξ(0)),maxs[t,0]eγ(st)(ξ(s)}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}k(\xi(0)),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s)\biggl{\}}
    \displaystyle\geq maxλΛminu𝕌max{eγt(ξ(0)),maxs[t,0]eγ(st)(ξ(s)}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}\ell(\xi(0)),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s)\biggl{\}}
    =\displaystyle= Vγ(x,t)\displaystyle V_{\gamma}(x,t)
  3. 3.

    Assume k(x)<(x)k(x)<\ell(x). Then, at time t=0t=0, we have V¯γ(x,0)<Vγ(x,0)\bar{V}_{\gamma}(x,0)<V_{\gamma}(x,0). Consider an infinitesimal time step 00^{-}, using (5), we have:

    V¯γ(x,0)=\displaystyle\bar{V}_{\gamma}(x,0^{-})= maxλΛminu𝕌max{eγ0k(ξ(0)),\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{\gamma 0^{-}}k(\xi(0^{-})),
    maxs[0,0]eγ(s0)(ξ(s)}\displaystyle\hskip 50.00008pt\max_{s\in[0^{-},0]}e^{\gamma(s-0^{-})}\ell(\xi(s)\biggl{\}}
    =\displaystyle= max{eγt1k(ξ(0)),eγ0(ξ(0)}\displaystyle\max\biggl{\{}e^{\gamma t_{1}}k(\xi(0)),e^{-\gamma 0^{-}}\ell(\xi(0^{-})\biggl{\}}
    =\displaystyle= eγ0(ξ(0)\displaystyle e^{-\gamma 0^{-}}\ell(\xi(0^{-})
    \displaystyle\geq (ξ(0)=Vγ(x,0),\displaystyle\ell(\xi(0^{-})=V_{\gamma}(x,0^{-}),

    in other words, after one infinitesimal small step, we get V¯γ(x,t)>Vγ((x,t)\bar{V}_{\gamma}(x,t^{-})>V_{\gamma}((x,t^{-}). Now, replace k(x)=V¯γ(x,t)k(x)=\bar{V}_{\gamma}(x,t^{-}), we return to the second case, and the remaining proof follows.

Theorem 10 shows that no matter what the initial value is, the value function propagated with this initial value is always an over-approximation of the TV-R-CLVF. However, for R-CLVF, we have the following Proposition and Theorem.

Proposition 6.

If V¯γ(x)\bar{V}_{\gamma}^{\infty}(x) exists on 𝒟γ¯\bar{\mathcal{D}_{\gamma}}, then V¯γ(x)Vγ(x)\bar{V}_{\gamma}^{\infty}(x)\geq V_{\gamma}^{\infty}(x) and 𝒟γ¯𝒟γ\bar{\mathcal{D}_{\gamma}}\subseteq\mathcal{D}_{\gamma}.

Proof.

The first part is a direct result from Theorem 10. The second part can be proved by contradiction. Assume x𝒟¯γx\ \in\bar{\mathcal{D}}_{\gamma} but x𝒟γx\notin{\mathcal{D}_{\gamma}}. This means Vγ¯(x)\bar{V_{\gamma}^{\infty}}(x) is finite, but Vγ(x){V_{\gamma}^{\infty}}(x) is infinite, which contradicts to the first part of this proposition. ∎

Theorem 11.

For initialization V¯γ(x,0)=k(x)Vγ(x)\bar{V}_{\gamma}(x,0)=k(x)\leq V_{\gamma}^{\infty}(x), we have V¯γ(x)=Vγ(x)\bar{V}_{\gamma}^{\infty}(x)=V_{\gamma}^{\infty}(x).

Proof.

Denote k~(x)=Vγ(x)\tilde{k}(x)=V_{\gamma}^{\infty}(x), and the value function initialized with k~(x)\tilde{k}(x) as V~γ(x,t)\tilde{V}_{\gamma}(x,t). we have x,t0\forall x,t\leq 0:

V~γ(x,t)\displaystyle\tilde{V}_{\gamma}(x,t)
=\displaystyle= maxλΛminu𝕌max{eγtV~γ(z,0),maxs[t,0]eγ(st)(ξ(s))}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}\tilde{V}^{\gamma}(z,0),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s))\biggl{\}}
=\displaystyle= maxλΛminu𝕌max{eγtk~(ξ(0)),maxs[t,0]eγ(st)(ξ(s))}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}\tilde{k}(\xi(0)),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s))\biggl{\}}
\displaystyle\geq maxλΛminu𝕌max{eγtk(ξ(0)),maxs[t,0]eγ(st)(ξ(s))}\displaystyle\max_{\lambda\in\Lambda}\min_{u\in\mathbb{U}}\max\biggl{\{}e^{-\gamma t}k(\xi(0)),\max_{s\in[t,0]}e^{\gamma(s-t)}\ell(\xi(s))\biggl{\}}
=\displaystyle= V¯γ(x,t).\displaystyle\bar{V}_{\gamma}(x,t).

Note that Vγ(x)V_{\gamma}^{\infty}(x) is the already the converged value function, we have Vγ(x)=V~γ(x,t)V¯γ(x,t)V_{\gamma}^{\infty}(x)=\tilde{V}_{\gamma}^{\infty}(x,t)\geq\bar{V}_{\gamma}(x,t).

Similar to Propsition 6, If Vγ(x)V_{\gamma}^{\infty}(x) exists on 𝒟γ\mathcal{D}_{\gamma}, then V¯γ(x)Vγ(x)\bar{V}_{\gamma}^{\infty}(x)\leq{V_{\gamma}^{\infty}}(x), and 𝒟γ𝒟¯γ\mathcal{D}_{\gamma}\subseteq\bar{\mathcal{D}}_{\gamma}. Combined, we get 𝒟γ=𝒟¯γ\mathcal{D}_{\gamma}=\bar{\mathcal{D}}_{\gamma}, and x𝒟γ\forall x\in\mathcal{D}_{\gamma}

V¯γ(x)=Vγ(x).\displaystyle\bar{V}_{\gamma}^{\infty}(x)=V_{\gamma}^{\infty}(x).

Using Theorem 11, we provide an enhanced version of the original algorithm for computing the R-CLVF, shown in Alg. 1.

Algorithm 1 Obtaining the R-CLVF for general nonlinear systems (offline)
1:: System dynamics f(x,u,d)f(x,u,d), 𝒰\mathcal{U}, 𝒟\mathcal{D}, desired exponential rate γ>0\gamma>0, convergence threshold Δ\Delta, loss function (x)\ell(x), time step δt\delta t.
2:Output: Vγ(x)V_{\gamma}^{\infty}(x), m\mathcal{I}_{m}
3:Initialization:
4:V(x,t0)(x)V(x,t_{0})\leftarrow\ell(x)
5:Find m\mathcal{I}_{m}
6:V(x)V^{\infty}(x)\leftarrow update_value(ff, 𝒰\mathcal{U}, 𝒟\mathcal{D}, Δ\Delta, δt\delta t, V(x,0)V(x,0), (x)\ell(x))
7:VmminxV(x)V^{\infty}_{m}\leftarrow\min_{x}V^{\infty}(x), m{V(x)=Vm}\mathcal{I}_{m}\leftarrow\{V^{\infty}(x)=V^{\infty}_{m}\}
8:Find R-CLVF
9:(x)(x)Vm\ell(x)\leftarrow\ell(x)-V^{\infty}_{m}, V(x,t0)V(x)VmV(x,t_{0})\leftarrow V^{\infty}(x)-V^{\infty}_{m}
10:Vγ(x)V_{\gamma}^{\infty}(x)\leftarrow update_value(ff, 𝒰\mathcal{U}, 𝒟\mathcal{D}, Δ\Delta, δt\delta t, V(x,0)V(x,0), (x)\ell(x))
11:update_value(ff, 𝒰\mathcal{U}, 𝒟\mathcal{D}, Δ\Delta, δt\delta t, V(x,0)V(x,0), (x)\ell(x))
12:t0t\leftarrow 0
13:while dVΔdV\geq\Delta do
14:     V(x,t+δt)V(x,t)V(x,t+\delta t)\leftarrow V(x,t)
15:     update V(x,t+δt)V(x,t+\delta t) using equation (2)
16:     dV(x)=V(x,t+δt)V(x,t)dV(x)=V(x,t+\delta t)-V(x,t)
17:     tt+δtt\leftarrow t+\delta t
18:end while

4.2 R-CLVF with Decomposition

To discuss the R-CLVF with decomposition, we first introduce the self-contained subsystems decomposition.

Definition 5.

(Self-contained subsystem decomposition) (SCSD) Consider the following special case z=(z1,z2,zc)z=(z_{1},z_{2},z_{c}), with z1n1z_{1}\in\mathbb{R}^{n_{1}}, z2n2z_{2}\in\mathbb{R}^{n_{2}}, zcncz_{c}\in\mathbb{R}^{n_{c}}, n1,n2>0n_{1},n_{2}>0, nc0n_{c}\geq 0, and n1+n2+nc=nn_{1}+n_{2}+n_{c}=n. z1z_{1}, z2z_{2}, zcz_{c} are called “state partitions” of the system.

Given the system (1), the two subsystems of it are

x˙1=f1(x1)+g1(x1)u,x˙2=f2(x2)+g2(x2)u,\dot{x}_{1}=f_{1}(x_{1})+g_{1}(x_{1})u,\quad\dot{x}_{2}=f_{2}(x_{2})+g_{2}(x_{2})u,\vspace{-1.5mm}

with x1=(z1,zc)𝒳1n1+ncx_{1}=(z_{1},z_{c})\in\mathcal{X}_{1}\subseteq\mathbb{R}^{n_{1}+n_{c}}, and x2=(z2,zc)𝒳2n2+ncx_{2}=(z_{2},z_{c})\in\mathcal{X}_{2}\subseteq\mathbb{R}^{n_{2}+n_{c}}.

Theorem 12.

Assume the system can be decomposed into several self-contained subsystems, and there are no shared control and states between each subsystem. Denote the corresponding R-CLVFs for the subsystems as Vγ,i(xi)V_{\gamma,i}^{\infty}(x_{i}) with domain 𝒟γi,xi\mathcal{D}_{\gamma_{i},x_{i}}, and define

Wγ(x)=iVγ,i(xi).\displaystyle W_{\gamma}^{\infty}(x)=\sum_{i}V_{\gamma,i}^{\infty}(x_{i}). (28)

Then

W˙γ(x)=iV˙γ,i(xi)iγVγ,i(xi)=γWγ(x).\displaystyle\dot{W}_{\gamma}^{\infty}(x)=\sum_{i}\dot{V}_{\gamma,i}^{\infty}(x_{i})\leq\sum_{i}-\gamma V_{\gamma,i}^{\infty}(x_{i})=-\gamma W_{\gamma}^{\infty}(x).

This reconstructed value function is a Lipschitz continuous robust CLF, but not necessarily the R-CLVF of the full-dimensional system. Since we assume no shared control between subsystems, the controller for the full-dimensional system can be determined by solving R-CLVF-QPs for the subsystems.

5 Numerical Examples

5.1 2D System Revisit

Consider again the system given by equation (8), and specify ¯(x)=x\bar{\ell}(x)=||x||_{\infty}. We compute the R-CLVF with γ1=0.1\gamma_{1}=0.1, γ2=0.3\gamma_{2}=0.3. The results are shown in Fig. 2. It should be noted that for this system, the SRCIS for γ=0.1\gamma=0.1 and γ=0.2\gamma=0.2 are both m={|x|0.5,|y|0.5}\mathcal{I}_{m}=\{|x|\leq 0.5,|y|\leq 0.5\}, and ROES 𝒟ROES={|x|>0.5,|y|<1}m\mathcal{D}_{\text{ROES}}=\{|x|>0.5,|y|<1\}\setminus\mathcal{I}_{m}.

Refer to caption
Figure 2: Top: R-CLVF with γ=0.1\gamma=0.1 (left) and γ=0.2\gamma=0.2 (right). Bottom left: ROES, SRCIS, and the two optimal trajectories using R-CLVF-QP controller. The ROES and SRCIS for different γ\gamma are all the same, while the optimal trajectories are different. To see this, first consider a point on the boundary of ROES, [0.1,1][0.1,1], dd will make xx increase 0.1 to 1, while uu cannot decrease yy. Since the distance is measured by x||x||_{\infty}, we have (ξ(s;t,x,u(),λ[u]))=1\ell(\xi(s;t,x,u^{*}(\cdot),\lambda[u^{*}]))=1, t<0\forall t<0. Using equation (10) and (16), the value will be infinite. However, for any |y|<1|y|<1, the control can decrease yy to 0, and for all xx, it either goes to 0.5 or -0.5. Note both happen in a finite time horizon. Therefore, using equation (10) and (16), the value will be finite for all γ0\gamma\geq 0. Bottom mid: value decay along the two optimal trajectories. All controllers were generated using R-CLVF-QP. With a 151-by-151 grid, the computation time for γ=0.1\gamma=0.1 is 215.6s with warmstart, and 289.7s w/o warmstart, and 211.5s with warmstart, and 258.4s w/o warmstart for γ=0.2\gamma=0.2.

5.2 3D Dubins Car

Consider the 3D Dubins car example:

x˙=vcos(θ)+dx,y˙=vsin(θ)+dy,θ˙=u,\displaystyle\dot{x}=v\cos(\theta)+d_{x},\hskip 10.00002pt\dot{y}=v\sin(\theta)+d_{y},\hskip 10.00002pt\dot{\theta}=u,

where v=1v=1 and u[π/2,π/2]u\in[-\pi/2,\pi/2] is the control and dx,dy[0.1,0.1]d_{x},d_{y}\in[-0.1,0.1] is the disturbance. This system has no equilibrium point. The SRCISs With different ¯(x)\bar{\ell}(x) are shown in Fig. 3, and the trajectory converges to the SRCIS exponentially.

Refer to caption
Figure 3: Different SRCISs with different ¯(x)\bar{\ell}(x). Top left: SRCIS and optimal trajectory with (¯x)=||x||2\ell\bar{(}x)=||x||_{2}. Top right: SRCIS and optimal trajectory with (¯x)=||x||Q\ell\bar{(}x)=||x||_{Q}, where Q=diag[1,1,0]Q=diag[1,1,0]. Bottom left: the value along the optimal trajectories. All controllers were generated using R-CLVF-QP. With a 51-by-51-by-53 grid, the computation time for ¯(x)=x2\bar{\ell}(x)=||x||_{2} is 264s with warmstart, and 386.6s w/o warmstart, and 143.4s with warmstart, and 207.7s w/o warmstart for ¯(x)=xQ\bar{\ell}(x)=||x||_{Q}.

5.3 10D Quadrotor

Consider the 10D quadrotor system:

x˙=vx+dx,vx˙=gtanθx,θx˙=d1θx+ωx,\displaystyle\dot{x}=v_{x}+d_{x},\hskip 5.69054pt\dot{v_{x}}=g\tan{\theta_{x}},\hskip 5.69054pt\dot{\theta_{x}}=-d_{1}\theta_{x}+\omega_{x},
ωx˙=d0θx+n0ux,y˙=vy+dy,vy˙=gtanθy,\displaystyle\dot{\omega_{x}}=-d_{0}\theta_{x}+n_{0}u_{x},\hskip 5.69054pt\dot{y}=v_{y}+d_{y},\hskip 5.69054pt\dot{v_{y}}=g\tan{\theta_{y}},
θy˙=d1θy+ωy,ωy˙=d0θy+n0uy,\displaystyle\dot{\theta_{y}}=-d_{1}\theta_{y}+\omega_{y},\hskip 5.69054pt\dot{\omega_{y}}=-d_{0}\theta_{y}+n_{0}u_{y},
z˙=vz+dz,vz˙=uz,\displaystyle\dot{z}=v_{z}+d_{z},\hskip 5.69054pt\dot{v_{z}}=u_{z}, (29)

where (x,y,z)(x,y,z) denote the position, (vx,vy,vz)(v_{x},v_{y},v_{z}) denote the velocity, (θx,θy)(\theta_{x},\theta_{y}) denote the pitch and roll, (ωx,ωy)(\omega_{x},\omega_{y}) denote the pitch and roll rates, and (ux,uy,uz)(u_{x},u_{y},u_{z}) are the controls. The system parameters are set to be d0=10,d1=8,n0=10,kT=0.91,g=9.81d_{0}=10,d_{1}=8,n_{0}=10,k_{T}=0.91,g=9.81, |ux|,|uy|π/9|u_{x}|,|u_{y}|\leq\pi/9, uz[1,1]u_{z}\in[-1,1], |dx|,|dy|0.1|d_{x}|,|d_{y}|\leq 0.1, |dz|0.5|d_{z}|\leq 0.5.

This 10D system can be decomposed into three subsystems: X-sys with stats [x,vx,θx,ωx][x,v_{x},\theta_{x},\omega_{x}], Y-sys with stats[y,vy,θy,ωy][y,v_{y},\theta_{y},\omega_{y}], and Z-sys with stats [z,vz][z,v_{z}]. It can be verified that all three subsystems have an equilibrium point at the origin. Further, there’s no shared control or states among subsystems. We use ¯(x)=x2\bar{\ell}(x)=||x||_{2}.

Table 1: Comparison of the computation time for the 10D quadrotor. X/Y dim has 17 grids for each state, and Z dim has 101 grids for each state.
System Z dim X/Y dim Full Sys
w/o Warmstrat 405.2 s 3731.7 s 7868.6s
w. Warmstart 234.9 s 3564.6 s 7364.1s

A CLF is reconstructed using equation (28), and the QP controllers for each subsystem are synthesized using Theorem 9. The results are shown in Fig. 4, and the computation time is shown in Tab. 1. A comparison of the R-CLVF with and without warmstart is shown in Fig. 5, showing that the warmstart provides the exact result.

Refer to caption
Figure 4: Top left: R-CLVF for Z-sys, with 101 grids for each state, time step = 0.1, convergence threshold = 0.001. Top right: R-CLVF for X(Y)-sys, with 17 grids for each state, time step = 0.1, convergence threshold = 0.01. Bottom left: Optimal trajectory using R-CLVF-QP controller, converging to the SRCIS shown in black. Bottom right: the decay of the value along the optimal trajectory.
Refer to caption
Figure 5: Comparision of R-CLVF with and without warmstart for the Z-subsystem. The difference is almost negligible.

6 Conclusions

In this paper, we extend our preliminary work on constructing CLVFs using HJ reachability analysis to the system with bounded disturbances. We provided more detailed discussions on several important claims and theorems compared to the previous version. Also, warmstarting and SCSD are proposed to solve the “curse of dimensionality,” and the effectiveness of both techniques is validated with numerical examples.

Future directions include finding conditions on when the SCSD provides R-CLVF and incorporating learning-based methods to tune the exponential rate γ\gamma for online execution in robotics applications.

References

  • [1] E. D. Sontag, “A ‘universal’ construction of Artstein’s theorem on nonlinear stabilization,” Systems & control letters, 1989.
  • [2] R. A. Freeman and J. A. Primbs, “Control lyapunov functions: New ideas from an old source,” in Conf. on Decision and Control, 1996.
  • [3] K. K. Hassan et al., “Nonlinear systems,” Departement of Electrical and Computer Engineering, Michigan State University, 2002.
  • [4] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath, and P. Tabuada, “Control barrier functions: Theory and applications,” in European Control Conf., 2019.
  • [5] A. D. Ames, K. Galloway, K. Sreenath, and J. W. Grizzle, “Rapidly exponentially stabilizing control Lyapunov functions and hybrid zero dynamics,” Trans. on Automatic Control, 2014.
  • [6] A. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” Trans. on Automatic Control, 2017.
  • [7] Z. Artstein, “Stabilization with relaxed controls,” Nonlinear Analysis: Theory, Methods & Applications, 1983.
  • [8] F. Camilli, L. Grüne, and F. Wirth, “Control Lyapunov functions and Zubov’s method,” SIAM Journal on Control and Optimization, 2008.
  • [9] P. Giesl and S. Hafstein, “Review on computational methods for lyapunov functions,” Discrete & Continuous Dynamical Systems, 2015.
  • [10] P. Giesl, “Construction of a local and global lyapunov function for discrete dynamical systems using radial basis functions,” Journal of Approximation Theory, 2008.
  • [11] X. Xu, P. Tabuada, J. W. Grizzle, and A. D. Ames, “Robustness of control barrier functions for safety critical control,” Int. Federation of Automatic Control, 2015.
  • [12] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-Jacobi reachability: A brief overview and recent advances,” in Conf. on Decision and Control, 2017.
  • [13] L. C. Evans and P. E. Souganidis, “Differential games and representation formulas for solutions of Hamilton-Jacobi-Isaacs equations,” Indiana University Mathematics Journal, 1984.
  • [14] M. Bardi and I. Capuzzo-Dolcetta, Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations.   Springer, 2008.
  • [15] M. G. Crandall and P.-L. Lions, “Viscosity solutions of hamilton-jacobi equations,” Trans. of the American mathematical society, 1983.
  • [16] M. G. Crandall, L. C. Evans, and P.-L. Lions, “Some properties of viscosity solutions of hamilton-jacobi equations,” Trans. of the American Mathematical Society, 1984.
  • [17] H. Frankowska, “Hamilton-jacobi equations: viscosity solutions and generalized gradients,” Journal of mathematical analysis and applications, 1989.
  • [18] M. Chen, S. L. Herbert, M. S. Vashishtha, S. Bansal, and C. J. Tomlin, “Decomposition of reachable sets and tubes for a class of nonlinear systems,” Trans. on Automatic Control, 2018.
  • [19] S. Bansal and C. J. Tomlin, “Deepreach: A deep learning approach to high-dimensional reachability,” in Int. Conf. on Robotics and Automation, 2021.
  • [20] S. Herbert, J. J. Choi, S. Sanjeev, M. Gibson, K. Sreenath, and C. J. Tomlin, “Scalable learning of safety guarantees for autonomous systems using Hamilton-Jacobi reachability,” in Int. Conf. on Robotics and Automation, 2021.
  • [21] C. He, Z. Gong, M. Chen, and S. Herbert, “Efficient and guaranteed hamilton–jacobi reachability via self-contained subsystem decomposition and admissible control sets,” IEEE Control Systems Letters, vol. 7, pp. 3824–3829, 2023.
  • [22] Z. Gong, M. Zhao, T. Bewley, and S. Herbert, “Constructing control lyapunov-value functions using hamilton-jacobi reachability analysis,” IEEE Control Systems Letters, vol. 7, pp. 925–930, 2022.
  • [23] S. Rakovic, E. Kerrigan, K. Kouramas, and D. Mayne, “Invariant approximations of the minimal robust positively invariant set,” IEEE Transactions on Automatic Control, vol. 50, no. 3, pp. 406–410, 2005.
  • [24] Y. Chen, H. Peng, J. Grizzle, and N. Ozay, “Data-driven computation of minimal robust control invariant set,” in 2018 IEEE Conference on Decision and Control (CDC).   IEEE, 2018, pp. 4052–4058.
  • [25] P. P. Varaiya, “On the existence of solutions to a differential game,” SIAM Journal on Control, vol. 5, no. 1, pp. 153–162, 1967.
  • [26] J. F. Fisac, M. Chen, C. J. Tomlin, and S. S. Sastry, “Reach-avoid problems with time-varying dynamics, targets and constraints,” in Hybrid Systems: Computation and Control.   ACM, 2015.
  • [27] I. J. Fialho and T. T. Georgiou, “Worst case analysis of nonlinear systems,” Trans. on Automatic Control, 1999.
  • [28] J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Herbert, “Robust control barrier-value functions for safety-critical control,” Conf. on Decision and Control, 2021.