Real-time Safety Index Adaptation for Parameter-varying Systems
via Determinant Gradient Ascend

Rui Chen¹, Weiye Zhao¹, Ruixuan Liu¹, Weiyang Zhang², and Changliu Liu¹ ¹Carnegie Mellon University, Pittsburgh, PA. Contact: {ruic3, weiyezha, ruixuanl, cliu6}@andrew.cmu.edu²University of Michigan, Ann Arbor, MI. Contact: [email protected]This work is partially supported by the National Science Foundation, Grant No. 2144489.

Abstract

Safety Index Synthesis (SIS) is critical for deriving safe control laws. Recent works propose to synthesize a safety index (SI) via nonlinear programming and derive a safe control law such that the system 1) achieves forward invariant (FI) with some safe set and 2) guarantees finite time convergence (FTC) to that safe set. However, real-world system dynamics can vary during run-time, making the control law infeasible and invalidating the initial SI. Since the full SIS nonlinear programming is computationally expensive, it is infeasible to re-synthesize the SI each time the dynamics are perturbed. To address that, this paper proposes an efficient approach to adapting the SI to varying system dynamics and maintaining the feasibility of the safe control law. The proposed method leverages determinant gradient ascend and derives a closed-form update to safety index parameters, enabling real-time adaptation performance. A numerical study validates the effectiveness of our approach.

I Introduction

Autonomous systems are entering many application domains, e.g., autonomous vehicles [1], human-robot collaboration [2, 3], etc. As autonomous systems are deployed to more dynamic environments, safety becomes increasingly critical. It is important to ensure that the system would not harm the agents sharing the environment (i.e., humans and the workspace).

Safe control has been widely studied to guarantee the safety of autonomous systems. In particular, energy functions [4] are widely used in the safe control field to quantify system safety and derive control laws to ensure safety, such as the safe set algorithm (SSA) [5] and control barrier functions (CBF) [6]. To achieve provable safety, the safe control law needs to satisfy two critical properties: 1) forward-invariance (FI), meaning that the system should stay in a safe region once entering it, and (b) finite-time convergence (FTC), meaning that the system should land in the safe region in finite time even starting in an unsafe state. To achieve such a provably safe control law, a safety index (SI) needs to be carefully synthesized so that the constraints yield from the SI is always feasible. Namely, in every state of interest, there must exist a control in the control space (either bounded or unbounded), that satisfies the safety constraints. Therefore, Safety Index Synthesis (SIS) is critical [7, 8].

Refer to caption — Figure 1: Illustration of safety index adaptation. After the drone picks up a package whose weight is not known in advance, its dynamics change. The safe control law is adapted to the new dynamics and continues to guarantee safety, e.g., collision avoidance.

SIS has been widely studied. Previous works [9, 10] address SIS for dynamic systems with unbounded control. [7, 11, 8] address SIS for systems with known bounded control. Recent work [12] further addresses the SIS problem for dynamic systems with varying (i.e., state-dependant) control bounds, which is more practical in reality. Although existing approaches are promising, most of them consider invariant dynamic systems. In practice, the dynamics of real-world systems are usually varying. For example, when a drone is used for package delivery, its dynamics change every time a package is added or removed (see Figure 1); when a robot arm is used for pick-and-place, its dynamics can change due to the object being manipulated. Under perturbed dynamics, the safe control law derived from the previous safety index might no longer be feasible, and can no longer guarantee safety. A naive fix is to re-synthesize the SI whenever the dynamics change. However, a full SIS generally requires non-trivial efforts and is infeasible for real-time adaptation. For instance, it can take more than 10 minutes to synthesize a single SI for a simplistic unicycle model with state-dependant control bounds [12].

This paper studies efficient safety index adaptation (SIA) for parameter-varying systems. Our intuition is that when the system dynamics change, it should be sufficient to fine-tune the safety index instead of generating a new one from scratch. To achieve that, we first observe that the full SIS problem is in fact solved via a semidefinite program with a positive-semidefiniteness (PSD) constraint that depends on the system dynamics. That constraint is normally violated when the dynamics change, invalidating the previous safety index. A reasonable solution is to fine-tune the SI parameters such that the PSD constraint is satisfied again. Leveraging Sylvester’s criterion [13], we are able to derive closed-form updates to the SI parameters that are computationally efficient enough for real-time adaptation.

In short, our major contribution is introducing determinant gradient ascent (DGA), a closed-form safety index adaptation algorithm that guarantees user-defined safety for parameter-varying dynamic systems. For the rest of the paper, we review the literature in Section II. In Section III, we introduce the goal of safe control and the full SIS problem before formulating the problem of safety index adaptation. In Section IV, we derive our efficient SIA approach which is then validated via a numerical study in Section V. We finally provide future directions and conclude with Section VI.

II Related Work

Previous works [9, 10] address SIS for known dynamics. SIS is similar to CBF synthesis for enforcing constraints [6], but different in that the desired safety index refers to a specific class of energy functions usually for collision avoidance with the safe set algorithm (SSA) [5, 14].

Real-world system dynamics are usually imperfectly known (i.e., uncertainty exists). To address those, [15] introduces adaptive CBF (aCBF) to ensure the safety of dynamic systems with estimated parametric model uncertainty. [16] introduces robust aCBF (RaCBF), which results in a less conservative safe control behavior than aCBF. [17] applies adaptive control to CBF for safe control of systems with parametric uncertainty by adjusting the adaptation gain online. [18, 19, 20] assume bounded dynamics noise and use learning-based approaches to synthesize the CBF of the mismatched system dynamics. [21] focuses on high relative degree safety constraints for systems with dynamics uncertainty. It leverages concurrent learning to estimate the system uncertainty parameters online and synthesizes CBF. [22] addresses high-order CBF for time-varying system dynamics and state constraints. However, these works do not consider control bounds, which are important in real-world systems and could violate safety guarantees.

[7, 8, 11] address SIS for known systems with invariant bounded control. [23] introduces time-varying penalty functions to construct adaptive CBF when addressing systems with noisy dynamics and time-varying control bounds. Recent work [12] addresses the SIS problem for dynamic systems with varying (i.e., state-dependant) control bounds. Despite the rapid advancement in the field, existing works do not consider systems with both varying dynamics and varying control bounds, which will be addressed in this paper.

III Preliminaries and Problem Formulation

III-A Dynamic System

We follow [12] and consider a dynamic system with state-dependent control limits. Let $x\in\mathcal{X}\subset\mathbb{R}^{N_{x}}$ be the system state and $u\in\mathcal{U}$ be the control input. The state space $\mathcal{X}$ is bounded by a set of inequalities $\mathcal{X}\vcentcolon=\{x\mid h_{i}(x)\geq 0,\forall i=1,\dots,N_{h}\}$ . The control space is bounded element-wise, i.e., $\mathcal{U}\vcentcolon=\{u\in\mathbb{R}^{N_{u}}\mid\underline{u}\leq u\leq\bar{u}\}$ . The dynamics is given by

\dot{x}=f(x)+g(x)u,\leavevmode\nobreak\ u\in\mathcal{U},

(1)

where $f:\mathbb{R}^{N_{x}}\mapsto\mathbb{R}^{N_{x}}$ and $g:\mathbb{R}^{N_{x}}\mapsto\mathbb{R}^{N_{x}\times N_{u}}$ are both locally Lipschitz continuous.

III-B Preliminary: Safe Control

Safety Specification: For safety, we require the state to stay within a closed subset $\mathcal{X}_{S}$ (i.e., safe set) of the state space $\mathcal{X}$ . $\mathcal{X}_{S}$ is assumed to be the zero sublevel set of some piecewise smooth function $\phi_{0}\vcentcolon=\mathcal{X}\mapsto\mathbb{R}$ , i.e., $\mathcal{X}_{S}\vcentcolon=\{x\in\mathcal{X}\mid\phi_{0}(x)\leq 0\}$ . Both $\mathcal{X}_{S}$ and $\phi_{0}$ should be designed by users. For instance, $\phi_{0}$ can be $\phi_{0}=d_{\mathrm{min}}-d$ if we were to keep the distance $d$ to some obstacle above $d_{\textrm{min}}$ .

Safe Control Objectives: Following [12], we focus on safe control with two objectives: (a) forward invariance (FI), meaning if the state $x$ is already within the safe set, it should never leave that set and (b) finite-time convergence (FTC), meaning if the state $x$ is outside the safe set, it should land in the safe set in finite time.

Safe Control Backbone: When the control $u$ does not appear in $\dot{\phi}_{0}$ (e.g., $\dot{\phi}_{0}=-\dot{d}$ does not depend on the acceleration input for a second-order system), we cannot derive constraints on $u$ to ensure safety. To solve that issue, the safe set algorithm (SSA) [5] provides a systematic approach to design an alternative safety quantification $\phi$ to handle general relative degrees ( $>1$ ) between $\phi_{0}$ and the control. SSA introduces a continuous, piece-wise smooth energy function $\phi\vcentcolon=\mathcal{X}\mapsto\mathbb{R}$ (a.k.a. the safety index). The general form of an $n^{\mathrm{th}}$ ( $n\geq 0$ ) order safety index $\phi_{n}$ is given as $\phi_{n}=(1+a_{1}s)(1+a_{2}s)\dots(1+a_{n}s)\phi_{0}$ where $s$ is the differentiation operator. $\phi_{n}$ is alternatively expanded to

\phi_{n}\vcentcolon=\phi_{0}+\textstyle\sum_{i=1}^{n}k_{i}\phi^{(i)}_{0}.

(2)

where $\phi_{0}^{(i)}$ is the $i^{\mathrm{th}}$ time derivative of $\phi_{0}$ . The safe control law $c_{\phi_{n}}$ of SSA can be written as the following optimization:

\mathop{\underset{u\in\mathcal{U}}{\mathop{\mathbf{min}}}}\mathcal{J}(u)\leavevmode\nobreak\ \mathop{\mathbf{s.t.}}\leavevmode\nobreak\ \dot{\phi}_{n}(x,u)\leq-\eta\leavevmode\nobreak\ \mathrm{if}\leavevmode\nobreak\ \phi_{n}(x)\geq 0

(3)

where the objective $\mathcal{J}$ is arbitrary. By [5, 12], if (a) the roots of the characteristic equation $\prod_{i=1}^{n}(1+a_{i}s)=0$ are all negative real, (b) $\phi_{0}^{(n)}$ has relative degree one to the control input, and (c) the problem (3) is always feasible, both FI and FTC are guaranteed. Note that (3) only considers constraint satisfaction which is compatible with arbitrary control objectives. For instance, for reference tracking, we can set $\mathcal{J}(u)=\|u-u^{r}\|$ to find $u$ that is minimally invasive to the nominal control $u^{r}$ , presumably generated by a given tracking controller with asymptotical stability.

III-C Preliminary: Safe Index Synthesis

To achieve safety guarantees by implementing (3), we need to construct $\phi$ to make the optimization feasible. Such an objective is referred to as Safety Index Synthesis (SIS), mathematically described as 1.

Problem 1 (Safety Index Synthesis).

Find safety index as $\phi_{\theta}\vcentcolon=\phi_{0}+\sum_{i=1}^{n}k_{i}\phi^{(i)}_{0}$ with parameter $\theta\in\Theta\vcentcolon=\{[k_{1},k_{2},\dots,k_{n}]\mid k_{i}\in\mathbb{R},k_{i}\geq 0,\forall i\}$ , such that

\forall x\in\mathcal{X}\leavevmode\nobreak\ \mathop{\mathbf{s.t.}}\leavevmode\nobreak\ \phi_{\theta}(x)\geq 0,\mathop{\underset{u\in\mathcal{U}}{\mathop{\mathbf{min}}}}\dot{\phi}_{\theta}(x,u)<-\eta.

(4)

$\phi_{\theta}$ is the $n^{\mathrm{th}}$ order safety index parameterized by $\theta$ and is used interchangeably with $\phi_{n}$ hereafter for clarity. Note that 1 depends on the dynamics (1) (i.e., $f$ and $g$ ) since $\dot{\phi}_{\theta}(x,u)=\frac{\partial\phi_{\theta}}{\partial x}f(x)+\frac{\partial\phi_{\theta}}{\partial x}g(x)u$ in (4). 1 is also difficult for having infinitely many constraints since (4) needs to hold for any state $x\leavevmode\nobreak\ \mathop{\mathbf{s.t.}}\phi_{\theta}(x)\geq 0$ . To tackle that challenge, we follow [8, 12] and leverage Positivstellensatz [24] to transform 1 into a sum-of-square programming (SOSP) which is further converted to nonlinear programming (NP). In specific, a refute set $\{x\mid\zeta_{i=1,\dots,N_{\zeta}}(x)=0,\gamma_{i=1,\dots,N}(x)\geq 0\}$ is first established for (4), then proved empty by solving an SOSP. We refer readers to [12] for details on the construction of the refute set. The SOSP finds $p^{\prime}_{i}\in\mathbb{R},p_{i}\geq 0,\leavevmode\nobreak\ \forall i>0$ such that

		$\displaystyle p_{0}=-1-\textstyle\sum_{i=1}^{N_{\zeta}}p^{\prime}_{i}\zeta_{i}-p_{1}\gamma_{1}-p_{2}\gamma_{2}-\dots-p_{N}\gamma_{N}$		(5)
		$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ -p_{12}\gamma_{1}\gamma_{2}-\dots-p_{12\dots N}\gamma_{1}\dots\gamma_{N}\in SOS.$		(5)

where $\zeta_{i}$ , $\gamma_{i}$ are functions of $x$ and also depend on $f$ and $g$ . The SOS condition is enforced by finding the positive semi-definite (PSD) decomposition $p_{0}=\bm{x}^{\top}Q(\theta,\bm{p})\bm{x}$ where $Q(\theta,\bm{p})\succeq 0$ . Assuming $p_{0}$ has degree $2d$ , $\bm{x}\vcentcolon=\left[1,x[1],\dots,x[N_{x}],x[1]x[2],\dots,x[N_{x}]^{d}\right]^{\top}$ contains all monomials of $x$ with order no more than $d$ . $\bm{p}\vcentcolon=[p^{\prime}_{1},\dots,p^{\prime}_{N_{\zeta}},p_{1},p_{2},\dots,p_{012\dots N}]^{\top}$ denotes the auxiliary decision variable. The final NP is given by:

Problem 2 (Nonlinear Programming).

Find $\theta\in\Theta$ and $\bm{p}$ where $\bm{p}[j]\in\mathbb{R}$ for $j>0$ and $\bm{p}[j]\geq 0$ for $j>N_{\zeta}$ , such that $Q(\theta,\bm{p})\succeq 0$ .

Remark 2.1.

The positive-semidefiniteness of the parametric coefficient matrix $Q(\theta,\bm{p})$ guarantees the positiveness of polynomial $p_{0}$ . Since $Q$ is derived from $\zeta_{i}$ and $\gamma_{i}$ , it depends on the dynamics (1), i.e., $f$ , $g$ , and the control limits $\mathcal{U}$ .

Remark 2.2.

The general form of the SOSP (5) allows $p_{i}^{\prime}$ and $p_{i}$ to be polynomials of $x$ . Hence, due to the simplifications (i.e., constraining $p_{i}^{\prime}$ and $p_{i}$ to real values), 2 solves a sufficient but not necessary condition to 1.

III-D Formulation of Safety Index Adaptation

As motivated in Section I, practical dynamic systems can contain varying parameters only known during runtime. We denote varying parameters as $\rho$ and extend (1) as

\dot{x}=f(x,\rho)+g(x,\rho)u,\leavevmode\nobreak\ u\in\mathcal{U}(\rho).

(6)

Assume that prior to deployment, the initial value $\rho_{0}$ is known, and a feasible safety index $\phi_{\theta_{0}}$ has been solved via 1. As explained in Remark 2.1, $\phi_{\theta}$ depends on the system dynamics. With the extended dynamics (6), $\phi_{\theta}$ also depends on $\rho$ . As a result, when $\rho$ is updated during runtime, the previously solved $\phi_{\theta}$ might no longer satisfy the feasibility condition (4) and render the system unsafe. Hence, it is imperative that $\phi_{\theta}$ is updated accordingly, formulated as:

Problem 3 (Safety Index Adaptation (SIA)).

Given a solution $\phi_{\theta}$ to 1 with system parameter $\rho$ , find $\phi_{\theta^{\prime}}$ to solve 1 with system parameter $\rho^{\prime}$ ¹¹1We assume bounded step changes in the system parameters, i.e., $\|\rho-\rho^{\prime}\|\leq\delta$ for some $\delta>0$ . Theoretical results on how the step size $\delta$ influences the adaptation performance are left for future work..

Remark 3.1.

A naive solution to 3 is to directly re-run the full synthesis given by 2. However, the solving time of the NP is significant even for simplistic systems, e.g., over $10$ minutes for a second-order unicycle model [12]. For safety-critical tasks, the safety guarantees of the safe control law should be recovered as soon as possible.

IV SIA via Determinant Gradient Ascend

Although re-running the full synthesis (2) is infeasible, we can leverage the NP formulation to design an adaptation strategy. Observe that solving 1 is ultimately achieved by making the parametric coefficient matrix $Q(\theta,\bm{p},\rho)\succeq 0$ , where the dependency on $\rho$ follows (6). Then, 3 naturally translates to:

\displaystyle\theta^{\prime},\bm{p}^{\prime}=\mathop{\underset{\theta,\bm{p}}{\mathop{\mathbf{argmin}}}}\leavevmode\nobreak\ \mathcal{J}(\theta,\bm{p})\leavevmode\nobreak\ \mathop{\mathbf{s.t.}}Q(\theta,\bm{p},\rho^{\prime})\succeq 0

(7)

given $Q(\theta,\bm{p},\rho)\succeq 0$ , where the objective $\mathcal{J}$ is a design parameter to guide the search for $\theta$ and $\bm{p}$ . If $\rho^{\prime}$ does not change significantly from $\rho$ , i.e., $\|\rho-\rho^{\prime}\|$ is bounded (to be formalized later), we are essentially searching for a new point $(\theta^{\prime},\bm{p}^{\prime},\rho^{\prime})$ near the neighborhood of $(\theta,\bm{p},\rho)$ to maintain the positive-semidefiniteness of $Q$ .

Note that the positive-semidefiniteness of $Q$ can be tested by computing determinants using Sylvester’s criterion [13], which says that a Hermitian matrix is positive-semidefinite if and only if all the principal minors are nonnegative. Namely, we can re-write the constraint in (7) as

\displaystyle\mathrm{Det}[Q(\theta,\bm{p},\rho^{\prime})]_{I,I}\geq 0,\leavevmode\nobreak\ \forall I\subseteq[1,\dots,M]

(8)

where $M$ is the size of $Q$ . $[Q]_{I,J}$ denotes the submatrix of $Q$ corresponding to the rows with indices $I$ and columns with indices $J$ . Since the principal minors are essentially explicit functions of $\theta$ and $\bm{p}$ , (8) can be readily satisfied via gradient ascends on those parameters as $[{\theta},{\bm{p}}]=[{\theta},{\bm{p}}]+\lambda\delta$ where the gradient $\delta$ is given by

\displaystyle\delta=\left.\nabla_{[\theta,\bm{p}]}\mathrm{Det}[Q(\theta,\bm{p},\rho^{\prime})]_{I^{*},I^{*}}\right\rvert_{\theta=\theta,\bm{p}=\bm{p}}.

(9)

$\mathrm{Det}[Q(\theta,\bm{p},\rho^{\prime})]_{I^{*},I^{*}}$ refers to the current lowest principal minor with indices $I^{*}$ and $\lambda$ the step size. Upon change of $\rho$ , $(\theta^{\prime},\bm{p}^{\prime})$ is initialized to the previous feasible values $(\theta,\bm{p})$ , and updated according to (9) until all principal minors are nonnegative. We refer to such an approach as the determinant gradient ascend (DGA). After $\phi_{n}$ is fully updated for $\rho^{\prime}$ , (3) would be feasible and guarantee FI and FTC with respect to $\mathcal{X}_{S}$ . Future work remains to study the system behaviors during DGA adaptation, when (3) might be infeasible.

Remark.

Since $Q$ depends on $f$ and $g$ which are fixed functions of $x$ and $\rho$ , the form of gradient update (9) is also fixed. Hence, with a pre-computed symbolic expression of the update, one only has to evaluate (9) on different $(\theta,\bm{p},\rho)$ values during deployment, which is fast enough to support real-time adaptation. In summary, the determinant gradient ascend (DGA) method enables close-form solutions to safety index adaptation using previous indices for warm start.

V Numerical Study

To validate our SIA approach, we provide a numerical study on a parameter-varying system based on a 2-DOF (degree of freedom) planar robot arm. The robot arm has a second-order dynamics model with joint acceleration as the input. We first derive the baseline NP problem for SIS following Section III-C and then derive the update rule for SIA in the form of (9). The feasibility of the adaptive safety index is validated by sample based evaluations.

V-A Parameter-varying 2-DOF Robot Arm

We consider a 2-DOF robot arm with state $x\vcentcolon=[\theta_{1},\theta_{2},\dot{\theta}_{1},\dot{\theta}_{2}]^{\top}$ , where $\theta_{1,2}\in[-\pi/2,-\pi/18]\cup[\pi/18,\pi/2]$ are the joint positions as shown in Figure 2. $\dot{\theta}_{1,2}\in[-1,1]$ are joint velocities. The two links have length $l_{1}$ and $l_{2}$ respectively. The control $u\vcentcolon=[u_{1},u_{2}]^{\top}$ includes bounded joint acceleration input $u_{1,2}\equiv\ddot{\theta}_{1,2}\in[u_{\mathrm{min}},u_{\mathrm{max}}]$ . The dynamics of the 2-DOF robot is given by $\dot{x}=f(x)+g(x)u$ where

f(x)=\begin{bmatrix}\dot{\theta}_{1}\\ \dot{\theta_{2}}\\ 0\\ 0\end{bmatrix},\leavevmode\nobreak\ g(x)=\begin{bmatrix}0&0\\ 0&0\\ 1&0\\ 0&1\end{bmatrix}

(10)

In real-world scenarios, system dynamics might change due to external factors. For instance, the total mass of a drone changes with different payloads, which in turn changes its dynamics; the torque limit of an arm motor might change due to insufficient power supply. In those cases, safety index adaptation is necessary to guarantee safety. Hence, to verify our SIA approach, we extend (10) to an affine parameter-varying system

\dot{x}=f(x,\rho)+g(x,\rho)u=A^{f}f(x)+A^{g}g(x)u+b

(11)

where $A^{f}=\mathbf{I}$ , $A^{g}=\mathrm{diag}([1,1,c_{1},c_{2}])$ and $b=[0,0,b_{1},b_{2}]^{\top}$ . We assume $c_{1,2}\geq 0$ and $b_{1,2}\in\mathbb{R}$ . The parameters $\rho\vcentcolon=[c_{1,2},b_{1,2}]$ are the system parameters, which can change during runtime and can be directly observed. The robot is allowed to move within the free space and should not collide with the obstacle which is a wall placed $d_{\mathrm{max}}$ from the robot base.

V-B Safety Index Adaptation Rule

We first derive the full SIS solution which is required to derive DGA update rules. With $\phi_{0}=l_{1}\cos(\theta_{1})+l_{2}\cos(\theta_{2})-d_{\mathrm{max}}$ , SIS produces a safety index $\phi_{\theta}=\phi_{0}+k\dot{\phi}_{0}$ such that the control law (3) always keeps the end-effector at most $d_{\mathrm{max}}$ away horizontally from the base, not colliding with the wall. The SI parameter $\theta$ contains a single parameter $k\geq 0$ . The immediate next step is to write out the feasibility condition (4) to be met by $\phi_{\theta}$ . We first handle the main condition $\mathbf{min}_{u\in\mathcal{U}}\dot{\phi}_{\theta}(x,u)<-\eta$ . Plugging in $\phi_{0}$ , we have

\phi_{\theta}=l_{1}\cos\theta_{1}+l_{2}\cos\theta_{2}-kl_{1}\sin\theta_{1}\dot{\theta}_{1}-kl_{2}\sin\theta_{2}\dot{\theta}_{2}-d_{\mathrm{max}}.

Taking time derivative, we have

	$\displaystyle\dot{\phi}_{\theta}=$	$\displaystyle\sum_{j=1,2}-l_{j}\sin\theta_{j}\dot{\theta}_{j}-kl_{j}\cos\theta_{j}\dot{\theta}_{j}^{2}-kl_{j}\sin\theta_{j}\ddot{\theta}_{j}$
	$\displaystyle=$	$\displaystyle\textstyle\sum_{i=1,2}-l_{j}\sin\theta_{j}\dot{\theta}_{j}-kl_{j}\cos\theta_{j}\dot{\theta}_{j}^{2}-kl_{j}\sin\theta_{j}(c_{j}u_{j}+b_{j})$

Note that $k,l_{j},c_{j}\geq 0$ , hence the minimum of $\dot{\phi}_{\theta}$ is reached at $u_{j}=u_{\mathrm{max}}$ if $\sin\theta_{j}\geq 0$ and $u_{j}=u_{\mathrm{min}}$ otherwise. Since $\theta_{j}\in[-\pi/2,-\pi/18]\cup[\pi/18,\pi/2]$ , the positiveness of $\theta_{j}$ depends on which interval it falls into, namely whether $\theta_{j}\leq-\pi/18$ or $\theta_{j}\geq\pi/18$ . With indicators $\mathbb{I}_{1,2}=\pm 1$ , those conditions can be written as

\displaystyle\mathbb{I}_{j}\sin\theta_{j}-\sin(\pi/18)\geq 0

(12)

Then, the main feasibility condition becomes

	$\displaystyle\textstyle\sum_{j=1,2}-l_{j}\sin\theta_{j}\dot{\theta}_{j}-kl_{j}\cos\theta_{j}\dot{\theta}_{j}^{2}$
	$\displaystyle\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ -kl_{j}\sin\theta_{j}(c_{j}\tilde{u}_{j}+b_{j})<-\eta$		(13)

where $\tilde{u}_{j}=u_{\mathrm{max}}$ if $\mathbb{I}_{j}=1$ and $\tilde{u}_{j}=u_{\mathrm{min}}$ if $\mathbb{I}_{j}=-1$ for $j=1,2$ . Next, we add conditions to consider the state limits, i.e., $\theta_{j}\in[\pi/18,\pi/2]$ , $\dot{\theta}_{j}\in[-1,1]$ and $\dot{\theta}_{j}^{2}\in[0,1]$ :

$\displaystyle-\mathbb{I}_{j}\sin\theta_{j}+1$	$\displaystyle\geq 0$	(14)
$\displaystyle 1-\dot{\theta}_{j}^{2}$	$\displaystyle\geq 0$	(15)
$\displaystyle-(\dot{\theta}_{j}^{2})^{2}+\dot{\theta}_{j}^{2}$	$\displaystyle\geq 0$	(16)
$\displaystyle\sin\theta_{j}^{2}+\cos\theta_{j}^{2}-1$	$\displaystyle=0$	(17)

The last condition in (4) is $\phi_{\theta}\geq 0$ , which is omitted here to enable decreasing safety index at all levels (i.e., $\phi_{\theta}\in\mathbb{R}$ ), instead of only the unsafe regions (i.e., $\phi_{\theta}\geq 0$ ). Now, (4) translates to: for any state satisfying (12) to (17), (13) holds. To achieve that, we construct a refute set by collecting (13) to (17), with (13) negated, and prove that the refute set is empty²²2See [12] for the theoretical results of such an approach.. With $\alpha_{j}\vcentcolon=\sin\theta_{j}$ , $\beta_{j}\vcentcolon=\cos\theta_{j}$ , $y_{j}\vcentcolon=\dot{\theta}_{j}$ and $z_{j}\vcentcolon=\dot{\theta}_{j}^{2}$ for $j=1,2$ , the refute set is given by:

\begin{cases}\gamma_{1}\vcentcolon=-\l_{1}\alpha_{1}y_{1}-kl_{1}\beta_{1}z_{1}-kl_{1}(c_{1}\tilde{u}_{1}+b_{1})\alpha_{1}\\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ -\l_{2}\alpha_{2}y_{2}-kl_{2}\beta_{2}z_{2}-kl_{2}(c_{2}\tilde{u}_{2}+b_{2})\alpha_{2}\geq 0\\ \gamma_{2}\vcentcolon=\mathbb{I}_{1}\alpha_{1}-\sin(\pi/18)\geq 0\\ \gamma_{3}\vcentcolon=-\mathbb{I}_{1}\alpha_{1}+1\geq 0\\ \gamma_{4}\vcentcolon=1-y_{1}^{2}\geq 0\\ \gamma_{5}\vcentcolon=-z_{1}^{2}+z_{1}\geq 0\\ \zeta_{1}\vcentcolon=\alpha_{1}^{2}+\beta_{1}^{2}-1=0\\ \gamma_{6}\vcentcolon=\mathbb{I}_{2}\alpha_{2}-\sin(\pi/18)\geq 0\\ \gamma_{7}\vcentcolon=-\mathbb{I}_{2}\alpha_{2}+1\geq 0\\ \gamma_{8}\vcentcolon=1-y_{2}^{2}\geq 0\\ \gamma_{9}\vcentcolon=-z_{2}^{2}+z_{2}\geq 0\\ \zeta_{2}\vcentcolon=\alpha_{2}^{2}+\beta_{2}^{2}-1=0\end{cases}

(18)

The refute set is represented by four versions of (18) with different sign values of $\mathbb{I}_{1,2}$ . Following (5), for the $i^{\mathrm{th}}$ assignment ( $i\in[4]$ ) of $(\mathbb{I}_{1},\mathbb{I}_{2})$ , we have

\displaystyle p_{i,0}=-1-p^{\prime}_{i,1}\zeta_{i,1}-p^{\prime}_{i,2}\zeta_{i,2}-\textstyle\sum_{n=1}^{9}p_{i,n}\gamma_{i,n}

(19)

and decompose as $p_{i,0}=\bm{x}^{\top}Q_{i}(\theta,\bm{p}_{i},\rho)\bm{x}$ where $\bm{x}\vcentcolon=[1,y_{1},z_{1},\alpha_{1},\beta_{1},y_{2},z_{2},\alpha_{2},\beta_{2}]^{\top}$ , $\theta\vcentcolon=[k]$ , and $\bm{p}_{i}\vcentcolon=[p^{\prime}_{i,1},p^{\prime}_{i,2},p_{i,1},\dots,p_{i,9}]$ . Let $[Q]_{m,n}$ denote the element of $Q$ at row $m$ column $n$ , we have

\begin{cases}[Q_{i}]_{2,4}=-l_{1}p_{i,1}\\ [Q_{i}]_{3,5}=-kl_{1}p_{i,1}\\ [Q_{i}]_{1,4}=-kl_{1}(c_{1}\tilde{u}_{i,1}+b_{1})p_{i,1}+\mathbb{I}_{i,1}p_{i,2}-\mathbb{I}_{i,1}p_{i,3}\\ [Q_{i}]_{4,4}=p^{\prime}_{i,1}\\ [Q_{i}]_{2,2}=-p_{i,4}\\ [Q_{i}]_{3,3}=-p_{i,5}\\ [Q_{i}]_{1,3}=p_{i,5}\\ [Q_{i}]_{5,5}=p^{\prime}_{i,1}\\ [Q_{i}]_{6,8}=-l_{2}p_{i,1}\\ [Q_{i}]_{7,9}=-kl_{2}p_{i,1}\\ [Q_{i}]_{1,8}=-kl_{2}(c_{2}\tilde{u}_{i,2}+b_{2})p_{i,1}+\mathbb{I}_{i,2}p_{i,6}-\mathbb{I}_{i,2}p_{i,7}\\ [Q_{i}]_{8,8}=p^{\prime}_{i,2}\\ [Q_{i}]_{6,6}=-p_{i,8}\\ [Q_{i}]_{7,7}=-p_{i,9}\\ [Q_{i}]_{1,7}=p_{i,9}\\ [Q_{i}]_{9,9}=p^{\prime}_{i,2}\\ \end{cases}

(20)

With that in hand, the gradient updates (9) can be obtained by taking derivatives of the principal minors of $\{Q_{i}\}_{i=1,\dots,4}$ with respect to $[\theta,\bm{p}_{1},\dots,\bm{p}_{4}]$ . Specifically, given new parameter $\rho^{\prime}$ to adapt to, we compute the gradients as:

	$\displaystyle\delta_{\theta}$	$\displaystyle=\frac{1}{4}\sum_{i=1}^{4}\left.\nabla_{\theta}\mathrm{Det}[Q_{i}(\theta,\bm{p}_{i},\rho^{\prime})]_{I^{}_{i},I^{}_{i}}\right\rvert_{\theta=\theta,\bm{p}_{i}=\bm{p}_{i}}$		(21)
	$\displaystyle\delta_{\bm{p}_{i}}$	$\displaystyle=\left.\nabla_{\bm{p}_{i}}\mathrm{Det}[Q_{i}(\theta,\bm{p}_{i},\rho^{\prime})]_{I^{}_{i},I^{}_{i}}\right\rvert_{\theta=\theta,\bm{p}_{i}=\bm{p}_{i}}$		(21)

With learning rate $\lambda_{k}$ and $\lambda_{\bm{p}}$ , we apply the update rule

\displaystyle\theta=\theta+\lambda_{\theta}\delta_{\theta},\leavevmode\nobreak\ \bm{p}_{i}=\bm{p}_{i}+\lambda_{\bm{p}}\delta_{\bm{p}_{i}}

(22)

until all principal minors of all $Q_{i}$ ’s are non-negative.

V-C Experiment and Results

We initialize the robot arm with nominal parameters $\rho=[c_{1}=c_{2}=1,b_{1}=b_{2}=0]$ and run the full safety index synthesis (see 2) to acquire an initial safety index $\phi_{\theta}$ . The inputs are limited to $u_{\mathrm{min}}=-100,u_{\mathrm{max}}=100$ . To validate our SIA approach, we simulate multiple disturbances to the system parameters $\rho$ . For each perturbed system with parameters $\rho^{\prime}$ , we invoke the SIA update rules (22) to acquire a new $\phi_{\theta^{\prime}}$ . Figure 3 shows an example of such adaptation where the parameters $\rho$ is perturbed after the arm end-effector reaches its goal. Without adaptation, the arm runs into a state where the safe control law (3) is infeasible and fails to ensure safety. With adaptation, the arm quickly updates the SI to $\phi_{\theta^{\prime}}$ and manages to find safe actions.

For quantitative evaluation, we apply each adapted $\phi_{\theta^{\prime}}$ by running the safe control law (3) on $1000$ uniformly sampled states under the perturbed system and compare to the nominal safety index $\phi_{\theta}$ . If (3) is feasible, we mark the safety index as feasible at the corresponding state. Due to the uncertainty of nonlinear programming 2, we repeat the whole process for $10$ times and plot the feasibility rate of the safety index before and after adaptation, the adapted SI parameter $\theta$ and adaptation time. We also run the full SIS on each perturbed system and compare the computation time. See Figure 4 for the plots. We observe that the more $\rho^{\prime}$ deviates from $\rho$ (the smaller the $c_{1,2}$ ), the control law under the nominal safety index is less likely to be feasible while the adapted safety index achieves $100\%$ feasibility rate. The adaptation time is also consistently lower than that of solving full SIS, validating that our SIA approach is computationally efficient for real-time deployment. Although only $c_{1,2}$ are perturbed in our simulations, our approach directly accommodates other variations, for instance changing $b_{1,2}$ or more generally, changing $A^{f}$ , $A^{g}$ and $b$ in (11).

V-D Discussions

Tolerance against variations. It can be observed from Figure 4 that the adapted value of SI parameter $k$ shows a negative correlation with respect to the system parameters $c_{1,2}$ . In our experiments, we discovered that when $c_{1,2}$ are increased, the original $k$ is normally still feasible, and no adaptation is required. Intuitively, the larger $c_{1,2}$ is, the more sensitive the system is to inputs; the larger $k$ is, the more sensitive the control law is to unsafe regions. When $c_{1,2}$ increases, the system becomes more reactive, keeping the original $k$ feasible. When $c_{1,2}$ decreases, a more aggressive safe control law is needed to react to unsafe regions in advance, necessitating a larger $k$ . Note that the above only applies to our specific system, while the tolerance analysis for general systems is left for future work.

Scalability against system dimensions. The scalability of both full SIS and SI adaptation largely depends on the size of the refute set (18) as well as the coefficient matrix $Q_{i}$ in (20). For an $n$ -DOF 2D robot arm, the size of the refute set is given by $1+5n$ ; the size of $Q_{i}$ is $1+4n$ ; and there are $2^{n}$ such $Q_{i}$ to prove PSD for full SIS. Despite the exponential scalability of SIS, our DGA approach allows one to pre-generate all gradient updates from $Q_{i}$ in symbolic forms and only evaluates those expressions during online adaptation. That renders our approach highly efficient even for high-dimensional systems.

Gradient-based Optimization. When implementing our update rule (22), we normalize the gradients $\delta_{\theta}$ and $\delta_{\bm{p}_{i}}$ , and set the learning rates $\lambda_{\theta}=\lambda_{\bm{p}}=1e-5$ . Empirically, one should always normalize the gradients and start experimenting with small learning rates to help DGA converge. Moreover, our DGA is presented in first-order gradient updates in (22). Second-order approaches such as Newton’s method can also be applied for better convergence rates when the change of $\rho$ is minimal and a feasible $k^{\prime}$ can be found within a near neighbor of the current $k$ .

VI Conclusion and Future Work

In this paper, we presented a safety index adaptation (SIA) approach to update safe control laws in response to varying system dynamics in real time. Our approach replaces full safety index synthesis, which is extremely slow, with fast closed-form updates to controller parameters. Through numerical studies, we verified that our approach allows the agent to quickly adapt to new system dynamics and achieve zero safety violations.

In practice, after the system dynamics change, the system is inevitably guarded by an outdated safety index during the adaptation computation time. Hence, as future work, it is worth studying the system’s behavior during such a transition period to draw critical insights, for instance, whether the adaptation can finish before the agent crashes into unsafe regions. If not, the agent should stop navigation and wait for the new safety index. Another promising direction is to handle continuously changing dynamics as opposed to step parameter changes, which will bring new questions on the tolerance of synthesized safety indices and the criterion of triggering SIA. Finally, we aim to provide theoretical results such as the proof of convergence to the new safety index as well as the convergence rate.

References

[1] E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE Access, vol. 8, pp. 58 443–58 469, 2020.
[2] H. Christensen, N. Amato, H. Yanco, M. Mataric, H. Choset, A. Drobnis, K. Goldberg, J. Grizzle, G. Hager, J. Hollerbach, et al., “A roadmap for us robotics–from internet to robotics 2020 edition,” Foundations and Trends in Robotics, vol. 8, no. 4, pp. 307–424, 2021.
[3] R. Liu, R. Chen, and C. Liu, “Safe interactive industrial robots using jerk-based safe set algorithm,” in International Symposium on Flexible Automation, 2022.
[4] T. Wei and C. Liu, “Safe control algorithms using energy functions: A unified framework, benchmark, and new directions,” in Conference on Decision and Control, 2019.
[5] C. Liu and M. Tomizuka, “Control in a safe set: Addressing safety in human-robot interactions,” in Dynamic Systems and Control Conference, 2014.
[6] A. D. Ames, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs with application to adaptive cruise control,” in Conference on Decision and Control, 2014.
[7] W. Zhao, T. He, and C. Liu, “Model-free safe control for zero-violation reinforcement learning,” in Conference on Robot Learning, 2021.
[8] W. Zhao, T. He, T. Wei, S. Liu, and C. Liu, “Safety index synthesis via sum-of-squares programming,” in American Control Conference, 2023.
[9] H. Ma, C. Liu, S. E. Li, S. Zheng, and J. Chen, “Joint synthesis of safety certificate and safe control policy using constrained reinforcement learning,” in Annual Learning for Dynamics and Control Conference, 2022.
[10] C. Dawson, Z. Qin, S. Gao, and C. Fan, “Safe nonlinear control using robust neural lyapunov-barrier functions,” in Conference on Robot Learning, 2022.
[11] T. Wei and C. Liu, “Safe control with neural network dynamic models,” in Learning for Dynamics and Control Conference, 2022.
[12] R. Chen, W. Zhao, and C. Liu, “Safety index synthesis with state-dependent control space,” in American Control Conference, 2024.
[13] R. A. Horn and C. R. Johnson, Matrix analysis. Cambridge university press, 2012.
[14] H.-C. Lin, C. Liu, Y. Fan, and M. Tomizuka, “Real-time collision avoidance algorithm on industrial manipulators,” in Conference on Control Technology and Applications, 2017.
[15] A. J. Taylor and A. D. Ames, “Adaptive safety with control barrier functions,” in American Control Conference, 2020.
[16] B. T. Lopez, J.-J. E. Slotine, and J. P. How, “Robust adaptive control barrier functions: An adaptive and data-driven approach to safety,” IEEE Control Systems Letters, vol. 5, no. 3, pp. 1031–1036, 2021.
[17] B. T. Lopez and J.-J. E. Slotine, “Unmatched control barrier functions: Certainty equivalence adaptive safety,” in American Control Conference, 2023.
[18] D. D. Fan, J. Nguyen, R. Thakker, N. Alatur, A.-a. Agha-mohammadi, and E. A. Theodorou, “Bayesian learning-based adaptive control for safety critical systems,” in International Conference on Robotics and Automation, 2020.
[19] L. Brunke, S. Zhou, and A. P. Schoellig, “Barrier bayesian linear regression: Online learning of control barrier conditions for safety-critical control of uncertain systems,” in Annual Learning for Dynamics and Control Conference, 2022.
[20] M. A. Khan, T. Ibuki, and A. Chatterjee, “Gaussian control barrier functions: Non-parametric paradigm to safety,” IEEE Access, vol. 10, pp. 99 823–99 836, 2022.
[21] M. H. Cohen and C. Belta, “High order robust adaptive control barrier functions and exponentially stabilizing adaptive control lyapunov functions,” in American Control Conference, 2022.
[22] H. Wang, J. Peng, J. Xu, F. Zhang, and Y. Wang, “High-order control barrier functions-based optimization control for time-varying nonlinear systems with full-state constraints: A dynamic sub-safe set approach,” International Journal of Robust and Nonlinear Control, vol. 33, no. 8, pp. 4490–4503, 2023.
[23] W. Xiao, C. Belta, and C. G. Cassandras, “Adaptive control barrier functions,” IEEE Transactions on Automatic Control, vol. 67, no. 5, pp. 2267–2281, 2022.
[24] P. A. Parrilo, “Semidefinite programming relaxations for semialgebraic problems,” Mathematical programming, vol. 96, pp. 293–320, 2003.

Real-time Safety Index Adaptation for Parameter-varying Systems via Determinant Gradient Ascend