This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Dynamic Adaptation Gains for Nonlinear Systems with Unmatched Uncertainties

Brett T. Lopez1 and Jean-Jacques Slotine2 1Verifiable and Control-Theoretic Robotics Laboratory, University of California, Los Angeles, Los Angeles CA, [email protected]2Nonlinear Systems Laboratory, Massachusetts Institute of Technology, Cambridge MA, [email protected]
Abstract

We present a new direct adaptive control approach for nonlinear systems with unmatched and matched uncertainties. The method relies on adjusting the adaptation gains of individual unmatched parameters whose adaptation transients would otherwise destabilize the closed-loop system. The approach also guarantees the restoration of the adaptation gains to their nominal values and can readily incorporate direct adaptation laws for matched uncertainties. The proposed framework is general as it only requires stabilizability for all possible models.

Index Terms:
Adaptive control, uncertain systems.

I Introduction

Adaptive control of systems with unmatched uncertainties, i.e., model perturbations outside the span of the control input matrix, is notoriously difficult because the uncertainties cannot be directly canceled by the control input. This is especially true for nonlinear systems where combining a stable model estimator with a nominal feedback controller does not necessarily yield a stable closed-loop system without imposing limits on the growth rate of the uncertainties, so as to prevent finite escape. The prevailing approach for handling nonlinear systems with unmatched uncertainties has been to construct nominal feedback controllers that are robust to time-varying parameter estimates [1]. More precisely, one seeks to construct an input-to-state stable control Lyapunov function (ISS-clf) [2] which, by construction, ensures the system converges to a region near the desired state despite parameter estimation error and transients. Once an ISS-clf and corresponding controller are known, then any stable model estimator can be employed without concern of instability. From a theoretical standpoint, this framework guarantees stable closed-loop control and estimation despite unmatched uncertainties but requires constructing an ISS-clf – a nontrivial task unless the system takes a particular structure.

The approach discussed above is categorized as indirect adaptive control and entails combining a robust controller with a stable model estimator. Conversely, direct adaptive control uses Lyapunov-like stability arguments to construct a parameter adaptation law that guarantees the state converges to a desired value. This eliminates the robustness requirements inherent to indirect adaptive control which, in some ways, simplifies design. However, since unmatched uncertainties cannot be directly canceled through control, it is necessary to construct a family of Lyapunov functions that depend on the estimates of the unmatched parameters. This model dependency introduces sign-indefinite terms (related to the parameter estimation transients) in the stability proof that require sophisticated direct adaptive control schemes to achieve stable closed-loop control and adaptation. Adaptive control Lyapunov functions (aclf) where proposed to cancel the problematic transient terms by synthesizing a family of clf’s for a modified dynamical system that depends on the family of clf’s [3]. Recently, [4] proposed a method that adjusts the adaptation gain online to cancel the undesirable transient terms and requires no modifications to the standard clf definition or dynamical system of interest.

The main contribution of this work is a new direct adaptive control methodology which generalizes and improves the online adaptation gain adjustment approach from [4], subsequently called dynamic adaptation gains, yielding a superior framework for handling various forms of model uncertainties. The new framework has two distinguishing properties: 1) adaptation gains are individually adjusted online to prevent parameter adaptation transients from destabilizing the system and 2) the adaptation gain is guaranteed to return to its nominal value asymptotically. Conceptually, the adaptation gain is lowered, i.e., adaptation slowed, for parameters whose transients is destabilizing; the adaptation gain then returns to its nominal once the transients subsides. Noteworthy properties are examined, e.g., handling of matched uncertainties, and several forms of the dynamic adaptation gain update law are derived. The approach only relies on stabilizability of the uncertain system so it is a very general framework. Simulation results of a nonlinear system with unmatched and matched uncertainties demonstrates the approach.

Notation: The set of positive and strictly-positive scalars will be denoted as 0\mathbb{R}_{\geq 0} and >0\mathbb{R}_{>0}, respectively. The shorthand notation for a function TT parameterized by a vector aa with vector argument ss will be Ta(s)=T(s;a)T_{a}(s)=T(s;a). The partial derivative of a function N(x,y)N(x,y) will sometimes denoted as xN(x,y)=Nx\nabla_{x}N(x,y)=\frac{\partial N}{\partial x} where the subscript on \nabla is omitted when there is no ambiguity.

II Problem Formulation

This work addresses control of uncertain dynamical systems of the form

x˙=f(x,t)Δ(x,t)θ+B(x,t)u,\dot{x}=f(x,t)-\Delta(x,t)^{\top}\theta+B(x,t)u, (1)

with state xnx\in\mathbb{R}^{n}, control input umu\in\mathbb{R}^{m}, nominal dynamics f:n×0nf:\mathbb{R}^{n}\times\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}^{n}, and control input matrix B:n×0n×mB:\mathbb{R}^{n}\times\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}^{n\times m}. The uncertain dynamics are a linear combination of known regression vectors Δ:n×0p×n\Delta:\mathbb{R}^{n}\times\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}^{p\times n} and unknown parameters θp\theta\in\mathbb{R}^{p}. We assume that Equation 1 is locally Lipschitz uniformly and that the state xx is measured. The following assumption is made on the unknown parameters θ\theta.

Assumption 1.

The unknown parameters θ\theta belong to a known compact convex set Θp\Theta\subset\mathbb{R}^{p}.

An immediate consequence of Assumption 1 is that the parameter estimation error θ~θ^θ\ \tilde{\theta}\triangleq\hat{\theta}-\theta\ must also belong to a known compact convex set Θ~\tilde{\Theta}. Each parameter must then have a finite maximum error where |θ~i|ϑ~i<|\tilde{\theta}_{i}|\leq\tilde{\vartheta}_{i}<\infty for i=1,,pi=1,\dots,p.

III Main Results

III-A Overview

This section presents the main results of this work. We first review the definition of the unmatched control Lyapunov function [4] and its role in the proposed approach. We then derive the first main result followed by several noteworthy extensions, e.g., the scenario where unmatched and matched uncertainties are present in addition to augmenting the adaptation law with model estimation (composite adaptation). We also present the the so-called leakage modification that guarantees the adaptation gains return to their nominal values asymptotically. Conceptually, the proposed approach relies on adjusting the adaptation gain of individual parameter estimates online to prevent destabilization induced by parameter adaptation transients. This is in stark contrast to [4] where a single gain for all parameters was adjusted. Consequently this work can be considered a more general and natural formulation of [4].

III-B Unmatched Control Lyapunov Functions

Similar to [4], the so-called unmatched control Lyapunov function will be used through this letter.

Definition 1 (cf. [4]).

A smooth, positive-definite function Vθ:n×p×00V_{\theta}:\mathbb{R}^{n}\times\mathbb{R}^{p}\times\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}_{\geq 0} is an unmatched control Lyapunov function (uclf) if it is radially unbounded in xx and for each θΘp\theta\in\Theta\subset\mathbb{R}^{p}

infum{Vθt+Vθx[fΔθ+Bu]}Qθ\displaystyle\underset{u\in\mathbb{R}^{m}}{\mathrm{inf}}\left\{\frac{\partial V_{\theta}}{\partial t}+\frac{\partial V_{\theta}}{\partial x}^{\top}\left[f-\Delta^{\top}\theta+Bu\right]\right\}\leq-Q_{\theta}

where Qθ:n×p0Q_{\theta}:\mathbb{R}^{n}\times\mathbb{R}^{p}\rightarrow\mathbb{R}_{\geq 0} is continuously differentiable, radially unbounded in xx, and positive-definite.

Remark 1.

Definition 1 can also be stated in terms of a contraction metric [5], see [4, Def. 1] for more details.

Definition 1 is noteworthy because the existence of an uclf is equivalent (see [4, Prop. 1]) to system Equation 1 being stabilizable for all θΘ\theta\in\Theta. This is the weakest possible requirement one can impose on Equation 1 and therefore showcases the generality of the approach. Definition 1 involves constructing a clf for each possible model realization, i.e., a family of clf’s, so convergence to the desired equilibrium can be achieved if a suitable adaptation law can be derived. Pragmatically, this can be done analytically, e.g., via backstepping, or numerically via discretization or sum-of-squares programming with the uclf search occurring over the state-parameter space. Definition 1 adopts the certainty equivalence principle philosophy: design a clf (or equivalently a controller) as if the unknown parameters are known and simply replace them with their estimated values online. This intuitive design approach is not generally possible for nonlinear systems with unmatched uncertainties unless additional robustness properties are imposed or the system be modified – both representing a departure from certainty equivalence. The approach developed in this work completely bypasses any additional requirements or system modifications by instead adjusting the adaptation gain online where the so-called adaptation gain update law yields a stable closed-loop system. In other words, the proposed dynamic adaptation gains method expands the use of the certainty equivalence principle to general nonlinear systems with unmatched uncertainties, simplifying the design of stable adaptive controllers.

III-C Dynamic Adaptation Gains

The proposed method involves adjusting the adaptation gain—typically denoted as a scalar γ\gamma or symmetric positive-definite matrix Γ\Gamma is the literature—to prevent the parameter adaptation transients from destabilization the closed-loop system. It is convenient to express the adaptation gain as a function of a scalar argument that has certain properties. We will make use of the following definition in deriving an adaptation gain update law that achieves closed-loop stability.

Definition 2.

An admissible dynamic adaptation gain γ:>0\gamma:\mathbb{R}\rightarrow\mathbb{R}_{>0} is a function with scalar argument ρ\rho such that γ(0)=γ¯\gamma(0)=\bar{\gamma} is the nominal adaptation gain, limργ(ρ)=c>0\lim_{\rho\rightarrow-\infty}\gamma(\rho)=c>0, and 0<γ(ρ)<0<\nabla\gamma(\rho)<\infty.

Remark 2.

There are several functions at the disposal of the designer when selecting an admissible dynamic adaptation gain. Two examples are γ(ρ)=γ¯(0.9ρ2+1+0.1)\gamma(\rho)=\bar{\gamma}\,(\frac{0.9}{\rho^{2}+1}+0.1) or γ(ρ)=γ¯(0.9exp(ρ/τ)+0.1)\gamma(\rho)=\bar{\gamma}\,(0.9\,\mathrm{exp}(\rho/\tau)+0.1).

With Definitions 1 and 2, we are now ready to state the first main theorem of this work.

Theorem 1.

Consider the uncertain system Equation 1 with xdx_{d} being the desired equilibrium point. If an uclf Vθ(x,t)\,V_{\theta}(x,t)\, exists, then xxdx\rightarrow x_{d} asymptotically with the adaptation law

θ^˙=diag(γ1(ρ1),,γp(ρp))Δ(x,t)Vθ^x\dot{\hat{\theta}}=-\,\mathrm{diag}(\gamma_{1}(\rho_{1}),\dots,\gamma_{p}(\rho_{p}))\,\Delta(x,t)\frac{\partial V_{\hat{\theta}}}{\partial x} (2a)

where each γi()\gamma_{i}(\cdot) is an admissible dynamic adaptation gain whose update law satisfies

γ˙i(ρi)2γi(ρi)2(ηiθ~i2)Vθ^θ^iθ^˙i\dot{\gamma}_{i}(\rho_{i})\leq-2\frac{\gamma_{i}(\rho_{i})^{2}}{(\eta_{i}-\tilde{\theta}^{2}_{i})}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i} (2b)

for ηi>ϑ~iθ~i\eta_{i}>\tilde{\vartheta}_{i}\geq\tilde{\theta}_{i} with i=1,,pi=1,\dots,p.

Proof.

Consider the Lyapunov-like function

Vc(t)=Vθ^(x,t)+12i=1p(θ~i2ηi)γi(ρi),V_{c}(t)=V_{\hat{\theta}}(x,t)+\frac{1}{2}\sum_{i=1}^{p}\frac{(\tilde{\theta}_{i}^{2}-\eta_{i})}{\gamma_{i}(\rho_{i})}, (3)

where θ~=θ^θ\tilde{\theta}=\hat{\theta}-\theta and ηi>ϑ~iθ~i\eta_{i}>\tilde{\vartheta}_{i}\geq\tilde{\theta}_{i} with ηi\eta_{i} finite. Differentiating Equation 3 yields

V˙c(t)=\displaystyle\dot{V}_{c}(t)= Vθ^t+Vθ^x[f(x,t)Δ(x,t)θ+B(x,t)u]\displaystyle\,{\frac{\partial V_{\hat{\theta}}}{\partial t}}+\frac{\partial V_{\hat{\theta}}}{\partial x}^{\top}[f(x,t)-\Delta(x,t)^{\top}\theta+{B(x,t)}u]
+i=1p[Vθ^θ^iθ^˙i+θ~iγi(ρi)θ^˙i+12(ηiθ~i2)γi(ρi)2γ˙i(ρi)].\displaystyle+\sum_{i=1}^{p}\biggl{[}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}+\frac{\tilde{\theta}_{i}}{\gamma_{i}(\rho_{i})}\dot{\hat{\theta}}_{i}+\frac{1}{2}\frac{(\eta_{i}-\tilde{\theta}_{i}^{2})}{\gamma_{i}(\rho_{i})^{2}}\,\dot{\gamma}_{i}(\rho_{i})\biggr{]}.

Since Vθ^(x,t)V_{\hat{\theta}}(x,t) is an uclf then

V˙c(t)\displaystyle\dot{V}_{c}(t)\leq Qθ^(x)+Vθ^xΔ(x,t)θ~\displaystyle\,-Q_{\hat{\theta}}(x)+\frac{\partial V_{\hat{\theta}}}{\partial x}^{\top}\Delta(x,t)^{\top}\tilde{\theta}
+i=1p[Vθ^θ^iθ^˙i+θ~iγi(ρi)θ^˙i+12(ηiθ~i2)γi(ρi)2γ˙i(ρi)].\displaystyle+\sum_{i=1}^{p}\biggl{[}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}+\frac{\tilde{\theta}_{i}}{\gamma_{i}(\rho_{i})}\dot{\hat{\theta}}_{i}+\frac{1}{2}\frac{(\eta_{i}-\tilde{\theta}_{i}^{2})}{\gamma_{i}(\rho_{i})^{2}}\,\dot{\gamma}_{i}(\rho_{i})\biggr{]}.

Noting

i=1pθ~iγi(ρi)θ^˙i=θ~[diag(γ1(ρ1),,γp(ρp))]1θ^˙\sum_{i=1}^{p}\frac{\tilde{\theta}_{i}}{\gamma_{i}(\rho_{i})}\dot{\hat{\theta}}_{i}=\tilde{\theta}^{\top}[\mathrm{diag}(\gamma_{1}(\rho_{1}),\dots,\gamma_{p}(\rho_{p}))]^{-1}\,\dot{\hat{\theta}}

then applying Equation 2a yields

V˙c(t)\displaystyle\dot{V}_{c}(t) Qθ^(x)+i=1p[Vθ^θ^iθ^˙i+12(ηiθ~i2)γi(ρi)2γ˙i(ρi)].\displaystyle\leq-Q_{\hat{\theta}}(x)+\sum_{i=1}^{p}\biggl{[}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}+\frac{1}{2}\frac{(\eta_{i}-\tilde{\theta}_{i}^{2})}{\gamma_{i}(\rho_{i})^{2}}\,\dot{\gamma}_{i}(\rho_{i})\biggr{]}.

If γ˙i(ρi)\dot{\gamma}_{i}(\rho_{i}) is chosen so Equation 2b is satisfied, then the term in brackets is negative so V˙c(t)Qθ^(x)<0\dot{V}_{c}(t)\leq-Q_{\hat{\theta}}(x)<0 if xxdx\neq x_{d}. Hence, Vc(t)V_{c}(t) is non-increasing so Vθ^(x,t)V_{\hat{\theta}}(x,t) and θ~\tilde{\theta} are bounded since γi(ρi)\gamma_{i}(\rho_{i}) is lower-bounded and ηi\eta_{i} is bounded by construction. Since Vθ^(x,t)V_{\hat{\theta}}(x,t) is radially unbounded then xx must also be bounded. By definition, Qθ^(x)Q_{\hat{\theta}}(x) is continuously differentiable so Q˙θ^(x)\dot{Q}_{\hat{\theta}}(x) is bounded for bounded xx, θ~\tilde{\theta} and therefore Qθ^(x)Q_{\hat{\theta}}(x) is uniformly continuous. Since Vc(t)V_{c}(t) is nonincreasing and bounded from below then limt0tQθ^(x(τ))𝑑τVc(0)limtVc(t)\lim_{t\rightarrow\infty}\int_{0}^{t}Q_{\hat{\theta}}(x(\tau))\,d\tau~{}{\leq}~{}V_{c}(0)-\lim_{t\rightarrow\infty}V_{c}(t) exists and is finite so by Barbalat’s lemma limtQθ^(x(t))0\lim_{t\rightarrow\infty}Q_{\hat{\theta}}(x(t))\rightarrow 0. Recalling Qθ^(x)=0x=xdQ_{\hat{\theta}}(x)=0\iff x=x_{d} then we can conclude x(t)xdx(t)\rightarrow x_{d} as tt\rightarrow\infty as desired. ∎

Remark 3.

Since the unknown parameters belong to a compact convex set Θ\Theta then one can employ the projection operator ProjΘ()\mathrm{Proj}_{\Theta}(\cdot) to ensure θ^Θ\hat{\theta}\in\Theta without affecting stability [6, 7].

Remark 4.

Equation 2b is expressed as an update to γ˙i(ρi)\dot{\gamma}_{i}(\rho_{i}) to make the dynamic adaptation gain interpretation more obvious. For implementation, one should rewrite Equation 2b as a bound on ρ˙i\dot{\rho}_{i} via the chain rule. Note γi(ρi)\nabla\gamma_{i}(\rho_{i}) in invertible by Definition 2.

Before deriving various implementable forms of the dynamic adaptation gain update law, it is instructive to analyze its general behavior. For the case where a parameter’s adaptation transients is destabilizing, i.e., Vθ^θ^iddtθ^i>0\tfrac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\tfrac{d}{dt}\hat{\theta}_{i}>0 for some ii, we see from Equation 2b that γ˙i(ρi)<0\dot{\gamma}_{i}(\rho_{i})<0 so the adaptation gain decreases to prevent destabilization. In other words, the adaptation rate is slowed for parameters whose adaptation transients could cause instability. Note that there is no concern of the adaptation gain γi(ρi)\gamma_{i}(\rho_{i}) becoming negative since γi(ρi)\gamma_{i}(\rho_{i}) must be an admissible dynamic adaptation gain so, by definition, γi(ρi)\gamma_{i}(\rho_{i}) is lower-bounded by a positive constant. With that said, ρi\rho_{i} could become a large negative number which does not affect the proof of Theorem 1 but can have practical implications; this will be discussed in more detail below. Now considering the case where a parameter’s adaptation transients is stabilizing, i.e., Vθ^θ^iddtθ^i0\tfrac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\tfrac{d}{dt}\hat{\theta}_{i}\leq 0 for some ii, we see from Equation 2b that γ˙i(ρi)\dot{\gamma}_{i}(\rho_{i}) is simply upper-bounded by a positive quantity. Consequently, one can set γ˙i(ρi)=0\dot{\gamma}_{i}(\rho_{i})=0 and still obtain a stable closed-loop system since the transients term is negative thereby leading to the desired stability inequality in the proof of Theorem 1. Another choice would be to let γ˙i(ρi)>0\dot{\gamma}_{i}(\rho_{i})>0 (without exceeding the inequality Equation 2b) until γi(ρi)=γ¯i{\gamma}_{i}(\rho_{i})=\bar{\gamma}_{i}, i.e., the nominal adaptation gain value, at which point one sets γ˙i(ρi)=0\dot{\gamma}_{i}(\rho_{i})=0. This leads us to one implementable form of the adaptation gain update law.

Corollary 1.

The adaptation gain update law given by

γ˙i(ρi)={2γ¯i2(ηiϑ~i2)Vθ^θ^iθ^˙iwhenVθ^θ^iθ^˙i>02ci2ηiVθ^θ^iθ^˙iwhenVθ^θ^iθ^˙i0andγi(ρi)<γ¯i0whenVθ^θ^iθ^˙i0andγi(ρi)=γ¯i,\dot{\gamma}_{i}(\rho_{i})=\begin{cases}-\frac{2\bar{\gamma}_{i}^{2}}{(\eta_{i}-\tilde{\vartheta}^{2}_{i})}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}~{}~{}\mathrm{when}~{}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}>0\\[6.0pt] -\frac{2c_{i}^{2}}{\eta_{i}}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}~{}~{}\mathrm{when}~{}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}\leq 0~{}\mathrm{and}~{}\gamma_{i}(\rho_{i})<\bar{\gamma}_{i}\\[6.0pt] \hphantom{-}0~{}~{}\mathrm{when}~{}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}\leq 0~{}\mathrm{and}~{}\gamma_{i}(\rho_{i})=\bar{\gamma}_{i},\end{cases}

for i=1,,pi=1,\dots,p satisfies condition Equation 2b in Theorem 1 so xxdx\rightarrow x_{d} asymptotically as desired.

The component-wise nature of the adaptation gain update law presented in Corollary 1 systematically checks the adaptation transients of each parameter and adjusts the adaptation gain accordingly to preserve stability. Algorithmically, the update law cycles through each parameter and updates the adaptation gain on the parameters whose transients are problematic. Treating parameters individually rather than as a whole (as in [4], see the Appendix) allows for targeted adjustments to problematic parameters which yields less myopic behavior of the controller and better closed-loop performance. Note this behavior is quite different than that in [8] where the gain was only increased to achieve stability.

Remark 5.

Corollary 1 for Vθ^θ^iddtθ^i0\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\tfrac{d}{dt}{\hat{\theta}}_{i}\leq 0 can be rewritten as

γ˙i(ρi)\displaystyle\dot{\gamma}_{i}(\rho_{i}) =2ci2ηiθ^i[log(h(x)(Vθ^(x,t)+c))]θ^˙i\displaystyle=\ -\frac{2c^{2}_{i}}{\eta_{i}}\ \frac{\partial}{\partial\hat{\theta}_{i}}\bigl{[}\log(h(x)(V_{\hat{\theta}}(x,t)+c))\bigr{]}\,\dot{\hat{\theta}}_{i}

where h:n>0h:\mathbb{R}^{n}\rightarrow\mathbb{R}_{>0} and c>0c\in\mathbb{R}_{>0}. This form shows that the adaptation gain update is unaffected if an uclf is scaled by a uniformly positive function h(x)h(x). This could have implications for safety-critical adaptive control if h(x)h(x) were akin to a barrier function. If h(x)h(x) were also to be upper-bounded then a similar relationship can be obtained for the Vθ^θ^iddtθ^i>0\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\tfrac{d}{dt}{\hat{\theta}}_{i}>0. The gradient term above is similar to the score function logp(x)\nabla\log p(x) in diffusion-based generative modeling [9], where p(x)p(x) a probability density known only within a scaling factor (the partition function) which disappears in logp(x)\nabla\log p(x).

III-D Leakage Modification: Bounding ρ\rho

The proof of Theorem 1 requires the individual adaptation gains be lower-bounded. This is accomplished through appropriate selection of each γi()\gamma_{i}(\cdot) to meet the conditions of Definition 2. It is desirable to ensure the scalar argument ρi\rho_{i} remains bounded through means beyond simple parameter tuning to prevent numerical instability cause by |ρi||\rho_{i}| becoming too large. Having the adaptation gains return to their nominal value automatically after the adaptation transients has subsided is also ideal. We propose to replace the pure integrator ρi\rho_{i} dynamics with nonlinear first-order dynamics, i.e., a leakage modification, to achieve these desirable properties. The modified adaptation gain update law takes the form

ρ˙i=2γi(ρi)2γi(ρi)[λiρi+Kiwi(x)],\dot{\rho}_{i}=2\frac{\gamma_{i}(\rho_{i})^{2}}{\nabla\gamma_{i}(\rho_{i})}\big{[}-\lambda_{i}\,\rho_{i}+K_{i}\,w_{i}(x)\big{]}\,, (4)

where wi(x)=[Vθ^θ^Δ(x,t)Vθ^x]iw_{i}(x)=\Bigl{[}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}}^{\top}\Delta(x,t)\frac{\partial V_{\hat{\theta}}}{\partial x}\Bigr{]}_{i} with []i[\cdot]_{i} the ii-th component, Ki=γ¯i/(ηiϑ~i2)\,K_{i}=\nicefrac{{\bar{\gamma}_{i}}}{{(\eta_{i}-\tilde{\vartheta}^{2}_{i})}}\, if wi(x)<0\,w_{i}(x)<0\, and Ki=0\,K_{i}=0 otherwise, and λi>0\lambda_{i}\in\mathbb{R}_{>0}. Note wi(x)<0w_{i}(x)<0 corresponds to the transients being destabilizing, so the input of Equation 4 is zero if the transients is stabilizing or has subsided. The following lemma establishes important properties of Equation 4.

Lemma 1.

The output of the dynamic adaptation gain update law Equation 4 remains bounded if the exogenous signal wi(x)w_{i}(x) is bounded. Furthermore, the output tends to zero if wi(x)0w_{i}(x)\rightarrow 0.

The proof can be found in the Appendix. We now show Equation 4 when combined with Equation 2a yields a stable closed-loop system.

Theorem 2.

Consider the uncertain system Equation 1 with xdx_{d} being the desired equilibrium point. If an uclf Vθ(x,t)\,V_{\theta}(x,t)\, exists, then xxdx\rightarrow x_{d} asymptotically with the adaptation law Equation 2a and dynamic adaptation gain update law Equation 4. Furthermore, ρi0\rho_{i}\rightarrow 0 and in turn γi(ρi)γ¯i\gamma_{i}(\rho_{i})\rightarrow\bar{\gamma}_{i} asymptotically for each i=1,,pi=1,\dots,p.

Proof.

Consider the Lyapunov-like function

Vc(t)=Vθ^(x,t)+i=1pηiλitT|ρi(τ)|𝑑τ+12i=1p(θ~i2ηi)γi(ρi),V_{c}(t)=V_{\hat{\theta}}(x,t)+\sum_{i=1}^{p}\eta_{i}\,\lambda_{i}\int\limits_{t}^{T}|\rho_{i}(\tau)|\,d\tau+\frac{1}{2}\sum_{i=1}^{p}\frac{(\tilde{\theta}_{i}^{2}-\eta_{i})}{\gamma_{i}(\rho_{i})},

where ηi\eta_{i} and λi\lambda_{i} are defined as before and TT is sufficiently large [10, 11] but finite to ensure the integral is finite. Differentiating and applying Definitions 1, 2a and 4,

V˙c(t)\displaystyle\dot{V}_{c}(t) Qθ^(x)i=1pηiλi|ρi|\displaystyle\leq-Q_{\hat{\theta}}(x)-\sum_{i=1}^{p}\eta_{i}\,\lambda_{i}\,|\rho_{i}|
+i=1p[Vθ^θ^iθ^˙i+(ηiθ~i2)[λiρi+Kiwi(x)]]\displaystyle\hphantom{\leq}+\sum_{i=1}^{p}\left[\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\dot{\hat{\theta}}_{i}+(\eta_{i}-\tilde{\theta}_{i}^{2})[-\lambda_{i}\,\rho_{i}+K_{i}\,w_{i}(x)]\right]
Qθ^(x)i=1p[ηiλi|ρi|+(ηiθ~i2)λiρi],\displaystyle\leq-Q_{\hat{\theta}}(x)-\sum_{i=1}^{p}\left[\eta_{i}\,\lambda_{i}\,|\rho_{i}|+(\eta_{i}-\tilde{\theta}_{i}^{2})\,\lambda_{i}\,\rho_{i}\right],

where the second inequality holds by choice of KiK_{i}. Also by choice of KiK_{i}, Equation 4 is only driven by an input that is either negative or zero, so ρi(t)0\rho_{i}(t)\leq 0 for all t0t\geq 0 since ρi(0)=0\rho_{i}(0)=0. Hence, the term in brackets must be negative which yields V˙c(t)Qθ^(x)\dot{V}_{c}(t)\leq-Q_{\hat{\theta}}(x) so Vc(t)V_{c}(t) is nonincreasing and in turn xx and θ~\tilde{\theta} are bounded. Using the same arguments as in Theorem 1 and noting Vc(t)V_{c}(t) is still lower-bounded, x(t)xdx(t)\rightarrow x_{d} as tt\rightarrow\infty via Barbalat’s lemma. Since Vθ^x0xxd\tfrac{\partial V_{\hat{\theta}}}{\partial x}\rightarrow 0\iff x\rightarrow x_{d} by construction, then each wi(x)0w_{i}(x)\rightarrow 0 as xxdx\rightarrow x_{d}. Hence, by Lemma 1, ρi0\rho_{i}\rightarrow 0 and in turn γi(ρi)γ¯i\gamma_{i}(\rho_{i})\rightarrow\bar{\gamma}_{i} for i=1,,pi=1,\dots,p

III-E Matched and Unmatched Uncertainties

A particularly useful property of the dynamic adaptation gains method is the ability to treat matched and unmatched uncertainties separately. This is in stark contrast to the approach taken in [4] where all adaptation gains were adjusted to cancel adaptation transients. The separability inherent to the current method is indicative of the more natural formalism of treating adaptation transients on an individual basis rather than as a whole. The following theorem solidifies this point.

Theorem 3.

Assume system Equation 1 can be rewritten as

x˙=f(x,t)Δ(x,t)θ+B(x,t)[uΨ(x,t)ϕ]\dot{x}=f(x,t)-\Delta(x,t)^{\top}\theta+B(x,t)[u-\Psi(x,t)^{\top}\phi] (5)

where ϕΦq\phi\in\Phi\subset\mathbb{R}^{q} are the matched parameters with known regression vectors Ψ:n×0q×n\Psi:\mathbb{R}^{n}\times\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}^{q\times n}. Let xdx_{d} denote the desired equilibrium point. If an uclf Vθ(x,t)\,V_{\theta}(x,t)\, exists with corresponding control law uθu_{\theta}, then xxdx\rightarrow x_{d} asymptotically with the controller κ=uθ^+Ψ(x,t)ϕ^\kappa=u_{\hat{\theta}}+\Psi(x,t)^{\top}\hat{\phi} and adaptation laws

ϕ^˙\displaystyle\dot{\hat{\phi}} =ΓB(x,t)Ψ(x,t)Vθ^x\displaystyle=-\,\Gamma\,B(x,t)^{\top}\Psi(x,t)\,\frac{\partial V_{\hat{\theta}}}{\partial x} (6)
θ^˙\displaystyle\dot{\hat{\theta}} =diag(γ1(ρ1),,γp(ρp))Δ(x,t)Vθ^x\displaystyle=-\,\mathrm{diag}(\gamma_{1}(\rho_{1}),\dots,\gamma_{p}(\rho_{p}))\,\Delta(x,t)\,\frac{\partial V_{\hat{\theta}}}{\partial x}

where Γ\Gamma is a constant symmetric positive-definite matrix and each γi()\gamma_{i}(\cdot) is an admissible dynamic adaptation gain.

Proof.

Consider the new Lyapunov-like function

Vc(t)=Vθ^(x,t)+12ϕ~Γ1ϕ~+12i=1p(θ~i2ηi)γi(ρi),V_{c}(t)=V_{\hat{\theta}}(x,t)+\frac{1}{2}\tilde{\phi}^{\top}\Gamma^{-1}\tilde{\phi}+\frac{1}{2}\sum_{i=1}^{p}\frac{(\tilde{\theta}_{i}^{2}-\eta_{i})}{\gamma_{i}(\rho_{i})},

where ϕ~ϕ^ϕ\tilde{\phi}\triangleq\hat{\phi}-\phi and θ~,ηi\tilde{\theta},~{}\eta_{i} are defined as before. Differentiating Vc(t)V_{c}(t) along Equation 5 and substituting κ\kappa yields

V˙c(t)\displaystyle\dot{V}_{c}(t)\leq Qθ^(x)+Vθ^xΔ(x,t)θ~\displaystyle\,-Q_{\hat{\theta}}(x)+\frac{\partial V_{\hat{\theta}}}{\partial x}^{\top}\Delta(x,t)^{\top}\tilde{\theta}
+Vθ^xΨ(x,t)B(x,t)ϕ~+ϕ~Γ1ϕ˙i\displaystyle+\frac{\partial V_{\hat{\theta}}}{\partial x}^{\top}\Psi(x,t)^{\top}{B(x,t)}\tilde{\phi}+\tilde{\phi}^{\top}\Gamma^{-1}\,\dot{\phi}_{i}
+i=1p[Vθ^θ^iθ^˙i+θ~iγi(ρi)θ^˙i+12(ηiθ~i2)γi(ρi)2γ˙i(ρi)].\displaystyle+\sum_{i=1}^{p}\biggl{[}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}_{i}}\,\dot{\hat{\theta}}_{i}+\frac{\tilde{\theta}_{i}}{\gamma_{i}(\rho_{i})}\dot{\hat{\theta}}_{i}+\frac{1}{2}\frac{(\eta_{i}-\tilde{\theta}_{i}^{2})}{\gamma_{i}(\rho_{i})^{2}}\,\dot{\gamma}_{i}(\rho_{i})\biggr{]}.

Substituting in Eqs. 6 and 2b yields V˙c(t)Qθ^(x)\dot{V}_{c}(t)\leq-Q_{\hat{\theta}}(x) so Vc(t)V_{c}(t) is nonincreasing. Similar to Theorem 1, we can conclude that limtQθ^(x(t))0x(t)xd\lim_{t\rightarrow\infty}Q_{\hat{\theta}}(x(t))\rightarrow 0\implies x(t)\rightarrow x_{d} as tt\rightarrow\infty. ∎

Theorem 3 can be immediately extended to use other well-known direct matched adaptation laws, e.g., those based on sliding variables [6], nonlinear damping [1], or any other suitable adaptation law. This further highlights the versatility of individual dynamic adaptation gains compared to [4].

III-F Composite Adaptation

Parameter adaptation transients can be markedly improved by combining direct and indirect adaptation schemes to form a composite adaptation law [12]. The following proposition shows composite adaptation with dynamic adaptation gains also yields a stable closed-loop system.

Proposition 1.

Assume a signal εθ^=W(x,t)θ~\varepsilon_{\hat{\theta}}=W(x,t)^{\top}\tilde{\theta} for some matrix W(x,t)W(x,t) is available via measurement or computation. If an uclf Vθ(x,t)\,V_{\theta}(x,t)\, exists, then xxdx\rightarrow x_{d} asymptotically with the composite adaptation law

θ^˙=diag(γ1(ρ1),,γp(ρp))(Δ(x,t)Vθ^x+βW(x,t)εθ^)\dot{\hat{\theta}}=-\mathrm{diag}(\gamma_{1}(\rho_{1}),\dots,\gamma_{p}(\rho_{p}))\Bigl{(}\Delta(x,t)\frac{\partial V_{\hat{\theta}}}{\partial x}+\beta\,W(x,t)\,\varepsilon_{\hat{\theta}}\Bigr{)}

where β>0\beta\in\mathbb{R}_{>0} and each γi\gamma_{i} is an admissible dynamic adaptation gain whose update law satisfies Equation 2b.

Proof.

Using the Lyapunov-like function from Theorem 1 and applying the composite adaptation law with Equation 2b yields V˙c(t)Qθ^(x)βθ~W(x,t)εθ^=Qθ^(x)βθ~W(x,t)W(x,t)θ~V˙c(t)Qθ^(x)\dot{V}_{c}(t)\leq-Q_{\hat{\theta}}(x)-\beta\,\tilde{\theta}^{\top}W(x,t)\,\varepsilon_{\hat{\theta}}=-Q_{\hat{\theta}}(x)-\beta\,\tilde{\theta}^{\top}W(x,t)W(x,t)^{\top}\tilde{\theta}\implies\dot{V}_{c}(t)\leq-Q_{\hat{\theta}}(x) so Vc(t)V_{c}(t) is nonincreasing. Similar to Theorem 1, we can conclude that limtQθ^(x(t))0x(t)xd\lim_{t\rightarrow\infty}Q_{\hat{\theta}}(x(t))\rightarrow 0\implies x(t)\rightarrow x_{d} as tt\rightarrow\infty. ∎

IV Simulation Experiments

The developed method was tested on the system

[x˙1x˙2x˙3]=[x3θ1x1x2θ2x12tanh(x2)θ3x3θ4x12]+[001]u,\left[\begin{array}[]{c}\dot{x}_{1}\\ \dot{x}_{2}\\ \dot{x}_{3}\end{array}\right]=\left[\begin{array}[]{c}x_{3}-\theta_{1}x_{1}\\ -x_{2}-\theta_{2}x^{2}_{1}\\ \mathrm{tanh}(x_{2})-\theta_{3}x_{3}-\theta_{4}x_{1}^{2}\end{array}\right]+\left[\begin{array}[]{c}0\\ 0\\ 1\end{array}\right]u, (7)

with state x=[x1,x2,x3]x=[x_{1},~{}x_{2},~{}x_{3}]^{\top}, unknown parameters θ=[θ1,θ2,θ3,θ4]\theta=[\theta_{1},~{}\theta_{2},~{}\theta_{3},~{}\theta_{4}]^{\top} where θ1,θ2\theta_{1},\,\theta_{2} are unmatched, and xdx_{d} being the origin. The system is not feedback linearizable and is not in strict feedback form. The true model parameters are θ=[1.8,2.4,0.75,2.25]\theta^{*}=[-1.8,\,-2.4,\,-0.75,\,-2.25]^{\top} with the set of allowable variations θ[2.1, 1.5]×[3, 1.5]×[1.8, 2.25]×[5.25, 1.5]{\theta}\in[-2.1,\,1.5]\times[-3,\,1.5]\times[-1.8,\,2.25]\times[-5.25,\,1.5]; the projection operator from [6] was used to bound each parameter. The dynamic adaptation gains took the form γi(ρi)=0.9eρi+0.1\gamma_{i}(\rho_{i})=0.9\,e^{\rho_{i}}+0.1 and ηi=10+ϑi2\eta_{i}=10+\vartheta_{i}^{2} for i=1,2i=1,2. Similar to [4], the Riemannian energy of a geodesic connecting xx and xdx_{d} was chosen to be the uclf.

Figure 1(a) shows the uclf—an indicator of tracking performance—with and without adaptation. The proposed method (Corollary 1) successfully stabilizes the origin as predicted by Theorem 1. This is in stark contrast to the no adaptation case where only bounded error can be acheived. The uclf with leakage modification (omitted for clarity) exhibits nearly identical behavior as the uclf with the update law from Corollary 1 which confirms the result stated in Theorem 2. Figure 1(b) shows the individual adaptation gains with and without the leakage modification. The nominal case (blue) sees a reduction in the adaptations gains by 47% and 9%, respectively, indicating the adaptation transients of θ1\theta_{1} would have a large destabilizing effect if not compensated for. The leakage modification (red, λ=1\lambda=1) ensures the adaptation gains return to their nominal values as predicted by Theorem 2.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: Performance and behavior of the dynamic adaptation gains method. (a): The uclf with and without adaptation shows the method is able to stabilize the origin while the system is unstable with no adaptation. (b): The adaptation gains are reduced by as much as 47% to ensure stability. When the leakage modification is added the gains return to their nominal values as desired.

V Conclusion

We presented a new direct adaptation law that adjusts individual adaptation gains online to achieve stable closed-loop control and learning for nonlinear systems with unmatched uncertainties. A bound on the rate of change of individual adaptation gains that prevents destabilization by the adaptation transients was derived. Noteworthy extensions and modifications were also discussed. The results presented here have important implications in adaptive safety [13] and adaptive optimal control [14]. Future work will explore these implications in addition to deeper investigations into fundmanetal properties of the approach.

Appendix

Proof of Lemma 1.

Consider the time interval t[t0,t1]t\in[t_{0},\,t_{1}] where w¯supt[t0,t1](|Kw(x(t))|)\bar{w}\triangleq{\sup}_{t\in[t_{0},\,t_{1}]}(|Kw(x(t))|) is the supremum of the input to Equation 4 over the time interval of interest [1]. Note the ii subscript is dropped for clarity. Consider the comparison system z˙=λa(z)z+a(z)w¯\dot{z}=-\lambda a(z)z+a(z)\bar{w} where |ρ|z|\rho|\leq z uniformly and a(z)2γ(z)2γ(z)a(z)\triangleq 2\frac{\gamma(z)^{2}}{\nabla\gamma(z)} is uniformly strictly positive. The comparison system has the virtual dynamics y˙=λa(z)y+a(z)w¯\dot{y}=-\lambda a(z)y+a(z)\bar{w} which is contracting in yy so any two particular solutions converge exponentially to each other [15]. Letting α(t)λt0ta(ρ(τ))𝑑τ>0\alpha(t)\triangleq\lambda\,\int_{t_{0}}^{t}a(\rho(\tau))\,d\tau>0, then y(t)=y(t0)eα(t)+1λ(1eα(t))w¯y(t)=y(t_{0})e^{-\alpha(t)}+\tfrac{1}{\lambda}(1-e^{-\alpha(t)})\,\bar{w} is the solution to the virtual dynamics and is bounded. Because zz is a particular solution of the virtual system and |ρ|z|\rho|\leq z, ρ\rho is bounded. Also, as w(x)0w(x)\rightarrow 0 then so does w¯\bar{w} over some time interval, hence y0y\rightarrow 0 and in turn ρ0\rho\rightarrow 0 as desired. ∎

Theorem 4 (cf. [4]).

Consider the uncertain system Equation 1 with xdx_{d} being the desired equilibrium. If an uclf   Vθ(x,t)V_{\theta}(x,t)   exists, then, for any strictly-increasing and uniformly-positive scalar function υ(ρ)\upsilon(\rho), xxdx\rightarrow x_{d} asymptotically with the adaptation law

θ^˙\displaystyle\dot{\hat{\theta}} =υ(ρ)ΓΔ(x,t)Vθ^x,\displaystyle=-\upsilon(\rho)\,\Gamma\,\Delta(x,t)\,\frac{\partial V_{\hat{\theta}}}{\partial x},
ρ˙\displaystyle\dot{\rho} =υ(ρ)υ(ρ)1Vθ^(x,t)+cVθ^θ^θ^˙,\displaystyle=-\frac{\upsilon(\rho)}{\nabla\upsilon(\rho)}\frac{1}{V_{\hat{\theta}}(x,t)+c}\frac{\partial V_{\hat{\theta}}}{\partial\hat{\theta}}^{\top}\,\dot{\hat{\theta}},

where Γ\Gamma is a symmetric positive-definite matrix and c>0c\in\mathbb{R}_{>0}.

Acknowledgements    We thank Miroslav Krstic for stimulating discussions.

References

  • [1] M. Krstic, P. V. Kokotovic, and I. Kanellakopoulos, Nonlinear and adaptive control design. John Wiley & Sons, 1995.
  • [2] E. D. Sontag and Y. Wang, “On characterizations of the input-to-state stability property,” Systems & Control Letters, vol. 24, no. 5, pp. 351–359, 1995.
  • [3] M. Krstić and P. V. Kokotović, “Control lyapunov functions for adaptive nonlinear stabilization,” Systems & Control Letters, vol. 26, no. 1, pp. 17–23, 1995.
  • [4] B. T. Lopez and J.-J. E. Slotine, “Universal adaptive control of nonlinear systems,” IEEE Control Systems Letters, vol. 6, pp. 1826–1830, 2021.
  • [5] W. Lohmiller and J.-J. E. Slotine, “On contraction analysis for non-linear systems,” Automatica, vol. 34, no. 6, pp. 683–696, 1998.
  • [6] J.-J. E. Slotine and W. Li, Applied nonlinear control. Prentice Hall, 1991.
  • [7] P. A. Ioannou and J. Sun, Robust adaptive control. Courier Corporation, 2012.
  • [8] H. Lei and W. Lin, “Universal adaptive control of nonlinear systems with unknown growth rate by output feedback,” Automatica, vol. 42, no. 10, pp. 1783–1789, 2006.
  • [9] Y. Song, J. Sohl-Dickstein, D. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” ICLR, 2021.
  • [10] R. E. Kalman and J. E. Bertram, “Control system analysis and design via the “second method” of lyapunov: I—continuous-time systems,” Trans. ASME Basic Engineering, Ser. D, vol. 82, pp. 371–400, 1960.
  • [11] D. G. Luenberger, Introduction to dynamic systems; theory, models, and applications. John Wiley & Sons, 1979.
  • [12] J.-J. E. Slotine and W. Li, “Composite adaptive control of robot manipulators,” Automatica, vol. 25, no. 4, pp. 509–519, 1989.
  • [13] B. T. Lopez and J.-J. E. Slotine, “Unmatched control barrier functions: Certainty equivalence adaptive safety,” in 2023 American Control Conference (ACC), pp. 3662–3668, IEEE, 2023.
  • [14] B. T. Lopez and J.-J. E. Slotine, “Adaptive variants of optimal feedback policies,” in Learning for Dynamics and Control, PMLR, 2022.
  • [15] W. Wang and J.-J. E. Slotine, “On partial contraction analysis for coupled nonlinear oscillators,” Biological Cybernetics, vol. 92, no. 1, pp. 38–53, 2005.