This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Feedback Interconnected Mean-Field Density Estimation and Control

Tongjia Zheng1, Qing Han2 and Hai Lin1 *This work was supported by the National Science Foundation under Grant No. IIS-1724070, CNS-1830335, IIS-2007949.1Tongia Zheng and Hai Lin are with the Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA. [email protected], [email protected].2Qing Han is with the Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA. [email protected].
Abstract

Swarm robotic systems have foreseeable applications in the near future. Recently, there has been an increasing amount of literature that employs mean-field partial differential equations (PDEs) to model the time-evolution of the probability density of swarm robotic systems and uses density feedback to design stabilizing control laws that act on individuals such that their density converges to a target profile. However, it remains largely unexplored considering problems of how to estimate the mean-field density, how the density estimation algorithms affect the control performance, and whether the estimation performance in turn depends on the control algorithms. In this work, we focus on studying the interplay of these algorithms. Specifically, we propose new density control laws which use the mean-field density and its gradient as feedback, and prove that they are globally input-to-state stable (ISS) with respect to estimation errors. Then, we design filtering algorithms to estimate the density and its gradient separately, and prove that these estimates are convergent assuming the control laws are known. Finally, we show that the feedback interconnection of these estimation and control algorithms is still globally ISS, which is attributed to the bilinearity of the PDE system. An agent-based simulation is included to verify the stability of these algorithms and their feedback interconnection.

I Introduction

Swarm robotic systems (such as drones) have foreseeable applications in the near future. Compared with small-scale robotic systems, the dramatic increase in the number of involved robots provides numerous advantages such as robustness, efficiency, and flexibility, but also poses significant challenges to their estimation and control problems.

Many methods have been proposed for controlling large-scale systems, such as graph theoretic design [1] and game theoretic formulation, especially potential games [2] and mean-field games [3]. Our work is also inspired by modelling and control strategies based on mean-field limit approximations. However, unlike mean-field games where the mean-field density is used to approximate the collective effect of the swarm, we aim at the direct control of the mean-field density, which results in a PDE control problem. Mean-field models include Markov chains and PDEs. The first category partitions the spatial domain to obtain an abstracted Markov chain model and designs the transition probability to stabilize the density [4, 5], which usually suffers from the state explosion issue. In PDE-based models, individual robots are modelled by a family of stochastic differential equations and their density evolves according to a PDE. In this way, the density control problem of a robotic swarm is converted as a regulation problem of the PDE.

Considering density control, early efforts usually adopt an optimal control formulation [6]. While an optimal control formulation provides more flexibility for the control objective, the solution relies on numerically solving the optimality conditions, which are computationally expensive and essentially open-loop. Density control is also studied in [7, 8] by relating density control problems with the so-called Schrödinger Bridge problem. However, numerically solving the associated Schrödinger Bridge problem is also known to suffer from the curse of dimensionality except for the linear case. In recent years, researchers have sought to design control laws that explicitly use the mean-field density as feedback to form closed-loop control [9, 10, 11]. This density feedback technique is able to guarantee closed-loop stability and can be efficiently computed since it is given in a closed form. However, it remains largely unexplored considering problems of how to estimate the mean-field density, how the density estimation algorithm affects the control performance, and whether the estimation performance in turn depends on the density control algorithms. These problems become more critical as it is observed that most density feedback design more or less depends on the gradient of the mean-field density. Since gradient is an unbounded operator, any density estimation algorithm that produces accurate density estimates may have arbitrarily large estimation errors for the gradient. This brings significant concerns to the density estimation problem.

In this work, we study the interplay of density feedback laws and estimation algorithms. In [11], we have proposed some density feedback laws and obtained preliminary results on their robustness to density estimation errors. This work extends these feedback laws so that they are less restrictive and have more verifiable robustness properties in the presence of estimation errors. We have also reported density filtering algorithms in [12, 13] which are designed for large-scale stochastic systems modelled by mean-field PDEs. This work extends the algorithms to directly estimate the gradient of the density, a quantity required by almost all existing density feedback control in the literature. Furthermore, we study the interconnection of these estimation and control algorithms and prove their closed-loop stability.

Our contribution includes three aspects. First, we propose new density feedback laws and show their robustness using the notion of ISS. Second, we design infinite-dimensional filters to estimate the gradient of the density and study their stability and optimality. Third, we prove that the feedback interconnection of these estimation and control algorithms is still globally ISS.

The rest of the paper is organized as follow. Section II introduces some preliminaries. Problem formulation is given in Section III. Section IV is our main results in which we propose new density estimation and control laws, and study their interconnected stability. Section V presents an agent-based simulation to verify the effectiveness.

II Preliminaries

II-A Notations

For a vector xnx\in\mathbb{R}^{n}, its Euclidean norm is denoted by x\|x\|. Let EnE\subset\mathbb{R}^{n} be a measurable set. For f:Ef:E\to\mathbb{R}, its L2L^{2}-norm is denoted by fL2(E):=(E|f(x)|2𝑑x)1/2\|f\|_{L^{2}(E)}:=(\int_{E}|f(x)|^{2}dx)^{1/2}. Given g(x)g(x) with 0<infxEg(x)supxEg(x)<0<\inf_{x\in E}g(x)\leq\sup_{x\in E}g(x)<\infty, the weighted norm fLp(E;g):=(E|f(x)|pg(x)𝑑x)1/p\|f\|_{L^{p}(E;g)}:=(\int_{E}|f(x)|^{p}g(x)dx)^{1/p} is equivalent to fLp(E)\|f\|_{L^{p}(E)}. We will omit EE in the notation when it is clear. The gradient and Laplacian of a scalar function ff are denoted by f\nabla f and Δf\Delta f, respectively The divergence of a vector field FF is denoted by F\nabla\cdot F.

Lemma 1 (Poincaré inequality for density functions [14])

Let Ωn\Omega\in\mathbb{R}^{n} be a bounded convex open set with a Lipschitz boundary. Let gg be a continuous density function on Ω\Omega such that 0<c1infgsupgc20<c_{1}\leq\inf g\leq\sup g\leq c_{2} for some constants c1c_{1} and c2c_{2}. Then, \exists a constant C>0C>0 such that for fH1(Ω)\forall f\in H^{1}(\Omega),

Ω|f|2g𝑑xCΩ|fΩfg𝑑x|2g𝑑x.\int_{\Omega}|\nabla f|^{2}gdx\geq C\int_{\Omega}\left|f-\int_{\Omega}fgdx\right|^{2}gdx. (1)

II-B Input-to-state stability

We introduce ISS for infinite-dimensional systems [15]. Define the following classes of comparison functions:

𝒫\displaystyle\mathscr{P} :={γ:++|γ is continuous, γ(0)=0,\displaystyle:=\{\gamma:\mathbb{R}_{+}\to\mathbb{R}_{+}|\gamma\text{ is continuous, }\gamma(0)=0,
 and γ(r)>0 for r>0}\displaystyle\qquad\text{ and }\gamma(r)>0\text{ for }r>0\}
𝒦\displaystyle\mathscr{K} :={γ𝒫γ is strictly increasing}\displaystyle:=\{\gamma\in\mathscr{P}\mid\gamma\text{ is strictly increasing}\}
𝒦\displaystyle\mathscr{K}_{\infty} :={γ𝒦γ is unbounded}\displaystyle:=\{\gamma\in\mathscr{K}\mid\gamma\text{ is unbounded}\}
\displaystyle\mathscr{L} :={γ:++γ is continuous and strictly\displaystyle:=\{\gamma:\mathbb{R}_{+}\to\mathbb{R}_{+}\mid\gamma\text{ is continuous and strictly}
decreasing with limtγ(t)=0}\displaystyle\quad\quad\text{decreasing with }\lim_{t\to\infty}\gamma(t)=0\}
𝒦\displaystyle\mathscr{KL} :={β:+×++β(,t)𝒦,t0,\displaystyle:=\{\beta:\mathbb{R}_{+}\times\mathbb{R}_{+}\to\mathbb{R}_{+}\mid\beta(\cdot,t)\in\mathscr{K},\forall t\geq 0,
β(r,),r>0}.\displaystyle\qquad\beta(r,\cdot)\in\mathscr{L},\forall r>0\}.

Let (X,X)\left(X,\|\cdot\|_{X}\right) and (U,U)\left(U,\|\cdot\|_{U}\right) be the state and input space, endowed with norms X\|\cdot\|_{X} and U\|\cdot\|_{U}, respectively. Denote Uc=PC(+;U)U_{c}=PC(\mathbb{R}_{+};U), the space of piecewise right-continuous functions from +\mathbb{R}_{+} to UU, equipped with the sup-norm. Consider a control system Σ=(X,Uc,ϕ)\Sigma=(X,U_{c},\phi) where ϕ:+×X×UcX\phi:\mathbb{R}_{+}\times X\times U_{c}\to X is a transition map. Let x(t)=ϕ(t,x0,u)x(t)=\phi(t,x_{0},u).

Definition 1

Σ\Sigma is called input-to-state stable (ISS), if β𝒦,γ𝒦\exists\beta\in\mathcal{KL},\gamma\in\mathcal{K}, such that

x(t)Xβ(x0X,t)+γ(sup0stu(s)U),\|x(t)\|_{X}\leq\beta(\|x_{0}\|_{X},t)+\gamma\Big{(}\sup_{0\leq s\leq t}\|u(s)\|_{U}\Big{)},

x0X,uUc\forall x_{0}\in X,\forall u\in U_{c} and t0\forall t\geq 0.

Definition 2

A continuous function V:+×X+V:\mathbb{R}_{+}\times X\to\mathbb{R}_{+} is called an ISS-Lyapunov function for Σ\Sigma, if ψ1,ψ2𝒦,χ𝒦\exists\psi_{1},\psi_{2}\in\mathcal{K}_{\infty},\chi\in\mathcal{K}, and W𝒫W\in\mathcal{P}, such that:

  • (i)

    ψ1(xX)V(t,x)ψ2(xX),xX\psi_{1}(\|x\|_{X})\leq V(t,x)\leq\psi_{2}(\|x\|_{X}),~{}\forall x\in X

  • (ii)

    xX,uUc\forall x\in X,\forall u\in U_{c} with u(0)=ξUu(0)=\xi\in U it holds:

    xXχ(ξU)V˙(t,x)W(xX).\|x\|_{X}\geq\chi(\|\xi\|_{U})\Rightarrow\dot{V}(t,x)\leq-W(\|x\|_{X}).
Theorem 1

If Σ\Sigma admits an ISS-Lyapunov function, then it is ISS.

ISS is useful for studying the stability of cascade systems. Consider two systems Σi=(Xi,Uci,ϕi),i=1,2\Sigma_{i}=(X_{i},U_{ci},\phi_{i}),i=1,2, where Uci=PC(+;Ui)U_{ci}=PC(\mathbb{R}_{+};U_{i}) and X1U2X_{1}\subset U_{2}. We say they form a cascade connection if u2(t)=ϕ1(t,t0,ϕ01,u1)u_{2}(t)=\phi_{1}(t,t_{0},\phi_{01},u_{1}).

Theorem 2

[16] The cascade connection of two ISS systems is ISS.

II-C Infinite-dimensional Kalman filters

We introduce the infinite-dimensional Kalman filters presented in [17]. Let ,𝒦\mathcal{H},\mathcal{K} be real Hilbert spaces. Consider the following infinite-dimensional linear system:

du(t)=𝒜(t)u(t)dt+(t)dw(t),u(0)=u0\displaystyle du(t)=\mathcal{A}(t)u(t)dt+\mathcal{B}(t)dw(t),~{}u(0)=u_{0}
dz(t)=𝒞(t)u(t)dt+(t)dv(t),z(0)=0,\displaystyle dz(t)=\mathcal{C}(t)u(t)dt+\mathcal{F}(t)dv(t),~{}z(0)=0,

where 𝒜(t)\mathcal{A}(t) is a linear closed operator on \mathcal{H}, ()L([0,T];())\mathcal{B}(\cdot)\in L^{\infty}([0,T];\mathcal{L}(\mathcal{H})), 𝒞()L([0,T];(,𝒦))\mathcal{C}(\cdot)\in L^{\infty}([0,T];\mathcal{L}(\mathcal{H},\mathcal{K})), and (),()1L([0,T];(𝒦))\mathcal{F}(\cdot),\mathcal{F}(\cdot)^{-1}\in L^{\infty}([0,T];\mathcal{L}(\mathcal{K})). w(t)w(t) and v(t)v(t) are independent Wiener processes on \mathcal{H} and 𝒦\mathcal{K} with covariance operators 𝒲\mathcal{W} and 𝒱\mathcal{V}, respectively. The infinite-dimensional Kalman filter is given by:

du^(t)=𝒜(t)u^dt+𝒦(t)(dz(t)𝒞(t)u(t)dt),u^(0)=0d\hat{u}(t)=\mathcal{A}(t)\hat{u}dt+\mathcal{K}(t)(dz(t)-\mathcal{C}(t)u(t)dt),~{}\hat{u}(0)=0

where 𝒦(t)=𝒫(t)𝒞(t)((t)𝒱(t))1\mathcal{K}(t)=\mathcal{P}(t)\mathcal{C}^{*}(t)\left(\mathcal{F}(t)\mathcal{V}\mathcal{F}^{*}(t)\right)^{-1} is the Kalman gain, and 𝒫(t)\mathcal{P}(t) is the solution of the Riccati equation:

d𝒫(t)dt\displaystyle\frac{d\mathcal{P}(t)}{dt} =𝒜(t)𝒫(t)+𝒫(t)𝒜(t)+(t)𝒲(t)\displaystyle=\mathcal{A}(t)\mathcal{P}(t)+\mathcal{P}(t)\mathcal{A}^{*}(t)+\mathcal{B}(t)\mathcal{W}\mathcal{B}^{*}(t)
𝒫(t)𝒞(t)((t)𝒱(t))1𝒞(t)𝒫(t).\displaystyle\quad-\mathcal{P}(t)\mathcal{C}^{*}(t)\left(\mathcal{F}(t)\mathcal{V}\mathcal{F}^{*}(t)\right)^{-1}\mathcal{C}(t)\mathcal{P}(t).

III Problem formulation

This work studies the density control problem of robotic swarms. Consider NN robots in a bounded convex domain Ωn\Omega\subset\mathbb{R}^{n}, which are assumed to be homogeneous and satisfy:

dXti=v(Xti,t)dt+2σ(t)dBti,i=1,,N,dX_{t}^{i}=v(X_{t}^{i},t)dt+\sqrt{2\sigma(t)}dB_{t}^{i},\quad i=1,\dots,N, (2)

where XtiΩX_{t}^{i}\in\Omega represents the position of the ii-th robot, v=(v1,,vn)nv=(v_{1},\dots,v_{n})\in\mathbb{R}^{n} is the velocity field to be designed, {Bti}\{B_{t}^{i}\} are standard Wiener processes assumed to be independent across the robots, and 2σ(t)\sqrt{2\sigma(t)}\in\mathbb{R} is the standard deviation.

In the mean-field limit as NN\to\infty, the collective behavior of the robots can be captured by their probability density

p(x,t)1Ni=1NδXti\displaystyle p(x,t)\approx\frac{1}{N}\sum_{i=1}^{N}\delta_{X_{t}^{i}}

with δx\delta_{x} being the Dirac distribution, and this density satisfies a Fokker-Planck equation given by:

tp=(vp)+Δ(σp)inΩ×(0,T),p=p0onΩ×{0},𝒏((σp)vp)=0onΩ×(0,T),\displaystyle\begin{split}\partial_{t}p=-\nabla\cdot({v}p)+\Delta(\sigma p)&\quad\text{in}\quad\Omega\times(0,T),\\ p=p_{0}&\quad\text{on}\quad\Omega\times\{0\},\\ \boldsymbol{n}\cdot(\nabla(\sigma p)-{v}p)=0&\quad\text{on}\quad\partial\Omega\times(0,T),\end{split} (3)

where 𝒏\boldsymbol{n} is the unit inner normal to the boundary Ω\partial\Omega, and p0(x)p_{0}(x) is the initial density. The last equation is the reflecting boundary condition to confine the robots within Ω\Omega.

Remark 1

This work focuses on the interconnected stability of density estimation and control. For clarity, the robots are assumed to be first-order integrators in (2). Density control for heterogeneous higher-order nonlinear systems is studied in a separate work [18]. The interconnected stability results to be presented later can be generalized to these more general systems by combining the stability results in [18].

The problems studied in this work are stated as follow.

Problem 1 (Density control)

Consider (3). Given a target density p(x)p_{*}(x), design vv such that p(x,t)p(x)p(x,t)\to p_{*}(x).

Problem 2 (Density estimation)

Consider (3). Given the robots’ states {Xti}i=1N\{X_{t}^{i}\}_{i=1}^{N}, estimate pp and its gradient.

Problem 3 (Feedback interconnection)

Consider (3). Given a target density p(x)p_{*}(x), let vv be designed as a feedback function of certain density estimates that are computed based on (3). Prove that p(x,t)p(x)p(x,t)\to p_{*}(x).

IV Main results

IV-A Modified density feedback laws

Given a smooth target density p(x)p_{*}(x), bounded from above and below by positive constants, we design:

v(x,t)=α(x,t)p(x,t)p(x)+σ(t)p(x)p(x)v(x,t)=-\alpha(x,t)\nabla\frac{p(x,t)}{p_{*}(x)}+\frac{\sigma(t)\nabla p_{*}(x)}{p_{*}(x)} (4)

where α0\alpha\geq 0 is a design parameter for individuals to adjust their velocity magnitude. In the collective level, (4) is called density feedback because of its explicit dependence on the mean-field density pp. In the individual level, (4) is essentially nonlinear state feedback and the velocity input for the ii-th robot is simply given by v(Xti,t)v(X_{t}^{i},t).

Remark 2

Compared with the control laws proposed in [9, 10, 11], the remarkable difference of (4) is that pp does not appear in any denominator, which provides several advantages. First, it relaxes the requirement for pp to be strictly positive and avoids producing large velocity when pp is close to 0. Second, it will enable us to obtain ISS results with respect to estimation errors in L2L^{2} norm. (Such results are difficult to obtain for those in [9, 10, 11].) The significance of this property will become apparent when we study the interconnected stability of control and estimation algorithms.

Define Φ=pp\Phi=p-p_{*} as the convergence error.

Theorem 3

Consider system (3). Let vv be given by (4). Then Φ(t)L2\|\Phi(t)\|_{L^{2}} converges to 0 exponentially.

Proof:

Substituting (4) into (3), we obtain:

tp=(αppp)(σppp)+Δ(σp)=(αppp)[σppp(σp)]=(αppp)+(σppp).\displaystyle\begin{split}\partial_{t}p&=\nabla\cdot\left(\alpha p\nabla\frac{p}{p_{*}}\right)-\nabla\cdot\left(\frac{\sigma p}{p_{*}}\nabla p_{*}\right)+\Delta(\sigma p)\\ &=\nabla\cdot\left(\alpha p\nabla\frac{p}{p_{*}}\right)-\nabla\cdot\left[\frac{\sigma p}{p_{*}}\nabla p_{*}-\nabla(\sigma p)\right]\\ &=\nabla\cdot\left(\alpha p\nabla\frac{p}{p_{*}}\right)+\nabla\cdot\left(\sigma p_{*}\nabla\frac{p}{p_{*}}\right).\end{split}

Consider a Lyapunov function V(t)=12ΦL2(1/p)2V(t)=\frac{1}{2}\|\Phi\|_{L^{2}(1/p_{*})}^{2}. By the divergence theorem and the boundary condition, we have

dVdt=Ωppptpdx=Ωppp[(αppp)+(σppp)]𝑑x=Ωαp(pp)2σp(pp)2dx(α1(t)+α2(t))ppL22=(α1(t)+α2(t))pppL22.\displaystyle\begin{split}\frac{dV}{dt}&=\int_{\Omega}\frac{p-p_{*}}{p_{*}}\partial_{t}pdx\\ &=\int_{\Omega}\frac{p-p_{*}}{p_{*}}\left[\nabla\cdot\left(\alpha p\nabla\frac{p}{p_{*}}\right)+\nabla\cdot\left(\sigma p_{*}\nabla\frac{p}{p_{*}}\right)\right]dx\\ &=\int_{\Omega}-\alpha p\left(\nabla\frac{p}{p_{*}}\right)^{2}-\sigma p_{*}\left(\nabla\frac{p}{p_{*}}\right)^{2}dx\\ &\leq-(\alpha_{1}(t)+\alpha_{2}(t))\left\|\nabla\frac{p}{p_{*}}\right\|_{L^{2}}^{2}\\ &=-(\alpha_{1}(t)+\alpha_{2}(t))\left\|\nabla\frac{p-p_{*}}{p_{*}}\right\|_{L^{2}}^{2}.\end{split}

By the Poincáre inequality (1) (where we set f=pppf=\frac{p-p_{*}}{p_{*}} and g=pg=p_{*}) and the fact that Ω(pp)𝑑x=0\int_{\Omega}(p-p_{*})dx=0, we have

dVdt(α1(t)+α2(t))C2pppL22=(α1(t)+α2(t))C2ΦL2(1/p2)2\displaystyle\begin{split}\frac{dV}{dt}&\leq-(\alpha_{1}(t)+\alpha_{2}(t))C^{2}\left\|\frac{p-p_{*}}{p_{*}}\right\|_{L^{2}}^{2}\\ &=-(\alpha_{1}(t)+\alpha_{2}(t))C^{2}\left\|\Phi\right\|_{L^{2}(1/p_{*}^{2})}^{2}\end{split}

where α1(t),α2(t)\alpha_{1}(t),\alpha_{2}(t)\in\mathbb{R} satisfy 0minxαpα1(t)maxxαp0\leq\min_{x}\alpha p\leq\alpha_{1}(t)\leq\max_{x}\alpha p and 0<minxσpα2(t)maxxσp0<\min_{x}\sigma p_{*}\leq\alpha_{2}(t)\leq\max_{x}\sigma p_{*}, and C>0C>0 is the Poincáre constant. By the strong maximum principle [19], there exist t1,a>0t_{1},a>0 such that pap\geq a for tt1t\geq t_{1}. Hence, α1=0\alpha_{1}=0 if and only if α=0\alpha=0 for tt1t\geq t_{1}. ∎

Remark 3

Note that we allow α=0\alpha=0 in (4). In this case, (4) becomes v=σppv=\frac{\sigma\nabla p_{*}}{p_{*}}, which is a well-known law to drive stochastic particles towards a target distribution in physics [20]. However, the convergence speed will be very slow since σ\sigma is small in general. We add the feedback term αpp-\alpha\nabla\frac{p}{p_{*}} to provide extra and locally adjustable convergence speed. The accelerated convergence is reflected by α1\alpha_{1} in the proof.

The mean-field density pp cannot be measured directly. We will study how to estimate pp in next section. We first establish some robustness results regardless of what estimation algorithm to use. It is useful to rewrite (4) as:

v=αppppp2+σ(t)pp.v=-\alpha\frac{p_{*}\nabla p-p\nabla p_{*}}{p_{*}^{2}}+\frac{\sigma(t)\nabla p_{*}}{p_{*}}. (5)

In general, we need to estimate pp and p\nabla p separately because \nabla is an unbounded operator, i.e., any algorithm that produces accurate estimates of pp may have arbitrarily large estimation errors for p\nabla p. Let p^\hat{p} and p^\widehat{\nabla p} be the estimates of pp and p\nabla p. Based on (5), the control law using estimates is given by:

v=αpp^p^pp2+σ(t)pp.v=-\alpha\frac{p_{*}\widehat{\nabla p}-\hat{p}\nabla p_{*}}{p_{*}^{2}}+\frac{\sigma(t)\nabla p_{*}}{p_{*}}. (6)

Define ϵ=p^p\epsilon=\hat{p}-p and ϵg=p^p\epsilon_{g}=\widehat{\nabla p}-\nabla p as the estimation errors. Substituting (6) into (3), we obtain the closed-loop system:

tΦ=tp=(αppp^p^pp2+σppp).\partial_{t}\Phi=\partial_{t}p=\nabla\cdot\left(\alpha p\frac{p_{*}\widehat{\nabla p}-\hat{p}\nabla p_{*}}{p_{*}^{2}}+\sigma p_{*}\nabla\frac{p}{p_{*}}\right). (7)

We have the following robustness result.

Theorem 4

Consider (3). Let vv be given by (6). If α>0\alpha>0, then Φ(t)L2\|\Phi(t)\|_{L^{2}} is ISS with respect to ϵ(t)L2\|\epsilon(t)\|_{L^{2}} and ϵg(t)L2\|\epsilon_{g}(t)\|_{L^{2}}.

Proof:

Consider V(t)=12ΦL2(1/p)2V(t)=\frac{1}{2}\|\Phi\|_{L^{2}(1/p_{*})}^{2}. By the divergence theorem and the boundary condition, we have

dVdt=Ωppp[(αppp^p^pp2+σppp)]𝑑x=Ωαppp(pppp+pϵgϵpp2)σp(pp)2dx=Ωαppp(pp+1pϵgpp2ϵ)σp(pp)2dx(α1(t)+α2(t))ppL22+α3(t)ppL21pϵgpp2ϵL2\displaystyle\begin{split}\frac{dV}{dt}&=\int_{\Omega}\frac{p-p_{*}}{p_{*}}\left[\nabla\cdot\left(\alpha p\frac{p_{*}\widehat{\nabla p}-\hat{p}\nabla p_{*}}{p_{*}^{2}}+\sigma p_{*}\nabla\frac{p}{p_{*}}\right)\right]dx\\ &=\int_{\Omega}-\alpha p\nabla\frac{p}{p_{*}}\cdot\left(\frac{p_{*}\nabla p-p\nabla p_{*}+p_{*}\epsilon_{g}-\epsilon\nabla p_{*}}{p_{*}^{2}}\right)\\ &\quad-\sigma p_{*}\left(\nabla\frac{p}{p_{*}}\right)^{2}dx\\ &=\int_{\Omega}-\alpha p\nabla\frac{p}{p_{*}}\cdot\left(\nabla\frac{p}{p_{*}}+\frac{1}{p_{*}}\epsilon_{g}-\frac{\nabla p_{*}}{p_{*}^{2}}\epsilon\right)\\ &\quad-\sigma p_{*}\left(\nabla\frac{p}{p_{*}}\right)^{2}dx\\ &\leq-(\alpha_{1}(t)+\alpha_{2}(t))\left\|\nabla\frac{p}{p_{*}}\right\|_{L^{2}}^{2}\\ &\quad+\alpha_{3}(t)\left\|\nabla\frac{p}{p_{*}}\right\|_{L^{2}}\left\|\frac{1}{p_{*}}\epsilon_{g}-\frac{\nabla p_{*}}{p_{*}^{2}}\epsilon\right\|_{L^{2}}\\ \end{split}

where α1\alpha_{1} and α2\alpha_{2} are defined in the proof of Theorem 3, and α3(t)\alpha_{3}(t)\in\mathbb{R} satisfies 0minxαpα3(t)maxxαp0\leq\min_{x}\alpha p\leq\alpha_{3}(t)\leq\max_{x}\alpha p. Using a constant θ(0,1)\theta\in(0,1) to split the first term and applying the Poincáre inequality (1), we have

dVdt(α1(t)+α2(t))(1θ)C2pppL22(α1(t)+α2(t))θCppL2pppL2+α3(t)(ϵgL2(1p)+pLϵL2(1p2))ppL2(α1(t)+α2(t))(1θ)C2ΦL2(1/p2)2,\displaystyle\begin{split}\frac{dV}{dt}&\leq-(\alpha_{1}(t)+\alpha_{2}(t))(1-\theta)C^{2}\left\|\frac{p-p_{*}}{p_{*}}\right\|_{L^{2}}^{2}\\ &\quad-(\alpha_{1}(t)+\alpha_{2}(t))\theta C\left\|\nabla\frac{p}{p_{*}}\right\|_{L^{2}}\left\|\frac{p-p_{*}}{p_{*}}\right\|_{L^{2}}\\ &\quad+\alpha_{3}(t)\big{(}\|\epsilon_{g}\|_{L^{2}(\frac{1}{p_{*}})}+\|\nabla p_{*}\|_{L^{\infty}}\|\epsilon\|_{L^{2}(\frac{1}{p_{*}^{2}})}\|\big{)}\left\|\nabla\frac{p}{p_{*}}\right\|_{L^{2}}\\ &\leq-(\alpha_{1}(t)+\alpha_{2}(t))(1-\theta)C^{2}\|\Phi\|_{L^{2}(1/p_{*}^{2})}^{2},\end{split}

if ΦL2(1p2)αsup(ϵgL2(1p)+pLϵL2(1p2))\|\Phi\|_{L^{2}(\frac{1}{p_{*}^{2}})}\geq\alpha_{\sup}(\|\epsilon_{g}\|_{L^{2}(\frac{1}{p_{*}})}+\|\nabla p_{*}\|_{L^{\infty}}\|\epsilon\|_{L^{2}(\frac{1}{p_{*}^{2}})}\|) where αsup:=suptα3(t)(α1(t)+α2(t))θC\alpha_{\sup}:=\sup_{t}\frac{\alpha_{3}(t)}{(\alpha_{1}(t)+\alpha_{2}(t))\theta C}. The ISS property then follows from Theorem 1. ∎

This theorem ensures that Φ(t)L2\|\Phi(t)\|_{L^{2}} is always bounded by a function of ϵ(t)L2\|\epsilon(t)\|_{L^{2}} and ϵg(t)L2\|\epsilon_{g}(t)\|_{L^{2}}, and will approach 0 exponentially if both p^\hat{p} and p^\widehat{\nabla p} are accurate.

IV-B Density and gradient estimation

Now we design algorithms to estimate pp and p\nabla p separately. The corresponding algorithms will be referred to as density filters and gradient filters, respectively. The former has been studied in [12, 13] and is included for completeness. In the estimation problem, the system (3) is assumed known (including vv). We use kernel density estimation (KDE) to construct noisy measurements of pp and p\nabla p and utilize two important properties: the dynamics governing pp and p\nabla p are linear; their measurement noises are approximately Gaussian. Then we use infinite-dimensional Kalman filters to design two filters to estimate them separately.

First, the processes {Xti}i=1N\{X_{t}^{i}\}_{i=1}^{N} become asymptotically independent as NN\to\infty [3]. Hence, for a large NN, {Xti}i=1N\{X_{t}^{i}\}_{i=1}^{N} can be approximately treated as independent samples of p(x,t)p(x,t). We then use KDE to construct priori estimates which are used as noisy measurements of pp and p\nabla p.

We need to derive an evolution equation for p\nabla p. By applying the gradient operator on both sides of (3), we have:

t(p)=(tp)=[(vp)+Δ(σp)]=[(v(p))+Δ(σ(p))]\displaystyle\begin{split}\partial_{t}(\nabla p)=\nabla(\partial_{t}p)&=\nabla[-\nabla\cdot(vp)+\Delta(\sigma p)]\\ &=\nabla[-\nabla\cdot(v\mathscr{I}(\nabla p))+\Delta(\sigma\mathscr{I}(\nabla p))]\end{split}

where \mathscr{I} is an integration operator defined as follows. For a given vector field F:ΩnF:\Omega\to\mathbb{R}^{n}, define (F)=f\mathscr{I}(F)=f, where f:Ωf:\Omega\to\mathbb{R} is uniquely determined by the following relations:

f=F and 𝒏((σf)vf)=0 on Ω.\nabla f=F\text{ and }\boldsymbol{n}\cdot(\nabla(\sigma f)-{v}f)=0\text{ on }\partial\Omega. (8)

For simplicity, denote q=pq=\nabla p, i.e., q:Ω×(0,T)nq:\Omega\times(0,T)\to\mathbb{R}^{n}. We obtain a partial-integro-differential equation for qq:

tq=[(v(q))+Δ(σ(q))]\displaystyle\begin{split}\partial_{t}q=\nabla[-\nabla\cdot(v\mathscr{I}(q))+\Delta(\sigma\mathscr{I}(q))]\end{split} (9)

which is a linear equation. We will rewrite (3) and (9) as evolution equations in 1=L2(Ω)\mathcal{H}_{1}=L^{2}(\Omega) and 2=L2(Ω)n:={(F1,,Fn):ΩnFiL2(Ω)}\mathcal{H}_{2}=L^{2}(\Omega)^{n}:=\{(F_{1},\dots,F_{n}):\Omega\to\mathbb{R}^{n}\mid F_{i}\in L^{2}(\Omega)\}, respectively. Specifically, define the following linear operators:

𝒜(t)p:=[v(t)p]+Δ[σ(t)p],\displaystyle\mathcal{A}(t)p:=-\nabla\cdot[v(t)p]+\Delta[\sigma(t)p],
𝒜g(t)q:=[(v(t)(q))+Δ(σ(t)(q))].\displaystyle\mathcal{A}_{g}(t)q:=\nabla[-\nabla\cdot(v(t)\mathscr{I}(q))+\Delta(\sigma(t)\mathscr{I}(q))].

For any tt, we use KDE and the samples {Xti}i=1N\{X_{t}^{i}\}_{i=1}^{N} to construct a priori estimate pKDE(x,t)p_{\text{KDE}}(x,t) [21]:

pKDE(x,t)\displaystyle p_{\text{KDE}}(x,t) =1Nhni=1NK(1h(xXti)),\displaystyle=\frac{1}{Nh^{n}}\sum_{i=1}^{N}K\Big{(}\frac{1}{h}(x-X_{t}^{i})\Big{)},

where K(x)K(x) is a kernel function and hh is the bandwidth. We treat pKDE(x,t)p_{\text{KDE}}(x,t) as a noisy measurement of p(x,t)p(x,t). We also treat pKDE(x,t)\nabla p_{\text{KDE}}(x,t), the gradient of pKDE(x,t)p_{\text{KDE}}(x,t), as a noisy measurement of q(x,t)q(x,t). Define w(x,t)=pKDE(x,t)p(x,t)w(x,t)=p_{\text{KDE}}(x,t)-p(x,t) and wg(x,t)=pKDE(x,t)q(x,t)w_{g}(x,t)=\nabla p_{\text{KDE}}(x,t)-q(x,t). Then, w(x,t)w(x,t) and wg(x,t)w_{g}(x,t) are approximately infinite-dimensional Gaussian noises with diagonal covariance operators ¯(t)=kdiag(p(t))\bar{\mathcal{R}}(t)=k\operatorname{diag}(p(t)) and ¯g(t)=kgdiag(q(t))\bar{\mathcal{R}}_{g}(t)=k_{g}\operatorname{diag}(q(t)), respectively, where k,kg>0k,k_{g}>0 are constants depending on the kernels and NN. (See Appendices in [13] for why ww and wgw_{g} are approximately Gaussian.)

We obtain two infinite-dimensional linear systems:

{p˙=𝒜(t)py=pKDE(t)=p+w(t)\displaystyle\left\{\begin{array}[]{l}\dot{p}=\mathcal{A}(t)p\\ y=p_{\text{KDE}}(t)=p+w(t)\end{array}\right. (12)
{q˙=𝒜g(t)qyg=pKDE(t)=q+wg(t)\displaystyle\left\{\begin{array}[]{l}\dot{q}=\mathcal{A}_{g}(t)q\\ y_{g}=\nabla p_{\text{KDE}}(t)=q+w_{g}(t)\end{array}\right. (15)

We can design infinite-dimensional Kalman filters for (12) and (15). However, ¯\bar{\mathcal{R}} and ¯g\bar{\mathcal{R}}_{g} are unknown because they depend on pp and qq, the states to be estimated. Hence, we approximate ¯\bar{\mathcal{R}} and ¯g\bar{\mathcal{R}}_{g} with (t)=kdiag(pKDE(t))\mathcal{R}(t)=k\operatorname{diag}(p_{\text{KDE}}(t)) and g(t)=kgdiag(pKDE(t))\mathcal{R}_{g}(t)=k_{g}\operatorname{diag}(\nabla p_{\text{KDE}}(t)). In this way, the “suboptimal” density filter is given by:

p^˙=𝒜(t)p^+𝒫(t)1(t)(y(t)p^),\displaystyle\dot{\hat{p}}=\mathcal{A}(t)\hat{p}+\mathcal{P}(t)\mathcal{R}^{-1}(t)(y(t)-\hat{p}), (16)
𝒫˙=𝒜(t)𝒫+𝒫𝒜(t)𝒫1(t)𝒫,\displaystyle\dot{\mathcal{P}}=\mathcal{A}(t)\mathcal{P}+\mathcal{P}\mathcal{A}^{*}(t)-\mathcal{P}\mathcal{R}^{-1}(t)\mathcal{P}, (17)

and the “suboptimal” gradient filter is given by:

q^˙=𝒜g(t)q^+𝒬(t)g1(t)(yg(t)q^),\displaystyle\dot{\hat{q}}=\mathcal{A}_{g}(t)\hat{q}+\mathcal{Q}(t)\mathcal{R}_{g}^{-1}(t)(y_{g}(t)-\hat{q}), (18)
𝒬˙=𝒜g(t)𝒬+𝒬𝒜g(t)𝒬g1(t)𝒬,\displaystyle\dot{\mathcal{Q}}=\mathcal{A}_{g}(t)\mathcal{Q}+\mathcal{Q}\mathcal{A}_{g}^{*}(t)-\mathcal{Q}\mathcal{R}_{g}^{-1}(t)\mathcal{Q}, (19)

where p^\hat{p} and q^\hat{q} are our estimates for pp and qq (i.e., p\nabla p).

Now we study the stability and optimality of these two filters. Let 𝒫¯\bar{\mathcal{P}} and 𝒬¯\bar{\mathcal{Q}} be the corresponding solutions of (17) and (19) when \mathcal{R} and g\mathcal{R}_{g} are respectively replaced by ¯\bar{\mathcal{R}} and ¯g\bar{\mathcal{R}}_{g}, their true but unknown values. Then, 𝒫¯\bar{\mathcal{P}} and 𝒬¯\bar{\mathcal{Q}} represent the optimal flows of estimation error covariance. We denote =𝒫1\mathcal{L}=\mathcal{P}\mathcal{R}^{-1} and g=𝒬g1\mathcal{L}_{g}=\mathcal{Q}\mathcal{R}_{g}^{-1} to represent the suboptimal Kalman gains, and denote ¯=𝒫¯¯1\bar{\mathcal{L}}=\bar{\mathcal{P}}\bar{\mathcal{R}}^{-1} and ¯g=𝒬¯¯g1\bar{\mathcal{L}}_{g}=\bar{\mathcal{Q}}\bar{\mathcal{R}}_{g}^{-1} to represent the optimal Kalman gains. Define ϵ=p^p\epsilon=\hat{p}-p and ϵg=q^q\epsilon_{g}=\hat{q}-q as the estimation errors of the filters. Then ϵ\epsilon and ϵg\epsilon_{g} satisfy the follow equations:

ϵ˙=(𝒜(t)𝒫(t)1(t))ϵ+𝒫(t)1(t)w\displaystyle\dot{\epsilon}=(\mathcal{A}(t)-\mathcal{P}(t)\mathcal{R}^{-1}(t))\epsilon+\mathcal{P}(t)\mathcal{R}^{-1}(t)w (20)
ϵ˙g=(𝒜g(t)𝒬(t)g1(t))ϵg+𝒬(t)g1(t)wg.\displaystyle\dot{\epsilon}_{g}=(\mathcal{A}_{g}(t)-\mathcal{Q}(t)\mathcal{R}_{g}^{-1}(t))\epsilon_{g}+\mathcal{Q}(t)\mathcal{R}_{g}^{-1}(t)w_{g}. (21)

We can show that the suboptimal filters are stable and remain close to the optimal ones. The following theorem is for the density filter (16) and (17), which is proved in [12, 13].

Theorem 5

Assume that 𝒫(t)\|\mathcal{P}(t)\| and 𝒫¯(t)\|\bar{\mathcal{P}}(t)\| are uniformly bounded, and c1,c2>0\exists c_{1},c_{2}>0 such that for t0t\geq 0,

c11(t),¯1(t),𝒫1(t),𝒫¯1(t)c2.c_{1}\mathcal{I}\leq\mathcal{R}^{-1}(t),\bar{\mathcal{R}}^{-1}(t),\mathcal{P}^{-1}(t),\bar{\mathcal{P}}^{-1}(t)\leq c_{2}\mathcal{I}. (22)

Then we have:

  • (i)

    ϵL2\|\epsilon\|_{L^{2}} is ISS with respect to (w.r.t.) wL2\|w\|_{L^{2}} (and is uniformly exponentially stable if w=0w=0);

  • (ii)

    𝒫¯𝒫\|\bar{\mathcal{P}}-\mathcal{P}\| is LISS w.r.t. ¯11\|\bar{\mathcal{R}}^{-1}-\mathcal{R}^{-1}\|;

  • (iii)

    ¯\|\bar{\mathcal{L}}-\mathcal{L}\| is LISS w.r.t. ¯11\|\bar{\mathcal{R}}^{-1}-\mathcal{R}^{-1}\|.

Property (i) means that the suboptimal filter (16) is stable even if \mathcal{R} is only an approximation for ¯\bar{\mathcal{R}}. Property (ii) means that the solution of the suboptimal operator Riccati equation (17) remains close to the solution of the optimal one (when \mathcal{R} is replaced by ¯\bar{\mathcal{R}}). Property (iii) means that the suboptimal Kalman gain remains close to the optimal Kalman gain. We have similar results for the gradient filter (18) and (19). The proof is similar to the proof of Theorem 5 and thus omitted.

Theorem 6

Assume that 𝒬(t)\|\mathcal{Q}(t)\| and 𝒬¯(t)\|\bar{\mathcal{Q}}(t)\| are uniformly bounded, and c3,c4>0\exists c_{3},c_{4}>0 such that for t0t\geq 0,

c3g1(t),¯g1(t),𝒬1(t),𝒬¯1(t)c4.c_{3}\mathcal{I}\leq\mathcal{R}_{g}^{-1}(t),\bar{\mathcal{R}}_{g}^{-1}(t),\mathcal{Q}^{-1}(t),\bar{\mathcal{Q}}^{-1}(t)\leq c_{4}\mathcal{I}. (23)

Then we have:

  • (i)

    ϵgL2\|\epsilon_{g}\|_{L^{2}} is ISS w.r.t. wgL2\|w_{g}\|_{L^{2}} (and is uniformly exponentially stable if wg=0w_{g}=0);

  • (ii)

    𝒬¯𝒬\|\bar{\mathcal{Q}}-\mathcal{Q}\| is LISS w.r.t. ¯g1g1\|\bar{\mathcal{R}}_{g}^{-1}-\mathcal{R}_{g}^{-1}\|;

  • (iii)

    ¯gg\|\bar{\mathcal{L}}_{g}-\mathcal{L}_{g}\| is LISS w.r.t. ¯g1g1\|\bar{\mathcal{R}}_{g}^{-1}-\mathcal{R}_{g}^{-1}\|.

IV-C Stability of feedback interconnection

Now we discuss the stability of the feedback interconnection of density estimation and control. We collect equations (7), (20) and (21) in the following for clarity:

tΦ=(αpp(ϵg+p)(ϵ+p)pp2+σppp),ϵ˙=[𝒜(t;Φ,ϵ,ϵg)𝒫(t)1(t)]ϵ+𝒫(t)1(t)w,ϵ˙g=[𝒜g(t;Φ,ϵ,ϵg)𝒬(t)g1(t)]ϵg+𝒬(t)g1(t)wg,\displaystyle\begin{split}&\partial_{t}\Phi=\nabla\cdot\left(\alpha p\frac{p_{*}(\epsilon_{g}+\nabla p)-(\epsilon+p)\nabla p_{*}}{p_{*}^{2}}+\sigma p_{*}\nabla\frac{p}{p_{*}}\right),\\ &\dot{\epsilon}=[\mathcal{A}(t;\Phi,\epsilon,\epsilon_{g})-\mathcal{P}(t)\mathcal{R}^{-1}(t)]\epsilon+\mathcal{P}(t)\mathcal{R}^{-1}(t)w,\\ &\dot{\epsilon}_{g}=[\mathcal{A}_{g}(t;\Phi,\epsilon,\epsilon_{g})-\mathcal{Q}(t)\mathcal{R}_{g}^{-1}(t)]\epsilon_{g}+\mathcal{Q}(t)\mathcal{R}_{g}^{-1}(t)w_{g},\end{split} (24)

where we write 𝒜(t;Φ,ϵ,ϵg)\mathcal{A}(t;\Phi,\epsilon,\epsilon_{g}) and 𝒜g(t;Φ,ϵ,ϵg)\mathcal{A}_{g}(t;\Phi,\epsilon,\epsilon_{g}) to emphasize their dependence on Φ\Phi, ϵ\epsilon and ϵg\epsilon_{g} through vv.

A critical observation is that the ISS results we have established for ϵ\epsilon and ϵg\epsilon_{g} are valid in spite of this dependence. This is because when designing estimation algorithms, 𝒜\mathcal{A} and 𝒜g\mathcal{A}_{g} (and vv) are treated as known system coefficients. In this way, the bilinear control system (3) becomes a linear system. The dependence on Φ\Phi, ϵ\epsilon and ϵg\epsilon_{g} can be seen as part of the time-varying nature of 𝒜\mathcal{A} and 𝒜g\mathcal{A}_{g}. Hence, the stability results for our density and gradient filters will not be affected. (Nevertheless, since in the interconnection vv depends on (p^,q^)(\hat{p},\hat{q}) and becomes stochastic, the presented filters downgrade to be the best “linear” filters instead of the minimum covariance filters.) In this regard, we can treat (24) as a cascade system. By Theorem 2, we have the following stability result for (24).

Theorem 7 (Interconnected stability)

Under the assumptions in Theorems 5 and 6, ΦL2\|\Phi\|_{L^{2}}, ϵL2\|\epsilon\|_{L^{2}} and ϵgL2\|\epsilon_{g}\|_{L^{2}} are all ISS w.r.t. wL2\|w\|_{L^{2}} and wgL2\|w_{g}\|_{L^{2}}.

This theorem has two implications. First, the stability results for the filters are independent of the density controller. Hence, they can be used for any control design when density feedback is required. Second, the interconnected system is always ISS as long as the density feedback laws are designed such that the tracking error is ISS w.r.t. the L2L^{2} norms of estimation errors. Considering that norms of infinite-dimensional vectors are not equivalent in general, obtaining ISS results w.r.t. L2L^{2} norms is critical. This in turn highlights the advantage of (4), because it is difficult to obtain such an ISS result if pp presents in the denominator, as in [9, 10, 11].

Remark 4

In our recent work [13], we proved that with some regularity assumptions on vv, the density filter (16) alone also produces convergent gradient estimates, which means the gradient filter may not be needed. However, those assumptions are not satisfied in a feedback interconnected system considered in this work. Hence, the gradient filter (18) still serves as a general solution to estimate the gradient. Determining the condition such that the density filter (16) produces convergent gradient estimates in feedback interconnected systems is the subject of continuing research.

V Simulation studies

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: Evolution of the swarm (top), the density estimates p^(x,t)\hat{p}(x,t) (middle) and the velocity fields v(x,t){v}(x,t) (bottom). Each column represents a single time step.

An agent-based simulation using 1024 robots is performed on Matlab to verify the proposed control law. We set Ω=(0,1)2\Omega=(0,1)^{2}, 2σ=0.01\sqrt{2\sigma}=0.01 and α=0.003\alpha=0.003. Each robot is simulated according to (2) where vv is given by (6). The initial positions are drawn from a uniform distribution on [0.15,0.85]2[0.15,0.85]^{2}. The desired density p(x)p_{*}(x) is illustrated in Fig. 2(a). The implementation of the filters and the feedback controller is based on the finite difference method. We discretize Ω\Omega into a 30×3030\times 30 grid, and the time difference is 0.01s0.01s. We use KDE (in which we set h=0.04h=0.04) to obtain pKDEp_{\text{KDE}} and pKDE\nabla p_{\text{KDE}}. Numerical implementation of the filters are introduced in [13]. Simulation results are given in Fig. 1. It is seen that the swarm is able to evolve towards the desired density. The convergence error p^pL2(Ω)\|\hat{p}-p^{*}\|_{L^{2}(\Omega)} is given in Fig. 2(b), which shows that the error converges exponentially to a small neighbourhood around 0 and remains bounded, which verifies the ISS property of the proposed algorithm.

Refer to caption
(a) Target density p(x)p_{*}(x).
Refer to caption
(b) Convergence error.
Figure 2:

VI Conclusion

This paper studied the interplay of density estimation and control algorithms. We proposed new density feedback laws for robust density control of swarm robotic systems and filtering algorithms for estimating the mean-field density and its gradient. We also proved that the interconnection of these algorithms is globally ISS. Our future work is to incorporate the density control algorithm for higher-order nonlinear systems in [18] and the distributed density estimation algorithm in [13], and study their interconnected stability.

References

  • [1] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and cooperation in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, 2007.
  • [2] J. R. Marden, G. Arslan, and J. S. Shamma, “Cooperative control and potential games,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 6, pp. 1393–1407, 2009.
  • [3] M. Huang, R. P. Malhamé, P. E. Caines et al., “Large population stochastic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle,” Communications in Information & Systems, vol. 6, no. 3, pp. 221–252, 2006.
  • [4] B. Açikmeşe and D. S. Bayard, “A markov chain approach to probabilistic swarm guidance,” in 2012 American Control Conference (ACC).   IEEE, 2012, pp. 6300–6307.
  • [5] S. Bandyopadhyay, S.-J. Chung, and F. Y. Hadaegh, “Probabilistic and distributed control of a large-scale swarm of autonomous agents,” IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1103–1123, 2017.
  • [6] K. Elamvazhuthi and S. Berman, “Optimal control of stochastic coverage strategies for robotic swarms,” in 2015 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2015, pp. 1822–1829.
  • [7] Y. Chen, T. T. Georgiou, and M. Pavon, “Optimal steering of a linear stochastic system to a final probability distribution, part i,” IEEE Transactions on Automatic Control, vol. 61, no. 5, pp. 1158–1169, 2015.
  • [8] J. Ridderhof, K. Okamoto, and P. Tsiotras, “Nonlinear uncertainty control with iterative covariance steering,” in 2019 IEEE 58th Conference on Decision and Control (CDC).   IEEE, 2019, pp. 3484–3490.
  • [9] U. Eren and B. Açıkmeşe, “Velocity field generation for density control of swarms using heat equation and smoothing kernels,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 9405–9411, 2017.
  • [10] K. Elamvazhuthi, H. Kuiper, M. Kawski, and S. Berman, “Bilinear controllability of a class of advection–diffusion–reaction systems,” IEEE Transactions on Automatic Control, vol. 64, no. 6, pp. 2282–2297, 2018.
  • [11] T. Zheng, Q. Han, and H. Lin, “Transporting robotic swarms via mean-field feedback control,” IEEE Transactions on Automatic Control, pp. 1–1, 2021.
  • [12] ——, “Pde-based dynamic density estimation for large-scale agent systems,” IEEE Control Systems Letters, vol. 5, no. 2, pp. 541–546, 2020.
  • [13] ——, “Distributed mean-field density estimation for large-scale systems,” IEEE Transactions on Automatic Control, pp. 1–1, 2021.
  • [14] D. Bakry, I. Gentil, and M. Ledoux, Analysis and geometry of Markov diffusion operators.   Springer Science & Business Media, 2013, vol. 348.
  • [15] S. Dashkovskiy and A. Mironchenko, “Input-to-state stability of infinite-dimensional control systems,” Mathematics of Control, Signals, and Systems, vol. 25, no. 1, pp. 1–35, 2013.
  • [16] H. K. Khalil and J. W. Grizzle, Nonlinear systems.   Prentice hall Upper Saddle River, NJ, 2002, vol. 3.
  • [17] R. F. Curtain, “Infinite-dimensional filtering,” SIAM Journal on Control, vol. 13, no. 1, pp. 89–104, 1975.
  • [18] T. Zheng, Q. Han, and H. Lin, “Backstepping density control for large-scale heterogeneous nonlinear stochastic systems,” in 2022 American Control Conference (ACC), 2022.
  • [19] G. M. Lieberman, Second order parabolic differential equations.   World scientific, 1996.
  • [20] P. A. Markowich and C. Villani, “On the trend to equilibrium for the fokker-planck equation: an interplay between physics and functional analysis,” Mat. Contemp, vol. 19, pp. 1–29, 2000.
  • [21] T. Cacoullos, “Estimation of a multivariate density,” Annals of the Institute of Statistical Mathematics, vol. 18, no. 1, pp. 179–189, 1966.