This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributed Nash Equilibrium Seeking for Games in Systems with Unknown Control Directions

Maojiao Ye    Shengyuan Xu    Jizhao Yin
Abstract

Distributed Nash equilibrium seeking for games in uncertain networked systems without a prior knowledge about control directions is explored in this paper. More specifically, the dynamics of the players are supposed to be first-order or second-order systems in which the control directions are unknown and there are parametric uncertainties. To achieve Nash equilibrium seeking in a distributed way, Nussbaum function based strategies are proposed through separately designing an optimization module and a state regulation module. The optimization module generates a reference trajectory, that can search for the Nash equilibrium, for the state regulation module. The state regulator is designed to steer the players' actions to the reference trajectory. An adaptive law is included in the state regulation module to compensate for the uncertain parameter in the players' dynamics and the Nussbaum function is included to address the unavailability of the control directions. Fully distributed implementations of the proposed algorithms are discussed and investigated. Through our analytical explorations, we show that the proposed seeking strategies can drive the players' actions to the Nash equilibrium asymptotically without requiring the homogeneity of the players' unknown control directions based on Barbalat's lemma. A numerical example is given to support the theoretical analysis of the proposed algorithms.

keywords:
Unknown control directions; Nussbaum function; distributed Nash equilibrium seeking; parametric uncertainties
thanks: This work was supported by the National Natural Science Foundation of China (NSFC), No. 61803202, the Natural Science Foundation of Jiangsu Province, No. BK20180455, and the Fundamental Research Funds for the Central Universities, No. 30920032203.thanks: M. Ye, S. Xu and J. Yin are with the School of Automation, Nanjing University of Science and Technology, Nanjing 210094, P.R. China (Email: [email protected],[email protected],[email protected]).

1 Introduction

Games under distributed communication networks are receiving increasing attention due to their wide applications in numerous fields. For example, the connectivity control of mobile sensor networks was modeled as a game in which each sensor's objective function contains a local cost that models the sensor's local goal (e.g., source seeking) and a global cost that describes the sensor's willingness to keep connectivity with other sensors [1]. Inspired by the observation that practical engineering systems are usually afflicted with model uncertainties and disturbance, an extended-state-observer based robust Nash equilibrium seeking strategy was proposed in [1]. Energy consumption control might be formulated as an aggregative game, in which each user's cost function depends on the user's own energy consumption and the total energy consumption of all users [2]. Dynamic average consensus algorithms can be adapted as an aggregation estimator, based on which distributed Nash equilibrium seeking algorithms were constructed [2]. Congestion control problems in wireless sensor networks can be viewed as a semi-aggregative game, in which each data transmitter makes decisions on its data transmission to maximize its own profit [3]. Interference graphs can be introduced for the interaction descriptions among the data transmitters, based on which Nash equilibrium seeking algorithms were designed in [3]. Moreover, noncooperative games can be utilized to illustrate the interactions among groups of discrete-time and continuous-time agents under distributed communication networks [4]. Motivated by the broad applications of networked games, distributed Nash equilibrium seeking has attracted a lot of interests in the past few years and quite a few distributed schemes have been proposed to achieve distributed Nash equilibrium seeking. The existing works provide some interesting viewpoints to cope with Nash equilibrium seeking for games in which the players' actions can be freely designed (e.g., [20][21][22][23]) or governed by simple dynamics (see e.g., [24][25]) or possibly subject to disturbance and un-modeled dynamics (see e.g, [1]). A common premise of the existing works is that the control directions are known to the players.

It should be noted that control directions determine the motion directions of a control system and are greatly important as a control force with incorrect direction may deteriorate the system and cause undesired system control performance [13]. With the information on control directions, the controller design becomes much simpler. Nevertheless, in some practical circumstances, the control directions are unknown. For instance, due to the inaccurate camera parameters and image depth, the manipulator trajectory tracking control of visual servo system may need to address the unknown control directions [6]. Affected by speed variations and loading conditions of the complex, varying environment, the model of ships contains large uncertainties and hence, the autopilot design of time-varying ships requires the accommodation of unknown control directions [7]. It was recognized that the longitudinal dynamics of the air-breathing hypersonic vehicle suffer from unknown control directions as well [9]. Furthermore, the authors in [10] argued that in some situations, it is difficult to detect the control directions of quadrotor unmanned aerial vehicles. Without the information on control directions, the controller design becomes much more challenging especially for multi-agent systems.

Many researchers have been dedicated to investigate systems with unknown control directions. Adaptive designs with Nussbaum-type functions, which can be traced back to [8], are shown to be effective to deal with uncertainties in control directions. In [12], the Nussbaum-type functions were adopted to achieve adaptive control of nonlinear systems with arbitrary dynamic order and parametric uncertainties. Extremum seekers with unknown control directions were proposed in [15]. Output feedback control for discrete-time systems without a prior control direction knowledge was studied in [17] in which a discrete Nussbaum gain was utilized to achieve asymptotic output tracking. Nussbaum functions were discussed in [13] for systems with time-varying unknown control directions. With the development of multi-agent systems, cooperative control of multi-agent systems with unknown control directions has received increasing attention. For example, the authors in [11] considered consensus among a network of first-order integrator-type agents with unknown control directions. In [16], the authors supposed that some control directions are known based on which consensus of multi-agent systems with partially unknown and non-identical control directions was addressed. Cooperative output consensus in heterogeneous multi-agent systems with non-identical control directions was considered in [18], where Nussbaum-type functions were adopted to achieve global cooperative output regulation. Distributed optimization among a network of high-order integrator-type agents was addressed in [19] without utilizing prior knowledge about control directions. Fully distributed consensus among high-order nonlinear systems in which the agents have heterogenous unknown control directions was investigated in [29]. A new Nussbaum function was employed to deal with the unknown control directions and it was shown that the agents' output can achieve asymptotic consensus [29]. Nevertheless, to the best of the authors' knowledge, distributed Nash equilibrium seeking for networked games in which the players are subject to unknown control directions and uncertain parameters still remains to be addressed. Motivated by the above observations, this paper tries to shed some light on distributed Nash equilibrium seeking strategy design without utilizing control direction information.

In comparison with the existing works, the main contributions of this paper are summarized as follows.

  1. 1.

    Different from the existing works that consider games with known control directions, the seeking strategies proposed in this paper do not require prior direction information. To the best of the authors' knowledge, this is the first work that addresses distributed Nash equilibrium seeking for games with unknown control directions. Besides, this paper also accommodates parametric uncertainties in the players' dynamics. Through a modular design, this paper proposes Nussbaum function based adaptive seeking strategies to achieve distributed Nash equilibrium seeking for games in both first-order and second-order systems with unknown control directions and parametric uncertainties.

  2. 2.

    Based on Barbalat's lemma, it is theoretically shown that the players' actions can be steered to the Nash equilibrium while the other auxiliary variables stay bounded by utilizing the proposed algorithms.

  3. 3.

    Discussions on fully distributed implementation of the proposed algorithms are provided. The explorations show that through adaptive parameter designs, the proposed fully distributed algorithms are effective.

We organize the remaining sections as follows. Some preliminaries are given in Section 2 and the considered problem is formulated in Section 3. Section 4 presents the main results of the paper, in which first-order and second-order systems with unknown control directions and parametric uncertainties are visited, successively. Discussions on fully distributed implementations of the proposed methods are provided in Section 5. Following the theoretical investigations of the developed methods, Section 6 provides numerical studies. In the end, conclusions are given in Section 7.

2 Preliminaries

The following definitions or lemmas will be utilized in the rest of the paper.

Definition 1.

[11] A continuously differentiable function N0()N_{0}(\cdot) is called a Nussbaum function if

limqsup1q0qN0(s)𝑑s=,\displaystyle\lim_{q\rightarrow\infty}\sup\frac{1}{q}\int_{0}^{q}N_{0}(s)ds=\infty, (1)
limqinf1q0qN0(s)𝑑s=.\displaystyle\lim_{q\rightarrow\infty}\inf\frac{1}{q}\int_{0}^{q}N_{0}(s)ds=-\infty.

Typical examples of Nussbaum functions include k2cos(k)k^{2}\cos(k), k2sin(k)k^{2}\sin(k), to mention just a few. Interested readers are referred to [13] for more detailed discussions of Nussbaum functions. In this paper, we adopt N0(k)=k2sin(k).N_{0}(k)=k^{2}\sin(k).

Lemma 1.

[12] Suppose that V()V(\cdot) and k()k(\cdot) are smooth functions defined on [0,tf),[0,t_{f}), where tft_{f} is a positive constant and V(t)0,t[0,tf)V(t)\geq 0,\forall t\in[0,t_{f}). Moreover, if

V(t)0t(a0N0(k(τ))+1)k˙(τ)𝑑τ+c,t[0,tf),V(t)\leq\int_{0}^{t}(a_{0}N_{0}(k(\tau))+1)\dot{k}(\tau)d\tau+c,\forall t\in[0,t_{f}), (2)

where a0a_{0} is a nonzero constant, N0N_{0} is an even smooth Nussbaum function, and cc is a suitable constant. Then, V(t),k(t)V(t),k(t) and 0t(aN0(k(τ))+1)k˙(τ)𝑑τ\int_{0}^{t}(aN_{0}(k(\tau))+1)\dot{k}(\tau)d\tau are bounded on [0,tf).[0,t_{f}).

Lemma 2.

(Barbalat's Lemma [26]) Suppose that g(t):g(t):\mathbb{R}\rightarrow\mathbb{R} is a uniformly continuous function. Then, limtg(t)=0\lim_{t\rightarrow\infty}g(t)=0 given that limt0tg(s)𝑑s\lim_{t\rightarrow\infty}\int_{0}^{t}g(s)ds exists and is finite.

A graph 𝒢\mathcal{G} contains a node set 𝒱={1,2,,M}\mathcal{V}=\{1,2,\cdots,M\} (M2M\geq 2 is an integer) and an edge set d\mathcal{E}_{d}. The elements of d\mathcal{E}_{d} are represented by (i,j)(i,j), which illustrates an edge from node ii to node jj and indicates that node jj can receive information from node ii but not necessarily vice versa. If (i,j)d(i,j)\in\mathcal{E}_{d} implies that (j,i)d(j,i)\in\mathcal{E}_{d} for all i,j𝒱i,j\in\mathcal{V}. The network is undirected. A directed path from node iki_{k} to node ik+li_{k+l} is a sequence of ordered edges denoted by (ik+j,ik+j+1),j=0,1,2,,l1.(i_{k+j},i_{k+j+1}),j=0,1,2,\cdots,l-1. A directed graph is said to be strongly connected if there is a directed path between any two distinct nodes. Similarly, an undirected graph is connected if there is a path between any two distinct nodes. The adjacency matrix 𝒜\mathcal{A} of a directed graph 𝒢\mathcal{G} is a matrix whose (i,j)(i,j)th entry is aij,a_{ij}, which is positive if (j,i)d,(j,i)\in\mathcal{E}_{d}, else, aij=0.a_{ij}=0. Moreover, aii=0.a_{ii}=0. The adjacency matrix of an undirected graph is similarly defined with a further requirement that aij=ajia_{ij}=a_{ji} for all ij.i\neq j. Moreover, the Laplacian matrix of graph 𝒢\mathcal{G} is =𝒟𝒜,\mathcal{L}=\mathcal{D}-\mathcal{A}, in which 𝒟\mathcal{D} is a diagonal matrix whose iith diagonal entry is j=1Maij\sum_{j=1}^{M}a_{ij} [5][30].

3 Problem Formulation

Consider a game with NN players in which the action and cost function of player ii is represented by xix_{i}\in\mathbb{R} and fi(𝐱):N,f_{i}(\mathbf{x}):\mathbb{R}^{N}\rightarrow\mathbb{R}, respectively, where 𝐱=[x1,x2,,xN]T\mathbf{x}=[x_{1},x_{2},\cdots,x_{N}]^{T}. Denote the player set as 𝒩={1,2,,N}\mathcal{N}=\{1,2,\cdots,N\} and suppose that the players' actions are governed by

x˙i=biui+ϕi(xi)θi,i𝒩,\dot{x}_{i}=b_{i}u_{i}+\phi_{i}(x_{i})\theta_{i},\forall i\in\mathcal{N}, (3)

or

x˙i=\displaystyle\dot{x}_{i}= vi,\displaystyle v_{i}, (4)
v˙i=\displaystyle\dot{v}_{i}= biui+ϕi(xi)θi,i𝒩.\displaystyle b_{i}u_{i}+\phi_{i}(x_{i})\theta_{i},\forall i\in\mathcal{N}.

Note that in (3) and (4), uiu_{i} is the control input to be designed and bi0b_{i}\neq 0 is an unknown constant. Moreover, ϕi(xi)\phi_{i}(x_{i}) is a sufficiently smooth known function and θi\theta_{i} is an unknown parameter. Moreover, viv_{i}\in\mathbb{R} is a state variable of player ii.

Furthermore, second-order systems in which player ii's action is generated by

x˙i=\displaystyle\dot{x}_{i}= bi1vi+ϕi1(xi)θi1\displaystyle b_{i1}v_{i}+\phi_{i1}(x_{i})\theta_{i1} (5)
v˙i=\displaystyle\dot{v}_{i}= bi2ui+ϕi2(xi,vi)θi2,i𝒩,\displaystyle b_{i2}u_{i}+\phi_{i2}(x_{i},v_{i})\theta_{i2},\forall i\in\mathcal{N},

where θi1,θi2,bi1,bi2\theta_{i1},\theta_{i2},b_{i1},b_{i2} are unknown constants, ϕi1(xi)\phi_{i1}(x_{i}) and ϕi2(xi,vi)\phi_{i2}(x_{i},v_{i}) are smooth functions, will also be considered. Note that in (5), bi1b_{i1} and bi2b_{i2} are nonzero.

The paper aims to design distributed control strategies uiu_{i} for systems in (3), (4) and (5), successively, such that limt𝐱(t)𝐱=0\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0 where 𝐱\mathbf{x}^{*} is the Nash equilibrium defined as follows.

Definition 2.

An action profile 𝐱=(xi,𝐱i)\mathbf{x}^{*}=(x_{i}^{*},\mathbf{x}_{-i}^{*}) is a Nash equilibrium if for i𝒩,i\in\mathcal{N},

fi(xi,𝐱i)fi(xi,𝐱i),f_{i}(x_{i}^{*},\mathbf{x}_{-i}^{*})\leq f_{i}(x_{i},\mathbf{x}_{-i}^{*}), (6)

for xix_{i}\in\mathbb{R}, where 𝐱i=[x1,x2,,xi1,xi+1,,xN]T\mathbf{x}_{-i}=[x_{1},x_{2},\cdots,x_{i-1},x_{i+1},\cdots,x_{N}]^{T} [5].

The rest of the paper is based on the following assumptions, which are widely adopted in related works.

Assumption 1.

For each i𝒩,i\in\mathcal{N}, fi(𝐱)f_{i}(\mathbf{x}) is sufficiently smooth and fi(𝐱)xi\frac{\partial f_{i}(\mathbf{x})}{\partial x_{i}} is globally Lipshitz with constant lil_{i}.

Assumption 2.

There exists a positive constant mm such that for 𝐱,𝐳N,\mathbf{x},\mathbf{z}\in\mathbb{R}^{N},

(𝐱𝐳)T(𝒫(𝐱)𝒫(𝐳))m𝐱𝐳2,(\mathbf{x}-\mathbf{z})^{T}(\mathcal{P}(\mathbf{x})-\mathcal{P}(\mathbf{z}))\geq m||\mathbf{x}-\mathbf{z}||^{2}, (7)

where 𝒫(𝐱)=[f1(𝐱)x1,f2(𝐱)x2,,fN(𝐱)xN]T.\mathcal{P}(\mathbf{x})=\left[\frac{\partial f_{1}(\mathbf{x})}{\partial x_{1}},\frac{\partial f_{2}(\mathbf{x})}{\partial x_{2}},\cdots,\frac{\partial f_{N}(\mathbf{x})}{\partial x_{N}}\right]^{T}.

Assumption 3.

The players are equipped with an undirected and connected communication graph 𝒢\mathcal{G}.

For the systems in (3) and (4), the nonlinear term should satisfy the following condition.

Assumption 4.

For each i𝒩,i\in\mathcal{N}, ϕi(xi)\phi_{i}(x_{i}) and ϕi(xi)xi\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}} are bounded provided that xix_{i} is bounded.

Moreover, for the system in (5), the nonlinear terms should satisfy the following condition.

Assumption 5.

For each i𝒩,i\in\mathcal{N}, ϕi1(xi)\phi_{i1}(x_{i}) and ϕi1(xi)xi\frac{\partial\phi_{i1}(x_{i})}{\partial x_{i}} are bounded provided that xix_{i} is bounded. Moreover, ϕi2(xi,vi)\phi_{i2}(x_{i},v_{i}) is bounded if xix_{i} and viv_{i} are bounded.

Remark 1.

Different from existing works on distributed Nash equilibrium seeking that consider the control directions to be known, we suppose that the control directions are unknown a prior as bib_{i} (or bi1,bi2b_{i1},b_{i2}) for all i𝒩i\in\mathcal{N} are not known. Moreover, the players may have different control directions as we do not enforce sign(bi)sign(b_{i}) (or sign(bi1),sign(bi2)sign(b_{i1}),sign(b_{i2})) for all i𝒩i\in\mathcal{N} to be the same. Note that in (3) and (4), θi\theta_{i} is supposed to be unknown as well, indicating that the players are suffering from parametric uncertainties.

4 Main Results

In this section, we will establish distributed Nash equilibrium seeking algorithms for games in which the players' actions are governed by (3), (4) and (5), successively. In the following, Nash equilibrium seekers that are able to accommodate the unknown control directions and parametric uncertainties will be proposed, followed by their corresponding convergence analyses.

4.1 Distributed Nash equilibrium seeking for first-order systems with unknown control directions

In this section, we consider that the action of player ii is governed by

x˙i=biui+ϕi(xi)θi,i𝒩.\dot{x}_{i}=b_{i}u_{i}+\phi_{i}(x_{i})\theta_{i},\forall i\in\mathcal{N}. (8)

In the following, method development and convergence analysis will be presented.

4.1.1 Method Development

To achieve distributed Nash equilibrium seeking for systems with unknown control directions, let

ui=N0(ki)(xiyi+ϕi(xi)θ^i),u_{i}=N_{0}(k_{i})(x_{i}-y_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}), (9)

where N0(ki)=ki2sin(ki)N_{0}(k_{i})=k_{i}^{2}sin(k_{i}) and

k˙i=\displaystyle\dot{k}_{i}= (xiyi)(xiyi+ϕi(xi)θ^i),\displaystyle(x_{i}-y_{i})(x_{i}-y_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}), (10)
θ^˙i=\displaystyle\dot{\hat{\theta}}_{i}= ϕi(xi)(xiyi).\displaystyle\phi_{i}(x_{i})(x_{i}-y_{i}).

Moreover, yiy_{i} is an auxiliary variable generated by

y˙i=ifi(𝐳i),\dot{y}_{i}=-\nabla_{i}f_{i}(\mathbf{z}_{i}), (11)

where ifi(𝐳i)=fi(𝐱)xi|𝐱=𝐳i\nabla_{i}f_{i}(\mathbf{z}_{i})=\frac{\partial f_{i}(\mathbf{x})}{\partial x_{i}}\left.\right|_{\mathbf{x}=\mathbf{z}_{i}}, 𝐳i=[zi1,zi2,,ziN]T\mathbf{z}_{i}=[z_{i1},z_{i2},\cdots,z_{iN}]^{T} and

z˙ij=δij(k=1Naik(zijzkj)+aij(zijyj)),\dot{z}_{ij}=-\delta_{ij}\left(\sum_{k=1}^{N}a_{ik}(z_{ij}-z_{kj})+a_{ij}(z_{ij}-y_{j})\right), (12)

in which δij=δδ¯ij\delta_{ij}=\delta\bar{\delta}_{ij}, δ\delta is positive constant to be determined and δ¯ij\bar{\delta}_{ij} is a fixed positive constant.

Remark 2.

The seeking strategy in (9)-(12) can be viewed as two modules. The subsystem in (9)-(10) is designed to drive xix_{i} to yiy_{i}. The Nussbaum function in (9) is employed to accommodate the unknown control directions and the second equation in (10) is utilized to compensate the unknown parameter θi\theta_{i}. In addition, the subsystem in (11)-(12) is adapted from [5] to act as a reference generator that would drive 𝐲=[y1,y2,,yN]T\mathbf{y}=[y_{1},y_{2},\cdots,y_{N}]^{T} to the Nash equilibrium 𝐱\mathbf{x}^{*} [5]. The schematic outline of (9)-(12) is depicted in the Fig. 1.

Refer to caption
Figure 1: The illustration of the information flows in the seeking strategy.

4.1.2 Convergence Analysis

In this section, we provide the convergence analysis for the seeking strategy proposed in (9)-(12). Before we proceed to present the convergence results, the following supportive lemma is given.

Lemma 3.

Suppose that Assumptions 1-3 are satisfied. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty), the following conclusions hold:

  • For each i,j𝒩,i,j\in\mathcal{N}, yi(t)y_{i}(t) and zij(t)z_{ij}(t) are bounded for t[0,).t\in[0,\infty).

  • For each i𝒩,i\in\mathcal{N}, y˙i(t)\dot{y}_{i}(t) globally exponentially decays to zero.

  • For each i𝒩,i\in\mathcal{N}, y˙i2(t)\dot{y}_{i}^{2}(t) is square integrable over t[0,)t\in[0,\infty), i.e., 0y˙i2(s)𝑑sci\int_{0}^{\infty}\dot{y}_{i}^{2}(s)ds\leq c_{i} for some positive constant ci.c_{i}.

Proof: Following the results in [5], it can be obtained that there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty), 𝐲\mathbf{y} and 𝐳\mathbf{z}, where 𝐲=[y1,y2,,yN]T\mathbf{y}=[y_{1},y_{2},\cdots,y_{N}]^{T} and 𝐳=[𝐳1T,𝐳2T,,𝐳NT]T\mathbf{z}=[\mathbf{z}_{1}^{T},\mathbf{z}_{2}^{T},\cdots,\mathbf{z}_{N}^{T}]^{T}, globally exponentially converge to 𝐱\mathbf{x}^{*} and 𝟏N𝐱,\mathbf{1}_{N}\otimes\mathbf{x}^{*}, respectively [5]. Hence, the first conclusion directly follows the results in [5]. The second conclusion can be reasoned as follows. As 𝐲\mathbf{y} and 𝐳\mathbf{z} globally exponentially converge to 𝐱\mathbf{x}^{*} and 𝟏N𝐱,\mathbf{1}_{N}\otimes\mathbf{x}^{*}, respectively, there are positive constants η1,η2\eta_{1},\eta_{2} such that

[(𝐲𝐱)T,(𝐳𝟏N𝐱)T]Tη1eη2t.||[(\mathbf{y}-\mathbf{x}^{*})^{T},(\mathbf{z}-\mathbf{1}_{N}\otimes\mathbf{x}^{*})^{T}]^{T}||\leq\eta_{1}e^{-\eta_{2}t}. (13)

For each i𝒩,i\in\mathcal{N}, we get that

y˙i=ifi(𝐳i)ifi(𝐱),||\dot{y}_{i}||=||\nabla_{i}f_{i}(\mathbf{z}_{i})-\nabla_{i}f_{i}(\mathbf{x}^{*})||, (14)

by noticing that ifi(𝐱)=0,i𝒩\nabla_{i}f_{i}(\mathbf{x}^{*})=0,\forall i\in\mathcal{N} according to Assumption 2. By the Lipshitz condition of ifi\nabla_{i}f_{i} in Assumption 1, we get that

y˙ili𝐳i𝐱li𝐳𝟏N𝐱liη1eη2t,||\dot{y}_{i}||\leq l_{i}||\mathbf{z}_{i}-\mathbf{x}^{*}||\leq l_{i}||\mathbf{z}-\mathbf{1}_{N}\otimes\mathbf{x}^{*}||\leq l_{i}\eta_{1}e^{-\eta_{2}t}, (15)

thus arriving at the second conclusion.

For the third conclusion,

0y˙i2(s)𝑑s=\displaystyle\int_{0}^{\infty}\dot{y}_{i}^{2}(s)ds= 0ifi(𝐳i(s))2𝑑s\displaystyle\int_{0}^{\infty}||\nabla_{i}f_{i}(\mathbf{z}_{i}(s))||^{2}ds (16)
\displaystyle\leq li20𝐳i(s)𝐱2𝑑s\displaystyle l_{i}^{2}\int_{0}^{\infty}||\mathbf{z}_{i}(s)-\mathbf{x}^{*}||^{2}ds
\displaystyle\leq li2η120e2η2s𝑑sli2η122η2,\displaystyle l_{i}^{2}\eta_{1}^{2}\int_{0}^{\infty}e^{-2\eta_{2}s}ds\leq\frac{l_{i}^{2}\eta_{1}^{2}}{2\eta_{2}},

thus arriving at the third conclusion with ci=li2η122η2c_{i}=\frac{l_{i}^{2}\eta_{1}^{2}}{2\eta_{2}}. \Box

Note that by Lemma 3 and (11)-(12), y˙i\dot{y}_{i} and z˙ij\dot{z}_{ij} are also bounded as yiy_{i} and zijz_{ij} for all i,j𝒩i,j\in\mathcal{N} are bounded.

With the above results in mind, we are now ready to show that the players' actions 𝐱\mathbf{x} can be driven to the Nash equilibrium 𝐱\mathbf{x}^{*} by utilizing the proposed method.

Theorem 1.

Suppose that Assumptions 1-4 are satisfied. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱=0,\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0, (17)

and ki(t)k_{i}(t), θ^i(t)\hat{\theta}_{i}(t) for all i𝒩i\in\mathcal{N} stay bounded.

Proof: Define a sub-Lyapunov candidate function for player ii as

Vi=12(xiyi)2+12(θiθ^i)2.V_{i}=\frac{1}{2}(x_{i}-y_{i})^{2}+\frac{1}{2}(\theta_{i}-\hat{\theta}_{i})^{2}. (18)

Then, the time derivative of VV along the trajectory is

V˙i=\displaystyle\dot{V}_{i}= (xiyi)(x˙iy˙i)+(θ^iθi)θ^˙i\displaystyle(x_{i}-y_{i})(\dot{x}_{i}-\dot{y}_{i})+(\hat{\theta}_{i}-\theta_{i})\dot{\hat{\theta}}_{i} (19)
=\displaystyle= (xiyi)(N0(ki)bi(xiyi+ϕi(xi)θ^i)+ϕi(xi)θi)\displaystyle(x_{i}-y_{i})\left(N_{0}(k_{i})b_{i}(x_{i}-y_{i}+\phi_{i}(x_{i})\hat{\theta}_{i})+\phi_{i}(x_{i})\theta_{i}\right)
(xiyi)y˙i+(θ^iθi)ϕi(xi)(xiyi)\displaystyle-(x_{i}-y_{i})\dot{y}_{i}+(\hat{\theta}_{i}-\theta_{i})\phi_{i}(x_{i})(x_{i}-y_{i})
\displaystyle\leq N0(ki)bik˙i(xiyi)y˙i+θ^iϕi(xi)(xiyi)\displaystyle N_{0}(k_{i})b_{i}\dot{k}_{i}-(x_{i}-y_{i})\dot{y}_{i}+\hat{\theta}_{i}\phi_{i}(x_{i})(x_{i}-y_{i})
\displaystyle\leq (xiyi)2+(N0(ki)bi+1)k˙i(xiyi)y˙i\displaystyle-(x_{i}-y_{i})^{2}+(N_{0}(k_{i})b_{i}+1)\dot{k}_{i}-(x_{i}-y_{i})\dot{y}_{i}
\displaystyle\leq (1Ci2)(xiyi)2\displaystyle-\left(1-\frac{C_{i}}{2}\right)(x_{i}-y_{i})^{2}
+(N0(ki)bi+1)k˙i+(y˙i)22Ci,\displaystyle+(N_{0}(k_{i})b_{i}+1)\dot{k}_{i}+\frac{(\dot{y}_{i})^{2}}{2C_{i}},

by noticing that |(xiyi)y˙i|Ci(xiyi)22+(y˙i)22Ci,|(x_{i}-y_{i})\dot{y}_{i}|\leq\frac{C_{i}(x_{i}-y_{i})^{2}}{2}+\frac{(\dot{y}_{i})^{2}}{2C_{i}}, where CiC_{i} is a positive constant that satisfies Ci<2.C_{i}<2.

Integrating both sides of (19), it can be obtained that

0tfV˙i(τ)𝑑τ\displaystyle\int_{0}^{t_{f}}\dot{V}_{i}(\tau)d\tau\leq 0tf(1Ci2)(xiyi)2𝑑τ\displaystyle-\int_{0}^{t_{f}}\left(1-\frac{C_{i}}{2}\right)(x_{i}-y_{i})^{2}d\tau (20)
+0tf(N0(ki)bi+1)k˙i𝑑τ+0tf(y˙i)22Ci𝑑τ\displaystyle+\int_{0}^{t_{f}}(N_{0}(k_{i})b_{i}+1)\dot{k}_{i}d\tau+\int_{0}^{t_{f}}\frac{(\dot{y}_{i})^{2}}{2C_{i}}d\tau
\displaystyle\leq 0tf(N0(ki)bi+1)k˙i𝑑τ+ci2Ci.\displaystyle\int_{0}^{t_{f}}(N_{0}(k_{i})b_{i}+1)\dot{k}_{i}d\tau+\frac{c_{i}}{2C_{i}}.

Note that the last inequality is obtained by noticing that 0tf(y˙i)22Ci𝑑τ0(y˙i)22Ci𝑑τci2Ci\int_{0}^{t_{f}}\frac{(\dot{y}_{i})^{2}}{2C_{i}}d\tau\leq\int_{0}^{\infty}\frac{(\dot{y}_{i})^{2}}{2C_{i}}d\tau\leq\frac{c_{i}}{2C_{i}} according to Lemma 3.

Hence, Vi(t)V_{i}(t) and ki(t)k_{i}(t) are bounded on [0,tf)[0,t_{f}) by Lemma 1, which indicates that xiyix_{i}-y_{i} and θ^i\hat{\theta}_{i} are bounded. Moreover, as yiy_{i} is bounded by Lemma 3, we obtain that xix_{i} is bounded for t[0,tf),t\in[0,t_{f}), from which we can further obtain that x˙i,k˙i,θ^˙i\dot{x}_{i},\dot{k}_{i},\dot{\hat{\theta}}_{i} are bounded over the time interval [0,tf).[0,t_{f}). This implies that there is no finite-time escape for the closed-loop system and hence tf=.t_{f}=\infty.

Taking the time derivative of k˙i\dot{k}_{i} gives

k¨i=\displaystyle\ddot{k}_{i}= (x˙iy˙i)(xiyi+ϕi(xi)θ^i)\displaystyle(\dot{x}_{i}-\dot{y}_{i})(x_{i}-y_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}) (21)
+(xiyi)(x˙iy˙i+ϕi(xi)xix˙iθ^i+ϕi(xi)θ^˙i).\displaystyle+(x_{i}-y_{i})(\dot{x}_{i}-\dot{y}_{i}+\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}}\dot{x}_{i}\hat{\theta}_{i}+\phi_{i}(x_{i})\dot{\hat{\theta}}_{i}).

As xix_{i} is bounded for t[0,)t\in[0,\infty), ϕi(xi)xi\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}} is bounded for t[0,)t\in[0,\infty) by Assumption 4. Moreover, noticing that xi,yi,ϕi(xi),θ^i,x˙i,y˙i,θ^˙ix_{i},y_{i},\phi_{i}(x_{i}),\hat{\theta}_{i},\dot{x}_{i},\dot{y}_{i},\dot{\hat{\theta}}_{i} are all bounded, we get that k¨i\ddot{k}_{i} is bounded. Hence, k˙i(t)\dot{k}_{i}(t) is uniformly continuous with respect to t.t. In addition,

0k˙i(s)𝑑s=ki()ki(0)ki,\int_{0}^{\infty}\dot{k}_{i}(s)ds=k_{i}(\infty)-k_{i}(0)\leq k_{i}^{*}, (22)

where kik_{i}^{*} is a finite constant determined by the bounds of ki(t).k_{i}(t).

Therefore, (xi(t)yi(t))(xi(t)yi(t)+ϕi(xi(t))θ^i(t))(x_{i}(t)-y_{i}(t))(x_{i}(t)-y_{i}(t)+\phi_{i}(x_{i}(t))\hat{\theta}_{i}(t)) is integrable over t[0,)t\in[0,\infty). Hence

limt(xi(t)yi(t))(xi(t)yi(t)+ϕi(xi(t))θ^i(t))=0,\lim_{t\rightarrow\infty}(x_{i}(t)-y_{i}(t))(x_{i}(t)-y_{i}(t)+\phi_{i}(x_{i}(t))\hat{\theta}_{i}(t))=0, (23)

by Lemma 2.

From the other aspect, taking the time derivative of θ^˙i\dot{\hat{\theta}}_{i} gives

θ^¨i=ϕi(xi)xix˙i(xiyi)+ϕi(xi)(x˙iy˙i),\ddot{\hat{\theta}}_{i}=\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}}\dot{x}_{i}(x_{i}-y_{i})+\phi_{i}(x_{i})(\dot{x}_{i}-\dot{y}_{i}), (24)

from which we see that θ^¨i\ddot{\hat{\theta}}_{i} is bounded by noticing that xi,yi,x˙i,y˙i,ϕi(xi)xi,ϕi(xi)x_{i},y_{i},\dot{x}_{i},\dot{y}_{i},\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}},\phi_{i}(x_{i}) are bounded.

Therefore, θ^˙i\dot{\hat{\theta}}_{i} is uniformly continuous with respect to tt. Moreover,

0θ^˙i(t)𝑑t=θ^i()θ^i(0)θ^i,\int_{0}^{\infty}\dot{\hat{\theta}}_{i}(t)dt=\hat{\theta}_{i}(\infty)-\hat{\theta}_{i}(0)\leq\hat{\theta}_{i}^{*}, (25)

where θ^i\hat{\theta}_{i}^{*} is a constant determined by the bounds of θ^i.\hat{\theta}_{i}. Hence, by Lemma 2, we can obtain that

limtϕi(xi(t))(xi(t)yi(t))=0.\lim_{t\rightarrow\infty}\phi_{i}(x_{i}(t))(x_{i}(t)-y_{i}(t))=0. (26)

By (23), we have xi(t)=yi(t)x_{i}(t)=y_{i}(t) or alternatively, xi(t)yi(t)+ϕi(xi(t))θ^i(t)=0x_{i}(t)-y_{i}(t)+\phi_{i}(x_{i}(t))\hat{\theta}_{i}(t)=0 for t=.t=\infty. Moreover, by (26), we have ϕi(xi(t))=0\phi_{i}(x_{i}(t))=0 or xi(t)=yi(t)x_{i}(t)=y_{i}(t) for t=.t=\infty. Suppose that xi(t)yi(t)x_{i}(t)\neq y_{i}(t) for t=,t=\infty, then, ϕi(xi(t))=0\phi_{i}(x_{i}(t))=0 must be satisfied. If this is the case, xi(t)yi(t)+ϕi(xi(t))θ^i(t)=xi(t)yi(t)0x_{i}(t)-y_{i}(t)+\phi_{i}(x_{i}(t))\hat{\theta}_{i}(t)=x_{i}(t)-y_{i}(t)\neq 0 for t=,t=\infty, indicating that (23) can not be satisfied. Hence, we arrive at a contradiction and obtain that xi(t)=yi(t)x_{i}(t)=y_{i}(t) must be satisfied for t=.t=\infty. Recalling that 𝐲(t)𝐱\mathbf{y}(t)\rightarrow\mathbf{x}^{*} as tt\rightarrow\infty, which is proven in Lemma 3, we arrive at the conclusion that 𝐱(t)𝐱\mathbf{x}(t)\rightarrow\mathbf{x}^{*} as tt\rightarrow\infty, thus completing the proof. \Box

In this paper, we focus on the case in which the communication graph is undirected and connected for simplicity. However, it should be noted that the presented results are still valid for strongly connected digraphs. To highlight this point, the following corollary is given.

Corollary 1.

Suppose that Assumptions 1-2, 4 are satisfied and the communication graph is strongly connected. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱=0,\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0, (27)

and ki(t),θ^i(t)k_{i}(t),\hat{\theta}_{i}(t) for all i𝒩i\in\mathcal{N} stay bounded.

Proof: The proof follows that of Theorem 1 by noticing that the results in Lemma 3 are still valid for strongly connected digraphs. \Box

In Theorem 1, we consider that each player ii's action is subject to both unknown control directions (bib_{i} is unknown) and uncertain parameter θi,\theta_{i}, i.e.,

x˙i=biui+ϕi(xi)θi.\dot{x}_{i}=b_{i}u_{i}+\phi_{i}(x_{i})\theta_{i}. (28)

If there is no uncertain parameter, and the players' actions are generated by

x˙i=biui.\dot{x}_{i}=b_{i}u_{i}. (29)

Then, the proposed seeking strategy can be revised to be

ui=N0(ki)(xiyi),u_{i}=N_{0}(k_{i})(x_{i}-y_{i}), (30)

where N0(ki)=ki2sin(ki)N_{0}(k_{i})=k_{i}^{2}sin(k_{i}),

k˙i=(xiyi)2,\dot{k}_{i}=(x_{i}-y_{i})^{2}, (31)

and yiy_{i} is generated by (11)-(12). If this is the case, the following corollary can be obtained.

Corollary 2.

Suppose that Assumptions 1-3 are satisfied. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱=0,\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0, (32)

and ki(t)k_{i}(t) for all i𝒩i\in\mathcal{N} stay bounded.

4.2 Distributed Nash equilibrium seeking for second-order systems

In this section, we suppose that for each i𝒩,i\in\mathcal{N}, player ii's action xix_{i} is governed by

x˙i=\displaystyle\dot{x}_{i}= vi\displaystyle v_{i} (33)
v˙i=\displaystyle\dot{v}_{i}= biui+ϕi(xi)θi,\displaystyle b_{i}u_{i}+\phi_{i}(x_{i})\theta_{i},

in which viv_{i}\in\mathbb{R} is a state of player ii.

4.2.1 Method development

To achieve distributed Nash equilibrium seeking for games in which each player ii's dynamics is governed by (33), the control input uiu_{i} is designed as

ui=N0(ki)(xiyi+vi+ϕi(xi)θ^i+(x˙iy˙i)),u_{i}=N_{0}(k_{i})(x_{i}-y_{i}+v_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i})), (34)

where N0(ki)=ki2sin(ki)N_{0}(k_{i})=k_{i}^{2}sin(k_{i}) and

k˙i=\displaystyle\dot{k}_{i}= (xiyi+vi)(xiyi+vi+ϕi(xi)θ^i+(x˙iy˙i)),\displaystyle(x_{i}-y_{i}+v_{i})(x_{i}-y_{i}+v_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i})), (35)
θ^˙i=\displaystyle\dot{\hat{\theta}}_{i}= ϕi(xi)(xiyi+vi).\displaystyle\phi_{i}(x_{i})(x_{i}-y_{i}+v_{i}).

Moreover, yiy_{i} is an auxiliary variable generated by

y˙i=ifi(𝐳i),\dot{y}_{i}=-\nabla_{i}f_{i}(\mathbf{z}_{i}), (36)

where 𝐳i=[zi1,zi2,,ziN]T\mathbf{z}_{i}=[z_{i1},z_{i2},\cdots,z_{iN}]^{T}. Furthermore,

z˙ij=δij(k=1Naik(zijzkj)+aij(zijyj)),\dot{z}_{ij}=-\delta_{ij}\left(\sum_{k=1}^{N}a_{ik}(z_{ij}-z_{kj})+a_{ij}(z_{ij}-y_{j})\right), (37)

where δij=δδ¯ij\delta_{ij}=\delta\bar{\delta}_{ij}, δ\delta is positive constant to be determined and δ¯ij\bar{\delta}_{ij} is a fixed positive constant.

To establish the results for second-order systems, the following assumption is also needed.

Assumption 6.

For each i,j𝒩,i,j\in\mathcal{N}, ifi(𝐱)xj\frac{\partial\nabla_{i}f_{i}(\mathbf{x})}{\partial x_{j}} is bounded given that 𝐱\mathbf{x} is bounded.

Remark 3.

Compared the strategy in (34)-(37) with (9)-(12), we see that the optimization modules are the same while the regulation modules are different. As the system in (33) is a second-order system, we further utilize x˙i\dot{x}_{i} and y˙i\dot{y}_{i} in the seeking strategy. Recalling the definitions of x˙i\dot{x}_{i} and y˙i\dot{y}_{i}, it is clear that the communication in the proposed seeking strategy is still one-hop.

4.2.2 Convergence analysis

The following theorem illustrates the convergence result for the proposed method.

Theorem 2.

Suppose that Assumptions 1-3, 5-6 are satisfied. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱0.\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||\rightarrow 0. (38)

Moreover, ki(t)k_{i}(t) and θ^i(t)\hat{\theta}_{i}(t) for all i𝒩i\in\mathcal{N} stay bounded.

Proof: For notational convenience, let ξi=xiyi+vi\xi_{i}=x_{i}-y_{i}+v_{i}. Define the sub-Lyapunov candidate function for player ii as

Vi=12(xiyi)2+12ξi2+12(θiθ^i)2.V_{i}=\frac{1}{2}(x_{i}-y_{i})^{2}+\frac{1}{2}\xi_{i}^{2}+\frac{1}{2}(\theta_{i}-\hat{\theta}_{i})^{2}. (39)

Then, the time derivative of ViV_{i} is

V˙i=\displaystyle\dot{V}_{i}= (xiyi)(xiyiξi)(xiyi)y˙i\displaystyle-(x_{i}-y_{i})(x_{i}-y_{i}-\xi_{i})-(x_{i}-y_{i})\dot{y}_{i} (40)
+ξi(biN0(ki)(ξi+ϕi(xi)θ^i+(x˙iy˙i)))\displaystyle+\xi_{i}\left(b_{i}N_{0}(k_{i})(\xi_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i}))\right)
+ξi(ϕi(xi)θi+x˙iy˙i)+(θ^iθi)ϕi(xi)ξi\displaystyle+\xi_{i}\left(\phi_{i}(x_{i})\theta_{i}+\dot{x}_{i}-\dot{y}_{i}\right)+(\hat{\theta}_{i}-\theta_{i})\phi_{i}(x_{i})\xi_{i}
=\displaystyle= (xiyi)2ξi2+(biN0(ki)+1)k˙i\displaystyle-(x_{i}-y_{i})^{2}-\xi_{i}^{2}+(b_{i}N_{0}(k_{i})+1)\dot{k}_{i}
+(xiyi)ξi(xiyi)y˙i\displaystyle+(x_{i}-y_{i})\xi_{i}-(x_{i}-y_{i})\dot{y}_{i}
\displaystyle\leq (12Ci2)(xiyi)212ξi2\displaystyle-\left(\frac{1}{2}-\frac{C_{i}}{2}\right)(x_{i}-y_{i})^{2}-\frac{1}{2}\xi_{i}^{2}
+(biN0(ki)+1)k˙i+(y˙i)22Ci,\displaystyle+(b_{i}N_{0}(k_{i})+1)\dot{k}_{i}+\frac{(\dot{y}_{i})^{2}}{2C_{i}},

where CiC_{i} is a positive constant that satisfies Ci<1.C_{i}<1.

Integrating both sides of (40) over t[0,tf)t\in[0,t_{f}) gives

0tfV˙i𝑑τ\displaystyle\int_{0}^{t_{f}}\dot{V}_{i}d\tau\leq 0tf[(12Ci2)(xiyi)2+12ξi2]𝑑τ\displaystyle-\int_{0}^{t_{f}}\left[\left(\frac{1}{2}-\frac{C_{i}}{2}\right)(x_{i}-y_{i})^{2}+\frac{1}{2}\xi_{i}^{2}\right]d\tau (41)
+0tf(biN0(ki)+1)k˙i𝑑τ+0tf(y˙i)22Ci𝑑τ\displaystyle+\int_{0}^{t_{f}}(b_{i}N_{0}(k_{i})+1)\dot{k}_{i}d\tau+\int_{0}^{t_{f}}\frac{(\dot{y}_{i})^{2}}{2C_{i}}d\tau
\displaystyle\leq 0tf(biN0(ki)+1)k˙i𝑑τ+ci2Ci.\displaystyle\int_{0}^{t_{f}}(b_{i}N_{0}(k_{i})+1)\dot{k}_{i}d\tau+\frac{c_{i}}{2C_{i}}.

Hence, by Lemma 1, we get that ViV_{i} and kik_{i} are bounded for t[0,tf),t\in[0,t_{f}), which further indicates that xiyi,ξi,θ^ix_{i}-y_{i},\xi_{i},\hat{\theta}_{i} are bounded for t[0,tf)t\in[0,t_{f}). Recalling that yiy_{i} is bounded, we get that xix_{i} is bounded for t[0,tf)t\in[0,t_{f}). Hence, viv_{i} is bounded. Therefore, there is no finite-time escape for the closed-loop system, which indicates that tf=.t_{f}=\infty. Recalling (34)-(37), we can obtain that x˙i,v˙i,y˙i,k˙i,θ^˙i\dot{x}_{i},\dot{v}_{i},\dot{y}_{i},\dot{k}_{i},\dot{\hat{\theta}}_{i} are all bounded. Taking the time derivative of k˙i(t)\dot{k}_{i}(t) gives

k¨i=\displaystyle\ddot{k}_{i}= (v˙i+x˙iy˙i)(ξi+ϕi(xi)θ^i+(x˙iy˙i))\displaystyle(\dot{v}_{i}+\dot{x}_{i}-\dot{y}_{i})(\xi_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i})) (42)
+ξi(x˙iy˙i+v˙i+ϕi(xi)xix˙iθ^i+ϕi(xi)θ^˙i)\displaystyle+\xi_{i}(\dot{x}_{i}-\dot{y}_{i}+\dot{v}_{i}+\frac{\partial\phi_{i}(x_{i})}{\partial x_{i}}\dot{x}_{i}\hat{\theta}_{i}+\phi_{i}(x_{i})\dot{\hat{\theta}}_{i})
+ξi(x¨iy¨i).\displaystyle+\xi_{i}(\ddot{x}_{i}-\ddot{y}_{i}).

Note that x¨i=v˙i\ddot{x}_{i}=\dot{v}_{i} is bounded and y¨i=(ifi(𝐱)𝐱|𝐱=𝐳i)T𝐳˙i\ddot{y}_{i}=\left(\frac{\nabla_{i}f_{i}(\mathbf{x})}{\partial\mathbf{x}}\left.\right|_{\mathbf{x}=\mathbf{z}_{i}}\right)^{T}\dot{\mathbf{z}}_{i} is bounded as 𝐳i\mathbf{z}_{i}, 𝐳˙i\dot{\mathbf{z}}_{i} are bounded and ifi(𝐱)𝐱|𝐱=𝐳i\frac{\nabla_{i}f_{i}(\mathbf{x})}{\partial\mathbf{x}}\left.\right|_{\mathbf{x}=\mathbf{z}_{i}} is bounded for bounded 𝐳i\mathbf{z}_{i} (by Assumption 6), it can be seen that k¨i\ddot{k}_{i} is bounded. Hence, k˙i(t)\dot{k}_{i}(t) is uniformly continuous. Moreover,

0k˙i(τ)𝑑τ=ki()ki(0)ki,\int_{0}^{\infty}\dot{k}_{i}(\tau)d\tau=k_{i}(\infty)-k_{i}(0)\leq k_{i}^{*}, (43)

where kik_{i}^{*} is a constant determined by the bounds of ki(t).k_{i}(t). Hence, by Lemma 2, we can obtain that (vi+xiyi)(xiyi+vi+ϕi(xi)θ^i+(x˙iy˙i))0(v_{i}+x_{i}-y_{i})(x_{i}-y_{i}+v_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i}))\rightarrow 0 as t.t\rightarrow\infty.

Similarly,

0θ^˙i(τ)𝑑τ=θ^i()θ^i(0)θ^i,\int_{0}^{\infty}\dot{\hat{\theta}}_{i}(\tau)d\tau=\hat{\theta}_{i}(\infty)-\hat{\theta}_{i}(0)\leq\hat{\theta}_{i}^{*}, (44)

where θ^i\hat{\theta}_{i}^{*} is a constant determined by the bounds of θ^i.\hat{\theta}_{i}. Hence, by Lemma 2, we can obtain that ϕi(xi)(vi+xiyi)0\phi_{i}(x_{i})(v_{i}+x_{i}-y_{i})\rightarrow 0 as t.t\rightarrow\infty.

Hence, for t=,t=\infty, we have ϕi(xi)(vi+xiyi)=0\phi_{i}(x_{i})(v_{i}+x_{i}-y_{i})=0 and (vi+xiyi)(xiyi+vi+ϕi(xi)θ^i+(x˙iy˙i))=0.(v_{i}+x_{i}-y_{i})(x_{i}-y_{i}+v_{i}+\phi_{i}(x_{i})\hat{\theta}_{i}+(\dot{x}_{i}-\dot{y}_{i}))=0.

Case I: ϕi(xi)=0\phi_{i}(x_{i})=0 but vi+xiyi0v_{i}+x_{i}-y_{i}\neq 0 for t=.t=\infty. In this case, xiyi+vi+(x˙iy˙i)=0x_{i}-y_{i}+v_{i}+(\dot{x}_{i}-\dot{y}_{i})=0. Recalling that as t,t\rightarrow\infty, yixi,y_{i}\rightarrow x_{i}^{*}, and y˙i0\dot{y}_{i}\rightarrow 0, we get that

x˙i=12(xixi),\dot{x}_{i}=-\frac{1}{2}(x_{i}-x_{i}^{*}), (45)

from which it is clear that 𝐱(t)𝐱\mathbf{x}(t)\rightarrow\mathbf{x}^{*} for t.t\rightarrow\infty.

Case II: vi+xiyi=0v_{i}+x_{i}-y_{i}=0 for t=.t=\infty. If this is the case

x˙i=(xiyi),\dot{x}_{i}=-(x_{i}-y_{i}), (46)

as t.t\rightarrow\infty. Recalling that limt(yi(t)xi)=0,\lim_{t\rightarrow\infty}(y_{i}(t)-x_{i}^{*})=0, we can obtain that limt𝐱(t)𝐱=0.\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0. To this end, the conclusion is obtained. \Box

Similar to Corollary 1, the following result can be obtained if the communication graph is strongly connected.

Corollary 3.

Suppose that Assumptions 1-2, 4, 6 are satisfied and the communication graph is strongly connected. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱=0\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0 (47)

and ki(t),θ^i(t)k_{i}(t),\hat{\theta}_{i}(t) for all i𝒩i\in\mathcal{N} stay bounded.

4.3 Distributed Nash equilibrium seeking for more general second-order systems

In this section, we consider a game in which each player ii's action is governed by

x˙i=\displaystyle\dot{x}_{i}= bi1vi+ϕi1(xi)θi1\displaystyle b_{i1}v_{i}+\phi_{i1}(x_{i})\theta_{i1} (48)
v˙i=\displaystyle\dot{v}_{i}= bi2ui+ϕi2(xi,vi)θi2.\displaystyle b_{i2}u_{i}+\phi_{i2}(x_{i},v_{i})\theta_{i2}.
Remark 4.

Note that compared with (33), the effect of viv_{i} on xix_{i} is also uncertain in (48) as bi1b_{i1} is also unknown. In addition, an uncertain nonlinear term ϕi1(xi)θi1\phi_{i1}(x_{i})\theta_{i1} is addressed as well. Hence, in this problem, bi1b_{i1} and bi2b_{i2} are both unknown directions that should be addressed. Moreover, both θi1\theta_{i1} and θi2\theta_{i2} result in uncertain nonlinearities that should be accommodated.

Motivated by [12], the Nash equilibrium seeking strategy is designed in the following process:

Step 1: Generate a reference trajectory yiy_{i} for i𝒩i\in\mathcal{N} that would converge to the Nash equilibrium according to

y˙i=\displaystyle\dot{y}_{i}= ifi(𝐳i)\displaystyle-\nabla_{i}f_{i}(\mathbf{z}_{i}) (49)
z˙ij=\displaystyle\dot{z}_{ij}= δij(k=1Naik(zijzkj)+aij(zijyj)),\displaystyle-\delta_{ij}\left(\sum_{k=1}^{N}a_{ik}(z_{ij}-z_{kj})+a_{ij}(z_{ij}-y_{j})\right),

where j𝒩,j\in\mathcal{N}, 𝐳i=[zi1,zi2,,ziN]T\mathbf{z}_{i}=[z_{i1},z_{i2},\cdots,z_{iN}]^{T}, δij=δδ¯ij\delta_{ij}=\delta\bar{\delta}_{ij}, δ\delta is positive constant to be determined and δ¯ij\bar{\delta}_{ij} is a fixed positive constant.

Step 2: Generate a reference trajectory αi\alpha_{i} for viv_{i} as

αi=\displaystyle\alpha_{i}= N0(ki1)(xiyi+ϕi1(xi)θ^i1)\displaystyle N_{0}(k_{i1})(x_{i}-y_{i}+\phi_{i1}(x_{i})\hat{\theta}_{i1}) (50)
k˙i1=\displaystyle\dot{k}_{i1}= (xiyi)(xiyi+ϕi1(xi)θ^i1)\displaystyle(x_{i}-y_{i})(x_{i}-y_{i}+\phi_{i1}(x_{i})\hat{\theta}_{i1})
θ^˙i1=\displaystyle\dot{\hat{\theta}}_{i1}= ϕi1(xi)(xiyi).\displaystyle\phi_{i1}(x_{i})(x_{i}-y_{i}).

Step 3: Let βi=viαi\beta_{i}=v_{i}-\alpha_{i}. Then, through direct calculation, it can be obtained that

β˙i=\displaystyle\dot{\beta}_{i}= v˙iα˙i\displaystyle\dot{v}_{i}-\dot{\alpha}_{i} (51)
=\displaystyle= bi2ui+ϕi2(xi,vi)θi2+Ψi1(xi,ki,θ^i1)θi1\displaystyle b_{i2}u_{i}+\phi_{i2}(x_{i},v_{i})\theta_{i2}+\Psi_{i1}(x_{i},k_{i},\hat{\theta}_{i1})\theta_{i1}
+Ψi2(ki1,xi,yi,θ^i1,y˙i)+Ψi3(ki,xi,θ^i1,vi)bi1\displaystyle+\Psi_{i2}(k_{i1},x_{i},y_{i},\hat{\theta}_{i1},\dot{y}_{i})+\Psi_{i3}(k_{i},x_{i},\hat{\theta}_{i1},v_{i})b_{i1}

where Ψi1=N0(ki1)(ϕi1(xi)+ϕi1(xi)xiθ^i1ϕi1(xi)),\Psi_{i1}=-N_{0}(k_{i1})\left(\phi_{i1}(x_{i})+\frac{\partial\phi_{i1}(x_{i})}{\partial x_{i}}\hat{\theta}_{i1}\phi_{i1}(x_{i})\right), Ψi2=(2ki1sin(ki1)+ki12cos(ki1))(xiyi)(xiyi+ϕi1(xi)θ^i1)2N0(ki1)(y˙i+ϕi12(xi)(xiyi))\Psi_{i2}=-(2k_{i1}sin(k_{i1})+k_{i1}^{2}cos(k_{i1}))(x_{i}-y_{i})(x_{i}-y_{i}+\phi_{i1}(x_{i})\hat{\theta}_{i1})^{2}-N_{0}(k_{i1})(-\dot{y}_{i}+\phi_{i1}^{2}(x_{i})(x_{i}-y_{i})) and Ψi3=N0(ki1)(ϕi1(xi)xiθ^i1+1)vi.\Psi_{i3}=-N_{0}(k_{i1})\left(\frac{\partial\phi_{i1}(x_{i})}{\partial x_{i}}\hat{\theta}_{i1}+1\right)v_{i}. Accordingly, the control input uiu_{i} is designed as

ui=\displaystyle u_{i}= N0(ki2)(βi+ϕi2θ¯i2+Ψi1θ¯i1+Ψi2+Ψi3b¯i1),\displaystyle N_{0}(k_{i2})(\beta_{i}+\phi_{i2}\bar{\theta}_{i2}+\Psi_{i1}\bar{\theta}_{i1}+\Psi_{i2}+\Psi_{i3}\bar{b}_{i1}), (52)
k˙i2=\displaystyle\dot{k}_{i2}= βi(βi+ϕi2θ¯i2+Ψi1θ¯i1+Ψi2+Ψi3b¯i1),\displaystyle\beta_{i}(\beta_{i}+\phi_{i2}\bar{\theta}_{i2}+\Psi_{i1}\bar{\theta}_{i1}+\Psi_{i2}+\Psi_{i3}\bar{b}_{i1}),
θ¯˙i2=\displaystyle\dot{\bar{\theta}}_{i2}= βiϕi2,θ¯˙i1=βiΨi1,\displaystyle\beta_{i}\phi_{i2},\dot{\bar{\theta}}_{i1}=\beta_{i}\Psi_{i1},
b¯˙i1=\displaystyle\dot{\bar{b}}_{i1}= βiΨi3.\displaystyle\beta_{i}\Psi_{i3}.
Remark 5.

Note that (50) is designed to drive xix_{i} to yiy_{i} and (52) is designed to drive viv_{i} to αi.\alpha_{i}. The design of the control input in (52) is motivated by [12] that treats viv_{i} as a virtual control input for xi.x_{i}. To deal with unknown constants bi1b_{i1} and bi2,b_{i2}, two Nussbaum functions are included. To accommodate multiple Nussbaum functions, the idea is to design the control input such that βi\beta_{i} is square integrable (see also [12]).

The following theorem establishes the stability of Nash equilibrium under the control input designed in (52).

Theorem 3.

Suppose that Assumptions 1-3,5 are satisfied. Then, there exists a positive constant δ\delta^{*} such that for each δ(δ,),\delta\in(\delta^{*},\infty),

limt𝐱(t)𝐱=0,\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0, (53)

and other variables stay bounded.

Proof: The proof is similar to those in [12] and Theorem 2. For the convenience of the readers, sketch of the proof is given as follows.

Step 1: Show that βi\beta_{i} is square integrable by defining the sub-Lyapunov candidate function as

Vi1=\displaystyle V_{i1}= 12βi2+12(θ¯i2θi2)2\displaystyle\frac{1}{2}\beta_{i}^{2}+\frac{1}{2}(\bar{\theta}_{i2}-\theta_{i2})^{2} (54)
+12(θ¯i1θi1)2+12(b¯i1bi1)2.\displaystyle+\frac{1}{2}(\bar{\theta}_{i1}-\theta_{i1})^{2}+\frac{1}{2}(\bar{b}_{i1}-b_{i1})^{2}.

Then, following the proof of Theorem 2, it can be obtained that

V˙i1βi2+(bi2N0(ki2)+1)k˙i2.\dot{V}_{i1}\leq-\beta_{i}^{2}+(b_{i2}N_{0}(k_{i2})+1)\dot{k}_{i2}. (55)

Moreover, taking integrations on both sides of (55) over [0,tf)[0,t_{f}), we get that

0tfV˙i1𝑑τ0tfβi2𝑑τ+0tf(bi2N0(ki2)+1)k˙i2𝑑τ,\int_{0}^{t_{f}}\dot{V}_{i1}d\tau\leq-\int_{0}^{t_{f}}\beta_{i}^{2}d\tau+\int_{0}^{t_{f}}(b_{i2}N_{0}(k_{i2})+1)\dot{k}_{i2}d\tau, (56)

from which it can be obtained that Vi1V_{i1}, ki2k_{i2} and 0tf(bi2N0(ki2)+1)k˙i2𝑑t\int_{0}^{t_{f}}(b_{i2}N_{0}(k_{i2})+1)\dot{k}_{i2}dt are bounded by Lemma 1.

Moreover, from (56), it is clear that

0tfβi2𝑑τVi(0)Vi(tf)+0tf(bi2N0(ki2)+1)k˙i2𝑑τ.\int_{0}^{t_{f}}\beta_{i}^{2}d\tau\leq V_{i}(0)-V_{i}(t_{f})+\int_{0}^{t_{f}}(b_{i2}N_{0}(k_{i2})+1)\dot{k}_{i2}d\tau. (57)

Hence, βi\beta_{i} is square integrable for t[0,tf)t\in[0,t_{f}).

Step 2: Show that xix_{i} can be driven to yiy_{i} by defining the other sub-Lyapunov function as

Vi2=12(xiyi)2+(θ^i1θi1)2.V_{i2}=\frac{1}{2}(x_{i}-y_{i})^{2}+(\hat{\theta}_{i1}-\theta_{i1})^{2}. (58)

Then, the time derivative of Vi2V_{i2} is

V˙i2=\displaystyle\dot{V}_{i2}= (xiyi)(bi1αi+bi1βi+ϕi1(xi)θi1y˙i)\displaystyle(x_{i}-y_{i})(b_{i1}\alpha_{i}+b_{i1}\beta_{i}+\phi_{i1}(x_{i})\theta_{i1}-\dot{y}_{i}) (59)
+(θ^i1θi1)θ^˙i1\displaystyle+(\hat{\theta}_{i1}-\theta_{i1})\dot{\hat{\theta}}_{i1}
=\displaystyle= (xiyi)(bi1N0(ki1)(xiyi+ϕi1(xi)θ^i1)\displaystyle(x_{i}-y_{i})(b_{i1}N_{0}(k_{i1})(x_{i}-y_{i}+\phi_{i1}(x_{i})\hat{\theta}_{i1})
+bi1βi+ϕi1(xi)θi1y˙i)+(θ^i1θi1)θ^˙i1\displaystyle+b_{i1}\beta_{i}+\phi_{i1}(x_{i})\theta_{i1}-\dot{y}_{i})+(\hat{\theta}_{i1}-\theta_{i1})\dot{\hat{\theta}}_{i1}
\displaystyle\leq (xiyi)2+(bi1N0(ki1)+1)k˙i1\displaystyle-(x_{i}-y_{i})^{2}+(b_{i1}N_{0}(k_{i1})+1)\dot{k}_{i1}
+(xiyi)bi1βi(xiyi)y˙i\displaystyle+(x_{i}-y_{i})b_{i1}\beta_{i}-(x_{i}-y_{i})\dot{y}_{i}
\displaystyle\leq 12(xiyi)2+(bi1N0(ki1)+1)k˙i1\displaystyle-\frac{1}{2}(x_{i}-y_{i})^{2}+(b_{i1}N_{0}(k_{i1})+1)\dot{k}_{i1}
+Ci12bi12βi2+Ci22y˙i2.\displaystyle+\frac{C_{i1}}{2}b_{i1}^{2}\beta_{i}^{2}+\frac{C_{i2}}{2}\dot{y}_{i}^{2}.

where Ci1,Ci2C_{i1},C_{i2} are positive constants that satisfy 1Ci1+1Ci21\frac{1}{C_{i1}}+\frac{1}{C_{i2}}\leq 1. Noticing that both βi\beta_{i} and y˙i\dot{y}_{i} are square integrable for t[0,tf),t\in[0,t_{f}), we obtain that Vi1,ki1V_{i1},k_{i1} are bounded for t[0,tf)t\in[0,t_{f}). Combining the above two steps, it can be seen that xiyi,x_{i}-y_{i}, ki1,k_{i1}, θ^i1\hat{\theta}_{i1} as well as βi\beta_{i}, ki2,θ¯i1,θ¯i2k_{i2},\bar{\theta}_{i1},\bar{\theta}_{i2}, b¯i1\bar{b}_{i1} are all bounded. Recalling the definition of αi\alpha_{i}, it can be obtained that viv_{i} is bounded. Furthermore, xix_{i} is bounded as yiy_{i} is bounded by Lemma 3. To this end, we have shown that all the variables contained in the closed-loop system are bounded for t[0,tf),t\in[0,t_{f}), indicating that there is no finite-time escape and tf=.t_{f}=\infty.

Step 3: The rest analysis follows the proof of Theorem 2 to take the time derivatives of k˙i1\dot{k}_{i1} and θ^˙i1\dot{\hat{\theta}}_{i1} to show that there are uniformly continuous. Then, take the integrations of them over [0,)[0,\infty) to prove that their integrations are bounded. With the above conclusions in mind, by Barbalat's lemma, limtk˙i1=0\lim_{t\rightarrow\infty}\dot{k}_{i1}=0 and limtθ^˙i1=0,\lim_{t\rightarrow\infty}\dot{\hat{\theta}}_{i1}=0, from which it can be obtained that limtxi(t)yi(t)=0\lim_{t\rightarrow\infty}x_{i}(t)-y_{i}(t)=0 by following the arguments in the proof of Theorem 1. \Box

Remark 6.

The system dynamics considered in (48) is similar to the one in [12] and the state regulation part is motivated by [12]. However, different from [12] that regulates the state to zero, this paper needs to regulate the state to a time-varying reference trajectory (yi(t)y_{i}(t) for i𝒩i\in\mathcal{N}), generated by the optimization module.

5 Discussions on Fully Distributed Implementation of the Proposed Algorithms

In Section 4, the proposed seeking strategies contain a centralized control gain δ\delta, which depends on the players' objective functions and the communication graph. In general, these centralized information can hardly be obtained. Actually, in [28], we proposed fully distributed Nash equilibrium seeking strategies by adaptively adjusting the control gains. In the following, we further prove that the adaptive algorithms in [28] can also be utilized in the proposed algorithms to achieve fully distributed Nash equilibrium seeking in the considered problem.

By the methods in [28], one can replace (11)-(12) in the proposed algorithms with

y˙i=\displaystyle\dot{y}_{i}= ifi(𝐳i)\displaystyle-\nabla_{i}f_{i}(\mathbf{z}_{i}) (60)
z˙ij=\displaystyle\dot{z}_{ij}= δij(k=1Naik(zijzkj)+aij(zijyj))\displaystyle-\delta_{ij}\left(\sum_{k=1}^{N}a_{ik}(z_{ij}-z_{kj})+a_{ij}(z_{ij}-y_{j})\right)
δ˙ij=\displaystyle\dot{\delta}_{ij}= (k=1Naik(zijzkj)+aij(zijyj))2,\displaystyle\left(\sum_{k=1}^{N}a_{ik}(z_{ij}-z_{kj})+a_{ij}(z_{ij}-y_{j})\right)^{2},

for i𝒩.i\in\mathcal{N}.

Then, the following result can be obtained.

Lemma 4.

Suppose that Assumptions 1-3 are satisfied. Then, with the strategy in (60), the following conclusions can be obtained:

  • For each i,j𝒩,i,j\in\mathcal{N}, yi(t)y_{i}(t), zij(t)z_{ij}(t) and δij(t)\delta_{ij}(t) are bounded for t[0,).t\in[0,\infty).

  • For each i𝒩,i\in\mathcal{N}, y˙i(t)\dot{y}_{i}(t) is square integrable over t(0,)t\in(0,\infty), i.e., 0y˙i2(s)𝑑sci\int_{0}^{\infty}\dot{y}_{i}^{2}(s)ds\leq c_{i} for some positive constant ci.c_{i}.

Proof: Following the proof of [28] to define V=𝐞TM𝐞+12(𝐲𝐱)T(𝐲𝐱)+i=1Nj=1N(θijθij)2,V=\mathbf{e}^{T}M\mathbf{e}+\frac{1}{2}(\mathbf{y}-\mathbf{x}^{*})^{T}(\mathbf{y}-\mathbf{x}^{*})+\sum_{i=1}^{N}\sum_{j=1}^{N}(\theta_{ij}-\theta_{ij}^{*})^{2}, where θij>8mMNmaxi𝒱{li}+(2MN+maxi𝒱{li})28mλmin(MM),\theta_{ij}^{*}>\frac{8m||M||\sqrt{N}\max_{i\in\mathcal{V}}\{l_{i}\}+(2||M||N+\max_{i\in\mathcal{V}}\{l_{i}\})^{2}}{8m\lambda_{min}(MM)}, 𝐞=[z11y1,z12y2,,z1NyN,z21y1,,zNNyN]T,\mathbf{e}=[z_{11}-y_{1},z_{12}-y_{2},\cdots,z_{1N}-y_{N},z_{21}-y_{1},\cdots,z_{NN}-y_{N}]^{T}, 𝐲=[y1,y2,,yN]T,\mathbf{y}=[y_{1},y_{2},\cdots,y_{N}]^{T}, M=IN×N+𝒜0M=\mathcal{L}\otimes I_{N\times N}+\mathcal{A}_{0} and 𝒜0\mathcal{A}_{0} is a diagonal matrix with its elements being aij.a_{ij}.

Then, it follows from [28] that

V˙a𝐄2,\dot{V}\leq-a||\mathbf{E}||^{2}, (61)

where a>0a>0 and 𝐄=[(𝐲𝐱)T,𝐞T]T,\mathbf{E}=[(\mathbf{y}-\mathbf{x}^{*})^{T},\mathbf{e}^{T}]^{T}, from which it can be obtained that for each i,j𝒩,i,j\in\mathcal{N}, yiy_{i}, zijz_{ij} and δij\delta_{ij} are bounded for t[0,).t\in[0,\infty).

Moreover,

0y˙i2(s)𝑑s\displaystyle\int_{0}^{\infty}\dot{y}_{i}^{2}(s)ds\leq 0|ifi(𝐳i(s))ifi(𝐱)|2𝑑s\displaystyle\int_{0}^{\infty}|\nabla_{i}f_{i}(\mathbf{z}_{i}(s))-\nabla_{i}f_{i}(\mathbf{x}^{*})|^{2}ds (62)
\displaystyle\leq li20𝐳i(s)𝐱2𝑑s.\displaystyle l_{i}^{2}\int_{0}^{\infty}||\mathbf{z}_{i}(s)-\mathbf{x}^{*}||^{2}ds.

Taking integration on both sides of (61), we obtain that

0V˙(s)𝑑sa0𝐄(s)2𝑑s,\int_{0}^{\infty}\dot{V}(s)ds\leq-a\int_{0}^{\infty}||\mathbf{E}(s)||^{2}ds, (63)

by which

V()+a0𝐄2𝑑sV(0).V(\infty)+a\int_{0}^{\infty}||\mathbf{E}||^{2}ds\leq V(0). (64)

By further noticing that

0𝐳i(s)𝐱2𝑑s0𝐄2𝑑s,\int_{0}^{\infty}||\mathbf{z}_{i}(s)-\mathbf{x}^{*}||^{2}ds\leq\int_{0}^{\infty}||\mathbf{E}||^{2}ds, (65)

we obtain that

V()+a0𝐳i(s)𝐱2𝑑sV(0).V(\infty)+a\int_{0}^{\infty}||\mathbf{z}_{i}(s)-\mathbf{x}^{*}||^{2}ds\leq V(0). (66)

Hence

0y˙i2(s)𝑑s(V(0)V())li2a,\int_{0}^{\infty}\dot{y}_{i}^{2}(s)ds\leq\frac{(V(0)-V(\infty))l_{i}^{2}}{a}, (67)

thus arriving at the second conclusion. \Box

With the results in Lemma 4, we can achieve the fully distributed implementations of the proposed algorithms, which is stated in the following theorem.

Theorem 4.

Suppose that Assumptions 1-4 are satisfied. Then, for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60). Then,

limt𝐱(t)𝐱=0,\lim_{t\rightarrow\infty}||\mathbf{x}(t)-\mathbf{x}^{*}||=0, (68)

and all the other variables stay bounded.

It's worth mentioning that for systems considered in (4)/(5) and the proposed control inputs designed for the corresponding systems, one can replace yiy_{i} therein with the one generated by (60) to achieve fully distributed implementations of the proposed algorithms. Note that we only present the results for the system (3) and omit the rest to avoid any repetitions in this paper.

Remark 7.

In this section, we only provide an example to illustrate the fully distributed implementations of the proposed algorithms. However, it is worth noting that the proposed algorithms actually provide a general framework to deal with games in systems with unknown control directions. That is, one may utilize other alternative approaches that result in square integrable y˙i\dot{y}_{i} and bounded state variables as well as their time derivatives to achieve fully distributed Nash equilibrium seeking for systems with unknown controls.

Remark 8.

The modular design in this paper is motivated by [25][31]. Though it was required that the controls should be bounded in [25], the control directions were supposed to be known. Moreover, [31] designed an extremum seeker through robust state regulation and numerical optimization, in which the control directions are also considered to be known. Different from [19] that considered distributed optimization problems with unknown control directions, this paper addresses Nash equilibrium seeking problems with both unknown control directions and parametric uncertainties. In particular, the existence of multiple unknown control directions and uncertain parameters is addressed. Though we only investigate first-order and second-order systems analytically in this paper, we believe that under the proposed framework, it is not challenging to extend the current results to high-order systems by backstepping techniques.

Though for presentation simplicity, we suppose that xi,x_{i}\in\mathbb{R}, it should be noted that the presented results can be directly adapted to deal with games in which the players are of multiple heterogeneous dimensions. In the subsequent section, an example in which xi2x_{i}\in\mathbb{R}^{2} for i𝒩i\in\mathcal{N} will be numerically studied.

6 A Numerical Example

In this section, we consider the connectivity control game among a network of 77 mobile sensors considered in [1]. The objective function of player ii engaged in the game is defined as

Fi(𝐱)=hi(xi)+li(𝐱),F_{i}(\mathbf{x})=h_{i}(x_{i})+l_{i}(\mathbf{x}), (69)

where xi=[xi1,xi2]T2x_{i}=[x_{i1},x_{i2}]^{T}\in\mathbb{R}^{2} and

hi(xi)=xiTmiixi+xiTmi+i2,h_{i}(x_{i})=x^{T}_{i}m_{ii}x_{i}+x^{T}_{i}m_{i}+i^{2}, (70)

in which mii=diag{2i,i},mi=[i,2i]T.m_{ii}=diag\{2i,i\},m_{i}=[i,2i]^{T}. Moreover, l1(𝐱)=x1x22l_{1}(\mathbf{x})={\lVert x_{1}-x_{2}\lVert}^{2}, l2(𝐱)=x2x32,l_{2}(\mathbf{x})={\lVert x_{2}-x_{3}\lVert}^{2}, l3(𝐱)=x3x12,l_{3}(\mathbf{x})={\lVert x_{3}-x_{1}\lVert}^{2}, l4(𝐱)=x4x32l_{4}(\mathbf{x})={\lVert x_{4}-x_{3}\lVert}^{2},l5(𝐱)=x5x12+x5x62l_{5}(\mathbf{x})={\lVert x_{5}-x_{1}\lVert}^{2}+{\lVert x_{5}-x_{6}\lVert}^{2},l6(𝐱)=x6x32+x6x12l_{6}(\mathbf{x})={\lVert x_{6}-x_{3}\lVert}^{2}+{\lVert x_{6}-x_{1}\lVert}^{2} and l7(𝐱)=x7x22l_{7}(\mathbf{x})={\lVert x_{7}-x_{2}\lVert}^{2}. It can be calculated that the Nash equilibrium of the game is xi1=14x^{*}_{i1}=-\frac{1}{4} and xi2=1x^{*}_{i2}=-1 for i{1,2,,7}i\in\{1,2,\cdots,7\}. In the simulation, the undirected and connected communication graph is plotted in Fig. 2. In the following, games with dynamics in (3), (4) and (5) will be numerically explored, successively.

Refer to caption
Figure 2: The communication graph among the players.

6.1 Distributed Nash equilibrium seeking for first-order systems

In this section, we simulate first-order systems in (3), where the control input is designed in (9)-(12). Note that as 𝐱i2,\mathbf{x}_{i}\in\mathbb{R}^{2}, bi2×2.b_{i}\in\mathbb{R}^{2}\times\mathbb{R}^{2}. In the simulation, b1=diag{3,3},b2=diag{5,5},b_{1}=diag\{3,3\},b_{2}=diag\{5,5\}, b3=diag{2,2},b_{3}=diag\{-2,-2\}, b4=diag{1,2},b_{4}=diag\{1,2\}, b5=diag{3,3},b_{5}=diag\{-3,-3\}, b6=diag{1,1}b_{6}=diag\{-1,-1\} and b7=diag{2,2}.b_{7}=diag\{2,2\}. Moreover, ϕi=ixi.\phi_{i}=ix_{i}. Let 𝐱(0)=[5,3,4,6,1,8,0,8,1,10,1,2,3,0]T,\mathbf{x}(0)=[-5,3,-4,-6,1,8,0,-8,-1,10,1,2,3,0]^{T}, and the initial values for all the other variables in (9)-(12) be zero. Then, generated by (9)-(12), the players' action trajectories xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} are plotted in Fig. 3, from which it is clear that the players' action trajectories converge to the Nash equilibrium asymptotically. Moreover, Figs. 4-5 illustrate the trajectories of kij(t)k_{ij}(t) and θ^ij(t)\hat{\theta}_{ij}(t) for all i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\}, respectively. From Figs. 4-5, it can be seen that these variables stay bounded. Therefore, Theorem 1 is numerically validated.

Refer to caption
Figure 3: The trajectories of xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} generated by (9)-(12).
Refer to caption
Figure 4: The trajectories of kijk_{ij} for i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} generated by (9)-(12).
Refer to caption
Figure 5: The trajectories of θ^ij\hat{\theta}_{ij} generated by (9)-(12) for i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\}.

6.2 Distributed Nash equilibrium seeking for second-order systems

In this section, we simulate the system in (4), where the control input is designed in (34)-(37). In the simulation, bib_{i}, ϕi(xi)\phi_{i}(x_{i}) and 𝐱(0)\mathbf{x}(0) follow those in Section 6.1 and the initial values for all the other variables are zero.

The players' action trajectories xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} generated by (34)-(37) are depicted in Fig. 6, from which it can be seen that the players' actions converge the actual Nash equilibrium of the game. In addition, kij(t)k_{ij}(t) and θ^ij(t)\hat{\theta}_{ij}(t) for all i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} are given in Figs. 7-8. From Figs. 7-8, we can conclude that kij(t)k_{ij}(t) and θ^ij(t)\hat{\theta}_{ij}(t) for all i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} stay bounded. Furthermore, Fig. 9 demonstrates that vi(t)v_{i}(t) for all i{1,2,,7}i\in\{1,2,\cdots,7\} decay to zero, which is aligned with the results in Theorem 2. To this end, the conclusions in Theorem 2 have been numerically verified.

Refer to caption
Figure 6: The trajectories of xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} generated by (34)-(37).
Refer to caption
Figure 7: The trajectories of kijk_{ij} for i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} generated by (34)-(37).
Refer to caption
Figure 8: The trajectories of θ^ij\hat{\theta}_{ij} for i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} generated by generated by (34)-(37).
Refer to caption
Figure 9: The trajectories of vi(t)v_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} generated by (34)-(37).

6.3 Distributed Nash equilibrium seeking for more general second-order uncertain nonlinear systems

In this section, we numerically verify the control input designed for uncertain nonlinear systems in (5). To illustrate the case, we suppose that the action of player 77 is governed by (5), and its control input is given by (49)-(52), while all the other players' actions are governed by (4) with their control inputs being (34)-(37). For players 11-66, bib_{i} and ϕi(xi)\phi_{i}(x_{i}) are chosen to be the same as those in Section 6.2. For player 77, b71=b72=diag{2,2},b_{71}=b_{72}=diag\{2,2\}, ϕ71(xi)=7x7\phi_{71}(x_{i})=7x_{7} and ϕ72(xi,vi)=[7x72,7v72]T.\phi_{72}(x_{i},v_{i})=[7x_{72},7v_{72}]^{T}. In addition, 𝐱(0)=[5,3,4,6,1,8,0,8,1,10,1,2,3,0]T\mathbf{x}(0)=[-5,3,-4,-6,1,8,0,-8,-1,10,1,2,3,0]^{T} and the initial conditions for all the other variables are zero. Generated by the proposed methods, the players' actions xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} are shown in Fig. 10, from which we see that the players' actions converge to the Nash equilibrium. Moreover, ki1(t)k_{i1}(t) and θ^i1(t)\hat{\theta}_{i1}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} are plotted in Figs. 11-12. Figs. 11-12 show that ki1(t)k_{i1}(t) and θ^i1(t)\hat{\theta}_{i1}(t) stay bounded. Moreover, the evolution of vi(t)v_{i}(t) is shown in Fig. 13, which shows that vi(t)v_{i}(t) for all i{1,2,,7}i\in\{1,2,\cdots,7\} are also bounded. Hence, the effectiveness of the method in (49)-(52) is also verified.

Refer to caption
Figure 10: The trajectories of xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} with player 77's control strategy being (49)-(52) and the rest of the players' control strategy being (34)-(37).
Refer to caption
Figure 11: The trajectories of ki1(t)k_{i1}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} with player 77's control strategy being (49)-(52) and the rest of the players' control strategy being (34)-(37).
Refer to caption
Figure 12: The trajectories of θ^i1(t)\hat{\theta}_{i1}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} with player 77's control strategy being (49)-(52) and the rest of the players' control strategy being (34)-(37).
Refer to caption
Figure 13: The trajectories of vi(t)v_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} with player 77's control input being (49)-(52) and the rest of the players' control input being (34)-(37).

6.4 Fully distributed implementations of the proposed algorithms

To verify the fully distributed implementations of the proposed methods, we take first-order systems as an example. The simulation setting of this section follows that of Section 6.1 and δij(0)=0\delta_{ij}(0)=0 for all i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\}. The simulation results for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60) are given in Figs. 14-17. Fig. 14 plots the players' actions from which we see that the players' actions can converge to the Nash equilibrium. Moreover, Figs. 15-17 plot ki(t),k_{i}(t), θ^i(t)\hat{\theta}_{i}(t) and δij(t)\delta_{ij}(t), respectively, from which it is clear that they stay bounded. Therefore, the results in Theorem 4 is numerically verified. Note that compared with the simulation in Section 6.1, there is no centralized control gain in (60), thus verifying the effectiveness of the distributively implemented algorithms.

Refer to caption
Figure 14: The trajectories of xi(t)x_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60).
Refer to caption
Figure 15: The trajectories of ki(t)k_{i}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60).
Refer to caption
Figure 16: The trajectories of θ^ij(t)\hat{\theta}_{ij}(t) for i{1,2,,7}i\in\{1,2,\cdots,7\} for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60).
Refer to caption
Figure 17: The trajectories of δij(t)\delta_{ij}(t) for i{1,2,,7},j{1,2}i\in\{1,2,\cdots,7\},j\in\{1,2\} for the system considered in (3) with the control input in (9)-(10), where yiy_{i} is generated by (60).

7 Conclusion

This paper considers distributed Nash equilibrium seeking for games in which the players' actions are subject to both unknown control directions and parametric uncertainties. First-order systems and second-order systems are addressed successively. To cope with the un-availability of control directions, a Nussbaum function is adopted. Moreover, the parametric uncertainties are addressed by adaptive laws. Together with an optimization module, a state regulation module is included in the seeking strategy. Based on the Barbalat's lemma, it is proven that the players' actions can be driven to the Nash equilibrium. Lastly, the fully distributed implementations of the proposed algorithms are discussed. It is shown that the adaptive techniques can be employed to achieve the equilibrium seeking in a fully distributed way.

References

  • [1] M. Ye, ``Distributed robust seeking of Nash equilibrium for networked games: an extended state observer-based approach," IEEE Transactions on Cybernetics, accepted, published online, DOI: 10.1109/TCYB.2020.2989755.
  • [2] M. Ye and G. Hu, ``Game design and analysis for price-based demand response: an aggregate game approach," IEEE Transactions on Cybernetics, vol. 47, no. 3, pp. 720-730, 2017.
  • [3] M. Ye, G. Hu, F. Lewis, L. Xie, ``A unified strategy for solution seeking in graphical NN-coalition noncooperative games," IEEE Transactions on Automatic Control, vol. 64, no. 11, pp. 4645-4652, 2019.
  • [4] J. Ma, M. Ye, Y. Zheng, Y. Zhu, ``Consensus analysis of hybrid multiagent systems: a game-theoretic approach," International Journal of Robust and Nonlinear Control, vol.29, pp. 1840-1853, 2019.
  • [5] M. Ye and G. Hu, ``Distributed Nash equilibrium seeking by a consensus based approach," IEEE Transactions on Automatic Control, pp. 4811-4818, vol. 62, no. 9, 2017.
  • [6] P. Jiang, P. Woo, R. Unbehauen, ``Iterative learning control for manipulator trajectory tracking without any control singularity," Robotica, vol. 20, no. 2, pp. 149-158, 2002.
  • [7] J. Du, C. Guo, S. Yu, Y. Zhao, ``Adaptive autopilot design of time-varying uncertain ships with completely unknown control coefficients," IEEE Journal of Oceanic Engineering, vol. 32, no. 2, pp. 346-352, 2007.
  • [8] R. Nussbaum, ``Some remarks on a conjecture in parameter adaptive control," System and Control Letters, vol. 3, no. 5, pp. 243-246, 1983.
  • [9] X. Bun, D. Wei, X. Wu, J. Huang, ``Guaranteeing preselected tracking quality for air-breathing hypersonic non-affine models with an unknown control direction via concise neural control," Journal of the Franklin Institute, vol. 353, no. 13, pp. 3207-3232, 2016.
  • [10] L. Wang, W. Deng, J. Liu and R. Mei, ``Adaptive sliding mode trajectory tracking control of quadrotor UAV with unknown control direction," In: R. Wang, Z. Chen, W. Zhang, Q. Zhu(eds) Proceedings of the 11th International Conference on Modelling, Identification and Control, Lecture Notes in Electrical Engineering, vol. 582. Springer, Singapore.
  • [11] J. Peng, X. Ye, ``Cooperative control of multiple heterogeneous agents with unknown high-frequency-gain signs," Systems and Control Letters, vol. 68, pp. 51-56, 2014.
  • [12] X. Ye and J. Jiang, ``Adaptive nonlinear design without a priori knowledge of control directions," IEEE Transactions on Automatic Control, vol. 43, no. 11, pp. 1617-1621, 1998.
  • [13] Z. Chen, ``Nussbaum functions in adaptive control with time-varying unknown control coefficients," Automatica, vol. 102, pp. 72-79, 2019.
  • [14] H. Khailil, Nonlinear Systems, Upper Saddle River, NJ: Prentice Hall, 2002.
  • [15] A. Scheinker, M. Krstic, ``Minimum-seeking for CLFs: universal semiglobally stabilizing feedback under unknown control directions," IEEE Transactions on Automatic Control, vol. 58, no. 5, pp. 1107-1122, 2013.
  • [16] C. Chen, C. Wen, Z. Liu, K. Xie, Y. Zhang, C. Chen, ``Adaptive consensus of nonlinear multi-agent systems with non-identical partially unknown control directions and bounded modelling errors," IEEE Transactions on Automatic Control, vol. 62, no. 9, pp. 4654-4659, 2017.
  • [17] C. Yang, S. Ge, T. Lee, ``Output feedback adaptive control of a class of nonlinear discrete-time systems with unknown control directions," Automatica, vol. 45, no. 1, pp. 270-276, 2008.
  • [18] M. Guo, D. Xu, L. Liu, ``Cooperative output regulation of heterogeneous nonlinear multi-agent systems with unknown control directions," IEEE Transactions on Automatic Control, vol. 62, no. 6, pp 3039-3045, 2017.
  • [19] Y. Tang, ``Multi-agent optimal consensus with unknown control directions," arXiv:2005.10492.
  • [20] J. Koshal, A. Nedic and U. Shanbhag, ``Distributed algorithms for aggregative games on graphs," Operations Research, vol. 64, pp. 680-704, 2016.
  • [21] F. Salehisadaghiani and L. Pavel, ``Distributed Nash equilibrium seeking: A gossip-based algorithm," Automatica, vol. 72, pp. 209-216, 2016.
  • [22] M. Ye, G. Hu and S. Xu, ``An extremum seeking-based approach for Nash equilibrium seeking in NN-cluster noncooperative games," Automatica, vol. 114, 108815, 2020.
  • [23] M. Ye, G. Hu, and F. L. Lewis, ``Nash equilibrium seeking for NN-coalition non-cooperative games," Automatica, vol. 95, pp. 266-272, 2018.
  • [24] A. Ibrahim, T. Hayakawa, ``Nash equilibrium seeking with secondorder dynamic agents," IEEE Conference on Decision and Control, pp. 2514-2518, 2018.
  • [25] M. Ye, ``Distributed Nash equilibrium seeking for games in systems with bounded control inputs," submitted to IEEE Transactions on Automatic Control, revised, available online: arXiv:1901.09333, 2019.
  • [26] J. Slotine, W. Li, Applied Nonlinear Control, Prentice Hall, Englewood Cliffs, 1991.
  • [27] M. Ye, G. Hu,``Distributed Nash equilibrium seeking in multi-agent games under switching communication topologies," IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3208-3217, 2018.
  • [28] M. Ye, G. Hu, ``Adaptive approaches for fully distributed Nash equilibrium seeking in networked games," submitted to Automatica, revised, available online: arXiv:1912.00415.
  • [29] J. Huang, Y. Song, W. Wang, C. Wen and G. Li, ``Fully distributed adaptive consensus control of a class of high-order nonlinear systems with a directed topology and unknown control directions," IEEE Transactions on Cybernetics, vol. 48, no. 8, pp. 2349-2356, 2018.
  • [30] Z. Li, Z. Duan, Cooperative Control of Multi-agent Systems: A Consensus Region Approach, Taylor and Francis/CRC Press, Boca Roton, FL, 2014. ISBN: 978-1-4665-6994-2.
  • [31] M. Ye, G. Hu, ``A robust extremum seeking scheme for dynamic systems with uncertainties and disturbances," Automatica, vol. 66, pp. 172-178, 2016.