This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Nash Equilibrium Seeking in NN-Coalition Games via a Gradient-Free Method

Yipeng Pang and Guoqiang Hu This research was supported by Singapore Ministry of Education Academic Research Fund Tier 1 RG180/17(2017-T1-002-158).Y. Pang and G. Hu are with the School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore [email protected], [email protected].
Abstract

This paper studies an NN-coalition non-cooperative game problem, where the players in the same coalition cooperatively minimize the sum of their local cost functions under a directed communication graph, while collectively acting as a virtual player to play a non-cooperative game with other coalitions. Moreover, it is assumed that the players have no access to the explicit functional form but only the function value of their local costs. To solve the problem, a discrete-time gradient-free Nash equilibrium seeking strategy, based on the gradient tracking method, is proposed. Specifically, a gradient estimator is developed locally based on Gaussian smoothing to estimate the partial gradients, and a gradient tracker is constructed locally to trace the average sum of the partial gradients among the players within the coalition. With a sufficiently small constant step-size, we show that all players’ actions approximately converge to the Nash equilibrium at a geometric rate under a strongly monotone game mapping condition. Numerical simulations are conducted to verify the effectiveness of the proposed algorithm.

Index Terms:
Nash equilibrium seeking, gradient-free methods, non-cooperative games.

I Introduction

Recently, great efforts have been devoted to the research of collaboration and competition among multiple rational decision-makers, of which distributed optimization and Nash equilibrium (NE) seeking in non-cooperative games are the two main lines. In particular, distributed optimization problem is concerned with a network of agents that cooperatively minimize a combination of their local cost functions, which has shown great interests in the applications of parameter estimation, source localization, resource allocation, multi-robot coordination, etc. Non-cooperative game, on the other hand, considers a group of players that are self-interest motivated to minimize their own individual cost function in response to other players’ actions, which has been widely applied in the fields of transportation network control, power network control, electricity markets, smart grids, etc.

This paper subsumes the research of both distributed optimization and NE seeking in non-cooperative games, by considering an NN-coalition non-cooperative game. In this game, each coalition consists of a number of players. On the one hand, each coalition can be regarded as a virtual player that aims to minimize its cost function by adjusting its actions based on other coalition’s actions in a non-cooperative game. On the other hand, the coalition’s cost function is defined as the sum of the local cost functions associated to the players in the corresponding coalition, and minimization of such cost is realized by the collaboration among the corresponding players. Moreover, it is assumed that the players have no access to the explicit functional form but only the function value of their local costs. A discrete-time gradient-free NE seeking strategy is developed to drive the players’ actions to the NE.

Related Work. Recently, vast results on the design of NE seeking algorithms have been reported to solve matrix games, potential games, aggregate games, generalized games [4, 2, 26, 13], to list a few. For the NN-coalition game problem, the formulation is related to the problem of two sub-networks zero-sum games (e.g., [3, 6]), where two adversarial networks have opposing objectives with regard to the optimization of a common cost function, being collaboratively minimized by the agents in the corresponding network. Then, the NN-coalition game problem is essentially an extension of such problem with the number of subnetworks being NN. To solve the problem, the work in [23] proposed a NE seeking strategy, where a dynamic average consensus protocol was adopted to estimate the averaged gradients of the coalitions’ cost functions, and a gradient play method was implemented to drive the states to the equilibrium. It was further improved in [24] to reduce the communication and computation costs by incorporating an interference graph which characterizes the interactions among the players in each coalition. The NN-coalition game problem was also studied in [28], where the players are subject to both local set constraints and a coupled inequality constraint, and the associated cost functions are allowed to be non-smooth. A distributed algorithm with the use of projected operators, subgradient dynamics, and differential inclusions was proposed to find the generalized NE. Different from all these works, the authors in [25] proposed an extremum seeking-based approach without the knowledge on the explicit expressions of the players’ local cost functions, which is the most relevant to the problem considered in this paper. However, it should be noted that the methods proposed for NN-coalition game problems in the existing literature are continuous-time approaches. This paper intends to devise a discrete-time gradient-free NE seeking strategy to drive the players’ actions to the NE.

The newly proposed gradient-free NE seeking strategy follows the idea of Gaussian smoothing to estimate the gradient of a function, which was firstly reported in [7] in convex optimization, and further studied in both distributed optimization [27, 8, 9, 10, 11] and non-cooperative games [12]. This type of methods, including payoff-based methods in [17, 18] and extremum seeking-based methods in [25], can be regarded as non-model based approaches, since the implementation of such algorithms does not require the model information. One advantage of the Gaussian smoothing compared with other non-model based methods is that it can deal with both smooth and non-smooth problem setups. Though this technique has been well studied in [7] and its extensions in [27, 8, 9, 10, 11] for optimization problems, the derived properties cannot be directly applied to non-cooperative game problems due to the couplings in the players’ actions. Moreover, the algorithm based on Gaussian smoothing technique can only drive the players’ actions to the NE of a smoothed game but not the original game. How close between the NEs of the smoothed and the original games has not been studied yet. With the Gaussian smoothing technique, the proposed strategy utilizes the estimated gradient information in a gradient tracking method to trace the gradient of the coalition’s cost function. The gradient tracking technique was widely adopted in distributed optimization to achieve a fast rate of convergence with a constant step-size, see [20, 21, 14, 22], but receives little attention in NE seeking. It should be highlighted that solving NN-coalition game problems is not equivalent to solving NN independent distributed optimization problems among NN coalitions, since these NN distributed optimization problems are coupled with each other in terms of the players’ actions, and each coalition has only authority of its own action rather than the full authority of the entire decision variables in distributed optimization problems. The action of one coalition will have influence to the actions of other coalitions. Moreover, the cost functions in non-cooperative games are only convex with respect to the player’s own action, rather than the entire decision variable. Such partial convexity also brings in additional challenges in the convergence analysis of the gradient tracking method in non-cooperative games.

Contributions. The major contributions of this paper are threefold. 1). Different from the existing works on NN-coalition game problems [23, 24, 28], the problem considered in this paper follows the settings in [25], where the players are assumed to have no access to the explicit functional form, but only the function value of their local cost functions. As the methods proposed for NN-coalition game problems in all existing literature [23, 24, 28, 25] are continuous-time approaches, this paper is the first attempt to address NN-coalition game problems with a discrete-time method, where a non-model based NE seeking strategy is proposed via a gradient-free method. 2). This paper adopts the Gaussian smoothing technique to estimate the partial gradient of the player’s local cost function, and a gradient tracking method is employed to trace the gradient of the coalition’s cost function. Since the Gaussian smoothing-based algorithms can only achieve convergence to the NE of a smoothed game but not the original game, we are the first to explicitly quantify the distance between the NEs of the smoothed and the original games as a function of the smoothing parameter. As compared to the existing gradient tracking methods, such as [20, 21, 14, 22], this work is the first attempt to adopt gradient tracking techniques in NN-coalition games. 3). The convergence property of the proposed algorithm is carefully studied. Under the standard assumptions on the graph connectivity, local cost functions and game mappings, it is derived that the players’ actions driven by the proposed algorithm with a small constant step-size are able to converge to a small neighborhood of the NE at a geometric rate with the error being proportional to the step-size and the smoothing parameter.

Notations. We use \mathbb{R}, 0\mathbb{R}_{\geq 0} and p\mathbb{R}^{p} to denote real numbers, non-negative real numbers, and pp-dimensional column vectors, respectively. 𝟏p\mathbf{1}_{p} (𝟎p\mathbf{0}_{p}) represent a vector with all elements equal to 11 (0), and IpI_{p} denotes the p×pp\times p identity matrix. For a vector uu, we use u\|u\| for its standard Euclidean norm, i.e., uu,u\|u\|\triangleq\sqrt{\langle u,u\rangle} For a vector aa or a matrix AA, we use [a]i[a]_{i} to denote its ii-th entry, and [A]ij[A]_{ij} to denote its element in the ii-th row and jj-th column. The transpose and spectral norm of a matrix AA are denoted by AA^{\top} and A\|A\|, respectively. We use ρ(A)\rho(A) to represent the spectral radius of a square matrix AA. For a totally differentiable function f(x,y)f(x,y), we use f(x,y)\nabla f(x,y) for its total derivative, and xf(x,y)\nabla_{x}f(x,y) for its partial derivative with respect to xx for a fixed yy.

II Problem Statement

II-A Game Formulation

Definition 1

(NN-coalition Games). An NN-coalition non-cooperative game is defined by Γ(,{fi},{Ωi})\Gamma(\mathcal{I},\{f^{i}\},\{\Omega^{i}\}), where each coalition ii (indexed by {1,2,,N}\mathcal{I}\triangleq\{1,2,\ldots,N\}) owns a cost function fif^{i}, and consists of nin_{i} number of players (denoted by 𝒱i{1,2,,ni}\mathcal{V}^{i}\triangleq\{1,2,\ldots,n_{i}\}) with each player’s action subject to a constrained set Ωjipji\Omega^{i}_{j}\subset\mathbb{R}^{p^{i}_{j}}, where pjip^{i}_{j} denotes the dimension of the action of player jj coalition ii. Denote ni=1Nnin\triangleq\sum_{i=1}^{N}n_{i}, pij=1nipjip_{i}\triangleq\sum_{j=1}^{n_{i}}p^{i}_{j}, pi=1Npip\triangleq\sum_{i=1}^{N}p_{i}, Ωi=Ω1i××Ωniipi\Omega^{i}=\Omega^{i}_{1}\times\cdots\times\Omega^{i}_{n_{i}}\subset\mathbb{R}^{p_{i}}, and ΩΩ1××ΩNp\Omega\triangleq\Omega^{1}\times\cdots\times\Omega^{N}\subset\mathbb{R}^{p}. The cost function fi:Ωf^{i}:\Omega\to\mathbb{R} is defined as,

fi(𝐱i,𝐱i)1nij=1nifji(𝐱i,𝐱i),𝐱iΩi,i,f^{i}(\mathbf{x}^{i},\mathbf{x}^{-i})\triangleq\frac{1}{n_{i}}\sum_{j=1}^{n_{i}}f^{i}_{j}(\mathbf{x}^{i},\mathbf{x}^{-i}),\quad\mathbf{x}^{i}\in\Omega^{i},\quad\forall i\in\mathcal{I},

where fji(𝐱i,𝐱i)f^{i}_{j}(\mathbf{x}^{i},\mathbf{x}^{-i}) is a local cost function of player jj in coalition ii, 𝐱i[x1i,,xnii]Ωi\mathbf{x}^{i}\triangleq[x^{i\top}_{1},\ldots,x^{i\top}_{n_{i}}]^{\top}\in\Omega^{i} is the collection of all players’ actions in coalition ii, and 𝐱iΩ\Ωi\mathbf{x}^{-i}\in\Omega\backslash\Omega^{i} is the collection of all players’ actions other than coalition ii. xjiΩjix^{i}_{j}\in\Omega^{i}_{j} is the action of player jj in coalition ii. Collectively, we denote 𝐱(𝐱i,𝐱i)=[𝐱1,,𝐱N]\mathbf{x}\triangleq(\mathbf{x}^{i},\mathbf{x}^{-i})=[\mathbf{x}^{1\top},\ldots,\mathbf{x}^{N\top}]^{\top}.

Definition 2

(Nash Equilibrium of NN-Coalition Games). A vector 𝐱(𝐱i,𝐱i)Ω\mathbf{x}^{*}\triangleq(\mathbf{x}^{i*},\mathbf{x}^{-i*})\in\Omega is said to be a Nash equilibrium of the NN-coalition non-cooperative game Γ(,{fi},{Ωi})\Gamma(\mathcal{I},\{f^{i}\},\{\Omega^{i}\}), if and only if

fi(𝐱i,𝐱i)fi(𝐱i,𝐱i),xiΩi,i.\displaystyle f^{i}(\mathbf{x}^{i*},\mathbf{x}^{-i*})\leq f^{i}(\mathbf{x}^{i},\mathbf{x}^{-i*}),\quad\forall x^{i}\in\Omega^{i},\quad\forall i\in\mathcal{I}.

In this paper, we consider Ωji=\Omega^{i}_{j}=\mathbb{R} for simplicity. For an NN-coalition non-cooperative game Γ(,{fi},{ni})\Gamma(\mathcal{I},\{f^{i}\},\{\mathbb{R}^{n_{i}}\}), the players in the same coalition cooperatively minimize the summation of their local cost functions, while collectively acting as a virtual player to play a non-cooperative game with other coalitions. To achieve the mutual interest inside the coalition, the players in each coalition ii\in\mathcal{I} are equipped with a communication network, characterized by a directed graph 𝒢i(𝒱i,i)\mathcal{G}_{i}(\mathcal{V}^{i},\mathcal{E}^{i}) with an adjacency matrix Aini×niA^{i}\in\mathbb{R}^{n_{i}\times n_{i}}, where [Ai]jk>0[A^{i}]_{jk}>0 if (k,j)i(k,j)\in\mathcal{E}^{i} and [Ai]jk=0[A^{i}]_{jk}=0 otherwise. We assume (k,k)i,k𝒱i(k,k)\in\mathcal{E}^{i},\forall k\in\mathcal{V}^{i}. Moreover, it is assumed that all players have limited knowledge on their local cost functions, similar to the settings in [12, 25]. That is, the player can only access the output of the local cost function, whose explicit form is assumed to be unknown. Suppose that the considered NN-coalition game Γ(,{fi},{ni})\Gamma(\mathcal{I},\{f^{i}\},\{\mathbb{R}^{n_{i}}\}) admits a NE. The objective of this paper is to design a NE seeking strategy such that all players’ actions converge to a NE.

This paper considers the case where each agent has full access to all other agents’ decisions, which is known as full-decision information. In this game setup, even though agents have full access to other agents’ decisions, the agents’ cost functions including the gradient information are still private to other agents. Hence, across the coalitions, the agents do not need to have extra communication. In the same coalition, since the agents need to collaboratively minimize the coalition’s cost function, so they need to communicate with each other to obtain some necessary information, such as the gradient tracker in this paper. Thus, we make the following assumptions on the communication graph of each coalition and the players’ local cost functions.

Assumption 1

The digraph 𝒢i\mathcal{G}_{i} is strongly connected. The associated adjacency matrix AiA^{i} is doubly-stochastic, i.e., 𝟏niAi=𝟏ni\mathbf{1}_{n_{i}}^{\top}A^{i}=\mathbf{1}_{n_{i}}^{\top} and Ai𝟏ni=𝟏niA^{i}\mathbf{1}_{n_{i}}=\mathbf{1}_{n_{i}}.

Define σAiAi1ni𝟏ni𝟏ni\sigma_{A^{i}}\triangleq\|A^{i}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\|. It follows from [14, Lemma 1] that σAi<1\sigma_{A^{i}}<1. Denote by σ¯maxiσAi\bar{\sigma}\triangleq\max_{i\in\mathcal{I}}\sigma_{A^{i}} and ςmaxi(1+σAi2)/(1σAi2)\varsigma\triangleq\max_{i\in\mathcal{I}}(1+\sigma_{A^{i}}^{2})/(1-\sigma_{A^{i}}^{2}).

Assumption 2

For each j𝒱i,ij\in\mathcal{V}^{i},i\in\mathcal{I}, the local cost function fji(𝐱i,𝐱i)f^{i}_{j}(\mathbf{x}^{i},\mathbf{x}^{-i}) is convex in 𝐱i\mathbf{x}^{i}, and continuously differentiable in 𝐱\mathbf{x}. The total gradient fji(𝐱)\nabla f^{i}_{j}(\mathbf{x}) is \mathcal{L}-Lipschitz continuous in 𝐱\mathbf{x}, i.e., fji(𝐱)fji(𝐱)𝐱𝐱\|\nabla f^{i}_{j}(\mathbf{x})-\nabla f^{i}_{j}(\mathbf{x}^{\prime})\|\leq\mathcal{L}\|\mathbf{x}-\mathbf{x}^{\prime}\|, 𝐱,𝐱n\forall\mathbf{x},\mathbf{x}^{\prime}\in\mathbb{R}^{n}.

Next, we define the game mapping of Γ(,{fi},{ni})\Gamma(\mathcal{I},\{f^{i}\},\{\mathbb{R}^{n_{i}}\}) as given by F(𝐱)[𝐱1f1(𝐱),,𝐱NfN(𝐱)]F(\mathbf{x})\triangleq[\nabla_{\mathbf{x}^{1}}f^{1}(\mathbf{x})^{\top},\ldots,\nabla_{\mathbf{x}^{N}}f^{N}(\mathbf{x})^{\top}]^{\top}.

In this paper, we impose a strong monotonicity assumption on the game mapping F(𝐱)F(\mathbf{x}), which is common in many works, e.g., [2, 15, 19, 12, 26].

Assumption 3

The game mapping F(𝐱)F(\mathbf{x}) of game Γ\Gamma is χ\chi-strongly monotone, i.e., for 𝐱,𝐱n\forall\mathbf{x},\mathbf{x}^{\prime}\in\mathbb{R}^{n}, we have F(𝐱)F(𝐱),𝐱𝐱χ𝐱𝐱2\langle F(\mathbf{x})-F(\mathbf{x}^{\prime}),\mathbf{x}-\mathbf{x}^{\prime}\rangle\geq\chi\|\mathbf{x}-\mathbf{x}^{\prime}\|^{2}.

Remark 1

It is known that under Assumptions 2 and 3, game Γ\Gamma admits a unique Nash equilibrium [16, Thm. 3].

II-B Preliminaries on Gaussian Smoothing

To facilitate the gradient-free techniques, some preliminaries on Gaussian smoothing and randomized gradient oracle originated from [7] are presented in this part.

For the local cost function fji(𝐱)f^{i}_{j}(\mathbf{x}) of player j𝒱ij\in\mathcal{V}^{i} in coalition ii\in\mathcal{I}, a Gaussian-smoothed function fj,μi(𝐱)f^{i}_{j,\mu}(\mathbf{x}) with respect to 𝐱\mathbf{x} is defined as

fj,μi(𝐱)1κnfji(𝐱+μξ)e12ξ2𝑑ξ,\displaystyle f^{i}_{j,\mu}(\mathbf{x})\triangleq\frac{1}{\kappa}\int_{\mathbb{R}^{n}}f^{i}_{j}(\mathbf{x}+\mu\xi)e^{-\frac{1}{2}\|\xi\|^{2}}d\xi, (1)

where κne12ξ2𝑑ξ=(2π)n/2\kappa\triangleq\int_{\mathbb{R}^{n}}e^{-\frac{1}{2}\|\xi\|^{2}}d\xi=(2\pi)^{n/2}, ξn\xi\in\mathbb{R}^{n} is a normally distributed random variable, and μ0\mu\geq 0 is called the smoothing parameter.

Within each coalition ii\in\mathcal{I}, the randomized gradient-free oracle for player jj with respect to player kk, j,k𝒱ij,k\in\mathcal{V}^{i}, at step t0t\geq 0 is designed as111In some cases, the two-point sampling in Gaussian smoothing may necessitate some coordination between players in the gradient approximation, which may be more stringent than the one-point sampling in other non-model based methods.

πjki(𝐱t)fji(𝐱t+μξj,ti)fji(𝐱t)μ[ξj,ti]ki,\displaystyle\pi^{i}_{jk}(\mathbf{x}_{t})\triangleq\frac{f^{i}_{j}(\mathbf{x}_{t}+\mu\xi^{i}_{j,t})-f^{i}_{j}(\mathbf{x}_{t})}{\mu}[\xi^{i}_{j,t}]^{i}_{k}, (2)

where [ξj,ti]ki[\xi^{i}_{j,t}]^{i}_{k} denotes the (l=0inl+k)(\sum_{l=0}^{i}n_{l}+k)-th element of ξj,ti\xi^{i}_{j,t} with n0=0n_{0}=0 and ξj,ti\xi^{i}_{j,t} being its own version of ξ\xi at step tt, and μ>0\mu>0. Let t\mathcal{F}_{t} denote the σ\sigma-field generated by the entire history of all random variables from step 0 to t1t-1. Then, the following properties on fj,μi(𝐱)f^{i}_{j,\mu}(\mathbf{x}) and πjki(𝐱t)\pi^{i}_{jk}(\mathbf{x}_{t}) are directly collected from [7].

Lemma 1

(see [7]) Under Assumption 2, for j,k𝒱i,i\forall j,k\in\mathcal{V}^{i},i\in\mathcal{I}, we have

  1. 1.

    The function fj,μi(𝐱)f^{i}_{j,\mu}(\mathbf{x}) is convex in 𝐱i\mathbf{x}^{i} and totally differentiable in 𝐱\mathbf{x}.

  2. 2.

    The total gradient fj,μi(𝐱)\nabla f^{i}_{j,\mu}(\mathbf{x}) is \mathcal{L}-Lipschitz continuous in 𝐱\mathbf{x}, i.e., 𝐱,𝐱n\forall\mathbf{x},\mathbf{x}^{\prime}\in\mathbb{R}^{n}, fj,μi(𝐱)fj,μi(𝐱)𝐱𝐱\|\nabla f^{i}_{j,\mu}(\mathbf{x})-\nabla f^{i}_{j,\mu}(\mathbf{x}^{\prime})\|\leq\mathcal{L}\|\mathbf{x}-\mathbf{x}^{\prime}\|; and satisfies fj,μi(𝐱)fji(𝐱)12(n+3)32μ\|\nabla f^{i}_{j,\mu}(\mathbf{x})-\nabla f^{i}_{j}(\mathbf{x})\|\leq\frac{1}{2}(n+3)^{\frac{3}{2}}\mathcal{L}\mu.

  3. 3.

    The oracle πjki(𝐱t)\pi^{i}_{jk}(\mathbf{x}_{t}) satisfies that 𝔼[πjki(𝐱t)|t]=xkifj,μi(𝐱t)\mathbb{E}[\pi^{i}_{jk}(\mathbf{x}_{t})|\mathcal{F}_{t}]=\nabla_{x^{i}_{k}}f^{i}_{j,\mu}(\mathbf{x}_{t}), and 𝔼[πjki(𝐱t)2|t]4(n+4)fj,μi(𝐱t)2+3(n+4)32μ2\mathbb{E}[\|\pi^{i}_{jk}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]\leq 4(n+4)\|\nabla f^{i}_{j,\mu}(\mathbf{x}_{t})\|^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}.

II-C Gaussian-Smoothed Game Formulation

We define a Gaussian-smoothed NN-coalition game, denoted by Γμ(,{fμi},{ni})\Gamma_{\mu}(\mathcal{I},\{f^{i}_{\mu}\},\{\mathbb{R}^{n_{i}}\}), which has the same set of coalitions and action sets, but the cost function fμif^{i}_{\mu} is given by

fμi(𝐱i,𝐱i)1nij=1nifj,μi(𝐱i,𝐱i),i,\displaystyle f^{i}_{\mu}(\mathbf{x}^{i},\mathbf{x}^{-i})\triangleq\frac{1}{n_{i}}\sum_{j=1}^{n_{i}}f^{i}_{j,\mu}(\mathbf{x}^{i},\mathbf{x}^{-i}),\quad\forall i\in\mathcal{I},

where fj,μif^{i}_{j,\mu} is given by (1). The game mapping of Γμ\Gamma_{\mu} is defined as Fμ(𝐱)[𝐱1fμ1(𝐱),,𝐱NfμN(𝐱)]F_{\mu}(\mathbf{x})\triangleq[\nabla_{\mathbf{x}^{1}}f^{1}_{\mu}(\mathbf{x})^{\top},\ldots,\nabla_{\mathbf{x}^{N}}f^{N}_{\mu}(\mathbf{x})^{\top}]^{\top}.

In the next lemma, we proceed to show that Fμ(𝐱)F_{\mu}(\mathbf{x}) is strongly monotone, and characterize the distance between the NE of the smoothed game Γμ\Gamma_{\mu} and the NE of the original game Γ\Gamma in terms of the smoothing parameter μ\mu.

Lemma 2

Under Assumptions 2 and 3, for μ0\forall\mu\geq 0, the smoothed game Γμ(,{fμi},{ni})\Gamma_{\mu}(\mathcal{I},\{f^{i}_{\mu}\},\{\mathbb{R}^{n_{i}}\}) holds that

  1. 1.

    The game mapping Fμ(𝐱)F_{\mu}(\mathbf{x}) is χ\chi-strongly monotone.

  2. 2.

    It admits a unique NE (denoted by 𝐱μ\mathbf{x}_{\mu}^{*}) satisfying

    𝐱μ𝐱n(n+3)32γ2(11γχ)μ,\displaystyle\|\mathbf{x}_{\mu}^{*}-\mathbf{x}^{*}\|\leq\frac{n(n+3)^{\frac{3}{2}}\mathcal{L}\gamma}{2(1-\sqrt{1-\gamma\chi})}\mu,

    where 𝐱\mathbf{x}^{*} is the unique NE of the original game Γ\Gamma, and γ(0,χn22]\gamma\in(0,\frac{\chi}{n^{2}\mathcal{L}^{2}}] is a constant.

Proof: (1) Under Assumption 2, we have fji(𝐱)f^{i}_{j}(\mathbf{x}) is differentiable in 𝐱i\mathbf{x}^{i}, and both fji(𝐱)f^{i}_{j}(\mathbf{x}) and 𝐱ifji(𝐱)\nabla_{\mathbf{x}^{i}}f^{i}_{j}(\mathbf{x}) are continuous in 𝐱\mathbf{x}. Then, the integrand in (1) has both itself and its partial derivative with respect to 𝐱i\mathbf{x}^{i} being continuous in (𝐱,ξ)(\mathbf{x},\xi), Thus, for μ0\mu\geq 0, we may derive 𝐱ifj,μi(𝐱)\nabla_{\mathbf{x}^{i}}f^{i}_{j,\mu}(\mathbf{x}) by applying Leibniz integral rule to (1), given by

𝐱ifj,μi(𝐱)=1κn𝐱ifji(𝐱+μξ)e12ξ2𝑑ξ.\displaystyle\nabla_{\mathbf{x}^{i}}f^{i}_{j,\mu}(\mathbf{x})=\frac{1}{\kappa}\int_{\mathbb{R}^{n}}\nabla_{\mathbf{x}^{i}}f^{i}_{j}(\mathbf{x}+\mu\xi)e^{-\frac{1}{2}\|\xi\|^{2}}d\xi. (3)

For 𝐱,𝐲n\forall\mathbf{x},\mathbf{y}\in\mathbb{R}^{n}, it can be shown that

Fμ(𝐱)Fμ(𝐲),𝐱𝐲=i=1N𝐱ifμi(𝐱)𝐲ifμi(𝐲),\displaystyle\langle F_{\mu}(\mathbf{x})-F_{\mu}(\mathbf{y}),\mathbf{x}-\mathbf{y}\rangle=\sum_{i=1}^{N}\langle\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x})-\nabla_{\mathbf{y}^{i}}f^{i}_{\mu}(\mathbf{y}),
𝐱i𝐲i=1κni=1N𝐱ifi(𝐱+μξ)𝐲ifi(𝐲+μξ),\displaystyle\mathbf{x}^{i}-\mathbf{y}^{i}\rangle=\frac{1}{\kappa}\int_{\mathbb{R}^{n}}\sum_{i=1}^{N}\langle\nabla_{\mathbf{x}^{i}}f^{i}(\mathbf{x}+\mu\xi)-\nabla_{\mathbf{y}^{i}}f^{i}(\mathbf{y}+\mu\xi),
𝐱i𝐲ie12ξ2dξ=1κnF(𝐱+μξ)F(𝐲+μξ),\displaystyle\mathbf{x}^{i}-\mathbf{y}^{i}\rangle e^{-\frac{1}{2}\|\xi\|^{2}}d\xi=\frac{1}{\kappa}\int_{\mathbb{R}^{n}}\langle F(\mathbf{x}+\mu\xi)-F(\mathbf{y}+\mu\xi),
(𝐱+μξ)(𝐲+μξ)e12ξ2dξχ𝐱𝐲2,\displaystyle(\mathbf{x}+\mu\xi)-(\mathbf{y}+\mu\xi)\rangle e^{-\frac{1}{2}\|\xi\|^{2}}d\xi\geq\chi\|\mathbf{x}-\mathbf{y}\|^{2},

where the second equality is due to (3) and the last inequality follows from Assumption 3. Thus, we have that for μ0\mu\geq 0, the game mapping of Γμ\Gamma_{\mu} is χ\chi-strongly monotone.

(2) With convexity of fμi(𝐱)f^{i}_{\mu}(\mathbf{x}) in 𝐱i\mathbf{x}^{i} (c.f. Lemma 1-(1)) and the strong monotonicity of Fμ(𝐱)F_{\mu}(\mathbf{x}), game Γμ\Gamma_{\mu} admits a unique NE, denoted by 𝐱μ\mathbf{x}_{\mu}^{*}. It is noted that an NN-coalition game is equivalent to an NN-player non-cooperative game with the cost being the coalition cost function, hence we have Fμ(𝐱μ)=𝟎nF_{\mu}(\mathbf{x}^{*}_{\mu})=\mathbf{0}_{n}.

We use the notation 𝑭(𝐱,μ):n×0n\bm{F}(\mathbf{x},\mu):\mathbb{R}^{n}\times\mathbb{R}_{\geq 0}\to\mathbb{R}^{n} for the game mapping Fμ(𝐱)F_{\mu}(\mathbf{x}) of the smoothed game, and 𝐱(μ)\mathbf{x}^{*}(\mu) for the unique NE 𝐱μ\mathbf{x}_{\mu}^{*} to explicitly quantify the effect of the smoothing parameter μ\mu. For μ0\forall\mu\geq 0, it follows from part (1) that 𝑭(𝐱,μ)\bm{F}(\mathbf{x},\mu) is χ\chi-strongly monotone in 𝐱\mathbf{x}, i.e., 𝐱,𝐱n\forall\mathbf{x},\mathbf{x}^{\prime}\in\mathbb{R}^{n},

𝐱𝐱,𝑭(𝐱,μ)𝑭(𝐱,μ)χ𝐱𝐱2.\displaystyle\langle\mathbf{x}-\mathbf{x}^{\prime},\bm{F}(\mathbf{x},\mu)-\bm{F}(\mathbf{x}^{\prime},\mu)\rangle\geq\chi\|\mathbf{x}-\mathbf{x}^{\prime}\|^{2}. (4)

When μ=0\mu=0, fj,0i(𝐱)f^{i}_{j,0}(\mathbf{x}) reduces to fji(𝐱)f^{i}_{j}(\mathbf{x}) by definition (1). That implies the two games Γμ\Gamma_{\mu} and Γ\Gamma are equal at μ=0\mu=0. Hence their respective unique NE are equal at μ=0\mu=0, i.e., 𝐱(0)=𝐱\mathbf{x}^{*}(0)=\mathbf{x}^{*}. Now, establishing an upper bound for 𝐱μ𝐱\|\mathbf{x}_{\mu}^{*}-\mathbf{x}^{*}\| is equivalent to studying 𝐱(μ)𝐱(0)\|\mathbf{x}^{*}(\mu)-\mathbf{x}^{*}(0)\|, which is the Lipschitz property of 𝐱(μ)\mathbf{x}^{*}(\mu) at μ=0\mu=0.

The rest of the proof follows the idea in [1, Thm. 2.1]. We first show that 𝑭(𝐱,μ)\bm{F}(\mathbf{x},\mu) is (i) Lipschitz continuous in 𝐱\mathbf{x} for μ0\forall\mu\geq 0, and (ii) Lipschitz continuous in μ\mu at μ=0\mu=0 for 𝐱n\forall\mathbf{x}\in\mathbb{R}^{n}. For (i), it follows from Lemma 1-(2) that, 𝐱,𝐱n\forall\mathbf{x},\mathbf{x}^{\prime}\in\mathbb{R}^{n}, 𝐱ifμi(𝐱)𝐱ifμi(𝐱)=j=1ni(𝐱ifj,μi(𝐱)𝐱ifj,μi(𝐱))j=1nifj,μi(𝐱)fj,μi(𝐱)ni𝐱𝐱\|\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x})-\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x}^{\prime})\|=\|\sum_{j=1}^{n_{i}}(\nabla_{\mathbf{x}^{i}}f^{i}_{j,\mu}(\mathbf{x})-\nabla_{\mathbf{x}^{i}}f^{i}_{j,\mu}(\mathbf{x}^{\prime}))\|\leq\sum_{j=1}^{n_{i}}\|\nabla f^{i}_{j,\mu}(\mathbf{x})-\nabla f^{i}_{j,\mu}(\mathbf{x}^{\prime})\|\leq n_{i}\mathcal{L}\|\mathbf{x}-\mathbf{x}^{\prime}\|. Hence, we have 𝑭(𝐱,μ)𝑭(𝐱,μ)=(i=1N𝐱ifμi(𝐱)𝐱ifμi(𝐱)2)12\|\bm{F}(\mathbf{x},\mu)-\bm{F}(\mathbf{x}^{\prime},\mu)\|=(\sum_{i=1}^{N}\|\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x})-\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x}^{\prime})\|^{2})^{\frac{1}{2}}, which gives

𝑭(𝐱,μ)𝑭(𝐱,μ)n𝐱𝐱,μ0.\displaystyle\|\bm{F}(\mathbf{x},\mu)-\bm{F}(\mathbf{x}^{\prime},\mu)\|\leq n\mathcal{L}\|\mathbf{x}-\mathbf{x}^{\prime}\|,\forall\mu\geq 0. (5)

For (ii), it follows from Lemma 1-(2) that, 𝐱n\forall\mathbf{x}\in\mathbb{R}^{n}, 𝐱ifμi(𝐱)𝐱ifi(𝐱)=j=1ni(𝐱ifj,μi(𝐱)𝐱ifji(𝐱))j=1nifj,μi(𝐱)fji(𝐱)12ni(n+3)32μ\|\nabla_{\mathbf{x}^{i}}f^{i}_{\mu}(\mathbf{x})-\nabla_{\mathbf{x}^{i}}f^{i}(\mathbf{x})\|=\|\sum_{j=1}^{n_{i}}(\nabla_{\mathbf{x}^{i}}f^{i}_{j,\mu}(\mathbf{x})-\nabla_{\mathbf{x}^{i}}f^{i}_{j}(\mathbf{x}))\|\leq\sum_{j=1}^{n_{i}}\|\nabla f^{i}_{j,\mu}(\mathbf{x})-\nabla f^{i}_{j}(\mathbf{x})\|\leq\frac{1}{2}n_{i}(n+3)^{\frac{3}{2}}\mathcal{L}\mu. Hence, we have

𝑭(𝐱,μ)𝑭(𝐱,0)12n(n+3)32μ,𝐱n.\displaystyle\|\bm{F}(\mathbf{x},\mu)-\bm{F}(\mathbf{x},0)\|\leq\frac{1}{2}n(n+3)^{\frac{3}{2}}\mathcal{L}\mu,\forall\mathbf{x}\in\mathbb{R}^{n}. (6)

Now, we consider the map M(𝐱,μ)𝐱γ𝑭(𝐱,μ)M(\mathbf{x},\mu)\triangleq\mathbf{x}-\gamma\bm{F}(\mathbf{x},\mu). For 0<γχn220<\gamma\leq\frac{\chi}{n^{2}\mathcal{L}^{2}}, 𝐱,𝐲n\forall\mathbf{x},\mathbf{y}\in\mathbb{R}^{n},

M(𝐱,μ)M(𝐲,μ)2𝐱𝐲22γ𝐱𝐲,𝑭(𝐱,μ)\displaystyle\|M(\mathbf{x},\mu)-M(\mathbf{y},\mu)\|^{2}\leq\|\mathbf{x}-\mathbf{y}\|^{2}-2\gamma\langle\mathbf{x}-\mathbf{y},\bm{F}(\mathbf{x},\mu)
𝑭(𝐲,μ)+γ2𝑭(𝐱,μ)𝑭(𝐲,μ)2(12γχ\displaystyle\quad-\bm{F}(\mathbf{y},\mu)\rangle+\gamma^{2}\|\bm{F}(\mathbf{x},\mu)-\bm{F}(\mathbf{y},\mu)\|^{2}\leq(1-2\gamma\chi
+n2γ22)𝐱𝐲2(1γχ)𝐱𝐲2,\displaystyle\quad+n^{2}\gamma^{2}\mathcal{L}^{2})\|\mathbf{x}-\mathbf{y}\|^{2}\leq(1-\gamma\chi)\|\mathbf{x}-\mathbf{y}\|^{2}, (7)

where the second inequality follows from (4) and (5). Hence, (7) implies that the map M(𝐱,μ)M(\mathbf{x},\mu) is a contraction with respect to 𝐱\mathbf{x}. By the Banach fixed point theorem, the map M(𝐱,μ)M(\mathbf{x},\mu) has a unique fixed point. On the other hand, any fixed point of the map M(𝐱,μ)M(\mathbf{x},\mu) is a NE of game Γμ\Gamma_{\mu} (since 𝑭(𝐱,μ)=Fμ(𝐱)=𝟎n\bm{F}(\mathbf{x},\mu)=F_{\mu}(\mathbf{x})=\mathbf{0}_{n}). Thus, we deduce that the unique NE 𝐱(μ)\mathbf{x}^{*}(\mu) is the unique fixed point of the map M(𝐱,μ)M(\mathbf{x},\mu). Then, it follows from (7) that

𝐱(μ)𝐱(0)\displaystyle\|\mathbf{x}^{*}(\mu)-\mathbf{x}^{*}(0)\| =M(𝐱(μ),μ)M(𝐱(0),0)\displaystyle=\|M(\mathbf{x}^{*}(\mu),\mu)-M(\mathbf{x}^{*}(0),0)\|
M(𝐱(μ),μ)M(𝐱(0),μ)\displaystyle\leq\|M(\mathbf{x}^{*}(\mu),\mu)-M(\mathbf{x}^{*}(0),\mu)\|
+M(𝐱(0),μ)M(𝐱(0),0)\displaystyle\quad+\|M(\mathbf{x}^{*}(0),\mu)-M(\mathbf{x}^{*}(0),0)\|
1γχ𝐱(μ)𝐱(0)\displaystyle\leq\sqrt{1-\gamma\chi}\|\mathbf{x}^{*}(\mu)-\mathbf{x}^{*}(0)\|
+M(𝐱(0),μ)M(𝐱(0),0).\displaystyle\quad+\|M(\mathbf{x}^{*}(0),\mu)-M(\mathbf{x}^{*}(0),0)\|.

Noting that the last term

M(𝐱(0),μ)M(𝐱(0),0)\displaystyle\|M(\mathbf{x}^{*}(0),\mu)-M(\mathbf{x}^{*}(0),0)\|
=\displaystyle= (𝐱(0)γ𝑭(𝐱(0),μ))(𝐱(0)γ𝑭(𝐱(0),0))\displaystyle\|(\mathbf{x}^{*}(0)-\gamma\bm{F}(\mathbf{x}^{*}(0),\mu))-(\mathbf{x}^{*}(0)-\gamma\bm{F}(\mathbf{x}^{*}(0),0))\|
=\displaystyle= γ𝑭(𝐱(0),μ)𝑭(𝐱(0),0)12n(n+3)32γμ,\displaystyle\gamma\|\bm{F}(\mathbf{x}^{*}(0),\mu)-\bm{F}(\mathbf{x}^{*}(0),0)\|\leq\frac{1}{2}n(n+3)^{\frac{3}{2}}\mathcal{L}\gamma\mu,

where the last inequality is due to (6). Combining the above two relations gives the desired result. \Box

III Nash Equilibrium Seeking Strategy in NN-Coalition Games

In this section, we present the details of the NE seeking strategy.

At time-step tt, the player j𝒱ij\in\mathcal{V}^{i} in each coalition ii\in\mathcal{I} needs to maintain the following variables: the player’s own action variable xj,tix^{i}_{j,t}, auxiliary action variables yjk,tiy^{i}_{jk,t}, and gradient tracker variables ϕjk,ti\phi^{i}_{jk,t} for k𝒱i\forall k\in\mathcal{V}^{i}. A gradient estimator πjki\pi^{i}_{jk} for k𝒱i\forall k\in\mathcal{V}^{i} is also constructed to estimate the partial gradient based on the values of the local cost functions. The algorithm is initialized with arbitrary xj,0i,yjk,0ix^{i}_{j,0},y^{i}_{jk,0}\in\mathbb{R} and ϕjk,0i=πjki(𝐱0)\phi^{i}_{jk,0}=\pi^{i}_{jk}(\mathbf{x}_{0}). Then, each player j𝒱i,ij\in\mathcal{V}^{i},i\in\mathcal{I} updates these variables according to the following update laws

yjk,t+1i\displaystyle y^{i}_{jk,t+1} =l=1ni[Ai]jlylk,tiαϕjk,ti,\displaystyle=\sum_{l=1}^{n_{i}}[A^{i}]_{jl}y^{i}_{lk,t}-\alpha\phi^{i}_{jk,t}, (8a)
xj,t+1i\displaystyle x^{i}_{j,t+1} =yjj,t+1i,\displaystyle=y^{i}_{jj,t+1}, (8b)
ϕjk,t+1i\displaystyle\phi^{i}_{jk,t+1} =l=1ni[Ai]jlϕlk,ti+πjki(𝐱t+1)πjki(𝐱t),\displaystyle=\sum_{l=1}^{n_{i}}[A^{i}]_{jl}\phi^{i}_{lk,t}+\pi^{i}_{jk}(\mathbf{x}_{t+1})-\pi^{i}_{jk}(\mathbf{x}_{t}), (8c)

where πjki(𝐱t)\pi^{i}_{jk}(\mathbf{x}_{t}) is the gradient estimator given by (2), and α>0\alpha>0 is a constant step-size sequence. Thus, at time-step tt, agent j𝒱ij\in\mathcal{V}^{i} of coalition ii\in\mathcal{I} first update the auxiliary action variables yjk,tiy^{i}_{jk,t}, its decision variable xj,tix^{i}_{j,t} and gradient tracker variables ϕjk,ti,k𝒱i\phi^{i}_{jk,t},k\in\mathcal{V}^{i} according to the update laws (8), and pass the updated auxiliary action variables yjk,tiy^{i}_{jk,t} and gradient tracker variables ϕjk,ti\phi^{i}_{jk,t}, k𝒱ik\in\mathcal{V}^{i} to its out-neighbors in the same coalition. Then the iteration continues. These procedures are tabulated in Algorithm 1.

Algorithm 1 NE seeking in NN-coalition games
1:Initialize: j𝒱i,ij\in\mathcal{V}^{i},i\in\mathcal{I}
2:      set xj,0i,yjk,0ix^{i}_{j,0},y^{i}_{jk,0}\in\mathbb{R} generate {ξj,ti}t0𝒩(𝟎n,In)\{\xi^{i}_{j,t}\}_{t\geq 0}\sim\mathcal{N}(\mathbf{0}_{n},I_{n}) set ϕjk,0i=πjki(𝐱0)\phi^{i}_{jk,0}=\pi^{i}_{jk}(\mathbf{x}_{0}), k𝒱i\forall k\in\mathcal{V}^{i}
3:Iteration (t0)(t\geq 0): j𝒱i,ij\in\mathcal{V}^{i},i\in\mathcal{I}
4:      (communicate with neighbor ll within coalition ii to obtain ylk,t1iy^{i}_{lk,t-1}, k𝒱i\forall k\in\mathcal{V}^{i}) update yjk,tiy^{i}_{jk,t} based on (8a) update xj,tix^{i}_{j,t} based on (8b) (𝐱t\mathbf{x}_{t} is broadcast to all players across all coalitions) compute πjki(𝐱t)\pi^{i}_{jk}(\mathbf{x}_{t}) based on (2), k𝒱i\forall k\in\mathcal{V}^{i} (communicate with neighbor ll within coalition ii to obtain ϕlk,t1i\phi^{i}_{lk,t-1}, k𝒱i\forall k\in\mathcal{V}^{i}) update ϕjk,ti\phi^{i}_{jk,t} based on (8c), k𝒱i\forall k\in\mathcal{V}^{i}
5:Output: j𝒱i,ij\in\mathcal{V}^{i},i\in\mathcal{I}
6:      xj,tixjix^{i}_{j,t}\to x^{i*}_{j}

IV Convergence Analysis

To facilitate the convergence analysis, we first make some notations for easy presentation. Denote that nsi=1Nni2n_{s}\triangleq\sum_{i=1}^{N}n_{i}^{2} and nci=1Nni3n_{c}\triangleq\sum_{i=1}^{N}n_{i}^{3}. For k𝒱i,i\forall k\in\mathcal{V}^{i},i\in\mathcal{I}, define that 𝐲k,ti[y1k,ti,,ynik,ti]ni\mathbf{y}^{i}_{k,t}\triangleq[y^{i}_{1k,t},\ldots,y^{i}_{n_{i}k,t}]^{\top}\in\mathbb{R}^{n_{i}}, y¯k,ti1ni𝟏ni𝐲k,ti\bar{y}^{i}_{k,t}\triangleq\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top}\mathbf{y}^{i}_{k,t}\in\mathbb{R}, 𝐲¯ti[y¯1,ti,,y¯ni,ti]ni\bar{\mathbf{y}}^{i}_{t}\triangleq[\bar{y}^{i}_{1,t},\ldots,\bar{y}^{i}_{n_{i},t}]^{\top}\in\mathbb{R}^{n_{i}}, 𝐲¯t[𝐲¯t1,,𝐲¯tN]n\bar{\mathbf{y}}_{t}\triangleq[\bar{\mathbf{y}}^{1\top}_{t},\ldots,\bar{\mathbf{y}}^{N\top}_{t}]^{\top}\in\mathbb{R}^{n}, ϕt[ϕ11,t1,,ϕnNnN,tN]n\phi_{t}\triangleq[\phi^{1}_{11,t},\ldots,\phi^{N}_{n_{N}n_{N},t}]^{\top}\in\mathbb{R}^{n}, ϕk,ti[ϕ1k,ti,,ϕnik,ti]ni\phi^{i}_{k,t}\triangleq[\phi^{i}_{1k,t},\ldots,\phi^{i}_{n_{i}k,t}]^{\top}\in\mathbb{R}^{n_{i}}, ϕ¯k,ti1ni𝟏niϕk,ti\bar{\phi}^{i}_{k,t}\triangleq\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top}\phi^{i}_{k,t}\in\mathbb{R}, ϕ¯ti[ϕ¯1,ti,,ϕ¯ni,ti]ni\bar{\bm{\phi}}^{i}_{t}\triangleq[\bar{\phi}^{i}_{1,t},\ldots,\bar{\phi}^{i}_{n_{i},t}]^{\top}\in\mathbb{R}^{n_{i}}, ϕ¯t[ϕ¯t1,,ϕ¯tN]n\bar{\bm{\phi}}_{t}\triangleq[\bar{\bm{\phi}}^{1\top}_{t},\ldots,\bar{\bm{\phi}}^{N\top}_{t}]^{\top}\in\mathbb{R}^{n}, πki[π1ki,,πniki]ni\pi^{i}_{k}\triangleq[\pi^{i}_{1k},\ldots,\pi^{i}_{n_{i}k}]^{\top}\in\mathbb{R}^{n_{i}}, π¯ki1ni𝟏niπki\bar{\pi}^{i}_{k}\triangleq\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top}\pi^{i}_{k}\in\mathbb{R}, and xki𝐟μi(𝐱)[xkif1,μi(𝐱),,xkifni,μi(𝐱)]ni\nabla_{x^{i}_{k}}\mathbf{f}^{i}_{\mu}(\mathbf{x})\triangleq[\nabla_{x^{i}_{k}}f^{i}_{1,\mu}(\mathbf{x}),\ldots,\nabla_{x^{i}_{k}}f^{i}_{n_{i},\mu}(\mathbf{x})]^{\top}\in\mathbb{R}^{n_{i}}. Then, the update laws (8a) and (8c) can be compactly written as

𝐲k,t+1i\displaystyle\color[rgb]{0,0,0}\mathbf{y}^{i}_{k,t+1} =Ai𝐲k,tiαϕk,ti,\displaystyle\color[rgb]{0,0,0}=A^{i}\mathbf{y}^{i}_{k,t}-\alpha\phi^{i}_{k,t}, (9a)
ϕk,t+1i\displaystyle\phi^{i}_{k,t+1} =Aiϕk,ti+πki(𝐱t+1)πki(𝐱t).\displaystyle=A^{i}\phi^{i}_{k,t}+\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t}). (9b)

Multiplying 1ni𝟏ni\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top} from the left on both sides of (9a) and augmenting the relation into a compact form, we obtain

𝐲¯t+1=𝐲¯tαϕ¯t,\displaystyle\color[rgb]{0,0,0}\bar{\mathbf{y}}_{t+1}=\bar{\mathbf{y}}_{t}-\alpha\bar{\bm{\phi}}_{t}, (10)

Now, we are ready for the convergence analysis of the proposed algorithm, which consists of two parts: 1) in Sec. IV-A, we derive the inequality iterations of three major quantities: i) i=1Nk=1ni𝔼[𝐲k,ti𝟏niy¯k,ti2]\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}], the total consensus error of the auxiliary variables, ii) 𝔼[𝐲¯t𝐱μ2]\mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}], the gap between the stacked averaged auxiliary variable and the NE of game Γμ\Gamma_{\mu}, and iii) i=1Nk=1ni𝔼[ϕk,ti𝟏niϕ¯k,ti2]\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}], the total gradient tracking error; 2) in Sec. IV-B, we establish a linear system of these three inequalities to analyze its convergence, and characterize the gap between the players’ actions and the NE of game Γμ\Gamma_{\mu}, 𝔼[𝐱t𝐱μ2]\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}] in terms of the aforementioned quantities, and finally deduce the gap between the players’ actions and the NE of game Γ\Gamma, 𝔼[𝐱t𝐱2]\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}\|^{2}] with the obtained results in Lemma 2-(2).

IV-A Auxiliary Results

First, we derive some basic properties on the averaged gradient tracker ϕ¯k,ti\bar{\phi}^{i}_{k,t}, and provide a bound on the stacked gradient tracker ϕk,ti\phi^{i}_{k,t} in Lemmas 3 and 4, respectively.

Lemma 3

Under Assumptions 1, 2 and 3, the averaged gradient tracker ϕ¯k,ti,k𝒱i,i\bar{\phi}^{i}_{k,t},\forall k\in\mathcal{V}^{i},i\in\mathcal{I} holds that

  1. 1.

    ϕ¯k,ti=π¯ki(𝐱t)\bar{\phi}^{i}_{k,t}=\bar{\pi}^{i}_{k}(\mathbf{x}_{t}),

  2. 2.

    𝔼[ϕ¯k,ti|t]=xkifμi(𝐱t)\mathbb{E}[\bar{\phi}^{i}_{k,t}|\mathcal{F}_{t}]=\nabla_{x^{i}_{k}}f^{i}_{\mu}(\mathbf{x}_{t}),

  3. 3.

    𝔼[ϕ¯k,ti2|t]12(n+4)2i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2+12(n+4)2𝐲¯t𝐱μ2+12(n+4)G2+3(n+4)32μ2\mathbb{E}[\|\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]\leq 12(n+4)\mathcal{L}^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}+12(n+4)\mathcal{L}^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}+12(n+4)G^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}, where Gmaxj𝒱i,ifj,μi(𝐱μ)G\triangleq\max_{j\in\mathcal{V}^{i},i\in\mathcal{I}}\|\nabla f^{i}_{j,\mu}(\mathbf{x}^{*}_{\mu})\|.

Proof: For (1), multiplying 1ni𝟏ni\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top} from the left on both sides of (9b), and noting that AiA^{i} is doubly stochastic, we have ϕ¯k,t+1i=ϕ¯k,ti+π¯ki(𝐱t+1)π¯ki(𝐱t)\bar{\phi}^{i}_{k,t+1}=\bar{\phi}^{i}_{k,t}+\bar{\pi}^{i}_{k}(\mathbf{x}_{t+1})-\bar{\pi}^{i}_{k}(\mathbf{x}_{t}). Recursively expanding the above relation and noting that ϕk,0i=πki(𝐱0)\phi^{i}_{k,0}=\pi^{i}_{k}(\mathbf{x}_{0}) completes the proof.

For (2), following the result of part (1) and Lemma 1-3), we obtain 𝔼[ϕ¯k,ti|t]=𝔼[π¯ki(𝐱t)|t]=1ni𝟏ni𝔼[πki|t]=1ni𝟏nixki𝐟μi(𝐱t)=xkifμi(𝐱t)\mathbb{E}[\bar{\phi}^{i}_{k,t}|\mathcal{F}_{t}]=\mathbb{E}[\bar{\pi}^{i}_{k}(\mathbf{x}_{t})|\mathcal{F}_{t}]=\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top}\mathbb{E}[\pi^{i}_{k}|\mathcal{F}_{t}]=\frac{1}{n_{i}}\mathbf{1}_{n_{i}}^{\top}\nabla_{x^{i}_{k}}\mathbf{f}^{i}_{\mu}(\mathbf{x}_{t})=\nabla_{x^{i}_{k}}f^{i}_{\mu}(\mathbf{x}_{t}).

For (3), It is noted that

𝔼[ϕ¯k,ti2|t]=𝔼[π¯ki(𝐱t)2|t]\displaystyle\mathbb{E}[\|\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]=\mathbb{E}[\|\bar{\pi}^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]
=1ni2𝔼[𝟏niπki(𝐱t)2|t]1nij=1ni𝔼[πjki(𝐱t)2|t].\displaystyle\quad=\frac{1}{n_{i}^{2}}\mathbb{E}[\|\mathbf{1}_{n_{i}}^{\top}\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]\leq\frac{1}{n_{i}}\sum_{j=1}^{n_{i}}\mathbb{E}[\|\pi^{i}_{jk}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}].

From Lemma 1-(3),

𝔼[πjki(𝐱t)2|t]4(n+4)fj,μi(𝐱t)2+3(n+4)32μ2\displaystyle\mathbb{E}[\|\pi^{i}_{jk}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]\leq 4(n+4)\|\nabla f^{i}_{j,\mu}(\mathbf{x}_{t})\|^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}
12(n+4)fj,μi(𝐱t)fj,μi(𝐲¯t)2\displaystyle\leq 12(n+4)\|\nabla f^{i}_{j,\mu}(\mathbf{x}_{t})-\nabla f^{i}_{j,\mu}(\bar{\mathbf{y}}_{t})\|^{2}
+12(n+4)fj,μi(𝐲¯t)fj,μi(𝐱μ)2\displaystyle\quad+12(n+4)\|\nabla f^{i}_{j,\mu}(\bar{\mathbf{y}}_{t})-\nabla f^{i}_{j,\mu}(\mathbf{x}^{*}_{\mu})\|^{2}
+12(n+4)fj,μi(𝐱μ)2+3(n+4)32μ2\displaystyle\quad+12(n+4)\|\nabla f^{i}_{j,\mu}(\mathbf{x}^{*}_{\mu})\|^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}
12(n+4)2𝐱t𝐲¯t2+12(n+4)2𝐲¯t𝐱μ2\displaystyle\leq 12(n+4)\mathcal{L}^{2}\|\mathbf{x}_{t}-\bar{\mathbf{y}}_{t}\|^{2}+12(n+4)\mathcal{L}^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}
+12(n+4)G2+3(n+4)32μ2\displaystyle\quad+12(n+4)G^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}
12(n+4)2i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2+3(n+4)32μ2\displaystyle\leq 12(n+4)\mathcal{L}^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}+3(n+4)^{3}\mathcal{L}^{2}\mu^{2}
+12(n+4)2𝐲¯t𝐱μ2+12(n+4)G2,\displaystyle\quad+12(n+4)\mathcal{L}^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}+12(n+4)G^{2}, (11)

where Gmaxj𝒱i,ifj,μi(𝐱μ)G\triangleq\max_{j\in\mathcal{V}^{i},i\in\mathcal{I}}\|\nabla f^{i}_{j,\mu}(\mathbf{x}^{*}_{\mu})\| and the last inequality follows from (8b) that

𝐱t𝐲¯t2\displaystyle\|\mathbf{x}_{t}-\bar{\mathbf{y}}_{t}\|^{2} =i=1Nk=1niykk,tiy¯k,ti2\displaystyle=\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|y^{i}_{kk,t}-\bar{y}^{i}_{k,t}\|^{2}
i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2.\displaystyle\leq\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}. (12)

The proof is completed by combining the preceding relations. \Box

Lemma 4

Under Assumptions 1, 2 and 3, the stacked gradient tracker {ϕk,ti}t0,k𝒱i,i\{\phi^{i}_{k,t}\}_{t\geq 0},\forall k\in\mathcal{V}^{i},i\in\mathcal{I} holds that

𝔼[ϕk,ti2|t]2𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]\displaystyle\mathbb{E}[\|\phi^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]\leq 2\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+24ni2(n+4)2i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2\displaystyle\quad\color[rgb]{0,0,0}+24n_{i}^{2}(n+4)\mathcal{L}^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}
+24ni2(n+4)2𝐲¯t𝐱μ2\displaystyle\quad\color[rgb]{0,0,0}+24n_{i}^{2}(n+4)\mathcal{L}^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}
+24ni2(n+4)G2+6ni2(n+4)32μ2.\displaystyle\quad\color[rgb]{0,0,0}+24n_{i}^{2}(n+4)G^{2}+6n_{i}^{2}(n+4)^{3}\mathcal{L}^{2}\mu^{2}.

Proof: For k𝒱i,i\forall k\in\mathcal{V}^{i},i\in\mathcal{I}, we have

ϕk,ti2\displaystyle\|\phi^{i}_{k,t}\|^{2} 2ϕk,ti𝟏niϕ¯k,ti2+2ni2ϕ¯k,ti2.\displaystyle\leq 2\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}+2n_{i}^{2}\|\bar{\phi}^{i}_{k,t}\|^{2}.

The proof is completed by taking the conditional expectation on t\mathcal{F}_{t} and substituting Lemma 3-(3). \Box

In the subsequent Lemmas 5, 6 and 7, we characterize the inequality iterations of the aforementioned three major quantities. Specifically, we begin with the total consensus error of the auxiliary variables i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}, in terms of the sum of the previous iterations of the three quantities, and some constants.

Lemma 5

Under Assumptions 1, 2 and 3, the total consensus error of the auxiliary variables i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2} satisfies that

i=1Nk=1ni𝔼[𝐲k,t+1i𝟏niy¯k,t+1i2|t](1+σ¯22\displaystyle\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t+1}\|^{2}|\mathcal{F}_{t}]\leq\bigg{(}\frac{1+\bar{\sigma}^{2}}{2}
+24(n+4)ncς2α2)i=1Nk=1ni𝐲ik,t𝟏niy¯ik,t2\displaystyle\quad+24(n+4)n_{c}\varsigma\mathcal{L}^{2}\alpha^{2}\bigg{)}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}
+24(n+4)ncς2α2𝐲¯t𝐱μ2\displaystyle\quad+24(n+4)n_{c}\varsigma\mathcal{L}^{2}\alpha^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}
+2ςα2i=1Nk=1ni𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]\displaystyle\quad+2\varsigma\alpha^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+24(n+4)ncςG2α2+6(n+4)3ncς2μ2α2.\displaystyle\quad+24(n+4)n_{c}\varsigma G^{2}\alpha^{2}+6(n+4)^{3}n_{c}\varsigma\mathcal{L}^{2}\mu^{2}\alpha^{2}.

Proof: It follows from (9a) that for ii\in\mathcal{I}

𝐲k,t+1i𝟏niy¯k,t+1i2=Ai𝐲k,tiαϕk,ti\displaystyle\|\mathbf{y}^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t+1}\|^{2}=\bigg{\|}A^{i}\mathbf{y}^{i}_{k,t}-\alpha\phi^{i}_{k,t}
1ni𝟏ni𝟏ni(Ai𝐲k,tiαϕk,ti)2\displaystyle\quad\quad-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}(A^{i}\mathbf{y}^{i}_{k,t}-\alpha\phi^{i}_{k,t})\bigg{\|}^{2}
Ai𝐲k,ti𝟏niy¯k,ti2+α2(Ini1ni𝟏ni𝟏ni)ϕk,ti2\displaystyle\quad\leq\|A^{i}\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}+\alpha^{2}\bigg{\|}\bigg{(}I_{n_{i}}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\bigg{)}\phi^{i}_{k,t}\bigg{\|}^{2}
2Ai𝐲k,ti𝟏niy¯k,ti,α(Ini1ni𝟏ni𝟏ni)ϕk,ti,\displaystyle\quad\quad-2\bigg{\langle}A^{i}\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t},\alpha\bigg{(}I_{n_{i}}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\bigg{)}\phi^{i}_{k,t}\bigg{\rangle},

Taking the conditional expectation on t\mathcal{F}_{t} and noting that Ini1ni𝟏ni𝟏ni=1\|I_{n_{i}}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\|=1, we obtain

𝔼[𝐲k,t+1i𝟏niy¯k,t+1i2|t]σAi2𝐲k,ti𝟏niy¯k,ti2\displaystyle\mathbb{E}[\|\mathbf{y}^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t+1}\|^{2}|\mathcal{F}_{t}]\leq\sigma_{A^{i}}^{2}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}
+α2𝔼[ϕk,ti2|t]+1σAi22σAi2𝔼[Ai𝐲k,ti𝟏niy¯k,ti2|t]\displaystyle+\alpha^{2}\mathbb{E}[\|\phi^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]+\frac{1-\sigma_{A^{i}}^{2}}{2\sigma_{A^{i}}^{2}}\mathbb{E}[\|A^{i}\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+2σAi21σAi2α2𝔼[ϕk,ti2|t]1+σAi21σAi2α2𝔼[ϕk,ti2|t]\displaystyle+\frac{2\sigma_{A^{i}}^{2}}{1-\sigma_{A^{i}}^{2}}\alpha^{2}\mathbb{E}[\|\phi^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]\leq\frac{1+\sigma_{A^{i}}^{2}}{1-\sigma_{A^{i}}^{2}}\alpha^{2}\mathbb{E}[\|\phi^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+1+σAi22𝔼[𝐲k,ti𝟏niy¯k,ti2|t]ςα2𝔼[ϕk,ti2|t]\displaystyle+\frac{1+\sigma_{A^{i}}^{2}}{2}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]\leq\varsigma\alpha^{2}\mathbb{E}[\|\phi^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+1+σ¯22𝐲k,ti𝟏niy¯k,ti2.\displaystyle+\frac{1+\bar{\sigma}^{2}}{2}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}.

Applying Lemma 4 and summing over k=1k=1 to nin_{i}, i=1i=1 to NN complete the proof. \Box

Then, we quantify the gap 𝐲¯t𝐱μ2\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}, in terms of the combination of the previous iterations of itself and the total consensus error of the auxiliary variables, and some constants.

Lemma 6

Under Assumptions 1, 2 and 3, the gap between the stacked averaged auxiliary variable and the NE of game Γμ\Gamma_{\mu}, 𝐲¯t𝐱μ2\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2} satisfies that

𝔼[𝐲¯t+1𝐱μ2|t](1χα+12n(n+4)2α2)𝐲¯t\displaystyle\color[rgb]{0,0,0}\mathbb{E}[\|\bar{\mathbf{y}}_{t+1}-\mathbf{x}^{*}_{\mu}\|^{2}|\mathcal{F}_{t}]\leq(1-\chi\alpha+12n(n+4)\mathcal{L}^{2}\alpha^{2})\|\bar{\mathbf{y}}_{t}
𝐱μ2+(α2χ+12n(n+4)2α2)i=1Nk=1ni𝐲k,ti\displaystyle\quad\color[rgb]{0,0,0}-\mathbf{x}^{*}_{\mu}\|^{2}+\bigg{(}\frac{\alpha\mathcal{L}^{2}}{\chi}+12n(n+4)\mathcal{L}^{2}\alpha^{2}\bigg{)}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}
𝟏niy¯k,ti2+3n(n+4)32μ2α2+12n(n+4)G2α2.\displaystyle\quad\color[rgb]{0,0,0}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}+3n(n+4)^{3}\mathcal{L}^{2}\mu^{2}\alpha^{2}+12n(n+4)G^{2}\alpha^{2}.

Proof: It follows from (10) that

𝐲¯t+1𝐱μ=𝐲¯t𝐱μαϕ¯t.\displaystyle\color[rgb]{0,0,0}\bar{\mathbf{y}}_{t+1}-\mathbf{x}^{*}_{\mu}=\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}-\alpha\bar{\bm{\phi}}_{t}.

Taking the Euclidean norm on both sides and noting that Fμ(𝐱μ)=𝟎nF_{\mu}(\mathbf{x}^{*}_{\mu})=\mathbf{0}_{n},

𝐲¯t+1𝐱μ2𝐲¯t𝐱μ2+α2i=1Nk=1niϕ¯k,ti2\displaystyle\color[rgb]{0,0,0}\|\bar{\mathbf{y}}_{t+1}-\mathbf{x}^{*}_{\mu}\|^{2}\leq\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}+\alpha^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\bar{\phi}^{i}_{k,t}\|^{2}
2αi=1Nk=1niy¯k,tixk,μi,ϕ¯k,tixkifμi(𝐱t)\displaystyle\quad\color[rgb]{0,0,0}-2\alpha\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\langle\bar{y}^{i}_{k,t}-x^{i*}_{k,\mu},\bar{\phi}^{i}_{k,t}-\nabla_{x^{i}_{k}}f^{i}_{\mu}(\mathbf{x}_{t})\rangle (13a)
2α𝐲¯t𝐱μ,Fμ(𝐱t)Fμ(𝐲¯t)\displaystyle\quad\color[rgb]{0,0,0}-2\alpha\langle\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu},F_{\mu}(\mathbf{x}_{t})-F_{\mu}(\bar{\mathbf{y}}_{t})\rangle (13b)
2α𝐲¯t𝐱μ,Fμ(𝐲¯t)Fμ(𝐱μ).\displaystyle\quad\color[rgb]{0,0,0}-2\alpha\langle\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu},F_{\mu}(\bar{\mathbf{y}}_{t})-F_{\mu}(\mathbf{x}^{*}_{\mu})\rangle. (13c)

For (13a), it follows from Lemma 3-(2) that

𝔼[2αi=1Nk=1niy¯k,tixk,μi,ϕ¯k,tixkifμi(𝐱t)|t]=0.\displaystyle\mathbb{E}\bigg{[}-2\alpha\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\langle\bar{y}^{i}_{k,t}-x^{i*}_{k,\mu},\bar{\phi}^{i}_{k,t}-\nabla_{x^{i}_{k}}f^{i}_{\mu}(\mathbf{x}_{t})\rangle\bigg{|}\mathcal{F}_{t}\bigg{]}=0.

For (13b), it follows from Lemma 1-(2) that

2α𝐲¯t𝐱μ,Fμ(𝐱t)Fμ(𝐲¯t)\displaystyle\color[rgb]{0,0,0}-2\alpha\langle\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu},F_{\mu}(\mathbf{x}_{t})-F_{\mu}(\bar{\mathbf{y}}_{t})\rangle
2αi=1Nk=1niy¯k,tixk,μixkifμi(𝐱t)xkifμi(𝐲¯t)\displaystyle\color[rgb]{0,0,0}\leq 2\alpha\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\bar{y}^{i}_{k,t}-x^{i*}_{k,\mu}\|\|\nabla_{x^{i}_{k}}f^{i}_{\mu}(\mathbf{x}_{t})-\nabla_{x^{i}_{k}}f^{i}_{\mu}(\bar{\mathbf{y}}_{t})\|
2αi=1Nk=1niy¯k,tixk,μi𝐱t𝐲¯t\displaystyle\color[rgb]{0,0,0}\leq 2\alpha\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\bar{y}^{i}_{k,t}-x^{i*}_{k,\mu}\|\mathcal{L}\|\mathbf{x}_{t}-\bar{\mathbf{y}}_{t}\|
χα𝐲¯t𝐱μ2+α2χ𝐱t𝐲¯t2\displaystyle\color[rgb]{0,0,0}\leq\chi\alpha\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}+\frac{\alpha\mathcal{L}^{2}}{\chi}\|\mathbf{x}_{t}-\bar{\mathbf{y}}_{t}\|^{2}
χα𝐲¯t𝐱μ2+α2χi=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2,\displaystyle\color[rgb]{0,0,0}\leq\chi\alpha\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}+\frac{\alpha\mathcal{L}^{2}}{\chi}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2},

where the last inequality follows from (12). For (13c), it follows from Lemma 2-(1) that

2α𝐲¯t𝐱μ,Fμ(𝐲¯t)Fμ(𝐱μ)2χα𝐲¯t𝐱μ2.\displaystyle\color[rgb]{0,0,0}-2\alpha\langle\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu},F_{\mu}(\bar{\mathbf{y}}_{t})-F_{\mu}(\mathbf{x}^{*}_{\mu})\rangle\leq-2\chi\alpha\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}.

Taking the conditional expectation on t\mathcal{F}_{t} for (13), and substituting the above three relations and Lemma 3-(3) yield the desired result. \Box

Finally, we derive a bound on the total gradient tracking error i=1Nk=1niϕk,ti𝟏niϕ¯k,ti2\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}, in terms of the combination of the previous iterations of the three quantities, and some constants.

Lemma 7

Under Assumptions 1, 2 and 3, the total gradient tracking error i=1Nk=1niϕk,ti𝟏niϕ¯k,ti2\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2} satisfies

i=1Nk=1ni𝔼[ϕk,t+1i𝟏niϕ¯k,t+1i2|t](1+σ¯22\displaystyle\color[rgb]{0,0,0}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t+1}\|^{2}|\mathcal{F}_{t}]\leq\bigg{(}\frac{1+\bar{\sigma}^{2}}{2}
+48(n+4)nsς22α2)i=1Nk=1ni𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]\displaystyle\quad\color[rgb]{0,0,0}+48(n+4)n_{s}\varsigma^{2}\mathcal{L}^{2}\alpha^{2}\bigg{)}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+24(n+4)nsς2(3+σ¯22+α2χ+12n(n+4)2α2\displaystyle\color[rgb]{0,0,0}+24(n+4)n_{s}\varsigma\mathcal{L}^{2}\bigg{(}\frac{3+\bar{\sigma}^{2}}{2}+\frac{\alpha\mathcal{L}^{2}}{\chi}+12n(n+4)\mathcal{L}^{2}\alpha^{2}
+24(n+4)ncς2α2)i=1Nk=1ni𝐲ik,t𝟏niy¯ik,t2\displaystyle\quad\color[rgb]{0,0,0}+24(n+4)n_{c}\varsigma\mathcal{L}^{2}\alpha^{2}\bigg{)}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}
+24(n+4)nsς2[2+12n(n+4)2α2\displaystyle\color[rgb]{0,0,0}+24(n+4)n_{s}\varsigma\mathcal{L}^{2}[2+12n(n+4)\mathcal{L}^{2}\alpha^{2}
+24(n+4)ncς2α2]𝐲¯t𝐱μ2\displaystyle\quad\color[rgb]{0,0,0}+24(n+4)n_{c}\varsigma\mathcal{L}^{2}\alpha^{2}]\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}
+24(n+4)nsς2[24(n+4)ncςG2α2\displaystyle\color[rgb]{0,0,0}+24(n+4)n_{s}\varsigma\mathcal{L}^{2}[24(n+4)n_{c}\varsigma G^{2}\alpha^{2}
+6(n+4)3ncς2μ2α2+3n(n+4)32μ2α2\displaystyle\quad\color[rgb]{0,0,0}+6(n+4)^{3}n_{c}\varsigma\mathcal{L}^{2}\mu^{2}\alpha^{2}+3n(n+4)^{3}\mathcal{L}^{2}\mu^{2}\alpha^{2}
+12n(n+4)G2α2]+48(n+4)nsςG2\displaystyle\quad\color[rgb]{0,0,0}+12n(n+4)G^{2}\alpha^{2}]+48(n+4)n_{s}\varsigma G^{2}
+12(n+4)3nsς2μ2.\displaystyle\quad\color[rgb]{0,0,0}+12(n+4)^{3}n_{s}\varsigma\mathcal{L}^{2}\mu^{2}.

Proof: It follows from (9b) that

ϕk,t+1i𝟏niϕ¯k,t+1i2=Aiϕk,ti𝟏niϕ¯k,ti2\displaystyle\|\phi^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t+1}\|^{2}=\|A^{i}\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}
+(Ini1ni𝟏ni𝟏ni)(πki(𝐱t+1)πki(𝐱t))2\displaystyle\quad+\bigg{\|}\bigg{(}I_{n_{i}}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\bigg{)}(\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t}))\bigg{\|}^{2}
+2Aiϕk,ti𝟏niϕ¯k,ti,\displaystyle\quad+2\bigg{\langle}A^{i}\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t},
(Ini1ni𝟏ni𝟏ni)(πki(𝐱t+1)πki(𝐱t)).\displaystyle\quad\quad\bigg{(}I_{n_{i}}-\frac{1}{n_{i}}\mathbf{1}_{n_{i}}\mathbf{1}_{n_{i}}^{\top}\bigg{)}(\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t}))\bigg{\rangle}.

Taking the conditional expectation on t\mathcal{F}_{t}, we have

𝔼[ϕk,t+1i𝟏niϕ¯k,t+1i2|t]σAi2𝔼[ϕk,ti\displaystyle\mathbb{E}[\|\phi^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t+1}\|^{2}|\mathcal{F}_{t}]\leq\sigma_{A^{i}}^{2}\mathbb{E}[\|\phi^{i}_{k,t}
𝟏niϕ¯k,ti2|t]+𝔼[πki(𝐱t+1)πki(𝐱t)2|t]\displaystyle\quad\quad-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]+\mathbb{E}[\|\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]
+2𝔼[Aiϕk,ti𝟏niϕ¯k,tiπki(𝐱t+1)πki(𝐱t)|t]\displaystyle\quad+2\mathbb{E}[\|A^{i}\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|\|\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t})\||\mathcal{F}_{t}]
σAi2𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]+𝔼[πki(𝐱t+1)\displaystyle\leq\sigma_{A^{i}}^{2}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]+\mathbb{E}[\|\pi^{i}_{k}(\mathbf{x}_{t+1})
πki(𝐱t)2|t]+1σAi22𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]\displaystyle\quad\quad-\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]+\frac{1-\sigma_{A^{i}}^{2}}{2}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+2σAi21σAi2𝔼[πki(𝐱t+1)πki(𝐱t)2|t]\displaystyle\quad+\frac{2\sigma_{A^{i}}^{2}}{1-\sigma_{A^{i}}^{2}}\mathbb{E}[\|\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]
1+σ¯22𝔼[ϕk,ti𝟏niϕ¯k,ti2|t]\displaystyle\leq\frac{1+\bar{\sigma}^{2}}{2}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}|\mathcal{F}_{t}]
+ς𝔼[πki(𝐱t+1)πki(𝐱t)2|t],\displaystyle\quad+\varsigma\mathbb{E}[\|\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}], (14)

where the last term of (14) follows from (11) that

𝔼[πki(𝐱t+1)πki(𝐱t)2|t]\displaystyle\mathbb{E}[\|\pi^{i}_{k}(\mathbf{x}_{t+1})-\pi^{i}_{k}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]
2j=1ni𝔼[πjki(𝐱t+1)2|t]+2j=1ni𝔼[πjki(𝐱t)2|t]\displaystyle\quad\leq 2\sum_{j=1}^{n_{i}}\mathbb{E}[\|\pi^{i}_{jk}(\mathbf{x}_{t+1})\|^{2}|\mathcal{F}_{t}]+2\sum_{j=1}^{n_{i}}\mathbb{E}[\|\pi^{i}_{jk}(\mathbf{x}_{t})\|^{2}|\mathcal{F}_{t}]
24ni(n+4)2i=1Nk=1ni𝔼[𝐲k,t+1i𝟏niy¯k,t+1i2|t]\displaystyle\quad\leq 24n_{i}(n+4)\mathcal{L}^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t+1}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t+1}\|^{2}|\mathcal{F}_{t}]
+24ni(n+4)2𝔼[𝐲¯t+1𝐱μ2|t]\displaystyle\quad+24n_{i}(n+4)\mathcal{L}^{2}\mathbb{E}[\|\bar{\mathbf{y}}_{t+1}-\mathbf{x}^{*}_{\mu}\|^{2}|\mathcal{F}_{t}]
+24ni(n+4)2i=1Nk=1ni𝐲k,ti𝟏niy¯k,ti2\displaystyle\quad+24n_{i}(n+4)\mathcal{L}^{2}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}
+24ni(n+4)2𝐲¯t𝐱μ2\displaystyle\quad+24n_{i}(n+4)\mathcal{L}^{2}\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}
+48ni(n+4)G2+12ni(n+4)32μ2.\displaystyle\quad+48n_{i}(n+4)G^{2}+12n_{i}(n+4)^{3}\mathcal{L}^{2}\mu^{2}.

Applying Lemmas 5 and 6 in the above relation, and summing (14) over k=1k=1 to nin_{i}, i=1i=1 to NN complete the proof. \Box

IV-B Main Results

Next, we are ready to analyze the convergence of the proposed algorithm with the following definitions.

Ψt\displaystyle\Psi_{t} [i=1Nk=1ni𝔼[𝐲k,ti𝟏niy¯k,ti2]𝔼[𝐲¯t𝐱μ2]i=1Nk=1ni𝔼[ϕk,ti𝟏niϕ¯k,ti2]],\displaystyle\triangleq\begin{bmatrix}\color[rgb]{0,0,0}\begin{smallmatrix}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}]\\ \mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]\\ \sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,t}\|^{2}]\end{smallmatrix}\end{bmatrix},
𝐌α\displaystyle\mathbf{M}_{\alpha} [1k1+k2α2k2α2k3α2k4α+k5α21k6α+k5α20k7+k8α+k9α2k10+k9α21k1+k11α2],\displaystyle\triangleq\begin{bmatrix}\color[rgb]{0,0,0}\begin{smallmatrix}1-k_{1}+k_{2}\alpha^{2}&k_{2}\alpha^{2}&k_{3}\alpha^{2}\\ k_{4}\alpha+k_{5}\alpha^{2}&1-k_{6}\alpha+k_{5}\alpha^{2}&0\\ k_{7}+k_{8}\alpha+k_{9}\alpha^{2}&k_{10}+k_{9}\alpha^{2}&1-k_{1}+k_{11}\alpha^{2}\end{smallmatrix}\end{bmatrix},
Υα\displaystyle\Upsilon_{\alpha} [k12α2k13α2k14+k15α2],\displaystyle\triangleq\begin{bmatrix}\color[rgb]{0,0,0}\begin{smallmatrix}k_{12}\alpha^{2}\\ k_{13}\alpha^{2}\\ k_{14}+k_{15}\alpha^{2}\end{smallmatrix}\end{bmatrix},

where k11σ¯22k_{1}\triangleq\frac{1-\bar{\sigma}^{2}}{2}, k224(n+4)ncς2k_{2}\triangleq 24(n+4)n_{c}\varsigma\mathcal{L}^{2}, k32ςk_{3}\triangleq 2\varsigma, k42/χk_{4}\triangleq\mathcal{L}^{2}/\chi, k512n(n+4)2k_{5}\triangleq 12n(n+4)\mathcal{L}^{2}, k6χk_{6}\triangleq\chi, k712(n+4)nsς2(3+σ¯2)k_{7}\triangleq 12(n+4)n_{s}\varsigma\mathcal{L}^{2}(3+\bar{\sigma}^{2}), k824(n+4)nsς2k4k_{8}\triangleq 24(n+4)n_{s}\varsigma\mathcal{L}^{2}k_{4}, k924(n+4)nsς2(k2+k5)k_{9}\triangleq 24(n+4)n_{s}\varsigma\mathcal{L}^{2}(k_{2}+k_{5}), k1048(n+4)nsς2k_{10}\triangleq 48(n+4)n_{s}\varsigma\mathcal{L}^{2}, k11ςk10k_{11}\triangleq\varsigma k_{10}, k1224(n+4)ncςG2+6(n+4)3ncς2μ2k_{12}\triangleq 24(n+4)n_{c}\varsigma G^{2}+6(n+4)^{3}n_{c}\varsigma\mathcal{L}^{2}\mu^{2}, k1312n(n+4)G2+3n(n+4)32μ2k_{13}\triangleq 12n(n+4)G^{2}+3n(n+4)^{3}\mathcal{L}^{2}\mu^{2}, k1448(n+4)nsςG2+12(n+4)3nsς2μ2k_{14}\triangleq 48(n+4)n_{s}\varsigma G^{2}+12(n+4)^{3}n_{s}\varsigma\mathcal{L}^{2}\mu^{2} and k1524(n+4)nsς2(k12+k13)k_{15}\triangleq 24(n+4)n_{s}\varsigma\mathcal{L}^{2}(k_{12}+k_{13}).

Following the results in Lemmas 5, 6 and 7, and taking the total expectation on both sides, we obtain the following dynamical system:

Ψt+1𝐌αΨt+Υα.\displaystyle\Psi_{t+1}\leq\mathbf{M}_{\alpha}\Psi_{t}+\Upsilon_{\alpha}. (15)

Then the convergence of all players’ actions to a neighborhood of the unique NE 𝐱\mathbf{x}^{*} of game Γ\Gamma is established based on the above dynamical system, as shown in the following theorem.

Theorem 1

Suppose Assumptions 1, 2 and 3 hold. Let the auxiliary action variables {yjk,ti}t0\{y^{i}_{jk,t}\}_{t\geq 0}, the player’s action {xj,ti}t0\{x^{i}_{j,t}\}_{t\geq 0}, and gradient tracker {ϕjk,ti}t0\{\phi^{i}_{jk,t}\}_{t\geq 0} be generated by (8) with a constant step-size α\alpha satisfying

0<α<min{α1,α2,α3,1/k6,1},\displaystyle\color[rgb]{0,0,0}0<\alpha<\min\{\alpha_{1},\alpha_{2},\alpha_{3},1/k_{6},1\},

where α1[k12k6/(k1k2k6+2k1k2k4+4k3k4k10+2k3k6k7+2k3k6k8)]12\alpha_{1}\triangleq[k_{1}^{2}k_{6}/(k_{1}k_{2}k_{6}+2k_{1}k_{2}k_{4}+4k_{3}k_{4}k_{10}+2k_{3}k_{6}k_{7}+2k_{3}k_{6}k_{8})]^{\frac{1}{2}}, α2k1k4k6/(k1k5k6+2k1k4k5)\alpha_{2}\triangleq k_{1}k_{4}k_{6}/(k_{1}k_{5}k_{6}+2k_{1}k_{4}k_{5}), and α3{k1(2k4k10+k6k7+k6k8)/[k1k6k9+2k1k4k9+k11(4k4k10+2k6k7+2k6k8)]}12\alpha_{3}\triangleq\{k_{1}(2k_{4}k_{10}+k_{6}k_{7}+k_{6}k_{8})/[k_{1}k_{6}k_{9}+2k_{1}k_{4}k_{9}+k_{11}(4k_{4}k_{10}+2k_{6}k_{7}+2k_{6}k_{8})]\}^{\frac{1}{2}}. Then, ρ(𝐌α)<1\rho(\mathbf{M}_{\alpha})<1, and we have

  • supti=1Nk=1ni𝔼[𝐲k,i𝟏niy¯k,i2]\sup_{\ell\geq t}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,\ell}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,\ell}\|^{2}]

  • supt𝔼[𝐲¯𝐱μ2]\sup_{\ell\geq t}\mathbb{E}[\|\bar{\mathbf{y}}_{\ell}-\mathbf{x}^{*}_{\mu}\|^{2}]

  • supti=1Nk=1ni𝔼[ϕk,i𝟏niϕ¯k,i2]\sup_{\ell\geq t}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\phi^{i}_{k,\ell}-\mathbf{1}_{n_{i}}\bar{\phi}^{i}_{k,\ell}\|^{2}]

converge at a geometric rate with exponent ρ(𝐌α)\rho(\mathbf{M}_{\alpha}). Moreover,

lim supti=1Nk=1ni𝔼[𝐲k,ti𝟏niy¯k,ti2]𝒪(α2),\displaystyle\color[rgb]{0,0,0}\limsup_{t\to\infty}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}]\leq\mathcal{O}(\alpha^{2}),
lim supt𝔼[𝐲¯t𝐱μ2]𝒪(α),\displaystyle\color[rgb]{0,0,0}\limsup_{t\to\infty}\mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]\leq\mathcal{O}(\alpha),

and we further have

lim supt𝔼[𝐱t𝐱2]𝒪(α)+𝒪(μ).\displaystyle\color[rgb]{0,0,0}\limsup_{t\to\infty}\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}\|^{2}]\leq\mathcal{O}(\alpha)+\mathcal{O}(\mu).

Proof: For the dynamical system (15), if ρ(𝐌α)<1\rho(\mathbf{M}_{\alpha})<1, then 𝐌αt\color[rgb]{0,0,0}\mathbf{M}^{t}_{\alpha} converges to 𝟎\mathbf{0} at a geometric rate with exponent ρ(𝐌α)\rho(\mathbf{M}_{\alpha}) [5, Thm. 5.6.12], which implies that each element in Ψt\Psi_{t} converges to some neighborhood of 0 with the same rate, respectively.

The following lemma is leveraged to ensure ρ(𝐌α)<1\rho(\mathbf{M}_{\alpha})<1:

Lemma 8

(see [5, Cor. 8.1.29]) Let Am×mA\in\mathbb{R}^{m\times m} be a matrix with non-negative entries and 𝛉m\bm{\theta}\in\mathbb{R}^{m} be a vector with positive entries. If there exists a constant λ0\lambda\geq 0 such that A𝛉<λ𝛉A\bm{\theta}<\lambda\bm{\theta}, then ρ(A)<λ\rho(A)<\lambda.

To invoke Lemma 8, it suffices to set α<1k6\alpha<\frac{1}{k_{6}} to ensure each element of 𝐌α\mathbf{M}_{\alpha} is non-negative. Then, according to Lemma 8, it suffices to find a vector 𝜽[θ1,θ2,θ3]\bm{\theta}\triangleq[\theta_{1},\theta_{2},\theta_{3}]^{\top} with θ1,θ2,θ3>0\theta_{1},\theta_{2},\theta_{3}>0 such that 𝐌α𝜽<𝜽\mathbf{M}_{\alpha}\bm{\theta}<\bm{\theta}, i.e.,

(1k1+k2α2)θ1+(k2α2)θ2+(k3α2)θ3<θ1,\displaystyle(1-k_{1}+k_{2}\alpha^{2})\theta_{1}+(k_{2}\alpha^{2})\theta_{2}+(k_{3}\alpha^{2})\theta_{3}<\theta_{1},
(k4α+k5α2)θ1+(1k6α+k5α2)θ2<θ2,\displaystyle(k_{4}\alpha+k_{5}\alpha^{2})\theta_{1}+(1-k_{6}\alpha+k_{5}\alpha^{2})\theta_{2}<\theta_{2},
(k7+k8α+k9α2)θ1+(k10+k9α2)θ2\displaystyle(k_{7}+k_{8}\alpha+k_{9}\alpha^{2})\theta_{1}+(k_{10}+k_{9}\alpha^{2})\theta_{2}
+(1k1+k11α2)θ3<θ3.\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad+(1-k_{1}+k_{11}\alpha^{2})\theta_{3}<\theta_{3}.

Without loss of generality, letting θ3=1\theta_{3}=1, it suffices to find θ1\theta_{1} and θ2\theta_{2} such that the following inequalities hold

(k2θ1+k2θ2+k3)α2<k1θ1,\displaystyle(k_{2}\theta_{1}+k_{2}\theta_{2}+k_{3})\alpha^{2}<k_{1}\theta_{1}, (16a)
(k5θ1+k5θ2)α<k6θ2k4θ1,\displaystyle(k_{5}\theta_{1}+k_{5}\theta_{2})\alpha<k_{6}\theta_{2}-k_{4}\theta_{1}, (16b)
[k9(θ1+θ2)+k11]α2\displaystyle[k_{9}(\theta_{1}+\theta_{2})+k_{11}]\alpha^{2}
<k1(k7+k8)θ1k10θ2,\displaystyle\quad\quad\quad\quad\quad\quad<k_{1}-(k_{7}+k_{8})\theta_{1}-k_{10}\theta_{2}, (16c)

where we have forced α<1\alpha<1 in the third inequality. Letting θ1=k1k64k4k10+2k6k7+2k6k8\theta_{1}=\frac{k_{1}k_{6}}{4k_{4}k_{10}+2k_{6}k_{7}+2k_{6}k_{8}} and θ2=k1k42k4k10+k6k7+k6k8\theta_{2}=\frac{k_{1}k_{4}}{2k_{4}k_{10}+k_{6}k_{7}+k_{6}k_{8}}, we may solve (16) that

α<α1,α<α2,α<α3,\displaystyle\alpha<\alpha_{1},\alpha<\alpha_{2},\alpha<\alpha_{3},

where α1[k12k6/(k1k2k6+2k1k2k4+4k3k4k10+2k3k6k7+2k3k6k8)]12\alpha_{1}\triangleq[k_{1}^{2}k_{6}/(k_{1}k_{2}k_{6}+2k_{1}k_{2}k_{4}+4k_{3}k_{4}k_{10}+2k_{3}k_{6}k_{7}+2k_{3}k_{6}k_{8})]^{\frac{1}{2}}, α2k1k4k6/(k1k5k6+2k1k4k5)\alpha_{2}\triangleq k_{1}k_{4}k_{6}/(k_{1}k_{5}k_{6}+2k_{1}k_{4}k_{5}), and α3{k1(2k4k10+k6k7+k6k8)/[k1k6k9+2k1k4k9+k11(4k4k10+2k6k7+2k6k8)]}12\alpha_{3}\triangleq\{k_{1}(2k_{4}k_{10}+k_{6}k_{7}+k_{6}k_{8})/[k_{1}k_{6}k_{9}+2k_{1}k_{4}k_{9}+k_{11}(4k_{4}k_{10}+2k_{6}k_{7}+2k_{6}k_{8})]\}^{\frac{1}{2}}, which gives the desired range of α\alpha.

To find out the steady-state value, taking the limsup on both sides of (15) gives

lim suptΨt𝐌αlim suptΨt+Υ.\displaystyle\limsup_{t\to\infty}\Psi_{t}\leq\mathbf{M}_{\alpha}\limsup_{t\to\infty}\Psi_{t}+\Upsilon.

Notice that

I3𝐌α=[k1k2α2k2α2k3α2k4αk5α2k6αk5α20k7k8αk9α2k10k9α2k1k11α2].\displaystyle\color[rgb]{0,0,0}I_{3}-\mathbf{M}_{\alpha}=\begin{bmatrix}\begin{smallmatrix}k_{1}-k_{2}\alpha^{2}&-k_{2}\alpha^{2}&-k_{3}\alpha^{2}\\ -k_{4}\alpha-k_{5}\alpha^{2}&k_{6}\alpha-k_{5}\alpha^{2}&0\\ -k_{7}-k_{8}\alpha-k_{9}\alpha^{2}&-k_{10}-k_{9}\alpha^{2}&k_{1}-k_{11}\alpha^{2}\end{smallmatrix}\end{bmatrix}.

Its determinant can be obtained that

Det(I3𝐌α)\displaystyle Det(I_{3}-\mathbf{M}_{\alpha}) =α(k1k11α2)[k1k6k1k5α\displaystyle=\color[rgb]{0,0,0}\alpha(k_{1}-k_{11}\alpha^{2})[k_{1}k_{6}-k_{1}k_{5}\alpha
k2(k4+k6)α2].\displaystyle\quad\color[rgb]{0,0,0}-k_{2}(k_{4}+k_{6})\alpha^{2}].

Then, we can obtain that

lim supti=1Nk=1ni𝔼[𝐲k,ti𝟏niy¯k,ti2][(I3𝐌α)1Υ]1\displaystyle\limsup_{t\to\infty}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}]\leq[(I_{3}-\mathbf{M}_{\alpha})^{-1}\Upsilon]_{1}
k12α2𝒪(α)+k13α2𝒪(α2)+(k14+k15α2)𝒪(α3)α(k1k11α2)[k1k6k1k5αk2(k4+k6)α2]\displaystyle\quad\color[rgb]{0,0,0}\leq\frac{k_{12}\alpha^{2}\mathcal{O}(\alpha)+k_{13}\alpha^{2}\mathcal{O}(\alpha^{2})+(k_{14}+k_{15}\alpha^{2})\mathcal{O}(\alpha^{3})}{\alpha(k_{1}-k_{11}\alpha^{2})[k_{1}k_{6}-k_{1}k_{5}\alpha-k_{2}(k_{4}+k_{6})\alpha^{2}]}
=𝒪(α2),\displaystyle\quad\color[rgb]{0,0,0}=\mathcal{O}(\alpha^{2}),

and

lim supt𝔼[𝐲¯t𝐱μ2][(I3𝐌α)1Υ]2\displaystyle\limsup_{t\to\infty}\mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]\leq[(I_{3}-\mathbf{M}_{\alpha})^{-1}\Upsilon]_{2}
k12α2𝒪(α)+k13α2𝒪(1)+(k14+k15α2)𝒪(α3)α(k1k11α2)[k1k6k1k5αk2(k4+k6)α2]\displaystyle\quad\color[rgb]{0,0,0}\leq\frac{k_{12}\alpha^{2}\mathcal{O}(\alpha)+k_{13}\alpha^{2}\mathcal{O}(1)+(k_{14}+k_{15}\alpha^{2})\mathcal{O}(\alpha^{3})}{\alpha(k_{1}-k_{11}\alpha^{2})[k_{1}k_{6}-k_{1}k_{5}\alpha-k_{2}(k_{4}+k_{6})\alpha^{2}]}
=𝒪(α).\displaystyle\quad\color[rgb]{0,0,0}=\mathcal{O}(\alpha).

Thus, it follows from (12) that

lim supt𝔼[𝐱t𝐱μ2]2lim supt𝔼[𝐱t𝐲¯t2]\displaystyle\limsup_{t\to\infty}\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]\leq 2\limsup_{t\to\infty}\mathbb{E}[\|\mathbf{x}_{t}-\bar{\mathbf{y}}_{t}\|^{2}]
+2lim supt𝔼[𝐲¯t𝐱μ2]\displaystyle\quad\quad+2\limsup_{t\to\infty}\mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]
2lim supti=1Nk=1ni𝔼[𝐲k,ti𝟏niy¯k,ti2]\displaystyle\quad\leq 2\limsup_{t\to\infty}\sum_{i=1}^{N}\sum_{k=1}^{n_{i}}\mathbb{E}[\|\mathbf{y}^{i}_{k,t}-\mathbf{1}_{n_{i}}\bar{y}^{i}_{k,t}\|^{2}]
+2lim supt𝔼[𝐲¯t𝐱μ2]=𝒪(α).\displaystyle\quad\quad+2\limsup_{t\to\infty}\mathbb{E}[\|\bar{\mathbf{y}}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]=\mathcal{O}(\alpha).

Applying Lemma 2-(2), we have

lim supt𝔼[𝐱t𝐱2]2lim supt𝔼[𝐱t𝐱μ2]\displaystyle\limsup_{t\to\infty}\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}\|^{2}]\leq 2\limsup_{t\to\infty}\mathbb{E}[\|\mathbf{x}_{t}-\mathbf{x}^{*}_{\mu}\|^{2}]
+2𝐱μ𝐱2=𝒪(α)+𝒪(μ),\displaystyle\quad\quad+2\|\mathbf{x}^{*}_{\mu}-\mathbf{x}^{*}\|^{2}=\mathcal{O}(\alpha)+\mathcal{O}(\mu),

which completes the proof. \Box

Remark 2

Theorem 1 shows that the players’ actions converge to a neighborhood of the NE at a geometric rate with the error bounded by a term which is proportional to both the step-size and the smoothing parameter. When the step-size and the smoothing parameter are set smaller, the error bound decreases, leading to a better accuracy.

Remark 3

The upper bound of the step-size depends on the global information, which may be infeasible to compute in a distributed mannar. However, the lower bound of the step-size is 0, which implies that the step-size can be set small enough to guarantee the convergence of the algorithm. The notion of a sufficiently small step-size is common in gradient tracking methods, see also [20, 21, 14, 22].

V Numerical Simulations

In this section, we illustrate the performance of the proposed algorithm with a numerical example for the Cournot competition game. The game consists of N=4N=4 coalitions with each coalition ii having ni=6n_{i}=6 players. The local cost function of player j{1,,6}j\in\{1,\ldots,6\} in coalition i{1,,4}i\in\{1,\ldots,4\} is given by fji(𝐱)=cji(𝐱)xjirji(𝐱)f^{i}_{j}(\mathbf{x})=c^{i}_{j}(\mathbf{x})-x^{i}_{j}{\color[rgb]{0,0,0}r^{i}_{j}}(\mathbf{x}), where cji(𝐱)=5(xji)2+5xji+5(xji6j)c^{i}_{j}(\mathbf{x})=5(x^{i}_{j})^{2}+5x^{i}_{j}+\color[rgb]{0,0,0}5(x^{i}_{j}-6j), rj1(𝐱)=60xj1xj2xj3xj4{\color[rgb]{0,0,0}r^{1}_{j}}(\mathbf{x})=60-x^{1}_{j}-x^{2}_{j}-x^{3}_{j}-x^{4}_{j}, rj2(𝐱)=60xj2,rj3(𝐱)=60xj1xj2{\color[rgb]{0,0,0}r^{2}_{j}}(\mathbf{x})=60-x^{2}_{j},{\color[rgb]{0,0,0}r^{3}_{j}}(\mathbf{x})=60-x^{1}_{j}-x^{2}_{j}, and rj4(𝐱)=60xj1xj2xj3{\color[rgb]{0,0,0}r^{4}_{j}}(\mathbf{x})=60-x^{1}_{j}-x^{2}_{j}-x^{3}_{j}. It is obvious to see that the problem is quadratic and hence satisfies Assumptions 2 and 3. The directed communication graph for each coalition ii is as shown in Fig. 1. Then, it is easy to find an associated adjacency matrix AiA^{i} satisfying Assumption 1.

Refer to caption
Figure 1: Communication network.

In the simulation, we set the smoothing parameter μ=104\color[rgb]{0,0,0}\mu=10^{-4}. To validate the convergence of the players’ actions, we set the constant step-size α=0.1\alpha=0.1. The algorithm is initialized with arbitrary xj,0ix^{i}_{j,0}, yjk,0iy^{i}_{jk,0} and ϕjk,0i=πjki(𝐱0)\phi^{i}_{jk,0}=\pi^{i}_{jk}(\mathbf{x}_{0}). The trajectories of the players’ actions for four coalitions are plotted in Figs. 2,3,4 and 5. As can be seen, all players’ actions can approximately converge to the NE.

Refer to caption
Figure 2: Trajectories of players’ actions in coalition 1.
Refer to caption
Figure 3: Trajectories of players’ actions in coalition 2.
Refer to caption
Figure 4: Trajectories of players’ actions in coalition 3.
Refer to caption
Figure 5: Trajectories of players’ actions in coalition 4.

To verify the convergence rate, we set the constant step-size α=0.005\alpha=0.005, 0.010.01, 0.020.02 and 0.050.05. The trajectories of the error gap 𝐱t𝐱\|\mathbf{x}_{t}-\mathbf{x}^{*}\| with all these step-sizes are plotted in Fig. 6. As can be observed, the error gap descends linearly in the log-scale plot for all cases. Moreover, when the step-size α\alpha is smaller, the convergence rate is slower but leading to a better accuracy, which verifies the derived results in Theorem 1.

Refer to caption
Figure 6: Trajectories of the error gap 𝐱t𝐱\|\mathbf{x}_{t}-\mathbf{x}^{*}\|.

VI Conclusions

In this paper, we have studied an NN-coalition non-cooperative game problem, where players have no access to the explicit form but only the value of their local cost functions. A discrete-time gradient-free Nash equilibrium seeking algorithm, based on the gradient tracking method, has been proposed to search for the Nash equilibrium of the game. Under a strongly monotone game mapping condition, we have established that all players’ actions converge linearly to a neighborhood of the Nash equilibrium with a sufficiently small constant step-size, where the gap is proportional to the step-size and the smoothing parameter. The performance of the proposed algorithm has been illustrated in numerical simulations.

References

  • [1] Stella Dafermos. Sensitivity Analysis in Variational Inequalities. Mathematics of Operations Research, 13(3):421–434, aug 1988.
  • [2] Zhenhua Deng and Shu Liang. Distributed algorithms for aggregative games of multiple heterogeneous Euler-Lagrange systems. Automatica, 99:246–252, 2019.
  • [3] B. Gharesifard and J. Cortes. Distributed convergence to Nash equilibria in two-network zero-sum games. Automatica, 49(6):1683–1692, 2013.
  • [4] Sergio Grammatico. Dynamic Control of Agents Playing Aggregative Games with Coupling Constraints. IEEE Transactions on Automatic Control, 62(9):4537–4548, 2017.
  • [5] Roger A Horn and Charles R Johnson. Matrix Analysis. Cambridge university press, 1990.
  • [6] Youcheng Lou, Yiguang Hong, Lihua Xie, Guodong Shi, and Karl Henrik Johansson. Nash Equilibrium Computation in Subnetwork Zero-Sum Games With Switching Communications. IEEE Transactions on Automatic Control, 61(10):2920–2935, 2016.
  • [7] Yurii Nesterov and Vladimir Spokoiny. Random Gradient-Free Minimization of Convex Functions. Foundations of Computational Mathematics, 17(2):527–566, 2017.
  • [8] Yipeng Pang and Guoqiang Hu. A distributed optimization method with unknown cost function in a multi-agent system via randomized gradient-free method. In 2017 11th Asian Control Conference (ASCC), pages 144–149, 2017.
  • [9] Yipeng Pang and Guoqiang Hu. Exact Convergence of Gradient-Free Distributed Optimization Method in a Multi-Agent System. In 2018 IEEE 58th Conference on Decision and Control (CDC), pages 5728–5733, 2018.
  • [10] Yipeng Pang and Guoqiang Hu. Randomized Gradient-Free Distributed Online Optimization with Time-Varying Cost Functions. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 4910–4915, 2019.
  • [11] Yipeng Pang and Guoqiang Hu. Randomized Gradient-Free Distributed Optimization Methods for a Multi-Agent System with Unknown Cost Function. IEEE Transactions on Automatic Control, 65(1):333–340, 2020.
  • [12] Yipeng Pang and Guoqiang Hu. Distributed Nash Equilibrium Seeking with Limited Cost Function Knowledge via A Consensus-Based Gradient-Free Method. IEEE Transactions on Automatic Control, 66(4):1832–1839, 2021.
  • [13] Lacra Pavel. Distributed GNE seeking under partial-decision information over networks via a doubly-augmented operator splitting approach. IEEE Transactions on Automatic Control, 65(4):1584–1597, 2020.
  • [14] Shi Pu and Angelia Nedić. Distributed stochastic gradient tracking methods. Mathematical Programming, pages 1–49, 2020.
  • [15] Farzad Salehisadaghiani and Lacra Pavel. Distributed Nash equilibrium seeking: A gossip-based algorithm. Automatica, 72:209–216, 2016.
  • [16] Gesualdo Scutari, Francisco Facchinei, Jong Shi Pang, and Daniel P. Palomar. Real and complex monotone communication games. IEEE Transactions on Information Theory, 60(7):4197–4231, 2014.
  • [17] Tatiana Tatarenko and Maryam Kamgarpour. Learning Generalized Nash Equilibria in a Class of Convex Games. IEEE Transactions on Automatic Control, 64(4):1426–1439, 2019.
  • [18] Tatiana Tatarenko and Maryam Kamgarpour. Learning Nash Equilibria in Monotone Games. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 3104–3109, 2019.
  • [19] Tatiana Tatarenko, Wei Shi, and Angelia Nedic. Accelerated Gradient Play Algorithm for Distributed Nash Equilibrium Seeking. In 2018 IEEE 58th Conference on Decision and Control (CDC), pages 3561–3566, 2018.
  • [20] Chenguang Xi, Van Sy Mai, Eyad H. Abed, and Usman A. Khan. Linear convergence in directed optimization with row-stochastic matrices. IEEE Transactions on Automatic Control, 63(10):3558–3565, 2018.
  • [21] Ran Xin and Usman A. Khan. Distributed heavy-ball: A generalization and acceleration of first-order methods with gradient tracking. IEEE Transactions on Automatic Control, 2019.
  • [22] Ran Xin, Anit Kumar Sahu, Usman A. Khan, and Soummya Kar. Distributed stochastic optimization with gradient tracking over strongly-connected networks. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 8353–8358, 2019.
  • [23] Maojiao Ye, Guoqiang Hu, and F.L. Lewis. Nash equilibrium seeking for N-coalition noncooperative games. Automatica, 95:266–272, 2018.
  • [24] Maojiao Ye, Guoqiang Hu, Frank L. Lewis, and Lihua Xie. A Unified Strategy for Solution Seeking in Graphical N-Coalition Noncooperative Games. IEEE Transactions on Automatic Control, 64(11):4645–4652, 2019.
  • [25] Maojiao Ye, Guoqiang Hu, and Shengyuan Xu. An extremum seeking-based approach for Nash equilibrium seeking in N-cluster noncooperative games. Automatica, 114:108815, 2020.
  • [26] Peng Yi and Lacra Pavel. An operator splitting approach for distributed generalized Nash equilibria computation. Automatica, 102:111–121, 2019.
  • [27] Deming Yuan and Daniel W. C. Ho. Randomized Gradient-Free Method for Multiagent Optimization Over Time-Varying Networks. IEEE Transactions on Neural Networks and Learning Systems, 26(6):1342–1347, 2015.
  • [28] Xianlin Zeng, Jie Chen, Shu Liang, and Yiguang Hong. Generalized Nash equilibrium seeking strategy for distributed nonsmooth multi-cluster game. Automatica, 103:20–26, 2019.