This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Solving Monotone Variational Inequalities with Best Response Dynamics

Yu-Wen Chen, Can Kizilkale, Murat Arcak Y.-W. Chen, C. Kizilkale and M. Arcak are with the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA. {yuwen_chen, cankizilkale, arcak}@berkeley.edu. This work was supported by the National Science Foundation grant CNS-2135791.
Abstract

We leverage best response dynamics to solve monotone variational inequalities on compact and convex sets. Specialization of the method to variational inequalities in game theory recovers convergence results to Nash equilibria when agents select the best response to the current distribution of strategies. We apply the method to generalize population games with additional convex constraints. Furthermore, we explore the robustness of the method by introducing various types of time-varying disturbances.

©2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, including reprinting/republishing this material for advertising or promotional purposes, collecting new collected works for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

I INTRODUCTION

Variational inequalities, introduced in [1], cover wide-ranging problems in modeling equilibria and optimization problems featuring inequality constraints. In transportation, they represent traffic equilibria [2]. In operations research, they model equilibria in supply chains, network flows, and facility locations [3]. In finance, they reflect equilibrium prices, offering insights into financial market dynamics [4]. In game theory, they capture Nash equilibria of noncooperative games [5]. For atomic games, the Nash equilibria can be captured by variational inequalities derived from first order optimality conditions. In nonatomic (population) games, the equilibrium condition comes directly in the form of a variational inequality [6]. Many generalizations of variational inequalities have been proposed, including the complementary problem [7], the quasi variational inequality [8] and the general variational inequality [9].

To solve variational inequalities, constrained gradient descent methods are widely adopted in the literature [10, 11, 12]. These methods iteratively update the solution by incorporating both the gradient information of the objective function and the constraints to ensure that the iterates remain feasible. One such method is projected gradient descent, which optimizes constrained problems by iteratively adjusting solutions along the negative gradient direction and projecting them onto the feasible set. Another variant, the Frank-Wolfe method [13], solves the problem by iteratively moving towards the solution along linear approximations of the objective function. The Augmented Lagrangian Method [14] solves constrained optimization problems by iteratively updating the Lagrange multipliers while minimizing a penalized version of the objective function. It combines the benefits of penalty methods and the method of multipliers.

Finding the Nash equilibria in noncooperative games, as an application of variational inequalities, has its own track of research. In particular, the best response dynamics is one of the incentive-oriented methods, describing the iterative process where players sequentially update their strategies to minimize their cost based on the current strategies of others. If the process stops, meaning that no player benefits from changing unilaterally, then the Nash equilibrium is reached. The existing literature examines the convergence characteristics of the best response dynamics in game theory, notably in potential games as outlined in [15]. The best response dynamics for population games follow the method proposed in [16], where the best response describes the best strategy distribution to the present cost. The convergence results of the best response dynamics for population games have been well-studied for particular cost structures [17, 18, 19, 20].

The first contribution of this paper is to leverage the best response dynamics to solve general variational inequalities, recovering results for games as special cases. Furthermore, the method extends the scope of population games by allowing additional convex constraints, thus extending previous work for standard population games [17].

The second contribution is to account for disturbances in best response dynamics, moving beyond static forms of disturbance typically studied in variational inequalities. We introduce various types of time-varying disturbances and prove appropriate forms of Input-to-State Stability [21].

The remainder of the paper is organized as follows. In Section II, preliminary results are given. In Section III, solutions of variational inequalities are studied using the best response dynamics. In Section IV, various types of disturbance are introduced and the robustness of the best response dynamics is analyzed. In Section V, several applications are presented. Finally in Section VI, conclusions are given.

II PRELIMINARIES

In this section, we review variational inequalities and the best response dynamics.

II-A Variational Inequalities

Definition 1 (Variational Inequality).

Given a set KnK\subseteq\mathbb{R}^{n} and a mapping F:KnF:K\rightarrow\mathbb{R}^{n}, we say that xKx\in K solves the variational inequality, denoted as VI(K,F)VI(K,F), if

(yx)TF(x)0,yK.\displaystyle(y-x)^{T}F(x)\geq 0,\quad\forall y\in K. (1)

The set of all solutions is denoted as SOL(K,F)SOL(K,F).

Definition 2 (Strong Monotonicity / Monotonicity).

A function F:KnnF:K\subseteq\mathbb{R}^{n}\rightarrow\mathbb{R}^{n} is strongly monotone (or monotone) on KK if c>0\exists\ c>0 (or c=0c=0) s.t.

(xy)T(F(x)F(y))cxy2,x,yK.\displaystyle(x-y)^{T}\left(F(x)-F(y)\right)\geq c||x-y||^{2},\quad\forall x,y\in K. (2)

If FF is continuously differentiable, then (2) is equivalent to

zTDF(x)zcz2,xK,zTK,\displaystyle z^{T}DF(x)z\geq c||z||^{2},\quad\forall x\in K,z\in TK, (3)

where TKTK denotes the tangent space to KK.

Most existing solvers for variational inequalities are closely connected to constrained optimization. Especially, lots of them are based on the projected gradient iterative algorithms; e.g., [22] proposes the continuous version,

x˙\displaystyle\dot{x} =ΠTK(x)(F(x)),\displaystyle=\Pi_{TK(x)}(-F(x)), (4)

where TK(x)TK(x) denotes the tangent cone at xx.

II-B Best response dynamics

II-B1 Best response dynamics in atomic games

Consider nn agents. For i=1i=1 to nn, agent-ii has a did_{i}-dimensional optimization variable xidix_{i}\in\mathbb{R}^{d_{i}} and an objective function fi(xi,xi)f_{i}(x_{i},x_{-i}) with a constraint set 𝒳idi\mathcal{X}_{i}\subseteq\mathbb{R}^{d_{i}}, where xix_{-i} denotes all the optimization variables except xix_{i}. Then, the best response is defined as

argminxi𝒳ifi(xi,xi).\displaystyle\arg\min_{x_{i}\in\mathcal{X}_{i}}\ f_{i}(x_{i},x_{-i}). (5)

That is, the best response of an agent refers to a strategy that optimizes its objective function given the actions taken by the other agents.

The best response dynamics describes an iterative process where agents take turns making their best responses. This process iterates until no player has an incentive to deviate from its chosen strategy unilaterally. Thus, the resulting strategies form a Nash equilibrium. The Nash equilibrium is a solution to the variational inequality (1) where F=[f1,,fn]TF=[\nabla f_{1},\cdots,\nabla f_{n}]^{T} and KK is the Cartesian product set 𝒳1××𝒳n\mathcal{X}_{1}\times\cdots\times\mathcal{X}_{n} [23].

II-B2 Best response dynamics in population games

In a single population game, the set of strategies available to the agents is denoted as 𝒮={1,,n}\mathcal{S}=\{1,\cdots,n\}. Then, the social state is defined as x=(x1,,xn)+nx=(x_{1},\cdots,x_{n})\in\mathbb{R}^{n}_{+}, where xix_{i} represents the fraction of players choosing strategy ii. Note that the social state xx lies in the probability simplex X={v=(v1,,vn)+n:i=1nvi=1,vi0}X=\{v=(v_{1},\cdots,v_{n})\in\mathbb{R}^{n}_{+}:\sum_{i=1}^{n}v_{i}=1,v_{i}\geq 0\}. A cost function G:XnG:X\rightarrow\mathbb{R}^{n} maps the social state xx to a cost, denoted as π=G(x)\pi=G(x). For a state xXx\in X, the best response to cost π=G(x)\pi=G(x) is the strategy distribution that minimizes the total cost,

argminyXyTπ.\displaystyle\arg\min_{y\in X}\ y^{T}\pi.

Evolutionary dynamics models for population games assume each agent continually revises its strategy and revision opportunities follow a Poisson process. When the opportunity arises, the agent switches its strategy according to a probability distribution defined by a learning rule [18]. When the rule is to select the strategy with the lowest cost (“best response”), the resulting mean dynamics are

x˙(t)(argminyXyTπ(t))x(t),\displaystyle\dot{x}(t)\in\left(\arg\min_{y\in X}\ y^{T}\pi(t)\right)-x(t),

where π(t)=G(x(t))\pi(t)=G(x(t)).

The solution of the best response dynamics above has been proven to converge to Nash equilibria in various games, e.g., potential games, monotone games, and supermodular games; see [18] for a comprehensive review.

III BEST RESPONSE DYNAMICS FOR VARIATIONAL INEQUALITIES

We now generalize the best response dynamics to solve broader variational inequalities than those describing Nash equilibria in population games. To do so, we extend the feasible set of the best response from a probability simplex XX to an arbitrary compact and convex set KK.

Definition 3 (Best response).

Let KnK\subseteq\mathbb{R}^{n} be a compact and convex set. Given a vector πn\pi\in\mathbb{R}^{n}, the best response mapping β(π)\beta(\pi) is defined as

β(π)=argminyKyTπ.\displaystyle\beta(\pi)=\arg\min_{y\in K}\ y^{T}\pi. (6)

Note that β\beta is a set-valued map. Next we define the best response dynamics.

Definition 4 (Best response dynamics).

Given a cost trajectory π(t)\pi(t), the best response dynamics is the differential inclusion,

x˙(t)β(π(t))x(t),t0.\displaystyle\dot{x}(t)\in\beta(\pi(t))-x(t),\quad\forall t\geq 0. (7)

Since KK is convex, β(π(t))x(t)\beta(\pi(t))-x(t) belongs to the tangent cone at x(t)x(t) making KK invariant under (7). Therefore, we do not require projection steps onto KK to ensure feasibility as in projected gradient methods.

In the following, we use the best response dynamics to solve the variational inequality VI(K,F)VI(K,F).

Theorem 1.

Consider the variational inequality (1) where KnK\subseteq\mathbb{R}^{n} is compact and convex and F:KnF:K\rightarrow\mathbb{R}^{n} is C1C^{1} monotone. Then SOL(KK,FF) is globally asymptotically stable under the best response dynamics (7) with π(t)=F(x(t))\pi(t)=F(x(t)).

Proof.

Define the mapping on the right hand side of (7) as f(x):=β(F(x))xf(x):=\beta(F(x))-x. Since ff is upper-hemicontinous, nonempty, compact-valued, and convex-valued, there exists a Carathéodory solution x(t),t0x(t),\forall t\geq 0 [24].

Define the Lyapunov function candidate

V(x)=U(x,F(x)),\displaystyle V(x)=U(x,F(x)), (8)

where

U(x,π)=xTπminyKyTπ:=m(π),\displaystyle U(x,\pi)=x^{T}\pi-\underbrace{\min_{y\in K}\ y^{T}\pi}_{:=m(\pi)}, (9)

which is nonnegative on KK and vanishes only on SOL(K,F)SOL(K,F). To apply the nonsmooth Lyapunov techniques [25], we need to show that (9) is Lipschitz continuous w.r.t. its first and second arguments, respectively. The Lipschitzness w.r.t the first argument is from the affine form of (9). To prove the Lipschitzness of m(π)m(\pi) w.r.t. π\pi, we note that

m(π1)m(π2)\displaystyle m(\pi_{1})-m(\pi_{2}) =minyKyTπ1minyKyTπ2\displaystyle=\min_{y\in K}\ y^{T}\pi_{1}-\min_{y\in K}\ y^{T}\pi_{2}
minyKyT(π1π2)Mπ1π2,\displaystyle\geq\min_{y\in K}\ y^{T}(\pi_{1}-\pi_{2})\geq-M||\pi_{1}-\pi_{2}||,

where M=maxxKxM=\max_{x\in K}||x||. Similarly, m(π2)m(π1)Mπ2π1m(\pi_{2})-m(\pi_{1})\geq-M||\pi_{2}-\pi_{1}||, then the Lipschitz property: |m(π1)m(π2)|Mπ1π2|m(\pi_{1})-m(\pi_{2})|\leq M||\pi_{1}-\pi_{2}|| follows. Therefore, (9) is Lipschitz w.r.t. its second argument. Moreover, x(t)x(t) is absolutely continuous w.r.t. tt and, thus, so is π(t)\pi(t). As a result, for almost all points where VV is differentiable and x˙\dot{x} exists, we have

ddtV(x(t))\displaystyle\frac{d}{dt}V(x(t)) =Ux(x(t),π(t))x˙(t)+Uπ(x(t),π(t))π˙(t)\displaystyle=\frac{\partial U}{\partial x}(x(t),\pi(t))\dot{x}(t)+\frac{\partial U}{\partial\pi}(x(t),\pi(t))\dot{\pi}(t)
=π(t)Tx˙(t)+x(t)Tπ˙(t)mπ(π(t))π˙(t)\displaystyle=\pi(t)^{T}\dot{x}(t)+x(t)^{T}\dot{\pi}(t)-\frac{\partial m}{\partial\pi}(\pi(t))\dot{\pi}(t)
=(a)π(t)Tx˙(t)+(x(t)β(π(t)))Tπ˙(t)\displaystyle\stackrel{{\scriptstyle(a)}}{{=}}\pi(t)^{T}\dot{x}(t)+\left(x(t)-\beta(\pi(t))\right)^{T}\dot{\pi}(t)
=π(t)Tx˙(t)x˙(t)DF(x(t))x˙(t)0, by (3)\displaystyle=\pi(t)^{T}\dot{x}(t)-\underbrace{\dot{x}(t)\ DF(x(t))\ \dot{x}(t)}_{\geq 0,\text{ by (\ref{eq:PD})}}
π(t)T(x(t)+β(π(t)))=V(x(t)),\displaystyle\leq\pi(t)^{T}\left(-x(t)+\beta(\pi(t))\right)=-V(x(t)), (10)

where (a)(a) follows from the Envelope Theorem [26]. Using Clarke’s generalized gradient [27], we can extend the inequality (10) to hold for almost all tt. Finally, by the Theorem A.2 in [17], we have V(x(t))0V(x(t))\rightarrow 0. Therefore, SOL(K,F)SOL(K,F) is globally asymptotically stable. ∎

When we specialize variational inequalities to population games, where KK is the probability simplex, SOL(K,F)SOL(K,F) characterizes the Nash equilibria. Thus, the result in [17] is a special case of 1.

A connection between (4) and (7) can be made as follows. If FF is the gradient of a function, then (4) is the projected gradient descent method and (7) is the Frank-Wolfe method. Although (4) and (7) have comparable complexity, (4) requires a projection onto the varying tangent cone at the current point TK(x)TK(x) to remain feasible but (7) only needs to select a point in the static set KK and automatically remains feasible. This feature brings benefits to analysis, especially when disturbances are considered in the following section.

IV ROBUSTNESS ANALYSIS

In 1, the function FF is perfectly known. We now consider a time-varying disturbance Δ(t)\Delta(t) which perturbs F(x(t))F(x(t)) into F(x(t))+Δ(t)F(x(t))+\Delta(t). Then, the best response dynamics become

x˙(t)β(π~(t))x(t),π~(t)=F(x(t))+Δ(t).\displaystyle\dot{x}(t)\in\beta(\tilde{\pi}(t))-x(t),\quad\tilde{\pi}(t)=F(x(t))+\Delta(t). (11)

This form of the disturbance alters the trajectory x(t)x(t) indirectly via changes to the cost function FF. In contrast, a disturbance ε(t)\varepsilon(t) may affect x(t)x(t) by directly perturbing the best response dynamics into

x˙(t)β(π(t))x(t)+ε(t),π(t)=F(x(t)).\displaystyle\dot{x}(t)\in\beta(\pi(t))-x(t)+\varepsilon(t),\quad\pi(t)=F(x(t)). (12)
Definition 5 (Cost disturbance and Dynamics disturbance).

We call disturbances appearing as in (11) cost disturbances and those as in (12) dynamics disturbances.

To analyze the dynamics disturbance in (12), we assume that the disturbance does not violate the constraints. For example, in a congestion game, the distribution of the traffic flows over routes add up to the total demand despite disturbances in the dynamics governing the evolution of flows.

Definition 6 (Admissible Dynamics Disturbance).

For (12), a dynamics disturbance ε(t)\varepsilon(t) is admissible if it is piecewise continuous, bounded, and satisfies

x˙(t)TK(x(t)),t0,\displaystyle\dot{x}(t)\in TK(x(t)),\quad\forall t\geq 0,

that is, the resulting trajectory x(t)x(t) will remain in KK.

In the following, we explore the general case where the best response dynamics are subject to both dynamics disturbances and cost disturbances. We show that the effects of disturbances are captured by the notion of Input-to-State Stability. Denote comparison functions class-𝒦\mathcal{K}, class-𝒦\mathcal{K}_{\infty}, and class-𝒦\mathcal{KL}, by 𝒦\mathcal{K}, 𝒦\mathcal{K}_{\infty} and 𝒦\mathcal{KL}, respectively [28].

Theorem 2.

Consider best response dynamics subject to a dynamics disturbance ε(t)\varepsilon(t) and a cost disturbance Δ(t)\Delta(t),

x˙(t)β(π~(t))x(t)+ε(t),π~(t)=F(x(t))+Δ(t),\displaystyle\dot{x}(t)\in\beta(\tilde{\pi}(t))-x(t)+\varepsilon(t),\quad\tilde{\pi}(t)=F(x(t))+\Delta(t),

where ε(t)\varepsilon(t) is admissible and Δ(t)\Delta(t) is bounded and with a bounded derivative. Let KnK\subseteq\mathbb{R}^{n} be compact and convex and F:KnF:K\rightarrow\mathbb{R}^{n} be C1C^{1} strongly monotone. Then,

||x(t)\displaystyle||x(t) x||ω(||x(0)x||,t)\displaystyle-x^{*}||\leq\omega(||x(0)-x^{*}||,t)
+γ1(max{ε,Δ})+γ2(Δ˙),\displaystyle+\gamma_{1}\left(\max\left\{||\varepsilon||_{\infty},||\Delta||_{\infty}\right\}\right)+\gamma_{2}(||\dot{\Delta}||_{\infty}), (13)

where xx^{*} is the unique solution of VI(K,F)VI(K,F), ω𝒦\omega\in\mathcal{KL}, and γ1,γ2𝒦\gamma_{1},\gamma_{2}\in\mathcal{K}. In particular, if ε(t)0\varepsilon(t)\rightarrow 0, Δ(t)0\Delta(t)\rightarrow 0, and Δ˙(t)0\dot{\Delta}(t)\rightarrow 0, then x(t)xx(t)\rightarrow x^{*}.

Proof.

Similar to the proof of 1, we define

V1(x)\displaystyle V_{1}(x) =U1(x,F(x)),\displaystyle=U_{1}(x,F(x)), (14)
V2(x,t)\displaystyle V_{2}(x,t) =U2(x,F(x),Δ(t)),\displaystyle=U_{2}(x,F(x),\Delta(t)), (15)

where

U1(x,π)\displaystyle U_{1}(x,\pi) =xTπminyKyTπ,\displaystyle=x^{T}\pi-\min_{y\in K}\ y^{T}\pi, (16)
U2(x,π,Δ)\displaystyle U_{2}(x,\pi,\Delta) =xT(π+Δ)minyKyT(π+Δ).\displaystyle=x^{T}(\pi+\Delta)-\min_{y\in K}\ y^{T}(\pi+\Delta). (17)

We first derive the following relations:

V1(x)\displaystyle V_{1}(x) α1(xx),\displaystyle\geq\alpha_{1}(||x-x^{*}||), (18)
V2(x,t)\displaystyle V_{2}(x,t) α2(xx)+DKΔ(t),\displaystyle\leq\alpha_{2}(||x-x^{*}||)+D_{K}||\Delta(t)||, (19)

for some α1,α2𝒦\alpha_{1},\alpha_{2}\in\mathcal{K}_{\infty} and DK>0D_{K}>0 is the diameter of the compact and convex set KK. Note that xx^{*} is the unique solution to VI(K,F)VI(K,F). Denote π=F(x)\pi^{*}=F(x^{*}), then we have xTπminyKyTπ=0{x^{*}}^{T}\pi^{*}-\min_{y\in K}\ y^{T}\pi^{*}=0. As a result, we have

V1(x)\displaystyle V_{1}(x) =xTπminyKyTπ+xTπminyKyTπ\displaystyle=x^{T}\pi-\min_{y\in K}\ y^{T}\pi+{x^{*}}^{T}\pi^{*}-\min_{y\in K}\ y^{T}\pi^{*}
=(xx)T(ππ)+(xTπminyKyTπ)\displaystyle=(x-x^{*})^{T}(\pi-\pi^{*})+({x^{*}}^{T}\pi-\min_{y\in K}\ y^{T}\pi)
+(xTπminyKyTπ)\displaystyle\hskip 20.0pt+(x^{T}\pi^{*}-\min_{y\in K}\ y^{T}\pi^{*})
cxx2:=α1(xx),\displaystyle\geq c||x-x^{*}||^{2}:=\alpha_{1}(||x-x^{*}||), (20)

where π\pi is attached to F(x)F(x). On the other hand,

V2(\displaystyle V_{2}( x,t)=xT(π+Δ(t))minyKyT(π+Δ(t))\displaystyle x,t)=x^{T}(\pi+\Delta(t))-\min_{y\in K}\ y^{T}(\pi+\Delta(t))
xTπ+minyKyTπ\displaystyle\hskip 35.0pt-{x^{*}}^{T}\pi^{*}+\min_{y\in K}\ y^{T}\pi^{*}
=xT(π+Δ(t))xTπy1T(π+Δ(t))+y2Tπ\displaystyle=x^{T}(\pi+\Delta(t))-{x^{*}}^{T}\pi^{*}-y_{1}^{T}(\pi+\Delta(t))+y_{2}^{T}\pi^{*}
(xx)T(ππ)+(xx)Tπ\displaystyle\leq(x-x^{*})^{T}(\pi-\pi^{*})+(x-x^{*})^{T}\pi^{*}
+xT(ππ)y1T(ππ)+(xy1)TΔ(t)\displaystyle\hskip 10.0pt+{x^{*}}^{T}(\pi-\pi^{*})-y_{1}^{T}(\pi-\pi^{*})+(x-y_{1})^{T}\Delta(t)
Lxx2+Mxx:=α2(xx)+DKΔ(t),\displaystyle\leq\underbrace{L||x-x^{*}||^{2}+M||x-x^{*}||}_{:=\alpha_{2}(||x-x^{*}||)}+D_{K}||\Delta(t)||, (21)

where π\pi is attached to F(x)F(x), y1=argminyKyT(π+Δ(t))y_{1}=\arg\min_{y\in K}\ y^{T}(\pi+\Delta(t)), and y2=argminyKyTπy_{2}=\arg\min_{y\in K}\ y^{T}\pi^{*}. The last inequality with L,M>0L,M>0 follows from the Cauchy-Schwartz inequality, compactness of KK, and Lipschitzness of FF. Next, we prove

V2(x,t)V1(x)DKΔ(t).\displaystyle||V_{2}(x,t)-V_{1}(x)||\leq D_{K}||\Delta(t)||. (22)
V2(x,t)V1(x)\displaystyle V_{2}(x,t)-V_{1}(x) =xTΔ(t)+y1Tπy2T(π+Δ(t))\displaystyle=x^{T}\Delta(t)+y_{1}^{T}\pi-y_{2}^{T}(\pi+\Delta(t))
xTΔ(t)+y2Tπy2T(π+Δ(t))\displaystyle\leq x^{T}\Delta(t)+y_{2}^{T}\pi-y_{2}^{T}(\pi+\Delta(t))
DKΔ(t),\displaystyle\leq D_{K}||\Delta(t)||,

where π\pi is attached to F(x)F(x), y1=argminyKyTπy_{1}=\arg\min_{y\in K}y^{T}\pi, and y2=argminyKyT(π+Δ(t))y_{2}=\arg\min_{y\in K}y^{T}(\pi+\Delta(t)). Similarly, V2(x,t)V1(x)(xy1)TΔDKΔV_{2}(x,t)-V_{1}(x)\geq(x-y_{1})^{T}\Delta\geq-D_{K}||\Delta||. Thus, (22) is proved.

In the following, we will derive a bound for the trajectory V2(x(t),t)V_{2}(x(t),t). We slightly abuse the notation by abbreviating x(t)x(t), π(t)\pi(t), Δ(t)\Delta(t) as xx, π\pi, and Δ\Delta, respectively.

ddt\displaystyle\frac{d}{dt} V2(x(t),t)=x˙T(π+Δ)+(xβ(π+Δ))T(π˙+Δ˙)\displaystyle V_{2}(x(t),t)=\dot{x}^{T}(\pi+\Delta)+\left(x-\beta(\pi+\Delta)\right)^{T}(\dot{\pi}+\dot{\Delta})
=V2+εT(π+Δ)+(xβ(π+Δ))TΔ˙\displaystyle=-V_{2}+\varepsilon^{T}(\pi+\Delta)+\left(x-\beta(\pi+\Delta)\right)^{T}\dot{\Delta}
(β(π+Δ)x)TDF(x)(β(π+Δ)x+ε)\displaystyle\hskip 20.0pt-\left(\beta(\pi+\Delta)-x\right)^{T}DF(x)\left(\beta(\pi+\Delta)-x+\varepsilon\right)
V2cxβ(π+Δ)2+xβ(π+Δ)Δ˙\displaystyle\leq-V_{2}-c||x-\beta(\pi+\Delta)||^{2}+||x-\beta(\pi+\Delta)||||\dot{\Delta}||
+σxβ(π+Δ)ε+εT(π+Δ)\displaystyle\hskip 20.0pt+\sigma||x-\beta(\pi+\Delta)||||\varepsilon||+\varepsilon^{T}(\pi+\Delta)
V2+12cΔ˙2+σ22cε2+M1ε+εΔ:=Γ(t),\displaystyle\leq-V_{2}+\underbrace{\frac{1}{2c}||\dot{\Delta}||^{2}+\frac{\sigma^{2}}{2c}||\varepsilon||^{2}+M_{1}||\varepsilon||+||\varepsilon||||\Delta||}_{:=\Gamma(t)},

where cc is from 2, σ=DF2\sigma=||DF||_{2}, and M1=maxxKπ(x)M_{1}=\max_{x\in K}||\pi(x)||. Then, by Comparison Lemma [28], we get

V2(x(t),t)\displaystyle V_{2}(x(t),t) V2(x(0),0)et+Γ.\displaystyle\leq V_{2}\left(x(0),0\right)e^{-t}+||\Gamma||_{\infty}.

Therefore by (22), we have

V1(x(t))V2(x(0),0)et+Γ+DKΔ.\displaystyle V_{1}(x(t))\leq V_{2}\left(x(0),0\right)e^{-t}+||\Gamma||_{\infty}+D_{K}||\Delta||_{\infty}. (23)

Finally, (13) follows from (18), (19), and (23). ∎

We next study a state-dependent perturbation δ\delta on FF:

x˙(t)β(π~(t))x(t),π~(t)=(F+δ)(x(t)),\displaystyle\dot{x}(t)\in\beta(\tilde{\pi}(t))-x(t),\quad\tilde{\pi}(t)=(F+\delta)(x(t)), (24)

Although we can analyze (24) with 2 by setting Δ(t)=δ(x(t))\Delta(t)=\delta(x(t)) and ε(t)=0\varepsilon(t)=0, in the next theorem we exploit the additional structure to derive a refined bound.

Theorem 3 (State-dependent cost disturbance).

Consider (24). Let KnK\subseteq\mathbb{R}^{n} be compact and convex, F:KnF:K\rightarrow\mathbb{R}^{n} be C1C^{1} strongly monotone, and δ:Kn\delta:K\rightarrow\mathbb{R}^{n} be such that π~\tilde{\pi} is C1C^{1} strongly monotone. Then, the dynamics converge to a new perturbed equilibrium point x~\tilde{x}^{*} and

xx~α11(h(x~)),\displaystyle||x^{*}-\tilde{x}^{*}||\leq\alpha_{1}^{-1}\left(h(\tilde{x}^{*})\right), (25)

where xx^{*} is the unique solution of unperturbed VI(K,F)VI(K,F) and α1𝒦\alpha_{1}\in\mathcal{K}_{\infty}. The function hh is given as

h(x)=max{(zx)Tδ(x):z=argminyKyTF(x)}.\displaystyle h(x)=\max\left\{(z-x)^{T}\delta(x):z=\arg\min_{y\in K}\ y^{T}F(x)\right\}.
Proof.

The proof is similar to Theorem 2. Define

V1(x)\displaystyle V_{1}(x) =U(x,F(x))\displaystyle=U(x,F(x)) (26)
V2(x)\displaystyle V_{2}(x) =U(x,(F+δ)(x))\displaystyle=U(x,(F+\delta)(x)) (27)

where

U(x,π)\displaystyle U(x,\pi) =xTπminyKyTπ.\displaystyle=x^{T}\pi-\min_{y\in K}\ y^{T}\pi. (28)

Refer to the prove of (18) and (22), then we have

α1(xx)V1(x)\displaystyle\alpha_{1}(||x-x^{*}||)\leq V_{1}(x) V2(x)+(zx)Tδ(x),\displaystyle\leq V_{2}(x)+(z-x)^{T}\delta(x),

where z=argminyKyTπz=\arg\min_{y\in K}\ y^{T}\pi. By Theorem 1, with the dynamics (24), x(t)x(t) converges to a perturbed equilibrium x~\tilde{x}^{*} satisfying V2(x~)=0V_{2}(\tilde{x}^{*})=0. As a result, (25) follows. ∎

Remark 1.

The model (24) encompasses the perturbed best response dynamics, which was used in [29] to study a smooth approximation of the best response dynamics. The perturbed best response dynamics is defined as

x˙(t)B~(x(t))x(t),B~(x(t))=argminyXyTπ(t)+H(y),\displaystyle\dot{x}(t)\in\tilde{B}(x(t))-x(t),\tilde{B}(x(t))=\arg\min_{y\in X}y^{T}\pi(t)+H(y),

where π(t)=F(x(t))\pi(t)=F(x(t)) and HH is strictly convex, twice differentiable, and with its magnitude of gradient approaching infinity near the boundary of XX. For the optimization problem B~\tilde{B}, if instead of directly picking the argmin\arg\min, we apply the Frank-Wolfe method, then the point selected is

argminyXyT(F(x(t))+Hy(x(t))),\displaystyle\arg\min_{y\in X}\ y^{T}\left(F(x(t))+\frac{\partial H}{\partial y}(x(t))\right),

which is covered by (24) with δ(x(t))=Hy(x(t))\delta(x(t))=\frac{\partial H}{\partial y}(x(t)). Since HH is strictly convex, δ\delta is strictly monotone. Therefore, 3 reproduces the convergence results for perturbed best response dynamics [29]. In addition, it provides a bound on the perturbed distance.

V APPLICATIONS

V-A Traffic network

Consider the network in Fig. 1, which includes 3 routes and 3 Y-intersections with traffic lights shown in boxes with crosses. The flows on each route are denoted x1x_{1}, x2x_{2}, and x3x_{3}, respectively, and their sum is one. We zoom in on one of the Y-intersections, shown in Fig. 2, to explain the effects of traffic lights. An actuated traffic light results in the interdependency of Link 1 and Link 2 delays on each others’ flows. Assuming the light prioritizes the branch line, we let Φ1(x1,x2)=x1+3x2\Phi_{1}(x_{1},x_{2})=x_{1}+3x_{2} and Φ2(x2,x1)=x1+x2\Phi_{2}(x_{2},x_{1})=x_{1}+x_{2}. For the rest of the delay functions in the network, we model similarly as Φ3(x,y)=Φ5(x,y)=x+3y\Phi_{3}(x,y)=\Phi_{5}(x,y)=x+3y and Φ4(x,y)=Φ6(x,y)=x+y\Phi_{4}(x,y)=\Phi_{6}(x,y)=x+y. Then, the total delays on each route are 2x1+3x2+x32x_{1}+3x_{2}+x_{3}, x1+2x2+3x3x_{1}+2x_{2}+3x_{3}, and 3x1+x2+2x33x_{1}+x_{2}+2x_{3}, respectively. Thus, the equilibrium of the traffic network is characterized by the variational inequality (1) where KK is the probability simplex and

F(x)=[2x1+3x2+x3x1+2x2+3x33x1+x2+2x3].\displaystyle F(x)=\begin{bmatrix}2x_{1}+3x_{2}+x_{3}\\ x_{1}+2x_{2}+3x_{3}\\ 3x_{1}+x_{2}+2x_{3}\end{bmatrix}.

Note that this is not a potential game but a monotone game, since DFDF is not symmetric but DF+DFTDF+DF^{T} is positive semidefinite. The simulation result is provided in Fig. 3(a) where the best response dynamics lead to a spiral trajectory and converge to the equilibrium point [13,13,13]T[\frac{1}{3},\frac{1}{3},\frac{1}{3}]^{T}.

Refer to caption
Figure 1: The network is composed of three Y-intersections. Three possible routes are given. The delay functions are provided in the contexts.
Refer to caption
Figure 2: The figure describes the Y-intersection where the traffic lights induce delay. The traffic flows are in red, while the delay functions are in black. The branch line, Link 2, and the mainline, Link 1, affect each other through the traffic light making their delay functions depend on x1x_{1} and x2x_{2}.

As an illustration of the case where KK is a subset of the simplex, we add a capacity constraint on Link 7: x2+x30.9x_{2}+x_{3}\leq 0.9, and delay constraints on Link 3 and Link 5: Φ3(x2,x3)=x2+3x32\Phi_{3}(x_{2},x_{3})=x_{2}+3x_{3}\leq 2 and Φ5(x3,x1)=x3+3x12\Phi_{5}(x_{3},x_{1})=x_{3}+3x_{1}\leq 2. Although the feasible set is no longer a simplex, 1 ensures convergence. Simulation results are given in Fig. 3(b).

Refer to caption
(a) without constraints
Refer to caption
(b) with constraints
Figure 3: Since the trajectory evolves on the 3-dimensional probability simplex, we project it and draw it on a plane. Compared with Fig. 3(a), the trajectory in Fig. 3(b) changes drastically and remains in the feasible set.

V-B Cost disturbances and dynamics disturbances

In the following, we first discuss congestion games with cost disturbances and then dynamics disturbances.

V-B1 Cost disturbances

In standard congestion games, the delays are merely based on the flow. However, there may be other time-varying factors that affect the delays. For example, weather conditions cause additional delays, and accidents increase the delays for a period until the road is cleared. To capture these scenarios, we introduce a bounded time-varying cost disturbance Δ(t)\Delta(t) with bounded derivative Δ˙(t)\dot{\Delta}(t).

A congestion game is strongly monotone if all the delay functions are strictly increasing. In this case, by Theorem 2, the convergence results of best response dynamics are robust to a certain extent of cost disturbances. The settings of the simulation are described in Fig. 5, where the corresponding FF for the variational inequality is F(x)=[2x1+0.5x3+0.3,1.5x2+0.5x3+0.5,0.5x1+0.5x2+2x3+0.6]TF(x)=[2x_{1}+0.5x_{3}+0.3,1.5x_{2}+0.5x_{3}+0.5,0.5x_{1}+0.5x_{2}+2x_{3}+0.6]^{T}.

Refer to caption
Figure 4: delay functions for links
Refer to caption
Figure 5: without disturbance

The results of solving the variational inequality by the best response dynamics without cost disturbances are provided in Fig. 5. Note that the function FF is strongly monotone and thus admits a unique Nash equilibrium.

Next, we give simulation results with two examples of cost disturbances. The first one is the periodic cost disturbance, Δ1(t)=[0.1sin(0.01t),0.05cos(0.02t+10),0.15sin(0.05t20)]T\Delta_{1}(t)=[0.1\sin(0.01t),-0.05\cos(0.02t+10),0.15\sin(0.05t-20)]^{T}. The result is shown in Fig. 6. Since this disturbance is periodic and does not vanish, the trajectory reaches a neighborhood of the unperturbed equilibrium rather than the equilibrium itself. The second example is the diminishing cost disturbance, Δ2(t)=20e0.01tΔ1(t)\Delta_{2}(t)=20e^{-0.01t}\Delta_{1}(t). The simulation result is provided in Fig. 7. Since the disturbance is large in the early stage, the trajectory moves away from the original trajectory initially. However, as the disturbance vanishes, the trajectory gradually converges to the non-disturbed one as in Fig. 5.

Refer to caption
Figure 6: A periodic cost disturbance is injected. The state trajectory moves around the unperturbed equilibrium as in Fig. 5.
Refer to caption
Figure 7: A diminishing cost disturbance is introduced. Initially, when the disturbance is large, the trajectory is driven away from the original unperturbed trajectory. However, as the disturbance vanishes, the trajectory gradually converges to the unperturbed one as in Fig. 5

V-B2 Dynamics disturbances

Next we simulate the system with a dynamics disturbance ε(t)=[0.7sin(0.1t),0.7cos(0.2t10),0.7sin(0.1t)0.7cos(0.2t10)]T\varepsilon(t)=[0.7\sin(0.1t),0.7\cos(0.2t-10),-0.7\sin(0.1t)-0.7\cos(0.2t-10)]^{T}. Note that ε(t)\varepsilon(t) is admissible as defined in Definition 6. By Theorem 2, the resulting trajectory should remain close to the unperturbed one as in Fig. 5. The simulation results are provided in Fig. 8.

Refer to caption
Figure 8: A periodic dynamics disturbance is added. The resulting trajectory remains close to the unperturbed one as in Fig. 5.

VI CONCLUSIONS AND FUTURE WORK

In this paper, we leveraged the best response dynamics to solve monotone variational inequalities and provided robustness guarantees against classes of disturbances. When the set KK is a subset of the probability simplex, as in our example, x(t)x(t) can be viewed as the distribution of the agents to the strategies in a population game with constraints. In this special case, it would be possible to study learning rules other than the best response, such as those reviewed in [30]. Note, however, that our convergence results apply to any compact and convex set KK; therefore, they are not restricted to subsets of the standard simplex used in population games. This flexibility allows for addressing population games where the entries of x(t)x(t) do not necessarily sum to a constant. This would allow “outside options,” e.g., choosing not to drive in a congestion game. Future research will address such scenarios as well as the incorporation of other learning rules.

References

  • [1] J. L. Lions and G. Stampacchia, “Variational inequalities,” Communications on Pure and Applied Mathematics, vol. 20, no. 3, pp. 493–519, 1967.
  • [2] S. Dafermos, “Traffic equilibrium and variational inequalities,” Transportation Science, vol. 14, no. 1, pp. 42–54, 1980.
  • [3] E. Allevi, A. Gnudi, I. V. Konnov, and G. Oggioni, “Evaluating the effects of environmental regulations on a closed-loop supply chain network: A variational inequality approach,” Annuals of Operations Research, vol. 261, no. 1, pp. 1–43, 2018.
  • [4] A. Nagurney, Network economics: A variational inequality approach.   Springer Science & Business Media, 2013, vol. 10.
  • [5] F. Facchinei and J.-S. Pang, Finite-dimensional variational inequalities and complementarity problems.   Springer, 2003.
  • [6] S. Sorin and C. Wan, “Finite composite games: Equilibria and dynamics,” Journal of Dynamics and Games, vol. 3, no. 1, pp. 101–120, 2016.
  • [7] C. E. Lemke, “Bimatrix equilibrium points and mathematical programming,” Management Science, vol. 11, no. 7, pp. 681–689, 1965.
  • [8] D. Chan and J. S. Pang, “The generalized quasi-variational inequality problem,” Mathematics of Operations Research, vol. 7, no. 2, pp. 211–222, 1982.
  • [9] M. A. Noor, “General variational inequalities,” Applied Mathematics Letters, vol. 1, no. 2, pp. 119–122, 1988.
  • [10] D. V. Hieu, J. J. Strodiot, and L. D. Muu, “An explicit extragradient algorithm for solving variational inequalities,” Journal of Optimization Theory and Applications, vol. 185, pp. 476–503, 2020.
  • [11] G. Gidel, T. Jebara, and S. Lacoste-Julien, “Frank-Wolfe Algorithms for Saddle Point Problems,” in International Conference on Artificial Intelligence and Statistics, 2017.
  • [12] A. Allibhoy and J. Cortés, “Safety-critical control as a design paradigm for anytime solvers of variational inequalities,” in IEEE 61st Conference on Decision and Control (CDC), 2022, pp. 6211–6216.
  • [13] M. Frank and P. Wolfe, “An algorithm for quadratic programming,” Naval Research Logistics Quarterly, vol. 3, no. 1-2, pp. 95–110, 1956.
  • [14] J. Nocedal and S. J. Wright, Numerical Optimization.   Springer, 2006.
  • [15] D. Monderer and L. S. Shapley, “Potential games,” Games and Economic Behavior, vol. 14, no. 1, pp. 124–143, 1996.
  • [16] A. Matsui, “Best response dynamics and socially stable strategies,” Journal of Economic Theory, vol. 57, no. 2, pp. 343–362, 1992.
  • [17] J. Hofbauer and W. H. Sandholm, “Stable games and their dynamics,” Journal of Economic Theory, vol. 144, no. 4, pp. 1665–1693, 2009.
  • [18] W. H. Sandholm, Population games and evolutionary dynamics.   MIT press, 2010.
  • [19] M. J. Fox and J. S. Shamma, “Population games, stable games, and passivity,” Games, vol. 4, no. 4, pp. 561–583, 2013.
  • [20] M. Arcak and N. C. Martins, “Dissipativity tools for convergence to nash equilibria in population games,” IEEE Transactions on Control of Network Systems, vol. 8, pp. 39–50, 2020.
  • [21] E. Sontag, “Smooth stabilization implies coprime factorization,” IEEE Transactions on Automatic Control, vol. 34, no. 4, pp. 435–443, 1989.
  • [22] A. Nagurney and D. Zhang, Projected dynamical systems and variational inequalities with applications.   Springer Science & Business Media, 2012, vol. 2.
  • [23] L. Pavel, “Dissipativity theory in game theory: On the role of dissipativity and passivity in nash equilibrium seeking,” IEEE Control Systems Magazine, vol. 42, no. 3, pp. 150–164, 2022.
  • [24] J. P. Aubin and A. Cellina, Differential Inclusions: Set-valued Maps and Viability Theory.   Springer-Verlag, 1984.
  • [25] D. Shevitz and B. Paden, “Lyapunov stability theory of nonsmooth systems,” IEEE Transactions on Automatic Control, vol. 39, no. 9, pp. 1910–1914, 1994.
  • [26] P. Milgrom and I. Segal, “Envelope theorems for arbitrary choice sets,” Econometrica, vol. 70, no. 2, pp. 583–601, 2002.
  • [27] F. H. Clarke, “Generalized gradients and applications,” Transactions of the American Mathematical Society, vol. 205, pp. 247–262, 1975.
  • [28] H. Khalil, Nonlinear Systems, ser. Pearson Education.   Prentice Hall, 2002.
  • [29] J. Hofbauer and W. H. Sandholm, “Evolution in games with randomly disturbed payoffs,” Journal of Economic Theory, vol. 132, no. 1, pp. 47–69, 2007.
  • [30] S. Park, N. C. Martins, and J. S. Shamma, “From population games to payoff dynamics models: A passivity-based approach,” in IEEE 58th Conference on Decision and Control (CDC), 2019, pp. 6584–6601.