Solving Monotone Variational Inequalities with Best Response Dynamics
Abstract
We leverage best response dynamics to solve monotone variational inequalities on compact and convex sets. Specialization of the method to variational inequalities in game theory recovers convergence results to Nash equilibria when agents select the best response to the current distribution of strategies. We apply the method to generalize population games with additional convex constraints. Furthermore, we explore the robustness of the method by introducing various types of time-varying disturbances.
I INTRODUCTION
Variational inequalities, introduced in [1], cover wide-ranging problems in modeling equilibria and optimization problems featuring inequality constraints. In transportation, they represent traffic equilibria [2]. In operations research, they model equilibria in supply chains, network flows, and facility locations [3]. In finance, they reflect equilibrium prices, offering insights into financial market dynamics [4]. In game theory, they capture Nash equilibria of noncooperative games [5]. For atomic games, the Nash equilibria can be captured by variational inequalities derived from first order optimality conditions. In nonatomic (population) games, the equilibrium condition comes directly in the form of a variational inequality [6]. Many generalizations of variational inequalities have been proposed, including the complementary problem [7], the quasi variational inequality [8] and the general variational inequality [9].
To solve variational inequalities, constrained gradient descent methods are widely adopted in the literature [10, 11, 12]. These methods iteratively update the solution by incorporating both the gradient information of the objective function and the constraints to ensure that the iterates remain feasible. One such method is projected gradient descent, which optimizes constrained problems by iteratively adjusting solutions along the negative gradient direction and projecting them onto the feasible set. Another variant, the Frank-Wolfe method [13], solves the problem by iteratively moving towards the solution along linear approximations of the objective function. The Augmented Lagrangian Method [14] solves constrained optimization problems by iteratively updating the Lagrange multipliers while minimizing a penalized version of the objective function. It combines the benefits of penalty methods and the method of multipliers.
Finding the Nash equilibria in noncooperative games, as an application of variational inequalities, has its own track of research. In particular, the best response dynamics is one of the incentive-oriented methods, describing the iterative process where players sequentially update their strategies to minimize their cost based on the current strategies of others. If the process stops, meaning that no player benefits from changing unilaterally, then the Nash equilibrium is reached. The existing literature examines the convergence characteristics of the best response dynamics in game theory, notably in potential games as outlined in [15]. The best response dynamics for population games follow the method proposed in [16], where the best response describes the best strategy distribution to the present cost. The convergence results of the best response dynamics for population games have been well-studied for particular cost structures [17, 18, 19, 20].
The first contribution of this paper is to leverage the best response dynamics to solve general variational inequalities, recovering results for games as special cases. Furthermore, the method extends the scope of population games by allowing additional convex constraints, thus extending previous work for standard population games [17].
The second contribution is to account for disturbances in best response dynamics, moving beyond static forms of disturbance typically studied in variational inequalities. We introduce various types of time-varying disturbances and prove appropriate forms of Input-to-State Stability [21].
The remainder of the paper is organized as follows. In Section II, preliminary results are given. In Section III, solutions of variational inequalities are studied using the best response dynamics. In Section IV, various types of disturbance are introduced and the robustness of the best response dynamics is analyzed. In Section V, several applications are presented. Finally in Section VI, conclusions are given.
II PRELIMINARIES
In this section, we review variational inequalities and the best response dynamics.
II-A Variational Inequalities
Definition 1 (Variational Inequality).
Given a set and a mapping , we say that solves the variational inequality, denoted as , if
(1) |
The set of all solutions is denoted as .
Definition 2 (Strong Monotonicity / Monotonicity).
A function is strongly monotone (or monotone) on if (or ) s.t.
(2) |
If is continuously differentiable, then (2) is equivalent to
(3) |
where denotes the tangent space to .
Most existing solvers for variational inequalities are closely connected to constrained optimization. Especially, lots of them are based on the projected gradient iterative algorithms; e.g., [22] proposes the continuous version,
(4) |
where denotes the tangent cone at .
II-B Best response dynamics
II-B1 Best response dynamics in atomic games
Consider agents. For to , agent- has a -dimensional optimization variable and an objective function with a constraint set , where denotes all the optimization variables except . Then, the best response is defined as
(5) |
That is, the best response of an agent refers to a strategy that optimizes its objective function given the actions taken by the other agents.
The best response dynamics describes an iterative process where agents take turns making their best responses. This process iterates until no player has an incentive to deviate from its chosen strategy unilaterally. Thus, the resulting strategies form a Nash equilibrium. The Nash equilibrium is a solution to the variational inequality (1) where and is the Cartesian product set [23].
II-B2 Best response dynamics in population games
In a single population game, the set of strategies available to the agents is denoted as . Then, the social state is defined as , where represents the fraction of players choosing strategy . Note that the social state lies in the probability simplex . A cost function maps the social state to a cost, denoted as . For a state , the best response to cost is the strategy distribution that minimizes the total cost,
Evolutionary dynamics models for population games assume each agent continually revises its strategy and revision opportunities follow a Poisson process. When the opportunity arises, the agent switches its strategy according to a probability distribution defined by a learning rule [18]. When the rule is to select the strategy with the lowest cost (“best response”), the resulting mean dynamics are
where .
The solution of the best response dynamics above has been proven to converge to Nash equilibria in various games, e.g., potential games, monotone games, and supermodular games; see [18] for a comprehensive review.
III BEST RESPONSE DYNAMICS FOR VARIATIONAL INEQUALITIES
We now generalize the best response dynamics to solve broader variational inequalities than those describing Nash equilibria in population games. To do so, we extend the feasible set of the best response from a probability simplex to an arbitrary compact and convex set .
Definition 3 (Best response).
Let be a compact and convex set. Given a vector , the best response mapping is defined as
(6) |
Note that is a set-valued map. Next we define the best response dynamics.
Definition 4 (Best response dynamics).
Given a cost trajectory , the best response dynamics is the differential inclusion,
(7) |
Since is convex, belongs to the tangent cone at making invariant under (7). Therefore, we do not require projection steps onto to ensure feasibility as in projected gradient methods.
In the following, we use the best response dynamics to solve the variational inequality .
Theorem 1.
Proof.
Define the mapping on the right hand side of (7) as . Since is upper-hemicontinous, nonempty, compact-valued, and convex-valued, there exists a Carathéodory solution [24].
Define the Lyapunov function candidate
(8) |
where
(9) |
which is nonnegative on and vanishes only on . To apply the nonsmooth Lyapunov techniques [25], we need to show that (9) is Lipschitz continuous w.r.t. its first and second arguments, respectively. The Lipschitzness w.r.t the first argument is from the affine form of (9). To prove the Lipschitzness of w.r.t. , we note that
where . Similarly, , then the Lipschitz property: follows. Therefore, (9) is Lipschitz w.r.t. its second argument. Moreover, is absolutely continuous w.r.t. and, thus, so is . As a result, for almost all points where is differentiable and exists, we have
(10) |
where follows from the Envelope Theorem [26]. Using Clarke’s generalized gradient [27], we can extend the inequality (10) to hold for almost all . Finally, by the Theorem A.2 in [17], we have . Therefore, is globally asymptotically stable. ∎
When we specialize variational inequalities to population games, where is the probability simplex, characterizes the Nash equilibria. Thus, the result in [17] is a special case of 1.
A connection between (4) and (7) can be made as follows. If is the gradient of a function, then (4) is the projected gradient descent method and (7) is the Frank-Wolfe method. Although (4) and (7) have comparable complexity, (4) requires a projection onto the varying tangent cone at the current point to remain feasible but (7) only needs to select a point in the static set and automatically remains feasible. This feature brings benefits to analysis, especially when disturbances are considered in the following section.
IV ROBUSTNESS ANALYSIS
In 1, the function is perfectly known. We now consider a time-varying disturbance which perturbs into . Then, the best response dynamics become
(11) |
This form of the disturbance alters the trajectory indirectly via changes to the cost function . In contrast, a disturbance may affect by directly perturbing the best response dynamics into
(12) |
Definition 5 (Cost disturbance and Dynamics disturbance).
To analyze the dynamics disturbance in (12), we assume that the disturbance does not violate the constraints. For example, in a congestion game, the distribution of the traffic flows over routes add up to the total demand despite disturbances in the dynamics governing the evolution of flows.
Definition 6 (Admissible Dynamics Disturbance).
For (12), a dynamics disturbance is admissible if it is piecewise continuous, bounded, and satisfies
that is, the resulting trajectory will remain in .
In the following, we explore the general case where the best response dynamics are subject to both dynamics disturbances and cost disturbances. We show that the effects of disturbances are captured by the notion of Input-to-State Stability. Denote comparison functions class-, class-, and class-, by , and , respectively [28].
Theorem 2.
Consider best response dynamics subject to a dynamics disturbance and a cost disturbance ,
where is admissible and is bounded and with a bounded derivative. Let be compact and convex and be strongly monotone. Then,
(13) |
where is the unique solution of , , and . In particular, if , , and , then .
Proof.
Similar to the proof of 1, we define
(14) | ||||
(15) |
where
(16) | ||||
(17) |
We first derive the following relations:
(18) | ||||
(19) |
for some and is the diameter of the compact and convex set . Note that is the unique solution to . Denote , then we have . As a result, we have
(20) |
where is attached to . On the other hand,
(21) |
where is attached to , , and . The last inequality with follows from the Cauchy-Schwartz inequality, compactness of , and Lipschitzness of . Next, we prove
(22) |
where is attached to , , and . Similarly, . Thus, (22) is proved.
We next study a state-dependent perturbation on :
(24) |
Although we can analyze (24) with 2 by setting and , in the next theorem we exploit the additional structure to derive a refined bound.
Theorem 3 (State-dependent cost disturbance).
Consider (24). Let be compact and convex, be strongly monotone, and be such that is strongly monotone. Then, the dynamics converge to a new perturbed equilibrium point and
(25) |
where is the unique solution of unperturbed and . The function is given as
Proof.
Remark 1.
The model (24) encompasses the perturbed best response dynamics, which was used in [29] to study a smooth approximation of the best response dynamics. The perturbed best response dynamics is defined as
where and is strictly convex, twice differentiable, and with its magnitude of gradient approaching infinity near the boundary of . For the optimization problem , if instead of directly picking the , we apply the Frank-Wolfe method, then the point selected is
which is covered by (24) with . Since is strictly convex, is strictly monotone. Therefore, 3 reproduces the convergence results for perturbed best response dynamics [29]. In addition, it provides a bound on the perturbed distance.
V APPLICATIONS
V-A Traffic network
Consider the network in Fig. 1, which includes 3 routes and 3 Y-intersections with traffic lights shown in boxes with crosses. The flows on each route are denoted , , and , respectively, and their sum is one. We zoom in on one of the Y-intersections, shown in Fig. 2, to explain the effects of traffic lights. An actuated traffic light results in the interdependency of Link 1 and Link 2 delays on each others’ flows. Assuming the light prioritizes the branch line, we let and . For the rest of the delay functions in the network, we model similarly as and . Then, the total delays on each route are , , and , respectively. Thus, the equilibrium of the traffic network is characterized by the variational inequality (1) where is the probability simplex and
Note that this is not a potential game but a monotone game, since is not symmetric but is positive semidefinite. The simulation result is provided in Fig. 3(a) where the best response dynamics lead to a spiral trajectory and converge to the equilibrium point .


As an illustration of the case where is a subset of the simplex, we add a capacity constraint on Link 7: , and delay constraints on Link 3 and Link 5: and . Although the feasible set is no longer a simplex, 1 ensures convergence. Simulation results are given in Fig. 3(b).


V-B Cost disturbances and dynamics disturbances
In the following, we first discuss congestion games with cost disturbances and then dynamics disturbances.
V-B1 Cost disturbances
In standard congestion games, the delays are merely based on the flow. However, there may be other time-varying factors that affect the delays. For example, weather conditions cause additional delays, and accidents increase the delays for a period until the road is cleared. To capture these scenarios, we introduce a bounded time-varying cost disturbance with bounded derivative .
A congestion game is strongly monotone if all the delay functions are strictly increasing. In this case, by Theorem 2, the convergence results of best response dynamics are robust to a certain extent of cost disturbances. The settings of the simulation are described in Fig. 5, where the corresponding for the variational inequality is .


The results of solving the variational inequality by the best response dynamics without cost disturbances are provided in Fig. 5. Note that the function is strongly monotone and thus admits a unique Nash equilibrium.
Next, we give simulation results with two examples of cost disturbances. The first one is the periodic cost disturbance, . The result is shown in Fig. 6. Since this disturbance is periodic and does not vanish, the trajectory reaches a neighborhood of the unperturbed equilibrium rather than the equilibrium itself. The second example is the diminishing cost disturbance, . The simulation result is provided in Fig. 7. Since the disturbance is large in the early stage, the trajectory moves away from the original trajectory initially. However, as the disturbance vanishes, the trajectory gradually converges to the non-disturbed one as in Fig. 5.


V-B2 Dynamics disturbances
Next we simulate the system with a dynamics disturbance . Note that is admissible as defined in Definition 6. By Theorem 2, the resulting trajectory should remain close to the unperturbed one as in Fig. 5. The simulation results are provided in Fig. 8.

VI CONCLUSIONS AND FUTURE WORK
In this paper, we leveraged the best response dynamics to solve monotone variational inequalities and provided robustness guarantees against classes of disturbances. When the set is a subset of the probability simplex, as in our example, can be viewed as the distribution of the agents to the strategies in a population game with constraints. In this special case, it would be possible to study learning rules other than the best response, such as those reviewed in [30]. Note, however, that our convergence results apply to any compact and convex set ; therefore, they are not restricted to subsets of the standard simplex used in population games. This flexibility allows for addressing population games where the entries of do not necessarily sum to a constant. This would allow “outside options,” e.g., choosing not to drive in a congestion game. Future research will address such scenarios as well as the incorporation of other learning rules.
References
- [1] J. L. Lions and G. Stampacchia, “Variational inequalities,” Communications on Pure and Applied Mathematics, vol. 20, no. 3, pp. 493–519, 1967.
- [2] S. Dafermos, “Traffic equilibrium and variational inequalities,” Transportation Science, vol. 14, no. 1, pp. 42–54, 1980.
- [3] E. Allevi, A. Gnudi, I. V. Konnov, and G. Oggioni, “Evaluating the effects of environmental regulations on a closed-loop supply chain network: A variational inequality approach,” Annuals of Operations Research, vol. 261, no. 1, pp. 1–43, 2018.
- [4] A. Nagurney, Network economics: A variational inequality approach. Springer Science & Business Media, 2013, vol. 10.
- [5] F. Facchinei and J.-S. Pang, Finite-dimensional variational inequalities and complementarity problems. Springer, 2003.
- [6] S. Sorin and C. Wan, “Finite composite games: Equilibria and dynamics,” Journal of Dynamics and Games, vol. 3, no. 1, pp. 101–120, 2016.
- [7] C. E. Lemke, “Bimatrix equilibrium points and mathematical programming,” Management Science, vol. 11, no. 7, pp. 681–689, 1965.
- [8] D. Chan and J. S. Pang, “The generalized quasi-variational inequality problem,” Mathematics of Operations Research, vol. 7, no. 2, pp. 211–222, 1982.
- [9] M. A. Noor, “General variational inequalities,” Applied Mathematics Letters, vol. 1, no. 2, pp. 119–122, 1988.
- [10] D. V. Hieu, J. J. Strodiot, and L. D. Muu, “An explicit extragradient algorithm for solving variational inequalities,” Journal of Optimization Theory and Applications, vol. 185, pp. 476–503, 2020.
- [11] G. Gidel, T. Jebara, and S. Lacoste-Julien, “Frank-Wolfe Algorithms for Saddle Point Problems,” in International Conference on Artificial Intelligence and Statistics, 2017.
- [12] A. Allibhoy and J. Cortés, “Safety-critical control as a design paradigm for anytime solvers of variational inequalities,” in IEEE 61st Conference on Decision and Control (CDC), 2022, pp. 6211–6216.
- [13] M. Frank and P. Wolfe, “An algorithm for quadratic programming,” Naval Research Logistics Quarterly, vol. 3, no. 1-2, pp. 95–110, 1956.
- [14] J. Nocedal and S. J. Wright, Numerical Optimization. Springer, 2006.
- [15] D. Monderer and L. S. Shapley, “Potential games,” Games and Economic Behavior, vol. 14, no. 1, pp. 124–143, 1996.
- [16] A. Matsui, “Best response dynamics and socially stable strategies,” Journal of Economic Theory, vol. 57, no. 2, pp. 343–362, 1992.
- [17] J. Hofbauer and W. H. Sandholm, “Stable games and their dynamics,” Journal of Economic Theory, vol. 144, no. 4, pp. 1665–1693, 2009.
- [18] W. H. Sandholm, Population games and evolutionary dynamics. MIT press, 2010.
- [19] M. J. Fox and J. S. Shamma, “Population games, stable games, and passivity,” Games, vol. 4, no. 4, pp. 561–583, 2013.
- [20] M. Arcak and N. C. Martins, “Dissipativity tools for convergence to nash equilibria in population games,” IEEE Transactions on Control of Network Systems, vol. 8, pp. 39–50, 2020.
- [21] E. Sontag, “Smooth stabilization implies coprime factorization,” IEEE Transactions on Automatic Control, vol. 34, no. 4, pp. 435–443, 1989.
- [22] A. Nagurney and D. Zhang, Projected dynamical systems and variational inequalities with applications. Springer Science & Business Media, 2012, vol. 2.
- [23] L. Pavel, “Dissipativity theory in game theory: On the role of dissipativity and passivity in nash equilibrium seeking,” IEEE Control Systems Magazine, vol. 42, no. 3, pp. 150–164, 2022.
- [24] J. P. Aubin and A. Cellina, Differential Inclusions: Set-valued Maps and Viability Theory. Springer-Verlag, 1984.
- [25] D. Shevitz and B. Paden, “Lyapunov stability theory of nonsmooth systems,” IEEE Transactions on Automatic Control, vol. 39, no. 9, pp. 1910–1914, 1994.
- [26] P. Milgrom and I. Segal, “Envelope theorems for arbitrary choice sets,” Econometrica, vol. 70, no. 2, pp. 583–601, 2002.
- [27] F. H. Clarke, “Generalized gradients and applications,” Transactions of the American Mathematical Society, vol. 205, pp. 247–262, 1975.
- [28] H. Khalil, Nonlinear Systems, ser. Pearson Education. Prentice Hall, 2002.
- [29] J. Hofbauer and W. H. Sandholm, “Evolution in games with randomly disturbed payoffs,” Journal of Economic Theory, vol. 132, no. 1, pp. 47–69, 2007.
- [30] S. Park, N. C. Martins, and J. S. Shamma, “From population games to payoff dynamics models: A passivity-based approach,” in IEEE 58th Conference on Decision and Control (CDC), 2019, pp. 6584–6601.