This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

LQG Mean Field Games with a Major Agent: Nash Certainty Equivalence versus Probabilistic Approach

Dena Firoozi [email protected] Department of Decision Sciences, HEC Montréal, Montreal, QC, Canada
Abstract

Mean field game (MFG) systems consisting of a major agent and a large number of minor agents were introduced in (Huang, 2010) in an LQG setup. The Nash certainty equivalence was used to obtain a Markovian closed-loop Nash equilibrium for the limiting system when the number of minor agents tends to infinity. In the past years several approaches to major–minor mean field game problems have been developed, principally (i) the Nash certainty equivalence and analytic approach, (ii) master equations, (iii) asymptotic solvability, and (iv) the probabilistic approach. For the LQG case, the recent work (Huang, 2021) establishes the equivalency of the Markovian closed-loop Nash equilibrium obtained via (i) with those obtained via (ii) and (iii). In this work, we demonstrate that the Markovian closed-loop Nash equilibrium of (i) is equivalent to that of (iv) for the LQG case. These two studies answer the long-standing questions about the consistency of the solutions to major-minor LQG MFG systems derived using different approaches.

keywords:
major-minor LQG mean field games; Nash equilibrium; Nash certainty equivalence; probabilistic approach.
\floatsetup

[table]capposition=top

thanks: Corresponding author D. Firoozi.thanks: The author would like to acknowledge helpful discussions with M. Huang, R. Malhamé, P. E. Caines, and M. Pazoki.

1 Introduction

Mean field game (MFG) systems with major and minor agents were first introduced in [5] in an LQG setting, where there is a major agent (whose impact does not vanish in the limit of infinite population size) together with a population of minor agents (where each agent has individually asymptotically negligible effect). In the introduced setting the major agent’s state appears in both dynamics and the cost functional of each minor agent. Moreover, each agent is interacting with the average state of minor agents through couplings in the dynamics and cost functionals. As a result, the mean field for such systems is a progressively measurable stochastic process with respect to the filtration generated by the major agent’s Wiener process. In [5], the author uses the Nash certainty equivalence to establish the existence of Markovian closed-loop ϵ\epsilon-Nash equilibria and derive the individual agents’ explicit control laws which together yield an equilibrium. This methodology is extended in [8] for a general nonlinear case where the major agent’s state appears in nonlinear dynamics and cost functional of individual minor agents, and all agents are coupled with the empirical distribution of minor agents’ state. The best-response strategy of an agent in the limiting case is formulated as the solution to a set of coupled forward backward (FB) PDEs, i.e. a Hamilton Jacobi Bellman equation and a Fokker–Planck–Kolmogorov equation. Subsequently, it is shown that the set of best-response strategies yields a Markovian closed-loop ϵ\epsilon-Nash equilibrium for the system. This methodology which mainly uses the dynamic programming principle is known as the analytic approach in some literature.

A probabilistic approach to major-minor (MM) MFG systems by using the stochastic maximum principle is developed in [3], where the authors establish the existence of open-loop ϵ\epsilon-Nash equilibria for a general case as the solutions to a set of FBSDEs, and provide the explicit solutions for an LQG case. In [3, Section 6], it is discussed in detail that the obtained open-loop equilibrium is different from the Markovian closed-loop equilibrium derived in [5] for the LQG case. Following this work an alternative probabilistic formulation is proposed in [2], where the stochastic maximum principle is used and the search for Nash equilibria in the infinite-population limit is formulated as the search for fixed points in the space of best response control maps for the major and minor agents. Using this method, the authors retrieve the same set of FBSDEs as in [3] characterizing the open-loop equilibrium without explicitly solving it. This is while the paper does not present any comparison between the obtained Markovian closed-loop Nash equilibrium and that of the existing work [5]. Therefore, the paper is inconclusive about this important aspect.

The MFG master equation methodology, which encapsulates the MFG system in a backward nonlinear PDE in infinite dimension, is used in [7, 1] to characterize Nash equilibria for a general MM MFG system. Moreover, [1] shows that the solution of the finite-population MM MFG Nash system converges to the solution of the system of master equations as the number of minor agents tends to infinity.

The solutions to MM LQG MFG systems obtained using the above discussed methods are seemingly different . Consequently, in order to address the questions in the MFG community related to the consistency of the Nash certainty equivalence solutions [5] with the ones obtained via other methods in the literature, the following works have emerged. [4] uses a variational analysis to retrieve the Markovian solutions of [5], where no assumption is imposed on the mean field evolution a priori. Moreover, [6] establishes that the Nash certainty equivalence solutions [5] are equivalent to the Markovian closed-loop solutions obtained via the master equations [7] and asymptotic solvability. The current work serves as the last piece of the long-standing puzzle about MM LQG MFG systems. We demonstrate that the Markovian closed-loop Nash equilibria obtained through the Nash certainty equivalence [5] and the probabilistic approach [2] for the limiting MM LQG MFG systems are equivalent. (For the detailed analysis related to the derivation of consistency equations, best response strategies and ϵ\epsilon-Nash property we refer the reader to [5], [4] and [2].)

In this paper we first introduce finite-population MM LQG MFGs in Section 2. Next, we present the Nash certainty equivalence solutions ([4], [5]) in Section 4. We then present the Markovian closed-loop solutions obtained via the probabilistic approach ([2]) in Section 5. Finally, we show that the two solutions are equivalent in Section 6.

2 Finite-Population MM LQG MFG Systems

We consider a large population NN of minor agents, each denoted by 𝒜i,i𝔑:={1,,N},N<\mathcal{A}_{i},i\in{\mathfrak{N}}:=\{1,\dots,N\},\,N<\infty, and a major agent denoted by 𝒜0\mathcal{A}_{0}. To capture the essence of the two approaches we consider a simple LQG case (for a general case with heterogeneous minor agents see [4, 6, 5]), where the major and minor agents’ states, respectively, satisfy

dxt0=[A0xt0+F0xt(N)+B0ut0]dt+σ0dwt0,\displaystyle dx_{t}^{0}=[A_{0}\,x_{t}^{0}+F_{0}\,x^{(N)}_{t}+B_{0}\,u_{t}^{0}]\,dt+\sigma_{0}\,dw^{0}_{t}, (1)
dxti=[Axti+Fxt(N)+Gxt0+Buti]dt+σdwti,\displaystyle dx_{t}^{i}=[A\,x_{t}^{i}+F\,x^{(N)}_{t}+Gx^{0}_{t}+B\,u_{t}^{i}]\,dt+\sigma\,dw^{i}_{t}, (2)

for t𝔗=[0,T]t\in{\mathfrak{T}}=[0,T], i𝔑i\in{\mathfrak{N}}. Here xtin,i𝔑0:={0,,N}x^{i}_{t}\in\mathbb{R}^{n},\leavevmode\nobreak\ i\in{\mathfrak{N}}_{0}:=\{0,\dots,N\}, are the states, (uti)t𝔗m,i𝔑0(u^{i}_{t})_{t\in{\mathfrak{T}}}\in\mathbb{R}^{m},\leavevmode\nobreak\ i\in{\mathfrak{N}}_{0}, are the control inputs, w={(wti)t𝔗,wtir,i𝔑0}w=\{(w^{i}_{t})_{t\in{\mathfrak{T}}},w^{i}_{t}\in\mathbb{R}^{r},i\in{\mathfrak{N}}_{0}\} denotes (N+1)(N+1) independent standard Wiener processes. Moreover, xt(N):=1Ni𝔑xtix^{(N)}_{t}:=\tfrac{1}{N}\sum_{i\in{\mathfrak{N}}}x^{i}_{t} denotes the average state of the minor agents. All matrices in (1) and (2) are constant and of appropriate dimension.

Assumption 1.

The initial states {x0i,i𝔑0}\{x^{i}_{0},\leavevmode\nobreak\ i\in{\mathfrak{N}}_{0}\} defined on (Ω,,P)(\Omega,\mathcal{F},P) are identically distributed, mutually independent and also independent of ww, with 𝔼x0i=ξ,i𝔑\mathbb{E}x^{i}_{0}=\xi,\,i\in{\mathfrak{N}}, and supi𝔼x0i2c<\sup_{i}\mathbb{E}\|x^{i}_{0}\|^{2}\leq c<\infty, i𝔑0i\in{\mathfrak{N}}_{0}, where cc is independent of NN. Moreover each agent 𝒜i,i𝔑0\mathcal{A}_{i},i\in{\mathfrak{N}}_{0}, observes ξ,x00\xi,\,x^{0}_{0}.

We denote aB2aBa\|a\|_{B}^{2}\coloneqq a^{\intercal}Ba, where aa and BB are matrices of appropriate dimension. We also denote u0(u1,,uN)u^{-0}\coloneqq(u^{1},\dots,u^{N}) and ui(u0,,ui1,ui+1,,uN)u^{-i}\coloneqq(u^{0},\dots,u^{i-1},u^{i+1},\dots,u^{N}). Then the cost functionals for a major agent and a minor agent 𝒜i\mathcal{A}_{i}, i𝔑i\in{\mathfrak{N}}, are given by

J0N(u0,u0)=12𝔼0T{xt0Φt(N)Q02+ut0R02}𝑑t,\displaystyle J_{0}^{N}(u^{0},u^{-0})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|x^{0}_{t}-\Phi^{(N)}_{t}\|^{2}_{Q_{0}}+\|u^{0}_{t}\|_{R_{0}}^{2}\Big{\}}dt,\allowdisplaybreaks (3)
JiN(ui,ui)=12𝔼0T{xtiΨt(N)Q2+utiR2}𝑑t,\displaystyle J_{i}^{N}(u^{i},u^{-i})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|x^{i}_{t}-\Psi^{(N)}_{t}\|_{Q}^{2}+\|u^{i}_{t}\|_{R}^{2}\Big{\}}\,dt,\allowdisplaybreaks (4)
Φt(N):=H0xt(N)+η0,Ψt(N):=Hxt0+H^xt(N)+η.\displaystyle\Phi^{(N)}_{t}:=H_{0}\,x^{(N)}_{t}+\eta_{0},\quad\Psi^{(N)}_{t}\leavevmode\nobreak\ :=\leavevmode\nobreak\ H\,x^{0}_{t}+{\widehat{H}}\,x^{(N)}_{t}+\eta. (5)
Assumption 2.

(Convexity) R0>0R_{0}>0, Q00Q_{0}\geq 0, R>0R>0, Q0Q\geq 0.

3 Infinite Population MM LQG MFG Systems

The dynamics and cost functional of the major agent 𝒜0\mathcal{A}_{0} and a generic minor agent 𝒜i\mathcal{A}_{i} in the infinite-population case are given by

dxt0=[A0xt0+F0x¯t+B0ut0]dt+σ0dwt0,\displaystyle dx_{t}^{0}=[A_{0}\,x_{t}^{0}+F_{0}\,\bar{x}_{t}+B_{0}\,u_{t}^{0}]\,dt+\sigma_{0}\,dw^{0}_{t},\allowdisplaybreaks (6)
dxti=[Axti+Fx¯t+Gxt0+Buti]dt+σdwti,\displaystyle dx_{t}^{i}=[A\,x_{t}^{i}+F\,\bar{x}_{t}+Gx^{0}_{t}+B\,u_{t}^{i}]\,dt+\sigma\,dw^{i}_{t},\allowdisplaybreaks (7)
J0N(u0)=12𝔼0T{xt0H0x¯t+η0Q02+ut0R02}𝑑t,\displaystyle J_{0}^{N}(u^{0})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|x^{0}_{t}-H_{0}\bar{x}_{t}+\eta_{0}\|^{2}_{Q_{0}}+\|u^{0}_{t}\|_{R_{0}}^{2}\Big{\}}dt,\allowdisplaybreaks (8)
JiN(ui,u0)=12𝔼0T{xtiHxt0H^x¯tηQ2+utiR2}𝑑t,\displaystyle J_{i}^{N}(u^{i},u^{0})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|x^{i}_{t}-Hx^{0}_{t}-{\widehat{H}}\bar{x}_{t}-\eta\|_{Q}^{2}+\|u^{i}_{t}\|_{R}^{2}\Big{\}}\,dt, (9)

where the mean field is defined as x¯t:=limN1Nj𝔑xtj\bar{x}_{t}:=\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{j\in{\mathfrak{N}}}x^{j}_{t}, if the limit exists. It is equivalently defined as the expected value x¯t=𝔼[xti|t0]\bar{x}_{t}=\mathbb{E}[x^{i}_{t}|\mathcal{F}^{0}_{t}] of the state of a generic minor agent 𝒜i\mathcal{A}_{i} given the information set t0\mathcal{F}^{0}_{t} defined below.

Information Sets. We define (i) the major agent’s information set 0:=(t0)t𝔗\mathcal{F}^{0}:=(\mathcal{F}^{0}_{t})_{t\in{\mathfrak{T}}} as the filtration generated by (wt0)t𝔗(w^{0}_{t})_{t\in{\mathfrak{T}}}, and (ii) a generic minor agent 𝒜i\mathcal{A}_{i}’s information set i(ti)t𝔗\mathcal{F}^{i}\coloneqq(\mathcal{F}_{t}^{i})_{t\in{\mathfrak{T}}} as the filtration generated by (wti,wt0)t𝔗(w^{i}_{t},w^{0}_{t})_{t\in{\mathfrak{T}}}.

Assumption 3.

(Admissible Controls) (i) For the major agent 𝒜0\mathcal{A}_{0}, the set of admissible control inputs 𝒰0\mathcal{U}^{0} is defined to be the collection of Markovian linear closed-loop control laws u0(ut0)t𝔗u^{0}\coloneqq(u^{0}_{t})_{t\in{\mathfrak{T}}} such that 𝔼[0Tut0ut0𝑑t]<\mathbb{E}[\int_{0}^{T}u_{t}^{0\intercal}u_{t}^{0}\,dt]<\infty. More specifically, ut0=00(t)+01(t)xt0+02(t)x¯tu^{0}_{t}=\ell^{0}_{0}(t)+\ell^{1}_{0}(t)x^{0}_{t}+\ell^{2}_{0}(t)\bar{x}_{t} for some deterministic functions 00(t),01(t),\ell^{0}_{0}(t),\ell^{1}_{0}(t), and 02(t)\ell^{2}_{0}(t). (ii) For each minor agent 𝒜i,i𝔑\mathcal{A}_{i},\,i\in{\mathfrak{N}}, the set of admissible control inputs 𝒰i\mathcal{U}^{i} is defined to be the collection of Markovian linear closed-loop control laws ui(uti)t𝔗u^{i}\coloneqq(u^{i}_{t})_{t\in{\mathfrak{T}}} such that 𝔼[0Tutiuti𝑑t]<\mathbb{E}[\int_{0}^{T}u_{t}^{i\intercal}u_{t}^{i}\,dt]<\infty. More specifically, uti=0(t)+1(t)xti+2(t)xt0+3(t)x¯tu^{i}_{t}=\ell^{0}(t)+\ell^{1}(t)x^{i}_{t}+\ell^{2}(t)x^{0}_{t}+\ell^{3}(t)\bar{x}_{t} for some deterministic functions 0(t),1(t),2(t)\ell^{0}(t),\ell^{1}(t),\ell^{2}(t) and 3(t)\ell^{3}(t).

4 Nash Certainty Equivalence Approach

In the Nash certainty equivalence approach, first an a priori dynamics for the mean field is derived. Then the idea is to Markovianize (i) the major agent’s limiting system by extending its state with the mean field, and (ii) a generic minor agent’s limiting system by extending its state with the major agent’s state and the mean field. This state extension leads to a set of decoupled classical optimal control problems for individual agents, which are linked with each other through the major agent’s state and the mean field. Given the individual information sets, each agent can solve its own stochastic optimal control problem to obtain a best-response strategy. Subsequently, a Nash equilibrium is defined as the set of the best-response Markovian closed-loop strategies of individual agents such that they collectively generate the same mean field that was used in the first step to obtain the best response strategies. This yields a set of consistency equations, the fixed-point solution of which characterizes the Nash equilibrium. ([4] uses a variational analysis and obtains the same Nash equilibrium without assuming an a priori mean field evolution.)

Mean Field Evolution. According to [5, 6], if a generic minor agent adopts a Markovian linear closed-loop strategy, x¯t\bar{x}_{t} satisfies

dx¯t=(A¯x¯t+G¯xt0+m¯(t))dt,d\bar{x}_{t}=\left(\bar{A}\,\bar{x}_{t}+\bar{G}\,x^{0}_{t}+\bar{m}(t)\right)dt, (10)

where A¯,G¯n×n\bar{A},\,\bar{G}\in\mathbb{R}^{n\times n} and m¯n\bar{m}\in\mathbb{R}^{n} are functions of the fixed-point solutions to the consistency equations (18)-(26). Now we present the agents’ Markovianized systems.

Major Agent. From (6) and (10) the major agent’s extended state Xt0=[xt0x¯t]X^{0}_{t}=[x^{0\intercal}_{t}\,\,\bar{x}_{t}^{\intercal}]^{\intercal} satisfies

dXt0=(𝔸0Xt0+𝔹0ut0+𝕄0)dt+Σ0dwt0,\displaystyle dX^{0}_{t}=\left(\mathbb{A}_{0}X^{0}_{t}+\mathbb{B}_{0}u^{0}_{t}+\mathbb{M}_{0}\right)dt+\Sigma_{0}dw^{0}_{t}, (11a)
𝔸0=[A0F0G¯A¯],𝕄0=[0m¯],𝔹0=[B00],Σ0=[σ00].\displaystyle\mathbb{A}_{0}=\begin{bmatrix}A_{0}&F_{0}\\ \bar{G}&\bar{A}\end{bmatrix},\mathbb{M}_{0}=\begin{bmatrix}0\\ \bar{m}\end{bmatrix},\mathbb{B}_{0}=\begin{bmatrix}B_{0}\\ 0\end{bmatrix},\Sigma_{0}=\begin{bmatrix}\sigma_{0}\\ 0\end{bmatrix}. (11b)

The major agent’s cost functional in terms of Xt0X^{0}_{t} is given by

J0(u0)=12𝔼0T{Xs002+us0R022(Xs0)η¯0}𝑑s,\displaystyle J_{0}^{\infty}(u^{0})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|X_{s}^{0}\|^{2}_{\mathbb{Q}_{0}}+\|u^{0}_{s}\|_{R_{0}}^{2}-2(X_{s}^{0})^{\intercal}\bar{\eta}_{0}\Big{\}}ds,\allowdisplaybreaks (12a)
0=[𝕀n,H0]Q02,η¯0=[𝕀n,H0]Q0η0.\displaystyle\mathbb{Q}_{0}=\|\left[{\mathbb{I}}_{n},-H_{0}\right]\|^{2}_{Q_{0}},\quad\bar{\eta}_{0}=\left[{\mathbb{I}}_{n},-H_{0}\right]^{\intercal}Q_{0}\eta_{0}. (12b)

According to [4, Thm 5] (the finite-horizon version of [5, Thm 10] for general MM LQG MFG systems), the best response strategy for the major agent is given by

ut0,=R01𝔹0(Π0(t)Xt0,+s0(t)),\displaystyle u^{0,*}_{t}=-R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\big{(}\Pi_{0}(t)X_{t}^{0,*}+s_{0}(t)\big{)},\allowdisplaybreaks (13a)
Π˙0=Π0𝔸0+𝔸0Π0Π0𝔹0R01𝔹0Π0+0,\displaystyle-\dot{\Pi}_{0}=\Pi_{0}\mathbb{A}_{0}+\mathbb{A}_{0}^{\intercal}\Pi_{0}-\Pi_{0}\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}+\mathbb{Q}_{0},\allowdisplaybreaks (13b)
s˙0=[𝔸0Π0𝔹0R01𝔹0]s0+Π0𝕄0η¯0,\displaystyle-\dot{s}_{0}=\left[\mathbb{A}_{0}^{\intercal}-\Pi_{0}\,\mathbb{B}_{0}\,R_{0}^{-1}\,\mathbb{B}_{0}^{\intercal}\right]s_{0}+\Pi_{0}\mathbb{M}_{0}-\bar{\eta}_{0}, (13c)
Π0(T)=0,s0(T)=0.\displaystyle\Pi_{0}(T)=0,\leavevmode\nobreak\ s_{0}(T)=0. (13d)

Minor Agent. From (6)-(7) and (10), for a generic minor agent 𝒜i\mathcal{A}_{i}, the extended state Xti=[xti,xt0,x¯t]X^{i}_{t}=[x^{i\intercal}_{t},x^{0\intercal}_{t},\bar{x}_{t}^{\intercal}]^{\intercal} is governed by the dynamcis

dXti=(𝔸Xti+𝔹uti+𝕄(t))dt+ΣdWti,\displaystyle dX^{i}_{t}=\left(\mathbb{A}X^{i}_{t}+\mathbb{B}\,u^{i}_{t}+\mathbb{M}(t)\right)dt+\Sigma\,dW^{i}_{t},\allowdisplaybreaks (14a)
𝔸=[A[GF]0𝔸0𝔹0R01𝔹0Π0],𝔹=[B0],Wti=[wtiwt0],\displaystyle\mathbb{A}=\begin{bmatrix}A&[G\,\,\,F]\\ 0&\mathbb{A}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}\end{bmatrix},\leavevmode\nobreak\ \mathbb{B}=\left[\begin{array}[]{c}B\\ 0\end{array}\right],\leavevmode\nobreak\ W^{i}_{t}=\begin{bmatrix}w^{i}_{t}\\ w^{0}_{t}\end{bmatrix},\allowdisplaybreaks (14d)
𝕄=[0𝕄0𝔹0R01𝔹0s0],Σ=[σ00Σ0].\displaystyle\mathbb{M}=\begin{bmatrix}0\\ \mathbb{M}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}s_{0}\end{bmatrix},\,\,\Sigma=\begin{bmatrix}\sigma&0\\ 0&\Sigma_{0}\end{bmatrix}. (14e)

The cost functional for 𝒜i,i𝔑\mathcal{A}_{i},\,i\in{\mathfrak{N}}, in terms of XtiX^{i}_{t} is given by

Ji(ui)=12𝔼0T{Xsi2+usiR22(Xsi)η¯}𝑑s,\displaystyle J_{i}^{\infty}(u^{i})=\tfrac{1}{2}\mathbb{E}\int_{0}^{T}\Big{\{}\|X_{s}^{i}\|_{\mathbb{Q}}^{2}+\|u_{s}^{i}\|^{2}_{R}-2(X_{s}^{i})^{\intercal}\,\bar{\eta}\Big{\}}ds,\allowdisplaybreaks (15a)
=[𝕀n,H,H^]Q2,η¯=[𝕀n,H,H^]Qη.\displaystyle\mathbb{Q}=\|[{\mathbb{I}}_{n},-H,-{\widehat{H}}]\|^{2}_{Q},\quad\bar{\eta}=[{\mathbb{I}}_{n},-H,-{\widehat{H}}]^{\intercal}Q\eta. (15b)

According to [4, Thm 5] and [5, Thm 10], the best response strategy for a generic minor agent 𝒜i\mathcal{A}_{i} is given by

uti,=R1𝔹(Π(t)Xti,+s(t)),\displaystyle u^{i,*}_{t}=-R^{-1}\mathbb{B}^{\intercal}\big{(}\Pi(t)\,X_{t}^{i,*}+s(t)\big{)},\allowdisplaybreaks (16a)
Π˙=Π𝔸+𝔸ΠΠ𝔹R1𝔹Π+,Π(T)=0,\displaystyle-\dot{\Pi}=\Pi\mathbb{A}+\mathbb{A}^{\intercal}\Pi-\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}\Pi+\mathbb{Q},\quad\Pi(T)=0,\allowdisplaybreaks (16b)
s˙=[𝔸Π𝔹R1𝔹]s+Π𝕄η¯,s(T)=0.\displaystyle-\dot{s}=[\mathbb{A}^{\intercal}-\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}]s+\Pi\mathbb{M}-\bar{\eta},\quad s(T)=0. (16c)

Mean Field Consistency Equations. We first define

Πk=[Π11Π¯12Π¯21Π¯22],Π¯12=[Π12Π13],s=[s1s¯2],\Pi_{k}=\begin{bmatrix}\Pi_{11}&\bar{{\Pi}}_{12}\\ \bar{\Pi}_{21}&\bar{\Pi}_{22}\end{bmatrix},\quad\bar{{\Pi}}_{12}=\begin{bmatrix}{\Pi}_{12}&{\Pi}_{13}\end{bmatrix},\quad s=\begin{bmatrix}s_{1}\\ \bar{s}_{2}\end{bmatrix}, (17)

where Π11,Π12,Π13n×n,Π¯222n×2n,Π¯12n×2n\Pi_{11},\Pi_{12},\Pi_{13}\in{\mathds{R}}^{n\times n},\,\bar{\Pi}_{22}\in{\mathds{R}}^{2n\times 2n},\,\bar{\Pi}_{12}\in{\mathds{R}}^{n\times 2n}, Π¯212n×n,s1n,s¯22n\bar{\Pi}_{21}\in{\mathds{R}}^{2n\times n},\,s_{1}\in{\mathds{R}}^{n},\,\bar{s}_{2}\in{\mathds{R}}^{2n}. Then according to the Nash certainty equivalence approach [5], the consistency equations are obtained by effectively equating (10) with the mean field equation resulting from the collective action of the mass of minor agents. Subsequently the consistency equations determining A¯\bar{A}, G¯\bar{G}, m¯\bar{m} are given by

Π˙0=Π0𝔸0+𝔸0Π0Π0𝔹0R01𝔹0Π0+0,\displaystyle-\dot{\Pi}_{0}=\Pi_{0}\mathbb{A}_{0}+\mathbb{A}_{0}^{\intercal}\Pi_{0}-\Pi_{0}\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}+\mathbb{Q}_{0}, (18)
Π˙=Π𝔸+𝔸ΠΠ𝔹R1𝔹Π+,\displaystyle-\dot{\Pi}=\Pi\mathbb{A}+\mathbb{A}^{\intercal}\Pi-\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}\Pi+\mathbb{Q}, (19)
A¯=ABR1BΠ11+FBR1BΠ13,\displaystyle\bar{A}=A-BR^{-1}B^{\intercal}\Pi_{11}+F-BR^{-1}B^{\intercal}\Pi_{13}, (20)
G¯=GBR1BΠ12,\displaystyle\bar{G}=G-BR^{-1}B^{\intercal}\Pi_{12}, (21)
Π0(T)=0,Π(T)=0,\displaystyle\Pi_{0}(T)=0,\quad\Pi(T)=0, (22)
s˙0=[𝔸0Π0𝔹0R01𝔹0]s0+Π0𝕄0η¯0,\displaystyle-\dot{s}_{0}=[\mathbb{A}_{0}^{\intercal}-\Pi_{0}\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}]s_{0}+\Pi_{0}\mathbb{M}_{0}-\bar{\eta}_{0}, (23)
s˙=[𝔸Π𝔹R1𝔹]s+Π𝕄η¯,\displaystyle-\dot{s}=[\mathbb{A}^{\intercal}-\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}]s+\Pi\mathbb{M}-\bar{\eta}, (24)
m¯=BR1𝔹s,\displaystyle\bar{m}=-BR^{-1}\mathbb{B}^{\intercal}s, (25)
s0(T)=0,s(T)=0.\displaystyle s_{0}(T)=0,\quad s(T)=0. (26)
Theorem 4.

The consistency equations (18)-(26) reduce to

Π˙0=Π0𝔸0+𝔸0Π0Π0𝔹0R01𝔹0Π0+0,\displaystyle-\dot{\Pi}_{0}=\Pi_{0}\mathbb{A}_{0}+\mathbb{A}_{0}^{\intercal}\Pi_{0}-\Pi_{0}\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}+\mathbb{Q}_{0},\allowdisplaybreaks (27)
Π¯˙12=Π11[GF]+AΠ¯12+Π¯12(𝔸0𝔹0R01𝔹0Π0)\displaystyle-\dot{\bar{\Pi}}_{12}=\Pi_{11}\left[G\,\,F\right]+A^{\intercal}\bar{\Pi}_{12}+\bar{\Pi}_{12}(\mathbb{A}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}){}
Π11BR1BΠ¯12Q[HH^],\displaystyle{\hskip 51.21504pt}-\Pi_{11}BR^{-1}B^{\intercal}\bar{\Pi}_{12}-Q[H\,\,\,{\widehat{H}}], (28)
A¯=ABR1BΠ11+FBR1BΠ13,\displaystyle\bar{A}=A-BR^{-1}B^{\intercal}\Pi_{11}+F-BR^{-1}B^{\intercal}\Pi_{13}, (29)
G¯=GBR1BΠ12,\displaystyle\bar{G}=G-BR^{-1}B^{\intercal}\Pi_{12}, (30)
Π0(T)=0,Π¯12(T)=0,\displaystyle\Pi_{0}(T)=0,\,\,\bar{\Pi}_{12}(T)=0, (31)
s˙0=[𝔸0Π0𝔹0R01𝔹0]s0+Π0𝕄0η¯0,\displaystyle-\dot{s}_{0}=[\mathbb{A}_{0}^{\intercal}-\Pi_{0}\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}]s_{0}+\Pi_{0}\mathbb{M}_{0}-\bar{\eta}_{0}, (32)
s˙1=[AΠ11BR1B]s1Qη\displaystyle-\dot{s}_{1}=[A^{\intercal}-\Pi_{11}BR^{-1}B^{\intercal}]s_{1}-Q\eta{}
+Π¯12(𝕄0𝔹0R01𝔹0s0),\displaystyle{\hskip 56.9055pt}+\bar{\Pi}_{12}(\mathbb{M}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}s_{0}), (33)
m¯=BR1Bs1,\displaystyle\bar{m}=-BR^{-1}B^{\intercal}s_{1}, (34)
s0(T)=0,s1(T)=0,\displaystyle s_{0}(T)=0,\,\,\,s_{1}(T)=0, (35)

where Π11(T)=0\Pi_{11}(T)=0 and

Π˙11=Π11A+AΠ11Π11BR1BΠ11+Q,\displaystyle-\dot{\Pi}_{11}=\Pi_{11}A+A^{\intercal}\Pi_{11}-\Pi_{11}BR^{-1}B^{\intercal}\Pi_{11}+Q,\allowdisplaybreaks (36)
Π¯12=[Π12,Π13],𝔸0=[A0F0G¯A¯],𝔹0=[B00],\displaystyle\bar{\Pi}_{12}=[\Pi_{12},\Pi_{13}],\leavevmode\nobreak\ \leavevmode\nobreak\ \mathbb{A}_{0}=\begin{bmatrix}A_{0}&F_{0}\\ \bar{G}&\bar{A}\end{bmatrix},\leavevmode\nobreak\ \leavevmode\nobreak\ \mathbb{B}_{0}=\begin{bmatrix}B_{0}\\ 0\end{bmatrix},{}\allowdisplaybreaks
0=[Q0Q0H0H0Q0H0Q0H0],η¯0=[Q0η0H0Q0η0],𝕄0=[0m¯].\displaystyle\mathbb{Q}_{0}=\begin{bmatrix}Q_{0}&-Q_{0}H_{0}\\ -H_{0}^{\intercal}Q_{0}&H_{0}^{\intercal}Q_{0}H_{0}\end{bmatrix},\leavevmode\nobreak\ \bar{\eta}_{0}=\begin{bmatrix}Q_{0}\eta_{0}\\ -H_{0}^{\intercal}Q_{0}\eta_{0}\end{bmatrix},\leavevmode\nobreak\ \leavevmode\nobreak\ \mathbb{M}_{0}=\begin{bmatrix}0\\ \bar{m}\end{bmatrix}.{}

Proof. Given that 𝔹Π=B[Π11Π¯12]\mathbb{B}^{\intercal}\Pi=B^{\intercal}\big{[}\Pi_{11}\,\,\bar{\Pi}_{12}\big{]}, the optimal control (16) of the minor agent 𝒜i\mathcal{A}_{i} is given by

uti,=R1B(Π11xti+Π¯12[xt0x¯t]+s1).{u}^{i,\ast}_{t}=-R^{-1}B^{\intercal}\Big{(}\Pi_{11}x^{i}_{t}+\bar{\Pi}_{12}{\big{[}{x^{0}_{t}}^{\intercal}\leavevmode\nobreak\ \leavevmode\nobreak\ {\bar{x}_{t}}^{\intercal}\big{]}}^{\intercal}+s_{1}\Big{)}. (37)

Hence only the first block row of Π\Pi and ss appear in a generic minor agent’s optimal control and the other blocks are irrelevant. Therefore we use (19) and (24) to derive the equations that Π11\Pi_{11}, Π¯12\bar{\Pi}_{12} and s1s_{1} satisfy. To this end, we first treat the terms in (19) one by one. Block multiplications for the first and the second terms on the right hand side of (19) yield

Π𝔸=[Π11AE1Π¯21AE2],𝔸Π=[AΠ11AΠ¯12E1E2],\displaystyle\Pi\mathbb{A}=\begin{bmatrix}\Pi_{11}A&\,\,\,E_{1}^{\intercal}\\ \bar{\Pi}_{21}A&\,\,\,E_{2}^{\intercal}\end{bmatrix},\vspace{0.2cm}\quad\mathbb{A}^{\intercal}\Pi=\begin{bmatrix}A^{\intercal}\Pi_{11}&\hskip 5.69046ptA^{\intercal}\bar{\Pi}_{12}\\ E_{1}&E_{2}\end{bmatrix}, (38)
E1=[GF]Π11+(𝔸0𝔹0R01𝔹0Π0)Π¯21,\displaystyle E_{1}=[G\,\,F]^{\intercal}{\Pi}_{11}+(\mathbb{A}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0})^{\intercal}\bar{\Pi}_{21},\allowdisplaybreaks (39)
E2=[GF]Π¯12+(𝔸0𝔹0R01𝔹0Π0)Π¯22.\displaystyle E_{2}=[G\,\,F]^{\intercal}\bar{\Pi}_{12}+(\mathbb{A}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0})^{\intercal}\bar{\Pi}_{22}. (40)

For the third and forth terms in (19) we have

Π𝔹R1𝔹Π=[Π11BR1BΠ11Π11BR1BΠ¯1200],\displaystyle\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}\Pi=\begin{bmatrix}\Pi_{11}BR^{-1}B^{\intercal}\Pi_{11}&\Pi_{11}BR^{-1}B^{\intercal}\bar{\Pi}_{12}\\ 0&0\end{bmatrix},{}\allowdisplaybreaks
=[QQ[HH^][HH^]Q[HH^]Q[HH^]].\displaystyle\mathbb{Q}=\begin{bmatrix}Q&\hskip 5.69046pt-Q[H\,\,{\widehat{H}}]\\ -[H\,\,{\widehat{H}}]^{\intercal}Q&\,\,\,[H\,\,{\widehat{H}}]^{\intercal}Q[H\,\,{\widehat{H}}]\end{bmatrix}. (41)

From (19) and (38)-(41), and through block by block correspondence, we obtain the ODEs that Π11\Pi_{11} and Π¯12\bar{\Pi}_{12} satisfy as in (36) and (28), respectively.

Similarly, block multiplications for the terms in (24) result in

Π𝔹R1𝔹s=[Π11BR1Bs10],η¯=[Qη[HH^]Qη],\displaystyle\Pi\mathbb{B}R^{-1}\mathbb{B}^{\intercal}s=\begin{bmatrix}\Pi_{11}BR^{-1}B^{\intercal}s_{1}\\ 0\end{bmatrix},\quad\bar{\eta}=\begin{bmatrix}Q\eta\\ -[H\,\,{\widehat{H}}]^{\intercal}Q\eta\end{bmatrix},{}\allowdisplaybreaks
Π𝕄=[Π¯12(𝕄0𝔹0R01𝔹0s0)Π¯22(𝕄0𝔹0R01𝔹0s0)].\displaystyle\Pi\mathbb{M}=\begin{bmatrix}\bar{\Pi}_{12}(\mathbb{M}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}s_{0})\\ \bar{\Pi}_{22}(\mathbb{M}_{0}-\mathbb{B}_{0}R_{0}^{-1}\mathbb{B}_{0}^{\intercal}s_{0})\end{bmatrix}.\leavevmode\nobreak\ (42)

From (24) and (42), s1s_{1} satisfies (33). \hfill\square

5 Probabilistic Approach

In [2], the search for Nash equilibria for MM MFGs is formulated as the search for fixed points in the space of best response control maps for the major and minor agents in the infinite-population limit. In this section, we present the approach of [2] for obtaining a Markovian closed-loop equilibrium for MM LQG MFG systems.

To be self-contained, in Table 1 we match the notations used in the current work and in [2] for presenting the model parameters and the processes. Otherwise, the notations in the two works are the same.

Table 1: Associated parameters and processes
Dynamics Current Work A0A_{0} σ0\sigma_{0} ut0u^{0}_{t} AA σ\sigma utiu^{i}_{t}
Reference [2] L0L_{0} D0D_{0} αt0\alpha^{0}_{t} LL DD αti\alpha^{i}_{t}
Cost Current Work Q0Q_{0} QQ R0R_{0} RR H^\hat{H}
Reference [2] 2Q02Q_{0} 2Q2Q 2R02R_{0} 2R2R H1H_{1}

In [2] Markovian linear closed-loop control actions as in Assumption 3 are considered for the major agent and a representative minor agent, denoted, respectively, by αt0\alpha^{0}_{t} and αti\alpha^{i}_{t}. The mean field dynamics is then obtained by forming the closed-loop system for the representative agent 𝒜i\mathcal{A}_{i} using αti\alpha^{i}_{t} and taking the conditional expectation 𝔼[xti|t0]\mathbb{E}[x^{i}_{t}|\mathcal{F}^{0}_{t}] of its state xtix^{i}_{t} given the information set t0\mathcal{F}^{0}_{t}. Subsequently, to obtain the solutions to the major agent’s problem, its state is extended with the mean field in the same manner as in [5]. Then using the stochastic maximum principle for the extended system, the major agent’s optimal control action is obtained as

αt0,=(2R0)1[0B0]𝕐t,\alpha^{0,\ast}_{t}=-(2R_{0})^{-1}\big{[}0\,\,B_{0}^{\intercal}\big{]}\mathbb{Y}_{t}, (43)

and a set of McKean-Vlasov FBSDEs is derived which solves for the major agent’s extended state and the decoupling field (adjoint process) 𝕐t\mathbb{Y}_{t}. To solve the FBSDEs, an ansatz is adopted for 𝕐t\mathbb{Y}_{t} as in

𝕐t=Kt[x¯txt0]+kt,\mathbb{Y}_{t}=K_{t}{\big{[}{\bar{x}_{t}}^{\intercal}\leavevmode\nobreak\ \leavevmode\nobreak\ {x^{0}_{t}}^{\intercal}\big{]}}^{\intercal}+k_{t}, (44)

where Kt,kt,K_{t},k_{t}, are deterministic matrices of appropriate dimension. Then a set of ODEs that KtK_{t} and ktk_{t} satisfy is derived. Subsequently the notion of a deviating minor agent is introduced as an extra virtual minor agent who deviates from the strategy of its peers and aims to optimize in response to the major agent and the rest of minor agents. However, when dealing with the optimal control problem for the deviating minor agent, the major agent’s state and the mean field are considered as exogenous stochastic coefficients, which are determined offline by solving a set of SDEs, i.e. the major agent’s closed-loop extended system. Then using the stochastic maximum principle, the deviating minor agent’s optimal control is obtained as in

αti,=(2R)1BYti,\alpha^{i,\ast}_{t}=(2R)^{-1}B^{\intercal}Y^{i}_{t}, (45)

and a set of FBSDEs with random coefficients (not of McKean-Vlasov type), that the minor agent’s state xtix^{i}_{t} and decoupling field (adjoint process) YtiY^{i}_{t} satisfy, are derived. To solve the FBSDEs, an ansatz is considered for YtiY_{t}^{i} as in

Yti=𝕊t[x¯txt0]+Stxti+s¯t,Y^{i}_{t}=\mathbb{S}_{t}{\big{[}{\bar{x}_{t}}^{\intercal}\leavevmode\nobreak\ \leavevmode\nobreak\ {x^{0}_{t}}^{\intercal}\big{]}}^{\intercal}+S_{t}x^{i}_{t}+\bar{s}_{t}, (46)

where 𝕊t\mathbb{S}_{t}, StS_{t}, and s¯t\bar{s}_{t} are matrices of appropriate dimension. The mean field equation resulting from a minor agent using αti,\alpha^{i,\ast}_{t} must match the one obtained using αti\alpha^{i}_{t} to solve for the exogenous stochastic coefficients [x¯txt0][{\bar{x}_{t}}^{\intercal}\leavevmode\nobreak\ {x^{0}_{t}}^{\intercal}]^{\intercal} in the minor agent’s optimal control problem. Subsequently the consistency equations whose fixed-point solutions determine Kt,ktK_{t},k_{t} and 𝕊t,St,s¯t\mathbb{S}_{t},S_{t},\bar{s}_{t}, are given by ([2, eq. (31)-(32)])

K˙t=Kt[𝕃t𝔹¯(2R)1B𝕊t]Kt𝔹¯0(2R0)1𝔹¯Kt\displaystyle-\dot{K}_{t}=K_{t}[\mathbb{L}_{t}-\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\mathbb{S}_{t}]-K_{t}\bar{\mathbb{B}}_{0}(2R_{0})^{-1}\bar{\mathbb{B}}^{\intercal}K_{t}{}
+[𝕃t𝔹¯(2R)1B𝕊t]Kt+2𝔽0,\displaystyle{\hskip 28.45274pt}+[\mathbb{L}_{t}-\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\mathbb{S}_{t}]^{\intercal}K_{t}+2\mathbb{F}_{0}, (47)
𝕊˙t=𝕊t[𝕃t𝔹¯(2R)1B𝕊t]𝕊t𝔹¯0(2R0)1𝔹¯0Kt\displaystyle-\dot{\mathbb{S}}_{t}=\mathbb{S}_{t}[\mathbb{L}_{t}-\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\mathbb{S}_{t}]-\mathbb{S}_{t}\bar{\mathbb{B}}_{0}(2R_{0})^{-1}\bar{\mathbb{B}}_{0}^{\intercal}K_{t}{}
+[LStB(2R)1B]𝕊t\displaystyle{\hskip 28.45274pt}+[L^{\intercal}-S_{t}B(2R)^{-1}B^{\intercal}]\mathbb{S}_{t}{}
+[StF2QH1StG2QH],\displaystyle{\hskip 28.45274pt}+[S_{t}F-2QH_{1}\,\,\,S_{t}G-2QH], (48)
KT=0,𝕊T=0,\displaystyle K_{T}=0,\,\,\mathbb{S}_{T}=0, (49)
k˙t=[𝕃t𝔹¯(2R)1B𝕊t]ktKt𝔹¯0(2R0)1𝔹¯0kt\displaystyle-\dot{k}_{t}=[\mathbb{L}_{t}-\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\mathbb{S}_{t}]k_{t}-K_{t}\bar{\mathbb{B}}_{0}(2R_{0})^{-1}\bar{\mathbb{B}}_{0}^{\intercal}k_{t}{}
Kt𝔹¯(2R)1Bs¯t+2f0,\displaystyle\hskip 28.45274pt-K_{t}\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\bar{s}_{t}+2f_{0}, (50)
s¯˙t=[LStB(2R)1B]s¯t𝕊t𝔹¯0(2R0)1𝔹¯0kt\displaystyle-\dot{\bar{s}}_{t}=[L^{\intercal}-S_{t}B(2R)^{-1}B^{\intercal}]\bar{s}_{t}-\mathbb{S}_{t}\bar{\mathbb{B}}_{0}(2R_{0})^{-1}\bar{\mathbb{B}}_{0}^{\intercal}k_{t}{}
𝕊t𝔹¯0(2R)1Bs¯t2Qη,\displaystyle\hskip 28.45274pt-\mathbb{S}_{t}\bar{\mathbb{B}}_{0}(2R)^{-1}B^{\intercal}\bar{s}_{t}-2Q\eta, (51)
kT=0,s¯T=0,\displaystyle k_{T}=0,\,\,\bar{s}_{T}=0, (52)

where

S˙t=StL+LStStB(2R)1BSt+2Q,ST=0,\displaystyle-\dot{S}_{t}=S_{t}L+L^{\intercal}S_{t}-S_{t}B(2R)^{-1}B^{\intercal}S_{t}+2Q,\quad S_{T}=0, (53)
𝕃=[L+FB(2R)1BStGF0L0],𝔹¯=[B0],𝔹¯0=[0B0],\displaystyle\mathbb{L}=\begin{bmatrix}L+F-B(2R)^{-1}B^{\intercal}S_{t}&G\\ F_{0}&L_{0}\end{bmatrix},\,\,\,\bar{\mathbb{B}}=\begin{bmatrix}B\\ 0\end{bmatrix},\,\,\,\bar{\mathbb{B}}_{0}=\begin{bmatrix}0\\ B_{0}\end{bmatrix},\,\,{}\allowdisplaybreaks
𝔽0=[H0Q0H0H0Q0Q0H0Q0],f0=[H0Q0η0Q0η0].\displaystyle\mathbb{F}_{0}=\begin{bmatrix}H_{0}^{\intercal}Q_{0}H_{0}&-H_{0}^{\intercal}Q_{0}\\ -Q_{0}H_{0}&Q_{0}\end{bmatrix},\,\,\leavevmode\nobreak\ f_{0}=\begin{bmatrix}H_{0}^{\intercal}Q_{0}\eta_{0}\\ -Q_{0}\eta_{0}\end{bmatrix}. (54)

6 Comparison of the Two Approaches

We start with the following theorem.

Theorem 5.

For the MM LQG MFG system (6)-(9), the Markovian closed-loop Nash equilibrium obtained via the Nash certainty equivalence is equivalent to the one obtained via the probabilistic approach.

Proof. By inspection, the Markovian linear optimal control laws {u0,,ui,,i𝔑}\{u^{0,\ast},u^{i,\ast},i\in{\mathfrak{N}}\} obtained through the Nash certainty equivalence (see (13a), (37)) have the same structure as the ones {α0,,αi,,i𝔑}\{\alpha^{0,\ast},\alpha^{i,\ast},i\in{\mathfrak{N}}\} obtained through the probabilistic approach (see (43)-(46)). It remains to show the equivalency of the sets of consistency equations, the fixed-point solutions of which yield the coefficients in the above control laws. More specifically, we show that the reduced consistency equations (27)-(36) obtained via the Nash certainty equivalence are the same as the consistency equations (47)-(53) obtained via the probabilistic approach. To this end, we first define a block elementary operator which operates on a matrix to produce the desired interchanged block rows, as in

=[0𝕀𝕀0],{\mathfrak{I}}=\begin{bmatrix}0&{\mathbb{I}}\\ {\mathbb{I}}&0\end{bmatrix}, (55)

where the identity matrices 𝕀\mathbb{I} are of appropriate dimension. Then we correspond the processes in (27)-(36) with those in (47)-(53) as shown in Table 2.

Table 2: Corresponding processes in consistency equations
Current Work Π0(t)\!\!\!{\mathfrak{I}}^{\intercal}\Pi_{0}(t){\mathfrak{I}} Π11(t)\!\!\!\Pi_{11}(t) Π¯12(t)\!\!\!\bar{\Pi}_{12}(t){\mathfrak{I}} s0(t)\!\!\!s_{0}(t){\mathfrak{I}} s1(t)\!\!\!s_{1}(t)
Reference [2] KtK_{t} StS_{t} 𝕊t\mathbb{S}_{t} ktk_{t} s¯t\bar{s}_{t}

Using Tables 1-2, we can retrieve (53) from (36) or vice versa. Now we correspond the terms in (27)-(28) to those in (47)-(48). First we use Table 1 to replace the system parameters in (27) with those considered in [2] (i.e. replace Q0,R0,Q_{0},R_{0}, respectively, with 2Q0,2R0,2Q_{0},2R_{0}, in (27)). Then as per Table 2 to obtain KtK_{t} we multiply both sides of (27) from right by {\mathfrak{I}} and from left by {\mathfrak{I}}^{\intercal} as in

Π˙0=(Π0𝔸0+𝔸0Π0Π0𝔹0(2R0)1𝔹0Π0+20),-{\mathfrak{I}}^{\intercal}\dot{\Pi}_{0}{\mathfrak{I}}={\mathfrak{I}}^{\intercal}(\Pi_{0}\mathbb{A}_{0}+\mathbb{A}_{0}^{\intercal}\Pi_{0}-\Pi_{0}\mathbb{B}_{0}(2R_{0})^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}+2\mathbb{Q}_{0}){\mathfrak{I}},

which by inspection is the same equation as (47), particularly

Π0𝔸0=Kt[𝕃t𝔹¯(2R)1B𝕊t],20=2𝔽0,\displaystyle{\mathfrak{I}}^{\intercal}\Pi_{0}\mathbb{A}_{0}{\mathfrak{I}}=K_{t}[\mathbb{L}_{t}-\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\mathbb{S}_{t}],\quad 2{\mathfrak{I}}^{\intercal}\mathbb{Q}_{0}{\mathfrak{I}}=2\mathbb{F}_{0},{}\allowdisplaybreaks
Π0𝔹0(2R0)1𝔹0Π0=Kt𝔹¯0(2R0)1𝔹¯Kt.\displaystyle{\mathfrak{I}}^{\intercal}\Pi_{0}\mathbb{B}_{0}(2R_{0})^{-1}\mathbb{B}_{0}^{\intercal}\Pi_{0}{\mathfrak{I}}=K_{t}\bar{\mathbb{B}}_{0}(2R_{0})^{-1}\bar{\mathbb{B}}^{\intercal}K_{t}. (56)

Next we use Tables 1-2 to match (28) and (48). We first replace R0,R,Q,H^R_{0},R,Q,\hat{H}, respectively with 2R0,2R,2Q,H12R_{0},2R,2Q,H_{1} in (28), and then right multiply both sides by {\mathfrak{I}} which gives us equation (48). Now we show that (50) can be retrieved from (32). From Tables 1-2, we first replace R0,Q0R_{0},Q_{0} with 2R0,2Q02R_{0},2Q_{0} in (32), and then left multiply its both sides by {\mathfrak{I}} as in

s˙0=([𝔸0Π0𝔹0(2R0)1𝔹0]s0+Π0𝕄02η¯0),-{\mathfrak{I}}\dot{s}_{0}={\mathfrak{I}}([\mathbb{A}_{0}^{\intercal}-\Pi_{0}\mathbb{B}_{0}(2R_{0})^{-1}\mathbb{B}_{0}^{\intercal}]s_{0}+\Pi_{0}\mathbb{M}_{0}-2\bar{\eta}_{0}), (57)

where 2η¯0=2f0-2{\mathfrak{I}}\bar{\eta}_{0}=2f_{0} and

Π0𝕄0=Kt[B(2R)1Bs¯t0]=Kt𝔹¯(2R)1Bs¯t.{\mathfrak{I}}\Pi_{0}\mathbb{M}_{0}=-K_{t}\begin{bmatrix}B(2R)^{-1}B^{\intercal}\bar{s}_{t}\\ 0\end{bmatrix}=-K_{t}\bar{\mathbb{B}}(2R)^{-1}B^{\intercal}\bar{s}_{t}.\\

Finally, we replace R0,R,Q,R_{0},R,Q, respectively, with 2R0,2R,2Q2R_{0},2R,2Q, in (33), which yields (51). This matches (33) and (51). \hfill\square

Discussions. To obtain Markovian closed-loop Nash equilibria for MM LQG MFG systems, both Nash certainty equivalence and probabilistic approaches assume that a generic minor agent adopts a Markovian linear closed-loop strategy. With this assumption the mean field equation is derived using an ansatz for the minor agent’s control action. Then to solve the major agent’s limiting optimal control problem, its state is extended with the mean field in both approaches. In [5], the optimal linear state feedback control for the major agent’s extended system is obtained using the known results for single-agent LQG systems. This is while in [2], the stochastic maximum principle is used, where a linear ansatz for the adjoint process in terms of the major agent’s state and the mean field is considered. The two methods are equivalent and result in the same optimal control for the major agent. For a generic minor agent’s optimal control problem, [5] Markovianizes the minor agent’s system by extending its state by the major agent’s state and the mean field. Then again using the known results for single-agent LQG systems the minor agent’s optimal control is obtained, which is a linear function of its own state, the major agent’s state and the mean field. This is while in [2], the major agent’s state and the mean field are considered as exogenous stochastic coefficients in the minor agent’s system. Then using the stochastic maximum principle, an optimal control is obtained for the minor agent by adopting an ansatz for the adjoint process which is a linear function of its own state, the major agent’s state and the mean field. Although, the obtained optimal control actions for a generic minor agent derived in [5] and [2] do not look the same at first glance, Theorem 5 establishes that they are indeed equivalent. Hence both approaches yield the same Markovian closed-loop Nash equilibrium for MM LQG MFGs.

The fact that the set of consistency equations of [5] reduces to that of [2] stems from an interaction asymmetry in the minor agent’s extended system in the former. In fact, in the minor agent’s extended system, the individual minor agent’s state and control action do not affect the joint system of the major agent and the mean field (i.e. the major agent’s extended system). However, the major agent’s state and the mean field affect the dynamics and the cost functional of the individual minor agent. In the core, the minor agent’s extended system (modelled in [5]) is working in the same manner as the individual minor agent system with exogenous stochastic coefficients solving the major agent’s extended system (modelled in [2]). Such asymmetric interactions do not occur in the major agent’s extended system. This is due to the mutual interactions as the major agent’s state appears in the mean field dynamics and the mean field appears in both the major agent’s dynamics and cost functional.

References

  • [1] P. Cardaliaguet, M. Cirant, and A. Porretta. Remarks on Nash equilibria in mean field game models with a major player. Proceedings of the American Mathematical Society, 148(10):4241–4255, 2020.
  • [2] R. Carmona and P. Wang. An alternative approach to mean field game with major and minor players, and applications to herders impacts. Applied Mathematics & Optimization, 76(1):5–27, 2017.
  • [3] R. Carmona and X. Zhu. A probabilistic approach to mean field games with major and minor players. Annals of Applied Probability, 26(3):1535–1580, 2016.
  • [4] D. Firoozi, S. Jaimungal, and P. E. Caines. Convex analysis for LQG systems with applications to major–minor LQG mean–field game systems. Systems & Control Letters, 142:104734, 2020.
  • [5] M. Huang. Large-population LQG games involving a major player: The Nash certainty equivalence principle. SIAM Journal on Control and Optimization, 48(5):3318–3353, 2010.
  • [6] M. Huang. Linear-quadratic mean field games with a major player: Nash certainty equivalence versus master equations. Communications in Information and Systems, 21(3):441–471, 2021.
  • [7] J. M. Lasry and P. L. Lions. Mean-field games with a major player. Comptes Rendus Mathematique, 356(8):886 – 890, 2018.
  • [8] M. Nourian and P. E. Caines. ϵ\epsilon-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents. SIAM Journal on Control and Optimization, 51(4):3302–3331, 2013.