This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Network reconstruction from asynchronously updated evolutionary game

Hong-Li Zeng School of Science, and New Energy Technology Engineering Laboratory of Jiangsu Province , Nanjing University of Posts and Telecommunications, Nanjing 210023, China    Shu-Xuan Wang School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, China    Yan-Dong Guo College of Electronic Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210046, China    Shao-Meng Qin [email protected] College of Science, Civil Aviation University of China, Tianjin 300300, China
Abstract

The interactions between players of prisoner’s dilemma (PD) game are reconstructed with evolutionary game data. All participants play the game with their counterparts and gain corresponding rewards during each round of the game. However, their strategies are updated asynchronously during the evolutionary PD game. Two inference methods of the interactions between players are derived with naive mean-field (nMF) approximation and maximum log-likelihood estimation (MLE) respectively. The two methods are tested numerically also for fully connected asymmetric Sherrington-Kirkpatrick (SK) models, varying the data length, asymmetric degree, payoff and system noise (coupling strength). We find that the reconstruction mean square error (MSE) of MLE method is proportional to the inverse of data length and typically half (benefit from the extra information of update times) of that by nMF. Both methods are robust to the asymmetric degree but works better for large payoff. Compared with MLE, nMF is more sensitive to the couplings strength which prefers weak couplings.

pacs:
Valid PACS appear here

I Introduction

Networks can be generally used for describing systems composed by interacting elements. Such systems usually generate big time-series data based on the underlying dynamics or mechanics. Uncovering the network structures from observations is crucial to understand the collective dynamics of the systems and has attracted great interest in different fields, ranging from biology system Schneidman et al. (2006); Nguyen et al. (2016); Roudi and Hertz (2011a); Zeng and Aurell (2020a) to social science Pincus and Kalman (2004); Bury (2013); Zeng et al. (2014).

Meanwhile, the Prisoner’s Dilemma (PD) game has long been studied for the emergence of cooperation among connected (links of the network) selfish individuals (nodes of the network). With the PD game, the criminals could take cooperation (confessing the crime to the judge) or defection (denying the crime) strategy. A large number of studies have focused on the collective behaviors with different network structures, e.g., small-world network Masuda and Aihara (2003), two-dimensional lattice Szabó and Töke (1998) and scale-free network Chen et al. (2007), etc.. Furthermore, the couplings between players are typically symmetric and binary (with / without interactions) between players in those studies. To quantify the strength of interactions between players, here we studied fully connected asymmetric Sherrington-Kirkpatrick (SK) Kirkpatrick and Sherrington (1978) models, with an asymmetric degree parameter could be varying between 0 and 1. Thus the couplings between players are Gaussians and not necessarily symmetric, which are more closer to the realities in natural social systems. We are more interested in the reconstruction of the couplings between players instead of the collective behaviours of the system.

For an evolutionary PD game, the players’ strategies follow a binary-state dynamics, which could be naturally mapped into binary state, i.e., +1+1 for cooperation and 1-1 for defection. Hence, it is straightforward to pose the coupling inference problem for an evolutionary game into the inverse Ising problem. There are many related algorithms to recovering the couplings for the inverse Ising problems, as reviewed in Nguyen et al. (2017) and references therein.

Generally, the couplings can be reconstructed by equilibrium Aurell and Ekeberg (2012); Cocco et al. (2009a); Weigt et al. (2009) or non-equilibrium Pillow et al. (2008); Roudi and Hertz (2011b); Zhang (2012); Zeng et al. (2011, 2013) model given the time-dependent/independent system data. However, in many applications, equilibrium Ising models are typically not the ideal choice for network inference and statistical modeling of the data. A mismatch between the real couplings and the inferred ones can be seen for large systems Roudi et al. (2009); Roudi and Hertz (2011a); Zeng and Aurell (2020b). This is mainly because these systems have the out-of-equilibrium nature and asymmetric couplings between elements. Thus, researchers have moved to the kinetic Ising models, developed exact and approximate methods for network reconstruction Cocco et al. (2009b); Roudi and Hertz (2011a); Zeng et al. (2011, 2013), which has wider generality.

With kinetic Ising models, spins could update synchronously or asynchronously. Similarly, with evolutionary games, synchronous means all players update their strategies simultaneously at every time step Nowak and May (1992). This can be mathematically described by a finite difference equation, but rarely exists in social or natural systems. The players, or organisms usually act at different and uncorrelated times with the coming information which may be delayed and imperfect Huberman and Glance (1993). For instance, a neuron will be in a refractory period after releasing a spike, it cannot respond to an input signal because it is still processing or recovering from the previous input signal Greil et al. (2007); Zeng et al. (2020). The dynamics of such asynchronous update is usually expressed by differential equations, whose solutions are not always the same with those of their finite-difference counterparts. Thus players will update their strategies asynchronously here, even though all of them get their corresponding rewards through the game with and from all neighbors. With asynchronous updates of strategies, only one player will be assigned to update the strategy in one game round. For finite number of players NN this is not exactly the asynchronous update described above, but we do not see any differences at least for kinetic Ising models Zeng et al. (2011, 2013).

For the network reconstruction with evolutionary game data, some inference methods Wang et al. (2016, 2011) have been developed based on compressive sensing technique Foucart and Rauhut (2017), which is a paradigm for sparse-signal reconstruction. However, the compressive sensing methodology works with assumptions that the interactions between individuals should be explicit and stationary and the system could be handled linearly in some way. For example, in the studies of Wang et al. (2011); Guo et al. (2016); Wu et al. (2016), the observations are the strategies and the total payoff of each player in the evolutionary game. Furthermore, all agents are assumed to play the same game with a given payoff matrix. One player’s total payoff is hence the sum of the payoff from all neighbors. In studies Steinke et al. (2007); Chang et al. (2014); Li et al. (2017), the dynamical functions are employed, but which can also be decomposed into a linear sum of scalar basis functions.

Be distinct from compressive sensing based inference methods, we present here an exact method which is intuitive and based on maximum-log-likelihood estimation (MLE). Another approximate reconstruction method is presented also which is based on naive mean-field (nMF) approximation. Both of the derived inference formula for evolutionary PD game data is enlighten by the asynchronously updated Glauber dynamics Glauber (1963), which could converge to a stationary state described by Gibbs-Boltzman distribution when the couplings are symmetric. However, this is not necessarily true for synchronous case. Hence, we are interested in using a kinetic Ising-like model to reconstruct an evolutionary game network dynamically.

With evolutionary PD game, the update times {τi}\{\tau_{i}\}, history of strategies {si(t)}\{s_{i}(t)\} and rewards for each player {xi(t)}\{x_{i}(t)\} are the full model data generated during the game process. MLE needs all of these data while nMF does not need the update times, neither the rewards for each game round. nMF method relies on means, correlations and the temptation of defection parameter bb parameter in the payoff matrix. With massive numerical simulations, we find that the nMF performs very close to MLE method while avoid long-time iteration process. One equilibrium inference method also tested for the sake of completeness, which shows poor reconstruction for the couplings of our non-equilibrium processes.

The paper is organized as follows. In Sec. II we present the testing network structure as well as the dynamics for the PD evolutionary game. In Sec. III, we derive the MLE and nMF inference formulae from evolutionary game data. Sec. IV, presents how these two methods perform for the network reconstruction, while Sec. V summarizes and discusses the results.

II Asymmetric SK Model and Glauber-Like Game Dynamics

The quenched aSK model is composed by NN vertices, which represents NN game player with two kinds of strategies (si=1s_{i}=1 for cooperation while si=1s_{i}=-1 for defection). This is a fully connected model, i.e., all players are connected to each other. The interactions JijJ_{ij} between each pair of players have the form of

Jij=Jijs+kJijas,k0,J_{ij}=J_{ij}^{s}+kJ_{ij}^{as},~{}~{}~{}~{}~{}~{}k\geq 0, (1)

with kk the asymmetric degree of these interactions, while Jijs=JjisJ_{ij}^{s}=J_{ji}^{s} and Jijas=JjiasJ_{ij}^{as}=-J_{ji}^{as} are symmetric and asymmetric matrices, respectively. They are Gaussian random variables with means 0 and variances

σ2(Jijs)=σ2(Jijas)=g2N11+k2.\sigma^{2}(J_{ij}^{s})=\sigma^{2}(J_{ij}^{as})=\frac{g^{2}}{N}\frac{1}{1+k^{2}}. (2)

The players do not play game with themselves. This indicates Jii=0J_{ii}=0 and the diagonal elements of JijsJ_{ij}^{s} and JijasJ_{ij}^{as} are 0 as well.

We now describe the dynamics for evolutionary game on the above aSK network model. Within an evolutionary PD game, the judge knows the players are criminal but without evidences. Then a player has two possible strategies (S): cooperation (confess their crime to the judge) or defection (deny their offense to the judge). They will get different payoff when they take different strategies. The rewards are usually expressed by a payoff matrix X{X}, for the prisoner dilemma (PD) game:

X=(10b0),X=\left(\begin{matrix}1&0\\ b&0\\ \end{matrix}\right), (3)

with 1b21\leq b\leq 2. If both players choose cooperation, then each of them will get a reward of 11; if one takes cooperation while the counterpart defects, then the cooperated one will get a reward of 0 while the defected one gets bb (temptation of defection); if both players choose defection, then none of them will get any rewards.

During each time step tt, all players play games simultaneously and get corresponding rewards according to the payoff matrix XX in matrix (3). Thus, for each player, the rewards come form the games played with as well as from the neighbours. Once they get the rewards from the game, one randomly chosen player ii will be assigned to update his strategy according to the following formula:

P(si(t))=exp(jJijxj(t1)δsj(t1),si(t))Zi(t),P(s_{i}(t))=\frac{\exp\left(\sum_{j}J_{ij}x_{j}(t-1)\delta_{s_{j}(t-1),s_{i}(t)}\right)}{Z_{i}(t)}, (4)

where

Zi(t)=si(t)=±1exp(jJijxj(t1)δsj(t1),si(t))Z_{i}(t)=\sum_{s_{i}(t)=\pm 1}\exp\left(\sum_{j}J_{ij}x_{j}(t-1)\delta_{s_{j}(t-1),s_{i}(t)}\right)

is the normalization factor (local partition function), which sums over both cooperation and defection states of player ii.

We denote the cooperative and defective strategy of player ii at time tt as αi(t)\alpha_{i}(t) and βi(t)\beta_{i}(t) respectively

αi(t)=jJijxj(t1)δsj(t1),+1,\alpha_{i}(t)=\sum_{j}J_{ij}x_{j}(t-1)\delta_{s_{j}(t-1),+1}, (5a)
βi(t)=jJijxj(t1)δsj(t1),1.\beta_{i}(t)=\sum_{j}J_{ij}x_{j}(t-1)\delta_{s_{j}(t-1),-1}. (5b)

Then the partition function can be rewritten as

Zi(t)=exp(αi(t))+exp(βi(t)).Z_{i}(t)=\exp\left(\alpha_{i}(t)\right)+\exp\left(\beta_{i}(t)\right). (6)

Define

Hi(t)\displaystyle H_{i}(t) =\displaystyle= 12(αi(t)βi(t))\displaystyle\frac{1}{2}\left(\alpha_{i}(t)-\beta_{i}(t)\right) (7)
=\displaystyle= jJijsj(t1)xj(t1)2,\displaystyle\sum_{j}J_{ij}s_{j}(t-1)\frac{x_{j}(t-1)}{2},

and substitute Zi(t)Z_{i}(t) in (4) with Eq. (6), then we have

p(si(t)|𝐒)\displaystyle p(s_{i}(t)|\mathbf{S}) =\displaystyle= exp(αi(t))exp(αi(t))+exp(βi(t))\displaystyle\frac{\exp(\alpha_{i}(t))}{\exp(\alpha_{i}(t))+\exp(\beta_{i}(t))} (8)
=\displaystyle= exp(si(t)Hi(t))2cosh(Hi(t))\displaystyle\frac{\exp(s_{i}(t)H_{i}(t))}{2\cosh(H_{i}(t))}
=\displaystyle= 12[1+si(t)tanh(Hi(t))].\displaystyle\frac{1}{2}\left[1+s_{i}(t)\tanh\left(H_{i}(t)\right)\right].

During each time step, all players join the prisoners’ dilemma game and obtain corresponding payoff according to the payoff matrix (3) from each counterpart. The averaged gain during this time step for player ii is derived in Eq. (25). However, only one of the players is going to update his strategy, which is not necessarily change his current choice. This is why we refer this dynamics as asynchronous PD game.

III INFERENCES FOR GAME NETWORKS

III.1 Maximum Log-likelihood Estimation (MSE)

Suppose we have the following data: history of players’ strategies s{si(t)}\textbf{s}\equiv\{s_{i}(t)\}, updating times τ{τi}\mathbf{\tau}\equiv\{\tau_{i}\} and payoff of each player x{xi(t)}\textbf{x}\equiv\{x_{i}(t)\}, of length L=T×NL=T\times N steps. We reconstruct the game network based on these data. This can be done by maximizing the likelihood P(s,τ,x)=P(s,x|τ)p(τ)P(\textbf{s},\mathbf{\tau},\textbf{x})=P(\textbf{s},\textbf{x}|\mathbf{\tau})p(\mathbf{\tau}) over these parameters. As mentioned in Zeng et al. (2013), for each player ii, the chance of updating their strategy is an Poisson process. This indicates the probability of the update history p(τ)p(\mathbf{\tau}) is independent of the model parameters, thus we can take the objective function as logP(s,x|τ)\log P(\textbf{s},\textbf{x}|\mathbf{\tau}) safely from Eq. (8),

=iτi[si(τi+δt)Hi(τi)log2coshHi(τi)].\mathcal{L}=\sum_{i}\sum_{\tau_{i}}\left[s_{i}(\tau_{i}+\delta t)H_{i}(\tau_{i})-\log 2\cosh H_{i}(\tau_{i})\right]. (9)

The sum only over the update times instead of all time steps as in synchronous case Roudi and Hertz (2011a). Then it leads to a learning rule for couplings JijJ_{ij}

δJij\displaystyle\delta J_{ij} \displaystyle\propto Jij\displaystyle\frac{\partial{\mathcal{L}}}{\partial{J_{ij}}} (10)
\displaystyle\propto τi[si(τi+δt)tanhHi(τi)]sj(τi)xj(τi)2.\displaystyle\sum_{\tau_{i}}\left[s_{i}(\tau_{i}+\delta t)-\tanh H_{i}(\tau_{i})\right]s_{j}(\tau_{i})\frac{x_{j}(\tau_{i})}{2}.~{}~{}~{}~{}

We refer this algorithm as the “MLE”. It is an exact method as no approximation is introduced to the derivation of the inference formula. Besides, it is notable that if the rewards for each player equal to 22 and the same with each other, then the PD game process degenerates to the original Glauber dynamics Glauber (1963).

III.2 naive Mean Field (nMF) approximation

We start from the master equation

dp(𝐒;t)dt\displaystyle\frac{dp(\mathbf{S};t)}{dt} =\displaystyle= iωi(si(t))p(s1,,si,sN;t)\displaystyle\sum_{i}\omega_{i}(-s_{i}(t))p(s_{1},...,-s_{i},...s_{N};t) (11)
\displaystyle- iωi(si(t))p(𝐒;t),\displaystyle\sum_{i}\omega_{i}(s_{i}(t))p(\mathbf{S};t),

and the flipping rate is:

ωi(𝐒;t)=12[1si(t)tanh(Hi(t))].\omega_{i}(\mathbf{S};t)=\frac{1}{2}\left[1-s_{i}(t)\tanh\left(H_{i}(t)\right)\right]. (12)

which could be obtained through detailed balanced condition Van Kampen (1992):

p(si)p(si)=ω(si)ω(si)\frac{p(-s_{i})}{p(s_{i})}=\frac{\omega(s_{i})}{\omega(-s_{i})} (13a)
p(si)+p(si)=1p(-s_{i})+p(s_{i})=1 (13b)

With strategy history for each player {si(t)}\{s_{i}(t)\}, the time-dependent means and correlations can be defined as

mi(t)=si(t),m_{i}(t)=\langle s_{i}(t)\rangle,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{} (14a)
Cij(tt0)=si(t)sj(t0)mi(t)mj(t).C_{ij}(t-t_{0})=\langle s_{i}(t)s_{j}(t_{0})\rangle-m_{i}(t)m_{j}(t). (14b)

With Eqs. (11) and (12), the temporal behavior of means and correlations are

dmi(t)dt=mi(t)+tanh[Hi(t)],\frac{dm_{i}(t)}{dt}=-m_{i}(t)+\langle\tanh[H_{i}(t)]\rangle,~{}~{} (15a)
dsi(t)sj(t0)dt=si(t)sj(t0)+tanh[Hi(t)]sj(t0).\frac{d\langle s_{i}(t)s_{j}(t_{0})\rangle}{dt}=-\langle s_{i}(t)s_{j}(t_{0})\rangle+\langle\tanh[H_{i}(t)]s_{j}(t_{0})\rangle. (15b)

Thus the external field (influence) Hi(t)H_{i}(t) for player ii at time tt in (7) could be rewritten as

Hi(t)\displaystyle H_{i}(t) =\displaystyle= jJijsj(t)xj(t)\displaystyle\sum_{j}J_{ij}s_{j}(t)x_{j}(t) (16)
=\displaystyle= jJij(B0+B1sj(t1)),\displaystyle\sum_{j}J_{ij}\left(B_{0}+B_{1}s_{j}(t-1)\right),

where B0=(1b)(1+m¯(t))/4B_{0}=(1-b)(1+\bar{m}(t))/4 and B1=(1+b)(1+m¯(t))/4B_{1}=(1+b)(1+\bar{m}(t))/4 are two non-fluctuating term depending on parameter bb and m¯(t)=1Nisi(t)\bar{m}(t)=\frac{1}{N}\sum_{i}s_{i}(t) is the population average. The detailed derivation of Eq. (16) is shown in Appendix A.

The nMF approximation hence is formally the same with that for original asynchronously updated kinetic Ising model in Zeng et al. (2011),

mi=tanhbi,m_{i}=\tanh b_{i}, (17)

but with

bi=jiJij(B0+B1mj),b_{i}=\sum_{j\in\partial i}J_{ij}(B_{0}+B_{1}m_{j}), (18)

which is obtained by taking the stationary solution of Eq. (15a) with naive mean field approximation. The details are illustrated in Appendix B.

With the time-derivative of correlations in Eq. (15b), the tanh\tanh function can be expanded with respect to bib_{i}, yields

ddtsi(t)sj(t0)+si(t)sj(t0)\displaystyle\frac{d}{dt}\left<s_{i}(t)s_{j}(t_{0})\right>+\left<s_{i}(t)s_{j}(t_{0})\right> (19)
=\displaystyle= mimj+B1(1mi2)kJikδsk(t)δsj(t0).\displaystyle m_{i}m_{j}+B_{1}(1-m_{i}^{2})\sum_{k}J_{ik}\left<\delta s_{k}(t)\delta s_{j}(t_{0})\right>. (20)

Denoting τ=tt0\tau=t-t_{0}, we get

ddτCij(τ)+Cij(τ)=B1(1mi2)kJikCkj(τ),\frac{d}{d\tau}C_{ij}(\tau)+C_{ij}(\tau)=B_{1}(1-m_{i}^{2})\sum_{k}J_{ik}C_{kj}(\tau), (21)

With the limit τ0\tau\rightarrow 0, we get the nMF  inference formula for the PD game network

J\displaystyle J =\displaystyle= A1DC1B1,\displaystyle\frac{A^{-1}DC^{-1}}{B_{1}}, (22)

with D=C˙(0)+C(0)D=\dot{C}(0)+C(0), Aij=δij(1mi2)A_{ij}=\delta_{ij}(1-m_{i}^{2}) and B1=(1+b)(m¯+1)/4B_{1}=(1+b)(\bar{m}+1)/4. With full data history, it is safe to replace m¯(t)\bar{m}(t) with the time average m¯=m¯(t)t\bar{m}=\left<\bar{m}(t)\right>_{t}.

IV Simulation Results

We infer the evolutionary game network by MLE (10) and nMF (22) formula respectively. In Fig. 1, the scatter plots for reconstructed structure against the true ones are presented for both methods. Fig. 1(a) is for the data length L=N×105L=N\times 10^{5} and Fig. 1(b) for L=N×107L=N\times 10^{7}. It shows that both nMF and MLE perform better with larger data length LL. Furthermore, MLE works slightly better than nMF in both cases. The nMF method over estimates the couplings especially for L=N×107L=N\times 10^{7}, as shown in Fig. 1(b).

Refer to caption

(a)(b)

Figure 1: Scatter-plots for reconstructed couplings versus the tested ones with different data length LL. Blue open dots are recovered couplings by nMF while open red upper triangles are for MLE. Left panel: data length L=N×105L=N\times 10^{5}; right panel: data length L=N×107L=N\times 10^{7}. The other parameters are the same: number of players N=20N=20, asymmetric degree k=1k=1, coupling strength g=0.3g=0.3.

We introduce the mean square error (MSE) to measure the distance between the reconstructed and the true tested network structure.

MSE=ij(JijRec.JijTrue)2N(N1),\text{MSE}=\frac{\sum_{ij}(J_{ij}^{Rec.}-J_{ij}^{True})^{2}}{N(N-1)}, (23)

where JijTrueJ_{ij}^{True} are the tested network couplings and JijRec.J_{ij}^{Rec.} for the inferred ones.

With the reconstruction error MSE defined by (23), we compared the performance of MLE and nMF method for fully connected SK model (1). We study both methods for different data length LL, the asymmetric degree kk, the temptation of defection parameter bb in the payoff matrix (3) as well as the coupling strength gg. The inverse of gg corresponds to the temperature in spin glass system, which can be explained as the noise (say, fake information or rumors etcetera) in the evolutionary PD game.

Refer to caption

×\times×\times(a)(b)(c)(d)

Figure 2: (color online). Mean square error (MSE) for (a) data length LL , (b) asymmetric degree kk, (c) bb parameter in payoff matrix and (d) temperature 1/g1/g respectively. Blue open dots present nMF while red solid dots for MLE. The parameters are N=20N=20, L=N×106L=N\times 10^{6}, k=1k=1 and g=0.3g=0.3 except varied in the corresponding panel.

Figure 2 shows the performance of both methods. Similar to the results for asynchronous Ising case Zeng et al. (2013), the MSE for MLE is 1/L\propto 1/L here. However, with the data length LL we tested here, nMF does not approach to a saturated value as happened for asynchronous Ising case. Instead, the reconstruction error MSE for nMF is about twice of that for MLE, see Fig. 2(a). This could be an indication for game network, nMF inference formula can be obtained also by taken the average of update times in MLE formula (10), which has been proved so in Zeng et al. (2013) for asynchronous Ising models. Figure 2(b) shows the MSE for both methods is not sensitive to the asymmetric degree kk, no matter the network is symmetric (k=0k=0) or fully asymmetric (k=1k=1). Here, the data length L=N×106L=N\times 10^{6} with N=20N=20 and g=0.3g=0.3. The difference between two methods is effected mainly by the data length LL. The MSE for MLE is about half of that for nMF approach, which agrees well with that shown in Fig. 2(a). Figure 2(c) shows the performance of MLE and nMF algorithm for different reward parameter bb in the payoff matrix (3), which is a specific parameter for game networks. Both algorithms recover the network better with larger reward bb. However, the proportion of MSE between nMF and MLE increases slightly for larger bb. Finally, the effects of inverse gg over network inference is illustrated in Fig. 2(d). For strong couplings, nMF works much worse than MLE. However, for g1g\leq 1, their performances are not effected by gg anymore. MLE still has its advantage of the factor 2 (conferred by the information of update times) compared with nMF.

To illustrate how the equilibrium inference method works for the non-equilibrium process, we use the naive mean-filed equilibrium inference Kappen and Spanjers (2000) algorithm which depends on the equal time correlations only:

JEqui=C1.J^{Equi}=-C^{-1}. (24)

We refer this as equilibrium inference to avoid the confusion with the nMF method for asynchronous case we derived here. Then, we present the inferred networks through scatter-plots in Fig. 3 with both equilibrium and non-equilibrium method for evolutionary data generated by PD games.

Fig. 3(a) and (b) presents the reconstructed couplings for symmetric (k=0k=0) and fully asymmetric (k=1k=1) SK model respectively. Our asynchronous nMF (blue open dots) and MLE (red upper triangle) methods recover the couplings nicely while the equilibrium nMF (24) (open black squares) works much worse, even for k=0k=0, the symmetric case. It is not surprising as equilibrium nMF (24) we tested here does not take into account the game process at all. For the original asynchronous Ising model without game process, the equilibrium nMF should work as good as the non-equilibrium methods Zeng and Aurell (2020b) since they are for the same dynamics.

Refer to caption

(a)(b)

Figure 3: Scatter plot for recovered links with the tested ones with different asymmetric degree kk. Black open squares are for inferred couplings with equilibrium inference Jequi=C(0)1J^{equi}=-C(0)^{-1}. Blue open dots are for asynchronous nMF while red open triangles are for MLE. Left panel: asymmetric degree k=0k=0, symmetric SK model; right panel: k=1k=1, fully asymmetric SK model. The other parameters are the same: data length L=N×106L=N\times 10^{6}, number of players N=20N=20, coupling strength g=0.3g=0.3.

V Discussion

We studied the network reconstruction problem with asynchronous updated evolutionary game data. The gamblers play the PD game with their counterparts simultaneously but update their strategies asynchronously. Two methods (exact and iterative MLE and approximate nMF) are introduced to infer the fully connected and Gaussian distributed couplings between players.

Comparing with the original asynchronous Ising model Zeng et al. (2011, 2013), we combine the PD games with Glauber dynamics, where the rewards during the game for each player is considered. The MLE and nMF reconstruction formulae for asynchronous Ising models in Zeng et al. (2011, 2013) hence have to be generalized for the asynchronous evolutionary game case here. MLE is derived from the log-likelihood of the system while nMF from the derivative of the equations of motion of means and correlations. However, they are not independent with each other. MLE utilizes the full model histories, {si(t)}\{s_{i}(t)\}, {xi(t)}\{x_{i}(t)\} and {τi}\{\tau_{i}\}. The update histories in Eq. (10) could be averaged and yields the nMF inference formula (22) by taking the linear term of the averaged tanh\tanh function.

With the derived MLE and nMF methods for asynchronously updates PD game, our numerical results show that the inference performance of nMF is comparable with that of MLE. Both methods are not sensitive to the asymmetric degree kk, but proportional to the inverse of data length. For fixed data length, the reconstruction error (MSE) of MLE is half (benefit from update times) of that by nMF method. Furthermore, nMF prefers small/weak coupling strength (gg) while MLE holds for much wider range of gg. Both methods recover better for large payoff parameter bb. This is reasonable as when the rewards of each players equal to 22, then the external field Eq. (7) in the dynamics degenerates to the original asynchronous Ising model. With bb increases, the rewards of each players will also close to 22 which show better performance of network reconstruction.

However, comparing with the data that MLE method used for inference, nMF needs only the bb parameter and the moments of strategy history {si(t)}\{s_{i}(t)\}. Furthermore, nMF infers much fast than MLE as no iterative procedure included for network inference. The scatter-plots in Fig. 1 and Fig. 3 show that the JijnMFJ_{ij}^{nMF}s are obviously comparable with JijMLEJ_{ij}^{MLE}s, even though the MSE for nMF is roughly twice of that for MLE. Thus, we expect that for the network inference with data from asynchronous PD game, nMF works well enough. The higher orders of tanh function (say, TAP) are not necessary.

Besides, we have tested a widely used equilibrium inference method (24) blindly for our dynamic data generated from evolutionary PD games, which recover much worse than that by non-equilibrium inference methods, even for symmetric interactions between players, as illustrated in Fig. 3(a). This bring us a message that if we have time series data generated from some certain dynamics, we should try non-equilibrium methods first. Furthermore, we start from the detailed balance condition to derive the asynchronous nMF formula for the network inference, but as shown by the numerical results, the inference formula holds for asymmetric k=1k=1 case also where the system is not in the equilibrium realm any more.

This work focuses on a simple and idealized prisoner’s game with fully connected and Gaussian distributed interactions, which could be extended to certain type of structures or different distributions (say, random power-law, where strong couplings between few players while weak ones between most of them) for the couplings. Besides, we use a constant value of defection temptation bb which could be different for each players. Meanwhile, non-zero abut small rr in the payoff matrix (3) could also be tested. These will be studied by further work.

Appendix A Reformulating the external field Hi(t)H_{i}(t)

According to the payoff matrix (3) for prisoner’s dilemma game, the average reward that player ii will gain at time tt is (for simplicity, rr is assumed to be equal to 0):

xi(t)\displaystyle x_{i}(t) =\displaystyle= 2×{1+si(t)21Nji1+sj(t)2\displaystyle 2\times\left\{\frac{1+s_{i}(t)}{2}\frac{1}{N}\sum_{j\in\partial i}\frac{1+s_{j}(t)}{2}\right. (25)
+\displaystyle+ 1si(t)2bNji1+sj(t)2}.\displaystyle\left.\frac{1-s_{i}(t)}{2}\frac{b}{N}\sum_{j\in\partial i}\frac{1+s_{j}(t)}{2}\right\}.

The factor 22 outside the bracket comes from the fact that players obtained rewards from the game with neighbors as well as the games originated from neighbors within each time step of synchronous PD games. Then si(t)xi(t)2s_{i}(t)\frac{x_{i}(t)}{2} in Eq. (7) is:

si(t)xi(t)2\displaystyle s_{i}(t)\frac{x_{i}(t)}{2} =\displaystyle= si(t)+12f(c)+si(t)12bf(c),\displaystyle\frac{s_{i}(t)+1}{2}f(c)+\frac{s_{i}(t)-1}{2}bf(c), (26)

with f(c)f(c) the cooperation density in the neighborhood of player ii:

f(c)=1Nji1+sj(t)2=m¯(t)+12.f(c)=\frac{1}{N}\sum_{j\in\partial{i}}\frac{1+s_{j}(t)}{2}=\frac{\bar{m}(t)+1}{2}. (27)

Hence we rewrite Eq. (26) as:

si(t)xi(t)2=B0+B1si(t),s_{i}(t)\frac{x_{i}(t)}{2}=B_{0}+B_{1}s_{i}(t), (28)

with

B0=1b2f(c)=(1b)(m¯(t)+1)4,B_{0}=\frac{1-b}{2}f(c)=\frac{(1-b)(\bar{m}(t)+1)}{4}, (29a)
B1=1+b2f(c)=(1+b)(m¯(t)+1)4,B_{1}=\frac{1+b}{2}f(c)=\frac{(1+b)(\bar{m}(t)+1)}{4}, (29b)

and m¯(t)=1Nisi(t)\bar{m}(t)=\frac{1}{N}\sum_{i}s_{i}(t).

Thus, we reformulated eq. (7) as:

Hi(t)=jJij(B0+B1sj(t1)).H_{i}(t)=\sum_{j}J_{ij}\left(B_{0}+B_{1}s_{j}(t-1)\right). (30)

Appendix B nMF approximation

Substituting sj(t1)s_{j}(t-1) with non-fluctuating and fluctuating terms as:

Hi(t)=jJij(B0+B1(mj+δsj(t1))),H_{i}(t)=\sum_{j}J_{ij}(B_{0}+B_{1}(m_{j}+\delta s_{j}(t-1))), (31)

and define

bi=jJij(B0+B1mj),b_{i}=\sum_{j}J_{ij}(B_{0}+B_{1}m_{j}), (32)

then

tanh[Hi(t)]\displaystyle\left<\tanh[H_{i}(t)]\right> =\displaystyle= tanh[bi+B1jJijδsj(t1)].\displaystyle\left<\tanh\left[b_{i}+B_{1}\sum_{j}J_{ij}\delta s_{j}(t-1)\right]\right>.

expanding the tanh\tanh function with respect to bib_{i} to the first order, we get:

tanh[Hi(t)]\displaystyle\left<\tanh[H_{i}(t)]\right> (34)
=\displaystyle= tanhbi+B1(1tanh2bi)jJijδsj(t1),\displaystyle\tanh b_{i}+B_{1}(1-\tanh^{2}b_{i})\left<\sum_{j}J_{ij}\delta s_{j}(t-1)\right>,~{}~{}~{}~{}~{}~{}

where the second term on the right hand-side goes to zero. Hence we have naive mean-field approximation for mim_{i} as:

mi=tanhbi.m_{i}=\tanh b_{i}. (35)

which is the stationary solution of the time derivative of magnetization in equation (15a).

Acknowledgements.
We sincerely thank John Hertz, Erik Aurell and Pan Zhang for heuristic and fruitful discussions and suggestions. H.-L. Zeng was supported by NSFC Project No. 11705097. HLZ was also supported by Natural Science Foundation of Jiangsu Province (BK20170895), Natural Science Fund for Colleges and Universities in Jiangsu Province (17KJB140015), Jiangsu Government Scholarship for Overseas Studies of 2018, and Scientific Research Foundation of Nanjing University of Posts and Telecommunications (NY217013). S.-M. Qin was supported by Project NSFC No. 11705279.

References

  • Schneidman et al. (2006) E. Schneidman, M. J. Berry II, R. Segev,  and W. Bialek, Nature 440, 1007 (2006).
  • Nguyen et al. (2016) J. P. Nguyen, F. B. Shipley, A. N. Linder, G. S. Plummer, M. Liu, S. U. Setru, J. W. Shaevitz,  and A. M. Leifer, Proceedings of the National Academy of Sciences 113, E1074 (2016).
  • Roudi and Hertz (2011a) Y. Roudi and J. Hertz, Physical review letters 106, 048702 (2011a).
  • Zeng and Aurell (2020a) H.-L. Zeng and E. Aurell, “Inferring genetic fitness from genomic data,”  (2020a), arXiv:2001.02173 [q-bio.PE] .
  • Pincus and Kalman (2004) S. Pincus and R. E. Kalman, Proceedings of the National Academy of Sciences 101, 13709 (2004)https://www.pnas.org/content/101/38/13709.full.pdf .
  • Bury (2013) T. Bury, Physica A: Statistical Mechanics and its Applications 392, 1375 (2013).
  • Zeng et al. (2014) H.-L. Zeng, R. Lemoy,  and M. Alava, Journal of Statistical Mechanics: Theory and Experiment 2014, P07008 (2014).
  • Masuda and Aihara (2003) N. Masuda and K. Aihara, Physics Letters, Section A: General, Atomic and Solid State Physics 313, 55 (2003).
  • Szabó and Töke (1998) G. Szabó and C. Töke, Physical Review E 58, 69 (1998).
  • Chen et al. (2007) Y. S. Chen, H. Lin,  and C. X. Wu, Physica A: Statistical Mechanics and its Applications 385, 379 (2007).
  • Kirkpatrick and Sherrington (1978) S. Kirkpatrick and D. Sherrington, Physical Review B 17, 4384 (1978).
  • Nguyen et al. (2017) H. C. Nguyen, R. Zecchina,  and J. Berg, Advances in Physics 66, 197 (2017).
  • Aurell and Ekeberg (2012) E. Aurell and M. Ekeberg, Phys. Rev. Lett. 108, 090201 (2012).
  • Cocco et al. (2009a) S. Cocco, S. Leibler,  and R. Monasson, Proceedings of the National Academy of Sciences 106, 14058 (2009a).
  • Weigt et al. (2009) M. Weigt, R. A. White, H. Szurmant, J. A. Hoch,  and T. Hwa, Proceedings of the National Academy of Sciences 106, 67 (2009).
  • Pillow et al. (2008) J. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, E. Chichilnisky,  and E. P. Simoncelli, Nature 454, 995 (2008).
  • Roudi and Hertz (2011b) Y. Roudi and J. Hertz, Phys. Rev. Lett. 106, 048702 (2011b).
  • Zhang (2012) P. Zhang, Journal of Statistical Physics 148, 502 (2012).
  • Zeng et al. (2011) H.-L. Zeng, E. Aurell, M. Alava,  and H. Mahmoudi, Physical Review E 83, 041135 (2011).
  • Zeng et al. (2013) H.-L. Zeng, M. Alava, E. Aurell, J. Hertz,  and Y. Roudi, Physical review letters 110, 210601 (2013).
  • Roudi et al. (2009) Y. Roudi, S. Nirenberg,  and P. E. Latham, PLoS computational biology 5, e1000380 (2009).
  • Zeng and Aurell (2020b) H.-L. Zeng and E. Aurell, arXiv:2002.05222  (2020b).
  • Cocco et al. (2009b) S. Cocco, S. Leibler,  and R. Monasson, Proceedings of the National Academy of Sciences 106, 14058 (2009b).
  • Nowak and May (1992) M. A. Nowak and R. M. May, Nature 359, 826 (1992).
  • Huberman and Glance (1993) B. A. Huberman and N. S. Glance, Proceedings of the National Academy of Sciences 90, 7716 (1993).
  • Greil et al. (2007) F. Greil, B. Drossel,  and J. Sattler, New Journal of Physics 9, 373 (2007).
  • Zeng et al. (2020) H.-L. Zeng, C.-P. Zhu, S.-X. Wang, Y.-D. Guo, Z.-M. Gu,  and C.-K. Hu, Physica A: Statistical Mechanics and its Applications 540, 123191 (2020).
  • Wang et al. (2016) W.-X. Wang, Y.-C. Lai,  and C. Grebogi, Physics Reports 644, 1 (2016).
  • Wang et al. (2011) W.-X. Wang, Y.-C. Lai, C. Grebogi,  and J. Ye, Phys. Rev. X 1, 021021 (2011).
  • Foucart and Rauhut (2017) S. Foucart and H. Rauhut, Bull. Am. Math 54, 151 (2017).
  • Guo et al. (2016) Q. Guo, G. Liang, J.-Q. Fu, J.-T. Han,  and J.-G. Liu, Physical Review E 94, 052303 (2016).
  • Wu et al. (2016) K. Wu, J. Liu,  and S. Wang, Scientific reports 6, 37771 (2016).
  • Steinke et al. (2007) F. Steinke, M. Seeger,  and K. Tsuda, BMC systems biology 1, 51 (2007).
  • Chang et al. (2014) Y. H. Chang, J. W. Gray,  and C. J. Tomlin, BMC bioinformatics 15, 400 (2014).
  • Li et al. (2017) J. Li, Z. Shen, W.-X. Wang, C. Grebogi,  and Y.-C. Lai, Phys. Rev. E 95, 032303 (2017).
  • Glauber (1963) R. J. Glauber, Journal of mathematical physics 4, 294 (1963).
  • Van Kampen (1992) N. G. Van Kampen, Stochastic processes in physics and chemistry, Vol. 1 (Elsevier, 1992).
  • Kappen and Spanjers (2000) H. J. Kappen and J. J. Spanjers, Phys. Rev. E 61, 5658 (2000).