This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

[1,2]\fnmNaoki \surMasuda

1]\orgdivDepartment of Mathematics, \orgnameUniversity at Buffalo, \orgaddress\streetState University of New York, \cityBuffalo, \postcode14260-2900, \stateNY, \countryUSA

2]\orgdivComputational and Data-Enabled Science and Engineering Program, \orgnameUniversity at Buffalo, \orgaddress\streetState University of New York, \cityBuffalo, \postcode14260-5030, \stateNY, \countryUSA

Perturbation theory for evolution of cooperation on networks

\fnmLingqi \surMeng    [email protected] [ [
Abstract

Network structure is a mechanism for promoting cooperation in social dilemma games. In the present study, we explore graph surgery, i.e., to slightly perturb the given network, towards a network that better fosters cooperation. To this end, we develop a perturbation theory to assess the change in the propensity of cooperation when we add or remove a single edge to/from the given network. Our perturbation theory is for a previously proposed random-walk-based theory that provides the threshold benefit-to-cost ratio, (b/c)(b/c)^{*}, which is the value of the benefit-to-cost ratio in the donation game above which the cooperator is more likely to fixate than in a control case, for any finite networks. We find that (b/c)(b/c)^{*} decreases when we remove a single edge in a majority of cases and that our perturbation theory captures at a reasonable accuracy which edge removal makes (b/c)(b/c)^{*} small to facilitate cooperation. In contrast, (b/c)(b/c)^{*} tends to increase when we add an edge, and the perturbation theory is not good at predicting the edge addition that changes (b/c)(b/c)^{*} by a large amount. Our perturbation theory significantly reduces the computational complexity for calculating the outcome of graph surgery.

keywords:
evolutionary game; prisoner’s dilemma; network reciprocity; fixation; stochastic dynamics

1 Introduction

Since Darwin’s time, explaining cooperative behavior in groups of self-interested individuals has been a challenge [1, 2, 3, 4, 5, 6, 7, 8]. Game theory including evolutionary game theory has shown that a population of self-interested individuals playing a social dilemma game of the prisoner’s dilemma type does not sustain cooperation without an additional mechanism. To explain cooperation in social dilemma situations in nature including in biological populations and to promote cooperation in human society, there have been proposed various mathematical mechanisms to support cooperation. Population structure as represented by contact networks of individuals is one such mechanism. The structure of contact networks constrains who can interact with whom and promotes emergence and endurance of clusters of cooperative players in local regions in spatial lattices [2, 9, 10, 11] and adjacent pairs of nodes in general networks [12, 13, 14, 11, 15].

A major indicator of the success of a mutant trait in evolutionary dynamics is the fixation probability. It is defined as the probability that the mutant type will spread and eventually occupy the entire population as a result of evolutionary dynamics, given an initial distribution of mutants [16, 17, 18, 5]. When each individual is in either of the two types (i.e., wild and mutant) at any given time and the population structure is described by a network on NN nodes, the state of the network is specified by an NN-dimensional binary vector of which the iith entry encodes the type of the iith node. In the absence of mutation, the fixation probability of the mutant starting from the state in which all the nodes are of the wild type is equal to 0. The fixation probability of the mutant is equal to 11 if all the nodes are initially mutant. For general initial conditions, the exact solution of the fixation probability requires solving a linear system of 2N22^{N}-2 equations [18, 13]. Therefore, it is difficult to exactly compute the fixation probability except for small networks, highly symmetric networks, or networks with other mathematically convenient properties.

We focus on social dilemma situations, in particular the prisoner’s dilemma game, in the present paper. In the prisoner’s dilemma, the wild and mutant types correspond to cooperator and defector, respectively, or vice versa. The calculation of the fixation probability for the prisoner’s dilemma game on networks, potentially with some additional assumptions, is usually more involved than the calculation in the case of the constant selection, in which the fitness of the wild and mutant types is fixed throughout the evolutionary dynamics. In games, the fitness of an individual generally depends on how other individuals behave, which makes setting up the linear system of 2N22^{N}-2 equations and efficiently solving it, particularly the latter, a difficult task. Under this circumstance, weak selection is an assumption that often facilitates analytical evaluation of the fixation probability of the mutant type including in social dilemma games [16]. Let us write down each individual’s fitness as a sum of a constant term, called the baseline fitness, and the payoff that the individual receives by playing the game. By definition, weak selection means that the payoff is small compared to the baseline fitness. Under weak selection, Ohtsuki et al. developed a pair approximation theory that enables us to analytically derive the conditions under which cooperation fixates with a larger probability than a baseline on random regular graphs, i.e., random graphs in which all nodes have the same number of neighbors [13]. Furthermore, Allen et al. extended this result to the case of arbitrary networks using coalescence times from random walk theory [15]. With these methods, one can avoid dealing with a set of 2N22^{N}-2 linear equations and calculate the leading term of the fixation probability in polynomial time in terms of NN.

In Ref. [15], the authors derived a key indicator to quantify the ease of cooperation in networks, i.e., the threshold benefit-to-cost ratio above which selection favors cooperation, denoted by (b/c)(b/c)^{*}. In fact, substantial changes in (b/c)(b/c)^{*} may occur when one only slightly perturbs the network structure, which is an operation referred to as graph surgery [15]. A carefully designed graph surgery may enhance cooperation by reducing (b/c)(b/c)^{*} by a larger amount than by a random graph surgery. For example, a small mean degree (i.e., the number of neighbors that a node has) of the network tends to induce cooperation [13, 15]. Therefore, decreasing the weight of an edge or removing an edge is expected to enhance cooperation. However, this may not be an optimal choice. Which particular edge should we perturb or remove to efficiently enhance cooperation? One can answer this question by removing just one edge from the original network, calculating (b/c)(b/c)^{*} for the perturbed network, and repeating the same procedure for each different perturbation of the original network. However, this procedure may be computationally costly. Note that the method to calculate the fixation probability for cooperation in arbitrary networks, developed in Ref. [15], is still computationally costly although its computational complexity is polynomial in NN.

In the current study, we develop a perturbation theory with the aim of predicting the direction and amount of the change in (b/c)(b/c)^{*} when one slightly perturbs the weight of an arbitrary single edge. We find that, for most networks, the actual change in (b/c)(b/c)^{*} when we remove an edge and the change predicted by our perturbation theory are strongly correlated, which makes it possible to propose a single edge to be removed for efficiently enhancing cooperation. However, the correlation between the result of direct numerical simulations and the perturbation theory is considerably weaker when one adds a new edge to the existing network. Therefore, our perturbation theory is not practically useful when one adds new edges. Compared to the direct numerical simulations, our perturbation theory is much faster, which allows us to compute the fixation probability under graph surgery in larger networks.

2 Fixation of cooperation on networks under weak selection

We assume that the graph GG is connected and undirected. We denote the set of nodes by V={1,,N}V=\{1,\ldots,N\}, where NN is the number of nodes. For each pair of nodes i,jVi,j\in V, we denote the edge weight by wij0w_{ij}\geq 0. If there is no edge between ii and jj, we set wij=0w_{ij}=0. We allow self-loops, i.e., positive values of wiiw_{ii} [15]. The weighted degree of node ii, denoted by si=j=1Nwijs_{i}=\sum_{j=1}^{N}w_{ij}, also called the node strength, is the sum of the weight of the edges connected to the node.

A discrete-time random walker is said to be simple if the walker located at node ii moves to one of its neighbors, denoted by jj, in a single time step with probability proportional to wijw_{ij}, i.e., with probability pij=wij/sip_{ij}=w_{ij}/s_{i}. Let W=(wij)W=(w_{ij}) be the N×NN\times N weighted adjacency matrix. The transition probability matrix P=(pij)P=(p_{ij}) of the simple random walk is given by P=D1WP=D^{-1}W, where D=diag(s1,,sN)D=\mathrm{diag}(s_{1},\ldots,s_{N}), i.e., the diagonal matrix whose diagonal entries are equal to s1,s2,,sNs_{1},s_{2},\ldots,s_{N}. Let 𝝅=(π1,,πN)\boldsymbol{\pi}=(\pi_{1},\ldots,\pi_{N}) be the stationary probability vector of the random walk with transition probability matrix PP, i.e., the solution of 𝝅P=𝝅\boldsymbol{\pi}P=\boldsymbol{\pi}. It holds true that [19, 20]

πi=si=1Ns,i{1,,N}.\pi_{i}=\frac{s_{i}}{\sum_{\ell=1}^{N}s_{\ell}},\quad i\in\{1,\ldots,N\}. (1)

We use the donation game, which is a special case of the prisoner’s dilemma game. In the donation game, which is a two-player game, one player, called the donor, decides whether or not to pay a cost cc (>0)(>0). If the donor pays cc, which we refer to as cooperation, then the other player, called the recipient, receives benefit bb (>c)(>c). If the donor does not pay cc, which we refer to as defection, then the donor does not lose anything, and the recipient does not gain anything. Therefore, the payoff matrix of the donation game for a pair of players is given by

CDC( bcc) Db0,\bordermatrix{&\mathrm{C}&\mathrm{D}\cr\mathrm{C}&b-c&-c\cr\mathrm{D}&b&0}, (2)

where C and D represent cooperation and defection, respectively, and the payoff values represent those for the row player. We assume that each player on a node participates in the game as donor and recipient half of the times each.

We assign 0 and 1 to the defector and cooperator, respectively. Then, we can represent a state of the entire network by a binary vector 𝒙=(x1,,xN){0,1}N\boldsymbol{x}=(x_{1},\ldots,x_{N})\in\{0,1\}^{N}. With this notation, the payoff of node ii averaged over all its neighbors is given by

fi(𝒙)=cxi+bj=1Npijxj.\displaystyle f_{i}(\boldsymbol{x})=-cx_{i}+b\sum_{j=1}^{N}p_{ij}x_{j}. (3)

The reproductive rate of node ii in state 𝒙\boldsymbol{x} is given by

Ri(𝒙)=1+ηfi(𝒙),\displaystyle R_{i}(\boldsymbol{x})=1+\eta f_{i}(\boldsymbol{x}), (4)

where η\eta represents the strength of the selection. If η=0\eta=0, the reproductive rate does not depend on the payoff matrix or the action (i.e., cooperation or defection) of any node. This case is equivalent to the so-called voter model. If η0\eta\to 0, the payoff weakly impacts the selection, and this limit is called the weak selection regime. The idea behind weak selection is that, in reality, many different factors may contribute to the overall fitness of an individual, and the game under consideration is just one such factor [13, 15].

We drive evolutionary dynamics by the death-birth process with selection on birth on an arbitrary network composed of cooperators and defectors [13, 15]. Specifically, we first select a node to be updated, denoted by ii, uniformly at random. Second, we select one of the ii’s neighbors, denoted by jj, for reproduction with the probability proportional to wijRj(𝒙)w_{ij}R_{j}(\boldsymbol{x}). Third, the offspring, ii, inherits the type of jj. This completes a single round of the evolutionary dynamics, which we schematically show in Fig. 1.

Refer to caption
Figure 1: Death-birth process with selection on birth on the unweighted network. (a) Each individual obtains a payoff by interacting with all its neighbors. C and D represent cooperator and defector, respectively. (b) We select a node whose type to be replaced uniformly at random, shown in gray. Then, one of the three neighbors of this node, whose payoff values are indicated, will replace the gray node. We select each of the cooperating neighbors with probability [1+η(b/2c)]/[1+η(b/2c)+1+η(b/2c)+1+η(2b/3)]=[6+3η(b2c)]/[18+2η(5b6c)][1+\eta(b/2-c)]/[1+\eta(b/2-c)+1+\eta(b/2-c)+1+\eta(2b/3)]=[6+3\eta(b-2c)]/[18+2\eta(5b-6c)] and the defecting neighbor with probability [1+η(2b/3)]/[1+η(b/2c)+1+η(b/2c)+1+η(2b/3)]=(3+2ηb)/[9+η(5b6c)][1+\eta(2b/3)]/[1+\eta(b/2-c)+1+\eta(b/2-c)+1+\eta(2b/3)]=(3+2\eta b)/[9+\eta(5b-6c)] for reproduction. (c) In this example, we select the cooperating neighbor to the left, which replaces the offspring node.

The death-birth process in any finite population without mutation will eventually reach the state in which all individuals are cooperators or defectors and halt. In other words, the cooperation or defection fixates in finite time with probability 11. Suppose the initial condition in which one node is cooperator and the other N1N-1 nodes are defectors. There are NN such initial conditions depending on which node is the cooperator. We consider the initial probability distribution over all possible states that assigns probability 1/N1/N to each of the states with exactly one cooperator and probability zero to all the other states. We denote by ρC\rho_{\mathrm{C}} the expectation that the cooperation fixates under this distribution of the initial state. If ρC>1/N\rho_{\mathrm{C}}>1/N, natural selection favors cooperation [16, 13, 5, 15]. In Ref. [15], Allen et al. showed that

ρC=1N+η2N[cτ2+b(τ3τ1)]+O(η2),\displaystyle\rho_{\mathrm{C}}=\frac{1}{N}+\frac{\eta}{2N}\left[-c\tau_{2}+b(\tau_{3}-\tau_{1})\right]+O(\eta^{2}), (5)

where

τk=i=1Nj=1Nπipij(k)tij,\displaystyle\tau_{k}=\sum_{i=1}^{N}\sum_{j=1}^{N}\pi_{i}p_{ij}^{(k)}t_{ij}, (6)

pij(k)p_{ij}^{(k)} is the (i,j)(i,j)th entry of matrix PkP^{k}, which implies that pij(1)=pijp_{ij}^{(1)}=p_{ij}, and

tij={0ifi=j,1+12k=1N(piktjk+pjktik)otherwise.\displaystyle t_{ij}=\begin{cases}0&\mathrm{if}\ i=j,\\ 1+\frac{1}{2}\sum_{k=1}^{N}(p_{ik}t_{jk}+p_{jk}t_{ik})&\mathrm{otherwise}.\end{cases} (7)

Equation (7) implies that tij=tjit_{ij}=t_{ji} is the mean coalescence time of two random walkers when one walker is initially located at node ii and the other at node jj. Note that pij(k)p_{ij}^{(k)} is the kk-step transition probability of the random walk from node ii to node jj. Therefore, τk\tau_{k} is the expected value of tijt_{ij} when ii and jj are the two ends of a kk-step random walk trajectory on GG under the stationary distribution [15]. Equation (5) implies that the threshold value of the benefit-to-cost ratio above which the natural selection favors cooperation (i.e., ρC>1/N\rho_{\mathrm{C}}>1/N) is given by

(bc)=τ2τ3τ1.\displaystyle\left(\frac{b}{c}\right)^{*}=\frac{\tau_{2}}{\tau_{3}-\tau_{1}}. (8)

Natural selection favors cooperation if b/c>(b/c)b/c>(b/c)^{*}.

For example, if the underlying network is regular with degree kk, we have

τ1=N1,\displaystyle\tau_{1}=N-1, (9)
τ2=N2,\displaystyle\tau_{2}=N-2, (10)

and

τ3=N+Nk3,\displaystyle\tau_{3}=N+\frac{N}{k}-3, (11)

such that

(bc)=k\displaystyle\left(\frac{b}{c}\right)^{*}=k (12)

as NN\to\infty [15]. Note that the right-hand side of Eq. (8) only depends on the adjacency matrix of the network. In other words, the structure of the contact network determines whether and how much natural selection favors cooperation.

3 Perturbation theory for graph surgery

In this section, we develop a perturbation theory to determine the change in (b/c)(b/c)^{*} when one perturbs the weight of a single edge. To this end, we start by rewriting Eq. (6) in terms of matrices and vectors. Let 𝟏=(1,,1)\boldsymbol{1}=(1,\ldots,1)^{\top}, where represents the transposition. Let T=(tij)T=(t_{ij}) be the N×NN\times N matrix of the mean coalescence time. Using these notations, we rewrite Eq. (6) as

τk=𝝅(PkT)𝟏,\displaystyle\tau_{k}=\boldsymbol{\pi}\left(P^{k}\circ T\right)\boldsymbol{1}, (13)

where k=1,2,3k=1,2,3, and \circ represents the Hadamard product.

If one changes the weight of an edge (i0,j0)(i_{0},j_{0}) by ε\varepsilon, where |ε|1|\varepsilon|\ll 1, including the case in which we create a new edge with weight ε\varepsilon (>0)(>0), the perturbed network remains connected and undirected. Therefore, one can still use Eq. (8) to compute (b/c)(b/c)^{*}. Equation (8) uses Eq. (6), which requires 𝝅\boldsymbol{\pi}, PP, and TT. We denote these variables after the perturbation by 𝝅(ε)\boldsymbol{\pi}(\varepsilon), P(ε)P(\varepsilon), and T(ε)T(\varepsilon). To distinguish the quantities before and after the perturbation, we denote these variables before the perturbation by 𝝅(0)\boldsymbol{\pi}(0), P(0)P(0), and T(0)T(0).

For writing down 𝝅(ε)\boldsymbol{\pi}(\varepsilon), we denote by

S=i=1Nsi=i=1Nj=1Nwij\displaystyle S=\sum_{i=1}^{N}s_{i}=\sum_{i=1}^{N}\sum_{j=1}^{N}w_{ij} (14)

the sum of the weighted degree of over all the nodes. Under a small perturbation, we carry out Taylor expansion of Eq. (1) to obtain

𝝅(ε)=𝝅(0)+εΔ𝝅+o(ε),\displaystyle\boldsymbol{\pi}(\varepsilon)=\boldsymbol{\pi}(0)+\varepsilon\Delta\boldsymbol{\pi}+o(\varepsilon), (15)

where Δ𝝅=(Δπ1,,ΔπN)\Delta\boldsymbol{\pi}=(\Delta\pi_{1},\ldots,\Delta\pi_{N}). We obtain

Δπi=δii0+δij0S2πi(0)S,\displaystyle\Delta\pi_{i}=\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{S}-\frac{2\pi_{i}(0)}{S}, (16)

where δij\delta_{ij} is the Kronecker delta. We present the derivation of Eq. (16) in Appendix A.

To calculate P(ε)P(\varepsilon), we define a symmetric indicator function, denoted by χi0j0\chi_{i_{0}j_{0}}, by

χi0j0(i,j)={1if(i,j)=(i0,j0)or(i,j)=(j0,i0),0otherwise.\displaystyle\chi_{i_{0}j_{0}}(i,j)=\begin{cases}1&\mathrm{if\ }(i,j)=(i_{0},j_{0})\ \mathrm{or}\ (i,j)=(j_{0},i_{0}),\\ 0&\mathrm{otherwise}.\end{cases} (17)

We obtain

P(ε)\displaystyle P(\varepsilon) =P(0)+εΘ(1)+o(ε),\displaystyle=P(0)+\varepsilon\Theta^{(1)}+o(\varepsilon), (18)
P2(ε)\displaystyle P^{2}(\varepsilon) =P2(0)+ε[Θ(1)P(0)+P(0)Θ(1)]+o(ε)\displaystyle=P^{2}(0)+\varepsilon\left[\Theta^{(1)}P(0)+P(0)\Theta^{(1)}\right]+o(\varepsilon)
:=P2(0)+εΘ(2)+o(ε),\displaystyle:=P^{2}(0)+\varepsilon\Theta^{(2)}+o(\varepsilon), (19)
P3(ε)\displaystyle P^{3}(\varepsilon) =P3(0)+ε[Θ(1)P2(0)+P(0)Θ(1)P(0)+P2(0)Θ(1)]+o(ε)\displaystyle=P^{3}(0)+\varepsilon\left[\Theta^{(1)}P^{2}(0)+P(0)\Theta^{(1)}P(0)+P^{2}(0)\Theta^{(1)}\right]+o(\varepsilon)
:=P3(0)+εΘ(3)+o(ε),\displaystyle:=P^{3}(0)+\varepsilon\Theta^{(3)}+o(\varepsilon), (20)

where Θ(1)=(θij(1))\Theta^{(1)}=(\theta^{(1)}_{ij}), Θ(2)=(θij(2))\Theta^{(2)}=(\theta^{(2)}_{ij}), and Θ(3)=(θij(3))\Theta^{(3)}=(\theta^{(3)}_{ij}) are N×NN\times N matrices whose entries are given by

θij(1)\displaystyle\theta^{(1)}_{ij} =χi0j0(i,j)sipij(0)δii0+δij0si,\displaystyle=\frac{\chi_{i_{0}j_{0}}(i,j)}{s_{i}}-p_{ij}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}},\phantom{----------------} (21)
θij(2)\displaystyle\theta^{(2)}_{ij} =δii0pj0j(0)si+δij0pi0j(0)sipij(2)(0)δii0+δij0si\displaystyle=\frac{\delta_{ii_{0}}p_{j_{0}j}(0)}{s_{i}}+\frac{\delta_{ij_{0}}p_{i_{0}j}(0)}{s_{i}}-p_{ij}^{(2)}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}
+δjj0pii0(0)si0+δji0pij0(0)sj0pii0(0)pi0j(0)1si0pij0(0)pj0j(0)1sj0,\displaystyle\phantom{=}+\frac{\delta_{jj_{0}}p_{ii_{0}}(0)}{s_{i_{0}}}+\frac{\delta_{ji_{0}}p_{ij_{0}}(0)}{s_{j_{0}}}-p_{ii_{0}}(0)p_{i_{0}j}(0)\frac{1}{s_{i_{0}}}-p_{ij_{0}}(0)p_{j_{0}j}(0)\frac{1}{s_{j_{0}}}, (22)

and

θij(3)\displaystyle\theta^{(3)}_{ij} =δii0si0pj0j(2)(0)+δij0sj0pi0j(2)(0)pij(3)(0)δii0+δij0si\displaystyle=\frac{\delta_{ii_{0}}}{s_{i_{0}}}p^{(2)}_{j_{0}j}(0)+\frac{\delta_{ij_{0}}}{s_{j_{0}}}p^{(2)}_{i_{0}j}(0)-p^{(3)}_{ij}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}
+pii0(0)pj0j(0)si0+pij0(0)pi0j(0)sj0pii0(0)pi0j(2)(0)si0pij0(0)pj0j(2)(0)sj0\displaystyle\phantom{=}+\frac{p_{ii_{0}}(0)p_{j_{0}j}(0)}{s_{i_{0}}}+\frac{p_{ij_{0}}(0)p_{i_{0}j}(0)}{s_{j_{0}}}-\frac{p_{ii_{0}}(0)p_{i_{0}j}^{(2)}(0)}{s_{i_{0}}}-\frac{p_{ij_{0}}(0)p_{j_{0}j}^{(2)}(0)}{s_{j_{0}}}
+δjj0si0pii0(2)(0)+δji0sj0pij0(2)(0)pii0(2)(0)pi0j(0)1si0pij0(2)(0)pj0j(0)1sj0.\displaystyle\phantom{=}+\frac{\delta_{jj_{0}}}{s_{i_{0}}}p^{(2)}_{ii_{0}}(0)+\frac{\delta_{ji_{0}}}{s_{j_{0}}}p^{(2)}_{ij_{0}}(0)-p^{(2)}_{ii_{0}}(0)p_{i_{0}j}(0)\frac{1}{s_{i_{0}}}-p^{(2)}_{ij_{0}}(0)p_{j_{0}j}(0)\frac{1}{s_{j_{0}}}. (23)

We show the derivation of Eqs. (21), (22), and (23) in Appendix B.

We next calculate T(ε)T(\varepsilon). Matrix T(0)=(tij(0))T(0)=(t_{ij}(0)) satisfies

tij(0)={0ifi=j,1+12[k=1j1pik(0)tkj(0)+k=j+1Npik(0)tjk(0)+k=1i1pjk(0)tki(0)+k=i+1Npjk(0)tik(0)]ifi<j,tji(0)ifi>j,\displaystyle t_{ij}(0)=\begin{cases}0&\mathrm{if}\ i=j,\\ 1+\frac{1}{2}\left[\sum_{k=1}^{j-1}p_{ik}(0)t_{kj}(0)+\sum_{k=j+1}^{N}p_{ik}(0)t_{jk}(0)\right.\\ \left.+\sum_{k=1}^{i-1}p_{jk}(0)t_{ki}(0)+\sum_{k=i+1}^{N}p_{jk}(0)t_{ik}(0)\right]&\mathrm{if}\ i<j,\\ t_{ji}(0)&\mathrm{if}\ i>j,\end{cases} (24)

which we obtain by applying tij(0)=tji(0)t_{ij}(0)=t_{ji}(0) to Eq. (7). Note that {p11(0),p12(0),,pNN(0)}\{p_{11}(0),p_{12}(0),\ldots,p_{NN}(0)\} are known from the network structure and that {t11(0),t12(0),,tNN(0)}\{t_{11}(0),t_{12}(0),\ldots,t_{NN}(0)\} are unknowns. We stack Eq. (24) for the different ii and jj values in lexicographical order of (i,j)(i,j) on the left-hand side. In other words, the first equation is t11(0)=0t_{11}(0)=0, the second equation is t12(0)12p11(0)t12(0)12k=3Np1k(0)t2k(0)12k=2Np2k(0)t1k(0)=1t_{12}(0)-\frac{1}{2}p_{11}(0)t_{12}(0)-\frac{1}{2}\sum_{k=3}^{N}p_{1k}(0)t_{2k}(0)-\frac{1}{2}\sum_{k=2}^{N}p_{2k}(0)t_{1k}(0)=1, the third equation is t13(0)12p11(0)t13(0)12p12(0)t23(0)12k=4Np1k(0)t3k(0)12k=2Np3k(0)t1k(0)=1t_{13}(0)-\frac{1}{2}p_{11}(0)t_{13}(0)-\frac{1}{2}p_{12}(0)t_{23}(0)-\frac{1}{2}\sum_{k=4}^{N}p_{1k}(0)t_{3k}(0)-\frac{1}{2}\sum_{k=2}^{N}p_{3k}(0)t_{1k}(0)=1, and so on. Denote by vec(T(0))\mathrm{vec}(T(0)) the thus obtained vectorization of matrix T(0)T(0), i.e.,

vec(T(0))=(t11(0),,t1N(0);t21(0),,t2N(0);,tN1(0),,tNN(0)).\displaystyle\mathrm{vec}(T(0))=(t_{11}(0),\ldots,t_{1N}(0);t_{21}(0),\ldots,t_{2N}(0);\ldots,t_{N1}(0),\ldots,t_{NN}(0))^{\top}. (25)

Equation (25) is a redundant expression because T(0)T(0) is a symmetric matrix and its diagonal elements are equal to 0. However, we use Eq. (25) in the following text because it makes the theoretical derivations and computational implementation easier than the most compact vector form of T(0)T(0), which would be N(N1)/2N(N-1)/2-dimensional. Using Eq. (25), we rewrite Eq. (24) as

M(0)vec(T(0))=𝒅,\displaystyle M(0)\mathrm{vec}(T(0))=\boldsymbol{d}, (26)

where M(0)M(0) is the N2×N2N^{2}\times N^{2} matrix whose entries are determined by Eq. (24), and 𝒅\boldsymbol{d} is the N2N^{2}-dimensional column vector whose ((k1)N+k)((k-1)N+k)th entry is equal to 0 for all k{1,N}k\in\{1,\ldots N\}, and all the other entries are equal to 1. Because it also holds true that M(ε)vec(T(ε))=𝒅M(\varepsilon)\mathrm{vec}(T(\varepsilon))=\boldsymbol{d}, the calculation of T(ε)T(\varepsilon) requires M(ε)M(\varepsilon), which is the matrix with perturbation, defined similarly to M(0)M(0). We obtain the entries of M(ε)M(\varepsilon) by those of M(0)M(0) with each pij(0)p_{ij}(0) (with i,j{1,,N}i,j\in\{1,\ldots,N\}) being replaced by pij(ε)p_{ij}(\varepsilon). We write the Taylor expansion of M(ε)M(\varepsilon) as

M(ε)=M(0)+εΔM+o(ε)\displaystyle M(\varepsilon)=M(0)+\varepsilon\Delta M+o(\varepsilon) (27)

and calculate ΔM\Delta M as follows.

We write ΔM\Delta M as a block matrix

ΔM=(Δ11Δ12Δ1NΔ21Δ22Δ2NΔN1ΔN2ΔNN),\displaystyle\Delta M=\begin{pmatrix}\Delta_{11}&\Delta_{12}&\cdots&\Delta_{1N}\\ \Delta_{21}&\Delta_{22}&\cdots&\Delta_{2N}\\ \vdots&\vdots&\ddots&\vdots\\ \Delta_{N1}&\Delta_{N2}&\cdots&\Delta_{NN}\end{pmatrix}, (28)

where each Δij\Delta_{ij} is an N×NN\times N matrix. We show the derivation of each Δij\Delta_{ij} in Appendix C. We point out that the number of nonzero rows of ΔM\Delta M is equal to 2×(N2)+(N1)×2=4N62\times(N-2)+(N-1)\times 2=4N-6, which is much smaller than N2N^{2} for a large NN (see Appendix C).

To derive the first-order term of T(ε)T(\varepsilon) from ΔM\Delta M, we use Eq. (27) to obtain

vec(T(ε))\displaystyle\mathrm{vec}(T(\varepsilon)) =M(ε)1𝒅\displaystyle=M(\varepsilon)^{-1}\boldsymbol{d}
=(M(0)+ΔM)1𝒅\displaystyle=(M(0)+\Delta M)^{-1}\boldsymbol{d}
=[M(0)(I+εM(0)1ΔM+o(ε))]1𝒅\displaystyle=\left[M(0)(I+\varepsilon M(0)^{-1}\Delta M+o(\varepsilon))\right]^{-1}\boldsymbol{d}
=[IεM(0)1ΔM+o(ε)]M(0)1𝒅\displaystyle=\left[I-\varepsilon M(0)^{-1}\Delta M+o(\varepsilon)\right]M(0)^{-1}\boldsymbol{d}
=(IεM(0)1ΔM)vec(T(0))+o(ε)\displaystyle=\left(I-\varepsilon M(0)^{-1}\Delta M\right)\mathrm{vec}(T(0))+o(\varepsilon)
=vec(T(0))εM(0)1ΔMvec(T(0))+o(ε).\displaystyle=\mathrm{vec}(T(0))-\varepsilon M(0)^{-1}\Delta M\mathrm{vec}(T(0))+o(\varepsilon). (29)

Therefore, we obtain

T(ε):=T(0)+εΔT+o(ε),\displaystyle T(\varepsilon):=T(0)+\varepsilon\Delta T+o(\varepsilon), (30)

where ΔT\Delta T is the N×NN\times N matrix satisfying

vec(ΔT)=M(0)1ΔMvec(T(0)).\displaystyle\mathrm{vec}(\Delta T)=-M(0)^{-1}\Delta M\mathrm{vec}(T(0)). (31)

Finally, using Eq. (13), we derive the perturbed τk(ε)\tau_{k}(\varepsilon) as follows:

τk(ε)=τk(0)+εΓk+o(ε),\displaystyle\tau_{k}(\varepsilon)=\tau_{k}(0)+\varepsilon\Gamma_{k}+o(\varepsilon), (32)

where

Γk=Δ𝝅(Pk(0)T(0))+𝝅(0)(Θ(k)T(0))+𝝅(0)(Pk(0)ΔT).\displaystyle\Gamma_{k}=\Delta\boldsymbol{\pi}(P^{k}(0)\circ T(0))+\boldsymbol{\pi}(0)(\Theta^{(k)}\circ T(0))+\boldsymbol{\pi}(0)(P^{k}(0)\circ\Delta T). (33)

By substituting Eq. (32) in Eq. (8), we obtain

(bc)(ε)\displaystyle\left(\frac{b}{c}\right)^{*}(\varepsilon) :=(bc)(0)+εΔ(bc)+o(ε),\displaystyle:=\left(\frac{b}{c}\right)^{*}(0)+\varepsilon\Delta\left(\frac{b}{c}\right)^{*}+o(\varepsilon), (34)

where

Δ(bc)=(τ3(0)τ1(0))Γ2τ2(0)(Γ3Γ1)(τ3(0)τ1(0))2.\displaystyle\Delta\left(\frac{b}{c}\right)^{*}=\frac{(\tau_{3}(0)-\tau_{1}(0))\Gamma_{2}-\tau_{2}(0)(\Gamma_{3}-\Gamma_{1})}{(\tau_{3}(0)-\tau_{1}(0))^{2}}. (35)

4 Time complexity

To calculate (b/c)(b/c)^{*} for a network with NN nodes, the original algorithm requires calculating the mean coalescence time by solving a linear system of N(N1)/2N(N-1)/2 variables, i.e., tijt_{ij} (with i,j{1,,N}i,j\in\{1,\ldots,N\} and i<j)i<j), which has a time complexity of O(N6)O(N^{6}). With the Coppersmith-Winograd algorithm [21], the time complexity is reduced to O(N4.75)O(N^{4.75}) [15]. To determine the single edge whose removal decreases (b/c)(b/c)^{*} by the largest amount, for example, one needs to repeat this procedure for each edge. Therefore, the entire procedure with an ordinary algorithm and the Coppersmith-Winograd algorithm requires O(N6|E|)O(N^{6}|E|) and O(N4.75|E|)O(N^{4.75}|E|) time, respectively, where |E||E| is the number of edges. For a sparse network, for which |E|=O(N)|E|=O(N), the time complexity is O(N7)O(N^{7}) and O(N5.75)O(N^{5.75}), respectively.

The matrix ΔM\Delta M defined by Eq. (28) is sparse and has a special pattern. If the iith row of ΔM\Delta M is a zero row, then the iith element of vector ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)) is zero, and we do not need to calculate it. Therefore, to calculate ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)), we only need to focus on its ((i01)N+k)((i_{0}-1)N+k)th entries, where k{1,,N}{i0}k\in\{1,\ldots,N\}\setminus\{i_{0}\}, ((j01)N+k)((j_{0}-1)N+k)th entries, where k{1,,N}{j0}k\in\{1,\ldots,N\}\setminus\{j_{0}\}, and ((k1)N+i0)((k-1)N+i_{0})th and ((k1)N+j0)((k-1)N+j_{0})th entries, where k{1,,N}{i0,j0}k\in\{1,\ldots,N\}\setminus\{i_{0},j_{0}\}. All the other entries of ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)) are equal to 0. We show a pseudo algorithm to calculate ΔT\Delta T in Algorithm 1.

Input: Matrices Θ(1)\Theta^{(1)} and M(0)1M(0)^{-1}; vector (𝒗1,𝒗2,,𝒗N)(\boldsymbol{v}_{1},\boldsymbol{v}_{2},\ldots,\boldsymbol{v}_{N}); edge (i0,j0)(i_{0},j_{0})
Output: Matrix ΔT\Delta T

/* Compute ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)) */
Initialize N2N^{2}-dimensional vector u=0u=0
while k{1,,N}{i0,j0}k\in\{1,\ldots,N\}\setminus\{i_{0},j_{0}\} do
      u(i01)N+kΘi0𝒗ku_{(i_{0}-1)N+k}\leftarrow\Theta_{i_{0}}\cdot\boldsymbol{v}_{k}
      u(j01)N+kΘj0𝒗ku_{(j_{0}-1)N+k}\leftarrow\Theta_{j_{0}}\cdot\boldsymbol{v}_{k}
      u(k1)N+i0Θi0𝒗ku_{(k-1)N+i_{0}}\leftarrow\Theta_{i_{0}}\cdot\boldsymbol{v}_{k}
      u(k1)N+j0Θj0𝒗ku_{(k-1)N+j_{0}}\leftarrow\Theta_{j_{0}}\cdot\boldsymbol{v}_{k}
end while
u(i01)N+j0Θi0𝒗j0+Θj0𝒗i0u_{(i_{0}-1)N+j_{0}}\leftarrow\Theta_{i_{0}}\cdot\boldsymbol{v}_{j_{0}}+\Theta_{j_{0}}\cdot\boldsymbol{v}_{i_{0}}
u(j01)N+i0Θi0𝒗j0+Θj0𝒗i0u_{(j_{0}-1)N+i_{0}}\leftarrow\Theta_{i_{0}}\cdot\boldsymbol{v}_{j_{0}}+\Theta_{j_{0}}\cdot\boldsymbol{v}_{i_{0}}
/* u=(u1,,uN2)u=(u_{1},\ldots,u_{N^{2}})^{\top} is now equal to ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)) */

/* Compute vec(ΔT)\mathrm{vec}(\Delta T) */
/* Multiply M(0)1M(0)^{-1} and the already calculated ΔMvec(T(0))\Delta M\mathrm{vec}(T(0)) */
Initialize N2N^{2}-dimensional vector vec(ΔT)=0\mathrm{vec}(\Delta T)=0
while k{1,,N}{i0,j0}k\in\{1,\ldots,N\}\setminus\{i_{0},j_{0}\} do
      vec(ΔT)vec(ΔT)+u(i01)N+k(M(0)1)(i01)N+k\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(i_{0}-1)N+k}\left(M(0)^{-1}\right)_{(i_{0}-1)N+k}
      vec(ΔT)vec(ΔT)+u(j01)N+k(M(0)1)(j01)N+k\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(j_{0}-1)N+k}\left(M(0)^{-1}\right)_{(j_{0}-1)N+k}
      vec(ΔT)vec(ΔT)+u(k1)N+i0(M(0)1)(k1)N+i0\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(k-1)N+i_{0}}\left(M(0)^{-1}\right)_{(k-1)N+i_{0}}
      vec(ΔT)vec(ΔT)+u(k1)N+j0(M(0)1)(k1)N+j0\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(k-1)N+j_{0}}\left(M(0)^{-1}\right)_{(k-1)N+j_{0}}
end while
vec(ΔT)vec(ΔT)+u(i01)N+j0(M(0)1)(i01)N+j0\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(i_{0}-1)N+j_{0}}\left(M(0)^{-1}\right)_{(i_{0}-1)N+j_{0}}
vec(ΔT)vec(ΔT)+u(j01)N+i0(M(0)1)(j01)N+i0\mathrm{vec}(\Delta T)\leftarrow\mathrm{vec}(\Delta T)+u_{(j_{0}-1)N+i_{0}}\left(M(0)^{-1}\right)_{(j_{0}-1)N+i_{0}}
Return ΔT\Delta T
Algorithm 1 Pseudoalgorithm to compute ΔT\Delta T. Let (M(0)1)i\left(M(0)^{-1}\right)_{i} be the iith column of M(0)1M(0)^{-1}. Let Θi\Theta_{i} be the iith row of Θ(1)\Theta^{(1)}, where Θ(1)\Theta^{(1)} is defined by Eq. (21). Let 𝒗i\boldsymbol{v}_{i} be the iith row of T(0)T(0) such that vec(T(0))=(𝒗1,𝒗2,,𝒗N)\mathrm{vec}(T(0))=(\boldsymbol{v}_{1},\boldsymbol{v}_{2},\ldots,\boldsymbol{v}_{N})^{\top}.

We now discuss the computational complexity of our perturbation method. Because the inner product of NN-dimensional vectors has a time complexity of O(N)O(N), the first while loop in Algorithm 1 has a complexity of O(N2)O(N^{2}). The second while loop computes vec(ΔT)\mathrm{vec}(\Delta T). Because the scalar multiplication of an N2N^{2}-dimensional vector requires O(N2)O(N^{2}) time, the entire while loop has a time complexity of O(N3)O(N^{3}). Therefore, for a single perturbation experiment, one can carry out the entire algorithm in O(N3)O(N^{3}) time to obtain the perturbed {tij}\{t_{ij}\}, and hence (b/c)(b/c)^{*}. This is considerably smaller than O(N4.75)O(N^{4.75}) and O(N6)O(N^{6}) with the Coppersmith-Winograd algorithm and the standard algorithm, respectively. The entire procedure to determine the single edge to be removed to maximize cooperation with the perturbation theory requires O(N3|E|)O(N^{3}|E|) time in general networks and O(N4)O(N^{4}) time for sparse networks.

5 Data

We use the following four synthetic networks and seven empirical networks in our numerical analysis in section 6. We show the number of nodes and that of edges for each network in Table 1 and visualize them in Fig. 2. All the networks are connected networks without self-loops.

Refer to caption
Figure 2: Visualization of the networks used in the numerical analysis. (a) ER random graph. (b) BA model. (c) Planted 2-partition model. (d) LFR model. (e) Karate club. (f) Weaver. (g) Sparrow. (h) Lizard. (i) Dolphin. (j) Email. (k) Bird. All the networks are undirected. The linear size of the node is proportional to its degree. We have ignored the weight of the edge in this figure and our analysis.

We use a network generated by the Erdős–Rényi (ER) random graph with N=100N=100 nodes. We connect 300300 pairs of nodes out of the N(N1)/2=4950N(N-1)/2=4950 pairs of nodes selected uniformly at random. The average degree k=6\langle k\rangle=6.

With the Barabási–Albert (BA) model, we sequentially add new nodes each with m=3m=3 edges that connect to existing nodes according to the linear preferential attachment rule [22]. We start the growth process from the star graph with four nodes. The degree distribution approximately obeys p(k)k3p(k)\propto k^{-3}, where p(k)p(k) is the probability that a node has degree of kk, and \propto represents “proportional to”, in the limit of NN\to\infty. We set N=100N=100 and m=3m=3, which yields 291291 edges, implying k=5.826\langle k\rangle=5.82\approx 6.

The planted \ell-partition model, also called the random partition (RP) graph, partitions the set of NN nodes into \ell groups, each of which has N/N/\ell nodes [23]. Any pair of nodes in the same group is adjacent to each other with probability pinp_{\mathrm{in}}. Any pair of nodes belonging to different groups are adjacent to each other with probability poutp_{\mathrm{out}}. If pin>poutp_{\mathrm{in}}>p_{\mathrm{out}}, the intra-cluster edge density exceeds the inter-cluster edge density such that the network has community structure. We set N=100N=100, =2\ell=2, pin=0.11p_{\mathrm{in}}=0.11, and pout=0.01p_{\mathrm{out}}=0.01 such that the mean degree k=pin(N/1)+poutN(1)/=5.89\langle k\rangle=p_{\mathrm{in}}(N/\ell-1)+p_{\mathrm{out}}N(\ell-1)/\ell=5.89 in theory. We use a network generated by this model having k=6.12\langle k\rangle=6.12.

The Lancichinetti–Fortunato–Radicchi (LFR) model generates networks with community structure [24]. The model generates a power-law degree distribution with power-law exponent γ\gamma, and a power-law distribution of the size of the community with power-law exponent κ\kappa. The model also requires the maximal degree kmaxk_{\mathrm{max}} and mean degree k\langle k\rangle as input. The mixing parameter μ¯(0,1)\overline{\mu}\in(0,1) specifies the fraction of edges that connect different communities. A small value of μ¯\overline{\mu} leads to strong community structure. We set N=100N=100, γ=3\gamma=3, κ=2\kappa=2, k=6\langle k\rangle=6, kmax=100k_{\mathrm{max}}=100, and μ¯=0.1\overline{\mu}=0.1. A network generated by this model that we use has k=6.08\langle k\rangle=6.08.

We consider the following seven empirical networks. The karate club network consists of 34 nodes and 78 edges [25]. Each node represents a member of a karate club in a university in the United States, who were observed between 1970 and 1972. The edges represent interaction outside the activities of the club.

The weaver network has 42 nodes and 151 edges [26]. Each node represents a sociable weaver (Philetairus socius) observed in Benfontein Game Farm, Kimberley, South Africa. The observation lasted for 10 months in total: September–December 2010 and 2011, and January–February 2013. Two nodes are adjacent to each other if the two weavers used the same nest chambers either for roosting or nest-building within a series of observations in the same year.

The sparrow network has 52 nodes and 516 edges [27]. A node represents a golden-crowned sparrow (Zonotrichia atricapilla) observed at the University of California, Santa Cruz Arboretum. The data was recorded between January and March 2010 [27]. Although the original network is weighted, we regard this network as an unweighted network.

The lizard network has 60 nodes and 318 edges [28]. Each node represents a lizard (Tiliqua rugosa) observed in a chenopod shrubland near Bundey Bore Station in South Australia. Each lizard was attached to the dorsal surface of the tail a data logger unit, which recorded synchronized GPS locations every 10 minutes. Two lizards were regarded to be adjacent to each other if they were within 2 meters of each other in any GPS record.

The dolphin network has 62 nodes and 159 edges [29]. Each node represents a bottlenose dolphin (Tursiops). An edge represents a frequent association between two dolphins.

The email network has 167 nodes and 3251 edges [30]. Each node represents an employee of a mid-sized manufacturing company in Poland. An edge between two nodes (i.e., employees) indicates that there exists at least one email correspondence between the two individuals. We do not distinguish the senders and the recipients and treat the network as undirected network.

The bird network has 202 nodes and 11900 edges [31]. In the experiment, they placed some nest boxes in Wytham Woods, Oxford, UK, for six days to record individuals that landed on the entrance hole while prospecting for breeding territories. Each node represents a wild bird, which is either great tit (Parus major), blue tit (Cyanistes caeruleus), marsh tit (Poecile palustris), coal tit (Periparus ater), or Eurasian nuthatch (Sitta europaea). An edge represents two birds that overlapped in nest-box exploration patterns on the same day.

6 Numerical results

6.1 Addition or removal of a single edge

We examine the accuracy at which our perturbation theory describes the change in (b/c)(b/c)^{*} when we add or remove an edge in the given unweighted network. Before the perturbation, wij=wji=1w_{ij}=w_{ji}=1 if there exists an edge between the iith and jjth nodes, and wij=wji=0w_{ij}=w_{ji}=0 otherwise. In the case of edge addition, we add an edge with weight ε\varepsilon between a pair of nodes (i0,j0)(i_{0},j_{0}) without an edge in the original network unless we state otherwise, where 0<ε10<\varepsilon\leq 1. Therefore, wi0j0(=wj0i0)w_{i_{0}j_{0}}(=w_{j_{0}i_{0}}) changes from 0 to ε\varepsilon, and all the other wij{0,1}w_{ij}\in\{0,1\} values remain unchanged. The addition of an unweighted edge corresponds to ε=1\varepsilon=1. In the case of edge removal, we reduce the weight of an edge (i0,j0)(i_{0},j_{0}) in the original network by ε-\varepsilon, where 1ε<0-1\leq\varepsilon<0. Therefore, wi0j0(=wj0i0)w_{i_{0}j_{0}}(=w_{j_{0}i_{0}}) changes from 11 to 1+ε1+\varepsilon, and all the other wijw_{ij} values remain unchanged. The complete removal of an unweighted edge corresponds to ε=1\varepsilon=-1.

We are interested in whether the linear approximation to (b/c)(ε)(b/c)^{*}(\varepsilon) given by Eq. (34), i.e., Δ(b/c)\Delta(b/c)^{*}, which we call the slope, predicts the change in (b/c)(b/c)^{*} in response to the addition of a single edge, i.e., (b/c)(1)(b/c)(0)(b/c)^{*}(1)-(b/c)^{*}(0), or the removal of a single edge, i.e., (b/c)(1)(b/c)(0)(b/c)^{*}(-1)-(b/c)^{*}(0). We start by directly computing the change in (b/c)(b/c)^{*}, i.e., (b/c)(ε)(b/c)(0)(b/c)^{*}(\varepsilon)-(b/c)^{*}(0), in response to adding a new edge of weight ε\varepsilon (>0)(>0) or reducing the weight of an existing edge by changing the edge weight to 1+ε1+\varepsilon (<1)(<1) for various values of ε\varepsilon for relatively small networks. The outcome of our perturbation theory, i.e., Δ(b/c)\Delta(b/c)^{*} is equal to limε0[(b/c)(ε)(b/c)(0)]/ε\lim_{\varepsilon\to 0}\left[(b/c)^{*}(\varepsilon)-(b/c)^{*}(0)\right]/\varepsilon, where (b/c)(0)(b/c)^{*}(0) and (b/c)(ε)(b/c)^{*}(\varepsilon) are the values obtained by the direct numerical simulations.

We show the relationship between (b/c)(ε)(b/c)(0)(b/c)^{*}(\varepsilon)-(b/c)^{*}(0) and ε\varepsilon when we reduce the weight of a single edge in a BA network with N=100N=100 nodes in Fig. 3(a). Each line in the figure corresponds to an edge whose weight is gradually reduced. Note that ε=0\varepsilon=0 corresponds to the original network. Figure 3(a) indicates that (b/c)(b/c)^{*} roughly monotonically decreases as we gradually decrease the edge weight (i.e., decrease ε\varepsilon from 0 to negative values) except near ε=0\varepsilon=0. For this network, the removal of any single edge (i.e., ε=1\varepsilon=-1) leads to a decrease in (b/c)(b/c)^{*}, implying that the edge removal promotes cooperation. However, we note that a small decrease in the weight of an edge in the original network (e.g., ε=0.3\varepsilon=-0.3) increases (b/c)(b/c)^{*} for some edges, making cooperation more difficult than in the original network. Figure 3(a) implies that the perturbation theory is not accurate at describing the amount of the change in (b/c)(b/c)^{*} upon the edge removal because most of the curves shown in the figures, corresponding to the different edges in the original network, are far from being linear. However, we observe that the curves with the largest values of the slope of the curve at ε=0\varepsilon=0 tend to yield the smallest values of (b/c)(b/c)^{*} at ε=1\varepsilon=-1. Therefore, the perturbation theory, which produces the slope value, is expected to be efficient at detecting the edges whose removal yields the largest decrease in (b/c)(b/c)^{*}.

We show in Fig. 3(b) the change in (b/c)(b/c)^{*} plotted against ε\varepsilon when we add a new edge with weight ε\varepsilon. Each line corresponds to a pair of nodes between which there is initially no edge. Note that ε=1\varepsilon=1 corresponds to the addition of an unweighted edge. We find that the addition of any unweighted edge increases (b/c)(b/c)^{*}, making cooperation difficult. However, in contrast to the case of edge removal, the addition of an unweighted edge (i.e., with edge weight ε=1\varepsilon=1) does not necessarily yield the largest change in (b/c)(b/c)^{*} among edges of different weights ε(0,1]\varepsilon\in(0,1]. Specifically, for many node pairs that are initially not adjacent to each other, adding an edge with an intermediate edge weight (e.g., ε0.7\varepsilon\approx 0.7) maximizes the increase in (b/c)(b/c)^{*} (see Fig. 3(b)). Another observation is that the slope of the curve at ε=0\varepsilon=0, corresponding to the perturbation theory, is apparently less predictive of the effect of adding an unweighted edge (i.e., ε=1\varepsilon=1). Specifically, Fig. 3(b) indicates that, even if the slope at ε=0\varepsilon=0 is large, (b/c)(b/c)^{*} at ε=1\varepsilon=1 can be relatively small because (b/c)(b/c)^{*} decreases as ε\varepsilon increases when ε\varepsilon is close to 1. Furthermore, the curves with the largest slopes at ε=0\varepsilon=0 do not yield the largest changes in the (b/c)(b/c)^{*} value at ε=1\varepsilon=1, which implies that the perturbation theory is expected to be inefficient at predicting the edge addition that makes the cooperation most difficult.

We find similar results for the planted 22-partition model for the gradual removal of a single edge (see Fig. 3(c)). A notable difference from the case of the BA model is that there exists one edge whose complete removal increases (b/c)(b/c)^{*}, making the cooperation difficult. The two nodes forming this edge have degrees 22 and 99, which are not outstanding. Furthermore, we have confirmed by running a deterministic approximate modularity maximization algorithm [32], using function greedy_modularity_communities in NetworkX, that these two nodes belong to the same community among the four communities detected. Therefore, this particular edge looks like just a normal edge.

We show in Fig. 3(d) the dependence of (b/c)(b/c)^{*} on ε\varepsilon when we gradually increase the weight of an edge that is initially absent in the planted 22-partition network. The slope of the curve at (b/c)(b/c)^{*} at ε=0\varepsilon=0 is apparently not strongly related to the change in (b/c)(b/c)^{*} at ε=1\varepsilon=1.

We show the results of edge removal in the dolphin network in Fig. 3(e). There are two edges out of the 150 edges of which the removal (i.e., ε=1\varepsilon=-1) increases (b/c)(b/c)^{*}, making cooperation difficult. These two edges are formed by two nodes with degrees 2 and 5 and two other ones with degrees 2 and 7. These degree values are not outstanding in the entire network. The four nodes belong to the same community among the four communities detected by the same approximate modularity maximization algorithm [32]. These results suggest that the two edges apparently look normal. The removal of any other edge decreases (b/c)(b/c)^{*}, enhancing cooperation. Similar to the BA model, the curves with the largest slopes at ε=0\varepsilon=0 yield the largest decreases in (b/c)(b/c)^{*} at ε=1\varepsilon=-1.

We show in Fig. 3(f) the dependence of (b/c)(b/c)^{*} on ε\varepsilon when we gradually increase the weight of an edge that is initially absent in the dolphin network. The results are similar to those for the planted 22-partition model shown in Fig. 3(d). Many curves yield decrease in (b/c)(b/c)^{*} at ε=1\varepsilon=1, implying that the edge addition can promote cooperation, whereas the converse is the case for many other curves. The slope of the curve of (b/c)(b/c)^{*} at ε=0\varepsilon=0 is apparently not strongly related to the change in (b/c)(b/c)^{*} at ε=1\varepsilon=1.

Table 1: Pearson correlation coefficient, rr, between the shift in (b/c)(b/c)^{*} obtained by direct numerical simulations and that predicted by the perturbation theory. We remind that NN is the number of nodes and that |E||E| is the number of edges. A large positive value of rr upon edge addition or enhancement implies that the perturbation theory is good at predicting the outcome of adding or enhancing an edge. A large negative value of rr upon edge removal implies that the perturbation theory is good at predicting the outcome of removing an edge.
Network NN |E||E| rr, edge rr, edge rr, edge
addition removal enhancement
ER 100 300 0.55-0.55 0.87-0.87 0.990.99
BA 100 291 0.36-0.36 0.86-0.86 0.990.99
RP 100 306 0.39-0.39 0.80-0.80 0.980.98
LFR 100 304 0.27\phantom{-}0.27 0.84-0.84 0.980.98
Karate 34 78 0.35\phantom{-}0.35 0.88-0.88 0.990.99
Weaver 42 152 0.94\phantom{-}0.94 0.93-0.93 0.990.99
Sparrow 52 516 0.01-0.01 0.95-0.95 1.001.00
Lizard 60 318 0.72\phantom{-}0.72 0.93-0.93 0.990.99
Dolphin 62 159 0.56\phantom{-}0.56 0.72-0.72 0.960.96

The nonlinearity in the curves shown in Fig. 3 indicates that our perturbation theory is not accurate at predicting the amount of change in (b/c)(b/c)^{*} when we completely remove or add an edge in most cases. Therefore, we turn to ask whether the slope obtained from the perturbation theory is useful at determining the edge whose removal or addition changes (b/c)(b/c)^{*} by a large amount, representing strong promotion or suppression of cooperation in networks. We show in Fig. 4(a) the relationship between the change in (b/c)(b/c)^{*} when we remove an edge from the BA network and the slope Δ(b/c)\Delta(b/c)^{*} obtained from Eq. (34). The two quantities are strongly negatively correlated (Pearson correlation coefficient r=0.86r=-0.86, sample size n=291,p<0.01n=291,p<0.01). This result indicates that the perturbation theory, which is theoretically accurate only in the vicinity of ε=0\varepsilon=0, is good at predicting the outcome of removing an edge. We show in Fig. 4(b) the change in (b/c)(b/c)^{*} when we add a new edge to the same BA network as a function of the slope, Δ(b/c)\Delta(b/c)^{*}. The change in (b/c)(b/c)^{*} is not strongly positively correlated with Δ(b/c)\Delta(b/c)^{*}, suggesting that the perturbation theory is not good at predicting the outcome of adding an edge, whereas the correlation coefficient is significant due to a large sample size (r=0.36,n=4659,p<0.01r=-0.36,n=4659,p<0.01). Note that a large positive correlation coefficient when we add an edge would imply that the perturbation theory is good at predicting the outcome of adding an edge.

We show in Figs. 4(c) and 4(d) the results for the same correlation analysis for the planted 22-partition model network. When one removes an existing edge, the change in (b/c)(b/c)^{*} and slope Δ(b/c)\Delta(b/c)^{*} are strongly negatively correlated (r=0.80,n=306,p<0.01r=-0.80,n=306,p<0.01; see Fig. 4(c)), which is similar to the result for the BA model shown in Fig. 4(a), suggesting that the perturbation theory is good at predicting the outcome of removing an edge. When one adds a new edge, the change in (b/c)(b/c)^{*} and slope Δ(b/c)\Delta(b/c)^{*} are weakly correlated for this network (r=0.39,n=4644,p<0.01r=-0.39,n=4644,p<0.01; see Fig. 4(d)), which is similar to the result for the BA model shown in Fig. 4(b).

We show the corresponding results for the dolphin network in Figs. 4(e) and 4(f). The change in (b/c)(b/c)^{*} and slope Δ(b/c)\Delta(b/c)^{*} are strongly negatively correlated when one removes an edge (r=0.72,n=150,p<0.01r=-0.72,n=150,p<0.01; see Fig. 4(e)) and less strongly correlated when one adds a new edge (r=0.56,n=1732,p<0.01r=0.56,n=1732,p<0.01; see Fig. 4(f)). A strongly negative correlation for the edge removal (i.e., r=0.72r=-0.72) is similar to the result for the BA model. A positive correlation for the edge addition (i.e., r=0.56r=0.56) implies that the perturbation theory is to some extent good at predicting the outcome of adding an edge.

We show in Table 1 the same relationships for the other networks. For all synthetic and empirical networks, the slope Δ(b/c)\Delta(b/c)^{*} obtained from perturbation theory is strongly negatively correlated with the change in (b/c)(b/c)^{*} when we remove an existing edge (r0.72r\leq-0.72). Therefore, the perturbation theory is effective at predicting the outcome of removing an edge across different networks. However, the correlation is strongly positive only for a small fraction of networks (i.e., r0.5r\geq 0.5 for three out of the nine networks) when we add a new edge to the network.

Refer to caption
Figure 3: Change in (b/c)(b/c)^{*} as a function of the change in the edge weight, ε\varepsilon. (a) BA model, removal of an existing edge. (b) BA model, addition of a new edge. (c) Planted 2-partition model, removal of an existing edge. (d) Planted 2-partition model, addition of a new edge. (e) Dolphin network, removal of an existing edge. (f) Dolphin network, addition of a new edge. In (a), (c), and (e), each line represents an edge in the original network. In (b), (d), and (f), each line represents a pair of nodes that is not adjacent to each other in the original network. The line color is only as a guide to the eyes.
Refer to caption
Figure 4: Change in (b/c)(b/c)^{*} when we remove or add an unweighted edge as a function of the slope Δ(b/c)\Delta(b/c)^{*} of the curves shown in Fig. 3 at ε=0\varepsilon=0. (a) BA model, removal of an existing edge. (b) BA model, addition of a new edge. (c) Planted 2-partition model, removal of an existing edge. (d) Planted 2-partition model, addition of a new edge. (e) Dolphin network, removal of an existing edge. (f) Dolphin network, addition of a new edge. Each circle in (a), (c), and (e) represents an edge in the original network. Each circle in (b), (d), and (f) represents a pair of nodes that is not adjacent to each other in the original network.

6.2 Enhancement of the weight of an existing edge

In this section, we allow weighted networks and consider an increase or decrease in the weight of an existing edge of the network. Because we effectively analyzed the case of the decrease in the edge weight in section 6.1 (i.e., by setting 1<ε<0-1<\varepsilon<0), here we only consider enhancement of the weight of an existing edge by ε\varepsilon.

We enhanced the weight of an existing edge by 0<ε10<\varepsilon\leq 1, making the edge weight 1+ε1+\varepsilon, and numerically examined (b/c)(b/c)^{*} in the altered weighted networks. We plot in Figs. 5(a), 5(c), and 5(e) the change in (b/c)(b/c)^{*} relative to the original network against ε\varepsilon for the three networks used in Figs. 3 and 4. For the BA model, increasing the weight of 74 out of the 291 existing edges from 11 to 22 (i.e., ε=1\varepsilon=1) led to an increase in (b/c)(b/c)^{*}, making cooperation more difficult, whereas the opposite is the case when one enhances the weight of any other edge (see Fig. 5(a)). This result contrasts with the case of adding a new edge to the same network, which always increases (b/c)(b/c)^{*} (see Fig. 3(b)). In the planted 2-partition model (Fig. 5(c)) and the dolphin network (Fig. 5(e)), enhancing the edge weight of 7 out of the 306 edges and 23 out of the 159 edges, respectively, led to an increase in (b/c)(b/c)^{*}. Therefore, in a majority of cases, cooperation becomes easier by enhancing the weight of a single edge, which contrasts with the results for adding a new edge to these networks (see Figs. 3(d) and 3(f)). These results altogether suggest that adding new edges and enhancing the weight of existing edges often lead to different results.

Figures 5(a), 5(c), and 5(e) also indicate that the change in (b/c)(b/c)^{*} is close to linear as a function of ε\varepsilon. Therefore, our perturbation theory should be accurate at estimating the change in (b/c)(b/c)^{*} with ε=1\varepsilon=1. To verify this prediction, we show in Figs. 5(b), 5(d), and 5(f) the relationship between the change in (b/c)(b/c)^{*} in response to changing the weight of a single edge from 11 to 22 and Δ(b/c)\Delta(b/c)^{*} obtained by the perturbation theory for the three networks. As expected, the accuracy of the perturbation theory is high. We have confirmed that a high accuracy also holds true for other networks (see the last column of Table 1). These high accuracy results are in stark contrast to the results in case of adding a new edge, with which the accuracy of the perturbation theory is low.

Refer to caption
Figure 5: Change in (b/c)(b/c)^{*} when one enhances the weight of an existing edge. Panels (a), (c), (e): Change in (b/c)(b/c)^{*} as a function of the increase in the weight of an existing edge, ε\varepsilon. Each line represents an edge in the original network. Panels (b), (d), (f): Change in (b/c)(b/c)^{*} when we enhance an edge weight by ε=1\varepsilon=1, plotted against the slope Δ(b/c)\Delta(b/c)^{*} of the curves shown in panels (a), (c), and (e) at ε=0\varepsilon=0. Each circle represents an edge in the original network. (a) and (b): BA model. (c) and (d): Planted 2-partition model. (e) and (f): Dolphin network.

6.3 Sequential edge removal

The nonlinearity in the curves shown in Fig. 3, and the results shown in Fig. 4 and Table 1 indicate that our perturbation theory is not accurate at estimating the amount of change in (b/c)(b/c)^{*} upon an edge removal. Therefore, we turn to investigate whether our perturbation theory is good at finding edges to be sequentially removed to decrease (b/c)(b/c)^{*} by a large amount in larger networks. Denote by G0G_{0} an original network. We remove the edge with the largest Δ(b/c)\Delta(b/c)^{*}, resulting in network G1G_{1}. Then, we calculate Δ(b/c)\Delta(b/c)^{*} for each existing edge in G1G_{1} and remove the edge with the largest Δ(b/c)\Delta(b/c)^{*}, resulting in network G2G_{2}. We repeat this procedure another three times to eventually obtain network G5G_{5}, which has five fewer edges than G0G_{0}.

A simple rule of thumb to determine edges to be removed to enhance cooperation is to use the degree of nodes composing the edge. In particular, (b/c)(b/c)^{*} for the death-birth rule is small for random regular graphs with small degrees [13] and general networks with a small mean degree [15]. Therefore, we test the performance of our perturbation theory against a degree-based heuristic to remove an edge for enhancing cooperation, which we define as follows. Denote by (i,j)(i,j) the edge to be removed and by kik_{i} and kjk_{j} the degree of the iith and jjth nodes, respectively. Note that ki==1Nwi(==1Nwi)k_{i}=\sum_{\ell=1}^{N}w_{i\ell}(=\sum_{\ell=1}^{N}w_{\ell i}) for our networks, which are unweighted. For each network, we remove the edge whose ki+kjk_{i}+k_{j} is largest. After removing an edge according to this criterion, we select the edge with the largest ki+kjk_{i}+k_{j} in the reduced network and remove it. We repeat this procedure another three times to remove five edges in total. In our numerical experiments described below, we have verified that the selected edges are always the same if the score for the edge is defined by kikjk_{i}k_{j} instead of ki+kjk_{i}+k_{j}.

We carry out sequential edge removal experiments on three synthetic networks and three empirical networks. Note that the six networks are mostly larger than those used in the previous numerical simulations. For these networks, it is computationally difficult to exactly calculate (b/c)(b/c)^{*} for all possible networks with, for example, one edge being removed from the original network.

We show the change in (b/c)(b/c)^{*} relative to the original network as we sequentially remove five edges using our perturbation theory by the red lines in Fig. 6. As expected, (b/c)(b/c)^{*} decreases, corresponding to negative Δ(b/c)\Delta(b/c)^{*} values, as we remove edges one by one. We also show the result of the sequential edge removal based on the degree sum ki+kjk_{i}+k_{j} by the blue lines in the same figure. For all networks, there are multiple edges that have the same value of ki+kjk_{i}+k_{j} at least in one of the five steps to remove a single edge. In this case, we calculated Δ(b/c)\Delta(b/c)^{*} for all the possible scenarios of removing one of the edges that maximize ki+kjk_{i}+k_{j} in each step of edge removal. This is why we have obtained multiple blue lines in the figure. In all cases, (b/c)(b/c)^{*} decreases as we sequentially remove edges with the largest ki+kjk_{i}+k_{j} value. Figure 6 indicates that the edge removal based on our perturbation theory results in a larger decrease in (b/c)(b/c)^{*} than that based on ki+kjk_{i}+k_{j} for all the networks. To be quantitative, we measured the decrease in (b/c)(b/c)^{*} after the removal of five edges compared to the original network with the perturbation theory and with the degree sum. The former was larger than the average of the latter (i.e., average of the blue lines in Fig. 6) by a factor of 1.021.02, 1.011.01, 1.021.02, 1.051.05, 1.021.02, and 1.021.02 for the ER random graph (Fig. 6(a)), BA model (Fig. 6(b)), planted 2-partition network (Fig. 6(c)), lizard network (Fig. 6(d)), email network (Fig. 6(e)), and bird network (Fig. 6(f)), respectively.

Refer to caption
Figure 6: Changes in (b/c)(b/c)^{*} upon sequential removal of five edges. (a) ER random graph with 300 nodes and 900 edges. (b) BA model network with 300 nodes and 891 edges. (c) Planted 2-partition network with 300 nodes and 939 edges. (d) Lizard network with 60 nodes and 318 edges. (e) Email network with 167 nodes and 3251 edges. (f) Bird network with 202 nodes and 11900 edges. The red lines represent the edge removal according to the perturbation theory. The blue lines represent the edge removal according to the rank of the degree sum.

7 Conclusions

To determine (b/c)(b/c)^{*} for an arbitrary network, one needs to solve a system of N2N^{2} linear equations such that the time complexity is O(N6)O(N^{6}). With the Coppersmith-Winograd algorithm, the time complexity is reduced to O(N4.75)O(N^{4.75}), but this is still large (see section 4). In particular, it is computationally costly to carry out graph surgery with various possible edges to be added or removed to compare the results in terms of (b/c)(b/c)^{*}. Therefore, we have developed a perturbation theory for the graph surgery with which we can evaluate the perturbed (b/c)(b/c)^{*} in O(N3)O(N^{3}) time. We have verified that the first-order term Δ(b/c)\Delta(b/c)^{*} obtained from our perturbation theory predicts the rank of the change in (b/c)(b/c)^{*} when one removes an edge from the network with a high accuracy. Specifically, we have numerically shown that the edge with the largest Δ(b/c)\Delta(b/c)^{*} value is the one whose actual removal decreases (b/c)(b/c)^{*} by the largest amount in two out of the three networks (see Fig. 4(a), 4(c), and 4(e)). Therefore, we conclude that our perturbation theory is useful for finding the edge whose removal efficiently enhances cooperation in the given network with a reduced computational cost.

We focused on the death-birth process because it tends to foster cooperation compared to other rules of strategy updating [13, 11]. However, it is straightforward to formulate similar perturbation methods in the case of other updating rules such as the birth-death process [18, 13] and the pairwise comparison rule [33, 34, 16, 35] as well as in the case of other payoff matrices. In particular, our theory should be applicable to the case of constant selection [18, 36], with which the payoff matrix is independent of the opponent’s action. The perturbation theory may be more accurate for other update rules or games than the combination of the death-birth rule and the prisoner’s dilemma game examined in the present study. Exploitation of our perturbation approach in these directions is left for future work.

Another direction of future work is interaction between the selection strength and network perturbation. In the present work, we have assumed the weak selection limit. However, one can retain a selection strength parameter (which is η\eta in this article) to be finite and write down a formal solution. Then, it may be interesting to consider the simultaneous limit of weak selection η0\eta\to 0 and weak network perturbation ε0\varepsilon\to 0 in a way η\eta and ε\varepsilon are interrelated. Apart from this research direction, assessing the validity of the present perturbation theory under strong selection is left for future work. To this end, we first need to understand the accuracy of the original theory of fixation of cooperation in networks [15], which our theory is based on, under strong selection.

We do not know why the perturbation theory is more accurate when one removes an edge than when one adds an edge. Furthermore, we have found that the perturbation theory is fairly accurate at predicting the result for adding a parallel edge where an edge already exists, whereas it is not accurate when adding an edge where an edge does not exist in the original network. In a related vein, we observed nonmonotonic behavior in the cooperativity in terms of (b/c)(b/c)^{*} especially when we gradually added a weighted edge (Figs. 3(b) and 3(f)). These results lead us to hypothesize that we can engineer networks that promote cooperation better by considering weighted networks than unweighted networks. These topics also warrant future work.

Data accessibility

The empirical network data sets are open resources and available at [37, 38, 39, 40, 41, 42, 43]. The Python codes and synthetic network data sets used in the present study are available on GitHub [44].

Acknowledgments

N.M. acknowledges support from AFOSR European Office (under Grant No. FA9550-19-1-7024), the Sumitomo Foundation, the Japan Science and Technology Agency (JST) Moonshot R&D (under Grant No. JPMJMS2021), and the National Science Foundation (under Grant Nos. 2052720 and 2204936).

Appendix A Derivation of Eq. (16)

We restate the stationary probability πi(0)\pi_{i}(0), given by Eq. (1), as

πi(0)=si(0)S(0),\displaystyle\pi_{i}(0)=\frac{s_{i}(0)}{S(0)}, (36)

where si(0)s_{i}(0) is the original weighted degree of the iith node, and S(0)S(0), given by Eq. (14), is the original total edge weight of the network. If one perturbs the weight of an edge (i0,j0)(i_{0},j_{0}) by ε\varepsilon, the weighted degree of the iith node, si(ε)s_{i}(\varepsilon), is equal to si(0)+δii0ε+δij0εs_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon. After the perturbation, the total edge weight of the network changes to S(ε)=S(0)+2εS(\varepsilon)=S(0)+2\varepsilon. Therefore, the stationary probability after the perturbation, πi(ε)\pi_{i}(\varepsilon), is given by

πi(ε)=si(0)+δii0ε+δij0εS(0)+2ε.\displaystyle\pi_{i}(\varepsilon)=\frac{s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon}{S(0)+2\varepsilon}. (37)

The first-order derivative of πi(ε)\pi_{i}(\varepsilon) in terms of ε\varepsilon, denoted by Δπi\Delta\pi_{i}, is given by

Δπi(ε)=(δii0+δij0)[S(0)+2ε][si(0)+δii0ε+δij0ε]×2[S(0)+2ε]2.\displaystyle\Delta\pi_{i}(\varepsilon)=\frac{(\delta_{ii_{0}}+\delta_{ij_{0}})\left[S(0)+2\varepsilon\right]-\left[s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon\right]\times 2}{\left[S(0)+2\varepsilon\right]^{2}}. (38)

By setting ε=0\varepsilon=0 in Eq. (38), we obtain

Δπi(0)\displaystyle\Delta\pi_{i}(0) =S(0)(δii0+δij0)2si(0)S(0)2\displaystyle=\frac{S(0)(\delta_{ii_{0}}+\delta_{ij_{0}})-2s_{i}(0)}{S(0)^{2}}
=δii0+δij0S(0)2πi(0)S(0).\displaystyle=\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{S(0)}-\frac{2\pi_{i}(0)}{S(0)}. (39)

We wrote SS instead of S(0)S(0) in Eq. (16) for brevity.

Appendix B Derivation of Eqs. (21), (22), and (23)

B.1 Derivation of Eq. (21)

Let pij(ε)p_{ij}(\varepsilon) be the transition probability of the random walk from node ii to node jj on the network after perturbing the weight of edge (i0,j0)(i_{0},j_{0}) by ε\varepsilon. Note that pij(0)p_{ij}(0) is the transition probability on the original network. By definition, we have

pij(ε)=wij(ε)si(ε),\displaystyle p_{ij}(\varepsilon)=\frac{w_{ij}(\varepsilon)}{s_{i}(\varepsilon)}, (40)

where wij(ε)w_{ij}(\varepsilon) represents the weight of edge (i,j)(i,j) after the same perturbation, and we remind that si(ε)s_{i}(\varepsilon) is the weighted degree of the iith node after the perturbation. In Appendix A, we showed that

si(ε)=si(0)+δii0ε+δij0ε.\displaystyle s_{i}(\varepsilon)=s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon. (41)

We can also verify that

wij(ε)=wij(0)+εχi0j0(i,j),\displaystyle w_{ij}(\varepsilon)=w_{ij}(0)+\varepsilon\chi_{i_{0}j_{0}}(i,j), (42)

where χi0j0(i,j)\chi_{i_{0}j_{0}}(i,j), defined by Eq. (17), is the indicator function, which is equal to 11 if and only if edge (i,j)(i,j) is the perturbed edge (i0,j0)(i_{0},j_{0}); it is equal to 0 otherwise. By substituting Eqs. (41) and (42) in Eq. (40), we obtain

pij(ε)=wij(0)+εχi0j0(i,j)si(0)+δii0ε+δij0ε.\displaystyle p_{ij}(\varepsilon)=\frac{w_{ij}(0)+\varepsilon\chi_{i_{0}j_{0}}(i,j)}{s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon}. (43)

By taking the first-order derivative of Eq. (43) with respect to ε\varepsilon, we obtain

pij(ε)=χi0j0(i,j)[si(0)+δii0ε+δij0ε][wij(0)+εχi0j0(i,j)](δii0+δij0)[si(0)+δii0ε+δij0ε]2,\displaystyle p_{ij}^{\prime}(\varepsilon)=\frac{\chi_{i_{0}j_{0}}(i,j)\left[s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon\right]-\left[w_{ij}(0)+\varepsilon\chi_{i_{0}j_{0}}(i,j)\right](\delta_{ii_{0}}+\delta_{ij_{0}})}{\left[s_{i}(0)+\delta_{ii_{0}}\varepsilon+\delta_{ij_{0}}\varepsilon\right]^{2}}, (44)

which leads to

pij(0)\displaystyle p_{ij}^{\prime}(0) =χi0j0(i,j)si(0)wij(0)(δii0+δij0)[si(0)]2\displaystyle=\frac{\chi_{i_{0}j_{0}}(i,j)s_{i}(0)-w_{ij}(0)(\delta_{ii_{0}}+\delta_{ij_{0}})}{\left[s_{i}(0)\right]^{2}}
=χi0j0(i,j)sipij(0)δii0+δij0si.\displaystyle=\frac{\chi_{i_{0}j_{0}}(i,j)}{s_{i}}-p_{ij}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}. (45)

We omitted the argument of si(0)s_{i}(0) in the last line of Eq. (45) without ambiguity. This completes the proof of Eq. (21). Note that pij(0)p_{ij}^{\prime}(0) is equivalent to θij(1)\theta^{(1)}_{ij} in Eq. (21). We chose a different notation Θ(1)\Theta^{(1)} in the main text to avoid confusion between the power of matrix PP and the derivative of PP.

B.2 Derivation of Eq. (22)

We have shown that

P(ε)=P(0)+εΘ(1)+o(ε),\displaystyle P(\varepsilon)=P(0)+\varepsilon\Theta^{(1)}+o(\varepsilon), (46)

where Θ(1)=(θij(1))=(pij(0))\Theta^{(1)}=(\theta^{(1)}_{ij})=(p_{ij}^{\prime}(0)). By squaring Eq. (46), we obtain

P2(ε)=P2(0)+ε[Θ(1)P(0)+P(0)Θ(1)]+o(ε).\displaystyle P^{2}(\varepsilon)=P^{2}(0)+\varepsilon\left[\Theta^{(1)}P(0)+P(0)\Theta^{(1)}\right]+o(\varepsilon). (47)

Let us evaluate matrix Θ(1)P(0)\Theta^{(1)}P(0) first. We denote by aija_{ij} the (i,j)(i,j) entry of Θ(1)P(0)\Theta^{(1)}P(0). By substituting Eq. (21) in

aij=k=1Nθik(1)pkj(0),\displaystyle a_{ij}=\sum_{k=1}^{N}\theta^{(1)}_{ik}p_{kj}(0), (48)

we obtain

aij\displaystyle a_{ij} =k=1N[χi0j0(i,k)sipik(0)δii0+δij0si]pkj(0)\displaystyle=\sum_{k=1}^{N}\left[\frac{\chi_{i_{0}j_{0}}(i,k)}{s_{i}}-p_{ik}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}\right]p_{kj}(0)
=k=1Nχi0j0(i,k)pkj(0)sik=1Npik(0)pkj(0)δii0+δij0si\displaystyle=\sum_{k=1}^{N}\frac{\chi_{i_{0}j_{0}}(i,k)p_{kj}(0)}{s_{i}}-\sum_{k=1}^{N}p_{ik}(0)p_{kj}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}
=k=1Nχi0j0(i,k)pkj(0)sipij(2)(0)δii0+δij0si,\displaystyle=\sum_{k=1}^{N}\frac{\chi_{i_{0}j_{0}}(i,k)p_{kj}(0)}{s_{i}}-p_{ij}^{(2)}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}, (49)

where pij(2)(0)p_{ij}^{(2)}(0) is the transition probability of the random walk from node ii to node jj in two time steps. Note that χi0j0(i,k)=1\chi_{i_{0}j_{0}}(i,k)=1 if and only if (i,k)=(i0,j0)(i,k)=(i_{0},j_{0}) or (i,k)=(j0,i0)(i,k)=(j_{0},i_{0}). If i=i0i=i_{0}, then k=j0k=j_{0} must hold true for the term χi0j0(i,k)pkj(0)\chi_{i_{0}j_{0}}(i,k)p_{kj}(0) not to vanish. Similarly, if i=j0i=j_{0}, then k=i0k=i_{0} must hold true for χi0j0(i,k)pkj(0)\chi_{i_{0}j_{0}}(i,k)p_{kj}(0) not to vanish. If ii0,j0i\neq i_{0},j_{0}, then we obtain χi0j0(i,k)pkj(0)=0\chi_{i_{0}j_{0}}(i,k)p_{kj}(0)=0 k\forall k. Therefore, we can simplify Eq. (49) into

aij\displaystyle a_{ij} =δii0pj0j(0)si+δij0pi0j(0)sipij(2)(0)δii0+δij0si.\displaystyle=\frac{\delta_{ii_{0}}p_{j_{0}j}(0)}{s_{i}}+\frac{\delta_{ij_{0}}p_{i_{0}j}(0)}{s_{i}}-p_{ij}^{(2)}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}. (50)

Similarly, by substituting Eq. (21) in

bijk=1Npik(0)θkj(1),\displaystyle b_{ij}\equiv\sum_{k=1}^{N}p_{ik}(0)\theta^{(1)}_{kj}, (51)

we obtain

bij\displaystyle b_{ij} =k=1Npik(0)[χi0j0(k,j)skpkj(0)δki0+δkj0sk]\displaystyle=\sum_{k=1}^{N}p_{ik}(0)\left[\frac{\chi_{i_{0}j_{0}}(k,j)}{s_{k}}-p_{kj}(0)\frac{\delta_{ki_{0}}+\delta_{kj_{0}}}{s_{k}}\right]
=k=1Nχi0j0(k,j)pik(0)skk=1Npik(0)pkj(0)δki0+δkj0sk\displaystyle=\sum_{k=1}^{N}\frac{\chi_{i_{0}j_{0}}(k,j)p_{ik}(0)}{s_{k}}-\sum_{k=1}^{N}p_{ik}(0)p_{kj}(0)\frac{\delta_{ki_{0}}+\delta_{kj_{0}}}{s_{k}}
=δjj0pii0(0)si0+δji0pij0(0)sj0pii0(0)pi0j(0)1si0pij0(0)pj0j(0)1sj0.\displaystyle=\frac{\delta_{jj_{0}}p_{ii_{0}}(0)}{s_{i_{0}}}+\frac{\delta_{ji_{0}}p_{ij_{0}}(0)}{s_{j_{0}}}-p_{ii_{0}}(0)p_{i_{0}j}(0)\frac{1}{s_{i_{0}}}-p_{ij_{0}}(0)p_{j_{0}j}(0)\frac{1}{s_{j_{0}}}. (52)

Because θij(2)=aij+bij\theta^{(2)}_{ij}=a_{ij}+b_{ij}, we have proved Eq. (22).

B.3 Derivation of Eq. (23)

By matrix multiplication, we obtain

Θ(3)=Θ(1)P2(0)+P(0)Θ(1)P(0)+P2(0)Θ(1).\displaystyle\Theta^{(3)}=\Theta^{(1)}P^{2}(0)+P(0)\Theta^{(1)}P(0)+P^{2}(0)\Theta^{(1)}. (53)

Let cijc_{ij}, dijd_{ij}, and eije_{ij} be the (i,j)(i,j) entry of matrices Θ(1)P2(0)\Theta^{(1)}P^{2}(0), P(0)Θ(1)P(0)P(0)\Theta^{(1)}P(0), and P2(0)Θ(1)P^{2}(0)\Theta^{(1)}, respectively.

Using Eq. (21), we obtain

cij\displaystyle c_{ij} =k=1Nθik(1)pkj(2)(0)\displaystyle=\sum_{k=1}^{N}\theta^{(1)}_{ik}p^{(2)}_{kj}(0)
=k=1N[χi0j0(i,k)sipik(0)δii0+δij0si]pkj(2)(0)\displaystyle=\sum_{k=1}^{N}\left[\frac{\chi_{i_{0}j_{0}}(i,k)}{s_{i}}-p_{ik}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}\right]p^{(2)}_{kj}(0)
=k=1Nχi0j0(i,k)pkj(2)(0)sik=1Npik(0)pkj(2)(0)δii0+δij0si\displaystyle=\sum_{k=1}^{N}\frac{\chi_{i_{0}j_{0}}(i,k)p^{(2)}_{kj}(0)}{s_{i}}-\sum_{k=1}^{N}p_{ik}(0)p^{(2)}_{kj}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}
=δii0si0pj0j(2)(0)+δij0sj0pi0j(2)(0)pij(3)(0)δii0+δij0si.\displaystyle=\frac{\delta_{ii_{0}}}{s_{i_{0}}}p^{(2)}_{j_{0}j}(0)+\frac{\delta_{ij_{0}}}{s_{j_{0}}}p^{(2)}_{i_{0}j}(0)-p^{(3)}_{ij}(0)\frac{\delta_{ii_{0}}+\delta_{ij_{0}}}{s_{i}}. (54)

Similarly, we obtain

dij\displaystyle d_{ij} =k=1Nbikpkj(0)\displaystyle=\sum_{k=1}^{N}b_{ik}p_{kj}(0)
=k=1N[δkj0pii0(0)si0+δki0pij0(0)sj0pii0(0)pi0k(0)1si0pij0(0)pj0k(0)1sj0]pkj(0)\displaystyle=\sum_{k=1}^{N}\left[\frac{\delta_{kj_{0}}p_{ii_{0}}(0)}{s_{i_{0}}}+\frac{\delta_{ki_{0}}p_{ij_{0}}(0)}{s_{j_{0}}}-p_{ii_{0}}(0)p_{i_{0}k}(0)\frac{1}{s_{i_{0}}}-p_{ij_{0}}(0)p_{j_{0}k}(0)\frac{1}{s_{j_{0}}}\right]p_{kj}(0)
=pii0(0)pj0j(0)si0+pij0(0)pi0j(0)sj0pii0(0)pi0j(2)(0)si0pij0(0)pj0j(2)(0)sj0\displaystyle=\frac{p_{ii_{0}}(0)p_{j_{0}j}(0)}{s_{i_{0}}}+\frac{p_{ij_{0}}(0)p_{i_{0}j}(0)}{s_{j_{0}}}-\frac{p_{ii_{0}}(0)p_{i_{0}j}^{(2)}(0)}{s_{i_{0}}}-\frac{p_{ij_{0}}(0)p_{j_{0}j}^{(2)}(0)}{s_{j_{0}}} (55)

and

eij\displaystyle e_{ij} =k=1Npik(2)(0)θkj(1)\displaystyle=\sum_{k=1}^{N}p_{ik}^{(2)}(0)\theta_{kj}^{(1)}
=k=1Npik(2)(0)[χi0j0(k,j)skpkj(0)δki0+δkj0sk]\displaystyle=\sum_{k=1}^{N}p_{ik}^{(2)}(0)\left[\frac{\chi_{i_{0}j_{0}}(k,j)}{s_{k}}-p_{kj}(0)\frac{\delta_{ki_{0}}+\delta_{kj_{0}}}{s_{k}}\right]
=δjj0si0pii0(2)(0)+δji0sj0pij0(2)(0)pii0(2)(0)pi0j(0)1si0pij0(2)(0)pj0j(0)1sj0.\displaystyle=\frac{\delta_{jj_{0}}}{s_{i_{0}}}p^{(2)}_{ii_{0}}(0)+\frac{\delta_{ji_{0}}}{s_{j_{0}}}p^{(2)}_{ij_{0}}(0)-p^{(2)}_{ii_{0}}(0)p_{i_{0}j}(0)\frac{1}{s_{i_{0}}}-p^{(2)}_{ij_{0}}(0)p_{j_{0}j}(0)\frac{1}{s_{j_{0}}}. (56)

Because θij(3)=cij+dij+eij\theta^{(3)}_{ij}=c_{ij}+d_{ij}+e_{ij}, we have proved Eq. (23).

Appendix C Derivation of ΔM\Delta M

In this section, we derive ΔM\Delta M as a block matrix

ΔM=(Δ11Δ12Δ1NΔ21Δ22Δ2NΔN1ΔN2ΔNN),\displaystyle\Delta M=\begin{pmatrix}\Delta_{11}&\Delta_{12}&\cdots&\Delta_{1N}\\ \Delta_{21}&\Delta_{22}&\cdots&\Delta_{2N}\\ \vdots&\vdots&\ddots&\vdots\\ \Delta_{N1}&\Delta_{N2}&\cdots&\Delta_{NN}\end{pmatrix}, (57)

where each Δij\Delta_{ij} is an N×NN\times N matrix.

The iith row of the diagonal block Δii\Delta_{ii} is filled by 0, and all the other rows are the same as those of matrix 12Θ(1)-\frac{1}{2}\Theta^{(1)}. For example, we obtain

Δ22=12(θ11(1)θ12(1)θ1N(1)000θ31(1)θ32(1)θ3N(1)θN1(1)θN2(1)θNN(1)).\displaystyle\Delta_{22}=-\frac{1}{2}\begin{pmatrix}\theta^{(1)}_{11}&\theta^{(1)}_{12}&\cdots&\theta^{(1)}_{1N}\\ 0&0&\cdots&0\\ \theta^{(1)}_{31}&\theta^{(1)}_{32}&\cdots&\theta^{(1)}_{3N}\\ \vdots&\vdots&\ddots&\vdots\\ \theta^{(1)}_{N1}&\theta^{(1)}_{N2}&\cdots&\theta^{(1)}_{NN}\end{pmatrix}. (58)

For iji\neq j, the jjth row of Δij\Delta_{ij} is equal to the iith row of 12Θ(1)-\frac{1}{2}\Theta^{(1)}, and all the other rows are filled by 0. For example, we obtain

Δ21=12(θ21(1)θ22(1)θ2N(1)000000)\displaystyle\Delta_{21}=-\frac{1}{2}\begin{pmatrix}\theta^{(1)}_{21}&\theta^{(1)}_{22}&\cdots&\theta^{(1)}_{2N}\\ 0&0&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&0\\ \end{pmatrix} (59)

and

Δ23=12(000000θ21(1)θ22(1)θ2N(1)000000).\displaystyle\Delta_{23}=-\frac{1}{2}\begin{pmatrix}0&0&\cdots&0\\ 0&0&\cdots&0\\ \theta^{(1)}_{21}&\theta^{(1)}_{22}&\cdots&\theta^{(1)}_{2N}\\ 0&0&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&0\end{pmatrix}. (60)

Equation (21) implies that only the i0i_{0}th and j0j_{0}th rows of Θ(1)\Theta^{(1)} may be nonzero. Owing to this property, there are only 3N23N-2 nonzero matrix blocks Δij\Delta_{ij} out of the N2N^{2} blocks of ΔM\Delta M, which we show using an example as follows. Assume that we perturb edge (i0,j0)=(4,7)(i_{0},j_{0})=(4,7). Then, the fourth and seventh rows are the only nonzero rows of Θ(1)\Theta^{(1)}. Therefore, Δii\Delta_{ii}, where i{i0,j0}i\notin\{i_{0},j_{0}\}, has only the i0i_{0}th and j0j_{0}th rows nonzero. Any off-diagonal matrix block Δij\Delta_{ij}, where i{i0,j0}i\not\in\{i_{0},j_{0}\} and jij\neq i, is zero because it has (θi1(1),,θiN(1))/2-\left(\theta_{i1}^{(1)},\ldots,\theta_{iN}^{(1)}\right)/2 in the jjth row, this row is zero given that i{i0,j0}i\notin\{i_{0},j_{0}\}, and all the other rows are zero. We next consider Δi0j\Delta_{i_{0}j}, where j{1,,N}j\in\{1,\ldots,N\}. Diagonal block Δi0i0\Delta_{i_{0}i_{0}} has only the j0j_{0}th row nonzero, which is given by (θj01(1),,θj0N(1))/2-\left(\theta^{(1)}_{j_{0}1},\ldots,\theta^{(1)}_{j_{0}N}\right)/2. Off-diagonal block Δi0j\Delta_{i_{0}j}, where ji0j\neq i_{0}, has only the jjth row nonzero, which is equal to (θi01(1),,θi0N(1))/2-\left(\theta^{(1)}_{i_{0}1},\ldots,\theta^{(1)}_{i_{0}N}\right)/2. Therefore, all the blocks Δi0j\Delta_{i_{0}j} with j{1,,N}j\in\{1,\ldots,N\} are nonzero in general. Likewise, Δj0j0\Delta_{j_{0}j_{0}} has only the i0i_{0}th row nonzero, which is given by (θi01(1),,θi0N(1))/2-\left(\theta^{(1)}_{i_{0}1},\ldots,\theta^{(1)}_{i_{0}N}\right)/2. Off-diagonal block Δj0j\Delta_{j_{0}j}, where jj0j\neq j_{0}, has only the jjth row nonzero, which is equal to (θj01(1),,θj0N(1))/2-\left(\theta^{(1)}_{j_{0}1},\ldots,\theta^{(1)}_{j_{0}N}\right)/2. Therefore, all the blocks Δj0j\Delta_{j_{0}j} with j{1,,N}j\in\{1,\ldots,N\} are nonzero in general. This proves that there are (N2)+N+N=3N2(N-2)+N+N=3N-2 nonzero matrix blocks Δij\Delta_{ij}.

Furthermore, most rows of ΔM\Delta M are zero rows. Specifically, consider NN rows of ΔM\Delta M given in a block matrix form by (Δi1,,ΔiN)\left(\Delta_{i1},\ldots,\Delta_{iN}\right), where i{i0,j0}i\notin\{i_{0},j_{0}\}. As we have shown, the only non-zero block among Δi1\Delta_{i1}, \ldots, ΔiN\Delta_{iN} is Δii\Delta_{ii}, and the only nonzero rows of Δii\Delta_{ii} are the i0i_{0}th and j0j_{0}th rows. Therefore, the N2N-2 rows of (Δi1,,ΔiN)\left(\Delta_{i1},\ldots,\Delta_{iN}\right), i.e., jjth rows with j{i0,j0}j\notin\{i_{0},j_{0}\}, are zero rows. In addition, the i0i_{0}th row of (Δi01,,Δi0N)\left(\Delta_{i_{0}1},\ldots,\Delta_{i_{0}N}\right) and the j0j_{0}th row of (Δj01,,Δj0N)\left(\Delta_{j_{0}1},\ldots,\Delta_{j_{0}N}\right) are zero rows. Therefore, the number of nonzero rows of ΔM\Delta M is equal to 2×(N2)+(N1)×2=4N62\times(N-2)+(N-1)\times 2=4N-6.

References

  • \bibcommenthead
  • Trivers [1971] Trivers, R.L.: The evolution of reciprocal altruism. Q. Rev. Biol. 46, 35–57 (1971)
  • Axelrod [1984] Axelrod, R.: The Evolution of Cooperation. Basic Books, New York (1984)
  • Hofbauer and Sigmund [1998] Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge, UK (1998)
  • Nowak [2006a] Nowak, M.A.: Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006)
  • Nowak [2006b] Nowak, M.A.: Evolutionary Dynamics: Exploring the Equations of Life. Harvard University Press, Cambridge, UK (2006)
  • Henrich and Henrich [2007] Henrich, N., Henrich, J.P.: Why Humans Cooperate: A Cultural and Evolutionary Explanation. Oxford University Press, Oxford, UK (2007)
  • Sigmund [2010] Sigmund, K.: The Calculus of Selfishness. Princeton University Press, Princeton, NJ (2010)
  • Bowles and Gintis [2011] Bowles, S., Gintis, H.: A Cooperative Species. Princeton University Press, Princeton, NJ (2011)
  • Nowak and May [1992] Nowak, M.A., May, R.M.: Evolutionary games and spatial chaos. Nature 359, 826–829 (1992)
  • Nowak and May [1993] Nowak, M.A., May, R.M.: The spatial dilemmas of evolution. Int. J. Bifurcation Chaos 3, 35–78 (1993)
  • Szabó and Fath [2007] Szabó, G., Fath, G.: Evolutionary games on graphs. Phys. Rep. 446, 97–216 (2007)
  • Santos and Pacheco [2005] Santos, F.C., Pacheco, J.M.: Scale-free networks provide a unifying framework for the emergence of cooperation. Phys. Rev. Lett. 95, 098104 (2005)
  • Ohtsuki et al. [2006] Ohtsuki, H., Hauert, C., Lieberman, E., Nowak, M.A.: A simple rule for the evolution of cooperation on graphs and social networks. Nature 441, 502–505 (2006)
  • Santos et al. [2006] Santos, F.C., Pacheco, J.M., Lenaerts, T.: Evolutionary dynamics of social dilemmas in structured heterogeneous populations. Proc. Natl. Acad. Sci. U.S.A. 103, 3490–3494 (2006)
  • Allen et al. [2017] Allen, B., Lippner, G., Chen, Y.-T., Fotouhi, B., Momeni, N., Yau, S.-T., Nowak, M.A.: Evolutionary dynamics on any population structure. Nature 544, 227–230 (2017)
  • Nowak et al. [2004] Nowak, M.A., Sasaki, A., Taylor, C., Fudenberg, D.: Emergence of cooperation and evolutionary stability in finite populations. Nature 428, 646–650 (2004)
  • Ewens [2004] Ewens, W.J.: Mathematical Population Genetics: Theoretical Introduction vol. 1. Springer, New York, NY (2004)
  • Lieberman et al. [2005] Lieberman, E., Hauert, C., Nowak, M.A.: Evolutionary dynamics on graphs. Nature 433, 312–316 (2005)
  • [19] Aldous, D., Fill, J.A.: Reversible Markov chains and random walks on graphs. Unfinished monograph, recompiled 2014 (2002) available at
    http://www.stat.berkeley.edu/\simaldous/RWG/book.html. Accessed on August 6, 2022
  • Masuda et al. [2017] Masuda, N., Porter, M.A., Lambiotte, R.: Random walks and diffusion on networks. Phys. Rep. 716, 1–58 (2017)
  • Coppersmith and Winograd [1987] Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, pp. 1–6 (1987)
  • Barabási and Albert [1999] Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)
  • Fortunato [2010] Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)
  • Lancichinetti et al. [2008] Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78, 046110 (2008)
  • Zachary [1977] Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33, 452–473 (1977)
  • van Dijk et al. [2014] Dijk, R.E., Kaden, J.C., Argüelles-Ticó, A., Dawson, D.A., Burke, T., Hatchwell, B.J.: Cooperative investment in public goods is kin directed in communal nests of social birds. Ecol. Lett. 17, 1141–1148 (2014)
  • Arnberg et al. [2015] Arnberg, N.N., Shizuka, D., Chaine, A.S., Lyon, B.E.: Social network structure in wintering golden-crowned sparrows is not correlated with kinship. Mol. Ecol. 24, 5034–5044 (2015)
  • Bull et al. [2012] Bull, C.M., Godfrey, S., Gordon, D.M.: Social networks and the spread of salmonella in a sleepy lizard population. Mol. Ecol. 21, 4386–4392 (2012)
  • Lusseau et al. [2003] Lusseau, D., Schneider, K., Boisseau, O.J., Haase, P., Slooten, E., Dawson, S.M.: The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 54, 396–405 (2003)
  • Michalski et al. [2011] Michalski, R., Palus, S., Kazienko, P.: Matching organizational structure and social network extracted from email communication. In: International Conference on Business Information Systems, pp. 197–206 (2011). Springer, Berlin, Germany
  • Firth and Sheldon [2015] Firth, J.A., Sheldon, B.C.: Experimental manipulation of avian social structure reveals segregation is carried over across contexts. Proc. R. Soc. B 282, 20142350 (2015)
  • Clauset et al. [2004] Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)
  • Blume [1993] Blume, L.E.: The statistical mechanics of strategic interaction. Games Econ. Behav. 5, 387–424 (1993)
  • Szabó and Tőke [1998] Szabó, G., Tőke, C.: Evolutionary prisoner’s dilemma game on a square lattice. Phys. Rev. E 58, 69 (1998)
  • Traulsen et al. [2006] Traulsen, A., Nowak, M.A., Pacheco, J.M.: Stochastic dynamics of invasion and fixation. Phys. Rev. E 74, 011909 (2006)
  • Allen et al. [2021] Allen, B., Sample, C., Steinhagen, P., Shapiro, J., King, M., Hedspeth, T., Goncalves, M.: Fixation probabilities in graph-structured populations under weak selection. PLoS Comput. Biol. 17, 1008695 (2021)
  • [37] Karate network data set: https://networkrepository.com/karate.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [38] Weaver network data set: https://networkrepository.com/aves-weaver-social-03.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [39] Sparrow network data set: https://networkrepository.com/aves-sparrow-social.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [40] Lizard network data set: https://networkrepository.com/reptilia-lizard-network-social.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [41] Dolphin network data set: https://networkrepository.com/dolphins.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [42] Email network data set: https://networkrepository.com/ia-radoslaw-email.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [43] Bird network data set: https://networkrepository.com/aves-wildbird-network.php organized by R. Rossi and N. Ahmed. Accessed on June 26, 2022
  • [44] Meng, L. Data sets and Python codes. See https://github.com/lingqime/Perturbation-theory-for-evolutionary-game-dynamics-on-networks. Accessed on August 13, 2022