Fostering Cooperation in Structured Populations Through Local and Global Interference Strategies
Abstract
We study the situation of an exogenous decision-maker aiming to encourage a population of autonomous, self-regarding agents to follow a desired behaviour at a minimal cost. The primary goal is therefore to reach an efficient trade-off between pushing the agents to achieve the desired configuration while minimising the total investment. To this end, we test several interference paradigms resorting to simulations of agents facing a cooperative dilemma in a spatial arrangement. We systematically analyse and compare interference strategies rewarding local or global behavioural patterns. Our results show that taking into account the neighbourhood’s local properties, such as its level of cooperativeness, can lead to a significant improvement regarding cost efficiency while guaranteeing high levels of cooperation. As such, we argue that local interference strategies are more efficient than global ones in fostering cooperation in a population of autonomous agents.
1 Introduction
The problem of understanding and predicting collective behaviours of agents in evolving multiagent systems is a well studied research topic in evolutionary game theory, as it can be found in a variety of real-world situations, ranging from ecosystems to human organisations and social networks santos2006pnas; Sigmund2001PNAS; tuyls2007evolutionary; RaghunandanS12; HanJaamas2016; SahraeiBBTW14. In this context, cooperation is typically assumed to emerge from the combined actions of participating agents within the system. However, in many scenarios, such behaviours are advocated and promoted by an exogenous agent, which is not part of the system, calling for a new set of heuristics capable of engineering a desired collective behaviour in a self-organised multiagent system. For instance, consider a wildlife management organisation (e.g., the WWF) that aims to maintain a desired level of biodiversity in a particular region. To do so, the organisation, not being part of the region’s eco-system, has to decide whether to modify the current population of some species, and if so, then when, and in what degree to interfere in the eco-system (i.e., to modify the composition and the biodiversity of the population) levin2000multiple. Note that since more efficient population controlling typically implies more physical actions, which requires higher (monetary) expenses in both human resources and equipment, the organisation has to achieve a balance between efficient wildlife management and a low total investment cost. Moreover, due to the evolutionary dynamics of the eco-system (e.g., frequency and structure dependence santos2006pnas), undesired behaviours can reoccur over time, for example when the interference was not sufficiently strong in the past. Given this, the decision-maker also has to take into account the fact that she will have to repeatedly interfere in the eco-system in order to sustain the level of biodiversity over time. That is, she has to find an efficient sequential interference strategy that leads to her desired goals, while also minimising the total cost of interference.
Although the (sequential) decision-making literature provides a number of techniques to tackle similar problems where the goal is to provide a sequence of decisions that lead to optimal behaviour of the system (e.g., the desired level of biodiversity) while minimising the total cost of making such decisions Madani2004; GuhaMunagala2007; Bachrach2009; Tran-Thanh2012; DingEtAl2013, these approaches typically ignore the fact that the agents, with whom the decision-maker has to interact, also have their own strategic behaviours that together drive the (evolutionary) dynamics of the system. Given this, we argue that such solutions will not be able to exploit the system characteristics, and thus, will fail in providing efficient performance in achieving the desired system status (e.g., the status quo between the fighting opponents, or the desired diversity of population). On the other hand, game theoretic literature, which deals with these strategic behaviours, typically focuses on the extremes of the problem. In particular, researchers either assume that the system is fully closed (i.e., there are no outsider decision-makers) or the decision-maker has full control over the behaviour of the agents. Typical models for the former are classical (both non-cooperative and coalitional) game theoretical models Bachrach2009; AzizBH10; AadithyaMJ11. The latter includes models from mechanism design, where the decision-maker is the system designer, and can define some set of norms and penalties such that the agents are incentivised not to deviate from the desired behaviour endriss2011incentive; Wooldridge12; LevitEtAl2013; HarrensteinTW14. Given this, these works are not suitable to meet our current aims either.
This paper presents an alternative approach to address the problem of the external decision-maker by combining the decision-making process design with an evolutionary game theoretic perspective. While the former captures the strategic behaviour of the decision-maker, the latter can be used to formalise the evolutionary dynamics of the multiagent system and the strategic decisions within large populations exhibiting arbitrary population structures and social tensions. In particular, we consider a multiagent system where the agents, distributed in a network, interact with their neighbours via the one-shot Prisoner’s Dilemma (PD), where uncooperative behaviour is preferred over cooperation Sigmund2001PNAS; santos2006pnas. As an outsider decision-maker, we aim to promote cooperation by interfering in the system, rewarding particular agents in the population at specific moments;
The research question here is to identify when and how much to invest (i.e., pay the agents) at each time step, in order to achieve our desired ratio of cooperation within the system such that the total cost of interference is minimised. We investigate two general classes or approaches of interference strategies; the first is based on the current composition of the population while the second is based on local neighbourhood properties, in particular, for any agent being considered for investment we examine its current cooperativeness level (the number of its neighbours who will cooperate). These two classes represent a large fraction of the sequential interference strategies in real–world scenarios, requiring different levels of information in the decision making process. While the first requires as input the overall population cooperativeness, the second requires more detailed, local information regarding each neighbourhood to make a decision. They represent two different approaches of governing and incentivising institutions: a large global institution overseeing the whole population (such as the United Nations) as opposed to multiple local institutions (such as regional or group-wide ones) vasconcelos2013bottom. We are particularly interested in whether taking into account neighbourhood information will lead to more efficient interference and how this may be achieved.
The remainder of the paper is structured as follows: the next section provides a brief overview of the related work, this is followed by a detailed description of our model, its results and a final discussion.
2 Related Work
In the literature of evolution of cooperation, different mechanisms of cooperation have been studied, including direct and indirect reciprocity nowak:2005:nature, kin and group selections traulsen:2006:pnas, network reciprocity santos2006pnas, punishment and rewarding Sigmund2001PNAS, and pre-commitments Han2017AAAI. These mechanisms are incorporated as part of individual strategic behaviours, in order to study how they evolve and whether their evolution promotes a better outcome for cooperative behaviour. These works, however, do not consider the influence of an exogenous decision-maker, but rather rely on the changes made by internal agents. In contrast, our interference strategies are external, i.e. they are not incorporated into the individual strategy. In addition, the aim of our strategies is to minimise the cost of interference while guaranteeing high levels of cooperation, contrary to past literature where the cost optimisation is often omitted. In this respect, our work is also different from the modelling works of institutional incentives to encourage cooperation through costly reward and punishment (see e.g., sigmund2010social; vasconcelos2013bottom) as well as through enforcing agreements Han2017AAAI.
Closely related to the current work is the analysis in han2015cost; han2014cost where cost-efficient interference is studied in well-mixed populations (i.e. having a fully connected graph structure). Thus, the interference therein only depends on the composition of the population. In the more complex scenario of structured populations we consider in this paper, agents might have different types of neighbourhood. As shown below, taking into account neighbourhood properties leads to more cost-efficient interference strategies than that based solely on the population composition.
Works on cooperation in social networks also assume that changes are initiated from inside the system RaghunandanS12; SahraeiBBTW14; FranksGJ13; FranksGA14. Among them, more relevant to our paper is the recent work by Franks et al. FranksGJ13; FranksGA14, which has explored the use of influencers on complex networks. However, these influencers are also part of the system and thus, similar to the cases mentioned above, this work does not consider external interference mechanisms. Given this, it does not address similar decision-making problems that we examine here. Apart from these examples, some other works apply schemes of mechanism design to control the behaviour of the system. Endriss et al. endriss2011incentive investigated how to tax games to incentivise certain behaviours in system equilibrium points and Wooldridge et al. Wooldridge12 discussed the options to manipulate games in order to achieve desired behaviours. These works, however, assume that the decision-maker has full control on the agents within the systems, which allows her to both reward the good agents (i.e., those who follow the desired behaviours) and penalising the deviant ones. With our approach, the decision-maker has little or no control on the agents so they can rely only on rewarding schemes, nudging agents to adopt a given strategy. Moreover, mechanism design work typically does not focus on the cost of maintaining the desired behaviour, whereas cost optimisation is one of our objectives.
Last but not least, Bachrach et al. Bachrach2009 investigated how much investment into a cooperative game is needed in order to ensure that a certain coalition structure is stable. This work, however, only considers non-evolving systems with a single time step, and thus, interference can only be applied once. In contrast, in our system, due to its evolutionary dynamics (and stochasticity), interference is repeatedly carried out over time (see the next section for more detail). Thus, the related work (above) cannot be applied directly to our approach.
3 Models and Methods
3.1 Prisoner’s Dilemma on Square Lattice Networks
We consider a population of agents on a square lattice of size with periodic boundary conditions— a widely adopted population structure in population dynamics and evolutionary games (for a survey, see szabo2007evolutionary). We focus our analysis on the efficiency of various interference strategies in spatial settings, adopting an agent-based model directly comparable with the setup of recent lab experiments on cooperation rand2014static.
Initially each agent is designated either as a cooperator (C) or defector (D) with equal probability. Agents’ interaction is modelled using the one-shot Prisoner’s Dilemma game, where mutual cooperation (mutual defection) yields the reward (penalty ) and unilateral cooperation gives the cooperator the sucker’s payoff and the defector the temptation . As a popular interaction model of structured populations szabo2007evolutionary, we adopt the following scaled payoff matrix of the PD: , , 111The results obtained in this paper remain robust when is a small positive value, resulting in a strict PD nowak1992evolutionary. (with ).
At each time step or generation, each agent plays the PD with its (four) immediate neighbours. The score for each agent is the sum of the payoffs in these encounters. At the start of the next generation, each agent’s strategy is changed to that of its highest scored neighbour nowak1992evolutionary; szabo2007evolutionary. Our analysis will be primarily based on this deterministic, standard evolutionary process in order to focus on understanding the cost-efficiency of different interference strategies. However, we confirm that all our conclusions remain valid for a stochastic update rule (see Section 4.4 for more details). Namely, instead of coping the highest scored neighbour, at the end of each generation an agent with score chooses to copy the strategy of a randomly selected neighbour agent with score with a probability given by the Fermi rule traulsen2006stochastic: , where denotes the amplitude of noise in the imitation process szabo2007evolutionary. Varying K allows capturing a wide range of update rules and levels of stochasticity, including those used by humans, as measured in lab experiments zisisSciRep2015; randUltimatum.
We simulate this evolutionary process until a stationary state or a cyclic pattern is reached. Similarly to nowak1992evolutionary, all the simulations in this work (described in next section) converge quickly to such a state. For the sake of a clear and fair comparison, all simulations are run for 200 generations. Moreover, for each simulation, the results are averaged from additional 50 generations after that. Furthermore, to improve accuracy, for each set of parameter values, the final results are obtained from averaging 50 independent realisations.
Note that we do not consider mutations or explorations in this work. Thus, whenever the population reaches a homogeneous state (i.e. when the population consists of 100% of agents adopting the same strategy), it will remain in that state regardless of interference. Hence, whenever detecting such a state, no further interference will be made. It is noteworthy that the results remain robust assuming a sufficiently small mutation rate, allowing the population to reach a stable state before any new mutants arise.
3.2 Cost-Efficient Interference in Networks
As already stated, we aim to study how one can efficiently interfere in a structured population to achieve high levels of cooperation while minimising the cost of interference. An investment in a cooperator consists of a cost (to the external decision-maker/investor).
In particular, we investigate whether global interference strategies (where investments are triggered based on network level information) or their local counterparts (where investments are based on local neighbourhood information) lead to successful behaviour with better cost efficiency.
To do so, we consider two main classes of interference strategies based on the global composition of the population and the neighbourhood information.
1. Population composition based (POP): In this class of strategies the decision to interfere (i.e. to invest on all cooperators in the population) is based on the current composition of the population (we denote the number of cooperators currently in the population). Namely, they invest when the number of cooperators in the population is below a certain threshold, (i.e. ), for . They do not invest otherwise (i.e. when ). The value describes how widespread defection strategy should be to trigger the support of cooperators’ survival against defectors.
2. Neighbourhood based (NEB): In this class of strategies, the decision to invest in a given cooperator is based on the cooperativeness level in that cooperator’s neighbourhood. While POP can be seen as a global interference strategy, NEB adopts a local interference paradigm. Namely, the decision-maker invests in a cooperator when the number of its cooperative neighbours is below a certain threshold, , for ; otherwise, no investment is made.
In addition to these primary strategy classes, we consider two additional advanced local strategies, assuming that the external decision-maker can assess individual scores of players in a neighbourhood (e.g. an institution might have access to individuals’ income records).
(NEB-i) Maintaining-C: for every cooperator, if the best scoring neighbour is a D which has a payoff larger than that of C, then invest an amount equal to the difference of the payoff of best scoring neighbour to that of C, plus a small amount .
(NEB-ii) Maintaining-C plus influencing D-neighbours. For every C we proceed as in (NEB-i). We then perform an additional second step in which we lure D neighbours of C to copy its strategy. To achieve this, if the highest scoring neighbour of D is a defector, then invest an additional amount to ensure that C has a higher payoff than that of the most fit D, by a small amount .
Note that these interference strategies are increasingly more subtle, requiring more information and so more detailed observations of the population. In particular, POP requires knowledge of overall cooperativeness of the population, while NEB requires information about local cooperativeness in each neighbourhood. NEB-i and NEB-ii need to access the fitness levels of all players in each neighbourhood.
4 Results
4.1 Cost-Efficient Interference: POP vs NEB
We compare the population-based and neighbourhood based interference strategies, i.e. POP vs NEB, namely their abilities to promote cooperation while maintaining a minimal total cost. In Figure 1, we investigate the evolution over time of the frequency of cooperation and the corresponding cost required as per-generation for POP and NEB strategies, and for different values of and , respectively. We observe that for POP, the population converges to cyclic patterns for a small , while when is sufficiently large (close to 100% in general), the population quickly converges to 100% of cooperation, avoiding cyclic behaviours. Cyclic patterns occur since interference can lead to an increase in the frequency of Cs; when this rate becomes larger than , it causes the interference by POP to stop, which in turn leads to the decrease of the frequency of cooperative actions and then causes interference to restart.
On the other hand, for NEB, when , the population converges to a stable level of cooperation nurtured by constant interference. When , the population quickly converges to 100% of cooperation, thus avoiding the cost of maintaining a stable state. Most interestingly, comparing and , we observe that the per-generation cost increases for the former until reaching 100% of cooperation, while it decreases for the latter. The reason is that there are increasingly more cooperators requiring costly investment in the former case while such a number decreases in the latter. As such, NEB with has a significantly lower total cost than NEB with while guaranteeing similar high levels of cooperation (see Figure 4 for other parameter values of ).
Next, for each interference approach (POP and NEB) and threshold ( and ), we relate the stationary cooperation level with the cost spent per-individual () (see Figure 2). In general, the C frequency obtained by POP mostly depends on – it requires a high ( of the whole population) to reach a high level of cooperation, turning POP a costly strategy. For NEB, similar high levels of cooperation can be achieved when (comparing panels A and C). Considering the total cost minimisation (panels B and D), NEB has a significantly larger area where a low cost is required to achieve high levels of cooperation when compared to POP. Thus, adopting a local interference strategy is largely more efficient than a global population-wide paradigm.
4.2 Conditions for Cost-Efficient Interference
From Figure 2, we also observe that both POP and NEB strategies require a certain threshold of to be successful. Next, we derive a closed-form expression for such critical thresholds, for arbitrary .
From Figure 2B we observe that, for POP to obtain an efficient cost while ensuring high levels of cooperation (the black area near top right corner), must be sufficiently high. Our analysis of Figure 3A shows that whenever , different cyclic patterns or stable mixed configurations are eventually reached. Particularly, when is slightly less than , the population always ends up in a stable pattern, characterised by a number of standalone D individuals surrounded by C neighbours. When (i.e. when , ), cyclic patterns can be escaped, leading to a small total cost and high levels of cooperation.
Similarly, from Figure 2D, for NEB to obtain an efficient cost while ensuring high levels of cooperation (the black area near top right corner), must be sufficiently high. We have derived the threshold of for : (i.e. when , ), see Figure 3 (panels B and C). Moreover, since NEB with is equivalent to POP with — both mean always invest in all cooperators — the same condition is obtained for these two limiting cases (i.e. ).
We can see that the threshold of for NEB with is lower than the one obtained for NEB with ( compared to ), which is also confirmed by the numerical results in Figure 2D. When lies between and , NEB with is more cost-efficient.
We confirm by performing additional simulations that the two conditions hold for other values of and are also robust for different population sizes .
4.3 Comparing with Fitness Observation based Strategies: NEB-i and NEB-ii
We compare the cost efficiency of NEB with and then with where the conditions for their efficiency (derived above) are satisfied, namely, and , respectively. Abusing notations, let us assume that and for the two cases, respectively. This allows us to conveniently compare these strategies with the more advanced ones, NEB-i and NEB-ii. The parameter represents the extra amount to the threshold value of which guarantees (close to) 100% of cooperation.
Figure 4 shows the total costs for the four interference strategies for varying . They all guarantee high levels of cooperation (100%) for . In general, NEB-3 is always significantly more cost-efficient than that with . Moreover, assuming that it is possible to assess individual scores of players in a neighbourhood, more cost-efficient strategies are obtained. Namely, both NEB-ii and NEB-i are slightly better than NEB-3.
4.4 Stochastic Update
We confirm that our results are robust with respect to the stochastic update rule (see again Section 3.1). In particular, similarly to the deterministic case, we observe that the local interference approach (NEB) leads to more cost-efficient strategies than the global interference one (POP). Namely, NEB with is highly cost-efficient (while guaranteeing high levels of cooperation) – see Figure 5 for a typical evolution of strategies and per-generation costs (and the total costs) for NEB with and POP with . Interestingly, due to the stochastic effect (where there is a chance that a higher scored individual copies the lower scored neighbour’s strategy), it is easier to escape cyclic configurations and reach full cooperation in the population. As a result, a lower threshold of is required to reach good performance of the interference strategies (e.g. in Figure 5, was sufficient to reach full cooperation, compared to in the deterministic case). However, due to this stochastic effect, it takes longer to converge to a homogeneous state (compared Figures 5 and 1 where and , noting that NEB with is equivalent to POP with ).
5 Conclusions and Future Work
In summary, in this paper, we seek to determine how to efficiently interfere in a spatially distributed system (i.e. a structured population) of agents to achieve a desired collective state (i.e. high levels of cooperation). In addition to understanding the mechanisms underlying the emergence of cooperation in a multiagent system, we aim at identifying the most effective strategy to foster cooperative scenarios. In particular, the cost of interference measured as the consumption of (monetary) resources, and the greater impact we want to make, the higher the cost we have to pay. To tackle this problem, we have combined different interference paradigms with an evolutionary game theoretic model in structured populations, showing that local strategies (even their simplest versions) outperform global strategies in promoting cooperation in networked communities.
In particular, we have systematically studied two major interference approaches: a global paradigm which oversees the status of the whole population when deciding whether to interfere and a local approach which considers local properties of a neighbourhood when deciding to intervene. We have shown that local approaches lead to more cost-efficient interference strategies. Notably, we have shown that a policy which only invests in a cooperator if it has at least a defective neighbour (i.e. NEB-3), is highly efficient, promoting substantial levels of cooperation. We have also derived the analytical condition of optimal thresholds above which it is profitable to invest and confirmed the results are robust for other stochastic update rules. Furthermore, we have demonstrated that more advanced strategies based on neighbourhood properties can be achieved by assessing individual scores of players in a neighbourhood.
Our findings provide useful insights for decision-makers in a large variety of application domains, ranging from drug prevention programmes (e.g., the Unplugged programme of Mentor-Adepis in the UK which aims to promote behaviours that lead to the decrease of alcohol and drug usage among the youth) and international educational aids to developing countries (e.g., the Global Partnership for Education fund), to wildlife conservation initiatives (e.g., WWF) and environmental governance (i.e., how to promote and enforce behaviours that lead to sustainable environment among both industrial and domestic actors). According to our results, it is better to focus on strategies which look at each individual’s neighbourhood and support cooperation based on local information, instead of using global observables as the primary trigger for decision-making.
Note that this work may foster future studies of even more efficient interference strategies. In particular, it would be interesting to consider more adaptive strategies, which modify the amount and the frequency of investment dynamically depending on the current state of the system. The analysis of the resulting systems, however, is not straightforward, as it remains unclear whether to increase or decrease the amount/frequency of investments will lead to more efficient performance. Also, we aim to extend our investigation to systems with other, more complicated scenarios such as heterogeneous networks and multi-player games santos2008social where more behavioural equilibria and degree-dependent strategies are expected duong2015expected.