This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Exact solutions of the simplified March model for organizational learning

Hang-Hyun Jo [email protected] Department of Physics, The Catholic University of Korea, Bucheon 14662, Republic of Korea
Abstract

James G. March’s celebrated agent-based simulation model for organizational learning [March, Organization Science 2, 71 (1991)] has been extensively studied for the last decades. Yet the model was not fully understood due to the lack of analytical solutions of the model. We simplify the March model to take an analytical approach using master equations. We then derive exact solutions for some simplest yet nontrivial cases, and perform numerical estimation of master equations for more complicated cases. Both analytical and numerical results are in good agreement with agent-based simulations. These results are also compared to those of the original March model. Our approach enables us to rigorously understand the results of the simplified model as well as the original model to a large extent.

preprint: APS/123-QED

I Introduction

James G. March introduced an agent-based simulation model for organizational learning in his seminal paper in 1991 [1]. Since then, the original March model, its variants, and other similar models have been extensively studied for the last decades [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. March’s model considers an external reality, an organizational code (code hereafter), and individual members of the organization. The code represents a set of norms, rules, etc. that is updated using the knowledge of individuals about the reality, while individuals learn from the code about the reality too. By doing so, the organizational knowledge about the reality is collected from individuals and it is disseminated to them at the same time. He studied the effects of learning rates of individuals and of the code on their achieved knowledge about the reality. In order to consider more realistic situations, he took personnel turnover and environmental turbulence into account in his model to find that there may exist an optimal turnover rate that maximizes the achieved knowledge, depending on the learning rates.

We remark that the March model can be considered in the framework of opinion dynamics in networks [16, 17, 18, 19, 20, 21, 22, 23, 24]. That is, the code plays a hub node in a hub-and-spoke network, while individuals are dangling nodes [25]. Nodes update their opinions or beliefs according to their neighbors’ opinions or beliefs, while the external reality plays an external field or source affecting all nodes. It implies that various analytical approaches developed for the opinion dynamics can be applied to the March model.

The March model and its variants have provided with insights into management and business administration [8] but mostly by means of computer simulations of models [14]. In general, for rigorous understanding of models, derivation of their exact, analytical solutions is of utmost importance. In our work, we simplify March’s original model to explicitly write down master equations describing the dynamics of the model. Then we derive exact solutions of the simplified model for some simplest yet nontrivial cases. Numerical estimation of master equations is performed for more complicated cases. Both analytical and numerical results are shown to be in good agreement with agent-based simulation results. Our approach enables us to rigorously understand the results of not only the simplified model but also the original model to a large extent.

The paper is organized as follows. In Sec. II, we describe the original March model and our simplified version. In Sec. III, analytical, numerical, and simulation results of the simplified model are presented and compared to the results of the original March model. Finally, we conclude our work in Sec. IV.

II Models

II.1 Original March model

As mentioned, the original model by March considers the external reality, the code, and individual members of the organization [1]. We remark that in this Subsection we will use mathematical symbols originally used in the March’s paper, and they are not supposed to be confused with symbols in the next Subsection and throughout the paper. The model is based on the following assumptions.

(i) The reality is characterized in terms of an mm-dimensional vector, each element of which may have the value of 11 or 1-1 with equal probabilities.

(ii) The code and nn individuals in the organization have beliefs about the reality. The belief is also represented by an mm-dimensional vector, each element of which may have the value of 11, 0, or 1-1 with equal probabilities. These beliefs may change over time.

(iii) At each time step, each individual may change elements of its belief that are different from those of the code unless the code’s element is 0. Each of such elements of the individual changes to that of the code with a probability p1p_{1}, independently of other elements.

(iv) At the same time, the code updates its belief based on beliefs of some individuals. For this, individuals whose beliefs are closer to the reality than the code is to the reality are identified, which are called a superior group. Then each element of the code’s belief changes to the dominant element within the superior group with a probability p2p_{2}, independently of other elements.

So far, the reality has been assumed to be fixed and the individuals are not replaced by new ones. Thus, it is called a closed system. March first considered a homogeneous population in the closed system, in which all individuals are assigned the identical learning probability. Then he considered the heterogeneous population in the closed system, such that some individuals have higher learning probability than the others. Finally, a homogeneous population in the open system is also considered; in the open system individuals may be replaced by new ones (turnover) and/or the reality changes over time (turbulence). The turnover probability is denoted by p3p_{3} and the turbulence probability is by p4p_{4}. That is, with a probability p3p_{3} each individual is replaced by a new one having a random belief vector at each time step. Also, each element of the reality shifts to the other value, i.e., from 11 to 1-1 or from 1-1 to 11, with a probability p4p_{4}.

II.2 Simplified March model

Let us simplify the March model. As in the original model we consider an external reality, a code, and NN agents. At a time step tt, the external reality, denoted by a variable r(t)r(t), can have a value of 0 or 11. Beliefs of the code and agents about reality are respectively represented by variables, namely, c(t){0,1}c(t)\in\{0,1\} for the code and σi(t){0,1}\sigma_{i}(t)\in\{0,1\} for the iith agent with i=1,,Ni=1,\ldots,N. For a given initial condition of r(0)r(0), c(0)c(0), and {σi(0)}\{\sigma_{i}(0)\}, each time step consists of four stages:

(i) Every agent of i{1,,N}i\in\{1,\ldots,N\} independently updates their belief by learning from the code with a socialization probability p(i)(0,1]p^{(i)}\in(0,1]:

σi(t+1)=c(t).\displaystyle\sigma_{i}(t+1)=c(t). (1)

(ii) Each agent is replaced by a new agent with a turnover probability u[0,1]u\in[0,1], and the new agent is assigned a belief randomly drawn from {0,1}\{0,1\}.

(iii) The code learns from agents who are superior to the code. Here superior agents indicate those whose beliefs are closer to the reality than the code’s belief is. For example, if r(t)=1r(t)=1, the code learns from superior agents only when c(t)=0c(t)=0 and there is at least one agent with σi(t)=1\sigma_{i}(t)=1. Denoting a superior agent to the code by jj, the code updates its belief with a codification probability q(0,1]q\in(0,1]:

c(t+1)=σj(t)forj{i|δσi(t),r(t)>δc(t),r(t)},\displaystyle c(t+1)=\sigma_{j}(t)\ \textrm{for}\ j\in\{i|\delta_{\sigma_{i}(t),r(t)}>\delta_{c(t),r(t)}\}, (2)

where δ,\delta_{\cdot,\cdot} is a Kronecker delta.

(iv) With a turbulence probability v[0,1]v\in[0,1], the reality is assigned a new value randomly drawn from {0,1}\{0,1\}, which closes the time step.

Since the reality, the code, and agents update their value or beliefs synchronously, the order of four stages does not affect the result, except for the socialization and turnover of agents. Note that parameters {p(i)}\{p^{(i)}\}, qq, uu, and vv in our simplified model correspond to p1p_{1}, p2p_{2}, p3p_{3}, and p4p_{4} in the original March model [1], respectively.

Our simplified model can be called a closed system if u=v=0u=v=0, otherwise it is an open system. In the open system, the reality can vary over time (turbulence), and/or agents can be replaced by new agents (turnover). In contrast, in the closed system, the reality is assumed to have the fixed value of 11 for the entire period of time, i.e., r(t)=1r(t)=1 for all tt, without loss of generality, and there is no turnover of agents.

III Results

Refer to caption
Figure 1: (a) Schematic diagram of the homogeneous learning model in a closed system with r=1r=1, showing how the code and agents learn from each other with learning probabilities pp and qq [Eqs. (1) and (2)]. (b) Transition structure between states of the system. Each state is denoted by (c,n)(c,n), where cc is the belief of the code and nn is the number of agents whose belief is 11. Self-loops are not shown for better visualization. Both (0,0)(0,0) and (1,N)(1,N) are absorbing states. (c) Analytic solutions of ρ()\rho(\infty) in Eq. (14) (lines) with simulation results (symbols) for the case with N=40N=40 and an initial condition that Pcn(0)=δc,0δn,1P_{cn}(0)=\delta_{c,0}\delta_{n,1}. (d) Numerical estimation of ρ(T)\rho(T) using the master equation in Eq. (7) (lines) for the case with N=40N=40 and an initial condition that Pcn(0)=12δn,N/2P_{cn}(0)=\frac{1}{2}\delta_{n,N/2} for each c{0,1}c\in\{0,1\}. Here TT is the first time step satisfying |ρ(T)ρ(T1)|<104|\rho(T)-\rho(T-1)|<10^{-4}. The corresponding simulation results are shown with symbols. In panels (c, d), each symbol was averaged over 10410^{4} different runs. Standard errors are omitted as they are smaller than symbols.

III.1 Homogeneous learning in a closed system

We consider the homogeneous learning model in a closed system. That is, r(t)=1r(t)=1 for all tt, there is no turnover of agents, and every agent has the same learning probability pp, i.e.,

p(i)=pfori=1,,N.\displaystyle p^{(i)}=p\ \textrm{for}\ i=1,\ldots,N. (3)

See also Fig. 1(a). The state of the system at each time step tt can be summarized in terms of the code’s belief c(t)c(t) and the number of agents whose belief matches the reality, which we denote by n(t){0,,N}n(t)\in\{0,\ldots,N\}. Precisely, n(t)n(t) is defined as

n(t)i=1Nδσi(t),r(t).\displaystyle n(t)\equiv\sum_{i=1}^{N}\delta_{\sigma_{i}(t),r(t)}. (4)

In our case with r(t)=1r(t)=1, one simply has n(t)=iσi(t)n(t)=\sum_{i}\sigma_{i}(t). Then the expected density of agents with the belief matching the reality is given by

ρ(t)n(t)N,\displaystyle\rho(t)\equiv\left\langle\frac{n(t)}{N}\right\rangle, (5)

which can be interpreted as the expected belief of a randomly chosen agent, or average individual knowledge [3].

Depending on the initial belief of the code, two scenarios are possible. Firstly, if c(0)=1c(0)=1, the code does not change its belief because it already coincides with the reality, and agents’ beliefs will eventually converge to the value of 11 by Eq. (1). It implies an absorbing state that the code and all agents share the same value as the reality, which is denoted by (c,n)=(1,N)(c,n)=(1,N). Secondly, if c(0)=0c(0)=0, n(t)n(t) will decrease until the code’s belief changes to 11 by Eq. (2) as long as there is at least one agent with belief of 11. Once the code’s belief becomes 11, n(t)n(t) will increase to reach the absorbing state (c,n)=(1,N)(c,n)=(1,N). However, this is not always the case; n(t)n(t) may reach 0 before c(t)c(t) changes to 11, implying that both the code and agents have the belief of 0 without further dynamics. This indicates another absorbing state (c,n)=(0,0)(c,n)=(0,0). Figure 1(b) shows the transition structure between states with two absorbing states emphasized in red.

For the analysis, let us denote by Pcn(t)P_{cn}(t) the probability that at time step tt the code’s belief is cc and there are exactly nn agents with belief of 11. These probabilities satisfy the normalization condition as

c=01n=0NPcn(t)=1.\displaystyle\sum_{c=0}^{1}\sum_{n=0}^{N}P_{cn}(t)=1. (6)

They evolve according to the following master equation in discrete time:

Pcn(t+1)=cnWcncnPcn(t),\displaystyle P_{cn}(t+1)=\sum_{c^{\prime}n^{\prime}}W_{c^{\prime}n^{\prime}\to cn}P_{c^{\prime}n^{\prime}}(t), (7)

where the transition probabilities read [see Fig. 1(b)]

W00cn=δ0,cδ0,n,W0n(0)0n={(nn)pnnp¯nq¯ifnn,0ifn<n,W0n(0)1n={(nn)pnnp¯nqifnn,0ifn<n,W1n0n=0n,n,W1n1n={(NnNn)pnnp¯Nnifnn,0ifn>n.\displaystyle\begin{split}&W_{00\to cn}=\delta_{0,c}\delta_{0,n},\\ &W_{0n^{\prime}(\neq 0)\to 0n}=\begin{cases}{n^{\prime}\choose n}p^{n^{\prime}-n}\bar{p}^{n}\bar{q}&\textrm{if}\ n^{\prime}\geq n,\\ 0&\textrm{if}\ n^{\prime}<n,\end{cases}\\ &W_{0n^{\prime}(\neq 0)\to 1n}=\begin{cases}{n^{\prime}\choose n}p^{n^{\prime}-n}\bar{p}^{n}q&\textrm{if}\ n^{\prime}\geq n,\\ 0&\textrm{if}\ n^{\prime}<n,\end{cases}\\ &W_{1n^{\prime}\to 0n}=0\ \forall n^{\prime},n,\\ &W_{1n^{\prime}\to 1n}=\begin{cases}{N-n^{\prime}\choose N-n}p^{n-n^{\prime}}\bar{p}^{N-n}&\textrm{if}\ n^{\prime}\leq n,\\ 0&\textrm{if}\ n^{\prime}>n.\end{cases}\end{split} (8)

Here we have used p¯1p\bar{p}\equiv 1-p and q¯1q\bar{q}\equiv 1-q. Calculating Eq. (7) recursively with any initial condition {Pcn(0)}\{P_{cn}(0)\}, one can in principle obtain Pcn(t)P_{cn}(t) for any cc, nn, and tt, hence ρ(t)\rho(t) in Eq. (5), i.e.,

ρ(t)=1Nc=01n=0NnPcn(t).\displaystyle\rho(t)=\frac{1}{N}\sum_{c=0}^{1}\sum_{n=0}^{N}nP_{cn}(t). (9)

We focus on steady states of the model. It is obvious that all initial probabilities eventually end up with two absorbing states, i.e., (c,n)=(0,0)(c,n)=(0,0) and (1,N)(1,N), implying P00()+P1N()=1P_{00}(\infty)+P_{1N}(\infty)=1. Thus, the average individual knowledge in Eq. (9) reads

ρ()=P1N()=1P00().\displaystyle\rho(\infty)=P_{1N}(\infty)=1-P_{00}(\infty). (10)

As the simplest yet nontrivial case, let us consider an initial condition that Pcn(0)=δc,0δn,1P_{cn}(0)=\delta_{c,0}\delta_{n,1}, namely, P01(0)=1P_{01}(0)=1 and Pcn(0)=0P_{cn}(0)=0 for all other states (c,n)(0,1)(c,n)\neq(0,1). Using

W0101=p¯q¯αandW0100=pq¯β,\displaystyle W_{01\to 01}=\bar{p}\bar{q}\equiv\alpha\ \textrm{and}\ W_{01\to 00}=p\bar{q}\equiv\beta, (11)

the master equations for P01P_{01} and P00P_{00} [Eq. (7)] are written as follows:

P01(t+1)=αP01(t),P00(t+1)=P00(t)+βP01(t).\displaystyle\begin{split}&P_{01}(t+1)=\alpha P_{01}(t),\\ &P_{00}(t+1)=P_{00}(t)+\beta P_{01}(t).\end{split} (12)

Master equations for all other states than P01(t)P_{01}(t) and P00(t)P_{00}(t) are irrelevant to calculate P00()P_{00}(\infty) in Eq. (10). Since P01(t)=αtP_{01}(t)=\alpha^{t}, one obtains

P00(t)=β(1+α++αt1),\displaystyle P_{00}(t)=\beta(1+\alpha+\ldots+\alpha^{t-1}), (13)

leading to

ρ()=1β1α=qp+qpq.\displaystyle\rho(\infty)=1-\frac{\beta}{1-\alpha}=\frac{q}{p+q-pq}. (14)

This solution is not a function of NN due to the choice of the initial condition that Pcn(0)=δc,0δn,1P_{cn}(0)=\delta_{c,0}\delta_{n,1}.

We observe that ρ()\rho(\infty) in Eq. (14) is a decreasing function of pp but an increasing function of qq [Fig. 1(c)], already partly implying the qualitatively similar behavior to the simulation results of the original March model, i.e., Fig. 1 in Ref. [1]. The larger qq leads to the more correct belief of agents about the reality, which is easily understood by considering that qq is the learning probability of the code from superior agents. On the other hand, the effect of pp on ρ()\rho(\infty) is not straightforward to understand. It is because the large value of pp speeds up not only the probability flow to the state (0,0)(0,0) but also to the state (1,N)(1,N). It means that the large pp always helps spread the code’s belief to agents whether the code’s belief is correct or not. When the code’s belief is incorrect, the large pp increases the amount of flow to the state (0,0)(0,0) [Fig. 1(a)]. In contrast, when the code’s belief is correct, the large pp does not increase the amount of flow to the state (1,N)(1,N), but only speeds up the flow. As we focus on the steady behavior, such an asymmetric role of pp leads to the decreasing behavior of ρ()\rho(\infty) as a function of pp. Such a behavior has been interpreted that slow socialization allows for longer exploration, resulting in better organizational learning [1].

For general initial conditions, one can estimate ρ(T)\rho(T) for a sufficiently large TT by iterating the master equation in Eq. (7) for a given initial condition {Pcn(0)}\{P_{cn}(0)\}. For a demonstration, we consider a system of N=40N=40 agents and the initial condition that Pcn(0)=12δn,N/2P_{cn}(0)=\frac{1}{2}\delta_{n,N/2} for each c{0,1}c\in\{0,1\}. We estimate the value of ρ(T)\rho(T) at the first time step TT when |ρ(T)ρ(T1)|<104|\rho(T)-\rho(T-1)|<10^{-4} is satisfied. From the results shown in Fig. 1(d), we find that ρ(T)\rho(T) is a decreasing function of pp but an increasing function of qq, showing the same tendency as the solution of ρ()\rho(\infty) in Eq. (14) for the simpler initial condition.

These exact and numerical results are supported by agent-based simulations. We perform the simulations of the model using the rules in Eqs. (1) and (2) together with Eq. (3) for the system with N=40N=40 agents with the mentioned initial conditions. Firstly, the initial condition with Pcn(0)=δc,0δn,1P_{cn}(0)=\delta_{c,0}\delta_{n,1} used for the analysis is realized in the simulation such that only one agent has an initial belief of 11, while all other agents and the code have the belief of 0. Secondly, as for the initial condition with Pcn(0)=12δn,N/2P_{cn}(0)=\frac{1}{2}\delta_{n,N/2} for each c{0,1}c\in\{0,1\}, we set σi(0)=1\sigma_{i}(0)=1 for i=1,,20i=1,\ldots,20 and σi(0)=0\sigma_{i}(0)=0 for the rest of agents, while the value of c(0)c(0) is randomly chosen from {0,1}\{0,1\} with equal probabilities. Eventually every run ends up with one of absorbing states, implying that n()=0,Nn(\infty)=0,N. For each pair of pp and qq, we take the average of n()/Nn(\infty)/N over 10410^{4} different runs to get the value of ρ()\rho(\infty) in Eq. (5). Such averages are shown with symbols in Fig. 1(c, d), which are indeed in good agreement with analytical and numerical solutions, respectively.

III.2 Heterogeneous learning in a closed system

Next, we study the heterogeneous version of the model in a closed system with r=1r=1 by using two distinct values of learning probability, i.e., by setting

p(i)=p1fori=1,,N1,p(i)=p2fori=N1+1,,N,\displaystyle\begin{split}&p^{(i)}=p_{1}\ \textrm{for}\ i=1,\ldots,N_{1},\\ &p^{(i)}=p_{2}\ \textrm{for}\ i=N_{1}+1,\ldots,N,\end{split} (15)

where 1N1N11\leq N_{1}\leq N-1 [Fig. 2(a)]. Agents with the larger (smaller) learning probability among p1p_{1} and p2p_{2} can be called fast (slow) learners [1]. The state of the system at each time step tt can be summarized in terms of the code’s belief c(t)c(t), the number of agents with p1p_{1} whose belief is 11, which we denote by n(t){0,,N1}n(t)\in\{0,\ldots,N_{1}\}, and the number of agents with p2p_{2} whose belief is 11, which we denote by m(t){0,,N2}m(t)\in\{0,\ldots,N_{2}\}. Here N2NN1N_{2}\equiv N-N_{1}. Then the expected density of agents with belief of 11 is given as

ρ(t)n(t)+m(t)N,\displaystyle\rho(t)\equiv\left\langle\frac{n(t)+m(t)}{N}\right\rangle, (16)

which can also be interpreted as the expected belief of a randomly chosen agent.

Similarly to the homogeneous version of the model, the master equation reads

Pcnm(t+1)=cnmWcnmcnmPcnm(t),\displaystyle P_{cnm}(t+1)=\sum_{c^{\prime}n^{\prime}m^{\prime}}W_{c^{\prime}n^{\prime}m^{\prime}\to cnm}P_{c^{\prime}n^{\prime}m^{\prime}}(t), (17)

where the transition probabilities are written as

W000cnm=δ0,cδ0,nδ0,m,W0nm(00)0nm={(nn)p1nnp¯1n(mm)p2mmp¯2mq¯ifnn&mm,0otherwise,W0nm(00)1nm={(nn)p1nnp¯1n(mm)p2mmp¯2mqifnn&mm,0otherwise,W1nm0nm=0n,m,n,m,W1nm1nm={(N1nN1n)p1nnp¯1N1n(N2mN2m)p2mmp¯2N2mifnn&mm,0otherwise.\displaystyle\begin{split}&W_{000\to cnm}=\delta_{0,c}\delta_{0,n}\delta_{0,m},\\ &W_{0n^{\prime}m^{\prime}(\neq 00)\to 0nm}=\begin{cases}{n^{\prime}\choose n}p_{1}^{n^{\prime}-n}\bar{p}_{1}^{n}{m^{\prime}\choose m}p_{2}^{m^{\prime}-m}\bar{p}_{2}^{m}\bar{q}&\textrm{if}\ n^{\prime}\geq n\ \&\ m^{\prime}\geq m,\\ 0&\textrm{otherwise},\end{cases}\\ &W_{0n^{\prime}m^{\prime}(\neq 00)\to 1nm}=\begin{cases}{n^{\prime}\choose n}p_{1}^{n^{\prime}-n}\bar{p}_{1}^{n}{m^{\prime}\choose m}p_{2}^{m^{\prime}-m}\bar{p}_{2}^{m}q&\textrm{if}\ n^{\prime}\geq n\ \&\ m^{\prime}\geq m,\\ 0&\textrm{otherwise},\end{cases}\\ &W_{1n^{\prime}m^{\prime}\to 0nm}=0\ \forall n^{\prime},m^{\prime},n,m,\\ &W_{1n^{\prime}m^{\prime}\to 1nm}=\begin{cases}{N_{1}-n^{\prime}\choose N_{1}-n}p_{1}^{n-n^{\prime}}\bar{p}_{1}^{N_{1}-n}{N_{2}-m^{\prime}\choose N_{2}-m}p_{2}^{m-m^{\prime}}\bar{p}_{2}^{N_{2}-m}&\textrm{if}\ n^{\prime}\leq n\ \&\ m^{\prime}\leq m,\\ 0&\textrm{otherwise}.\end{cases}\end{split} (18)

Here we have used p¯11p1\bar{p}_{1}\equiv 1-p_{1} and p¯21p2\bar{p}_{2}\equiv 1-p_{2}. It is obvious that there are two absorbing states, i.e., (0,0,0)(0,0,0) and (1,N1,N2)(1,N_{1},N_{2}), implying that P000()+P1N1N2()=1P_{000}(\infty)+P_{1N_{1}N_{2}}(\infty)=1. Calculating Eq. (17) recursively with any initial condition {Pcnm(0)}\{P_{cnm}(0)\}, one can in principle obtain Pcnm(t)P_{cnm}(t) for any cc, nn, mm, and tt, hence ρ(t)\rho(t) in Eq. (16).

Refer to caption
Figure 2: (a) Schematic diagram of the heterogeneous learning model in a closed system with r=1r=1, showing how the code and agents learn from each other with learning probabilities p1p_{1}, p2p_{2}, and qq. (b) Transition structure between states of the system. Each state is denoted by (c,n,m)(c,n,m), where cc is the belief of the code and nn (mm) is the number of agents with learning probability p1p_{1} (p2p_{2}) whose belief is 11. Self-loops and states with c=1c=1 are not shown. (c) Analytic solutions of ρ()\rho(\infty) in Eq. (22) (lines) for the case with N=40N=40 (N1=N2=20N_{1}=N_{2}=20) and an initial condition that Pcnm(0)=δc,0δn,1δm,1P_{cnm}(0)=\delta_{c,0}\delta_{n,1}\delta_{m,1}. (d) Numerical estimation of ρ(T)\rho(T) using the master equation in Eq. (17) (lines) for the case with N=40N=40 (N1=N2=20N_{1}=N_{2}=20) and an initial condition that Pcnm(0)=12δn,N1/2δm,N2/2P_{cnm}(0)=\frac{1}{2}\delta_{n,N_{1}/2}\delta_{m,N_{2}/2} for each c{0,1}c\in\{0,1\}. Here TT is the first time step satisfying |ρ(T)ρ(T1)|<104|\rho(T)-\rho(T-1)|<10^{-4}. (e) Numerical estimation of expected beliefs of the code (“code” in the figure), fast-learning agents with p1=0.9p_{1}=0.9 (“fast”), slow-learning agents with p2=0.1p_{2}=0.1 (“slow”), and all agents (“avg”) at t=3t=3 using the master equation in Eq. (17) for the case with N=40N=40 and N1=2,4,6,,38N_{1}=2,4,6,\ldots,38 (lines). We use an initial condition that Pcnm(0)=12δn,N1/2δm,N2/2P_{cnm}(0)=\frac{1}{2}\delta_{n,N_{1}/2}\delta_{m,N_{2}/2} for each c{0,1}c\in\{0,1\} and for each N1N_{1}. In panels (c–e), simulation results are shown with symbols, each symbol was averaged over 2×1042\times 10^{4} different runs, and standard errors are omitted as they are smaller than symbols.

As the simplest yet nontrivial case, let us consider an initial condition that Pcnm(0)=δc,0δn,1δm,1P_{cnm}(0)=\delta_{c,0}\delta_{n,1}\delta_{m,1}. Denoting

αi1i2W0i1i20i1i2,βi1i2,j1j2W0i1i20j1j2[(i1,i2)(j1,j2)],\displaystyle\begin{split}&\alpha_{i_{1}i_{2}}\equiv W_{0i_{1}i_{2}\to 0i_{1}i_{2}},\\ &\beta_{i_{1}i_{2},j_{1}j_{2}}\equiv W_{0i_{1}i_{2}\to 0j_{1}j_{2}}\ [(i_{1},i_{2})\neq(j_{1},j_{2})],\end{split} (19)

the master equations for P011P_{011}, P010P_{010}, P001P_{001}, and P000P_{000} [Eq. (17)] are written as follows:

P011(t+1)=\displaystyle P_{011}(t+1)= α11P011(t),\displaystyle\alpha_{11}P_{011}(t),
P010(t+1)=\displaystyle P_{010}(t+1)= α10P010(t)+β11,10P011(t),\displaystyle\alpha_{10}P_{010}(t)+\beta_{11,10}P_{011}(t),
P001(t+1)=\displaystyle P_{001}(t+1)= α01P001(t)+β11,01P011(t),\displaystyle\alpha_{01}P_{001}(t)+\beta_{11,01}P_{011}(t), (20)
P000(t+1)=\displaystyle P_{000}(t+1)= P000(t)+β11,00P011(t)+β10,00P010(t)\displaystyle P_{000}(t)+\beta_{11,00}P_{011}(t)+\beta_{10,00}P_{010}(t)
+β01,00P001(t),\displaystyle+\beta_{01,00}P_{001}(t),

where α11=p¯1p¯2q¯\alpha_{11}=\bar{p}_{1}\bar{p}_{2}\bar{q}, α10=p¯1q¯\alpha_{10}=\bar{p}_{1}\bar{q}, α01=p¯2q¯\alpha_{01}=\bar{p}_{2}\bar{q}, β11,10=p¯1p2q¯\beta_{11,10}=\bar{p}_{1}p_{2}\bar{q}, β11,01=p1p¯2q¯\beta_{11,01}=p_{1}\bar{p}_{2}\bar{q}, β11,00=p1p2q¯\beta_{11,00}=p_{1}p_{2}\bar{q}, β10,00=p1q¯\beta_{10,00}=p_{1}\bar{q}, and β01,00=p2q¯\beta_{01,00}=p_{2}\bar{q}. After some algebra, one obtains

P000()=\displaystyle P_{000}(\infty)= β11,001α11+β11,10β10,00(1α11)(1α10)\displaystyle\frac{\beta_{11,00}}{1-\alpha_{11}}+\frac{\beta_{11,10}\beta_{10,00}}{(1-\alpha_{11})(1-\alpha_{10})}
+β11,01β01,00(1α11)(1α01),\displaystyle+\frac{\beta_{11,01}\beta_{01,00}}{(1-\alpha_{11})(1-\alpha_{01})}, (21)

leading to

ρ()=1p1p2q¯1p¯1p¯2q¯[1+p¯1q¯1p¯1q¯+p¯2q¯1p¯2q¯].\displaystyle\rho(\infty)=1-\frac{p_{1}p_{2}\bar{q}}{1-\bar{p}_{1}\bar{p}_{2}\bar{q}}\left[1+\frac{\bar{p}_{1}\bar{q}}{1-\bar{p}_{1}\bar{q}}+\frac{\bar{p}_{2}\bar{q}}{1-\bar{p}_{2}\bar{q}}\right]. (22)

This result is not a function of N1N_{1} and N2N_{2} due to the choice of the initial condition that Pcnm(0)=δc,0δn,1δm,1P_{cnm}(0)=\delta_{c,0}\delta_{n,1}\delta_{m,1}. It is straightforward to prove that setting p1=p2=pp_{1}=p_{2}=p reduces the solution in Eq. (22) to the solution of the homogeneous model with the initial condition that Pcn(0)=δc,0δn,2P_{cn}(0)=\delta_{c,0}\delta_{n,2}.

To demonstrate the effect of heterogeneous learning on ρ()\rho(\infty) in Eq. (22), we parameterize p1=p+δp_{1}=p+\delta and p2=pδp_{2}=p-\delta with non-negative δ\delta and p(δ,1δ]p\in(\delta,1-\delta]. Here δ\delta controls the degree of heterogeneity of agents. As shown in Fig. 2(c), the larger δ\delta leads to the higher values of ρ()\rho(\infty) in Eq. (22) for the entire range of pp, which is consistent with the simulation results of the original March model, i.e., Fig. 2 in Ref. [1]. Such behaviors can be essentially understood by comparing the transition probability W011000=p1p2q¯W_{011\to 000}=p_{1}p_{2}\bar{q} in the heterogeneous model to its counterpart W0200=p2q¯W_{02\to 00}=p^{2}\bar{q} in the homogeneous model [Eq. (8)] to get

W011000W0200=1δ2p21.\displaystyle\frac{W_{011\to 000}}{W_{02\to 00}}=1-\frac{\delta^{2}}{p^{2}}\leq 1. (23)

It implies that for positive δ\delta the probability flow to the absorbing state (0,0,0)(0,0,0) in the heterogeneous model is always smaller than the flow to the absorbing state (0,0)(0,0) in the homogeneous model, hence the larger ρ()\rho(\infty) for the heterogeneous model than for the homogeneous model. We also remark that the ratio in Eq. (23) gets closer to 11 for the larger value of pp, hence the smaller gap between the heterogeneous and homogeneous models. Such an expectation is indeed the case as depicted in Fig. 2(c).

For general initial conditions, we numerically estimate ρ(T)\rho(T) for a sufficiently large TT by iterating the master equation in Eq. (17) for a given initial condition {Pcnm(0)}\{P_{cnm}(0)\}. For a demonstration, we consider a system of N=40N=40 agents (N1=N2=20N_{1}=N_{2}=20) and the initial condition that Pcnm(0)=12δn,N1/2δm,N2/2P_{cnm}(0)=\frac{1}{2}\delta_{n,N_{1}/2}\delta_{m,N_{2}/2} for each c{0,1}c\in\{0,1\}. We estimate the value of ρ(T)\rho(T) at the first time step TT when |ρ(T)ρ(T1)|<104|\rho(T)-\rho(T-1)|<10^{-4} is satisfied. From the results shown in Fig. 2(d), we find that ρ(T)\rho(T) has higher values for more heterogeneous systems.

We also perform the agent-based simulations of the heterogeneous model for the system with N=40N=40 agents (N1=N2=20N_{1}=N_{2}=20) with the mentioned initial conditions. Firstly, the initial condition with Pcnm(0)=δc,0δn,1δm,1P_{cnm}(0)=\delta_{c,0}\delta_{n,1}\delta_{m,1} is realized in the simulation such that one agent with p1p_{1} and one agent with p2p_{2} have an initial belief of 11, while all other agents as well as the code have the belief of 0. Secondly, as for the initial condition with Pcnm(0)=12δn,N1/2δm,N2/2P_{cnm}(0)=\frac{1}{2}\delta_{n,N_{1}/2}\delta_{m,N_{2}/2} for each c{0,1}c\in\{0,1\}, we set σi(0)=1\sigma_{i}(0)=1 for i=1,,10,21,,30i=1,\ldots,10,21,\ldots,30 and σi(0)=0\sigma_{i}(0)=0 for the rest of agents, while the value of c(0)c(0) is randomly chosen from {0,1}\{0,1\} with equal probabilities. Eventually every run ends up with one of absorbing states, implying that n()+m()=0,Nn(\infty)+m(\infty)=0,N. For each combination of p1p_{1}, p2p_{2}, and qq, we take the average of [n()+m()]/N[n(\infty)+m(\infty)]/N over 2×1042\times 10^{4} different runs to get the value of ρ()\rho(\infty) in Eq. (16). Such averages are shown with symbols in Fig. 2(c, d), which are indeed in good agreement with analytical and numerical solutions, respectively.

Finally, we note that our setup for heterogeneous agents is different from that in the original March model [1]. In the original paper, the heterogeneity was controlled by the number of agents having p1p_{1}, i.e., N1N_{1}, while learning probabilities were fixed to be p1=0.9p_{1}=0.9 and p2=0.1p_{2}=0.1. We test such original setup using our simplified model both by estimating ρ(3)\rho(3) from the master equations in Eq. (17) and by performing agent-based simulations up to t=3t=3. For a system of N=40N=40 agents, we consider N1=2,4,6,,38N_{1}=2,4,6,\ldots,38 with the initial condition that Pcnm(0)=12δn,N1/2δm,N2/2P_{cnm}(0)=\frac{1}{2}\delta_{n,N_{1}/2}\delta_{m,N_{2}/2} for each c{0,1}c\in\{0,1\}. In addition to the expected belief of all agents in Eq. (16), we measure the expected belief of fast-learning agents with p1p_{1}, that of slow-learning agents with p2p_{2}, and that of the code. Results from the numerical estimation of master equations and from agent-based simulations are in good agreement with each other as depicted in Fig. 2(e). These results show the qualitatively same behaviors as in the original March model, i.e., Fig. 3 in Ref. [1].

III.3 Homogeneous learning in an open system

Refer to caption
Figure 3: (a) Different behaviors of the analytic solution of ρ(2)\rho(2) in Eq. (28) as a function of uu depicted in the plane of (p,q)(p,q). ρ(2)\rho(2) can have either a maximum value for u(0,1)u\in(0,1) (red shade, denoted by “max”) or a minimum value for u(0,1)u\in(0,1) (tin shade, denoted by “min”), or it can monotonically increase (“increasing”) or decrease (“decreasing”). Four empty symbols are chosen to demonstrate their different functional forms of ρ(2)\rho(2) in panel (b). (b) Analytic solutions of ρ(2)\rho(2) in Eq. (28) as a function of uu for several combinations of pp and qq (lines) with corresponding simulation results (symbols). We use an initial condition that Pcσ(0)=δc,0δσ,1P_{c\sigma}(0)=\delta_{c,0}\delta_{\sigma,1}. (c) Heatmaps of the analytic solution of ρ()\rho(\infty) in Eq. (34) as a function of uu and vv for the case with p=q=0.5p=q=0.5. Two empty symbols are chosen to demonstrate different behaviors of ρ(t)\rho(t) in panel (d). (d) Numerical estimation of ρ(t)\rho(t) using the master equation (solid lines) and simulation results (dotted lines) for u=0u=0 and 0.10.1 when p=0.5p=0.5, q=0.5q=0.5, and v=0.02v=0.02. In panels (b, d), each symbol was averaged over 2×1052\times 10^{5} different runs. Standard errors are omitted as they are smaller than symbols.

We finally study the effects of turnover of agents and turbulence of the external reality on organizational learning. For this, we focus on the simplest yet nontrivial case with N=1N=1, indicating that there is only one agent in the system. This agent’s belief is denoted by σ(t)\sigma(t). The case with general N>1N>1 can be studied too within our framework.

We first consider the system with turnover of agents only, while r(t)=1r(t)=1 for all tt, namely, u>0u>0 and v=0v=0. Let us denote by Pcσ(t)P_{c\sigma}(t) the probability that at time step tt the code’s belief is cc, and the agent’s belief is σ\sigma. These probabilities satisfy the normalization condition as

c,σ{0,1}Pcσ(t)=1.\displaystyle\sum_{c,\sigma\in\{0,1\}}P_{c\sigma}(t)=1. (24)

They evolve according to the following master equation in discrete time:

Pcσ(t+1)=cσWcσcσPcσ(t),\displaystyle P_{c\sigma}(t+1)=\sum_{c^{\prime}\sigma^{\prime}}W_{c^{\prime}\sigma^{\prime}\to c\sigma}P_{c^{\prime}\sigma^{\prime}}(t), (25)

where the transition probabilities read using uu/2u^{\prime}\equiv u/2 and u¯1u\bar{u}^{\prime}\equiv 1-u^{\prime} as follows:

W0000=u¯,W0001=u,W0100=(pu¯+p¯u)q¯,W0101=(p¯u¯+pu)q¯,W0110=(pu¯+p¯u)q,W0111=(p¯u¯+pu)q,W1010=p¯u¯+pu,W1011=pu¯+p¯u,W1110=u,W1111=u¯.\displaystyle\begin{split}&W_{00\to 00}=\bar{u}^{\prime},\ W_{00\to 01}=u^{\prime},\\ &W_{01\to 00}=(p\bar{u}^{\prime}+\bar{p}u^{\prime})\bar{q},\ W_{01\to 01}=(\bar{p}\bar{u}^{\prime}+pu^{\prime})\bar{q},\\ &W_{01\to 10}=(p\bar{u}^{\prime}+\bar{p}u^{\prime})q,\ W_{01\to 11}=(\bar{p}\bar{u}^{\prime}+pu^{\prime})q,\\ &W_{10\to 10}=\bar{p}\bar{u}^{\prime}+pu^{\prime},\ W_{10\to 11}=p\bar{u}^{\prime}+\bar{p}u^{\prime},\\ &W_{11\to 10}=u^{\prime},\ W_{11\to 11}=\bar{u}^{\prime}.\end{split} (26)

and all other transition probabilities are zero. Note that due to u>0u>0, both (c,σ)=(0,0)(c,\sigma)=(0,0) and (1,1)(1,1) are no longer absorbing states.

For the steady state, we derive the analytical solution of ρ()\rho(\infty) as

ρ()=c{0,1}Pc1()=p+(12p)up+(1p)u.\displaystyle\rho(\infty)=\sum_{c\in\{0,1\}}P_{c1}(\infty)=\frac{p+(\frac{1}{2}-p)u}{p+(1-p)u}. (27)

This solution is independent of the initial condition, and it is not a function of qq because the change of the code’s belief from 0 to 11 is irreversible; as long as q>0q>0, all initial probabilities end up with states with c=1c=1. We also find that limu0ρ()=1\lim_{u\to 0}\rho(\infty)=1 and ρ()=1/2\rho(\infty)=1/2 for u=1u=1, both of which are irrespective of pp. That is, ρ()\rho(\infty) is a decreasing function of uu for u>0u>0, whereas for u=0u=0 it can have the finite value less than one, e.g., given in Eq. (14), as long as p>0p>0. Thus one can conclude that ρ()\rho(\infty) shows an “increasing” and then decreasing behavior in the range of 0u10\leq u\leq 1. This argument is important to discuss about the optimal turnover of agents that maximizes the effectiveness of organizational learning.

Next, we focus on the transient dynamics instead of the steady state. Starting from the initial condition that Pcσ(0)=δc,0δσ,1P_{c\sigma}(0)=\delta_{c,0}\delta_{\sigma,1}, we obtain, e.g., at t=2t=2

ρ(2)=\displaystyle\rho(2)= (12p)(1p)u2(2p2+pq72p+1)u\displaystyle(\tfrac{1}{2}-p)(1-p)u^{2}-(2p^{2}+pq-\tfrac{7}{2}p+1)u
+(1p)2+pq.\displaystyle+(1-p)^{2}+pq. (28)

It turns out that ρ(2)\rho(2) is a quadratic function of uu, meaning that it can have either a maximum or minimum value for the range of u(0,1)u\in(0,1), or it may be a monotonically increasing or decreasing function of uu, depending on the choice of pp and qq. Figure 3(a) summarizes such behaviors in the plane of (p,q)(p,q), and Fig. 3(b) depicts ρ(2)\rho(2) as a function of uu for several cases of pp and qq. For example, for sufficiently large qq, ρ(2)\rho(2) is a monotonically decreasing function of uu irrespective of pp. It implies that if the code learns fast from the superior agent, the maximal organizational learning is achieved when there is no turnover. This result can be understood by considering the fact that the turnover introduces randomness or new information from outside of the system. In contrast, if the code learns slowly from the superior agent but the agent learns fast from the code, the maximal organizational learning is achieved for the largest turnover. In such case, without turnover, both the code and agent are likely to be stuck in a suboptimal situation. Thus, the strong turnover may help the system to evade it. Precisely, we find the increasing and then decreasing behavior of ρ(2)\rho(2) for p=0.7p=0.7 and q=0.6q=0.6, and the monotonically decreasing behavior of ρ(2)\rho(2) for p=0.4p=0.4 and q=0.6q=0.6, which are consistent with the results of the original March model, e.g., Fig. 4 in Ref. [1].

We now consider the effect of turbulence on the organizational learning in the presence of turnover of agents. For this, we define an extended system consisting of both the system and the reality, whose states can be denoted by (r,c,σ){0,1}3(r,c,\sigma)\in\{0,1\}^{3}. Let us denote by Prcσ(t)P_{rc\sigma}(t) the probability that at time step tt the reality is rr, the code’s belief is cc, and the agent’s belief is σ\sigma. These probabilities satisfy the normalization condition as

r,c,σ{0,1}Prcσ(t)=1.\displaystyle\sum_{r,c,\sigma\in\{0,1\}}P_{rc\sigma}(t)=1. (29)

They evolve according to the following master equation in discrete time:

Prcσ(t+1)=rcσWrcσrcσPrcσ(t).\displaystyle P_{rc\sigma}(t+1)=\sum_{r^{\prime}c^{\prime}\sigma^{\prime}}W_{r^{\prime}c^{\prime}\sigma^{\prime}\to rc\sigma}P_{r^{\prime}c^{\prime}\sigma^{\prime}}(t). (30)

Denoting vv/2v^{\prime}\equiv v/2 and v¯1v\bar{v}^{\prime}\equiv 1-v^{\prime} and using Eq. (26), we get the transition probabilities for Eq. (30) as follows:

W0cσ0cσ=W¯cσcσv¯,W0cσ1cσ=W¯cσcσv,W1cσ0cσ=Wcσcσv,W1cσ1cσ=Wcσcσv¯,\displaystyle\begin{split}&W_{0c^{\prime}\sigma^{\prime}\to 0c\sigma}=\overline{W}_{c^{\prime}\sigma^{\prime}\to c\sigma}\bar{v}^{\prime},\\ &W_{0c^{\prime}\sigma^{\prime}\to 1c\sigma}=\overline{W}_{c^{\prime}\sigma^{\prime}\to c\sigma}v^{\prime},\\ &W_{1c^{\prime}\sigma^{\prime}\to 0c\sigma}=W_{c^{\prime}\sigma^{\prime}\to c\sigma}v^{\prime},\\ &W_{1c^{\prime}\sigma^{\prime}\to 1c\sigma}=W_{c^{\prime}\sigma^{\prime}\to c\sigma}\bar{v}^{\prime},\end{split} (31)

where we have used

W¯cσcσW1c,1σ1c,1σ.\displaystyle\overline{W}_{c^{\prime}\sigma^{\prime}\to c\sigma}\equiv W_{1-c^{\prime},1-\sigma^{\prime}\to 1-c,1-\sigma}. (32)

As r(t)r(t) is no longer constant, the average individual knowledge is obtained as

ρ(t)=r,c,σ{0,1}δr,σPrcσ(t).\displaystyle\rho(t)=\sum_{r,c,\sigma\in\{0,1\}}\delta_{r,\sigma}P_{rc\sigma}(t). (33)

After some algebra, we derive an exact solution of ρ()\rho(\infty) for the steady state as follows:

ρ()=\displaystyle\rho(\infty)= [(p2qp252pq+2p+q1)u2v2(p2q+p272pq+2p+32q1)u2v(p12)qu2(2p2q2p292pq+3p\displaystyle[(p^{2}q-p^{2}-\tfrac{5}{2}pq+2p+q-1)u^{2}v^{2}-(p^{2}q+p^{2}-\tfrac{7}{2}pq+2p+\tfrac{3}{2}q-1)u^{2}v-(p-\tfrac{1}{2})qu^{2}-(2p^{2}q-2p^{2}-\tfrac{9}{2}pq+3p
+32q1)uv2+(2p2q2p292pq+2p+q)uv+pqu+(p2qp22pq+p+12q)v2p(pqpq)v]/[2(p2qp2\displaystyle+\tfrac{3}{2}q-1)uv^{2}+(2p^{2}q-2p^{2}-\tfrac{9}{2}pq+2p+q)uv+pqu+(p^{2}q-p^{2}-2pq+p+\tfrac{1}{2}q)v^{2}-p(pq-p-q)v]/[2(p^{2}q-p^{2}
2pq+2p+q1)u2v2(2p2q2p25pq+4p+3q2)u2v+(1p)qu2(4p2q4p28pq+6p+3q2)\displaystyle-2pq+2p+q-1)u^{2}v^{2}-(2p^{2}q-2p^{2}-5pq+4p+3q-2)u^{2}v+(1-p)qu^{2}-(4p^{2}q-4p^{2}-8pq+6p+3q-2)
×uv2+(4p2q4p27pq+4p+2q)uv+pqu+(2p2q2p24pq+2p+q)v22p(pqpq)v].\displaystyle\times uv^{2}+(4p^{2}q-4p^{2}-7pq+4p+2q)uv+pqu+(2p^{2}q-2p^{2}-4pq+2p+q)v^{2}-2p(pq-p-q)v]. (34)

This analytical solution is depicted as a heatmap in Fig. 3(c) for the case with p=q=0.5p=q=0.5. We find that for each value of turbulence vv there exists an optimal turnover probability u(0,1)u^{*}\in(0,1) that maximizes the effectiveness of organizational learning. Such an optimal turnover probability for a given vv is obtained as

u(v)=v2+3v2v(v+3)(3v+1)v23v2,\displaystyle u^{*}(v)=\frac{v^{2}+3v-\sqrt{2v(v+3)(3v+1)}}{v^{2}-3v-2}, (35)

which is an increasing function of vv. It implies that the system is required to have the larger turnover to adapt to the more turbulent external reality. Yet the value of ρ()\rho(\infty) with the optimal turnover tends to decrease with vv [Fig. 3(c)].

Finally, we look at the transient dynamics of ρ(t)\rho(t) for different values of uu when pp, qq, and vv are given. We numerically obtain ρ(t)\rho(t) by iterating the master equation in Eq. (25) using the initial condition that Prcσ(0)=1/8P_{rc\sigma(0)}=1/8 for each state. Numerical results are depicted as solid lines in Fig. 3(d). The agent-based simulations are also performed using the initial condition that each of r(0)r(0), c(0)c(0), and σ(0)\sigma(0) is randomly and independently drawn from {0,1}\{0,1\}. Simulation results are shown as dotted lines in Fig. 3(d), which are in good agreement with numerical results. These results are also qualitatively similar to those in the original March model, e.g., Fig. 5 in Ref. [1].

IV Conclusion

In our work, the celebrated organizational learning model proposed by March [1] has been simplified, enabling us to explicitly write down the master equation for the dynamics of the model. We have successfully derived exact solutions for the simplest yet nontrivial cases and numerically estimated quantities of interest using the master equations, both results are found to be in good agreement with agent-based simulation results. Our results help rigorously understand not only the simplified model but also the original March model to a large extent.

Our theoretical framework for the simplified March model can be applied to the original March model as well as variants of the March’s model that incorporate other relevant factors such as forgetting of the beliefs [3, 11] and direct interaction and communication between agents in the organization [4, 5, 6]. For modeling the interaction structure between agents, various network models might be deployed [26, 25, 27, 28]. In conclusion, we expect to gain deeper insights into the organizational learning using our analytical approach.

Acknowledgements.
H.-H.J. acknowledges financial support by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2022R1A2C1007358).

References

  • March [1991] J. G. March, Exploration and Exploitation in Organizational Learning, Organization Science 2, 71 (1991).
  • Rodan [2005] S. Rodan, Exploration and exploitation revisited: Extending March’s model of mutual learning, Scandinavian Journal of Management 21, 407 (2005).
  • Blaschke and Schoeneborn [2006] S. Blaschke and D. Schoeneborn, The forgotten function of forgetting: Revisiting exploration and exploitation in organizational learning, Soziale Systeme 12, 99 (2006).
  • Miller et al. [2006] K. D. Miller, M. Zhao, and R. J. Calantone, Adding Interpersonal Learning and Tacit Knowledge to March’s Exploration-Exploitation Model, Academy of Management Journal 49, 709 (2006).
  • Kane and Alavi [2007] G. C. Kane and M. Alavi, Information Technology and Organizational Learning: An Investigation of Exploration and Exploitation Processes, Organization Science 18, 796 (2007).
  • Kim and Rhee [2009] T. Kim and M. Rhee, Exploration and exploitation: Internal variety and environmental dynamism, Strategic Organization 7, 11 (2009).
  • Fang et al. [2010] C. Fang, J. Lee, and M. A. Schilling, Balancing Exploration and Exploitation Through Structural Design: The Isolation of Subgroups and Organizational Learning, Organization Science 21, 625 (2010).
  • Sachdeva [2013] M. Sachdeva, Encounter with March’s Organizational Learning Model, Review of Integrative Business and Economics Research 2, 602 (2013).
  • Schilling and Fang [2014] M. A. Schilling and C. Fang, When hubs forget, lie, and play favorites: Interpersonal network structure, information distortion, and organizational learning: When Hubs Forget, Lie, and Play Favorites, Strategic Management Journal 35, 974 (2014).
  • Chanda and Ray [2015] S. S. Chanda and S. Ray, Optimal exploration and exploitation: The managerial intentionality perspective, Computational and Mathematical Organization Theory 21, 247 (2015).
  • Miller and Martignoni [2016] K. D. Miller and D. Martignoni, Organizational learning with forgetting: Reconsidering the exploration–exploitation tradeoff, Strategic Organization 14, 53 (2016).
  • Chanda [2017] S. S. Chanda, Inferring final organizational outcomes from intermediate outcomes of exploration and exploitation: The complexity link, Computational and Mathematical Organization Theory 23, 61 (2017).
  • Chanda et al. [2018] S. S. Chanda, S. Ray, and B. Mckelvey, The Continuum Conception of Exploration and Exploitation: An Update to March’s Theory, M@n@gement 21, 1050 (2018).
  • Chanda and Miller [2019] S. S. Chanda and K. D. Miller, Replicating agent-based models: Revisiting March’s exploration–exploitation study, Strategic Organization 17, 425 (2019).
  • Marín-Idárraga et al. [2022] D. A. Marín-Idárraga, J. M. Hurtado González, and C. Cabello Medina, Factors affecting the effect of exploitation and exploration on performance: A meta-analysis, BRQ Business Research Quarterly 25, 312 (2022).
  • Castellano et al. [2009] C. Castellano, S. Fortunato, and V. Loreto, Statistical physics of social dynamics, Reviews of Modern Physics 81, 591 (2009).
  • Acemoglu and Ozdaglar [2011] D. Acemoglu and A. Ozdaglar, Opinion dynamics and learning in social networks, Dynamic Games and Applications 1, 3 (2011).
  • Sen and Chakrabarti [2014] P. Sen and B. K. Chakrabarti, Sociophysics: An Introduction (Oxford University Press, Oxford, 2014).
  • Sîrbu et al. [2017] A. Sîrbu, V. Loreto, V. D. P. Servedio, and F. Tria, Opinion dynamics: Models, extensions and external effects, in Participatory Sensing, Opinions and Collective Awareness, edited by V. Loreto, M. Haklay, A. Hotho, V. D. Servedio, G. Stumme, J. Theunis, and F. Tria (Springer International Publishing, Cham, 2017) pp. 363–401.
  • Proskurnikov and Tempo [2017] A. V. Proskurnikov and R. Tempo, A tutorial on modeling and analysis of dynamic social networks. Part I, Annual Reviews in Control 43, 65 (2017).
  • Proskurnikov and Tempo [2018] A. V. Proskurnikov and R. Tempo, A tutorial on modeling and analysis of dynamic social networks. Part II, Annual Reviews in Control 45, 166 (2018).
  • Baronchelli [2018] A. Baronchelli, The emergence of consensus: A primer, Royal Society Open Science 5, 172189 (2018).
  • Anderson and Ye [2019] B. D. O. Anderson and M. Ye, Recent advances in the modelling and analysis of opinion dynamics on influence networks, International Journal of Automation and Computing 16, 129 (2019).
  • Noorazar [2020] H. Noorazar, Recent advances in opinion propagation dynamics: A 2020 survey, The European Physical Journal Plus 135, 521 (2020).
  • Barabási and Pósfai [2016] A.-L. Barabási and M. Pósfai, Network Science (Cambridge University Press, Cambridge, 2016).
  • Borgatti et al. [2009] S. P. Borgatti, A. Mehra, D. J. Brass, and G. Labianca, Network analysis in the social sciences, Science 323, 892 (2009).
  • Newman [2018] M. E. J. Newman, Networks, second edition ed. (Oxford University Press, Oxford, United Kingdom ; New York, NY, United States of America, 2018).
  • Menczer et al. [2020] F. Menczer, S. Fortunato, and C. A. Davis, A First Course in Network Science (Cambridge University Press, Cambridge, 2020).