Dynamics of cascades on burstiness-controlled temporal networks

Samuel Unicomb [email protected] Université de Lyon, ENS de Lyon, INRIA, CNRS, UMR 5668, IXXI, 69364 Lyon, France Gerardo Iñiguez Department of Network and Data Science, Central European University, A-1100 Vienna, Austria Department of Computer Science, Aalto University School of Science, FI-00076 Aalto, Finland IIMAS, Universidad Nacional Autonóma de México, 01000 Ciudad de México, Mexico James P. Gleeson MACSI and Insight Centre for Data Analytics, University of Limerick, Limerick V94 T9PX, Ireland Márton Karsai Department of Network and Data Science, Central European University, A-1100 Vienna, Austria Université de Lyon, ENS de Lyon, INRIA, CNRS, UMR 5668, IXXI, 69364 Lyon, France

Abstract

Burstiness, the tendency of interaction events to be heterogeneously distributed in time, is critical to information diffusion in physical and social systems. However, an analytical framework capturing the effect of burstiness on generic dynamics is lacking. We develop a master equation formalism to study cascades on temporal networks with burstiness modelled by renewal processes. Supported by numerical and data-driven simulations, we describe the interplay between heterogeneous temporal interactions and models of threshold-driven and epidemic spreading. We find that increasing interevent time variance can both accelerate and decelerate spreading for threshold models, but can only decelerate epidemic spreading. When accounting for the skewness of different interevent time distributions, spreading times collapse onto a universal curve. Our framework uncovers a deep yet subtle connection between generic diffusion mechanisms and underlying temporal network structures that impacts on a broad class of networked phenomena, from spin interactions to epidemic contagion and language dynamics.

Temporal networks provide a representation of real-world complex systems where interactions between components vary in time [holme2012temporal, masuda2016guide, holme2015modern]. Although they were initially modelled as Poisson processes, where independent events are homogeneously distributed in time, real-world network interactions have been found to be heterogeneously distributed and to exhibit temporal correlations [barabasi2005origin, goh2008burstiness, karsai2012universal]. In particular, interaction events in real systems concentrate within short periods of intense activity followed by long intervals of inactivity, an effect known as burstiness. Bursty dynamics appear in diverse physical phenomena including earthquakes [davidsen2013earthquake] and solar flares [deArcangelis2006universality], biological processes like neuron firing [turnbull2005string], and even the dynamics of human social interaction [karsai2018bursty, goh2008burstiness].

Burstiness in temporal interactions has profound implications for the diffusion of information over temporal networks, as demonstrated in a growing number of works [karsai2011small, lambiotte2013burstiness, jo2014analytically, horvath2014spreading, williams2019effects, vazquez2007impact, mancastroppa2019burstiness]. This is true in the case of epidemic processes, often referred to as simple contagion, where the probability of infection of an uninfected node depends linearly on the number of exposures, i.e., temporal interactions with infected neighbours in the network [pastorsatorras2015epidemic]. Epidemic models successfully describe the spread of biological disease [vespignani2020modelling], and have been shown to critically depend on burstiness and other patterns of temporal interactions [starnini2017equivalence, lambiotte2013burstiness, liu2014controlling, masuda2017temporal, masuda2020small]. Epidemic spreading over temporal networks appears to be slowed due to burstiness in some cases [karsai2011small, miritello2011dynamical, hiraoka2018correlated, min2011spreading], while accelerated in others [rocha2011simulated]. Threshold mechanisms provide another class of phenomena where bursty temporal networks play a crucial role. Threshold dynamics, also known as complex contagion, are used to model the spread of information where infection requires the reinforced influence of at least a certain fraction of neighbours in the network [granovetter1978threshold]. Threshold driven dynamics over static networks have been extensively studied both empirically [karsai2016local] and theoretically [watts2002simple, gleeson2008cascades, karsai2016local, unicomb2018threshold, unicomb2019reentrant], but analysis of their behaviour on temporal networks has been limited to a small number of empirical studies [karimi2013Athreshold, karimi2013Btemporal, takaguchi2013bursty, backlund2014effects]. Here we propose an analytical framework to systematically describe the relationship between the diffusion of information and bursty temporal interactions, thus providing the theoretical foundation necessary to uncover the role of burstiness in generic diffusion processes, including simple and complex contagion models of physical, biological and social phenomena.

We incorporate the most widely documented features of temporal interactions into a framework of binary state dynamics and benchmark its behaviour with standard models of threshold driven and epidemic spreading. Although stochastic bursty interactions are likely emergent phenomena [barabasi2005origin, vazquez2006modeling], their dynamics are well approximated by renewal processes [whitt1982approximating]. Temporal heterogeneity in network interactions can then be characterised by the variability in interevent times $\tau$ (the time between consecutive events on a given edge), parameterised by the interevent time distribution $\psi(\tau)$ , while other features of the temporal network are considered maximally random. Renewal processes represent the simplest model of bursty, non-Markovian dynamics, and a departure from the memoryless assumption implicit in Poisson processes. Nevertheless, we are able to show that such a system can be accurately captured by a master equation formalism, which is essentially memoryless, implying the existence of a purely Markovian system with almost identical behaviour. We show both analytically and numerically that bursty temporal interactions give rise to a percolation transition in the connectivity of the temporal network, separating phases of slow and rapid dynamics for both epidemic and threshold models of information diffusion. We find that diffusion dynamics are sensitive to the choice of interevent time distribution, particularly in regard to its skewness, and we demonstrate a data collapse across distributions when controlling for this effect.

Temporal network model. To model a temporal network, we consider an undirected, unweighted static network of $N$ nodes as the underlying structure, which acts as a skeleton on top of which temporal interactions take place. The degree of a node (how many neighbours it has) takes discrete values $k=0,\ldots,N-1$ from a degree distribution $p_{k}$ . Pairwise temporal interactions, or events, occur independently at random on each static edge via a renewal process with interevent time distribution $\psi(\tau)$ . Time is continuous and events are instantaneous, while consecutive interevent times are uncorrelated. We also assume that the renewal process is stationary (for further details see Methods and Supplementary Note 4). By using a static underlying network, we assume the time scales of edge formation and node addition or removal are far longer and thus negligible relative to the time scale of event dynamics over existing edges.

In its simplest form, information diffusion is a binary-state process where each node occupies one of two mutually exclusive states, which we term uninfected and infected. The probability of a node changing state is a function of the state of its neighbours, as well as the strength of their interactions. Interaction strength, also referred to as mutual influence, is a non-negative scalar that we consider to be a function of the elapsed renewal process time series. We desire that the mean of the emergent distribution of interaction strengths be stationary, and invariant to the underlying burstiness of the system. This is achieved when the contribution of a single event to interaction strength (i) goes to zero as the event ages, and (ii) is additive, meaning a spike in edge activity leads to a spike in the interaction strength between neighbours. Under these assumptions, the simplest such coupling is a step function, i.e., the contribution of an event to interaction strength is constant for a duration $\eta$ , after which it goes to zero. As such we define the interaction strength $w_{j}$ of an edge at time $t$ (or state $j$ for short), as the number $j$ of events having occurred in the preceding time window of width $\eta$ .

It follows that the local configuration of a node is determined by the number $k_{j}$ of its neighbours connected via edges in state $j$ , with the degree $k$ of the node related to its $k_{j}$ values by $k=\sum_{j}k_{j}$ at any time $t$ . We introduce $m_{j}$ as the number of infected neighbours of a node connected via edges in state $j$ . Consequently, $0\leq m_{j}\leq k_{j}$ with $m=\sum_{j}m_{j}$ the total number of infected neighbours. For each node, we store $k_{j}$ and $m_{j}$ for all $j$ in vectors $\mathbf{k}$ and $\mathbf{m}$ , providing a description of edge and node states in the local neighbourhood of a node. Nodes in class $(\textbf{k},\textbf{m})$ become infected at a rate $F_{\mathbf{k},\mathbf{m}}$ , and are statistically identical in this sense. We also store the interaction strength $w_{j}=j$ in the vector $\mathbf{w}$ for all $j$ . The dynamics of the influence received by a node is thus fully determined by $(\textbf{k},\textbf{m})$ and $\mathbf{w}$ .

Models of information diffusion. To examine the effect of temporal interactions on information diffusion, we explore three widely known models of transmission. We consider both relative (RT) and absolute (AT) variants of a threshold mechanism [watts2002simple, granovetter1978threshold, centola2007complex], as well as the Susceptible-Infected (SI) model of epidemic spreading [valdano2015analytical] (see Table 1 for details). All models are non-recovery, meaning the uninfected state cannot be reentered, and we consider infection due to external noise at a low, but nonzero rate $p$ .

Table 1: Transmission rate

F_{\mathbf{k},\mathbf{m}}

for nodes in configuration

(\textbf{k},\textbf{m})

with interaction strength

\mathbf{w}

and infection rate

p

due to external noise. In complex contagion models with relative (RT) and absolute (AT) thresholds, infection is regulated by parameters

\phi

and

M_{\phi}

, respectively. In the Susceptible-Infected (SI) model, infection is determined by the rate

\lambda

threshold
relative	absolute	SI
$\begin{cases}1,\enspace\mathbf{m}\cdot\mathbf{w}\geq\phi\mathbf{k}\cdot\mathbf{w}\\ p,\enspace\text{otherwise}\end{cases}$	$\begin{cases}1,\enspace\mathbf{m}\cdot\mathbf{w}\geq M_{\phi}\\ p,\enspace\text{otherwise}\end{cases}$	$\max(p,\ \textbf{m}\cdot\boldsymbol{\lambda})$

The study of threshold dynamics focuses on the conditions leading to cascades, or large avalanches of infections that sweep through the network. In the simplest implementation of threshold dynamics, infection occurs when the number $m$ of infected neighbours of an uninfected node exceeds a fraction $\phi$ of its degree $k$ [watts2002simple, granovetter1978threshold]. Generalising this rule to the case or arbitrary interaction strength [yagan2012analysis, unicomb2018threshold], in the RT model infection occurs when the influence of infected neighbours, $\mathbf{m}\cdot\mathbf{w}$ , exceeds a fraction $\phi$ of all potential influence, $\mathbf{k}\cdot\mathbf{w}$ . The RT model captures instances of real-world diffusion where interaction between elements affect the probability of infection only in aggregate, similar to the response of individuals to new behavioural patterns or transmission in biological neural networks [gerstner2014neuronal, iyer2013influence]. When considering the RT model over temporal networks, the probability of infection may increase during bursts of interaction events with infected neighbours or, conversely, bursts of activity with uninfected neighbours may temporarily maintain a node in the uninfected state. In the AT model, influence from infected neighbours is not normalised, but compared to some absolute value $M_{\phi}$ [centola2007complex]. In contrast to the RT model, infection is not hindered by interaction activity with uninfected neighbours, and bursts can only increase the probability of infection. In the SI model, finally, each interaction event with an infected neighbour triggers infection at a rate $\lambda$ . In our framework of temporal networks, infected neighbours trigger infection via edges in state $j$ at a rate $\lambda j$ . Writing $\boldsymbol{\lambda}=\lambda\mathbf{w}$ , the infection rate for a node with a neighbourhood of infected nodes described by $\mathbf{m}$ is $\textbf{m}\cdot\boldsymbol{\lambda}$ . Similar to the AT model, bursts can only increase the probability of infection in the SI model.

Binary dynamics over temporal networks. We extend a master equation formalism [gleeson2011high, unicomb2018threshold] to account for network temporality. We introduce the state space of all configurations $(\textbf{k},\textbf{m})$ allowed by the underlying degree distribution $p(k)$ , under the condition that each edge is in one of a finite number of possible edge states (see Methods, and Supplementary Note 1 for lattice diagrams of this space). We introduce the state vector $\mathbf{s}(t)$ containing the probability that a randomly selected node with underlying degree $k$ is uninfected and in class $(\textbf{k},\textbf{m})$ at time $t$ . The time evolution of $\mathbf{s}$ is governed by the matrix $W(\mathbf{s},t)$ , containing the transition rate $W_{ij}$ from the $i$ -th to the $j$ -th configuration $(\textbf{k},\textbf{m})$ at time $t$ . Transitions arise from three mechanisms. First, ego transitions, contained in the matrix $W_{ego}$ , describe the loss to configuration $(\textbf{k},\textbf{m})$ due to its nodes becoming infected. This occurs at a rate $F_{\mathbf{k},\mathbf{m}}$ , as per Table 1, so the diagonal terms of $W_{ego}$ are given by $-F_{\mathbf{k},\mathbf{m}}$ and off-diagonals are zero. Second, neighbour transitions, contained in matrix $W_{neigh}$ , describe the gain or loss to configuration $(\textbf{k},\textbf{m})$ due to the infection of neighbours of nodes in this class. This transition is determined by $\beta_{j}dt$ , the probability of an uninfected neighbour in configuration $j$ becoming infected over an interval $dt$ (see Methods for an explicit calculation). Taken together, $W_{ego}$ and $W_{neigh}$ accurately describe diffusion dynamics over static and heterogeneously distributed edges, such as weighted and multiplex networks [unicomb2018threshold, unicomb2019reentrant].

Temporal networks require a third component, edge transitions, contained in the matrix $W_{edge}$ , describing the gain or loss to configuration $(\textbf{k},\textbf{m})$ due to changes in an edge’s state $j$ . This applies to any temporal network model that can be formulated in terms of discrete, dynamic edge states. We denote by $\mu_{j}dt$ and $\nu_{j}dt$ the probabilities that a randomly selected edge in state $j$ undergoes a positive or negative transition and enters state $j+1$ or $j-1$ , respectively, over an interval $dt$ . Combining these terms gives the master equation

\dfrac{d}{dt}\mathbf{s}=(W_{ego}+W_{neigh}+W_{edge})\mathbf{s}=W(\mathbf{s},t)\mathbf{s}.

(1)

Modelling temporal network dynamics amounts to solving Eq. (1), which along with the initial condition $\mathbf{s}(0)$ , determine the evolution of the system.

Refer to caption — Figure 1: Normalised density of noise-induced infections $\rho_{f}$ as a function of interevent time standard deviation $\sigma_{\tau}$ and memory $\eta$ . Normalised diffusion time $t_{f}$ produces an almost identical effect [see (a), inset]. The small $\sigma_{\tau}$ limit, leading to regular patterns in $\tau$ , comprises the quenched limit in (a) where $\eta=1$ and the network is effectively static. The large $\sigma_{\tau}$ limit produces large bursts in activity, comprising the annealed regime where the network is effectively sparsified and plays no role in information diffusion ( $\rho_{f}=1$ ). Mirroring results are achieved by varying memory $\eta$ for fixed $\sigma_{\tau}$ in generated (b) and empirical (c-d) temporal networks. Analytic solution is denoted by dashed lines, and Monte Carlo results by solid lines. Generated networks have lognormal degree distribution with mean $\langle k\rangle=7$ and standard deviation $\sigma_{k}=2$ . We use Weibull-distributed interevent times with mean $\langle\tau\rangle=1$ . Plot (b) uses $\sigma_{\tau}=1$ . For empirical data description see Methods. Node dynamics correspond to the RT model with threshold $\phi=0.15$ and external noise $p=2\times 10^{-4}$ . Cutoff density is $\rho_{c}=0.4$ . Monte Carlo simulations are averaged over $10^{4}$ realisations. Network size is $10^{6}$ in (a), and $5\times 10^{3}$ in (b).

To apply this formalism we derive the edge transition rates $\mu_{j}$ and $\nu_{j}$ in the case of renewal processes. We first note that microscopically, on the scale of a single edge, transitions from state $j$ to $j\pm 1$ cannot be described by a constant rate. In a renewal process, the probability of an event occurring is conditional on the time elapsed since the previous event. Therefore, this probability is history dependent, meaning edges have an effective memory and are non-Markovian by definition. Further, since it is only the previous event that is determinant, there is clearly no $j$ dependence at this scale. A renewal process may then seem at odds with a Markovian master equation [where $\mathbf{s}(t+dt)$ depends only on $\mathbf{s}(t)$ , as per Eq. (1)]. Macroscopically however, on the scale of large ensembles of edges, the renewal process exhibits effective $j$ -dependent rates that are constant in time. We can calculate the probability $E_{j}$ that a randomly selected edge is in state $j$ , and the probability that it transitions to state $j\pm 1$ over an interval $dt$ , giving $\mu_{j}$ and $\nu_{j}$ [see Methods for explicit expressions for $j>0$ , with the $j=0$ case of $E_{j}$ and $\mu_{j}$ comprising a special case that we define in Eqs. (2) and (3) below].

Since the rates $\mu_{j}$ and $\nu_{j}$ are heterogeneous in terms of $j$ , they can be viewed as a signature of the model parameters $\psi(\tau)$ and $\eta$ , and of the non-Markovianity inherent at the scale of a single edge. On a macroscopic scale, $\mu_{j}$ , $\nu_{j}$ , and $E_{j}$ are constant in time, meaning our system is indistinguishable from a continuous-time Markov chain model of edge state. That is, a random walk on the non-negative integers, with transition rates given by $\mu_{j}$ and $\nu_{j}$ , and a stationary distribution of walkers given by $E_{j}$ [see Supplementary Note 2 for an illustration of $\mu_{j}$ and $\nu_{j}$ in the case of a gamma distribution $\psi(\tau)$ ]. Applying the system-wide rates $\mu_{j}$ and $\nu_{j}$ at the finer-grained level of configurations $(\textbf{k},\textbf{m})$ amounts to a mean field approximation. Monte Carlo simulations (see Supplementary Fig. 8) demonstrate that the actual edge transition rates deviate slightly from $\mu_{j}$ and $\nu_{j}$ for each configuration $(\textbf{k},\textbf{m})$ , even if they are exact for the network as a whole, in the limit of large $N$ . The accuracy of the master equation solution provides a measure of the remarkable similarity between a renewal process, where Eq. (1) is an approximation, and the biased random walk interpretation of edge state, where Eq. (1) is exact.

Burstiness and information diffusion. We validate our analytical framework with Monte Carlo simulations of diffusion dynamics over temporal networks. Simulations use an underlying static, configuration-model network with lognormal degree distribution of mean $\langle k\rangle$ and standard deviation $\sigma_{k}$ . We measure the time $t_{c}$ required to reach an arbitrary density $\rho_{c}$ of infected nodes, in the presence of background noise at rate $p$ . We also measure $\rho_{f}$ , the relative frequency of infections due to external noise, such that $0<\rho_{f}\leq 1$ , with $1/\rho_{f}$ the ratio of all to noise-induced infections, measuring the catalytic effect of external noise (for a detailed description of $\rho_{f}$ see Methods). We normalise $t_{c}$ by the time taken to reach the desired density by noise only, providing $t_{f}$ , such that $0<t_{f}\leq 1$ . Remarkably, $\rho_{f}$ and $t_{f}$ are almost equivalent, with a value of $\rho_{f}=t_{f}=1$ indicating slow diffusion with complete reliance on external noise, and small $\rho_{f}$ and $t_{f}$ representing rapid diffusion with external noise producing a substantial catalytic effect. Together, they measure the extent to which the temporal network, rather than external noise, drives the diffusion of information.

We first examine the effect of varying interevent time standard deviation $\sigma_{\tau}$ for fixed memory $\eta=\langle\tau\rangle=1$ [Fig. 1(a)]. We choose a Weibull interevent time distribution $\psi(\tau)$ , used widely to model behavioural bursts in both human [jiang2013calling] and animal [sorribes2011origin] dynamics. A Weibull distribution reduces to the exponential distribution for $\sigma_{\tau}=\langle\tau\rangle=1$ . Node dynamics follow the RT model for threshold $\phi=0.15$ and background noise $p=2\times 10^{-4}$ . Approaching the small $\sigma_{\tau}$ limit from above, events arrive in an increasingly regular pattern, and an increasing fraction of edges are frozen in the mean state $\eta/\langle\tau\rangle=1$ . We refer to this as the quenched regime, whereby edges converge to a single state and the network is effectively static. In the opposing limit of large $\sigma_{\tau}$ , burstiness means that at any given time, edge activity is concentrated among an arbitrarily small fraction of edges that undergo large spikes in activity, with the remainder in state $j=0$ . We refer to this as the annealed regime, where the network is maximally sparse and has a vanishingly small role in information diffusion ( $\rho_{f}$ and $t_{f}$ approach one).

Both quenched and annealed regimes lead to slow, noise-reliant diffusion, where the expected edge state $\eta/\langle\tau\rangle$ is preserved [Fig. 1(a)]. For intermediate values of $\sigma_{\tau}$ there is a well-mixed regime where relatively rapid diffusion is due to edge state fluctuations that are ultimately favourable to transmission. In the RT model this implies a spike of activity on an infected neighbour overcoming a node’s threshold, or decreased activity on uninfected edges lowering the relative influence to be overcome. The decelerative effect of quenching is increased for narrower underlying degree distributions, since an increasing fraction of nodes are frozen in a state unfavourable to transmission, a static network effect already reported in [watts2002simple].

A mirroring effect can be obtained by varying memory $\eta$ for constant $\sigma_{\tau}=\langle\tau\rangle=1$ [Fig. 1(b)]. The quenched limit is recovered for large $\eta$ , as large samples of events on each edge result in edges converging to a mean state, $\eta/\langle\tau\rangle$ , with an increasingly narrow distribution, due to the central limit theorem. As for the case of fixed $\eta$ , quenching may be decelerative if cascades on the corresponding static network are noise dependent. For example, increasing $\phi$ can cause slower diffusion in the quenched limit [Fig. 1(b)]. The annealed (noise-driven) regime is effectively recovered when $\eta$ is vanishingly small, meaning almost all edges are in state $j=0$ and the role of the network in information diffusion vanishes ( $\rho_{f}=t_{f}=1$ ). The correspondence between $\sigma_{\tau}$ and $\eta$ suggests data-driven experiments that allow an indirect inference of the effects of varying $\sigma_{\tau}$ in real systems, an open problem in the study of information diffusion. We simulate the RT model on two empirical temporal networks and vary only the memory $\eta$ , recovering qualitatively the effects observed on synthetic networks [Fig. 1(c-d), see Methods for data description]. This suggests the accelerative and decelerative effects of burstiness may well be a feature of real-world information diffusion.

We systematically explore the RT, AT, and SI models with Monte Carlo simulations over $(\sigma_{\tau},\eta)$ -space (Fig. 2). The underlying degree distribution is lognormal with $\langle k\rangle=7$ and $\sigma_{k}=0.5$ , and interevent times are Weibull-distributed with $\langle\tau\rangle=1$ . Our aim is to understand how the temporal connectivity evolves over $(\sigma_{\tau},\eta)$ -space. As previously observed, the quenched regime appears either in the small $\sigma_{\tau}$ limit for constant (but sufficiently large) $\eta$ , or in the large $\eta$ limit for constant $\sigma_{\tau}$ . The temporal network enters the annealed regime in two ways, either by taking the small $\eta$ limit for constant $\sigma_{\tau}$ , or the large $\sigma_{\tau}$ limit for constant $\eta$ . The two regimes are separated by a percolation transition, i.e., the emergence of a giant connected component in the subgraph formed by edges in state $j>0$ [see regimes and boundary in Fig. 2(c)]. To quantify this transition, we introduce

\xi_{E}=\int_{\eta}^{\infty}\Psi(\tau)d\tau

(2)

and

\xi_{\mu}=\dfrac{\int_{\eta}^{\infty}\psi(\tau)d\tau}{\int_{\eta}^{\infty}\Psi(\tau)d\tau},

(3)

where $\xi_{E}$ equals $E_{0}$ , the density of edges in state zero, and $\xi_{\mu}$ equals $\mu_{0}$ , the probability that a randomly selected edge in state zero enters state one over an interval $dt$ . We may refer to $\xi_{E}$ as the effective sparsification, or alternatively, the effective annealing. Here $\Psi(\tau)$ is the complementary cumulative distribution relating to $\psi(\tau)$ . We denote by $q(k)$ the degree distribution obtained by randomly removing a fraction $\xi_{E}$ of edges in a static configuration model network with degree distribution $p(k)$ , which is identical to the expected subgraph formed by removing state zero edges in the stochastic temporal network. The percolation transition for $q(k)$ can be computed analytically (see Supplementary Note 3), and despite the static assumption, provides an excellent estimate of the boundary between quenched and annealed regimes (Fig. 2), indicating the onset of slow, noise-dependent diffusion for all diffusion dynamics considered.

Even if the outcome of information diffusion (as measured by $t_{f}$ ) is qualitatively similar across diffusion models with respect to the features of the temporal network ( $\sigma_{\tau}$ and $\eta$ ), we can identify differences due to node dynamics by measuring the values of $\sigma_{\tau}$ that produce a minimum diffusion time for given $\eta$ (see dashed lines in Fig. 2). In the RT model, both quenched and annealed regimes produce relatively slow diffusion. Between the two regimes, the minimum diffusion time shifts to larger $\sigma_{\tau}$ for increasing memory $\eta$ , and eventually exceeds $\sigma_{\tau}=1$ , meaning burstiness is accelerative. As $\eta$ increases, larger and larger fluctuations in $\tau$ are required to exit the quenched regime and enter the mixing phase where rapid diffusion occurs, a consequence of the central limit theorem [Fig. 2(a)]. Increasing $\sigma_{\tau}$ also produces an accelerative effect in the AT model [Fig. 2(b)]. In contrast to the RT model, burstiness is accelerative only for small values of memory $\eta$ . Since in the AT model we do not normalise infectious influence by total influence, increasing $\eta$ is always favourable to transmission, and quenching never slows down diffusion [see Fig. 3(a)]. In the SI model burstiness does not have an accelerative effect [Fig. 2(c) and Fig. 3(d)], since its infection rate is unbounded (as opposed to threshold models, as per Table 1). The annealing effect of burstiness overwhelms the increased rate of local diffusion afforded by unbounded transmission rates, which increases proportionally to the size of the burst. As such, acceleration due to burstiness appears to be a hallmark of threshold mechanisms, whether relative or absolute.

Finally, we examine the effect of the choice of interevent time distribution $\psi(\tau)$ (Fig. 3). We measure the noise dependence $\rho_{f}$ for lognormal, Weibull and gamma distributions, controlling for both $\langle\tau\rangle$ and $\sigma_{\tau}$ . Consider first the AT model with $M_{\phi}=2$ and $\eta=\langle\tau\rangle=1$ [Fig. 3(b)]. Here, we observe a striking dependence on $\psi(\tau)$ , with the lognormal distribution leading to the most rapid diffusion, outpacing the gamma distribution in diffusion speed and relative noise dependence by up to a factor of $83$ , and the Weibull distribution by up to a factor of $14$ . These differences can be accounted for by comparing the rate of onset of annealing in terms of $\xi_{E}$ as we increase $\sigma_{\tau}$ . The gamma distribution rapidly anneals the network, yielding the largest $\xi_{E}$ values of all choices of distribution, meaning the most edges in state $j=0$ . As a result, it exhibits the slowest, most noise reliant diffusion. In terms of the value of $\xi_{E}$ induced, the gamma is followed by the Weibull distribution, then the lognormal distribution. In fact, the lognormal requires order-of-magnitude larger $\sigma_{\tau}$ to produce equal values of $\xi_{E}$ as the Weibull and gamma distributions. By plotting $\rho_{f}$ against $\xi_{E}$ we observe the data to collapse approximately onto a single curve, revealing $\xi_{E}$ to be a far better predictor of dynamics than $\sigma_{\tau}$ [see Fig. 3(c) in contrast to Fig. 3(b)]. Some disagreement persists, however [Fig. 3(c), left inset], which can be explained by noting that increased rates of mixing $\xi_{\mu}$ [Fig. 3(c), right inset] ensure that the small number of active edges redistribute about the network at a greater rate, thus mediating cascades more effectively. An identical effect is observed for the SI model [Fig. 3(e-f)].

The data collapse in Fig. 3(c) and (f) confirms that above all it is $\xi_{E}$ , the density of edges in state $j=0$ , that ultimately determines the diffusion dynamics in our framework. It remains to determine why the value of $\xi_{E}$ is so sensitive to the choice of interevent time distribution $\psi$ , and in particular, what the properties are of a given distribution $\psi$ that most contribute to the value of $\xi_{E}$ , beyond its mean and standard deviation. We have found two properties that correlate with our observations in Fig. 3(b) and (e), at least qualitatively, as shown in Supplementary Fig. 9. They are the third raw moment, $\langle\tau^{3}\rangle$ , which is closely related to skewness, and differential entropy (as defined in Supplementary Note 8). These measures provide a rule of thumb such that, for instance, given two distributions $\psi$ with equal mean and variance, it is the one with the greater skewness that produces the lowest $\xi_{E}$ , and the most rapid diffusion.

Discussion. Our study shows that generic dynamics of information diffusion are closely tied to the level of burstiness in the underlying temporal network. By considering three binary-state models of transmission, we have demonstrated that they differ in their response to burstiness only in their details. For instance, while having a purely decelerative effect on SI models, increasing burstiness at intermediate values can be accelerative for threshold models. Nevertheless, the prevailing trend is that increasing burstiness is strongly decelerative overall, with the onset of the decelerative phase heavily dependent on the choice of interevent time distribution. The key assumptions here are that the underlying network is fixed, and that due to a memory mechanism, a fraction of edges enter a non-interacting state due to long waiting times. These assumptions result in a temporal network topology that has profound implications for many dynamical processes. It is likely that structural features of the temporal network, such as the percolation transition separating slow and fast diffusion, and the data collapse observed when controlling for the effective sparsity, will also be critical for the more general class of binary-state dynamics, including not only threshold models and models of disease, but language, voter, and Ising models, among others.

Our master equation formalism can be extended to a broad class of temporal network models. In particular, any model that can be formulated in terms of discrete, dynamic edge states is a candidate for our approach. This includes growing, decaying and adaptive networks, as well as models of rewiring. In line with our use of renewal processes, a large family of point processes have natural descriptions in terms of discrete edge states, such as cascading Poisson and Cox processes. Extensions to the Poisson process in general suggest promising applications of our approach. In particular, our treatment of non-Markovianity could be applied to other systems. That is, while a single component in a large system may be strongly non-Markovian, as was the case in our renewal process, stationary statistics may emerge at an ensemble level that act as a signature of the non-Markovianity occurring microscopically. Our biased random walk interpretation of the renewal process model shows that strikingly similar Markovian counterparts may be available for analysis. Incidentally, biased random walk models of edge state suggest a broad class of Markovian models to which our master equation applies exactly. These may be extended, for example, to Lévy flights, and used as a probe of various complex systems where memory is critical.

Acknowledgements

S.U. acknowledges the Pôle Scientifique de Modélisation Numérique from ENS Lyon for their computing support, as well as L. Taulelle and V. Lefèvre for useful technical advice. G.I. acknowledges partial funding by the European Commission through H2020 project HumanE AI under G.A. No. 761758. J.P.G. is supported by Science Foundation Ireland (grant numbers 16/IA/4470, 16/RC/3918, 12/RC/2289 P2 and 18/CRT/6049) with co-funding from the European Regional Development Fund. M.K. is supported by the DataRedux (ANR-19-CE46-0008) and SoSweet (ANR-15-CE38-0011) projects funded by ANR and the SoBigData++ (H2020-871042) project.

Author contributions

S.U. derived the analytical solution, and performed all related calculations and numerical experiments. S.U., G.I., J.P.G., and M.K. designed the research and contributed to the manuscript.

Competing interests

The authors declare no competing interests.

Methods

Master equation configuration space. We provide here a minimal description of the master equation formalism, with a focus on class transition rates, with a complete description provided in Supplementary Note 1. We introduce $C_{\mathbf{k},\mathbf{m}}$ , the set of all nodes in the network with local configuration $(\textbf{k},\textbf{m})$ , such that $\mathbf{0}\leq\mathbf{m}\leq\mathbf{k}$ . Whereas $C_{\mathbf{k},\mathbf{m}}$ is a set of nodes, we define $C_{k}$ as the set of all sets $C_{\mathbf{k},\mathbf{m}}$ with total degree $k$ . This can be written $C_{k}=\{C_{\mathbf{k},\mathbf{m}}\mid\sum_{j}k_{j}=k\}$ . Then, we refer to the configuration space $C$ as the set of all possible sets $C_{\mathbf{k},\mathbf{m}}$ . Given a degree distribution $p_{k}$ , we define

C=\{(\textbf{k},\textbf{m})\mid k\in\text{supp}\ p(k)\enspace\text{and}\enspace\mathbf{0}\leq\mathbf{m}\leq\mathbf{k}\},

(4)

which partitions the network at any given time. Written this way, $C$ is potentially infinite. To ensure that it be finite in numerical constructions, we assume an upper cutoff in the degree distribution $p(k)$ , and the set of edge states to be of a finite size $n$ . Note that $C$ includes any set for which $C_{\mathbf{k},\mathbf{m}}$ is empty at a given time. The cardinality $|C|$ of configuration space is thus determined entirely by the support of $p_{k}$ , along with $n$ . Since $(\textbf{k},\textbf{m})$ does not convey ego state, just edge and neighbour configuration, we partition $C_{\mathbf{k},\mathbf{m}}$ into sets of uninfected and infected nodes, such that $C_{\mathbf{k},\mathbf{m}}=S_{\textbf{k},\textbf{m}}\cup I_{\textbf{k},\textbf{m}}$ . Similar definitions allow us to introduce $S_{\mathbf{k}}$ and $I_{\mathbf{k}}$ , $S_{k}$ and $I_{k}$ , as well as $S$ and $I$ . Although in general $|S_{\textbf{k},\textbf{m}}|\neq|I_{\textbf{k},\textbf{m}}|$ , the structure of the uninfected and infected configuration spaces is identical, such that $|C|=|S|=|I|$ , $|C_{k}|=|S_{k}|=|I_{k}|$ and $|C_{\mathbf{k}}|=|S_{\mathbf{k}}|=|I_{\mathbf{k}}|$ .

The evolution of a dynamical process over a network amounts to a flow of nodes through the sets $S_{\mathbf{k},\mathbf{m}}$ and $I_{\mathbf{k},\mathbf{m}}$ over time. Since the number of nodes $N$ in the network is conserved, it is their distribution over the sets $S_{\mathbf{k},\mathbf{m}}$ and $I_{\mathbf{k},\mathbf{m}}$ that evolves in time. These distributions provide the state of the network at time $t$ . Since our formalism is independent of network size, we deal with the densities of nodes rather than the absolute sizes of these sets. To this end we introduce

\|C_{k}\|\equiv\sum_{C_{\mathbf{k},\mathbf{m}}\in C_{k}}|C_{\mathbf{k},\mathbf{m}}|

(5)

as shorthand for the number of nodes with underlying degree $k$ . This is in contrast to $|C_{\mathbf{k}}|$ and $|C_{k}|$ which give the number of configurations with degrees $\mathbf{k}$ and $k$ , respectively. To convert from absolute node count to densities of nodes, we need to normalise $S_{\mathbf{k},\mathbf{m}}$ and $I_{\mathbf{k},\mathbf{m}}$ by some non-zero quantity that is conserved over the course of a dynamical process. Since our temporal network models assume a static underlying network, a node’s underlying degree $k$ is preserved, and as a result, so is $\|C_{k}\|$ , defined in Eq. (5). The density of uninfected nodes in class $(\textbf{k},\textbf{m})$ in this case is given by

s_{\mathbf{k},\mathbf{m}}=\dfrac{|S_{\textbf{k},\textbf{m}}|}{\|C_{k}\|},

(6)

with $i_{\mathbf{k},\mathbf{m}}$ defined analogously. The node conservation principle leads to the condition $\sum_{C_{k}}(s_{\mathbf{k},\mathbf{m}}+i_{\mathbf{k},\mathbf{m}})=1$ , which is to say that the sum of all densities $s_{\mathbf{k},\mathbf{m}}$ and $i_{\mathbf{k},\mathbf{m}}$ with underlying degree $k$ , is one. We then have

\rho_{k}=1-\sum_{C_{k}}s_{\textbf{k},\textbf{m}}

(7)

and

\rho=\sum_{k}p(k)\rho_{k},

(8)

where the sum in the first expression is over all configurations $(\textbf{k},\textbf{m})$ that satisfy $\sum_{j}k_{j}=k$ . The term $\rho_{k}$ gives the probability that a randomly selected node with underlying degree $k$ will be infected, and $\rho$ the probability that any randomly selected node will be infected.

As discussed in the main text, $\mathbf{s}$ is the $|C|$ -dimensional vector storing the densities $s_{\mathbf{k},\mathbf{m}}$ . In practice, we use lexicographic ordering of the tuples in $C$ to define a one-to-one mapping $(\textbf{k},\textbf{m})\mapsto i$ , for some $i\in\{1,\ldots,|C|\}$ to define the $i$ -th element $s_{i}$ of $\mathbf{s}$ . Finally, it is possible to show that for fixed $n$ and limiting $k$ , the size of $C$ behaves like $\Theta(k^{2n})$ . Now that we have defined the space of allowed configurations, we turn to its dynamics.

Master equation transition rates. Ego transitions occur at rates $F_{\mathbf{k},\mathbf{m}}$ , and involve the flow of nodes from set $S_{\textbf{k},\textbf{m}}$ to $I_{\textbf{k},\textbf{m}}$ . As such, no change to the ego’s local neighbourhood $(\textbf{k},\textbf{m})$ takes place, and the transition represents a type of self-edge, or loop, in the lattice representation of configuration space, illustrated in Supplementary Note 1. The rates $F_{\mathbf{k},\mathbf{m}}$ are encoded in transmission functions such as those shown in Table 1. Flux measurements of these transitions, such as those in Supplementary Note 7, are expected to be exact, and are an important benchmark for verification of experiments. The rates $F_{\mathbf{k},\mathbf{m}}$ are contained in the matrix $W_{ego}$ .

Neighbour transitions are based on the probability $\beta_{j}dt$ that an uninfected neighbour of an uninfected node becomes infected over an interval $dt$ . To calculate $\beta_{j}$ we use a straightforward ensemble average over $S$ . To obtain the expected fraction of neighbours undergoing transitions, we observe the number of nodes undergoing ego transitions at time $t$ , and count the number of neighbour transitions produced as a result. That is, when an uninfected node in class $(\textbf{k},\textbf{m})$ becomes infected, which occurs with probability $F_{\mathbf{k},\mathbf{m}}dt$ , it has $k_{j}-m_{j}$ uninfected neighbours that observe this transition, or $k_{j}-m_{j}$ nodes undergoing neighbour transitions. The number of such edges across the entire network is given by $\sum_{S}p_{\mathbf{k}}(k_{j}-m_{j})F_{\mathbf{k},\mathbf{m}}s_{\mathbf{k},\mathbf{m}}$ , where the sum is over all uninfected classes. We compare this to the total number of uninfected-uninfected edges, $\sum_{S}p_{\mathbf{k}}(k_{j}-m_{j})s_{\mathbf{k},\mathbf{m}}$ , giving the neighbour transition rate

\beta_{j}dt=\dfrac{\sum_{S}p_{\mathbf{k}}(k_{j}-m_{j})F_{\mathbf{k},\mathbf{m}}s_{\mathbf{k},\mathbf{m}}}{\sum_{S}p_{\mathbf{k}}(k_{j}-m_{j})s_{\mathbf{k},\mathbf{m}}}dt,

(9)

which has previously been used in master equation solutions of binary-state dynamics on static networks. The rates $\beta_{j}$ are contained in the matrix $W_{neigh}$ , weighted by the values $k_{j}$ and $m_{j}$ of the relevant classes $(\textbf{k},\textbf{m})$ , as detailed in Supplementary Note 1.

Edge transitions occur at rates $\mu_{j}$ and $\nu_{j}$ , and give the probability of edges in state $j$ transitioning to state $j+1$ or $j-1$ , respectively, over an interval $dt$ . Their value depends upon the temporal network model in question. In this work, edge transition rates are determined by renewal processes following interevent time distributions $\psi(\tau)$ , with complementary cumulative distributions $\Psi$ . If the state of an edge is determined by the number of events $j$ having occurred in the preceding time window of duration $\eta$ due to a renewal process, edge transition rates are

\mu_{j}dt=\dfrac{\Psi\ast\psi^{\ast j}}{\Psi\ast\psi^{\ast(j-1)}\ast\Psi}dt

(10)

and

\nu_{j}dt=\dfrac{\Psi\ast\psi^{\ast(j-1)}}{\Psi\ast\psi^{\ast(j-1)}\ast\Psi}dt,

(11)

with

E_{j}=\Psi\ast\psi^{\ast(j-1)}\ast\Psi

(12)

giving the probability that a randomly selected edge is in state $j$ . It is this quantity that provides the normalising constant for the rates $\mu_{j}$ and $\nu_{j}$ . Here, $\psi^{\ast j}$ is the $j$ -th convolution power of $\psi$ . A complete derivation is given in Supplementary Note 1. The Gaver-Stehfest algorithm is used to compute the inverse Laplace transforms, and an efficient numerical procedure reducing $\mu_{j}$ and $\nu_{j}$ to a matrix-vector product is developed in Supplementary Note 5. These expressions hold for $j>0$ , with Eqs. (2) and (3) in the main text giving the special case of $j=0$ for $E_{j}$ and $\mu_{j}$ , respectively. Regardless of the form of $\psi$ , the mean edge state $\eta/\langle\tau\rangle$ is always conserved on a network-wide level. Applying Eqs. (10) and (11) at the level of class transitions amounts to a mean field approximation, since flux measurements of Monte Carlo simulation show edge transition rates to deviate slightly from $\mu_{j}$ and $\nu_{j}$ at the class level $(\textbf{k},\textbf{m})$ , even if exact for the network as a whole, as shown in Supplementary Note 7.

Simulation. We simulate networks $\mathcal{G}=(\mathcal{V},\mathcal{E})$ composed of a node set $\mathcal{V}$ of size $N$ , and an underlying edge set $\mathcal{E}$ . The edge set is produced by a desired degree distribution, wired according to the configuration model. Overlying temporal network activity is initialised to the steady state, such that at time $t=0$ , the time to the first event follows exactly the residual distribution $\Psi$ , in the limit of large networks. Specifically, we set the time to $t=-\eta$ , and draw $|\mathcal{E}|$ residual times from $\Psi$ , or one for each edge. Subsequent interevent times are drawn from $\psi$ . Advancing in time from $-\eta$ ensures that a stationary distribution of edge states $E_{j}$ is achieved exactly at $t=0$ , when we begin to allow node dynamics to evolve. Due to the large values of interevent time standard deviation studied in this work, out-of-the-box sampling routines were either inefficient or broke down for large $\sigma_{\tau}$ . As such, we develop a simple, yet efficient routine in Supplementary Note 4 based on approximate inverse transform sampling of $\psi$ and $\Psi$ , using a bisection method. This is performed on a numerical grid of $\Psi$ values, with relevant details of the probability distributions outlined in detail in Supplementary Note 9. A third-order spline interpolation on a logarithmic scale provides intermediate values of the grid, such that the resultant underlying distribution is close to exact.

Node dynamics are implemented via a Gillespie algorithm, which uses the fact that the waiting time to infection for an uninfected node in class $(\textbf{k},\textbf{m})$ follows an exponential distribution with mean $1/F_{\mathbf{k},\mathbf{m}}$ . Initially all nodes are in the uninfected state, and the diffusion process is triggered by low-level background noise at rate $p$ . To simulate the temporal network itself, a time-ordered sequence of edge events is implemented in parallel with the node update sequence. This amounts to two separate time-ordered sequences of events executed simultaneously. Algorithms are described in detail in Supplementary Note 4 with pseudocode.

We use the normalised density of noise-induced infections, $\rho_{f}$ , and normalised diffusion time, $t_{f}$ , as measures of the diffusion process. We define these quantities as follows. The probability that a randomly selected node has been infected as a result of external noise is

\tilde{\rho}(t)=p\int_{0}^{t}(1-\rho(\tau))d\tau,

(13)

meaning $0<\tilde{\rho}\leq\rho$ . We define $\rho_{f}$ as the fraction of infections that are due to noise $\rho_{f}=\tilde{\rho}/\rho$ , such that $0<\rho_{f}\leq 1$ . This value cannot equal zero since there must be at least one noise induced infection, namely, the first infection in the diffusion process. A value approaching $\rho_{f}=1$ means almost all infection is due to external noise. This occurs in the annealed limit, when almost all edges are in state $j=0$ , and network interactions play a vanishingly small role in the diffusion process. As a consequence, the time evolution of the diffusion process is governed by

\dot{\rho}=p(1-\rho)

(14)

whose solution $\rho=1-e^{-pt}$ can be inverted to give the time required to the achieve a given density $\rho$ of infections relying solely on noise, that is,

t=\dfrac{-\ln(1-\rho)}{p}.

(15)

If $t_{c}$ is the time required in the general case to reach a cutoff density of infections $\rho_{c}$ , normalising $t_{c}$ by Eq. (15) evaluated at $\rho_{c}$ defines $t_{f}$ , such that $0<t_{f}\leq 1$ . A value of $t_{f}=1$ means the system is driven entirely by noise, and a value approaching $0$ a rapid diffusion process. An important feature of this work is that $t_{f}$ and $\rho_{f}$ seem to be interchangeable, as per the inset of Fig. 1(a), and any result shown in terms of $\rho_{f}$ produces an identical picture in $t_{f}$ .

Data description. In this work we use two empirical temporal networks used by [saramaki2015exploring] and references therein, which we describe below. To simulate diffusion processes on these networks we use periodic boundary conditions, starting at a randomly selected point in time.

The first dataset is a temporal network of email exchange [saramaki2015exploring, eckmann2004entropy], extracted from the log files of a university email server. The sender, recipient and the timestamp are used to form the network. The dataset consists of $N=3188$ nodes, and $|\mathcal{E}|=31857$ underlying edges, such that the average degree is $19.99$ . A total of $308730$ events were recorded, with a resolution of one second over a period of $81.3$ days. An average of $9.691$ events occur per edge. We determine the interevent time distribution by taking the the subset of edges observing more than one event, of which there are $21199$ . The mean interevent time is then calculated to be $\langle\tau\rangle=3.125\text{ days}$ , with standard deviation $\sigma_{\tau}=6.620\text{ days}$ . This yields a coefficient of variation $\sigma_{\tau}/\langle\tau\rangle=2.118$ .

The second dataset is a temporal network of forum interactions [saramaki2015exploring, karimi2014structural], an online community where users discuss movies. Similar to the email dataset, the sender, recipient and the timestamp are extracted from the messages. The dataset consists of $N=7083$ nodes, and $|\mathcal{E}|=138144$ underlying edges, such that the average degree is $39.01$ . A total of $1428493$ events were recorded, with a resolution of one second over a period of $3133$ days. An average of $10.34$ events occur per edge. We determine the interevent time distribution by taking the the subset of edges observing more than one event, of which there are $70902$ . The mean interevent time is then calculated to be $\langle\tau\rangle=16.60\text{ days}$ , with standard deviation $\sigma_{\tau}=76.53\text{ days}$ . This yields a coefficient of variation $\sigma_{\tau}/\langle\tau\rangle=4.611$ .