Agent based simulators for epidemic modelling: Simulating larger models using smaller ones

Daksh Mittal, Sandeep Juneja, Shubhada Agrawal

Abstract

Agent-based simulators (ABS) are a popular epidemiological modelling tool to study the impact of various non-pharmaceutical interventions in managing an epidemic in a city (or a region). They provide the flexibility to accurately model a heterogeneous population with time and location varying, person-specific interactions as well as detailed governmental mobility restrictions. Typically, for accuracy, each person is modelled separately. This however may make computational time prohibitive when the city population and the simulated time is large. In this paper, we dig deeper into the underlying probabilistic structure of a generic, locally detailed ABS for epidemiology to arrive at modifications that allow smaller models (models with less number of agents) to give accurate statistics for larger ones, thus substantially speeding up the simulation. We observe that simply considering a smaller aggregate model and scaling up the output leads to inaccuracies. We exploit the observation that in the initial disease spread phase, the starting infections create a family tree of infected individuals more-or-less independent of the other trees and are modelled well as a multi-type super-critical branching process. Further, although this branching process grows exponentially, the relative proportions amongst the population types stabilise quickly. Once enough people have been infected, the future evolution of the epidemic is closely approximated by its mean field limit with a random starting state. We build upon these insights to develop a shifted, scaled and restart-based algorithm that accurately evaluates the ABS’s performance using a much smaller model while carefully reducing the bias that may otherwise arise. We apply our algorithm for Covid-19 epidemic in a city and theoretically support the proposed algorithm through an asymptotic analysis where the population size increases to infinity. We develop nuanced coupling based arguments to show that the epidemic process is close to the branching process early on in the simulation.

1 Introduction

Agent-based simulators (ABS) are a popular tool in epidemiology (See [12], [5]). In this paper we focus on ABS used to model epidemics such as Covid-19 in cities and we illustrate our methodological contributions through the use of ABS for modelling Covid evolution in the city of Mumbai. As is apparent from the paper, the underlying ideas are valid more generally for any epidemic in a large city that may have an initial exponential growth phase that tapers down when sufficient population is infected (See [4], [7]).

As is well known, in an ABS in epidemic modelling of a city, a synthetic copy of it is constructed on a computer that captures the population interaction spaces and detailed disease spread as well as disease spreading interactions as they evolve in time. Typically, each individual in the city is modelled as an agent, so that total number of agents equal the total city population. The constructed individuals reside in homes, children may go to schools, adults may go to work. Individuals also engage with each other in community spaces (to capture interactions in marketplaces, restaurants, public transport, and other public places). Homes, workplaces, schools sizes and locations and individuals associated with them, their gender and age, are created to match the city census data and are distributed to match its geography. Government policies, such as partial, location specific lockdowns for small periods of time, case isolation of the infected and home quarantine of their close contacts, closure of schools and colleges, partial openings of workplaces, etc. lead to mobility reduction and are easily modelled. Similarly, variable compliance behaviour in different segments of the population that further changes with time, is easily captured in an ABS. Further, it is easy to introduce new variants as they emerge, the individual vaccination status, as well as the protection offered by the vaccines against different variants as a function of the evolving state of the epidemic and individual characteristics, such as age, density of individual’s interactions, etc (see [13]). Thus, this microscopic level modelling flexibility allows ABS to become an effective strategic and operational tool to manage and control the the disease spread. See [5] for an ABS used for UK and USA related studies specific to COVID-19, [8] for a COVID-19 study on Sweden, [1], [13] for a study on Bangalore and Mumbai in India. See, e.g., [9] and [12] for an overview of different agent-based models.

Key drawback: However, when a reasonably large population is simulated, especially over a long time horizon, an ABS can take huge computational time and this is its key drawback. This becomes particularly prohibitive when multiple runs are needed using different parameters. For instance, any predictive analysis involves simulating a large number of scenarios to provide a comprehensive view of potential future sample paths. In model calibration, the key parameters such as transmission rates, infectiousness of new variants are fit to the observed infection data over relevant time horizons, requiring many computationally demanding simulations.

Key contributions: We develop a shift-scale-restart algorithm (described later) that carefully exploits the closeness of the underlying infection process (process of number infected of each type at each time), first to a multi-type super critical branching process, and then, suitably normalised, to the mean field limit of the infection process, so that the output from the smaller model accurately matches the output from the larger one. Essentially, both the smaller and the larger model, with identical initial conditions of small number of infections evolve similarly in the early days of the infection growth. Interestingly, in a super-critical branching process, while the number of infections grows exponentially, the proportions of the different types infected quickly stabilises, and this allows us to shift a scaled path from a smaller model to a later time with negligible change in the underlying distribution. Therefore, once there are enough infections in the system, output from the smaller model, when scaled, matches that from the larger model at a later ‘shifted’ time. This shifting and scaling of the paths from the smaller model does a good job of representing output from the larger model when there are no interventions to the system. However, realistically, government intervenes and population mobility behaviour changes with increasing infections. To get the timings of these interventions right, we restart the smaller model and synchronise the timings of interventions in the shifted and scaled path to the actual timings in the original path of the larger model.

We present numerical results where our strategy is implemented on a model for Mumbai with 12.8 million population that realistically captures interactions at home, school, workplace and community as well as mobility restrictions through interventions such as lockdown, home quarantine, case isolation, schools closed, limited attendance at workplaces, etc. Using our approach, we more or less exactly replicate 12.8 million population model for Mumbai with all its complexities, using only 1 million people model, providing an almost 12.8 times speed-up. We numerically observe similar results with a smaller half-million population model, although further reduction leads to increase in errors as the branching process phase is very small and mean field approximations break down. Mumbai is a densely populated city. To check for validity of our approach to less dense cities, we reduce the parameters that capture interactions and numerically observe that the proposed approach works well even in substantially sparser cities.

Further, we provide theoretical support for the proposed approach through an asymptotic analysis where the population size $N$ increases to infinity. We show that early on till time $\frac{\log({N}/{i\log N})}{\log(\rho)}$ , where $\rho$ is the exponential epidemic growth rate in early stages and $i$ denotes the number exposed at time zero, the epidemic process is well approximated by an associated branching process. Our analysis relies on developing a nuanced coupling between the epidemic and the branching process during this period. Then, after time $\log(\epsilon N)/\log(\rho)$ for any $\epsilon>0$ , and large $N$ , we show that the epidemic process is closely approximated by its mean field limit that can be seen to follow a large dimensional discretized ODE. While, our simulation model allows each person’s state to lie in an uncountable state space, our theoretical analysis is conducted under somewhat simpler finite state assumptions, that subsumes general compartmental models.

In the Appendix we also identify the behaviour of simple compartmental models such as susceptible-infected and recovered (SIR) and some generalisations in the early branching process phase before it is governed by the ODE (see Appendix B).

Structure of remaining paper: To illustrate the key ideas practically, we first show the implementation of our algorithm on a realistic model of Mumbai in Section 2. Then in Section 3, we briefly summarise our agent based simulator. In Section 4 we spell out the shift-scale-restart algorithm. We provide theoretical asymptotic analysis supporting the efficacy of the proposed algorithm in Section 5. We first conduct our analysis in a simpler set-up where the coupling between the epidemic and the branching process and the related analysis is easier. A more general analysis, and a more intricate coupling is analysed in the Appendix D. Section 5 is accompanied with brief proof sketches of the results while the details are presented in Appendix C. In Section 6 we demonstrate the performance of our algorithm for less dense cities and smaller half million and one lakh models. In section 7 we conclude.

2 Speeding up ABS: The big picture

A naive approach to speed up the ABS maybe to use a representative smaller population model and scale up the results. Thus, for instance, while a realistic model for Mumbai city may have 12.8 million agents (see [11]), we may construct a sparser Mumbai city having, say, a million agents, that matches the bigger model in essential features, so that, roughly speaking, in the two models each infectious person contributes the same total infection rate to all susceptibles at each time. The output numbers from the smaller model may be scaled by a factor of 12.8 to estimate the output from the larger model. We observe, somewhat remarkably, that this naive approach is actually accurate if the initial seed infections in the smaller model (and hence also the larger model) are large, say, of the order of thousands, and are identically distributed in both the smaller and the larger model (see Figure 3 where the number exposed are plotted under a counter-factual no interventions scenario. The comparative statements hold equally well for other statistics such as the number infected, hospitalised, in ICUs and deceased). The rationale is that in this setting both the smaller and the larger model have sufficient infections so that the proportion of the infected population in both the models well-approximate their identical mean field limits.

Refer to caption — Figure 1: Scaled no. exposed in the smaller model match the larger model when we start with large, 12800 infections.

However, modelling initial randomness in the disease spread is important for reasons including ascertaining the distribution of when and where an outbreak may be initiated, the probability that some of the initial infection clusters die-down, getting an accurate distribution of geographical spread of infection over time, capturing the intensity of any sample path (the random variable $W$ in the associated branching process, described in Theorem 5.2), etc. These are typically captured by setting the initial infections to a small number, say, around a hundred, and the model is initiated at a well chosen time (see [1]). In such settings, we observe that the scaled output from the smaller model (with proportionately lesser initial infections) is noisy and biased so that the simple scaling fix no longer works (see Figure 3. In Appendix E, we explain in a simple setting why the scaled smaller model is biased and reports lower number of infections compared to the larger model in the early infection spread phase).

Additionally, we observe that in the early days of the infection, the smaller and the larger model with the same number of initial infections, similarly clustered, behave more or less identically (see Figure 3), so that the smaller model with the unscaled number of initial seed infections provides an accurate approximation to the larger one. Here again in the early phase in the two models, each infectious person contributes roughly the same total infection rate to susceptibles at home, workplace and community. Probabilistically this is true because early on, both the models closely approximate an identical multi-type branching process. Shift-scale-restart algorithm outlined in Section 4 exploits these observations to speed up the simulator. We briefly describe it below.

Fixing ideas: Suppose that for Mumbai with an estimated population of $12.8$ million, a $12.8$ million agent model is seeded with $100$ randomly distributed infections on day zero. To get the statistics of interest such as expected hospitalisations and fatalities over time, instead of running the $12.8$ million agent model, we start a $1$ million agent model seeded with $100$ similarly randomly distributed infections at day zero and generate a complete path for the requisite duration. To get the statistics for the larger model, we first observe that under the no-intervention scenario, the output of the smaller model more or less exactly matches that of the larger model for around first $35$ days (Figure 5). As our analysis in Section 5 suggests, the two models closely approximate the associated branching process till time $\frac{\log(N/i\log N)}{\log\rho}$ where $N$ denotes the population of the smaller model and equals 1 million, $i$ denotes the number exposed at time zero and equals 100, and $\rho$ denotes the exponential growth rate in the early fatalities and is estimated from fatality data to equal 1.21. The quantity $\frac{\log(N/i\log N)}{\log\rho}$ is estimated to equal 34.5 suggesting that both are close to the underlying branching process around day 35. After this initial period of around $35$ days, the city has an average of $x$ thousand infections. We then determine the day when the city had $\frac{x}{12.8}$ thousand infections. This turns out to be day $21.5$ in our example. We take the path from day 21.5 onwards, scale it by a factor of 12.8, and concatenate it to the original path starting at day $35$ . Theoretical justification for this time shift comes from the branching process theory, where while a super-critical multi-type branching process can be seen to grow exponentially with a sample path dependent intensity, the relative proportions amongst types along each sample path stabilise fairly quickly and become more-or-less stationary (see Theorem 5.2). This shifted and scaled output after 35 days matches that of the larger model remarkably well. See Figure 5 where the generated infections from the larger $12.8$ million model and the shifted and scaled smaller $1$ million model are compared. The choice of day 35 is not critical above. Similar results would be achieved if we used lesser, as low as 25 days in the original model.

In a realistic setting, administration may intervene once the reported cases begin to grow. Suppose in the above example, an intervention happens on day 40. (This is reasonable, as in modelling Mumbai, our calibration exercise had set the day zero to February 13, 2020 (see, [11]), the resulting infections and reported cases reached worrying levels around the second week of March. Restrictions in the city were imposed around March 20, 2020.) In that case, the shifted and scaled path from the day 21.5 would need to have the restrictions imposed on day 26.5 (so that it approximates day 40 for the larger model). We achieve this by using the first generated a path till day 35, computing the appropriate scaled path time (21.5 days, in this case), and then, using common random numbers, restarting an identical path from time zero that has restrictions imposed from day 26.5. This path is scaled from day 21.5 onwards and concatenated to the original path at day 35. In Section 5 we note that the restarted path need not use common random numbers (see Remark 1). Even if it is generated independently, we get very similar output.

Figure 5 compares the number exposed in a 12.8 million Mumbai city simulation (in no intervention scenario) with the estimates from the shift-scale-restart algorithm applied to the smaller 1 million city. Numerical parameters for the no-intervention scenario for Mumbai are described in Appendix F. Figure 5 compares the exposed population process for the 12.8 million population Mumbai model with the smaller 1 million one as per our algorithm under realistic interventions (lockdown, case isolation, home quarantine, masking etc.) introduced at realistic times, as implemented in [1] using similar parameters, for 250 days. In this experiment, smaller and larger city evolves similarly till day 22. Intervention is at day 33, therefore smaller city is restarted with intervention at day 22.5 and scaled estimates from day 11.5 of the restarted simulation are appended at day 22 of the initial run (i.e. shifted by 10.5 days). We see that the smaller model faithfully replicates the larger one with negligible error. Similar results hold for other health statistics such as the number hospitalised and the number of cumulative fatalities, and for larger models that include variants (see Appendix A).

3 Agent Based Simulator

We informally and briefly describe the main drivers of the dynamics in our ABS model. A more detailed discussion can be seen in [1].

The model consists of individuals and various interaction spaces such as households, schools, workplaces and community spaces. Infected individuals interact with susceptible individuals in these interaction spaces. The number of individuals living in a household, their age, whether they go to school or work or neither, schools and workplaces size and composition all have distributions that may be set to match the available data. The model proceeds in discrete time steps of constant width $\Delta t$ (six hours in our set-up). At a well chosen time zero, a small number of individuals can be set to either exposed, asymptomatic, or symptomatic states, to seed the infection. At each time $t$ , an infection rate $\lambda_{n}(t)$ is computed for each susceptible individual $n$ based on its interactions with other infected individuals in different interaction spaces (households, schools, workplaces and community). In the next $\Delta t$ time, each susceptible individual moves to the exposed state with probability $1-\exp\{-\lambda_{n}(t)\cdot\Delta t\}$ , independently of all other events. Further, disease may progress independently in the interval $\Delta t$ for the population already afflicted by the virus. The probabilistic dynamics of disease progression as well as implementation of public health safety measures are briefly summarized below under simplified assumptions (underlying model is similar to that in [5] and [1]). Simulation time is then incremented to $t+\Delta t$ , and the state of each individual is updated to reflect the new exposures, changes to infectiousness, hospitalisations, recoveries, quarantines, etc., during the period $t$ to $t+\Delta t$ . The overall process repeats incrementally until the end of the simulation time.

3.1 Computing $\lambda_{n}(t)$

A susceptible individual $n$ at any time $t$ receives a total infection rate $\lambda_{n}(t)$ which is sum of the infection rates $\lambda_{n}^{h}(t)$ (from home), $\lambda_{n}^{s}(t)$ (from school), $\lambda_{n}^{w}(t)$ (from workplace) and $\lambda_{n}^{c}(t)$ (from community) coming in from infected individuals in respective interaction spaces of individual $n$ .

\lambda_{n}(t)=\lambda_{n}^{h}(t)+\lambda_{n}^{s}(t)+\lambda_{n}^{w}(t)+\lambda_{n}^{c}(t).

Briefly, the transmission rate ( $\beta$ ) of virus by an infected individual in each interaction space is the expected number of eventful (infection spreading) contact opportunities with all the individuals in that interaction space. It accounts for the combined effect of frequency of meetings and the probability of infection spread during each meeting. An infected individual can transmit the virus in the infective (pre-symptomatic or asymptomatic stage) or in the symptomatic stage. Each individual has two other parameters: a severity variable (individual attendance in school and workplace depends on the severity of disease) and a relative infectiousness variable, virus transmission is related linearly to this. Both bring in heterogeneity to the model.

Specifically, let $e_{n^{\prime}}(t)=1$ when an individual $n^{\prime}$ can transmit the virus at time $t$ , $e_{n^{\prime}}(t)=0$ otherwise. To keep notation simple, we avoid describing the details of individual infectiousness, severity, age dependent mobility and community density factor (see [1] for details). Let $\beta_{h}$ , $\beta_{s}$ , $\beta_{w}$ , and $\beta_{c}$ denote the transmission coefficients at home, school, workplace and community spaces, respectively. A susceptible individual $n$ who belongs to home $h(n)$ , school $s(n)$ , workplace $w(n)$ and community space $c(n)$ sees the following infection rates at time $t$ in different interaction spaces:

$\lambda_{n}^{h}(t)$ is the average transmission rate coming in to individual $n$ from each infected individual in his home.

\lambda_{n}^{h}(t)=\beta_{h}\frac{\sum_{n^{\prime}:h(n^{\prime})=h(n)}e_{n^{\prime}}(t)}{n_{h(n)}}.

$\lambda_{n}^{s}(t)$ and $\lambda_{n}^{w}(t)$ are also computed similarly based on infected individuals in respective interaction spaces of the individual $n$ .

Different wards in the city constitute different communities. In our numerical experiments for Mumbai, since slums form more than half the population and are extremely dense, we further divide each ward into slum and non-slum community. To compute infection rate to an individual from the community, we first compute the infection rate $h_{c^{\prime}}(t)$ for each community $c^{\prime}$ at a given time $t$ . This is set to the sum of transmission rate from all the infected individuals of the community assigned a weight that is proportional to the strictly decreasing function of the distance between the individual and the community centre. Specifically,

h_{c^{\prime}}(t)=\beta_{c}\left(\frac{\sum_{n^{\prime}:c(n^{\prime})=c^{\prime}}f(d_{n^{\prime},c^{\prime}})\cdot e_{n^{\prime}}(t)}{\sum_{n^{\prime}}f(d_{n^{\prime},c^{\prime}})}\right)

(1)

where $f(d)$ is a distance kernel that is strictly decreasing in $d$ . As in [6], one may select $f(d)=1/(1+(d/a)^{b})$ and $d\ll a$ for appropriately chosen non-negative constants $a$ and $b$ . Note that $h_{c^{\prime}}(t)$ is dominated by $\beta_{c}$ and it equals $\beta_{c}$ only when everyone in the community $c^{\prime}$ is infected.

The infection rate $\lambda_{n}^{c}(t)$ seen by an individual $n$ in community $c$ is set to sum of the community infection rates from different communities of the city multiplied with weights that are again proportional to the strictly decreasing function of the distance between the two communities. This quantity is further adjusted for the distance between individual $n$ and its community centre. Specifically,

\lambda_{n}^{c}(t)=\frac{f(d_{n,c(n)})}{\sum_{c^{\prime}}f(d_{c(n),c^{\prime}})}\sum_{c^{\prime}}f(d_{c(n),c^{\prime}})h_{c^{\prime}}(t).

(2)

Again observe that, if $f(d_{n,c(n)})<1$ , then $\lambda_{n}^{c}(t)<\beta_{c}$ . Further, an age dependent factor determining the mobility of the individuals in the community may also be accounted in $\lambda_{n}^{c}(t)$ (see [1]).

3.2 Disease progression

In the numerical experiments, the disease progression model is based on COVID-19 progression as described in [16] and [5]. An individual may have one of the following states: susceptible, exposed, infective (pre-symptomatic or asymptomatic), recovered, symptomatic, hospitalised, critical, or deceased. Briefly, an individual after getting exposed to the virus at some time observes an incubation period which is random with a Gamma distribution. Individuals are infectious for an exponentially distributed period. This covers both presymptomatic transmission and possible asymptomatic transmission. We assume that a third of the patients recover, these are the asymptomatic patients; the remaining develop symptoms. Individuals either recover or move to the hospital after a random duration that is exponentially distributed (see Appendix F for model input data details). The probability that an individual recovers depends on the individual’s age. While hospitalised individuals may continue to be infectious, they are assumed to be sufficiently isolated, and hence do not further contribute to the spread of the infection. Further progression of hospitalised individuals to critical care and further to fatality is also age dependent.

3.3 Public health safety measures (PHSMs)

We introduce methodologies to model different PHSMs (or Interventions) such as lockdowns, home quarantine, case isolation, social distancing of elderly population, mobility restrictions, masks etc. Table 7 in Appendix F summarizes some of the key interventions implemented. The PHSM’s mentioned above when implemented put some restrictions on the individual’s mobility. However, it is often the case that when several restrictions are in place, only a fraction of the population comply with these restrictions. Therefore, we restrict the movement of only the individuals in the compliant fraction of households in the city.

4 Shift-scale-restart algorithm

Let $\mu_{0}(N)$ denote the initial distribution of the infected population at time zero in our model with population $N$ and let the simulation run for a total of $T$ time units. E.g., for Mumbai at a suitably chosen time 0, we select $I_{0}=100$ people at random from the non-slum population and mark them as exposed. (Since initially the infection came from international travellers flying into Mumbai, it is reasonable to assume that most of them were residing in non-slums). Algorithm 1 summarises the simulation dynamics.

Algorithm 1 Simulation Dynamics

1:At

t=0

, start the simulation with

I_{0}

infections distributed as per

\mu_{0}(N)

2:while

t<T

3: For each susceptible individual

n

, calculate

\lambda_{n}(t)

. Its status then changes to exposed with probability

1-\exp({-\lambda_{n}(t)})

4: All individuals in some state other than susceptible, independently transition to another state as per the disease progression dynamics.

t\leftarrow t+1

6:The above simulation is independently repeated many times and average of performance measures such as number exposed, number infected, number hospitalised, and number of fatalities as a function of time are reported.

Scaling the model: For $k,N\in\mathbb{N}$ , $k>1$ , let $kN$ be the number of individuals in the larger city, and $N$ in the smaller city. Roughly speaking, the larger city has $k$ times more homes, schools and workplaces compared to the smaller city. The joint distribution of people in homes, schools or workplaces is unchanged, and transmission rates $\beta_{h}$ , $\beta_{s}$ , $\beta_{w}$ and $\beta_{c}$ are unchanged (recall that each transmission rate of an infected individual in each interaction space is the expected number of infection spreading contact opportunities with all the individuals in that interaction space. It accounts for the combined effect of frequency of meetings and the probability of infection spread during each meeting).

When we initiate both the larger as well as the smaller city with a same few and well spread infections, the disease spreads similarly in homes, schools and workplaces. To understand the disease spread through communities, for simplicity assume that there is a single community. It is easy to see that each susceptible person sees approximately $1/k$ times the community infection rate in the larger city compared to the smaller city. On the other hand, the larger city has $k$ times more susceptible population. This is true even when there are more communities and for a general distance function $f$ (see Section 3.1 for role of $f$ in community infection rates). Therefore, early on in the simulation, the total number getting infected through communities is also essentially identical between the larger and the smaller city, and the infection process in the two cities evolves very similarly.

Let $t_{S}$ denote the time till the two cities evolve essentially identically (as seen empirically and suggested by theoretical analysis, this is close to $\log_{\rho}N^{*}$ for $N^{*}=N/i\log{N}$ . Here $\log_{\rho}m=\log{m}/\log\rho$ for any $m\in\Re^{+}$ and $\rho$ denotes the initial infection exponential growth rate.)

Algorithm 2 Shift and scale algorithm in no-intervention setting

1:At

t=0

, start the simulation with

I_{0}

infections distributed as per

\mu_{0}(N)

. Generate the simulation sample path

[y_{1},y_{2},...,y_{T}]

where

y_{t}

denotes the statistics of the affected population at time

t

2:Suppose there are

x

infections at

t_{S}

, determine an earlier time

t_{x/k}

in the simulation when there where approximately

\frac{x}{k}

infections in the city.

3:The statistics of affected for the larger city is then obtained as

[y_{1},y_{2},...,y_{t_{S}},k\times y_{t_{x/k}+1},...,k\times y_{T-(t_{S}-t_{x/k})}].

4:As is Step 6, Algorithm 1.

In this setting, we propose that a shift and scale Algorithm 2, that builds upon Algorithm 1, be applied to the smaller city to generate output that resembles the larger city. Algorithm 2 is graphically illustrated in Figure 7.

In a realistic scenario, as the infection spreads, the administration will intervene and impose mobility restrictions. Thus, our simulation adjustments to the small city need to account for the timings of these interventions accurately. Let $t_{I}$ denote the first intervention time (e.g., lockdown; typically after $\beta\log_{\rho}N$ time for small $\beta\in(0,1)$ ). Let $t_{min}\approx\min\{t_{I},t_{S}\}$ . We need to restart our simulation to ensure that the shift and scaled path incorporates the intervention at the correct time. The Algorithm 3 achieves this and is graphically illustrated in Figure 7.

Algorithm 3 Shift, scale and restart algorithm

1:At

t=0

, start the simulation with

I_{0}

infections distributed as per

\mu_{0}(N)

. Generate the simulation sample path

[y_{1},y_{2},...,y_{t_{min}}]

where

y_{t}

denotes the statistics of the affected population (e.g., number exposed, number hospitalised, number of fatalities) at time

t

2:Suppose there are

x

infections at

t_{min}

, determine an earlier time

t_{x/k}

in the simulation when there where approximately

\frac{x}{k}

infections in the city.

3:Restart a new simulation of the city using common random numbers, but with the intervention introduced at time

t_{\frac{x}{k}}+t_{I}-t_{min}

. Simulate it upto time

T-(t_{min}-t_{x/k})

. Denote the time series of statistics of the affected population in the restart simulation by

z_{1},z_{2},...,z_{t_{x/k}},...,z_{T-(t_{min}-t_{x/k})}.

4:The approximate statistics of the affected population for the larger city is then obtained as

[y_{1},y_{2},...,y_{t_{min}},k\times z_{t_{x/k}+1},...,k\times z_{T+t_{x/k}-t_{min}}].

5:As is Step 6, Algorithm 1.

As we see empirically, and as is suggested by Proposition 5.1, in Algorithm 3, evolution of the infection process after time $t_{x/k}$ , given the state at that time, is more or less deterministic (even though the process may be close to the associated branching process at this time).

5 Asymptotic Analysis

In SSR algorithm, theoretical justification is needed for the fact that early on in the small city simulation, we could take a path at one time period, scale it, and stitch it to the path at another appropriately chosen time period, to accurately generate a path for the larger city. We provide this through an asymptotic analysis as the city population $N$ increases to infinity. We prove that in the initial disease spread phase, the epidemic process is close to a multi-type super-critical branching process by coupling the two processes. Further we prove that, once enough people have been infected, the future evolution of the epidemic is closely approximated by its mean field limit with a random starting state.

Recall that the synthetic model of the city is generated in two steps. We first instantiate the individuals, households, schools, workplaces and communities using the population level statistics of the demographic profile of the city. Secondly, these individuals are randomly assigned to households, schools etc. As a result, the graph of the population in the whole city might be connected to each other through schools, workplaces and homes (apart from being connected through community). As we also discuss later, in our asymptotic analysis the number of homes, schools and workplaces increase to infinity with $N$ . This implies that the state space of the population network will go to $\infty$ as $N\to\infty$ , making the analysis significantly more complex.

For the ease of analysis, we assume that the state of the city takes fixed, finitely many values as $N\to\infty$ . Specifically, we assume that the city contains many groups of individuals, each group of fixed finite population. An individual in a group interacts with another group only through community so an infectious individual in one group can infect a person in another group only through community interactions. All the other interactions (home, school, workplace) of individuals are within a group. Each group may be thought as comprising many households where children may go to schools and adults may go to work within the same group. This group structure remedies the problem of infinite state space, as a group of finite size can only have finitely many types (considering different types of individuals and their networks). Additionally, it was experimentally observed that for large group sizes, the results from the actual simulation model and similar group structured model were very close.

Even though the group structure makes the state space finite, the analysis involves nuanced arguments for coupling the epidemic and branching processes. To keep the discussion simple and intuitive, in the main body, we consider a simpler model with only one individual in a group thereby ignoring the interaction spaces of homes, workplaces and schools. For groups of size greater than 1, the analysis using the group structure involves an intricate coupling argument and is presented in the Appendix D.

Table 1, elaborates the differences between the simulation model and the two models analysed asymptotically - the intermediate group-structure model and the simplified model with groups of size 1.

Table 1: Comparison between city simulation model, intermediate (group-structure) model for asymptotic analysis and the simplified model for asymptotic analysis.

	City	Intermediate	Simplified
	simulation	model for	model for	Remarks
	model	analysis	analysis
Presence of				Intermediate model: Population is divided into
homes, schools	Yes	Yes	No	groups. All the home, school and workplace
and workplaces				interactions of an individual are within a group.
Number of	Multiple	1	1	Intermediate, simplified model: Extension to
communities				multiple communities is straightforward.
Distance of an
individual from	Continuous	Constant	Constant	Intermediate, simplified model: Extension to
community				multiple distances is straightforward.
center
	Not			Intermediate model: Single group
Groups	applicable	Yes	Yes	type is considered. Extension to
				Multiple group types is straightforward.

We briefly review the standard multi-type branching process in Section 5.1. In the following analysis, we define an epidemic process in Section 5.2. Further, we define a multi-type super-critical branching process tailored to the epidemic process in Section 5.3. We describe the coupling between the two processes in Section 5.4. We then state the results demonstrating the closeness of the epidemic and the branching process in the early disease spread phase in Section 5.5. The results demonstrating the closeness of the epidemic process to its mean field limit once the epidemic process has grown are given in Section 5.6.

5.1 Supercritical multi-type branching process review

In this section, we first define a multi-type branching process and state a key result associated with a super critical multi-type branching process and an assumption outlining a set of sufficient conditions for it to hold. See [2] for details.

Let ${{\mathbf{\tilde{}}{B}}_{t}}$ be a $\tilde{\eta}$ dimension vector denoting the multi-type branching process at time $t$ , where $\tilde{\eta}<\infty$ . Component $i$ of ${{\mathbf{\tilde{}}{B}}_{t}}$ denotes the number of individuals of type $i$ at time $t$ . As is well known, in a multi-type branching process, at the end of each time period an individual may give birth to children of different types and itself dies (it may reincarnate as a child of same or different type). The number of children each individual of type $i$ gives birth to is independent and identically distributed. Therefore, multi-type branching process is a Markov chain $({\mathbf{\tilde{}}{B}}_{t}\in\mathbb{{R^{+}}^{\tilde{\eta}}}:t\geq 0)$ . Suppose ${\mathbf{\tilde{}}{B}}_{t}=(b_{1},\ldots,b_{\tilde{\eta}})$ , then ${\mathbf{\tilde{}}{B}}_{t+1}$ is sum of independent offsprings of $b_{1}$ type $1$ parents, independent offsprings $b_{2}$ type $2$ parents, and so on. Thus, ${\mathbf{\tilde{}}{B}}_{t+1}$ is sum of $b_{1}+b_{2}+...+b_{\tilde{\eta}}$ independent random vectors in $\mathbb{{R^{+}}^{\tilde{\eta}}}$ .

Consider a matrix $\tilde{K}\in{\Re^{+}}^{\tilde{\eta}\times\tilde{\eta}}$ such that $\tilde{K}(i,j)=\mathbb{E}\left({\tilde{B}_{1}(j)|{\mathbf{\tilde{}}{B}}_{0}={\mathbf{e}_{i}}}\right)$ , that is expected number of type $j$ offsprings of a single type $i$ individual in one time period. Then,

\displaystyle\mathbb{E}\left({{\mathbf{\tilde{}}{B}}_{t+1}}\right)

\displaystyle=\tilde{K}^{T}\mathbb{E}\left({{\mathbf{\tilde{}}{B}}_{t}}\right)=(\tilde{K}^{T})^{t+1}\mathbb{E}\left({{\mathbf{\tilde{}}{B}}_{0}}\right).

Recall that the Perron Frobenius eigenvalue of a non-negative irreducible matrix (a non-negative square matrix $A$ is irreducible if there exists an integer $m>0$ such that all entries of $A^{m}$ are strictly positive) is its largest eigenvalue in absolute value, and can be seen to be positive. The following assumption is standard in multi-type branching process theory (see [2]).

Assumption 1.

$\tilde{K}$ is irreducible and its Perron Frobenius eigenvalue $\tilde{\rho}>1.$ Furthermore,

\mathbb{E}\left({\tilde{B}_{1}(j)\log\tilde{B}_{1}(j)|{\mathbf{\tilde{}}{B}}_{0}={\mathbf{e}_{i}}}\right)<\infty,

for all $1\leq i,j\leq\tilde{\eta}$ .

By $X_{n}\xrightarrow{P}X$ we denote that the sequence of random variables $\{X_{n}\}$ converges to $X$ in probability as $n\rightarrow\infty$ . Theorem 4 below is well known (see [2]).

Suppose that Assumption 1 holds. Then, corresponding to eigenvalue $\tilde{\rho}$ , there exist strictly positive right and left eigenvectors of $\tilde{K}$ , ${\mathbf{\tilde{}}{u}}$ and ${\mathbf{\tilde{}}{v}}$ such that ${\mathbf{\tilde{}}{u}}^{T}{\mathbf{\tilde{}}{v}}=1$ and $\sum_{i=1}^{\tilde{\eta}}{\tilde{u}(i)}=1$ .

Theorem 5.1.

Under Assumption 1,

\lim_{t\to\infty}\frac{\tilde{K}^{t}}{\tilde{\rho}^{t}}={\mathbf{\tilde{}}{u}}{\mathbf{\tilde{}}{v}}^{T}.

Furthermore,

\frac{{\mathbf{\tilde{}}{B}}_{t}}{\tilde{\rho}^{t}}\xrightarrow{P}W{\mathbf{\tilde{}}{v}}\quad\text{ as }t\to\infty,

(3)

where $W$ is a non-negative random variable such that $P\{W>0\}>0$ and $\mathbb{E}\left({W|{\mathbf{\tilde{}}{B}}_{0}={\mathbf{e}_{i}}}\right)=\tilde{u}(i)$ for all $i=1,...,\eta$ . Let $A=\{\omega:\tilde{B}_{t}(\omega)\to\infty\mbox{ as }t\to\infty\}$ . Then, for any $\epsilon>0$ and for all $j\in[1,\tilde{\eta}]$

\lim_{t\to\infty}P\{\omega:\omega\in A,\left\lvert{\frac{{\tilde{B}}_{t}(j)}{\sum_{i=1}^{\tilde{\eta}}\tilde{B}_{t}(i)}-\frac{{\tilde{v}(j)}}{\sum_{i=1}^{\tilde{\eta}}\tilde{v}(i)}}\right\rvert>\epsilon\}=0.

(4)

5.2 Epidemic process dynamics

Notation: The city comprises of $N$ individuals and our interest is in analyzing the city asymptotically as $N\rightarrow\infty$ .

•

An individual at any time can be in one of the following disease states: Susceptible, exposed, infective, symptomatic, hospitalised, critical, dead or recovered. Individuals are infectious only in infective or symptomatic states. Denote all the disease states by $\mathcal{D}$ . For simplicity, we ignore possible reinfections, although incorporating them would not alter our conclusions.
•

Each individual has some characteristics that are assumed to remain unchanged throughout the epidemic regardless of it’s disease state. These include individual’s age group, disease progression profile (e.g., some may be more infectious than others), community transmission rates (e.g., individuals living in congested slums may be modelled to have higher transmission rates), mobility in the community (e.g., elder population may travel less to the community compared to the working age population). We assume that set of all possible individual characteristics are finite, and denote them by $\mathcal{A}$ . Let $N_{a}$ , for $a\in\mathcal{A}$ , denote the total number of individuals with characteristic $a$ in system $N$ , and set $\pi_{a}=\frac{N_{a}}{N}$ for all $a\in\mathcal{A}$ . We assume that $\pi_{a}$ is independent of $N$ as $N\rightarrow\infty$ .
•

Hence, each individual at any time may be classified by a type ${\mathbf{s}}=(a,d)$ , where $a$ denotes the individual characteristic and $d$ the disease state. Let $\mathcal{S}=\mathcal{A}\times\mathcal{D}$ denote the set of all types.
•

Let $\mathcal{U}\subset\mathcal{S}$ denote all the types with susceptible disease state. Hence, $\mathcal{S}\setminus\mathcal{U}$ denote the types where individuals are already affected (that is, they have been exposed to the disease at some point in the past). Let $\eta=\lvert\mathcal{S}\setminus\mathcal{U}\rvert$ .
•

Denote the number of individuals of type ${\mathbf{s}}$ at time $t$ by $X_{t}^{N}({\mathbf{s}})$ and set ${\mathbf{X}}_{t}^{N}=(X_{t}^{N}({\mathbf{s}}):{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U})$ . Then, ${{\mathbf{X}}_{t}^{N}\in\mathbbm{Z^{+}}^{\eta}}$ .
•

Let $A_{t}^{N}=\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}X_{t}^{N}({\mathbf{s}})$ denote the total number of affected individuals in the system $N$ at or before time $t$ .

Dynamics: At time zero, for each $N$ , ${{\mathbf{X}}_{0}^{N}}$ is initialised by setting a suitably selected small and fixed number of people randomly from some distribution $\mu_{0}(N)$ and assigning them to the exposed state. All others are set as susceptible. The distribution $\mu_{0}(N)$ is assumed to be independent of $N$ so we can set ${{\mathbf{X}}_{0}^{N}}={\mathbf{X}}_{0}$ for all N.

Given ${{\mathbf{X}}_{t}^{N}}$ , ${{\mathbf{X}}_{t+\Delta t}^{N}}$ is arrived at through two mechanisms. (For the ease of notation we will set $\Delta t=1$ .) i) Infectious individuals at time $t$ who make Poisson distributed infectious contacts with the rest of the population moving the contacted susceptible population to exposed state, and ii) through population already affected moving further along in their disease state. Specifically,

•

We assume that an infectious individual with characteristic $a\in\mathcal{A}$ spreads the disease in the community with transmission rate $\beta_{a}$ . Thus, the total number of infectious contacts it makes with all the individuals (both susceptible and affected) in one time step is Poisson distributed with rate ${\beta_{a}}$ . The individuals contacted are selected randomly and an individual with characteristic $\tilde{a}\in\mathcal{A}$ is selected with probability proportional to ${\psi_{a,\tilde{a}}}$ (independent of $N$ ). $\psi_{a,\tilde{a}}$ helps model biases such as an individuals living in a dense region are more likely to infect other individuals living in the same dense region. As a normalisation, set $\sum_{\tilde{a}}\pi_{\tilde{a}}\psi_{a,\tilde{a}}=1$ .
•

Once the number of infectious contacts for a particular characteristic $\tilde{a}\in\mathcal{A}$ have been generated, each contact is made with an individual selected uniformly at random from all the individuals with characteristic $\tilde{a}$ .
•

If an already affected individual has one or more contact from an infectious individual, its type remains unchanged. On the other hand, if a susceptible individual has at least one contact from any infectious individual, it gets exposed at time $t+1$ . Each susceptible individual with characteristic $\tilde{a}\in\mathcal{A}$ who gets exposed at time $t$ , increments ${{\mathbf{X}}_{t+1}^{N}}(\tilde{a},exposed)$ by 1. A susceptible individual that has no contact with an infectious individual remains susceptible in the next time period.
•

Once an individual gets exposed, its disease progression is independent of all the other individuals and depends only on its characteristics, that is, the disease progression profile of the characteristic class the individual belongs to. The waiting time in each state (except susceptible, dead and recovered) is assumed to be geometrically distributed. Transition to symptomatic, hospitalised, critical, dead or recovered state happens with respective characteristic (disease progression profile) dependent transition probabilities. Thus, an individual of type ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ , some disease state other than susceptible, at time $t$ transitions to some other state ${\mathbf{q}}$ at time $t+1$ with the transition probability $P({\mathbf{s}},{\mathbf{q}})$ in one time step. The probability $P({\mathbf{s}},{\mathbf{q}})$ is independent of time $t$ and $N$ . If the transition happens, $X_{t+1}^{N}({\mathbf{s}})$ is decreased by 1 and $X_{t+1}^{N}({\mathbf{q}})$ is increased by 1. Observe that $P({\mathbf{s}},{\mathbf{q}})=0$ if characteristic of types ${\mathbf{s}}$ and ${\mathbf{q}}$ is different.

5.3 Associated branching process dynamics

For each $t$ , let ${{{\mathbf{B}}_{t}}\in\mathbbm{Z^{+}}^{\eta}}$ , ${{\mathbf{B}}_{t}}=(B_{t}({\mathbf{s}}):{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U})$ where $B_{t}({\mathbf{s}})$ denote the number of individuals of ${\mathbf{s}}$ at time $t$ in the branching process.

Dynamics: At time zero ${{\mathbf{B}}_{0}}={{\mathbf{X}}_{0}}$ . Given ${\bm{B}_{t}}$ , we arrive at ${\bm{B}_{t+1}}$ as follows:

•

At time $t$ , every infectious individual of type $a$ , for all $a$ , gives birth to independent Poisson distributed offspring of type $(\tilde{a},exposed)$ at time $t+1$ with rate $\pi_{\tilde{a}}\psi_{a,\tilde{a}}\beta_{a}$ for each $\tilde{a}$ . $B_{t+1}(\tilde{a},exposed)$ is increased accordingly.
•

Once an individual gets exposed, disease progression of the individual has same transition probabilities as in each epidemic process. An individual of type ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ , that is, in disease state other than susceptible at time $t$ , transitions to some other disease state ${\mathbf{q}}$ at time $t+1$ with probability $P({\mathbf{s}},{\mathbf{q}})$ . If the transition happens, $B_{t+1}({\mathbf{s}})$ is decreased by 1 and $B_{t+1}({\mathbf{q}})$ is increased by 1.

Let $A_{t}^{B}=\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}B_{t}({\mathbf{s}})$ denote the total number of offsprings generated by time $t$ in the branching process.

The expected offsprings matrix $K\in{\Re^{+}}^{\eta\times\eta}$ for $\{{{\mathbf{B}}_{t}}\}$ : Let each entry ${K}({\mathbf{s}},{\mathbf{q}})$ of $K$ denote the expected number of type ${\mathbf{q}}$ offspring of a single type ${\mathbf{s}}$ individual in one time step. Let $\mathcal{H}\subset\mathcal{S}$ denote all the types with the disease states that are infectious or may become infectious in subsequent time steps (that is, types with disease state either exposed, infective or symptomatic). Let $\mathcal{H}^{c}$ denote its complement. Individuals in $\mathcal{H}^{c}$ do not contribute to community infection. Let ${\mathcal{\tilde{H}}}\subset\mathcal{S}$ denote all the types with the disease states that are infectious (that is infective or symptomatic state). Let $\mathcal{E}$ denote $\mathcal{H}\setminus\mathcal{\tilde{H}}$ , that is, the set of all the types corresponding to exposed individuals. Let $\hat{\eta}=\left\lvert{\mathcal{H}}\right\rvert$ .

As described above, an individual of type ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ , may give birth to other exposed individuals if it is infectious, and/or may itself transition to some other type in one time step. Then, $K$ can be written as,

	$\displaystyle K({\mathbf{s}},{\mathbf{q}})$	$\displaystyle=P({\mathbf{s}},{\mathbf{q}})+\pi_{\tilde{a}}\psi_{a,\tilde{a}}\beta_{a}~{}\forall~{}{\mathbf{s}}=(a,d)\in\mathcal{\tilde{H}},{\mathbf{q}}=(\tilde{a},\tilde{d})\in\mathcal{E}$		(5)
	$\displaystyle K({\mathbf{s}},{\mathbf{q}})$	$\displaystyle=P({\mathbf{s}},{\mathbf{q}})\text{ otherwise.}$		(5)

It follows that

\mathbb{E}\left({{\mathbf{B}}_{t+1}}\right)=K^{T}\mathbb{E}\left({{\mathbf{B}}_{t}}\right)=(K^{T})^{t+1}\mathbb{E}\left({{\mathbf{B}}_{0}}\right).

(6)

Theorem 5.2 for multi-type branching processes holds under a standard assumption that the expected offspring matrix ${K}$ is irreducible (for more details, see Appendix 5.1). However, $K$ as defined in (5) is not irreducible. Lemma 5.1 below sheds further light on the structure of $K$ . Theorem 5.2 observes that standard conclusions continue to hold for the branching process $\{{{\mathbf{B}}_{t}}\}$ associated with epidemic processes $\{{{\mathbf{X}}_{t}^{N}}\}$ .

For any matrix $M$ , let $\rho(M)$ denote its spectral radius, that is, the maximum of the absolute values of all its eigenvalues. Henceforth, we assume that $\rho=\rho(K)>1$ in all subsequent analysis.

Lemma 5.1.

There exist matrices $K_{1}\in{\Re^{+}}^{\hat{\eta}\times\hat{\eta}}$ , $C\in{\Re^{+}}^{(\eta-\hat{\eta})\times(\eta-\hat{\eta})}$ and $M\in{\Re^{+}}^{(\eta)\times(\eta-\hat{\eta})}$ such that

K=\big{(}\begin{smallmatrix}K_{1}&M\\ 0&C\end{smallmatrix}\big{)},

where $K_{1}\in{\Re^{+}}^{\hat{\eta}\times\hat{\eta}}$ is irreducible. Further, $\rho(C)\leq\rho(K_{1})$ .

As $K_{1}$ defined in Lemma 5.1 is irreducible, its Perron Frobenius eigenvalue is equal to its spectral radius $\rho(K_{1})$ .

Theorem 5.2.

Spectral radius of $K$ is equal to the spectral radius of $K_{1}$ , that is, $\rho(K)=\rho(K_{1})$ . Furthermore, $\rho=\rho(K)$ is a unique eigenvalue of $K$ with maximum absolute value and $\lim_{t\to\infty}{K^{t}}/{\rho^{t}}={\mathbf{u}}{\mathbf{v}}^{T},$ where ${\mathbf{u}}$ and ${\mathbf{v}}$ are the strictly positive right and left eigenvectors of $K$ corresponding to eigenvalue $\rho=\rho(K)$ such that ${\mathbf{u}}^{T}{\mathbf{v}}=1$ and $\sum_{i=1}^{\eta}{u}(i)=1$ .

In addition, ${{\mathbf{B}}_{t}}/{\rho^{t}}\xrightarrow{P}W{\mathbf{v}}$ as $t\to\infty$ , where $W$ is a non-negative random variable such that $P\{W>0\}>0$ iff $B_{0}({\mathbf{s}})\neq 0$ for some ${\mathbf{s}}\in\mathcal{H}$ and $\mathbb{E}\left({W|{\mathbf{B}}_{0}={\mathbf{e}_{i}}}\right)=u(i)$ for all $i=1,...,\eta$ . Also, let $A=\{\omega:B_{t}(\omega)\to\infty\}$ as $t\to\infty$ . Then, for any $\epsilon>0$ and for all $j\in[1,\eta]$ ,

\lim_{t\to\infty}P\{\omega:\omega\in A,\left\lvert{\frac{{B}_{t}(j)}{\sum_{i=1}^{\eta}B_{t}(i)}-\frac{{v}(j)}{\sum_{i=1}^{\eta}v(i)}}\right\rvert>\epsilon\}=0.

Observe that $\mathbb{E}\left({{B}_{1}(j)\log{B}_{1}(j)|{\mathbf{B}}_{0}={\mathbf{e}_{i}}}\right)<\infty$ for all $1\leq i,j\leq{\eta}$ holds because the offsprings for our branching process have a Poisson distribution.

Proof sketch for Lemma 5.1 and Theorem 5.2: Lemma 5.1 follows by observing that: 1) $K_{1}$ correspond to types of individuals with disease states that are infectious or may become infectious. These types may give birth to exposed individuals of each characteristic in subsequent time steps and $K_{1}$ is therefore irreducible. 2) As any individual in hospitalised, critical, dead or recovered disease states neither transitions nor gives birth to offspring in exposed, infective or symptomatic disease states, therefore, we have $0$ submatrix in $K$ . 3) $C$ corresponds to type of individuals that do not contribute to any infections in the city. In addition, these individuals either transition to the next disease state or remain in the same disease state with positive probability. This results in all the eigenvalues of C being less than or equal to 1. Theorem 5.2 then follows easily by combining the Lemma 5.1 and the standard results for branching processes. For detailed proof, see Appendix C.1 and C.2.

5.4 Coupling epidemic and branching process

•

Recall that at time zero, ${\mathbf{B}}_{0}={\mathbf{X}}_{0}$ . We couple each exposed individual of each type in the epidemic process to an exposed individual of the same type in the branching process at time zero.
•

The coupled individuals in each of these processes follow the same disease progression (using same randomness) and stay coupled throughout the simulation.
•

Further, when infectious, they generate identical Poisson number of contacts (in epidemic process) and offsprings (in branching process). Specifically, when a coupled individual with characteristic $a\in\mathcal{A}$ is infectious, the number of contacts it makes in a time step with all the individuals with characteristic $\tilde{a}\in\mathcal{A}$ in epidemic process is Poisson distributed with rate $\pi_{\tilde{a}}\psi_{a,\tilde{a}}\beta_{a}$ . This equals the offsprings generated by the corresponding coupled individual in the branching process where the offsprings are with characteristic $\tilde{a}\in\mathcal{A}$ and are of type $(\tilde{a},exposed)$ .
•

In epidemic process, each contact is made with an individual randomly selected from the population with the same characteristic. If a contact is with a susceptible person, then that person is marked exposed in the next time period and is coupled with the corresponding person in the branching process.
•

On the other hand, if in epidemic process, a new contact is with an already affected individual, then this does not result in any person getting exposed so that the corresponding offspring in the branching process is uncoupled. The descendants of uncoupled individuals in the branching process are also uncoupled. In our analysis, we will show that such uncoupled individuals in the branching process vis-a-vis epidemic process are a negligible fraction of the coupled individuals till time $\log_{\rho}N^{*}$ for large $N$ , where recall that $N^{*}=N/i\log N$ and $\log_{\rho}N^{*}=\log N^{*}/\log\rho$ and $\rho$ denotes the exponential growth rate of the branching process.

Denote the new uncoupled individuals born from the coupled individuals in branching process at each time $t$ by $G_{t}^{N}$ (referred to as “ghost individuals”). As mentioned earlier, these and their descendants remain uncoupled to epidemic system $N$ . Let $D_{i,t-i}^{N}$ denote all the descendants of $G_{i}^{N}$ (ghost individuals born at time $i$ ) after $t-i$ time steps, i.e. at time $t$ .

5.5 The initial branching process phase

Following result shows that epidemic process is close to the multi-type branching process till time $\log_{\rho}N^{*}$ .

Theorem 5.3.

As $N\rightarrow\infty$ , for all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ ,

\sup\limits_{t\in[0,\log_{\rho}({N^{*}}/{\sqrt{N}})]}\left\lvert{X}_{t}^{N}({\mathbf{s}})-{B}_{t}({\mathbf{s}})\right\rvert\xrightarrow{P}0.

(7)

\sup\limits_{t\in\left[0,\log_{\rho}N^{*}\right]}\left\lvert{\frac{{X}_{t}^{N}({\mathbf{s}})}{\rho^{t}}-\frac{{B}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert\xrightarrow{P}0.

(8)

Result (7) was earlier shown for SIR setting in [3]. We extend this result to general models and also prove the result (8) in this more general setting.

Lemma 5.2.

For all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ , and for some $\tilde{e}>0$ ,

\mathbb{E}\left({\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert}\right)\leq\tilde{e}\frac{\rho^{2t}}{N}.

Proof sketch for Theorem 8 and Lemma 5.2: For all $\bm{s}\in\mathcal{S}\setminus\mathcal{U}$ , $\left\lvert{{{B}}_{t}(\bm{s})-{{X}}^{N}_{t}(\bm{s})}\right\rvert$ is bounded from above by $\sum_{i=1}^{t}D_{i,t-i}^{N}$ (total number of uncoupled individuals till time $t$ ). Recall that the ghost individuals ( $G_{i}^{N}$ ) arise as a result of the infectious individuals in epidemic process making contact with already affected individuals and is related to their product. Further, infectious individuals are upper bounded in no. by affected individuals and hence it can easily be shown that $\mathbb{E}\left({{G}^{N}_{i}}\right)\leq\hat{\beta}\mathbb{E}\left({\frac{(A_{i-1}^{N})^{2}}{N}}\right)$ for some $\hat{\beta}>0$ and any $i\geq 1$ . Further, from coupling we have $A_{t}^{N}\leq A_{t}^{B}$ for all $t\in\mathbb{N}$ . Inequality (9) then follows by bounding from above the no. of descendants of $G^{N}_{i}$ individuals at time $t-i$ in the branching process. ${\mathbf{1}}\in{\Re^{+}}^{\eta}$ is a vector with each entry equal to 1.

\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq{K}^{t-i}\mathbb{E}\left({{G}^{N}_{i}}\right)\bm{1}\leq\frac{\hat{\beta}}{N}\bm{1}^{T}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right)\bm{1}.

(9)

Lemma 5.2 follows from (9) and a standard branching process result that $\mathbb{E}\left({\left({A^{B}_{i}}\right)^{2}}\right)\leq\tilde{c}\rho^{2i}$ , for some $\tilde{c}>0$ . Theorem 8 follows from Lemma 5.2 using Markov’s inequality. Details are given in Appendix C.3 and C.4.

Recall from Theorem 5.2 that, ${{\mathbf{B}}_{t}}/{\rho^{t}}\xrightarrow{P}W{\mathbf{v}}$ as $t\to\infty,$ where $W$ is a non-negative random variable representing the intensity of branching process and ${{\mathbf{v}}}\in{\Re^{+}}^{\eta}$ is the left eigenvector corresponding to eigenvalue $\rho$ of matrix $K$ . Therefore, initially (till time $\log_{\rho}N^{*}$ ) epidemic process grows exponentially at rate $\rho$ , with sample path dependent intensity being determined by $W$ .

The following proposition follows directly from Theorem 5.2 and Theorem 8 and justifies the fact that the proportions across different types stabilize quickly in the epidemic process, and thus early on, till time $\log_{\rho}N^{*}$ , paths can be patched from one time period to the other with negligible error due to change in the proportions.

Proposition 5.1.

For $t_{N}\rightarrow\infty$ as $N\rightarrow\infty~{}~{}$ and $~{}~{}\limsup_{N\rightarrow\infty}\frac{t_{N}}{\log_{\rho}N^{*}}<\infty$ , then for all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ :

\left\lvert\frac{{X}_{t_{N}}^{N}({\mathbf{s}})}{\sum_{{\mathbf{q}}\in\mathcal{S}\setminus\mathcal{U}}{X}_{t_{N}}^{N}({\mathbf{q}})}-\frac{{v}({\mathbf{s}})}{\sum_{{\mathbf{q}}\in\mathcal{S}\setminus\mathcal{U}}v({\mathbf{q}})}\right\rvert\xrightarrow{P}0,\quad\text{ as }~{}N\rightarrow\infty.

Remark 1.

In Algorithm 3 we had suggested that the restarted simulations should use common random numbers as in the original simulation paths so as to identically reproduce them. However, in our experiments we observe that even if the restarted paths are generated using independent samples, that leads to a negligible anomaly. To understand this, observe that to replicate an original path, we essentially need to replicate $W$ along that path, as after small initial period, the associated branching process is well specified once $W$ is known. (Note that this $W$ is implicitly generated in a simulation, it is not explicitly computed). Common random numbers achieve this. However, for approximating the statistics of the larger city after $t_{\min}$ (Day 22 in Figure 5), independent restarted simulations provide equally valid sample paths as the ones using the common random numbers. The only difference is that the $W$ associated with each independently generated path may not match the $W$ associated with the original path, so patching them together at time $t_{\min}$ may result in a mismatch. However, since we are reporting statistics associated with the average of generated paths, the statistics at time $t_{\min}$ depend linearly on the average of the $W$ ’s associated with generated sample paths. Thus, if restarted paths are independently generated, the corresponding average of associated $W$ ’s may not match the average of $W$ ’s associated with the original simulations. Empirically, we observe negligible mismatch as the average of $W$ ’s appear to have small variance.

Compartmental models are widely used to model epidemics. Usually, in these models we start with infected population which is a positive fraction of the overall population. However, little is known about the dynamics if we start with a small, constant number of infections. In Appendix B we describe the results for some popular compartmental models using Theorem 8 and 5.1.

5.6 Deterministic phase

As Theorem 8 and Proposition 5.1 note, early in the infection growth, till time $\log_{\rho}N^{*}$ for large $N$ , while the number affected grows exponentially, the proportion of individuals across different types stabilizes. Here, the types corresponding to the susceptible population are not considered because at this stage, the affected are a negligible fraction of the total susceptible population. The growth in the affected population in this phase is sample path dependent and depends upon non-negative random variable $W$ .

However, at time $\log_{\rho}(\epsilon N)$ for any $\epsilon>0$ and large $N$ , this changes as the affected population equals $\Theta(N)$ . Hereafter, the population growth closely approximates its mean field limit whose initial state depends on random $W$ and where the proportions across types may change as the time progresses. Our key result in this setting is Theorem 5.4. To this end we need Assumption 2.

Let $t_{N}=log_{\rho}(\epsilon N)$ . Let ${\mathbf{\mu}}^{N}_{t}$ denote the empirical distribution across types at time $t_{N}+t$ . This corresponds to augmenting the vector ${\mathbf{X}}_{t_{N}+t}^{N}$ with the types associated with the susceptible population at time $t_{N}+t$ and scaling the resultant vector with factor $N^{-1}$ .

Assumption 2.

There exists a random distribution ${\mathbf{\bar{}}{\mu}}_{0}(W)\in{\Re^{+}}^{(\eta+1)}$ that is independent of $N$ such that ${\mathbf{\mu}}^{N}_{0}\xrightarrow{P}{\mathbf{\bar{}}{\mu}}_{0}(W)$ as $N\rightarrow\infty$ .

Observe that ${\mathbf{\bar{}}{\mu}}_{0}(W)$ above is path dependent in that it depends on the random variable $W$ . The above assumption is seen to hold empirically. While, we do not have a proof for it (this appears to be a difficult and open problem), Corollary 5.1 below supports this assumption. This corollary follows from Lemma 5.2.

Corollary 5.1.

For $\epsilon\in(0,1)$ , $t=\log_{\rho}(\epsilon N)$ and for all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ ,

\mathbb{E}\left({\left\lvert{\frac{{X}_{t}^{N}({\mathbf{s}})}{\rho^{t}}-\frac{{B}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert}\right)\leq\epsilon\tilde{e},

for some constant $\tilde{e}>0$ .

Also note that Assumption 2 holds along some subsequence $\{N_{i}\}$ since ${\mathbf{\mu}}^{N}_{0}$ is a bounded sequence.

Let $c^{N}_{t}({a})$ denote the total incoming-infection rate from the community as seen by an individual with characteristic ${a}\in\mathcal{A}$ at time $t$ . It is determined from the disease state of all the individuals at time $t$ . In our setup, $c^{N}_{t}({a})$ equals

\sum\limits_{{\mathbf{q}}=(\tilde{a},\tilde{d})\in\mathcal{S}\setminus\mathcal{U}}\mu^{N}_{t}({\mathbf{q}})\mathbbm{1}(\text{type }{\mathbf{q}}\text{ is infectious})\psi_{\tilde{a},{a}}\beta_{\tilde{a}},

(10)

where $\mathbbm{1}$ denotes the indicator function. For each individual $n\leq N$ , let $S^{t}_{n}$ denote its type at time $t$ . Then, for ${\mathbf{s}}=(a,d)\in\mathcal{S}$ , $\mathbb{P}\left(S^{t+1}_{n}={\mathbf{s}}|\mathcal{\mu}_{t}^{N}\right)={p}(S^{t}_{n},{\mathbf{s}},c^{N}_{t}({a})),$ for a continuous function $p$ . In particular, the transition probability only depends on the disease-state of the individual at the previous time, the disease-state to which it is transitioning, and the infection rate incoming to the individual at that time.

Recall that from Assumption 2 we have defined ${\mathbf{\bar{}}{\mu}}_{0}(W)\in{\Re^{+}}^{(\eta+1)}$ such that ${\mathbf{\mu}}^{N}_{0}\xrightarrow{P}{\mathbf{\bar{}}{\mu}}_{0}(W)$ as $N\rightarrow\infty$ . Define ${\mathbf{\bar{}}{\mu}}_{t}(W)\in{\Re^{+}}^{(\eta+1)}$ such that for all $t\in\mathbbm{N}$ , ${\mathbf{s}}=(a,d)\in\mathcal{S}$ ,

\bar{\mu}_{t}({\mathbf{s}},W):=\sum\limits_{{\mathbf{s}^{\prime}}\in\mathcal{S}}\bar{\mu}_{t-1}({{\mathbf{s}^{\prime}}},W)p({\mathbf{s}^{\prime}}{\mathbf{s}},\bar{c}_{t-1}(a,W)),

(11)

where

\bar{c}_{t-1}({a},W)=\sum\limits_{{\mathbf{q}}=(\tilde{a},\tilde{d})\in\mathcal{S}\setminus\mathcal{U}}\bar{\mu}_{t-1}({\mathbf{q}},W)\mathbbm{1}(\text{state }{\mathbf{q}}\text{ is infectious})\psi_{\tilde{a},{a}}\beta_{\tilde{a}}.

Theorem 5.4.

Under Assumption 2 and for $t\in\mathbbm{N}$ , ${\mathbf{\mu}}^{N}_{t}\xrightarrow{P}{\mathbf{\bar{}}{\mu}}_{t}(W)$ as $N\rightarrow\infty$ .

Proof sketch: Rewrite $\mu^{N}_{t+1}(\bm{s})=\frac{1}{N}\sum\limits_{n=1}^{N}\mathbbm{1}\left(S^{t+1}_{n}=\bm{s}\right)$ as,

\mu^{N}_{t+1}(\bm{s})=\sum\limits_{\bm{s^{\prime}}\in\mathcal{S}}~{}\left[{p}(\bm{s^{\prime}},\bm{s},c^{N}_{t}({a}))\mu^{N}_{t}(\bm{s^{\prime}})~{}+~{}\mathcal{M}^{N}_{t+1}(\bm{s})\right],

(12)

where $\mathcal{M}^{N}_{t+1}(\bm{s})=\frac{1}{N}\sum\limits_{n=1}^{N}\mathbbm{1}\left(S^{t+1}_{n}=\bm{s}\right)-{p}(S^{t}_{n},\bm{s},c^{N}_{t}({a})).$ Observe that $\mathcal{M}^{N}_{t+1}(\bm{s})$ is a sequence of $0$ -mean random variables whose variance converges to $0$ as $N\rightarrow\infty$ . Also, $c^{N}_{t}({a})$ is a continuous and bounded function of $\mu^{N}_{t}$ and ${h}$ is uniformly continuous in its arguments since it is a continuous function defined on a compact set. Therefore, ${\mathbf{\mu}}^{N}_{t}\xrightarrow{P}{\mathbf{\bar{}}{\mu}}_{t}(W)$ follows by inducting on $t$ . Further details are given in Appendix C.5.

In particular, if ${\mathbf{\bar{}}{\mu}}_{t}$ denotes the mean field limit of the normalised process at time $t+\log_{\rho}(\epsilon N)$ , then, the number of infections observed in a smaller model with population $N$ is approximately $N*{\mathbf{\bar{}}{\mu}}_{t}$ and that of a larger model is approximately $kN*{\mathbf{\bar{}}{\mu}}_{t}$ . Thus, the larger model infection process can be approximated by the smaller model infection process by scaling it by $k$ .

Remark 2.

While the deterministic equations (conditioned on $W$ ) in Theorem 5.4 can be easily solved when the number of types is small and initial state is established via simulation, these become much harder in a realistic model with all its complexity, where the number of types is extremely large and may be uncountable if non-memoryless probability distributions are involved. One may thus view the key role of the simulator as a tool that identifies the random initial state of these deterministic mean field equations and then solves them somewhat efficiently using stochastic methods.

6 Further Experiments

6.1 Smaller networks

In our earlier experiments the interaction spaces included homes, workplaces, schools and communities. In practice, there maybe subspaces within these interaction spaces where individuals have more frequent interaction. In the following experiments, we capture these subspaces by introducing smaller project networks in workspaces, classes within each school, and neighbourhood clusters within communities. The exact details of these constructions are given in [1]. Roughly, speaking a project network within a workplace is a cluster of size that is uniformly chosen within 3 to 10. In schools, students of same age are kept in the same class. Each neighbourhood consists of approximately 200 families. The $\beta$ ’s within these clusters are increased while those outside the cluster are reduced.

Theses clusters are identically created in the larger as well as the smaller city (in the larger city, their numbers are scaled up along with the population size). Figure 9 shows that the numerical results using the SSR algorithm on the smaller 1 million city match the larger 12.8 million city under no-intervention scenario. Figure 9 shows analogous results under the intervention - home quarantine day 40 onwards.

Cautionary note: We could not get the SSR algorithm to give similar matching results when the neighbourhood clusters were based on geography as in [1], where in the larger model each geographical cluster had an area corresponding to circle of 100 meter radius. To match the population in the smaller model, we expanded the cluster area by a factor of 12.8. However, the variability in the number present in each cluster, created errors in scaling the smaller model to the larger one. In contrast, by creating the clusters in terms of number of households gets rid of this error. This underscores a larger point that for SSR algorithm to work, we need to ensure that the smaller model is probabilistically an almost exact replica of the larger one. As in our cluster design modification, often this is easy to do without compromising the broad aim of the simulation.

6.2 SSR on sparser cities

While our experiments are designed for Mumbai, a very dense city, it is reasonable to wonder if the results would hold for less dense cities or regions. In this section we report the experimental results for the sparser cities that are created by reducing the contact and hence the transmission rates between individual in different interaction spaces. Specifically, we consider two reasonable scenarios:

•

Scenario 1: The community beta is kept at one tenth the level of the community beta for Mumbai considered in our earlier experiments in Section 2. Figure 11 shows that the result using SSR algorithm matches the larger model in the no-intervention scenario. Figure 11 shows that the result using SSR algorithm matches the larger model in the scenario with an intervention of home quarantine after 40 days. Although we did not conduct elaborate experimentation, similar results should be true under more general and extensive interventions.

Figure 10: Shift and scale smaller model (no. of exposed) matches the larger model under no intervention scenario ( $\beta_{c}^{new}=\beta_{c}/10$ ).

Figure 11: Shift-scale-restart smaller model (no. of exposed) matches the larger model under intervention ( $\beta_{c}^{new}=\beta_{c}/10$ ).
•

Scenario 2: Beta community is reduced by a factor of 20, beta home, school and workplace by a factor 4. Again as seen in Figure 13, the result using SSR algorithm matches the larger model in the no-intervention scenario. It can be seen that even in the no-intervention scenario, the number of infections are very low after 200 days. Under home quarantine after 40 days, the exposed cases from the smaller model are within 20% of the larger model although both are more-or-less negligible numbers, the highest less than 250 (see Figure 13).

6.3 How small can the smaller model be

In the earlier experiments in Section 2, we had compared the results of a 12.8 million city to the results from using SSR on a smaller 1 million city. A natural question to ask is how small the city can be for SSR to continue to be accurate. Below we consider cities of size 500,000 and 100,000.

•

500,000 city: Figure 15 shows that the result using SSR algorithm matches the larger model in the no-intervention scenario. Figure 15 shows that the result using SSR algorithm matches the larger model in the scenario with illustrative single intervention - home quarantine day 40 onwards.
•

100,000 city: Figure 17 shows that in this case using SSR algorithm poorly matches the larger model in the no-intervention scenario. The reason for this mismatch is that 12.8 million city and 100,000 city are essentially identical for only 24 days (see Figure 17). However, the number of exposed at day 24 are around 8,000. After scaling 8,000 by 128, we get 62.5, we need to take the simulation output from the smaller model when the number of exposed equalled 62.5 and stitch that to day 24 by scaling it by 128. This is not feasible as we start with 100 persons exposed. Ad-hoc adjustments, that allow us to start with 100 or more exposed under SSR lead to much noisier output. For this reason, we are also unable to implement SSR in this setting when there are interventions.

7 Conclusion

In this paper we considered large ABS models used to model epidemic spread in a city or a region. These models are of great use in capturing details of population types, their behaviour, time varying regulatory instructions, etc. A key drawback of ABS models is that computational time can quickly become prohibitive as the city size and the simulated time increases. In this paper we proposed a shift-scale-restart algorithm that exploits the underlying probabilistic structure and allows smaller cities to provide extremely accurate approximations to larger ones using much less computational time. We supported our experiments and the algorithm through asymptotic analysis where we showed that initial part of the epidemic process is close to a corresponding multi-type branching process and very quickly the evolution of the epidemic process is well approximated by its mean field limit. The fact that in multi-type branching process the distribution of population across types stabilizes quickly helped us shift sample paths across time without loss of accuracy. To show closeness of the epidemic process to a branching process we developed careful coupling arguments. To show the robustness of the proposed methods, we numerically tested their effectiveness in presence of small networks, sparser city population, and use of substantially smaller city models to generate output that matches the large city models.

Acknowledgments

We acknowledge the support of A.T.E. Chandra Foundation for this research. We further acknowledge the support of the Department of Atomic Energy, Government of India, to TIFR under project no. 12-R&D-TFR-5.01-0500.

References

[1] S. Agrawal, S. Bhandari, A. Bhattacharjee, A. Deo, N. M. Dixit, P. Harsha, S. Juneja, P. Kesarwani, A. Swamy, K. P. Patil, N. Rathod, R. Saptharishi, S. Shriram, P. Srivastava, R. Sundaresan, N. K. Vaidhiyan, and S. Yasodharan. City-scale agent-based simulators for the study of non-pharmaceutical interventions in the context of the covid-19 epidemic. Journal of the Indian Institute of Science, 100(4):809–847, 2020.
[2] K. Athreya and P. Ney. Branching processes. 1972.
[3] T. Britton and E. Pardoux. Stochastic Epidemic Models with Inference. Springer, Switzerland, 2019.
[4] J.-P. Chretien, S. Riley, and D. B. George. Mathematical modeling of the west africa ebola epidemic. eLife, 4:e09186, 2015.
[5] N. Ferguson, D. Laydon, G. Nedjati Gilani, N. Imai, K. Ainslie, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cucunuba Perez, G. Cuomo-Dannenburg, A. Dighe, I. Dorigatti, H. Fu, K. Gaythorpe, W. Green, A. Hamlet, W. Hinsley, L. C. Okell, S. V. Elsland, H. Thompson, R. Verity, E. Volz, H. Wang, Y. Wang, P. G. Walker, C. Walters, P. Winskill, C. Whittaker, C. A. Donnelly, S. Riley, and A. C. Ghani. Impact of Non-pharmaceutical Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand. Technical Report 9, MRC Centre for Global Infectious Disease Analysis, Imperial College London, U.K., 2020.
[6] N. M. Ferguson, D. A. Cummings, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai, S. Iamsirithaworn, and D. S. Burke. Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature, 437(7056):209–214, 2005.
[7] E. Frias-Martinez, G. Williamson, and V. Frías-Martínez. An agent-based model of epidemic spread using human mobility and social network information. pages 57–64, 10 2011.
[8] J. M. Gardner, L. Willem, W. van der Wijngaart, S. C. L. Kamerlin, N. Brusselaers, and P. Kasson. Intervention strategies against covid-19 and their estimated impact on swedish healthcare capacity. medRxiv, 2020.
[9] M. E. Halloran, N. M. Ferguson, S. Eubank, I. M. Longini, D. A. Cummings, B. Lewis, S. Xu, C. Fraser, A. Vullikanti, T. C. Germann, et al. Modeling targeted layered containment of an influenza pandemic in the United States. Proceedings of the National Academy of Sciences, 105(12):4639–4644, 2008.
[10] T. Harris. The Theory of Branching Processes. 1963.
[11] P. Harsha, S. Juneja, D. Mittal, and R. Saptharishi. Covid-19 epidemic in Mumbai: Projections, full economic opening, and containment zones versus contact tracing and testing: An update. Technical report, TIFR Mumbai, India, 2020.
[12] E. Hunter, B. Mac Namee, and J. D. Kelleher. A taxonomy for agent-based models in human infectious disease epidemiology. Journal of Artificial Societies and Social Simulation, 20(3):1–2, 2017.
[13] S. Juneja and D. Mittal. Modelling the Second Covid-19 Wave in Mumbai. Technical report, TIFR Mumbai, India, 2021.
[14] A. Malani, D. Shah, G. Kang, G. Lobo, J. Shastri, M. Mohanan, R. Jain, S. Agrawal, S. Juneja, S. Imad, and U. Kolthur-Seetharam. Seroprevalence of SARS-CoV-2 in slums versus non-slums in Mumbai, India. The Lancet Global Health, 9(2):E110–E111, 2020.
[15] M. Meckes. Lecture notes on matrix analysis. 2019.
[16] R. Verity, L. C. Okell, I. Dorigatti, P. Winskill, C. Whittaker, N. Imai, G. Cuomo-Dannenburg, H. Thompson, P. G. Walker, H. Fu, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet Infectious Diseases, 2020.

In Appendix A, we first display graphs of additional numerical experiments for Mumbai. These include a graph of hopitalised and deceased in the two models. We also show the output of a longer 470 day experiment where variants were also modelled. In Appendix B, we describe the behaviour of the compartmental models such as SIR, SEIR in early branching process phase before the number of infected are sufficient to follow the ODE.

We provide detailed proofs of the results stated in Section 5 in Appendix C. Then in Appendix D, we present the asymptotic analysis of the intermediate (group-structure) model. As we observed earlier, naive scaling of the smaller model underestimates the number infected in the larger one when we start with small number of initial infections. We show this in a simpler SIR setting in Section E. Finally, we provide the key details of the parameters and the city statistics used in our numerical experiments in Section F.

Appendix A Additional numerical experiments

Figures 19 and 19 compare the number hospitalised, and the number of cumulative fatalities in a 12.8 million Mumbai city simulation (no intervention) with estimates from the shift-scale-restart algorithm applied to the smaller 1 million city. In this experiment, smaller and larger city evolves similarly till day 35. There is no intervention and therefore scaled estimates from day 21.5 of the same simulation are appended after day 35 (i.e. shifted by 13.5 days).

Figures 21, 21 and 23 compare the number exposed, the number hospitalised, and the number of cumulative fatalities in a 12.8 million Mumbai city simulation (intervention included here is that home quarantine for the infected and compliant starts from day 40) with estimates from the shift-scale-restart algorithm applied to the smaller 1 million city. In this experiment, smaller and larger city evolves similarly till day 35. Intervention is at day 40, therefore smaller city is restarted with intervention at day 26.5 and scaled estimates from day 21.5 of the restarted simulation are appended at day 35 of the initial run (i.e., shifted by 13.5 days).

Figure 23 similarly compares the two models for a longer 470 days horizon. This includes the introduction of a new variant after 350 days as well as the phase before the new variant (methodology for introducing new variants is explained below). In addition realistic interventions such as lockdown, case isolation, home quarantine, masking etc. are introduced at times matching the actual interventions in Mumbai. In this experiment, smaller and larger city evolve similarly till day 22. Intervention is at day 29, therefore smaller city is restarted with intervention at day 18.5 and scaled estimates from day 11.5 of the restarted simulation are appended at day 22 of the initial run (i.e., shifted by 10.5 days). As noted earlier, the input data used in the experiments is spelled out in Section F.

Modelling multiple strains : Recall that in the current methodology, a susceptible individual $n$ sees an incoming infection rate of $\lambda_{n}(t)$ from all the infected individuals in the city at time $t$ , based on which it gets exposed with probability $1-\exp\{-\lambda_{n}(t)\cdot\Delta t\}$ . However, with the introduction of a new infectious strain, it is important to identify whether the person was exposed from an infected person with a new strain or an infected person with an original strain. To estimate this, we compute infection rates coming in from individuals infected with original strain ( $\lambda^{original}_{n}(t)$ ) and those with new infectious strain ( $\lambda^{infectious}_{n}(t)$ ) separately. To account for the increased infectiousness of the strain (let’s say $k$ times more infectious then the original strain), if an individual is infected with the infectious strain, its infectiousness is increased by the factor $k$ . Then, $\lambda_{n}(t)$ can be written as $\lambda_{n}(t)=\lambda^{original}_{n}(t)+\lambda^{infectious}_{n}(t)$ . In our model, initially a fixed percentage (2.5%) of individuals chosen randomly from infected population (active) are assumed to be from a more infectious strain on a fixed day (after 350 days from start of the simulation in this case). After this initial seeding of infectious strain, it spreads in the following manner: every individual $n$ who gets exposed to the disease at time $t$ is deemed to be infected with the infectious strain with probability $\frac{\lambda^{infectious}_{n}(t)}{\lambda_{n}(t)}$ and with the original strain with probability $1-\frac{\lambda^{infectious}_{n}(t)}{\lambda_{n}(t)}$ . The above methodology can be easily extended to more than two strains.

Appendix B Behaviour of simple compartmental models in the early branching process phase

Example 1 (SIR Model).

Susceptible-Infectious-Recovered (SIR) models are the simplest aggregate compartmental models used in epidemiology. All the individuals in the city are assumed to be identical, i.e., they have same community transmission rate and disease progression transition probabilities. Individuals can be in three disease states (susceptible, infected, recovered). Let $S(t),I(t),R(t)$ denote the number of susceptible, infected and recovered at any time $t$ . Let community transmission rate of an infected individual be $\beta$ and an infected individual recovers at rate $r$ . SIR dynamics can be expressed by the following system of differential equations:

\frac{dS(t)}{dt}=-\frac{\beta}{N}S(t)I(t),

\frac{dI(t)}{dt}=\frac{\beta}{N}S(t)I(t)-rI(t),

\frac{dR(t)}{dt}=rI(t).

A discrete version of this model fits our setting. Again, for the ease of notation, time step is set to 1. Let $\beta$ denote community transmission rate of an infected individual, i.e., number of contacts made by an individual with other individuals in one time step is Poisson distributed with rate $\beta$ , and an infected individual recovers at rate $r$ . Again, till time $t=\log_{\rho}N^{*}$ , the epidemic process is close to the branching process. For branching process, let the infected population be denoted as type 1, and the recovered population as type 2. Then, the $K$ matrix of the corresponding branching process is:

K=\big{(}\begin{smallmatrix}1+\beta-r&r\\ 0&1\end{smallmatrix}\big{)}.

Observe that, matrix $K_{1}=\big{(}1+\beta-r\big{)}$ , $C=\big{(}1\big{)}$ , $M=\big{(}r\big{)}$ and $\rho(K)=\rho(K_{1})=1+\beta-r$ . Finally,

v=\big{(}\begin{smallmatrix}1\\ \frac{r}{\beta-r}\end{smallmatrix}\big{)}.

Therefore, initially (till time $\ \log_{\rho}N^{*}$ for large $N$ ) epidemic process grows exponentially at rate $\rho=1+\beta-r$ , and the ratio of infected to recovered individuals in this phase quickly stabilizes to $\frac{\beta-r}{r}$ .

Example 2 (SEIR model).

Recall that SEIR model corresponds to susceptible-exposed-infected-recovered model. As the name suggests, SEIR model considers an additional disease state of exposed as a refinement to SIR model. Again, all the individuals are identical. Community transmission rate of an infected individual is $\beta$ (i.e., number of contacts made by an individual with other individuals is Poisson distributed with rate $\beta$ ). An exposed individual transitions to infected state with rate $p$ and an infected individual recovers at rate $r$ .

Let the exposed population be denoted as type 1, infected population as type 2, and recovered population as type 3. Then, the $K$ matrix of the branching process is:

K=\left(\begin{matrix}1-p&p&0\\ \beta&1-r&r\\ 0&0&1\end{matrix}\right).

Observe that, $C=\big{(}1\big{)}$ , $M=\big{(}\begin{smallmatrix}0\\ r\end{smallmatrix}\big{)}$ , $K_{1}=\big{(}\begin{smallmatrix}1-p&p\\ \beta&1-r\end{smallmatrix}\big{)}$ and $\rho(K)=\rho(K_{1})=1+\frac{1}{2}\big{(}((p+r)^{2}+4p(\beta-r))^{1/2}-(p+r)\big{)}$ . Finally,

v=\left(\begin{matrix}\beta\\ p+\rho-1\\ r+\frac{rp}{\rho-1}\end{matrix}\right).

Therefore, initially (till time $t=\log_{\rho}N^{*}$ , for large $N$ ) epidemic process grows exponentially at rate $\rho=1+\frac{1}{2}\big{(}{((p+r)^{2}+4p(\beta-r))^{1/2}}-(p+r)\big{)}$ , and the proportions of exposed, infected and recovered quickly stabilize to $\big{(}\beta,(p+\rho-1),(r+\frac{rp}{\rho-1})\big{)}$ normalised by their sum.

Example 3 (SIR model with two age groups).

Usually different age groups have different disease progression profiles. To account for this, we consider SIR model with two age groups. We limit ourselves to two groups so that the growth rate of epidemic process can be expressed in closed form (solution to a quadratic equation. In general, the degree of the equation is same as number of age groups). Suppose that the fraction of individuals in each age group is $\pi_{1}$ and $\pi_{2}=1-\pi_{1}$ . There are only 3 disease states (susceptible, infected and recovered). Community transmission rate of an infected individual is $\beta$ (i.e., number of contacts made by an individual with other individuals is Poisson distributed with rate $\beta$ ). The infected individual make contacts with people belonging to age group 1 and age group 2 at rate $\pi_{1}\beta$ and $\pi_{2}\beta$ , respectively. Further, suppose that an infected individual from first and second age group recover at rates $r_{1}$ and $r_{2}$ , respectively.

We then have four types of population - first and second type denote the number infected in the first and second age group. Third and fourth type correspond to number recovered in first and second type age group. Then the $K$ matrix of this branching process is:

K=\left(\begin{matrix}\pi_{1}\beta+1-r_{1}&\pi_{2}\beta&r_{1}&0\\ \pi_{1}\beta&\pi_{2}\beta+1-r_{2}&0&r_{2}\\ 0&0&1&0\\ 0&0&0&1\end{matrix}\right).

Here, as per the notation in Lemma 5.1, $C=\big{(}\begin{smallmatrix}1&0\\ 0&1\end{smallmatrix}\big{)}$ , $M=\big{(}\begin{smallmatrix}r_{1}&0\\ 0&r_{2}\end{smallmatrix}\big{)}$ ,

K_{1}=\left(\begin{matrix}\pi_{1}\beta+1-r_{1}&\pi_{2}\beta\\ \pi_{1}\beta&\pi_{2}\beta+1-r_{2}\end{matrix}\right),

$\displaystyle\rho(K)$	$\displaystyle=\rho(K_{1})$	(13)
	$\displaystyle=1+\frac{1}{2}\bigg{(}\Big{(}(\beta-r_{1}-r_{2})^{2}+4\pi_{1}r_{2}(\beta-r_{1})$
	$\displaystyle+4\pi_{2}r_{1}(\beta-r_{2})\Big{)}^{1/2}+(\beta-r_{1}-r_{2})\bigg{)}$

Finally,

v=\left(\begin{matrix}\pi_{1}\beta\\ r_{1}+\rho-1-\pi_{1}\beta\\ \frac{r_{1}}{\rho-1}\pi_{1}\beta\\ \frac{r_{2}}{\rho-1}(r_{1}+\rho-1-\pi_{1}\beta)\end{matrix}\right).

Therefore, initially (for time $\leq\log_{\rho}N^{*}$ and large $N$ ) epidemic process grows exponentially at rate

	$\displaystyle\rho$	$\displaystyle=1+\frac{1}{2}\bigg{(}\Big{(}(\beta-r_{1}-r_{2})^{2}+4\pi_{1}r_{2}(\beta-r_{1})$		(14)
		$\displaystyle+4\pi_{2}r_{1}(\beta-r_{2})\Big{)}^{1/2}+(\beta-r_{1}-r_{2})\bigg{)}$		(14)

and the population of the four types (infected in two age groups and recovered in two age groups) quickly stabilizes to be proportional to

\left(\pi_{1}\beta,(r_{1}+\rho-1-\pi_{1}\beta),(\frac{r_{1}}{\rho-1}\pi_{1}\beta),(\frac{r_{2}}{\rho-1}(r_{1}+\rho-1-\pi_{1}\beta))\right).

Appendix C Detailed proofs of Section 5 results

In this section we give detailed proofs of the asymptotic analysis results in Section 5 of the main paper.

C.1 Proof of Lemma 5.1

•

Observe that types ${\mathbf{s}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ correspond to hospitalised, critical, dead or recovered states and types ${\mathbf{q}}\in\mathcal{H}$ correspond to exposed infective or symptomatic state. As any individual in hospitalised, critical, dead or recovered disease states neither transitions nor gives birth to offspring in exposed, infective or symptomatic disease states, therefore $K({\mathbf{s}},{\mathbf{q}})=0$ for all ${\mathbf{s}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ and ${\mathbf{q}}\in\mathcal{H}$ .
•

Recall that $K_{1}\in{\Re^{+}}^{\hat{\eta}\times\hat{\eta}}$ is such that $K_{1}({\mathbf{s}},{\mathbf{q}})=K({\mathbf{s}},{\mathbf{q}})$ for all ${\mathbf{s}}\in\mathcal{H}$ and ${\mathbf{q}}\in\mathcal{H}$ . Observe that $K_{1}$ correspond to the “types” with the disease states that are infectious or may become infectious in subsequent time steps. As every infectious individual gives birth to exposed individuals of each characteristic therefore $K_{1}$ is irreducible.
•

Recall that matrix $C\in{\Re^{+}}^{(\eta-\hat{\eta})\times(\eta-\hat{\eta})}$ is such that $C({\mathbf{s}},{\mathbf{q}})=K({\mathbf{s}},{\mathbf{q}})$ for all ${\mathbf{s}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ and ${\mathbf{q}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ . The individuals of the type ${\mathbf{s}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ do not contribute to any infections in the city. In addition, this individual either transitions to the next disease state or remains in the same disease state with positive probability. Therefore matrix $C$ is an upper triangular matrix with diagonal entries equalling the probability of an individual in some disease state remaining in the same disease state in one time step (which is less than 1). Hence, all the eigenvalues of matrix C are positive and less than or equal to 1.
•

Recall that matrix $M\in{\Re^{+}}^{\hat{\eta}\times(\eta-\hat{\eta})}$ is such that $M({\mathbf{s}},{\mathbf{q}})=K({\mathbf{s}},{\mathbf{q}})$ for all ${\mathbf{s}}\in\mathcal{H}$ and ${\mathbf{q}}\in\mathcal{H}^{c}\setminus\mathcal{U}$ .

Therefore, there exists $K_{1}\in{\Re^{+}}^{\hat{\eta}\times\hat{\eta}}$ , $C\in{\Re^{+}}^{(\eta-\hat{\eta})\times(\eta-\hat{\eta})}$ and $M\in{\Re^{+}}^{(\eta)\times(\eta-\hat{\eta})}$ such that

K=\big{(}\begin{smallmatrix}K_{1}&M\\ 0&C\end{smallmatrix}\big{)},

where $K_{1}\in{\Re^{+}}^{\hat{\eta}\times\hat{\eta}}$ is irreducible, and $\rho(C)<\rho(K)=\rho(K_{1})$ .

C.2 Proof of Theorem 5.2

Recall from Lemma 5.1,

K=\big{(}\begin{smallmatrix}K_{1}&M\\ 0&C\end{smallmatrix}\big{)}.

Observe that, set of eigenvalues of $K$ is same as combined set of eigenvalues of $K_{1}$ and $C$ . Furthermore,

•

$\rho(C)\leq 1$ from proof of Lemma 5.1, therefore $\rho(K)=\rho(K_{1})$ .
•

$K_{1}$ is irreducible from Lemma 5.1, therefore its spectral radius $\rho(K_{1})$ is Perron-Frobenius eigenvalue. In addition, spectral radius $\rho(K)=\rho(K_{1})$ of $K$ is also a unique eigenvalue of $K$ with highest absolute value.
•

$\lim_{t\to\infty}\frac{K^{t}}{\rho^{t}}={\mathbf{u}}{\mathbf{v}}^{T}$ then follows from [15] (Lemma 8.16, pg. 69).

Results (3) and (4) in Theorem 4 for supercritical multi-type branching process (Section 5.1) were proved in [2], Chap. V. Critical step in proving these results was the fact that corresponding matrix $\tilde{K}$ had following property $\lim_{t\to\infty}\frac{\tilde{K}^{t}}{\tilde{\rho}^{t}}={\mathbf{\tilde{}}{u}}{\mathbf{\tilde{}}{v}}^{T}$ (see Lemma 1 - Chap. V, pg. 194, in [2]).

As shown above, matrix $K$ of the associated branching process also satisfies this property ( $\lim_{t\to\infty}\frac{K^{t}}{\rho^{t}}={\mathbf{u}}{\mathbf{v}}^{T}$ ), therefore results (3) and (4) in Theorem 4 hold for our branching process as well.

C.3 Proof of Lemma 5.2

Some notation is needed to aid in proving Lemma 5.2. Let $I_{t}^{N}$ denote the number of coupled individuals between epidemic process and branching process at time $t$ . Since all the affected individuals in epidemic process are coupled with some individual in branching process of the same type, $I_{t}^{N}=A_{t}^{N}$ and we have,

X_{t}^{N}({\mathbf{s}})\leq B_{t}({\mathbf{s}})\quad\forall{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}.

\mbox{ Further, }{A_{t}^{N}}\leq{A^{B}_{t}}.

(15)

Recall that the new uncoupled individuals born from the coupled individuals in branching process at each time $t$ are referred to as “ghost individuals” and are denoted by $G_{t}^{N}$ and $D_{i,j}^{N}$ denote all the descendants of $G_{i}^{N}$ (ghost individuals born at time $i$ ) after $j$ time steps. Let $H_{t}^{N}$ denote the total number of uncoupled individuals in the branching process at time $t$ . Then,

H_{t}^{N}=\sum_{i=1}^{t}D_{i,t-i}^{N}.

(16)

Therefore, for all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$

\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert\leq H^{N}_{t}.

(17)

From (16) and (17) we have

\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert\leq\sum_{i=1}^{t}D_{i,t-i}^{N}.

(18)

Before proceeding with further analysis, we assume two results (19) and (20), which we will show later.

For some constant $\hat{\beta}>0$ ,

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}},

(19)

where $A_{i}^{B}$ is total number of affected individuals in branching process at time $i$ and ${\mathbf{1}}\in{\Re^{+}}^{\eta}$ is a vector with each entry equal to 1.

For some constant $\tilde{c}$ ,

\mathbb{E}\left({\left({A^{B}_{i}}\right)^{2}}\right)\leq\tilde{c}\rho^{2i}.

(20)

Assuming (19) to be true, from (18) and (19) we have

\mathbb{E}\left({\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}}.

(21)

As $\lim_{t\to\infty}\frac{K^{t}}{\rho^{t}}={\mathbf{u}}{\mathbf{v}}^{T}$ (from Theorem 5.2), therefore there exists $K_{3}\in{\Re^{+}}^{\eta\times\eta}$ such that for all $t\geq 0$

K^{t}\leq{\rho}^{t}K_{3}.

(22)

Assuming (20) to be true, from (20), (21) and (22) we have,

	$\displaystyle\mathbb{E}\left({\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert}\right)$	$\displaystyle\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}\rho^{t-i}K_{3}\tilde{c}\rho^{2i-2}{\mathbf{1}}$
		$\displaystyle\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}K_{3}\tilde{c}\frac{\rho^{2t-1}}{(\rho-1)}{\mathbf{1}}$
		$\displaystyle=\tilde{e}\frac{\rho^{2t}}{N},$

where $\tilde{e}=\bigg{(}\frac{\hat{\beta}}{\rho}{\mathbf{1}}^{T}\frac{K_{3}\tilde{c}}{(\rho-1)}{\mathbf{1}}\bigg{)}$ .

Proof of (19): Recall that ghost individuals $G_{t}^{N}$ is the number of uncoupled exposed individuals born from the coupled individuals in branching process when compared to epidemic process at time $t$ . These extra individuals in branching process are born whenever a coupled infectious individual in epidemic process makes contact with already affected individual. Therefore, $G^{N}_{t}$ is equal to the number of contacts made to already affected individuals in epidemic process in one time step between time $t-1$ and $t$ . Let ${Y}_{t}^{N}$ be the total number of community contacts in one time step between time $t-1$ and $t$ . For a given contact, probability of hitting an already affected individual is less then $\psi_{max}\frac{A^{N}_{t-1}+{Y}_{t}^{N}}{N}$ , where $\psi_{max}=\max_{a\in\mathcal{A},\tilde{a}\in\mathcal{A}}\{\psi_{a,\tilde{a}}\}$ . Therefore, we have,

	$\displaystyle\mathbb{E}\left({{G}^{N}_{t}}\right)$	$\displaystyle=\mathbb{E}\left({\mathbb{E}\left({{G}^{N}_{t}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{(A_{t-1}^{N}+{Y}_{t}^{N})}{N}{Y}_{t}^{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{A_{t-1}^{N}{Y}_{t}^{N}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)+\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{({Y}_{t}^{N})^{2}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right).$

To upper bound ${Y}_{t}^{N}$ , we assume that all the affected individuals in epidemic process at time $t-1$ are infectious and that community hits by each infectious individual is Poisson distributed with $\beta_{max}=\max_{a\in\mathcal{A}}\{\beta_{a}\}$ . Then,

\mathbb{E}\left({{G}^{N}_{t}}\right)\leq\psi_{max}\mathbb{E}\left({\beta_{max}\frac{(A_{t-1}^{N})^{2}}{N}}\right)+\psi_{max}\mathbb{E}\left({\frac{\beta_{max}A_{t-1}^{N}+\beta_{max}^{2}(A_{t-1}^{N})^{2}}{N}}\right).

Since $A_{t-1}^{N}\geq 1$ , and setting $\hat{\beta}=\psi_{max}(\beta_{max}^{2}+2\beta_{max})$ , we have

\mathbb{E}\left({{G}^{N}_{t}}\right)\leq\hat{\beta}\mathbb{E}\left({\frac{(A_{t-1}^{N})^{2}}{N}}\right).

To upper bound the $\mathbb{E}\left({D_{i,t-i}^{N}}\right)$ , we assume that each state ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ has $G_{i}^{N}$ ghost individuals born at time $i$ . As ghost individuals $G_{i}^{N}$ born at time $i$ have descendants according to the same branching process dynamics,

\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{N})^{2}}\right){\mathbf{1}}.

Taking summation on both sides we have,

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=1}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{N})^{2}}\right){\mathbf{1}}.

(23)

From (15) and (23), we have

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=1}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}}.

Proof of (20): Recall that $A^{B}_{i}=\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}B_{i}({\mathbf{s}})$ . Then, $A^{B}_{i}={\mathbf{1}}^{T}{\mathbf{B}}_{i}={\mathbf{B}}_{i}^{T}{\mathbf{1}}$ . Hence,

	$\displaystyle\mathbb{E}\left({\left({A^{B}_{i}}\right)^{2}}\right)$	$\displaystyle\leq\mathbb{E}\left({({\mathbf{1}}^{T}{B}_{i})^{2}}\right)$		(24)
		$\displaystyle={\mathbf{1}}^{T}\mathbb{E}\left({{\mathbf{B}}_{i}{\mathbf{B}}_{i}^{T}}\right){\mathbf{1}}.$		(24)

Let $Var({\mathbf{B}}_{1})\in{\Re^{+}}^{\eta\times\eta}$ , be a matrix with $({\mathbf{s}},{\mathbf{q}})$ entry equalling $\mathbb{E}\left({B_{1}({\mathbf{s}})B_{1}({\mathbf{q}})}\right)-\mathbb{E}\left({B_{1}({\mathbf{s}})}\right)\mathbb{E}\left({B_{1}({\mathbf{q}})}\right)$ , $V_{{\mathbf{s}}}$ denote $Var({\mathbf{B}}_{1}|{\mathbf{B}}_{0}={\mathbf{e}}_{{\mathbf{s}}})$ and $C_{0}$ denote the matrix with $({\mathbf{s}},{\mathbf{q}})$ entry set to $\mathbb{E}\left({B_{0}({\mathbf{s}})B_{0}({\mathbf{q}})}\right)$ . From (Chap. 2, pg 37, [10]) we have,

\displaystyle\mathbb{E}\left({{\mathbf{B}}_{i}{\mathbf{B}}_{i}^{T}}\right)

\displaystyle=({K}^{i})^{T}C_{0}{K}^{i}+

\displaystyle\sum_{j=1}^{i}({K}^{i-j})^{T}\Big{(}\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}V_{{\mathbf{s}}}\mathbb{E}\left({{\mathbf{B}}_{j-1}({\mathbf{s}})}\right)\Big{)}{K}^{i-j}.

(25)

Furthermore, for our branching process, there exists a constant $v>0$ such that $V_{{\mathbf{s}}}\leq v{K}$ for all ${\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}$ . Using this in (25) we observe that

	$\displaystyle\mathbb{E}\left({{\mathbf{B}}_{i}{\mathbf{B}}^{T}_{i}}\right)$	$\displaystyle=({K}^{i})^{T}C_{0}{K}^{i}+$		(26)
		$\displaystyle v\sum_{j=1}^{i}({K}^{i-j})^{T}\Big{(}\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}\mathbb{E}\left({{B}_{j-1}({\mathbf{s}})}\right)\Big{)}{K}^{i-j+1}.$		(26)

Recalling that $\mathbb{E}\left({{\mathbf{B}}_{t+1}}\right)=(K^{T})^{t+1}\mathbb{E}\left({{\mathbf{B}}_{0}}\right)$ , then from (22) we have

{\mathbf{B}}_{i}\leq\rho^{i}(K_{3})^{T}{\mathbf{B}}_{0}.

(27)

From (26) and (27) we have

	$\displaystyle\mathbb{E}\left({{\mathbf{B}}_{i}{\mathbf{B}}^{T}_{i}}\right)$	$\displaystyle\leq\rho^{2i}(K_{3})^{T}C_{0}K_{3}+v\tilde{d}\sum_{j=1}^{i}\rho^{2i-j}(K_{3})^{T}K_{3}$
		$\displaystyle\leq\rho^{2i}\bigg{(}(K_{3})^{T}C_{0}K_{3}+\frac{\rho v\tilde{d}}{\rho-1}(K_{3})^{T}K_{3}\bigg{)}$
		$\displaystyle=\rho^{2i}\tilde{K}_{3},$

where, $\tilde{d}:=\sum_{{\mathbf{s}}\in\mathcal{S}\setminus\mathcal{U}}\Big{(}\mathbb{E}\left({{B}_{0}^{T}}\right)K_{3}\Big{)}({\mathbf{s}})$ and $\tilde{K}_{3}:=\bigg{(}(K_{3})^{T}C_{0}K_{3}+\frac{\rho v\tilde{d}}{\rho-1}(K_{3})^{T}K_{3}\bigg{)}$ .

Using above bound in (24) and setting $\tilde{c}={\mathbf{1}}^{T}\tilde{K}_{3}{\mathbf{1}}$ , we get

\displaystyle\mathbb{E}\left({\left({A^{B}_{i}}\right)^{2}}\right)\leq\tilde{c}\rho^{2i}.

C.4 Proof for Theorem 8

For $\zeta>0$ , we have

	$\displaystyle\mathbb{P}\left({\sup\limits_{t\in[0,\log_{\rho}(N^{*}/\sqrt{N})]}\left\lvert{{{X}}^{N}_{t}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert\geq\zeta}\right)$
	$\displaystyle\leq\frac{1}{\zeta}\mathbb{E}\left({\sup\limits_{t\in[0,\log_{\rho}(N^{*}/\sqrt{N})]}\left\lvert{{{X}}^{N}_{t}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert}\right)$
	$\displaystyle\leq\frac{1}{\zeta}\sum\limits_{t=0}^{\log_{\rho}(N^{*}/\sqrt{N})}~{}\mathbb{E}\left({\left\lvert{{{X}}^{N}_{t}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert}\right),$

where the first inequality follows from the Markov’s inequality, and the second follows from observing that maximum of non-negative random variables is bounded by their sum. Similarly,

	$\displaystyle\mathbb{P}\left({\sup\limits_{t\in[0,\log_{\rho}N^{*}]}\left\lvert{\frac{{{X}}^{N}_{t}({\mathbf{s}})}{\rho^{t}}-\frac{{{B}}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert\geq\zeta}\right)$
	$\displaystyle\leq\frac{1}{\zeta}\mathbb{E}\left({\sup\limits_{t\in[0,\log_{\rho}N^{*}]}\left\lvert{\frac{{{X}}^{N}_{t}({\mathbf{s}})}{\rho^{t}}-\frac{{{B}}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert}\right)$
	$\displaystyle\leq\frac{1}{\zeta}\sum_{t=0}^{\log_{\rho}N^{*}}~{}\mathbb{E}\left({\left\lvert{\frac{{{X}}^{N}_{t}({\mathbf{s}})}{\rho^{t}}-\frac{{{B}}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert}\right).$

Thus, it is sufficient to show that

\lim\limits_{N\rightarrow\infty}~{}\left\{{~{}\sum_{t=0}^{\log_{\rho}(N^{*}/\sqrt{N})}~{}\mathbb{E}\left({\left\lvert{{{X}}^{N}_{t}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert}\right)}\right\}=0,

and

\lim\limits_{N\rightarrow\infty}~{}\left\{{~{}\sum_{t=0}^{\log_{\rho}N^{*}}~{}\mathbb{E}\left({\left\lvert{\frac{{{X}}^{N}_{t}({\mathbf{s}})}{\rho^{t}}-\frac{{{B}}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert}\right)}\right\}=0.

From Lemma 5.2, we have,

\displaystyle\mathbb{E}\left({\left\lvert{{{X}}_{t}^{N}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert}\right)

\displaystyle\leq\tilde{e}\frac{\rho^{2t}}{N},

(28)

Taking sum from $t=0$ to $t=\log_{\rho}(N^{*}/\sqrt{N})$ above, we observe that

\sum\limits_{i=0}^{\log_{\rho}(N^{*}/\sqrt{N})}~{}\mathbb{E}\left({\left\lvert{{{X}}^{N}_{i}({\mathbf{s}})-{{B}}_{i}({\mathbf{s}})}\right\rvert}\right)\leq\frac{\rho^{2\log_{\rho}(N^{*}/\sqrt{N})}}{N}\frac{\tilde{e}}{\rho^{2}-1},

recall that $N^{*}=N/\log{N}$ . Therefore,

\lim\limits_{N\rightarrow\infty}~{}\mathbb{E}\left({\sup\limits_{t\in[0,\log_{\rho}(N^{*}/\sqrt{N})]}\left\lvert{{{X}}_{t}^{N}({\mathbf{s}})-{{B}}_{t}({\mathbf{s}})}\right\rvert}\right)=0.

Using (28) to bound $\sum_{i=0}^{t}\frac{1}{\rho^{i}}\Big{(}\mathbb{E}(\left\lvert{{{X}}_{i}^{N}({\mathbf{s}})-{{B}}_{i}({\mathbf{s}})}\right\rvert)\Big{)}$ we have,

\displaystyle\sum_{i=0}^{t}\frac{1}{\rho^{i}}\Big{(}\ \mathbb{E}(\left\lvert{{X}_{i}^{N}({\mathbf{s}})-{B}_{i}({\mathbf{s}})}\right\rvert)\Big{)}

\displaystyle\leq\frac{\rho^{t}}{N}\frac{\tilde{e}}{\rho-1}.

Setting $t=\log_{\rho}N^{*}$ , we observe that

\sum\limits_{i=0}^{t}~{}\mathbb{E}\left({\left\lvert{\frac{{X}^{N}_{i}({\mathbf{s}})}{\rho^{i}}-\frac{{B}_{i}({\mathbf{s}})}{\rho^{i}}}\right\rvert}\right)\leq\frac{1}{\log N}\frac{\tilde{e}}{\rho-1}.

Thus,

\lim\limits_{N\rightarrow\infty}~{}\mathbb{E}\left({\sup\limits_{t\in[0,\log_{\rho}N^{*}]}\left\lvert{\frac{{X}^{N}_{t}({\mathbf{s}})}{\rho^{t}}-\frac{{B}_{t}({\mathbf{s}})}{\rho^{t}}}\right\rvert}\right)=0.

C.5 Proof of Theorem 5.4

We will prove Theorem 5.4 through induction. For the ease of notation we hide $W$ in representing the mean field process ${\mathbf{\bar{}}{\mu}}_{t}(W)$ and $\bar{c}_{t}(W,a)$ in the proof. Assumption 2 in Section 5.6 forms the base case for induction. Assume that ${\mathbf{\mu}}^{N}_{t}\xrightarrow{\text{P}}{\mathbf{\bar{}}{\mu}}_{t}$ . From the definition we know that for all ${\mathbf{s}}=({a},{d})\in\mathcal{S}$ ,

\mu^{N}_{t+1}({\mathbf{s}})=\frac{1}{N}\sum\limits_{n=1}^{N}\mathbbm{1}\left(S^{t+1}_{n}={\mathbf{s}}\right).

Recall that

\mathbb{P}\left(S^{t+1}_{n}={\mathbf{s}}|\mathcal{\mu}^{N}_{t}\right)={p}(S^{t}_{n},{\mathbf{s}},c^{N}_{t}({a})),

Define $\mathcal{M}^{N}_{t+1}({\mathbf{s}})$ to be

\frac{1}{N}\sum\limits_{n=1}^{N}\mathbbm{1}\left(S^{t+1}_{n}={\mathbf{s}}\right)-{p}(S^{t}_{n},{\mathbf{s}},c^{N}_{t}({a})).

Clearly,

\mu^{N}_{t+1}({\mathbf{s}})=\sum\limits_{{\mathbf{s}^{\prime}}\in\mathcal{S}}~{}\left[{p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t}({a}))\mu^{N}_{t}({\mathbf{s}^{\prime}})~{}+~{}\mathcal{M}^{N}_{t+1}({\mathbf{s}})\right].

(29)

Observe that $\mathcal{M}^{N}_{t+1}({\mathbf{s}})$ is a sequence of $0$ -mean random variables whose variance converges to $0$ as $N\rightarrow\infty$ . Therefore

\mathcal{M}^{N}_{t+1}({\mathbf{s}})\xrightarrow{P}0.

Assuming that as $N\to\infty$ ,

{p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t}({a}))\mu^{N}_{t}({\mathbf{s}^{\prime}})\xrightarrow{P}{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))\bar{\mu}_{t}({\mathbf{s}^{\prime}}),

(30)

we get

{\mu}_{t+1}^{N}({\mathbf{s}})\xrightarrow{P}\sum\limits_{{\mathbf{s}^{\prime}}\in\mathcal{S}}{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))\bar{\mu}_{t}({\mathbf{s}^{\prime}})~{}=:~{}\bar{\mu}_{t+1}({\mathbf{s}}).

To see (30), observe that ${p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t-1}({a}))\mu^{N}_{t}({\mathbf{s}^{\prime}})$ , can be re-written as

	$\displaystyle{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))\mu^{N}_{t}({\mathbf{s}^{\prime}})+$
	$\displaystyle({p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t}({a}))-{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a})))\mu^{N}_{t}({\mathbf{s}^{\prime}}).$

Taking limit as $N\rightarrow\infty$ for the first term above, we get

{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))\mu^{N}_{t}({\mathbf{s}^{\prime}})\xrightarrow{P}{p}({\mathbf{s}},{\mathbf{s}^{\prime}},\bar{c}_{t}({a}))\bar{\mu}_{t}({\mathbf{s}^{\prime}}),

since $\mu^{N}_{t}({\mathbf{s}})$ converges to $\bar{\mu}_{t}$ in probability, by induction hypothesis. Moreover, the second term equals $0$ . To see this, observe that the second term above can be bounded by

\left\lvert{{p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t}({a}))-{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))}\right\rvert.

(31)

Recall that

c^{N}_{t}({a})=\sum\limits_{{\mathbf{q}}=(\tilde{a},\tilde{d})\in\mathcal{S}\setminus\mathcal{U}}\mu^{N}_{t}({\mathbf{q}})\mathbbm{1}(\text{type }{\mathbf{q}}\text{ is infectious})\psi_{\tilde{a},{a}}\beta_{\tilde{a}},

we can see that $c^{N}_{t}({a})$ is a continuous function of $\mu^{N}_{t}$ and $\mu^{N}_{t}\xrightarrow{P}\bar{\mu}_{t}$ by induction hypothesis. Therefore

c^{N}_{t}({a})~{}\xrightarrow{P}~{}\sum\limits_{{\mathbf{q}}\in\mathcal{S}}\bar{\mu}_{t}(\bm{q})\mathbbm{1}(\text{state }{\mathbf{q}}\text{ is infectious})\psi_{\tilde{a},{a}}\beta_{\tilde{a}}=~{}\bar{c}_{t}({a}).

Observe that $c^{N}_{t}$ and $\bar{c}_{t}$ are bounded (by definition). Moreover, ${p}$ is uniformly continuous in its arguments since it is a continuous function defined on a compact set. Thus, taking limit as $N\rightarrow\infty$ , (31) converges to zero since ${p}({\mathbf{s}^{\prime}},{\mathbf{s}},c^{N}_{t}({a}))\xrightarrow{P}{p}({\mathbf{s}^{\prime}},{\mathbf{s}},\bar{c}_{t}({a}))$ uniformly as $N\rightarrow\infty$ .

Appendix D Intermediate (group-structure) model for asymptotic analysis

We consider the below intermediate (group-structure) model:

•

The population comprises of $N$ groups each of size $g$ . A group interacts with another group only through community. So an infectious individual in one group can infect a person in another group only through community interactions. All the other interactions (home, school, workplace) of individuals are within a group. Each group may be thought as comprising many households where children may go to schools and adults may go to work within the same group. The model has a single community.
•

Each individual $j\in\{1,2,...,g\}$ in a group is probabilistically identical across all the groups, that is they have identical interaction spaces (home, school, workplaces, community) and same age, disease progression profile, community transmission rates, mobility in the community etc.
•

Further, each person is at distance 1 from the centre of the community.

Again, generalizing to multiple communities, including modelling local transport as a single commonplace of interaction amongst the population, having multiple but finite types of groups with probabilistically different individuals, and having individuals at finitely many different distances from the center, involve more notation but are otherwise straightforward.

We first present the dynamics of the epidemic process and then outline the dynamics of the associated branching process. We then state and prove the technicalities needed for our main results from Section 5.

D.1 Epidemic process dynamics

Notation: The city comprises of $N$ groups each of size $g$ .

•

Every group is uniquely indexed $n\in[1,N]$ .
•

Individual $j\in[1,g]$ in group $n\in[1,N]$ is indexed $(j,n)$ . Recall that the individuals $(j,\cdot)\in[1,g]\times[1,N]$ are probabilistically identical across all the groups.
•

Each group at any time may be classified by a type ${\mathbf{s}}$ based on the disease state of each individual $j\in[1,g]$ . Thus, ${\mathbf{s}}\in\mathcal{D}^{g}$ . Let $\mathbf{u}$ denote the group type where all the individuals are susceptible. Let $\eta=\left\lvert{\mathcal{D}^{g}\setminus\{u\}}\right\rvert$ .
•

Denote the number of groups of type ${\mathbf{s}}$ at time $t$ by $X_{t}^{N}({\mathbf{s}})$ and set ${\mathbf{X}}_{t}^{N}=(X_{t}^{N}({\mathbf{s}}):{\mathbf{s}}\in\mathcal{D}^{g}\setminus\mathbf{u})$ . Then, ${{\mathbf{X}}_{t}^{N}\in\mathbbm{Z^{+}}^{\eta}}$ .
•

Let $A_{t}^{N}=\sum_{{\mathbf{s}}\in\mathcal{D}^{g}\setminus\{\mathbf{u}\}}X_{t}^{N}({\mathbf{s}})$ denote the total number of affected groups in the system $N$ at or before time $t$ .

Dynamics : At time zero, for each $N$ , ${{\mathbf{X}}_{0}^{N}}$ is initialised by setting a suitably selected small and fixed number of people randomly from some distribution $\mu_{0}(N)$ and assigning them to the exposed state. All others are set as susceptible. The distribution $\mu_{0}(N)$ is assumed to be independent of $N$ so we can set ${{\mathbf{X}}_{0}^{N}}={\mathbf{X}}_{0}$ for all N.

Given ${{\mathbf{X}}_{t}^{N}}$ , ${{\mathbf{X}}_{t+1}^{N}}$ is arrived at through two mechanisms. (For the ease of notation we will set $\Delta t=1$ .) i) Infectious individuals at time $t$ who make Poisson distributed infectious contacts with the rest of the population (at homes, schools, workplaces within each individual’s group and in the community) moving the contacted susceptible population to exposed state, and ii) through population already affected moving further along in their disease state. Specifically,

•

We assume that an infectious individual $(j,n)\in[1,g]\times[1,N]$ spreads the disease in the community with transmission rate $\beta_{j}^{c}$ , in his home with transmission rate $\beta_{j}^{h}$ , in his school with transmission rate $\beta_{j}^{s}$ , in his workplace with transmission rate $\beta_{j}^{w}$ .
•

Therefore, the number of infectious contacts it makes with all the individuals having index $(\tilde{j},\cdot)$ (both susceptible and affected) in the community in one time step is Poisson distributed with rate ${\beta_{j}^{c}}/g$ . Once the number of infectious contacts for a particular index $(\tilde{j},\cdot)$ have been generated, each contact is made with an individual selected uniformly at random from all the $N$ individuals with index $(\tilde{j},\cdot)$ .
•

The number of infectious contacts an infectious individual $(j,\cdot)$ makes with each individual in his home (school, workplace) in one time step is Poisson distributed with rate $\frac{\beta_{j}^{h}}{n^{h}_{j}}$ ( $\frac{\beta_{j}^{s}}{n^{s}_{j}}$ , $\frac{\beta_{j}^{w}}{n^{w}_{j}}$ ), where $n^{h}_{j}$ ( $n^{s}_{j}$ , $n^{w}_{j}$ ) is the number of individuals in the home (school, workplace) of individual $(j,\cdot)$ .
•

If an already affected individual has one or more contact from an infectious individual, its type remains unchanged. On the other hand, if a susceptible individual has at least one contact from any infectious individual, it gets exposed at time $t+1$ .
•

Once an individual gets exposed, its disease progression is the same as that defined for the simplified model (Section 5).
•

Thus, for a group $n\in[1,N]$ in state $s$ , at time $t$ , susceptible individuals may transition to exposed state and already affected individuals might transition to some other disease state. Let the resulting type of group $n$ be $\tilde{s}$ at time $t+1$ after accounting for all the transitions above. Therefore we decrease ${{\mathbf{X}}_{t+1}^{N}}(\mathbf{s})$ by 1 and increase the ${{\mathbf{X}}_{t+1}^{N}}(\mathbf{\tilde{}}{s})$ by 1.

Defining random variables associated with epidemic process : Suppose an individual $(j,n)$ gets exposed at time $\tau_{(j,n)}$ . Let $W_{(j,n)}(i)$ be the collection of all the random variables associated with individual $(j,n)$ at time $\tau_{(j,n)}+i$ . These random variables are:

•

Those associated with the disease state transition of the individual $(j,n)$ , at time $\tau_{(j,n)}+i$ .
•

Those associated with number of infectious contacts it makes with individual $(\tilde{j},G)$ within his group at time $\tau_{(j,n)}+i$ ), through home, school and workplace.
•

Those associated with number of infectious contacts it makes with individuals $(\tilde{j},\cdot)$ at time $\tau_{(j,n)}+i$ , and the associated random variables uniformly distributed to choose amongst $N$ individuals with index $(\tilde{j},\cdot)$ with whom the infectious contact will be made.

Further, we denote the collection of all the random variables $W_{(j,n)}(i)$ for all $i$ by $W{(j,n)}$ . Observe that for all the individuals $(j,\cdot)$ , each $W_{(j,\cdot)}$ has the same distribution.

D.2 The associated branching process dynamics

Branching process has identical group structure as the epidemic process, that is, there are $g$ individuals in each group and the $j\in[1,g]$ individual in branching process is probablistically identical to corresponding individual $j$ in epidemic process. For each $t$ , let ${{{\mathbf{B}}_{t}}\in\mathbbm{Z^{+}}^{\eta}}$ , ${{\mathbf{B}}_{t}}=(B_{t}({\mathbf{s}}):{\mathbf{s}}\in\mathcal{D}^{g}\setminus\{\mathbf{u}\})$ where $B_{t}({\mathbf{s}})$ denotes the number of groups of type ${\mathbf{s}}$ at time $t$ in the branching process. Let $A_{t}^{B}=\sum_{{\mathbf{s}}\in\mathcal{D}^{g}\setminus\{\mathbf{u}\}}B_{t}({\mathbf{s}})$ denote the total number of affected groups till time $t$ in the branching process.

Dynamics: At time zero ${{\mathbf{B}}_{0}}={{\mathbf{X}}_{0}}$ . Given ${\bm{B}_{t}}$ , we arrive at ${\bm{B}_{t+1}}$ as follows:

•

At time $t$ , every infectious individual $j$ in all the groups, for all $j\in[1,g]$ , gives birth to independent Poisson distributed groups of type $\mathbf{s}$ (type $\mathbf{s}$ is such that individual $\tilde{j}$ is exposed and all other individuals susceptible) at time $t+1$ with rate $\beta_{j}/g$ for each $\tilde{j}\in[1,g]$ . ${\bm{B}_{t+1}}(\mathbf{s})$ is increased accordingly.
•

A group of type $\mathbf{s}\in\mathcal{D}^{g}\setminus\{u\}$ can undergo transition to some type $\mathbf{\tilde{}}{s}\in\mathcal{D}^{g}\setminus\{u\}$ through following two processes:

i) An infectious individual $j$ makes infectious contacts with each individual in his home (school, workplace) in one time step which are Poisson distributed with rate $\frac{\beta_{j}^{h}}{n^{h}_{j}}$ ( $\frac{\beta_{j}^{s}}{n^{s}_{j}}$ , $\frac{\beta_{j}^{w}}{n^{w}_{j}}$ ), where $n^{h}_{j}$ ( $n^{s}_{j}$ , $n^{w}_{j}$ ) is the number of individuals in the home (school, workplace) of individual $j$ . A susceptible individual might get exposed through these infectious contacts.

ii) An already affected individual might transition from one disease state to other (disease progression of the individual has same transition probabilities as in each epidemic process).

Correspondingly, we decrease $B_{t+1}({\mathbf{s}})$ by 1 and increase $B_{t+1}({\mathbf{\tilde{}}{s}})$ 1. Also, let the probability of transition be denoted by $P(\mathbf{s},\mathbf{\tilde{}}{s})$ .

The expected offsprings matrix $K\in{\Re^{+}}^{\eta\times\eta}$ for $\{{{\mathbf{B}}_{t}}\}$ : Let each entry ${K}({\mathbf{s}},{\mathbf{q}})$ of $K$ denote the expected number of type ${\mathbf{q}}$ offsprings of a single type ${\mathbf{s}}$ group in one time step. Let $\mathbf{e}_{j}\in\mathcal{D}^{g}$ denote the group type with individual $(j,\cdot)$ in exposed state and the remaining individuals in susceptible state. Let ${\mathcal{H}}\subset\mathcal{D}^{g}$ denote the group states having at least one individual who is infectious or may become infectious in subsequent time steps (that is, types with atleast one individual in exposed, infective or symptomatic state). Let $\mathcal{H}^{c}$ denote its complement. Groups in $\mathcal{H}^{c}$ do not contribute to the community infection. Let ${\mathcal{\tilde{H}}}\subset\mathcal{S}$ denote all the group types with atleast one individual in a disease state that is infectious (that is infective or symptomatic state). Let $\hat{\eta}=\left\lvert{\mathcal{H}}\right\rvert$ .

As described above, a group of type ${\mathbf{s}}\in\mathcal{S}\setminus\{\mathbf{u}\}$ , may give birth to groups $\mathbf{e}^{j},j\in[1,g]$ if it has an infectious individual, and/or may itself transition to some other type in one time step. Then, $K$ can be written as,

	$\displaystyle K({\mathbf{s}},{\mathbf{e}_{\tilde{j}}})$	$\displaystyle=P({\mathbf{s}},{\mathbf{e}_{\tilde{j}}})+\sum_{j\in[1,g]:j\text{ is infectious }}\beta_{j}^{c}/g\quad\quad\forall~{}{\mathbf{s}}\in\mathcal{\tilde{H}},~{}\forall~{}\tilde{j}\in[1,g]$		(32)
	$\displaystyle K({\mathbf{s}},{\mathbf{q}})$	$\displaystyle=P({\mathbf{s}},{\mathbf{q}})\text{ otherwise.}$		(32)

Indexing of individuals in branching process : We will introduce a unique index $(j,\cdot,\cdot)$ for each individual $j$ in different groups of the branching process. This indexing will help define all the random variables associated with branching process and later help in the coupling the epidemic and the branching process. Although our indexing of individuals in branching process will depend on $N$ , the branching process in itself is independent of $N$ and the only difference is indexing of the individuals, that may depend upon $N$ .

Rules for assigning the second index in $(j,\cdot,\cdot)$ :

•

Recall that at time zero, ${\mathbf{B}}_{0}={\mathbf{X}}_{0}$ . Therefore each individual $(j,n)$ in epidemic process is matched to a corresponding individual $j$ in branching process and the individual is given index $(j,n,\cdot)$ .

All the individuals $\tilde{j}$ within the group of an individual with index $(j,n,\cdot)$ are given index $(\tilde{j},n,\cdot)$ , for all $j\in[1,g]$ , $\tilde{j}\in[1,g]$ and $n\in[1,N]$ .
•

Recall that at any time $t$ , an infectious individual $(j,\cdot,\cdot)$ for all $j\in[1,g]$ , gives birth to independent Poisson distributed groups of type $\mathbf{e}_{\tilde{j}}$ with rate $\beta_{j}/g$ for each $\tilde{j}\in[1,g]$ . The exposed individual $\tilde{j}$ in this offspring group is given an index $(\tilde{j},\tilde{n},\cdot)$ , where $\tilde{n}$ is selected uniformly at random from $\{1,...,N\}$ .

Again all the individuals $\tilde{j}$ within the group of an individual with index $(j,n,\cdot)$ are given index $(\tilde{j},n,\cdot)$ , for all $j\in[1,g]$ , $\tilde{j}\in[1,g]$ and $n\in[1,N]$ .

Rules for assigning third index in $(j,\cdot,\cdot)$ :

•

All the individuals within a group again have the same third index.
•

When $i^{th}$ group containing individual with index $(\cdot,n,\cdot)$ is born, the individuals within that group are given index $(j,n,i)$ for all $j\in[1,g]$
•

If some groups containing individual with index $(\cdot,n,\cdot)$ are born at same time, they are randomly arranged and given indexes sequentially.

Defining random variables associated with branching process and the above defined indexing: Let $W_{(j,\cdot,\cdot)}$ be the vector of random variables associated with individual $(j,\cdot,\cdot)$ . This includes random variables associated with the disease progression and infectious contacts of the individual $(j,\cdot,\cdot)$ . Suppose individual $(j,\cdot,\cdot)$ is exposed at time $\tau_{(j,\cdot,\cdot)}$ then $W_{(j,\cdot,\cdot)}(i)$ denotes all the random variables associated with the individual $(j,\cdot,\cdot)$ , at time $\tau_{(j,\cdot,\cdot)}+i$ . Specifically, $W_{(j,\cdot,\cdot)}(i)$ consists of the following random variables:

•

Those associated with the disease state transition of the individual $(j,\cdot,\cdot)$ at time $\tau_{(j,\cdot,\cdot)}+i$ .
•

Those associated with number of infectious contacts it makes with individual $(\tilde{j},\cdot,\cdot)$ within his group at time $\tau_{(j,\cdot,\cdot)}+i$ through home, school and workplace.
•

Those associated with number of offspring groups it has of type $e^{\tilde{j}}$ at time $\tau_{(j,\cdot,\cdot)}+i$ .
•

Those associated with choosing second index of the exposed individual $(\tilde{j},\cdot,\cdot)$ in the new group above, which is selected uniformly at random from $\{1,...,N\}$ .

Further, we denote the collection of all the random variables $W_{(j,\cdot,\cdot)}(i)$ for all $i$ by $W{(j,\cdot,\cdot)}$ . Observe that for all the individuals $(j,\cdot,\cdot)$ , $W_{(j,\cdot,\cdot)}$ has same distribution and is equal to the distribution of $W_{(j,\cdot)}$ defined for the epidemic process.

D.3 Coupling

Before defining the coupling, we explain that why do we need a more delicate coupling in this model. Recall that an important consequence of the coupling in simplified model (Section 5) was the inequality $A^{N}_{t}\leq A^{B}_{t}$ . This inequality was in turn critical in proving Lemma 5.2.

Figure 24 demonstrates a scenario in which the inequality $A^{N}_{t}\leq A^{B}_{t}$ is violated, if we do not carefully couple the two processes. To follow this figure, suppose that each group contains only 3 types of people (rectangle, circle and triangle) living in the same home. The disease spread evolution in Figure 24 is explained below:

•

$t=0$ : Two infected individuals $1$ and $2$ in epidemic and branching processes with each individual in a separate group. Each of the individuals in epidemic process is coupled to the corresponding individual in the branching process.
•

$t=1$ : A rare event happens in the epidemic process - individual 1 infects the susceptible individual 3 in the same group as individual 2. Correspondingly, in the branching process individual 1 gives birth to a new group with the only exposed individual being 3. The individuals 3 in each of the processes are coupled to each other.
•

$t=2$ : In epidemic process, individual 3 makes an infectious contact with an individual 2 (circle) within his group - home contact. But, as the individual 2 was already infected, it has no impact. On the other hand, in the branching process, the home contact of the corresponding individual 3 infects the susceptible individual 4 (circle). Now, individual 4 is an uncoupled individual in the branching process.
•

$t=3$ : Nothing happens in epidemic process. Individual 4, through home contact, infects individual 5 - rectangle. Creating again an uncoupled individual.
•

$t=4$ : In epidemic process, individual 3 infects an individual (rectangle) within his group through home contact. But, in the branching process, corresponding individual 3 could not infect the individual 5 (rectangle) as he was already infected at t=3. This results in an uncoupled individual in epidemic process. But an uncoupled individual can potentially cause many infections through community, potentially leading to the violation of $A^{N}_{t}\leq A^{B}_{t}$ .

To avoid such a scenario we allow for coupling across times, which means that an individual $(j,n)$ in epidemic process can be coupled to an individual $(j,n,i)$ in branching process, with $\tau_{(j,n,i)}\leq\tau_{(j,n)}$ . For example, we will couple individual 5 in epidemic process at $t=4$ with individual 5 in branching process at $t=3$ .

We will couple the epidemic process and the branching process so that that each individual $(j,n)$ in the epidemic process is coupled with some individual $(j,n,i)$ of the branching process.

By coupling an individual $(j,n)$ with individual $(j,n,i)$ , we mean that $W_{(j,n)}=W_{(j,n,i)}$ , that is, the random variables $W_{(j,n)}$ used by individual $(j,n)$ after he gets exposed are same as random variables $W_{(j,n,i)}$ used by individual $(j,n,i)$ after he gets exposed. Therefore, the coupled individuals in each of these processes follow the same disease progression (using same randomness). Further, when infectious ( $i$ time steps after they get exposed, respectively), they generate identical Poisson number of contacts at their home, school and workplace. Additionally, when infectious, in community they generate identical Poisson number of contacts with individual $(\tilde{j},\tilde{n})$ (in epidemic process) and offspring groups $e^{\tilde{j}}$ with exposed individual having index $(\tilde{j},\tilde{n},i)$ (in branching process). As both are Poisson distributed with rate $\beta^{c}_{j}/gN$ .

Individuals in the epidemic process and branching process are coupled as follows:

1.

Recall that at time zero, ${\mathbf{B}}_{0}={\mathbf{X}}_{0}$ and each individual $(j,n)$ in epidemic process was matched to a corresponding individual $(j,n,\cdot)$ in branching process. Therefore at time zero, we couple each exposed individual $(j,n)$ of the epidemic process with corresponding individual $(j,n,\cdot)$ in branching process.
2.

Suppose an individual $(j,n)$ in epidemic process infects a susceptible individual $(\tilde{j},\tilde{n})$ through community. The corresponding coupled individual $(j,n,i)$ in branching process would also have given birth to an exposed individual $(\tilde{j},\tilde{n},\tilde{i})$ in a completely susceptible group. If the individual $(\tilde{j},\tilde{n})$ in epidemic process was not already coupled with some individual of branching process then it will be coupled to $(\tilde{j},\tilde{n},\tilde{i})$ .
3.

Suppose a group in branching process has a coupled individual $(j,n,i)$ and a new individual $(\tilde{j},n,i)$ is exposed within this group. If the individual $(\tilde{j},n)$ of the epidemic process was not already coupled with some individual of the branching process then it will be coupled to $(\tilde{j},n,i)$ .
4.
All the other individuals in the branching process remain uncoupled. Specifically :
1. (a)
  
  When a infectious individual in epidemic process makes a community contact with an already affected individual, the offspring of the corresponding coupled individual in the branching process is uncoupled.
2. (b)
  
  The community offsprings of the uncoupled individuals in branching process are uncoupled.
3. (c)
  
  If a group in branching process does not contain any coupled individuals, all the individuals within that group will remain uncoupled forever - or - If a group in branching process has a coupled individual, the first exposed individual in that group must have been coupled.

Observe that coupling rules (1), (2) and (3) ensure that all the individuals $(j,n)$ in epidemic process are coupled with some individual $(j,n,i)$ in branching process, for all $j\in[1,g]$ and $n\in[1,N]$ .

Defining “Completely coupled groups” “multiple hit groups”, “Ghost groups”:

•

Completely coupled groups: At time $t$ , a group $n$ in epidemic process is called a completely coupled group if each affected individual $(j,n)$ of the group $n$ is coupled with an individual $(j,n,i)$ ( $i$ being same for all $j$ ) in the branching process. Additionally, each coupled individual $(j,n)$ in epidemic process should be exposed at same time as the corresponding individual $(j,n,i)$ in the branching process. The corresponding group in the branching process is also called completely coupled group.
•

All other groups in epidemic process are called “multiple hit groups”. While in the branching process are called “ghost groups”.
•

At time $t$ , if an infectious individual $(j,n)$ in the epidemic process hits some susceptible individual in a completely coupled group, it becomes a “multiple hit group”. The corresponding completely coupled group in the branching process becomes a “ghost group”. Additionally, the coupled individual $(j,n,i)$ in branching process will produce another ghost group.
•

At time $t$ , if an infectious individual $(j,n)$ in the epidemic process hits some affected individual in a completely coupled group, the group remains completely coupled to the corresponding group in the branching process. However, the coupled individual $(j,n,i)$ in branching process will still produce a ghost group.

Observe that, the number of completely coupled groups at any time $t$ is same in both the processes. Also, a completely coupled group in epidemic process at time $t$ has same state as the corresponding group in the branching process and vice versa. Therefore only multiple hit groups and ghost groups contribute to the difference in $X_{t}^{N}({\mathbf{s}})$ and $B_{t}({\mathbf{s}})$ .

Denote the number of new ghost groups born in branching process due to an infectious individual hitting some individual in a good group in epidemic process at each time $t$ by $G_{t}^{N}$ . Observe that, all the ghost groups at time $t$ in branching process are either new “ ghost groups” born at time $t$ or descendants of the new“ ghost groups” born before time $t$ . Let $D_{i,t-i}^{N}$ denote all the descendants of $G_{i}^{N}$ after $t-i$ time steps, i.e. at time $t$ .

For the results in Section 5.5 and Section 5.6, the only major change from the previous proofs (Appendix C) is in the proof of Lemma 5.2. Other proofs remain similar and are straightforward.

D.4 Proof of Lemma 5.2

Before proceeding to the proof of Lemma 5.2, we first prove below result :

{A_{t}^{N}}\leq{A^{B}_{t}}.

(33)

Proof of (33): As each individual $(j,n)$ in epidemic process is coupled with some individual $(j,n,i)$ in branching process, for all $j\in[1,g]$ and $n\in[1,N]$ . Therefore ${A_{t}^{N}}\leq{A^{B}_{t}}$ , follows if we show that every individual $(j,n)$ in epidemic process is exposed at a time equal or later than the corresponding coupled individual $(j,n,i)$ in branching process. We prove above by induction on $t$ .

•

Base case: It is clearly true at $t=0$ , by the coupling rule (1).
•

Induction hypothesis: At time $t$ all the exposed individuals $(j,n)$ in epidemic process are exposed at a time equal or later than the corresponding coupled individual $(j,n,i)$ in branching process.
•
$\mathbf{t}\implies\mathbf{t}+1$ : For the sake of contradiction, assume that an individual $(\tilde{j},\tilde{n})$ gets exposed at time $t+1$ for whom there is no eligible individual $(\tilde{j},\tilde{n},\cdot)$ with $\tau_{(\tilde{j},\tilde{n},\cdot)}\leq t+1$ to couple in branching process. There are two possible cases :
- –
  
  Individual $(\tilde{j},\tilde{n})$ got exposed through a community contact of individual $(j,n)$ . From induction hypothesis, we have that corresponding coupled individual $(j,n,i)$ was exposed earlier than $(j,n)$ . Therefore, $(j,n,i)$ must have given birth to an exposed individual $(\tilde{j},\tilde{n},\cdot)$ till time $t+1$ . Clearly, by rule (2), we have a contradiction.
- –
  
  Individual $(\tilde{j},n)$ got exposed through a (home/school/workplace) contact of individual $(j,n)$ . From induction hypothesis, we have that corresponding coupled individual $(j,n,i)$ was exposed earlier than $(j,n)$ . Therefore, $(j,n,i)$ must have contacted $(\tilde{j},n,i)$ till time $t+1$ . Therefore, $(\tilde{j},n,i)$ must be exposed till time $t+1$ . Additionally, as individual $(j,n,i)$ is coupled, by rule (4-c), we have that the first exposed individual in that group must also be coupled. Therefore, $(\tilde{j},n,i)$ is eligible for coupling by rule (3) and got exposed at time $\leq t+1$ , contradiction.

Some notation is needed to aid in proving Lemma 5.4. Let $I_{t}^{N}$ denote the number of infectious individuals in epidmeic process at time $t$ . Clearly, $I_{t}^{N}\leq gA_{t}^{N}$ , where $A_{t}^{N}$ is the number of affected groups in epidemic process.

Recall that the number of new ghost groups born in branching process at each time $t$ due to an infectious individual hitting some individual in a good group in epidemic process are denoted by $G_{t}^{N}$ . As mentioned earlier and $D_{i,t-i}^{N}$ denote all the descendants of $G_{i}^{N}$ after $t-i$ time steps, i.e. at time $t$ . Let $H_{t}^{N}$ denote the total number of ghost groups in the branching process at time $t$ . Then,

H_{t}^{N}\leq\sum_{i=1}^{t}D_{i,t-i}^{N}.

(34)

As the number of completely coupled groups at any time $t$ in branching process and epidemic process is same and ${A_{t}^{N}}\leq{A^{B}_{t}}$ , therefore the number of multiple hit groups in epidemic process at any time $t$ is less than number of ghost groups in branching process. Therefore, for all ${\mathbf{s}}\in\mathcal{D}^{g}\setminus\{\mathbf{u}\}$

\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert\leq 2H^{N}_{t}.

(35)

From (34) and (35) we have

\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert\leq 2\sum_{i=1}^{t}D_{i,t-i}^{N}.

(36)

Before proceeding with further analysis, we assume two results (37) and (38), which we will show later.

For some constant $\hat{\beta}>0$ ,

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}},

(37)

where $A_{i}^{B}$ is total number of affected groups in branching process at time $i$ and ${\mathbf{1}}\in{\Re^{+}}^{\eta}$ is a vector with each entry equal to 1.

For some constant $\tilde{c}$ ,

\mathbb{E}\left({\left({A^{B}_{i}}\right)^{2}}\right)\leq\tilde{c}\rho^{2i}.

(38)

Assuming (37) to be true, from (36) and (37) we have

\mathbb{E}\left({\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert}\right)\leq\frac{2\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}}.

(39)

As $\lim_{t\to\infty}\frac{K^{t}}{\rho^{t}}={\mathbf{u}}{\mathbf{v}}^{T}$ (from Theorem 5.2), therefore there exists $K_{3}\in{\Re^{+}}^{\eta\times\eta}$ such that for all $t\geq 0$

K^{t}\leq{\rho}^{t}K_{3}.

(40)

Assuming (38) to be true, from (38), (39) and (40) we have,

	$\displaystyle\mathbb{E}\left({\left\lvert{{{B}}_{t}({\mathbf{s}})-{{X}}^{N}_{t}({\mathbf{s}})}\right\rvert}\right)$	$\displaystyle\leq\frac{2\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=0}^{t}\rho^{t-i}K_{3}\tilde{c}\rho^{2i-2}{\mathbf{1}}$
		$\displaystyle\leq\frac{2\hat{\beta}}{N}{\mathbf{1}}^{T}K_{3}\tilde{c}\frac{\rho^{2t-1}}{(\rho-1)}{\mathbf{1}}$
		$\displaystyle=\tilde{e}\frac{\rho^{2t}}{N},$

where $\tilde{e}=\bigg{(}\frac{2\hat{\beta}}{\rho}{\mathbf{1}}^{T}\frac{K_{3}\tilde{c}}{(\rho-1)}{\mathbf{1}}\bigg{)}$ .

Proof of (37): Recall that $G_{t}^{N}$ is the number of new ghost groups born from at time $t$ in the branching process. If an infectious individual makes community contact with a susceptible individual in a completely coupled group two new ghost groups are born. On the other hand, if an infectious individual makes community contact with an already affected individual in a completely coupled group, one new ghost group is born. In any case, the number of ghost groups born is atmost 2 per contact made to an affected group. Also, the number of completely coupled groups at time $t-1$ is less than or equal to $A_{t-1}^{N}$ . Let ${Y}_{t}^{N}$ be the total number of community contacts in epidemic process in one time step between time $t-1$ and $t$ . For a given contact, probability of hitting an individual in already affected group is less then $g\frac{A^{N}_{t-1}+{Y}_{t}^{N}}{N}$ . Therefore, we have,

	$\displaystyle\mathbb{E}\left({{G}^{N}_{t}}\right)$	$\displaystyle=\mathbb{E}\left({\mathbb{E}\left({{G}^{N}_{t}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({2g\frac{(A_{t-1}^{N}+{Y}_{t}^{N})}{N}{Y}_{t}^{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({2g\frac{A_{t-1}^{N}{Y}_{t}^{N}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)+\mathbb{E}\left({\mathbb{E}\left({2g\frac{({Y}_{t}^{N})^{2}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right).$

To upper bound ${Y}_{t}^{N}$ , we assume that all the individuals in affected groups in epidemic process at time $t-1$ are infectious and that community hits by each infectious individual is Poisson distributed with $\beta_{max}=\max_{j\in[1,g]}\{\beta_{j}^{c}\}$ . Then,

\mathbb{E}\left({{G}^{N}_{t}}\right)\leq 2g\mathbb{E}\left({\beta_{max}\frac{(A_{t-1}^{N})^{2}}{N}}\right)+2g\mathbb{E}\left({\frac{\beta_{max}A_{t-1}^{N}+\beta_{max}^{2}(A_{t-1}^{N})^{2}}{N}}\right).

Since $A_{t-1}^{N}\geq 1$ , and setting $\hat{\beta}=2g(\beta_{max}^{2}+2\beta_{max})$ , we have

\mathbb{E}\left({{G}^{N}_{t}}\right)\leq\hat{\beta}\mathbb{E}\left({\frac{(A_{t-1}^{N})^{2}}{N}}\right).

To upper bound the $\mathbb{E}\left({D_{i,t-i}^{N}}\right)$ , we assume that each state ${\mathbf{s}}\in\mathcal{D}^{g}\setminus\{\mathbf{u}\}$ has $G_{i}^{N}$ new ghost groups born at time $i$ . As new ghost groups $G_{i}^{N}$ born at time $i$ have descendants according to the same branching process dynamics,

\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{N})^{2}}\right){\mathbf{1}}.

Taking summation on both sides we have,

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=1}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{N})^{2}}\right){\mathbf{1}}.

(41)

From (33) and (41), we have

\sum_{i=1}^{t}\mathbb{E}\left({D_{i,t-i}^{N}}\right)\leq\frac{\hat{\beta}}{N}{\mathbf{1}}^{T}\sum_{i=1}^{t}{K}^{t-i}\mathbb{E}\left({(A_{i-1}^{B})^{2}}\right){\mathbf{1}}.

Proof of (38) is same as before.

Appendix E Bias in the smaller model: A justification

In this section we compare infections in a large city with the appropriately scaled version of the smaller city and show that the smaller city underestimates the reported number of infections compared to the larger city. This bias is more pronounced when the initial number of infections are small, and becomes much less significant when this initial number infected becomes large. As our analysis suggests, it is the initial variability in the infection process that causes this bias. For simplicity of presentation, we use deterministic SIR model as used in Appendix B.

First consider a city with total population $N$ . Assume that there are $i_{0}$ infected in the city at time 0. Let $i_{N}(t)$ and $r_{N}(t)$ be the number infected and recovered at time $t$ . Denote by $i^{B}(t)$ the number of infections in the associated branching process at time $t$ when starting from 1 infection at time 0. Discretized SIR model then follows these equations for all $t\geq 0$ :

s_{N}(t+1)=s_{N}(t)-\beta\frac{s_{N}(t)}{N}i_{N}(t)\Delta t,

i_{N}(t+1)=i_{N}(t)\Big{(}1+\big{(}\beta\frac{s_{N}(t)}{N}-r\big{)}\Delta t\Big{)},

r_{N}(t+1)=r_{N}(t)+r\cdot i_{N}(t)\Delta t,

s_{N}(t)+i_{N}(t)+r_{N}(t)=N.

We know that for large $N$ till time $t_{N}=\log_{\rho}\frac{N^{*}}{i_{0}}$ (recall that $N^{*}=N/\log N$ ) the city evolves approximately as a branching process. Therefore, each starting infection gives rise to an approximately independent tree of descendant infections till this time, and

i_{N}(t_{N})=\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})+o(\rho^{t_{N}}),

where $i^{B}_{j}(t_{N})$ for each $j\in[1,i_{0}]$ is distributed as an independent copy of $i^{B}(t)$ . Recall from Example 3.1 that the proportion amongst the infected and the recovered stabilises in the branching process as time increases. Hence,

{r_{N}(t_{N})}=\frac{r}{\beta-r}\sum_{j=1}^{i_{0}}{i^{B}_{j}(t_{N})}+o(\rho^{t_{N}}),

and taking expectations,

\mathbb{E}\left({i_{N}(t_{N})}\right)={i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)+o(\rho^{t_{N}}).

$\mathbb{E}\left({i_{N}(t_{N}+1)}\right)$ can be estimated as follows,

$\displaystyle=$	$\displaystyle\mathbb{E}\left({\beta\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})\bigg{(}1-\frac{1}{N}\big{(}\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})+\sum_{j=1}^{i_{0}}r^{B}_{j}(t_{N})\big{)}\bigg{)}}\right)+$	(42)
	$\displaystyle\mathbb{E}\left({(1-r)\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})}\right)+o(\rho^{t_{N}})$
$\displaystyle=$	$\displaystyle\mathbb{E}\left({\beta\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})\bigg{(}1-\frac{1}{N}\big{(}\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})+\frac{r}{\beta-r}\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})\big{)}\bigg{)}}\right)+$
	$\displaystyle\frac{1}{N}(1-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)+o(\rho^{t_{N}})$
$\displaystyle=$	$\displaystyle\mathbb{E}\left({\beta\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})\bigg{(}1-\frac{\beta}{N(\beta-r)}\sum_{j=1}^{i_{0}}i^{B}_{j}(t_{N})\bigg{)}}\right)+$
	$\displaystyle\frac{1}{N}(1-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)+o(\rho^{t_{N}})$
$\displaystyle=$	$\displaystyle{(1+\beta-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)-\frac{\beta^{2}i_{0}}{N(\beta-r)}\mathbb{E}\left({[i^{B}(t_{N})]^{2}}\right)}-$
	$\displaystyle{\frac{\beta^{2}i_{0}(i_{0}-1)}{N(\beta-r)}\Big{[}\mathbb{E}\left({i^{B}(t_{N})}\right)}\Big{]}^{2}+o(\rho^{t_{N}})$
$\displaystyle=$	$\displaystyle{(1+\beta-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)-\frac{\beta^{2}i_{0}}{N(\beta-r)}\text{Var}(i^{B}(t_{N}))}-$
	$\displaystyle{\frac{\beta^{2}i_{0}^{2}}{N(\beta-r)}\Big{[}\mathbb{E}\left({i^{B}(t_{N})}\right)}\Big{]}^{2}+o(\rho^{t_{N}}).$

Now consider a larger city with total population $kN$ for $k>1$ . At time 0, we proportionately increase the initial infections to $ki_{0}$ . Let $i_{kN}(t)$ and $r_{kN}(t)$ be the number of infections and recovered at time $t$ in this larger city.

Again, till time $t_{kN}=\log_{\rho}\frac{kN^{*}}{ki_{0}}=\log_{\rho}\frac{N^{*}}{i_{0}}=t_{N}$ , the city evolves as a branching process. Therefore, each starting infection has an approximately independent infection tree till this time, and

i_{kN}(t_{N})=\sum_{j=1}^{ki_{0}}i^{B}_{j}(t_{N})+o(\rho^{t_{N}}),

where again each $i^{B}_{j}(t_{N})$ is distributed as an independent copy of $i^{B}(t)$ . Again,

{r_{kN}(t_{N})}=\frac{r}{\beta-r}\sum_{j=1}^{ki_{0}}{i^{B}_{j}(t_{N})}+o(\rho^{t_{N}}).

Following similar steps as earlier to estimate $\mathbb{E}\left({i_{kN}(t_{N}+1)}\right)$ , we have

$\displaystyle\mathbb{E}\left({i_{kN}(t_{N}+1)}\right)$	$\displaystyle={k(1+\beta-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)}$	(43)
	$\displaystyle{-\frac{\beta^{2}i_{0}}{N(\beta-r)}\text{Var}{(i^{B}(t_{N}))}-k\frac{\beta^{2}i_{0}^{2}}{N(\beta-r)}\Big{[}\mathbb{E}\left({i^{B}(t_{N})}\right)}\Big{]}^{2}$
	$\displaystyle+o(\rho^{t_{N}}).$

Scaling $k$ times the estimate of $i_{N}(t_{N}+1)$ in (42), we have

$\displaystyle k\mathbb{E}\left({i_{N}(t_{N}+1)}\right)$	$\displaystyle={k(1+\beta-r){i_{0}}\mathbb{E}\left({i^{B}(t_{N})}\right)}$	(44)
	$\displaystyle{-k\frac{\beta^{2}i_{0}}{N(\beta-r)}\text{Var}(i^{B}(t_{N}))-k\frac{\beta^{2}i_{0}^{2}}{N(\beta-r)}\Big{[}\mathbb{E}\left({i^{B}(t_{N})}\right)}\Big{]}^{2}$
	$\displaystyle+o(\rho^{t_{N}}).$

Observe that the second term $[\frac{\beta^{2}i_{0}}{N(\beta-r)}\text{Var}{(i^{B}(t_{N}))}]$ on right-hand side of (43) is smaller than second term $[k\frac{\beta^{2}i_{0}}{N(\beta-r)}\text{Var}(i^{B}(t_{N}))]$ on right-hand side of (44). Other terms are equal in (43) and (44). This suggests that $\mathbb{E}\left({i_{kN}(t_{N}+1)}\right)>k\mathbb{E}\left({i_{N}(t_{N}+1)}\right)$ . Thus a large city with $kN$ population should have more infections at time $t_{N}+1$ as compared to the scaled smaller city with population $N$ . Observe that the variance term in the output (44) of smaller model induces an error when we estimate the larger model by scaling the output of smaller model. Inductively, this error between the larger model and the scaled smaller model can be seen to hold till O( $log_{\rho}N$ ) time. This is also evident in Figure 2 of the main paper.

Also observe that in the estimate of $k\mathbb{E}\left({i_{N}(t_{N}+1)}\right)$ for smaller city, the ratio of the third term to the second term for smaller city is $i_{0}\frac{\Big{[}\mathbb{E}\left({i^{B}(t_{N})}\right)\Big{]}^{2}}{\text{Var}(i^{B}(t_{N}))}$ , that is the third term becomes dominant over the second term as $i_{0}$ increases. As the third term is the same for both the larger city and the scaled smaller city, therefore as $i_{0}$ (initial number of infections) increases, the difference between the number of infections in the larger city and the scaled smaller city becomes smaller, suggesting that scaling the smaller city provides a good approximation to the larger city when $i_{0}$ is large. This is also evident from Figure 1 of the main paper.

Appendix F Input data for numerical experiments

In this section for completeness, we summarise the city statistics and disease related parameters used in our numerical experiments. This data is similar to that reported in [1] where it is more fully justified.

Recall that for the numerical experiments, we first create a synthetic city that closely models the actual Mumbai. A synthetic model is set to match the numbers employed, numbers in schools, commute distances, etc in Mumbai. Tables 4 and 5 show the household size distribution and school size distribution in the model. Fraction of working population is set to 40.33%. Workplace size distribution can be seen in [1].

For Figures 3-5, Figures 9-17 and Figures 19-23, we consider one community space and the whole population is considered to be living in non-slums. For Figure 5 and 23, we consider 48 community spaces, to model the 24 administrative wards in Mumbai further divided into slums and non-slums. Mumbai slums are densely crowded. This leads to difficulties in maintaining social distancing and increases the transmission rate between the infected and the uninfected. We account for this by selecting a higher community transmission rate in the slums (2 times the non-slums). This leads to slum and non-slum prevalence that closely matches the seroprevalence data observed in July 2020 (see [14]). Table 2 summarises the non-slum population age distribution and Table 3 summarises the non-slum population age distribution.

Transmission rates are set as follows : $\beta_{h}=1.227$ , $\beta_{w}=0.919$ , $\beta_{s}=1.82$ , $\beta_{c}=0.233$ . See, Table 6 and Figure 25 for details of the disease progression. Symptomatic patients are assumed to be more infectious during the symptomatic period than during the pre-symptomatic infective stage (1.5 times more infectious in our model).

Table 7 summarizes the details of interventions modelled in the simulator.

For Figure 5, interventions were introduced based on the actual interventions as happened in the city. Specifically, no intervention for first 33 days, i.e. till 20 March, 2020 (Simulation starts on 13th Feb, 2020). Lockdown from March 20 to May 17. Masks are active from 9 April, 2020 (from day 53). Rules for higher social distancing of elderly are enforced from May 1, 2020 (from day 75). Schools are closed throughout the pandemic period except in the earlier no intervention period. Other interventions such as home quarantine and case isolation are also implemented post the lockdown. Attendance schedules at workplaces after May 17 are set as follows : 5% in May 18-31, 20% in all of June , 33% in all of July, and 50% thereafter. These numbers were selected keeping in mind the prevalent restrictions and observing the transportation and the Google mobility data. Compliance levels are set at 60% in non-slums and 40% in slums. These appeared reasonable and these matched the fatality data early on in the pandemic (till June 2020).

Similarly for Figure 23, interventions were introduced based on the actual interventions as happened in the city. Specifically, no intervention for first 33 days, i.e. till 20 March, 2020 (Simulation starts on 13th Feb, 2020). Lockdown from March 20 to May 17. Masks are active from 9 April, 2020 (from day 53). Rules for higher social distancing of elderly are enforced from May 1, 2020 (from day 75). Schools are closed throughout the pandemic period except in the earlier no intervention period. Other interventions such as home quarantine and case isolation are also implemented post the lockdown. Attendance schedules at workplaces after May 17 are set as follows : 5% attendance from May 18 to May 31st, 2020, 15% attendance in June, 25% in July, 33% in August, 50% from September 2020 to January, 2021, 65% from February, 2021 and 20% from 15 April - 31 May 2021. To account for increased intermingling due to the Ganpati festival, from August 20 to September 1, we increase $\beta$ for community by 2/3, and we reduce compliance from 60% in non-slums and 40% in slums to 40% in non-slums and 20% in slums. Similarly for Diwali and Christmas festivities. Compliance levels are set at 60% in Non slums (NS) and 40% in slums (S) before December 2020 and outside festivals (during festival periods compliance is 40% NS and 20% S). These change to (50%, 30%) in Dec 2020, (40%, 20%) in Jan 2021, (20%, 10%) in 1-18 February 2021 and (40%, 20%) from Feb 19 to April 14, 2021. During the lockdown 15 April - 31 May 2021, it is (60%, 40%). We assume that there existed a single variant that accounted for 2.5% of all the infected population on Feb. 1 in our model. These were randomly chosen amongst all the infected on Feb 1. Further we assumed that this variant was 2.25 times more infectious compared to the original strain.

Table 2: Non-slum population age distribution

Age (yrs)	Fraction of population
0-4	0.0757
5-9	0.0825
10-14	0.0608
15-19	0.0669
20-24	0.0705
25-29	0.0692
30-34	0.0777
35-39	0.0716
40-44	0.0762

Age (yrs)	Fraction of population
45-49	0.0664
50-54	0.0795
55-59	0.0632
60-64	0.0560
65-69	0.0380
70-74	0.0227
75-79	0.0136
80+	0.0094

Table 3: Slum population age distribution

Age (yrs)	Fraction of population
0-4	0.0757
5-9	0.0825
10-14	0.0875
15-19	0.0963
20-24	0.0934
25-29	0.0917
30-34	0.0921
35-39	0.0849
40-44	0.0606

Age (yrs)	Fraction of population
45-49	0.0529
50-54	0.0632
55-59	0.0503
60-64	0.0327
65-69	0.0221
70-74	0.0073
75-79	0.0044
80+	0.0021

Table 4: Household size distribution

Household size	Fraction of households
1	0.0485
2	0.1030
3	0.1715
4	0.2589
5	0.1819

Household size	Fraction of households
6	0.1035
7-10	0.1165
11-14	0.0126
15+	0.0035

Table 5: School size distribution

School size	Fraction of schools
0-100	0.0185
100-200	0.1284
200-300	0.2315
300-400	0.2315
400-500	0.1574

School size	Fraction of schools
500-600	0.0889
600-700	0.063
700-800	0.0481
800-900	0.0408
900+	0

Table 6: Disease progression parameters

Parameter description	Values
Incubation Period	Gamma distributed with shape 2 and scale 2.29
Asymptomatic Period	Exponentially distributed with mean duration 0.5 of a day
Symptomatic Period	Exponentially distributed with mean duration of 5 days
Hospitalisation Period	8 days
Critical Period	8 days

Table 7: Interventions as implemented in the simulator

Intervention	Description
No intervention	Business as usual
Lockdown	For compliant households, household rates are doubled, no workplace
	interactions except for 25% leakage (for essential services), community
	interactions reduce by 75%. For non-compliant households, workplace
	interactions only have a leakage of 25%, community interactions are
	unchanged, and household interactions increase by 50%
Case Isolation	Compliant symptomatic individuals stay at home for 7 days, reducing non-
	household contacts by 75%. Household contacts remain unchanged.
Home Quarantine	Following identification of a symptomatic case in a compliant household,
	all household members remain at home for 14 days. Household contact
	rates double, contacts in the community reduce by 75%
Social distancing of the elderly	All compliant individuals over 65 years of age reduce their community
	interactions by 75%
Schools and colleges closed	Self explanatory
Masks	Reduce community transmission by 20%

	$\displaystyle\mathbb{E}\left({{G}^{N}_{t}}\right)$	$\displaystyle=\mathbb{E}\left({\mathbb{E}\left({{G}^{N}_{t}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{(A_{t-1}^{N}+{Y}_{t}^{N})}{N}{Y}_{t}^{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{A_{t-1}^{N}{Y}_{t}^{N}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)+\mathbb{E}\left({\mathbb{E}\left({\psi_{max}\frac{({Y}_{t}^{N})^{2}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right).$

	$\displaystyle\mathbb{E}\left({{G}^{N}_{t}}\right)$	$\displaystyle=\mathbb{E}\left({\mathbb{E}\left({{G}^{N}_{t}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({2g\frac{(A_{t-1}^{N}+{Y}_{t}^{N})}{N}{Y}_{t}^{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)$
		$\displaystyle\leq\mathbb{E}\left({\mathbb{E}\left({2g\frac{A_{t-1}^{N}{Y}_{t}^{N}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right)+\mathbb{E}\left({\mathbb{E}\left({2g\frac{({Y}_{t}^{N})^{2}}{N}\|\mathbf{X}^{N}_{t-1}}\right)}\right).$

Agent based simulators for epidemic modelling: Simulating larger models using smaller ones

Abstract

1 Introduction

2 Speeding up ABS: The big picture

3 Agent Based Simulator

3.1 Computing λn​(t)\lambda_{n}(t)

3.2 Disease progression

3.3 Public health safety measures (PHSMs)

4 Shift-scale-restart algorithm

5 Asymptotic Analysis

5.1 Supercritical multi-type branching process review

Assumption 1.

Theorem 5.1.

5.2 Epidemic process dynamics

5.3 Associated branching process dynamics

Lemma 5.1.

Theorem 5.2.

5.4 Coupling epidemic and branching process

5.5 The initial branching process phase

Theorem 5.3.

Lemma 5.2.

Proposition 5.1.

Remark 1.

5.6 Deterministic phase

Assumption 2.

Corollary 5.1.

Theorem 5.4.

Remark 2.

6 Further Experiments

6.1 Smaller networks

6.2 SSR on sparser cities

6.3 How small can the smaller model be

7 Conclusion

Acknowledgments

References

Appendix A Additional numerical experiments

Appendix B Behaviour of simple compartmental models in the early branching process phase

Example 1 (SIR Model).

Example 2 (SEIR model).

Example 3 (SIR model with two age groups).

Appendix C Detailed proofs of Section 5 results

C.1 Proof of Lemma 5.1

C.2 Proof of Theorem 5.2

C.3 Proof of Lemma 5.2

C.4 Proof for Theorem 8

C.5 Proof of Theorem 5.4

Appendix D Intermediate (group-structure) model for asymptotic analysis

D.1 Epidemic process dynamics

D.2 The associated branching process dynamics

D.3 Coupling

D.4 Proof of Lemma 5.2

Appendix E Bias in the smaller model: A justification

Appendix F Input data for numerical experiments

3.1 Computing $\lambda_{n}(t)$