Cutoff for the logistic SIS epidemic model with self-infection
Abstract.
We study a variant of the classical Markovian logistic SIS epidemic model on a complete graph, which has the additional feature that healthy individuals can become infected without contacting an infected member of the population. This additional “self-infection” is used to model situations where there is an unknown source of infection or an external disease reservoir, such as an animal carrier population. In contrast to the classical logistic SIS epidemic model, the version with self-infection has a non-degenerate stationary distribution, and we derive precise asymptotics for the time to converge to stationarity (mixing time) as the population size becomes large. It turns out that the chain exhibits the cutoff phenomenon, which is a sharp transition in time from one to zero of the total variation distance to stationarity. We obtain the exact leading constant for the cutoff time, and show the window size is constant (optimal) order. While this result is interesting in its own right, an additional contribution of our work is that the proof illustrates a recently formalised methodology of Barbour, Brightwell and Luczak [6], which can be used to show cutoff via a combination of concentration of measure inequalities for the trajectory of the chain, and coupling techniques.
1. Introduction
Mathematical models of disease spread are widely used in epidemiology, to assist in preventing and managing epidemic outbreaks. One well-studied epidemic model is the logistic Susceptible-Infected-Susceptible model (SIS model) on the complete graph with vertices, which is a continuous time birth-and-death Markov chain with state space , and transitions given by
where and . The model describes an epidemic spreading in a closed population with individuals, with the number of infected people at time represented by . To explain the dynamics, each infected person encounters a random member of the population at rate , and if the other individual is currently susceptible to infection, that individual then becomes infected. Each infected individual recovers at rate , and once they are recovered, they immediately become susceptible again. If, at any time, the number of infected people becomes zero, we say that the epidemic outbreak is extinct. This model was first formulated and studied by Feller [18] in the 1930s, but was not heavily studied until the 1970s, when it was popularised by Weiss and Dishon [33].
We are interested in asymptotics as , with , fixed positive constants. We use the phrase “with high probability” to mean “with probability tending to as ”. A key quantity in the SIS model is the so-called basic reproduction ratio . The epidemic is called supercritical if , and subcritical if . It is well-known that for the supercritical SIS model, if there are a large number of infected individuals at time 0, then with high probability, heads rapidly towards the stable fixed point of a corresponding deterministic model given by a (logistic) differential equation, then spends most of its time before extinction in the neighbourhood of that fixed point. The time to extinction is exponential in (in mean and with high probability) as ; see, for example, Nåsell [25]. For more detailed aspects of the supercritical logistic SIS model such as quasi-stationarity, see, for example, Andersson and Djehiche [4], Barbour [5] and Nåsell [26].
In the subcritical case where , Doering et al. [16] showed in 2005 that, if the initial state is of order , then the expected extinction time is .
Brightwell et al. [10] obtained a formula for the asymptotic distribution of the extinction time in the subcritical case, and, in particular, proved that, if the starting state is of order , then for any constant large enough, the probability of extinction by time is nearly zero, while by time , it is nearly one. To be precise, an asymptotic formula for the extinction time, where the randomness has a Gumbel distribution, is given in [10] in the case where and both are fixed constants. The authors also consider the case where are allowed to vary with , and suitably slowly (the “barely subcritical” case) and show that the same formula for the extinction time holds in this regime.
Recent work of Foxall [19] summarises and derives further results for the asymptotic behaviour of the extinction time under different asymptotic regimes for the initial infection size and reproduction ratio (also allowing and to vary with ).
The model studied in this paper is a variant of the logistic SIS epidemic model known as the logistic SIS epidemic model with self-infection (-SIS model), which is a continuous-time birth-and-death Markov chain with state space , and transitions given by
where and . A literature review for this lesser studied model is given later in the present section.
Compared to the standard SIS model, the -SIS model has an additional source of infection, which can be thought of as an infection from an unknown source or external disease reservoir, such as an animal carrier of the disease. Thus, any susceptible person can be infected (at rate ) without having contact with any infected member of the population. Excluding the case (done throughout this paper), which is the classical well-studied SIS model, the -SIS model admits a non-degenerate stationary distribution, and so the relevant quantity of interest is the time to stationarity, which is analogous to the extinction time in the classical SIS model. We are interested in asymptotics as the population size tends to infinity. Assuming the parameters of the model, , and are absolute constants not depending on , we show that the sequence of -SIS Markov chains exhibits a cutoff, as per the following definition.
Definition 1.1.
The sequence of Markov chains with finite state space , transition function , and stationary distribution is said to exhibit a cutoff at time with window size , if and
where
is the worst-case total variation distance between the distribution of and the chain’s stationary distribution .
Informally, cutoff means that, for any starting state, the distribution of the chain goes from being about as far as possible in total variation distance from the stationary distribution to about as close as possible, over a window of time of length centred around time .
The cutoff phenomenon for Markov chains was first identified in the 1980s for random transpositions on the symmetric group by Diaconis and Shahshahani [14], and the name was coined by Aldous and Diaconis [2]. Establishing cutoff for specific models of Markov chains is to this day an active area of research and in general a difficult problem. For the model in question, we have the following main result of the paper, which gives the precise cutoff time of the chain, with an optimal window.
Theorem 1.2.
The sequence of -SIS Markov chains has a cutoff at with constant window size, where .
Before going further, we highlight a few points about the theorem. First, Theorem 1.2 implies that for a very small , the cutoff time of the -SIS model is about . This is about twice as fast as the subcritical classical SIS model, which has a cutoff at . This finding does not appear to be at all a priori obvious: we will give an intuitive explanation for it after describing the proof in more detail at the end of the introduction.
Second, general conditions that guarantee a sequence of birth-and-death chains exhibits cutoff are given in Ding et al. [15]. They show that, if the sequence satisfies the so-called product condition, that the relaxation times –defined to be one over the spectral gap of the chain–is of smaller order as goes to infinity than the mixing time –defined to be the first time the total variation distance to stationarity is less than –then the chain exhibits cutoff with window of order at most . While it is likely possible in our setting to verify the product condition, or related conditions, see for example the follow-up works [11, 12, 13], we would still need to derive bounds on the mixing time, and the window size is at best of order , because the relaxation time is at least one, and the mixing time is of order . Thus our Theorem 1.2 is sharper than what would be obtained from these general results, by giving the constant for the cutoff time, and showing the window is of constant order. Ding et al. [15] also show the window size given in their results is unimprovable for general birth-and-death chains, so our result is also interesting as a non-trivial example where the window is smaller than predicted by the general result.
We also mention in this vein of research the work of Basu et al. [8], which gives analogous conditions for cutoff of reversible Markov chains on finite trees, and semi birth-and-death chains, and the very recent work of Salez [28], which gives a “product-like” condition for cutoff assuming that the Markov chain has “non-negative curvature”. These works also give a cutoff window of order at most for an , which, as in the birth-and-death case, may not be sharp. Thus, in many applications precise information about the exact cutoff time and window size needs to be obtained by “bare hands” methods.
The technique we use for proving Theorem 1.2 is adapted from Barbour et al. [6], where the authors developed a general approach to establishing the cutoff phenomenon for suitable chains by using couplings and concentration of measure inequalities; see also that paper for a comprehensive literature review of the cutoff phenomenon. The approach has been used in other places such as Lopes and Luczak [24], who derived a formula for the asymptotic distribution of the extinction time for the weaker of two competing SIS epidemics (while that result is not an example of the cutoff phenomenon, it has a similar flavour); and Eskenazis and Nestoridi [17], who established the cutoff phenomenon for the Bernoulli–Laplace urn model with swaps. Thus, an additional contribution of our work is to provide another example of this approach, which we hope will serve to standardise it as a tool for establishing cutoff.
A final contribution of this article is to bring the -SIS model to the attention of researchers in applied probability, and so we next discuss the existing literature. Afterwards, the end of the introduction gives an overview of the proof of Theorem 1.2, and simultaneously the organisation of the paper.
Related literature. The -SIS model has appeared sporadically in the applied literature. The theoretical primer (aimed at applied researchers) of Keeling and Ross [21] includes the additional term in their definition of the (vanilla) SIS model. Stone et al. [29] apply the -SIS model to head lice epidemic data, after deriving closed-form analytic formulas for some stationary quantities, and Hill et al. [20] use it to model obesity spread. Nieddu et al. [27] derive analytic quantities of the model to understand the parameter space where the disease is endemic.
The most systematic study of the model appears in a series of papers by Van Mieghem, alone and with and a number of co-authors [1, 30, 31, 32]. Van Mieghem [30] uses an exact expression for the mean prevalence (i.e., the expectation of ) in equilibrium to identify a sharp phase transition in this mean as varies, for fixed and a fixed very small positive value of . A follow-up paper of Van Mieghem and Wang [32] studies time-dependent behaviour in this regime. We remark here that known results about the supercritical classical logistic SIS model () help to explain this phase transition phenomenon. When and is much smaller compared to , the behaviour of the -SIS model is very similar to that of the usual logistic SIS model in the supercritical regime, except that, after reaching state , the process restarts after the next self-infection, so that the mean waiting time in state is . It is known (see, for instance, [4]) that the duration of the supercritical logistic epidemic is asymptotically exponential with a mean whose leading term is , with . If is much smaller than , then the mean prevalence is of order , as the -SIS model will be spending most of its time near the stable fixed point of the logistic equation at . On the other hand, if is much larger than , then the mean prevalence is near zero, since then most of the time the epidemic will be extinct, waiting to restart.
While papers [30, 32] assume that the population size is fixed, the regime considered there is qualitatively very similar to the case where is very large, is exponentially small in , and fixed constants, and therefore very different to the regime considered in the present paper.
Proof and paper overview. In Section 2 we provide a brief discussion of the deterministic -SIS model, represented by the differential equation given in (2.1). The solution to this differential equation provides an approximation for the scaled process over an appropriate time scale, for large .
In Section 3, we show in Lemma 3.1 that, for a (deterministic) period of time that is stretched exponential in , with high probability, the scaled process is at most , for some fixed away from the solution of equation (2.1). In addition, we show in Lemma 3.2 that for large enough times , the mean of is within of the stable fixed point of (2.1), given explicitly at (2.2).
In Section 4, we obtain an upper bound on the mixing time of the -SIS model. We show that by time , where is a constant that is independent of any parameters of the model and the population size , the distribution of the chain is sufficiently close to its stationary distribution . This is done by using the Markov chain coupling method, where the coupling used consists of running two copies of the chain independently, over three phases. During the burn-in phase, the concentration results of Section 3 imply that by time , both copies are within of with high probability; see Corollary 4.4. Given that this event has occurred, during the intermediate phase, the copies come within of each other after an additional time with high probability; see Lemma 4.5. In the final phase, given the copies are within of each other, their distance may be compared to an unbiased random walk, which hits from a state of order within a constant time, given that the transition rate is of order ; see Lemma 4.6.
In Section 5, we obtain a lower bound on the mixing time. We first show an improved version of the concentration inequality given in Section 3, which only holds if the starting state of the chain is in a “good set”. We then break the proof of the lower bound into two parts. First, we combine the rapidly mixing result from the previous section and the improved concentration inequality to show the stationary distribution is concentrated within a ball of order around . Then we use concentration around the deterministic solution to show that most of the mass of is located outside that ball, and thus by the time the distribution of the chain cannot be close to stationary.
Finally, as mentioned already, for very small the -SIS model mixes twice as fast as the classical subcritical SIS model. To see why this is the case, for the -SIS model, the (non-normalised) process arrives within distance of the fixed point , where most of the mass of the stationary distribution lies, after time of order about . After that, the stochastic model can be shown to do better than the deterministic one, as we can compare its behaviour to that of an unbiased random walk in continuous time taking steps at rate of order , since the transition rates remain of order throughout. Such a random walk will hit in constant time with high probability.
For the classical subcritical SIS model, the process also arrives within of after about time. However, at that point events happen at rate of order only and decrease as we get closer to , so any comparison with an unbiased random walk would be useless. Instead, we need to keep taking advantage of the negative drift right until the end, first following the differential equation closely, and then approximating by a linear birth-and-death process.
2. Deterministic version of the -SIS model
In this section, we consider the deterministic version of the logistic SIS model with self-infection. The model represents a spreading epidemic governed by the autonomous ordinary differential equation (ODE)
(2.1) |
where . The proportion of infected people at time is modeled by . When , we recover the classical deterministic logistic SIS model.
We solve the equation to identify the fixed points. There are two solutions to the equation, however, since only non-negative solutions have a biological meaning, we require for all in (2.1). The only fixed point that falls in this range is
(2.2) |
where we recall from the statement of Theorem 1.2 that . Since models the proportion of infected people, , as one would expect. When , the solution simplifies to , if , whereas if the solution degenerates to zero. The other solution is
which is non-positive and degenerates to zero when , provided . The following proposition is easily verified by simple calculations and standard results, for example, Theorem 2.4.2 in [9].
Proposition 2.1.
The differential equation (2.1) subject to the initial condition has an explicit solution
(2.3) |
It is easy to see from (2.3) that, as , the solution monotonically approaches from below when , whereas it monotonically approaches from above if . This can also be seen from the phase diagram. The graph of versus is a downward parabola, so when , and , if . This implies that regardless of the starting point, any solution of (2.1) approaches monotonically as , so the fixed point is globally attractive, over the set .
3. Concentration around the deterministic process
For the rest of this paper, we use to denote the solution to the governing equation (2.1) with initial state . There are two main results of this section. We first show that the scaled process closely follows the solution defined in (2.3). To be precise, we prove the following result.
Lemma 3.1.
There exist positive constants depending only on and such that for all , the following relation holds for all large enough (depending on and ; see (3.11)),
(3.1) |
where
(3.2) |
As will be seen in the proof, it is enough to consider fixed (say equal to ), so the dependence of and on is ultimately not important.
Our other result in this section controls the mean of , and is used in the proof of the lower bound on the mixing time in Section 5.
Lemma 3.2.
Let be the expectation given the starting state . The following two statements hold:
-
(1)
There exists a positive constant , depending only on and , such that, for any and large enough (depending on , and ),
-
(2)
Suppose that that , for some . Then there exists a positive constant , depending on and , such that, for any and large enough (depending on , and ),
3.1. Proof of Lemma 3.1
We study the centred processes and , with . The overall strategy of the proof is to give an integral representation of the deterministic process (Lemma 3.3), which can be compared to a similar representation of , but with an addition of a martingale term (Lemma 3.4). From here, the result essentially follows by controlling the martingale term (Lemma 3.6), using a simplified version of [24, Lemma 7] (stated here in Lemma 3.5) and then applying Grönwall’s lemma.
We first give a representation of .
Lemma 3.3.
In the notation just defined, we have
(3.3) |
and
(3.4) |
Proof.
We compare to via a representation that is similar to (3.4).
Lemma 3.4.
In the notation just defined, we have
(3.6) |
where
(3.7) |
is a zero-mean martingale.
Proof.
To show that the process defined in (3.7) is a martingale, note that the family of Markov chains is a density dependent family (see Section 4 in [22]) with transitions of the form
and the state space is compact. It is easy to check that the scaled process satisfies the conditions of [22, Proposition 2.1] and thus
is a zero-mean martingale, where . Then formula (3.6) holds, by [7, Lemma 4.1]. ∎
Comparing expressions (3.4) and (3.6), it is clear that to be able to approximate by we must control the size of . To do this, we use the following simplified version of [24, Lemma 7] for multi-dimensional Markov jump chains.
Lemma 3.5.
[24, Lemma 7] Let be a pure jump Markov chain with state space . For such that , let denote the rate the chain jumps from to , and assume that there is a deterministic such that for . Suppose further that, for each , the drift at can be written in the form
Let be the corresponding Dynkin martingale given by
and let . For , define to be the infimum of time such that exceeds in absolute value. Further, given , let , and suppose that for some , . Then for any and , we have
Specialising the lemma to our setting, we obtain the following result, which controls the size of the term .
Lemma 3.6.
Fix , set , , and
Then
Proof.
We apply Lemma 3.5 to our process . Clearly, the only possible jumps for are so we can take . The drift of is given by
Hence it is easily seen that the drift of is given by , where and .
Moreover, there is a uniform bound on , where and is the rate of the jump from to for , :
The result then follows directly from the Lemma 3.5 by setting and . ∎
The last missing ingredient for the proof of Lemma 3.1 is a quantitative upper bound on the speed of convergence for the solution to the governing equation (2.1) with initial condition to the fixed point . This will then be used to bound . The following lemma with an elementary calculus proof gives such a bound.
Lemma 3.7.
Using the notation defined above, the following inequality holds,
(3.8) |
Proof.
Recall that , where . We have . Also, from (2.3) and the fact , we obtain
It then follows by easy manipulation that
We can now prove Lemma 3.1. As previously discussed, we use representations (3.4) and (3.6) to bound the difference between and , and use Lemma 3.6 to control the martingale term in (3.6).
Proof of Lemma 3.1.
Using (3.4) and (3.6), we have
Using (3.8) to bound , we obtain
(3.9) |
To bound the first two terms on the right-hand side of (3.9), we define
where (the exact form is not so important). Recall the definition of , and from Lemma 3.6. We will show that on the event , one has
(3.10) |
For now, suppose this is true. Define
By definition, . Moreover, since we have assumed that (3.10) holds, we also have , which implies that . Applying Lemma 3.6, we obtain
which completes the proof. The only thing left to check is (3.10). On the event , it follows from (3.9) that
To further bound , a standard method is to use the Grönwall’s Lemma (see for example Theorem 1.3.1 in [3]). However, to apply the inequality we first need to control the term. On the event , one has
Choose , so that
(3.11) |
and hence
Let , then on the event , we have
for large enough. Applying Grönwall’s lemma to , we obtain for that
Dividing from both side the inequality above then establishes (3.10) as . ∎
3.2. Proof of Lemma 3.2
To explain the proof of the other main result of this section, write , which is the corresponding starting state of the chain . Recall the definition of in (3.7), Since for all , we have
(3.12) |
where we have used Fubini in the last equality to change the order of expectation and the integral. If we could replace by , then would exactly solve the integral equation (3.4), and the claim of the lemma would be trivial. Thus we prove the result by showing the difference between and is small for all , and so the difference between and is also small for not too large .
Proof of Lemma 3.2.
Let solve (3.5) and for fixed and from Lemma 3.1. Applying that lemma with this , we conclude that is exponentially small in for . Together with the fact that is bounded for all , we obtain
and using this and a similar argument implies
(3.13) |
for . Rewriting (3.2) to take advantage of these bounds, we have
(3.14) |
With this equation and the bound (3.2), we can use standard differential equation comparison theorems to bound . Let and differentiate the integral equation (3.14) to get the initial value problem
(3.15) |
It follows from (3.2) that there exists a suitable constant such that and so we obtain
From standard comparison theory of differential equations (see for example Theorem 2.2.2 in [3]), we have for all , where is the minimal solution to the initial value problem
This differential equation is of the same form as the governing equation (2.1), so it is not hard to see that for sufficiently large such that , the solution is uniquely defined for all and thus
where
For any , we can write , , and , where . Using this notation, the bound above then becomes
(3.16) |
Note that if then, as increases, monotonically approaches from above, whereas monotonically approaches from below if . (Note that is not biologically viable.) Moreover, for any fixed , is monotonically increasing as increases, which can be verified by differentiating (3.16) with respect to .
On the other hand, since , it follows from (3.15) that
Again, by the comparison theorem and using (3.5), we have
(3.17) |
Similar to the behaviour of , if then monotonically approaches from above, whereas monotonically approaches from below if . (Note that is not biologically viable.) Also, by differentiating (3.17) with respect to , one can see that for any fixed , is monotonically increasing as increases.
For any , we take , which is then non-negative for sufficiently large. If in the right-hand side of the relation (3.16), one has , then, since and as , we obtain
Therefore, there exists a constant depending on and such that for large enough,
(3.18) |
which finishes the proof of the second statement in Lemma 3.2. For the first statement, in view of (3.16) and (3.17), and the monotonicity of with respect to the initial value , one has, for any ,
(3.19) |
where
Similarly, for any ,
(3.20) |
where
Also, observe that when , we have for all . Since , it follows from (3.19) and (3.20) and the previous observation that
Now, fix any , so that . The proof is completed by noticing that for some positive constants depending only on and one has
for large enough. ∎
4. Proof of the upper bound for the mixing time
In this section, we prove the upper bound in Theorem 1.2, which in particular shows that, for a population of size , the mixing time of the Markov chain is asymptotically at most .
Lemma 4.1 (Cutoff upper bound, rapid mixing).
For all , we have
where is a non-negative real valued function depending only on and and as .
Let and be two copies of the process , with initial states , moving independently until the coalescence time, which is defined as
After the coalescence time, the states in the two copies are the same and move together forever after. Note that this defines a Markovian coupling.
By standard results (see Corollary 5.5 in [23]), to prove Lemma 4.1, it is sufficient to show the following. Write for the probability measure of the chain given and .
Lemma 4.2.
For all with , and all , we have
where is a non-negative real valued function depending only on and and as .
Before we dive into the details, we give an outline of the proof. As mentioned in the introduction, we will break up the analysis into the following phases.
-
Initial phase: We use the concentration result from the Section 3 to show that with high probability, by the time , both copies will be in the interior of a good set consisting of a ball of size around . We also prove that if the copies start anywhere in the interior of the good set, then with high probability they remain in the good set for a time period that is exponential in .
-
Intermediate phase: After both copies reach the interior of the good set, the independent coupling of the chains is contractive while the copies stay in the good set. As a consequence, after another time (plus a constant time), the distance between them will drop to with high probability.
-
Final phase: Once the copies are within of each other and given that they are still in the good set, we show the coalescence time of the copies is no greater than the time it takes for an unbiased random walk (with transition rate of order , so on average, there are events happening in a unit time period) to hit , which is at most a time of constant order with high probability.
Throughout the rest of this section, we use to denote the underlying probability measure of the process with starting state .
4.1. Initial phase
We first show that with high probability uniformly over all starting states, by the time , the process will be within of the fixed point . This will be done by a simple application of Lemma 3.1, on account of Proposition 2.1.
To state the result, we first need to introduce some notation, which will be used frequently later. Consider intervals of the form
(4.1) |
We denote the first time that the scaled chain leaves an interval by
(4.2) |
To prove the upper bound for the mixing time, we fix a and consider the following set to be the good set,
(4.3) |
where was defined in Lemma 3.1 and . We also define the interior of the good set by . Moreover, to simplify the notation, we write for , i.e. the first exit time of .
Recall that from (3.2), where was defined in Lemma 3.1. We prove the following lemma, where the first part of the lemma serves as the initial phase and the second part controls the exit time of the interval
Lemma 4.3.
Proof.
Recall the solution to the governing equation of the deterministic -SIS model given in (2.3). A simple calculation using Lemma 3.7 shows that
(4.5) |
uniformly for . It follows from a union bound and Lemma 3.1 that, for large enough such that ,
which completes the proof of the first statement. For the second statement, note that, given , by the monotonicity of in , we have for all . The condition (4.4), implies that for sufficiently large . Moreover, it follows from Lemma 3.1 that the probability that is within of for all is at least . Together with the previous observation, this implies that the probability that for all is at least . ∎
The next corollary states the result of Lemma 4.3 in terms of the two coupled copies and . That is, by the time , both have entered the interior of the good set with high probability. Moreover, once they have reached the interior, they will then remain in the good set for at least time which is exponential in with high probability. Let
(4.6) |
be the first time either copy leaves the good set. The corollary follows directly from Lemma 4.3 by applying a union bound.
Corollary 4.4.
For any , the following holds for sufficiently large such that and as per Lemma 3.1:
-
(1)
For any ,
-
(2)
Suppose , then we have .
4.2. Intermediate phase
For this phase, we prove that, after the burn-in period in the first phase, the distance between the two copies will drop to with high probability after another time (plus a constant time) by showing that the coupling is contracting if both are in the good set , provided that is large enough so that , where is as defined in (4.3). To be precise, recall from (4.6) that is the first time either copies exits the good set . Also, let be the difference between the two copies of the chain at time . The lemma below states that, for any , given that is of order , with high probability, on the event , drops below by the time .
For use in the proof, the jump rates of the coupling for are given by
(4.7) | ||||||
where
Note that before colliding the two copies almost surely do not jump simultaneously, and thus they do not cross without colliding. Since the chains move together after colliding and we assume (without loss of generality) that , the coupling is monotonic, meaning for all .
Lemma 4.5.
For , let . Suppose , so that . There exists a positive constant depending only on and such that the following relation holds for sufficiently large and any ,
(4.8) |
Proof.
Recall the possible transitions for this phase defined in (4.7). Since the coupling is monotonic, is non-negative for all . Also, . Note that the process is not a Markov process by itself, but is uniquely determined by the two-dimensional process , which is Markov with respect to its natural filtration.
For each , Proposition 2.1 in [22] implies the process
(4.9) |
is a zero-mean martingale, where
Taking expectations in (4.9) and subtracting the coordinates, we obtain
Note that, for all , one has . Moreover, for any states , we have
Choose large enough so that . Applying the optional stopping theorem to the Dynkin martingale for (which is a function of the two-dimensional Markov chain ) and bounded stopping time , as well as Fubini’s theorem, and the fact that is non-negative, we obtain
Letting , we see that satisfies the following integral inequality:
Then, for all large enough so that , by adapting Bellman’s proof of the Grönwall’s inequality (see Theorem 1.2.2 in [3]) to this setting, we obtain that for all ,
Since we have assumed that , we have
Now, using the fact that and the previous display, we have
The proof is complete after noting that, as , we have . ∎
4.3. Final phase
We now show that, once the two copies are within distance away from each other, the additional time to coalescence is at most (which is an absolute constant), with high probability, on the event that the copies stay in the good set for a time that is at least .
Lemma 4.6.
Let . Suppose and , then for all sufficiently large such that , the following relation holds for all ,
To prove this lemma, we require [6, Proposition 4.1] (stated below as Proposition 4.7), which is a continuous time analogue of [23, Proposition 17.19], and controls the tails of certain hitting times. With the right setup, the proof of Lemma 4.6 is a straightforward application of the lemma.
Proposition 4.7.
[6, Proposition 4.1] Let be a continuous-time Markov jump chain with state space and rate matrix , which is stable, conservative and non-explosive. Let and be positive, and let be a function. Set , and assume that
-
(i)
the drift of is non-positive for all ;
-
(ii)
makes jumps of magnitude at most ;
-
(iii)
for all .
Define the hitting time of . Then, for any ,
The drift condition of the proposition does not hold globally for , but it does hold in the good set . Since we are only interested in the behaviour of the chain before it leaves the good set, conceptually the lemma still applies. To apply it directly, we introduced a modified version of the chain that reflects at the “boundary” of the good set. Let and , be the states corresponding to leaving the good set . Define a state space . Let be a Markov chain on , which evolves as follows. If , then it makes jumps according to the transition rates of the original process . Otherwise, we have
(4.10) |
The processes and admit a natural coupling, under which the two processes start from the same initial state and evolve together until time defined in (4.2). This leads to the following immediate observation: for any event in the stopped sigma-algebra of , one has , where we use to denote the underlying probability measure of and to denote the underlying probability measure of . That is, the two processes and are path-wise indistinguishable in probability up to time .
As before, let be two independent copies of with transitions specified by (4.7) before coalescence, and moving together after coalescence. Also, let evolve as two independent copies of with until coalescence, and moving together after coalescence. Analogously to the relationship between and , the two-dimensional processes and are path-wise indistinguishable in probability up to the first time exits , that is until the time defined in (4.6).
Proof of Lemma 4.6.
Set
and write for the underlying probability measure of and for the underlying probability measure of . Since for all , the process leaves either by having the lower one jump down or by having the upper one jump up. Then, as in the previous discussion of the relationship between and , we have the following two immediate observations:
(4.11) | ||||
(4.12) |
Therefore, it suffices to prove the result for the process . To apply [6, Proposition 4.1] (Proposition 4.7) to , first set
Clearly, is non-negative and integer-valued, so for all such that . Also, Condition (2) of Proposition 4.7 is satisfied by taking .
We now verify Condition (1) of Propositions 4.7, i.e. the drift, is non-positive for all x such that . First note that, in our case, there are four terms in the sum since, before coalescence, the coupled process has four possible ways of jumping. The movements
are always allowed and lead to . The other two possible jumps,
always leads to and the boundary conditions are reflected by the indicators in the transition rates. Also, note that for all ,
with defined in (4.3). It then follows that
(4.13) |
4.4. Bounding the time to coalescence.
We combine the previous results from this section for a proof of Lemma 4.2.
Proof of Lemma 4.2.
Fix . Denote the good event that the coupled processes and are in the interior good set by time by
Then the first part of Corollary 4.4 implies
(4.15) |
By the Markov property, to bound the first term on the right hand side of (4.15), it is enough to bound, uniformly in ,
where the first inequality follows from the second part of Corollary 4.4, and for the second inequality we assume is sufficiently large so that .
Lemma 4.5 and Lemma 4.6 combined with a straightforward application of the Markov property implies that for all large enough,
Combining the displays above gives
and taking the limit superior as on both sides finishes the proof. ∎
5. Proof of the lower bound for the mixing time
In this section, we prove the following lower bound on the mixing time, which together with Lemma 4.1 proves Theorem 1.2. Recall and from Definition 1.1 and Theorem 1.2.
Lemma 5.1.
For sufficiently large , we have
where is a non-negative real valued function depending only on and with as .
Before outlining the strategy of the proof, we first state an improved version of the concentration of measure inequality from Lemma 3.1 for the scaled process , which holds only if the starting state of the chain is close enough to . Recall the interval of the form defined in (4.1). In the previous section, we took , in defining our good set , so that the size of the good set shrinks as gets large. To prove the lower bound, we consider as a positive constant, i.e. independent of .
Lemma 5.2.
Fix , and . Let and let . Then the following holds for all large enough (depending on and ):
The lemma essentially follows from an application of [6, Theorem 3.3], but the details are technical, so we postpone the proof until the end of this section. Assuming that Lemma 5.2 holds, the proof of Lemma 5.1 is then broken down into two steps.
The first step is taken in Lemma 5.3, where we show that the stationary distribution is with high probability concentrated in a ball of order around the fixed point . The explicit expression for the stationary distribution can be obtained by solving the detailed balance equation, (see for example the displays (A1) and (A2) in [30]), but it is not easy to observe the concentration from this expression. The outline of our strategy is to first fix and choose large enough such that . We use the improved concentration inequality of Lemma 5.2 to show that the distribution of is concentrated in a -ball around given . Next, the first part of Lemma 3.2 implies that is within of . Together these imply that is in a ball around with high probability. Finally, since the distribution of is close to stationary by the time , as stated in Lemma 4.1, this implies that also assigns most of the mass to a ball around .
To prove the lower bound, it is sufficient to only consider one specific initial state. For convenience, we take the initial state to be . The second step of the proof of Lemma 5.1 is then to show that the distribution of given is concentrated in a region that is sufficiently apart from ; combined with the concentration of around , this gives a lower bound on the mixing time. The outline of the proof of this result is as follows. First fix a large enough such that , where are as in Lemma 3.2. By the improved concentration inequality, the distribution of is concentrated in a -ball around . The second part of Lemma 3.2 then implies that at least above the fixed point . Together, these two results indicate that with high probability
but by Lemma 5.3, we know that assigns most of its mass within of .
Now, with the outline in mind, we state and prove concentration for the stationary distribution .
Lemma 5.3.
Proof.
Let , let and let . Lemma 5.2 implies that, as long as is sufficiently large,
Moreover, applying the first part of Lemma 3.2 with , we obtain
for large enough. It then follows that, for large enough,
Thus, for any and large enough, the inequality implies that
where is the law of given the initial state . The result now follows by taking the limsup and applying Lemma 4.1. ∎
We can now prove our lower bound on the mixing time.
Proof of Lemma 5.1.
Fix a large enough such that and let . It follows from the second part of Lemma 3.2 by taking that
for all large enough. Moreover, note that and so for any such that , Lemma 5.2 implies that
for all large enough. Combining the previous two relations gives
(5.1) |
where is the set defined in Lemma 5.3. Now,
so that (5.1) and Lemma 5.3 imply that for all sufficiently large ,
The only thing left at this point is to prove Lemma 5.2. To do this, we shall make use of [6, Theorem 3.3] which provides a concentration inequality for contracting Markov chains.
Theorem 5.4.
[6, Theorem 3.3] Let be a stable, conservative, non-explosive continuous-time Markov chain on a discrete state space , with generator matrix . Suppose that is a metric on , and let be a function such that, for some constant , for all . Let be a subset of , and let and be constants such that for all and whenever and is such that . For , let . Suppose there is a Markovian coupling of two copies of with generator and constant , such that for all ,
(5.2) |
Then, for all and ,
We are not able to apply this theorem directly to our chain since it is not contracting on the entire state space. One could modify the theorem to get a version that requires the contraction property to hold only on a “good” subset of the state space, as is the case for near . (Indeed, [6, Theorem 2.3] is stated in such a form, but for discrete-time.) Instead, we avoid this issue by making use of the reflected chain defined for the final phase immediately after Proposition 4.7, which behaves as in a good set and reflects at the boundary. Here, with a bit of abuse of notation, we define to be the reflected chain on the state space , where and are considered to the the upper and lower boundary respectively and is a fixed positive constant.
Proof of Lemma 5.2.
Recall the definition of from (4.2). Just as we argued in the passage following (4.10), it is sufficient to prove the concentration for on , where . To apply Theorem 5.4, we take to be the identity function and the metric so that . The jump size of is clearly bounded by and we have for all
Finally, we show that there is a coupling of two copies of which satisfies (5.2) for all , and hence is contracting in Wasserstein distance. The coupling is that used earlier in Section 4.3, where the copies move independently until they collide, and then they move in unison. Verifying (5.2) for this coupling follows very similarly to verifying Condition (1) in Proposition 4.7.
To compute the left-hand side of (5.2), first note that if the two copies have already collided with each other, then the left-hand side is . Otherwise, the left-hand side is the same as in (4.3) with replaced by . Since we have chosen to be less than , so for sufficiently large we must have . Therefore, (5.2) is satisfied with . Then, Theorem 5.4 implies that for all , , and ,
Moreover, since and are path-wise indistinguishable in probability before leaving , we have
The second part of Lemma 4.3 implies that for all , and all sufficiently large , we have
(5.3) |
and hence
Now, for any fixed , choose large enough such that . It follows that
Together with (5.3), we have
for large enough, as required. ∎
References
- [1] Massimo A Achterberg, Bastian Prasse and Piet Van Mieghem “Analysis of continuous-time Markovian -SIS epidemics on networks” In Physical Review E 105.5, 2022, pp. 054305
- [2] David Aldous and Persi Diaconis “Shuffling cards and stopping times” In The American Mathematical Monthly 93.5 Taylor & Francis, 1986, pp. 333–348
- [3] William F Ames and BG Pachpatte “Inequalities for differential and integral equations” Elsevier, 1997
- [4] Håkan Andersson and Boualem Djehiche “A threshold limit theorem for the stochastic logistic epidemic” In Journal of Applied Probability 35.03, 1998, pp. 662–670
- [5] A. D. Barbour “Quasi–stationary distributions in Markov population processes” In Advances in Applied Probability 8.2 Cambridge University Press, 1976, pp. 296–314
- [6] A. D. Barbour, Graham Brightwell and Malwina Luczak “Long-term concentration of measure and cut-off” In Stochastic Processes and their Applications 152 Elsevier, 2022, pp. 378–423
- [7] A. D. Barbour and MJ Luczak “A law of large numbers approximation for Markov population processes with countably many types” In Probability Theory and Related Fields 153.3 Springer, 2012, pp. 727–757
- [8] Riddhipratim Basu, Jonathan Hermon and Yuval Peres “Characterization of cutoff for reversible Markov chains” In Proceedings of the twenty-sixth annual ACM-SIAM symposium on Discrete algorithms, 2014, pp. 1774–1791 SIAM
- [9] William E Boyce, Richard C DiPrima and Hugo Villagómez Velázquez “Elementary differential equations and boundary value problems. Ecuaciones diferenciales y problemas con valores en la frontera”, 2004
- [10] Graham Brightwell, Thomas House and Malwina Luczak “Extinction times in the subcritical stochastic SIS logistic epidemic” In Journal of Mathematical Biology 77.2 Springer, 2018, pp. 455–493
- [11] Guan-Yu Chen and Laurent Saloff-Coste “On the mixing time and spectral gap for birth and death chains” In ALEA Latin American Journal of Probability and Mathematical Statistics 10.1, 2013, pp. 293–321
- [12] Guan-Yu Chen and Laurent Saloff-Coste “Spectral computations for birth and death chains” In Stochastic Processes and Applications 124.1, 2014, pp. 848–882
- [13] Guan-Yu Chen and Laurent Saloff-Coste “Computing cutoff times of birth and death chains” In Electronic Journal of Probability 20, 2015, pp. no. 76, 47
- [14] Persi Diaconis and Mehrdad Shahshahani “Generating a random permutation with random transpositions” In Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57.2 Springer, 1981, pp. 159–179
- [15] Jian Ding, Eyal Lubetzky and Yuval Peres “Total variation cutoff in birth-and-death chains” In Probabability Theory and Related Fields 146.1-2, 2010, pp. 61–85
- [16] Charles R. Doering, Khachik V. Sargsyan and Leonard M. Sander “Extinction Times for Birth-Death Processes: Exact Results, Continuum Asymptotics, and the Failure of the Fokker–Planck Approximation” In Multiscale Modeling & Simulation 3.2, 2005, pp. 283–299
- [17] Alexandros Eskenazis and Evita Nestoridi “Cutoff for the Bernoulli–Laplace urn model with swaps” In Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 56.4, 2020, pp. 2621–2639 Institut Henri Poincaré
- [18] Willy Feller “Die Grundlagen der Volterraschen Theorie des Kampfes ums Dasein in wahrscheinlichkeitstheoretischer Behandlung” In Acta Biotheoretica 5.1, 1939, pp. 11–40
- [19] Eric Foxall “Extinction time of the logistic process” In Journal of Applied Probability 58.3 Cambridge University Press, 2021, pp. 637–676
- [20] Alison L. Hill, David G. Rand, Martin A. Nowak and Nicholas A. Christakis “Infectious Disease Modeling of Social Contagion in Networks” In PLOS Computational Biology 6.11 Public Library of Science, 2010, pp. 1–15
- [21] Matthew James Keeling and Joshua V Ross “On methods for studying stochastic disease dynamics” In Journal of the Royal Society Interface 5.19 The Royal Society London, 2008, pp. 171–181
- [22] Thomas G Kurtz “Limit theorems for sequences of jump Markov processes approximating ordinary differential processes” In Journal of Applied Probability 8.2 Cambridge University Press, 1971, pp. 344–356
- [23] David A Levin and Yuval Peres “Markov chains and mixing times” American Mathematical Soc., 2017
- [24] Fabio Lopes and Malwina Luczak “Extinction time for the weaker of two competing SIS epidemics” In The Annals of Applied Probability 30.6 Institute of Mathematical Statistics, 2020, pp. 2880–2922
- [25] Ingemar Nåsell “The quasi-stationary distribution of the closed endemic SIS model” In Advances in Applied Probability 28.3 Cambridge University Press, 1996, pp. 895–932
- [26] Ingemar Nåsell “On the quasi-stationary distribution of the stochastic logistic epidemic” In Mathematical Biosciences 156.1-2, 1999, pp. 21–40
- [27] Garrett T Nieddu, Eric Forgoston and Lora Billings “Characterizing outbreak vulnerability in a stochastic SIS model with an external disease reservoir” In Journal of the Royal Society Interface 19.192 The Royal Society, 2022, pp. 20220253
- [28] Justin Salez “Cutoff for non-negatively curved Markov chains” In Journal of the European Mathematical Society 26.11, 2024, pp. 4375–4392
- [29] Patricia Stone, Hilde Wilkinson-Herbots and Valerie Isham “A stochastic model for head lice infections” In Journal of Mathematical Biology 56 Springer, 2008, pp. 743–763
- [30] Piet Van Mieghem “Explosive phase transition in susceptible-infected-susceptible epidemics with arbitrary small but nonzero self-infection rate” In Physical Review E 101.3 APS, 2020, pp. 032303
- [31] Piet Van Mieghem and Eric Cator “Epidemics in networks with nodal self-infection and the epidemic threshold” In Physical Review E 86.1 APS, 2012, pp. 016116
- [32] Piet Van Mieghem and Fenghua Wang “Time dependence of susceptible-infected-susceptible epidemics on networks with nodal self-infections” In Physical Review E 101.5 APS, 2020, pp. 052310
- [33] George H. Weiss and Menachem Dishon “On the asymptotic behavior of the stochastic and deterministic models of an epidemic” In Mathematical Biosciences 11.3-4, 1971, pp. 261–265