This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Data Forecasts of the Epidemic COVID-19 by Deterministic and Stochastic Time-Dependent Models

Bo-Sheng Chen Corresponding author: Department of Mathematics, National Taiwan Normal University, Taipei, Taiwan. Zong-Ying Wu Department of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan. [email protected] Yen-Jia Chen Jann-Long Chern
Abstract

We propose a deterministic SAIVRD model and a stochastic SARV model of the epidemic COVID-19 involving asymptomatic infections and vaccinations to conduct data forecasts using time-dependent parameters. The forecast by our deterministic model conducts 10-day predictions to see whether the epidemic will ease or become more severe in the short term. The forecast by our stochastic model predicts the probability distributions of the final size and the maximum size to see how large the epidemic will be in the long run. The first forecast using the data set from the USA gives the relative errors within 3%3\% in 55 days and 7%7\% in 1010 days for the prediction of isolated infectious cases. The distributions in the second forecast using the time-varying parameters from the first forecast are bimodal in our model with time-independent parameters in our simulations of smaller populations. For the model with time-dependent model, what are different are that there is another peak in the final size distribution, that the probability of minor outbreak is higher and that the maximum size distribution is oscillating with time-dependent parameters. The final size distributions are similar between different populations and so are the maximum size distributions, which means that we can expect that with the same parameters and in a large population, the ratio of the final size and the maximum size are distributed similarly. The result shows that under recent transmissibility of this disease in the USA, when an initial infection is introduced into all-susceptible large population, major outbreak occurs with around 95%95\% of the population and with high probability the epidemic is maximized to around 30%30\% of the population.

1 Introduction

According to WHO [1] report, as of 7:06pm CEST, 15 September 2022, there have been 607,745,726 confirmed cases of COVID-19, including 6,498,747 deaths, globally. The quick spread of this epidemic makes a worldwide impact on not only the human health but the economics and developments. Even worse, the uncontrollability of asymptomatic infections causes a problem that we may underestimate the transmission of the disease. [2] estimates that the pooled percentage of asymptomatic infections among the confirmed cases was 40.50%40.50\% globally. Nevertheless, tough tactics, such as lockdown and traffic halt, to block the spread of the disease are not long plans because these require large social cost. Fortunately, vaccinations help the immunity and lower the probability of getting infected or dead from this disease [3]. As of 12 September 2022, a total of 12,613,484,608 vaccine doses, reported to WHO [1], have been administered. Hence, it is important to consider the contribution of vaccinations to make more precise policy decision.

In [4] and [5], traditional epidemic models, such as SIS, SIR and SEIR models, and mathematical models of some well-known pandemics, such as smallpox, HIV and malaria, have been discussed in detail. In [6] and [7], the authors considered asymptomatic infections and the quarantine policy when studying the transmission of SARS in 2003. [8], [9] and [10] also investigate mathematical models of COVID-19 considering also the asymptomatic infections. Especially, [9] and [10] discretize the differential equations to difference equations and assume that some parameters are time-dependent. They use the data from different countries, such as China, the USA and Brazil, and apply a popular machine learning method: the Finite Impulse Response (FIR) filters to track the time series of these parameters and then predict the desired values, such as the number of infections and recoveries. Furthermore, [11] considers vaccinations against COVID-19 and uses an Ensemble Kalman Filter to conduct data forecast.

In contrast of deterministic models, stochastic models are more realistic in epidemiology. Many stochastic models of epidemics are also proposed in recent years. The most common two ways of stochastic modeling of epidemics are to conduct a Markov branching process [12], [13] and [14] and to conduct stochastic differential equations [15], [16] and [17]. The quantities in interest shall be probability distributions instead of exact values. For example, [18] provides formulae of the final size in its deterministic model, while [19], [20] and [21] give algorithms calculating the final size probability distribution.

The probability distributions of the final size and the maximum size are important topics in stochastic epidemic models. [12], [19] and [22] are good materials for a review of calculations of these two distributions. Many optimized algorithms calculating these distributions have been proposed, such as in [20] and in [21]. It was calculated that these probability distributions are bimodal.

In this stage of the epidemic COVID-19, the quarantine policy is loosen gradually in many countries and so the data of quarantine is more unavailable. Hence, we will only consider an epidemic prevention policy: vaccination in our models in this paper. In this paper, we will conduct data forecasts to see whether the epidemic will ease or become more severe in the short term and how large the epidemic will be in the long run.

For the first question, we conduct a deterministic SAIVRD model using ordinary differential equations, which involves six compartments: susceptible people (S), (unconfirmed) asymptomatically infectious cases (A), (confirmed) isolated infectious cases (I), recovered people (R), vaccinated-immune people (V) and deaths (D). In this model, we will analyze the existence and the asymptotic stability of the disease-free and endemic equilibria related to the basic reproduction number R0R_{0}. It is shown that if R0>1,R_{0}>1, then there is a unique endemic equilibrium which is locally asymptotically stable, and if R0<1,R_{0}<1, then the disease-free equilibrium is locally asymptotically stable. We also assume the time-dependence of some parameters and track their time series to predict the time-varying parameters and values of confirmed cases II using FIR filters. Using the data set [23], [24] and [25] from the USA for 4040 days from June 30-th 2022 to August 8-th, 2022 to predict the next 1010 days, we get the relative prediction errors of II within 3%3\% in 55 days and 7%7\% in 1010 days, and the trend of change is closed to the real data of II. This shows that our proposed model meets the real situation.

For the second question, we conduct a time-dependent Markov SARV model with no demography assuming that there will not be another outbreak and considering the administration of vaccinations. The four compartments are susceptible people (S), asmytomatic infections (A), isolations/recoveries/deaths (R) and vaccine-immune (V). Under this model, we approximate the probability distributions of the final size and the maximum size with one initial infectious person using the time-varying parameters obtained from the first forecast.

The desired distributions are also bimodal in our model with time-independent parameters in our simulations of smaller populations. For the model with time-dependent model, what are different are that there is another peak in the final size distribution, that the probability of ‘‘minor outbreak’’ [15] is higher with time-dependent parameters and that the maximum size distribution is oscillating. The distributions are similar between different populations and so we can estimate the “ratio” of the final size and the maximum size distributions by observing those in small populations. Under recent transmissibility of this disease in the USA, when an initial infection is introduced into all-susceptible large population, ‘‘major outbreak’’ [15] occurs with around 95%95\% of the population; with high probability the epidemic is maximized to around 30%30\% of the population. Moreover, to estimate the extinction probability of this epidemic in a large population, we assume the infinite population. As shown in [21] and [26], the plateau value of cumulative probability of the final size in a large population is approximately the extinction probability. It is also interesting to note that this probability is approximately the reciprocal of the basic reproduction number R0SARVR_{0}^{\mathrm{SARV}} (defined in the SARV model) whenever R0SARV>1R_{0}^{\mathrm{SARV}}>1, which demonstrates numerically the result in [15]. These data forecasts can serve as reference to public health policy.

We organize the article as follows. We derive our deterministic SAIVRD model, define the basic reproduction number R0R_{0} and investigate the existence and asymptotic stability of the disease-free and endemic equilibria in section 2. Section 3 uses the FIR filter to train the time-varying rates to predict the trend of the epidemic in the short term. The probability distributions of the final size and the maximum size in the Markov SARV model with both time-independent and time-dependent parameters will be calculated in section 4. Section 5 provides numerical results. Section 6 states our concluding remarks.

2 The Derivation of the Deterministic SAIVRD Model and Stability Analysis

We firstly introduce our deterministic SAIVRD model in this section. The six compartments are defined in section 1. The model is under the following assumptions: there must be a period that an infected case is not confirmed yet which is asymptomatically infectious; asymptomatically infectious cases will not enter RR or DD; a dead person cannot infect others; the system is closed, that is, there is no migration (the border is well-controlled); confirmed cases are well isolated so that they have very low adequate contact with others; nosocomial infections and re-infections are omitted.

The parameters are set as following and to be non-negative:

Notation Description
BB The new births per unit of time
μ\mu Natural death rate
β\beta Transmission rate of AA
vv Full-vaccination rate multiplying the vaccination efficiency
ww Progression rate from AA to II
ρ\rho Death rate from II
γ\gamma Recovering rate from II


Let N0N_{0} be the initial total population. The flow chart is shown as following.

SSAAVVIIDDRRBBμS\mu SμV\mu VμA\mu AμI\mu IμR\mu RvSvSβASN0\dfrac{\beta AS}{N_{0}}wAwAρI\rho IγI\gamma I

Then the model equations are

{S(t)=BvS(t)βN0A(t)S(t)μS(t)A(t)=βN0A(t)S(t)(w+μ)A(t)I(t)=wA(t)(ρ+γ+μ)I(t)V(t)=vS(t)μV(t)R(t)=γI(t)μR(t)D(t)=ρI(t)\begin{cases}S^{\prime}(t)=B-vS(t)-\dfrac{\beta}{N_{0}}A(t)S(t)-\mu S(t)\\ A^{\prime}(t)=\dfrac{\beta}{N_{0}}A(t)S(t)-(w+\mu)A(t)\\ I^{\prime}(t)=wA(t)-(\rho+\gamma+\mu)I(t)\\ V^{\prime}(t)=vS(t)-\mu V(t)\\ R^{\prime}(t)=\gamma I(t)-\mu R(t)\\ D^{\prime}(t)=\rho I(t)\end{cases} (1)

with the initial conditions S(0)=S0>0,A(0)=A0>0,I(0)=V(0)=R(0)=D(0)=0.S(0)=S_{0}>0,A(0)=A_{0}>0,I(0)=V(0)=R(0)=D(0)=0. Clearly, we have the result in theorem 2.1.

Theorem 2.1.

All solutions of eq. 1 always lie in the positively invariant set

Ω:={(S,A,I,V,R,D)6:0S+A+I+V+R+DBμ+N0}\Omega:=\{(S,A,I,V,R,D)\in\mathbb{R}^{6}:0\leq S+A+I+V+R+D\leq\frac{B}{\mu}+N_{0}\}

for all t0t\geq 0. In particular, SS and AA remain positive for all t0t\geq 0.

Proof.

Note that N=S+A+I+V+R+DN=S+A+I+V+R+D and N=S+A+I+V+R+D=BμN as functions of t.N^{\prime}=S^{\prime}+A^{\prime}+I^{\prime}+V^{\prime}+R^{\prime}+D^{\prime}=B-\mu N\mbox{ as functions of }t. Then

0N(t)=Bμ+(N0Bμ)eμtBμ+N0.0\leq N(t)=\frac{B}{\mu}+\big{(}N_{0}-\frac{B}{\mu}\big{)}e^{-\mu t}\leq\frac{B}{\mu}+N_{0}.

Since S(v+βN0N(t)+μ)S(v+βN0(Bμ+N0)+μ)S,\displaystyle S^{\prime}\geq-(v+\dfrac{\beta}{N_{0}}N(t)+\mu)S\geq-(v+\dfrac{\beta}{N_{0}}(\frac{B}{\mu}+N_{0})+\mu)S, we have

SS0exp((v+βN0(Bμ+N0)+μ))>0.S\geq S_{0}\exp\big{(}-(v+\dfrac{\beta}{N_{0}}(\frac{B}{\mu}+N_{0})+\mu)\big{)}>0.

Similarly for A,I,V,RA,I,V,R and DD and this theorem is proved. ∎

Since the number of deaths can be calculated by D=NSAIVRD=N-S-A-I-V-R, the model can be reduced to have five variables S,A,I,VS,A,I,V and RR.

We now derive the basic reproduction number R0R_{0} in our SAIVRD model. The Jacobian matrix of the reduced model from eq. 1 at an equilibrium (S,A,I,V,R)(S^{*},A^{*},I^{*},V^{*},R^{*}) is given by

[(v+βN0A+μ)βN0S000βN0AβN0S(w+μ)0000w(ρ+γ+μ)00v0γμ00000μ],\begin{bmatrix}-(v+\dfrac{\beta}{N_{0}}A^{*}+\mu)&-\dfrac{\beta}{N_{0}}S^{*}&0&0&0\\ \dfrac{\beta}{N_{0}}A^{*}&\dfrac{\beta}{N_{0}}S^{*}-(w+\mu)&0&0&0\\ 0&w&-(\rho+\gamma+\mu)&0&0\\ v&0&\gamma&-\mu&0\\ 0&0&0&0&-\mu\end{bmatrix},

whose characteristic polynomial is

P(λ)=\displaystyle P(\lambda)= ((ρ+γ+μ)λ)(μλ)2((v+μ)(w+μβN0S)\displaystyle(-(\rho+\gamma+\mu)-\lambda)(-\mu-\lambda)^{2}((v+\mu)(w+\mu-\dfrac{\beta}{N_{0}}S^{*}) (2)
+βN0A(w+μ)(v+w+2μ+βN0(AS))λ+λ2).\displaystyle+\dfrac{\beta}{N_{0}}A^{*}(w+\mu)-(v+w+2\mu+\dfrac{\beta}{N_{0}}(A^{*}-S^{*}))\lambda+\lambda^{2}).

At the disease-free equilibrium (DFE) (Bv+μ,0,0,vμBv+μ,0),(\displaystyle\frac{B}{v+\mu},0,0,\frac{v}{\mu}\frac{B}{v+\mu},0), the Jacobian of eq. 1 is

[(v+μ)BβN0(v+μ)0000BβN0(v+μ)(w+μ)0000w(ρ+γ+μ)00v00μ000γ0μ],\begin{bmatrix}-(v+\mu)&\displaystyle-\frac{B\beta}{N_{0}(v+\mu)}&0&0&0\\ 0&\displaystyle\frac{B\beta}{N_{0}(v+\mu)}-(w+\mu)&0&0&0\\ 0&w&-(\rho+\gamma+\mu)&0&0\\ v&0&0&-\mu&0\\ 0&0&\gamma&0&-\mu\end{bmatrix},

whose eigenvalues are (v+μ),(ρ+γ+μ),μ,μ,BβN0(v+μ)(w+μ).-(v+\mu),-(\rho+\gamma+\mu),-\mu,-\mu,\displaystyle\frac{B\beta}{N_{0}(v+\mu)}-(w+\mu). Hence, we have BβN0(v+μ)(w+μ)<0\dfrac{B\beta}{N_{0}(v+\mu)}-(w+\mu)<0 if and only if the DFE is asymptotically stable. Hence, if we put

R0=BβN0(v+μ)(w+μ),R_{0}=\frac{B\beta}{N_{0}(v+\mu)(w+\mu)}, (3)

then we have theorem 2.2.

Theorem 2.2.

The DFE (Bv+μ,0,0,vμBv+μ,0)(\displaystyle\frac{B}{v+\mu},0,0,\frac{v}{\mu}\frac{B}{v+\mu},0) is locally asymptotically stable if and only if R0<1.R_{0}<1.

We then define the basic reproduction number by eq. 3 and explain why this definition of R0R_{0} is reasonable in epidemiology as follows. Firstly, the value of R0R_{0} has positive correlations with BB and β\beta, which means that there are more people will be infected when more people are born or when more people are contacted by infectious people. Secondly, R0R_{0} has negative correlations with w,vw,v and μ\mu, which means that if we can improve the efficiency of disease screening (to shorten the period of infecting others), improve the efficiency of vaccinations or the infected people die fast, then fewer people would be infected.

It is also worthwhile to note that the rates of recovery γ\gamma and death ρ\rho of the confirmed cases are not involved in R0R_{0}. This is because the class II is well-controlled such that people in the class II cannot cause further infections.

There may be an equilibrium in which AA^{*} and II^{*} are positive in eq. 1, whose stability means that the human will coexist the virus finally and which is called an endemic equilibrium. The existence and asymptotic stability of the endemic equilibria are stated in theorem 2.3.

Theorem 2.3.

There is a unique locally asymptotically stable endemic equilibrium whenever R0>1R_{0}>1.

Proof.

Solving

{0=BvSβN0ASμS0=βN0AS(w+μ)A0=wA(ρ+γ+μ)I0=vSμV0=γIμR,\begin{cases}0=B-vS^{*}-\dfrac{\beta}{N_{0}}A^{*}S^{*}-\mu S^{*}\\ 0=\dfrac{\beta}{N_{0}}A^{*}S^{*}-(w+\mu)A^{*}\\ 0=wA^{*}-(\rho+\gamma+\mu)I^{*}\\ 0=vS^{*}-\mu V^{*}\\ 0=\gamma I^{*}-\mu R^{*},\end{cases}

we have

A=ρ+γ+μwI,R=γμI,S=w+μβN0,V=v(w+μ)βμN0,A^{*}=\frac{\rho+\gamma+\mu}{w}I^{*},R^{*}=\frac{\gamma}{\mu}I^{*},S^{*}=\frac{w+\mu}{\beta}N_{0},V^{*}=\frac{v(w+\mu)}{\beta\mu}N_{0},

where I=B(v+μ)(w+μ)N0/β(1+1/w)(w+μ)(ρ+γ+μ).I^{*}=\dfrac{B-(v+\mu)(w+\mu)N_{0}/\beta}{(1+1/w)(w+\mu)(\rho+\gamma+\mu)}. Since I>0I^{*}>0 if and only if R0>1,R_{0}>1, the existence follows.
Plug this equilibrium into eq. 2 and then under the assumptions, since
βN0A(w+μ)>0,\dfrac{\beta}{N_{0}}A^{*}(w+\mu)>0, the roots of P(λ)P(\lambda) are all negative if and only if v+w+2μ+βN0(AS)>0,v+w+2\mu+\dfrac{\beta}{N_{0}}(A^{*}-S^{*})>0, which can be easily calculated to be equivalent to R0+w>0R_{0}+w>0 and so this theorem is proved. ∎

3 Prediction Methods Using Time-Varying Parametric Models

In this section, we assume the time-dependence of some parameters in our SAIVRD model derived in section 2 to conduct the predictions of infections and time-varying rates in the short term. Since the data is updated in days, it is reasonable to consider the discrete-time model instead of the ordinary differential equations.

Suppose that the parameters BB and μ\mu are constants and that the other parameters ρ(t),γ(t),w(t),v(t)\rho(t),\gamma(t),w(t),v(t) and β(t)\beta(t) are time-dependent. Then we have the model difference equations in eq. 4.

{S(t+1)S(t)=B(v(t)+μ)S(t)β(t)N0A(t)S(t)A(t+1)A(t)=β(t)N0A(t)S(t)(w(t)+μ)A(t)I(t+1)I(t)=w(t)A(t)(ρ(t)+γ(t)+μ)I(t)V(t+1)V(t)=v(t)S(t)μV(t)R(t+1)R(t)=γ(t)I(t)μR(t)D(t+1)D(t)=ρ(t)I(t).\begin{cases}S(t+1)-S(t)=B-(v(t)+\mu)S(t)-\dfrac{\beta(t)}{N_{0}}A(t)S(t)\\ A(t+1)-A(t)=\dfrac{\beta(t)}{N_{0}}A(t)S(t)-(w(t)+\mu)A(t)\\ I(t+1)-I(t)=w(t)A(t)-(\rho(t)+\gamma(t)+\mu)I(t)\\ V(t+1)-V(t)=v(t)S(t)-\mu V(t)\\ R(t+1)-R(t)=\gamma(t)I(t)-\mu R(t)\\ D(t+1)-D(t)=\rho(t)I(t).\end{cases} (4)

We will use the known data S(t),A(t),I(t),V(t),R(t)S(t),A(t),I(t),V(t),R(t) and D(t)D(t) for t=0,1,,T1t=0,1,\cdots,\\ T-1 to track these five time series: ρ(t),γ(t),w(t),v(t)\rho(t),\gamma(t),w(t),v(t) and β(t).\beta(t). Then we predict the values of S(T),A(T),I(T),V(T),R(T),D(T),ρ(T1),γ(T1),w(T1),v(T1)S(T),A(T),I(T),V(T),R(T),D(T),\rho(T-1),\gamma(T-1),w(T-1),v(T-1) and β(T1).\beta(T-1).

For t=0,1,,T2,t=0,1,\cdots,T-2, we solve ρ(t),γ(t),w(t),v(t)\rho(t),\gamma(t),w(t),v(t) and β(t)\beta(t) in eq. 5.

{ρ(t)=D(t+1)D(t)I(t),γ(t)=R(t+1)(1μ)R(t)I(t),w(t)=I(t+1)+(ρ(t)+γ(t)+μ1)I(t)A(t),v(t)=V(t+1)(1μ)V(t)S(t),β(t)=A(t+1)+(w(t)+μ1)A(t)A(t)S(t)N0, for t=0,1,,T2.\begin{cases}\rho(t)=\dfrac{D(t+1)-D(t)}{I(t)},\\ \gamma(t)=\dfrac{R(t+1)-(1-\mu)R(t)}{I(t)},\\ w(t)=\dfrac{I(t+1)+(\rho(t)+\gamma(t)+\mu-1)I(t)}{A(t)},\\ v(t)=\dfrac{V(t+1)-(1-\mu)V(t)}{S(t)},\\ \beta(t)=\dfrac{A(t+1)+(w(t)+\mu-1)A(t)}{A(t)S(t)}\cdot N_{0},\mbox{ for }t=0,1,\cdots,T-2.\end{cases} (5)

In our numerical experiment in section 5, some β(t)\beta(t) are negative, which is epidemiologically unreasonable. In this case, we reset β(t)\beta(t) to be 0.

We will apply a data set in the USA from [23], [24] and [25] that is of large number of infections and so the Finite Impulse Response (FIR) filters is an appropriate method to predict the time-varying rates [9]:

{ρ^(t)=x1ρ(t1)++xJ1ρ(tJ1)+x0=j=1J1xjρ(tj)+x0γ^(t)=y1γ(t1)++yJ2γ(tJ2)+y0=j=1J2yjγ(tj)+y0w^(t)=z1w(t1)++zJ3w(tJ3)+z0=j=1J3zjw(tj)+z0v^(t)=u1v(t1)++uJ4v(tJ4)+u0=j=1J4ujv(tj)+u0β^(t)=b1β(t1)++bJ5β(tJ5)+b0=j=1J5bjβ(tj)+b0,\begin{cases}\hat{\rho}(t)=x_{1}\rho(t-1)+\dots+x_{J_{1}}\rho(t-J_{1})+x_{0}=\sum_{j=1}^{J_{1}}x_{j}\rho(t-j)+x_{0}\\ \hat{\gamma}(t)=y_{1}\gamma(t-1)+\dots+y_{J_{2}}\gamma(t-J_{2})+y_{0}=\sum_{j=1}^{J_{2}}y_{j}\gamma(t-j)+y_{0}\\ \hat{w}(t)=z_{1}w(t-1)+\dots+z_{J_{3}}w(t-J_{3})+z_{0}=\sum_{j=1}^{J_{3}}z_{j}w(t-j)+z_{0}\\ \hat{v}(t)=u_{1}v(t-1)+\dots+u_{J_{4}}v(t-J_{4})+u_{0}=\sum_{j=1}^{J_{4}}u_{j}v(t-j)+u_{0}\\ \hat{\beta}(t)=b_{1}\beta(t-1)+\dots+b_{J_{5}}\beta(t-J_{5})+b_{0}=\sum_{j=1}^{J_{5}}b_{j}\beta(t-j)+b_{0},\end{cases} (6)

where Ji,i=1,2,3,4,5J_{i},i=1,2,3,4,5 are orders of the five FIR filters and xj,yj,zj,uj,bjx_{j},y_{j},z_{j},u_{j},b_{j} are coefficients of the impulse responses of these five FIR filters. The coefficients can be determined by considering the following minimization problems.

{min𝐱J1+1t=J1T2(ρ(t)ρ^(t))2+α1j=0J1xj2min𝐲J2+1t=J2T2(γ(t)γ^(t))2+α2j=0J2yj2min𝐳J3+1t=J3T2(w(t)w^(t))2+α3j=0J3zj2min𝐮J4+1t=J4T2(v(t)v^(t))2+α4j=0J4uj2min𝐛J5+1t=J5T2(β(t)β^(t))2+α5j=0J5bj2\begin{cases}\min\limits_{{\bf{x}}\in\mathbb{R}^{J_{1}+1}}\sum_{t=J_{1}}^{T-2}(\rho(t)-\hat{\rho}(t))^{2}+\alpha_{1}\sum_{j=0}^{J_{1}}x_{j}^{2}\\ \min\limits_{{\bf{y}}\in\mathbb{R}^{J_{2}+1}}\sum_{t=J_{2}}^{T-2}(\gamma(t)-\hat{\gamma}(t))^{2}+\alpha_{2}\sum_{j=0}^{J_{2}}y_{j}^{2}\\ \min\limits_{{\bf{z}}\in\mathbb{R}^{J_{3}+1}}\sum_{t=J_{3}}^{T-2}(w(t)-\hat{w}(t))^{2}+\alpha_{3}\sum_{j=0}^{J_{3}}z_{j}^{2}\\ \min\limits_{{\bf{u}}\in\mathbb{R}^{J_{4}+1}}\sum_{t=J_{4}}^{T-2}(v(t)-\hat{v}(t))^{2}+\alpha_{4}\sum_{j=0}^{J_{4}}u_{j}^{2}\\ \min\limits_{{\bf{b}}\in\mathbb{R}^{J_{5}+1}}\sum_{t=J_{5}}^{T-2}(\beta(t)-\hat{\beta}(t))^{2}+\alpha_{5}\sum_{j=0}^{J_{5}}b_{j}^{2}\end{cases} (7)

using rigid regression, where αi,i=1,2,3,4,5\alpha_{i},i=1,2,3,4,5 are regulation parameters, which avoid overfitting. After predicting the values of ρ^(T1),γ^(T1),w^(T1),v^(T1)\hat{\rho}(T-1),\hat{\gamma}(T-1),\hat{w}(T-1),\hat{v}(T-1) and β^(T1),\hat{\beta}(T-1), we plug these data back into eq. 4 to get the desired values S^(T),A^(T),\hat{S}(T),\hat{A}(T), I^(T),V^(T),R^(T)\hat{I}(T),\hat{V}(T),\hat{R}(T) and D^(T)\hat{D}(T) by replacing tt with T1T-1.

After doing prediction for one day, we can treat this predicted day as known data to predict more one day, and step by step we calculate more days. More precisely, we will predict the values of ρ^(t),γ^(t),w^(t),v^(t),β^(t),S^(t+1),A^(t+1),I^(t+1),V^(t+1),R^(t+1)\hat{\rho}(t),\hat{\gamma}(t),\hat{w}(t),\hat{v}(t),\hat{\beta}(t),\hat{S}(t+1),\hat{A}(t+1),\hat{I}(t+1),\hat{V}(t+1),\hat{R}(t+1) and D^(t+1)\hat{D}(t+1) for t=T1,T,,T+d1.t=T-1,T,\cdots,T+d-1.

In this process, we still need to determine the orders Ji,i=1,2,3,4,5J_{i},i=1,2,3,4,5 and find the formulae to solve the optimization problems in eq. 7. Given the orders, we can use a result from [10] named ‘‘Normal Gradient Equation’’ and described as follows, to compute the desired coefficients. Let f(0),f(1),,f(T2)f(0),f(1),\cdots,f(T-2) be the known data, and the FIR filter f^(t)=j=1Jxjf(tj)+x0\hat{f}(t)=\sum_{j=1}^{J}x_{j}f(t-j)+x_{0} be the predicted data, where t=J,,T2t=J,\cdots,T-2. Define the function FF by

F(x0,x1,,xJ;α)=t=JT2(f(t)f^(t))2+αj=0Jxj2,F(x_{0},x_{1},\dots,x_{J};\alpha)=\sum_{t=J}^{T-2}(f(t)-\hat{f}(t))^{2}+\alpha\sum_{j=0}^{J}x_{j}^{2},

where α\alpha is the regulation parameter. If det(αI+A)0,\det(\alpha I+A)\neq 0, then

(αI+A)1𝐛=argmin𝐱J+1F(𝐱,α),(\alpha I+A)^{-1}{\bf{b}}=\operatornamewithlimits{argmin}\limits_{{\bf{x}}\in\mathbb{R}^{J+1}}F({\bf{x}},\alpha),

where II is (J+1)×(J+1)(J+1)\times(J+1) identity matrix,
A=[(TJ1)t=JT2f(t1)t=JT2f(tJ)t=JT2f(t1)t=JT2(f(t1))2t=JT2f(t1)f(tJ)t=JT2f(tJ)t=JT2f(tJ)f(t1)t=JT2(f(tJ))2]A=\begin{bmatrix}(T-J-1)&\sum_{t=J}^{T-2}f(t-1)&\cdots&\sum_{t=J}^{T-2}f(t-J)\\ \sum_{t=J}^{T-2}f(t-1)&\sum_{t=J}^{T-2}(f(t-1))^{2}&\cdots&\sum_{t=J}^{T-2}f(t-1)f(t-J)\\ \vdots&\vdots&\ddots&\vdots\\ \sum_{t=J}^{T-2}f(t-J)&\sum_{t=J}^{T-2}f(t-J)f(t-1)&\cdots&\sum_{t=J}^{T-2}(f(t-J))^{2}\end{bmatrix} and
𝐛=[t=JT2f(t)t=JT2f(t)f(t1)t=JT2f(t)f(tJ)]{\bf{b}}=\begin{bmatrix}\sum_{t=J}^{T-2}f(t)\\ \sum_{t=J}^{T-2}f(t)f(t-1)\\ \vdots\\ \sum_{t=J}^{T-2}f(t)f(t-J)\end{bmatrix}.

We give the process of calculation in algorithm 1 to algorithm 3 [10]. Firstly, we use algorithm 1 to find the best choice of orders. Then after some pre-processing of data, we use algorithm 2 to predict the next-day data of S,A,I,V,R,DS,A,I,V,R,D and the time-varying rates. Finally, we predict the data for several days in algorithm 3.

The procedure of algorithm 1 is described as follows. Firstly, we input the known data (the time-dependent parameters), and decide how large T\ell_{T} the training set we want to extract from is, bounds UJU_{J} and LJL_{J} and regulation parameters. Then we use different JJ’s to find the best JJ which gives the least residual.

Algorithm 1 Best Order Searcher

Input: Data =(f(0),f(1),,f(T2))=(f(0),f(1),\cdots,f(T-2)); training size: T\ell_{T}; lower bound of order: LJL_{J} ; upper bound of order: UJU_{J}; Regulation Parameter: α\alpha.
Define DataT=(f(0),f(1),,f(T1))\mathrm{Data}_{T}=(f(0),f(1),\cdots,f(\ell_{T}-1)) and DataV=(f(T),f(T+1),,f(T2))\mathrm{Data}_{V}=(f(\ell_{T}),f(\ell_{T}+1),\cdots,f(T-2)).
Calculate the validation length: VTT1\ell_{V}\leftarrow T-\ell_{T}-1.

for JLJJ\leftarrow L_{J} to UJU_{J} do
     for tTt\leftarrow\ell_{T} to T2T-2 do
         Calculate 𝐱=(x0,x1,,xJ)T=argmin𝐱J+1F(𝐱,α){\bf{x}}=(x_{0},x_{1},\cdots,x_{J})^{T}=\operatornamewithlimits{argmin}\limits_{{\bf{x}}\in\mathbb{R}^{J+1}}F({\bf{x}},\alpha)
         Estimate f^J(t)=j=1Jxjf(tj)+x0,\hat{f}_{J}(t)=\displaystyle\sum_{j=1}^{J}x_{j}f(t-j)+x_{0}, where f(t1),f(t2),,f(T)f(t-1),f(t-2),\cdots,f(\ell_{T}) are taken to be f^J(t1),f^J(t2),,f^J(T)\hat{f}_{J}(t-1),\hat{f}_{J}(t-2),\cdots,\hat{f}_{J}(\ell_{T}) and f(T1),f(T2),,f(tJ+1),f(tJ)f(\ell_{T}-1),f(\ell_{T}-2),\cdots,f(t-J+1),f(t-J) are taken from DataT\mathrm{Data}_{T}.
     end for
     Calculate err(J):=t=TT2|f^J(t)f(t)|.\mathrm{err}(J):=\displaystyle\sum_{t=\ell_{T}}^{T-2}|\hat{f}_{J}(t)-f(t)|.
end for

return Order of FIR: Jfit=argminJ=LJ,,UJerr(J)J_{\mathrm{fit}}=\operatornamewithlimits{argmin}\limits_{J=L_{J},\cdots,U_{J}}\mathrm{err}(J).

We next conduct short-term prediction in algorithm 2. Due to the limitation of available data from [25], we need to make more modifications and assumptions as follows. Firstly, the cumulative confirmed cases I~\tilde{I} and cumulative vaccinations V~\tilde{V} provided in our data set [25] is different from the definition of II and VV in our model. Our definition of II is the simultaneous confirmed cases, and VV is the number of valid immunity by vaccination. To adapt our definition, we update II to be (I~RD)(1μ)(\tilde{I}-R-D)(1-\mu) and update VV to be V~σ(1μ),\tilde{V}\sigma(1-\mu), where σ\sigma is the efficiency of vaccination. Secondly, the asymptotic infections AA is updated by A=pI,A=pI, where pp is the estimated ratio between asymptomatic and symptomatic infections. Thirdly, S(t)S(t) is updated by S(t)=(N0A(t)I(t)V(t)R(t)D(t)+tB)(1μ)S(t)=(N_{0}-A(t)-I(t)-V(t)-R(t)-D(t)+tB)(1-\mu) for t=0,1,,T1.t=0,1,\cdots,T-1.

After these settings, we can predict the data for one day by algorithm 2. Firstly, we calculate the time-varying rate by known data and reset β(t)\beta(t) if it is negative. Then use algorithm 1 to find the best FIR orders for prediction and calculate argmin𝐱J+1F(𝐱,α)\operatornamewithlimits{argmin}\limits_{{\bf{x}}\in\mathbb{R}^{J+1}}F({\bf{x}},\alpha) to predict the rates. Finally, plug these predicted rates into eq. 4 to predict the data of the six compartments.

Algorithm 2 Short-Term Prediction

Input: Initial total population N0N_{0}, number TT of days of known data, revised numbers S(t),A(t),I(t),V(t),R(t)S(t),A(t),I(t),V(t),R(t) and D(t)D(t) for t=0,1,,T1,t=0,1,\cdots,T-1, and regulation parameters αi,i=1,2,3,4,5.\alpha_{i},i=1,2,3,4,5.

for t0t\leftarrow 0 to T2T-2 do
     Calculate ρ(t),γ(t),w(t),v(t)\rho(t),\gamma(t),w(t),v(t) and β(t)\beta(t) by eq. 5.
     if β(t)<0\beta(t)<0 then
         β(t)0.\beta(t)\leftarrow 0.
     end if
end forFind the best FIR orders Ji,i=1,2,3,4,5J_{i},i=1,2,3,4,5 by algorithm 1.
Calculate the coefficients by argmin𝐱J+1F(𝐱,α)\operatornamewithlimits{argmin}\limits_{{\bf{x}}\in\mathbb{R}^{J+1}}F({\bf{x}},\alpha) using the best orders.
Calculate ρ^(T1),γ^(T1),w^(T1),v^(T1)\hat{\rho}(T-1),\hat{\gamma}(T-1),\hat{w}(T-1),\hat{v}(T-1) and β^(T1)\hat{\beta}(T-1) by eq. 6 and S^(T),A^(T),I^(T),V^(T),R^(T)\hat{S}(T),\hat{A}(T),\hat{I}(T),\hat{V}(T),\hat{R}(T) and D^(T)\hat{D}(T) by eq. 4.
if β^(T1)<0\hat{\beta}(T-1)<0 then
     β^(T1)0.\hat{\beta}(T-1)\leftarrow 0.
end if

return ρ^(T1),γ^(T1),w^(T1),v^(T1),β^(T1),S^(T),A^(T),I^(T),V^(T),R^(T)\hat{\rho}(T-1),\hat{\gamma}(T-1),\hat{w}(T-1),\hat{v}(T-1),\hat{\beta}(T-1),\hat{S}(T),\hat{A}(T),\hat{I}(T),\hat{V}(T),\hat{R}(T) and D^(T)\hat{D}(T).

When we get the prediction for one day, we can treat them as the known data to predict more days by algorithm 3. Firstly, apply algorithm 2 to predict the data for one day. Then iteratively treat the predicted data as known data to conduct predictions for several days and our goal of this section is accomplished so far.

Algorithm 3 Prediction for Several Days

Input: Same as in algorithm 2 and Number dd of predicted days in addition.
Calculate ρ^(T1),γ^(T1),w^(T1),v^(T1)\hat{\rho}(T-1),\hat{\gamma}(T-1),\hat{w}(T-1),\hat{v}(T-1) and β^(T1)\hat{\beta}(T-1), and S^(T),A^(T),I^(T),V^(T),R^(T)\hat{S}(T),\hat{A}(T),\hat{I}(T),\hat{V}(T),\hat{R}(T) and D^(T)\hat{D}(T) by algorithm 2.

for tTt\leftarrow T to T+d1T+d-1 do
     Redefine II by concatenating II and I^(t)\hat{I}(t) and V,R,DV,R,D are redefined in the similar way.
     Calculate ρ^(t),γ^(t),w^(t),v^(t)\hat{\rho}(t),\hat{\gamma}(t),\hat{w}(t),\hat{v}(t) and β^(t),\hat{\beta}(t), and S^(t+1),A^(t+1),I^(t+1),V^(t+1),R^(t+1)\hat{S}(t+1),\hat{A}(t+1),\hat{I}(t+1),\hat{V}(t+1),\hat{R}(t+1) and D^(t+1)\hat{D}(t+1) by algorithm 2.
end for

return ρ^(t),γ^(t),w^(t),v^(t)\hat{\rho}(t),\hat{\gamma}(t),\hat{w}(t),\hat{v}(t) and β^(t),\hat{\beta}(t), and S^(t+1),A^(t+1),I^(t+1),V^(t+1),R^(t+1)\hat{S}(t+1),\hat{A}(t+1),\hat{I}(t+1),\hat{V}(t+1),\hat{R}(t+1) and D^(t+1)\hat{D}(t+1) for t=T1,T,,T+d1.t=T-1,T,\cdots,T+d-1.

4 Probability Distributions of the Final Size and the Maximum Size

To estimate how large the epidemic caused by a single infected person is in the long run, we calculate the probability distributions of the final size and the maximum size. The final size is defined to be the cumulative number of the infections until the end of epidemic and the maximum size is defined to be the maximum number of the simultaneous infections during the prevalence of the epidemic. We propose a simplified stochastic SARV model whose four compartments are defined in section 1. We do not consider the demography and assume that there will not be another outbreak, that each asymptomatic infectious case transmits the disease to others according to a Poisson process with parameter λ\lambda, that when a susceptible is infected, it enters AA, that AA enters RR (isolated, dead or recovered) after the period TT exponentially distributed with parameter α\alpha and that θ\theta is the average ratio of the number of vaccinations to the number of jumps [12].

The reasons why we treat the isolated cases as the compartment RR are that they cannot infect others and that this model does not require huge calculations.

Let NN be the total population. Then the process {(a(j),s(j))}\{(a(j),s(j))\} is a discrete-time Markov process on the state space

X={(a,s):a+sN}.X=\{(a,s):a+s\leq N\}.

The flow chart is shown as following.

SSAAVVRRλASN\dfrac{\lambda AS}{N}θS\theta SαA\alpha A

The basic reproduction number is then defined to be

R0SARV=λ(1θ)α.R_{0}^{\mathrm{SARV}}=\dfrac{\lambda(1-\theta)}{\alpha}.

The parameters under these assumptions are time-independent. However, just as in section 3, the parameters can also be time-dependent. In section 4.1, we will calculate the probability distributions of the final size and the maximum size with time-independent parameters, whose calculation is based on [12] and [21]. In section 4.2, we assume the time-dependence of parameters and apply the parameters from the first forecast in section 3 to approximate these two distributions and compare them with time-independent model.

4.1 Distributions with Time-Independent Parameters

Firstly, we can calculate the transition probabilities as following:

((A(j+1),S(j+1))=(a+1,s1)|(A(j),S(j))=(a,s))=λs(1jθ)λs(1jθ)+Nα,\mathbb{P}((A(j+1),S(j+1))=(a+1,s-1)|(A(j),S(j))=(a,s))=\dfrac{\lambda s(1-j\theta)}{\lambda s(1-j\theta)+N\alpha},
((A(j+1),S(j+1))=(a1,s)|(A(j),S(j))=(a,s))=Nαλs(1jθ)+Nα.\mathbb{P}((A(j+1),S(j+1))=(a-1,s)|(A(j),S(j))=(a,s))=\dfrac{N\alpha}{\lambda s(1-j\theta)+N\alpha}.

Let τ=inf{j0:A(j)=0}\tau=\inf\{j\geq 0:A(j)=0\} and

Pm(a,s)=((A(τ),S(τ))=(0,Nm)|(A(0),S(0))=(a,s)),P_{m}(a,s)=\mathbb{P}((A(\tau),S(\tau))=(0,N-m)|(A(0),S(0))=(a,s)),

which gives the probability that the final cumulative number of infections is mm with initial state (a,s).(a,s). Then what we seek is the final size distribution Pm(1,N1)P_{m}(1,N-1) for m=0,1,,N.m=0,1,\cdots,N.

We firstly set up the boundary conditions in eq. 8:

{Pm(a,s)=0 for s=0,1,,Nm1, and a=0,1,,Ns.Pm(0,Nm)=1.\begin{cases}P_{m}(a,s)=0\mbox{ for }s=0,1,\cdots,N-m-1,\mbox{ and }a=0,1,\cdots,N-s.\\ P_{m}(0,N-m)=1.\end{cases} (8)

The concept behind these is very simple: if the number of susceptible people is less than Nm,N-m, the cumulative number of infections must be greater than mm and so with probability 0 the epidemic ends with size m.m. If there is no infectious people and NmN-m susceptible people, then the epidemic must stop with size m.m.

For s=1,2,,Ns=1,2,\cdots,N and a=1,2,,Nsa=1,2,\cdots,N-s, by conditioning on the first step [27], we have

Pm(a,s)=λs(1θ)λs(1θ)+NαPm(a+1,s1)+Nαλs(1θ)+NαPm(a1,s).P_{m}(a,s)=\dfrac{\lambda s(1-\theta)}{\lambda s(1-\theta)+N\alpha}P_{m}(a+1,s-1)+\dfrac{N\alpha}{\lambda s(1-\theta)+N\alpha}P_{m}(a-1,s). (9)

The iteration can be explained as following: λs(1θ)λs(1θ)+Nα\dfrac{\lambda s(1-\theta)}{\lambda s(1-\theta)+N\alpha} and Nαλs(1θ)+Nα\dfrac{N\alpha}{\lambda s(1-\theta)+N\alpha} are the probabilities of infection and recovery in a step, respectively. The former multiplying Pm(a+1,s1)P_{m}(a+1,s-1) is the probability that when infection occurs in the first step, the epidemic ends with size m;m; the latter multiplying Pm(a1,s)P_{m}(a-1,s) is the probability that when recovery occurs in the first step, the epidemic ends with size m.m. Pm(a,s),P_{m}(a,s), the probability that the cumulative number of infections is mm with initial state (a,s),(a,s), equals the sum of these two probabilities.

Similarly as the final size distribution, we can calculate the maximum size distribution. Let

Qm(a,s)=(supj[0,τ]A(j)=m|(A(0),S(0))=(a,s))Q_{m}(a,s)=\mathbb{P}(\sup\limits_{j\in[0,\tau]}A(j)=m|(A(0),S(0))=(a,s))

for m=1,2,,N.m=1,2,\cdots,N. Then what we seek is the maximum size distribution Qm(1,N1)Q_{m}(1,N-1) for m=1,2,,Nm=1,2,\cdots,N and we have the boundary conditions and the iteration formula as in eq. 10.

{Qm(m,0)=1.Qm(a,0)=0 for am.Q0(0,s)=1.Qm(0,s)=0 for m>0.Qm(a,s)=0 for m<a or a+s<m.Qm(a,s)=λs(1θ)λs(1θ)+NαQm(a+1,s1)+Nαλs(1θ)+NαQm(a1,s), otherwise.\begin{cases}Q_{m}(m,0)=1.\\ Q_{m}(a,0)=0\mbox{ for }a\neq m.\\ Q_{0}(0,s)=1.\\ Q_{m}(0,s)=0\mbox{ for }m>0.\\ Q_{m}(a,s)=0\mbox{ for }m<a\mbox{ or }a+s<m.\\ Q_{m}(a,s)=\dfrac{\lambda s(1-\theta)}{\lambda s(1-\theta)+N\alpha}Q_{m}(a+1,s-1)\\ \hskip 108.00018pt+\dfrac{N\alpha}{\lambda s(1-\theta)+N\alpha}Q_{m}(a-1,s),\mbox{ otherwise.}\end{cases} (10)

The iteration can be explained in a similar way of that of the final size distribution. Since when s=0s=0 or a=0,a=0, there will no more infection occurs. It follows directly by the first to fourth conditions. If m<a,m<a, then the initial state contains the epidemic size larger than m;m; if a+s<m,a+s<m, then the epidemic size must be no more than mm afterwards and so the fifth condition follows.

4.2 Distributions with Time-Varying Parameters

As in section 3, the parameters can be time-dependent, so we can use the predicted time-varying rates to calculate the distributions of final size and maximum size. To adapt the simplified model, let

λ~(t)=β(t),α~(t)=w(t) and θ~(t)=v(t),\tilde{\lambda}(t)=\beta(t),\tilde{\alpha}(t)=w(t)\mbox{ and }\tilde{\theta}(t)=v(t), (11)

where tt is the count of predicted days. Let jj be the step in which (a,s)(a,s) lies in starting from (1,N1).(1,N-1).

It is not reasonable to assume that the process proceeds one step in one day, so we assume that there are rr steps in one day. That is, there are rr jj’s corresponding to a same tt. Then the step jj lies in the tj:=jrt_{j}:=\lfloor\dfrac{j}{r}\rfloor-th day and so we can set

λ(j)=λ~(tj)r,α(j)=α~(tj)r and θ(j)=θ~(tj)r.\lambda(j)=\dfrac{\tilde{\lambda}(t_{j})}{r},\alpha(j)=\dfrac{\tilde{\alpha}(t_{j})}{r}\mbox{ and }\theta(j)=\dfrac{\tilde{\theta}(t_{j})}{r}. (12)

With the same boundary conditions as in eq. 8, we rewrite eq. 9 as

Pm(a,s)=\displaystyle P_{m}(a,s)= λ(j)s(1θ(j))λ(j)s(1θ(j))+Nα(j)Pm(a+1,s1)\displaystyle\dfrac{\lambda(j)s(1-\theta(j))}{\lambda(j)s(1-\theta(j))+N\alpha(j)}P_{m}(a+1,s-1) (13)
+Nα(j)λ(j)s(1θ(j))+Nα(j)Pm(a1,s).\displaystyle+\dfrac{N\alpha(j)}{\lambda(j)s(1-\theta(j))+N\alpha(j)}P_{m}(a-1,s).

The choice of rr must satisfy that tjT,t_{j}\leq T, the total number of predicted days.

Note that every step moves a state by either (1,0)(-1,0) or (1,1).(1,-1). For a state (a,s)(a,s) with s=1,2,,Ns=1,2,\cdots,N and a=1,2,,Ns,a=1,2,\cdots,N-s, write (a,s)=(1,N1)+(x,y),(a,s)=(1,N-1)+(x,-y), that is, (1,N1)(1,N-1) is moved to (a,s)(a,s) by (x,y).(x,-y). In the jj-th step, (x,y)=(j+2y,y).(x,-y)=(-j+2y,-y). Then j=2yx=2(N1s)(a1).j=2y-x=2(N-1-s)-(a-1).

Similarly for the maximum size distribution, we set the same boundary conditions as in eq. 10 and rewrite the iteration as

Qm(a,s)=\displaystyle Q_{m}(a,s)= λ(j)s(1θ(j))λ(j)s(1θ(j))+Nα(j)Qm(a+1,s1)\displaystyle\dfrac{\lambda(j)s(1-\theta(j))}{\lambda(j)s(1-\theta(j))+N\alpha(j)}Q_{m}(a+1,s-1) (14)
+Nα(j)λ(j)s(1θ(j))+Nα(j)Qm(a1,s).\displaystyle+\dfrac{N\alpha(j)}{\lambda(j)s(1-\theta(j))+N\alpha(j)}Q_{m}(a-1,s).

Algorithm 4 computes the final size and the maximum size distributions with time-dependent parameters. The first and third for loops set up the boundary conditions of PmP_{m} and Qm.Q_{m}. Then we compute PmP_{m} iteratively in the second for loop. The fourth for loop computes QmQ_{m} iteratively [21].

Algorithm 4 Final Size and Maximum Size Distributions with Time-Dependent Parameters

Input: Total population N,N, the number rr of steps in one day and predicted parameters λ~(t),α~(t)\tilde{\lambda}(t),\tilde{\alpha}(t) and θ~(t)\tilde{\theta}(t) for t=0,1,,T1.t=0,1,\cdots,T-1.

for m0m\leftarrow 0 to NN do
     Pm=O(N+1)×(N+1).P_{m}=O_{(N+1)\times(N+1)}.
     Pm(0,Nm)=1.P_{m}(0,N-m)=1.
     Qm=O(N+1)×(N+1).Q_{m}=O_{(N+1)\times(N+1)}.
end for
for s1s\leftarrow 1 to NN do
     for a1a\leftarrow 1 to NsN-s do
         tj2(N1s)(a1)r.t_{j}\leftarrow\lfloor\dfrac{2(N-1-s)-(a-1)}{r}\rfloor.
         for m0m\leftarrow 0 to NN do
              Calculate Pm(a,s)P_{m}(a,s) by eq. 11, eq. 12 and eq. 13.
         end for
     end for
end for
for a0a\leftarrow 0 to NN do
     Qa(a,0)=1.Q_{a}(a,0)=1.
end for
for s1s\leftarrow 1 to N1N-1 do
     Q0(0,s)=1.Q_{0}(0,s)=1.
     for a1a\leftarrow 1 to NsN-s do
         tj2(N1s)(a1)r.t_{j}\leftarrow\lfloor\dfrac{2(N-1-s)-(a-1)}{r}\rfloor.
         for ma+1m\leftarrow a+1 to a+sa+s do
              Calculate Qm(a,s)Q_{m}(a,s) by eq. 11, eq. 12 and eq. 14.
         end for
         Qa(a,s)=1m=a+1a+sQm(a,s).Q_{a}(a,s)=1-\sum_{m=a+1}^{a+s}Q_{m}(a,s).
     end for
end for

return P=(P0(1,N1),P1(1,N1),,PN(1,N1))P=(P_{0}(1,N-1),P_{1}(1,N-1),\cdots,P_{N}(1,N-1)) and Q=(Q0(1,N1),Q1(1,N1),,QN(1,N1))Q=(Q_{0}(1,N-1),Q_{1}(1,N-1),\cdots,Q_{N}(1,N-1)).

4.3 Extinction Probability in a Large Population

In this subsection, we discuss the extinction probability with one initial infection in an infinite (or very large) population with exponential distributed infectious period related to the basic reproduction number R0SARVR^{\mathrm{SARV}}_{0} in our stochastic SARV model. We will also see the relation between the extinction probability and the final size distribution.

The following calculation is based on the method proposed in [28]. [21] also did this discussion in their proposed SIkRSIkR model to approximate the outbreak probability, which is complementary to the extinction probability.

Let XX be a random variable of number of infections caused by a single infectious person during exponential distributed infectious period. Let TT be the random variable of the infectious period. Since λTs(1θ)N=λT(Na)(1θ)NλT(1θ)\dfrac{\lambda Ts(1-\theta)}{N}=\dfrac{\lambda T(N-a)(1-\theta)}{N}\rightarrow\lambda T(1-\theta) as N,N\to\infty, we write the probability generating function of XX as 𝔼(uX)=𝔼(eλT(1θ)(1u)).\mathbb{E}(u^{X})=\mathbb{E}(e^{-\lambda T(1-\theta)(1-u)}). Using the moment generating function 𝔼(etT)=ααt\mathbb{E}(e^{tT})=\dfrac{\alpha}{\alpha-t} of T,T, we have

𝔼(uX)=αα+λ(1θ)(1u).\mathbb{E}(u^{X})=\dfrac{\alpha}{\alpha+\lambda(1-\theta)(1-u)}.

Then the extinction probability uu of the Markov branching process is the smaller solution of the fixed point equation αα+λ(1θ)(1u)=u,\dfrac{\alpha}{\alpha+\lambda(1-\theta)(1-u)}=u, which gives

u=min{1,αλ(1θ)}=min{1,1R0SARV}.u=\min\{1,\dfrac{\alpha}{\lambda(1-\theta)}\}=\min\{1,\dfrac{1}{R_{0}^{\mathrm{SARV}}}\}.

For the relation with the final size distribution, we do the following calculation considering the method proposed by [26]. Let Ωg(z)=jωj(g)zj\Omega_{g}(z)=\displaystyle\sum_{j}\omega_{j}(g)z^{j} be the probability generating function for the distribution of the number of recoveries at step gg. Let Ω(z)=limgΩg(z)\Omega_{\infty}(z)=\lim\limits_{g\rightarrow\infty}\Omega_{g}(z) be the pointwise limit. Let ω\omega_{\infty} be the probability that the cumulative number of the infections is infinite and define z=1z^{\infty}=1 as z=1z=1 and z=0z^{\infty}=0 as z[0,1)z\in[0,1). Then we can express Ω(z)\Omega_{\infty}(z) by Ω(z)=r<ωrzr+ωz.\Omega_{\infty}(z)=\displaystyle\sum_{r<\infty}\omega_{r}z^{r}+\omega_{\infty}z^{\infty}. Let z1z\to 1- and we have r<ωr=1ω,\displaystyle\sum_{r<\infty}\omega_{r}=1-\omega_{\infty}, which gives the extinction probability. Therefore, this probability is exactly the cumulative probability of the final size which is finite in an infinite population. In order to demonstrate this concept graphically, we will use a small population N=1000N=1000 with time-independent parameters in section 5.5 to observe the plateau value in the cumulative probability distribution of the final size, which is approximately the value of the extinction probability uu.

5 Numerical Results

5.1 Data Setting

For the first forecast, we track the data for 4040 days in the USA from [23], [24] and [25] from June 30-th, 2022 to August 8-th, 2022 to predict the data on next 10 days, i.e., August 9-th, 2022 to August 18-th, 2022, and estimate the relative errors.

LJL_{J} is taken to be 22. UJU_{J} and T\ell_{T} are taken to be 3535. σ\sigma is taken (underestimate) to be 0.50.5 [29] and pp is taken to be 4010040=23\dfrac{40}{100-40}=\dfrac{2}{3} [2]. For demography, it was estimated in [23] that the total population in the USA is N0=338,279,857N_{0}=338,279,857, that the number of new births per day in the USA is about B=10,800B=10,800 and that death rate per day is μ=7855N0\mu=\dfrac{7855}{N_{0}}.

Unlike the FIR filters that can predict the real-world quantities, the calculation of the desired probability distributions is too large to apply the real data. Hence, we only apply the estimated rates from the short-term prediction for 2020 days to small populations N=100,N=500N=100,N=500 and N=1000N=1000 to demonstrate the use of calculation of the desired distributions. Finally, we consider the rates to be time-dependent. We assume that the epidemic ends within 2020 days and set r=10,r=50r=10,r=50 and r=100,r=100, respectively. We will see that the distributions are similar between different populations and so we can expect that with the same parameters and in a large population, the ‘‘ratios’’ of the final size and the maximum size are distributed similarly.

5.2 Prediction Using Time-Varying Parametric Models

The predicted time-varying v(t)v(t) is around 1.2423×104,ρ(t)1.2423\times 10^{-4},\rho(t) is around 1.1445×1041.1445\times 10^{-4} and γ(t)\gamma(t) is around 0.0305.0.0305. The predicted values of β(t)\beta(t) and w(t)w(t) vary drastically and they are presented in fig. 1; fig. 2 shows the real and predicted data of confirmed I(t)I(t); the time-varying basic reproduction numbers from in the SAIVRD model are presented in fig. 3.

Refer to caption
Refer to caption
Figure 1: Predictions of β(t)\beta(t) (left) and w(t)w(t) (right). The hollow points represent the real data and the star-shaped points represent the predicted values.
Refer to caption
Figure 2: Prediction of I(t)I(t). The hollow points represent the real data and the star-shaped points represent the predicted values.
Refer to caption
Figure 3: The Time-Varying Basic Reproduction Numbers R0(t)R_{0}(t). The hollow points represent the real data and the star-shaped points represent the predicted values.

5.3 Distributions With Time-Independent Parameters

From the first forecast in section 3, we also get the time-varying rates λ~(t),α~(t)\tilde{\lambda}(t),\tilde{\alpha}(t) and θ~(t)\tilde{\theta}(t) from August 9-th, 2022 to August 28-th, 2022. Then we estimate λ,α\lambda,\alpha and θ\theta to be the average of the predicted values λ~(t),α~(t)\tilde{\lambda}(t),\tilde{\alpha}(t) and θ~(t)\tilde{\theta}(t), respectively, namely, 1.3304×103,4.5077×1041.3304\times 10^{-3},4.5077\times 10^{-4} and 1.2460×1061.2460\times 10^{-6}, respectively. Using eq. 8, eq. 9 and eq. 10, we compute the probability distributions of the final size and the maximum size and present them in fig. 4 and fig. 5.

Refer to caption
Refer to caption
Refer to caption
Figure 4: Final Size Distribution with Time-Independent Parameters as N=100,N=500N=100,N=500 and N=1000.N=1000.
Refer to caption
Refer to caption
Refer to caption
Figure 5: Maximum Size Distribution with Time-Independent Parameters as N=100,N=500N=100,N=500 and N=1000.N=1000.

5.4 Distributions With Time-Varying Parameters

We next use the time-varying rates λ~(t),α~(t)\tilde{\lambda}(t),\tilde{\alpha}(t) and θ~(t)\tilde{\theta}(t) from the section 5.2 and apply algorithm 4 to compute the probability distributions of the final size and the maximum size under the time-dependent assumption and present them in fig. 6 and fig. 7.

Refer to caption
Refer to caption
Refer to caption
Figure 6: Final Size Distribution with Time-Dependent Parameters as N=100,N=500N=100,N=500 and N=1000.N=1000.
Refer to caption
Refer to caption
Refer to caption
Figure 7: Maximum Size Distribution with Time-Dependent Parameters as N=100,N=500N=100,N=500 and N=1000.N=1000.

5.5 Results and Discussions

The relative prediction errors of II are within 3%3\% in 55 days and 7%7\% in 1010 days. The trend of change is closed to the real data of II. Therefore, the methods provided in [9] and [10] also have good performance in our SAIVRD model and our model meets the real situation and then the predicted parameters are reliable.

We can observe, from fig. 3, that the time-varying basic reproduction number in our SAIVRD model oscillates not far from 11 and so the epidemic is likely to be lowly prevalent and there may not be another outbreak in the short term.

The distributions of the final size and the maximum size are bimodal, which agrees the results from many previous papers, such as [12], [20] and [21]. These mean that with high probability, the epidemic ends and is maximized with very high or very low size. With time-dependent parameters, the distributions look slightly different. There is another peak in the final size distribution, the probability of minor outbreak is higher and the maximum size distribution is oscillating with time-dependent parameters. As mentioned in section 5.1, we can observe that the distributions are similar between different populations and so we can estimate the “ratio” of the final size and the maximum size distributions by observing those in small populations. From fig. 4 and fig. 6, under recent transmissibility of this disease in the USA, when an initial infection is introduced into all-susceptible (large) population, ‘‘major outbreak’’ [15] occurs with around 95%95\% of the population; from fig. 5 and fig. 7, with high probability the epidemic is maximized to around 30%30\% of the population. On the other hand, for the distributions with time-dependent parameters, the probability of ‘‘minor outbreak’’ [15] is higher than that with time-independent parameters and of course it follows that the probability of major outbreak is lower than that with time-independent parameters.

Finally, we discuss the relation between the final size distribution and the extinction probability with one initial infection in an infinite (or very large) population with exponential distributed infectious period. The cumulative probability distribution of the final size using the settings in section 5.3 is presented in fig. 8.

Refer to caption
Figure 8: Cumulative Final Size Distribution with Time-Independent Parameters, N=1000.N=1000.

By using the parameters from section 5.3, the basic reproduction number in this SARV model is R0SARV=2.9513R_{0}^{\mathrm{SARV}}=2.9513 and the extinction probability is u=0.3388,u=0.3388, which is approximately the value of the plateau in fig. 8 and also 1R0SARV\dfrac{1}{R_{0}^{\mathrm{SARV}}}. In view of the derivation in Section 2 in [26], using the plateau value of the cumulative final size distribution with N=1000N=1000 is an ideal simulation for approximation of the extinction probability and the outbreak probability with one initial infection in a large population.

6 Conclusion and Future Works

At this stage of COVID-19, many countries, such as the UK and the USA [30] and [31], loosen their public health policies against this epidemic, looking ahead to living with COVID-19. This decision may have theoretical basis. Our discussion of extinction probability is an example. We can see that the extinction probability is as low as about 33.9%33.9\% by our simulation.

In our paper, we proposed a deterministic SAIVRD model and a stochastic Markov SARV model, simplified from the SAIVRD model, of the epidemic COVID-19. In the deterministic SAIVRD model, we analyzed the existence and the asymptotic stability of disease-free and endemic equilibria related to the basic reproduction number. Based on this model, we conducted the numerical simulations for data forecast to do short-term prediction with time-varying rates and got small relative errors (metioned in section 1 and section 5.5), which means that our proposed model meets the real situation in the society of our data set. This forecast can also answer the first question in section 1. Next, in our stochastic Markov SARV model, we extended the parameters in branching process to be time-dependent and used this model to approximate the probability distributions of the final size and the maximum size. These can answer the second question in section 1. The results with time-dependent and time-independent parameters are a little different and so it is worthwhile to investigate distributions considering the time-dependent parameters instead of time-independent ones. Finally, we estimate the probability of extinction in real situation by both calculating directly and using the cumulative probability of the final size.

In time-dependent Markov SARV model, we assumed the number rr of steps in a day. One interesting question comes here: how does the choice of rr effect the results? One possible direction is to estimate the time until the epidemic ends and then we can decide how long do we need to predict in section 5.2 so that rr can be chosen so that every step lies within this time. With aid of [32] and [33], estimating the ending time of an epidemic is one of future tasks to make a better choice of rr in our time-dependent Markov SARV model.

Furthermore, we would extend our models to many aspects of epidemiology modeling, such as data forecast by stochastic differential equations [34], the effect of diffusion [35] and even more combining them to conduct models with stochastic partial differential equations to conduct more epidemiologically realistic models.

References

  • [1] WHO ‘‘Coronavirus (COVID-19) Dashboard’’, https://covid19.who.int/
  • [2] Qiuyue Ma et al. ‘‘Global percentage of asymptomatic SARS-CoV-2 infections among the tested population and individuals with confirmed COVID-19 diagnosis: a systematic review and meta-analysis’’ In JAMA network open 4.12 American Medical Association, 2021, pp. e2137257–e2137257
  • [3] CDC ‘‘CDC COVID-19 Study Shows mRNA Vaccines Reduce Risk of Infection by 91 Percent for Fully Vaccinated People’’, https://www.cdc.gov/media/releases/2021/p0607-mrna-reduce-risks.html
  • [4] Herbert W Hethcote ‘‘The mathematics of infectious diseases’’ In SIAM review 42.4 SIAM, 2000, pp. 599–653
  • [5] Fred Brauer, Carlos Castillo-Chavez and Zhilan Feng ‘‘Mathematical models in epidemiology’’ Springer, 2019
  • [6] Sze-Bi Hsu and Ying-Hen Hsieh ‘‘Modeling intervention measures and severity-dependent public response during severe acute respiratory syndrome outbreak’’ In SIAM Journal on Applied Mathematics 66.2 SIAM, 2005, pp. 627–647
  • [7] Sze-Bi Hsu and Ying-Hen Hsieh ‘‘On the role of asymptomatic infection in transmission dynamics of infectious diseases’’ In Bulletin of Mathematical Biology 70.1 Springer, 2008, pp. 134–155
  • [8] Ugo Avila-Ponce León, Ángel GC Pérez and Eric Avila-Vales ‘‘An SEIARD epidemic model for COVID-19 in Mexico: mathematical analysis and state-level forecast’’ In Chaos, Solitons & Fractals 140 Elsevier, 2020, pp. 110165
  • [9] Yi-Cheng Chen, Ping-En Lu, Cheng-Shang Chang and Tzu-Hsuan Liu ‘‘A time-dependent SIR model for COVID-19 with undetectable infected persons’’ In Ieee transactions on network science and engineering 7.4 IEEE, 2020, pp. 3279–3294
  • [10] Bo-Cyuan Lin et al. ‘‘The data forecast in COVID-19 model with applications to US, South Korea, Brazil, India, Russia and Italy’’ In arXiv preprint arXiv:2011.04738, 2020
  • [11] Rabih Ghostine, Mohamad Gharamti, Sally Hassrouny and Ibrahim Hoteit ‘‘An extended SEIR model with vaccination for forecasting the COVID-19 pandemic in Saudi Arabia using an ensemble Kalman filter’’ In Mathematics 9.6 MDPI, 2021, pp. 636
  • [12] Priscilla E Greenwood and Luis F Gordillo ‘‘Stochastic epidemic modeling’’ In Mathematical and statistical estimation approaches in epidemiology Springer, 2009, pp. 31–52
  • [13] Edwin C Yuan, David L Alderson, Sean Stromberg and Jean M Carlson ‘‘Optimal vaccination in a stochastic epidemic model of two non-interacting populations’’ In PLoS One 10.2 Public Library of Science San Francisco, CA USA, 2015, pp. e0115826
  • [14] Tom Britton et al. ‘‘Stochastic epidemic models with inference’’ Springer, 2019
  • [15] Linda JS Allen ‘‘A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis’’ In Infectious Disease Modelling 2.2 Elsevier, 2017, pp. 128–142
  • [16] Olusegun Michael Otunuga ‘‘Global stability of nonlinear stochastic SEI epidemic model with fluctuations in transmission rate of disease’’ In International Journal of Stochastic Analysis 2017 Hindawi, 2017, pp. 6313620
  • [17] Andrés Ríos-Gutiérrez, Soledad Torres and Viswanathan Arunachalam ‘‘Studies on the basic reproduction number in stochastic epidemic models with random perturbations’’ In Advances in difference equations 2021.1 Springer, 2021, pp. 1–24
  • [18] Yukihiko Nakata and Ryosuke Omori ‘‘R0R_{0} fails to predict the outbreak potential in the presence of natural-boosting immunity’’ In arXiv preprint arXiv:1808.08749, 2018
  • [19] Thomas House, Joshua V Ross and David Sirl ‘‘How big is an outbreak likely to be? Methods for epidemic final-size calculation’’ In Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 469.2150 The Royal Society Publishing, 2013, pp. 20120436
  • [20] Andrew J Black and Joshua V Ross ‘‘Computation of epidemic final size distributions’’ In Journal of theoretical biology 367 Elsevier, 2015, pp. 159–165
  • [21] Zeynep Gökçe İşlier, Refik Güllü and Wolfgang Hörmann ‘‘An exact and implementable computation of the final outbreak size distribution under Erlang distributed infectious period’’ In Mathematical Biosciences 325 Elsevier, 2020, pp. 108363
  • [22] HE Daniels ‘‘The maximum size of a closed epidemic’’ In Advances in Applied Probability 6.4 Cambridge University Press, 1974, pp. 607–621
  • [23] Census ‘‘U.S. and World Population Clock’’, https://www.census.gov/popclock/
  • [24] Our World Data ‘‘Coronavirus (COVID-19) Vaccinations’’, https://ourworldindata.org/covid-vaccinations?country=USA
  • [25] Coronaboard ‘‘COVID-19 Dashboard’’, https://coronaboard.com/
  • [26] Joel C Miller ‘‘A primer on the use of probability generating functions in infectious disease modeling’’ In Infectious Disease Modelling 3 Elsevier, 2018, pp. 192–248
  • [27] Richard Durrett and R Durrett ‘‘Essentials of stochastic processes’’ Springer, 1999
  • [28] Hakan Andersson and Tom Britton ‘‘Stochastic epidemic models and their statistical analysis’’ Springer Science & Business Media, 2012
  • [29] WHO ‘‘Vaccine efficacy, effectiveness and protection’’, https://www.who.int/news-room/feature-stories/detail/vaccine-efficacy-effectiveness-and-protection
  • [30] ‘‘These Countries Are Looking Ahead to Living With Covid-19’’, https://www.wsj.com/articles/these-countries-are-looking-ahead-to-living-with-covid-19-11626012000
  • [31] NDTV ‘‘Europe Starts To Consider Treating Covid Like The Flu, Live With It’’, https://www.ndtv.com/world-news/europe-starts-to-consider-treating-covid-like-the-flu-live-with-it-2703334
  • [32] Kris V Parag, Christl A Donnelly, Rahul Jha and Robin N Thompson ‘‘An exact method for quantifying the reliability of end-of-epidemic declarations in real time’’ In PLoS Computational Biology 16.11 Public Library of Science San Francisco, CA USA, 2020, pp. e1008478
  • [33] Tusneem Elhassan and Ameera Gaafar ‘‘Mathematical modeling of the COVID-19 prevalence in Saudi Arabia’’ In medRxiv Cold Spring Harbor Laboratory Press, 2020
  • [34] Abdon Atangana and Seda İğret Araz ‘‘Modeling and forecasting the spread of COVID-19 with stochastic and deterministic approaches: Africa and Europe’’ In Advances in Difference Equations 2021.1 Springer, 2021, pp. 1–107
  • [35] Wonjun Lee, Siting Liu, Wuchen Li and Stanley Osher ‘‘Mean field control problems for vaccine distribution’’ In arXiv preprint arXiv:2104.11887, 2021