A control theory approach to optimal pandemic mitigation

P. Godara, S. Herminghaus, K. M. Heidemann Max Planck Institute for Dynamics and Self-Organization, Am Fassberg 17, 37073 Göttingen, Germany

Abstract

In the framework of homogeneous susceptible-infected-recovered (SIR) models, we use a control theory approach to identify optimal pandemic mitigation strategies. We derive rather general conditions for reaching herd immunity while minimizing the costs incurred by the introduction of societal control measures (such as closing schools, social distancing, lockdowns, etc.), under the constraint that the infected fraction of the population does never exceed a certain maximum corresponding to public health system capacity. Optimality is derived and verified by variational and numerical methods for a number of model cost functions. The effects of immune response decay after recovery are taken into account and discussed in terms of the feasibility of strategies based on herd immunity.

I Introduction

The recent outbreak of the illness COVID-19, caused by the SARS-CoV-2 virus, has resulted in a pandemic with unprecedented impact on societies all over the globe. Mitigation measures included complete lockdowns of societal life, with severe psychic, social, and economic consequences et al. (2020); Enserink and Kupferschmidt (2020). Hence a major focus of governments is on designing containment strategies which are as mild as possible, but substantial enough to limit the severity of the outbreak in order not to overwhelm the health service system (HSS). This requires reliable forecast, based on careful collection of data on the fraction of infected citizens, as well as extensive simulation Enserink and Kupferschmidt (2020); Dehning et al. (2020).

We discuss the system in terms of a so-called SIR model Hethcote (2000), referring to the fraction of susceptible (S), infected (I), and recovered (R) citizens in the population. We identify the recovered with all those who are neither susceptible nor infected ( $R=1-S-I$ ); the dynamics are thus fully described by a set of two equations:

\begin{array}[]{r c l}\partial_{t}S&=&-\beta SI\,,\\ &&\\ \partial_{t}I&=&\beta SI-\frac{I}{\tau}\,,\\ \end{array}

(1)

where $S,I\in[0,1]$ are the fraction of susceptible and infected individuals in the population, respectively. Note that $I(t)$ denotes the actual fraction of acutely infected citizens at time $t$ , no matter whether or not the infection has been realized by the individual, or has even been recorded. The infection of a susceptible individual by an infected is described by the infection rate $\beta>0$ , while $\tau$ is the average duration of the infection of an individual until her recovery.

The task we address in this study is to limit, during the whole period of the pandemic, the current number of infected individuals such as to prevent the number of those needing intensive care from exceeding the capacity of the deployed HSS. Such control may be described by a control parameter $\alpha(t)$ , which quantifies the effect of mitigation strategies upon the infection rate. We may write $\beta=\beta_{0}(1-\alpha)$ , such that $\alpha=0$ and $\alpha=1$ correspond to usual societal life and complete mutual isolation of citizens, respectively.

In order to define $\alpha$ in a general way, we state that a certain value of $\alpha=1-\beta/\beta_{0}$ denotes the subset of all possible mitigation measures which lead to an infection rate $\beta\leq\beta_{0}$ . We thus do not need to refer to any specific measures, but can formulate our approach in a very general way. The (more or less) accurate determination of these subsets is then the task of careful social (e.g., infection history) data analysis among citizens. This is illustrated in Fig 1, where mitigation strategies, followed by the public authorities, are indicated by the dashed and dotted curves, within a space spanned by the effect of the measures upon the infection rate, $\alpha$ , and the cost incurred for economy and society as a whole, $f(\alpha)$ .

Refer to caption — Figure 1: Space of mitigation measures. Sketch of the space of possible mitigation measures (grey shade), spanned by their effect on the infection rate, $\alpha$ , and their socio-economic cost, $f(\alpha)$ . Normal societal life is at the origin, while the upper right corner corresponds to total mutual isolation of all citizens, which is the strongest possible intervention. The dashed and dotted curves depict possible choices for mitigation measures. Such curves correspond to the cost functions referred to in the manuscript (see Eq (7).

As indicated by the explicit time dependence of $\alpha(t)$ , we follow a control theoretic approach. This is in some contrast to earlier treatments which have assumed mitigation measures to be constant over time Harko et al. (2014); Gros et al. (2020); Zlatic et al. (2020); Ferguson et al. (2020). Instead, we aim at determining the optimal function $\alpha(t)$ which minimizes the impact on society, while at the same time avoiding the HSS to become overloaded. At the end of the mitigation scenario, herd immunity shall be reached, so that the epidemic comes to an end without further control. We do not consider potential vaccination scenarios here (as is done elsewhere Perkins and Espana (2020); Djidjou-Demasse et al. (2020)), so immunization can only be achieved via infection with the virus in the present study.

A dimensionless quantity which is frequently used in epidemiology is the reproduction number, $R\coloneqq\beta\tau S$ , which denotes the average number of susceptibles infected by one infected individual. At the beginning of an epidemic ( $S\approx 1$ ) and without mitigation measures deployed ( $\beta=\beta_{0}$ ), one observes the basic reproduction number $R_{0}=\beta_{0}\tau$ . In case of the COVID-19 pandemic, typical estimates are $R_{0}\approx 3$ Liu et al. (2020) and $\tau\approx$ ten days et al. (2020); He et al. (2020). It has been shown before that the inherently random, network-like structure of interactions between individuals mainly results in a readjustment of $R_{0}$ Parshani et al. (2010). Hence we follow a mean field approach, disregarding small scale inhomogeneities of the system. We consider a homogeneous scenario, where $\beta(t)$ depends on time, but is spatially constant. Defining a normalized time variable, $\theta:=\beta t$ , we can rewrite Eq (1) as

\begin{array}[]{r c l}\partial_{\theta}S&=&-SI\,,\\ &&\\ \partial_{\theta}I&=&SI-\frac{I}{R_{0}(1-\alpha)}\,.\\ \end{array}

(2)

The trajectory of the system in the $S$ - $I$ -plane, $I(S)$ , can be obtained by dividing the equations displayed in Eq (2) by each other. This yields

\frac{dI}{dS}=\frac{1}{R_{0}(1-\alpha)S}-1\,,

(3)

which has the solution:

I(S)=\frac{\ln S}{R_{0}(1-\alpha)}-S+C\,,

(4)

where $C$ is a constant of integration. When no mitigation measures are in place ( $\alpha=0$ ), we have $I(1)=0$ and thus obtain

I(S)=\frac{\ln S}{R_{0}}-S+1\,.

(5)

This is plotted as the dashed curve in Fig 2 for the case $R_{0}=3$ . The maximum turns out to occur at $S_{\textrm{peak}}=1/R_{0}$ , where it reaches a value of

I_{\textrm{peak}}=1-\frac{\ln R_{0}}{R_{0}}-\frac{1}{R_{0}}\,.

(6)

At $S=S_{\textrm{peak}}=1/R_{0}$ , the population has reached herd immunity since from then on the number of infected citizens decreases until zero.

If the disease is serious, one is faced with the problem that with a fraction of $I_{\textrm{peak}}$ people being infected, the number of those in need of hospitalization or even intense care may exceed the capacity of the HSS. We denote by $I_{h}$ $<I_{\textrm{peak}}$ the maximum fraction of infected citizens which can be managed by the HSS. It is limited by infrastructural aspects, such as the availability of staff or the size and areal density of hospitals, and is indicated by the horizontal dotted line in Fig 2. Any substantial overshoot of the dashed curve over the dotted line constitutes a catastrophe, as a major fraction of the population will then not receive proper health care or treatment. This must clearly be avoided by means of suitable measures, such as reducing mutual contacts between individuals, banning major assemblies, reducing mobility etc., thus reducing the infection rate. Such measures are described by the parameter $\alpha(t)$ , which is to be discussed next.

II Optimal control problem

It is clear that the aforementioned measures will have a more or less substantial impact on society, mainly through their detrimental effects on economy, but also through other societal (e.g., cultural) damage. This may be described by means of a cost functional,

K\{\alpha\}\coloneqq\int f(\alpha(t))\,dt\,,

(7)

where the cost function $f$ corresponds to the mitigation strategy chosen, i.e., the curve chosen in Fig 1. It denotes the cost incurred at a given control $\alpha$ , along with the assumption $\partial f/\partial\alpha\geq 0\ \forall\alpha$ ; later we will require $\frac{\partial^{2}f}{\partial\alpha^{2}}\geq 0$ ¹¹1Convexity of the cost function is a reasonable assumption since one can imagine that chosen mitigation measures become more and more costly as $\alpha$ approaches unity. ²²2In principle, $f(\cdot)$ could depend on time $t$ explicitly or involve memory, e.g., because prolonged measures cause disproportinally higher cost. For simplicity, we do not treat such cases here..

The control problem we choose to address is to find a control trajectory, denoted by the function $\alpha(t)$ , such that the impact on society, as described by $K$ , is being minimized under the constraint that $I(t)$ never exceeds $I_{h}$ (capacity of the HSS) and at the end of mitigation—at unknown terminal time $t_{e}$ —herd immunity is reached, i.e., $S(t_{e})=1/R_{0}$ (end of phase II in Fig 2).

We thus need to

\begin{split}\textrm{minimize}\quad&K\{\alpha\}=\int_{0}^{t_{e}}f(\alpha(t))\,dt\,,\\ \textrm{such that}\quad&\dot{\bm{x}}(t)=\bm{h}(\bm{x},\alpha(t))\,,\,\bm{x}(0)=\bm{x}_{0}\,,\\ &I(t)\leq I_{h}\,,\,S(t_{e})=1/R_{0}\,,\end{split}

(8)

where $\bm{x}=(S,I)$ , and $\bm{h}(\bm{x},\alpha(t))$ as given by Eq 1 (with $\beta=\beta_{0}(1-\alpha)$ ). Minimization of mitigation time is covered by setting $f(\cdot)\equiv\mathrm{const}$ .

Solving Eq 8 can be recast into minimization of the following functional:

J\coloneqq\int_{0}^{t_{e}}f(\alpha(t))+\bm{\lambda}(t)\cdot[\dot{\bm{x}}(t)-\bm{h}(\bm{x},\alpha(t))]+\mu(t)(I-I_{h})\,dt\,,

(9)

where $\bm{\lambda}(t)=(\lambda_{S}(t),\lambda_{I}(t))$ , $\mu(t))$ are Lagrange multipliers. The introduction of $\mu(t)$ for the inequality constraint introduces additional constraints on $\mu(t)$ , namely $\mu(t)\geq 0$ and the complementary slackness condition $\mu(t)(I^{*}-I_{h})=0$ . These are also known as KKT conditions Kuhn and Tucker (1951). The star (^∗) represents the optimal quantities. Additionally, $S(t_{e})=1/R_{0}$ and $\bm{x}(0)=\bm{x}_{0}$ need to be enforced.

The necessary conditions for optimality can be evaluated by setting the first variation of Eq 9 to zero (for a detailed derivation see appendix A), we obtain:

	$\displaystyle f(\alpha^{}(t^{}_{e}))-\bm{\lambda}(t^{}_{e})\cdot\bm{h}(\bm{x}^{}(t^{}_{e}),\alpha^{}(t^{*}_{e}))=0\,,$		(10)
	$\displaystyle\dot{\bm{\lambda}}(t)=-\bm{\lambda}(t)\cdot\bm{\nabla}_{\bm{x}}\bm{h}\|_{\bm{x}^{}(t)}+\mu(t)\,\bm{\nabla}_{\bm{x}}(I-I_{h})\|_{\bm{x}^{}(t)}\,,$		(11)
	$\displaystyle\partial_{\alpha}f\|_{\alpha^{}(t)}-\bm{\lambda}(t)\cdot\partial_{\alpha}\bm{h}\|_{\alpha^{}(t)}=0\,,$		(12)
	$\displaystyle\lambda_{I}(t^{*}_{e})=0\,,$		(13)
	$\displaystyle\mu(t)\geq 0\,,$		(14)
	$\displaystyle\mu(t)(I^{*}-I_{h})=0\,.$		(15)

In addition to these, one also has the optimal system dynamics $\dot{\bm{x}}^{*}(t)=\bm{h}(\bm{x}^{*},\alpha^{*}(t))$ . The necessary conditions become sufficient conditions if $\bm{h}(\bm{x},\alpha)$ and $f(\alpha)$ are convex in $\bm{x}$ and $\alpha$ Mangasarian (1966). The former can be checked to be valid for the SIR model and the latter implies that $\frac{\partial^{2}f}{\partial\alpha^{2}}\geq 0$ .

The task remains to find Lagrange multipliers ${\bm{\lambda}}(t)$ and $\mu(t)$ which simultaneously satisfy the above conditions. This task usually involves a numerical search for the initial conditions of the Lagrange multipliers and evolve the system of ODE’s until the terminal conditions given by Eqs 10 and 13 are met. We escape the numerical difficulties arising with this procedure by first guessing a solution and then finding the appropriate Lagrange multipliers which verify optimality.

II.1 Heuristic approach

Let us first consider what is necessary to keep the fraction of infected citizens at a constant value, $I_{c}$ . Since $S$ varies with time, $dI/dt=0$ entails $dI/dS=0$ , and hence from Eq 3 we obtain

\alpha(t)=1-\frac{1}{R_{0}S(t)}\,.

(16)

This is indicated by the dash-dotted curve in Fig 2. Note that $\alpha(t)$ does not depend on the value of $I_{c}$ .

Next we consider the cost function for proceeding from some $S=S_{0}$ to some $S=S_{1}<S_{0}$ while maintaining $I=I_{c}$ . Inserting Eq 16 in Eq 1, we find

dS=-I_{c}\frac{dt}{\tau}\,.

(17)

We use this substitution to express the cost function as

K=\int\limits_{t(S_{0})}^{t(S_{1})}f(\alpha(t))\,dt=\frac{\tau}{I_{c}}\int\limits_{S_{1}}^{S_{0}}f(\alpha(S))\,dS\,.

(18)

Hence if $S_{0}$ and $S_{1}$ are fixed, $I_{c}$ must be as large as possible to minimize $K$ . This now guides our heuristic: we should control $\alpha$ such as to maintain $I=I_{h}$ for as long as possible.

If our guess is valid, the trajectory we have to follow in order to optimally control the pandemic is the one indicated as the solid curve in Fig 2. It starts at $(S,I)=(1,0)$ and proceeds until $I=I_{h}$ is reached. This completes phase I of the process, during which we set $\alpha=0$ . Mitigation measures are then deployed, such that $\alpha$ jumps upwards to the dash-dotted curve. It follows that curve all through phase II, hence keeping $I=I_{h}$ constant. As $S$ decreases, $\alpha$ is gradually reduced until it reaches zero at the end of phase II. All through phase III, $\alpha$ is maintained at zero, while $I$ gradually decays to zero because $R<1$ . This ends the pandemic.

II.2 Validation of the solution

We now proceed to verify our heuristic solution. We focus on phase II, as this is where the pandemic will spend the most amount of time. To do this we ask the question: Is it true that if the pandemic starts with $I_{0}=I_{h}$ , then for all $S_{0}>1/R_{0}$ and for all the cost functions $f(\alpha(t))$ such that $\frac{\partial f}{\partial\alpha},\frac{\partial^{2}f}{\partial\alpha^{2}}\geq 0$ , optimal pandemic control must keep $I(t)=I_{h}$ until $S(t_{e})=\frac{1}{R_{0}}$ is reached?

As we will see, the answer to the above question is yes. We proceed by setting $\alpha^{*}(t)=1-\frac{1}{R_{0}S^{*}(t)}$ and $S^{*}(t)=S_{0}-\frac{I_{h}}{\tau}t$ for $t\in[0,t^{*}_{e}]$ and $t^{*}_{e}=(S_{0}-\frac{1}{R_{0}})\frac{\tau}{I_{0}}$ . The terminal conditions for the dynamics are given by $\bm{x}^{*}(t^{*}_{e})=(\frac{1}{R_{0}},I_{h})$ . We can substitute the terminal conditions in Eq 10 and get the terminal condition for $\lambda_{S}$ (the terminal condition for $\lambda_{I}$ is given by Eq 13). The task remains to find $\mu(t)$ such that Eqs 11 and 12 are satisfied simultaneously.

Let’s have a look at Eqs 11 and 12 after making the substitutions. We have

\begin{split}\dot{\lambda}_{S}(t)=\frac{I_{h}}{\tau S^{*}(t)}[\lambda_{S}-\lambda_{I}]\,,\\ \dot{\lambda}_{I}(t)=\frac{1}{\tau}\lambda_{S}+\mu(t)\,,\end{split}

(19)

and

\left.\frac{\partial f}{\partial\alpha}\right|_{\alpha^{*}(t)}=(\lambda_{S}(t)-\lambda_{I}(t))\beta I_{h}S^{*}(t)\,.

(20)

One can hence find the Lagrange parameter $\mu(t)$ as

\begin{split}\mu(t)=-\frac{\lambda_{S}}{\tau}+\frac{\partial^{2}f}{\partial\alpha^{2}}\frac{1}{R_{0}^{2}S^{*3}}\,.\end{split}

(21)

Lastly, there is also the issue of non-negativity of $\mu$ . If we assume the convexity of $f$ we have $\frac{\partial^{2}f}{\partial\alpha^{2}}\geq 0$ . Hence the second summand in Eq 21 is non-negative. $\frac{\partial f}{\partial\alpha}\geq 0$ also implies that $\lambda_{S}$ is monotonically increasing (Eqs 19, 20). If the cost of zero control is zero, then the terminal condition Eq 10 implies that $\lambda_{S}(t^{*}_{e})=0$ , thereby implying $\lambda_{S}(t)\leq 0$ in the interval $[0,t^{*}_{e}]$ . This shows that for time independent cost functions $f$ under the assumptions that $\frac{\partial f}{\partial\alpha},\frac{\partial^{2}f}{\partial\alpha^{2}}\geq 0$ and $f(0)=0$ our heuristic solution is optimal in phase II.

II.3 Numerical results

We have shown that an optimal trajectory starting on the boundary ( $I_{0}=I_{h}$ ) remains on that boundary. To obtain optimal control trajectories for arbitrary initial conditions, we perform direct numerical optimization using the software library PSOPT Becerra (2010). In Fig 3 we show the numerical solutions to the control problem Eq (8) for various cost functions $f_{i}(\alpha(t))\in\{\alpha(t),\alpha^{2}(t),\alpha^{3}(t)\}$ .

Clearly, in all scenarios the optimal trajectory $I^{*}$ reaches the threshold value $I_{h}$ and remains there until $S^{*}(t^{*}_{e})$ is reached (phase II). Phase I, however, depends on the cost function applied. For linear costs, $\alpha(t)=0$ until $I=I_{h}$ ³³3The same holds for a constant cost function, i.e. if we assume societal costs to be independent of $\alpha(t)$ ; numerical data not shown here.. With higher order cost terms, we observe non-zero control from the very beginning (see Fig 3). This is to reduce the amount of time spent at large control values $\alpha$ and thereby the total integrated costs. The optimal terminal time $t_{e}^{*}$ increases with the order of the cost function (see Fig 3). We should note, however, that the influence of the functional form of $f(\alpha(t))$ , as expressed in the different shapes of the numerically derived curves, is minute, since the time axis is logarithmic, and the deviations are noticeable only during a very small fraction of time. Hence we see that the influence of the cost function, which corresponds to the chosen mitigation strategy, is finite, but can be regarded as negligible for practical purposes.

III Duration of the pandemic

If immune response acquired after recovery from an infection is permanent, the pandemic will last until herd immunity is reached at the end of phase II. This is when $S(t)=S_{1}=1/R_{0}$ , as indicated by the left vertical dotted line in Fig 2. This is the start of phase III, in which the number of infected citizens decays with no mitigation measures anymore in place (i.e., at $\alpha=0$ ). We will now discuss the time we expect it to take until this point is reached. Using Eq (17) with $I_{c}=I_{h}$ , we can write

dt=-\frac{\tau}{I_{h}}\,dS

(22)

and hence for the total duration of the pandemic, $T_{0}$ , until herd immunity is reached,

T_{0}=-\frac{\tau}{I_{h}}\int\limits_{S_{0}}^{S_{1}}dS\approx\frac{\tau}{I_{h}}\left(1-\frac{1}{R_{0}}\right)\,.

(23)

Here we have exploited the fact that in all cases of practical relevance, $I_{h}$ will be very small as compared to unity. Consequently, the duration of phase I will be very small as compared to phase II, such that the evaluation of the true duration of phase I is of minor importance. As a very good approximation, we have simply set $S_{0}=1$ and neglected the impact of $\alpha(t)$ on the dynamics for the short period of phase I.

III.1 Influence of immune response decay

The introductory discussion was based on the idea that recovered patients stay immune for all times. However, it is well known that for some diseases, in particular of the SARS-CoV type, the immune response tends to decay after some time Wu (2007). Hence there is some finite probability that recovered patients become susceptible again.

We now assume that the transition from the recovered to the susceptible state can be described as a Poisson process. In other words, we assume the probability that a randomly chosen, formerly infected citizen becomes susceptible in a time interval $[t,t+dt]$ to be proportional to $dt$ and independent of $t$ . This modifies the dynamical system (1) to

\begin{array}[]{r c l}\partial_{t}S&=&-\beta SI+\frac{1}{\rho}\left(1-S-I\right)\,,\\ &&\\ \partial_{t}I&=&\beta SI-\frac{I}{\tau}\,,\\ \end{array}

(24)

with $\beta=\beta_{0}(1-\alpha)$ , and $\rho$ the average life time of the immune state, averaged over all formerly infected individuals. Note that we conceptually include those who fell victim to the disease and thus do not become susceptible again. Their contribution to the average resusceptibilization frequency is zero, which merely increases the average immune lifetime, $\rho$ . From the data in Wu (2007), we find that after three years the average IgG immune response against SARS-CoV had decayed to 55.6 percent. For a corresponding Poissonian process we can estimate $\rho\approx 931$ days.

In Eq 24, we see immediately that the conditions to fulfill $\partial_{t}I=0$ have not changed with respect to Eq 1. Hence the optimal control trajectory still obeys $\alpha(t)=1-\frac{1}{R_{0}S(t)}$ . In phase II, with optimal control, we obtain

\partial_{t}S=-\frac{I_{h}}{\tau}+\frac{1}{\rho}\left(1-I_{h}\right)-\frac{1}{\rho}S

(25)

with the solution

S(t)=I_{h}\left(1+\frac{\rho}{\tau}\right)\left[e^{-t/\rho}-1\right]+1\,.

(26)

Again, herd immunity (and hence the end of the pandemic) is reached when $S=1/R_{0}$ , at a time we call $T_{r}$ , referring to resusceptibilization (i.e., decaying immune response). Inserting this into Eq 26 yields

1-e^{-T_{r}/\rho}=\frac{1-1/R_{0}}{I_{h}(1+\rho/\tau)}

(27)

and hence

T_{r}=-\rho\ln\left[1-\frac{1-1/R_{0}}{I_{h}(1+\rho/\tau)}\right]\,.

(28)

Although this looks rather awkward, it can be rewritten conveniently in terms of the pandemic duration, $T_{0}$ , which we would find for infinite $\rho$ . Defining the variable $X=T_{0}/(\tau+\rho)$ , we find

\frac{T_{r}}{T_{0}}=-\frac{1}{X}\ln\left[1-X\right]\,.

(29)

$X$ is the total duration of the pandemic if no loss of immunity occurs, divided by the average time it takes for a patient from her infection to the loss of immunity after recovery, $\tau+\rho$ . If immunity lasts very much longer than $T_{0}$ , $X$ is small. In this case, the logarithm in Eq 29 can be expanded and we recover $T_{r}\approx T_{0}$ . If, however, $\rho$ is of order $T_{0}$ (remember that $T_{0}\gg\tau$ in all relevant cases), $T_{r}$ diverges. This behavior is summarized in Fig 4a, in which $X$ is chosen as the abscissa. We see that the duration of the pandemic becomes uncomfortably large when the total time from infection to resusceptibilization, $\tau+\rho$ , comes close to the pandemic duration with infinite immunity, $T_{0}$ (vertical dotted line).

We might now ask how many acute infections the health system must be able to deal with in order to reach herd immunity at all. This can be derived by demanding $\lim_{t\to\infty}S(t)=1/R_{0}$ in Eq 26. It is readily shown that the health system capacity required for reaching herd immunity is given by

\hat{I}_{h}=\frac{1-\frac{1}{R_{0}}}{1+\frac{\rho}{\tau}}.

(30)

This is shown in Fig 4b for different values of $R_{0}$ . The dotted curve represents the (unrealistic) limiting case $R_{0}\longrightarrow\infty$ .

Only with infinite immune response lifetime ( $\rho\to\infty$ ), we observe an exponential decay of $I$ after herd immunity has been reached (see also Fig 3). To understand the long time dynamics after mitigation (phase III) for finite $\rho$ , we draw the phase portrait (see Fig 5).

There exist two fixed points, $(I_{1},S_{1})=(0,1)$ , a saddle for $R_{0}\geq 1$ , and $(I_{\infty},S_{\infty})=(\hat{I}_{h},1/R_{0})$ , a stable fixed point for $R_{0}\geq 1$ (see appendix B for details on the linear stability analysis). So for any initial conditions with $I_{0}>0$ , the uncontrolled system will approach the stationary state, $(I_{\infty},S_{\infty})$ . Interestingly, the stationary fraction of infected coincides with the minimal HSS capacity $\hat{I}_{h}$ needed to reach herd immunity.

We have thus shown that given $I_{h}>\hat{I}_{h}$ , herd immunity can be reached in finite time during mitigation phase II. After mitigation measures have been released (phase III), $I$ converges to $\hat{I}_{h}$ in the long time limit. Moreover, $I$ remains below $I_{h}$ (see Fig 5) since re-entering the regime $I<I_{h}$ from above would require a trajectory to cross itself (not possible for an autonomous system of ODEs (Eq 24) with unique solutions).

IV A few scenarios

Let us finally consider a few scenarios for some typical parameters, as we have in the current COVID-19 pandemic. The maximum number of known acute infections in Germany in spring 2020 was around 100.000, which was well tolerable for the HSS. Depending upon the percentage of cases which are officially recorded, the actual number of infected citizens may be considerably larger. If we assume a factor of two here, corresponding to 200.000 cases, we have $I_{h}\approx 0.0025$ (given $\approx 80$ million citizens in Germany). On the other hand, if there are more, and if we also take into account that the HSS could well take a few more patients, we might also consider a scenario with 800.000 acute infections at a time, corresponding then to $I_{h}\approx 0.01$ . In both cases we also have to vary the average lifetime of the immune state, $\rho$ , and observe its effect on the duration of the pandemic.

The two sets of scenarios are represented in the graphs in Fig 6. The remaining fraction of susceptible citizens is shown as the solid curves, while the dashed curves represent the control parameter, $\alpha$ . As before, we have assumed $R_{0}=3$ for the basic reproduction number. We take the value $\rho=931$ days mentioned above for another SARS-CoV strain as a reasonable estimate. Using $\tau=$ ten days, this corresponds to $\rho=93\tau$ . In order to cover this case, we have used the values $\rho/\tau=\{50,93,200,\infty\}$ . For Germany, an HSS capacity of $I_{h}=0.01$ (top (a) graph) would correspond to roughly $32\%$ of hospital beds ⁴⁴4We do not consider the need for intensive care units here, one reason being lack of knowledge about the fraction of ICU cases. ( $500.000$ in total) utilized for patients with COVID-19, if we assume a hospitalization rate of $20\%$ ⁵⁵5This is a conservative estimate given that the number published by the Robert Koch Institute Robert Koch Institute (2020) (17%) is based on reported cases only; the true hospitalization rate might be significantly lower.. The bottom (b) graph corresponds to a smaller HSS capacity ( $I_{h}=0.0025$ ), for Germany, corresponding to $8\%$ utilization of hospital beds.

From the steadily decreasing dashed curves representing $\alpha(t)$ , it is obvious that the mitigation measures can be gradually alleviated as time proceeds. In the top (a) graph ( $I_{h}=0.01$ ) for infinite immunity ( $\rho\to\infty$ ), one would reach the end of mitigation measures after about two years ( $=66.7\tau$ , with $\tau=10$ days). This is, however, hardly realistic. For the more realistic case, $\rho/\tau=93$ , it would take about three years ( $\approx 114.9\tau$ ). For an HSS capacity of $I_{h}=0.0025$ , bottom (b) graph, clearly, there would be no chance to ever reach herd immunity for $\rho/\tau=93$ . Instead, one would not reach any further than $\alpha\approx 0.5$ , which still corresponds to rather harsh measures.

It should finally be noted that the number of fatalities is limited for all cases where herd immunity is reached. In particular, if $\phi$ is the fraction of fatalities among those which are infected, the fraction of fatalities with respect to the population is $F=\phi(1-R_{0}^{-1})$ for infinite $\rho$ . If $\rho$ is finite, we find

F=\frac{\phi T_{r}}{T_{0}}\left(1-\frac{1}{R_{0}}\right)=\left(1-\frac{1}{R_{0}}\right)\frac{\phi\ln(1-X)}{X}.

(31)

This is precisely the scaling described by the curve displayed in Fig 4a, which shows that already well below $X=1$ , the death toll incurred by the herd immunity strategy becomes prohibitively high if immune response decay plays a significant role.

V Conclusions

We have shown that for a wide class of cost functions, in order to reach herd immunity without vaccination, it provides an optimal control strategy to keep the effective reproduction number, $R$ , at unity during the majority of the duration of the pandemic. Deviations which depend upon the specific form of the cost function are limited to a narrow time window and can be considered negligible for practical purposes.

Reducing $R$ can be achieved through various measures, e.g., increased hygiene, physical distancing, or contact tracing Bulchandani et al. (2020). Keeping $R$ at the critical value of unity—above which epidemic spreading sets in—is, however, hardly feasible in practice, due to uncertainties as well as observation delays concerning the effects of mitigation measures. Development of robust optimal control scenarios taking such uncertainties into account is left for future investigations.

In this study, costs incurred at time $t$ have been considered local in time. Cost functions nonlocal in time (with a memory kernel) would be an interesting extension but go beyond the scope of this work. Costs associated with the number of infections have not been considered explicitly. Instead we kept the number of infections below an upper bound, e.g., the capacity limit of the health service system (HSS). Of course there are societal costs due to infections even below the limit of the HSS. However, if herd immunity is the goal and vaccination is not available, then there is no way around a fraction of $1-1/R_{0}$ of the population going through the infection. Moreover, the effectiveness of specific mitigation measures can depend on the number of infections; while contact tracing is an efficient measure for low case numbers, local health authorities can be overwhelmed if case numbers are high Contreras et al. (2020, 2021). Therefore, the socio-economic costs for establishing a given reproduction number $R$ might depend on $I$ as well. Here we focused on simple costs functions as a starting point allowing for analytical treatment.

Explicit expressions for the expected duration of the pandemic have been given, and we have seen that the duration of the pandemic increases strongly as the average lifetime of the immune state decreases. In particular, we can conclude that in case the immune response to SARS-CoV2 decays in a similar manner as for the formerly encountered SARS-CoV1 strain Wu (2007), using infection mediated herd immunity as a vaccination strategy for SARS-CoV2 would require a substantial fraction of health system capacity dedicated to COVID-19 patients (see Fig 6). However, as a consequence of global mobility there may be more pandemics coming which show different infection and immune response behavior. We therefore think that our results should be borne in mind for future use, as they are of rather general nature.

References

et al. (2020) W. H. O. et al., Report of the who-china joint mission on coronavirus disease 2019 (covid-19), Tech. Rep. (2020).
Enserink and Kupferschmidt (2020) M. Enserink and K. Kupferschmidt, Science 367, 1414 (2020).
Dehning et al. (2020) J. Dehning, J. Zierenberg, F. P. Spitzner, M. Wibral, J. P. Neto, M. Wilczek, and V. Priesemann, Science 369 (2020), 10.1126/science.abb9789.
Hethcote (2000) H. W. Hethcote, SIAM Review 42, 599 (2000).
Harko et al. (2014) T. Harko, F. Lobo, and M. Mak, Appl. Math. and Comput. 236, 184 (2014).
Gros et al. (2020) C. Gros, R. Valenti, L. Schneider, K. Valenti, and D. Gros, “Containment efficiency and control strategies for the corona pandemic costs,” (2020).
Zlatic et al. (2020) V. Zlatic, I. Barjasic, A. Kadovic, H. Stefancic, and A. Gabrielli, “Bi-stability of sudr+k model of epidemics and test kits applied to covid-19,” (2020).
Ferguson et al. (2020) N. Ferguson, D. Laydon, G. Nedjati Gilani, N. Imai, K. Ainslie, M. Baguelin, S. Bhatia, A. Boonyasiri, Z. Cucunuba Perez, G. Cuomo-Dannenburg, A. Dighe, I. Dorigatti, H. Fu, K. Gaythorpe, W. Green, A. Hamlet, W. Hinsley, L. Okell, S. Van Elsland, H. Thompson, R. Verity, E. Volz, H. Wang, Y. Wang, P. Walker, P. Winskill, C. Whittaker, C. Donnelly, S. Riley, and A. Ghani, Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand, Tech. Rep. (Imperial College London, 2020).
Perkins and Espana (2020) A. Perkins and G. Espana, Optimal control of the COVID-19 pandemic with non-pharmaceutical interventions, preprint (Epidemiology, 2020).
Djidjou-Demasse et al. (2020) R. Djidjou-Demasse, Y. Michalakis, M. Choisy, M. T. Sofonea, and S. Alizon, Optimal COVID-19 epidemic control until vaccine deployment, preprint (Infectious Diseases (except HIV/AIDS), 2020).
Liu et al. (2020) Y. Liu, A. A. Gayle, A. Wilder-Smith, and J. Rocklöv, Journal of Travel Medicine 27, taaa021 (2020).
He et al. (2020) X. He, E. H. Lau, P. Wu, X. Deng, J. Wang, X. Hao, Y. C. Lau, J. Y. Wong, Y. Guan, X. Tan, et al., Nat Med 26, 672 (2020).
Parshani et al. (2010) R. Parshani, S. Carmi, and S. Havlin, Phys. Rev. Lett. 104, 258701 (2010).
Note (1) Convexity of the cost function is a reasonable assumption since one can imagine that chosen mitigation measures become more and more costly as $\alpha$ approaches unity.
Note (2) In principle, $f(\cdot)$ could depend on time $t$ explicitly or involve memory, e.g., because prolonged measures cause disproportinally higher cost. For simplicity, we do not treat such cases here.
Kuhn and Tucker (1951) H. W. Kuhn and A. W. Tucker, in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability (University of California Press, Berkeley, Calif., 1951) pp. 481–492.
Mangasarian (1966) O. L. Mangasarian, SIAM Journal on Control 4, 139 (1966), https://doi.org/10.1137/0304013 .
Becerra (2010) V. M. Becerra, in 2010 IEEE International Symposium on Computer-Aided Control System Design (IEEE, 2010) pp. 1391–1396.
Note (3) The same holds for a constant cost function, i.e. if we assume societal costs to be independent of $\alpha(t)$ ; numerical data not shown here.
Wu (2007) L.-P. Wu, Emerging Infectious Diseases 13, 1562 (2007).
Note (4) We do not consider the need for intensive care units here, one reason being lack of knowledge about the fraction of ICU cases.
Note (5) This is a conservative estimate given that the number published by the Robert Koch Institute Robert Koch Institute (2020) (17%) is based on reported cases only; the true hospitalization rate might be significantly lower.
Bulchandani et al. (2020) V. B. Bulchandani, S. Shivam, S. Moudgalya, and S. L. Sondhi, “Digital Herd Immunity and COVID-19,” (2020), arXiv: 2004.07237.
Contreras et al. (2020) S. Contreras, J. Dehning, S. B. Mohr, F. P. Spitzner, and V. Priesemann, “Towards a long-term control of covid-19 at low case numbers,” (2020).
Contreras et al. (2021) S. Contreras, J. Dehning, M. Loidolt, J. Zierenberg, F. P. Spitzner, J. H. Urrea-Quintero, S. B. Mohr, M. Wilczek, M. Wibral, and V. Priesemann, Nat Commun 12, 1 (2021).
Robert Koch Institute (2020) Robert Koch Institute, Coronavirus Disease 2019 (COVID-19). Daily Situation Report of the Robert Koch Institute 29/04/2020, Tech. Rep. (Robert Koch Institute, 2020).

Appendix A First order necessary conditions for optimality

The problem of optimal control is given in Eq. (8). Defining $\psi(\bm{x}(t))=I(t)-I_{h}$ we rewrite the functional in Eq. 9:

\begin{split}J\{\alpha\}=\int_{0}^{t_{e}}f(\alpha(t))+\bm{\lambda}(t)\cdot[\dot{\bm{x}}(t)-\bm{h}(\bm{x},\alpha(t))]\\ +\mu(t)\psi(\bm{x}(t))\,dt\,,\end{split}

(32)

with the complimentary slackness condition $\mu(t)\psi(\bm{x}^{*}(t))=0$ and $\mu(t)\geq 0$ . The slackness condition can be seen as activation of the constraint, i.e., in the region when the optimal trajectory satisfies $\psi(\bm{x}^{*}(t))<0$ the Lagrange multiplier is $0$ , or in other words the constraint is inactive.

The first order conditions for optimality can be found by setting the first variation of the functional $\delta J=0$ . To this end we consider the variations $\bm{x}(t)=\bm{x}^{*}(t)+\eta\,\bm{\sigma}(t)$ , $\alpha(t)=\alpha^{*}(t)+\eta\,\theta(t)$ and $t_{e}=t^{*}_{e}+\eta\,\chi$ for some infinitesimal scalar parameter $\eta$ . Upon making the substitutions we have:

\begin{split}J=J^{*}+\eta\,\delta J=\int_{0}^{t^{*}_{e}+\eta\chi}f(\alpha^{*}(t))+\partial_{\alpha}f|_{\alpha^{*}(t)}\eta\,\theta(t)\\ +\bm{\lambda}(t)\cdot\big{[}\dot{\bm{x}^{*}}(t)+\eta\,\dot{\bm{\sigma}}(t)-\bm{h}(\bm{x}^{*}(t),\alpha^{*}(t))-\\ \bm{\nabla}_{\bm{x}}\bm{h}|_{\bm{x}^{*}(t)}\cdot\eta\,{\bm{\sigma}}(t)-\partial_{\alpha}\bm{h}|_{\alpha^{*}(t)}\cdot\eta\,\theta(t)\big{]}\\ +\mu(t)\psi(\bm{x}^{*}(t))+\bm{\nabla}_{\bm{x}}\psi|_{\bm{x}^{*}(t)}\cdot\eta\,{\bm{\sigma}}(t)\mu(t)\,dt\,,\end{split}

(33)

where $J^{*}=\int_{0}^{t^{*}_{e}}f(\alpha^{*}(t))+\bm{\lambda}(t)\cdot[\dot{\bm{x}^{*}}(t)-\bm{h}(\bm{x}^{*}(t),\alpha^{*}(t))]\\ +\mu(t)\psi(\bm{x}^{*}(t))\,dt$ . Now noting that $\int_{0}^{t^{*}_{e}+\eta\chi}A(t)\,dt=\int_{0}^{t^{*}_{e}}A(t)\,dt+\int_{t^{*}_{e}}^{t^{*}_{e}+\eta\chi}A(t)\,dt$ and $\int_{t^{*}_{e}}^{t^{*}_{e}+\eta\chi}A(t)\,dt=A(t^{*}_{e})\eta\chi$ and considering terms only up to first order in $\eta$ , we have

\begin{split}\delta J=\bigg{[}f(\alpha^{*}(t^{*}_{e}))+\bm{\lambda}(t^{*}_{e})\cdot[\dot{\bm{x}}(t^{*}_{e})-\bm{h}(\bm{x}(t^{*}_{e}),\alpha^{*}(t^{*}_{e}))]\\ +\mu(t^{*}_{e})\psi(\bm{x}^{*}(t^{*}_{e}))\bigg{]}\chi\\ +\int_{0}^{t^{*}_{e}}\partial_{\alpha}f|_{\alpha^{*}(t)}\theta(t)+\bm{\lambda}(t)\cdot\big{[}\dot{\bm{\sigma}}(t)-\bm{\nabla}_{\bm{x}}\bm{h}|_{\bm{x}^{*}(t)}\cdot{\bm{\sigma}}(t)\\ -\partial_{\alpha}\bm{h}|_{\alpha^{*}(t)}\theta(t)\big{]}+\bm{\nabla}_{\bm{x}}\psi|_{\bm{x}^{*}(t)}\cdot{\bm{\sigma}}(t)\mu(t)\,dt\,.\end{split}

(34)

Making use of the slackness condition, performing integration by parts on the $\bm{\lambda}(t)\cdot\dot{\bm{\sigma}}(t)$ term and rearranging the terms we get:

\begin{split}\delta J=\bigg{[}f(\alpha^{*}(t^{*}_{e}))+\bm{\lambda}(t^{*}_{e})\cdot[\dot{\bm{x}^{*}}(t^{*}_{e})-\bm{h}(\bm{x}^{*}(t^{*}_{e}),\alpha^{*}(t^{*}_{e}))]\bigg{]}\chi\\ +\int_{0}^{t^{*}_{e}}\theta(t)\bigg{[}\partial_{\alpha}f|_{\alpha^{*}(t)}-\bm{\lambda}(t)\cdot\partial_{\alpha}\bm{h}|_{\alpha^{*}(t)}\bigg{]}dt\\ +\int_{0}^{t^{*}_{e}}\bm{\sigma}(t)\cdot\bigg{[}-\dot{\bm{\lambda}}(t)-\bm{\lambda}(t)\cdot\bm{\nabla}_{\bm{x}}\bm{h}|_{\bm{x}^{*}(t)}+\bm{\nabla}_{\bm{x}}\psi|_{\bm{x}^{*}(t)}\mu(t)\bigg{]}dt\\ +[\bm{\lambda}(t)\cdot\bm{\sigma}(t)]_{0}^{t^{*}_{e}}\,.\end{split}

(35)

For the last term we make use of the initial conditions $\bm{x}(0)=\bm{x}_{0}$ , hence we require $\bm{\sigma}(0)=\bm{0}$ . The terminal constraint implies $R_{0}^{-1}=S(t_{e})=S(t^{*}_{e}+\eta\chi)=S^{*}(t_{e}^{*})+\eta(\dot{S}^{*}(t^{*}_{e})\chi+\sigma_{s}(t^{*}_{e}))+\mathcal{O}(\eta^{2})$ , and therefore $\dot{S}^{*}(t^{*}_{e})\chi+\sigma_{s}(t^{*}_{e})=0$ . Taking these into account we finally get the first order necessary conditions for optimal control:

	$\displaystyle f(\alpha^{}(t^{}_{e}))-\bm{\lambda}(t^{}_{e})\cdot\bm{h}(\bm{x}^{}(t^{}_{e}),\alpha^{}(t^{*}_{e}))=0\,,$		(36)
	$\displaystyle\dot{\bm{\lambda}}(t)=-\bm{\lambda}(t)\cdot\bm{\nabla}_{\bm{x}}\bm{h}\|_{\bm{x}^{}(t)}+\bm{\nabla}_{\bm{x}}\psi\|_{\bm{x}^{}(t)}\mu(t)\,,$		(37)
	$\displaystyle\partial_{\alpha}f\|_{\alpha^{}(t)}-\bm{\lambda}(t)\cdot\partial_{\alpha}\bm{h}\|_{\alpha^{}(t)}=0\,,$		(38)
	$\displaystyle\lambda_{I}(t^{*}_{e})=0\,.$		(39)

Appendix B Stability analysis of the uncontrolled SIR model with finite immune response

The uncontrolled SIR model with finite immune response (Eq. 24 with $\alpha=0$ ) can be written as:

\begin{array}[]{r c l}\partial_{t}S&=&-\frac{1}{\tau}R_{0}SI+\frac{1}{\rho}\left(1-S-I\right)\,,\\ &&\\ \partial_{t}I&=&\frac{1}{\tau}(R_{0}SI-I)\,.\\ \end{array}

(40)

It has a fixed point at

\displaystyle x_{\infty}=(S_{\infty},I_{\infty})=\left(\frac{1}{R_{0}},\frac{1-1/R_{0}}{1+\rho/\tau}\right)\,.

(41)

Linear stability analysis requires computation of the Jacobian of the dynamical system evaluated at the fixed point. It is given via:

\displaystyle J|_{x_{\infty}}=\begin{bmatrix}\frac{1-R_{0}}{\tau+\rho}-1/\rho&-1/\tau-1/\rho\\ \frac{R_{0}-1}{\tau+\rho}&0\end{bmatrix}\,.

(42)

Its eigenvalues are given by:

\displaystyle\lambda_{1,2}=-\frac{\rho R_{0}+\tau}{2\rho(\tau+\rho)}\pm\sqrt{\left(\frac{\rho R_{0}+\tau}{2\rho(\tau+\rho)}\right)+\frac{1-R_{0}}{\tau\rho}}\,.

(43)

The first term is always negative, we thus have three regimes:

\begin{array}[]{rl}\lambda_{1,2}\in\mathbb{R}\;\land\;\lambda_{1,2}<0:&\quad\text{stable node}\\ \lambda_{1,2}\in\mathbb{R}\;\land\;\lambda_{1}>0,\,\lambda_{2}<0:&\quad\text{saddle}\\ \operatorname{Im}(\lambda_{1,2})\neq 0\;\land\;\operatorname{Re}(\lambda_{1,2})<0:&\quad\text{stable spiral}\end{array}

(44)

In Fig. 7 we show the different regimes of stability. The fixed point $x_{\infty}$ is stable for any $R_{0}\geq 1$ . The stable node regime is limited to a very small, rather unrealistic parameter regime (mainly $\rho\leq\tau$ , i.e., loss of immunity is faster than recovery from the infection). So for most cases, the fixed point is a stable spiral, in particular for the estimated values for SARS-CoV-2 ( $\rho/\tau\approx 93$ , $R_{0}\approx 3$ ).