Control Policies for Recovery of Interdependent Systems After Disruptions

Hemant Gehlot, Shreyas Sundaram, and Satish V. Ukkusuri Hemant Gehlot and Satish V. Ukkusuri are with the Lyles School of Civil Engineering at Purdue University. Email: {hgehlot,sukkusur}@purdue.edu. Shreyas Sundaram is with the School of Electrical and Computer Engineering at Purdue University. Email: [email protected]. This research was supported by National Science Foundation award CMMI 1638311.

Abstract

We examine a control problem where the states of the components of a system deteriorate after a disruption, if they are not being repaired by an entity. There exist a set of dependencies in the form of precedence constraints between the components, captured by a directed acyclic graph (DAG). The objective of the entity is to maximize the number of components whose states are brought back to the fully repaired state within a given time. We prove that the general problem is NP-hard, and therefore we characterize near-optimal control policies for special instances of the problem. We show that when the deterioration rates are larger than or equal to the repair rates and the precedence constraints are given by a DAG, it is optimal to continue repairing a component until its state reaches the fully recovered state before switching to repair any other component. Under the aforementioned assumptions and when the deterioration and the repair rates are homogeneous across all the components, we prove that the control policy that targets the healthiest component at each time-step while respecting the precedence and time constraints fully repairs at least half the number of components that would be fully repaired by an optimal policy. Finally, we prove that when the repair rates are sufficiently larger than the deterioration rates, the precedence constraints are given by a set of disjoint trees that each contain at most $k$ nodes, and there is no time constraint, the policy that targets the component with the least value of health minus the deterioration rate at each time-step while respecting the precedence constraints fully repairs at least $1/k$ times the number of components that would be fully repaired by an optimal policy.

I Introduction

We focus on a problem where multiple components of a system have been damaged after a disruption (e.g., a disaster, cyberphysical attack, etc.) and the states of the components continue to reduce if they are not being repaired, until they reach a value referred to as permanent failure. There is an entity (or a controller) whose objective is to maximize a performance criteria or reward, e.g., maximizing the number of components whose states are brought back to their maximum value (known as the permanent repair state) within a given time. The states of the components do not change once they reach permanent failure or permanent repair. There exists a set of dependencies between the components that are represented by precedence constraints such that if there is a precedence constraint between an ordered pair of components $(i,j)$ , then component $i$ needs to be permanently repaired before the entity can target component $j$ . This problem has relevance in multiple applications, e.g., disaster recovery, protection of cyberphysical systems against attacks, and fire-fighting. For instance, infrastructure components face accelerated deterioration after disasters due to processes such as corrosion and flooded roads. If the components are not repaired in a timely manner, they can deteriorate to such a level that they become unusable and require full replacement, which is usually expensive [1]. Furthermore, there can be various dependencies between infrastructure components; for example, in order to bring back the functionality of water systems (such as irrigation pumps), it might be necessary to first repair some sections of the power network. Similarly, there might be a limited time before components (e.g., computing elements) of a cyberphysical system become fully compromised after an attack [2]. There might be dependencies or hierarchies between the components such that the components at the top of the hierarchy may be required to be repaired first in order to stop the spread of attack to other components [3]. Thus, the protecting agency would have to make optimal recovery decisions if there is a repair budget (e.g., manpower and resources).

This problem can be cast in the general framework of optimal control and scheduling of switched systems [4, 5]. In switched systems, there are multiple subsystems such that only one subsystem can be controlled at a time. In our problem, components correspond to the subsystems of a switched system. Since the entity can only control one component at a time, our problem is analogous to optimal control and scheduling of discrete time linear switched systems [4]. One of the main complexities that arises in solving switched controlled systems is the combinatorially large number of feasible switching sequences [5]. Despite recent advances in the optimal control and scheduling of switched systems[4], finding general theoretical results or frameworks for efficiently solving optimal control and scheduling problems for switched systems remains an active area of research [6].

At a high level, our problem also has analogies to the resource (e.g., time slot) allocation problem at a base-station to time-dependent flows [7], optimal control of robotic systems [8], switched system control problems such as scheduling of thermostatically controlled loads [9] and patient triage scheduling problems [10]. However, these studies do not consider precedence constraints. In addition, some of these studies [7, 8] do not consider permanent failure of flows/components being served/targeted. Job scheduling with linear time deterioration and precedence constraints that minimize the total number of late jobs also have similarities to our problem [11]. However, job scheduling studies typically do not consider the notion of permanent failure, as jobs are completely processed even if their completion times exceed due dates. In addition, jobs are considered to be late if their completion times exceed the corresponding due dates; however, in our setting, a repair agency should start targeting a component before it fails. Scheduling of real-time tasks [12] also has some high level analogies to our work. However, these studies focus on tasks that sporadically become available for processing, in contrast to the components in our problem that are available for repair starting at the same time (i.e., immediately after a disruption), subject to precedence constraints.

Under the setting described before, we focus on characterizing (near-) optimal switching policies for maximizing the number of components that are permanently repaired. The paper [13] characterized optimal policies for this problem assuming homogeneous rates of repair and deterioration, and [14] studied the problem for the case when there are heterogeneous rates of repair and deterioration. However, these studies do not consider precedence constraints among various components and also do not constrain the maximum time that is available to repair the components. In the conference version of this paper [15], precedence constraints were considered among various components but there was no constraint on the time that is available for repair, and no discussion on the complexity of the general problem. This paper significantly expands on the conference paper by addressing those gaps.

Our contributions

First, we prove that when the deterioration rates are larger than or equal to the repair rates, and the precedence constraints are given by a directed acyclic graph (DAG), the optimal switching policy is to permanently repair a component that is targeted before switching to another component, while respecting the precedence and time constraints. Second, we prove that under the aforementioned assumptions and when the deterioration and the repair rates are homogeneous across all the components, the switching policy that targets the healthiest component at each time-step while respecting the precedence and time constraints is $1/2$ -optimal.¹¹1For $\alpha\in(0,1]$ , a policy is said to be $\alpha$ -optimal if the reward obtained by that policy is at least $\alpha$ times the reward obtained by an optimal policy. We also prove that the aforementioned policy is optimal when time is constrained but when there are no precedence constraints. Third, we prove that when the repair rates are sufficiently larger than the deterioration rates, the precedence constraints are given by a set of disjoint trees that each contain at most $k$ nodes, and there is no time constraint, the policy that targets the component with the least health value minus the deterioration rate at each time-step while respecting the precedence constraints is $1/k$ -optimal.²²2The paper [15] characterizes a near-optimal policy with the same factor $(1/k)$ under the aforementioned assumptions where the root nodes of the trees are first targeted and other nodes are only targeted when it is not possible to target any root node; the policy that is characterized in this paper does not require that condition. Finally, we prove that the general problem is NP-hard (even if there is no time constraint).

In the next section, we formally define the problem that we focus in this paper. After that, we divide the characterization of sequencing policies into two cases, corresponding to the relative sizes of the rates of deterioration and repair.

II Problem Statement

There are $N(\geq 2)$ components indexed by the set $\mathcal{V}=\{1,\ldots,N\}$ . A component could represent damaged infrastructure in an area, an infected server, etc., depending on the context. We refer to different components as nodes. There is a repair agency (also referred to as an entity), which is responsible for repairing the nodes. We index time-steps by the variable $t\in\mathbb{N}=\{0,1,2,\ldots\}$ , capturing the resolution at which the entity decides to repair nodes. Let $T\in\mathbb{N}\cup\{\infty\}$ be the largest time-step that is available for repairing nodes (when $T=\infty$ we say that there is no time constraint). The control action of the entity at time-step $t$ is represented by $u_{t}\in\mathcal{V}$ , i.e., the entity targets one node to repair at each time-step.³³3We keep the analysis of the problem where the entity can simultaneously target multiple nodes for future work. Each node $j\in\mathcal{V}$ has an associated health value $v_{t}^{j}\in[0,1]$ at time-step $t$ . We denote the initial health value of each node $j$ by $v_{0}^{j}\in(0,1)$ . If a node $j$ is being repaired at a time-step then the health value of that node increases by $\Delta_{inc}^{j}\in(0,1)$ . A node $j$ is considered to be permanently repaired at time-step $t$ if $v^{j}_{t}=1$ and $v^{j}_{t-1}<1$ . If a node $j$ is not being repaired at a time-step, then the health value of node $j$ decreases by $\Delta_{dec}^{j}\in(0,1)$ . A node $j$ is considered to be permanently failed at time-step $t$ if $v^{j}_{t}=0$ and $v^{j}_{t-1}>0$ . Note that the health value of a node does not change further once it gets permanently repaired or permanently failed. Thus, for all $j\in\{1,\ldots,N\}$ , the health value of node $j$ dynamically changes depending on the control action taken by the entity as follows:

v^{j}_{t+1}=\begin{cases}1&\text{if }v^{j}_{t}=1,\\ 0&\text{if }v^{j}_{t}=0,\\ \min(1,v^{j}_{t}+\Delta_{inc}^{j})&\hbox{if }u_{t}=j\hskip 2.84526pt\text{and}\hskip 2.84526ptv^{j}_{t}\in(0,1),\\ \max(0,v^{j}_{t}-\Delta_{dec}^{j})&\hbox{if }u_{t}\neq j\hskip 2.84526pt\text{and}\hskip 2.84526ptv^{j}_{t}\in(0,1).\end{cases}

There is a set of dependency constraints between different nodes, represented by a set $\mathcal{E}=\{(j,k)|j,k\in\mathcal{V}\}$ consisting of edges between various nodes. An edge $(j,k)\in\mathcal{E}$ starting from node $j$ and ending at node $k$ represents a precedence constraint that node $j$ needs to be permanently repaired (i.e., to full health) before the entity can start targeting node $k$ ; $j$ is an in-neighbor of $k$ . The in-neighbor set of a node $k\in\mathcal{V}$ is the set $\mathcal{W}_{k}=\{j\in\mathcal{V}|(j,k)\in\mathcal{E}\}$ . We assume that the precedence constraints form a directed acyclic graph (DAG) denoted by $G=\{\mathcal{V},\mathcal{E}\}$ ; such graphs are a common form for representing general precedence constraints in other problems such as job scheduling [11].

We now define the feasible control set as follows.

Definition 1.

Let the health values of all the nodes at time-step $t$ be given by $v_{t}=\{v_{t}^{1},\ldots,v_{t}^{N}\}$ . Then, the feasible control set at time-step $t$ is the set $\mathcal{U}(v_{t})\subseteq\mathcal{V}$ containing the nodes whose in-neighbors are all in the permanently repair state at time-step $t$ , i.e., $\mathcal{U}(v_{t})=\{k\in\mathcal{V}|v^{j}_{t}=1,\forall j\in\mathcal{W}_{k}\}$ .

We now present the definition of a control sequence that respects the precedence constraints as follows.

Definition 2.

A control sequence $\bar{u}_{0:T}=\{\bar{u}_{0},\bar{u}_{1},\ldots,\bar{u}_{T}\}$ is said to respect the precedence constraints if for all $t\in[0,T]$ , $\bar{u}_{t}\in\mathcal{U}(v_{t})$ .

We define the reward function as follows.

Definition 3.

Given a set of initial health values $\{v_{0}^{j}\}$ and a control sequence $\bar{u}_{0:T}$ that respects the precedence constraints, the reward $J(v_{0},\bar{u}_{0:T})$ is defined as the number of nodes that get permanently repaired through that sequence. More formally, $J(v_{0},\bar{u}_{0:T})=|\{j\in\mathcal{V}\hskip 2.84526pt|\hskip 2.84526pt\exists\hskip 2.84526ptt\text{ s.t. }0\leq t\leq T\text{ and }v^{j}_{t+1}=1\}|$ .

Based on the above definitions, the following problem is the focus of this paper.

Problem 1 (Optimal Control Sequencing over a DAG).

Consider a directed acyclic graph $G=\{\mathcal{V},\mathcal{E}\}$ consisting of $N(\geq 2)$ nodes with initial health values $\{v_{0}^{j}\}$ , along with repair and deterioration rates $\{\Delta_{inc}^{j}\}$ and $\{\Delta_{dec}^{j}\}$ , respectively. Given a time constraint $T\in\mathbb{N}\cup\{\infty\}$ , find a control sequence $u^{\ast}_{0:T}$ that respects the precedence constraints and maximizes the reward $J(v_{0},u^{\ast}_{0:T})$ .

For our purposes, we also define the concept of a jump.

Definition 4.

If the entity starts targeting a node before permanently repairing the node it targeted in the last time-step, then the entity is considered to have jumped at that time-step. Mathematically, if $u_{t-1}=j$ , $v_{t}^{j}<1$ and $u_{t}\neq j$ , then a jump is said to have been made by the entity at time-step $t$ . A control sequence that does not contain any jumps is called a non-jumping sequence.

In this paper, we will be obtaining sequences that are generated by state feedback policies, i.e., we will be providing a mapping $\mu:[0,1]^{N}\rightarrow\mathcal{V}$ , such that $\bar{u}_{t}=\mu(v_{t})$ for all $t\in[0,T]$ . Thus, we will use the same terminology defined above (for sequences) for policies (e.g., a policy respecting the precedence constraints, a non-jumping policy, etc.). Given a policy $\mu$ that respects the precedence constraints, we denote the reward (from Definition 3) by $J_{\mu}(v_{0})$ , with the time-constraint being implicit.

We now define an $\alpha$ -optimal policy as follows.

Definition 5.

Let $\mu^{*}$ be an optimal policy for Problem 1. For $\alpha\in(0,1]$ , a policy $\mu$ (that respects the precedence and time constraints) is said to be $\alpha$ -optimal if $J_{\mu}(v_{0})\geq\alpha J_{\mu^{*}}(v_{0}),\forall v_{0}\in[0,1]^{N}$ .

In the next section, we start the analysis of the problem for $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\quad\forall j\in\{1,\ldots,N\}$ .

III Control Sequences for $\Delta_{{dec}}^{{j}}\geq\Delta_{{inc}}^{{j}},\quad\forall{j}\in\{1,\ldots,N\}$

In this section, we start by showing the optimality of non-jumping sequences when $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\quad\forall j\in\{1,\ldots,N\}$ .

III-A Optimality of non-jumping sequences

To show that non-jumping policies are optimal when $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ , we first analyze the properties of sequences containing at most one jump and later generalize to sequences containing an arbitrary number of jumps. We start with the following result.

Lemma 1.

Let there be $N(\geq 2)$ nodes that have a set of precedence constraints given by a DAG $G=\{\mathcal{V},\mathcal{E}\}$ , $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ and $T\in\mathbb{N}\cup\{\infty\}$ . Consider a control sequence $A$ that respects the precedence and time constraints and targets $N$ nodes as shown in Figure 1. Suppose sequence $A$ permanently repairs all the nodes and contains exactly one jump, where the entity partially repairs node $i_{1}$ before moving to node $i_{2}$ at time-step $\bar{t}_{1}^{A}$ . Sequence $A$ then permanently repairs nodes $i_{2},i_{3},\ldots,i_{k}$ , before returning to node $i_{1}$ and permanently repairing it. Then, there exists a non-jumping control sequence $B$ as shown in Figure 2 that targets nodes in the order $\{i_{2},i_{3},\ldots,i_{k},i_{1},i_{k+1},\ldots,i_{N}\}$ (which respects the precedence and time constraints) and permanently repairs all the nodes. Furthermore, if $t_{j}^{A}$ (resp. $t_{j}^{B}$ ) is the number of time-steps taken to permanently repair node $i_{j}$ in sequence $A$ (resp. sequence $B$ ), then the following holds true:

$\displaystyle t_{j}^{A}$	$\displaystyle\geq t_{j}^{B}+\left(2^{j-2}\right)\overline{t}_{1}^{A}\hskip 8.53581pt\forall j\in\{2,\ldots,k\},$	(1)
$\displaystyle t_{1}^{A}$	$\displaystyle\geq t_{1}^{B}+(2^{k-1}-2)\overline{t}_{1}^{A},$	(2)
$\displaystyle t_{j}^{A}$	$\displaystyle\geq t_{j}^{B}+\left(2^{j-1}-2^{j-k}\right)\overline{t}_{1}^{A}\hskip 8.53581pt\forall j\in\{k+1,\ldots,N\}.$	(3)

Refer to caption — Figure 1: Sequence $A$ with a single jump.

Proof:

Since node $i_{1}$ is partially repaired by the entity before it targets the nodes $i_{2},\ldots,i_{k}$ in sequence $A$ , there cannot be an edge that starts from $i_{1}$ and ends at a node in the set $\{i_{2},\ldots,i_{k}\}$ . Thus, if sequence $A$ follows the precedence constraints, the non-jumping sequence $B$ also respects the precedence constraints. We can now leverage the analysis in [14], which proved that (1)-(3) hold and each node in sequence $B$ is targeted at an earlier time-step than in sequence $A$ . Therefore, sequence $B$ permanently repairs all the nodes that are permanently repaired by sequence $A$ while respecting the precedence and time constraints. Thus, the result follows. ∎

The above result considered sequences containing exactly one jump. This leads us to the following key result pertaining to the optimal control policy when $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j$ .

Theorem 1.

The proof of the above result follows immediately by applying Lemma 1 on sequence $A$ to obtain a sequence that has one less jump than $A$ , and repeating this procedure on the obtained sequences to get a non-jumping sequence.

Since non-jumping sequences are optimal by the above result, we will only focus on non-jumping sequences when $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ . We now present the following NP-hardness result, whose proof is provided in the appendix.

Theorem 2.

Problem 1 is NP-hard.

Since Problem 1 is NP-hard, it is unlikely that the optimal solution can be efficiently computed for all instances of the problem [16]. Thus, we will characterize optimal or near-optimal policies for special instances of the problem.

III-B Near-optimal policy

In this section, we characterize a near-optimal policy for the case when $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ . We will make the following assumption in this section.

Assumption 1.

For all $j\in\{1,\ldots,N\}$ , suppose $\Delta_{inc}^{j}=\Delta_{inc}$ , $\Delta_{dec}^{j}=\Delta_{dec}$ , and $\Delta_{dec}\geq\Delta_{inc}$ . Also, suppose there exists a positive integer $n$ such that $\Delta_{dec}=n\Delta_{inc}$ , and for each node $j\in\{1,\ldots,N\}$ , there exists a positive integer $m_{j}$ such that $1-v^{j}_{0}=m_{j}\Delta_{inc}$ .

Note that the conditions $\Delta_{dec}=n\Delta_{inc}$ and $1-v^{j}_{0}=m_{j}\Delta_{inc},\hskip 5.69054pt\forall j\in\{1,\ldots,N\}$ , in the assumption ensure that no node gets permanently repaired partway through a time-step.

We now start with the following result.

Lemma 2.

Let there be $N(\geq 2)$ nodes with precedence constraints given by a DAG $G=\{\mathcal{V},\mathcal{E}\}$ . Suppose Assumption 1 holds and $T\in\mathbb{N}\cup\{\infty\}$ . Consider a non-jumping sequence $A$ that respects the precedence and time constraints, and permanently repairs $x(\leq N)$ nodes. Suppose there is a node $h$ from the set of $N$ nodes that has larger initial health value than the first node of sequence $A$ and the precedence constraints allow node $h$ to be permanently repaired before the first node of sequence $A$ . Consider a non-jumping sequence $B$ where node $h$ is first targeted and then the remaining targeted nodes of sequence $A$ are targeted after node $h$ . Then, at least $x-1$ nodes are permanently repaired in sequence $B$ while respecting the precedence and time constraints.

Proof:

Let $x$ be the number of nodes that are permanently repaired in sequence $A$ . Denote the order of nodes in the sequence $A$ as $\{i_{1},\ldots,i_{x}\}$ . The proof for the case when $x=1$ is trivial because $x-1=0$ ; we thus focus on the case when $x\geq 2$ . Denote the ordering of the first $x-1$ nodes in sequence $B$ as $\{i^{\prime}_{1},\ldots,i^{\prime}_{x-1}\}$ , where $i^{\prime}_{1}=h$ . Note that the proof when $x=2$ is also trivial because node $i^{\prime}_{1}=h$ in the non-jumping sequence $B$ is permanently repaired since the time taken to permanently repair node $h$ in sequence $B$ is less than the time taken to permanently repair the nodes in sequence $A$ (because node $h$ has larger initial health than node $i_{1}$ ). Therefore, the non-trivial cases are when $x\geq 3$ .

We first argue that sequence $B$ permanently repairs at least $x-1$ nodes, regardless of the time it takes to repair the nodes. We prove the result by contradiction. Suppose sequence $B$ permanently repairs less than $x-1$ nodes. Let $i^{\prime}_{k}$ be the first node that permanently fails in sequence $B$ , where $2\leq k\leq x-1$ . We first argue that either $i^{\prime}_{k}=i_{k-1}$ or $i^{\prime}_{k}=i_{k}$ . If node $h$ does not appear in sequence $A$ , then the first $x-1$ nodes of sequence $B$ are ordered as follows $\{h,i_{1},\ldots,i_{x-2}\}$ and thus $i^{\prime}_{k}=i_{k-1}$ . On the other hand, if node $h$ is permanently repaired in sequence $A$ , let the position of node $h$ in sequence $A$ be $z$ , where $z\in\{2,\ldots,x\}$ . Then, $i^{\prime}_{1}=h$ , $i^{\prime}_{j}=i_{j-1},\quad\forall j\in\{2,\ldots,z\}$ and $i^{\prime}_{j}=i_{j},\quad\forall j\in\{z+1,\ldots,x-1\}$ . Thus, if the first node that fails in sequence $B$ (i.e., node $i^{\prime}_{k}$ ) is such that $k\leq z$ , then $i^{\prime}_{k}=i_{k-1}$ ; otherwise $i^{\prime}_{k}=i_{k}$ . We now consider the cases when $i^{\prime}_{k}=i_{k-1}$ and $i^{\prime}_{k}=i_{k}$ one by one.

Suppose $i^{\prime}_{k}=i_{k-1}$ . If node $i_{k-1}$ fails in sequence $B$ , then we show that node $i_{k+1}$ must permanently fail in sequence $A$ (where $2\leq k\leq x-1$ ). Consider $k=2$ . Let $t_{h}=\frac{1-v_{0}^{h}}{\Delta_{inc}}$ be the number of time-steps taken to permanently repair node $h$ from its initial health to the full health. If node $i_{1}$ permanently fails after permanently repairing node $h$ in sequence $B$ , then $v_{0}^{i_{1}}\leq\Delta_{dec}t_{h}$ . That is,

1-v_{0}^{i_{1}}\geq 1-\Delta_{dec}t_{h}.

(4)

Denote $t_{j}=\frac{1-v_{0}^{i_{j}}}{\Delta_{inc}}$ as the number of time-steps taken to permanently repair node $i_{j}$ from its initial health to full health. Then,

\Delta_{dec}t_{1}=\Delta_{dec}\left(\frac{1-v_{0}^{i_{1}}}{\Delta_{inc}}\right)=n\left(1-v_{0}^{i_{1}}\right)\geq\left(1-v_{0}^{i_{1}}\right),

(5)

as $n\geq 1$ and $v_{0}^{i_{1}}<1$ . Therefore,

\Delta_{dec}t_{1}\geq 1-\Delta_{dec}t_{h}>1-\Delta_{dec}t_{1},

(6)

where the first inequality from the left comes by conditions (4)-(5) and the second inequality comes from the fact that $t_{h}<t_{1}$ (as node $h$ has larger initial health than node $i_{1}$ ). Note that in sequence $A$ , it takes $t_{1}$ time-steps to repair node $i_{1}$ , and $nt_{1}+t_{2}$ time-steps to repair node $i_{2}$ (it takes $nt_{1}$ time-steps to repair the health that is lost by node $i_{2}$ due to deterioration and it takes $t_{2}$ time-steps to repair the difference in the initial health of $i_{2}$ and full health). By (6), the total reduction in the health of node $i_{3}$ after $(1+n)t_{1}+t_{2}$ time-steps in sequence $A$ satisfies

	$\displaystyle\Delta_{dec}\left((1+n)t_{1}+t_{2}\right)$	$\displaystyle=\Delta_{dec}\left(nt_{1}+t_{2}\right)+\Delta_{dec}t_{1}$
		$\displaystyle>\Delta_{dec}\left(nt_{1}+t_{2}\right)+1-\Delta_{dec}t_{1}$
		$\displaystyle>1,$

as $n\geq 1$ and $t_{1},t_{2}>0$ . Thus, node $i_{3}$ permanently fails in sequence $A$ by the time it is reached, contradicting the fact that $A$ permanently repairs $x(\geq 3)$ nodes.

Now consider the case when $k=3$ . If node $i_{2}$ permanently fails in sequence $B$ , then $v_{0}^{i_{2}}\leq\Delta_{dec}((1+n)t_{h}+t_{1})$ . Following the same arguments as before, we get

\Delta_{dec}t_{2}>1-\Delta_{dec}\left((2+n)t_{1}\right).

(7)

As before, the total number of time-steps required to permanently repair nodes $i_{1},i_{2}$ , and $i_{3}$ in sequence $A$ is equal to $(1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}$ . Thus, the total reduction in the health of node $i_{4}$ by the time it is reached in sequence $A$ satisfies

\Delta_{dec}\left((1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}\right)>1,

by (7), $n\geq 1$ and $t_{1},t_{2},t_{3}>0$ . Thus, $i_{4}$ permanently fails by the time it is reached in sequence $A$ , leading to a contradiction.

We can repeat the above arguments to show that if node $i_{k-1}$ permanently fails in sequence $B$ , where $k>3$ , then

\Delta_{dec}t_{k-1}>1-\Delta_{dec}((2+n)(1+n)^{k-3}t_{1}+\\ (1+n)^{k-4}t_{2}+\ldots+(1+n)^{0}t_{k-2}).

(8)

Therefore, the total reduction in the health of node $i_{k+1}$ after $(1+n)^{k-1}t_{1}+(1+n)^{k-2}t_{2}+\ldots+(1+n)^{0}t_{k}$ time-steps in sequence $A$ is

	$\displaystyle\Delta_{dec}\left((1+n)^{k-1}t_{1}+\ldots+(1+n)^{1}t_{k-1}+(1+n)^{0}t_{k}\right)=$
	$\displaystyle\Delta_{dec}\left((1+n)^{k-1}t_{1}+\ldots+nt_{k-1}+(1+n)^{0}t_{k}\right)+\Delta_{dec}t_{k-1}.$		(9)

Thus, for $k>3$ , we get

\displaystyle\Delta_{dec}\left((1+n)^{k-1}t_{1}+\ldots+(1+n)^{1}t_{k-1}+(1+n)^{0}t_{k}\right)>1

by (8), (9), $n\geq 1$ and $t_{j}>0\hskip 2.84526pt\forall j$ . Since, the total reduction in the health value of node $i_{k+1}$ before the entity starts targeting it is larger than one, it is not possible to permanently repair node $i_{k+1}$ in sequence $A$ , leading to a contradiction.

We now consider the case when $i^{\prime}_{k}=i_{k}$ (along with $x\geq 3$ ). We prove that if node $i^{\prime}_{k}=i_{k}$ permanently fails in sequence $B$ , then it is not possible to permanently repair node $i^{\prime}_{k+1}=i_{k+1}$ in sequence $A$ . Recall that the case $i^{\prime}_{k}=i_{k}$ happens when node $h$ is permanently repaired in sequence $A$ and $k>z$ . Since $z\geq 2$ , we have $k>2$ . Consider $k=3$ . Then, $z=2$ (i.e., $i_{2}=h$ ) because $z<k$ and $z\geq 2$ . If node $i^{\prime}_{3}=i_{3}$ permanently fails in sequence $B$ , then

v_{0}^{i_{3}}\leq\Delta_{dec}((1+n)t_{h}+t_{1})<\Delta_{dec}\left((2+n)t_{1}\right),

or equivalently,

\Delta_{dec}t_{3}>1-\Delta_{dec}\left((2+n)t_{1}\right).

(10)

Therefore, the total reduction in the health of node $i_{4}$ after $(1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}$ time-steps in $A$ satisfies

	$\displaystyle\Delta_{dec}\left((1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}\right)$
	$\displaystyle=\Delta_{dec}\left((1+n)^{2}t_{1}+(1+n)t_{2}\right)+\Delta_{dec}t_{3}>1,$

by (10), $n\geq 1$ and $t_{1},t_{2}>0$ . Thus, node $i_{4}$ permanently fails in sequence $A$ , contradicting the fact that $A$ permanently repairs $x$ nodes.

Similarly, it can be shown that if node $i^{\prime}_{k}=i_{k}$ permanently fails in sequence $B$ , where $k>3$ , then

\Delta_{dec}t_{k}>1-\Delta_{dec}((2+n)(1+n)^{k-3}t_{1}+\\ (1+n)^{k-4}t_{2}+\ldots+(1+n)^{k-1-z}t_{z-1}+\\ (1+n)^{k-2-z}t_{z+1}+\ldots+(1+n)^{0}t_{k-1}).

(11)

Therefore, the total reduction in the health of node $i_{k+1}$ (where $k>3$ ) after $(1+n)^{k-1}t_{1}+(1+n)^{k-2}t_{2}+\ldots+(1+n)^{0}t_{k}$ time-steps in sequence $A$ is

	$\displaystyle\Delta_{dec}\left((1+n)^{k-1}t_{1}+\ldots+(1+n)^{1}t_{k-1}+(1+n)^{0}t_{k}\right)$
	$\displaystyle=\Delta_{dec}\left((1+n)^{k-1}t_{1}+\ldots+(1+n)^{1}t_{k-1}\right)+\Delta_{dec}t_{k}>1,$

by (11), $n\geq 1$ and $t_{j}>0\hskip 2.84526pt\forall j$ . Therefore, it would not be possible to permanently repair $x$ nodes in sequence $A$ , leading to a contradiction.

We will now show that sequence $B$ permanently repairs $x-1$ nodes in less time than the time taken to permanently repair $x$ nodes in sequence $A$ . Suppose node $h$ is not permanently repaired in sequence $A$ . Consider $x=3$ . Then, the time taken to permanently repair the first two nodes, i.e., nodes $h$ and $i_{1}$ , in sequence $B$ is equal to $(1+n)t_{h}+t_{1}$ and the time taken to permanently repair three nodes in sequence $A$ is equal to $(1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}$ . Note that $(1+n)t_{h}+t_{1}<(2+n)t_{1}<(1+n)^{2}t_{1}+(1+n)t_{2}+t_{3}$ because $t_{h}<t_{1}$ , $n\geq 1$ and $t_{1},t_{2},t_{3}>0$ . Consider the case when $x>3$ . Then, the time taken to permanently repair $x-1$ nodes in sequence $B$ is equal to

(1+n)^{x-2}t_{h}+(1+n)^{x-3}t_{1}+\ldots+(1+n)^{0}t_{x-2},

(12)

and the time taken to permanently repair $x$ nodes in sequence $A$ is equal to

(1+n)^{x-1}t_{1}+(1+n)^{x-2}t_{2}+\ldots+(1+n)^{0}t_{x}.

(13)

Then, $(1+n)^{x-2}t_{h}+(1+n)^{x-3}t_{1}+(1+n)^{x-4}t_{2}+\ldots+(1+n)^{0}t_{x-2}<\left((2+n)(1+n)^{x-3}\right)t_{1}+(1+n)^{x-4}t_{2}+\ldots+(1+n)^{0}t_{x-2}<(1+n)^{x-1}t_{1}+(1+n)^{x-2}t_{2}+\ldots+(1+n)^{0}t_{x}$ as $t_{h}<t_{1}$ , $n\geq 1$ and $t_{j}>0,\forall j\in\{1,\ldots,x\}$ .

Now consider the case when node $h$ is permanently repaired in sequence $A$ . When $z\geq x-1$ (and $x\geq 3$ ), the time taken to permanently repair $x-1$ nodes in sequence $B$ and the time taken to permanently repair $x$ nodes in sequence $A$ are given by expressions (12) and (13), respectively, and therefore the proof proceeds in the same way as when node $h$ is not permanently repaired in sequence $A$ . Thus, the proof for $x=3$ follows in exactly the same way as before because $z\geq 2$ . We now focus on the case when $z\leq x-2$ and $x>3$ . Then, the total time taken to permanently repair $x-1$ nodes in sequence $B$ is equal to

(1+n)^{x-2}t_{h}+(1+n)^{x-3}t_{1}+\ldots+(1+n)^{x-1-z}t_{z-1}+\\ (1+n)^{x-2-z}t_{z+1}+\ldots+(1+n)^{0}t_{x-1},

(14)

and the time taken to permanently repair $x$ nodes in sequence $A$ is given by the expression (13). Note that the value of the expression (14) is less than the value of the expression (13) because $t_{h}<t_{1}$ , $n\geq 1$ and $t_{j}>0,\forall j\in\{1,\ldots,x\}$ . Therefore, if sequence $A$ permanently repairs $x$ nodes by time-step $T$ , then sequence $B$ must permanently repair at least $x-1$ nodes by time-step $T$ . Thus, the result follows. ∎

Using the above result, we now propose a near-optimal policy for a subclass of Problem 1.

Theorem 3.

Proof:

Let $A$ be the optimal (non-jumping) sequence. Then, we go to the first time-step $T^{\prime}(\leq T)$ in sequence $A$ at which the healthiest available node (i.e., the healthiest node that can be targeted at time-step $T^{\prime}$ without violating the precedence constraints) is not targeted. Denote the portion of sequence $A$ from time-step $0$ to time-step $T^{\prime}-1$ as $A^{\prime}$ and the portion of sequence from time-step $T^{\prime}$ onwards as $A^{\prime\prime}$ . We modify the sequence $A^{\prime\prime}$ by Lemma 2 to generate sequence $A^{*}$ such that the healthiest available node (while respecting the precedence and time constraints) is targeted at time-step $T^{\prime}$ in $A^{*}$ . Let $y$ be the number of nodes that are permanently repaired by sequence $A^{\prime\prime}$ . Then, at least $y-1$ nodes are permanently repaired in sequence $A^{*}$ (while respecting the precedence and time constraints) by Lemma 2. Then, a sequence $B$ is generated by concatenating $A^{\prime}$ with $A^{*}$ . After this, we go to the next time-step in sequence $B$ at which the healthiest node (while respecting the precedence and time constraints) is not targeted and repeat this procedure. We iteratively repeat this procedure, so that we finally get a sequence in which the healthiest node (while respecting the precedence and time constraints) is targeted at each time-step. The number of nodes that are permanently repaired in the final sequence will be at least half of the number of nodes that are permanently repaired in sequence $A$ because 1) at each iteration of this procedure we move one node across the given sequence and the number of nodes that are permanently repaired in the modified sequence reduces at most by one, and 2) in the last iteration of this procedure where there is only one node, there is no reduction in the number of permanently repaired nodes because if the last node in the given sequence can be permanently repaired then a healthier node that replaces it can also be permanently repaired. ∎

We now show that the factor of $1/2$ in the above result is sharp, in that there exist problem instances where targeting the healthiest node at each time-step repairs only half as many nodes as an optimal policy.

Example 1.

Consider a graph consisting of three nodes as shown in Figure 3. Let the initial health values of the nodes 1, 2 and 3 be 0.6, 0.3 and 0.8, respectively. The deterioration and repair rates are homogeneous across all the nodes and both are equal to 0.1. Suppose $T=\infty$ . The healthiest node that can be targeted in the first time-step is node 1 (as we need to permanently repair node 2 before targeting node 3). If node 1 is first permanently repaired then node 2 fails by the time the entity reaches it and we cannot permanently repair any more nodes. However, if node 2 is first permanently repaired then node 3 can also be permanently repaired. Note that although the policy that targets the healthiest node at each time-step (while respecting the precedence and time constraints) is not optimal in this example, it is indeed $1/2$ -optimal as proved in Theorem 3.

Figure 3: Graph for illustrating optimal policies in the presence of precedence constraints.

Remark 1.

In this paper, we consider all the nodes to be equally important. However, one could also consider weights on the nodes to signify their relative importance, and the objective of Problem 1 can be changed to maximize the total weight of the nodes that are permanently repaired. Then, it is easy to argue that the policy characterized in Theorem 3 would be $\frac{w_{min}}{2w_{max}}$ -optimal, where $w_{max}$ and $w_{min}$ are the maximum and minimum weights of nodes, respectively.

Theorem 3 shows that the feedback policy that targets the healthiest node at each time-step while respecting the precedence and time constraints is $1/2$ -optimal. In the next section, we show that this policy is, in fact, optimal for special instances of the problem.

III-C Optimal sequencing

The paper [14] showed that the policy that targets the healthiest node at each time-step is optimal when all the conditions of Theorem 3 are satisfied but when there are no precedence and time constraints. We now prove that the aforementioned policy is also optimal when there is a time constraint but there are no precedence constraints.

Theorem 4.

Let there be $N(\geq 2)$ nodes such that there are no precedence constraints (i.e., $\mathcal{E}=\emptyset$ ). Suppose Assumption 1 holds and $T\in\mathbb{N}\cup\{\infty\}$ . Then, the policy that targets the healthiest node at each time-step while respecting the time constraint is optimal.

Proof:

Consider any optimal (non-jumping) sequence $A=\{i_{1},\ldots,i_{x}\}$ that permanently repairs $x(\leq N)$ nodes. Let $T^{\prime}(\leq T+1)$ be the total number of time-steps taken by the sequence $A$ to permanently repair all $x$ nodes.⁴⁴4Note that the total number of time-steps is one larger than the largest time-step index $T$ because the first time-step in a sequence is time-step 0. Then,

T^{\prime}=\frac{1-E_{1}}{\Delta_{inc}}+\frac{1-E_{2}}{\Delta_{inc}}+\ldots+\frac{1-E_{x}}{\Delta_{inc}},

(15)

where $E_{j}$ is the health of node $i_{j}$ when it is reached in the sequence, i.e., $E_{1}=v_{0}^{i_{1}}$ and $E_{k}=v_{0}^{i_{k}}-n\sum_{j=2}^{k}\left(1-E_{j-1}\right)$ for $k\in\{2,\ldots,x\}$ . Alternatively, $E_{k}=v_{0}^{i_{1}}n\left(1+n\right)^{k-2}+v^{i_{2}}_{0}n(1+n)^{k-3}+\ldots+v_{0}^{i_{k-1}}n+v^{i_{k}}_{0}-n\left(1+n\right)^{k-2}-n(1+n)^{k-3}-\ldots-n$ for all $k\in\{2,\ldots,x\}$ . Note that $E_{1}$ is the largest when node $i_{1}$ has the largest initial health; $E_{2}$ is the largest when node $i_{1}$ has the largest initial health and node $i_{2}$ has the second largest initial health (as the coefficients of $v_{0}^{i_{1}}$ and $v_{0}^{i_{2}}$ are $n$ and 1, respectively); $E_{3}$ is the largest when node $i_{1}$ has the largest initial health, node $i_{2}$ has the second largest initial health and node $i_{3}$ has the third largest initial health (as the coefficients of $v_{0}^{i_{1}}$ , $v_{0}^{i_{2}}$ and $v_{0}^{i_{3}}$ are $n(1+n)$ , $n$ and 1, respectively); and so on. Thus, the non-jumping sequence $B$ that targets the nodes in decreasing order of their initial health is optimal when time is constrained because it permanently repairs $x$ nodes in less time in comparison to the time taken by sequence $A$ to permanently repair $x$ nodes (as the value of $T^{\prime}$ in (15) for sequence $B$ is less than the corresponding value for sequence $A$ ). Therefore, the policy that targets the healthiest node at each time-step while respecting the time constraint is optimal by Theorem 1. ∎

We also show that the policy of targeting the healthiest node at each time-step (while respecting the precedence and time constraints) is optimal in certain classes of DAGs. We start with the following definition.

Definition 6.

Given a DAG, the nodes that do not have any incoming edges are said to belong to the “first level”; after removing all the nodes in the first level and all the outgoing edges of the nodes in the first level, the nodes that do not have any incoming edges are said to belong to the “second level”, and so on. A DAG in which all the nodes in a level have incoming edges from all the nodes in the previous level is termed as a “complete series graph”.

Figure 4 shows a complete series graph that has three levels.

Proposition 1.

Let there be $N(\geq 2)$ nodes with precedence constraints given by a complete series graph $G=\{\mathcal{V},\mathcal{E}\}$ . Suppose Assumption 1 holds and $T\in\mathbb{N}\cup\{\infty\}$ . Then, the policy that targets the healthiest node at each time-step while respecting the precedence and time constraints is optimal.

Proof:

Due to precedence constraints, we need to permanently repair all the nodes in the first level before targeting the other nodes in a complete series graph. Since the optimal policy is to target the healthiest node at each time-step (while respecting the time constraint) when there are no precedence constraints (by Theorem 4), we follow this policy for the nodes in the first level. After permanently repairing all the nodes in the first level, we need to permanently repair all the nodes in the second level before targeting other nodes. Thus, the policy of targeting the healthiest node at each time-step (while respecting the time constraint) in the second level is followed after permanently repairing all the nodes in the first level. The rest of the proof follows the above argument. ∎

So far, we assumed that $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ . In the next section, we analyze other cases.

IV Control Sequences for $\Delta_{{dec}}^{j}<\Delta_{{inc}}^{j},\quad\forall{j}\in\{1,\ldots,N\}$

We now consider the case when repair rates are sufficiently larger than the deterioration rates, i.e., $\Delta_{inc}^{j}>(N-1)\Delta_{dec}^{j},\forall j\in\{1,\ldots,N\}$ and $\Delta_{inc}^{j}>\sum_{l\neq j}\Delta_{dec}^{l},\quad\forall j\in\{1,\ldots,N\}$ . We will use the concept of a modified health value defined as follows.

Definition 7.

The modified health value of a node at a time-step is the health value of a node at that time-step minus its deterioration rate.

IV-A Near-optimal policy

Consider a directed rooted tree where the total number of nodes is at most equal to $k(\geq 1)$ . In the context of disaster recovery, such a tree can represent damaged infrastructure components (with dependencies) within a particular neighborhood. We will consider such trees for the representation of precedence constraints as follows.

Theorem 5.

Let there be $N(\geq 2)$ nodes with precedence constraints represented by a graph $G=\{\mathcal{V},\mathcal{E}\}$ , which is a set of disjoint rooted trees that each contain at most $k(\geq 1)$ nodes. Suppose $\Delta_{inc}^{j}>(N-1)\Delta_{dec}^{j},\forall j\in\{1,\ldots,N\}$ , $\Delta_{inc}^{j}>\sum_{l\neq j}\Delta_{dec}^{l},\quad\forall j\in\{1,\ldots,N\}$ and $T=\infty$ . Then, the policy that targets the node with the least modified health value at each time-step while respecting the precedence constraints is $\frac{1}{k}$ -optimal.

Proof:

Let $\mathcal{R}$ be the set of root nodes of the trees. Let $A$ denote the sequence that targets the node with the least modified health value in the set $\mathcal{R}$ at each time-step, and let $B$ denote the optimal sequence. Let the number of nodes that are permanently repaired by sequences $A$ and $B$ be $y$ and $x$ , respectively. We will show that $y\geq\frac{x}{k}$ . We prove this by contradiction. Suppose $\frac{x}{k}>y$ . Denote the set of root nodes that are permanently repaired in sequence $B$ by $\mathcal{F}\subseteq\mathcal{R}$ , and let $z\triangleq|\mathcal{F}|$ . Note that $z\geq\left\lceil\frac{x}{k}\right\rceil$ because the root node in a tree should be permanently repaired before targeting other nodes in the tree. Then, $z\geq y+1$ because $\frac{x}{k}>y$ implies $\left\lceil\frac{x}{k}\right\rceil\geq y+1$ . We now construct a sequence $C$ by modifying sequence $B$ such that at every time-step of sequence $B$ where a node not belonging to the set $\mathcal{F}$ is targeted, we replace that node with a node from the set $\mathcal{F}$ . Then, sequence $C$ also permanently repairs all the nodes in the set $\mathcal{F}$ because the nodes of set $\mathcal{F}$ are targeted more frequently in sequence $C$ than in sequence $B$ . Since the optimal policy is to target the node with the least modified health value at each time-step when there are no precedence and time constraints (by Proposition 1 of [14]), we obtain a contradiction because sequence $C$ permanently repairs $z\geq y+1$ nodes whereas sequence $A$ (which is the optimal sequence to permanently repair the set of root nodes) permanently repairs $y$ nodes. Therefore, the assumption that $\frac{x}{k}>y$ is not true.

Let the sequence that targets the node with the least modified health value (in the set $\mathcal{V}$ ) at each time-step while respecting the precedence constraints be denoted by $D$ . We now argue that sequence $D$ permanently repairs at least as many nodes as sequence $A$ . Denote the set of root nodes that are permanently repaired in sequence $A$ by $\mathcal{Z}\subseteq\mathcal{R}$ and the number of nodes in the set $\mathcal{Z}$ by $y$ . Then, there exists an ordering of nodes in the set $\mathcal{Z}$ , denoted by $\{i_{1},i_{2},\ldots,i_{y}\}$ , such that the initial health values of those nodes satisfy

v_{0}^{i_{j}}>(y-j)\Delta_{dec}^{i_{j}},\quad\forall j\in\{1,\ldots,y\}

(16)

because of the following argument. Let $\mathcal{A}_{t}\subseteq\mathcal{Z}$ denote the set of nodes that have not been targeted at least once by the entity prior to time $t$ in sequence $A$ . Note that $\mathcal{Z}=\mathcal{A}_{0}\supseteq\mathcal{A}_{1}\supseteq\ldots\supseteq\mathcal{A}_{y-1}$ . At time $t=0$ , $|\mathcal{A}_{t}|=y$ , where $|\mathcal{A}_{t}|$ denotes the cardinality of set $\mathcal{A}_{t}$ . At $t=1$ , $|\mathcal{A}_{t}|=y-1$ as there are $y-1$ nodes of the set $\mathcal{Z}$ that have not been targeted at least once by the entity in sequence $A$ and thus each node $k\in\mathcal{A}_{1}$ should have initial health value larger than $\Delta_{dec}^{k}$ to survive until $t=1$ . At $t=2$ , $|\mathcal{A}_{t}|\geq y-2$ as there are at least $y-2$ nodes of the set $\mathcal{Z}$ that have not been targeted at least once by the entity in sequence $A$ and thus each node $k\in\mathcal{A}_{2}$ should have initial health value larger than $2\Delta_{dec}^{k}$ . Repeating this argument for the next $y-3$ time-steps, we can argue that there must be a permutation $\{i_{1},\ldots,i_{y}\}$ of nodes that satisfies the conditions (16) in order for $y$ nodes to be permanently repaired in sequence $A$ .

After this, it can be argued that sequence $D$ permanently repairs at least $y$ nodes by the same way as in Proposition 1 of [14]. Let the number of nodes that are permanently repaired by sequence $D$ be $y^{\prime}$ . Then, $y^{\prime}\geq y$ and therefore $\frac{x}{y^{\prime}}\leq\frac{x}{y}\leq k$ . Thus, the policy that targets the node with the least modified health value at each time-step while respecting the precedence constraints is $\frac{1}{k}$ -optimal. ∎

We now show that the factor of $\frac{1}{k}$ in Theorem 5 is sharp, in that there exist problem instances where targeting the node with the least modified health value at each time-step repairs only $\frac{1}{k}$ times the nodes as an optimal policy.

Example 2.

Consider the graph in Figure 3, which is a set of disjoint rooted trees that each contain at most two nodes. Let the initial health values of the nodes 1, 2 and 3 be 0.01, 0.02 and 0.8, respectively, and $T=\infty$ . The deterioration and repair rates are homogeneous across all the nodes and are equal to 0.1 and 0.25, respectively. The least healthy node that can be targeted in the first time-step is node 1 (since deterioration rates are homogeneous across all the nodes, targeting the node with the least modified health value is equivalent to targeting the node with least health value). However, if node 1 is targeted in the first time-step then node 2 fails by the time the entity reaches it and we cannot target any more nodes. However, if node 2 is first permanently repaired then node 3 can also be permanently repaired. Note that although the policy that targets the least healthy node at each time-step while respecting the precedence constraints is not optimal in this example, it is indeed $\frac{1}{k}$ -optimal as proved in Theorem 5.

We now also provide an example to show that the policy that is characterized in Theorem 5 need not be $\frac{1}{k}$ -optimal when $T\in\mathbb{N}$ .

Example 3.

Consider a graph consisting of two nodes without any precedence constraints (i.e., the graph is a union of disjoint trees that each contain one node). Let the initial health values of the nodes 1 and 2 be 0.01 and 0.11, respectively. The deterioration and repair rates are homogeneous across all the nodes and are equal to 0.1 and 0.11, respectively, and $T=10$ . If the policy of targeting the node with the least health value at each time-step is followed then no node gets permanently repaired while respecting the time constraint. However, if the non-jumping policy that first targets node 2 is followed, then node 2 can be permanently repaired. Thus, the policy that is characterized in Theorem 5 is not a 1-optimal policy (or optimal policy) in this example.

Therefore, characterizing near-optimal policies for this case under a time constraint is an avenue for future research.

IV-B Optimal sequencing

The paper [14] proved that the policy that targets the node with the least modified health value at each time-step is optimal when there are no precedence and time constraints. Examples 2 and 3 showed that this need not be true when there is a precedence or time constraint. However, the policy of targeting the node with the least modified health value at each time-step is optimal for special cases such as when the precedence constraints are given by a complete series graph (defined in the previous section).

Proposition 2.

Let there be $N(\geq 2)$ nodes with precedence constraints given by a complete series graph $G=\{\mathcal{V},\mathcal{E}\}$ . Suppose $\Delta_{inc}^{j}>(N-1)\Delta_{dec}^{j},\forall j\in\{1,\ldots,N\}$ , $\Delta_{inc}^{j}>\sum_{l\neq j}\Delta_{dec}^{l},\quad\forall j\in\{1,\ldots,N\}$ and $T=\infty$ . The optimal policy is to target the node with the least modified health at each time-step while respecting the precedence constraints.

This result can be proved similarly as Proposition 1 by using the fact that the policy of targeting the node with the least modified health at each step is optimal when there are no precedence and time constraints [14].

V Conclusions

We studied the problem of finding (near-) optimal control policies for targeting different components whose states are disrupted, when there exist precedence constraints between the components. We characterized control policies depending on the relationship between the repair and deterioration rates. There are several options for future extensions to this work: introducing stochasticity in the health values, extending the problem to time-dependent deterioration and repair rates, characterizing policies with improved ratios $\alpha$ , and characterizing near-optimal policies when for all $j\in\mathcal{V}$ , $\Delta_{dec}^{j}<\Delta_{inc}^{j}<(N-1)\Delta_{dec}^{j}$ .

VI Appendix

In this section, we prove that Problem 1 is NP-hard. This proof is inspired from the paper [17]; however, the job scheduling problem in [17] does not consider deterioration of jobs and there are additional differences between the problem that is considered in [17] and Problem 1 as mentioned in the review of job scheduling studies in Section I. We start by defining the NP-complete Clique problem [16] and an instance of a decision version of Problem 1 referred to as $OR_{d}$ .

Problem 2 (Clique).

Given an undirected graph $G^{\prime}=(\mathcal{V^{\prime}},\mathcal{E^{\prime}})$ consisting of $s$ vertices and $q$ edges, and a positive integer $p(\leq s)$ , does $G^{\prime}$ have a complete subgraph of size $p$ , i.e., a set of $p$ vertices such that each pair of vertices in the set is connected by an edge in $\mathcal{E^{\prime}}$ ?

Problem 3 ( $OR_{d}$ ).

Given a directed acyclic graph $G=\{\mathcal{V},\mathcal{E}\}$ consisting of $N(\geq 2)$ nodes with initial health values $\{v_{0}^{j}\}$ , along with repair and deterioration rates $\{\Delta_{inc}^{j}\}$ and $\{\Delta_{dec}^{j}\}$ , respectively, $T=\infty$ , and an integer $z$ such that $0\leq z\leq N$ , is there a non-jumping control sequence $u^{\ast}_{0:T}$ that respects the precedence constraints and gives a reward $J(v_{0},u^{\ast}_{0:T})\geq N-z$ ?

We now present the proof of Theorem 2.

Proof:

Given an instance of Clique we construct an instance of $OR_{d}$ as follows. We construct a total of $N=s+q+1$ nodes that are of three types as follows: a v-node is constructed corresponding to each vertex in $G^{\prime}$ , an e-node is constructed corresponding to each edge in $G^{\prime}$ , and a root node $r$ is constructed. The parameters of these nodes are set as follows. For each v-node $j$ , set $v_{0}^{j}=\frac{2(2^{s+q}-1)+1}{2+2(2^{s+q}-1)+1}$ and $\Delta_{dec}^{j}=\Delta_{inc}^{j}=\frac{1}{2+2(2^{s+q}-1)+1}$ . For each e-node $j$ , set $v_{0}^{j}=\frac{2(2^{\frac{p(p+1)}{2}}-1)+1}{2+2(2^{\frac{p(p+1)}{2}}-1)+1}$ and $\Delta_{dec}^{j}=\Delta_{inc}^{j}=\frac{1}{2+2(2^{\frac{p(p+1)}{2}}-1)+1}$ . The root node $r$ has the same parameters as a v-node. We construct directed edges such that there is an edge starting from a v-node and ending in an e-node if the vertex corresponding to the v-node lies at one of the ends of the edge (in graph $G^{\prime}$ ) corresponding to the e-node. We also construct directed edges starting from the root node and ending in all the other nodes, so that a DAG is formed. Finally, set $z=q-\frac{p(p-1)}{2}$ . Note that the constructed instance of $OR_{d}$ is polynomial in the size of the given instance of Clique.

We will now prove that the answer to the given instance of Clique is yes if and only if the answer to the constructed instance of $OR_{d}$ is yes. Suppose that the answer to the instance of Clique is yes. Then, we will show that it is possible to create a non-jumping sequence for the constructed instance of $OR_{d}$ in which no more than $q-\frac{p(p-1)}{2}$ nodes permanently fail while respecting the precedence constraints. The first node in the created sequence is node $r$ because of the precedence constraints. Note that node $r$ is permanently repaired in the created sequence because the first node in any non-jumping sequence is always permanently repaired (when $T=\infty$ ). After permanently repairing node $r$ , $p$ v-nodes whose corresponding vertices form a complete graph in the Clique are targeted in the created sequence. Note that it takes two time-steps to permanently repair node $r$ in the created sequence because $\frac{1-v_{0}^{r}}{\Delta_{dec}^{r}}=2$ . Also, the health value of each v-node $j$ after two time-steps is equal to $v_{2}^{j}=v_{0}^{j}-2\Delta_{dec}^{j}=\frac{2(2^{s+q}-1)+1-2}{2+2(2^{s+q}-1)+1}$ . Thus, the number of time-steps it takes to permanently repair the first v-node $j$ in the created sequence is equal to $\frac{1-v_{2}^{j}}{\Delta_{dec}^{j}}=4$ . Thus, the second v-node $j$ starts getting targeted after six time-steps in the created sequence and at that time its health value is equal to $v_{6}^{j}=v_{0}^{j}-6\Delta_{dec}^{j}=\frac{2(2^{s+q}-1)+1-6}{2+2(2^{s+q}-1)+1}$ . Thus, the number of time-steps it takes to permanently repair the second v-node $j$ in the created sequence is equal to $\frac{1-v_{6}^{j}}{\Delta_{dec}^{j}}=8$ . Proceeding in this way, it can be shown that the total number of time-steps taken to permanently repair node $r$ and $p-1$ v-nodes is equal to $2(2^{p}-1)$ because the $i$ th node in the created sequence (where $1\leq i\leq p$ ) takes $2^{i}$ time-steps to get permanently repaired. Note that $\frac{v_{0}^{j}}{\Delta_{dec}^{j}}=2\left(2^{s+q}-1\right)+1$ for all v-nodes $j$ , and represents the number of time-steps it takes for $j$ to permanently fail if it is not targeted within the first $\frac{v_{0}^{j}}{\Delta_{dec}^{j}}$ time-steps. Thus, for all v-nodes $j$ , we define $\gamma_{\textit{v}}=\frac{v_{0}^{j}}{\Delta_{dec}^{j}}=2(2^{s+q}-1)+1$ . Note that the time-step at which the $p$ th v-node starts getting targeted in the created sequence is less than $\gamma_{\textit{v}}$ because $2(2^{p}-1)\leq 2^{s+1}-2<2^{s+1}-1\leq 2(2^{s+q}-1)+1$ (as $p\leq s$ and $q\geq 0$ ). Thus, all the $p$ v-nodes are permanently repaired in the created sequence. After permanently repairing $p$ v-nodes that correspond to the solution of Clique, $\frac{p(p-1)}{2}$ e-nodes that correspond to the edges of the complete subgraph in the solution of Clique are targeted. Note that $\frac{v_{0}^{j}}{\Delta_{dec}^{j}}=2(2^{\frac{p(p+1)}{2}}-1)+1$ for all e-nodes $j$ . Thus, for all e-nodes $j$ , we define $\gamma_{\textit{e}}=\frac{v_{0}^{j}}{\Delta_{dec}^{j}}=2(2^{\frac{p(p+1)}{2}}-1)+1$ . Note that there are $1+p+\frac{p(p-1)}{2}-1=\frac{p(p+1)}{2}$ nodes that are targeted before the last e-node among the aforementioned e-nodes in the created sequence. Thus, the last e-node among the aforementioned e-nodes starts getting targeted in the created sequence at time-step $2(2^{\frac{p(p+1)}{2}}-1)$ because the $i$ th node in the created sequence (where $1\leq i\leq\frac{p(p+1)}{2}$ ) takes $2^{i}$ time-steps to get permanently repaired. Thus, all the aforementioned e-nodes get permanently repaired because $2(2^{\frac{p(p+1)}{2}}-1)<\gamma_{\textit{e}}$ . Finally, the remaining $s-p$ v-nodes are targeted in the created sequence. Note that there are $1+p+\frac{p(p-1)}{2}+s-p-1=s+\frac{p(p-1)}{2}$ nodes that are targeted before the last v-node in the created sequence. Thus, it takes $2(2^{s+\frac{p(p-1)}{2}}-1)$ time-steps in order to start targeting the last v-node because the $i$ th node in the created sequence (where $1\leq i\leq s+\frac{p(p-1)}{2}$ ) takes $2^{i}$ time-steps to get permanently repaired. Thus, it is possible to permanently repair all the v-nodes because $2(2^{s+\frac{p(p-1)}{2}}-1)<\gamma_{\textit{v}}$ (as $q\geq\frac{p(p-1)}{2}$ ). Thus, except the remaining $q-\frac{p(p-1)}{2}$ e-nodes, all the nodes are permanently repaired in the created sequence.

Now we show the opposite direction. Suppose the answer to the constructed instance of $OR_{d}$ is yes. Then, there exists a non-jumping sequence in which at most $z=q-\frac{p(p-1)}{2}$ nodes permanently fail. Note that there are at most $N-1=s+q$ nodes that are permanently repaired before the last node that is targeted in the given sequence. Thus, the largest time-step at which the last targeted node in the given non-jumping sequence starts getting targeted is equal to $2(2^{s+q}-1)$ because the $i$ th node in the given sequence takes $2^{i}$ time-steps to get permanently repaired. Thus, all the v-nodes are permanently repaired in the given sequence because $2(2^{s+q}-1)<2(2^{s+q}-1)+1=\gamma_{\textit{v}}$ . Note that the first node that is targeted in the given sequence is node $r$ because of the precedence constraints. Since the first node in the given sequence is permanently repaired and all v-nodes are also permanently repaired, the set of all the nodes that permanently fail in the given sequence is a subset of all the e-nodes. The number of e-nodes that are permanently repaired in the given sequence is at least equal to $\frac{p(p-1)}{2}$ because at most $q-\frac{p(p-1)}{2}$ nodes permanently fail in the given sequence and the total number of e-nodes is equal to $q$ . Therefore, at least $p$ v-nodes need to be permanently repaired before at least $\frac{p(p-1)}{2}$ e-nodes can be permanently repaired in the given sequence because of the following two reasons. First, a v-node is an in-neighbor of an e-node if the vertex corresponding to the v-node lies at one of the ends of the edge (in graph $G^{\prime}$ ) corresponding to the e-node. Secondly, among all the undirected graphs that have $\frac{p(p-1)}{2}$ edges, a complete graph (containing $p$ vertices) has the least number of vertices. Note that all the e-nodes that are permanently repaired in the given sequence start getting targeted before time-step $\gamma_{\textit{e}}=2(2^{\frac{p(p+1)}{2}}-1)+1$ . Also, the maximum number of nodes (when the $i$ th node takes $2^{i}$ time-steps to get permanently repaired) that can be targeted such that the last node starts getting targeted before time-step $2(2^{\frac{p(p+1)}{2}}-1)+1$ is equal to $\frac{p(p+1)}{2}+1$ . Thus, node $r$ , $p$ v-nodes and $\frac{p(p-1)}{2}$ e-nodes are targeted before time-step $2(2^{\frac{p(p+1)}{2}}-1)+1$ in the given sequence such that the vertices corresponding to the $p$ v-nodes form a complete subgraph of size $p$ in graph $G^{\prime}$ .

Since non-jumping sequences are optimal when the precedence constraints are given by a DAG, $\Delta_{dec}^{j}\geq\Delta_{inc}^{j},\forall j\in\{1,\ldots,N\}$ and $T\in\mathbb{N}\cup\{\infty\}$ by Theorem 1, we conclude that Problem 1 is NP-hard. ∎

References

[1] E. I. Chisolm and J. C. Matthews, “Impact of hurricanes and flooding on buried infrastructure,” Leadership and Management in Engineering, vol. 12, no. 3, pp. 151–156, 2012.
[2] D. J. Leversage and E. J. Byres, “Estimating a system’s mean time-to-compromise,” IEEE Security & Privacy, vol. 6, no. 1, pp. 52–60, 2008.
[3] R. J. La, “Interdependent security with strategic agents and cascades of infection,” IEEE/ACM Transactions on Networking, vol. 24, no. 3, pp. 1378–1391, 2015.
[4] D. Gorges, M. Izak, and S. Liu, “Optimal control and scheduling of switched systems,” IEEE Transactions on Automatic Control, vol. 56, no. 1, pp. 135–140, 2010.
[5] A. Bemporad and N. Giorgetti, “Logic-based solution methods for optimal control of hybrid systems,” IEEE Transactions on Automatic Control, vol. 51, no. 6, pp. 963–976, 2006.
[6] F. Zhu and P. J. Antsaklis, “Optimal control of hybrid switched systems: A brief survey,” Discrete Event Dynamic Systems, vol. 25, no. 3, pp. 345–364, 2015.
[7] A. Eryilmaz and R. Srikant, “Fair resource allocation in wireless networks using queue-length-based scheduling and congestion control,” IEEE/ACM Transactions on Networking (TON), vol. 15, no. 6, pp. 1333–1344, 2007.
[8] S. L. Smith, M. Schwager, and D. Rus, “Persistent robotic tasks: Monitoring and sweeping in changing environments,” IEEE Transactions on Robotics, vol. 28, no. 2, pp. 410–426, 2012.
[9] P. Nilsson and N. Ozay, “On a class of maximal invariance inducing control strategies for large collections of switched systems,” in Proceedings of the 20th International Conference on Hybrid Systems: Computation and Control. ACM, 2017, pp. 187–196.
[10] N. T. Argon, S. Ziya, and R. Righter, “Scheduling impatient jobs in a clearing system with insights on patient triage in mass casualty incidents,” Probability in the Engineering and Informational Sciences, vol. 22, no. 3, pp. 301–332, 2008.
[11] V. S. Gordon, C. N. Potts, V. A. Strusevich, and J. D. Whitehead, “Single machine scheduling models with deterioration and learning: handling precedence constraints via priority generation,” Journal of Scheduling, vol. 11, no. 5, p. 357, 2008.
[12] F. Zhang and A. Burns, “Schedulability analysis for real-time systems with EDF scheduling,” IEEE Transactions on Computers, no. 9, pp. 1250–1258, 2009.
[13] H. Gehlot, S. Sundaram, and S. V. Ukkusuri, “Optimal sequencing policies for recovery of physical infrastructure after disasters,” in 2019 American Control Conference (ACC). IEEE, 2019, pp. 3605–3610.
[14] ——, “Optimal policies for recovery of multiple systems after disruptions,” arXiv preprint arXiv:1904.11615, 2020.
[15] ——, “Approximation algorithms for the recovery of infrastructure after disasters under precedence constraints,” IFAC-PapersOnLine, vol. 52, no. 20, pp. 175–180, 2019.
[16] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms. MIT press, 2009.
[17] M. Garey and D. S. Johnson, “Scheduling tasks with nonuniform deadlines on two processors,” Journal of the ACM (JACM), vol. 23, no. 3, pp. 461–467, 1976.

Shreyas Sundaram is an Associate Professor in the School of Electrical and Computer Engineering at Purdue University. He received his MS and PhD degrees in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005 and 2009, respectively. He was a Postdoctoral Researcher at the University of Pennsylvania from 2009 to 2010, and an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Waterloo from 2010 to 2014. He is a recipient of the NSF CAREER award, and an Air Force Research Lab Summer Faculty Fellowship. At Purdue, he received the Hesselberth Award for Teaching Excellence and the Ruth and Joel Spira Outstanding Teacher Award. At Waterloo, he received the Department of Electrical and Computer Engineering Research Award and the Faculty of Engineering Distinguished Performance Award. He received the M. E. Van Valkenburg Graduate Research Award and the Robert T. Chien Memorial Award from the University of Illinois, and he was a finalist for the Best Student Paper Award at the 2007 and 2008 American Control Conferences. His research interests include network science, analysis of large-scale dynamical systems, fault-tolerant and secure control, linear system and estimation theory, game theory, and the application of algebraic graph theory to system analysis.

Control Policies for Recovery of Interdependent Systems After Disruptions

Abstract

I Introduction

Our contributions

II Problem Statement

Definition 1.

Definition 2.

Definition 3.

Problem 1 (Optimal Control Sequencing over a DAG).

Definition 4.

Definition 5.

III Control Sequences for Δd​e​cj≥Δi​n​cj,∀j∈{1,…,N}\Delta_{{dec}}^{{j}}\geq\Delta_{{inc}}^{{j}},\quad\forall{j}\in\{1,\ldots,N\}

III-A Optimality of non-jumping sequences

Lemma 1.

Proof:

Theorem 1.

Theorem 2.

III-B Near-optimal policy

Assumption 1.

Lemma 2.

Proof:

Theorem 3.

Proof:

Example 1.

Remark 1.

III-C Optimal sequencing

Theorem 4.

Proof:

Definition 6.

Proposition 1.

Proof:

IV Control Sequences for Δd​e​cj<Δi​n​cj,∀j∈{1,…,N}\Delta_{{dec}}^{j}<\Delta_{{inc}}^{j},\quad\forall{j}\in\{1,\ldots,N\}

Definition 7.

IV-A Near-optimal policy

Theorem 5.

Proof:

Example 2.

Example 3.

IV-B Optimal sequencing

Proposition 2.

V Conclusions

VI Appendix

Problem 2 (Clique).

Problem 3 (O​RdOR_{d}).

Proof:

References

III Control Sequences for $\Delta_{{dec}}^{{j}}\geq\Delta_{{inc}}^{{j}},\quad\forall{j}\in\{1,\ldots,N\}$

IV Control Sequences for $\Delta_{{dec}}^{j}<\Delta_{{inc}}^{j},\quad\forall{j}\in\{1,\ldots,N\}$

Problem 3 ( $OR_{d}$ ).