Control Policies for Recovery of Interdependent Systems After Disruptions
Abstract
We examine a control problem where the states of the components of a system deteriorate after a disruption, if they are not being repaired by an entity. There exist a set of dependencies in the form of precedence constraints between the components, captured by a directed acyclic graph (DAG). The objective of the entity is to maximize the number of components whose states are brought back to the fully repaired state within a given time. We prove that the general problem is NP-hard, and therefore we characterize near-optimal control policies for special instances of the problem. We show that when the deterioration rates are larger than or equal to the repair rates and the precedence constraints are given by a DAG, it is optimal to continue repairing a component until its state reaches the fully recovered state before switching to repair any other component. Under the aforementioned assumptions and when the deterioration and the repair rates are homogeneous across all the components, we prove that the control policy that targets the healthiest component at each time-step while respecting the precedence and time constraints fully repairs at least half the number of components that would be fully repaired by an optimal policy. Finally, we prove that when the repair rates are sufficiently larger than the deterioration rates, the precedence constraints are given by a set of disjoint trees that each contain at most nodes, and there is no time constraint, the policy that targets the component with the least value of health minus the deterioration rate at each time-step while respecting the precedence constraints fully repairs at least times the number of components that would be fully repaired by an optimal policy.
I Introduction
We focus on a problem where multiple components of a system have been damaged after a disruption (e.g., a disaster, cyberphysical attack, etc.) and the states of the components continue to reduce if they are not being repaired, until they reach a value referred to as permanent failure. There is an entity (or a controller) whose objective is to maximize a performance criteria or reward, e.g., maximizing the number of components whose states are brought back to their maximum value (known as the permanent repair state) within a given time. The states of the components do not change once they reach permanent failure or permanent repair. There exists a set of dependencies between the components that are represented by precedence constraints such that if there is a precedence constraint between an ordered pair of components , then component needs to be permanently repaired before the entity can target component . This problem has relevance in multiple applications, e.g., disaster recovery, protection of cyberphysical systems against attacks, and fire-fighting. For instance, infrastructure components face accelerated deterioration after disasters due to processes such as corrosion and flooded roads. If the components are not repaired in a timely manner, they can deteriorate to such a level that they become unusable and require full replacement, which is usually expensive [1]. Furthermore, there can be various dependencies between infrastructure components; for example, in order to bring back the functionality of water systems (such as irrigation pumps), it might be necessary to first repair some sections of the power network. Similarly, there might be a limited time before components (e.g., computing elements) of a cyberphysical system become fully compromised after an attack [2]. There might be dependencies or hierarchies between the components such that the components at the top of the hierarchy may be required to be repaired first in order to stop the spread of attack to other components [3]. Thus, the protecting agency would have to make optimal recovery decisions if there is a repair budget (e.g., manpower and resources).
This problem can be cast in the general framework of optimal control and scheduling of switched systems [4, 5]. In switched systems, there are multiple subsystems such that only one subsystem can be controlled at a time. In our problem, components correspond to the subsystems of a switched system. Since the entity can only control one component at a time, our problem is analogous to optimal control and scheduling of discrete time linear switched systems [4]. One of the main complexities that arises in solving switched controlled systems is the combinatorially large number of feasible switching sequences [5]. Despite recent advances in the optimal control and scheduling of switched systems[4], finding general theoretical results or frameworks for efficiently solving optimal control and scheduling problems for switched systems remains an active area of research [6].
At a high level, our problem also has analogies to the resource (e.g., time slot) allocation problem at a base-station to time-dependent flows [7], optimal control of robotic systems [8], switched system control problems such as scheduling of thermostatically controlled loads [9] and patient triage scheduling problems [10]. However, these studies do not consider precedence constraints. In addition, some of these studies [7, 8] do not consider permanent failure of flows/components being served/targeted. Job scheduling with linear time deterioration and precedence constraints that minimize the total number of late jobs also have similarities to our problem [11]. However, job scheduling studies typically do not consider the notion of permanent failure, as jobs are completely processed even if their completion times exceed due dates. In addition, jobs are considered to be late if their completion times exceed the corresponding due dates; however, in our setting, a repair agency should start targeting a component before it fails. Scheduling of real-time tasks [12] also has some high level analogies to our work. However, these studies focus on tasks that sporadically become available for processing, in contrast to the components in our problem that are available for repair starting at the same time (i.e., immediately after a disruption), subject to precedence constraints.
Under the setting described before, we focus on characterizing (near-) optimal switching policies for maximizing the number of components that are permanently repaired. The paper [13] characterized optimal policies for this problem assuming homogeneous rates of repair and deterioration, and [14] studied the problem for the case when there are heterogeneous rates of repair and deterioration. However, these studies do not consider precedence constraints among various components and also do not constrain the maximum time that is available to repair the components. In the conference version of this paper [15], precedence constraints were considered among various components but there was no constraint on the time that is available for repair, and no discussion on the complexity of the general problem. This paper significantly expands on the conference paper by addressing those gaps.
Our contributions
First, we prove that when the deterioration rates are larger than or equal to the repair rates, and the precedence constraints are given by a directed acyclic graph (DAG), the optimal switching policy is to permanently repair a component that is targeted before switching to another component, while respecting the precedence and time constraints. Second, we prove that under the aforementioned assumptions and when the deterioration and the repair rates are homogeneous across all the components, the switching policy that targets the healthiest component at each time-step while respecting the precedence and time constraints is -optimal.111For , a policy is said to be -optimal if the reward obtained by that policy is at least times the reward obtained by an optimal policy. We also prove that the aforementioned policy is optimal when time is constrained but when there are no precedence constraints. Third, we prove that when the repair rates are sufficiently larger than the deterioration rates, the precedence constraints are given by a set of disjoint trees that each contain at most nodes, and there is no time constraint, the policy that targets the component with the least health value minus the deterioration rate at each time-step while respecting the precedence constraints is -optimal.222The paper [15] characterizes a near-optimal policy with the same factor under the aforementioned assumptions where the root nodes of the trees are first targeted and other nodes are only targeted when it is not possible to target any root node; the policy that is characterized in this paper does not require that condition. Finally, we prove that the general problem is NP-hard (even if there is no time constraint).
In the next section, we formally define the problem that we focus in this paper. After that, we divide the characterization of sequencing policies into two cases, corresponding to the relative sizes of the rates of deterioration and repair.
II Problem Statement
There are components indexed by the set . A component could represent damaged infrastructure in an area, an infected server, etc., depending on the context. We refer to different components as nodes. There is a repair agency (also referred to as an entity), which is responsible for repairing the nodes. We index time-steps by the variable , capturing the resolution at which the entity decides to repair nodes. Let be the largest time-step that is available for repairing nodes (when we say that there is no time constraint). The control action of the entity at time-step is represented by , i.e., the entity targets one node to repair at each time-step.333We keep the analysis of the problem where the entity can simultaneously target multiple nodes for future work. Each node has an associated health value at time-step . We denote the initial health value of each node by . If a node is being repaired at a time-step then the health value of that node increases by . A node is considered to be permanently repaired at time-step if and . If a node is not being repaired at a time-step, then the health value of node decreases by . A node is considered to be permanently failed at time-step if and . Note that the health value of a node does not change further once it gets permanently repaired or permanently failed. Thus, for all , the health value of node dynamically changes depending on the control action taken by the entity as follows:
There is a set of dependency constraints between different nodes, represented by a set consisting of edges between various nodes. An edge starting from node and ending at node represents a precedence constraint that node needs to be permanently repaired (i.e., to full health) before the entity can start targeting node ; is an in-neighbor of . The in-neighbor set of a node is the set . We assume that the precedence constraints form a directed acyclic graph (DAG) denoted by ; such graphs are a common form for representing general precedence constraints in other problems such as job scheduling [11].
We now define the feasible control set as follows.
Definition 1.
Let the health values of all the nodes at time-step be given by . Then, the feasible control set at time-step is the set containing the nodes whose in-neighbors are all in the permanently repair state at time-step , i.e., .
We now present the definition of a control sequence that respects the precedence constraints as follows.
Definition 2.
A control sequence is said to respect the precedence constraints if for all , .
We define the reward function as follows.
Definition 3.
Given a set of initial health values and a control sequence that respects the precedence constraints, the reward is defined as the number of nodes that get permanently repaired through that sequence. More formally, .
Based on the above definitions, the following problem is the focus of this paper.
Problem 1 (Optimal Control Sequencing over a DAG).
Consider a directed acyclic graph consisting of nodes with initial health values , along with repair and deterioration rates and , respectively. Given a time constraint , find a control sequence that respects the precedence constraints and maximizes the reward .
For our purposes, we also define the concept of a jump.
Definition 4.
If the entity starts targeting a node before permanently repairing the node it targeted in the last time-step, then the entity is considered to have jumped at that time-step. Mathematically, if , and , then a jump is said to have been made by the entity at time-step . A control sequence that does not contain any jumps is called a non-jumping sequence.
In this paper, we will be obtaining sequences that are generated by state feedback policies, i.e., we will be providing a mapping , such that for all . Thus, we will use the same terminology defined above (for sequences) for policies (e.g., a policy respecting the precedence constraints, a non-jumping policy, etc.). Given a policy that respects the precedence constraints, we denote the reward (from Definition 3) by , with the time-constraint being implicit.
We now define an -optimal policy as follows.
Definition 5.
Let be an optimal policy for Problem 1. For , a policy (that respects the precedence and time constraints) is said to be -optimal if .
In the next section, we start the analysis of the problem for .
III Control Sequences for
In this section, we start by showing the optimality of non-jumping sequences when .
III-A Optimality of non-jumping sequences
To show that non-jumping policies are optimal when , we first analyze the properties of sequences containing at most one jump and later generalize to sequences containing an arbitrary number of jumps. We start with the following result.
Lemma 1.
Let there be nodes that have a set of precedence constraints given by a DAG , and . Consider a control sequence that respects the precedence and time constraints and targets nodes as shown in Figure 1. Suppose sequence permanently repairs all the nodes and contains exactly one jump, where the entity partially repairs node before moving to node at time-step . Sequence then permanently repairs nodes , before returning to node and permanently repairing it. Then, there exists a non-jumping control sequence as shown in Figure 2 that targets nodes in the order (which respects the precedence and time constraints) and permanently repairs all the nodes. Furthermore, if (resp. ) is the number of time-steps taken to permanently repair node in sequence (resp. sequence ), then the following holds true:
(1) | ||||
(2) | ||||
(3) |


Proof:
Since node is partially repaired by the entity before it targets the nodes in sequence , there cannot be an edge that starts from and ends at a node in the set . Thus, if sequence follows the precedence constraints, the non-jumping sequence also respects the precedence constraints. We can now leverage the analysis in [14], which proved that (1)-(3) hold and each node in sequence is targeted at an earlier time-step than in sequence . Therefore, sequence permanently repairs all the nodes that are permanently repaired by sequence while respecting the precedence and time constraints. Thus, the result follows. ∎
The above result considered sequences containing exactly one jump. This leads us to the following key result pertaining to the optimal control policy when .
Theorem 1.
Let there be nodes that have a set of precedence constraints given by a DAG , and . Suppose there is a sequence with one or more jumps that respects the precedence and time constraints, and permanently repairs nodes. Then, there exists a non-jumping sequence that respects the precedence and time constraints, and permanently repairs nodes. Thus, non-jumping sequences are optimal when .
The proof of the above result follows immediately by applying Lemma 1 on sequence to obtain a sequence that has one less jump than , and repeating this procedure on the obtained sequences to get a non-jumping sequence.
Since non-jumping sequences are optimal by the above result, we will only focus on non-jumping sequences when . We now present the following NP-hardness result, whose proof is provided in the appendix.
Theorem 2.
Problem 1 is NP-hard.
III-B Near-optimal policy
In this section, we characterize a near-optimal policy for the case when . We will make the following assumption in this section.
Assumption 1.
For all , suppose , , and . Also, suppose there exists a positive integer such that , and for each node , there exists a positive integer such that .
Note that the conditions and , in the assumption ensure that no node gets permanently repaired partway through a time-step.
We now start with the following result.
Lemma 2.
Let there be nodes with precedence constraints given by a DAG . Suppose Assumption 1 holds and . Consider a non-jumping sequence that respects the precedence and time constraints, and permanently repairs nodes. Suppose there is a node from the set of nodes that has larger initial health value than the first node of sequence and the precedence constraints allow node to be permanently repaired before the first node of sequence . Consider a non-jumping sequence where node is first targeted and then the remaining targeted nodes of sequence are targeted after node . Then, at least nodes are permanently repaired in sequence while respecting the precedence and time constraints.
Proof:
Let be the number of nodes that are permanently repaired in sequence . Denote the order of nodes in the sequence as . The proof for the case when is trivial because ; we thus focus on the case when . Denote the ordering of the first nodes in sequence as , where . Note that the proof when is also trivial because node in the non-jumping sequence is permanently repaired since the time taken to permanently repair node in sequence is less than the time taken to permanently repair the nodes in sequence (because node has larger initial health than node ). Therefore, the non-trivial cases are when .
We first argue that sequence permanently repairs at least nodes, regardless of the time it takes to repair the nodes. We prove the result by contradiction. Suppose sequence permanently repairs less than nodes. Let be the first node that permanently fails in sequence , where . We first argue that either or . If node does not appear in sequence , then the first nodes of sequence are ordered as follows and thus . On the other hand, if node is permanently repaired in sequence , let the position of node in sequence be , where . Then, , and . Thus, if the first node that fails in sequence (i.e., node ) is such that , then ; otherwise . We now consider the cases when and one by one.
Suppose . If node fails in sequence , then we show that node must permanently fail in sequence (where ). Consider . Let be the number of time-steps taken to permanently repair node from its initial health to the full health. If node permanently fails after permanently repairing node in sequence , then . That is,
(4) |
Denote as the number of time-steps taken to permanently repair node from its initial health to full health. Then,
(5) |
as and . Therefore,
(6) |
where the first inequality from the left comes by conditions (4)-(5) and the second inequality comes from the fact that (as node has larger initial health than node ). Note that in sequence , it takes time-steps to repair node , and time-steps to repair node (it takes time-steps to repair the health that is lost by node due to deterioration and it takes time-steps to repair the difference in the initial health of and full health). By (6), the total reduction in the health of node after time-steps in sequence satisfies
as and . Thus, node permanently fails in sequence by the time it is reached, contradicting the fact that permanently repairs nodes.
Now consider the case when . If node permanently fails in sequence , then . Following the same arguments as before, we get
(7) |
As before, the total number of time-steps required to permanently repair nodes , and in sequence is equal to . Thus, the total reduction in the health of node by the time it is reached in sequence satisfies
by (7), and . Thus, permanently fails by the time it is reached in sequence , leading to a contradiction.
We can repeat the above arguments to show that if node permanently fails in sequence , where , then
(8) |
Therefore, the total reduction in the health of node after time-steps in sequence is
(9) |
Thus, for , we get
by (8), (9), and . Since, the total reduction in the health value of node before the entity starts targeting it is larger than one, it is not possible to permanently repair node in sequence , leading to a contradiction.
We now consider the case when (along with ). We prove that if node permanently fails in sequence , then it is not possible to permanently repair node in sequence . Recall that the case happens when node is permanently repaired in sequence and . Since , we have . Consider . Then, (i.e., ) because and . If node permanently fails in sequence , then
or equivalently,
(10) |
Therefore, the total reduction in the health of node after time-steps in satisfies
by (10), and . Thus, node permanently fails in sequence , contradicting the fact that permanently repairs nodes.
Similarly, it can be shown that if node permanently fails in sequence , where , then
(11) |
Therefore, the total reduction in the health of node (where ) after time-steps in sequence is
by (11), and . Therefore, it would not be possible to permanently repair nodes in sequence , leading to a contradiction.
We will now show that sequence permanently repairs nodes in less time than the time taken to permanently repair nodes in sequence . Suppose node is not permanently repaired in sequence . Consider . Then, the time taken to permanently repair the first two nodes, i.e., nodes and , in sequence is equal to and the time taken to permanently repair three nodes in sequence is equal to . Note that because , and . Consider the case when . Then, the time taken to permanently repair nodes in sequence is equal to
(12) |
and the time taken to permanently repair nodes in sequence is equal to
(13) |
Then, as , and .
Now consider the case when node is permanently repaired in sequence . When (and ), the time taken to permanently repair nodes in sequence and the time taken to permanently repair nodes in sequence are given by expressions (12) and (13), respectively, and therefore the proof proceeds in the same way as when node is not permanently repaired in sequence . Thus, the proof for follows in exactly the same way as before because . We now focus on the case when and . Then, the total time taken to permanently repair nodes in sequence is equal to
(14) |
and the time taken to permanently repair nodes in sequence is given by the expression (13). Note that the value of the expression (14) is less than the value of the expression (13) because , and . Therefore, if sequence permanently repairs nodes by time-step , then sequence must permanently repair at least nodes by time-step . Thus, the result follows. ∎
Using the above result, we now propose a near-optimal policy for a subclass of Problem 1.
Theorem 3.
Let there be nodes with precedence constraints given by a DAG . Suppose Assumption 1 holds and . Then, the policy that targets the healthiest node at each time-step while respecting the precedence and time constraints is -optimal.
Proof:
Let be the optimal (non-jumping) sequence. Then, we go to the first time-step in sequence at which the healthiest available node (i.e., the healthiest node that can be targeted at time-step without violating the precedence constraints) is not targeted. Denote the portion of sequence from time-step to time-step as and the portion of sequence from time-step onwards as . We modify the sequence by Lemma 2 to generate sequence such that the healthiest available node (while respecting the precedence and time constraints) is targeted at time-step in . Let be the number of nodes that are permanently repaired by sequence . Then, at least nodes are permanently repaired in sequence (while respecting the precedence and time constraints) by Lemma 2. Then, a sequence is generated by concatenating with . After this, we go to the next time-step in sequence at which the healthiest node (while respecting the precedence and time constraints) is not targeted and repeat this procedure. We iteratively repeat this procedure, so that we finally get a sequence in which the healthiest node (while respecting the precedence and time constraints) is targeted at each time-step. The number of nodes that are permanently repaired in the final sequence will be at least half of the number of nodes that are permanently repaired in sequence because 1) at each iteration of this procedure we move one node across the given sequence and the number of nodes that are permanently repaired in the modified sequence reduces at most by one, and 2) in the last iteration of this procedure where there is only one node, there is no reduction in the number of permanently repaired nodes because if the last node in the given sequence can be permanently repaired then a healthier node that replaces it can also be permanently repaired. ∎
We now show that the factor of in the above result is sharp, in that there exist problem instances where targeting the healthiest node at each time-step repairs only half as many nodes as an optimal policy.
Example 1.
Consider a graph consisting of three nodes as shown in Figure 3. Let the initial health values of the nodes 1, 2 and 3 be 0.6, 0.3 and 0.8, respectively. The deterioration and repair rates are homogeneous across all the nodes and both are equal to 0.1. Suppose . The healthiest node that can be targeted in the first time-step is node 1 (as we need to permanently repair node 2 before targeting node 3). If node 1 is first permanently repaired then node 2 fails by the time the entity reaches it and we cannot permanently repair any more nodes. However, if node 2 is first permanently repaired then node 3 can also be permanently repaired. Note that although the policy that targets the healthiest node at each time-step (while respecting the precedence and time constraints) is not optimal in this example, it is indeed -optimal as proved in Theorem 3.

Remark 1.
In this paper, we consider all the nodes to be equally important. However, one could also consider weights on the nodes to signify their relative importance, and the objective of Problem 1 can be changed to maximize the total weight of the nodes that are permanently repaired. Then, it is easy to argue that the policy characterized in Theorem 3 would be -optimal, where and are the maximum and minimum weights of nodes, respectively.
Theorem 3 shows that the feedback policy that targets the healthiest node at each time-step while respecting the precedence and time constraints is -optimal. In the next section, we show that this policy is, in fact, optimal for special instances of the problem.
III-C Optimal sequencing
The paper [14] showed that the policy that targets the healthiest node at each time-step is optimal when all the conditions of Theorem 3 are satisfied but when there are no precedence and time constraints. We now prove that the aforementioned policy is also optimal when there is a time constraint but there are no precedence constraints.
Theorem 4.
Let there be nodes such that there are no precedence constraints (i.e., ). Suppose Assumption 1 holds and . Then, the policy that targets the healthiest node at each time-step while respecting the time constraint is optimal.
Proof:
Consider any optimal (non-jumping) sequence that permanently repairs nodes. Let be the total number of time-steps taken by the sequence to permanently repair all nodes.444Note that the total number of time-steps is one larger than the largest time-step index because the first time-step in a sequence is time-step 0. Then,
(15) |
where is the health of node when it is reached in the sequence, i.e., and for . Alternatively, for all . Note that is the largest when node has the largest initial health; is the largest when node has the largest initial health and node has the second largest initial health (as the coefficients of and are and 1, respectively); is the largest when node has the largest initial health, node has the second largest initial health and node has the third largest initial health (as the coefficients of , and are , and 1, respectively); and so on. Thus, the non-jumping sequence that targets the nodes in decreasing order of their initial health is optimal when time is constrained because it permanently repairs nodes in less time in comparison to the time taken by sequence to permanently repair nodes (as the value of in (15) for sequence is less than the corresponding value for sequence ). Therefore, the policy that targets the healthiest node at each time-step while respecting the time constraint is optimal by Theorem 1. ∎
We also show that the policy of targeting the healthiest node at each time-step (while respecting the precedence and time constraints) is optimal in certain classes of DAGs. We start with the following definition.
Definition 6.
Given a DAG, the nodes that do not have any incoming edges are said to belong to the “first level”; after removing all the nodes in the first level and all the outgoing edges of the nodes in the first level, the nodes that do not have any incoming edges are said to belong to the “second level”, and so on. A DAG in which all the nodes in a level have incoming edges from all the nodes in the previous level is termed as a “complete series graph”.
Figure 4 shows a complete series graph that has three levels.
Proposition 1.
Let there be nodes with precedence constraints given by a complete series graph . Suppose Assumption 1 holds and . Then, the policy that targets the healthiest node at each time-step while respecting the precedence and time constraints is optimal.
Proof:
Due to precedence constraints, we need to permanently repair all the nodes in the first level before targeting the other nodes in a complete series graph. Since the optimal policy is to target the healthiest node at each time-step (while respecting the time constraint) when there are no precedence constraints (by Theorem 4), we follow this policy for the nodes in the first level. After permanently repairing all the nodes in the first level, we need to permanently repair all the nodes in the second level before targeting other nodes. Thus, the policy of targeting the healthiest node at each time-step (while respecting the time constraint) in the second level is followed after permanently repairing all the nodes in the first level. The rest of the proof follows the above argument. ∎

So far, we assumed that . In the next section, we analyze other cases.
IV Control Sequences for
We now consider the case when repair rates are sufficiently larger than the deterioration rates, i.e., and . We will use the concept of a modified health value defined as follows.
Definition 7.
The modified health value of a node at a time-step is the health value of a node at that time-step minus its deterioration rate.
IV-A Near-optimal policy
Consider a directed rooted tree where the total number of nodes is at most equal to . In the context of disaster recovery, such a tree can represent damaged infrastructure components (with dependencies) within a particular neighborhood. We will consider such trees for the representation of precedence constraints as follows.
Theorem 5.
Let there be nodes with precedence constraints represented by a graph , which is a set of disjoint rooted trees that each contain at most nodes. Suppose , and . Then, the policy that targets the node with the least modified health value at each time-step while respecting the precedence constraints is -optimal.
Proof:
Let be the set of root nodes of the trees. Let denote the sequence that targets the node with the least modified health value in the set at each time-step, and let denote the optimal sequence. Let the number of nodes that are permanently repaired by sequences and be and , respectively. We will show that . We prove this by contradiction. Suppose . Denote the set of root nodes that are permanently repaired in sequence by , and let . Note that because the root node in a tree should be permanently repaired before targeting other nodes in the tree. Then, because implies . We now construct a sequence by modifying sequence such that at every time-step of sequence where a node not belonging to the set is targeted, we replace that node with a node from the set . Then, sequence also permanently repairs all the nodes in the set because the nodes of set are targeted more frequently in sequence than in sequence . Since the optimal policy is to target the node with the least modified health value at each time-step when there are no precedence and time constraints (by Proposition 1 of [14]), we obtain a contradiction because sequence permanently repairs nodes whereas sequence (which is the optimal sequence to permanently repair the set of root nodes) permanently repairs nodes. Therefore, the assumption that is not true.
Let the sequence that targets the node with the least modified health value (in the set ) at each time-step while respecting the precedence constraints be denoted by . We now argue that sequence permanently repairs at least as many nodes as sequence . Denote the set of root nodes that are permanently repaired in sequence by and the number of nodes in the set by . Then, there exists an ordering of nodes in the set , denoted by , such that the initial health values of those nodes satisfy
(16) |
because of the following argument. Let denote the set of nodes that have not been targeted at least once by the entity prior to time in sequence . Note that . At time , , where denotes the cardinality of set . At , as there are nodes of the set that have not been targeted at least once by the entity in sequence and thus each node should have initial health value larger than to survive until . At , as there are at least nodes of the set that have not been targeted at least once by the entity in sequence and thus each node should have initial health value larger than . Repeating this argument for the next time-steps, we can argue that there must be a permutation of nodes that satisfies the conditions (16) in order for nodes to be permanently repaired in sequence .
After this, it can be argued that sequence permanently repairs at least nodes by the same way as in Proposition 1 of [14]. Let the number of nodes that are permanently repaired by sequence be . Then, and therefore . Thus, the policy that targets the node with the least modified health value at each time-step while respecting the precedence constraints is -optimal. ∎
We now show that the factor of in Theorem 5 is sharp, in that there exist problem instances where targeting the node with the least modified health value at each time-step repairs only times the nodes as an optimal policy.
Example 2.
Consider the graph in Figure 3, which is a set of disjoint rooted trees that each contain at most two nodes. Let the initial health values of the nodes 1, 2 and 3 be 0.01, 0.02 and 0.8, respectively, and . The deterioration and repair rates are homogeneous across all the nodes and are equal to 0.1 and 0.25, respectively. The least healthy node that can be targeted in the first time-step is node 1 (since deterioration rates are homogeneous across all the nodes, targeting the node with the least modified health value is equivalent to targeting the node with least health value). However, if node 1 is targeted in the first time-step then node 2 fails by the time the entity reaches it and we cannot target any more nodes. However, if node 2 is first permanently repaired then node 3 can also be permanently repaired. Note that although the policy that targets the least healthy node at each time-step while respecting the precedence constraints is not optimal in this example, it is indeed -optimal as proved in Theorem 5.
We now also provide an example to show that the policy that is characterized in Theorem 5 need not be -optimal when .
Example 3.
Consider a graph consisting of two nodes without any precedence constraints (i.e., the graph is a union of disjoint trees that each contain one node). Let the initial health values of the nodes 1 and 2 be 0.01 and 0.11, respectively. The deterioration and repair rates are homogeneous across all the nodes and are equal to 0.1 and 0.11, respectively, and . If the policy of targeting the node with the least health value at each time-step is followed then no node gets permanently repaired while respecting the time constraint. However, if the non-jumping policy that first targets node 2 is followed, then node 2 can be permanently repaired. Thus, the policy that is characterized in Theorem 5 is not a 1-optimal policy (or optimal policy) in this example.
Therefore, characterizing near-optimal policies for this case under a time constraint is an avenue for future research.
IV-B Optimal sequencing
The paper [14] proved that the policy that targets the node with the least modified health value at each time-step is optimal when there are no precedence and time constraints. Examples 2 and 3 showed that this need not be true when there is a precedence or time constraint. However, the policy of targeting the node with the least modified health value at each time-step is optimal for special cases such as when the precedence constraints are given by a complete series graph (defined in the previous section).
Proposition 2.
Let there be nodes with precedence constraints given by a complete series graph . Suppose , and . The optimal policy is to target the node with the least modified health at each time-step while respecting the precedence constraints.
V Conclusions
We studied the problem of finding (near-) optimal control policies for targeting different components whose states are disrupted, when there exist precedence constraints between the components. We characterized control policies depending on the relationship between the repair and deterioration rates. There are several options for future extensions to this work: introducing stochasticity in the health values, extending the problem to time-dependent deterioration and repair rates, characterizing policies with improved ratios , and characterizing near-optimal policies when for all , .
VI Appendix
In this section, we prove that Problem 1 is NP-hard. This proof is inspired from the paper [17]; however, the job scheduling problem in [17] does not consider deterioration of jobs and there are additional differences between the problem that is considered in [17] and Problem 1 as mentioned in the review of job scheduling studies in Section I. We start by defining the NP-complete Clique problem [16] and an instance of a decision version of Problem 1 referred to as .
Problem 2 (Clique).
Given an undirected graph consisting of vertices and edges, and a positive integer , does have a complete subgraph of size , i.e., a set of vertices such that each pair of vertices in the set is connected by an edge in ?
Problem 3 ().
Given a directed acyclic graph consisting of nodes with initial health values , along with repair and deterioration rates and , respectively, , and an integer such that , is there a non-jumping control sequence that respects the precedence constraints and gives a reward ?
We now present the proof of Theorem 2.
Proof:
Given an instance of Clique we construct an instance of as follows. We construct a total of nodes that are of three types as follows: a v-node is constructed corresponding to each vertex in , an e-node is constructed corresponding to each edge in , and a root node is constructed. The parameters of these nodes are set as follows. For each v-node , set and . For each e-node , set and . The root node has the same parameters as a v-node. We construct directed edges such that there is an edge starting from a v-node and ending in an e-node if the vertex corresponding to the v-node lies at one of the ends of the edge (in graph ) corresponding to the e-node. We also construct directed edges starting from the root node and ending in all the other nodes, so that a DAG is formed. Finally, set . Note that the constructed instance of is polynomial in the size of the given instance of Clique.
We will now prove that the answer to the given instance of Clique is yes if and only if the answer to the constructed instance of is yes. Suppose that the answer to the instance of Clique is yes. Then, we will show that it is possible to create a non-jumping sequence for the constructed instance of in which no more than nodes permanently fail while respecting the precedence constraints. The first node in the created sequence is node because of the precedence constraints. Note that node is permanently repaired in the created sequence because the first node in any non-jumping sequence is always permanently repaired (when ). After permanently repairing node , v-nodes whose corresponding vertices form a complete graph in the Clique are targeted in the created sequence. Note that it takes two time-steps to permanently repair node in the created sequence because . Also, the health value of each v-node after two time-steps is equal to . Thus, the number of time-steps it takes to permanently repair the first v-node in the created sequence is equal to . Thus, the second v-node starts getting targeted after six time-steps in the created sequence and at that time its health value is equal to . Thus, the number of time-steps it takes to permanently repair the second v-node in the created sequence is equal to . Proceeding in this way, it can be shown that the total number of time-steps taken to permanently repair node and v-nodes is equal to because the th node in the created sequence (where ) takes time-steps to get permanently repaired. Note that for all v-nodes , and represents the number of time-steps it takes for to permanently fail if it is not targeted within the first time-steps. Thus, for all v-nodes , we define . Note that the time-step at which the th v-node starts getting targeted in the created sequence is less than because (as and ). Thus, all the v-nodes are permanently repaired in the created sequence. After permanently repairing v-nodes that correspond to the solution of Clique, e-nodes that correspond to the edges of the complete subgraph in the solution of Clique are targeted. Note that for all e-nodes . Thus, for all e-nodes , we define . Note that there are nodes that are targeted before the last e-node among the aforementioned e-nodes in the created sequence. Thus, the last e-node among the aforementioned e-nodes starts getting targeted in the created sequence at time-step because the th node in the created sequence (where ) takes time-steps to get permanently repaired. Thus, all the aforementioned e-nodes get permanently repaired because . Finally, the remaining v-nodes are targeted in the created sequence. Note that there are nodes that are targeted before the last v-node in the created sequence. Thus, it takes time-steps in order to start targeting the last v-node because the th node in the created sequence (where ) takes time-steps to get permanently repaired. Thus, it is possible to permanently repair all the v-nodes because (as ). Thus, except the remaining e-nodes, all the nodes are permanently repaired in the created sequence.
Now we show the opposite direction. Suppose the answer to the constructed instance of is yes. Then, there exists a non-jumping sequence in which at most nodes permanently fail. Note that there are at most nodes that are permanently repaired before the last node that is targeted in the given sequence. Thus, the largest time-step at which the last targeted node in the given non-jumping sequence starts getting targeted is equal to because the th node in the given sequence takes time-steps to get permanently repaired. Thus, all the v-nodes are permanently repaired in the given sequence because . Note that the first node that is targeted in the given sequence is node because of the precedence constraints. Since the first node in the given sequence is permanently repaired and all v-nodes are also permanently repaired, the set of all the nodes that permanently fail in the given sequence is a subset of all the e-nodes. The number of e-nodes that are permanently repaired in the given sequence is at least equal to because at most nodes permanently fail in the given sequence and the total number of e-nodes is equal to . Therefore, at least v-nodes need to be permanently repaired before at least e-nodes can be permanently repaired in the given sequence because of the following two reasons. First, a v-node is an in-neighbor of an e-node if the vertex corresponding to the v-node lies at one of the ends of the edge (in graph ) corresponding to the e-node. Secondly, among all the undirected graphs that have edges, a complete graph (containing vertices) has the least number of vertices. Note that all the e-nodes that are permanently repaired in the given sequence start getting targeted before time-step . Also, the maximum number of nodes (when the th node takes time-steps to get permanently repaired) that can be targeted such that the last node starts getting targeted before time-step is equal to . Thus, node , v-nodes and e-nodes are targeted before time-step in the given sequence such that the vertices corresponding to the v-nodes form a complete subgraph of size in graph .
References
- [1] E. I. Chisolm and J. C. Matthews, “Impact of hurricanes and flooding on buried infrastructure,” Leadership and Management in Engineering, vol. 12, no. 3, pp. 151–156, 2012.
- [2] D. J. Leversage and E. J. Byres, “Estimating a system’s mean time-to-compromise,” IEEE Security & Privacy, vol. 6, no. 1, pp. 52–60, 2008.
- [3] R. J. La, “Interdependent security with strategic agents and cascades of infection,” IEEE/ACM Transactions on Networking, vol. 24, no. 3, pp. 1378–1391, 2015.
- [4] D. Gorges, M. Izak, and S. Liu, “Optimal control and scheduling of switched systems,” IEEE Transactions on Automatic Control, vol. 56, no. 1, pp. 135–140, 2010.
- [5] A. Bemporad and N. Giorgetti, “Logic-based solution methods for optimal control of hybrid systems,” IEEE Transactions on Automatic Control, vol. 51, no. 6, pp. 963–976, 2006.
- [6] F. Zhu and P. J. Antsaklis, “Optimal control of hybrid switched systems: A brief survey,” Discrete Event Dynamic Systems, vol. 25, no. 3, pp. 345–364, 2015.
- [7] A. Eryilmaz and R. Srikant, “Fair resource allocation in wireless networks using queue-length-based scheduling and congestion control,” IEEE/ACM Transactions on Networking (TON), vol. 15, no. 6, pp. 1333–1344, 2007.
- [8] S. L. Smith, M. Schwager, and D. Rus, “Persistent robotic tasks: Monitoring and sweeping in changing environments,” IEEE Transactions on Robotics, vol. 28, no. 2, pp. 410–426, 2012.
- [9] P. Nilsson and N. Ozay, “On a class of maximal invariance inducing control strategies for large collections of switched systems,” in Proceedings of the 20th International Conference on Hybrid Systems: Computation and Control. ACM, 2017, pp. 187–196.
- [10] N. T. Argon, S. Ziya, and R. Righter, “Scheduling impatient jobs in a clearing system with insights on patient triage in mass casualty incidents,” Probability in the Engineering and Informational Sciences, vol. 22, no. 3, pp. 301–332, 2008.
- [11] V. S. Gordon, C. N. Potts, V. A. Strusevich, and J. D. Whitehead, “Single machine scheduling models with deterioration and learning: handling precedence constraints via priority generation,” Journal of Scheduling, vol. 11, no. 5, p. 357, 2008.
- [12] F. Zhang and A. Burns, “Schedulability analysis for real-time systems with EDF scheduling,” IEEE Transactions on Computers, no. 9, pp. 1250–1258, 2009.
- [13] H. Gehlot, S. Sundaram, and S. V. Ukkusuri, “Optimal sequencing policies for recovery of physical infrastructure after disasters,” in 2019 American Control Conference (ACC). IEEE, 2019, pp. 3605–3610.
- [14] ——, “Optimal policies for recovery of multiple systems after disruptions,” arXiv preprint arXiv:1904.11615, 2020.
- [15] ——, “Approximation algorithms for the recovery of infrastructure after disasters under precedence constraints,” IFAC-PapersOnLine, vol. 52, no. 20, pp. 175–180, 2019.
- [16] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms. MIT press, 2009.
- [17] M. Garey and D. S. Johnson, “Scheduling tasks with nonuniform deadlines on two processors,” Journal of the ACM (JACM), vol. 23, no. 3, pp. 461–467, 1976.
![]() |
Hemant Gehlot is a PhD candidate in the Lyles School of Civil Engineering at Purdue University, being supervised by Dr. Satish V. Ukkusuri and Dr. Shreyas Sundaram. He received his BTech and MTech degrees from the Indian Institute of Technology Kanpur in 2015. He was a finalist for the Best Student Paper Award at the IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys) 2019. His research interests include optimal control and combinatorial optimization. |
![]() |
Shreyas Sundaram is an Associate Professor in the School of Electrical and Computer Engineering at Purdue University. He received his MS and PhD degrees in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005 and 2009, respectively. He was a Postdoctoral Researcher at the University of Pennsylvania from 2009 to 2010, and an Assistant Professor in the Department of Electrical and Computer Engineering at the University of Waterloo from 2010 to 2014. He is a recipient of the NSF CAREER award, and an Air Force Research Lab Summer Faculty Fellowship. At Purdue, he received the Hesselberth Award for Teaching Excellence and the Ruth and Joel Spira Outstanding Teacher Award. At Waterloo, he received the Department of Electrical and Computer Engineering Research Award and the Faculty of Engineering Distinguished Performance Award. He received the M. E. Van Valkenburg Graduate Research Award and the Robert T. Chien Memorial Award from the University of Illinois, and he was a finalist for the Best Student Paper Award at the 2007 and 2008 American Control Conferences. His research interests include network science, analysis of large-scale dynamical systems, fault-tolerant and secure control, linear system and estimation theory, game theory, and the application of algebraic graph theory to system analysis. |
![]() |
Satish V. Ukkusuri is a Professor in the Lyles School of Civil Engineering and Director of the Urban Mobility Networks and Intelligence Lab at Purdue University. His research is in the area of interdisciplinary transportation networks with current interests in data driven mobility solutions, disaster management, resilience of interdependent networks, connected and autonomous traffic systems, dynamic traffic networks and smart logistics. He has published more than 350 peer reviewed journal and conference articles on these topics. |