A $D$ -competitive algorithm for the Multilevel Aggregation Problem with Deadlines

Jeremy McMahan Dept. of Computer Science, University of Wisconsin-Madison. [email protected]

Abstract

In this paper, we consider the multi-level aggregation problem with deadlines
(MLAPD) previously studied by Bienkowski et al. [8], Buchbinder et al. [14], and Azar and Touitou [3]. This is an online problem where the algorithm services requests arriving over time and can save costs by aggregating similar requests. Costs are structured in the form of a rooted tree. This problem has applications to many important areas such as multicasting, sensor networks, and supply-chain management. In particular, the TCP-acknowledgment problem, joint-replenishment problem, and assembly problem are all special cases of the delay version of the problem.

We present a $D$ -competitive algorithm for MLAPD. This beats the $6(D+1)$ -competitive algorithm given in Buchbinder et al. [14]. Our approach illuminates key structural aspects of the problem and provides an algorithm that is simpler to implement than previous approaches. We also give improved competitive ratios for special cases of the problem.

1 Introduction

In many optimization contexts, aggregating tasks and performing them together can be more efficient. For example, consider sending multiple gifts to a relative. If there are multiple presents, it may be cheaper to bundle them all in one large box rather than sending each present individually. Although, aggregation is useful in many contexts, it is often constrained. For the gift example, the postal service may have weight restrictions on packages so bundling items together may be impossible. In such a situation, multiple smaller bundles may have to be constructed resulting in a complex optimization problem. The type of aggregation constraints we will be interested in form a tree-like structure that commonly appear in communication networks.

Besides the aggregation constraints, another difficulty in these problems are the urgency of the tasks. A task might have a hard deadline of when it must be completed or may accrue a cost the longer it is delayed. At the same time, delaying a task allows more tasks to pile up which better exploits the power of aggregation. This illustrates a natural trade-off between the aggregation benefits of delaying tasks and the cost of doing so. Generally, we will refer to strategies that mostly delay tasks to maximize aggregation benefits as lazy-aggregating and to strategies that prioritize the urgency of the tasks as eager-aggregating.

To illustrate these ideas, consider the example of a communication tree of employees at a restaurant. A health inspector may continually issue fines to a host at a restaurant until some concern is resolved. Depending on the scope of the concern, it may indicate issues more broadly in the company and so would need to be delivered to a higher official than just the local manager. However, the cost of the fines might not be as expensive as the cost for the host to send a message up the chain of command. So, at any time, the host will have to decide whether to send the fine now or wait to receive more messages to send together. In fact, there are many problems of a similar flavor involving transmitting control packages in communication networks [4, 10, 19, 18]. Similarly, many optimization problems in supply chain management face similar considerations [16, 20, 2, 26].

The online multi-level aggregation problem with delay (MLAP) introduced in [8] captures many of the scenarios mentioned above. In the deadline version (MLAPD) [14], which is a special case of the delay version, requests arrive over time and must be served between their time of arrival and the time they are due. The aggregation constraints are modeled using a node-weighted rooted tree and a service involves transmitting a subtree that contains the root. All requests in a service subtree are satisfied to model aggregation. Then, the goal is to minimize the total cost of all services. MLAP encapsulates many classic problems such as the TCP-acknowledgment Problem, the Joint Replenishment Problem (JRP), and the Assembly Problem.

1.1 Our Contributions

In this paper, we present a $D$ -competitive algorithm for MLAPD. This beats the previous best algorithm constructed in [14], which was roughly $6(D+1)$ -competitive. Also, the proposed algorithm attacks the problem directly rather than relying on a reduction to $L$ -decreasing trees. This illuminates key structural aspects of the problem and results in the algorithm being simpler to implement than previous approaches.

The main analysis used is based off the framework of critically overdue nodes introduced in [8], but this framework is paired with a generalization of phases introduced in [7] to achieve a more refined analysis. In addition, using the ideas of late and early nodes within the phases context provides additional structure that can be exploited. Overall, the analysis uses a charging strategy that uses the concept of node investments to charge nodes proportionally to their impact on a service. The strategy exploits both top-down and bottom-up views of a service in order to maximize the precision of the charges. Investments also allow charging future nodes without causing complications. This in contrast to past approaches that only considered nodes available at the current moment or earlier.

Additionally, we give improved competitive algorithms for special cases of MLAPD. In particular, we present an algorithm with competitive ratio strictly less than $4$ for paths. This shows that the known lower-bound of $4$ for infinite graphs does not hold for finite graphs. The main analysis used exploits a notion of phases that utilizes the special structure of path instances. This approach involves looking at nested subpaths and then charging several such subpaths to a single path of the optimal schedule.

1.2 Related Work

MLAP and MLAPD were originally formulated by [8]. The offline versions of MLAP and MLAPD have been well studied. Even for $D=2$ , MLAPD and so MLAP is known to be APX-hard [6, 23, 2]. When the tree is a path, MLAP can be solved efficiently [9]. For MLAPD, the best approximation factor is $2$ [5, 8]. Using an algorithm for the assembly problem from [21], [24] constructs a $(2+\epsilon)$ -approximation for MLAP where $\epsilon>0$ is arbitrary.

For the online variant, the best competitive ratio for MLAP is $O(D^{2})$ due to an algorithm constructed in [3]. For MLAPD, the best competitive ratio known was $6(D+1)$ due to an algorithm developed in [14]. The best known lower bound for MLAPD and MLAP for finite depth trees is $2$ , which comes from the lower bound of $2$ on JRP with deadlines (JRPD) given in [7]. If the depth of the tree is unbounded, the best lower bound is $4$ as shown in [8] by constructing a hard instance where the tree is a path of arbitrary length. This improves on the lower bound of $2+\phi$ presented in [9] for trees of unbounded depth. Additionally, [8] shows that $4$ is in fact the optimal competitive ratio for paths of unbounded length improving on the results in [9, 11].

There is even more known about MLAP for small $D$ . For $D=1$ , MLAP is equivalent to the TCP-acknowledgment problem which involves sending control messages in a single packet in a network. The offline version of the problem is equivalent to the lot sizing problem [26] and can be solved efficiently [1]. For the online variant, it is known that the optimal deterministic competitive ratio is $2$ [17]. The optimal competitive ratio when randomness can be used is known to be $\frac{e}{e-1}$ [19, 25, 12, 13]. For the deadline case, a simple greedy algorithm is exactly $1$ -competitive.

For $D=2$ , MLAP is equivalent to JRP. JRP tries to optimize the total shipping cost from suppliers to retailers when a warehouse is used as an intermediary place for storage. Although JRPD and so JRP is APX-hard [6, 23, 2], JRP permits an $1.791$ -approximation [7] which improves on the results in [22]. In addition, JRPD permits a $1.574$ -approximation [6]. In the online variant, [15] gives a $3$ -competitive algorithm for JRP which improves upon the $5$ -competitive algorithm given in [11]. This competitive ratio is nearly tight as the best known lower bound for the competitive ratio of JRP is $2.754$ [7]. For JRPD, the optimal competitive ratio is exactly $2$ [7].

2 Preliminaries

Multi-level aggregation problem.

The multi-level aggregation problem with deadlines (MLAPD) is defined by a pair, $(\mathcal{T},\mathcal{R})$ , where $\mathcal{T}$ is a node-weighted tree rooted at a node $r$ and $\mathcal{R}$ is a set of requests that arrive at the nodes of $\mathcal{T}$ over some time period. A request $\rho=(v,a,d)$ is specified by an arrival time $a$ , a deadline $d$ , and a node at which the request is issued, $v\in\mathcal{T}$ . Without loss of generality, we can assume that the deadlines of each request are distinct and that the cost of every node is positive. We define $D$ to the height of the tree plus one or, equivalently, the maximum number of nodes present in a path in $\mathcal{T}$ . We call $D$ the depth or number of levels of the tree. Then, we define the level of a node, $L_{v}$ , to be the number of nodes on the unique $r$ to $v$ path in $\mathcal{T}$ .

A service is a pair $(S,t)$ where $S$ is a sub-tree of $\mathcal{T}$ containing the root and $t$ is the time that $S$ is transmitted. Notice the definition of a service implies that if $v\in S$ and $u$ is an ancestor of $v$ , then $u\in S$ . We refer to this fact as the subtree property of services. We sometimes refer to the subtree $S$ as the service when the time of the service is clear or irrelevant to the given context. A request is satisfied by a service, $(S,t)$ , if $v\in S$ and $a\leq t\leq d$ . A schedule is a set of services $(S_{1},t_{1}),\ldots,(S_{k},t_{k})$ . A schedule is a solution to MLAPD if every request in $\mathcal{R}$ is satisfied by some service in the schedule. The cost of a schedule is $\sum_{i=1}^{k}c(S_{i})$ where $c(S_{i})=\sum_{u\in S_{i}}c(u)$ is the total cost of nodes in the service. Note that this node-weighted definition is equivalent to the original definition of the problem as shown by [14].

In the online setting, $\mathcal{T}$ is known to the algorithm but the request arrive in an online fashion. In particular, if $\rho=(v,a,d)\in\mathcal{R}$ is a request, then the algorithm receives this request at time $a$ . At time $t$ , the algorithm must decide whether to transmit a service and if so decide what nodes to include in the service given only knowledge of requests whose arrival time are before or at time $t$ . The algorithms we consider only keep track of active requests, i.e. requests that have arrived yet have not been served previously.

For a node $v$ , we let $\mathcal{T}_{v}$ denote the subtree of $\mathcal{T}$ rooted at $v$ . Given a request $\rho$ and a time $t$ , we use the notation $\rho\in\mathcal{T}_{v}^{t}$ to mean that $\rho$ was issued at a node in $\mathcal{T}_{v}$ by time $t$ and has not yet been satisfied. Then, if $\rho\in\mathcal{T}_{v}^{t}$ , we define $P(v\to\rho)$ to be the nodes on the path in $\mathcal{T}_{v}$ from $v$ to the node at which $\rho$ is issued. For the root, we use the special notation $P_{\rho}$ to be $P(r\to\rho)$ . If $S$ is any subtree, then we define $P(v\to\rho|S)=P(v\to\rho)\setminus S$ to be the nodes on the path to $\rho$ in $\mathcal{T}_{v}$ that are not already in $S$ . We also define $c(v\to\rho|S)=c(P(v\to\rho|S))$ to be the cost of the nodes of the path to $\rho$ that are not in $S$ .

Critically overdue.

It is known that there is always an optimal schedule that only transmits at times that are deadlines of requests. Given such an optimal schedule and a schedule of some algorithm, we can define a notion of a node being late. Suppose $(S,t)$ is a service of the algorithm in the given schedule and that $v\in S$ . Then, we define $d^{t}(v)=\min_{\rho\in\mathcal{T}_{v}^{t}}d_{\rho}$ to be the deadline of the earliest request in $\mathcal{T}_{v}^{t}$ , which we call the deadline of $v$ or the urgency of $v$ . Also, we define $\text{nos}_{v}^{t}$ to be the time of the next service of the optimal schedule that includes $v$ and is transmitted at a time strictly after time $t$ . Note, we sometimes also use $\text{nos}_{v}^{t}$ to reference this service of the optimal schedule instead of the time at which it was transmitted. If no other services of the optimal schedule include $v$ at time $t$ or later, then we define $\text{nos}_{v}^{t}=\infty$ . Now, we say that a node $v$ is late at time $t$ if $d^{t}(v)<\text{nos}_{v}^{t}$ . This is so called since the optimal schedule must have already satisfied the earliest due request in $v$ as the next time this schedule includes $v$ this request will have already been due. Next, we say $v$ is critically overdue at time $t$ if $v$ is late at time $t$ and if there is no service of the algorithm’s schedule including $v$ that is transmitted at a time in $(t,\text{nos}_{v}^{t})$ [8]. We often will say the pair $(v,t)$ is critically overdue to mean $v$ is critically overdue at time $t$ .

3 Special Tree Classes

In this section we present new upper bounds on the competitive ratio for MLAPD when restricted to special classes of graphs. The arguments in this section serve as a warm up for the main argument in the next section. Also, the new upper bound for path instances shows that the known lower-bound of $4$ on the competitive ratio does not hold for finite graphs.

3.1 Increasing Trees

An increasing tree is a tree where weights continually increase as one travels away from the root down any path in the tree. We consider an algorithm that serves only the path to a due request. In particular, whenever a request becomes due, the algorithm transmits exactly the path to the due request and nothing else. Hence, every service is of the form $(P_{\rho},d_{\rho})$ . We call this algorithm Noadd since it never adds any nodes to a service other than the bare minimum.

Theorem 3.1.

Noadd is $D$ -competitive for any Increasing Tree of depth $D$ .

Proof.

We use a simple charging argument. Suppose $\rho$ is a request that causes a service of ALG at time $t=d_{\rho}$ . Further, suppose that $v$ , the node at which $\rho$ was issued, is at level $L$ and $r=v_{1},\ldots,v_{L}=v$ are the nodes on the path $P_{\rho}$ . Then, by definition, we know that at the time of its satisfaction by ALG, $\rho$ is due and so must have been served by OPT by this time. Suppose $(O,t^{\prime})$ was the service of OPT that satisfied $\rho$ . We charge the entire cost of this service of ALG just to the occurrence of $v$ appearing in $O$ . We note that $\sum_{i=1}^{L}c(v_{i})\leq\sum_{i=1}^{L}c(v)=Lc(v)$ since the tree is increasing. Then, the charge to $v$ is exactly $\frac{Lc(v)}{c(v)}=L$ . Since $L\leq D$ , the total charge to $v$ is at most $D$ . Also, since all the requests at $v$ are cleared after this transmission, the next time a request issued at $v$ becomes due must be strictly after the current time $t$ . By the same argument, at that time $t^{\prime}$ a new service of OPT must have been transmitted that contains $v$ and it will be this occurrence of $v$ that is charged next. Thus, all the charges are disjoint and so the charge to any node of OPT is at most $D$ . Thus, ALG is $D$ -competitive.

∎

Note that the argument above is actually proving $v$ is critically overdue at time $d_{\rho}$ . Then, the charging scheme boils down to charging critically overdue nodes. We can use the same ideas to get an even better result for $L$ -increasing trees. $L$ -increasing trees simply satisfy the property that if $u$ is an ancestor of $v$ then $c(v)\geq Lc(u)$ .

Corollary 3.1.1.

Noadd is $\frac{L}{L-1}$ -competitive for any $L$ -increasing Tree of depth $D$ .

Proof.

We use the same argument as for increasing trees but refine the charge to any node. To avoid confusion we use $\ell$ to mean the level and $L$ to be the factor of increase. With the same setup as before, the only change is the total charge to $v$ . Note that $c(v)\geq Lc(v_{\ell-1})\geq L^{2}c(v_{\ell-2})\geq\ldots\geq L^{\ell-1}c(v_{1})$ since the tree is $L$ -increasing. In particular, $c(v_{i})\leq L^{i-\ell}c(v)$ . Then,

\sum_{i=1}^{\ell}c(v_{i})\leq\sum_{i=1}^{\ell}L^{i-\ell}c(v)=c(v)\sum_{i=0}^{\ell-1}L^{-i}=\frac{L-L^{1-\ell}}{L-1}c(v)\leq\frac{L}{L-1}c(v)

So, the total charge to $v$ is $\frac{L}{L-1}$ . The rest of the argument goes through as before. Thus, ALG is $\frac{L}{L-1}$ -competitive.

∎

3.2 Paths

In our context, we consider a path to simply be a path graph where one of the endpoints is the root of the tree. Consider the algorithm, Double, that upon a request $\rho$ becoming due, first adds the subpath from the root to the node at which $\rho$ was issued, $P_{\rho}$ . Then, the algorithm continually tries to extend the current subpath containing the root by adding other subpaths that connect the last node in the current subpath to nodes holding urgent requests. If adding a subpath would cause the resultant service to cost more than $2c(P_{\rho})$ , then the path is not added and the algorithm transmits the current service. This iterative procedure is described by Algorithm 1.

Data: Request

\rho

just became due

Result: A subtree is transmitted that serves

\rho

1 Initiate a service

S=P_{\rho}

2 while there is an unsatisfied request do

3 Let

\gamma

be the earliest due request not yet satisfied by

S

4 if $c(S)+c(r\to\gamma|S)>2c(P_{\rho})$ then

5 break;

6 end if

S\leftarrow S\cup P(r\to\gamma|S)

9 end while

transmit

S

Algorithm 1 Double

Theorem 3.2.

Double is $(4-2^{-D})$ -competitive for any path with depth $D$ .

Proof.

We will use a charging argument that charges the cost of several services of ALG to a single service of OPT. We visualize each subpath as an interval going from the root to another node. Then, this strategy boils down to charging sequences of nested intervals of ALG to a single interval of OPT.

Let $(S,d)=(P_{\rho}\cup A,d_{\rho})$ be a service of ALG. Since $\rho$ is due at time $d_{\rho}$ , OPT must have already satisfied $\rho$ at some earlier time $t$ . If $(O,t)$ is the service of OPT satisfying $\rho$ , then we know that $P_{\rho}\subseteq O$ by the subtree property of services. Also, the subtree property and the fact that the tree is a path implies that either $O\subseteq S$ or $S\subseteq O$ . Since each cost is positive, this corresponds to $c(O)\leq c(S)$ in the first case and $c(S)\leq c(O)$ in the second case.

•

Suppose that $c(S)\geq c(O)$ so that ALG’s interval contains OPT’s interval. By definition of ALG, we have that $c(A)\leq c(P_{\rho})$ . Thus, we charge $(S,d_{\rho})$ to $(O,t)$ at a total charge of

$c(S)=c(P_{\rho})+c(A)\leq 2c(P_{\rho})\leq 2c(O)$

Here, we used the fact that $P_{\rho}\subseteq O$ and so $c(P_{\rho})\leq c(O)$ . We claim that no other service $(S^{\prime},d^{\prime})=(P_{\gamma}\cup B,d_{\gamma})$ of ALG satisfying $c(S^{\prime})\geq c(O)$ will be charged to $(O,t)$ . For the sake of contradiction, suppose that $(S^{\prime},d^{\prime})$ and $(S,d)$ are both charged to $(O,t)$ . WLOG, we assume that $d_{\gamma}>d_{\rho}$ . Note, it must be the case that $\gamma$ and $\rho$ both arrive before time $t$ for $O$ to satisfy them. Thus, when $S$ was constructed, $\gamma$ had already arrived. Since $\gamma$ triggered a service, we know it was not satisfied by $S$ . Hence, the stopping conditions of ALG imply that $S_{i}\subset P_{\gamma}$ . But, $S$ containing $O$ then implies that $O\subseteq S\subset P_{\gamma}$ so $O$ could not have satisfied $\gamma$ , a contradiction. Thus, only one service of ALG with larger cost may be charged to any service of OPT.
•

Suppose that $c(S)<c(O)$ so that ALG’s interval is strictly contained in OPT’s interval. We will pay for all such $S$ at once. Suppose that $\rho_{1},\ldots,\rho_{k}$ are requests in order of increasing deadline that are all satisfied by $(O,t)$ and that each trigger a service of ALG having smaller service cost than $c(O)$ . Let $(S_{i},t_{i})=(P_{\rho_{i}}\cup A_{i},d_{\rho_{i}})$ be the services of ALG triggered by $\rho_{i}$ . By Lemma 3.3 we know that

$\sum_{i=1}^{k}c(S_{i})<(2-2^{-k})c(O)$

and so the charge to $O$ for all these intervals is $2-2^{-k}$ .

Overall, we see any service of OPT is charged at most $2$ larger services of ALG and at most $2-2^{-k}$ for all the smaller services of ALG. Thus, the total charge made to any service of OPT is at most $2+(2-2^{-k})=4-2^{-k}$ . Hence, the algorithm is $4-2^{-k}$ -competitive, where $k$ is the maximal number of smaller services that we charge to any service of OPT. A trivial upper bound for $k$ is $D$ as each interval must differ by at least one node. Thus, ALG is $(4-2^{-D})$ -competitive.

∎

Lemma 3.3.

Suppose that $(O,t)$ is a service of OPT and $\rho_{1},\ldots,\rho_{k}$ are requests in order of increasing deadline that are satisfied by $(O,t)$ and that trigger services $(S_{1},t_{1}),\ldots,(S_{k},t_{k})$ of ALG each having smaller service cost than $c(O)$ . Then,

\sum_{i=1}^{k}c(S_{i})<(2-2^{-k})c(O)

We defer the proof of Lemma 3.3 to section A of the appendix. In fact, the analysis above can be refined to give a $(4-2^{1-\frac{D}{2}})$ -competitive ratio. The proof of this stronger result can also be found in section A of the appendix.

Corollary 3.3.1.

The competitive ratio lower bound of $4$ does not hold for finite paths. Thus, the true competitive ratio for MLAPD for path instances is either at most $3$ or non-integer.

Proof.

The claims are immediate as $4-2^{-D}<4$ for any finite $D$ . ∎

4 Arbitrary Trees

In this section, we present a $D$ -competitive algorithm for MLAPD on arbitrary trees of depth $D$ .

Intuition.

Since the path to a due request must always be served, the main task is to decide what other nodes will be added to the service. The main observation is that several of these nodes have been included in ALG unnecessarily in the past. So, we need to justify their inclusion in the current service. For a node $v$ , we do this by satisfying requests in $\mathcal{T}_{v}$ using a procedure called a $v$ -fall. This way if $v$ was indeed serviced too many times before, we have other nodes that can effectively be used to pay for $v$ ’s cost in the analysis. Repeating this logic, we need to recursively justify all the nodes added by the $v$ -fall, which will be nodes of higher level than $v$ . Eventually, we find good nodes of low level, or this process reaches the leaves, which will always have the desired properties.

Clearly, for this to work the fall will have to include enough new nodes to pay for the cost of the given node. At the same time, adding too many nodes would runs the risk of too much eager-aggregation which would just cause more of the same problem we seek to avoid. To deal with this, we used a fixed budget that determines how many nodes can be added by the fall. However, the path to a request in $\mathcal{T}_{v}$ may have cost far greater than $v$ ’s budget. To avoid $v$ ’s budget going to waste, we discount the cost of this path so that it becomes easier to include by other falls. We treat this amount discounted as an investment that can be used to justify $v$ ’s inclusion. To implement this, we keep track of a price, $p(u)$ , for every node that represents how much of $c(u)$ is leftover to be paid for. Then, we only need a fall to pay for the price of a path in order to include it in a service, rather than its full cost.

Overall, the fall will add paths to requests in increasing order of deadline. Naturally, this ordering ensures we satisfy many urgent requests. Though, some requests that aren’t urgent may be cheap to add and so should be considered as well. However, the fact that every node included triggers its own fall ensures that requests that aren’t globally urgent but are inexpensive to include are added to the service.

Falls.

Formally, given a tentative service tree $S$ at time $t$ , we define a $v$ -fall to be an iterative procedure that considers requests in $\mathcal{T}_{v}^{t}$ for addition to the service. We define $F_{v}^{t}$ to be the set of nodes, $\mathcal{A}_{v}$ , constructed by the $v$ -fall at time $t$ . We will write $P(v\to\rho)$ to mean $P(v\to\rho|S^{\prime})$ , where $S^{\prime}$ is the tentative service tree right before this path was added to the fall $F_{v}^{t}$ . If the nodes on a path to a request that are not already included in the service, $S$ , can be included without the total price exceeding $v$ ’s budget, then the nodes are added to the service and the budget reduced by the corresponding price of these nodes. This procedure continues until there are no pending requests in $\mathcal{T}_{v}^{t}$ or until the addition of a path to a request exceeds the remaining budget. If a path $P(v\to\rho|S)$ exceeds the remaining budget, then we reduce the price of nodes on this path by the remaining budget. In particular, we evenly divide the remaining budget amongst the price of the path and then reduce each node’s price proportional to its contribution to the total price of the path. We refer to these nodes as overflow nodes for $F_{v}^{t}$ . Whenever a fall causes some node’s price to change, we think of this fall as investing in that node. This entire process is described as Algorithm 4. Note we refer to the entire process (Algorithm 2, Algorithm 3, Algorithm 4) as well as just Algorithm 3 as Waterfall.

Waterfall Interpretation.

Algorithm 3 behaves like a flow in a flow network defined by $\mathcal{T}$ . The sources are the root and other nodes on the path to the due request and the sinks are generally the leaves. Roughly, each $v$ -fall is a way of selecting which children of $v$ in the network the flow will continue through. Recursively, all such nodes also send the amount of flow it received down its subtree, and we continue the process until we reach the leaves which cannot send their flow elsewhere. Nodes may have excesses if they don’t add enough in a fall, but excess is invested into the overflow nodes for their eventual inclusion. So, if we take future inclusions into account then conservation is preserved. With this flow interpretation, the algorithm behaves like a waterfall where water runs over ledges which act as containers. Whenever enough water fills up on a ledge, it spills over causing a smaller waterfall.

1 for $v\in\mathcal{T}$ do

p(v)\leftarrow c(v)

3 end for

Algorithm 2 Initialization

Data: Request

\rho

just became due

Result: A subtree is transmitted that serves

\rho

1 Initiate a service

S=P_{\rho}

and set

p(v)=c(v)

for each

v\in P_{\rho}

2 Let

Q

be a queue for nodes. Enqueue the nodes of

P_{\rho}

into

Q

3 while $Q\not=\varnothing$ do

4 Dequeue

v

from

Q

\mathcal{A}_{v}\leftarrow F_{v}(S)

6 For every

u\in\mathcal{A}_{v}

enqueue

u

into

Q

8 end while

transmit

S

Algorithm 3 Waterfall

Input : A tentative service subtree

S

Result: The service subtree

S

is possibly extended to satisfy some requests in

\mathcal{T}_{v}

and the set of added nodes is returned.

\mathcal{A}_{v}=\emptyset,b(v)=c(v)

2 for $\rho\in\mathcal{T}_{v}$ in increasing deadline order do

3 if $p(v\to\rho|S)>b(v)$ then

4 for $u\in P(v\to\rho|S)$ do

p(u)\leftarrow p(u)(1-\frac{b(v)}{p(v\to\rho|S)})

6 end for

7 break;

8 else

\mathcal{A}_{v}\leftarrow\mathcal{A}_{v}\cup P(v\to\rho|S)

b(v)\leftarrow b(v)-p(v\to\rho|S)

11 for $u\in P(v\to\rho|S)$ do

p(u)\leftarrow c(u)

13 end for

S\leftarrow S\cup\mathcal{A}_{v}

16 end if

18 end for

return $\mathcal{A}_{v}$

Algorithm 4

v

-fall,

F_{v}(S)

Theorem 4.1.

Waterfall is $D$ -competitive on trees of depth $D$ .

The main approach to analyzing Waterfall will be a charging argument. Whenever a node is included in a service of ALG, its cost must be paid for by charging to a service of OPT. We think of the amount invested in a node at some time as the total cost it is responsible for. If the node is critically overdue, then it can pay for its own cost and the amount invested in it. Otherwise, a fall investing a node’s budget to its descendants can be interpreted as the node delegating the payment of its cost to its descendants. Then, these descendants will either be able to pay for their own costs as well as the ancestors’, or will further transfer responsibility of payment to their descendants. Continuing this process, nodes that can pay off the costs will eventually be reached. In particular, the leaves of the tree will eventually be reached and these will be critically overdue.

However, nodes in the subtree of a critically overdue node may be early and so cannot pay for themselves. The cost of such nodes will also be charged to their critically overdue ancestor. Thus, a critically overdue node will have to pay both for its ancestors (including itself) and its early descendants. This charging idea can be visualized as the evolution of debt in a family tree. The family may continually pass down debt from generation to generation. At some point, the descendants will have accumulated enough wealth to pay-off the debt of their ancestors. Though these descendants might also have children who will accumulate further debt. These unlucky descendants then would be be tasked with paying off debts from both sides, ancestors and descendants.

4.1 ALG’s Structure

Phases

It will be helpful to break services of ALG into different phases. Suppose that $(O_{1},t_{1}),\ldots,(O_{j},t_{j})$ are all the services of OPT that include the node $v$ in chronological order of transmission time. Then, for some $i\in[j-1]$ suppose $(A_{1},t_{1}^{\prime}),\ldots,(A_{k},t_{k}^{\prime})$ are all the services of ALG that include $v$ and satisfy $t_{i}\leq t_{h}^{\prime}<t_{i+1}$ where $h\in[k]$ . We call these services of ALG a $v$ -phase and say that $(O_{i},t_{i})$ starts the $i$ -th $v$ -phase. For $i=j$ , a $v$ -phase consists of all the services of ALG containing $v$ that are transmitted at time $t_{j}$ or later. Note that if a request $\rho$ becomes due in the $i$ -th $v$ -phase, then $\rho$ must be satisfied by one of $(O_{h},t_{h})$ for $h\in[i]$ . Consequently, if a request, $\rho$ , arrives in $\mathcal{T}_{v}$ during a $v$ -phase, it cannot become due before the next $v$ -phase since no service of OPT includes $v$ during a $v$ -phase and so cannot satisfy $\rho$ . Thus, if $v$ is late at time $t$ during a phase, it must have been satisfied earlier. Also, for any service $(A,t)$ in the $i$ th $v$ -phase, we have that $\text{pos}_{v}^{t}=t_{i}$ and $\text{nos}_{v}^{t}=t_{i+1}$ or $\text{nos}_{v}^{t}=\infty$ if $i=j$ .

Suppose $v$ is late at the time $t_{i}$ when $A_{i}$ is transmitted in the phase. Then, we have $v$ is late at all previous services in the phase. This is because ALG always considers the earliest due requests in $\mathcal{T}_{v}$ for satisfaction first. Specifically, since late requests cannot arrive in the middle of a phase, we know the request causing ALG to enter $\mathcal{T}_{v}$ , $\rho_{i}$ , must have been available in the previous services of ALG. Hence, the earliest due request in $\mathcal{T}_{v}^{t_{h}}$ for $h<i$ , $\rho_{h}$ , had an earlier deadline so is also overdue. Formally,

d^{t_{h}}(v)=d_{\rho_{h}}<d_{\rho_{i}}<\text{nos}_{v}^{t_{i}}=\text{nos}_{v}^{t_{h}}

Thus, we can partition any phase into two pieces: an initial segment where $v$ is late and a later segment where $v$ is not late. We call the former services late with respect to the phase, and the latter early. Note it could be that a phase has only early or only late services. However, $r$ -phases have only late services since transmissions of the algorithm only happen when a request becomes due.

Let $v$ be a node included in a service of ALG at time $t$ . Recall, we say $v$ is critically overdue at time $t$ if $v$ is late at time $t$ and if for any service of the algorithm’s schedule including $v$ at a time $t^{\prime}\in(t,\text{nos}_{v}^{t})$ we have that $v$ is not late at time $t^{\prime}$ . In terms of phases, $v$ is critically overdue at time $t$ if the service of ALG at time $t$ is the last late service in its respective $v$ -phase. We “charge” to critically overdue nodes, $(v,t)$ , by charging to the occurrence of $v$ that appears in $\text{pos}_{v}^{t}$ . We have that charging to critically overdue nodes is injective since only one service in any phase may be critically overdue.

Next, we discuss structural properties of late and critically overdue nodes. First, we note that if $(v,t)$ is late and the $v$ -fall satisfies all requests in $\mathcal{T}_{v}^{t}$ , then $(v,t)$ must be critically overdue. This is clear since a new request must arrive after time $t$ in order for $v$ to become late again and a new service of OPT must be transmitted to satisfy this new request.

Lemma 4.2.

Suppose that $v$ is late at time $t$ and that $\rho$ is the earliest due request in $\mathcal{T}_{v}^{t}$ . Then, for all $u\in P(v\to\rho)$ , $u$ is late at time $t$ . Furthermore, if $v$ is late at time $t$ and $u\in F_{v}^{t}$ is early at time $t$ , then $v$ is critically overdue at time $t$ .

Proof.

If $v$ is late at time $t$ then, by definition, we know that $d^{t}(v)<\text{nos}_{v}^{t}$ . Since $\rho$ is the earliest due request in $\mathcal{T}_{v}^{t}$ , we also know that $d^{t}(v)=d_{\rho}$ . If for $u\in P(v\to\rho)$ a request $\gamma\in\mathcal{T}_{u}^{t}$ was due earlier than $\rho$ , $\gamma$ would also be an earlier due request in $\mathcal{T}_{v}^{t}\supseteq\mathcal{T}_{u}^{t}$ . Hence, we have $\rho$ is the earliest due request in $\mathcal{T}_{u}^{t}$ , so $d^{t}(v)=d^{t}(u)$ . Now, by the subtree property of services, if OPT includes $u$ in a service it must also include $v$ in that service implying that $\text{nos}_{v}^{t}\leq\text{nos}_{u}^{t}$ . Thus, we see that $d^{t}(u)=d^{t}(v)<\text{nos}_{v}^{t}\leq\text{nos}_{u}^{t}$ and so $u$ is late at time $t$ .

Now, further suppose that $u\in F_{v}^{t}$ is early at time $t$ . Also, suppose that ALG next includes $v$ at time $t^{\prime}$ . If $v$ is early at time $t^{\prime}$ , then either this service at time $t^{\prime}$ is in a new $v$ -phase or this service is an early service in the same $v$ -phase. In either case, the partition of phases into earlier and late parts implies that the service at time $t$ is the last late service for the $v$ -phase. Hence, $v$ is critically overdue at time $t$ .

Next, suppose that $v$ is late at time $t^{\prime}$ . Again, if the service at time $t^{\prime}$ is in a new $v$ -phase, $v$ is critically overdue at time $t$ . If the service at time $t^{\prime}$ is in the same $v$ -phase as the service at time $t$ , we show that $u$ at time $t$ was actually late. In particular, the services at times $t$ and $t^{\prime}$ being in the same $v$ -phase implies that $\text{pos}_{v}^{t}=\text{pos}_{v}^{t^{\prime}}$ and $\text{nos}_{v}^{t}=\text{nos}_{v}^{t^{\prime}}$ . We claim that the earliest due request, $\gamma$ , in $\mathcal{T}_{v}^{t^{\prime}}$ was issued by time $t$ . If not, we have that $\text{pos}_{v}^{t}\leq t<a_{\gamma}\leq d_{\gamma}<\text{nos}_{v}^{t}$ . This means that $\gamma$ arrived and became due between two services of OPT and so OPT never satisfied this request. This is impossible by feasibility of OPT and so $\gamma\in\mathcal{T}_{v}^{t}$ . Since $u$ was included in the $v$ -fall at time $t$ , but $\gamma$ was not satisfied by this $v$ -fall, we know that $d^{t}(u)<d_{\gamma}$ . But, since $d_{\gamma}=d^{t^{\prime}}(v)$ and $v$ is late at time $t^{\prime}$ , we have

d^{t}(u)<d_{\gamma}=d^{t^{\prime}}(v)<\text{nos}_{v}^{t^{\prime}}=\text{nos}_{v}^{t}

Finally, applying the subtree property at time $t$ , we have that $d^{t}(u)<\text{nos}_{v}^{t}\leq\text{nos}_{u}^{t}$ and so $u$ is late at time $t$ . This contradicts that $u$ is early and hence $v$ must be critically overdue at time $t$ .

∎

Direct Investments.

Suppose that $(S,t)$ is a service of ALG with $v\in S$ . We say that the pair $(v,t)$ directly invests in the pair $(w,t^{\prime})$ if $w\in F_{v}^{t}$ and $t=t^{\prime}$ or if $w$ is an overflow node for $F_{v}^{t}$ and $t^{\prime}$ is the next time after $t$ that $w$ is included in a service of ALG. The investment made, $I_{v}^{t}(w,t^{\prime})$ , is $p(w)$ , $w$ ’s price during the construction of $F_{v}^{t}$ , in the former case. In the latter case, the investment made is $\frac{rb(v,t)}{p(v\to\rho)}p(w)$ where $rb(v,t)$ is the remaining budget, $b(v)$ , at the end of the $v$ -fall’s construction and $\rho$ is this earliest request not satisfied by this fall. If $rb(v,t)=0$ , then we do not consider any $0$ investments and say there are no overflow nodes for the fall. We imagine a fall including a node as reducing its price to $0$ before resetting its price to its cost. Then, we see the direct investment in any node is exactly the amount by which the fall reduces the node’s price.

Lemma 4.3.

The total direct investment made by any pair $(v,t)$ is at most $c(v)$ and is strictly less than $c(v)$ only if $F_{v}^{t}$ satisfied all requests in $\mathcal{T}_{v}^{t}$ . In addition, the total direct investments made to a pair $(v,t)$ is at most $c(v)$ and is strictly less than $c(v)$ exactly when $v$ is on the path to the request that triggers the service of ALG at time $t$ .

Proof.

The total direct investment made by $(v,t)$ is exactly the total price of all nodes included in $F_{v}^{t}$ plus the investments made to the overflow nodes. Note that the budget starts at $c(v)$ and is reduced by the price of each node added to the fall in each iteration. Hence, the total investment made by by $(v,t)$ to nodes included in $F_{v}^{t}$ is exactly $c(v)-rb(v,t)$ . If all requests in $\mathcal{T}_{v}^{t}$ are satisfied by $F_{v}^{t}$ , then this is all the direct investments made and this is clearly at most $c(v)$ . If there are overflow nodes, each overflow node, $(u,t^{\prime})$ , is invested exactly $\frac{rb(v,t)}{p(v\to\rho)}p(u)$ . Thus, the total investment made by $(v,t)$ to overflow nodes is exactly

\sum_{u\in P(v\to\rho)}\frac{rb(v,t)}{p(v\to\rho)}p(u)=\frac{rb(v,t)}{p(v\to\rho)}\sum_{u\in P(v\to\rho)}p(u)=rb(v,t)

So, the total direct investments made by $(v,t)$ is $c(v)-rb(v,t)+rb(v,t)=c(v)$ . Hence, as long as there are overflow nodes, the total direct investment will be exactly $c(v)$ . So, the only way the total direct investment can be strictly less than $c(v)$ is if there are no overflow nodes, which can only happen if $F_{v}^{t}$ satisfied all requests in $\mathcal{T}_{v}^{t}$ by definition.

For the second claim, first note that every direct investment made to $(v,t)$ will be when $v$ is a overflow node other than possibly the very last direct investment as $v$ will be included in the service at that point and then cannot be invested in further by definition. Now, the direct investments made to $v$ are exactly the amounts the price of the node is reduced by. Since the price of the node begins at its cost and is reduced until the price becomes $0$ or until the node is included in a service due to a request becoming due, we see the total direct investments made to $v$ is at most $c(v)$ . If $v$ is added by a fall then the last investment was its final price and so the total direct investment is simply the final price $p(v)$ plus the amounts the price was reduced by which is $c(v)-p(v)$ for $c(v)$ total direct investments. Hence, we see the total direct investment in $(v,t)$ is less than $c(v)$ exactly when it is not added by some fall. This happens if and only if $v$ was on a path to a request due at time $t$ and so the initial part of Algorithm 3 added $v$ rather than a fall.

∎

Investments.

We say $(v,t)$ invests in $(w,t^{\prime})$ if either $(v,t)$ directly invests in $(w,t^{\prime})$ or if $(v,t)$ invests in $(u,t^{\prime\prime})$ and $(u,t^{\prime\prime})$ directly invests in $(w,t^{\prime})$ . Equivalently, the latter case can be defined so that $(v,t)$ directly invests in $(u,t^{\prime\prime})$ and $(u,t^{\prime\prime})$ invests in $(w,t^{\prime})$ and we will use both versions interchangeably. The equivalence follows similarly to how we can define a walk in a graph inductively by either looking at the last step or the first step in the walk. In the inductive case, we define $I_{v}^{t}(w,t^{\prime})=\frac{I_{v}^{t}(u,t^{\prime\prime})}{c(u)}\cdot I_{u}^{t^{\prime\prime}}(w,t^{\prime})$ where one of the investments will be direct and the other will be recursively constructed depending on which definition we use. Intuitively, this is just the investment made to $(w,t^{\prime})$ by $(u,t^{\prime\prime})$ scaled by the relative reduction in $c(u)$ caused by $(v,t)$ ’s investment to $(u,t^{\prime\prime})$ .

We define the total investment in $(v,t)$ to be $I(v,t)=\sum_{\begin{subarray}{c}(w,t^{\prime})\text{ invests}\\ \text{in }(v,t)\end{subarray}}I_{w}^{t^{\prime}}(v,t)$ . Note, that we can calculate the total investment by focusing on direct investments. In particular, $I(v,t)=\sum_{\begin{subarray}{c}(w,t^{\prime})\text{ directly }\\ \text{invests in }(v,t)\end{subarray}}\frac{I(w,t^{\prime})}{c(w)}I_{w}^{t^{\prime}}(v,t)$ which follows since any pair that invests in $(v,t)$ must invest in some pair directly investing in $(v,t)$ . To streamline our arguments we will consider $(v,t)$ as investing in $(v,t)$ and the investment made is $c(v)$ . Note that only ancestors of $v$ can invest in a pair $(v,t)$ .

Lemma 4.4.

For any pair $(v,t)$ , we have $I(v,t)\leq L_{v}c(v)$ .

Proof.

We proceed by induction on $L_{v}$ . For the base case, we have $L_{v}=1$ , so $v=r$ . The only ancestor of $r$ is $r$ itself, so the only investment made is from $r$ itself. Hence, the total investment to $(r,t)$ is $I(r,t)=c(r)=L_{r}c(r)$ .

Suppose that $L_{v}>1$ , so that $v$ is a proper descendant of $r$ . If no ancestors of $v$ directly invested in $v$ , then again $I(v,t)=c(v)\leq L_{v}c(v)$ . Suppose that $(u_{1},t_{1}),\ldots,(u_{k},t_{k})$ directly invested in $(v,t)$ . Note that each $u_{i}$ must be a proper ancestor of $v$ and so $L_{u_{i}}\leq L_{v}-1$ . The induction hypothesis implies that $I(u_{i},t_{i})\leq L_{u_{i}}c(u_{i})$ for each $i$ . Excluding the investment from $(v,t)$ to itself, the total investment in $(v,t)$ is

	$\displaystyle I(v,t)-c(v)$	$\displaystyle=\sum_{i=1}^{k}\frac{I(u_{i},t_{i})}{c(u_{i})}I_{u_{i}}^{t_{i}}(v,t)$
		$\displaystyle\leq\sum_{i=1}^{k}\frac{L_{u_{i}}c(u_{i})}{c(u_{i})}I_{u_{i}}^{t_{i}}(v,t)$
		$\displaystyle=\sum_{i=1}^{k}L_{u_{i}}I_{u_{i}}^{t_{i}}(v,t)$
		$\displaystyle\leq(L_{v}-1)\sum_{i=1}^{k}I_{u_{i}}^{t_{i}}(v,t)$

Lastly, we use the fact from Lemma 4.3 that the total direct investment into $(v,t)$ is at most $c(v)$ , so this final sum is at most $c(v)$ . Thus, we have that $I(v,t)-c(v)\leq(L_{v}-1)c(v)$ implying that $I(v,t)\leq L_{v}c(v)$ .

∎

We will not only want to look at the amount invested in a pair, but also how much a pair invests in other pairs. To this end, we define $IM(v,t)=\sum_{\begin{subarray}{c}(v,t)\text{ invests}\\ \text{in }(w,t^{\prime}),w\not=v\end{subarray}}I_{v}^{t}(w,t^{\prime})$ to be the total investment made by $(v,t)$ to its proper descendants. Again, we can focus on the pairs that are directly invested in as follows: $IM(v,t)=\sum_{\begin{subarray}{c}(v,t)\text{ directly }\\ \text{invests in }(w,t^{\prime})\end{subarray}}I_{v}^{t}(w,t^{\prime})+\frac{I_{v}^{t}(w,t^{\prime})}{c(w)}IM(w,t^{\prime})$ .

Lemma 4.5.

For any pair $(v,t)$ , $IM(v,t)\leq(D-L_{v})c(v)$ .

Proof.

We proceed by induction on $D-L_{v}$ . For the base case, $D-L_{v}=0$ , so $v$ is a leaf. In this case, $v$ has no proper descendants and so $IM(v,t)=0=(D-L_{v})c(v)$ .

Now, suppose that $D-L_{v}>0$ so that $v$ is not a leaf. If $(v,t)$ does not directly invest in any pair we are done, so suppose that $(v,t)$ directly invests in $(w_{1},t_{1}),\ldots,(w_{k},t_{k})$ . By Lemma 4.3, we know the total direct investment made by $(v,t)$ is at most $c(v)$ . In symbols, $\sum_{i=1}^{k}I_{v}^{t}(w_{i},t_{i})\leq c(v)$ . Also, each $w_{i}$ is a proper descendant of $v$ and so the induction hypothesis implies that $IM(w_{i},t_{i})\leq(D-L_{w_{i}})c(w_{i})$ for all $i$ . Using these two facts gives,

	$\displaystyle IM(v,t)$	$\displaystyle=\sum_{i=1}^{k}I_{v}^{t}(w_{i},t_{i})+\frac{I_{v}^{t}(w_{i},t_{i})}{c(w_{i})}IM(w_{i},t_{i})$
		$\displaystyle\leq\sum_{i=1}^{k}I_{v}^{t}(w_{i},t_{i})+\frac{I_{v}^{t}(w_{i},t_{i})}{c(w_{i})}(D-L_{w_{i}})c(w_{i})$
		$\displaystyle=\sum_{i=1}^{k}(D-L_{w_{i}}+1)I_{v}^{t}(w_{i},t_{i})$
		$\displaystyle\leq\sum_{i=1}^{k}(D-L_{v})I_{v}^{t}(w_{i},t_{i})$
		$\displaystyle\leq(D-L_{v})c(v)$

Above we have used the fact that $L_{w_{i}}\geq L_{v}+1$ since descendants have higher level and so $D-L_{w_{i}}+1\leq D-L_{v}$ .

∎

4.2 Charging Scheme

Now, we begin the construction of the charges we will make. For a set of pairs, $S$ , we let $I_{v}^{t}(S)=\sum_{(w,t^{\prime})\in S}I_{v}^{t}(w,t^{\prime})$ be the total investment $(v,t)$ made to the pairs in $S$ .

Lemma 4.6.

Let $(S,t)$ be a service of ALG and suppose $v\in S$ is late at time $t$ . Then, there exists a set $U_{v}^{t}\subseteq\mathcal{T}_{v}$ of critically overdue nodes such that $I_{v}^{t}(U_{v}^{t})=c(v)$ .

Proof.

We proceed by induction on $D-L_{v}$ .

For the base case, $D-L_{v}=0$ , so $v$ is a leaf. In this case, we claim $v$ is critically overdue at time $t$ and hence $U_{v}^{t}=\{(v,t)\}$ satisfies the requirements as $I_{v}^{t}(v,t)=c(v)$ . Suppose that there was another service of ALG including $v$ at a time $t^{\prime}\in(t,\text{nos}_{v}^{t})$ . Further, suppose that $v$ is late at time $t^{\prime}$ . Now, since all requests at $v$ are satisfied after time $t$ by $v$ ’s inclusion, it must be that some other request $\rho$ arrived at $v$ after time $t$ . However, $v$ late at time $t^{\prime}$ implies $t<a_{\rho}\leq d_{\rho}<\text{nos}_{v}^{t}$ . This implies $\rho$ becomes due before OPT could have satisfied it, contradicting feasibility of OPT. Thus, $(v,t)$ is critically overdue.

Now, suppose that $D-L_{v}>0$ , so $v$ is not a leaf. If $(v,t)$ is critically overdue, we again have $U_{v}^{t}=\{(v,t)\}$ satisfies the requirements. Otherwise, suppose that $v$ is not critically overdue at time $t$ . Then, by definition, there must be a time $t^{\prime}>t$ at which ALG transmits a service, and this service contains $v$ , which is late at time $t^{\prime}$ . By the partitioning of phases into early and late parts, we can assume that time $t^{\prime}$ is the very next time that ALG includes $v$ after time $t$ since at time $t^{\prime}$ , $v$ must be late if it is at a even later time in the phase.

Consider the request $\gamma$ whose satisfaction caused $v$ to be included at time $t^{\prime}$ and $\rho$ the last request considered by $F_{v}^{t}$ . We claim that $\rho=\gamma$ . If $\gamma\not=\rho$ , then it must be the case that $d_{\gamma}<d_{\rho}$ . However, the definition of $v$ -falls imply they consider requests in earliest deadline order. Thus, for $\rho$ to be considered before $\gamma$ it must be the case that $\gamma$ arrived after time $t$ . But, this implies $t^{\prime}$ is in another phase and so $v$ was critically overdue at time $t$ , a contradiction. Hence, $\rho=\gamma$ .

Then, Lemma 4.2 implies all the nodes in $P(v\to\rho)$ are late at time $t^{\prime}$ as $\rho$ is the earliest request in $\mathcal{T}_{v}^{t^{\prime}}$ . Also, the contrapositive of the last statement in the lemma implies that any $u\in F_{v}^{t}$ is late at time $t$ since $(v,t)$ is not critically overdue. Thus, all of the nodes in $F_{v}^{t}$ and the overflow nodes $P(v\to\rho)$ are higher level nodes than $v$ that are late so the induction hypothesis applies to these nodes. In particular, if $(w_{1},t_{1}),\ldots,(w_{k},t_{k})$ are the nodes that $(v,t)$ directly invested in, then we have that for each $i$ there exists a set of critically overdue nodes, $U_{w_{i}}^{t_{i}}$ , satisfying $I_{w_{i}}^{t_{i}}(U_{w_{i}}^{t_{i}})=c(w_{i})$ . Since $(v,t)$ is not critically overdue, it must be the case that $F_{v}^{t}$ did not satisfy every request in $\mathcal{T}_{v}^{t}$ as mentioned previously. Thus, by Lemma 4.3 we have that the total direct investment made by $(v,t)$ is exactly $c(v)$ . Hence, we have that $\sum_{i=1}^{k}I_{v}^{t}(w_{i},t_{i})=c(v)$ . Defining $U_{v}^{t}=\cup_{i=1}^{k}U_{w_{i}}^{t_{i}}$ , we then have,

	$\displaystyle I_{v}^{t}(U_{v}^{t})$	$\displaystyle=\sum_{i=1}^{k}I_{v}^{t}(U_{w_{i}}^{t_{i}})$
		$\displaystyle=\sum_{i=1}^{k}\frac{I_{v}^{t}(w_{i},t_{i})}{c(w_{i})}I_{w_{i}}^{t_{i}}(U_{w_{i}}^{t_{i}})$
		$\displaystyle=\sum_{i=1}^{k}I_{v}^{t}(w_{i},t_{i})=c(v)$

Also, $U_{v}^{t}\subseteq\mathcal{T}_{v}$ contains only nodes that are critically overdue.

∎

Fix any service $(S,t)$ of ALG. Lemma 4.6 implies for any late node $v\in S$ , we can pay for the $c(v)$ in this service by charging the critically overdue nodes in $U_{v}^{t}$ the amount $(v,t)$ invested in them. This will be the charges we make for late nodes. Note, at this point the total charge to any critically overdue node, $(v,t)$ , will be at most $L_{v}$ . This follows since we at most charge a node the total investment made in it and this is at most $L_{v}c(v)$ by Lemma 4.4.

We also need to pay for early nodes of a service, which we do next. The analysis above essentially covers the cost of all nodes “above” a node $v$ , i.e. its ancestors. Now, we want to bound the cost of all nodes “below” a node $v$ , i.e. its descendants. To continue the charging strategy, we will charge every critically overdue node, $(v,t)$ , for all of the early nodes that it invests in. This is at most the totality of all investments made by $(v,t)$ so this adds a charge of at most $D-L_{v}$ to $(v,t)$ according to Lemma 4.5. Thus, each node is charged $L_{v}$ for nodes above it and $D-L_{v}$ for nodes below it for a total of $D$ . We just need to argue every cost is in fact covered by this strategy.

Lemma 4.7.

Let $(S,t)$ be a service of ALG. Then, the charging strategy pays for all the costs of nodes in $S$ .

Proof.

Any late node is clearly paid for by the definition of the charging scheme and Lemma 4.6. Next, we argue all early nodes are also paid for. Note any node on a path to a due request cannot be early by Lemma 4.2. Thus, if a node is early it cannot be on the path to a request that triggers a service of ALG. Consequently, Lemma 4.3 implies the total investments to an early node must exactly equal its cost.

We proceed by induction on $L_{v}$ to show if $v$ is early at time $t$ then the cost of $v$ is exactly paid for by charges to some critically overdue nodes as dictated by the charging scheme. For the base case, $L_{v}=1$ and we have that $v=r$ . We know $r$ is never early by definition of ALG only including $r$ when a request becomes due and so this case vacuously holds.

Suppose that $L_{v}>1$ . Let $(u,t^{\prime})$ be a pair that directly invested in $(v,t)$ . If $(u,t^{\prime})$ is critically overdue, then we know by the charging scheme that $(u,t^{\prime})$ pays for the investment $I_{u}^{t^{\prime}}(v,t)$ that it made to $(v,t)$ . We just need to ensure that investments of the other pairs that are not critically overdue were paid for. First, we claim all non-critically overdue pairs that invested in $(v,t)$ are early. For the sake of contradiction, suppose that $(u,t^{\prime})$ directly invested in $(v,t)$ and $(w,t^{\prime})$ is late but not critically overdue.

•

If $v\in F_{u}^{t^{\prime}}$ is early at time $t^{\prime}$ , then Lemma 4.2 implies that $(u,t^{\prime})$ is critically overdue, a contradiction.
•

If $F_{u}^{t^{\prime}}$ does not add $v$ at time $t$ , then $v$ was an overflow node for $(u,t^{\prime})$ . Since $(u,t^{\prime})$ is late but not critically overdue, we know the next time $t^{\prime\prime}$ that ALG includes $u$ after time $t^{\prime}$ is in the same $u$ -phase as time $t^{\prime}$ . However, since $v$ was an overflow node at time $t^{\prime}$ , we know that the earliest request in $\mathcal{T}_{u}^{t^{\prime}}$ that is not satisfied is in $\mathcal{T}_{v}^{t^{\prime}}$ .
- –
  
  If the next time we enter $u$ the path includes $v$ , then this happens at time $t^{\prime\prime}$ and so $t=t^{\prime\prime}$ . Then, Lemma 4.2 implies $(v,t)$ is late since its on the path to the earliest due request in $\mathcal{T}_{u}^{t}$ .
- –
  
  If the next time we enter $u$ the path does not include $v$ , it must be that a new request arrived after time $t^{\prime}$ causing $u$ to be late. However, late requests cannot arrive in the middle of a phase as it would become due before OPT includes $u$ again.

In all cases, we reach a contradiction and so if $(u,t^{\prime})$ is not critically overdue then it must be early.

Now, for each $(u,t^{\prime})$ that is early we have that $u$ is an ancestor of $v$ so the induction hypothesis implies that some charges to critically overdue nodes paid for $c(w)$ . In particular, by our charging scheme this means there were critically overdue nodes $(w_{1},t_{1}),\ldots,(w_{j},t_{k})$ satisfying $\sum_{i=1}^{k}I_{w_{i}}^{t_{i}}(u,t^{\prime})=c(u)$ . However, since $v$ is an early descendant of $u$ , $v$ is also an early descendant of each $w_{i}$ . Hence, our charging scheme would also ensure that $(w_{i},t_{i})$ paid for its investment to $(v,t)$ since it paid its investment to $(u,t^{\prime})$ . By definition of the investments, the total investment made to $(v,t)$ by these critically overdue nodes is exactly

\sum_{i=1}^{k}I_{w_{i}}^{t_{i}}(v,t)=\sum_{i=1}^{k}\frac{I_{w_{i}}^{t_{i}}(u,t^{\prime})}{c(u)}I_{u}^{t^{\prime}}(v,t)=\frac{I_{u}^{t^{\prime}}(v,t)}{c(u)}\sum_{i=1}^{k}I_{w_{i}}^{t_{i}}(u,t^{\prime})=I_{u}^{t^{\prime}}(v,t)

Now, since this holds true for all early pairs investing in $(v,t)$ , we have that all direct investments to $(v,t)$ are paid for by our charging scheme. But the total direct investment for an early node is its cost, and so $c(v)$ is paid for by the charges. Thus, the charging scheme pays for all of the costs of the service $(S,t)$ .

∎

Summarizing the charging strategy above gives a proof of Theorem 4.1.

Proof.

For any service $(S,t)$ of ALG, we apply Lemma 4.6 to create sets of critically overdue nodes to charge. In particular, each node $v\in S$ that is late at time $t$ we pay for by charging each pair in $U_{v}^{t}$ the investment $(v,t)$ made to each. This pays for every late node as $I_{v}^{t}(U_{v}^{t})=c(v)$ . By Lemma 4.4 the total investments made to $(v,t)$ is at most $L_{v}c(v)$ and so the charge to any critically overdue node is at most $L_{v}$ to pay for late nodes through investments. Then, we charge to each critically overdue node the total amount of investments it made to early nodes. These charges add a total of charge of at most $D-L_{v}$ to any node $v$ as the total investments made by $(v,t)$ is at most $(D-L_{v})c(v)$ by Lemma 4.5. Hence, the total charge to any critically overdue node is at most $D$ . Lastly, Lemma 4.7 implies every node in a service of ALG is paid for by this charging. Thus, the total cost of ALG’s schedule is at most $D$ times the total cost of OPT’s schedule. Consequently, ALG is $D$ -competitive.

∎

One thing to note about the above analysis is that its basically tight. Most critically overdue nodes are charged exactly $D$ . This optimizes the charging possible if we restrict to only charging to nodes OPT has included prior to a fixed time. The only way to improve upon this analysis would be to allow charging early nodes as well. However, recursive instances give evidence that no such approach would be possible as the adversary could change the instance if it sees ALG early-aggregate many nodes.

At a higher level, there is a clear trade-off between the eager and lazy aggregation costs and neither cost seems to be more dominant than the other. So, perfectly balancing both likely would give the best competitive ratio for the problem and this is exactly what Waterfall does. Specifically, the maximum early aggregation is $D$ resultant from the early aggregation made by the root and the maximum lazy aggregation will come from the leaves which will also have a charge of $D$ . More generally, the sum of the early charges and the late charges add to $D$ as the analysis shows. Although they may not always be exactly $D/2$ each, we always have the early and lazy aggregation costs are in equilibrium with each other balancing out to the fixed number $D$ . Depending on the tree structure and the requests received, strictly more eager aggregation or strictly more lazy aggregation may be more beneficial and so this more general type of balancing seems necessary. This further indicates that $D$ may be the best competitive ratio possible for MLAPD.

5 Conclusions

One prominent open question left by this paper is whether $D$ is the true competitive ratio for MLAPD. In particular, a critical result would be new lower-bounds for MLAPD that improve upon the $2$ lower-bound for the $D=2$ case. In addition, determining the difference in complexity between the delay and deadline cases is a critical open problem.

For $D=1$ , we know the optimal competitive ratio for MLAPD is $1$ and that for MLAP it is $2$ . For $D=2$ , we know that the optimal competitive ratio for MLAPD is $2$ and for MLAP it is essentially $3$ (improving the known lower-bound of $2.754$ to $3$ is still unresolved). Unless the problems for larger $D$ differ greatly than that for smaller $D$ , this pattern would suggest that the optimal competitive ratio for MLAP is exactly $1$ more than that for MLAPD.

In fact, [15] essentially took an algorithm for JRPD and converted it into an algorithm for JRP using a primal dual framework. The resultant algorithm had competitive ratio $1$ more than the competitive ratio for the algorithm for JRPD. If this technique could be extended for $D>2$ then using Waterfall as the MLAPD algorithm could result in a $D+1$ competitive algorithm for MLAP, which would improve on the best known $O(D^{2})$ -competitive algorithm. This is especially promising since the JRPD algorithm used for that result behaves almost exactly as Waterfall does on trees of depth $2$ .

Another possible approach would be to extend Waterfall to work for the delay case directly. By simply replacing deadlines with maturation times and making minor adjustments, the modified Waterfall may in fact be $D+1$ -competitive. However, unlike the nice structural properties of phases being partitioned into late and early parts, no such structure is clear for the delay version. Thus, a different analysis strategy must be employed absent new structural insights.

References

[1] Alok Aggarwal and James K. Park “Improved Algorithms for Economic Lot Size Problems” In Operations Research 41.3, 1993, pp. 549–571 DOI: 10.1287/opre.41.3.549
[2] Esther Arkin, Dev Joneja and Robin Roundy “Computational complexity of uncapacitated multi-echelon production planning problems” In Operations Research Letters 8.2, 1989, pp. 61–66 DOI: https://doi.org/10.1016/0167-6377(89)90001-1
[3] Yossi Azar and Noam Touitou “General Framework for Metric Optimization Problems with Delay or with Deadlines” In CoRR abs/1904.07131, 2019 arXiv: http://arxiv.org/abs/1904.07131
[4] B.R. Badrinath and P. Sudame “Gathercast: the design and implementation of a programmable aggregation mechanism for the Internet” In Proceedings Ninth International Conference on Computer Communications and Networks (Cat.No.00EX440), 2000, pp. 206–213 DOI: 10.1109/ICCCN.2000.885492
[5] LUCA Becchetti et al. “Latency-Constrained Aggregation in Sensor Networks” In ACM Trans. Algorithms 6.1 New York, NY, USA: Association for Computing Machinery, 2010 DOI: 10.1145/1644015.1644028
[6] Marcin Bienkowski et al. “Approximation algorithms for the joint replenishment problem with deadlines” In Journal of Scheduling 18.6 Springer ScienceBusiness Media LLC, 2014, pp. 545–560 DOI: 10.1007/s10951-014-0392-y
[7] Marcin Bienkowski et al. “Better Approximation Bounds for the Joint Replenishment Problem” In CoRR abs/1307.2531, 2013 arXiv: http://arxiv.org/abs/1307.2531
[8] Marcin Bienkowski et al. “Online Algorithms for Multi-Level Aggregation” In CoRR abs/1507.02378, 2015 arXiv: http://arxiv.org/abs/1507.02378
[9] Marcin Bienkowski et al. “Online Control Message Aggregation in Chain Networks” In Proceedings of the 13th International Conference on Algorithms and Data Structures, WADS’13 London, ON, Canada: Springer-Verlag, 2013, pp. 133–145 DOI: 10.1007/978-3-642-40104-6˙12
[10] Edward Bortnikov and Reuven Cohen “Schemes for scheduling control messages by hierarchical protocols” In Comput. Commun. 24.7-8, 2001, pp. 731–743 DOI: 10.1016/S0140-3664(00)00276-0
[11] Carlos Fisch Brito, Elias Koutsoupias and Shailesh Vaya “Competitive Analysis of Organization Networks or Multicast Acknowledgment: How Much to Wait?” In Algorithmica 64.4, 2012, pp. 584–605
[12] Niv Buchbinder and Joseph (Seffi) Naor “The Design of Competitive Online Algorithms via a Primal: Dual Approach” In Found. Trends Theor. Comput. Sci. 3.2–3 Hanover, MA, USA: Now Publishers Inc., 2009, pp. 93–263 DOI: 10.1561/0400000024
[13] Niv Buchbinder, Kamal Jain and Joseph Seffi Naor “Online Primal-Dual Algorithms for Maximizing Ad-Auctions Revenue” In Proceedings of the 15th Annual European Conference on Algorithms, ESA’07 Eilat, Israel: Springer-Verlag, 2007, pp. 253–264
[14] Niv Buchbinder, Moran Feldman, Joseph Naor and Ohad Talmon “O(depth)-Competitive Algorithm for Online Multi-level Aggregation” In CoRR abs/1701.01936, 2017 arXiv: http://arxiv.org/abs/1701.01936
[15] Niv Buchbinder et al. “Online make-to-order joint replenishment model: Primal-dual competitive algorithms” In Operations Research 61.4 INFORMS Inst.for Operations Res.and the Management Sciences, 2013, pp. 1014–1029 DOI: 10.1287/opre.2013.1188
[16] Wallace B. Crowston and Michael H. Wagner “Dynamic Lot Size Models for Multi-Stage Assembly Systems” In Management Science 20.1, 1973, pp. 14–21 DOI: 10.1287/mnsc.20.1.14
[17] Daniel R. Dooly, Sally A. Goldman and Stephen D. Scott “On-Line Analysis of the TCP Acknowledgment Delay Problem” In J. ACM 48.2 New York, NY, USA: Association for Computing Machinery, 2001, pp. 243–273 DOI: 10.1145/375827.375843
[18] F. Hu, X. Cao and C. May “Optimized scheduling for data aggregation in wireless sensor networks” In International Conference on Information Technology: Coding and Computing (ITCC’05) - Volume II 2, 2005, pp. 557–561 Vol. 2 DOI: 10.1109/ITCC.2005.219
[19] Anna R. Karlin, Claire Kenyon and Dana Randall “Dynamic TCP Acknowledgement and Other Stories about e/(e-1)” In Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, STOC ’01 Hersonissos, Greece: Association for Computing Machinery, 2001, pp. 502–509 DOI: 10.1145/380752.380845
[20] Alf Kimms “Multi-level lot sizing and scheduling: methods for capacitated, dynamic, and deterministic models” Springer Science & Business Media, 2012
[21] Retsef Levi, Robin Roundy and David Shmoys “Primal-Dual Algorithms for Deterministic Inventory Problems.” In Math. Oper. Res. 31, 2006, pp. 267–284 DOI: 10.1145/1007352.1007410
[22] Retsef Levi and Maxim Sviridenko “Improved Approximation Algorithm for the One-Warehouse Multi-Retailer Problem” In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 188–199
[23] Tim Nonner and Alexander Souza “A 5/3-Approximation Algorithm for Joint Replenishment with Deadlines” In Proceedings of the 3rd International Conference on Combinatorial Optimization and Applications, COCOA ’09 Huangshan, China: Springer-Verlag, 2009, pp. 24–35 DOI: 10.1007/978-3-642-02026-1˙3
[24] L… Pedrosa, Private communication reported by Bienkowski et al., 2013
[25] Steven S. Seiden “A Guessing Game and Randomized Online Algorithms” In Proceedings of the thirty-second annual ACM symposium on theory of computing (STOC), 2000, 2000, pp. 592–601
[26] Harvey M. Wagner and Thomson M. Whitin “Dynamic Version of the Economic Lot Size Model” In Management Science 5.1, 1958, pp. 89–96 DOI: 10.1287/mnsc.5.1.89

Appendix A Proofs for section 3.2

We begin with the proof of Lemma 3.3.

Proof.

In symbols, the assumptions state that $d_{\rho_{1}}<\ldots<d_{\rho_{k}}$ , $c(S_{i})<c(O)$ , and $P_{\rho_{i}}\subseteq P_{\rho_{k}}\subseteq O$ for all $i\leq k$ . We will show that each interval transmitted by ALG has cost at at least double the previously transmitted interval, and use this to show that in reverse order each interval must be at least a power of two smaller than the cost of $O$ .

Let $P(\rho_{i-1}\to\rho_{i})=P_{\rho_{i}}\setminus P_{\rho_{i-1}}$ be the nodes that need to be added to $P_{\rho_{i-1}}$ in order to reach the node at which $\rho_{i}$ was issued. Since $\rho_{i}$ triggered a service of ALG, we know that $\rho_{i}$ was not satisfied by previous services of ALG. Then, we know that the nodes in $P(\rho_{i-1}\to\rho_{i})$ were not added to the service. By definition of ALG, this implies that $c(\rho_{i-1}\to\rho_{i})>c(P_{\rho_{i-1}})$ . Thus,

c(P_{\rho_{i}})=c(P_{\rho_{i-1}})+c(\rho_{i-1}\to\rho_{i})>2c(P_{\rho_{i-1}})

In other words, the intervals of the requests are at least doubling each time a request is not satisfied. This implies that $c(P_{\rho_{i-1}})<\frac{1}{2}c(P_{\rho_{i}})$ holds for all $i>1$ . As $c(P_{\rho_{k}})\leq 2^{-(k-k)}c(P_{\rho_{k}})$ , an easy induction shows that $c(P_{\rho_{i}})\leq 2^{-(k-i)}c(P_{\rho_{k}})$ holds for all $i$ . The definition of ALG then implies that $c(S_{i})\leq 2c(P_{\rho_{i}})\leq 2^{-(k-i)+1}c(P_{\rho_{k}})$ .

Suppose $\gamma$ is the earliest request due at the node $v$ that is the furthest node from the root contained in $O$ . As $\rho_{k}$ was satisfied by $O$ by assumption, and $t\leq t_{k}$ , it must be the case that the request $\gamma$ arrived before the deadline of $\rho_{k}$ and so has been issued at the time of $S_{k}$ ’s construction. Then, we know that $\gamma$ was not satisfied by $S_{k}$ otherwise $v$ being the last node in $O$ ’s interval would imply $c(S_{k})\geq c(O)$ . Thus, the same arguments from before imply that $c(P_{\rho_{k}})\leq\frac{1}{2}c(P_{\gamma})=c(O)$ . Thus, for all $i$ we have that

c(S_{i})\leq 2^{-(k-i)+1}c(P_{\rho_{k}})<2^{-(k-i)+1}(\frac{1}{2}c(O))=2^{-(k-i)}c(O)

The total cost of these intervals ALG transmitted is

\displaystyle\sum_{i=1}^{k}c(S_{i})=\sum_{i=1}^{k}c(S_{k-i+1})=\sum_{i=0}^{k-1}c(S_{k-i})<\sum_{i=0}^{k-1}2^{-i}c(O)=(2-2^{-k})c(O)

∎

Next, we show the improved competitive ratio for Double.

Proof.

We simply improve the upper bound on $k$ from $D$ to $\frac{D}{2}-1$ to achieve the claimed competitiveness. Notice that the worst charge to a service of OPT happens when both a larger service and smaller services of ALG both charge to it. In this case, to achieve a $2$ charge from the larger service it must be that at least two nodes are not included by the last small service (one that is due and another that is added too early). So, we know that $k\leq D-2$ . Now, in one situation we could have every request is at its own node and each node has very large cost and so $k=D-2$ . However, this will not achieve the largest charge. In particular, this means $c(S_{i})=c(\rho_{i})$ for each $i$ and so instead of $c(S_{k-i})\leq 2^{-i}c(O)$ we have $c(S_{k-i})\leq 2^{-(i+1)}c(O)$ , which overall yields a decrease of at least $1$ in the charge.

In order for $c(S_{k})$ to be very close to the $c(O)$ , which happens in the worst case, we need at least an additional node between the node $\rho_{k}$ is issued at and the last node served by $O$ . This pushes the total cost of $S_{k}$ from $\frac{1}{2}c(O)$ closer to $c(O)$ in exchange for one fewer interval than $D-2$ could be transmitted in this range. However, the overall increase in cost is positive. In particular, even if we added infinitely many intervals, the total cost would be $\sum_{j=1}^{\infty}2^{-j}=1$ , whereas just losing one later interval also pushes the cost to $1$ , so there is no improvement for choosing more intervals over a longer initial interval. In fact, if the path is finite, the longer interval is a strict improvement. Similarly, for any request $\rho_{k-i}$ where we would want to exchange a longer service $S_{k-i}$ for fewer total intervals, this is an improvement as $\sum_{j=i}^{\infty}2^{-j}=2^{1-i}$ which is exactly the largest we could make $S_{k-i}$ by sacrificing one smaller interval. Hence, the worst case occurs when each smaller interval is as large as possible, which requires at least one node to exist between any two request nodes. Thus, we remove half of the nodes from consideration giving that $k\leq\frac{D}{2}-1$ . Thus, ALG is $4-2^{1-\frac{D}{2}}$ -competitive. ∎

We note that the analysis above actually depends not only on $k$ , but the minimal difference, $\epsilon$ , between intervals charged to a longer service. The most accurate competitive ratio that could be achieved by the above analysis is likely closer to $4-\epsilon k2^{1-k}$ by taking the minimal differences into account appropriately. This indicates an even better competitive ratio may be possible for path instances.

A DD-competitive algorithm for the Multilevel Aggregation Problem with Deadlines

Abstract

1 Introduction

1.1 Our Contributions

1.2 Related Work

2 Preliminaries

Multi-level aggregation problem.

Critically overdue.

3 Special Tree Classes

3.1 Increasing Trees

Theorem 3.1.

Proof.

Corollary 3.1.1.

Proof.

3.2 Paths

Theorem 3.2.

Proof.

Lemma 3.3.

Corollary 3.3.1.

Proof.

4 Arbitrary Trees

Intuition.

Falls.

Waterfall Interpretation.

Theorem 4.1.

4.1 ALG’s Structure

Phases

Lemma 4.2.

Proof.

Direct Investments.

Lemma 4.3.

Proof.

Investments.

Lemma 4.4.

Proof.

Lemma 4.5.

Proof.

4.2 Charging Scheme

Lemma 4.6.

Proof.

Lemma 4.7.

Proof.

Proof.

5 Conclusions

References

Appendix A Proofs for section 3.2

Proof.

Proof.

A $D$ -competitive algorithm for the Multilevel Aggregation Problem with Deadlines