Dynamic geometric set cover and hitting set
Abstract
We investigate dynamic versions of geometric set cover and hitting set where points and ranges may be inserted or deleted, and we want to efficiently maintain an (approximately) optimal solution for the current problem instance. While their static versions have been extensively studied in the past, surprisingly little is known about dynamic geometric set cover and hitting set. For instance, even for the most basic case of one-dimensional interval set cover and hitting set, no nontrivial results were known. The main contribution of our paper are two frameworks that lead to efficient data structures for dynamically maintaining set covers and hitting sets in and . The first framework uses bootstrapping and gives a -approximate data structure for dynamic interval set cover in with amortized update time for any constant ; in , this method gives -approximate data structures for unit-square (and quadrant) set cover and hitting set with amortized update time. The second framework uses local modification, and leads to a -approximate data structure for dynamic interval hitting set in with amortized update time; in , it gives -approximate data structures for unit-square (and quadrant) set cover and hitting set in the partially dynamic settings with amortized update time.
1 Introduction
Given a pair where is a set of points and is a collection of geometric ranges in a Euclidean space, the geometric set cover (resp., hitting set) problem is to find the smallest number of ranges in (resp., points in ) that cover all points in (resp., hit all ranges in ). Geometric set cover and hitting set are classical geometric optimization problems, with numerous applications in databases, sensor networks, VLSI design, etc.
In many applications, the problem instance can change over time and re-computing a new solution after each change is too costly. In these situations, a dynamic algorithm that can update the solution after a change more efficiently than constructing the entire new solution from scratch is highly desirable. This motivates the main problem studied in our paper: dynamically maintaining geometric set covers and hitting sets under insertion and deletion of points and ranges.
Although (static) geometric set cover and hitting set have been extensively studied over the years, their dynamic variants are surprisingly open. For example, even for the most fundamental case, dynamic interval set cover and hitting set in one dimension, no nontrivial results were previously known. In this paper, we propose two algorithmic frameworks for the problems, which lead to efficient data structures for dynamic set cover and hitting set for intervals in and unit squares and quadrants in . We believe that our approaches can be extended to solve dynamic set cover and hitting set in other geometric settings, or more generally, other dynamic problems in computational geometry.
1.1 Related work
The set cover and hitting set problems in general setting are well-known to be NP-complete [18]. A simple greedy algorithm achieves an -approximation [11, 19, 21], which is tight under appropriate complexity-theoretic assumptions [13, 20]. In many geometric settings, the problems remain NP-hard or even hard to approximate [5, 22, 23]. However, by exploiting the geometric nature of the problems, efficient algorithms with better approximation factors can be obtained. For example, Mustafa and Ray [25] showed the existence of polynomial-time approximation schemes (PTAS) for halfspace hitting set in and disk hitting set. There is also a PTAS for unit-square set cover given by Erlebach and van Leeuwen [14]. Agarwal and Pan [2] proposed approximation algorithms with near-linear running time for the set cover and hitting set problems for halfspaces in , disks in , and orthogonal rectangles.
Dynamic problems have received a considerable attention in recent years [3, 6, 7, 9, 10, 17, 26, 27]. In particular, dynamic set cover in general setting has been studied in [1, 8, 16]. All the results were achieved in the partially dynamic setting where ranges are fixed and only points are dynamic. Gupta et al. [16] showed that an -approximation can be maintained using amortized update time and an -approximation can be maintained using amortized update time, where is the maximum number of ranges that a point belongs to. Bhattacharya et al. [8] gave an -approximation data structure for dynamic set cover with amortized update time. Abboud et al. [1] proved that one can maintain a -approximation using amortized update time.
In geometric settings, only the dynamic hitting set problem has been considered [15]. Ganjugunte [15] studied two different dynamic settings: (i) only the range set is dynamic and (ii) is dynamic and is semi-dynamic (i.e., insertion-only). Ganjugunte [15] showed that, for pseudo-disks in , dynamic hitting set in setting (i) can be solved using amortized update time with approximation factor , and that in setting (ii) can be solved using amortized update time with approximation factor , where is the time for finding a point in contained in a query pseudo-trapezoid (see [15] for details). Dynamic geometric hitting set in the fully dynamic setting (where both points and ranges can be inserted or deleted) as well as dynamic geometric set cover has not yet been studied before, to the best of our knowledge.
1.2 Our results
Let be a dynamic geometric set cover (resp., hitting set) instance. We are interested in proposing efficient data structures to maintain an (approximately) optimal solution for . This may have various definitions, resulting in different variants of the problem. A natural variant is to maintain the number , which is the size of an optimal set cover (resp., hitting set) for , or an approximation of . However, in many applications, only maintaining the optimum is not satisfactory and one may hope to maintain a “real” set cover (resp., hitting set) for the dynamic instance. Therefore, in this paper, we formulate the problem as follows. We require the data structure to, after each update, store (implicitly) a solution for the current problem instance satisfying some quality requirement such that certain information about the solution can be queried efficiently. For example, one can ask how large the solution is, whether a specific element in (resp., ) is used in the solution, what the entire solution is, etc. We will make this more precise shortly.
In dynamic settings, it is usually more natural and convenient to consider a multiset solution for the set cover (resp., hitting set) instance. That is, we allow the solution to be a multiset of elements in (resp., ) that cover all points in (resp., hit all ranges in ), and the quality of the solution is also evaluated in terms of the multiset cardinality. In the static problems, one can always efficiently remove the duplicates in a multiset solution to obtain a (ordinary) set cover or hitting set with even better quality (i.e., smaller cardinality), hence computing a multiset solution is essentially equivalent to computing an ordinary solution. However, in dynamic settings, the update time has to be sublinear, and in general this is not sufficient for detecting and removing duplicates. Therefore, in this paper, we mainly focus on multiset solutions (though some of our data structures can maintain an ordinary set cover or hitting set). Unless explicitly mentioned otherwise, solutions for set cover and hitting set always refer to multiset solutions hereafter.
Precisely, we require a dynamic set cover (resp., hitting set) data structure to store implicitly, after each update, a set cover (resp., a hitting set ) for the current instance such that the following queries are supported.
-
•
Size query: reporting the (multiset) size of (resp., ).
-
•
Membership query: reporting, for a given range (resp., a given point ), the number of copies of (resp., ) contained in (resp., ).
-
•
Reporting query: reporting all the elements in (resp., ).
We require the size query to be answered in time, a membership query to be answered in time (resp., time), and the reporting query to be answered in time (resp., time); this is the best one can expect in the pointer machine model.
We say that a set cover (resp., hitting set) instance is fully dynamic if insertions and deletions on both points and ranges are allowed, and partially dynamic if only the points (resp., ranges) can be inserted and deleted. This paper mainly focuses on the fully dynamic setting, while some results are achieved in the partially dynamic setting. Thus, unless explicitly mentioned otherwise, problems are always considered in the fully dynamic setting.
The main contribution of this paper are two frameworks for designing dynamic geometric set cover and hitting set data structures, leading to efficient data structures in and (see Table 1). The first framework is based on bootstrapping, which results in efficient (approximate) dynamic data structures for interval set cover, quadrant/unit-square set cover and hitting set (see the first three rows of Table 1 for detailed bounds). The second framework is based on local modification, which results in efficient (approximate) dynamic data structures for interval hitting set, quadrant/unit-square set cover and hitting set in the partially dynamic setting (see the last three rows of Table 1 for detailed bounds).
Framework | Problem | Range | Approx. factor | Update time | Setting |
---|---|---|---|---|---|
Bootstrapping | SC | Interval | Fully dynamic | ||
SC & HS | Quadrant | Fully dynamic | |||
SC & HS | Unit square | Fully dynamic | |||
Local modification | HS | Interval | Fully dynamic | ||
SC & HS | Quadrant | Part. dynamic | |||
SC & HS | Unit square | Part. dynamic |
Organization.
The rest of the paper is organized as follows. Section 2 gives the preliminaries required for the paper and Section 3 gives an overview of our two frameworks. Our first framework (bootstrapping) and second framework (local modification) for designing efficient dynamic geometric set cover and hitting set data structures are presented in Section 4 and Section 5, respectively. To make the paper more readable, the proofs of the technical lemmas and some details are deferred to the appendix.
2 Preliminaries
In this section, we introduce the basic notions used throughout the paper.
Multi-sets and disjoint union.
A multi-set is a set in which elements can have multiple copies. The multiplicity of an element in a multi-set is the number of the copies of in . For two multi-sets and , we use to denote the disjoint union of and , in which the multiplicity of an element is the sum of the multiplicities of in and .
Basic data structures.
A data structure built on a dataset (e.g., point set, range set, set cover or hitting set instances) of size is basic if it can be constructed in time and can be dynamized with update time (with a bit abuse of terminology, sometimes we also use “basic data structures” to denote the dynamized version of such data structures).
Output-sensitive algorithms.
In some set cover and hitting set problems, if the problem instance is properly stored in some data structure, it is possible to compute an (approximate) optimal solution in sub-linear time. An output-sensitive algorithm for a set cover or hitting set problem refers to an algorithm that can compute an (approximate) optimal solution in time (where is the size of the output solution), by using some basic data structure built on the problem instance.
3 An overview of our two frameworks
The basic idea of our first framework is bootstrapping. Namely, we begin from a simple inefficient dynamic set cover or hitting set data structure (e.g., a data structure that re-computes a solution after each update), and repeatedly use the current data structure to obtain an improved one. The main challenge here is to design the bootstrapping procedure: how to use a given data structure to construct a new data structure with improved update time. We achieve this by using output-sensitive algorithms and carefully partitioning the problem instances to sub-instances.
Our second framework is much simpler, which is based on local modification. Namely, we construct a new solution by slightly modifying the previous one after each update, and re-compute a new solution periodically using an output-sensitive algorithm. This framework applies to the problems which are stable, in the sense that the optimum of a dynamic instance does not change significantly.
4 First framework: Bootstrapping
In this section, we present our first frame work for dynamic geometric set cover and hitting set, which is based on bootstrapping and results in sub-linear data structures for dynamic interval set cover and dynamic quadrant and unit-square set cover (resp., hitting set).
4.1 Warm-up: 1D set cover for intervals
As a warm up, we first study the 1D problem: dynamic interval set cover. First, we observe that interval set cover admits a simple exact output-sensitive algorithm. Indeed, interval set cover can be solved using the greedy algorithm that repeatedly picks the leftmost uncovered point and covers it using the interval with the rightmost right endpoint, and the algorithm can be easily made output-sensitive if we store the points and intervals in binary search trees.
Lemma 1.
Interval set cover admits an exact output-sensitive algorithm.
This algorithm will serve an important role in the design of our data structure.
4.1.1 Bootstrapping
As mentioned before, our data structure is designed using bootstrapping. Specifically, we prove the following bootstrapping theorem, which is the technical heart of our result. The theorem roughly states that given a dynamic interval set cover data structure, one can obtain another dynamic interval set cover data structure with improved update time.
Theorem 2.
Let be a number. If there exists a -approximate dynamic interval set cover data structure with amortized update time and construction time for any , then there exists a -approximate dynamic interval set cover data structure with amortized update time and construction time for any , where . Here (resp., ) denotes the size of the current (resp., initial) problem instance.
Assuming the existence of as in the theorem, we are going to design the improved data structure . Let be a dynamic interval set cover instance, and be the approximation factor. We denote by (resp., ) the size of the current (resp., initial) .
The construction of .
Initially, . Essentially, our data structure consists of two parts333In implementation level, we may need some additional support data structures (which are very simple). For simplicity of exposition, we shall mention them when discussing the implementation details.. The first part is the basic data structure required for the output-sensitive algorithm of Lemma 1. The second part is a family of data structures defined as follows. Let be a function to be determined shortly. We partition the real line into connected portions (i.e., intervals) such that each portion contains points in and endpoints of the intervals in . Define and define as the sub-collections consisting of the intervals that “partially intersect” , i.e., . When the instance is updated, the partition will remain unchanged, but the ’s and ’s will change along with and . We view each as a dynamic interval set cover instance, and let be the data structure built on for the approximation parameter . Thus, maintains a -approximate optimal set cover for . The second part of consists of the data structures .
Update and reconstruction.
After an operation on , we update the basic data structure . Also, we update the data structure if the instance changes due to the operation. Note that an operation on changes exactly one and an operation on changes at most two ’s (because an interval can belong to at most two ’s). Thus, we in fact only need to update at most two ’s. Besides the update, we also reconstruct the entire data structure periodically (as is the case for many dynamic data structures). Specifically, the first reconstruction of happens after processing operations. The reconstruction is the same as the initial construction of , except that is replaced with , the size of at the time of reconstruction. Then the second reconstruction happens after processing operations since the first reconstruction, and so forth.
Maintaining a solution.
We now discuss how to maintain a -approximate optimal set cover for . Let denote the optimum (i.e., the size of an optimal set cover) of the current . Set . If , then the output-sensitive algorithm can compute an optimal set cover for in time. Thus, we simulate the output-sensitive algorithm within that amount of time. If the algorithm successfully computes a solution, we use it as our . Otherwise, we construct as follows. For , we say is coverable if there exists such that and uncoverable otherwise. Let and . We try to use the intervals in to “cover” all coverable portions. That is, for each , we find an interval in that contains , and denote by the collection of these intervals. Then we consider the uncoverable portions. If for some , the data structure tells us that the current does not have a set cover, then we immediately make a no-solution decision, i.e., decide that the current has no feasible set cover, and continue to the next operation. Otherwise, for every , the data structure maintains a -approximate optimal set cover for . We then define .
Later we will prove that is always a -approximate optimal set cover for . Before this, let us consider how to store properly to support the size, membership, and reporting queries in the required query times. If is computed by the output-sensitive algorithm, then the size of is at most , and we have all the elements of in hand. In this case, it is not difficult to build a data structure on to support the desired queries. On the other hand, if is defined as the disjoint union of and ’s, the size of might be very large and thus we are not able to explicitly extract all elements of . Fortunately, in this case, each is already maintained in the data structure . Therefore, we actually only need to compute , , and ; with these in hand, one can already build a data structure to support the desired queries for . We defer the detailed discussion to Appendix B.
Correctness.
We now prove the correctness of our data structure . We first show the correctness of the no-solution decision.
Lemma 3.
makes a no-solution decision iff the current has no set cover.
Next, we show that the solution maintained by is truly a -approximate optimal set cover for . If is computed by the output-sensitive algorithm, then it is an optimal set cover for . Otherwise, , i.e., either or . If , then the current has no set cover (i.e., ) and thus makes a no-solution decision by Lemma 3. So assume . In this case, . For each , let be the optimum of the instance . Then we have for all where . Since , we have
(1) |
Let be an optimal set cover for . We observe that for , is a set cover for , because is uncoverable (so the points in cannot be covered by any interval in ). It immediately follows that for all . Therefore, we have
(2) |
The right-hand side of the above inequality can be larger than as some intervals in can belong to two ’s. The following lemma bounds the number of such intervals.
Lemma 4.
There are at most intervals in that belong to exactly two ’s.
Time analysis.
We briefly discuss the update and construction time of ; a detailed analysis can be found in Appendix B. Since is reconstructed periodically, it suffices to consider the first period (i.e., the period before the first reconstruction). The construction of can be easily done in time. The update time of consists of the time for updating the data structures and , the time for maintaining the solution, and the time for reconstruction. Since the period consists of operations, the size of each is always bounded by during the period. As argued before, we only need to update at most two ’s after each operation. Thus, updating the data structures takes amortized time. Maintaining the solution can be done in time, with a careful implementation. The time for reconstruction is bounded by ; we amortize it over the operations in the period and the amortized time cost is then , i.e., . In total, the amortized update time of (during the first period) is . If we set where is as defined in Theorem 2, a careful calculation (see Appendix B) shows that the amortized update time becomes .
4.1.2 Putting everything together
With the bootstrapping theorem in hand, we are now able to design our dynamic interval set cover data structure. The starting point is a “trivial” data structure, which simply uses the output-sensitive algorithm of Lemma 1 to re-compute an optimal interval set cover after each update. Clearly, the update time of this data structure is and the construction time is . Thus, there exists a -approximate dynamic interval set cover data structure with amortized update time for and construction time. Define for . By applying Theorem 2 times for a constant , we see the existence of a -approximate dynamic interval set cover data structure with amortized update time and construction time. One can easily verify that for all . Therefore, for any constant , we have an index satisfying and hence . We finally conclude the following.
Theorem 5.
For a given approximation factor and any constant , there exists a -approximate dynamic interval set cover data structure with amortized update time and construction time.
4.2 2D set cover and hitting set for quadrants and unit squares
In this section, we present our bootstrapping framework for 2D dynamic set cover and hitting set. Our framework works for quadrants and unit squares.
We first show that dynamic unit-square set cover, dynamic unit-square hitting set, and dynamic quadrant hitting set can all be reduced to dynamic quadrant set cover.
Lemma 6.
Suppose there exists a -approximate dynamic quadrant set cover data structure with amortized update time and construction time, where is an increasing function. Then there exist -approximate dynamic unit-square set cover, dynamic unit-square hitting set, and dynamic quadrant hitting set data structures with amortized update time and construction time.
Now it suffices to consider dynamic quadrant set cover. In order to do bootstrapping, we need an output-sensitive algorithm for quadrant set cover, analog to the one in Lemma 1 for intervals. To design such an algorithm is considerably more difficult compared to the 1D case, and we defer it to Section 4.2.2. Before this, let us first discuss the bootstrapping procedure, assuming the existence of a -approximate output-sensitive algorithm for quadrant set cover.
4.2.1 Bootstrapping
We prove the following bootstrapping theorem, which is the technical heart of our result.
Theorem 7.
Assume quadrant set cover admits a -approximate output-sensitive algorithm for some constant .
Then we have the following result.
(*) Let be a number.
If there exists a -approximate dynamic quadrant set cover data structure with amortized update time and construction time for any ,
then there exists a -approximate dynamic quadrant set cover data structure with amortized update time and construction time for any , where .
Here (resp., ) denotes the size of the current (resp., initial) problem instance.
Assuming the existence of as in the theorem, we are going to design the improved data structure . Let be a dynamic quadrant set cover instance. As before, we denote by (resp., ) the size of the current (resp., initial) .


The construction of .
Initially, . Essentially, our data structure consists of two parts. The first part is the data structure required for the -approximate output-sensitive algorithm. The second part is a family of data structures defined as follows. Let be a function to be determined shortly. We use an orthogonal grid to partition the plane into cells for such that each row (resp., column) of the grid contains points in and vertices of the quadrants in (see Figure 1 for an illustration). Denote by the cell in the -th row and -th column. Define . Also, we need to define a sub-collection . Recall that in the 1D case, we define as the sub-collection of intervals in that partially intersect the portion . However, for technical reasons, here we cannot simply define as the sub-collection of quadrants in that partially intersects . Instead, we define as follows. We include in all the quadrants in whose vertices lie in . Besides, we also include in the following (at most) four special quadrants. We say a quadrant left intersects if partially intersects and contains the left edge of (see Figure 2 for an illustration); similarly, we define “right intersects”, “top intersects”, and “bottom intersects”. Among a collection of quadrants, the leftmost/rightmost/topmost/bottommost quadrant refers to the quadrant whose vertex is the leftmost/rightmost/topmost/bottommost. We include in the rightmost quadrant in that left intersects , the leftmost quadrant in that right intersects , the bottommost quadrant in that top intersects , and the topmost quadrant in that bottom intersects (if these quadrants exist). When the instance is updated, the grid keeps unchanged, but the ’s and ’s change along with and . We view each as a dynamic quadrant set cover instance, and let be the data structure built on for the approximation factor . The second part of consists of the data structures for .
Update and reconstruction.
After each operation on , we update the data structure . Also, if some changes, we update the data structure . Note that an operation on changes exactly one , and an operation on may only change the ’s in one row and one column (specifically, if the vertex of the inserted/deleted quadrant lies in , then only may change). Thus, we in fact only need to update the ’s in one row and one column. Besides the update, we also reconstruct the entire data structure periodically, where the (first) reconstruction happens after processing operations. This part is totally the same as in our 1D data structure.
Maintaining a solution.
We now discuss how to maintain a -approximate optimal set cover for . Let denote the optimum of the current . Set . If , then the output-sensitive algorithm can compute a -approximate optimal set cover for in time. Thus, we simulate the output-sensitive algorithm within that amount of time. If the algorithm successfully computes a solution, we use it as our . Otherwise, we construct as follows. We say the cell is coverable if there exists that contains and uncoverable otherwise. Let and . We try to use the quadrants in to “cover” all coverable cells. That is, for each , we find a quadrant in that contains , and denote by the set of all these quadrants. Then we consider the uncoverable cells. If for some , the data structure tells us that the instance has no set cover, then we immediately make a no-solution decision, i.e., decide that the current has no feasible set cover, and continue to the next operation. Otherwise, for each , the data structure maintains a -approximate optimal set cover for . We then define .
We will see later that is always a -approximate optimal set cover for . Before this, let us briefly discuss how to store to support the desired queries. If is computed by the output-sensitive algorithm, then we have all the elements of in hand and can easily store them in a data structure to support the queries. Otherwise, is defined as the disjoint union of and ’s. In this case, the size and reporting queries can be handled in the same way as that in the 1D problem, by taking advantage of the fact that is maintained in . However, the situation for the membership query is more complicated, because now a quadrant in may belong to many ’s. This issue can be handled by collecting all special quadrants in and building on them a data structure that supports the membership query. We defer the detailed discussion to Appendix C.
Correctness.
We now prove the correctness of our data structure . First, we show that the no-solution decision made by our data structure is correct.
Lemma 8.
makes a no-solution decision iff the current has no set cover.
Next, we show that the solution maintained by is truly a -approximate optimal set cover for . If is computed by the output-sensitive algorithm, then it is a -approximate optimal set cover for . Otherwise, , i.e., either or . If , then has no set cover (i.e., ) and makes a no-solution decision by Lemma 8. So assume . In this case, . For each , let be the optimum of . Then we have for all where . Since , we have
(4) |
Let consist of the non-special quadrants, i.e., those whose vertices are in .
Lemma 9.
We have for all , and in particular,
(5) |
Time analysis.
We briefly discuss the update and construction time of ; a detailed analysis can be found in Appendix C. It suffices to consider the first period (i.e., the period before the first reconstruction). We first observe the following fact.
Lemma 10.
At any time in the first period, we have for all and for all .
The above lemma implies that the sum of the sizes of all is at any time in the first period. Therefore, constructing can be easily done in time. The update time of consists of the time for reconstruction, the time for updating and ’s, and the time for maintaining the solution. Using almost the same analysis as in the 1D problem, we can show that the reconstruction takes amortized time and maintaining the solution can be done in time, with a careful implementation. The time for updating the data structures requires a different analysis. Let denote the current size of . As argued before, we in fact only need to update the data structures in one row and one column (say the -th row and -th column). Hence, updating the data structures takes amortized time. Lemma 10 implies that and . Since , by Hölder’s inequality and Lemma 10,
and similarly . It follows that updating the data structures takes amortized time. In total, the amortized update time of (during the first period) is . If we set where is as defined in Theorem 7, a careful calculation (see Appendix C) shows that the amortized update time becomes .
4.2.2 An output-sensitive quadrant set cover algorithm
We propose an -approximate output-sensitive algorithm for quadrant set cover, which is needed for applying Theorem 7. Let be a quadrant set cover instance of size , and be its optimum. Our goal is to compute an -approximate optimal set cover for in time, using some basic data structure built on .
For simplicity, let us assume that has a set cover; how to handle the no-solution case is discussed in Appendix D.1. There are four types of quadrants in , southeast, southwest, northeast, northwest; we denote by the sub-collections of these types of quadrants, respectively. Let denote the union of the quadrants in , and define similarly. Since has a set cover, we have . Therefore, if we can compute -approximate optimal set covers for , , , and , then the union of these four set covers is an -approximate optimal set cover for .

With this observation, it now suffices to show how to compute an -approximate optimal set cover for in time, where is the optimum of . The main challenge is to guarantee the running time and approximation ratio simultaneously. We begin by introducing some notation. Let denote the boundary of , which is an orthogonal staircase curve from bottom-left to top-right. If , then is a connected portion of that contains the bottom-left end of . Define as the “endpoint” of , i.e., the point on that is closest the top-right end of . See Figure 3 for an illustration. If , we define as the bottom-left end of (which is a point whose -coordinate equals to ). For a number , we define as the leftmost point in whose -coordinate is greater than ; we say does not exist if no point in has -coordinate greater than . For a point and a collection of quadrants, we define and as the rightmost and topmost quadrants in that contains , respectively. For a quadrant , we denote by and the - and -coordinates of the vertex of , respectively.
To get some intuition, let us consider a very simple case, where only consists of southeast quadrants. In this case, one can compute an optimal set cover for using a greedy algorithm similar to the 1D interval set cover algorithm: repeatedly pick the leftmost uncovered point in and cover it using the topmost (southeast) quadrant in . Using the notations defined above, we can describe this algorithm as follows. Set and initially, and repeatedly do , , , until does not exist. Eventually, is the set cover we want.
Now we try to extend this algorithm to the general case. However, the situation here becomes much more complicated, since we may have three other types of quadrants in , which have to be carefully dealt with in order to guarantee the correctness. But the intuition remains the same: we still construct the solution in a greedy manner. The following procedure describes our algorithm.
-
1.
. . If does not exist, then go to Step 6.
-
2.
. . If exists, then , else go to Step 6.
-
3.
If , then and go to Step 6.
-
4.
If , then and where is the vertex of , otherwise .
-
5.
. . If exists, then and go to Step 3.
-
6.
Output .
The following lemma proves the correctness of our algorithm.
Lemma 11.
covers all points in , and .
The remaining task is to show how to perform our algorithm in time using basic data structures. It is clear that our algorithm terminates in steps, since we include at least one quadrant to in each iteration of the loop Step 3–5 and eventually by Lemma 11. Thus, it suffices to show that each step can be done in time. In every step of our algorithm, all work can be done in constant time except the tasks of computing the point , testing whether and for a given point , computing the quadrants for a given point , and computing for a given number . All these tasks except the computation of can be easily done in time by storing the quadrants in binary search trees. To compute in time is more difficult, and we achieve this by properly using range trees built on both and . The details are presented in Appendix D.2.
Using the above algorithm, we can compute -approximate optimal set covers for , , , and . As argued before, the union of these four set covers, denoted by , is an -approximate optimal set covers for .
Theorem 12.
Quadrant set cover admits an -approximate output-sensitive algorithm.
4.2.3 Putting everything together
With the bootstrapping theorem in hand, we are now able to design our dynamic quadrant set cover data structure. Again, the starting point is a “trivial” data structure which uses the output-sensitive algorithm of Theorem 12 to re-compute an optimal quadrant set cover after each update. Clearly, the update time of this data structure is and the construction time is . Let be the approximation ratio of the output-sensitive algorithm. The trivial data structure implies the existence of a -approximate dynamic quadrant set cover data structure with amortized update time for and construction time. Define for . By applying Theorem 7 times for a constant , we see the existence of a -approximate dynamic quadrant set cover data structure with amortized update time and construction time. One can easily verify that for all . Therefore, for any constant , we have an index satisfying and hence . Setting to be any constant, we finally conclude the following.
Theorem 13.
For any constant , there exists an -approximate dynamic quadrant set cover data structure with amortized update time and construction time.
By the reduction of Lemma 6, we have the following corollary.
Corollary 14.
For any constant , there exist -approximate dynamic unit-square set cover, dynamic unit-square hitting set, and dynamic quadrant hitting set data structures with amortized update time and construction time.
5 Second framework: Local modification
In this section, we give our second framework for dynamic geometric set cover and hitting set, which results in poly-logarithmic update time data structures for dynamic interval hitting set, as well as dynamic quadrant and unit-square set cover (resp., hitting set) in the partially dynamic setting. This framework applies to stable dynamic geometric set cover and hitting set problems, in which each operation cannot change the optimum of the problem instance significantly. One can easily see that any dynamic set cover (resp., hitting set) problem in the partially dynamic setting is stable. Interestingly, we will see that the dynamic interval hitting set problem, even in the fully dynamic setting, is stable.
The basic idea behind this framework is quite simple, and we call it local modification. Namely, at the beginning, we compute a good solution for the problem instance using an output-sensitive algorithm, and after each operation, we slightly modify , by adding/removing a constant number of elements, to guarantee that is a feasible solution for the current instance. After processing many operations, we re-compute a new and repeat this procedure. The stability of the problem can guarantee that is always a good approximation for the optimal solution.
Next, we formally define the notion of stability. The quasi-optimum of a set cover instance is the size of an optimal set cover for , where consists of the points that can be covered by . The quasi-optimum of a hitting set instance is defined in a similar way. For a dynamic set cover (resp., hitting set) instance , we use to denote the quasi-optimum of at the time (i.e., after the -th operation).
Definition 15.
A dynamic set cover (resp., hitting set) instance is stable if for all . A dynamic set cover (resp., hitting set) problem is stable if any instance of the problem is stable.
Lemma 16.
Any dynamic set cover (resp., hitting set) problem in the partially dynamic setting is stable. The dynamic interval hitting set problem is stable.
5.1 Dynamic interval hitting set
We show how to use the idea of local modification to obtain a -approximate dynamic interval hitting set data structure with amortized update time.
Similarly to interval set cover, the interval hitting set problem can also be solved using a simple greedy algorithm, which can be performed in time if we store (properly) the points and intervals in some basic data structure (e.g., binary search trees). We omit the proof of the following lemma, as it is almost the same as the proof of Lemma 1.
Lemma 17.
The interval hitting set problem yields an exact output-sensitive algorithm.
Let be a dynamic interval hitting set instance. We denote by (resp., ) the size of the current (resp., initial) , and by the optimum of the current .
The data structure.
Our data structure basically consists of three parts. The first part is the basic data structure built on required for the output-sensitive algorithm. The second part is the following dynamic data structure which can tell us whether the current instance has a hitting set.
Lemma 18.
There exists a dynamic data structure built on with update time and construction time that can indicate whether the current has a hitting set or not.
The third part consists of the following two basic data structures and .
Lemma 19.
One can store in a basic data structure such that given a point , the rightmost (resp., leftmost) point in to the left (resp., right) of can be reported in time (if they exist).
Lemma 20.
One can store in a basic data structure such that given an interval , a point that is contained in can be reported in time (if it exists).
Maintaining a solution.
Let be the initial , and define as the sub-collection consisting of all intervals that are hit by . In the construction of , after building the data structures and , we compute an optimal hitting set for . Set and initially. The data structure handles the operations as follows. After each operation, first increment by 1 and query the data structure to see whether the current has a hitting set. Whenever and the current has a hitting set, we re-compute an optimal hitting set for the current using the output-sensitive algorithm and reset and . Otherwise, we apply local modification on (i.e., “slightly” modify ) as follows.
-
•
If the operation inserts a point to , then include in . If the operation deletes a point from , then remove from (if ) and include in the rightmost point in to the left of and the leftmost point in to the right of (if they exist), which can be found using the data structure of Lemma 19.
-
•
If the operation inserts an interval to , then include in an arbitrary point in that hits (if it exists), which can be found using the data structure of Lemma 20. If the operation deletes an interval from , keep unchanged.
Note that we do local modification on no matter whether the current has a hitting set or not. After this, our data structure uses as the solution for the current , if the current has a hitting set. In order to support the desired queries for , we can simply store in a (dynamic) binary search tree. When we re-compute , we re-build the binary search tree that stores . For each local modification on , we do insertion/deletion on the binary search tree according to the change of .
Correctness.
To see the correctness of our data structure, we have to show that is always a -approximate optimal hitting set for the current , whenever has a hitting set. To this end, we observe some simple facts. First, the counter is exactly the number of the operations processed after the last re-computation of , and the number is the optimum at the last re-computation of . Second, after each local modification, the size of either increases by 1 or keeps unchanged. Based on these two facts and the stability of dynamic interval hitting set (Lemma 16), we can easily bound the size of .
Lemma 21.
We have at any time.
With the above lemma in hand, it now suffices to show that is a feasible hitting set.
Lemma 22.
Whenever the current has a hitting set, is a hitting set for .
Time analysis.
The construction time of is clearly . Specifically, when constructing , we need to build the data structures , all of which can be built in time. Also, we need to compute an optimal hitting set for using the output-sensitive algorithm, which can also be done in time. We show that the amortized update time of is . After each operation, we need to update the data structures , which takes time. If we do local modification on , then it takes time. Indeed, in each local modification, there are elements to be added to or removed from . These elements can be found in time using the data structures and . After these elements are found, we need to modify the binary search tree that stores for supporting the desired queries, which can also be done in time. In the case where we re-compute a new , it takes time. However, we can amortize this time over the operations in between the last re-computation of and the current one. Let be the quasi-optimum of at the last re-computation. Suppose there are operations in between the two re-computations. Then . By the stability of dynamic interval hitting set (Lemma 16), we have , because is also the quasi-optimum of the current . It follows that
Thus, the amortized time for the current re-computation is . We conclude the following.
Theorem 23.
For a given approximation factor , there is a -approximate dynamic interval hitting data structure with amortized update time and construction time.
5.2 Dynamic set cover and hitting set in the partially dynamic settings
The idea of local modification can also be applied to dynamic geometric set cover (resp., hitting set) problems in the partially dynamic setting (which are stable by Lemma 16), as long as we have an output-sensitive algorithm for the problem and an efficient dynamic data structure that can indicate whether the current instance has a feasible solution or not.
For an example, let us consider dynamic quadrant set cover in the partially dynamic setting. Let be a dynamic quadrant set cover instance with only point updates. As before, we denote by (resp., ) the size of the current (resp., initial) , and by the optimum of the current .
The data structure.
Our data structure consists of three parts. The first part is the basic data structure required for performing the output-sensitive algorithm of Theorem 12. The second part is the following dynamic data structure which can tell us whether the current instance has a hitting set.
Lemma 24.
There exists a dynamic data structure built on with update time and construction time that can indicate whether the current has a set cover or not.
The third part is a simple (static) data structure built on that can report in time, for a given point , a quadrant in that contains (if it exists); this can be done using a standard orthogonal range-search data structure, e.g., range tree, whose construction time is .
Maintaining a solution.
Let be the initial , and define as the subset consisting of the points covered by . At the beginning, we compute a -approximate optimal set cover for , where is the approximation ratio of the output-sensitive algorithm. This can be done as follows. We first compute by testing whether each point in is covered by some range in using the data structure , and then build on the basic data structure required for the output-sensitive algorithm and perform the output-sensitive algorithm on . Set and . After each iteration, first increments by 1 and checks whether the current has a set cover or not using the data structure . Whenever and the current has a set cover, we re-compute a -approximate optimal set cover for the current using the output-sensitive algorithm, and reset and . Otherwise, we apply local modification on as follows. If the operation inserts a point to , then include in an arbitrary range in that contains , which can be found using the data structure (if it exists). If the operation deletes a point from , then keep unchanged. After this, our data structure uses as the solution for the current . As in the last section, we can store in a (dynamic) binary search tree to support the desired queries (by fixing some order on the family of quadrants).
Correctness.
Using the same argument as in the last section, we can show that at any time. Furthermore, it is also easy to see that is always a feasible set cover.
Lemma 25.
Whenever the current has a set cover, is a set cover for .
Time analysis.
Using the same analysis as in the last section, we can show that the construction time of the data structure is , and the amortized update time is . Setting to be any constant, we conclude the following.
Theorem 26.
There exists an -approximate partially dynamic quadrant set cover data structure with amortized update time and construction time.
Using the reduction of Lemma 6, we also have the following result. Note that although the reduction of Lemma 6 is described for the fully dynamic setting, it actually applies to reduce dynamic unit-square set cover (resp., hitting set) in the partially dynamic setting and dynamic quadrant hitting set in the partially dynamic setting to dynamic quadrant set cover data structure in the partially dynamic setting.
Corollary 27.
There exist an -approximate partially dynamic unit-square set cover, partially dynamic unit-square hitting set, and partially dynamic quadrant hitting set data structures with amortized update time and construction time.
References
- [1] Amir Abboud, Raghavendra Addanki, Fabrizio Grandoni, Debmalya Panigrahi, and Barna Saha. Dynamic set cover: improved algorithms and lower bounds. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 114–125. ACM, 2019.
- [2] Pankaj K Agarwal and Jiangwei Pan. Near-linear algorithms for geometric hitting sets and set covers. In Proceedings of the thirtieth annual symposium on Computational geometry, page 271. ACM, 2014.
- [3] Surender Baswana, Manoj Gupta, and Sandeep Sen. Fully dynamic maximal matching in update time. SIAM Journal on Computing, 44(1):88–113, 2015.
- [4] Jon Louis Bentley. Decomposable searching problems. Technical report, Carnegie-Mellon University, Department of Computer Science, 1978.
- [5] Piotr Berman and Bhaskar DasGupta. Complexities of efficient solutions of rectilinear polygon cover problems. Algorithmica, 17(4):331–356, 1997.
- [6] Aaron Bernstein and Cliff Stein. Faster fully dynamic matchings with small approximation ratios. In Proceedings of the twenty-seventh annual ACM-SIAM Symposium on Discrete Algorithms, pages 692–711. Society for Industrial and Applied Mathematics, 2016.
- [7] Sayan Bhattacharya, Deeparnab Chakrabarty, and Monika Henzinger. Deterministic fully dynamic approximate vertex cover and fractional matching in amortized update time. In International Conference on Integer Programming and Combinatorial Optimization, pages 86–98. Springer, 2017.
- [8] Sayan Bhattacharya, Monika Henzinger, and Giuseppe F Italiano. Design of dynamic algorithms via primal-dual method. In International Colloquium on Automata, Languages, and Programming, pages 206–218. Springer, 2015.
- [9] Sayan Bhattacharya, Monika Henzinger, and Giuseppe F Italiano. Deterministic fully dynamic data structures for vertex cover and matching. SIAM Journal on Computing, 47(3):859–887, 2018.
- [10] Sayan Bhattacharya, Monika Henzinger, and Danupon Nanongkai. New deterministic approximation algorithms for fully dynamic matching. In Proceedings of the forty-eighth annual ACM Symposium on Theory of Computing, pages 398–411. ACM, 2016.
- [11] Vasek Chvatal. A greedy heuristic for the set-covering problem. Mathematics of operations research, 4(3):233–235, 1979.
- [12] Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to algorithms. MIT press, 2009.
- [13] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In Proceedings of the forty-sixth annual ACM Symposium on Theory of Computing, pages 624–633. ACM, 2014.
- [14] Thomas Erlebach and Erik Jan Van Leeuwen. Ptas for weighted set cover on unit squares. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 166–177. Springer, 2010.
- [15] Shashidhara K Ganjugunte. Geometric hitting sets and their variants. PhD thesis, Duke University, 2011.
- [16] Anupam Gupta, Ravishankar Krishnaswamy, Amit Kumar, and Debmalya Panigrahi. Online and dynamic algorithms for set cover. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 537–550. ACM, 2017.
- [17] Manoj Gupta and Richard Peng. Fully dynamic (1+e)-approximate matchings. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 548–557. IEEE, 2013.
- [18] Juris Hartmanis. Computers and intractability: a guide to the theory of np-completeness. Siam Review, 24(1):90, 1982.
- [19] David S Johnson. Approximation algorithms for combinatorial problems. Journal of computer and system sciences, 9(3):256–278, 1974.
- [20] Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to within 2-. Journal of Computer and System Sciences, 74(3):335–349, 2008.
- [21] László Lovász. On the ratio of optimal integral and fractional covers. Discrete mathematics, 13(4):383–390, 1975.
- [22] Nimrod Megiddo and Kenneth J Supowit. On the complexity of some common geometric location problems. SIAM journal on computing, 13(1):182–196, 1984.
- [23] Nimrod Megiddo and Arie Tamir. On the complexity of locating linear facilities in the plane. Operations research letters, 1(5):194–197, 1982.
- [24] Christian Worm Mortensen. Fully dynamic orthogonal range reporting on ram. SIAM Journal on Computing, 35(6):1494–1525, 2006.
- [25] Nabil H Mustafa and Saurabh Ray. Improved results on geometric hitting set problems. Discrete & Computational Geometry, 44(4):883–895, 2010.
- [26] Ofer Neiman and Shay Solomon. Simple deterministic algorithms for fully dynamic maximal matching. ACM Transactions on Algorithms (TALG), 12(1):7, 2016.
- [27] Shay Solomon. Fully dynamic maximal matching in constant update time. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 325–334. IEEE, 2016.
Appendix A Missing proofs
In this section, we give the proofs of the technical lemmas in Section 4.1 and Section 4.2, which are missing in the main body of the paper.
A.1 Proof of Lemma 1
Consider an interval set cover instance of size .
Lemma 28.
One can store in a basic data structure that can report in time, for a query point , the leftmost point in that is to the right of (if it exists).
Proof.
We simply store in a binary search tree. Given a query point , one can find the leftmost point in to the right of by searching in the tree in time. Clearly, the tree can be built in time and dynamized with update time, and hence it is basic. ∎
Lemma 29.
One can store in a basic data structure that can report in time, for a query point , the interval in with the rightmost right endpoint that contains (if it exists).
Proof.
We store in a binary search tree where the key of each interval is its left endpoint. We augment each node with an additional field which stores the interval in the subtree rooted at that has the rightmost right endpoint. Given a query point , we first look for the interval in with the rightmost right endpoint whose key is smaller than or equal to . With the augmented fields, this interval can be found in time. If this interval contains , then we report it; otherwise, no interval in contains . Clearly, can be built in time and dynamized with update time, and hence it is basic. ∎
Recall the greedy algorithm for the static interval set cover problem. The algorithm computes an optimal set cover for by repeatedly picking the leftmost uncovered point and covering it using the right-rightmost interval. Formally, the procedure is presented below.
-
1.
and .
-
2.
Find the leftmost point to the right of .
-
3.
, where is the right-rightmost interval in covering .
-
4.
the right endpoint of , then go to Step 2.
With the data structures of Lemma 28 and Lemma 29 in hand, the greedy algorithm can be performed in time, where is the optimum of . Indeed, the data structure of Lemma 28 can be used to implement Step 2 and the data structure of Lemma 29 can be used to implement Step 3, both in time. Since the algorithm terminates after iterations, the total time cost is .
A.2 Proof of Lemma 3
Our algorithm makes a no-solution decision iff has no set cover for some . If has a set cover for all , then clearly has a set cover, because the portions for are coverable. If has no set cover for some , then has no set cover, because is uncoverable and thus the points in can only be covered by the intervals in . Therefore, the no-solution decision of our data structure is correct.
A.3 Proof of Lemma 4
Suppose the portions are sorted from left to right. Let be the separation point of and . Observe that an interval belongs to exactly two ’s only if contains one of the separation points . We claim that for each , at most two intervals in contain . Assume there are three intervals that contain . Without loss of generality, assume that (resp., ) has the leftmost left endpoint (resp., the rightmost right endpoint) among . Then one can easily see that . Therefore, is also a set cover for , contradicting the optimality of . Thus, at most two intervals in contain . It follows that there are at most intervals in that contain some separation point, and only these intervals can belong to exactly two ’s, which proves the lemma.
A.4 Proof of Lemma 6
We first observe that unit-square set cover and hitting set are equivalent. A unit square contains a point iff the unit square centered at contains the center of . Therefore, given a unit-square set cover instance , if we replace each point with a unit square centered at and replace each unit square with the center of , then is reduced to an equivalent unit-square hitting set instance . In the same way, one can also reduce a unit-square hitting set instance to an equivalent unit-square set cover instance. Thus, we only need to consider unit-square set cover.
In order to reduce dynamic unit-square set cover to dynamic quadrant set cover, we apply a grid technique. To this end, let us first consider a static unit-square set cover instance . We build a grid on the plane consisting of square cells of side-length 1. For each cell of the grid, we define and . We call a cell nonempty if or , and truly nonempty if . Let be the set of all truly nonempty cells. It is clear that has a set cover iff has a set cover for all , since the points in can only be covered using the unit squares in . Let be a -approximate optimal set cover for . We claim that is a -approximate optimal set cover for . Consider an optimal set cover for . We have , because is a set cover for . On the other hand, we have , since any unit square can intersect at most four cells of . It follows that
In this way, we reduce the instance to the instances for . It seems that solving each is still a unit-square set cover problem. However, it is in fact a quadrant set cover problem. Indeed, a cell is a square of side-length 1, hence the intersection of any unit square and is equal to the intersection of a quadrant and , where is the quadrant obtained by removing the two edges of outside . Since the points in are all contained in , the quadrant covers the same set of points in as the unit square . Therefore, is equivalent to the quadrant set cover instance .
The above observation gives us a reduction from dynamic unit-square set cover to dynamic quadrant set cover. Suppose we have a -approximate dynamic quadrant set cover data structure with amortized update time and construction time, where is an increasing function. We are going to design a -approximate dynamic unit-square set cover data structure . Let be a dynamic unit-square set cover instance. When constructing , we build the grid described above and compute for all nonempty cells. The grid is always fixed, while the instances change along with the change of and . As argued before, we can regard each as a dynamic quadrant set cover instance. We then construct a data structure for each , which is the dynamic quadrant set cover data structure built on . If has a set cover, then maintains a -approximate optimal set cover for , which we denote by . Furthermore, we also need two data structures to maintain the nonempty and truly nonempty cells of , respectively. We store the nonempty cells in a (dynamic) binary search tree (by fixing some total order of the cells of ). Each node has a pointer pointing to the data structure where is the cell corresponding to . Similarly, we store all the truly nonempty cells in a (dynamic) binary search tree , with a pointer at each node pointing to the data structure for the corresponding cell . The data structures and the binary search trees form our dynamic unit-square set cover data structure . Besides this, also maintains two numbers and , where is the number of the truly nonempty cells such that has no set cover and is the sum of over all truly nonempty cells such that has a set cover. The construction time of is . Indeed, finding the nonempty (and truly nonempty) cells and building the binary search trees can be easily done in time. Furthermore, the sum of the sizes of all is , since each point in lies in exactly one cell and each unit-square in intersects at most four cells. Therefore, computing the instances can be done in time. The time for building each data structure is near-linear in the size of , thus building all takes time.
Next, we consider how to handle updates on . For an update on , there are (at most) four cells for which changes, and we call them involved cells. For each involved cell , we do the following. First, if is previously empty (resp., not truly nonempty) and becomes nonempty (resp., truly nonempty) after the update, we insert a new node to the binary search tree (resp., ) corresponding to . Conversely, if is previously nonempty (resp., truly nonempty) and becomes empty (resp., not truly nonempty) after the update, we delete the node corresponding to from (resp., ). After this, we need to update the data structure . If is previously empty and becomes nonempty after the update, we create a new data structure for , otherwise we find the data structure (which can be done by searching the node corresponding to in and using the pointer at ) and update it. We observe that all the above work can be done in amortized time. Indeed, updating the binary search trees and takes logarithmic time. By assumption, the amortized update time of is , where is the size of the current , which is smaller than or equal to as is an increasing function. Since there are at most four involved cells, the amortized update time of is .
At any point, if , then directly decides that the current has no set cover. Otherwise, uses as the set cover solution for the current , where is the set of the truly nonempty cells. By our previous observation, is a -approximate optimal set cover for . To support the desired queries for is quite easy. First, the size of is just the number maintained by , hence the size query for can be answered in time. To answer a membership query, let be the query element. We want to know the multiplicity of in . There are at most four cells of that intersect . For each such cell , we search in the binary search tree . If we cannot find a node of corresponding to , then . Otherwise, the node corresponding to gives us a pointer to the data structure , which can report the multiplicity of in in time. The sum of all these multiplicities is just the multiplicity of in . The time cost for answer the membership query is , since the size of is . Note that , since for all (for otherwise is not a set cover for ). Thus, a membership query takes time. To answer the reporting query, we do a traversal in . For each node , we use the data structure to report the elements in in time, where is the truly nonempty cell corresponding to . The time cost is . Again, since , the reporting query can be answered in time. To summarize, is a -approximate dynamic unit-square set cover data structure with amortized update time and construction time.
Finally, we show how to reduce dynamic quadrant hitting set to dynamic quadrant set cover. Let us first consider a static quadrant hitting set instance . There are four types of quadrants in , southeast, southwest, northeast, and northwest; we denote by the sub-collections consisting of these types of quadrants, respectively. Let be -approximate optimal hitting sets for the instances , respectively. We claim that is a -approximate hitting set for . Indeed, we have , where is the optimum of and is the optimum of . Similarly, we can show that for all . Thus, . Therefore, if we can solve dynamic quadrant hitting set for a single type of quadrants with an approximation factor , we are able to solve the general dynamic quadrant hitting set problem with an approximation factor .
It is easy to see that dynamic quadrant hitting set for a single type of quadrants is equivalent to dynamic quadrant set cover for a single type of quadrants. To see this, let us consider southeast quadrants. Note that a point is contained in a southeast quadrant iff the northwest quadrant whose vertex is (denoted by ) contains the vertex of . Therefore, given a quadrant hitting set instance where only contains southeast quadrants, if we replace each point with the northwest quadrant and replace each southeast quadrant with its vertex , then is reduced to an equivalent quadrant set cover instance . To summarize, if we can solve dynamic quadrant set cover with an approximation factor , then we can also solve dynamic quadrant hitting set for a single type of quadrants with an approximation factor and in turn solve the general dynamic quadrant hitting set problem with an approximation factor . This completes the proof.
A.5 Proof of Lemma 8
Our algorithm makes a no-solution decision iff has no set cover for some . If has a set cover for all , then clearly has a set cover, because the cells for are coverable. Suppose has no set cover for some . Then there exists a point that is not covered by any quadrant in . We claim that is not covered by any quadrant in . Consider a quadrant . If does not intersect , then . Otherwise, must partially intersect , because is uncoverable. If the vertex of lies in , then and thus . The remaining case is that partially intersects and contains one edge of . Without loss of generality, assume left intersects . The rightmost quadrant that left intersects is contained in . Since and , we have . It follows that is not covered by and hence has no set cover. Therefore, the no-solution decision of our data structure is correct.
A.6 Proof of Lemma 9
Fix . Let consist of all quadrants in that intersect . Then covers all points in , because the points in can only be covered by quadrants that intersect . Since is uncoverable by , all quadrants in must partially intersect . It follows that a quadrant in either has its vertex in or contains one edge of . We claim that if a point is covered by , then is covered by either or a special quadrant in . Let be a quadrant such that . If , we are done. Otherwise, must contain one edge of , say the left edge. In this case, should contain a special quadrant , which is the rightmost quadrant that left intersects . It is clear that , which implies that . From this claim, it directly follows that all points in are covered by , where consists of the special quadrants. Since and covers , we have . Note that and . So we have
Equation 5 then follows from the equation
A.7 Proof of Lemma 10
It suffices to show for all at any time in the first period. Fix . Initially, the -th row of the grid contains points in and vertices of the quadrants in . Since the period only consists of operations, the -th row always contains points in and vertices of the quadrants in during the entire period. Thus, at any time in the period, and the total number of the non-special quadrants in is bounded by . The number of the special quadrants in is clearly , since each contains at most four special quadrants. Thus, the equation holds at any time in the period.
A.8 Proof of Lemma 11
We first prove an invariant of our algorithm: whenever is updated in the algorithm, the following two properties hold.
-
1.
is set to be the leftmost point in that is not covered by .
-
2.
Any point in above is not covered by .
The first update of happens in Step 2. At this time, , , and is set to be , i.e., the leftmost point in whose -coordinate is larger than . In order to prove the invariant, we only need to show that (1) any point in strictly to the left of is covered by and (2) any point in above (including itself) is not covered by . Let be a point strictly to the left of . First, the -coordinate of is at most , since is the leftmost point in whose -coordinate is larger than . Therefore, if is to the right of , then . Now assume is strictly to the left of . Because is on the boundary of , must be below . Thus, . We see that covers . Let be a point above . Then the -coordinate of is greater than , which implies . Furthermore, must be strictly to the right of . Indeed, if is to the left of , then is below (as argued above) and hence the -coordinate of is at most , resulting in a contradiction. Now we see is strictly above and strictly to the right of . Note that is on the boundary of , which implies and in particular . So is not covered by . The invariant holds when we update in Step 2.
In the rest of the procedure, is only updated in Step 5. Note that Step 3–5 form a loop in which we include a constant number of quadrants to cover , see Figure 4. It suffices to show that if the two properties hold at the beginning of an iteration of the loop (before Step 3), then they also hold after we update in Step 5 of that iteration. Suppose we are at the beginning of an iteration, and the two properties hold. If , then we go directly from Step 3 to Step 6 and the algorithm terminates. Otherwise, we go to Step 4. Here we need to distinguish two cases: and .




[Case 1] We first consider the case . In this case, we only include in a new quadrant . Note that after including , covers all the points in whose -coordinates are at most . To see this, consider a point whose -coordinate is at most . If is strictly to the left of , then is covered by even before we include , since is the leftmost point in that is not covered by at the beginning of this iteration (by our assumption). Otherwise, is to the right of and is hence covered by . On the other hand, we also notice that after including , does not cover any the points in whose -coordinates are greater than . To see this, consider a point whose -coordinate is greater than . Then and is strictly above (since ). By our assumption, at the beginning of this iteration (when is not included in ), does not cover any point above , and in particular does not cover . Since , we see that, after we include in , the new still does not cover . To summarize, the new contains all points in whose -coordinates are at most and no points in whose -coordinates are greater than . Therefore, when we set to be in Step 5, is the leftmost point in that is not covered by and any point in above is not covered by .
[Case 2] Next, we consider the case . In this case, we include in three new quadrants: , , and where is the vertex of . We first show that after including these three new quadrants, covers all the points in whose -coordinates are at most . Consider a point whose -coordinate is at most . If the -coordinate of is at least , then . Otherwise, is strictly to the left of (because ). Thus, if is above , then . It now suffices to consider the case when is strictly below . In this case, is strictly below , since is the vertex of . If is to the right of , then . If is strictly to the left of , then is covered by even before we include the three new quadrants, since is the leftmost point in that is not covered by at the beginning of this iteration (by our assumption). In any case, is covered by . On the other hand, we notice that after including these three new quadrants, does not cover any points in whose -coordinates are greater than . To see this, consider a point whose -coordinate is greater than . We first establish some obvious facts. Since the NW quadrant and the SE quadrant both contain , we have . It follows that , by the definition of . Therefore, is strictly above , since the -coordinate of is at most . By our assumption, at the beginning of this iteration (when the three new quadrants are not included in ), does not cover any point above , and in particular does not cover . Also, the fact implies . To see , let be a quadrant contains (such a quadrant exists as ). Then . By the definition of , we have and thus , which implies . Finally, it is clear that . So after we include the three new quadrants in , the new still does not cover . To summarize, the new contains all points in whose -coordinates are at most and no points in whose -coordinates are greater than . Therefore, when we set to be in Step 5, is the leftmost point in that is not covered by and any point in above is not covered by .
Combining the discussions for the two cases, the invariant is proved. With the invariant in hand, we now prove the lemma. For convenience, we denote by the point in the -th iteration (before the update of Step 5) of the loop Step 3–5. The invariant of our algorithm guarantees that is the leftmost point in that is not covered by at the beginning of the -th iteration. Furthermore, in the -th iteration, we always add the quadrant to , hence is covered by at the end of the -th iteration. This implies that our algorithm always terminates, because covers at least one more point in in each iteration.
We first show (the eventual) covers all points in . If we go to Step 6 from Step 1, then and nothing needs to be proved. If we go to Step 6 from Step 2, we know that the -coordinates of all points in are at most . Consider a point . If is to the right of , then , as the -coordinate of is at most . If is to the left of , then , because any point in to the left of is covered by . Thus, is covered by , implying that all points in are covered by . The last case is that we go to Step 6 from Step 3. By the invariant, we know that is the leftmost point in that is not covered by just before Step 3. When we include the two quadrants and in , any point to the right of is covered by . Thus, all points in are covered by when we go from Step 3 to Step 6. The remaining case is that we go to Step 6 from Step 5. This case happens only when does not exist, or equivalently, the -coordinate of every point in is at most . Recall that when proving the invariant, we showed that after we include in in Step 5, covers all the points in whose -coordinates are at most . Hence, all points in are covered by when we go from Step 5 to Step 6.
We then show that . To this end, we notice that in each iteration of the loop Step 3–5, we include in a constant number of quadrants. Suppose we do iterations in total during the algorithm. It suffices to show that . We claim that any quadrant contains at most one of . First, we observe that any quadrant in may only contain . Indeed, if for some , then in the -th iteration, the algorithm goes from Step 3 to Step 6 and terminates. Next, we show that any quadrant in does not contain (or equivalently, ) for all . Let . By the invariant we proved before, is the leftmost point in that is not covered by at the beginning of the -th iteration. Since we include and in Step 2, we know that and . Thus, must be strictly above , since any point below is covered by either or . It follows that is to the right of , because and is on the boundary of . In fact, is strictly to the right of , since any point in that has the same -coordinate as is covered by . Because is also on the boundary of , given the fact that is strictly above and strictly to the right of , we see that .
Now it suffices to verify that any quadrant in or contains at most one of . Let such that . We want to show that no quadrant in or contains both and . Since is not covered by at the beginning of the -th iteration, it is also not covered by at any point before the -th iteration and in particular, at the beginning of the -th iteration. This implies that is to the right of . In the -th iteration, we always add the quadrant to . We have , since is not covered by at the end of the -th iteration. Thus, is strictly above . Now let be a quadrant that contains . Note that a point to the right of is contained in only if it is contained in , simply by the definition of . Because , we have . Consequently, no quadrant in contains both and . Next, we show that no quadrant in contains both and . If , we are done. So assume . In this case, we add the quadrant to in Step 4 of the -th iteration. Again, we have for is not covered by at the end of the -th iteration. Let be a quadrant that contains . Note that a point above is contained in only if it is contained in , simply by the definition of . Because , we have . Consequently, no quadrant in contains both and .
To summarize, we showed that any quadrant contains at most one of . Therefore, covering all of requires at least quadrants in . This implies , because . As a result, .
A.9 Proof of Lemma 16
Let be a dynamic set cover instance with only point updates, and be a number. We claim that . Denote by and be the point set at the time and . Also, let and consist of the points that are covered by . Suppose the -th operation inserts a point to , so . If is not covered by , then and . If is covered by , then . Let be a range that covers . An optimal set cover for together with is a set cover for . It follows that . The case where the -th operation deletes a point from is symmetric. This shows that dynamic set cover in the partially dynamic setting is stable. The same argument can also be applied to show that dynamic hitting set in the partially dynamic setting is stable.
Next, we consider the dynamic interval hitting set problem (in the fully dynamic setting). Let be a dynamic interval hitting set instance, and be a number. Denote by and as the point set and interval collection at the time . Let consist of the intervals that are hit by . We want to show that . We distinguish two cases.
[Case 1] The -th operation happens on . Suppose the -th operation is an insertion on , and let be the interval inserted. If is not hit by , then and . If is hit by , then . In this case, we have , since an optimal hitting set for together with a point hitting is a hitting set for . Thus, . The case where the -th operation is a deletion on is symmetric.
[Case 2] The -th operation happens on . Suppose the -th operation is an insertion on , and let be the point inserted. Then where . It is clear that , because an optimal hitting set for together with the point is a hitting set for . It suffices to show that . Let be an optimal hitting set for . If , then is also an optimal hitting set for . Otherwise, let (resp., ) be the rightmost (resp., leftmost) point to the left (resp., right) of ; in the case where is the leftmost (resp, rightmost) point in , then let (resp., ) be an arbitrary point in . We claim that is a hitting set for . Consider an interval . If , then is hit by some point in . Otherwise, , and thus is hit by . Since , is also hit by some point in , say . If is to the left of , then must be hit by . On the other hand, if is to the right of , must be hit by . Thus, in any case, is hit by . It follows that . The case where the -th operation is a deletion on is symmetric.
A.10 Proof of Lemma 18
For convenience, we include in two dummy points and . We assume these two dummy points are always in . Since and do not hit any interval, including them in is safe.
First, we store in a binary search tree where the key of an interval is its left endpoint. We augment each node with an additional field which stores the interval in the subtree rooted at with the leftmost right endpoint. We notice that using , we can decide in time, for two given numbers , whether there is an interval in whose both endpoints are in the open interval . Specifically, we first look for the interval in with the leftmost right endpoint whose key is greater than . With the augmented fields, this interval can be found in time. If the two endpoints of this interval are contained , then we return “yes”; otherwise, no interval in has two endpoints in and we return “no”. Clearly, can be constructed in time and dynamized with update time, and hence it is basic.
We then store in a standard range tree . The points in are stored at the leaves of . Each node corresponds to a canonical subset of consisting of the points stored in the subtree rooted at . At each internal node , we store the leftmost and rightmost points in as well as a separation point such that all points in the left (resp., right) subtree of are to the left (resp., right) of . At each leaf of that does not correspond to the point , we maintain an additional field indicating whether there exists an interval in whose both endpoints are in the open interval where is the point corresponding to and is the leftmost point that is to the right of ; we set if such an interval exists and set otherwise. Note that using the binary search tree , can be computed in time for each leaf of . Indeed, we only need to find , which can be done in time by a walk in , and query to see whether there is an interval in whose both endpoints are in the open interval . Besides, we also maintain a counter that is the total number of the leaves whose -values are equal to 1. It is easy to see that has a hitting set iff .
It is clear that can be constructed in time. We show that when changes, we can maintain with the -fields in time, by using . Suppose a point is inserted to . We need to first compute the -value of the leaf corresponding to using . Let be the rightmost point to the left of . Due to the insertion of , the -value of the leaf of corresponding to may also change. So we need to find , which can be done in time by a walk in , and update using . The -values of all the other leaves remain unchanged. A deletion of a point from is handled similarly. Now suppose an interval is inserted to or deleted from . We find in the leaf of corresponding to the rightmost point in that is to the left of the left endpoint of . Note that the insertion/deletion of does not change the -values of the leaves other than . So it suffices to re-compute using . The time cost for all the cases above is . The rotations of (for self-balancing) do not change the -fields. It follows that the counter can also be maintained in time.
The dynamic data structure in the lemma just consists of (the dynamic versions of) the binary search tree and the range tree . The update time of is time and the construction time is . To indicate whether the current has a hitting set or not, we simply check whether or not.
A.11 Proof of Lemma 19
We can simply store in a binary search tree. Then the rightmost (resp., leftmost) point in to the left (resp., right) of a given point can be reported in time by searching in the tree. Clearly, the binary search tree can be constructed in time and dynamized with update time, and thus it is basic.
A.12 Proof of Lemma 20
We can simply store in a binary search tree. Then a point contained in a given interval can be reported in time by searching in the tree. Clearly, the binary search tree can be constructed in time and dynamized with update time, and thus it is basic.
A.13 Proof of Lemma 21
We use to denote the quasi-optimum of the current . It suffices to show at any time, since we always have .
Initially, is an optimal hitting set for , so we have at that time. If the current is obtained by re-computing using the output-sensitive algorithm, then , as we only do re-computation when the current has a hitting set. Suppose is obtained by local modification. Consider the last re-computation of , and we use to denote the at that point and use to denote the quasi-optimum of at that point. Then we have . As argued before, is the number of the operations processed after the last re-computation of . By the stability of dynamic interval hitting set (Lemma 16), we have , implying . Furthermore, by the fact that the size of either increases by 1 or keeps unchanged after each local modification, we have . It follows that
Note that , for otherwise we should re-compute . This implies and hence .
A.14 Proof of Lemma 22
Let denote the instance at the time , and be the point set at the time . Define as the sub-collection consisting of the intervals that are hit by . Then always has a hitting set. Furthermore, if has a hitting set, then . So it suffices to show that is always a hitting set for . We prove this by induction. It is clear that is a hitting set for . Assume is a hitting set for and we show that is a hitting set for . If is obtained by re-computing, then , since we only re-compute when the current has a hitting set. In this case, is clearly a hitting set for both and . So suppose is obtained by local modification.
We consider different cases separately according to the -th operation. If the -th operation inserts a point to , then and . In this case, and , where . The intervals in are hit by and the intervals in are hit by the point . Hence, is a hitting set for . If the -th operation deletes a point from , then and . In this case, is obtained from by adding the rightmost point in to the left of and the leftmost point in to the right of (if they exist). Consider an interval . We want to show that is hit by . Note that . Thus, and is hit by . If , then is hit by and hence hit by . Otherwise, . Since and , must be hit by some point different from . If is to the left of , then must be hit by (as is in between and ). On the other hand, if is to the right of , must be hit by (as is in between and ). Therefore, is hit by . If the -th operation inserts an interval to , then and . In this case, is obtained from by adding an arbitrary point that hits (if is hit by ). If is not hit by , then and , thus is a hitting set for . If is hit by , then and . The intervals in are hit by by our induction hypothesis and is hit by the point . Hence, is a hitting set for . If the -th operation deletes an interval from , then and . In this case, . Note that , which implies that is a hitting set for .
A.15 Proof of Lemma 24
Note that is static. We simply store in a static data structure that can decide in time, for a given point , whether there exists a quadrant in that covers ; this can be done using a standard orthogonal stabbing data structure with construction time. Then our data structure simply maintains the number of the points in that are not covered by . Initially, can be computed in time by considering every point in and use to check if it is covered by in time. After an operation on , we can update in time by checking whether the inserted/deleted point is covered by using .
A.16 Proof of Lemma 25
Let denote the set at the time and be the at the time . Define as the sub-collection consisting of the points that are covered by . Then always has a hitting set. Furthermore, if has a set cover, then . So it suffices to show that is always a set cover for . We prove this by induction. It is clear that is a set cover for . Assume is a set cover for and we show that is a set cover for . If is obtained by re-computing, then , since we only re-compute when the current has a set cover. In this case, is clearly a set cover for both and . So suppose is obtained by local modification. If the -th iteration inserts a point to , then . In this case, is obtained by including in an arbitrary quadrant that contains (if such a quadrant exists). If is not covered by , then and , thus is a set cover for . If is covered by , then and . The points in are covered by by our induction hypothesis and the point is covered by . Hence, is a set cover for . If the -th iteration delete a point from , then . In this case, . Note that , which implies that is a set cover for .
Appendix B Implementation details and detailed time analysis for the dynamic interval set cover data structure
We present the implementation details of our dynamic interval set cover data structure in Section 4.1 as well as a detailed time analysis. Assume the function we choose satisfies two properties: (1) for any , and (2) .
First, we discuss how to construct . Constructing the data structure takes time, as it is basic. The portions can be computed in time by sorting the points in and the endpoints of the intervals in . Once are computed, we build a (static) point location data structure which can report in time, for a given point , the portion that contains . Clearly, can be constructed in time. With in hand, we can determine in time, for each point (resp., each interval ), the portion that contains (resp., the two ’s that contain the endpoints of ). By doing this for all points in and all intervals in , we obtain all and all in time. After this, we can build the data structures ’s. Constructing each takes time since . Hence, the time for constructing all is .
The support data structure will be used later in the implementation of the update procedure of (we do not need to update since it is static). Besides, we need another support data structure defined as follows.
Lemma 30.
One can store in a basic data structure such that given an interval , an interval in that contains can be reported in time (if it exists).
Proof.
We store in a binary search tree where the key of each interval is its left endpoint. We augment each node with an additional field which stores the interval in the subtree rooted at that has the rightmost right endpoint. Given a query interval , we first look for the interval in with the rightmost right endpoint whose key is smaller than or equal to the left endpoint of . With the augmented fields, this interval can be found in time. If this interval contains , then we report it; otherwise, no interval in contains . Clearly, can be built in time and dynamized with update time, and thus is basic. ∎
Since is basic, it can be constructed in time and updated in time. We conclude that the construction time of is .
Next, we consider how to implement the update procedure of . After each operation, we need to update the data structure and the support data structure , which can be done in time since they are basic. Also, if some changes, we need to update the data structure . An operation on changes exactly one and an operation on changes at most two ’s. Thus, we only need to update at most two ’s, and by using we can find these ’s in time. Note that the size of each is bounded by during the first period (i.e., before the first reconstruction), because the period only consists of operations. Thus, updating the data structures takes amortized time.
Then we discuss the maintenance of the solution. The time for simulating the output-sensitive algorithm is , i.e., . If the algorithm gives the solution , we compute and store in a binary search tree; by doing this, we can answer the size, membership, and reporting queries for in the required query times. This step takes time, i.e., time, since in this case. If the output-sensitive algorithm fails, we compute the sets and . This can be done in time by using the support data structure . After this, we compute , which again takes time by using ; specifically, we consider each and use to find an interval in that contains . We have . To support the size query for in time, we need to compute . This can be done in time, because we can query to obtain in time. In order to support the membership and reporting queries, we store in a binary search trees . Also, we store the set , and for each we store a pointer pointing to the data structure . Consider a membership query . Using the binary search tree , we can obtain the multiplicity of in in time. Then we use to find in time the (at most) two ’s that contain (say and ). By querying and , we know the multiplicities of in and , which takes . The sum of these multiplicities is just the multiplicity of in . The time for answering the query is . Note that . Furthermore, because (as the output-sensitive algorithm fails), we have . Thus, the time cost for a membership query is . Finally, consider the reporting query. We first use the binary search tree to report the elements in , using time. Then for each , we query the data structure to report the elements in . The total time cost is . Since and as argued before, the time for answering the reporting query is . The above work for storing takes time, since and . To summarize, maintaining the solution takes time.
After processing operations, we need to reconstruct the entire data structure . The reconstruction is the same as the initial construction, except that is replaced with , the size of at the time of reconstruction. Thus, the reconstruction takes time. We amortize the time cost over all the operations in the period. Since , the amortized time for reconstruction is , i.e., .
Combining the time for updating the data structures, the time for maintaining the solution, and the time for reconstruction, we see that the amortized update time of is during the first period (since is reconstructed periodically, it suffices to analyze the update time in the first period). By property (1) of , we have at any time in the period, i.e., the size of is at any time in the period. By property (2) of , we further have at any time in the period. It follows that the amortized update time of is during the period. To minimize the time complexity while guaranteeing the two conditions of , we set where is as defined in Theorem 2, i.e., . The following lemma shows that our choice of makes the time bound .
Lemma 31.
When , we have
Proof.
If , then one can easily verify the equation in the lemma via a direct computation (by bounding each of the three terms on the left-hand side). It suffices to verify the equation for the case . In this case, we have , implying that . It follows that . So the first term in the left-hand side is bounded by . The second term is , which is bounded by . The third term is clearly . This proves the equation in the lemma. ∎
Appendix C Implementation details and detailed time analysis for the dynamic quadrant set cover data structure
We present the implementation details of our dynamic quadrant set cover data structure in Section 4.2 as well as a detailed time analysis. Since we are interested in the asymptotic bounds, we may assume that the approximation factor is sufficiently small, say . Assume the function we choose satisfies two properties: (1) for any , and (2) . Note that property (1) implies that and .
First, we discuss how to construct . Constructing the data structure takes time, as it is basic. The grid can be built in time by sorting the points in and the vertices of the quadrants in . Once the grid is computed, we build a (static) point location data structure which can report in time, for a given point , the grid cell that contains . Since the grid has cells, can be built in time. With in hand, we can determine in time, for each point (resp., each quadrant ), the cell contains (resp., the vertex of ). By doing this for all points in and all quadrants in , we obtain all and the non-special quadrants in all in time, i.e., time. In order to compute the special quadrants, we need the following support data structure .
Lemma 32.
One can store in a basic data structure such that given an orthogonal rectangle , the leftmost (resp., rightmost) quadrant in that right (resp., left) intersects and the topmost (resp., bottommost) quadrant in that bottom (resp., top) intersects can be reported in time (if they exist).
Proof.
It suffices to show how to compute the leftmost quadrant in that right intersects a given orthogonal rectangle . Note that only southeast and northeast quadrants can left intersects a rectangle, and without loss of generality, it suffices to see how to find the leftmost southeast quadrant in that right intersects . Let consist of the southeast quadrants. We store the vertices of the quadrants in in a basic 3-sided range-minimum data structure (Lemma 36) with query time, by setting the weight of each vertex to be its -coordinate. Let be a given orthogonal rectangle. Observe that a quadrant in right intersects iff its vertex lies in the 3-sided rectangle . Thus, the leftmost quadrant in that right intersects corresponds to the lightest vertex (i.e., the vertex with the smallest weight) in , which can be reported by the 3-sided range-minimum data structure in time. ∎
Using the support data structure , we can find the special quadrants in all in time. After this, we can build the data structures ’s. Constructing each takes time. By Lemma 10, the time for constructing all ’s is . Therefore, the entire construction time of is , i.e., .
The support data structures and will be used later in the implementation of the update procedure of . Thus, will be updated after each operation (while is static). Besides and , we need another support data structure defined as follows.
Lemma 33.
One can store in a basic data structure such that given an orthogonal rectangle , a quadrant that contains can be reported in time (if it exists).
Proof.
It suffices to consider the southeast quadrants. Let consist of the southeast quadrants. We store in a binary search tree where the key of a quadrant is the -coordinate of its vertex. We augment each node with an additional field which stores the topmost quadrant in the subtree rooted at . Given an orthogonal rectangle , we first look for the topmost quadrant whose key is smaller than or equal to . With the augmented fields, can be found in time using . If contains , then we report , otherwise no quadrant in contains (because a southeast quadrant contains iff the -coordinate of its vertex is smaller than or equal to and the -coordinate of its vertex is greater than or equal to ). Clearly, can be built in time and dynamized with update time, and thus is basic. ∎
Next, we consider how to implement the update procedure of . We first discuss the update of the data structures. If the operation is an insertion or deletion on , then we use to find the cell that contains the inserted/deleted point and update the data structure . The situation is more complicated when the operation happens on . Suppose the operation inserts a quadrant to . Without loss of generality, assume is a southeast quadrant. Using , we can find the cell that contains the vertex of . We then update the data structure by inserting to . Besides, the insertion of may also change for and for . Fix an index . The quadrant bottom intersects . We use the support data structure to find the topmost quadrant in (before the insertion of ) that bottom intersects . Then is a special quadrant in . If the -coordinate of the vertex of is greater than the -coordinate of the vertex of , then we delete from and insert to , and we update twice for these two operations. The data structures ’s can be updated similarly. Updating each takes amortized time, where is the size of the current , and updating each takes amortized time. Therefore, the total amortized time cost for updating these data structures is bounded by . By Lemma 10 and the fact , we have and at any time of the first period. Since , by Hölder’s inequality and Lemma 10,
and similarly . It follows that updating the data structures takes amortized time. Updating the data structure and the support data structures and can be done in time since they are basic.
Then we discuss the maintenance of the solution. The time for simulating the output-sensitive algorithm is , i.e., . If the algorithm gives the solution , we compute and store in a binary search tree; by doing this, we can answer the size, membership, and reporting queries for in the required query times. This step takes time, i.e., time, since in this case. If the output-sensitive algorithm fails, we compute the sets and . This can be done in time by using the support data structure . After this, we compute , which again takes time by using ; specifically, we consider each and use to find a quadrant in that contains . We have . To support the size query for in time, we need to compute . This can be done in time, because we can query to obtain in time. To support the reporting query in time, we only need to store and , and store at each a pointer pointing to the data structure . In this way, we can report the quadrants in and for each , report the quadrants in in time by querying . Supporting the membership query in time is more difficult, since a quadrant may belong to many ’s. To handle this issue, the idea is to collect all the special quadrants in the ’s. Specifically, let consist of the (at most) four special quadrants. We can compute for each in time by first finding the (at most) four special quadrants in using and computing the multiplicity of each special quadrant in by querying . We then store in a binary search tree . Given a query quadrant , we first use to compute the multiplicity of in in time. Then we use to find the cell that contains the vertex of in time and query to find the multiplicity of in in time. One can easily verify that is the multiplicity of in . The query takes time. Note that and , where the latter follow from the fact that (as the output-sensitive algorithm fails). Therefore, we can support the membership query in time. The above work for storing takes time, since and . To summarize, maintaining the solution takes time.
After processing operations, we need to reconstruct the entire data structure . The reconstruction is the same as the initial construction, except that is replaced with , the size of at the time of reconstruction. Thus, the reconstruction takes time. We amortize the time cost over all the operations in the period. Since , the amortized time for reconstruction is , i.e., .
Combining the time for updating the data structures, the time for maintaining the solution, and the time for reconstruction, we see that the amortized update time of is during the first period (since is reconstructed periodically, it suffices to analyze the update time in the first period). By property (1) of , we have at any time in the period, i.e., the size of is at any time in the period. By property (2) of , we further have at any time in the period. It follows that the amortized update time of is . To minimize the time complexity while guaranteeing the two conditions of , we set where is as defined in Theorem 7, i.e., . Note that by doing this we have because of our assumption . The following lemma shows that our choice of makes the time bound .
Lemma 34.
When , we have
Proof.
If , then one can easily verify the equation in the lemma via a direct computation (by bounding each of the three terms on the left-hand side). It suffices to verify the equation for the case . In this case, we have , implying that . It follows that . So the first term in the left-hand side is bounded by . The second term is , which is bounded by . The third term is clearly . This proves the equation in the lemma. ∎
Appendix D Missing details in the output-sensitive quadrant set cover algorithm
D.1 Handling the no-solution case
Let . Observe that no matter whether has no set cover or not, the algorithm in Section 4.2.2 gives an -approximate optimal set cover for . Thus, in order to handle the no-solution case, we only need to check whether covers all points in , after is computed. Define . Note that is a set cover for iff does not contain any point in . The area is a rectilinear domain (not necessarily connected). Since is the union of quadrants, the complexity of is , so is the complexity of . Furthermore, it is easy to compute and decompose it into rectangles in time, given in hand. We then only need to test for each such rectangle whether contains any point in or not, which can be done via an orthogonal range-emptiness query on ; there are existing basic data structures that support orthogonal range-emptiness queries in time [24].
D.2 Implementing the algorithm using basic data structures
In this section, we show that the output-sensitive quadrant set cover algorithm can be performed in time by storing the instance in some basic data structure. As argued in Section 4.2.2, it suffices to show how to implement the following operations in time using basic data structures (please refer to Section 4.2.2 for the notations).
-
•
Computing the point .
-
•
Given a point , testing whether and .
-
•
Given a point , computing the quadrants , , , and .
-
•
Given a number , computing the point .
Computing .
Recall that is the boundary of , which is a staircase curve from bottom-left to top-right. The area contains the bottom-left end of (if ). The point is the “endpoint” of , i.e., the point on closest to the top-right end of that is contained in . We define the height of at , denoted by , as the largest number such that the point with coordinates is contained in ; similarly, we can define the height of at , denoted by . We say is higher than at if ; otherwise, we say is lower than at . It is easy to see the following three facts about the point .
-
1.
The -coordinate of , denoted by , is equal to for some ; recall that is the -coordinate of the vertex of .
-
2.
is higher than at all and is lower than at all .
-
3.
The coordinates of is .
Based on these facts, we compute as follows. First, we need a basic data structure built on (resp., ) that can compute the height function in time.
Lemma 35.
One can store (resp., ) in a basic data structure which can report (resp., ) for a given number in time.
Proof.
It suffices to consider . We store in a standard binary search tree , by using as the key of each quadrant . At each node , we store a field which is the maximum of for all quadrants in the subtree rooted at . Clearly, can be constructed in time where and can be dynamized with update time to support insertions and deletions on , hence it is a basic data structure. Next, we consider how to compute for a given using . One can easily see that . In other words, is the maximum of for all corresponding to the nodes in whose keys are at least . Therefore, using the field , can be computed in time simply via a top-down walk in . The walk begins at the root of , and we set initially. If the key of the current node is smaller than , then we just go to its right child. Otherwise, we update as where is the quadrant corresponding to the current node and is the right child of the current node (if the current node has no right child, we update as ), and go to the left child of the current node. When the walk ends at a leaf node, the number is just equal to . The walk takes time, hence can be computed in time using . This completes the proof of the lemma. ∎
With the above lemma in hand, we may now assume that the height functions and can be computed in time for any . In particular, we can test in time whether is higher or lower than at any . We then build a binary search tree on , by using as the key of each quadrant . Clearly, can be constructed in time where and can be dynamized with update time, hence it is a basic data structure. We observe that, using , we can determine in time for any given number which one of the following three is true: (1) , (2) , (3) . To see this, consider a given number . We first search in to see whether for some . If not, we know , because for some (as we observed before). In this case, we can decide whether or by simply testing whether is higher or lower than at . Specifically, if is higher than at , then , otherwise , because is higher (resp., lower) than at all (resp., ) as we observed before. The remaining case is that for some . In this case, we also test whether is higher or lower than at ; by doing this, we can decide whether or . Suppose . To see whether or , we search in to find the smallest key that is larger than . Let be any number such that . If is higher than at , then we have . If is lower than at , then we know . Note that any number in is not equal to for any , due to the choice of and the inequality . Therefore, , which implies . To summarize, we can determine in time for any which one of the following three is true: (1) , (2) , (3) .
This allows us to compute using a binary search manner. Specifically, we do a top-down walk from the root of . If the key of the current node is equal to , we are done. Otherwise, if the key of the current node is smaller (resp., larger) than , we go to its right (resp., left) child, because the keys of the nodes in the left (resp., right) subtree are all smaller (resp., larger) than . During the walk, we can definitely find a node whose key is , since for some , i.e., is the key of some node in . In this way, we can compute in time. As we observed before, the coordinates of is . Thus, once we know , it suffices to compute and , which takes time by Lemma 35. We conclude that computing can be done in time (by properly store and in basic data structures).
Testing whether and .
It suffices to consider how to test whether . We store in a binary search tree , by using as the key of each quadrant . We augment each node with a field which stores the leftmost quadrant in the subtree rooted at . Clearly, can be constructed in time where and can be dynamized with update time, hence it is basic. Given a point , we first look for the leftmost quadrant in whose key is smaller than or equal to the -coordinate of . With the augmented fields, can be found in time. If , then we know . Otherwise, we claim that . Indeed, a quadrant in contains only if its key is smaller than or equal to the -coordinate of . Since is the leftmost one among such quadrants and , we know that is not contained in any quadrant in , i.e., . We conclude that testing whether and for a given point can be done in time (by properly store and in basic data structures).
Computing .
Recall that for a point and a collection of quadrants, and denote the rightmost and topmost quadrants in that contain , respectively. We want to compute , , , and in time for a given point , using basic data structures. Here we only consider how to compute , the other three can be computed in the same way. We store in a binary search tree , by using as the key of each quadrant . At each node , we store a field that is the rightmost quadrant in the subtree rooted at . Clearly, can be constructed in time where and can be dynamized with update time, hence it is a basic data structure. Given a point , we first look for the rightmost quadrant in whose key is greater than or equal to the -coordinate of . With the augmented fields, can be found in time. If contains , then is the rightmost quadrant in (i.e., in ) that contains , because any quadrant that contains must satisfy . Otherwise, no quadrant in contains . We conclude that computing , , , and for a given point can be done in time (by properly store , , , in basic data structures).
Computing .
Recall that for a number , is the leftmost point in whose -coordinate is greater than . We want to store and in some basic data structure such that can be computed in time for any given . For simplicity of exposition, let us make a general-position assumption: the points in and the vertices of the quadrants in have distinct -coordinates. The first thing we need is a data structure that supports the so-called 3-sided range-minimum query. A 3-sided range-minimum query on a set of weighted points in ask for the lightest point (i.e., the point with the smallest weight) contained in a given 3-sided query rectangle .
Lemma 36.
There exists a basic data structure that supports 3-sided range-minimum queries in time.
Proof.
The standard range trees can answer static 3-sided range-minimum queries in time, and can be constructed in time, where is the size of the dataset. Since range-minimum queries are decomposable, the approach of [4] can be applied to dynamize the static data structure with update time, by paying an extra logarithmic factor in the query time. This gives us the basic data structure that supports 3-sided range-minimum queries in time. ∎
We store in the basic 3-sided range-minimum data structure of the above lemma, by setting the weight of each point in to be its -coordinate. Besides , we need a (1D) range tree built on for -coordinates, where is the set of the vertices of the quadrants in . By the definition of a range tree, the points in are one-to-one corresponding to the leaves of , where the left-right order of the leaves corresponds to the small-large order of the -coordinates of the points. The canonical subset of each node refers to the subset of consisting of the points stored in the subtree rooted at . For a node , we denote by the set of the points in that are contained in the canonical subset of , and denote by the collection of the quadrants in whose vertices are contained in the canonical subset of . Also, we write At each node , we store the following three fields.
-
•
: the -coordinate of the bottommost point in the canonical subset of .
-
•
: the -coordinate of the topmost point in the canonical subset of .
-
•
: the leftmost quadrant in .
-
•
: the leftmost point in that is contained in .
Let be a node and be its left and right children, respectively. It is clear that , , can be computed in time knowing the , , and fields of and . We claim that can be computed in time based on the information stored at , , , and the 3-sided range-minimum data structure . By definition, is the leftmost point in that is contained in . Since , it suffices to compute the leftmost point in contained in and the leftmost point in contained in . Note that any point in is not contained in , since any quadrant in is “below” any point in . It follows that the leftmost point in that is contained in is just . To compute the leftmost point in that is contained in , we only need to compute the leftmost point in contained in and the leftmost point in contained in , because . The leftmost point in contained in is just . In order to compute the leftmost point in contained in , we observe that a point in is contained in iff it is contained in the quadrant . Thus, the leftmost point in contained in is just the leftmost point in contained in . Note that is exactly the set of the points in that lie in the strip , i.e., . Hence, we can query the data structure with the 3-sided rectangle to obtain the leftmost point in contained in , i.e., the leftmost point in contained in , which takes time. We conclude that, by using the 3-sided range-minimum data structure , all the fields of a node can be computed in time based on the information stored at and their children. Therefore, the range tree with the augmented fields can be dynamized with update time using the standard technique for dynamizing augmented trees [12]. The construction time of is clearly where , thus is a basic data structure.
Next, we consider how to use to compute in time for a given . We first find the canonical nodes corresponding to the range . Suppose are sorted from left to right in . By the property of canonical nodes, the canonical subsets of are disjoint and their union is the subset of consisting of the points whose -coordinates are in the range . The point we are looking for is just the leftmost point in that is contained in . Note that is not contained in any southeast quadrant with . Thus, is the leftmost point in that is contained in . To compute , it suffices to know the leftmost point in that is contained in for all . If , then any point in is not contained in . If , then the leftmost point in contained in is just . If , then a point in is contained in iff it is contained in the quadrant . Note that the points in are exactly the set of the points in that lie in the strip , i.e., . Therefore, the leftmost point in that is contained in can be computed in time by querying the data structure with the 3-sided rectangle . To summarize, the leftmost point in that is contained in can be computed in time for any . Since , computing the leftmost point in that is contained in for all takes time. Once we have these points, the leftmost one among them is just the leftmost point in that is contained in , i.e., the point . We conclude that computing can be done in time (by properly storing and in some basic data structure).