This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Self-Improving Voronoi Construction for a Hidden Mixture of Product Distributionsthanks: Research of Cheng and Wong are supported by Research Grants Council, Hong Kong, China (project no. 16200317).

Siu-Wing Cheng111Department of Computer Science and Engineering, HKUST, Hong Kong, China. Email: [email protected], [email protected]   Man Ting Wong22footnotemark: 2
Abstract

We propose a self-improving algorithm for computing Voronoi diagrams under a given convex distance function with constant description complexity. The nn input points are drawn from a hidden mixture of product distributions; we are only given an upper bound m=o(n)m=o(\sqrt{n}) on the number of distributions in the mixture, and the property that for each distribution, an input instance is drawn from it with a probability of Ω(1/n)\Omega(1/n). For any ε(0,1)\varepsilon\in(0,1), after spending O(mnlogO(1)(mn)+mεn1+εlog(mn))O\bigl{(}mn\log^{O(1)}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)\bigr{)} time in a training phase, our algorithm achieves an O(1εnlogm+1εn2O(logn)+1εH)O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}n2^{O(\log^{*}n)}+\frac{1}{\varepsilon}H\bigr{)} expected running time with probability at least 1O(1/n)1-O(1/n), where HH is the entropy of the distribution of the Voronoi diagram output. The expectation is taken over the input distribution and the randomized decisions of the algorithm. For the Euclidean metric, the expected running time improves to O(1εnlogm+1εH)O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H\bigr{)}.

1 Introduction

Self-improving algorithms, proposed by Ailon et al. [2], is a framework for studying algorithmic complexity beyond the worst case. There is a training phase that allows some auxiliary structures about the input distribution to be constructed. In the operation phase, these auxiliary structures help to achieve an expected running time, called the limiting complexity, that may surpass the worst-case optimal time complexity.

Self-improving algorithms have been designed for product distributions [2, 10]. Let nn be the input size. A product distribution 𝒟=(D1,,Dn)\mathscr{D}=(D_{1},\ldots,D_{n}) consists of nn distributions DiD_{i} such that the iith input item is drawn independently from DiD_{i}. It is possible that Di=DjD_{i}=D_{j} for some iji\not=j, but the draws of the iith and jjth input items are independent. No further information about 𝒟\mathscr{D} is given. Sorting, Delaunay triangulation, 2D maxima, and 2D convex hull have been studied for product distributions. For all four problems, the training phase uses O(nε)O(n^{\varepsilon}) input instances, and the space complexity is O(n1+ε)O(n^{1+\varepsilon}). The limiting complexities of sorting and Delaunay triangulation are O(1εn+1εHout)O\bigl{(}\frac{1}{\varepsilon}n+\frac{1}{\varepsilon}H_{\text{out}}\bigr{)} for any ε(0,1)\varepsilon\in(0,1), where HoutH_{\text{out}} is the entropy of the output distribution [2]. The limiting complexities for 2D maxima and 2D convex hull are O(OptM+n)O(\mathrm{OptM}+n) and O(OptC+nloglogn)O(\mathrm{OptC}+n\log\log n) respectively, where OptM and OptC are the expected depths of the optimal linear decision trees for the two problems [10].

Extensions that allow dependence among input items have been developed. One extension is that there is a hidden partition of [n][n] into groups. The input items with indices in the kkth group follow some hidden functions of a common parameter uku_{k}. The parameters u1,u2,u_{1},u_{2},\cdots follow a product distribution. The partition of [n][n] is not given though. If the hidden functions are known to be linear, sorting can be solved in a limiting complexity of O(1εn+1εHout)O\bigl{(}\frac{1}{\varepsilon}n+\frac{1}{\varepsilon}H_{\text{out}}\bigr{)} after a training phase that takes O(n2log3n)O(n^{2}\log^{3}n) time [8]. If it is only known that each hidden function has O(1)O(1) extrema and the graphs of two functions intersect in O(1)O(1) places (without knowing any of the functions, or any of these extrema and intersections), sorting can be solved in a limiting complexity of O(n+Hout)O(n+H_{\text{out}}) after an O~(n3)\tilde{O}(n^{3})-time training phase [7]. For the Delaunay triangulation problem, if it is known that the hidden functions are bivariate polynomials of O(1)O(1) degree (without knowing the polynomials), a limiting complexity of O(nα(n)+Hout)O(n\alpha(n)+H_{\text{out}}) can be achieved after a polynomial-time training phase [7].

Another extension is that the input instance II is drawn from a hidden mixture of at most mm product distributions. That is, there are at most mm product distributions 𝒟1,𝒟2,\mathscr{D}_{1},\mathscr{D}_{2},\ldots such that Pr[I𝒟a]=λa\Pr[I\sim\mathscr{D}_{a}]=\lambda_{a} for some fixed positive value λa\lambda_{a}. The upper bound mm is given, but no information about the λa\lambda_{a}’s and the 𝒟a\mathscr{D}_{a}’s is provided. Sorting can be solved in a limiting complexity of O(1εnlogm+1εHout)O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H_{\text{out}}\bigr{)} after a training phase that takes O(mnlog2(mn)+mεn1+εlog(mn))O(mn\log^{2}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)) time [8].

In this paper, we present a self-improving algorithm for constructing Voronoi diagrams under a convex distance function dQd_{Q} in 2\mathbb{R}^{2}, assuming that the input distribution is a hidden mixture of at most mm product distributions. The convex distance function dQd_{Q} is induced by a given convex polygon QQ of O(1)O(1) size. The upper bound mm is given, and we assume that m=o(n)m=o(\sqrt{n}). We also assume that for each product distribution 𝒟a\mathscr{D}_{a} in the mixture, λa=Ω(1/n)\lambda_{a}=\Omega(1/n). Let ε(0,1)\varepsilon\in(0,1) be a parameter fixed beforehand. The training phase uses O(mnlog(mn))O(mn\log(mn)) input instances and takes O(mnlogO(1)(mn)+mεn1+εlog(mn))O\bigl{(}mn\log^{O(1)}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)\bigr{)} time. In the operation phase, given an input instance II, we can construct its Voronoi diagram VorQ(I)\mathrm{Vor}_{Q}(I) under dQd_{Q} in a limiting complexity of O(1εnlogm+1εn2O(logn)+1εH)O\big{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}n2^{O(\log^{*}n)}+\frac{1}{\varepsilon}H\bigr{)}, where HH denotes the entropy of the distribution of the Voronoi diagram output. Note that Ω(H)\Omega(H) is a lower bound of the expected running time of any comparison-based algorithm. Our algorithm also works for the Euclidean case, and the limiting complexity improves to O(1εnlogm+1εH)O\big{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H\bigr{)}.

For simplicity, we will assume throughout the rest of this paper that the hidden mixture has exactly mm product distributions. We give an overview of our method in the following.

We follow the strategy in [2] for computing a Euclidean Delaunay triangulation. The idea is to form a set SS of sample points and build Del(S)\mathrm{Del}(S) and some auxiliary structures in the training phase so that any future input instance II can be merged quickly into Del(S)\mathrm{Del}(S) to form Del(SI)\mathrm{Del}(S\cup I), and then Del(I)\mathrm{Del}(I) can be split off in O(n)O(n) expected time. Merging II into Del(S)\mathrm{Del}(S) requires locating the input points in Del(S)\mathrm{Del}(S). The location distribution is gathered in the training phase so that distribution-sensitive point location can be used to avoid the logarithmic query time as much as possible. Modifying Del(S)\mathrm{Del}(S) efficiently into Del(SI)\mathrm{Del}(S\cup I) requires that only O(1)O(1) points in II fall into the same neighborhood in Del(S)\mathrm{Del}(S) in expectation.

In our case, since there are mm product distributions, we will need a larger set SS of mnmn sample points in order to ensure that only O(1)O(1) points in II fall into the same neighborhood in VorQ(S)\mathrm{Vor}_{Q}(S) in expectation. But then merging II into VorQ(S)\mathrm{Vor}_{Q}(S) in the operation phase would be too slow because scanning VorQ(S)\mathrm{Vor}_{Q}(S) already requires Θ(mn)\Theta(mn) time. We need to extract a subset RSR\subseteq S such that RR has O(n)O(n) size and RR contains all points in SS whose Voronoi cells conflict with the input points.

Still, we cannot afford to construct VorQ(R)\mathrm{Vor}_{Q}(R) in O(nlogn)O(n\log n) time. In the training phase, we form a metric dd related to dQd_{Q} and construct a net-tree TST_{S} for SS under dd [16]. In the operation phase, after finding the appropriate RSR\subseteq S, we use nearest common ancestor queries [22] to compress TST_{S} in O(nloglogm)O(n\log\log m) time to a subtree TRT_{R} for RR that has O(n)O(n) size. Next, we use TRT_{R} to construct a well-separated pair decomposition of RR under dd in O(n)O(n) time [16], use the decomposition to compute the nearest neighbor graph of RR under dd in O(n)O(n) time, and then construct VorQ(R)\mathrm{Vor}_{Q}(R) from the nearest neighbor graph in O(n)O(n) expected time. The merging of II into VorQ(R)\mathrm{Vor}_{Q}(R) to form VorQ(RI)\mathrm{Vor}_{Q}(R\cup I), and the splitting of VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) into VorQ(I)\mathrm{Vor}_{Q}(I) and VorQ(R)\mathrm{Vor}_{Q}(R) are obtained by transferring their analogous results in the Euclidean case [2, 6].

We have left out the expected time to locate the input points in VorQ(S)\mathrm{Vor}_{Q}(S). It is bounded by O(1/ε)O(1/\varepsilon) times the sum of the entropies of the point location outcomes. We show that VorQ(I)\mathrm{Vor}_{Q}(I) allows us to locate the input points in VorQ(S)\mathrm{Vor}_{Q}(S) in O(nlogm+n2O(logn))O(n\log m+n2^{O(\log^{*}n)}) time. Then, a result in [2] implies that the sum of the entropies of the point location outcomes is O(nlogm+n2O(logn)+H)O(n\log m+n2^{O(\log^{*}n)}+H). The expected running time is thus O(1εnlogm+1εn2O(logn)+1εH)O(\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}n2^{O(\log^{*}n)}+\frac{1}{\varepsilon}H), which dominates the limiting complexity. In the Euclidean case, Vor(I)\mathrm{Vor}(I) allows us to locate the input points in O(nlogm)O(n\log m) time, so the limiting complexity improves to O(1εnlogm+1εH)O(\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H).

2 Preliminaries

Let QQ be a convex polygon that has O(1)O(1) complexity and contains the origin in its interior. Let \partial and int()\mathrm{int}(\cdot) be the boundary and interior operators, respectively. So QQ’s boundary is Q\partial Q and its interior is int(Q)\mathrm{int}(Q). Let dQd_{Q} be the distance function induced by QQ: x,y2\forall\,x,y\in\mathbb{R}^{2}, dQ(x,y)=min{λ[0,):yλQ+x}d_{Q}(x,y)=\min\{\lambda\in[0,\infty):y\in\lambda Q+x\}. As QQ may not be centrally symmetric (i.e., xQxQx\in Q\iff-x\in Q), dQd_{Q} may not be a metric.

The bisector of two points pp and qq is {x2:dQ(p,x)=dQ(q,x)}\{x\in\mathbb{R}^{2}:d_{Q}(p,x)=d_{Q}(q,x)\}, which is an open polygonal curve of O(1)O(1) size. The Voronoi diagram of a set Σ\Sigma of nn points, VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma), is a partition of 2\mathbb{R}^{2} into interior-disjoint cells Vp(Σ)={x2:qΣ,dQ(p,x)dQ(q,x)}V_{p}(\Sigma)=\{x\in\mathbb{R}^{2}:\,\forall q\in\Sigma,\,d_{Q}(p,x)\leq d_{Q}(q,x)\} for all pΣp\in\Sigma. There are algorithms for constructing VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma) in O(nlogn)O(n\log n) time [9, 19].

Vp(Σ)V_{p}(\Sigma) is simply connected and star-shaped with respect to pp [9]. We use Np(Σ)N_{p}(\Sigma) to denote the set of Voronoi neighbors of pp in VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma). The Voronoi edges of VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma) form a planar graph of O(|Σ|)O(|\Sigma|) size. Each Voronoi edge is a polygonal line, and we call its internal vertices Voronoi edge bends. We use VΣV_{\Sigma} to denote the set of Voronoi edge bends and Voronoi vertices in VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma). For the infinite Voronoi edges, their endpoints at infinity are included in VΣV_{\Sigma}.

Define Q={x:xQ}Q^{*}=\{-x:x\in Q\}. For any points x,y2x,y\in\mathbb{R}^{2}, dQ(x,y)=dQ(y,x)d_{Q^{*}}(x,y)=d_{Q}(y,x). At any point xx on a Voronoi edge of VorQ(Σ)\mathrm{Vor}_{Q}(\Sigma) defined by p,qΣp,q\in\Sigma, there exists λ(0,)\lambda\in(0,\infty) such that dQ(x,p)=dQ(p,x)=dQ(q,x)=dQ(x,q)=λd_{Q^{*}}(x,p)=d_{Q}(p,x)=d_{Q}(q,x)=d_{Q^{*}}(x,q)=\lambda and dQ(x,s)=dQ(s,x)λd_{Q^{*}}(x,s)=d_{Q}(s,x)\geq\lambda for all sΣs\in\Sigma. Hence, {p,q}(λQ+x)\{p,q\}\subset\partial(\lambda Q^{*}+x) and int(λQ+x)Σ=\mathrm{int}(\lambda Q^{*}+x)\cap\Sigma=\emptyset, i.e., an “empty circle property”.

Take a point xx. Consider the largest homothetic222A homothetic copy of a shape is a scaled and translated copy of it. copy QxQ^{*}_{x} of QQ^{*} centered at xx such that int(Qx)Σ=\mathrm{int}(Q^{*}_{x})\cap\Sigma=\emptyset. If we insert a new point qq to Σ\Sigma, we say that qq conflicts with xx if qQxq\in Q^{*}_{x}. We say that qq conflicts with a cell Vp(Σ)V_{p}(\Sigma) if qq conflicts with some point in Vp(Σ)V_{p}(\Sigma). Clearly, Vp(Σ)V_{p}(\Sigma) must be updated by the insertion of qq. We use VΣ|qV_{\Sigma}|_{q} to denote the subset of VΣV_{\Sigma} that conflict with qq. The Voronoi edge bends and Voronoi vertices in VΣ|qV_{\Sigma}|_{q} will be destroyed by the insertion of qq.

We make three general position assumptions. First, no two sides of QQ are parallel. Second, for every pair of input points, their support line is not parallel to any side of QQ. Third, no four input points lie on the boundary of any homothetic copy of QQ^{*}, which implies that every Voronoi vertex has degree three.

It is much more convenient if all Voronoi cells of the input points are bounded. We assume that all possible input points appear in some fixed bounding square \cal B centered at the origin. We place O(1)O(1) dummy points outside \cal B so that all Voronoi cells of the input points are bounded, and their portions inside \cal B remain the same as before. Refer to Figure 1. Take λQ\lambda Q^{*} for some large enough λ\lambda\in\mathbb{R} such that for every point xx\in{\cal B}, λQ+x\lambda Q^{*}+x contains \cal B. Refer to the left image in Figure 1. We slide a copy of λQ\lambda Q^{*} around \cal B to generate the outer convex polygon. The dashed polygon demonstrates the sliding of λQ\lambda Q^{*} around \cal B. This outer polygon contains a translational copy of every edge of \cal B and two translational copies of every edge of λQ\lambda Q^{*}. We add the vertices of this outer polygon as dummy points. Any homothetic copy of QQ^{*} that intersects \cal B cannot be expanded indefinitely without containing some of these dummy points. So all Voronoi cells of input points are bounded. For each point xx\in{\cal B}, since the dummy points lie outside λQ+x\lambda Q^{*}+x and λQ+x{\cal B}\subseteq\lambda Q^{*}+x (i.e., λQ+x\lambda Q^{*}+x is not empty of the input points), the portion of the Voronoi diagram inside \cal B is unaffected by the dummy points.

Refer to caption

Figure 1: The left image shows the bounding square \cal B and the large enclosing λQ\lambda Q^{*}. In the right image, we slide a copy of λQ\lambda Q^{*} around \cal B to generate the outer convex polygon. The dashed polygon demonstrates the sliding of λQ\lambda Q^{*} around \cal B. The bold edges on this convex polygon are translates of the boundary edges of \cal B. Every edge of λQ\lambda Q^{*} has two translational copies too as labelled.

3 Training phase

Sample set S\boldsymbol{S}. Take mnln(mn)mn\ln(mn) instances I1,I2,,Imnln(mn)I_{1},I_{2},\ldots,I_{mn\ln(mn)}. Define x1,,xmnln(mn)x_{1},\ldots,x_{mn\ln(mn)} by taking the p1p_{1}’s in I1,,Imln(mn)I_{1},\ldots,I_{m\ln(mn)} to be x1,,xmln(mn)x_{1},\ldots,x_{m\ln(mn)}, p2p_{2}’s in Imln(mn)+1,,I2mln(mn)I_{m\ln(mn)+1},\ldots,I_{2m\ln(mn)} to be xmln(mn)+1,,x2mln(mn)x_{m\ln(mn)+1},\ldots,x_{2m\ln(mn)}, and so on. The set SS of sample points includes a 1mn\frac{1}{mn}-net of the xix_{i}’s with respect to the family of homothetic copies of QQ^{*}, as well as the O(1)O(1) dummy points. The set SS has O(mn)O(mn) points and can be constructed in O(mnlogO(1)(mn))O(mn\log^{O(1)}(mn)) time as homothetic copies of QQ^{*} are pseudo-disks [2, 21].

Point location. Compute VorQ(S)\mathrm{Vor}_{Q}(S) and triangulate it by connecting each pSp\in S to VSVp(S)V_{S}\cap\partial V_{p}(S), i.e., the Voronoi edge bends and Voronoi vertices in Vp(S)\partial V_{p}(S). For unbounded Voronoi cells, we view the infinite Voronoi edges as leading to some vertices at infinity; an extra triangulation edge that goes between two infinite Voronoi edges also leads to a vertex at infinity, giving rise to unbounded triangles. Figure 2 shows an example.

Refer to caption

Figure 2: Part of the triangulation of a Voronoi diagram induced by the triangle shown with a gray center. The solid edges form the Voronoi diagram. The dashed edges refine it into a triangulation.

Construct a point location structure LSL_{S} for the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) with O(log(mn))O(\log(mn)) query time [13]. Take another mεnεm^{\varepsilon}n^{\varepsilon} input instances and use LSL_{S} to locate the points in these input instances in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S). For every i[n]i\in[n] and every triangle tt, we compute π~i,t\tilde{\pi}_{i,t} to be the ratio of the frequency of tt hit by pip_{i} to mεnεm^{\varepsilon}n^{\varepsilon}, which is an estimate of Pr[pit]\Pr[p_{i}\in t]. For each i[n]i\in[n], form a subdivision 𝒮i{\cal S}_{i} that consists of triangles with positive π~i,t\tilde{\pi}_{i,t}’s, triangulate the exterior of 𝒮i{\cal S}_{i}, and give these new triangles a zero estimated probability. Set the weight of each triangle in 𝒮i{\cal S}_{i} to be the maximum of (mn)ε(mn)^{-\varepsilon} and its estimated probability. Construct a distribution-sensitive point location structure LiL_{i} for 𝒮i{\cal S}_{i} based on the triangle weights [1, 17]. Note that LiL_{i} has O(mεnε)O(m^{\varepsilon}n^{\varepsilon}) size, and locating a point in a triangle t𝒮it\in{\cal S}_{i} takes O(logWiwt)O\bigl{(}\log\frac{W_{i}}{w_{t}}\bigr{)} time, where wtw_{t} is the weight of tt and WiW_{i} is the total weight in 𝒮i{\cal S}_{i}.

For any input instance (p1,,pn)(p_{1},\ldots,p_{n}) in the operation phase, we will query LiL_{i} to locate pip_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S), which may fail if pip_{i} falls into a triangle with zero estimated probability. If the search fails, we query LSL_{S} to locate pip_{i}.

Net-tree. We first define a metric that is induced by a centrally symmetric convex polygon. Define Q^={xy:x,yQ}\hat{Q}=\{x-y:x,y\in Q^{*}\}, i.e., the Minkowski sum of QQ^{*} and Q-Q^{*}, or equivalently QQ^{*} and QQ. It is centrally symmetric by definition. It can be visualized as the region covered by all possible placements of QQ^{*} that has the origin in the polygon boundary. Since Q^\hat{Q} is a Minkowski sum, its number of vertices is within a constant factor of the total number of vertices of QQ^{*} and Q-Q^{*}, which is O(1)O(1).

Let dd be the metric induced by the centrally symmetric convex polygon Q^\hat{Q}, which is a doubling metric—there is a constant λ>0\lambda>0 such that for any point x2x\in\mathbb{R}^{2} and any positive number rr, the ball with respect to dd centered at xx with radius rr can be covered by λ\lambda balls with respect to dd of radius r/2r/2.

Given a set of points PP, a net-tree for PP with respect to dd [16] is an analog of the well-separated pair decomposition for Euclidean spaces [4]. It is a rooted tree whose leaves are the points in PP. For each node vv, let 𝑝𝑎𝑟𝑒𝑛𝑡(v)\mathit{parent}(v) denote its parent, and let PvP_{v} denote the subset of PP at the leaves that descend from vv. Every tree node vv is given a representative point pvp_{v} and an integer level v\ell_{v}. Let τ11\tau\geq 11 be a fixed constant. Let B(x,h)B(x,h) denote the ball {y2:d(x,y)h}\{y\in\mathbb{R}^{2}:d(x,y)\leq h\}. By the results in [16] (Definition 2.1 and the remark that follows Proposition 2.2), the following properties are satisfied by a net-tree:

  1. (a)

    pvPvp_{v}\in P_{v}.

  2. (b)

    For every non-root node vv, v<𝑝𝑎𝑟𝑒𝑛𝑡(v)\ell_{v}<\ell_{\mathit{parent}(v)}, and if vv is a leaf, then v=\ell_{v}=-\infty.

  3. (c)

    Every internal node has at least two and at most a constant number of children.

  4. (d)

    For every node vv, B(pv,2ττ1τv)B\bigl{(}p_{v},\frac{2\tau}{\tau-1}\cdot\tau^{\ell_{v}}\bigr{)} contains PvP_{v}.

  5. (e)

    For every non-root node vv, B(pv,τ52τ2τ𝑝𝑎𝑟𝑒𝑛𝑡(v)1)PPvB\bigl{(}p_{v},\frac{\tau-5}{2\tau-2}\cdot\tau^{\ell_{\mathit{parent}(v)}-1}\bigr{)}\cap P\subset P_{v}.

  6. (f)

    For every internal node vv, there is a child ww of vv such that pw=pvp_{w}=p_{v}.

Clusters. We construct a net-tree TST_{S} for SS in O(mnlog(mn))O(mn\log(mn)) expected time [16]. We define clusters as follows. Label all leaves of TST_{S} as unclustered initially. Select the leftmost mm unclustered leaves of TST_{S}; if there are fewer than mm such leaves, select them all. Find the subtree rooted at a node vv of TST_{S} that contains the selected unclustered leaves, but no child subtree of vv contains them all. We call the subtree rooted at vv a cluster and label all its leaves clustered. Then, we repeat the above until all leaves of TST_{S} are clustered. By construction, the clusters are disjoint, each cluster has O(m)O(m) nodes, and there are O(n)O(n) clusters in TST_{S}.

We assign nodes in each cluster a unique cluster index in the range [1,O(n)][1,O(n)]. We also assign each node of a cluster three indices from the range [1,O(m)][1,O(m)] according to its rank in the preorder, inorder, and postorder traversals of that cluster. The preorder and postorder indices allow us to tell in O(1)O(1) time whether two nodes are an ancestor-descendant pair.

We keep an initially empty van Emde Boas tree 𝐸𝐵c\mathit{EB}_{c} [23] with each cluster cc. The universe for 𝐸𝐵c\mathit{EB}_{c} is the set of leaves in the cluster cc, and the inorder of these leaves in cc is the total order for 𝐸𝐵c\mathit{EB}_{c}. We also build a nearest common ancestor query data structure for each cluster [22]. The nearest common ancestor query of any two nodes can be reported in O(loglogm)O(\log\log m) time.

Planar separator. VorQ(S)\mathrm{Vor}_{Q}(S) is a planar graph of O(mn)O(mn) size with all Voronoi edge bends and Voronoi vertices as graph vertices. By a recursive application of the planar separator theorem, one can produce an m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S): it is divided into O(n/m)O(n/m) regions, each region contains O(m2)O(m^{2}) vertices, and the boundary of each region contains O(m)O(m) vertices [14].

Extract the subset BSB\subset S of points whose Voronoi cell boundaries contain some region boundary vertices in the m2m^{2}-division. So |B|=O(mn/m)=O(n)|B|=O(m\cdot n/m)=O(n). Compute VorQ(B)\mathrm{Vor}_{Q}(B) and triangulate it as in triangulating VorQ(S)\mathrm{Vor}_{Q}(S). By our choice of BB, the region boundaries in the m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S) form a subgraph of VorQ(B)\mathrm{Vor}_{Q}(B). Label in O(n)O(n) time the Voronoi edge bends and Voronoi vertices in VorQ(B)\mathrm{Vor}_{Q}(B) whether they exist in VorQ(S)\mathrm{Vor}_{Q}(S).

We construct point location data structures for every region Π\Pi in the m2m^{2}-division as follows. For every boundary vertex ww of Π\Pi, let QwQ^{*}_{w} be the largest homothetic copy of QQ^{*} centered at ww such that int(Qw)B=\mathrm{int}(Q^{*}_{w})\cap B=\emptyset. These QwQ^{*}_{w}’s form an arrangement of O(m2)O(m^{2}) complexity, and we construct a point location data structure that allows a point to be located in this arrangement in O(logm)O(\log m) time. We also construct a point location data structure for the portion of the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) inside Π\Pi. Since the region has O(m2)O(m^{2}) complexity, this point location data structure can return in O(logm)O(\log m) time the triangle in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains a point inside Π\Pi.

Output and performance. The following result summarizes the output and performance of the training phase. Its proof is given in Appendix A. The proof of Lemma 3.1(a) is similar to an analogous result for sorting in [8].

Lemma 3.1

Let 𝒟a\mathscr{D}_{a}, a[m]a\in[m], be the distributions in the hidden mixture. The training phase computes the following structures in O(mnlogO(1)(mn)+mεn1+εlog(mn))O(mn\log^{O(1)}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)) time.

  1. (a)

    A set SS of O(mn)O(mn) points and VorQ(S)\mathrm{Vor}_{Q}(S). It holds with probability at least 11/n1-1/n that for any a[1,m]a\in[1,m] and any vVSv\in V_{S}, i=1nPr[Xiv|I𝒟a]=O(1/m)\sum_{i=1}^{n}\mathrm{Pr}[X_{iv}\,|\,I\sim\mathscr{D}_{a}]=O(1/m), where Xiv=1X_{iv}=1 if piIp_{i}\in I conflicts with vv and Xiv=0X_{iv}=0 otherwise.

  2. (b)

    Point location structures LSL_{S} and LiL_{i} for each i[n]i\in[n] that allow us to locate pip_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) in O(1εH(ti))O\bigl{(}\frac{1}{\varepsilon}H(t_{i})\bigr{)} expected time, where tit_{i} is the random variable that represents the point location outcome, and H(ti)H(t_{i}) is the entropy of the distribution of tit_{i}.

  3. (c)

    A net-tree TST_{S} for SS, the O(n)O(n) clusters in TST_{S}, the initially empty van Emde Boas trees for the clusters, and the nearest common ancestor data structures for the clusters.

  4. (d)

    An m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S), the subset BSB\subseteq S of O(n)O(n) points whose Voronoi cell boundaries contain some region boundary vertices in the m2m^{2}-division, VorQ(B)\mathrm{Vor}_{Q}(B), and the point location data structures for the regions in the m2m^{2}-division.

Lemma 3.1(a) leads to Lemma 3.2 below, which implies that for any vVSv\in V_{S}, if we feed the input points that conflict with vv to a procedure that runs in quadratic time in the worst case, the expected running time of this procedure over all points in VSV_{S} is O(n)O(n). The proof of Lemma 3.2 is just an algebraic manipulation of the probabilities, and it is given in Appendix B.

Lemma 3.2

For every vVSv\in V_{S}, let ZvZ_{v} be the subset of input points that conflict with vv. It holds with probability at least 1O(1/n)1-O(1/n) that vVSE[|Zv|2]=O(n)\sum_{v\in V_{S}}\mathrm{E}\bigl{[}|Z_{v}|^{2}\bigr{]}=O(n).

We state two technical results. Their proofs are in Appendix B. Figure 3(a) and (b) illustrate these two lemmas.

Lemma 3.3

Consider VorQ(Y)\mathrm{Vor}_{Q}(Y) for some point set YY. For any point x2x\in\mathbb{R}^{2}, let QxQ^{*}_{x} be the largest homothetic copy of QQ^{*} centered at xx such that int(Qx)Y=\mathrm{int}(Q^{*}_{x})\cap Y=\emptyset. Let w1w_{1} and w2w_{2} be two adjacent Voronoi edge bends or Voronoi vertices in VorQ(Y)\mathrm{Vor}_{Q}(Y). For any point xw1w2x\in w_{1}w_{2}, QxQw1Qw2Q^{*}_{x}\subseteq Q^{*}_{w_{1}}\cup Q^{*}_{w_{2}}. The same property holds if w1w_{1} and w2w_{2} are Voronoi vertices connected by a Voronoi edge, and xx lies on that Voronoi edge.

Refer to caption Refer to caption
(a) (b)
Figure 3: (a) The points qq and qq^{\prime} define a Voronoi edge, and w1w_{1} and w2w_{2} are two adjacent Voronoi edge bends or Voronoi vertices on this edge. At any point xx between w1w_{1} and w2w_{2}, the polygon QxQ^{*}_{x} (shown dashed) is a subset of Qw1Qw2Q^{*}_{w_{1}}\cup Q^{*}_{w_{2}}. (b) A triangle quvquv in the triangulated Voronoi diagram in Figure 2 is shown. If a point pp conflicts with the white dot (i.e., lies inside the bold dashed triangle), then pp conflicts with uu or vv (i.e., lies inside one of the two light dashed circles.)
Lemma 3.4

Let qq be a point in some point set YY. Let quvquv be a triangle in the triangulated VorQ(Y)\mathrm{Vor}_{Q}(Y). If a point pYp\not\in Y conflicts with a point in quvquv, then pp conflicts with uu or vv. Hence, if pp conflicts with Vq(Y)V_{q}(Y), pp conflicts with a Voronoi edge bend or Voronoi vertex in Vq(Y)\partial V_{q}(Y).

4 Operation phase

Given an instance I=(p1,,pn)I=(p_{1},\cdots,p_{n}), we construct VorQ(I)\mathrm{Vor}_{Q}(I) using the pseudocode below.

Operation Phase

  1. 1.

    For each i[n]i\in[n], query LiL_{i} to find the triangle tit_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}, and if the search fails, query LSL_{S} to find tit_{i}.

  2. 2.

    For each i[n]i\in[n], search VorQ(S)\mathrm{Vor}_{Q}(S) from tit_{i} to find VS|piV_{S}|_{p_{i}}, i.e., the subset of VSV_{S} that conflict with pip_{i}. This also gives the subset of SS whose Voronoi cells conflict with the input points. Let RR be the union of this subset of SS and the set of representative points of all cluster roots in TST_{S}.

  3. 3.

    Compute the compression TRT_{R} of TST_{S} to RR.

  4. 4.

    Construct the nearest neighbor graph 1-NNR\mathrm{NN}_{R} under the metric dd from TRT_{R}.

  5. 5.

    Compute VorQ(R)\mathrm{Vor}_{Q}(R) from 1-NNR\mathrm{NN}_{R}.

  6. 6.

    Modify VorQ(R)\mathrm{Vor}_{Q}(R) to produce VorQ(RI)\mathrm{Vor}_{Q}(R\cup I).

  7. 7.

    Split VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) to produce VorQ(I)\mathrm{Vor}_{Q}(I) and VorQ(R)\mathrm{Vor}_{Q}(R). Return VorQ(I)\mathrm{Vor}_{Q}(I).

We analyze step 1 in Section 4.1, steps 2 and 3 in Section 4.2, steps 4 and 5 in Section 4.3, and steps 6 and 7 in Section 4.4. Step 1 is the most time-consuming; all other steps run in O(n)O(n) expected time or O(nloglogm)O(n\log\log m) expected time.

4.1 Point location

By Lemma 3.1(b), step 1 runs in O(i=1n1εH(ti))O\bigl{(}\sum_{i=1}^{n}\frac{1}{\varepsilon}H(t_{i})\bigr{)} expected time, which is O(1εnlogm+1εH(t1,,tn))O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H(t_{1},\ldots,t_{n})\bigr{)} as we will show later. By Lemma 4.1 below, if there is an algorithm that can use VorQ(I)\mathrm{Vor}_{Q}(I) to determine t1,,tnt_{1},\ldots,t_{n} in c(n)c(n) expected time, then H(t1,,tn)=O(c(n)+H)H(t_{1},\ldots,t_{n})=O(c(n)+H), implying that step 1 takes O(1ε(nlogm+c(n)+H))O\big{(}\frac{1}{\varepsilon}(n\log m+c(n)+H)\bigr{)} expected time. Any preprocessing cost of SS is excluded from c(n)c(n). We present such an algorithm.

Lemma 4.1 (Lemma 2.3 in [2])

Let 𝒟\mathscr{D} be a distribution on a universe 𝒰\cal U. Let X:𝒰𝒳X:{\cal U}\rightarrow{\cal X}, and let Y:𝒰𝒴Y:{\cal U}\rightarrow{\cal Y} be two random variables. Suppose that there is a comparison-based algorithm that computes a function f:(I,X(I))Y(I)f:(I,X(I))\rightarrow Y(I) in CC expected comparisons over 𝒟\mathscr{D} for every I𝒰I\in{\cal U}. Then H(Y)=C+O(H(X))H(Y)=C+O(H(X)).

Recall that we have computed in the training phase the subset BSB\subseteq S whose Voronoi cell boundaries contain some region boundary vertices in the m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S). Note that |B|=O(n)|B|=O(n). We have also computed VorQ(B)\mathrm{Vor}_{Q}(B) and point location data structures associated with the regions in the m2m^{2}-division. We use VorQ(B)\mathrm{Vor}_{Q}(B) and these point location data structures determines t1,,tnt_{1},\ldots,t_{n} as follows.

  • Task 1: Merge VorQ(B)\mathrm{Vor}_{Q}(B) with VorQ(I)\mathrm{Vor}_{Q}(I) to form the triangulated VorQ(BI)\mathrm{Vor}_{Q}(B\cup I).

  • Task 2: Use VorQ(S)\mathrm{Vor}_{Q}(S), VorQ(B)\mathrm{Vor}_{Q}(B), and VorQ(BI)\mathrm{Vor}_{Q}(B\cup I) to find the triangles t1,,tnt_{1},\ldots,t_{n}.

We discuss these two tasks in the following.

Task 1. For every point pBp\in B, define a polygonal cone surface Cp={(a,b,dQ(p,(a,b)):(a,b)2}C_{p}=\bigl{\{}(a,b,d_{Q}(p,(a,b)):(a,b)\in\mathbb{R}^{2}\bigr{\}}. Each horizontal cross-section of CpC_{p} is a scaled copy of QQ centered at pp. The triangulated VorQ(B)\mathrm{Vor}_{Q}(B) is the vertical projection of the lower envelope of {Cp:pB}\{C_{p}:p\in B\}, denoted by (B){\cal L}(B). Similarly, (I){\cal L}(I) projects to VorQ(I)\mathrm{Vor}_{Q}(I). We take the lower envelope of (B){\cal L}(B) and (I){\cal L}(I) to form (BI){\cal L}(B\cup I) which projects to VorQ(BI)\mathrm{Vor}_{Q}(B\cup I). We do so in O(n2O(logn))O(n2^{O(\log^{*}n)}) expected time with a randomized algorithm that is based on an approach proposed and analyzed by Chan [5, Section 4]. More details are given in Appendix C.1.

Task 2. Suppose that for an input point piIp_{i}\in I, we have determined some subset BiB_{i} that satisfies BBiSB\subseteq B_{i}\subseteq S, and we have computed a Voronoi edge bend or Voronoi vertex viv_{i} in VorQ(Bi)\mathrm{Vor}_{Q}(B_{i}) that conflicts with pip_{i} and is known to be in VSV_{S} or not.

If viVSv_{i}\in V_{S}, we search VorQ(S)\mathrm{Vor}_{Q}(S) from viv_{i} to find VS|piV_{S}|_{p_{i}} (i.e., the subset of VSV_{S} that conflict with pip_{i}), which by Lemma 3.3 also gives the triangle tit_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}. By Lemma 3.2, the expected total running time of this procedure over all input points is O(n)O(n).

Suppose that viVSv_{i}\not\in V_{S}. So viv_{i} is not a region boundary vertex in the m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S), i.e., viv_{i} lies inside a region in the m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S), say Π\Pi. For each boundary vertex ww of Π\Pi, let QwQ^{*}_{w} be the largest homothetic copy of QQ^{*} centered at ww such that int(Qw)B=\mathrm{int}(Q^{*}_{w})\cap B=\emptyset. These QwQ^{*}_{w}’s form an arrangement of O(m2)O(m^{2}) complexity, and we locate pip_{i} in this arrangement in O(logm)O(\log m) time. It tells us whether piQwp_{i}\in Q^{*}_{w} for some boundary vertex ww of Π\Pi. If so, then pip_{i} conflicts with ww, which belongs to VSV_{S}, and we search VorQ(S)\mathrm{Vor}_{Q}(S) from ww to find VS|piV_{S}|_{p_{i}} and hence the triangle tit_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}. Otherwise, pip_{i} must lie inside Π\Pi in order to conflict with viv_{i} inside Π\Pi without conflicting with any boundary vertex of Π\Pi. So we do a point location in O(logm)O(\log m) time to locate pip_{i} in the portion of the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) inside Π\Pi. This gives tit_{i}.

How do we compute viv_{i} for pip_{i}? We discuss this computation and provide more details of Step 2 in Appendix C.2. The following lemma summarizes the result that follows from the discussion above.

Lemma 4.2

Given VorQ(I)\mathrm{Vor}_{Q}(I), the triangles t1,,tnt_{1},\ldots,t_{n} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contain p1,,pnIp_{1},\ldots,p_{n}\in I can be computed in O(nlogm+n2O(logn))O\left(n\log m+n2^{O(\log^{*}n)}\right) expected time.

Lemma 4.3

Step 11 of the operation phase takes O(1ε(nlogm+n2O(logn)+H))O\bigl{(}\frac{1}{\varepsilon}(n\log m\!+\!n2^{O(\log^{*}n)}\!+\!\!H)\bigr{)} expected time, where HH is the entropy of the distribution of VorQ(I)\mathrm{Vor}_{Q}(I).

Proof.  Let A[1,m]A\in[1,m] be a random variable that indicates which distribution in the mixture generates the input instance. By the chain rule for conditional entropy [24, Proposition 2.23], H(ti)H(ti)+H(A|ti)=H(ti,A)=H(A)+H(ti|A)H(t_{i})\leq H(t_{i})+H(A|t_{i})=H(t_{i},A)=H(A)+H(t_{i}|A). It is known that H(A)log2(domain size of A)=log2mH(A)\leq\log_{2}(\text{domain size of $A$})=\log_{2}m [24, Theorem 2.43]. Thus, i=1nH(ti)nlog2m+i=1nH(ti|A)\sum_{i=1}^{n}H(t_{i})\leq n\log_{2}m+\sum_{i=1}^{n}H(t_{i}|A). The variables t1|A,,tn|At_{1}|A,\ldots,t_{n}|A are mutually independent. So i=1nH(ti|A)=H(t1,,tn|A)\sum_{i=1}^{n}H(t_{i}|A)=H(t_{1},\ldots,t_{n}|A). Since entropy is not increased by conditioning [24, Theorem 2.38], we get i=1nH(ti|A)=H(t1,,tn|A)H(t1,,tn)\sum_{i=1}^{n}H(t_{i}|A)=H(t_{1},\ldots,t_{n}|A)\leq H(t_{1},\ldots,t_{n}). By Lemma 4.2, we can determine t1,,tnt_{1},\ldots,t_{n} using VorQ(I)\mathrm{Vor}_{Q}(I) in O(nlogm+n2O(logn))O(n\log m+n2^{O(\log^{*}n)}) expected time. So H(t1,,tn)=O(nlogm+n2O(logn)+H)H(t_{1},\ldots,t_{n})=O(n\log m+n2^{O(\log^{*}n)}+H) by Lemma 4.1, where HH is the entropy of the distribution of VorQ(I)\mathrm{Vor}_{Q}(I).    

In the Euclidean metric, merging Vor(B)\mathrm{Vor}(B) and Vor(I)\mathrm{Vor}(I) into Vor(BI)\mathrm{Vor}(B\cup I) can be reduced to finding the intersection of two convex polyhedra of O(n)O(n) size in 3\mathbb{R}^{3}, which can be solved in O(n)O(n) time [5]. So the expected running time of step 1 improves to O(1ε(nlogm+H))O\bigl{(}\frac{1}{\varepsilon}(n\log m+H)\bigr{)}.

4.2 Construction of 𝑹\boldsymbol{R}

Step 1 determines the triangle tit_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains piIp_{i}\in I. We search VorQ(S)\mathrm{Vor}_{Q}(S) from tit_{i} to find VS|piV_{S}|_{p_{i}}, which takes O(|VS|pi|)O\bigl{(}\bigl{|}V_{S}|_{p_{i}}\bigr{|}\bigr{)} time [19]. This search also gives the Voronoi cells that conflict with pip_{i}. The total time over all i[n]i\in[n] is O(vVS|Zv|)O\bigl{(}\sum_{v\in V_{S}}|Z_{v}|\bigr{)}, where ZvZ_{v} is the subset of input points that conflict with vv. Since RR includes all sites whose cells conflict with the input points and the representative points of all cluster roots in TST_{S}, we have |R|vVS|Zv|+O(n)|R|\leq\sum_{v\in V_{S}}|Z_{v}|+O(n). The following result follows from Lemma 3.2.

Lemma 4.4

The set RR has O(n)O(n) expected size. Step 2 of the operation phase constructs RR in O(n)O(n) expected time.

4.3 Extraction of 𝐕𝐨𝐫𝑸(𝑹)\boldsymbol{\mathrm{Vor}_{Q}(R)}

4.3.1 Construction of 𝑻𝑹\boldsymbol{T_{R}}

We define a compression of a net-tree TT. Select a subset UU of leaves in TT. Let TTT^{\prime}\subseteq T be the minimal subtree that spans UU. Bypass all internal nodes in TT^{\prime} that have only one child. The resulting tree is the compression of TT to UU. The following result is an easy observation.

Lemma 4.5

Let TT be a net-tree. Let T1T_{1} be the compression of TT to a subset U1U_{1} of leaves. The compression of T1T_{1} to any subset U2U_{2} of leaves in T1T_{1} can also be obtained by a compression of TT to U2U_{2}.

Conceptually, TRT_{R} is defined as follows. Select all leaves of TST_{S} that are points in RR, and TRT_{R} is the compression of TST_{S} to these selected leaves. Since RR includes the representative points of all cluster roots, all ancestors of the cluster roots in TST_{S} will survive the compression and exist as nodes in TRT_{R}. The compression affects the clusters only. More precisely, for each cluster cc in TST_{S}, we select its leaves that are points in RR and compute the compression TcT_{c} of the cluster cc to these selected leaves. Substituting every cluster cc in TST_{S} by TcT_{c} gives the desired TRT_{R}. It remains to discuss how to compute the TcT_{c}’s.

We divide RR in O(n)O(n) expected time into sublists R1,R2,R_{1},R_{2},\ldots such that RcR_{c} consists of the points that are leaves in cluster cc. Recall that every cluster cc has an initially empty van Emde Boas tree 𝐸𝐵c\mathit{EB}_{c} for its leaves in left-to-right order. For each RcR_{c}, we insert all leaves in RcR_{c} into 𝐸𝐵c\mathit{EB}_{c} and then repeatedly perform extract-min on 𝐸𝐵c\mathit{EB}_{c}. This gives in O(|Rc|loglogm)O(|R_{c}|\log\log m) time a sorted list RcR^{\prime}_{c} of the leaves in RcR_{c} according to their left-to-right order in the cluster cc.

If |Rc|=1|R^{\prime}_{c}|=1, then TcT_{c} consists of the single leaf in RcR_{c}. Suppose that |Rc|2|R^{\prime}_{c}|\geq 2. We construct TcT_{c} using a stack. Initially, TcT_{c} is a single node which is the first leaf in RcR^{\prime}_{c}. The stack stores the nodes on the rightmost root-to-leaf path in the current TcT_{c}, with the root at the stack bottom and the leaf at the stack top. When we scan the next leaf qq in RcR^{\prime}_{c}, we find in cluster cc the nearest common ancestor xx of qq and qq’s predecessor in RcR^{\prime}_{c}. This takes O(loglogm)O(\log\log m) time [22]. If we see xx at the stack top, we add qq as a new leaf to TcT_{c} with xx as its parent, and then we push qq onto the stack. Refer to the left image in Figure 4. If we see an ancestor zz of xx at the stack top, let yy be the node that was immediately above zz in the stack and was just popped, we make xx the rightmost child of zz in TcT_{c} (which was yy previously), we also make yy and qq the left and right children of xx respectively, and then we push xx and qq in this order onto the stack. Refer to the middle image in Figure 4. If neither of the two conditions above happens and the stack is not empty, we pop the stack and repeat. Refer to the right image in Figure 4. If the stack becomes empty, we make xx the new root of TcT_{c}, we also make the old root of TcT_{c} and qq the left and right children of xx respectively, and then we push xx and qq in this order onto the stack. The construction of TcT_{c} takes O(|Rc|loglogm)O(|R_{c}|\log\log m) time.

Refer to caption

Figure 4: Three different cases in the manipulation of the stack. The tree shown is a part of TST_{S}. The gray nodes are nodes in TcT_{c}. The gray leaves are leaves in RcR_{c}.
Lemma 4.6

The compression TRT_{R} of TST_{S} to RR can be computed in O(nloglogm)O(n\log\log m) time.

4.3.2 Construction of the 𝒌\boldsymbol{k}-nearest neighbor graph

Let XX be any subset of SS. Assume that the compression TXT_{X} of TST_{S} to XX is available. We show how to use TXT_{X} to construct in O(k|X|)O(k|X|) time the kk-nearest neighbor graph of XX under the metric dd. We denote this graph by kk-NNX\mathrm{NN}_{X}. We will use the well-separated pair decomposition or WSPD for short. For any c1c\geq 1, a set {{A1,B1},,{As,Bs}}\bigl{\{}\{A_{1},B_{1}\},\ldots,\{A_{s},B_{s}\}\bigr{\}} is a cc-WSPD of XX under dd if the following properties are satisfied:

  • i,Ai,BiX\forall\,i,\,\,\,A_{i},B_{i}\subseteq X.

  • distinctx,yX\forall\,\text{distinct}\,x,y\in X, i\exists\,i such that {x,y}{{a,b}:aAibBi}\bigl{\{}x,y\}\in\bigl{\{}\{a,b\}:a\in A_{i}\wedge b\in B_{i}\bigr{\}}.

  • i\forall\,i, the maximum of the diameters of AiA_{i} and BiB_{i} under dd is less than 1cd(Ai,Bi)\frac{1}{c}\cdot d(A_{i},B_{i}). It implies that AiBi=A_{i}\cap B_{i}=\emptyset.

It is known that a cc-WSPD has O(cO(1)|X|)O(c^{O(1)}|X|) size and can be constructed in O(c(O(1)|X|)O(c^{(O(1)}|X|) time from a net-tree for XX [16]. The same method works for a compression TXT_{X} of TST_{S} to XX, giving a cc-WPSD of O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) size in O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) time. The details of the WSPD construction are given in Appendix D.1. To compute kk-NNX\mathrm{NN}_{X}, we transfer a strategy in [4] for constructing a Euclidean kk-nearest neighbor graph using a WSPD. The details are given in Appendix D.2.

Lemma 4.7

Given the compression TXT_{X} of TST_{S} to any subset XSX\subseteq S, the kk-NNX\mathrm{NN}_{X} can be constructed in O(k|X|)O(k|X|) time.

The next result shows that the vertex degree of 11-NNX\mathrm{NN}_{X} is O(1)O(1). Its proof is given in Appendix D.3 which is adapted from an analogous result in the Euclidean case [20].

Lemma 4.8

For any subset XSX\subseteq S, every vertex in 11-NNX\mathrm{NN}_{X} has O(1)O(1) degree, and adjacent vertices in 11-NNX\mathrm{NN}_{X} are Voronoi neighbors in VorQ(X)\mathrm{Vor}_{Q}(X).

4.3.3 𝐕𝐨𝐫𝑸(𝑹)\boldsymbol{\mathrm{Vor}_{Q}(R)} from the nearest neighbor graph

We show how to construct VorQ(R)\mathrm{Vor}_{Q}(R) in O(n)O(n) expected time using 1-NNR\mathrm{NN}_{R}. We use the following recursive routine which is similar to the one in [3] for constructing an Euclidean Delaunay triangulation from the Euclidean nearest neighbor graph. The top-level call is VorNN(R,TR)(R,T_{R}).

VorNN(Y,TY)(Y,T_{Y})

  1. 1.

    If |Y|=O(1)|Y|=O(1), compute VorQ(Y)\mathrm{Vor}_{Q}(Y) directly and return.

  2. 2.

    Compute 1-NNY\mathrm{NN}_{Y} under the metric dd using TYT_{Y}.

  3. 3.

    Let XYX\subseteq Y be a random sample such that XX meets every connected component of 1-NNY\mathrm{NN}_{Y}, and Pr[pX]=1/2\mathrm{Pr}[p\in X]=1/2 for every pYp\in Y.

  4. 4.

    Compute the compression TXT_{X} of TYT_{Y} to XX.

  5. 5.

    Call VorNN(X,TX)(X,T_{X}) to compute VorQ(X)\mathrm{Vor}_{Q}(X).

  6. 6.

    Using 1-NNY\mathrm{NN}_{Y} as a guide, insert the points in YXY\setminus X into VorQ(X)\mathrm{Vor}_{Q}(X) to form VorQ(Y)\mathrm{Vor}_{Q}(Y).

There are two differences from [3]. First, we use a compression TYT_{Y} of TST_{S} to compute 1-NNY\mathrm{NN}_{Y} in step 2, which takes O(|Y|)O(|Y|) time by Lemma 4.7. Second, we need to compress TYT_{Y} to TXT_{X} in step 4. This compression works in almost the same way as described in Section 4.3.1 except that we can afford to traverse TYT_{Y} in O(|Y|)O(|Y|) time to answer all nearest common ancestor queries required for constructing TXT_{X}. Thus, step 4 runs in O(|Y|)O(|Y|) time.

Step 3 is implemented as follows [3]. Form an arbitrary maximal matching of 1-NNY\mathrm{NN}_{Y}. By the definition of 1-NNY\mathrm{NN}_{Y}, each connected component of 1-NNY\mathrm{NN}_{Y} contains at least one matched pair. Randomly select one point from every matched pair. Then, among those unmatched points in 1-NNY\mathrm{NN}_{Y}, select each one with probability 1/2 uniformly at random. The selected points form the subset XX required in step 3. The time needed is O(|Y|)O(|Y|).

In step 6, for each pYXp\in Y\setminus X that is connected to some point qXq\in X in 1-NNY\mathrm{NN}_{Y}, pp and qq are Voronoi neighbors in VorQ(Y)\mathrm{Vor}_{Q}(Y) by Lemma 4.8. So pp conflicts with a point in Vq(X)V_{q}(X). By Lemma 3.4, pp conflicts with a Voronoi edge bend or Voronoi vertex in Vq(X)\partial V_{q}(X), which can be found in O(|Vq(X)|)O\bigl{(}\bigl{|}\partial V_{q}(X)\bigr{|}\bigr{)} time. After finding a Voronoi edge bend or Voronoi vertex vv in Vq(X)\partial V_{q}(X) that conflicts with pp, we search VorQ(X)\mathrm{Vor}_{Q}(X) from vv to find all Voronoi edge bends and Voronoi vertices that conflict with pp. In the same search of VorQ(X)\mathrm{Vor}_{Q}(X) , we modify VorQ(X)\mathrm{Vor}_{Q}(X) into VorQ(X{p})\mathrm{Vor}_{Q}\bigl{(}X\cup\{p\}\bigr{)} as in a randomized incremental construction [19]. By the Clarkson-Shor analysis [11], the expected running time of the search of VorQ(X)\mathrm{Vor}_{Q}(X) and the Voronoi diagram modification over the insertions of all points in YXY\setminus X is O(|Y|)O(|Y|). We spend O(|Vq(X)|)O\bigl{(}\bigl{|}\partial V_{q}(X)\bigr{|}\bigr{)} time to find vv. It translates to an O(1)O(1) charge at each vertex of Vq(X)V_{q}(X). This charging happens only for qq’s neighbors in 1-NNY\mathrm{NN}_{Y}. By Lemma 4.8, there are O(1)O(1) such neighbors of qq, so the charge at each vertex of Vq(X)V_{q}(X) is O(1)O(1). Moreover, if a vertex of Vq(X)V_{q}(X) is destroyed by the insertion of a point from YXY\setminus X, that vertex will not reappear. So the O(|Vq(X)|)O\bigl{(}\bigl{|}\partial V_{q}(X)\bigr{|}\bigr{)} cost is absorbed by the structural changes which is already taken care of by the Clarkson-Shor analysis. Unwinding the recursion gives a total expected running time of O(|R|+|R|/2+|R|/4+)=O(|R|)O(|R|+|R|/2+|R|/4+\cdots)=O(|R|). For completeness, more details of the analysis given in [3] can be found in Appendix E.

Lemma 4.9

VorNN(R,TR)(R,T_{R}) computes VorQ(R)\mathrm{Vor}_{Q}(R) in O(|R|)O(|R|) expected time.

4.4 Computing 𝐕𝐨𝐫𝑸(𝑰)\boldsymbol{\mathrm{Vor}_{Q}(I)} from 𝐕𝐨𝐫𝑸(𝑹)\boldsymbol{\mathrm{Vor}_{Q}(R)} and 𝑰\boldsymbol{I}

Let qq be a point in RR. Let v1,v2,v_{1},v_{2},\ldots be the vertices of Vq(R)V_{q}(R), in clockwise order, which may be Voronoi edge bends or Voronoi vertices. Let QviQ^{*}_{v_{i}} denote the largest homothetic copy of QQ^{*} centered at viv_{i} such that int(Qvi)R=\mathrm{int}(Q^{*}_{v_{i}})\cap R=\emptyset. Let Zvi=QviIZ_{v_{i}}=Q^{*}_{v_{i}}\cap I where II is an input instance.

Lemma 4.10

The portions of VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) and VorQ({q}ZviZvi+1)\mathrm{Vor}_{Q}\bigl{(}\{q\}\cup Z_{v_{i}}\cup Z_{v_{i+1}}\bigr{)} inside the triangle qvivi+1qv_{i}v_{i+1} are identical.

Proof.  Let pp be a point in (RI){q}(R\cup I)\setminus\{q\} that contributes to VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) inside qvivi+1qv_{i}v_{i+1}. As qvivi+1Vq(R)qv_{i}v_{i+1}\subseteq V_{q}(R), pRp\not\in R. So pIp\in I. By Lemma 3.4, pp conflicts with viv_{i} or vi+1v_{i+1}.    

Step 2 of the operation phase has found VS|piV_{S}|_{p_{i}} for each piIp_{i}\in I. VS|piV_{S}|_{p_{i}} and the portions of the Voronoi edges of VorQ(S)\mathrm{Vor}_{Q}(S) among the points in VS|piV_{S}|_{p_{i}} are preserved in VorQ(R)\mathrm{Vor}_{Q}(R) because RR includes the subset of SS whose Voronoi cells conflict with the input points. Hence, i=1nVS|pi\bigcup_{i=1}^{n}V_{S}|_{p_{i}} is the set URU_{R} of Voronoi edge bends and Voronoi vertices in VorQ(R)\mathrm{Vor}_{Q}(R) that conflict with the input points (refer to Appendix F for a proof). By Lemma 4.10, we locally compute pieces of VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) and stitch them together. The running time is O(u,v(|Zu|+|Zv|)log(|Zu|+|Zv|))O\bigl{(}\sum_{u,v}(|Z_{u}|+|Z_{v}|)\log(|Z_{u}|+|Z_{v}|)\bigr{)}, where the sum is over all pairs {u,v}\{u,v\} of adjacent Voronoi edge bends and Voronoi vertices in VorQ(R)\mathrm{Vor}_{Q}(R) such that {u,v}UR\{u,v\}\cap U_{R}\not=\emptyset. Since the degrees of Voronoi edge bends and Voronoi vertices are two and three respectively, this running time can be bounded by O(vUR|Zv|log|Zv|)O\big{(}\sum_{v\in U_{R}}|Z_{v}|\log|Z_{v}|\bigr{)}. Since URVSU_{R}\subseteq V_{S}, by Lemma 3.2, step 6 of the operation phase computes VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) in O(n)O(n) expected time.

In step 7, the splitting of VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) into VorQ(R)\mathrm{Vor}_{Q}(R) and VorQ(I)\mathrm{Vor}_{Q}(I) can be performed in O(n)O(n) expected time by using the algorithm in [6] for splitting a Euclidean Delaunay triangulation. That algorithm is combinatorial in nature. It relies on the Voronoi diagram being planar and of O(n)O(n) size, all points having O(1)O(1) degrees in the nearest neighbor graph, and that one can delete a site from a Voronoi diagram in time proportional to its number of Voronoi neighbors. The first two properties hold in our case, and it is known how to delete a site from an abstract Voronoi diagram so that the expected running time is proportional to its number of Voronoi neighbors [18].

Lemma 4.11

Step 6 of the operation phase computes VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) in O(n)O(n) expected time, and step 7 splits VorQ(RI)\mathrm{Vor}_{Q}(R\cup I) into VorQ(I)\mathrm{Vor}_{Q}(I) and VorQ(R)\mathrm{Vor}_{Q}(R) in O(n)O(n) expected time.

In summary, since steps 2-7 of the operation phase take O(n)O(n) expected time, the limiting complexity is dominated by the O(1εnlogm+1εn2O(logn)+1εH)O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}n2^{O(\log^{*}n)}+\frac{1}{\varepsilon}H\big{)} expected running time of step 1. In the Euclidean case, step 1 runs faster in O(1εnlogm+1εH)O\bigl{(}\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H\big{)} time.

Theorem 4.1

Let QQ be a convex polygon with O(1)O(1) complexity. Let nn be the input size. For any ε(0,1)\varepsilon\in(0,1) and any hidden mixture of at most m=o(n)m=o(\sqrt{n}) product distributions such that each distribution contributes an instance with a probability of Ω(1/n)\Omega(1/n), there is a self-improving algorithm for constructing a Voronoi diagram under dQd_{Q} with a limiting complexity of O(1εnlogm+1εn2O(logn)+1εH)O(\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}n2^{O(\log^{*}n)}+\frac{1}{\varepsilon}H). For the Euclidean metric, the limiting complexity is O(1εnlogm+1εH)O(\frac{1}{\varepsilon}n\log m+\frac{1}{\varepsilon}H). The training phase runs in O(mnlog2(mn)+mεn1+εlog(mn))O(mn\log^{2}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)) time. The success probability is at least 1O(1/n)1-O(1/n).

5 Conclusion

It is open whether one can get rid of the requirement that each distribution in the mixture contributes an instance with a probability of Ω(1/n)\Omega(1/n), which is not needed for self-improving sorting [8]. Eliminating the n2O(logn)n2^{O(\log^{*}n)} term from the limiting complexity might require solving the question raised in [5] that whether there is an O(n)O(n)-time algorithm for computing the lower envelope of pseudo-planes. As a Voronoi diagram can be interpreted as the lower envelope of some appropriate surfaces, a natural question is what surfaces admit a self-improving lower envelope algorithm.

References

  • [1] S. Arya, T. Malamatos, D.M. Mount, and K.C. Wong. Optimal expected-case planar point location. SIAM Journal on Computing, 37 (2007), 584–610.
  • [2] N. Ailon, B. Chazelle, K. Clarkson, D. Liu, W. Mulzer, and C. Seshadhri. Self-improving algorithms. SIAM Journal on Computing, 40 (2011), 350–375.
  • [3] K. Buchin and W. Mulzer. Delaunay triangulations in o(sort(n))o(\text{sort}(n)) time and more. Journal of the ACM, 58 (2011), 6:1–6:27.
  • [4] P.B. Callahan and S.R. Kosaraju. A decomposition of multidimensional point sets with applications to kk-nearest-neighbors and nn-body potential fields. Journal of the ACM, 42 (1995), 67–90.
  • [5] T.M. Chan. A simpler linear-time algorithm for intersecting two convex polyhedra in three dimensions. Discrete & Computational Geometry, 56 (2016), 860–865.
  • [6] B. Chazelle, O. Devillers, F. Hurtado, M. Mora, V. Sacristan, and M. Teillaud. Splitting a delaunay triangulation in linear time. Algorithmica, 34 (2002), 39–46.
  • [7] S.-W. Cheng, M.-K. Chiu, K. Jin, and M.T. Wong. A generalization of self-improving algorithms. Proceedings of the International Symposium on Computational Geometry, 2020, 29:1–29:13. Full version: arXiv:2003.08329v2.
  • [8] S.-W. Cheng, K. Jin, and L. Yan. Extensions of self-improving sorters. Algorithmica, 82 (2020), 88–106.
  • [9] L. Paul Chew and R.L. Scot Drysdale. Voronoi diagrams based on convex distance functions. Proceedings of the 1st Annual Symposium on Computational Geometry, 1985, 235–244.
  • [10] K.L. Clarkson, W. Mulzer, and C. Seshadhri. Self-improving algorithms for coordinatewise maxima and convex hulls. SIAM Journal on Computing, 43(2):617–653, 2014.
  • [11] K.L. Clarkson and P.W. Shor. Applications of random sampling in computational geometry, II. Discrete and Computational Geometry, 4 (1989), 387–421.
  • [12] T.M. Cover and J.A. Thomas. Elements of Information Theory. Wiley-Interscience, New York, 2nd edition, 2006.
  • [13] H. Edelsbrunner, L.J. Guibas, and J. Stolfi. Optimal point location in a monotone subdivision. SIAM Journal on Computing, 15 (1986), 317–340.
  • [14] G.N. Frederickson. Fast algorithms for shortest paths in planar graphs, with applications. SIAM Journal on Computing, 16 (1987), 1004–1022.
  • [15] M.L. Fredman. How good is the information theory bound in sorting? Theoretical Computer Science, 1(4):355 – 361, 1976.
  • [16] S. Har-Peled and M. Mendel. Fast construction of nets in low-dimensional metrics and their applications. SIAM Journal on Computing, 35 (2006), 1148–1184.
  • [17] J. Iacono. Expected asymptotically optimal planar point location. Computational Geometry: Theory and Applications, 29 (2004), 19–22.
  • [18] K. Junginer and E. Papadopoulou. Deletion in abstract Voronoi diagram in expected linear time. Proceedings of the 34th International Symposium on Computational Geometry, 2018, 50:1–50:14.
  • [19] R. Klein, K. Mehlhorn, and S. Meiser. Randomized incremental construction of abstract Voronoi diagrams. Computational Geometry: Theory and Applications, 3 (1993), 157–184.
  • [20] G.L. Miller, S.-H. Teng, W. Thurston, and S.A. Vavasis. Separators for sphere-packings and nearest neighbor graphs. Journal of the ACM, 44 (1997), 1–29.
  • [21] E. Pyrga and S. Ray. New existence proofs for ϵ\epsilon-nets. Proceedings of the 24th Annual Symposium on Computational Geometry, 2008, 199–207.
  • [22] A.K. Tsakalides and J. van Leeuwen. An optimal pointer machine algorithm for finding nearest common ancestors. Technical Report RUU-CS-88-17, Department of Computer Science, University of Utrecht, 1988.
  • [23] P. van Emde Boas, R. Kaas, and E. Zijlstra. Design and implementation of an efficient priority queue. Mathematical Systems Theory, 10 (1977), 99–127.
  • [24] R.W. Yeung. A First Course in Information Theory, Kluwer Academic/Plenum Publishers, 2002.

Appendix A Proof of Lemma 3.1

We restate Lemma 3.1 and then give its proof.

Statement of Lemma 3.1:   Let 𝒟a\mathscr{D}_{a}, a[m]a\in[m], be the distributions in the hidden mixture. The training phase computes the following structures in O(mnlogO(1)(mn)+mεn1+εlog(mn))O(mn\log^{O(1)}(mn)+m^{\varepsilon}n^{1+\varepsilon}\log(mn)) time.

  1. (a)

    A set SS of O(mn)O(mn) points and VorQ(S)\mathrm{Vor}_{Q}(S). It holds with probability at least 11/n1-1/n that for any a[1,m]a\in[1,m] and any vVSv\in V_{S}, i=1nPr[Xiv|I𝒟a]=O(1/m)\sum_{i=1}^{n}\mathrm{Pr}[X_{iv}\,|\,I\sim\mathscr{D}_{a}]=O(1/m), where Xiv=1X_{iv}=1 if piIp_{i}\in I conflicts with vv and Xiv=0X_{iv}=0 otherwise.

  2. (b)

    Point location structures LSL_{S} and LiL_{i} for each i[n]i\in[n] that allow us to locate pip_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) in O(1εH(ti))O\bigl{(}\frac{1}{\varepsilon}H(t_{i})\bigr{)} expected time, where tit_{i} is the random variable that represents the point location outcome, and H(ti)H(t_{i}) is the entropy of the distribution of tit_{i}.

  3. (c)

    A net-tree TST_{S} for SS, the O(n)O(n) clusters in TST_{S}, the initially empty van Emde Boas trees for the clusters, and the nearest common ancestor data structures for the clusters.

  4. (d)

    An m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S), the subset BSB\subseteq S of O(n)O(n) points whose Voronoi cell boundaries contain some region boundary vertices in the m2m^{2}-division, VorQ(B)\mathrm{Vor}_{Q}(B), and the point location data structures for the regions in the m2m^{2}-division.

Proof.  Let X={x1,,xmnln(mn)}X=\{x_{1},\ldots,x_{mn\ln(mn)}\} be the set of points from which the 1mn\frac{1}{mn}-net was extracted. The set SS consists of this 1mn\frac{1}{mn}-net and the O(1)O(1) dummy points. Let σ={j1,j2,j3}[1,mnln(mn)]\sigma=\{j_{1},j_{2},j_{3}\}\subset[1,mn\ln(mn)] be a triple of distinct indices. Let QσQ^{*}_{\sigma} be the homothetic copy of QQ^{*} that circumscribes xj1x_{j_{1}}, xj2x_{j_{2}} and xj3x_{j_{3}} if it exists; otherwise, we ignore σ\sigma. Assume that σ\sigma is not ignored. We analyze the number of points in any input instance that fall inside QσQ^{*}_{\sigma}.

Fix any product distribution 𝒟a\mathscr{D}_{a} in the hidden mixture. Let 𝒥a,σ={i[1,mnln(mn)]σ:xi is drawn 𝒟a}{\cal J}_{a,\sigma}=\{i\in[1,mn\ln(mn)]\setminus\sigma:\text{$x_{i}$ is drawn $\mathscr{D}_{a}$}\}. How large is 𝒥a,σ{\cal J}_{a,\sigma}? Since Pr[I𝒟a]=Ω(1/n)\Pr[I\sim\mathscr{D}_{a}]=\Omega(1/n), the expected size of 𝒥a,σ{\cal J}_{a,\sigma} is Ω(mln(mn))\Omega(m\ln(mn)). Then, Chernoff bound implies that |𝒥a,σ|=Ω(mln(mn))|{\cal J}_{a,\sigma}|=\Omega(m\ln(mn)) with probability at least 1(mn)Ω(m)1-(mn)^{-\Omega(m)}.

For every i𝒥a,σi\in{\cal J}_{a,\sigma}, define Ya,σ(i)=1Y_{a,\sigma}(i)=1 if xiQσx_{i}\in Q^{*}_{\sigma}; otherwise, Ya,σ(i)=0Y_{a,\sigma}(i)=0. Let Ya,σ=i𝒥a,σYa,σ(i)Y_{a,\sigma}=\sum_{i\in{\cal J}_{a,\sigma}}Y_{a,\sigma}(i). The variables Ya,σ(i)Y_{a,\sigma}(i)’s are independent from each other, so the Chernoff bound is applicable to Ya,σY_{a,\sigma}. It says that for any λ(0,1)\lambda\in(0,1), Pr[Ya,σ>(1λ)E[Ya,σ]]>1e12λ2E[Ya,σ]\mathrm{Pr}\bigl{[}Y_{a,\sigma}>(1-\lambda)\mathrm{E}[Y_{a,\sigma}]\bigr{]}>1-e^{-\frac{1}{2}\lambda^{2}\mathrm{E}[Y_{a,\sigma}]}.

If E[Ya,σ]>2λ2(1λ)ln(mn)\mathrm{E}[Y_{a,\sigma}]>\frac{2}{\lambda^{2}(1-\lambda)}\ln(mn), then Pr[Ya,σ>2λ2ln(mn)]>1(mn)1/(1λ)\Pr\bigl{[}Y_{a,\sigma}>\frac{2}{\lambda^{2}}\ln(mn)\bigr{]}>1-(mn)^{-1/(1-\lambda)}. Setting λ=4/5\lambda=4/5 gives E[Ya,σ]>1258ln(mn)Pr[Ya,σ>258ln(mn)]>1(mn)5\mathrm{E}[Y_{a,\sigma}]>\frac{125}{8}\ln(mn)\Rightarrow\Pr\bigl{[}Y_{a,\sigma}>\frac{25}{8}\ln(mn)\bigr{]}>1-(mn)^{-5}. There are fewer than m3n3ln3(mn)m^{3}n^{3}\ln^{3}(mn) triples of distinct indices. By the union bound, it holds with probability at least 1ln3(mn)/(m2n2)>11/(mn)1-\ln^{3}(mn)/(m^{2}n^{2})>1-1/(mn) that for any triple σ\sigma of distinct indices, if E[Ya,σ]>1258ln(mn)\mathrm{E}[Y_{a,\sigma}]>\frac{125}{8}\ln(mn), then Ya,σ>258ln(mn)Y_{a,\sigma}>\frac{25}{8}\ln(mn).

Consider any Voronoi vertex vVSv\in V_{S} and its defining triple σ\sigma. If |QσX||X|/(mn)=ln(mn)|Q^{*}_{\sigma}\cap X|\geq|X|/(mn)=\ln(mn), then QσSQ^{*}_{\sigma}\cap S\not=\emptyset because SS is a 1mn\frac{1}{mn}-net of XX. But QσSQ^{*}_{\sigma}\cap S is empty as vv is a Voronoi vertex, which implies that |QσX|<ln(mn)|Q^{*}_{\sigma}\cap X|<\ln(mn). If we restrict our attention to instances in 𝒥a,σ{\cal J}_{a,\sigma} that contribute to QσXQ^{*}_{\sigma}\cap X, the count does not get bigger. That is, Ya,σ|QσX|<ln(mn)Y_{a,\sigma}\leq|Q^{*}_{\sigma}\cap X|<\ln(mn). By the contrapositive of the result that we obtained earlier on the relation between E[Ya,σ]\mathrm{E}[Y_{a,\sigma}] and Ya,σY_{a,\sigma}, we conclude that E[Ya,σ]1258ln(mn)\mathrm{E}[Y_{a,\sigma}]\leq\frac{125}{8}\ln(mn). Moreover, this upper bound on E[Ya,σ]\mathrm{E}[Y_{a,\sigma}] hold simultaneously for all defining triples of the Voronoi vertices in VSV_{S} with probability at least 11/(mn)1-1/(mn).

Since the input distribution is oblivious of the training and operation phases, we can use the instances in 𝒥a,σ{\cal J}_{a,\sigma} to derive the following inequality: E[Ya,σ]|𝒥a,σ|(i=1nPr[Xiv|I𝒟a])3\mathrm{E}\left[Y_{a,\sigma}\right]\geq|{\cal J}_{a,\sigma}|\cdot\left(\sum_{i=1}^{n}\mathrm{Pr}\bigl{[}X_{iv}\,|\,I\sim\mathscr{D}_{a}\bigr{]}\right)-3. The additive term of 3-3 stems from the fact that the indices in σ\sigma are excluded from 𝒥a,σ{\cal J}_{a,\sigma} in the definition of Ya,σY_{a,\sigma}, but they are allowed in |𝒥a,σ|i=1nPr[Xiv|I𝒟a]|{\cal J}_{a,\sigma}|\cdot\sum_{i=1}^{n}\mathrm{Pr}\bigl{[}X_{iv}\,|\,I\sim\mathscr{D}_{a}\bigr{]}. Rearranging terms gives

i=1nPr[Xiv|I𝒟a]=O(E[Ya,σ]+3|𝒥a,σ|)=O(E[Ya,σ]+3mln(mn))=O(1m).\sum_{i=1}^{n}\mathrm{Pr}\bigl{[}X_{iv}\,|\,I\sim\mathscr{D}_{a}\bigr{]}=O\left(\frac{\mathrm{E}\left[Y_{a,\sigma}\right]+3}{|{\cal J}_{a,\sigma}|}\right)=O\left(\frac{\mathrm{E}\left[Y_{a,\sigma}\right]+3}{m\ln(mn)}\right)=O\left(\frac{1}{m}\right).

As discussed before, the above result holds for 𝒟a\mathscr{D}_{a} with probability at least 11/(mn)1-1/(mn). Applying the union bound over all a[m]a\in[m], we get a success probability of at least 11/n1-1/n.

Consider (b). If pip_{i} falls into a triangle t𝒮it\in{\cal S}_{i} with weight wtw_{t}, the distribution-sensitive point location data structure [1, 17] ensures that the query time of LiL_{i} is O(log(W/wt))O(\log(W/w_{t})), where W=t𝒮iwtW=\sum_{t\in{\cal S}_{i}}w_{t}. Since wtw_{t} is defined to be max{(mn)ε,π~i,t}\max\bigl{\{}(mn)^{-\varepsilon},\tilde{\pi}_{i,t}\bigr{\}} and the complexity of 𝒮i{\cal S}_{i} is O(mεnε)O(m^{\varepsilon}n^{\varepsilon}), we have Wt𝒮i((mn)ε+π~i,t)=O(1)W\leq\sum_{t\in{\cal S}_{i}}\bigl{(}(mn)^{-\varepsilon}+\tilde{\pi}_{i,t}\bigr{)}=O(1). Let πi,t\pi_{i,t} be the true probability of pip_{i} hitting a triangle tt in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S). Using the Chernoff bound, one can prove as in [2, Lemma 3.4] that, with probability at least 1O(1/(mn))1-O(1/(mn)), for every i[n]i\in[n] and every tt, if πi,t>(mn)ε/3\pi_{i,t}>(mn)^{-\varepsilon/3}, then π~i,t[0.5πi,t,1.5πi,t]\tilde{\pi}_{i,t}\in[0.5\pi_{i,t},1.5\pi_{i,t}]. As wt=max{(mn)ε,π~i,t}w_{t}=\max\bigl{\{}(mn)^{-\varepsilon},\tilde{\pi}_{i,t}\bigr{\}}, if πi,t>(mn)ε/3\pi_{i,t}>(mn)^{-\varepsilon/3}, the query time is O(log1/wt)=O(log(1/πi,t))O(\log 1/w_{t})=O(\log(1/\pi_{i,t})). If πi,t(mn)ε/3\pi_{i,t}\leq(mn)^{-\varepsilon/3}, we may query LSL_{S} as well, so the query time is O(log(1/wt))+O(log(mn))=O(εlog(mn))+O(log(mn))=O(1εlog(1/πi,t))O(\log(1/w_{t}))+O(\log(mn))=O(\varepsilon\log(mn))+O(\log(mn))=O\bigl{(}\frac{1}{\varepsilon}\log(1/\pi_{i,t})\bigr{)}. Therefore, the expected query time of LiL_{i} is bounded by O(t𝒮iπi,t1εlog(1/πi,t))=1εH(t)O\left(\sum_{t\in{\cal S}_{i}}\pi_{i,t}\cdot\frac{1}{\varepsilon}\log(1/\pi_{i,t})\right)=\frac{1}{\varepsilon}H(t).

The correctness of (c) and (d) follows from [14, 16] and our previous description.    

Appendix B Missing details in Section 3

We restate the results below and give their proofs.

Statement of Lemma 3.2:  For every vVSv\in V_{S}, let ZvZ_{v} be the subset of input points that conflict with vv. It holds with probability at least 1O(1/n)1-O(1/n) that vVSE[|Zv|2]=O(n)\sum_{v\in V_{S}}\mathrm{E}\bigl{[}|Z_{v}|^{2}\bigr{]}=O(n).

Proof.  For every i[n]i\in[n] and every vVSv\in V_{S}, define Xiv=1X_{iv}=1 if piZvp_{i}\in Z_{v} and Xiv=0X_{iv}=0 otherwise.

vVSE[|Zv|2]=E[vVS(i[n]Xiv)2]=vVSi,j[n]E[XivXjv]\displaystyle~{}~{}~{}~{}\sum_{v\in V_{S}}\mathrm{E}\bigl{[}|Z_{v}|^{2}\bigr{]}=\mathrm{E}\left[\sum_{v\in V_{S}}\left(\sum_{i\in[n]}X_{iv}\right)^{2}\right]=\sum_{v\in V_{S}}\sum_{i,j\in[n]}\mathrm{E}\bigl{[}X_{iv}X_{jv}\bigr{]}
=a[m]vVSi[n]Pr[Xiv|I𝒟a]Pr[I𝒟a]+\displaystyle=\sum_{a\in[m]}\sum_{v\in V_{S}}\sum_{i\in[n]}\mathrm{Pr}\bigl{[}X_{iv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}+
a[m]vVSijPr[XivXjv|I𝒟a]Pr[I𝒟a]\displaystyle\quad\quad\sum_{a\in[m]}\sum_{v\in V_{S}}\sum_{i\not=j}\mathrm{Pr}\bigl{[}X_{iv}\wedge X_{jv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}
=a[m]Pr[I𝒟a]vVSO(1/m)+a[m]vVSijPr[XivXjv|I𝒟a]Pr[I𝒟a]\displaystyle=\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}\sum_{v\in V_{S}}O(1/m)+\sum_{a\in[m]}\sum_{v\in V_{S}}\sum_{i\not=j}\mathrm{Pr}\bigl{[}X_{iv}\wedge X_{jv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}
=O(n)+a[m]vVSijPr[XivXjv|I𝒟a]Pr[I𝒟a].\displaystyle=O(n)+\sum_{a\in[m]}\sum_{v\in V_{S}}\sum_{i\not=j}\mathrm{Pr}\bigl{[}X_{iv}\wedge X_{jv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}.

Lemma 3.1(a) is invoked in the third step. The last step is due to the fact that |VS|=O(mn)|V_{S}|=O(mn) and a[m]Pr[I𝒟a]=1\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim\mathscr{D}_{a}\bigr{]}=1. Under the condition that I𝒟aI\sim\mathscr{D}_{a}, XivX_{iv} and XjvX_{jv} are independent. Therefore, Pr[XivXjv|I𝒟a]=Pr[Xiv|I𝒟a]Pr[Xjv|I𝒟a]\mathrm{Pr}\bigl{[}X_{iv}\wedge X_{jv}|I\sim{\cal D}_{a}\bigr{]}=\mathrm{Pr}\bigl{[}X_{iv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}X_{jv}|I\sim{\cal D}_{a}\bigr{]}. As a result,

a[m]vVSijPr[XivXjv|I𝒟a]Pr[I𝒟a]\displaystyle~{}~{}~{}~{}\sum_{a\in[m]}\sum_{v\in V_{S}}\sum_{i\not=j}\mathrm{Pr}\bigl{[}X_{iv}\wedge X_{jv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}
=a[m]Pr[I𝒟a]vVSijPr[Xiv|I𝒟a]Pr[Xjv|I𝒟a]\displaystyle=\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}\sum_{v\in V_{S}}\sum_{i\not=j}\mathrm{Pr}\bigl{[}X_{iv}|I\sim{\cal D}_{a}\bigr{]}\cdot\mathrm{Pr}\bigl{[}X_{jv}|I\sim{\cal D}_{a}\bigr{]}
a[m]Pr[I𝒟a]vVS(i[n]Pr[Xiv|I𝒟a])2\displaystyle\leq\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}\sum_{v\in V_{S}}\Bigl{(}\sum_{i\in[n]}\mathrm{Pr}\bigl{[}X_{iv}|I\sim{\cal D}_{a}\bigr{]}\Bigr{)}^{2}
=a[m]Pr[I𝒟a]vVSO(1/m2)=O(n/m).\displaystyle=\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim{\cal D}_{a}\bigr{]}\sum_{v\in V_{S}}O(1/m^{2})~{}=~{}O(n/m).

In the last step, we use Lemma 3.1(a) and the relations that |VS|=O(mn)|V_{S}|=O(mn) and a[m]Pr[I𝒟a]=1\sum_{a\in[m]}\mathrm{Pr}\bigl{[}I\sim\mathscr{D}_{a}\bigr{]}=1.    

Statement of Lemma 3.3:  Consider VorQ(Y)\mathrm{Vor}_{Q}(Y) for some point set YY. For any point x2x\in\mathbb{R}^{2}, let QxQ^{*}_{x} be the largest homothetic copy of QQ^{*} centered at xx such that int(Qx)Y=\mathrm{int}(Q^{*}_{x})\cap Y=\emptyset. Let w1w_{1} and w2w_{2} be two adjacent Voronoi edge bends or Voronoi vertices in VorQ(Y)\mathrm{Vor}_{Q}(Y). For any point xw1w2x\in w_{1}w_{2}, QxQw1Qw2Q^{*}_{x}\subseteq Q^{*}_{w_{1}}\cup Q^{*}_{w_{2}}. The same property holds if w1w_{1} and w2w_{2} are Voronoi vertices connected by a Voronoi edge, and xx lies on that Voronoi edge.

Proof.  We assume that QxQ^{*}_{x} is not equal to Qw1Q^{*}_{w_{1}} or Qw2Q^{*}_{w_{2}} as there is nothing to prove otherwise. Let qq and qq^{\prime} be two of the defining points of w1w_{1} and w2w_{2}. So w1w_{1} and w2w_{2} lie on the Voronoi edge ee defined by qq and qq^{\prime}. Place an imaginary point q1q_{1} in Qw1Qw2\partial Q^{*}_{w_{1}}\setminus Q^{*}_{w_{2}} such that q1q_{1} does not lie on the same edge of Qw1Q^{*}_{w_{1}} as qq or qq^{\prime}. Place an imaginary point q2q_{2} similarly in Qw2Qw1\partial Q^{*}_{w_{2}}\setminus Q^{*}_{w_{1}}. Figure 5(a) shows an example.

Refer to caption Refer to caption
(a) (b)
Figure 5: (a) The placement of q1q_{1} and q2q_{2} in Qw1\partial Q^{*}_{w_{1}} and Qw2\partial Q^{*}_{w_{2}}. (b) The chain in Qw1\partial Q^{*}_{w_{1}} that goes from qq to qq^{\prime} through q1q_{1} lies outside QxQ^{*}_{x}.

We claim that dQ(q1,x)dQ(q,x)=dQ(q,x)d_{Q}(q_{1},x)\geq d_{Q}(q^{\prime},x)=d_{Q}(q,x). Suppose not. Then dQ(q1,x)<dQ(q,x)=dQ(q,x)d_{Q}(q_{1},x)<d_{Q}(q^{\prime},x)=d_{Q}(q,x). Move along the Voronoi edge ee from xx towards w2w_{2}. We must reach some point yy before reaching w2w_{2} such that dQ(q1,y)=dQ(q,y)=dQ(q,y)d_{Q}(q_{1},y)=d_{Q}(q^{\prime},y)=d_{Q}(q,y) because q1q_{1} is farther from w2w_{2} than qq, qq^{\prime}, and q2q_{2}. Let QyQ^{*}_{y} be the homothetic copy of QQ^{*} centered at yy that includes qq, qq^{\prime}, and q1q_{1} in its boundary. As Qw1QyQ^{*}_{w_{1}}\not=Q^{*}_{y}, either one is strictly contained in the other, or their boundaries intersect transversally at two points. The former case is impossible as at least one of qq, qq^{\prime}, and q1q_{1} would lie in the interior of Qw1Q^{*}_{w_{1}} or QyQ^{*}_{y}, an impossibility. If Qw1\partial Q^{*}_{w_{1}} and Qy\partial Q^{*}_{y} intersect transversally at two points, then one of qq, qq^{\prime} and q1q_{1} would not lie in Qw1\partial Q^{*}_{w_{1}} or Qy\partial Q^{*}_{y}, an impossibility again. This proves our claim that dQ(q1,x)dQ(q,x)=dQ(q,x)d_{Q}(q_{1},x)\geq d_{Q}(q^{\prime},x)=d_{Q}(q,x).

Similarly, dQ(q2,x)dQ(q,x)=dQ(q,x)d_{Q}(q_{2},x)\geq d_{Q}(q^{\prime},x)=d_{Q}(q,x).

Since both dQ(q1,x)d_{Q}(q_{1},x) and dQ(q2,x)d_{Q}(q_{2},x) are at least dQ(q,x)=dQ(q,x)d_{Q}(q,x)=d_{Q}(q^{\prime},x), neither q1q_{1} nor q2q_{2} belongs to int(Qx)\mathrm{int}(Q^{*}_{x}). Since QxQw1Q^{*}_{x}\not=Q^{*}_{w_{1}}, either one of QxQ^{*}_{x} and Qw1Q^{*}_{w_{1}} is strictly contained in the other, or their boundaries intersect transversally. The former case is impossible because one of q1q_{1}, qq and qq^{\prime} would lie in the interior of QxQ^{*}_{x} or Qw1Q^{*}_{w_{1}}. In the second case, Qx\partial Q^{*}_{x} and Qw1\partial Q^{*}_{w_{1}} must intersect transversally at qq and qq^{\prime}. It follows that one of the two chains in Qw1\partial Q^{*}_{w_{1}} delimited by qq and qq^{\prime} lies outside QxQ^{*}_{x}. Figure 5(b) shows an example. Since dQ(x,q1)=dQ(q1,x)dQ(q,x)=dQ(q,x)d_{Q^{*}}(x,q_{1})=d_{Q}(q_{1},x)\geq d_{Q}(q,x)=d_{Q}(q^{\prime},x), we conclude that the chain that contains q1q_{1} lies outside QxQ^{*}_{x}. Similarly, we can show that QxQ^{*}_{x} does not contain the chain in Qw2\partial Q^{*}_{w_{2}} that goes from qq through q2q_{2} to qq^{\prime}. Hence, QxQ^{*}_{x} must be contained in Qw1Qw2Q^{*}_{w_{1}}\cup Q^{*}_{w_{2}}.

The same proof applies when w1w_{1} and w2w_{2} are Voronoi vertices connected by a Voronoi edge, and xx lies on that Voronoi edge.    

Statement of Lemma 3.4:  Let qq be a point in some point set YY. Let quvquv be a triangle in the triangulated VorQ(Y)\mathrm{Vor}_{Q}(Y). If a point pYp\not\in Y conflicts with a point in quvquv, then pp conflicts with uu or vv. Hence, if pp conflicts with Vq(Y)V_{q}(Y), pp conflicts with a Voronoi edge bend or Voronoi vertex in Vq(Y)\partial V_{q}(Y).

Proof.  For any point y2y\in\mathbb{R}^{2}, let QyQ^{*}_{y} be the largest homothetic copy of QQ^{*} centered at yy such that int(Qy)Y=\mathrm{int}(Q^{*}_{y})\cap Y=\emptyset. It suffices to show that the point pp that conflicts with Vq(Y)V_{q}(Y) belongs to QuQ^{*}_{u} or QvQ^{*}_{v}. If pp conflicts with any point in uvuv, Lemma 3.3 implies that pQup\in Q^{*}_{u} or pQvp\in Q^{*}_{v}. Suppose that pp does not conflict with any point in uvuv. So uvVq(Y{p})uv\subseteq V_{q}(Y\cup\{p\}). Recall that the Voronoi cells of a Voronoi diagram under a convex distance function are star-shaped with respect to their sites. So Vq(Y{p})V_{q}(Y\cup\{p\}) is star-shaped with respect to qq. However, some segment that connects qq to some point in uvuv must cross Vp(Y{p})V_{p}(Y\cup\{p\}) as pp conflicts with a point in quvquv, a contradiction. Figure 6 illustrates this situation.    

Refer to caption

Figure 6: The left image shows a schematic drawing of Vq(Y)V_{q}(Y). If Vp(Y{p})V_{p}(Y\cup\{p\}) lies inside quvquv such as the shaded region on the right, then Vq(Y{p})V_{q}(Y\cup\{p\}) is obtained by removing the shaded region from Vq(Y)V_{q}(Y), and Vq(Y{p})V_{q}(Y\cup\{p\}) cannot be star-shaped with respect to qq.

Appendix C Missing details in Section 4.1

We will show that Task 1 in Section 4.1 runs in O(n2O(logn))O(n2^{O(\log^{*}n)}) expected time and Task 2 in Section 4.1 runs in O(nlogm)O(n\log m) expected time.

C.1 Lower envelope of two lower envelopes

We describe a algorithm based on the randomized divide-and-conquer approach due to Chan [5, Section 4] to compute the lower envelope of (B){\cal L}(B) and (I){\cal L}(I).

Construct point location data structures for the triangulated VorQ(B)\mathrm{Vor}_{Q}(B) and VorQ(I)\mathrm{Vor}_{Q}(I) in O(n)O(n) time. Let 𝙱\mathtt{B} be the multiset version of BB in which each pp has multiplicity equal to the complexity of Vp(B)V_{p}(B). Draw a random sample 𝙱\mathtt{B}^{\prime} of 𝙱\mathtt{B} of size O(n/logn)O(n/\log n), and let BB^{\prime} denote the equivalent of 𝙱\mathtt{B}^{\prime} without the multiplicity. Similarly, 𝙸\mathtt{I} is multiset version of II, 𝙸\mathtt{I}^{\prime} is a random sample of 𝙸\mathtt{I} of size O(n/logn)O(n/\log n), and II^{\prime} is the equivalent of 𝙸\mathtt{I}^{\prime} without the multiplicity.

Compute (BI){\cal L}(B^{\prime}\cup I^{\prime}), which has O(n/logn)O(n/\log n) size, directly in O((n/logn)logn)=O(n)O((n/\log n)\cdot\log n)=O(n) time. For each triangle tt in (BI){\cal L}(B^{\prime}\cup I^{\prime}), the strategy is to extract the subsets B|t={pB:some point in t is above Cp}B|_{t}=\{p\in B:\text{some point in $t$ is above $C_{p}$}\} and I|t={pI:some point in t is above Cp}I|_{t}=\{p\in I:\text{some point in $t$ is above $C_{p}$}\}, and then recursively compute the patch (B|tI|t{pt})t^{\cal L}\bigl{(}B|_{t}\cup I|_{t}\cup\{p_{t}\}\bigr{)}\cap\hat{t}, where ptp_{t} is the point in BIB^{\prime}\cup I^{\prime} such that tCptt\subseteq C_{p_{t}}, and t^\hat{t} is the vertical prism obtained by sweeping tt upward and downward. Collect these patches over all triangles in (BI){\cal L}(B^{\prime}\cup I^{\prime}) and stitch them together to form (BI){\cal L}(B\cup I). The recurrence for the expected running time is T(n)=t(BI)T(nt+1)+ET(n)=\sum_{t\in{\cal L}(B^{\prime}\cup I^{\prime})}T(n_{t}+1)+E, where nt=|B|t|+|I|t|n_{t}=\bigl{|}B|_{t}\bigr{|}+\bigl{|}I|_{t}\bigr{|} and EE is the total expected running time to identify B|tB|_{t} and I|tI|_{t} for all triangles tt in (BI){\cal L}(B^{\prime}\cup I^{\prime}).

We identify B|tB|_{t} as follows. If some point xx of tt is above a cone CpC_{p}, then pp conflicts with the projection of xx in the projection of tt. Lemma 3.4 implies that pp conflicts with the projection of a vertex of tt different from ptp_{t}. Take a vertex vv of tt different from ptp_{t}. Locate vv’s vertical projection in a triangle tvt_{v} in the triangulated VorQ(B)\mathrm{Vor}_{Q}(B) by a point location query. If vv is below tvt_{v}, then vv does not conflict with any point in BB. If vv is above tvt_{v}, we search VorQ(B)\mathrm{Vor}_{Q}(B) within the vertical projection of tt to determine the subset B|v={pBB:v is above Cp}B|_{v}=\{p\in B\setminus B^{\prime}:\text{$v$ is above $C_{p}$}\}. The time needed is O(logn+pB|v|Vp(B)|)O\bigl{(}\log n+\sum_{p\in B|_{v}}|\partial V_{p}(B)|\bigr{)}. Summing over all triangles in (BI){\cal L}(B^{\prime}\cup I^{\prime}), the total point location time is O(n/lognlogn)=O(n)O(n/\log n\cdot\log n)=O(n). By Clarkson-Shor’s analysis [11, Corollary 3.8], it holds with probability at least 2/32/3 that the sum of pB|v|Vp(B)|\sum_{p\in B|_{v}}|\partial V_{p}(B)| over all vertices of (BI){\cal L}(B^{\prime}\cup I^{\prime}) is O(n)O(n), and maxt(BI)max{nt}=O(log2n)\max_{t\in{\cal L}(B^{\prime}\cup I^{\prime})}\max\{n_{t}\}=O(\log^{2}n). We identify I|tI|_{t} in the same way, which involves determining Iv={pII:v is above Cp}I_{v}=\{p\in I\setminus I^{\prime}:\text{$v$ is above $C_{p}$}\} for the vertices vv of (BI){\cal L}(B^{\prime}\cup I^{\prime}), and the same analysis applies.

When determining B|vB|_{v} and I|vI|_{v} over all vertices of (BI){\cal L}(B^{\prime}\cup I^{\prime}), if the total number of steps exceeds cncn for some appropriate constant cc, we abort, resample 𝙱\mathtt{B}^{\prime} and 𝙸\mathtt{I}^{\prime}, and then repeat. Similarly, if maxt(BI)max{nt}\max_{t\in{\cal L}(B^{\prime}\cup I^{\prime})}\max\{n_{t}\} exceeds clog2nc^{\prime}\log^{2}n for some appropriate constant cc^{\prime}, we also abort, resample 𝙱\mathtt{B}^{\prime} and 𝙸\mathtt{I}^{\prime}, and then repeat. We expect to succeed in O(1)O(1) trials and proceed with the recursive calls.

In summary, EE is O(n)O(n) in the recurrence T(n)=t(BI)T(nt+1)+ET(n)=\sum_{t\in{\cal L}(B^{\prime}\cup I^{\prime})}T(n_{t}+1)+E, and there are O(logn)O(\log^{*}n) levels of recursion in expectation. The hidden big-Oh constant may thus be raised to a power of O(logn)O(\log^{*}n), resulting in an expected running time of n2O(logn)n2^{O(\log^{*}n)}.

C.2 Determining 𝒕𝟏,𝒕𝟐,,𝒕𝒏\boldsymbol{t_{1},t_{2},\ldots,t_{n}}

We generalize a method in [2], which does not work directly in our case because no information about VorQ(B)\mathrm{Vor}_{Q}(B) is gathered in the training phase. Our method works in two stages for each point piIp_{i}\in I as follows.

  • Stage 1: Determine some subset BiB_{i} that satisfies BBiSB\subseteq B_{i}\subseteq S, and compute a Voronoi edge bend or Voronoi vertex viv_{i} in VorQ(Bi)\mathrm{Vor}_{Q}(B_{i}) that conflicts with pip_{i} and is known to be in VSV_{S} or not.

  • Stage 2: Use viv_{i} to find the triangle tit_{i} that contains pip_{i}.

We provide the details of these two stages for each input point in the following. We discuss the second stage first because it is easier.

C.2.1 Stage 2

If viVSv_{i}\in V_{S}, we search VorQ(S)\mathrm{Vor}_{Q}(S) from viv_{i} to find VS|piV_{S}|_{p_{i}} (i.e., the subset of VSV_{S} that conflict with pip_{i}), which by Lemma 3.3 will also give the triangle in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}. The time needed is O(|VS|pi|)O\bigl{(}\bigl{|}V_{S}|_{p_{i}}\bigr{|}\bigr{)}.

Suppose that viVSv_{i}\not\in V_{S}. Then viv_{i} cannot be a region boundary vertex in the m2m^{2}-division of VorQ(S)\mathrm{Vor}_{Q}(S), so viv_{i} lies inside a region in the m2m^{2}-division, say Π\Pi. We check whether pip_{i} conflicts with any boundary vertex of Π\Pi. For each boundary vertex ww of Π\Pi, let QwQ^{*}_{w} be the largest homothetic copy of QQ^{*} centered at ww such that int(Qw)B=\mathrm{int}(Q^{*}_{w})\cap B=\emptyset. These QwQ^{*}_{w}’s form an arrangement of O(m2)O(m^{2}) complexity. The point pip_{i} conflicts with ww if and only if piQwp_{i}\in Q^{*}_{w}. So we do a point location in the arrangement in O(logm)O(\log m) time to decide whether pip_{i} is contained in QwQ^{*}_{w} for some boundary vertex ww of Π\Pi.

If so, we can search VorQ(S)\mathrm{Vor}_{Q}(S) from ww as before to find the triangle in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}. Suppose that piQwp_{i}\not\in Q^{*}_{w} for any boundary vertex ww of Π\Pi. We claim that pip_{i} lies inside Π\Pi, which means that we can do a point location in VorQ(S)Π\mathrm{Vor}_{Q}(S)\cap\Pi to find the triangle in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}. The time needed is O(logm)O(\log m) as VorQ(S)Π\mathrm{Vor}_{Q}(S)\cap\Pi has O(m2)O(m^{2}) size. The running time of Stage 2 is O(|VS|pi|+logm)O\bigl{(}\bigl{|}V_{S}|_{p_{i}}\bigr{|}+\log m\bigr{)}.

We prove the claim as follows. Assume to the contrary that it is false. Since pip_{i} conflicts with viv_{i} that lies inside Π\Pi, we have piQvip_{i}\in Q^{*}_{v_{i}}, the largest homothetic copy of QQ^{*} centered at viv_{i} such that int(Qvi)B=\mathrm{int}(Q^{*}_{v_{i}})\cap B=\emptyset. The segment pivip_{i}v_{i} intersects the boundary of Π\Pi at some point xx. We can define a deformation of QviQ^{*}_{v_{i}} so that its center moves linearly from viv_{i} to pip_{i} while the polygon shrinks linearly to the point pip_{i}. Since xpivix\in p_{i}v_{i}, at some point during this deformation of QviQ^{*}_{v_{i}}, we must obtain a homothetic copy Q~x\tilde{Q}_{x} of QxQ^{*}_{x} such that xx is the center of Q~x\tilde{Q}_{x}, int(Q~x)B=\mathrm{int}(\tilde{Q}_{x})\cap B=\emptyset, and piQ~xp_{i}\in\tilde{Q}_{x}. Then, we can invoke Lemma 3.3 to obtain the contradiction that pip_{i} must conflict with a boundary vertex of Π\Pi.

C.2.2 Stage 1

For efficiency purpose, we will present a procedure that runs stage 1 for all input points in II in an inductive manner. The procedure is a generalization of a method for a similar task in [2]. During the running of this procedure, whenever we have computed viv_{i} for an input point pip_{i} as required of stage 1, the procedure will invoke stage 2 for pip_{i} in order to locate the triangle tit_{i}, and if viVSv_{i}\not\in V_{S}, compute a Voronoi edge bend or Voronoi vertex viVSv^{\prime}_{i}\in V_{S} that conflicts with pip_{i}.

We first present a technical result that will be used by the procedure.

Lemma C.1

Let pp be a point in BIB\cup I. Let BpB_{p} be any subset that satisfies BBpSB\subseteq B_{p}\subseteq S. Assume that Vp(BpI)V_{p}(B_{p}\cup I), Vp(Bp{p})V_{p}(B_{p}\cup\{p\}), and the edges of VorQ(Bp)\mathrm{Vor}_{Q}(B_{p}) that intersect Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) are known. For each piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}), we can compute a Voronoi edge eie_{i} in Vp(Bp)V_{p}(B_{p}) that conflicts with pip_{i}. The total running time is O(|Np(BpI)|+|Np(Bp{p})|)O\bigl{(}\bigl{|}N_{p}(B_{p}\cup I)\bigr{|}+\bigl{|}N_{p}(B_{p}\cup\{p\})\bigr{|}\bigr{)}.

Proof.  Suppose that pIp\in I. Take a point piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}). The segment ppipp_{i} lies between pqpq and pqpq^{\prime} for some q,qNp(Bp{p})q,q^{\prime}\in N_{p}(B_{p}\cup\{p\}) that are consecutive in the cyclic order of Np(Bp{p})N_{p}(B_{p}\cup\{p\}) around pp. Recall that the dummy points make all Voronoi cells of input points bounded. So there is a Voronoi vertex ww in Vp(Bp{p})\partial V_{p}(B_{p}\cup\{p\}) defined by pp, qq and qq^{\prime}.

We claim that pip_{i} conflicts with ww. Let QwQ^{*}_{w} be the largest homothetic copy of QQ^{*} centered at ww that circumscribes pp, qq and qq^{\prime}. Since piNp(BpI)p_{i}\in N_{p}(B_{p}\cup I), there exists a homothetic copy QxQ^{*}_{x} of QQ^{*} such that {p,pi}Qx\{p,p_{i}\}\subset\partial Q^{*}_{x} and int(Qx)(BpI)=\mathrm{int}(Q^{*}_{x})\cap(B_{p}\cup I)=\emptyset. If pip_{i} does not conflict with ww, then piQwp_{i}\not\in Q^{*}_{w}. Refer to Figure 7. But then as QwQ^{*}_{w} and QxQ^{*}_{x} intersects transversally at zero or two points, QxQ^{*}_{x} is forced to contain qq or qq^{\prime} in its interior, a contradiction. Clearly, qq and qq^{\prime} define a Voronoi edge eie_{i} in VorQ(Bp)\mathrm{Vor}_{Q}(B_{p}) that intersects Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) and contains ww, so eie_{i} is the Voronoi edge that we look for.

By the analysis above, a synchronized cyclic scan of Np(Bp{p})N_{p}(B_{p}\cup\{p\}) and Np(BpI)N_{p}(B_{p}\cup I) gives the Voronoi edges of Vp(Bp)V_{p}(B_{p}) that we look for.

The remaining case is that pBp\in B. Hence, Vp(Bp{p})=Vp(Bp)V_{p}(B_{p}\cup\{p\})=V_{p}(B_{p}). Every point piNp(BpI)p_{i}\in N_{p}(B_{p}\cup I) must conflict with some point in Vp(Bp)\partial V_{p}(B_{p}) in order that pip_{i} becomes a Voronoi neighbor of pp in Np(BpI)N_{p}(B_{p}\cup I). Each piNp(BpI)p_{i}\in N_{p}(B_{p}\cup I) conflicts with a connected portion of Vp(Bp)\partial V_{p}(B_{p}). Moreover, this portion of Vp(Bp)\partial V_{p}(B_{p}) is not nested within the portion of Vp(Bp)\partial V_{p}(B_{p}) that conflicts with any other pjNp(BpI)p_{j}\in N_{p}(B_{p}\cup I). It follows that a synchronized cyclic scan of Vp(Bp)\partial V_{p}(B_{p}) and Np(BpI)N_{p}(B_{p}\cup I) gives the Voronoi edges that we look for.    

Refer to caption

Figure 7: Illustration for the proof of Lemma C.1.

The following is the pseudocode for determining the triangles in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains the input points in II.

  1. 1.

    Initialize a queue LL to contain all points in BB.

  2. 2.

    Mark all points in BIB\cup I as unvisited.

  3. 3.

    While LL is non-empty do:

    1. (a)

      Dequeue the next point pp from LL.

    2. (b)

      If pBp\in B, let Bp=BB_{p}=B. Otherwise, p=pjIp=p_{j}\in I, and vjv_{j} has been inductively determined, and we perform the following steps:

      1. (i)

        We will show in Lemma C.2 below that vjVSv_{j}\in V_{S}. Search VorQ(S)\mathrm{Vor}_{Q}(S) from vjv_{j} to determine VS|pV_{S}|_{p}.

      2. (ii)

        Let SpS_{p} be the set of defining points of the elements of VS|pV_{S}|_{p}. Let SpS^{\prime}_{p} be the set of defining points of Voronoi edge bends and Voronoi vertices in VorQ(S)\mathrm{Vor}_{Q}(S) that are adjacent to the elements of VS|pV_{S}|_{p}. Note that |Sp||S_{p}| and |Sp||S^{\prime}_{p}| are O(|VS|p|)O\bigl{(}\bigl{|}V_{S}|_{p}\bigr{|}\bigr{)}.

      3. (iii)

        Bp:=BSpSpB_{p}:=B\cup S_{p}\cup S^{\prime}_{p}. The motivation for this definition of BpB_{p} is to ensure that Vp(Bp{p})=Vp(S{p})V_{p}(B_{p}\cup\{p\})=V_{p}(S\cup\{p\}).

      4. (iv)

        In the same search in step 3(b)(i), we construct Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) and the edges of VorQ(Bp)\mathrm{Vor}_{Q}(B_{p}) that intersect Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) without increasing the asymptotic running time.

      5. (v)

        Merge Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) and Vp(BI)V_{p}(B\cup I) to form Vp(BpI)V_{p}(B_{p}\cup I).

    3. (c)

      Use Lemma C.1 to determine for each piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}), the Voronoi edge eie_{i} in VorQ(Bp)\mathrm{Vor}_{Q}(B_{p}) that conflicts with pip_{i}. By Lemma 3.3, pip_{i} must conflict with some Voronoi edge bend or endpoint of eie_{i}, which is the desired Voronoi edge bend or Voronoi vertex viv_{i} in VorQ(Bp)\mathrm{Vor}_{Q}(B_{p}) for each unvisited piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}).

    4. (d)

      For each unvisited piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}),

      1. (i)

        Invoke the second stage to find the triangle tit_{i} in the triangulated VorQ(S)\mathrm{Vor}_{Q}(S) that contains pip_{i}, mark pip_{i} as visited, and append pip_{i} to LL.

      2. (ii)

        If pBp\in B, let u,uVSu,u^{\prime}\in V_{S} be two of the vertices of tit_{i}, and update viv_{i} to be uu or uu^{\prime} whichever conflicts with pip_{i}. Note that if pIp\in I, we will prove in Lemma C.2 below that viv_{i} already belongs to VSV_{S}.

Lemma C.2

At the end of step 3(c), if pIp\in I, then for each piNp(BpI)Np(Bp{p})p_{i}\in N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}), viVSv_{i}\in V_{S}. At the end of step 3(d)(ii), for each visited piIp_{i}\in I, viVSv_{i}\in V_{S}.

Proof.  We prove the lemma by induction. Consider an iteration of the while-loop in step 3. The newly visited points piIp_{i}\in I are those in Np(BpI)Np(Bp{p})N_{p}(B_{p}\cup I)\setminus N_{p}(B_{p}\cup\{p\}).

Suppose that pIp\in I. By the definition of BpB_{p}, Vp(Bp{p})=Vp(S{p})V_{p}(B_{p}\cup\{p\})=V_{p}(S\cup\{p\}). Therefore, eie_{i} in Lemma C.1 is a Voronoi edge in VorQ(S)\mathrm{Vor}_{Q}(S) that conflicts with pip_{i}. Moreover, the proof of Lemma C.1 reveals that pip_{i} conflicts with a Voronoi vertex ww of VorQ(Bp{p})\mathrm{Vor}_{Q}(B_{p}\cup\{p\}) that lies on eie_{i}, and ww is defined by pp and the two defining points of eie_{i}. Therefore, pp conflicts with ww on the edge eie_{i} of VorQ(S)\mathrm{Vor}_{Q}(S) too. By Lemma 3.3, pp conflicts with some point in VSeiV_{S}\cap e_{i}. The defining points of VS|pV_{S}|_{p} are included in BpB_{p} by definition, which includes the defining points of VS|peiV_{S}|_{p}\cap e_{i}. Therefore, we have obtained the edge eie_{i} during the construction of Vp(Bp{p})V_{p}(B_{p}\cup\{p\}), which allows eie_{i} to be returned for pip_{i} in the application of Lemma C.1 in step 3(c). When we search along eie_{i} in step 3(c), we find the point viVSv_{i}\in V_{S} that conflicts with pip_{i}.

Suppose that pBp\in B. Then, Bp=BB_{p}=B. Consider step 3(d)(ii). Let the vertices of tit_{i} be {q,u,u}\{q,u,u^{\prime}\}, where qq is a point in SS. Since pitip_{i}\in t_{i}, Lemma 3.4 implies that pip_{i} conflicts with uu or uu^{\prime}. Note that both uu and uu^{\prime} belong to VSV_{S}.    

The correctness of the pseudocode follows from induction using Lemmas C.1 and C.2.

We can view step 3(b)(v) as taking the lower envelope of two polygonal cones. In particular, we lift Vp(Bp{p})V_{p}(B_{p}\cup\{p\}) and Vp(BI)V_{p}(B\cup I) to 3\mathbb{R}^{3}. Then Vp(BpI)V_{p}(B_{p}\cup I) is the lower envelope of these two polygonal cones, which can be obtained by a synchronized cyclic scan of Np(Bp{p})N_{p}(B_{p}\cup\{p\}) and Np(BI)N_{p}(B\cup I) in linear time. In step 3(d)(i), when we invoke stage 2 for a point piIp_{i}\in I, we are supposed to know whether viSv_{i}\in S. If pIp\in I, then Lemma C.2 ensures that viSv_{i}\in S. If pBp\in B, we can assume that all Voronoi edge bends and Voronoi vertices in VorQ(B)\mathrm{Vor}_{Q}(B) have been labelled in preprocessing whether they belong to VSV_{S} or not.

The total running time is O(pBI|Np(BI)|+pB|Np(B)|+piI|VS|pi|+nlogm)O\bigl{(}\sum_{p\in B\cup I}|N_{p}(B\cup I)|+\sum_{p\in B}|N_{p}(B)|+\sum_{p_{i}\in I}\bigl{|}V_{S}|_{p_{i}}\bigr{|}+n\log m\bigr{)} from our previous discussion. The first two terms are O(n)O(n). By Lemma 3.2, the expected value of the third term is O(n)O(n). Therefore, the expected running time of the pseudocode is O(nlogm)O(n\log m).

Appendix D Missing details in Section 4.3.2

D.1 𝒄\boldsymbol{c}-WSPD from a compression 𝑻𝑿\boldsymbol{T_{X}} of 𝑻𝑺\boldsymbol{T_{S}}

Let TXT_{X} be the compression of TST_{S} to XX. We compute a cc-WSPD as described in [16] for a doubling metric, which is dd in our case, using TXT_{X}. The pseudocode is given below. The top-level call is Build(r,r)(r,r), where rr is the root of TXT_{X}.

Build(u,v)(u,v)

  1. 1.

    Swap uu and vv if necessary to ensure that u>v\ell_{u}>\ell_{v}, or u=v\ell_{u}=\ell_{v} and uu is to the left of vv in the inorder traversal.

  2. 2.

    If 4ττ1τu<1c+2d(pu,pv)\frac{4\tau}{\tau-1}\cdot\tau^{\ell_{u}}<\frac{1}{c+2}\cdot d(p_{u},p_{v}) then return {{Pu,Pv}}\bigl{\{}\{P_{u},P_{v}\}\bigr{\}}.

  3. 3.

    Otherwise, let w1,,wjw_{1},\ldots,w_{j} be the children of uu, return i=1jBuild(wi,v)\bigcup_{i=1}^{j}\mathrm{Build}(w_{i},v).

Lemma D.1

Build constructs a cc-WSPD of size O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) in O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) time.

Proof.  We first show that Build outputs a cc-WSPD. Clearly, the working of Build guarantees that for every distinct pair of points x,yXx,y\in X, there exists a pair {Pu,Pv}\{P_{u},P_{v}\} in the output of Build such that xPux\in P_{u} and yPvy\in P_{v}. Let {Pu,Pv}\{P_{u},P_{v}\} be a pair in the output of Build. Let PuP^{\prime}_{u} and PvP^{\prime}_{v} be the subsets of points for the subtrees of TST_{S} rooted at uu and vv. Let δu\delta^{\prime}_{u} and δv\delta^{\prime}_{v} be the diameters of PuP^{\prime}_{u} and PvP^{\prime}_{v} under dd, respectively. By property (c) of a net-tree, PuB(pu,2ττ1τu)P^{\prime}_{u}\subseteq B(p_{u},\frac{2\tau}{\tau-1}\tau^{\ell_{u}}). Thus, max{δu,δv}4ττ1max{τu,τv}\max\{\delta^{\prime}_{u},\delta^{\prime}_{v}\}\leq\frac{4\tau}{\tau-1}\max\{\tau^{\ell_{u}},\tau^{\ell_{v}}\}, which is less than 1c+2d(pu,pv)\frac{1}{c+2}d(p_{u},p_{v}) by steps 1 and 2 of Build. Then, max{δu,δv}<1c+2d(pu,pv)1c+2(d(Pu,Pv)+δu+δv)\max\{\delta^{\prime}_{u},\delta^{\prime}_{v}\}<\frac{1}{c+2}d(p_{u},p_{v})\leq\frac{1}{c+2}\bigl{(}d(P^{\prime}_{u},P^{\prime}_{v})+\delta^{\prime}_{u}+\delta^{\prime}_{v}\bigr{)}. Rearranging terms gives max{δu,δv}<1cd(Pu,Pv)\max\{\delta^{\prime}_{u},\delta^{\prime}_{v}\}<\frac{1}{c}\cdot d(P^{\prime}_{u},P^{\prime}_{v}). Let δu\delta_{u} and δv\delta_{v} be the diameters of PuP_{u} and PvP_{v} respectively. Observe that PuPuP_{u}\subseteq P^{\prime}_{u} and PvPvP_{v}\subseteq P^{\prime}_{v}. Hence, max{δu,δv}max{δu,δv}<1cd(Pu,Pv)1cd(Pu,Pv)\max\{\delta_{u},\delta_{v}\}\leq\max\{\delta^{\prime}_{u},\delta^{\prime}_{v}\}<\frac{1}{c}\cdot d(P^{\prime}_{u},P^{\prime}_{v})\leq\frac{1}{c}\cdot d(P_{u},P_{v}). In summary, the output of Build is a cc-WSPD.

It remains to bound the output size and the running time of Build.

Consider a pair {Pu,Pv}\{P_{u},P_{v}\} in the output. Without loss of generality, assume that Build(u,v)(u,v) is called by Build(u,w)(u,w), where w=𝑝𝑎𝑟𝑒𝑛𝑡(v)w=\mathit{parent}(v). We charge the pair {Pu,Pv}\{P_{u},P_{v}\} to ww. Since Build considers the children of ww instead of those of uu in processing (u,w)(u,w), we must have wu\ell_{w}\geq\ell_{u}. We claim that 𝑝𝑎𝑟𝑒𝑛𝑡(u)w\ell_{\mathit{parent}(u)}\geq\ell_{w}. Suppose not. There must be a call Build(𝑝𝑎𝑟𝑒𝑛𝑡(u),w)(\mathit{parent}(u),w^{\prime}) before the call Build(u,w)(u,w), where w=ww^{\prime}=w or ww^{\prime} is an ancestor of ww. Since 𝑝𝑎𝑟𝑒𝑛𝑡(u)<ww\ell_{\mathit{parent}(u)}<\ell_{w}\leq\ell_{w^{\prime}}, Build(𝑝𝑎𝑟𝑒𝑛𝑡(u),w)(\mathit{parent}(u),w^{\prime}) eventually leads to the call Build(𝑝𝑎𝑟𝑒𝑛𝑡(u),v)(\mathit{parent}(u),v), which must then call Build(u,v)(u,v) for {Pu,Pv}\{P_{u},P_{v}\} to be included in the output. But this contradicts our assumption that Build(u,v)(u,v) is called by Build(u,w)(u,w). Therefore, 𝑝𝑎𝑟𝑒𝑛𝑡(u)wu\ell_{\mathit{parent}(u)}\geq\ell_{w}\geq\ell_{u}. Hence, uu belongs to the set NX(w)N_{X}(\ell_{w}) as defined for TXT_{X} below:

NX()={node u of TXu𝑝𝑎𝑟𝑒𝑛𝑡(u)}.N_{X}(\ell)=\{\text{node $u$ of $T_{X}$: $\ell_{u}\leq\ell\leq\ell_{\mathit{parent}(u)}$}\}.

Note that the parent-child relation in the definition of NX()N_{X}(\ell) refers to TXT_{X}. Consider the following similar definition for TST_{S}:

NS()={node u of TSu𝑝𝑎𝑟𝑒𝑛𝑡(u)}.N_{S}(\ell)=\{\text{node $u$ of $T_{S}$: $\ell_{u}\leq\ell\leq\ell_{\mathit{parent}(u)}$}\}.

Note that the parent-child relation in the definition of NS()N_{S}(\ell) refers to TST_{S}. Since the pair {Pu,Pw}\{P_{u},P_{w}\} is not included in the output, we must have 4ττ1τw1c+2d(pu,pw)\frac{4\tau}{\tau-1}\tau^{\ell_{w}}\geq\frac{1}{c+2}d(p_{u},p_{w}). That is, puB(pw,O(cτw))p_{u}\in B\bigl{(}p_{w},O(c\tau^{\ell_{w}})\bigr{)}. We analyze the total charge on ww in the following.

First, we charge the nodes in NX(w)N_{X}(\ell_{w}) to some nodes in NS(w)N_{S}(\ell_{w}). Take any node uNX(w)u\in N_{X}(\ell_{w}). If uNS(w)u\in N_{S}(\ell_{w}), we charge uu to itself. Suppose that uNS(w)u\not\in N_{S}(\ell_{w}). Let u′′=𝑝𝑎𝑟𝑒𝑛𝑡(u)u^{\prime\prime}=\mathit{parent}(u) in TXT_{X}. It means that there are some internal nodes on the path from u′′u^{\prime\prime} to uu in TST_{S}, and all of them are pruned by the compression of TST_{S} to XX. It follows from the definition of NS()N_{S}(\ell) that exactly one of these pruned internal node belongs to NS(w)N_{S}(\ell_{w}), say uu^{\prime}. We charge uNX(w)u\in N_{X}(\ell_{w}) to uNS(w)u^{\prime}\in N_{S}(\ell_{w}). Note that uu^{\prime} cannot be charged by another node in NX(w)N_{X}(\ell_{w}). Otherwise, TXT_{X} would contain nodes in two different child subtrees of uu^{\prime} in TST_{S}, which would force uu^{\prime} to be a node of TXT_{X}. This contradicts the fact that uu^{\prime} is pruned by the compression of TST_{S} to XX.

Second, consider a node uNX(w)u\in N_{X}(\ell_{w}) that charges an ancestor uNS(w)u^{\prime}\in N_{S}(\ell_{w}) of uu in TST_{S}. We have shown earlier that d(pu,pw)=O(cτw)d(p_{u},p_{w})=O(c\tau^{\ell_{w}}) if {Pu,Pw}\{P_{u},P_{w}\} does not appear in the output of Build. By property (d) of the net-tree TST_{S}, we have d(pu,pu)=O(τu)d(p_{u},p_{u^{\prime}})=O(\tau^{\ell_{u^{\prime}}}), which is O(τw)O(\tau^{\ell_{w}}) as uw\ell_{u^{\prime}}\leq\ell_{w} by the definition of NS(w)N_{S}(\ell_{w}). As a result, d(pu,pw)d(pu,pu)+d(pu,pw)=O((c+1)τw)d(p_{u^{\prime}},p_{w})\leq d(p_{u},p_{u^{\prime}})+d(p_{u},p_{w})=O((c+1)\tau^{\ell_{w}}). Consequently, the nodes in NS(w)N_{S}(\ell_{w}) that are charged by the nodes in {uNX(w):{Pu,Pw} does not appear in the output of Build}\bigl{\{}u\in N_{X}(\ell_{w}):\text{$\{P_{u},P_{w}\}$ does not appear in the output of Build}\bigr{\}} lie in B(pw,O((c+1)τw))B(p_{w},O((c+1)\tau^{\ell_{w}})).

It is known that any two nodes in NS(w)N_{S}(\ell_{w}) are at distance 14τw1\frac{1}{4}\tau^{\ell_{w}-1} or more apart [16, Proposition 2.2]. Therefore, NS(w)B(pw,O((c+1)τw))N_{S}(\ell_{w})\cap B(p_{w},O((c+1)\tau^{\ell_{w}})) has size at most (c+1)O(1)(c+1)^{O(1)} by the doubling property.

In summary, the total charge on the node ww in TXT_{X} is (c+1)O(1)(c+1)^{O(1)}. Since TXT_{X} has O(|X|)O(|X|) nodes, the size bound of the cc-WSPD follows.

Construct a computation tree 𝒯\cal T in which each node is labelled (u,v)(u,v) for the call Build(u,v)(u,v), and a node (u,v)(u,v) is a child of another node (u,w)(u,w) if Build(u,w)(u,w) calls Build(u,v)(u,v). The leaves of 𝒯\cal T correspond to the pairs output by Build. Each internal node of 𝒯\cal T has at least two children because each internal node of TXT_{X} has at least two children. So 𝒯\cal T has O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) nodes as it has O((c+1)O(1)|X|)O((c+1)^{O(1)}|X|) leaves. Clearly, Build spends O(1)O(1) time at each node of 𝒯\cal T, establishing the running time bound.    

D.2 Correctness of the extraction of 𝒌\boldsymbol{k}-nearest neighbor graph

Compute a subset CvXC_{v}\subseteq X for every leaf vv of TXT_{X} such that |Cv|=O(k)|C_{v}|=O(k) and CvC_{v} contains the subset {pX:the point in Pv is a k-nearest neighbor of p}\bigl{\{}p\in X:\text{the point in $P_{v}$ is a $k$-nearest neighbor of $p$}\bigr{\}}. The containment may be strict, so the point in PvP_{v} may not be a kk-nearest neighbor of some point pCvp\in C_{v}. We will discuss shortly how to compute such CvC_{v}’s. After obtaining all CvC_{v}’s, for each point pXp\in X, construct Lp={Pv:v is a leaf of TXpCv}L_{p}=\bigcup\bigl{\{}P_{v}:\text{$v$ is a leaf of $T_{X}$}\wedge p\in C_{v}\bigr{\}}. By definition, all kk-nearest neighbors of pp are included in LpL_{p}, although LpL_{p} may contain more points. We select in O(|Lp|)O(|L_{p}|) time the kk-th farthest point pp^{\prime} in LpL_{p} from pp under dd. Then, we scan in O(|Lp|)O(|L_{p}|) time using d(p,p)d(p,p^{\prime}) to find the kk-nearest neighbors of pp. Hence, as |Cv|=O(k)|C_{v}|=O(k), the total running time is O(pX|Lp|)=O(leaf v|Cv|)=O(k|X|)O(\sum_{p\in X}|L_{p}|)=O(\sum_{\text{leaf $v$}}|C_{v}|)=O(k|X|) plus the time to compute the CvC_{v}’s. It remains to discuss the computation of the CvC_{v}’s.

Compute a 4-WSPD Δ\Delta of XX which takes O(|X|)O(|X|) time by Lemma D.1. We define a subset CuXC_{u}\subseteq X for every node uu of TXT_{X} with a generalized requirement. For every node uu of TXT_{X}, we require that |Cu|=O(k)|C_{u}|=O(k) and CuC_{u} contains the subset {p:{Pw,Pw}Δs.t.pPw\bigl{\{}p:\exists\,\{P_{w},P_{w^{\prime}}\}\in\Delta\,\text{s.t.}\,p\in P_{w^{\prime}}, ww is uu or an ancestor of uu, and PuP_{u} contains a kk-nearest neighbor of pp }\bigr{\}}. The containment may be strict, i.e., some pCup\in C_{u} may violate the property above.

This generalization is consistent with the requirement for CvC_{v} at a leaf vv because if the single point qq in PvP_{v} is a kk-nearest neighbor of some pp, then as Δ\Delta is a WSPD, there exists {Pw,Pw}Δ\{P_{w},P_{w^{\prime}}\}\in\Delta such that qPwq\in P_{w} and pPwp\in P_{w^{\prime}}; ww is clearly either vv or an ancestor of vv.

The CuC_{u}’s are generated in a preorder traversal of TXT_{X}. The computation of CuC_{u} is complete after visiting uu. When visiting a node uu, we initialize a set CC and prune CC later to obtain CuC_{u}. The initial CC is C𝑝𝑎𝑟𝑒𝑛𝑡(u){p:{Pu,Pw}Δs.t.pPw|Pw|k}C_{\mathit{parent}(u)}\cup\bigl{\{}p:\exists\,\{P_{u},P_{w^{\prime}}\}\in\Delta\;\text{s.t.}\;p\in P_{w^{\prime}}\wedge|P_{w^{\prime}}|\leq k\bigr{\}}. (If uu is the root of TT, take C𝑝𝑎𝑟𝑒𝑛𝑡(u)C_{\mathit{parent}(u)} to be \emptyset.) Lemma D.2 below shows that the initial CC satisfies the requirement for CuC_{u} except that |Cu||C_{u}| may not be O(k)O(k). As |C𝑝𝑎𝑟𝑒𝑛𝑡(u)|=O(k)|C_{\mathit{parent}(u)}|=O(k) inductively, the initialization of CC takes O(k+|{p:{Pu,Pw}Δs.t.pPw|Pw|k}|)O\bigl{(}k+\bigl{|}\bigl{\{}p:\exists\,\{P_{u},P_{w^{\prime}}\}\in\Delta\;\text{s.t.}\;p\in P_{w^{\prime}}\wedge|P_{w^{\prime}}|\leq k\bigr{\}}\bigr{|}\bigr{)} time.

Lemma D.2

The initial CC contains the subset {p:{Pw,Pw}Δs.t.pPw\bigl{\{}p:\exists\,\{P_{w},P_{w^{\prime}}\}\in\Delta\,\text{s.t.}\,p\in P_{w^{\prime}}, ww is uu or an ancestor of uu, and PuP_{u} contains a kk-nearest neighbor of pp }\bigr{\}}.

Proof.  Let K={p:{Pw,Pw}Δs.t.pPwK=\bigl{\{}p:\exists\,\{P_{w},P_{w^{\prime}}\}\in\Delta\,\text{s.t.}\,p\in P_{w^{\prime}}, ww is uu or an ancestor of uu, and PuP_{u} contains a kk-nearest neighbor of pp }\bigr{\}}. Partition KK into a disjoint union KK′′K^{\prime}\cup K^{\prime\prime}, where KK^{\prime} covers those pairs {Pw,Pw}Δ\{P_{w},P_{w^{\prime}}\}\in\Delta such that ww is an ancestor of uu, and K′′K^{\prime\prime} covers those pairs {Pu,Pw}Δ\{P_{u},P_{w^{\prime}}\}\in\Delta. Inductively, KC𝑝𝑎𝑟𝑒𝑛𝑡(u)K^{\prime}\subseteq C_{\mathit{parent}(u)}. We just need to argue that K′′K^{\prime\prime} is contained in the subset {p:{Pu,Pw}Δs.t.pPw|Pw|k}\bigl{\{}p:\exists\,\{P_{u},P_{w^{\prime}}\}\in\Delta\;\text{s.t.}\;p\in P_{w^{\prime}}\wedge|P_{w^{\prime}}|\leq k\bigr{\}} which is part of the initial CC. That is, if pPwp\in P_{w^{\prime}} for some {Pu,Pw}Δ\{P_{u},P_{w^{\prime}}\}\in\Delta and some qPuq\in P_{u} is a kk-nearest neighbor of pp, we need to show that |Pw|k|P_{w^{\prime}}|\leq k. In this case, as Δ\Delta is a 4-WSPD, the diameter of PwP_{w^{\prime}} is less than 14d(Pu,Pw)<d(p,q)\frac{1}{4}d(P_{u},P_{w^{\prime}})<d(p,q). So all points in Pw{p}P_{w^{\prime}}\setminus\{p\} are closer to pp than qq, which implies that |Pw|k|P_{w^{\prime}}|\leq k because qq is a kk-nearest neighbor of pp.    

We prune CC as follows. Recall that Q^\hat{Q} is the polygon of O(1)O(1) size that induces the metric dd. Let Ξ=(ξ1,ξ2,)\Xi=(\xi_{1},\xi_{2},\ldots) be a maximal set of points in Q^\partial\hat{Q} in clockwise order such that for any ξiΞ\xi_{i}\in\Xi, d(ξi,ξi+1)[18,14]d(\xi_{i},\xi_{i+1})\in\bigl{[}\frac{1}{8},\frac{1}{4}\bigr{]}. The set Ξ\Xi has O(1)O(1) size and can be computed in O(1)O(1) time by placing points greedily in Q^\partial\hat{Q}. Let γi\gamma_{i} be the ray from the origin through ξi\xi_{i}. These rays divide 2\mathbb{R}^{2} into cones. Fix an arbitrary point q0Puq_{0}\in P_{u}. Compute in O(|C|)O(|C|) time, for all ii, the subset CiC_{i} of CC in the cone bounded by γi+q0\gamma_{i}+q_{0} and γi+1+q0\gamma_{i+1}+q_{0}. Determine the kk-th nearest point in CiC_{i} from q0q_{0} in O(|Ci|)O(|C_{i}|) time. Then, scan CiC_{i} in O(|Ci|)O(|C_{i}|) time to retain only the kk nearest points in CiC_{i} from q0q_{0}. Repeat the same for every CiC_{i}. The union of the pruned CiC_{i}’s is CuC_{u} which clearly has O(k)O(k) size. Lemma D.3 below shows that the pruning of CiC_{i} only removes a point pp if no point in PuP_{u} can be a kk-nearest neighbor of pp. Therefore, the union of the pruned CiC_{i}’s satisfies the requirement for CuC_{u}. The running time over all CiC_{i}’s is O(|C|)=O(k+|{p:{Pu,Pw}Δs.t.pPw|Pw|k}|)O\bigl{(}|C|)=O\bigl{(}k+\bigl{|}\bigl{\{}p:\exists\,\{P_{u},P_{w^{\prime}}\}\in\Delta\;\text{s.t.}\;p\in P_{w^{\prime}}\wedge|P_{w^{\prime}}|\leq k\bigr{\}}\bigr{|}\bigr{)}.

In summary, the computation of all CuC_{u}’s takes O(k|Δ|)=O(k|X|)O(k|\Delta|)=O(k|X|) time.

Lemma D.3

For any point pCip\in C_{i}, if there are at least kk points in pCi{p}p^{\prime}\in C_{i}\setminus\{p\} such that d(p,q0)d(p,q0)d(p^{\prime},q_{0})\leq d(p,q_{0}), then no point in PuP_{u} can be a kk-nearest neighbor of pp.

Proof.  By our initialization of CC, we can inductively show that for any point pp included in the initial CC, there exists {Pw,Pw}Δ\{P_{w},P_{w^{\prime}}\}\in\Delta such that pPwp\in P_{w^{\prime}}, and ww is uu or an ancestor of uu.

Pick a point pCip\in C_{i}. Let pp^{\prime} be a point in Ci{p}C_{i}\setminus\{p\} such that d(p,q0)d(p,q0)d(p^{\prime},q_{0})\leq d(p,q_{0}). Let λ0\lambda_{0} be the factor such that p(λ0Q^+q0)p^{\prime}\in\partial(\lambda_{0}\hat{Q}+q_{0}). Similarly, let λ1λ0\lambda_{1}\geq\lambda_{0} be the factor such that p(λ1Q^+q0)p\in\partial(\lambda_{1}\hat{Q}+q_{0}). Let p′′p^{\prime\prime} be the intersection between pq0pq_{0} and (λ0Q^+q0)\partial(\lambda_{0}\hat{Q}+q_{0}). Refer to Figure 8.

Refer to caption

Figure 8: The two concentric polygons are homothetic copies of Q^\hat{Q}. The smaller one is λ0Q^+q0\lambda_{0}\hat{Q}+q_{0}. The larger one is λ1Q^+q0\lambda_{1}\hat{Q}+q_{0}.

Since {p,p′′}(λ0Q^+q0)\{p^{\prime},p^{\prime\prime}\}\subset\partial(\lambda_{0}\hat{Q}+q_{0}), and pp^{\prime} and p′′p^{\prime\prime} lie in the cone that is bounded by the rays γi+q0\gamma_{i}+q_{0} and γi+1+q0\gamma_{i+1}+q_{0}, we can deduce from the property of d(ξi,ξi+1)1/4d(\xi_{i},\xi_{i+1})\leq 1/4 that {p,p′′}λ04Q^+λ0ξi+q0\{p^{\prime},p^{\prime\prime}\}\subset\frac{\lambda_{0}}{4}\hat{Q}+\lambda_{0}\xi_{i}+q_{0}. Therefore, d(p,p′′)λ0/2d(p^{\prime},p^{\prime\prime})\leq\lambda_{0}/2. The edge of λ0Q^+q0\lambda_{0}\hat{Q}+q_{0} that contains p′′p^{\prime\prime} and the edge of λ1Q^+q0\lambda_{1}\hat{Q}+q_{0} that contains pp are homothetic copies of the same edge of Q^\hat{Q}. Since pp, p′′p^{\prime\prime} and q0q_{0} are collinear, the Euclidean length of pp′′pp^{\prime\prime} is (λ1λ0)/λ0(\lambda_{1}-\lambda_{0})/\lambda_{0} times the Euclidean length of q0p′′q_{0}p^{\prime\prime}. Therefore, (λ1λ0)Q^+p′′(\lambda_{1}-\lambda_{0})\hat{Q}+p^{\prime\prime} contains pp in its boundary, which implies that d(p,p′′)=λ1λ0d(p,p^{\prime\prime})=\lambda_{1}-\lambda_{0}. By the triangle inequality,

d(p,p)\displaystyle d(p,p^{\prime}) d(p,p′′)+d(p,p′′)λ1λ0/2\displaystyle\leq d(p,p^{\prime\prime})+d(p^{\prime},p^{\prime\prime})\leq\lambda_{1}-\lambda_{0}/2
=d(p,q0)d(p,q0)/2.\displaystyle=d(p,q_{0})-d(p^{\prime},q_{0})/2. (1)

As mentioned at the beginning of this proof, since pCp^{\prime}\in C, there exists {Pw,Pw}Δ\{P_{w},P_{w^{\prime}}\}\in\Delta such that pPwp^{\prime}\in P_{w^{\prime}}, and ww is uu or an ancestor of uu. So q0PuPwq_{0}\in P_{u}\subseteq P_{w}. Let δu\delta_{u} and δw\delta_{w} be the diameters of PuP_{u} and PwP_{w} under dd, respectively. We have d(p,q0)d(p,Pw)d(Pw,Pw)d(p^{\prime},q_{0})\geq d(p^{\prime},P_{w})\geq d(P_{w^{\prime}},P_{w}). Since Δ\Delta is a 4-WSPD, d(Pw,Pw)4δw4δud(P_{w^{\prime}},P_{w})\geq 4\delta_{w}\geq 4\delta_{u}. It follows that d(p,q0)/22δud(p^{\prime},q_{0})/2\geq 2\delta_{u}. Substituting into (1) gives d(p,p)d(p,q0)2δud(p,p^{\prime})\leq d(p,q_{0})-2\delta_{u}.

For any point yPuy\in P_{u}, d(p,p)d(p,q0)2δud(p,y)+d(q0,y)2δud(p,y)δu<d(p,y)d(p,p^{\prime})\leq d(p,q_{0})-2\delta_{u}\leq d(p,y)+d(q_{0},y)-2\delta_{u}\leq d(p,y)-\delta_{u}<d(p,y). As a result, pp is closer to pp^{\prime} than any point in PuP_{u}. If there are at least kk such pp^{\prime}’s, no point in PuP_{u} can be a kk-nearest neighbor of pp.    

D.3 Nearest neighbor graph

We restate Lemma 4.8 and give its proof.

Statement of Lemma 4.8:  For any subset XSX\subseteq S, every vertex in 11-NNX\mathrm{NN}_{X} has O(1)O(1) degree, and adjacent vertices in 11-NNX\mathrm{NN}_{X} are Voronoi neighbors in VorQ(X)\mathrm{Vor}_{Q}(X).

Proof.  For every point pXp\in X, let Q^p\hat{Q}_{p} be the largest homothetic copy of Q^\hat{Q} centered at pp such that int(Q^p)(X{p})=\mathrm{int}(\hat{Q}_{p})\cap(X\setminus\{p\})=\emptyset. In 1-NNX\mathrm{NN}_{X}, a point qXq\in X is connected to its nearest neighbor, and if any other point pXp\in X is connected to qq, then qQ^pq\in\partial\hat{Q}_{p}. Therefore, the vertex degree of 1-NNX\mathrm{NN}_{X} is bounded from above by the maximum number of polygons in {Q^p:pX}\{\hat{Q}_{p}:p\in X\} that are intersected by a point in 2\mathbb{R}^{2}.

Let yy be a point in 2\mathbb{R}^{2} that intersects the maximum number of polygons in {Q^p:pX}\{\hat{Q}_{p}:p\in X\}. Let {Q^p1,,Q^ps}\{\hat{Q}_{p_{1}},\ldots,\hat{Q}_{p_{s}}\} be the polygons intersected by yy. Shrink each Q^pi\hat{Q}_{p_{i}} concentrically to a polygon Q^i\hat{Q}_{i} that just contains yy in its boundary. So Q^iQ^pi\hat{Q}_{i}\subseteq\hat{Q}_{p_{i}}, meaning that int(Q^i)(X{pi})=\mathrm{int}(\hat{Q}_{i})\cap(X\setminus\{p_{i}\})=\emptyset. Take the largest λ>0\lambda>0 such that the interior of Q^y=λQ^+y\hat{Q}_{y}=\lambda\hat{Q}+y does not intersect {p1,,ps}\{p_{1},\ldots,p_{s}\}. For i[1,s]i\in[1,s], let qiq_{i} be the intersection between the segment piyp_{i}y and Q^y\partial\hat{Q}_{y}. Refer to Figure 9.

Refer to caption

Figure 9: Illustration for the proof of Lemma 4.8.

We claim that d(qi,qj)λd(q_{i},q_{j})\geq\lambda for all iji\not=j. Without loss of generality, assume that d(y,pi)d(y,pj)d(y,p_{i})\geq d(y,p_{j}). Let zz be the point in the segment piyp_{i}y such that d(y,z)=d(y,pj)d(y,z)=d(y,p_{j}). Since yy, zz and pip_{i} are collinear, we have d(y,z)=d(y,pi)d(pi,z)d(y,z)=d(y,p_{i})-d(p_{i},z). Since pjint(Q^pi)p_{j}\not\in\mathrm{int}(\hat{Q}_{p_{i}}) and yQ^piy\in\hat{Q}_{p_{i}}, we have d(pi,pj)d(y,pi)d(p_{i},p_{j})\geq d(y,p_{i}), which implies that d(y,z)d(pi,pj)d(pi,z)d(pi,z)+d(pj,z)d(pi,z)=d(pj,z)d(y,z)\leq d(p_{i},p_{j})-d(p_{i},z)\leq d(p_{i},z)+d(p_{j},z)-d(p_{i},z)=d(p_{j},z). Since the wedge yqiqjyq_{i}q_{j} is a scaled copy of the wedge yzpjyzp_{j}, the inequality d(pj,z)d(y,z)d(p_{j},z)\geq d(y,z) implies that d(qi,qj)d(y,qi)=λd(q_{i},q_{j})\geq d(y,q_{i})=\lambda.

Our claim implies that we can place non-overlapping copies of λ2Q^\frac{\lambda}{2}\hat{Q} centered at the qiq_{i}’s. Each copy has half the area of Q^y\hat{Q}_{y}, and all these copies are contained inside 2Q^y2\hat{Q}_{y}. A packing argument shows that there are O(1)O(1) such copies. This shows that the vertex degree of 1-NNX\mathrm{NN}_{X} is O(1)O(1).

Let pqpq be an edge in 1-NNX\mathrm{NN}_{X}. We assume without loss of generality that pp is the nearest neighbor of qq in XX. Thus, there exists λ>0\lambda>0 such that p(λQ^+q)p\in\partial(\lambda\hat{Q}+q) and int(λQ^+q)X=\mathrm{int}(\lambda\hat{Q}+q)\cap X=\emptyset. By the definition of Q^\hat{Q}, it means that there exists a point x2x\in\mathbb{R}^{2} such that λQ+xλQ^+q\lambda Q^{*}+x\subset\lambda\hat{Q}+q and {p,q}(λQ+x)\{p,q\}\subset\partial(\lambda Q^{*}+x). So int(λQ+x)X=\mathrm{int}(\lambda Q^{*}+x)\cap X=\emptyset which certifies that pp and qq are Voronoi neighbors in VorQ(X)\mathrm{Vor}_{Q}(X).    

Appendix E Missing details in Section 4.3.3

Only the proof of Lemma 4.9 is missing. We restate the lemma and give its proof below. The proof contains the details of the analysis in [3] that works for step 6 of VorNN.

Statement of Lemma 4.9:  VorNN(R,TR)(R,T_{R}) computes VorQ(R)\mathrm{Vor}_{Q}(R) in O(|R|)O(|R|) expected time.

Proof.  We first give the details of step 6 of VorNN. As in [3], we grow XX and VorQ(X)\mathrm{Vor}_{Q}(X) by moving points repeatedly from YXY\setminus X to XX. We keep track of edges pqpq in 1-NNY\mathrm{NN}_{Y} such that pYXp\in Y\setminus X and qXq\in X. Take such an edge.

Since pqpq is an edge in 1-NNY\mathrm{NN}_{Y}, pqpq must also be an edge in 1-NNX{p}\mathrm{NN}_{X\cup\{p\}}. By Lemma 4.8, pp and qq are Voronoi neighbors in VorQ(X{p})\mathrm{Vor}_{Q}(X\cup\{p\}). So pp must conflict with a point in Vq(X)V_{q}(X). Lemma 3.4 implies that pp must conflict with some Voronoi edge bend or Voronoi vertex vv in Vq(X)\partial V_{q}(X). We search VorQ(X)\mathrm{Vor}_{Q}(X) from vv to find all Voronoi edge bends and Voronoi vertices that conflict with pp. During this search, we can modify VorQ(X)\mathrm{Vor}_{Q}(X) to VorQ(X{p})\mathrm{Vor}_{Q}(X\cup\{p\}) in time proportional to the number of Voronoi edge bends and Voronoi vertices that conflict with pp [19].

Consider the total running time. The identification of the starting Voronoi edge bend or Voronoi vertex involves checking Vq(X)\partial V_{q}(X). In other words, Vq(X)\partial V_{q}(X) is examined once for each neighbor of qq in 1-NNY\mathrm{NN}_{Y}. Therefore, any Voronoi edge bend or Voronoi vertex ww that was constructed at some point during the algorithm can be examined as many times as the degree sum in 1-NNY\mathrm{NN}_{Y} of the points that define ww. This degree sum is O(1)O(1) by Lemma 4.8. Also, if ww is subsequently destroyed by the insertion of a point in YXY\setminus X, we can charge the time to destroy ww to the creation of ww. Hence, the total expected running time is bounded by the expected number of Voronoi edge bends and Voronoi vertices created in the course of the algorithm.

Let QwQ^{*}_{w} denote the homothetic copy of QQ^{*} that circumscribes the points that define ww. Let s=|YQw|s=|Y\cap Q^{*}_{w}|. A necessary condition for ww to be created in step 6 is that YQwY\cap Q^{*}_{w} contains no point in XX right before the execution of step 6. If some points in YQwY\cap Q^{*}_{w} form matching pair(s) in step 3 of VorNN, at least one of them must be included in XX in step 3, which means that we cannot possibly create ww in step 6. If the points in YQwY\cap Q^{*}_{w} do not form any matching pair in step 3, then the points in YQwY\cap Q^{*}_{w} are sampled independently with probability 1/2. Therefore, the probability that none of these points is selected in step 3 is at most 1/2s1/2^{s}, so the probability that ww is created in step 6 is at most 1/2s1/2^{s}. By the result of Clarkson and Shor [11, Theorem 3.1], there are O(|Y|s2)O(|Y|s^{2}) Voronoi edge bends and Voronoi vertices whose circumscribing homothetic copies of QQ^{*} contain at most ss points in YY. Therefore, the expected number of Voronoi edge bends and Voronoi vertices created in step 6 of VorNN is O(s=0|Y|s2/2s)=O(|Y|)O(\sum_{s=0}^{\infty}|Y|s^{2}/2^{s})=O(|Y|). The expected size of XX is |Y|/2|Y|/2. Therefore, unwinding the recursion starting from the top-level call VorNN(R,TR)(R,T_{R}) gives a total expected running time of O(|R|+|R|/2+|R|/4+)=O(|R|)O(|R|+|R|/2+|R|/4+\cdots)=O(|R|).    

Appendix F Missing details in Section 4.4

Recall that URU_{R} is the set of Voronoi edge bends and Voronoi vertices in VorQ(R)\mathrm{Vor}_{Q}(R) that conflict with the input points p1,,pnp_{1},\ldots,p_{n}. We need to prove that UR=i=1nVS|piU_{R}=\bigcup_{i=1}^{n}V_{S}|_{p_{i}}. Clearly, i=1nVS|piUR\bigcup_{i=1}^{n}V_{S}|_{p_{i}}\subseteq U_{R} by the definition of RR. The following result proves that the containment also holds in the other direction.

Lemma F.1

URi=1nVS|piU_{R}\subseteq\bigcup_{i=1}^{n}V_{S}|_{p_{i}}.

Proof.  Take any piIp_{i}\in I. For each qNpi(S{pi})q\in N_{p_{i}}\bigl{(}S\cup\{p_{i}\}\bigr{)}, pip_{i} must conflict with Vq(S)V_{q}(S) in order that qq becomes a Voronoi neighbor of pip_{i} in VorQ(S{pi})\mathrm{Vor}_{Q}\big{(}S\cup\{p_{i}\}\bigr{)}. As a result, Npi(S{pi})RN_{p_{i}}\big{(}S\cup\{p_{i}\}\bigr{)}\subseteq R, which implies that Vpi(S{pi})=Vpi(R{pi})V_{p_{i}}\bigl{(}S\cup\{p_{i}\}\bigr{)}=V_{p_{i}}\bigl{(}R\cup\{p_{i}\}\big{)}. In VorQ(S)\mathrm{Vor}_{Q}(S), the region Vpi(S{pi})V_{p_{i}}\bigl{(}S\cup\{p_{i}\}\bigr{)} is partitioned and distributed among the Voronoi cells of points in Npi(S{pi})RN_{p_{i}}\bigl{(}S\cup\{p_{i}\}\bigr{)}\subseteq R. Thus, any Voronoi edge bend or Voronoi vertex ww in VorQ(R)\mathrm{Vor}_{Q}(R) that does not exist in VSV_{S} must lie strictly outside Vpi(S{pi})=Vpi(R{pi})V_{p_{i}}\bigl{(}S\cup\{p_{i}\}\bigr{)}=V_{p_{i}}\bigl{(}R\cup\{p_{i}\}\big{)}. Hence, ww cannot conflict with pip_{i}.