This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Simpler is More: Efficient Top-K Nearest Neighbors Search on Large Road Networks

Yiqi Wang1, Long Yuan2, Wenjie Zhang1, Xuemin Lin3, Zi Chen1, Qing Liu4 1 The University of New South Wales, 2 Nanjing University of Science and Technology 3 Shanghai Jiaotong University, 4 Data61, CSIRO, Australia [email protected], [email protected], [email protected] zhangw,[email protected],[email protected]
Abstract.

Top-kk Nearest Neighbors (kkNN) problem on road network has numerous applications on location-based services. As direct search using the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s algorithm results in a large search space, a plethora of complex-index-based approaches have been proposed to speedup the query processing. However, even with the current state-of-the-art approach, long query processing delays persist, along with significant space overhead and prohibitively long indexing time. In this paper, we depart from the complex index designs prevalent in existing literature and propose a simple index named ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. With ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, we can answer a kkNN query optimally and progressively with small and size-bounded index. To improve the index construction performance, we propose a bidirectional construction algorithm which can effectively share the common computation during the construction. Theoretical analysis and experimental results on real road networks demonstrate the superiority of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} over the state-of-the-art approach in query processing performance, index size, and index construction efficiency.


1. Introduction

Top kk nearest neighbors (kkNN) search on road network is a fundamental operation in location-based services (Bhatia etย al., 2010; Abbasifard etย al., 2014; Nodarakis etย al., 2017). Formally, given a road network Gโ€‹(V,E)G(V,E), a set of candidate objects โ„ณ\mathcal{M}, and a query vertex uu, kkNN search identifies kk objects in โ„ณ\mathcal{M} with the shortest distance to uu. kkNN search finds many important real world applications. For example, in the accommodation booking platforms like Booking (Booking, [n.d.]), Airbnb (Airnb, [n.d.]) and Trip (Trip, [n.d.]), an important operation is to show several accommodations closest to the location provided by users. In restaurant-review services, such as Yelp (Yelp, [n.d.]), Dianping (Dianping, [n.d.]) and OpenRice (OpenRice, [n.d.]), platforms utilize kkNN search to present several nearby restaurants to the user. In ride-hailing services like Uber (Uber, [n.d.]) and Didi (DiDi, [n.d.]), several available vehicles near the pickup location are presented before users send the ride-hailing request.

Refer to caption
Figure 1. kkNN Search in Location-based Service (k=3k=3)
Example 1.1.

Figureย 1 shows a kkNN search example in location-based service. Assume that tourists in New York, such as โ€Johnโ€ and โ€Jennieโ€, want to find Starbucks nearby to drink coffee, the location-based service providers like Google Map generally present several candidate stores based on the distance from their locations, which can be modeled as kkNN search problem. In Figureย 1, there are 66 Starbucks stores (marked with A,B,โ‹ฏ,FA,B,\cdots,F), therefore, the candidate object set โ„ณ={A,B,C,D,E,F}\mathcal{M}=\{A,B,C,D,E,F\}. For โ€Jennieโ€, the 33NN search returns {E,F,B}\{E,F,B\} while the 33NN search for โ€Johnโ€ returns {C,D,B}\{C,D,B\}.

Motivation. Given a kkNN query for vertex uu, the query can be directly answered by exploring the vertices based on their distance to uu using ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s algorithm (Dijkstra, 1959). Nevertheless, this method is inefficient, especially when the road network is large and the candidate objects are far from uu. Therefore, researchers resort to indexing-based solutions to accelerate query processing (Papadias etย al., 2003b; Demiryurek etย al., 2009a; Lee etย al., 2010; Zhong etย al., 2015; Shen etย al., 2017; Luo etย al., 2018; He etย al., 2019; Ouyang etย al., 2020a).

Although existing index-based approaches have made strides in accelerating the query processing, they still suffer from the long query processing delay and their performance is far from optimal. Additionally, these solutions exhibit significant space overhead and prohibitively long indexing times, severely limiting their practical applicability. Take the state-of-the-art approach ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a) for kkNN queries on road networks as an example. The size of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}on the dataset ๐–ด๐–ฒ๐– \mathsf{USA} with only 23.95 million vertices and 58.33 million edges (nearly 442.5 MB if an edge is stored by two 4 byte integers) exceeds 160160 GB, and it takes more than 5.45.4 hours to construct the corresponding index. Motivated by these, this paper aims to propose a new index-based solution for kkNN query that can overcome the shortcomings of existing solutions in query processing performance, index size and index construction.

A Minimalist kkNN Index Design. Revisiting the existing solutions, they generally design a complex index to speedup the query processing. For example, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}consists of three different parts. Incredibly, as an index for kkNN query, one of the three parts is even a complete index structure for shortest distance query. The complex-index design leads to the drawbacks of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}as analyzed in Sectionย 3. This drives us to ask: is this complex-index design thinking really suitable for kkNN query?

In this paper, we adopt a completely opposite design approach. Going back to the essence of kkNN search, it only needs to return the kk nearest neighbors for the query vertex. Moreover, the kk value of the kkNN search used in real applications is typically not large as users often have limited attention spans and prefer to quickly obtain relevant information to reduce cognitive load and facilitate decision-making (Tugend, y 27; Ph.D., er 1; Sthapit, 2018; Chernev etย al., 2015; Schwartz, 2004; Luo etย al., 2018). For example, Yelp App (Yelp, [n.d.]) provides customers with 2020 results every time when searching nearest specific place type, such as restaurant or gas station. A similar strategy is also adopted in other Apps like OpenRice (OpenRice, [n.d.]) and OpenTable (OpenTable, [n.d.]). Therefore, our proposed new index named ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} only simply records the kk nearest neighbors of each vertex. The benefits of this minimalist kkNN index design are twofold: regarding the query processing, the query can be answered progressively 111Progressive query processing outputs results gradually in a well-bounded delay, allowing users to obtain useful results even before query processing completes. It also provides a choice to early terminate the search when a user finds sufficient answers (Papadias etย al., 2003a). in optimal time. Regarding the space-consumption of the index, only the essential information directly to kkNN query is stored in the index and the value of kk is small in practice, resulting in a well-bounded index space.

New Challenges. ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} successfully addresses the issues of long query delays and oversized indexes by directly storing the kk nearest neighbors for each vertex. However, this strategy leaves the trouble to the index construction as the index structure intuitively implies that we have to explore all the query space before constructing it. A straightforward approach is to compute the kk nearest neighbors for each vertex by ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s algorithm (Dijkstra, 1959). However, the time complexity of this approach is Oโ€‹(nโ‹…(m+nโ€‹logโกn))O(n\cdot(m+n\log n)), where nn is the number of vertices and mm is the number of edges in the road network. Clearly, this approach is impractical to handle large road networks. Another possible approach is to use the existing index like ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}to accelerate the computation of kk nearest neighbors for each vertex. Nevertheless, this approach unavoidably induces the drawbacks of existing approaches as discussed above. Overall, the efficiency of the index construction algorithm determines the applicability of our index while it is challenging to design such an efficient index construction algorithm that could outperform existing solutions.

Our Idea. The above discussed approaches compute the kk nearest neighbors for each vertex independently, which miss the potential opportunities to re-use the intermediate results during the construction. Therefore, we adopt a computation sharing strategy to achieve the efficient index construction. To effectively share the computation, we introduce the concept of bridge neighbor set for a vertex vv and reveal the hidden relationships between its bridge neighbor set and kk nearest neighbors. Following these findings, we design a bridge neighbor preserved graph (๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}) of the input road network with which the bridge neighbor set of a vertex can be easily obtained. Based on ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}, we first propose a bottom-up index construction algorithm in which the intermediate results during the construction can be largely shared and further improve the performance by introducing a bidirectional construction algorithm. Additionally, the given candidate objects โ„ณ\mathcal{M} may be updated in some cases (Ouyang etย al., 2020a), we also design efficient algorithm to incrementally maintain the index for these updates.

Contributions. In this paper, we make the following contributions:

(1) A new attempt at an alternative kkNN index design paradigm with a simple yet effective kkNN index. Recognizing the complex index in the existing solutions leads to long query processing delay, oversized index and prohibitive indexing time, we embrace minimalism and design a simple kkNN index that has a well-bounded space and enables progressive and optimal query processing. To the best of our knowledge, this is the first work that systematically studies such simple yet effective index for kkNN query.

(2) Efficient index construction and maintenance algorithms. Following the designed index, we propose a novel index construction algorithm with which the shortest distance computation regarding a vertex and its top kk nearest neighbors can be effectively shared. We also propose index maintenance algorithms to handle object insertion and deletion. We provide time complexity analysis for all proposed algorithms.

(3) Extensive experiments on real-world road networks. We extensively evaluate our proposed algorithms on real road networks. Compared with the state-of-the-art approach ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, experimental results demonstrate that our approach reduces the index space two order of magnitude, speeds up the query time up to two orders of magnitude, and achieves up to two orders of magnitude speedup in index construction.

Outline. Sectionย 2 provides the problem definition. Sectionย 3 introduces the state-of-the-art algorithm. Sectionย 4, Sectionย 5, and Sectionย 6 present the new indexing approach. Sectionย 7 evaluates our algorithms and Sectionย 8 reviews the related work. Sectionย 9 concludes the paper.

2. Preliminaries

Notations Descriptions
๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) kk nearest neighbor set of uu
๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) decreasing rank partial kkNN of uu
Gโ€ฒG^{\prime} bridge neighbor preserved graph of GG
๐–ฆโ€ฒโฃ<โ€‹(u){\mathsf{G^{\prime<}}}(u) decreasing rank path subgraph of uu
๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) increasing rank path subgraph of uu
๐–ก๐–ญ๐–ฒโ€‹(u){\mathsf{BNS}}(u) neighbors of uu in Gโ€ฒG^{\prime}
๐–ก๐–ญ๐–ฒ<โ€‹(u){\mathsf{BNS^{<}}}(u) neighbors of uu in Gโ€ฒG^{\prime} with lower rank than uu
๐–ก๐–ญ๐–ฒ>โ€‹(u){\mathsf{BNS^{>}}}(u) neighbors of uu in Gโ€ฒG^{\prime} with higher rank than uu
๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v) length of decreasing rank shortest path between uu and vv
Table 1. Notations

Let G=(V,E)G=(V,E) be a connected and weighted graph to represent a real-world road network, where Vโ€‹(G)V(G) and Eโ€‹(G)E(G) is the set of vertices and edges in GG, respectively. We use n=|Vโ€‹(G)|n=|V(G)| (resp. m=|Eโ€‹(G)|m=|E(G)|) to denote the number of vertices (resp. edges) in GG. For each vertex vโˆˆVโ€‹(G)v\in V(G), the neighbours of vv, denoted by ๐—‡๐–ป๐—‹โ€‹(v,G){\mathsf{nbr}}(v,G), is defined as ๐—‡๐–ป๐—‹โ€‹(v,G)={u|(u,v)โˆˆEโ€‹(G)}{\mathsf{nbr}}(v,G)=\{u|(u,v)\in E(G)\}. The degree of a vertex vv is the number of neighbors of vv, i.e., ๐–ฝ๐–พ๐—€โ€‹(v,G)=|๐—‡๐–ป๐—‹โ€‹(v,G)|{\mathsf{deg}}(v,G)=|{\mathsf{nbr}}(v,G)|. The weight of an edge (u,v)(u,v) is denoted as ฯ•โ€‹((u,v),G)\phi((u,v),G). A path pp in GG is a sequence of vertices p=(v0,v1,v2,โ€ฆ,vn)p=(v_{0},v_{1},v_{2},\dots,v_{n}), such that viโˆˆVโ€‹(G)v_{i}\in V(G) for each 0โ‰คiโ‰คn0\leq i\leq n. The length of pp, denoted by ๐—…๐–พ๐—‡โ€‹(p){\mathsf{len}}(p), is the sum for the weight of edges in pp, i.e., ๐—…๐–พ๐—‡โ€‹(p)=โˆ‘i=1nฯ•โ€‹(viโˆ’1,vi){\mathsf{len}}(p)=\sum_{i=1}^{n}\phi(v_{i-1},v_{i}). Given two vertices uu and vv, the shortest path between uu and vv in GG is a path pp from uu to vv with smallest ๐—…๐–พ๐—‡โ€‹(p){\mathsf{len}}(p). The distance between u and v in G, denoted as ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v){\mathsf{dist}}(u,v), is the weight of the shortest path between them.

Regarding the given set of candidate objects โ„ณ\mathcal{M}, we assume all objects in โ„ณ\mathcal{M} are on vertices following the previous works (Shen etย al., 2017; Luo etย al., 2018; He etย al., 2019; Ouyang etย al., 2020a). In real-world road networks, each object oโˆˆโ„ณo\in\mathcal{M} may appear on any point of edges. For an object oo not on a vertex, we can see oo as a vertex object with an offset, and the distance between oo and a query vertex vv can be computed by mapping oo to an adjacent vertex with an offset following the previous works (Shen etย al., 2017; Luo etย al., 2018; He etย al., 2019; Ouyang etย al., 2020a) as well. Specifically, assume oo is on an edge (uo,uoโ€ฒ)(u_{o},u^{\prime}_{o}) with a distance ฯ•o\phi_{o} to uoโ€ฒu^{\prime}_{o}, qq is on an edge (uq,uqโ€ฒ)(u_{q},u^{\prime}_{q}) with a distance ฯ•q\phi_{q}. The distance between qq and oo is represented as ๐–ฝ๐—‚๐—Œ๐—โ€‹(q,o)=ฯ•q+๐–ฝ๐—‚๐—Œ๐—โ€‹(uq,uo)+ฯ•o{\mathsf{dist}}(q,o)=\phi_{q}+{\mathsf{dist}}(u_{q},u_{o})+\phi_{o}. We denote the kkNN result of a vertex uu as ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) and define the problem of kkNN search as follows.

Problem Definition. Given a road network G=(V,E)G=(V,E), a query vertex uu, an integer kk, and a set of candidate objects โ„ณ\mathcal{M} (|โ„ณ|>k),โ„ณโІVโ€‹(G)(|\mathcal{M}|>k),\mathcal{M}\subseteq V(G)), we aim to computes kk objects from โ„ณ\mathcal{M}, denoted by ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u), such that โˆ€vโˆˆ๐–ต๐—„โ€‹(u),\forall v\in{\mathsf{V_{k}}}(u), wโˆˆโ„ณโˆ–๐–ต๐—„โ€‹(u),w\in\mathcal{M}\setminus{\mathsf{V_{k}}}(u), ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)โ‰ค๐–ฝ๐—‚๐—Œ๐—โ€‹(u,w){\mathsf{dist}}(u,v)\leq{\mathsf{dist}}(u,w).

Refer to caption
Figure 2. A Road Network
Example 2.1.

Consider the graph GG in Figureย 2 and assume all vertices are in the candidate object set. For a given query vertex v12v_{12} and k=5k=5, V5โ€‹(v12)={v12,v5,v11,v4,v19}V_{5}(v_{12})=\{v_{12},v_{5},v_{11},v_{4},v_{19}\}. The corresponding distances between v12v_{12} and vertex in V5โ€‹(v12)V_{5}(v_{12}) are 0,1,1,20,1,1,2 and 22 respectively.

3. The State-of-the-art Solution

๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a) is the state-of-the-art index-based approach for kkNN queries on road networks. ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}designs an index based on tree decomposition (Robertson and Seymour, 1984; Xu etย al., 2005) and ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} (Ouyang etย al., 2018), which proves superiority over other existing approaches. Specifically,

Index Structure. ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}decomposes the input road network into a tree-like structure by tree decomposition (Xu etย al., 2005). Given the decomposed tree structure, each vertex uu has a child vertex set ๐•‹โ€‹(u)\mathbb{T}(u) and an ancestor vertex set ๐”ธโ€‹(u)\mathbb{A}(u). Apart from the decomposed tree structure, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}contains the other two parts: kkTNN for each vertex uu which stores the top kk nearest neighbors of uu in ๐•‹โ€‹(u)\mathbb{T}(u) and ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} (Ouyang etย al., 2018) which is used to compute the shortest distance between uu and vโˆˆ๐”ธโ€‹(u)v\in\mathbb{A}(u).

Query Processing. Given a query vertex uu, for each vertex vv in the kkNN of uu, there exists a vertex pp such that pโˆˆ๐”ธโ€‹(u)โˆช{u}p\in\mathbb{A}(u)\cup\{u\} and vv in kkTNN of pp. Following this idea, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}answers the kkNN query in kk rounds. In each ii round (1โ‰คiโ‰คk1\leq i\leq k), it outputs the top ii-th result by iterating the vertices in ๐”ธโ€‹(u)โˆช{u}\mathbb{A}(u)\cup\{u\} and computing the corresponding shortest distance through ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

Index Construction. To construct the index, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}first decomposes the graph following (Xu etย al., 2005). With the decomposed tree, ๐”ธโ€‹(u)\mathbb{A}(u) and ๐•‹โ€‹(u)\mathbb{T}(u) can be obtained accordingly. After that, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}builds the ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} based on (Ouyang etย al., 2018). At last, the kkTNN for each vertex is constructed by querying the shortest distance of corresponding vertex pairs through ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

Drawbacks. Although ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}accelerates the kkNN query processing on road network, the following drawbacks limit its applicability in practice:

  • โ€ข

    Oversized Index. The size of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}is generally huge in practice. As verified in our experiments, the size of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}on ๐–ด๐–ฒ๐– \mathsf{USA} (only 23,947,34723,947,347 vertices and 58,333,34458,333,344 edges) exceeds 172.80172.80 GB, in which ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} takes 169.23169.23 GB space.

  • โ€ข

    Long Query Delay. To answer a kkNN query regarding vertex uu, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}has to iterate the vertices in ๐”ธโ€‹(u)โˆช{u}\mathbb{A}(u)\cup\{u\} and compute the corresponding shortest distance in kk rounds. Moreover, the shortest distance computation is not free, and needs heavy exploration on the ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. These two factors lead to long query delay of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

  • โ€ข

    Prohibitive Indexing Time. As shown in the above, to construct the index, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}has to decompose the road network first, and then build the ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} and compute the kkTNN accordingly. Obviously, the time cost of these procedures are expensive, especially the ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} construction. For the dataset USA, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}takes 1966619666s to construct the index, in which ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} consumes 1963219632s.

4. Our Indexing Approach

According to the above analysis, although the use of ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} accelerates the query processing of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, heavily depending on the ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} directly leads to the drawbacks of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. This raises a natural question: why do we need an index for shortest distance such as ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} when addressing kkNN problem? Based on the logic of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, partial kkNN (namely kkTNN) is maintained for each vertex and ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is used to refine the partial kkNN to obtain the final results when processing the query. This motivates us to further ask: Is it necessary to maintain the partial kkNN? How about maintaining the kkNN for each vertex directly as an index? In this way, the drawbacks regarding index size and query delay can be totally addressed. Following this idea, we propose the following index and query processing algorithm.

4.1. Index Structure and Query Processing

Our index just simply records the kkNN for each vertex in the graph, which is formally defined as follows:

Definition 4.1.

(๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}) Given a graph GG, an integer kk and a set of candidate objects โ„ณ\mathcal{M} (|โ„ณ|>k)(|\mathcal{M}|>k), for each vertex vโˆˆGv\in G, ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} records the top-kk nearest neighbors of vv in โ„ณ\mathcal{M}, namely ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u), in the increasing order of their shortest distances from vv.

Example 4.2.

Given the graph GG in Figureย 2, assume the candidate object set is all vertices in GG and k=5k=5, the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of GG is shown in Figureย 3. Take v8v_{8} as an example, V5โ€‹(v8)={v8,v20,v2,v9,v1}V_{5}(v_{8})=\{v_{8},v_{20},v_{2},v_{9},v_{1}\}, with shortest distance 0, 3, 3, 4 and 4 respectively.

vv ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} vv ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}
v1v_{1} (v1,0)โ€‹(v2,1)โ€‹(v6,2)โ€‹(v7,4)โ€‹(v8,4)(v_{1},0)(v_{2},1)(v_{6},2)(v_{7},4)(v_{8},4) v11v_{11} (v11,0)โ€‹(v12,1)โ€‹(v19,1)โ€‹(v5,2)โ€‹(v18,2)(v_{11},0)(v_{12},1)(v_{19},1)(v_{5},2)(v_{18},2)
v2v_{2} (v2,0)โ€‹(v1,1)โ€‹(v6,3)โ€‹(v8,3)โ€‹(v7,5)(v_{2},0)(v_{1},1)(v_{6},3)(v_{8},3)(v_{7},5) v12v_{12} (v12,0)โ€‹(v11,1)โ€‹(v5,1)โ€‹(v4,2)โ€‹(v19,2)(v_{12},0)(v_{11},1)(v_{5},1)(v_{4},2)(v_{19},2)
v3v_{3} (v3,0)โ€‹(v12,3)โ€‹(v5,4)โ€‹(v11,4)โ€‹(v4,5)(v_{3},0)(v_{12},3)(v_{5},4)(v_{11},4)(v_{4},5) v13v_{13} (v13,0)โ€‹(v18,3)โ€‹(v19,3)โ€‹(v11,4)โ€‹(v20,4)(v_{13},0)(v_{18},3)(v_{19},3)(v_{11},4)(v_{20},4)
v4v_{4} (v4,0)โ€‹(v12,2)โ€‹(v5,3)โ€‹(v11,3)โ€‹(v19,4)(v_{4},0)(v_{12},2)(v_{5},3)(v_{11},3)(v_{19},4) v14v_{14} (v14,0)โ€‹(v10,1)โ€‹(v16,1)โ€‹(v15,3)โ€‹(v20,3)(v_{14},0)(v_{10},1)(v_{16},1)(v_{15},3)(v_{20},3)
v5v_{5} (v5,0)โ€‹(v12,1)โ€‹(v11,2)โ€‹(v17,2)โ€‹(v4,3)(v_{5},0)(v_{12},1)(v_{11},2)(v_{17},2)(v_{4},3) v15v_{15} (v15,0)โ€‹(v16,2)โ€‹(v14,3)โ€‹(v17,3)โ€‹(v10,4)(v_{15},0)(v_{16},2)(v_{14},3)(v_{17},3)(v_{10},4)
v6v_{6} (v6,0)โ€‹(v1,2)โ€‹(v7,2)โ€‹(v2,3)โ€‹(v20,4)(v_{6},0)(v_{1},2)(v_{7},2)(v_{2},3)(v_{20},4) v16v_{16} (v16,0)โ€‹(v14,1)โ€‹(v10,2)โ€‹(v15,2)โ€‹(v20,4)(v_{16},0)(v_{14},1)(v_{10},2)(v_{15},2)(v_{20},4)
v7v_{7} (v7,0)โ€‹(v6,2)โ€‹(v20,2)โ€‹(v9,3)โ€‹(v1,4)(v_{7},0)(v_{6},2)(v_{20},2)(v_{9},3)(v_{1},4) v17v_{17} (v17,0)โ€‹(v5,2)โ€‹(v12,3)โ€‹(v15,3)โ€‹(v11,4)(v_{17},0)(v_{5},2)(v_{12},3)(v_{15},3)(v_{11},4)
v8v_{8} (v8,0)โ€‹(v20,2)โ€‹(v2,3)โ€‹(v9,3)โ€‹(v1,4)(v_{8},0)(v_{20},2)(v_{2},3)(v_{9},3)(v_{1},4) v18v_{18} (v18,0)โ€‹(v11,2)โ€‹(v12,3)โ€‹(v13,3)โ€‹(v19,3)(v_{18},0)(v_{11},2)(v_{12},3)(v_{13},3)(v_{19},3)
v9v_{9} (v9,0)โ€‹(v20,1)โ€‹(v19,2)โ€‹(v7,3)โ€‹(v8,3)(v_{9},0)(v_{20},1)(v_{19},2)(v_{7},3)(v_{8},3) v19v_{19} (v19,0)โ€‹(v11,1)โ€‹(v9,2)โ€‹(v12,2)โ€‹(v5,3)(v_{19},0)(v_{11},1)(v_{9},2)(v_{12},2)(v_{5},3)
v10v_{10} (v10,0)โ€‹(v14,1)โ€‹(v16,2)โ€‹(v15,4)โ€‹(v20,4)(v_{10},0)(v_{14},1)(v_{16},2)(v_{15},4)(v_{20},4) v20v_{20} (v20,0)โ€‹(v9,1)โ€‹(v7,2)โ€‹(v8,2)โ€‹(v14,3)(v_{20},0)(v_{9},1)(v_{7},2)(v_{8},2)(v_{14},3)
Figure 3. ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of GG (k=5k=5)

Query Processing. Based on our ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, for a kkNN query regarding a vertex vv, we can answer the query directly by retrieving the corresponding items of vv in the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

4.2. Theoretical Analysis

Following the index structure and query processing algorithm, we have the following theoretical results.

Optimal Query Processing. Since our query processing algorithm can answer the query directly by scanning the corresponding items of the query vertex in the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, the following theorem exists obviously:

Theorem 4.3.

Given a kkNN query, our algorithm takes Oโ€‹(k)O(k) time to process the query.

To answer a kkNN query, any algorithm needs to output the kk results at least, which takes Oโ€‹(k)O(k) time. On the other hand, Theoremย 4.3 shows the time complexity of our query processing algorithm is Oโ€‹(k)O(k). Therefore, the optimality holds.

Incremental Polynomial Query Processing. Consider an algorithm that returns several results. Let kk be the number of results in the output. An algorithm is said to have incremental polynomial if for all iโ‰คki\leq k, the output time of the first ii results is bounded by a polynomial function of the input size and ii (Chang etย al., 1994). Since the items for each vertex vv in the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} are recorded in the increasing order of their distance from vv, we have:

Theorem 4.4.

Given a kkNN query regarding vv, for every 1โ‰คiโ‰คk1\leq i\leq k, our algorithm outputs the top ii-th nearest neighbor in Oโ€‹(i)O(i) time.

Theoremย 4.4 shows that our query processing algorithm is incremental polynomial, indicating that it progressively provides results for a query within a bounded delay. The capability of incremental polynomial query processing is considered as a significant technical contribution of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a) and Theoremย 4.4 confirms that our algorithm also possesses this desirable theoretical guarantee.

Bounded Index Space. Since ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} only stores the top-kk nearest neighbors of each vertex in the road network, we have:

Theorem 4.5.

Given a road network GG and an integer kk, the size of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is bounded by Oโ€‹(nโ‹…k)O(n\cdot k).

Remark. According to the above analysis, ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} processes a query in optimal time while ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}takes Oโ€‹(hโ‹…k)O(h\cdot k) time (hh is the height of the tree decomposition), which means ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} surpasses ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}at the query processing. For the index size, the size of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is Oโ€‹(nโ‹…k)O(n\cdot k) while that of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}is Oโ€‹(nโ‹…h)O(n\cdot h). As introduced in Sectionย 1, the kk value of kkNN search in real road network applications is not large, therefore, the index size of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is advantageous compared with ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}in practice. Note that although (Ouyang etย al., 2020a) claims TEN-Index is parameter-free index, both TEN-Index and KNN-Index constructs their index based on a specific kk, which means we needs to construct different indices for different values of kk. A compromise solution to address this problem for ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and our index is that we can construct the index based on a moderately large kk value, and the kkNN search with smaller kk values can be answered based on the constructed index directly.

5. Index Construction

Based on the structure of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, it can be constructed straightforwardly by computing the top kk nearest neighbors of each vertex through ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm or ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. However, these approaches are time-consuming and inefficient to handle large road network. In this section, we present our new approach to construct the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

5.1. Key Properties of ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u)

The above discussed direct approaches using ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm or ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}compute the kk nearest neighbors for each vertex independently, which misses the potential opportunities to re-use the intermediate results during the index construction. In this section, we introduce two important properties regarding the distance computation, which lays the foundation for our computation-sharing index construction algorithms. We first define:

Definition 5.1.

(Bridge Neighbor Set) Given a vertex uโˆˆVโ€‹(G)u\in V(G), the bridge neighbor set of uu, denoted by ๐–ก๐–ญ๐–ฒโ€‹(u){\mathsf{BNS}}(u), is the set of uโ€ฒโ€‹su^{\prime}s neighbors vv such that the weight of the edge (u,v)(u,v) is equal to the distance between uu and vv in GG, i.e., ๐–ก๐–ญ๐–ฒโ€‹(u)={v|vโˆˆ๐—‡๐–ป๐—‹โ€‹(u,G)โˆงฯ•โ€‹((u,v),G)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)}{\mathsf{BNS}}(u)=\{v|v\in{\mathsf{nbr}}(u,G)\wedge\phi((u,v),G)={\mathsf{dist}}((u,v),G)\}.

Example 5.2.

Consider v8v_{8} in Figureย 2. ๐—‡๐–ป๐—‹โ€‹(v8,G)={v2,v6,v7,v20,v9}{\mathsf{nbr}}(v_{8},G)=\{v_{2},v_{6},v_{7},v_{20},\\ v_{9}\}. The shortest path between v8v_{8} and v9v_{9} is (v8,v20,v9)(v_{8},v_{20},v_{9}), and the distance is ๐–ฝ๐—‚๐—Œ๐—โ€‹((v8,v9),G)=3{\mathsf{dist}}((v_{8},v_{9}),G)=3. As ๐–ฝ๐—‚๐—Œ๐—โ€‹((v8,v9),G)โ‰ ฯ•โ€‹((v8,v9),G){\mathsf{dist}}((v_{8},v_{9}),G)\neq\phi((v_{8},v_{9}),G), v9v_{9} is not in ๐–ก๐–ญ๐–ฒโ€‹(v8){\mathsf{BNS}}(v_{8}). Similarly, v6v_{6} and v7v_{7} also does not belong to ๐–ก๐–ญ๐–ฒโ€‹(v8){\mathsf{BNS}}(v_{8}). For the graph GG in Figureย 2, the bridge neighbor set of v8v_{8} is ๐–ก๐–ญ๐–ฒโ€‹(v8)={v2,v20}{\mathsf{BNS}}(v_{8})=\{v_{2},v_{20}\}.

Based on Definitionย 5.1, we have following property regarding the bridge neighbor set of uu and its kk nearest neighbors:

Property 1.

Given a vertex uโˆˆVโ€‹(G)u\in V(G), ๐–ต๐—„โ€‹(u)โІโˆชvโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(u)\subseteq\cup_{v\in{\mathsf{BNS}}(u)}{\mathsf{V_{k}}}(v).

Proof: We prove this property by contradiction. Assume that wโˆˆ๐–ต๐—„โ€‹(u)w\in{\mathsf{V_{k}}}(u) but wโˆ‰โˆชvโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)๐–ต๐—„โ€‹(v)w\notin\cup_{v\in{\mathsf{BNS}}(u)}{\mathsf{V_{k}}}(v). According to Definitionย 5.1, the shortest path between ww and uu must pass through one vertex vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)v\in{\mathsf{BNS}}(u) such that for all viโˆˆ๐–ต๐—„โ€‹(v)v_{i}\in{\mathsf{V_{k}}}(v), iโˆˆ[1,k]i\in[1,k], we have ๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vi)<๐–ฝ๐—‚๐—Œ๐—โ€‹(w,v){\mathsf{dist}}(v,v_{i})<{\mathsf{dist}}(w,v). Therefore, ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)+๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vi)<๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)+๐–ฝ๐—‚๐—Œ๐—โ€‹(v,w){\mathsf{dist}}(u,v)+{\mathsf{dist}}(v,v_{i})<{\mathsf{dist}}(u,v)+{\mathsf{dist}}(v,w). This implies that there are at least kk vertices whose distance to uu are smaller than the distance between uu and ww, which contradicts wโˆˆ๐–ต๐—„โ€‹(u)w\in{\mathsf{V_{k}}}(u). The proof completes. โ–ก\Box

Following Propertyย 1, we have:

Property 2.

Given a vertex uโˆˆVโ€‹(G)u\in V(G), ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),G)=๐—†๐—‚๐—‡vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)โ€‹{๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)+๐–ฝ๐—‚๐—Œ๐—โ€‹((v,w),G)}{\mathsf{dist}}((u,w),G)={\mathsf{min}}_{v\in{\mathsf{BNS}}(u)}\\ \{{\mathsf{dist}}((u,v),G)+{\mathsf{dist}}((v,w),G)\} where wโˆˆ๐–ต๐—„โ€‹(u)w\in{\mathsf{V_{k}}}(u).

Proof: According to Definitionย 5.1, for โˆ€wโˆˆ๐–ต๐—„โ€‹(u)\forall w\in{\mathsf{V_{k}}}(u), each shortest path between ww and uu must pass through at least one vertex vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)v\in{\mathsf{BNS}}(u), so we have ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,w)=๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)+๐–ฝ๐—‚๐—Œ๐—โ€‹(v,w){\mathsf{dist}}(u,w)={\mathsf{dist}}(u,v)+{\mathsf{dist}}(v,w). โ–ก\Box

Based on Propertyย 1, when a vertex wโˆˆ๐–ต๐—„โ€‹(u)w\in{\mathsf{V_{k}}}(u), ww must be in โˆชvโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)๐–ต๐—„โ€‹(v)\cup_{v\in{\mathsf{BNS}}(u)}{\mathsf{V_{k}}}(v). Moreover, when the bridge neighbor set ๐–ก๐–ญ๐–ฒโ€‹(u){\mathsf{BNS}}(u) of uu, the distance ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v){\mathsf{dist}}(u,v) and ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) (๐–ฝ๐—‚๐—Œ๐—โ€‹((v,w),G){\mathsf{dist}}((v,w),G) accordingly where wโˆˆ๐–ต๐—„โ€‹(v)w\in{\mathsf{V_{k}}}(v)) for all vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)v\in{\mathsf{BNS}}(u) have been computed, we can compute ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),G){\mathsf{dist}}((u,w),G) for each wโˆˆโˆชvโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)๐–ต๐—„โ€‹(v)w\in\cup_{v\in{\mathsf{BNS}}(u)}{\mathsf{V_{k}}}(v) efficiently following Propertyย 2. Obviously, ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) just selects kk vertices from โˆชvโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)๐–ต๐—„โ€‹(v)\cup_{v\in{\mathsf{BNS}}(u)}{\mathsf{V_{k}}}(v) with the smallest distance values. Therefore, if we process the vertices in GG in a certain order, and when processing each vertex uu, the vertices vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)v\in{\mathsf{BNS}}(u) and ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) have been computed, then ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) and thereby ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} can be computed efficiently by sharing the computed results. The remaining problem is how to make this idea practically applicable. In next section, we present a bottom-up computation-sharing algorithm, which paves the way to our final index construction algorithm.

5.2. A Bottom-Up Computation-Sharing Algorithm

To compute the bridge neighbor set and share the computation effectively, we construct the index based on the bridge neighbor preserved graph Gโ€ฒG^{\prime} of the road network GG, which is defined as:

Definition 5.3.

(BN-Graph) Given a road network GG, a graph Gโ€ฒG^{\prime} is a bridge neighbor preserved graph (๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}) of GG if (1) Vโ€‹(Gโ€ฒ)=Vโ€‹(G)V(G^{\prime})=V(G); (2) for each edge (u,v)โˆˆEโ€‹(Gโ€ฒ)(u,v)\in E(G^{\prime}), ฯ•โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)\phi((u,v),G^{\prime})={\mathsf{dist}}((u,v),G); (3) for any two vertices u,vโˆˆVโ€‹(Gโ€ฒ)u,v\in V(G^{\prime}), ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G^{\prime})={\mathsf{dist}}((u,v),G).

1 Gโ€ฒโ†GG^{\prime}\leftarrow G;
2 forย each wโˆˆVโ€‹(G)w\in V(G) in increasing order of ฯ€โ€‹(w)\pi(w)ย do
3ย ย ย ย ย  ๐’ฉโ†{v|vโˆˆ๐—‡๐–ป๐—‹โ€‹(w,Gโ€ฒ)โˆงฯ€โ€‹(v)>ฯ€โ€‹(w)}\mathcal{N}\leftarrow\{v|v\in{\mathsf{nbr}}(w,G^{\prime})\wedge\pi(v)>\pi(w)\};
4ย ย ย ย ย  forย each pair of vertices u,vโˆˆ๐’ฉu,v\in\mathcal{N}ย do
5ย ย ย ย ย ย ย ย ย ย  ifย (u,v)โˆ‰Eโ€‹(Gโ€ฒ)(u,v)\notin E(G^{\prime})ย then
6ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  insert (u,v)(u,v) into Gโ€ฒG^{\prime};
7ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  ฯ•โ€‹((u,v),Gโ€ฒ)โ†ฯ•โ€‹((u,w),Gโ€ฒ)+ฯ•โ€‹((w,v),Gโ€ฒ)\phi((u,v),G^{\prime})\leftarrow\phi((u,w),G^{\prime})+\phi((w,v),G^{\prime});
8ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย 
9ย ย ย ย ย ย ย ย ย ย else ifย ฯ•โ€‹((u,w),Gโ€ฒ)+ฯ•โ€‹((w,v),Gโ€ฒ)<ฯ•โ€‹((u,v),Gโ€ฒ)\phi((u,w),G^{\prime})+\phi((w,v),G^{\prime})<\phi((u,v),G^{\prime})ย then
10ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  ฯ•โ€‹((u,v),Gโ€ฒ)โ†ฯ•โ€‹((u,w),Gโ€ฒ)+ฯ•โ€‹((w,v),Gโ€ฒ)\phi((u,v),G^{\prime})\leftarrow\phi((u,w),G^{\prime})+\phi((w,v),G^{\prime});
11ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย 
12forย each wโˆˆVโ€‹(G)w\in V(G) in decreasing order of ฯ€โ€‹(w)\pi(w) ย do
13ย ย ย ย ย  ๐’ฉโ†{v|vโˆˆ๐—‡๐–ป๐—‹โ€‹(w,Gโ€ฒ)โˆงฯ€โ€‹(v)>ฯ€โ€‹(w)}\mathcal{N}\leftarrow\{v|v\in{\mathsf{nbr}}(w,G^{\prime})\wedge\pi(v)>\pi(w)\};
14ย ย ย ย ย  forย each pair of u,vโˆˆ๐’ฉu,v\in\mathcal{N}ย do
15ย ย ย ย ย ย ย ย ย ย  ifย ฯ•โ€‹((w,v),Gโ€ฒ)+ฯ•โ€‹((v,u),Gโ€ฒ)<ฯ•โ€‹((w,u),Gโ€ฒ)\phi((w,v),G^{\prime})+\phi((v,u),G^{\prime})<\phi((w,u),G^{\prime})ย then
16ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  ฯ•โ€‹((w,u),Gโ€ฒ)โ†ฯ•โ€‹((w,v),Gโ€ฒ)+ฯ•โ€‹((v,u),Gโ€ฒ)\phi((w,u),G^{\prime})\leftarrow\phi((w,v),G^{\prime})+\phi((v,u),G^{\prime});
17ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  mark (w,u)(w,u) as removed;
18ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย 
19remove all the marked edges in Gโ€ฒG^{\prime};
20 forย each vโˆˆVโ€‹(Gโ€ฒ)v\in V(G^{\prime})ย do
21ย ย ย ย ย  ๐–ก๐–ญ๐–ฒโ€‹(v)โ†๐—‡๐–ป๐—‹โ€‹(v,Gโ€ฒ){\mathsf{BNS}}(v)\leftarrow{\mathsf{nbr}}(v,G^{\prime});
22ย ย ย ย ย 
Algorithmย 1 ๐–ฒ๐–ฃโ€‹-โ€‹๐–ฆ๐—‹๐–บ๐—‰๐—โ€‹-โ€‹๐–ฆ๐–พ๐—‡โ€‹(G,ฯ€){\mathsf{SD}}\textrm{-}{\mathsf{Graph}}\textrm{-}{\mathsf{Gen}}(G,\pi)

Based on Definitionย 5.3, we propose Algorithmย 1 to compute the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} of an input road network and obtain the bridge neighbor set for each vertex accordingly. Intuitively, a ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} of GG with larger bridge neighbor set for each vertex has more potential possibility to share the computation following the analysis of Sectionย 5.1. Meanwhile, the construction of ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} should not be costly. Following this idea, for a given road network GG and a total vertex order ฯ€\pi (the order used in our paper is discussed at the end of this section), our algorithm (Algorithmย 1) contains two steps to construct ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}: (1) Edge insertion, it aims to add edges to connect vertices to enlarge the bridge neighbor set. (2) Edge deletion, it deletes edges to guarantee that the bridge neighbor set is enlarged correctly. Specifically,

โˆ™\bullet Step 1. Edge Insertion: Given a graph GG and a rank over all vertices in GG, it initializes Gโ€ฒG^{\prime} as GG, and iterates every vertex in the increasing order of ฯ€โ€‹(w)\pi(w) (line 1-2). For every pair of vertices u,vu,v among the neighbors of ww in Gโ€ฒG^{\prime} with higher ranks than ww, if (u,v)โˆ‰Eโ€‹(Gโ€ฒ)(u,v)\notin E(G^{\prime}), a new edge (u,v)(u,v) with weight ฯ•โ€‹((u,v),Gโ€ฒ)=ฯ•โ€‹((u,w),Gโ€ฒ)+ฯ•โ€‹((v,w),Gโ€ฒ)\phi((u,v),G^{\prime})=\phi((u,w),G^{\prime})+\phi((v,w),G^{\prime}) is inserted into Gโ€ฒG^{\prime} (line 5-7). Otherwise, if ฯ•โ€‹((u,w),Gโ€ฒ)+ฯ•โ€‹((w,v),Gโ€ฒ)<ฯ•โ€‹((u,v),Gโ€ฒ)\phi((u,w),G^{\prime})+\phi((w,v),G^{\prime})<\phi((u,v),G^{\prime}), it updates ฯ•โ€‹((u,v),Gโ€ฒ)\phi((u,v),G^{\prime}) as ฯ•โ€‹((u,w),Gโ€ฒ)\phi((u,w),G^{\prime}) +ฯ•โ€‹((w,v),Gโ€ฒ)+\phi((w,v),G^{\prime}) (line 8-9).

โˆ™\bullet Step 2. Edge Deletion: After the edge insertion step, it further iterates the vertex in the decreasing order of ฯ€โ€‹(w)\pi(w) (line 10). For every pair of vertices u,vu,v among the neighbors of ww in Gโ€ฒG^{\prime} with higher ranks than ww (line 11-12), if ฯ•โ€‹((w,v),Gโ€ฒ)+ฯ•โ€‹((v,u),Gโ€ฒ)<ฯ•โ€‹((w,u),Gโ€ฒ)\phi((w,v),G^{\prime})+\phi((v,u),G^{\prime})<\phi((w,u),G^{\prime}), it updates ฯ•โ€‹((w,u),Gโ€ฒ)\phi((w,u),G^{\prime}) as ฯ•โ€‹((w,v),Gโ€ฒ)+ฯ•โ€‹((v,u),Gโ€ฒ)\phi((w,v),G^{\prime})+\phi((v,u),G^{\prime}) and marks the updated edge as removed (line 13-15). At last, the marked edges in Gโ€ฒG^{\prime} are removed (line 16), and ๐–ก๐–ญ๐–ฒโ€‹(w){\mathsf{BNS}}(w) for each vertex is set as ๐—‡๐–ป๐—‹โ€‹(w,Gโ€ฒ){\mathsf{nbr}}(w,G^{\prime}) (line 17-18).

Refer to caption
Figure 4. ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of GG
Example 5.4.

Consider the road network GG in Figureย 2 and assume the vertex order ฯ€=(v1,v2,โ€ฆ,v20)\pi=(v_{1},v_{2},...,v_{20}), the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of GG is shown in Figureย 4. To construct Gโ€ฒG^{\prime}, we first conduct the edge insertion step. For v1v_{1}, its ๐’ฉ\mathcal{N} is {v2,v6}\{v_{2},v_{6}\}. There exists no edge (v2,v6)(v_{2},v_{6}) in Gโ€ฒG^{\prime} currently, then (v2,v6)(v_{2},v_{6}) with ฯ•โ€‹((v2,v6),Gโ€ฒ)=3\phi((v_{2},v_{6}),G^{\prime})=3 is added into Gโ€ฒG^{\prime}. The procedure continues until all vertices are processed. In the edge deletion step, vertices are processed in the reverse order of ฯ€\pi. Take v7v_{7} as an example. When processing v7v_{7}, its ๐’ฉ\mathcal{N} is {v8,v20}\{v_{8},v_{20}\}. Since ฯ•โ€‹((v7,v20),Gโ€ฒ)+ฯ•โ€‹((v20,v8),Gโ€ฒ)=2+2<ฯ•โ€‹((v7,v8),Gโ€ฒ)=5\phi((v_{7},v_{20}),G^{\prime})+\phi((v_{20},v_{8}),G^{\prime})=2+2<\phi((v_{7},v_{8}),G^{\prime})=5, (v7,v8)(v_{7},v_{8}) is marked. When all the vertices are processed, the marked edges are removed, and Figureย 4 shows the final Gโ€ฒG^{\prime}.

Based on the procedure of Algorithmย 1, we have:

Lemma 5.5.

The graph Gโ€ฒG^{\prime} generated at the end of Algorithmย 1 is a ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} of GG.

Proof: Based on Algorithmย 1, it is direct that Vโ€‹(Gโ€ฒ)=Vโ€‹(G)V(G^{\prime})=V(G), meeting the condition (1) of ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}. For any two vertices u,vโˆˆGu,v\in G, ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)=โˆ‘i=1nฯ•โ€‹((viโˆ’1,vi),G){\mathsf{dist}}((u,v),G)=\sum_{i=1}^{n}\phi((v_{i-1},v_{i}),G), where ฯ•โ€‹((viโˆ’1,vi),G)=๐–ฝ๐—‚๐—Œ๐—โ€‹((viโˆ’1,vi),G)\phi((v_{i-1},v_{i}),G)={\mathsf{dist}}((v_{i-1},v_{i}),G). Clearly, Gโ€ฒG^{\prime} retains all edges with ฯ•โ€‹((u,v),G)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)\phi((u,v),G)={\mathsf{dist}}((u,v),G) in GG and includes all inserted edges with ฯ•โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)\phi((u,v),G^{\prime})={\mathsf{dist}}((u,v),G) in Gโ€ฒG^{\prime}. Therefore, for any two vertices u,vโˆˆGโ€ฒu,v\in G^{\prime}, ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G^{\prime})={\mathsf{dist}}((u,v),G), satisfying the condition (3) of ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}. Next, we prove that Gโ€ฒG^{\prime} satisfies condition (2) via induction. Obviously, for vnโˆ’1v_{n-1}, ฯ•โ€‹((vnโˆ’1,vn),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((vnโˆ’1,vn),G)\phi((v_{n-1},v_{n}),G^{\prime})={\mathsf{dist}}((v_{n-1},v_{n}),G). Assume that for vnโˆ’kv_{n-k}, ฯ•โ€‹((vnโˆ’k,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((vnโˆ’k,v),G)\phi((v_{n-k},v),G^{\prime})={\mathsf{dist}}((v_{n-k},v),G) for โˆ€vโˆˆ๐’ฉโ€‹(vnโˆ’k)\forall v\in\mathcal{N}(v_{n-k}), ๐’ฉโ€‹(vnโˆ’k)={v|vโˆˆ๐—‡๐–ป๐—‹โ€‹(vnโˆ’k,Gโ€ฒ)โˆงฯ€โ€‹(v)>ฯ€โ€‹(vnโˆ’kโˆ’1)}\mathcal{N}(v_{n-k})=\{v|v\in{\mathsf{nbr}}(v_{n-k},G^{\prime})\wedge\pi(v)>\pi(v_{n-k-1})\}. Now, we prove it for vnโˆ’kโˆ’1v_{n-k-1}. Suppose vnโˆ’kv_{n-k} connects to vnโˆ’kโˆ’1v_{n-k-1}. From the insertion step, vnโˆ’kโˆ’1v_{n-k-1} and ๐’ฉโ€‹(vnโˆ’kโˆ’1)\mathcal{N}(v_{n-k-1}) form a clique. Thus, the shortest path from vnโˆ’kโˆ’1v_{n-k-1} to vv either only contains vnโˆ’kโˆ’1v_{n-k-1} and vv, or passes a vertex in ๐’ฉโ€‹(vnโˆ’kโˆ’1)โˆ–{vnโˆ’kโˆ’1,v}\mathcal{N}(v_{n-k-1})\setminus\{v_{n-k-1},v\}, i.e, ๐–ฝ๐—‚๐—Œ๐—โ€‹((vnโˆ’kโˆ’1,v),Gโ€ฒ)=ฯ•โ€‹((vnโˆ’kโˆ’1,v),Gโ€ฒ){\mathsf{dist}}((v_{n-k-1},v),G^{\prime})=\phi((v_{n-k-1},v),G^{\prime}) or ๐–ฝ๐—‚๐—Œ๐—โ€‹((vnโˆ’kโˆ’1,v),Gโ€ฒ)=minuโˆˆ๐’ฉโก(ฯ•โ€‹((vnโˆ’kโˆ’1,u),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)){\mathsf{dist}}((v_{n-k-1},v),G^{\prime})=\min_{u\in\mathcal{N}}(\phi((v_{n-k-1},u),G^{\prime})+{\mathsf{dist}}((u,v),G^{\prime})). Since the ๐’ฉโ€‹(vnโˆ’kโˆ’1)\mathcal{N}(v_{n-k-1}) includes vkv_{k} and ๐’ฉโ€‹(vk)\mathcal{N}(v_{k}), we have ฯ•โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)\phi((u,v),G^{\prime})={\mathsf{dist}}((u,v),G^{\prime}) for any two vertices u,vโˆˆ๐’ฉโ€‹(vnโˆ’kโˆ’1)u,v\in\mathcal{N}(v_{n-k-1}). Thus, ๐–ฝ๐—‚๐—Œ๐—โ€‹((vnโˆ’kโˆ’1,v),Gโ€ฒ)=minuโˆˆ๐’ฉโ€‹(vnโˆ’kโˆ’1)โก(ฯ•โ€‹((vnโˆ’kโˆ’1,u),Gโ€ฒ)+ฯ•โ€‹((u,v),Gโ€ฒ)){\mathsf{dist}}((v_{n-k-1},v),G^{\prime})=\min_{u\in\mathcal{N}(v_{n-k-1})}(\phi((v_{n-k-1},u),G^{\prime})+\phi((u,v),G^{\prime})). As line 10-15 of Algorithmย 1 can guarantee that ฯ•โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)\phi((u,v),G^{\prime})={\mathsf{dist}}((u,v),G^{\prime}), and Gโ€ฒG^{\prime} satisfies condition (3) of ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph}, i.e. ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G^{\prime})={\mathsf{dist}}((u,v),G), we have ฯ•โ€‹((u,v),Gโ€ฒ)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)\phi((u,v),G^{\prime})={\mathsf{dist}}((u,v),G). Therefore, Gโ€ฒG^{\prime} is a ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} of GG. โ–ก\Box

Following Lemmaย 5.5, it is clear that for each vertex vโˆˆVโ€‹(G)v\in V(G), its kkNN in GG is the same as that in Gโ€ฒG^{\prime} based on the condition (3) of Definitionย 5.3. Moreover, ๐—‡๐–ป๐—‹โ€‹(w,Gโ€ฒ){\mathsf{nbr}}(w,G^{\prime}) is the bridge neighbor set of ww in Gโ€ฒG^{\prime} based on the condition (2) of Definitionย 5.3. The following problem is how to compute ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) for each vertex uu via Gโ€ฒG^{\prime} and ๐–ก๐–ญ๐–ฒโ€‹(u){\mathsf{BNS}}(u). According to the discussion in Sectionย 5.1, to fully utilize the intermediate computed results during the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} construction, we define a special type of path based on the given total vertex order as follows:

Definition 5.6.

(Monotonic Rank Path) Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(Gโ€ฒ)u\in V(G^{\prime}), a path pโ€‹(u,v)=(u=v1,v2,โ€ฆ,vj=v)p(u,v)=(u=v_{1},v_{2},\dots,v_{j}=v) in Gโ€ฒG^{\prime} is a decreasing rank path of uu if ฯ€โ€‹(vj)<ฯ€โ€‹(vjโˆ’1)<โ‹ฏ<ฯ€โ€‹(v1)\pi(v_{j})<\pi(v_{j-1})<\dots<\pi(v_{1}), and it is an increasing rank path of uu if ฯ€โ€‹(vj)>ฯ€โ€‹(vjโˆ’1)>โ‹ฏ>ฯ€โ€‹(v1)\pi(v_{j})>\pi(v_{j-1})>\dots>\pi(v_{1}).

Definition 5.7.

(Monotonic Rank Path Subgraph) Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uu, the decreasing rank path subgraph of uu, denoted by ๐–ฆโ€ฒโฃ<โ€‹(u){\mathsf{G^{\prime<}}}(u), is the subgraph induced by all decreasing rank paths of uu in Gโ€ฒG^{\prime}. The increasing rank path subgraph, denoted by ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u), is the subgraph induced by all increasing rank paths of uu in Gโ€ฒG^{\prime}.

Example 5.8.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} in Figureย 4, for vertex v1v_{1}, increasing rank paths of v1v_{1} contain pโ€‹(v1,v2)=(v1,v2),pโ€‹(v1,v6)=(v1,v6),pโ€‹(v1,v8)=(v1,v2,v8)p(v_{1},v_{2})=(v_{1},v_{2}),p(v_{1},v_{6})=(v_{1},v_{6}),p(v_{1},v_{8})=(v_{1},v_{2},v_{8}) or (v1,v6,v8),pโ€‹(v1,v7)=(v1,v6,v7)(v_{1},v_{6},v_{8}),p(v_{1},v_{7})=(v_{1},v_{6},v_{7}) or (v1,v2,v6,v7),pโ€‹(v1,v20)=(v1,v2,v8,v20)(v_{1},v_{2},v_{6},v_{7}),p(v_{1},v_{20})=(v_{1},v_{2},v_{8},v_{20}), (v1,v6,v8,v20)(v_{1},v_{6},v_{8},v_{20}), (v1,v2,v7,v20)(v_{1},v_{2},\\ v_{7},v_{20}), (v1,v2,v6,v7,v20)(v_{1},v_{2},v_{6},v_{7},v_{20}), (v1,v2,v6,v8,v20)(v_{1},v_{2},v_{6},v_{8},v_{20}). The increasing rank path subgraph of v1v_{1}, i.e., ๐–ฆโ€ฒโฃ>โ€‹(v1){\mathsf{G^{\prime>}}}(v_{1}), is the subgraph induced by these paths, which is shown in pink in Figureย 4. The decreasing rank path subgraph of v17v_{17}, i.e., ๐–ฆโ€ฒโฃ<โ€‹(v17){\mathsf{G^{\prime<}}}(v_{17}), can be obtained similarly, which is shown in green in Figureย 4.

Definition 5.9.

(Decreasing Rank Partial kkNN) Given a vertex uโˆˆVโ€‹(G)u\in V(G) and a set of candidate objects โ„ณ\mathcal{M}, the decreasing rank partial kkNN of uu, denoted by ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u), is the kkNN of uu in ๐–ฆโ€ฒโฃ<โ€‹(u){\mathsf{G^{\prime<}}}(u).

Lemma 5.10.

Given a vertex uโˆˆVโ€‹(G)u\in V(G) in a road network GG, ๐–ต๐—„โ€‹(u)โІโˆชwโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))๐–ต๐—„<โ€‹(w){\mathsf{V_{k}}}(u)\subseteq\cup_{w\in V({\mathsf{G^{\prime>}}}(u))}{\mathsf{V_{k}^{<}}}(w).

Proof: This lemma can be proved directly based on Propertyย 1. โ–ก\Box

Therefore, if we can obtain ๐–ต๐—„<โ€‹(w){\mathsf{V_{k}^{<}}}(w) for each vertex, we can obtain ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) following Lemmaย 5.10. Moreover, we have:

Lemma 5.11.

Given a road network of GG and a set of candidate objects โ„ณ\mathcal{M}, let u1u_{1} be the vertex with the lowest rank, we have ๐–ต๐—„<โ€‹(u1)={โ„ณโˆฉ{u1}}{\mathsf{V_{k}^{<}}}(u_{1})=\{\mathcal{M}\cap\{u_{1}\}\}.

Proof: From Definitionย 5.7, Vโ€‹(๐–ฆโ€ฒโฃ<โ€‹(u1))={u1}V({\mathsf{G^{\prime<}}}(u_{1}))=\{u_{1}\}. Based on Definitionย 5.9, ๐–ต๐—„<โ€‹(u1)={โ„ณโˆฉVโ€‹(๐–ฆโ€ฒโฃ<โ€‹(u1))}={โ„ณโˆฉ{u1}}{\mathsf{V_{k}^{<}}}(u_{1})=\{\mathcal{M}\cap V({\mathsf{G^{\prime<}}}(u_{1}))\}=\{\mathcal{M}\cap\{u_{1}\}\}. โ–ก\Box

Based on Lemmaย 5.11, the decreasing rank partial kkNN for the vertex with the lowest rank can be computed directly. Regarding the remaining vertices, we further divide ๐–ก๐–ญ๐–ฒโ€‹(u){\mathsf{BNS}}(u) into two parts: ๐–ก๐–ญ๐–ฒ<โ€‹(u){\mathsf{BNS^{<}}}(u) which contains the neighbors of uu in Gโ€ฒG^{\prime} with lower rank than uu, i.e., ๐–ก๐–ญ๐–ฒ<โ€‹(u)={v|vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)โˆงฯ€โ€‹(v)<ฯ€โ€‹(u)}{\mathsf{BNS^{<}}}(u)=\{v|v\in{\mathsf{BNS}}(u)\wedge\pi(v)<\pi(u)\} and ๐–ก๐–ญ๐–ฒ>โ€‹(u){\mathsf{BNS^{>}}}(u) which contains the neighbors of uu in Gโ€ฒG^{\prime} with higher rank than uu, i.e., ๐–ก๐–ญ๐–ฒ>โ€‹(u)={v|vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)โˆงฯ€โ€‹(v)>ฯ€โ€‹(u)}{\mathsf{BNS^{>}}}(u)=\{v|v\in{\mathsf{BNS}}(u)\wedge\pi(v)>\pi(u)\}. We have:

Lemma 5.12.

Given a vertex uโˆˆVโ€‹(G)u\in V(G) in a road network GG, ๐–ต๐—„<โ€‹(u)โІ{โ„ณโˆฉ{u}}โˆชvโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)๐–ต๐—„<โ€‹(v){\mathsf{V_{k}^{<}}}(u)\subseteq\{\mathcal{M}\cap\{u\}\}\cup_{v\in{\mathsf{BNS^{<}}}(u)}{\mathsf{V_{k}^{<}}}(v).

Proof: This lemma can be proved directly based on Propertyย 1 and Definitionย 5.9. โ–ก\Box

Lemmaย 5.12 indicates the scope of ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) for each vertex. To obtain ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u), we only need to compute the distance between uu and wโˆˆ{โ„ณโˆฉ{u}}โˆชvโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)๐–ต๐—„<โ€‹(v)w\in\{\mathcal{M}\cap\{u\}\}\cup_{v\in{\mathsf{BNS^{<}}}(u)}{\mathsf{V_{k}^{<}}}(v), and retrieve the top kk objects. To avoid the expensive ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm, we define:

Definition 5.13.

(Decreasing Rank Shortest Path) Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for two vertices u,vโˆˆVโ€‹(Gโ€ฒ)u,v\in V(G^{\prime}), the decreasing rank shortest path between uu and vv is the rank decreasing path from uu to vv with the smallest length in Gโ€ฒG^{\prime}.

In ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of GG, for any two vertices u,vโˆˆGโ€ฒu,v\in G^{\prime}, one shortest path between uu and vv is a decreasing rank shortest path. We call the length of decreasing rank shortest path between uu and vv as decreasing rank distance and denote it as ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v). We have:

Lemma 5.14.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(Gโ€ฒ)u\in V(G^{\prime}), ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)=๐—†๐—‚๐—‡wโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)โ€‹{ฯ•โ€‹((u,w),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v)}{\mathsf{dist_{<}}}(u,v)={\mathsf{min}}_{w\in{\mathsf{BNS^{<}}}(u)}\{\phi((u,w),G^{\prime})+{\mathsf{dist_{<}}}(w,v)\}, where vโˆˆ๐–ต๐—„<โ€‹(u)v\in{\mathsf{V_{k}^{<}}}(u).

Proof: Based on Definitionย 5.7 and Definitionย 5.9, we have ๐–ต๐—„<โ€‹(u)โІ{โ„ณโˆฉVโ€‹(๐–ฆโ€ฒโฃ<โ€‹(u))}{\mathsf{V_{k}^{<}}}(u)\subseteq\{\mathcal{M}\cap V({\mathsf{G^{\prime<}}}(u))\}. According to Definitionย 5.13, for โˆ€vโˆˆ๐–ฆโ€ฒโฃ<โ€‹(u)\forall v\in{\mathsf{G^{\prime<}}}(u), there is one decreasing rank shortest path between uu and vv, which passes through one vertex wโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)w\in{\mathsf{BNS^{<}}}(u). Therefore, ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)=mโ€‹iโ€‹nwโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)โ€‹{ฯ•โ€‹((u,w),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v)}{\mathsf{dist_{<}}}(u,v)=min_{w\in{\mathsf{BNS^{<}}}(u)}\{\phi((u,w),G^{\prime})+{\mathsf{dist_{<}}}(w,v)\}. โ–ก\Box

Lemma 5.15.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uu, let vโˆˆ๐–ต๐—„<โ€‹(u)โˆฉ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}^{<}}}(u)\cap{\mathsf{V_{k}}}(u), if ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ){\mathsf{dist_{<}}}(u,v)={\mathsf{dist}}((u,v),G^{\prime}), there is a shortest path between uu and vv in Gโ€ฒG^{\prime}, which is also a decreasing rank shortest path.

Proof: According to Definitionย 5.7 and Definitionย 5.9, if vโˆˆ๐–ต๐—„<โ€‹(u)v\in{\mathsf{V_{k}^{<}}}(u), we know vโˆˆVโ€‹(๐–ฆโ€ฒโฃ<โ€‹(u))v\in V({\mathsf{G^{\prime<}}}(u)). Based on Definitionย 5.13, there is one decreasing rank shortest path between uu and vv. When ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)=๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),Gโ€ฒ){\mathsf{dist_{<}}}(u,v)\\ ={\mathsf{dist}}((u,v),G^{\prime}), there is a shortest path between uu and vv in Gโ€ฒG^{\prime}, which is also a decreasing shortest path. โ–ก\Box

Based on Lemmaย 5.11, Lemmaย 5.12, and Lemmaย 5.14, to obtain ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) for each vertex, we can adopt a bottom-up strategy based on the increasing order of ฯ€โ€‹(u)\pi(u), and the computed distance for a lower rank vertex can be re-used to compute the distance for a higher rank vertex. However, ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) only contains the vertices vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u) whose shortest paths to uu pass through ๐–ก๐–ญ๐–ฒ<โ€‹(u){\mathsf{BNS^{<}}}(u), the vertices vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u) whose shortest paths to uu pass through ๐–ก๐–ญ๐–ฒ>โ€‹(u){\mathsf{BNS^{>}}}(u) does not considered. Unfortunately, these vertices cannot be obtained by only exploring the vertices in โˆชvโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)๐–ต๐—„<โ€‹(v)\cup_{v\in{\mathsf{BNS^{>}}}(u)}{\mathsf{V_{k}^{<}}}(v) in the similar way as discussed above since this approach only explores the vertices whose ranks are not higher than ๐—†๐–บ๐—‘vโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)โ€‹ฯ€โ€‹(v){\mathsf{max}}_{v\in{\mathsf{BNS^{>}}}(u)}\pi(v). On the other hand, we have the following lemmas regarding the distance between uu and vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u) based on Lemmaย 5.10:

Lemma 5.16.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(G)u\in V(G), let v,vโ€ฒโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))v,v^{\prime}\in V({\mathsf{G^{\prime>}}}(u)), ๐–ฝ๐—‚๐—Œ๐—โ€‹((v,vโ€ฒ),๐–ฆโ€ฒโฃ>โ€‹(u))=๐–ฝ๐—‚๐—Œ๐—โ€‹((v,vโ€ฒ),G){\mathsf{dist}}((v,v^{\prime}),{\mathsf{G^{\prime>}}}(u))={\mathsf{dist}}((v,v^{\prime}),G).

Proof: This lemma can be proved by Lemmaย 5.5. โ–ก\Box

Lemma 5.17.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(G)u\in V(G), ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)=๐—†๐—‚๐—‡wโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))โ€‹{๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),๐–ฆโ€ฒโฃ>โ€‹(u))+๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v)}{\mathsf{dist}}((u,v),G)={\mathsf{min}}_{w\in V({\mathsf{G^{\prime>}}}(u))}\{{\mathsf{dist}}((u,w),{\mathsf{G^{\prime>}}}\\ (u))+{\mathsf{dist_{<}}}(w,v)\}, where vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u).

Proof: This lemma can be proved directly based on Lemmaย 5.10 and Lemmaย 5.16. โ–ก\Box

1 Gโ€ฒโ†๐–ฒ๐–ฃโ€‹-โ€‹๐–ฆ๐—‹๐–บ๐—‰๐—โ€‹-โ€‹๐–ฆ๐–พ๐—‡โ€‹(G,ฯ€)G^{\prime}\leftarrow{\mathsf{SD}}\textrm{-}{\mathsf{Graph}}\textrm{-}{\mathsf{Gen}}(G,\pi);
2 ๐’ฎโ†โˆ…\mathcal{S}\leftarrow\emptyset, ๐–ต๐—„<โ€‹(โ‹…)โ†โˆ…{\mathsf{V_{k}^{<}}}(\cdot)\leftarrow\emptyset, ๐–ต๐—„โ€‹(โ‹…)โ†โˆ…{\mathsf{V_{k}}}(\cdot)\leftarrow\emptyset;
3 forย each uu in increasing order of ฯ€โ€‹(u)\pi(u)ย do
4ย ย ย ย ย  ๐’ฎโ†{โ„ณโˆฉ{u}}โˆชwโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)๐–ต๐—„<โ€‹(w)\mathcal{S}\leftarrow\{\mathcal{M}\cap\{u\}\}\cup_{w\in{\mathsf{BNS^{<}}}(u)}{\mathsf{V_{k}^{<}}}(w);
5ย ย ย ย ย  forย each vโˆˆ๐’ฎv\in\mathcal{S}ย do
6ย ย ย ย ย ย ย ย ย ย  ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)โ†minwโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)โก{ฯ•โ€‹((u,w),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v)}{\mathsf{dist_{<}}}(u,v)\leftarrow\min_{w\in{\mathsf{BNS^{<}}}(u)}\{\phi((u,w),G^{\prime})+{\mathsf{dist_{<}}}(w,v)\};
7ย ย ย ย ย ๐–ต๐—„<โ€‹(u)โ†{\mathsf{V_{k}^{<}}}(u)\leftarrow kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v);
8ย ย ย ย ย 
9forย each uu in increasing order of ฯ€โ€‹(u)\pi(u)ย do
10ย ย ย ย ย  construct ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) by conducting ๐–ก๐–ฅ๐–ฒ\mathsf{BFS} search from uu on Gโ€ฒG^{\prime} following edge (v,vโ€ฒ)(v,v^{\prime}) with ฯ€โ€‹(v)<ฯ€โ€‹(vโ€ฒ)\pi(v)<\pi(v^{\prime});
11ย ย ย ย ย  forย each wโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))w\in V({\mathsf{G^{\prime>}}}(u))ย do
12ย ย ย ย ย ย ย ย ย ย  compute ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),๐–ฆโ€ฒโฃ>โ€‹(u)){\mathsf{dist}}((u,w),{\mathsf{G^{\prime>}}}(u));
13ย ย ย ย ย ย ย ย ย ย 
14ย ย ย ย ย ๐’ฎโ†๐–ต๐—„<โ€‹(u)โˆชwโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))๐–ต๐—„<โ€‹(w)\mathcal{S}\leftarrow{\mathsf{V_{k}^{<}}}(u)\cup_{w\in V({\mathsf{G^{\prime>}}}(u))}{\mathsf{V_{k}^{<}}}(w);
15ย ย ย ย ย  forย each vโˆˆ๐’ฎv\in\mathcal{S}ย do
16ย ย ย ย ย ย ย ย ย ย  ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)โ†minwโˆˆ๐–ฆโ€ฒโฃ>โ€‹(u)โก๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),๐–ฆโ€ฒโฃ>โ€‹(u)){\mathsf{dist}}((u,v),G)\leftarrow\min_{w\in{\mathsf{G^{\prime>}}}(u)}{\mathsf{dist}}((u,w),{\mathsf{G^{\prime>}}}(u))+๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v){\mathsf{dist_{<}}}(w,v);
17ย ย ย ย ย ๐–ต๐—„โ€‹(u)โ†{\mathsf{V_{k}}}(u)\leftarrow kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G);
18ย ย ย ย ย 
Algorithmย 2 ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œโ€‹(G,ฯ€,โ„ณ){\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}(G,\pi,\mathcal{M})
Refer to caption
(a) ๐’ฎ\mathcal{S} for ๐–ต5<โ€‹(v17){\mathsf{V}}^{<}_{5}(v_{17})
Refer to caption
(b) Computation of ๐–ต5<โ€‹(v17){\mathsf{V}}^{<}_{5}(v_{17})
Refer to caption
(c) ๐’ฎ\mathcal{S} for ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17})
Refer to caption
(d) Computation of ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17})
Figure 5. Procedure of Algorithmย 2 to Compute ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17})

Algorithm. By combing the above two cases together, our index construction algorithm is shown in Algorithmย 2. It first generates the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} using Algorithmย 1 (line 1). Then, it adopts a bottom-up strategy to compute ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) in the increasing order of ฯ€โ€‹(u)\pi(u) (line 2-7). Specifically, for each vertex uu, it retrieves {โ„ณโˆฉ{u}}โˆชvโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(u)๐–ต๐—„<โ€‹(v)\{\mathcal{M}\cap\{u\}\}\cup_{v\in{\mathsf{BNS^{<}}}(u)}{\mathsf{V_{k}^{<}}}(v) based on Lemmaย 5.12 (line 4) and computes ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v) based on Lemmaย 5.14 (line 5-6). Then, ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) is the kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v) (line 7). After that, it constructs ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) by conducting ๐–ก๐–ฅ๐–ฒ\mathsf{BFS} search from uu on Gโ€ฒG^{\prime} (line 9). And we compute the single source shortest distance ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),๐–ฆโ€ฒโฃ>โ€‹(u)){\mathsf{dist}}((u,w),{\mathsf{G^{\prime>}}}(u)) from uu to each vertex ww in ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) using the Dijkstraโ€™s Algorithm (line 10-11). Then, following Lemmaย 5.10, it retrieves โˆชwโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))๐–ต๐—„<โ€‹(w)\cup_{w\in V({\mathsf{G^{\prime>}}}(u))}{\mathsf{V_{k}^{<}}}(w) (line 12) and computes ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G) based on Lemmaย 5.17 (line 13-14). ๐–ฝ๐—‚๐—Œ๐—<โ€‹(w,v){\mathsf{dist_{<}}}(w,v) can be obtained from ๐–ต๐—„<โ€‹(w){\mathsf{V_{k}^{<}}}(w) directly. ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,w),๐–ฆโ€ฒโฃ>โ€‹(u)){\mathsf{dist}}((u,w),{\mathsf{G^{\prime>}}}(u)) can be computed (line 10-11) after the construction of ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) (line 99) following Lemmaย 5.16. At last, the kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G) is returned as ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) in line 15.

Example 5.18.

Following the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} in Figureย 4, Figureย 5 takes v17v_{17} as an example to show the procedure of Algorithmย 2 to compute ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17}). According to Algorithmย 2, we compute ๐–ต๐Ÿง<โ€‹(v17){\mathsf{V_{5}^{<}}}(v_{17}) first. Based on Gโ€ฒG^{\prime}, ๐–ก๐–ญ๐–ฒ<โ€‹(v17)={v5,v12,v15,v16}{\mathsf{BNS^{<}}}(v_{17})=\{v_{5},v_{12},v_{15},v_{16}\}, which is shown in green in Figureย 5 (a). Following Algorithmย 2, when computing ๐–ต๐Ÿง<โ€‹(v17){\mathsf{V_{5}^{<}}}(v_{17}), we already have ๐–ต๐Ÿง<โ€‹(v5),๐–ต๐Ÿง<โ€‹(v15),๐–ต๐Ÿง<โ€‹(v12){\mathsf{V_{5}^{<}}}(v_{5}),{\mathsf{V_{5}^{<}}}(v_{15}),{\mathsf{V_{5}^{<}}}(v_{12}) and ๐–ต๐Ÿง<โ€‹(v16){\mathsf{V_{5}^{<}}}(v_{16}), which is shown in Figureย 5 (b). Consequently, following line 6 of Algorithmย 2, we can achieve ๐’ฎ={โ„ณโˆฉ{v17}}โˆชwโˆˆ๐–ก๐–ญ๐–ฒ<โ€‹(v17)๐–ต๐Ÿง<โ€‹(w)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4),(v4,5),(v16,5),(v3,6),(v14,6),(v10,7)}\mathcal{S}=\{\mathcal{M}\cap\{v_{17}\}\}\cup_{w\in{\mathsf{BNS^{<}}}(v_{17})}{\mathsf{V_{5}^{<}}}(w)=\{(v_{17},0),(v_{5},2),(v_{12},3),(v_{15},3),(v_{11},4),(v_{4},5),(v_{16},5),(v_{3},6),(v_{14},\\ 6),(v_{10},7)\}. Figureย 5 (b) shows this set ๐’ฎ\mathcal{S} for constructing ๐–ต๐Ÿง<โ€‹(v17){\mathsf{V_{5}^{<}}}(v_{17}). After sorting distance, we have ๐–ต๐Ÿง<โ€‹(v17)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4)}{\mathsf{V_{5}^{<}}}(v_{17})=\{(v_{17},0),(v_{5},2),(v_{12},3),\\ (v_{15},3),(v_{11},4)\}.

Figureย 5 (c) shows the ๐–ฆโ€ฒโฃ>โ€‹(v17){\mathsf{G^{\prime>}}}(v_{17}) in purple with bold lines. Using ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} Algorithm, we compute the distance from v17v_{17} to each vertex in ๐–ฆโ€ฒโฃ>โ€‹(v17){\mathsf{G^{\prime>}}}(v_{17}). And ๐–ฝ๐—‚๐—Œ๐—โ€‹((v17,v18),๐–ฆโ€ฒโฃ>โ€‹(v17))=6,๐–ฝ๐—‚๐—Œ๐—โ€‹((v17,v19),๐–ฆโ€ฒโฃ>โ€‹(v17))=5,{\mathsf{dist}}((v_{17},v_{18}),{\mathsf{G^{\prime>}}}(v_{17}))=6,{\mathsf{dist}}((v_{17},v_{19}),\\ {\mathsf{G^{\prime>}}}(v_{17}))=5, and ๐–ฝ๐—‚๐—Œ๐—โ€‹((v17,v20),๐–ฆโ€ฒโฃ>โ€‹(v17))=8{\mathsf{dist}}((v_{17},v_{20}),{\mathsf{G^{\prime>}}}(v_{17}))=8. Following line 12 of Algorithmย 2, when computing ๐–ต๐Ÿงโ€‹(v17){\mathsf{V_{5}}}(v_{17}), we have ๐–ต๐Ÿง<โ€‹(v18),๐–ต๐Ÿง<โ€‹(v19){\mathsf{V_{5}^{<}}}(v_{18}),{\mathsf{V_{5}^{<}}}(v_{19}) and ๐–ต๐Ÿง<โ€‹(v20){\mathsf{V_{5}^{<}}}(v_{20}), which is shown in Figureย 5 (d). Following line 14 of Algorithmย 2, we achieve ๐’ฎ=๐–ต๐Ÿง<โ€‹(v17)โˆชwโˆˆ๐–ฆโ€ฒโฃ>โ€‹(v17)๐–ต๐Ÿง<โ€‹(w)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4),(v19,5),(v18,6),(v9,7),(v20,8),(v13,9),(v7,10),(v8,10)}\mathcal{S}={\mathsf{V_{5}^{<}}}(v_{17})\cup_{w\in{\mathsf{G^{\prime>}}}(v_{17})}{\mathsf{V_{5}^{<}}}(w)=\{(v_{17},0),(v_{5},2),(v_{12},3),(v_{15},3),(v_{11},4),(v_{19},5),(v_{18},6),(v_{9},7),(v_{20}\\ ,8),(v_{13},9),(v_{7},10),(v_{8},10)\}, which is shown in Figureย 5 (c). Then, we select 55 nearest objects from ๐’ฎ\mathcal{S} as the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of v17v_{17}, namely, ๐–ต๐—„โ€‹(v17)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4)}{\mathsf{V_{k}}}(v_{17})=\{(v_{17},0),(v_{5},2),(v_{12},3),(v_{15},3),(v_{11},4)\} .

The correctness of Algorithmย 2 is straightforward following the above discussion. For the efficiency of the algorithm, we have:

Theorem 5.19.

The time complexity of Algorithmย 2 is bounded by Oโ€‹(nโ‹…(ฯ2+ฮทโ‹…ฯ„โ‹…lโ€‹oโ€‹gโ€‹(ฮท)+(ฯ„+ฮท)โ‹…k))O(n\cdot(\rho^{2}+\eta\cdot\tau\cdot log(\eta)+(\tau+\eta)\cdot k)), where ฯ\rho represents the maximum degree of vertices in the graph generated by Algorithmย 1 when Step 1 finishes, ฮท=๐—†๐–บ๐—‘vโˆˆVโ€‹(G)โ€‹|๐–ฆโ€ฒโฃ>โ€‹(v)|\eta={\mathsf{max}}_{v\in V(G)}|{\mathsf{G^{\prime>}}}(v)| and ฯ„=๐—†๐–บ๐—‘vโˆˆVโ€‹(G)โ€‹|๐–ก๐–ญ๐–ฒ>โ€‹(v)|\tau={\mathsf{max}}_{v\in V(G)}|{\mathsf{BNS^{>}}}(v)|.

Proof: Algorithmย 1 requires Oโ€‹(nโ‹…ฯ2)O(n\cdot\rho^{2}) time (line 1 of Algorithmย 2). Specifically, in the for loop (line 2-9 of Algorithmย 1), for each vertex ww, line 4-9 of Algorithmย 1 takes Oโ€‹(ฯ2)O(\rho^{2}) time and the for loop terminates at nn iterations. Therefore, the edge insertion step (line 2-9 of Algorithmย 1) requires Oโ€‹(nโ‹…ฯ2)O(n\cdot\rho^{2}) time. Similarly, the edge deletion step requires Oโ€‹(nโ‹…ฯ2)O(n\cdot\rho^{2}) (line 10-16 of Algorithmย 1). Scanning all vertices to achieve ๐–ก๐–ญ๐–ฒโ€‹(โ‹…){\mathsf{BNS}}(\cdot) is bounded by Oโ€‹(nโ‹…ฯ„)O(n\cdot\tau) (line 17-18 of Algorithmย 1). Obviously, for โˆ€uโˆˆVโ€‹(G)\forall u\in V(G), ฯ„โ‰คฯ\tau\leq\rho. Therefore, the time complexity of Algorithmย 1 is Oโ€‹(nโ‹…(ฯ2+ฯ„))=Oโ€‹(nโ‹…ฯ2)O(n\cdot(\rho^{2}+\tau))=O(n\cdot\rho^{2}). In the for loop from line 3 to line 7 of Algorithmย 2, line 4 of Algorithmย 2 takes Oโ€‹(ฯ„โ‹…k)O(\tau\cdot k), since each vertex uu is only explored by the vertex wโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)w\in{\mathsf{BNS^{>}}}(u). At the same time with obtaining ๐’ฎ\mathcal{S} (line 4 of Algorithmย 2), line 5-7 of Algorithmย 2 could be done. Therefore, ๐–ต๐—„<โ€‹(โ‹…){\mathsf{V_{k}^{<}}}(\cdot) construction (line 3-7 of Algorithmย 2) requires Oโ€‹(nโ‹…ฯ„โ‹…k)O(n\cdot\tau\cdot k) time. In the for loop (line 8-15 of Algorithmย 2), constructing ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) by conducting ๐–ก๐–ฅ๐–ฒ\mathsf{BFS} search requires Oโ€‹(ฮทโ‹…ฯ„)O(\eta\cdot\tau) time (line 9 of Algorithmย 2). Computing ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v){\mathsf{dist}}(u,v) for โˆ€vโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))\forall v\in V({\mathsf{G^{\prime>}}}(u)) via ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm (line 10-11 of Algorithmย 2) consumes Oโ€‹(ฮทโ‹…ฯ„โ‹…lโ€‹oโ€‹gโ€‹(ฮท))O(\eta\cdot\tau\cdot log(\eta)) time. Obtaining ๐’ฎ\mathcal{S} and distance computation require Oโ€‹(ฮทโ‹…k)O(\eta\cdot k) (line 12-15 of Algorithmย 2). Therefore, ๐–ต๐—„โ€‹(โ‹…){\mathsf{V_{k}}}(\cdot) construction (line 8-15 of Algorithmย 2) requires Oโ€‹(nโ‹…(ฮทโ‹…ฯ„โ‹…lโ€‹oโ€‹gโ€‹(ฮท)+(ฯ„+ฮท)โ‹…k))O(n\cdot(\eta\cdot\tau\cdot log(\eta)+(\tau+\eta)\cdot k)). In summary, the time complexity of Algorithmย 2 is Oโ€‹(nโ‹…(ฯ2+ฮทโ‹…ฯ„โ‹…lโ€‹oโ€‹gโ€‹(ฮท)+(ฯ„+ฮท)โ‹…k))O(n\cdot(\rho^{2}+\eta\cdot\tau\cdot log(\eta)+(\tau+\eta)\cdot k)). โ–ก\Box

Remark. Based on Theoremย 5.19, we prefer the generated ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} with smaller ฯ\rho and ฯ„\tau. Thus, we use the following heuristic total order ฯ€\pi in this paper: (1) The vertex with the minimum degree in GG has the lowest rank (the vertex with a smallest ๐—‚๐–ฝ\mathsf{id} has the lowest rank if more than one vertices have the minimum degree); (2) for two unprocessed vertices uu and vv in line 2 of Algorithmย 1, ฯ€โ€‹(u)>ฯ€โ€‹(v)\pi(u)>\pi(v) if the number of unprocessed neighbors of uu is bigger than that of vv in Gโ€ฒG^{\prime}. Note this order can be obtained incidentally in Algorithmย 1, and does not affect the time complexity of Algorithmย 1.

5.3. A Bidirectional Construction Algorithm

Algorithmย 2 adopts a bottom-up strategy to construct the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} with which the computation regarding ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) is well shared. However, it still needs to invoke ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm to compute the distance between uu and wโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))w\in V({\mathsf{G^{\prime>}}}(u)) in line 11-12, which is costly. To address this problem, we propose a new algorithm to further improve the index construction efficiency. Instead of following the sole bottom-up direction which adopted in Algorithmย 2, the new algorithm constructs the index in a bidirectional manner, which totally avoids the invocation of ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€ฒโ€‹๐—Œ\mathsf{Dijkstra^{\prime}s} algorithm. Before introducing our algorithm, we have:

Lemma 5.20.

Given a road network GG, let unu_{n} be the vertex with the highest rank, ๐–ต๐—„โ€‹(un)=๐–ต๐—„<โ€‹(un){\mathsf{V_{k}}}(u_{n})={\mathsf{V_{k}^{<}}}(u_{n}).

Proof: Following Definitionย 5.7, Vโ€‹(๐–ฆโ€ฒโฃ>โ€‹(un))={un}V({\mathsf{G^{\prime>}}}(u_{n}))=\{u_{n}\}. Based on Lemmaย 5.10, ๐–ต๐—„โ€‹(un)โІโˆชwโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(un))๐–ต๐—„<โ€‹(w)=๐–ต๐—„<โ€‹(un){\mathsf{V_{k}}}(u_{n})\subseteq\cup_{w\in V({\mathsf{G^{\prime>}}}(u_{n}))}{\mathsf{V_{k}^{<}}}(w)={\mathsf{V_{k}^{<}}}(u_{n}). โ–ก\Box

Lemma 5.21.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(G)u\in V(G), ๐–ต๐—„โ€‹(u)โІ๐–ต๐—„<โ€‹(u)โˆชwโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)๐–ต๐—„โ€‹(w){\mathsf{V_{k}}}(u)\subseteq{\mathsf{V_{k}^{<}}}(u)\cup_{w\in{\mathsf{BNS^{>}}}(u)}{\mathsf{V_{k}}}(w).

Proof: This lemma can be proved directly based on Propertyย 1. โ–ก\Box

Lemmaย 5.20 and Lemmaย 5.21 imply that if we process the vertices in the decreasing order of their ranks when computing ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u), it can re-use the computed information of vertices with higher ranks in the computation of the kkNN for vertices with lower ranks. Moreover, we have:

Lemma 5.22.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of a road network GG, for a vertex uโˆˆVโ€‹(G)u\in V(G), ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G)=๐—†๐—‚๐—‡โ€‹{๐—†๐—‚๐—‡wโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)โ€‹{ฯ•โ€‹((u,w),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—โ€‹((w,v),G)},๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)}{\mathsf{dist}}((u,v),G)={\mathsf{min}}\{{\mathsf{min}}_{w\in{\mathsf{BNS^{>}}}(u)}\{\phi((u,w),G^{\prime})\\ +{\mathsf{dist}}((w,v),G)\},{\mathsf{dist_{<}}}(u,v)\}, where vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u).

Proof: For vโˆˆ๐–ต๐—„โ€‹(u)v\in{\mathsf{V_{k}}}(u), there are two parts. The one part contains all vv whose shortest paths to uu pass through ๐–ก๐–ญ๐–ฒ>โ€‹(u){\mathsf{BNS^{>}}}(u), this distance computation can be proved based on Propertyย 2. The other part contains all vv whose shortest paths to uu pass through ๐–ก๐–ญ๐–ฒ<โ€‹(u){\mathsf{BNS^{<}}}(u), ๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v){\mathsf{dist_{<}}}(u,v) can be directly obtained from ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) based on Lemmaย 5.15. โ–ก\Box

1 Gโ€ฒโ†๐–ฒ๐–ฃโ€‹-โ€‹๐–ฆ๐—‹๐–บ๐—‰๐—โ€‹-โ€‹๐–ฆ๐–พ๐—‡โ€‹(G,ฯ€)G^{\prime}\leftarrow{\mathsf{SD}}\textrm{-}{\mathsf{Graph}}\textrm{-}{\mathsf{Gen}}(G,\pi) ;
2 ๐–ต๐—„<โ€‹(โ‹…)โ†{\mathsf{V_{k}^{<}}}(\cdot)\leftarrow line 3-7 of Algorithmย 2;
3 ๐’ฎโ†โˆ…\mathcal{S}\leftarrow\emptyset, ๐–ต๐—„โ€‹(โ‹…)โ†โˆ…{\mathsf{V_{k}}}(\cdot)\leftarrow\emptyset;
4 forย each uu in decreasing order of ฯ€โ€‹(u)\pi(u)ย do
5ย ย ย ย ย  ๐’ฎโ†๐–ต๐—„<โ€‹(u)โˆชwโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)๐–ต๐—„โ€‹(w)\mathcal{S}\leftarrow{\mathsf{V_{k}^{<}}}(u)\cup_{w\in{\mathsf{BNS^{>}}}(u)}{\mathsf{V_{k}}}(w);
6ย ย ย ย ย  forย each vโˆˆ๐’ฎv\in\mathcal{S}ย do
7ย ย ย ย ย ย ย ย ย ย  dโ†minwโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)โก{ฯ•โ€‹((u,w),Gโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—โ€‹((w,v),G)};d\leftarrow\min_{w\in{\mathsf{BNS^{>}}}(u)}\{\phi((u,w),G^{\prime})+{\mathsf{dist}}((w,v),G)\};
8ย ย ย ย ย ย ย ย ย ย  ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)โ†minโก{d,๐–ฝ๐—‚๐—Œ๐—<โ€‹(u,v)}{\mathsf{dist}}(u,v)\leftarrow\min\{d,{\mathsf{dist_{<}}}(u,v)\};
9ย ย ย ย ย ย ย ย ย ย 
10ย ย ย ย ย ๐–ต๐—„โ€‹(u)โ†{\mathsf{V_{k}}}(u)\leftarrow kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G);
11ย ย ย ย ย 
Algorithmย 3 ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+โ€‹(G,ฯ€,โ„ณ){\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}(G,\pi,\mathcal{M})

Algorithm. Following Lemmaย 5.22, our new bidirectional construction algorithm is shown in Algorithmย 3. It first generates the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of GG using Algorithmย 1 (line 1) and computes ๐–ต๐—„<โ€‹(u){\mathsf{V_{k}^{<}}}(u) in the same way as Algorithmย 2 (line 2). After that, it processes the vertices in the decreasing order of their ranks (line 4-9). For each vertex uu, it retrieves ๐–ต๐—„<โ€‹(u)โˆชwโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)๐–ต๐—„โ€‹(w){\mathsf{V_{k}^{<}}}(u)\cup_{w\in{\mathsf{BNS^{>}}}(u)}{\mathsf{V_{k}}}(w) based on Lemmaย 5.21 and stores them in ๐’ฎ\mathcal{S} (line 5). Then, the distance between uu and vโˆˆ๐’ฎv\in\mathcal{S} is computed following Lemmaย 5.22 (line 6-8). Since the index construction procedure follows the decreasing order of ฯ€โ€‹(u)\pi(u), ๐–ต๐—„โ€‹(w){\mathsf{V_{k}}}(w) for โˆ€wโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(u)\forall w\in{\mathsf{BNS^{>}}}(u) has been computed before computing ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u). ๐–ฝ๐—‚๐—Œ๐—โ€‹((w,v),G){\mathsf{dist}}((w,v),G) can be obtained from ๐–ต๐—„โ€‹(w){\mathsf{V_{k}}}(w) directly. And ฯ•โ€‹((u,w),Gโ€ฒ)\phi((u,w),G^{\prime}) can be achieved from ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} directly. At last, the kk vertices in ๐’ฎ\mathcal{S} with the smallest ๐–ฝ๐—‚๐—Œ๐—โ€‹((u,v),G){\mathsf{dist}}((u,v),G) is returned as ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) in line 9.

Refer to caption
(a) ๐’ฎ\mathcal{S} for ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17})
Refer to caption
(b) Computation of ๐–ต5โ€‹(v17){\mathsf{V}}_{5}(v_{17})
Figure 6. Procedure of Algorithmย 3 to Compute ๐–ต๐Ÿงโ€‹(v17){\mathsf{V_{5}}}(v_{17})
Example 5.23.

Figureย 6 shows the ๐–ต๐Ÿงโ€‹(v17){\mathsf{V_{5}}}(v_{17}) construction procedure following Algorithmย 3. Based on the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} in Figureย 4, for v17v_{17}, ๐–ก๐–ญ๐–ฒ>โ€‹(v17)={v18,v19}{\mathsf{BNS^{>}}}(v_{17})=\{v_{18},v_{19}\}, which is shown in pink in Figureย 6 (a). ๐–ต๐Ÿง<โ€‹(v17){\mathsf{V_{5}^{<}}}(v_{17}) can be constructed in the same way as shown in Exampleย 5.18. Following line 5 of Algorithmย 3, when computing ๐–ต๐Ÿง<โ€‹(v17){\mathsf{V_{5}^{<}}}(v_{17}), we already have ๐–ต๐Ÿง<โ€‹(v17),๐–ต๐Ÿงโ€‹(v18){\mathsf{V_{5}^{<}}}(v_{17}),{\mathsf{V_{5}}}(v_{18}), and ๐–ต๐Ÿงโ€‹(v19){\mathsf{V_{5}}}(v_{19}), which is shown in Figureย 6 (b). According to line 7-8 of Algorithmย 3, we have ๐’ฎ=๐–ต๐Ÿง<โ€‹(v17)โˆชwโˆˆ๐–ก๐–ญ๐–ฒ>โ€‹(v17)๐–ต๐Ÿงโ€‹(w)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4),(v19,5),(v18,6),(v9,8),(v13,9)}\mathcal{S}={\mathsf{V_{5}^{<}}}(v_{17})\cup_{w\in{\mathsf{BNS^{>}}}(v_{17})}{\mathsf{V_{5}}}(w)=\{(v_{17},0),(v_{5},2),(v_{12},3),(v_{15},\\ 3),(v_{11},4),(v_{19},5),(v_{18},6),(v_{9},8),(v_{13},9)\}. After sorting distance, the 55 nearest neighbors for v17v_{17} is selected from the set ๐’ฎ\mathcal{S}, namely, ๐–ต๐Ÿงโ€‹(v17)={(v17,0),(v5,2),(v12,3),(v15,3),(v11,4)}{\mathsf{V_{5}}}(v_{17})=\{(v_{17},0),(v_{5},2),(v_{12},3),(v_{15},3),(v_{11},4)\}.

Theorem 5.24.

Given a road network GG, the time complexity of Algorithmย 3 is bounded by Oโ€‹(nโ‹…ฯ2+nโ‹…ฯ„โ‹…k)O(n\cdot\rho^{2}+n\cdot\tau\cdot k) where ฯ\rho represents the maximum degree of vertices in the graph generated by Algorithmย 1 when Step 1 finishes, and ฯ„=๐—†๐–บ๐—‘vโˆˆVโ€‹(G)โ€‹|๐–ก๐–ญ๐–ฒ>โ€‹(v)|\tau={\mathsf{max}}_{v\in V(G)}|{\mathsf{BNS^{>}}}(v)|.

Proof: As proved in Theoremย 5.19, Algorithmย 1 requires Oโ€‹(nโ‹…ฯ2)O(n\cdot\rho^{2}) time (line 1 of Algorithmย 3) and ๐–ต๐—„<โ€‹(โ‹…){\mathsf{V_{k}^{<}}}(\cdot) construction requires Oโ€‹(nโ‹…ฯ„โ‹…k)O(n\cdot\tau\cdot k) (line 2 of Algorithmย 3). In the for loop from line 4 to line 9 of Algorithmย 3, obtaining ๐’ฎ\mathcal{S} and distance computation require Oโ€‹(ฯ„โ‹…k)O(\tau\cdot k) (line 5-9 of Algorithmย 3) and the loop terminates in nn iterations. Therefore, the for loop takes Oโ€‹(nโ‹…ฯ„โ‹…k)O(n\cdot\tau\cdot k) (line 4-9 of Algorithmย 3). In summary, the bidirectional ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{KNN}}\textrm{-}{\mathsf{Index}} construction (Algorithmย 3) requires Oโ€‹(nโ‹…ฯ2+nโ‹…ฯ„โ‹…k)O(n\cdot\rho^{2}+n\cdot\tau\cdot k). โ–ก\Box

Compared with Theoremย 5.19, Theoremย 5.24 shows that the time complexity of our new bidirectional algorithm to construct the index is significantly improved, which is also verified by the experimental results illustrated in Sectionย 7.

6. Candidate Object Update

In some cases, the candidate objects โ„ณ\mathcal{M} may be updated by inserting new objects or deleting existing objects. Straightforwardly, we can reconstruct the index from scratch by Algorithmย 3 to handle the update. However, this approach is inefficient as the update of a candidate object may not affect the kkNN results of all the vertices. In this section, we discuss how to maintain the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} incrementally when the candidate objects are updated.

Obviously, when a candidate object uu is inserted or deleted, the update of uu will not affect the kkNN results of a vertex vv if uu and vv are far away from each other. Specifically, let vkv_{k} be the vertex in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) with the largest distance to vv. If ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)>๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk){\mathsf{dist}}(u,v)>{\mathsf{dist}}(v,v_{k}), then uu cannot be in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v), which means deleting or inserting uu will not affect ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v). Moreover, we have the following lemma based on Propertyย 1 and Propertyย 2:

Lemma 6.1.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} of the road network GG, for a vertex vโˆˆV(G))v\in V(G)), when an object uu is inserted/deleted, ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) could be affected if and only if there exists at least one vertex wโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)w\in{\mathsf{BNS}}(v) whose ๐–ต๐—„โ€‹(w){\mathsf{V_{k}}}(w) changes due to the update of uu.

Proof: This lemma can be directly proved based on Propertyย 1 and Propertyย 2. โ–ก\Box

Therefore, we can maintain the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} starting from the vertex of the updated object uu. Based on the definition of kkNN, it is clear that ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) will be changed. Following Lemmaย 6.1, the change of ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u) will possibly lead to the change of ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) where vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(u)v\in{\mathsf{BNS}}(u). Then, we check whether ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) needs to be updated based on ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v){\mathsf{dist}}(u,v) and ๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk){\mathsf{dist}}(v,v_{k}) as discussed above. We continue to repeat the above procedure recursively, and it is obvious that the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is correctly maintained when no more vertices whose kkNN results change. Based on the above idea, our algorithms to handle the candidate object insertion and deletion are shown in Algorithmย 4 and Algorithmย 5, respectively.

1 ๐–ฝ๐—‚๐—Œ๐—โ€‹[โ‹…]โ†+โˆž{\mathsf{dist}}[\cdot]\leftarrow+\infty; ๐’ฎโ†{โˆ…}\mathcal{S}\leftarrow\{\emptyset\}; Qโ†โˆ…Q\leftarrow\emptyset;
2 ๐–ฝ๐—‚๐—Œ๐—โ€‹[u]โ†0{\mathsf{dist}}[u]\leftarrow 0; ๐’ฎโ†{u}\mathcal{S}\leftarrow\{u\}; Q.pโ€‹uโ€‹sโ€‹hโ€‹(u)Q.push(u);
3 whileย Qโ‰ โˆ…Q\neq\emptysetย do
4ย ย ย ย ย  wโ†Q.๐—‰๐—ˆ๐—‰โ€‹()w\leftarrow Q.{\mathsf{pop}}();
5ย ย ย ย ย  forย each vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(w)v\in{\mathsf{BNS}}(w)ย do
6ย ย ย ย ย ย ย ย ย ย  ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]โ†minโก{๐–ฝ๐—‚๐—Œ๐—โ€‹[v],๐–ฝ๐—‚๐—Œ๐—โ€‹[w]+ฯ•โ€‹((w,v),Gโ€ฒ)}{\mathsf{dist}}[v]\leftarrow\min\{{\mathsf{dist}}[v],{\mathsf{dist}}[w]+\phi((w,v),G^{\prime})\};
7ย ย ย ย ย ย ย ย ย ย  ifย vโˆ‰๐’ฎโˆง๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œโ€‹(v,๐–ต๐—„โ€‹(v),๐–ฝ๐—‚๐—Œ๐—โ€‹[v])v\notin\mathcal{S}\wedge{\mathsf{checkIns}}(v,{\mathsf{V_{k}}}(v),{\mathsf{dist}}[v])ย then
8ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  Q.๐—‰๐—Ž๐—Œ๐—โ€‹(v)Q.{\mathsf{push}}(v); ๐’ฎโ†๐’ฎโˆช{v}\mathcal{S}\leftarrow\mathcal{S}\cup\{v\};
9ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย 
10forย each vโˆˆ๐’ฎv\in\mathcal{S}ย do
11ย ย ย ย ย  remove vkv_{k} from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v); insert uu into ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v);
12ย ย ย ย ย 
13 Procedureย ๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œ\mathsf{checkIns}(v,๐–ต๐—„โ€‹(v),d)(v,{\mathsf{V_{k}}}(v),d)
14ย ย ย ย ย  vkโ†v_{k}\leftarrow the vertex with the largest distance to vv in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v);
15ย ย ย ย ย  if ๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk)โ‰คd{\mathsf{dist}}(v,v_{k})\leq d then return ๐–ฅ๐–บ๐—…๐—Œ๐–พ{\mathsf{False}};
16ย ย ย ย ย  else return ๐–ณ๐—‹๐—Ž๐–พ{\mathsf{True}};
Algorithmย 4 ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–จ๐—‡๐—Œโ€‹(Gโ€ฒ,๐–ต๐—„โ€‹(โ‹…),u){\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Ins}}(G^{\prime},{\mathsf{V_{k}}}(\cdot),u)

Object Insertion. Algorithmย 4 shows the algorithm for candidate object insertion. An array ๐–ฝ๐—‚๐—Œ๐—โ€‹[โ‹…]{\mathsf{dist}}[\cdot] stores the distance between uu and other vertices, a set ๐’ฎ\mathcal{S} stores vertices whose kkNN results should be updated, and a queue QQ stores the vertices whose bridge neighbor sets should be checked (line 1). Then, it initializes ๐–ฝ๐—‚๐—Œ๐—โ€‹[u]{\mathsf{dist}}[u] as 0 and adds uu in ๐’ฎ\mathcal{S} and QQ (line 1). After that, it pops a vertex ww from QQ (line 4), and for each vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(w)v\in{\mathsf{BNS}}(w), it computes the distance between uu and vv, which can be obtained based on the fact ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)=๐—†๐—‚๐—‡wโ€ฒโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)โ€‹{๐–ฝ๐—‚๐—Œ๐—โ€‹(u,wโ€ฒ)+๐–ฝ๐—‚๐—Œ๐—โ€‹(wโ€ฒ,v)}{\mathsf{dist}}(u,v)={\mathsf{min}}_{w^{\prime}\in{\mathsf{BNS}}(v)}\{{\mathsf{dist}}(u,w^{\prime})+{\mathsf{dist}}(w^{\prime},v)\} (line 6). Note that instead of visting all wโ€ฒโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)w^{\prime}\in{\mathsf{BNS}}(v), only the vertices whose ๐–ต๐—„โ€‹(w){\mathsf{V_{k}}}(w) changes due to the update of uu need to be explored following Lemmaย 6.1, which is captured by QQ. If vv is not in ๐’ฎ\mathcal{S} and ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]{\mathsf{dist}}[v] is smaller than the distance between vv and vkv_{k}, where vkv_{k} is the vertex with the largest distance to vv in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) which can be obtained directly based on ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, it adds vv into QQ and ๐’ฎ\mathcal{S} (line 7-8). The procedure terminates when QQ becomes empty (line 3). At last, for each vertex vโˆˆ๐’ฎv\in\mathcal{S}, it removes vkv_{k} from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) and inserts uu into ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) (line 9-10). When inserting uu into ๐–ต๐—„โ€‹(u){\mathsf{V_{k}}}(u), ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]{\mathsf{dist}}[v] is the shortest distance between uu and vv, which can be guaranteed by Lemmaย 6.1.

Theorem 6.2.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} and its corresponding ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of a road network GG, Algorithmย 4 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} correctly when an object uu is inserted.

Proof: Algorithmย 4 (line 5-6) can guarantee ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]{\mathsf{dist}}[v] for โˆ€vโˆˆ๐’ฎ\forall v\in\mathcal{S} is the distance between uu and vv before inserting uu to ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) for vโˆˆ๐’ฎv\in\mathcal{S} (line 9-10). Even though when using ๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œโ€‹(v,๐–ต๐—„โ€‹(v),d){\mathsf{checkIns}}(v,{\mathsf{V_{k}}}(v),d) (line 7), dd may not be the distance between uu and vv and ๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œโ€‹(v,๐–ต๐—„โ€‹(v),d){\mathsf{checkIns}}(v,{\mathsf{V_{k}}}(v),d) returns ๐–ณ๐—‹๐—Ž๐–พ\mathsf{True}, the final result can not be affected. Since ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)โ‰คd{\mathsf{dist}}(u,v)\leq d, ๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œ{\mathsf{checkIns}} returning ๐–ณ๐—‹๐—Ž๐–พ\mathsf{True} denotes that d<๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk)d<{\mathsf{dist}}(v,v_{k}). Therefore, we have ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)<๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk){\mathsf{dist}}(u,v)<{\mathsf{dist}}(v,v_{k}). Overall, Algorithmย 4 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} correctly when an object uu is inserted. โ–ก\Box

Theorem 6.3.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} and its corresponding ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of a road network GG, when an object uu is inserted, Algorithmย 4 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} in Oโ€‹(ฮ”โ‹…ฯ„โ€ฒ)O(\Delta\cdot\tau^{\prime}), where ฮ”=|๐’ฎ|\Delta=|\mathcal{S}| and ฯ„โ€ฒ=๐—†๐–บ๐—‘vโˆˆVโ€‹(G)โ€‹|๐–ก๐–ญ๐–ฒโ€‹(v)|\tau^{\prime}={\mathsf{max}}_{v\in V(G)}|{\mathsf{BNS}}(v)|.

Proof: The time complexity of ๐–ผ๐—๐–พ๐–ผ๐—„๐–จ๐—‡๐—Œโ€‹(v,๐–ต๐—„โ€‹(v),d){\mathsf{checkIns}}(v,{\mathsf{V_{k}}}(v),d) (line 11-14) is Oโ€‹(1)O(1). In the for loop (line 3-8), for each vertex ww, line 5-8 requires Oโ€‹(ฯ„โ€ฒ)O(\tau^{\prime}) time and the loop terminates in at most ฮ”\Delta iterations. Therefore, the for loop (line 3-8) requires Oโ€‹(ฮ”โ‹…ฯ„โ€ฒ)O(\Delta\cdot\tau^{\prime}) time. In the for loop (line 9-10), removing vkv_{k} from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) requires Oโ€‹(1)O(1), inserting uu into ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) in the correct position needs Oโ€‹(k)O(k) and the loop stops in ฮ”\Delta iterations. Therefore, the for loop (line 9-10) requires Oโ€‹(ฮ”โ‹…k)O(\Delta\cdot k) time. In summary, the overall time complexity of Algorithmย 4 is O(ฮ”โ‹…(ฯ„โ€ฒ+k)=O(ฮ”โ‹…ฯ„โ€ฒ)O(\Delta\cdot(\tau^{\prime}+k)=O(\Delta\cdot\tau^{\prime}), since in real applications the parameter kk is not large and k<ฯ„โ€ฒk<\tau^{\prime}. โ–ก\Box

1 ๐–ฝ๐—‚๐—Œ๐—โ€‹[โ‹…]โ†+โˆž{\mathsf{dist}}[\cdot]\leftarrow+\infty; ๐’ฎโ†{โˆ…}\mathcal{S}\leftarrow\{\emptyset\}; Qโ†โˆ…Q\leftarrow\emptyset;
2 ๐–ฝ๐—‚๐—Œ๐—โ€‹[u]โ†0{\mathsf{dist}}[u]\leftarrow 0; ๐’ฎโ†{u}\mathcal{S}\leftarrow\{u\}; Q.pโ€‹uโ€‹sโ€‹hโ€‹(u)Q.push(u);
3 whileย Qโ‰ โˆ…Q\neq\emptysetย do
4ย ย ย ย ย  wโ†Q.๐—‰๐—ˆ๐—‰โ€‹()w\leftarrow Q.{\mathsf{pop}}();
5ย ย ย ย ย  forย each vโˆˆ๐–ก๐–ญ๐–ฒโ€‹(w)v\in{\mathsf{BNS}}(w)ย do
6ย ย ย ย ย ย ย ย ย ย  ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]โ†minโก{๐–ฝ๐—‚๐—Œ๐—โ€‹[v],๐–ฝ๐—‚๐—Œ๐—โ€‹[w]+ฯ•โ€‹((w,v),Gโ€ฒ)}{\mathsf{dist}}[v]\leftarrow\min\{{\mathsf{dist}}[v],{\mathsf{dist}}[w]+\phi((w,v),G^{\prime})\};
7ย ย ย ย ย ย ย ย ย ย  ifย vโˆ‰๐’ฎโˆง๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…โ€‹(u,v,๐–ต๐—„โ€‹(v),๐–ฝ๐—‚๐—Œ๐—โ€‹[v])v\notin\mathcal{S}\wedge{\mathsf{checkDel}}(u,v,{\mathsf{V_{k}}}(v),{\mathsf{dist}}[v])ย then
8ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย  Q.๐—‰๐—Ž๐—Œ๐—โ€‹(v)Q.{\mathsf{push}}(v); ๐’ฎโ†๐’ฎโˆช{v}\mathcal{S}\leftarrow\mathcal{S}\cup\{v\};
9ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย 
10forย each vโˆˆ๐’ฎv\in\mathcal{S} in decreasing order of ฯ€โ€‹(v)\pi(v)ย do
11ย ย ย ย ย  ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…โ€‹(v,๐–ก๐–ญ๐–ฒโ€‹(v),๐–ต๐—„โ€‹(โ‹…)){\mathsf{processDel}}(v,{\mathsf{BNS}}(v),{\mathsf{V_{k}}}(\cdot)); delete uu from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v);
12ย ย ย ย ย 
13 Procedureย ๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…\mathsf{checkDel}(u,v,๐–ต๐—„โ€‹(v),d)(u,v,{\mathsf{V_{k}}}(v),d)
14ย ย ย ย ย  vkโ†v_{k}\leftarrow the vertex with the largest distance to vv in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v);
15ย ย ย ย ย  if ๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk)<dโˆจuโˆ‰๐–ต๐—„โ€‹(v){\mathsf{dist}}(v,v_{k})<d\vee u\notin{\mathsf{V_{k}}}(v) then return ๐–ฅ๐–บ๐—…๐—Œ๐–พ\mathsf{False};
16ย ย ย ย ย 
17ย ย ย ย ย else return ๐–ณ๐—‹๐—Ž๐–พ{\mathsf{True}};
18ย ย ย ย ย 
19Procedureย ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…\mathsf{processDel}(v,๐–ก๐–ญ๐–ฒโ€‹(v),๐–ต๐—„โ€‹(โ‹…))(v,{\mathsf{BNS}}(v),{\mathsf{V_{k}}}(\cdot))
20ย ย ย ย ย  ๐’ฎโ€ฒโ†{โˆชwโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)๐–ต๐—„โ€‹(w)}โˆ–๐–ต๐—„โ€‹(v)\mathcal{S^{\prime}}\leftarrow\{\cup_{w\in{\mathsf{BNS}}(v)}{\mathsf{V_{k}}}(w)\}\setminus{\mathsf{V_{k}}}(v);
21ย ย ย ย ย 
22ย ย ย ย ย vโ€ฒโ†๐–บ๐—‹๐—€๐—†๐—‚๐—‡vโ€ฒโˆˆ๐’ฎโ€ฒโ€‹๐–ฝ๐—‚๐—Œ๐—โ€‹(vโ€ฒ,v)v^{\prime}\leftarrow{\mathsf{argmin}}_{v^{\prime}\in\mathcal{S}^{\prime}}{\mathsf{dist}}(v^{\prime},v);
23ย ย ย ย ย  insert vโ€ฒv^{\prime} into ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v);
24ย ย ย ย ย 
Algorithmย 5 ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ฃ๐–พ๐—…โ€‹(Gโ€ฒ,๐–ต๐—„โ€‹(โ‹…),u){\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Del}}(G^{\prime},{\mathsf{V_{k}}}(\cdot),u)

Object Deletion. Algorithmย 5 shows the algorithm for candidate object deletion, which follows a similar framework as Algorithmย 4. The main difference is in line 10. When the vertices whose ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) need to be updated are determined, Algorithmย 5 finds a new vertex vโ€ฒv^{\prime} to replace uu in ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) by procedure ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…\mathsf{processDel} and deletes uu from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v). For procedure ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…\mathsf{processDel}, it is easy to know that vโ€ฒv^{\prime} must be the vertex in {โˆชwโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)๐–ต๐—„โ€‹(w)}โˆ–๐–ต๐—„โ€‹(v)\{\cup_{w\in{\mathsf{BNS}}(v)}{\mathsf{V_{k}}}(w)\}\setminus{\mathsf{V_{k}}}(v) with the smallest distance to vv according to Propertyย 1, thus it first retrieves such set of vertices, namely ๐’ฎโ€ฒ\mathcal{S^{\prime}} (line 16), and finds the vertex in ๐’ฎโ€ฒ\mathcal{S^{\prime}} with the smallest distance to vv (line 17, since the vertices are processed in decreasing order of ฯ€โ€‹(v)\pi(v), ๐–ฝ๐—‚๐—Œ๐—โ€‹(vโ€ฒ,v){\mathsf{dist}}(v^{\prime},v) can be obtained in the similar way as line 7-8 of Algorithmย 3 following the same idea). At last, it inserted vโ€ฒv^{\prime} into ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) in line 18.

Theorem 6.4.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} and its corresponding ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of a road network GG, Algorithmย 5 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} correctly when an object uu is deleted.

Proof: Algorithmย 5 (line 5-6) can guarantee ๐–ฝ๐—‚๐—Œ๐—โ€‹[v]{\mathsf{dist}}[v] for โˆ€vโˆˆ๐’ฎ\forall v\in\mathcal{S} is the distance between uu and vv, before processing each vโˆˆ๐’ฎv\in\mathcal{S} (line 9-10). Even though when using ๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…โ€‹(v,๐–ต๐—„โ€‹(v),d){\mathsf{checkDel}}(v,{\mathsf{V_{k}}}(v),d) (line 7), dd may not be the distance between uu and vv and ๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…โ€‹(v,๐–ต๐—„โ€‹(v),d){\mathsf{checkDel}}(v,{\mathsf{V_{k}}}(v),d) returns ๐–ณ๐—‹๐—Ž๐–พ\mathsf{True}, the final result can not be affected. Since ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)โ‰คd{\mathsf{dist}}(u,v)\leq d, ๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…{\mathsf{checkDel}} returning ๐–ณ๐—‹๐—Ž๐–พ\mathsf{True} denotes that d<๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk)d<{\mathsf{dist}}(v,v_{k}). Therefore, we have ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v)<๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk){\mathsf{dist}}(u,v)<{\mathsf{dist}}(v,v_{k}). Overall, Algorithmย 5 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} correctly when an object uu is deleted. โ–ก\Box

Theorem 6.5.

Given the ๐–ก๐–ญ\mathsf{BN}-๐–ฆ๐—‹๐–บ๐—‰๐—\mathsf{Graph} Gโ€ฒG^{\prime} and its corresponding ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} of a road network GG, when an object uu is deleted, Algorithmย 5 maintains the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} in Oโ€‹(ฮ”โ‹…ฯ„โ€ฒโ‹…k)O(\Delta\cdot\tau^{\prime}\cdot k), where ฮ”=|๐’ฎ|\Delta=|\mathcal{S}| and ฯ„โ€ฒ=๐—†๐–บ๐—‘vโˆˆVโ€‹(G)โ€‹|๐–ก๐–ญ๐–ฒโ€‹(v)|\tau^{\prime}={\mathsf{max}}_{v\in V(G)}|{\mathsf{BNS}}(v)|.

Proof: The time complexity of ๐–ผ๐—๐–พ๐–ผ๐—„๐–ฃ๐–พ๐—…โ€‹(u,v,๐–ต๐—„โ€‹(v),d){\mathsf{checkDel}}(u,v,{\mathsf{V_{k}}}(v),d) (line 11-14) is Oโ€‹(1)O(1). In the for loop (line 3-8), for each vertex ww, line 5-8 requires Oโ€‹(ฯ„โ€ฒ)O(\tau^{\prime}) time and the loop terminates in at most ฮ”\Delta iterations. Therefore, the for loop (line 3-8) requires Oโ€‹(ฮ”โ‹…ฯ„โ€ฒ)O(\Delta\cdot\tau^{\prime}) time. For the procedure ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…(v,๐–ก๐–ญ๐–ฒ(v),๐–ต๐—„(โ‹…){\mathsf{processDel}}(v,{\mathsf{BNS}}(v),{\mathsf{V_{k}}}(\cdot) (line 15-18), retrieving ๐’ฎโ€ฒ\mathcal{S}^{\prime} from {โˆชwโˆˆ๐–ก๐–ญ๐–ฒโ€‹(v)๐–ต๐—„โ€‹(w)}โˆ–๐–ต๐—„โ€‹(v)\{\cup_{w\in{\mathsf{BNS}}(v)}{\mathsf{V_{k}}}(w)\}\setminus{\mathsf{V_{k}}}(v) needs Oโ€‹(|๐–ก๐–ญ๐–ฒโ€‹(v)|โ‹…|๐–ต๐—„โ€‹(w)|)=Oโ€‹(ฯ„โ€ฒโ‹…k)O(|{\mathsf{BNS}}(v)|\cdot|{\mathsf{V_{k}}}(w)|)=O(\tau^{\prime}\cdot k) time (line 16). At the same time with retrieving ๐’ฎโ€ฒ\mathcal{S}^{\prime}, the vertex vโ€ฒv^{\prime} with the smallest distance from ๐’ฎโ€ฒ\mathcal{S}^{\prime} can be achieved (line 17). Line 18 requires Oโ€‹(1)O(1) time, since ๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vโ€ฒ)โ‰ค๐–ฝ๐—‚๐—Œ๐—โ€‹(v,vk){\mathsf{dist}}(v,v^{\prime})\leq{\mathsf{dist}}(v,v_{k}) and vโ€ฒv^{\prime} should be inserted into the end of ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v). Therefore, the time complexity of ๐—‰๐—‹๐—ˆ๐–ผ๐–พ๐—Œ๐—Œ๐–ฃ๐–พ๐—…(v,๐–ก๐–ญ๐–ฒ(v),๐–ต๐—„(โ‹…){\mathsf{processDel}}(v,{\mathsf{BNS}}(v),{\mathsf{V_{k}}}(\cdot) (line 15-18) is Oโ€‹(ฯ„โ€ฒโ‹…k)O(\tau^{\prime}\cdot k). In the for loop (line 9-10), deleting uu from ๐–ต๐—„โ€‹(v){\mathsf{V_{k}}}(v) requires Oโ€‹(k)O(k), and the loop stops in ฮ”\Delta iterations. Therefore, the for loop (line 9-10) requires Oโ€‹(ฮ”โ‹…ฯ„โ€ฒโ‹…k)O(\Delta\cdot\tau^{\prime}\cdot k) time. In summary, the overall time complexity of Algorithmย 4 is Oโ€‹(ฮ”โ‹…ฯ„โ€ฒโ‹…(1+k))=Oโ€‹(ฮ”โ‹…ฯ„โ€ฒโ‹…k)O(\Delta\cdot\tau^{\prime}\cdot(1+k))=O(\Delta\cdot\tau^{\prime}\cdot k). โ–ก\Box

7. Experiments

In this section, we compare our algorithms with the state-of-the-art method. All experiments are conducted on a machine with an Intel Xeon CPU and 384 GB main memory running Linux.

Dataset Name nn mm ฮท\eta ฯ„\tau ฯ\rho
New York City ๐–ญ๐–ธ\mathsf{NY} 264,346 733,846 725 56 116
San Francisco Bay Area ๐–ก๐– ๐–ธ\mathsf{BAY} 321,270 800,172 388 45 100
Colorado ๐–ข๐–ฎ๐–ซ\mathsf{COL} 435,666 1,057,066 524 65 122
Florida ๐–ฅ๐–ซ๐– \mathsf{FLA} 1,070,376 2,712,798 556 49 85
Northwest USA ๐–ญ๐–ถ\mathsf{NW} 1,207,945 2,840,208 619 49 119
Northeast USA ๐–ญ๐–ค\mathsf{NE} 1,524,453 3,897,636 1096 81 149
California and Nevada ๐–ข๐– ๐–ซ\mathsf{CAL} 1,890,815 4,657,742 795 93 204
Great Lakes ๐–ซ๐–ช๐–ฒ\mathsf{LKS} 2,758,119 6,885,658 1674 124 327
Eastern USA ๐–ค๐–ด๐–ฒ\mathsf{EUS} 3,598,623 8,778,114 1089 102 233
Western USA ๐–ถ๐–ด๐–ฒ\mathsf{WUS} 6,262,104 15,248,146 1356 128 276
Central USA ๐–ข๐–ณ๐–ฑ\mathsf{CTR} 14,081,816 34,292,496 2811 234 531
Full USA ๐–ด๐–ฒ๐– \mathsf{USA} 23,947,347 58,333,344 3315 257 587
Table 2. Datasets in Experiments
Refer to caption
(a) NY
Refer to caption
(b) BAY
Refer to caption
(c) COL
Refer to caption
(d) FLA
Refer to caption
(e) NW
Refer to caption
(f) NE
Refer to caption
(g) CAL
Refer to caption
(h) LKS
Refer to caption
(i) EUS
Refer to caption
(j) WUS
Refer to caption
(k) CTR
Refer to caption
(l) USA
Figure 7. Query Processing Time by Varying kk

Datasets. We use twelve publicly available real road networks from DIMACS 222http://users.diag.uniroma1.it/challenge9/download.shtml. In each road network, vertices represent intersections between roads, edges correspond to roads or road segments, the weight of an edge is the physical distance between two vertices. Tableย 2 provides the details about these datasets. Tableย 2 also shows the value of ฮท\eta, ฯ„\tau and ฯ\rho for each road network. Clearly, ฮท\eta, ฯ„\tau and ฯ\rho are small in practice.

Algorithms. We compare the following algorithms:

  • โ€ข

    ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{{\mathsf{TEN}}\textrm{-}{\mathsf{Index}}}: The state-of-the-art algorithm for kkNN queries queries, which is introduced in Sectionย 3.

  • โ€ข

    ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}: Our proposed algorithms for kkNN queries. For the index construction algorithms, we further distinguish between ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย (Algorithmย 2) and ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}(Algorithm 3) for comparison.

  • โ€ข

    ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}: Another algorithm for kkNN queries proposed in (Luo etย al., 2018), which is introduced in Sectionย 8.

  • โ€ข

    ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บโ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ\mathsf{{\mathsf{Dijkstra}}\textrm{-}{\mathsf{Cons}}}: Using Dijkstraโ€™s Algorithm to compute top-kk nearest neighbors for all vertices in a given graph GG to construct the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} as discussed in Sectionย 5.

  • โ€ข

    ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ\mathsf{{\mathsf{TEN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}}: Using ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} to compute top-kk nearest neighbors for all vertices in a given graph GG to construct the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} as discussed in Sectionย 5.

All the algorithms are implemented in C++ and compiled in GCC with -O3. The time cost is measured as the amount of wall-clock time elapsed during the programโ€™s execution. If an algorithm cannot finish in 6 hours, we denote the processing time as ๐–ญ๐– \mathsf{NA}.

Parameter Settings. Following previous kkNN works (Ouyang etย al., 2020a; He etย al., 2019; Luo etย al., 2018), we randomly select candidate objects in each dataset with a density ฮผ=|โ„ณ|/|V|\mu=|\mathcal{M}|/|V|. The candidate density ฮผ\mu and the query parameter kk settings are shown in Tableย 3, default values display in bold and italic font.

Parameters Values
ฮผ\mu 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001
kk 100, 80, 60, 40, 30, 20, 10
Table 3. Parameter Settings

Exp-1: Query Processing Time when Varying kk. In this experiment, we evaluate the query processing time of our algorithms ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, the SOTA solutions ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}by varying the parameter kk. We randomly generate 10,00010,000 queries and report average running time of each algorithm in Figureย 7.

As shown in Figureย 7, our algorithm is the most efficient one compared with ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}and the growth for query processing time of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}is sharper than that of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} with increase of kk. For example, on ๐–ญ๐–ธ\mathsf{NY} dataset, when kk increases from 1010 to 100100, the query processing time of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}grows from 13.21713.217us to 127.025127.025us and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}raises from 18.17218.172us to 138.212138.212us. However, our ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} processing time grows from 0.3950.395us to 1.2101.210us. This is consistent with our analysis in Sectionย 4.2. For ๐–ด๐–ฒ๐– \mathsf{USA}, ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} is out of memory, the query processing time for ๐–ด๐–ฒ๐– \mathsf{USA} could
not be tested at this and following experiments.

Refer to caption
(a) NY
Refer to caption
(b) BAY
Refer to caption
(c) COL
Refer to caption
(d) FLA
Refer to caption
(e) NW
Refer to caption
(f) NE
Refer to caption
(g) CAL
Refer to caption
(h) LKS
Refer to caption
(i) EUS
Refer to caption
(j) WUS
Refer to caption
(k) CTR
Refer to caption
(l) USA
Figure 8. Query Processing Time by Varying ฮผ=|โ„ณ|/|V|\mu=|\mathcal{M}|/|V|

Exp-2: Query Processing Time when Varying โ„ณ\mathcal{M}. We also compare our ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} with the SOTA solutions ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} by varying object โ„ณ\mathcal{M} (the density ฮผ=|โ„ณ|/|V|\mu=|\mathcal{M}|/|V|, therefore, we vary โ„ณ\mathcal{M} by changing ฮผ\mu). We randomly generate 10,00010,000 queries for every dataset. We report the average processing time of each algorithm in Figureย 8.

As shown in Figureย 8, the query processing time of our algorithm is stable with the decrease of candidate object โ„ณ\mathcal{M}. However, the query processing time of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} increases significantly with the decrease of candidate density ฮผ\mu. For example, when ฮผ=0.0001\mu=0.0001 ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} achieves 2 orders of magnitude speedup compared with ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}in all datasets, and ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} achieves 4 orders of magnitude speedup compared with ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. Moreover, the more sparsely the object set distributes, the larger speedup is. This is because our proposed algorithm is optimal regarding query processing as analyzed in Sectionย 4.2.

Refer to caption
(a) NY
Refer to caption
(b) BAY
Refer to caption
(c) COL
Refer to caption
(d) FLA
Refer to caption
(e) NW
Refer to caption
(f) NE
Refer to caption
(g) CAL
Refer to caption
(h) LKS
Refer to caption
(i) EUS
Refer to caption
(j) WUS
Refer to caption
(k) CTR
Refer to caption
(l) USA
Figure 9. Query Processing Time for Different Outputs

Exp-3: Progressive Query Processing. In this experiment, we evaluate the progressive query processing strategy of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} by outputing every 55 outputs in all datasets with k=60k=60. As shown in Figureย 9, the lines of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} are linear in all datasets, and the total time of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is always smaller than that of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. The reasons are similar as explained in Exp-1 and Exp-2. ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} does not support incremental polynomial query processing, so it has the same the total time for different outputs.

Refer to caption
Figure 10. Indexing Time (s)
Refer to caption
Figure 11. Index Size (GB)

Exp-4: Indexing Time. In this experiment, we evaluate the indexing time for ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}, ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}-๐–ข๐—ˆ๐—‡๐—Œ\mathsf{Cons}, ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย and ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}-๐–ข๐—ˆ๐—‡๐—Œ\mathsf{Cons}. Figureย 10 shows that ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}ย is the fastest in all datasets, and achieves up to 2 orders of magnitude speedup compared with ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. For example, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}ย only takes 283.78283.78s for ๐–ด๐–ฒ๐– \mathsf{USA} while ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}costs 19655.6819655.68s. ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}-๐–ข๐—ˆ๐—‡๐—Œ\mathsf{Cons} and ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}takes the similar indexing time as ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}-๐–ข๐—ˆ๐—‡๐—Œ\mathsf{Cons} depends on the ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. They both rely on ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. Also, the indexing time of ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}and ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}are similar, since ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}constructs the additional grid index on the basis of ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. As shown in Figureย 10, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย cannot complete the index construction within 66 hours for ๐–ข๐–ณ๐–ฑ\mathsf{CTR} and ๐–ด๐–ฒ๐– \mathsf{USA}. And for ๐–ด๐–ฒ๐– \mathsf{USA} ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}is out of memory. ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}-๐–ข๐—ˆ๐—‡๐—Œ\mathsf{Cons} cannot finish index construction within 66 hours for ๐–ญ๐–ค\mathsf{NE}, ๐–ซ๐–ช๐–ฒ\mathsf{LKS}, ๐–ค๐–ด๐–ฒ\mathsf{EUS}, ๐–ถ๐–ด๐–ฒ\mathsf{WUS}, ๐–ข๐–ณ๐–ฑ\mathsf{CTR} and ๐–ด๐–ฒ๐– \mathsf{USA}. Although index construction frameworks in ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย and ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}ย are similar, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย consumes much more time compared with ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}. For example, for ๐–ถ๐–ด๐–ฒ\mathsf{WUS}, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}ย only costs 50.1550.15s, but ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย costs 42061.2042061.20s. This is because ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}ย first uses BFS to construct ๐–ฆโ€ฒโฃ>โ€‹(u){\mathsf{G^{\prime>}}}(u) for each vertex uโˆˆVโ€‹(G)u\in V(G), and then uses Dijkstraโ€™s Algorithm to compute ๐–ฝ๐—‚๐—Œ๐—โ€‹(u,v){\mathsf{dist}}(u,v) for โˆ€vโˆˆVโ€‹(๐–ฆโ€ฒโฃ>โ€‹(u))\forall v\in V({\mathsf{G^{\prime>}}}(u)) when constructing the index. However, ๐–ช๐–ญ๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘โ€‹-โ€‹๐–ข๐—ˆ๐—‡๐—Œ+{\mathsf{KNN}}\textrm{-}{\mathsf{Index}}\textrm{-}{\mathsf{Cons}}^{+}ย adopts a bidirectional construction strategy to avoid the time-consuming BFS search and the computation of Dijkstraโ€™s Algorithm during the index construction. The experimental results demonstrate the efficiency of our proposed algorithm regarding index construction.

Exp-5: Index Size. In this experiment, we evaluate the index size for ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. The experimental results for the 1212 road networks are shown in Figureย 11. Figureย 11 shows the index size of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is much smaller than that of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. For example, for the dataset ๐–ด๐–ฒ๐– \mathsf{USA}, the ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} size is only 3.573.57 GB while ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}size is 169.28169.28 GB, which is 47.4247.42 times smaller than that of ๐–ณ๐–ค๐–ญ\mathsf{TEN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}.

USA CTR
kk ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }} ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }} ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }} ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }} ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}
kk Indexing Time (s) Indexing Time (s) Index Size (GB) Index Size (GB) Indexing Time (s) Indexing Time (s) Index Size (GB) Index Size (GB)
1010 19669.787 266.005 169.265 1.784 10877.527 169.578 98.021 1.049
2020 19670.198 283.850 169.277 3.568 10878.179 179.480 98.028 2.908
3030 19670.336 297.224 169.286 5.353 10881.515 186.859 98.034 3.148
4040 19671.470 321.845 169.293 7.1369 10882.434 201.225 98.039 4.197
6060 19672.287 345.169 169.305 10.705 10884.793 215.737 98.046 6.295
8080 19672.839 382.074 169.315 14.274 10885.981 238.944 98.053 8.393
100100 19710.726 418.251 169.324 17.842 10887.369 253.219 98.058 10.492
Table 4. Indexing Time and Index Size when Varying kk

Exp-6: Indexing Time and Index Space when Varying kk. We evaluate the performance of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} and ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}when varying kk on ๐–ด๐–ฒ๐– \mathsf{USA} and ๐–ข๐–ณ๐–ฑ\mathsf{CTR} with k=10,20,30,40,60,80,100k=10,20,30,40,60,80,100. The results on the other datasets are omitted due to similar trends. As shown in Tableย 4, the index size and the indexing time of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} both increases with the growth of kk slightly. For example, when kk increases from 1010 to 100100, the indexing time for ๐–ด๐–ฒ๐– \mathsf{USA} increases by 40.94040.940s and the index size for ๐–ด๐–ฒ๐– \mathsf{USA} increases by 0.0590.059 GB. This is consistent with our analysis in Sectionย 4.2.

Refer to caption
Figure 12. Update Time (us)

Exp-7: Scalability when Varying Graph Size. In this experiment, we evaluate the scalability of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. To test the scalability for indexing time and index size, we divide the map of the whole US into 10ร—1010\times 10 grids. We select a 1ร—11\times 1 grid in the middle and generate an induced network by all vertices falling the grid. Using this method, we generate 1010 datasets. We report the indexing time and index size for these ten networks in Figureย 13. The labels on xx-axis represent the number of vertices. For the largest datasets, ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}is out of memory. As shown in Figureย 13, when the dataset increases from 10610^{6} to 24โˆ—10624*10^{6}, the indexing time increases stably for all algorithms, which verifies our ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} has a good scalability. Moreover, our ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} always outperforms ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. The reasons are similar as explained in Exp-4 and Exp-5.

Refer to caption
(a) Index Size
Refer to caption
(b) Indexing Time
Figure 13. Scalability

Exp-8: Object Update. In this experiment, we evaluate the performance of our update algorithms. To generate updated objects, we randomly select an object uu with either insertion or deletion. We skip the update if uโˆ‰โ„ณu\notin\mathcal{M} for deletion and uโˆˆโ„ณu\in\mathcal{M} for insertion. For each dataset, we repeat this step until 10,00010,000 updates are generated. The average time for each update is reported in Figureย 12. The update time of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is slower than that of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and that of ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}, since our update algorithm needs more time to compute the distance between each vertex and the updated objects. As analyzed in Sectionย 3, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}contains ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} can compute the distance between any two vertices efficiently. Therefore, based on ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}can finish insertion or deletion in the shorter time. Since ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} only needs to update objectsโ€™ grid index, the update operation is easier and costs shorter time.

Exp-9: System Throughput. We evaluate the throughput of answering queries mixed by kkNN queries and object updates. Following (Ouyang etย al., 2020b), the throughput is calculated based on two models, which are (1) Batch Update Arrival + Query First (๐–ก๐–ด๐– \mathsf{BUA} +๐–ฐ๐–ฅ\mathsf{QF}) and (2) Random Update Arrival + First Come First Served (๐–ฑ๐–ด๐– \mathsf{RUA} +๐–ฅ๐–ข๐–ฅ๐–ฒ\mathsf{FCFS}) (Luo etย al., 2018). We report the throughput of each algorithm under different kk and ฮผ\mu for a representative dataset ๐–ญ๐–ธ\mathsf{NY}, which is also used in (Ouyang etย al., 2020b). The results are shown in Figureย 14. We can see that the throughput of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is larger for the ๐–ก๐–ด๐– \mathsf{BUA} +๐–ฐ๐–ฅ\mathsf{QF} model as our query processing algorithm is very fast. However, in the ๐–ฑ๐–ด๐– \mathsf{RUA} +๐–ฅ๐–ข๐–ฅ๐–ฒ\mathsf{FCFS} model, the throughput of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is smaller than that of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. This is because our update algorithm costs more time than the update algorithms for ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}. Also, the throughput of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} all will go down with the decrease of ฮผ\mu. When the candidate density is relatively high, our update algorithms are faster. When ฮผ\mu decreases in Figureย 14(c), our update algorithms cost more time. Even if our query processing time is not affected by ฮผ\mu, our throughput still decrease with the fall of ฮผ\mu. For ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD}, the query processing time and updating time both will be longer with decrease of ฮผ\mu. Therefore, the throughput of ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} have a similar trend with that of ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}. In summary, ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} is good at handling the ๐–ก๐–ด๐– \mathsf{BUA} +๐–ฐ๐–ฅ\mathsf{QF} workload while ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} are good at handling ๐–ฑ๐–ด๐– \mathsf{RUA} +๐–ฅ๐–ข๐–ฅ๐–ฒ\mathsf{FCFS} workload, and ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} have their respective advantages regarding throughput.

Refer to caption
(a) ๐–ก๐–ด๐– \mathsf{BUA} +๐–ฐ๐–ฅ\mathsf{QF} (vary ฮผ)\mu)
Refer to caption
(b) ๐–ก๐–ด๐– \mathsf{BUA} +๐–ฐ๐–ฅ\mathsf{QF} (vary kk)
Refer to caption
(c) ๐–ฑ๐–ด๐– \mathsf{RUA} +๐–ฅ๐–ข๐–ฅ๐–ฒ\mathsf{FCFS} (vary ฮผ\mu)
Refer to caption
(d) ๐–ฑ๐–ด๐– \mathsf{RUA} +๐–ฅ๐–ข๐–ฅ๐–ฒ\mathsf{FCFS} (vary kk)
Figure 14. Throughput for ๐–ญ๐–ธ\mathsf{NY}

Exp-10: Indexing Time of Different Vertex Total Orders. We evaluate index construction performance using different total orders. We adopt three total orders: (1) degree-based total order in which the vertex with the smallest degree is processed first; (2) id-based total order in which the vertex with the smallest id is processed first; (3) and our proposed total order. As shown in Figureย 15, degree-based order can finish ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} construction for four small datasets, and id-based order only can construct ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} for 3 small datasets within 66 hours. Our order can finish constructing ๐–ช๐–ญ๐–ญ\mathsf{KNN}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} for all datasets, and the construction time in our order is 4 orders of magnitude faster than the other two orders. The experimental results are also consistent with our analysis in Sectionย 5.2.

Refer to caption
Figure 15. Indexing Time of Different Vertex Orders (s)

8. Related Work

The direct approach to answer a kkNN query is the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s algorithm (Dijkstra, 1959). Nevertheless, this approach is inefficient obviously. Therefore, a plethora of index based ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}-search enhanced solutions (Papadias etย al., 2003b; Demiryurek etย al., 2009a; Lee etย al., 2010; Zhong etย al., 2015; Shen etย al., 2017; Luo etย al., 2018) are proposed in the literature, which generally adopts the following search framework for a given query vertex uu: (1) Initialize the distance for vertices vv it connected as their edge weights and other vertices as +โˆž+\infty. (2) Maintain two vertex sets SS and TT. SS contains vertices whose distance to uu is computed. TT contains vertices whose distance to uu is not computed yet, but have neighbors in SS. Initially, uu is inserted in SS and the neighbors of uu are inserted in TT. (3) Select one node vv with the smallest distance to uu from TT, and add it to SS. Then, the neighbors of vv are inserted into TT. Here, different indexing methods add different restrictions, pruning unnecessary vertices to be inserted in TT, to improve the query processing performance. (4) Repeat (3) until |S|=k|S|=k.

Specifically, ๐–จ๐–ค๐–ฑ\mathsf{IER} (Papadias etย al., 2003b) uses Euclidean distance as a pruning bound to acquire the kkNN results. ๐–จ๐–ญ๐–ค\mathsf{INE} (Papadias etย al., 2003b) improves ๐–จ๐–ค๐–ฑ\mathsf{IER}โ€™s Euclidean distance bound by expending searching space from the query location. (Demiryurek etย al., 2009a) adapts a Euclidean restriction-based method to deal with continuous kk nearest neighbor problem. (Demiryurek etย al., 2009a) divides the map into Nร—NN\times N grids and records which vertices and edges belong to some grid. Given a query vertex, the fixed distance between grids is used to filter a proximate range. ๐–ฑ๐–ฎ๐– ๐–ฃ\mathsf{ROAD} (Lee etย al., 2010) separates the input graph GG into many subgraphs hierarchically and skips the subgraphs without candidate objects to speedup kkNN query processing. ๐–ฆ\mathsf{G}-๐—๐—‹๐–พ๐–พ\mathsf{tree} (Zhong etย al., 2015) adapts a binary tree division method to divide a graph into two disjoint subgraphs recursively until the number of vertices in a tree node is smaller than a predefined parameter. In each subgraph, ๐–ฆ\mathsf{G}-๐—๐—‹๐–พ๐–พ\mathsf{tree} maintains a distance matrix which stores distance between borders and vertices, which is used to prune unnecessary vertex exploration during the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra} search. ๐–ต\mathsf{V}-๐—๐—‹๐–พ๐–พ\mathsf{tree} (Shen etย al., 2017) constructs a similar structure as ๐–ฆ\mathsf{G}-๐—๐—‹๐–พ๐–พ\mathsf{tree} but adds additional kk nearest objects for borders, which leads to a faster query processing than ๐–ฆ\mathsf{G}-๐—๐—‹๐–พ๐–พ\mathsf{tree}. Based on the contraction hierarchy (๐–ข๐–ง\mathsf{CH}) (Geisberger etย al., 2008), ๐–ณ๐–ฎ๐– ๐–จ๐–ญ\mathsf{TOAIN} (Luo etย al., 2018) constructs a kkDNN index recording the top-kk nearest neighbors for each vertex uu from objects whose ranks are lower than uu, where the rank is defined by the contraction hierarchy. To answer a kkNN query with vertex uu, ๐–ณ๐–ฎ๐– ๐–จ๐–ญ\mathsf{TOAIN} performs ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra} search from uu following the ๐–ข๐–ง\mathsf{CH} and maintains a candidate result set RR. When visiting a vertex vv, if there is a vertex ww in the kkDNN of vv such that the distance of ww and uu is smaller than the kk-th distance to uu in RR, ๐–ณ๐–ฎ๐– ๐–จ๐–ญ\mathsf{TOAIN} updates RR. The processing finishes when the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra} search is far enough or all vertices are explored. Although the methods design different pruning algorithms to reduce the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra} search space in step (3), the number of explored vertices cannot be well-bounded. In worst case, these methods degenerate into ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s algorithm, which leads to long query processing delay unavoidably. For ๐–ณ๐–ฎ๐– ๐–จ๐–ญ\mathsf{TOAIN}, asit constructs kkDNN based on ๐–ข๐–ง\mathsf{CH}, which causes a relatively huge index size. Additionally, the vertex ranking method in ๐–ณ๐–ฎ๐– ๐–จ๐–ญ\mathsf{TOAIN} employs ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}โ€™s Algorithm, which incurs an expensive time cost regarding index construction. The experimental results of (Ouyang etย al., 2020a) also verify above discussions.

Apart from the ๐–ฃ๐—‚๐—ƒ๐—„๐—Œ๐—๐—‹๐–บ\mathsf{Dijkstra}-search enhanced solutions, (Li etย al., 2018) exploits the massive parallelism of GPU to accelerate the kkNN query processing. ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} (He etย al., 2019) partitions the road network into 2xร—2x2^{x}\times 2^{x} girds based on the geographical coordinate of each vertex. When answering a kkNN query, it starts the search from the grid containing the query vertex and updates the candidate result via probing vertices in neighbor grids iteratively. It avoids the exploration to the vertices in a grid if the minimum Euclidean distance between any vertex inside the grid and the query vertex is not less than the largest distance in the candidate result. As ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} needs to use ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index} to compute the exact shortest distance to select the final exact kkNN results, the query processing is long. Moreover, since ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} depends on ๐–ง๐Ÿค๐–ง\mathsf{H2H}-๐–จ๐—‡๐–ฝ๐–พ๐—‘\mathsf{Index}, the index size of ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} is huge and the indexing time of ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} is long, which are similar to ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a). ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a) is the state-of-the-art approach to kkNN query in road network, which has been discussed in Sectionย 3. (Li etย al., 2023) extend ๐–ณ๐–ค๐–ญโ€‹-โ€‹๐–จ๐—‡๐–ฝ๐–พ๐—‘{\mathsf{TEN}}\textrm{-}{\mathsf{Index\ }}(Ouyang etย al., 2020a) and ๐–ฆ๐–ซ๐– ๐–ฃ\mathsf{GLAD} (He etย al., 2019) onto time-dependent road networks. (Jiang etย al., 2023) extends tree decomposition method (Ouyang etย al., 2018) to deal with kkNN search on flow graph.

Besides, continuous kkNN query problem on road network is also studied in the literature (Shahabi etย al., 2003; Kolahdouzan and Shahabi, 2005; Cho and Chung, 2005; Mouratidis etย al., 2006; Demiryurek etย al., 2009b; Cho etย al., 2013; Zheng etย al., 2016; Kolahdouzan and Shahabi, 2004; Jiang etย al., 2023, 2021). Different from our setting, these studies generally assume that the query vertex is moving on the road network, and thus are orthogonal to ours. As a result, the proposed techniques in these studies cannot be used to address our problem.

9. Conclusion

Motivated by existing complex-index-based approaches for classical top kk nearest neighbors search in road networks suffers from the long query processing delay, oversized index space, and prohibitive indexing time, we embrace minimalism and design a simple index for kkNN query. The index has a well-bounded space and supports progressive and optimal query processing. Moreover, we further design efficient algorithms to support the index construction. Experimental results demonstrate the significant superiority of our index over the state-of-the-art approach.

References

  • (1)
  • Abbasifard etย al. (2014) Mohammadย Reza Abbasifard, Bijan Ghahremani, and Hassan Naderi. 2014. A survey on nearest neighbor search methods. International Journal of Computer Applications 95, 25 (2014).
  • Airnb ([n.d.]) Airnb. [n.d.]. https://www.airbnb.com/.
  • Bhatia etย al. (2010) Nitin Bhatia etย al. 2010. Survey of nearest neighbor techniques. arXiv preprint arXiv:1007.0085 (2010).
  • Booking ([n.d.]) Booking. [n.d.]. https://www.booking.com/.
  • Chang etย al. (1994) Y.ย H. Chang, Jia-Shung Wang, and Richard C.ย T. Lee. 1994. Generating All Maximal Independent Sets on Trees in Lexicographic Order. Inf. Sci. 76, 3-4 (1994), 279โ€“296.
  • Chernev etย al. (2015) Alexander Chernev, Ulf Bรถckenholt, and Joseph Goodman. 2015. Choice overload: A conceptual review and meta-analysis. Journal of Consumer Psychology 25, 2 (2015), 333โ€“358.
  • Cho etย al. (2013) Hyung-Ju Cho, Seย Jin Kwon, and Tae-Sun Chung. 2013. A safe exit algorithm for continuous nearest neighbor monitoring in road networks. Mob. Inf. Syst. 9, 1 (2013), 37โ€“53.
  • Cho and Chung (2005) Hyung-Ju Cho and Chin-Wan Chung. 2005. An efficient and scalable approach to CNN queries in a road network. In Proceedings of VLDB, Vol.ย 2. International Conference on VLDB, 865โ€“876.
  • Demiryurek etย al. (2009a) Ugur Demiryurek, Farnoush Banaei-Kashani, and Cyrus Shahabi. 2009a. Efficient continuous nearest neighbor query in spatial networks using euclidean restriction. In Proceedings of SSTD. Springer, 25โ€“43.
  • Demiryurek etย al. (2009b) Ugur Demiryurek, Farnoushย Banaei Kashani, and Cyrus Shahabi. 2009b. Efficient Continuous Nearest Neighbor Query in Spatial Networks Using Euclidean Restriction. In Proceedings of SSTD, Vol.ย 5644. 25โ€“43.
  • Dianping ([n.d.]) Dianping. [n.d.]. https://www.dianping.com/.
  • DiDi ([n.d.]) DiDi. [n.d.]. https://www.didiglobal.com.
  • Dijkstra (1959) Edsgerย W. Dijkstra. 1959. A note on two problems in connexion with graphs. Vol.ย 1. 269โ€“271.
  • Geisberger etย al. (2008) Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. 2008. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In Proceedings of WEA. Springer, 319โ€“333.
  • He etย al. (2019) Dan He, Sibo Wang, Xiaofang Zhou, and Reynold Cheng. 2019. An efficient framework for correctness-aware kNN queries on road networks. In Proceedings of ICDE. IEEE, 1298โ€“1309.
  • Jiang etย al. (2023) Wei Jiang, Guanyu Li, Mei Bai, Bo Ning, Xite Wang, and Fangliang Wei. 2023. Graph-Indexed k NN Query Optimization on Road Network. Electronics 12, 21 (2023), 4536.
  • Jiang etย al. (2021) Wei Jiang, Fangliang Wei, Guanyu Li, Mei Bai, Yongqiang Ren, and Jingmin An. 2021. Tree index nearest neighbor search of moving objects along a road network. Wireless Communications and Mobile Computing 2021 (2021), 1โ€“18.
  • Kolahdouzan and Shahabi (2004) Mohammadย R. Kolahdouzan and Cyrus Shahabi. 2004. Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases. In Proceedings of VLDB. 840โ€“851.
  • Kolahdouzan and Shahabi (2005) Mohammadย R. Kolahdouzan and Cyrus Shahabi. 2005. Alternative Solutions for Continuous K Nearest Neighbor Queries in Spatial Network Databases. GeoInformatica 9, 4 (2005), 321โ€“341.
  • Lee etย al. (2010) Kenย CK Lee, Wang-Chien Lee, Baihua Zheng, and Yuan Tian. 2010. ROAD: A new spatial object search framework for road networks. IEEE transactions on knowledge and data engineering 24, 3 (2010), 547โ€“560.
  • Li etย al. (2018) Chuanwen Li, Yu Gu, Jianzhong Qi, Jiayuan He, Qingxu Deng, and Ge Yu. 2018. A GPU Accelerated Update Efficient Index for kNN Queries in Road Networks. In Proceedings of ICDE. 881โ€“892.
  • Li etย al. (2023) Jiajia Li, Cancan Ni, Dan He, Lei Li, Xiufeng Xia, and Xiaofang Zhou. 2023. Efficient k NN query for moving objects on time-dependent road networks. The VLDB Journal 32, 3 (2023), 575โ€“594.
  • Luo etย al. (2018) Siqiang Luo, Ben Kao, Guoliang Li, Jiafeng Hu, Reynold Cheng, and Yudian Zheng. 2018. Toain: a throughput optimizing adaptive index for answering dynamic k nn queries on road networks. Proceedings of VLDB 11, 5 (2018), 594โ€“606.
  • Mouratidis etย al. (2006) Kyriakos Mouratidis, Manย Lung Yiu, Dimitris Papadias, and Nikos Mamoulis. 2006. Continuous Nearest Neighbor Monitoring in Road Networks. In Proceedings of VLDB. 43โ€“54.
  • Nodarakis etย al. (2017) Nikolaos Nodarakis, Angeliki Rapti, Spyros Sioutas, Athanasiosย K Tsakalidis, Dimitrios Tsolis, Giannis Tzimas, and Yannis Panagis. 2017. (A) kNN query processing on the cloud: a survey. In Algorithmic Aspects of Cloud Computing: Second International Workshop, ALGOCLOUD 2016, Aarhus, Denmark, August 22, 2016, Revised Selected Papers 2. Springer, 26โ€“40.
  • OpenRice ([n.d.]) OpenRice. [n.d.]. https://www.openrice.com/.
  • OpenTable ([n.d.]) OpenTable. [n.d.]. https://www.opentable.com.au/.
  • Ouyang etย al. (2018) Dian Ouyang, Lu Qin, Lijun Chang, Xuemin Lin, Ying Zhang, and Qing Zhu. 2018. When hierarchy meets 2-hop-labeling: Efficient shortest distance queries on road networks. In Proceedings of SIGMOD. 709โ€“724.
  • Ouyang etย al. (2020a) Dian Ouyang, Dong Wen, Lu Qin, Lijun Chang, Ying Zhang, and Xuemin Lin. 2020a. Progressive top-k nearest neighbors search in large road networks. In Proceedings of SIGMOD. 1781โ€“1795.
  • Ouyang etย al. (2020b) Dian Ouyang, Long Yuan, Lu Qin, Lijun Chang, Ying Zhang, and Xuemin Lin. 2020b. Efficient shortest path index maintenance on dynamic road networks with theoretical guarantees. Proceedings of the VLDB Endowment 13, 5 (2020), 602โ€“615.
  • Papadias etย al. (2003a) Dimitris Papadias, Yufei Tao, Greg Fu, and Bernhard Seeger. 2003a. An optimal and progressive algorithm for skyline queries. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data. 467โ€“478.
  • Papadias etย al. (2003b) Dimitris Papadias, Jun Zhang, Nikos Mamoulis, and Yufei Tao. 2003b. Query processing in spatial network databases. In Proceedings of VLDB. Elsevier, 802โ€“813.
  • Ph.D. (er 1) Frederickย Muench Ph.D. 2010, November 1. The Burden of Choice. https://www.psychologytoday.com/ca/blog/more-tech-support/201011/the-burden-choice.
  • Robertson and Seymour (1984) Neil Robertson and Paulย D Seymour. 1984. Graph minors. III. Planar tree-width. Journal of Combinatorial Theory, Series B 36, 1 (1984), 49โ€“64.
  • Schwartz (2004) Barry Schwartz. 2004. The paradox of choice: Why more is less. New York (2004).
  • Shahabi etย al. (2003) Cyrus Shahabi, Mohammadย R. Kolahdouzan, and Mehdi Sharifzadeh. 2003. A Road Network Embedding Technique for K-Nearest Neighbor Search in Moving Object Databases. GeoInformatica 7, 3 (2003), 255โ€“273.
  • Shen etย al. (2017) Bilong Shen, Ying Zhao, Guoliang Li, Weimin Zheng, Yue Qin, Bo Yuan, and Yongming Rao. 2017. V-tree: Efficient knn search on moving objects with road-network constraints. In Proceedings of ICDE. IEEE, 609โ€“620.
  • Sthapit (2018) Erose Sthapit. 2018. The more the merrier: Souvenir shopping, the absence of choice overload and preferred attributes. Tourism management perspectives 26 (2018), 126โ€“134.
  • Trip ([n.d.]) Trip. [n.d.]. https://www.trip.com/.
  • Tugend (y 27) Alina Tugend. 2010, February 27. Too many choices: A problem that can paralyze. https://www.nytimes.com/2010/02/27/your-money/27shortcuts.html?_r=1&.
  • Uber ([n.d.]) Uber. [n.d.]. https://www.uber.com/.
  • Xu etย al. (2005) Jinbo Xu, Feng Jiao, and Bonnie Berger. 2005. A tree-decomposition approach to protein structure prediction. In 2005 IEEE Computational Systems Bioinformatics Conference (CSBโ€™05). IEEE, 247โ€“256.
  • Yelp ([n.d.]) Yelp. [n.d.]. https://www.yelp.com/.
  • Zheng etย al. (2016) Bolong Zheng, Kai Zheng, Xiaokui Xiao, Han Su, Hongzhi Yin, Xiaofang Zhou, and GuoHui Li. 2016. Keyword-aware continuous kNN query on road networks. In Proceedings of ICDE. IEEE Computer Society, 871โ€“882.
  • Zhong etย al. (2015) Ruicheng Zhong, Guoliang Li, Kian-Lee Tan, Lizhu Zhou, and Zhiguo Gong. 2015. G-tree: An efficient and scalable index for spatial search on road networks. IEEE Transactions on Knowledge and Data Engineering 27, 8 (2015), 2175โ€“2189.