Sub-quadratic -approximate Euclidean Spanners, with Applications
Abstract
We study graph spanners for point-set in the high-dimensional Euclidean space. On the one hand, we prove that spanners with stretch and subquadratic size are not possible, even if we add Steiner points. On the other hand, if we add extra nodes to the graph (non-metric Steiner points), then we can obtain -approximate spanners of subquadratic size. We show how to construct a spanner of size , as well as a directed version of the spanner of size .
We use our directed spanner to obtain an algorithm for computing -approximation to Earth-Mover Distance (optimal transport) between two sets of size in time .
1 Introduction
A spanner for a metric defined by the graph is a subgraph such that all shortest path distances in approximate shortest path distances in up to some approximation . Originally introduced in [PS89], the notion has been very influential in theory and practice — see, for example, the recent survey [ABS+20] and earlier surveys [Epp99, Zwi01]. The classic result is that, for any integer , for any weighted graph , one can construct a spanner of size approximating all distances up to factor [PS89], termed stretch. This is tight under the Erdös girth conjecture [Erd63].
Graph spanners have been studied in a number of settings and variants. While we refer the reader to the survey [ABS+20] for the vast list of relevant papers, we highlight that there are a number of parameters/versions of spanners considered: approximation (e.g., multiplicative, additive, and others), which pairs to preserve, properties of the graph , computation time, data structures (distance oracles), etc.
An almost disjoint line of research concerns geometric spanners, where the graph is defined implicitly as the pairwise distance in some metric ; introduced in [Che86] even before spanners where defined. In particular, here is a complete graph, where each pair of nodes is connected with an edge with weight equal to their distance. The classic results here show that an -point metric in the Euclidean space admits a -approximation spanner of size [Cla87, Kei88, KG92, ADD+93]; see also the survey [NS07].
More recent progress [LS22, BT22] showed tightness of the bound , but also that Steiner spanners can improve the size to . A Steiner spanner is a graph whose nodes are the pointset together with some extra Steiner points . Whenever an edge is added to the spanner, its weight is the direct distance between the corresponding points (so that no pair is shortcut).
In contrast to the two extremely well-studied settings from above, the high-dimensional geometric spanners have surprisingly received much less attention. In particular, we are aware of only one (theoretical) result: [HPIS13] who obtained -spanner of size for any . This is despite the ubiquity of high-dimensional datasets and the fact that many (massive) graphs are actually “similarity graphs” 111While similarity graph is a bit different from the aforementioned complete graph — e.g., the edges are only for pairs of points at distance — the tools from here apply to such graphs nonetheless., obtained by connecting close points of some dataset — subsequently used for various clustering applications; see recent [DEL+22] and references therein. Indeed, the spanner from [HPIS13] has been implemented in [CHJ+22], yielding orders of magnitude speed up in graph building, without affecting down-the-stream clustering which leverages the spanner only.
To put our lack of understanding into perspective, we don’t even know if we can obtain subquadratic-size spanners for approximation .
1.1 Our contributions
We build new high-dimensional geometric spanners in the approximation regime, asking the following main question:
Question 1.1.
Do there exist spanners for high-dimensional Euclidean pointset with approximation and subquadratic size?
From now on, we assume we have an -point dataset under the Euclidean distance, where . For a graph , we use to denote the shortest path distance between to in . We also let be the aspect ratio of : .
Lower bound for spanners.
We first give indication that it is impossible to obtain approximation with a sub-quadratic-sized spanner even if we allow Steiner points. In particular, we prove the following two lower bounds for the Hamming metric and Euclidean metric, respectively.
Theorem 1.2 (See Section 2.1).
For any , there exists a dataset such that the following holds. Fix a graph with vertices identified with , such that for any ,
Then must have edges. This remains true even if contains Steiner vertices corresponding to points in .
Theorem 1.3 (See Section 2.2).
For any , there exists a dataset such that the following holds. Fix a graph with vertices identified with , such that for any ,
Then must have edges. This remains true even if contains Steiner vertices corresponding to points in .
Spanners with non-metric Steiner nodes.
Next, we show that if we consider a more general class of Steiner spanners, where we allow graph nodes that do not correspond to points , we can indeed obtain -spanners of subquadratic size, answering positively the question from above. We refer to such spanners as non-metric Steiner spanners.
Furthermore, we distinguish two types of non-metric Steiner spanners: undirected, and directed. In particular, a directed spanner for sets with stretch is a directed graph with vertex-set , satisfying for all :
where as the shortest path from to . When we have one dataset , we take to be a copy of . Note that directed spanner only makes sense for non-metric Steiner version.
Theorem 1.4 (See Section 3).
Fix . For any dataset , there exists a directed graph on vertices , such that for any ,
Furthermore, the number of edges of , the size of , and the construction time are all bounded by , where is the aspect ratio of .
While the directed spanner turns out to be already sufficient for our application below, there other applications which may require an undirected graph (as many graph problems are easier for undirected graphs). Thus, we also show that one can construct an undirected spanner, albeit the parameters are less efficient than for the directed case.
Theorem 1.5 (See Section 4).
Fix . For any dataset , we can construct an undirected graph on vertices , such that for any ,
Furthermore, the construction time, the number of edges of and the size of are bounded by , where is the aspect ratio of .
Sub-quadratic algorithms for the Earth-Mover Distance in high dimensions problem
We highlight one concrete algorithmic application of our spanner construction. In particular, we consider the problem of Earth-Mover Distance in the high-dimensional spaces: given two -points sets , compute the smallest cost bi-chromatic matching where ranges over all permutations. In different communities, the EMD problem is also termed optimal transport, Wasserstein distance, Monge-Kantarovich, and others.222Formally, all these are defined over distributions over points, equivalently weighted pointsets. All our results apply to the weighted version as well, and hence we ignore the difference for simplicity of presentation.
The basic problem is to compute the (near-) optimal efficiently. When dimension is small, a long progression of results, starting from [Vai89] and culminating in [KNP19] yielded an -approximate algorithm running in time .
Another line of research concerns the general metric (or cost) setting, i.e., more general than the Euclidean setting. Here, the best possible runtime is , which is necessary in general. One approach uses Sinkhorn iteration algorithm to obtain additive error of [Cut13, ANWR17]. A different approach uses faster linear programming algorithms to solve the standard bi-partite matching in time near-linear in the size of the (complete) bi-partite graph on [VDBLL+21, CKL+22] — this approach can obtain an exact result.
Nevertheless, the most common setting for general cost is still the high-dimensional Euclidean (or ) space (see, e.g., the experiments from [ANWR17], or this most recent paper [ARSS23]). In this setting, one can hope for runtime , as the input size is only .
The best algorithm for the high-dimensional setting obtains runtime of for -approximation, for . This follows from using the spanner of [HPIS13] in the following general approach. One can take the pointset and build a spanner on it. Then the EMD problem reduces to solving the “graphical EMD” case: EMD under the metric defined by the shortest path distance in . The latter problem is equivalent to the problem of uncapacitated min-cost flow, or transshipment problem. This problem in turn can be solved in time near-linear in the graph size [She17, ASZ20, Li20]. While these algorithms are for undirected graphs (and hence would require an undirected spanner), the most recent algorithm does solve the min-cost flow problem in directed graph in near-linear time [CKL+22].
Thus, we obtain the following statement, which is direct corollary of our directed spanner Theorem 1.4 combined with [CKL+22]:
Theorem 1.6.
For any , given two sets of points of size , we can solve the problem within approximation in time time.
1.2 Our techniques
The natural place to start is to understand why [HPIS13] does not answer our main Question 1.1, instead obtaining a subquadratic-sized spanner only for approximation . First, we note that it is enough to solve the problem for some fixed scale : for given dataset , build a spanner where all pairs of points at distance have a path of length , while no pairwise distance contracts. For this, the authors use Locality-Sensitive Hashing, originally designed to solve the Approximate Near Neighbor Search (ANNS) problem in :
Definition 1.7 ([HIM12]).
Fix dimension , approximation , and scale . A distribution over hash functions , for some discrete set , is called -Locality Sensitive Hashing (LSH) if:
-
•
for any two points with , we have ;
-
•
for any two points with , we have .
It is known that one can obtain -LSH with for any [AI08]. The overall [HPIS13] algorithm proceeds as follows. Sample hash functions from -LSH where . Each such hash function partitions the dataset into buckets. In each bucket, we connect all points to a fixed point with an edge of length — i.e., add a star to the graph . One can immediately deduce that points at distance will be connected by 2-hop path of length , and that the total size is . Furthermore, since , one can argue that no pair at distance ends up in the same bucket.
Note the size is sub-quadratic only for , and there are two contributions to the exponent 12. The first one is the “star” gadget inside the bucket: most colliding points are connected by a 2-hop path that imposes a factor-2 stretch. The second one is the fact that we need to set to ensure that no pair of points at distance collide in a bucket.
As we discuss below, both these sources of stretch loss are intrinsic to the algorithm and require new ideas to overcome.
Lower bound.
First, we show that we cannot obtain approximation using vanilla spanners or even spanners with Steiner points in the Euclidean/Hamming metric. Indeed, consider a random pointset on the surface of a unit sphere in . Then all distances are concentrated around . It is immediate that, for any spanner , any pair of points without a direct edge will have stretch 2. One can also show that even if we have Steiner points in , then a subquadratic-sized spanner much have stretch at least . For intuition, note that the best Steiner point for is the center of the sphere. A similar lower bound holds for Steiner spanners in the Hamming space.
Upper bound.
To deal with the first source of approximation from [HPIS13] algorithm, we consider adding Steiner vertices to the spanner that does not represent points in the metric itself. In particular, in the above equidistant case, the solution is to add a node to the graph, and connect it to each of the points in with an edge of length . We term this a “star gadget” below.
Now we can focus on the second source of stretch: that we need to set as low as , which is a much more significant technical challenge. The natural first question would be: why not set and hence for ?
Consider a pair of points at distance that we would like to collide in some bucket, in which case they are connected by a path of distance (using our new star gadget from above). It is immediate to argue that will collide with probability , and hence over all hash functions. The tricky part is to ensure that this successful bucket does not also contain a pair of far points (i.e., at distance ), in which case the star gadget will shortcut the pair.
To analyze the probability of far pair of points falling into the bucket, we can consider, for example, the probability that a “far” point at distance collides with . Standard analysis says that the expected number of far points is , and hence, by Markov’s, there’s, say, probability that no far point collides with . The crucial caveat is that this probability is not independent of the event that collide: i.e., it may be the case that whenever collide, the number of colliding far points is much larger than the expectation.
We are lead to consider the “3-point collision” probability: . If we were able to prove that these two events are essentially independent, we would be all set. Such property has indeed been considered, at least in [AR15] who used it for data-dependent LSH algorithms. They proved such desired independence by a very intricate reduction to the “pseudo-random” case: where is essentially random on the sphere, except for the pair of interest . Alas, their reduction does not seem applicable here for a couple of reasons. The first is that the independence holds only in the case when far point is far from both and . The second is that, while in [AR15] one can think of being a “rare” close point to (this helps in the reduction), in our setting, there could be both close points and far points for most ’s!
Our approach is different and uses two main ideas: the first enough to get directed spanners, and the second necessary for the undirected spanners.
Upper bound: first idea.
We circumvent the “3-point collision” analysis by considering a different setting of the hashing, roughly corresponding to the ANNS algorithm with sub-polynomial query time. In particular, we use the hashing scheme from [ALRW17]. Think of the dataset as being two sets and , and we need to worry only about the collision of points vs (in the end we take ). Then [ALRW17] design a hashing scheme such that for some , there’s a function which describes for each point which set of buckets it is hashed to, and there’s an equivalent function. Then we have that, for any pair at distance , we have that with constant probability. Furthermore, and for satisfying some relation, and the runtime to compute is roughly proportional to their output size. It is possible to set in which case (we note that this bound is tight [ALRW17]).
Our solution uses the above asymmetric hashing as follows: split the dataset into groups of size . For each group, preprocess as above, taking a total of time, and then query each , taking a total of time. We can then prove that, for every close pair , with probability at least , there exists a bucket where 1) collide, but 2) contains no point that is far from .
The remaining caveat is that we need to verify that a bucket succeeded — all hashed are close to each other — as we cannot add star gadget to the failed ones (otherwise, we create shortcuts). For this, we need to employ furthest neighbor search data structure. A naïve adaptation of the best currently known data structure, would result in extra runtime , a slow-down we manage to avoid.
Upper bound: second idea.
The above procedure obtains only a directed spanner since we guarantee distances between and but among pairs or pairs. This turns out to be much more difficult to ensure.
In fact, our original “star gadget” is not even enough to get an undirected spanner. In particular, consider the case when, we have points drawn from sphere of radius and drawn sphere of radius . Then the distance between a pair is typically ; fix . At the same time the distance between points is . It seems impossible to construct an undirected spanner of sub-quadratic size with stretch using the “star gadget” from before — because it would shortcut -to- distances.
Instead, we introduce a new gadget: asymmetric star gadget. In particular, in the above example, we have a Steiner vertex that is connected with distance to points and with distance to points . Now the distance -to- is ; -to- is ; and -to- is as well. While the -to- distance is stretched (it will be taken care of at a different scale ), the -scale distances are taken care of, and there are not shortcuts created.
To be able to use this gadget, we start from where we left off with the directed spanner: a pair of sets of such that each is close to each . We show that in this case, there exists with , such that we can decompose (and ) into smaller parts so that the diameter of each part is (and respectively). The number of such parts is bounded by and respectively. Then we use the asymmetric gadget on each pair of such parts.
Acknowledgements
We thank Negev Shekel Nosatzki for important discussions and suggestions for the undirected spanner construction in Section 4. Research supported in part by NSF (CCF2008733) and ONR (N00014-22-1-2713).
2 Lower Bound for Spanners, with or without Steiner Points
In this section, we prove the lower bounds on the size of the spanner for geometric pointsets. We consider spanners that contains Steiner nodes corresponding to points in the metric. We note that since the spanner cannot shortcut any pair, for any spanner edge connecting two points , the optimal length of the edge is the distance between points and .
We prove lower bounds for both the Hamming and Euclidean () spaces separately.333While it is known that can be embedded into (and into -squared), a lower bound for the () space does not immediately imply a lower bound for the (resp. ) space since the Steiner points are to be taken in the considered space. Both lower bounds use the following lemma, which we prove in Section 2.3.
Lemma 2.1.
Consider a graph and integer . Let be any disjoint sets. Suppose and . For every and , let be any path from to in . Then, there exists distinct points and distinct points , such that there is a node that lays on all paths .
2.1 Lower bound in the Hamming metric
We prove the following lower bound on Steiner spanner size in the metric.
Theorem 2.2 (Restatement of Theorem 1.2).
For any , there exists a point set such that the following holds. Let be any spanner that can use extra Steiner points , but can only use the distance as the edge weight between two points. Suppose for all , the distance between and on graph satisfies
then must use edges.
Proof of Theorem 2.2.
Let be chosen as every point is i.i.d. uniformly sampled from the hamming cube with . With high probability, every pair of points in has distance .
For every two points , we define , where is taking average of and . Intuitively, for , the set contains all possible Steiner points in such that the path won’t have large approximation: if , then since .
The following claim shows that has small volume.
Claim 2.3.
For every , if are uniformly sampled from , then
Proof.
Define be the set of coordinates where and agree on. has an equivalent definition . For , let be the random variable that if and , otherwise . If are sampled uniformly randomly from , then each with probability and with probability, and all are independent. And is equivalent to .
By Chernoff bound , replacing with , we have
∎
We are now ready to proceed with the main argument. For any , let be any spanner stated in this theorem. Suppose has edges for some tbd. By Lemma 2.1, there must exist distinct points such that the shortest path between intersect on some vertex (which could be Steiner or an original point in ). Since the edge weight can only use the actual Hamming distance, we must have for all . Thus, must be in every by definition of .
Define the event as follows: there exists a point , and distinct points such that . We have argued that if the claimed spanner with edges exists for every , then . We will show next that when , the probability happens (over the random choice of ) is less than . Hence, there exists a dataset for which does not happen, and therefore a claimed spanner does not exist for .
By Claim 2.3, for a fixed , we have
Therefore, for ,
Thus, we conclude by union bound on all points , and therefore the theorem follows. ∎
2.2 Lower bound in metric
We now prove the lower bound for the metric:
Theorem 2.4 ( lower bound).
For any , there exists a dataset , such that the following holds. Let be any spanner that can use extra Steiner points , but can only use the distance as the edge weight between two points. Suppose for all , the distance between and on graph satisfies
then must use edges, where .
First, we give some definitions. For any and two points , let be defined as . If is clear from the context, we write it as .
The following lemma shows that for a fixed point , it is rare contains for a random pair of points and .
Lemma 2.5.
For , let be any point in the unit ball. Let be sampled uniformly from unit sphere . Then
Proof.
It is enough to bound for any fixed . By symmetry, this is equivalent to
where , and can be any two fixed points on that , say, .
Note that is an ellipsoid with focal points on and . Let . Let
This means , where is the sphere cap including all points for which . Using the bound on the volume of a high dimensional sphere cap given by [Tko12], we get
∎
Proof of Theorem 2.4.
Let be generated as every point is i.i.d. uniformly sampled from unit sphere with . With high probability, every pair of points in has distance .
Let be any spanner stated in this theorem with edges for . Wlog, we can assume every Steiner point of is in the unit ball , since otherwise we can project them onto the unit sphere. By Lemma 2.1, there must exists distinct points such that the shortest path between intersect on some point . Since the edge weight can only use the actual distance in the Euclidean space, we must have for all . Let be the closest point to such that every coordinate of is a multiple of , and thus , and therefore, satisfies that for all ,
Let include all points in ball whose coordinates are multiple of and let the event be: there exists a point and distinct points such that for all . The above argument indicates that if such a spanner exists, then must happen. Next, we will show that happens with probability for a random dataset ; therefore, we assert that such spanner does not exist for some .
Let be any point. Since every pair of points in has distance , by Lemma 2.5, we have
By union bound on all , we have
therefore, the theorem follows. ∎
2.3 Proof of Lemma 2.1
Proof of Lemma 2.1.
Suppose there is a graph for which the property doesn’t hold. We will turn into another bipartite graph with less than edges but can connect all pairs of points and by a direct edge. Thus, by proof of contradiction, we can conclude such graph doesn’t exist.
First of all, we assume every path never touches or except its endpoints, since otherwise, we can create dummy points for every point in and and let the path use dummy points. If the new graph has paths intersecting on one point, then so does the original graph.
Next, to simplify the proof, we give direction to edges. Let be any path, and let be points on the path in order. We say the directed edge if and for some . We assume only has useful edges. From now on, is a directed graph and
To turn into , we have three intermediate steps – from to , , , then , each graph has better properties. In the first graph , we will have , , and every vertex is light which we will explain later. The procedure of making is as follows.
For every vertex , let be initially empty. For every path with and , let be the points on the path. For every , we add to .
We show that we modify the graph to obtain a simpler condition — that contains of either distinct ’s or distinct ’s. For every vertex , can be regarded as a bipartite graph with a collection of edges between and . In this bipartite graph, the maximum matching is less than , otherwise there are paths with distinct endpoints intersecting on .
Theorem 2.6 (Extended Hall’s theorem).
For any bipartite graph with , the cardinality of maximum matching is
where range over all subset of and separately, and denote the set of neighbors of the set in .
Apply the above theorem on the bipartite graph , there must be a set or such that . We must have and . The analysis for the cases and would be similar, so we assume the first case. Let and . Note that there is no path in crossing with and . In the new graph , we create three copies of : and and , where and . So . For each edge in , we create at most edges linking each pair of copies of and . Thus, .
Let be vertices on path . For all , we replace with its copy where and get a path in . In the new graph , for every vertex , let be initially empty. For every path with and , let be the points on the path. For every , we add to and add to .
Since from the construction of , for every vertex , we have either or . If , we call left-light, otherwise we call right-light. So there are four types of points in : , , left-light and right-light points.
For any graph and a vertex , we call as the set of edges entering , and as the set of edges leaving . In some context, we also refer as those vertices connected with an edge entering/leaving .
Then we construct the next graph as follows. Initially, . Then, for every left-light vertex , we delete all edges in and add edge for every . For every right-light vertex , we delete all edges in and add edge for every . Intuitively, the construction is replacing edges with shortcuts. One can check that every pair of points is still connected with a path in . The number of edges is bounded by . The purpose of constructing is to have for left-light vertices , and for right-light vertices . Thus, is in fact a four-layered graph ordered from – left-light vertices– right-light vertices – , and every edge must go strictly from lower level to higher level.
To construct from , we need to remove all points in . Define the removing procedure applied on vertex on graph , as follows: delete all edges connecting , and then for every vertex and , add an edge from to .
We apply on every left-light vertices in to get graph , then apply on every right-light vertices in to get graph . The order of deleting does not affect the outcome since there are no edges connecting two left-light vertices, nor two right-light vertices. We have
where the last step follows by for left-light vertices .
Furthermore, we have
Thus, . And is our – a bipartite graph that connects all pairs of points from and with less than edges. ∎
3 Directed Non-metric Steiner Spanners
In this section we construct a directed non-metric Steiner spanner, proving Theorem 1.4.
Theorem 3.1 (Restatement of Theorem 1.4).
Fix . For any dataset , there exists a directed graph on vertices , such that for every ,
Furthermore, the number of edges of , the size of , and the construction time are all bounded by , where is the aspect ratio of .
We prove this theorem by reducing to the one-scale case: we fix distance , and we construct a spanner such that, for all pairs of points within distance , they are all connected in with a proper path. Here is the formal statement.
Lemma 3.2.
Let be a point set on the unit sphere. Let , , and is the approximation. There is an algorithm running in time, that can generate a directed graph with . Here is the set of extra non-metric Steiner points. With high probability, satisfies:
-
(1)
, ;
-
(2)
with , there is a 2-hop directed path in from to of length .
The proof of this lemma is in the rest of this section. For now, we finish the proof of our main theorem using Lemma 3.2.
Proof of Theorem 3.1.
In Lemma 3.2, we show how to build a spanner, so that every pair of points with are handled, given any parameter . We can do doubling on from least distance to largest distance to create a sequence of spanners . Then is the final spanner.
There is only one issue to solve in applying Lemma 3.2: that we need to be on the sphere and has to be about . Indeed, it is possible to reduce any Euclidean instance to such an instance, while preserving the distances at the scale . See one such reduction in Corollary A.2 from [AIMN23] (based on [BRS11, Lemma 6]).
∎
3.1 Algorithm for one scale
The algorithm heavily uses the data-independent (asymmetric) locality sensitive hashing from [ALRW17] as a subroutine. We use the following primitive from [ALRW17]; while this lemma is not explicitly given in [ALRW17], we show how it is obtained in Appendix A.
Lemma 3.3 (Data-independent LSH from Section 3 of [ALRW17]).
Let be a fixed parameter, be the approximation ratio. There is a data structure that, takes a random seed , implicitly 444 can be super-polynomially large. gives two random collections of subspaces and . For any point , denote as the set of subspaces that contains . Similarly, define . Then, for any , we have
-
1.
, ;
-
2.
, ;
-
3.
that , ,
-
4.
that , ,
-
5.
, one can find all in time, and all in time.
Description of algorithm for directed spanner construction. The formal algorithm is presented as Algorithm 1. The algorithm takes as input a dataset , a distance parameter , and a approximation factor , and output a Steiner spanner graph . It never create “shortcut”, i.e., cannot be smaller than . For those , it guarantees that .
The algorithm consists of rounds. In each round, for every pair of points at distance , they have chance being connected by a 2-hop path of total length of . Furthermore, we avoid creating a short-cut on any pair of points. Thus, after rounds, we expect our guarantee is satisfied.
In each round, we first arbitrarily partition all points into groups of size . Denote groups as . Then, we use Locality Sensitive Hashing from Lemma 3.3 (with parameters be the approximation and for some ). Note that we use the same LSH function for all groups. Assume are half-spaces generated by LSH function. Define sets and . Intuitively, every point in will be close to every point in for all , because it’s exactly the building and querying procedure of a nearest neighbor data structure – in the pre-processing phase, falls into -th bucket, and in the query phase, queries the -th bucket.
To guarantee that some is at distance from all the points in (which may fail to happen with non-trivial probability), we perform a Furthest Neighbor Search (FNS) check. For the latter, we use an FNS data structure, stated in the lemma below. It follows immediately from the reduction from [Ind01] to a nearest neighbor search data structure, which we instantiate with the data structure from [ALRW17].
Lemma 3.4 (FNS data structure [Ind01, ALRW17]).
There is a -approximate Furthest Neighbor Search data structure with preprocessing time and query time, for any satisfying the trade-off of:
We build an FNS data structure (Lemma 3.4) for each with approximation and parameters and . For every point , we check if every point in is approximately -close to . If so, we can safely put into . As a result, every point and every point has Euclidean distance so we can create an extra Steiner node as the center and add a star gadget. We will prove later that if are close, they will fall into and with probability, for some .
3.2 One scale correctness: Proof of Lemma 3.2
We now prove Lemma 3.2. We first analyze the size of sets and , and then bound the size of spanner.
Observation 3.5.
Let and be defined in Algorithm 1. We have
-
1.
for all , and .
-
2.
;
-
3.
;
-
4.
,
-
5.
and .
Proof.
We need to prove (2), (3), (4).
For (2), (3), by Lemma 3.3, each is mapped to different , and each is mapped to different . For (4), we note that for every fixed , any can appear in for at most one value of . ∎
Proof of Lemma 3.2.
We are going to prove the lemma in four aspects: the graph our algorithm output satisfies both property (1) and (2), also the edge number and running time are both upper bounded by . For simplicity, we will run Algorithm 1 with . The lemma then follows with a substitution.
Property 1: , .
The only situation we add edges is when creating paths between and , and all paths have length . Because every enters only after it passes FNS test (Line 16), which means every point in is close to . So property (1) is true.
Property 2: with , there is a 2-hop directed path in from to of length .
Let be any pair of points with . Let’s compute the probability of connecting to with a path in one round.
Suppose is partitioned into group . We are going to calculate the probability that and , for some . Define as the set of far points of . If , while is included in , then will pass the FNS test and thus be included in .
Since , by union bound on all , we can bound the probability that the bad event happens:
Thus, by choose a proper , we have
Also, By Lemma 3.3, we have
Thus, with probability, and for some , and hence the spanner will have a 2-hop path from to , via , with a path of length . By repeating rounds, connects to with high probability.
Bounding the number of edges. In one round, the total number of edges added from to is at most , where the last step is by Observation 3.5. So the total number of edges from to is at most .
On the other hand, the total number of edges from to is at most
where the first step is because , the second step is by (Observation 3.5).
Thus, .
Running time. By Lemma 3.3, we can find all that contains for all in time, and all that contains for all in time.
The major running time cost is on building and querying the FNS data structure on every . We use Lemma 3.4 with and (See line 13).
In the preprocessing phase, we spend time to build the FNS data structure for every . The total time cost is
where the second step is because , and , the last step is by our choice of .
In the query phase, for every point , if is in , needs to query FNS data structure built on for every . Since is in at most different ’s, with our choice of , the query time cost on each point is at most . Thus, the query time for all , and also the overall running time is
∎
4 Undirected Non-metric Steiner Spanners
In this section, we show how to turn the directed spanner from Theorem 3.1 into an undirected spanner, with some loss in its size. Here is the formal statement:
Theorem 4.1 (Restatement of Theorem 1.5).
Fix . For any dataset , we can construct an undirected graph on vertices , such that for any ,
Furthermore, the construction time, the number of edges of and the size of are bounded by , where is the aspect ratio.
The proof of this theorem starts with the directed spanner constructed in the previous section, modifying it further. We note that the directed edges appear in line 20 to line 22 in Algorithm 1, where we link all to Steiner point with edge length and link to all with edge length . If these directed edges lose their direction, they will potentially create shortcuts between or (depending on how we would assign the weights to these edges).
We show how to replace the above (directed) gadget with a more involved (undirected) graph. For now, we focus on one pair. The high level idea is to partition and into clusters of bounded diameter and , respectively, with . Then, for each pair of clusters , we add a Steiner node with connecting to each with edge of length , and to with edge length . In this way, we connect and with path of length without shortcuting any pair of points. As long as the number of clusters is , the number of edges is .
We start with some definitions.
Definition 4.2 (Average square distance).
For a point set , define . Define .
Remark 4.3.
When only has two points , , since there is half probability that or . Thus, is can be smaller than .
The average square distance has the following properties.
Claim 4.4.
If two sets of points satisfy , then .
Proof.
Note that Claim 4.4 is also true for any subset of and . Thus, we have the following proposition.
Proposition 4.5.
If two sets of points satisfy , then .
Definition 4.6 (Low diameter decomposition).
Let be a point set. We say is an -diameter decomposition of , if they form a partition of , and every has diameter at most .
Lemma 4.7.
Fix . Suppose we have two point sets with . Let be any real number such that . Given , an -diameter decomposition of , and , a -diameter decomposition of , we can build an undirected graph such that for the shortest path distance in :
-
•
for all , , ;
-
•
for all , ,
and the size of is . The construction time is .
Proof.
Choose any and such that .
For every and , we create a Steiner point . For every , add edge with length . For every , add edge with length . Now, every are connected with distance , every are connected by , and every are connected by . Because has diameter at most , has diameter at most , the constraint always holds throughout this procedure.
The number of edges is at most . ∎
With the above lemma, it remains to show how to construct small low-diameter decompositions, which we will do next.
4.1 Computing low diameter decomposition
We now show that we can compute efficient low-diameter decompositions to be used in Lemma 4.7. We note that we use and . Hence we need an efficient algorithm to compute -diameter decomposition for any point set . We will use the notions of -close and -far. Namely, if two points have distance , we say they are a -close pair. If their distance is , we say they are a -far pair.
This section gives an important subroutine for the decomposition, presented as Algorithm 2. The input set has the guarantee that at least pairs of points are -close. The algorithm will then output disjoint clusters , in which every has diameter , and contains a significant portion of . is a parameter the algorithm can chose from .
The guarantees of the Algorithm 2 are captured by the following lemma.
Lemma 4.8 (cluster extraction).
Fix an -point set . Let be the approximation parameter. Let be a parameter such that there are at least -close pairs, where .
Proof.
In our algorithm, we pick a random -LSH function with and . We note that an -LSH function maps an -close pair into the same bucket with probability , and map a -far pair into the same bucket with probability .
We can assume, wlog, that the number of buckets is at most , since otherwise, we can randomly group buckets into buckets, increasing by at most a factor of . The expected number of -far pairs that collide is , and the expected number of colliding -close pairs is .
For any such LSH function , we denote and as the number of colliding far and close pairs, respectively. We use if is true, otherwise .
Since , we have
Also,
Combining the above two inequalities, we get
Since , we get
Furthermore, there must exist some bucket such that: 1) the number of close pairs colliding in bucket is , and 2) the number of colliding far pairs satisfies that .
This means, with probability , we succeed at the if-check in line 8, because we have a bucket with and . Pick points at random from this bucket , and call them . Then, for set defined as the -far pairs in , we have . Thus, with probability, points are pairwise -close and they are extracted in line 13. Note that each extraction will decrease by at most , so we can extract times safely as the same analysis holds.
Running time. We check line 8 for times in expectation. Each time we use time to implement the hashing. In each check, instead of using time calculating exact and , we can do sampling to estimate. samples can approximate w.h.p, and samples is enough for .
Total running time would be
∎
4.2 Proof of Theorem 4.1
The last step is to compute small-diameter decomposition for sets . In particular, we will use Algorithm 2 to compute -diameter decomposition of a set , where is the diameter of .
We accomplish this in the next three lemmas. Lemma 4.9 shows that most pairs inside a set are at distance at most . Lemma 4.10 shows a decomposition of a set into clusters with diameter . Lemma 4.11 shows how to decompose specifically from Algorithm 1.
Lemma 4.9.
Given a point set and , let the average distance be define in Def. 4.2, i.e., . Then, there are at least pairs of such that .
Proof.
By Markov’s, . Thus, there are fraction pairs of that have . ∎
Lemma 4.10.
Fix . Consider a set of size . We can construct , which is a -diameter decomposition of in time. Furthermore, we have .
Proof.
Since every has , we can compute a quantity in time satisfying (simply by computing an empirical average of a number of sampled pairs).
By Lemma 4.9, there are at least number of -close pairs. Let so that . Fix . Run . By Lemma 4.8, we extract points from and that takes time.
Denote the remaining part of as . Now, iteratively recalculate and run Extract Clusters on until no points are remained. Denote as the final partition of . By Lemma 4.8, is at most since each has size lower bound. And every has diameter at most . The total running time is . Replace with to conclude the lemma. ∎
Lemma 4.11 (Decompose ).
Proof.
The idea is to compute low-diameter decomposition for all of , and the decomposition for can be calculated in a straightforward way.
Fix . For every , we iteratively run Algorithm 2 with parameters Extract Clusters(, , , , ) and delete the extracted points from , until we cannot extract anymore — specifically, until the Repeat loop (line 10) cannot be satisfied within the bounded time Lemma 4.8 gives, which can be checked directly. Let the extracted clusters be () and the remaining set be (). Note that every has diameter .
Now consider one . Let be defined in Def. 4.2. Because every point in is -close to , we have . We claim that there exists such that the partition of defined as consists of at most clusters, and therefore, this is the decomposition of we want. The only thing we need to prove is that .
Let . By Lemma 4.9, there are at least pairs of with . Let be the least multiplicity of . So . Since as a remaining set fails on , by Lemma 4.8, this means the number of -close pairs is at most . This means . Therefore, .
There are at most different and we can try all to find the correct one. Use Lemma 4.10, computing the decomposition for all and uses at most running time. Thus, decomposing all is done in time
∎
Proof of Theorem 4.1.
Suppose we are in (Algorithm 1). Let , , be as in the algorithm. Now, instead of doing line 20 to line 22, we do the following.
First, compute an -diameter decomposition of , denoted as , and compute an -diameter decomposition of , denoted as . Then, use the gadget in Lemma 4.7 to add edges in our undirected spanner. The reason we can use the gadget is because by Proposition 4.5. Thus pairs in are distorted by at most .
Decomposing all can be done in running time proportional to
Computing the decompositions of all can be done in time.
Size of spanner. By Lemma 4.7, every pair of and adds edges to the spanner. We have (Lemma 4.10) and (Lemma 4.11). Since , the total size of our spanner is at most
Here we copied the constraints for and from Observation 3.5.
-
1.
for all , and .
-
2.
;
-
3.
;
-
4.
,
-
5.
and .
We bound the two terms as follows.
- 1.
- 2.
Concluding, we proved that the size of our spanner is . ∎
5 Removing dependence on
If we ignore the time cost for constructing, we can remove the dependence on in the number of edges used in the spanners. Below we show this for the directed spanner reducing the size from to ; similar result for the undirected spanner is analogous. We use the idea from [HPIS13].
We start by defining the -net.
Definition 5.1 (-net).
Let be a metric space, and let . A maximal set , such that for any , , is called a net for .
Assume the dataset has closest distance and largest distance . Let . Define so that . For every , let be a -net for . This means every point not in must have a near neighbor in that is close, and we call this point the leader of , denoted as .
We can pick those nets so that
We suppose to run Lemma 3.2 on each net with parameter . But note that many point in may not even has a neighbor within distance . Thus, we will do the following to reduce the size of the spanner. For , let be the set of points in that has at least one neighbor within distance in . Then we run Lemma 3.2 on each with parameter to get . We claim that is a spanner with dilation.
We prove by induction on the distance that for every pair of point with , . Assume this is true for all pairs of point with distance . Let be . If , we let , otherwise we let be the leader of . Similarly, we define if or otherwise we let . Then . Since , by induction we must have . Since and , must be also in . Thus, we have
Thus, .
Also, because , so .
Next, we will upper bound the size of .
For every point and , we create an imaginary point called if . Denote as the largest integer that . We build a tree on all such points as follows. We link an edge between to for all , and link to where is the leader of . The root of the tree is , in which has only one point.
Consider any and such that . This means has a point that is close to . By properties of -net, and cannot co-exists in net , where . This means there must be a vertex on the chain having two children.
The tree has leaves and it has at most vertices having two children, so .
The size of the spanner is
Theorem 5.2 (Existence result for directed spanner).
Fix . For any dataset , there exists a directed graph on vertices with size , such that for any ,
Using the same technique, we can prove similar results for undirected graph.
Theorem 5.3 (Existence result for undirected spanner).
Fix . For any dataset , there exists an undirected graph on vertices with size , such that for any ,
Appendix
Appendix A Data-independent LSH from [ALRW17]
We show here how Lemma 3.3 follows from results in [ALRW17, Section 3]. We use the notations from [ALRW17], noting that our corresponds to from [ALRW17].
Definition A.1 (Definition of , ).
For any , let be any two points on a unit sphere at distance . Define as the cosine of the angle between and , and as the sine of the same angle.
Definition A.2 (Definition of and ).
Let be any point on . Define
For any , let be any point on with . Define
Note that doesn’t depend on the specific choice of and due to the symmetry of Gaussian.
Lemma A.3 (Lemma 3.1 of [ALRW17]).
For ,
Lemma A.4 (Lemma 3.2 of [ALRW17]).
For every , if and , one has
Lemma A.5 (Data-independent LSH from Section 3 of [ALRW17]).
Let be a fixed parameter. For any integer and , there is a data structure that, takes a random seed , implicitly gives two random collections of subspaces and . For any point , denote as the set of subspaces that contains . Similarly, define .
-
1.
, ;
-
2.
, ;
-
3.
,
-
4.
, one can find all in time, and all in time;
Lemma A.6 (parameters).
Let and is the approximation ratio. Then, for any , there is a choice of and such that
-
1.
;
-
2.
;
-
3.
.
-
4.
.
Proof.
Let be two parameters that and . Thus, can be written as . We set and as what [ALRW17] did in Section 3.3.3, in the case where .
The proof of part 3 follows the same proof as in [ALRW17]. Here we have
Also, part 2 is true because
where
For part 1, we have
where
Part 4 follows by computing directly and using the fact that . ∎
References
- [ABS+20] Reyan Ahmed, Greg Bodwin, Faryad Darabi Sahneh, Keaton Hamm, Mohammad Javad Latifi Jebelli, Stephen Kobourov, and Richard Spence. Graph spanners: A tutorial review. Computer Science Review, 37:100253, 2020.
- [ADD+93] Ingo Althöfer, Gautam Das, David Dobkin, Deborah Joseph, and José Soares. On sparse spanners of weighted graphs. Discrete & Computational Geometry, 9(1):81–100, 1993.
- [AI08] Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM, 51(1):117–122, 2008.
- [AIMN23] Alexandr Andoni, Piotr Indyk, Sepideh Mahabadi, and Shyam S. Narayanan. Differentially private approximate near neighbor counting in high dimensions. In NeurIPS, 2023.
- [ALRW17] Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn, and Erik Waingarten. Optimal hashing-based time-space trade-offs for approximate near neighbors. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 47–66. SIAM, 2017.
- [ANWR17] Jason Altschuler, Jonathan Niles-Weed, and Philippe Rigollet. Near-linear time approximation algorithms for optimal transport via sinkhorn iteration. Advances in neural information processing systems, 30, 2017.
- [AR15] Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. 2015. Full version at http://arxiv.org/abs/1501.01062.
- [ARSS23] Pankaj K Agarwal, Sharath Raghvendra, Pouyan Shirzadian, and Rachita Sowle. A higher precision algorithm for computing the -wasserstein distance. In The Eleventh International Conference on Learning Representations, 2023.
- [ASZ20] Alexandr Andoni, Clifford Stein, and Peilin Zhong. Parallel approximate undirected shortest paths via low hop emulators. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 322–335, 2020.
- [BRS11] Yair Bartal, Ben Recht, and Leonard J. Schulman. Dimensionality reduction: Beyond the johnson-lindenstrauss bound. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 868–887. SIAM, 2011.
- [BT22] Sujoy Bhore and Csaba D Tóth. Euclidean steiner spanners: Light and sparse. SIAM Journal on Discrete Mathematics, 36(3):2411–2444, 2022.
- [Che86] Paul Chew. There is a planar graph almost as good as the complete graph. In Proceedings of the second annual symposium on Computational geometry, pages 169–177, 1986.
- [CHJ+22] CJ Carey, Jonathan Halcrow, Rajesh Jayaram, Vahab Mirrokni, Warren Schudy, and Peilin Zhong. Stars: Tera-scale graph building for clustering and graph learning. arXiv preprint arXiv:2212.02635, 2022.
- [CKL+22] Li Chen, Rasmus Kyng, Yang P Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 612–623. IEEE, 2022.
- [Cla87] Ken Clarkson. Approximation algorithms for shortest path motion planning. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pages 56–65, 1987.
- [Cut13] Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013.
- [DEL+22] Laxman Dhulipala, David Eisenstat, Jakub Lacki, Vahab Mirrokni, and Jessica Shi. Hierarchical agglomerative graph clustering in poly-logarithmic depth. Advances in Neural Information Processing Systems, 35:22925–22940, 2022.
- [Epp99] David Eppstein. Spanning trees and spanners. handbook of computational geometry, j.-r. sack and j. urrutia, eds, 1999.
- [Erd63] P. Erdös. Extremal problems in graph theory. Theory of Graphs and its Applications (Proc. Symp. Smolenice, 1963), pages 29–36, 1963.
- [HIM12] Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, 1(8):321–350, 2012.
- [HPIS13] Sariel Har-Peled, Piotr Indyk, and Anastasios Sidiropoulos. Euclidean spanners in high dimensions. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pages 804–809. SIAM, 2013.
- [Ind01] Piotr Indyk. High-dimensional computational geometry. Ph.D. Thesis. Department of Computer Science, Stanford University, 2001.
- [Kei88] J Mark Keil. Approximating the complete euclidean graph. In SWAT 88: 1st Scandinavian Workshop on Algorithm Theory Halmstad, Sweden, July 5–8, 1988 Proceedings 1, pages 208–213. Springer, 1988.
- [KG92] J Mark Keil and Carl A Gutwin. Classes of graphs which approximate the complete euclidean graph. Discrete & Computational Geometry, 7:13–28, 1992.
- [KNP19] Andrey Boris Khesin, Aleksandar Nikolov, and Dmitry Paramonov. Preconditioning for the geometric transportation problem. In 35th International Symposium on Computational Geometry (SoCG 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
- [Li20] Jason Li. Faster parallel algorithm for approximate shortest path. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 308–321, 2020.
- [LS22] Hung Le and Shay Solomon. Truly optimal euclidean spanners. SIAM Journal on Computing, (0):FOCS19–135, 2022.
- [NS07] Giri Narasimhan and Michiel Smid. Geometric spanner networks. Cambridge University Press, 2007.
- [PS89] David Peleg and Alejandro A Schäffer. Graph spanners. Journal of graph theory, 13(1):99–116, 1989.
- [Sch35] Isaac J Schoenberg. Remarks to Maurice Frechet’s article “sur la definition axiomatique d’une classe d’espace distances vectoriellement applicable sur l’espace de hilbert”. Annals of Mathematics, pages 724–732, 1935.
- [Sei91] JJ Seidel. Quasiregular two-distance sets. In Geometry and Combinatorics, pages 267–273. Elsevier, 1991.
- [She17] Jonah Sherman. Generalized preconditioning and undirected minimum-cost flow. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 772–780. SIAM, 2017.
- [Tko12] Tomasz Tkocz. An upper bound for spherical caps. The American Mathematical Monthly, 119(7):606–607, 2012.
- [Vai89] Pravin M. Vaidya. Geometry helps in matching. SIAM Journal on Computing, 18(6):1201–1225, 1989.
- [VDBLL+21] Jan Van Den Brand, Yin Tat Lee, Yang P Liu, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang. Minimum cost flows, mdps, and -regression in nearly linear time for dense instances. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 859–869, 2021.
- [Zwi01] Uri Zwick. Exact and approximate distances in graphs—a survey. In Algorithms—ESA 2001: 9th Annual European Symposium Århus, Denmark, August 28–31, 2001 Proceedings, pages 33–48. Springer, 2001.