Hardness of Approximate Nearest Neighbor Search
under L-infinity

Young Kun Ko Courant Institute of Mathematical Sciences, New York University. Email: [email protected]. Min Jae Song Courant Institute of Mathematical Sciences, New York University. Email: [email protected]. Research supported by the Simons Collaboration on Algorithms and Geometry and by the National Science Foundation (NSF) under Grant No. CCF-1814524.

Abstract

We show conditional hardness of Approximate Nearest Neighbor Search (ANN) under the $\ell_{\infty}$ norm with two simple reductions. Our first reduction shows that hardness of a special case of the Shortest Vector Problem (SVP), which captures many provably hard instances of SVP, implies a lower bound for ANN with polynomial preprocessing time under the same norm. Combined with a recent quantitative hardness result on SVP under $\ell_{\infty}$ (Bennett et al., FOCS 2017), our reduction implies that finding a $(1+\varepsilon)$ -approximate nearest neighbor under $\ell_{\infty}$ with polynomial preprocessing requires near-linear query time, unless the Strong Exponential Time Hypothesis (SETH) is false. This complements the results of Rubinstein (STOC 2018), who showed hardness of ANN under $\ell_{1}$ , $\ell_{2}$ , and edit distance.

Further improving the approximation factor for hardness, we show that, assuming SETH, near-linear query time is required for any approximation factor less than $3$ under $\ell_{\infty}$ . This shows a conditional separation between ANN under the $\ell_{1}/\ell_{2}$ norm and the $\ell_{\infty}$ norm since there are sublinear time algorithms achieving better than $3$ -approximation for the $\ell_{1}$ and $\ell_{2}$ norm. Lastly, we show that the approximation factor of $3$ is a barrier for any naive gadget reduction from the Orthogonal Vectors problem.

1 Introduction

Nearest Neighbor Search is formally defined as the following problem: Given a set of points $P=\{p_{1},\ldots,p_{N}\}\subset\mathbb{R}^{d}$ , and a query point $t\in\mathbb{R}^{d}$ , find a point $p\in P$ that is closest to $t$ . This is a fundamental algorithmic problem that has numerous applications in machine learning, computer vision, and databases (See [SDI06] and references therein for applications).

A naive solution is to enumerate all points in $p\in P$ , compare the distance with $t\in\mathbb{R}^{d}$ , and return $p$ that attains the minimum. This naturally gives a linear time ( $O(N)$ where $N=|P|$ ) algorithm assuming one can compute the distance between two points in $O(1)$ -time. Unfortunately, for a very large $N$ , which is the usual setting, this is costly. Ideally, we would like to return the solution without looking at every point in $P$ . Another naive solution on the other extreme is to enumerate all possible query points $t$ and, for each $t$ , store its nearest neighbor in $P$ in a look-up table. That way, we can handle each query $t$ using only $O(1)$ query time. However, this would require almost exponential (in dimension $d$ ) preprocessing time as one needs to enumerate through all possible $t$ and compute the corresponding nearest neighbor. Furthermore, this requires a huge space overhead since one needs to store all the corresponding solutions for each $t$ .

A natural question is, can we obtain efficient preprocessing time (say polynomial in $N$ and $d$ ) and efficient query time (sublinear in $N$ ) simultaneously. When $d=O(1)$ , such algorithms are known. If $d=1$ , for instance, one can construct a binary search tree with $P$ and determine the closest point to any query $t\in\mathbb{R}$ in $O(\log N)$ time. For arbitrary $d$ , the Voronoi diagram gives a solution using $N^{O(d)}$ space with $(d+\log N)^{O(1)}$ query time, but it is not at all clear if this can be further improved without paying an exponential price in $d$ (this is typically called the curse of dimensionality).

Instead, if we are satisfied with an approximate nearest neighbor rather than an exact nearest neighbor, i.e., find $\tilde{p}\in P$ such that $d(t,\tilde{p})\leq\gamma\cdot\min_{p\in P}d(t,p)$ , we can do much better. We refer to such a problem as $\gamma$ -approximate nearest neighbor search (or $\gamma$ -ANN for short). Previous works have shown that for many interesting metrics, we can indeed achieve improvements. For various distance metrics such as $\ell_{1},\ell_{2}$ , and edit distance, upper and lower bounds are well known. For $\ell_{1}$ (Hamming or Manhattan) and $\ell_{2}$ (Euclidean), one can use dimensionality reduction, locality sensitive hashing, or recently developed data-dependent hashing techniques [AR15] to improve upon the naive linear scan. Under these norms, current state-of-the-art techniques for worst-case data achieve $\gamma$ -approximation using $O(dN^{1+\rho})$ preprocessing time and $O(dN^{\rho})$ query time where $\rho=\frac{1}{2\gamma^{2}-1}$ . Furthermore, these are known to be optimal for hashing based techniques [And+17]. We refer the reader to the survey by Andoni et al. [AIR18] for further information.

Even if one uses non-hashing based techniques for these norms, one cannot hope to indefinitely improve the query time’s dependence on $\varepsilon$ for $(1+\varepsilon)$ -approximation unless one refutes the Strong Exponential Time Hypothesis (SETH). More precisely, for any $\delta>0$ , there exists a constant $\varepsilon=\varepsilon(\delta)>0$ such that $(1+\varepsilon)$ -ANN under $\ell_{1},\ell_{2}$ , and edit distance requires $N^{1-\delta}$ query time with polynomial preprocessing time unless SETH is false [Rub18]. Yet, unlike $\ell_{1}$ or $\ell_{2}$ , $\ell_{\infty}$ remains a mystery as we have not seen any improvement on either the upper or lower bound front for the past decade [Ind01, ACP08, PTW10]. In Section 1.1, we elaborate on the current status of ANN under $\ell_{\infty}$ . Moreover, we give a brief overview of the Shortest Vector Problem ( $\mathsf{SVP}$ ), a fundamental lattice problem central to post-quantum cryptography, and its connection to ANN.

1.1 Previous Works

Approximate nearest neighbors (ANN) in $\ell_{\infty}$ .

Whether or not we can achieve a better time-space tradeoff for $\ell_{\infty}$ is not just a purely intellectual endeavor, but a practical one as well. For instance, if the coordinates are heterogeneous, adding up different coordinates may not make sense as it would be “comparing apples to oranges”. To circumvent this difficulty, one can convert each coordinate to a rank space and use the maximum rank difference as the distance measure [Fag96, Fag99]. Another motivation for studying $\ell_{\infty}$ comes from the fact that it can be used as a target space for embedding any general normed spaces. (e.g., one can embed any metric space into $\ell_{\infty}$ in $O(cN^{1/c}\log N)$ dimensions with $2c-1$ distortion [Mat96]). Then, an efficient algorithm for ANN under $\ell_{\infty}$ can be used as a black box to give efficient algorithms for other norms, as achieved in [And+17a]. Therefore, if one can design a better algorithm for ANN under $\ell_{\infty}$ , algorithms for other norms may see improvements as well.

From a technical perspective, $\ell_{\infty}$ stands unique compared to other $\ell_{p}$ norms. While there has been fruitful algorithmic progress for other $\ell_{p}$ norms by using various techniques such as dimensionality reduction [JL84, Kle97], locality sensitive hashing [AI08] and data-dependent hashing [And+17], $\ell_{\infty}$ remains resistant to these techniques. The seemingly unorthodox upper bound devised by Indyk [Ind01] almost two decades ago remains state-of-the-art. This algorithm achieves $O(\log_{\rho}\log d)$ -approximation with $O(d{\operatorname{poly}}\log N)$ query time with $O(dN^{\rho}{\operatorname{poly}}\log N)$ preprocessing time for any $\rho>1$ . Surprisingly, this regime is known to be tight under some restricted computation model (decision tree model) [ACP08, PTW10, BK18].

However, for a general computational model such as word-RAM,¹¹1Here we remark that the only known technique for word-RAM lower bounds uses the cell-probe model [Yao79], in which computation is given for free and one only gets charged for information access. Proving any super-logarithmic query time lower bound for the cell-probe model (under any polynomial preprocessing time) would lead to a breakthrough in circuit complexity [DGW19]. it is a major open problem whether the tradeoff given by Indyk’s algorithm is optimal. Even if we allow a polynomial yet sublinear query time, it is open whether we can achieve a constant factor approximation with polynomial preprocessing time (and no exponential dependence on $\log d$ ).

Shortest vector problem (SVP).

$\mathsf{SVP}$ is a fundamental problem in lattice-based cryptography. For instance, the average-case hardness of learning with errors (LWE) [Reg09], which serves as the basis of many post-quantum cryptography proposals [Ala+20], is based on the worst-case hardness of approximating $\mathsf{SVP}$ up to polynomial factors in the $\ell_{2}$ norm. For finite $\ell_{p}$ norms, NP-hardness of exact $\mathsf{SVP}_{p}$ , the Shortest Vector Problem in the $\ell_{p}$ norm, was shown by [Ajt98] using randomized reductions. This hardness result has been improved to hardness of constant factor approximations [Mic01, Kho05], and $2^{(\log n)^{1-\varepsilon}}$ (almost polynomial) factor approximations [Kho05, HR07] under the assumption that NP problems do not have randomized quasi-polynomial time algorithms. Since approximating $\mathsf{SVP}$ up to factors larger than $\sqrt{n}$ is in $\mathsf{NP}\cap\mathsf{coNP}$ , it is unlikely that NP-hardness can be shown for this regime [AR05]. For further information on $\mathsf{SVP}$ , we refer the reader to [MG02] and [Kho10].

In this work, we consider the special case of finding short vectors in the $\ell_{\infty}$ norm, which we denote by $\mathsf{SVP}_{\infty}$ . This problem is interesting because of its relevance to practical lattice-based cryptosystems, in which the key size has to be as small as possible for efficiency while being large enough to rule out attacks from any conceivable adversaries. For instance, the practical security of Dilithium, a recent lattice-based signature scheme by [Duc+18], is based on the intractability of solving $\mathsf{SVP}_{\infty}$ . NP-hardness of exact $\mathsf{SVP}_{\infty}$ was shown by [Boa81], and [Din02] showed hardness of approximation up to factor $n^{c/\log\log n}$ . Recently, [BGSD17] proved that for any constant $\varepsilon>0$ , there is a constant $\gamma_{\varepsilon}\in(1,2)$ such that $\mathsf{SVP}_{\infty}$ does not have a $2^{(1-\varepsilon)n}$ time algorithm achieving $\gamma_{\varepsilon}$ -approximation assuming SETH. We note that the best known provable algorithm for $\gamma$ -approximate $\mathsf{SVP}_{\infty}$ runs in time $3^{n}\cdot\Big{(}\frac{\gamma}{\gamma-2}\Big{)}^{n}$ [AM18].

The relationship between nearest neighbor search and $\mathsf{SVP}$ has already been explored in previous works [BJG15, Bec+16, Laa15], but the focus of these works was on speeding up heuristic sieving algorithms for $\mathsf{SVP}$ using nearest neighbor search. More precisely, LSH-based ANN algorithms have been used as a subprocedure in sieving algorithms to solve $\mathsf{SVP}$ with better time-space tradeoffs under heuristic assumptions. In this work, we turn this relationship around and prove a near-linear query time lower bound for $\gamma$ -ANN with polynomial preprocessing time using the aforementioned (conditional) lower bound on $\mathsf{SVP}_{\infty}$ [BGSD17].

1.2 Our Results

We give two separate reductions that show conditional hardness of ANN with polynomial preprocessing time under the $\ell_{\infty}$ norm. Our result demonstrates a separation between the $\ell_{1}/\ell_{2}$ norm and the $\ell_{\infty}$ norm. That is, while algorithms with polynomial preprocessing and sublinear query time for $(1+\varepsilon)$ -ANN, where $\varepsilon>0$ is an arbitrary constant and the data is in the high dimensional regime $d=O(\log N)$ , exist for the $\ell_{1}$ and $\ell_{2}$ norm [IM98, Val15, AIR18], our second reduction (Corollary 1.4) shows that we cannot expect to have sublinear algorithms for $\gamma$ -ANN under the $\ell_{\infty}$ norm for any $\gamma<3$ in the high dimensional regime.

Our first result is the following theorem, which translates any hardness assumption on $\gamma$ - $\mathsf{SVP}_{p}^{\{0,1\}}$ , which is the Shortest Vector Problem under the $\ell_{p}$ norm restricted to lattice vectors with $\{0,1\}$ -coefficients in the given basis (See Section 2 for a formal definition), into a hardness result (Corollary 1.2) for $\gamma$ -ANN under the $\ell_{p}$ norm. While this reduction leads to a weaker hardness result compared to Corollary 1.4, it establishes a simple connection between the two canonical problems which could potentially lead to fine-grained hardness results that are not based on SETH (See Section 1.3). We remark that an analogous reduction can be shown for a special case of the Closest Vector Problem ( $\mathsf{CVP}$ ), which is the problem of computing the distance from some target point to the lattice (See Remark 3.4).

Theorem 1.1 (Informal).

For any $\varepsilon>0,\gamma>1$ and $1\leq p\leq\infty$ , if there is no algorithm which solves $\gamma$ - $\mathsf{SVP}_{p}^{\{0,1\}}$ in $2^{(1-\varepsilon)n}$ time, where $n$ is the rank of the given lattice, then there exists a $\delta=\delta(\varepsilon)>0$ such that $\gamma$ -Approximate Nearest Neighbor under the $\ell_{p}$ norm cannot be solved with polynomial preprocessing and $N^{1-\delta}$ query time.

The key observation that connects this rather generic translation between $\mathsf{SVP}^{\{0,1\}}$ and ANN to a quantitative hardness result on ANN under the $\ell_{\infty}$ norm is that $k$ -SAT reduces to this special subcase of $\mathsf{SVP}_{\infty}$ (with a small gap, in fact). The quantitative hardness of $\mathsf{SVP}^{\{0,1\}}_{\infty}$ based on SETH [BGSD17] (See Theorem 2.8) implies the hardness of ANN under the $\ell_{\infty}$ norm.

Corollary 1.2 (Informal).

Assuming SETH, for any $\delta>0$ , there exists a constant $\gamma\in(1,2)$ such that $\gamma$ -ANN under the $\ell_{\infty}$ norm cannot be solved with polynomial preprocessing and $N^{1-\delta}$ query time.

One caveat of Corollary 1.2 is that the precise relationship between the approximation factor and query time lower bound is not known (See Remark 2.9). Moreover, the approximation factor for which hardness can be shown is less than 2. A natural question is whether hardness can be shown for larger approximation factors. Further strengthening the hardness result based on SETH, we show that a modification of the reduction from online partial matching by Indyk [Ind01] demonstrates hardness with a larger gap of $\gamma=3$ for Bichromatic Closest Pair under $\ell_{\infty}$ .

Theorem 1.3 (Informal).

Assuming SETH, for any $\delta>0$ and $\gamma<3$ , there exists a constant $c>0$ such that $\gamma$ -approximate Bichromatic Closest Pair on instances with dimension $d=c\log N$ under the $\ell_{\infty}$ norm cannot be solved in $N^{2-\delta}$ time.

Since a lower bound for $\gamma$ -approximate Bichromatic Closest Pair implies a lower bound for $\gamma$ -ANN with polynomial preprocessing time [WW18], Theorem 1.3 implies the following corollary.

Corollary 1.4 (Informal).

Assuming SETH, for any $\delta>0$ and $\gamma<3$ , there exists a constant $c>0$ such that $\gamma$ -ANN on instances with dimension $d=c\log N$ under the $\ell_{\infty}$ norm cannot be solved with polynomial preprocessing and $N^{1-\delta}$ query time.

Furthermore, we show that any naive gadget reduction from Orthogonal Vectors (Definition 2.2), which essentially captures most known reduction techniques from SETH, to $\gamma$ -approximate Bichromatic Closest Pair cannot show hardness for any approximation factor larger than 3 (Theorem 3.9). This essentially follows from the triangle inequality. We note that this limitation was mentioned in [Rub18] as well. Hence, showing any hardness with an approximation factor larger than 3 would require a novel technique bypassing the naive gadget reduction techniques. Therefore, our hardness result for ANN under $\ell_{\infty}$ is tight in this sense.

We note that the result of [Rub18], who showed hardness under the $\ell_{2}$ norm, already implies the hardness of ANN under the $\ell_{\infty}$ norm since one can embed $\ell_{2}$ into a subspace of $\ell_{\infty}$ with low distortion [Ind03]. This reduction is similar to that of [RR06], who showed that $\ell_{2}$ is, in a certain sense, the “easiest norm” for lattice problems. However, hardness under the $\ell_{\infty}$ norm through this reduction only holds for $d={\operatorname{poly}}\log N$ (as opposed to $d=O(\log N)$ in our results) because of the embedding’s dimension blow-up. Moreover, our reductions are simpler and do not require the distributed PCP [ARW17] machinery used in [Rub18]. Our Corollary 1.4 is also stronger in the sense that it holds for larger (and known) approximation factors. The approximation factor for hardness in [Rub18] is exponentially small in $\delta$ and depends on an unspecified relation between the reduction parameters due to SETH.

1.3 Future Directions

Our result shows that the hardness of ANN under $\ell_{\infty}$ for approximation factor 3 is the best one can hope for with current reduction techniques. The difficulty of proving hardness beyond $3$ -approximation was hinted in [Rub18], but no norm which explicitly instantiates this barrier was known previously.

An obvious next direction is determining whether hardness of $\gamma$ -ANN under $\ell_{\infty}$ for $\gamma>3$ can be shown, as it is widely believed [ACP08, PTW10, BK18] that the tradeoff between the required preprocessing/query time and approximation guarantee given by Indyk’s algorithm [Ind01] is optimal. However, it is not clear whether such a result can be obtained from SETH. Our connection between $\gamma$ -ANN and $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}$ suggests a direction for overcoming this technical barrier. Instead of basing hardness on SETH, consider the following conjecture on $\mathsf{SVP}_{\infty}^{\{0,1\}}$ .

Conjecture 1.5.

For any $\varepsilon\in(0,1/2)$ , there exists $\gamma=\omega(1)$ such that there is no $2^{(1-\varepsilon)n}$ time algorithm for $\gamma$ - $\mathsf{SVP}_{\infty}^{\{0,1\}}$ .

It is straightforward to see that Lemma 3.1 and 3.2 (formal version of Theorem 1.1) together with Conjecture 1.5 would imply hardness of $\omega(1)$ -ANN under $\ell_{\infty}$ . We remark that almost all instances arising in NP-hardness proofs of $\gamma$ - $\mathsf{SVP}_{\infty}$ are indeed captured by $\mathsf{SVP}_{\infty}^{\{0,1\}}$ [MG02, Din02, BGSD17].²²2 [Din02] generates a lattice in which coefficients of the short lattice vector take values in $\{-1,0,1\}$ instead of $\{0,1\}$ . Still, there are only $3^{n}$ many candidates for the short vector, so our reduction applies to this case as well with minor modifications.

One potential direction is to prove Conjecture 1.5 assuming SETH. However, this would then overcome the barrier of $3$ -approximation for ANN, suggesting that such a result would be difficult to obtain. Another interesting direction is to show the quantitative hardness of other computational problems assuming Conjecture 1.5 since $\mathsf{SVP}^{\{0,1\}}_{\infty}$ is itself a natural problem which encapsulates the difficulty of canonical problems such as $k$ -SAT and Subset Sum. Additional motivation for this direction is given by [BGSD17, Agg+21], who remark that the concrete lower bound for general $\mathsf{SVP}_{\infty}$ may in fact be $3^{n}$ since the kissing number in the $\ell_{\infty}$ norm is $3^{n}-1$ . Since it is not at all clear whether a $3^{n}$ lower bound for $\mathsf{SVP}_{\infty}$ can be deduced from SETH, it may make sense to directly assume the quantitative hardness of $\mathsf{SVP}_{\infty}$ or $\mathsf{SVP}^{\{0,1\}}_{\infty}$ .

Acknowledgments.

We thank Noah Stephens-Davidowitz and Oded Regev for helpful comments.

2 Preliminaries

We present two reductions that show hardness of $\gamma$ -ANN. Our first reduction reduces a special case of the Shortest Vector Problem to $\gamma$ -ANN. Our second reduction reduces the Subset Query problem to $\gamma$ -ANN. Both reductions involve an intermediate problem, referred to as the Bichromatic Closest Pair problem in [Rub18]. In this section, we present formal definitions of these problems and their lower bounds assuming SETH.

2.1 Complexity Assumptions

Definition 2.1 (Strong Exponential Time Hypothesis [IP99]).

For any $\varepsilon>0$ , there exists $k=k(\varepsilon)$ such that $k$ -SAT on $n$ variables cannot be solved in $2^{(1-\varepsilon)n}$ time.

Orthogonal Vectors (OV) is an intermediate problem commonly used to show fine-grained hardness results in $\mathsf{P}$ [Wil05, Wil07]. Note that SETH implies the Orthogonal Vectors Conjecture, which states that Orthogonal Vectors cannot be solved in subquadratic time. For recent developments in fine-grained complexity assuming the Orthogonal Vectors Conjecture, we refer the reader to the survey [Wil17] and references therein.

Definition 2.2 (Orthogonal Vectors).

Given two sets $A,B\subset\{0,1\}^{d}$ , decide whether there exists a pair $(a,b)\in A\times B$ such that $\langle a,b\rangle=0$ .

Conjecture 2.3 (Orthogonal Vectors Conjecture).

For every $\delta>0$ , there exists a constant $c=c(\delta)$ such that given two sets $A,B\subset\{0,1\}^{d}$ each of cardinality $N$ , deciding if there is a pair $(a,b)\in A\times B$ such that $\langle a,b\rangle=0$ cannot be solved in $O(N^{2-\delta})$ time on instances with $d=c\log N$ .

2.2 Nearest Neighbor Search Problems

Definition 2.4 (Approximate Nearest Neighbor Search).

The $\gamma$ -Approximate Nearest Neighbor Search problem ( $\gamma$ -ANN) under the $\ell_{p}$ norm is defined as follows. Let $A\subset\mathbb{R}^{d}$ be a set of $N$ vectors. Given numbers $\gamma>1$ , $r>0$ , and a vector $x\in\mathbb{R}^{d}$ , distinguish between the following two cases after preprocessing $A$ :

•

(YES) There exists $a\in A$ such that with $\left\lVert a-x\right\rVert_{p}\leq r$ .
•

(NO) For all $a\in A$ , $\left\lVert a-x\right\rVert_{p}\geq\gamma r$ .

Definition 2.5 (Approximate Bichromatic Closest Pair).

The $\gamma$ -approximate Bichromatic Closest Pair problem under the $\ell_{p}$ norm is defined as follows. Let $A,B\subset\mathbb{R}^{d}$ each be a set of size $N$ . Given numbers $\gamma>1$ and $r>0$ , distinguish between the following two cases:

•

(YES) There exists $a\in A$ and $b\in B$ with $\left\lVert a-b\right\rVert_{p}\leq r$ .
•

(NO) For all $a\in A$ and $b\in B$ , $\left\lVert a-b\right\rVert_{p}\geq\gamma r$ .

A standard technique [WW18] shows that any $\gamma$ -ANN algorithm with sublinear query time and polynomial preprocessing time implies a subquadratic algorithm for $\gamma$ -approximate Bichromatic Closest Pair (See Lemma 3.1).

2.3 Lattice Problems

A lattice $\mathcal{L}$ is defined as the set of all integer combinations of linearly independent basis vectors $b_{1},\ldots,b_{n}\in\mathbb{R}^{d}$ , i.e.,

\displaystyle\mathcal{L}=\mathcal{L}(b_{1},\ldots,b_{n})=\left\{\sum_{i=1}^{n}\alpha_{i}b_{i}\mid\alpha_{i}\in\mathbb{Z}\right\}\;.

We denote the rank of $\mathcal{L}$ by $n$ and the ambient dimension by $d$ .

Definition 2.6 (Shortest Vector Problem).

The $\gamma$ -approximate Shortest Vector Problem under the $\ell_{p}$ norm ( $\gamma$ - $\mathsf{SVP}_{p}$ ) is defined as follows. Let $\mathcal{L}$ be a $d$ -dimensional lattice of rank $n$ given in the form of a basis $\mathcal{B}=(b_{1},\ldots,b_{n})$ , where $\mathcal{B}\subset\mathbb{R}^{d}$ . Given numbers $\gamma>1$ and $r>0$ , distinguish between the following two cases:

•

(YES) There exists a non-zero vector $\vec{\alpha}\in\mathbb{Z}^{n}$ such that

$\left\lVert\sum_{i=1}^{n}\alpha_{i}b_{i}\right\rVert_{p}\leq r\;.$
•

(NO) For any non-zero $\vec{\alpha}\in\mathbb{Z}^{n}$ ,

$\left\lVert\sum_{i=1}^{n}\alpha_{i}b_{i}\right\rVert_{p}\geq\gamma r\;.$

Definition 2.7 (Shortest Vector Problem with Binary Restriction).

The $\gamma$ -approximate Shortest Vector Problem with Binary Restriction under the $\ell_{p}$ norm ( $\gamma$ - $\mathsf{SVP}_{p}^{\{0,1\}}$ ) is defined as follows. Let $\mathcal{L}$ be a $d$ -dimensional lattice of rank $n$ given in the form of a basis $\mathcal{B}=(b_{1},\ldots,b_{n})$ , where $\mathcal{B}\subset\mathbb{R}^{d}$ . Given numbers $\gamma>1$ and $r>0$ , distinguish between the following two cases:

•

(YES) There exists a non-zero vector $\vec{\alpha}\in\{0,1\}^{n}$ such that

$\left\lVert\sum_{i=1}^{n}\alpha_{i}b_{i}\right\rVert_{p}\leq r\;.$
•

(NO) For any non-zero $\vec{\alpha}\in\mathbb{Z}^{n}$ ,

$\left\lVert\sum_{i=1}^{n}\alpha_{i}b_{i}\right\rVert_{p}\geq\gamma r\;.$

Theorem 2.8 ([BGSD17, Corollary 6.7]).

Assuming SETH, for any $\varepsilon>0$ , there exists $\gamma=\gamma(\varepsilon)$ such that $\gamma$ - $\mathsf{SVP}_{\infty}^{\{0,1\}}$ cannot be solved in $2^{(1-\varepsilon)n}$ time, where $n$ is the rank of the given lattice.

Remark 2.9.

The precise form of the approximation factor $\gamma$ in Theorem 2.8 is $\gamma=1+2/(k-1)$ , where $k=k(\varepsilon)$ comes from the hard $k$ -SAT instance from SETH. Since the dependence $k=k(\varepsilon)$ is not known, so is the dependence of $\gamma$ on $\varepsilon$ .

2.4 Subset Query Problems

Definition 2.10 (Subset Query).

The Subset Query problem is defined as follows. Given $N$ sets $S_{1},\ldots,S_{N}\subset[d]$ and a query $Q\subset[d]$ , distinguish between the following two cases:

•

(YES) There exists $i\in[N]$ such that $Q\subseteq S_{i}$ .
•

(NO) For all $i\in[N]$ , $Q\not\subseteq S_{i}$ .

Definition 2.11 (Bichromatic Subset Query).

The Bichromatic Subset Query problem is defined as follows. Given two collections of sets $S_{1},\ldots,S_{N}\subset[d]$ and $T_{1},\ldots,T_{N}\subset[d]$ , distinguish between the following two cases:

•

(YES) There exists $i,j\in[N]$ such that $T_{i}\subseteq S_{j}$ .
•

(NO) For all $i,j\in[N]$ , $T_{i}\not\subseteq S_{j}$ .

It is well-known that Bichromatic Subset Query is SETH-hard (as it is equivalent to Orthogonal Vectors) due to the following celebrated result of Williams [Wil05].

Theorem 2.12 ([Wil05, Theorem 5.1]).

Assuming SETH, for any $\delta>0$ , there exists a constant $c=c(\delta)$ such that Bichromatic Subset Query cannot be solved in $N^{2-\delta}$ time on instances with $d=c\log N$ .

3 Proof of Main Theorems

We give two simple reductions that imply the hardness of $\gamma$ -ANN assuming SETH. To this end, we first present a reduction from Bichromatic Closest Pair to ANN. Then, we reduce $\mathsf{SVP}_{\infty}^{\{0,1\}}$ and Bichromatic Subset Query to Bichromatic Closest Pair, thereby showing the conditional quantitative hardness of $\gamma$ -ANN. Furthermore, we show that the hard approximation factor of $3$ , which our reduction from Bichromatic Subset Query achieves, is the best possible for any “natural” reduction from the Orthogonal Vectors problem (See Theorem 3.9). Since most known SETH-based hardness results are obtained via a reduction from Orthogonal Vectors, our result implies that showing hardness for approximation factor larger than $3$ would require new techniques.

3.1 Bichromatic Closest Pair to ANN

A standard reduction from [WW18] shows that a sublinear algorithm for $\gamma$ -ANN with polynomial preprocessing time implies a subquadratic algorithm for $\gamma$ -approximate Bichromatic Closest Pair. We include a short proof of this fact for completeness.

Lemma 3.1.

Given numbers $\delta>0$ and $C>1$ , if there exists an algorithm for $\gamma$ -ANN under the $\ell_{p}$ norm with preprocessing time $N^{C}$ and query time $N^{1-\delta}$ , where $N$ denotes the number of data points, then there exists a $N^{2-\Omega\left(\frac{\delta}{C-1}\right)}$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair under $\ell_{p}$ .

Proof.

Consider the following algorithm for $\gamma$ -approximate Bichromatic Closest Pair under the $\ell_{p}$ norm for two sets $A=\{p_{1},\ldots,p_{N}\}\subset\mathbb{R}^{d}$ and $B=\{q_{1},\ldots,q_{N}\}\subset\mathbb{R}^{d}$ .

•

Divide $A$ into $N/\ell$ batches, each of size $\ell$ .
•

Preprocess each batch separately, constructing $N/\ell$ many data structures.
•

For each query point $q_{j}$ , query each data structure created by the preprocessing step.

The total time used for preprocessing is

\frac{N}{\ell}\cdot(\ell)^{C}=N\cdot\ell^{C-1}\;.

For each query $q_{j}$ , one needs to query $N/\ell$ data structures. Hence, the total query time spent on $q_{1},\ldots,q_{N}$ is

N\cdot\frac{N}{\ell}\cdot\ell^{1-\delta}=\frac{N^{2}}{\ell^{\delta}}.

Let $\delta^{\prime}>0$ be a number satisfying $\frac{\delta^{\prime}}{1-\delta^{\prime}}<\frac{\delta}{C-1}$ and choose $\ell$ such that

N^{\delta^{\prime}/\delta}<\ell<N^{\frac{1-\delta^{\prime}}{C-1}}\;.

This gives a $O(N^{2-\delta^{\prime}})$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair. ∎

3.2 $\mathsf{SVP}_{p}^{\{0,1\}}$ Hardness implies Hardness of ANN under $\ell_{p}$

In this section, we show that a fast algorithm for $\gamma$ -approximate Bichromatic Closest Pair under $\ell_{p}$ implies a fast algorithm for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ . Combined with Lemma 3.1, this shows that a lower bound for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ implies a lower bound for $\gamma$ -ANN under $\ell_{p}$ . Assuming SETH, lower bounds for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{\infty}$ are known (Theorem 2.8). Hence, our reduction implies hardness of $\gamma$ -ANN under the $\ell_{\infty}$ norm.

Lemma 3.2.

If there exists a $f(N)$ -time algorithm that solves $\gamma$ -approximate Bichromatic Closest Pair under the $\ell_{p}$ norm (where $1\leq p\leq\infty$ ), then there exists an algorithm for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ with the same ambient dimension that runs in $O\left(f(2^{n/2})\right)$ time, where $n$ denotes the rank of the input lattice.

Proof.

Consider an instance of $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ with a lattice basis $\mathcal{B}\subset\mathbb{R}^{d}$ of size $n$ . Divide $\mathcal{B}$ into $\mathcal{B}_{1}$ and $\mathcal{B}_{2}$ , each of size $n/2$ , and consider the following sets of vectors.

	$\displaystyle~{}A_{0}:=\left\{\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{1},\vec{\alpha}\in\{0,1\}^{n/2}\right\}$
	$\displaystyle~{}A_{1}:=\left\{\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{1},\vec{\alpha}\in\{0,1\}^{n/2},\vec{\alpha}\neq\vec{0}\right\}$
	$\displaystyle~{}B_{0}:=\left\{-\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{2},\vec{\alpha}\in\{0,1\}^{n/2}\right\}$
	$\displaystyle~{}B_{1}:=\left\{-\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{2},\vec{\alpha}\in\{0,1\}^{n/2},\vec{\alpha}\neq\vec{0}\right\}.$

Let $\mathcal{A}$ be an $f(N)$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair. We run $\mathcal{A}$ two times with the following inputs, $\mathcal{A}(A_{0},B_{1})$ , $\mathcal{A}(A_{1},B_{0})$ and return the OR of the outputs. Since $|A_{1}|,|B_{1}|\leq|A_{0}|,|B_{0}|\leq N=2^{n/2}$ , then the runtime is $O(f(N))=O(f(2^{n/2}))$ . It remains to show the correctness of our reduction.

Claim 3.3 (Correctness).

Let $\mathcal{L}^{\{0,1\}}=\left\{\sum_{i=1}^{n}\alpha_{i}b_{i}\mid b_{i}\in\mathcal{B},\;\vec{\alpha}\in\{0,1\}^{n}\right\}$ and $A-B=\{a-b~{}|~{}a\in A,b\in B\}$ . Then

(A_{0}-B_{1})\cup(A_{1}-B_{0})=\mathcal{L}^{\{0,1\}}\setminus\{\vec{0}\}\;.

Proof.

First, notice that $A_{0}-B_{0}=\mathcal{L}^{\{0,1\}}$ . It follows that $A_{0}-B_{1}=\mathcal{L}^{\{0,1\}}\setminus A_{0}$ since $B_{1}=B_{0}\setminus\{\vec{0}\}$ and $\mathcal{B}=\mathcal{B}_{1}\cup\mathcal{B}_{2}$ is a basis. Similarly, $A_{1}-B_{0}=\mathcal{L}^{\{0,1\}}\setminus B_{0}$ . Hence,

	$\displaystyle(A_{0}-B_{1})\cup(A_{1}-B_{0})$	$\displaystyle=(\mathcal{L}^{\{0,1\}}\setminus A_{0})\cup(\mathcal{L}^{\{0,1\}}\setminus B_{0})$
		$\displaystyle=\mathcal{L}^{\{0,1\}}\setminus(A_{0}\cap B_{0})$
		$\displaystyle=\mathcal{L}^{\{0,1\}}\setminus\{\vec{0}\}\;.$

∎

From Claim 3.3, we know that if there exists a vector in $A_{0}-B_{1}$ or $A_{1}-B_{0}$ with $\ell_{p}$ -norm less than $r$ , then there exists a vector in $\mathcal{L}^{\{0,1\}}$ with $\ell_{p}$ -norm less than $r$ . If all vectors in $(A_{0}-B_{1})\cup(A_{1}-B_{0})$ have $\ell_{p}$ -norm greater than $\gamma r$ , then all non-zero vectors in $\mathcal{L}^{\{0,1\}}$ must have $\ell_{p}$ -norm greater than $\gamma r$ which concludes our proof. ∎

Remark 3.4.

A similar reduction can be shown for $\gamma\mbox{-}\mathsf{CVP}^{\{0,1\}}_{p}$ , which is the $\mathsf{CVP}$ analogue of $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ . To see this, take $\mathcal{B}_{0}^{\prime}=\mathcal{B}_{0}+t$ in the proof of Lemma 3.2, where $t\in\mathbb{R}^{d}$ is a target point which is guaranteed to be close to a $\{0,1\}$ -coefficient lattice vector. Then, run the Bichromatic Closest Pair algorithm on the input $(\mathcal{A}_{0},\mathcal{B}_{0}^{\prime})$ .

Combining Lemma 3.1 and Lemma 3.2, we obtain our main theorem which connects hardness of $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}$ and hardness of $\gamma$ -ANN.

Theorem 3.5.

For any $\delta>0$ and $C>1$ , if there is an algorithm for $\gamma$ -ANN under the $\ell_{p}$ norm with $N^{C}$ preprocessing time and $N^{1-\delta}$ query time, then there is a $2^{\left(1-\Omega\left(\frac{\delta}{C-1}\right)\right)n}$ -time algorithm for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ with the same ambient dimension, where $n$ denotes the rank of the input lattice.

Proof.

By Lemma 3.1, we get a $N^{2-\delta^{\prime}}$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair where $\delta^{\prime}=\Omega\left(\frac{\delta}{C-1}\right)$ from the algorithm for $\gamma$ -ANN. By Lemma 3.2, this in turn gives a $2^{\left(1-\frac{\delta}{2}\right)n}$ -time algorithm for $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{p}$ . ∎

Note that the SETH-hard instance of $\gamma\mbox{-}\mathsf{SVP}^{\{0,1\}}_{\infty}$ in Theorem 2.8 has ambient dimension $d=O(n)$ by the Sparsification Lemma [IPZ01]. Hence, we obtain the following lower bound for $\gamma$ -ANN under $\ell_{\infty}$ .

Corollary 3.6.

Assuming SETH, for any $\delta>0$ and $C>1$ , there exists a constant $\gamma=\gamma(\delta,C)\in(1,2)$ such that $\gamma$ -ANN under $\ell_{\infty}$ (where $d=O(\log N)$ ) cannot be solved with $N^{C}$ preprocessing and $N^{1-\delta}$ query time.

3.3 Bichromatic Subset Query Hardness implies Hardness of ANN under $\ell_{\infty}$

In this section, we show that solving $\gamma$ -ANN for $\gamma<3$ with polynomial preprocessing time requires nearly linear query time unless SETH is false. We first modify Indyk’s reduction [Ind01] from Subset Query to ANN to show that $\gamma$ -approximate Bichromatic Closest Pair under $\ell_{\infty}$ is as hard as Bichromatic Subset Query for any $\gamma<3$ .

Lemma 3.7.

For any $\gamma<3$ , if there exists a $f(N)$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair under the $\ell_{\infty}$ norm, then there exists a $f(N)+O(dN)$ -time algorithm for Bichromatic Subset Query.

Proof.

Define functions $f$ and $g$ as

f(x)=\begin{cases}0&\mbox{if }x=0\\ \frac{2}{3}&\mbox{if }x=1\end{cases}

g(x)=\begin{cases}\frac{1}{3}&\mbox{if }x=0\\ 1&\mbox{if }x=1\end{cases}

For a set $S\subset[d]$ , consider its corresponding vector $\chi_{S}\in\{0,1\}^{d}$ where $\chi_{S,j}=1$ iff $j\in S$ . Let $F:\{0,1\}^{d}\to\{0,1\}^{d}$ and $G:\{0,1\}^{d}\to\{0,1\}^{d}$ be defined as

	$\displaystyle F(T)=(f(\chi_{T,1}),\ldots,f(\chi_{T,d}))$
	$\displaystyle G(S)=(g(\chi_{S,1}),\ldots,g(\chi_{S,d}))\;.$

Define $D(S,T):=\left\lVert G(S)-F(T)\right\rVert_{\infty}$ . If $T\subseteq S$ , then $D(S,T)=\frac{1}{3}$ , and $D(S,T)=1$ otherwise. Then, an instance of Bichromatic Subset Query, $S_{1},\ldots,S_{N}\subset[d]$ and $T_{1},\ldots,T_{N}\subset[d]$ , can be reduced to a $\gamma$ -approximate Bichromatic Closest Pair instance with $G(S_{1}),\ldots,G(S_{N})$ and $F(T_{1}),\ldots,F(T_{N})$ for $\gamma<3$ . If there exists a pair $G(S_{i}),F(T_{j})$ with $D(S_{i},T_{j})\leq 1/3$ , then $T_{j}\subseteq S_{i}$ . If no such pair exists, we know that for any $i,j\in[N]$ , $T_{j}\not\subset S_{i}$ . ∎

Combining Lemma 3.1, Lemma 3.7 and Theorem 2.12, we get the following theorem.

Theorem 3.8.

Assuming SETH, for any $\delta>0$ , $C>1$ , and $\gamma<3$ , there exists $c=c(\delta,C)$ such that $\gamma$ -ANN under the $\ell_{\infty}$ norm cannot be solved with $N^{C}$ preprocessing and $N^{1-\delta}$ query time on instances with $d=c\log N$ .

Proof.

Suppose there exists some $\delta>0,C>1$ , and $\gamma<3$ such that there is an algorithm for $\gamma$ -ANN under the $\ell_{\infty}$ norm with $N^{C}$ preprocessing time and $N^{1-\delta}$ query time for all $d=O(\log N)$ . Then, by Lemma 3.1, there is a $N^{2-\delta^{\prime}}$ -time algorithm for $\gamma$ -approximate Bichromatic Closest Pair under $\ell_{\infty}$ with $\delta^{\prime}=\Omega\left(\frac{\delta}{C-1}\right)$ that works for all $d=O(\log N)$ . By Lemma 3.7, such an algorithm for Bichromatic Closest Pair implies a $O(N^{2-\delta^{\prime}})$ -time algorithm for Bichromatic Subset Query applying to all $d=O(\log N)$ . Such an algorithm for Bichromatic Subset Query cannot exist unless SETH is false (Theorem 2.12). ∎

3.4 Limitation of OV-based Hardness for ANN

We show that the hard approximation factor of 3 in Theorem 3.8 is, in a certain sense, tight. More precisely, we show that any “natural” reduction from Orthogonal Vectors (which is how most SETH-based hardness results are obtained) to $\gamma$ -ANN under any metric cannot show hardness for $\gamma>3$ due to the triangle inequality. This in turn implies that any improvement to Theorem 3.8 in terms of the approximation factor would require novel techniques for proving SETH-based hardness results. This is formalized in Theorem 3.9.

Theorem 3.9.

Let $d>0$ be a positive integer and $(\mathcal{M},D)$ be a metric space. Let $F:\{0,1\}^{d}\to\mathcal{M}$ and $G:\{0,1\}^{d}\to\mathcal{M}$ be embeddings such that for any two strings $S,T\in\{0,1\}^{d}$ ,

•

If $\mathsf{DISJ}(S,T)=1$ , then $D(F(S),G(T))\leq r$ .
•

If $\mathsf{DISJ}(S,T)=0$ , then $D(F(S),G(T))\geq\gamma r$ .

Then, it must be the case that $\gamma\leq 3$ .

Proof.

For the sake of contradiction, suppose that there exists such $F$ and $G$ with $\gamma>3$ . For a string $S\subset\{0,1\}^{d}$ , denote by $S^{d}$ its $d$ -th bit and $S^{<d}$ its restriction to the first $d-1$ bits. Let $S,T$ be two strings such that $\mathsf{DISJ}(S^{<d},T^{<d})=1$ . Then, $S^{d}$ and $T^{d}$ determines the value of $\mathsf{DISJ}(S,T)$ . That is, if $S^{d}=T^{d}=1$ then $\mathsf{DISJ}(S,T)=0$ . Otherwise, $\mathsf{DISJ}(S,T)=1$ . Now consider restrictions $F^{\prime}:\{0,1\}\to\mathcal{M}$ and $G^{\prime}:\{0,1\}\to\mathcal{M}$ of $F$ and $G$ , defined as

	$\displaystyle F^{\prime}(x)=F(S^{<d},x)\;,$
	$\displaystyle G^{\prime}(y)=G(T^{<d},y)\;.$

By our assumption we have $D(F^{\prime}(1),G^{\prime}(1))>\gamma r$ , while we have $D(F^{\prime}(0),G^{\prime}(0))<r$ , $D(F^{\prime}(1),G^{\prime}(0))<r$ and $D(F^{\prime}(0),G^{\prime}(1))<r$ . But this is a contradiction if $\gamma>3$ , since it must be the case

D(F^{\prime}(1),G^{\prime}(1))\leq D(F^{\prime}(1),G^{\prime}(0))+D(F^{\prime}(0),G^{\prime}(0))+D(F^{\prime}(0),G^{\prime}(1))<3r

due to the triangle inequality. ∎

References

[ACP08] Alexandr Andoni, Dorian Croitoru and Mihai Patrascu “Hardness of Nearest Neighbor under L-infinity” In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS) Philadelphia, PA, USA: IEEE, 2008, pp. 424–433 DOI: 10.1109/FOCS.2008.89
[Agg+21] Divesh Aggarwal, Huck Bennett, Alexander Golovnev and Noah Stephens-Davidowitz “Fine-grained hardness of CVP(P) – Everything that we can prove (and nothing else)” arXiv: 1911.02440 In ACM-SIAM Symposium on Discrete Algorithms (SODA), 2021 URL: http://arxiv.org/abs/1911.02440
[AI08] Alexandr Andoni and Piotr Indyk “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions” In Communications of the ACM 51.1, 2008, pp. 117–122 DOI: 10.1145/1327452.1327494
[AIR18] Alexandr Andoni, Piotr Indyk and Ilya Razenshteyn “Approximate nearest neighbor search in high dimensions” In Proceedings of the International Congress of Mathematicians (ICM 2018) World Scientific, 2018, pp. 3287–3318 DOI: 10.1142/9789813272880˙0182
[Ajt98] Miklós Ajtai “The shortest vector problem in L ${}_{\textrm{2}}$ is NP-hard for randomized reductions (extended abstract)” In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing (STOC), STOC ’98, 1998, pp. 10–19 DOI: 10.1145/276698.276705
[Ala+20] Gorjan Alagic et al. “Status Report on the Second Round of the NIST Post-Quantum Cryptography Standardization Process”, 2020 URL: https://csrc.nist.gov/publications/detail/nistir/8309/final
[AM18] Divesh Aggarwal and Priyanka Mukhopadhyay “Improved Algorithms for the Shortest Vector Problem and the Closest Vector Problem in the Infinity Norm” ISSN: 1868-8969 In 29th International Symposium on Algorithms and Computation (ISAAC) 123, Leibniz International Proceedings in Informatics (LIPIcs) Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018, pp. 35:1–35:13 DOI: 10.4230/LIPIcs.ISAAC.2018.35
[And+17] Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn and Erik Waingarten “Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors” In Proceedings of the 2017 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Proceedings Society for IndustrialApplied Mathematics, 2017, pp. 47–66 DOI: 10.1137/1.9781611974782.4
[And+17a] Alexandr Andoni et al. “Approximate near neighbors for general symmetric norms” In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC), STOC 2017, 2017, pp. 902–913 DOI: 10.1145/3055399.3055418
[AR05] Dorit Aharonov and Oded Regev “Lattice problems in NP intersect coNP” In Journal of the ACM 52.5, 2005, pp. 749–765 DOI: 10.1145/1089023.1089025
[AR15] Alexandr Andoni and Ilya P. Razenshteyn “Optimal Data-Dependent Hashing for Approximate Near Neighbors” In Proceedings of the 47th Annual ACM on Symposium on Theory of Computing (STOC), 2015, pp. 793–801 DOI: 10.1145/2746539.2746553
[ARW17] A. Abboud, A. Rubinstein and R. Williams “Distributed PCP Theorems for Hardness of Approximation in P” ISSN: 0272-5428 In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), 2017, pp. 25–36 DOI: 10.1109/FOCS.2017.12
[Bec+16] Anja Becker, Léo Ducas, Nicolas Gama and Thijs Laarhoven “New directions in nearest neighbor searching with applications to lattice sieving” In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SODA ’16 USA: Society for IndustrialApplied Mathematics, 2016, pp. 10–24
[BGSD17] Huck Bennett, Alexander Golovnev and Noah Stephens-Davidowitz “On the Quantitative Hardness of CVP” ISSN: 0272-5428 In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), 2017, pp. 13–24 DOI: 10.1109/FOCS.2017.11
[BJG15] Anja Becker, Antoine Joux and Nicolas Gama “Speeding-up lattice sieving without increasing the memory, using sub-quadratic nearest neighbor search”, 2015 URL: https://eprint.iacr.org/2015/522
[BK18] Mark Braverman and Young Kun Ko “Semi-Direct Sum Theorem and Nearest Neighbor under l_infty” ISSN: 1868-8969 In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM) 116, Leibniz International Proceedings in Informatics (LIPIcs) Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2018, pp. 6:1–6:17 DOI: 10.4230/LIPIcs.APPROX-RANDOM.2018.6
[Boa81] P. Boas “Another np-complete problem and the complexity of computing short vectors in a lattice”, 1981 URL: /paper/Another-np-complete-problem-and-the-complexity-of-a-Boas/69a18eb62802cbf51fae0cb0d2219469425fe0de
[DGW19] Zeev Dvir, Alexander Golovnev and Omri Weinstein “Static data structure lower bounds imply rigidity” In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC), STOC 2019, 2019, pp. 967–978 DOI: 10.1145/3313276.3316348
[Din02] Irit Dinur “Approximating SVP-infinity to within almost-polynomial factors is NP-hard” In Theoretical Computer Science 285.1, Algorithms and Complexity, 2002, pp. 55–71 DOI: 10.1016/S0304-3975(01)00290-0
[Duc+18] Léo Ducas et al. “CRYSTALS-Dilithium: A Lattice-Based Digital Signature Scheme” In IACR Transactions on Cryptographic Hardware and Embedded Systems (CHES), 2018, pp. 238–268 DOI: 10.13154/tches.v2018.i1.238-268
[Fag96] Ronald Fagin “Combining fuzzy information from multiple systems (extended abstract)” In Proceedings of the 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), PODS ’96, 1996, pp. 216–226 DOI: 10.1145/237661.237715
[Fag99] Ronald Fagin “Combining Fuzzy Information from Multiple Systems” In Journal of Computer and System Sciences 58.1, 1999, pp. 83–99 DOI: 10.1006/jcss.1998.1600
[HR07] Ishay Haviv and Oded Regev “Tensor-based hardness of the shortest vector problem to within almost polynomial factors” In Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC), STOC ’07, 2007, pp. 469–477 DOI: 10.1145/1250790.1250859
[IM98] Piotr Indyk and Rajeev Motwani “Approximate nearest neighbors: towards removing the curse of dimensionality” In Proceedings of the 30th Annual ACM on Symposium on Theory of Computing (STOC), STOC ’98, 1998, pp. 604–613 DOI: 10.1145/276698.276876
[Ind01] Piotr Indyk “On Approximate Nearest Neighbors under L-infinity Norm” In Journal of Computer and System Sciences 63.4, 2001, pp. 627–638 DOI: 10.1006/jcss.2001.1781
[Ind03] Piotr Indyk “Better algorithms for high-dimensional proximity problems via asymmetric embeddings” In Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, SODA ’03 USA: Society for IndustrialApplied Mathematics, 2003, pp. 539–545
[IP99] Russell Impagliazzo and Ramamohan Paturi “The Complexity of k-SAT” In Proceedings of the Fourteenth Annual IEEE Conference on Computational Complexity, COCO ’99 USA: IEEE Computer Society, 1999, pp. 237
[IPZ01] Russell Impagliazzo, Ramamohan Paturi and Francis Zane “Which Problems Have Strongly Exponential Complexity?” In Journal of Computer and System Sciences 63.4, 2001, pp. 512–530 DOI: 10.1006/jcss.2001.1774
[JL84] William B. Johnson and Joram Lindenstrauss “Extensions of Lipschitz mappings into a Hilbert space” In Contemporary Mathematics 26 Providence, Rhode Island: American Mathematical Society, 1984, pp. 189–206 DOI: 10.1090/conm/026/737400
[Kho05] Subhash Khot “Hardness of approximating the shortest vector problem in lattices” In Journal of the ACM 52.5, 2005, pp. 789–808 DOI: 10.1145/1089023.1089027
[Kho10] Subhash Khot “Inapproximability Results for Computational Problems on Lattices” In The LLL Algorithm: Survey and Applications, Information Security and Cryptography Berlin, Heidelberg: Springer, 2010, pp. 453–473 DOI: 10.1007/978-3-642-02295-1˙14
[Kle97] Jon M. Kleinberg “Two algorithms for nearest-neighbor search in high dimensions” In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing (STOC) El Paso, Texas, United States: ACM Press, 1997, pp. 599–608 DOI: 10.1145/258533.258653
[Laa15] Thijs Laarhoven “Sieving for Shortest Vectors in Lattices Using Angular Locality-Sensitive Hashing” In Advances in Cryptology (CRYPTO), Lecture Notes in Computer Science Berlin, Heidelberg: Springer, 2015, pp. 3–22 DOI: 10.1007/978-3-662-47989-6˙1
[Mat96] Jiří Matoušek “On the distortion required for embedding finite metric spaces into normed spaces” In Israel Journal of Mathematics 93.1, 1996, pp. 333–344 DOI: 10.1007/BF02761110
[MG02] Daniele Micciancio and Shafi Goldwasser “Complexity of Lattice Problems: A Cryptographic Perspective”, The Springer International Series in Engineering and Computer Science Springer US, 2002 DOI: 10.1007/978-1-4615-0897-7
[Mic01] Daniele Micciancio “The Shortest Vector in a Lattice is Hard to Approximate to within Some Constant” Publisher: Society for Industrial and Applied Mathematics In SIAM Journal on Computing 30.6, 2001, pp. 2008–2035 DOI: 10.1137/S0097539700373039
[PTW10] Rina Panigrahy, Kunal Talwar and Udi Wieder “Lower Bounds on Near Neighbor Search via Metric Expansion” In IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS), 2010, pp. 805–814
[Reg09] Oded Regev “On lattices, learning with errors, random linear codes, and cryptography” In Journal of the ACM 56.6, 2009, pp. 34:1–34:40 DOI: 10.1145/1568318.1568324
[RR06] Oded Regev and Ricky Rosen “Lattice problems and norm embeddings” In Proceedings of the thirty-eighth annual ACM symposium on Theory of Computing, STOC ’06 New York, NY, USA: Association for Computing Machinery, 2006, pp. 447–456 DOI: 10.1145/1132516.1132581
[Rub18] Aviad Rubinstein “Hardness of approximate nearest neighbor search” In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), 2018, pp. 1260–1268 DOI: 10.1145/3188745.3188916
[SDI06] Gregory Shakhnarovich, Trevor Darrell and Piotr Indyk “Nearest-Neighbor Methods in Learning and Vision: Theory and Practice” The MIT Press, 2006
[Val15] Gregory Valiant “Finding Correlations in Subquadratic Time, with Applications to Learning Parities and the Closest Pair Problem” In Journal of the ACM 62.2, 2015, pp. 13:1–13:45 DOI: 10.1145/2728167
[Wil05] Ryan Williams “A new algorithm for optimal 2-constraint satisfaction and its implications” In Theoretical Computer Science 348.2-3, 2005, pp. 357–365 DOI: 10.1016/j.tcs.2005.09.023
[Wil07] R. Ryan Williams “Algorithms and resource requirements for fundamental problems” AAI3274191 ISBN-13: 9780549164777, 2007
[Wil17] Virginia Vassilevska Williams “On some fine-grained questions in algorithms and complexity” In ICM 2018, 2017
[WW18] Virginia Vassilevska Williams and R. Ryan Williams “Subcubic Equivalences Between Path, Matrix, and Triangle Problems” In Journal of the ACM 65.5, 2018, pp. 1–38 DOI: 10.1145/3186893
[Yao79] Andrew Chi-Chih Yao “Some Complexity Questions Related to Distributive Computing” In Proceedings of the 11h Annual ACM SIGACT Symposium on Theory of Computing (STOC), 1979, pp. 209–213

	$\displaystyle~{}A_{0}:=\left\{\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{1},\vec{\alpha}\in\{0,1\}^{n/2}\right\}$
	$\displaystyle~{}A_{1}:=\left\{\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{1},\vec{\alpha}\in\{0,1\}^{n/2},\vec{\alpha}\neq\vec{0}\right\}$
	$\displaystyle~{}B_{0}:=\left\{-\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{2},\vec{\alpha}\in\{0,1\}^{n/2}\right\}$
	$\displaystyle~{}B_{1}:=\left\{-\sum_{i}\alpha_{i}b_{i}\|b_{i}\in\mathcal{B}_{2},\vec{\alpha}\in\{0,1\}^{n/2},\vec{\alpha}\neq\vec{0}\right\}.$

Hardness of Approximate Nearest Neighbor Search under L-infinity

Abstract

1 Introduction

1.1 Previous Works

Approximate nearest neighbors (ANN) in ℓ∞\ell_{\infty}.

Shortest vector problem (SVP).

1.2 Our Results

Theorem 1.1 (Informal).

Corollary 1.2 (Informal).

Theorem 1.3 (Informal).

Corollary 1.4 (Informal).

1.3 Future Directions

Conjecture 1.5.

Acknowledgments.

2 Preliminaries

2.1 Complexity Assumptions

Definition 2.1 (Strong Exponential Time Hypothesis [IP99]).

Definition 2.2 (Orthogonal Vectors).

Conjecture 2.3 (Orthogonal Vectors Conjecture).

2.2 Nearest Neighbor Search Problems

Definition 2.4 (Approximate Nearest Neighbor Search).

Definition 2.5 (Approximate Bichromatic Closest Pair).

2.3 Lattice Problems

Definition 2.6 (Shortest Vector Problem).

Definition 2.7 (Shortest Vector Problem with Binary Restriction).

Theorem 2.8 ([BGSD17, Corollary 6.7]).

Remark 2.9.

2.4 Subset Query Problems

Definition 2.10 (Subset Query).

Definition 2.11 (Bichromatic Subset Query).

Theorem 2.12 ([Wil05, Theorem 5.1]).

3 Proof of Main Theorems

3.1 Bichromatic Closest Pair to ANN

Lemma 3.1.

Proof.

3.2 𝖲𝖵𝖯p{0,1}\mathsf{SVP}_{p}^{\{0,1\}} Hardness implies Hardness of ANN under ℓp\ell_{p}

Lemma 3.2.

Proof.

Claim 3.3 (Correctness).

Proof.

Remark 3.4.

Theorem 3.5.

Proof.

Corollary 3.6.

3.3 Bichromatic Subset Query Hardness implies Hardness of ANN under ℓ∞\ell_{\infty}

Lemma 3.7.

Proof.

Theorem 3.8.

Proof.

3.4 Limitation of OV-based Hardness for ANN

Theorem 3.9.

Proof.

References

Hardness of Approximate Nearest Neighbor Search
under L-infinity

Approximate nearest neighbors (ANN) in $\ell_{\infty}$ .

3.2 $\mathsf{SVP}_{p}^{\{0,1\}}$ Hardness implies Hardness of ANN under $\ell_{p}$

3.3 Bichromatic Subset Query Hardness implies Hardness of ANN under $\ell_{\infty}$