This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On Lattices, Learning with Errors,
Random Linear Codes, and Cryptography

Oded Regev 111School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel. Supported by an Alon Fellowship, by the Binational Science Foundation, by the Israel Science Foundation, by the Army Research Office grant DAAD19-03-1-0082, by the European Commission under the Integrated Project QAP funded by the IST directorate as Contract Number 015848, and by a European Research Council (ERC) Starting Grant.
Abstract

Our main result is a reduction from worst-case lattice problems such as GapSVP and SIVP to a certain learning problem. This learning problem is a natural extension of the ‘learning from parity with error’ problem to higher moduli. It can also be viewed as the problem of decoding from a random linear code. This, we believe, gives a strong indication that these problems are hard. Our reduction, however, is quantum. Hence, an efficient solution to the learning problem implies a quantum algorithm for GapSVP and SIVP. A main open question is whether this reduction can be made classical (i.e., non-quantum).

We also present a (classical) public-key cryptosystem whose security is based on the hardness of the learning problem. By the main result, its security is also based on the worst-case quantum hardness of GapSVP and SIVP. The new cryptosystem is much more efficient than previous lattice-based cryptosystems: the public key is of size O~(n2)\tilde{O}(n^{2}) and encrypting a message increases its size by a factor of O~(n)\tilde{O}(n) (in previous cryptosystems these values are O~(n4)\tilde{O}(n^{4}) and O~(n2)\tilde{O}(n^{2}), respectively). In fact, under the assumption that all parties share a random bit string of length O~(n2)\tilde{O}(n^{2}), the size of the public key can be reduced to O~(n)\tilde{O}(n).

1 Introduction

Main theorem.

For an integer n1n\geq 1 and a real number ε0\varepsilon\geq 0, consider the ‘learning from parity with error’ problem, defined as follows: the goal is to find an unknown 𝐬2n\mathbf{s}\in{\mathbb{Z}}_{2}^{n} given a list of ‘equations with errors’

𝐬,𝐚1\displaystyle{\langle{\mathbf{s},\mathbf{a}_{1}}\rangle} εb1(mod2)\displaystyle\approx_{\varepsilon}b_{1}~{}({\rm{mod}}~{}2)
𝐬,𝐚2\displaystyle{\langle{\mathbf{s},\mathbf{a}_{2}}\rangle} εb2(mod2)\displaystyle\approx_{\varepsilon}b_{2}~{}({\rm{mod}}~{}2)
\displaystyle\vdots

where the 𝐚i\mathbf{a}_{i}’s are chosen independently from the uniform distribution on 2n{\mathbb{Z}}_{2}^{n}, 𝐬,𝐚i=jsj(ai)j{\langle{\mathbf{s},\mathbf{a}_{i}}\rangle}=\sum_{j}s_{j}(a_{i})_{j} is the inner product modulo 22 of 𝐬\mathbf{s} and 𝐚i\mathbf{a}_{i}, and each equation is correct independently with probability 1ε1-\varepsilon. More precisely, the input to the problem consists of pairs (𝐚i,bi)(\mathbf{a}_{i},b_{i}) where each 𝐚i\mathbf{a}_{i} is chosen independently and uniformly from 2n{\mathbb{Z}}_{2}^{n} and each bib_{i} is independently chosen to be equal to 𝐬,𝐚i{\langle{\mathbf{s},\mathbf{a}_{i}}\rangle} with probability 1ε1-\varepsilon. The goal is to find 𝐬\mathbf{s}. Notice that the case ε=0\varepsilon=0 can be solved efficiently by, say, Gaussian elimination. This requires O(n)O(n) equations and poly(n){\rm{poly}}(n) time.

The problem seems to become significantly harder when we take any positive ε>0\varepsilon>0. For example, let us consider again the Gaussian elimination process and assume that we are interested in recovering only the first bit of 𝐬\mathbf{s}. Using Gaussian elimination, we can find a set SS of O(n)O(n) equations such that S𝐚i\sum_{S}\mathbf{a}_{i} is (1,0,,0)(1,0,\ldots,0). Summing the corresponding values bib_{i} gives us a guess for the first bit of 𝐬\mathbf{s}. However, a standard calculation shows that this guess is correct with probability 12+2Θ(n)\frac{1}{2}+2^{-\Theta(n)}. Hence, in order to obtain the first bit with good confidence, we have to repeat the whole procedure 2Θ(n)2^{\Theta(n)} times. This yields an algorithm that uses 2O(n)2^{O(n)} equations and 2O(n)2^{O(n)} time. In fact, it can be shown that given only O(n)O(n) equations, the 𝐬2n\mathbf{s}^{\prime}\in{\mathbb{Z}}_{2}^{n} that maximizes the number of satisfied equations is with high probability 𝐬\mathbf{s}. This yields a simple maximum likelihood algorithm that requires only O(n)O(n) equations and runs in time 2O(n)2^{O(n)}.

Blum, Kalai, and Wasserman [11] provided the first subexponential algorithm for this problem. Their algorithm requires only 2O(n/logn)2^{O(n/\log n)} equations/time and is currently the best known algorithm for the problem. It is based on a clever idea that allows to find a small set SS of equations (say, O(n)O(\sqrt{n})) among 2O(n/logn)2^{O(n/\log n)} equations, such that S𝐚i\sum_{S}\mathbf{a}_{i} is, say, (1,0,,0)(1,0,\ldots,0). This gives us a guess for the first bit of 𝐬\mathbf{s} that is correct with probability 12+2Θ(n)\frac{1}{2}+2^{-\Theta(\sqrt{n})}. We can obtain the correct value with high probability by repeating the whole procedure only 2O(n)2^{O(\sqrt{n})} times. Their idea was later shown to have other important applications, such as the first 2O(n)2^{O(n)}-time algorithm for solving the shortest vector problem [23, 5].

An important open question is to explain the apparent difficulty in finding efficient algorithms for this learning problem. Our main theorem explains this difficulty for a natural extension of this problem to higher moduli, defined next.

Let p=p(n)poly(n)p=p(n)\leq{\rm{poly}}(n) be some prime integer and consider a list of ‘equations with error’

𝐬,𝐚1\displaystyle{\langle{\mathbf{s},\mathbf{a}_{1}}\rangle} χb1(modp)\displaystyle\approx_{\chi}b_{1}~{}({\rm{mod}}~{}p)
𝐬,𝐚2\displaystyle{\langle{\mathbf{s},\mathbf{a}_{2}}\rangle} χb2(modp)\displaystyle\approx_{\chi}b_{2}~{}({\rm{mod}}~{}p)
\displaystyle\vdots

where this time 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n}, 𝐚i\mathbf{a}_{i} are chosen independently and uniformly from pn{\mathbb{Z}}_{p}^{n}, and bipb_{i}\in{\mathbb{Z}}_{p}. The error in the equations is now specified by a probability distribution χ:p+\chi:{\mathbb{Z}}_{p}\to\mathbb{R}^{+} on p{\mathbb{Z}}_{p}. Namely, for each equation ii, bi=𝐬,𝐚i+eib_{i}={\langle{\mathbf{s},\mathbf{a}_{i}}\rangle}+e_{i} where each eipe_{i}\in{\mathbb{Z}}_{p} is chosen independently according to χ\chi. We denote the problem of recovering 𝐬\mathbf{s} from such equations by LWEp,χ\mbox{\sc LWE}_{p,\chi} (learning with error). For example, the learning from parity problem with error ε\varepsilon is the special case where p=2p=2, χ(0)=1ε\chi(0)=1-\varepsilon, and χ(1)=ε\chi(1)=\varepsilon. Under a reasonable assumption on χ\chi (namely, that χ(0)>1/p+1/poly(n)\chi(0)>1/p+1/{\rm{poly}}(n)), the maximum likelihood algorithm described above solves LWEp,χ\mbox{\sc LWE}_{p,\chi} for ppoly(n)p\leq{\rm{poly}}(n) using poly(n){\rm{poly}}(n) equations and 2O(nlogn)2^{O(n\log n)} time. Under a similar assumption, an algorithm resembling the one by Blum et al. [11] requires only 2O(n)2^{O(n)} equations/time. This is the best known algorithm for the LWE problem.

Our main theorem shows that for certain choices of pp and χ\chi, a solution to LWEp,χ\mbox{\sc LWE}_{p,\chi} implies a quantum solution to worst-case lattice problems.

Theorem 1.1 (Informal)

Let n,pn,p be integers and α(0,1)\alpha\in(0,1) be such that αp>2n\alpha p>2\sqrt{n}. If there exists an efficient algorithm that solves LWEp,Ψ¯α\mbox{\sc LWE}_{p,\bar{\Psi}_{\alpha}} then there exists an efficient quantum algorithm that approximates the decision version of the shortest vector problem (GapSVP) and the shortest independent vectors problem (SIVP) to within O~(n/α)\tilde{O}(n/\alpha) in the worst case.

The exact definition of Ψ¯α\bar{\Psi}_{\alpha} will be given later. For now, it is enough to know that it is a distribution on p{\mathbb{Z}}_{p} that has the shape of a discrete Gaussian centered around 0 with standard deviation αp\alpha p, as in Figure 1. Also, the probability of 0 (i.e., no error) is roughly 1/(αp)1/(\alpha p). A possible setting for the parameters is p=O(n2)p=O(n^{2}) and α=1/(nlog2n)\alpha=1/(\sqrt{n}\log^{2}n) (in fact, these are the parameters that we use in our cryptographic application).

Figure 1: Ψ¯α\bar{\Psi}_{\alpha} for p=127p=127 with α=0.05\alpha=0.05 (left) and α=0.1\alpha=0.1 (right). The elements of p{\mathbb{Z}}_{p} are arranged on a circle.

GapSVP and SIVP are two of the main computational problems on lattices. In GapSVP, for instance, the input is a lattice, and the goal is to approximate the length of the shortest nonzero lattice vector. The best known polynomial time algorithms for them yield only mildly subexponential approximation factors [24, 38, 5]. It is conjectured that there is no classical (i.e., non-quantum) polynomial time algorithm that approximates them to within any polynomial factor. Lattice-based constructions of one-way functions, such as the one by Ajtai [2], are based on this conjecture.

One might even conjecture that there is no quantum polynomial time algorithm that approximates GapSVP (or SIVP) to within any polynomial factor. One can then interpret the main theorem as saying that based on this conjecture, the LWE problem is hard. The only evidence supporting this conjecture is that there are no known quantum algorithms for lattice problems that outperform classical algorithms, even though this is probably one of the most important open questions in the field of quantum computing.222If forced to make a guess, the author would say that the conjecture is true.

In fact, one could also interpret our main theorem as a way to disprove this conjecture: if one finds an efficient algorithm for LWE, then one also obtains a quantum algorithm for approximating worst-case lattice problems. Such a result would be of tremendous importance on its own. Finally, we note that it is possible that our main theorem will one day be made classical. This would make all our results stronger and the above discussion unnecessary.

The LWE problem can be equivalently presented as the problem of decoding random linear codes. More specifically, let m=poly(n)m={\rm{poly}}(n) be arbitrary and let 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n} be some vector. Then, consider the following problem: given a random matrix Qpm×nQ\in{\mathbb{Z}}_{p}^{m\times n} and the vector 𝐭=Q𝐬+𝐞pm\mathbf{t}=Q\mathbf{s}+\mathbf{e}\in{\mathbb{Z}}_{p}^{m} where each coordinate of the error vector 𝐞pm\mathbf{e}\in{\mathbb{Z}}_{p}^{m} is chosen independently from Ψ¯α\bar{\Psi}_{\alpha}, recover 𝐬\mathbf{s}. The Hamming weight of 𝐞\mathbf{e} is roughly m(11/(αp))m(1-1/(\alpha p)) (since a value chosen from Ψ¯α\bar{\Psi}_{\alpha} is 0 with probability roughly 1/(αp)1/(\alpha p)). Hence, the Hamming distance of 𝐭\mathbf{t} from Q𝐬Q\mathbf{s} is roughly m(11/(αp))m(1-1/(\alpha p)). Moreover, it can be seen that for large enough mm, for any other word 𝐬\mathbf{s}^{\prime}, the Hamming distance of 𝐭\mathbf{t} from Q𝐬Q\mathbf{s}^{\prime} is roughly m(11/p)m(1-1/p). Hence, we obtain that approximating the nearest codeword problem to within factors smaller than (11/p)/(11/(αp))(1-1/p)/(1-1/(\alpha p)) on random codes is as hard as quantumly approximating worst-case lattice problems. This gives a partial answer to the important open question of understanding the hardness of decoding from random linear codes.

It turns out that certain problems, which are seemingly easier than the LWE problem, are in fact equivalent to the LWE problem. We establish these equivalences in Section 4 using elementary reductions. For example, being able to distinguish a set of equations as above from a set of equations in which the bib_{i}’s are chosen uniformly from p{\mathbb{Z}}_{p} is equivalent to solving LWE. Moreover, it is enough to correctly distinguish these two distributions for some non-negligible fraction of all 𝐬\mathbf{s}. The latter formulation is the one we use in our cryptographic applications.

Cryptosystem.

In Section 5 we present a public key cryptosystem and prove that it is secure based on the hardness of the LWE problem. We use the standard security notion of semantic, or IND-CPA, security (see, e.g., [20, Chapter 10]). The cryptosystem and its security proof are entirely classical. In fact, the cryptosystem itself is quite simple; the reader is encouraged to glimpse at the beginning of Section 5. Essentially, the idea is to provide a list of equations as above as the public key; encryption is performed by summing some of the equations (forming another equation with error) and modifying the right hand side depending on the message to be transmitted. Security follows from the fact that a list of equations with error is computationally indistinguishable from a list of equations in which the bib_{i}’s are chosen uniformly.

By using our main theorem, we obtain that the security of the system is based also on the worst-case quantum hardness of approximating SIVP and GapSVP to within O~(n1.5)\tilde{O}(n^{1.5}). In other words, breaking our cryptosystem implies an efficient quantum algorithm for approximating SIVP and GapSVP to within O~(n1.5)\tilde{O}(n^{1.5}). Previous cryptosystems, such as the Ajtai-Dwork cryptosystem [4] and the one by Regev [36], are based on the worst-case (classical) hardness of the unique-SVP problem, which can be related to GapSVP (but not SIVP) through the recent result of Lyubashevsky and Micciancio [26].

Another important feature of our cryptosystem is its improved efficiency. In previous cryptosystems, the public key size is O~(n4)\tilde{O}(n^{4}) and the encryption increases the size of messages by a factor of O~(n2)\tilde{O}(n^{2}). In our cryptosystem, the public key size is only O~(n2)\tilde{O}(n^{2}) and encryption increases the size of messages by a factor of only O~(n)\tilde{O}(n). This possibly makes our cryptosystem practical. Moreover, using an idea of Ajtai [3], we can reduce the size of the public key to O~(n)\tilde{O}(n). This requires all users of the cryptosystem to share some (trusted) random bit string of length O~(n2)\tilde{O}(n^{2}). This can be achieved by, say, distributing such a bit string as part of the encryption and decryption software.

We mention that learning problems similar to ours were already suggested as possible sources of cryptographic hardness in, e.g., [10, 7], although this was done without establishing any connection to lattice problems. In another related work [3], Ajtai suggested a cryptosystem that has several properties in common with ours (including its efficiency), although its security is not based on worst-case lattice problems.

Why quantum?

This paper is almost entirely classical. In fact, quantum is needed only in one step in the proof of the main theorem. Making this step classical would make the entire reduction classical. To demonstrate the difficulty, consider the following situation. Let LL be some lattice and let d=λ1(L)/n10d=\lambda_{1}(L)/n^{10} where λ1(L)\lambda_{1}(L) is the length of the shortest nonzero vector in LL. We are given an oracle that for any point 𝐱n\mathbf{x}\in\mathbb{R}^{n} within distance dd of LL finds the closest lattice vector to 𝐱\mathbf{x}. If 𝐱\mathbf{x} is not within distance dd of LL, the output of the oracle is undefined. Intuitively, such an oracle seems quite powerful; the best known algorithms for performing such a task require exponential time. Nevertheless, we do not see any way to use this oracle classically. Indeed, it seems to us that the only way to generate inputs to the oracle is the following: somehow choose a lattice point 𝐲L\mathbf{y}\in L and let 𝐱=𝐲+𝐳\mathbf{x}=\mathbf{y}+\mathbf{z} for some perturbation vector 𝐳\mathbf{z} of length at most dd. Clearly, on input 𝐱\mathbf{x} the oracle outputs 𝐲\mathbf{y}. But this is useless since we already know 𝐲\mathbf{y}!

It turns out that quantumly, such an oracle is quite useful. Indeed, being able to compute 𝐲\mathbf{y} from 𝐱\mathbf{x} allows us to uncompute 𝐲\mathbf{y}. More precisely, it allows us to transform the quantum state |𝐱,𝐲{|{\mathbf{x},\mathbf{y}}\rangle} to the state |𝐱,0{|{\mathbf{x},0}\rangle} in a reversible (i.e., unitary) way. This ability to erase the contents of a memory cell in a reversible way seems useful only in the quantum setting.

Techniques.

Unlike previous constructions of lattice-based public-key cryptosystems, the proof of our main theorem uses an ‘iterative construction’. Essentially, this means that instead of ‘immediately’ finding very short vectors in a lattice, the reduction proceeds in steps where in each step shorter lattice vectors are found. So far, such iterative techniques have been used only in the construction of lattice-based one-way functions [2, 12, 27, 29]. Another novel aspect of our main theorem is its crucial use of quantum computation. Our cryptosystem is the first classical cryptosystem whose security is based on a quantum hardness assumption (see [30] for a somewhat related recent work).

Our proof is based on the Fourier transform of Gaussian measures, a technique that was developed in previous papers [36, 29, 1]. More specifically, we use a parameter known as the smoothing parameter, as introduced in [29]. We also use the discrete Gaussian distribution and approximations to its Fourier transform, ideas that were developed in [1].

Open questions.

The main open question raised by this work is whether Theorem 1.1 can be dequantized: can the hardness of LWE be established based on the classical hardness of SIVP and GapSVP? We see no reason why this should be impossible. However, despite our efforts over the last few years, we were not able to show this. As mentioned above, the difficulty is that there seems to be no classical way to use an oracle that solves the closest vector problem within small distances. Quantumly, however, such an oracle turns out to be quite useful.

Another important open question is to determine the hardness of the learning from parity with errors problem (i.e., the case p=2p=2). Our theorem only works for p>2np>2\sqrt{n}. It seems that in order to prove similar results for smaller values of pp, substantially new ideas are required. Alternatively, one can interpret our inability to prove hardness for small pp as an indication that the problem might be easier than believed.

Finally, it would be interesting to relate the LWE problem to other average-case problems in the literature, and especially to those considered by Feige in [15]. See Alekhnovich’s paper [7] for some related work.

Followup work.

We now describe some of the followup work that has appeared since the original publication of our results in 2005 [37].

One line of work focussed on improvements to our cryptosystem. First, Kawachi, Tanaka, and Xagawa [21] proposed a modification to our cryptosystem that slightly improves the encryption blowup to O(n)O(n), essentially getting rid of a log\log factor. A much more significant improvement is described by Peikert, Vaikuntanathan, and Waters in [34]. By a relatively simple modification to the cryptosystem, they managed to bring the encryption blowup down to only O(1)O(1), in addition to several equally significant improvements in running time. Finally, Akavia, Goldwasser, and Vaikuntanathan [6] show that our cryptosystem remains secure even if almost the entire secret key is leaked.

Another line of work focussed on the design of other cryptographic protocols whose security is based on the hardness of the LWE problem. First, Peikert and Waters [35] constructed, among other things, CCA-secure cryptosystems (see also [33] for a simpler construction). These are cryptosystems that are secure even if the adversary is allowed access to a decryption oracle (see, e.g., [20, Chapter 10]). All previous lattice-based cryptosystems (including the one in this paper) are not CCA-secure. Second, Peikert, Vaikuntanathan, and Waters [34] showed how to construct oblivious transfer protocols, which are useful, e.g., for performing secure multiparty computation. Third, Gentry, Peikert, and Vaikuntanathan [16] constructed an identity-based encryption (IBE) scheme. This is a public-key encryption scheme in which the public key can be any unique identifier of the user; very few constructions of such schemes are known. Finally, Cash, Peikert, and Sahai [13] constructed a public-key cryptosystem that remains secure even when the encrypted messages may depend upon the secret key. The security of all the above constructions is based on the LWE problem and hence, by our main theorem, also on the worst-case quantum hardness of lattice problems.

The LWE problem has also been used by Klivans and Sherstov to show hardness results related to learning halfspaces [22]. As before, due to our main theorem, this implies hardness of learning halfspaces based on the worst-case quantum hardness of lattice problems.

Finally, we mention two results giving further evidence for the hardness of the LWE problem. In the first, Peikert [32] somewhat strengthens our main theorem by replacing our worst-case lattice problems with their analogues for the q\ell_{q} norm, where 2q2\leq q\leq\infty is arbitrary. Our main theorem only deals with the standard 2\ell_{2} versions.

In another recent result, Peikert [33] shows that the quantum part of our proof can be removed, leading to a classical reduction from GapSVP to the LWE problem. As a result, Peikert is able to show that public-key cryptosystems (including many of the above LWE-based schemes) can be based on the classical hardness of GapSVP, resolving a long-standing open question (see also [26]). Roughly speaking, the way Peikert circumvents the difficulty we described earlier is by noticing that the existence of an oracle that is able to recover 𝐲\mathbf{y} from 𝐲+𝐳\mathbf{y}+\mathbf{z}, where 𝐲\mathbf{y} is a random lattice point and 𝐳\mathbf{z} is a random perturbation of length at most dd, is by itself a useful piece of information as it provides a lower bound on the length of the shortest nonzero vector. By trying to construct such oracles for several different values of dd and checking which ones work, Peikert is able to obtain a good approximation of the length of the shortest nonzero vector.

Removing the quantum part, however, comes at a cost: the construction can no longer be iterative, the hardness can no longer be based on SIVP, and even for hardness based on GapSVP, the modulus pp in the LWE problem must be exponentially big unless we assume the hardness of a non-standard variant of GapSVP. Because of this, we believe that dequantizing our main theorem remains an important open problem.

1.1 Overview

In this subsection, we give a brief informal overview of the proof of our main theorem, Theorem 1.1. The complete proof appears in Section 3. We do not discuss here the reductions in Section 4 and the cryptosystem in Section 5 as these parts of the paper are more similar to previous work.

In addition to some very basic definitions related to lattices, we will make heavy use here of the discrete Gaussian distribution on LL of width rr, denoted DL,rD_{L,r}. This is the distribution whose support is LL (which is typically a lattice), and in which the probability of each 𝐱L\mathbf{x}\in L is proportional to exp(π𝐱/r2){{\mathrm{exp}}\left(-\pi\|\mathbf{x}/r\|^{2}\right)} (see Eq. (6) and Figure 2). We also mention here the smoothing parameter ηε(L)\eta_{\varepsilon}(L). This is a real positive number associated with any lattice LL (ε\varepsilon is an accuracy parameter which we can safely ignore here). Roughly speaking, it gives the smallest rr starting from which DL,rD_{L,r} ‘behaves like’ a continuous Gaussian distribution. For instance, for rηε(L)r\geq\eta_{\varepsilon}(L), vectors chosen from DL,rD_{L,r} have norm roughly rnr\sqrt{n} with high probability. In contrast, for sufficiently small rr, DL,rD_{L,r} gives almost all its mass to the origin 0. Although not required for this section, a complete list of definitions can be found in Section 2.

Figure 2: DL,2D_{L,2} (left) and DL,1D_{L,1} (right) for a two-dimensional lattice LL. The zz-axis represents probability.

Let α,p,n\alpha,p,n be such that αp>2n\alpha p>2\sqrt{n}, as required in Theorem 1.1, and assume we have an oracle that solves LWEp,Ψ¯α\mbox{\sc LWE}_{p,\bar{\Psi}_{\alpha}}. For concreteness, we can think of p=n2p=n^{2} and α=1/n\alpha=1/n. Our goal is to show how to solve the two lattice problems mentioned in Theorem 1.1. As we prove in Subsection 3.3 using standard reductions, it suffices to solve the following discrete Gaussian sampling problem (DGS): Given an nn-dimensional lattice LL and a number r2nηε(L)/αr\geq\sqrt{2n}\cdot\eta_{\varepsilon}(L)/\alpha, output a sample from DL,rD_{L,r}. Intuitively, the connection to GapSVP and SIVP comes from the fact that by taking rr close to its lower limit 2nηε(L)/α\sqrt{2n}\cdot\eta_{\varepsilon}(L)/\alpha, we can obtain short lattice vectors (of length roughly nr\sqrt{n}r). In the rest of this subsection we describe our algorithm for sampling from DL,rD_{L,r}. We note that the exact lower bound on rr is not that important for purposes of this overview, as it only affects the approximation factor we obtain for GapSVP and SIVP. It suffices to keep in mind that our goal is to sample from DL,rD_{L,r} for rr that is rather small, say within a polynomial factor of ηε(L)\eta_{\varepsilon}(L).

The core of the algorithm is the following procedure, which we call the ‘iterative step’. Its input consists of a number rr (which is guaranteed to be not too small, namely, greater than 2pηε(L)\sqrt{2}p\eta_{\varepsilon}(L)), and ncn^{c} samples from DL,rD_{L,r} where cc is some constant. Its output is a sample from the distribution DL,rD_{L,r^{\prime}} for r=rn/(αp)r^{\prime}=r\sqrt{n}/(\alpha p). Notice that since αp>2n\alpha p>2\sqrt{n}, r<r/2r^{\prime}<r/2. In order to perform this ‘magic’ of converting vectors of norm nr\sqrt{n}r into shorter vectors of norm nr\sqrt{n}r^{\prime}, the procedure of course needs to use the LWE oracle.

Given the iterative step, the algorithm for solving DGS works as follows. Let rir_{i} denote r(αp/n)ir\cdot(\alpha p/\sqrt{n})^{i}. The algorithm starts by producing ncn^{c} samples from DL,r3nD_{L,r_{3n}}. Because r3nr_{3n} is so large, such samples can be computed efficiently by a simple procedure described in Lemma 3.2. Next comes the core of the algorithm: for i=3n,3n1,,1i=3n,3n-1,\ldots,1 the algorithm uses its ncn^{c} samples from DL,riD_{L,r_{i}} to produce ncn^{c} samples from DL,ri1D_{L,r_{i-1}} by calling the iterative step ncn^{c} times. Eventually, we end up with ncn^{c} samples from DL,r0=DL,rD_{L,r_{0}}=D_{L,r} and we complete the algorithm by simply outputting the first of those. Note the following crucial fact: using ncn^{c} samples from DL,riD_{L,r_{i}}, we are able to generate the same number of samples ncn^{c} from DL,ri1D_{L,r_{i-1}} (in fact, we could even generate more than ncn^{c} samples). The algorithm would not work if we could only generate, say, nc/2n^{c}/2 samples, as this would require us to start with an exponential number of samples.

We now finally get to describe the iterative step. Recall that as input we have ncn^{c} samples from DL,rD_{L,r} and we are supposed to generate a sample from DL,rD_{L,r^{\prime}} where r=rn/(αp)r^{\prime}=r\sqrt{n}/(\alpha p). Moreover, rr is known and guaranteed to be at least 2pηε(L)\sqrt{2}p\eta_{\varepsilon}(L), which can be shown to imply that p/r<λ1(L)/2p/r<\lambda_{1}(L^{*})/2. As mentioned above, the exact lower bound on rr does not matter much for this overview; it suffices to keep in mind that rr is sufficiently larger than ηε(L)\eta_{\varepsilon}(L), and that 1/r1/r is sufficiently smaller than λ1(L)\lambda_{1}(L^{*}).

The iterative step is obtained by combining two parts (see Figure 3). In the first part, we construct a classical algorithm that uses the given samples and the LWE oracle to solve the following closest vector problem, which we denote by CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r}: given any point 𝐱n\mathbf{x}\in\mathbb{R}^{n} within distance αp/r\alpha p/r of the dual lattice LL^{*}, output the closest vector in LL^{*} to 𝐱\mathbf{x}.333In fact, we only solve CVPL,αp/(2r){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)} but for simplicity we ignore the factor 2\sqrt{2} here. By our assumption on rr, the distance between any two points in LL^{*} is greater than 2αp/r2\alpha p/r and hence the closest vector is unique. In the second part, we use this algorithm to generate samples from DL,rD_{L,r^{\prime}}. This part is quantum (and in fact, the only quantum part of our proof). The idea here is to use the CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r} algorithm to generate a certain quantum superposition which, after applying the quantum Fourier transform and performing a measurement, provides us with a sample from DL,rn/(αp)D_{L,r\sqrt{n}/(\alpha p)}. In the following, we describe each of the two parts in more detail.

Figure 3: Two iterations of the algorithm
Part 1:

We start by recalling the main idea in [1]. Consider some probability distribution DD on some lattice LL and consider its Fourier transform f:nf:\mathbb{R}^{n}\to\mathbb{C}, defined as

f(𝐱)=𝐲LD(𝐲)exp(2πi𝐱,𝐲)=Exp𝐲D[exp(2πi𝐱,𝐲)]f(\mathbf{x})=\sum_{\mathbf{y}\in L}D(\mathbf{y}){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{x},\mathbf{y}}\rangle}\right)}=\mathop{\mathrm{Exp}}_{\mathbf{y}\sim D}[{{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{x},\mathbf{y}}\rangle}\right)}]

where in the second equality we simply rewrite the sum as an expectation. By definition, ff is LL^{*}-periodic, i.e., f(𝐱)=f(𝐱+𝐲)f(\mathbf{x})=f(\mathbf{x}+\mathbf{y}) for any 𝐱n\mathbf{x}\in\mathbb{R}^{n} and 𝐲L\mathbf{y}\in L^{*}. In [1] it was shown that given a polynomial number of samples from DD, one can compute an approximation of ff to within ±1/poly(n)\pm 1/{\rm{poly}}(n). To see this, note that by the Chernoff-Hoeffding bound, if 𝐲1,,𝐲N\mathbf{y}_{1},\ldots,\mathbf{y}_{N} are N=poly(n)N={\rm{poly}}(n) independent samples from DD, then

f(𝐱)1Nj=1Nexp(2πi𝐱,𝐲j)f(\mathbf{x})\approx\frac{1}{N}\sum_{j=1}^{N}{{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{x},\mathbf{y}_{j}}\rangle}\right)}

where the approximation is to within ±1/poly(n)\pm 1/{\rm{poly}}(n) and holds with probability exponentially close to 11, assuming that NN is a large enough polynomial.

By applying this idea to the samples from DL,rD_{L,r} given to us as input, we obtain a good approximation of the Fourier transform of DL,rD_{L,r}, which we denote by f1/rf_{1/r}. It can be shown that since 1/rλ1(L)1/r\ll\lambda_{1}(L^{*}) one has the approximation

f1/r(𝐱)exp(π(rdist(L,𝐱))2)f_{1/r}(\mathbf{x})\approx{{\mathrm{exp}}\left(-\pi(r\cdot{\rm{dist}}(L^{*},\mathbf{x}))^{2}\right)} (1)

(see Figure 4). Hence, f1/r(𝐱)1f_{1/r}(\mathbf{x})\approx 1 for any 𝐱L\mathbf{x}\in L^{*} (in fact an equality holds) and as one gets away from LL^{*}, its value decreases. For points within distance, say, 1/r1/r from the lattice, its value is still some positive constant (roughly exp(π){{\mathrm{exp}}\left(-\pi\right)}). As the distance from LL^{*} increases, the value of the function soon becomes negligible. Since the distance between any two vectors in LL^{*} is at least λ1(L)1/r\lambda_{1}(L^{*})\gg 1/r, the Gaussians around each point of LL^{*} are well-separated.

Refer to caption
Figure 4: f1/rf_{1/r} for a two-dimensional lattice

Although not needed in this paper, let us briefly outline how one can solve CVPL,1/r{\mbox{\sc CVP}}_{L^{*},1/r} using samples from DL,rD_{L,r}. Assume that we are given some point 𝐱\mathbf{x} within distance 1/r1/r of LL^{*}. Intuitively, this 𝐱\mathbf{x} is located on one of the Gaussians of f1/rf_{1/r}. By repeatedly computing an approximation of f1/rf_{1/r} using the samples from DL,rD_{L,r} as described above, we ‘walk uphill’ on f1/rf_{1/r} in an attempt to find its ‘peak’. This peak corresponds to the closest lattice point to 𝐱\mathbf{x}. Actually, the procedure as described here does not quite work: due to the error in our approximation of f1/rf_{1/r}, we cannot find the closest lattice point exactly. It is possible to overcome this difficulty; see [25] for the details. The same procedure actually works for slightly longer distances, namely O(logn/r)O(\sqrt{\log n}/r), but beyond that distance the value of f1/rf_{1/r} becomes negligible and no useful information can be extracted from our approximation of it.

Unfortunately, solving CVPL,1/r{\mbox{\sc CVP}}_{L^{*},1/r} is not useful for the iterative step as it would lead to samples from DL,rnD_{L,r\sqrt{n}}, which is a wider rather than a narrower distribution than the one we started with. This is not surprising, since our solution to CVPL,1/r{\mbox{\sc CVP}}_{L^{*},1/r} did not use the LWE oracle. Using the LWE oracle, we will now show that we can gain an extra αp\alpha p factor in the radius, and obtain the desired CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r} algorithm.

Notice that if we could somehow obtain samples from DL,r/pD_{L,r/p} we would be done: using the procedure described above, we could solve CVPL,p/r{\mbox{\sc CVP}}_{L^{*},p/r}, which is better than what we need. Unfortunately, it is not clear how to obtain such samples, even with the help of the LWE oracle. Nevertheless, here is an obvious way to obtain something similar to samples from DL,r/pD_{L,r/p}: just take the given samples from DL,rD_{L,r} and divide them by pp. This provides us with samples from DL/p,r/pD_{L/p,r/p} where L/pL/p is the lattice LL scaled down by a factor of pp. In the following we will show how to use these samples to solve CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r}.

Let us first try to understand what the distribution DL/p,r/pD_{L/p,r/p} looks like. Notice that the lattice L/pL/p consists of pnp^{n} translates of the original lattice LL. Namely, for each 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n}, consider the set

L+L𝐚/p={L𝐛/p|𝐛n,𝐛modp=𝐚}.L+L\mathbf{a}/p=\{L\mathbf{b}/p~{}|~{}\mathbf{b}\in{\mathbb{Z}}^{n},~{}~{}\mathbf{b}~{}{\rm{mod}}~{}p=\mathbf{a}\}.

Then {L+L𝐚/p|𝐚pn}\{L+L\mathbf{a}/p~{}|~{}\mathbf{a}\in{\mathbb{Z}}_{p}^{n}\} forms a partition of L/pL/p. Moreover, it can be shown that since r/pr/p is larger than the smoothing parameter ηε(L)\eta_{\varepsilon}(L), the probability given to each L+L𝐚/pL+L\mathbf{a}/p under DL/p,r/pD_{L/p,r/p} is essentially the same, that is, pnp^{-n}. Intuitively, beyond the smoothing parameter, the Gaussian measure no longer ‘sees’ the discrete structure of LL, so in particular it is not affected by translations (this will be shown in Claim 3.8).

This leads us to consider the following distribution, call it D~\tilde{D}. A sample from D~\tilde{D} is a pair (𝐚,𝐲)(\mathbf{a},\mathbf{y}) where 𝐲\mathbf{y} is sampled from DL/p,r/pD_{L/p,r/p}, and 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n} is such that 𝐲L+L𝐚/p\mathbf{y}\in L+L\mathbf{a}/p. Notice that we can easily obtain samples from D~\tilde{D} using the given samples from DL,rD_{L,r}. From the above discussion we have that the marginal distribution of 𝐚\mathbf{a} is essentially uniform. Moreover, by definition we have that the distribution of 𝐲\mathbf{y} conditioned on any 𝐚\mathbf{a} is DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p}. Hence, D~\tilde{D} is essentially identical to the distribution on pairs (𝐚,𝐲)(\mathbf{a},\mathbf{y}) in which 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n} is chosen uniformly at random, and then 𝐲\mathbf{y} is sampled from DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p}. From now on, we think of D~\tilde{D} as being this distribution.

We now examine the Fourier transform of DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p} (see Figure 5). When 𝐚\mathbf{a} is zero, we already know that the Fourier transform is fp/rf_{p/r}. For general 𝐚\mathbf{a}, a standard calculation shows that the Fourier transform of DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p} is given by

exp(2πi𝐚,τ(𝐱)/p)fp/r(𝐱){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}/p\right)}\cdot f_{p/r}(\mathbf{x}) (2)

where τ(𝐱)pn\tau(\mathbf{x})\in{\mathbb{Z}}_{p}^{n} is defined as

τ(𝐱):=(L)1κL(𝐱)modp,\tau(\mathbf{x}):=(L^{*})^{-1}\kappa_{L^{*}}(\mathbf{x})~{}{\rm{mod}}~{}p,

and κL(𝐱)\kappa_{L^{*}}(\mathbf{x}) denotes the (unique) closest vector in LL^{*} to 𝐱\mathbf{x}. In other words, τ(𝐱)\tau(\mathbf{x}) is the vector of coefficients of the vector in LL^{*} closest to 𝐱\mathbf{x} when represented in the basis of LL^{*}, reduced modulo pp. So we see that the Fourier transform DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p} is essentially fp/rf_{p/r}, except that each ‘hill’ gets its own phase depending on the vector of coefficients of the lattice point in its center. The appearance of these phases is as a result of a well-known property of the Fourier transform, saying that translation is transformed to multiplication by phase.

Refer to caption
Refer to caption
Figure 5: The Fourier transform of DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p} with n=2n=2, p=2p=2, 𝐚=(0,0)\mathbf{a}=(0,0) (left), 𝐚=(1,1)\mathbf{a}=(1,1) (right).

Equipped with this understanding of the Fourier transform of DL+L𝐚/p,r/pD_{L+L\mathbf{a}/p,r/p}, we can get back to our task of solving CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r}. By the definition of the Fourier transform, we know that the average of exp(2πi𝐱,𝐲){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{x},\mathbf{y}}\rangle}\right)} over 𝐲DL+L𝐚/p,r/p\mathbf{y}\sim D_{L+L\mathbf{a}/p,r/p} is given by (2). Assume for simplicity that 𝐱L\mathbf{x}\in L^{*} (even though in this case finding the closest vector is trivial; it is simply 𝐱\mathbf{x} itself). In this case, (2) is equal to exp(2πi𝐚,τ(𝐱)/p){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}/p\right)}. Since the absolute value of this expression is 11, we see that for such 𝐱\mathbf{x}, the random variable 𝐱,𝐲mod1{\langle{\mathbf{x},\mathbf{y}}\rangle}~{}{\rm{mod}}~{}1 (where 𝐲DL+L𝐚/p,r/p\mathbf{y}\sim D_{L+L\mathbf{a}/p,r/p}) must be deterministically equal to 𝐚,τ(𝐱)/pmod1{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}/p~{}{\rm{mod}}~{}1 (this fact can also be seen directly). In other words, when 𝐱L\mathbf{x}\in L^{*}, each sample (𝐚,𝐲)(\mathbf{a},\mathbf{y}) from D~\tilde{D}, provides us with a linear equation

𝐚,τ(𝐱)=p𝐱,𝐲modp{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}=p{\langle{\mathbf{x},\mathbf{y}}\rangle}~{}{\rm{mod}}~{}p

with 𝐚\mathbf{a} distributed essentially uniformly in pn{\mathbb{Z}}_{p}^{n}. After collecting about nn such equations, we can use Gaussian elimination to recover τ(𝐱)pn\tau(\mathbf{x})\in{\mathbb{Z}}_{p}^{n}. And as we shall show in Lemma 3.5 using a simple reduction, the ability to compute τ(𝐱)\tau(\mathbf{x}) easily leads to the ability to compute the closest vector to 𝐱\mathbf{x}.

We now turn to the more interesting case in which 𝐱\mathbf{x} is not in LL^{*}, but only within distance αp/r\alpha p/r of LL^{*}. In this case, the phase of (2) is still equal to exp(2πi𝐚,τ(𝐱)/p){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}/p\right)}. Its absolute value, however, is no longer 11, but still quite close to 11 (depending on the distance of 𝐱\mathbf{x} from LL^{*}). Therefore, the random variable 𝐱,𝐲mod1{\langle{\mathbf{x},\mathbf{y}}\rangle}~{}{\rm{mod}}~{}1, where 𝐲DL+L𝐚/p,r/p\mathbf{y}\sim D_{L+L\mathbf{a}/p,r/p}, must be typically quite close to 𝐚,τ(𝐱)/pmod1{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}/p~{}{\rm{mod}}~{}1 (since, as before, the average of exp(2πi𝐱,𝐲){{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{x},\mathbf{y}}\rangle}\right)} is given by (2)). Hence, each sample (𝐚,𝐲)(\mathbf{a},\mathbf{y}) from D~\tilde{D}, provides us with a linear equation with error,

𝐚,τ(𝐱)p𝐱,𝐲modp.{\langle{\mathbf{a},\tau(\mathbf{x})}\rangle}\approx\lfloor p{\langle{\mathbf{x},\mathbf{y}}\rangle}\rceil~{}{\rm{mod}}~{}p.

Notice that p𝐱,𝐲p{\langle{\mathbf{x},\mathbf{y}}\rangle} is typically not an integer and hence we round it to the nearest integer. After collecting a polynomial number of such equations, we call the LWE oracle in order to recover τ(𝐱)\tau(\mathbf{x}). Notice that 𝐚\mathbf{a} is distributed essentially uniformly, as required by the LWE oracle. Finally, as mentioned above, once we are able to compute τ(𝐱)\tau(\mathbf{x}), computing 𝐱\mathbf{x} is easy (this will be shown in Lemma 3.5).

The above outline ignores one important detail: what is the error distribution in the equations we produce? Recall that the LWE oracle is only guaranteed to work with error distribution Ψ¯α\bar{\Psi}_{\alpha}. Luckily, as we will show in Claim 3.9 and Corollary 3.10 (using a rather technical proof), if 𝐱\mathbf{x} is at distance βp/r\beta p/r from LL^{*} for some 0βα0\leq\beta\leq\alpha, then the error distribution in the equations is essentially Ψ¯β\bar{\Psi}_{\beta}. (In fact, in order to get this error distribution, we will have to modify the procedure a bit and add a small amount of normal error to each equation.) We then complete the proof by noting (in Lemma 3.7) that an oracle for solving LWEp,Ψ¯α\mbox{\sc LWE}_{p,\bar{\Psi}_{\alpha}} can be used to solve LWEp,Ψ¯β\mbox{\sc LWE}_{p,\bar{\Psi}_{\beta}} for any 0βα0\leq\beta\leq\alpha (even if β\beta is unknown).

Part 2:

In this part, we describe a quantum algorithm that, using a CVPL,αp/r{\mbox{\sc CVP}}_{L^{*},\alpha p/r} oracle, generates one sample from DL,rn/(αp)D_{L,r\sqrt{n}/(\alpha p)}. Equivalently, we show how to produce a sample from DL,rD_{L,r} given a CVPL,n/r{\mbox{\sc CVP}}_{L^{*},\sqrt{n}/r} oracle. The procedure is essentially the following: first, by using the CVP oracle, create a quantum state corresponding to f1/rf_{1/r}. Then, apply the quantum Fourier transform and obtain a quantum state corresponding to DL,rD_{L,r}. By measuring this state we obtain a sample from DL,rD_{L,r}.

In the following, we describe this procedure in more detail. Our first goal is to create a quantum state corresponding to f1/rf_{1/r}. Informally, this can be written as

𝐱nf1/r|𝐱.\sum_{\mathbf{x}\in\mathbb{R}^{n}}f_{1/r}{|{\mathbf{x}}\rangle}. (3)

This state is clearly not well-defined. In the actual procedure, n\mathbb{R}^{n} is replaced with some finite set (namely, all points inside the basic parallelepiped of LL^{*} that belong to some fine grid). This introduces several technical complications and makes the computations rather tedious. Therefore, in the present discussion, we opt to continue with informal expressions as in (3).

Let us now continue our description of the procedure. In order to prepare the state in (3), we first create the uniform superposition on LL^{*},

𝐱L|𝐱.\sum_{\mathbf{x}\in L^{*}}{|{\mathbf{x}}\rangle}.

(This step is actually unnecessary in the real procedure, since there we work in the basic parallelepiped of LL^{*}; but for the present discussion, it is helpful to imagine that we start with this state.) On a separate register, we create a ‘Gaussian state’ of width 1/r1/r,

𝐳nexp(πr𝐳2)|𝐳.\sum_{\mathbf{z}\in\mathbb{R}^{n}}{{\mathrm{exp}}\left(-\pi\|r\mathbf{z}\|^{2}\right)}{|{\mathbf{z}}\rangle}.

This can be done using known techniques. The combined state of the system can be written as

𝐱L,𝐳nexp(πr𝐳2)|𝐱,𝐳.\sum_{\mathbf{x}\in L^{*},\mathbf{z}\in\mathbb{R}^{n}}{{\mathrm{exp}}\left(-\pi\|r\mathbf{z}\|^{2}\right)}{|{\mathbf{x},\mathbf{z}}\rangle}.

We now add the first register to the second (a reversible operation), and obtain

𝐱L,𝐳nexp(πr𝐳2)|𝐱,𝐱+𝐳.\sum_{\mathbf{x}\in L^{*},\mathbf{z}\in\mathbb{R}^{n}}{{\mathrm{exp}}\left(-\pi\|r\mathbf{z}\|^{2}\right)}{|{\mathbf{x},\mathbf{x}+\mathbf{z}}\rangle}.

Finally, we would like to erase, or uncompute, the first register to obtain

𝐱L,𝐳nexp(πr𝐳2)|𝐱+𝐳𝐳nf1/r(𝐳)|𝐳.\displaystyle\sum_{\mathbf{x}\in L^{*},\mathbf{z}\in\mathbb{R}^{n}}{{\mathrm{exp}}\left(-\pi\|r\mathbf{z}\|^{2}\right)}{|{\mathbf{x}+\mathbf{z}}\rangle}\approx\sum_{\mathbf{z}\in\mathbb{R}^{n}}f_{1/r}(\mathbf{z}){|{\mathbf{z}}\rangle}.

However, ‘erasing’ a register is in general not a reversible operation. In order for it to be reversible, we need to be able to compute 𝐱\mathbf{x} from the remaining register 𝐱+𝐳\mathbf{x}+\mathbf{z}. This is precisely why we need the CVPL,n/r{\mbox{\sc CVP}}_{L^{*},\sqrt{n}/r} oracle. It can be shown that almost all the mass of exp(πr𝐳2){{\mathrm{exp}}\left(-\pi\|r\mathbf{z}\|^{2}\right)} is on 𝐳\mathbf{z} such that 𝐳n/r\|\mathbf{z}\|\leq\sqrt{n}/r. Hence, 𝐱+𝐳\mathbf{x}+\mathbf{z} is within distance n/r\sqrt{n}/r of the lattice and the oracle finds the closest lattice point, namely, 𝐱\mathbf{x}. This allows us to erase the first register in a reversible way.

In the final part of the procedure, we apply the quantum Fourier transform. This yields the quantum state corresponding to DL,rD_{L,r}, namely,

𝐲LDL,r(𝐲)|𝐲.\sum_{\mathbf{y}\in L}D_{L,r}(\mathbf{y}){|{\mathbf{y}}\rangle}.

By measuring this state, we obtain a sample from the distribution DL,rD_{L,r} (or in fact from DL,r2=DL,r/2D_{L,r}^{2}=D_{L,r/\sqrt{2}} but this is a minor issue).

2 Preliminaries

In this section we include some notation that will be used throughout the paper. Most of the notation is standard. Some of the less standard notation is: the Gaussian function ρ\rho (Eq. (4)), the Gaussian distribution ν\nu (Eq. (5)), the periodic normal distribution Ψ\Psi (Eq. (7)), the discretization of a distribution on 𝕋\mathbb{T} (Eq. (8)), the discrete Gaussian distribution DD (Eq. (6)), the unique closest lattice vector κ\kappa (above Lemma 2.3), and the smoothing parameter η\eta (Definition 2.10).

General.

For two real numbers xx and y>0y>0 we define xmodyx~{}{\rm{mod}}~{}y as xx/yyx-{\lfloor x/y\rfloor}y. For xx\in\mathbb{R} we define x{\lfloor x\rceil} as the integer closest to xx or, in case two such integers exist, the smaller of the two. For any integer p2p\geq 2, we write p{\mathbb{Z}}_{p} for the cyclic group {0,1,,p1}\{0,1,\ldots,p-1\} with addition modulo pp. We also write 𝕋\mathbb{T} for /\mathbb{R}/{\mathbb{Z}}, i.e., the segment [0,1)[0,1) with addition modulo 11.

We define a negligible amount in nn as an amount that is asymptotically smaller than ncn^{-c} for any constant c>0c>0. More precisely, f(n)f(n) is a negligible function in nn if limnncf(n)=0\lim_{n\to\infty}n^{c}f(n)=0 for any c>0c>0. Similarly, a non-negligible amount is one which is at least ncn^{-c} for some c>0c>0. Also, when we say that an expression is exponentially small in nn we mean that it is at most 2Ω(n)2^{-\Omega(n)}. Finally, when we say that an expression (most often, some probability) is exponentially close to 11, we mean that it is 12Ω(n)1-2^{-\Omega(n)}.

We say that an algorithm 𝒜{\cal A} with oracle access is a distinguisher between two distributions if its acceptance probability when the oracle outputs samples of the first distribution and its acceptance probability when the oracle outputs samples of the second distribution differ by a non-negligible amount.

Essentially all algorithms and reductions in this paper have an exponentially small error probability, and we sometimes do not state this explicitly.

For clarity, we present some of our reductions in a model that allows operations on real numbers. It is possible to modify them in a straightforward way so that they operate in a model that approximates real numbers up to an error of 2nc2^{-n^{c}} for arbitrary large constant cc in time polynomial in nn.

Given two probability density functions ϕ1,ϕ2\phi_{1},\phi_{2} on n\mathbb{R}^{n}, we define the statistical distance between them as

Δ(ϕ1,ϕ2):=n|ϕ1(𝐱)ϕ2(𝐱)|𝑑𝐱\Delta(\phi_{1},\phi_{2}):=\int_{\mathbb{R}^{n}}|\phi_{1}(\mathbf{x})-\phi_{2}(\mathbf{x})|d\mathbf{x}

(notice that with this definition, the statistical distance ranges in [0,2][0,2]). A similar definition can be given for discrete random variables. The statistical distance satisfies the triangle inequality, i.e., for any ϕ1,ϕ2,ϕ3\phi_{1},\phi_{2},\phi_{3},

Δ(ϕ1,ϕ3)Δ(ϕ1,ϕ2)+Δ(ϕ2,ϕ3).\Delta(\phi_{1},\phi_{3})\leq\Delta(\phi_{1},\phi_{2})+\Delta(\phi_{2},\phi_{3}).

Another important fact which we often use is that the statistical distance cannot increase by applying a (possibly randomized) function ff, i.e.,

Δ(f(X),f(Y))Δ(X,Y),\Delta(f(X),f(Y))\leq\Delta(X,Y),

see, e.g., [28]. In particular, this implies that the acceptance probability of any algorithm on inputs from XX differs from its acceptance probability on inputs from YY by at most 12Δ(X,Y)\frac{1}{2}\Delta(X,Y) (the factor half coming from the choice of normalization in our definition of Δ\Delta).

Gaussians and other distributions.

Recall that the normal distribution with mean 0 and variance σ2\sigma^{2} is the distribution on \mathbb{R} given by the density function 12πσexp(12(xσ)2)\frac{1}{\sqrt{2\pi}\cdot\sigma}{{\mathrm{exp}}\left(-\frac{1}{2}(\frac{x}{\sigma})^{2}\right)} where exp(y){{\mathrm{exp}}\left(y\right)} denotes eye^{y}. Also recall that the sum of two independent normal variables with mean 0 and variances σ12\sigma_{1}^{2} and σ22\sigma_{2}^{2} is a normal variable with mean 0 and variance σ12+σ22\sigma_{1}^{2}+\sigma_{2}^{2}. For a vector 𝐱\mathbf{x} and any s>0s>0, let

ρs(𝐱):=exp(π𝐱/s2)\displaystyle\rho_{s}(\mathbf{x}):={{\mathrm{exp}}\left(-\pi\|\mathbf{x}/s\|^{2}\right)} (4)

be a Gaussian function scaled by a factor of ss. We denote ρ1\rho_{1} by ρ\rho. Note that 𝐱nρs(𝐱)𝑑𝐱=sn\int_{\mathbf{x}\in\mathbb{R}^{n}}\rho_{s}(\mathbf{x})d\mathbf{x}=s^{n}. Hence,

νs:=ρs/sn\displaystyle\nu_{s}:=\rho_{s}/s^{n} (5)

is an nn-dimensional probability density function and as before, we use ν\nu to denote ν1\nu_{1}. The dimension nn is implicit. Notice that a sample from the Gaussian distribution νs\nu_{s} can be obtained by taking nn independent samples from the 11-dimensional Gaussian distribution. Hence, sampling from νs\nu_{s} to within arbitrarily good accuracy can be performed efficiently by using standard techniques. For simplicity, in this paper we assume that we can sample from νs\nu_{s} exactly.444In practice, when only finite precision is available, νs\nu_{s} can be approximated by picking a fine grid, and picking points from the grid with probability approximately proportional to νs\nu_{s}. All our arguments can be made rigorous by selecting a sufficiently fine grid. Functions are extended to sets in the usual way; i.e., ρs(A)=𝐱Aρs(𝐱)\rho_{s}(A)=\sum_{\mathbf{x}\in A}\rho_{s}(\mathbf{x}) for any countable set AA. For any vector 𝐜n\mathbf{c}\in\mathbb{R}^{n}, we define ρs,𝐜(𝐱):=ρs(𝐱𝐜)\rho_{s,\mathbf{c}}(\mathbf{x}):=\rho_{s}(\mathbf{x}-\mathbf{c}) to be a shifted version of ρs\rho_{s}. The following simple claim bounds the amount by which ρs(𝐱)\rho_{s}(\mathbf{x}) can shrink by a small change in 𝐱\mathbf{x}.

Claim 2.1

For all s,t,l>0s,t,l>0 and 𝐱,𝐲n\mathbf{x},\mathbf{y}\in\mathbb{R}^{n} with 𝐱t\|\mathbf{x}\|\leq t and 𝐱𝐲l\|\mathbf{x}-\mathbf{y}\|\leq l,

ρs(𝐲)(1π(2lt+l2)/s2)ρs(𝐱).\rho_{s}(\mathbf{y})\geq(1-\pi(2lt+l^{2})/s^{2})\rho_{s}(\mathbf{x}).
  • Proof:

    Using the inequality ez1ze^{-z}\geq 1-z,

    ρs(𝐲)=eπ𝐲/s2\displaystyle\rho_{s}(\mathbf{y})=e^{-\pi\|\mathbf{y}/s\|^{2}} eπ(𝐱/s+l/s)2=eπ(2l𝐱/s2+(l/s)2)ρs(𝐱)(1π(2lt+l2)/s2)ρs(𝐱).\displaystyle\geq e^{-\pi(\|\mathbf{x}\|/s+l/s)^{2}}=e^{-\pi(2l\|\mathbf{x}\|/s^{2}+(l/s)^{2})}\rho_{s}(\mathbf{x})\geq(1-\pi(2lt+l^{2})/s^{2})\rho_{s}(\mathbf{x}).

     

For any countable set AA and a parameter s>0s>0, we define the discrete Gaussian probability distribution DA,sD_{A,s} as

𝐱A,DA,s(𝐱):=ρs(𝐱)ρs(A).\forall\mathbf{x}\in A,~{}D_{A,s}(\mathbf{x}):=\frac{\rho_{s}(\mathbf{x})}{\rho_{s}(A)}. (6)

See Figure 2 for an illustration.

For β+\beta\in\mathbb{R}^{+} the distribution Ψβ\Psi_{\beta} is the distribution on 𝕋\mathbb{T} obtained by sampling from a normal variable with mean 0 and standard deviation β2π\frac{\beta}{\sqrt{2\pi}} and reducing the result modulo 11 (i.e., a periodization of the normal distribution),

r[0,1),Ψβ(r):=k=1βexp(π(rkβ)2).\displaystyle\forall r\in[0,1),~{}\Psi_{\beta}(r):=\sum_{k=-\infty}^{\infty}\frac{1}{\beta}\cdot{{\mathrm{exp}}\left(-\pi\Big{(}\frac{r-k}{\beta}\Big{)}^{2}\right)}. (7)

Clearly, one can efficiently sample from Ψβ\Psi_{\beta}. The following technical claim shows that a small change in the parameter β\beta does not change the distribution Ψβ\Psi_{\beta} by much.

Claim 2.2

For any 0<α<β2α0<\alpha<\beta\leq 2\alpha,

Δ(Ψα,Ψβ)9(βα1).\Delta(\Psi_{\alpha},\Psi_{\beta})\leq 9\Big{(}\frac{\beta}{\alpha}-1\Big{)}.
  • Proof:

    We will show that the statistical distance between a normal variable with standard deviation α/2π\alpha/\sqrt{2\pi} and one with standard deviation β/2π\beta/\sqrt{2\pi} is at most 9(βα1)9(\frac{\beta}{\alpha}-1). This implies the claim since applying a function (modulo 11 in this case) cannot increase the statistical distance. By scaling, we can assume without loss of generality that α=1\alpha=1 and β=1+ε\beta=1+\varepsilon for some 0<ε10<\varepsilon\leq 1. Then the statistical distance that we wish to bound is given by

    |eπx211+εeπx2/(1+ε)2|𝑑x\displaystyle\int_{\mathbb{R}}\bigg{|}e^{-\pi x^{2}}-\frac{1}{1+\varepsilon}\,e^{-\pi x^{2}/(1+\varepsilon)^{2}}\bigg{|}dx |eπx2eπx2/(1+ε)2|𝑑x+|(111+ε)eπx2/(1+ε)2|𝑑x\displaystyle\leq\int_{\mathbb{R}}\big{|}e^{-\pi x^{2}}-e^{-\pi x^{2}/(1+\varepsilon)^{2}}\big{|}dx+\int_{\mathbb{R}}\bigg{|}\Big{(}1-\frac{1}{1+\varepsilon}\Big{)}e^{-\pi x^{2}/(1+\varepsilon)^{2}}\bigg{|}dx
    =|eπx2eπx2/(1+ε)2|𝑑x+ε\displaystyle=\int_{\mathbb{R}}\big{|}e^{-\pi x^{2}}-e^{-\pi x^{2}/(1+\varepsilon)^{2}}\big{|}dx+\varepsilon
    =|eπ(11/(1+ε)2)x21|eπx2/(1+ε)2𝑑x+ε.\displaystyle=\int_{\mathbb{R}}\big{|}e^{-\pi(1-1/(1+\varepsilon)^{2})x^{2}}-1\big{|}\,e^{-\pi x^{2}/(1+\varepsilon)^{2}}dx+\varepsilon.

    Now, since 1zez11-z\leq e^{-z}\leq 1 for all z0z\geq 0,

    |eπ(11/(1+ε)2)x21|π(11/(1+ε)2)x22πεx2.\displaystyle\big{|}e^{-\pi(1-1/(1+\varepsilon)^{2})x^{2}}-1\big{|}\leq\pi(1-1/(1+\varepsilon)^{2})x^{2}\leq 2\pi\varepsilon x^{2}.

    Hence we can bound the statistical distance above by

    ε+2πεx2eπx2/(1+ε)2𝑑x=ε+ε(1+ε)39ε.\displaystyle\varepsilon+2\pi\varepsilon\int_{\mathbb{R}}x^{2}e^{-\pi x^{2}/(1+\varepsilon)^{2}}dx=\varepsilon+\varepsilon(1+\varepsilon)^{3}\leq 9\varepsilon.

     

For an arbitrary probability distribution with density function ϕ:𝕋+\phi:\mathbb{T}\rightarrow\mathbb{R}^{+} and some integer p1p\geq 1 we define its discretization ϕ¯:p+\bar{\phi}:{\mathbb{Z}}_{p}\rightarrow\mathbb{R}^{+} as the discrete probability distribution obtained by sampling from ϕ\phi, multiplying by pp, and rounding to the closest integer modulo pp. More formally,

ϕ¯(i):=(i1/2)/p(i+1/2)/pϕ(x)𝑑x.\displaystyle\bar{\phi}(i):=\int_{(i-1/2)/p}^{(i+1/2)/p}\phi(x)dx. (8)

As an example, Ψ¯β\bar{\Psi}_{\beta} is shown in Figure 1.

Let p2p\geq 2 be some integer, and let χ:p+\chi:{\mathbb{Z}}_{p}\rightarrow\mathbb{R}^{+} be some probability distribution on p{\mathbb{Z}}_{p}. Let nn be an integer and let 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n} be a vector. We define A𝐬,χA_{\mathbf{s},\chi} as the distribution on pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p} obtained by choosing a vector 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n} uniformly at random, choosing epe\in{\mathbb{Z}}_{p} according to χ\chi, and outputting (𝐚,𝐚,𝐬+e)(\mathbf{a},{\langle{\mathbf{a},\mathbf{s}}\rangle}+e), where additions are performed in p{\mathbb{Z}}_{p}, i.e., modulo pp. We also define UU as the uniform distribution on pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}.

For a probability density function ϕ\phi on 𝕋\mathbb{T}, we define A𝐬,ϕA_{\mathbf{s},\phi} as the distribution on pn×𝕋{\mathbb{Z}}_{p}^{n}\times\mathbb{T} obtained by choosing a vector 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n} uniformly at random, choosing e𝕋e\in\mathbb{T} according to ϕ\phi, and outputting (𝐚,𝐚,𝐬/p+e)(\mathbf{a},{\langle{\mathbf{a},\mathbf{s}}\rangle}/p+e), where the addition is performed in 𝕋\mathbb{T}, i.e., modulo 11.

Learning with errors.

For an integer p=p(n)p=p(n) and a distribution χ\chi on p{\mathbb{Z}}_{p}, we say that an algorithm solves LWEp,χ\mbox{\sc LWE}_{p,\chi} if, for any 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n}, given samples from A𝐬,χA_{\mathbf{s},\chi} it outputs 𝐬\mathbf{s} with probability exponentially close to 11. Similarly, for a probability density function ϕ\phi on 𝕋\mathbb{T}, we say that an algorithm solves LWEp,ϕ\mbox{\sc LWE}_{p,\phi} if, for any 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n}, given samples from A𝐬,ϕA_{\mathbf{s},\phi} it outputs 𝐬\mathbf{s} with probability exponentially close to 11. In both cases, we say that the algorithm is efficient if it runs in polynomial time in nn. Finally, we note that pp is assumed to be prime only in Lemma 4.2; In the rest of the paper, including the main theorem, pp can be an arbitrary integer.

Lattices.

We briefly review some basic definitions; for a good introduction to lattices, see [28]. A lattice in n\mathbb{R}^{n} is defined as the set of all integer combinations of nn linearly independent vectors. This set of vectors is known as a basis of the lattice and is not unique. Given a basis (𝐯1,,𝐯n)(\mathbf{v}_{1},\ldots,\mathbf{v}_{n}) of a lattice LL, the fundamental parallelepiped generated by this basis is defined as

𝒫(𝐯1,,𝐯n)={i=1nxi𝐯i|xi[0,1)}.{\cal P}(\mathbf{v}_{1},\ldots,\mathbf{v}_{n})=\left\{\left.\sum_{i=1}^{n}x_{i}\mathbf{v}_{i}~{}\right|~{}x_{i}\in[0,1)\right\}.

When the choice of basis is clear, we write 𝒫(L){\cal P}(L) instead of 𝒫(𝐯1,,𝐯n){\cal P}(\mathbf{v}_{1},\ldots,\mathbf{v}_{n}). For a point 𝐱n\mathbf{x}\in\mathbb{R}^{n} we define 𝐱mod𝒫(L)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L) as the unique point 𝐲𝒫(L)\mathbf{y}\in{\cal P}(L) such that 𝐲𝐱L\mathbf{y}-\mathbf{x}\in L. We denote by det(L)\det(L) the volume of the fundamental parallelepiped of LL or equivalently, the absolute value of the determinant of the matrix whose columns are the basis vectors of the lattice (det(L)\det(L) is a lattice invariant, i.e., it is independent of the choice of basis). The dual of a lattice LL in n\mathbb{R}^{n}, denoted LL^{*}, is the lattice given by the set of all vectors 𝐲n\mathbf{y}\in\mathbb{R}^{n} such that 𝐱,𝐲{\langle{\mathbf{x},\mathbf{y}}\rangle}\in{\mathbb{Z}} for all vectors 𝐱L\mathbf{x}\in L. Similarly, given a basis (𝐯1,,𝐯n)(\mathbf{v}_{1},\ldots,\mathbf{v}_{n}) of a lattice, we define the dual basis as the set of vectors (𝐯1,,𝐯n)(\mathbf{v}_{1}^{*},\ldots,\mathbf{v}_{n}^{*}) such that 𝐯i,𝐯j=δij{\langle{\mathbf{v}_{i},\mathbf{v}_{j}^{*}}\rangle}=\delta_{ij} for all i,j[n]i,j\in[n] where δij\delta_{ij} denotes the Kronecker delta, i.e., 11 if i=ji=j and 0 otherwise. With a slight abuse of notation, we sometimes write LL for the n×nn\times n matrix whose columns are 𝐯1,,𝐯n\mathbf{v}_{1},\ldots,\mathbf{v}_{n}. With this notation, we notice that L=(LT)1L^{*}=(L^{T})^{-1}. From this it follows that det(L)=1/det(L)\det(L^{*})=1/\det(L). As another example of this notation, for a point 𝐯L\mathbf{v}\in L we write L1𝐯L^{-1}\mathbf{v} to indicate the integer coefficient vector of 𝐯\mathbf{v}.

Let λ1(L)\lambda_{1}(L) denote the length of the shortest nonzero vector in the lattice LL. We denote by λn(L)\lambda_{n}(L) the minimum length of a set of nn linearly independent vectors from LL, where the length of a set is defined as the length of longest vector in it. For a lattice LL and a point 𝐯\mathbf{v} whose distance from LL is less than λ1(L)/2\lambda_{1}(L)/2 we define κL(𝐯)\kappa_{L}(\mathbf{v}) as the (unique) closest point to 𝐯\mathbf{v} in LL. The following useful fact, due to Banaszczyk, is known as a ‘transference theorem’. We remark that the lower bound is easy to prove.

Lemma 2.3 ([9], Theorem 2.1)

For any lattice nn-dimensional LL, 1λ1(L)λn(L)n1\leq\lambda_{1}(L)\cdot\lambda_{n}(L^{*})\leq n.

Two other useful facts by Banaszczyk are the following. The first bounds the amount by which the Gaussian measure of a lattice changes by scaling; the second shows that for any lattice LL, the mass given by the discrete Gaussian measure DL,rD_{L,r} to points of norm greater than nr\sqrt{n}r is at most exponentially small (the analogous statement for the continuous Gaussian νr\nu_{r} is easy to establish).

Lemma 2.4 ([9], Lemma 1.4(i))

For any lattice LL and a1a\geq 1, ρa(L)anρ(L)\rho_{a}(L)\leq a^{n}\rho(L).

Lemma 2.5 ([9], Lemma 1.5(i))

Let BnB_{n} denote the Euclidean unit ball. Then, for any lattice LL and any r>0r>0, ρr(LnrBn)<22nρr(L)\rho_{r}(L\setminus\sqrt{n}rB_{n})<2^{-2n}\cdot\rho_{r}(L), where LnrBnL\setminus\sqrt{n}rB_{n} is the set of lattice points of norm greater than nr\sqrt{n}r.

In this paper we consider the following lattice problems. The first two, the decision version of the shortest vector problem (GapSVP) and the shortest independent vectors problem (SIVP), are among the most well-known lattice problems and are concerned with λ1\lambda_{1} and λn\lambda_{n}, respectively. In the definitions below, γ=γ(n)1\gamma=\gamma(n)\geq 1 is the approximation factor, and the input lattice is given in the form of some arbitrary basis.

Definition 2.6

An instance of GapSVPγ\mbox{\sc GapSVP}_{\gamma} is given by an nn-dimensional lattice LL and a number d>0d>0. In YES{\rm YES} instances, λ1(L)d\lambda_{1}(L)\leq d whereas in NO{\rm NO} instances λ1(L)>γ(n)d\lambda_{1}(L)>\gamma(n)\cdot d.

Definition 2.7

An instance of SIVPγ\mbox{\sc SIVP}_{\gamma} is given by an nn-dimensional lattice LL. The goal is to output a set of nn linearly independent lattice vectors of length at most γ(n)λn(L)\gamma(n)\cdot\lambda_{n}(L).

A useful generalization of SIVP is the following somewhat less standard lattice problem, known as the generalized independent vectors problem (GIVP). Here, φ\varphi denotes an arbitrary real-valued function on lattices. Choosing φ=λn\varphi=\lambda_{n} results in SIVP.

Definition 2.8

An instance of GIVPγφ\mbox{\sc GIVP}^{\varphi}_{\gamma} is given by an nn-dimensional lattice LL. The goal is to output a set of nn linearly independent lattice vectors of length at most γ(n)φ(L)\gamma(n)\cdot\varphi(L).

Another useful (and even less standard) lattice problem is the following. We call it the discrete Gaussian sampling problem (DGS). As before, φ\varphi denotes an arbitrary real-valued function on lattices.

Definition 2.9

An instance of DGSφ\mbox{\sc DGS}_{\varphi} is given by an nn-dimensional lattice LL and a number r>φ(L)r>\varphi(L). The goal is to output a sample from DL,rD_{L,r}.

We also consider a variant of the closest vector problem (which is essentially what is known as the bounded distance decoding problem [25]): For an nn-dimensional lattice LL, and some d>0d>0, we say that an algorithm solves CVPL,d{\mbox{\sc CVP}}_{L,d} if, given a point 𝐱n\mathbf{x}\in\mathbb{R}^{n} whose distance to LL is at most dd, the algorithm finds the closest lattice point to 𝐱\mathbf{x}. In this paper dd will always be smaller than λ1(L)/2\lambda_{1}(L)/2 and hence the closest vector is unique.

The smoothing parameter.

We make heavy use of a lattice parameter known as the smoothing parameter [29]. Intuitively, this parameter provides the width beyond which the discrete Gaussian measure on a lattice behaves like a continuous one. The precise definition is the following.

Definition 2.10

For an nn-dimensional lattice LL and positive real ε>0\varepsilon>0, we define the smoothing parameter ηε(L)\eta_{\varepsilon}(L) to be the smallest ss such that ρ1/s(L{𝟎})ε\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\})\leq\varepsilon.

In other words, ηε(L)\eta_{\varepsilon}(L) is the smallest ss such that a Gaussian measure scaled by 1/s1/s on the dual lattice LL^{*} gives all but ε/(1+ε)\varepsilon/(1+\varepsilon) of its weight to the origin. We usually take ε\varepsilon to be some negligible function of the lattice dimension nn. Notice that ρ1/s(L{𝟎})\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\}) is a continuous and strictly decreasing function of ss. Moreover, it can be shown that lims0ρ1/s(L{𝟎})=\lim_{s\to 0}\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\})=\infty and limsρ1/s(L{𝟎})=0\lim_{s\to\infty}\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\})=0. So, the parameter ηε(L)\eta_{\varepsilon}(L) is well defined for any ε>0\varepsilon>0, and εηε(L)\varepsilon\mapsto\eta_{\varepsilon}(L) is the inverse function of sρ1/s(L{𝟎})s\mapsto\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\}). In particular, ηε(L)\eta_{\varepsilon}(L) is also a continuous and strictly decreasing function of ε\varepsilon.

The motivation for this definition (and the name ‘smoothing parameter’) comes from the following result, shown in [29] (and included here as Claim 3.8). Informally, it says that if we choose a ‘random’ lattice point from an nn-dimensional lattice LL and add continuous Gaussian noise νs\nu_{s} for some s>ηε(L)s>\eta_{\varepsilon}(L) then the resulting distribution is within statistical distance ε\varepsilon of the ‘uniform distribution on n\mathbb{R}^{n}’. In this paper, we show (in Claim 3.9) another important property of this parameter: for s>2ηε(L)s>\sqrt{2}\eta_{\varepsilon}(L), if we sample a point from DL,sD_{L,s} and add Gaussian noise νs\nu_{s}, we obtain a distribution whose statistical distance to a continuous Gaussian ν2s\nu_{\sqrt{2}s} is at most 4ε4\varepsilon. Notice that ν2s\nu_{\sqrt{2}s} is the distribution one obtains when summing two independent samples from νs\nu_{s}. Hence, intuitively, the noise νs\nu_{s} is enough to hide the discrete structure of DL,sD_{L,s}.

The following two upper bounds on the smoothing parameter appear in [29].

Lemma 2.11

For any nn-dimensional lattice LL, ηε(L)n/λ1(L)\eta_{\varepsilon}(L)\leq\sqrt{n}/\lambda_{1}(L^{*}) where ε=2n\varepsilon=2^{-n}.

Lemma 2.12

For any nn-dimensional lattice LL and ε>0\varepsilon>0,

ηε(L)ln(2n(1+1/ε))πλn(L).\eta_{\varepsilon}(L)\leq\sqrt{\frac{\ln(2n(1+1/\varepsilon))}{\pi}}\cdot\lambda_{n}(L).

In particular, for any superlogarithmic function ω(logn)\omega(\log n), ηε(n)(L)ω(logn)λn(L)\eta_{\varepsilon(n)}(L)\leq\sqrt{\omega(\log n)}\cdot\lambda_{n}(L) for some negligible function ε(n)\varepsilon(n).

We also need the following simple lower bound on the smoothing parameter.

Claim 2.13

For any lattice LL and any ε>0\varepsilon>0,

ηε(L)ln1/επ1λ1(L)ln1/επλn(L)n.\eta_{\varepsilon}(L)\geq\sqrt{\frac{\ln 1/\varepsilon}{\pi}}\cdot\frac{1}{\lambda_{1}(L^{*})}\geq\sqrt{\frac{\ln 1/\varepsilon}{\pi}}\cdot\frac{\lambda_{n}(L)}{n}.

In particular, for any ε(n)=o(1)\varepsilon(n)=o(1) and any constant c>0c>0, ηε(n)(L)>c/λ1(L)cλn(L)/n\eta_{\varepsilon(n)}(L)>c/\lambda_{1}(L^{*})\geq c\lambda_{n}(L)/n for large enough nn.

  • Proof:

    Let 𝐯L\mathbf{v}\in L^{*} be a vector of length λ1(L)\lambda_{1}(L^{*}) and let s=ηε(L)s=\eta_{\varepsilon}(L). Then,

    ε=ρ1/s(L{𝟎})ρ1/s(𝐯)=exp(π(sλ1(L))2).\varepsilon=\rho_{1/s}(L^{*}\setminus\{\mathbf{0}\})\geq\rho_{1/s}(\mathbf{v})={{\mathrm{exp}}\left(-\pi(s\lambda_{1}(L^{*}))^{2}\right)}.

    The first inequality follows by solving for ss. The second inequality is by Lemma 2.3.  

The Fourier transform.

We briefly review some of the important properties of the Fourier transform. In the following, we omit certain technical conditions as these will always be satisfied in our applications. For a more precise and in-depth treatment, see, e.g., [14]. The Fourier transform of a function h:nh:\mathbb{R}^{n}\to\mathbb{C} is defined to be

h^(𝐰)=nh(𝐱)e2πi𝐱,𝐰𝑑𝐱.\hat{h}(\mathbf{w})=\int_{\mathbb{R}^{n}}h(\mathbf{x})e^{-2\pi i{\langle{\mathbf{x},\mathbf{w}}\rangle}}d\mathbf{x}.

From the definition we can obtain two useful formulas; first, if hh is defined by h(𝐱)=g(𝐱+𝐯)h(\mathbf{x})=g(\mathbf{x}+\mathbf{v}) for some function gg and vector 𝐯\mathbf{v} then

h^(𝐰)=e2πi𝐯,𝐰g^(𝐰).\hat{h}(\mathbf{w})=e^{2\pi i{\langle{\mathbf{v},\mathbf{w}}\rangle}}\hat{g}(\mathbf{w}). (9)

Similarly, if hh is defined by h(𝐱)=e2πi𝐱,𝐯g(𝐱)h(\mathbf{x})=e^{2\pi i{\langle{\mathbf{x},\mathbf{v}}\rangle}}g(\mathbf{x}) for some function gg and vector 𝐯\mathbf{v} then

h^(𝐰)=g^(𝐰𝐯).\hat{h}(\mathbf{w})=\hat{g}(\mathbf{w}-\mathbf{v}). (10)

Another important fact is that the Gaussian is its own Fourier transform, i.e., ρ^=ρ\hat{\rho}=\rho. More generally, for any s>0s>0 it holds that ρs^=snρ1/s\widehat{\rho_{s}}=s^{n}\rho_{1/s}. Finally, we will use the following formulation of the Poisson summation formula.

Lemma 2.14 (Poisson summation formula)

For any lattice LL and any function f:nf:\mathbb{R}^{n}\to\mathbb{C},

f(L)=det(L)f^(L).f(L)=\det(L^{*})\hat{f}(L^{*}).

3 Main Theorem

Our main theorem is the following. The connection to the standard lattice problems GapSVP and SIVP will be established in Subsection 3.3 by polynomial time reductions to DGS.

Theorem 3.1 (Main theorem)

Let ε=ε(n)\varepsilon=\varepsilon(n) be some negligible function of nn. Also, let p=p(n)p=p(n) be some integer and α=α(n)(0,1)\alpha=\alpha(n)\in(0,1) be such that αp>2n\alpha p>2\sqrt{n}. Assume that we have access to an oracle WW that solves LWEp,Ψα\mbox{\sc LWE}_{p,\Psi_{\alpha}} given a polynomial number of samples. Then there exists an efficient quantum algorithm for DGS2nηε(L)/α\mbox{\sc DGS}_{\sqrt{2n}\cdot\eta_{\varepsilon}(L)/\alpha}.

  • Proof:

    The input to our algorithm is an nn-dimensional lattice LL and a number r>2nηε(L)/αr>\sqrt{2n}\cdot\eta_{\varepsilon}(L)/\alpha. Our goal is to output a sample from DL,rD_{L,r}. Let rir_{i} denote r(αp/n)ir\cdot(\alpha p/\sqrt{n})^{i}. The algorithm starts by producing ncn^{c} samples from DL,r3nD_{L,r_{3n}} where cc is the constant from the iterative step lemma, Lemma 3.3. By Claim 2.13, r3n>23nr>22nλn(L)r_{3n}>2^{3n}r>2^{2n}\lambda_{n}(L), and hence we can produce these samples efficiently by the procedure described in the bootstrapping lemma, Lemma 3.2. Next, for i=3n,3n1,,1i=3n,3n-1,\ldots,1 we use our ncn^{c} samples from DL,riD_{L,r_{i}} to produce ncn^{c} samples from DL,ri1D_{L,r_{i-1}}. The procedure that does this, called the iterative step, is the core of the algorithm and is described in Lemma 3.3. Notice that the condition in Lemma 3.3 is satisfied since for all i1i\geq 1, rir1=rαp/n>2pηε(L)r_{i}\geq r_{1}=r\alpha p/\sqrt{n}>\sqrt{2}p\eta_{\varepsilon}(L). At the end of the loop, we end up with ncn^{c} samples from DL,r0=DL,rD_{L,r_{0}}=D_{L,r} and we complete the algorithm by simply outputting the first of those.  

3.1 Bootstrapping

Lemma 3.2 (Bootstrapping)

There exists an efficient algorithm that, given any nn-dimensional lattice LL and r>22nλn(L)r>2^{2n}\lambda_{n}(L), outputs a sample from a distribution that is within statistical distance 2Ω(n)2^{-\Omega(n)} of DL,rD_{L,r}.

  • Proof:

    By using the LLL basis reduction algorithm [24], we obtain a basis for LL of length at most 2nλn(L)2^{n}\lambda_{n}(L) and let 𝒫(L){\cal P}(L) be the parallelepiped generated by this basis. The sampling procedure samples a vector 𝐲\mathbf{y} from νr\nu_{r} and then outputs 𝐲(𝐲mod𝒫(L))L\mathbf{y}-(\mathbf{y}~{}{\rm{mod}}~{}{\cal P}(L))\in L. Notice that 𝐲mod𝒫(L)diam(𝒫(L))n2nλn(L)\|\mathbf{y}~{}{\rm{mod}}~{}{\cal P}(L)\|\leq\mbox{diam}({\cal P}(L))\leq n2^{n}\lambda_{n}(L).

    Our goal is to show that the resulting distribution is exponentially close to DL,rD_{L,r}. By Lemma 2.5, all but an exponentially small part of DL,rD_{L,r} is concentrated on points of norm at most nr\sqrt{n}r. So consider any 𝐱L\mathbf{x}\in L with 𝐱nr\|\mathbf{x}\|\leq\sqrt{n}r. By definition, the probability given to it by DL,rD_{L,r} is ρr(𝐱)/ρr(L)\rho_{r}(\mathbf{x})/\rho_{r}(L). By Lemma 2.14, the denominator is ρr(L)=det(L)rnρ1/r(L)det(L)rn\rho_{r}(L)=\det(L^{*})\cdot r^{n}\rho_{1/r}(L^{*})\geq\det(L^{*})\cdot r^{n} and hence the probability is at most ρr(𝐱)/(det(L)rn)=det(L)νr(𝐱)\rho_{r}(\mathbf{x})/(\det(L^{*})\cdot r^{n})=\det(L)\nu_{r}(\mathbf{x}). On the other hand, by Claim 2.1, the probability given to 𝐱L\mathbf{x}\in L by our procedure is

    𝐱+𝒫(L)νr(𝐲)𝑑𝐲(12Ω(n))det(L)νr(𝐱).\int_{\mathbf{x}+{\cal P}(L)}\nu_{r}(\mathbf{y})d\mathbf{y}\geq(1-2^{-\Omega(n)})\det(L)\nu_{r}(\mathbf{x}).

    Together, these facts imply that our output distribution is within statistical distance 2Ω(n)2^{-\Omega(n)} of DL,rD_{L,r}.  

3.2 The iterative step

Lemma 3.3 (The iterative step)

Let ε=ε(n)\varepsilon=\varepsilon(n) be a negligible function, α=α(n)(0,1)\alpha=\alpha(n)\in(0,1) be a real number, and p=p(n)2p=p(n)\geq 2 be an integer. Assume that we have access to an oracle WW that solves LWEp,Ψα\mbox{\sc LWE}_{p,\Psi_{\alpha}} given a polynomial number of samples. Then, there exists a constant c>0c>0 and an efficient quantum algorithm that, given any nn-dimensional lattice LL, a number r>2pηε(L)r>\sqrt{2}p\eta_{\varepsilon}(L), and ncn^{c} samples from DL,rD_{L,r}, produces a sample from DL,rn/(αp)D_{L,r\sqrt{n}/(\alpha p)}.

Note that the output distribution is taken with respect to the randomness (and quantum measurements) used in the algorithm, and not with respect to the input samples. In particular, this means that from the same set of ncn^{c} samples from DL,rD_{L,r} we can produce any polynomial number of samples from DL,rn/(αp)D_{L,r\sqrt{n}/(\alpha p)}.

  • Proof:

    The algorithm consists of two main parts. The first part is shown in Lemma 3.4. There, we describe a (classical) algorithm that using WW and the samples from DL,rD_{L,r}, solves CVPL,αp/(2r){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}. The second part is shown in Lemma 3.14. There, we describe a quantum algorithm that, given an oracle that solves CVPL,αp/(2r){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}, outputs a sample from DL,rn/(αp)D_{L,r\sqrt{n}/(\alpha p)}. This is the only quantum component in this paper. We note that the condition in Lemma 3.14 is satisfied since by Claim 2.13, αp/(2r)1/ηε(L)λ1(L)/2\alpha p/(\sqrt{2}r)\leq 1/\eta_{\varepsilon}(L)\leq\lambda_{1}(L^{*})/2.  

3.2.1 From samples to CVP

Our goal in this subsection is to prove the following.

Lemma 3.4 (First part of iterative step)

Let ε=ε(n)\varepsilon=\varepsilon(n) be a negligible function, p=p(n)2p=p(n)\geq 2 be an integer, and α=α(n)(0,1)\alpha=\alpha(n)\in(0,1) be a real number. Assume that we have access to an oracle WW that solves LWEp,Ψα\mbox{\sc LWE}_{p,\Psi_{\alpha}} given a polynomial number of samples. Then, there exist a constant c>0c>0 and an efficient algorithm that, given any nn-dimensional lattice LL, a number r>2pηε(L)r>\sqrt{2}p\eta_{\varepsilon}(L), and ncn^{c} samples from DL,rD_{L,r}, solves CVPL,αp/(2r){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}.

For an nn-dimensional lattice LL, some 0<d<λ1(L)/20<d<\lambda_{1}(L)/2, and an integer p2p\geq 2, we say that an algorithm solves CVPL,d(p){\mbox{\sc CVP}}_{L,d}^{(p)} if, given any point 𝐱n\mathbf{x}\in\mathbb{R}^{n} within distance dd of LL, it outputs L1κL(𝐱)modppnL^{-1}\kappa_{L}(\mathbf{x})~{}{\rm{mod}}~{}p\in{\mathbb{Z}}_{p}^{n}, the coefficient vector of the closest vector to 𝐱\mathbf{x} reduced modulo pp. We start with the following lemma, which shows a reduction from CVPL,d{\mbox{\sc CVP}}_{L,d} to CVPL,d(p){\mbox{\sc CVP}}_{L,d}^{(p)}.

Lemma 3.5 (Finding coefficients modulo pp is sufficient)

There exists an efficient algorithm that given a lattice LL, a number d<λ1(L)/2d<\lambda_{1}(L)/2 and an integer p2p\geq 2, solves CVPL,d{\mbox{\sc CVP}}_{L,d} given access to an oracle for CVPL,d(p){\mbox{\sc CVP}}_{L,d}^{(p)}.

  • Proof:

    Our input is a point 𝐱\mathbf{x} within distance dd of LL. We define a sequence of points 𝐱1=𝐱,𝐱2,𝐱3,\mathbf{x}_{1}=\mathbf{x},\mathbf{x}_{2},\mathbf{x}_{3},\ldots as follows. Let 𝐚i=L1κL(𝐱i)n\mathbf{a}_{i}=L^{-1}\kappa_{L}(\mathbf{x}_{i})\in{\mathbb{Z}}^{n} be the coefficient vector of the closest lattice point to 𝐱i\mathbf{x}_{i}. We define 𝐱i+1=(𝐱iL(𝐚imodp))/p\mathbf{x}_{i+1}=(\mathbf{x}_{i}-L(\mathbf{a}_{i}~{}{\rm{mod}}~{}p))/p. Notice that the closest lattice point to 𝐱i+1\mathbf{x}_{i+1} is L(𝐚i(𝐚imodp))/pLL(\mathbf{a}_{i}-(\mathbf{a}_{i}~{}{\rm{mod}}~{}p))/p\in L and hence 𝐚i+1=(𝐚i(𝐚imodp))/p\mathbf{a}_{i+1}=(\mathbf{a}_{i}-(\mathbf{a}_{i}~{}{\rm{mod}}~{}p))/p. Moreover, the distance of 𝐱i+1\mathbf{x}_{i+1} from LL is at most d/pid/p^{i}. Also note that this sequence can be computed by using the oracle.

    After nn steps, we have a point 𝐱n+1\mathbf{x}_{n+1} whose distance to the lattice is at most d/pnd/p^{n}. We now apply an algorithm for approximately solving the closest vector problem, such as Babai’s nearest plane algorithm [8]. This yields a lattice point L𝐚L\mathbf{a} within distance 2nd/pnd<λ1(L)/22^{n}\cdot d/p^{n}\leq d<\lambda_{1}(L)/2 of 𝐱n+1\mathbf{x}_{n+1}. Hence, L𝐚L\mathbf{a} is the lattice point closest to 𝐱n+1\mathbf{x}_{n+1} and we managed to recover 𝐚n+1=𝐚\mathbf{a}_{n+1}=\mathbf{a}. Knowing 𝐚n+1\mathbf{a}_{n+1} and 𝐚nmodp\mathbf{a}_{n}~{}{\rm{mod}}~{}p (by using the oracle), we can now recover 𝐚n=p𝐚n+1+(𝐚nmodp)\mathbf{a}_{n}=p\mathbf{a}_{n+1}+(\mathbf{a}_{n}~{}{\rm{mod}}~{}p). Continuing this process, we can recover 𝐚n1,𝐚n2,,𝐚1\mathbf{a}_{n-1},\mathbf{a}_{n-2},\ldots,\mathbf{a}_{1}. This completes the algorithm since L𝐚1L\mathbf{a}_{1} is the closest point to 𝐱1=𝐱\mathbf{x}_{1}=\mathbf{x}.  

As we noted in the proof of Lemma 3.3, for our choice of rr, αp/(2r)λ1(L)/2\alpha p/(\sqrt{2}r)\leq\lambda_{1}(L^{*})/2. Hence, in order to prove Lemma 3.4, it suffices to present an efficient algorithm for CVPL,αp/(2r)(p){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}^{(p)}. We do this by combining two lemmas. The first, Lemma 3.7, shows an algorithm WW^{\prime} that, given samples from A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} for some (unknown) βα\beta\leq\alpha, outputs 𝐬\mathbf{s} with probability exponentially close to 11 by using WW as an oracle. Its proof is based on Lemma 3.6. The second, Lemma 3.11, is the main lemma of this subsection, and shows how to use WW^{\prime} and the given samples from DL,rD_{L,r} in order to solve CVPL,αp/(2r)(p){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}^{(p)}.

Lemma 3.6 (Verifying solutions of LWE)

Let p=p(n)1p=p(n)\geq 1 be some integer. There exists an efficient algorithm that, given 𝐬\mathbf{s}^{\prime} and samples from A𝐬,ΨαA_{\mathbf{s},\Psi_{\alpha}} for some (unknown) 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n} and α<1\alpha<1, outputs whether 𝐬=𝐬\mathbf{s}=\mathbf{s}^{\prime} and is correct with probability exponentially close to 11.

We remark that the lemma holds also for all αO(logn)\alpha\leq O(\sqrt{\log n}) with essentially the same proof.

  • Proof:

    The idea is to perform a statistical test on samples from A𝐬,ΨαA_{\mathbf{s},\Psi_{\alpha}} that checks whether 𝐬=𝐬\mathbf{s}=\mathbf{s}^{\prime}. Let ξ\xi be the distribution on 𝕋\mathbb{T} obtained by sampling (𝐚,x)(\mathbf{a},x) from A𝐬,ΨαA_{\mathbf{s},\Psi_{\alpha}} and outputting x𝐚,𝐬/p𝕋x-{\langle{\mathbf{a},\mathbf{s}^{\prime}}\rangle}/p\in\mathbb{T}. The algorithm takes nn samples y1,,yny_{1},\ldots,y_{n} from ξ\xi. It then computes z:=1ni=1ncos(2πyi)z:=\frac{1}{n}\sum_{i=1}^{n}\cos(2\pi y_{i}). If z>0.02z>0.02, it decides that 𝐬=𝐬\mathbf{s}=\mathbf{s}^{\prime}, otherwise it decides that 𝐬𝐬\mathbf{s}\neq\mathbf{s}^{\prime}.

    We now analyze this algorithm. Consider the distribution ξ\xi. Notice that it be obtained by sampling ee from Ψα\Psi_{\alpha}, sampling 𝐚\mathbf{a} uniformly from pn{\mathbb{Z}}_{p}^{n} and outputting e+𝐚,𝐬𝐬/p𝕋e+{\langle{\mathbf{a},\mathbf{s}-\mathbf{s}^{\prime}}\rangle}/p\in\mathbb{T}. From this it easily follows that if 𝐬=𝐬\mathbf{s}=\mathbf{s}^{\prime}, ξ\xi is exactly Ψα\Psi_{\alpha}. Otherwise, if 𝐬𝐬\mathbf{s}\neq\mathbf{s}^{\prime}, we claim that ξ\xi has a period of 1/k1/k for some integer k2k\geq 2. Indeed, let jj be an index on which sjsjs_{j}\neq s^{\prime}_{j}. Then the distribution of aj(sjsj)modpa_{j}(s_{j}-s^{\prime}_{j})~{}{\rm{mod}}~{}p is periodic with period gcd(p,sjsj)<p{\rm{gcd}}(p,s_{j}-s^{\prime}_{j})<p. This clearly implies that the distribution of aj(sjsj)/pmod1a_{j}(s_{j}-s^{\prime}_{j})/p~{}{\rm{mod}}~{}1 is periodic with period 1/k1/k for some k2k\geq 2. Since a sample from ξ\xi can be obtained by adding a sample from aj(sjsj)/pmod1a_{j}(s_{j}-s^{\prime}_{j})/p~{}{\rm{mod}}~{}1 and an independent sample from some other distribution, we obtain that ξ\xi also has the same period of 1/k1/k.

    Consider the expectation555We remark that this expectation is essentially the Fourier series of ξ\xi at point 11 and that the following arguments can be explained in terms of properties of the Fourier series.

    z~:=Expyξ[cos(2πy)]=01cos(2πy)ξ(y)𝑑y=Re[01exp(2πiy)ξ(y)𝑑y].\displaystyle\tilde{z}:=\mathop{\mathrm{Exp}}_{y\sim\xi}[\cos(2\pi y)]=\int_{0}^{1}\cos(2\pi y)\xi(y)dy=\mathop{\rm Re}\Big{[}\int_{0}^{1}{{\mathrm{exp}}\left(2\pi iy\right)}\xi(y)dy\Big{]}.

    First, a routine calculation shows that for ξ=Ψα\xi=\Psi_{\alpha}, z~=exp(πα2)\tilde{z}={{\mathrm{exp}}\left(-\pi\alpha^{2}\right)}, which is at least 0.040.04 for α<1\alpha<1. Moreover, if ξ\xi has a period of 1/k1/k, then

    01exp(2πiy)ξ(y)𝑑y=01exp(2πi(y+1k))ξ(y)𝑑y=exp(2πi/k)01exp(2πiy)ξ(y)𝑑y\int_{0}^{1}{{\mathrm{exp}}\left(2\pi iy\right)}\xi(y)dy=\int_{0}^{1}{{\mathrm{exp}}\left(2\pi i(y+\textstyle{\frac{1}{k}})\right)}\xi(y)dy={{\mathrm{exp}}\left(2\pi i/k\right)}\int_{0}^{1}{{\mathrm{exp}}\left(2\pi iy\right)}\xi(y)dy

    which implies that if k2k\geq 2 then z~=0\tilde{z}=0. We complete the proof by noting that by the Chernoff bound, |zz~|0.01|z-\tilde{z}|\leq 0.01 with probability exponentially close to 11.  

Lemma 3.7 (Handling error Ψβ\Psi_{\beta} for βα\beta\leq\alpha)

Let p=p(n)2p=p(n)\geq 2 be some integer and α=α(n)(0,1)\alpha=\alpha(n)\in(0,1). Assume that we have access to an oracle WW that solves LWEp,Ψα\mbox{\sc LWE}_{p,\Psi_{\alpha}} by using a polynomial number of samples. Then, there exists an efficient algorithm WW^{\prime} that, given samples from A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} for some (unknown) βα\beta\leq\alpha, outputs 𝐬\mathbf{s} with probability exponentially close to 11.

  • Proof:

    The proof is based on the following idea: by adding the right amount of noise, we can transform samples from A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} to samples from A𝐬,ΨαA_{\mathbf{s},\Psi_{\alpha}} (or something sufficiently close to it). Assume that the number of samples required by WW is at most ncn^{c} for some c>0c>0. Let ZZ be the set of all integer multiplies of n2cα2n^{-2c}\alpha^{2} between 0 and α2\alpha^{2}. For each γZ\gamma\in Z, Algorithm WW^{\prime} does the following nn times. It takes ncn^{c} samples from A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} and adds to the second element of each sample a noise sampled independently from Ψγ\Psi_{\sqrt{\gamma}}. This creates ncn^{c} samples taken from the distribution A𝐬,Ψβ2+γA_{\mathbf{s},\Psi_{\sqrt{\beta^{2}+\gamma}}}. It then applies WW and obtains some candidate 𝐬\mathbf{s}^{\prime}. Using Lemma 3.6, it checks whether 𝐬=𝐬\mathbf{s}^{\prime}=\mathbf{s}. If the answer is yes, it outputs 𝐬\mathbf{s}^{\prime}; otherwise, it continues.

    We now show that WW^{\prime} finds 𝐬\mathbf{s} with probability exponentially close to 11. By Lemma 3.6, if WW^{\prime} outputs some value, then this value is correct with probability exponentially close to 11. Hence, it is enough to show that in one of the iterations, WW^{\prime} outputs some value. Consider the smallest γZ\gamma\in Z such that γα2β2\gamma\geq\alpha^{2}-\beta^{2}. Clearly, γα2β2+n2cα2\gamma\leq\alpha^{2}-\beta^{2}+n^{-2c}\alpha^{2}. Define α=β2+γ\alpha^{\prime}=\sqrt{\beta^{2}+\gamma}. Then,

    ααα2+n2cα2(1+n2c)α.\alpha\leq\alpha^{\prime}\leq\sqrt{\alpha^{2}+n^{-2c}\alpha^{2}}\leq(1+n^{-2c})\alpha.

    By Claim 2.2, the statistical distance between Ψα\Psi_{\alpha} and Ψα\Psi_{\alpha^{\prime}} is at most 9n2c9n^{-2c}. Hence, the statistical distance between ncn^{c} samples from Ψα\Psi_{\alpha} and ncn^{c} samples from Ψα\Psi_{\alpha^{\prime}} is at most 9nc9n^{-c}. Therefore, for our choice of γ\gamma, WW outputs 𝐬\mathbf{s} with probability at least 19nc/22Ω(n)121-9n^{-c}/2-2^{-\Omega(n)}\geq\frac{1}{2}. The probability that 𝐬\mathbf{s} is not found in any of the nn calls to WW is at most 2n2^{-n}.  

For the analysis of our main procedure in Lemma 3.11, we will need to following claims regarding Gaussian measures on lattices. On first reading, the reader can just read the statements of Claim 3.8 and Corollary 3.10 and skip directly to Lemma 3.11. All claims show that in some sense, when working above the smoothing parameter, the discrete Gaussian measure behaves like the continuous Gaussian measure. We start with the following claim, showing that above the smoothing parameter, the discrete Gaussian measure is essentially invariant under shifts.

Claim 3.8

For any lattice LL, 𝐜n\mathbf{c}\in\mathbb{R}^{n}, ε>0\varepsilon>0, and rηε(L)r\geq\eta_{\varepsilon}(L),

ρr(L+𝐜)rndet(L)(1±ε).\rho_{r}(L+\mathbf{c})\in r^{n}\det(L^{*})(1\pm\varepsilon).
  • Proof:

    Using the Poisson summation formula (Lemma 2.14) and the assumption that ρ1/r(L{𝟎})ε\rho_{1/r}(L^{*}\setminus\{\mathbf{0}\})\leq\varepsilon,

    ρr(L+𝐜)=𝐱Lρr(𝐱+𝐜)\displaystyle\rho_{r}(L+\mathbf{c})=\sum_{\mathbf{x}\in L}\rho_{r}(\mathbf{x}+\mathbf{c}) =𝐱Lρr,𝐜(𝐱)\displaystyle=\sum_{\mathbf{x}\in L}\rho_{r,-\mathbf{c}}(\mathbf{x})
    =det(L)𝐲Lρr,𝐜^(𝐲)\displaystyle=\det(L^{*})\sum_{\mathbf{y}\in L^{*}}\widehat{\rho_{r,-\mathbf{c}}}(\mathbf{y})
    =rndet(L)𝐲Lexp(2πi𝐜,𝐲)ρ1/r(𝐲)\displaystyle=r^{n}\det(L^{*})\sum_{\mathbf{y}\in L^{*}}{{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{c},\mathbf{y}}\rangle}\right)}\rho_{1/r}(\mathbf{y})
    =rndet(L)(1±ε).\displaystyle=r^{n}\det(L^{*})(1\pm\varepsilon).

     

The following claim (which is only used to establish the corollary following it) says that when adding a continuous Gaussian of width ss to a discrete Gaussian of width rr, with both rr and ss sufficiently greater than the smoothing parameter, the resulting distribution is very close to a continuous Gaussian of the width we would expect, namely r2+s2\sqrt{r^{2}+s^{2}}. To get some intuition on why we need to assume that both Gaussians are sufficiently wide, notice for instance that if the discrete Gaussian is very narrow, then it is concentrated on the origin, making the sum have width ss. Also, if the continuous Gaussian is too narrow, then the discrete structure is still visible in the sum.

Claim 3.9

Let LL be a lattice, let 𝐮n\mathbf{u}\in\mathbb{R}^{n} be any vector, let r,s>0r,s>0 be two reals, and let tt denote r2+s2\sqrt{r^{2}+s^{2}}. Assume that rs/t=1/1/r2+1/s2ηε(L)rs/t=1/\sqrt{1/r^{2}+1/s^{2}}\geq\eta_{\varepsilon}(L) for some ε<12\varepsilon<\frac{1}{2}. Consider the continuous distribution YY on n\mathbb{R}^{n} obtained by sampling from DL+𝐮,rD_{L+\mathbf{u},r} and then adding a noise vector taken from νs\nu_{s}. Then, the statistical distance between YY and νt\nu_{t} is at most 4ε4\varepsilon.

  • Proof:

    The probability density function of YY can be written as

    Y(𝐱)\displaystyle Y(\mathbf{x}) =1snρr(L+𝐮)𝐲L+𝐮ρr(𝐲)ρs(𝐱𝐲)\displaystyle=\frac{1}{s^{n}\rho_{r}(L+\mathbf{u})}\sum_{\mathbf{y}\in L+\mathbf{u}}\rho_{r}(\mathbf{y})\rho_{s}(\mathbf{x}-\mathbf{y})
    =1snρr(L+𝐮)𝐲L+𝐮exp(π(𝐲/r2+(𝐱𝐲)/s2))\displaystyle=\frac{1}{s^{n}\rho_{r}(L+\mathbf{u})}\sum_{\mathbf{y}\in L+\mathbf{u}}{{\mathrm{exp}}\left(-\pi(\|\mathbf{y}/r\|^{2}+\|(\mathbf{x}-\mathbf{y})/s\|^{2})\right)}
    =1snρr(L+𝐮)𝐲L+𝐮exp(π(r2+s2r2s2𝐲r2r2+s2𝐱2+1r2+s2𝐱2))\displaystyle=\frac{1}{s^{n}\rho_{r}(L+\mathbf{u})}\sum_{\mathbf{y}\in L+\mathbf{u}}{{\mathrm{exp}}\left(-\pi\Big{(}\frac{r^{2}+s^{2}}{r^{2}\cdot s^{2}}\cdot\Big{\|}\mathbf{y}-\frac{r^{2}}{r^{2}+s^{2}}\mathbf{x}\Big{\|}^{2}+\frac{1}{r^{2}+s^{2}}\|\mathbf{x}\|^{2}\Big{)}\right)}
    =exp(πr2+s2𝐱2)1snρr(L+𝐮)𝐲L+𝐮exp(π(r2+s2r2s2𝐲r2r2+s2𝐱2))\displaystyle={{\mathrm{exp}}\left(-\frac{\pi}{r^{2}+s^{2}}\|\mathbf{x}\|^{2}\right)}\frac{1}{s^{n}\rho_{r}(L+\mathbf{u})}\sum_{\mathbf{y}\in L+\mathbf{u}}{{\mathrm{exp}}\left(-\pi\Big{(}\frac{r^{2}+s^{2}}{r^{2}\cdot s^{2}}\cdot\Big{\|}\mathbf{y}-\frac{r^{2}}{r^{2}+s^{2}}\mathbf{x}\Big{\|}^{2}\Big{)}\right)}
    =1snρt(𝐱)ρrs/t,(r/t)2𝐱𝐮(L)ρr,𝐮(L)\displaystyle=\frac{1}{s^{n}}\rho_{t}(\mathbf{x})\cdot\frac{\rho_{rs/t,(r/t)^{2}\mathbf{x}-\mathbf{u}}(L)}{\rho_{r,-\mathbf{u}}(L)}
    =1snρt(𝐱)ρrs/t,(r/t)2𝐱𝐮^(L)ρr,𝐮^(L)\displaystyle=\frac{1}{s^{n}}\rho_{t}(\mathbf{x})\cdot\frac{\widehat{\rho_{rs/t,(r/t)^{2}\mathbf{x}-\mathbf{u}}}(L^{*})}{\widehat{\rho_{r,-\mathbf{u}}}(L^{*})}
    =ρt(𝐱)/tn(t/rs)nρrs/t,(r/t)2𝐱𝐮^(L)(1/r)nρr,𝐮^(L)\displaystyle=\rho_{t}(\mathbf{x})/t^{n}\cdot\frac{(t/rs)^{n}\widehat{\rho_{rs/t,(r/t)^{2}\mathbf{x}-\mathbf{u}}}(L^{*})}{(1/r)^{n}\widehat{\rho_{r,-\mathbf{u}}}(L^{*})} (11)

    where in the next-to-last equality we used Lemma 2.14. Using Eq. (9),

    ρrs/t,(r/t)2𝐱𝐮(𝐰)^\displaystyle\widehat{\rho_{rs/t,(r/t)^{2}\mathbf{x}-\mathbf{u}}(\mathbf{w})} =exp(2πi(r/t)2𝐱𝐮,𝐰)(rs/t)nρt/rs(𝐰),\displaystyle={{\mathrm{exp}}\left(-2\pi i{\langle{(r/t)^{2}\mathbf{x}-\mathbf{u},\mathbf{w}}\rangle}\right)}\cdot(rs/t)^{n}\rho_{t/rs}(\mathbf{w}),
    ρr,𝐮^(𝐰)\displaystyle\widehat{\rho_{r,-\mathbf{u}}}(\mathbf{w}) =exp(2πi𝐮,𝐰)rnρ1/r(𝐰).\displaystyle={{\mathrm{exp}}\left(2\pi i{\langle{\mathbf{u},\mathbf{w}}\rangle}\right)}\cdot r^{n}\rho_{1/r}(\mathbf{w}).

    Hence,

    |1(t/rs)nρrs/t,(r/t)2𝐱𝐮^(L)|\displaystyle\Big{|}1-(t/rs)^{n}\widehat{\rho_{rs/t,(r/t)^{2}\mathbf{x}-\mathbf{u}}}(L^{*})\Big{|} ρt/rs(L{𝟎})ε\displaystyle\leq\rho_{t/rs}(L^{*}\setminus\{\mathbf{0}\})\leq\varepsilon
    |1(1/r)nρr,𝐮^(L)|\displaystyle\Big{|}1-(1/r)^{n}\widehat{\rho_{r,-\mathbf{u}}}(L^{*})\Big{|} ρ1/r(L{𝟎})ε\displaystyle\leq\rho_{1/r}(L^{*}\setminus\{\mathbf{0}\})\leq\varepsilon

    where the last inequality follows from 1/rt/rs1/r\leq t/rs. Hence, the quotient in (11) is between (1ε)/(1+ε)12ε(1-\varepsilon)/(1+\varepsilon)\geq 1-2\varepsilon and (1+ε)/(1ε)1+4ε(1+\varepsilon)/(1-\varepsilon)\leq 1+4\varepsilon. This implies that,

    |Y(𝐱)ρt(𝐱)/tn|ρt(𝐱)/tn4ε.\displaystyle|Y(\mathbf{x})-\rho_{t}(\mathbf{x})/t^{n}|\leq\rho_{t}(\mathbf{x})/t^{n}\cdot 4\varepsilon.

    We complete the proof by integrating over n\mathbb{R}^{n}.  

Corollary 3.10

Let LL be a lattice, let 𝐳,𝐮n\mathbf{z},\mathbf{u}\in\mathbb{R}^{n} be vectors, and let r,α>0r,\alpha>0 be two reals. Assume that 1/1/r2+(𝐳/α)2ηε(L)1/\sqrt{1/r^{2}+(\|\mathbf{z}\|/\alpha)^{2}}\geq\eta_{\varepsilon}(L) for some ε<12\varepsilon<\frac{1}{2}. Then the distribution of 𝐳,𝐯+e{\langle{\mathbf{z},\mathbf{v}}\rangle}+e where 𝐯\mathbf{v} is distributed according to DL+𝐮,rD_{L+\mathbf{u},r} and ee is a normal variable with mean 0 and standard deviation α/2π\alpha/\sqrt{2\pi}, is within statistical distance 4ε4\varepsilon of a normal variable with mean 0 and standard deviation (r𝐳)2+α2/2π\sqrt{(r\|\mathbf{z}\|)^{2}+\alpha^{2}}/\sqrt{2\pi}. In particular, since statistical distance cannot increase by applying a function, the distribution of 𝐳,𝐯+emod1{\langle{\mathbf{z},\mathbf{v}}\rangle}+e~{}{\rm{mod}}~{}1 is within statistical distance 4ε4\varepsilon of Ψ(r𝐳)2+α2\Psi_{\sqrt{(r\|\mathbf{z}\|)^{2}+\alpha^{2}}}.

  • Proof:

    We first observe that the distribution of 𝐳,𝐯+e{\langle{\mathbf{z},\mathbf{v}}\rangle}+e is exactly the same as that of 𝐳,𝐯+𝐡{\langle{\mathbf{z},\mathbf{v}+\mathbf{h}}\rangle} where 𝐡\mathbf{h} is distributed as the continuous Gaussian να/𝐳\nu_{\alpha/\|\mathbf{z}\|}. Next, by Claim 3.9, we know that the distribution of 𝐯+𝐡\mathbf{v}+\mathbf{h} is within statistical distance 4ε4\varepsilon of the continuous Gaussian νr2+(α/𝐳)2\nu_{\sqrt{r^{2}+(\alpha/\|\mathbf{z}\|)^{2}}}. Taking the inner product of this continuous Gaussian with 𝐳\mathbf{z} leads to a normal distribution with mean 0 and standard deviation (r𝐳)2+α2/2π\sqrt{(r\|\mathbf{z}\|)^{2}+\alpha^{2}}/\sqrt{2\pi}, and we complete the proof by using the fact that statistical distance cannot increase by applying a function (inner product with 𝐳\mathbf{z} in this case).  

Lemma 3.11 (Main procedure of the first part)

Let ε=ε(n)\varepsilon=\varepsilon(n) be a negligible function, p=p(n)2p=p(n)\geq 2 be an integer, and α=α(n)(0,1)\alpha=\alpha(n)\in(0,1) be a real number. Assume that we have access to an oracle WW that for all βα\beta\leq\alpha, finds 𝐬\mathbf{s} given a polynomial number of samples from A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} (without knowing β\beta). Then, there exists an efficient algorithm that given an nn-dimensional lattice LL, a number r>2pηε(L)r>\sqrt{2}p\eta_{\varepsilon}(L), and a polynomial number of samples from DL,rD_{L,r}, solves CVPL,αp/(2r)(p){\mbox{\sc CVP}}_{L^{*},\alpha p/(\sqrt{2}r)}^{(p)}.

  • Proof:

    We describe a procedure that given 𝐱\mathbf{x} within distance αp/(2r)\alpha p/(\sqrt{2}r) of LL^{*}, outputs samples from the distribution A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} for some βα\beta\leq\alpha where 𝐬=(L)1κL(𝐱)modp\mathbf{s}=(L^{*})^{-1}\kappa_{L^{*}}(\mathbf{x})~{}{\rm{mod}}~{}p. By running this procedure a polynomial number of times and then using WW, we can find 𝐬\mathbf{s}.

    The procedure works as follows. We sample a vector 𝐯L\mathbf{v}\in L from DL,rD_{L,r}, and let 𝐚=L1𝐯modp\mathbf{a}=L^{-1}\mathbf{v}~{}{\rm{mod}}~{}p. We then output

    (𝐚,𝐱,𝐯/p+emod1)\displaystyle(\mathbf{a},{\langle{\mathbf{x},\mathbf{v}}\rangle}/p+e~{}{\rm{mod}}~{}1) (12)

    where ee\in\mathbb{R} is chosen according to a normal distribution with standard deviation α/(2π)\alpha/(2\sqrt{\pi}). We claim that the distribution given by this procedure is within negligible statistical distance of A𝐬,ΨβA_{\mathbf{s},\Psi_{\beta}} for some βα\beta\leq\alpha.

    We first notice that the distribution of 𝐚\mathbf{a} is very close to uniform. Indeed, the probability of obtaining each 𝐚pn\mathbf{a}\in{\mathbb{Z}}_{p}^{n} is proportional to ρr(pL+L𝐚)\rho_{r}(pL+L\mathbf{a}). Using ηε(pL)=pηε(L)<r\eta_{\varepsilon}(pL)=p\eta_{\varepsilon}(L)<r and Claim 3.8, the latter is (r/p)ndet(L)(1±ε)(r/p)^{n}\det(L^{*})(1\pm\varepsilon), which implies that the statistical distance between the distribution of 𝐚\mathbf{a} and the uniform distribution is negligible.

    Next, we condition on any fixed value of 𝐚\mathbf{a} and consider the distribution of the second element in (12). Define 𝐱=𝐱κL(𝐱)\mathbf{x^{\prime}}=\mathbf{x}-\kappa_{L^{*}}(\mathbf{x}) and note that 𝐱αp/(2r)\|\mathbf{x^{\prime}}\|\leq\alpha p/(\sqrt{2}r). Then,

    𝐱,𝐯/p+emod1=𝐱/p,𝐯+e+κL(𝐱),𝐯/pmod1.{\langle{\mathbf{x},\mathbf{v}}\rangle}/p+e~{}{\rm{mod}}~{}1={\langle{\mathbf{x^{\prime}}/p,\mathbf{v}}\rangle}+e+{\langle{\kappa_{L^{*}}(\mathbf{x}),\mathbf{v}}\rangle}/p~{}{\rm{mod}}~{}1.

    Now,

    κL(𝐱),𝐯=(L)1κL(𝐱),L1𝐯{\langle{\kappa_{L^{*}}(\mathbf{x}),\mathbf{v}}\rangle}={\langle{(L^{*})^{-1}\kappa_{L^{*}}(\mathbf{x}),L^{-1}\mathbf{v}}\rangle}

    since L1=(L)TL^{-1}=(L^{*})^{T}. In words, this says that the inner product between κL(𝐱)\kappa_{L^{*}}(\mathbf{x}) and 𝐯\mathbf{v} (and in fact, between any vector in LL^{*} and any vector in LL) is the same as the inner product between the corresponding coefficient vectors. Since the coefficient vectors are integer,

    κL(𝐱),𝐯modp=𝐬,𝐚modp{\langle{\kappa_{L^{*}}(\mathbf{x}),\mathbf{v}}\rangle}~{}{\rm{mod}}~{}p={\langle{\mathbf{s},\mathbf{a}}\rangle}~{}{\rm{mod}}~{}p

    from which it follows that κL(𝐱),𝐯/pmod1{\langle{\kappa_{L^{*}}(\mathbf{x}),\mathbf{v}}\rangle}/p~{}{\rm{mod}}~{}1 is exactly 𝐬,𝐚/pmod1{\langle{\mathbf{s},\mathbf{a}}\rangle}/p~{}{\rm{mod}}~{}1.

    We complete the proof by applying Corollary 3.10, which shows that the distribution of the remaining part 𝐱/p,𝐯+e{\langle{\mathbf{x^{\prime}}/p,\mathbf{v}}\rangle}+e is within negligible statistical distance of Ψβ\Psi_{\beta} for β=(r𝐱/p)2+α2/2α\beta=\sqrt{(r\|\mathbf{x}^{\prime}\|/p)^{2}+\alpha^{2}/2}\leq\alpha, as required. Here we used that the distribution of 𝐯\mathbf{v} is DpL+L𝐚,rD_{pL+L\mathbf{a},r} (since we are conditioning on 𝐚\mathbf{a}), the distribution of ee is normal with mean 0 and standard deviation (α/2)/2π(\alpha/\sqrt{2})/\sqrt{2\pi}, and that

    1/1/r2+(2𝐱/pα)2r/2>ηε(pL).1/\sqrt{1/r^{2}+(\sqrt{2}\|\mathbf{x}^{\prime}\|/p\alpha)^{2}}\geq r/\sqrt{2}>\eta_{\varepsilon}(pL).

     

3.2.2 From CVP to samples

In this subsection we describe a quantum procedure that uses a CVP oracle in order to create samples from the discrete Gaussian distribution. We assume familiarity with some basic notions of quantum computation, such as (pure) states, measurements, and the quantum Fourier transform. See, e.g., [31] for a good introduction. For clarity, we often omit the normalization factors from quantum states.

The following lemma shows that we can efficiently create a ‘discrete quantum Gaussian state’ of width rr as long as rr is large enough compared with λn(L)\lambda_{n}(L). It can be seen as the quantum analogue of Lemma 3.2. The assumption that LnL\subseteq{\mathbb{Z}}^{n} is essentially without loss of generality since a lattice with rational coordinates can always be rescaled so that LnL\subseteq{\mathbb{Z}}^{n}.

Lemma 3.12

There exists an efficient quantum algorithm that, given an nn-dimensional lattice LnL\subseteq{\mathbb{Z}}^{n} and r>22nλn(L)r>2^{2n}\lambda_{n}(L), outputs a state that is within 2\ell_{2} distance 2Ω(n)2^{-\Omega(n)} of the normalized state corresponding to

𝐱Lρr(𝐱)|𝐱=𝐱Lρ2r(𝐱)|𝐱.\displaystyle\sum_{\mathbf{x}\in L}\sqrt{\rho_{r}(\mathbf{x})}{|{\mathbf{x}}\rangle}=\sum_{\mathbf{x}\in L}\rho_{\sqrt{2}r}(\mathbf{x}){|{\mathbf{x}}\rangle}. (13)
  • Proof:

    We start by creating the ‘one-dimensional Gaussian state’

    x=nrnreπ(x/(2r))2|x.\displaystyle\sum_{x=-\sqrt{n}r}^{\sqrt{n}r}e^{-\pi(x/(\sqrt{2}r))^{2}}{|{x}\rangle}. (14)

    This state can be created efficiently using a technique by Grover and Rudolph [18] who show that in order to create such a state, it suffices to be able to compute for any a,b{nr,,nr}a,b\in\{-\sqrt{n}r,\ldots,\sqrt{n}r\} the sum x=abeπ(x/r)2\sum_{x=a}^{b}e^{-\pi(x/r)^{2}} to within good precision. This can be done using the same standard techniques used in sampling from the normal distribution.

    Repeating the procedure described above nn times, creates a system whose state is the nn-fold tensor product of the state in Eq. (14), which can be written as

    𝐱{nr,,nr}nρ2r(𝐱)|𝐱.\displaystyle\sum_{\mathbf{x}\in\{-\sqrt{n}r,\ldots,\sqrt{n}r\}^{n}}\rho_{\sqrt{2}r}(\mathbf{x}){|{\mathbf{x}}\rangle}.

    Since nnrBn{nr,,nr}n{\mathbb{Z}}^{n}\cap\sqrt{n}rB_{n}\subseteq\{-\sqrt{n}r,\ldots,\sqrt{n}r\}^{n}, Lemma 2.5 implies that this state is within 2\ell_{2} distance 2Ω(n)2^{-\Omega(n)} of

    𝐱nρ2r(𝐱)|𝐱\displaystyle\sum_{\mathbf{x}\in{\mathbb{Z}}^{n}}\rho_{\sqrt{2}r}(\mathbf{x}){|{\mathbf{x}}\rangle} (15)

    and hence for our purposes we can assume that we have generated the state in Eq. (15).

    Next, using the LLL basis reduction algorithm [24], we obtain a basis for LL of length at most 2nλn(L)2^{n}\lambda_{n}(L) and let 𝒫(L){\cal P}(L) be the parallelepiped generated by this basis. We now compute in a new register 𝐱mod𝒫(L)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L) and measure it. Let 𝐲𝒫(L)\mathbf{y}\in{\cal P}(L) denote the result and note that 𝐲diam(𝒫(L))n2nλn(L)\|\mathbf{y}\|\leq\mbox{diam}({\cal P}(L))\leq n2^{n}\lambda_{n}(L). The state we obtain after the measurement is

    𝐱L+𝐲ρ2r(𝐱)|𝐱.\sum_{\mathbf{x}\in L+\mathbf{y}}\rho_{\sqrt{2}r}(\mathbf{x}){|{\mathbf{x}}\rangle}.

    Finally, we subtract 𝐲\mathbf{y} from our register, and obtain

    𝐱Lρ2r(𝐱+𝐲)|𝐱.\sum_{\mathbf{x}\in L}\rho_{\sqrt{2}r}(\mathbf{x}+\mathbf{y}){|{\mathbf{x}}\rangle}.

    Our goal is to show that this state is within 2\ell_{2} distance 2Ω(n)2^{-\Omega(n)} of the one in Eq. (13). First, by Lemma 2.5, all but an exponentially small part of the 2\ell_{2} norm of the state in Eq. (13) is concentrated on points of norm at most nr\sqrt{n}\cdot r. So consider any 𝐱L\mathbf{x}\in L with 𝐱nr\|\mathbf{x}\|\leq\sqrt{n}\cdot r. The amplitude squared given to it in Eq. (13) is ρr(𝐱)/ρr(L)\rho_{r}(\mathbf{x})/\rho_{r}(L). By Lemma 2.14, the denominator is ρr(L)=det(L)rnρ1/r(L)det(L)rn\rho_{r}(L)=\det(L^{*})\cdot r^{n}\rho_{1/r}(L^{*})\geq\det(L^{*})\cdot r^{n} and hence the amplitude squared is at most ρr(𝐱)/(det(L)rn)=det(L)νr(𝐱)\rho_{r}(\mathbf{x})/(\det(L^{*})\cdot r^{n})=\det(L)\nu_{r}(\mathbf{x}).

    On the other hand, the amplitude squared given to 𝐱\mathbf{x} by our procedure is ρr(𝐱+𝐲)/ρr(L+𝐲)\rho_{r}(\mathbf{x}+\mathbf{y})/\rho_{r}(L+\mathbf{y}). By Lemma 2.14, the denominator is

    ρr(L+𝐲)=det(L)rn𝐳Le2πi𝐳,𝐲ρ1/r(𝐳)(1+2Ω(n))det(L)rn.\rho_{r}(L+\mathbf{y})=\det(L^{*})\cdot r^{n}\sum_{\mathbf{z}\in L^{*}}e^{2\pi i{\langle{\mathbf{z},\mathbf{y}}\rangle}}\rho_{1/r}(\mathbf{z})\leq(1+2^{-\Omega(n)})\det(L^{*})\cdot r^{n}.

    To obtain this inequality, first note that by the easy part of Lemma 2.3, λ1(L)1/λn(L)>n/r\lambda_{1}(L^{*})\geq 1/\lambda_{n}(L)>\sqrt{n}/r, and then apply Lemma 2.5. Moreover, by Claim 2.1, the numerator is at least (12Ω(n))ρr(𝐱)(1-2^{-\Omega(n)})\rho_{r}(\mathbf{x}). Hence, the amplitude squared given to 𝐱\mathbf{x} is at least (12Ω(n))det(L)νr(𝐱)(1-2^{-\Omega(n)})\det(L)\nu_{r}(\mathbf{x}), as required.  

For a lattice LL and a positive integer RR, we denote by L/RL/R the lattice obtained by scaling down LL by a factor of RR. The following technical claim follows from the fact that almost all the mass of ρ\rho is on points of norm at most n\sqrt{n}.666The proof originally provided in the published journal version of this paper (J. ACM, 56(6), 34, 2009) contained a bug. The bug was found and fixed by the author in summer 2023, and also independently by Yilei Chen, Zihan Hu, Qipeng Liu, Han Luo, and Yaxin Tu in late 2023. The current statement of the claim is identical to the original one. However, the condition λ1(L)>2n\lambda_{1}(L)>2\sqrt{n} is no longer used in the proof.

Claim 3.13

Let R1R\geq 1 be an integer and LL be an nn-dimensional lattice satisfying λ1(L)>2n\lambda_{1}(L)>2\sqrt{n}. Let 𝒫(L){\cal P}(L) be some basic parallelepiped of LL. Then, the 2\ell_{2} distance between the normalized quantum states corresponding to

|ϑ1\displaystyle{|{\vartheta_{1}}\rangle} =𝐱L/R,𝐱<nρ(𝐱)|𝐱mod𝒫(L), and\displaystyle=\sum_{\mathbf{x}\in L/R,\|\mathbf{x}\|<\sqrt{n}}\rho(\mathbf{x}){|{\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L)}\rangle},\text{\qquad and}
|ϑ2\displaystyle{|{\vartheta_{2}}\rangle} =𝐱L/Rρ(𝐱)|𝐱mod𝒫(L)=𝐱L/R𝒫(L)𝐲Lρ(𝐱𝐲)|𝐱\displaystyle=\sum_{\mathbf{x}\in L/R}\rho(\mathbf{x}){|{\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L)}\rangle}=\sum_{\mathbf{x}\in L/R\cap{\cal P}(L)}\sum_{\mathbf{y}\in L}\rho(\mathbf{x}-\mathbf{y}){|{\mathbf{x}}\rangle}

is 2Ω(n)2^{-\Omega(n)}.

  • Proof:

    Let ZZ be the 2\ell_{2} norm of |ϑ2{|{\vartheta_{2}}\rangle}. In the following we show that the 2\ell_{2} distance between |ϑ1{|{\vartheta_{1}}\rangle} and |ϑ2{|{\vartheta_{2}}\rangle} is at most 2Ω(n)Z2^{-\Omega(n)}Z. This is enough to establish that the 2\ell_{2} distance between the normalized quantum states corresponding to |ϑ1{|{\vartheta_{1}}\rangle} and |ϑ2{|{\vartheta_{2}}\rangle} is exponentially small.

    Define the 2n2n-dimensional lattice

    M\displaystyle M ={(𝐱1,𝐱2)L/R×L/R|𝐱1=𝐱2mod𝒫(L)}\displaystyle=\{(\mathbf{x}_{1},\mathbf{x}_{2})\in L/R\times L/R~{}|~{}\mathbf{x}_{1}=\mathbf{x}_{2}~{}{\rm{mod}}~{}{\cal P}(L)\}
    =𝐱L/R𝒫(L)(𝐱+L)×(𝐱+L),\displaystyle=\bigcup_{\mathbf{x}\in L/R\cap{\cal P}(L)}(\mathbf{x}+L)\times(\mathbf{x}+L)\;,

    where the union is of disjoint sets. It follows that ρ(M)=𝐱L/R𝒫(L)ρ(𝐱+L)2=Z2\rho(M)=\sum_{\mathbf{x}\in L/R\cap{\cal P}(L)}\rho(\mathbf{x}+L)^{2}=Z^{2}.

    Next, notice that

    |ϑ2|ϑ1=𝐱L/R𝒫(L)ρ((𝐱+L)nBn)|𝐱.{|{\vartheta_{2}}\rangle}-{|{\vartheta_{1}}\rangle}=\sum_{\mathbf{x}\in L/R\cap{\cal P}(L)}\rho((\mathbf{x}+L)\setminus\sqrt{n}B_{n}){|{\mathbf{x}}\rangle}\;.

    Therefore, |ϑ2|ϑ12=ρ(M)\|{|{\vartheta_{2}}\rangle}-{|{\vartheta_{1}}\rangle}\|^{2}=\rho(M^{\prime}) where MM^{\prime} is the subset of MM containing all pairs (𝐱1,𝐱2)(\mathbf{x}_{1},\mathbf{x}_{2}) such that both 𝐱1n\|\mathbf{x}_{1}\|\geq\sqrt{n} and 𝐱2n\|\mathbf{x}_{2}\|\geq\sqrt{n}. Since MM2nBnM^{\prime}\subset M\setminus\sqrt{2n}B_{n}, we can apply Lemma 2.5 (with r=1r=1) and obtain that ρ(M)<24nρ(M)=24nZ2\rho(M^{\prime})<2^{-4n}\rho(M)=2^{-4n}Z^{2}, as desired.  

We now prove the main lemma of this subsection.

Lemma 3.14 (Second part of iterative step)

There exists an efficient quantum algorithm that, given any nn-dimensional lattice LL, a number d<λ1(L)/2d<\lambda_{1}(L^{*})/2, and an oracle that solves CVPL,d{\mbox{\sc CVP}}_{L^{*},d}, outputs a sample from DL,n/(2d)D_{L,\sqrt{n}/(\sqrt{2}d)}.

  • Proof:

    By scaling, we can assume without loss of generality that d=nd=\sqrt{n}. Let R23nλn(L)R\geq 2^{3n}\lambda_{n}(L^{*}) be a large enough integer. We can assume that logR\log R is polynomial in the input size (since such an RR can be computed in polynomial time given the lattice LL). Our first step is to create a state exponentially close to

    𝐱L/R𝒫(L)𝐲Lρ(𝐱𝐲)|𝐱.\sum_{\mathbf{x}\in L^{*}/R\cap{\cal P}(L^{*})}\sum_{\mathbf{y}\in L^{*}}\rho(\mathbf{x}-\mathbf{y}){|{\mathbf{x}}\rangle}. (16)

    This is a state on nlogRn\log R qubits, a number that is polynomial in the input size. To do so, we first use Lemma 3.12 with r=1/2r=1/\sqrt{2} and the lattice L/RL^{*}/R to create the state

    𝐱L/Rρ(𝐱)|𝐱.\sum_{\mathbf{x}\in L^{*}/R}\rho(\mathbf{x}){|{\mathbf{x}}\rangle}.

    By Lemma 2.5, this is exponentially close to

    𝐱L/R,𝐱<nρ(𝐱)|𝐱.\sum_{\mathbf{x}\in L^{*}/R,\|\mathbf{x}\|<\sqrt{n}}\rho(\mathbf{x}){|{\mathbf{x}}\rangle}.

    Next, we compute 𝐱mod𝒫(L)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L^{*}) in a new register and obtain

    𝐱L/R,𝐱<nρ(𝐱)|𝐱,𝐱mod𝒫(L).\sum_{\mathbf{x}\in L^{*}/R,\|\mathbf{x}\|<\sqrt{n}}\rho(\mathbf{x}){|{\mathbf{x},\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L^{*})}\rangle}.

    Using the CVP oracle, we can recover 𝐱\mathbf{x} from 𝐱mod𝒫(L)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L^{*}). This allows us to uncompute the first register and obtain

    𝐱L/R,𝐱<nρ(𝐱)|𝐱mod𝒫(L).\sum_{\mathbf{x}\in L^{*}/R,\|\mathbf{x}\|<\sqrt{n}}\rho(\mathbf{x}){|{\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(L^{*})}\rangle}.

    Using Claim 3.13, this state is exponentially close to the required state (16).

    In the second step, we apply the quantum Fourier transform. First, using the natural mapping between L/R𝒫(L)L^{*}/R\cap{\cal P}(L^{*}) and Rn{\mathbb{Z}}_{R}^{n}, we can rewrite (16) as

    𝐬Rn𝐫nρ(L𝐬/RL𝐫)|𝐬.\sum_{\mathbf{s}\in{\mathbb{Z}}_{R}^{n}}\sum_{\mathbf{r}\in{\mathbb{Z}}^{n}}\rho(L^{*}\mathbf{s}/R-L^{*}\mathbf{r}){|{\mathbf{s}}\rangle}.

    We now apply the quantum Fourier transform on Rn{\mathbb{Z}}_{R}^{n}. We obtain a state in which the amplitude of |𝐭{|{\mathbf{t}}\rangle} for 𝐭Rn\mathbf{t}\in{\mathbb{Z}}_{R}^{n} is proportional to

    𝐬Rn𝐫nρ(L𝐬/RL𝐫)exp(2πi𝐬,𝐭/R)\displaystyle\sum_{\mathbf{s}\in{\mathbb{Z}}^{n}_{R}}\sum_{\mathbf{r}\in{\mathbb{Z}}^{n}}\rho(L^{*}\mathbf{s}/R-L^{*}\mathbf{r})\exp(2\pi i{\langle{\mathbf{s},\mathbf{t}}\rangle}/R)
    =𝐬nρ(L𝐬/R)exp(2πi𝐬,𝐭/R)\displaystyle\qquad=\sum_{\mathbf{s}\in{\mathbb{Z}}^{n}}\rho(L^{*}\mathbf{s}/R)\exp(2\pi i{\langle{\mathbf{s},\mathbf{t}}\rangle}/R)
    =𝐱L/Rρ(𝐱)exp(2πi(L)1𝐱,𝐭)\displaystyle\qquad=\sum_{\mathbf{x}\in L^{*}/R}\rho(\mathbf{x})\exp(2\pi i{\langle{(L^{*})^{-1}\mathbf{x},\mathbf{t}}\rangle})
    =𝐱L/Rρ(𝐱)exp(2πi𝐱,L𝐭)\displaystyle\qquad=\sum_{\mathbf{x}\in L^{*}/R}\rho(\mathbf{x})\exp(2\pi i{\langle{\mathbf{x},L\mathbf{t}}\rangle})
    =det(RL)𝐲RLρ(𝐲L𝐭)\displaystyle\qquad=\det(RL)\sum_{\mathbf{y}\in RL}\rho(\mathbf{y}-L\mathbf{t})

    where the last equality follows from Lemma 2.14 and Eq. (10). Hence, the resulting state can be equivalently written as

    𝐱𝒫(RL)L𝐲RLρ(𝐲𝐱)|𝐱.\sum_{\mathbf{x}\in{\cal P}(RL)\cap L}\sum_{\mathbf{y}\in RL}\rho(\mathbf{y}-\mathbf{x}){|{\mathbf{x}}\rangle}.

    Notice that λ1(RL)=Rλ1(L)R/λn(L)23n\lambda_{1}(RL)=R\lambda_{1}(L)\geq R/\lambda_{n}(L^{*})\geq 2^{3n}. Hence, we can apply Claim 3.13 to the lattice RLRL, and obtain that this state is exponentially close to

    𝐱L,𝐱<nρ(𝐱)|𝐱mod𝒫(RL).\sum_{\mathbf{x}\in L,\|\mathbf{x}\|<\sqrt{n}}\rho(\mathbf{x}){|{\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(RL)}\rangle}.

    We measure this state and obtain 𝐱mod𝒫(RL)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(RL) for some vector 𝐱\mathbf{x} with 𝐱<n\|\mathbf{x}\|<\sqrt{n}. Since 𝐱mod𝒫(RL)\mathbf{x}~{}{\rm{mod}}~{}{\cal P}(RL) is within n\sqrt{n} of the lattice RLRL, and λ1(RL)23n\lambda_{1}(RL)\geq 2^{3n}, we can recover 𝐱\mathbf{x} by using, say, Babai’s nearest plane algorithm [8]. The output of the algorithm is 𝐱\mathbf{x}.

    We claim that the distribution of 𝐱\mathbf{x} is exponentially close to DL,1/2D_{L,1/\sqrt{2}}. Indeed, the probability of obtaining any 𝐱L,𝐱<n\mathbf{x}\in L,\|\mathbf{x}\|<\sqrt{n} is proportional to ρ(𝐱)2=ρ1/2(𝐱)\rho(\mathbf{x})^{2}=\rho_{1/\sqrt{2}}(\mathbf{x}). It remains to notice that by Lemma 2.5, all but an exponentially small fraction of the probability distribution DL,1/2D_{L,1/\sqrt{2}} is on points of norm less than n\sqrt{n}.  

3.3 Standard lattice problems

We now complete the proof of the main theorem by reducing the standard lattice problems GapSVP and SIVP to DGS. We start with SIVP. The basic idea of the reduction is simple: we call the DGS oracle enough times. We show that with high probability, there are nn short linearly independent vectors among the returned vectors. We prove this by using the following lemma, which appeared in the preliminary version of [29]. We include the proof since only a proof sketch was given there.

Lemma 3.15

Let LL be an nn-dimensional lattice and let rr be such that r2ηε(L)r\geq\sqrt{2}\eta_{\varepsilon}(L) where ε110\varepsilon\leq\frac{1}{10}. Then for any subspace HH of dimension at most n1n-1 the probability that 𝐱H\mathbf{x}\notin H where 𝐱\mathbf{x} is chosen from DL,rD_{L,r} is at least 110\frac{1}{10}.

  • Proof:

    Assume without loss of generality that the vector (1,0,,0)(1,0,\ldots,0) is orthogonal to HH. Using Lemma 2.14,

    Exp𝐱DL,r[exp(π(x1/r)2)]\displaystyle\mathop{\mathrm{Exp}}_{\mathbf{x}\sim D_{L,r}}[{{\mathrm{exp}}\left(-\pi(x_{1}/r)^{2}\right)}] =1ρr(L)𝐱Lexp(π(2x1/r)2)exp(π(x2/r)2)exp(π(xn/r)2)\displaystyle=\frac{1}{\rho_{r}(L)}\sum_{\mathbf{x}\in L}{{\mathrm{exp}}\left(-\pi(\sqrt{2}x_{1}/r)^{2}\right)}{{\mathrm{exp}}\left(-\pi(x_{2}/r)^{2}\right)}\cdots{{\mathrm{exp}}\left(-\pi(x_{n}/r)^{2}\right)}
    =det(L)rn2ρr(L)𝐲Lexp(π(ry1/2)2)exp(π(ry2)2)exp(π(ryn)2)\displaystyle=\frac{\det(L^{*})\,r^{n}}{\sqrt{2}\rho_{r}(L)}\,\sum_{\mathbf{y}\in L^{*}}{{\mathrm{exp}}\left(-\pi(ry_{1}/\sqrt{2})^{2}\right)}{{\mathrm{exp}}\left(-\pi(ry_{2})^{2}\right)}\cdots{{\mathrm{exp}}\left(-\pi(ry_{n})^{2}\right)}
    det(L)rn2ρr(L)ρ2/r(L)\displaystyle\leq\frac{\det(L^{*})\,r^{n}}{\sqrt{2}\rho_{r}(L)}\,\rho_{\sqrt{2}/r}(L^{*})
    det(L)rn2ρr(L)(1+ε).\displaystyle\leq\frac{\det(L^{*})\,r^{n}}{\sqrt{2}\rho_{r}(L)}\,(1+\varepsilon).

    By using Lemma 2.14 again, we see that ρr(L)=det(L)rnρ1/r(L)det(L)rn\rho_{r}(L)=\det(L^{*})\,r^{n}\rho_{1/r}(L^{*})\geq\det(L^{*})\,r^{n}. Therefore, the expectation above is at most 12(1+ε)<0.9\frac{1}{\sqrt{2}}\,(1+\varepsilon)<0.9 and the lemma follows.  

Corollary 3.16

Let LL be an nn-dimensional lattice and let rr be such that r2ηε(L)r\geq\sqrt{2}\eta_{\varepsilon}(L) where ε110\varepsilon\leq\frac{1}{10}. Then, the probability that a set of n2n^{2} vectors chosen independently from DL,rD_{L,r} contains no nn linearly independent vectors is exponentially small.

  • Proof:

    Let 𝐱1,,𝐱n2\mathbf{x}_{1},\ldots,\mathbf{x}_{n^{2}} be n2n^{2} vectors chosen independently from DL,rD_{L,r}. For i=1,,ni=1,\ldots,n, let BiB_{i} be the event that

    dimspan(𝐱1,,𝐱(i1)n)=dimspan(𝐱1,,𝐱in)<n.\dim{\rm span}(\mathbf{x}_{1},\ldots,\mathbf{x}_{(i-1)n})=\dim{\rm span}(\mathbf{x}_{1},\ldots,\mathbf{x}_{in})<n.

    Clearly, if none of the BiB_{i}’s happens, then dimspan(𝐱1,,𝐱n2)=n\dim{\rm span}(\mathbf{x}_{1},\ldots,\mathbf{x}_{n^{2}})=n. Hence, in order to complete the proof it suffices to show that for all ii, Pr[Bi]2Ω(n)\Pr[B_{i}]\leq 2^{-\Omega(n)}. So fix some ii, and let us condition on some fixed choice of 𝐱1,,𝐱(i1)n\mathbf{x}_{1},\ldots,\mathbf{x}_{(i-1)n} such that dimspan(𝐱1,,𝐱(i1)n)<n\dim{\rm span}(\mathbf{x}_{1},\ldots,\mathbf{x}_{(i-1)n})<n. By Lemma 3.15, the probability that

    𝐱(i1)n+1,,𝐱indimspan(𝐱1,,𝐱(i1)n)\mathbf{x}_{(i-1)n+1},\ldots,\mathbf{x}_{in}\in\dim{\rm span}(\mathbf{x}_{1},\ldots,\mathbf{x}_{(i-1)n})

    is at most (9/10)n=2Ω(n)(9/10)^{n}=2^{-\Omega(n)}. This implies that Pr[Bi]2Ω(n)\Pr[B_{i}]\leq 2^{-\Omega(n)}, as required.  

In the following lemma we give the reduction from SIVP (in fact, GIVP) to DGS. It shows that under the assumptions of Theorem 3.1, there exists an efficient quantum algorithm for GIVP22nηε(L)/α\mbox{\sc GIVP}_{2\sqrt{2}n\eta_{\varepsilon}(L)/\alpha}. By Lemma 2.12, this algorithm also solves SIVPO~(n/α)\mbox{\sc SIVP}_{\tilde{O}(n/\alpha)}.

Lemma 3.17

For any ε=ε(n)110\varepsilon=\varepsilon(n)\leq\frac{1}{10} and any φ(L)2ηε(L)\varphi(L)\geq\sqrt{2}\eta_{\varepsilon}(L), there is a polynomial time reduction from GIVP2nφ\mbox{\sc GIVP}_{2\sqrt{n}\varphi} to DGSφ\mbox{\sc DGS}_{\varphi}.

  • Proof:

    As mentioned above, the idea of the reduction is to simply call the DGS oracle in an attempt to find nn short linearly independent vectors. One technical complication is that the function φ\varphi is not necessarily efficiently computable, and hence we do not know which parameter rr to give the DGS oracle. The solution is easy: we just try many values of rr and take the shortest set of nn linearly independent vectors found.

    We now present the reduction in detail. The input to the reduction is a lattice LL. We first apply the LLL algorithm [24] to obtain nn linearly independent vectors of length at most 2nλn(L)2^{n}\lambda_{n}(L). Let SS denote the resulting set, and let λn~\tilde{\lambda_{n}} be the length of the longest vector in SS. By construction we have λn(L)λn~2nλn(L)\lambda_{n}(L)\leq\tilde{\lambda_{n}}\leq 2^{n}\lambda_{n}(L). For each i{0,,2n}i\in\{0,\ldots,2n\} call the DGS oracle n2n^{2} times with the pair (L,ri)(L,r_{i}) where ri=λn~2ir_{i}=\tilde{\lambda_{n}}2^{-i}, and let SiS_{i} be the resulting set of vectors. At the end, look for a set of nn linearly independent vectors in each of S,S0,S1,,S2nS,S_{0},S_{1},\ldots,S_{2n}, and output the shortest set found.

    We now prove correctness. If φ(L)λn~\varphi(L)\geq\tilde{\lambda_{n}} then SS is already shorter than 2nφ(L)2\sqrt{n}\varphi(L) and so we are done. Otherwise, let i{0,,2n}i\in\{0,\ldots,2n\} be such that φ(L)<ri2φ(L)\varphi(L)<r_{i}\leq 2\varphi(L). Such an ii must exist by Claim 2.13. By Corollary 3.16, SiS_{i} contains nn linearly independent vectors with probability exponentially close to 11. Moreover, by Lemma 2.5, all vectors in SiS_{i} are of length at most rin2nφ(L)r_{i}\sqrt{n}\leq 2\sqrt{n}\varphi(L) with probability exponentially close to 11. Hence, our reduction outputs a set of nn linearly independent vectors of length at most 2nφ(L)2\sqrt{n}\varphi(L), as required.  

We now present the reduction from GapSVP to DGS. We first define the decision version of the closest vector problem (GapCVP) and a slight variant of it.

Definition 3.18

An instance of GapCVPγ\mbox{\sc GapCVP}_{\gamma} is given by an nn-dimensional lattice LL, a vector 𝐭\mathbf{t}, and a number d>0d>0. In YES{\rm YES} instances, dist(𝐭,L)d{\rm{dist}}(\mathbf{t},L)\leq d, whereas in NO{\rm NO} instances, dist(𝐭,L)>γ(n)d{\rm{dist}}(\mathbf{t},L)>\gamma(n)\cdot d.

Definition 3.19

An instance of GapCVPγ\mbox{\sc GapCVP}^{\prime}_{\gamma} is given by an nn-dimensional lattice LL, a vector 𝐭\mathbf{t}, and a number d>0d>0. In YES{\rm YES} instances, dist(𝐭,L)d{\rm{dist}}(\mathbf{t},L)\leq d. In NO{\rm NO} instances, λ1(L)>γ(n)d\lambda_{1}(L)>\gamma(n)\cdot d and dist(𝐭,L)>γ(n)d{\rm{dist}}(\mathbf{t},L)>\gamma(n)\cdot d.

In [17] it is shown that for any γ=γ(n)1\gamma=\gamma(n)\geq 1, there is a polynomial time reduction from GapSVPγ\mbox{\sc GapSVP}_{\gamma} to GapCVPγ\mbox{\sc GapCVP}^{\prime}_{\gamma} (see also Lemma 5.22 in [29]). Hence, it suffices to show a reduction from GapCVP\mbox{\sc GapCVP}^{\prime} to DGS. This reduction is given in the following lemma. By using Lemma 2.11, we obtain that under the assumptions of Theorem 3.1 there exists an efficient quantum algorithm for GapCVPO(n/α)\mbox{\sc GapCVP}^{\prime}_{O(n/\alpha)} (and hence also for GapSVPO(n/α)\mbox{\sc GapSVP}_{O(n/\alpha)}).

Lemma 3.20

For any γ=γ(n)1\gamma=\gamma(n)\geq 1, there is a polynomial time reduction from GapCVP100nγ(n)\mbox{\sc GapCVP}^{\prime}_{100\sqrt{n}\cdot\gamma(n)} to DGSnγ(n)/λ1(L)\mbox{\sc DGS}_{\sqrt{n}\gamma(n)/\lambda_{1}(L^{*})}.

  • Proof:

    The main component in our reduction is the NP verifier for coGapCVP shown in [1]. In more detail, [1] present an efficient algorithm, call it 𝒱{\cal V}, whose input consists of an nn-dimensional lattice LL, a vector 𝐭\mathbf{t}, a number d>0d>0, and a sequence of vectors 𝐰1,,𝐰N\mathbf{w}_{1},\ldots,\mathbf{w}_{N} in LL^{*} for some N=poly(n)N={\rm{poly}}(n). When dist(𝐭,L)d{\rm{dist}}(\mathbf{t},L)\leq d, the algorithm is guaranteed to reject. When dist(𝐭,L)>100nd{\rm{dist}}(\mathbf{t},L)>100\sqrt{n}d, and 𝐰1,,𝐰N\mathbf{w}_{1},\ldots,\mathbf{w}_{N} are chosen from the distribution DL,1/(100d)D_{L^{*},1/(100d)}, then the algorithm accepts with probability exponentially close to 11.

    The input to the reduction is an nn-dimensional lattice LL, a vector 𝐭\mathbf{t}, and a number d>0d>0. We call the DGS oracle NN times with the lattice LL^{*} and the value 1100d\frac{1}{100d} to obtain vectors 𝐰1,,𝐰NL\mathbf{w}_{1},\ldots,\mathbf{w}_{N}\in L^{*}. We then apply 𝒱{\cal V} with LL, 𝐭\mathbf{t}, dd, and the vectors 𝐰1,,𝐰N\mathbf{w}_{1},\ldots,\mathbf{w}_{N}. We accept if and only if 𝒱{\cal V} rejects.

    To prove correctness, notice first that in the case of a YES{\rm YES} instance, dist(𝐭,L)d{\rm{dist}}(\mathbf{t},L)\leq d, and hence 𝒱{\cal V} must reject (irrespective of the 𝐰\mathbf{w}’s). In the case of a NO{\rm NO} instance we have that 1100d>nγ(n)/λ1(L)\frac{1}{100d}>\sqrt{n}\gamma(n)/\lambda_{1}(L), and hence 𝐰1,,𝐰N\mathbf{w}_{1},\ldots,\mathbf{w}_{N} are guaranteed to be valid samples from DL,1/(100d)D_{L^{*},1/(100d)}. Moreover, dist(𝐭,L)>100nγ(n)d100nd{\rm{dist}}(\mathbf{t},L)>100\sqrt{n}\gamma(n)d\geq 100\sqrt{n}d, and hence 𝒱{\cal V} accepts with probability exponentially close to 11.  

4 Variants of the LWE problem

In this section, we consider several variants of the LWE problem. Through a sequence of elementary reductions, we prove that all problems are as hard as LWE. The results of this section are summarized in Lemma 4.4.

Lemma 4.1 (Average-case to Worst-case)

Let n,p1n,p\geq 1 be some integers and χ\chi be some distribution on p{\mathbb{Z}}_{p}. Assume that we have access to a distinguisher WW that distinguishes A𝐬,χA_{\mathbf{s},\chi} from UU for a non-negligible fraction of all possible 𝐬\mathbf{s}. Then there exists an efficient algorithm WW^{\prime} that for all 𝐬\mathbf{s} accepts with probability exponentially close to 11 on inputs from A𝐬,χA_{\mathbf{s},\chi} and rejects with probability exponentially close to 11 on inputs from UU.

  • Proof:

    The proof is based on the following transformation. For any 𝐭pn\mathbf{t}\in{\mathbb{Z}}_{p}^{n} consider the function f𝐭:pn×ppn×pf_{\mathbf{t}}:{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}\rightarrow{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p} defined by

    f𝐭(𝐚,b)=(𝐚,b+𝐚,𝐭).f_{\mathbf{t}}(\mathbf{a},b)=(\mathbf{a},b+{\langle{\mathbf{a},\mathbf{t}}\rangle}).

    It is easy to see that this function transforms the distribution A𝐬,χA_{\mathbf{s},\chi} into A𝐬+𝐭,χA_{\mathbf{s}+\mathbf{t},\chi}. Moreover, it transforms the uniform distribution UU into itself.

    Assume that for nc1n^{-c_{1}} of all possible 𝐬\mathbf{s}, the acceptance probability of WW on inputs from A𝐬,χA_{\mathbf{s},\chi} and on inputs from UU differ by at least nc2n^{-c_{2}}. We construct WW^{\prime} as follows. Let RR denote the unknown input distribution. Repeat the following nc1+1n^{c_{1}+1} times. Choose a vector 𝐭pn\mathbf{t}\in{\mathbb{Z}}_{p}^{n} uniformly at random. Then estimate the acceptance probability of WW on UU and on f𝐭(R)f_{\mathbf{t}}(R) by calling WW O(n2c2+1)O(n^{2c_{2}+1}) times on each of the input distributions. By the Chernoff bound, this allows us to obtain an estimate that with probability exponentially close to 11 is within ±nc2/8\pm n^{-c_{2}}/8 of the true acceptance probabilities. If the two estimates differ by more than nc2/2n^{-c_{2}}/2 then we stop and decide to accept. Otherwise we continue. If the procedure ends without accepting, we reject.

    We now prove that WW^{\prime} distinguishes A𝐬,χA_{\mathbf{s},\chi} from UU for all 𝐬\mathbf{s}. First, we claim that when RR is UU, the acceptance probability of WW^{\prime} is exponentially close to 0. Indeed, in this case, f𝐭(U)=Uf_{\mathbf{t}}(U)=U and therefore the two estimates that WW^{\prime} performs are of the same distribution. The probability that the estimates differ by more than nc2/2>2nc2/8n^{-c_{2}}/2>2\cdot n^{-c_{2}}/8 is exponentially small. Next, consider the case that RR is A𝐬,χA_{\mathbf{s},\chi} for some 𝐬\mathbf{s}. In each of the nc1+1n^{c_{1}+1} iterations, we are considering the distribution f𝐭(A𝐬,χ)=A𝐬+𝐭,χf_{\mathbf{t}}(A_{\mathbf{s},\chi})=A_{\mathbf{s}+\mathbf{t},\chi} for some uniformly chosen 𝐭\mathbf{t}. Notice that the distribution of 𝐬+𝐭\mathbf{s}+\mathbf{t} is uniform on pn{\mathbb{Z}}_{p}^{n}. Hence, with probability exponentially close to 11, in one of the nc1+1n^{c_{1}+1} iterations, 𝐭\mathbf{t} is such that the acceptance probability of WW on inputs from A𝐬+𝐭,χA_{\mathbf{s}+\mathbf{t},\chi} and on inputs from UU differ by at least nc2n^{-c_{2}}. Since our estimates are within ±nc2/8\pm n^{-c_{2}}/8, WW^{\prime} accepts with probability exponentially to 11.  

Lemma 4.2 (Decision to Search)

Let n1n\geq 1 be some integer, 2ppoly(n)2\leq p\leq{\rm{poly}}(n) be a prime, and χ\chi be some distribution on p{\mathbb{Z}}_{p}. Assume that we have access to procedure WW that for all 𝐬\mathbf{s} accepts with probability exponentially close to 11 on inputs from A𝐬,χA_{\mathbf{s},\chi} and rejects with probability exponentially close to 11 on inputs from UU. Then, there exists an efficient algorithm WW^{\prime} that, given samples from A𝐬,χA_{\mathbf{s},\chi} for some 𝐬\mathbf{s}, outputs 𝐬\mathbf{s} with probability exponentially close to 11.

  • Proof:

    Let us show how WW^{\prime} finds s1ps_{1}\in{\mathbb{Z}}_{p}, the first coordinate of 𝐬\mathbf{s}. Finding the other coordinates is similar. For any kpk\in{\mathbb{Z}}_{p}, consider the following transformation. Given a pair (𝐚,b)(\mathbf{a},b) we output the pair (𝐚+(l,0,,0),b+lk)(\mathbf{a}+(l,0,\ldots,0),b+l\cdot k) where lpl\in{\mathbb{Z}}_{p} is chosen uniformly at random. It is easy to see that this transformation takes the uniform distribution into itself. Moreover, if k=s1k=s_{1} then this transformation also takes A𝐬,χA_{\mathbf{s},\chi} to itself. Finally, if ks1k\neq s_{1} then it takes A𝐬,χA_{\mathbf{s},\chi} to the uniform distribution (note that this requires pp to be prime). Hence, using WW, we can test whether k=s1k=s_{1}. Since there are only p<poly(n)p<{\rm{poly}}(n) possibilities for s1s_{1} we can try all of them.  

Lemma 4.3 (Discrete to Continuous)

Let n,p1n,p\geq 1 be some integers, let ϕ\phi be some probability density function on 𝕋\mathbb{T}, and let ϕ¯\bar{\phi} be its discretization to p{\mathbb{Z}}_{p}. Assume that we have access to an algorithm WW that solves LWEp,ϕ¯\mbox{\sc LWE}_{p,\bar{\phi}}. Then, there exists an efficient algorithm WW^{\prime} that solves LWEp,ϕ\mbox{\sc LWE}_{p,\phi}.

  • Proof:

    Algorithm WW^{\prime} simply takes samples from A𝐬,ϕA_{\mathbf{s},\phi} and discretizes the second element to obtain samples from A𝐬,ϕ¯A_{\mathbf{s},\bar{\phi}}. It then applies WW with these samples in order to find 𝐬\mathbf{s}.  

By combining the three lemmas above, we obtain

Lemma 4.4

Let n1n\geq 1 be an integer and 2ppoly(n)2\leq p\leq{\rm{poly}}(n) be a prime. Let ϕ\phi be some probability density function on 𝕋\mathbb{T} and let ϕ¯\bar{\phi} be its discretization to p{\mathbb{Z}}_{p}. Assume that we have access to a distinguisher that distinguishes A𝐬,ϕ¯A_{\mathbf{s},\bar{\phi}} from UU for a non-negligible fraction of all possible 𝐬\mathbf{s}. Then, there exists an efficient algorithm that solves LWEp,ϕ\mbox{\sc LWE}_{p,\phi}.

5 Public Key Cryptosystem

We let nn be the security parameter of the cryptosystem. Our cryptosystem is parameterized by two integers m,pm,p and a probability distribution χ\chi on p{\mathbb{Z}}_{p}. A setting of these parameters that guarantees both security and correctness is the following. Choose p2p\geq 2 to be some prime number between n2n^{2} and 2n22n^{2} and let m=(1+ε)(n+1)logpm=(1+\varepsilon)(n+1)\log p for some arbitrary constant ε>0\varepsilon>0. The probability distribution χ\chi is taken to be Ψ¯α(n)\bar{\Psi}_{\alpha(n)} for α(n)=o(1/(nlogn))\alpha(n)=o(1/(\sqrt{n}\log n)), i.e., α(n)\alpha(n) is such that limnα(n)nlogn=0\lim_{n\to\infty}\alpha(n)\cdot\sqrt{n}\log n=0. For example, we can choose α(n)=1/(nlog2n)\alpha(n)=1/(\sqrt{n}\log^{2}n). In the following description, all additions are performed in p{\mathbb{Z}}_{p}, i.e., modulo pp.

  • Private key: Choose 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n} uniformly at random. The private key is 𝐬\mathbf{s}.

  • Public Key: For i=1,,mi=1,\ldots,m, choose mm vectors 𝐚1,,𝐚mpn\mathbf{a}_{1},\ldots,\mathbf{a}_{m}\in{\mathbb{Z}}_{p}^{n} independently from the uniform distribution. Also choose elements e1,,empe_{1},\ldots,e_{m}\in{\mathbb{Z}}_{p} independently according to χ\chi. The public key is given by (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} where bi=𝐚i,𝐬+eib_{i}={\langle{\mathbf{a}_{i},\mathbf{s}}\rangle}+e_{i}.

  • Encryption: In order to encrypt a bit we choose a random set SS uniformly among all 2m2^{m} subsets of [m][m]. The encryption is (iS𝐚i,iSbi)(\sum_{i\in S}\mathbf{a}_{i},\sum_{i\in S}b_{i}) if the bit is 0 and (iS𝐚i,p2+iSbi)(\sum_{i\in S}\mathbf{a}_{i},{\lfloor\frac{p}{2}\rfloor}+\sum_{i\in S}b_{i}) if the bit is 1.

  • Decryption: The decryption of a pair (𝐚,b)(\mathbf{a},b) is 0 if b𝐚,𝐬b-{\langle{\mathbf{a},\mathbf{s}}\rangle} is closer to 0 than to p2{\lfloor\frac{p}{2}\rfloor} modulo pp. Otherwise, the decryption is 11.

Notice that with our choice of parameters, the public key size is O(mnlogp)=O~(n2)O(mn\log p)=\tilde{O}(n^{2}) and the encryption process increases the size of a message by a factor of O(nlogp)=O~(n)O(n\log p)=\tilde{O}(n). In fact, it is possible to reduce the size of the public key to O(mlogp)=O~(n)O(m\log p)=\tilde{O}(n) by the following idea of Ajtai [3]. Assume all users of the cryptosystem share some fixed (and trusted) random choice of 𝐚1,,𝐚m\mathbf{a}_{1},\ldots,\mathbf{a}_{m}. This can be achieved by, say, distributing these vectors as part of the encryption and decryption software. Then the public key need only consist of b1,,bmb_{1},\ldots,b_{m}. This modification does not affect the security of the cryptosystem.

We next prove that under a certain condition on χ\chi, mm, and pp, the probability of decryption error is small. We later show that our choice of parameters satisfies this condition. For the following two lemmas we need to introduce some additional notation. For a distribution χ\chi on p{\mathbb{Z}}_{p} and an integer k0k\geq 0, we define χk\chi^{\star k} as the distribution obtained by summing together kk independent samples from χ\chi, where addition is performed in p{\mathbb{Z}}_{p} (for k=0k=0 we define χ0\chi^{\star 0} as the distribution that is constantly 0). For a probability distribution ϕ\phi on 𝕋\mathbb{T} we define ϕk\phi^{\star k} similarly. For an element apa\in{\mathbb{Z}}_{p} we define |a||a| as the integer aa if a{0,1,,p2}a\in\{0,1,\ldots,{\lfloor\frac{p}{2}\rfloor}\} and as the integer pap-a otherwise. In other words, |a||a| represents the distance of aa from 0. Similarly, for x𝕋x\in\mathbb{T}, we define |x||x| as xx for x[0,12]x\in[0,\frac{1}{2}] and as 1x1-x otherwise.

Lemma 5.1 (Correctness)

Let δ>0\delta>0. Assume that for any k{0,1,,m}k\in\{0,1,\ldots,m\}, χk\chi^{\star k} satisfies that

Preχk[|e|<p2/2]>1δ.\Pr_{e\sim\chi^{\star k}}\Big{[}|e|<\Big{\lfloor}\frac{p}{2}\Big{\rfloor}/2\Big{]}>1-\delta.

Then, the probability of decryption error is at most δ\delta. That is, for any bit c{0,1}c\in\{0,1\}, if we use the protocol above to choose private and public keys, encrypt cc, and then decrypt the result, then the outcome is cc with probability at least 1δ1-\delta.

  • Proof:

    Consider first an encryption of 0. It is given by (𝐚,b)(\mathbf{a},b) for 𝐚=iS𝐚i\mathbf{a}=\sum_{i\in S}\mathbf{a}_{i} and

    b=iSbi=iS𝐚i,𝐬+ei=𝐚,𝐬+iSei.b=\sum_{i\in S}b_{i}=\sum_{i\in S}{\langle{\mathbf{a}_{i},\mathbf{s}}\rangle}+e_{i}={\langle{\mathbf{a},\mathbf{s}}\rangle}+\sum_{i\in S}e_{i}.

    Hence, b𝐚,𝐬b-{\langle{\mathbf{a},\mathbf{s}}\rangle} is exactly iSei\sum_{i\in S}e_{i}. The distribution of the latter is χ|S|\chi^{\star|S|}. According to our assumption, |iSei||\sum_{i\in S}e_{i}| is less than p2/2\lfloor\frac{p}{2}\rfloor/2 with probability at least 1δ1-\delta. In this case, it is closer to 0 than to p2\lfloor\frac{p}{2}\rfloor and therefore the decryption is correct. The proof for an encryption of 11 is similar.  

Claim 5.2

For our choice of parameters it holds that for any k{0,1,,m}k\in\{0,1,\ldots,m\},

PreΨ¯αk[|e|<p2/2]>1δ(n)\Pr_{e\sim\bar{\Psi}_{\alpha}^{\star k}}\Big{[}|e|<\Big{\lfloor}\frac{p}{2}\Big{\rfloor}/2\Big{]}>1-\delta(n)

for some negligible function δ(n)\delta(n).

  • Proof:

    A sample from Ψ¯αk\bar{\Psi}_{\alpha}^{\star k} can be obtained by sampling x1,,xkx_{1},\ldots,x_{k} from Ψα\Psi_{\alpha} and outputting i=1kpximodp\sum_{i=1}^{k}{\lfloor px_{i}\rceil}~{}{\rm{mod}}~{}p. Notice that this value is at most km<p/32k\leq m<p/32 away from i=1kpximodp\sum_{i=1}^{k}px_{i}~{}{\rm{mod}}~{}p. Hence, it is enough to show that |i=1kpximodp|<p/16|\sum_{i=1}^{k}px_{i}~{}{\rm{mod}}~{}p|<p/16 with high probability. This condition is equivalent to the condition that |i=1kximod1|<1/16|\sum_{i=1}^{k}x_{i}~{}{\rm{mod}}~{}1|<1/16. Since i=1kximod1\sum_{i=1}^{k}x_{i}~{}{\rm{mod}}~{}1 is distributed as Ψkα\Psi_{\sqrt{k}\cdot\alpha}, and kα=o(1/logn)\sqrt{k}\cdot\alpha=o(1/\sqrt{\log n}), the probability that |i=1kximod1|<1/16|\sum_{i=1}^{k}x_{i}~{}{\rm{mod}}~{}1|<1/16 is 1δ(n)1-\delta(n) for some negligible function δ(n)\delta(n).  

In order to prove the security of the system, we need the following special case of the leftover hash lemma that appears in [19]. We include a proof for completeness.

Claim 5.3

Let GG be some finite Abelian group and let ll be some integer. For any ll elements g1,,glGg_{1},\ldots,g_{l}\in G consider the statistical distance between the uniform distribution on GG and the distribution given by the sum of a random subset of g1,,glg_{1},\ldots,g_{l}. Then the expectation of this statistical distance over a uniform choice of g1,,glGg_{1},\ldots,g_{l}\in G is at most |G|/2l\sqrt{|G|/2^{l}}. In particular, the probability that this statistical distance is more than |G|/2l4\sqrt[4]{|G|/2^{l}} is at most |G|/2l4\sqrt[4]{|G|/2^{l}}.

  • Proof:

    For a choice 𝐠=(g1,,gl)\mathbf{g}=(g_{1},\ldots,g_{l}) of ll elements from GG, let P𝐠P_{\mathbf{g}} be the distribution of the sum of a random subsets of g1,,glg_{1},\ldots,g_{l}, i.e.,

    P𝐠(h)=12l|{𝐛{0,1}l|ibigi=h}|.P_{\mathbf{g}}(h)=\frac{1}{2^{l}}\left|\left\{\mathbf{b}\in\{0,1\}^{l}\left|\;\textstyle{\sum_{i}}b_{i}g_{i}=h\right.\right\}\right|.

    In order to show that this distribution is close to uniform, we compute its 2\ell_{2} norm, and note that it is very close to 1/|G|1/|G|. From this it will follow that the distribution must be close to the uniform distribution. The 2\ell_{2} norm of P𝐠P_{\mathbf{g}} is given by

    hGP𝐠(h)2\displaystyle\sum_{h\in G}P_{\mathbf{g}}(h)^{2} =Pr𝐛,𝐛[bigi=bigi]\displaystyle=\Pr_{\mathbf{b},\mathbf{b}^{\prime}}\left[\sum b_{i}g_{i}=\sum b_{i}^{\prime}g_{i}\right]
    12l+Pr𝐛,𝐛[bigi=bigi|𝐛𝐛].\displaystyle\leq\frac{1}{2^{l}}+\Pr_{\mathbf{b},\mathbf{b}^{\prime}}\left[\left.\sum b_{i}g_{i}=\sum b_{i}^{\prime}g_{i}\right|\mathbf{b}\neq\mathbf{b}^{\prime}\right].

    Taking expectation over 𝐠\mathbf{g}, and using the fact that for any 𝐛𝐛\mathbf{b}\neq\mathbf{b}^{\prime}, Pr𝐠[bigi=bigi]=1/|G|\Pr_{\mathbf{g}}[\sum b_{i}g_{i}=\sum b_{i}^{\prime}g_{i}]=1/|G|, we obtain that

    Exp𝐠[hP𝐠(h)2]12l+1|G|.\displaystyle\mathop{\mathrm{Exp}}_{\mathbf{g}}\left[\textstyle{\sum_{h}}P_{\mathbf{g}}(h)^{2}\right]\leq\frac{1}{2^{l}}+\frac{1}{|G|}.

    Finally, the expected distance from the uniform distribution is

    Exp𝐠[h|P𝐠(h)1/|G||]\displaystyle\mathop{\mathrm{Exp}}_{\mathbf{g}}\left[\textstyle{\sum_{h}}\left|P_{\mathbf{g}}(h)-1/|G|\right|\right] Exp𝐠[|G|1/2(h(P𝐠(h)1/|G|)2)1/2]\displaystyle\leq\mathop{\mathrm{Exp}}_{\mathbf{g}}\left[|G|^{1/2}\left(\textstyle{\sum_{h}}(P_{\mathbf{g}}(h)-1/|G|)^{2}\right)^{1/2}\right]
    =|G|Exp𝐠[(hP𝐠(h)21/|G|)1/2]\displaystyle=\sqrt{|G|}\mathop{\mathrm{Exp}}_{\mathbf{g}}\left[\left(\textstyle{\sum_{h}}P_{\mathbf{g}}(h)^{2}-1/|G|\right)^{1/2}\right]
    |G|(Exp𝐠[hP𝐠(h)2]1/|G|)1/2\displaystyle\leq\sqrt{|G|}\left(\mathop{\mathrm{Exp}}_{\mathbf{g}}\left[\textstyle{\sum_{h}}P_{\mathbf{g}}(h)^{2}\right]-1/|G|\right)^{1/2}
    |G|2l.\displaystyle\leq\sqrt{\frac{|G|}{2^{l}}}.

     

We now prove that our cryptosystem is semantically secure, i.e., that it is hard to distinguish between encryptions of 0 and encryptions of 11. More precisely, we show that if such a distinguisher exists, then there exists a distinguisher that distinguishes between A𝐬,χA_{\mathbf{s},\chi} and UU for a non-negligible fraction of all 𝐬\mathbf{s}. If χ=Ψ¯α\chi=\bar{\Psi}_{\alpha} and ppoly(n)p\leq{\rm{poly}}(n) is a prime, then by Lemma 4.4, this also implies an efficient (classical) algorithm that solves LWEp,Ψα\mbox{\sc LWE}_{p,\Psi_{\alpha}}. This in turn implies, by Theorem 3.1, an efficient quantum algorithm for DGS2nηε(L)/α\mbox{\sc DGS}_{\sqrt{2n}\cdot\eta_{\varepsilon}(L)/\alpha}. Finally, by Lemma 3.17 we also obtain an efficient quantum algorithm for SIVPO~(n/α)\mbox{\sc SIVP}_{\tilde{O}(n/\alpha)} and by Lemma 3.20 we obtain an efficient quantum algorithm for GapSVPO(n/α)\mbox{\sc GapSVP}_{O(n/\alpha)}.

Lemma 5.4 (Security)

For any ε>0\varepsilon>0 and m(1+ε)(n+1)logpm\geq(1+\varepsilon)(n+1)\log p, if there exists a polynomial time algorithm WW that distinguishes between encryptions of 0 and 11 then there exists a distinguisher ZZ that distinguishes between A𝐬,χA_{\mathbf{s},\chi} and UU for a non-negligible fraction of all possible 𝐬\mathbf{s}.

  • Proof:

    Let p0(W)p_{0}(W) be the acceptance probability of WW on input ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)) where (𝐚,b)(\mathbf{a},b) is an encryption of 0 with the public key (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} and the probability is taken over the randomness in the choice of the private and public keys and over the randomness in the encryption algorithm. We define p1(W)p_{1}(W) similarly for encryptions of 11 and let pu(W)p_{u}(W) be the acceptance probability of WW on inputs ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)) where (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} are again chosen according to the private and public keys distribution but (𝐚,b)(\mathbf{a},b) is chosen uniformly from pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}. With this notation, our hypothesis says that |p0(W)p1(W)|1nc{\left|{p_{0}(W)-p_{1}(W)}\right|}\geq\frac{1}{n^{c}} for some c>0c>0.

    We now construct a WW^{\prime} for which |p0(W)pu(W)|12nc{\left|{p_{0}(W^{\prime})-p_{u}(W^{\prime})}\right|}\geq\frac{1}{2n^{c}}. By our hypothesis, either |p0(W)pu(W)|12nc{\left|{p_{0}(W)-p_{u}(W)}\right|}\geq\frac{1}{2n^{c}} or |p1(W)pu(W)|12nc{\left|{p_{1}(W)-p_{u}(W)}\right|}\geq\frac{1}{2n^{c}}. In the former case we take WW^{\prime} to be the same as WW. In the latter case, we construct WW^{\prime} as follows. On input ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)), WW^{\prime} calls WW with ((𝐚i,bi)i=1m,(𝐚,p12+b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},\frac{p-1}{2}+b)). Notice that this maps the distribution on encryptions of 0 to the distribution on encryptions of 11 and the uniform distribution to itself. Therefore, WW^{\prime} is the required distinguisher.

    For 𝐬pn\mathbf{s}\in{\mathbb{Z}}_{p}^{n}, let p0(𝐬)p_{0}(\mathbf{s}) be the probability that WW^{\prime} accepts on input ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)) where (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} are chosen from A𝐬,χA_{\mathbf{s},\chi}, and (𝐚,b)(\mathbf{a},b) is an encryption of 0 with the public key (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m}. Similarly, define pu(𝐬)p_{u}(\mathbf{s}) to be the acceptance probability of WW^{\prime} where (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} are chosen from A𝐬,χA_{\mathbf{s},\chi}, and (𝐚,b)(\mathbf{a},b) is now chosen uniformly at random from pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}. Our assumption on WW^{\prime} says that |Exp𝐬[p0(𝐬)]Exp𝐬[pu(𝐬)]|12nc|\mathop{\mathrm{Exp}}_{\mathbf{s}}[p_{0}(\mathbf{s})]-\mathop{\mathrm{Exp}}_{\mathbf{s}}[p_{u}(\mathbf{s})]|\geq\frac{1}{2n^{c}}. Define

    Y={𝐬||p0(𝐬)pu(𝐬)|14nc}.Y=\left\{\mathbf{s}~{}\left|~{}|p_{0}(\mathbf{s})-p_{u}(\mathbf{s})|\geq\frac{1}{4n^{c}}\right.\right\}.

    By an averaging argument we get that a fraction of at least 14nc\frac{1}{4n^{c}} of the 𝐬\mathbf{s} are in YY. Hence, it is enough to show a distinguisher ZZ that distinguishes between UU and A𝐬,χA_{\mathbf{s},\chi} for any 𝐬Y\mathbf{s}\in Y.

    In the following we describe the distinguisher ZZ. We are given a distribution RR that is either UU or A𝐬,χA_{\mathbf{s},\chi} for some 𝐬Y\mathbf{s}\in Y. We take mm samples (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} from RR. Let p0((𝐚i,bi)i=1m)p_{0}((\mathbf{a}_{i},b_{i})_{i=1}^{m}) be the probability that WW^{\prime} accepts on input ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)) where the probability is taken on the choice of (𝐚,b)(\mathbf{a},b) as an encryption of the bit 0 with the public key (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m}. Similarly, let pu((𝐚i,bi)i=1m)p_{u}((\mathbf{a}_{i},b_{i})_{i=1}^{m}) be the probability that WW^{\prime} accepts on input ((𝐚i,bi)i=1m,(𝐚,b))((\mathbf{a}_{i},b_{i})_{i=1}^{m},(\mathbf{a},b)) where the probability is taken over the choice of (𝐚,b)(\mathbf{a},b) as a uniform element of pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}. By applying WW^{\prime} a polynomial number of times, the distinguisher ZZ estimates both p0((𝐚i,bi)i=1m)p_{0}((\mathbf{a}_{i},b_{i})_{i=1}^{m}) and pu((𝐚i,bi)i=1m)p_{u}((\mathbf{a}_{i},b_{i})_{i=1}^{m}) up to an additive error of 164nc\frac{1}{64n^{c}}. If the two estimates differ by more than 116nc\frac{1}{16n^{c}}, ZZ accepts. Otherwise, ZZ rejects.

    We first claim that when RR is the uniform distribution, ZZ rejects with high probability. In this case, (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} are chosen uniformly from pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}. Using Claim 5.3 with the group G=pn×pG={\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}, we obtain that with probability exponentially close to 11, the distribution on (𝐚,b)(\mathbf{a},b) obtained by encryptions of 0 is exponentially close to the uniform distribution on pn×p{\mathbb{Z}}_{p}^{n}\times{\mathbb{Z}}_{p}. Therefore, except with exponentially small probability,

    |p0((𝐚i,bi)i=1m)pu((𝐚i,bi)i=1m)|2Ω(n).|p_{0}((\mathbf{a}_{i},b_{i})_{i=1}^{m})-p_{u}((\mathbf{a}_{i},b_{i})_{i=1}^{m})|\leq 2^{-\Omega(n)}.

    Hence, our two estimates differ by at most 132nc+2Ω(n)\frac{1}{32n^{c}}+2^{-\Omega(n)}, and ZZ rejects.

    Next, we show that if RR is A𝐬,χA_{\mathbf{s},\chi} for 𝐬Y\mathbf{s}\in Y then ZZ accepts with probability 1/poly(n)1/{\rm{poly}}(n). Notice that p0(𝐬)p_{0}(\mathbf{s}) (respectively, pu(𝐬)p_{u}(\mathbf{s})) is the average of p0((𝐚i,bi)i=1m)p_{0}((\mathbf{a}_{i},b_{i})_{i=1}^{m}) (respectively, pu((𝐚i,bi)i=1m)p_{u}((\mathbf{a}_{i},b_{i})_{i=1}^{m})) taken over the choice of (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} from A𝐬,χA_{\mathbf{s},\chi}. From |p0(𝐬)pu(𝐬)|14nc|p_{0}(\mathbf{s})-p_{u}(\mathbf{s})|\geq\frac{1}{4n^{c}} we obtain by an averaging argument that

    |p0((𝐚i,bi)i=1m)pu((𝐚i,bi)i=1m)|18nc|p_{0}((\mathbf{a}_{i},b_{i})_{i=1}^{m})-p_{u}((\mathbf{a}_{i},b_{i})_{i=1}^{m})|\geq\frac{1}{8n^{c}}

    with probability at least 18nc\frac{1}{8n^{c}} over the choice of (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} from A𝐬,χA_{\mathbf{s},\chi}. Hence, with probability at least 18nc\frac{1}{8n^{c}}, ZZ chooses such a (𝐚i,bi)i=1m(\mathbf{a}_{i},b_{i})_{i=1}^{m} and since our estimates are accurate to within 164nc\frac{1}{64n^{c}}, the difference between them is more than 116nc\frac{1}{16n^{c}} and ZZ accepts.  

Acknowledgments

I would like to thank Michael Langberg, Vadim Lyubashevsky, Daniele Micciancio, Chris Peikert, Miklos Santha, Madhu Sudan, and an anonymous referee for useful comments.

References

  • [1] D. Aharonov and O. Regev. Lattice problems in NP intersect coNP. Journal of the ACM, 52(5):749–765, 2005. Preliminary version in FOCS’04.
  • [2] M. Ajtai. Generating hard instances of lattice problems. In Complexity of computations and proofs, volume 13 of Quad. Mat., pages 1–32. Dept. Math., Seconda Univ. Napoli, Caserta, 2004. Preliminary version in STOC 1996.
  • [3] M. Ajtai. Representing hard lattices with O(nlogn)O(n\log n) bits. In Proc. 37th Annual ACM Symp. on Theory of Computing (STOC), pages 94–103, 2005.
  • [4] M. Ajtai and C. Dwork. A public-key cryptosystem with worst-case/average-case equivalence. In Proc. 29th Annual ACM Symp. on Theory of Computing (STOC), pages 284–293, 1997.
  • [5] M. Ajtai, R. Kumar, and D. Sivakumar. A sieve algorithm for the shortest lattice vector problem. In Proc. 33rd ACM Symp. on Theory of Computing, pages 601–610, 2001.
  • [6] A. Akavia, S. Goldwasser, and V. Vaikuntanathan. Simultaneous hardcore bits and cryptography against memory attacks. In Proc. of 6th IACR Theory of Cryptography Conference (TCC), pages 474–495, 2009.
  • [7] M. Alekhnovich. More on average case vs approximation complexity. In Proc. 44th Annual IEEE Symp. on Foundations of Computer Science (FOCS), pages 298–307, 2003.
  • [8] L. Babai. On Lovasz’ lattice reduction and the nearest lattice point problem. Combinatorica, 6(1):1–13, 1986.
  • [9] W. Banaszczyk. New bounds in some transference theorems in the geometry of numbers. Mathematische Annalen, 296(4):625–635, 1993.
  • [10] A. Blum, M. Furst, M. Kearns, and R. J. Lipton. Cryptographic primitives based on hard learning problems. In Advances in cryptology—CRYPTO ’93, volume 773 of Lecture Notes in Comput. Sci., pages 278–291. Springer, Berlin, 1994.
  • [11] A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM, 50(4):506–519, 2003.
  • [12] J.-Y. Cai and A. Nerurkar. An improved worst-case to average-case connection for lattice problems. In Proc. 38th Annual IEEE Symp. on Foundations of Computer Science (FOCS), pages 468–477, 1997.
  • [13] D. Cash, C. Peikert, and A. Sahai. Efficient circular-secure encryption from hard learning problems, 2009. Submitted.
  • [14] W. Ebeling. Lattices and codes. Advanced Lectures in Mathematics. Friedr. Vieweg & Sohn, Braunschweig, revised edition, 2002. A course partially based on lectures by F. Hirzebruch.
  • [15] U. Feige. Relations between average case complexity and approximation complexity. In Proc. 34th Annual ACM Symp. on Theory of Computing (STOC), pages 534–543, 2002.
  • [16] C. Gentry, C. Peikert, and V. Vaikuntanathan. Trapdoors for hard lattices and new cryptographic constructions. In Proc. 40th ACM Symp. on Theory of Computing (STOC), pages 197–206, 2008.
  • [17] O. Goldreich, D. Micciancio, S. Safra, and J.-P. Seifert. Approximating shortest lattice vectors is not harder than approximating closest lattice vectors. Information Processing Letters, 71(2):55–61, 1999.
  • [18] L. Grover and T. Rudolph. Creating superpositions that correspond to efficiently integrable probability distributions. In quant-ph/0208112, http://xxx.lanl.gov, 2002.
  • [19] R. Impagliazzo and D. Zuckerman. How to recycle random bits. In Proc. 30th Annual IEEE Symp. on Foundations of Computer Science (FOCS), pages 248–253, 1989.
  • [20] J. Katz and Y. Lindell. Introduction to modern cryptography. Chapman & Hall/CRC Cryptography and Network Security. Chapman & Hall/CRC, Boca Raton, FL, 2008.
  • [21] A. Kawachi, K. Tanaka, and K. Xagawa. Multi-bit cryptosystems based on lattice problems. In Public Key Cryptography – PKC 2007, volume 4450 of Lecture Notes in Comput. Sci., pages 315–329, Berlin, 2007. Springer.
  • [22] A. R. Klivans and A. A. Sherstov. Cryptographic hardness for learning intersections of halfspaces. J. Comput. System Sci., 75(1):2–12, 2009. Preliminary version in FOCS’06.
  • [23] R. Kumar and D. Sivakumar. On polynomial approximation to the shortest lattice vector length. In Proc. 12th Annual ACM-SIAM Symp. on Discrete Algorithms, pages 126–127, 2001.
  • [24] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász. Factoring polynomials with rational coefficients. Math. Ann., 261(4):515–534, 1982.
  • [25] Y.-K. Liu, V. Lyubashevsky, and D. Micciancio. On bounded distance decoding for general lattices. In International Workshop on Randomization and Computation - Proceedings of RANDOM 2006, volume 4110 of Lecture Notes in Comput. Sci., pages 450–461, Barcelona, Spain, Aug. 2006. Springer.
  • [26] V. Lyubashevsky and D. Micciancio. On bounded distance decoding, unique shortest vectors, and the minimum distance problem, 2009. Manuscript.
  • [27] D. Micciancio. Almost perfect lattices, the covering radius problem, and applications to Ajtai’s connection factor. SIAM Journal on Computing, 34(1):118–169, 2004. Preliminary version in STOC 2002.
  • [28] D. Micciancio and S. Goldwasser. Complexity of Lattice Problems: A Cryptographic Perspective, volume 671 of The Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers, Boston, Massachusetts, Mar. 2002.
  • [29] D. Micciancio and O. Regev. Worst-case to average-case reductions based on Gaussian measures. SIAM Journal on Computing, 37(1):267–302, 2007.
  • [30] C. Moore, A. Russell, and U. Vazirani. A classical one-way function to confound quantum adversaries. In quant-ph/0701115, http://xxx.lanl.gov, 2007.
  • [31] M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 2000.
  • [32] C. Peikert. Limits on the hardness of lattice problems in lpl_{p} norms. Comput. Complexity, 17(2):300–351, 2008. Preliminary version in CCC’07.
  • [33] C. Peikert. Public-key cryptosystems from the worst-case shortest vector problem. In Proc. 41st ACM Symp. on Theory of Computing (STOC), 2009.
  • [34] C. Peikert, V. Vaikuntanathan, and B. Waters. A framework for efficient and composable oblivious transfer. In Advances in cryptology—CRYPTO ’08, volume 5157 of Lecture Notes in Comput. Sci., pages 554–571. Springer, Berlin, 2008.
  • [35] C. Peikert and B. Waters. Lossy trapdoor functions and their applications. In Proc. 40th ACM Symp. on Theory of Computing (STOC), pages 187–196, 2008.
  • [36] O. Regev. New lattice based cryptographic constructions. Journal of the ACM, 51(6):899–942, 2004. Preliminary version in STOC’03.
  • [37] O. Regev. On lattices, learning with errors, random linear codes, and cryptography. In Proc. 37th ACM Symp. on Theory of Computing (STOC), pages 84–93, 2005.
  • [38] C.-P. Schnorr. A hierarchy of polynomial time lattice basis reduction algorithms. Theoret. Comput. Sci., 53(2-3):201–224, 1987.