This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\stackMath

Smoothing Codes and Lattices:
Systematic Study and New Bounds

Thomas Debris-Alazard1 [email protected] Léo Ducas2,3 [email protected] Nicolas Resch4 [email protected]  and  Jean-Pierre Tillich1 [email protected] 1 Inria 2 CWI, Amsterdam, The Netherlands 3 Mathematical Institute, Leiden University 4 Informatics’ Institute, University of Amsterdam
Abstract.

In this article we revisit smoothing bounds in parallel between lattices and codes. Initially introduced by Micciancio and Regev, these bounds were instantiated with Gaussian distributions and were crucial for arguing the security of many lattice-based cryptosystems. Unencumbered by direct application concerns, we provide a systematic study of how these bounds are obtained for both lattices and codes, transferring techniques between both areas. We also consider multiple choices of spherically symmetric noise distribution.

We found that the best strategy for a worst-case bound combines Parseval’s Identity, the Cauchy-Schwarz inequality, and the second linear programming bound, and this holds for both codes and lattices and all noise distributions at hand. For an average-case analysis, the linear programming bound can be replaced by a tight average count.

This alone gives optimal results for spherically uniform noise over random codes and random lattices. This also improves previous Gaussian smoothing bound for worst-case lattices, but surprisingly this provides even better results with uniform ball noise than for Gaussian (or Bernoulli noise for codes).

This counter-intuitive situation can be resolved by adequate decomposition and truncation of Gaussian and Bernoulli distributions into a superposition of uniform noise, giving further improvement for those cases, and putting them on par with the uniform cases.

The work of TDA and JPT was funded by the French Agence Nationale de la Recherche through ANR JCJC COLA (ANR-21-CE39-0011) for TDA and ANR CBCRYPT (ANR-17-CE39-0007) for JPT. Part of this work was done while NR was affiliated with the CWI and partially supported by ERC H2020 grant No.74079 (ALGSTRONGCRYPTO). LD is supported by an ERC starting Grant 947821 (ARTICULATE)

1. Introduction

1.1. Smoothing bounds.

In either a code or a lattice, smoothing refers to fact that, as an error distribution grows wider and wider, the associated syndrome distribution tends towards a uniform distribution. In other words, the error distribution, reduced modulo the code or the lattice, becomes essentially flat. This phenomenon is pivotal in arguing security of cryptosystems [MR07, GPV08, DST19]. In information theoretic literature, it is also sometimes referred to as flatness [LLBS14]. Informally, by a “smoothing bound” we are referring to a result which lower bounds the amount of noise which needs to be added so that the smoothed distribution “looks” flat.

To be more concrete, by a “flat distribution”, we are referring to a uniform distribution over the ambient space modulo the group of interest. For a (linear) code 𝒞𝔽2n\mathscr{C}\subseteq\mathbb{F}_{2}^{n}, this quotient space is 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C}; for a lattice Λn\Lambda\subseteq\mathbb{R}^{n}, it is n/Λ\mathbb{R}^{n}/\Lambda. We then consider some “noise” vector 𝐞\mathbf{e} distributed over the ambient space 𝔽2n\mathbb{F}_{2}^{n} (resp. n\mathbb{R}^{n}), and attempt to prove that 𝐞mod𝒞\mathbf{e}\mod\mathscr{C} (resp. 𝐞modΛ\mathbf{e}\mod\Lambda) is “close” to the uniform distribution over the quotient space 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C} (resp. n/Λ\mathbb{R}^{n}/\Lambda). To quantify “closeness” between distributions, we will use the standard choice of statistical distance.

An important question to be addressed is the choice of distribution for the noise vector 𝐞\mathbf{e}. In lattice-based cryptography (where such smoothing bounds originated [MR07]), the literature ubiquitously uses Gaussian distributions for errors, and smoothness is guaranteed for an error growing as the inverse of the minimum distance of the dual lattice. The original chain [MR07] of argument goes as follows:

  • Apply the Poisson summation formula (PSF);

  • Bound variations via the triangle inequality (TI) over all non-zero dual lattice points;

  • Bound the absolute sum above via the Banaszczyk tail bound [Ban93] for discrete Gaussian (BT).

An intermediate quantity called the smoothing parameter introduced by [MR07] before the last step is also often used in the lattice-based cryptographic literature. Each bounding step is potentially non-tight, and indeed more recent works have replaced the last step by the following [ADRS15]:

  • Bound the number of lattice points in balls of a given radius via the Linear Programming bound [Lev79] (LP) and “sum over all radii” (with care).

With this LP strategy, it is in principle possible to also compute a smoothing bound for spherically symmetric distributions of errors other than the Gaussian; however, we are not aware of prior work doing this explicitly. A very natural choice would be uniform distributions over Euclidean balls.

For codes, there are also two natural distributions of errors: Bernoulli noise, i.e. flip each bit independently with some probability pp (a.k.a. the binary symmetric BSCp\mathrm{BSC}_{p} channel), and a uniform noise over a Hamming sphere of a fixed radius. The latter is typically preferred for the design of concrete and practical cryptosystems [McE78, Ale11, MTSB13, DST19], while the former appears more convenient in theoretical works (1)(1)(1)A third choice of distribution, described as a discrete-time random walk, also made an appearance for a complexity theoretic result [BLVW19]. The expert reader may note that the Bernoulli distribution can also be treated as a continuous-time random walk, and both can be analysed via the heat kernel formalism  [Chu97, Chap. 10].. Cryptographic interest for code smoothing has recently arisen [BLVW19, YZ21], but results are so far limited to codes with extreme parameters and specific “balancedness” constraints. However we note that the question is not entirely new in the coding literature (see for instance  [Klø07]). In particular, an understanding of the smoothing properties of Bernoulli noise is intimately connected to the undetected error probability of a code transmitted through the BSCp\mathrm{BSC}_{p}.

In this light, it is interesting to revisit and systematize our understanding of smoothing bounds, unencumbered by direct application concerns. We find it enlightening to do this exploration in parallel between codes and lattices, transferring techniques back and forth between both areas whenever possible.

Furthermore, we keep our arguments agnostic to the specific choice of error distribution, allowing us to apply them with different error distributions and compare the results. To compare different (symmetric) distributions, we advocate parametrizing them by the expected weight/norm of a vector. That is, we quantify the magnitude of a noise vector 𝐞\mathbf{e} by t=𝔼(|𝐞|)t=\mathbb{E}(|\mathbf{e}|) (where |||\cdot| denotes either the Hamming weight or the Euclidean norm of the vector). Our smoothing bounds will depend on this parameter, and we consider a smoothing bound to be more effective if for the smoothed distribution to be close to uniform we require a smaller lower-bound on tt.

1.2. Contributions.

In this work, we collect the techniques that have been used for smoothing, both in the code and lattice contexts. We view individual steps as modular components of arguments, and consider all permissible combinations of steps, thereby determining the most effective arguments. In the following, we outline our systematization efforts, describing the various proof frameworks that we tried before settling on the most effective argument.

Code smoothing bounds.

Given the relative dearth of results concerning code smoothing, it seems natural to start by adapting the first argument (PSF+TI+BT) to codes following the proof techniques of [Ban93, MR07]. And indeed, the whole strategy translates flawlessly, with only one caveat: it leads to a very poor result, barely better than the trivial bound. Namely, smoothness is established only for Bernoulli errors with parameter very close to p=1/2p=1/2.

The adaptation of Banaszczyk tail bound [Ban93] to codes (together with replacing the Gaussian by a Bernoulli distribution) is rather naïve, and it is therefore not very surprising that it leads to a disappointing result. Instead, we can also follow the improved strategy for lattices from [ADRS15], and resort to linear programming bounds for codes [Bas65, MRJW77, ABL01]. Briefly, by an LP bound we are referring to a result that bounds the number of codewords (resp. lattice vectors) of a certain weight (resp. norm) in terms of the dual distance (resp. shortest dual vector) of the code (resp. lattice). In both cases, the results are obtained by considering a certain LP relaxation of the combinatorial quantities one wishes to bound, hence the name. Even more, the bounds for codes and lattices are obtained via essentially the same arguments [MRJW77, DL98, CE03]. We therefore find it natural to apply LP bounds in our effort to develop proof techniques which apply to both code- and lattice-smoothing.

The strategy (PSF+TI+LP) turns out to give a significantly better result, but it nevertheless still appears to be far from optimal. We believe that the application of the triangle inequality in the second step to bound the sum of Fourier coefficients given by the Poisson summation formula leads to the unsatisfactory bound. Indeed, a common heuristic when dealing with sums of Fourier coefficients is that, unless there is a good reason otherwise, the sum should have magnitude roughly the square-root of the order of the group (as is the case for random signs): the triangle inequality is far too crude to notice this.

Instead, we turn to another common upper-bound on a sum, namely, the Cauchy-Schwarz (CS) inequality. It is natural to subsequently apply Parseval’s Identity (PI). It turns out that this strategy yields very promising results, upon which we now elucidate. The upper-bound is described in terms of the weight distribution of a code, i.e. the number of codewords of weight ww for each w=1,,nw=1,\dots,n. Unfortunately, it is quite difficult to understand the weight distribution of arbitrary codes, and the bounds that we do have are quite technical.

Random codes.

For this reason, we first apply our proof template to random codes, as it is quite simple to compute the (expected) weight distribution of a random code. Quite satisfyingly, the simple two steps arguments (PI+CS) already yields optimal results for this case, but when the error is sampled uniformly at random from a sphere! That is, we can show that the support size of the error distribution matches the obvious lower bound that applies to any distribution that successfully smooths a code: namely, for a code 𝒞\mathscr{C} the support size must be at least (𝔽2n/𝒞)\sharp(\mathbb{F}_{2}^{n}/\mathscr{C}). Using coding-theoretic terminology, the weight of the error vector that we need to smooth is given by the ubiquitous Gilbert-Varshamov bound

ωGV(R)=h1(1R)\omega_{\textup{GV}}(R)=h^{-1}(1-R)

which characterizes the trade-off between a random code’s rate RR and its minimum distance. Here, h1h^{-1} is the inverse of the binary entropy function.

Moreover, as the argument is versatile enough to apply to essentially all spherical error distributions, we also tried applying it to the Bernoulli distribution, and the random walk distribution of [BLVW19]. Comparing them, we were rather surprised that our argument provided better bounds for the uniform distribution over a Hamming sphere than the other two distributions for the same average Hamming weight.

However, while the (PI+CS) sequence of arguments is more effective when the noise is sampled uniformly on the sphere, we can exploit the fact that the Hamming weight of a Bernoulli-distributed vector is tightly concentrated to recover the same smoothing bound for this distribution. In more detail, we use a “truncated” argument. First, we decompose the Bernoulli distribution into a convex combination of uniform sphere distributions. But, by Chernoff’s bound, a Bernoulli distribution is concentrated on vectors whose weight lies in a width εn\varepsilon n interval around its expected weight. Therefore, outside of this interval, the contribution of the Bernoulli on the statistical distance is negligible. Then apply the (PI+CS) sequence of arguments to each constituent distribution close to the expected weight. In this way, we are able to demonstrate that Bernoulli distributions also optimally smooth random codes.

Arbitrary codes.

Next, we turn our attention to smoothing worst-case codes. Motivated by our success in smoothing random codes, we again follow the (PI+CS) sequence of arguments and combine this with LP bounds to derive smoothing bounds when the dual distance of the code is sufficiently large. Again, the sequence of arguments is most effective when the error is distributed uniformly over the sphere, with one caveat: we are also required to assume that the dual code is balanced in the sense that it also does not contain any vectors of too large weight. While this assumption has appeared in other works [BLVW19, YZ21], we find it somewhat unsatisfactory.

Fortunately, this condition is not required if the error is sampled according to the Bernoulli distribution. But then we run into the same issue that we had earlier with random codes: the (PI+CS) argument, followed by LP bounds, natively yields a lesser result when instantiated with Bernoulli noise. Fortunately, we have already seen how to resolve this issue: we pass to the truncated Bernoulli distribution and decompose it into uniform sphere distributions. This yields a best-of-both-worlds result: we obtain the strongest smoothing bound we can in terms of the noise magnitude, while requiring the weakest assumption on the code.

And back to lattices.

Having now uncovered this better strategy for codes, we can return to lattices and apply our new proof template. Indeed, as we outline in Section 2.3, the (PI+CS) sequence of arguments can be applied in a very broad context; see, in particular, Corollary 2.4.

Random lattices.

First, just as we set our expectations for code-smoothing by first studying the random case, we analogously start here by considering random lattices. However, defining a random lattice is a non-trivial task. We actually consider two distributions. The first, which is based on the deep Minkowski-Hlwaka-Siegel (MHS) Theorem, we only abstractly describe. Thanks to the MHS Theorem, we can very easily compute the (expected value) of our upper-bound.

For the MHS distribution of lattices, we consider two natural error distributions: the Gaussian distribution (which is used ubiquitously in the literature), as well as the uniform distribution over the Euclidean ball. And again, perhaps surprisingly (although less so now thanks to our experience with the code case), we obtain a better result with the uniform distribution over the Euclidean ball. And moreover, the Euclidean ball result is optimal in the same sense that we had for codes: the support volume of the error distribution is exactly equal to the covolume of the lattice (2)(2)(2)That is, for a lattice Λ\Lambda, the volume of the torus n/Λ\mathbb{R}^{n}/\Lambda. We will denote this quantity by |Λ|{\left|\Lambda\right|} from now on. . We view the value ww such that the volume of the nn-ball of radius ww is equal to the covolume of a lattice (which is half the quantity that appears in Minkowski bound) as being the lattice-theoretic analogue of the Gilbert-Varshamov quantity:

wM/2=def|Λ|Γ(n/2+1)nπ.w_{\textup{M}/2}\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{\sqrt[n]{{\left|\Lambda\right|}\;\Gamma(n/2+1)}}{\sqrt{\pi}}\ .

However, as Gaussian vectors satisfy many pleasing properties that are often exploited in lattice-theoretic literature, we would like to obtain the same smoothing bound for this error distribution. Fortunately, our experience with codes also tells us how to recover the result for Gaussian noise from the Euclidean ball noise smoothing bound: we decompose the Gaussian distribution appropriately into a convex combination of Euclidean ball distributions. Together with a basic tail bound, we recover the same smoothing bound for Gaussian noise that we had for the uniform ball noise.

We also study random qq-ary lattices, which are more concretely defined: following the traditional lattice-theoretic terminology, they are obtained by applying Construction A to a random code. This does lead to a slight increase in the technicality of the argument – in particular, we need to apply a certain “summing over annuli” trick – but the computations are still relatively elementary. Again, we find that the argument naturally works better when the errors are distributed uniformly over a ball, but we can still transfer the bound to the Gaussian noise.

Interestingly, the same optimal bound has been recovered in a concurrent work [LLB22, Theorem 1.] for Gaussian distributions. Their arguments are quite unlike ours: [LLB22] uses the Kullback–Leibler divergence in combination with other information-theoretic arguments. However, contrary to our bounds obtained via the (PI + CS) sequence of arguments, [LLB22, Theorem 1] only holds for random qq-ary lattices.

Arbitrary lattices.

Next, we address the challenge of smoothing arbitrary lattices. And again, we follow the (PI+CS) sequence of arguments, and subsequently use the Kabatiansky and Levenshtein bound [KL78] to obtain a smoothing bound in terms of the minimum distance of the dual lattice. The Kabatiansky and Levenshtein bound is the lattice-analogue of the second LP bound from coding theory. We can directly apply the arguments with both of our error distributions of interest, and again, the uniform ball distribution wins. But the decomposition and tail-bound trick again applies to yield the same result for the Gaussian distribution that we had for the uniform ball distribution.

Comparison.

We summarize how our work improves on the state of the art in Table 1 for lattices, and in Table 2 and Figure 1 for codes. For this discussion, we let U(n/Λ)U(\mathbb{R}^{n}/\Lambda) (resp. U(𝔽2n/𝒞U(\mathbb{F}_{2}^{n}/\mathscr{C})) denote the uniform distribution over n/Λ\mathbb{R}^{n}/\Lambda (resp. 𝔽2n/𝒞)\mathbb{F}_{2}^{n}/\mathscr{C}), and let Δ\Delta denote the statistical distance.

In the case of lattices (Table 1), we fix the smoothing bound target to exponentially small, that is we state the minimal value of F>0F>0 such that the bound over the statistical distance implies Δ(𝐞modΛ,U(n/Λ))2Ω(n)\Delta(\mathbf{e}\bmod\Lambda,U(\mathbb{R}^{n}/\Lambda))\leq 2^{-\Omega(n)} when the error follows the prescribed distribution and of an average Euclidean length of 𝔼(|𝐞|2)=Fn/λ1(Λ)\mathbb{E}(|\mathbf{e}|_{2})=F\;n/\lambda_{1}^{*}(\Lambda).(3)(3)(3)In fact, the values in this table guarantee exponentially small statistical distance from the uniform distribution.

Distribution Proof strategy smoothing factor FF General statement
Gaussian PSF+TI+BT 1/(2π)0.159151/(2\pi)\approx 0.15915 Lemma 3.2  [MR07]
Gaussian PSF+TI+LP CKL/(2πe)0.12746C_{\textup{KL}}/(2\pi\sqrt{e})\approx 0.12746 Lemma 6.1  [ADRS15]
Gaussian PI+CS+LP CKL/(2π2e)0.09013C_{\textup{KL}}/(2\pi\sqrt{2e})\approx 0.09013 Theorem 4.18 (this work)
Unif. Euclidean ball PI+CS+LP CKL/(2πe)0.07731C_{\textup{KL}}/(2\pi e)\approx 0.07731 Theorem 4.17 (this work)
Gaussian via Unif. + Trunc. CKL/(2πe)0.07731C_{\textup{KL}}/(2\pi e)\approx 0.07731 Theorem 4.19 (this work)
Table 1. Comparison of smoothing bounds for various proof strategies and error distributions. The smoothing constant FF is the smallest constant CC such that the bounds proves exponential smoothness when the average norm (over nn, the length of the ambient space) of an error is at least CC times the inverse of the minimal distance of the dual lattice. Here CKL20.401C_{\textup{KL}}\approx 2^{0.401} denotes the constant that is involved in the Kabatiansky and Levenshtein bound [KL78].

In the case of codes we also fix the smoothing bound target to negligible,(4)(4)(4)Again, it is the same if we insist the statistical distance to uniform is exponentially small. but we compare two cases: smoothing bounds for random codes (in average) and for a fixed code (worst case). In Figure 1 we compare the minimal value F>0F>0 such that 𝔼𝒞(Δ(𝐞mod𝒞,U(𝔽2n/𝒞)))2Ω(n)\mathbb{E}_{\mathscr{C}}\left(\Delta(\mathbf{e}\bmod\mathscr{C},U(\mathbb{\mathbb{F}}_{2}^{n}/\mathscr{C}))\right)\leq 2^{-\Omega(n)} when the error 𝐞\mathbf{e} follows the prescribed distribution and with an expectation that is taken over codes of rate RR. In Table 1 we make the same comparison but to reach Δ(𝐞mod𝒞,U(𝔽2n/𝒞))2Ω(n)\Delta(\mathbf{e}\bmod\mathscr{C},U(\mathbb{\mathbb{F}}_{2}^{n}/\mathscr{C}))\leq 2^{-\Omega(n)} for a fixed code 𝒞\mathscr{C} such that the minimum distance of its dual 𝒞\mathscr{C}^{*} is known.

Refer to caption
Figure 1. Comparison of smoothing constants for random codes as a function of their rate RR for various error distributions. The smoothing constant is the smallest constant CC such that the bounds proves exponential smoothness when the average Hamming weight of an error is at least CnCn.
Distribution smoothing factor FF Balanced-code General statement
Bernoulli 0.24\approx 0.24 NO Eq. (17), Prop. 3.11, 3.12
Discrete Rand. Walk 0.27\approx 0.27 YES Theorem 3.14
Unif. Hamming sphere 0.17\approx 0.17 YES Theorem 3.14
Bernoulli + Trunc. 0.17\approx 0.17 NO Theorem 3.16
Table 2. Comparison of smoothing bounds for a code 𝒞\mathscr{C} of length nn such that its dual 𝒞\mathscr{C}^{*} has minimum distance 0.11n0.11n (which is the typical case for a code of rate 1/21/2) for various error distributions. The smoothing constant FF is the smallest constant CC such that the bounds proves exponential smoothness when the average Hamming weight of an error is at least CnCn. Furthermore the balanced-code hypothesis means that we suppose there are no dual codewords 𝐜𝒞\mathbf{c}^{*}\in\mathscr{C}^{*} of Hamming weight larger than (10.11)n(1-0.11)n.

2. Preliminaries: Notations and Fourier Analysis over Locally Compact Abelian Group

2.1. General Notation.

The notation x=defyx\stackrel{{\scriptstyle\text{def}}}{{=}}y means that xx is defined as being equal to yy. Given a set 𝒮\mathscr{S}, its indicator function will be denoted 1𝒮1_{\mathscr{S}}. For a finite set 𝒮\mathscr{S}, we will denote by 𝒮\sharp\mathscr{S} its cardinality. Vectors will be written with bold letters (such as 𝐱\mathbf{x}). Furthermore, we denotes by a,b\llbracket a,b\rrbracket the set of integers {a,a+1,,b}\{a,a+1,\dots,b\}.

The statistical distance between two discrete probability distributions ff and gg over a same space 𝒮\mathscr{S} is defined as:

Δ(f,g)=def12x𝒮|f(x)g(x)|.\Delta(f,g)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{2}\sum_{x\in\mathscr{S}}|f(x)-g(x)|.

Similarly, for two continuous probability density functions ff and gg over a same measure space \mathscr{E}, the statistical distance is defined as

Δ(f,g)=def12|fg|.\Delta(f,g)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{2}\int_{\mathscr{E}}|f-g|.

2.2. Codes and Lattices

We give here some basic definitions and notation about linear codes and lattices.

Linear codes. In the whole paper, we will deal exclusively with binary linear codes, namely subspaces of 𝔽2n\mathbb{F}_{2}^{n} for some positive integer nn. The space 𝔽2n\mathbb{F}_{2}^{n} will be embedded with the Hamming weight |||\cdot|, namely

𝐱𝔽2n,|𝐱|=def{i1,n : xi0}.\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\quad|\mathbf{x}|\stackrel{{\scriptstyle\text{def}}}{{=}}\sharp\left\{i\in\llbracket 1,n\rrbracket\mbox{ : }x_{i}\neq 0\right\}.

We will denote by 𝒮w\mathscr{S}_{w} the sphere with center 𝟎\mathbf{0} and radius ww; its size is given by (nw)\binom{n}{w} and we have 1nlog2(nw)=h(w/n)+o(1)\frac{1}{n}\log_{2}\binom{n}{w}=h(w/n)+o(1) where hh denotes the binary-entropy, namely h(x)=defxlog2(x)(1x)log2(1x)h(x)\stackrel{{\scriptstyle\text{def}}}{{=}}-x\log_{2}(x)-(1-x)\log_{2}(1-x).

An [n,k][n,k]-code 𝒞\mathscr{C} is defined as a dimension kk subspace of 𝔽2n\mathbb{F}_{2}^{n}. The rate of 𝒞\mathscr{C} is kn\frac{k}{n}. Its minimal distance is given by

dmin(𝒞)\displaystyle d_{\textup{min}}(\mathscr{C}) =defmin{|𝐜𝐜| : 𝐜,𝐜𝒞 and 𝐜𝐜}\displaystyle\stackrel{{\scriptstyle\text{def}}}{{=}}\min\left\{|\mathbf{c}-\mathbf{c}^{\prime}|\text{ : }\mathbf{c},\mathbf{c}^{\prime}\in\mathscr{C}\text{ and }\mathbf{c}\neq\mathbf{c}^{\prime}\right\}
=min{|𝐜| : 𝐜𝒞 and 𝐜𝟎}.\displaystyle=\min\left\{|\mathbf{c}|\mbox{ : }\mathbf{c}\in\mathscr{C}\mbox{ and }\mathbf{c}\neq\mathbf{0}\right\}.

The number of codewords of 𝒞\mathscr{C} of weight tt will be denoted by Nt(𝒞)N_{t}(\mathscr{C}), namely

Nt(𝒞)=def{𝐜𝒞 and |𝐜|=t}.N_{t}(\mathscr{C})\stackrel{{\scriptstyle\text{def}}}{{=}}\sharp\left\{\mathbf{c}\in\mathscr{C}\mbox{ and }|\mathbf{c}|=t\right\}.

The dual of a code 𝒞\mathscr{C} is defined as 𝒞=def{𝐜𝔽2n : 𝐜𝒞, 𝐜𝐜=0}\mathscr{C}^{*}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\mathbf{c}^{*}\in\mathbb{F}_{2}^{n}\mbox{ : }\forall\mathbf{c}\in\mathscr{C},\mbox{ }\mathbf{c}\cdot\mathbf{c}^{*}=0\right\} where \cdot denotes the standard inner product on 𝔽2n\mathbb{F}_{2}^{n}.

Lattices. We will consider lattices of n\mathbb{R}^{n} which is embedded with the Euclidean norm ||2|\cdot|_{2}, namely

𝐱n,|𝐱|2=defi=1nxi2.\forall\mathbf{x}\in\mathbb{R}^{n},\quad|\mathbf{x}|_{2}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{\sum_{i=1}^{n}x_{i}^{2}}.

We will denote by w\mathscr{B}_{w} the ball with center 𝟎\mathbf{0} and radius ww; its volume is given by

Vn(w)=defπn/2wnΓ(n/2+1).V_{n}\left(w\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{\pi^{n/2}w^{n}}{\Gamma(n/2+1)}.

An nn-dimension lattice Λ\Lambda is defined as a discrete subgroup of n\mathbb{R}^{n}. The covolume |Λ|=defvol(n/Λ)|\Lambda|\stackrel{{\scriptstyle\text{def}}}{{=}}\textup{vol}\left(\mathbb{R}^{n}/\Lambda\right) of Λ\Lambda is the volume of any fundamental parallelotope. The minimal distance of Λ\Lambda is given by λ1(Λ)=defmin{|𝐱|2 : 𝐱Λ and 𝐱𝟎}.\lambda_{1}(\Lambda)\stackrel{{\scriptstyle\text{def}}}{{=}}\min\left\{|\mathbf{x}|_{2}\mbox{ : }\mathbf{x}\in\Lambda\mbox{ and }\mathbf{x}\neq\mathbf{0}\right\}. The number of lattice points of Λ\Lambda of weight t\leq t will be denoted by Nt(Λ)N_{\leq t}(\Lambda), namely

Nt(Λ)=def{𝐱Λ : |𝐱|2t}.N_{\leq t}(\Lambda)\stackrel{{\scriptstyle\text{def}}}{{=}}\sharp\left\{\mathbf{x}\in\Lambda\mbox{ : }|\mathbf{x}|_{2}\leq t\right\}.

2.3. Fourier Analysis

We give here a brief introduction to Fourier analysis over arbitrary locally compact Abelian groups. Our general treatment will allow us to apply directly some basic results in a code and lattice context, obviating the need in each case to introduce essentially the same definitions and to provide the same proofs.

Corollary 2.4 at the end of this subsection is the starting point of our smoothing bounds: all of our results are obtained by using different facts to bound the right hand side of the inequality.

Groups and Their Duals. In what follows GG will denote a locally compact Abelian group. Such a group admits a Haar measure μ\mu. For instance G=G=\mathbb{R} with μ\mu the Lebesgue measure λ\lambda, or G=𝔽2nG=\mathbb{F}_{2}^{n} with μ\mu the counting measure \sharp.

The dual group G^\widehat{G} is given by the continuous group homomorphisms χ\chi from GG into the multiplicative group of complex numbers of absolute value 11, and it is again a locally compact Abelian group. In Figure 2 we give groups, their duals as well as their associated Haar measures that will be considered in this work.

. GG μ\mu G^\widehat{G} μ\mu 𝔽2n\mathbb{F}_{2}^{n} 12n\frac{1}{2^{n}}\;\sharp 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C} 𝒞2n\frac{\sharp\mathscr{C}}{2^{n}}\;\sharp 𝔽2n/𝒞^𝒞\widehat{\mathbb{F}_{2}^{n}/\mathscr{C}}\simeq\mathscr{C}^{*} \sharp 𝒞\mathscr{C} 1𝒞\frac{1}{\sharp\mathscr{C}}\;\sharp n\mathbb{R}^{n} λ\lambda n/Λ\mathbb{R}^{n}/\Lambda 1|Λ|λ\frac{1}{|\Lambda|}\;\lambda n/Λ^Λ\widehat{\mathbb{R}^{n}/\Lambda}\simeq\Lambda^{*} \sharp Λ\Lambda |Λ|\sharp\;|\Lambda|

Figure 2. Some groups GG, their duals G^\widehat{G} and their associated Haar measures. Here λ\lambda denotes the Lebesgue measure and \sharp the counting measure.

It is important to note that if HGH\subseteq G is a closed subgroup, then G/HG/H and HH are also locally compact groups. Furthermore, G/HG/H has a dual group that satisfies the following isomorphism

G/H^H=def{χG^ : hH, χ(h)=1}.\widehat{G/H}\simeq H^{\perp}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\chi\in\widehat{G}\mbox{ : }~\forall h\in H,\mbox{ }\chi(h)=1\right\}.

Norms and Fourier Transforms. For any p[1,[p\in[1,\infty[, Lp(G)L_{p}(G) will denote the space of measurable functions f:Gf:G\rightarrow\mathbb{C} (up to functions which agree almost everywhere) with finite norm fp\|f\|_{p} which is defined as

fp=defG|f|p𝑑μp.\|f\|_{p}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt[p]{\int_{G}|f|^{p}d\mu}.

The Fourier transform of fL1(G)f\in L_{1}(G) is defined as

f^:χG^Gfχ¯𝑑μ.\widehat{f}:\chi\in\widehat{G}\longmapsto\int_{G}f\overline{\chi}d\mu.

We omitted here the dependence on GG. It will be clear from the context.

Theorem 2.1 (Parseval’s Identity).

Let fL1(G)L2(G)f\in L_{1}(G)\cap L_{2}(G), then with appropriate normalization of the Haar measure

f2=f^2.\|f\|_{2}=\|\widehat{f}\|_{2}.

Poisson Formula. Given HGH\subseteq G and any function f:Gf:G\rightarrow\mathbb{C}, its restriction over HH is defined as f|H:hHf(h)f_{|H}:h\in H\mapsto f(h)\in\mathbb{C}. We define its periodization as follows.

Definition 2.2 (Periodization).

Let HH be a closed subgroup of GG and fL1(G)f\in L^{1}(G). We define the HH-periodization of ff as

f|H:(g+H)G/HHf(g+h)𝑑μH(h)f^{|H}:(g+H)\in G/H\longmapsto\int_{H}f(g+h)d\mu_{H}(h)\in\mathbb{C}

where μH\mu_{H} denotes any choice of the Haar measure for HH.

There always exists a Haar measure μG/H\mu_{G/H} such that for any continuous function with compact support f:Gf:G\rightarrow\mathbb{C} the quotient integral formula holds

G/H(Hf(g+h)𝑑μH(h))𝑑μG/H(g+H)=Gf(g)𝑑μ(g).\int_{G/H}\left(\int_{H}f(g+h)d\mu_{H}(h)\right)d\mu_{G/H}(g+H)=\int_{G}f(g)d\mu(g). (1)
Theorem 2.3 (Poisson Formula).

Let HGH\subseteq G be a closed subgroup and fL1(G)f\in L^{1}(G), then with appropriate normalization of the Haar measures,

(f|H)^=(f^)|G/H^.\widehat{\left(f^{|H}\right)}=\left(\widehat{f}\right)_{|\widehat{G/H}}.

The following corollary is a simple consequence of the Cauchy-Schwarz inequality, Parseval identity and the Poisson formula. Our results on smoothing bounds are all based on this corollary.

Corollary 2.4.

Let HH be a closed subgroup of GG. Let a:xG/H1a:x\in G/H\mapsto 1 and fL1(G)f\in L^{1}(G) such that Gf𝑑μ=μG/H(G/H)\int_{G}fd\mu=\mu_{G/H}(G/H). Then with appropriate normalization of the Haar measure,(5)(5)(5)We choose the Haar measures μG\mu_{G}, μH\mu_{H},μG/H\mu_{G/H} and μG/H^\widehat{\mu_{G/H}} for which both the Poisson formula and Parseval’s Identity hold.

af|H1μG/H(G/H)G/H^\{χ𝟎}|f^|2𝑑μG/H^\|a-f^{|H}\|_{1}\leq\sqrt{\mu_{G/H}(G/H)}\;\sqrt{\int_{\widehat{G/H}\backslash\{\chi_{\mathbf{0}}\}}|\widehat{f}|^{2}\;d\mu_{\widehat{G/H}}}

where χ𝟎\chi_{\mathbf{0}} denotes the identity element of G/H^\widehat{G/H}.

Proof.

We have

af|H1\displaystyle\|a-f^{|H}\|_{1} =G/H^|af|H|𝑑μG/H\displaystyle=\int_{\widehat{G/H}}|a-f^{|H}|d\mu_{G/H}
μG/H(G/H)af2(By Cauchy-Schwarz)\displaystyle\leq\sqrt{\mu_{G/H}(G/H)}\;\|a-f\|_{2}\quad(\mbox{By Cauchy-Schwarz})
=μG/H(G/H)a^f^2(By Parseval)\displaystyle=\sqrt{\mu_{G/H}(G/H)}\;\|\widehat{a}-\widehat{f}\|_{2}\quad(\mbox{By Parseval})
=μG/H(G/H)G/H^{χ𝟎}|f|H^|2𝑑μG/H^\displaystyle=\sqrt{\mu_{G/H}(G/H)}\;\sqrt{\int_{\widehat{G/H}\setminus\{\chi_{\mathbf{0}}\}}|\widehat{f^{|H}}|^{2}d\mu_{\widehat{G/H}}} (2)
=μG/H(G/H)G/H^{χ𝟎}|f^|2𝑑μG/H^2(By Poisson)\displaystyle=\sqrt{\mu_{G/H}(G/H)}\;\sqrt{\int_{\widehat{G/H}\setminus\{\chi_{\mathbf{0}}\}}|\widehat{f}|^{2}d\mu_{\widehat{G/H}}}2\quad(\mbox{By Poisson})

where in Equation (2) we used the following equalities:

f|H^(χ𝟎)\displaystyle\widehat{f^{|H}}(\chi_{\mathbf{0}}) =G/Hf|Hχ𝟎¯𝑑μG/H\displaystyle=\int_{G/H}f^{|H}\overline{\chi_{\mathbf{0}}}\;d\mu_{G/H}
=G/H(Hf(g+h)𝑑μH(h))𝑑μG/H(g+H)\displaystyle=\int_{G/H}\left(\int_{H}f(g+h)d\mu_{H}(h)\right)d\mu_{G/H}(g+H)
=Gf(By Equation (1))\displaystyle=\int_{G}f\quad(\mbox{By Equation \eqref{eq:quoIntFormula}})
=μG/H(G/H)(By assumption on f)\displaystyle=\mu_{G/H}(G/H)\quad(\mbox{By assumption on $f$})

and

a^(χ𝟎)=G/Huχ𝟎¯𝑑μG/H=μG/H(G/H)andχG/H^{χ𝟎}, a^(χ)=G/Hχ¯𝑑μG/H=0.\widehat{a}(\chi_{\mathbf{0}})=\int_{G/H}u\overline{\chi_{\mathbf{0}}}d\mu_{G/H}=\mu_{G/H}(G/H)\quad\mbox{and}\quad\forall\chi\in\widehat{G/H}\setminus\{\chi_{\mathbf{0}}\},\mbox{ }\widehat{a}(\chi)=\int_{G/H}\overline{\chi}d\mu_{G/H}=0.

which concludes the proof. ∎

In this work we will choose G=nG=\mathbb{R}^{n} and H=ΛH=\Lambda or G=𝔽2nG=\mathbb{F}_{2}^{n} and H=𝒞H=\mathscr{C}. Haar measures associated to G,G/HG,G/H and G/H^\widehat{G/H} for which the corollary holds are given in Figure 2. Furthermore, we will use Fourier transforms over G^\widehat{G} and G/H^\widehat{G/H}. We describe in Figure 3 these dual groups that we will consider.

n\mathbb{R}^{n} 𝔽2n\mathbb{F}_{2}^{n}
n/Λ^={χ𝐱 : 𝐱Λ}\widehat{\mathbb{R}^{n}/\Lambda}=\left\{\chi_{\mathbf{x}}\mbox{ : }\mathbf{x}\in\Lambda^{*}\right\} 𝔽2n/𝒞^={χ𝐱 : 𝐱𝒞}\widehat{\mathbb{F}_{2}^{n}/\mathscr{C}}=\left\{\chi_{\mathbf{x}}\mbox{ : }\mathbf{x}\in\mathscr{C}^{*}\right\}
f^(𝐱)=nf(𝐲)e2iπ𝐱𝐲𝑑𝐲\widehat{f}(\mathbf{x})=\int_{\mathbb{R}^{n}}f(\mathbf{y})e^{2i\pi\mathbf{x}\cdot\mathbf{y}}d\mathbf{y} f^(𝐱)=12n𝐲𝔽2nf(𝐲)(1)𝐱𝐲\widehat{f}(\mathbf{x})=\frac{1}{2^{n}}\sum_{\mathbf{y}\in\mathbb{F}_{2}^{n}}f(\mathbf{y})(-1)^{\mathbf{x}\cdot\mathbf{y}}
Figure 3. Dual groups and Fourier transforms that we will consider. We identify f^(χ𝐱)\widehat{f}(\chi_{\mathbf{x}}) with f^(𝐱)\widehat{f}(\mathbf{x}).

3. Smoothing Bounds: Code Case

Given a binary linear code 𝒞\mathscr{C} of length nn, the aim of a smoothing bound is to quantify at which condition on the noise 𝐜+𝐞\mathbf{c}+\mathbf{e} is statistically close to the uniform distribution over 𝔽2n\mathbb{F}_{2}^{n} when 𝐜\mathbf{c} is uniformly drawn from 𝒞\mathscr{C} and 𝐞\mathbf{e} sampled according to some noise distribution ff. Equivalently, we want to understand when (𝐞mod𝒞)𝔽2n/𝒞\left(\mathbf{e}\mod\mathscr{C}\right)\in\mathbb{F}_{2}^{n}/\mathscr{C} is close to the uniform distribution. We will focus on the case where the distribution of 𝐞\mathbf{e} is radial, meaning that it only depends on the Hamming weight of 𝐞\mathbf{e}.

Notation 3.1.

We will use throughout this section the following notation.

  • The uniform probability distribution over the quotient space 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C} will frequently recur and for this reason we just denote it by uu. The uniform distribution over the whole space 𝔽2n\mathbb{F}_{2}^{n} is denoted by ufullu_{\textup{full}} and the uniform distribution over the codewords of 𝒞\mathscr{C} is denoted by u𝒞u_{\mathscr{C}}.

  • We also use the uniform distribution over the sphere 𝒮w{\mathscr{S}}_{w} which we denote by uwu_{w}.

  • For two probability distributions ff and gg over 𝔽2n\mathbb{F}_{2}^{n} we denote by fgf\star g the convolution over 𝔽2n\mathbb{F}_{2}^{n}: fg(𝐱)=𝐲𝔽2nf(𝐱𝐲)g(𝐲)f\star g(\mathbf{x})=\sum_{\mathbf{y}\in\mathbb{F}_{2}^{n}}f(\mathbf{x}-\mathbf{y})g(\mathbf{y}).

It will be more convenient to work in the quotient space and for this we use the following proposition.

Proposition 3.2.

Let ff be a probability distribution over 𝔽2n\mathbb{F}_{2}^{n} and 𝒞\mathscr{C} be an [n,k][n,k]-code. We have

Δ(ufull,u𝒞f)=Δ(u,f𝒞),where f𝒞(𝐱)=def2kf|𝒞(𝐱)=𝐜𝒞f(𝐱𝐜).\Delta(u_{\textup{full}},u_{\mathscr{C}}\star f)=\Delta(u,f^{\mathscr{C}}),\quad\mbox{where }f^{\mathscr{C}}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}2^{k}\;f^{|\mathscr{C}}(\mathbf{x})=\sum_{\mathbf{c}\in\mathscr{C}}f(\mathbf{x}-\mathbf{c}).
Proof.

Let 𝐜\mathbf{c} and 𝐞\mathbf{e} be distributed according to u𝒞u_{\mathscr{C}} and ff. We have the following computation:

Δ(ufull,u𝒞f)\displaystyle\Delta(u_{\textup{full}},u_{\mathscr{C}}\star f) =12𝐱𝔽2n|12nu𝒞,f(𝐜+𝐞=𝐱)|\displaystyle=\frac{1}{2}\;\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}}\left|\frac{1}{2^{n}}-\mathbb{P}_{u_{\mathscr{C}},f}\left(\mathbf{c}+\mathbf{e}=\mathbf{x}\right)\right|
=12𝐱𝔽2n|12n𝐜0𝒞f(𝐜+𝐞=𝐱𝐜=𝐜0)12k|\displaystyle=\frac{1}{2}\;\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}}\left|\frac{1}{2^{n}}-\sum_{\mathbf{c}_{0}\in\mathscr{C}}\mathbb{P}_{f}(\mathbf{c}+\mathbf{e}=\mathbf{x}\mid\mathbf{c}=\mathbf{c}_{0})\;\frac{1}{2^{k}}\right|
=12𝐱𝔽2n|12n12k𝐜0𝒞f(𝐱𝐜0)|\displaystyle=\frac{1}{2}\;\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}}\left|\frac{1}{2^{n}}-\frac{1}{2^{k}}\;\sum_{\mathbf{c}_{0}\in\mathscr{C}}f(\mathbf{x}-\mathbf{c}_{0})\right|
=12𝐱𝔽2n/𝒞|12nk𝐜0𝒞f(𝐱𝐜0)|\displaystyle=\frac{1}{2}\;\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|\frac{1}{2^{n-k}}-\sum_{\mathbf{c}_{0}\in\mathscr{C}}f(\mathbf{x}-\mathbf{c}_{0})\right| (3)
=12𝐱𝔽2n/𝒞|12nkf𝒞(𝐱)|\displaystyle=\frac{1}{2}\;\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|\frac{1}{2^{n-k}}-f^{\mathscr{C}}(\mathbf{x})\right|

where in Equation (3) we used that each term of the sum is constant on 𝐱+𝒞\mathbf{x}+\mathscr{C}. ∎

As a rewriting of Corollary 2.4 we get the following proposition that upper-bounds Δ(u,f𝒞)\Delta(u,f^{\mathscr{C}}), namely:

Proposition 3.3.

Let 𝒞\mathscr{C} be an [n,k][n,k]-code and ff be a radial distribution on 𝔽2n\mathbb{F}_{2}^{n}. We have

Δ(u,f𝒞)2nt=dmin(𝒞)nNt(𝒞)|f^(t)|2\Delta\left(u,f^{\mathscr{C}}\right)\leq 2^{n}\;\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n}N_{t}(\mathscr{C}^{*})|\widehat{f}(t)|^{2}}

where by abuse of notation we denote by f^(t)\widehat{f}(t) the common value of f^\widehat{f} on vectors of weight tt.

Proof.

We have that 𝒞\mathscr{C} is a closed subgroup of 𝔽2n\mathbb{F}_{2}^{n} with associated Haar measures:

μ𝔽2n=12nandμ𝔽2n/𝒞=2k2n\mu_{\mathbb{F}_{2}^{n}}=\frac{1}{2^{n}}\;\sharp\quad\mbox{and}\quad\mu_{\mathbb{F}_{2}^{n}/\mathscr{C}}=\frac{2^{k}}{2^{n}}\;\sharp

for which we can apply Corollary 2.4. Let a=def2nkua\stackrel{{\scriptstyle\text{def}}}{{=}}2^{n-k}u and b=def2nfb\stackrel{{\scriptstyle\text{def}}}{{=}}2^{n}f. First, it is clear that a:𝐱𝔽2n/𝒞1a:\mathbf{x}\in\mathbb{F}_{2}^{n}/\mathscr{C}\mapsto 1 and that

𝔽2nb𝑑μ𝔽2n=12n𝐱𝔽2n2nf(𝐱)=1=μ𝔽2n/𝒞(𝔽2n/𝒞)\int_{\mathbb{F}_{2}^{n}}b\;d\mu_{\mathbb{F}_{2}^{n}}=\frac{1}{2^{n}}\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}}2^{n}f(\mathbf{x})=1=\mu_{\mathbb{F}_{2}^{n}/\mathscr{C}}(\mathbb{F}_{2}^{n}/\mathscr{C})

where we used that ff is a distribution. Therefore we can apply Corollary 2.4 with functions aa and bb. Furthermore, b|𝒞=2nf|𝒞=2nkf𝒞b^{|\mathscr{C}}=2^{n}f^{|\mathscr{C}}=2^{n-k}f^{\mathscr{C}} by definition of f𝒞f^{\mathscr{C}}. We get the following computation:

ab|𝒞1\displaystyle\|a-b^{|\mathscr{C}}\|_{1} =a2nkf𝒞1\displaystyle=\|a-2^{n-k}f^{\mathscr{C}}\|_{1}
=𝐱𝔽2n/𝒞|12nkf𝒞(𝐱)|12nk\displaystyle=\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|1-2^{n-k}f^{\mathscr{C}}(\mathbf{x})\right|\;\frac{1}{2^{n-k}}
=𝐱𝔽2n/𝒞|12nkf𝒞(𝐱)|\displaystyle=\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|\frac{1}{2^{n-k}}-f^{\mathscr{C}}(\mathbf{x})\right|
=2Δ(u,f𝒞).\displaystyle=2\;\Delta(u,f^{\mathscr{C}})\ . (4)

To conclude the proof it remains to apply Corollary 2.4 with Equation (4) and then to use that ff is radial and therefore also f^\widehat{f}. ∎

Our upper-bound of Proposition 3.3 involves the weight distribution of the code 𝒞\mathscr{C}^{*}, namely (Nt(𝒞))tdmin(𝒞)(N_{t}(\mathscr{C}^{*}))_{t\geq d_{\textup{min}}(\mathscr{C}^{*})}. To understand how our bound behaves for a given distribution ff, we will start (in the following subsection) with the case of random codes. The expected value for NtN_{t} is well known in this case. This will lead us to estimate our bound on almost all codes and gives us some hints about the best distribution to choose for our smoothing bound in the worst case (which is the case that we treat in Subsection 3.2).

3.1. Smoothing Random Codes.

The probabilistic model Cn,k{\pazocal C}_{n,k} that we use for our random code of length nn is defined by sampling uniformly at random a generator matrix 𝐆𝔽2k×n\mathbf{G}\in\mathbb{F}_{2}^{k\times n} for it, i.e.

𝒞={𝐦𝐆 : 𝐦𝔽2k}.\mathscr{C}=\left\{\mathbf{m}\mathbf{G}\mbox{ : }\mathbf{m}\in\mathbb{F}_{2}^{k}\right\}.

It is straightforward to check that the expected number of codewords of weight tt in the dual 𝒞\mathscr{C}^{*} is given by:

Fact 3.4.

For 𝒞\mathscr{C} chosen according to Cn,k{\pazocal C}_{n,k}

𝔼𝒞(Nt(𝒞))=(nt)2k.\mathbb{E}_{\mathscr{C}}(N_{t}(\mathscr{C}^{*}))=\frac{\binom{n}{t}}{2^{k}}.

This estimation combined with Proposition 3.3 enables us to upper-bound 𝔼𝒞(Δ(u,f𝒞))\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f^{\mathscr{C}})\right).

Proposition 3.5.

We have:

𝔼𝒞(Δ(u,f𝒞))2nt>0(nt)2k|f^(t)|2.\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f^{\mathscr{C}})\right)\leq 2^{n}\;\sqrt{\sum_{t>0}\frac{\binom{n}{t}}{2^{k}}\;|\widehat{f}(t)|^{2}}. (5)
Proof.

By using Proposition 3.3, we obtain:

𝔼𝒞(Δ(u,f𝒞))\displaystyle\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f^{\mathscr{C}})\right) 𝔼𝒞(2nt=dmin(𝒞)nNt(𝒞)|f^(t)|2)\displaystyle\leq\mathbb{E}_{\mathscr{C}}\left(2^{n}\;\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n}N_{t}(\mathscr{C}^{*})|\widehat{f}(t)|^{2}}\right)
2n𝔼𝒞(t=dmin(𝒞)nNt(𝒞)|f^(t)|2)(Jensen’s inequality)\displaystyle\leq 2^{n}\;\sqrt{\mathbb{E}_{\mathscr{C}}\left(\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n}N_{t}(\mathscr{C}^{*})|\widehat{f}(t)|^{2}\right)}\quad(\mbox{Jensen's inequality})
=2nt>0(nt)2k|f^(t)|2\displaystyle=2^{n}\;\sqrt{\sum_{t>0}\frac{\binom{n}{t}}{2^{k}}\;|\widehat{f}(t)|^{2}}

where in the last line we used the linearity of the expectation and Fact 3.4. ∎

It remains now to choose the distribution ff. A natural choice in code-based cryptography is the uniform distribution uwu_{w} over the sphere 𝒮w{\mathscr{S}}_{w} of radius ww centered around 𝟎\mathbf{0}.

Uniform Distribution over a Sphere. The Fourier transform of uwu_{w} is intimately connected to Krawtchouk polynomials. The Krawtchouk polynomial of order nn and degree w{0,,n}w\in\{0,\dots,n\} is defined as

Kw(X;n)=defj=0w(1)j(Xj)(nXwj).K_{w}(X;n)\stackrel{{\scriptstyle\text{def}}}{{=}}\sum_{j=0}^{w}(-1)^{j}\binom{X}{j}\binom{n-X}{w-j}.

To simplify notation, since nn is clear here from context, we will drop the dependency on nn and simply write Kw(X)K_{w}(X). The following fact allows to relate KwK_{w} with uw^\widehat{u_{w}} (see for instance [vL99, Lem. 3.5.1, §3.5])

Fact 3.6.

For any 𝐲𝒮t\mathbf{y}\in\mathscr{S}_{t},

𝐞𝒮w(1)𝐲𝐞=Kw(t).\sum_{\mathbf{e}\in\mathscr{S}_{w}}(-1)^{\mathbf{y}\cdot\mathbf{e}}=K_{w}(t). (6)

This leads us to

uw^(𝐱)=12nKw(|𝐱|)/(nw).\widehat{u_{w}}(\mathbf{x})=\frac{1}{2^{n}}\;K_{w}(|\mathbf{x}|)\bigg{/}\binom{n}{w}.

By plugging this in Equation (5) of Proposition 3.5 we obtain

𝔼𝒞(Δ(u,uw𝒞))t>0(nt)2k(Kw(t)(nw))2.\mathbb{E}_{\mathscr{C}}\left(\Delta(u,u_{w}^{\mathscr{C}})\right)\leq\sqrt{\sum_{t>0}\frac{\binom{n}{t}}{2^{k}}\left(\frac{K_{w}(t)}{\binom{n}{w}}\right)^{2}}. (7)

The above sum can be upper-bounded by observing that (Kw/(nw))0wn\left(K_{w}/\sqrt{\binom{n}{w}}\right)_{0\leq w\leq n} is an orthonormal basis of functions f:{0,1,,n}f:\{0,1,\cdots,n\}\rightarrow\mathbb{C} for the inner product f,grad=deft=0nf(t)g(t)¯(nt)/2n\langle f,g\rangle_{\textup{rad}}\stackrel{{\scriptstyle\text{def}}}{{=}}\sum_{t=0}^{n}f(t)\overline{g(t)}\binom{n}{t}/2^{n}. It can be viewed as the standard inner product between radial functions over 𝔽2n\mathbb{F}_{2}^{n}. In particular, t=0nKw(t)2(nw)(nt)2n=1\sum_{t=0}^{n}\frac{K_{w}(t)^{2}}{\binom{n}{w}}\;\frac{\binom{n}{t}}{2^{n}}=1 [Lev95, Corollary 2.3]. Therefore, for random codes we obtain the following proposition

Proposition 3.7.

We have for random 𝒞\mathscr{C} chosen according to Cn,k{\pazocal C}_{n,k}

𝔼𝒞(Δ(u,uw𝒞))2nk/(nw).\mathbb{E}_{\mathscr{C}}\left(\Delta(u,u_{w}^{\mathscr{C}})\right)\leq\sqrt{2^{n-k}\bigg{/}\binom{n}{w}}. (8)

In other words, if one wants to smooth a random code with target distance 2Ω(n)2^{-\Omega(n)} via the uniform distribution over a sphere, one has to choose its radius wn/2w\leq n/2 such that (nw)=2Ω(n) 2nk\binom{n}{w}=2^{\Omega(n)}\;2^{n-k}. It is readily seen that for fixed code rate R=defknR\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{k}{n}, choosing any fixed ratio ω=defwn\omega\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{w}{n} such that ω>ωGV(R)\omega>\omega_{\textup{GV}}(R) is enough, where ωGV(R)\omega_{\textup{GV}}(R) corresponds to the asymptotic relative Gilbert-Varshamov (GV) bound

ωGV(R)=defh1(1R),\omega_{\textup{GV}}(R)\stackrel{{\scriptstyle\text{def}}}{{=}}h^{-1}(1-R)\ ,

with h1:[0,1][0,1/2]h^{-1}:[0,1]\to[0,1/2] being the inverse of the binary entropy function h(p)=plog2(p)(1p)log2(1p)h(p)=-p\log_{2}(p)-(1-p)\log_{2}(1-p). The GV bound ωGV(R)\omega_{\textup{GV}}(R) appears ubiquitously in the coding-theoretic literature: amongst other contexts, it arises as the (expected) relative minimum distance of a random code of dimension RnRn, or as the maximum relative minimum error weight for which decoding over the binary symmetric channel can be successful with non-vanishing error probability.

This value of radius nωGV(R)n\omega_{\textup{GV}}(R) is optimal: clearly, the support size of an error distribution smoothing a code 𝒞\mathscr{C} must exceed 𝔽2n/𝒞\sharp\mathbb{F}_{2}^{n}/\mathscr{C}. Thus, we cannot expect to smooth a code 𝒞\mathscr{C} with errors in the sphere 𝒮w\mathscr{S}_{w} if its volume is smaller than 2nk=𝔽2n/𝒞2^{n-k}=\sharp\mathbb{F}_{2}^{n}/\mathscr{C}.

Therefore the uniform distribution over a sphere is optimal for random codes. By this, we mean that it leads to the smallest amount of possible noise (when it is concentrated on a ball) to smooth a random code. Notice that we obtained this result after applying the chain of arguments Cauchy-Schwarz, Parseval and Poisson to bound the statistical distance.

About the original chain of arguments of Micciancio and Regev. It can be verified that by coming back to the original steps of [MR07, ADRS15], namely the Poisson summation formula and then the triangle inequality, we would obtain

Δ(u,f𝒞)2ntdmin(𝒞)Nt(𝒞)|f^(t)|.\Delta\left(u,f^{\mathscr{C}}\right)\leq 2^{n}\sum_{t\geq d_{\textup{min}}(\mathscr{C}^{*})}N_{t}(\mathscr{C}^{*})|\widehat{f}(t)|. (9)

By using that a2+b2(a+b)2a^{2}+b^{2}\leq(a+b)^{2} (when a,b0a,b\geq 0) we see that our bound (Proposition 3.3) is sharper. It turns out that our bound is exponentially sharper for random codes (and even in the worst case) when choosing ff as the uniform distribution over a sphere of radius ww, namely f=uwf=u_{w}. In this case the Micciancio-Regev argument yields the following computation

𝔼𝒞(Δ(u,uw𝒞))\displaystyle\mathbb{E}_{\mathscr{C}}\left(\Delta\left(u,u_{w}^{\mathscr{C}}\right)\right) 𝔼𝒞(tdmin(𝒞)Nt(𝒞)|Kw(t)|(nw))\displaystyle\leq\mathbb{E}_{\mathscr{C}}\left(\sum_{t\geq d_{\textup{min}}(\mathscr{C}^{*})}N_{t}(\mathscr{C}^{*})\;\frac{|K_{w}(t)|}{\binom{n}{w}}\right)
=t>0(nt)2k|Kw(t)|(nw).\displaystyle=\sum_{t>0}\frac{\binom{n}{t}}{2^{k}}\;\frac{|K_{w}(t)|}{\binom{n}{w}}. (10)

To carefully estimate this upper-bound (and to compare with (8)) we are going to use the following proposition, which gives the asymptotic behaviour of KwK_{w} (see for instance [IS98, DT17]).

Proposition 3.8.

Let n,tn,t and ww be three positive integers. We set τ=deftn\tau\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{t}{n}, ω=wn\omega=\frac{w}{n} and ω=def1/2ω(1ω)\omega^{\perp}\stackrel{{\scriptstyle\text{def}}}{{=}}1/2-\sqrt{\omega(1-\omega)}. We assume wn/2w\leq n/2. Let z=def12τD2(1ω)z\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1-2\tau-\sqrt{D}}{2(1-\omega)} where D=def(12τ)24ω(1ω)D\stackrel{{\scriptstyle\text{def}}}{{=}}\left(1-2\tau\right)^{2}-4\omega(1-\omega). In the case τ(0,ω)\tau\in(0,\omega^{\perp}),

Kw(t)=O(2n(a(τ,ω)+o(1)))wherea(τ,ω)=defτlog2(1z)+(1τ)log2(1+z)ωlog2z.K_{w}(t)=O\left(2^{n(a(\tau,\omega)+o(1))}\right)\quad\mbox{where}\quad a(\tau,\omega)\stackrel{{\scriptstyle\text{def}}}{{=}}\tau\log_{2}(1-z)+(1-\tau)\log_{2}(1+z)-\omega\log_{2}z.

In the case τ(ω,1/2)\tau\in(\omega^{\perp},1/2), DD is negative, and

Kw(t)=O(2n(a(τ,ω)+o(1)))wherea(τ,ω)=def12(1+h(ω)h(τ)).K_{w}(t)=O\left(2^{n(a(\tau,\omega)+o(1))}\right)\quad\mbox{where}\quad a(\tau,\omega)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{2}(1+h(\omega)-h(\tau)).

We let,

ω0\displaystyle\omega_{0} =deflim¯n{wn:2n(1R)/(nw)1},\displaystyle\stackrel{{\scriptstyle\text{def}}}{{=}}\varlimsup_{n\rightarrow\infty}\left\{\frac{w}{n}:\;\sqrt{2^{n(1-R)}\bigg{/}\binom{n}{w}}\geq 1\right\},
ω1\displaystyle\omega_{1} =deflim¯n{wn:t>0(nt)2Rn|Kw(t)|(nw)1}.\displaystyle\stackrel{{\scriptstyle\text{def}}}{{=}}\varlimsup_{n\rightarrow\infty}\left\{\frac{w}{n}:\;\sum_{t>0}\frac{\binom{n}{t}}{2^{Rn}}\;\frac{|K_{w}(t)|}{\binom{n}{w}}\geq 1\right\}.

In Figure 4 we compare the asymptotic values of ω0\omega_{0} and ω1\omega_{1} as functions of RR. Notice that ω0=ωGV(R)\omega_{0}=\omega_{\textup{GV}}(R). We see that ω1\omega_{1} is undefined for a rate R<1/2R<1/2. In other words, it is impossible to show that 𝔼𝒞(Δ(u,uw𝒞))2Ω(n)\mathbb{E}_{\mathscr{C}}\left(\Delta(u,u_{w}^{\mathscr{C}})\right)\leq 2^{-\Omega(n)} with the standard approach of [MR07, ADRS15] when R<1/2R<1/2. Furthermore, for larger rates (and sufficiently large nn), ω0\omega_{0} is much smaller than ω1\omega_{1}.

Refer to caption
Figure 4. ω0\omega_{0} and ω1\omega_{1} as functions of R=defknR\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{k}{n}.

Bernoulli Distribution. Another natural distribution to consider when dealing with codes is the so-called “Bernoulli” distribution fber,pf_{\textup{ber},p}, which is defined for p[0,1/2]p\in[0,1/2] as

𝐱𝔽2n,fber,p(𝐱)=defp|𝐱|(1p)n|𝐱|.\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\quad f_{\textup{ber},p}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}p^{|\mathbf{x}|}(1-p)^{n-|\mathbf{x}|}.

This choice leads to simpler computations compared to the uniform distribution over a sphere. For instance we have fber,p^(𝐱)=12n(12p)|𝐱|\widehat{f_{\textup{ber},p}}(\mathbf{x})=\frac{1}{2^{n}}(1-2p)^{|\mathbf{x}|}. By plugging this in Equation (5) of Proposition 3.5 we obtain

𝔼𝒞(Δ(u,fber,p𝒞))\displaystyle\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\right) t>0(nt)2k(12p)2t\displaystyle\leq\sqrt{\sum_{t>0}\frac{\binom{n}{t}}{2^{k}}(1-2p)^{2t}}
12k(1+(12p)2)n\displaystyle\leq\sqrt{\frac{1}{2^{k}}\;(1+(1-2p)^{2})^{n}} (11)

Thus, if one wants to smooth a random code at target distance 2Ω(n)2^{-\Omega(n)} with the Bernoulli distribution, the above argument says that one has to choose p>p0=def12(12R1)p>p_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{2}\left(1-\sqrt{2^{R}-1}\right) where R=k/nR=k/n. As 𝔼fber,p(|𝐱|)=pn\mathbb{E}_{f_{\textup{ber},p}}(|\mathbf{x}|)=pn, it is meaningful to compare p0p_{0} and ω0\omega_{0}. It is readily seen that ω0=ωGV(R)=h1(1R)<12(12R1)=p0\omega_{0}=\omega_{\textup{GV}}(R)=h^{-1}(1-R)<\frac{1}{2}\left(1-\sqrt{2^{R}-1}\right)=p_{0}. In other words, this time the upper-bound given by Proposition 3.5 does not give what would be optimal, namely the Gilbert-Varshamov relative distance ωGV(R)\omega_{\textup{GV}}(R), but a quantity which is bigger. However, it is expected that the average amount of noise to smooth a random code is the same in both cases, since a Bernoulli distribution of parameter pp is extremely concentrated over words of Hamming weight pnpn and that therefore Δ(u,fber,p𝒞)Δ(u,upn𝒞)\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\approx\Delta(u,u_{pn}^{\mathscr{C}}). This suggests that Proposition 3.5 is not tight in this case. This is indeed the case, we can prove that we can smooth a random code with the Bernoulli noise as soon as p>ωGV(R)p>\omega_{\textup{GV}}(R). This follows from the following proposition.

Proposition 3.9.

Let ε>0\varepsilon>0 and p[0,1/2]p\in[0,1/2]. Then,

Δ(u,fber,p𝒞)r=(1ε)np(1+ε)npΔ(u,ur𝒞)+2Ω(n).\Delta(u,f^{\mathscr{C}}_{\textup{ber},p})\leq\sum_{r=(1-\varepsilon)np}^{(1+\varepsilon)np}\Delta(u,u_{r}^{\mathscr{C}})+2^{-\Omega(n)}.
Proof.

See Appendix A. ∎

This proposition shows that if one wants Δ(u,fber,p𝒞)2Ω(n)\Delta(u,f^{\mathscr{C}}_{\textup{ber},p})\leq 2^{-\Omega(n)} it is enough to have Δ(u,funif,r𝒞)2Ω(n)\Delta(u,f^{\mathscr{C}}_{\textup{unif},r})\leq 2^{-\Omega(n)} for any r[(1ε)np,(1+ε)np]r\in\left[(1-\varepsilon)np,(1+\varepsilon)np\right]. This can be achieved by choosing ε\varepsilon and pp such that (1ε)p>ωGV(R)(1-\varepsilon)p>\omega_{\textup{GV}}(R).

To summarize this subsection we have the following theorem

Theorem 3.10.

Let 𝒞\mathscr{C} be a random code chosen according to Cn,k{\pazocal C}_{n,k}, R=defknR\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{k}{n}. Let uu (resp. upnu_{\lceil pn\rceil}) be the uniform distribution over 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C} (resp. 𝒮w\mathscr{S}_{w}) and fber,pf_{\textup{ber},p} be the Bernoulli distribution over 𝔽2n\mathbb{F}_{2}^{n} of parameter pp. We have,

𝔼𝒞(Δ(u,upn𝒞)) 2n2(1Rh(p)+o(1))and𝔼𝒞(Δ(u,fber,p𝒞))2n2(1Rh(p)+o(1)).\mathbb{E}_{\mathscr{C}}\left(\Delta(u,u_{\lceil pn\rceil}^{\mathscr{C}})\right)\leq\;2^{\frac{n}{2}\left(1-R-h(p)+o(1)\right)}\quad\mbox{and}\quad\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\right)\leq 2^{\frac{n}{2}\left(1-R-h(p)+o(1)\right)}.

In particular, 𝔼𝒞(Δ(u,upn𝒞))2Ω(n)\mathbb{E}_{\mathscr{C}}\left(\Delta(u,u_{\lceil pn\rceil}^{\mathscr{C}})\right)\leq 2^{-\Omega(n)} and 𝔼𝒞(Δ(u,fber,p𝒞))2Ω(n)\mathbb{E}_{\mathscr{C}}\left(\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\right)\leq 2^{-\Omega(n)} for any fixed p>ωGV(R)p>\omega_{\textup{GV}}(R).

3.2. Smoothing a Fixed Code.

Our upper-bound on Δ(u,f𝒞)\Delta(u,f^{\mathscr{C}}) given in Proposition 3.3 involves the weight distribution of the dual of 𝒞\mathscr{C}, namely the Nt(𝒞)N_{t}(\mathscr{C}^{*})’s. To derive smoothing bounds on a fixed code our strategy will simply consist in using the best known upper bounds on the Nt(𝒞)N_{t}(\mathscr{C}^{*})’s. Roughly speaking, these bounds show that Nt(𝒞)(nt)2KnN_{t}(\mathscr{C}^{*})\leq\binom{n}{t}2^{-Kn} for some constant KK which is function of dmin(𝒞)d_{\textup{min}}(\mathscr{C}^{*}).

Notation. Let δ(0,1/2)\delta\in(0,1/2) and δτ1\delta\leq\tau\leq 1,

b(δ,τ)=deflim¯nmax𝒞{1nlog2Nτn(𝒞)}b(\delta,\tau)\stackrel{{\scriptstyle\text{def}}}{{=}}\mathop{\overline{\lim}}\limits_{n\to\infty}\mathop{\max}\limits_{\mathscr{C}}\left\{\frac{1}{n}\log_{2}N_{\lfloor\tau n\rfloor}(\mathscr{C})\right\} (12)

where the maximum is taken over all codes 𝒞\mathscr{C} of length nn and minimum distance δn\geq\delta n.

We recall (or slightly extend) results taken from [ABL01]:

Proposition 3.11.

Let δ(0,1/2)\delta\in(0,1/2) and δ=def1/2δ(1δ)\delta^{\perp}\stackrel{{\scriptstyle\text{def}}}{{=}}1/2-\sqrt{\delta(1-\delta)}. For any δτ1\delta\leq\tau\leq 1

b(δ,τ)c(δ,τ)=def{h(τ)+h(δ)1if τ[δ,1δ],2(h(δ)a(τ,δ))otherwise,b(\delta,\tau)\leq c(\delta,\tau)\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\begin{array}[]{ll}h(\tau)+h\left(\delta^{\perp}\right)-1&\mbox{if }\tau\in[\delta,1-\delta],\\ 2\left(h(\delta^{\perp})-a(\tau,\delta^{\perp})\right)&\mbox{otherwise,}\end{array}\right. (13)

where a(,)a(\cdot,\cdot) is defined in Proposition 3.8.

Proof.

See Appendix B. ∎

Proposition 3.12 ([ABL01, Proposition 4]).

Let δJSB=def(112δ)/2\delta_{\textup{JSB}}\stackrel{{\scriptstyle\text{def}}}{{=}}\left(1-\sqrt{1-2\delta}\right)/2 and

τ0=defargminδJSBα1/21h(α)+R1(α,δ)\tau_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\mathop{\operatorname*{argmin}}\limits_{\delta_{\textup{JSB}}\leq\alpha\leq 1/2}1-h(\alpha)+R_{1}(\alpha,\delta)

where

R1(τ,δ)=defh(12(11(4τ(1τ)δ(2δ)δ)2)).R_{1}(\tau,\delta)\stackrel{{\scriptstyle\text{def}}}{{=}}h\left(\frac{1}{2}\left(1-\sqrt{1-\left(\sqrt{4\tau(1-\tau)-\delta(2-\delta)}-\delta\right)^{2}}\right)\right).

For any δτ1\delta\leq\tau\leq 1

b(δ,τ)d(δ,τ)=def{h(τ)h(τ0)+R1(τ0,δ)if τ(δJSB,1δJSB) and τ0τ,R1(τ,δ)if τ(δJSB,1δJSB) and τ0>τ,0otherwise.b(\delta,\tau)\leq d(\delta,\tau)\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\begin{array}[]{ll}h(\tau)-h(\tau_{0})+R_{1}(\tau_{0},\delta)&\mbox{if }\tau\in(\delta_{\textup{JSB}},1-\delta_{\textup{JSB}})\mbox{ and }\tau_{0}\leq\tau,\\ R_{1}(\tau,\delta)&\mbox{if }\tau\in(\delta_{\textup{JSB}},1-\delta_{\textup{JSB}})\mbox{ and }\tau_{0}>\tau,\\ 0&\mbox{otherwise.}\end{array}\right. (14)

Both of these bounds are derived from “linear programming arguments” which were initially used to upper-bound the size of a code given its minimum distance. Proposition 3.11 is an extension of [ABL01, Theorem 3] in the case of linear codes, in particular we give an upper-bound for any τ[δ,1]\tau\in[\delta,1] (and not for only τ[δ,1/2]\tau\in[\delta,1/2]). The proof is in the appendix. The second bound is usually called the the second linear programming bound. In terms of δ\delta and τ\tau, Proposition 3.11 and 3.12 are among the best (known) upper-bounds on b(δ,τ)b(\delta,\tau). In the case where 0δ0.2730\leq\delta\leq 0.273, Proposition 3.12 leads to better smoothing bounds compared to Proposition 3.11.

Remark 3.13.

There exist many other bounds on b(δ,τ)b(\delta,\tau), like [ACKL05, Theorem 8] which holds only for linear codes or [ACKL05, Theorem 7]. However for our smoothing bounds, Propositions 3.11 and 3.12 lead to the best results, partly because these are the best bounds on the number of codewords of Hamming weight close to the minimum distance of the code.

We draw in Figures 5 and 6 the bounds of Propositions 3.11 and 3.12 as function of τ[δ,1]\tau\in[\delta,1] for a couple values of δ\delta.

Refer to caption
Figure 5. Bounds of Propositions 3.11 and 3.12 on b(δ,τ)b(\delta,\tau) as function of τ[δ,1]\tau\in[\delta,1] for δ=0.1\delta=0.1.
Refer to caption
Figure 6. Bounds of Propositions 3.11 and 3.12 on b(δ,τ)b(\delta,\tau) as function of τ[δ,1]\tau\in[\delta,1] for δ=0.35\delta=0.35.

Equipped with these bounds we are ready to give our smoothing bounds for codes in the worst case, namely for a fixed code. Our study with random codes gave a hint that the choice of the uniform distribution over a sphere could give better results than the Bernoulli distribution. However, as we will show now, the distribution on a sphere forces us to assume that no codewords of large weight belong to the dual 𝒞\mathscr{C}^{*} when we want to smooth 𝒞\mathscr{C}. It corresponds to the hypothesis of balanced-codes made in [BLVW19] to obtain a worst-to-average case reduction. We would like to avoid making this assumption as nothing forbids large weight vectors from belonging to a fixed code. Fortunately, as we will later show, we can avoid making this hypothesis while still keeping the advantages of the uniform distribution over a sphere.

Impossibility to smooth a code whose dual is not balanced with the uniform distribution over a sphere. It is readily seen that in the case where the dual code 𝒞\mathscr{C}^{*} is not balanced, meaning that it contains the all-one vector (and therefore that the dual weight distribution is symmetric: Nw(𝒞)=Nnw(𝒞)N_{w}(\mathscr{C}^{*})=N_{n-w}(\mathscr{C}^{*}) for any w{0,,n}w\in\{0,\cdots,n\} when the codelength is nn), then it is impossible to smooth it with the uniform distribution uwu_{w} over a sphere. Indeed, this implies that all codewords of 𝒞\mathscr{C} have an even Hamming weight (they have to be orthogonal to the all-one vector). The parity of the Hamming weights of vectors in a coset (i.e. in the class of representatives of some element in 𝔽2n/𝒞\mathbb{F}_{2}^{n}/\mathscr{C}) will be the same. Therefore, half of the cosets cannot be reached when periodizing uwu_{w} over 𝒞\mathscr{C}.

Difficulty of using Proposition 3.3 for proving smoothness of the uniform distribution if the dual has large weight codewords. Even in the case where the dual is balanced, difficulties can arise if we want to use Proposition 3.3 for proving smoothness of the uniform distribution over a sphere when the dual has large weight codewords. First of all, the fact that it contains the all-one codeword also reflects in the upper-bound of Proposition 3.3. Recall that uw^(𝐱)=12nKw(|𝐱|)/(nw)\widehat{u_{w}}(\mathbf{x})=\frac{1}{2^{n}}K_{w}(|\mathbf{x}|)/\binom{n}{w} and that we have Kw(n)=(1)w(nw)K_{w}(n)=(-1)^{w}\binom{n}{w} (see Fact 3.6). Therefore, when the full weight vector belongs to 𝒞\mathscr{C}^{*}, our upper-bound on Δ(u,uw𝒞)\Delta(u,u_{w}^{\mathscr{C}}) of Proposition 3.3 cannot be smaller than 11. Furthermore, even if the dual does not contain the all-one codeword, codewords of weight say t=nO(logn)t=n-O(\log n) also give a non-negligible contribution to the upper-bound of Proposition 3.3: the contribution is a polynomial nO(1)n^{-O(1)}.

Difficulty of using Proposition 3.3 for proving smoothness of the “discrete walk distribution” if the dual has large weight codewords. Other meaningful distributions in the cryptographic context display the same problem as the uniform distribution concerning the difficulty of applying Proposition 3.3 to them if the dual contains large weight codewords. This applies to the discrete time random walk distribution fRW,tf_{\textup{RW},t} introduced in [BLVW19] for worst-to-average case reductions. The authors were only able to prove smoothness of this distribution if the dual code has no small and no large weight codewords. This distribution is given by

𝐱𝔽2n,fRW,w(𝐱)=def(i=1w𝐞ui=𝐱)\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\quad f_{\textup{RW},w}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}\mathbb{P}\left(\sum_{i=1}^{w}\mathbf{e}_{u_{i}}=\mathbf{x}\right)

where the uiu_{i}’s are independently and uniformly drawn at random in {1,,n}\{1,\dots,n\} and 𝐞j\mathbf{e}_{j} denotes the jj-th canonical basis vector. Recall that [BLVW19]

fRW,w(𝐲)^=12n(12|𝐲|n)w.\widehat{f_{\textup{RW},w}(\mathbf{y})}=\frac{1}{2^{n}}\;\left(1-2\frac{|\mathbf{y}|}{n}\right)^{w}.

Therefore, fRW,w^(𝐲)=12n(1)w\widehat{f_{\textup{RW},w}}(\mathbf{y})=\frac{1}{2^{n}}(-1)^{w} when |𝐲|=n|\mathbf{y}|=n, as for the Fourier transform of the uniform distribution over a sphere, showing that fRW,wf_{\textup{RW},w} cannot smooth a code when the full weight vector belongs to its dual. In summary, a direct application of Proposition 3.3 is quite unsatisfactory for these distributions uwu_{w} and fRW,wf_{\textup{RW},w}. If we are willing to also make an assumption on the largest weight of a codeword, then certainly a direct application of Proposition 3.3 is able to provide meaningful smoothing bounds for them. Indeed, the following theorem is obtained by just combining Propositions 3.3, 3.11 and 3.12.

Theorem 3.14.

Let 𝒞\mathscr{C} be a binary linear code of length nn and ω(0,1)\omega\in(0,1). Suppose that dmin(𝒞)=δnd_{\textup{min}}(\mathscr{C}^{*})=\delta^{*}n and that 𝒞\mathscr{C}^{*} has no element of Hamming weight βn\geq\beta n for some β(δ,1)\beta\in(\delta^{*},1). We have

1nlog2Δ(u,uωn𝒞)maxδτβ{12min{c(δ,τ),d(δ,τ)}+a(ω,τ)}h(ω)\frac{1}{n}\log_{2}\Delta\left(u,u_{\omega n}^{\mathscr{C}}\right)\leq\mathop{\max}\limits_{\delta^{*}\leq\tau\leq\beta}\left\{\frac{1}{2}\min\left\{c(\delta^{*},\tau),d(\delta^{*},\tau)\right\}+a(\omega,\tau)\right\}-h(\omega)
1nlog2Δ(u,fRW,ωn𝒞)maxδτβ{12min{c(δ,τ),d(δ,τ)}+ωlog2(12τ)}\frac{1}{n}\log_{2}\Delta\left(u,f_{\textup{RW},\omega n}^{\mathscr{C}}\right)\leq\mathop{\max}\limits_{\delta^{*}\leq\tau\leq\beta}\left\{\frac{1}{2}\min\left\{c(\delta^{*},\tau),d(\delta^{*},\tau)\right\}+\omega\log_{2}\left(1-2\tau\right)\right\}

where a(,)a(\cdot,\cdot), c(,)c(\cdot,\cdot) and d(,)d(\cdot,\cdot) are defined respectively in Propositions 3.8, 3.11 and 3.12.

Avoiding making an assumption on the largest dual codeword: the case of the Bernoulli distribution. Even if the Bernoulli distribution has some drawbacks compared to the uniform distribution over a sphere, when applying Proposition 3.3 with random codes, it has however a nice property concerning the large weight codewords: the large weight dual codewords have a negligible contribution in the upper-bound of Proposition 3.3. To see this let us first recall that

fber,p^(𝐱)=12n(12p)|𝐱|.\widehat{f_{\textup{ber},p}}(\mathbf{x})=\frac{1}{2^{n}}\;(1-2p)^{|\mathbf{x}|}. (15)

Therefore, by Proposition 3.3 we have

Δ(u,fber,p𝒞)t=dmin(𝒞)nNt(𝒞)(12p)2t.\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\leq\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n}N_{t}(\mathscr{C}^{*})(1-2p)^{2t}}. (16)

On the other hand, we have the following lemma which shows that large weight codewords can only have an exponentially small contribution to the above upper-bound.

Lemma 3.15.

Let 𝒞\mathscr{C} be a linear code of length nn and let t>ndmin(𝒞)/2t>n-d_{\textup{min}}(\mathscr{C})/2. There is at most one codeword 𝐜\mathbf{c} of weight tt.

Proof.

Suppose by contradiction that there exists two distinct codewords 𝐜,𝐜𝒞\mathbf{c},\mathbf{c}^{\prime}\in\mathscr{C} of Hamming weight tt. By using the triangle inequality we obtain (where 𝟏\mathbf{1} denotes the all-one vector)

|𝐜𝐜|\displaystyle|\mathbf{c}-\mathbf{c}^{\prime}| |𝐜𝟏|+|𝟏𝐜|\displaystyle\leq|\mathbf{c}-\mathbf{1}|+|\mathbf{1}-\mathbf{c}^{\prime}|
=2(nt)\displaystyle=2\left(n-t\right)
<dmin(𝒞)\displaystyle<d_{\textup{min}}(\mathscr{C})

which contradicts the fact that 𝒞\mathscr{C} has minimum distance dmin(𝒞)d_{\textup{min}}(\mathscr{C}). ∎

Therefore, using Lemma 3.15 in Equation (16) gives for p(0,1/2]p\in(0,1/2],

Δ(u,fber,p𝒞)t=dmin(𝒞)ndmin(𝒞)/2Nt(𝒞)(12p)2t+2Ω(n).\Delta(u,f_{\textup{ber},p}^{\mathscr{C}})\leq\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n-d_{\textup{min}}(\mathscr{C}^{*})/2}N_{t}(\mathscr{C}^{*})(1-2p)^{2t}}+2^{-\Omega(n)}. (17)

In other words, large weight dual codewords (if they exist) have only an exponentially small contribution to our smoothing bound with the Bernoulli distribution. In principle, we could plug in Equation (17) bounds on the Nt(𝒞)N_{t}(\mathscr{C}^{*})’s given in Propositions 3.11 and 3.12. We will improve on the bounds obtained in this way by truncating the Bernoulli distribution, then

  • (i)(i)

    prove that by appropriately truncating both distributions have the same smoothness property,

  • (ii)(ii)

    show that the truncated distribution has the same nice properties with respect to large weights,

  • (iii)(iii)

    show that we can apply Proposition 3.3 to the truncated distribution and get appropriate smoothness properties.

We obtain in this way:

Theorem 3.16.

Let 𝒞\mathscr{C} be a binary linear code of length nn and p(0,1/2]p\in(0,1/2] such that dmin(𝒞)δnd_{\textup{min}}(\mathscr{C}^{*})\geq\delta^{*}n for some δ[0,1]\delta^{*}\in[0,1]. We have asymptotically,

1nlog2Δ(u,fber,p𝒞)maxδτ1δ/2{12min{c(δ,τ),d(δ,τ)}+max(1ε)pλ(1+ε)p{λlog2p+(1λ)log2(1p)+a(λ,τ)}}+O(1n)\frac{1}{n}\log_{2}\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq\mathop{\max}\limits_{\delta^{*}\leq\tau\leq 1-\delta^{*}/2}\{\frac{1}{2}\min\left\{c(\delta^{*},\tau),d(\delta^{*},\tau)\right\}+\\ \mathop{\max}\limits_{(1-\varepsilon)p\leq\lambda\leq(1+\varepsilon)p}\left\{\lambda\log_{2}p+(1-\lambda)\log_{2}(1-p)+a(\lambda,\tau)\right\}\}+O\left(\frac{1}{n}\right)

where a(,)a(\cdot,\cdot), c(,)c(\cdot,\cdot) and d(,)d(\cdot,\cdot) are defined respectively in Propositions 3.8, 3.11 and 3.12.

Proof.

See Appendix C. ∎

Let i{0,1}i\in\{0,1\} and pip_{i} be the smallest p(0,1/2]p\in(0,1/2] that enables to reach Δ(u,fber,p𝒞)2Ω(n)\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq 2^{-\Omega(n)} with

  • Theorem 3.16 when i=0i=0,

  • Equation (17) and Propositions 3.11, 3.12 when i=1i=1.

In Figure 7 we compare the smallest pp that enables one to reach Δ(u,fber,p𝒞)2Ω(n)\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq 2^{-\Omega(n)} with Equation (17) and with Theorem 3.16. As we can see Theorem 3.16 leads to significantly better bounds. Furthermore, it turns out that p0np_{0}n is roughly equal to the smallest radius ww such that Δ(u,uw𝒞)2Ω(n)\Delta(u,u_{w}^{\mathscr{C}})\leq 2^{-\Omega(n)} if we had supposed that no codewords of weight >ndmin(𝒞)>n-d_{\textup{min}}(\mathscr{C}^{*}) belong to 𝒞\mathscr{C}^{*}. In other words, our proof using the tweak of truncating the Bernoulli enables us to obtain a smoothing bound without the hypothesis of no dual codewords of large Hamming weight which is as good as with the uniform distribution over a sphere if we had made this assumption.

Refer to caption
Figure 7. Smoothing bounds for a code 𝒞\mathscr{C} as function of δ=defdmin(𝒞)/n\delta^{*}\stackrel{{\scriptstyle\text{def}}}{{=}}d_{\textup{min}}(\mathscr{C}^{*})/n via Theorem 3.16 (for ε=102\varepsilon=10^{-2}) and Equation (17) .

4. Smoothing Bounds: Lattice Case

Given an nn-dimensional lattice Λ\Lambda the aim of smoothing bounds is to give a non-trivial model of noise 𝐞n\mathbf{e}\in\mathbb{R}^{n} for (𝐞modΛ)n/Λ(\mathbf{e}\mod\Lambda)\in\mathbb{R}^{n}/\Lambda (namely the reduction of 𝐞\mathbf{e} modulo Λ\Lambda) to be uniformly distributed. Following Micciancio and Regev [MR07], the standard choice of noise is given by the Gaussian distribution, defined via

𝐱n,Ds(𝐱)=def1snρs(𝐱)whereρs(𝐱)=defeπ(|𝐱|2/s)2.\forall\mathbf{x}\in\mathbb{R}^{n},\quad D_{s}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{s^{n}}\;\rho_{s}(\mathbf{x})\quad\mbox{where}\quad\rho_{s}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}e^{-\pi(|\mathbf{x}|_{2}/s)^{2}}\ .

The parametrization is chosen such that sn/2πs\sqrt{n/2\pi} is the standard deviation of DsD_{s}. Micciancio and Regev showed that when 𝐞\mathbf{e} is distributed according to DsD_{s}, choosing ss large enough enables 𝐞modΛ\mathbf{e}\mod\Lambda to be statistically close to the uniform distribution.

However, following the intuition from the case of codes we will first analyze the case where 𝐞\mathbf{e} is sampled uniformly from a Euclidean ball. Interestingly, just as with codes where our methodology led to stronger bounds when the uniform distribution over a sphere was used to smooth rather than the Bernoulli distribution, we will obtain better results when we work with the uniform distribution over a ball. Fortunately, using concentration of the Gaussian measure one can translate results from the case where 𝐞\mathbf{e} is uniformly distributed over a ball to the case that it is sampled according to DsD_{s}; see Proposition 4.5. This is analogous to the translation from results for the uniform distribution over a sphere to the Bernoulli distribution for codes elucidated in Proposition 3.9.

For either choice of noise, to obtain a smoothing bound we are required to bound the statistical distance between the distribution of 𝐞modΛ\mathbf{e}\mod\Lambda if 𝐞\mathbf{e} has density gg, and the uniform distribution over n/Λ\mathbb{R}^{n}/\Lambda. It is readily seen that 𝐞modΛ\mathbf{e}\mod\Lambda has density |Λ|g|Λ|\Lambda|g^{|\Lambda} which is defined as (see Definition 2.2 with the choice of Haar measures given in Table 2)

g|Λ(𝐱)=1|Λ|𝐲Λg(𝐱+𝐲).g^{|\Lambda}(\mathbf{x})=\frac{1}{|\Lambda|}\;\sum_{\mathbf{y}\in\Lambda}g(\mathbf{x}+\mathbf{y}).

Notation. For any g:ng:\mathbb{R}^{n}\rightarrow\mathbb{C},

gΛ=def|Λ|g|Λ.g^{\Lambda}\stackrel{{\scriptstyle\text{def}}}{{=}}|\Lambda|\;g^{|\Lambda}.

In the following proposition we specialize Corollary 2.4 to the case of lattices.

Proposition 4.1.

Let Λ\Lambda be an nn-dimensional lattice. Let gg be some density function on n\mathbb{R}^{n} and vv be the density of the uniform distribution over n/Λ\mathbb{R}^{n}/\Lambda. We have

Δ(v,gΛ)12𝐱Λ{𝟎}|g^(𝐱)|2.\Delta\left(v,g^{\Lambda}\right)\leq\frac{1}{2}\;\sqrt{\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}|\widehat{g}(\mathbf{x})|^{2}}\ .

We will restrict our instantiations to functions gg whose Fourier transforms are radial, that is, g^(𝐱)\widehat{g}(\mathbf{x}) depends only on the Euclidean norm of 𝐱\mathbf{x}, namely |𝐱|2|\mathbf{x}|_{2}.

4.1. Smoothing Random Lattices

As with codes, we begin our investigation of smoothing lattices by considering the random case. However, defining a “random lattice” is much more involved than the analogous notion of random codes. Fortunately for us, we can apply the Siegel version of the Minkowski-Hlawka theorem to conclude that there exists a random lattice model which behaves very nicely from the perspective of “test functions”. We first state the technical theorem that we require.

Theorem 4.2 (Minkowski-Hlawka-Siegel).

On the set of all the lattices of covolume MM in n\mathbb{R}^{n} there exists a probability measure μ\mu such that, for any Riemann integrable function g(𝐱)g(\mathbf{x}) which vanishes outside some bounded region,(6)(6)(6)This statement holds for a larger class of functions. In particular it holds for our instantiation with the Gaussian distribution.

𝔼Λμ(𝐱Λ{𝟎}g(𝐱))=1Mng(𝐱)𝑑𝐱.\mathop{\mathbb{E}}_{\Lambda\sim\mu}\left(\sum_{\mathbf{x}\in\Lambda\setminus\{\mathbf{0}\}}g(\mathbf{x})\right)=\frac{1}{M}\int_{\mathbb{R}^{n}}g(\mathbf{x})d\mathbf{x}\ .

As intuition for the above theorem, consider the case that gg is the indicator function for a bounded, measurable subset SnS\subseteq\mathbb{R}^{n}. Then, Theorem 4.2 promises that the expected number of lattice points (other than the origin(7)(7)(7)Note that as 𝟎Λ\mathbf{0}\in\Lambda with certainty, there is really no “randomness” for this event.) in SS is equal to the volume of SS over the covolume of the lattice.

Uniform Distribution over a Ball. Let

uw=def1wVn(w)u_{w\mathscr{B}}\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1_{{\mathscr{B}}_{w}}}{V_{n}\left(w\right)}

be the density of the uniform distribution over the Euclidean ball of radius ww. Let us recall that Vn(w)V_{n}\left(w\right) denotes the volume of any ball of radius ww. From Theorem 4.2, we may obtain the following proposition. This should be compared with Proposition 3.7.

Proposition 4.3.

On the set of all lattices of covolume MM in n\mathbb{R}^{n} there exists a probability measure ν\nu such that, for any w>0w>0

𝔼Λν(Δ(u,uwΛ))12MVn(w).\mathop{\mathbb{E}}_{\Lambda\sim\nu}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq\frac{1}{2}\;\sqrt{\frac{M}{V_{n}\left(w\right)}}.

In particular, defining

w0=defn/2πeM1/n,w_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{n/2\pi e}\;M^{1/n},

if w>w0w>w_{0} we have

𝔼Λν(Δ(u,uwΛ))O(1)(w0w)n/2.\mathop{\mathbb{E}}_{\Lambda\sim\nu}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq O(1)\;\left(\frac{w_{0}}{w}\right)^{n/2}.
Proof.

We define ν\nu to be the procedure that samples a lattice according to μ\mu of covolume M1M^{-1}, then outputs its dual. In the following chain, we first apply Proposition 4.1; then, Jensen’s inequality; then, the Minkowski-Hlawka-Siegel (MHS) Theorem (Theorem 4.2) to the function |uwΛ^|2|\widehat{u_{w\mathscr{B}}^{\Lambda}}|^{2}; and, lastly, Parseval’s Identity (Theorem 2.1). This yields:

𝔼Λν(2Δ(u,uwΛ))\displaystyle\mathop{\mathbb{E}}_{\Lambda\sim\nu}\left(2\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right) 𝔼Λμ(𝐱Λ{𝟎}|uw^(𝐱)|2)(Proposition 4.1)\displaystyle\leq\mathop{\mathbb{E}}_{\Lambda^{*}\sim\mu}\left(\sqrt{\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}|\widehat{u_{w\mathscr{B}}}(\mathbf{x})|^{2}}\right)\quad\text{(Proposition~{}\ref{propo:FBSDLat})}
𝔼Λμ(𝐱Λ{𝟎}|uw^(𝐱)|2)(Jensen’s Inequality)\displaystyle\leq\sqrt{\mathop{\mathbb{E}}_{\Lambda^{*}\sim\mu}\left(\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}|\widehat{u_{w\mathscr{B}}}(\mathbf{x})|^{2}\right)}\quad\text{(Jensen's Inequality)}
=1M1(n|uw^(𝐱)|2𝑑𝐱)(MHS Theorem)\displaystyle=\sqrt{\frac{1}{M^{-1}}\;\left(\int_{\mathbb{R}^{n}}|\widehat{u_{w\mathscr{B}}}(\mathbf{x})|^{2}d\mathbf{x}\right)}\quad\text{(MHS Theorem)}
=Mn|uw(𝐱)|2𝑑𝐱(Parseval’s Identity)\displaystyle=\sqrt{M\int_{\mathbb{R}^{n}}|u_{w\mathscr{B}}(\mathbf{x})|^{2}d\mathbf{x}}\quad\text{(Parseval's Identity)}
=MVn(w)2n1w(𝐱)𝑑𝐱\displaystyle=\sqrt{\frac{M}{V_{n}(w)^{2}}\int_{\mathbb{R}^{n}}1_{{\mathscr{B}}_{w}}(\mathbf{x})d\mathbf{x}}
=MVn(w).\displaystyle=\sqrt{\frac{M}{V_{n}(w)}}.

For the “in particular” part of the proposition, we use Stirling’s estimate to derive

Vn(w)=πn/2wnΓ(n/2+1)=πn/2wn(n2e)n/2(1+o(1))nV_{n}\left(w\right)=\frac{\pi^{n/2}\;w^{n}}{\Gamma(n/2+1)}=\frac{\pi^{n/2}\;w^{n}}{\left(\frac{n}{2e}\right)^{n/2}}\;(1+o(1))^{n}

from which it follows that if

w>w0=n/2πeM1/n,w>w_{0}=\sqrt{n/2\pi e}\;M^{1/n},

we have

MVn(w)O(1)(ww0)n/2\sqrt{\frac{M}{V_{n}(w)}}\leq O(1)\left(\frac{w}{w_{0}}\right)^{n/2}

which concludes the proof. ∎

It is easily verified that the value of w0w_{0} defined in Proposition 4.4 corresponds to the so-called Gaussian heuristic. We view this condition on w>w0w>w_{0} as the equivalent of the Gilbert-Varshamov bound for codes as we discussed just below Proposition 3.7. In particular, as we need the support of the noise to have volume at least MM if we hope to smooth a lattice of covolume MM, we see that the uniform distribution over a ball is optimal for smoothing random lattices, just as the uniform distribution over a sphere was optimal for smoothing random codes.

Gaussian Noise. We now turn to the case of Gaussian noise. Following the proof of Proposition 4.3 to the point where we apply Parseval’s identity, but replacing uwu_{w\mathscr{B}} by DsD_{s}, we obtain that

𝔼(Δ(u,DsΛ))Mn|Ds(𝐱)|2𝑑𝐱.\mathbb{E}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\sqrt{M\int_{\mathbb{R}^{n}}|D_{s}(\mathbf{x})|^{2}d\mathbf{x}}\ .

To conclude, one uses the following routine computation

n|Ds(𝐱)|2𝑑𝐱=1s2nne2π(|𝐱|2s)2𝑑𝐱=1s2nnρs/2(𝐱)𝑑𝐱=(1s2)n.\displaystyle\int_{\mathbb{R}^{n}}|D_{s}(\mathbf{x})|^{2}d\mathbf{x}=\frac{1}{s^{2n}}\int_{\mathbb{R}^{n}}e^{-2\pi\left(\frac{|\mathbf{x}|_{2}}{s}\right)^{2}}d\mathbf{x}=\frac{1}{s^{2n}}\int_{\mathbb{R}^{n}}\rho_{s/\sqrt{2}}(\mathbf{x})d\mathbf{x}=\left(\frac{1}{s\sqrt{2}}\right)^{n}.

Thus, we obtain:

Proposition 4.4.

On the set of all the lattices of covolume MM in n\mathbb{R}^{n} there exists a probability measure ν\nu such that, for any s>0s>0,

𝔼Λν(Δ(u,DsΛ))12M(s2)n.\mathop{\mathbb{E}}_{\Lambda\sim\nu}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\frac{1}{2}\;\sqrt{\frac{M}{\left(s\sqrt{2}\right)^{n}}}\ .

In particular, if s>s0=defM1/n/2s>s_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}M^{1/n}/\sqrt{2}, we have

𝔼Λν(Δ(u,DsΛ))(s0s)n/2.\mathop{\mathbb{E}}_{\Lambda\sim\nu}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\left(\frac{s_{0}}{s}\right)^{n/2}.

To compare Propositions 4.3 and 4.4, we note that a random vector sampled according to DsD_{s} has an expected Euclidean norm given by sΓ(n+12)πΓ(n2)sn2πs\frac{\Gamma\left(\frac{n+1}{2}\right)}{\sqrt{\pi}\Gamma\left(\frac{n}{2}\right)}\sim s\sqrt{\frac{n}{2\pi}}. So, it is fair to compare the effectiveness of smoothing with a parameter ss Gaussian distribution and the uniform distribution over a ball of radius sn2πs\sqrt{\frac{n}{2\pi}}. We note that, if s0s_{0} is as in Proposition 4.4 and w0w_{0} is the radius of the so-called Gaussian heuristic, then

s0n2π=M1/n2n2π=w0e/2.s_{0}\sqrt{\frac{n}{2\pi}}=\frac{M^{1/n}}{\sqrt{2}}\sqrt{\frac{n}{2\pi}}=w_{0}\;\sqrt{e/2}.

Thus, we conclude that the parameter s0s_{0} from Proposition 4.4 is larger than what we could hope by a factor e/2\sqrt{e/2}.

4.2. Connecting Uniform Ball Distribution to Gaussian

However, recall that in the code-case we argued that, as the Hamming weight of a vector sampled according to the Bernoulli distribution is tightly concentrated, we could obtain the same smoothing bound for the Bernoulli distribution as we did for the uniform sphere distribution, essentially by showing that we can approximate a Bernoulli distribution by a convex combination of uniform sphere distributions. Similarly, we can relate the Gaussian distribution to the uniform distribution over a ball, and thereby remove this additional e/2\sqrt{e/2} factor.

We state a general proposition that allows us to translate smoothing bounds for the uniform ball distribution to the Gaussian distribution. It guarantees that if the uniform ball distribution smooths whenever w>w0w>w_{0}, the Gaussian distribution smooths whenever s>w02πns>w_{0}\;\sqrt{\frac{2\pi}{n}}. While the intuition for the argument is the same as that which we used in the code-case, the argument is itself a bit more sophisticated.

Proposition 4.5.

Let Λ\Lambda be a random lattice of covolume MM and let u=defun/Λu\stackrel{{\scriptstyle\text{def}}}{{=}}u_{\mathbb{R}^{n}/\Lambda} be the uniform distribution over its cosets. Suppose that for all w>w0w>w_{0} there is a function f(n)f(n) such that

𝔼Λ(Δ(u,uwΛ))f(n)(w0w)n/2.\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq f(n)\left(\frac{w_{0}}{w}\right)^{n/2}.

Let s0=defw02πns_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}w_{0}\sqrt{\frac{2\pi}{n}}. Then, for all s>s0s>s_{0}, defining η=def1s0s(0,1)\eta\stackrel{{\scriptstyle\text{def}}}{{=}}1-\frac{s_{0}}{s}\in(0,1), we have

𝔼Λ(Δ(u,DsΛ))exp(η28n)+f(n)(s0s)n/4.\mathbb{E}_{\Lambda}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\exp(-\frac{\eta^{2}}{8}\;n)+f(n)\left(\frac{s_{0}}{s}\right)^{n/4}.
Proof.

See Appendix D. ∎

Combining the above proposition with Theorem 4.2, setting f(n)=O(1)f(n)=O(1), we obtain the following theorem.

Theorem 4.6.

Let Λ\Lambda be a random lattice of covolume MM sampled according to ν\nu, let u=defun/Λu\stackrel{{\scriptstyle\text{def}}}{{=}}u_{\mathbb{R}^{n}/\Lambda} be the uniform distribution over its cosets, and let

s0=defM1/n/e.s_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}M^{1/n}/\sqrt{e}.

Then, for any s>s0s>s_{0}, setting η=def1s0s(0,1)\eta\stackrel{{\scriptstyle\text{def}}}{{=}}1-\frac{s_{0}}{s}\in(0,1), we have

𝔼Λ(Δ(u,DsΛ))exp(η28n)+O(1)(s0s)n/4.\mathbb{E}_{\Lambda}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\exp(-\frac{\eta^{2}}{8}\;n)+O(1)\left(\frac{s_{0}}{s}\right)^{n/4}.

4.3. Smoothing Random qq-ary Lattices

While the method of sampling lattices promised by the Minkowski-Hlawka-Siegel Theorem (Theorem 4.2) is indeed very convenient for computations, it does not tell us much about how to explicitly sample from the distribution. Furthermore it is not very relevant if one is interested in the random lattices that are used in cryptography.

For a more concrete sampling procedure that is relevant to cryptography, we can consider the randomized Construction A (or, more precisely, its dual), which gives a very popular random model of lattices which are easily constructed from random codes. Specifically, for a prime qq and a linear code 𝒞(/q)n\mathscr{C}\subseteq(\mathbb{Z}/q\mathbb{Z})^{n} we obtain a lattice as follows. First, we “lift” the codewords 𝐜𝒞\mathbf{c}\in\mathscr{C} to vectors in n\mathbb{R}^{n} in the natural way by identifying /q\mathbb{Z}/q\mathbb{Z} with the set {0,1,,q1}\{0,1,\dots,q-1\}; denote the lifted vector as 𝐜~\widetilde{\mathbf{c}}. Then, we can define the following lattice

Λ𝒞=def{𝐜~:𝐜𝒞}+qn.\Lambda_{\mathscr{C}}\stackrel{{\scriptstyle\text{def}}}{{=}}\{\widetilde{\mathbf{c}}:\mathbf{c}\in\mathscr{C}\}+q\mathbb{Z}^{n}.

In other words: Λ𝒞\Lambda_{\mathscr{C}} consists of all vectors in the integer lattice n\mathbb{Z}^{n} whose reductions modulo qq give an element of 𝒞\mathscr{C}.

Fix integers 1kn1\leq k\leq n, a prime qq and a desired covolume MM. We sample a random lattice Λ\Lambda as follows

  • First, sample a random linear code 𝒞(/q)n\mathscr{C}\subseteq(\mathbb{Z}/q\mathbb{Z})^{n} of dimension kk (recall this means that we sample a random k×nk\times n matrix 𝐆\mathbf{G} and define 𝒞={𝐦𝐆:𝐦(/q)k}\mathscr{C}=\{\mathbf{m}\mathbf{G}:\mathbf{m}\in(\mathbb{Z}/q\mathbb{Z})^{k}\}),

  • Then, we scale Λ𝒞\Lambda_{\mathscr{C}} by 1M1/n1q1k/n\frac{1}{M^{1/n}}\;\frac{1}{q^{1-k/n}},

  • Lastly, we output the dual of 1M1/n1q1k/nΛ𝒞\frac{1}{M^{1/n}}\;\frac{1}{q^{1-k/n}}\Lambda_{\mathscr{C}}.

Notice that the scaling is chosen so that, as long as 𝐆\mathbf{G} is of full rank, the lattice Λ\Lambda we output has the desired covolume MM. We denote this procedure of sampling Λ\Lambda by νA\nu_{\textup{A}} (the dependence on qq, kk and nn is left implicit).

The important fact is that, up to an error term (which decreases as qq increases), the expected number of lattice points from Λ\Lambda^{*} in a Euclidean ball of radius rr is roughly Vn(r)M\frac{V_{n}\left(r\right)}{M}, as one would hope.

Proposition 4.7 ([Zam14, Lemma 7.9.2]).

For every n2n\geq 2, 1k<n1\leq k<n and prime power qq, for ΛνA\Lambda\sim\nu_{\textup{A}} the expected number of lattice points from Λ\Lambda^{*} in a Euclidean ball of radius w=deftnw\stackrel{{\scriptstyle\text{def}}}{{=}}t\sqrt{n} satisfies

M𝔼Λ(Nw(Λ))Vn(w)n=1±δ/twhere δ=def1q1k/n.\sqrt[n]{\frac{M\;\mathbb{E}_{\Lambda}(N_{\leq w}(\Lambda^{*}))}{V_{n}\left(w\right)}}=1\pm\delta/t\quad\mbox{where }\delta\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{q^{1-k/n}}.

We now turn to bounding the expected statistical distance between uu and uwΛu_{w\mathscr{B}}^{\Lambda}, where ΛνA\Lambda\sim\nu_{\textup{A}} and w>0w>0 is the radius of the Euclidean ball from which the noise is uniformly sampled. First, we state an explicit formula for the Fourier transform of 1w1_{{\mathscr{B}}_{w}}, the indicator function of a Euclidean ball of radius ww, in terms of Bessel functions.

Notation 4.8.

For a positive real number μ>0\mu>0, we denote by Jμ:J_{\mu}:\mathbb{R}\to\mathbb{R} the Bessel function of the first kind of order μ\mu.

The important fact concerning Bessel functions that we will use is the following.

Fact 4.9.

We have

1w^(𝐲)=(w|𝐲|2)n/2Jn/2(2πw|𝐲|2).\displaystyle\widehat{1_{{\mathscr{B}}_{w}}}(\mathbf{y})=\left(\frac{w}{|\mathbf{y}|_{2}}\right)^{n/2}J_{n/2}(2\pi w|\mathbf{y}|_{2}). (18)

We will refrain from providing an explicit formula for Bessel functions, and instead use the following upper-bound as a black-box.

Proposition 4.10 ([Kra06]).

For any xx\in\mathbb{R} we have

|Jn/2(x)||x|1/3.|J_{n/2}(x)|\leq|x|^{-1/3}.

Using this proposition, we first prove a technical lemma that will be reused when we discuss smoothing arbitrary lattices. In order to state the lemma, we introduce the following auxiliary function.

Notation 4.11.

For a real w>0w>0, we define gw:g_{w}:\mathbb{R}\to\mathbb{R} via

gw(t)=def1Vn(w)1w^(𝐱)2g_{w}(t)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{V_{n}\left(w\right)}\widehat{{1}_{{\mathscr{B}}_{w}}}(\mathbf{x})^{2}

where 𝐱\mathbf{x} is any vector in n\mathbb{R}^{n} of norm tt. Note that as 1w^(𝐱)\widehat{{1}_{{\mathscr{B}}_{w}}}(\mathbf{x}) depends only on |𝐱|2|\mathbf{x}|_{2}, this is indeed well-defined.

The following lemma leverages Proposition 4.10 to upper-bound gwg_{w} on a closed interval.

Lemma 4.12.

For any w>0w>0 and any 0a0\leq a and b=(1+1n)ab=\left(1+\frac{1}{n}\right)a we have, for some constant C>0C>0

maxatbgw(t)CVn(b)w2/31a2/3.\max_{a\leq t\leq b}g_{w}(t)\leq\frac{C}{V_{n}\left(b\right)w^{2/3}}\;\frac{1}{a^{2/3}}.
Proof.

First, we notice that for all t[a,b]t\in[a,b]

Vn(t)=(tb)nVn(b)(ab)nVn(b)=(1+1n)nVn(b)1CVn(b)\displaystyle V_{n}\left(t\right)=\left(\frac{t}{b}\right)^{n}V_{n}\left(b\right)\geq\left(\frac{a}{b}\right)^{n}V_{n}\left(b\right)=\left(1+\frac{1}{n}\right)^{-n}V_{n}\left(b\right)\geq\frac{1}{C^{\prime}}V_{n}\left(b\right)

for some constant C>0C^{\prime}>0. We now use Proposition 4.10 to derive

maxatbgw(t)CVn(b)maxatbJn/2(2πwt)2CVn(b)w2/31a2/3\displaystyle\max_{a\leq t\leq b}\;g_{w}(t)\leq\frac{C^{\prime}}{V_{n}\left(b\right)}\;\max_{a\leq t\leq b}J_{n/2}(2\pi wt)^{2}\leq\frac{C}{V_{n}\left(b\right)w^{2/3}}\;\frac{1}{a^{2/3}}

for an appropriate constant C>0C>0 which concludes the proof. ∎

We now provide the main theorem of this section. It demonstrates that to smooth our ensemble of random qq-ary codes (in expectation) with the uniform distribution over the ball of radius ww, it still suffices to choose w>w0=defn2π/eM1/nw>w_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{n2\pi/e}\;M^{1/n}, assuming qq is not too small.

Theorem 4.13.

Let n>2n>2 and 1k<n1\leq k<n. Let qq be a prime and set γ=defn3/2q1k/n\gamma\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{n^{3/2}}{q^{1-k/n}}. Let ΛνA\Lambda\sim\nu_{\textup{A}}. For some constant C>0C>0, we have

𝔼Λ(Δ(u,uwΛ))C(nw)1/3eγ/2MVn(w).\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq C\left(\frac{n}{w}\right)^{1/3}e^{\gamma/2}\sqrt{\frac{M}{V_{n}\left(w\right)}}.

In particular, if w>w0=defn2π/eM1/nw>w_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{n2\pi/e}M^{1/n}, we have

𝔼Λ(Δ(u,uwΛ))O((nw)1/3eγ/2)(w0w)n/2.\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq O\left(\left(\frac{n}{w}\right)^{1/3}e^{\gamma/2}\right)\left(\frac{w_{0}}{w}\right)^{n/2}.
Proof.

Let tj=def(1+1n)jt_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\left(1+\frac{1}{n}\right)^{j} for jj\in\mathbb{N} and

Nj=def{𝐱Λ:tj|𝐱|2<tj+1};φj=defmaxtjttj+1gw(t).N_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\sharp\{\mathbf{x}^{*}\in\Lambda^{*}:t_{j}\leq|\mathbf{x}^{*}|_{2}<t_{j+1}\}\quad;\quad\varphi_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\max_{t_{j}\leq t\leq t_{j+1}}g_{w}(t).

Now, we apply Proposition 4.1 and the above definitions to obtain

𝔼Λ(2Δ(u,uwΛ))\displaystyle\mathbb{E}_{\Lambda}\left(2\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right) 𝔼Λ(𝐱Λ{𝟎}|uw^(𝐱)|2)\displaystyle\leq\mathbb{E}_{\Lambda}\left(\sqrt{\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}|\widehat{u_{w\mathscr{B}}}(\mathbf{x})|^{2}}\right)
1Vn(w)𝔼Λ(𝐱Λ{𝟎}gw(𝐱))(Jensen’s inequality)\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\mathbb{E}_{\Lambda}\left(\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}g_{w}(\mathbf{x})\right)}\quad(\mbox{Jensen's inequality})
1Vn(w)𝔼Λ(j=0Njφj)\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\mathbb{E}_{\Lambda}\left(\sum_{j=0}^{\infty}N_{j}\varphi_{j}\right)}
1Vn(w)j=0𝔼(Ntj+1(Λ))φj.\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\sum_{j=0}^{\infty}\mathbb{E}\left(N_{\leq t_{j+1}}(\Lambda^{*})\right)\varphi_{j}}\ .

By Proposition 4.7, we may upper-bound

𝔼Λ(Ntj+1(Λ))MVn(tj+1)(1+(n(1+1n)jnq1k/n))n.\displaystyle\mathbb{E}_{\Lambda}\left(N_{\leq t_{j+1}}(\Lambda^{*})\right)\leq M\;V_{n}\left(t_{j+1}\right)\left(1+\left(\frac{\sqrt{n}}{\left(1+\frac{1}{n}\right)^{jn}q^{1-k/n}}\right)\right)^{n}\ . (19)

Now, recalling γ=n3/2q1k/n\gamma=\frac{n^{3/2}}{q^{1-k/n}} we have for any j0j\geq 0

(1+(n(1+1n)jnq1k/n))n(1+(nq1k/n))nennq1k/n=eγ.\left(1+\left(\frac{\sqrt{n}}{\left(1+\frac{1}{n}\right)^{jn}q^{1-k/n}}\right)\right)^{n}\leq\left(1+\left(\frac{\sqrt{n}}{q^{1-k/n}}\right)\right)^{n}\leq e^{n\frac{\sqrt{n}}{q^{1-k/n}}}=e^{\gamma}.

Thus, we conclude

𝔼Λ(2Δ(u,uwΛ))eγMVn(w)j=0Vn(tj+1))φj.\displaystyle\mathbb{E}_{\Lambda}\left(2\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq\sqrt{\frac{e^{\gamma}M}{V_{n}\left(w\right)}\sum_{j=0}^{\infty}V_{n}\left(t_{j+1}\right))\varphi_{j}}\ .

Now, by Lemma 4.12 we have φjC1Vn(tj+1)w2/31tj2/3\varphi_{j}\leq\frac{C_{1}}{V_{n}\left(t_{j+1}\right)w^{2/3}}\frac{1}{t_{j}^{2/3}} for all j0j\geq 0. Hence,

j=0Vn(tj+1)φj\displaystyle\sum_{j=0}^{\infty}V_{n}\left(t_{j+1}\right)\varphi_{j} C1w2/3j=0Vn(tj+1)Vn(tj+1)1tj2/3\displaystyle\leq\frac{C_{1}}{w^{2/3}}\sum_{j=0}^{\infty}\frac{V_{n}\left(t_{j+1}\right)}{V_{n}\left(t_{j+1}\right)}\frac{1}{t_{j}^{2/3}}
=C1w2/3j=01(1+1/n)2j/3\displaystyle=\frac{C_{1}}{w^{2/3}}\sum_{j=0}^{\infty}\frac{1}{(1+1/n)^{2j/3}}
=C1w2/311(1+1/n)2/3\displaystyle=\frac{C_{1}}{w^{2/3}}\frac{1}{1-(1+1/n)^{-2/3}}
C2n2/3w2/3,\displaystyle\leq\frac{C_{2}\;n^{2/3}}{w^{2/3}}\ ,

for an appropriate constant C2>0C_{2}>0. Thus, putting everything together we derive

𝔼Λ(Δ(u,uwΛ))eγM2Vn(w)C2n2/3w2/3C(nw)1/3eγ/2MVn(w)\displaystyle\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq\sqrt{\frac{e^{\gamma}M}{2V_{n}\left(w\right)}\;\frac{C_{2}n^{2/3}}{w^{2/3}}}\leq C\left(\frac{n}{w}\right)^{1/3}e^{\gamma/2}\sqrt{\frac{M}{V_{n}\left(w\right)}}

for some constant C>0C>0. The “in particular” part of the Theorem follows analogously to the corresponding argumentation (Stirling’s estimate) used in the proof of Proposition 4.3. ∎

Next, turning to Gaussian noise, we could again prove a smoothing bound “directly,” but this will lose the same factor of e/2\sqrt{e/2} as we had earlier. Instead, we apply Proposition 4.5 with the function f(n)=O((nw)1/3eγ/2)f(n)=O\left(\left(\frac{n}{w}\right)^{1/3}e^{\gamma/2}\right) to conclude the following.

Theorem 4.14.

Let n>2n>2 and 1k<n1\leq k<n. Let qq be a prime and set γ=defn3/2q1k/n\gamma\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{n^{3/2}}{q^{1-k/n}}. Let Λ\Lambda be a random qq-ary lattice sampled according to νA\nu_{A}, let u=un/Λu=u_{\mathbb{R}^{n}/\Lambda} be the uniform distribution over its cosets, and let

s0=defM1/n/e.s_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}M^{1/n}/\sqrt{e}.

Then, for any s>s0s>s_{0}, setting η=def1s0s(0,1)\eta\stackrel{{\scriptstyle\text{def}}}{{=}}1-\frac{s_{0}}{s}\in(0,1), we have

𝔼Λ(Δ(u,DsΛ))exp(η28n)+O(1)(s/s0)n/4eγ/2.\mathbb{E}_{\Lambda}\left(\Delta\left(u,D_{s}^{\Lambda}\right)\right)\leq\exp\left(-\frac{\eta^{2}}{8}\;n\right)+O(1)\;(s/s_{0})^{n/4}\;e^{\gamma/2}.

4.4. Smoothing Arbitrary Lattices

We now turn our attention to the task of smoothing arbitrary lattices.

Analogously to how we used the minimum distance of the dual code to give our smoothing bound for worst-case codes, we will use the shortest vector of the dual lattice in order to provide our smoothing bound for worst-case lattices. The lemma that we will apply is the following where

CKL=def20.401.C_{\textup{KL}}\stackrel{{\scriptstyle\text{def}}}{{=}}2^{0.401}.
Lemma 4.15 ([PS09, Lemma 3]).

For any nn-dimensional lattice Λ\Lambda,

tλ1(Λ),Nt(Λ)Vn(t)Vn(λ1(Λ))CKLn(1+o(1)).\forall t\geq\lambda_{1}(\Lambda),\quad N_{\leq t}(\Lambda)\leq\frac{V_{n}\left(t\right)}{V_{n}\left(\lambda_{1}(\Lambda)\right)}\;C_{\textup{KL}}^{n(1+o(1))}.
Remark 4.16.

This lemma is a consequence of the Kabatiansky and Levenshtein’ bound [KL78] on the size of spherical codes, historically known as the “second linear programming bound”. It is why we may refer to the aforementioned bound of Lemma 4.15 as the second linear programming bound.

We begin by considering the effectiveness of smoothing with noise uniformly sampled from the ball. The following theorem is proved using similar techniques to those we used for Theorem 4.13, although instead of using Proposition 4.7 to bound the Nt(Λ)N_{\leq t}(\Lambda^{*})’s, we use Lemma 4.15.

Theorem 4.17.

Let Λ\Lambda be an nn-dimensional lattice and u=defun/Λu\stackrel{{\scriptstyle\text{def}}}{{=}}u_{\mathbb{R}^{n}/\Lambda} be the uniform distribution over its cosets. Then, it holds that

Δ(u,uwΛ)CKLn(1+o(1))Vn(λ1(Λ))Vn(w).\Delta\left(u,u_{w\mathscr{B}}^{\Lambda}\right)\leq\sqrt{\frac{C_{\textup{KL}}^{n(1+o(1))}}{V_{n}\left(\lambda_{1}(\Lambda^{*})\right)\;V_{n}\left(w\right)}}.

In particular, setting

w0=defnCKL1+o(1/n)2πeλ1(Λ)w_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}n\;\frac{C_{\textup{KL}}^{1+o(1/n)}}{2\pi\;e\;\lambda_{1}(\Lambda^{*})}

for all w>w0w>w_{0}, it holds that

Δ(u,uwΛ)O(1)(w0/w)n/2.\Delta\left(u,u_{w\mathscr{B}}^{\Lambda}\right)\leq O(1)(w_{0}/w)^{n/2}.
Proof.

Define

t0=defλ1(Λ),tj+1=def(1+1n)tjandφj=defmaxtjttj+1{gw(t)} for j0,t_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\lambda_{1}(\Lambda^{*}),\quad t_{j+1}\stackrel{{\scriptstyle\text{def}}}{{=}}\left(1+\tfrac{1}{n}\right)t_{j}\quad\mbox{and}\quad\varphi_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\max_{t_{j}\leq t\leq t_{j+1}}\{g_{w}(t)\}~{}~{}\text{ for }j\geq 0,

where we recall the definition of gw(t)=1Vn(w)1w^(𝐱)2g_{w}(t)=\frac{1}{V_{n}\left(w\right)}\widehat{{1}_{{\mathscr{B}}_{w}}}(\mathbf{x})^{2} with |𝐱|2=t|\mathbf{x}|_{2}=t (see Notation 4.11). We also define

Nj=def{𝐱Λ:tj|𝐱|2tj+1}.N_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\sharp\{\mathbf{x}^{*}\in\Lambda^{*}:t_{j}\leq|\mathbf{x}^{*}|_{2}\leq t_{j+1}\}\ .

With this notation and Proposition 4.1 we have

2Δ(u,uwΛ)\displaystyle 2\Delta\left(u,u_{w\mathscr{B}}^{\Lambda}\right) 𝐱Λ{𝟎}|uw^(𝐱)|2\displaystyle\leq\sqrt{\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}|\widehat{u_{w\mathscr{B}}}(\mathbf{x})|^{2}}
1Vn(w)𝐱Λ{𝟎}gw(𝐱)\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\sum_{\mathbf{x}\in\Lambda^{*}\setminus\{\mathbf{0}\}}g_{w}(\mathbf{x})}
1Vn(w)j=0Njφj\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\sum_{j=0}^{\infty}N_{j}\varphi_{j}}
1Vn(w)j=0Ntj+1(Λ)φj.\displaystyle\leq\sqrt{\frac{1}{V_{n}\left(w\right)}\sum_{j=0}^{\infty}N_{\leq t_{j+1}}(\Lambda^{*})\varphi_{j}}\ . (20)

By Lemma 4.12, for some constant C1>0C_{1}>0, we obtain

φjC1Vn(tj+1)w2/31tj2/3.\varphi_{j}\leq\frac{C_{1}}{V_{n}(t_{j+1})w^{2/3}}\frac{1}{t_{j}^{2/3}}\ .

Combining this with the upper-bound on Ntj+1(Λ)N_{\leq t_{j+1}}(\Lambda^{*}) provided by Lemma 4.15 (note that tj+1λ1(Λ)t_{j+1}\geq\lambda_{1}(\Lambda^{*}) for all j0j\geq 0), we find

j=0Ntj+1(Λ)φj\displaystyle\sum_{j=0}^{\infty}N_{\leq t_{j+1}}(\Lambda^{*})\varphi_{j} j=0Vn(tj+1)Vn(λ1(Λ))CKLn(1+o(1))C1Vn(tj+1)w2/31tj2/3\displaystyle\leq\sum_{j=0}^{\infty}\frac{V_{n}(t_{j+1})}{V_{n}(\lambda_{1}(\Lambda^{*}))}C_{\textup{KL}}^{n(1+o(1))}\;\frac{C_{1}}{V_{n}(t_{j+1})w^{2/3}}\;\frac{1}{t_{j}^{2/3}}
=CKLn(1+o(1))Vn(λ1(Λ))w2/3j=01tj2/3\displaystyle=\frac{C_{\textup{KL}}^{n(1+o(1))}}{V_{n}(\lambda_{1}(\Lambda^{*}))w^{2/3}}\;\sum_{j=0}^{\infty}\frac{1}{t_{j}^{2/3}}
=CKLn(1+o(1))Vn(λ1(Λ))w2/3j=01λ1(Λ)2/3(1+1n)2j/3\displaystyle=\frac{C_{\textup{KL}}^{n(1+o(1))}}{V_{n}(\lambda_{1}(\Lambda^{*}))w^{2/3}}\;\sum_{j=0}^{\infty}\frac{1}{\lambda_{1}(\Lambda^{*})^{2/3}\left(1+\frac{1}{n}\right)^{2j/3}}
CKLn(1+o(1))Vn(λ1(Λ))w2/3(nwλ1(Λ))2/3.\displaystyle\leq\frac{C_{\textup{KL}}^{n(1+o(1))}}{V_{n}(\lambda_{1}(\Lambda^{*}))w^{2/3}}\;\left(\frac{n}{w\lambda_{1}(\Lambda^{*})}\right)^{2/3}.

In the above, all necessary constants were absorbed into the CKLo(n)C_{\textup{KL}}^{o(n)} term. Combining this with (20), we obtain the first part of the theorem. The “in particular” part again follows using Stirling’s approximation. ∎

Next, we can consider the effectiveness of smoothing with the Gaussian distribution. As usual, we could follow the steps of the proof of Theorem 4.17 and obtain the same result, but with an additional multiplicative factor of e2\sqrt{\frac{e}{2}}. That is, we obtain

Theorem 4.18.

Let Λ\Lambda be an nn-dimensional lattice and u=defun/Λu\stackrel{{\scriptstyle\text{def}}}{{=}}u_{\mathbb{R}^{n}/\Lambda} be the uniform distribution over its cosets. Then, it holds.

Δ(u,DsΛ)CKLn(1+o(1))Vn(λ1(Λ))Vn(sn/(2π))(e2)n/2.\Delta\left(u,D_{s}^{\Lambda}\right)\leq\sqrt{\frac{C_{\textup{KL}}^{n(1+o(1))}}{V_{n}\left(\lambda_{1}(\Lambda^{*})\right)\;V_{n}\left(s\sqrt{n/(2\pi)}\right)}\;\left(\frac{e}{2}\right)^{n/2}}.

In particular, setting

s0=defnCKL1+o(1/n)2πeλ1(Λ),s_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{n}\;\frac{C_{\textup{KL}}^{1+o(1/n)}}{2\sqrt{\pi e}\;\lambda_{1}(\Lambda^{*})},

it holds for any s>s0s>s_{0} that Δ(u,DsΛ)O(1)(s0/s)n/2\Delta\left(u,D_{s}^{\Lambda}\right)\leq O(1)\;(s_{0}/s)^{n/2}.

However, as usual it is more effective to combine the bound for the uniform ball distribution and decompose the Gaussian as a convex combination of uniform ball distributions, i.e. to apply Proposition 4.5. In this way, we can obtain the following theorem, improving the smoothing bound s0s_{0} by another e/2\sqrt{e/2} factor. In the following theorem, we are setting the f(n)f(n) function of Proposition 4.5 with the O(1)O(1) term in the bound of Theorem 4.17.

Theorem 4.19.

Let Λ\Lambda be an nn-dimensional lattice, u=defun/Λu\stackrel{{\scriptstyle\text{def}}}{{=}}u_{\mathbb{R}^{n}/\Lambda} the uniform distribution over its cosets, and

s0=defnCKL1+o(1/n)2πeλ1(Λ).s_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{n}\;\frac{C_{\textup{KL}}^{1+o(1/n)}}{\sqrt{2\pi}\;e\;\lambda_{1}(\Lambda^{*})}\ .

Then, for any s>s0s>s_{0} and letting η=def=1s0s(0,1)\eta\stackrel{{\scriptstyle\text{def}}}{{=}}=1-\frac{s_{0}}{s}\in(0,1), it holds that

Δ(u,DsΛ)exp(η28n)+O(1)(s0s)n/4.\Delta\left(u,D_{s}^{\Lambda}\right)\leq\exp\left(-\frac{\eta^{2}}{8}\;n\right)+O(1)\;\left(\frac{s_{0}}{s}\right)^{n/4}.

5. Acknowledgement

We would like to thank Iosif Pinelis for help with the proof of Proposition 4.5.

References

  • [ABL01] Alexei E. Ashikhmin, Alexander Barg, and Simon Litsyn. Estimates of the distance distribution of codes and designs. Electron. Notes Discret. Math., 6:4–14, 2001.
  • [ACKL05] Alexei E. Ashikhmin, Gérard D. Cohen, Michael Krivelevich, and Simon Litsyn. Bounds on distance distributions in codes of known size. IEEE Trans. Inf. Theory, 51(1):250–258, 2005.
  • [ADRS15] Divesh Aggarwal, Daniel Dadush, Oded Regev, and Noah Stephens-Davidowitz. Solving the shortest vector problem in 2n2^{n} time using discrete Gaussian sampling. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 733–742, 2015.
  • [Ale11] Michael Alekhnovich. More on average case vs approximation complexity. Computational Complexity, 20(4):755–786, 2011.
  • [Ban93] Wojciech Banaszczyk. New bounds in some transference theorems in the geometry of numbers. Mathematische Annalen, 296(1):625–635, 1993.
  • [Bas65] LA Bassalygo. New upper bounds for codes correcting errors. Probl. Peredachi Inform, 1(4):41–44, 1965.
  • [BLVW19] Zvika Brakerski, Vadim Lyubashevsky, Vinod Vaikuntanathan, and Daniel Wichs. Worst-case hardness for LPN and cryptographic hashing via code smoothing. In Annual international conference on the theory and applications of cryptographic techniques, pages 619–635. Springer, 2019.
  • [CE03] Henry Cohn and Noam Elkies. New upper bounds on sphere packings I. Ann. of Math, (157-2):689–714, 2003.
  • [Chu97] Fan R. K. Chung. Spectral graph theory, volume 92 of CBMS Regional Conference Series in Mathematics. American Mathematical Society, 1997.
  • [DL98] Philippe Delsarte and Vladimir Iossifovitch Levenshtein. Association schemes and coding theory. IEEE Trans. Inform. Theory, 44(6):2477–2504, 1998.
  • [DST19] Thomas Debris-Alazard, Nicolas Sendrier, and Jean-Pierre Tillich. Wave: A new family of trapdoor one-way preimage sampleable functions based on codes. In Advances in Cryptology - ASIACRYPT 2019, LNCS, Kobe, Japan, December 2019.
  • [DT17] Thomas Debris-Alazard and Jean-Pierre Tillich. Statistical decoding. preprint, January 2017. arXiv:1701.07416.
  • [GPV08] Craig Gentry, Chris Peikert, and Vinod Vaikuntanathan. Trapdoors for hard lattices and new cryptographic constructions. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 197–206, 2008.
  • [IS98] Mourad E.H. Ismail and Plamen Simeonov. Strong asymptotics for Krawtchouk polynomials. Journal of Computational and Applied Mathematics, pages 121–144, 1998.
  • [KL78] Grigory Kabatiansky and Vladimir I. Levenshtein. Bounds for packings on a sphere and in space. Problems of Information Transmission, (14):1–17, 1978.
  • [Klø07] Torleiv Kløve. Codes for Error Detection, volume 2 of Series on Coding Theory and Cryptology. WorldScientific, 2007.
  • [Kra06] Ilia Krasikov. Uniform bounds for bessel functions. Journal of Applied Analysis, 12:83–91, 06 2006.
  • [Lev79] Vladimir I. Levenshtein. On bounds for packings in nn-dimensional euclidean space. Dokl. Akad. Nauk SSSR, 245:1299–1303, 1979.
  • [Lev95] Vladimir I. Levenshtein. Krawtchouk polynomials and universal bounds for codes and designs in hamming spaces. IEEE Trans. Inf. Theory, 41(5):1303–1321, 1995.
  • [LLB22] Laura Luzzi, Cong Ling, and Matthieu R. Bloch. Secret key generation from Gaussian sources using lattice-based extractors. CoRR, abs/2206.10443, 2022.
  • [LLBS14] Cong Ling, Laura Luzzi, Jean-Claude Belfiore, and Damien Stehlé. Semantically secure lattice codes for the Gaussian wiretap channel. IEEE Transactions on Information Theory, 60(10):6399–6416, 2014.
  • [McE78] Robert J. McEliece. A Public-Key System Based on Algebraic Coding Theory, pages 114–116. Jet Propulsion Lab, 1978. DSN Progress Report 44.
  • [MR07] Daniele Micciancio and Oded Regev. Worst-case to average-case reductions based on Gaussian measures. SIAM Journal on Computing, 37(1):267–302, 2007.
  • [MRJW77] Robert J. McEliece, Eugene R. Rodemich, Howard Rumsey Jr., and Lloyd R. Welch. New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities. IEEE Trans. Inf. Theory, 23(2):157–166, 1977.
  • [MTSB13] Rafael Misoczki, Jean-Pierre Tillich, Nicolas Sendrier, and Paulo S. L. M. Barreto. MDPC-McEliece: New McEliece variants from moderate density parity-check codes. In Proc. IEEE Int. Symposium Inf. Theory - ISIT, pages 2069–2073, 2013.
  • [PS09] Xavier Pujol and Damien Stehlé. Solving the shortest lattice vector problem in time 22.465n{}^{\mbox{2.465n}}. IACR Cryptol. ePrint Arch., 2009:605, 2009.
  • [vL99] Jacobus Hendricus van Lint. Introduction to coding theory. Graduate texts in mathematics. Springer, 3rd edition edition, 1999.
  • [Wai19] Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019.
  • [YZ21] Yu Yu and Jiang Zhang. Smoothing out binary linear codes and worst-case sub-exponential hardness for LPN. In Tal Malkin and Chris Peikert, editors, Advances in Cryptology - CRYPTO 2021 - 41st Annual International Cryptology Conference, CRYPTO 2021, Virtual Event, August 16-20, 2021, Proceedings, Part III, volume 12827 of Lecture Notes in Computer Science, pages 473–501. Springer, 2021.
  • [Zam14] Ram Zamir. Lattice Coding for Signals and Networks: A Structured Coding Approach to Quantization, Modulation and Multiuser Information Theory. Cambridge University Press, 2014.

Appendix A Proof of Proposition 3.9

Our aim in this section is to prove the following proposition

See 3.9

Roughly speaking, this proposition is a consequence of the fact that a Bernoulli distribution concentrates Hamming weights over a small number of slices close to the expected weight (here npnp) and, on each slice the Bernoulli distribution is uniform. Let us introduce the truncated Bernoulli distribution over words of Hamming weight [(1ε)pn,(1+ε)pn][(1-\varepsilon)pn,(1+\varepsilon)pn] for some ε>0\varepsilon>0, namely

ftruncBer,p(𝐱)=def{1Zfber,p(𝐱)if |𝐱|[(1ε)pn,(1+ε)pn]0otherwise.f_{\textup{truncBer},p}(\mathbf{x})\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\begin{array}[]{ll}\frac{1}{Z}\;f_{\textup{ber},p}(\mathbf{x})&\mbox{if }|\mathbf{x}|\in\left[(1-\varepsilon)pn,(1+\varepsilon)pn\right]\\ 0&\mbox{otherwise.}\end{array}\right. (21)

where

Z=def|𝐲|=(1ε)np(1+ε)npfber,p(𝐲)Z\stackrel{{\scriptstyle\text{def}}}{{=}}\mathop{\sum}\limits_{|\mathbf{y}|=(1-\varepsilon)np}^{(1+\varepsilon)np}f_{\textup{ber},p}(\mathbf{y}) (22)

is the probability normalizing constant.

Proposition 3.9 is a consequence of the following lemmas.

Lemma A.1.

Let ε>0\varepsilon>0. We have

Δ(fber,p,ftruncBer,p)=2Ω(n).\Delta\left(f_{\textup{ber},p},f_{\textup{truncBer},p}\right)=2^{-\Omega(n)}.
Proof.

By Chernoff’s bound

1Z=𝐲:|𝐲|[(1ε)np,(1+ε)np]fber,p(𝐲)2eε2n=2Ω(n).1-Z=\sum_{\begin{subarray}{c}\mathbf{y}:\\ |\mathbf{y}|\notin\left[(1-\varepsilon)np,(1+\varepsilon)np\right]\end{subarray}}f_{\textup{ber},p}(\mathbf{y})\leq 2e^{-\varepsilon^{2}n}=2^{-\Omega(n)}. (23)

Therefore for any |𝐱|[(1ε)np,(1+ε)np]|\mathbf{x}|\in\left[(1-\varepsilon)np,(1+\varepsilon)np\right],

ftruncBer,p(𝐱)\displaystyle f_{\textup{truncBer},p}(\mathbf{x}) =112Ω(n)fber,p(𝐱)\displaystyle=\frac{1}{1-2^{-\Omega(n)}}\;f_{\textup{ber},p}(\mathbf{x})
=(1+2Ω(n))fber,p(𝐱).\displaystyle=\left(1+2^{-\Omega(n)}\right)\;f_{\textup{ber},p}(\mathbf{x}). (24)

We have now the following computation:

2Δ(fber,p,ftruncBer,p)\displaystyle 2\Delta\left(f_{\textup{ber},p},f_{\textup{truncBer},p}\right) =𝐱𝔽2n|fber,p(𝐱)ftruncBer,p(𝐱)|\displaystyle=\sum_{\mathbf{x}\in\mathbb{F}_{2}^{n}}\left|f_{\textup{ber},p}(\mathbf{x})-f_{\textup{truncBer},p}(\mathbf{x})\right|
=|𝐱|[(1ε)np,(1+ε)np]|fber,p(𝐱)ftruncBer,p(𝐱)|+|𝐱|[(1ε)np,(1+ε)np]|fber,p(𝐱)|\displaystyle=\sum_{|\mathbf{x}|\in\left[(1-\varepsilon)np,(1+\varepsilon)np\right]}\left|f_{\textup{ber},p}(\mathbf{x})-f_{\textup{truncBer},p}(\mathbf{x})\right|+\sum_{|\mathbf{x}|\notin\left[(1-\varepsilon)np,(1+\varepsilon)np\right]}\left|f_{\textup{ber},p}(\mathbf{x})\right|
=2Ω(n)(|𝐱|[(1ε)np,(1+ε)np]|fber,p(𝐱)|)+2Ω(n)(Equations (23) and (A))\displaystyle=2^{-\Omega(n)}\left(\sum_{|\mathbf{x}|\in\left[(1-\varepsilon)np,(1+\varepsilon)np\right]}\left|f_{\textup{ber},p}(\mathbf{x})\right|\right)+2^{-\Omega(n)}\quad\mbox{(Equations \eqref{eq:chernoff} and \eqref{eq:geps})}
=2Ω(n)\displaystyle=2^{-\Omega(n)}

where in the last line we used that fber,pf_{\textup{ber},p} is a probability distribution. ∎

Lemma A.2.

We have

Δ(u,fber,p𝒞)Δ(u,ftruncBer,p𝒞)+2Ω(n).\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)+2^{-\Omega(n)}.
Proof.

By the triangle inequality,

Δ(u,fber,p𝒞)Δ(u,ftruncBer,p𝒞)+Δ(fber,p𝒞,ftruncBer,p𝒞).\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)+\Delta\left(f_{\textup{ber},p}^{\mathscr{C}},f_{\textup{truncBer},p}^{\mathscr{C}}\right).

Focusing on the second term now

Δ(fber,p𝒞,ftruncBer,p𝒞)\displaystyle\Delta\left(f_{\textup{ber},p}^{\mathscr{C}},f_{\textup{truncBer},p}^{\mathscr{C}}\right) =12𝐲𝔽2n/𝒞|fber,p𝒞(𝐲)ftruncBer,p𝒞(𝐲)|\displaystyle=\frac{1}{2}\sum_{\mathbf{y}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|f_{\textup{ber},p}^{\mathscr{C}}(\mathbf{y})-f_{\textup{truncBer},p}^{\mathscr{C}}(\mathbf{y})\right|
=12𝐲𝔽2n/𝒞|𝐜𝒞fber,p(𝐜+𝐲)𝐜𝒞ftruncBer,p(𝐜+𝐲)|\displaystyle=\frac{1}{2}\sum_{\mathbf{y}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\left|\sum_{\mathbf{c}\in\mathscr{C}}f_{\textup{ber},p}(\mathbf{c}+\mathbf{y})-\sum_{\mathbf{c}\in\mathscr{C}}f_{\textup{truncBer},p}(\mathbf{c}+\mathbf{y})\right|
12𝐲𝔽2n/𝒞𝐜𝒞|fber,p(𝐜+𝐲)ftruncBer,p(𝐜+𝐲)|\displaystyle\leq\frac{1}{2}\sum_{\mathbf{y}\in\mathbb{F}_{2}^{n}/\mathscr{C}}\sum_{\mathbf{c}\in\mathscr{C}}\left|f_{\textup{ber},p}(\mathbf{c}+\mathbf{y})-f_{\textup{truncBer},p}(\mathbf{c}+\mathbf{y})\right|
=Δ(fber,p,ftruncBer,p).\displaystyle=\Delta\left(f_{\textup{ber},p},f_{\textup{truncBer},p}\right).

which concludes the proof by Lemma A.1. ∎

The following lemma is a basic property of the statistical distance.

Lemma A.3.

For any distribution ff and (gi)1im(g_{i})_{1\leq i\leq m} we have

Δ(f,i=1mλigi)i=1mλiΔ(f,gi)\Delta\left(f,\sum_{i=1}^{m}\lambda_{i}g_{i}\right)\leq\sum_{i=1}^{m}\lambda_{i}\;\Delta(f,g_{i})

where the λi\lambda_{i}’s are positive and sum to one.

We are now ready to prove Proposition 3.9.

Proof of Proposition 3.9.

First, by Lemma A.2 we have

Δ(u,fber,p𝒞)Δ(u,ftruncBer,p𝒞)+2Ω(n).\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)\leq\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)+2^{-\Omega(n)}. (25)

To upper-bound Δ(u,ftruncBer,p𝒞)\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right) we are going to use Lemma A.3. Notice that

fber,p=r=0n(nr)pr(1p)nrur.f_{\textup{ber},p}=\sum_{r=0}^{n}\binom{n}{r}p^{r}(1-p)^{n-r}u_{r}.

Therefore it is readily seen that

ftruncBer,p=r=(1ε)np(1+ε)npλrurwhereλr=def1Z(nr)pr(1p)nr.f_{\textup{truncBer},p}=\sum_{r=(1-\varepsilon)np}^{(1+\varepsilon)np}\lambda_{r}\;u_{r}\quad\mbox{where}\quad\lambda_{r}\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{Z}\;\binom{n}{r}p^{r}(1-p)^{n-r}.

By using Lemma A.3 we obtain:

Δ(u,ftruncBer,p𝒞)\displaystyle\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right) r=(1ε)np(1+ε)npλrΔ(u,ur𝒞)\displaystyle\leq\sum_{r=(1-\varepsilon)np}^{(1+\varepsilon)np}\lambda_{r}\;\Delta\left(u,u_{r}^{\mathscr{C}}\right)
r=(1ε)np(1+ε)npΔ(u,ur𝒞)\displaystyle\leq\sum_{r=(1-\varepsilon)np}^{(1+\varepsilon)np}\Delta\left(u,u_{r}^{\mathscr{C}}\right) (26)

where in the last line we used that the λr\lambda_{r}’s are smaller than one. To conclude the proof we plug Equation (26) in (25). ∎

Appendix B Proof of Proposition 3.11

Our aim in this section is to prove the following proposition which is an extension of [ABL01, Theorem 3] for τ[δ,1]\tau\in[\delta,1] ([ABL01, Theorem 3] only applied for τ[δ,1/2]\tau\in[\delta,1/2].)

See 3.11

Our proof is mainly a rewriting of the proof of [ABL01, Theorem 3] which relies on the following proposition.

Proposition B.1 ([ABL01, Proposition 22 with d=0d^{\prime}=0]).

Let 𝒞\mathscr{C} be a binary code of length nn such that dmin(𝒞)=Ω(n)d_{\textup{min}}(\mathscr{C})=\Omega(n). Let t=defn2dmin(𝒞)(ndmin(𝒞))t\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{n}{2}-\sqrt{d_{\textup{min}}(\mathscr{C})(n-d_{\textup{min}}(\mathscr{C}))} and aa be such that

x1(t+1)<a<x1(t);Kt(a)Kt+1(a)=1x_{1}^{(t+1)}<a<x_{1}^{(t)}\quad\mbox{;}\quad\frac{K_{t}(a)}{K_{t+1}(a)}=-1

where x1(μ)x_{1}^{(\mu)} denotes the first root of the Krawtchouk polynomial of order μ\mu, namely KμK_{\mu}.

When 0w<tn/20\leq w<t\leq n/2, we have

𝐜𝒞\{𝟎}Kw(|𝐜|)2t+12a(nw)(nt)((nt+1)+(nt))2\sum_{\mathbf{c}\in\mathscr{C}\backslash\{\mathbf{0}\}}K_{w}(|{\mathbf{c}}|)^{2}\leq\frac{t+1}{2a}\;\frac{\binom{n}{w}}{\binom{n}{t}}\left(\binom{n}{t+1}+\binom{n}{t}\right)^{2} (27)

The approach is to optimize on the choice of ww in Proposition B.1 to give an upper-bound on N(𝒞)N_{\ell}(\mathscr{C}). More precisely we observe that

N(𝒞)1Kw()2𝐜𝒞\{𝟎}Kw(|𝐜|)21Kw()2t+12(nw)(nt)((nt+1)+(nt))2N_{\ell}(\mathscr{C})\leq\frac{1}{K_{w}(\ell)^{2}}\sum_{\mathbf{c}\in\mathscr{C}\backslash\{\mathbf{0}\}}K_{w}(|\mathbf{c}|)^{2}\leq\frac{1}{K_{w}(\ell)^{2}}\;\frac{t+1}{2}\frac{\binom{n}{w}}{\binom{n}{t}}\left(\binom{n}{t+1}+\binom{n}{t}\right)^{2} (28)

and then choose ww to minimize (nw)Kw()2\frac{\binom{n}{w}}{K_{w}(\ell)^{2}}.

Proof of Proposition 3.11.

It will be helpful to bring in the following map:

x[0,1]x=def12x(1x).x\in[0,1]\mapsto x^{\perp}\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{2}-\sqrt{x(1-x)}.

It can be verified that this application is an involution, is symmetric (1x)=x(1-x)^{\perp}=x^{\perp} and decreasing on [0,12][0,\frac{1}{2}].

Let 𝒞\mathscr{C} be a binary code of length nn such that dmin(𝒞)=δnd_{\textup{min}}(\mathscr{C})=\delta n where δ(0,1/2]\delta\in(0,1/2] and tt be defined as in Proposition B.1. Let ω=defwn,λ=defn\omega\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{w}{n},\lambda\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{\ell}{n} and δ=def1/2δ(1δ)\delta^{\perp}\stackrel{{\scriptstyle\text{def}}}{{=}}1/2-\sqrt{\delta(1-\delta)}. Then by Proposition B.1 we have (see Equation (28))

log2N(𝒞)nh(ω)+h(δ)2log2|Kw()|n+o(1).\frac{\log_{2}N_{\ell}(\mathscr{C})}{n}~\leq h(\omega)+h(\delta^{\perp})-\frac{2\log_{2}|K_{w}(\ell)|}{n}+o(1). (29)

Case 1: λ[δ,1δ]\lambda\in[\delta,1-\delta].
It is optimal to choose in this case ww such that ω=λε\omega=\lambda^{\perp}-\varepsilon where ε>0\varepsilon>0 and ε=o(1)\varepsilon=o(1) as nn tends to infinity. Let us first notice that λ[δ,1δ]\lambda\in[\delta,1-\delta] implies that λδ\lambda^{\perp}\leq\delta^{\perp} which together with ω<λ\omega<\lambda^{\perp} implies that ω<δ\omega<\delta^{\perp} which in turn is equivalent to the condition w<tw<t for being able to apply Proposition B.1. Moreover ω<λ\omega<\lambda^{\perp} also implies λ<ω\lambda<\omega^{\perp} and by using Proposition 3.8 we obtain

2log2|Kw()|nh(ω)+1h(λ)+o(1).\frac{2\log_{2}|K_{w}(\ell)|}{n}\leq h(\omega)+1-h(\lambda)+o(1).

Therefore

log2N(𝒞)nh(ω)+h(δ)h(ω)1+h(λ)+o(1)=h(δ)+h(λ)1+o(1).\frac{\log_{2}N_{\ell}(\mathscr{C})}{n}~\leq h(\omega)+h(\delta^{\perp})-h(\omega)-1+h(\lambda)+o(1)=h(\delta^{\perp})+h(\lambda)-1+o(1).

Case 2: λ(1δ,1]\lambda\in(1-\delta,1].
In that case, let ω=δε\omega=\delta^{\perp}-\varepsilon with ε>0\varepsilon>0 and ε=o(1)\varepsilon=o(1) as nn tends to infinity. Here we can write

2log2|Kw()|n=log2(Kw()2)n=log2(Kw(n)2)n.\frac{2\log_{2}|K_{w}(\ell)|}{n}=\frac{\log_{2}(K_{w}(\ell)^{2})}{n}=\frac{\log_{2}(K_{w}(n-\ell)^{2})}{n}.

Since λ>1δ\lambda>1-\delta, we have 1λ<δ1-\lambda<\delta. On the other hand, ω<δ\omega<\delta^{\perp} implies δ<ω\delta<\omega^{\perp}. We deduce from these two inequalities that 1λ<ω1-\lambda<\omega^{\perp}. By using Proposition 3.8 again, we get

log2(Kw(n)2)n=2a(1λ,δ)+o(1)=2a(λ,δ)+o(1).\frac{\log_{2}(K_{w}(n-\ell)^{2})}{n}=2a(1-\lambda,\delta^{\perp})+o(1)=2a(\lambda,\delta^{\perp})+o(1).

By plugging this estimate in (29) we get

log2N(𝒞)n2h(δ)2a(λ,δ).\frac{\log_{2}N_{\ell}(\mathscr{C})}{n}\leq 2h(\delta^{\perp})-2a(\lambda,\delta^{\perp}).

This concludes the proof. ∎

Appendix C Proof of Theorem 3.16

Our aim in this appendix is to prove the following theorem.

See 3.16

Sketch of proof. We will use the following proof strategy

  • 1.

    By Lemma A.2 we know that on one hand

    Δ(u,fber,p𝒞)=Δ(u,ftruncBer,p𝒞)+2Ω(n).\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right)=\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)+2^{-\Omega(n)}. (30)

    This is actually a consequence of Chernoff’s bound. This argument can also be used to show that the Fourier transforms are also close to each other pointwise

    𝐱𝔽2n,2n|ftruncBer,p^(𝐱)fber,p^(𝐱)|=2Ω(n).\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\quad 2^{n}\;\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})-\widehat{f_{\textup{ber},p}}(\mathbf{x})\right|=2^{-\Omega(n)}. (31)
  • 2.

    Equation (31) together with Lemma 3.15 are then used to show that:

    Δ(u,ftruncBer,p𝒞)2nt=dmin(𝒞)ndmin(𝒞)/2Nt(𝒞),ftruncBer,p^(t)2+2Ω(n).\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)\leq 2^{n}\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n-d_{\textup{min}}(\mathscr{C}^{*})/2}N_{t}(\mathscr{C}^{*})\widehat{,f_{\textup{truncBer},p}}(t)^{2}}+2^{-\Omega(n)}. (32)
  • 3.

    We use the two previous points to upper-bound Δ(u,fber,p𝒞)\Delta\left(u,f_{\textup{ber},p}^{\mathscr{C}}\right) as in the equation above and conclude by using bounds of Propositions 3.11 and 3.12.

Proof of Step 1. As we explained above (30) is just Lemma A.2. Let us now prove that

Lemma C.1.

We have

𝐱𝔽2n,2n|ftruncBer,p^(𝐱)fber,p^(𝐱)|=2Ω(n).\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\quad 2^{n}\;\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})-\widehat{f_{\textup{ber},p}}(\mathbf{x})\right|=2^{-\Omega(n)}.
Proof.

Recall that Z=|𝐲|=(1ε)np(1+ε)npfber,p(𝐲)Z=\mathop{\sum}\limits_{|\mathbf{y}|=(1-\varepsilon)np}^{(1+\varepsilon)np}f_{\textup{ber},p}(\mathbf{y}) where by Chernoff’s bound, we have

Z=12Ω(n).Z=1-2^{-\Omega(n)}. (33)

Notice now that,

fber,p=r=0n(nr)pr(1p)nrurandftruncBer,p=1Zr=(1ε)pn(1+ε)pn(nr)pr(1p)nrur/f_{\textup{ber},p}=\sum_{r=0}^{n}\binom{n}{r}p^{r}(1-p)^{n-r}u_{r}\quad\mbox{and}\quad f_{\textup{truncBer},p}=\frac{1}{Z}\;\sum_{r=(1-\varepsilon)pn}^{(1+\varepsilon)pn}\binom{n}{r}p^{r}(1-p)^{n-r}u_{r}/

Let =def(1ε)pn,(1+ε)pn\mathscr{I}\stackrel{{\scriptstyle\text{def}}}{{=}}\llbracket(1-\varepsilon)pn,(1+\varepsilon)pn\rrbracket. Notice that Z=r(nr)pr(1p)nrZ=\sum_{r\in\mathscr{I}}\binom{n}{r}p^{r}(1-p)^{n-r}. By linearity of the Fourier transform we obtain the following computation:

|ftruncBer,p^(𝐱)fber,p^(𝐱)|\displaystyle\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})-\widehat{f_{\textup{ber},p}}(\mathbf{x})\right| =(1Z1)r(nr)pr(1p)nr|ur^(𝐱)|\displaystyle=\left(\frac{1}{Z}-1\right)\sum_{r\in\mathscr{I}}\binom{n}{r}p^{r}(1-p)^{n-r}\left|\widehat{u_{r}}(\mathbf{x})\right|
+r(nr)pr(1p)nr|ur^(𝐱)|\displaystyle\qquad\qquad\qquad\qquad+\sum_{r\notin\mathscr{I}}\binom{n}{r}p^{r}(1-p)^{n-r}\left|\widehat{u_{r}}(\mathbf{x})\right|
=2Ω(n)r(nr)pr(1p)nr|ur^(𝐱)|+2Ω(n)maxr|ur^(𝐱)|\displaystyle=2^{-\Omega(n)}\sum_{r\in\mathscr{I}}\binom{n}{r}p^{r}(1-p)^{n-r}\left|\widehat{u_{r}}(\mathbf{x})\right|+2^{-\Omega(n)}\max_{r}\left|\widehat{u_{r}}(\mathbf{x})\right| (34)

where in the last line we used Equation (33). Recall now that by definition of the Fourier transform for functions over 𝔽2n\mathbb{F}_{2}^{n} we have:

|ur(𝐱)|=|12n𝐲:|𝐲|=r(1)𝐱𝐲(nr)|12n.\left|u_{r}(\mathbf{x})\right|=\left|\frac{1}{2^{n}}\sum_{\mathbf{y}:|\mathbf{y}|=r}\frac{(-1)^{\mathbf{x}\cdot\mathbf{y}}}{\binom{n}{r}}\right|\leq\frac{1}{2^{n}}.

By plugging this in Equation (34) we get:

|ftruncBer,p^(𝐱)fber,p^(𝐱)|\displaystyle\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})-\widehat{f_{\textup{ber},p}}(\mathbf{x})\right| 2Ω(n)2nr(nr)pr(1p)nr1+2Ω(n)2n\displaystyle\leq\frac{2^{-\Omega(n)}}{2^{n}}\underbrace{\sum_{r\in\mathscr{I}}\binom{n}{r}p^{r}(1-p)^{n-r}}_{\leq 1}+\frac{2^{-\Omega(n)}}{2^{n}}
=2Ω(n)2n\displaystyle=\frac{2^{-\Omega(n)}}{2^{n}}

which concludes the proof. ∎

Proof of Step 2. This corresponds to proving the following lemma.

Lemma C.2.
Δ(u,ftruncBer,p𝒞)2nt=dmin(𝒞)ndmin(𝒞)/2Nt(𝒞),ftruncBer,p^(t)2+2Ω(n).\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)\leq 2^{n}\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n-d_{\textup{min}}(\mathscr{C}^{*})/2}N_{t}(\mathscr{C}^{*})\widehat{,f_{\textup{truncBer},p}}(t)^{2}}+2^{-\Omega(n)}.
Proof.

By applying Proposition 3.3 to ftruncBer,pf_{\textup{truncBer},p} we obtain

Δ(u,ftruncBer,p𝒞)2nt=dmin(𝒞)nNt(𝒞)|ftruncBer,p^(t)|2\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)\leq 2^{n}\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n}N_{t}(\mathscr{C}^{*})|\widehat{f_{\textup{truncBer},p}}(t)|^{2}} (35)

where ftruncBer,p^(t)\widehat{f_{\textup{truncBer},p}}(t) denotes the common value of the radial function ftruncBer,p^\widehat{f_{\textup{truncBer},p}} on vectors of Hamming weight tt. Recall now that fber,p^(𝐱)=12n(12p)|𝐱|\widehat{f_{\textup{ber},p}}(\mathbf{x})=\frac{1}{2^{n}}\;(1-2p)^{|\mathbf{x}|} and by Lemma C.1 that 2n|ftruncBer,p^(𝐱)fber,p^(𝐱)|=2Ω(n)2^{n}\;\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})-\widehat{f_{\textup{ber},p}}(\mathbf{x})\right|=2^{-\Omega(n)}. Therefore,

𝐱𝔽2n, |𝐱|ndmin(𝒞)2:2n|ftruncBer,p^(𝐱)|=2Ω(n).\forall\mathbf{x}\in\mathbb{F}_{2}^{n},\mbox{ }|\mathbf{x}|\geq n-\frac{d_{\textup{min}}(\mathscr{C}^{*})}{2}\quad\mbox{:}\quad 2^{n}\;\left|\widehat{f_{\textup{truncBer},p}}(\mathbf{x})\right|=2^{-\Omega(n)}.

By plugging this in Equation (35) we obtain (as there is at most one dual codeword of weight \ell for each >ndmin(𝒞)/2\ell>n-d_{\textup{min}}(\mathscr{C}^{*})/2, see Lemma 3.15)

Δ(u,ftruncBer,p𝒞)2nt=dmin(𝒞)ndmin(𝒞)/2Nt(𝒞)|ftruncBer,p^(t)|2+2Ω(n)\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)\leq 2^{n}\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n-d_{\textup{min}}(\mathscr{C}^{*})/2}N_{t}(\mathscr{C}^{*})|\widehat{f_{\textup{truncBer},p}}(t)|^{2}}+2^{-\Omega(n)} (36)

which concludes the proof. ∎

Proof of Step 3. We finish the proof of Theorem 3.16 by noticing that

ftruncBer,p=1Z=(1ε)pn(1+ε)pn(n)p(1p)nuf_{\textup{truncBer},p}=\frac{1}{Z}\;\sum_{\ell=(1-\varepsilon)pn}^{(1+\varepsilon)pn}\binom{n}{\ell}p^{\ell}(1-p)^{n-\ell}u_{\ell}

where Z=def|𝐲|=(1ε)np(1+ε)npfber,p(𝐲)=12Ω(n)Z\stackrel{{\scriptstyle\text{def}}}{{=}}\mathop{\sum}\limits_{|\mathbf{y}|=(1-\varepsilon)np}^{(1+\varepsilon)np}f_{\textup{ber},p}(\mathbf{y})=1-2^{-\Omega(n)} by Chernoff’s bound. Therefore,

ftruncBer,p^=(1+2Ω(n))=(1ε)pn(1+ε)pn(n)p(1p)nu^.\widehat{f_{\textup{truncBer},p}}=\left(1+2^{-\Omega(n)}\right)\;\sum_{\ell=(1-\varepsilon)pn}^{(1+\varepsilon)pn}\binom{n}{\ell}p^{\ell}(1-p)^{n-\ell}\;\widehat{u_{\ell}}.

By plugging this in Equation (36) and using u^=12nK(n)\widehat{u_{\ell}}=\frac{1}{2^{n}}\;\frac{K_{\ell}}{\binom{n}{\ell}} we obtain

Δ(u,ftruncBer,p𝒞)(1+2Ω(n))t=dmin(𝒞)ndmin(𝒞)/2Nt(𝒞)(=(1ε)pn(1+ε)pnp(1p)nK(t))2+2Ω(n).\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right)\leq\left(1+2^{-\Omega(n)}\right)\;\sqrt{\sum_{t=d_{\textup{min}}(\mathscr{C}^{*})}^{n-d_{\textup{min}}(\mathscr{C}^{*})/2}N_{t}(\mathscr{C}^{*})\left(\sum_{\ell=(1-\varepsilon)pn}^{(1+\varepsilon)pn}p^{\ell}(1-p)^{n-\ell}K_{\ell}(t)\right)^{2}}+2^{-\Omega(n)}.

We then use in the righthand term, Propositions 3.11, 3.12 which give bounds on the 1nlog2N(𝒞)\frac{1}{n}\;\log_{2}N_{\ell}(\mathscr{C}^{*})’s (where dmin(𝒞)δnd_{\textup{min}}(\mathscr{C}^{*})\geq\delta^{*}n) and Proposition 3.8 which gives an asymptotic expansion of Krawtchouk polynomials to upper-bound Δ(u,ftruncBer,p𝒞)\Delta\left(u,f_{\textup{truncBer},p}^{\mathscr{C}}\right). We finish the proof of the theorem by using this upper-bound in the righthand term of (30).

Appendix D Proof of Proposition 4.5

Our aim in this section is to prove the following proposition.

See 4.5

It will be a consequence of the following lemmas. We begin with the following result decomposing the Gaussian as a convex combination of balls.

Lemma D.1.

The Gaussian distribution in dimension nn of parameter ss is the following convex combination of uniform distributions over balls:

Ds=1s0Gn(w/s)uw𝑑wD_{s}=\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\;u_{w\mathscr{B}}\,dw

where Gn(x)=xn+1Vn(1) 2πexp(πx2)0G_{n}(x)=x^{n+1}\;V_{n}\left(1\right)\;2\pi\;\exp\left(-\pi x^{2}\right)\geq 0. Furthermore, we have 1s0Gn(w/s)𝑑w=1\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\,dw=1.

Proof.

First, let gs(w)=def1snexp(πw2s2)g_{s}(w)\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{s^{n}}\;\exp\left(-\pi\tfrac{w^{2}}{s^{2}}\right) (i.e. the value the probability density function DsD_{s} takes on vectors of weight ww) and denote hs(w)=gs(w)=2πwsn+2exp(πw2s2)h_{s}(w)=-g_{s}^{\prime}(w)=\frac{2\pi w}{s^{n+2}}\;\exp\left(-\pi\tfrac{w^{2}}{s^{2}}\right). For any 𝐱n\mathbf{x}\in\mathbb{R}^{n}, setting u=|𝐱|2u=|\mathbf{x}|_{2}, as limwgs(w)=0\lim_{w\to\infty}g_{s}(w)=0 we have

Ds(𝐱)\displaystyle D_{s}(\mathbf{x}) =gs(u)=uhs(w)𝑑w=0hs(w) 1{uw}𝑑w=0hs(w) 1w(𝐱)𝑑w.\displaystyle=g_{s}(u)=\int_{u}^{\infty}h_{s}(w)\,dw=\int_{0}^{\infty}h_{s}(w)\;1\{u\leq w\}\ dw=\int_{0}^{\infty}h_{s}(w)\;1_{\mathscr{B}_{w}}(\mathbf{x})\ dw\ .

Above, we denoted by 1{uw}1\{u\leq w\} the function which takes value 11 on input ww if uwu\leq w, and 0 otherwise. To conclude, note that 1sGn(w/s)=hs(w)Vn(w)\frac{1}{s}\;G_{n}(w/s)=h_{s}(w)\;V_{n}\left(w\right) and recall uw=1wVn(w)u_{w\mathscr{B}}=\frac{1_{\mathscr{B}_{w}}}{V_{n}\left(w\right)}.

For the “furthermore” part of the lemma, we compute

1s0Gn(w/s)𝑑w=1s0(w/s)n+1Vn(1) 2πexp(π(w/s)2)𝑑w.\displaystyle\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\,dw=\frac{1}{s}\int_{0}^{\infty}(w/s)^{n+1}\;V_{n}\left(1\right)\;2\pi\;\exp\left(-\pi(w/s)^{2}\right)\,dw\ . (37)

We make the substitution t=π(ws)2t=\pi\left(\frac{w}{s}\right)^{2}, which means dw=s2dt2πw=s2tπdtdw=\frac{s^{2}\,dt}{2\pi w}=\frac{s}{2\sqrt{t\pi}}\,dt. Also, we recall Vn(1)=πn/2Γ(n/2+1)V_{n}\left(1\right)=\frac{\pi^{n/2}}{\Gamma(n/2+1)}. Thus,

1s0Gn(w/s)𝑑w\displaystyle\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\,dw =1sπn/2Γ(n/2+1)0(tπ)(n+1)/2 2πets2tπ𝑑t\displaystyle=\frac{1}{s}\;\frac{\pi^{n/2}}{\Gamma(n/2+1)}\int_{0}^{\infty}\left(\frac{t}{\pi}\right)^{(n+1)/2}\;2\pi\;e^{-t}\;\frac{s}{2\sqrt{t\pi}}\,dt
=1Γ(n/2+1)0tn/2et𝑑t=Γ(n/2+1)Γ(n/2+1)=1\displaystyle=\frac{1}{\Gamma(n/2+1)}\int_{0}^{\infty}t^{n/2}\;e^{-t}\;dt=\frac{\Gamma(n/2+1)}{\Gamma(n/2+1)}=1

which concludes the proof. ∎

We now quote the following bound, which makes precise the intuition that it is exponentially unlikely that a random Gaussian vector has norm (1η)(1-\eta) factor smaller than its expected norm. This result provides the analogy for the Chernoff bound that we used for the code-case.

Lemma D.2 ([Wai19, Example 2.5]).

Let 𝐗\mathbf{X} be a random Gaussian vector of dimension nn and parameter 11. Let 0<η<10<\eta<1. Then

(|𝐗|22(1η)n2π)exp(η28n).\mathbb{P}\left(|\mathbf{X}|_{2}^{2}\leq(1-\eta)\;\frac{n}{2\pi}\right)\leq\exp(-\frac{\eta^{2}}{8}\;n).

This lemma allows us to prove the following lemma bounding 1s0w¯Gn(w/s)𝑑w\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)dw when w¯<sn/(2π)\overline{w}<s\;\sqrt{n/(2\pi)}.

Lemma D.3.

Let η(0,1)\eta\in(0,1) and w¯=1ηsn/(2π)\overline{w}=\sqrt{1-\eta}\;s\;\sqrt{n/(2\pi)}. Then

1s0w¯Gn(w/s)𝑑wexp(η28n).\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)dw\leq\exp(-\frac{\eta^{2}}{8}\;n)\ .
Proof.

Let u¯=def1ηn/(2π)\overline{u}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{1-\eta}\;\sqrt{n/(2\pi)}. By Lemma D.2, if 𝐗\mathbf{X} denotes a random Gaussian vector of dimension nn and parameter 11, we have

0|𝐱|2u¯exp(π|𝐱|22)𝑑𝐱=(|𝐗|22(1η)n2π)exp(η28n).\int_{0\leq|\mathbf{x}|_{2}\leq\overline{u}}\exp(-\pi|\mathbf{x}|_{2}^{2})\ d\mathbf{x}=\mathbb{P}\left(|\mathbf{X}|_{2}^{2}\leq(1-\eta)\;\frac{n}{2\pi}\right)\leq\exp(-\frac{\eta^{2}}{8}\;n). (38)

To compute this last integral, note that

0|𝐱|2u¯exp(π|𝐱|22)𝑑𝐱\displaystyle\int_{0\leq|\mathbf{x}|_{2}\leq\overline{u}}\exp(-\pi|\mathbf{x}|_{2}^{2})\ d\mathbf{x} =0u¯u𝒮n1eπu2𝑑A𝑑u,\displaystyle=\int_{0}^{\overline{u}}\int_{u\mathscr{S}^{n-1}}e^{-\pi u^{2}}dAdu\ , (39)

where u𝒮n1u\mathscr{S}^{n-1} denotes the Euclidean sphere of radius uu and dAdA is the area element. If An1(u)A_{n-1}(u) denotes the surface area of u𝒮n1u\mathscr{S}^{n-1}, then An1(u)=un1An1(1)A_{n-1}(u)=u^{n-1}A_{n-1}(1) and thus

0u¯u𝒮n1eπu2𝑑A𝑑u=An1(1)0u¯un1exp(πu2)𝑑u.\displaystyle\int_{0}^{\overline{u}}\int_{u\mathscr{S}^{n-1}}e^{-\pi u^{2}}dAdu=A_{n-1}(1)\int_{0}^{\overline{u}}u^{n-1}\exp(-\pi u^{2})\,du\ . (40)

Further, it is known that An1(1)=2πn/2Γ(n/2)A_{n-1}(1)=\frac{2\pi^{n/2}}{\Gamma(n/2)}. Therefore, plugging Equations (39) and (40) into (38) leads to

0u¯un1exp(πu2)𝑑u1An1(1)exp(η28n).\int_{0}^{\overline{u}}u^{n-1}\exp(-\pi u^{2})\,du\leq\frac{1}{A_{n-1}(1)}\;\exp(-\frac{\eta^{2}}{8}\;n)\ . (41)

Now, we look at the left-hand side of the inequality we wish to prove. We begin by making the substitution u=w/su=w/s. So then dw=sdudw=s\;du. Moreover, let u¯=def1ηn/(2π)\overline{u}\stackrel{{\scriptstyle\text{def}}}{{=}}\sqrt{1-\eta}\;\sqrt{n/(2\pi)} and note that when w=w¯w=\overline{w} we have u=w¯/s=1ηn/(2π)=u¯u=\overline{w}/s=\sqrt{1-\eta}\;\sqrt{n/(2\pi)}=\overline{u}.

1s0w¯Gn(w/s)𝑑w\displaystyle\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)\ dw =0u¯Gn(u)𝑑u\displaystyle=\int_{0}^{\overline{u}}G_{n}(u)\ du
=Vn(1) 2π0u¯un+1exp(πu2)𝑑u\displaystyle=V_{n}\left(1\right)\;2\pi\int_{0}^{\bar{u}}u^{n+1}\;\exp(-\pi u^{2})\ du
Vn(1) 2πu¯20u¯un1exp(πu2)𝑑u.\displaystyle\leq V_{n}\left(1\right)\;2\pi\bar{u}^{2}\int_{0}^{\bar{u}}u^{n-1}\exp(-\pi u^{2})\ du\ .

Plugging this last inequality with (41) yields

1s0w¯Gn(w/s)𝑑wVn(1)2πu¯2An1(1)exp(η28n).\displaystyle\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)\ dw\leq\frac{V_{n}\left(1\right)2\pi\;\overline{u}^{2}}{A_{n-1}(1)}\exp(-\frac{\eta^{2}}{8}\;n)\ .

To conclude the proof, note that Vn(1)=01An1(u)𝑑u=01un1An1(1)𝑑u=An1(1)nV_{n}(1)=\int_{0}^{1}A_{n-1}(u)du=\int_{0}^{1}u^{n-1}A_{n-1}(1)du=\frac{A_{n-1}(1)}{n} and therefore

Vn(1)2πu¯2An1(1)=2π(1η)n2πn=1η1.\frac{V_{n}\left(1\right)2\pi\;\overline{u}^{2}}{A_{n-1}(1)}=\frac{2\pi(1-\eta)n}{2\pi n}=1-\eta\leq 1.

It concludes the proof. ∎

We are now ready to prove Proposition 4.5.

Proof of Proposition 4.5..

By Lemma D.1, DsD_{s} is a convex combination of uniform distribution over balls, namely Ds=1s0Gn(w/s)uw𝑑wD_{s}=\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\;u_{w\mathscr{B}}\;dw. Therefore (we use here the analogue of Lemma A.3 in the context of the statistical distance between two probability density functions)

𝔼Λ(Δ(u,DsΛ))1s0Gn(w/s)𝔼Λ(Δ(u,uwΛ))𝑑w.\mathbb{E}_{\Lambda}\left(\Delta(u,D_{s}^{\Lambda})\right)\leq\frac{1}{s}\int_{0}^{\infty}G_{n}(w/s)\;\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)dw.

We split the integral in two parts at radius w¯=1ηsn/(2π)\overline{w}=\sqrt{1-\eta}\;s\;\sqrt{n/(2\pi)}. For the first part ww¯w\leq\overline{w}, we use the trivial bound 𝔼Λ(Δ(u,uwΛ))1\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)\leq 1 which gives:

1s0w¯Gn(w/s)𝔼Λ(Δ(u,uwΛ))𝑑w1s0w¯Gn(w/s)𝑑w.\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)\;\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right)dw\leq\frac{1}{s}\int_{0}^{\overline{w}}G_{n}(w/s)dw.

We then apply Lemma D.3, which bounds this part by exp(η28n)\exp(-\frac{\eta^{2}}{8}\;n).

For the second part ww¯w\geq\overline{w}, we use the trivial bound 1sw¯Gn(w/s)𝑑w1\frac{1}{s}\int_{\overline{w}}^{\infty}G_{n}(w/s)dw\leq 1 and, noting

ww¯=1ηsn/(2π)=11ηs0n/(2π)>s0n/(2π)=w0,w\geq\overline{w}=\sqrt{1-\eta}\;s\;\sqrt{n/(2\pi)}=\frac{1}{\sqrt{1-\eta}}\;s_{0}\;\sqrt{n/(2\pi)}>s_{0}\;\sqrt{n/(2\pi)}=w_{0},

we may apply the assumption of the proposition, yielding

𝔼Λ(Δ(u,uwΛ))\displaystyle\mathbb{E}_{\Lambda}\left(\Delta(u,u_{w\mathscr{B}}^{\Lambda})\right) f(n)(w0w)n/2f(n)(w0w¯)1/2=f(n)(1η)n/2=f(n)(s0s)n/4.\displaystyle\leq f(n)\left(\frac{w_{0}}{w}\right)^{n/2}\leq f(n)\left(\frac{w_{0}}{\overline{w}}\right)^{1/2}=f(n)\left(\sqrt{1-\eta}\right)^{n/2}=f(n)\left(\frac{s_{0}}{s}\right)^{n/4}.

Adding these bounds yields the proposition. ∎