This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Singularity of sparse Bernoulli matrices

Alexander E. Litvak and Konstantin E. Tikhomirov
Abstract

Let MnM_{n} be an n×nn\times n random matrix with i.i.d. Bernoulli(pp) entries. We show that there is a universal constant C1C\geq 1 such that, whenever pp and nn satisfy Clogn/npC1C\log n/n\leq p\leq C^{-1},

{Mn is singular}\displaystyle{\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}} =(1+on(1)){Mn contains a zero row or column}\displaystyle=(1+o_{n}(1)){\mathbb{P}}\big{\{}\mbox{$M_{n}$ contains a zero row or column}\big{\}}
=(2+on(1))n(1p)n,\displaystyle=(2+o_{n}(1))n\,(1-p)^{n},

where on(1)o_{n}(1) denotes a quantity which converges to zero as nn\to\infty. We provide the corresponding upper and lower bounds on the smallest singular value of MnM_{n} as well.


AMS 2010 Classification: primary: 60B20, 15B52; secondary: 46B06, 60C05.
Keywords: Littlewood–Offord theory, Bernoulli matrices, sparse matrices, smallest singular value, invertibility

1 Introduction

Invertibility of discrete random matrices attracts considerable attention in the literature. The classical problem in this direction — estimating the singularity probability of a square random matrix BnB_{n} with i.i.d. ±1\pm 1 entries — was first addressed by Komlós in the 1960-es. Komlós [18] showed that {Bn is singular}{\mathbb{P}}\{\mbox{$B_{n}$ is singular}\} decays to zero as the dimension grows to infinity. A breakthrough result of Kahn–Komlós–Szemerédi [16] confirmed that the singularity probability of BnB_{n} is exponentially small in dimension. Further improvements on the singularity probability were obtained by Tao–Vu [44, 45] and Bourgain–Vu–Wood [7]. An old conjecture states that {Bn is singular}=(12+on(1))n{\mathbb{P}}\{\mbox{$B_{n}$ is singular}\}=\big{(}\frac{1}{2}+o_{n}(1)\big{)}^{n}. The conjecture was resolved in [48].

Other models of non-symmetric discrete random matrices considered in the literature include adjacency matrices of dd-regular digraphs, as well as the closely related model of sums of independent uniform permutation matrices [19, 9, 10, 22, 23, 24, 25, 26, 2]. In particular, the recent breakthrough works [15, 35, 36] confirmed that the adjacency matrix of a uniform random dd–regular digraph of a constant degree d3d\geq 3 is non-singular with probability decaying to zero as the number of vertices of the graph grows to infinity. A closely related line of research deals with the rank of random matrices over finite fields. We refer to [33] for some recent results and further references.

The development of the Littlewood–Offord theory and a set of techniques of geometric functional analysis reworked in the random matrix context, produced strong invertibility results for a broad class of distributions. Following works [47, 39] of Tao–Vu and Rudelson, the paper [41] of Rudelson and Vershynin established optimal small ball probability estimates for the smallest singular value in the class of square matrices with i.i.d. subgaussian entries, namely, it was shown that any n×nn\times n matrix AA with i.i.d. subgaussian entries of zero mean and unit variance satisfies {smin(A)tn1/2}Ct+2exp(cn){\mathbb{P}}\{s_{\min}(A)\leq t\,n^{-1/2}\}\leq Ct+2\exp(-cn) for all t>0t>0 and some C,c>0C,c>0 depending only on the subgaussian moment. The assumptions of identical distribution of entries and of bounded subgaussian moment were removed in subsequent works [37, 30, 31]. This line of research lead to positive solution of the Bernoulli matrix conjecture mentioned in the first paragraph. Let us state the result of [48] for future reference.

Theorem (Invertibility of dense Bernoulli matrices, [48]).
  • For each nn, let BnB_{n} be the n×nn\times n random matrix with i.i.d. ±1\pm 1 entries. Then for any ε>0\varepsilon>0 there is CC depending only on ε\varepsilon such that the smallest singular value smin(Bn)s_{\min}(B_{n}) satisfies

    {smin(Bn)tn1/2}Ct+C(1/2+ε)n,t>0.{\mathbb{P}}\big{\{}s_{\min}(B_{n})\leq tn^{-1/2}\big{\}}\leq Ct+C(1/2+\varepsilon)^{n},\quad t>0.

    In particular, {Bn is singular}=(1/2+on(1))n{\mathbb{P}}\big{\{}\mbox{$B_{n}$ is singular}\big{\}}=(1/2+o_{n}(1))^{n}, where the quantity on(1)o_{n}(1) tends to zero as nn grows to infinity.

  • For each ε>0\varepsilon>0 and p(0,1/2]p\in(0,1/2] there is C>0C>0 depending on ε\varepsilon and pp such that for any nn and for random n×nn\times n matrix MnM_{n} with i.i.d. Bernoulli(pp) entries,

    {smin(Mn)tn1/2}Ct+C(1p+ε)n,t>0.{\mathbb{P}}\big{\{}s_{\min}(M_{n})\leq tn^{-1/2}\big{\}}\leq Ct+C(1-p+\varepsilon)^{n},\quad t>0.

    In particular, for a fixed p(0,1/2]p\in(0,1/2], we have {Mn is singular}=(1p+on(1))n{\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=(1-p+o_{n}(1))^{n}.


Sparse analogs of the Rudelson–Vershynin invertibility theorem [41] were obtained, in particular, in works [46, 14, 29, 3, 4, 5], with the strongest small ball probability estimates in the i.i.d. subgaussian setting available in [3, 4, 5]. Here, we state a result of Basak–Rudelson [3] for Bernoulli(pnp_{n}) random matrices.

Theorem (Invertibility of sparse Bernoulli matrices, [3]).

There are universal constants C,c>0C,c>0 with the following property. Let nn\in{\mathbb{N}} and let pn(0,1)p_{n}\in(0,1) satisfy Clogn/npn1/2C\log n/n\leq p_{n}\leq 1/2. Further, let MnM_{n} be the random n×nn\times n matrix with i.i.d. Bernoulli(pnp_{n}) entries (that is, 0/10/1 random variables with expectation pp). Then

{smin(Mn)texp(Clog(1/pn)/log(npn))pn/n}Ct+2exp(cnpn),t>0.\displaystyle{\mathbb{P}}\big{\{}s_{\min}(M_{n})\leq t\,\exp\big{(}-C\log(1/p_{n})/\log(np_{n})\big{)}\,\sqrt{p_{n}/n}\big{\}}\leq Ct+2\exp(-cnp_{n}),\quad t>0.

The singularity probabilities implied by the results [48, 3] may be regarded as suboptimal in a certain respect. Indeed, while [48] produced an asymptotically sharp base of the power in the singularity probability of BnB_{n}, the estimate of [48] is off by a factor (1+on(1))n(1+o_{n}(1))^{n} which may (and in fact does, as analysis of the proof shows) grow to infinity with nn superpolynomially fast. Further, the upper bound on the singularity probability of sparse Bernoulli matrices implied by [3] captures an exponential dependence on npnnp_{n}, but does not recover an asymptotically optimal base of the power.

A folklore conjecture for matrices BnB_{n} asserts that {Bn is singular}=(1+on(1))n221n{\mathbb{P}}\{\mbox{$B_{n}$ is singular}\}=(1+o_{n}(1))n^{2}2^{1-n}, where the right hand side of the expression is the probability that two rows or two columns of the matrix BnB_{n} are equal up to a sign (see, for example, [16]). This conjecture can be naturally extended to the model with Bernoulli(pnp_{n}) (0/10/1) entries as follows.

Conjecture 1.1 (Stronger singularity conjecture for Bernoulli matrices).

For each nn, let pn(0,1/2]p_{n}\in(0,1/2], and let MnM_{n} be the n×nn\times n matrix with i.i.d. Bernoulli(pnp_{n}) entries. Then

{\displaystyle{\mathbb{P}}\{ Mn is singular}\displaystyle\mbox{$M_{n}$ is singular}\}
=(1+on(1)){a row or a column of Mn equals zero, or two rows or columns are equal}.\displaystyle=(1+o_{n}(1)){\mathbb{P}}\big{\{}\mbox{a row or a column of $M_{n}$ equals zero, or two rows or columns are equal}\big{\}}.

In particular, if lim suppn<1/2\limsup p_{n}<1/2 then

{Mn is singular}\displaystyle{\mathbb{P}}\{\mbox{$M_{n}$ is singular}\} =(1+on(1)){either a row or a column of Mn equals zero}.\displaystyle=(1+o_{n}(1)){\mathbb{P}}\big{\{}\mbox{either a row or a column of $M_{n}$ equals zero}\big{\}}.

Conceptually, the above conjecture asserts that the main causes for singularity are local in the sense that the linear dependencies typically appear within small subsets of rows or columns. In a special regime npnlnn+on(lnlnn)np_{n}\leq\ln n+o_{n}(\ln\ln n), the conjecture was positively resolved in [5] (note that if npnlnnnp_{n}\leq\ln n then the matrix has a zero row with probability at least 11/eon(1)1-1/e-o_{n}(1)). However, the regime lim inf(npn/logn)>1\liminf(np_{n}/\log n)>1 was not covered in [5].

The main purpose of our paper is to develop methods capable of capturing the singularity probability with a sufficient precision to answer the above question. Interestingly, this appears to be more accessible in the sparse regime, when pnp_{n} is bounded above by a small universal constant (we discuss this in the next section in more detail). It is not difficult to show that when lim inf(npn/lnn)>1\liminf(np_{n}/\ln n)>1, the events that a given row or a given column equals zero, almost do not intersect, so that

{either a row or a column of Mn equals zero}=(2+on(1))n(1pn)n.{\mathbb{P}}\big{\{}\mbox{either a row or a column of $M_{n}$ equals zero}\big{\}}=(2+o_{n}(1))n\,(1-p_{n})^{n}.

Our main result can be formulated as follows.

Theorem 1.2.

There are universal constants C,C~1C,\widetilde{C}\geq 1 with the following property. Let n1n\geq 1 and let MnM_{n} be an n×nn\times n random matrix such that

The entries of MnM_{n} are i.i.d. Bernoulli(pp) , with p=pnp=p_{n} satisfying ClnnnpC1C\ln n\leq np\leq C^{-1}. (A)

Then

{Mn is singular}=(2+on(1))n(1p)n,{\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=(2+o_{n}(1))n\,(1-p)^{n},

where on(1)o_{n}(1) is a quantity which tends to zero as nn\to\infty. Moreover, for every t>0t>0,

{smin(Mn)texp(3ln2(2n))}t+(1+on(1)){Mn is singular}=t+(2+on(1))n(1p)n.{\mathbb{P}}\big{\{}s_{\min}(M_{n})\leq t\,\exp(-3\ln^{2}(2n))\big{\}}\leq t+(1+o_{n}(1)){\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=t+(2+o_{n}(1))n\,(1-p)^{n}.

In fact, our approach gives much better estimates on smins_{\min} in the regime when pnp_{n} is constant, see Theorem 7.1 below. At the same time, we note that obtaining small ball probability estimates for smins_{\min} was not the main objective of this paper, and the argument was not fully optimized in that respect.

Geometrically, the main result of our work asserts that (under appropriate assumptions on pnp_{n}) the probability that a collection of nn independent random vectors X1(n),,Xn(n)X_{1}^{(n)},\dots,X_{n}^{(n)} in n{\mathbb{R}}^{n}, with i.i.d Bernoulli(pnp_{n}) components is linearly dependent is equal (up to (1+on(1))(1+o_{n}(1)) factor) to probability of the event that either Xi(n)X_{i}^{(n)} is zero for some ini\leq n or X1(n),,Xn(n)X_{1}^{(n)},\dots,X_{n}^{(n)} are contained in the same coordinate hyperplane:

{X1(n),,Xn(n) are linearly dependent}=(1+on(1)){Xi(n)=𝟎 for some in}\displaystyle{\mathbb{P}}\big{\{}\mbox{$X_{1}^{(n)},\dots,X_{n}^{(n)}$ are linearly dependent}\big{\}}=(1+o_{n}(1))\,{\mathbb{P}}\big{\{}X_{i}^{(n)}={\bf 0}\mbox{ for some }i\leq n\big{\}}
+(1+on(1)){ a coordinate hyperplane H such that Xi(n)H for all in}.\displaystyle\hskip 28.45274pt+(1+o_{n}(1))\,{\mathbb{P}}\big{\{}\exists\,\mbox{ a coordinate hyperplane $H$ such that }X_{i}^{(n)}\in H\mbox{ for all }i\leq n\big{\}}.

Thus, the linear dependencies between the vectors, when they appear, typically have the prescribed structure, falling into one of the two categories described above with the (conditional) probability 12+on(1)\frac{1}{2}+o_{n}(1).

The paper is organized as follows. In the next section, we give an overview of the proof of the main result. In Section 3, we gather some preliminary facts and important notions to be used later. In Section 4, we consider new anti-concentration inequalities for random 0/10/1 vectors with prescribed number of non-zero components, and introduce a functional (u-degree of a vector) which enables us to classify vectors on the sphere according to anti-concentration properties of inner products with the random 0/10/1 vectors. In the same section, we prove a key technical result — Theorem 2.2 — which states, roughly speaking, that with very high probability a random unit vector orthogonal to n1n-1 columns of MnM_{n} is either close to being sparse or to being a constant multiple of (1,1,,1)(1,1,\dots,1), or the vector is very unstructured, i.e., has a very large u-degree.

In Section 5, we consider a special regime of constant probability of success pp. In this regime, estimating the event that MnM_{n} has an almost null vector which is either close to sparse or almost constant, is relatively simple. The reader who is interested only in the regime of constant pp can thus skip the more technical Section 6 and have the proof of the main result as a combination of the theorems in Sections 4 and 5. In Section 6, we consider the entire range for pp. Here, the treatment of almost constant and close to sparse null vectors is much more challenging and involves a careful analysis of multiple cases. Finally, in Section 7 we establish an invertibility via distance lemma and prove the main result of the paper. Some open questions are discussed in Section 8.

2 Overview of the proof

In this section, we provide a high-level overview of the proof; technical details will be discussed further in the text. The proof utilizes some known approaches to the matrix invertibility, which involve, in particular, a decomposition of the space into structured and unstructured parts, a form of invertibility via distance argument, small ball probability estimates based on the Esseen lemma, and various forms of the ε\varepsilon–net argument. The novel elements of the proof are anti-concentration inequalities for random vectors with a prescribed cardinality of the support, a structural theorem for normals to random hyperplanes spanned by vectors with i.i.d. Bernoulli(pp) components, and a sharp analysis of the matrix invertibility over the set of structured vectors. We will start the description with our use of the partitioning trick, followed by a modified invertibility-via-distance lemma, and then consider the anti-concentration inequality and the theorem for normals (Subsection 2.1) as well as invertibility over the structured vectors (Subsection 2.2).

The use of decompositions of the space n{\mathbb{R}}^{n} into structured and unstructured vectors has become rather standard in the literature. A common idea behind such partitions is to apply the Littlewood–Offord theory to analyse the unstructured vectors and to construct a form of the ε\varepsilon–net argument to treat the structured part. Various definitions of structured and unstructured have been used in works dealing with the matrix invertibility. One of such decomposition was introduced in [28] and further developed in [41]. In this splitting the structured vectors are compressible, having a relatively small Euclidean distance to the set of sparse vectors, while the vectors in the complement are incompressible, having a large distance to sparse vectors and, as a consequence, many components of roughly comparable magnitudes. In our work, the decomposition of n{\mathbb{R}}^{n} is closer to the one introduced in [24, 27].

Let xx^{*} denote a non-increasing rearrangement of absolute values of components of a vector xx, and let r,δ,ρ(0,1)r,\delta,\rho\in(0,1) be some parameters. Further, let 𝐠{\bf g} be a non-decreasing function from [1,)[1,\infty) into [1,)[1,\infty); we shall call it the growth function. At this moment, the choice of the growth function is not important; we can assume that 𝐠(t){\bf g}(t) grows roughly as tlntt^{\ln t}. Define the set of gradual non-constant vectors as

𝒱n=𝒱n(r,𝐠,δ,ρ)\displaystyle{\mathcal{V}}_{n}={\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) :={xn:xrn=1,xi𝐠(n/i) for all in, and\displaystyle:=\big{\{}x\in{\mathbb{R}}^{n}\,:\,x^{*}_{\lfloor rn\rfloor}=1,\;x^{*}_{i}\leq{\bf g}(n/i)\mbox{ for all $i\leq n$},\,\,\,\mbox{ and }
Q1,Q2[n] such that |Q1|,|Q2|δn and maxiQ2ximiniQ1xiρ}.\displaystyle\exists\,Q_{1},Q_{2}\subset[n]\quad\mbox{ such that }\quad|Q_{1}|,|Q_{2}|\geq\delta n\quad\mbox{ and }\quad\max\limits_{i\in Q_{2}}x_{i}\leq\min\limits_{i\in Q_{1}}x_{i}-\rho\big{\}}. (1)

In a sense, constant multiples of the gradual non-constant vectors occupy most of the space n{\mathbb{R}}^{n}, they play role of the unstructured vectors in our argument. By negation, the structured vectors,

𝒮n=𝒮n(r,𝐠,δ,ρ):=nλ0(λ𝒱n(r,𝐠,δ,ρ)),{\mathcal{S}}_{n}={\mathcal{S}}_{n}(r,{\bf g},\delta,\rho):={\mathbb{R}}^{n}\setminus\bigcup_{\lambda\geq 0}(\lambda\,{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)), (2)

are either almost constant (with most of components nearly equal) or have a very large ratio of xix^{*}_{i} and xrnx^{*}_{\lfloor rn\rfloor} for some i<rni<rn.

For simplicity, we only discuss the problem of singularity at this moment. As MnM_{n} and MnM_{n}^{\top} are equidistributed, to show that {Mn is singular}=(2+on(1))n(1p)n,{\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=(2+o_{n}(1))n\,(1-p)^{n}, it is sufficient to verify that

({Mnx=0 for some x𝒱n}{Mnx0 for all x𝒮n})=on(n)(1p)n,\begin{split}{\mathbb{P}}\Big{(}&\big{\{}M_{n}x=0\mbox{ for some }x\in{\mathcal{V}}_{n}\big{\}}\cap\Big{\{}M_{n}^{\top}x\neq 0\mbox{ for all }x\in{\mathcal{S}}_{n}\Big{\}}\Big{)}=o_{n}(n)\,(1-p)^{n},\end{split} (3)

and

{Mnx=0 for some x𝒮n}=(1+on(1))n(1p)n.{\mathbb{P}}\Big{\{}M_{n}x=0\mbox{ for some }x\in{\mathcal{S}}_{n}\Big{\}}=(1+o_{n}(1))n\,(1-p)^{n}.

The first relation is dealt with by using a variation of the invertibility via distance argument which was introduced in [41] to obtain sharp small ball probability estimates for the smallest singular value. In the form given in [41], the argument reduces the problem of invertibility over unstructured vectors to estimating distances of the form dist(𝐂i(Mn),Hi(Mn)){\rm dist}({\bf C}_{i}(M_{n}),H_{i}(M_{n})), where 𝐂i(Mn){\bf C}_{i}(M_{n}) is the ii–th column of MnM_{n}, and Hi(Mn)H_{i}(M_{n}) is the linear span of columns of MnM_{n} except for the ii–th. In our setting, however, the argument needs to be modified to pass to estimating the distance conditioned on the size of the support of the column, as this allows using much stronger anti-concentration inequalities (see the following subsection). By the invariance of the distribution of MnM_{n} under permutation of columns, it can be shown that in order to prove the relation (3), it is enough to verify that

{|supp𝐂1(Mn)|[pn8,8pn] and 𝐘,𝐂1(Mn)=0 and 𝐘/𝐘rn𝒱n}=on(n)(1p)n,{\mathbb{P}}\big{\{}|{\rm supp\,}{\bf C}_{1}(M_{n})|\in[\mbox{$\frac{pn}{8}$},8pn]\mbox{ and }\langle{\bf Y},{\bf C}_{1}(M_{n})\rangle=0\mbox{ and }{\bf Y}/{\bf Y}^{*}_{\lfloor rn\rfloor}\in{\mathcal{V}}_{n}\big{\}}=o_{n}(n)\,(1-p)^{n}, (4)

where 𝐘{\bf Y} is a non-zero random vector orthogonal to and measurable with respect to H1(Mn)H_{1}(M_{n}) (see Lemma 7.4 and the beginning of the proof of Theorem 1.2). In this form, the question can be reduced to studying the anti-concentration of the linear combinations i=1n𝐘ibi\sum_{i=1}^{n}{\bf Y}_{i}b_{i}, where the Bernoulli random variables b1,,bnb_{1},\dots,b_{n} are mutually independent with 𝐘{\bf Y} and conditioned to sum up to a fixed number in [pn/8,8pn][pn/8,8pn]. This intermediate problem is discussed in the next subsection.

2.1 New anti-concentration inequalities for random vectors with prescribed support cardinality

The Littlewood–Offord theory — the study of anti-concentration properties of random variables — has been a crucial ingredient of many recent results on invertibility of random matrices, starting with the work of Tao–Vu [47]. In particular, the breakthrough result [41] of Rudelson–Vershynin mentioned in the introduction, is largely based on studying the Lévy function 𝒬(𝐂1(A),𝐘,t)\mathcal{Q}(\langle{\bf C}_{1}(A),{\bf Y}\rangle,t), with 𝐂1(A){\bf C}_{1}(A) being the first column of the random matrix AA and 𝐘{\bf Y} — a random unit vector orthogonal to the remaining columns of AA.

We recall that given a random vector XX taking values in n{\mathbb{R}}^{n}, the Lévy concentration function 𝒬(X,t)\mathcal{Q}(X,t) is defined by

𝒬(X,t):=supyn{Xyt},t0;\mathcal{Q}(X,t):=\sup\limits_{y\in{\mathbb{R}}^{n}}{\mathbb{P}}\big{\{}\|X-y\|\leq t\big{\}},\quad t\geq 0;

in particular for a scalar random variable ξ\xi we have 𝒬(ξ,t):=supλ{|ξλ|t}\mathcal{Q}(\xi,t):=\sup\limits_{\lambda\in{\mathbb{R}}}{\mathbb{P}}\{|\xi-\lambda|\leq t\}. A common approach is to determine structural properties of a fixed vector which would imply desired upper bounds on the Lévy function of its scalar product with a random vector (say, a matrix’ column). The classical result of Erdős–Littlewood–Offord [12, 21] asserts that whenever XX is a vector in n{\mathbb{R}}^{n} with i.i.d. ±1\pm 1 components, and y=(y1,,yn)ny=(y_{1},\dots,y_{n})\in{\mathbb{R}}^{n} is such that |yi|1|y_{i}|\geq 1 for all ii, we have

𝒬(X,y,t)Ctn1/2+Cn1/2,\mathcal{Q}(\langle X,y\rangle,t)\leq Ct\,n^{-1/2}+Cn^{-1/2},

where C>0C>0 is a universal constant. It can be further deduced from the Lévy–Kolmogorov–Rogozin inequality [38] that the above assertion remains true whenever XX is a random vector with independent components XiX_{i} satisfying 𝒬(Xi,c)1c\mathcal{Q}(X_{i},c)\leq 1-c for some constant c>0c>0. More delicate structural properties, based on whether components of yy can be embedded into a generalized arithmetic progression with prescribed parameters were employed in [47] to prove superpolynomially small upper bounds on the singularity probability of discrete random matrices.

The Least Common Denominator (LCD) of a unit vector introduced in [41] played a central role in establishing the exponential upper bounds on the matrix singularity under more general assumptions on the entries’ distributions. We recall that the LCD of a unit vector yy in n{\mathbb{R}}^{n} can be defined as

LCD(y):=inf{θ>0:dist(θy,n)min(c1θy,c2n)}{\rm LCD}(y):=\inf\big{\{}\theta>0:\;{\rm dist}(\theta y,{\mathbb{Z}}^{n})\leq\min(c_{1}\|\theta y\|,c_{2}\sqrt{n})\big{\}} (5)

for some parameters c1,c2(0,1)c_{1},c_{2}\in(0,1). The small ball probability theorem of Rudelson and Vershynin [41] states that given a vector XX with i.i.d. components of zero mean and unit variance satisfying some additional mild assumptions,

𝒬(X,y,t)Ct+CLCD(y)+2ecn\mathcal{Q}(\langle X,y\rangle,t)\leq Ct+\frac{C^{\prime}}{{\rm LCD}(y)}+2e^{-c^{\prime}n}

for some constants C,C,c>0C,C^{\prime},c^{\prime}>0 (see [42] for a generalization of the statement). The LCD, or its relatives, were subsequently used in studying invertibility of non-Hermitian square matrices under broader assumptions [37, 30, 31], and delocalization of eigenvectors of non-Hermitian random matrices [43, 34, 32], among many other works.

Anti-concentration properties of random linear combinations naturally play a central role in the current work, however, the measures of unstructuredness of vectors existing in the literature do not allow to obtain the precise estimates we are aiming for. Here, we develop a new functional for dealing with linear combinations of dependent Bernoulli variables.

Given nn\in{\mathbb{N}}, 1mn/21\leq m\leq n/2, a vector yny\in{\mathbb{R}}^{n} and parameters K1,K21K_{1},K_{2}\geq 1, we define the degree of unstructuredness (u-degree) of vector yy by

𝐔𝐃n(y,m,K1,K2):=sup{\displaystyle{\bf UD}_{n}(y,m,K_{1},K_{2}):=\sup\bigg{\{} t>0:AnmS1,,Smtti=1mψK2(|𝔼exp(2π𝐢yη[Si]m1/2s)|)dsK1},\displaystyle t>0:\;A_{nm}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,y_{\eta[S_{i}]}\,m^{-1/2}s\big{)}\big{|}\big{)}\,ds\leq K_{1}\bigg{\}}, (6)

where the sum is taken over all sequences (Si)i=1m(S_{i})_{i=1}^{m} of disjoint subsets S1,,Sm[n]S_{1},\dots,S_{m}\subset[n], each of cardinality n/m\lfloor n/m\rfloor and

Anm=((n/m)!)m(nmn/m)!n!\displaystyle A_{nm}=\frac{\big{(}(\lfloor n/m\rfloor)!\big{)}^{m}\,(n-m\lfloor n/m\rfloor)!}{n!}\cdot (7)

Here η[Si]\eta[S_{i}], imi\leq m, denote mutually independent integer random variables uniformly distributed on respective SiS_{i}’s. The function ψK2\psi_{K_{2}} in the definition acts as a smoothing of max(1K2,t)\max(\frac{1}{K_{2}},t), with ψK2(t)=1K2\psi_{K_{2}}(t)=\frac{1}{K_{2}} for all t12K2t\leq\frac{1}{2K_{2}} and ψK2(t)=t\psi_{K_{2}}(t)=t for all t1K2t\geq\frac{1}{K_{2}} (we prefer to skip discussion of this purely technical element of the proof in this section, and refer to the beginning of Section 4 for the full list of conditions imposed on ψK2\psi_{K_{2}}).

The functional 𝐔𝐃n(y,m,K1,K2){\bf UD}_{n}(y,m,K_{1},K_{2}) can be understood as follows. The expression inside the supremum is the average value of the integral

tti=1mψK2(|𝔼exp(2π𝐢yη[Si]m1/2s)|)ds,\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,y_{\eta[S_{i}]}\,m^{-1/2}s\big{)}\big{|}\big{)}\,ds,

with the average taken over all choices of sequences (Si)i=1m(S_{i})_{i=1}^{m}. The function under the integral, disregarding the smoothing ψK2\psi_{K_{2}}, is the absolute value of the characteristic function of the random variable y,Z\langle y,Z\rangle, where ZZ is a random 0/10/1–vector with exactly mm ones, and with the ii-th one distributed uniformly on SiS_{i}. A relation between the magnitude of the characteristic function and anti-concentration properties of a random variable (the Esseen lemma (Lemma 3.12 below) has been commonly used in works on the matrix invertibility (see, for example, [40]), and determines the shape of the functional 𝐔𝐃n(){\bf UD}_{n}(\cdot). The definition of the u-degree is designed specifically to work with random 0/10/1–vectors having a fixed sum (equal to mm). The next statement follows from the definition of 𝐔𝐃n(){\bf UD}_{n}(\cdot) and the Esseen lemma.

Theorem 2.1 (A Littlewood–Offord-type inequality in terms of the u-degree).

Let m,nm,n be positive integers with mn/2m\leq n/2, and let K1,K21K_{1},K_{2}\geq 1. Further, let vnv\in{\mathbb{R}}^{n}, and let X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random 0/10/1–vector in n{\mathbb{R}}^{n} uniformly distributed on the set of vectors with mm ones and nmn-m zeros. Then

𝒬(i=1nviXi,mτ)C2.1(τ+𝐔𝐃n(v,m,K1,K2)1)for all τ>0,\mathcal{Q}\Big{(}\sum\limits_{i=1}^{n}v_{i}X_{i},\sqrt{m}\,\tau\Big{)}\leq C_{\text{\tiny\ref{p: cf est}}}\,\big{(}\tau+{\bf UD}_{n}(v,m,K_{1},K_{2})^{-1}\big{)}\quad\mbox{for all $\tau>0$},

where C2.1>0C_{\text{\tiny\ref{p: cf est}}}>0 may only depend on K1K_{1}.

The principal difference of the u-degree and the above theorem from the notion of the LCD and (5) is that the former allow to obtain stronger anti-concentration inequalities in the same regime of sparsity, assuming that the coefficient vector yy is sufficiently unstructured. In fact, under certain conditions, sparse random 0/10/1 vectors with prescribed support cardinality admit stronger anti-concentration inequalities compared to the i.i.d. model.

The last principle can be illustrated by taking the coefficient vector yy as a “typical” vector on the sphere Sn1S^{n-1}. First, assume that b1,,bnb_{1},\dots,b_{n} are i.i.d. Bernoulli(pp) , with p<1/2p<1/2. Then it is easy to see that for almost all (with respect to normalized Lebesgue measure) vectors ySn1y\in S^{n-1},

𝒬(i=1nyibi,0)=(1p)n.\mathcal{Q}\Big{(}\sum_{i=1}^{n}y_{i}b_{i},0\Big{)}=(1-p)^{n}.

In words, for a typical coefficient vector yy on the sphere, the linear combination i=1nyibi\sum_{i=1}^{n}y_{i}b_{i} takes distinct values for any two distinct realizations of (b1,,bn)(b_{1},\dots,b_{n}), and thus the Lévy function at zero is equal to the probability measure of the largest atom of the distribution of i=1nyibi\sum_{i=1}^{n}y_{i}b_{i} which corresponds to all bib_{i} equal to zero. In contrast, if the vector (b1,,bn)(b_{1},\dots,b_{n}) is uniformly distributed on the set of 0/10/1–vectors with support of size d=pnd=pn, then for almost all ySn1y\in S^{n-1}, the random sum i=1nyibi\sum_{i=1}^{n}y_{i}b_{i} takes (nd){n\choose d} distinct values. Thus,

𝒬(i=1nvibi,0)=(nnp)1,\mathcal{Q}\Big{(}\sum_{i=1}^{n}v_{i}b_{i},0\Big{)}={n\choose np}^{-1},

where (nnp)1(1p)n{n\choose np}^{-1}\ll(1-p)^{n} for small pp.

The above example provides only qualitative estimates and does not give an information on the location of the atoms of the distribution of i=1nyibi\sum_{i=1}^{n}y_{i}b_{i}. The notion of the u-degree addresses this problem. The following theorem, which is the main result of Section 4, asserts that with a very large probability the normal vector to the (say, last) n1n-1 columns of our matrix MnM_{n} is either very structured or has a very large u-degree, much greater than the critical value (1p)n(1-p)^{-n}.

Theorem 2.2.

Let r,δ,ρ(0,1)r,\delta,\rho\in(0,1), s>0s>0, R1R\geq 1, and let K31K_{3}\geq 1. Then there are n0n_{0}\in{\mathbb{N}}, C1C\geq 1 and K11K_{1}\geq 1, K24K_{2}\geq 4 depending on r,δ,ρ,R,s,K3r,\delta,\rho,R,s,K_{3} such that the following holds. Let nn0n\geq n_{0}, pC1p\leq C^{-1}, and slnnpns\ln n\leq pn. Let 𝐠:[1,)[1,){\bf g}\,:[1,\infty)\to[1,\infty) be an increasing (growth) function satisfying

a2t1:𝐠(at)𝐠(t)+a and j=1𝐠(2j)j 2jK3.\displaystyle\forall a\geq 2\,\,\forall t\geq 1:\,\,\,\,{\bf g}(a\,t)\geq{\bf g}(t)+a\quad\quad\quad\mbox{ and }\quad\quad\quad\prod_{j=1}^{\infty}{\bf g}(2^{j})^{j\,2^{-j}}\leq K_{3}. (8)

Assume that MnM_{n} is an n×nn\times n Bernoulli(pp) random matrix. Then with probability at least 1exp(Rpn)1-\exp(-Rpn) one has

{Set of normal vectors to 𝐂2(Mn),,𝐂n(Mn)}𝒱n(r,𝐠,δ,ρ)\displaystyle\{\mbox{Set of normal vectors to ${\bf C}_{2}(M_{n}),\dots,{\bf C}_{n}(M_{n})\}\cap{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\subset$}
{xn:xrn=1,𝐔𝐃n(x,m,K1,K2)exp(Rpn)\{x\in{\mathbb{R}}^{n}:\;x^{*}_{\lfloor rn\rfloor}=1,\,\,\,{\bf UD}_{n}(x,m,K_{1},K_{2})\geq\exp(Rpn)\,\, for all pn/8m8pn}\,\,pn/8\leq m\leq 8pn\}.

We would like to emphasize that the parameter ss in this theorem can take values less than one, in the regime when the matrix MnM_{n} typically has null rows and columns. In this respect, the restriction pClnn/np\geq C\ln n/n in the main theorem comes from the treatment of structured vectors.

The proof of Theorem 2.2 is rather involved, and is based on a double counting argument and specially constructed lattice approximations of the normal vectors. We refer to Section 4 for details. Here, we only note that, by taking RR as a sufficiently large constant, the theorem implies the relation (4), hence, accomplishes the treatment of unstructured vectors.

2.2 Almost constant, steep and {\mathcal{R}}-vectors

In this subsection we discuss our treatment of the set of structured vectors, 𝒮n{\mathcal{S}}_{n}. In the proof we partition the set 𝒮n{\mathcal{S}}_{n} into several subsets and work with them separately. In a simplistic form, the structured vectors are dealt with in two ways: either by constructing discretizations and taking the union bound (variations of the ε\varepsilon–net argument), or via deterministic estimates in the case when there are very few very large components in the vector. We note here that the discretization procedure has to take into account the non-centeredness of our random matrix model: while in case of centered matrices with i.i.d. components (and under appropriate moment conditions) the norm of the matrix is typically of order n\sqrt{n} times the standard deviation of an entry, for our Bernoulli(pp) model it has order pnpn (i.e., roughly pn\sqrt{p}n times the standard deviation of an entry), which makes a direct application of the ε\varepsilon–net argument impossible. Fortunately, this large norm is attained only in one direction — the direction of the vector 𝟏=(1,1,,1){\bf 1}=(1,1,\dots,1) while on the orthogonal complement of 𝟏{\bf 1} the typical norm is pn\sqrt{pn}. Therefore it is enough to take a standard net in the Euclidean norm and to make it denser in that one direction, which almost does not affect the cardinality of the net. We refer to Section 3.6 for details.

Let us first describe our approach in the (simpler) case when p(q,c)p\in(q,c), where cc is a small enough absolute constant and q(0,c)q\in(0,c) is a fixed parameter (independent of nn). We introduce four auxiliary sets and show that the set of unit structured vectors, 𝒮nSn1{\mathcal{S}}_{n}\cap S^{n-1}, is contained in the closure of their union.

The first set, 1\mathcal{B}_{1}, consists of unit vectors close to vectors of the canonical basis, specifically, unit vectors xx satisfying x1>6pnx2x_{1}^{*}>6pnx_{2}^{*}, where xx^{*} denotes the non-inreasing rearrangement of the vector (|xi|)in(|x_{i}|)_{i\leq n}. For any such vector xx the individual bound is rather straightforward — conditioned on the event that there are no zero columns in our matrix MM, and that the Euclidean norms of the matrix rows are not too large, we get Mx0Mx\neq 0. This class is the main contributor to the bound (1+on(1))n(1p)n(1+o_{n}(1))n(1-p)^{n} for non-invertibility over the structured vectors 𝒮n{\mathcal{S}}_{n}.

For the other three sets we use anti-concentration probability estimates and discretizations. An application of Rogozin’s lemma (Proposition 3.9) implies that probability of having small inner product of a given row of our matrix with xx is small, provided that there is a subset A[n]A\subset[n] such that the maximal coordinate of PAxP_{A}x is bounded above by cpPAxc\sqrt{p}\|P_{A}x\|, where \|\cdot\| denotes the standard Euclidean norm and PAP_{A} is the coordinate projection onto A{\mathbb{R}}^{A}. Combined with tensorization Lemma 3.8 this implies exponentially (in nn) small probability of the event that Mx\|Mx\| is close to zero — see Proposition 3.10 below. Specifically, we define 2\mathcal{B}_{2} as the set of unit vectors satisfying the above condition with A=[n]A=[n], that is, satisfying x1cpx_{1}^{*}\leq c\sqrt{p}, and for 3\mathcal{B}_{3} we take all unit vectors satisfying the condition with A=σx([2,n])A=\sigma_{x}([2,n]), that is, satisfying x2cpPσx([2,n])xx_{2}^{*}\leq c\sqrt{p}\|P_{\sigma_{x}([2,n])}x\|, where σx\sigma_{x} is a permutation satisfying xi=|xσx(i)|x^{*}_{i}=|x_{\sigma_{x}(i)}|, ini\leq n. For vectors from these two sets we have very good individual probability estimates, but, unfortunately, the complexity of both sets is large — they don’t admit nets of small cardinality. To overcome this issue, we have to redefine these sets by intersecting them with specially chosen sets of vectors having many almost equal coordinates. For the precise definition of such sets, denoted by U(m,γ)U(m,\gamma), see Subsection 3.6. A set U(m,γ)U(m,\gamma) is a variant of the class of almost constant vectors, 𝒜𝒞(ρ)\mathcal{AC}(\rho) (see (9) below), introduced to deal with general pp. Having a large part of coordinates of a vector almost equal to each other reduces the complexity of the set making possible to construct a net of small cardinality. This resolves the problem and allows us to deal with these two classes of sets. The remaining class of vectors, 4\mathcal{B}_{4}, consists of vectors xx with x1x2cpPσx([2,n])xx_{1}^{*}\geq x_{2}^{*}\geq c\sqrt{p}\|P_{\sigma_{x}([2,n])}x\|, i.e., vectors with relatively big two largest components. For such vectors we produce needed anti-concentration estimates for the matrix-vector products by using only those two components, i.e., we consider anti-concentration for the vector PAxP_{A}x, where A=σx({1,2})A=\sigma_{x}(\{1,2\}). Since the Rogozin lemma is not suitable for this case, we compute the anti-concentration directly in Proposition 3.11. As for the classes 2,3\mathcal{B}_{2},\mathcal{B}_{3}, we actually intersect the fourth class with appropriately chosen sets of almost constant vectors in order to control cardinalities of the nets. The final step is to show that the set 𝒮n{\mathcal{S}}_{n} is contained in the union of four sets described here. Careful analysis of this approach shows that the result can be proved with all constants and parameters r,δ,ρr,\delta,\rho depending only on qq. Thus, it works for pp being between the two constants qq and cc.

The case of small pp, that is, the case C(lnn)/npcC(\ln n)/n\leq p\leq c, requires a more sophisticated splitting of 𝒮n{\mathcal{S}}_{n} — we split it into steep vectors and {\mathcal{R}}-vectors. The definition and the treatment of steep vectors essentially follows [24, 27], with corresponding adjustments for our model. The set of steep vectors consists of vectors having a large jump between order statistics measured at certain indices. The first subclass of steep vectors, 𝒯0\mathcal{T}_{0}, is the same as the class 1\mathcal{B}_{1} described above — vectors having very large maximal coordinate — and is treated as 1\mathcal{B}_{1}. Similarly to the case of constant pp, this class is the main contributor to the bound (1+on(1))n(1p)n(1+o_{n}(1))n(1-p)^{n} for non-invertibility over structured vectors. Next we fix certain m1/pm\approx 1/p and consider a sequence n0=2n_{0}=2, nj+1/nj=0n_{j+1}/n_{j}=\ell_{0}, js01j\leq s_{0}-1, ns0+1=mn_{s_{0}+1}=m for some specially chosen parameters 0\ell_{0} and s0s_{0} depending on pp and nn. The class 𝒯1\mathcal{T}_{1} will be defined as the class of vectors such that there exists jj with xnj+1>6pnxnjx^{*}_{n_{j+1}}>6pnx^{*}_{n_{j}}. To work with vectors from this class, we first show that for a given jj the event that for every choice of two disjoint sets |J1|=nj|J_{1}|=n_{j} and |J2|=nj+1nj|J_{2}|=n_{j+1}-n_{j}, a random Bernoulli(pp) matrix has a row with exactly one 11 in components indexed by J1J_{1} and no 11’s among components indexed by J2J_{2}, holds with a very high probability. Then, conditioned on this event, for every x𝒯1x\in\mathcal{T}_{1}, we choose J1J_{1} corresponding to xix_{i}^{*}, inji\leq n_{j}, and J2J_{2} corresponding to xix_{i}^{*}, njinj+1n_{j}\leq i\leq n_{j+1}, and the corresponding row. Then the inner product of this row with xx will be large in absolute value due to the jump (see Lemma 6.9 for the details). Thus, conditioned on the described event, for every x𝒯1x\in\mathcal{T}_{1} we have a good lower bound on Mx\|Mx\|. Then next two classes of steep vectors, 𝒯2\mathcal{T}_{2} and 𝒯3\mathcal{T}_{3}, consist of vectors having a jump of order CpnC\sqrt{pn}, namely, vectors in 𝒯2\mathcal{T}_{2} satisfy xm>Cpnxkx_{m}^{*}>C\sqrt{pn}x_{k}^{*} and vectors in 𝒯3\mathcal{T}_{3} satisfy xk>Cpnxx_{k}^{*}>C\sqrt{pn}x_{\ell}^{*}, where kn/pk\approx\sqrt{n/p} and =rn\ell=\lfloor rn\rfloor (rr is the parameter from the definition of 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)). Trying to apply the same idea for these two subclasses one sees that the size of corresponding sets J1J_{1} and J2J_{2} is too large to have exactly one 11 among a row’s components indexed by J1J2J_{1}\cup J_{2} with a high probability. Therefore the proof of individual probability bounds is more delicate and technical as well as a construction of corresponding nets for 𝒯2,𝒯3\mathcal{T}_{2},\mathcal{T}_{3}. We discuss the details in Subsection 6.6.

The class of {\mathcal{R}}-vectors consists of non-steep vectors to which Rogozin’s lemma (Proposition 3.9) can be applied when we project a vector on nkn-k smallest coordinates with m<kn/ln2(pn)m<k\leq n/\ln^{2}(pn), thus vectors from this class satisfy PAxcpPAx\|P_{A}x\|\leq c\sqrt{p}\|P_{A}x\|_{\infty} for A=σx([k,n]A=\sigma_{x}([k,n] (we will take union over all choices of integer kk in the interval (m,n/ln2(pn)](m,n/\ln^{2}(pn)]). Thus, the individual probability bounds for {\mathcal{R}}-vectors will follow from Rogozin’s lemma together with tensorization lemma as for classes 2\mathcal{B}_{2}, 3\mathcal{B}_{3}, described above. Thus the remaining part is to construct a good net for {\mathcal{R}}-vectors. For simplicity, dealing with such vectors, we fix the normalization xrn=1x^{*}_{\lfloor rn\rfloor}=1. Since vectors are non-steep, we have a certain control of largest coordinates and, thus, on the Euclidean norm of a vector. The upper bound on kk is chosen in such a way that the cardinality on a net corresponding to largest coordinates of a vector is relatively small (it lies in n/ln2(pn)n/\ln^{2}(pn)-dimensional subspace). For the purpose of constructing of a net of small cardinality, we need to control the Euclidean norm of PAxP_{A}x for an {\mathcal{R}}-vector. Therefore we split {\mathcal{R}}-vectors into level sets according to the value of PAx\|P_{A}x\|. There will be two different types of level sets — vectors with relatively large Euclidean norm of PAxP_{A}x and vectors with small PAx\|P_{A}x\|. A net for level sets with large PAx\|P_{A}x\| is easier to construct, since we can zero all coordinates starting with xrn=1x^{*}_{\lfloor rn\rfloor}=1. If the Euclidean norm is small, we cannot do this, so we intersect this subclass with almost constant vectors (in fact we incorporate this intersection into the definition of {\mathcal{R}}-vectors), defined by

𝒜𝒞(ρ):={xn:λ s. t. |λ|=xrn and |{in:|xiλ|ρ|λ|}|>nrn}.\mathcal{AC}(\rho):=\{x\in{\mathbb{R}}^{n}\,:\,\exists\lambda\in{\mathbb{R}}\,\mbox{ s. t. }\,|\lambda|=x^{*}_{\lfloor rn\rfloor}\,\mbox{ and }\,|\{i\leq n\,:\,|x_{i}-\lambda|\leq\rho|\lambda|\}|>n-\lfloor rn\rfloor\}. (9)

As in the case of constant pp, this essentially reduces the dimension corresponding to almost constant part to one and therefore reduce the cardinality of a net. The rather technical construction of nets is presented in Subsection 6.3. In some aspects the construction follows ideas developed in [24].

3 Preliminaries

3.1 General notation

By universal or absolute constants we always mean numbers independent of all involved parameters, in particular independent of pp and nn. Given positive integers <k\ell<k we denote sets {1,2,,}\{1,2,\ldots,\ell\} and {,+1,,k}\{\ell,\ell+1,\ldots,k\} by [][\ell] and [,k][\ell,k] correspondingly. Having two functions ff and gg we write fgf\approx g if there are two absolute positive constants cc and CC such that cfgCfcf\leq g\leq Cf. As usual, Πn\Pi_{n} denotes the permutation group on [n][n].

For every vector x=(xi)i=1nnx=(x_{i})_{i=1}^{n}\in{\mathbb{R}}^{n}, by (xi)i=1n(x_{i}^{*})_{i=1}^{n} we denote the non-increasing rearrangement of the sequence (|xi|)i=1n(|x_{i}|)_{i=1}^{n} and we fix one permutation σx\sigma_{x} satisfying |xσx(i)|=xi|x_{\sigma_{x}(i)}|=x_{i}^{*}, ini\leq n. We use ,\left\langle\cdot,\cdot\right\rangle for the standard inner product on n{\mathbb{R}}^{n}, that is x,y=i=1nxiyi\left\langle x,y\right\rangle=\sum_{i=1}^{n}x_{i}y_{i}. Further, we write x=maxi|xi|\|x\|_{\infty}=\max_{i}|x_{i}| for the \ell_{\infty}-norm of xx. We also denote 𝟏=(1,1,,1){\bf 1}=(1,1,\dots,1).

3.2 Lower bound on the singularity probability

Here, we provide a simple argument showing that for the sequence of random Bernoulli(pnp_{n}) matrices (Mn)(M_{n}), with pnp_{n} satisfying (npnlnn)(np_{n}-\ln n)\longrightarrow\infty as nn\to\infty, we have

{Mn contains a zero row or column}(2on(1))n(1p)n.{\mathbb{P}}\big{\{}\mbox{$M_{n}$ contains a zero row or column}\big{\}}\geq(2-o_{n}(1))n\,(1-p)^{n}.

Our approach is similar to that applied in [5] in the related context.

Fix n>1n>1 and write p=pnp=p_{n}. Let 𝟏R{\bf 1}_{R} be the indicator of the event that there is zero row in the matrix MnM_{n}, and, similarly, let 𝟏C{\bf 1}_{C} be the indicator of the event that MnM_{n} has a zero column. Then, obviously,

𝔼 1R=𝔼 1C=1(1(1p)n)n,{\mathbb{E}}\,{\bf 1}_{R}={\mathbb{E}}\,{\bf 1}_{C}=1-\big{(}1-(1-p)^{n}\big{)}^{n},

hence,

𝔼(𝟏R+𝟏C)222(1(1p)n)n.{\mathbb{E}}({\bf 1}_{R}+{\bf 1}_{C})^{2}\geq 2-2\big{(}1-(1-p)^{n}\big{)}^{n}.

On the other hand,

𝔼 1R 1Ci=1nj=1n{i–th row and j–th column of Mn are zero}=n2(1p)2n1,{\mathbb{E}}\,{\bf 1}_{R}\,{\bf 1}_{C}\leq\sum_{i=1}^{n}\sum_{j=1}^{n}{\mathbb{P}}\big{\{}\mbox{$i$--th row and $j$--th column of $M_{n}$ are zero}\big{\}}=n^{2}(1-p)^{2n-1},

implying

𝔼(𝟏R+𝟏C)2={𝟏R+𝟏C=1}+4{𝟏R 1C=1}{𝟏R+𝟏C=1}+4n2(1p)2n1.{\mathbb{E}}({\bf 1}_{R}+{\bf 1}_{C})^{2}={\mathbb{P}}\big{\{}{\bf 1}_{R}+{\bf 1}_{C}=1\big{\}}+4\,{\mathbb{P}}\big{\{}{\bf 1}_{R}\,{\bf 1}_{C}=1\big{\}}\leq{\mathbb{P}}\big{\{}{\bf 1}_{R}+{\bf 1}_{C}=1\big{\}}+4n^{2}(1-p)^{2n-1}.

Therefore,

{Mn contains a zero row or column}\displaystyle{\mathbb{P}}\big{\{}\mbox{$M_{n}$ contains a zero row or column}\big{\}} {𝟏R+𝟏C=1}\displaystyle\geq{\mathbb{P}}\big{\{}{\bf 1}_{R}+{\bf 1}_{C}=1\big{\}}
𝔼(𝟏R+𝟏C)24n2(1p)2n1\displaystyle\geq{\mathbb{E}}({\bf 1}_{R}+{\bf 1}_{C})^{2}-4n^{2}(1-p)^{2n-1}
22(1(1p)n)n4n2(1p)2n1.\displaystyle\geq 2-2\big{(}1-(1-p)^{n}\big{)}^{n}-4n^{2}(1-p)^{2n-1}.

It remains to note that, with our assumption on the growth rate of p=pnp=p_{n}, we have n(1p)n0n(1-p)^{n}\longrightarrow 0, which implies

1n(1p)n(22(1(1p)n)n4n2(1p)2n1)2.\frac{1}{n(1-p)^{n}}\big{(}2-2\big{(}1-(1-p)^{n}\big{)}^{n}-4n^{2}(1-p)^{2n-1}\big{)}\longrightarrow 2.

3.3 Gradual non-constant vectors

For any r(0,1)r\in(0,1), we define Υn(r){\Upsilon}_{n}(r) as the set of all vectors xx in n{\mathbb{R}}^{n} with xrn=1x^{*}_{\lfloor rn\rfloor}=1. We will call these vectors rr-normalized. By a growth function 𝐠{\bf g} we mean any non-decreasing function from [1,)[1,\infty) into [1,)[1,\infty).

Let 𝐠{\bf g} be an arbitrary growth function. We will say that a vector xΥn(r)x\in{\Upsilon}_{n}(r) is gradual (with respect to the function 𝐠{\bf g}) if xi𝐠(n/i)x^{*}_{i}\leq{\bf g}(n/i) for all ini\leq n. Further, if xΥn(r)x\in{\Upsilon}_{n}(r) satisfies

Q1,Q2[n] such that |Q1|,|Q2|δn and maxiQ2ximiniQ1xiρ\displaystyle\exists\,Q_{1},Q_{2}\subset[n]\quad\mbox{ such that }\quad|Q_{1}|,|Q_{2}|\geq\delta n\quad\mbox{ and }\quad\max\limits_{i\in Q_{2}}x_{i}\leq\min\limits_{i\in Q_{1}}x_{i}-\rho (10)

then we say that the vector xx is essentially non-constant or just non-constant (with parameters δ,ρ\delta,\rho). Recall that the set 𝒱n=𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}={\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) was defined in (2) as

{xΥn(r):x is gradual with 𝐠 and satisfies (10)}.\big{\{}x\in{\Upsilon}_{n}(r):\,\,x\,\mbox{ is gradual with ${\bf g}$ and satisfies \eqref{cond2}}\big{\}}.

Vectors from this set we call gradual non-constant vectors.

Recall that the set 𝒮n=𝒮n(r,𝐠,δ,ρ){\mathcal{S}}_{n}={\mathcal{S}}_{n}(r,{\bf g},\delta,\rho) of structured vectors was defined in (2) as the complement of scalar multiples of 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho). The next simple lemma will allow us to reduce analysis of {x/x:x𝒮n}\{x/\|x\|:\;x\in{\mathcal{S}}_{n}\} to the treatment of the set {x/x:xΥn(r)𝒱n}\{x/\|x\|:\;x\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}\}.

Lemma 3.1.

For any choice of parameters r,𝐠,δ,ρr,{\bf g},\delta,\rho, the set {x/x:x𝒮n}\{x/\|x\|:\;x\in{\mathcal{S}}_{n}\} is contained in the closure of the set {x/x:xΥn(r)𝒱n}\{x/\|x\|:\;x\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}\}.

Proof.

Let yy be a unit vector such that y=x/xy=x/\|x\| for some x𝒮nx\in{\mathcal{S}}_{n}. If xrn0x^{*}_{\lfloor rn\rfloor}\neq 0 then y=z/zy=z/\|z\|, where z=x/xrnΥn(r)𝒱nz=x/x^{*}_{\lfloor rn\rfloor}\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}. If xrn=0x^{*}_{\lfloor rn\rfloor}=0, we can consider a sequence of vectors (x(j))j1(x(j))_{j\geq 1} in n{\mathbb{R}}^{n} defined by x(j)i=xjx(j)_{i}=x_{j} for irni\neq\lfloor rn\rfloor and x(j)rn=1/jx(j)_{\lfloor rn\rfloor}=1/j. Let

y(j):=x(j)/x(j)rnΥn(r),j1.y(j):=x(j)/x(j)^{*}_{\lfloor rn\rfloor}\in{\Upsilon}_{n}(r),\quad j\geq 1.

Clearly, y(j)1y(j)^{*}_{1}\longrightarrow\infty, so for all sufficiently large jj we have y(j)𝒱ny(j)\notin{\mathcal{V}}_{n}. Thus, for all large jj,

y(j)/y(j){x/x:xΥn(r)𝒱n},y(j)/\|y(j)\|\in\{x^{\prime}/\|x^{\prime}\|:\;x^{\prime}\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}\},

whereas y(j)/y(j)=x(j)/x(j)x/xy(j)/\|y(j)\|=x(j)/\|x(j)\|\longrightarrow x/\|x\|. This implies the desired result. ∎

We will need two following lemmas. The first one states that vectors which do not satisfy (10) are almost constant (that is, have large part of coordinates nearly equal to each other). The second one is a simple combinatorial estimate, so we omit its proof.

Lemma 3.2.

Let n1n\geq 1, δ,ρ,r(0,1)\delta,\rho,r\in(0,1). Denote k=δnk=\lceil\delta n\rceil and m=rnm=\lfloor rn\rfloor and assume n2m>4kn\geq 2m>4k. Assume xΥn(r)x\in{\Upsilon}_{n}(r) does not satisfy (10). Then there exist A[n]A\subset[n] of cardinality |A|>nm|A|>n-m and λ\lambda with |λ|=1|\lambda|=1 such that |xiλ|<ρ|x_{i}-\lambda|<\rho for every iAi\in A.

Proof.

By (xi#)i(x_{i}^{\#})_{i} denote the non-increasing rearrangement of (xi)i(x_{i})_{i} (we would like to emphasize that we do not take absolute values). Note that there are two subsets Q1,Q2[n]Q_{1},Q_{2}\subset[n] with |Q1|,|Q2|k|Q_{1}|,|Q_{2}|\geq k satisfying maxiQ2ximiniQ1xiρ\max_{i\in Q_{2}}x_{i}\leq\min_{i\in Q_{1}}x_{i}-\rho if and only if xk#xnk+1#ρx_{k}^{\#}-x_{n-k+1}^{\#}\geq\rho. Therefore, using that xx does not satisfy (10), we observe xk#xnk+1#<ρx_{k}^{\#}-x_{n-k+1}^{\#}<\rho. Next consider the set

A:={xi#:k<ink}.A:=\{x_{i}^{\#}\,\,:\,\,k<i\leq n-k\}.

Then |A|=n2k>nm|A|=n-2k>n-m. Since xm=1x^{*}_{m}=1 we obtain that

|{i:|xi|>1}|<mnm and |{i:|xi|<1}|nm.|\{i\,:\,|x_{i}|>1\}|<m\leq n-m\quad\mbox{ and }\quad|\{i\,:\,|x_{i}|<1\}|\leq n-m.

Therefore, there exists an index jAj\in A such that |xj|=1|x_{j}|=1. Taking λ=xj\lambda=x_{j}, we observe that for every iAi\in A, |xiλ|<ρ|x_{i}-\lambda|<\rho. This completes the proof. ∎

Lemma 3.3.

For any δ(0,1]\delta\in(0,1] there are nδn_{\delta}\in{\mathbb{N}}, cδ>0c_{\delta}>0 and Cδ1C_{\delta}\geq 1 depending only on δ\delta with the following property. Let nnδn\geq n_{\delta} and let mm\in{\mathbb{N}} satisfy n/mCδn/m\geq C_{\delta}. Denote by 𝒮\mathcal{S} the collection of sequences (S1,,Sm)[n](S_{1},\dots,S_{m})\subset[n] with |Si|=n/m|S_{i}|=\lfloor n/m\rfloor and SiSj= for all ijS_{i}\cap S_{j}=\emptyset\mbox{ for all }i\neq j. Let AnmA_{nm} be as in (7). Then for any pair Q1,Q2Q_{1},Q_{2} of disjoint subsets of [n][n] of cardinality at least δn\delta n each, one has

|{\displaystyle\Big{|}\Big{\{} (S1,,Sm)𝒮:min(|SiQ1|,|SiQ2|)δ2n/m for at most cδm indices i}|ecδnAnm1.\displaystyle(S_{1},\dots,S_{m})\in\mathcal{S}:\;\min(|S_{i}\cap Q_{1}|,|S_{i}\cap Q_{2}|)\geq\frac{\delta}{2}\lfloor n/m\rfloor\mbox{ for at most $c_{\delta}m$ indices $i$}\Big{\}}\Big{|}\leq e^{-c_{\delta}n}A_{nm}^{-1}.

3.4 Auxiliary results for Bernoulli r.v. and random matrices

Let p(0,1)p\in(0,1), δ\delta is Bernoulli random variable taking value 11 with probability pp and 0 with probability 1p1-p. We say that δ\delta is a Bernoulli(pp) random variable. A random matrix with i.i.d. entries distributed as δ\delta will be called Bernoulli(pp) random matrix.

Here we provide four lemmas needed below. We start with notations for random matrices used throughout the paper. The class of all n×nn\times n matrices having 0/10/1 entries we denote by n{\mathcal{M}_{n}}. We will consider a probability measure on n{\mathcal{M}_{n}} induced by the distribution of an n×nn\times n Bernoulli(pp) random matrix. We will use the same notation {\mathbb{P}} for this probability measure; the parameter pp will always be clear from the context. Let M={μij}nM=\{\mu_{ij}\}\in{\mathcal{M}_{n}}. By 𝐑i=𝐑i(M){\bf R}_{i}={\bf R}_{i}(M) we denote the ii-th row of MM, and by 𝐂i(M){\bf C}_{i}(M) — the ii-th column, ini\leq n. By M\|M\| we always denote the operator norm of MM acting as an operator 22\ell_{2}\to\ell_{2}. This norm is also called spectral norm and equals the largest singular number.

We will need the following form of Bennett’s inequality.

Lemma 3.4.

Let n1n\geq 1, 0<q<10<q<1, and δ\delta be a Bernoulli(qq) random variable. Let δi\delta_{i} and δij\delta_{ij}, i,jni,j\leq n, be independent copies of δ\delta. Define the function h(u):=(1+u)ln(1+u)uh(u):=(1+u)\ln(1+u)-u, u0u\geq 0. Then for every t>0t>0,

max((i=1nδi>qn+t),(i=1nδi<qnt))\displaystyle\max\left(\mathbb{P}\left(\sum_{i=1}^{n}\delta_{i}>qn+t\right),\mathbb{P}\left(\sum_{i=1}^{n}\delta_{i}<qn-t\right)\right) exp(nq(1q)max2(q,1q)h(tmax(q,1q)nq(1q))).\displaystyle\leq\exp\left(-\frac{nq(1-q)}{\max^{2}(q,1-q)}\,h\left(\frac{t\max(q,1-q)}{nq(1-q)}\right)\right).

In particular, for 0<εq1/20<\varepsilon\leq q\leq 1/2,

max((i=1nδi>(q+ε)n),(i=1nδi<(qε)n))\displaystyle\max\left(\mathbb{P}\left(\sum_{i=1}^{n}\delta_{i}>(q+\varepsilon)n\right),\mathbb{P}\left(\sum_{i=1}^{n}\delta_{i}<(q-\varepsilon)n\right)\right) exp(nε22q(1q)(1ε3q)),\displaystyle\leq\exp\left(-\frac{n\varepsilon^{2}}{2q(1-q)}\,\left(1-\frac{\varepsilon}{3q}\right)\right),

and for q1/2q\leq 1/2, τ>e\tau>e,

(i=1nδi>(τ+1)qn)\displaystyle\mathbb{P}\left(\sum_{i=1}^{n}\delta_{i}>(\tau+1)qn\right) exp(τln(τ/e)qn).\displaystyle\leq\exp\left(-\tau\ln(\tau/e)qn\right).

Furthermore, for 50/nq0.150/n\leq q\leq 0.1,

(qn/8i=1nδi8qn)\displaystyle\mathbb{P}\Big{(}qn/8\leq\sum_{i=1}^{n}\delta_{i}\leq 8qn\Big{)} 1(1q)n/2.\displaystyle\geq 1-(1-q)^{n/2}.

Moreover, if n30n\geq 30 and p=q(4lnn)/np=q\geq(4\ln n)/n then denoting

sum:={M={δij}i,jnn:j=1nδij3.5pn for every in}{\mathcal{E}}_{sum}:=\Big{\{}M=\{\delta_{ij}\}_{i,j\leq n}\in{\mathcal{M}_{n}}\,:\,\sum_{j=1}^{n}\delta_{ij}\leq 3.5pn\quad\mbox{ for every }\,\,\,i\leq n\Big{\}}

we have (sum)1exp(1.5np)\mathbb{P}({\mathcal{E}}_{sum})\geq 1-\exp(-1.5np).

Proof.

Recall that Bennett’s inequality states that for mean zero independent random variables ξ1\xi_{1}, …, ξn\xi_{n} satisfying ξiρ\xi_{i}\leq\rho (for a certain fixed ρ>0\rho>0) almost surely for ini\leq n, one has for every t>0t>0,

(i=1nξi>t)exp(σ2ρ2h(ρtσ2)),\mathbb{P}\left(\sum_{i=1}^{n}\xi_{i}>t\right)\leq\exp\left(-\frac{\sigma^{2}}{\rho^{2}}\,h\left(\frac{\rho t}{\sigma^{2}}\right)\right),

where σ2=i=1n𝔼ξi2\sigma^{2}=\sum_{i=1}^{n}{\mathbb{E}}\xi_{i}^{2} (see e.g. Theorem 1.2.1 on p. 28 in [8] or Exercise 2.2 on p. 11 in [11] or Theorem 2.9 in [6]). Take ξi=δiq\xi_{i}=\delta_{i}-q, ξi=ξi\xi_{i}^{\prime}=-\xi_{i}, ini\leq n. Then for every ini\leq n, ξi\xi_{i}^{\prime} and ξi\xi_{i} are centered, |ξi|=|ξi|=max(q,1q)|\xi_{i}^{\prime}|=|\xi_{i}|=\max(q,1-q), and σ2=nq(1q)\sigma^{2}=nq(1-q). Applying the Bennett inequality with ρ=max(q,1q)\rho=\max(q,1-q) twice — to ξi\xi_{i} and ξi\xi_{i}^{\prime}, we observe the first inequality. To prove the second inequality, we take t=εnt=\varepsilon n and use that h()h(\cdot) is an increasing function satisfying h(u)u2/2u3/6h(u)\geq u^{2}/2-u^{3}/6 on +{\mathbb{R}}^{+}. The third inequality follows by taking t=τqnt=\tau qn and using h(u)uln(u/e)h(u)\geq u\ln(u/e).

For the “furthermore” part, we apply the third inequality with τ=7\tau=7, to get

{i=1nδi>8qn}exp(6qn).{\mathbb{P}}\Big{\{}\sum_{i=1}^{n}\delta_{i}>8qn\Big{\}}\leq\exp(-6qn).

On the other hand, using q0.1q\leq 0.1,

{i=1nδi<qn/8}\displaystyle{\mathbb{P}}\Big{\{}\sum_{i=1}^{n}\delta_{i}<qn/8\Big{\}} =i=0qn/8(ni)qi(1q)ni(1q)n+i=1qn/8(enqi(1q))i(1q)n\displaystyle=\sum_{i=0}^{\lfloor qn/8\rfloor}{n\choose i}q^{i}(1-q)^{n-i}\leq(1-q)^{n}+\sum_{i=1}^{\lfloor qn/8\rfloor}\bigg{(}\frac{enq}{i(1-q)}\bigg{)}^{i}\,(1-q)^{n}
(1q)n+qn8(8e1q)qn/8(1q)n(1q)n+qn8(80e9)qn/8(1q)n\displaystyle\leq(1-q)^{n}+\frac{qn}{8}\bigg{(}\frac{8e}{1-q}\bigg{)}^{qn/8}\,(1-q)^{n}\leq(1-q)^{n}+\frac{qn}{8}\bigg{(}\frac{80e}{9}\bigg{)}^{qn/8}\,(1-q)^{n}

Since (80e/9)1/8e0.4(80e/9)^{1/8}\leq e^{0.4}, (1q)nexp(qn)(1-q)^{n}\leq\exp(-qn), qn50qn\geq 50, and lnxx/e\ln x\leq x/e on [0,)[0,\infty), this implies

(qn/8i=1nδi<qn/8)\displaystyle\mathbb{P}\Big{(}qn/8\leq\sum_{i=1}^{n}\delta_{i}<qn/8\Big{)} exp(6qn)+(1+exp(0.45qn))(1q)n(1q)n/2.\displaystyle\leq\exp(-6qn)+(1+\exp(0.45qn))(1-q)^{n}\leq(1-q)^{n/2}.

Finally, to get the last inequality, we take t=2.5qn=2.5pnt=2.5qn=2.5pn, then

(j=1nδij>3.5pn)exp(np1ph(2.5))exp(np(3.5ln3.52.5))exp(1.8np).\mathbb{P}\left(\sum_{j=1}^{n}\delta_{ij}>3.5pn\right)\leq\exp\left(-\frac{np}{1-p}\,h\left(2.5\right)\right)\leq\exp\left(-np\,\left(3.5\ln 3.5-2.5\right)\right)\leq\exp\left(-1.8np\right).

Since under our assumptions, nexp(1.8np)exp(1.5np)n\exp\left(-1.8np\right)\leq\exp\left(-1.5np\right), the bound on (sum)\mathbb{P}({\mathcal{E}}_{sum}) follows by the union bound. ∎

We need the following simple corollary of the Bennet’s lemma.

Lemma 3.5.

For any R1R\geq 1 there is C3.5=C3.5(R)1C_{\text{\tiny\ref{l: column supports}}}=C_{\text{\tiny\ref{l: column supports}}}(R)\geq 1 with the following property. Let n1n\geq 1 and p(0,1)p\in(0,1) satisfy C3.5p1C_{\text{\tiny\ref{l: column supports}}}p\leq 1 and C3.5pnC_{\text{\tiny\ref{l: column supports}}}\leq pn. Further, let MM be an n×nn\times n be Bernoulli(pp) random matrix. Then with probability at least 1exp(n/C3.5)1-\exp(-n/C_{\text{\tiny\ref{l: column supports}}}) one has

8pn|supp𝐂i(M)|pn/8for all but (pR)1 indices i[n]{1}.8pn\geq|{\rm supp\,}{\bf C}_{i}(M)|\geq pn/8\quad\mbox{for all but $\,\,\,\lfloor(pR)^{-1}\rfloor\,\,\,$ indices $\,i\in[n]\setminus\{1\}$.}
Proof.

For each i[n]{1}i\in[n]\setminus\{1\}, let ξi\xi_{i} be the indicator of the event

{8pn<|supp𝐂i(M)| or |supp𝐂i(M)|<pn/8}.\big{\{}8pn<|{\rm supp\,}{\bf C}_{i}(M)|\quad\quad\mbox{ or }\quad\quad|{\rm supp\,}{\bf C}_{i}(M)|<pn/8\big{\}}.

By Lemma 3.4, 𝔼ξiepn/2{\mathbb{E}}\,\xi_{i}\leq e^{-pn/2}. Since ξi\xi_{i}’s are independent, by the Markov inequality,

{i=2nξi1pR}(n1(pR)1)(epn/2)(pR)1(n1(pR)1)en/(4R).\displaystyle{\mathbb{P}}\Big{\{}\sum_{i=2}^{n}\xi_{i}\geq\frac{1}{pR}\Big{\}}\leq{n-1\choose\lfloor(pR)^{-1}\rfloor}\big{(}e^{-pn/2}\big{)}^{\lfloor(pR)^{-1}\rfloor}\leq{n-1\choose\lfloor(pR)^{-1}\rfloor}\,e^{-n/(4R)}.

The result follows. ∎

The following lemma provides a bound on the norm of a random Bernoulli matrix. It is similar to [5, Theorem 1.14], where the case of symmetric matrices was treated. For the sake of completeness we sketch its proof.

Lemma 3.6.

Let nn be large enough and (4lnn)/np1/4(4\ln n)/n\leq p\leq 1/4. Let M=(δij)i,jM=(\delta_{ij})_{i,j} be a Bernoulli(pp) random matrix. Then for every t30t\geq 30 one has

{M𝔼M2tnp}4et2pn/4 and {M2tnp+pn}4et2pn/4.\mathbb{P}\big{\{}\|M-{\mathbb{E}}M\|\geq 2t\sqrt{np}\big{\}}\leq 4e^{-t^{2}pn/4}\quad\quad\mbox{ and }\quad\quad\mathbb{P}\big{\{}\|M\|\geq 2t\sqrt{np}+pn\big{\}}\leq 4e^{-t^{2}pn/4}.

In particular, taking t=pnt=\sqrt{pn},

(M𝟏3pn3/2)4exp(n2p2/4).\mathbb{P}\left(\|M{\bf 1}\|\geq 3pn^{3/2}\right)\leq 4\exp(-n^{2}p^{2}/4). (11)
Proof.

Given an n×nn\times n random matrix T=(tij)i,jT=(t_{ij})_{i,j} with independent entries taking values in [0,1][0,1]. we consider it as a vector in m{\mathbb{R}}^{m} with m=n2m={n^{2}}. Then the Hilbert–Schmidt norm of TT is the standard Euclidean norm on m{\mathbb{R}}^{m}. Let ff be any function in m{\mathbb{R}}^{m} which is convex and is 11-Lipschitz with respect to the standard Euclidean norm. Then the Talagrand inequality (see e.g. Corollary 4.10 and Proposition 1.8 in [20]) gives that for every s>0s>0,

(f(T)𝔼f(T)+s+4π)4exp(s2/4).\mathbb{P}\left(f(T)\geq{\mathbb{E}}f(T)+s+4\sqrt{\pi}\right)\leq 4\exp(-s^{2}/4).

We apply this inequality twice, first with the function f(T):=Tf(T):=\|T\| to the matrix T:=M𝔼MT:=M-{\mathbb{E}}M. At the end of this proof we show that 𝔼M𝔼M20pn{\mathbb{E}}\|M-{\mathbb{E}}M\|\leq 20\sqrt{pn}. Therefore, taking s=tpns=t\sqrt{pn} with t30t\geq 30, we obtain the first bound. For the second bound, note that all entries of 𝔼M{\mathbb{E}}M equal pp, hence 𝔼M=pn\|{\mathbb{E}}M\|=pn. Thus, the second bound follows by the triangle inequality.

It remains to prove that 𝔼M𝔼M20pn{\mathbb{E}}\|M-{\mathbb{E}}M\|\leq 20\sqrt{pn}. Recall that δij\delta_{ij} are the entries of MM. Let δij\delta_{ij}^{\prime}, i,jni,j\leq n be independent copies of δij\delta_{ij} and set M:=(δij)i,jM^{\prime}:=(\delta_{ij}^{\prime})_{i,j}. Denote by rijr_{ij} independent Rademacher random variables and by gijg_{ij} independent standard Gaussian random variables. We assume that all our variables are mutually independent and set ξij:=δijδij\xi_{ij}:=\delta_{ij}-\delta_{ij}^{\prime}. Since for every i,jni,j\leq n, ξij\xi_{ij} is symmetric, it has the same distribution as |ξij|rij|\xi_{ij}|r_{ij} and the same as 2/π|ξij|rij𝔼|gij|\sqrt{2/\pi}|\xi_{ij}|r_{ij}{\mathbb{E}}|g_{ij}|. Then we have

𝔼δM𝔼M=𝔼δM𝔼δM𝔼δ𝔼δMM=𝔼ξ(ξij)i,j=2/π𝔼ξ,r(ξijrij𝔼g|gij|)i,j{\mathbb{E}}_{\delta}\|M-{\mathbb{E}}M\|={\mathbb{E}}_{\delta}\|M-{\mathbb{E}}_{\delta^{\prime}}M^{\prime}\|\leq{\mathbb{E}}_{\delta}{\mathbb{E}}_{\delta^{\prime}}\|M-M^{\prime}\|={\mathbb{E}}_{\xi}\|(\xi_{ij})_{i,j}\|=\sqrt{2/\pi}\,{\mathbb{E}}_{\xi,r}\|(\xi_{ij}r_{ij}{\mathbb{E}}_{g}|g_{ij}|)_{i,j}\|
2/π𝔼ξ,r,g(ξijrij|gij|)i,j=2/π𝔼ξ𝔼g(ξij|gij|)i,j.\leq\sqrt{2/\pi}\,{\mathbb{E}}_{\xi,r,g}\|(\xi_{ij}r_{ij}|g_{ij}|)_{i,j}\|=\sqrt{2/\pi}\,{\mathbb{E}}_{\xi}{\mathbb{E}}_{g}\|(\xi_{ij}|g_{ij}|)_{i,j}\|.

Applying a result of Bandeira and Van Handel (see the beginning of Section 3.1 in [1]), we obtain

𝔼δM𝔼M𝔼ξ(4max(σ1,σ2)+15σln(2n)),{\mathbb{E}}_{\delta}\|M-{\mathbb{E}}M\|\leq{\mathbb{E}}_{\xi}(4\max(\sigma_{1},\sigma_{2})+15\sigma_{*}\sqrt{\ln(2n)}),

where

σ1=maxinj=1nξij2,σ2=maxjni=1nξij2, and σ=maxi,jn|ξij|1.\sigma_{1}=\max_{i\leq n}\sqrt{\sum_{j=1}^{n}\xi_{ij}^{2}},\quad\sigma_{2}=\max_{j\leq n}\sqrt{\sum_{i=1}^{n}\xi_{ij}^{2}},\quad\mbox{ and }\quad\sigma_{*}=\max_{i,j\leq n}|\xi_{ij}|\leq 1.

Note that ξij2\xi_{ij}^{2} are Bernoulli(qq)   random variables with q=2p(1p)q=2p(1-p). Therefore, using (4lnn)/np1/2(4\ln n)/n\leq p\leq 1/2 and applying the “moreover part” of Lemma 3.4, we obtain that max(σ1,σ2)>2pn\max(\sigma_{1},\sigma_{2})>2\sqrt{pn} with probability at most 2exp(1.5nq)2/n62\exp(-1.5nq)\leq 2/n^{6}. Moreover, since ξij21\xi_{ij}^{2}\leq 1, we have max(σ1,σ2)n\max(\sigma_{1},\sigma_{2})\leq\sqrt{n}. Therefore,

𝔼ξ(4max(σ1,σ2)+15σln(2n))8pn+4/n5+15ln(2n)20pn.{\mathbb{E}}_{\xi}(4\max(\sigma_{1},\sigma_{2})+15\sigma_{*}\sqrt{\ln(2n)})\leq 8\sqrt{pn}+4/n^{5}+15\sqrt{\ln(2n)}\leq 20\sqrt{pn}.

As an elementary corollary of the above lemma, we have the following statement where the restriction pn4lnnpn\geq 4\ln n is removed.

Corollary 3.7.

For every s>0s>0 and R1R\geq 1 there is C3.71C_{\text{\tiny\ref{cor: norm of centered}}}\geq 1 depending on s,Rs,R with the following property. Let n16/sn\geq 16/s be large enough and let p(0,1/4]p\in(0,1/4] satisfy slnnpns\ln n\leq pn. Let MnM_{n} be an n×nn\times n Bernoulli(pp) random matrix. Then

{Mn𝔼MnC3.7pn}1exp(Rpn).{\mathbb{P}}\big{\{}\|M_{n}-{\mathbb{E}}M_{n}\|\leq C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{pn}\big{\}}\geq 1-\exp(-Rpn).
Proof.

Let w:=max(1,8/s)w:=\max(1,\lceil 8/s\rceil), n~:=wn\widetilde{n}:=w\,n, and let M~n\widetilde{M}_{n} be n~×n~\widetilde{n}\times\widetilde{n} Bernoulli(pp) matrix. Assuming that nn is sufficiently large, we get

pn~=wpnsmax(1,8/s)lnn4lnn~.p\,\widetilde{n}=wpn\geq s\max(1,\lceil 8/s\rceil)\ln n\geq 4\ln\widetilde{n}.

Thus, the previous lemma is applicable, and we get

{M~n𝔼M~nC3.7pn}1exp(Rpn),{\mathbb{P}}\big{\{}\|\widetilde{M}_{n}-{\mathbb{E}}\widetilde{M}_{n}\|\leq C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{pn}\big{\}}\geq 1-\exp(-Rpn),

for some C3.7>0C_{\text{\tiny\ref{cor: norm of centered}}}>0 depending only on s,Rs,R. Since the norm of a matrix is not less than the norm of any of its submatrices, and because any n×nn\times n submatrix of M~n\widetilde{M}_{n} is equidistributed with MnM_{n}, we get the result. ∎

3.5 Anti-concentration

In this subsection we combine anti-concentration inequalities with the following tensorization lemma (see Lemma 3.2 in [48], Lemma 2.2 in [41] and Lemma 5.4 in [39]). We also provide Esseen’s lemma.

Lemma 3.8 (Tensorization lemma).

Let λ,γ>0\lambda,\gamma>0. Let ξ1,ξ2,,ξm\xi_{1},\xi_{2},\ldots,\xi_{m} be independent random variables. Assume that for all jmj\leq m, (|ξj|λ)γ\mathbb{P}(|\xi_{j}|\leq\lambda)\leq\gamma. Then for every ε(0,1)\varepsilon\in(0,1) one has

((ξ1,ξ2,,ξm)λεm)(e/ε)εmγm(1ε).\mathbb{P}(\|(\xi_{1},\xi_{2},...,\xi_{m})\|\leq\lambda\sqrt{\varepsilon m})\leq(e/\varepsilon)^{\varepsilon m}\gamma^{m(1-\varepsilon)}.

Moreover, if there exists ε0>0\varepsilon_{0}>0 and K>0K>0 such that for every εε0\varepsilon\geq\varepsilon_{0} and for all jmj\leq m one has (|ξj|ε)Kε\mathbb{P}(|\xi_{j}|\leq\varepsilon)\leq K\varepsilon then there exists an absolute constant C3.8>0C_{\text{\tiny\ref{l: tensor}}}>0 such that for every εε0\varepsilon\geq\varepsilon_{0},

((ξ1,ξ2,,ξm)εm)(C3.8Kε)m.\mathbb{P}(\|(\xi_{1},\xi_{2},...,\xi_{m})\|\leq\varepsilon\sqrt{m})\leq(C_{\text{\tiny\ref{l: tensor}}}K\varepsilon)^{m}.

Recall that for a real-valued random variable ξ\xi its Lévy concentration function 𝒬(ξ,t)\mathcal{Q}(\xi,t) is defined as

𝒬(ξ,t):=supλ{|ξλ|t},t>0.\mathcal{Q}(\xi,t):=\sup\limits_{\lambda\in{\mathbb{R}}}{\mathbb{P}}\bigl{\{}|\xi-\lambda|\leq t\bigr{\}},\;\;t>0.

We will need bounds on the Lévy concentration function of sums of independent random variables. Such inequalities were investigated in many works, starting with Lévi, Doeblin, Kolmogorov, Rogozin. We quote here a result due to Kesten [17], who improved Rogozin’s estimate [38].

Proposition 3.9.

Let ξ1,ξ2,,ξm\xi_{1},\xi_{2},\ldots,\xi_{m} be independent random variables and λ,λ1,,λm>0\lambda,\lambda_{1},...,\lambda_{m}>0 satisfy λmaximλi\lambda\geq\max_{i\leq m}\lambda_{i}. Then there exists an absolute positive constant CC such that

𝒬(i=1mξi,λ)Cλmaxim𝒬(ξi,λ)i=1mλi2(1𝒬(ξi,λi)).\mathcal{Q}\Bigl{(}\sum_{i=1}^{m}\xi_{i},\lambda\Bigr{)}\leq\frac{C\,\lambda\,\max_{i\leq m}\mathcal{Q}(\xi_{i},\lambda)}{\sqrt{\sum_{i=1}^{m}\lambda_{i}^{2}(1-\mathcal{Q}(\xi_{i},\lambda_{i}))}}.

This proposition together with Lemma 3.8 immediately implies the following consequence, in which, given A[m]A\subset[m] and xmx\in{\mathbb{R}}^{m}, xAx_{A} denotes coordinate projection of xx on A{\mathbb{R}}^{A}.

Proposition 3.10.

There exists and absolute constant C01C_{0}\geq 1 such that the following holds. Let p(0,1/2)p\in(0,1/2). Let δ\delta be a Bernoulli(pp) random variable. Let δj\delta_{j}, jnj\leq n, and δij\delta_{ij}, i,jni,j\leq n, be independent copies of δ\delta. Let M=(δij)ijM=(\delta_{ij})_{ij}. Let A[n]A\subset[n] and xnx\in{\mathbb{R}}^{n} be such that xAC01pxA\|x_{A}\|_{\infty}\leq C_{0}^{-1}\sqrt{p}\,\|x_{A}\|. Then

(Mxpn32C0xA)e3n.\mathbb{P}\Bigl{(}\|Mx\|\leq\frac{\sqrt{pn}}{3\sqrt{2}C_{0}}\|x_{A}\|\Bigr{)}\leq e^{-3n}.

Moreover, if λ:=pxA3C01/3\lambda:=\frac{\sqrt{p}\,\|x_{A}\|}{3C_{0}}\leq 1/3 then 𝒬(j=1nδjxj,λ)e8.\mathcal{Q}\Bigl{(}\sum_{j=1}^{n}\delta_{j}x_{j},\lambda\Bigr{)}\leq e^{-8}.

Proof.

We start with the “moreover” part. Assume pxAC0{\sqrt{p}\,\|x_{A}\|}\leq C_{0}. Let λj=|xj|/3\lambda_{j}=|x_{j}|/3. Clearly, for every jnj\leq n, 𝒬(xjδj,|xj|/3)=𝒬(δj,1/3)=1p\mathcal{Q}(x_{j}\delta_{j},|x_{j}|/3)=\mathcal{Q}(\delta_{j},1/3)=1-p. Proposition 3.9 implies that for every λ\lambda satisfying maxjAλjλ1/3\max_{j\in A}\lambda_{j}\leq\lambda\leq 1/3 one has

𝒬(j=1nxjδj,λ)𝒬(jAxjδj,λ)CλjAλj2p=3CλpxA.\mathcal{Q}\Bigl{(}\sum_{j=1}^{n}x_{j}\delta_{j},\lambda\Bigr{)}\leq\mathcal{Q}\Bigl{(}\sum_{j\in A}x_{j}\delta_{j},\lambda\Bigr{)}\leq\frac{C\,\lambda}{\sqrt{\sum_{j\in A}\lambda_{j}^{2}\,p}}=\frac{3C\,\lambda}{\sqrt{p}\,\|x_{A}\|}.

Choosing C0=Ce8C_{0}=Ce^{8} and λ=pxA/(3C0)\lambda=\sqrt{p}\,\|x_{A}\|/(3C_{0}) (note that the assumption on xA\|x_{A}\|_{\infty} ensures that λλj\lambda\geq\lambda_{j} for all jAj\in A) we obtain the “moreover” part.

Now apply Lemma 3.8 with ξi=(Mx)i=j=1nxjδij\xi_{i}=(Mx)_{i}=\sum_{j=1}^{n}x_{j}\delta_{ij}, ε=1/2\varepsilon=1/2, γ=e8\gamma=e^{-8}, m=nm=n. We have

(Mxλn/2)(2e)n/2exp(4n)exp(3n).\mathbb{P}\Bigl{(}\|Mx\|\leq\lambda\sqrt{n/2}\Bigr{)}\leq(2e)^{n/2}\exp(-4n)\leq\exp(-3n).

This implies the bound under assumption pxAC0{\sqrt{p}\,\|x_{A}\|}\leq C_{0}, which can be removed by normalizing xx. ∎

We also will need the following combination of a simple anti-concentration fact with Lemma 3.8.

Proposition 3.11.

Let p(0,1/20)p\in(0,1/20) and α>0\alpha>0. Let δ\delta be a Bernoulli(pp) random variable. Let δj\delta_{j}, jnj\leq n, and δij\delta_{ij}, i,jni,j\leq n, be independent copies of δ\delta. Let M=(δij)ijM=(\delta_{ij})_{ij}. Let xnx\in{\mathbb{R}}^{n} be such that x2αx_{2}^{*}\geq\alpha. Then

𝒬(j=1nxjδj,α/2.1)exp(1.9p) and (Mxαpn7ln(e/p))exp(1.6np).\mathcal{Q}\Bigl{(}\sum_{j=1}^{n}x_{j}\delta_{j},\alpha/2.1\Bigr{)}\leq\exp(-1.9p)\quad\quad\mbox{ and }\quad\quad\mathbb{P}\Bigl{(}\|Mx\|\leq\frac{\alpha\sqrt{pn}}{7\sqrt{\ln(e/p)}}\Bigr{)}\leq\exp(-1.6np).
Proof.

Without loss of generality we assume that x1=|x1|x_{1}^{*}=|x_{1}| and x2=|x2|x_{2}^{*}=|x_{2}|. Note that x1δ1+x2δ2x_{1}\delta_{1}+x_{2}\delta_{2} takes value in E1:={0,x1+x2}E_{1}:=\{0,x_{1}+x_{2}\} with probability (1p)2+p211.9p(1-p)^{2}+p^{2}\leq 1-1.9p and in E2:={x1,x2}E_{2}:=\{x_{1},x_{2}\} with probability 2p(1p)11.9p2p(1-p)\leq 1-1.9p. Since the distance between E1E_{1} and E2E_{2} is min(|x1|,|x2|)=|x2|\min(|x_{1}|,|x_{2}|)=|x_{2}| and since 𝒬(j=1nxjδj,λ)𝒬(j=12xjδj,λ)\mathcal{Q}\bigl{(}\sum_{j=1}^{n}x_{j}\delta_{j},\lambda\bigr{)}\leq\mathcal{Q}\bigl{(}\sum_{j=1}^{2}x_{j}\delta_{j},\lambda\bigr{)}, the first inequality follows.

Now apply Lemma 3.8 with ξi=(Mx)i=j=1nxjδij\xi_{i}=(Mx)_{i}=\sum_{j=1}^{n}x_{j}\delta_{ij}, ε=p/(10ln(e/p))\varepsilon=p/(10\ln(e/p)), γ=e1.9p\gamma=e^{-1.9p}, m=nm=n. We note that then εln(e/ε)p/4\varepsilon\ln(e/\varepsilon)\leq p/4 and therefore we have

(Mxαpn2.110ln(e/p))(e/ε)εnexp(1.9pn(1ε))\mathbb{P}\Bigl{(}\|Mx\|\leq\frac{\alpha\sqrt{pn}}{2.1\sqrt{10\ln(e/p)}}\Bigr{)}\leq(e/\varepsilon)^{\varepsilon n}\exp(-1.9pn(1-\varepsilon))
exp(pn/41.9np(1ε))exp(1.6np),\leq\exp(pn/4-1.9np(1-\varepsilon))\leq\exp(-1.6np),

which completes the proof. ∎

Finally we provide Esseen’s lemma [13], needed to prove Theorem 2.1.

Lemma 3.12 (Esseen).

There exists an absolute constant C>0C>0 such that the following holds. Let ξi\xi_{i}, imi\leq m be independent random variables. Then for every τ>0\tau>0,

𝒬(i=1mξi,τ)\displaystyle\mathcal{Q}\Big{(}\sum\limits_{i=1}^{m}\xi_{i},\tau\Big{)} C11i=1m|𝔼exp(2π𝐢ξis/τ)|ds.\displaystyle\leq C\int\limits_{-1}^{1}\prod\limits_{i=1}^{m}|{\mathbb{E}}\exp(2\pi{\bf i}\xi_{i}s/\tau)|\,ds.

3.6 Net argument

Here we discuss special nets that will be used and corresponding approximations. We fix the following notations. Let 𝐞=𝟏/n{\bf e}={\bf 1}/\sqrt{n} be the unit vector in the direction of 𝟏{\bf 1}. Let P𝐞P_{\bf e} be the projection on 𝐞{\bf e}^{\perp} and P𝐞P_{\bf e}^{\perp} be the projection on 𝐞{\bf e}, that is P𝐞=,𝐞𝐞P_{\bf e}^{\perp}=\left\langle\cdot,{\bf e}\right\rangle{\bf e}. Similarly, for jnj\leq n, let PjP_{j} be the projection on eje_{j}^{\perp} and PjP_{j}^{\perp} be the projection on eje_{j}. Recall that for xnx\in{\mathbb{R}}^{n}, the permutation σx\sigma_{x} satisfies |xσx(i)|=xi|x_{\sigma_{x}(i)}|=x_{i}^{*}, ini\leq n. Define a (non-linear) operator Q:nnQ:{\mathbb{R}}^{n}\to{\mathbb{R}}^{n} by Qx=PF(x)xQx=P_{F(x)}x — the coordinate projection on F(x){\mathbb{R}}^{F(x)}, where F(x)=σx([2,n])F(x)=\sigma_{x}([2,n]), in other words QQ annihilates the largest coordinate of a vector. Consider the triple norm on n{\mathbb{R}}^{n} defined by

|x|2:=P𝐞x2+pnP𝐞x2|||x|||^{2}:=\|P_{\bf e}x\|^{2}+pn\|P_{\bf e}^{\perp}x\|^{2}

(note that P𝐞x=|x,𝐞|\|P_{\bf e}^{\perp}x\|=|\left\langle x,{\bf e}\right\rangle|). We will use the following notion of shifted sparse vectors (recall here that σx\sigma_{x} is the permutation responsible for the non-increasing rearrangement). Given mnm\leq n and a parameter γ>0\gamma>0, define

U(m,γ):={xn:A[n],|A|=nm,|λ|2miA one has |xiλ|γn}.U(m,\gamma):=\Big{\{}x\in{\mathbb{R}}^{n}\,\,:\,\,\exists A\subset[n],|A|=n-m,\,\,\exists|\lambda|\leq\frac{2}{\sqrt{m}}\,\,\forall i\in A\,\,\mbox{ one has }\,\,|x_{i}-\lambda|\leq\frac{\gamma}{\sqrt{n}}\Big{\}}.

Further, given another parameter β>0\beta>0, define the set

V(β):={xn:x1 and Qxβ}.V(\beta):=\{x\in{\mathbb{R}}^{n}\,:\,\|x\|_{\infty}\leq 1\,\,\mbox{ and }\,\,\|Qx\|\leq\beta\}.
Lemma 3.13.

Let 0<8γεβ0<8\gamma\leq\varepsilon\leq\beta and 1mn1\leq m\leq n. Then there exists an ε\varepsilon-net in V(β)U(m,γ)V(\beta)\cap U(m,\gamma) with respect to |||||||||\cdot||| of cardinality at most

210pn2ε2m(9βε)m(nm).\frac{2^{10}\sqrt{p}\,n^{2}}{\varepsilon^{2}\,\sqrt{m}}\left(\frac{9\beta}{\varepsilon}\right)^{m}{n\choose m}.
Proof.

Denote V:=V(β)U(m,γ)V:=V(\beta)\cap U(m,\gamma). For each xVx\in V let A(x)A(x) be a set from the definition of U(m,γ)U(m,\gamma) (if the choice of A(x)A(x) is not unique, we fix one of them).

Fix E[n]E\subset[n] of cardinality mm. We first consider vectors xVx\in V satisfying A(x)=EcA(x)=E^{c}. Fix jnj\leq n and denote

Vj=Vj(E):={xV:j=σx(1) and A(x)=Ec}V_{j}=V_{j}(E):=\{x\in V\,:\,j=\sigma_{x}(1)\,\,\mbox{ and }\,\,A(x)=E^{c}\}

(thus x1=|xj|x_{1}^{*}=|x_{j}| on VjV_{j}). We now construct a net for VjV_{j}. It will be obtained as the sum of four nets, where the first one deals with just one coordinate, jj, “killing” the maximal coordinate; the second one deals with non-constant part of the vector, consisting of at most mm coordinates (excluding x1x_{1}^{*}); the third one deals with almost constant coordinates (corresponding to A(x)A(x)); and the fourth net deals with the direction of the constant vector. This way, three of our four nets are 11-dimensional. Let PWP_{W} be the coordinate projection onto W{\mathbb{R}}^{W}, where W=E{j}W=E\setminus\{j\}. Note that the definition of V(β)V(\beta) implies that PW(x)β\|P_{W}(x)\|\leq\beta for every xVjx\in V_{j}. Let, as before, PjP_{j}^{\perp} be the projection onto eje_{j}.

Let 𝒩1\mathcal{N}_{1} be an ε/4\varepsilon/4-net in Pj(Vj)[1,1]ejP_{j}^{\perp}(V_{j})\subset[-1,1]e_{j} of cardinality at most 8/ε8/\varepsilon. Let 𝒩2\mathcal{N}_{2} be an ε/4\varepsilon/4-net (with respect to the Euclidean metric) in PF(Vj)P_{F}(V_{j}) of cardinality at most (1+8β/ε)m.\left(1+8\beta/\varepsilon\right)^{m}.

Further, let 𝒩3\mathcal{N}_{3} be an ε/(8n)\varepsilon/(8\sqrt{n})-net in the segment [2/m,2/m]iEc{j}ei[-2/\sqrt{m},2/\sqrt{m}]\sum_{i\in E^{c}\setminus\{j\}}e_{i} with cardinality at most 16n/(εm)16\sqrt{n}/(\varepsilon\sqrt{m}). Then by the construction of the nets and by the definition of U(m,γ)U(m,\gamma) for every xVjx\in V_{j} there exist yxi𝒩iy^{i}_{x}\in\mathcal{N}_{i}, i3i\leq 3, such that for yx=yx1+yx2+yx3y_{x}=y^{1}_{x}+y^{2}_{x}+y^{3}_{x},

xyx2ε216+ε216+iEc{j}(γn+ε8n)23ε216;\|x-y_{x}\|^{2}\leq\frac{\varepsilon^{2}}{16}+\frac{\varepsilon^{2}}{16}+\sum_{i\in E^{c}\setminus\{j\}}\left(\frac{\gamma}{\sqrt{n}}+\frac{\varepsilon}{8\sqrt{n}}\right)^{2}\leq\frac{3\varepsilon^{2}}{16};

in particular, P𝐞(xyx)3/16ε\|P_{\bf e}(x-y_{x})\|\leq\sqrt{3/16}\varepsilon. Finally, let 𝒩4\mathcal{N}_{4} be an ε/(4pn)\varepsilon/(4\sqrt{pn})-net in the segment (ε/2)[𝐞,𝐞](\varepsilon/2)[-{\bf e},{\bf e}] with cardinality at most 8pn8\sqrt{pn}. Then for every xVjx\in V_{j} there exists yxy_{x} as above and yx4𝒩4y_{x}^{4}\in\mathcal{N}_{4} with

|xyxyx4|2=|P𝐞(xyx)+P𝐞(xyx)yx4|2=P𝐞(xyx)2+pnP𝐞(xyx)yx42ε2/4.|||x-y_{x}-y^{4}_{x}|||^{2}=|||P_{\bf e}(x-y_{x})+P_{\bf e}^{\perp}(x-y_{x})-y^{4}_{x}|||^{2}=\|P_{\bf e}(x-y_{x})\|^{2}+pn\|P_{\bf e}^{\perp}(x-y_{x})-y^{4}_{x}\|^{2}\leq\varepsilon^{2}/4.

Thus the set 𝒩E,j=𝒩1+𝒩2+𝒩3+𝒩4\mathcal{N}_{E,j}=\mathcal{N}_{1}+\mathcal{N}_{2}+\mathcal{N}_{3}+\mathcal{N}_{4} is an (ε/2)(\varepsilon/2)-net for VjV_{j} with respect to |||||||||\cdot||| and its cardinality is bounded by

210pnε2m(1+8βε)m.\frac{2^{10}\sqrt{p}\,n}{\varepsilon^{2}\sqrt{m}}\left(1+\frac{8\beta}{\varepsilon}\right)^{m}.

Taking union of such nets over all choices of E[n]E\subset[n] and all jnj\leq n we obtain an (ε/2)(\varepsilon/2)-net 𝒩0\mathcal{N}_{0} in |||||||||\cdot||| for VV of desired cardinality. Using standard argument, we pass to an ε\varepsilon-net 𝒩V\mathcal{N}\subset V for VV. ∎

Later we apply Lemma 3.13 with the following proposition.

Proposition 3.14.

Let nn be large enough and (4lnn)/np<1/2(4\ln n)/n\leq p<1/2, and ε>0\varepsilon>0. Denote

nrm:={Mn:Mp𝟏𝟏60np and M𝟏3pn3/2}.{\mathcal{E}}_{nrm}:=\{M\in{\mathcal{M}_{n}}\,:\,\|M-p{\bf 1}{\bf 1}^{\top}\|\leq 60\sqrt{np}\quad\mbox{ and }\quad\|M{\bf 1}\|\leq 3pn^{3/2}\}.

Then for every xnx\in{\mathbb{R}}^{n} satisfying |x|ε|||x|||\leq\varepsilon and every MnrmM\in{\mathcal{E}}_{nrm} one has Mx100pnε.\|Mx\|\leq 100\sqrt{pn}\varepsilon.

Proof.

Let w=P𝐞xw=P_{\bf e}^{\perp}x. Then, by the definition of the triple norm, w|x|/pnε/pn\|w\|\leq|||x|||/\sqrt{pn}\leq\varepsilon/\sqrt{pn}. Clearly,

(p𝟏𝟏)(xw)=(p𝟏𝟏)P𝐞x=0.(p{\bf 1}{\bf 1}^{\top})(x-w)=(p{\bf 1}{\bf 1}^{\top})P_{\bf e}x=0.

Therefore, using that MnrmM\in{\mathcal{E}}_{nrm}, we get

M(xw)=(Mp𝟏𝟏)(xw)60pnxw70pnε.\|M(x-w)\|=\|(M-p{\bf 1}{\bf 1}^{\top})(x-w)\|\leq 60\sqrt{pn}\|x-w\|\leq 70\sqrt{pn}\varepsilon.

Since w=𝟏w/nw={\bf 1}\|w\|/\sqrt{n} and wε/pn\|w\|\leq\varepsilon/\sqrt{pn}, using again that MnrmM\in{\mathcal{E}}_{nrm}, we observe that

MwεpnM𝟏3pnε.\|Mw\|\leq\frac{\varepsilon}{\sqrt{p}\,n}\|M{\bf 1}\|\leq 3\sqrt{pn}\varepsilon.

The proposition follows by the triangle inequality. ∎

4 Unstructured vectors

The goal of this section is to prove Theorem 2.2.

Recall that given growth function 𝐠{\bf g} and parameters r,δ,ρ(0,1)r,\delta,\rho\in(0,1), the set of vectors 𝒱n=𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}={\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) was defined in (2). In the next two sections (dealing with invertibility over structured vectors) we work with two different growth functions; one will be applied to the case of constant pp and the other one (giving a worse final estimate) is suitable in the general case. For this reason, and to increase flexibility of our argument, rather than fixing a specific growth function here, we will work with an arbitrary non-decreasing function 𝐠:[1,)[1,){\bf g}\,:\,[1,\infty)\to[1,\infty) satisfying the additional assumption (8) with a “global” parameter K31K_{3}\geq 1.

4.1 Degree of unstructuredness: definition and basic properties

Below, for any non-empty finite integer subset SS, we denote by η[S]\eta[S] a random variable uniformly distributed on SS. Additionally, for any K21K_{2}\geq 1, we fix a smooth version of max(1K2,t)\max(\frac{1}{K_{2}},t). More precisely, let us fix a function ψK2:++\psi_{K_{2}}:{\mathbb{R}}_{+}\to{\mathbb{R}}_{+} satisfying

  • The function ψK2\psi_{K_{2}} is twice continuously differentiable, with ψK2=1\|\psi_{K_{2}}^{\prime}\|_{\infty}=1 and ψK2′′<\|\psi_{K_{2}}^{\prime\prime}\|_{\infty}<\infty;

  • ψK2(t)=1K2\psi_{K_{2}}(t)=\frac{1}{K_{2}} for all t12K2t\leq\frac{1}{2K_{2}};

  • 1K2ψK2(t)t\frac{1}{K_{2}}\geq\psi_{K_{2}}(t)\geq t for all 1K2t12K2\frac{1}{K_{2}}\geq t\geq\frac{1}{2K_{2}};

  • ψK2(t)=t\psi_{K_{2}}(t)=t for all t1K2t\geq\frac{1}{K_{2}}.

In what follows, we view the maximum of the second derivative of ψK2\psi_{K_{2}} as a function of K2K_{2} (the nature of this function is completely irrelevant as we do not attempt to track magnitudes of constants involved in our arguments).

Fix an integer n1n\geq 1 and an integer mn/2m\leq n/2. Recall that given a vector vnv\in{\mathbb{R}}^{n} and parameters K1,K21K_{1},K_{2}\geq 1, the degree of unstructuredness (u-degree) 𝐔𝐃n=𝐔𝐃n(v,m,K1,K2){\bf UD}_{n}={\bf UD}_{n}(v,m,K_{1},K_{2}) of vv was defined in (6). The quantity 𝐔𝐃n{\bf UD}_{n} will serve as a measure of unstructuredness of the vector vv and in its spirit is similar to the notion of the essential least common denominator introduced earlier by Rudelson and Vershynin [41]. Here unstructuredness refers to the uniformity in the locations of components of vv on the real line. The larger the degree is, the better anti-concentration properties of an associated random linear combination are. The functions ψK2\psi_{K_{2}} employed in the definition will be important when discussing certain stability properties of 𝐔𝐃n{\bf UD}_{n}.

We start with a proof of Theorem 2.1 which connects the definition of the u-degree with anti-concentration properties.

Proof of Theorem 2.1.

For any sequence of disjoint subsets S1,,SmS_{1},\dots,S_{m} of [n][n] of cardinality n/m\lfloor n/m\rfloor each, set

S1,,Sm:={suppXSi=1 for all im}.{\mathcal{E}}_{S_{1},\dots,S_{m}}:=\big{\{}{\rm supp\,}X\cap S_{i}=1\mbox{ for all $i\leq m$}\big{\}}.

Note that each point ω\omega of the probability space belongs to the same number of events from the collection {S1,,Sm}S1,,Sm\{{\mathcal{E}}_{S_{1},\dots,S_{m}}\}_{S_{1},\dots,S_{m}}, therefore, for AnmA_{nm} defined in (7) we have for any λ\lambda\in{\mathbb{R}} and τ>0\tau>0,

{|i=1nviXiλ|τ}=AnmS1,,Sm{|i=1nviXiλ|τ|S1,,Sm}.\begin{split}{\mathbb{P}}\Big{\{}&\Big{|}\sum\limits_{i=1}^{n}v_{i}X_{i}-\lambda\Big{|}\leq\tau\Big{\}}=A_{nm}\,\sum\limits_{S_{1},\dots,S_{m}}{\mathbb{P}}\Big{\{}\Big{|}\sum\limits_{i=1}^{n}v_{i}X_{i}-\lambda\Big{|}\leq\tau\;\big{|}\;{\mathcal{E}}_{S_{1},\dots,S_{m}}\Big{\}}.\end{split} (12)

Further, conditioned on an event S1,,Sm{\mathcal{E}}_{S_{1},\dots,S_{m}}, the random sum i=1nviXi\sum\limits_{i=1}^{n}v_{i}X_{i} is equidistributed with i=1mvη[Si]\sum\limits_{i=1}^{m}v_{\eta[S_{i}]} (where we assume that η[S1],,η[Sm]\eta[S_{1}],\dots,\eta[S_{m}] are jointly independent with S1,,Sm{\mathcal{E}}_{S_{1},\dots,S_{m}}). On the other hand, applying Lemma 3.12, we observe that for every τ>0\tau>0,

𝒬(i=1mvη[Si],τ)\displaystyle\mathcal{Q}\Big{(}\sum\limits_{i=1}^{m}v_{\eta[S_{i}]},\tau\Big{)} C11i=1m|𝔼exp(2π𝐢vη[Si]s/τ)|ds\displaystyle\leq C^{\prime}\int\limits_{-1}^{1}\prod\limits_{i=1}^{m}|{\mathbb{E}}\exp(2\pi{\bf i}v_{\eta[S_{i}]}s/\tau)|\,ds
=Cm1/2τm/τm/τi=1m|𝔼exp(2π𝐢vη[Si]m1/2s)|ds,\displaystyle=C^{\prime}\,m^{-1/2}\,\tau\int\limits_{-\sqrt{m}/\tau}^{\sqrt{m}/\tau}\prod\limits_{i=1}^{m}|{\mathbb{E}}\exp(2\pi{\bf i}v_{\eta[S_{i}]}\,m^{-1/2}s)|\,ds,

for a universal constant C>0C^{\prime}>0. Combining this with (12), we get for every τ>0\tau>0,

𝒬(i=1nviXi,τ)\displaystyle\mathcal{Q}\Big{(}\sum\limits_{i=1}^{n}v_{i}X_{i},\tau\Big{)} AnmS1,,Sm𝒬(i=1nviXi,τ|S1,,Sm)\displaystyle\leq A_{nm}\,\sum\limits_{S_{1},\dots,S_{m}}\mathcal{Q}\Big{(}\sum\limits_{i=1}^{n}v_{i}X_{i},\tau\;\big{|}\;{\mathcal{E}}_{S_{1},\dots,S_{m}}\Big{)}
CτAnmmS1,,Smm/τm/τi=1m|𝔼exp(2π𝐢vη[Si]m1/2s)|ds.\displaystyle\leq\frac{C^{\prime}\tau A_{nm}}{\sqrt{m}}\sum\limits_{S_{1},\dots,S_{m}}\int\limits_{-\sqrt{m}/\tau}^{\sqrt{m}/\tau}\prod\limits_{i=1}^{m}|{\mathbb{E}}\exp(2\pi{\bf i}v_{\eta[S_{i}]}\,m^{-1/2}s)|\,ds.

Setting τ:=m/𝐔𝐃n\tau:=\sqrt{m}/{\bf UD}_{n}, where 𝐔𝐃n=𝐔𝐃n(v,m,K1,K2){\bf UD}_{n}={\bf UD}_{n}(v,m,K_{1},K_{2}), we obtain

𝒬\displaystyle\mathcal{Q} (i=1nviXi,m/𝐔𝐃n)CAnm𝐔𝐃nS1,,Sm𝐔𝐃n𝐔𝐃ni=1m|𝔼exp(2π𝐢vη[Si]m1/2s)|dsCK1𝐔𝐃n,\displaystyle\Big{(}\sum\limits_{i=1}^{n}v_{i}X_{i},\sqrt{m}/{\bf UD}_{n}\Big{)}\leq\frac{C^{\prime}A_{nm}}{{\bf UD}_{n}}\,\sum\limits_{S_{1},\dots,S_{m}}\int\limits_{-{\bf UD}_{n}}^{{\bf UD}_{n}}\prod\limits_{i=1}^{m}|{\mathbb{E}}\exp(2\pi{\bf i}v_{\eta[S_{i}]}\,m^{-1/2}s)|\,ds\leq\frac{C^{\prime}K_{1}}{{\bf UD}_{n}},

in view of the definition of 𝐔𝐃n(v,m,K1,K2){\bf UD}_{n}(v,m,K_{1},K_{2}). The result follows. ∎

For the future use we state an immediate consequence of Theorem 2.1 and Lemma 3.8.

Corollary 4.1.

Let n,n,\ell\in{\mathbb{N}}, let m1,,mm_{1},\dots,m_{\ell} be integers with min/2m_{i}\leq n/2 for all ii, and let K1,K21K_{1},K_{2}\geq 1. Further, let vnv\in{\mathbb{R}}^{n}, and let BB be an ×n\ell\times n random matrix with independent rows such that the ii-th row is uniformly distributed on the set of vectors with mim_{i} ones and nmin-m_{i} zeros. Then for any non-random vector ZZ\in{\mathbb{R}}^{\ell} we have

{BvZt}(2C3.8C2.1t/minimi)for all tmaximi𝐔𝐃n(v,mi,K1,K2).{\mathbb{P}}\big{\{}\|Bv-Z\|\leq\sqrt{\ell}\,t\big{\}}\leq\Big{(}2C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}t/\sqrt{\min\limits_{i}m_{i}}\Big{)}^{\ell}\quad\mbox{for all }t\geq\max\limits_{i}\frac{\sqrt{m_{i}}}{{\bf UD}_{n}(v,m_{i},K_{1},K_{2})}.

The parameter K2K_{2} which did not participate in any way in the proof of Theorem 2.1 is needed to guarantee a certain stability property of 𝐔𝐃n(v,m,K1,K2){\bf UD}_{n}(v,m,K_{1},K_{2}). We would like to emphasize that the use of functions ψK2\psi_{K_{2}} is a technical element of the argument.

Proposition 4.2 (Stability of the u-degree).

For any K21K_{2}\geq 1 there are c4.2,c4.2>0c_{\text{\tiny\ref{l: stability of bal}}},c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}>0 depending only on K2K_{2} with the following property. Let K11K_{1}\geq 1, vnv\in{\mathbb{R}}^{n}, kk\in{\mathbb{N}}, mn/2m\leq n/2, and assume that 𝐔𝐃n(v,m,K1,K2)c4.2k{\bf UD}_{n}(v,m,K_{1},K_{2})\leq c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}k. Then there is a vector y(1k)ny\in\big{(}\frac{1}{k}{\mathbb{Z}}\big{)}^{n} such that vy1k\|v-y\|_{\infty}\leq\frac{1}{k}, and such that

𝐔𝐃n(y,m,c4.2K1,K2)𝐔𝐃n(v,m,K1,K2)𝐔𝐃n(y,m,c4.21K1,K2){\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}K_{1},K_{2})\leq{\bf UD}_{n}(v,m,K_{1},K_{2})\leq{\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}^{-1}K_{1},K_{2})

To prove the proposition we need two auxiliary lemmas. u>0u>0, ε(0,u/2]\varepsilon\in(0,u/2],

Lemma 4.3.

Let 0z0\neq z\in\mathbb{C}, ε[0,|z|/2]\varepsilon\in[0,|z|/2] and let WW be a random vector in \mathbb{C} with 𝔼W=0{\mathbb{E}}W=0 and with |W|ε|W|\leq\varepsilon everywhere on the probability space. Then

|𝔼|z+W||z||ε2|z|.\big{|}{\mathbb{E}}|z+W|-|z|\big{|}\leq\frac{\varepsilon^{2}}{|z|}.
Proof.

We can view both zz and WW as vectors in 2{\mathbb{R}}^{2}, and can assume without loss of generality that z=(z1,0)z=(z_{1},0), with z1=|z|z_{1}=|z|. Then |z1+W1|=z1+W1|z_{1}+W_{1}|=z_{1}+W_{1} and

z1+W1|z+W|=(z1+W1)2+W22(z1+W1)+W222|z1+W1|(z1+W1)+ε22(|z|ε).z_{1}+W_{1}\leq|z+W|=\sqrt{(z_{1}+W_{1})^{2}+W_{2}^{2}}\leq(z_{1}+W_{1})+\frac{W_{2}^{2}}{2|z_{1}+W_{1}|}\leq(z_{1}+W_{1})+\frac{\varepsilon^{2}}{2(|z|-\varepsilon)}.

Hence,

|z|=z1=𝔼(z1+W1)𝔼|z+W|𝔼(z1+W1)+ε2|z|=|z|+ε2|z|,|z|=z_{1}={\mathbb{E}}(z_{1}+W_{1})\leq{\mathbb{E}}|z+W|\leq{\mathbb{E}}(z_{1}+W_{1})+\frac{\varepsilon^{2}}{|z|}=|z|+\frac{\varepsilon^{2}}{|z|},

which implies the desired estimate. ∎

Lemma 4.4.

Let λ,μ\lambda,\mu\in{\mathbb{R}}, and let ξ\xi be a random variable in {\mathbb{R}} with 𝔼ξ=μ{\mathbb{E}}\xi=\mu and with |ξμ|λ|\xi-\mu|\leq\lambda everywhere on the probability space. Then for any ss\in{\mathbb{R}} we have

|𝔼exp(2π𝐢ξs)exp(2π𝐢μs)|(2πλs)2.\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,\xi\,s\big{)}-\exp\big{(}2\pi{\bf i}\,\mu\,s\big{)}\big{|}\leq(2\pi\lambda s)^{2}.
Proof.

Denote ξ=ξμ\xi^{\prime}=\xi-\mu. Then 𝔼ξ=0{\mathbb{E}}\xi=0 and |ξ|λ|\xi^{\prime}|\leq\lambda. Therefore, using that |sinx||x||\sin x|\leq|x| and |sinxx|x2/2|\sin x-x|\leq x^{2}/2 for every xx\in{\mathbb{R}}, we obtain

|𝔼exp\displaystyle\big{|}{\mathbb{E}}\exp (2π𝐢ξs)exp(2π𝐢μs)|=|𝔼exp(2π𝐢ξs)1|=|𝔼cos(2πξs)1+𝐢𝔼sin(2πξs)|\displaystyle\big{(}2\pi{\bf i}\,\xi\,s\big{)}-\exp\big{(}2\pi{\bf i}\,\mu\,s\big{)}\big{|}=\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,\xi^{\prime}s\big{)}-1\big{|}=\big{|}{\mathbb{E}}\cos\big{(}2\pi\xi^{\prime}s\big{)}-1+{\bf i}\,{\mathbb{E}}\sin\big{(}2\pi\xi^{\prime}s\big{)}\big{|}
=|2𝔼sin2(πξs)+𝐢𝔼(sin(2πξs)2πξs)|2(πλs)2+(2πλs)2/2=(2πλs)2.\displaystyle=\big{|}-2{\mathbb{E}}\sin^{2}\big{(}\pi\xi^{\prime}s\big{)}+{\bf i}\,{\mathbb{E}}\big{(}\sin\big{(}2\pi\xi^{\prime}s\big{)}-2\pi\xi^{\prime}s\big{)}\big{|}\leq 2(\pi\lambda s)^{2}+(2\pi\lambda s)^{2}/2=(2\pi\lambda s)^{2}.

Proof of Proposition 4.2.

To prove the proposition, we will use the randomized rounding which is a well known notion in computer science, and was recently applied in the random matrix context in [30] (see also [48, 31]). Define a random vector YY in (1k)n\big{(}\frac{1}{k}{\mathbb{Z}}\big{)}^{n} with independent components Y1,,YnY_{1},\dots,Y_{n} such that each component YiY_{i} has distribution

Yi={1kkvi, with probability kvikvi+1,1kkvi+1k, with probability kvikvi.Y_{i}=\begin{cases}\frac{1}{k}\lfloor kv_{i}\rfloor,&\mbox{ with probability $\lfloor kv_{i}\rfloor-kv_{i}+1$},\\ \frac{1}{k}\lfloor kv_{i}\rfloor+\frac{1}{k},&\mbox{ with probability $kv_{i}-\lfloor kv_{i}\rfloor$}.\end{cases}

Then 𝔼Yi=vi{\mathbb{E}}Y_{i}=v_{i}, ini\leq n and, deterministically, vY1/k\|v-Y\|_{\infty}\leq 1/k.

Fix for a moment a number s(0,k/(14πK2)]s\in(0,k/(14\pi K_{2})] and a subset S[n]S\subset[n] of cardinality n/m\lfloor n/m\rfloor. Our intermediate goal is to estimate the quantity

𝔼ψK2(|1n/mjSexp(2π𝐢Yjs)|).{\mathbb{E}}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}.

Denote

V=VS:=|1n/mjSexp(2π𝐢vjs)|=|𝔼exp(2π𝐢vη[S]s)|V=V_{S}:=\left|\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,v_{j}\,s\big{)}\right|=\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S]}\,s\big{)}\big{|}

and consider two cases.

Case 1. V12K22πskV\leq\frac{1}{2K_{2}}-\frac{2\pi\,s}{k}. Using that |e𝐢x1||x||e^{{\bf i}x}-1|\leq|x| for every xx\in{\mathbb{R}}, we observe that deterministically

|exp(2π𝐢vjs)exp(2π𝐢Yjs)|2πs/k.|\exp\big{(}2\pi{\bf i}\,v_{j}\,s\big{)}-\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}|\leq 2\pi s/k. (13)

Therefore, by the definition of the function ψK2\psi_{K_{2}}, in this case we have on the entire probability space

ψK2(|1n/mjSexp(2π𝐢Yjs)|)=ψK2(V)=1K2.\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}=\psi_{K_{2}}(V)=\frac{1}{K_{2}}.

Case 2. V>12K22πsk14K2V>\frac{1}{2K_{2}}-\frac{2\pi\,s}{k}\geq\frac{1}{4K_{2}}. Set

z:=1n/m𝔼jSexp(2π𝐢Yjs) and W:=1n/mjSexp(2π𝐢Yjs)z.z:=\frac{1}{\lfloor n/m\rfloor}{\mathbb{E}}\,\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\quad\mbox{ and }\quad W:=\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}-z.

Then 𝔼W=0{\mathbb{E}}W=0 and, using again |e𝐢x1||x||e^{{\bf i}x}-1|\leq|x|, we see that |W|2πs/k|W|\leq 2\pi s/k everywhere. By Lemma 4.4, |zV|(2πs/k)2|z-V|\leq(2\pi s/k)^{2}, in particular, zV(2πs/k)21/(3K2)4πs/k|W|/2z\geq V-(2\pi s/k)^{2}\geq 1/(3K_{2})\geq 4\pi s/k\geq|W|/2. Therefore we may apply Lemma 4.3, to obtain

|𝔼|W+z||z||4π2s2|z|k212π2K2s2k2.\big{|}{\mathbb{E}}|W+z|-|z|\big{|}\leq\frac{4\pi^{2}s^{2}}{|z|k^{2}}\leq\frac{12\pi^{2}K_{2}s^{2}}{k^{2}}.

This implies,

|𝔼|\displaystyle\Big{|}{\mathbb{E}}\Big{|} 1n/mjSexp(2π𝐢Yjs)|V|=|𝔼|W+z||z|+|z|V|16π2K2s2k2.\displaystyle\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}-V\Big{|}=\Big{|}{\mathbb{E}}|W+z|-|z|+|z|-V\Big{|}\leq\frac{16\pi^{2}K_{2}s^{2}}{k^{2}}. (14)

To convert the last relation to estimating ψK2()\psi_{K_{2}}(\cdot), we will use the assumption that the second derivative of ψK2\psi_{K_{2}} is uniformly bounded. Applying Taylor’s expansion around the point VV, we get

𝔼ψK2(|1n/mjSexp(2π𝐢Yjs)|)=ψK2(V)\displaystyle{\mathbb{E}}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}=\psi_{K_{2}}\big{(}V\big{)} +𝔼(|1n/mjSexp(2π𝐢Yjs)|V)ψK2(V)\displaystyle+{\mathbb{E}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}-V\Big{)}\,\psi_{K_{2}}^{\prime}(V)
+C′′|1n/mjSexp(2π𝐢Yjs)|V2,\displaystyle+C^{\prime\prime}\Big{\|}\,\,\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}-V\Big{\|}_{\infty}^{2},

for some C′′>0C^{\prime\prime}>0 which may only depend on K2K_{2}. Here, \|\cdot\|_{\infty} denotes the essential supremum of the random variable, and is bounded above by 2πs/k2\pi s/k by 13. Together with (14) and with ψK21\|\psi_{K_{2}}^{\prime}\|_{\infty}\leq 1, this gives

|𝔼ψK2(|1n/mjSexp(2π𝐢Yjs)|)ψK2(V)|C¯s2k2,\Big{|}{\mathbb{E}}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}-\psi_{K_{2}}(V)\Big{|}\leq\frac{\bar{C}\,s^{2}}{k^{2}},

where C¯\bar{C} depends only on K2K_{2}.

Since ψK21/(2K2)\psi_{K_{2}}^{\prime}\geq 1/(2K_{2}), in both cases we obtain for some C^>0\hat{C}>0 depending only on K2K_{2},

|\displaystyle\Big{|} 𝔼ψK2(|1n/mjSexp(2π𝐢Yjs)|)ψK2(V)|C^s2k2ψK2(V).\displaystyle{\mathbb{E}}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}-\psi_{K_{2}}(V)\Big{|}\leq\frac{\hat{C}\,s^{2}}{k^{2}}\psi_{K_{2}}(V).

Using this inequality together with definition of V=VSV=V_{S}, integrating over ss, and summing over all choices of disjoint subsets S1,,SmS_{1},\dots,S_{m} of cardinality n/m\lfloor n/m\rfloor, for every t(0,k/(14πK2)]t\in(0,k/(14\pi K_{2})] we get the relation

S1,,Sm\displaystyle\sum\limits_{S_{1},\dots,S_{m}}\; ttmax(0,1c0s2k2)mi=1mψK2(|𝔼exp(2π𝐢vη[Si]s)|)ds\displaystyle\int\limits_{-t}^{t}\max\bigg{(}0,1-\frac{c_{0}\,s^{2}}{k^{2}}\bigg{)}^{m}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}\,s\big{)}\big{|}\big{)}\,ds
S1,,Smtti=1m𝔼YψK2(|1n/mjSiexp(2π𝐢Yjs)|)ds\displaystyle\leq\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}{\mathbb{E}}_{Y}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S_{i}}\exp\big{(}2\pi{\bf i}\,Y_{j}\,s\big{)}\Big{|}\Big{)}\,ds
S1,,Smtt(1+C0s2k2)mi=1mψK2(|𝔼exp(2π𝐢vη[Si]s)|)ds,\displaystyle\leq\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\bigg{(}1+\frac{C_{0}\,s^{2}}{k^{2}}\bigg{)}^{m}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}\,s\big{)}\big{|}\big{)}\,ds,

where C0,c0>7πK2C_{0},c_{0}>7\pi K_{2} are constants that may only depend on K2K_{2}. Using independence of the components of YY, we can take the expectation with respect to YY out of the integral.

Given a vector Q=(q1,,qn)nQ=(q_{1},\dots,q_{n})\in{\mathbb{R}}^{n} and t(0,k/(14πK2)]t\in(0,k/(14\pi K_{2})], denote

gt(Q):=S1,,Smtti=1mψK2(|1n/mjSiexp(2π𝐢qjs)|)ds.g_{t}(Q):=\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S_{i}}\exp\big{(}2\pi{\bf i}\,q_{j}\,s\big{)}\Big{|}\Big{)}\,ds.

The above relation implies that there are two (non-random) realizations YY^{\prime} and Y′′Y^{\prime\prime} of YY such that for

gt(Y)\displaystyle g_{t}(Y^{\prime}) I1:=max(0,1c0t2k2)mS1,,Smtti=1mψK2(|𝔼exp(2π𝐢vη[Si]s)|)ds\displaystyle\geq I_{1}:=\max\bigg{(}0,1-\frac{c_{0}\,t^{2}}{k^{2}}\bigg{)}^{m}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}\,s\big{)}\big{|}\big{)}\,ds

and

gt(Y′′)I2:=(1+C0t2k2)mS1,,Smtti=1mψK2(|𝔼exp(2π𝐢vη[Si]s)|)ds.\displaystyle g_{t}(Y^{\prime\prime})\leq I_{2}:=\bigg{(}1+\frac{C_{0}\,t^{2}}{k^{2}}\bigg{)}^{m}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t}^{t}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}\,s\big{)}\big{|}\big{)}\,ds.

Using properties of the function ψK2\psi_{K_{2}}, we note that for any two non-random vectors Y~\widetilde{Y} and Y^\hat{Y} in the range of YY such that they differ on a single coordinate, one has gt(Y~)4K2gt(Y^).g_{t}(\widetilde{Y})\leq 4K_{2}\,g_{t}(\hat{Y}). Consider a path Y(1)=Y,Y(2),Y(3),,Y′′Y^{(1)}=Y^{\prime},Y^{(2)},Y^{(3)},\dots,Y^{\prime\prime} from YY^{\prime} to Y′′Y^{\prime\prime} consisting of a sequence of non-random vectors in the range of YY such that each adjacent pair Y(i),Y(i+1)Y^{(i)},Y^{(i+1)} differs on a single coordinate and let

S:={i:gt(Y(i))>4K2I2}[1,n1].S:=\{i\,:\,g_{t}(Y^{(i)})>4K_{2}I_{2}\}\subset[1,n-1].

If S=S=\emptyset, take 𝐘=Y(1){\bf Y}=Y^{(1)}. Otherwise, let =max{i:gt(Y(i))>4K2I2}\ell=\max\{i\,:\,g_{t}(Y^{(i)})>4K_{2}I_{2}\}. Then take 𝐘=Y(+1){\bf Y}=Y^{(\ell+1)} and note gt(Y(+1))gt(Y())/(4K2)I2I1g_{t}(Y^{(\ell+1)})\geq g_{t}(Y^{(\ell)})/(4K_{2})\geq I_{2}\geq I_{1}. Thus the vector 𝐘{\bf Y} is in the range of YY and

I1gt(𝐘)\displaystyle I_{1}\leq g_{t}({\bf Y}) 4K2I2.\displaystyle\leq 4K_{2}I_{2}.

Making substitutions s=mss^{\prime}=\sqrt{m}s, t=mtt^{\prime}=\sqrt{m}t in the integrals in I1,I2I_{1},I_{2}, and assuming that tk/max(2C0,2c0)t^{\prime}\leq k/\max(2C_{0},2c_{0}) (in this case the condition tk/(14πK2)t\leq k/(14\pi K_{2}) is satisfied), we can rewrite the last inequalities as

12S1,,Smtti=1mψK2(|𝔼exp(2π𝐢vη[Si]m1/2s)|)dsS1,,Smtti=1mψK2(|1n/mjSiexp(2π𝐢𝐘jm1/2s)|)ds6K2S1,,Smtti=1mψK2(|𝔼exp(2π𝐢vη[Si]m1/2s)|)ds.\frac{1}{2}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t^{\prime}}^{t^{\prime}}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}m^{-1/2}\,s\big{)}\big{|}\big{)}\,ds\\ \leq\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t^{\prime}}^{t^{\prime}}\prod\limits_{i=1}^{m}\,\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{j\in S_{i}}\exp\big{(}2\pi{\bf i}\,{\bf Y}_{j}m^{-1/2}\,s\big{)}\Big{|}\Big{)}\,ds\\ \leq 6K_{2}\,\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-t^{\prime}}^{t^{\prime}}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,v_{\eta[S_{i}]}m^{-1/2}\,s\big{)}\big{|}\big{)}\,ds.

The result follows by the definition of 𝐔𝐃n(){\bf UD}_{n}(\cdot). ∎

The last statement to be considered in this subsection asserts that the u-degree of any vector from 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) is at least of order m\sqrt{m}.

Proposition 4.5 (Lower bound on the u-degree).

For any r,δ,ρr,\delta,\rho there is C4.5>0C_{\text{\tiny\ref{p: low bound on bal}}}>0 depending only on r,δ,ρr,\delta,\rho with the following property. Let K22K_{2}\geq 2, 1mn/C4.51\leq m\leq n/C_{\text{\tiny\ref{p: low bound on bal}}}, K1C4.5K_{1}\geq C_{\text{\tiny\ref{p: low bound on bal}}} and let x𝒱n(r,𝐠,δ,ρ)x\in{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho). Then

𝐔𝐃n(x,m,K1,K2)m.{\bf UD}_{n}(x,m,K_{1},K_{2})\geq\sqrt{m}.
Lemma 4.6.

For any ρ>0\rho>0 and κ(0,1/2]\kappa\in(0,1/2] there is a constant C~>0\widetilde{C}>0 depending only on ρ\rho and κ\kappa with the following property. Let SS\neq\emptyset be a finite subset of {\mathbb{Z}}, and let (yw)wS(y_{w})_{w\in S} be a real vector (indexed by SS). Assume further that S1,S2S_{1},S_{2} are two disjoint subsets of SS, each of cardinality at least κ|S|\kappa|S| such that minwS1ywmaxwS2yw+ρ\min\limits_{w\in S_{1}}y_{w}\geq\max\limits_{w\in S_{2}}y_{w}+\rho. Let K22K_{2}\geq 2 and ff be a function on [0,1][0,1] defined by

f(t):=ψK2(|1|S|wSexp(2π𝐢ywt)|),t[0,1].f(t):=\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{|S|}\sum_{w\in S}\exp(2\pi{\bf i}\,y_{w}\,t)\Big{|}\Big{)},\quad t\in[0,1].

Then for every b>0b>0 one has

|{t[0,1]:f(t)1b2}|C~b.\big{|}\big{\{}t\in[0,1]:\;f(t)\geq 1-b^{2}\big{\}}\big{|}\leq\widetilde{C}b.
Proof.

Clearly we may assume that b1/2b\leq 1/\sqrt{2}. Denote m=κ|S|m=\lceil\kappa|S|\rceil and

g(t):=|wSexp(2π𝐢ywt)|,t.g(t):=\Big{|}\sum_{w\in S}\exp(2\pi{\bf i}\,y_{w}\,t)\Big{|},\quad t\in{\mathbb{R}}.

Let TS1×S2T\subset S_{1}\times S_{2} be of cardinality T=mT=m and such that for all (q,j),(q,j)T(q,j),(q^{\prime},j^{\prime})\in T with (q,j)(q,j)(q,j)\neq(q^{\prime},j^{\prime}) one has qqq\neq q^{\prime} and jjj\neq j^{\prime}. Then for all tt\in{\mathbb{R}},

g(t)=|wS1S2exp(2π𝐢ywt)+wS1S2exp(2π𝐢ywt)|(q,j)T|1+exp(2π𝐢(yjyq)t)|+|S|2m.g(t)=\Big{|}\sum_{w\in S_{1}\cup S_{2}}\exp(2\pi{\bf i}\,y_{w}\,t)+\sum_{w\notin S_{1}\cup S_{2}}\exp(2\pi{\bf i}\,y_{w}\,t)\Big{|}\leq\sum_{(q,j)\in T}\big{|}1+\exp(2\pi{\bf i}\,(y_{j}-y_{q})\,t)\big{|}+|S|-2m.

Further, take any u(0,1/2κ)u\in(0,1/\sqrt{2\kappa}) and observe that for each (q,j)T(q,j)\in T, since |yjyq|ρ|y_{j}-y_{q}|\geq\rho, we have

|{t[0,1]:|1+exp(2π𝐢(yjyq)t)|22u2}|Cu,\big{|}\big{\{}t\in[0,1]:\;\big{|}1+\exp(2\pi{\bf i}\,(y_{j}-y_{q})\,t)\big{|}\geq 2-2u^{2}\big{\}}\big{|}\leq C^{\prime}u,

where C>0C^{\prime}>0 may only depend on ρ\rho. This implies that

|{t[0,1]:|1+exp(2π𝐢(yjyq)t)|22u2 for at least m/2 pairs (q,j)T}|2Cu.\Big{|}\Big{\{}t\in[0,1]:\;\big{|}1+\exp(2\pi{\bf i}\,(y_{j}-y_{q})\,t)\big{|}\geq 2-2u^{2}\mbox{ for at least $m/2$ pairs $(q,j)\in T$}\Big{\}}\Big{|}\leq 2C^{\prime}u.

On the other hand, whenever t[0,1]t\in[0,1] is such that |1+exp(2π𝐢(yjyq)t)|22u2\big{|}1+\exp(2\pi{\bf i}\,(y_{j}-y_{q})\,t)\big{|}\geq 2-2u^{2} for at most m/2m/2 pairs (q,j)T(q,j)\in T, we have

g(t)m2(22u2)+m22+|S|2m=|S|mu2|S|(1κu2),g(t)\leq\frac{m}{2}(2-2u^{2})+\frac{m}{2}\cdot 2+|S|-2m=|S|-mu^{2}\leq|S|(1-\kappa u^{2}),

whence f(t)max(1K2,1κu2)=1κu2f(t)\leq\max\big{(}\frac{1}{K_{2}},1-\kappa u^{2}\big{)}=1-\kappa u^{2}. Taking u=bκu=\frac{b}{\sqrt{\kappa}} we obtain the desired result with C~=2Cκ\widetilde{C}=\frac{2C^{\prime}}{\sqrt{\kappa}}. ∎

Proof of Proposition 4.5.

Let AnmA_{nm} be defined as in (7) and nδn_{\delta}, CδC_{\delta}, 𝒮\mathcal{S} be from Lemma 3.3. We assume that nnδn\geq n_{\delta} and n/mCδn/m\geq C_{\delta}. For every imi\leq m denote

fi(s)=ψK2(|𝔼exp(2π𝐢xη[Si]m1/2s)|).f_{i}(s)=\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,x_{\eta[S_{i}]}\,m^{-1/2}s\big{)}\big{|}\big{)}.

Further, let subsets Q1Q_{1} and Q2Q_{2} be taken from the definition of non-constant vectors applied to xx. Then by Lemma 3.3 and since ψK2(1)1\psi_{K_{2}}(1)\leq 1,

Anm(S1,,Sm)𝒮mmi=1mfidsecδn 2m+Anm(S1,,Sm)𝒮mmi=1mfids,\displaystyle A_{nm}\,\sum\limits_{(S_{1},\dots,S_{m})\in\mathcal{S}}\;\int\limits_{-\sqrt{m}}^{\sqrt{m}}\prod\limits_{i=1}^{m}f_{i}\,ds\leq\,e^{-c_{\delta}n}\,2\sqrt{m}+A_{nm}\,\sum\limits_{(S_{1},\dots,S_{m})\in\mathcal{S}^{\prime}}\;\int\limits_{-\sqrt{m}}^{\sqrt{m}}\prod\limits_{i=1}^{m}f_{i}\,ds,

where 𝒮\mathcal{S}^{\prime} is set of all sequences (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S} such that is the subset of SS such that

min(|SiQ1|,|SiQ2|)δ2n/m for at least cδm indices i.\min(|S_{i}\cap Q_{1}|,|S_{i}\cap Q_{2}|)\geq\frac{\delta}{2}\lfloor n/m\rfloor\,\,\mbox{ for at least $\,\,c_{\delta}m\,\,$ indices $\,i$.} (15)

Take any (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S}^{\prime} and denote m0:=cδmm_{0}:=\lceil c_{\delta}m\rceil. Without loss of generality we assume that (15) holds for all im0i\leq m_{0}. Applying Lemma 4.6 with κ:=δ/2\kappa:=\delta/2 and b=1ub=\sqrt{1-u}, we get for all u(0,1]u\in(0,1] and im0i\leq m_{0},

μ(u):=|{s[m,m]:fiu}|C~m1u,\mu(u):=\Big{|}\Big{\{}s\in[-\sqrt{m},\sqrt{m}]:\;f_{i}\geq u\Big{\}}\Big{|}\leq\widetilde{C}\sqrt{m}\sqrt{1-u},

where C~>0\widetilde{C}>0 depends only on δ\delta and ρ\rho. This estimate implies that for im0i\leq m_{0},

mm(fi(s))m0𝑑s=01m0um01μu𝑑sC~mm0B(3/2,m0)C2,\int\limits_{-\sqrt{m}}^{\sqrt{m}}(f_{i}(s))^{m_{0}}\,ds=\int\limits_{0}^{1}m_{0}\,u^{m_{0}-1}\,\mu_{u}\,ds\leq\widetilde{C}\sqrt{m}\,m_{0}\,B(3/2,m_{0})\leq C_{2},

where BB denotes the Beta-function and C2>0C_{2}>0 is a constant depending only on ρ\rho and δ\delta. Applying Hölder’s inequality, we obtain

mmi=1mψK2(|𝔼exp(2π𝐢xη[Si]m1/2s)|)ds\displaystyle\int\limits_{-\sqrt{m}}^{\sqrt{m}}\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,x_{\eta[S_{i}]}\,m^{-1/2}s\big{)}\big{|}\big{)}\,ds mmi=1m0ψK2(|𝔼exp(2π𝐢xη[Si]m1/2s)|)dsC2,\displaystyle\leq\int\limits_{-\sqrt{m}}^{\sqrt{m}}\prod\limits_{i=1}^{m_{0}}\psi_{K_{2}}\big{(}\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,x_{\eta[S_{i}]}\,m^{-1/2}s\big{)}\big{|}\big{)}\,ds\leq C_{2},

which impies the desired result. ∎

4.2 No moderately unstructured normal vectors

Let MnM_{n} be an n×nn\times n Bernoulli(pp) random matrix.. For each ini\leq n, denote by Hi=Hi(Mn)H_{i}=H_{i}(M_{n}) the span of columns 𝐂j(Mn){\bf C}_{j}(M_{n}), jij\neq i. The goal of this subsection is to prove Theorem 2.2, which asserts that under appropriate restrictions on nn and pp with a very large probability (say, at least 12e2pn1-2e^{-2pn}), the subspace HiH_{i}^{\perp} is either structured or very unstructured. The main ingredient of the proof — Proposition 4.9 — will be considered in the next subsection. Here, we will only state the proposition to be used as a blackbox and for this we need to introduce an additional product structure, which, in a sense, replaces the set 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho).


Fix a permutation σΠn\sigma\in\Pi_{n}, two disjoint subsets Q1,Q2Q_{1},Q_{2} of cardinality δn\lceil\delta n\rceil each, and a number hh\in{\mathbb{R}} such that

iQ1:h+2𝐠(n/σ1(i)) and iQ2:𝐠(n/σ1(i))hρ2.\forall i\in Q_{1}:\,\,h+2\leq{\bf g}(n/\sigma^{-1}(i))\quad\quad\mbox{ and }\quad\quad\forall i\in Q_{2}:\,\,-{\bf g}(n/\sigma^{-1}(i))\leq h-\rho-2. (16)

Define the sets Λn=Λn(k,𝐠,Q1,Q2,ρ,σ,h)\Lambda_{n}=\Lambda_{n}(k,{\bf g},Q_{1},Q_{2},\rho,\sigma,h) by

Λn:={x1kn:|xσ(i)|𝐠(n/i)for all in,miniQ1xih, and maxiQ2xihρ}.\begin{split}\Lambda_{n}:=\bigg{\{}x\in\frac{1}{k}{\mathbb{Z}}^{n}:\;&|x_{\sigma(i)}|\leq{\bf g}(n/i)\;\;\mbox{for all }i\leq n,\;\;\;\min\limits_{i\in Q_{1}}x_{i}\geq h,\,\,\,\mbox{ and }\,\,\,\max\limits_{i\in Q_{2}}x_{i}\leq h-\rho\bigg{\}}.\end{split} (17)

In what follows, we adopt the convention that Λn=\Lambda_{n}=\emptyset whenever hh does not satisfy (16).

Lemma 4.7.

There exists an absolute constant C4.71C_{\text{\tiny\ref{l: permut}}}\geq 1 such that for every n1n\geq 1 there is a subset Π¯nΠn\bar{\Pi}_{n}\subset\Pi_{n} of cardinality at most exp(C4.7n)\exp({C_{\text{\tiny\ref{l: permut}}}n}) with the following property. For any two partitions (Si)i=1m(S_{i})_{i=1}^{m} and (Si)i=1m(S_{i}^{\prime})_{i=1}^{m} of [n][n] with 2i+1n|Si|=|Si|2^{-i+1}n\geq|S_{i}|=|S_{i}^{\prime}|, imi\leq m, there is σΠ¯n\sigma\in\bar{\Pi}_{n} such that σ(Si)=Si\sigma(S_{i})=S_{i}^{\prime}, imi\leq m.

This lemma immediately follows from the fact that the total number of partitions (Si)i=1m(S_{i})_{i=1}^{m} of [n][n] satisfying 2i+1n|Si|2^{-i+1}n\geq|S_{i}|, imi\leq m, is exponential in nn (one can take C4.7=23C_{\text{\tiny\ref{l: permut}}}=23). Using Lemma 4.7, we provide an efficient approximation of 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho).

Lemma 4.8.

For any x𝒱n=𝒱n(r,𝐠,δ,ρ)x\in{\mathcal{V}}_{n}={\mathcal{V}}_{n}(r,{\bf g},\delta,\rho), k4/ρk\geq 4/\rho, and any y1kny\in\frac{1}{k}{\mathbb{Z}}^{n} with xy1/k\|x-y\|_{\infty}\leq 1/k one has

yq=4𝐠(6n)/ρ4𝐠(6n)/ρσ¯Π¯n|Q1|,|Q2|=δnΛn(k,𝐠(6),Q1,Q2,ρ/4,σ¯,ρq/4),y\in\bigcup\limits_{q=\lfloor-4{\bf g}(6n)/\rho\rfloor}^{\lceil 4{\bf g}(6n)/\rho\rceil}\;\bigcup\limits_{\bar{\sigma}\in\bar{\Pi}_{n}}\bigcup\limits_{|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil}\Lambda_{n}(k,{\bf g}(6\,\cdot),Q_{1},Q_{2},\rho/4,\bar{\sigma},\rho q/4),

where the set of permutations Π¯n\bar{\Pi}_{n} is taken from Lemma 4.7.

Proof.

Let x𝒱nx\in{\mathcal{V}}_{n}, and assume that y1kny\in\frac{1}{k}{\mathbb{Z}}^{n} satisfies xy1/k\|x-y\|_{\infty}\leq 1/k. Then, by the definition of 𝒱n{\mathcal{V}}_{n}, there exist sets Q1,Q2[n]Q_{1},Q_{2}\subset[n], each of cardinality δn\lceil\delta n\rceil, satisfying

maxiQ2yi1kmaxiQ2ximiniQ1xiρminiQ1yiρ+1k.\max\limits_{i\in Q_{2}}y_{i}-\frac{1}{k}\leq\max\limits_{i\in Q_{2}}x_{i}\leq\min\limits_{i\in Q_{1}}x_{i}-\rho\leq\min\limits_{i\in Q_{1}}y_{i}-\rho+\frac{1}{k}.

Then maxiQ2yiminiQ1yiρ2,\max\limits_{i\in Q_{2}}y_{i}\leq\min\limits_{i\in Q_{1}}y_{i}-\frac{\rho}{2}, hence we can find a number hρ4h\in\frac{\rho}{4}{\mathbb{Z}} such that

miniQ1yih and maxiQ2yihρ4.\min\limits_{i\in Q_{1}}y_{i}\geq h\quad\quad\mbox{ and }\quad\quad\max\limits_{i\in Q_{2}}y_{i}\leq h-\frac{\rho}{4}.

By the definition of 𝒱n{\mathcal{V}}_{n} we also have |xσx(i)|𝐠(n/i)|x_{\sigma_{x}(i)}|\leq{\bf g}(n/i) for all i[n]i\in[n]. By the definition of Π¯n\bar{\Pi}_{n}, we can find a permutation σ¯Π¯n\bar{\sigma}\in\bar{\Pi}_{n} such that

σx({n/2+1,,n/21})=σ¯({n/2+1,,n/21}) for all 1.\sigma_{x}\big{(}\{\lfloor n/2^{\ell}\rfloor+1,\dots,\lfloor n/2^{\ell-1}\rfloor\}\big{)}=\bar{\sigma}\big{(}\{\lfloor n/2^{\ell}\rfloor+1,\dots,\lfloor n/2^{\ell-1}\rfloor\}\big{)}\quad\mbox{ for all }\ell\geq 1.

Clearly for such a permutation we have |xσ¯(i)|𝐠(2n/i)|x_{\bar{\sigma}(i)}|\leq{\bf g}(2n/i) for every ini\leq n. Using (8), we obtain

|yσ¯(i)||xσ¯(i)|+1k𝐠(2n/i)+1k𝐠(6n/i)2.|y_{\bar{\sigma}(i)}|\leq|x_{\bar{\sigma}(i)}|+\frac{1}{k}\leq{\bf g}(2n/i)+\frac{1}{k}\leq{\bf g}(6n/i)-2.

Thus

iσ¯1(Q1):hminiQ1yi𝐠(6n/i)2 and iσ¯1(Q2):hρ4maxiQ2yi2𝐠(6n/i).\forall i\in\bar{\sigma}^{-1}(Q_{1}):\,\,h\leq\min\limits_{i\in Q_{1}}y_{i}\leq{\bf g}(6n/i)-2\quad\mbox{ and }\quad\forall i\in\bar{\sigma}^{-1}(Q_{2}):\,\,h-\frac{\rho}{4}\geq\max\limits_{i\in Q_{2}}y_{i}\geq 2-{\bf g}(6n/i).

Since h=ρq/4h=\rho q/4 for some qq\in{\mathbb{Z}}, this implies the desired result. ∎

The following statment, together with Theorem 2.1 and Proposition 4.2, is the main ingredient of the proof of Theorem 2.2.

Proposition 4.9.

Let ε(0,1/8]\varepsilon\in(0,1/8], ρ,δ(0,1/4]\rho,\delta\in(0,1/4] and let the growth function 𝐠{\bf g} satisfies (8). There exist K4.9=K4.9(δ,ρ)1K_{\text{\tiny\ref{prop: 09582593852}}}=K_{\text{\tiny\ref{prop: 09582593852}}}(\delta,\rho)\geq 1, n4.9=n4.9(ε,δ,ρ,K3)n_{\text{\tiny\ref{prop: 09582593852}}}=n_{\text{\tiny\ref{prop: 09582593852}}}(\varepsilon,\delta,\rho,K_{3}), and C4.9=C4.9(ε,δ,ρ,K3)C_{\text{\tiny\ref{prop: 09582593852}}}=C_{\text{\tiny\ref{prop: 09582593852}}}(\varepsilon,\delta,\rho,K_{3})\in{\mathbb{N}} with the following property. Let σΠn\sigma\in\Pi_{n}, hh\in{\mathbb{R}}, and let Q1,Q2[n]Q_{1},Q_{2}\subset[n] be such that |Q1|,|Q2|=δn|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil. Let 8K21/ε8\leq K_{2}\leq 1/\varepsilon, nn4.9n\geq n_{\text{\tiny\ref{prop: 09582593852}}}, mC4.9m\geq C_{\text{\tiny\ref{prop: 09582593852}}} with n/mC4.9n/m\geq C_{\text{\tiny\ref{prop: 09582593852}}}, 1kmin((K2/8)m/2,2n/C4.9)1\leq k\leq\min\big{(}(K_{2}/8)^{m/2},2^{n/C_{\text{\tiny\ref{prop: 09582593852}}}}\big{)}, and let X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random vector uniformly distributed on Λn(k,𝐠,Q1,Q2,ρ,σ,h)\Lambda_{n}(k,{\bf g},Q_{1},Q_{2},\rho,\sigma,h). Then

{𝐔𝐃n(X,m,K4.9,K2)<km1/2/C4.9}εn.{\mathbb{P}}\big{\{}{\bf UD}_{n}(X,m,K_{\text{\tiny\ref{prop: 09582593852}}},K_{2})<km^{1/2}/C_{\text{\tiny\ref{prop: 09582593852}}}\big{\}}\leq\varepsilon^{n}.

Let us describe the proof of Theorem 2.2 informally. Assume that the hyperplane H1H_{1} admits a normal vector XX which belongs to 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho). We need to show that with a large probability the u-degree 𝐔𝐃n(X,m,K1,K2){\bf UD}_{n}(X,m,K_{1},K_{2}) of XX is very large, say, at least εm\varepsilon^{-m} for a small ε>0\varepsilon>0. The idea is to split the collection 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) into about log2(εm)\log_{2}(\varepsilon^{-m}) subsets according to the magnitude of the u-degree (that is, each subset 𝒯N\mathcal{T}_{N} will have a form 𝒯N={x𝒱n(r,𝐠,δ,ρ):𝐔𝐃n(x,m,K1,K2)[N,2N)}\mathcal{T}_{N}=\big{\{}x\in{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho):\;{\bf UD}_{n}(x,m,K_{1},K_{2})\in[N,2N)\big{\}} for an appropriate NN). To show that for each NεmN\ll\varepsilon^{-m} the probability of X𝒯NX\in\mathcal{T}_{N} is very small, we define a discrete approximation 𝒜N{\mathcal{A}}_{N} of 𝒯N\mathcal{T}_{N} consisting of all vectors y1kny\in\frac{1}{k}{\mathbb{Z}}^{n} such that yx1/k\|y-x\|_{\infty}\leq 1/k for some x𝒯Nx\in\mathcal{T}_{N} and additionally, in view of Proposition 4.2, 𝐔𝐃n(y,m,c4.2K1,K2)2N{\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}K_{1},K_{2})\leq 2N and 𝐔𝐃n(y,m,c4.21K1,K2)N{\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}^{-1}K_{1},K_{2})\geq N. We can bound the cardinality of such set 𝒜N{\mathcal{A}}_{N} by (ε~k)n(\tilde{\varepsilon}\,k)^{n}, for a small ε~>0\tilde{\varepsilon}>0, by combining Proposition 4.9 with Lemma 4.8 and with the following simple fact.

Lemma 4.10.

Let k1k\geq 1, hh\in{\mathbb{R}}, ρ,δ(0,1)\rho,\delta\in(0,1), Q1,Q2[n]Q_{1},Q_{2}\subset[n] with |Q1|,|Q2|=δn|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil, and 𝐠{\bf g} satisfies (8) with some K31K_{3}\geq 1. Then |Λn(k,𝐠,Q1,Q2,ρ,σ,h)|(C4.10k)n|\Lambda_{n}(k,{\bf g},Q_{1},Q_{2},\rho,\sigma,h)|\leq\big{(}C_{\text{\tiny\ref{l: Lambda_n card}}}k\big{)}^{n}, where C4.101C_{\text{\tiny\ref{l: Lambda_n card}}}\geq 1 depends only on K3K_{3}.

On the other hand, for each fixed vector yy in the set 𝒜N{\mathcal{A}}_{N} we can estimate the probability that it “approximates” a normal vector to H1H_{1} by using Corollary 4.1:

{y is an “approximate” normal vector to H1}(C/k)nfor every y𝒜N,{\mathbb{P}}\big{\{}\mbox{$y$ is an ``approximate'' normal vector to $H_{1}$}\big{\}}\leq(C^{\prime}/k)^{n}\quad\mbox{for every }y\in{\mathcal{A}}_{N},

for some constant Cε~1C^{\prime}\ll\tilde{\varepsilon}^{-1}. Taking the union bound, we obtain

{X𝒯N}{𝒜N contains an “approximate” normal vector to H1}(C/k)n(ε~k)n1.{\mathbb{P}}\big{\{}X\in\mathcal{T}_{N}\big{\}}\leq{\mathbb{P}}\big{\{}\mbox{${\mathcal{A}}_{N}$ contains an ``approximate'' normal vector to $H_{1}$}\big{\}}\leq(C^{\prime}/k)^{n}\,(\tilde{\varepsilon}\,k)^{n}\ll 1.

Below, we make this argument rigorous.


Proof of Theorem 2.2.

We start by defining parameters. We always assume that nn is large enough, so all statements used below work for our nn. Fix any R1R\geq 1, r>0r>0 and s>0s>0, and set b:=(2pR)1b:=\lfloor(2pR)^{-1}\rfloor. Let K2=32exp(16R).K_{2}=32\exp(16R). Note that the function 𝐠(6){\bf g}(6\,\cdot) is a growth function that satisfies condition (8) with parameter K3=(K3)8K_{3}^{\prime}=(K_{3})^{8}. In particular, choosing jj so taht 2j16n2j2^{j-1}\leq 6n\leq 2^{j}, we have

𝐠(6n)𝐠(2j)(K3)2j/j(K3)12n/log2(6n)K3n.{\bf g}(6n)\leq{\bf g}(2^{j})\leq(K_{3}^{\prime})^{2^{j}/j}\leq(K_{3}^{\prime})^{12n/\log_{2}(6n)}\leq K_{3}^{n}.

For brevity, we denote

C3.7:=C3.7(s,2R),C3.5:=C3.5(2R),c4.2:=c4.2(K2),c4.2:=c4.2(K2),C4.10=C4.10(K3).C_{\text{\tiny\ref{cor: norm of centered}}}:=C_{\text{\tiny\ref{cor: norm of centered}}}(s,2R),\,\,C_{\text{\tiny\ref{l: column supports}}}:=C_{\text{\tiny\ref{l: column supports}}}(2R),\,\,c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}:=c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}(K_{2}),\,\,c_{\text{\tiny\ref{l: stability of bal}}}:=c_{\text{\tiny\ref{l: stability of bal}}}(K_{2}),\,\,C_{\text{\tiny\ref{l: Lambda_n card}}}=C_{\text{\tiny\ref{l: Lambda_n card}}}(K_{3}^{\prime}).

Set

K1:=max(K4.9(δ,ρ/4)/c4.2,C4.5(r,δ,ρ)),K_{1}:=\max\big{(}K_{\text{\tiny\ref{prop: 09582593852}}}(\delta,\rho/4)/c_{\text{\tiny\ref{l: stability of bal}}},C_{\text{\tiny\ref{p: low bound on bal}}}(r,\delta,\rho)\big{)},

and

ε:=min(K21,c4.2(384eK3exp(C4.7)C4.10C3.8C2.1C3.7)1exp(3R))\varepsilon:=\min\Big{(}K_{2}^{-1},\,c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}\big{(}384eK_{3}\,\exp({C_{\text{\tiny\ref{l: permut}}}})\,C_{\text{\tiny\ref{l: Lambda_n card}}}C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}C_{\text{\tiny\ref{cor: norm of centered}}}\big{)}^{-1}\exp(-3R)\Big{)}

We will assume that pnpn is sufficiently large so that

5exp(2Rpn)exp(Rpn) and exp(3Rpn)12Rpnexp(2Rpn).5\exp(-2Rpn)\leq\exp(-Rpn)\quad\mbox{ and }\quad\exp(-3Rpn)\leq\frac{1}{2Rpn}\exp(-2Rpn).

Moreover, we will assume that

2RC3.5p12RC_{\text{\tiny\ref{l: column supports}}}p\leq 1\quad and C3.5pn\quad C_{\text{\tiny\ref{l: column supports}}}\leq pn (18)

and

18pmax(C4.9(ε,δ,ρ/4,K3),C4.5(r,δ,ρ));pn16C4.9(ε,δ,ρ/4,K3)2;\displaystyle\frac{1}{8p}\geq\max(C_{\text{\tiny\ref{prop: 09582593852}}}(\varepsilon,\delta,\rho/4,K_{3}^{\prime}),C_{\text{\tiny\ref{p: low bound on bal}}}(r,\delta,\rho));\quad pn\geq 16C_{\text{\tiny\ref{prop: 09582593852}}}(\varepsilon,\delta,\rho/4,K_{3}^{\prime})^{2};
e2Rp21/C4.9(ε,δ,ρ/4,K3);c4.2/3exp(Rpn);exp(Rpn)/c4.2n2n.\displaystyle e^{2Rp}\leq 2^{1/C_{\text{\tiny\ref{prop: 09582593852}}}(\varepsilon,\delta,\rho/4,K_{3}^{\prime})};\quad c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}/3\geq\exp(-Rpn);\quad\lfloor\exp(Rpn)/c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}\rfloor\,n\leq 2^{n}.

Define two auxiliary random objects as follows. Set

Z:={xn:xrn=1,𝐔𝐃n(x,m,K1,K2)exp(Rpn) for all pn/8m8pn},Z:=\mbox{$\{x\in{\mathbb{R}}^{n}:\;x^{*}_{\lfloor rn\rfloor}=1,\,\,\,{\bf UD}_{n}(x,m,K_{1},K_{2})\geq\exp(Rpn)\,\,$ for all $\,\,pn/8\leq m\leq 8pn\}$,}

and let XX be a random vector measurable with respect to H1H_{1} and such that

  • X(𝒱n(r,𝐠,δ,ρ)H1)Zwhenever (𝒱n(r,𝐠,δ,ρ)H1)ZX\in\big{(}{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}\big{)}\setminus Z\quad\mbox{whenever }\;\big{(}{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}\big{)}\setminus Z\neq\emptyset;

  • X(𝒱n(r,𝐠,δ,ρ)H1)Zwhenever (𝒱n(r,𝐠,δ,ρ)H1)Z=X\in\big{(}{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}\big{)}\cap Z\quad\mbox{whenever }\;\big{(}{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}\big{)}\setminus Z=\emptyset and 𝒱n(r,𝐠,δ,ρ)H1{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}\neq\emptyset;

  • X=𝟎whenever 𝒱n(r,𝐠,δ,ρ)H1=X={\bf 0}\quad\mbox{whenever }\quad{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cap H_{1}^{\perp}=\emptyset.

(Note that H1H_{1}^{\perp} may have dimension larger than one with non-zero probability, and thus ±X\pm X is not uniquely defined). Note that to prove the theorem, it is sufficient to show that with probability at least 1exp(Rpn)1-\exp(-Rpn) one has either X=𝟎X={\bf 0} or XZX\in Z.

Next, we denote

ξ:={min8pnmpn/8𝐔𝐃n(X,m,K1,K2),whenever X𝟎;+,otherwise.\xi:=\begin{cases}\min\limits_{8pn\geq m\geq pn/8}{\bf UD}_{n}(X,m,K_{1},K_{2}),&\mbox{whenever }\;X\neq{\bf 0};\\ +\infty,&\mbox{otherwise.}\end{cases}

Then, proving the theorem amounts to showing that ξ<exp(Rpn)\xi<\exp(Rpn) with probability at most exp(Rpn)\exp(-Rpn).

We say that a collection of indices I[n]I\subset[n] is admissible if 1I1\notin I and |I|nb1|I|\geq n-b-1. For admissible sets II consider disjoint collection of events {I}I\{{\mathcal{E}}_{I}\}_{I} defined by

I:={iI:|supp𝐂i(Mn)|[pn/8,8pn] and iI:|supp𝐂i(Mn)|[pn/8,8pn]}.\displaystyle{\mathcal{E}}_{I}:=\big{\{}\forall i\in I:\,\,|{\rm supp\,}{\bf C}_{i}(M_{n})|\in[pn/8,8pn]\quad\mbox{ and }\quad\forall i\notin I:\,\,|{\rm supp\,}{\bf C}_{i}(M_{n})|\notin[pn/8,8pn]\big{\}}.

Further, denote

~:={Mn𝔼MnC3.7pn}.\widetilde{\mathcal{E}}:=\big{\{}\|M_{n}-{\mathbb{E}}M_{n}\|\leq C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{pn}\big{\}}.

According to Corollary 3.7, (~)1exp(2Rpn){\mathbb{P}}(\widetilde{\mathcal{E}})\geq 1-\exp(-2Rpn), while by Lemma 3.5 and (18),

(II)1exp(n/C3.5)1exp(2Rpn).{\mathbb{P}}\Big{(}\bigcup_{I}{\mathcal{E}}_{I}\Big{)}\geq 1-\exp(-n/C_{\text{\tiny\ref{l: column supports}}})\geq 1-\exp(-2Rpn).

Denote by \mathcal{I} the collection of all admissible II satisfying 2(I~)(I)2{\mathbb{P}}({\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}})\geq{\mathbb{P}}({\mathcal{E}}_{I}). Then for II\in\mathcal{I}, we have (I)2(I~c){\mathbb{P}}({\mathcal{E}}_{I})\geq 2{\mathbb{P}}({\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}^{c}), and, using that events I{\mathcal{E}}_{I} are disjoint,

(II)1exp(2Rpn)2(~c)13exp(2Rpn).{\mathbb{P}}\Big{(}\bigcup_{I\in\mathcal{I}}{\mathcal{E}}_{I}\Big{)}\geq 1-\exp(-2Rpn)-2{\mathbb{P}}(\widetilde{\mathcal{E}}^{c})\geq 1-3\exp(-2Rpn).

Hence,

{ξ<exp(Rpn)}\displaystyle{\mathbb{P}}\big{\{}\xi<\exp(Rpn)\big{\}} I({ξ<exp(Rpn)}I~)+(IIc)+(~c)\displaystyle\leq\sum\limits_{I\in\mathcal{I}}{\mathbb{P}}\big{(}\big{\{}\xi<\exp(Rpn)\big{\}}\cap{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}\big{)}+{\mathbb{P}}\Big{(}\bigcap_{I\in\mathcal{I}}{\mathcal{E}}_{I}^{c}\Big{)}+{\mathbb{P}}(\widetilde{\mathcal{E}}^{c})
I({ξ<exp(Rpn)}|I~)(I~)+4exp(2Rpn).\displaystyle\leq\sum\limits_{I\in\mathcal{I}}{\mathbb{P}}\big{(}\big{\{}\xi<\exp(Rpn)\big{\}}\;|\;{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}\big{)}{\mathbb{P}}({\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}})+4\exp(-2Rpn).

Therefore, to prove the theorem it is sufficient to show that for any II\in\mathcal{I},

({ξ<exp(Rpn)}|I~)exp(2Rpn).{\mathbb{P}}\big{(}\big{\{}\xi<\exp(Rpn)\big{\}}\;|\;{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}\big{)}\leq\exp(-2Rpn).

Fix an admissible II\in\mathcal{I}, denote by BIB_{I} the |I|×n|I|\times n matrix obtained by transposing columns 𝐂i(Mn){\bf C}_{i}(M_{n}), iIi\in I, and let B~I\widetilde{B}_{I} be the non-random |I|×n|I|\times n matrix with all elements equal to pp. Note that, in view of our definition of K1K_{1}, the assumptions on pp and Proposition 4.5, we have a deterministic relation

ξpn/8\xi\geq\sqrt{pn/8}

everywhere on the probability space. For each real number NJp:=[pn/8,exp(Rpn)/2]N\in J_{p}:=[\sqrt{pn/8},\exp(Rpn)/2], denote by N,I{\mathcal{E}}_{N,I} the event

N,I:={ξ[N,2N)}I~.{\mathcal{E}}_{N,I}:=\big{\{}\xi\in[N,2N)\big{\}}\cap{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}.

Splitting the interval JpJ_{p} into subintervals, we observe that it is sufficient to show that for every NJpN\in J_{p} we have

(N,I|I~)exp(3Rpn)12Rpnexp(2Rpn).{\mathbb{P}}\big{(}{\mathcal{E}}_{N,I}\;|\;{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}\big{)}\leq\exp(-3Rpn)\leq\frac{1}{2Rpn}\exp(-2Rpn).

The rest of the argument is devoted to estimating probability of N,I{\mathcal{E}}_{N,I} for fixed NJpN\in J_{p} and fixed II\in\mathcal{I}. Set k:=2N/c4.2k:=\lceil 2N/c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}\rceil. Let 𝐦:N,I[pn/8,8pn]{\bf m}:{\mathcal{E}}_{N,I}\to[pn/8,8pn] be a (random) integer such that

𝐔𝐃n(X,𝐦,K1,K2)[N,2N) everywhere on N,I.{\bf UD}_{n}(X,{\bf m},K_{1},K_{2})\in[N,2N)\;\,\,\,\mbox{ everywhere on }\,\,\,\;{\mathcal{E}}_{N,I}.

Since on N,I{\mathcal{E}}_{N,I} we have 𝐔𝐃n(X,𝐦,K1,K2)2Nc4.2k{\bf UD}_{n}(X,{\bf m},K_{1},K_{2})\leq 2N\leq c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}k, applying Proposition 4.2, we can construct a random vector 𝐘:N,I1kn{\bf Y}:{\mathcal{E}}_{N,I}\to\frac{1}{k}{\mathbb{Z}}^{n} having the following properties:

  • 𝐘X1/k\|{\bf Y}-X\|_{\infty}\leq 1/k everywhere on N,I{\mathcal{E}}_{N,I},

  • 𝐔𝐃n(𝐘,𝐦,c4.2K1,K2)2N{\bf UD}_{n}({\bf Y},{\bf m},c_{\text{\tiny\ref{l: stability of bal}}}K_{1},K_{2})\leq 2N everywhere on N,I{\mathcal{E}}_{N,I},

  • 𝐔𝐃n(𝐘,m,c4.21K1,K2)N{\bf UD}_{n}({\bf Y},m,c_{\text{\tiny\ref{l: stability of bal}}}^{-1}K_{1},K_{2})\geq N for all m[pn/8,8pn]m\in[pn/8,8pn] and everywhere on N,I{\mathcal{E}}_{N,I}.

The first condition together with the inclusion N,I~{\mathcal{E}}_{N,I}\subset\widetilde{\mathcal{E}} implies that

(BIB~I)(𝐘X)C3.7pn/k.\|(B_{I}-\widetilde{B}_{I})({\bf Y}-X)\|\leq C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k.

Using that BIX=0B_{I}X=0 and that B~I(𝐘X)=p(i=1n(𝐘iXi)) 1I\widetilde{B}_{I}({\bf Y}-X)=p(\sum_{i=1}^{n}({\bf Y}_{i}-X_{i}))\,{\bf 1}_{I}, we observe that there is a random number 𝐳:N,I[pn/k,pn/k]pnk{\bf z}:{\mathcal{E}}_{N,I}\to[-pn/k,pn/k]\cap\frac{\sqrt{pn}}{k}{\mathbb{Z}} such that everywhere on N,I{\mathcal{E}}_{N,I} one has

BI𝐘𝐳 1I2C3.7pn/k.\|B_{I}{\bf Y}-{\bf z}\,{\bf 1}_{I}\|\leq 2C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k.

Let Λ\Lambda be a subset of

q=4𝐠(6n)/ρ4𝐠(6n)/ρσ¯Π¯n|Q1|,|Q2|=δnΛn(k,𝐠(6),Q1,Q2,ρ/4,σ¯,ρq/4),\bigcup\limits_{q=\lfloor-4{\bf g}(6n)/\rho\rfloor}^{\lceil 4{\bf g}(6n)/\rho\rceil}\;\bigcup\limits_{\bar{\sigma}\in\bar{\Pi}_{n}}\bigcup\limits_{|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil}\Lambda_{n}(k,{\bf g}(6\,\cdot),Q_{1},Q_{2},\rho/4,\bar{\sigma},\rho q/4),

consisting of all vectors yy such that

  • 𝐔𝐃n(y,m,c4.2K1,K2)2N{\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}K_{1},K_{2})\leq 2N for some m[pn/8,8pn]m\in[pn/8,8pn];

  • 𝐔𝐃n(y,m,c4.21K1,K2)N{\bf UD}_{n}(y,m,c_{\text{\tiny\ref{l: stability of bal}}}^{-1}K_{1},K_{2})\geq N for all m[pn/8,8pn]m\in[pn/8,8pn].

Note that by Lemma 4.8 the entire range of 𝐘{\bf Y} on N,I{\mathcal{E}}_{N,I} falls into Λ\Lambda.

Combining the above observations,

N,I{BIyz𝟏I2C3.7pn/k for some yΛ, z[pn/k,pn/k]pnk},{\mathcal{E}}_{N,I}\subset\big{\{}\|B_{I}y-z{\bf 1}_{I}\|\leq 2C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k\;\mbox{ for some }y\in\Lambda,\mbox{ $z\in[-pn/k,pn/k]\cap\frac{\sqrt{pn}}{k}{\mathbb{Z}}$}\big{\}},

whence, using that 2(I~)(I)2{\mathbb{P}}({\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}})\geq{\mathbb{P}}({\mathcal{E}}_{I}) by the definition of \mathcal{I},

(N,I|I~)\displaystyle{\mathbb{P}}({\mathcal{E}}_{N,I}\,|\,{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}) 2{BIyz𝟏I2C3.7pn/k for some yΛ, z[pn/k,pn/k]pnk|I}\displaystyle\leq 2{\mathbb{P}}\big{\{}\|B_{I}y-z{\bf 1}_{I}\|\leq 2C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k\;\mbox{ for some }y\in\Lambda,\mbox{ $z\in[-pn/k,pn/k]\cap\frac{\sqrt{pn}}{k}{\mathbb{Z}}$}\;|\;{\mathcal{E}}_{I}\big{\}}
6|Λ|pnmaxzpnkmaxyΛ{BIyz𝟏I2C3.7pn/k|I}.\displaystyle\leq 6|\Lambda|\sqrt{pn}\,\max\limits_{z\in\frac{\sqrt{pn}}{k}{\mathbb{Z}}}\,\,\max\limits_{y\in\Lambda}{\mathbb{P}}\big{\{}\|B_{I}y-z{\bf 1}_{I}\|\leq 2C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k\;|\;{\mathcal{E}}_{I}\big{\}}.

To estimate the last probability, we apply Corollary 4.1 with t:=C3.78pn/Nt:=C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{8pn}/{N} (note that k2Nk\geq 2N, 2|I|n2|I|\geq n, and that tt satisfies the assumption of the corollary). We obtain that for all admissible yy and zz,

{BIyz𝟏I2C3.7pn/k|I}\displaystyle{\mathbb{P}}\big{\{}\|B_{I}y-z{\bf 1}_{I}\|\leq 2C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{p}n/k\;|\;{\mathcal{E}}_{I}\big{\}} {BIyz𝟏IC3.78pnN|I||I}\displaystyle\leq{\mathbb{P}}\bigg{\{}\|B_{I}y-z{\bf 1}_{I}\|\leq\frac{C_{\text{\tiny\ref{cor: norm of centered}}}\sqrt{8pn}}{N}\,\sqrt{|I|}\;|\;{\mathcal{E}}_{I}\bigg{\}}
(16C3.8C2.1C3.7/N)|I|.\displaystyle\leq(16C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}C_{\text{\tiny\ref{cor: norm of centered}}}/N)^{|I|}.

On the other hand, the cardinality of Λ\Lambda can be estimated by combining Lemma 4.10, Lemma 4.7 and Proposition 4.9 (note that our choice of parameters guarantees applicability of these statements):

|Λ|8pnεn(9𝐠(6n)/ρ)exp(C4.7n) 22n(C4.10k)n(72pn/ρ)εnK3nexp(C4.7n) 22n(C4.10k)n,|\Lambda|\leq 8pn\varepsilon^{n}\,(9{\bf g}(6n)/\rho)\exp({C_{\text{\tiny\ref{l: permut}}}n})\,2^{2n}(C_{\text{\tiny\ref{l: Lambda_n card}}}k)^{n}\leq(72pn/\rho)\varepsilon^{n}\,K_{3}^{n}\,\exp({C_{\text{\tiny\ref{l: permut}}}n})\,2^{2n}(C_{\text{\tiny\ref{l: Lambda_n card}}}k)^{n},

where C4.10=C4.10(K3)C_{\text{\tiny\ref{l: Lambda_n card}}}=C_{\text{\tiny\ref{l: Lambda_n card}}}(K_{3}^{\prime}). Thus, using our choice of parameters and assuming in addition that 2n72pn/ρ2^{n}\geq 72pn/\rho

(N,I|I~)\displaystyle{\mathbb{P}}({\mathcal{E}}_{N,I}\,|\,{\mathcal{E}}_{I}\cap\widetilde{\mathcal{E}}) εn(8K3exp(C4.7)C4.10k)n(16C3.8C2.1C3.7/N)|I|\displaystyle\leq\varepsilon^{n}\,(8K_{3}\,\exp({C_{\text{\tiny\ref{l: permut}}}})\,C_{\text{\tiny\ref{l: Lambda_n card}}}k)^{n}\,(16C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}C_{\text{\tiny\ref{cor: norm of centered}}}/N)^{|I|}
εn(8K3exp(C4.7)C4.10k)n(48C3.8C2.1C3.7/(c4.2k))nN1+(2pR)1\displaystyle\leq\varepsilon^{n}\,(8K_{3}\,\exp({C_{\text{\tiny\ref{l: permut}}}})\,C_{\text{\tiny\ref{l: Lambda_n card}}}k)^{n}\,(48C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}C_{\text{\tiny\ref{cor: norm of centered}}}/(c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}k))^{n}N^{1+\lfloor(2pR)^{-1}\rfloor}
εn(384K3exp(C4.7)C4.10C3.8C2.1C3.7/(c4.2))nen\displaystyle\leq\varepsilon^{n}\,(384K_{3}\,\exp({C_{\text{\tiny\ref{l: permut}}}})\,C_{\text{\tiny\ref{l: Lambda_n card}}}C_{\text{\tiny\ref{l: tensor}}}C_{\text{\tiny\ref{p: cf est}}}C_{\text{\tiny\ref{cor: norm of centered}}}/(c_{\text{\tiny\ref{l: stability of bal}}}^{\prime}))^{n}\,e^{n}
exp(3Rn),\displaystyle\leq\exp(-3Rn),

by our choice of parameters. The result follows. ∎

4.3 Anti-concentration on a lattice

The goal of this subsection is to prove Proposition 4.9. Thus, in this subsection, we fix ρ,δ(0,1/4]\rho,\delta\in(0,1/4], a growth function 𝐠{\bf g} satisfying (8), which in particular means that 𝐠(n)K32n/log2n{\bf g}(n)\leq K_{3}^{2n/\log_{2}n}, a permutation σΠn\sigma\in\Pi_{n}, a number hh\in{\mathbb{R}}, two sets Q1,Q2[n]Q_{1},Q_{2}\subset[n] such that |Q1|,|Q2|=δn|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil, and we do not repeat these assumptions in lemmas below. We also always use short notation Λn\Lambda_{n} for the set Λn(k,𝐠,Q1,Q2,ρ,σ,h)\Lambda_{n}(k,{\bf g},Q_{1},Q_{2},\rho,\sigma,h) defined in (17).

We start with auxiliary probabilistic statements which are just special forms of Markov’s inequality.

Lemma 4.11 (Integral form of Markov’s inequality, I).

For each s[a,b]s\in[a,b], let ξ(s)\xi(s) be a non-negative random variable with ξ(s)1\xi(s)\leq 1 a.e. Assume that the random function ξ(s)\xi(s) is integrable on [a,b][a,b] with probability one. Assume further that for some integrable function ϕ(s):[a,b]+\phi(s):\,[a,b]\to{\mathbb{R}}_{+} and some ε>0\varepsilon>0 we have

{ξ(s)ϕ(s)}1ε{\mathbb{P}}\big{\{}\xi(s)\leq\phi(s)\big{\}}\geq 1-\varepsilon

for all s[a,b]s\in[a,b]. Then for all t>0t>0,

{abξ(s)𝑑sabϕ(s)𝑑s+t(ba)}ε/t.{\mathbb{P}}\bigg{\{}\int_{a}^{b}\xi(s)\,ds\geq\int_{a}^{b}\phi(s)\,ds+t(b-a)\bigg{\}}\leq\varepsilon/t.
Proof.

Consider a random set

I:={s[a,b]:ξ(s)>ϕ(s)}.I:=\big{\{}s\in[a,b]:\;\xi(s)>\phi(s)\big{\}}.

Since {sI}ε{\mathbb{P}}\{s\in I\}\leq\varepsilon for any s[a,b]s\in[a,b], we have 𝔼|I|ε(ba).{\mathbb{E}}|I|\leq\varepsilon(b-a). Therefore, by the Markov inequality, {|I|t(ba)}ε/t{\mathbb{P}}\big{\{}|I|\geq t(b-a)\big{\}}\leq\varepsilon/t for all t>0t>0. The result follows by noting that

abξ(s)𝑑s|I|+abϕ(s)𝑑s.\int_{a}^{b}\xi(s)\,ds\leq|I|+\int_{a}^{b}\phi(s)\,ds.

Lemma 4.12 (Integral form of Markov’s inequality, II).

Let II be a finite set, and for each iIi\in I, let ξi\xi_{i} be a non-negative random variable with ξi1\xi_{i}\leq 1 a.e. Assume further that for some ϕ(i):I+\phi(i):I\to{\mathbb{R}}_{+} and some ε>0\varepsilon>0 we have

{ξiϕ(i)}1ε{\mathbb{P}}\big{\{}\xi_{i}\leq\phi(i)\big{\}}\geq 1-\varepsilon

for all iIi\in I. Then for all t>0t>0,

{1|I|iIξi1|I|iIϕ(i)+t}ε/t.{\mathbb{P}}\bigg{\{}\frac{1}{|I|}\sum_{i\in I}\xi_{i}\geq\frac{1}{|I|}\sum_{i\in I}\phi(i)+t\bigg{\}}\leq\varepsilon/t.

The proof of Lemma 4.12 is almost identical to that of Lemma 4.11, and we omit it.

Our next statement will be important in an approximation (discretization) argument used later in the proof.

Lemma 4.13 (Lipschitzness of the product ψK2()\prod\psi_{K_{2}}(\cdot)).

Let y1,,yny_{1},\dots,y_{n}\in{\mathbb{R}} and set y:=maxwn|yw|y:=\max\limits_{w\leq n}|y_{w}|. Further, let S1,,SmS_{1},\dots,S_{m} be some non-empty subsets of [n][n]. For imi\leq m denote

fi(s):=ψK2(|1|Si|wSiexp(2π𝐢yws)|) and let f(s):=i=1mfi(s).f_{i}(s):=\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{|S_{i}|}\sum_{w\in S_{i}}\exp(2\pi{\bf i}\,y_{w}s)\Big{|}\bigg{)}\quad\mbox{ and let }\quad f(s):=\prod\limits_{i=1}^{m}f_{i}(s).

Then ff (viewed as a function of ss) is (8K2πym)(8K_{2}\pi y\,m)-Lipschitz.

Proof.

By our definition, ψK2\psi_{K_{2}} is 11-Lipschitz for any K21K_{2}\geq 1, hence fif_{i} (viewed as a function of ss) is 2πy2\pi y-Lipschitz. Since |wSiexp(2π𝐢yws)||Si|\big{|}\sum_{w\in S_{i}}\exp(2\pi{\bf i}\,y_{w}s)\big{|}\leq|S_{i}|, by the definition of the function ψK2\psi_{K_{2}}, we have 1/(2K2)fi11/(2K_{2})\leq f_{i}\leq 1, hence, for all s,Δss,\Delta s\in{\mathbb{R}},

fi(s)fi(s+Δs)=1+fi(s)fi(s+Δs)fi(s+Δs)1+4K2πy|Δs|.\frac{f_{i}(s)}{f_{i}(s+\Delta s)}=1+\frac{f_{i}(s)-f_{i}(s+\Delta s)}{f_{i}(s+\Delta s)}\leq 1+4K_{2}\pi y\,|\Delta s|.

Taking the product, we obtain that

f(s)f(s+Δs)(1+4K2πy|Δs|)m1+8K2πym|Δs|\frac{f(s)}{f(s+\Delta s)}\leq\big{(}1+4K_{2}\pi y\,|\Delta s|\big{)}^{m}\leq 1+8K_{2}\pi y\,m\,|\Delta s|

whenever 8K2πym|Δs|1/28K_{2}\pi y\,m\,|\Delta s|\leq 1/2. This, together with the bound f1f\leq 1 implies for all s,Δss,\Delta s\in{\mathbb{R}},

f(s)f(s+Δs)8K2πym|Δs|,f(s)-f(s+\Delta s)\leq 8K_{2}\pi y\,m\,|\Delta s|,

which completes the proof. ∎

In the next two lemmas we initiate the study of random variables exp(2π𝐢η[Iw]sj/k)\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k), more specifically, we will be interested in the property that, under appropriate assumptions on sjs_{j}’s, the sum of such variables is close to zero on average.

Lemma 4.14.

Let ε(0,1]\varepsilon\in(0,1], k1k\geq 1, 2/ε\ell\geq 2/\varepsilon. Let II be an integer interval and let s1,,ss_{1},\dots,s_{\ell} be real numbers such that for all juj\neq u,

kε|I||sjsu|k2.\frac{k}{\varepsilon|I|}\leq|s_{j}-s_{u}|\leq\frac{k}{2}.

Then

𝔼|j=1exp(2π𝐢η[I]sj/k)|2ε2.\displaystyle{\mathbb{E}}\,\Big{|}\sum_{j=1}^{\ell}\exp(2\pi{\bf i}\,\eta[I]\,s_{j}/k)\Big{|}^{2}\leq\varepsilon\ell^{2}.
Proof.

We will determine the restrictions on parameter RR at the end of the proof. We have

𝔼|j=1exp(2π𝐢η[I]sj/k)|2=j=1u=1𝔼exp(2π𝐢η[I](sjsu)/k)+|ju𝔼exp(2π𝐢η[I](sjsu)/k)|.\begin{split}{\mathbb{E}}\,\Big{|}\sum_{j=1}^{\ell}\exp(2\pi{\bf i}\,\eta[I]\,s_{j}/k)\Big{|}^{2}&=\sum_{j=1}^{\ell}\sum_{u=1}^{\ell}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,\eta[I]\,(s_{j}-s_{u})/k\big{)}\\ &\leq\ell+\Big{|}\sum\limits_{j\neq u}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,\eta[I]\,(s_{j}-s_{u})/k\big{)}\Big{|}.\end{split} (19)

Further, denoting a=minIa=\min I and b=minIb=\min I, we observe for any juj\neq u,

𝔼\displaystyle{\mathbb{E}} exp(2π𝐢η[I](sjsu)/k)\displaystyle\exp\big{(}2\pi{\bf i}\,\eta[I]\,(s_{j}-s_{u})/k\big{)}
=1|I|v=abexp(2π𝐢v(sjsu)/k)\displaystyle=\frac{1}{|I|}\sum_{v=a}^{b}\exp\big{(}2\pi{\bf i}\,v\,(s_{j}-s_{u})/k\big{)}
=1|I|exp(2π𝐢a(sjsu)/k)1exp(2π𝐢(ba+1)(sjsu)/k)1exp(2π𝐢(sjsu)/k).\displaystyle=\frac{1}{|I|}\exp\big{(}2\pi{\bf i}\,a\,(s_{j}-s_{u})/k\big{)}\cdot\frac{1-\exp\big{(}2\pi{\bf i}\,(b-a+1)\,(s_{j}-s_{u})/k\big{)}}{1-\exp\big{(}2\pi{\bf i}\,(s_{j}-s_{u})/k\big{)}}.

In view of assumptions on |sjsu||s_{j}-s_{u}|

|1exp(2π𝐢(sjsu)/k)|=|2sin(π(sjsu)/k)|4|sjsu|k4ε|I|.\big{|}1-\exp\big{(}2\pi{\bf i}\,(s_{j}-s_{u})/k\big{)}\big{|}=\big{|}2\sin(\pi\,(s_{j}-s_{u})/k)\big{|}\geq\frac{4|s_{j}-s_{u}|}{k}\geq\frac{4}{\varepsilon|I|}.

Therefore,

|𝔼exp(2π𝐢η[I](sjsu)/k)|ε2.\big{|}{\mathbb{E}}\exp\big{(}2\pi{\bf i}\,\eta[I]\,(s_{j}-s_{u})/k\big{)}\big{|}\leq\frac{\varepsilon}{2}.

Using (19), we complete the proof. ∎

Lemma 4.15.

For every ε(0,1/2]\varepsilon\in(0,1/2] there are R4.15=R4.15(ε)>0R_{\text{\tiny\ref{l: aux 9871039481}}}=R_{\text{\tiny\ref{l: aux 9871039481}}}(\varepsilon)>0 and :=4.15(ε)\ell:=\ell_{\text{\tiny\ref{l: aux 9871039481}}}(\varepsilon)\in{\mathbb{N}}, 1000\ell\geq 1000, with the following property. Let k1k\geq 1, uu\geq\ell, let IwI_{w} (w=1,2,,uw=1,2,\dots,u) be integer intervals, and let s1,,ss_{1},\dots,s_{\ell} be real numbers such that |Iw||sjsq|R4.15k|I_{w}|\,|s_{j}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 9871039481}}}k, and |sjsq|k/2|s_{j}-s_{q}|\leq k/2 for all jqj\neq q and wuw\leq u. Then, assuming that random variables η[Iw]\eta[I_{w}], wuw\leq u, are mutually independent, one has

{|1uw=1uexp(2π𝐢η[Iw]sj/k)|ε for at least ε indices j}εu.{\mathbb{P}}\Big{\{}\Big{|}\frac{1}{u}\sum_{w=1}^{u}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq\varepsilon\;\mbox{ for at least $\varepsilon\ell$ indices }j\Big{\}}\leq\varepsilon^{u}.
Proof.

Fix any ε(0,1/2]\varepsilon\in(0,1/2], and set ε1:=210e6ε4+9/ε\varepsilon_{1}:=2^{-10}e^{-6}\varepsilon^{4+9/\varepsilon}. Set R:=1/ε1R:=1/\varepsilon_{1} and :=2/ε1\ell:=\lceil 2/\varepsilon_{1}\rceil. Assume that uu\geq\ell, and let numbers sjs_{j} and integer intervals IwI_{w} satisfy the assumptions of the lemma. Denote the event

{|1uw=1uexp(2π𝐢η[Iw]sj/k)|ε for at least ε indices j}\Big{\{}\Big{|}\frac{1}{u}\sum_{w=1}^{u}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq\varepsilon\;\mbox{ for at least $\varepsilon\ell$ indices }j\Big{\}}

by {\mathcal{E}}, and additionally, for any subset Q[]Q\subset[\ell] of cardinality ε/4\lfloor\varepsilon\ell/4\rfloor and any vector z{1,1}2z\in\{-1,1\}^{2}, set

Q,z:={(1uw=1ucos(2πη[Iw]sj/k),1uw=1usin(2πη[Iw]sj/k)),zε for all jQ}.{\mathcal{E}}_{Q,z}:=\Big{\{}\Big{\langle}\Big{(}\frac{1}{u}\sum_{w=1}^{u}\cos(2\pi\,\eta[I_{w}]\,s_{j}/k),\frac{1}{u}\sum_{w=1}^{u}\sin(2\pi\,\eta[I_{w}]\,s_{j}/k)\Big{)},z\Big{\rangle}\geq\varepsilon\;\mbox{ for all }j\in Q\Big{\}}.

It is not difficult to see that

Q,zQ,z,{\mathcal{E}}\subset\bigcup\limits_{Q,z}{\mathcal{E}}_{Q,z},

whence it is sufficient to show that for any admissible Q,zQ,z,

(Q,z)14(ε/4)1εu.{\mathbb{P}}({\mathcal{E}}_{Q,z})\leq\frac{1}{4}{\ell\choose\lfloor\varepsilon\ell/4\rfloor}^{-1}\varepsilon^{u}. (20)

Without loss of generality, we can consider Q=Q0:=[ε/4]Q=Q_{0}:=\big{[}\lfloor\varepsilon\ell/4\rfloor\big{]}. Event Q0,z{\mathcal{E}}_{Q_{0},z} is contained inside the event

{|jQ0w=1uexp(2π𝐢η[Iw]sj/k)|21/2εuε/4},\Big{\{}\Big{|}\sum_{j\in Q_{0}}\sum_{w=1}^{u}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq 2^{-1/2}\varepsilon u\,\lfloor\varepsilon\ell/4\rfloor\Big{\}},

while the latter is contained inside the event

{|jQ0exp(2π𝐢η[Iw]sj/k)|ε4ε/4 for at least εu/4 indices w}.\Big{\{}\Big{|}\sum_{j\in Q_{0}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq\frac{\varepsilon}{4}\,\lfloor\varepsilon\ell/4\rfloor\mbox{ for at least $\varepsilon u/4$ indices $w$}\Big{\}}.

Thus, taking the union over all admissible choices of εu/4\lceil\varepsilon u/4\rceil indices w[u]w\in[u], we get

(Q0,z)(uεu/4)maxF[u],|F|=εu/4{|jQ0exp(2π𝐢η[Iw]sj/k)|ε4ε/4 for all wF}.{\mathbb{P}}({\mathcal{E}}_{Q_{0},z})\leq{u\choose\lceil\varepsilon u/4\rceil}\max\limits_{F\subset[u],\,|F|=\lceil\varepsilon u/4\rceil}{\mathbb{P}}\Big{\{}\Big{|}\sum_{j\in Q_{0}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq\frac{\varepsilon}{4}\,\lfloor\varepsilon\ell/4\rfloor\mbox{ for all $w\in F$}\Big{\}}.

To estimate the last probability, we apply Markov’s inequality, together with the bound for the second moment from Lemma 4.14 (applied with ε1\varepsilon_{1}), and using independence of η[Iw]\eta[I_{w}], wuw\leq u. We then get

maxF[u]|F|=εu/4{|jQ0exp(2π𝐢η[Iw]sj/k)|ε4ε/4 for all wF}(ε12(ε2/32)2)εu/4e3εu/2ε2u.\max\limits_{F\subset[u]\atop|F|=\lceil\varepsilon u/4\rceil}{\mathbb{P}}\Big{\{}\Big{|}\sum_{j\in Q_{0}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\geq\frac{\varepsilon}{4}\,\lfloor\varepsilon\ell/4\rfloor\mbox{ for all $w\in F$}\Big{\}}\leq\bigg{(}\frac{\varepsilon_{1}\ell^{2}}{(\varepsilon^{2}\ell/32)^{2}}\bigg{)}^{\lceil\varepsilon u/4\rceil}\leq e^{-3\varepsilon u/2}\varepsilon^{2u}.

In view of (20) this implies the result, since using 8u8\leq\ell\leq u and ε<1/2\varepsilon<1/2, we have

4(ε/4)εu(uεu/4)e3εu/2ε2u4e3εu/2(4eε)ε/4(2eε)εu/2εu4(16e3)εu/4εu/41.4{\ell\choose\lfloor\varepsilon\ell/4\rfloor}\varepsilon^{-u}{u\choose\lceil\varepsilon u/4\rceil}e^{-3\varepsilon u/2}\varepsilon^{2u}\leq 4e^{-3\varepsilon u/2}\bigg{(}\frac{4e}{\varepsilon}\bigg{)}^{\varepsilon\ell/4}\bigg{(}\frac{2e}{\varepsilon}\bigg{)}^{\varepsilon u/2}\varepsilon^{u}\leq 4(16e^{-3})^{\varepsilon u/4}\,\varepsilon^{u/4}\leq 1.

Our next step is to show that for the vector X=(X1,,Xn)X=(X_{1},\dots,X_{n}) uniformly distributed on Λn\Lambda_{n} the random product i=1mψK2(|1n/mwSiexp(2π𝐢Xws)|)\prod\limits_{i=1}^{m}\psi_{K_{2}}\big{(}\big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\big{|}\big{)} is, in a certain sense, typically small (for most choices of ss). To do this we first show that given a collection of distinct numbers s1,,ss_{1},\dots,s_{\ell} which are pairwise well separated, the above product is small for at least one sjs_{j} with very high probability.

Lemma 4.16.

For any ε(0,1/2]\varepsilon\in(0,1/2] there are R4.16=R4.16(ε)1R_{\text{\tiny\ref{l: aux 0876958237}}}=R_{\text{\tiny\ref{l: aux 0876958237}}}(\varepsilon)\geq 1 and :=4.16(ε)\ell:=\ell_{\text{\tiny\ref{l: aux 0876958237}}}(\varepsilon)\in{\mathbb{N}} with the following property. Let k,m,nk,m,n\in{\mathbb{N}} be with n/mn/m\geq\ell. Let 1K22/ε1\leq K_{2}\leq 2/\varepsilon, X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random vector uniformly distributed on Λn\Lambda_{n}, and let s1,,ss_{1},\dots,s_{\ell} be real numbers in [0,k/2][0,k/2] such that |sjsq|R4.16|s_{j}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 0876958237}}} for all jqj\neq q. Fix disjoint subsets S1,,SmS_{1},\dots,S_{m} of [n][n], each of cardinality n/m\lfloor n/m\rfloor. Then

{j:i=1mψK2(|1n/mwSiexp(2π𝐢Xwsj)|)(K2/2)m/2}εn.{\mathbb{P}}\Big{\{}\forall j\leq\ell\,:\,\,\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s_{j})\Big{|}\bigg{)}\geq(K_{2}/2)^{-m/2}\Big{\}}\leq\varepsilon^{n}.
Proof.

Fix any ε(0,1/2]\varepsilon\in(0,1/2] and set :=4.15(ε5)1000\ell:=\ell_{\text{\tiny\ref{l: aux 9871039481}}}(\varepsilon^{5})\geq 1000 and R:=R4.15(ε5)R:=R_{\text{\tiny\ref{l: aux 9871039481}}}(\varepsilon^{5}). Assume that n/mn/m\geq\ell. Note that, by our definition of Λn\Lambda_{n}, the coordinates of XX are independent and, moreover, each variable kXwkX_{w} is distributed on an integer interval of cardinality at least kk. Thus, it is sufficient to prove that for any collection of integer intervals IjI_{j}, jnj\leq n, such that |Ij|k|I_{j}|\geq k, the event

:={j:i=1mψK2(|1n/mwSiexp(2π𝐢η[Iw]sj/k)|)(K2/2)m/2}.{\mathcal{E}}:=\Big{\{}\forall j\leq\ell\,:\,\,\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}\bigg{)}\geq(K_{2}/2)^{-m/2}\Big{\}}.

has probability at most εn\varepsilon^{n}, where, as usual, we assume that the variables η[Iw]\eta[I_{w}], wSiw\in S_{i}, imi\leq m, are jointly independent. Observe that, as ψK2(t)1\psi_{K_{2}}(t)\leq 1 for all t1t\leq 1, the event {\mathcal{E}} is contained inside the event

:={j:aij2/K2 for at least m/2 indices i},{\mathcal{E}}^{\prime}:=\Big{\{}\forall j\leq\ell\,:\,\,a_{ij}\geq 2/K_{2}\mbox{ for at least $m/2$ indices $i$}\Big{\}},

where aij:=|1n/mwSiexp(2π𝐢η[Iw]sj/k)|a_{ij}:=\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|}, imi\leq m, jj\leq\ell. Denoting bij=1b_{ij}=1 if aij2/K2a_{ij}\geq 2/K_{2} and bij=0b_{ij}=0 otherwise and using a simple counting argument for the matrix {bij}ij\{b_{ij}\}_{ij}, we obtain that

′′:={|{i:aij2/K2 for at least /4 indices j}|m/4}.{\mathcal{E}}\subset{\mathcal{E}}^{\prime}\subset{\mathcal{E}}^{\prime\prime}:=\Big{\{}\Big{|}\Big{\{}i\,:\,\,\,a_{ij}\geq 2/K_{2}\,\,\,\mbox{ for at least $\ell/4$ indices $j$}\Big{\}}\Big{|}\geq m/4\Big{\}}.

To estimate (′′){\mathbb{P}}({\mathcal{E}}^{\prime\prime}) we use Lemma 4.15 with ε5\varepsilon^{5}. Note that ε5min(2/K2,1/2)\varepsilon^{5}\leq\min(2/K_{2},1/2), and that by our choice of RR, for any jqj\neq q we have |Iw||sjsq|k|sjsq|R4.15(ε5)k|I_{w}|\,|s_{j}-s_{q}|\geq k\,|s_{j}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 9871039481}}}(\varepsilon^{5})k, while |sjsq|k/2|s_{j}-s_{q}|\leq k/2. Thus,

im:{aij2/K2 for at least /4 indices j}ε5n/m.\forall i\leq m\,:\quad{\mathbb{P}}\Big{\{}a_{ij}\geq 2/K_{2}\,\,\,\mbox{ for at least $\ell/4$ indices $j$}\Big{\}}\leq\varepsilon^{5\lfloor n/m\rfloor}.

Hence,

(′′)(mm/4)ε5n/mm/42mε5n/mm/4εn,{\mathbb{P}}({\mathcal{E}}^{\prime\prime})\leq{m\choose\lceil m/4\rceil}\varepsilon^{5\lfloor n/m\rfloor\,m/4}\leq 2^{m}\varepsilon^{5\lfloor n/m\rfloor\,m/4}\leq\varepsilon^{n},

which completes the proof. ∎

Lemma 4.17 (Very small product everywhere except for a set of measure O(1)O(1)).

For any ε(0,1/2]\varepsilon\in(0,1/2] there are R4.17=R4.17(ε)1R_{\text{\tiny\ref{l: aux 2398205987305}}}=R_{\text{\tiny\ref{l: aux 2398205987305}}}(\varepsilon)\geq 1, =4.17(ε)\ell=\ell_{\text{\tiny\ref{l: aux 2398205987305}}}(\varepsilon)\in{\mathbb{N}} and n4.17=n4.17(ε,K3)n_{\text{\tiny\ref{l: aux 2398205987305}}}=n_{\text{\tiny\ref{l: aux 2398205987305}}}(\varepsilon,K_{3})\in{\mathbb{N}} with the following property. Let k,m,nk,m,n\in{\mathbb{N}}, nn4.17n\geq n_{\text{\tiny\ref{l: aux 2398205987305}}}, k2n/k\leq 2^{n/\ell}, n/mn/m\geq\ell, and 4K22/ε4\leq K_{2}\leq 2/\varepsilon. Let X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random vector uniformly distributed on Λn\Lambda_{n}. Fix disjoint subsets S1,,SmS_{1},\dots,S_{m} of [n][n], each of cardinality n/m\lfloor n/m\rfloor. Then

{|{s[0,k/2]:i=1mψK2(|1n/mwSiexp(2π𝐢Xws)|)(K2/4)m/2}|R4.17}1(ε/2)n.{\mathbb{P}}\bigg{\{}\Big{|}\Big{\{}s\in[0,k/2]:\;\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|}\bigg{)}\geq(K_{2}/4)^{-m/2}\Big{\}}\Big{|}\leq R_{\text{\tiny\ref{l: aux 2398205987305}}}\bigg{\}}\geq 1-(\varepsilon/2)^{n}.
Proof.

Fix any ε(0,1/2]\varepsilon\in(0,1/2], and define ε~:=ε3/2/32\widetilde{\varepsilon}:=\varepsilon^{3/2}/32, ~:=4.16(ε~)\widetilde{\ell}:=\ell_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon}), :=2~\ell:=2\widetilde{\ell}, and R:=4R4.16(ε~)4.16(ε~)>1R:=4R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon})\ell_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon})>1.

Assume that the parameters k,m,nk,m,n and S1,,SmS_{1},\dots,S_{m} satisfy the assumptions of the lemma. In particular, we assume that nn is large enough so that (8K2πn)~2n(8K_{2}\pi n)^{\widetilde{\ell}}\leq 2^{n} and 𝐠(n)~2n{\bf g}(n)^{\widetilde{\ell}}\leq 2^{n}. Denote

β:=(8K2πm𝐠(n))1(2K2)m/2 and aij:=|1n/mwSiexp(2π𝐢η[Iw]sj/k)|,im,j~.\beta:=(8K_{2}\pi m{\bf g}(n))^{-1}(2K_{2})^{-m/2}\quad\mbox{ and }\quad a_{ij}:=\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}\,\eta[I_{w}]\,s_{j}/k)\Big{|},\,\,i\leq m,\,j\leq\widetilde{\ell}.

Let T:=[0,k/2]βT:=[0,k/2]\cap\,\beta{\mathbb{Z}}. By Lemma 4.16 for any collection s1,,s~s_{1},\dots,s_{\widetilde{\ell}} of points from TT satisfying |sjsq|R4.16(ε~)|s_{j}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon}) for all jqj\neq q, we have

{j~:i=1mψK2(aij)(K2/2)m/2}ε~n.{\mathbb{P}}\bigg{\{}\forall j\leq\widetilde{\ell}\,:\,\,\,\prod\limits_{i=1}^{m}\psi_{K_{2}}(a_{ij})\geq(K_{2}/2)^{-m/2}\bigg{\}}\leq\widetilde{\varepsilon}^{\,n}.

Taking the union bound over all possible choices of s1,,s~s_{1},\dots,s_{\widetilde{\ell}} from TT, we get

{i=1mψK2(aij)(K2/2)m/2 for all j~ and for some s1,,s~Twith |spsq|R4.16(ε~) for all pq}ε~n|T|~.\begin{split}{\mathbb{P}}\bigg{\{}\prod\limits_{i=1}^{m}\psi_{K_{2}}(a_{ij})&\geq(K_{2}/2)^{-m/2}\mbox{ for all $j\leq{\widetilde{\ell}}$ and for some $s_{1},\dots,s_{\widetilde{\ell}}\in T$}\\ &\mbox{with $|s_{p}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon})$ for all $p\neq q$}\bigg{\}}\leq\widetilde{\varepsilon}^{\,n}|T|^{\widetilde{\ell}}.\end{split} (21)

Further, in view of Lemma 4.13, for any realization of XwX_{w}’s the product

f(s):=i=1mψK2(|1n/mwSiexp(2π𝐢Xws)|),f(s):=\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|}\bigg{)},

viewed as a function of ss, is (8K2π𝐠(n)m)(8K_{2}\pi{\bf g}(n)m)-Lipschitz. This implies that for any pair (s,s)+2(s,s^{\prime})\in{\mathbb{R}}_{+}^{2}, satisfying |ss|β|s-s^{\prime}|\leq\beta, we have

f(s)(K2/2)m/2wheneverf(s)(K2/4)m/2.f(s)\geq(K_{2}/2)^{-m/2}\quad\quad\mbox{whenever}\quad f(s^{\prime})\geq(K_{2}/4)^{-m/2}.

Moreover, for any collection s1,,s~s_{1}^{\prime},\dots,s_{\widetilde{\ell}}^{\prime} of numbers from [0,k/2][0,k/2] satisfying |spsq|2R4.16(ε~)|s_{p}^{\prime}-s_{q}^{\prime}|\geq 2R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon}) for all pqp\neq q there are numbers s1,,s~Ts_{1},\dots,s_{\widetilde{\ell}}\in T with |sqsq|β|s_{q}-s_{q}^{\prime}|\leq\beta |spsq|R4.16(ε~)|s_{p}-s_{q}|\geq R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon}) for all pqp\neq q (we used also 2β1R4.16(ε~)2\beta\leq 1\leq R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon})). This, together with (21), yields

{\displaystyle{\mathbb{P}}\bigg{\{} i=1mψK2(|1n/mwSiexp(2π𝐢Xwsj)|)(K2/4)m/2 for all j~ and some s1,,s~[0,k/2]\displaystyle\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s_{j}^{\prime})\Big{|}\bigg{)}\geq(K_{2}/4)^{-m/2}\mbox{ for all $j\leq{\widetilde{\ell}}$ and some $s_{1}^{\prime},\dots,s_{\widetilde{\ell}}^{\prime}\in[0,k/2]$}
with |spsq|2R4.16(ε~) for all pq}\displaystyle\mbox{with $|s_{p}^{\prime}-s_{q}^{\prime}|\geq 2R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon})$ for all $p\neq q$}\bigg{\}}
ε~n|T|~ε~n(k/β)~ε~n 2n(8K2πm𝐠(n))~(2K2)m~/2\displaystyle\hskip 28.45274pt\leq\widetilde{\varepsilon}^{\,n}|T|^{\widetilde{\ell}}\leq\widetilde{\varepsilon}^{\,n}\,(k/\beta)^{\widetilde{\ell}}\leq\widetilde{\varepsilon}^{\,n}\,2^{n}\,(8K_{2}\pi m{\bf g}(n))^{\widetilde{\ell}}(2K_{2})^{m\widetilde{\ell}/2}
ε~n 8n(4/ε)m~/2ε~nεn/2 16n(ε/2)n.\displaystyle\hskip 28.45274pt\leq\widetilde{\varepsilon}^{\,n}\,8^{n}\,(4/\varepsilon)^{m\widetilde{\ell}/2}\,\leq\widetilde{\varepsilon}^{\,n}\,\varepsilon^{-n/2}\,16^{n}\,\leq(\varepsilon/2)^{n}.

The event whose probability is estimated above, clearly contains the event in question —

{|{s[0,k/2]:i=1mψK(|1n/mwSiexp(2π𝐢Xws)|)(K2/4)m/2}|4R4.16(ε~)~}.\bigg{\{}\Big{|}\Big{\{}s\in[0,k/2]:\;\prod\limits_{i=1}^{m}\psi_{K_{2}}\Big{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|}\Big{)}\geq(K_{2}/4)^{-m/2}\Big{\}}\Big{|}\geq 4R_{\text{\tiny\ref{l: aux 0876958237}}}(\widetilde{\varepsilon}){\widetilde{\ell}}\bigg{\}}.

This, and our choice of parameters, implies the result. ∎

Lemma 4.18 (Moderately small product for almost all ss).

For any ε(0,1]\varepsilon\in(0,1] and z(0,1)z\in(0,1) there are ε=ε(ε)(0,1/2]\varepsilon^{\prime}=\varepsilon^{\prime}(\varepsilon)\in(0,1/2], n4.18=n4.18(ε,z)10n_{\text{\tiny\ref{l: aux 20985059837}}}=n_{\text{\tiny\ref{l: aux 20985059837}}}(\varepsilon,z)\geq 10, and C4.18=C4.18(ε,z)1C_{\text{\tiny\ref{l: aux 20985059837}}}=C_{\text{\tiny\ref{l: aux 20985059837}}}(\varepsilon,z)\geq 1 with the following property. Let nn4.18n\geq n_{\text{\tiny\ref{l: aux 20985059837}}}, 2nk12^{n}\geq k\geq 1, C4.18mn/4C_{\text{\tiny\ref{l: aux 20985059837}}}\leq m\leq n/4, and 4K21/ε4\leq K_{2}\leq 1/\varepsilon. Let X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random vector uniformly distributed on Λn\Lambda_{n}. Fix disjoint subsets S1,,SmS_{1},\dots,S_{m} of [n][n], each of cardinality n/m\lfloor n/m\rfloor. Then

{s[z,εk]:i=1mψK2(|1n/mwSiexp(2π𝐢Xws)|)em}1(ε/2)n.{\mathbb{P}}\bigg{\{}\forall s\in[z,\varepsilon^{\prime}k]\,:\,\,\,\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|}\bigg{)}\leq e^{-\sqrt{m}}\bigg{\}}\geq 1-(\varepsilon/2)^{n}.
Proof.

Let ε>0\varepsilon^{\prime}>0 will be chosen later. Fix any s[z,εk]s\in[z,\varepsilon^{\prime}k]. Assume m(εz)410m\geq(\varepsilon^{\prime}z)^{-4}\geq 10. For imi\leq m denote

γi(s):=|1n/mwSiexp(2π𝐢Xws)|,fi(s):=ψK2(γi(s)), and f(s):=i=1mfi(s)\gamma_{i}(s):=\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|},\quad\quad f_{i}(s):=\psi_{K_{2}}\big{(}\gamma_{i}(s)\big{)},\quad\mbox{ and }\quad f(s):=\prod\limits_{i=1}^{m}f_{i}(s)

Observe that by the definition of ψK2\psi_{K_{2}} for each imi\leq m we have fi(s)=γi(s)f_{i}(s)=\gamma_{i}(s), provided γi(s)1/K2\gamma_{i}(s)\geq 1/K_{2}. Next note that if for some complex unit numbers z1,,zNz_{1},...,z_{N} their average v:=i=1Nzi/Nv:=\sum_{i=1}^{N}z_{i}/N has length 1α>01-\alpha>0 then, taking the unit complex number z0z_{0} satisfying z0,v=|v|\left\langle z_{0},v\right\rangle=|v| we have

N(1α)i=1NRezi,vN,N(1-\alpha)\leq\sum_{i=1}^{N}Re\left\langle z_{i},v\right\rangle\leq N,

therefore there are at least N/2+1N/2+1 indices ii such that Rezi,v14αRe\left\langle z_{i},v\right\rangle\geq 1-4\alpha. This in turn implies that there exists an index jj such that there are at least N/2N/2 indices ii with Rezi,z¯j116αRe\left\langle z_{i},\bar{z}_{j}\right\rangle\geq 1-16\alpha. Thus that the event {fi(s)12m}\big{\{}f_{i}(s)\geq 1-\frac{2}{\sqrt{m}}\big{\}} is contained in the event

{wSi:cos(2πs(XwXw))132mfor at least n2m indices wSi{w}}.\Big{\{}\exists\;\;w^{\prime}\in S_{i}:\;\;\;\cos(2\pi s(X_{w}-X_{w^{\prime}}))\geq 1-\frac{32}{\sqrt{m}}\;\;\mbox{for at least }\,\frac{n}{2m}\,\mbox{ indices \, }w\in S_{i}\setminus\{w^{\prime}\}\Big{\}}.

To estimate the probability of the later event, we take the union bound over all choices of n/(2m)n/(2m) indices from SiS_{i}, and over all choices of ww^{\prime}. We then get

{fi(s)12m}\displaystyle{\mathbb{P}}\bigg{\{}f_{i}(s)\geq 1-\frac{2}{\sqrt{m}}\bigg{\}} nm 2n/mmaxwSi,FSi{w},|F|n/(2m){wF:dist(s(XwXw),)2m1/4}\displaystyle\leq\frac{n}{m}\,2^{\lfloor n/m\rfloor}\,\max\limits_{w^{\prime}\in S_{i},\,F\subset S_{i}\setminus\{w^{\prime}\},\atop|F|\geq n/(2m)}{\mathbb{P}}\bigg{\{}\forall w\in F:\,\,{\rm dist}(s(X_{w}-X_{w^{\prime}}),{\mathbb{Z}})\leq\frac{2}{m^{1/4}}\bigg{\}}

To estimate the probability under maximum we use the definition of Λn\Lambda_{n} and independence of coordinates of the vector XX. Note that for each fixed ww there is an integer interval IwI_{w} of the length at least 2k2k such that XwX_{w} is uniformly distributed on Iw/kI_{w}/k. Therefore, fixing a realization Xw=b/kX_{w^{\prime}}=b/k, bb\in{\mathbb{Z}}, we need to count how many aIwa\in I_{w} are such that s(ab)/ks(a-b)/k is close to an integer. This can be done by splitting IwI_{w} into subintervals of length kk and considering cases zs1z\leq s\leq 1, 1<sCk/m1/41<s\leq C^{\prime}k/m^{1/4} (this case can be empty), and Ck/m1/4<sεkC^{\prime}k/m^{1/4}<s\leq\varepsilon^{\prime}k. This leads to the following bound with an absolute constant C>0C^{\prime\prime}>0,

{fi(s)12m}\displaystyle{\mathbb{P}}\bigg{\{}f_{i}(s)\geq 1-\frac{2}{\sqrt{m}}\bigg{\}} nm 2n/m(max(Czm1/4,Cε))n/(2m)nm(4Cε)n/(2m)\displaystyle\leq\frac{n}{m}\,2^{n/m}\,\bigg{(}\max\Big{(}\frac{C^{\prime\prime}}{z\,m^{1/4}},C^{\prime\prime}\varepsilon^{\prime}\Big{)}\bigg{)}^{n/(2m)}\leq\frac{n}{m}\,\big{(}4C^{\prime\prime}\varepsilon^{\prime}\big{)}^{n/(2m)}

Using this estimate and the fact that ψK2(t)1\psi_{K_{2}}(t)\leq 1 for t1t\leq 1 (so, each fi(s)1f_{i}(s)\leq 1), we obtain

{f(s)(12m)3m/4}\displaystyle{\mathbb{P}}\bigg{\{}f(s)\geq\Big{(}1-\frac{2}{\sqrt{m}}\Big{)}^{3m/4}\bigg{\}} {fi(s)12mfor at least m/4 indices i}\displaystyle\leq{\mathbb{P}}\bigg{\{}f_{i}(s)\geq 1-\frac{2}{\sqrt{m}}\;\;\mbox{for at least $m/4$ indices $i$}\bigg{\}}
2m(nm(4Cε)n/(2m))m/4=(16nm)m/4(4Cε)n/8.\displaystyle\leq 2^{m}\bigg{(}\frac{n}{m}\,\big{(}4C^{\prime\prime}\varepsilon^{\prime}\big{)}^{n/(2m)}\bigg{)}^{m/4}=\bigg{(}\frac{16n}{m}\bigg{)}^{m/4}\big{(}4C^{\prime\prime}\varepsilon^{\prime}\big{)}^{n/8}.

The last step of the proof is somewhat similar to the one used in the proof of Lemma 4.17 — we discretize the interval [z,εk][z,\varepsilon^{\prime}k] and use the Lipschitzness f(s)f(s). Recall that 𝐠(n)2n{\bf g}(n)\leq 2^{n} and thus, by Lemma 4.13, f(s)f(s) is (8K2π2nm)(8K_{2}\pi 2^{n}\,m)-Lipschitz. Let

β:=(12/m)3m/4(8K2π 2nm)1 and T:=[z,εk]β.\beta:=\big{(}1-2/\sqrt{m}\big{)}^{3m/4}\big{(}8K_{2}\pi\,2^{n}m\big{)}^{-1}\quad\quad\mbox{ and }\quad\quad T:=[z,\varepsilon^{\prime}k]\cap\beta{\mathbb{Z}}.

Then for any s,s[z,εk]s,s^{\prime}\in[z,\varepsilon^{\prime}k] satisfying |ss|β|s-s^{\prime}|\leq\beta we have |f(s)f(s)|(12/m)3m/4|f(s)-f(s^{\prime})|\leq\big{(}1-2/\sqrt{m}\big{)}^{3m/4} deterministically. This implies that

{s[z,εk]:f(s)\displaystyle{\mathbb{P}}\bigg{\{}\forall s\in[z,\varepsilon^{\prime}k]\,:\,\,\,f(s) 2(12m)3m/4}{sT:f(s)(12m)3m/4}\displaystyle\leq 2\Big{(}1-\frac{2}{\sqrt{m}}\Big{)}^{3m/4}\bigg{\}}\geq{\mathbb{P}}\bigg{\{}\forall s\in T\,:\,\,\,f(s)\leq\Big{(}1-\frac{2}{\sqrt{m}}\Big{)}^{3m/4}\bigg{\}}
1kβ(16nm)m/4(4Cε)n/81(ε/2)n,\displaystyle\geq 1-\frac{k}{\beta}\bigg{(}\frac{16n}{m}\bigg{)}^{m/4}\big{(}4C^{\prime\prime}\varepsilon^{\prime}\big{)}^{n/8}\geq 1-(\varepsilon/2)^{n},

provided that ε:=cε8\varepsilon^{\prime}:=c^{\prime\prime}\varepsilon^{8} for a sufficiently small universal constant c>0c^{\prime\prime}>0. ∎

Lemma 4.19.

Let ρ,ε(0,1]\rho,\varepsilon\in(0,1], k1k\geq 1, hh\in{\mathbb{R}}, a1h+1a_{1}\geq h+1, a2hρ1a_{2}\leq h-\rho-1. Let Y1,Y2Y_{1},Y_{2} be independent random variables, with Y1Y_{1} uniformly distributed on [h,a1]1k[h,a_{1}]\cap\frac{1}{k}{\mathbb{Z}} and Y2Y_{2} uniformly distributed on [a2,hρ]1k[a_{2},h-\rho]\cap\frac{1}{k}{\mathbb{Z}}. Then for every s[ε/8,ε/8]s\in[-\varepsilon/8,\varepsilon/8] one has

{|exp(2π𝐢Y1s)+exp(2π𝐢Y2s)|>22πρ2s2}ε.{\mathbb{P}}\big{\{}\big{|}\exp\big{(}2\pi{\bf i}Y_{1}s\big{)}+\exp\big{(}2\pi{\bf i}Y_{2}s\big{)}\big{|}>2-2\pi\rho^{2}s^{2}\big{\}}\leq\varepsilon.
Proof.

Clearly, it is enough to consider 0<s<ε/80<s<\varepsilon/8 only. Note that

|exp(2π𝐢Y1s)+exp(2π𝐢Y2s)|=|1+exp(2π𝐢(Y1Y2)s)|=2|cos(π𝐢(Y1Y2)s)|.\big{|}\exp\big{(}2\pi{\bf i}Y_{1}s\big{)}+\exp\big{(}2\pi{\bf i}Y_{2}s\big{)}\big{|}=\big{|}1+\exp\big{(}2\pi{\bf i}(Y_{1}-Y_{2})s\big{)}\big{|}=2\big{|}\cos\big{(}\pi{\bf i}(Y_{1}-Y_{2})s\big{)}\big{|}.

We consider two cases.

Case 1. a1h+2ε1a_{1}\leq h+2\varepsilon^{-1} and a2h2ε1a_{2}\geq h-2\varepsilon^{-1}. In this case, deterministically, ρY1Y24/ε\rho\leq Y_{1}-Y_{2}\leq 4/\varepsilon, therefore, using that cost1t2/π\cos t\leq 1-t^{2}/\pi on [π/2,π/2][-\pi/2,\pi/2], we have for every s(0,ε/8]s\in(0,\varepsilon/8],

|exp(2π𝐢Y1s)+exp(2π𝐢Y2s)|22πρ2s2.\big{|}\exp\big{(}2\pi{\bf i}Y_{1}s\big{)}+\exp\big{(}2\pi{\bf i}Y_{2}s\big{)}\big{|}\leq 2-2\pi\rho^{2}s^{2}.

Case 2. Either a1>h+2ε1a_{1}>h+2\varepsilon^{-1} or a2<h2ε1a_{2}<h-2\varepsilon^{-1}. Without loss of generality, we will assume the first inequality holds. We condition on a realization Y~2\widetilde{Y}_{2} of Y2Y_{2} (further in the proof, we compute conditional probabilities given Y2=Y~2Y_{2}=\widetilde{Y}_{2}). For any sε/8s\leq\varepsilon/8, the event

{|1+exp(2π𝐢(Y1Y~2)s)|2s2}\big{\{}\big{|}1+\exp\big{(}2\pi{\bf i}(Y_{1}-\widetilde{Y}_{2})s\big{)}\big{|}\geq 2-s^{2}\big{\}}

is contained inside the event

{dist((Y1Y~2)s,)s}.\big{\{}{\rm dist}\big{(}(Y_{1}-\widetilde{Y}_{2})s,{\mathbb{Z}}\big{)}\leq s\big{\}}.

On the other hand, since (Y1Y~2)s(Y_{1}-\widetilde{Y}_{2})s is uniformly distributed on a set [b1,b2]sk[b_{1},b_{2}]\cap\frac{s}{k}{\mathbb{Z}}, for some b2b1+2ε1sb_{2}\geq b_{1}+2\varepsilon^{-1}s, the probability of the last event is less than ε\varepsilon. The result follows.

Lemma 4.20 (Integration for small ss).

For any ε~(0,1]\widetilde{\varepsilon}\in(0,1], ρ(0,1/4]\rho\in(0,1/4] and δ(0,1/2]\delta\in(0,1/2] there are n4.20=n4.20(ε~,δ,ρ)n_{\text{\tiny\ref{l: aux -29802609872}}}=n_{\text{\tiny\ref{l: aux -29802609872}}}(\widetilde{\varepsilon},\delta,\rho), C4.20=C4.20(ε~,δ,ρ)1C_{\text{\tiny\ref{l: aux -29802609872}}}=C_{\text{\tiny\ref{l: aux -29802609872}}}(\widetilde{\varepsilon},\delta,\rho)\geq 1, and K4.20=K4.20(δ,ρ)1K_{\text{\tiny\ref{l: aux -29802609872}}}=K_{\text{\tiny\ref{l: aux -29802609872}}}(\delta,\rho)\geq 1 with the following property. Let AnmA_{nm} be defined as in (7), nn4.18n\geq n_{\text{\tiny\ref{l: aux 20985059837}}}, k1k\geq 1, mm\in{\mathbb{N}} with n/mC4.20n/m\geq C_{\text{\tiny\ref{l: aux -29802609872}}} and m2m\geq 2, and let X=(X1,,Xn)X=(X_{1},\dots,X_{n}) be a random vector uniformly distributed on Λn\Lambda_{n}. Then for every K24K_{2}\geq 4,

{\displaystyle{\mathbb{P}}\bigg{\{} AnmS1,,Smm/C4.20m/C4.20i=1mψK2(|1n/mwSiexp(2π𝐢Xwm1/2s)|)dsK4.20}(ε~/2)n,\displaystyle A_{nm}\,\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-\sqrt{m}/C_{\text{\tiny\ref{l: aux -29802609872}}}}^{\sqrt{m}/C_{\text{\tiny\ref{l: aux -29802609872}}}}\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp\big{(}2\pi{\bf i}X_{w}m^{-1/2}\,s\big{)}\Big{|}\bigg{)}\,ds\geq K_{\text{\tiny\ref{l: aux -29802609872}}}\bigg{\}}\leq(\widetilde{\varepsilon}/2)^{n},

where the sum is taken over all disjoint subsets S1,,Sm[n]S_{1},\dots,S_{m}\subset[n] of cardinality n/m\lfloor n/m\rfloor each.

Proof.

Let nδ,Cδ,cδn_{\delta},C_{\delta},c_{\delta}, and 𝒮\mathcal{S} be as in Lemma 3.3). A given choice of subsets (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S} denote

γi(s):=|1n/mwSiexp(2π𝐢Xws)|,fi(s):=ψK2(γi(s)), and f(s):=i=1mfi(s)\gamma_{i}(s):=\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp(2\pi{\bf i}X_{w}s)\Big{|},\quad\quad f_{i}(s):=\psi_{K_{2}}\big{(}\gamma_{i}(s)\big{)},\quad\mbox{ and }\quad f(s):=\prod\limits_{i=1}^{m}f_{i}(s)

(note that functions γi(s)\gamma_{i}(s), fi(s)f_{i}(s), f(s)f(s) depend on the choice of subsets SiS_{i}).

First, we study the distribution of the variable f(s)f(s) for a given choice of subsets SiS_{i}. We assume that nnδn\geq n_{\delta} and n/mCδn/m\geq C_{\delta}. We also denote ε:=210/δε~ 16/δcδ\varepsilon:=2^{-10/\delta}\,\widetilde{\varepsilon}^{\,16/\delta c_{\delta}} and

𝒮:={\displaystyle\mathcal{S}^{\prime}:=\Big{\{} (S1,,Sm)𝒮:min(|SiQ1|,|SiQ2|)δn/m/2 for at least cδm indices i}.\displaystyle(S_{1},\dots,S_{m})\in\mathcal{S}:\;\min(|S_{i}\cap Q_{1}|,|S_{i}\cap Q_{2}|)\geq\delta\lfloor n/m\rfloor/2\mbox{ for at least $c_{\delta}m$ indices $i$}\Big{\}}.

Fix a sequence (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S}^{\prime}, and J[m]J\subset[m] be a subset of cardinality cδm\lceil c_{\delta}m\rceil such that

iJ:min(|SiQ1|,|SiQ2|)δn/m/2.\forall i\in J\,:\,\,\min(|S_{i}\cap Q_{1}|,|S_{i}\cap Q_{2}|)\geq\delta\lfloor n/m\rfloor/2.

For any iJi\in J, w1SiQ1w_{1}\in S_{i}\cap Q_{1}, and w2SiQ2w_{2}\in S_{i}\cap Q_{2} by Lemma 4.19 we have for s[ε/8,ε/8]s\in[-\varepsilon/8,\varepsilon/8],

{|exp(2π𝐢Xw1s)+exp(2π𝐢Xw2s)|22πρ2s2}ε.{\mathbb{P}}\big{\{}\big{|}\exp\big{(}2\pi{\bf i}X_{w_{1}}s\big{)}+\exp\big{(}2\pi{\bf i}X_{w_{2}}s\big{)}\big{|}\geq 2-2\pi\rho^{2}s^{2}\big{\}}\leq\varepsilon.

Within SiS_{i}, we can find at least δ2n/m\frac{\delta}{2}\lfloor n/m\rfloor disjoint pairs of indices (w1,w2)Q1×Q2(w_{1},w_{2})\in Q_{1}\times Q_{2} satisfying the above condition. Let TT be a set of such pairs with |T|=δ2n/m|T|=\frac{\delta}{2}\lfloor n/m\rfloor. Using the independence of coordinates of XX, and denoting z:=min(1/(πρ2δ),ε/8)z:=\min\big{(}\sqrt{1/(\pi\rho^{2}\delta)},\varepsilon/8\big{)}, we obtain for every s[z,z]s\in[-z,z],

{\displaystyle{\mathbb{P}}\bigg{\{} γi(s)1πρ2δs22}\displaystyle\gamma_{i}(s)\geq 1-\frac{\pi\rho^{2}\delta s^{2}}{2}\bigg{\}}
{|exp(2π𝐢Xw1s)+exp(2π𝐢Xw2s)|22πρ2s2 for at least δ4n/m pairs (w1,w2)T}\displaystyle\leq{\mathbb{P}}\big{\{}\big{|}\exp\big{(}2\pi{\bf i}X_{w_{1}}s\big{)}+\exp\big{(}2\pi{\bf i}X_{w_{2}}s\big{)}\big{|}\geq 2-2\pi\rho^{2}s^{2}\mbox{ for at least $\frac{\delta}{4}\lfloor n/m\rfloor$ pairs $(w_{1},w_{2})\in T$}\big{\}}
2δn/m/2εδn/m/4(4ε)δn/(4m).\displaystyle\leq 2^{\delta\lfloor n/m\rfloor/2}\,\varepsilon^{\delta\lfloor n/m\rfloor/4}\leq(4\varepsilon)^{\delta n/(4m)}.

Applying this for all iJi\in J together with observations f(s)1f(s)\leq 1 and fi(s)=γi(s)f_{i}(s)=\gamma_{i}(s) (when γi(s)1/K2\gamma_{i}(s)\geq 1/K_{2}), we conclude that for every s[z,z]s\in[-z,z],

{f(s)(1πρ2δs2/2)|J|/2}\displaystyle{\mathbb{P}}\bigg{\{}f(s)\geq\big{(}1-\pi\rho^{2}\delta s^{2}/2\big{)}^{|J|/2}\bigg{\}} {fi(s)1πρ2δs2/2 for at least |J|/2 indices iJ}\displaystyle\leq{\mathbb{P}}\bigg{\{}f_{i}(s)\geq 1-\pi\rho^{2}\delta s^{2}/2\,\,\mbox{ for at least $|J|/2$ indices $i\in J$}\bigg{\}}
2|J|(4ε)δ|J|n/(8m)\displaystyle\leq 2^{|J|}\,(4\varepsilon)^{\delta|J|n/(8m)}

At the next step, we apply the Lemma 4.11 with ξ(s)=f(s)\xi(s)=f(s) to obtain from the previous relation

{zzf(s)dszz(1πρ2δs22)|J|/2ds+m1/2}12zm1/2 2|J|(4ε)δ|J|n/(8m).{\mathbb{P}}\bigg{\{}\int\limits_{-z}^{z}f(s)\,ds\leq\int\limits_{-z}^{z}\bigg{(}1-\frac{\pi\rho^{2}\delta s^{2}}{2}\bigg{)}^{|J|/2}\,ds+m^{-1/2}\bigg{\}}\geq 1-2zm^{1/2}\,2^{|J|}\,(4\varepsilon)^{\delta|J|n/(8m)}.

Next we apply Lemma 4.12) with I=𝒮I=\mathcal{S}^{\prime}, ξi=f(s)\xi_{i}=f(s) (recall that f(s)f(s) depends also on the choice of (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S}). We obtain

{Anm(S1,,Sm)𝒮zzf(s)dszz(1πρ2δs22)|J|/2ds+2m1/2}12zm 2|J|(4ε)δ|J|n/(8m).\displaystyle{\mathbb{P}}\bigg{\{}A_{nm}\,\sum\limits_{(S_{1},\dots,S_{m})\in\mathcal{S}^{\prime}}\;\int\limits_{-z}^{z}f(s)\,ds\leq\int\limits_{-z}^{z}\bigg{(}1-\frac{\pi\rho^{2}\delta s^{2}}{2}\bigg{)}^{|J|/2}\,ds+2m^{-1/2}\bigg{\}}\geq 1-2zm\,2^{|J|}\,(4\varepsilon)^{\delta|J|n/(8m)}.

Further, since by Lemma 3.3 we have |𝒮|(1ecδn)|𝒮||\mathcal{S}^{\prime}|\geq(1-e^{-c_{\delta}n})|\mathcal{S}| and since f(s)1f(s)\leq 1, we observe that

Anm(S1,,Sm)𝒮𝒮zzf(s)ds2zecδn\displaystyle A_{nm}\,\sum\limits_{(S_{1},\dots,S_{m})\in\mathcal{S}\setminus\mathcal{S}^{\prime}}\;\int\limits_{-z}^{z}f(s)\,ds\leq 2z\,e^{-c_{\delta}n}

deterministically. Recalling that |J|=cδm|J|=\lceil c_{\delta}m\rceil, we obtain

{\displaystyle{\mathbb{P}}\bigg{\{} Anm(S1,,Sm)𝒮zzf(s)dsCm1/2}12zm 2|J|(4ε)δ|J|n/(8m)1(ε~/2)n,\displaystyle A_{nm}\,\sum\limits_{(S_{1},\dots,S_{m})\in\mathcal{S}}\;\int\limits_{-z}^{z}f(s)\,ds\leq C^{\prime\prime}m^{-1/2}\bigg{\}}\geq 1-2zm\,2^{|J|}\,(4\varepsilon)^{\delta|J|n/(8m)}\geq 1-(\widetilde{\varepsilon}/2)^{n},

for some C1C^{\prime\prime}\geq 1 depending only on δ\delta and ρ\rho, provided that nn0(ε~,δ,ρ)n\geq n_{0}(\widetilde{\varepsilon},\delta,\rho). The result follows by the substitution s=m1/2us=m^{-1/2}u in the integral. ∎

As a combination of Lemmas 4.174.18 and 4.20, we obtain Proposition 4.9.

Proof of Proposition 4.9.

As we mentioned at the beginning of this subsection, we fix ρ,δ(0,1/4]\rho,\delta\in(0,1/4], a growth function 𝐠{\bf g} satisfying (8), a permutation σΠn\sigma\in\Pi_{n}, a number hh\in{\mathbb{R}}, two sets Q1,Q2[n]Q_{1},Q_{2}\subset[n] such that |Q1|,|Q2|=δn|Q_{1}|,|Q_{2}|=\lceil\delta n\rceil, and we use Λn\Lambda_{n} for the set Λn(k,𝐠,Q1,Q2,ρ,σ,h)\Lambda_{n}(k,{\bf g},Q_{1},Q_{2},\rho,\sigma,h) defined in (17). We also fix ε(0,1/4]\varepsilon\in(0,1/4].

We start by selecting the parameters. Assume that nn is large enough. Set :=4.17(ε)\ell:=\ell_{\text{\tiny\ref{l: aux 2398205987305}}}(\varepsilon). Let ε=ε(ε)\varepsilon^{\prime}=\varepsilon^{\prime}(\varepsilon) be taken from Lemma 4.18. Set z:=1/C4.20(ε,δ,ρ)z:=1/C_{\text{\tiny\ref{l: aux -29802609872}}}(\varepsilon,\delta,\rho). Fix an integer m[C4.18(ε,z),n/max(,C4.20)]m\in[C_{\text{\tiny\ref{l: aux 20985059837}}}(\varepsilon,z),n/\max(\ell,C_{\text{\tiny\ref{l: aux -29802609872}}})] satisfying the condition R4.17mem1R_{\text{\tiny\ref{l: aux 2398205987305}}}\sqrt{m}\,e^{-\sqrt{m}}\leq 1, and take 1kmin(2n/,(K2/8)m/2)1\leq k\leq\min\big{(}2^{n/\ell},(K_{2}/8)^{m/2}\big{)}. Let AnmA_{nm} be defined as in (7). We assume that hh is chosen in such a way that the set Λn\Lambda_{n} is non-empty. As before XX denotes the random vector uniformly distributed on Λn\Lambda_{n}. Let 𝒮\mathcal{S} be as in Lemma 3.3). A given choice of subsets (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S} denote

f(s)=fS1,,Sm(s):=i=1mψK2(|1n/mwSiexp(2π𝐢Xwm1/2s)|).f(s)=f_{S_{1},\dots,S_{m}}(s):=\prod\limits_{i=1}^{m}\psi_{K_{2}}\bigg{(}\Big{|}\frac{1}{\lfloor n/m\rfloor}\sum_{w\in S_{i}}\exp\big{(}2\pi{\bf i}X_{w}m^{-1/2}\,s\big{)}\Big{|}\bigg{)}.

We have

AnmS1,,Sm\displaystyle A_{nm}\sum\limits_{S_{1},\dots,S_{m}}\; εm1/2kεm1/2kf(s)ds=AnmS1,,Smzmzmf(s)ds+2AnmS1,,Smzmεkmf(s)ds\displaystyle\int\limits_{-\varepsilon^{\prime}m^{1/2}k}^{\varepsilon^{\prime}m^{1/2}k}f(s)\,ds=A_{nm}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{-z\sqrt{m}}^{z\sqrt{m}}f(s)\,ds+2A_{nm}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{z\sqrt{m}}^{\varepsilon^{\prime}k\sqrt{m}}f(s)\,ds

In view of Lemma 4.20, with probability at least 1(ε/2)n1-(\varepsilon/2)^{n} the first summand is bounded above by K4.20K_{\text{\tiny\ref{l: aux -29802609872}}}. To estimate the second summand, we combine Lemmas 4.17 and 4.18 (we assume that zεkz\leq\varepsilon^{\prime}k as otherwise there is no second summand). Fix for a moment a collection (S1,,Sm)𝒮(S_{1},\dots,S_{m})\in\mathcal{S}. By Lemma 4.17, with probability at least 1(ε/2)n1-(\varepsilon/2)^{n} the function ff on [0,km/2][0,k\sqrt{m}/2] is bounded above by (K2/4)m/2(K_{2}/4)^{-m/2} for all points ss outside of some set of measure at most R4.17mR_{\text{\tiny\ref{l: aux 2398205987305}}}\sqrt{m} (note that we apply variable transformation sm1/2ss\to m^{-1/2}s to use the lemma here). Further, by Lemma 4.18, with probability at least 1(ε/2)n1-(\varepsilon/2)^{n} we have that ff is bounded above by eme^{-\sqrt{m}} for all s[zm,εkm]s\in[z\sqrt{m},\varepsilon^{\prime}k\sqrt{m}]. Thus, with probability at least 12(ε/2)n1-2(\varepsilon/2)^{n},

zmεkmf(s)dsmk(K24)m/2+R4.17mem.\int\limits_{z\sqrt{m}}^{\varepsilon^{\prime}k\sqrt{m}}f(s)\,ds\leq\sqrt{m}k\,\Big{(}\frac{K_{2}}{4}\Big{)}^{-m/2}+R_{\text{\tiny\ref{l: aux 2398205987305}}}\sqrt{m}\,e^{-\sqrt{m}}.

Applying Lemma 4.12 with I=𝒮I=\mathcal{S} and ξi=f(s)\xi_{i}=f(s), we obtain that

AnmS1,,Smzmεkmf(s)dsmk(K24)m/2+R4.17mem+13\displaystyle A_{nm}\sum\limits_{S_{1},\dots,S_{m}}\;\int\limits_{z\sqrt{m}}^{\varepsilon^{\prime}k\sqrt{m}}f(s)\,ds\leq\sqrt{m}k\,\Big{(}\frac{K_{2}}{4}\Big{)}^{-m/2}+R_{\text{\tiny\ref{l: aux 2398205987305}}}\sqrt{m}\,e^{-\sqrt{m}}+1\leq 3

with probability at least 12(ε/2)n1-2(\varepsilon/2)^{n}. Thus, taking K1:=K4.20+3K_{1}:=K_{\text{\tiny\ref{l: aux -29802609872}}}+3, we obtain

{𝐔𝐃n(X,m,K1,K2)εm1/2k}13(ε/2)n13εn.{\mathbb{P}}\{{\bf UD}_{n}(X,m,K_{1},K_{2})\geq\varepsilon^{\prime}m^{1/2}k\}\geq 1-3(\varepsilon/2)^{n}\geq 1-3\varepsilon^{n}.

5 Complement of gradual non-constant vectors: constant pp

In this section, we study the problem of invertibility of the Bernoulli(pp) matrix MM over the set 𝒮n{\mathcal{S}}_{n} defined by (2) in the case when the parameter pp is a small constant. This setting turns out to be much simpler than treatment of the general case Clnn/npcC\ln n/n\leq p\leq c given in the next section. Although the results of Section 6 essentially absorb the statements of this section, we prefer to include analysis of the constant pp in our work, first, because it provides a short and relatively simple illustration of our method and, second, because the estimates obtained here allow to derive better quantitative bounds for the smallest singular value of MM.

5.1 Spliting of n{\mathbb{R}}^{n} and main statements

We define the following four classes of vectors 1,,4\mathcal{B}_{1},\dots,\mathcal{B}_{4}. For simplicity, we normalize vectors with respect to the Euclidean norm. The first class is the set of vectors with one coordinate much larger than the others, namely,

1=1(p):={xSn1:x1>6pnx2}.\mathcal{B}_{1}=\mathcal{B}_{1}(p):=\{x\in S^{n-1}\,:\,x_{1}^{*}>6pn\,x_{2}^{*}\}.

For the next sets we fix a parameter βp=p/C0\beta_{p}=\sqrt{p}/C_{0}, where C0C_{0} is the absolute constant from Proposition 3.10. Recall also that the operator QQ (which annihilates the maximal coordinate of a given vector) and the set U(m,γ)U(m,\gamma) were introduced in Subsection 3.6. We also fix a small enough absolute positive constant c0c_{0}. We don’t try to compute the actual value of c0c_{0}, the conditions on how small c0c_{0} is can be obtained from the proofs. We further fix an integer 1mn1\leq m\leq n.

The second class of vectors consist of those vectors for which the Euclidean norm dominates the maximal coordinate. To control cardinalities of nets (discretizations) we intersect this class with U(m,c0)U(m,c_{0}), specifically, we set

2=2(p,m):=2U(m,c0), where 2:={xSn1:x1 and x1βp}.\mathcal{B}_{2}=\mathcal{B}_{2}(p,m):=\mathcal{B}_{2}^{\prime}\cap U(m,c_{0}),\quad\mbox{ where }\quad\mathcal{B}_{2}^{\prime}:=\left\{x\in S^{n-1}\,:\,x\not\in\mathcal{B}_{1}\,\,\mbox{ and }\,\,x_{1}^{*}\leq\beta_{p}\right\}.

The next set is similar to 2\mathcal{B}_{2}, but instead of comparing x1x_{1}^{*} with the Euclidean norm of the entire vector, we compare x2x_{2}^{*} with Qx\|Qx\|. For a technical reason, we need to control the magnitude of Qx\|Qx\| precisely; thus we partition the third set into subsets. Let numbers λk\lambda_{k}, kk\leq\ell, be defined by

λ1=16pn,λk+1=3λk,k<1,1/3λ1<1 and λ=1.\lambda_{1}=\frac{1}{6pn},\quad\lambda_{k+1}=3\lambda_{k},\,\,k<\ell-1,\quad 1/3\leq\lambda_{\ell-1}<1\quad\mbox{ and }\quad\lambda_{\ell}=1. (22)

Clearly, lnn\ell\leq\ln n. Then for each k1k\leq\ell-1 we define

3,k=3,k(p,m)\displaystyle\mathcal{B}_{3,k}=\mathcal{B}_{3,k}(p,m) :={xSn1:x12,x2βpQx and λkQx<λk+1}U(m,c0λk).\displaystyle:=\left\{x\in S^{n-1}\,:\,x\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime},\,\,x_{2}^{*}\leq\beta_{p}\|Qx\|\,\,\mbox{ and }\,\,\lambda_{k}\leq\|Qx\|<\lambda_{k+1}\right\}\cap U(m,c_{0}\lambda_{k}).

To explain the choice of λ1\lambda_{1}, note that if x12x\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime} and x=1\|x\|=1, then x2x1/(6pn)βp/(6pn)x_{2}^{*}\geq x_{1}^{*}/(6pn)\geq\beta_{p}/(6pn). Thus, if in addition βpQxx2\beta_{p}\|Qx\|\geq x_{2}^{*}, then Qx1/(6pn)=λ1\|Qx\|\geq 1/(6pn)=\lambda_{1}. We set

3=3(p,m):=k=113,k.\mathcal{B}_{3}=\mathcal{B}_{3}(p,m):=\bigcup_{k=1}^{\ell-1}\mathcal{B}_{3,k}.

The fourth set covers the remaining options for vectors having a large almost constant part. Let numbers μk\mu_{k}, ksk\leq s, be defined by

μ1=βp6pn,μk+1=3μk,k<s1,1/3μs1<1 and μs=1.\mu_{1}=\frac{\beta_{p}}{6pn},\quad\mu_{k+1}=3\mu_{k},\,\,k<s-1,\quad 1/3\leq\mu_{s-1}<1\quad\mbox{ and }\quad\mu_{s}=1. (23)

Clearly, slnns\leq\ln n. Then for each ks1k\leq s-1 define the set 4,k=4,k(p,m)\mathcal{B}_{4,k}=\mathcal{B}_{4,k}(p,m) as

{xSn1:x12,x2>βpQx and μkx2<μk+1}U(m,c0μk/ln(e/p)).\left\{x\in S^{n-1}\,:\,x\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime},\,\,x_{2}^{*}>\beta_{p}\|Qx\|\,\,\mbox{ and }\,\,\mu_{k}\leq x^{*}_{2}<\mu_{k+1}\right\}\cap U(m,c_{0}\mu_{k}/\sqrt{\ln(e/p)}).

Note that if x12x\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime} and x=1\|x\|=1, then x2x1/(6pn)βp/(6pn)x_{2}^{*}\geq x_{1}^{*}/(6pn)\geq\beta_{p}/(6pn), justifying the choice of μ1\mu_{1}. We set

4=4(p,m)=k=114,k.\mathcal{B}_{4}=\mathcal{B}_{4}(p,m)=\bigcup_{k=1}^{\ell-1}\mathcal{B}_{4,k}.

Finally define \mathcal{B} as the union of these four classes, =(p,m):=j=14j.\mathcal{B}=\mathcal{B}(p,m):=\bigcup_{j=1}^{4}\mathcal{B}_{j}.

In this section we prove two following theorems.

Theorem 5.1.

There exists positive absolute constants c,Cc,C such that the following holds. Let nn be large enough, let mcpn/ln(e/p)m\leq cpn/\ln(e/p), and (30lnn)/np<1/20(30\ln n)/n\leq p<1/20. Let MM be an n×nn\times n Bernoulli(pp) random matrix. Then

{x such that Mx<1Cnln(e/p)x}n(1p)n+4e1.5np,{\mathbb{P}}\Big{\{}\exists\;x\in\mathcal{B}\,\,\,\mbox{ such that }\,\,\,\|Mx\|<\frac{1}{C\sqrt{n\ln(e/p)}}\,\,\|x\|\Big{\}}\leq n(1-p)^{n}+4e^{-1.5np},

where the set =(p,m)\mathcal{B}=\mathcal{B}(p,m) is defined above.

Recall that the set 𝒱n{\mathcal{V}}_{n} was introduced in Subsection 3.3. The next theorem shows that, after a proper normalization, the complement of 𝒱n{\mathcal{V}}_{n} (taken in Υn(r){\Upsilon}_{n}(r)) is contained in \mathcal{B} for some choice of r,δ,ρr,\delta,\rho and for the growth function 𝐠(t)=(2t)3/2{\bf g}(t)=(2t)^{3/2} (clearly, satisfying (8)).

Theorem 5.2.

There exists an absolute (small) positive constant c1c_{1} such that the following holds. Let q(0,c1)q\in(0,c_{1}) be a parameter. Then there exist nq1n_{q}\geq 1, r=r(q),ρ=ρ(q)(0,1)r=r(q),\rho=\rho(q)\in(0,1) such that for nnqn\geq n_{q}, p(q,c1)p\in(q,c_{1}), δ=r/3\delta=r/3, 𝐠(t)=(2t)3/2{\bf g}(t)=(2t)^{3/2}, and m=rnm=\lfloor rn\rfloor one has

{x/x:xΥn(r)𝒱n(r,𝐠,δ,ρ)}(p,m).\Big{\{}x/\|x\|\,\,:\,\,x\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\Big{\}}\subset\mathcal{B}(p,m).

5.2 Proof of Theorem 5.1

Theorem 5.1 is a consequence of four lemmas that we prove in this section. Each lemma treats one of the classes i\mathcal{B}_{i}, i4i\leq 4, and Theorem 5.1 follows by the union bound. Recall that U(m,γ)U(m,\gamma) was introduced in Subsection 3.6 and that given xx, we fixed one permutation, σx\sigma_{x}, such that xi=|xσx(i)|x_{i}^{*}=|x_{\sigma_{x}(i)}| for ini\leq n. Recall also that the event nrm{\mathcal{E}}_{nrm} was introduced in Proposition 3.14.

Lemma 5.3.

Let n1n\geq 1 and p(0,1/2]p\in(0,1/2]. Let sum{\mathcal{E}}_{sum} (with q=pq=p) be the event introduced in Lemma 3.4 and by coln{\mathcal{E}}_{col}\subset{\mathcal{M}_{n}} denote the subset of 0/10/1 matrices with no zero columns. Then for every MsumcolM\in{\mathcal{E}}_{sum}\cap{\mathcal{E}}_{col} and every x1x\in\mathcal{B}_{1},

Mx13nx.\|Mx\|\geq\frac{1}{3\sqrt{n}}\,\|x\|.

In particular,

{Mn:x1 with Mx13n}n(1p)n+e1.5np.\mathbb{P}\Bigl{\{}M\in{\mathcal{M}_{n}}:\;\exists x\in\mathcal{B}_{1}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{1}{3\sqrt{n}}\Bigr{\}}\leq n(1-p)^{n}+e^{-1.5np}.
Proof.

Let δij\delta_{ij}, i,jni,j\leq n be entries of MsumcolM\in{\mathcal{E}}_{sum}\cap{\mathcal{E}}_{col}. Let σ=σx\sigma=\sigma_{x}. Denote, =σ(1)\ell=\sigma(1). Since McolM\in{\mathcal{E}}_{col}, there exists sns\leq n such that δs=1\delta_{s\ell}=1. Then

|Rs(M),x|\displaystyle|\langle R_{s}(M),\,x\rangle| =|x+jδsjxj||x|jδsjxj|x|j=1nδsjxn2.\displaystyle=\Big{|}x_{\ell}+\sum_{j\neq\ell}\delta_{sj}x_{j}\Big{|}\geq|x_{\ell}|-\sum_{j\neq\ell}\delta_{sj}\,x_{j}\geq|x_{\ell}|-\sum_{j=1}^{n}\delta_{sj}\,x_{n_{2}}^{*}.

Using that MsumM\in{\mathcal{E}}_{sum} we observe that j=1nδsj3.5pn\sum_{j=1}^{n}\delta_{sj}\leq 3.5pn. Thus,

Mx|Rs(M),x|x13.5pnxn2x1/3.\|Mx\|\geq|\langle R_{s}(M),\,x\rangle|\geq x_{1}^{*}-3.5pnx_{n_{2}}^{*}\geq x_{1}^{*}/3.

The trivial bound xnx1\|x\|\leq\sqrt{n}\,x_{1}^{*} completes the first estimate. The “in particular” part follows by the “moreover” part of Lemma 3.4 and since (col)n(1p)n\mathbb{P}({\mathcal{E}}_{col})\leq n(1-p)^{n}. ∎

Lemma 5.4.

There exists a (small) absolute positive constant cc such that the following holds. Let nn be large enough and mcnm\leq cn. Let (4lnn)/np<1/2(4\ln n)/n\leq p<1/2 and MM be Bernoulli(pp) matrix. Then

(Mnrm and x2 with Mxpn5C0)e2n.\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{2}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{\sqrt{pn}}{5C_{0}}\Bigr{)}\leq e^{-2n}.
Proof.

By Lemma 3.13 for ε[8c0,1)\varepsilon\in[8c_{0},1) there exists an (ε/2)(\varepsilon/2)–net in V(1)U(m,c0)V(1)\cap U(m,c_{0}) with respect to the triple norm |||||||||\cdot|||, with cardinality at most

Cn2ε2(18enεm)m.\frac{Cn^{2}}{\varepsilon^{2}}\left(\frac{18en}{\varepsilon m}\right)^{m}.

Since 2V(1)U(m,c0)\mathcal{B}_{2}\subset V(1)\cap U(m,c_{0}), by a standard “projection” trick, we can obtain from it an ε\varepsilon–net 𝒩\mathcal{N} in 2\mathcal{B}_{2} of the same cardinality. Let x2x\in\mathcal{B}_{2}. Let z𝒩z\in\mathcal{N} be such that |||xz|||ε|||x-z|||\leq\varepsilon. Since on 2\mathcal{B}_{2} we have z1βpz=βpz_{1}^{*}\leq\beta_{p}\|z\|=\beta_{p}, Proposition 3.10 implies that with probability at least 1e3n1-e^{-3n},

Mzpn32C0.\|Mz\|\geq\frac{\sqrt{pn}}{3\sqrt{2}C_{0}}. (24)

Further, in view of Proposition 3.14, conditioned on (24) and on {Mnrm}\{M\in{\mathcal{E}}_{nrm}\}, we have

MxMzM(xz)pn32C0100pnεpn5C0,\|Mx\|\geq\|Mz\|-\|M(x-z)\|\geq\frac{\sqrt{pn}}{3\sqrt{2}C_{0}}-100\sqrt{pn}\varepsilon\geq\frac{\sqrt{pn}}{5C_{0}},

where we have chosen ε=1/(5000C0)\varepsilon=1/(5000C_{0}). Using the union bound and our choice of ε\varepsilon, we obtain that

(Mnrm and x2 with Mxpn5C0)e3n|𝒩|e2n\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{2}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{\sqrt{pn}}{5C_{0}}\Bigr{)}\leq e^{-3n}|\mathcal{N}|\leq e^{-2n}

for sufficiently large nn and provided that c01/(40000C0)c_{0}\leq 1/(40000C_{0}) and mcnm\leq cn for small enough absolute positive constant cc. This completes the proof. ∎

Remark 5.5.

Note that we used Proposition 3.10 with the set A=[n]A=[n]. In this case we could use slightly easier construction for nets than the one in Lemma 3.13 — we don’t need to distinguish the first coordinate in the net construction, in other words we could have only one special direction, not two. However this would not lead to a better estimate and in the remaining lemmas we will need the fult strength of our construction.

Next we threat the case of vectors in 3\mathcal{B}_{3}. The proof is similar to the proof of Lemma 5.4, but we need to remove the maximal coordinate and to deal with remaining part of the vector. Recall that the operator QQ serves this purpose.

Lemma 5.6.

There exists a (small) absolute positive constant cc such that the following holds. Let nn be large enough, and mcpn/ln(e/p)m\leq cpn/\ln(e/p), (4lnn)/np<1/2(4\ln n)/n\leq p<1/2. Let MM be a random Bernoulli(pp) matrix. Then

(Mnrm and x3 with Mx130C0pn)e2n.\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{3}\mbox{ with }\,\,\|Mx\|\leq\frac{1}{30C_{0}\sqrt{pn}}\Bigr{)}\leq e^{-2n}.
Proof.

Fix 1k11\leq k\leq\ell-1. By Lemma 3.13 for ε[8c0λk,λk+1)\varepsilon\in[8c_{0}\lambda_{k},\lambda_{k+1}) there exists an (ε/2)(\varepsilon/2)–net in V(λk+1)U(m,c0λk)V(\lambda_{k+1})\cap U(m,c_{0}\lambda_{k}) with respect to |||||||||\cdot|||, with cardinality at most

Cn2ε2(18eλk+1nεm)mCn2ε2(54eλknεm)m.\frac{Cn^{2}}{\varepsilon^{2}}\left(\frac{18e\lambda_{k+1}n}{\varepsilon m}\right)^{m}\leq\frac{Cn^{2}}{\varepsilon^{2}}\left(\frac{54e\lambda_{k}n}{\varepsilon m}\right)^{m}.

Again using a “projection” trick, we can construct an ε\varepsilon–net 𝒩k\mathcal{N}_{k} in 3,k\mathcal{B}_{3,k} of the same cardinality. Let x3,kx\in\mathcal{B}_{3,k}. Let z𝒩kz\in\mathcal{N}_{k} be such that |||xz|||ε|||x-z|||\leq\varepsilon. Since on 3,k\mathcal{B}_{3,k} we have z2βpQzz_{2}^{*}\leq\beta_{p}\|Qz\|, Proposition 3.10 applied with A=σz([2,n])A=\sigma_{z}([2,n]) implies that with probability at least 1e3n1-e^{-3n},

MzpnQz32C0pnλk32C0.\|Mz\|\geq\frac{\sqrt{pn}\,\|Qz\|}{3\sqrt{2}C_{0}}\geq\frac{\sqrt{pn}\,\lambda_{k}}{3\sqrt{2}C_{0}}.

Conditioned on the above inequality and on the event {Mnrm}\{M\in{\mathcal{E}}_{nrm}\}, Proposition 3.14 implies that

MxMzM(xz)pnλk32C0100pnεpnλk5C0,\|Mx\|\geq\|Mz\|-\|M(x-z)\|\geq\frac{\sqrt{pn}\,\lambda_{k}}{3\sqrt{2}C_{0}}-100\sqrt{pn}\varepsilon\geq\frac{\sqrt{pn}\,\lambda_{k}}{5C_{0}},

where we have chosen ε=λk/(5000C0)\varepsilon=\lambda_{k}/(5000C_{0}). Using the union bound, our choice of ε\varepsilon and λk1/(6pn)\lambda_{k}\geq 1/(6pn), we obtain that

Pk:=(x3,k with Mxpnλk5C0)e3n|𝒩k|e2.5nP_{k}:=\mathbb{P}\Bigl{(}\exists x\in\mathcal{B}_{3,k}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{\sqrt{pn}\,\lambda_{k}}{5C_{0}}\Bigr{)}\leq e^{-3n}|\mathcal{N}_{k}|\leq e^{-2.5n}

for large enough nn and for mcnm\leq cn, where c>0c>0 is a small enough absolute constant (we also assume c01/(40000C0)c_{0}\leq 1/(40000C_{0})). Since lnn\ell\leq\ln n and λkλ11/(6pn)\lambda_{k}\geq\lambda_{1}\geq 1/(6pn), we obtain

(x3 with Mx130C0pn)k=11Pke2pn.\mathbb{P}\Bigl{(}\exists x\in\mathcal{B}_{3}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{1}{30C_{0}\sqrt{pn}}\Bigr{)}\leq\sum_{k=1}^{\ell-1}P_{k}\leq e^{-2pn}.

This completes the proof. ∎

Finally we threat the case of vectors in 4\mathcal{B}_{4}.

Lemma 5.7.

There exists a (small) absolute positive constant cc such that the following holds. Let nn be large enough and let mcpn/ln(e/p)m\leq cpn/\ln(e/p), (30lnn)/np<1/20(30\ln n)/n\leq p<1/20. Let MM be a Bernoulli(pp) random matrix. Then

(Mnrm and x4 with Mx160C0nln(e/p))e1.5pn.\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{4}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{1}{60C_{0}\sqrt{n\ln(e/p)}}\Bigr{)}\leq e^{-1.5pn}.
Proof.

Fix 1ks11\leq k\leq s-1. By Lemma 3.13 for ε[8c0μk/ln(e/p),μk+1)\varepsilon\in[8c_{0}\mu_{k}/\sqrt{\ln(e/p)},\mu_{k+1}) there exists an (ε/2)(\varepsilon/2)–net in

V(μk+1/βp)U(m,c0μk/ln(e/p))V(\mu_{k+1}/\beta_{p})\cap U(m,c_{0}\mu_{k}/\sqrt{\ln(e/p)})

with respect to |||||||||\cdot||| with cardinality at most

Cn2ε2(18eμk+1nεmβp)mCn2ε2(54eμknεmβp)m.\frac{Cn^{2}}{\varepsilon^{2}}\left(\frac{18e\mu_{k+1}n}{\varepsilon m\beta_{p}}\right)^{m}\leq\frac{Cn^{2}}{\varepsilon^{2}}\left(\frac{54e\mu_{k}n}{\varepsilon m\beta_{p}}\right)^{m}.

By the projection trick, we get an ε\varepsilon–net 𝒩k\mathcal{N}_{k} in 4,kV(μk+1/βp)U(m,c0μk/ln(e/p))\mathcal{B}_{4,k}\subset V(\mu_{k+1}/\beta_{p})\cap U(m,c_{0}\mu_{k}/\sqrt{\ln(e/p)}).

Let x4,kx\in\mathcal{B}_{4,k}. Let z𝒩kz\in\mathcal{N}_{k} be such that |||xz|||ε|||x-z|||\leq\varepsilon. Since on 4\mathcal{B}_{4} we have z1z2μkz_{1}^{*}\geq z_{2}^{*}\geq\mu_{k}, Proposition 3.11 implies that with probability at least 1e1.6np1-e^{-1.6np},

Mzμkpn7ln(e/p).\|Mz\|\geq\frac{\mu_{k}\sqrt{pn}}{7\sqrt{\ln(e/p)}}.

Conditioned on the above and on {Mnrm}\{M\in{\mathcal{E}}_{nrm}\}, Proposition 3.14 implies that

MxMzM(xz)μkpn7ln(e/p)C1pnεμkpn10ln(e/p),\|Mx\|\geq\|Mz\|-\|M(x-z)\|\geq\frac{\mu_{k}\sqrt{pn}}{7\sqrt{\ln(e/p)}}-C_{1}\sqrt{pn}\varepsilon\geq\frac{\mu_{k}\sqrt{pn}}{10\sqrt{\ln(e/p)}},

where we have chosen

ε=μk/(50C1ln(e/p))8c0μk/ln(e/p),\varepsilon=\mu_{k}/(50C_{1}\sqrt{\ln(e/p)})\geq 8c_{0}\mu_{k}/\sqrt{\ln(e/p)},

provided that c01/40000c_{0}\leq 1/40000. Using the union bound and our choice of ε\varepsilon we obtain that

Pk:=(Mnrm and x4,k with Mxμkpn10ln(e/p))e1.6pn|𝒩k|e1.55pnP_{k}:=\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{4,k}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{\mu_{k}\sqrt{pn}}{10\sqrt{\ln(e/p)}}\Bigr{)}\leq e^{-1.6pn}|\mathcal{N}_{k}|\leq e^{-1.55pn}

for large enough nn and for mcpn/ln(e/p)m\leq cpn/\ln(e/p), where c>0c>0 is a small enough absolute constant. Since slnns\leq\ln n and μkμ1βp/(6pn)=1/(6C0np)\mu_{k}\geq\mu_{1}\geq\beta_{p}/(6pn)=1/(6C_{0}n\sqrt{p}), we obtain

(Mnrm and x4 with Mx160C0nln(e/p))k=1s1Pke1.5pn.\mathbb{P}\Bigl{(}M\in{\mathcal{E}}_{nrm}\quad\mbox{ and }\quad\exists x\in\mathcal{B}_{4}\,\,\mbox{ with }\,\,\|Mx\|\leq\frac{1}{60C_{0}\sqrt{n\ln(e/p)}}\Bigr{)}\leq\sum_{k=1}^{s-1}P_{k}\leq e^{-1.5pn}.

This completes the proof. ∎

Proof of Theorem 5.1..

Lemmas 5.3, 5.4, 5.6, and 5.7 imply that

()n(1p)n+3e1.5np+(cnrm),{\mathbb{P}}({\mathcal{E}})\leq n(1-p)^{n}+3e^{-1.5np}+\mathbb{P}({\mathcal{E}}^{c}_{nrm}),

where {\mathcal{E}} denotes the event from Theorem 5.1. Lemma 3.6 applied with t=30t=30 and (11) imply that (cnrm)e10pn,\mathbb{P}({\mathcal{E}}^{c}_{nrm})\leq e^{-10pn}, provided that pnpn is large enough. This completes the proof. ∎

5.3 Proof of Theorem 5.2

Proof.

We prove the statement with r=r(q)=cq/ln(e/q)r=r(q)=cq/\ln(e/q), where cc is the constant from Theorem 5.1, and ρ=ρ(q)=c0rβq/(6ln(e/q))\rho=\rho(q)=c_{0}\sqrt{r}\beta_{q}/(6\sqrt{\ln(e/q)}). Note that under our choice of parameters (and assuming c1c_{1} is small), 9δ/2c0βq/ln(e/q)c0βp/ln(e/p)9\delta/2\leq c_{0}\beta_{q}/\sqrt{\ln(e/q)}\leq c_{0}\beta_{p}/\sqrt{\ln(e/p)}.

Assume that xΥn(r)𝒱nx\in{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}. By (xi#)i(x_{i}^{\#})_{i} denote the non-increasing rearrangement of (xi)i(x_{i})_{i} (we would like to emphasize that we do not take absolute values). Note that for any t>0t>0 there are two subsets Q1,Q2[n]Q_{1},Q_{2}\subset[n] with |Q1|,|Q2|δn|Q_{1}|,|Q_{2}|\geq\lceil\delta n\rceil satisfying maxiQ2ximiniQ1xit\max\limits_{i\in Q_{2}}x_{i}\leq\min\limits_{i\in Q_{1}}x_{i}-t if and only if xδn#xnδn+1#tx_{\lceil\delta n\rceil}^{\#}-x_{n-\lceil\delta n\rceil+1}^{\#}\geq t. This leads to the two following cases.

Case 1. xδn#xnδn+1#ρx_{\lceil\delta n\rceil}^{\#}-x_{n-\lceil\delta n\rceil+1}^{\#}\geq\rho. Since x𝒱nx\notin{\mathcal{V}}_{n}, in this case there exists an index jnj\leq n with xj>(2n/j)3/2x_{j}^{*}>(2n/j)^{3/2}. Note that since xrn=1x^{*}_{\lfloor rn\rfloor}=1, we have j<rn=3δnj<rn=3\delta n.

Subcase 1a. 1<j<3δn1<j<3\delta n. Since xj>(2n/j)3/2x_{j}^{*}>(2n/j)^{3/2} we get

Qx2i=2j(xi)2i=2j(2n/i)3j2(2n/j)3=n(2n/j)2.\|Qx\|^{2}\geq\sum_{i=2}^{j}(x_{i}^{*})^{2}\geq\sum_{i=2}^{j}(2n/i)^{3}\geq\frac{j}{2}\,(2n/j)^{3}=n(2n/j)^{2}.

Therefore,

xrn+1Qx1nj2n(3δ/2)n.\frac{x^{*}_{\lfloor rn\rfloor+1}}{\|Qx\|}\leq\frac{1}{\sqrt{n}}\,\frac{j}{2n}\leq\frac{(3\delta/2)}{\sqrt{n}}.

Now let y=x/xy=x/\|x\|. Then

yrn+1=xrn+1x3δ/2nQxx=3δ/2nQy.y^{*}_{\lfloor rn\rfloor+1}=\frac{x^{*}_{\lfloor rn\rfloor+1}}{\|x\|}\leq\frac{3\delta/2}{\sqrt{n}}\,\frac{\|Qx\|}{\|x\|}=\frac{3\delta/2}{\sqrt{n}}\,\|Qy\|. (25)

Our goal is to show that y(p,m)y\in\mathcal{B}(p,m) (with m=rnm=\lfloor rn\rfloor).

If y1(p)y\in\mathcal{B}_{1}(p), we are done.

Otherwise, if y2y\in\mathcal{B}_{2}^{\prime}, then (25) implies that yrn+1c0/ny^{*}_{\lfloor rn\rfloor+1}\leq c_{0}/\sqrt{n}, that is, there are at least nmn-m coordinates at the distance at most c0/nc_{0}/\sqrt{n} from zero. Thus yU(m,c0)y\in U(m,c_{0}) and hence y2y\in\mathcal{B}_{2}.

If y12y\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime} and y2βpQyy^{*}_{2}\leq\beta_{p}\|Qy\|, then necessarily λkQy<λk+13λk\lambda_{k}\leq\|Qy\|<\lambda_{k+1}\leq 3\lambda_{k} for some kk, where λk,λk+1\lambda_{k},\lambda_{k+1} are defined according to (22). Then (25) implies that yrn+1c0λk/ny^{*}_{\lfloor rn\rfloor+1}\leq c_{0}\lambda_{k}/\sqrt{n}, that is, there are at least nmn-m coordinates at the distance at most c0λk/nc_{0}\lambda_{k}/\sqrt{n} from zero. Thus yU(m,c0λk)y\in U(m,c_{0}\lambda_{k}) and hence y3,ky\in\mathcal{B}_{3,k}.

If y12y\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime} and y2>βpQyy^{*}_{2}>\beta_{p}\|Qy\| then necessarily μky2<μk+13μk\mu_{k}\leq y_{2}^{*}<\mu_{k+1}\leq 3\mu_{k}, where μk,μk+1\mu_{k},\mu_{k+1} are given by (23). Then, similarly,

yrn+13δ/2nQy3δ/2ny2βp9δ/2βpnμkc0μkln(e/p)n.y^{*}_{\lfloor rn\rfloor+1}\leq\frac{3\delta/2}{\sqrt{n}}\,\|Qy\|\leq\frac{3\delta/2}{\sqrt{n}}\,\frac{y_{2}^{*}}{\beta_{p}}\leq\frac{9\delta/2}{\beta_{p}\sqrt{n}}\,\mu_{k}\leq\frac{c_{0}\mu_{k}}{\sqrt{\ln(e/p)}\sqrt{n}}.

This implies that yU(m,c0μk/ln(e/p))y\in U(m,c_{0}\mu_{k}/\sqrt{\ln(e/p)}) and, thus, y4,ky\in\mathcal{B}_{4,k}.

Subcase 1b. j=1j=1. In this case x1(2n)3/2x_{1}^{*}\geq(2n)^{3/2}. Assume x1x\not\in\mathcal{B}_{1}, that is x1<6pnx2x_{1}^{*}<6pnx_{2}^{*}. Then

xrn+1Qx1x26pn(2n)3/2=6p23/2n.\frac{x^{*}_{\lfloor rn\rfloor+1}}{\|Qx\|}\leq\frac{1}{x_{2}^{*}}\leq\frac{6pn}{(2n)^{3/2}}=\frac{6p}{2^{3/2}\sqrt{n}}.

We can now define y:=x/xy:=x/\|x\| and, having noted that yrn+16p23/2nQyy^{*}_{\lfloor rn\rfloor+1}\leq\frac{6p}{2^{3/2}\sqrt{n}}\,\|Qy\|, proceed similarly to the Subcase 1a. We will need to use the condition 18p23/2c0βp/ln(e/p)18p\leq 2^{3/2}c_{0}\beta_{p}/\sqrt{\ln(e/p)}, which holds for small enough pp.

Case 2. xδn#xnδn+1#<ρx_{\lceil\delta n\rceil}^{\#}-x_{n-\lceil\delta n\rceil+1}^{\#}<\rho. Set σ\sigma be a permutation of [n][n] such that xi#=xσ(i)x_{i}^{\#}=x_{\sigma(i)}, ini\leq n (note that σ\sigma is in general different from the permutation σx\sigma_{x} defined in connection with the non-increasing rearrangement of the absolute values |xi||x_{i}|). Define the following set, which will play the role of the set in the definition of U(m,γ)U(m,\gamma) (see Subsection 3.6),

A:={σ(i):δn<inδn}.A:=\{\sigma(i):\,\,\lceil\delta n\rceil<i\leq n-\lceil\delta n\rceil\}.

Then |A|=n2δn|A|=n-2\lceil\delta n\rceil, and m>2δn=2rn/3m>2\lceil\delta n\rceil=2\lceil rn/3\rceil. Since xm=1x^{*}_{m}=1, we observe that either xδn+1#1x_{\lceil\delta n\rceil+1}^{\#}\geq 1 or xnδn#1x_{n-\lceil\delta n\rceil}^{\#}\leq-1 (or both). Moreover, since r<1/2r<1/2, we necessarily have that |x#i|1|x^{\#}_{i}|\leq 1 for some δn<inδn\lceil\delta n\rceil<i\leq n-\lceil\delta n\rceil. Therefore, there exists an index jAj\in A such that |xj|=1|x_{j}|=1. Taking b=xjb=x_{j}, we observe that for every iAi\in A, |xib|<ρ|x_{i}-b|<\rho. On the other hand we have

x2Qx2i=2mxim1m/2 and iA:|xib|Qx2ρm1n2ρr.\|x\|^{2}\geq\|Qx\|^{2}\geq\sum_{i=2}^{m}x_{i}^{*}\geq m-1\geq m/2\quad\mbox{ and }\quad\forall i\in A\,:\,\frac{|x_{i}-b|}{\|Qx\|}\leq\frac{\sqrt{2}\,\rho}{\sqrt{m}}\leq\frac{1}{\sqrt{n}}\,\,\frac{2\rho}{\sqrt{r}}.

Now let y=x/xy=x/\|x\|. Then

iA:|yibx|=|xib|QxQxx1n2ρrQy.\forall i\in A\,:\,\left|y_{i}-\frac{b}{\|x\|}\right|=\frac{|x_{i}-b|}{\|Qx\|}\,\frac{\|Qx\|}{\|x\|}\leq\frac{1}{\sqrt{n}}\,\,\frac{2\rho}{\sqrt{r}}\,\|Qy\|. (26)

The end of the proof is similar to the end of the proof of Case 1. If y1y\in\mathcal{B}_{1}, we are done. If y2y\in\mathcal{B}_{2}^{\prime}, then using (26), Qyy=1\|Qy\|\leq\|y\|=1, and 6ρ/rc06\rho/\sqrt{r}\leq c_{0} we obtain that yU(m,c0)y\in U(m,c_{0}) and, thus, y2y\in\mathcal{B}_{2}. If y12y\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime}, y2βpQyy^{*}_{2}\leq\beta_{p}\|Qy\|, and λkQy<λk+13λk\lambda_{k}\leq\|Qy\|<\lambda_{k+1}\leq 3\lambda_{k} then, using (26) and 6ρ/rc06\rho/\sqrt{r}\leq c_{0} we obtain that yU(m,c0λk)y\in U(m,c_{0}\lambda_{k}) and, thus, y3,ky\in\mathcal{B}_{3,k}. If y12y\not\in\mathcal{B}_{1}\cup\mathcal{B}_{2}^{\prime}, y2βpQyy^{*}_{2}\geq\beta_{p}\|Qy\|, and μky2<μk+13μk\mu_{k}\leq y_{2}^{*}<\mu_{k+1}\leq 3\mu_{k} then, similarly, using (26) and 6ρ/rc0βp/ln(e/p)6\rho/\sqrt{r}\leq c_{0}\beta_{p}/\sqrt{\ln(e/p)}, we obtain that yU(m,c0μk/ln(e/p))y\in U(m,c_{0}\mu_{k}/\sqrt{\ln(e/p)}) and, thus, y4,ky\in\mathcal{B}_{4,k}. This completes the proof. ∎

6 Complement of gradual non-constant vectors: general case

We split n{\mathbb{R}}^{n} into two classes of vectors. The first class, the class of steep vectors 𝒯\mathcal{T}, is constructed in essentially the same way as in [24] and [27]. The proof of bound for this class resembles corresponding proofs in [24] and [27], however, due to the differences of the models of randomness, there are important modifications. The second class {\mathcal{R}}, which we call {\mathcal{R}}-vectors, will consist of vectors to which Proposition 3.10 can be applied, therefore dealing with this class is simpler. To control the cardinality of nets, part of this class will be intersected with the almost constant vectors. Then we show that the complement of 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) in Υn(r){\Upsilon}_{n}(r) is contained in 𝒯\mathcal{T}\cup{\mathcal{R}}.

We now introduce the following parameters, which will be used throughout this section. It will be convenient to denote d=pnd=pn. We always assume that p0.0001p\leq 0.0001 and nn is large enough (that is, larger than a certain positive absolute constant). We also always assume that the “average degree” d=pn200lnnd=pn\geq 200\ln n. Fix a sufficiently small absolute positive constant rr and sufficiently large absolute positive constant CτC_{\tau} (we do not try to estimate the actual values of rr and CτC_{\tau}, the conditions on how small rr can be extracted from the proofs, in particular, the condition on CτC_{\tau} comes for (38)). We also fix two positive integers 0\ell_{0} and s0s_{0} such that

0=pn4ln(1/p) and 0s01164p=n64d<0s0.\ell_{0}=\left\lfloor\frac{pn}{4\ln(1/p)}\right\rfloor\quad\mbox{ and }\quad\ell_{0}^{s_{0}-1}\leq\frac{1}{64p}=\frac{n}{64d}<\ell_{0}^{s_{0}}. (27)

Note that 050\ell_{0}\geq 50 and that s0>1s_{0}>1 implies pc(lnn)/np\leq c\sqrt{(\ln n)/n}.

For 1js01\leq j\leq s_{0} we set

n0:=2,nj:=300j1,ns0+2:=n/p=nd, and ns0+3:=rn.n_{0}:=2,\,\,\,\,\,\,\,\,n_{j}:=30\ell_{0}^{j-1},\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,n_{s_{0}+2}:=\left\lfloor\sqrt{n/p}\right\rfloor=\left\lfloor\frac{n}{\sqrt{d}}\right\rfloor,\,\,\,\,\,\,\,\,\mbox{ and }\,\,\,\,\,\,\,\,n_{s_{0}+3}:=\lfloor rn\rfloor.

Then, in the case 1/(64p)15ns0\left\lfloor 1/(64p)\right\rfloor\geq 15n_{s_{0}} we set ns0+1=1/(64p)n_{s_{0}+1}=\left\lfloor 1/(64p)\right\rfloor. Otherwise, let ns0+1=ns0n_{s_{0}+1}=n_{s_{0}}. Note that with this definition we always have ns0+2>ns0+1n_{s_{0}+2}>n_{s_{0}+1}. The indices njn_{j}, js0+3j\leq s_{0}+3, are global parameters which will be used throughout the section. Below we provide the proof only for the case 1/(64p)=ns0+115ns0\left\lfloor 1/(64p)\right\rfloor=n_{s_{0}+1}\geq 15n_{s_{0}}, the other case is treated similarly (in particular, in that other case the set 𝒯1(s0+1)\mathcal{T}_{1(s_{0}+1)} defined below, will be empty).

We also will use another parameter,

κ=κ(p):=ln(6pn)ln0.\kappa=\kappa(p):=\frac{\ln(6pn)}{\ln\ell_{0}}. (28)

Note that the function f(p)=ln(6pn)/(4ln(1/p))f(p)=\ln(6pn)/(4\ln(1/p)) is a decreasing function on (0,1)(0,1), therefore for p(100lnn)/np\geq(100\ln n)/n and sufficiently larg nn we have 1<κlnlnn1<\kappa\leq\ln\ln n. Moreover, it is easy to see that if p(100ln2n)/np\geq(100\ln^{2}n)/n, then κ2\kappa\leq 2. We also notice that if pn6(5lnn)1+γpn\geq 6(5\ln n)^{1+\gamma} for some γ(0,1)\gamma\in(0,1) then κ1+1/γ\kappa\leq 1+1/\gamma and, using the definition of 0\ell_{0} and s0s_{0},

(6d)s01=0(s01)κ1/(64p)κ.(6d)^{s_{0}-1}=\ell_{0}^{(s_{0}-1)\kappa}\leq 1/(64p)^{\kappa}. (29)

6.1 Two classes of vectors and main results

We first introduce the class of steep vectors. It will be constructed as a union of four subclasses. Set

𝒯0:={xn:x1>6dx2} and 𝒯11:={xn:x𝒯0 and x2>6dxn1}.\mathcal{T}_{0}:=\{x\in{\mathbb{R}}^{n}\,:\,x_{1}^{*}>6d\,x_{2}^{*}\}\quad\mbox{ and }\quad\mathcal{T}_{11}:=\{x\in{\mathbb{R}}^{n}\,:\,x\not\in\mathcal{T}_{0}\,\,\mbox{ and }\,\,x_{2}^{*}>6d\,x_{n_{1}}^{*}\}.

Then for 2js0+12\leq j\leq s_{0}+1,

𝒯1j:={xn:x𝒯0i=1j1𝒯1i and xnj1>6dxnj} and 𝒯1:=i=1s0+1𝒯1i.\mathcal{T}_{1j}:=\left\{x\in{\mathbb{R}}^{n}\,:\,x\not\in\mathcal{T}_{0}\cup\bigcup_{i=1}^{j-1}\mathcal{T}_{1i}\,\,\mbox{ and }\,\,x_{n_{j-1}}^{*}>6d\,x_{n_{j}}^{*}\right\}\quad\mbox{ and }\quad\mathcal{T}_{1}:=\bigcup_{i=1}^{s_{0}+1}\mathcal{T}_{1i}.

Finally, for k=2,3k=2,3 set j=j(k)=s0+kj=j(k)=s_{0}+k and define

𝒯k:={xn:xi=0k1𝒯i and xnj1>Cτdxnj}.\mathcal{T}_{k}:=\left\{x\in{\mathbb{R}}^{n}\,:\,x\not\in\bigcup_{i=0}^{k-1}\mathcal{T}_{i}\,\,\mbox{ and }\,\,x_{n_{j-1}}^{*}>C_{\tau}\sqrt{d}\,x_{n_{j}}^{*}\right\}.

The set of steep vectors is 𝒯:=𝒯0𝒯1𝒯2𝒯3\mathcal{T}:=\mathcal{T}_{0}\cup\mathcal{T}_{1}\cup\mathcal{T}_{2}\cup\mathcal{T}_{3}. The “rules” of the partition are summarized in the diagram. [Uncaptioned image]

For this class we prove the following bound.

Theorem 6.1.

There exist positive absolute constants c,C>0c,C>0 such that the following holds. Let nCn\geq C, and let 0<p<c0<p<c satisfy pnClnnpn\geq C\ln n. Let MM be a Bernoulli(pp) random matrix and denote

steep:={x𝒯 such that Mx<c(64p)κ(pn)2min(1,1p1.5n)x},{\mathcal{E}}_{steep}:=\bigg{\{}\exists\;x\in\mathcal{T}\,\,\,\mbox{ such that }\,\,\,\|Mx\|<\frac{c(64p)^{\kappa}}{(pn)^{2}}\,\min\left(1,\frac{1}{p^{1.5}n}\right)\,\,\|x\|\bigg{\}},

where as before κ=κ(p):=(ln(6pn))/ln0.\kappa=\kappa(p):=(\ln(6pn))/\ln\ell_{0}. Then

(steep)n(1p)n+2e1.4pn.{\mathbb{P}}({\mathcal{E}}_{steep})\leq n(1-p)^{n}+2e^{-1.4pn}.

Next we introduce the class of {\mathcal{R}}-vectors, denoted by {\mathcal{R}}. Let C0C_{0} be the constant from Proposition 3.10 and recall that the class 𝒜𝒞(ρ)\mathcal{AC}(\rho) of almost constant vectors was defined by (9) in Subsection 2.2. Given ns0+1<kn/ln2dn_{s_{0}+1}<k\leq n/\ln^{2}d denote A=A(k):=[k,n]A=A(k):=[k,n] and consider the sets

k1:={x(Υn(r)𝒯)𝒜𝒞(ρ):xσx(A)xσx(A)C0p and n/2xσx(A)Cτdn},{\mathcal{R}}_{k}^{1}:=\left\{x\in\big{(}{\Upsilon}_{n}(r)\setminus\mathcal{T}\big{)}\cap\mathcal{AC}(\rho)\,\,:\,\,\frac{\|x_{\sigma_{x}(A)}\|}{\|x_{\sigma_{x}(A)}\|_{\infty}}\geq\frac{C_{0}}{\sqrt{p}}\quad\mbox{ and }\quad\sqrt{n/2}\leq\|x_{\sigma_{x}(A)}\|\leq C_{\tau}\sqrt{dn}\right\},

and

k2:={xΥn(r)𝒯:xσx(A)xσx(A)C0p and 2nrxσx(A)Cτ2dn}.{\mathcal{R}}_{k}^{2}:=\left\{x\in{\Upsilon}_{n}(r)\setminus\mathcal{T}\,\,:\,\,\frac{\|x_{\sigma_{x}(A)}\|}{\|x_{\sigma_{x}(A)}\|_{\infty}}\geq\frac{C_{0}}{\sqrt{p}}\quad\mbox{ and }\quad\frac{2\sqrt{n}}{r}\leq\|x_{\sigma_{x}(A)}\|\leq C_{\tau}^{2}d\sqrt{n}\right\}.

Define :=ns0+1<kn/ln2d(k1k2).{\mathcal{R}}:=\bigcup_{n_{s_{0}+1}<k\leq n/\ln^{2}d}\,({\mathcal{R}}_{k}^{1}\cup{\mathcal{R}}_{k}^{2}).

The class {\mathcal{R}} should be thought of as the class of sufficiently spread vectors, not steep, but possibly without having two subsets of coordinates of size proportional to nn, which are separated by ρ\rho (which would allow us to treat those vectors as part of the set 𝒱n{\mathcal{V}}_{n}). Crucially, the sets k1{\mathcal{R}}_{k}^{1} and k2{\mathcal{R}}_{k}^{2} are “low complexity” sets because they admit ε\varepsilon–nets of relatively small cardinalities (see Subsection 6.3). For the class {\mathcal{R}} we prove the following bound.

Theorem 6.2.

There are absolute constants r0,ρ0,Cr_{0},\rho_{0},C with the following property. Let 0<rr00<r\leq r_{0}, 0<ρρ00<\rho\leq\rho_{0}, let n1n\geq 1 and p(0,0.001]p\in(0,0.001] be such that d=pnClnnd=pn\geq C\ln n. Then

({x:Mxpn12C0})e2n+e200pn.\mathbb{P}\left(\left\{\exists x\in{\mathcal{R}}\,:\,\|Mx\|\leq\frac{\sqrt{p}n}{12C_{0}}\right\}\right)\leq e^{-2n}+e^{-200pn}.

Finally we show that together with 𝒱n{\mathcal{V}}_{n} classes 𝒯\mathcal{T} and {\mathcal{R}} cover all (properly normalized) vectors for the growth function defined by

𝐠(t)=(2t)3/2 for  1t<64pn and 𝐠(t)=exp(ln2(2t)) for t64pn.{\bf g}(t)=(2t)^{3/2}\,\,\,\mbox{ for }\,1\leq t<64pn\quad\quad\mbox{ and }\quad\quad{\bf g}(t)=\exp(\ln^{2}(2t))\,\,\,\mbox{ for }\,t\geq 64pn. (30)

It is straightforward to check that 𝐠{\bf g} satisfies (8) with some absolute constant K3K_{3}.

Theorem 6.3.

There are universal constants c,C>0c,C>0 with the following property. Let nCn\geq C, p(0,c)p\in(0,c), and assume that d=pn100lnnd=pn\geq 100\ln n. Let r(0,1/2)r\in(0,1/2), δ(0,r/3)\delta\in(0,r/3), ρ(0,1)\rho\in(0,1), and let 𝐠{\bf g} be as in (30). Then

Υn(r)𝒱n(r,𝐠,δ,ρ)𝒯.{\Upsilon}_{n}(r)\setminus{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\subset{\mathcal{R}}\cup\mathcal{T}.

6.2 Auxiliary lemmas

In the following lemma we provide a simple bound on the Euclidean norms of vectors in the class 𝒯\mathcal{T} and its complement in terms of their order statistics.

Lemma 6.4.

Let nn be large enough and (200lnn)/n<p<0.001(200\ln n)/n<p<0.001. Consider the vectors x𝒯1jx\in\mathcal{T}_{1j} for some 1js0+11\leq j\leq s_{0}+1, y𝒯2y\in\mathcal{T}_{2}, z𝒯3z\in\mathcal{T}_{3} and w𝒯cw\in\mathcal{T}^{c}. Then

xxnj164(pn)2(64p)κ,yyns0+1384(pn)3(64p)κ,zzns0+2384Cτ(pn)3.5(64p)κ, and wwns0+3384Cτ2(pn)4(64p)κ.\frac{\|x\|}{x_{n_{j-1}}^{*}}\leq\frac{64(pn)^{2}}{(64p)^{\kappa}},\quad\frac{\|y\|}{y_{n_{s_{0}+1}}^{*}}\leq\frac{384(pn)^{3}}{(64p)^{\kappa}},\quad\frac{\|z\|}{z_{n_{s_{0}+2}}^{*}}\leq\frac{384C_{\tau}(pn)^{3.5}}{(64p)^{\kappa}},\quad\mbox{ and }\quad\frac{\|w\|}{w_{n_{s_{0}+3}}^{*}}\leq\frac{384C_{\tau}^{2}(pn)^{4}}{(64p)^{\kappa}}.
Proof.

Let d=pnd=pn. Since x𝒯1jx\in\mathcal{T}_{1j}, denoting m=nj1m=n_{j-1}, we have

x1(6d)x2(6d)2xn1(6d)jxnj1=(6d)jxm.x_{1}^{*}\leq(6d)x^{*}_{2}\leq(6d)^{2}x^{*}_{n_{1}}\leq\ldots\leq(6d)^{j}x_{n_{j-1}}^{*}=(6d)^{j}x_{m}^{*}.

Since ni=300i130di1n_{i}=30\ell_{0}^{i-1}\leq 30d^{i-1}, is0i\leq s_{0}, since κ>1\kappa>1, and in view of (29), we obtain

x2\displaystyle\|x\|^{2} =(x1)2+(x2++(xn1)2)+((xn1+1)2++(xn2)2)+\displaystyle=(x_{1}^{*})^{2}+(x_{2}^{*}+\ldots+(x_{n_{1}}^{*})^{2})+((x_{n_{1}+1}^{*})^{2}+\dots+(x_{n_{2}}^{*})^{2})+\ldots
((6d)2j+n1(6d)2(j1)+n2(6d)2(j2)+nj1(6d)2+n)(xm)2\displaystyle\leq((6d)^{2j}+n_{1}(6d)^{2(j-1)}+n_{2}(6d)^{2(j-2)}\ldots+n_{j-1}(6d)^{2}+n)(x_{m}^{*})^{2}
((6d)2j+5(6d)2j2i0(6d)i+n)(xm)2(2(6d)2(s0+1)+n)(xm)2\displaystyle\leq\Big{(}(6d)^{2j}+5(6d)^{2j-2}\sum_{i\geq 0}(6d)^{-i}+n\Big{)}(x_{m}^{*})^{2}\leq\left(2(6d)^{2(s_{0}+1)}+n\right)(x_{m}^{*})^{2}
(2(6d)4/(64p)2κ+n)(xm)2(3(6d)4/(64p)2κ)(xm)2.\displaystyle\leq\left(2(6d)^{4}/(64p)^{2\kappa}+n\right)(x_{m}^{*})^{2}\leq\left(3(6d)^{4}/(64p)^{2\kappa}\right)(x_{m}^{*})^{2}.

This implies the first bound. The bounds for y,z,wy,z,w are obtained similarly. ∎

The next two Lemmas 6.5 and 6.6 will be used to bound from below the norm of the matrix-vector product MxMx for vectors xx with a “too large” almost constant part which does not allow to directly apply the Lévy–Kolmogorov–Rogozin anti-concentration inequality together with the tensorization argument. Lemma 6.5 will be used to bound Mx\|Mx\| by a single inner product |𝐑i(M),x||\langle{\bf R}_{i}(M),x\rangle| for a specially chosen index ii, while Lemma 6.6 will allow to extract a subset of “good” rows having large inner products with xx.

Lemma 6.5.

Let n30n\geq 30 and 0<p<0.0010<p<0.001 satisfy pn200lnnpn\geq 200\ln n. Let m,=(m)2m,\ell=\ell(m)\geq 2 be such that either

m=2 and =15,m=2\mbox{ and }\ell=15,

or

m30,m164pandnp4ln1pm.m\geq 30,\quad\ell m\leq\frac{1}{64p}\quad\quad\mbox{and}\quad\quad\ell\leq\frac{np}{4\ln\frac{1}{pm}}.

Let MM be an n×nn\times n Bernoulli(pp) random matrix. By col=col(,m){\mathcal{E}}_{col}={\mathcal{E}}_{col}(\ell,m) denote the event that for any choice of two disjoint sets J1,J2[n]J_{1},J_{2}\subset[n] of cardinality |J1|=m|J_{1}|=m, |J2|=mm|J_{2}|=\ell m-m there exists a row of MM with exactly one 11 among components indexed by J1J_{1} and no 11s among components indexed by J2J_{2}. Then (col)1exp(1.5pn).\mathbb{P}({\mathcal{E}}_{col})\geq 1-\exp(-1.5pn).

Proof.

We first treat the case m30m\geq 30. Fix two disjoint sets J1,J2[n]J_{1},J_{2}\subset[n] of required cardinality. The probability that a fixed row has exactly one 11 among components indexed by J1J_{1} and no 11s among components indexed by J2J_{2} equals

q:=mp(1p)m1mpexp(2pm)29mp/30,q:=mp(1-p)^{\ell m-1}\geq mp\exp(-2p\ell m)\geq 29mp/30,

where we used mp1/64\ell mp\leq 1/64. Since the rows are independent, the probability that MM does not have such a row is

(1q)nexp(nq)exp(29mpn/30).(1-q)^{n}\leq\exp(-nq)\leq\exp(-29mpn/30).

Note that the number of all choices of J1J_{1} and J2J_{2} satisfying the conditions of the lemma is

(nmm)(nm+mm)(en(1)m)mm(enm)m(3nm)m(2)m.{n\choose\ell m-m}{n-\ell m+m\choose m}\leq\left(\frac{en}{(\ell-1)m}\right)^{\ell m-m}\left(\frac{en}{m}\right)^{m}\leq\left(\frac{3n}{\ell m}\right)^{\ell m}(2\ell)^{m}.

Thus union bound over all choices of J1J_{1} and J2J_{2} implies

((col)c)(3nm)m(2)mexp(29mpn/30).\mathbb{P}(({\mathcal{E}}_{col})^{c})\leq\left(\frac{3n}{\ell m}\right)^{\ell m}(2\ell)^{m}\exp(-29mpn/30).

Using that m1/(64p)m\leq 1/(64p) and np4ln(1/(pm))\ell\leq\frac{np}{4\ln(1/(pm))}, we observe (3nm)mexp(mpn/2).\left(\frac{3n}{\ell m}\right)^{\ell m}\leq\exp(mpn/2). Since np200lnnnp\geq 200\ln n, we have (2)mexp(2mpn/5)(2\ell)^{m}\leq\exp(2mpn/5). Thus,

((col)c)exp(mpn/15)exp(2pn),\mathbb{P}(({\mathcal{E}}_{col})^{c})\leq\exp(-mpn/15)\leq\exp(-2pn),

which proves this case.

The case m=2m=2, =15\ell=15 is similar. Fixing two disjoint sets J1,J2[n]J_{1},J_{2}\subset[n] of the required cardinality, the probability that a fixed row has exactly one 11 among components indexed by J1J_{1} and no 11s among components indexed by J2J_{2} equals

q:=2p(1p)292pexp(29p).q:=2p(1-p)^{29}\geq 2p\exp(-29p).

Since rows are independent, the probability that MM does not have such a row is

(1q)n(12pexp(29p))nexp(2pnexp(29p))exp(1.8pn).(1-q)^{n}\leq(1-2p\exp(-29p))^{n}\leq\exp(-2pn\exp(-29p))\leq\exp(-1.8pn).

Using union bound over all choices of J1J_{1} and J2J_{2} we obtain

(sumc)n30228!exp(1.8pn)exp(1.5pn),\mathbb{P}({\mathcal{E}}_{sum}^{c})\leq\frac{n^{30}}{2\cdot 28!}\exp(-1.8pn)\leq\exp(-1.5pn),

which proves the lemma. ∎

In the next lemma we restrict a matrix to a certain set of columns and estimate the cardinality of a set of rows having exactly one 11. To be more precise, for any J[n]J\subset[n] and a 0/10/1 matrix MM denote

IJ=I(J,M):={in:|supp𝐑i(M)J|=1}.I_{J}=I(J,M):=\{i\leq n:\,|{\rm supp\,}{\bf R}_{i}(M)\cap J|=1\}.

The following statement is similar to Lemma 2.7 from [24] and Lemma 3.6 in [27].

Lemma 6.6.

Let 1\ell\geq 1 be an integer and p(0,1/2]p\in(0,1/2] be such that p1/32p\ell\leq 1/32. Let MM be a Bernoulli(pp) random matrix. Then with probability at least

12(n)exp(np/4)1-2{n\choose\ell}\exp\left(-n\ell p/4\right)

for every J[n]J\subset[n] of cardinality \ell one has

pn/16|I(J,M)|2np.\ell pn/16\leq|I(J,M)|\leq 2\ell np.

In particular, if =21/(64p)n\ell=2\lfloor 1/(64p)\rfloor\leq n, n105n\geq 10^{5}, and p[100/n,0.001]p\in[100/n,0.001] then, denoting

card=card():={Mn:J[n]with|J|=one has|I(J,M)|[pn/16,2pn]},{\mathcal{E}}_{card}={\mathcal{E}}_{card}(\ell):=\{M\in{\mathcal{M}_{n}}\,:\,\forall J\subset[n]\,\,\,\mbox{with}\,\,\,|J|=\ell\,\,\,\mbox{one has}\,\,\,|I(J,M)|\in[\ell pn/16,2\ell pn]\},

we have

(card)12exp(n/500).\mathbb{P}\left({\mathcal{E}}_{card}\right)\geq 1-2\exp\left(-n/500\right).
Proof.

Fix J[n]J\subset[n] of cardinality \ell. Denote q=p(1p)1q=\ell p(1-p)^{\ell-1}. Since p1/32\ell p\leq 1/32,

15p/16p(12p)pexp(2p)qp1/2.15\ell p/16\leq\ell p(1-2p\ell)\leq\ell p\exp(-2p\ell)\leq q\leq\ell p\leq 1/2.

For every ini\leq n, let ξi\xi_{i} be the indicator of the event {iI(J,M)}\{i\in I(J,M)\}. Clearly, ξi\xi_{i}’s are independent Bernoulli(qq) random variables and |I(J,M)|=i=1nξi|I(J,M)|=\sum_{i=1}^{n}\xi_{i}. Applying Lemma 3.4, we observe that for every 0<ε<q0<\varepsilon<q

(|I(J,M)|[(qε)n,(q+ε)n])12exp(nε22q(1q)(1ε3q)).\mathbb{P}\left(|I(J,M)|\in[(q-\varepsilon)n,(q+\varepsilon)n]\right)\geq 1-2\exp\left(-\frac{n\varepsilon^{2}}{2q(1-q)}\,\left(1-\frac{\varepsilon}{3q}\right)\right).

Taking ε=14q/15\varepsilon=14q/15 we obtain that

(qε)n=qn/15pn/16and(q+ε)n2qn2pn,(q-\varepsilon)n=qn/15\geq\ell pn/16\quad\quad\mbox{and}\quad\quad(q+\varepsilon)n\leq 2qn\leq 2\ell pn,

and

nε22q(1q)(1ε3q)9831nq225450.3np(12p)np/4.\frac{n\varepsilon^{2}}{2q(1-q)}\,\left(1-\frac{\varepsilon}{3q}\right)\geq\frac{98\cdot 31nq}{225\cdot 45}\geq 0.3n\ell p(1-2\ell p)\geq n\ell p/4.

This implies the bound for a fixed JJ. The lemma follows by the union bound. ∎

6.3 Cardinality estimates for ε\varepsilon–nets

In this subsection we provide bounds on cardinality of certain discretizations of the sets of vectors introduced earlier. Recall that 𝐞{\bf e} denotes the vector 𝟏/n{\bf 1}/\sqrt{n}, P𝐞P_{\bf e} denotes the projection on 𝐞{\bf e}^{\perp}, and P𝐞P_{\bf e}^{\perp} is the projection on 𝐞{\bf e}, that is P𝐞=,𝐞𝐞P_{\bf e}^{\perp}=\left\langle\cdot,{\bf e}\right\rangle{\bf e}. We recall also that given A[n]A\subset[n], xAx_{A} denotes coordinate projection of xx on A{\mathbb{R}}^{A}, and that given xnx\in{\mathbb{R}}^{n}, σx\sigma_{x} is a (fixed) permutation corresponding to non-increasing rearrangement of {|xi|}i=1n\{|x_{i}|\}_{i=1}^{n}.

Our first lemma deals with nets for 𝒯2\mathcal{T}_{2} and 𝒯3\mathcal{T}_{3}. We will consider the following normalization:

𝒯2={x𝒯2:xns0+1=1}, and 𝒯3={x𝒯3:xns0+2=1}.\mathcal{T}^{\prime}_{2}=\{x\in\mathcal{T}_{2}\,:\,x_{n_{s_{0}+1}}^{*}=1\},\quad\mbox{ and }\quad\mathcal{T}^{\prime}_{3}=\{x\in\mathcal{T}_{3}\,:\,x_{n_{s_{0}+2}}^{*}=1\}.

The triple norm is defined by |||x|||2:=P𝐞x2+pnP𝐞x2.|||x|||^{2}:=\|P_{\bf e}x\|^{2}+pn\|P_{\bf e}^{\perp}x\|^{2}.

Lemma 6.7.

Let n1n\geq 1, p(0,0.001]p\in(0,0.001], and assume that d=pnd=pn is sufficiently large. Let i{2,3}i\in\{2,3\}. Then there exists a set 𝒩i=𝒩i+𝒩i{\mathcal{N}}_{i}={\mathcal{N}}_{i}^{\prime}+{\mathcal{N}}_{i}^{\prime\prime}, 𝒩in{\mathcal{N}}_{i}^{\prime}\subset{\mathbb{R}}^{n}, 𝒩ispan{𝟏}{\mathcal{N}}_{i}^{\prime\prime}\subset{\rm span}\,\{{\bf{}1}\}, with the following properties:

  • |𝒩i|exp(2ns0+ilnd).|{\mathcal{N}}_{i}|\leq\exp\left(2n_{s_{0}+i}\ln d\right).

  • For every u𝒩iu\in{\mathcal{N}}_{i}^{\prime} one has uj=0u_{j}^{*}=0 for all jns0+ij\geq n_{s_{0}+i}.

  • For every x𝒯ix\in\mathcal{T}_{i}^{\prime} there are u𝒩iu\in{\mathcal{N}}_{i}^{\prime} and w𝒩iw\in{\mathcal{N}}_{i}^{\prime\prime} satisfying

    xu1Cτd,w1Cτd, and |||xuw|||2nCτd.\|x-u\|_{\infty}\leq\frac{1}{C_{\tau}\sqrt{d}},\quad\|w\|_{\infty}\leq\frac{1}{C_{\tau}\sqrt{d}},\quad\mbox{ and }\quad|||x-u-w|||\leq\frac{\sqrt{2n}}{C_{\tau}\sqrt{d}}.

Since the proof of this lemma in many parts repeats the proofs of Lemma 3.8 from [24] and of Lemma 6.8 below, we only sketch it.

Proof.

Fix μ=1/(Cτd\mu=1/(C_{\tau}\sqrt{d}) and i{2,3}i\in\{2,3\}. We first repeat the proof of Lemma 3.8 from [24] with our choice of parameters. See also the beginning of the proof of Lemma 6.8 below — many definitions, constructions, and calculations are exactly the same, however note that the normalization is slightly different. In particular, the definitions of sets B1(x)B_{1}(x), B2(x)B_{2}(x) (with k1=ns0+i1k-1=n_{s_{0}+i-1}), B3(x)B_{3}(x) are the same (we do not need the sets B0(x)B_{0}(x) and B4(x)B_{4}(x)). This will show (for large enough dd) the existence of a μ\mu-net 𝒩i\mathcal{N}_{i}^{\prime} (in the \ell_{\infty} metric) for 𝒯i\mathcal{T}_{i}^{\prime} such that for every u𝒩iu\in{\mathcal{N}}_{i}^{\prime} one has uj=0u_{j}^{*}=0 for all jns0+ij\geq n_{s_{0}+i} and |𝒩i|exp(1.1ns0+ilnd)|\mathcal{N}_{i}^{\prime}|\leq\exp\left(1.1n_{s_{0}+i}\ln d\right).

Next given x𝒯ix\in\mathcal{T}_{i}^{\prime} let u=u(x)𝒩iu=u(x)\in\mathcal{N}_{i}^{\prime} be such that xuμ\|x-u\|_{\infty}\leq\mu. Then P𝐞(xu)μn\|P_{\bf e}^{\perp}(x-u)\|\leq\mu\sqrt{n}. Let 𝒩i\mathcal{N}_{i}^{\prime\prime} be a (μn/d)(\mu\sqrt{n/d})-net in the segment μn[𝐞,𝐞]\mu\sqrt{n}\,[-{\bf e},{\bf e}] of cardinality at most 2d2\sqrt{d} (note, we are in the one-dimensional setting). Note that every w𝒩iw\in\mathcal{N}_{i}^{\prime\prime} is of the form w=a𝐞=a 1/nw=a\,{\bf e}=a\,{\bf 1}/\sqrt{n}, |a|μn|a|\leq\mu\sqrt{n}, in particular, wμ\|w\|_{\infty}\leq\mu. Then for xx (and the corresponding u=u(x)u=u(x)), there exists w𝒩iw\in\mathcal{N}_{i}^{\prime\prime} such that

|||xuw|||2=P𝐞(xuw)2+dP𝐞(xuw)2=P𝐞(xu)2+dP𝐞(xu)w22μ2n.|||x-u-w|||^{2}=\|P_{\bf e}(x-u-w)\|^{2}+d\|P_{\bf e}^{\perp}(x-u-w)\|^{2}=\|P_{\bf e}(x-u)\|^{2}+d\|P_{\bf e}^{\perp}(x-u)-w\|^{2}\leq 2\mu^{2}n.

Finally, note that |𝒩i+𝒩i|2dexp(1.1ns0+ilnd)exp(2ns0+ilnd)|\mathcal{N}_{i}^{\prime}+\mathcal{N}_{i}^{\prime\prime}|\leq 2\sqrt{d}\exp\left(1.1n_{s_{0}+i}\ln d\right)\leq\exp\left(2n_{s_{0}+i}\ln d\right). This completes the proof. ∎

Let k1{\mathcal{R}}_{k}^{1}, k2{\mathcal{R}}_{k}^{2} be the vector subsets introduced in Subsection 6.1. Consider the increasing sequence λ1<λ2<<λm\lambda_{1}<\lambda_{2}<\ldots<\lambda_{m}, m1m\geq 1, defined by

λ1=1/2\lambda_{1}=1/\sqrt{2}, λi+1=3λi\,\,\,\lambda_{i+1}=3\lambda_{i}\,\,\, for 1<i<m,1<i<m,\quad and λm1<λm=Cτ2d3λm1\quad\lambda_{m-1}<\lambda_{m}=C_{\tau}^{2}d\leq 3\lambda_{m-1}. (31)

Clearly mnm\leq n. For s{1,2}s\in\{1,2\}, ns0+1<kn/ln2dn_{s_{0}+1}<k\leq n/\ln^{2}d and imi\leq m set

kis:={xks:λinxσx([k,n])λi+1n}.{\mathcal{R}}_{ki}^{s}:=\left\{x\in{\mathcal{R}}_{k}^{s}\,\,:\,\,\lambda_{i}\sqrt{n}\leq\|x_{\sigma_{x}([k,n])}\|\leq\lambda_{i+1}\sqrt{n}\right\}.

It is not difficult to see that the union of kis{\mathcal{R}}_{ki}^{s}’s over admissible ii gives ks{\mathcal{R}}_{k}^{s}. The sets kis{\mathcal{R}}_{ki}^{s} are “low complexity” sets in the sense that they admit efficient ε\varepsilon-nets. For s=1s=1, the low complexity is a consequence of the condition that ki1𝒜𝒞(ρ){\mathcal{R}}_{ki}^{1}\subset\mathcal{AC}(\rho), i.e., the vectors have a very large almost constant part. For the sets ki2{\mathcal{R}}_{ki}^{2}, we do not assume the almost constant behavior, but instead rely on the assumption that xσx([k,n])\|x_{\sigma_{x}([k,n])}\| is large (much larger than n\sqrt{n}). This will allow us to pick ε\varepsilon much larger than n\sqrt{n}, and thus construct a net of small cardinality.

Lemma 6.8.

Let R40R\geq 40 be a (large) constant. Then there is r0>0r_{0}>0 depending on RR with the following property. Let 0<rr00<r\leq r_{0}, 0<ρ1/(2R)0<\rho\leq 1/(2R), let n1n\geq 1 and p(0,0.001]p\in(0,0.001] so that d=pnd=pn is sufficiently large (larger than a constant depending on R,rR,r). Let s{1,2}s\in\{1,2\}, ns0+1<kn/ln2dn_{s_{0}+1}<k\leq n/\ln^{2}d, tmt\leq m, and 40λtn/Rελtn40\lambda_{t}\sqrt{n}/R\leq\varepsilon\leq\lambda_{t}\sqrt{n}, where λt\lambda_{t} and mm are defined according to relation (31). Then there exists an ε\varepsilon-net 𝒩ktskts\mathcal{N}_{kt}^{s}\subset{\mathcal{R}}_{kt}^{s} for kts{\mathcal{R}}_{kt}^{s} with respect to |||||||||\cdot||| of cardinality at most (e/r)3rn(e/r)^{3rn}.

Proof.

Note that in case of s=2s=2 the set kt2{\mathcal{R}}_{kt}^{2} is empty whenever 3λt<2r3\lambda_{t}<\frac{2}{r}. So, in the course of the proof we will implicitly assume that 3λt2r3\lambda_{t}\geq\frac{2}{r} whenever s=2s=2.

We follow ideas of the proof of Lemma 3.8 from [24]. We split a given vector from kts{\mathcal{R}}_{kt}^{s} into few parts according to magnitudes of its coordinates and approximate each part separately. Then we construct nets for vectors with the same splitting and take the union over all nets. We now discuss the splitting. For each xktsx\in{\mathcal{R}}_{kt}^{s} consider the following (depending on xx) partition of [n][n]. If s=2s=2, set B0(x)=B_{0}^{\prime}(x)=\emptyset. If s=1s=1 then x𝒜𝒞(ρ)x\in\mathcal{AC}(\rho) and we set

B0(x):=σx({jn:|xjλx|ρ}),B_{0}^{\prime}(x):=\sigma_{x}(\{j\leq n\,:\,|x_{j}-\lambda_{x}|\leq\rho\}),

where λx=±1\lambda_{x}=\pm 1 is from the definition of 𝒜𝒞(ρ)\mathcal{AC}(\rho) (note that under the normalization in Υn(r){\Upsilon}_{n}(r) we have xns0+3=1x^{*}_{n_{s_{0}+3}}=1). Then |B0(x)|>nns0+3|B_{0}^{\prime}(x)|>n-n_{s_{0}+3} for s=1s=1. Next, we set

B1(x)\displaystyle B_{1}(x) =σx([ns0+1]);\displaystyle=\sigma_{x}([n_{s_{0}+1}]);
B2(x)\displaystyle B_{2}(x) =σx([k1])B1(x);\displaystyle=\sigma_{x}([k-1])\setminus B_{1}(x);
B3(x)\displaystyle B_{3}(x) =σx([ns0+3])(B1(x)B2(x));\displaystyle=\sigma_{x}([n_{s_{0}+3}])\setminus(B_{1}(x)\cup B_{2}(x));
B0(x)\displaystyle B_{0}(x) =B0(x)(B1(x)B2(x)B3(x));\displaystyle=B_{0}^{\prime}(x)\setminus(B_{1}(x)\cup B_{2}(x)\cup B_{3}(x));
B4(x)\displaystyle B_{4}(x) =[n](B0(x)B1(x)B2(x)B3(x))\displaystyle=[n]\setminus(B_{0}(x)\cup B_{1}(x)\cup B_{2}(x)\cup B_{3}(x))

(one of the sets B0(x)B_{0}(x), B4(x)B_{4}(x) could be empty). Denote x:=|B0(x)|\ell_{x}:=|B_{0}(x)|. Note that the definition of B3(x)B_{3}(x) and B4(x)B_{4}(x) imply that xnns0+3\ell_{x}\leq n-n_{s_{0}+3}, while the condition k1ns0+3k-1\leq n_{s_{0}+3} and the above observation for B0(x)B_{0}^{\prime}(x) give n2ns0+3<xn-2n_{s_{0}+3}<\ell_{x} for s=1s=1. Clearly, x=0\ell_{x}=0 for s=2s=2.

Moreover, we have both for s=1s=1 and s=2s=2:

|B1(x)|=ns0+1,|B2(x)|=k1ns0+1,|B3(x)|=ns0+3k+1,|B4(x)|=nxns0+3.|B_{1}(x)|=n_{s_{0}+1},\quad|B_{2}(x)|=k-1-n_{s_{0}+1},\quad|B_{3}(x)|=n_{s_{0}+3}-k+1,\quad|B_{4}(x)|=n-\ell_{x}-n_{s_{0}+3}. (32)

Thus, given {0}[nns0+3k+1,nk+1]\ell\in\{0\}\cup[n-n_{s_{0}+3}-k+1,n-k+1] and a partition of [n][n] into five sets BiB_{i}, 0i40\leq i\leq 4, with cardinalities as in (32), it is enough to construct a net for vectors xktsx\in{\mathcal{R}}_{kt}^{s} with Bi(x)=BiB_{i}(x)=B_{i}, 0i40\leq i\leq 4, x=\ell_{x}=\ell, and then to take the union of nets over all possible realizations of \ell and all such partitions {B0,B1,B2,B3,B4}\{B_{0},B_{1},B_{2},B_{3},B_{4}\} of [n][n].

Now we describe our construction. Fix \ell as above and fix two parameters μ=1/(Cτd)\mu=1/(C_{\tau}\sqrt{d}), and ν=9λtn/R\nu=9\lambda_{t}\sqrt{n}/R. We would like to emphasize that for the actual calculations in this lemma, taking μ\mu to be a small constant multiple of R1R^{-1} would be sufficient, however, we would like to run the proof with the above choice of μ\mu because this corresponds to the parameter choice in the previous Lemma 6.7 whose proof we only sketched. Note that for xktsx\in{\mathcal{R}}_{kt}^{s} we have x𝒯x\not\in\mathcal{T}, hence xns0+1Cτdxns0+2Cτ2dx^{*}_{n_{s_{0}+1}}\leq C_{\tau}\sqrt{d}x^{*}_{n_{s_{0}+2}}\leq C_{\tau}^{2}d and

x1(6d)x2(6d)2xn1(6d)s0+2xns0+1Cτ2d(6d)s0+2.x_{1}^{*}\leq(6d)x^{*}_{2}\leq(6d)^{2}x^{*}_{n_{1}}\leq\ldots\leq(6d)^{s_{0}+2}x_{n_{s_{0}+1}}^{*}\leq C_{\tau}^{2}d(6d)^{s_{0}+2}. (33)

Fix I0[n]I_{0}\subset[n] with |I0|=ns0+1|I_{0}|=n_{s_{0}+1} (which will play the role of B1B_{1}). We shall construct a μ\mu-net 𝒩I0{\mathcal{N}}_{I_{0}} (in the \ell_{\infty}-metric) for the set

𝒯I0:={PB1(x)x:xkts,B1(x)=I0}.\displaystyle\mathcal{T}_{I_{0}}:=\big{\{}P_{B_{1}(x)}x:\;x\in{\mathcal{R}}_{kt}^{s},\;B_{1}(x)=I_{0}\big{\}}.

Clearly, the nets 𝒩I0{\mathcal{N}}_{I_{0}} for various I0I_{0}’s can be related by appropriate permutations, so without loss of generality we can assume for now that I0=[ns0+1]I_{0}=[n_{s_{0}+1}]. First, consider the partition of I0I_{0} into sets I1,,Is0+2I_{1},\ldots,I_{s_{0}+2} defined by

I1=[2] and Ij=[nj1][nj2], for   2js0+2.I_{1}=[2]\quad\mbox{ and }\quad\,\,\,I_{j}=[n_{j-1}]\setminus[n_{j-2}],\,\,\mbox{ for }\,\,2\leq j\leq s_{0}+2.

Consider the set

𝒯:={x𝒯[ns0+1]:σx(Ij)=Ij,j=1,2,,s0+2}.\mathcal{T}^{*}:=\big{\{}x\in\mathcal{T}_{[n_{s_{0}+1}]}:\,\sigma_{x}(I_{j})=I_{j},\;\;j=1,2,\dots,s_{0}+2\big{\}}.

By the definition of 𝒯I0\mathcal{T}_{I_{0}}, for every x𝒯x\in\mathcal{T}^{*}, one has PIjxbj:=Cτ2d(6d)s0+3j\|P_{I_{j}}x\|_{\infty}\leq b_{j}:=C_{\tau}^{2}d(6d)^{s_{0}+3-j} for every js0+2j\leq s_{0}+2 (where as usual PIP_{I} denotes the coordinate projection onto I{\mathbb{R}}^{I}). Define a μ\mu–net (in the \ell_{\infty}-metric) for 𝒯\mathcal{T}^{*} by setting

𝒩:=𝒩1𝒩2𝒩s0+2,{\mathcal{N}}^{*}:={\mathcal{N}}_{1}\oplus{\mathcal{N}}_{2}\oplus\cdots\oplus{\mathcal{N}}_{s_{0}+2},

where 𝒩j{\mathcal{N}}_{j} is a μ\mu-net (in the \ell_{\infty}-metric) of cardinality at most

(3bj/μ)|Ij|(Cτ3d3/2(6d)s0+3j)nj1(Cτ3(6d)s0+5j)nj1(3b_{j}/\mu)^{|I_{j}|}\leq(C_{\tau}^{3}d^{3/2}(6d)^{s_{0}+3-j})^{n_{j-1}}\leq(C_{\tau}^{3}(6d)^{s_{0}+5-j})^{n_{j-1}}

in the coordinate projection of the cube PIj(bjBn)P_{I_{j}}(b_{j}B_{\infty}^{n}). Recall that n0=2n_{0}=2, nj=300j1n_{j}=30\ell_{0}^{j-1}, 1js01\leq j\leq s_{0}, where 0\ell_{0} and s0s_{0} are given by (27). Since dd is large enough,

2s0+8+30j=2s0+1(s0+5j)0j2\displaystyle 2s_{0}+8+30\sum_{j=2}^{s_{0}+1}(s_{0}+5-j)\ell_{0}^{j-2} =2s0+8+30m=1s01(m+3)0s0m1210s014.1ns0+1,\displaystyle=2s_{0}+8+30\sum_{m=1}^{s_{0}-1}(m+3)\ell_{0}^{s_{0}-m}\leq 121\ell_{0}^{s_{0}-1}\leq 4.1n_{s_{0}+1},

which implies

|𝒩|j=1s0+2|𝒩j|exp(7.1ns0+1ln(6Cτ2d)).|{\mathcal{N}}^{*}|\leq\prod_{j=1}^{s_{0}+2}|{\mathcal{N}}_{j}|\leq\exp(7.1n_{s_{0}+1}\ln(6C_{\tau}^{2}d)).

To pass from the net for 𝒯\mathcal{T}^{*} to the net for 𝒯[ns0+1]\mathcal{T}_{[n_{s_{0}+1}]}, let 𝒩[ns0+1]{\mathcal{N}}_{[n_{s_{0}+1}]} be the union of nets constructed as 𝒩{\mathcal{N}}^{*} but for arbitrary partitions I1,,Is0+2I_{1}^{\prime},\dots,I_{s_{0}+2}^{\prime} of [ns0+1][n_{s_{0}+1}] with |Ij|=|Ij||I_{j}^{\prime}|=|I_{j}|. Using that

j=1s0+1nj12+30j=0s010j2+300s01/(11/0)2ns0+1\sum_{j=1}^{s_{0}+1}{n_{j-1}}\leq 2+30\sum_{j=0}^{s_{0}-1}\ell_{0}^{j}\leq 2+30\ell_{0}^{s_{0}-1}/(1-1/\ell_{0})\leq 2n_{s_{0}+1}

and e0de\ell_{0}\leq d we obtain that the cardinality of 𝒩[ns0+1]{\mathcal{N}}_{[n_{s_{0}+1}]} is at most

|𝒩|j=1s0+1(njnj1)\displaystyle|{\mathcal{N}}^{*}|\,\prod_{j=1}^{s_{0}+1}{n_{j}\choose n_{j-1}} |𝒩|j=1s0+1(enjnj1)nj1|𝒩|j=1s0+1(e0)nj1exp(9.1ns0+1ln(6Cτ2d)).\displaystyle\leq|{\mathcal{N}}^{*}|\,\prod_{j=1}^{s_{0}+1}\Big{(}\frac{en_{j}}{n_{j-1}}\Big{)}^{n_{j-1}}\leq|{\mathcal{N}}^{*}|\,\prod_{j=1}^{s_{0}+1}(e\ell_{0})^{n_{j-1}}\leq\exp(9.1n_{s_{0}+1}\ln(6C_{\tau}^{2}d)).

Next we construct a net for the parts of the vectors corresponding to B2B_{2}. Fix J0[n]J_{0}\subset[n] with |J0|=k1ns0+1|J_{0}|=k-1-n_{s_{0}+1} (it will play the role of B2B_{2}). We construct a μ\mu-net (in the \ell_{\infty}-metric) for the set

𝒯2J0:={PB2(x)x:xΥn(r)𝒯,B2(x)=J0}.\mathcal{T}^{2}_{J_{0}}:=\{P_{B_{2}(x)}x\,:\,x\in{\Upsilon}_{n}(r)\setminus\mathcal{T},\,\,B_{2}(x)=J_{0}\}.

Since by (33), we have xns0+1Cτ2dx^{*}_{n_{s_{0}+1}}\leq C_{\tau}^{2}d for every xΥn(r)𝒯x\in{\Upsilon}_{n}(r)\setminus\mathcal{T}, it is enough to take a μ\mu-net 𝒦J0{\mathcal{K}}_{J_{0}} of cardinality at most

|𝒦J0|(3Cτ2d/μ)|J0|(3Cτ3d3/2)k|{\mathcal{K}}_{J_{0}}|\leq(3C_{\tau}^{2}d/\mu)^{|J_{0}|}\leq(3C_{\tau}^{3}d^{3/2})^{k}

in the coordinate projection of the cube PJ0(Cτ2dBn)P_{J_{0}}(C_{\tau}^{2}dB_{\infty}^{n}).

Now we turn to the part of the vectors corresponding to B3B_{3}. Fix D0[n]D_{0}\subset[n] with |D0|=ns0+3k+1|D_{0}|=n_{s_{0}+3}-k+1 (it will play the role of B3B_{3}). For this part we use 2\ell_{2}-metric and construct a ν\nu-net (in the Euclidean metric this time) for the set

𝒯3D0:={PB3(x)x:xkts,B3(x)=D0}.\mathcal{T}^{3}_{D_{0}}:=\{P_{B_{3}(x)}x\,:\,x\in{\mathcal{R}}_{kt}^{s},\,\,B_{3}(x)=D_{0}\}.

Since for xktsx\in{\mathcal{R}}_{kt}^{s} we have xB3(x)xσx([k,n])3λtn\|x_{B_{3}(x)}\|\leq\|x_{\sigma_{x}([k,n])}\|\leq 3\lambda_{t}\sqrt{n}, there exists a corresponding ν\nu-net D0{\mathcal{L}}_{D_{0}} in the coordinate projection of the Euclidean ball PD0(3λtnB2n)P_{D_{0}}(3\lambda_{t}\sqrt{n}B_{2}^{n}) of cardinality at most

|D0|(9λtn/ν)|D0|Rns0+3Rrn.|{\mathcal{L}}_{D_{0}}|\leq(9\lambda_{t}\sqrt{n}/\nu)^{|D_{0}|}\leq R^{n_{s_{0}+3}}\leq R^{rn}.

Next we approximate the almost constant part of a vector (corresponding to B0B_{0}), provided that it is not empty (otherwise we skip this step). Fix A0[n]A_{0}\subset[n] with |A0|=|A_{0}|=\ell (it will play the role of B0B_{0}) and denote

𝒯0A0:={PB0(x)x:x(Υn(r)𝒯)𝒜𝒞(ρ),B0(x)=A0}.\mathcal{T}^{0}_{A_{0}}:=\{P_{B_{0}(x)}x\,:\,x\in\big{(}{\Upsilon}_{n}(r)\setminus\mathcal{T}\big{)}\cap\mathcal{AC}(\rho),\,\,B_{0}(x)=A_{0}\}.

Let 𝒦0A0:={±PA0𝟏}{\mathcal{K}}^{0}_{A_{0}}:=\{\pm P_{A_{0}}{\bf 1}\}. Since for every xΥn(r)x\in{\Upsilon}_{n}(r) we have either λx=1\lambda_{x}=1 or λx=1\lambda_{x}=-1, by the definition of B0(x)B_{0}(x), every z𝒯0A0z\in\mathcal{T}^{0}_{A_{0}} is approximated by one of ±PA0𝟏\pm P_{A_{0}}{\bf 1} within error ρ\rho in the \ell_{\infty}-metric.

The last part of the vector, corresponding to B4B_{4} we just approximate by 0. Note that for any xkt1x\in{\mathcal{R}}_{kt}^{1} we have PB4(x)xrn2rλtn\|P_{B_{4}(x)}x\|\leq\sqrt{rn}\leq\sqrt{2r}\lambda_{t}\sqrt{n}, in view of the condition x𝒜𝒞(ρ)x\in\mathcal{AC}(\rho). On the other hand, for xkt2x\in{\mathcal{R}}_{kt}^{2} we have PB4(x)xn3r2λtn\|P_{B_{4}(x)}x\|\leq\sqrt{n}\leq\frac{3r}{2}\lambda_{t}\sqrt{n}.

Now we combine our nets. Consider the net

𝒩0:=,I0,J0,D0,A0{y=y1+y2+y3+y0:y1𝒩I0,y2𝒦J0,y3D0,y0𝒦0A0},{\mathcal{N}}_{0}:=\bigcup\limits_{\ell,I_{0},J_{0},D_{0},A_{0}}\big{\{}y=y_{1}+y_{2}+y_{3}+y_{0}:\,y_{1}\in{\mathcal{N}}_{I_{0}},\,y_{2}\in\mathcal{K}_{J_{0}},\,y_{3}\in\mathcal{L}_{D_{0}},y_{0}\in\mathcal{K}^{0}_{A_{0}}\big{\}},

where the union is taken over all {0}[n2ns0+3,nns0+3]\ell\in\{0\}\cup[n-2n_{s_{0}+3},n-n_{s_{0}+3}] and all partitions of [n][n] into I0,J0,D0,A0,BI_{0},J_{0},D_{0},A_{0},B with |I0|=ns0+1|I_{0}|=n_{s_{0}+1}, |J0|=k1ns0+1|J_{0}|=k-1-n_{s_{0}+1}, |D0|=ns0+3k+1|D_{0}|=n_{s_{0}+3}-k+1, |A0|=|A_{0}|=\ell, and B=[n](I0J0D0A0)B=[n]\setminus(I_{0}\cup J_{0}\cup D_{0}\cup A_{0}). Then the cardinality of 𝒩0{\mathcal{N}}_{0},

|𝒩0|\displaystyle|{\mathcal{N}}_{0}| n(nns0+1)(nns0+1k1ns0+1)(nk+1ns0+3k+1)(nns0+3)maxI0|𝒩I0|maxJ0|𝒦J0|maxD0|D0|maxA0|𝒦0A0|.\displaystyle\leq n{n\choose n_{s_{0}+1}}{n-n_{s_{0}+1}\choose k-1-n_{s_{0}+1}}{n-k+1\choose n_{s_{0}+3}-k+1}{n-n_{s_{0}+3}\choose\ell}\max\limits_{I_{0}}|{\mathcal{N}}_{I_{0}}|\max\limits_{J_{0}}|\mathcal{K}_{J_{0}}|\max\limits_{D_{0}}|\mathcal{L}_{D_{0}}|\max\limits_{A_{0}}|\mathcal{K}^{0}_{A_{0}}|.

Using that ns0+1n/(64d)n_{s_{0}+1}\leq n/(64d), kn/ln2dk\leq n/\ln^{2}d, ns0+3rnn_{s_{0}+3}\leq rn, =0\ell=0 or n2ns0+3\ell\geq n-2n_{s_{0}+3}, the obtained bounds on nets, as well as that dd is large enough and rr is small enough (smaller than a constant depending on RR), we observe that the cardinality of 𝒩0{\mathcal{N}}_{0} is bounded by

n(ed)n/d(2eln2d)n/ln2d(2e/r)rn(2e/r)rnexp(9.1nln(6Cτ2d)/(64d))(3Cτ3d3/2)n/ln2dRrn2(e/r)2.5rn.n\left(ed\right)^{n/d}\,\left(2e\ln^{2}d\right)^{n/\ln^{2}d}\,\left(2e/r\right)^{rn}\,\left(2e/r\right)^{rn}\,\exp(9.1n\ln(6C_{\tau}^{2}d)/(64d))\,(3C_{\tau}^{3}d^{3/2})^{n/\ln^{2}d}R^{rn}\,\cdot 2\leq\left(e/r\right)^{2.5rn}.

By construction, for every xktsx\in{\mathcal{R}}_{kt}^{s} there exists y=y1+y2+y3+y0𝒩0y=y_{1}+y_{2}+y_{3}+y_{0}\in{\mathcal{N}}_{0} such that

xy\displaystyle\|x-y\| PB1(x)xy1+PB2(x)xy2+PB3(x)xy3+PB4(x)x+PB0(x)xy0\displaystyle\leq\|P_{B_{1}(x)}x-y_{1}\|+\|P_{B_{2}(x)}x-y_{2}\|+\|P_{B_{3}(x)}x-y_{3}\|+\|P_{B_{4}(x)}x\|+\|P_{B_{0}(x)}x-y_{0}\|
μns0+1+μk1ns0+1+ν+2rλtn+ρn2nCτd+ρn+9λtnR10λtnR,\displaystyle\leq\mu\sqrt{n_{s_{0}+1}}+\mu\sqrt{k-1-n_{s_{0}+1}}+\nu+\sqrt{2r}\lambda_{t}\sqrt{n}+\rho\sqrt{n}\leq\frac{2\sqrt{n}}{C_{\tau}\sqrt{d}}+\rho\sqrt{n}+\frac{9\lambda_{t}\sqrt{n}}{R}\leq\frac{10\lambda_{t}\sqrt{n}}{R},

where we used that ρ1/(2R)λ1/(2R)λt/(2R)\rho\leq 1/(2R)\leq\lambda_{1}/(\sqrt{2}R)\leq\lambda_{t}/(\sqrt{2}R) and that rr is sufficiently small.

Finally we adjust our net to |||||||||\cdot|||. Note that by Lemma 6.4 for every xΥn(r)𝒯x\in{\Upsilon}_{n}(r)\setminus\mathcal{T},

|x,𝐞|=|i=1nxin|x384Cτ2d4(64p)ln(6d)ern.|\left\langle x,{\bf e}\right\rangle|=\left|\sum_{i=1}^{n}\frac{x_{i}}{\sqrt{n}}\right|\leq\|x\|\leq\frac{384C_{\tau}^{2}d^{4}}{(64p)^{\ln(6d)}}\leq e^{rn}.

Therefore, there exists an ε/(4pn)\varepsilon/(4\sqrt{pn})-net 𝒩\mathcal{N}_{*} in P𝐞ktsP_{\bf e}^{\perp}{\mathcal{R}}_{kt}^{s} of cardinality 8pnern/ε8\sqrt{pn}e^{rn}/\varepsilon (note, the rank of P𝐞P_{\bf e}^{\perp} is one). Then, by the constructions of nets, for every xktsx\in{\mathcal{R}}_{kt}^{s} there exist y𝒩0y\in\mathcal{N}_{0} and y𝒩y_{*}\in\mathcal{N}_{*} such that

|||xP𝐞yy|||2=P𝐞(xy)2+pnP𝐞xy2100λt2nR2+ε2/16ε2/8.|||x-P_{\bf e}y-y_{*}|||^{2}=\|P_{\bf e}(x-y)\|^{2}+pn\|P_{\bf e}^{\perp}x-y_{*}\|^{2}\leq\frac{100\lambda_{t}^{2}n}{R^{2}}+\varepsilon^{2}/16\leq\varepsilon^{2}/8.

Thus the set 𝒩=P𝐞(𝒩0)+𝒩\mathcal{N}=P_{\bf e}(\mathcal{N}_{0})+\mathcal{N}_{*} is an (ε/2)(\varepsilon/2)-net for kts{\mathcal{R}}_{kt}^{s} with respect to |||||||||\cdot||| and its cardinality is bounded by (e/r)3rn(e/r)^{3rn}. Using standard argument we pass to an ε\varepsilon-net 𝒩ktskts\mathcal{N}_{kt}^{s}\subset{\mathcal{R}}_{kt}^{s} for kts{\mathcal{R}}_{kt}^{s}. ∎

6.4 Proof of Theorem 6.2

Proof.

Recall that the sets kis{\mathcal{R}}_{ki}^{s} were introduced just before Lemma 6.8 and the event nrm{\mathcal{E}}_{nrm} was defined in Proposition 3.14.

Fix s{1,2}s\in\{1,2\}, kn/ln2dk\leq n/\ln^{2}d, A:=[k,n]A:=[k,n], imi\leq m. Set ε:=λin/(6002C0)\varepsilon:=\lambda_{i}\sqrt{n}/(600\sqrt{2}C_{0}), where λi\lambda_{i} and mm are defined according to (31). Applying Lemma 6.8 with R=240002C0R=24000\sqrt{2}C_{0}, we find an ε\varepsilon-net (in the |||||||||\cdot|||–norm) 𝒩kiskis\mathcal{N}_{ki}^{s}\subset{\mathcal{R}}_{ki}^{s} for kis{\mathcal{R}}_{ki}^{s} of cardinality at most (e/r)3rn(e/r)^{3rn}. Take for a moment any y𝒩kisy\in\mathcal{N}_{ki}^{s}. Note that yσ(A)C0yσ(A)/p\|y_{\sigma(A)}\|\geq C_{0}\|y_{\sigma(A)}\|_{\infty}/\sqrt{p}, yσ(A)λin\|y_{\sigma}(A)\|\geq\lambda_{i}\sqrt{n} (where σ=σy\sigma=\sigma_{y}). Then Proposition 3.10 implies (yc)e3n\mathbb{P}({\mathcal{E}}_{y}^{c})\leq e^{-3n}, where

y={My>pn32C0yσ(A)}.{\mathcal{E}}_{y}=\left\{\|My\|>\frac{\sqrt{pn}}{3\sqrt{2}C_{0}}\,\|y_{\sigma(A)}\|\right\}.

Let us condition on the event nrmy𝒩kisy{\mathcal{E}}_{nrm}\cap\bigcap\limits_{y\in\mathcal{N}_{ki}^{s}}{\mathcal{E}}_{y}. Using the definition of 𝒩kis\mathcal{N}_{ki}^{s} and kis{\mathcal{R}}_{ki}^{s}, the triangle inequality, and the definition of nrm{\mathcal{E}}_{nrm} from Proposition 3.14, we get that for any xkisx\in{\mathcal{R}}_{ki}^{s} there is y𝒩kisy\in\mathcal{N}_{ki}^{s} such that |||xy|||ε|||x-y|||\leq\varepsilon, and hence

MxMyM(xy)>pn32C0yσ(A)100pnεpλin62C0.\|Mx\|\geq\|My\|-\|M(x-y)\|>\frac{\sqrt{pn}}{3\sqrt{2}C_{0}}\,\|y_{\sigma(A)}\|-100\sqrt{pn}\varepsilon\geq\frac{\sqrt{p}\lambda_{i}n}{6\sqrt{2}C_{0}}.

Using that |𝒩kis|(e/r)3rn|\mathcal{N}_{ki}^{s}|\leq(e/r)^{3rn}, that λi1/2\lambda_{i}\geq 1/\sqrt{2}, and the union bound, we obtain

(nrm{xkis:Mxpn12C0})(nrmy𝒩kisyc)e3(1rln(e/r))n.\mathbb{P}\left({\mathcal{E}}_{nrm}\cap\left\{\exists x\in{\mathcal{R}}_{ki}^{s}\,:\,\|Mx\|\leq\frac{\sqrt{p}n}{12C_{0}}\right\}\right)\leq{\mathbb{P}}\Big{(}{\mathcal{E}}_{nrm}\cap\bigcup\limits_{y\in\mathcal{N}_{ki}^{s}}{\mathcal{E}}_{y}^{c}\Big{)}\leq e^{-3(1-r\ln(e/r))n}.

Since =k,i(ki1ki2){\mathcal{R}}=\bigcup_{k,i}\,({\mathcal{R}}_{ki}^{1}\cup{\mathcal{R}}_{ki}^{2}) and rr is small enough, the result follows by the union bound and by Lemma 3.6 applied with t=30t=30 in order to estimate (nrm){\mathbb{P}}({\mathcal{E}}_{nrm}). ∎

6.5 Lower bounds on Mx\|Mx\| for vectors from 𝒯0𝒯1\mathcal{T}_{0}\cup\mathcal{T}_{1}

The following lemma provides a lower bound on the ratio Mx/x2\|Mx\|/\|x\|_{2} for vectors xx from 𝒯0𝒯1\mathcal{T}_{0}\cup\mathcal{T}_{1}.

Lemma 6.9.

Let n1n\geq 1, 0<p<0.0010<p<0.001, and assume that d=pn200lnnd=pn\geq 200\ln n. Then

({x𝒯0𝒯1 such that Mx(64p)κ192(pn)2x})n(1p)n+e1.4np,\mathbb{P}\left(\Big{\{}\exists\;x\in\mathcal{T}_{0}\cup\mathcal{T}_{1}\,\,\,\mbox{ such that }\,\,\,\|Mx\|\leq\frac{(64p)^{\kappa}}{192(pn)^{2}}\,\|x\|\Big{\}}\right)\leq n(1-p)^{n}+e^{-1.4np},

where κ\kappa is defined by (28).

Proof.

Let δij\delta_{ij}, i,jni,j\leq n be entries of MM. Let {\mathcal{E}} be the event that there are no zero columns in MM. Clearly, ()1n(1p)n\mathbb{P}({\mathcal{E}})\geq 1-n(1-p)^{n}.

Also, for each 1js0+11\leq j\leq s_{0}+1, let j=col(0,nj1){\mathcal{E}}_{j}={\mathcal{E}}_{col}(\ell_{0},n_{j-1}) be the event introduced in Lemma 6.5 (with s0,0s_{0},\ell_{0} defined in (27)), and observe that, according to Lemma 6.5, (j)1e1.5np\mathbb{P}({\mathcal{E}}_{j})\geq 1-e^{-1.5np} for every jj.

Recall that σx\sigma_{x} denotes a permutation [n][n] such that xi=|xσ(i)|x_{i}^{*}=|x_{\sigma(i)}| for ini\leq n. Pick any x𝒯0𝒯1x\in\mathcal{T}_{0}\cup\mathcal{T}_{1}. In the case x𝒯0x\in\mathcal{T}_{0} set m=m1=1m=m_{1}=1 and m2=2m_{2}=2. In the case x𝒯1jx\in\mathcal{T}_{1j} for some 1js0+11\leq j\leq s_{0}+1 set m=m1=nj1m=m_{1}=n_{j-1} and m2=njm_{2}=n_{j}. Then by the definition of sets 𝒯0,𝒯1\mathcal{T}_{0},\mathcal{T}_{1} we have xm>6dxm2x^{*}_{m}>6dx^{*}_{m_{2}}. Let

J=J(x)=σx([m]),Jr=Jr(x)=σx([m21][m]), and J(x)=(JJr)cJ^{\ell}=J^{\ell}(x)=\sigma_{x}([m]),\quad J^{r}=J^{r}(x)=\sigma_{x}([m_{2}-1]\setminus[m]),\quad\mbox{ and }\quad J(x)=(J^{\ell}\cup J^{r})^{c}

(if x𝒯0x\in\mathcal{T}_{0} then Jr=J^{r}=\emptyset). Note that by our definition we have |xi|>6d|xu||x_{i}|>6d|x_{u}| for any iJ(x)i\in J^{\ell}(x) and uJ(x)u\in J(x), and that maxiJ(x)|xi|xm2\max_{i\in J(x)}|x_{i}|\leq x^{*}_{m_{2}}. Denote by I(x)I^{\ell}(x) the (random) set of rows of MM having exactly one 1 in J(x)J^{\ell}(x) and no 1’s in Jr(x)J^{r}(x). Now we recall that the event sum{\mathcal{E}}_{sum} was introduced in Lemma 3.4 (we use it with q=pq=p) and set

:=sumj=1s0+1j.{\mathcal{E}}^{\prime}:={\mathcal{E}}\cap{\mathcal{E}}_{sum}\cap\bigcap_{j=1}^{s_{0}+1}{\mathcal{E}}_{j}.

Clearly, conditioned on {\mathcal{E}}^{\prime}, the set I(x)I^{\ell}(x) is not empty for any x𝒯0𝒯1x\in\mathcal{T}_{0}\cup\mathcal{T}_{1}. By definition, for every sI(x)s\in I^{\ell}(x) there exists j(s)J(x)j(s)\in J^{\ell}(x) such that

suppRs(M)J(x)={j(s)},suppRs(M)Jr(x)=.{\rm supp\,}R_{s}(M)\cap J^{\ell}(x)=\{j(s)\},\quad{\rm supp\,}R_{s}(M)\cap J^{r}(x)=\emptyset.

Since j(s)J(x)j(s)\in J^{\ell}(x) (which implies |xj(s)|xm>6dxm2|x_{j(s)}|\geq x^{*}_{m}>6dx^{*}_{m_{2}}), we obtain

|Rs(M),x|\displaystyle|\langle R_{s}(M),\,x\rangle| =|xj(s)+jJ(x)δsjxj||xj(s)|xm2jJ(x)δsjxmxm6djJ(x)δsj.\displaystyle=\Big{|}x_{j(s)}+\sum_{j\in J(x)}\delta_{sj}x_{j}\Big{|}\geq|x_{j(s)}|-x_{m_{2}}^{*}\sum_{j\in J(x)}\delta_{sj}\geq x_{m}^{*}-\frac{x_{m}^{*}}{6d}\sum_{j\in J(x)}\delta_{sj}.

Observe that conditioned on sum{\mathcal{E}}_{sum} we have jJ(x)δsjj=1nδsj3.5pn=3.5d\sum_{j\in J(x)}\delta_{sj}\leq\sum_{j=1}^{n}\delta_{sj}\leq 3.5pn=3.5d. Thus, everywhere on {\mathcal{E}}^{\prime} we have for all x𝒯0𝒯1x\in\mathcal{T}_{0}\cup\mathcal{T}_{1},

Mx|Rs(M),x|xm/3,sI(x).\|Mx\|\geq|\langle R_{s}(M),\,x\rangle|\geq x_{m}^{*}/3,\quad s\in I^{\ell}(x).

Finally, in the case x𝒯0x\in\mathcal{T}_{0} we have m=1m=1 and xnx1\|x\|\leq\sqrt{n}x^{*}_{1}. In the case x𝒯1jx\in\mathcal{T}_{1j} by Lemma 6.4 we have

x64(pn)2(64p)κxm,\|x\|\leq\frac{64(pn)^{2}}{(64p)^{\kappa}}\,x^{*}_{m},

This proves the lower bound on Mx/x\|Mx\|/\|x\| conditioned on {\mathcal{E}}^{\prime}. The probability bound follows by the union bound, Lemmas 3.4 and 6.5, and since s0lnns_{0}\leq\ln n, indeed

(sumj=1s0+1j)1n(1p)n(s0+2)e1.5np1n(1p)ne1.4np.\mathbb{P}\left({\mathcal{E}}\cap{\mathcal{E}}_{sum}\cap\bigcap_{j=1}^{s_{0}+1}{\mathcal{E}}_{j}\right)\geq 1-n(1-p)^{n}-(s_{0}+2)e^{-1.5np}\leq 1-n(1-p)^{n}-e^{-1.4np}.

6.6 Individual bounds for vectors from 𝒯2𝒯3\mathcal{T}_{2}\cup\mathcal{T}_{3}

In this section we provide individual probability bounds for vectors from the nets constructed in Lemma 6.7. To obtain the lower bounds on Mx\|Mx\|, we consider the behavior of the inner products 𝐑i(M),x\left\langle{\bf R}_{i}(M),x\right\rangle, more specifically, of the Lévy concentration function for 𝐑i(M),x\left\langle{\bf R}_{i}(M),x\right\rangle. To estimate this function, we will consider 2m2m columns of MM corresponding to the mm biggest and mm smallest (in absolute value) coordinates of xx, where m=ns0+1m=n_{s_{0}+1} or m=ns0+2m=n_{s_{0}+2}. In a sense, our anti-concentration estimates will appear in the process of swapping 11’s and 0’s within a specially chosen subset of the matrix rows. A crucial element in this process is to extract a pair of subsets of indices on which the chosen matrix rows have only one non-zero component. This will allow to get anti-concentration bounds by “sending” the non-zero component into the other index subset from the pair. The main difficulty in this scheme comes from the restriction 2mp1/322mp\leq 1/32 from Lemma 6.6, which guarantees existence of sufficiently many required subsets (and rows) but which cannot be directly applied to m=ns0+2m=n_{s_{0}+2}. To resolve this problem we use idea from [27]. We split the initially fixed set of 2m2m columns into smaller subsets of columns of size at most 1/(64p)1/(64p) each, and create independent random variables corresponding to this splitting. Then we apply Proposition 3.9, allowing to deal with the Lévy concentration function for sums of independent random variables.

We first describe subdivisions of n{\mathcal{M}_{n}} used in [27]. Recall that n{\mathcal{M}_{n}} denotes the class of all n×nn\times n matrices with 0/10/1 entries. We recall also that the probability measure {\mathbb{P}} on n{\mathcal{M}_{n}} is always assumed to be induced by a Bernoulli(pp) random matrix. Given J[n]J\subset[n] and MnM\in{\mathcal{M}_{n}} denote

I(J,M)={in:|supp𝐑i(M)J|=1}.I(J,M)=\{i\leq n\,:\,|{\rm supp\,}{\bf R}_{i}(M)\cap J|=1\}.

By J{\mathcal{M}}_{J} we denote the set of n×|J|n\times|J| matrices with 0/10/1 entries and with columns indexed by JJ. Fix q0nq_{0}\leq n and a partition J0J_{0}, J1J_{1}, …, Jq0J_{q_{0}} of [n][n]. Given subsets I1,,Iq0I_{1},\dots,I_{q_{0}} of [n][n] and V=(vij)J0V=(v_{ij})\in{\mathcal{M}}_{J_{0}}, denote =(I1,,Iq0){\mathcal{I}}=(I_{1},\ldots,I_{q_{0}}) and consider the class

(,V)={M=(μij)n:q[q0]I(Jq,M)=Iq and injJ0μij=vij}.{\mathcal{F}}({\mathcal{I}},V)=\left\{M=(\mu_{ij})\in{\mathcal{M}_{n}}\,:\,\forall q\in[q_{0}]\quad I(J_{q},M)=I_{q}\,\,\mbox{ and }\,\,\forall i\leq n\,\forall j\in J_{0}\,\,\,\mu_{ij}=v_{ij}\right\}.

In words, we fix the columns indexed by J0J_{0} and for each q[q0]q\in[q_{0}] we fix the row indices having exactly one 11 in columns indexed by JqJ_{q}. Then, for any fixed partition J0J_{0}, J1J_{1}, …, Jq0J_{q_{0}}, n{\mathcal{M}_{n}} is the disjoint union of classes (,V){\mathcal{F}}({\mathcal{I}},V) over all VJ0V\in{\mathcal{M}}_{J_{0}} and all (𝒫([n]))q0{\mathcal{I}}\in(\mathcal{P}([n]))^{q_{0}}, where 𝒫()\mathcal{P}(\cdot) denotes the power set.

The following is an important, but simple observation.

Lemma 6.10.

Let (,V){\mathcal{F}}({\mathcal{I}},V) be a non-empty class (defined as above), and denote by {\mathbb{P}}_{{\mathcal{F}}} the induced probability measure on (,V){\mathcal{F}}({\mathcal{I}},V), i.e., let

(B):=(B)((,V)),B(,V).{\mathbb{P}}_{{\mathcal{F}}}(B):=\frac{{\mathbb{P}}(B)}{{\mathbb{P}}({\mathcal{F}}({\mathcal{I}},V))},\quad B\subset{\mathcal{F}}({\mathcal{I}},V).

Then the matrix rows for matrices in (,V){\mathcal{F}}({\mathcal{I}},V) are mutually independent with respect to {\mathbb{P}}_{{\mathcal{F}}}, in other words, a random matrix distributed according to {\mathbb{P}}_{{\mathcal{F}}} has mutually independent rows.

Finally, given a vector vnv\in{\mathbb{R}}^{n}, a class (,V){\mathcal{F}}({\mathcal{I}},V), indices ini\leq n, qq0q\leq q_{0}, define

ξq(i)=ξq(M,v,i):=jJqδijvj,M=(δij)(,V).\xi_{q}(i)=\xi_{q}(M,v,i):=\sum_{j\in J_{q}}\delta_{ij}v_{j},\quad M=(\delta_{ij})\in{\mathcal{F}}({\mathcal{I}},V). (34)

We will view ξq(i)\xi_{q}(i) as random variables on (,V){\mathcal{F}}({\mathcal{I}},V) (with respect to the measure {\mathbb{P}}_{{\mathcal{F}}}). It is not difficult to see that for every fixed ii, the variables ξ1(i),ξq0(i)\xi_{1}(i),\dots\xi_{q_{0}}(i) are mutually independent, and, moreover, whenever iIqi\in I_{q}, the variable ξq(i)\xi_{q}(i) is uniformly distributed on the multiset {vj}jJq\{v_{j}\}_{j\in J_{q}}. Thus, we may apply Proposition 3.9 to

|𝐑i(M),v|=|q=0q0ξq(i)|\left|\left\langle{\bf R}_{i}(M),v\right\rangle\right|=\Big{|}\sum_{q=0}^{q_{0}}\xi_{q}(i)\Big{|}

with some α>0\alpha>0 satisfying 𝒬(ξq(i),1/3)α\mathcal{Q}(\xi_{q}(i),1/3)\leq\alpha for every iIqi\in I_{q}. This gives

{|𝐑i(M),x+y|1/3}C0α(1α)|{q1:iIq}|,{\mathbb{P}}_{{\mathcal{F}}}\left\{\left|\left\langle{\bf R}_{i}(M),x+y\right\rangle\right|\leq 1/3\right\}\leq\frac{C_{0}\alpha}{\sqrt{(1-\alpha)|\{q\geq 1:\,i\in I_{q}\}|}}, (35)

where C0C_{0} is a positive absolute constant.

We are ready now to estimate individual probabilities.

Lemma 6.11 (Individual probabilities).

There exist absolute constants C,C>1>c1>0C,C^{\prime}>1>c_{1}>0 such that the following holds. Let p(0,1/64]p\in(0,1/64], d=pn2d=pn\geq 2, Set m0=1/(64p)m_{0}=\lfloor 1/(64p)\rfloor and let m1m_{1} and m2m_{2} be such that

1m1<m2nm1.1\leq m_{1}<m_{2}\leq n-m_{1}.

Let yspan{𝟏}y\in{\rm span}\,\{{\bf{1}}\} and assume that xnx\in{\mathbb{R}}^{n} satisfies

xm1>2/3 and xi=0 for every i>m2.x^{*}_{m_{1}}>2/3\quad\mbox{ and }\quad x^{*}_{i}=0\,\,\,\mbox{ for every }\,\,i>m_{2}.

Denote m=min(m0,m1)m=\min(m_{0},m_{1}) and consider the event

E(x,y)={Mn:M(x+y)c1md}.E(x,y)=\left\{M\in{\mathcal{M}_{n}}\,:\,\|M(x+y)\|\leq\sqrt{c_{1}md}\right\}.

Then in the case m1m0m_{1}\leq m_{0} one has

(E(x,y)card)2md/20,{\mathbb{P}}(E(x,y)\cap{\mathcal{E}}_{card})\leq 2^{-md/20},

and in the case m1>Cm0m_{1}>C^{\prime}m_{0} one has

(E(x,y)card)(Cnm1d)md/20,{\mathbb{P}}(E(x,y)\cap{\mathcal{E}}_{card})\leq\left(\frac{Cn}{m_{1}d}\right)^{md/20},

where card{\mathcal{E}}_{card} is the event introduced in Lemma 6.6 with =2m\ell=2m.

Remark 6.12.

We apply this lemma below for sets 𝒯i\mathcal{T}_{i} with the following choice of parameters. For i=2i=2 we set

m1=m0=ns0+1=max(300s01,1/(64p)),m2=ns0+2,andp0.001,m_{1}=m_{0}=n_{s_{0}+1}=\max(30\ell_{0}^{s_{0}-1},\left\lfloor 1/(64p)\right\rfloor),\quad m_{2}=n_{s_{0}+2},\quad\mbox{and}\quad p\leq 0.001,

obtaining

(E(x,y)card)2ns0+1d/20.{\mathbb{P}}(E(x,y)\cap{\mathcal{E}}_{card})\leq 2^{-n_{s_{0}+1}d/20}.

For i=3i=3, we set

m1=ns0+2=n/d>m0=ns0+1,m2=ns0+3,andp0.001,m_{1}=n_{s_{0}+2}=\lfloor n/\sqrt{d}\rfloor>m_{0}=n_{s_{0}+1},\quad m_{2}=n_{s_{0}+3},\quad\mbox{and}\quad p\leq 0.001,

obtaining for large enough dd,

(E(x,y)card)(Cnns0+2d)ns0+1d/20(d/(2C))ns0+1d/20.{\mathbb{P}}(E(x,y)\cap{\mathcal{E}}_{card})\leq\left(\frac{Cn}{n_{s_{0}+2}d}\right)^{n_{s_{0}+1}d/20}\leq\left(\sqrt{d}/(2C)\right)^{-n_{s_{0}+1}d/20}.

To prove Lemma 6.11 it will be convenient to use the same notation as in Lemma 6.9. Given two disjoint subsets JJ^{\ell}, Jr[n]J^{r}\subset[n] and a matrix MnM\in{\mathcal{M}_{n}}, denote

I=I(M):={in:|supp𝐑i(M)J|=1 and supp𝐑i(M)Jr=},I^{\ell}=I^{\ell}(M):=\{i\leq n:\,|{\rm supp\,}{\bf R}_{i}(M)\cap J^{\ell}|=1\,\,\text{ and }\,\,{\rm supp\,}{\bf R}_{i}(M)\cap J^{r}=\emptyset\},

and

Ir=Ir(M):={in:supp𝐑i(M)J= and |supp𝐑i(M)Jr|=1}.I^{r}=I^{r}(M):=\{i\leq n:\,{\rm supp\,}{\bf R}_{i}(M)\cap J^{\ell}=\emptyset\,\,\text{ and }\,\,|{\rm supp\,}{\bf R}_{i}(M)\cap J^{r}|=1\}.

Here the upper indices \ell and rr refer to left and right.

Proof.

Let d=pnd=pn and fix γ=mp/72=md/(72n)\gamma=mp/72=md/(72n).

Fix xnx\in{\mathbb{R}}^{n} and yspan{𝟏}y\in{\rm span}\,\{{\bf 1}\} satisfying the conditions of the lemma. Let σ=σx\sigma=\sigma_{x}, that is, a permutation of [n][n] such that xi=|xσ(i)|x_{i}^{*}=|x_{\sigma(i)}| for all ini\leq n. Denote q0=m1/mq_{0}=m_{1}/m and without loss of generality assume that either q0=1q_{0}=1 or that q0q_{0} is a large enough integer. Let J1,J2,,Jq0J^{\ell}_{1},J_{2}^{\ell},\ldots,J^{\ell}_{q_{0}} be a partition of σ([m1])\sigma([m_{1}]) into sets of cardinality mm each, and let Jr1,J2r,,Jrq0J^{r}_{1},J_{2}^{r},\ldots,J^{r}_{q_{0}} be a partition of σ([nm1+1,n])\sigma([n-m_{1}+1,n]) into sets of cardinality mm each. Denote

Jq:=JqJrq for q[q0] and J0:=[n]q=1q0Jq.J_{q}:=J^{\ell}_{q}\cup J^{r}_{q}\,\,\,\mbox{ for }\,\,\,q\in[q_{0}]\quad\mbox{ and }\quad J_{0}:=[n]\setminus\bigcup_{q=1}^{q_{0}}J_{q}.

Then J0J_{0}, J1J_{1}, …, Jq0J_{q_{0}} is a partition of [n][n], which we fix in this proof. Let MM be a 0/10/1 n×nn\times n matrix. For every pair JqJ^{\ell}_{q}, JrqJ^{r}_{q}, let the sets Iq(M)I^{\ell}_{q}(M) and Irq(M)I^{r}_{q}(M) be defined as after Remark 6.12 and let Iq(M)=Iq(M)Irq(M)I_{q}(M)=I^{\ell}_{q}(M)\cup I^{r}_{q}(M). Since

|Jq|=2m2m01/(32p),|J_{q}|=2m\leq 2m_{0}\leq 1/(32p),

and by the definition of the event card{\mathcal{E}}_{card} (see Lemma 6.6 with =2m\ell=2m), we have

|Iq(M)|[md/8, 4md]|I_{q}(M)|\in[md/8,\,4md] (36)

everywhere on card{\mathcal{E}}_{card}. Now we represent n{\mathcal{M}_{n}} as a disjoint union of classes (,V){\mathcal{F}}({\mathcal{I}},V) defined at the beginning of this subsection with VJ0V\in{\mathcal{M}}_{J_{0}} and =(I1,,Iq){\mathcal{I}}=(I_{1},\ldots,I_{q}). Since it is enough to prove a uniform upper bound for classes (,V)card{\mathcal{F}}({\mathcal{I}},V)\cap{\mathcal{E}}_{card} and since for every such non-empty class {\mathcal{I}} must satisfy (36) for every qq0q\leq q_{0}, we have

(E(x,y)card)max(E(x,y)card|(,V))max(E(x,y)|(,V)),{\mathbb{P}}(E(x,y)\cap{\mathcal{E}}_{card})\leq\max\mathbb{P}(E(x,y)\cap{\mathcal{E}}_{card}\,|\,{\mathcal{F}}({\mathcal{I}},V))\leq\max\mathbb{P}(E(x,y)|\,{\mathcal{F}}({\mathcal{I}},V)),

where the first maximum is taken over all (,V){\mathcal{F}}({\mathcal{I}},V) with (,V)card{\mathcal{F}}({\mathcal{I}},V)\cap{\mathcal{E}}_{card}\neq\emptyset and the second maximum is taken over all (,V){\mathcal{F}}({\mathcal{I}},V) with IqI_{q}’s satisfying condition (36).

Fix any class (,V){\mathcal{F}}({\mathcal{I}},V), where {\mathcal{I}} satisfies (36), and denote the corresponding induced probability measure on the class by \mathbb{P}_{\mathcal{F}}, that is

()=(|(,V)).\mathbb{P}_{\mathcal{F}}(\cdot)=\mathbb{P}(\cdot\,|\,{\mathcal{F}}({\mathcal{I}},V)).

Let

I:=q=1q0Iq.I:=\bigcup_{q=1}^{q_{0}}I_{q}.

Note that |I|4q0md|I|\leq 4q_{0}md. We first show that the set of ii’s which belongs to many IqI_{q}’s is large. More precisely, denote

Ai={q[q0]:iIq},i[n], and I0={in:|Ai|γq0}.A_{i}=\{q\in[q_{0}]\,:\,i\in I_{q}\},\;\;i\in[n],\quad\quad\mbox{ and }\quad\quad I_{0}=\{i\leq n\,:\,|A_{i}|\geq\gamma q_{0}\}.

Then, using bounds on cardinalities of IqI_{q}’s, one has

mdq0/8q=1q0|Iq|=i=1n|Ai||I0|q0+(n|I0|)γq0|I0|q0+nγq0.mdq_{0}/8\leq\sum_{q=1}^{q_{0}}|I_{q}|=\sum_{i=1}^{n}|A_{i}|\leq|I_{0}|q_{0}+(n-|I_{0}|)\gamma q_{0}\leq|I_{0}|q_{0}+n\gamma q_{0}.

Thus,

|I0|md/8nγmd/9.|I_{0}|\geq md/8-n\gamma\geq md/9.

Without loss of generality we assume that I0={1,2,|I0|}I_{0}=\{1,2,\ldots|I_{0}|\} and only consider the first k:=md/9k:=\lceil md/9\rceil indices from it. Then [k]I0[k]\subset I_{0}.

Now, by definition, for matrices ME(x,y)M\in E(x,y) we have

M(x+y)2=i=1n|𝐑i(M),x+y|2c1md.\|M(x+y)\|^{2}=\sum_{i=1}^{n}|\left\langle{\bf R}_{i}(M),x+y\right\rangle|^{2}\leq c_{1}\,md.

Therefore there are at most 9c1md9c_{1}md rows with |𝐑i(M),x+y)|1/3|\langle{\bf R}_{i}(M),x+y)\rangle|\geq 1/3. Hence,

|{ik:|𝐑i(M),x+y|<1/3}|md/99c1md(1/99c1)md.|\{i\leq k\,:\,|\langle{\bf R}_{i}(M),x+y\rangle|<1/3\}|\geq md/9-9c_{1}md\geq(1/9-9c_{1})md.

Let k0:=(1/99c1)mdk_{0}:=\lceil(1/9-9c_{1})md\rceil and for every iki\leq k denote

Ωi:={M(,V):|𝐑i(M),x+y|<1/3} and Ω0=(,V).\Omega_{i}:=\{M\in{\mathcal{F}}({\mathcal{I}},V)\,:\,|\left\langle{\bf R}_{i}(M),x+y\right\rangle|<1/3\}\quad\mbox{ and }\quad\Omega_{0}={\mathcal{F}}({\mathcal{I}},V).

Then

(E(x,y))\displaystyle{\mathbb{P}}_{{\mathcal{F}}}(E(x,y)) B[k]|B|=k0(iBΩi)(kk0)maxB[k]|B|=k0(iBΩi).\displaystyle\leq\sum_{B\subset[k]\atop|B|=k_{0}}\,{\mathbb{P}}_{{\mathcal{F}}}\Big{(}\bigcap_{i\in B}\Omega_{i}\Big{)}\leq{k\choose k_{0}}\,\max_{B\subset[k]\atop|B|=k_{0}}\,{\mathbb{P}}_{{\mathcal{F}}}\Big{(}\bigcap_{i\in B}\Omega_{i}\Big{)}.

Without loss of generality we assume that the maximum above is attained at B=[k0]B=[k_{0}]. Then

(E(x,y))(e/(81c1))9c1mdi=1k0(Ωi|Ω1Ωi1)=(e/(81c1))9c1mdi=1k0(Ωi),{\mathbb{P}}_{{\mathcal{F}}}(E(x,y))\leq\left(e/(81c_{1})\right)^{9c_{1}md}\,\,\,\prod_{i=1}^{k_{0}}\,{\mathbb{P}}_{{\mathcal{F}}}(\Omega_{i}|\,\Omega_{1}\cap\ldots\cap\Omega_{i-1})=\left(e/(81c_{1})\right)^{9c_{1}md}\,\,\,\prod_{i=1}^{k_{0}}\,{\mathbb{P}}_{{\mathcal{F}}}(\Omega_{i}), (37)

where at the last step we used mutual independence of the events Ωi\Omega_{i} (with respect to measure {\mathbb{P}}_{{\mathcal{F}}}), see Lemma 6.10.

Next we estimate the factors in the product. Fix ik0i\leq k_{0} and Ai={q:iIq}A_{i}=\{q\,:\,i\in I_{q}\}. Since, by our assumptions, iI0i\in I_{0}, we have |Ai|γq0|A_{i}|\geq\gamma q_{0}. Consider the random variables ξq(i)=ξq(M,x+y,i)\xi_{q}(i)=\xi_{q}(M,x+y,i), qAiq\in A_{i}, defined in (34). Then by (35) we have

(Ωi)\displaystyle{\mathbb{P}}_{{\mathcal{F}}}(\Omega_{i}) ={|𝐑i(M),x+y|<1/3}𝒬(q=0q0ξq(i),1/3)\displaystyle={\mathbb{P}}_{{\mathcal{F}}}\big{\{}|\left\langle{\bf R}_{i}(M),x+y\right\rangle|<1/3\big{\}}\leq\mathcal{Q}_{{\mathcal{F}}}\Big{(}\sum_{q=0}^{q_{0}}\xi_{q}(i),1/3\Big{)}
𝒬(qAiξq(i),1/3)C0α(1α)|Ai|C0α(1α)γq0\displaystyle\leq\mathcal{Q}_{{\mathcal{F}}}\Big{(}\sum_{q\in A_{i}}\xi_{q}(i),1/3\Big{)}\leq\frac{C_{0}\alpha}{\sqrt{(1-\alpha)|A_{i}|}}\leq\frac{C_{0}\alpha}{\sqrt{(1-\alpha)\gamma q_{0}}}

where α=maxqAi𝒬(ξq(i),1/3)\alpha=\max_{q\in A_{i}}\mathcal{Q}_{{\mathcal{F}}}(\xi_{q}(i),1/3). Moreover, in the case q0=1q_{0}=1 we just have

(Ωi)α=𝒬(ξ1(i),1/3).{\mathbb{P}}_{{\mathcal{F}}}(\Omega_{i})\leq\alpha=\mathcal{Q}(\xi_{1}(i),1/3).

Thus it remains to estimate 𝒬(ξq(i),1/3)\mathcal{Q}_{{\mathcal{F}}}(\xi_{q}(i),1/3) for qAiq\in A_{i}. Fix qAiq\in A_{i}, so that iIqi\in I_{q}. Recall that, by construction, the intersection of the support of 𝐑i(M){\bf R}_{i}(M) with JqJ_{q} is a singleton everywhere on (,V){\mathcal{F}}({\mathcal{I}},V). Denote the corresponding index by j(q,M)=j(q,M,i)j(q,M)=j(q,M,i). Then

ξq(i)=ξq(M,x+y,i)=jJqδij(xj+y1)=xj(q,M)+y1,\xi_{q}(i)=\xi_{q}(M,x+y,i)=\sum_{j\in J_{q}}\delta_{ij}(x_{j}+y_{1})=x_{j(q,M)}+y_{1},

and note that |xj(q,M)|>2/3|x_{j(q,M)}|>2/3 whenever j(q,M)Jqj(q,M)\in J^{\ell}_{q} and xj(q,M)=0x_{j(q,M)}=0 whenever j(q,M)Jrqj(q,M)\in J^{r}_{q}. Observe further that {j(q,M)Jrq}={j(q,M)Jq}=1/2{\mathbb{P}}_{{\mathcal{F}}}\big{\{}j(q,M)\in J^{r}_{q}\big{\}}={\mathbb{P}}_{{\mathcal{F}}}\big{\{}j(q,M)\in J^{\ell}_{q}\big{\}}=1/2. Hence, we obtain

𝒬(ξq(i),1/3)1/2:=α.\mathcal{Q}_{{\mathcal{F}}}(\xi_{q}(i),1/3)\leq 1/2:=\alpha.

Combining the probability estimates starting with (37) and using that γ=md/(72n)\gamma=md/(72n), we obtain in the case q0=m1/mCq_{0}=m_{1}/m\geq C^{\prime},

(E(x,y))\displaystyle{\mathbb{P}}_{{\mathcal{F}}}(E(x,y)) (e81c1)9c1md(C02γq0)(1/99c1)md\displaystyle\leq\left(\frac{e}{81c_{1}}\right)^{9c_{1}md}\,\,\,\left(\frac{C_{0}}{\sqrt{2\gamma q_{0}}}\right)^{(1/9-9c_{1})md}
=(e81c1)9c1md(6C0nm1d)(1/99c1)md(C1nm1d)md/20,\displaystyle=\left(\frac{e}{81c_{1}}\right)^{9c_{1}md}\,\,\,\left(\frac{6C_{0}\sqrt{n}}{\sqrt{m_{1}d}}\right)^{(1/9-9c_{1})md}\leq\left(\frac{C_{1}n}{m_{1}d}\right)^{md/20},

provided that c1c_{1} is small enough and C1=36C02C_{1}=36C_{0}^{2}. Note that the bound is meaningful only if CC^{\prime} is large enough. In the case q0=1q_{0}=1 we have

(E(x,y))(e81c1)9c1md(12)(1/99c1)md(12)md/20,{\mathbb{P}}_{{\mathcal{F}}}(E(x,y))\leq\left(\frac{e}{81c_{1}}\right)^{9c_{1}md}\,\,\,\left(\frac{1}{2}\right)^{(1/9-9c_{1})md}\leq\left(\frac{1}{2}\right)^{md/20},

provided that c1c_{1} is small enough. This completes the proof. ∎

6.7 Proof of Theorem 6.1

We are ready to complete the proof. Denote

m=m0=ns0+1:=max(300s01,1/(64p))[n/(64d),n/(2d)].m=m_{0}=n_{s_{0}+1}:=\max(30\ell_{0}^{s_{0}-1},\left\lfloor 1/(64p)\right\rfloor)\in[n/(64d),n/(2d)].

Lemma 6.9 implies that

({x𝒯0𝒯1 such that Mx(64p)κ192(pn)2x})n(1p)n+e1.4np.\mathbb{P}\left(\Big{\{}\exists\;x\in\mathcal{T}_{0}\cup\mathcal{T}_{1}\,\,\,\mbox{ such that }\,\,\,\|Mx\|\leq\frac{(64p)^{\kappa}}{192(pn)^{2}}\,\|x\|\Big{\}}\right)\leq n(1-p)^{n}+e^{-1.4np}.

We now turn to the remaining cases. Fix j{2,3}j\in\{2,3\}. Let

j:={Mn:x𝒯jsuch thatMxc1md2bjx},\displaystyle{\mathcal{E}}_{j}:=\Big{\{}M\in{\mathcal{M}_{n}}\,:\,\exists\,x\in\mathcal{T}_{j}\,\,\,\mbox{such that}\,\,\,\|Mx\|\leq\frac{\sqrt{c_{1}md}}{2\,b_{j}}\,\|x\|\Big{\}},

where c1c_{1} is the constant from Lemma 6.11, and b2=384(pn)3/(64p)κb_{2}=384(pn)^{3}/(64p)^{\kappa}, b3=384Cτ(pn)3.5/(64p)κb_{3}=384C_{\tau}(pn)^{3.5}/(64p)^{\kappa}.

Recall that nrm{\mathcal{E}}_{nrm} was defined in Proposition 3.14. For any matrix MjnrmM\in{\mathcal{E}}_{j}\cap{\mathcal{E}}_{nrm} there exists x=x(M)𝒯jx=x(M)\in\mathcal{T}_{j} satisfying

Mxc1md2bjx.\|Mx\|\leq\frac{\sqrt{c_{1}md}}{2\,b_{j}}\,\|x\|.

Normalize xx so that xns0+j1=1x_{n_{s_{0}+j-1}}^{*}=1, that is, x𝒯jx\in\mathcal{T}_{j}^{\prime}. By Lemma 6.4 we have xbj\|x\|\leq b_{j}.

Let 𝒩j=𝒩j+𝒩j{\mathcal{N}}_{j}={\mathcal{N}}_{j}^{\prime}+{\mathcal{N}}_{j}^{\prime\prime} be the net constructed in Lemma 6.7. Then there exist u𝒩ju\in{\mathcal{N}}_{j}^{\prime} with

us0+j111/(Cτd)>2/3u_{{s_{0}+j-1}}^{*}\geq 1-1/(C_{\tau}\sqrt{d})>2/3

and u=0u_{\ell}^{*}=0 for >ns0+j\ell>n_{s_{0}+j}, and w𝒩jspan{𝟏}w\in{\mathcal{N}}_{j}^{\prime\prime}\subset{\rm span}\,\{{\bf 1}\}, such that |||x(u+w)|||2n/(Cτd).|||x-(u+w)|||\leq\sqrt{2n}/(C_{\tau}\sqrt{d}). Applying Proposition 3.14 (where nrm{\mathcal{E}}_{nrm} was introduced), and using that CτC_{\tau} is large enough, we obtain that for every matrix MjnrmM\in{\mathcal{E}}_{j}\cap{\mathcal{E}}_{nrm} there exist u=u(M)𝒩ju=u(M)\in{\mathcal{N}}_{j}^{\prime} and w=w(M)𝒩jspan{𝟏}w=w(M)\in{\mathcal{N}}_{j}^{\prime\prime}\subset{\rm span}\,\{{\bf 1}\} with

M(u+w)Mx+M(xuw)c1md/2+2002n/Cτc1md.\|M(u+w)\|\leq\|Mx\|+\|M(x-u-w)\|\leq\sqrt{c_{1}md}/2+200\sqrt{2n}/C_{\tau}\leq\sqrt{c_{1}md}. (38)

Using our choice of ns0+1n_{s_{0}+1}, ns0+2n_{s_{0}+2}, ns0+3n_{s_{0}+3}, Lemma 6.7, and Lemma 6.11 twice — first with m1=m0=ns0+1m_{1}=m_{0}=n_{s_{0}+1}, m2=ns0+2m_{2}=n_{s_{0}+2}, then with m1=ns0+2>m0=ns0+1m_{1}=n_{s_{0}+2}>m_{0}=n_{s_{0}+1}, m2=ns0+3m_{2}=n_{s_{0}+3} (see Remark 6.12), we obtain that for small enough rr and large enough dd the probability (2nrmcard)\mathbb{P}\left({\mathcal{E}}_{2}\cap{\mathcal{E}}_{nrm}\cap{\mathcal{E}}_{card}\right) is bounded by

exp(2ns0+2lnd)2ns0+1d/20exp(ns0+1d/30)exp(n/2000)\exp\left(2n_{s_{0}+2}\ln d\right)2^{-n_{s_{0}+1}d/20}\leq\exp\left(-n_{s_{0}+1}d/30\right)\leq\exp\left(-n/2000\right)

and that the probability (3nrmcard)\mathbb{P}\left({\mathcal{E}}_{3}\cap{\mathcal{E}}_{nrm}\cap{\mathcal{E}}_{card}\right) is bounded by

exp(2ns0+3lnd)(d/(2C))ns0+1d/20exp(nlnd/10000),\exp\left(2n_{s_{0}+3}\ln d\right)\left(\sqrt{d}/(2C)\right)^{-n_{s_{0}+1}d/20}\leq\exp\left(-n\ln d/10000\right),

where card{\mathcal{E}}_{card} is the event introduced in Lemma 6.6 with =2m\ell=2m.

Combining all three cases we obtain that the desired bound holds for all x𝒯x\in\mathcal{T} with probability at most

2exp(n/2000)+(normc)+(cardc).2\exp\left(-n/2000\right)+\mathbb{P}\left({\mathcal{E}}_{norm}^{c}\right)+\mathbb{P}\left({\mathcal{E}}_{card}^{c}\right).

It remains to note that since npnp is large, by Lemma 3.6 (applied with t=30t=30) and by Lemma 6.6,

(nrmc)+(cardc)4e225np+2exp(n/500)exp(10pn).\mathbb{P}\left({\mathcal{E}}_{nrm}^{c}\right)+\mathbb{P}\left({\mathcal{E}}_{card}^{c}\right)\leq 4e^{-225np}+2\exp(-n/500)\leq\exp(-10pn).

\Box

6.8 Proof of Theorem 6.3

Proof.

Clearly, it is enough to show that Υn(r)(𝒱n(r,𝐠,δ,ρ)𝒯).{\Upsilon}_{n}(r)\setminus({\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\cup\mathcal{T})\subset{\mathcal{R}}. Let xΥn(r)𝒯x\in{\Upsilon}_{n}(r)\setminus\mathcal{T} and set σ:=σx\sigma:=\sigma_{x}. Note that |xns0+2|Cτd|x_{n_{s_{0}+2}}|\leq C_{\tau}\sqrt{d}, where s0s_{0} was defined in (27). Denote m0=n/ln2d>2ns0+2m_{0}=\lfloor n/\ln^{2}d\rfloor>2{n_{s_{0}+2}}.

Assume first that xx does not satisfy (10). Then by Lemma 3.2, x𝒜𝒞(ρ)x\in\mathcal{AC}(\rho). If xm0ln2dx_{m_{0}}^{*}\leq\ln^{2}d then denoting k=m0k=m_{0}, A=[k,n]A=[k,n], and using the definition of 𝒜𝒞(ρ)\mathcal{AC}(\rho), we observe

xσ(A)(nns0+3k)(1ρ)n/2,\|x_{\sigma(A)}\|\geq\sqrt{(n-n_{s_{0}+3}-k)(1-\rho)}\geq\sqrt{n/2},

whence

xσ(A)xσ(A)n/2ln2dC0p.\frac{\|x_{\sigma(A)}\|}{\|x_{\sigma(A)}\|_{\infty}}\geq\frac{\sqrt{n/2}}{\ln^{2}d}\geq\frac{C_{0}}{\sqrt{p}}.

On the other hand, xm0|xns0+2|Cτdx_{m_{0}}^{*}\leq|x_{n_{s_{0}+2}}|\leq C_{\tau}\sqrt{d}, hence xσ(A)Cτdn\|x_{\sigma(A)}\|\leq C_{\tau}\sqrt{dn}. This implies that xk1x\in{\mathcal{R}}_{k}^{1}\subset{\mathcal{R}}.

Now, if xm0>ln2dx_{m_{0}}^{*}>\ln^{2}d then denoting k=ns0+2k=n_{s_{0}+2}, A=[k,n]A=[k,n], we get

xσ(A)2i=ns0+2m0(xi)2(m0/2)ln4d(n/4)ln2d,\|x_{\sigma(A)}\|^{2}\geq\sum_{i=n_{s_{0}+2}}^{m_{0}}(x_{i}^{*})^{2}\geq(m_{0}/2)\ln^{4}d\geq(n/4)\ln^{2}d,

whence

xσ(A)xσ(A)nlnd2CτdC0p.\frac{\|x_{\sigma(A)}\|}{\|x_{\sigma(A)}\|_{\infty}}\geq\frac{\sqrt{n}\ln d}{2C_{\tau}\sqrt{d}}\geq\frac{C_{0}}{\sqrt{p}}.

As in the previous case we have xσ(A)Cτdn\|x_{\sigma(A)}\|\leq C_{\tau}\sqrt{dn}, which implies that xk1x\in{\mathcal{R}}_{k}^{1}\subset{\mathcal{R}}.

Next we assume that xx does satisfy (10). Then, by the definition of the set 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) and our function 𝐠{\bf g}, xx does not satisfy the following condition:

i164p:xiexp(ln2(2n/i)) and 164p<in:xi(2n/i)3/2.\displaystyle\forall i\leq\frac{1}{64p}:\,\,\,x^{*}_{i}\leq\exp(\ln^{2}(2n/i))\quad\mbox{ and }\quad\forall\frac{1}{64p}<i\leq n:\,\,\,x^{*}_{i}\leq(2n/i)^{3/2}.

We fix the smallest value of j1j\geq 1 which breaks this condition and consider several cases. Note that since xΥn(r)x\in{\Upsilon}_{n}(r), we must have jrnj\leq rn.

Case 1. 2m0jrn2m_{0}\leq j\leq rn. In this case by the conditions and by minimality of jj, we have xm0(2n/m0)3/2x_{m_{0}}^{*}\leq(2n/m_{0})^{3/2} and xj(2n/j)3/2x_{j}^{*}\geq(2n/j)^{3/2}. Take k=m0k=m_{0} and A=[k,n]A=[k,n]. Then we have

xσ(A)jm0+1xjj/2(2n/j)3/2rn/2(2/r)3/2=2n/r,\|x_{\sigma(A)}\|\geq\sqrt{j-m_{0}+1}\,x_{j}^{*}\geq\sqrt{j/2}\,(2n/j)^{3/2}\geq\sqrt{rn/2}\,(2/r)^{3/2}=2\sqrt{n}/r,

hence

xσ(A)xσ(A)(2r)n(2n/m0)3/2(2r)n(2lnd)3C0p.\frac{\|x_{\sigma(A)}\|}{\|x_{\sigma(A)}\|_{\infty}}\geq\left(\frac{2}{r}\right)\,\frac{\sqrt{n}}{(2n/m_{0})^{3/2}}\geq\left(\frac{2}{r}\right)\frac{\sqrt{n}}{(2\ln d)^{3}}\geq\frac{C_{0}}{\sqrt{p}}.

As above we have xσ(A)Cτdn\|x_{\sigma(A)}\|\leq C_{\tau}\sqrt{dn}, which implies that xk2x\in{\mathcal{R}}_{k}^{2}\subset{\mathcal{R}}.

Case 2. 16C02n/dj2m016C_{0}^{2}n/d\leq j\leq 2m_{0}. Take k=j/2k=\lceil j/2\rceil and A=[k,n]A=[k,n]. Then we have xk(2n/k)3/2(4n/j)3/2x_{k}^{*}\leq(2n/k)^{3/2}\leq(4n/j)^{3/2}, xj(2n/j)3/2x_{j}\geq(2n/j)^{3/2}, and

xσ(A)jk+1xjj/2(2n/j)3/2(2/r)n.\|x_{\sigma(A)}\|\geq\sqrt{j-k+1}\,x_{j}^{*}\geq\sqrt{j/2}\,(2n/j)^{3/2}\geq(2/r)\,\sqrt{n}.

Therefore,

xσ(A)xσ(A)(j2)1/2(2n/j)3/2(4n/j)3/2C0p.\frac{\|x_{\sigma(A)}\|}{\|x_{\sigma(A)}\|_{\infty}}\geq\left(\frac{j}{2}\right)^{1/2}\frac{(2n/j)^{3/2}}{(4n/j)^{3/2}}\geq\frac{C_{0}}{\sqrt{p}}.

Since x𝒯x\not\in\mathcal{T}, we observe xkCτ2dx_{k}^{*}\leq C_{\tau}^{2}d, hence xσ(A)Cτ2dn\|x_{\sigma(A)}\|\leq C_{\tau}^{2}d\sqrt{n} and xk2x\in{\mathcal{R}}_{k}^{2}\subset{\mathcal{R}}.

In the rest of the proof we show that we must necessarily have j16C02n/dj\geq 16C_{0}^{2}n/d.

Case 3. ns0+1j<C1n/dn_{s_{0}+1}\leq j<C_{1}n/d, where C1=16C02C_{1}=16C_{0}^{2}. Using that x𝒯x\not\in\mathcal{T}, in this case we have

Cτ2dxj(2nj)3/2(2dC1)3/2,C_{\tau}^{2}d\geq x_{j}^{*}\geq\left(\frac{2n}{j}\right)^{3/2}\geq\left(\frac{2d}{C_{1}}\right)^{3/2},

which is impossible for large enough dd.

Case 4. ns0j<ns0+1n_{s_{0}}\leq j<n_{s_{0}+1}. Using that x𝒯x\not\in\mathcal{T} and that ns0+1=1/(64p)=n/(64d)n_{s_{0}+1}=\left\lfloor 1/(64p)\right\rfloor=\left\lfloor n/(64d)\right\rfloor, in this case we have

(6d)Cτ2dxjexp(ln2(2n/j))exp(ln2(2n/ns0+1))exp(ln2(128d))(6d)C_{\tau}^{2}d\geq x_{j}^{*}\geq\exp(\ln^{2}(2n/j))\geq\exp(\ln^{2}(2n/n_{s_{0}+1}))\geq\exp(\ln^{2}(128d))

which is impossible for large enough dd.

Case 5. nkj<nk+1n_{k}\leq j<n_{k+1} for some 1ks011\leq k\leq s_{0}-1. Recall that nk=300k1n_{k}=30\ell_{0}^{k-1} and recall also that if s0>1s_{0}>1 (as in this case) then pcnlnnp\leq c\sqrt{n\ln n}. Using that x𝒯x\not\in\mathcal{T}, in this case we have

(Cτ2d)(6d)s0k+1xjexp(ln2(2n/j))exp(ln2(2n/(300k))),(C_{\tau}^{2}d)(6d)^{s_{0}-k+1}\geq x_{j}^{*}\geq\exp(\ln^{2}(2n/j))\geq\exp(\ln^{2}(2n/(30\ell_{0}^{k}))),

hence

(Cτ2d)(6d)s0+1(6d)kexp(ln2(2n/(300k))).(C_{\tau}^{2}d)(6d)^{s_{0}+1}\geq(6d)^{k}\exp(\ln^{2}(2n/(30\ell_{0}^{k}))). (39)

Considering the function f(k):=kln(6d)+ln2(2n/(300k)f(k):=k\ln(6d)+\ln^{2}(2n/(30\ell_{0}^{k}), we observe that its derivative is linear in kk, therefore ff attains its maximum either at k=1k=1 or at k=s01k=s_{0}-1. Thus, to show that (39) is impossible it is enough to consider k=1,s01k=1,s_{0}-1 only. Let k=1k=1. By (29), (6d)s0(6d) 1/(64p)κ(6d)^{s_{0}}\leq(6d)\,1/(64p)^{\kappa}, where κ=ln(6d)ln0\kappa=\frac{\ln(6d)}{\ln\ell_{0}}. Therefore, the logarithm of the left hand side of (39) is

ln((Cτ2d)(6d)s0+1)4lnd+ln(6d)ln0ln(1/64p).\ln((C_{\tau}^{2}d)(6d)^{s_{0}+1})\leq 4\ln d+\frac{\ln(6d)}{\ln\ell_{0}}\,\ln(1/64p). (40)

On the other hand, n/0(4ln(1/p))/pn/\ell_{0}\geq(4\ln(1/p))/p, therefore the logarithm of the left hand side of (39) is larger than ln2(ln(1/p)/(4p))\ln^{2}(\ln(1/p)/(4p)). Thus, it is enough to check that

(1/2)ln2(ln(1/p)/(4p))4lnd and (1/2)ln2(ln(1/p)/(4p))ln0ln(6d)ln(1/64p).(1/2)\ln^{2}(\ln(1/p)/(4p))\geq 4\ln d\quad\mbox{ and }\quad(1/2)\ln^{2}(\ln(1/p)/(4p))\ln\ell_{0}\geq\ln(6d)\,\ln(1/64p).

Both inequalities follows since pcnlnnp\leq c\sqrt{n\ln n}, d=pnd=pn, dd and nn are large enough, and since 025\ell_{0}\geq 25. Next assume that k=s01k=s_{0}-1. Note that in this case 0kn/(64d)\ell_{0}^{k}\leq n/(64d). Thus, to disprove (39) it is enough to show that

ln2(64d/15)ln(36Cτ2d3),\ln^{2}(64d/15)\geq\ln(36C_{\tau}^{2}d^{3}),

which clearly holds for large enough dd.

Case 6. 2j<302\leq j<30. In this case we have

(Cτ2d)(6d)s0+1xjexp(ln2(2n/j))exp(ln2(2n/30)),(C_{\tau}^{2}d)(6d)^{s_{0}+1}\geq x_{j}^{*}\geq\exp(\ln^{2}(2n/j))\geq\exp(\ln^{2}(2n/30)),

By (40) this implies

4lnd+ln(6d)ln0ln(1/64p)ln2(2n/30),4\ln d+\frac{\ln(6d)}{\ln\ell_{0}}\,\ln(1/64p)\geq\ln^{2}(2n/30),

which is impossible.

Case 7. j=1j=1. In this case we have (Cτ2d)(6d)s0+2x1exp(ln2(2n))(C_{\tau}^{2}d)(6d)^{s_{0}+2}\geq x_{1}^{*}\geq\exp(\ln^{2}(2n)) and we proceed as in Case 6. ∎

7 Proof of the main theorem

In this section, we combine the results of Sections 4, 5, and 6, as well as Subsection 3.2 to prove the main theorem, Theorems 1.2, and the following improvement for the case of constant pp:

Theorem 7.1.

There exists an absolute positive constant cc with the following property. Let q(0,c)q\in(0,c) be a parameter (independent of nn). Then there exist CqC_{q} and nq1n_{q}\geq 1 (both depend only on qq), such that for every nnqn\geq n_{q} and every p(q,c)p\in(q,c) a Bernoulli(pp) n×nn\times n random matrix MnM_{n} satisfies

{Mn is singular}=(2+on(1))n(1p)n,{\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=(2+o_{n}(1))n\,(1-p)^{n},

and, moreover, for every t>0t>0,

{smin(Mn)Cqn2.5t}t+(1+on(1)){Mn is singular}=t+(2+on(1))n(1p)n.{\mathbb{P}}\big{\{}s_{\min}(M_{n})\leq C_{q}\,n^{-2.5}\,t\big{\}}\leq t+(1+o_{n}(1)){\mathbb{P}}\big{\{}\mbox{$M_{n}$ is singular}\big{\}}=t+(2+o_{n}(1))n\,(1-p)^{n}.

At this stage, the scheme of the proof to a large extent follows the approach of Rudelson and Vershynin developed in [41]. However, a crucial part of their argument — “invertibility via distance” (see [41, Lemma 3.5]) — will be reworked in order to keep sharp probability estimates for the matrix singularity and to be able to bind this part of the argument with the previous sections, where we essentially condition on row- and column-sums of our matrix.

We start by restating main results of Sections 5 and 6 using the vector class 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) defined by (2), together with Lemma 3.1.

Corollary 7.2.

There are universal constants C1C\geq 1, δ,ρ(0,1)\delta,\rho\in(0,1) and r(0,1)r\in(0,1) with the following property. Let MnM_{n} be a random matrix satisfying (A) with CC and let the growth function 𝐠{\bf g} be given by (30). Then

{Mnxan1x for some xλ0(λ𝒱n(r,𝐠,δ,ρ))}=(1+on(1))n(1p)n,{\mathbb{P}}\Big{\{}\|M_{n}x\|\leq a_{n}^{-1}\|x\|\,\,\,\mbox{ for some }\,\,\,x\notin\bigcup\limits_{\lambda\geq 0}\big{(}\lambda\,{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\big{)}\Big{\}}=(1+o_{n}(1))n\,(1-p)^{n}, (41)

where an=(pn)2c(64p)κmax(1,p1.5n)a_{n}=\frac{(pn)^{2}}{c(64p)^{\kappa}}\,\max\left(1,p^{1.5}n\right), κ=κ(p):=(ln(6pn))/lnpn4ln(1/p).\kappa=\kappa(p):=(\ln(6pn))/\ln\big{\lfloor}\frac{pn}{4\ln(1/p)}\big{\rfloor}.

Further, Theorems 5.15.2 and Lemma 3.1 are combined as follows.

Corollary 7.3.

There are universal positive constants c,Cc,C with the following property. Let q(0,c)q\in(0,c) be a parameter. Then there exist n0=n0(q)1n_{0}=n_{0}(q)\geq 1, r=r(q),ρ=ρ(q)(0,1)r=r(q),\rho=\rho(q)\in(0,1) such that for nn0n\geq n_{0}, p(q,c)p\in(q,c), δ=r/3\delta=r/3, 𝐠(t)=(2t)3/2{\bf g}(t)=(2t)^{3/2}, the random Bernoulli(pp) n×nn\times n matrix MnM_{n} satisfies (41) with an=Cnln(e/p)a_{n}=C\sqrt{n\ln(e/p)}.

Below is our version of “invertibility via distance,” which deals with pairs of columns.

Lemma 7.4 (Invertibility via distance).

Let r,δ,ρ(0,1)r,\delta,\rho\in(0,1), and let 𝐠{\bf g} be a growth function. Further, let n6/rn\geq 6/r and let AA be an n×nn\times n random matrix. Then for any t>0t>0 we have

{\displaystyle{\mathbb{P}}\big{\{} Axtx for some x𝒱n(r,𝐠,δ,ρ)}\displaystyle\|Ax\|\leq t\,\|x\|\quad\mbox{ for some }\quad x\in{\mathcal{V}}_{n}(r,{\bf g},\delta,\rho)\big{\}}
2(rn)2ij{dist(Hi(A),𝐂i(A))tbn and dist(Hj(A),𝐂j(A))tbn},\displaystyle\leq\frac{2}{(rn)^{2}}\sum\limits_{i\neq j}{\mathbb{P}}\big{\{}{\rm dist}(H_{i}(A),{\bf C}_{i}(A))\leq t\,b_{n}\quad\mbox{ and }\quad{\rm dist}(H_{j}(A),{\bf C}_{j}(A))\leq t\,b_{n}\big{\}},

where the sum is taken over all ordered pairs (i,j)(i,j) with iji\neq j and bn=i=1n𝐠(i)b_{n}=\sum_{i=1}^{n}{\bf g}(i).

Proof.

For every iji\neq j, denote by 𝟏ij{\bf 1}_{ij} the indicator of the event

ij:={dist(Hi(A),𝐂i(A))tbn and dist(Hj(A),𝐂j(A))tbn}.{\mathcal{E}}_{ij}:=\big{\{}{\rm dist}(H_{i}(A),{\bf C}_{i}(A))\leq t\,b_{n}\quad\mbox{ and }\quad{\rm dist}(H_{j}(A),{\bf C}_{j}(A))\leq t\,b_{n}\big{\}}.

The condition

Axtx\|Ax\|\leq t\,\|x\|

for some x𝒱n=𝒱n(r,𝐠,δ,ρ)x\in{\mathcal{V}}_{n}={\mathcal{V}}_{n}(r,{\bf g},\delta,\rho) implies that for every ini\leq n,

|xi|dist(Hi(A),𝐂i(A))Axtbn,|x_{i}|\,{\rm dist}(H_{i}(A),{\bf C}_{i}(A))\leq\|Ax\|\leq t\,b_{n},

where the last inequality follows from the definition of 𝒱n{\mathcal{V}}_{n}. Since xrn=1x^{*}_{\lfloor rn\rfloor}=1, we get that everywhere on the event {Axtx for some x𝒱n}\{\|Ax\|\leq t\,\|x\|\mbox{ for some }x\in{\mathcal{V}}_{n}\} there are at least rn(rn1)(rn)2/2\lfloor rn\rfloor\,(\lfloor rn\rfloor-1)\geq(rn)^{2}/2 ordered pairs of indices (i,j)(i,j) such that for each pair the event ij{\mathcal{E}}_{ij} occurs. Rewriting this assertion in terms of indicators, we observe

{Axtx for some x𝒱n}{ij𝟏ij(rn)2/2}.\{\|Ax\|\leq t\,\|x\|\mbox{ for some }x\in{\mathcal{V}}_{n}\}\subset\Big{\{}\sum\limits_{i\neq j}{\bf 1}_{ij}\geq(rn)^{2}/2\Big{\}}.

Applying Markov’s inequality in order to estimate probability of the event on the right hand side, we obtain the desired result. ∎

Proof of Theorems 1.2 and 7.1.

The proofs of both theorems are almost the same, the only difference is that Theorem 1.2 uses Corollary 7.3 while Theorem 1.2 uses Corollary 7.2. Let parameters δ,ρ,r,𝐠,an\delta,\rho,r,{\bf g},a_{n} be taken from Corollary 7.2 or from Corollary 7.3 correspondingly. We always write 𝒱n{\mathcal{V}}_{n} for 𝒱n(r,𝐠,δ,ρ){\mathcal{V}}_{n}(r,{\bf g},\delta,\rho). Let bn=i=1n𝐠(i)b_{n}=\sum_{i=1}^{n}{\bf g}(i). Without loss of generality, we can assume that n6/rn\geq 6/r. Fix t(0,1]t\in(0,1], and denote by {\mathcal{E}} the complement of the event

{Mnxan1x or Mnxan1x for some xλ0(λ𝒱n)}.\Big{\{}\|M_{n}x\|\leq a_{n}^{-1}\|x\|\;\mbox{ or }\;\|M_{n}^{\top}x\|\leq a_{n}^{-1}\|x\|\quad\mbox{ for some }\quad x\notin\bigcup\limits_{\lambda\geq 0}\big{(}\lambda\,{\mathcal{V}}_{n}\big{)}\Big{\}}.

For i=1,2i=1,2 denote

i:={dist(Hi(Mn),𝐂i(Mn))an1t}.{\mathcal{E}}_{i}:=\big{\{}{\rm dist}(H_{i}(M_{n}),{\bf C}_{i}(M_{n}))\leq a_{n}^{-1}\,t\big{\}}.

Applying Corollary 7.2 (or Corollary 7.3), Lemma 7.4 and the invariance of the conditional distribution of MnM_{n} given {\mathcal{E}} under permutation of columns, we obtain

\displaystyle{\mathbb{P}} {smin(Mn)(anbn)1t}\displaystyle\big{\{}s_{\min}(M_{n})\leq(a_{n}b_{n})^{-1}t\big{\}}
(2+on(1))n(1p)n+({Mnx(anbn)1tx for some x𝒱n})\displaystyle\leq(2+o_{n}(1))n\,(1-p)^{n}+{\mathbb{P}}\big{(}\big{\{}\|M_{n}x\|\leq(a_{n}b_{n})^{-1}\,t\|x\|\quad\mbox{ for some }\quad x\in{\mathcal{V}}_{n}\big{\}}\cap{\mathcal{E}}\big{)}
(2+on(1))n(1p)n+2r2(12).\displaystyle\leq(2+o_{n}(1))n\,(1-p)^{n}+\frac{2}{r^{2}}\,{\mathbb{P}}\big{(}{\mathcal{E}}\cap{\mathcal{E}}_{1}\cap{\mathcal{E}}_{2}\big{)}.

At the next step, we consider events

Ωi:={|supp𝐂i(Mn)|[pn/8,8pn]},i=1,2, and Ω:=Ω1Ω2.\Omega_{i}:=\big{\{}|{\rm supp\,}{\bf C}_{i}(M_{n})|\in[pn/8,8pn]\big{\}},\,i=1,2,\quad\mbox{ and }\quad\Omega:=\Omega_{1}\cup\Omega_{2}.

Since columns of MM are independent and consist of i.i.d. Bernoulli(pp) variables, applying Lemma 3.4, we observe

(Ωc)=(Ω1c)(Ω2c)(1p)n.{\mathbb{P}}\big{(}\Omega^{c}\big{)}={\mathbb{P}}\big{(}\Omega_{1}^{c}\big{)}{\mathbb{P}}\big{(}\Omega_{2}^{c}\big{)}\leq(1-p)^{n}.

Therefore, in view of equidistribution of the first two columns, we get

\displaystyle{\mathbb{P}} (12)(1p)n+(12Ω)(1p)n+2(1Ω1).\displaystyle\big{(}{\mathcal{E}}\cap{\mathcal{E}}_{1}\cap{\mathcal{E}}_{2}\big{)}\leq(1-p)^{n}+{\mathbb{P}}\big{(}{\mathcal{E}}\cap{\mathcal{E}}_{1}\cap{\mathcal{E}}_{2}\cap\Omega\big{)}\leq(1-p)^{n}+2{\mathbb{P}}\big{(}{\mathcal{E}}\cap{\mathcal{E}}_{1}\cap\Omega_{1}\big{)}.

Denote by 𝐘{\bf Y} a random unit vector orthogonal to (and measurable with respect to) H1(Mn)H_{1}(M_{n}). Note that on the event 1{\mathcal{E}}_{1} the vector 𝐘{\bf Y} satisfies

|𝐘,𝐂1(Mn)|=Mn𝐘an1t𝐘,|\langle{\bf Y},{\bf C}_{1}(M_{n})\rangle|=\|M_{n}^{\top}{\bf Y}\|\leq a_{n}^{-1}\,t\,\|{\bf Y}\|,

which implies that on the event 1{\mathcal{E}}\cap{\mathcal{E}}_{1} we also have 𝐘rn0{\bf Y}^{*}_{\lfloor rn\rfloor}\neq 0, and 𝐙:=𝐘/𝐘rn𝒱n{\bf Z}:={\bf Y}/{\bf Y}^{*}_{\lfloor rn\rfloor}\in{\mathcal{V}}_{n}. By the definition of 𝒱n{\mathcal{V}}_{n}, we have 𝐙bn\|{\bf Z}\|\leq b_{n}, therefore,

P0:=(1Ω1)(Ω1{There is ZH1(Mn)𝒱n:|Z,𝐂1(Mn)|an1bnt}).\displaystyle P_{0}:={\mathbb{P}}\big{(}{\mathcal{E}}\cap{\mathcal{E}}_{1}\cap\Omega_{1}\big{)}\leq{\mathbb{P}}\big{(}\Omega_{1}\cap\big{\{}\mbox{There is $Z\in H_{1}(M_{n})^{\perp}\cap{\mathcal{V}}_{n}$}:\;|\langle Z,{\bf C}_{1}(M_{n})\rangle|\leq a_{n}^{-1}\,b_{n}\,t\big{\}}\big{)}.

On the other hand, applying Theorem 2.2 with R=2R=2, we get that for some constants K11K_{1}\geq 1 and K24K_{2}\geq 4, with probability at least 1exp(2pn)1-\exp(-2pn),

H1(Mn)𝒱n{xΥn(r):𝐔𝐃n(x,m,K1,K2)exp(2pn) for any m[pn/8,8pn]}.\displaystyle H_{1}(M_{n})^{\perp}\cap{\mathcal{V}}_{n}\subset\big{\{}x\in{\Upsilon}_{n}(r):\;{\bf UD}_{n}(x,m,K_{1},K_{2})\geq\exp(2pn)\,\,\,\mbox{ for any }\,\,\,m\in[pn/8,8pn]\big{\}}.

Combining the last two assertions and applying Theorem 2.1, we observe

P0exp(2pn)+(\displaystyle P_{0}\leq\exp(-2pn)+{\mathbb{P}}\big{(} Ω1{There is ZH1(Mn)𝒱n:|Z,𝐂1(Mn)|an1bnt, and\displaystyle\Omega_{1}\cap\big{\{}\mbox{There is $Z\in H_{1}(M_{n})^{\perp}\cap{\mathcal{V}}_{n}$}:\;|\langle Z,{\bf C}_{1}(M_{n})\rangle|\leq a_{n}^{-1}\,b_{n}\,t,\mbox{ and }
𝐔𝐃n(Z,m,K1,K2)exp(2pn) for any m[pn/8,8pn]})\displaystyle{\bf UD}_{n}(Z,m,K_{1},K_{2})\geq\exp(2pn)\mbox{ for any }m\in[pn/8,8pn]\big{\}}\big{)}
exp(2pn)+supm[pn/8,8pn],yΥn(r),𝐔𝐃n(y,m,K1,K2)exp(2pn){|y,𝐂1(Mn)|an1bnt||supp𝐂1(Mn)|=m}\displaystyle\leq\exp(-2pn)+\sup\limits_{\begin{subarray}{c}m\in[pn/8,8pn],\,y\in{\Upsilon}_{n}(r),\\ {\bf UD}_{n}(y,m,K_{1},K_{2})\geq\exp(2pn)\end{subarray}}{\mathbb{P}}\big{\{}|\langle y,{\bf C}_{1}(M_{n})\rangle|\leq a_{n}^{-1}b_{n}\,t\,\,\big{|}\,\,|{\rm supp\,}{\bf C}_{1}(M_{n})|=m\big{\}}
(1+C2.1)exp(2pn)+C2.1bnanpn/8t.\displaystyle\leq(1+C_{\text{\tiny\ref{p: cf est}}})\exp(-2pn)+\frac{C_{\text{\tiny\ref{p: cf est}}}b_{n}}{a_{n}\sqrt{pn/8}}\,t.

Thus

{smin(Mn)(anbn)1t}(2+on(1))n(1p)n+8C2.1bnr2anpnt.{\mathbb{P}}\big{\{}s_{\min}(M_{n})\leq(a_{n}b_{n})^{-1}t\big{\}}\leq(2+o_{n}(1))n\,(1-p)^{n}+\frac{8C_{\text{\tiny\ref{p: cf est}}}b_{n}}{r^{2}\,a_{n}\sqrt{pn}}\,t.

By rescaling tt we obtain

{smin(Mn)r2pn(8C2.1bn2)t}(2+on(1))n(1p)n+t,0t8C2.1bnr2anpn.{\mathbb{P}}\Big{\{}s_{\min}(M_{n})\leq\frac{r^{2}\,\sqrt{pn}}{(8C_{\text{\tiny\ref{p: cf est}}}b_{n}^{2})}\,t\Big{\}}\leq(2+o_{n}(1))n\,(1-p)^{n}+t,\quad 0\leq t\leq\frac{8C_{\text{\tiny\ref{p: cf est}}}b_{n}}{r^{2}\,a_{n}\sqrt{pn}}.

In the case of constant pp (applying Corollary 7.3) we have an=Cnln(e/p)a_{n}=C\sqrt{n\ln(e/p)} and bn23n3/2b_{n}\leq 2\sqrt{3}n^{3/2}, and we get the small ball probability estimate of Theorem 7.1.

In the case of “general” pp (with the application of Corollary 7.2) we have an=(pn)2c(64p)κmax(1,p1.5n)a_{n}=\frac{(pn)^{2}}{c(64p)^{\kappa}}\,\max\left(1,p^{1.5}n\right) and bnexp(1.5ln2(2n))b_{n}\leq\exp(1.5\ln^{2}(2n)). Therefore,

r2pn(8C2.1bn2)exp(3ln2(2n))\frac{r^{2}\,\sqrt{pn}}{(8C_{\text{\tiny\ref{p: cf est}}}b_{n}^{2})}\geq\exp(-3\ln^{2}(2n))

for large enough nn, and the smins_{\min} estimate follows.

In both cases the upper bound on tt, 8C2.1bnr2anpn\frac{8C_{\text{\tiny\ref{p: cf est}}}b_{n}}{r^{2}\,a_{n}\sqrt{pn}}, is greater than 11, so we may omit it.

Finally, applying the argument of Subsection 3.2, we get the matching lower bound for the singularity probability. This completes the proof. ∎

8 Open questions

The result of this paper leaves open the problem of estimating the singularity probability for Bernoulli matrices in two regimes: when npnnp_{n} is logarithmic in nn and when pnp_{n} is larger than the constant C1C^{-1} from Theorem 1.2.

For the first regime, we recall that the singularity probability of MnM_{n}, with npnnp_{n} in a (small) neighborhood of lnn\ln n, was determined up to the 1+on(1)1+o_{n}(1) multiple in the work of Basak–Rudelson [5]. Definitely, it would be of interest to bridge that result and the main theorem of this paper.

Problem 8.1 (A brigde: Theorem 1.2 to Basak–Rudelson).

Let pnp_{n} satisfy

1lim infnpn/lnnlim supnpn/lnn<,1\leq\liminf np_{n}/\ln n\leq\limsup np_{n}/\ln n<\infty,

and for each nn let MnM_{n} be the n×nn\times n matrix with i.i.d. Bernoulli(pnp_{n}) entries. Show that

{Mn is singular}=(1+on(1)){Mn has a zero row or a zero column}.{\mathbb{P}}\big{\{}M_{n}\mbox{ is singular}\big{\}}=(1+o_{n}(1)){\mathbb{P}}\big{\{}M_{n}\mbox{ has a zero row or a zero column}\big{\}}.

Note that the main technical result for unstructured (gradual non-constant) vectors, Theorem 2.2 proved in Section 4, remains valid for these values of pnp_{n}. It may be therefore expected that the above problem can be positively resolved by finding an efficient treatment for the structured vectors (the complement of gradual non-constant vectors), which would replace (or augment) the argument from Section 6.

On the contrary, the second problem — singularity of random Bernoulli matrices with large values of pnp_{n} — seem to require essential new arguments for working with the unstructured vectors as the basic idea of Section 4 — gaining on anti-concentration estimates by grouping together several components of a random vector — does not seem to be applicable in this regime.

Problem 8.2 (Optimal singularity probability for dense Bernoulli matrices below the 1/21/2 threshold).

Let the sequence pnp_{n} satisfy

0<lim infpnlim suppn<1/2.0<\liminf p_{n}\leq\limsup p_{n}<1/2.

Show that

{Mn is singular}\displaystyle{\mathbb{P}}\big{\{}M_{n}\mbox{ is singular}\big{\}} =(1+on(1)){Mn has a zero row or a zero column}=(2+on(1))n(1pn)n.\displaystyle=(1+o_{n}(1)){\mathbb{P}}\big{\{}M_{n}\mbox{ has a zero row or a zero column}\big{\}}=(2+o_{n}(1))n\,(1-p_{n})^{n}.

Acknowledgments

K.T. was partially supported by the Sloan Research Fellowship.

References

  • [1] A.S. Bandeira, R. van Handel, Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab. 44 (2016), 2479–2506.
  • [2] A. Basak, N. Cook and O. Zeitouni, Circular law for the sum of random permutation matrices, Electronic Journal of Probability, 23 (2018), Paper No. 33, 51 pp.
  • [3] A. Basak and M. Rudelson, Invertibility of sparse non-Hermitian matrices, Adv. Math. 310 (2017), 426–483. MR3620692
  • [4] A. Basak and M. Rudelson, The circular law for sparse non-Hermitian matrices, arXiv:1707.03675
  • [5] A. Basak and M. Rudelson, Sharp transition of the invertibility of the adjacency matrices of random graphs, arXiv:1809.08454
  • [6] S. Boucheron, G. Lugosi, P. Massart, Concentration inequalities. A nonasymptotic theory of independence. With a foreword by Michel Ledoux. Oxford University Press, Oxford, 2013.
  • [7] J. Bourgain, V. H. Vu and P. M. Wood, On the singularity probability of discrete random matrices, J. Funct. Anal. 258 (2010), no. 2, 559–603. MR2557947
  • [8] D. Chafaï, O. Guédon, G. Lecué, A. Pajor, Interactions between compressed sensing random matrices and high dimensional geometry, Panoramas et Synthèses [Panoramas and Syntheses], 37. Soc. Math. de France, Paris, 2012.
  • [9] N. A. Cook, On the singularity of adjacency matrices for random regular digraphs, Probab. Theory Related Fields 167 (2017), no. 1-2, 143–200. MR3602844
  • [10] N. Cook, The circular law for random regular digraphs, Ann. Inst. Henri Poincare Probab. Stat., to appear, arXiv:1703.05839
  • [11] L. Devroye; G. Lugosi, Combinatorial methods in density estimation. Springer Series in Statistics. Springer-Verlag, New York, 2001.
  • [12] P. Erdös, On a lemma of Littlewood and Offord, Bull. Amer. Math. Soc. 51 (1945), 898–902. MR0014608
  • [13] C. G. Esseen, On the Kolmogorov-Rogozin inequality for the concentration function, Z. Wahrsch. Verw. Gebiete 5 (1966), 210–216. MR0205297
  • [14] F. Götze and A. Tikhomirov, The circular law for random matrices, Ann. Probab. 38 (2010), no. 4, 1444–1491. MR2663633
  • [15] J. Huang, Invertibility of adjacency matrices for random dd-regular graphs, preprint, arXiv:1807.06465, 2018.
  • [16] J. Kahn, J. Komlós and E. Szemerédi, On the probability that a random ±1\pm 1-matrix is singular, J. Amer. Math. Soc. 8 (1995), no. 1, 223–240. MR1260107
  • [17] H. Kesten, A sharper form of the Doeblin-Lévy-Kolmogorov-Rogozin inequality for concentration functions, Math. Scand. 25 (1969), 133–144. MR0258095
  • [18] J. Komlós, On the determinant of (0, 1)(0,\,1) matrices, Studia Sci. Math. Hungar 2 (1967), 7–21. MR0221962
  • [19] B. Landon, P. Sosoe, H. Yau, Fixed energy universality of Dyson Brownian motion, Adv. Math. 346 (2019), 1137–1332.
  • [20] M. Ledoux, The concentration of measure phenomenon. Mathematical Surveys and Monographs, 89. American Mathematical Society, Providence, RI, 2001.
  • [21] J. E. Littlewood and A. C. Offord, On the number of real roots of a random algebraic equation. III, Rec. Math. [Mat. Sbornik] N.S. 12(54) (1943), 277–286. MR0009656
  • [22] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, Anti-concentration property for random digraphs and invertibility of their adjacency matrices, C. R. Math. Acad. Sci. Paris 354 (2016), no. 2, 121–124. MR3456885
  • [23] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, Adjacency matrices of random digraphs: singularity and anti-concentration, J. Math. Anal. Appl. 445 (2017), no. 2, 1447–1491. MR3545253
  • [24] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, The smallest singular value of a shifted dd-regular random square matrix, Probab. Theory Related Fields, 173 (2019), 1301–1347.
  • [25] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, The circular law for sparse random regular digraphs, J. European Math. Soc., to appear.
  • [26] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, The rank of random regular digraphs of constant degree, J. of Complexity, 48 (2018), 103–110.
  • [27] A. E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann and P. Youssef, Structure of eigenvectors of random regular digraphs, Trans. Amer. Math. Soc., 371 (2019), 8097–8172.
  • [28] A. E. Litvak, A. Pajor, M. Rudelson and N. Tomczak-Jaegermann, Smallest singular value of random matrices and geometry of random polytopes, Adv. Math. 195 (2005), no. 2, 491–523. MR2146352
  • [29] A. E. Litvak and O. Rivasplata, Smallest singular value of sparse random matrices, Studia Math. 212 (2012), no. 3, 195–218. MR3009072
  • [30] G. V. Livshyts, The smallest singular value of heavy-tailed not necessarily i.i.d. random matrices via random rounding, arXiv:1811.07038.
  • [31] G. V. Livshyts, K. Tikhomirov, R. Vershynin, The smallest singular value of inhomogeneous square random matrices, arXiv:1909.04219
  • [32] A. Lytova, K. Tikhomirov, On delocalization of eigenvectors of random non-Hermitian matrices, Probab. Theor. Rel. Fields, to appear.
  • [33] K. Luh, S. Meehan, H.H. Nguyen, Some new results in random matrices over finite fields, arXiv:1907.02575
  • [34] K. Luh, S. O’Rourke, Eigenvector Delocalization for Non-Hermitian Random Matrices and Applications, arXiv:1810.00489
  • [35] A. Mészáros, The distribution of sandpile groups of random regular graphs, preprint, arXiv: 1806.03736, 2018.
  • [36] H.H. Nguyen and M.M. Wood, Cokernels of adjacency matrices of random r-regular graphs, preprint, arXiv: 1806.10068, 2018.
  • [37] E. Rebrova, K. Tikhomirov, Coverings of random ellipsoids, and invertibility of matrices with i.i.d. heavy-tailed entries, Israel J. Math., to appear.
  • [38] B. A. Rogozin, On the increase of dispersion of sums of independent random variables, Teor. Verojatnost. i Primenen 6 (1961), 106–108. MR0131894
  • [39] M. Rudelson, Invertibility of random matrices: norm of the inverse, Ann. of Math. 168 (2008), 575–600.
  • [40] M. Rudelson, Recent developments in non-asymptotic theory of random matrices, in Modern aspects of random matrix theory, 83–120, Proc. Sympos. Appl. Math., 72, Amer. Math. Soc., Providence, RI. MR3288229
  • [41] M. Rudelson and R. Vershynin, The Littlewood–Offord problem and invertibility of random matrices, Adv. Math. 218 (2008), no. 2, 600–633. MR2407948
  • [42] M. Rudelson and R. Vershynin, Smallest singular value of a random rectangular matrix, Comm. Pure Appl. Math. 62 (2009), no. 12, 1707–1739. MR2569075
  • [43] M. Rudelson and R. Vershynin, No-gaps delocalization for general random matrices, Geom. Funct. Anal. 26 (2016), no. 6, 1716–1776. MR3579707
  • [44] T. Tao and V. Vu, On random ±1\pm 1 matrices: singularity and determinant, Random Structures Algorithms 28 (2006), no. 1, 1–23. MR2187480
  • [45] T. Tao and V. Vu, On the singularity probability of random Bernoulli matrices, J. Amer. Math. Soc. 20 (2007), no. 3, 603–628. MR2291914
  • [46] T. Tao and V. Vu, Random matrices: the circular law, Commun. Contemp. Math. 10 (2008), 261–307. MR2409368
  • [47] T. Tao and V. H. Vu, Inverse Littlewood-Offord theorems and the condition number of random discrete matrices, Ann. of Math. 169 (2009), 595–632. MR2480613
  • [48] K. Tikhomirov, Singularity of random Bernoulli matrices, Annals of Math., to appear. arXiv:1812.09016

Alexander E. Litvak
Dept. of Math. and Stat. Sciences,
University of Alberta,
Edmonton, AB, Canada, T6G 2G1.
e-mail: [email protected]

Konstantin E. Tikhomirov
School of Math., GeorgiaTech,
686 Cherry street,
Atlanta, GA 30332, USA.
e-mail: [email protected]