This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Spectrum of Heavy-Tailed Elliptic Random Matrices

Andrew Campbell Department of Mathematics, University of Colorado at Boulder, Boulder, CO 80309 [email protected]  and  Sean O’Rourke Department of Mathematics, University of Colorado at Boulder, Boulder, CO 80309 [email protected]
Abstract.

An elliptic random matrix XX is a square matrix whose (i,j)(i,j)-entry XijX_{ij} is a random variable independent of every other entry except possibly XjiX_{ji}. Elliptic random matrices generalize Wigner matrices and non-Hermitian random matrices with independent entries. When the entries of an elliptic random matrix have mean zero and unit variance, the empirical spectral distribution is known to converge to the uniform distribution on the interior of an ellipse determined by the covariance of the mirrored entries.

We consider elliptic random matrices whose entries fail to have two finite moments. Our main result shows that when the entries of an elliptic random matrix are in the domain of attraction of an α\alpha-stable random variable, for 0<α<20<\alpha<2, the empirical spectral measure converges, in probability, to a deterministic limit. This generalizes a result of Bordenave, Caputo, and Chafaï for heavy-tailed matrices with independent and identically distributed entries. The key elements of the proof are (i) a general bound on the least singular value of elliptic random matrices under no moment assumptions; and (ii) the convergence, in an appropriate sense, of the matrices to a random operator on the Poisson Weighted Infinite Tree.

A. Campbell has been supported in part by NSF grant ECCS-1610003.
S. O’Rourke has been supported in part by NSF grants DMS-1810500 and ECCS-1610003.

1. Introduction

Let Matn(𝔽)\operatorname{Mat}_{n}(\mathbb{F}) be the set of n×nn\times n matrices over the field 𝔽\mathbb{F}. For a matrix AMatn()A\in\operatorname{Mat}_{n}(\mathbb{C}), the singular values of AA are the square roots of the eigenvalues of AAAA^{\ast}, where AA^{\ast} is the conjugate transpose of AA. We let sn(A)s1(A)s_{n}(A)\leq\cdots\leq s_{1}(A) denote the ordered singular values of AA, and λ1(A),,λn(A)\lambda_{1}(A),\dots,\lambda_{n}(A)\in\mathbb{C} be the eigenvalues of AA in no particular order. For a matrix AMatn()A\in\operatorname{Mat}_{n}(\mathbb{C}) we define the empirical spectral measure

μA:=1ni=1nδλi(A)\mu_{A}:=\frac{1}{n}\sum_{i=1}^{n}\delta_{\lambda_{i}(A)}

and the empirical singular value measure

νA:=1ni=1nδsi(A).\nu_{A}:=\frac{1}{n}\sum_{i=1}^{n}\delta_{s_{i}(A)}.

These measures are central objects in random matrix theory, and the goal of this paper is to study the asymptotic behavior of the empirical spectral measure for a class of heavy-tailed elliptic random matrices.

Elliptic random matrices can be thought of as interpolating between random matrices whose entries are independent and identically distributed (i.i.d.) and Wigner matrices. We now give a precise definition.

Definition 1.1 (Elliptic Random Matrix).

Let (ξ1,ξ2)(\xi_{1},\xi_{2}) be a random vector with complex-valued random variable entries, ζ\zeta a complex random variable, and Xn=(Xij)i,j=1nX_{n}=(X_{ij})_{i,j=1}^{n} be an n×nn\times n random matrix. XnX_{n} is an elliptic random matrix if

  • (i)

    {Xii:1in}{(Xij,Xji):1i<jn}\{X_{ii}:1\leq i\leq n\}\cup\{(X_{ij},X_{ji}):1\leq i<j\leq n\} is a collection of independent random elements.

  • (ii)

    the pairs {(Xij,Xji)}1i<jn\{(X_{ij},X_{ji})\}_{1\leq i<j\leq n} are independent copies of (ξ1,ξ2)(\xi_{1},\xi_{2}).

  • (iii)

    the diagonal elements {Xii:1in}\{X_{ii}:1\leq i\leq n\} are independent copies of ζ\zeta.

We refer to (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta as the atom variables of the matrix XnX_{n}.

Elliptic random matrices were originally introduced by Girko [25, 26] in the 1980s, with the name coming from the limit of the empirical spectral measure. When the entries of the matrix have four finite moments, the limiting empirical spectral measures was investigated by Naumov [38]. The general case, when the entries are only assumed to have finite variance, was studied in [39].

Theorem 1.2 (Elliptic law for real random matrices, Theorem 1.5 in [39]).

Let XnX_{n} be an n×nn\times n elliptic random matrix with real atom variables (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta. Assume ξ1,ξ2\xi_{1},\xi_{2} have mean zero and unit variance, and 𝔼[ξ1ξ2]=:ρ\mathbb{E}[\xi_{1}\xi_{2}]=:\rho for 1<ρ<1-1<\rho<1. Additionally assume ζ\zeta has mean zero and finite variance. Then almost surely the empirical spectral measure μ1nXn\mu_{\frac{1}{\sqrt{n}}X_{n}} of 1nXn\frac{1}{\sqrt{n}}X_{n} converges weakly to the uniform probability measure on

ρ:={z:Re(z)2(1+ρ)2+Im(z)2(1ρ)21}\mathcal{E}_{\rho}:=\left\{z\in\mathbb{C}:\frac{\operatorname{Re}(z)^{2}}{(1+\rho)^{2}}+\frac{\operatorname{Im}(z)^{2}}{(1-\rho)^{2}}\leq 1\right\}

as nn\rightarrow\infty.

Elliptic random matrices have also been studied in [28, 29, 40, 41].

Our main result, Theorem 1.6, gives an analogous result for heavy-tailed elliptic random matrices, i.e. when 𝔼|ξ1|2\mathbb{E}|\xi_{1}|^{2} and 𝔼|ξ2|2\mathbb{E}|\xi_{2}|^{2} are both infinite. In 1994, Cizeau and Bouchaud [18] introduced Lévy matrices as a heavy-tailed analogue of the Gaussian Orthogonal Ensemble (GOE). Instead of Gaussian entries, these matrices have entries in the domain of attraction of an α\alpha-stable random variable, for 0<α<20<\alpha<2. They predicted a deterministic limit μα\mu_{\alpha}, which depends only on α\alpha, for the empirical spectral measures of these matrices when properly scaled. Convergence to a deterministic limit was first proved by Ben Arous and Guionnet [8] and later by Bordenave, Caputo, and Chafaï [12] with an alternative characterization in their study of random Markov matrices. In [13] Bordenave, Caputo, and Chafaï proved the empirical spectral measure of random matrices with i.i.d. entries in the domain of attraction of a complex α\alpha-stable random variable converges almost surely to an isotropic measure on \mathbb{C} with unbounded support. Notably for Lévy matrices the limiting spectral measure inherits the tail behavior of the entries, while the limiting spectral measure of heavy-tailed random matrices with i.i.d. entries has a lighter tail and finite moments of every order.

Much of the work on heavy-tailed random matrices has been on the spectrum of symmetric matrices, either Lévy or sample covariance matrices [8, 4, 5, 7, 11, 2, 49, 12, 31]. Motivated by questions of delocalization from physics there has also been considerable work done in studying the eigenvectors of symmetric heavy-tailed matrices [16, 17, 1, 37, 18, 52, 9].

As is often the case in random matrix theory most of the work on heavy-tailed random matrices has focused on ensembles where the entries are independent up to symmetry conditions on the matrix. Work on matrices with dependent entries is still limited. Heavy-tailed matrices with normalized rows have been considered for random Markov chains in [12, 14] and sample correlation matrices in [31]. In [6] extreme eigenvalue statistics of symmetric heavy-tailed random matrices with mm-dependent entries were studied and shown to converge to a Poisson process. This mm-dependence can be thought of as a short range dependence meant to model stock returns that depend on stocks from the same sector of size determined by mm. To the best of our knowledge there are not any results on non-Hermitian heavy-tailed matrices with long range dependence between entries from different rows outside the random reversible Markov chains studied in [12].

The key question when approaching heavy-tailed elliptic random matrices is how to measure the dependence between ξ1\xi_{1} and ξ2\xi_{2}. Without two finite moments the covariance between ξ1\xi_{1} and ξ2\xi_{2}, which was the key parameter in Theorem 1.2, cannot be defined. Similar notions, such as covariation or codifference, exist for α\alpha-stable random variables but they do not seem sufficient for our purposes. The difference is that the covariation does not provide as much information for α\alpha-stable random vectors as the covariance does for Gaussian random vectors. If X=(X1,X2)X=(X_{1},X_{2}) is a bivariate Gaussian random vector where X1X_{1} and X2X_{2} are standard real Gaussian random variables, then the correlation ρ=𝔼[X1X2]\rho=\mathbb{E}[X_{1}X_{2}] uniquely determines the distribution of XX. Thus one approach to measuring dependence is to find a parameter which uniquely determines the distribution of a properly normalized α\alpha-stable random vector. The distribution of an α\alpha-stable random vector YY in n\mathbb{R}^{n} is determined uniquely through its characteristic function of the form

𝔼exp(iuTY)={exp(𝕊n1|uTs|α(1isign(uTs)tan(πα2))𝑑θ(s)+iuTy),α1exp(𝕊n1|uTs|(1+isign(uTs)log|uTs|)𝑑θ(s)+iuTy),α=1\mathbb{E}\exp\left(iu^{T}Y\right)=\begin{cases}\exp\left(-\int_{\mathbb{S}^{n-1}}|u^{T}s|^{\alpha}(1-i\text{sign}(u^{T}s)\tan(\frac{\pi\alpha}{2}))d\theta(s)+iu^{T}y\right),\alpha\neq 1\\ \exp\left(-\int_{\mathbb{S}^{n-1}}|u^{T}s|(1+i\text{sign}(u^{T}s)\log|u^{T}s|)d\theta(s)+iu^{T}y\right),\alpha=1\end{cases}

for a finite measure θ\theta on the unit sphere 𝕊n1\mathbb{S}^{n-1} and a deterministic vector yy. YY can be translated and scaled so that y=0y=0 and θ\theta is a probability measure uniquely determining the distribution of YY. θ\theta is called the spectral measure of YY, and it turns out to be the appropriate explicit description of the dependence between the entries of YY. The definition of θ\theta can be extended to random variables which are not stable but rather in the domain of attraction of an α\alpha-stable random variable, see Definition 1.3.

If the components of YY are independent, then θ\theta is supported entirely on the intersection of the axes and the unit sphere. Intuitively, when considering the mirrored entries of a random matrix, if θ\theta is close to a measure supported on the intersection of the axes and the unit sphere, the entries are close to independent, after scaling. If θ\theta is close to a measure supported on the set {(z1,z2):|z1|2+|z2|2=1,z1=z¯2}\{(z_{1},z_{2}):|z_{1}|^{2}+|z_{2}|^{2}=1,z_{1}=\bar{z}_{2}\} then the matrix is close to Hermitian. Numerical simulations seem to reflect this intuition in the spectrum of elliptic random matrices, see Figures 1, 2, and 3.

1.1. Matrix distribution

We will consider elliptic random matrices whose atom variables satisfy the following conditions.

Definition 1.3 (Condition C1).

We say the atom variables (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta satisfy Condition C1 if

  • (i)

    there exists a positive number 0<α<20<\alpha<2, a sequence an=(n)n1/αa_{n}=\ell(n)n^{1/\alpha} for a slowly varying function \ell (i.e. limt(tx)/t=1\lim_{t\rightarrow\infty}\ell(tx)/t=1 for all x>0x>0), and a finite measure θd\theta_{d} on the unit sphere in 2\mathbb{C}^{2} such that for all Borel subsets DD of the unit sphere with θd(D)=0\theta_{d}(\partial D)=0 and all r>0r>0,

    limnn((ξ1,ξ2)(ξ1,ξ2)D,(ξ1,ξ2)ran)=θd(D)mα([r,)),\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(\xi_{1},\xi_{2})}{\|(\xi_{1},\xi_{2})\|}\in D,\|(\xi_{1},\xi_{2})\|\geq ra_{n}\right)=\theta_{d}(D)m_{\alpha}([r,\infty)),

    where mαm_{\alpha} is a measure on (0,)(0,\infty) with density f(r)=αr(1+α)f(r)=\alpha r^{-(1+\alpha)}.

  • (ii)

    there exists a constant C>0C>0 such that (|ζ|t)Ctα\mathbb{P}(|\zeta|\geq t)\leq Ct^{-\alpha} for all t>0t>0.

We will reserve θd\theta_{d} to denote the measure on a sphere associated to the atom variables of a heavy-tailed elliptic random matrix; it may be worth noting that dd stands for “dependence” and is not a parameter. As it turns out, Condition C1 is enough to prove convergence of the empirical singular value distribution, see Theorem 1.5. We will need more assumptions to prove convergence of the empirical spectral measure.

Definition 1.4 (Condition C2).

We say (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta satisfy Condition C2 if the atom variables satisfy Condition C1 and if

  • (i)

    there exists no (a,b)2{(0,0)}(a,b)\in\mathbb{C}^{2}\setminus\{(0,0)\} such that

    supp(θd){(z,w)2:az+bw=0,|z|2+|w|2=1}.\operatorname{supp}(\theta_{d})\subseteq\{(z,w)\in\mathbb{C}^{2}:az+bw=0,|z|^{2}+|w|^{2}=1\}.
  • (ii)

    an/n1/αca_{n}/n^{1/\alpha}\rightarrow c, for some constant c>0c>0 as nn\rightarrow\infty.

1.2. Main results

For simplicity, let An:=1anXnA_{n}:=\frac{1}{a_{n}}X_{n}. Our first result gives the convergence of the singular values of AnzInA_{n}-zI_{n}, which we will denote as AnzA_{n}-z, for zz\in\mathbb{C}. Here and throughout, InI_{n} denotes the n×nn\times n identity matrix. While interesting on its own, this is the first step in the method of Hermitization to establish convergence of the empirical spectral measure. Throughout we will use \Rightarrow to denote weak convergence of probability measures and convergence in distribution of random variables.

Theorem 1.5 (Singular values of heavy-tailed elliptic random matrices).

Let XnX_{n} be an n×nn\times n elliptic random matrix with atom variables (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta which satisfy Condition C1. Then for each zz\in\mathbb{C} there exists a deterministic probability measure νz,α,θd\nu_{z,\alpha,\theta_{d}}, depending only on z,αz,\alpha and θd\theta_{d}, such that almost surely

νAnzInνz,α,θd\nu_{A_{n}-zI_{n}}\Rightarrow\nu_{z,\alpha,\theta_{d}}

as nn\rightarrow\infty.

Under Condition C2 we prove the convergence of the empirical spectral measure of AnA_{n}.

Theorem 1.6 (Eigenvalues of heavy-tailed elliptic random matrices).

Let XnX_{n} be an n×nn\times n elliptic random matrix with atom variables (ξ1,ξ2),ζ(\xi_{1},\xi_{2}),\zeta which satisfy Condition C2. Then there exists a deterministic probability measure μα,θd\mu_{\alpha,\theta_{d}}, depending only on α\alpha and θd\theta_{d}, such that

μAnμα,θd\mu_{A_{n}}\Rightarrow\mu_{\alpha,\theta_{d}}

in probability as nn\rightarrow\infty. Moreover for any smooth φ:\varphi:\mathbb{C}\rightarrow\mathbb{C} with compact support

φ𝑑μα,θd=12πΔφ(z)[0log(t)𝑑να,z,θd]𝑑z,\int_{\mathbb{C}}\varphi\ d\mu_{\alpha,\theta_{d}}=\frac{1}{2\pi}\int_{\mathbb{C}}\Delta\varphi(z)\left[\int_{0}^{\infty}\log(t)\ d\nu_{\alpha,z,\theta_{d}}\right]dz, (1.1)

where να,z,θd\nu_{\alpha,z,\theta_{d}} is as in Theorem 1.5 and dzdz is the Lebesgue measure on \mathbb{C}.

A distributional equation describing the Stieltjes transform of να,z,θd\nu_{\alpha,z,\theta_{d}} is given in Proposition 4.12, which, when combined with (1.1), gives a description of μα,θd\mu_{\alpha,\theta_{d}}.

Remark 1.7.

If θd=1/2(θ1+θ2)\theta_{d}=1/2(\theta_{1}+\theta_{2}) where θ1\theta_{1} and θ2\theta_{2} are probability measures with supp(θi){(z1,z2)𝕊:zi=0}\operatorname{supp}(\theta_{i})\subseteq\{(z_{1},z_{2})\in\mathbb{S}:z_{i}=0\} and θ1(A)=θ2({(z1,z2):(z2,z1)A})\theta_{1}(A)=\theta_{2}\left(\{(z_{1},z_{2}):(z_{2},z_{1})\in A\}\right), then να,z,θd\nu_{\alpha,z,\theta_{d}} in Theorem 1.5 and μα,θd\mu_{\alpha,\theta_{d}} in Theorem 1.6 are the same measures as in the main results of Bordenave, Caputo, and Chafaï [13]. This can be seen by computing θd\theta_{d} when ξ1\xi_{1} and ξ2\xi_{2} are independent and identically distributed. It is also worth noting that if the matrix XnX_{n} is complex Hermitian but not real symmetric, then (ξ1,ξ2)(\xi_{1},\xi_{2}) will satisfy Condition C2 (i), and thus Theorem 1.6 holds.

Numerical simulations seem to give weight to the reasoning that the support of θd\theta_{d} determines how close μα,θd\mu_{\alpha,\theta_{d}} is to being isotropic or supported on the real line. In Figure 3 we see as supp(θd)\operatorname{supp}(\theta_{d}) moves further from {z1=z2}\{z_{1}=z_{2}\} the mass of the spectrum moves further from the real line. A similar phenomenon appears in Figure 2: as supp(θd)\operatorname{supp}(\theta_{d}) moves further from the intersection of the axes and the sphere the spectrum becomes further from isotropic. We also see in Figure 1 that the tail behavior of the spectrum appears to depend on θd\theta_{d} and may vary in different directions.

Spectrum similar to Figure 1 can be found in [30] where the authors study the spectral measure of C1+iC2C_{1}+iC_{2} where C1C_{1} and C2C_{2} are free random Lévy elements. The authors use tools in free probability to show the spectrum is supported inside the “hyperbolic cross”, which appears to be the case for heavy-tailed elliptic random matrices for certain θd\theta_{d}. The limiting spectral measure in [13] is not contained in a “hyperbolic cross”, which suggests C1+iC2C_{1}+iC_{2} is not the heavy-tailed analogue of a circular element, in contrast with the case when C1C_{1} and C2C_{2} are free semicircular elements and C1+iC2C_{1}+iC_{2} is a circular element.

Refer to caption
Refer to caption
Figure 1. The plot on the left is the spectrum of an n×nn\times n matrix n1/αXn^{-1/\alpha}X where α=1.25\alpha=1.25, n=2000n=2000, and the entries of XX are i.i.d. random variables distributed as εU1/α\varepsilon U^{-1/\alpha}, where ε\varepsilon is uniformly distributed on {1,1}\{-1,1\} and UU is uniformly distributed on [0,1][0,1]. The plot on the right is the spectrum of an n×nn\times n elliptic random matrix n1/αXn^{-1/\alpha}X where XX has atom variables (U1/αcos(w),U1/αsin(w)),1(U^{-1/\alpha}\cos(w),U^{-1/\alpha}\sin(w)),1 with α=1.25\alpha=1.25, n=2000n=2000, UU uniformly distributed on [0,1][0,1], and ww uniformly distributed on [0,2π][0,2\pi]. The plot window on the right is cropped to avoid extreme values.

1.3. Further questions

Since our results capture both heavy-tailed Hermitian matrices and heavy-tailed matrices with i.i.d. entries, μα,θd\mu_{\alpha,\theta_{d}} does actually depend on θd\theta_{d}. However, as can be seen from [13], μα,θd\mu_{\alpha,\theta_{d}} does not depend on every aspect of θd\theta_{d}.

Question 1.8.

What properties of θd\theta_{d} determine μα,θd\mu_{\alpha,\theta_{d}}?

In the case when XnX_{n} has i.i.d. entries the limiting empirical spectral measure has an exponential tail [13] while in the case when XnX_{n} is Hermitian it has the same tail behavior as the entries [8, 12]. This leads us to ask how the tail of μα,θd\mu_{\alpha,\theta_{d}} depends on θd\theta_{d}.

Question 1.9.

How does the tail behavior of μα,θd\mu_{\alpha,\theta_{d}} vary with respect to θd\theta_{d}?

Refer to caption
Refer to caption
Figure 2. Both plots show the spectrum of an n×nn\times n elliptic random matrix n1/αXn^{-1/\alpha}X where XX has atom variables (U1/αcos(w),U1/αsin(w)),1(U^{-1/\alpha}\cos(w),U^{-1/\alpha}\sin(w)),1 with α=1.25\alpha=1.25, n=2000n=2000, UU uniformly distributed on [0,1][0,1], and ww uniformly distributed on {0,π/2,π,3π/2}+[bπ/4,bπ/4]\{0,\pi/2,\pi,3\pi/2\}+[-b\pi/4,b\pi/4]. For the plot on the left b=0.1b=0.1, while for the plot on the right b=0.5b=0.5. Both plot windows are trimmed to avoid extreme values.
Refer to caption
Refer to caption
Figure 3. Both plots show the spectrum of an n×nn\times n elliptic random matrix n1/αXn^{-1/\alpha}X where XX has atom variables (U1/αcos(w),U1/αsin(w)),1(U^{-1/\alpha}\cos(w),U^{-1/\alpha}\sin(w)),1 with α=1.25\alpha=1.25, n=2000n=2000, UU uniformly distributed on [0,1][0,1], and ww uniformly distributed on {π/4,5π/4}+[bπ/4,bπ/4]\{\pi/4,5\pi/4\}+[-b\pi/4,b\pi/4]. For the plot on the left b=1b=1, while for the plot on the right b=4/3b=4/3. Both plot windows are trimmed to avoid extreme values.

1.4. Outline

As expected with the empirical spectral measure of non-Hermitian random matrices, we make use of Girko’s Hermitization method. However, instead of considering the logarithmic potential directly, we follow the approach of Bordenave, Caputo, and Chafaï [12, 13] by using the objective method of Aldous and Steele [3] to get convergence of the matrices to an operator on Aldous’ Poisson Weighted Infinite Tree (PWIT). This objective method approach was expanded by Jung [34] to symmetric light-tailed random matrices, adjacency matrices of sparse random graphs, and symmetric random matrices whose column entries are some combination of heavy-tailed and light-tailed random variables.

In Sections 2 and 3 we give a collection of results and background for approaching the proof of Theorem 1.5. In Section 4 we define the PWIT, establish the local weak convergence of the matrices AnA_{n} to an operator associated with the PWIT, and give a proof of Theorem 1.5. In Section 5 we give a very general bound on the least singular value of elliptic random matrices. In Section 6 we establish the uniform integrability of log()\log(\cdot) against νAnzIn\nu_{A_{n}-zI_{n}} and complete the proof of Theorem 1.6. The appendix contains some auxiliary results.

We conclude this section by establishing notation, giving a brief description of Hermitization, and stating some properties of ξ1\xi_{1} and ξ2\xi_{2} implied by Condition C2.

1.5. Notation

We now establish notation we will use throughout.

Let [n]:={1,,n}[n]:=\{1,\ldots,n\} denote the discrete interval. For a vector v=(vi)i=1nnv=(v_{i})_{i=1}^{n}\in\mathbb{C}^{n} and a subset I[n]I\subset[n], we let vI:=(vi)iIIv_{I}:=(v_{i})_{i\in I}\in\mathbb{C}^{I}. Similarly, for an m×nm\times n matrix A=(Aij)i[m],j[n]A=(A_{ij})_{i\in[m],j\in[n]} and I[m],J[n]I\subset[m],J\subset[n], we define AI×J:=(Aij)iI,jJA_{I\times J}:=(A_{ij})_{i\in I,j\in J}. For a countable set SS we will let ISI_{S} denote the identity on 2(S)\ell^{2}(S). In the case when S=[n],2(S)=nS=[n],\ell^{2}(S)=\mathbb{C}^{n} we will simply write InI_{n} or II if the dimension is clear.

For a vector xnx\in\mathbb{C}^{n}, x\|x\| is the Euclidean norm of xx. For a matrix AA, ATA^{\mathrm{T}} is the transpose of AA and AA^{\ast} is the conjugate transpose of AA. In addition, A=s1(A)\|A\|=s_{1}(A) is the spectral norm of AA and A2\|A\|_{2} is the Hilbert–Schmidt norm of AA defined by the formula

A2=tr(AA).\|A\|_{2}=\sqrt{\operatorname{tr}(AA^{\ast})}.

For a linear, but not necessarily bounded, operator AA on a Hilbert space we let 𝒟(A)\mathcal{D}(A) denote the domain of AA. We will often use the shorthand A+zA+z to denote A+zIA+zI where II is the identity operator.

For two complex-valued square integrable random variables ξ\xi and ψ\psi, we define the correlation between ξ\xi and ψ\psi as

Corr(ξ,ψ):=Cov(ξ,ψ)Var(ξ)Var(ψ),\operatorname{Corr}(\xi,\psi):=\frac{\operatorname{Cov}(\xi,\psi)}{\sqrt{\operatorname{Var}(\xi)\operatorname{Var}(\psi)}},

where Cov(ξ,ψ):=𝔼[(ξ𝔼ξ)(ψ𝔼ψ)¯]\operatorname{Cov}(\xi,\psi):=\mathbb{E}[(\xi-\mathbb{E}\xi)\overline{(\psi-\mathbb{E}\psi)}] is the covariance between ξ\xi and ψ\psi, and Var(ξ)=𝔼|ξ𝔼ξ|2\operatorname{Var}(\xi)=\mathbb{E}|\xi-\mathbb{E}\xi|^{2} is the variance of ξ\xi. For two random elements XX and YY we say X=𝑑YX\overset{d}{=}Y if XX and YY have the same distribution. We will also say a positive random variable YY stochastically dominates a positive random variable ZZ if for all x>0x>0

(Yx)(Zx).\mathbb{P}(Y\geq x)\geq\mathbb{P}(Z\geq x).

For a topological space EE, (E)\mathcal{B}(E) will always denote the Borel σ\sigma-algebra of EE. +\mathbb{R}_{+} will denote the positive real numbers.

Throughout this paper we will use asymptotic notation (O,o,ΘO,o,\Theta, etc.) under the assumption that nn\rightarrow\infty. X=O(Y)X=O(Y) if XCYX\leq CY for an absolute constant C>0C>0 and all nCn\geq C, X=o(Y)X=o(Y) if XCnYX\leq C_{n}Y for Cn0C_{n}\rightarrow 0, X=Θ(Y)X=\Theta(Y) if cYXCYcY\leq X\leq CY for absolute constants C,c>0C,c>0 and all nCn\geq C, and XYX\sim Y if X/Y1X/Y\rightarrow 1.

1.6. Hermitization

Let 𝒫()\mathcal{P}(\mathbb{C}) be the set of probability measures on \mathbb{C} which integrate log||\log|\cdot| in a neighborhood of infinity. For every μ𝒫()\mu\in\mathcal{P}(\mathbb{C}) the logarithmic potential UμU_{\mu} of μ\mu on \mathbb{C} is a function Uμ:[,)U_{\mu}:\mathbb{C}\rightarrow[-\infty,\infty) defined for every zz\in\mathbb{C} by

Uμ(z)=log|zw|dμ(w).U_{\mu}(z)=\int_{\mathbb{C}}\log|z-w|d\mu(w).

In 𝒟()\mathcal{D^{\prime}}(\mathbb{C}) one has ΔUμ=2πμ\Delta U_{\mu}=2\pi\mu, where 𝒟()\mathcal{D^{\prime}}(\mathbb{C}) is the set of Schwartz-Sobolev distributions on \mathbb{C} endowed with its usual convergence with respect to all infinitely differentiable functions with compact support.

Lemma 1.10 (Lemma A.1 in [13]).

For every μ,ν𝒫()\mu,\nu\in\mathcal{P}(\mathbb{C}), if Uμ=UνU_{\mu}=U_{\nu} a.e. then μ=ν\mu=\nu.

To see the connection between logarithmic potentials and random matrices consider an n×nn\times n random matrix AA. If P(z)=det(AzIn)P(z)=\det(A-zI_{n}) is the characteristic polynomial of AA, then

UμA(z)=log|zw|dμA(w)=1nlog|PA(z)|=0log(t)𝑑νAz(t).U_{\mu_{A}}(z)=\int_{\mathbb{C}}\log|z-w|d\mu_{A}(w)=\frac{1}{n}\log|P_{A}(z)|=\int_{0}^{\infty}\log(t)d\nu_{A-z}(t). (1.2)

Thus, through the logarithmic potential we can move from a question about eigenvalues of AA to singular values of AzA-z. We refer the reader to [15] for more on the logarithmic potential in random matrix theory. One immediate issue is that log()\log(\cdot) is not a bounded function on +\mathbb{R}_{+}, and thus we need more control on the integral of log()\log(\cdot) with respect to {νAnz}n1\{\nu_{A_{n}-z}\}_{n\geq 1}.

Definition 1.11 (Uniform integrability almost surely and in probability).

Let {μn}n=1\{\mu_{n}\}_{n=1}^{\infty} be a sequence of random probability measures on a measurable space (T,𝒯)(T,\mathcal{T}). We say a measurable function f:Tf:T\rightarrow\mathbb{R} is uniformly integrable almost surely with respect to {μn}n=1\{\mu_{n}\}_{n=1}^{\infty} if

limtlim supn|f|>t|f|𝑑μn=0,\lim\limits_{t\rightarrow\infty}\limsup\limits_{n\rightarrow\infty}\int_{|f|>t}|f|d\mu_{n}=0,

with probability one. We say a measurable function f:Tf:T\rightarrow\mathbb{R} is uniformly integrable in probability with respect to {μn}n=1\{\mu_{n}\}_{n=1}^{\infty} if for every ε>0\varepsilon>0

limtlim supn(|f|>t|f|𝑑μn>ε)=0.\lim\limits_{t\rightarrow\infty}\limsup\limits_{n\rightarrow\infty}\mathbb{P}\left(\int_{|f|>t}|f|d\mu_{n}>\varepsilon\right)=0.
Lemma 1.12 (Lemma 4.3 in [15]).

Let (An)n1(A_{n})_{n\geq 1} be a sequence of complex random matrices where AnA_{n} is n×nn\times n for every n1n\geq 1. Suppose for Lebesgue almost all zz\in\mathbb{C}, there exists a probability measure νz\nu_{z} on [0,)[0,\infty) such that

  • a.s. (νAnz)n1(\nu_{A_{n}-z})_{n\geq 1} tends weakly to νz\nu_{z}

  • a.s. (resp. in probability) log()\log(\cdot) is uniformly integrable for (νAnz)n1(\nu_{A_{n}-z})_{n\geq 1}.

Then there exists a probability measure μ𝒫()\mu\in\mathcal{P}(\mathbb{C}) such that

  • a.s. (resp. in probability) (μAn)n1(\mu_{A_{n}})_{n\geq 1} converges weakly to μ\mu

  • for almost every zz\in\mathbb{C},

    Uμ(z)=0log(t)𝑑νz(t).U_{\mu}(z)=\int_{0}^{\infty}\log(t)d\nu_{z}(t).

1.7. Individual entries and stable random vectors.

We now state some useful properties of ξ1\xi_{1} and ξ2\xi_{2} implied by Condition C2. First, if (ξ1(1),ξ2(1)),(ξ1(2),ξ2(2))(\xi_{1}^{(1)},\xi_{2}^{(1)}),(\xi_{1}^{(2)},\xi_{2}^{(2)})\dots are i.i.d copies of (ξ1,ξ2)(\xi_{1},\xi_{2}), then Condition C1 (i) guarantees, see [46], there exists a sequence bnb_{n} such that

1ani=1n(ξ1(i),ξ2(i))bnZ=(Z1,Z2),\frac{1}{a_{n}}\sum_{i=1}^{n}(\xi_{1}^{(i)},\xi_{2}^{(i)})-b_{n}\Rightarrow Z=(Z_{1},Z_{2}),

for some α\alpha-stable random vector ZZ with spectral measure θd\theta_{d}. Condition C2 (i) guaranties neither Z1Z_{1} nor Z2Z_{2} is identically 0. We need the following theorem to get results on stable random vectors.

Theorem 1.13 (Theorem 2.3.9 in [47]).

Let (X1,,Xd)(X_{1},\dots,X_{d}) be an α\alpha-stable vector in d\mathbb{R}^{d}. Then (X1,,Xk)(X_{1},\dots,X_{k}) is an α\alpha-stable random vector for any kdk\leq d.

From this, with k=1k=1 for real ξ1\xi_{1} and k=2k=2 for complex ξ1\xi_{1}, we get that ξ1\xi_{1} is in the domain of attraction of an α\alpha-stable random variable and satisfies

(|ξ1|t)=L(t)tα\mathbb{P}(|\xi_{1}|\geq t)=L(t)t^{-\alpha} (1.3)

for some slowly varying function LL. In addition,

limt(ξ1|ξ1|||ξ1|t)=θ1()\lim\limits_{t\rightarrow\infty}\mathbb{P}\left(\frac{\xi_{1}}{|\xi_{1}|}\in\cdot\Bigg{|}|\xi_{1}|\geq t\right)=\theta_{1}(\cdot) (1.4)

for some probability measure θ1\theta_{1}, again see [46]. The same holds for ξ2\xi_{2} with a possibly different probability measure θ2\theta_{2}. Also note

𝔼[|ξi|p]<\mathbb{E}[|\xi_{i}|^{p}]<\infty (1.5)

for all 0p<α0\leq p<\alpha and i=1,2i=1,2.

Acknowledgment

The second author thanks Djalil Chafaï for answering his questions concerning [15, Appendix A].

2. Poisson point processes and stable distributions

In this section we give a brief review of Poisson Point Processes (p.p.p.) and their relation to the order statistics of random variables in the domain of attraction of an α\alpha-stable distribution. See [21, 43, 44], and the references therein for proofs.

2.1. Simple point processes

Throughout this section we will assume E=¯n{0}E=\bar{\mathbb{R}}^{n}\setminus\{0\} with the relative topology, where ¯n\bar{\mathbb{R}}^{n} is the one point compactification of n\mathbb{R}^{n}, but many of the results can be extended to other topological spaces.

Denote by (E)\mathcal{M}(E) the set of simple point Radon measures

μ=xDδx\mu=\sum_{x\in D}\delta_{x}

where DD is such that D(Br(0))cD\cap(B_{r}(0))^{c} is a finite set for any r>0r>0 and Br(0)B_{r}(0) is the ball of radius rr around the point 0. Denote by (E)\mathcal{H}(E) the set of supports corresponding to measures in (E)\mathcal{M}(E). The elements of (E)\mathcal{H}(E) are called configurations.

Let CK(E)C_{K}(E) denote the set of real-valued continuous functions on EE with compact support. The vague topology on (E)\mathcal{M}(E) is the topology where a sequence μn\mu_{n} converges to μ\mu if for any fCK(E)f\in C_{K}(E)

Ef𝑑μnEf𝑑μ.\int_{E}fd\mu_{n}\rightarrow\int_{E}fd\mu.

(E)\mathcal{M}(E) with the vague topology is a Polish space, and thus complete and metrizable.

If one considers the one-to-one function I:(E)(E)I:\mathcal{M}(E)\rightarrow\mathcal{H}(E) given by I(μ)=supp(μ)I(\mu)=\operatorname{supp}(\mu), then the topology of (E)\mathcal{M}(E) can be pushed forward to (E)\mathcal{H}(E). The vague convergence in (E)\mathcal{M}(E) can be stated in terms of a convergence of the supports. Let μn𝑣μ\mu_{n}\xrightarrow{v}\mu and give some labeling supp(μ)={x(1),x(2),}\operatorname{supp}(\mu)=\{x^{(1)},x^{(2)},\dots\}. Then this vague convergence implies there exists labeling supp(μn)={xn(1),xn(2),}\operatorname{supp}(\mu_{n})=\{x^{(1)}_{n},x^{(2)}_{n},\dots\} such that for all kk, xn(k)x(k)x_{n}^{(k)}\rightarrow x^{(k)}. This description will be particularly useful for our case.

A simple point process NN is a measurable mapping from a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) to ((E),((E)))(\mathcal{M}(E),\mathcal{B}(\mathcal{M}(E))), where ((E))\mathcal{B}(\mathcal{M}(E)) is the Borel σ\sigma-algebra defined by the vague topology. Weak convergence of simple point processes is defined by weak convergence of the measures in the vague topology.

2.2. Poisson point process

Definition 2.1.

Let mm be a Borel measure on EE. A point process NN is called a Poisson Point Process (p.p.p.) with intensity measure mm if for any pairwise disjoint Borel sets A1,,AnA_{1},\dots,A_{n}, the random variables N(A1),,N(An)N(A_{1}),\dots,N(A_{n}) are independent Poisson random variables with expected values m(A1),,m(An)m(A_{1}),\dots,m(A_{n}).

Remark 2.2.

NN is a.s. simple if and only if mm is non-atomic.

The next proposition makes clear the connection between point processes and stable random variables. See Proposition 3.21 in [43] for a proof. First, we describe an important point process. Let θ\theta be a finite measure on the unit sphere in n\mathbb{R}^{n}, and mαm_{\alpha} be a measure with density αr(α+1)dr\alpha r^{-(\alpha+1)}dr on +\mathbb{R}_{+}. We let NαN_{\alpha} denote the p.p.p. on n\mathbb{R}^{n} with intensity measure θ×mα\theta\times m_{\alpha}.

Proposition 2.3.

Let {ξn}n1\{\xi_{n}\}_{n\geq 1} be i.i.d. d\mathbb{R}^{d}-valued random variables. Assume there exists a finite measure θ\theta on the unit sphere in d\mathbb{R}^{d} such that

limnn(ξnξnD,ξnrbn)=θ(D)mα([r,)),\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{\xi_{n}}{\|\xi_{n}\|}\in D,\|\xi_{n}\|\geq rb_{n}\right)=\theta(D)m_{\alpha}([r,\infty)),

for every r>0r>0 and all Borel subsets DD of the unit sphere with θ(D)=0\theta(\partial D)=0, where bn=n1/αL(n)b_{n}=n^{1/\alpha}L(n) for 0<α<20<\alpha<2 and a slowly varying function LL. Then

βn:=i=1nδξi/bnNα\beta_{n}:=\sum_{i=1}^{n}\delta_{\xi_{i}/b_{n}}\Rightarrow N_{\alpha}

as nn\to\infty.

Remark 2.4.

In [21], Davydov and Egorov prove this convergence in p\ell^{p}-type topologies, under a smoothness assumption on the ξn\xi_{n}.

2.3. Useful properties of 𝐍α\mathbf{N_{\alpha}}

The following are useful and well known properties of the p.p.p. NαN_{\alpha}. Again, see Davydov and Egorov, [21], and the references therein for more information and proofs.

Proposition 2.5.

Let {λi}\{\lambda_{i}\} and {wi}\{w_{i}\} be independent i.i.d. sequences where λ1\lambda_{1} has exponential distribution with mean 11, and w1w_{1} is 1θ(𝕊d1)θ\frac{1}{\theta(\mathbb{S}^{d-1})}\theta distributed, with θ\theta a finite nonzero measure on 𝕊d1\mathbb{S}^{d-1}. Define Γi=λ1++λi\Gamma_{i}=\lambda_{1}+\cdots+\lambda_{i}. Then, for any α>0\alpha>0,

Nα=𝑑i=1δΓi1/α(θ(𝕊d1))1/αwi.N_{\alpha}\overset{d}{=}\sum_{i=1}^{\infty}\delta_{\Gamma_{i}^{-1/\alpha}(\theta(\mathbb{S}^{d-1}))^{1/\alpha}w_{i}}.
Lemma 2.6.

The p.p.p. NαN_{\alpha} has the following properties.

  • (1)

    Almost surely there are only a finite number of points of supp(Nα)\operatorname{supp}(N_{\alpha}) outside a ball of positive radius centered at the origin.

  • (2)

    NαN_{\alpha} is simple.

  • (3)

    Almost surely, we can label the points from supp(Nα)\operatorname{supp}(N_{\alpha}) according to the decreasing order of their norms supp(Nα)={(y1,y2,):y1>y2>}\operatorname{supp}(N_{\alpha})=\{(y_{1},y_{2},\dots):\|y_{1}\|>\|y_{2}\|>\dots\}.

  • (4)

    With probability one, for any p>αp>\alpha,

    i=1|yi|p<.\sum_{i=1}^{\infty}|y_{i}|^{p}<\infty.

3. Bipartized resolvent matrix

In this section we follow the notation of Bordenave, Caputo, and Chafaï in [13] to define the bipartizations of matrices and operators.

3.1. Bipartization of a matrix

For an n×nn\times n complex matrix AA we consider a symmetrized version of νAzI\nu_{A-zI},

νˇAz:=12nk=1n(δsk(Az)+δsk(Az)).\check{\nu}_{A-z}:=\frac{1}{2n}\sum_{k=1}^{n}(\delta_{s_{k}(A-z)}+\delta_{-s_{k}(A-z)}).

Let +:={z:Im(z)>0}\mathbb{C}_{+}:=\{z\in\mathbb{C}:\operatorname{Im}(z)>0\} and

+:={U=(ηzz¯η):η+,z}Mat2().\mathbb{H}_{+}:=\left\{U=\begin{pmatrix}\eta&z\\ \bar{z}&\eta\end{pmatrix}:\eta\in\mathbb{C}_{+},z\in\mathbb{C}\right\}\subset\text{Mat}_{2}(\mathbb{C}).

For any z,η+z\in\mathbb{C},\eta\in\mathbb{C}_{+} and 1i,jn1\leq i,j\leq n define the following 2×22\times 2 matrices

U=U(z,η):=(ηzz¯η)andBij:=(0AijA¯ji0).U=U(z,\eta):=\begin{pmatrix}\eta&z\\ \bar{z}&\eta\end{pmatrix}\quad\text{and}\quad B_{ij}:=\begin{pmatrix}0&A_{ij}\\ \bar{A}_{ji}&0\end{pmatrix}.

Define the matrix BMatn(Mat2())Mat2n()B\in\text{Mat}_{n}(\text{Mat}_{2}(\mathbb{C}))\simeq\text{Mat}_{2n}(\mathbb{C}) by B=(Bij)1i,jnB=(B_{ij})_{1\leq i,j\leq n}. As an element of Mat2n()\text{Mat}_{2n}(\mathbb{C}), BB is Hermitian. We call BB the bipartization of the matrix AA. We define the resolvent matrix in Matn(Mat2())\text{Mat}_{n}(\text{Mat}_{2}(\mathbb{C})) by

R(U)=(BUIn)1,R(U)=(B-U\otimes I_{n})^{-1},

so that for all i,ji,j, R(U)ijMat2()R(U)_{ij}\in\text{Mat}_{2}(\mathbb{C}). For 1kn1\leq k\leq n we write,

R(U)kk=(ak(z,η)bk(z,η)bk(zη)ck(z,η)).R(U)_{kk}=\begin{pmatrix}a_{k}(z,\eta)&b_{k}(z,\eta)\\ b^{\prime}_{k}(z\eta)&c_{k}(z,\eta)\end{pmatrix}.

Letting B(z)=BU(z,0)InB(z)=B-U(z,0)\otimes I_{n} we have

R(U)=(B(z)ηI2n)1.R(U)=(B(z)-\eta I_{2n})^{-1}.

Let mμm_{\mu} denote the Stieltjes transform of a probability measure μ\mu on \mathbb{R} defined by

mμ(η)=1xη𝑑μ(x),η+.m_{\mu}(\eta)=\int_{\mathbb{R}}\frac{1}{x-\eta}d\mu(x),\qquad\eta\in\mathbb{C}_{+}.
Theorem 3.1 (Theorem 2.1 in [13]).

Let AMatn()A\in\operatorname{Mat}_{n}(\mathbb{C}). Then μB(z)=νˇAz\mu_{B(z)}=\check{\nu}_{A-z},

mνˇAz(η)=12nk=1n(ak(z,η)+ck(z,η)),m_{\check{\nu}_{A-z}}(\eta)=\frac{1}{2n}\sum_{k=1}^{n}(a_{k}(z,\eta)+c_{k}(z,\eta)),

and in 𝒟()\mathcal{D^{\prime}}(\mathbb{C}), the set of Schwartz-Sobolev distributions on \mathbb{C} endowed with its usual convergence with respect to all infinitely differentiable functions with compact support,

μA()=1πnk=1nbk(,0).\mu_{A}(\cdot)=-\frac{1}{\pi n}\sum_{k=1}^{n}\partial b_{k}(\cdot,0).

3.2. Bipartization of an operator

Let VV be a countable set and let 2(V)\ell^{2}(V) denote the Hilbert space defined by the inner product

ϕ,ψ:=uVϕ¯uψu,ϕu=δu,ϕ,\langle\phi,\psi\rangle:=\sum_{u\in V}\bar{\phi}_{u}\psi_{u},\quad\phi_{u}=\langle\delta_{u},\phi\rangle,

where δu\delta_{u} is the unit vector supported on uVu\in V. Let 𝒟(V)\mathcal{D}(V) denote the dense subset of 2(V)\ell^{2}(V) of vectors with finite support. Let (wuv)u,vV(w_{uv})_{u,v\in V} be a collection of complex numbers such that for all uVu\in V,

vV|wuv|2+|wvu|2<.\sum_{v\in V}|w_{uv}|^{2}+|w_{vu}|^{2}<\infty.

We then define a linear operator AA with domain 𝒟(V)\mathcal{D}(V) by

δu,Aδv=wuv.\langle\delta_{u},A\delta_{v}\rangle=w_{uv}. (3.1)

Let V^\hat{V} be a set in bijection with VV, and let v^V^\hat{v}\in\hat{V} be the image of vVv\in V under the bijection. Let Vb=VV^V^{b}=V\cup\hat{V}, and define the bipartization of AA as the symmetric operator BB on 𝒟(Vb)\mathcal{D}(V^{b}) by

δu,Bδv^=δv^,Bδu¯=wuv,\langle\delta_{u},B\delta_{\hat{v}}\rangle=\overline{\langle\delta_{\hat{v}},B\delta_{u}\rangle}=w_{uv},
δu,Bδv=δu^,Bδv^=0.\langle\delta_{u},B\delta_{v}\rangle=\langle\delta_{\hat{u}},B\delta_{\hat{v}}\rangle=0.

Let Πu:2(Vb)Span{δu,δu^}\Pi_{u}:\ell^{2}(V^{b})\rightarrow\operatorname{Span}\{\delta_{u},\delta_{\hat{u}}\} denote the orthogonal projection onto the span of δu,δu^\delta_{u},\delta_{\hat{u}}. Span{δu,δu^}\operatorname{Span}\{\delta_{u},\delta_{\hat{u}}\} is isomorphic to 2\mathbb{C}^{2} under the map δue1,δu^e2\delta_{u}\mapsto e_{1},\delta_{\hat{u}}\mapsto e_{2}. Under this isomorphism we may think of ΠuBΠv\Pi_{u}B\Pi_{v}^{*} as a linear map from 2\mathbb{C}^{2} to 2\mathbb{C}^{2} with matrix representation

ΠuBΠv=(0wuvw¯vu0).\Pi_{u}B\Pi_{v}^{*}=\begin{pmatrix}0&w_{uv}\\ \bar{w}_{vu}&0\end{pmatrix}.

Let B(z)=BU(z,0)IVB(z)=B-U(z,0)\otimes I_{V}. For simplicity we will denote by B(z)B(z) the closure of B(z)B(z). Recall the sum of an essentially self-adjoint operator and a bounded self-adjoint operator is essentially self-adjoint, thus if BB is (essentially) self-adjoint then B(z)B(z) is (essentially) self-adjoint. For η+\eta\in\mathbb{C}_{+}, U(z,η)+U(z,\eta)\in\mathbb{H}_{+} we define the resolvent operator

R(U):=(B(z)ηIVb)1,R(U):=(B(z)-\eta I_{V^{b}})^{-1},

and

R(U)vv=ΠvR(U)Πv=(av(z,η)bv(z,η)bv(z,η)cv(z,η)).R(U)_{vv}=\Pi_{v}R(U)\Pi_{v}^{*}=\begin{pmatrix}a_{v}(z,\eta)&b_{v}(z,\eta)\\ b_{v}^{\prime}(z,\eta)&c_{v}(z,\eta)\end{pmatrix}. (3.2)
Lemma 3.2.

If av,bv,cv,bva_{v},b_{v},c_{v},b_{v}^{\prime} are defined by (3.2), then

  • for each zz\in\mathbb{C}, av(z,),cv(z,):++a_{v}(z,\cdot),c_{v}(z,\cdot):\mathbb{C}_{+}\rightarrow\mathbb{C}_{+},

  • the functions av(z,),bv(z,),bv(z,),cv(z,)a_{v}(z,\cdot),b_{v}(z,\cdot),b_{v}^{\prime}(z,\cdot),c_{v}(z,\cdot) are analytic on +\mathbb{C}_{+},

  • and

    |av|(Im(η))1,|cv|(Im(η))1,|bv|(2Im(η))1,and|bv|(2Im(η))1.|a_{v}|\leq(\operatorname{Im}(\eta))^{-1},\quad|c_{v}|\leq(\operatorname{Im}(\eta))^{-1},\quad|b_{v}|\leq(2\operatorname{Im}(\eta))^{-1},\quad\text{and}\quad|b_{v}^{\prime}|\leq(2\operatorname{Im}(\eta))^{-1}.

Moreover, if ηi+\eta\in i\mathbb{R}_{+}, then av,cva_{v},c_{v} are pure imaginary and bv=b¯vb_{v}^{\prime}=\bar{b}_{v}.

See Reed and Simon [42] for a proof of the first two statements and [13] for the last.

4. Convergence to the Poisson Weighted Infinite Tree

4.1. Operators on a tree

Consider a tree T=(V,E)T=(V,E) on a vertex set VV with edge set EE. We say uvu\sim v if {u,v}E\{u,v\}\in E. Assume if {u,v}E\{u,v\}\notin E then wuv=wvu=0w_{uv}=w_{vu}=0, in particular wvv=0w_{vv}=0 for all vVv\in V. We consider the operator AA defined by (3.1). We begin with useful sufficient conditions for a symmetric linear operator to be essentially self-adjoint, which will be very important for our use.

Lemma 4.1 (Lemma A.3 in [12]).

Let κ>0\kappa>0 and T=(V,E)T=(V,E) be a tree. Assume that for all u,vVu,v\in V, wuv=w¯vuw_{uv}=\bar{w}_{vu} and if {u,v}E\{u,v\}\notin E then wuv=wvu=0w_{uv}=w_{vu}=0. Assume that there exists a sequence of connected finite subsets (Sn)n1(S_{n})_{n\geq 1} of VV, such that SnSn+1S_{n}\subset S_{n+1}, nSn=V\bigcup_{n}S_{n}=V, and for every nn and vSnv\in S_{n},

uSn:uv|wuv|2κ.\sum_{u\notin S_{n}:u\sim v}|w_{uv}|^{2}\leq\kappa.

Then AA is essentially self-adjoint.

Corollary 4.2 (Corollary 2.4 in [13]).

Let κ>0\kappa>0 and T=(V,E)T=(V,E) be a tree. Assume that if {u,v}E\{u,v\}\notin E then wuv=wvu=0w_{uv}=w_{vu}=0. Assume there exists a sequence of connected finite subsets (Sn)n1(S_{n})_{n\geq 1} of VV, such that SnSn+1S_{n}\subset S_{n+1}, nSn=V\bigcup_{n}S_{n}=V, and for every nn and vSnv\in S_{n},

uSn:uv(|wuv|2+|wvu|2)κ.\sum_{u\notin S_{n}:u\sim v}(|w_{uv}|^{2}+|w_{vu}|^{2})\leq\kappa.

Then for all zz\in\mathbb{C}, B(z)B(z) is self-adjoint.

The advantages of defining an operator on a tree also include the following recursive formula for the resolvents, see Lemma 2.5 in [13] for a proof.

Proposition 4.3.

Assume BB is sellf-adjoint and let U=U(z,η)+U=U(z,\eta)\in\mathbb{H}_{+}. Then

R(U)=(U+v(0wvw¯v0)R~(U)vv(0wvw¯v0))1,R(U)_{\varnothing\varnothing}=-\left(U+\sum_{v\sim\varnothing}\begin{pmatrix}0&w_{\varnothing v}\\ \bar{w}_{v\varnothing}&0\end{pmatrix}\tilde{R}(U)_{vv}\begin{pmatrix}0&w_{v\varnothing}\\ \bar{w}_{\varnothing v}&0\end{pmatrix}\right)^{-1}, (4.1)

where R~(U)kk:=ΠvRBvΠv\tilde{R}(U)_{kk}:=\Pi_{v}R_{B_{v}}\Pi_{v}^{*} where BvB_{v} is the bipartization of the operator AA restricted to 2(Vv)\ell^{2}(V_{v}), and VvV_{v} is the subtree of VV of vertices whose path to \varnothing contains vv.

4.2. Local operator convergence.

We now define a useful type of convergence.

Definition 4.4 (Local Convergence).

Suppose (An)(A_{n}) is a sequence of bounded operators on 2(V)\ell^{2}(V) and AA is a linear operator on 2(V)\ell^{2}(V) with domain 𝒟(A)𝒟(V)\mathcal{D}(A)\supset\mathcal{D}(V). For any u,vVu,v\in V we say that (An,u)(A_{n},u) converges locally to (A,v)(A,v), and write

(An,u)(A,v),(A_{n},u)\rightarrow(A,v),

if there exists a sequence of bijections σn:VV\sigma_{n}:V\rightarrow V such that σn(v)=u\sigma_{n}(v)=u and, for all ϕ𝒟(V)\phi\in\mathcal{D}(V),

σn1AnσnϕAϕ,\sigma_{n}^{-1}A_{n}\sigma_{n}\phi\rightarrow A\phi,

in 2(V)\ell^{2}(V), as nn\rightarrow\infty.

Here we use σn\sigma_{n} for the bijection on VV and the corresponding linear isometry defined in the obvious way. This notion of convergence is useful to random matrices for two reasons. First, we will make a choice on how to define the action of an n×nn\times n matrix on 2(V)\ell^{2}(V), and the bijections σn\sigma_{n} help ensure the choice of location for the support of the matrix does not matter. Second, local convergence also gives convergence of the resolvent operator at the distinguished points u,vVu,v\in V. This comes down to the fact that local convergence is strong operator convergence, up to the isometries. See [13] for details.

Theorem 4.5 (Theorem 2.7 in [13]).

Assume (An,u)(A,v)(A_{n},u)\rightarrow(A,v) for some u,vVu,v\in V. Let BnB_{n} be the self-adjoint bipartized operator of AnA_{n}. If the bipartized operator BB of AA is self-adjoint and 𝒟(Vb)\mathcal{D}(V^{b}) is a core for BB (i.e., the closure BB restricted to 𝒟(Vb)\mathcal{D}(V^{b}) is BB), then for all U+,U\in\mathbb{H}_{+},

RBn(U)uuRB(U)vv.R_{B_{n}}(U)_{uu}\rightarrow R_{B}(U)_{vv}.

To apply this to random operators we say that (An,u)(A,v)(A_{n},u)\rightarrow(A,v) in distribution if there exists a sequence of random bijections σn\sigma_{n} such that σn1AnσnϕAϕ\sigma_{n}^{-1}A_{n}\sigma_{n}\phi\rightarrow A\phi in distribution for every ϕ𝒟(Vb)\phi\in\mathcal{D}(V^{b}).

4.3. Poisson Weighted Infinite Tree (PWIT)

Let ρ\rho be a positive Radon measure on n{0}\mathbb{C}^{n}\setminus\{0\} such that ρ(n{0})=\rho(\mathbb{C}^{n}\setminus\{0\})=\infty. 𝐏𝐖𝐈𝐓(ρ)\mathbf{PWIT}(\rho) is the random weighted rooted tree defined as follows. The vertex set of the tree is identified with f:=k{0}k\mathbb{N}^{f}:=\bigcup_{k\in\mathbb{N}\cup\{0\}}\mathbb{N}^{k} by indexing the root as 0=\mathbb{N}^{0}=\varnothing, the offspring of the root as \mathbb{N} and, more generally, the offspring of some vkv\in\mathbb{N}^{k} as (v1),(v2),k+1(v1),(v2),\cdots\in\mathbb{N}^{k+1}. Define TT as the tree on f\mathbb{N}^{f} with edges between parents and offspring. Let {Ξv}vf\{\Xi_{v}\}_{v\in\mathbb{N}^{f}} be independent realizations of a Poisson point process with intensity measure ρ\rho. Let Ξ={y1,y2,}\Xi_{\varnothing}=\{y_{1},y_{2},\dots\} be ordered such that y1y2\|y_{1}\|\geq\|y_{2}\|\geq\cdots, and assign the weight yiy_{i} to the edge between \varnothing and ii, assuming such an ordering is possible. More generally assign the weight yviy_{vi} to the edge between vv and vivi where Ξv={yv1,yv2,}\Xi_{v}=\{y_{v1},y_{v2},\dots\} where yv1yv2.\|y_{v1}\|\geq\|y_{v2}\|\geq\cdots.

Consider a realization of 𝐏𝐖𝐈𝐓(θd×mα)\mathbf{PWIT}(\theta_{d}\times m_{\alpha}), with yvk=(yvk(1),yvk(2))y_{vk}=(y_{vk}^{(1)},y_{vk}^{(2)}). Even though the measure θd×mα\theta_{d}\times m_{\alpha} has a more natural representation in polar coordinates, we let (yvk(1),yvk(2))(y_{vk}^{(1)},y_{vk}^{(2)}) be the Cartesian coordinates of yvky_{vk}. Define an operator AA on 𝒟(f)\mathcal{D}(\mathbb{N}^{f}) by the formulas

δv,Aδvk=yvk(1), and δvk,Aδv=yvk(2)\langle\delta_{v},A\delta_{vk}\rangle=y_{vk}^{(1)},\quad\text{ and }\quad\langle\delta_{vk},A\delta_{v}\rangle=y_{vk}^{(2)} (4.2)

and δv,Aδu=0\langle\delta_{v},A\delta_{u}\rangle=0 otherwise. For 0<α<20<\alpha<2, we know by Lemma 2.6 that the points in Ξv\Xi_{v} are almost surely square summable for every vfv\in\mathbb{N}^{f}, and thus AA is actually a well defined linear operator on 𝒟(f)\mathcal{D}(\mathbb{N}^{f}), though is possibly unbounded on 2(f)\ell^{2}(\mathbb{N}^{f}). Before showing the local convergence of the random matrices AnA_{n} to AA we will show the bipartization of AA is self-adjoint.

Proposition 4.6.

With probability one, for all zz\in\mathbb{C}, B(z)B(z) is self-adjoint, where B(z)B(z) is the bipartization of the operator AA defined by (4.2).

We begin with a lemma on point processes, proved in Lemma A.4 from [12], before checking our criterion for self-adjointness.

Lemma 4.7.

Let κ>0,0<α<2\kappa>0,0<\alpha<2 and let 0<x1<x2<0<x_{1}<x_{2}<\cdots be a Poisson process of intensity 11 on +\mathbb{R}_{+}. Define τ=inf{t:k=t+1xk2/ακ}.\tau=\inf\{t\in\mathbb{N}:\sum_{k=t+1}^{\infty}x_{k}^{-2/\alpha}\leq\kappa\}. Then 𝔼τ\mathbb{E}\tau is finite and goes to 0 as κ\kappa goes to infinity.

Proof of Proposition 4.6.

For κ>0\kappa>0 and vfv\in\mathbb{N}^{f} define

τv=inf{t0:k=t+1yvk2κ}.\tau_{v}=\inf\left\{t\geq 0:\sum_{k=t+1}^{\infty}\|y_{vk}\|^{2}\leq\kappa\right\}.

Note if NN is a homogeneous Poisson process on +\mathbb{R}_{+} with intensity 1 and f(x)=x1/αf(x)=x^{-1/\alpha}, then f(N)f(N) is Poisson process with intensity measure αr1αdr\alpha r^{-1-\alpha}dr. Thus by Lemma 4.7, κ>0\kappa>0 can be chosen such that 𝔼τv<1\mathbb{E}\tau_{v}<1 for any fixed vv. Since the random variables {τv}vf\{\tau_{v}\}_{v\in\mathbb{N}^{f}} are i.i.d. this κ\kappa works for all vv. Fix such a κ\kappa. We now color the vertices red and green in an effort to build the sets SnS_{n} in Corollary 4.2. Put a green color on all vertices vv such that τv1\tau_{v}\geq 1 and a red color otherwise. Define the sub-forest TgT^{g} of TT where an edge between vv and vkvk is included if vv is green and 1kτv1\leq k\leq\tau_{v}. If the root \varnothing is red let S1={}S_{1}=\{\varnothing\}. Otherwise let Tg=(Vg,Eg)T^{g}_{\varnothing}=(V^{g}_{\varnothing},E^{g}_{\varnothing}) be the subtree of TgT^{g} containing \varnothing. Let ZnZ_{n} denote the number of vertices in TgT^{g}_{\varnothing} at a depth nn from the root, then ZnZ_{n} is a Galton-Watson process with offspring distribution τ\tau_{\varnothing}. It is well known that if 𝔼τ<1\mathbb{E}\tau_{\varnothing}<1, then the tree is almost surely finite, see Theorem 5.3.7 in [22]. Let LgL^{g}_{\varnothing} be the leaves of the tree TgT^{g}_{\varnothing}. Set S1:=vLg{vk:1kτv}VgS_{1}:=\bigcup_{v\in L^{g}_{\varnothing}}\{vk:1\leq k\leq\tau_{v}\}\bigcup V^{g}_{\varnothing}. It is clear for any vS1v\in S_{1},

uS1:uv(|yuv|2+|yvu|2)κ.\sum_{u\notin S_{1}:u\sim v}(|y_{uv}|^{2}+|y_{vu}|^{2})\leq\kappa.

Now define the outer boundary of {}\{\varnothing\} as τ{}={1,,max(τ,1)}\partial_{\tau}\{\varnothing\}=\{1,\dots,\max(\tau_{\varnothing},1)\} and for v=(i1ik)v=(i_{1}\cdots i_{k}) set τ{v}={(i1ik1(ik+1))}{(i1ik1),,(i1ikmax(τv,1))}\partial_{\tau}\{v\}=\{(i_{1}\cdots i_{k-1}(i_{k}+1))\}\cup\{(i_{1}\cdots i_{k}1),\dots,(i_{1}\cdots i_{k}\max(\tau_{v},1))\}. For a connected set SS define its outer boundary as

τS=(vSτ{v})S.\partial_{\tau}S=\left(\bigcup_{v\in S}\partial_{\tau}\{v\}\right)\setminus S.

Now for each u1,,ukτS1u_{1},\dots,u_{k}\in\partial_{\tau}S_{1} apply the above process to get subtrees {Tuig=(Vuig,Euig)}i=1k\{T^{g}_{u_{i}}=(V^{g}_{u_{i}},E^{g}_{u_{i}})\}_{i=1}^{k} with roots uiu_{i} and the leaves of tree TuigT^{g}_{u_{i}} denoted by LuigL^{g}_{u_{i}}. Set

S2:=S1(i=1k(VuigvLuig{vj:1jτv})).S_{2}:=S_{1}\cup\left(\bigcup_{i=1}^{k}(V^{g}_{u_{i}}\cup_{v\in L^{g}_{u_{i}}}\{vj:1\leq j\leq\tau_{v}\})\right).

Apply this procedure iteratively to get the sequence of subsets (Sn)n1(S_{n})_{n\geq 1}. Apply Corollary 4.2 to complete the proof. ∎

4.4. Local convergence

For an n×nn\times n matrix MM we aim to define MM as a bounded operator on 2(f)\ell^{2}(\mathbb{N}^{f}). For 1i,j,n,1\leq i,j,\leq n, let δi,Mδj=Mij\langle\delta_{i},M\delta_{j}\rangle=M_{ij}. and δu,Mδv=0\langle\delta_{u},M\delta_{v}\rangle=0 otherwise.

Theorem 4.8.

Let (Pn)n1(P_{n})_{n\geq 1} be a sequence of uniformly distributed random n×nn\times n permutation matrices, independent of AnA_{n}. Then, in distribution, (PnAnPnT,1)(A,)(P_{n}A_{n}P_{n}^{T},1)\rightarrow(A,\varnothing) where AA is the operator defined by (4.2).

Remark 4.9.

This theorem holds for any sequence of permutation matrices PnP_{n} regardless of independence or distribution, but for our use later it is important they are independent of AnA_{n} and uniformly distributed. This is to get around the fact the entries of AnA_{n} are not exchangeable.

The rest of this subsection is devoted to the proof of Theorem 4.8. The procedure follows along the same lines as Bordenave, Caputo, and Chafaï in [12] and [13]. We will define a network as a graph with edge weights taking values in some normed space. To begin let GnG_{n} be the complete network on {1,,n}\{1,\dots,n\} whose weight on edge {i,j}\{i,j\} equals ξijn\xi^{n}_{ij} for some collection (ξijn)1ijn(\xi^{n}_{ij})_{1\leq i\leq j\leq n} of i.i.d. random variables taking values in some normed space. Now consider the rooted network (Gn,1)(G_{n},1) with the distinguished vertex 11. For any realization (ξijn)(\xi_{ij}^{n}), and for any B,HB,H\in\mathbb{N} such that (BH+11)/(B1)n(B^{H+1}-1)/(B-1)\leq n, we will define a finite rooted subnetwork (Gn,1)B,H(G_{n},1)^{B,H} of (Gn,1)(G_{n},1) whose vertex set coincides with a BB-ary tree of depth HH. To this end we partially index the vertices of (Gn,1)(G_{n},1) as elements in

JB,H:=l=0H{1,,B}lf,J_{B,H}:=\bigcup_{l=0}^{H}\{1,\dots,B\}^{l}\subset\mathbb{N}^{f},

the indexing being given by an injective map σn\sigma_{n} from JB,HJ_{B,H} to Vn:={1,,n}V_{n}:=\{1,\dots,n\}. We set I:={1}I_{\varnothing}:=\{1\} and the index of the root σn1(1)=\sigma_{n}^{-1}(1)=\varnothing. The vertex vVnIv\in V_{n}\setminus I_{\varnothing} is given the index (k)=σn1(v),1kB(k)=\sigma_{n}^{-1}(v),1\leq k\leq B, if ξ1,vn\xi^{n}_{1,v} has the kk-th largest norm value among {ξ1jn,j1}\{\xi_{1j}^{n},j\neq 1\}, ties being broken by lexicographic order111To help keep track of notation in this section, note that v=(w)Vnv=(w)\in V_{n} if wJB,Hw\in J_{B,H} and σn(w)=v\sigma_{n}(w)=v.. This defines the first generation, and let I1I_{1} be the union of II_{\varnothing} and this generation. If H2H\geq 2 repeat this process for the vertex labeled (1)(1) on VnI1V_{n}\setminus I_{1} to order {ξ(1)jn}jVnI1\{\xi_{(1)j}^{n}\}_{j\in V_{n}\setminus I_{1}} to get {11,12,,1B}\{11,12,\dots,1B\}. Define I2I_{2} to be the union of I1I_{1} and this new collection. Repeat again for (2),(3),,(B)(2),(3),\dots,(B) to get the second generation and so on. Call this vertex set VnB,H=σnJB,HV_{n}^{B,H}=\sigma_{n}J_{B,H}.

For a realization TT of 𝐏𝐖𝐈𝐓(ρ)\mathbf{PWIT}(\rho), recall we assign the weight yvky_{vk} to the edge {v,vk}\{v,vk\}. Then (T,)(T,\varnothing) is a rooted network. Call (T,)B,H(T,\varnothing)^{B,H} the finite rooted subnetwork obtained by restricting (T,)(T,\varnothing) to the vertex set JB,HJ_{B,H}. If an edge is not present in (T,)B,H(T,\varnothing)^{B,H} assign the weight 0. We say a sequence (Gn,1)B,H(G_{n},1)^{B,H}, for fixed BB and HH, converges in distribution, as nn\rightarrow\infty, to (T,)B,H(T,\varnothing)^{B,H} if the joint distribution of the weights converges weakly.

Let πn\pi_{n} be the permutation on {1,,n}\{1,\dots,n\} associated to the permutation matrix PnP_{n}. We let

ξijn=(ξijn,(1),ξijn,(2)):=(Xπn(i),πn(j)an,Xπn(j),πn(i)an).\xi_{ij}^{n}=\left(\xi_{ij}^{n,(1)},\xi_{ij}^{n,(2)}\right):=\left(\frac{X_{\pi_{n}(i),\pi_{n}(j)}}{a_{n}},\frac{X_{\pi_{n}(j),\pi_{n}(i)}}{a_{n}}\right).

We now consider (Gn,1)B,H(G_{n},1)^{B,H} with weights ξijn\xi_{ij}^{n} and a realization, TT, of 𝐏𝐖𝐈𝐓(θd×mα)\mathbf{PWIT}(\theta_{d}\times m_{\alpha}). We aim to show (Gn,1)B,H(G_{n},1)^{B,H} converges in distribution to (T,)B,H(T,\varnothing)^{B,H}, for fixed B,HB,H as nn\rightarrow\infty.

Order the elements of JB,HJ_{B,H} lexicographically, i.e. 12B1112BB\varnothing\prec 1\prec 2\prec\cdots\prec B\prec 11\prec 12\prec\cdots\prec B\cdots B. For vJB,Hv\in J_{B,H} let 𝒪v\mathcal{O}_{v} denote the offspring of vv in (G,1)B,H(G,1)^{B,H}. By construction I={1}I_{\varnothing}=\{1\} and Iv=σn(wv𝒪w)I_{v}=\sigma_{n}\left(\bigcup_{w\prec v}\mathcal{O}_{w}\right), where wvw\prec v must be strict in this union. Thus at every step of the indexing procedure we order the weights of neighboring edges not already considered at a previous step. Thus for all vv,

(ξσn(v),jn)jIv=𝑑(ξ1jn)1<jn|Iv|.(\xi_{\sigma_{n}(v),j}^{n})_{j\notin I_{v}}\overset{d}{=}(\xi_{1j}^{n})_{1<j\leq n-|I_{v}|}.

Note that by independence, Proposition 2.3 still holds if you take the empirical sum over {1,,n}I\{1,\dots,n\}\setminus I for any fixed finite set II. Thus by Proposition 2.3 the weights from a fixed parent to its offspring in (Gn,1)B,H(G_{n},1)^{B,H} converge weakly to those of (T,)B,H(T,\varnothing)^{B,H}. By independence we can extend this to joint convergence. Because for any fixed nn, (Gn,1)B,H(G_{n},1)^{B,H} is still a complete network on VnB,HV_{n}^{B,H}, we must now check the weights connected to vertices not indexed above converge to zero, which is the weight given to edges not in the tree. For v,wJB,Hv,w\in J_{B,H} define

xv,wn:=ξσn(v),σn(w)n.x_{v,w}^{n}:=\xi_{\sigma_{n}(v),\sigma_{n}(w)}^{n}.

Also let {zv,wn,v,wJB,H}\{z_{v,w}^{n},v,w\in J_{B,H}\} denote independent variables distributed as ξ12n\|\xi_{12}^{n}\|. Let EB,HE^{B,H} denote the set of edges {v,w}JB,H×JB,H\{v,w\}\in J_{B,H}\times J_{B,H} that do not belong to the finite subtree (T,)B,H(T,\varnothing)^{B,H}. Because we have sorted out the largest elements, the vector {xv,wn,{v,w}EB,H}\{\|x_{v,w}^{n}\|,\{v,w\}\in E^{B,H}\} is stochastically dominated by the vector Zn:={zv,wn,{v,w}EB,H}Z_{n}:=\{z_{v,w}^{n},\{v,w\}\in E^{B,H}\} (see [12] Lemma 2.7). Since JB,HJ_{B,H} is finite, the vector ZnZ_{n} converges to 0 as nn\rightarrow\infty. Thus (G,1)B,H(T,)B,H(G,1)^{B,H}\Rightarrow(T,\varnothing)^{B,H}.

Let AA be the operator associated to 𝐏𝐖𝐈𝐓(θd×mα)\mathbf{PWIT}(\theta_{d}\times m_{\alpha}) defined by (4.2). For fixed B,HB,H let σnB,H\sigma_{n}^{B,H} be the map σn\sigma_{n} above associated to (G,1)B,H(G,1)^{B,H}, and arbitrarily extend σnB,H\sigma_{n}^{B,H} to a bijection on f\mathbb{N}^{f}, where VnV_{n} is considered in the natural way as a subset of the offspring of \varnothing. From the Skorokhod Representation Theorem we may assume (Gn,1)B,H(G_{n},1)^{B,H} converges almost surely to (T,)B,H(T,\varnothing)^{B,H}. Thus there is a sequences Bn,HnB_{n},H_{n} tending to infinity and σ^n:=σnBn,Hn\hat{\sigma}_{n}:=\sigma_{n}^{B_{n},H_{n}} such that for any pair v,wfv,w\in\mathbb{N}^{f}, ξσ^n(v),σ^n(w)n\xi_{\hat{\sigma}_{n}(v),\hat{\sigma}_{n}(w)}^{n} converges almost surely to

{yvk,if w=vk for some kywk,if v=wk for some k0,otherwise.\begin{cases}y_{vk},\quad\text{if $w=vk$ for some $k$}\\ y_{wk},\quad\text{if $v=wk$ for some $k$}\\ 0,\quad\text{otherwise.}\end{cases}

Thus almost surely

δv,σ^n1PnAnPnTσ^nδw=ξσ^n(v),σ^n(w)n,(i)δv,Aδw,\langle\delta_{v},\hat{\sigma}_{n}^{-1}P_{n}A_{n}P_{n}^{T}\hat{\sigma}_{n}\delta_{w}\rangle=\xi_{\hat{\sigma}_{n}(v),\hat{\sigma}_{n}(w)}^{n,(i)}\rightarrow\langle\delta_{v},A\delta_{w}\rangle,

where (i)=1,2(i)=1,2 depending on whether ww is an offspring of vv, vice versa, or we take the convention i=1i=1 if neither in which case ξσ^n(v),σ^n(w)n,(1)0\xi_{\hat{\sigma}_{n}(v),\hat{\sigma}_{n}(w)}^{n,(1)}\rightarrow 0. To prove the local convergence of operators it is sufficient, by linearity, to prove point wise convergence for any δw\delta_{w}. For convenience let ϕnw:=σ^n1PnAnPnTσ^nδw\phi_{n}^{w}:=\hat{\sigma}_{n}^{-1}P_{n}A_{n}P_{n}^{T}\hat{\sigma}_{n}\delta_{w}. Thus all that remains to be shown to complete the proof of Theorem 4.8 is that almost surely as nn\rightarrow\infty

uf|δu,ϕnwδu,Aδw|20.\sum_{u\in\mathbb{N}^{f}}|\langle\delta_{u},\phi_{n}^{w}\rangle-\langle\delta_{u},A\delta_{w}\rangle|^{2}\rightarrow 0.

Since for every uu, δu,ϕnwδu,Aδw\langle\delta_{u},\phi_{n}^{w}\rangle\rightarrow\langle\delta_{u},A\delta_{w}\rangle and δu,Aδw\langle\delta_{u},A\delta_{w}\rangle is square summable in uu it is enough to show that if ufu\in\mathbb{N}^{f} are given some indexing by \mathbb{N} u1,u2,u_{1},u_{2},\dots, then

supn1i=k|δui,ϕnw|20\sup_{n\geq 1}\sum_{i=k}^{\infty}|\langle\delta_{u_{i}},\phi_{n}^{w}\rangle|^{2}\rightarrow 0

as kk\rightarrow\infty. This follows from the uniform square-integrability of order statistics, see Lemma 2.4 of [12]. This completes the proof of Theorem 4.8.

4.5. Resolvent matrix

. Let RR be the resolvent of the bipartized random operator of AA. For U(z,η)+U(z,\eta)\in\mathbb{H}_{+}, set

R(U)=(a(z,η)b(z,η)b(z,η)c(z,η)).R(U)_{\varnothing\varnothing}=\begin{pmatrix}a(z,\eta)&b(z,\eta)\\ b^{\prime}(z,\eta)&c(z,\eta)\end{pmatrix}. (4.3)

We have the following result.

Theorem 4.10.

Let PnP_{n}, AnA_{n}, and AA be as in the Theorem 4.8. Since B(z)B(z) is almost surely self-adjoint we may almost surely define RR, the resolvent of B(z)B(z). Let R^n\hat{R}_{n} be the resolvent of the bipartized matrix of PnAnPnTP_{n}A_{n}P_{n}^{T}. For all U+U\in\mathbb{H}_{+},

R^n(U)11R(U)\hat{R}_{n}(U)_{11}\Rightarrow R(U)_{\varnothing\varnothing}

as nn\to\infty.

Proof.

This follows immediately from Theorem 4.8, Proposition 4.6, and Theorem 4.5. ∎

As functions, the entries of the resolvent matrix are continuous, and by Lemma 3.2 are bounded. Thus

limn𝔼R^n(U)11=𝔼R(U).\lim\limits_{n\rightarrow\infty}\mathbb{E}\hat{R}_{n}(U)_{11}=\mathbb{E}R(U)_{\varnothing\varnothing}. (4.4)

Note that by independence of πn\pi_{n} and AnA_{n}

𝔼R^n(U)11=𝔼Rn(U)πn1(1)πn1(1)=𝔼1ni=1nRn(U)ii.\mathbb{E}\hat{R}_{n}(U)_{11}=\mathbb{E}R_{n}(U)_{\pi_{n}^{-1}(1)\pi_{n}^{-1}(1)}=\mathbb{E}\frac{1}{n}\sum_{i=1}^{n}R_{n}(U)_{ii}. (4.5)

This is the reason for the choice of uniformly distributed PnP_{n} independent of AnA_{n}.

Theorem 4.11.

For all zz\in\mathbb{C}, almost surely the measures νˇAnz\check{\nu}_{A_{n}-z} converge weakly to a deterministic probability measure νˇα,z,θd\check{\nu}_{\alpha,z,\theta_{d}} whose Stieltjes transform is given by

mνˇα,z,θd(η)=𝔼a(z,η),m_{\check{\nu}_{\alpha,z,\theta_{d}}}(\eta)=\mathbb{E}a(z,\eta),

for η+\eta\in\mathbb{C}_{+} and a(z,η)a(z,\eta) in (4.3).

Proof.

By Proposition 4.6 for every zz\in\mathbb{C}, B(z)B(z) is almost surely essentially self-adjoint. Thus, using the Borel functional calculus, there exists almost surely a random probability measure ν,z\nu_{\varnothing,z} on \mathbb{R} such that

a(z,η)=δ,R(U)δ=dν,z(x)xη=mν,z(η).a(z,\eta)=\langle\delta_{\varnothing},R(U)\delta_{\varnothing}\rangle=\int_{\mathbb{R}}\frac{d\nu_{\varnothing,z}(x)}{x-\eta}=m_{\nu_{\varnothing,z}}(\eta).

See Theorem VIII.5 in [42] for more on this measure and the Borel functional calculus for unbounded self-adjoint operators. Define RnR_{n} as the resolvent matrix of Bn(z)B_{n}(z), the bipartized matrix of AnA_{n}. For U(z,η)+U(z,\eta)\in\mathbb{H}_{+}, we write

Rn(U)kk=(ak(z,η)bk(z,η)bk(zη)ck(z,η)),R^n(U)kk=(a^k(z,η)b^k(z,η)b^k(zη)c^k(z,η)).R_{n}(U)_{kk}=\begin{pmatrix}a_{k}(z,\eta)&b_{k}(z,\eta)\\ b^{\prime}_{k}(z\eta)&c_{k}(z,\eta)\end{pmatrix},\qquad\hat{R}_{n}(U)_{kk}=\begin{pmatrix}\hat{a}_{k}(z,\eta)&\hat{b}_{k}(z,\eta)\\ \hat{b}^{\prime}_{k}(z\eta)&\hat{c}_{k}(z,\eta)\end{pmatrix}.

By Theorem 3.1

m𝔼νˇAz(η)=𝔼12nk=1n(ak(z,η)+ck(z,η))=12(𝔼a^1(z,η)+𝔼c^1(z,η))=𝔼a^1(z,η).m_{\mathbb{E}\check{\nu}_{A-z}}(\eta)=\mathbb{E}\frac{1}{2n}\sum_{k=1}^{n}(a_{k}(z,\eta)+c_{k}(z,\eta))=\frac{1}{2}\left(\mathbb{E}\hat{a}_{1}(z,\eta)+\mathbb{E}\hat{c}_{1}(z,\eta)\right)=\mathbb{E}\hat{a}_{1}(z,\eta).

Thus, by (4.4) and (4.5),

limnm𝔼νˇAz(η)=𝔼a(z,η).\lim\limits_{n\rightarrow\infty}m_{\mathbb{E}\check{\nu}_{A-z}}(\eta)=\mathbb{E}a(z,\eta).

It follows that 𝔼νˇAnz\mathbb{E}\check{\nu}_{A_{n}-z} converges to some deterministic probability measure νˇz,α,θd=𝔼ν,z\check{\nu}_{z,\alpha,\theta_{d}}=\mathbb{E}\nu_{\varnothing,z}. By Lemma A.2 νˇAnz\check{\nu}_{A_{n}-z} concentrates around its expected value, and thus by the Borel-Cantelli Lemma νˇAnz\check{\nu}_{A_{n}-z} converges almost surely to νˇz,α,θd\check{\nu}_{z,\alpha,\theta_{d}}. ∎

Theorem 1.5 follows immediately from Theorem 4.11. We conclude this section with a recursive distributional equation describing R(U)R(U)_{\varnothing\varnothing}.

Proposition 4.12.

Let θd\theta_{d} be a probability measure. Let (w(1),w(2))(w^{(1)},w^{(2)}) be a θd\theta_{d} distributed random vector independent of the matrix given in (4.3). Additionally let ρz,η\rho_{z,\eta} be the measure on 48\mathbb{C}^{4}\cong\mathbb{R}^{8} which is the distribution of the random vector

(c(z,η)|w(1)|2b(z,η)w(1)w(2)b(z,η)w¯(1)w¯(2)a(z,η)|w(2)|2),\begin{pmatrix}c(z,\eta)|w^{(1)}|^{2}\\ b^{\prime}(z,\eta)w^{(1)}w^{(2)}\\ b(z,\eta)\bar{w}^{(1)}\bar{w}^{(2)}\\ a(z,\eta)|w^{(2)}|^{2}\end{pmatrix}, (4.6)

where a(z,η),b(z,η),b(z,η),a(z,\eta),b(z,\eta),b^{\prime}(z,\eta), and c(z,η)c(z,\eta) are as in (4.3). Then the matrix given in (4.3) satisfies the following recursive distributional equation

(a(z,η)b(z,η)b(z,η)c(z,η))=𝑑((ηzz¯η)+(S1S2S3S4))1\begin{pmatrix}a(z,\eta)&b(z,\eta)\\ b^{\prime}(z,\eta)&c(z,\eta)\end{pmatrix}\overset{d}{=}-\left(\begin{pmatrix}\eta&z\\ \bar{z}&\eta\end{pmatrix}+\begin{pmatrix}S_{1}&S_{2}\\ S_{3}&S_{4}\end{pmatrix}\right)^{-1} (4.7)

where S=(S1,S2,S3,S4)S=(S_{1},S_{2},S_{3},S_{4}) is an α/2\alpha/2-stable random vector with spectral measure is given by the image of the measure Γ(2α/2)cos(πα/4)1α/2vα/2dρz,η(v)\frac{\Gamma(2-\alpha/2)\cos(\pi\alpha/4)}{1-\alpha/2}\|v\|^{\alpha/2}d\rho_{z,\eta}(v) under the map vvvv\mapsto\frac{v}{\|v\|}.

The proof of Proposition 4.12 is given in Appendix A.4.

5. Least singular values of elliptic random matrices

Now that we have proven Theorem 1.5, we move on to show that log()\log(\cdot) is uniformly integrable, in probability, with respect to {νAnzIn}n1\{\nu_{A_{n}-zI_{n}}\}_{n\geq 1}. We begin with a bound on the least singular value of an elliptic random matrix under very general assumptions. This section is entirely self contained.

Theorem 5.1 (Least singular value bound).

Let X=(Xij)X=(X_{ij}) be an n×nn\times n complex-valued random matrix such that

  1. (i)

    (off-diagonal entries) {(Xij,Xji):1i<jn}\{(X_{ij},X_{ji}):1\leq i<j\leq n\} is a collection of independent random tuples,

  2. (ii)

    (diagonal entries) the diagonal entries {Xii:1in}\{X_{ii}:1\leq i\leq n\} are independent of the off-diagonal entries (but can be dependent on each other),

  3. (iii)

    there exists a>0a>0 such that the events

    ij:={|Xij|a,|Xji|a}\mathcal{E}_{ij}:=\{|X_{ij}|\leq a,|X_{ji}|\leq a\} (5.1)

    defined for iji\neq j satisfy

    b:=mini<j(ij)>0,σ2:=minijVar(Xijij)>0,b:=\min_{i<j}\mathbb{P}(\mathcal{E}_{ij})>0,\quad\sigma^{2}:=\min_{i\neq j}\operatorname{Var}(X_{ij}\mid\mathcal{E}_{ij})>0,

    and

    ρ:=maxi<j|Corr(Xijij,Xjiji)|<1.\rho:=\max_{i<j}\left|{\operatorname{Corr}(X_{ij}\mid\mathcal{E}_{ij},X_{ji}\mid\mathcal{E}_{ji})}\right|<1.

Then there exists C=C(a,b,σ)>0C=C(a,b,\sigma)>0 such that for nCn\geq C, any MMatn()M\in\operatorname{Mat}_{n}(\mathbb{C}), s1s\geq 1, 0<t10<t\leq 1,

(sn(X+M)tn,s1(X+M)s)C(log(Cns)1ρ(s5t+1n))1/4.\mathbb{P}\left(s_{n}(X+M)\leq\frac{t}{\sqrt{n}},s_{1}(X+M)\leq s\right)\leq C\left(\frac{\log(Cns)}{\sqrt{1-\rho}}\left(\sqrt{s^{5}t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}.
Remark 5.2.

The constant CC from Theorem 5.1 only depends on a,b,σa,b,\sigma and does not depend on ρ\rho. This allows one to apply Theorem 5.1 to cases where ρ\rho depends on nn.

5.1. Proof of Theorem 5.1

In this section we prove Theorem 5.1. Suppose XX and MM satisfy the assumptions of the theorem and denote A:=X+MA:=X+M. We will use AA throughout this section and it may be worth noting it is not the AnA_{n} of Theorems 1.5 and 1.6. Throughout the section, we allow all constants to depend on a,b,σa,b,\sigma without mentioning or denoting this dependence. Constants, however, will not depend on ρ\rho; instead we will state all dependence on ρ\rho explicitly.

For the proof of Theorem 5.1, it suffices to assume that AA and every principle submatrix of AA is invertible with probability 11. To see this, define X:=X+tnξIX^{\prime}:=X+\frac{t}{\sqrt{n}}\xi I, where II is the identity matrix and ξ\xi is a real-valued random variable uniformly distributed on the interval [1,1][-1,1], independent of XX. It follows that XX^{\prime} satisfies the assumptions of Theorem 5.1. However, since ξ\xi is continuously distributed, it also follows that A:=X+MA^{\prime}:=X^{\prime}+M and every principle submatrix of AA^{\prime} is invertible with probability 11. By Weyl’s inequality for the singular values (see, for instance, [10, Problem III.6.13]), we find

max1kn|sk(A)sk(A)|tns.\max_{1\leq k\leq n}|s_{k}(A)-s_{k}(A^{\prime})|\leq\frac{t}{\sqrt{n}}\leq s.

Hence, we conclude that

(sn(A)tn,s1(A)s)(sn(A)2tn,s1(A)2s).\mathbb{P}\left(s_{n}(A)\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right)\leq\mathbb{P}\left(s_{n}(A^{\prime})\leq\frac{2t}{\sqrt{n}},s_{1}(A^{\prime})\leq 2s\right).

In other words, it suffices to prove Theorem 5.1 under the additional assumption that AA and every principle submatrix of AA is invertible. We work under this additional assumption for the remainder of the proof.

5.2. Nets and a decomposition of the unit sphere

Consider a compact set KnK\subset\mathbb{C}^{n} and ε>0\varepsilon>0. A subset 𝒩K\mathcal{N}\subset K is called an ε\varepsilon-net of KK if for every point vKv\in K one has dist(v,𝒩)ε\operatorname{dist}(v,\mathcal{N})\leq\varepsilon.

For some real positive parameters δ,τ>0\delta,\tau>0 that will be determined later, we define the set of δ\delta-sparse vectors as

Sparse(δ):={xn:|supp(x)|δn}.\operatorname{Sparse}(\delta):=\{x\in\mathbb{C}^{n}:|\operatorname{supp}(x)|\leq\delta n\}.

We decompose the unit sphere 𝕊n1\mathbb{S}^{n-1} into the set of compressible vectors and the complementary set of incompressible vectors by

Comp(δ,τ):={x𝕊n1:dist(x,Sparse(δ))τ}\operatorname{Comp}(\delta,\tau):=\{x\in\mathbb{S}^{n-1}:\operatorname{dist}(x,\operatorname{Sparse}(\delta))\leq\tau\}

and

Incomp(δ,τ):=𝕊n1Comp(δ,τ).\operatorname{Incomp}(\delta,\tau):=\mathbb{S}^{n-1}\setminus\operatorname{Comp}(\delta,\tau).

We have the following result for incompressible vectors.

Lemma 5.3 (Incompressible vectors are spread; Lemma A.3 from [15]).

Let xIncomp(δ,τ)x\in\operatorname{Incomp}(\delta,\tau). There exists a subset π[n]\pi\subset[n] such that |π|δn/2|\pi|\geq\delta n/2 and for all iπi\in\pi,

τn|xi|2δn.\frac{\tau}{\sqrt{n}}\leq|x_{i}|\leq\sqrt{\frac{2}{\delta n}}.

5.3. Control of compressible vectors

The case of compressible vectors roughly follows the arguments from [15]. For i,j[n]i,j\in[n], we let CjC_{j} denote the jj-th column of AA and Cj(i)C_{j}^{(i)} denote the jj-th column of AA with the ii-th entry removed.

Lemma 5.4 (Distance of a random vector to a deterministic subspace).

There exist constants ε,C,c,δ0>0\varepsilon,C,c,\delta_{0}>0 such that for all 1in1\leq i\leq n and any deterministic subspace HH of n1\mathbb{C}^{n-1} with 1dim(H)δ0n1\leq\dim(H)\leq\delta_{0}n, we have

(dist(Ci(i),H)εn)Cexp(cn).\mathbb{P}\left(\operatorname{dist}(C_{i}^{(i)},H)\leq\varepsilon\sqrt{n}\right)\leq C\exp(-cn).
Proof.

The proof follows the same arguments as those given in the proof of [15, Theorem A.2]. Fix 1in1\leq i\leq n; the arguments and bounds below are all uniform in ii. Recall the definitions of the events ij\mathcal{E}_{ij} given in (5.1). By the assumptions on ij\mathcal{E}_{ij}, the Chernoff bound gives

(ji𝟏ji(n1)b2)exp((n1)b8).\mathbb{P}\left(\sum_{j\neq i}\mathbf{1}_{{\mathcal{E}_{ji}}}\leq\frac{(n-1)b}{2}\right)\leq\exp\left(-\frac{(n-1)b}{8}\right).

In other words, with high probability, at least m:=(n1)b2m:=\left\lceil\frac{(n-1)b}{2}\right\rceil of the events ji\mathcal{E}_{ji}, jij\neq i occur. Thus, it suffices to prove the result by conditioning on the event

Em:=j[m],jiji.E_{m}:=\bigcap_{j\in[m],j\neq i}\mathcal{E}_{ji}. (5.2)

There are two cases to consider. Either i[m]i\in[m] or i>mi>m. The arguments for these two cases are almost identical except some notations must be changed slightly to remove the ii-th index. For the remainder of the proof, let us only consider the case when i>mi>m; the changes required for the other case are left to the reader.

Recall that it suffices to prove the result by conditioning on the event EmE_{m} defined in (5.2). In fact, as i>mi>m, the definition of the event EmE_{m} given in (5.2) can be stated as

Em:=j[m]ji.E_{m}:=\bigcap_{j\in[m]}\mathcal{E}_{ji}.

Let 𝔼m[]:=𝔼[Em,m]\mathbb{E}_{m}[\cdot]:=\mathbb{E}[\cdot\mid E_{m},\mathcal{F}_{m}] denote the conditional expectation given the event EmE_{m} and the filtration m\mathcal{F}_{m} generated by XjiX_{ji}, j>mj>m, jij\neq i. Let WW be the subspace spanned by HH and the vectors

u:=(0,,0,Xm+1,i,,Xi1,i,Xi+1,i,,Xn,i)u:=(0,\ldots,0,X_{m+1,i},\ldots,X_{i-1,i},X_{i+1,i},\ldots,X_{n,i})

and

w:=(𝔼m[X1,i],,𝔼m[Xm,i],0,,0).w:=(\mathbb{E}_{m}[X_{1,i}],\ldots,\mathbb{E}_{m}[X_{m,i}],0,\ldots,0).

By construction, dim(W)dim(H)+2\dim(W)\leq\dim(H)+2 and WW is m\mathcal{F}_{m} measurable. In addition,

dist(Ci(i),H)dist(Ci(i),W)=dist(Y,W),\operatorname{dist}(C_{i}^{(i)},H)\geq\operatorname{dist}(C_{i}^{(i)},W)=\operatorname{dist}(Y,W),

where

Y:=(X1,i𝔼m[X1,i],,Xm,i𝔼m[Xm,i],0,,0)=Ci(i)uw.Y:=(X_{1,i}-\mathbb{E}_{m}[X_{1,i}],\ldots,X_{m,i}-\mathbb{E}_{m}[X_{m,i}],0,\ldots,0)=C_{i}^{(i)}-u-w.

By assumption, the coordinates Y1,,YmY_{1},\ldots,Y_{m} are independent random variables which satisfy

|Yk|2a,Em[Yk]=0,𝔼m|Yk|2σ2|Y_{k}|\leq 2a,\qquad E_{m}[Y_{k}]=0,\qquad\mathbb{E}_{m}|Y_{k}|^{2}\geq\sigma^{2}

for 1km1\leq k\leq m. Thus, since the function xdist(x,W)x\mapsto\operatorname{dist}(x,W) is convex and 11-Lipschitz, Talagrand’s concentration inequality (see for instance [50, Theorem 2.1.13]) yields

m(|dist(Y,W)𝔼m[dist(Y,W)]|t)Cexp(ct2a2)\mathbb{P}_{m}(|\operatorname{dist}(Y,W)-\mathbb{E}_{m}[\operatorname{dist}(Y,W)]|\geq t)\leq C\exp\left(-c\frac{t^{2}}{a^{2}}\right) (5.3)

for every t0t\geq 0, where C,c>0C,c>0 are absolute constants. In particular, this implies that

(𝔼mdist(Y,W))2𝔼mdist2(Y,W)ca2(\mathbb{E}_{m}\operatorname{dist}(Y,W))^{2}\geq\mathbb{E}_{m}\operatorname{dist}^{2}(Y,W)-c^{\prime}a^{2} (5.4)

for an absolute constant c>0c^{\prime}>0. Thus, if PP denotes the orthogonal projection onto the orthogonal complement of WW, we get

𝔼mdist2(Y,W)\displaystyle\mathbb{E}_{m}\operatorname{dist}^{2}(Y,W) =k=1m𝔼m|Yk|2Pkk\displaystyle=\sum_{k=1}^{m}\mathbb{E}_{m}|Y_{k}|^{2}P_{kk}
σ2(trPk=m+1n1Pkk)\displaystyle\geq\sigma^{2}\left(\operatorname{tr}P-\sum_{k=m+1}^{n-1}P_{kk}\right)
σ2((n1)dim(H)2(n1m))\displaystyle\geq\sigma^{2}((n-1)-\dim(H)-2-(n-1-m))
σ2((n1)b2δ0n2).\displaystyle\geq\sigma^{2}\left(\frac{(n-1)b}{2}-\delta_{0}n-2\right).

The last term is lower bounded by c′′σ2nc^{\prime\prime}\sigma^{2}n for all nn sufficiently large by taking δ0:=b/4\delta_{0}:=b/4. Combining this bound with (5.3) and (5.4) completes the proof. ∎

The next bound, which follows as a corollary of Lemma 5.4, will be useful when dealing with compressible vectors.

Corollary 5.5.

There exist ε,C,c,δ0>0\varepsilon,C,c,\delta_{0}>0 such that for any deterministic subset π[n]\pi\subset[n] with |π|δ0n|\pi|\leq\delta_{0}n and any deterministic unu\in\mathbb{C}^{n}, we have

(miniπdist(Ci,Hi)εn)Cnexp(cn),\mathbb{P}\left(\min_{i\in\pi}\operatorname{dist}(C_{i},H_{i})\leq\varepsilon\sqrt{n}\right)\leq Cn\exp(-cn),

where Hi:=Span({Cj:jπ,ji}{u})H_{i}:=\operatorname{Span}\left(\{C_{j}:j\in\pi,j\neq i\}\cup\{u\}\right).

Proof.

We will apply Lemma 5.4 to control dist(Ci,Hi)\operatorname{dist}(C_{i},H_{i}). To this end, define u(i)u^{(i)} to be the vector uu with the ii-th entry removed, and set Hi(i):=Span({Cj(i):jπ,ji}{u(i)})H_{i}^{(i)}:=\operatorname{Span}\left(\{C_{j}^{(i)}:j\in\pi,j\neq i\}\cup\{u^{(i)}\}\right). Then

dist(Ci,Hi)dist(Ci(i),Hi(i))\operatorname{dist}(C_{i},H_{i})\geq\operatorname{dist}(C_{i}^{(i)},H_{i}^{(i)})

for all 1in1\leq i\leq n. Note that Hi(i)H_{i}^{(i)} is independent of Ci(i)C_{i}^{(i)}. Hence, conditioning on Hi(i)H_{i}^{(i)} and applying Lemma 5.4, we find the existence of ε,C,c,δ0\varepsilon,C,c,\delta_{0} such that for any π[n]\pi\subset[n] with |π|δ0n|\pi|\leq\delta_{0}n and all iπi\in\pi

(dist(Ci,Hi)εn)(dist(Ci(i),Hi(i))εn)Cexp(cn).\displaystyle\mathbb{P}\left(\operatorname{dist}(C_{i},H_{i})\leq{\varepsilon}\sqrt{n}\right)\leq\mathbb{P}\left(\operatorname{dist}(C_{i}^{(i)},H_{i}^{(i)})\leq{\varepsilon}\sqrt{n}\right)\leq C\exp(-cn).

Therefore, by the union bound,

(miniπdist(Ci,Hi)εn)iπ(dist(Ci,Hi)εn)Cnexp(cn),\mathbb{P}\left(\min_{i\in\pi}\operatorname{dist}(C_{i},H_{i})\leq{\varepsilon}\sqrt{n}\right)\leq\sum_{i\in\pi}\mathbb{P}\left(\operatorname{dist}(C_{i},H_{i})\leq{\varepsilon}\sqrt{n}\right)\leq Cn\exp(-cn),

and the proof is complete. ∎

Let ε,δ0\varepsilon,\delta_{0} be as in Corollary 5.5. From now on we set

τ:=14min{1,εsδ}.\tau:=\frac{1}{4}\min\left\{1,\frac{\varepsilon}{s\sqrt{\delta}}\right\}. (5.5)

(Recall that s1s\geq 1 is the upper bound for s1(X+M)s_{1}(X+M) specified in the statement of Theorem 5.1.) In particular, this definition implies that τ1/4\tau\leq 1/4. The parameter δ\delta is still to be specified. Right now we only assume that δ<δ0\delta<\delta_{0}.

Lemma 5.6 (Control of compressible vectors).

There exist constants C,c,δ>0C,c,\delta>0 such that for any deterministic vector unu\in\mathbb{C}^{n} and any s1s\geq 1

(infxComp(δ,τ)Axuε2δ,s1(A)s)Cexp(cn).\mathbb{P}\left(\inf_{x\in\operatorname{Comp}(\delta,\tau)}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right)\leq C\exp(-cn).
Proof.

Let 0<δ<δ00<\delta<\delta_{0} be a constant to be chosen later. We decompose Comp(δ,τ)\operatorname{Comp}(\delta,\tau) as

Comp(δ,τ)=π[n]:|π|=δnSπ,\operatorname{Comp}(\delta,\tau)=\bigcup_{\pi\subset[n]:|\pi|=\lfloor\delta n\rfloor}S_{\pi},

where

Sπ:={xComp(δ,τ):dist(x,Sparseπ(δ))τ}S_{\pi}:=\{x\in\operatorname{Comp}(\delta,\tau):\operatorname{dist}(x,\operatorname{Sparse}_{\pi}(\delta))\leq\tau\}

and

Sparseπ(δ):={ySparse(δ):supp(y)π}.\operatorname{Sparse}_{\pi}(\delta):=\{y\in\operatorname{Sparse}(\delta):\operatorname{supp}(y)\subset\pi\}.

So by the union bound,

\displaystyle\mathbb{P} (infxComp(δ,τ)Axuε2δ,s1(A)s)\displaystyle\left(\inf_{x\in\operatorname{Comp}(\delta,\tau)}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right) (5.6)
π[n]:|π|=δn(infxSπAxuε2δ,s1(A)s).\displaystyle\qquad\qquad\leq\sum_{\pi\in[n]:|\pi|=\lfloor\delta n\rfloor}\mathbb{P}\left(\inf_{x\in S_{\pi}}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right).

Fix π[n]\pi\subset[n] with |π|=δn|\pi|=\lfloor\delta n\rfloor, and suppose there exists xSπx\in S_{\pi} such that Axuε2δ\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}} and s1(A)ss_{1}(A)\leq s. Then there exists ySparseπ(δ)y\in\operatorname{Sparse}_{\pi}(\delta) with xyτ\|x-y\|\leq\tau. In particular, this implies that y3/4\|y\|\geq 3/4. In addition, we have

ε2δAxuAyuτAAyuε4δ\displaystyle\frac{\varepsilon}{2\sqrt{\delta}}\geq\|Ax-u\|\geq\|Ay-u\|-\tau\|A\|\geq\|Ay-u\|-\frac{\varepsilon}{4\sqrt{\delta}}

by the assumption that As\|A\|\leq s and the definition of τ\tau (5.5). Hence, we obtain

Ayu3ε4δ.\|Ay-u\|\leq\frac{3\varepsilon}{4\sqrt{\delta}}. (5.7)

We now bound Ayu\|Ay-u\| from below. Indeed, we have

Ayu2\displaystyle\|Ay-u\|^{2} =iπCiyiu2maxiπ|yi|2dist2(Ci,Hi),\displaystyle=\left\|\sum_{i\in\pi}C_{i}y_{i}-u\right\|^{2}\geq\max_{i\in\pi}|y_{i}|^{2}\operatorname{dist}^{2}(C_{i},H_{i}), (5.8)

where Hi:=Span({Cj:jπ,ji}{u})H_{i}:=\operatorname{Span}\left(\{C_{j}:j\in\pi,j\neq i\}\cup\{u\}\right). In addition, we bound

maxiπ|yi|2dist2(Ci,Hi)miniπdist2(Ci,Hi)1|π|iπ|yi|2miniπdist2(Ci,Hi)(34)21δn.\displaystyle\max_{i\in\pi}|y_{i}|^{2}\operatorname{dist}^{2}(C_{i},H_{i})\geq\min_{i\in\pi}\operatorname{dist}^{2}(C_{i},H_{i})\frac{1}{|\pi|}\sum_{i\in\pi}|y_{i}|^{2}\geq\min_{i\in\pi}\operatorname{dist}^{2}(C_{i},H_{i})\left(\frac{3}{4}\right)^{2}\frac{1}{\delta n}.

Thus, combining (5.7) and (5.8) with the bound above, we find

miniπdist(Ci,Hi)εn.\min_{i\in\pi}\operatorname{dist}(C_{i},H_{i})\leq\varepsilon\sqrt{n}.

To conclude, we have shown that

(infxSπAxuε2δ,s1(A)s)(miniπdist(Ci,Hi)εn).\mathbb{P}\left(\inf_{x\in S_{\pi}}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right)\leq\mathbb{P}\left(\min_{i\in\pi}\operatorname{dist}(C_{i},H_{i})\leq\varepsilon\sqrt{n}\right).

In view of Corollary 5.5, there exist C,c>0C,c>0 such that for any π[n]\pi\in[n] with |π|=δn|\pi|=\lfloor\delta n\rfloor, we have

(infxSπAxuε2δ,s1(A)s)Cnexp(cn).\mathbb{P}\left(\inf_{x\in S_{\pi}}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right)\leq Cn\exp(-cn).

Returning to (5.6), we conclude that

(infxComp(δ,τ)Axuε2δ,s1(A)s)\displaystyle\mathbb{P}\left(\inf_{x\in\operatorname{Comp}(\delta,\tau)}\|Ax-u\|\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right) (nδn)Cnexp(cn)\displaystyle\leq\binom{n}{\lfloor\delta n\rfloor}Cn\exp(-cn)
(neδn)δnCnexp(cn)\displaystyle\leq\left(\frac{ne}{\lfloor\delta n\rfloor}\right)^{\lfloor\delta n\rfloor}Cn\exp(-cn)
Cexp(cn)\displaystyle\leq C^{\prime}\exp(-c^{\prime}n)

for some constants C,c>0C^{\prime},c^{\prime}>0 by taking δ\delta sufficiently small (in terms of cc). ∎

We now fix δ\delta to be the constant from Lemma 5.6. Thus, δ\delta and τ\tau are now completely determined. We will also need the following corollary of Lemma 5.6.

Corollary 5.7.

There exist constants C,c>0C,c>0 such that for any deterministic vector unu\in\mathbb{C}^{n} and any s1s\geq 1

(infx/xComp(δ,τ)Axuxε4δ,s1(A)s)Csexp(cn).\mathbb{P}\left(\inf_{x/\|x\|\in\operatorname{Comp}(\delta,\tau)}\frac{\|Ax-u\|}{\|x\|}\leq\frac{\varepsilon}{4\sqrt{\delta}},s_{1}(A)\leq s\right)\leq Cs\exp(-cn). (5.9)
Proof.

The proof is based on the arguments given in [53]. We first note that if u=0u=0, then the claim follows immediately from Lemma 5.6. Assume u0u\neq 0. Let \mathcal{E} denote the event on the left-hand side of (5.9) whose probability we would like to bound. Suppose that \mathcal{E} holds. Then there exists x0:=x/xComp(δ,τ)x_{0}:=x/\|x\|\in\operatorname{Comp}(\delta,\tau) and u0:=u/xSpan(u)u_{0}:=u/\|x\|\in\operatorname{Span}(u) such that Ax0u0ε4δ\|Ax_{0}-u_{0}\|\leq\frac{\varepsilon}{4\sqrt{\delta}} and s1(A)ss_{1}(A)\leq s. In particular, this implies that

u0Ax0u0+Ax0ε4δ+s.\|u_{0}\|\leq\|Ax_{0}-u_{0}\|+\|Ax_{0}\|\leq\frac{\varepsilon}{4\sqrt{\delta}}+s.

Let 𝒩\mathcal{N} be a ε4δ\frac{\varepsilon}{4\sqrt{\delta}}-net of the real interval [ε4δs,ε4δ+s][-\frac{\varepsilon}{4\sqrt{\delta}}-s,\frac{\varepsilon}{4\sqrt{\delta}}+s]. In particular, we can choose 𝒩\mathcal{N} so that

|𝒩|Cs|\mathcal{N}|\leq C^{\prime}s (5.10)

for some constant C>0C^{\prime}>0. Here, we have used the assumption that s1s\geq 1. In particular, there exists c0𝒩c_{0}\in\mathcal{N} such that

uxc0uuε4δ.\left\|\frac{u}{\|x\|}-c_{0}\frac{u}{\|u\|}\right\|\leq\frac{\varepsilon}{4\sqrt{\delta}}.

By the triangle inequality, this implies that

Ax0c0uuε2δ.\left\|Ax_{0}-c_{0}\frac{u}{\|u\|}\right\|\leq\frac{\varepsilon}{2\sqrt{\delta}}.

To conclude, we have shown that

()(infc0𝒩infx0Comp(δ,τ)Ax0c0uuε2δ,s1(A)s),\mathbb{P}(\mathcal{E})\leq\mathbb{P}\left(\inf_{c_{0}\in\mathcal{N}}\inf_{x_{0}\in\operatorname{Comp}(\delta,\tau)}{\left\|Ax_{0}-c_{0}\frac{u}{\|u\|}\right\|}\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right),

and thus, by the union bound, we have

()c0𝒩(infx0Comp(δ,τ)Ax0c0uuε2δ,s1(A)s).\mathbb{P}(\mathcal{E})\leq\sum_{c_{0}\in\mathcal{N}}\mathbb{P}\left(\inf_{x_{0}\in\operatorname{Comp}(\delta,\tau)}{\left\|Ax_{0}-c_{0}\frac{u}{\|u\|}\right\|}\leq\frac{\varepsilon}{2\sqrt{\delta}},s_{1}(A)\leq s\right).

The claim now follows by the cardinality bound for 𝒩\mathcal{N} in (5.10) and Lemma 5.6. ∎

When dealing with incompressible vectors we will need the following corollary.

Corollary 5.8.

There exists constants C,c>0C,c>0, such that, for any unu\in\mathbb{C}^{n} with u0u\neq 0, and any s1s\geq 1, we have

(A1uA1uComp(δ,τ),s1(A)s)Csexp(cn).\mathbb{P}\left(\frac{A^{-1}u}{\|A^{-1}u\|}\in\operatorname{Comp}(\delta,\tau),s_{1}(A)\leq s\right)\leq Cs\exp(-cn).
Proof.

If x:=A1ux:=A^{-1}u, then (Axu)/x=0(Ax-u)/\|x\|=0. Thus, we have

\displaystyle\mathbb{P} (A1uA1uComp(δ,τ),s1(A)s)\displaystyle\left(\frac{A^{-1}u}{\|A^{-1}u\|}\in\operatorname{Comp}(\delta,\tau),s_{1}(A)\leq s\right)
(infx/xComp(δ,τ)Axux=0,s1(A)s).\displaystyle\qquad\qquad\leq\mathbb{P}\left(\inf_{x/\|x\|\in\operatorname{Comp}(\delta,\tau)}\frac{\|Ax-u\|}{\|x\|}=0,s_{1}(A)\leq s\right).

The conclusion now follows from (5.9). ∎

5.4. Anti-concentration bounds

In order to handle incompressible vectors, we will need several anti-concentration bounds. The main idea is to use the rate of convergence from the Berry–Esseen Theorem to obtain the estimates. This idea appears to have originated in [35] and has been used previously in many works including [15, 35, 45].

Lemma 5.9 (Small ball probability via Berry–Esseen; Lemma A.6 from [15]).

There exists C>0C>0 such that if Z1,,ZnZ_{1},\ldots,Z_{n} are independent centered complex-valued random variables, then for all t0t\geq 0,

supz(|i=1nZiz|t)C(ti=1n𝔼|Zi|2+i=1n𝔼|Zi|3(i=1n𝔼|Zi|2)3/2).\sup_{z\in\mathbb{C}}\mathbb{P}\left(\left|\sum_{i=1}^{n}Z_{i}-z\right|\leq t\right)\leq C\left(\frac{t}{\sqrt{\sum_{i=1}^{n}\mathbb{E}|Z_{i}|^{2}}}+\frac{\sum_{i=1}^{n}\mathbb{E}|Z_{i}|^{3}}{\left(\sum_{i=1}^{n}\mathbb{E}|Z_{i}|^{2}\right)^{3/2}}\right).

We begin the following anti-concentration bound for sums involving dependent random variables.

Lemma 5.10.

Let {(ξi,ψi):1in}\{(\xi_{i},\psi_{i}):1\leq i\leq n\} be a collection of independent complex-valued random tuples, and assume there exist a,b,σ>0a,b,\sigma>0 such that the events

i:={|ξi|a,|ψi|a}\mathcal{E}_{i}:=\{|\xi_{i}|\leq a,|\psi_{i}|\leq a\}

for 1in1\leq i\leq n satisfy

bmin1in(i),σ2min1inVar(ξii),σ2min1inVar(ψii).b\leq\min_{1\leq i\leq n}\mathbb{P}(\mathcal{E}_{i}),\qquad\sigma^{2}\leq\min_{1\leq i\leq n}\operatorname{Var}(\xi_{i}\mid\mathcal{E}_{i}),\qquad\sigma^{2}\leq\min_{1\leq i\leq n}\operatorname{Var}(\psi_{i}\mid\mathcal{E}_{i}).

In addition, assume there exists ρ<1\rho<1 such that

max1in|Corr(ξii,ψii)|ρ.\max_{1\leq i\leq n}|\operatorname{Corr}(\xi_{i}\mid\mathcal{E}_{i},\psi_{i}\mid\mathcal{E}_{i})|\leq\rho. (5.11)

Then for any δ,τ(0,1)\delta,\tau\in(0,1), any w=(wi)i=1nIncomp(δ,τ)w=(w_{i})_{i=1}^{n}\in\operatorname{Incomp}(\delta,\tau), any w=(wi)i=1nnw^{\prime}=(w^{\prime}_{i})_{i=1}^{n}\in\mathbb{C}^{n} with w1\|w^{\prime}\|\leq 1, any J[n]J\subset[n] with |J|n(1δ/4)|J|\geq n(1-\delta/4), and any t0t\geq 0, we have

supz\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P} (|iJ(ξiwi+ψiwi)z|tτ)\displaystyle\left(\left|\sum_{i\in J}(\xi_{i}w_{i}+\psi_{i}w_{i}^{\prime})-z\right|\leq t\tau\right)
Cσδb(1ρ)12log2(2τ2δ)log2(nτ)(t+an)+exp(δnb/32),\displaystyle\leq\frac{C}{\sigma\sqrt{\delta b(1-\rho)}}\sqrt{\left\lceil\frac{1}{2}\log_{2}\left(\frac{2}{\tau^{2}\delta}\right)\right\rceil\left\lceil\log_{2}\left(\frac{\sqrt{n}}{\tau}\right)\right\rceil}\left(t+\frac{a}{\sqrt{n}}\right)+\exp(-\delta nb/32),

where C>0C>0 is an absolute constant.

Proof.

The proof is based on the arguments from [15]. Since wIncomp(δ,τ)w\in\operatorname{Incomp}(\delta,\tau) and |J|n(1δ/4)|J|\geq n(1-\delta/4), Lemma 5.3 implies the existence of πJ\pi\subset J such that |π|δn/4|\pi|\geq\delta n/4 and

τn|wi|2δn\frac{\tau}{\sqrt{n}}\leq|w_{i}|\leq\sqrt{\frac{2}{\delta n}}

for all iπi\in\pi. By conditioning on the random variables ξi,ψi\xi_{i},\psi_{i} for iπi\not\in\pi and absorbing their contribution into the constant zz, it suffices to bound

supz(|iπ(ξiwi+ψiwi)z|tτ).\sup_{z\in\mathbb{C}}\mathbb{P}\left(\left|\sum_{i\in\pi}(\xi_{i}w_{i}+\psi_{i}w_{i}^{\prime})-z\right|\leq t\tau\right).

We now proceed to truncate the random variables ξi,ψi\xi_{i},\psi_{i} for iπi\in\pi. Indeed, by the Chernoff bound, it follows that

iπ𝟏i|π|b2δbn8\sum_{i\in\pi}\mathbf{1}_{{\mathcal{E}_{i}}}\geq\frac{|\pi|b}{2}\geq\frac{\delta bn}{8}

with probability at least 1exp(δbn/32)1-\exp(-\delta bn/32). Therefore, taking m:=δbn/8m:=\lceil\delta bn/8\rceil, it suffices to show

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|i=1m(ξiwi+ψiwi)z|tτ)\displaystyle\left(\left|\sum_{i=1}^{m}(\xi_{i}w_{i}+\psi_{i}w_{i}^{\prime})-z\right|\leq t\tau\right)
Cσδb(1ρ)12log2(2τ2δ)log2(nτ)(t+an),\displaystyle\leq\frac{C}{\sigma\sqrt{\delta b(1-\rho)}}\sqrt{\left\lceil\frac{1}{2}\log_{2}\left(\frac{2}{\tau^{2}\delta}\right)\right\rceil\left\lceil\log_{2}\left(\frac{\sqrt{n}}{\tau}\right)\right\rceil}\left(t+\frac{a}{\sqrt{n}}\right),

where m():=(|Em,m)\mathbb{P}_{m}(\cdot):=\mathbb{P}(\cdot|E_{m},\mathcal{F}_{m}) is the conditional probability given m\mathcal{F}_{m}, the σ\sigma-algebra generated by all random variables except ξ1,,ξm,ψ1,,ψm\xi_{1},\ldots,\xi_{m},\psi_{1},\ldots,\psi_{m}, and the event

Em:={|ξi|a,|ψi|a:1im}{τn|wi|2δn:1im}.E_{m}:=\left\{|\xi_{i}|\leq a,|\psi_{i}|\leq a:1\leq i\leq m\right\}\bigcap\left\{\frac{\tau}{\sqrt{n}}\leq|w_{i}|\leq\sqrt{\frac{2}{\delta n}}:1\leq i\leq m\right\}.

By centering the random variables (and absorbing the expectations into the constant zz), it suffices to bound

supzm(|i=1m[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ).\sup_{z\in\mathbb{C}}\mathbb{P}_{m}\left(\left|\sum_{i=1}^{m}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right).

We again reduce to the case where we only need to consider a subset of the coordinates of ww and ww^{\prime} which are roughly comparable. Indeed, as the random tuples (ξ1,ψ1),,(ξm,ψm)(\xi_{1},\psi_{1}),\ldots,(\xi_{m},\psi_{m}) are jointly independent under the probability measure m\mathbb{P}_{m}, we can condition on any subset of them (and again absorb their contribution into the constant zz); hence, for any subset I[m]I\subset[m], we have

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|i=1m[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ)\displaystyle\left(\left|\sum_{i=1}^{m}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right) (5.12)
supzm(|iI[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ).\displaystyle\leq\sup_{z\in\mathbb{C}}\mathbb{P}_{m}\left(\left|\sum_{i\in I}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right).

We now choose the subset II in a sequence of two steps. First, define

L:=212log2(2δτ2),L:=2\left\lceil\frac{1}{2}\log_{2}\left(\frac{2}{\delta\tau^{2}}\right)\right\rceil,

and for 1jL1\leq j\leq L set

Ij:={1im:2j1τn|wi|<2jτn}.I_{j}:=\left\{1\leq i\leq m:\frac{2^{j-1}\tau}{\sqrt{n}}\leq|w_{i}|<\frac{2^{j}\tau}{\sqrt{n}}\right\}.

By construction, I1,,ILI_{1},\ldots,I_{L} partition the index set [m][m]. Hence, by the pigeonhole principle, there exists jj such that |Ij|m/L|I_{j}|\geq m/L. Second, we partition the set IjI_{j} as follows. Define

K:=2log2(nτ),K:=2\left\lceil\log_{2}\left(\frac{\sqrt{n}}{\tau}\right)\right\rceil,

and for 1kK1\leq k\leq K set

Ij,k:={iIj:2k1τn|wi|<2kτn}I_{j,k}:=\left\{i\in I_{j}:\frac{2^{k-1}\tau}{\sqrt{n}}\leq|w_{i}^{\prime}|<\frac{2^{k}\tau}{\sqrt{n}}\right\}

and define

Ij,0:={iIj:|wi|<τn}.I_{j,0}:=\left\{i\in I_{j}:|w^{\prime}_{i}|<\frac{\tau}{\sqrt{n}}\right\}.

As w1\|w^{\prime}\|\leq 1 by assumption, the sets Ij,0,Ij,1,,Ij,KI_{j,0},I_{j,1},\ldots,I_{j,K} form a partition of IjI_{j}. By the pigeonhole principle, there exists kk such that

|Ij,k|mL(K+1)m2LK.|I_{j,k}|\geq\frac{m}{L(K+1)}\geq\frac{m}{2LK}. (5.13)

Applying (5.12) to the set Ij,kI_{j,k}, it now suffices to show

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right) (5.14)
CLKσδb(1ρ)(t+an)\displaystyle\qquad\qquad\qquad\leq\frac{C\sqrt{LK}}{\sigma\sqrt{\delta b(1-\rho)}}\left(t+\frac{a}{\sqrt{n}}\right)

for some absolute constant C>0C>0.

We will apply Lemma 5.9 to obtain (5.14). For iIj,ki\in I_{j,k}, define

Zi:=(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi.Z_{i}:=(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}.

Then

q2\displaystyle q^{2} :=iIj,k𝔼m|Zi|2\displaystyle:=\sum_{i\in I_{j,k}}\mathbb{E}_{m}|Z_{i}|^{2}
iIj,k[|wi|2Varm(ξi)+|wi|2Varm(ψi)2ρ|wi||wi|Varm(ξi)Varm(ψi)]\displaystyle\geq\sum_{i\in I_{j,k}}\left[|w_{i}|^{2}\operatorname{Var}_{m}(\xi_{i})+|w^{\prime}_{i}|^{2}\operatorname{Var}_{m}(\psi_{i})-2\rho|w_{i}||w^{\prime}_{i}|\sqrt{\operatorname{Var}_{m}(\xi_{i})\operatorname{Var}_{m}(\psi_{i})}\right]

by assumption (5.11). Thus, we deduce that

q2\displaystyle q^{2} (1ρ)iIj,k[|wi|2Varm(ξi)+|wi|2Varm(ψi)]\displaystyle\geq(1-\rho)\sum_{i\in I_{j,k}}\left[|w_{i}|^{2}\operatorname{Var}_{m}(\xi_{i})+|w_{i}^{\prime}|^{2}\operatorname{Var}_{m}(\psi_{i})\right]
(1ρ)σ2iIj,k[|wi|2+|wi|2].\displaystyle\geq(1-\rho)\sigma^{2}\sum_{i\in I_{j,k}}\left[|w_{i}|^{2}+|w_{i}^{\prime}|^{2}\right]. (5.15)

In addition,

iIj,k𝔼m|Zi|32aτ(2j+2kn)q2.\sum_{i\in I_{j,k}}\mathbb{E}_{m}|Z_{i}|^{3}\leq 2a\tau\left(\frac{2^{j}+2^{k}}{\sqrt{n}}\right)q^{2}.

Hence, Lemma 5.9 implies the existence of an absolute constant C>0C>0 such that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
Cτq(t+a(2j+2k)n).\displaystyle\qquad\qquad\qquad\leq\frac{C\tau}{q}\left(t+\frac{a(2^{j}+2^{k})}{\sqrt{n}}\right). (5.16)

We complete the proof by considering two separate cases. First, if k=0k=0, then using (5.13) and (5.15), we obtain

q2σ2(1ρ)iIj,k|wi|2σ2(1ρ)|Ij,k|22j2τ2nσ2(1ρ)δb22jτ264LK.q^{2}\geq\sigma^{2}(1-\rho)\sum_{i\in I_{j,k}}|w_{i}|^{2}\geq\sigma^{2}(1-\rho)|I_{j,k}|\frac{2^{2j-2}\tau^{2}}{n}\geq\sigma^{2}(1-\rho)\delta b\frac{2^{2j}\tau^{2}}{64LK}.

Hence returning to (5.16), we find that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
CLKσδb(1ρ)(t+an),\displaystyle\qquad\qquad\qquad\leq\frac{C^{\prime}\sqrt{LK}}{\sigma\sqrt{\delta b(1-\rho)}}\left(t+\frac{a}{\sqrt{n}}\right), (5.17)

where C>0C^{\prime}>0 is an absolute constant. Similarly, if 1kK1\leq k\leq K, then

q2σ2(1ρ)|Ij,k|τ2n(22j2+22k2)σ2(1ρ)δbτ222j+22k64LK.q^{2}\geq\sigma^{2}(1-\rho)|I_{j,k}|\frac{\tau^{2}}{n}\left(2^{2j-2}+2^{2k-2}\right)\geq\sigma^{2}(1-\rho)\delta b\tau^{2}\frac{2^{2j}+2^{2k}}{64LK}.

In this case, we again apply (5.16) to deduce the existence of an absolute constant C′′>0C^{\prime\prime}>0 such that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ψi𝔼m[ψi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\psi_{i}-\mathbb{E}_{m}[\psi_{i}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
C′′LKσδb(1ρ)(t+an).\displaystyle\qquad\qquad\qquad\leq\frac{C^{\prime\prime}\sqrt{LK}}{\sigma\sqrt{\delta b(1-\rho)}}\left(t+\frac{a}{\sqrt{n}}\right). (5.18)

Combining (5.17) and (5.18), we obtain the bound (5.14) (with the absolute constant C:=max{C,C′′}C:=\max\{C^{\prime},C^{\prime\prime}\}), and the proof of the lemma is complete. ∎

Lastly, we will need the following technical anti-concentration bound which is similar to Lemma 5.10.

Lemma 5.11.

Let {(ξi,ψi):1in}{(ξi,ψi):1in}\{(\xi_{i},\psi_{i}):1\leq i\leq n\}\cup\{(\xi_{i}^{\prime},\psi_{i}^{\prime}):1\leq i\leq n\} be a collection of independent complex-valued random tuples with the property that (ξi,ψi)(\xi_{i},\psi_{i}) has the same distribution as (ξi,ψi)(\xi_{i}^{\prime},\psi_{i}^{\prime}) for 1in1\leq i\leq n, and assume there exist a,b,σ>0a,b,\sigma>0 such that the events

i:={|ξi|a,|ψi|a}\mathcal{E}_{i}:=\{|\xi_{i}|\leq a,|\psi_{i}|\leq a\}

for 1in1\leq i\leq n satisfy

bmin1in(i),σ2min1inVar(ξii),σ2min1inVar(ψii).b\leq\min_{1\leq i\leq n}\mathbb{P}(\mathcal{E}_{i}),\qquad\sigma^{2}\leq\min_{1\leq i\leq n}\operatorname{Var}(\xi_{i}\mid\mathcal{E}_{i}),\qquad\sigma^{2}\leq\min_{1\leq i\leq n}\operatorname{Var}(\psi_{i}\mid\mathcal{E}_{i}).

In addition, let δ1,,δn\delta_{1},\ldots,\delta_{n} be i.i.d. {0,1}\{0,1\}-valued random variables with 𝔼δi=coo(0,1]\mathbb{E}\delta_{i}=c_{oo}\in(0,1], and assume δ1,,δn\delta_{1},\ldots,\delta_{n} are independent of the random tuples {(ξi,ψi),(ξi,ψi):1in}\{(\xi_{i},\psi_{i}),(\xi_{i}^{\prime},\psi_{i}^{\prime}):1\leq i\leq n\}. Then for any δ,τ(0,1)\delta,\tau\in(0,1), any w=(wi)i=1nIncomp(δ,τ)w=(w_{i})_{i=1}^{n}\in\operatorname{Incomp}(\delta,\tau), any w=(wi)i=1nnw^{\prime}=(w^{\prime}_{i})_{i=1}^{n}\in\mathbb{C}^{n} with w1\|w^{\prime}\|\leq 1, and any t0t\geq 0, we have

supz\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P} (|i=1n(δiξiwi+δiξiwi)z|tτ)\displaystyle\left(\left|\sum_{i=1}^{n}(\delta_{i}\xi_{i}w_{i}+\delta_{i}\xi_{i}^{\prime}w_{i}^{\prime})-z\right|\leq t\tau\right)
Cσcooδb212log2(2τ2δ)log2(nτ)(t+an)+exp(δncoob2/16),\displaystyle\leq\frac{C}{\sigma\sqrt{c_{oo}\delta b^{2}}}\sqrt{\left\lceil\frac{1}{2}\log_{2}\left(\frac{2}{\tau^{2}\delta}\right)\right\rceil\left\lceil\log_{2}\left(\frac{\sqrt{n}}{\tau}\right)\right\rceil}\left(t+\frac{a}{\sqrt{n}}\right)+\exp(-\delta nc_{oo}b^{2}/16),

where C>0C>0 is an absolute constant.

Lemma 5.11 is very similar to Lemma 5.10. The statement of the lemma is somewhat unusual since the hypotheses involve the variables ξi,ψi,ξi,ψi\xi_{i},\psi_{i},\xi_{i}^{\prime},\psi_{i}^{\prime}, but the conclusion only involves the random variables ξi,ξi\xi_{i},\xi_{i}^{\prime}. This is to match the assumptions of Theorem 5.1. The proof of Lemma 5.11 presented below follows the same framework as the proof of Lemma 5.10.

Proof of Lemma 5.11.

Since wIncomp(δ,τ)w\in\operatorname{Incomp}(\delta,\tau), Lemma 5.3 implies the existence of π[n]\pi\subset[n] such that |π|δn/2|\pi|\geq\delta n/2 and

τn|wi|2δn\frac{\tau}{\sqrt{n}}\leq|w_{i}|\leq\sqrt{\frac{2}{\delta n}}

for all iπi\in\pi. By conditioning on the random variables ξi,ψi,ξi,ψi,δi\xi_{i},\psi_{i},\xi_{i}^{\prime},\psi_{i}^{\prime},\delta_{i} for iπi\not\in\pi and absorbing their contribution into the constant zz, it suffices to bound

supz(|iπ(δiξiwi+δiξiwi)z|tτ).\sup_{z\in\mathbb{C}}\mathbb{P}\left(\left|\sum_{i\in\pi}(\delta_{i}\xi_{i}w_{i}+\delta_{i}\xi_{i}^{\prime}w_{i}^{\prime})-z\right|\leq t\tau\right).

We now proceed to truncate the random variables ξi,ψi,ξi,ψi\xi_{i},\psi_{i},\xi_{i}^{\prime},\psi_{i}^{\prime} for iπi\in\pi. Define the events

i:={|ξi|a,|ψi|a},\mathcal{E}_{i}^{\prime}:=\{|\xi_{i}^{\prime}|\leq a,|\psi_{i}^{\prime}|\leq a\},

and observe that i\mathcal{E}_{i}^{\prime} is independent of i\mathcal{E}_{i} for 1in1\leq i\leq n. By the Chernoff bound, it follows that

iπ𝟏ii{δi=1}coo|π|b22cooδb2n4\sum_{i\in\pi}\mathbf{1}_{{\mathcal{E}_{i}\cap\mathcal{E}_{i}^{\prime}\cap\{\delta_{i}=1\}}}\geq\frac{c_{oo}|\pi|b^{2}}{2}\geq\frac{c_{oo}\delta b^{2}n}{4}

with probability at least 1exp(cooδb2n/16)1-\exp(-c_{oo}\delta b^{2}n/16). Therefore, taking m:=cooδb2n/4m:=\lceil c_{oo}\delta b^{2}n/4\rceil, it suffices to bound

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|i=1m(δiξiwi+δiξiwi)z|tτ)=supzm\displaystyle\left(\left|\sum_{i=1}^{m}(\delta_{i}\xi_{i}w_{i}+\delta_{i}\xi_{i}^{\prime}w_{i}^{\prime})-z\right|\leq t\tau\right)=\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|i=1m(ξiwi+ξiwi)z|tτ)\displaystyle\left(\left|\sum_{i=1}^{m}(\xi_{i}w_{i}+\xi_{i}^{\prime}w_{i}^{\prime})-z\right|\leq t\tau\right) (5.19)

where m():=(|Em,m)\mathbb{P}_{m}(\cdot):=\mathbb{P}(\cdot|E_{m},\mathcal{F}_{m}) is the conditional probability given m\mathcal{F}_{m}, the σ\sigma-algebra generated by all random variables except ξ1,,ξm\xi_{1},\ldots,\xi_{m}, ψ1,,ψm\psi_{1},\ldots,\psi_{m}, ξ1,,ξm\xi_{1}^{\prime},\ldots,\xi_{m}^{\prime}, ψ1,,ψm\psi_{1}^{\prime},\ldots,\psi_{m}^{\prime}, δ1,,δm\delta_{1},\ldots,\delta_{m}, and the event

Em:=i=1m{|ξi|a,|ψi|a,|ξi|a,|ψi|a,δi=1}i=1m{τn|wi|2δn}.E_{m}:=\bigcap_{i=1}^{m}\left\{|\xi_{i}|\leq a,|\psi_{i}|\leq a,|\xi_{i}^{\prime}|\leq a,|\psi_{i}^{\prime}|\leq a,\delta_{i}=1\right\}\bigcap_{i=1}^{m}\left\{\frac{\tau}{\sqrt{n}}\leq|w_{i}|\leq\sqrt{\frac{2}{\delta n}}\right\}.

Here, we have exploited the fact that on the event EmE_{m}, δi=1\delta_{i}=1 for i[m]i\in[m], and so all factors of δi\delta_{i} have been replaced by 11 on the right-hand side of (5.19). By centering the random variables (and absorbing the expectations into the constant zz), it suffices to bound

supzm(|i=1m[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ).\sup_{z\in\mathbb{C}}\mathbb{P}_{m}\left(\left|\sum_{i=1}^{m}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right).

We again reduce to the case where we only need to consider a subset of the coordinates of ww and ww^{\prime} which are roughly comparable. Indeed, as the random vectors (ξ1,ψ1,ξ1,ψ1),,(ξm,ψm,ξm,ψm)(\xi_{1},\psi_{1},\xi_{1}^{\prime},\psi_{1}^{\prime}),\ldots,(\xi_{m},\psi_{m},\xi_{m}^{\prime},\psi_{m}^{\prime}) are jointly independent under the probability measure m\mathbb{P}_{m}, we can condition on any subset of them (and again absorb their contribution into the constant zz); hence, for any subset I[m]I\subset[m], we have

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|i=1m[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ)\displaystyle\left(\left|\sum_{i=1}^{m}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right) (5.20)
supzm(|iI[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ).\displaystyle\leq\sup_{z\in\mathbb{C}}\mathbb{P}_{m}\left(\left|\sum_{i\in I}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right).

We now choose the subset II in a sequence of two steps as was done in the proof of Lemma 5.10. First, define

L:=212log2(2δτ2),L:=2\left\lceil\frac{1}{2}\log_{2}\left(\frac{2}{\delta\tau^{2}}\right)\right\rceil,

and for 1jL1\leq j\leq L set

Ij:={1im:2j1τn|wi|<2jτn}.I_{j}:=\left\{1\leq i\leq m:\frac{2^{j-1}\tau}{\sqrt{n}}\leq|w_{i}|<\frac{2^{j}\tau}{\sqrt{n}}\right\}.

By construction, I1,,ILI_{1},\ldots,I_{L} partition the index set [m][m]. Hence, by the pigeonhole principle, there exists jj such that |Ij|m/L|I_{j}|\geq m/L. Second, we partition the set IjI_{j} as follows. Define

K:=2log2(nτ),K:=2\left\lceil\log_{2}\left(\frac{\sqrt{n}}{\tau}\right)\right\rceil,

and for 1kK1\leq k\leq K set

Ij,k:={iIj:2k1τn|wi|<2kτn}I_{j,k}:=\left\{i\in I_{j}:\frac{2^{k-1}\tau}{\sqrt{n}}\leq|w_{i}^{\prime}|<\frac{2^{k}\tau}{\sqrt{n}}\right\}

and define

Ij,0:={iIj:|wi|<τn}.I_{j,0}:=\left\{i\in I_{j}:|w^{\prime}_{i}|<\frac{\tau}{\sqrt{n}}\right\}.

As w1\|w^{\prime}\|\leq 1 by assumption, the sets Ij,0,Ij,1,,Ij,KI_{j,0},I_{j,1},\ldots,I_{j,K} form a partition of IjI_{j}. By the pigeonhole principle, there exists kk such that

|Ij,k|mL(K+1)m2LK.|I_{j,k}|\geq\frac{m}{L(K+1)}\geq\frac{m}{2LK}. (5.21)

Applying (5.20) to the set Ij,kI_{j,k}, it now suffices to show

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right) (5.22)
CLKσcooδb2(t+an)\displaystyle\qquad\qquad\qquad\leq C\frac{\sqrt{LK}}{\sigma\sqrt{c_{oo}\delta b^{2}}}\left(t+\frac{a}{\sqrt{n}}\right)

for some absolute constant C>0C>0.

We will apply Lemma 5.9 to obtain (5.22). For iIj,ki\in I_{j,k}, define

Zi:=(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi.Z_{i}:=(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}.

Then

q2\displaystyle q^{2} :=iIj,k𝔼m|Zi|2\displaystyle:=\sum_{i\in I_{j,k}}\mathbb{E}_{m}|Z_{i}|^{2}
=iIj,k[|wi|2Varm(ξi)+|wi|2Varm(ξi)]\displaystyle=\sum_{i\in I_{j,k}}\left[|w_{i}|^{2}\operatorname{Var}_{m}(\xi_{i})+|w^{\prime}_{i}|^{2}\operatorname{Var}_{m}(\xi_{i}^{\prime})\right] (5.23)
σ2iIj,k[|wi|2+|wi|2]\displaystyle\geq\sigma^{2}\sum_{i\in I_{j,k}}\left[|w_{i}|^{2}+|w_{i}^{\prime}|^{2}\right]

since ξi\xi_{i} and ξi\xi_{i}^{\prime} are independent under the measure m\mathbb{P}_{m} for 1in1\leq i\leq n. In addition,

iIj,k𝔼m|Zi|32aτ(2j+2kn)q2.\sum_{i\in I_{j,k}}\mathbb{E}_{m}|Z_{i}|^{3}\leq 2a\tau\left(\frac{2^{j}+2^{k}}{\sqrt{n}}\right)q^{2}.

Hence, Lemma 5.9 implies the existence of an absolute constant C>0C>0 such that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
Cτq(t+a(2j+2k)n).\displaystyle\qquad\qquad\qquad\leq\frac{C\tau}{q}\left(t+\frac{a(2^{j}+2^{k})}{\sqrt{n}}\right). (5.24)

We complete the proof by considering two separate cases. First, if k=0k=0, then using (5.21) and (5.23), we obtain

q2σ2iIj,k|wi|2σ2|Ij,k|22j2τ2nσ2cooδb222jτ232LK.q^{2}\geq\sigma^{2}\sum_{i\in I_{j,k}}|w_{i}|^{2}\geq\sigma^{2}|I_{j,k}|\frac{2^{2j-2}\tau^{2}}{n}\geq\sigma^{2}c_{oo}\delta b^{2}\frac{2^{2j}\tau^{2}}{32LK}.

Hence returning to (5.24), we find that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}^{\prime}])w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
CLKσcooδb2(t+an),\displaystyle\qquad\qquad\qquad\leq\frac{C^{\prime}\sqrt{LK}}{\sigma\sqrt{c_{oo}\delta b^{2}}}\left(t+\frac{a}{\sqrt{n}}\right), (5.25)

where C>0C^{\prime}>0 is an absolute constant. Similarly, if 1kK1\leq k\leq K, then

q2σ2|Ij,k|τ2n(22j2+22k2)σ2cooδb2τ222j+22k32LK.q^{2}\geq\sigma^{2}|I_{j,k}|\frac{\tau^{2}}{n}\left(2^{2j-2}+2^{2k-2}\right)\geq\sigma^{2}c_{oo}\delta b^{2}\tau^{2}\frac{2^{2j}+2^{2k}}{32LK}.

In this case, we again apply (5.24) to deduce the existence of an absolute constant C′′>0C^{\prime\prime}>0 such that

supzm\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P}_{m} (|iIj,k[(ξi𝔼m[ξi])wi+(ξi𝔼m[ξi])wi]z|tτ)\displaystyle\left(\left|\sum_{i\in I_{j,k}}\left[(\xi_{i}-\mathbb{E}_{m}[\xi_{i}])w_{i}+(\xi_{i}^{\prime}-\mathbb{E}_{m}[\xi_{i}]^{\prime})w^{\prime}_{i}\right]-z\right|\leq t\tau\right)
C′′LKσcooδb2(t+an).\displaystyle\qquad\qquad\qquad\leq\frac{C^{\prime\prime}\sqrt{LK}}{\sigma\sqrt{c_{oo}\delta b^{2}}}\left(t+\frac{a}{\sqrt{n}}\right). (5.26)

Combining (5.25) and (5.26), we obtain the bound (5.22) (with the absolute constant C:=max{C,C′′}C:=\max\{C^{\prime},C^{\prime\prime}\}), and the proof of the lemma is complete. ∎

5.5. Incompressible vectors

In order to control the set of incompressible vectors, we will require the following averaging estimate.

Lemma 5.12 (Invertibility via average distance; Lemma A.4 from [15]).

Let AA be a random matrix taking values in Matn()\operatorname{Mat}_{n}(\mathbb{C}) with columns C1,,CnC_{1},\ldots,C_{n}. For any 1kn1\leq k\leq n, let Hk:=Span{Ci:ik}H_{k}:=\operatorname{Span}\{C_{i}:i\neq k\}. Then, for any t0t\geq 0,

(minxIncomp(δ,τ)Axtτn)2δnk=1n(dist(Ck,Hk)t).\mathbb{P}\left(\min_{x\in\operatorname{Incomp}(\delta,\tau)}\|Ax\|\leq\frac{t\tau}{\sqrt{n}}\right)\leq\frac{2}{\delta n}\sum_{k=1}^{n}\mathbb{P}(\operatorname{dist}(C_{k},H_{k})\leq t).

Let A=X+MA=X+M be the matrix from Theorem 5.1. Let C1,,CnC_{1},\ldots,C_{n} be the columns of AA and Hk:=Span{Ci:ik}H_{k}:=\operatorname{Span}\{C_{i}:i\neq k\} be as in Lemma 5.12. Our main result for controlling the set of incompressible vectors is the following.

Lemma 5.13.

There exists C,c>0C,c>0 such that for every 1kn1\leq k\leq n, any t>0t>0 and any s1s\geq 1,

(dist(Ck,Hk)t,s1(A)s)C(log(Csn)1ρ(s2t+1n))1/4.\mathbb{P}(\operatorname{dist}(C_{k},H_{k})\leq t,s_{1}(A)\leq s)\leq C\left(\frac{\log(Csn)}{\sqrt{1-\rho}}\left(s^{2}\sqrt{t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}.

The rest of the subsection is devoted to the proof of Lemma 5.13. We complete the proof of Theorem 5.1 in Subsection 5.6. We will also need the following result based on [53, Proposition 5.1] and [29, Statement 2.8].

Lemma 5.14 (Distance problem via bilinear forms).

Let A=(Aij)Matn()A=(A_{ij})\in\operatorname{Mat}_{n}(\mathbb{C}), let C1,,CnC_{1},\ldots,C_{n} denote the columns of AA, and fix 1kn1\leq k\leq n. Let Hk:=Span{Ci:ik}H_{k}:=\operatorname{Span}\{C_{i}:i\neq k\}, uu be the kk-th row of AA with the kk-th entry removed, vv be CkC_{k} with the kk-th entry removed, and let BB be the (n1)×(n1)(n-1)\times(n-1) submatrix of AA formed from removing the kk-th row and kk-th column. If BB is invertible, then

dist(Ck,Hk)|AkkuB1v|1+uB12.\operatorname{dist}(C_{k},H_{k})\geq\frac{|A_{kk}-uB^{-1}v|}{\sqrt{1+\|uB^{-1}\|^{2}}}. (5.27)
Proof.

The proof presented below is based on the arguments given in [29, 53]. By permuting the rows and columns, it suffices to assume that k=1k=1. Let h𝕊n1h\in\mathbb{S}^{n-1} denote any normal to the hyperplane H1H_{1}. Then

dist(C1,H1)|hC1|.\operatorname{dist}(C_{1},H_{1})\geq|h^{\ast}C_{1}|.

We decompose

C1=(A11v),h=(h1g),C_{1}=\begin{pmatrix}A_{11}\\ v\end{pmatrix},\quad h=\begin{pmatrix}h_{1}\\ g\end{pmatrix},

where h1h_{1}\in\mathbb{C} and gn1g\in\mathbb{C}^{n-1}. Then

dist(C1,H1)|hC1|=|h¯1A11+gv|.\operatorname{dist}(C_{1},H_{1})\geq|h^{\ast}C_{1}|=|\bar{h}_{1}A_{11}+g^{\ast}v|. (5.28)

Since hh is orthogonal to the columns of the matrix (uB)\begin{pmatrix}u\\ B\end{pmatrix}, we find

0=h(uB)=h¯1u+gB,0=h^{\ast}\begin{pmatrix}u\\ B\end{pmatrix}=\bar{h}_{1}u+g^{\ast}B,

and hence

g=h¯1uB1.g^{\ast}=-\bar{h}_{1}uB^{-1}.

Returning to (5.28), we have

dist(C1,H1)|h1||A11uB1v|.\operatorname{dist}(C_{1},H_{1})\geq|h_{1}||A_{11}-uB^{-1}v|. (5.29)

In addition,

1=h2=|h1|2+g2=|h1|2(1+uB12),1=\|h\|^{2}=|h_{1}|^{2}+\|g\|^{2}=|h_{1}|^{2}(1+\|uB^{-1}\|^{2}),

and so

|h1|2=11+uB12.|h_{1}|^{2}=\frac{1}{1+\|uB^{-1}\|^{2}}. (5.30)

The conclusion now follows from (5.29) and (5.30). ∎

Our study of the bilinear form uB1vuB^{-1}v is based on the following general result, which will allow us to introduce some additional independence into the problem to deal with the fact that uu and vv are dependent. Similar decoupling techniques have also appeared in [19, 20, 27, 29, 53]. The lemma below is based on [53, Lemma 8.4] for quadratic forms.

Lemma 5.15 (Decoupling lemma).

Let MMatn()M\in\operatorname{Mat}_{n}(\mathbb{C}), and let x=(xi)i=1n,y=(yi)i=1nx=(x_{i})_{i=1}^{n},y=(y_{i})_{i=1}^{n} be random vectors in n\mathbb{C}^{n} such that {(xi,yi):1in}\{(x_{i},y_{i}):1\leq i\leq n\} is a collection of independent random tuples. Let (x,y)(x^{\prime},y^{\prime}) denote an independent copy of (x,y)(x,y). Then for every subset π[n]\pi\subset[n] and t0t\geq 0, we have

supz\displaystyle\sup_{z\in\mathbb{C}}\mathbb{P} (|xTMyz|t)2\displaystyle(|x^{\mathrm{T}}My-z|\leq t)^{2}
x,y,x,y(|xπTMπ×πc(yπcyπc)+(xπcxπc)TMπc×πyπ+z0|2t),\displaystyle\leq\mathbb{P}_{x,y,x^{\prime},y^{\prime}}\left(|x_{\pi}^{\mathrm{T}}M_{\pi\times\pi^{c}}(y_{\pi^{c}}-y^{\prime}_{\pi^{c}})+(x_{\pi^{c}}-x^{\prime}_{\pi^{c}})^{\mathrm{T}}M_{\pi^{c}\times\pi}y_{\pi}+z_{0}|\leq 2t\right),

where z0z_{0} is a random variable whose value is determined by Mπc×πc,xπc,yπc,xπc,yπcM_{\pi^{c}\times\pi^{c}},x_{\pi^{c}},y_{\pi^{c}},x^{\prime}_{\pi^{c}},y^{\prime}_{\pi^{c}}.

The proof of Lemma 5.15 is based on the following decoupling bound from [48, 53].

Lemma 5.16 (Lemma 8.5 from [53]).

Let ξ\xi and ψ\psi be independent random vectors, and let ψ\psi^{\prime} be an independent copy of ψ\psi. Let (ξ,ψ)\mathcal{E}(\xi,\psi) be an event which is determined by the values of ξ\xi and ψ\psi. Then

((ξ,ψ))2((ξ,ψ)(ξ,ψ)).\mathbb{P}(\mathcal{E}(\xi,\psi))^{2}\leq\mathbb{P}(\mathcal{E}(\xi,\psi)\cap\mathcal{E}(\xi,\psi^{\prime})).
Proof of Lemma 5.15.

Let ξ\xi be the random vector formed by the tuples {(xi,yi):iπ}\{(x_{i},y_{i}):i\in\pi\}, and let ψ\psi be the random vector formed from the tuples {(xi,yi):iπ}\{(x_{i},y_{i}):i\not\in\pi\}. Then ξ\xi and ψ\psi are independent by supposition, and we can apply Lemma 5.16. To this end, let x~,y~\tilde{x},\tilde{y} be random vectors in n\mathbb{C}^{n} defined by

x~π:=xπ,x~πc:=xπc,y~π:=yπ,y~πc:=yπc.\tilde{x}_{\pi}:=x_{\pi},\quad\tilde{x}_{\pi^{c}}:=x^{\prime}_{\pi^{c}},\quad\tilde{y}_{\pi}:=y_{\pi},\quad\tilde{y}_{\pi^{c}}:=y^{\prime}_{\pi^{c}}.

An application of Lemma 5.16 yields

(|xTMyz|t)2\displaystyle\mathbb{P}(|x^{\mathrm{T}}My-z|\leq t)^{2} x,y,x~,y~(|xTMyz|t,|x~TMy~z|t)\displaystyle\leq\mathbb{P}_{x,y,\tilde{x},\tilde{y}}(|{x}^{\mathrm{T}}My-z|\leq t,|\tilde{x}^{\mathrm{T}}M\tilde{y}-z|\leq t)
x,y,x~,y~(|xTMyx~TMy~|2t),\displaystyle\leq\mathbb{P}_{x,y,\tilde{x},\tilde{y}}(|x^{\mathrm{T}}My-\tilde{x}^{\mathrm{T}}M\tilde{y}|\leq 2t),

where the last inequality follows from the triangle inequality. We now note that

xTMyx~TMy~=xπTMπ×πc(yπcy~πc)+(xπcx~πc)TMπc×πyπ+z0,\displaystyle x^{\mathrm{T}}My-\tilde{x}^{\mathrm{T}}M\tilde{y}=x_{\pi}^{\mathrm{T}}M_{\pi\times\pi^{c}}(y_{\pi^{c}}-\tilde{y}_{\pi^{c}})+(x_{\pi^{c}}-\tilde{x}_{\pi^{c}})^{\mathrm{T}}M_{\pi^{c}\times\pi}y_{\pi}+z_{0},

where z0z_{0} depends only on Mπc×πc,xπc,yπc,x~πc,y~πcM_{\pi^{c}\times\pi^{c}},x_{\pi^{c}},y_{\pi^{c}},\tilde{x}_{\pi^{c}},\tilde{y}_{\pi^{c}}. Since x~πc=xπc\tilde{x}_{\pi^{c}}=x^{\prime}_{\pi^{c}} and y~πc=yπc\tilde{y}_{\pi^{c}}=y^{\prime}_{\pi^{c}}, the claim follows. ∎

We now turn to the proof of Lemma 5.13. The arguments presented here follow the general framework of [53, Section 8.3]. Fix 1kn1\leq k\leq n; the arguments and bounds below will all be uniform in kk. Let uu be the kk-th row of AA with the kk-th entry removed. Let vv be the kk-th column of AA with the kk-th entry removed, and let BB be the (n1)×(n1)(n-1)\times(n-1) matrix formed by removing the kk-th row and kk-th column from AA. In view of Lemma 5.14, it suffices to prove that

(|AkkuB1v|1+uB12t,s1(A)s)C(log(Csn)1ρ(s2t+1n))1/4\mathbb{P}\left(\frac{|A_{kk}-uB^{-1}v|}{\sqrt{1+\|uB^{-1}\|^{2}}}\leq t,s_{1}(A)\leq s\right)\leq C\left(\frac{\log(Csn)}{\sqrt{1-\rho}}\left(s^{2}\sqrt{t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4} (5.31)

for some constants C,c>0C,c>0. Our argument is based on applying Lemma 5.15 to decouple the bilinear form uB1vuB^{-1}v and then applying our anti-concentration bounds from Subsection 5.4 to bound the resulting expressions. We divide the proof of (5.31) into a number of sub-steps.

5.5.1. Step 1: Constructing a random subset π\pi

Following [53], we decompose [n1][n-1] into two random subsets π\pi and πc\pi^{c}. Let δ1,,δn1\delta_{1},\ldots,\delta_{n-1} be i.i.d. {0,1}\{0,1\}-valued random variables, independent of XX, with 𝔼δi=coo/2\mathbb{E}\delta_{i}=c_{oo}/2, where cooc_{oo} is a constant defined by

coo:=δ/8c_{oo}:=\delta/8

and δ(0,1)\delta\in(0,1) was previously fixed. We then define π:={i[n1]:δi=0}\pi:=\{i\in[n-1]:\delta_{i}=0\}. By the Chernoff bound, it follows that

|πc|coon|\pi^{c}|\leq c_{oo}n (5.32)

with probability at least 1Cooexp(coon)1-C_{oo}^{\prime}\exp(-c_{oo}^{\prime}n) for some constants Coo,coo>0C_{oo}^{\prime},c_{oo}^{\prime}>0.

5.5.2. Step 2: Estimating B12\|B^{-1}\|_{2}

Lemma 5.17 below will allow us to estimate the denominator appearing on the right-hand side of (5.27). To this end, let (u,v)(u^{\prime},v^{\prime}) be an independent copy of (u,v)(u,v), also independent of XX.

Lemma 5.17.

There exist constants C,c>0C,c>0 such that, for any s1s\geq 1, the random matrix BB has the following properties with probability at least 1Csexp(cn)1-Cs\exp(-cn). If s1(B)ss_{1}(B)\leq s, one has:

  1. (i)

    for any t00t_{0}\geq 0, with probability at least 1Clog(Cns)(st0+n1/2)1-C{\log(Cns)}\left(st_{0}+n^{-1/2}\right) in u,u,πu,u^{\prime},\pi,

    (uu)πcBπc×[n1]1t0B12.\|(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times[n-1]}\|\geq t_{0}\|B^{-1}\|_{2}.
  2. (ii)

    for any t00t_{0}\geq 0, with probability at least 1Clog(Cns)(st0+n1/2)1-C{\log(Cns)}\left(st_{0}+n^{-1/2}\right) in v,v,πv,v^{\prime},\pi,

    B[n1]×πc1(vv)πct0B12.\|B^{-1}_{[n-1]\times\pi^{c}}(v-v^{\prime})_{\pi^{c}}\|\geq t_{0}\|B^{-1}\|_{2}.

In order to prove the lemma, we will need the following elementary result.

Lemma 5.18 (Sums of dependent random variables; Lemma 8.3 from [53]).

Let Z1,,ZnZ_{1},\ldots,Z_{n} be arbitrary non-negative random variables (not necessarily independent), and p1,,pnp_{1},\ldots,p_{n} be non-negative numbers such that

j=1npj=1.\sum_{j=1}^{n}p_{j}=1.

Then, for every t0t\geq 0, we have

(j=1npjZjt)2j=1npj(Zj2t).\mathbb{P}\left(\sum_{j=1}^{n}p_{j}Z_{j}\leq t\right)\leq 2\sum_{j=1}^{n}p_{j}\mathbb{P}(Z_{j}\leq 2t).
Proof of Lemma 5.17.

By Corollary 5.8 and the union bound, under the assumption s1(B)ss_{1}(B)\leq s, we have

xj:=B1ejB1ejIncomp(δ,τ),yj:=ejB1ejB1Incomp(δ,τ)x_{j}:=\frac{B^{-1}e_{j}}{\|B^{-1}e_{j}\|}\in\operatorname{Incomp}(\delta,\tau),\qquad y_{j}:=\frac{e_{j}^{\ast}B^{-1}}{\|e_{j}^{\ast}B^{-1}\|}\in\operatorname{Incomp}(\delta,\tau)

for 1jn11\leq j\leq n-1 with probability at least 1Csexp(cn)1-Cs\exp(-cn). Here, e1,,en1e_{1},\ldots,e_{n-1} are the standard basis elements of n1\mathbb{C}^{n-1}. Fix a realization of BB for which this property holds. We will prove that both properties hold with the desired probability for this fixed realization of BB.

For (i), we note that

(uu)πcBπc×[n1]12\displaystyle\|(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times[n-1]}\|^{2} =j=1n1|(uu)πcBπc×[n1]1ej|2\displaystyle=\sum_{j=1}^{n-1}|(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times[n-1]}e_{j}|^{2}
=j=1n1|(uu)πc(xj)πc|2B1ej2.\displaystyle=\sum_{j=1}^{n-1}|(u-u^{\prime})_{\pi^{c}}(x_{j})_{\pi^{c}}|^{2}\|B^{-1}e_{j}\|^{2}.

Taking pj:=B1ej2/B122p_{j}:=\|B^{-1}e_{j}\|^{2}/\|B^{-1}\|_{2}^{2}, we see that j=1n1pj=1\sum_{j=1}^{n-1}p_{j}=1, and hence

u,u,π\displaystyle\mathbb{P}_{u,u^{\prime},\pi} ((uu)πcBπc×[n1]1t0B12)\displaystyle\left(\|(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times[n-1]}\|\leq t_{0}\|B^{-1}\|_{2}\right)
u,u,π(j=1n1|(uu)πc(xj)πc|2pjt02)\displaystyle\qquad\qquad\leq\mathbb{P}_{u,u^{\prime},\pi}\left(\sum_{j=1}^{n-1}|(u-u^{\prime})_{\pi^{c}}(x_{j})_{\pi^{c}}|^{2}p_{j}\leq t_{0}^{2}\right)
2j=1n1pju,u,π(|(uu)πc(xj)πc|22t02)\displaystyle\qquad\qquad\leq 2\sum_{j=1}^{n-1}p_{j}\mathbb{P}_{u,u^{\prime},\pi}(|(u-u^{\prime})_{\pi^{c}}(x_{j})_{\pi^{c}}|^{2}\leq 2t_{0}^{2})
2supwIncomp(δ,τ)u,u,π(|(uu)πcwπc|2t0)\displaystyle\qquad\qquad\leq 2\sup_{w\in\operatorname{Incomp}(\delta,\tau)}\mathbb{P}_{u,u^{\prime},\pi}(|(u-u^{\prime})_{\pi^{c}}w_{\pi^{c}}|\leq\sqrt{2}t_{0})

by Lemma 5.18. Recalling our choice of δ\delta, τ\tau (5.5), and cooc_{oo}, the claim now follows from the anti-concentration bound given in Lemma 5.11.

The proof of (ii) is similar. Indeed, we have

B[n1]×πc1(vv)πc2=j=1n1|(yj)πc(vv)πc|2ejB12.\|B^{-1}_{[n-1]\times\pi^{c}}(v-v^{\prime})_{\pi^{c}}\|^{2}=\sum_{j=1}^{n-1}|(y_{j})_{\pi^{c}}(v-v^{\prime})_{\pi^{c}}|^{2}\|e_{j}^{\ast}B^{-1}\|^{2}.

Applying Lemma 5.18 with pj:=ejB12/B122p_{j}:=\|e_{j}^{\ast}B^{-1}\|^{2}/\|B^{-1}\|_{2}^{2}, we conclude that

v,v,π\displaystyle\mathbb{P}_{v,v^{\prime},\pi} (B[n1]×πc1(vv)πct0B12)\displaystyle\left(\|B^{-1}_{[n-1]\times\pi^{c}}(v-v^{\prime})_{\pi^{c}}\|\leq t_{0}\|B^{-1}\|_{2}\right)
2j=1n1pjv,v,π(|(yj)πc(vv)πc|2t0)\displaystyle\qquad\qquad\leq 2\sum_{j=1}^{n-1}p_{j}\mathbb{P}_{v,v^{\prime},\pi}\left(|(y_{j})_{\pi^{c}}(v-v^{\prime})_{\pi^{c}}|\leq\sqrt{2}t_{0}\right)
2supwIncomp(δ,τ)v,v,π(|wπcT(vv)πc|2t0).\displaystyle\qquad\qquad\leq 2\sup_{w\in\operatorname{Incomp}(\delta,\tau)}\mathbb{P}_{v,v^{\prime},\pi}\left(|w_{\pi^{c}}^{\mathrm{T}}(v-v^{\prime})_{\pi^{c}}|\leq\sqrt{2}t_{0}\right).

As before, the conclusion now follows from the anti-concentration bound given in Lemma 5.11. ∎

5.5.3. Step 3: Working on the appropriate events

We have one last preparatory step before we can apply the decoupling lemma, Lemma 5.15. In this step, we define the events we will need to work on for the remainder of the proof. To this end, define the events

A:={s1(A)s},B:={s1(B)s},u:={us}.\mathcal{B}_{A}:=\{s_{1}(A)\leq s\},\qquad\mathcal{B}_{B}:=\{s_{1}(B)\leq s\},\qquad\mathcal{B}_{u}:=\{\|u\|\leq s\}.

We note that AB\mathcal{B}_{A}\subset\mathcal{B}_{B} and Au\mathcal{B}_{A}\subset\mathcal{B}_{u} since uu and BB are sub-matrices of AA. Consider the random vectors

w:=(uu)πcBπc×[n1]1,w:=B[n1]×πc1(vv)πc.w:=(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times[n-1]},\qquad w^{\prime}:=B^{-1}_{[n-1]\times\pi^{c}}(v-v^{\prime})_{\pi^{c}}. (5.33)

It is possible that w=0w=0 or w=0w^{\prime}=0, although we will show that these events happen with small probability momentarily.

Let t0>0t_{0}>0. Consider the event

B121t0min{w,w}.\|B^{-1}\|_{2}\leq\frac{1}{t_{0}}\min\{\|w\|,\|w^{\prime}\|\}. (5.34)

By Lemma 5.17, we find

B,u,u,v,v,π((5.34) holds Bc)1Clog(csn)(st0+n1/2)Csexp(cn)\mathbb{P}_{B,u,u^{\prime},v,v^{\prime},\pi}(\eqref{eq:condB2}\text{ holds }\lor\mathcal{B}_{B}^{c})\geq 1-C\log(csn)(st_{0}+n^{-1/2})-Cs\exp(-cn)

for some constants C,c>0C,c>0. In particular, since B12>0\|B^{-1}\|_{2}>0, it follows that when the event in (5.34) occurs, it must be the case that ww and ww^{\prime} are both nonzero. In order to avoid several different cases later in the proof, let us define ω\omega and ω\omega^{\prime} as follows. If ww is nonzero, we take ω:=w\omega:=w, and if ww is zero, we define ω\omega to be a fixed vector in Incomp(δ,τ)\operatorname{Incomp}(\delta,\tau). We define ω\omega^{\prime} analogously in terms of ww^{\prime}. It follows that on the event (5.34), we have

ω=w,ω=w.\omega=w,\qquad\omega^{\prime}=w^{\prime}. (5.35)

Next, consider the event

ωωIncomp(δ,τ),ωωIncomp(δ,τ).\displaystyle\frac{\omega}{\|\omega\|}\in\operatorname{Incomp}(\delta,\tau),\qquad\frac{\omega^{\prime}}{\|\omega^{\prime}\|}\in\operatorname{Incomp}(\delta,\tau). (5.36)

Let us fix an arbitrary realization of u,v,u,vu,v,u^{\prime},v^{\prime} and a realization of π\pi which satisfies (5.32). We will apply Corollary 5.8 to control the event in (5.36). Indeed, we only need to consider the cases when w0w\neq 0 or w0w^{\prime}\neq 0. In these cases, it follows that ω=w\omega=w or ω=w\omega^{\prime}=w^{\prime}. Let us suppose this is the case. Then ω=(uu)PπcB1\omega=(u-u^{\prime})P_{\pi^{c}}B^{-1} or ω=B1Pπc(vv)\omega^{\prime}=B^{-1}P_{\pi^{c}}(v-v^{\prime}), where PπcP_{\pi^{c}} is an orthogonal projection onto those coordinates specified by πc\pi^{c}. Thus, from Corollary 5.8, we deduce that

B((5.36) holds Bcu,v,u,v,π satisfies (5.32))1Csexp(cn)\mathbb{P}_{B}(\eqref{eq:condIncomp}\text{ holds }\lor\mathcal{B}_{B}^{c}\mid u,v,u^{\prime},v^{\prime},\pi\text{ satisfies }\eqref{eq:condpic})\geq 1-C^{\prime}s\exp(-c^{\prime}n)

for some constants C,c>0C^{\prime},c^{\prime}>0. Combining the probabilities above, we conclude that

B,u,v,u,v,π\displaystyle\mathbb{P}_{B,u,v,u^{\prime},v^{\prime},\pi} (((5.32),(5.34),(5.36) hold)Bc)\displaystyle\left((\eqref{eq:condpic},\eqref{eq:condB2},\eqref{eq:condIncomp}\text{ hold})\lor\mathcal{B}_{B}^{c}\right)
1C′′log(C′′sn)(st0+n1/2)C′′sexp(c′′n)\displaystyle\qquad\qquad\geq 1-C^{\prime\prime}\log(C^{\prime\prime}sn)(st_{0}+n^{-1/2})-C^{\prime\prime}s\exp(-c^{\prime\prime}n)
=:1p0\displaystyle\qquad\qquad=:1-p_{0}

for some constants C′′,c′′>0C^{\prime\prime},c^{\prime\prime}>0.

It follows that there exists a realization of π\pi that satisfies (5.32) and such that

B,u,v,u,v(((5.34),(5.36) hold)Bc)1p0.\mathbb{P}_{B,u,v,u^{\prime},v^{\prime}}\left((\eqref{eq:condB2},\eqref{eq:condIncomp}\text{ hold})\lor\mathcal{B}_{B}^{c}\right)\geq 1-p_{0}.

We fix such a realization of π\pi for the remainder of the proof. Using Fubini’s theorem, we deduce that the random matrix BB has the following property with probability at least 1p01-\sqrt{p_{0}}:

u,v,u,v(((5.34),(5.36) hold)BcB)1p0.\mathbb{P}_{u,v,u^{\prime},v^{\prime}}\left((\eqref{eq:condB2},\eqref{eq:condIncomp}\text{ hold})\lor\mathcal{B}_{B}^{c}\mid B\right)\geq 1-\sqrt{p_{0}}.

Since the event B\mathcal{B}_{B} depends only on BB and not on u,v,u,vu,v,u^{\prime},v^{\prime}, it follows that the random matrix BB has the following property with probability at least 1p01-\sqrt{p_{0}}: either Bc\mathcal{B}_{B}^{c} holds, or

B holds and u,v,u,v((5.34),(5.36) holdB)1p0.\mathcal{B}_{B}\text{ holds and }\mathbb{P}_{u,v,u^{\prime},v^{\prime}}\left(\eqref{eq:condB2},\eqref{eq:condIncomp}\text{ hold}\mid B\right)\geq 1-\sqrt{p_{0}}. (5.37)

5.5.4. Step 4: Decoupling

Recall that we are interested in bounding B,u,v,Akk(A)\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A}), where

:={|AkkuB1v|1+uB12t}.\mathcal{E}:=\left\{\frac{|A_{kk}-uB^{-1}v|}{\sqrt{1+\|uB^{-1}\|^{2}}}\leq t\right\}.

We first observe that

B,u,v,Akk(A)B,u,v,Akk(Bu).\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{B}\land\mathcal{B}_{u}).

On the event u\mathcal{B}_{u}, we have

uB1uB1sB12.\|uB^{-1}\|\leq\|u\|\|B^{-1}\|\leq s\|B^{-1}\|_{2}.

In addition, if s1(B)ss_{1}(B)\leq s, then B121/s\|B^{-1}\|_{2}\geq 1/s. Hence, on the event Bu\mathcal{B}_{B}\land\mathcal{B}_{u},

1+uB121+s2B1222s2B122.1+\|uB^{-1}\|^{2}\leq 1+s^{2}\|B^{-1}\|_{2}^{2}\leq 2s^{2}\|B^{-1}\|_{2}^{2}.

Thus, we obtain

B,u,v,Akk(A)B,u,v,Akk(B),\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}^{\prime}\land\mathcal{B}_{B}),

where

:={|AkkuB1v|2tsB12}.\mathcal{E}^{\prime}:=\left\{|A_{kk}-uB^{-1}v|\leq\sqrt{2}ts\|B^{-1}\|_{2}\right\}.

Thus, we find

B,u,v,Akk(A)B,u,v,Akk((5.37) holds)+B,u,v,Akk(B(5.37) fails).\displaystyle\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}^{\prime}\land\eqref{eq:conddagger}\text{ holds})+\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{B}_{B}\land\eqref{eq:conddagger}\text{ fails}).

The last probability is bounded above by p0\sqrt{p_{0}} by the previous step. We conclude that

B,u,v,Akk(A)supB satisfies (5.37)Akku,v(B,Akk)+p0.\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq\sup_{\begin{subarray}{c}B\text{ satisfies }\eqref{eq:conddagger}\\ A_{kk}\in\mathbb{C}\end{subarray}}\mathbb{P}_{u,v}(\mathcal{E}^{\prime}\mid B,A_{kk})+\sqrt{p_{0}}.

We now begin to work with the random vectors u,vu^{\prime},v^{\prime} (recall that (u,v)(u^{\prime},v^{\prime}) are independent of XX). To do so, we will work on a larger probability space which also includes the random vectors u,vu^{\prime},v^{\prime}. Indeed, computing the probability above on the larger space which includes u,vu^{\prime},v^{\prime}, we conclude that

B,u,v,Akk(A)supB satisfies (5.37)Akku,v,u,v(B,Akk)+p0.\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq\sup_{\begin{subarray}{c}B\text{ satisfies }\eqref{eq:conddagger}\\ A_{kk}\in\mathbb{C}\end{subarray}}\mathbb{P}_{u,v,u^{\prime},v^{\prime}}(\mathcal{E}^{\prime}\mid B,A_{kk})+\sqrt{p_{0}}.

For the remainder of the proof, we fix a realization of BB which satisfies (5.37) and fix an arbitrary realization of AkkA_{kk}. By supposition, both BB and AkkA_{kk} are independent of u,v,u,vu,v,u^{\prime},v^{\prime}. It remains to bound the probability

p1:=supzu,v,u,v(z),p_{1}:=\sup_{z\in\mathbb{C}}\mathbb{P}_{u,v,u^{\prime},v^{\prime}}(\mathcal{E}_{z}^{\prime}),

where

z:={|zuB1v|2tsB12}.\mathcal{E}_{z}^{\prime}:=\left\{|z-uB^{-1}v|\leq\sqrt{2}ts\|B^{-1}\|_{2}\right\}.

To bound p1p_{1}, we apply the decoupling lemma, Lemma 5.15. Indeed, by Lemma 5.15,

p12u,v,u,v(′′),p_{1}^{2}\leq\mathbb{P}_{u,v,u^{\prime},v^{\prime}}(\mathcal{E}^{\prime\prime}),

where

′′:={|uπBπ×πc1(vv)πc+(uu)πcBπc×π1vπ+z0|22stB12}\mathcal{E}^{\prime\prime}:=\left\{|u_{\pi}B^{-1}_{\pi\times\pi^{c}}(v-v^{\prime})_{\pi^{c}}+(u-u^{\prime})_{\pi^{c}}B^{-1}_{\pi^{c}\times\pi}v_{\pi}+z_{0}|\leq 2\sqrt{2}st\|B^{-1}\|_{2}\right\}

and z0z_{0} is a complex number depending only on B{πc×πc}B_{\{}\pi^{c}\times\pi^{c}\}, uπc,vπc,uπc,vπcu_{\pi^{c}},v_{\pi^{c}},u_{\pi^{c}}^{\prime},v_{\pi^{c}}^{\prime}. Using (5.37) (where the conditioning on BB is no longer required since BB is now fixed), we find

p12u,v,u,v(′′(5.34),(5.36) hold)+p0,p_{1}^{2}\leq\mathbb{P}_{u,v,u^{\prime},v^{\prime}}(\mathcal{E}^{\prime\prime}\land\eqref{eq:condB2},\eqref{eq:condIncomp}\text{ hold})+\sqrt{p_{0}},

and hence

p12u,v,u,v(′′′(5.35),(5.36) hold)+p0,p_{1}^{2}\leq\mathbb{P}_{u,v,u^{\prime},v^{\prime}}(\mathcal{E}^{\prime\prime\prime}\land\eqref{eq:condomega},\eqref{eq:condIncomp}\text{ hold})+\sqrt{p_{0}},

where

′′′\displaystyle\mathcal{E}^{\prime\prime\prime} :={|uπwπ+wπvπ+z0|22stt0max{w,w}};\displaystyle:=\left\{\left|u_{\pi}w_{\pi}^{\prime}+w_{\pi}v_{\pi}+z_{0}\right|\leq 2\sqrt{2}\frac{st}{t_{0}}\max\{\|w\|,\|w^{\prime}\|\}\right\};

here, we used the fact that on the event (5.34), the event (5.35) holds.

5.5.5. Step 5: Applying the anti-concentration bounds

Recall that w,ww,w^{\prime} depend only on uπc,vπc,uπc,vπcu_{\pi^{c}},v_{\pi^{c}},u_{\pi^{c}}^{\prime},v_{\pi^{c}}^{\prime}. In addition, uπu_{\pi} and vπv_{\pi} are independent of these random vectors. Let us fix a realization of the random vectors uπc,vπc,uπc,vπcu_{\pi^{c}},v_{\pi^{c}},u_{\pi^{c}}^{\prime},v_{\pi^{c}}^{\prime} which satisfy (5.35) and (5.36). This completely determines ww and ww^{\prime}; moreover, z0z_{0} is also completely determined. Therefore, we conclude that

p12supw,w satisfy (5.35),(5.36)z0uπ,vπ(|uπwπ+wπvπ+z0|22stt0max{w,w})+p0.p_{1}^{2}\leq\sup_{\begin{subarray}{c}w,w^{\prime}\text{ satisfy }\eqref{eq:condomega},\eqref{eq:condIncomp}\\ z_{0}\in\mathbb{C}\end{subarray}}\mathbb{P}_{u_{\pi},v_{\pi}}\left(\left|u_{\pi}w_{\pi}^{\prime}+w_{\pi}v_{\pi}+z_{0}\right|\leq 2\sqrt{2}\frac{st}{t_{0}}\max\{\|w\|,\|w^{\prime}\|\}\right)+\sqrt{p_{0}}.

In order to bound this first term on the right-hand side, we will apply the anti-concentration bound given in Lemma 5.10. Without loss of generality, let us assume that max{w,w}=w\max\{\|w\|,\|w^{\prime}\|\}=\|w\|. Then dividing through by w\|w\|, we find that

p12supw,w satisfy (5.35),(5.36)z0uπ,vπ(|uπwπw+wπwvπ+z0|22stt0)+p0,p_{1}^{2}\leq\sup_{\begin{subarray}{c}w,w^{\prime}\text{ satisfy }\eqref{eq:condomega},\eqref{eq:condIncomp}\\ z_{0}\in\mathbb{C}\end{subarray}}\mathbb{P}_{u_{\pi},v_{\pi}}\left(\left|u_{\pi}\frac{w_{\pi}^{\prime}}{\|w\|}+\frac{w_{\pi}}{\|w\|}v_{\pi}+z_{0}\right|\leq 2\sqrt{2}\frac{st}{t_{0}}\right)+\sqrt{p_{0}},

and

ww1.\frac{\left\|{w^{\prime}}\right\|}{\|w\|}\leq 1.

In view of Lemma 5.10 (where we recall that |π|n(1δ/8)|\pi|\geq n(1-\delta/8) due to (5.32)), we conclude that

p12C′′′11ρlog(C′′′ns)(s2tt0+1n)+p0p_{1}^{2}\leq C^{\prime\prime\prime}\frac{1}{\sqrt{1-\rho}}\log(C^{\prime\prime\prime}ns)\left(\frac{s^{2}t}{t_{0}}+\frac{1}{\sqrt{n}}\right)+\sqrt{p_{0}}

for some constant C′′′>0C^{\prime\prime\prime}>0.

5.5.6. Step 6: Completing the proof

Combining the bounds from the previous steps, we obtain

B,u,v,Akk(A)p1+p0.\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq p_{1}+\sqrt{p_{0}}.

We now proceed to simplify the expression to obtain (5.31). We still have the freedom to chose t0>0t_{0}>0; let us take t0:=tt_{0}:=\sqrt{t}. In addition, we may assume that the expression

C′′′log(C′′′ns)(s2t+1n)C^{\prime\prime\prime}\log(C^{\prime\prime\prime}ns)\left({s^{2}\sqrt{t}}+\frac{1}{\sqrt{n}}\right) (5.38)

is less than one as the bound is trivial otherwise. In particular, this implies that sexp(n)s\leq\exp(\sqrt{n}). Among others, this means that the error term C′′sexp(c′′n)C^{\prime\prime}s\exp(-c^{\prime\prime}n) can be absorbed into terms of the form (5.38) (by increasing the constant C′′′C^{\prime\prime\prime} if necessary). After some simplification, the bound for p1p_{1} obtained in the previous step (with the substitution t0:=tt_{0}:=\sqrt{t}) yields

B,u,v,Akk(A)C(log(Csn)1ρ(s2t+1n))1/4\mathbb{P}_{B,u,v,A_{kk}}(\mathcal{E}\land\mathcal{B}_{A})\leq C\left(\frac{\log(Csn)}{\sqrt{1-\rho}}\left(s^{2}\sqrt{t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}

for some constant C>0C>0. This completes the proof of (5.31), and hence the proof of Lemma 5.13 is complete.

5.6. Proof of Theorem 5.1

In this subsection, we complete the the proof of Theorem 5.1. Indeed, for any s1s\geq 1 and any 0<t10<t\leq 1, we have

(sn(A)tn,s1(A)s)\displaystyle\mathbb{P}\left(s_{n}(A)\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right) (minx𝕊n1Axtn,s1(A)s)\displaystyle\leq\mathbb{P}\left(\min_{x\in\mathbb{S}^{n-1}}\|Ax\|\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right)
(minxIncomp(δ,τ)Axtn,s1(A)s)\displaystyle\leq\mathbb{P}\left(\min_{x\in\operatorname{Incomp}(\delta,\tau)}\|Ax\|\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right) (5.39)
+(minxComp(δ,τ)Ax1n,s1(A)s)\displaystyle\qquad+\mathbb{P}\left(\min_{x\in\operatorname{Comp}(\delta,\tau)}\|Ax\|\leq\frac{1}{\sqrt{n}},s_{1}(A)\leq s\right)

due to our decomposition of the unit sphere into compressible and incompressible vectors. It remains to bound each of the terms on the right-hand side.

For the incompressible vectors, we combine Lemmas 5.12 and 5.13 to find that, for any t>0t>0,

(minxIncomp(δ,τ)Axtτn,s1(A)s)\displaystyle\mathbb{P}\left(\min_{x\in\operatorname{Incomp}(\delta,\tau)}\|Ax\|\leq\frac{t\tau}{\sqrt{n}},s_{1}(A)\leq s\right) 2δnk=1n(dist(Ck,Hk)t,s1(A)s)\displaystyle\leq\frac{2}{\delta n}\sum_{k=1}^{n}\mathbb{P}(\operatorname{dist}(C_{k},H_{k})\leq t,s_{1}(A)\leq s)
2Cδ(log(Csn)1ρ(s2t+1n))1/4\displaystyle\leq\frac{2C}{\delta}\left(\frac{\log(Csn)}{\sqrt{1-\rho}}\left(s^{2}\sqrt{t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}

for some constant C>0C>0. Recalling the definitions of δ\delta and τ\tau (5.5), we conclude that

(minxIncomp(δ,τ)Axtn,s1(A)s)C(log(Csn)1ρ(s5t+1n))1/4\mathbb{P}\left(\min_{x\in\operatorname{Incomp}(\delta,\tau)}\|Ax\|\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right)\leq C^{\prime}\left(\frac{\log(C^{\prime}sn)}{\sqrt{1-\rho}}\left(\sqrt{s^{5}t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4} (5.40)

for some constant C>0C^{\prime}>0. For compressible vectors, Lemma 5.6 implies the existence of constants C′′,c′′>0C^{\prime\prime},c^{\prime\prime}>0 such that

(minxComp(δ,τ)Ax1n,s1(A)s)C′′exp(c′′n).\mathbb{P}\left(\min_{x\in\operatorname{Comp}(\delta,\tau)}\|Ax\|\leq\frac{1}{\sqrt{n}},s_{1}(A)\leq s\right)\leq C^{\prime\prime}\exp(-c^{\prime\prime}n). (5.41)

Combining (5.40) and (5.41) with (5.39), we conclude that, for any s1s\geq 1 and any 0<t10<t\leq 1,

(sn(A)tn,s1(A)s)\displaystyle\mathbb{P}\left(s_{n}(A)\leq\frac{t}{\sqrt{n}},s_{1}(A)\leq s\right) C(log(Csn)1ρ(s5t+1n))1/4+C′′exp(c′′n)\displaystyle\leq C^{\prime}\left(\frac{\log(C^{\prime}sn)}{\sqrt{1-\rho}}\left(\sqrt{s^{5}t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}+C^{\prime\prime}\exp(-c^{\prime\prime}n)
C′′′(log(C′′′sn)1ρ(s5t+1n))1/4\displaystyle\leq C^{\prime\prime\prime}\left(\frac{\log(C^{\prime\prime\prime}sn)}{\sqrt{1-\rho}}\left(\sqrt{s^{5}t}+\frac{1}{\sqrt{n}}\right)\right)^{1/4}

for some constant C′′′>0C^{\prime\prime\prime}>0, where the second inequality follows from the fact that the first error term dominates the second for all nn sufficiently large. The proof of Theorem 5.1 is complete.

6. Singular values of AnA_{n} and uniform integrability

6.1. Tightness

We begin with a bound on the largest singular values of AnzA_{n}-z.

Lemma 6.1.

If Condition C1 holds, there exists r>0,C>0r>0,C>0 such that the following hold.

  • For all zz\in\mathbb{C}, there exists Cz>0C_{z}>0 such that almost surely

    lim supn0tr𝑑νAnz(t)<Cz and thus (νAnz)n1 is tight.\limsup_{n\rightarrow\infty}\int_{0}^{\infty}t^{r}d\nu_{A_{n}-z}(t)<C_{z}\text{ and thus }(\nu_{A_{n}-z})_{n\geq 1}\text{ is tight.}
  • Almost surely

    lim supn|w|r𝑑μAn(w)<C and thus (μAn)n1 is tight..\limsup_{n\rightarrow\infty}\int_{\mathbb{C}}|w|^{r}d\mu_{A_{n}}(w)<C\text{ and thus }(\mu_{A_{n}})_{n\geq 1}\text{ is tight.}.
Proof.

We follow the approach of Lemma 3.1 in [13]. The tightness follows from the moment bound and Markov’s inequality. The moment bound on μAn\mu_{A_{n}} follows from the bound on νAn\nu_{A_{n}} and Weyl’s inequality, Lemma A.7. One has from Lemma A.3, sk(Anz)sk(An)+|z|s_{k}(A_{n}-z)\leq s_{k}(A_{n})+|z| for every 1kn1\leq k\leq n and thus using the fact for any x,y0,(x+y)r2r(xr+yr)x,y\geq 0,(x+y)^{r}\leq 2^{r}(x^{r}+y^{r}) we can assume z=0z=0. We aim to work with matrices with independent entries, and thus decompose An=Un+LnA_{n}=U_{n}+L_{n} where LnL_{n} is strictly lower triangular, and UnU_{n} is upper triangular. Note for all 0kn10\leq k\leq n-1, we have by Lemma A.3

s1+k(An)s1+k/2(Un)+s1+k/2(Ln).s_{1+k}(A_{n})\leq s_{1+\lfloor k/2\rfloor}(U_{n})+s_{1+\lceil k/2\rceil}(L_{n}).

We now restrict rr such that 0<r20<r\leq 2. Thus

0tr𝑑νAn(t)8[0tr𝑑νUn(t)+0tr𝑑νLn(t)].\int_{0}^{\infty}t^{r}d\nu_{A_{n}}(t)\leq 8\left[\int_{0}^{\infty}t^{r}d\nu_{U_{n}}(t)+\int_{0}^{\infty}t^{r}d\nu_{L_{n}}(t)\right].

We show only

lim supn0tr𝑑νUn(t)<a.s.\limsup_{n\rightarrow\infty}\int_{0}^{\infty}t^{r}d\nu_{U_{n}}(t)<\infty\qquad\qquad\text{a.s.}

as the proof that

lim supn0tr𝑑νLn(t)<a.s.\limsup_{n\rightarrow\infty}\int_{0}^{\infty}t^{r}d\nu_{L_{n}}(t)<\infty\qquad\qquad\text{a.s.}

follows in the exact same way.

By the Schatten bound, Lemma A.8,

0tr𝑑νUn(t)Zn:=1ni=1nYn,i\int_{0}^{\infty}t^{r}d\nu_{U_{n}}(t)\leq Z_{n}:=\frac{1}{n}\sum_{i=1}^{n}Y_{n,i}

where

Yn,i=(j=inan2|Xij|2)r/2.Y_{n,i}=\left(\sum_{j=i}^{n}a_{n}^{-2}|X_{ij}|^{2}\right)^{r/2}.

For every 1i<jn1\leq i<j\leq n we let XjiX_{ji}^{\prime} be a copy of ξ1\xi_{1}, independent of all XijX_{ij} and all other XjiX_{ji}^{\prime}. In addition let

Yn,i:=(j=inan2|Xij|2+j=1i1an2|Xij|2)r/2Y_{n,i}^{\prime}:=\left(\sum_{j=i}^{n}a_{n}^{-2}|X_{ij}|^{2}+\sum_{j=1}^{i-1}a_{n}^{-2}|X_{ij}^{\prime}|^{2}\right)^{r/2}

then

Zn1ni=1nYn,i,Z_{n}\leq\frac{1}{n}\sum_{i=1}^{n}Y_{n,i}^{\prime},

since Yn,iYn,iY_{n,i}\leq Y_{n,i}^{\prime}. The proof then follows exactly as in Lemma 3.1 of [13]. ∎

6.2. Distance from a row and a vector space.

Throughout the rest of this section we assume the atom variables of XnX_{n} satisfy Condition C2. The proof of Proposition 3.3 from [13] can be adapted in a straight forward way to get Proposition 6.2 below. We give a brief explanation of the changes to the proof needed for entries that are independent but not necessarily identically distributed.

Proposition 6.2.

Let 0<γ<1/20<\gamma<1/2, and RR be the ii-th row of an(Anz)a_{n}(A_{n}-z) with the ii-th entry set to zero. There exists δ>0\delta>0 depending on α,γ\alpha,\gamma such that for all dd-dimensional subspaces WW of n\mathbb{C}^{n} with ndn1γn-d\geq n^{1-\gamma}, one has

(dist(R,W)n(12γ)/α)enδ.\mathbb{P}\left(\operatorname{dist}(R,W)\leq n^{(1-2\gamma)/\alpha}\right)\leq e^{-n^{\delta}}.
Proof.

Assume RR is the ii-th row of an(Anz)a_{n}(A_{n}-z) with the ii-th entry set to zero. If X(i)X^{(i)} is the ii-th row of XnX_{n} with the ii-th entry set to zero, we have

dist(R,W)dist(X(i),W1)\operatorname{dist}(R,W)\geq\operatorname{dist}(X^{(i)},W_{1})

where W1=Span(W,ei)W_{1}=\operatorname{Span}(W,e_{i}). Note the entries of X(i)X^{(i)} are independent, but have two potentially different distributions, in contrast with [13] where the entries are independent and identically distributed. However, Lemma A.9 can be applied to either distribution. Under Condition C2 the slowly varying function, L(t)L(t), in (1.3) is bounded and L(t)c>0L(t)\rightarrow c>0 as tt\rightarrow\infty for both entries. To adapt the proof of Proposition 3.3 in [13] apply Lemma A.9 to the entries of X(i)X^{(i)} to get uniform bounds on the truncated moments without assuming the entries are identically distributed. ∎

We now give some results for stable random variables which will be helpful. For 0<β<10<\beta<1, let Z=Z(β)Z=Z^{(\beta)} denote the one-sided positive β\beta-stable distribution such that for all s0s\geq 0,

𝔼exp(sZ)=exp(sβ).\mathbb{E}\exp(-sZ)=\exp(-s^{\beta}). (6.1)

Recall for y,m>0y,m>0,

ym=Γ(m)10xm1exy𝑑x.y^{-m}=\Gamma(m)^{-1}\int_{0}^{\infty}x^{m-1}e^{-xy}dx.

Thus for all m>0m>0,

𝔼[Zm]=Γ(m)10xm1exβ𝑑x,\mathbb{E}[Z^{-m}]=\Gamma(m)^{-1}\int_{0}^{\infty}x^{m-1}e^{-x^{\beta}}dx, (6.2)

and if Z1,,ZnZ_{1},\dots,Z_{n} are i.i.d. copies of ZZ and w1,w2,,wnw_{1},w_{2},\dots,w_{n} are non-negative real numbers then

i=1nwiZi=𝑑(i=1nwiβ)1/βZ1.\sum_{i=1}^{n}w_{i}Z_{i}\overset{d}{=}\left(\sum_{i=1}^{n}w_{i}^{\beta}\right)^{1/\beta}Z_{1}. (6.3)
Lemma 6.3 (Lemma 3.5 in [13]).

Let 0<α<20<\alpha<2 and YY be a random variable such that tα(|Y|t)ct^{\alpha}\mathbb{P}(|Y|\geq t)\rightarrow c as tt\rightarrow\infty for some c>0c>0. Then there exists ε>0\varepsilon>0 and p(0,1)p\in(0,1) such that the random variable |Y|2|Y|^{2} dominates stochastically the random variable εDZ\varepsilon DZ, where (D=1)=1(D=0)=p\mathbb{P}(D=1)=1-\mathbb{P}(D=0)=p is a Bernoulli random variable, Z=Z(α/2)Z=Z^{(\alpha/2)} and DD and ZZ are independent.

Lemma 6.4.

Let X(i)X^{(i)} be the ii-th row of the matrix XnX_{n}, with the ii-th entry set to zero. Let wj[0,1]w_{j}\in[0,1] be numbers such that w(n):=j=1nwjn1/2+εw(n):=\sum_{j=1}^{n}w_{j}\geq n^{1/2+\varepsilon} for some ε>0\varepsilon>0. Let Z=Z(β)Z=Z^{(\beta)} with β=α/2\beta=\alpha/2. Then there exists δ>0\delta>0 and a coupling of X(i)X^{(i)} and ZZ such that

(j=1nwj|Xij|2δw(n)1/βZ)Cecnδ\mathbb{P}\left(\sum_{j=1}^{n}w_{j}|X_{ij}|^{2}\leq\delta w(n)^{1/\beta}Z\right)\leq Ce^{-cn^{\delta}}

for constants C,c>0C,c>0.

Proof.

Let D=(Dj)j=1iD=(D_{j})_{j=1}^{i} and D=(Dj)j=i+1nD^{\prime}=(D_{j})_{j=i+1}^{n} be two independent vectors of i.i.d. Bernoulli random variables given by Lemma 6.3 for Y=X21Y=X_{21} with parameter pp and Y=X12Y=X_{12} with parameter pp^{\prime} respectively. We know from Lemma 6.3 there exists ε>0\varepsilon^{\prime}>0, such that for independent random variables ZjZ_{j} satisfying (6.1) such that for every jj, wj|Xij|2w_{j}|X_{ij}|^{2} stochastically dominates εwjDjZj\varepsilon^{\prime}w_{j}D_{j}Z_{j}. Then there exists a coupling (see Lemma 2.12 in [32]) such that

(j=1nwj|Xij|2εj=1nwjDjZj)=1.\mathbb{P}\left(\sum_{j=1}^{n}w_{j}|X_{ij}|^{2}\geq\varepsilon^{\prime}\sum_{j=1}^{n}w_{j}D_{j}Z_{j}\right)=1.

Let 𝐚=(a1,,an){0,1}n\mathbf{a}=(a_{1},\dots,a_{n})\in\{0,1\}^{n}, and A𝐚A_{\mathbf{a}} be the event Dj=ajD_{j}=a_{j} for all jj. Then define the random variable ZZ pointwise on A𝐚A_{\mathbf{a}}, 𝐚𝟎\mathbf{a}\neq\mathbf{0}

Z(ω):=j=inwjajZj(ω)(j=1nwjβaj)1/β,Z(\omega):=\frac{\sum_{j=i}^{n}w_{j}a_{j}Z_{j}(\omega)}{\left(\sum_{j=1}^{n}w_{j}^{\beta}a_{j}\right)^{1/\beta}},

for ωA𝐚\omega\in A_{\mathbf{a}} and Z(ω)=Z1(ω)Z(\omega)=Z_{1}(\omega) on A𝟎A_{\mathbf{0}}. From (6.3), we see ZZ satisfies (6.1) and the distribution of ZZ does not depend on D1,,DnD_{1},\dots,D_{n}. Thus it is sufficient to show there exists ε′′>0\varepsilon^{\prime\prime}>0 such that

(j=1nwjβDjε′′w(n))Cecnε′′.\mathbb{P}\left(\sum_{j=1}^{n}w_{j}^{\beta}D_{j}\leq\varepsilon^{\prime\prime}w(n)\right)\leq Ce^{-cn^{\varepsilon^{\prime\prime}}}.

Note wjβwjw_{j}^{\beta}\geq w_{j}, and thus 𝔼j=1nwjβDjmin(p,p)w(n)\mathbb{E}\sum_{j=1}^{n}w_{j}^{\beta}D_{j}\geq\min(p,p^{\prime})w(n). Therefore for 0<ε′′<min(p,p)0<\varepsilon^{\prime\prime}<\min(p,p^{\prime}),

(j=1nwjβDjε′′w(n))\displaystyle\mathbb{P}\left(\sum_{j=1}^{n}w_{j}^{\beta}D_{j}\leq\varepsilon^{\prime\prime}w(n)\right)
(|j=1n(wjβDj𝔼wjβDj)|(min(p,p)ε′′)w(n))\displaystyle\qquad\qquad\leq\mathbb{P}\left(\big{|}\sum_{j=1}^{n}(w_{j}^{\beta}D_{j}-\mathbb{E}w_{j}^{\beta}D_{j})\big{|}\geq(\min(p,p^{\prime})-\varepsilon^{\prime\prime})w(n)\right)
2e12(min(p,p)ε′′)2w(n)2/n,\displaystyle\qquad\qquad\leq 2e^{-\frac{1}{2}(\min(p,p^{\prime})-\varepsilon^{\prime\prime})^{2}w(n)^{2}/n},

where the last bound follows from Hoeffding’s inequality. ∎

We now give another bound on the distance between a row of AnA_{n} and a deterministic subspace.

Proposition 6.5.

Take 0<γ<α/40<\gamma<\alpha/4. Let RR be the ii-th row of an(Anz)a_{n}(A_{n}-z) with the ii-th entry set to zero. There exists an event EE such that for any dd-dimensional subspace WW of n\mathbb{C}^{n} with ndn1γn-d\geq n^{1-\gamma}, we have for sufficiently large nn

𝔼[dist2(R,W);E]c(nd)2/αand(Ec)cn12+γ(2α12),\mathbb{E}[\operatorname{dist}^{-2}(R,W);E]\leq c(n-d)^{-2/\alpha}\quad\text{and}\quad\mathbb{P}(E^{c})\leq cn^{-\frac{1}{2}+\gamma(\frac{2}{\alpha}-\frac{1}{2})},

where c>0c>0 is an absolute constant which does not depend on the choice of row.

Proof.

We follow the approach of the proof of Proposition 3.7 in [13]. The only difference with Proposition 3.7 in [13] is the entries here are independent but not necessarily identically distributed. Assume that RR is the ii-th row of an(Anz)a_{n}(A_{n}-z) with the ii-th entry set to zero. Note

dist(R,W)dist(X(i),W1),\operatorname{dist}(R,W)\geq\operatorname{dist}(X^{(i)},W_{1}), (6.4)

where W1=Span(W,ei)W_{1}=\operatorname{Span}(W,e_{i}), eie_{i} is the ii-th basis vector, and X(i)=(Xij)1jnX^{(i)}=(X_{ij})_{1\leq j\leq n}. Though the ii-th entry of X(i)X^{(i)} is not necessarily zero, inequality (6.4) still holds since the subspace spanned by eie_{i} is contained in W1W_{1}. Let 𝒥\mathcal{J} denote the set of indices jj such that |Xij|an|X_{ij}|\leq a_{n}. It is a straight forward application of the Chernoff bound to show there exists δ>0\delta>0 such that

(|𝒥|<nn)enδ.\mathbb{P}(|\mathcal{J}|<n-\sqrt{n})\leq e^{-n^{\delta}}.

We begin by showing for any set J{1,,n}J\subset\{1,\dots,n\} such that |J|nn|J|\geq n-\sqrt{n},

𝔼[dist2(R,W);EJ|𝒥=J]c(nd)2/α,\mathbb{E}[\operatorname{dist}^{-2}(R,W);E_{J}|\mathcal{J}=J]\leq c(n-d)^{-2/\alpha},

for some event EJE_{J} satisfying (EJc|𝒥=J)cn12+γ(2α12)\mathbb{P}(E_{J}^{c}|\mathcal{J}=J)\leq cn^{-\frac{1}{2}+\gamma(\frac{2}{\alpha}-\frac{1}{2})}. Without loss of generality assume J:={1,,n}J:=\{1,\dots,n^{\prime}\} with nnnn^{\prime}\geq n-\sqrt{n}. Let πJ\pi_{J} be the orthogonal projection onto the first nn^{\prime} canonical basis vectors. Let W2=πJ(W1)W_{2}=\pi_{J}(W_{1}), set

W=Span(W2,𝔼(πJ(X(i))|𝒥=J)).W^{\prime}=\operatorname{Span}\left(W_{2},\mathbb{E}(\pi_{J}(X^{(i)})|\mathcal{J}=J)\right).

Note dndim(W)dim(W1)+1d+2d-\sqrt{n}\leq\dim(W^{\prime})\leq\dim(W_{1})+1\leq d+2. Define

Y=πJ(X(i))𝔼(πJ(X(i))|𝒥=J).Y=\pi_{J}(X^{(i)})-\mathbb{E}(\pi_{J}(X^{(i)})|\mathcal{J}=J).

One has

dist(R,W)dist(Y,W).\operatorname{dist}(R,W)\geq\operatorname{dist}(Y,W^{\prime}).

YY is a mean zero vector under (|𝒥=J)\mathbb{P}(\cdot|\mathcal{J}=J). Note WπJ(n)W^{\prime}\subseteq\pi_{J}(\mathbb{C}^{n}) and YπJ(n)Y\in\pi_{J}(\mathbb{C}^{n}), so we will work with both as objects in only πJ(n)n\pi_{J}(\mathbb{C}^{n})\simeq\mathbb{C}^{n^{\prime}} and not the larger vector space n\mathbb{C}^{n}. Let PP be the orthogonal projection matrix onto (W)(W^{\prime})^{\perp} in n\mathbb{C}^{n^{\prime}}. Note trP=ndim(W)\operatorname{tr}P=n^{\prime}-\dim(W^{\prime}) satisfies for sufficiently large nn

2(nd)trP12(nd).2(n-d)\geq\operatorname{tr}P\geq\frac{1}{2}(n-d). (6.5)

By construction

𝔼(dist2(Y,W)|𝒥=J)=𝔼[j,k=1nYjPjkY¯k|𝒥=J]=j=1nPjj𝔼[|Yj|2|𝒥=J].\mathbb{E}(\operatorname{dist}^{2}(Y,W^{\prime})|\mathcal{J}=J)=\mathbb{E}\left[\sum_{j,k=1}^{n^{\prime}}Y_{j}P_{jk}\bar{Y}_{k}|\mathcal{J}=J\right]=\sum_{j=1}^{n^{\prime}}P_{jj}\mathbb{E}[|Y_{j}|^{2}|\mathcal{J}=J]. (6.6)

Let S:=j=1nPjj|Yj|2S:=\sum_{j=1}^{n^{\prime}}P_{jj}|Y_{j}|^{2}. Before beginning note by Lemma A.9 there exists C>0C>0 such that 𝔼[|Yj|2|𝒥=J]Can2/n\mathbb{E}[|Y_{j}|^{2}|\mathcal{J}=J]\leq Ca_{n}^{2}/n for 1jn1\leq j\leq n^{\prime}. Thus333We believe there is a small typo in the bound of 𝔼(|dist2(Y,W)S|2|𝒥=J)\mathbb{E}(|\operatorname{dist}^{2}(Y,W^{\prime})-S|^{2}\big{|}\mathcal{J}=J) in the proof of Proposition 3.7 in [13]

𝔼(|dist2(Y,W)S|2|𝒥=J)\displaystyle\mathbb{E}(|\operatorname{dist}^{2}(Y,W^{\prime})-S|^{2}\big{|}\mathcal{J}=J) =𝔼(|kjYjPjkY¯k|2|𝒥=J)\displaystyle=\mathbb{E}\left(\left|\sum_{k\neq j}Y_{j}P_{jk}\bar{Y}_{k}\right|^{2}\big{|}\mathcal{J}=J\right)
2Can4n2j,k|Pjk|2\displaystyle\leq\frac{2Ca_{n}^{4}}{n^{2}}\sum_{j,k}|P_{jk}|^{2}
=2Can4n2P22\displaystyle=\frac{2Ca_{n}^{4}}{n^{2}}\|P\|_{2}^{2}
=2Can4n2tr(PP)\displaystyle=\frac{2Ca_{n}^{4}}{n^{2}}\operatorname{tr}(P^{*}P)
=2Can4n2tr(P).\displaystyle=\frac{2Ca_{n}^{4}}{n^{2}}\operatorname{tr}(P).

Thus

𝔼(|dist2(Y,W)S|2|𝒥=J)=O(an4ndn2).\mathbb{E}(|\operatorname{dist}^{2}(Y,W^{\prime})-S|^{2}\big{|}\mathcal{J}=J)=O\left(a_{n}^{4}\frac{n-d}{n^{2}}\right). (6.7)

Let ZZ be as in Lemma 6.4. Set wj=Pjjw_{j}=P_{jj} and for ε>0\varepsilon>0, consider the event

ΓJ:={j=1nwj|Xij|2ε(nd)1/βZ},\Gamma_{J}:=\left\{\sum_{j=1}^{n^{\prime}}w_{j}|X_{ij}|^{2}\geq\varepsilon(n-d)^{1/\beta}Z\right\},

where β=α/2\beta=\alpha/2. From Lemma 6.4 there exists a coupling of Xi1,,Xi,nX_{i1},\dots,X_{i,n^{\prime}} and ZZ such that

(ΓJc)Cecnδ,\mathbb{P}(\Gamma_{J}^{c})\leq Ce^{-cn^{\delta}}, (6.8)

for some δ>0\delta>0 and some choice of ε>0\varepsilon>0. Since (ab)2a2/2b2(a-b)^{2}\geq a^{2}/2-b^{2} for a,ba,b\in\mathbb{R} we have S12SaSbS\geq\frac{1}{2}S_{a}-S_{b} where

Sa:=j=1nwj|Xij|2,S_{a}:=\sum_{j=1}^{n^{\prime}}w_{j}|X_{ij}|^{2},

and

Sb:=j=1nwj𝔼[|Xij||Xijan]2.S_{b}:=\sum_{j=1}^{n^{\prime}}w_{j}\mathbb{E}[|X_{ij}|\big{|}X_{ij}\leq a_{n}]^{2}.

From Lemma A.9 and (6.5) one has

Sbh(α)(n,d)S_{b}\leq h^{(\alpha)}(n,d)

where h(α)(n,d)=Θ((nd)an2/n2)h^{(\alpha)}(n,d)=\Theta((n-d)a_{n}^{2}/n^{2}) if α(0,1]\alpha\in(0,1] and h(α)(n,d)=Θ((nd))h^{(\alpha)}(n,d)=\Theta((n-d)) if α(1,2)\alpha\in(1,2). Let GJ1G_{J}^{1} be the event that Sa3SbS_{a}\geq 3S_{b}. There exists some c0c_{0} such that

((GJ1)cΓJ|𝒥=J)(Zc0(nd)1/βh(α)(n,d)|𝒥=J).\mathbb{P}((G_{J}^{1})^{c}\cap\Gamma_{J}|\mathcal{J}=J)\leq\mathbb{P}(Z\leq c_{0}(n-d)^{-1/\beta}h^{(\alpha)}(n,d)|\mathcal{J}=J).

From the assumption ndn1γn-d\geq n^{1-\gamma} with 0<γ<α/40<\gamma<\alpha/4 we have (nd)1/βh(α)(n,d)Cnε0(n-d)^{-1/\beta}h^{(\alpha)}(n,d)\leq Cn^{-\varepsilon_{0}} for some C,ε0>0C,\varepsilon_{0}>0. From here, using the bound (6.2) on the negative second moment of ZZ , it is straightforward to show that for every p>0p>0 there exists a constant κp\kappa_{p} such that

((GJ1)cΓJ|𝒥=J)κpnp.\mathbb{P}((G_{J}^{1})^{c}\cap\Gamma_{J}|\mathcal{J}=J)\leq\kappa_{p}n^{-p}. (6.9)

Set Γ~J=GJ1ΓJ\tilde{\Gamma}_{J}=G_{J}^{1}\cap\Gamma_{J}. On Γ~J\tilde{\Gamma}_{J}, S16Saε6(nd)2/αZS\geq\frac{1}{6}S_{a}\geq\frac{\varepsilon}{6}(n-d)^{2/\alpha}Z, and therefore

𝔼[S2;Γ~J|𝒥=J]c1(nd)4/α𝔼[Z2].\mathbb{E}[S^{-2};\tilde{\Gamma}_{J}|\mathcal{J}=J]\leq c_{1}(n-d)^{-4/\alpha}\mathbb{E}[Z^{-2}].

and thus using again the negative second moment bound on ZZ,

𝔼[S2;Γ~J|𝒥=J]=O((nd)4/α).\mathbb{E}[S^{-2};\tilde{\Gamma}_{J}|\mathcal{J}=J]=O\left((n-d)^{-4/\alpha}\right). (6.10)

Let GJ2G_{J}^{2} be the event {dist2(Y,W)S/2}\{\operatorname{dist}^{2}(Y,W^{\prime})\geq S/2\}. Note using Markov’s inequality and the Cauchy-Schwarz inequality leads to

(dist2(Y,W)S/2;Γ~J|𝒥=J)\displaystyle\mathbb{P}\left(\operatorname{dist}^{2}(Y,W^{\prime})\leq S/2;\tilde{\Gamma}_{J}|\mathcal{J}=J\right) (|dist2(Y,W)S|S1/2;Γ~J|𝒥=J)\displaystyle\leq\mathbb{P}\left(\frac{|\operatorname{dist}^{2}(Y,W^{\prime})-S|}{S}\geq 1/2;\tilde{\Gamma}_{J}|\mathcal{J}=J\right)
2𝔼[|dist2(Y,W)S|S;Γ~J|𝒥=J]\displaystyle\leq 2\mathbb{E}\left[\frac{|\operatorname{dist}^{2}(Y,W^{\prime})-S|}{S};\tilde{\Gamma}_{J}|\mathcal{J}=J\right] (6.11)
2𝔼[|dist2(Y,W)S|2|𝒥=J]𝔼[S2;Γ~J|𝒥=J].\displaystyle\leq 2\sqrt{\mathbb{E}\left[|\operatorname{dist}^{2}(Y,W^{\prime})-S|^{2}|\mathcal{J}=J\right]\mathbb{E}[S^{-2};\tilde{\Gamma}_{J}|\mathcal{J}=J]}.

Then by (6.7), (6.10), and (6.2)

((GJ2)cΓ~J|𝒥=J)=O(an2n1(nd)122α).\mathbb{P}\left((G_{J}^{2})^{c}\cap\tilde{\Gamma}_{J}|\mathcal{J}=J\right)=O\left(a_{n}^{2}n^{-1}(n-d)^{\frac{1}{2}-\frac{2}{\alpha}}\right). (6.12)

By the Cauchy-Schwarz inequality

𝔼[dist2(X,W);GJ2Γ~J|𝒥=J]2𝔼[S1;Γ~J|𝒥=J]=O((nd)2/α).\mathbb{E}\left[\operatorname{dist}^{-2}(X,W);G_{J}^{2}\cap\tilde{\Gamma}_{J}|\mathcal{J}=J\right]\leq 2\mathbb{E}\left[S^{-1};\tilde{\Gamma}_{J}|\mathcal{J}=J\right]=O\left((n-d)^{-2/\alpha}\right).

To conclude take EJ=GJ2Γ~JE_{J}=G_{J}^{2}\cap\tilde{\Gamma}_{J}. Then

(EJc|𝒥=J)(ΓJc|𝒥=J)+((GJ1)cΓJ|𝒥=J)+((GJ2)cΓ~J|𝒥=J).\mathbb{P}(E_{J}^{c}|\mathcal{J}=J)\leq\mathbb{P}(\Gamma_{J}^{c}|\mathcal{J}=J)+\mathbb{P}((G_{J}^{1})^{c}\cap\Gamma_{J}|\mathcal{J}=J)+\mathbb{P}\left((G_{J}^{2})^{c}\cap\tilde{\Gamma}_{J}|\mathcal{J}=J\right).

It is then straightforward to show using (6.8), (6.9), and (6.12)

(EJc|𝒥=J)=O(n12+γ(2α12)).\mathbb{P}(E_{J}^{c}|\mathcal{J}=J)=O\left(n^{-\frac{1}{2}+\gamma(\frac{2}{\alpha}-\frac{1}{2})}\right).

Take E=J𝐉EJ{𝒥=J}E=\bigcup_{J\in\mathbf{J}}E_{J}\cap\{\mathcal{J}=J\} where 𝐉={J[n]:|J|nn}\mathbf{J}=\{J\subseteq[n]:|J|\geq n-\sqrt{n}\} to complete the proof. ∎

6.3. Application of Theorem 5.1

We now show that Theorem 5.1 can be applied to AnA_{n} to bound sn(Anz)s_{n}(A_{n}-z) from below with high probability. The difficulty is connecting the hypothesis on the spectral measure in Condition C2 to the correlation of truncated random variables in the statement of Theorem 5.1.

Theorem 6.6.

For all zz\in\mathbb{C}, there exists C,r>0C,r>0 such that

(sn(Anz)Cnr)=o(1).\mathbb{P}\left(s_{n}(A_{n}-z)\leq Cn^{-r}\right)=o(1).

It is worth noting o(1)o(1) can be improved to nεn^{-\varepsilon} for some sufficiently small ε>0\varepsilon>0. The proof of Theorem 6.6 will be an application of Theorem 5.1. First we need a bound on the operator norm, and hence the largest singular value, of AnzA_{n}-z.

Lemma 6.7.

For every zz\in\mathbb{C}, there exists C>0C>0 depending on the distribution of the entries and zz such that for any k>1/αk>1/\alpha and nn sufficiently large,

(Xnanznk)Cn((k1)α/2)2.\mathbb{P}(\|X_{n}-a_{n}z\|\geq n^{k})\leq\frac{C}{n^{((k-1)\alpha/2)-2}}.
Proof.

Since

XnanzXn+an|z|Xn+Czn1/α,\|X_{n}-a_{n}z\|\leq\|X_{n}\|+a_{n}|z|\leq\|X_{n}\|+C_{z}n^{1/\alpha},

it is sufficient to bound (Xncnk)\mathbb{P}\left(\|X_{n}\|\geq cn^{k}\right), for some 0<c<10<c<1 and nn sufficiently large. Note

Xn2n2max1i,jn{|Xij|2}.\|X_{n}\|^{2}\leq n^{2}\max_{1\leq i,j\leq n}\{|X_{ij}|^{2}\}.

Thus, by the union bound,

(Xncnk)\displaystyle\mathbb{P}\left(\|X_{n}\|\geq cn^{k}\right) =(Xn2c2n2k)\displaystyle=\mathbb{P}\left(\|X_{n}\|^{2}\geq c^{2}n^{2k}\right)
(max1i,jn{|Xij|2}c2n2(k1))\displaystyle\leq\mathbb{P}\left(\max_{1\leq i,j\leq n}\{|X_{ij}|^{2}\}\geq c^{2}n^{2(k-1)}\right)
(1i,jn{|Xij|2c2n2(k1)})\displaystyle\leq\mathbb{P}\left(\bigcup_{1\leq i,j\leq n}\{|X_{ij}|^{2}\geq c^{2}n^{2(k-1)}\}\right)
n2maxi,j=1,2{(|Xij|2c2n2(k1))}\displaystyle\leq n^{2}\max_{i,j=1,2}\left\{\mathbb{P}\left(|X_{ij}|^{2}\geq c^{2}n^{2(k-1)}\right)\right\}
n2C𝔼|X12|α/2n2(k1)α/4,\displaystyle\leq n^{2}\frac{C^{\prime}\mathbb{E}|X_{12}|^{\alpha/2}}{n^{2(k-1)\alpha/4}},

where the last inequality follows from Markov’s inequality and (1.5). ∎

Proof of Theorem 6.6.

Clearly XnX_{n} satisfies conditions (i) and (ii) of Theorem 5.1, so it only remains to check condition (iii) before we can apply Theorem 5.1. Let ijn:={|Xij|<log(n)an,|Xji|<log(n)an}\mathcal{E}_{ij}^{n}:=\{|X_{ij}|<\log(n)a_{n},|X_{ji}|<\log(n)a_{n}\}. Since {(Xij,Xji):1i<jn}\{(X_{ij},X_{ji}):1\leq i<j\leq n\} is a collection i.i.d. random tuples we focus on showing for some nn, 12n\mathcal{E}_{12}^{n} satisfies the desired conditions. From the tail bounds on X12,X_{12}, and X21X_{21},

(12n)1Clog(n)αn.\mathbb{P}(\mathcal{E}^{n}_{12})\geq 1-\frac{C}{\log(n)^{\alpha}n}.

Var(X12|12n)=0\operatorname{Var}(X_{12}|\mathcal{E}^{n}_{12})=0 if and only if X12X_{12} is constant on 12n\mathcal{E}^{n}_{12}, and hence on any subset of 12n\mathcal{E}^{n}_{12}. Thus if Var(X12|12n0)>0\operatorname{Var}(X_{12}|\mathcal{E}^{n_{0}}_{12})>0 for some n0n_{0}, then Var(X12|12n)>0\operatorname{Var}(X_{12}|\mathcal{E}^{n}_{12})>0 for all nn0n\geq n_{0}. Since X12X_{12} is non-constant, there must be some nn sufficiently large such that X12X_{12} is non-constant on 12n\mathcal{E}_{12}^{n}. Thus Var(X12|12n)>0\operatorname{Var}(X_{12}|\mathcal{E}^{n}_{12})>0 for all nn sufficiently large. The same argument follows for Var(X21|12n)\operatorname{Var}(X_{21}|\mathcal{E}^{n}_{12}).

Now assume for all nn sufficiently large,

|Corr(X12|12n,X21|12n)|=1.|\text{Corr}\left(X_{12}|\mathcal{E}_{12}^{n},X_{21}|\mathcal{E}_{12}^{n}\right)|=1.

Then there exists θn[0,2π)\theta_{n}\in[0,2\pi) such that on 12n\mathcal{E}_{12}^{n},

X12𝔼[X12|12n]=eiθnVar(X12|12n)Var(X21|12n)[X21𝔼[X21|12n]].X_{12}-\mathbb{E}[X_{12}|\mathcal{E}^{n}_{12}]=e^{i\theta_{n}}\sqrt{\frac{\operatorname{Var}(X_{12}|\mathcal{E}^{n}_{12})}{\operatorname{Var}(X_{21}|\mathcal{E}^{n}_{12})}}\left[X_{21}-\mathbb{E}[X_{21}|\mathcal{E}^{n}_{12}]\right].

For α<1\alpha<1, by Lemma A.9

𝔼[|X12|𝟏{|X12|t}](α1αL(t)t1α)1,\frac{\mathbb{E}[|X_{12}|\mathbf{1}_{\{{|X_{12}|\leq t}\}}]}{\left(\frac{\alpha}{1-\alpha}L(t)t^{1-\alpha}\right)}\rightarrow 1,

as tt\rightarrow\infty, for some slowly varying function LL. We assumed the atom variables satisfy Condition C2 (ii), specifically ancn1/αa_{n}\sim cn^{1/\alpha} and thus L(t)cL(t)\rightarrow c^{\prime} for some constant c>0c^{\prime}>0 as tt\rightarrow\infty. For α1\alpha\geq 1, 𝔼[|X12|𝟏{|X12|t}]\mathbb{E}[|X_{12}|\mathbf{1}_{\{{|X_{12}|\leq t}\}}] is dominated by trt^{r} for any r>0r>0. Thus on 12n\mathcal{E}_{12}^{n}

X12=RnX21+CnX_{12}=R_{n}X_{21}+C_{n}

where RnR_{n} is a sequence of complex numbers and

limnCnan=0,\lim\limits_{n\rightarrow\infty}\frac{C_{n}}{a_{n}}=0, (6.13)

since Var(X12|12n)Var(X21|12n)\sqrt{\frac{\operatorname{Var}(X_{12}|\mathcal{E}^{n}_{12})}{\operatorname{Var}(X_{21}|\mathcal{E}^{n}_{12})}} is slowly varying by Lemma A.9. If RnR_{n} has a bounded subsequence RnkR_{n_{k}}, we shall take the corresponding sequences anka_{n_{k}}, and nkn_{k} and for simplicity denote all three by Rn,an,R_{n},a_{n}, and nn. If not, then we note on 12n\mathcal{E}_{12}^{n}

X21=(Rn)1X12+CnX_{21}=(R_{n})^{-1}X_{12}+C_{n}^{\prime}

where (Rn)1(R_{n})^{-1} is bounded, and (6.13) still holds for CnC_{n}^{\prime}. Thus we will assume RnR_{n} is bounded and we take a subsequence which converges to RR. Let r>0r>0 and BB be a Borel subset of the unit sphere in 2\mathbb{C}^{2} such that θd(B)=0\theta_{d}(\partial B)=0. Note

θd(B)mα([r,)=\displaystyle\theta_{d}(B)m_{\alpha}([r,\infty)= limnn((X12,X21)(X12,X21)B,(X12,X21)ran)=\displaystyle\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in B,\|(X_{12},X_{21})\|\geq ra_{n}\right)=
limnn((X12,X21)(X12,X21)B,(X12,X21)ran,12n)\displaystyle\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in B,\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\right) (6.14)
+limnn((X12,X21)(X12,X21)B,(X12,X21)ran,(12n)c),\displaystyle\quad\quad+\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in B,\|(X_{12},X_{21})\|\geq ra_{n},(\mathcal{E}^{n}_{12})^{c}\right),

and

limnn((X12,X21)(X12,X21)B,(X12,X21)ran,(12n)c)\displaystyle\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in B,\|(X_{12},X_{21})\|\geq ra_{n},(\mathcal{E}^{n}_{12})^{c}\right)
limnn((12n)c)\displaystyle\quad\leq\lim\limits_{n\rightarrow\infty}n\mathbb{P}((\mathcal{E}^{n}_{12})^{c})
=limnn(|X12|log(n)an,or|X21|log(n)an)\displaystyle\quad=\lim\limits_{n\rightarrow\infty}n\mathbb{P}(|X_{12}|\geq\log(n)a_{n},\text{or}|X_{21}|\geq\log(n)a_{n}) (6.15)
limnCn(log(n)an)α\displaystyle\quad\leq\lim\limits_{n\rightarrow\infty}Cn(\log(n)a_{n})^{-\alpha}
=limnCn(log(n)n1/α)α\displaystyle\quad=\lim\limits_{n\rightarrow\infty}C^{\prime}n(\log(n)n^{1/\alpha})^{-\alpha}
=0.\displaystyle\quad=0.

Thus we will consider only

limnn((X12,X21)(X12,X21)B,(X12,X21)ran,12n).\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in B,\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\right). (6.16)

Define the set AA as

A={(z,w)2:z=Rw,|z|2+|w|2=1},A=\{(z,w)\in\mathbb{C}^{2}:z=Rw,|z|^{2}+|w|^{2}=1\},

where RR is the limit of RnR_{n}. Let (z1,z2)2(z_{1},z_{2})\in\mathbb{C}^{2} be such that (z1,z2)A(z_{1},z_{2})\notin A and |z1|2+|z2|2=1|z_{1}|^{2}+|z_{2}|^{2}=1. Let OO be a small open neighborhood of (z1,z2)(z_{1},z_{2}) in 2\mathbb{C}^{2} such that AO¯=A\cap\bar{O}=\emptyset. We will consider the limit

limnn((X12,X21)(X12,X21)O,(X12,X21)ran,12n).\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in O,\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\right). (6.17)

Before we deal with this limit note that on 12n\mathcal{E}^{n}_{12}

(X12,X21)(X12,X21)=(RX21,X21)(X12,X21)+((RnR)X21,0)(X12,X21)+(Cn,0)(X12,X21),\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}=\frac{(RX_{21},X_{21})}{\|(X_{12},X_{21})\|}+\frac{((R_{n}-R)X_{21},0)}{\|(X_{12},X_{21})\|}+\frac{(C_{n},0)}{\|(X_{12},X_{21})\|}, (6.18)
|(RX21,X21)((RnR)X21,0)(Cn,0)|(X12,X21),\left|\|(RX_{21},X_{21})\|-\|((R_{n}-R)X_{21},0)\|-\|(C_{n},0)\|\right|\leq\|(X_{12},X_{21})\|, (6.19)

and

(X12,X21)(RX21,X21)+((RnR)X21,0)+(Cn,0).\|(X_{12},X_{21})\|\leq\|(RX_{21},X_{21})\|+\|((R_{n}-R)X_{21},0)\|+\|(C_{n},0)\|. (6.20)

We aim to show the unit vector in (6.17) is approaching the bad set AA, which will lead to a contradiction of Condition C2. To this end note by factoring out (RX21,X21)\|(RX_{21},X_{21})\| from (6.19) and (6.20), we get

(X12,X21)(RX21,X21)|1((RnR)X21,0)+|Cn|(RX21,X21)|,\|(X_{12},X_{21})\|\geq\|(RX_{21},X_{21})\|\left|1-\frac{\|((R_{n}-R)X_{21},0)\|+|C_{n}|}{\|(RX_{21},X_{21})\|}\right|,

and

(X12,X21)(RX21,X21)[1+((RnR)X21,0)+|Cn|(RX21,X21)].\|(X_{12},X_{21})\|\leq\|(RX_{21},X_{21})\|\left[1+\frac{\|((R_{n}-R)X_{21},0)\|+|C_{n}|}{\|(RX_{21},X_{21})\|}\right].

Since |Cn|=o(an)|C_{n}|=o(a_{n}), it follows from (6.20) that if (X12,X21)ran\|(X_{12},X_{21})\|\geq ra_{n}, then (RX21,X21)can\|(RX_{21},X_{21})\|\geq ca_{n} for some c>0c>0. It then follows that on {(X12,X21)ran,12n}\{\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\}

(X12,X21)(RX21,X21)[1o(1)],\|(X_{12},X_{21})\|\geq\|(RX_{21},X_{21})\|\left[1-o(1)\right],

and

(X12,X21)(RX21,X21)[1+o(1)].\|(X_{12},X_{21})\|\leq\|(RX_{21},X_{21})\|\left[1+o(1)\right].

Thus

limnn((X12,X21)(X12,X21)O,(X12,X21)ran,12n)\displaystyle\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in O,\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\right)
=limnn((RX21,X21)(RX21,X21)+o(1)O,(X12,X21)ran,12n)\displaystyle\quad=\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(RX_{21},X_{21})}{\|(RX_{21},X_{21})\|}+o(1)\in O,\|(X_{12},X_{21})\|\geq ra_{n},\mathcal{E}^{n}_{12}\right) (6.21)
=0,\displaystyle\quad=0,

since (RX21,X21)(RX21,X21)+o(1)\frac{(RX_{21},X_{21})}{\|(RX_{21},X_{21})\|}+o(1) is a small perturbation of a vector in AA which is disjoint from OO. From (6.3), (6.3), and (6.3) we see

limnn((X12,X21)(X12,X21)O,(X12,X21)ran)=0.\lim\limits_{n\rightarrow\infty}n\mathbb{P}\left(\frac{(X_{12},X_{21})}{\|(X_{12},X_{21})\|}\in O,\|(X_{12},X_{21})\|\geq ra_{n}\right)=0.

Thus for arbitrary (z1,z2)2(z_{1},z_{2})\in\mathbb{C}^{2}, (z1,z2)supp(θd)(z_{1},z_{2})\notin\operatorname{supp}(\theta_{d}) and supp(θd)A\operatorname{supp}(\theta_{d})\subseteq A, a contradiction of Condition C2 (i). Therefore there exists arbitrarily large nn such that

|Corr(X12|12n,X21|12n)|<1.|\text{Corr}\left(X_{12}|\mathcal{E}_{12}^{n},X_{21}|\mathcal{E}_{12}^{n}\right)|<1.

From the above we see Theorem 5.1 may be applied to AnzA_{n}-z, which combining with Lemma 6.7 completes the proof of Theorem 6.6. ∎

6.4. Uniform integrability

For 0<δ<10<\delta<1 we define Kδ=[δ,δ1]K_{\delta}=[\delta,\delta^{-1}]. We aim to show that log()\log(\cdot) is uniformly integrable in probability with respect to {νAz}n1\{\nu_{A-z}\}_{n\geq 1}, i.e. for all ε>0\varepsilon>0

limδ0limn(Kδc|log(x)|𝑑νAnz(x)>ε)=0.\lim\limits_{\delta\rightarrow 0}\lim\limits_{n\rightarrow\infty}\mathbb{P}\left(\int_{K_{\delta}^{c}}|\log(x)|d\nu_{A_{n}-z}(x)>\varepsilon\right)=0. (6.22)

From Lemma 6.1 there exists a constant c0>0c_{0}>0 such that with probability 1

lim supn1|log(x)|𝑑νAnz(x)<c0.\limsup_{n\rightarrow\infty}\int_{1}^{\infty}|\log(x)|d\nu_{A_{n}-z}(x)<c_{0}.

From this, the part of the integral in (6.22) over [δ1,)[\delta^{-1},\infty) is not a concern. Thus it suffices to prove that for every sequence δn\delta_{n} converging to 0,

1ni=0n1𝟏{sniδn}logsni2\frac{1}{n}\sum_{i=0}^{n-1}\mathbf{1}_{\{s_{n-i}\leq\delta_{n}\}}\log s_{n-i}^{-2}

converges to 0 in probability. By Theorem 6.6 we may, with probability 1o(1)1-o(1), lower bound snis_{n-i} by cnrcn^{-r} for all ii. Take 0<γ<α/40<\gamma<\alpha/4 to be fixed later. Using the polynomial lower bound for 0in1γ0\leq i\leq n^{1-\gamma}, it follows that it is sufficient to prove that

1ni=n1γn1𝟏{sniδn}logsni2\frac{1}{n}\sum_{i=\lfloor n^{1-\gamma}\rfloor}^{n-1}\mathbf{1}_{\{s_{n-i}\leq\delta_{n}\}}\log s_{n-i}^{-2}

converges in probability to 0.

We next aim to show there exists an event FnF_{n} such that for some δ>0\delta>0 and c>0c>0

(Fnc)cexp(nδ),\mathbb{P}(F_{n}^{c})\leq c\exp(-n^{\delta}),

and

𝔼[sni2|Fn]c(ni)2α+1,\mathbb{E}[s_{n-i}^{-2}|F_{n}]\leq c(\frac{n}{i})^{\frac{2}{\alpha}+1},

for n1γin1\lfloor n^{1-\gamma}\rfloor\leq i\leq n-1. First to see why this implies convergence in probability to zero, note

(sniδn)(Fnc)+cδn2(ni)1+2/α.\mathbb{P}(s_{n-i}\leq\delta_{n})\leq\mathbb{P}(F_{n}^{c})+c\delta_{n}^{2}(\frac{n}{i})^{1+2/\alpha}.

It follows there exists a sequence εn=δn1/(2α+1)\varepsilon_{n}=\delta_{n}^{1/(\frac{2}{\alpha}+1)} tending to zero such that (snnεnδn)\mathbb{P}(s_{n-\lfloor n\varepsilon_{n}\rfloor}\leq\delta_{n}) converges to 0. Hence it is sufficient to prove, given FnF_{n},

1ni=n1γεnnlogsni2\frac{1}{n}\sum_{i=\lfloor n^{1-\gamma}\rfloor}^{\lfloor\varepsilon_{n}n\rfloor}\log s_{n-i}^{-2} (6.23)

converges to zero in probability. Using the negative second moment bound on FnF_{n} and the concavity of log\log we have

𝔼[1ni=n1γεnn𝟏{sniδn}logsni2|Fn]\displaystyle\mathbb{E}\left[\frac{1}{n}\sum_{i=\lfloor n^{1-\gamma}\rfloor}^{\lfloor\varepsilon_{n}n\rfloor}\mathbf{1}_{\{s_{n-i}\leq\delta_{n}\}}\log s_{n-i}^{-2}\big{|}F_{n}\right] 1ni=n1γεnnlog𝔼[sni2|Fn]\displaystyle\leq\frac{1}{n}\sum_{i=\lfloor n^{1-\gamma}\rfloor}^{\lfloor\varepsilon_{n}n\rfloor}\log\mathbb{E}[s_{n-i}^{-2}|F_{n}]
c1ni=1εnnlog(ni)\displaystyle\leq\frac{c_{1}}{n}\sum_{i=1}^{\lfloor\varepsilon_{n}n\rfloor}\log(\frac{n}{i})
0\displaystyle\rightarrow 0

where the last sum converges to zero by a Riemann Sum approximation. Thus, by Markov’s inequality we obtain convergence (given FnF_{n}) of (6.23) in probability to zero.

Now we finish with the construction of such an event FnF_{n}. Let BnB_{n} be the matrix formed by the first ni/2n-\lfloor i/2\rfloor rows of an(Anz)a_{n}(A_{n}-z). If s1s2sni/2s_{1}^{\prime}\geq s_{2}^{\prime}\geq\cdots\geq s_{n-\lfloor i/2\rfloor}^{\prime} are the singular values of BnB_{n}, then by Cauchy interlacing

snisnian.s_{n-i}\geq\frac{s_{n-i}^{\prime}}{a_{n}}. (6.24)

By the Tao-Vu negative second moment lemma, Lemma A.4, we have

s12+sni/22=dist12+distni/22s_{1}^{\prime-2}+\cdots s_{n-\lfloor i/2\rfloor}^{\prime-2}=\operatorname{dist}_{1}^{-2}+\cdots\operatorname{dist}_{n-\lfloor i/2\rfloor}^{-2} (6.25)

where distj\operatorname{dist}_{j} is the distance from the jj-th row of BnB_{n} to the subspace spanned by the other rows. Using (6.24) and (6.25) we get

i2sni2an2j=ini/2distj2.\frac{i}{2}s_{n-i}^{-2}\leq a_{n}^{2}\sum_{j=i}^{n-\lfloor i/2\rfloor}\operatorname{dist}_{j}^{-2}.

Now let distj\operatorname{dist}_{j}^{\prime} be the distance from the jj-th row of BnB_{n} with its jj-th entry removed and the subspace spanned by the other rows minus their jj-th entry. Since distjdistj\operatorname{dist}_{j}\geq\operatorname{dist}_{j}^{\prime} we have

i2sni2an2j=ini/2distj2\frac{i}{2}s_{n-i}^{-2}\leq a_{n}^{2}\sum_{j=i}^{n-\lfloor i/2\rfloor}\operatorname{dist}_{j}^{\prime-2} (6.26)

Let Fn,iF_{n,i} be the event that for all 1jni/21\leq j\leq n-\lfloor i/2\rfloor, distj(n1)(12γ)/α\operatorname{dist}_{j}^{\prime}\geq(n-1)^{(1-2\gamma)/\alpha}. Since the span of all but 1 row of BnB_{n} is at most ni/2n-i/2, we can use Proposition 6.2 to obtain, for sufficiently large nn,

(Fn,ic)exp((n1)δ),\mathbb{P}(F_{n,i}^{c})\leq\exp(-(n-1)^{\delta^{\prime}}),

for some δ>0\delta^{\prime}>0 (after a union bound). Let Fn=i=0n1γFn,iF_{n}=\bigcap_{i=0}^{n^{1-\gamma}}F_{n,i}, then

(Fnc)exp((n1)δ),\mathbb{P}(F_{n}^{c})\leq\exp(-(n-1)^{\delta}),

for some δ>0\delta>0.

We now aim to show the desired negative second moment bound on FnF_{n}. Recall from Proposition 6.5 there exists an event EjE_{j} independent from the rows iji\neq j of BnB_{n} without their ii-th entry such that (Ejc)n12+γ(2α12)\mathbb{P}(E_{j}^{c})\leq n^{-\frac{1}{2}+\gamma(\frac{2}{\alpha}-\frac{1}{2})} and for any subspace WW with dimension d<nn1γd<n-n^{1-\gamma} one has

𝔼[dist(R,W)2;Ej]c(nd)2/α\mathbb{E}[\operatorname{dist}(R,W)^{-2};E_{j}]\leq c(n-d)^{-2/\alpha}

where RR is the jj-th row of BnB_{n} with the jj-th entry removed. Thus

𝔼[distj2;Ej]=O(i2/α),\mathbb{E}[\operatorname{dist}_{j}^{\prime-2};E_{j}]=O(i^{-2/\alpha}),

for ijni/2i\leq j\leq n-\lfloor i/2\rfloor.

By taking the lower bound of distj(n1)(12γ)/α\operatorname{dist}_{j}^{\prime}\geq(n-1)^{(1-2\gamma)/\alpha} on EjcFnE^{c}_{j}\cap F_{n}, we get

𝔼[distj2;Fn]c2(i2/α+n12+γ(2α12)2(12γ)/α).\mathbb{E}[\operatorname{dist}_{j}^{\prime-2};F_{n}]\leq c_{2}\left(i^{-2/\alpha}+n^{-\frac{1}{2}+\gamma(\frac{2}{\alpha}-\frac{1}{2})-2(1-2\gamma)/\alpha}\right).

So for γ\gamma, not dependent on ii, sufficiently small one has

𝔼[distj2;Fn]c3i2/α.\mathbb{E}[\operatorname{dist}_{j}^{\prime-2};F_{n}]\leq c_{3}i^{-2/\alpha}.

Thus by (6.26)

𝔼[sni2;Fn]c3an2ni(1+2/α).\mathbb{E}[s_{n-i}^{-2};F_{n}]\leq c_{3}a_{n}^{2}ni^{-(1+2/\alpha)}.

The result then follows from the assumption ancn1/αa_{n}\sim cn^{1/\alpha}.

6.5. Proof of Theorem 1.6

We have shown that almost surely νAnzIn\nu_{A_{n}-zI_{n}} converges weakly to νz,α,θd\nu_{z,\alpha,\theta_{d}} and that log()\log(\cdot) is uniformly integrable in probability with respect to (νAnzIn)n1(\nu_{A_{n}-zI_{n}})_{n\geq 1}. Thus by Lemma 1.12, μAn\mu_{A_{n}} converges weakly to some deterministic probability measure μα,θd\mu_{\alpha,\theta_{d}} in probability.

Appendix A Additional lemmas

A.1. Concentration

Lemma A.1 (McDiarmid’s inequality, Theorem 3.1 in [36]).

Let X1,X2,,XnX_{1},X_{2},\dots,X_{n} be independent random variables taking values in R1,R2,,RnR_{1},R_{2},\dots,R_{n} respectively. Let F:R1××RnF:R_{1}\times\cdots\times R_{n}\rightarrow\mathbb{C} be a function with the property that for every 1in1\leq i\leq n there exists a ci>0c_{i}>0 such that

|F(x1,x2,,xi,,xn)F(x1,x2,,xi,,xn)|ci|F(x_{1},x_{2},\dots,x_{i},\dots,x_{n})-F(x_{1},x_{2},\dots,x_{i}^{\prime},\dots,x_{n})|\leq c_{i}

for all xjRjx_{j}\in R_{j}, xiRix_{i}^{\prime}\in R_{i} for 1jn1\leq j\leq n. Then for any t>0t>0,

(|F(X)𝔼F(X)|σt)Cexp(ct2)\mathbb{P}(|F(X)-\mathbb{E}F(X)|\geq\sigma t)\leq C\exp(-ct^{2})

for absolute constants C,c>0C,c>0 and σ2:=i=1nci2\sigma^{2}:=\sum_{i=1}^{n}c_{i}^{2}.

For the following lemma we need a way of breaking up an elliptic random matrix XX into independent pieces. For a matrix M=(mij)i,j=1nM=(m_{ij})_{i,j=1}^{n} we let the kk-wedge of MM be the n×nn\times n matrix CkC_{k} with entries cij=mijc_{ij}=m_{ij} if either i=ki=k and jij\geq i or j=kj=k and iji\geq j, with cij=0c_{ij}=0 otherwise. Note

Mn=k=1nCk,M_{n}=\sum_{k=1}^{n}C_{k},

and rank(Ck)2\operatorname{rank}(C_{k})\leq 2 for all 1kn1\leq k\leq n. Recall the total variation norm of a function f:f:\mathbb{R}\rightarrow\mathbb{R} is given by

fTV=supk|f(xk+1)f(xk)|,\|f\|_{TV}=\sup\sum_{k\in\mathbb{Z}}|f(x_{k+1})-f(x_{k})|,

where the supremum runs over all sequences (xk)k(x_{k})_{k\in\mathbb{Z}} with xk+1xkx_{k+1}\geq x_{k}.

Lemma A.2.

Let MM be a n×nn\times n random matrix with independent wedges. Then for any f:f:\mathbb{R}\rightarrow\mathbb{R} going to 0 at ±\pm\infty with fTV1\|f\|_{TV}\leq 1 and every t0t\geq 0,

(|f𝑑νM𝔼f𝑑νM|t)Cexp(cnt2),\mathbb{P}\left(\left|\int fd\nu_{M}-\mathbb{E}\int fd\nu_{M}\right|\geq t\right)\leq C\exp(-cnt^{2}),

for absolute constants C,c>0C,c>0.

Proof.

If A,BMatn()A,B\in\operatorname{Mat}_{n}(\mathbb{C}) let FAF_{A} and FBF_{B} be the cumulative distribution functions of νA\nu_{A} and νB\nu_{B}. By the Lidskii inequality (see Theorem 3.3.16 in [33]) for singular values

FAFBrank(AB)n.\|F_{A}-F_{B}\|_{\infty}\leq\frac{\operatorname{rank}(A-B)}{n}.

Assume ff is smooth. Integrating by parts we get

|f𝑑νAf𝑑νB|=|f(t)(FA(t)FB(t))𝑑t|rank(AB)nfTV.\left|\int fd\nu_{A}-\int fd\nu_{B}\right|=\left|\int_{\mathbb{R}}f^{\prime}(t)(F_{A}(t)-F_{B}(t))dt\right|\leq\frac{\operatorname{rank}(A-B)}{n}\|f\|_{TV}. (A.1)

Since |f𝑑νAf𝑑νB|\left|\int fd\nu_{A}-\int fd\nu_{B}\right| depends on only finitely many points for any ff, we can extend the previous inequality to any function of finite total variation.

Now fix f:f:\mathbb{R}\rightarrow\mathbb{R} going to 0 at ±\pm\infty with fTV1\|f\|_{TV}\leq 1. Let 𝒞k\mathcal{C}_{k} be the space of all kk-wedges and Ff:𝒞1×𝒞2××𝒞nF_{f}:\mathcal{C}_{1}\times\mathcal{C}_{2}\times\dots\times\mathcal{C}_{n} be the function defined by

Ff(C1,C2,,Cn)=f𝑑νAF_{f}(C_{1},C_{2},\dots,C_{n})=\int fd\nu_{A}

where AA is the matrix with kk-wedge CkC_{k}. By (A.1)

|Ff(C1,C2,,Ci,,Cn)Ff(C1,C2,,Ci,,Cn)|2n,|F_{f}(C_{1},C_{2},\dots,C_{i},\dots,C_{n})-F_{f}(C_{1},C_{2},\dots,C_{i}^{\prime},\dots,C_{n})|\leq\frac{2}{n},

and thus by McDiarmid’s inequality, Lemma A.1, and the definition of FfF_{f},

(|f𝑑νM𝔼f𝑑νM|t)Cexp(cnt2).\mathbb{P}\left(\left|\int fd\nu_{M}-\mathbb{E}\int fd\nu_{M}\right|\geq t\right)\leq C\exp(-cnt^{2}).

A.2. Singular value estimates

Lemma A.3 (See [33], Chapter 3).

If AA and BB are n×nn\times n complex matrices then

s1(AB)s1(A)s1(B) and s1(A+B)s1(A)+s1(B),s_{1}(AB)\leq s_{1}(A)s_{1}(B)\text{ and }s_{1}(A+B)\leq s_{1}(A)+s_{1}(B),
max1kn|sk(A+B)sk(A)|B,\max_{1\leq k\leq n}|s_{k}(A+B)-s_{k}(A)|\leq\|B\|,
si+j1(A+B)si(A)+sj(B)s_{i+j-1}(A+B)\leq s_{i}(A)+s_{j}(B)

for 1i,jn1\leq i,j\leq n and i+jn+1i+j\leq n+1. In addition,

max1in|si(A)si(B)|s1(AB).\max_{1\leq i\leq n}|s_{i}(A)-s_{i}(B)|\leq s_{1}(A-B).
Lemma A.4 (Tao-Vu Negative Second Moment, [51] Lemma A.4).

If AA is a full rank n×nn^{\prime}\times n complex matrix with rows R1,,RnR_{1},\dots,R_{n^{\prime}} and Ri:=Span{Rj:ji}R_{-i}:=\operatorname{Span}\{R_{j}:j\neq i\}, then

i=1nsi(A)2=i=1ndist(Ri,Ri)2.\sum_{i=1}^{n^{\prime}}s_{i}(A)^{-2}=\sum_{i=1}^{n^{\prime}}\operatorname{dist}(R_{i},R_{-i})^{-2}.
Lemma A.5 (Cauchy interlacing, see [33]).

Let AA be an n×nn\times n complex matrix. If BB is an n×nn^{\prime}\times n matrix obtained by deleting nnn-n^{\prime} rows from AA, then for every 1in1\leq i\leq n^{\prime},

si(A)si(B)si+nn(A).s_{i}(A)\geq s_{i}(B)\geq s_{i+n-n^{\prime}}(A).
Lemma A.6 (See Remark 1 in [23]).

Let AA be an n×nn\times n matrix and let Re(A)=(A+A)/2\operatorname{Re}(A)=(A+A^{*})/2 denote the real part of AA. If s1s2sns_{1}\geq s_{2}\geq\dots\geq s_{n} and λ1λ2λn\lambda_{1}\geq\lambda_{2}\geq\dots\geq\lambda_{n} denote the singular values of AA and eigenvalues of Re(A)\operatorname{Re}(A) respectively, then

λjsj\lambda_{j}\leq s_{j} (A.2)

for every 1jn1\leq j\leq n.

Lemma A.7 (Weyl’s inequality, [54], see also Lemma B.5 in [13]).

For every n×nn\times n complex matrix AA with eigenvalues ordered as |λ1(A)||λ2(A)||λn(A)||\lambda_{1}(A)|\geq|\lambda_{2}(A)|\geq\dots\geq|\lambda_{n}(A)| one has

i=1k|λi(A)|i=1ksi(A) and i=knsi(A)i=kn|λi(A)|\prod_{i=1}^{k}|\lambda_{i}(A)|\leq\prod_{i=1}^{k}s_{i}(A)\text{ and }\prod_{i=k}^{n}s_{i}(A)\leq\prod_{i=k}^{n}|\lambda_{i}(A)|

for all 1kn1\leq k\leq n. Moreover for r>0r>0

k=1n|λk(A)|rk=1nsk(A)r.\sum_{k=1}^{n}|\lambda_{k}(A)|^{r}\leq\sum_{k=1}^{n}s_{k}(A)^{r}.
Lemma A.8 (Schatten Bound, see proof of Theorem 3.32 in [55]).

Let AA be an n×nn\times n complex matrix with rows R1,,RnR_{1},\dots,R_{n}. Then for every 0<r20<r\leq 2,

i=1nsk(A)ri=1rRkr.\sum_{i=1}^{n}s_{k}(A)^{r}\leq\sum_{i=1}^{r}\|R_{k}\|^{r}.

A.3. Moments of stable distributions

Lemma A.9 (See [13] and [24] Theorem VIII.9.2).

Let ZZ be a positive random variable such that for every t>0t>0,

(Zt)=L(t)tα\mathbb{P}(Z\geq t)=L(t)t^{-\alpha}

for some slowly varying function LL and some α(0,2)\alpha\in(0,2). Then for every p>αp>\alpha,

limt𝔼[Zp𝟏{Zt}]c(p)L(t)tpα1,\lim\limits_{t\rightarrow\infty}\frac{\mathbb{E}[Z^{p}\mathbf{1}_{\{Z\leq t\}}]}{c(p)L(t)t^{p-\alpha}}\rightarrow 1,

where c(p):=α/(pα)c(p):=\alpha/(p-\alpha).

A.4. Proof of Proposition 4.12

From Proposition 4.3 one has

(a(z,η)b(z,η)b(z,η)c(z,η))=((ηzz¯η)+k=1(0yk(1)y¯k(2)0)R~(U)kk(0yk(2)y¯k(1)0))1.\begin{pmatrix}a(z,\eta)&b(z,\eta)\\ b^{\prime}(z,\eta)&c(z,\eta)\end{pmatrix}=-\left(\begin{pmatrix}\eta&z\\ \bar{z}&\eta\end{pmatrix}+\sum_{k=1}^{\infty}\begin{pmatrix}0&y_{k}^{(1)}\\ \bar{y}_{k}^{(2)}&0\\ \end{pmatrix}\tilde{R}(U)_{kk}\begin{pmatrix}0&y_{k}^{(2)}\\ \bar{y}_{k}^{(1)}&0\\ \end{pmatrix}\right)^{-1}.

We will consider the point process {(yk(1),yk(2))}k1\{(y_{k}^{(1)},y_{k}^{(2)})\}_{k\geq 1} in the polar form {(rk,wk)}k1\{(r_{k},w_{k})\}_{k\geq 1} where {rk}k1\{r_{k}\}_{k\geq 1} is a Poisson point process with intensity measure mαm_{\alpha} and w1,w2,w_{1},w_{2},\dots is a collection of independent identically θd\theta_{d} distributed random variables independent of {rk}k1\{r_{k}\}_{k\geq 1}. We will denote the coordinates of wkw_{k} by (wk(1),wk(2))(w_{k}^{(1)},w_{k}^{(2)}). In polar coordinates the recursive equation becomes

(a(z,η)b(z,η)b(z,η)c(z,η))=((ηzz¯η)+k=1rk2(ck(z,η)|wk(1)|2bk(z,η)wk(1)wk(2)bk(z,η)w¯k(1)w¯k(2)ak(z,η)|wk(2)|2))1,\begin{pmatrix}a(z,\eta)&b(z,\eta)\\ b^{\prime}(z,\eta)&c(z,\eta)\end{pmatrix}=-\left(\begin{pmatrix}\eta&z\\ \bar{z}&\eta\end{pmatrix}+\sum_{k=1}^{\infty}r_{k}^{2}\begin{pmatrix}c_{k}(z,\eta)|w^{(1)}_{k}|^{2}&b^{\prime}_{k}(z,\eta)w^{(1)}_{k}w^{(2)}_{k}\\ b_{k}(z,\eta)\bar{w}^{(1)}_{k}\bar{w}^{(2)}_{k}&a_{k}(z,\eta)|w^{(2)}_{k}|^{2}\end{pmatrix}\right)^{-1}, (A.3)

where

(ak(z,η)bk(z,η)bk(z,η)ck(z,η)):=R~(U)kk.\begin{pmatrix}a_{k}(z,\eta)&b_{k}(z,\eta)\\ b^{\prime}_{k}(z,\eta)&c_{k}(z,\eta)\end{pmatrix}:=\tilde{R}(U)_{kk}.

To complete the proof that the matrix in (4.3) satisfies the distributional equation we need the following lemma, which is essentially Theorem 10.3 in [8].

Lemma A.10.

Let {rk}k1\{r_{k}\}_{k\geq 1} be a Poisson point process with intensity measure mαm_{\alpha} for α(0,2)\alpha\in(0,2) and v1,v2,v_{1},v_{2},\dots be a collection of bounded independent and identically ν\nu distributed random vectors for some probability measure ν\nu. Then

k=1rk2vk=𝑑S,\sum_{k=1}^{\infty}r_{k}^{2}v_{k}\overset{d}{=}S, (A.4)

where SS is an α2\frac{\alpha}{2}-stable random vector with spectral measure Γν\Gamma_{\nu}, where Γν\Gamma_{\nu} is the measure on the unit sphere obtained by the image of the measure Γ(2α/2)cos(πα/4)1α/2vα/2dν(v)\frac{\Gamma(2-\alpha/2)\cos(\pi\alpha/4)}{1-\alpha/2}\|v\|^{\alpha/2}d\nu(v) under the map vvvv\mapsto\frac{v}{\|v\|}.

Proof.

Let (Xk)k1(X_{k})_{k\geq 1} be a sequence of i.i.d. non-negative random variables such that

(X1u)=L(u)uα/2,\mathbb{P}(X_{1}\geq u)=\frac{L(u)}{u^{\alpha/2}}, (A.5)

for some slowly varying function LL, and define the normalizing constants

bn:=inf{b:(Xb)1n}.b_{n}:=\inf\left\{b:\mathbb{P}(X\geq b)\leq\frac{1}{n}\right\}. (A.6)

From Theorem 10.3 in [8], 1bnk=1nXkvk\frac{1}{b_{n}}\sum_{k=1}^{n}X_{k}v_{k} converges in distribution to SS. It also follows from Proposition 2.3 that

k=1nδXkvk/bnk=1δrk2vk,\sum_{k=1}^{n}\delta_{X_{k}v_{k}/b_{n}}\Rightarrow\sum_{k=1}^{\infty}\delta_{r_{k}^{2}v_{k}}, (A.7)

as nn\rightarrow\infty. It then follows from the almost sure uniform summability of the sequence (Xk)k1(X_{k})_{k\geq 1} that

1bnk=1nXkvkk=1rk2vk,\frac{1}{b_{n}}\sum_{k=1}^{n}X_{k}v_{k}\Rightarrow\sum_{k=1}^{\infty}r_{k}^{2}v_{k}, (A.8)

as nn\rightarrow\infty. ∎

An application of Lemma A.10 to the series in (A.3) gives (4.7).

References

  • [1] A. Aggarwal, P. Lopatto, and J. Marcinek. Eigenvector statistics of Lévy matrices. The Annals of Probability, 49(4):1778 – 1846, 2021.
  • [2] A. Aggarwal, P. Lopatto, and H.-T. Yau. GOE statistics for Lévy matrices. Journal of the European Mathematical Society, 23(11):3707–3800, 2021.
  • [3] D. Aldous and J. M. Steele. The objective method: Probabilistic combinatorial optimization and local weak convergence. In Probability on Discrete Structures, volume 110 of Encyclopaedia of Mathematical Sciences, page 1–72. Springer-Verlag, New York, 2004.
  • [4] A. Auffinger, G. Ben Arous, and S. Péché. Poisson convergence for the largest eigenvalues of heavy tailed random matrices. Ann. Inst. Henri Poincaré, Probab. Stat., 45(3):589–610, 2009.
  • [5] A. Auffinger and S. Tang. Extreme eigenvalues of sparse, heavy tailed random matrices. Stochastic Processes and their Applications, 126(11):3310–3330, Nov 2016.
  • [6] B. Basrak, Y. Cho, J. Heiny, and P. Jung. Extreme eigenvalue statistics of m-dependent heavy-tailed matrices. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 57(4):2100 – 2127, 2021.
  • [7] S. Belinschi, A. Dembo, and A. Guionnet. Spectral measure of heavy tailed band and covariance random matrices. Communications in Mathematical Physics, 289(3):1023–1055, May 2009.
  • [8] G. Ben Arous and A. Guionnet. The spectrum of heavy tailed random matrices. Communications in Mathematical Physics, 278(3):715–751, Dec 2007.
  • [9] F. Benaych-Georges and A. Guionnet. Central limit theorem for eigenvectors of heavy tailed matrices. Electron. J. Probab., 19:27 pp., 2014.
  • [10] R. Bhatia. Matrix analysis, volume 169 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1997.
  • [11] G. Biroli, J.-P. Bouchaud, and M. Potters. On the top eigenvalue of heavy-tailed random matrices. Europhysics Letters (EPL), 78(1):10001, Mar 2007.
  • [12] C. Bordenave, P. Caputo, and D. Chafaï. Spectrum of large random reversible Markov chains: heavy-tailed weights on the complete graph. The Annals of Probability, 39(4):1544–1590, 2011.
  • [13] C. Bordenave, P. Caputo, and D. Chafaï. Spectrum of non-Hermitian heavy tailed random matrices. Communications in Mathematical Physics, 307(2):513–560, Oct. 2011.
  • [14] C. Bordenave, P. Caputo, D. Chafaï, and D. Piras. Spectrum of large random markov chains: Heavy-tailed weights on the oriented complete graph. Random Matrices: Theory and Applications, 06(02):1750006, 2017.
  • [15] C. Bordenave and D. Chafaï. Around the circular law. Probab. Surveys, 9:1–89, 2012.
  • [16] C. Bordenave and A. Guionnet. Localization and delocalization of eigenvectors for heavy-tailed random matrices. Probability Theory and Related Fields, 157(3-4):885–953, Jan 2013.
  • [17] C. Bordenave and A. Guionnet. Delocalization at small energy for heavy-tailed random matrices. Communications in Mathematical Physics, 354(1):115–159, May 2017.
  • [18] P. Cizeau and J. P. Bouchaud. Theory of Lévy matrices. Phys. Rev. E, 50:1810–1822, Sep 1994.
  • [19] K. P. Costello. Bilinear and quadratic variants on the Littlewood-Offord problem. Israel Journal of Mathematics, 194(1):359–394, Mar 2013.
  • [20] K. P. Costello, T. Tao, and V. Vu. Random symmetric matrices are almost surely nonsingular. Duke Math. J., 135(2):395–413, Nov 2006.
  • [21] Y. Davydov and V. Egorov. On convergence of empirical point processes. Statistics & Probability Letters, 76(17):1836–1844, Nov. 2006.
  • [22] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 4 edition, 2010.
  • [23] K. Fan and A. J. Hoffman. Some metric inequalities in the space of matrices. Proceedings of the American Mathematical Society, 6(1):111–116, 1955.
  • [24] W. Feller. An introduction to probability theory and its applications. Vol. II. Second edition. John Wiley & Sons Inc., New York, 1971.
  • [25] V. Girko. Elliptic law. Theory of Probability & Its Applications, 30(4):677–690, 1986.
  • [26] V. Girko. The elliptic law: ten years later I. Random Operators and Stochastic Equations, 3(3):257–302, 1995.
  • [27] F. Götze. Asymptotic expansions for bivariate von Mises functionals. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 50(3):333–355, Jan 1979.
  • [28] F. Götze, A. Naumov, and A. Tikhomirov. On a generalization of the elliptic law for random matrices. Acta Physica Polonica B, 46(9):1737–1745, 2015.
  • [29] F. Götze, A. Naumov, and A. Tikhomirov. On minimal singular values of random matrices with correlated entries. Random Matrices: Theory and Applications, 04(02):1550006, 2015.
  • [30] E. Gudowska-Nowak, A. Jarosz, M. Nowak, and G. Papp. Towards non-hermitian random Lévy matrices. Acta Physica Polonica B, 38, 01 2007.
  • [31] J. Heiny and J. Yao. Limiting distributions for eigenvalues of sample correlation matrices from heavy-tailed populations. Available at arXiv:2003.03857, 2020.
  • [32] R. v. d. Hofstad. Random Graphs and Complex Networks, volume 1 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2016.
  • [33] R. Horn and C. R. Johnson. Topics in matrix analysis. Cambridge University Press.
  • [34] P. Jung. Lévy-Khintchine random matrices and the poisson weighted infinite skeleton tree. Transactions of the American Mathematical Society, 370, 02 2014.
  • [35] A. E. Litvak, A. Pajor, M. Rudelson, and N. Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes. Adv. Math., 195(2):491–523, 2005.
  • [36] C. McDiarmid. Concentration. In M. Habib, C. McDiarmid, J. Ramirez-Alfonsin, and B. Reed, editors, Probabilistic Methods for Algorithmic Discrete Mathematics, Algorithms and Combinatorics. Springer-Verlag, New York, 1998.
  • [37] C. Monthus. Localization transition in random Lévy matrices: multifractality of eigenvectors in the localized phase and at criticality. Journal of Statistical Mechanics: Theory and Experiment, 2016(9):093304, Sep 2016.
  • [38] A. Naumov. Elliptic law for real random matrices. Available at arXiv:1201.1639, 2012.
  • [39] H. H. Nguyen and S. O’Rourke. The elliptic law. International Mathematics Research Notices, 2015(17):7620–7689, Oct 2014.
  • [40] S. O’Rourke and D. Renfrew. Central limit theorem for linear eigenvalue statistics of elliptic random matrices. Journal of Theoretical Probability, 29(3):1121–1191, Mar 2015.
  • [41] S. O’Rourke, D. Renfrew, A. Soshnikov, and V. Vu. Products of independent elliptic random matrices. Journal of Statistical Physics, 160(1):89–119, Apr 2015.
  • [42] M. Reed and B. Simon. Methods of modern mathematical physics, volume 1. Academic Press, 1972.
  • [43] S. I. Resnick. Extreme Values, Regular Variation and Point Processes. Springer Series in Operations Research and Financial Engineering. Springer-Verlag, New York, 1 edition, 1987.
  • [44] S. I. Resnick. Heavy-Tailed Phenomemena. Springer Series in Operations Research and Financial Engineering. Springer-Verlag, New York, 1 edition, 2007.
  • [45] M. Rudelson and R. Vershynin. The Littlewood-Offord problem and invertibility of random matrices. Adv. Math., 218(2):600–633, 2008.
  • [46] G. Sakovich. A single form for the conditions for attraction to stable laws. Theory of Probability & Its Applications, 1(3):322–325, 1956.
  • [47] G. Samorodnitsky and M. S. Taqqu. Stable non-gaussian random processes. Stochastic Modeling. Chapman & Hall, New York, NY, 1994.
  • [48] A. Sidorenko. A correlation inequality for bipartite graphs. Graphs and Combinatorics, 9(2):201–204, Jun 1993.
  • [49] A. Soshnikov. Poisson statistics for the largest eigenvalues of Wigner random matrices with heavy tails. Electronic Communications in Probability, 9:82–91, 2004.
  • [50] T. Tao. Topics in random matrix theory, volume 132 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2012.
  • [51] T. Tao, V. Vu, and M. Krishnapur. Random matrices: Universality of ESDs and the circular law. Ann. Probab., 38(5):2023–2065, 09 2010.
  • [52] E. Tarquini, G. Biroli, and M. Tarzia. Level statistics and localization transitions of Lévy matrices. Physical Review Letters, 116(1), Jan 2016.
  • [53] R. Vershynin. Invertibility of symmetric random matrices. Random Structures & Algorithms, 44(2):135–182, 2014.
  • [54] H. Weyl. Inequalities between the two kinds of eigenvalues of a linear transformation. Proceedings of the National Academy of Sciences, 35(7):408–411, 1949.
  • [55] X. Zhan. Matrix inequalities, volume 1790 of Lecture Notes in Mathematics. Springer-Verlag, New York, 2002.