This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Improved concentration of Laguerre and Jacobi ensembles

Yichen Huang (黄溢辰) [email protected] Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA Aram W. Harrow [email protected] Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
Abstract

We consider the asymptotic limits where certain parameters in the definitions of the Laguerre and Jacobi ensembles diverge. In these limits, Dette, Imhof, and Nagel proved that up to a linear transformation, the joint probability distributions of the ensembles become more and more concentrated around the zeros of the Laguerre and Jacobi polynomials, respectively. In this paper, we improve the concentration bounds. Our proofs are similar to those in the original references, but the error analysis is improved and arguably simpler. For the first and second moments of the Jacobi ensemble, we further improve the concentration bounds implied by our aforementioned results.

Preprint number: MIT-CTP/5469

1 Introduction

The Gaussian, Wishart, and Jacobi ensembles are three classical ensembles in random matrix theory. They find numerous applications in physics, statistics, and other branches of applied science. The Gaussian (Wishart) ensemble is also known as the Hermite (Laguerre) ensemble due to its relationship with the Hermite (Laguerre) polynomial.

Of particular interest are the asymptotic limits where certain parameters in the definitions of the ensembles diverge. In these limits, Dette, Imhof, and Nagel [1, 2] proved that up to a linear transformation, the joint probability distributions of the Hermite, Laguerre, and Jacobi ensembles become more and more concentrated around the zeros of the Hermite, Laguerre, and Jacobi polynomials, respectively. These results allow us to transfer knowledge on the zeros of orthogonal polynomials to the corresponding ensembles.

In this paper, we improve the concentration bounds for the Laguerre and Jacobi probability distributions around the zeros of the Laguerre and Jacobi polynomials, respectively. Our proofs are similar to those in the original references [1, 2], but the error analysis is improved and arguably simpler. We also prove the concentration of the first and second moments of the Jacobi ensemble. The last result has found applications in quantum statistical mechanics [3].

The rest of this paper is organized as follows. Section 2 presents our main results, which are compared with previous results in the literature. Proofs are given in Section 3.

2 Results

In the literature, there is more than one definition of the Laguerre probability distribution. These definitions differ only by a linear transformation and are thus essentially equivalent. In this paper, we stick to one definition. When citing a result from the literature, we perform a linear transformation such that the result is presented for the definition we stick to. The same applies to the Jacobi case.

Let nn be the number of random variables in an ensemble. Let β\beta be the Dyson index, which can be an arbitrary positive number.

2.1 Laguerre ensemble

We draw λ1λ2λn\lambda_{1}\leq\lambda_{2}\leq\cdots\leq\lambda_{n} from the Laguerre ensemble.

Definition 1 (Laguerre ensemble).

The probability density function of the β\beta-Laguerre ensemble with parameters

α>(n1)β2\alpha>(n-1)\frac{\beta}{2} (1)

is

fLag(λ1,λ2,,λn)1i<jn|λiλj|βi=1nλiα(n1)β21eλi/2,λi>0.f_{\textnormal{Lag}}(\lambda_{1},\lambda_{2},\ldots,\lambda_{n})\propto{\prod_{1\leq i<j\leq n}|\lambda_{i}-\lambda_{j}|^{\beta}}\prod_{i=1}^{n}\lambda_{i}^{\alpha-\frac{(n-1)\beta}{2}-1}e^{-\lambda_{i}/2},\quad\lambda_{i}>0. (2)

For certain values of β\beta, the Laguerre ensemble arises as the probability density function of the eigenvalues of a Wishart matrix VVVV^{*}, where VV is an n×2αβn\times\frac{2\alpha}{\beta} matrix with real (β=1\beta=1), complex (β=2\beta=2), or quaternionic (β=4\beta=4) entries. In each case the entries of VV are independent standard Gaussian random variables and VV^{*} denotes the conjugate transpose of VV.

Let

Ln(p)(x):=i=0n(n+pni)(x)ii!,p>1L_{n}^{(p)}(x):=\sum_{i=0}^{n}{n+p\choose n-i}\frac{(-x)^{i}}{i!},\quad p>-1 (3)

be the Laguerre polynomial, whose zeros are all in the interval with endpoints [4]

2n+p2±1+4(n1)(n+p1)cos2πn+1.2n+p-2\pm\sqrt{1+4(n-1)(n+p-1)\cos^{2}\frac{\pi}{n+1}}. (4)

Let x1<x2<<xnx_{1}<x_{2}<\cdots<x_{n} be the zeros of the Laguerre polynomial Ln(2α/βn)(x/β)L_{n}^{(2\alpha/\beta-n)}(x/\beta).

We are interested in the limit α\alpha\to\infty but do not assume that nn\to\infty. Note that if β\beta is a constant, then nn\to\infty implies that α\alpha\to\infty; see (1).

Theorem 1 (Theorem 2.1 in Ref. [1]).

For any 0<ϵ<10<\epsilon<1,

Pr(12αmax1in|λixi|>ϵ)4n(1+ϵ2/25)αeαϵ2/25.\Pr\left(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon\right)\leq 4n(1+\epsilon^{2}/25)^{\alpha}e^{-\alpha\epsilon^{2}/25}. (5)

This theorem can be restated as

Corollary 1.

There exist positive constants C1,C2C_{1},C_{2} such that for any 0<ϵ<10<\epsilon<1,

Pr(12αmax1in|λixi|>ϵ)C1neC2αϵ4.\Pr\left(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon\right)\leq C_{1}ne^{-C_{2}\alpha\epsilon^{4}}. (6)
Theorem 2 (Theorem 2.4 in Ref. [1]).

Let κ1\kappa\geq 1 be a parameter. If

n1+1/β2α/βn1+κand2κβ/α<ϵ<1,n-1+1/\beta\leq 2\alpha/\beta\leq n-1+\kappa\quad\textnormal{and}\quad 2\kappa\beta/\alpha<\epsilon<1, (7)

then there exist positive constants C1,C2,C3C_{1},C_{2},C_{3} such that

Pr(12αmax1in|λixi|>ϵ)C1n(eC2αϵ2/κ+eC3κ2βC2αϵ2).\Pr\left(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon\right)\leq C_{1}n(e^{-C_{2}\alpha\epsilon^{2}/\kappa}+e^{C_{3}\kappa^{2}\beta-C_{2}\alpha\epsilon^{2}}). (8)

The original upper bound on Pr(12αmax1in|λixi|>ϵ)\Pr(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon) in Theorem 2.4 of Ref. [1] is a complicated expression without implicit constants. The right-hand side of (8) is its simplification using implicit constants.

If condition (7) is satisfied, (8) may be an improvement of (6). In particular, for a constant β\beta, the right-hand side of (8) becomes C1neC2αϵ2C^{\prime}_{1}ne^{-C^{\prime}_{2}\alpha\epsilon^{2}} (C1,C2C^{\prime}_{1},C^{\prime}_{2} are positive constants) if and only if κ\kappa is upper bounded by a constant.

As the main result of this subsection, Theorem 3 is an improvement of Corollary 1 and Theorem 2.

Theorem 3.

There exist positive constants C1,C2C_{1},C_{2} such that for any ϵ>0\epsilon>0,

Pr(12αmax1in|λixi|>ϵ)C1neC2αϵmin{ϵ,1}.\Pr\left(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon\right)\leq C_{1}ne^{-C_{2}\alpha\epsilon\min\{\epsilon,1\}}. (9)

Let nsn\leq s be two positive integers and VV be an n×sn\times s matrix whose elements are independent standard real Gaussian random variables. Then, VVTVV^{T} is a real Wishart matrix, whose joint eigenvalue distribution is given by (2) with β=1\beta=1 and α=s/2\alpha=s/2. Theorem 3 implies that

Corollary 2.

Let λ1λ2λn\lambda_{1}\leq\lambda_{2}\leq\cdots\leq\lambda_{n} be the eigenvalues of VVTVV^{T} and x1<x2<<xnx_{1}<x_{2}<\cdots<x_{n} be the zeros of the Laguerre polynomial Ln(sn)(x)L_{n}^{(s-n)}(x). There exist positive constants C1,C2C_{1},C_{2} such that for any ϵ>0\epsilon>0,

Pr(1smax1in|λixi|>ϵ)C1neC2sϵmin{ϵ,1}.\Pr\left(\frac{1}{s}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>\epsilon\right)\leq C_{1}ne^{-C_{2}s\epsilon\min\{\epsilon,1\}}. (10)

Analogues of Corollary 2 for complex (β=2\beta=2) and quaternionic (β=4\beta=4) Wishart matrices also follow directly from Theorem 3.

Let

M1L:=1ni=1nλiM_{1}^{\textnormal{L}}:=\frac{1}{n}\sum_{i=1}^{n}\lambda_{i} (11)

be the first moment of the Laguerre ensemble. The distribution of M1LM_{1}^{\textnormal{L}} has a particularly simple form.

Fact 1.

M1LM_{1}^{\textnormal{L}} is distributed as 1nχ2αn2\frac{1}{n}\chi_{2\alpha n}^{2}, where χk2\chi_{k}^{2} denotes the chi-square distribution with kk degrees of freedom.

Thus, the concentration of M1LM_{1}^{\textnormal{L}} follows directly from the tail bound [5, 6] for the chi-square distribution.

The distribution of the second moment of the Laguerre ensemble does not have a simple form. Furthermore, it is complicated to obtain concentration bounds for the distribution, so we omit this analysis here.

2.2 Jacobi ensemble

We draw μ1μ2μn\mu_{1}\leq\mu_{2}\leq\cdots\leq\mu_{n} from the Jacobi ensemble.

Definition 2 (Jacobi ensemble).

The probability density function of the β\beta-Jacobi ensemble with parameters a,b>0a,b>0 is

fJac(μ1,μ2,,μn)1i<jn|μiμj|βi=1n(1μi)a1(1+μi)b1,1μi1.f_{\textnormal{Jac}}(\mu_{1},\mu_{2},\ldots,\mu_{n})\propto{\prod_{1\leq i<j\leq n}|\mu_{i}-\mu_{j}|^{\beta}}\prod_{i=1}^{n}(1-\mu_{i})^{a-1}(1+\mu_{i})^{b-1},\quad-1\leq\mu_{i}\leq 1. (12)

The Jacobi ensemble can be interpreted as the probability density function of the eigenvalues of a random matrix ensemble. In the complex (β=2\beta=2) case, let Q1Q_{1} and Q2Q_{2} be uniformly random projectors in 2n+a+b2\mathbb{C}^{2n+a+b-2} with ranks nn and n+b1n+b-1, respectively. Then, 1+μ12,1+μ22,,1+μn2\frac{1+\mu_{1}}{2},\frac{1+\mu_{2}}{2},\ldots,\frac{1+\mu_{n}}{2} are the non-zero eigenvalues of Q1Q2Q1Q_{1}Q_{2}Q_{1} [7]. Equivalently, they are the squared singular values of an n×(n+b1)n\times(n+b-1) rectangular block within a Haar-random unitary matrix of dimension 2n+a+b22n+a+b-2. A random matrix interpretation for general β\beta is given in Ref. [8], but it has less of a natural connection to applications.

The Jacobi polynomial is defined as

Pnp,q(y):=Γ(n+p+1)Γ(n+p+q+1)i=0nΓ(n+p+q+i+1)i!(ni)!Γ(p+i+1)(y12)i,P_{n}^{p,q}(y):=\frac{\Gamma(n+p+1)}{\Gamma(n+p+q+1)}\sum_{i=0}^{n}\frac{\Gamma(n+p+q+i+1)}{i!(n-i)!\Gamma(p+i+1)}\left(\frac{y-1}{2}\right)^{i}, (13)

where Γ\Gamma is the gamma function. It is well known that all zeros of the Jacobi polynomial are in the interval (1,1)(-1,1). Let y1<y2<<yny_{1}<y_{2}<\cdots<y_{n} be the zeros of the Jacobi polynomial Pn2a/β1,2b/β1(y)P_{n}^{2a/\beta-1,2b/\beta-1}(y).

2.2.1 Pointwise approximation

In this subsubsection, we are interested in the limit a+ba+b\to\infty but do not assume that min{a,b}\min\{a,b\}\to\infty.

Theorem 4 (Theorem 2.1 in Ref. [2]).

For any 0<ϵ1/20<\epsilon\leq 1/2,

Pr(max1in|μiyi|>ϵ)4(2n1)(1+ϵ2162+2ϵ2)a+be(a+b)ϵ2162+2ϵ2.\Pr\left(\max_{1\leq i\leq n}|\mu_{i}-y_{i}|>\epsilon\right)\leq 4(2n-1)\left(1+\frac{\epsilon^{2}}{162+2\epsilon^{2}}\right)^{a+b}e^{-\frac{(a+b)\epsilon^{2}}{162+2\epsilon^{2}}}. (14)

This theorem can be restated as

Corollary 3.

There exist positive constants C1,C2C_{1},C_{2} such that for any 0<ϵ1/20<\epsilon\leq 1/2,

Pr(max1in|μiyi|>ϵ)C1neC2(a+b)ϵ4.\Pr\left(\max_{1\leq i\leq n}|\mu_{i}-y_{i}|>\epsilon\right)\leq C_{1}ne^{-C_{2}(a+b)\epsilon^{4}}. (15)

As the main result of this subsubsection, Theorem 5 is an improvement of Corollary 3.

Theorem 5.

There exist positive constants C1,C2C_{1},C_{2} such that for any ϵ>0\epsilon>0,

Pr(max1in|μiyi|>ϵ)C1neC2(a+b)ϵ2.\Pr\left(\max_{1\leq i\leq n}|\mu_{i}-y_{i}|>\epsilon\right)\leq C_{1}ne^{-C_{2}(a+b)\epsilon^{2}}. (16)

Section 3 of Ref. [2] presents several applications of Theorem 4. Most of them can be improved by using Theorem 5. We discuss one of them in detail.

Let β\beta be a positive constant. Consider the limit nn\to\infty with

a=ω(n),a=Θ(b).a=\omega(n),\quad a=\Theta(b). (17)

Let δ()\delta(\cdot) be the Dirac delta. The semicircle law with radius rr is a probability distribution on the interval [r,r][-r,r] with density function

fSC(μ)r2μ2.f_{\textnormal{SC}}(\mu)\propto\sqrt{r^{2}-\mu^{2}}. (18)
Corollary 4.

The empirical distribution

f(μ):=1ni=1nδ(μa+b2abnβ((a+b)μi+ab))f(\mu):=\frac{1}{n}\sum_{i=1}^{n}\delta\left(\mu-\sqrt{\frac{a+b}{2abn\beta}}\big{(}(a+b)\mu_{i}+a-b\big{)}\right) (19)

of linearly transformed μi\mu_{i} converges weakly to the semicircle law with radius 22 almost surely.

For ω(n)=a=o(n2/lnn)\omega(n)=a=o(n^{2}/\ln n), Corollary 4 was proved in Example 3.4 of Ref. [2] using Theorem 4. Using Theorem 5 instead, the same proof becomes valid for any a=ω(n)a=\omega(n).

Corollary 4 is very similar to Theorem 2.1 in Ref. [9].

2.2.2 Moments

Theorem 5 implies the concentration of any smooth multivariate function of μ1,μ2,,μn\mu_{1},\mu_{2},\ldots,\mu_{n}. The main result of this subsubsection is tighter concentration bounds (than those implied by Theorem 5) for the first and second moments of the Jacobi ensemble.

Let

N:=a+b+β(n1).N:=a+b+\beta(n-1). (20)

Suppose that β=Θ(1)\beta=\Theta(1) is a positive constant and that a+b=Ω(1)a+b=\Omega(1). In this subsubsection, we are interested in the limit NN\to\infty. This means that a+ba+b\to\infty or nn\to\infty or both.

Let

M1J:=1ni=1nμi,M2J:=1ni=1n(μi𝔼M1J)2M^{\textnormal{J}}_{1}:=\frac{1}{n}\sum_{i=1}^{n}\mu_{i},\quad M^{\textnormal{J}}_{2}:=\frac{1}{n}\sum_{i=1}^{n}(\mu_{i}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1})^{2} (21)

be the first and shifted second moments of the Jacobi ensemble. Equation (B.7) of Ref. [10] implies that

𝔼M1J=baN,𝔼M2J=βn(2a+βn)(2b+βn)2N3+O(1/N).\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1}=\frac{b-a}{N},\quad\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2}=\frac{\beta n(2a+\beta n)(2b+\beta n)}{2N^{3}}+O(1/N). (22)

Indeed, 𝔼M2J\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2} can be calculated exactly in closed form. The expression is lengthy and simplifies to the above using the Big-O notation.

Theorem 6 (concentration of moments).

For any ϵ>0\epsilon>0,

Pr(|M1J𝔼M1J|>ϵ)=O(eΩ(Nnϵ2)),Pr(|M2J𝔼M2J|>ϵ)=O(eΩ(Nϵ)min{Nϵ,n}).\Pr(|M^{\textnormal{J}}_{1}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1}|>\epsilon)=O(e^{-\Omega(Nn\epsilon^{2})}),\quad\Pr(|M^{\textnormal{J}}_{2}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2}|>\epsilon)=O(e^{-\Omega(N\epsilon)\min\{N\epsilon,n\}}). (23)

Let

Y1:=1ni=1nyi,Y2:=1ni=1n(yiY1)2Y_{1}:=\frac{1}{n}\sum_{i=1}^{n}y_{i},\quad Y_{2}:=\frac{1}{n}\sum_{i=1}^{n}(y_{i}-Y_{1})^{2} (24)

be the mean and variance of the zeros of the Jacobi polynomial. From direct calculation (Appendix A) we find that

Y1=baN,Y2=β(n1)(2a+β(n1))(2b+β(n1))N2(2Nβ).Y_{1}=\frac{b-a}{N},\quad Y_{2}=\frac{\beta(n-1)\big{(}2a+\beta(n-1)\big{)}\big{(}2b+\beta(n-1)\big{)}}{N^{2}(2N-\beta)}. (25)

Hence,

𝔼M1J=Y1,𝔼M2J=Y2+O(1/N).\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1}=Y_{1},\quad\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2}=Y_{2}+O(1/N). (26)
Corollary 5.

For any ϵ>0\epsilon>0,

Pr(|M1JY1|>ϵ)=O(eΩ(Nnϵ2)),Pr(|M2JY2|>ϵ)=O(eΩ(Nϵ)min{Nϵ,n}).\Pr(|M^{\textnormal{J}}_{1}-Y_{1}|>\epsilon)=O(e^{-\Omega(Nn\epsilon^{2})}),\quad\Pr(|M^{\textnormal{J}}_{2}-Y_{2}|>\epsilon)=O(e^{-\Omega(N\epsilon)\min\{N\epsilon,n\}}). (27)

3 Proofs

The proofs of Theorems 3 and 5 are similar to those of Theorems 1 and 4 in Refs. [1, 2], respectively, but the error analysis is improved and arguably simpler.

The following lemma will be used multiple times.

Lemma 1.

Let mm be an integer and pi,qip_{i},q_{i} be numbers such that |piqi|δ|p_{i}-q_{i}|\leq\delta for i=1,2,,mi=1,2,\ldots,m. Then,

|i=1mpii=1mqi|δk=0m1i=1mk1|pi|j=m+1km|qj|.\left|\prod_{i=1}^{m}p_{i}-\prod_{i=1}^{m}q_{i}\right|\leq\delta\sum_{k=0}^{m-1}\prod_{i=1}^{m-k-1}|p_{i}|\prod_{j=m+1-k}^{m}|q_{j}|. (28)
Proof.
|i=1mpii=1mqi|k=0m1|i=1mkpij=m+1kmqji=1mk1pij=mkmqj|δk=0m1i=1mk1|pi|j=m+1km|qj|.\left|\prod_{i=1}^{m}p_{i}-\prod_{i=1}^{m}q_{i}\right|\leq\sum_{k=0}^{m-1}\left|\prod_{i=1}^{m-k}p_{i}\prod_{j=m+1-k}^{m}q_{j}-\prod_{i=1}^{m-k-1}p_{i}\prod_{j=m-k}^{m}q_{j}\right|\leq\delta\sum_{k=0}^{m-1}\prod_{i=1}^{m-k-1}|p_{i}|\prod_{j=m+1-k}^{m}|q_{j}|. (29)

Let CC be a positive constant. For notational simplicity, we will reuse CC in that its value may be different in different expressions or equations.

3.1 Laguerre ensemble: Proofs of Theorem 3 and Fact 1

For Theorem 3, it suffices to prove

Theorem 7.

For any ϵ>0\epsilon>0,

Pr(12αmax1in|λixi|>4ϵ)4neα(1+ϵ1)2.\Pr\left(\frac{1}{2\alpha}\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|>4\epsilon\right)\leq 4ne^{-\alpha(\sqrt{1+\epsilon}-1)^{2}}. (30)
Proof of Theorem 7.

Let X2α,X2αβ,X2α2β,,X2α(n1)β,Yβ,Y2β,,Y(n1)βX_{2\alpha},X_{2\alpha-\beta},X_{2\alpha-2\beta},\ldots,X_{2\alpha-(n-1)\beta},Y_{\beta},Y_{2\beta},\ldots,Y_{(n-1)\beta} be independent non-negative random variables with Xk2χk2X_{k}^{2}\sim\chi^{2}_{k} and Yl2χl2Y_{l}^{2}\sim\chi^{2}_{l}. Note that

𝔼(Xk2)=k,Var(Xk2)=2k.\operatorname*{\mathbb{E}}(X_{k}^{2})=k,\quad\operatorname{Var}(X_{k}^{2})=2k. (31)

Lemma A.1 in Ref. [1] gives the tail bound (δ\delta here and in all probability bounds below is positive)

Pr(|Xkk|>δ)2(1+δ/k)keδkδ2/22eδ2/2.\Pr(|X_{k}-\sqrt{k}|>\delta)\leq 2(1+\delta/\sqrt{k})^{k}e^{-\delta\sqrt{k}-\delta^{2}/2}\leq 2e^{-\delta^{2}/2}. (32)

Let 𝐋i,j\mathbf{L}_{i,j} be the element in the iith row and jjth column of a real symmetric n×nn\times n tridiagonal random matrix 𝐋\mathbf{L}. “Tridiagonal” means that 𝐋i,j=0\mathbf{L}_{i,j}=0 if |ij|>1|i-j|>1. The diagonal and subdiagonal matrix elements are, respectively,

𝐋1,1=X2α2,\displaystyle\mathbf{L}_{1,1}=X_{2\alpha}^{2}, (33)
𝐋i,i=X2α(i1)β2+Y(n+1i)β2,i=2,3,,n,\displaystyle\mathbf{L}_{i,i}=X_{2\alpha-(i-1)\beta}^{2}+Y_{(n+1-i)\beta}^{2},\quad i=2,3,\ldots,n, (34)
𝐋i+1,i=X2α(i1)βY(ni)β,i=1,2,,n1.\displaystyle\mathbf{L}_{i+1,i}=X_{2\alpha-(i-1)\beta}Y_{(n-i)\beta},\quad i=1,2,\ldots,n-1. (35)

The joint eigenvalue distribution of 𝐋\mathbf{L} is given by [11] the Laguerre ensemble (Definition 1).

Let 𝐋\mathbf{L}^{\prime} be a real symmetric n×nn\times n tridiagonal deterministic matrix, whose matrix elements are obtained by replacing Xk2,Yl2X_{k}^{2},Y_{l}^{2} in Eqs. (33), (34) by their expectation values and replacing XkYlX_{k}Y_{l} in Eq. (35) by 𝔼(Xk2)𝔼(Yl2)\sqrt{\operatorname*{\mathbb{E}}(X_{k}^{2})\operatorname*{\mathbb{E}}(Y_{l}^{2})}, i.e.,

𝐋1,1=2α,\displaystyle\mathbf{L}^{\prime}_{1,1}=2\alpha, (36)
𝐋i,i=2α+(n+22i)β,i=2,3,,n,\displaystyle\mathbf{L}^{\prime}_{i,i}=2\alpha+(n+2-2i)\beta,\quad i=2,3,\ldots,n, (37)
𝐋i+1,i=(2α(i1)β)(ni)β,i=1,2,,n1.\displaystyle\mathbf{L}^{\prime}_{i+1,i}=\sqrt{\big{(}2\alpha-(i-1)\beta\big{)}(n-i)\beta},\quad i=1,2,\ldots,n-1. (38)

The eigenvalues of 𝐋\mathbf{L}^{\prime} are the zeros of the Laguerre polynomial Ln(2a/βn)(x/β)L_{n}^{(2a/\beta-n)}(x/\beta) [1].

Let \|\cdot\| denote the operator norm. Let 𝐋1,0=𝐋1,0=𝐋n+1,n=𝐋n+1,n:=0\mathbf{L}_{1,0}=\mathbf{L}^{\prime}_{1,0}=\mathbf{L}_{n+1,n}=\mathbf{L}^{\prime}_{n+1,n}:=0. Let δ=2α(1+ϵ1)\delta=\sqrt{2\alpha}(\sqrt{1+\epsilon}-1). Since

max1in|λixi|𝐋𝐋max1in{|𝐋i,i1𝐋i,i1|+|𝐋i,i𝐋i,i|+|𝐋i+1,i𝐋i+1,i|},\max_{1\leq i\leq n}|\lambda_{i}-x_{i}|\leq\|\mathbf{L}-\mathbf{L}^{\prime}\|\leq\max_{1\leq i\leq n}\{|\mathbf{L}_{i,i-1}-\mathbf{L}^{\prime}_{i,i-1}|+|\mathbf{L}_{i,i}-\mathbf{L}^{\prime}_{i,i}|+|\mathbf{L}_{i+1,i}-\mathbf{L}^{\prime}_{i+1,i}|\}, (39)

it suffices to show that

|𝐋i,i𝐋i,i|4αϵ,|𝐋i+1,i𝐋i+1,i|2αϵ,i|\mathbf{L}_{i,i}-\mathbf{L}^{\prime}_{i,i}|\leq 4\alpha\epsilon,\quad|\mathbf{L}_{i+1,i}-\mathbf{L}^{\prime}_{i+1,i}|\leq 2\alpha\epsilon,\quad\forall i (40)

under the assumptions that

|Xkk|δ,|Yll|δ,k,l.|X_{k}-\sqrt{k}|\leq\delta,\quad|Y_{l}-\sqrt{l}|\leq\delta,\quad\forall k,l. (41)

Indeed, (41) and Lemma 1 with m=2m=2 imply that for any k,l2αk,l\leq 2\alpha,

|Xk2k|δ(2k+δ)δ(22α+δ),\displaystyle|X_{k}^{2}-k|\leq\delta(2\sqrt{k}+\delta)\leq\delta(2\sqrt{2\alpha}+\delta), (42)
|XkYlkl|δ(k+l+δ)δ(22α+δ)=2αϵ.\displaystyle|X_{k}Y_{l}-\sqrt{kl}|\leq\delta(\sqrt{k}+\sqrt{l}+\delta)\leq\delta(2\sqrt{2\alpha}+\delta)=2\alpha\epsilon. (43)

Proof of 1.

Using the matrix model (33), (34), (35) from Ref. [11], we find that

M1L1ni=1n𝐋i,i=1ni=1nX2α(i1)β2+1ni=2nY(n+1i)β21nχ2αn2.M_{1}^{\textnormal{L}}\sim\frac{1}{n}\sum_{i=1}^{n}\mathbf{L}_{i,i}=\frac{1}{n}\sum_{i=1}^{n}X_{2\alpha-(i-1)\beta}^{2}+\frac{1}{n}\sum_{i=2}^{n}Y_{(n+1-i)\beta}^{2}\sim\frac{1}{n}\chi^{2}_{2\alpha n}. (44)

3.2 Jacobi ensemble

For k,l>0k,l>0, let ZB(k,l)Z\sim B(k,l) denote a beta-distributed random variable on the interval [1,1][-1,1] with probability density function

fbeta(z)(1z)k1(1+z)l1f_{\textnormal{beta}}(z)\propto(1-z)^{k-1}(1+z)^{l-1} (45)

so that

𝔼Z=lkk+l.\operatorname*{\mathbb{E}}Z=\frac{l-k}{k+l}. (46)

Assume without loss of generality that klk\geq l. Theorem 8 in Ref. [12] gives the tail bound

Pr(Z>𝔼Z+δ)2eCmin{k2δ2l,kδ},Pr(Z<𝔼Zδ)2eCk2δ2l.\Pr(Z>\operatorname*{\mathbb{E}}Z+\delta)\leq 2e^{-C\min\left\{\frac{k^{2}\delta^{2}}{l},k\delta\right\}},\quad\Pr(Z<\operatorname*{\mathbb{E}}Z-\delta)\leq 2e^{-\frac{Ck^{2}\delta^{2}}{l}}. (47)

Note that Pr(Z>𝔼Z+δ)=0\Pr(Z>\operatorname*{\mathbb{E}}Z+\delta)=0 for δ1𝔼Z\delta\geq 1-\operatorname*{\mathbb{E}}Z. In this case, the first inequality above holds trivially. The tail bound (47) implies that

Pr(|Z𝔼Z|>δ)4eCkδ2,\displaystyle\Pr(|Z-\operatorname*{\mathbb{E}}Z|>\delta)\leq 4e^{-Ck\delta^{2}}, (48)
Pr(Z>𝔼Z+2δ1+𝔼Z+δ2)2eCkδ2,δ>0.\displaystyle\Pr(Z>\operatorname*{\mathbb{E}}Z+2\delta\sqrt{1+\operatorname*{\mathbb{E}}Z}+\delta^{2})\leq 2e^{-Ck\delta^{2}},\quad\forall\delta>0. (49)

Furthermore, for 0<δ<1+𝔼Z0<\delta<\sqrt{1+\operatorname*{\mathbb{E}}Z},

Pr(Z<𝔼Z2δ1+𝔼Z+δ2)2eCkδ2.\Pr(Z<\operatorname*{\mathbb{E}}Z-2\delta\sqrt{1+\operatorname*{\mathbb{E}}Z}+\delta^{2})\leq 2e^{-Ck\delta^{2}}. (50)

(49) and (50) imply that

Pr(|1+Z1+𝔼Z|>δ)4eCkδ2.\Pr(|\sqrt{1+Z}-\sqrt{1+\operatorname*{\mathbb{E}}Z}|>\delta)\leq 4e^{-Ck\delta^{2}}. (51)

Similarly,

Pr(|1Z1𝔼Z|>δ)4eCkδ2.\Pr(|\sqrt{1-Z}-\sqrt{1-\operatorname*{\mathbb{E}}Z}|>\delta)\leq 4e^{-Ck\delta^{2}}. (52)

3.2.1 Pointwise approximation: Proof of Theorem 5

Let Z2,Z3,Z4,,Z2nZ_{2},Z_{3},Z_{4},\ldots,Z_{2n} be independent random variables with distribution

Zi{B(a+(2ni)β/4,b+(2ni)β/4),eveniB(a+b+(2n1i)β/4,(2n+1i)β/4),oddiZ_{i}\sim\begin{cases}B\big{(}a+(2n-i)\beta/4,b+(2n-i)\beta/4\big{)},\quad\textnormal{even}~{}i\\ B\big{(}a+b+(2n-1-i)\beta/4,(2n+1-i)\beta/4\big{)},\quad\textnormal{odd}~{}i\end{cases} (53)

so that

𝔼Zi=1a+b+(ni/2)β×{ba,eveniβ/2ab,oddi.\operatorname*{\mathbb{E}}Z_{i}=\frac{1}{a+b+(n-i/2)\beta}\times\begin{cases}b-a,\quad\textnormal{even}~{}i\\ \beta/2-a-b,\quad\textnormal{odd}~{}i\end{cases}. (54)

Let Z1:=1Z_{1}:=-1.

Let 𝐉i,j\mathbf{J}_{i,j} be the element in the iith row and jjth column of a real symmetric n×nn\times n tridiagonal random matrix 𝐉\mathbf{J}. The diagonal and subdiagonal matrix elements are, respectively,

𝐉i,i=(1Z2i1)Z2i(1+Z2i1)Z2i2,𝐉i+1,i=(1Z2i1)(1Z2i2)(1+Z2i+1).\mathbf{J}_{i,i}=(1-Z_{2i-1})Z_{2i}-(1+Z_{2i-1})Z_{2i-2},\quad\mathbf{J}_{i+1,i}=\sqrt{(1-Z_{2i-1})(1-Z_{2i}^{2})(1+Z_{2i+1})}. (55)

The joint eigenvalue distribution of 𝐉/2\mathbf{J}/2 is given by [8] the Jacobi ensemble (Definition 2).

Let 𝐉\mathbf{J}^{\prime} be a real symmetric n×nn\times n tridiagonal deterministic matrix, whose matrix elements are obtained by replacing every random variable ZiZ_{i} in (55) by 𝔼Zi\operatorname*{\mathbb{E}}Z_{i}, i.e.,

𝐉i,i=(1𝔼Z2i1)𝔼Z2i(1+𝔼Z2i1)𝔼Z2i2,\displaystyle\mathbf{J}^{\prime}_{i,i}=(1-\operatorname*{\mathbb{E}}Z_{2i-1})\operatorname*{\mathbb{E}}Z_{2i}-(1+\operatorname*{\mathbb{E}}Z_{2i-1})\operatorname*{\mathbb{E}}Z_{2i-2}, (56)
𝐉i+1,i=(1𝔼Z2i1)(1(𝔼Z2i)2)(1+𝔼Z2i+1).\displaystyle\mathbf{J}^{\prime}_{i+1,i}=\sqrt{(1-\operatorname*{\mathbb{E}}Z_{2i-1})\big{(}1-(\operatorname*{\mathbb{E}}Z_{2i})^{2}\big{)}(1+\operatorname*{\mathbb{E}}Z_{2i+1})}. (57)

The eigenvalues of 𝐉/2\mathbf{J}^{\prime}/2 are the zeros of the Jacobi polynomial Pn2a/β1,2b/β1(x)P_{n}^{2a/\beta-1,2b/\beta-1}(x) [2].

Let 𝐉1,0=𝐉1,0=𝐉n+1,n=𝐉n+1,n:=0\mathbf{J}_{1,0}=\mathbf{J}^{\prime}_{1,0}=\mathbf{J}_{n+1,n}=\mathbf{J}^{\prime}_{n+1,n}:=0. Using (48), (51), (52) and since

max1in|μiyi|𝐉𝐉/2max1in{|𝐉i,i1𝐉i,i1|+|𝐉i,i𝐉i,i|+|𝐉i+1,i𝐉i+1,i|}/2,\max_{1\leq i\leq n}|\mu_{i}-y_{i}|\leq\|\mathbf{J}-\mathbf{J}^{\prime}\|/2\leq\max_{1\leq i\leq n}\{|\mathbf{J}_{i,i-1}-\mathbf{J}^{\prime}_{i,i-1}|+|\mathbf{J}_{i,i}-\mathbf{J}^{\prime}_{i,i}|+|\mathbf{J}_{i+1,i}-\mathbf{J}^{\prime}_{i+1,i}|\}/2, (58)

it suffices to show that

|𝐉i,i𝐉i,i|Cϵ,i,\displaystyle|\mathbf{J}_{i,i}-\mathbf{J}^{\prime}_{i,i}|\leq C\epsilon,\quad\forall i, (59)
|𝐉i+1,i𝐉i+1,i|Cϵ,i\displaystyle|\mathbf{J}_{i+1,i}-\mathbf{J}^{\prime}_{i+1,i}|\leq C\epsilon,\quad\forall i (60)

under the assumptions that

|Zi𝔼Zi|ϵ,i,\displaystyle|Z_{i}-\operatorname*{\mathbb{E}}Z_{i}|\leq\epsilon,\quad\forall i, (61)
|1+Zi1+𝔼Zi|ϵ,|1Zi1𝔼Zi|ϵ,i.\displaystyle|\sqrt{1+Z_{i}}-\sqrt{1+\operatorname*{\mathbb{E}}Z_{i}}|\leq\epsilon,\quad|\sqrt{1-Z_{i}}-\sqrt{1-\operatorname*{\mathbb{E}}Z_{i}}|\leq\epsilon,\quad\forall i. (62)

(59) follows from (61) and Lemma 1 with m=2m=2. (60) follows from (62) and Lemma 1 with m=4m=4.

3.2.2 Moments: Proof of Theorem 6

Since N=O(max{a+b,n})N=O(\max\{a+b,n\}), it suffices to prove that

Pr(|M1J𝔼M1J|>ϵ)=O(eΩ(a+b)nϵ2),\displaystyle\Pr(|M^{\textnormal{J}}_{1}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1}|>\epsilon)=O(e^{-\Omega(a+b)n\epsilon^{2}}), (63)
Pr(|M1J𝔼M1J|>ϵ)=O(eΩ(n2ϵ2)),\displaystyle\Pr(|M^{\textnormal{J}}_{1}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{1}|>\epsilon)=O(e^{-\Omega(n^{2}\epsilon^{2})}), (64)
Pr(|M2J𝔼M2J|>ϵ)=O(eΩ(a+b)ϵmin{Nϵ,n}),\displaystyle\Pr(|M^{\textnormal{J}}_{2}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2}|>\epsilon)=O(e^{-\Omega(a+b)\epsilon\min\{N\epsilon,n\}}), (65)
Pr(|M2J𝔼M2J|>ϵ)=O(eΩ(n2ϵ2)).\displaystyle\Pr(|M^{\textnormal{J}}_{2}-\operatorname*{\mathbb{E}}M^{\textnormal{J}}_{2}|>\epsilon)=O(e^{-\Omega(n^{2}\epsilon^{2})}). (66)

We follow the proof of Theorem 5 and use the same notation. We have proved that

Pr(|𝐉i,i𝐉i,i|>δ)=O(eΩ(a+b)δ2),i,\displaystyle\Pr(|\mathbf{J}_{i,i}-\mathbf{J}^{\prime}_{i,i}|>\delta)=O(e^{-\Omega(a+b)\delta^{2}}),\quad\forall i, (67)
Pr(|𝐉i+1,i𝐉i+1,i|>δ)=O(eΩ(a+b)δ2),i.\displaystyle\Pr(|\mathbf{J}_{i+1,i}-\mathbf{J}^{\prime}_{i+1,i}|>\delta)=O(e^{-\Omega(a+b)\delta^{2}}),\quad\forall i. (68)

Let InI_{n} be the identity matrix of order nn. A straightforward calculation using (55) yields

M1J\displaystyle M^{\textnormal{J}}_{1} =1ntr𝐉2=12ni=1n𝐉i,i=12n(Z2ni=22nZi1Zi),\displaystyle=\frac{1}{n}\tr\frac{\mathbf{J}}{2}=\frac{1}{2n}\sum_{i=1}^{n}\mathbf{J}_{i,i}=\frac{1}{2n}\left(Z_{2n}-\sum_{i=2}^{2n}Z_{i-1}Z_{i}\right), (69)
M2J\displaystyle M^{\textnormal{J}}_{2} =1ntr((𝐉/2Y1In)2)=1ni=1n(𝐉i,i/2Y1)2+12ni=1n1𝐉i+1,i2\displaystyle=\frac{1}{n}\tr((\mathbf{J}/2-Y_{1}I_{n})^{2})=\frac{1}{n}\sum_{i=1}^{n}(\mathbf{J}_{i,i}/2-Y_{1})^{2}+\frac{1}{2n}\sum_{i=1}^{n-1}\mathbf{J}_{i+1,i}^{2} (70)
=Y122Y1M1J+12+2Z2n1(1Z2n2)+Z22+Z2n24n+M,\displaystyle=Y_{1}^{2}-2Y_{1}M^{\textnormal{J}}_{1}+\frac{1}{2}+\frac{2Z_{2n-1}(1-Z_{2n}^{2})+Z_{2}^{2}+Z_{2n}^{2}}{4n}+M^{\prime}, (71)

where

M:=14ni=32n(2Zi2(Zi121)Zi+Zi12Zi2).M^{\prime}:=\frac{1}{4n}\sum_{i=3}^{2n}\big{(}2Z_{i-2}(Z_{i-1}^{2}-1)Z_{i}+Z_{i-1}^{2}Z_{i}^{2}\big{)}. (72)

We will use the Chernoff bound multiple times.

Lemma 2.

Let W1,W2,,WnW_{1},W_{2},\ldots,W_{n} be independent real-valued random variables such that

𝔼Wi=0,Pr(|Wi|>x)=O(emin{xr,x2s2}),i\operatorname*{\mathbb{E}}W_{i}=0,\quad\Pr(|W_{i}|>x)=O\left(e^{-\min\left\{\frac{x}{r},\frac{x^{2}}{s^{2}}\right\}}\right),\quad\forall i (73)

for some r,s>0r,s>0. Then,

Pr(|1ni=1nWi|>δ)=O(eΩ(n)min{δr,δ2r2+s2}).\Pr\left(\left|\frac{1}{n}\sum_{i=1}^{n}W_{i}\right|>\delta\right)=O\left(e^{-\Omega(n)\min\left\{\frac{\delta}{r},\frac{\delta^{2}}{r^{2}+s^{2}}\right\}}\right). (74)

Each WiW_{i} is a subexponential random variable in that its probability distribution satisfies (73). Thus, Lemma 2 is the Chernoff bound for subexponential random variables. For r=0+r=0^{+}, WiW_{i} becomes a sub-Gaussian random variable, and Lemma 2 reduces to the Chernoff bound for sub-Gaussian random variables.

Proof of Lemma 2.

The tail bound (73) implies that for any j>0j>0,

𝔼(|Wi|j)=0Pr(|Wi|j>x)dx=0jxj1Pr(|Wi|>x)dx=0jxj1O(ex/r+ex2/s2)dx=O(rjΓ(j+1)+sjΓ(j/2+1)).\operatorname*{\mathbb{E}}(|W_{i}|^{j})=\int_{0}^{\infty}\Pr(|W_{i}|^{j}>x)\,\mathrm{d}x=\int_{0}^{\infty}jx^{j-1}\Pr(|W_{i}|>x)\,\mathrm{d}x\\ =\int_{0}^{\infty}jx^{j-1}O(e^{-x/r}+e^{-x^{2}/s^{2}})\,\mathrm{d}x=O\big{(}r^{j}\Gamma(j+1)+s^{j}\Gamma(j/2+1)\big{)}. (75)

Let tt be such that 0<t1/(2r)0<t\leq 1/(2r). Since 𝔼Wi=0\operatorname*{\mathbb{E}}W_{i}=0,

𝔼etWi=1+j=2tj𝔼(Wij)j!=1+j=2O((rt)j+(st)jΓ(j/2+1)j!).\operatorname*{\mathbb{E}}e^{tW_{i}}=1+\sum_{j=2}^{\infty}\frac{t^{j}\operatorname*{\mathbb{E}}(W_{i}^{j})}{j!}=1+\sum_{j=2}^{\infty}O\left((rt)^{j}+\frac{(st)^{j}\Gamma(j/2+1)}{j!}\right). (76)

Using (st)j(st)j1+(st)j+1(st)^{j}\leq(st)^{j-1}+(st)^{j+1} for odd jj,

𝔼etWi=1+O(rt)21rt+j=1(st)2jO(Γ(j+1/2)(2j1)!+j!(2j)!+Γ(j+3/2)(2j+1)!)=1+O(rt)2+O(1)j=1(st)2jj!ec(r2+s2)t2,\operatorname*{\mathbb{E}}e^{tW_{i}}=1+\frac{O(rt)^{2}}{1-rt}+\sum_{j=1}^{\infty}(st)^{2j}O\left(\frac{\Gamma(j+1/2)}{(2j-1)!}+\frac{j!}{(2j)!}+\frac{\Gamma(j+3/2)}{(2j+1)!}\right)\\ =1+O(rt)^{2}+O(1)\sum_{j=1}^{\infty}\frac{(st)^{2j}}{j!}\leq e^{c(r^{2}+s^{2})t^{2}}, (77)

where c>0c>0 is a constant. Recall the standard Chernoff argument:

Pr(1ni=1nWi>δ)=Pr(eti=1nWi>entδ)entδ𝔼eti=1nWi=i=1n𝔼etWitδ.\Pr\left(\frac{1}{n}\sum_{i=1}^{n}W_{i}>\delta\right)=\Pr(e^{t\sum_{i=1}^{n}W_{i}}>e^{nt\delta})\leq e^{-nt\delta}\operatorname*{\mathbb{E}}e^{t\sum_{i=1}^{n}W_{i}}=\prod_{i=1}^{n}\operatorname*{\mathbb{E}}e^{tW_{i}-t\delta}. (78)

If δc(r2+s2)/r\delta\leq c(r^{2}+s^{2})/r, we choose t=δ2c(r2+s2)t=\frac{\delta}{2c(r^{2}+s^{2})} so that

𝔼etWitδeδ24c(r2+s2).\operatorname*{\mathbb{E}}e^{tW_{i}-t\delta}\leq e^{-\frac{\delta^{2}}{4c(r^{2}+s^{2})}}. (79)

If δ>c(r2+s2)/r\delta>c(r^{2}+s^{2})/r, we choose t=1/(2r)t=1/(2r) so that

𝔼etWitδec(r2+s2)4r2δ2reδ4r.\operatorname*{\mathbb{E}}e^{tW_{i}-t\delta}\leq e^{\frac{c(r^{2}+s^{2})}{4r^{2}}-\frac{\delta}{2r}}\leq e^{-\frac{\delta}{4r}}. (80)

We complete the proof by combining these two cases. ∎

Lemma 3.

Let W1,W2,,WnW_{1},W_{2},\ldots,W_{n} be independent random variables on the interval [1,1][-1,1] such that

𝔼Wi=0,Pr(|Wi|>x)=O(emin{(r+is)x,(r+is)3x2r2}),i\operatorname*{\mathbb{E}}W_{i}=0,\quad\Pr(|W_{i}|>x)=O\left(e^{-\min\left\{(r+is)x,\frac{(r+is)^{3}x^{2}}{r^{2}}\right\}}\right),\quad\forall i (81)

for some r,s=Ω(1)r,s=\Omega(1). Then,

Pr(|1ni=1nWi|>δ)=O(eΩ(n2δ2)).\Pr\left(\left|\frac{1}{n}\sum_{i=1}^{n}W_{i}\right|>\delta\right)=O(e^{-\Omega(n^{2}\delta^{2})}). (82)
Proof.

For i(2tr)/si\geq(2t-r)/s, by replacing (73) with (81), (77) implies that

𝔼etWi=eO(t2)(r+is)2+O(r2t2)(r+is)3.\operatorname*{\mathbb{E}}e^{tW_{i}}=e^{\frac{O(t^{2})}{(r+is)^{2}}+\frac{O(r^{2}t^{2})}{(r+is)^{3}}}. (83)

Since |Wi|1|W_{i}|\leq 1, we trivially have

𝔼etWiet.\operatorname*{\mathbb{E}}e^{tW_{i}}\leq e^{t}. (84)

The Chernoff argument (78) implies that

Pr(1ni=1nWi>δ)entδi=1n𝔼etWientδ1i<2trset×max{2trs,1}ineO(t2)(r+is)2+O(r2t2)(r+is)3entδ+O(t2)+i=1O(t2)(r+is)2+O(r2t2)(r+is)3=eO(t2)ntδ.\Pr\left(\frac{1}{n}\sum_{i=1}^{n}W_{i}>\delta\right)\leq e^{-nt\delta}\prod_{i=1}^{n}\operatorname*{\mathbb{E}}e^{tW_{i}}\leq e^{-nt\delta}\prod_{1\leq i<\frac{2t-r}{s}}e^{t}\times\prod_{\max\left\{\frac{2t-r}{s},1\right\}\leq i\leq n}e^{\frac{O(t^{2})}{(r+is)^{2}}+\frac{O(r^{2}t^{2})}{(r+is)^{3}}}\\ \leq e^{-nt\delta+O(t^{2})+\sum_{i=1}^{\infty}\frac{O(t^{2})}{(r+is)^{2}}+\frac{O(r^{2}t^{2})}{(r+is)^{3}}}=e^{O(t^{2})-nt\delta}. (85)

We complete the proof by choosing t=cnδt=c^{\prime}n\delta for a sufficiently small constant c>0c^{\prime}>0. ∎

Proof of Eq. (63).

Using Eq. (67) and Lemma 2 with r=0+r=0^{+},

Pr(|1neveni(𝐉i,i𝐉i,i)|>ϵ)=O(eΩ(a+b)nϵ2),\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{even}~{}i}(\mathbf{J}_{i,i}-\mathbf{J}^{\prime}_{i,i})\right|>\epsilon\right)=O(e^{-\Omega(a+b)n\epsilon^{2}}), (86)
Pr(|1noddi(𝐉i,i𝐉i,i)|>ϵ)=O(eΩ(a+b)nϵ2).\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{odd}~{}i}(\mathbf{J}_{i,i}-\mathbf{J}^{\prime}_{i,i})\right|>\epsilon\right)=O(e^{-\Omega(a+b)n\epsilon^{2}}). (87)

Then, Eq. (63) follows from Eq. (69) and the union bound. ∎

Proof of Eq. (64).

The tail bound (48) implies that

Pr(|Zi𝔼Zi|>δ)=O(eΩ(a+b+(ni/2)β)δ2)\Pr(|Z_{i}-\operatorname*{\mathbb{E}}Z_{i}|>\delta)=O(e^{-\Omega(a+b+(n-i/2)\beta)\delta^{2}}) (88)

so that

Pr(|Zi1Zi𝔼Zi1𝔼Zi|>δ)=O(eΩ(a+b+(ni/2)β)δmin{(a+b+(ni/2)β)2δ(a+b)2,1}).\Pr(|Z_{i-1}Z_{i}-\operatorname*{\mathbb{E}}Z_{i-1}\cdot\operatorname*{\mathbb{E}}Z_{i}|>\delta)=O\left(e^{-\Omega(a+b+(n-i/2)\beta)\delta\min\left\{\frac{(a+b+(n-i/2)\beta)^{2}\delta}{(a+b)^{2}},1\right\}}\right). (89)

Using Lemma 3,

Pr(|1neveni(Zi1Zi𝔼Zi1𝔼Zi)|>ϵ)=O(eΩ(n2ϵ2)),\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{even}~{}i}(Z_{i-1}Z_{i}-\operatorname*{\mathbb{E}}Z_{i-1}\cdot\operatorname*{\mathbb{E}}Z_{i})\right|>\epsilon\right)=O(e^{-\Omega(n^{2}\epsilon^{2})}), (90)
Pr(|1noddi(Zi1Zi𝔼Zi1𝔼Zi)|>ϵ)=O(eΩ(n2ϵ2)).\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{odd}~{}i}(Z_{i-1}Z_{i}-\operatorname*{\mathbb{E}}Z_{i-1}\cdot\operatorname*{\mathbb{E}}Z_{i})\right|>\epsilon\right)=O(e^{-\Omega(n^{2}\epsilon^{2})}). (91)

Then, Eq. (64) follows from Eq. (69) and the union bound. ∎

Proof of Eq. (65).

Equations (54), (55), (56), (57) imply that

|𝐉i,i/2Y1|=O(n/N),|𝐉i+1,i|=O(n/N),i,\displaystyle|\mathbf{J}^{\prime}_{i,i}/2-Y_{1}|=O(n/N),\quad|\mathbf{J}^{\prime}_{i+1,i}|=O(\sqrt{n/N}),\quad\forall i, (92)
|𝔼((𝐉i,i/2Y1)2)(𝐉i,i/2Y1)2|=O(1)a+b,|𝔼(𝐉i+1,i2)𝐉i+1,i2|=O(n)(a+b)N,i.\displaystyle|\operatorname*{\mathbb{E}}((\mathbf{J}_{i,i}/2-Y_{1})^{2})-(\mathbf{J}^{\prime}_{i,i}/2-Y_{1})^{2}|=\frac{O(1)}{a+b},\quad|\operatorname*{\mathbb{E}}(\mathbf{J}^{2}_{i+1,i})-\mathbf{J}^{\prime 2}_{i+1,i}|=\frac{O(n)}{(a+b)N},\quad\forall i. (93)

Equations (67), (68), (92) imply that

Pr(|(𝐉i,i/2Y1)2(𝐉i,i/2Y1)2|>δmissing)=O(eΩ(a+b)δmin{N2δ/n2,1}),i,\displaystyle\Pr\big(|(\mathbf{J}_{i,i}/2-Y_{1})^{2}-(\mathbf{J}^{\prime}_{i,i}/2-Y_{1})^{2}|>\delta\big{missing})=O(e^{-\Omega(a+b)\delta\min\{N^{2}\delta/n^{2},1\}}),\quad\forall i, (94)
Pr(|𝐉i+1,i2𝐉i+1,i2|>δ)=O(eΩ(a+b)δmin{Nδ/n,1}),i.\displaystyle\Pr(|\mathbf{J}_{i+1,i}^{2}-\mathbf{J}^{\prime 2}_{i+1,i}|>\delta)=O(e^{-\Omega(a+b)\delta\min\{N\delta/n,1\}}),\quad\forall i. (95)

Using Eq. (93),

Pr(|(𝐉i,i/2Y1)2𝔼((𝐉i,i/2Y1)2)|>δmissing)=O(eΩ(a+b)δmin{N2δ/n2,1}),i,\displaystyle\Pr\big(|(\mathbf{J}_{i,i}/2-Y_{1})^{2}-\operatorname*{\mathbb{E}}((\mathbf{J}_{i,i}/2-Y_{1})^{2})|>\delta\big{missing})=O(e^{-\Omega(a+b)\delta\min\{N^{2}\delta/n^{2},1\}}),\quad\forall i, (96)
Pr(|𝐉i+1,i2𝔼(𝐉i+1,i2)|>δmissing)=O(eΩ(a+b)δmin{Nδ/n,1}),i.\displaystyle\Pr\big(|\mathbf{J}_{i+1,i}^{2}-\operatorname*{\mathbb{E}}(\mathbf{J}^{2}_{i+1,i})|>\delta\big{missing})=O(e^{-\Omega(a+b)\delta\min\{N\delta/n,1\}}),\quad\forall i. (97)

Using Lemma 2,

Pr(|1neveni((𝐉i,i/2Y1)2𝔼((𝐉i,i/2Y1)2))|>ϵ)=O(eΩ(a+b)ϵmin{Nϵ,n}),\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{even}~{}i}\big{(}(\mathbf{J}_{i,i}/2-Y_{1})^{2}-\operatorname*{\mathbb{E}}((\mathbf{J}_{i,i}/2-Y_{1})^{2})\big{)}\right|>\epsilon\right)=O(e^{-\Omega(a+b)\epsilon\min\{N\epsilon,n\}}), (98)
Pr(|1noddi((𝐉i,i/2Y1)2𝔼((𝐉i,i/2Y1)2))|>ϵ)=O(eΩ(a+b)ϵmin{Nϵ,n}),\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{odd}~{}i}\big{(}(\mathbf{J}_{i,i}/2-Y_{1})^{2}-\operatorname*{\mathbb{E}}((\mathbf{J}_{i,i}/2-Y_{1})^{2})\big{)}\right|>\epsilon\right)=O(e^{-\Omega(a+b)\epsilon\min\{N\epsilon,n\}}), (99)
Pr(|1neveni(𝐉i+1,i2𝔼(𝐉i+1,i2))|>ϵ)=O(eΩ(a+b)ϵmin{Nϵ,n}),\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{even}~{}i}\big{(}\mathbf{J}_{i+1,i}^{2}-\operatorname*{\mathbb{E}}(\mathbf{J}^{2}_{i+1,i})\big{)}\right|>\epsilon\right)=O(e^{-\Omega(a+b)\epsilon\min\{N\epsilon,n\}}), (100)
Pr(|1noddi(𝐉i+1,i2𝔼(𝐉i+1,i2))|>ϵ)=O(eΩ(a+b)ϵmin{Nϵ,n}).\displaystyle\Pr\left(\left|\frac{1}{n}\sum_{\textnormal{odd}~{}i}\big{(}\mathbf{J}_{i+1,i}^{2}-\operatorname*{\mathbb{E}}(\mathbf{J}^{2}_{i+1,i})\big{)}\right|>\epsilon\right)=O(e^{-\Omega(a+b)\epsilon\min\{N\epsilon,n\}}). (101)

Then, Eq. (65) follows from Eq. (70) and the union bound. ∎

Proof of Eq. (66).

The tail bound (88) implies that

Pr(|2Zi2(Zi121)Zi+Zi12Zi2𝔼(2Zi2(Zi121)Zi+Zi12Zi2)|>δmissing)=O(eΩ(a+b+(ni/2)β)δmin{(a+b+(ni/2)β)2δ(a+b)2,1}).\Pr\big(|2Z_{i-2}(Z_{i-1}^{2}-1)Z_{i}+Z_{i-1}^{2}Z_{i}^{2}-\operatorname*{\mathbb{E}}(2Z_{i-2}(Z_{i-1}^{2}-1)Z_{i}+Z_{i-1}^{2}Z_{i}^{2})|>\delta\big{missing})\\ =O\left(e^{-\Omega(a+b+(n-i/2)\beta)\delta\min\left\{\frac{(a+b+(n-i/2)\beta)^{2}\delta}{(a+b)^{2}},1\right\}}\right). (102)

Recall the definition (72) of MM^{\prime}. It can be proved in the same way as Eq. (64) that

Pr(|M𝔼M|>ϵ)=O(eΩ(n2ϵ2)).\Pr(|M^{\prime}-\operatorname*{\mathbb{E}}M^{\prime}|>\epsilon)=O(e^{-\Omega(n^{2}\epsilon^{2})}). (103)

Equation (66) follows from Eqs. (64), (71), (103) and the union bound. ∎

Acknowledgments

This material is based upon work supported by the U.S. Department of Energy, Office of Science, National Quantum Information Science Research Centers, Quantum Systems Accelerator. AWH was also supported by NSF grants CCF-1729369 and PHY-1818914 and NTT (Grant AGMT DTD 9/24/20).

Appendix A Proof of Eq. (25)

We write the Jacobi polynomial (13) as

Pnp,q(y)=Γ(p+q+2n+1)2nn!Γ(p+q+n+1)(yn+j=0n1cjyj).P_{n}^{p,q}(y)=\frac{\Gamma(p+q+2n+1)}{2^{n}n!\Gamma(p+q+n+1)}\left(y^{n}+\sum_{j=0}^{n-1}c_{j}y^{j}\right). (104)

Let p=2a/β1p=2a/\beta-1 and q=2b/β1q=2b/\beta-1. From direct calculation we find that

cn1\displaystyle c_{n-1} =2n(p+n)p+q+2nn=n(ab)N,\displaystyle=\frac{2n(p+n)}{p+q+2n}-n=\frac{n(a-b)}{N}, (105)
cn2\displaystyle c_{n-2} =n(n1)(122(p+n)p+q+2n+2(p+n)(p+n1)(p+q+2n)(p+q+2n1))\displaystyle=n(n-1)\left(\frac{1}{2}-\frac{2(p+n)}{p+q+2n}+\frac{2(p+n)(p+n-1)}{(p+q+2n)(p+q+2n-1)}\right)
=n(n1)(2(ab)2βN)2N(2Nβ).\displaystyle=\frac{n(n-1)\big{(}2(a-b)^{2}-\beta N\big{)}}{2N(2N-\beta)}. (106)

Hence,

Y1\displaystyle Y_{1} =cn1/n=(ba)/N,\displaystyle=-c_{n-1}/n=(b-a)/N, (107)
Y2\displaystyle Y_{2} =Y12+1nj=1nyj2=1n(j=1nyj)2Y121njkyjyk=(n1)Y122cL2n\displaystyle=-Y_{1}^{2}+\frac{1}{n}\sum_{j=1}^{n}y_{j}^{2}=\frac{1}{n}\left(\sum_{j=1}^{n}y_{j}\right)^{2}-Y_{1}^{2}-\frac{1}{n}\sum_{j\neq k}y_{j}y_{k}=(n-1)Y_{1}^{2}-\frac{2c_{L-2}}{n}
=β(n1)(1Y12)/(2Nβ).\displaystyle=\beta(n-1)(1-Y_{1}^{2})/(2N-\beta). (108)

Appendix B Moments of the Hermite ensemble

Fact 1 and Theorem 6 concern the moments of the Laguerre and Jacobi ensembles, respectively. For the Hermite ensemble, it is simple to calculate the distributions of the first and second moments exactly. The results are presented here for completeness.

Definition 3 (Hermite ensemble).

The probability density function of the β\beta-Hermite ensemble is

fHerm(ν1,ν2,,νn)1i<jn|νiνj|βi=1neνi2/2.f_{\textnormal{Herm}}(\nu_{1},\nu_{2},\ldots,\nu_{n})\propto{\prod_{1\leq i<j\leq n}|\nu_{i}-\nu_{j}|^{\beta}}\prod_{i=1}^{n}e^{-\nu_{i}^{2}/2}. (109)

For β=1,2,4\beta=1,2,4, the Hermite ensemble gives the probability density function of the eigenvalues of an n×nn\times n self-adjoint matrix whose entries are real, complex, or quaternionic Gaussian random variables.

Let

M1H:=1ni=1nνi,M2H:=1ni=1n(νi𝔼M1H)2=1ni=1nνi2M_{1}^{\textnormal{H}}:=\frac{1}{n}\sum_{i=1}^{n}\nu_{i},\quad M_{2}^{\textnormal{H}}:=\frac{1}{n}\sum_{i=1}^{n}(\nu_{i}-\operatorname*{\mathbb{E}}M_{1}^{\textnormal{H}})^{2}=\frac{1}{n}\sum_{i=1}^{n}\nu_{i}^{2} (110)

be the first and second moments of the Hermite ensemble, where we used the fact that 𝔼M1H=0\operatorname*{\mathbb{E}}M_{1}^{\textnormal{H}}=0.

Fact 2.

M1HM_{1}^{\textnormal{H}} is distributed as 𝒩(0,1/n)\mathcal{N}(0,1/n), where 𝒩(0,σ2)\mathcal{N}(0,\sigma^{2}) denotes the normal distribution with mean 0 and variance σ2\sigma^{2}. M2HM_{2}^{\textnormal{H}} is distributed as 1nχn+βn(n1)/22\frac{1}{n}\chi_{n+\beta n(n-1)/2}^{2}.

Proof.

Let g1,g2,,gn,Xβ,X2β,,X(n1)βg_{1},g_{2},\ldots,g_{n},X_{\beta},X_{2\beta},\ldots,X_{(n-1)\beta} be independent random variables with

gi𝒩(0,1),Xk2χk2,Xk0.g_{i}\sim\mathcal{N}(0,1),\quad X_{k}^{2}\sim\chi_{k}^{2},\quad X_{k}\geq 0. (111)

The eigenvalues of the real symmetric n×nn\times n tridiagonal random matrix

𝐇=12(2g1XβXβ2g2X2βX2β2g3X4βX(n2)β2gn1X(n1)βX(n1)β2gn)\mathbf{H}=\frac{1}{\sqrt{2}}\begin{pmatrix}\sqrt{2}g_{1}&X_{\beta}\\ X_{\beta}&\sqrt{2}g_{2}&X_{2\beta}\\ &X_{2\beta}&\sqrt{2}g_{3}&X_{4\beta}\\ &&\ddots&\ddots&\ddots\\ &&&X_{(n-2)\beta}&\sqrt{2}g_{n-1}&X_{(n-1)\beta}\\ &&&&X_{(n-1)\beta}&\sqrt{2}g_{n}\end{pmatrix} (112)

are distributed according to fHermf_{\text{Herm}} [11] so that

M1H1ntr𝐇=1ni=1ngi𝒩(0,1/n),\displaystyle M_{1}^{\textnormal{H}}\sim\frac{1}{n}\tr\mathbf{H}=\frac{1}{n}\sum_{i=1}^{n}g_{i}\sim\mathcal{N}(0,1/n), (113)
M2H1ntr(𝐇2)=1ni=1ngi2+1ni=1n1Xiβ21nχn+βn(n1)/22.\displaystyle M_{2}^{\textnormal{H}}\sim\frac{1}{n}\tr(\mathbf{H}^{2})=\frac{1}{n}\sum_{i=1}^{n}g_{i}^{2}+\frac{1}{n}\sum_{i=1}^{n-1}X_{i\beta}^{2}\sim\frac{1}{n}\chi_{n+\beta n(n-1)/2}^{2}. (114)

References

  • [1] Holger Dette and Lorens A. Imhof “Uniform approximation of eigenvalues in Laguerre and Hermite β\beta-ensembles by roots of orthogonal polynomials” In Transactions of the American Mathematical Society 359.10, 2007, pp. 4999–5018
  • [2] Holger Dette and Jan Nagel “Some Asymptotic Properties of the Spectrum of the Jacobi Ensemble” In SIAM Journal on Mathematical Analysis 41.4, 2009, pp. 1491–1507
  • [3] Aram W. Harrow and Yichen Huang “Thermalization without eigenstate thermalization” arXiv:2209.09826
  • [4] Mourand E.. Ismail and Xin Li “Bound on the Extreme Zeros of Orthogonal Polynomials” In Proceedings of the American Mathematical Society 115.1 American Mathematical Society, 1992, pp. 131–140
  • [5] B. Laurent and P. Massart “Adaptive estimation of a quadratic functional by model selection” In The Annals of Statistics 28.5 Institute of Mathematical Statistics, 2000, pp. 1302–1338
  • [6] Tadeusz Inglot and Teresa Ledwina “Asymptotic optimality of new adaptive test in regression model” In Annales de l’Institut Henri Poincare (B) Probability and Statistics 42.5, 2006, pp. 579–590
  • [7] Benoît Collins “Product of random projections, Jacobi ensembles and universality problems arising from free probability” In Probability Theory and Related Fields 133.3, 2005, pp. 313–344
  • [8] Rowan Killip and Irina Nenciu “Matrix models for circular ensembles” In International Mathematics Research Notices 2004.50, 2004, pp. 2665–2701
  • [9] Jan Nagel “Nonstandard limit theorems and large deviations for the Jacobi beta ensemble” In Random Matrices: Theory and Applications 3.3, 2014, pp. 1450012
  • [10] Francesco Mezzadri, Alexi K Reynolds and Brian Winn “Moments of the eigenvalue densities and of the secular coefficients of β\beta-ensembles” In Nonlinearity 30.3 IOP Publishing, 2017, pp. 1034–1057
  • [11] Ioana Dumitriu and Alan Edelman “Matrix models for beta ensembles” In Journal of Mathematical Physics 43.11, 2002, pp. 5830–5847
  • [12] Anru R. Zhang and Yuchen Zhou “On the non-asymptotic and sharp lower tail bounds of random variables” In Stat 9.1, 2020, pp. e314