This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Spiked eigenvalues of noncentral Fisher matrix with applicationsthanks: The first two authors contributed equally to this work. For correspondence, please contact Zhidong Bai and Jiang Hu.

Xiaozhuo Zhanglabel=e1][email protected] [    Zhiqiang Houlabel=e2][email protected] [    Zhidong Bailabel=e3][email protected] [    Jiang Hulabel=e4][email protected] [ School of Mathematics and Statistics, Northeast Normal University, School of Statistics, Shandong University of Finance and Economics,
Abstract

In this paper, we investigate the asymptotic behavior of spiked eigenvalues of the noncentral Fisher matrix defined by 𝐅p=𝐂n(𝐒N)1{\mathbf{F}}_{p}={\mathbf{C}}_{n}(\mathbf{S}_{N})^{-1}, where 𝐂n{\mathbf{C}}_{n} is a noncentral sample covariance matrix defined by (𝚵+𝐗)(𝚵+𝐗)/n(\mathbf{\Xi}+\mathbf{X})(\mathbf{\Xi}+\mathbf{X})^{*}/n and 𝐒N=𝐘𝐘/N\mathbf{S}_{N}={\mathbf{Y}}{\mathbf{Y}}^{*}/N. The matrices 𝐗\mathbf{X} and 𝐘\mathbf{Y} are two independent Gaussian arrays, with respective p×np\times n and p×Np\times N and the Gaussian entries of them are independent and identically distributed (i.i.d.) with mean 0 and variance 11. When pp, nn, and NN grow to infinity proportionally, we establish a phase transition of the spiked eigenvalues of 𝐅p\mathbf{F}_{p}. Furthermore, we derive the central limiting theorem (CLT) for the spiked eigenvalues of 𝐅p\mathbf{F}_{p}. As an accessory to the proof of the above results, the fluctuations of the spiked eigenvalues of 𝐂n{\mathbf{C}}_{n} are studied, which should have its own interests. Besides, we develop the limits and CLT for the sample canonical correlation coefficients by the results of the spiked noncentral Fisher matrix and give three consistent estimators, including the population spiked eigenvalues and the population canonical correlation coefficients.

60F05,
60B20,
62E20,
62H20,
Noncentral Fisher matrix,
spiked eigenvalues,
central limiting theorem,
canonical correlation analysis,
keywords:
[class=MSC2020]
keywords:
\startlocaldefs\endlocaldefs

, , and

1 Introduction

Fisher matrix is one of the most classical and important tools in multivariate statistic analysis (for details see [1], [29], and [30]). [22] provided a remarked five-way classification of the distribution theory and introduced some representative applications, such as signal detection in noise and testing equality of group means under unknown covariance matrix and so on. Among these applications, some statistics can be transformed into a Fisher matrix while others can be studied by a noncentral Fisher matrix. So it is natural to study the spectral properties of the Fisher matrix and noncentral Fisher matrix.

There have been many works focusing on the Fisher matrix, [34] derived the limiting spectral distribution (LSD) of Fisher matrix, which is the celebrated Wachter distribution. [19] proved the largest eigenvalue of Fisher matrix follows Tracy-Widom (T-W) law, see [33]. [38] was devoted to the CLT for linear spectral statistics (LSS) of the Fisher matrix. [40] studied the LSD and CLT of LSS of the so-called general Fisher matrix. In fact, these above works all focus on the central Fisher matrix. Before introducing the concept of the noncentral Fisher matrix, it is necessary to know the large dimensional information-plus-noise-type matrix,

𝐂n=1n(𝚵+𝐗)(𝚵+𝐗)\displaystyle{\mathbf{C}}_{n}=\frac{1}{n}(\mathbf{\Xi}+\mathbf{X})(\mathbf{\Xi}+\mathbf{X})^{*} (1)

where 𝐗\mathbf{X} is a p×np\times n matrix containing i.i.d. entries with mean 0 and variance 11. 𝚵𝚵/n{\mathbf{\Xi}}{\mathbf{\Xi}}^{*}/n is a deterministic and it is assumed to have a LSD. Here and subsequently, * denotes conjugate transpose, and TT stands for transpose on real matrix and vector. In [17, 18, 14, 5], a lot of spectral properties of 𝐂n{\mathbf{C}}_{n} have been researched. Actually, the matrix 𝐂n{\mathbf{C}}_{n} is a noncentral sample covariance matrix and 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{\ast}/n is called the noncentral parameter matrix, whose eigenvalues are arranged as a descending order:

l1𝚵l2𝚵lp𝚵.\displaystyle l_{1}^{\boldsymbol{\Xi}}\geq l_{2}^{\boldsymbol{\Xi}}\geq\cdots\geq l_{p}^{\boldsymbol{\Xi}}. (2)

However, many problems, such as signal detection in noise and testing equality of group means under unknown covariance matrix always involve the noncentral Fisher matrix, which is constructed based on the matrix 𝐂n{\mathbf{C}}_{n},

𝐅p=𝐂n𝐒N1,\displaystyle{\mathbf{F}}_{p}={\mathbf{C}}_{n}{\mathbf{S}}_{N}^{-1}, (3)

where 𝐒N=𝐘𝐘/N{\mathbf{S}}_{N}={\mathbf{Y}}{\mathbf{Y}}^{*}/N and 𝐘\mathbf{Y} is independent of 𝐗\mathbf{X}. The entries {𝐘ij,1ip,1jN}\{\mathbf{Y}_{ij},1\leq i\leq p,1\leq j\leq N\} are i.i.d. with mean 0 and variance 11. To the best of our knowledge, there are only a handful of works devoted to the noncentral Fisher matrices. Under the Gaussian assumption, [27] developed an approximation to the distribution of the largest eigenvalue of the noncentral Fisher matrix and [12] derived the CLT for the LSS of the large dimensional noncentral Fisher matrix. In this paper, we concentrate on the outlier eigenvalues of the noncentral Fisher matrix defined in (3). Specially, we will work with the following assumption,

Assumption a: Assume that 𝚵\boldsymbol{\Xi} is a p×np\times n nonrandom matrix and the empirical spectral distribution (ESD) of 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{\ast}/n satisfies Hn𝑤HH_{n}\overset{w}{\to}H, (ww denoting weakly convergence), where HH is a non-random probability measure. In addition, the eigenvalues of 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{\ast}/n are subject to the condition

ljk+1𝚵=ljk+2𝚵==ljk+mk𝚵=ak,k{1,,K}\displaystyle l_{j_{k}+1}^{\boldsymbol{\Xi}}=l_{j_{k}+2}^{\boldsymbol{\Xi}}=\cdots=l_{j_{k}+m_{k}}^{\boldsymbol{\Xi}}={a_{k}},\quad k\in\{1,\cdots,K\} (4)

and ak{a_{k}} satisfies the separation condition, that is,

minkj|akaj1|>d,\displaystyle\mathop{min}\limits_{k\neq j}\left|\frac{a_{k}}{a_{j}}-1\right|>d, (5)

where dd is positive constant and independent of nn and ak,k{1,,K}{a_{k}},k\in\{1,\cdots,K\} is allowed to grow at an order o(n)o(\sqrt{n}). In addition, 𝒥k={jk+1,,jk+mk}\mathcal{J}_{k}=\{j_{k}+1,\cdots,j_{k}+m_{k}\} denotes the set of ranks of aka_{k}, where mkm_{k} is the multiplicities of aka_{k} satisfying m1++mK=Mm_{1}+\cdots+m_{K}=M, a fixed integer.

Remark 1.1.

Note that ak,k{1,,K}{a_{k}},k\in\{1,\cdots,K\} can be located in any gap between the supports of HH, which means that ak{a_{k}} is not just the extreme eigenvalues of 𝚵𝚵/n{\boldsymbol{\Xi}\boldsymbol{\Xi}^{\ast}/n}.

The eigenvalues ak,k{1,,K}a_{k},k\in\{1,\cdots,K\} are called as population spiked eigenvalues of the noncentral sample covariance matrix (1) and the noncentral Fisher matrix (3). We call these two matrices satisfying (4) the spiked noncentral sample covariance matrix and the spiked noncentral Fisher matrix, respectively. In fact, the spiked eigenvalues ak,k{1,,K}a_{k},k\in\{1,\cdots,K\} should have allowed to diverge at any rate, but under the restriction of our studying method, we have to assume them at a rate of o(n)o(\sqrt{n}). In this paper, we are devoted to exploring the properties of the limits of the sample spiked eigenvalues (corresponding to ak,k{1,,K}a_{k},k\in\{1,\cdots,K\}) of the noncentral Fisher matrix. From now on, we call the sample spiked eigenvalues simply spiked eigenvalues when no confusion can arise.

To study the principal component analysis (PCA), [26] proposed the spiked model based on the covariance matrix. The spiked model has been studied much further and extended in various random matrices such as Fisher matrix, sample canonical correlation matrix, separable covariance matrix, see [11, 16] for more details. The main emphasis of the research of the spiked model is on the limits and the fluctuations of the spiked eigenvalues of these kinds of random matrices. [8, 31, 6, 7, 2, 13, 23] focused on the spiked sample covariance matrix and [35, 25, 24] concentrated on the central spiked Fisher matrices.

The main contribution of the paper is the establishment of the limits and fluctuations of the spiked eigenvalues of the noncentral Fisher matrix under the Gaussian population assumption. Even better, we apply the above theoretical results to the canonical correlation analysis (CCA) and derive the limits and fluctuations of sample canonical correlation coefficients and give three consistent estimators, including the population spiked eigenvalues and the population canonical correlation coefficients. In addition, we study the properties of sample spiked eigenvalues of the noncentral sample covariance matrix, which should have its own interest.

The rest of the paper is organized as follows. In Section 22, we define some notations and we present the LSD of some random matrices. In Section 33, we study the limits and fluctuations of spiked eigenvalues of the noncentral sample covariance matrix and noncentral Fisher matrix. In Section 44, we present the limits and fluctuations of spiked eigenvalues of sample canonical correlation matrix are investigated, give three estimators of the population spiked eigenvalues, and conduct the actual data analysis about climate and geography by CCA. To show the correctness and rationality of theorems intuitively, we design a series of simulations in Section 55. In Section 66, we summarize the main conclusions and the outlook. Section 77 presents technical proof.

2 Preliminaries

In this section, we collect some notations and preliminary results or assumptions, which will be throughout the paper. Although some notations have been mentioned above, we still provide the precise definitions here.

2.1 Basic notions

For any n×nn\times n matrix 𝐀n\mathbf{A}_{n} with only real eigenvalues, let FnF_{n} be the empirical spectral distribution (ESD) function of 𝐀n\mathbf{A}_{n}, that is,

Fn(x)=1nCard{i;λi𝐀nx},F_{n}(x)=\frac{1}{n}Card\{i;\lambda_{i}^{\mathbf{A}_{n}}\leq x\},

where λi𝐀n\lambda_{i}^{\mathbf{A}_{n}} denotes the ii-th largest eigenvalue of 𝐀n\mathbf{A}_{n}. If F𝐀nF^{\mathbf{A}_{n}} has a limiting distribution FF, then we call it the limiting spectral distribution (LSD) of sequence {𝐀n}\{\mathbf{A}_{n}\}. For any function of bounded variation GG on the real line, its Stieltjes transform (ST) is defined by

m(z)=1λz𝑑G(λ),z+.m(z)=\int\frac{1}{\lambda-z}dG(\lambda),~{}~{}z\in\mathbb{C}^{+}.

2.2 Symbols and Assumptions

In this paper, we study the spiked eigenvalues of the noncentral spiked sample covariance matrix 𝐂n\mathbf{C}_{n} defined in (1) and the noncentral spiked Fisher matrix 𝐅p\mathbf{F}_{p} defined in (3) with the matrix 𝚵𝚵/n{\mathbf{\Xi}}{\mathbf{\Xi}}^{*}/n satisfying (4). In order to distinguish the symbols of these three matrices clearly, we show the notations of eigenvalues and ST of the matrices in Table 1.

Table 1: Notations for 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{*}/n, 𝐂n{\bf C}_{n} and 𝐅p{\bf F}_{p}.
Matrix 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{*}/n 𝐂n{\bf C}_{n} 𝐅p{\bf F}_{p}
LSD HH F𝐂F^{{\bf C}} FF
ST m1m_{1} m2m_{2} m3m_{3}
Population spiked eigenvalue aka_{k} aka_{k} aka_{k}
Sample eigenvalue - li𝐂nl_{i}^{{\bf C}_{n}} lil_{i}
Limit - λk𝐂=ψ𝐂(ak)\lambda_{k}^{{\bf C}}=\psi_{{\bf C}}(a_{k}) λk=ψ(ak)=ψ𝐅(ψ𝐂(ak))\lambda_{k}=\psi(a_{k})=\psi_{{\bf F}}(\psi_{{\bf C}}(a_{k}))

Throughout the paper, we consider the following assumptions about the high-dimensional setting and the moment conditions.
Assumption b: Assume that p<np<n, p<Np<N with p/n=c1nc1(0,1),p/N=c2Nc2(0,1)p/n=c_{1n}\rightarrow c_{1}\in(0,1),p/N=c_{2N}\rightarrow c_{2}\in(0,1), as min(p,n,N)\min(p,n,N)\rightarrow\infty.
Assumption c: Assume that the matrix 𝐗n{\bf X}_{n} and 𝐘N{\bf Y}_{N} are two independent arrays of independent standard Gaussian distribution variables {Xjk:1jp,1kn}\{X_{jk}:1\leq j\leq p,1\leq k\leq n\} and {Yjk:1jp,1kN}\{Y_{jk}:1\leq j\leq p,1\leq k\leq N\}, respectively. If XijX_{ij} and YijY_{ij} are complex, EXij2=0EX_{ij}^{2}=0, EYij2=0EY_{ij}^{2}=0, E|Xij|2=1E|X_{ij}|^{2}=1 and E|Yij|2=1E|Y_{ij}|^{2}=1 are required.

2.3 LSD for 𝐂n{\bf C}_{n} and 𝐅p{\bf F}_{p}

To introduce the necessary conclusions and symbols in the following sections, we provide the LSD for 𝐂n{\bf C}_{n} and 𝐅p{\bf F}_{p} by the results in [17, 39]. Note that similar results are obtained in [12]. According to Theorem 1.11.1 of [17], the ST m2(z)m_{2}(z) of the LSD of 𝐂n{\bf C}_{n} is the unique solution to the equation

m2\displaystyle m_{2} =\displaystyle= dH(t)t1+c1m2(1+c1m2)z+1c1\displaystyle\int\frac{dH(t)}{\frac{t}{1+c_{1}m_{2}}-(1+c_{1}m_{2})z+1-c_{1}} (6)
=\displaystyle= (1+c1m2)m1[(1+c1m2)((1+c1m2)z(1c1))]\displaystyle\left(1+c_{1}m_{2}\right)m_{1}[(1+c_{1}m_{2})((1+c_{1}m_{2})z-(1-c_{1}))]

where m1(z)m_{1}(z) is the ST of HH. For simplicity, we write m2(z)m_{2}(z) as m2m_{2} in (6). By Theorem 2.12.1 of [39], the ST m3(z)m_{3}(z) of the LSD of 𝐅p{\bf F}_{p} satisfies the following equation:

m3=dH(t)t1+(c1+c2z)m3+1c11+c2zm3z(1+(c1+c2z)m3)1+c2zm3.\displaystyle m_{3}=\int\frac{dH(t)}{\frac{t}{1+(c_{1}+c_{2}z)m_{3}}+\frac{1-c_{1}}{1+c_{2}zm_{3}}-\frac{z(1+(c_{1}+c_{2}z)m_{3})}{1+c_{2}zm_{3}}}. (7)

From (6.12) in [4], we have

m3(z)1+c2zm3(z)=m2(z(1+c2zm3(z))).\displaystyle\frac{m_{3}(z)}{1+c_{2}zm_{3}(z)}=m_{2}\left(z(1+c_{2}zm_{3}(z))\right). (8)

3 Main Results

In this section, we state our main results and briefly summarize our proof strategy. Our main results include the limits and the CLT of the spiked eigenvalues of the noncentral sample covariance matrix 𝐂n{\bf C}_{n} and the noncentral Fisher matrix 𝐅p{\bf F}_{p}.

3.1 Limits and fluctuations for the matrix 𝐂n{\bf C}_{n}

The noncentral sample covariance matrix 𝐂n{\bf C}_{n} and its eigenvalues are arranged as a descending order as

l1𝐂nl2𝐂nlp𝐂n.\displaystyle l_{1}^{{\bf C}_{n}}\geq l_{2}^{{\bf C}_{n}}\geq\cdots\geq l_{p}^{{\bf C}_{n}}. (9)
Theorem 3.1.

If Assumption [𝐚][𝐜][{\mathbf{a}}]-[{\mathbf{c}}] hold and the population spiked eigenvalues ak,k{1,,K}{a_{k}},\,k\in\{1,\cdots,K\} satisfies ψ𝐂(ak)>0\psi_{{\bf C}}^{\prime}\left({a_{k}}\right)>0, for 1kK1\leq k\leq K. Then we have

lj𝐂nψ𝐂(ak)1a.s.0,j𝒥k\displaystyle\frac{l_{j}^{{\bf C}_{n}}}{\psi_{{\bf C}}(a_{k})}-1\overset{a.s.}{\longrightarrow}0,\quad j\in\mathcal{J}_{k}\ (10)

where

ψ𝐂(ak)=ak(1c1n1tak𝑑Hn(t))2+(1c1n)(1c1n1tak𝑑Hn(t)),\displaystyle{\psi_{{\bf C}}\left(a_{k}\right)}=a_{k}\left(1-c_{1n}\int\frac{1}{t-a_{k}}dH_{n}(t)\right)^{2}+(1-c_{1n})\left(1-c_{1n}\int\frac{1}{t-a_{k}}dH_{n}(t)\right), (11)
Remark 3.1.

Considering that the convergence HnHH_{n}\rightarrow H may be slow, in ψ𝐂(ak){\psi_{{\bf C}}\left(a_{k}\right)}, we use HnH_{n} and c1nc_{1n} instead of HH and c1c_{1}, respectively. In following theorems related to the limits of spiked eigenvalues, we will take the same treatment.

Having disposed of the limits of the sample spiked eigenvalues, we are now in a position to show the CLT for the sample spiked eigenvalues.

Theorem 3.2.

Suppose the Assumption [𝐚][𝐜][{\mathbf{a}}]-[{\mathbf{c}}] hold, the mkm_{k}-dimensional random vector

γk𝐂n=n{(lj𝐂nψ𝐂(ak)1)1βθ1,j𝒥k}\gamma_{k}^{{\bf C}_{n}}=\sqrt{n}\left\{\left(\frac{l_{j}^{{\bf C}_{n}}}{\psi_{\mathbf{C}}\left(a_{k}\right)}-1\right)\frac{1}{\sqrt{\beta\theta_{1}}},\quad j\in\mathcal{J}_{k}\right\}

converges weakly to the joint distribution of the mkm_{k} eigenvalues of Gaussian random matrix 𝛀\boldsymbol{\Omega}, where 𝛀\boldsymbol{\Omega} is a mkm_{k}-dimensional standard GOE (GUE) matrix. If the samples are either real, β=2\beta=2; or complex, β=1\beta=1, and

θ1\displaystyle\theta_{1} =1[λk𝐂m¯2+ak(1+c1m2+c1λk𝐂m2)λk𝐂(1+c1m2)2]2\displaystyle=\frac{1}{[\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime}+\frac{a_{k}(1+c_{1}m_{2}+c_{1}\lambda_{k}^{{\bf C}}m_{2}^{\prime})}{\lambda_{k}^{{\bf C}}(1+c_{1}m_{2})^{2}}]^{2}}
×(m¯2+ak2c1m2(λk𝐂)2(1+c1m2)4+2ak(1+m¯2+λk𝐂m¯2)(λk𝐂)2(1+c1m2)2),\displaystyle\times\left(\underline{m}_{2}^{\prime}+\frac{a_{k}^{2}c_{1}m_{2}^{\prime}}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2})^{4}}+\frac{2a_{k}(1+\underline{m}_{2}+\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime})}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2})^{2}}\right),

where m2m_{2}^{\prime} and m¯2\underline{m}_{2}^{\prime} are the derivatives of m2m_{2} and m¯2\underline{m}_{2} at the point λk𝐂\lambda_{k}^{{\bf C}}, respectively, and

m¯2(λk𝐂)=1c1λk𝐂+c1m2(λk𝐂).\displaystyle\underline{m}_{2}(\lambda_{k}^{{\bf C}})=-\frac{1-c_{1}}{\lambda_{k}^{{\bf C}}}+c_{1}m_{2}(\lambda_{k}^{{\bf C}}). (12)
Remark 3.2.

It is worth pointing out that θ1\theta_{1} is equal to the (half of) variance of n(lj𝐂n/ψ𝐂(ak)1)\sqrt{n}(l_{j}^{{\bf C}_{n}}/\psi_{{\bf C}}(a_{k})-1) when aka_{k} is a single eigenvalue. When aka_{k} is multiple, the limiting distribution of n(lj𝐂n/ψ𝐂(ak)1)\sqrt{n}(l_{j}^{{\bf C}_{n}}/\psi_{{\bf C}}(a_{k})-1) is related to that of the eigenvalues of a GOE (GUE) matrix, the variance of whose diagonal elements is equal to 2θ12\theta_{1} (θ1\theta_{1}). In what follows, we also call θ1\theta_{1} as the scale parameter of the GOE (GUE) matrix.

Remark 3.3.

[15] and [10] focused on the noncentral spiked sample covariance matrix. [15] studied the limits and rates for the spiked eigenvalues and vectors of the noncentral sample covariance matrix. [10] was devoted to the fluctuation of the spiked vectors of the noncentral sample covariance matrix. Note that both of the works are under the condition of finite rank, specially, the rank of the 𝚵{\mathbf{\Xi}} is finite. Compared with [15] and [10], our assumptions about 𝚵{\mathbf{\Xi}} is more general.

3.2 Limits and fluctuations for the noncentral Fisher matrix 𝐅p{\bf F}_{p}

Having disposed of the noncentral spiked sample covariance matrix 𝐂n{\mathbf{C}}_{n}, we can now return to the noncentral Fisher matrix 𝐅p{\bf F}_{p}. The eigenvalues of noncentral Fisher matrix 𝐅p{\bf F}_{p} are sorted in descending order as

l1l2lp.\displaystyle l_{1}\geq l_{2}\geq\cdots\geq l_{p}. (13)
Theorem 3.3.

Let the Assumption [𝐚][𝐜][{\mathbf{a}}]-[{\mathbf{c}}] hold, the noncentral Fisher matrix 𝐅p{\bf F}_{p} is defined in (3). If aka_{k} satisfies ψ𝐅(ψ𝐂(ak))>0\psi_{\mathbf{F}}^{\prime}(\psi_{\mathbf{C}}(a_{k}))>0 and ψ𝐂(ak)>0\psi_{\mathbf{C}}^{\prime}(a_{k})>0, for 1kK1\leq k\leq K, then we have

ljψ𝐅(ψ𝐂(ak))1a.s.0,j𝒥k\displaystyle\frac{l_{j}}{\psi_{\mathbf{F}}\left(\psi_{\mathbf{C}}(a_{k})\right)}-1\overset{a.s.}{\longrightarrow}0,\quad j\in\mathcal{J}_{k}

where

ψ𝐅(x)=x1+c2nxm20(x),\displaystyle\psi_{\mathbf{F}}\left(x\right)=\frac{x}{1+c_{2n}\cdot x\cdot m_{2}^{0}\left(x\right)}, (14)

ψ𝐅()\psi_{\mathbf{F}}^{\prime}(\cdot) is the derivative of ψ𝐅()\psi_{\boldsymbol{F}}(\cdot), m20m_{2}^{0} and m2m_{2} are same, but c1c_{1}, c2c_{2} and HH replaced by c1nc_{1n}, c2nc_{2n} and HnH_{n}, respectively.

The task is now to show the CLT for the sample spiked eigenvalues of the noncentral Fisher matrix 𝐅p{\bf F}_{p}.

Theorem 3.4.

Suppose that the Assumption [𝐚][𝐜][{\mathbf{a}}]-[{\mathbf{c}}] hold, the mkm_{k} dimensional random vector

γk𝐅p=n{(ljψ𝐅(ψ𝑪(ak))ψ𝐅(ψ𝑪(ak)))1βθ2,j𝒥k},\gamma_{k}^{{\bf F}_{p}}\overset{\triangle}{=}\sqrt{{n}}\left\{\left(\frac{l_{j}-\psi_{\mathbf{F}}(\psi_{\boldsymbol{C}}(a_{k}))}{\psi_{\mathbf{F}}(\psi_{\boldsymbol{C}}(a_{k}))}\right)\frac{1}{\sqrt{\beta\theta_{2}}},\quad j\in\mathcal{J}_{k}\right\},

converges weakly to the joint distribution of the eigenvalues of Gaussian random matrix 𝛀\boldsymbol{\Omega} where

θ2=c2c1ϑ+[1c2(λk𝐂)2m2(λk𝐂)1+c2λkm3(λk)]2θ1,\theta_{2}=\frac{c_{2}}{c_{1}\cdot\vartheta}+\left[\frac{1-c_{2}(\lambda_{k}^{{\bf C}})^{2}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{1+c_{2}\lambda_{k}m_{3}(\lambda_{k})}\right]^{2}\theta_{1}, (15)

θ1\theta_{1} is defined in Theorem 3.2 and ϑ\vartheta satisfies

ϑ\displaystyle\vartheta =1+2λkc2m3(λk)+c2λk2m3(λk)\displaystyle=1+2\lambda_{k}c_{2}m_{3}(\lambda_{k})+c_{2}\lambda_{k}^{2}m_{3}^{\prime}(\lambda_{k}) (16)

4 Applications

In this section, we discuss some applications of our results in limiting properties of sample canonical correlations coeffcients, estimators of the population spiked eigenvalues and the population canonical correlations coeffcients. At the end, we present an experiment on a real environmental variables for world countries data.

4.1 Limits and fluctuations for the sample canonical correlation matrix

The CCA is the general and favorable method to investigate the relationship between two random vectors. Under the high-dimensional setting and the Gaussian assumption, [11] studied the limiting properties of sample canonical correlation coefficients. Under the sharp moment condition, [36, 28] prove the largest eigenvalue of sample canonical correlation matrix converges to T-W law to test the independence of random vectors with two different structures. Moreover, [37] shows the limiting distribution of the spiked eigenvalues depends on the fourth cumulants of the population distribution. Note that [3, 36, 28, 37] only focused on the finite rank case, where the number of the positive population canonical correlation coefficients of two groups of high-dimensional Gaussian vectors are finite. However, we popularize finite rank case to infinite rank case, in other words, we get the limits and fluctuations of the sample canonical correlation coefficients under the infinite rank case. In the following, we will introduce the application about limiting properties of sample canonical correlations coeffcients in great detail.

Let 𝐳i=(𝐱iT,𝐲iT)T,i=1,,n{\mathbf{z}_{i}}=(\mathbf{x}_{i}^{T},\mathbf{y}_{i}^{T})^{T},i=1,\cdots,n, be independent observations from a (p+q)(p+q)-dimensional Gaussian distribution with mean zero and covariance matrix

𝚺=(𝚺xx𝚺xy𝚺yx𝚺yy),\displaystyle\boldsymbol{\Sigma}=\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{xx}&\boldsymbol{\Sigma}_{xy}\\ \boldsymbol{\Sigma}_{yx}&\boldsymbol{\Sigma}_{yy}\\ \end{array}\right),

where 𝐱i{\mathbf{x}}_{i} and 𝐲i{\mathbf{y}}_{i} are pp-dimensional and qq-dimensional vectors with the population covariance matrices 𝚺xx\boldsymbol{\Sigma}_{xx} and 𝚺yy\boldsymbol{\Sigma}_{yy}, respectively. Without loss of generality, we assume that pqp\leq q. Define the corresponding sample covariance matrix as

𝑺n=1ni=1n𝒛i𝒛iT,\displaystyle{\boldsymbol{S}}_{n}=\frac{1}{n}\sum_{i=1}^{n}{\boldsymbol{z}}_{i}{\boldsymbol{z}}_{i}^{T}, (17)

which can be formed as

𝑺n=(𝑺xx𝑺xy𝑺yx𝑺yy)=1n(𝑿𝑿T𝑿𝒀T𝒀𝑿T𝒀𝒀T)\displaystyle{\boldsymbol{S}}_{n}=\left(\begin{array}[]{cc}{\boldsymbol{S}}_{xx}&{\boldsymbol{S}}_{xy}\\ {\boldsymbol{S}}_{yx}&{\boldsymbol{S}}_{yy}\\ \end{array}\right)=\frac{1}{n}\left(\begin{array}[]{cc}{\boldsymbol{X}}{\boldsymbol{X}}^{T}&{\boldsymbol{X}}{\boldsymbol{Y}}^{T}\\ {\boldsymbol{Y}}{\boldsymbol{X}}^{T}&{\boldsymbol{Y}}{\boldsymbol{Y}}^{T}\\ \end{array}\right)

with

𝑿=(𝐱1,,𝐱n)p×n,𝒀=(𝐲1,,𝐲n)q×n,\displaystyle{\boldsymbol{X}}=({\mathbf{x}}_{1},\cdots,{\mathbf{x}}_{n})_{p\times n},\quad{\boldsymbol{Y}}=({\mathbf{y}}_{1},\cdots,{\mathbf{y}}_{n})_{q\times n},

In the sequel, 𝚺xx1𝚺xy𝚺yy1𝚺yx{\boldsymbol{\Sigma}}_{xx}^{-1}{\boldsymbol{\Sigma}}_{xy}{\boldsymbol{\Sigma}}_{yy}^{-1}{\boldsymbol{\Sigma}}_{yx} is called as the population canonical correlation matrix and its eigenvalues are denoted by

1>ρ12ρ22ρp2.\displaystyle 1>\rho_{1}^{2}\geq\rho_{2}^{2}\geq\cdots\geq\rho_{p}^{2}. (18)

By the singular value decomposition, we have that

𝚺xx12𝚺xy𝚺yy12=𝐏1𝚲𝐏2T\displaystyle\boldsymbol{\Sigma}_{xx}^{-\frac{1}{2}}\boldsymbol{\Sigma}_{xy}\boldsymbol{\Sigma}_{yy}^{-\frac{1}{2}}={\mathbf{P}}_{1}{\boldsymbol{\Lambda}}{\mathbf{P}}_{2}^{T} (19)

where

𝚲=(𝚲11𝟎12),\displaystyle{\boldsymbol{\Lambda}}=\left(\begin{array}[]{cc}{\boldsymbol{\Lambda}}_{11}&{\mathbf{0}}_{12}\\ \end{array}\right), (21)

𝚲11=diag(ρ1,ρ2,,ρp){\boldsymbol{\Lambda}}_{11}=\text{diag}(\rho_{1},\rho_{2},\cdots,\rho_{p}), 𝟎12{\mathbf{0}}_{12} is a p×(qp)p\times(q-p) zero matrix, 𝐏1{\mathbf{P}}_{1} and 𝐏2{\mathbf{P}}_{2} are orthogonal matrix with size p×pp\times p and q×qq\times q, respectively. It follows that ρ12,ρ22,,ρp2\rho^{2}_{1},\rho^{2}_{2},\cdots,\rho^{2}_{p} are also the eigenvalues of the diagonal matrix 𝚲𝚲T{\boldsymbol{\Lambda}}{\boldsymbol{\Lambda}}^{T}. According to Theorem 12.2.1 of [1], the nonnegative square roots ρ1,,ρp\rho_{1},\cdots,\rho_{p} are the population canonical correlation coefficients. Correspondingly, 𝐒xx1𝐒xy𝐒yy1𝐒yx{\mathbf{S}}_{xx}^{-1}{\mathbf{S}}_{xy}{\mathbf{S}}_{yy}^{-1}{\mathbf{S}}_{yx} is called as the sample canonical correlation matrix and its eigenvalues are denoted by

λ12λ22λp2.\displaystyle\lambda_{1}^{2}\geq\lambda_{2}^{2}\geq\cdots\geq\lambda_{p}^{2}. (22)

The following theorem describes the function relation between sample canonical correlation coefficients and the eigenvalues of a special noncentral Fisher matrix.

Theorem 4.1.

[Theorem 1 in [3]] Suppose that λi2,i=1,,p\lambda_{i}^{2},i=1,\cdots,p, is the ordered eigenvalue of the sample canonical correlation matrix 𝐒xx1𝐒xy𝐒yy1𝐒yx{\mathbf{S}}_{xx}^{-1}{\mathbf{S}}_{xy}{\mathbf{S}}_{yy}^{-1}{\mathbf{S}}_{yx}. Then, there exists a noncentral Fisher matrix 𝐅(𝚵)\boldsymbol{F}({\mathbf{\Xi}}) whose eigenvalue lil_{i} satisfies li=g(λi)=(nq)λi2q(1λi2),i=1,,pl_{i}=g(\lambda_{i})\overset{\triangle}{=}\frac{(n-q)\lambda_{i}^{2}}{q(1-\lambda_{i}^{2})},~{}i=1,\cdots,p and 𝚵{\mathbf{\Xi}} is the noncentral parameter matrix,

𝚵=nq𝐓(n1𝐘^𝐘^T)𝐓T\displaystyle{\mathbf{\Xi}}=\frac{n}{q}\mathbf{T}(n^{-1}\widehat{\mathbf{Y}}\widehat{\mathbf{Y}}^{T})\mathbf{T}^{T} (23)

where 𝐓𝐓T=diag(ρ12/(1ρ12),ρ22/(1ρ22),,ρp2/(1ρp2)){\mathbf{T}}{\mathbf{T}}^{T}=diag\left(\rho^{2}_{1}/(1-\rho^{2}_{1}),\rho^{2}_{2}/(1-\rho^{2}_{2}),\cdots,\rho^{2}_{p}/(1-\rho^{2}_{p})\right) and 𝐘^\widehat{\mathbf{Y}} is a p×np\times n matrix and contains i.i.d. elements with standard Gaussian distribution.

Combining Theorem 4.1 with the properties of the noncentral Fisher matrix, we obtain the limits and fluctuations of the sample canonical correlation coefficients under the following assumption.

Assumption d: The empirical spectral distribution (ESD) of 𝚺xx1𝚺xy𝚺yy1𝚺yx{\boldsymbol{\Sigma}}_{xx}^{-1}{\boldsymbol{\Sigma}}_{xy}{\boldsymbol{\Sigma}}_{yy}^{-1}{\boldsymbol{\Sigma}}_{yx}, n\mathcal{H}_{n}, tends to proper probability measure \mathcal{H}, if min(p,q,n)\min(p,q,n)\rightarrow\infty. Assume that ρi2,i=1,,p\rho^{2}_{i},~{}i=1,\dots,p is subject to the condition

αk=ρmk1+12==ρmk1+mk2,k{1,,K},\displaystyle\alpha_{k}=\rho_{m_{k-1}+1}^{2}=\cdots=\rho_{m_{k-1}+m_{k}}^{2},\quad k\in\{1,\cdots,K\}, (24)

where αk\alpha_{k} is out of the support of \mathcal{H} and satisfies the separation condition defined in (5). M=i=1KmiM=\sum_{i=1}^{K}m_{i} is a fixed positive integer with convention m0=0m_{0}=0. In addition, α1\alpha_{1} is allowed to be with the order 1o(n1/2)1-o(n^{-1/2}).

Remark 4.1.

Note that the noncentral parameter matrix (23) is random, so the assumption about multiple roots in (24) is not reasonable. In the following Assumption d, we replace the comndition of multiple roots by single root. However, we think the results of the limits and fluctuations for the multiple roots case are correct and our guess will be verifed by some simulations in the following section.

Assumption d:The assumptions are same as Assumption d, the additional assumption mi=1,i{1,,K}m_{i}=1,\quad i\in\{1,\cdots,K\}.

Theorem 4.2.

Under the conditions stated in Theorem 4.1, if moreover Assumption d’ holds, and αk\alpha_{k} satisfies ψ𝚵(f(αk))>0\psi_{\boldsymbol{\Xi}}^{\prime}(f(\alpha_{k}))>0, Ψ𝐂(αk)>0\Psi_{\mathbf{C}}^{\prime}(\alpha_{k})>0, and Ψ(αk)>0\Psi^{\prime}(\alpha_{k})>0 for 1kK1\leq k\leq K, then we have

λi2t(αk)1a.s.0,i𝒥k\displaystyle\frac{\lambda_{i}^{2}}{t(\alpha_{k})}-1\overset{a.s.}{\longrightarrow}0,\quad i\in{\mathcal{J}}_{k} (25)

where

t(x)=g1Ψ(x),Ψ(x)=ψ𝑭ψ𝑪ψ𝚵f(x),\displaystyle t(x)=g^{-1}\circ\Psi(x),\quad\Psi(x)=\psi_{\boldsymbol{F}}\circ\psi_{{\boldsymbol{C}}}\circ\psi_{\boldsymbol{\Xi}}\circ f(x),
Ψ𝐂(x)=ψ𝐂ψ𝚵f(x),f(x)=nqx1x,ψ𝑭(x)=x1+pnqxm𝑪(x);\displaystyle\Psi_{{\bf C}}(x)=\psi_{{\bf C}}\circ\psi_{\boldsymbol{\Xi}}\circ f(x),\quad f(x)=\frac{n}{q}\frac{x}{1-x},\quad\psi_{\boldsymbol{F}}(x)=\frac{x}{1+\frac{p}{n-q}\cdot x\cdot m_{\boldsymbol{C}}(x)};
ψ𝚵(x)=x(1+pntxt𝑑~(t)),~(x)=n(qxn+qx),\displaystyle\psi_{\boldsymbol{\Xi}}(x)=x\left(1+\frac{p}{n}\int\frac{t}{x-t}d\widetilde{\mathcal{H}}(t)\right),\quad\widetilde{\mathcal{H}}(x)=\mathcal{H}_{n}\left(\frac{qx}{n+qx}\right),
ψ𝑪(x)=x(1pq1tx𝑑Fmpp/n,~(t))2(1pq)(1pq1tx𝑑Fmpp/n,~(t)),\displaystyle\psi_{\boldsymbol{C}}(x)=x\left(1\!-\!\frac{p}{q}\int\frac{1}{t-x}dF^{p/n,{\tilde{\mathcal{H}}}}_{mp}(t)\right)^{2}\!-\!\left(1\!-\!\frac{p}{q}\right)\left(1\!-\!\frac{p}{q}\int\frac{1}{t-x}dF^{p/n,{\tilde{\mathcal{H}}}}_{mp}(t)\right),

and g1Ψg^{-1}\circ\Psi stands for the composition of Ψ()\Psi(\cdot) and g1()g^{-1}(\cdot). Fmpp/n,~F_{mp}^{p/n,{\tilde{\mathcal{H}}}} denotes M-P law with the parameter p/np/n and ~{\tilde{\mathcal{H}}}. Moreover, m𝐂()m_{\boldsymbol{C}}(\cdot) stands for the unique solution (6) with c1c_{1}, HH replaced by p/qp/q, Fmpp/n,~F_{mp}^{p/n,{\tilde{\mathcal{H}}}}, respectively.

It is worth noting that we can conclude the results in Theorem 1.8 of [11] by Theorem 4.2 or Theorem 3.1 in [35], when the LSD \mathcal{H} of 𝚺xx1𝚺xy𝚺yy1𝚺yx{\boldsymbol{\Sigma}}_{xx}^{-1}{\boldsymbol{\Sigma}}_{xy}{\boldsymbol{\Sigma}}_{yy}^{-1}{\boldsymbol{\Sigma}}_{yx} degenerates to δ{0}\delta_{\{0\}}.

Corollary 4.2.

Let the Assumption [𝐜{\mathbf{c}}] and [𝐝{\mathbf{d}^{\prime}}] hold, furthermore, the LSD \mathcal{H} degenerates to δ{0}\delta_{\{0\}} and αk(k=1,,K)\alpha_{k}(k=1,\cdots,K) satisfies αk>αr,\alpha_{k}>\alpha_{r}, then we have

λi2ϕ(αk)1a.s.0,i𝒥k,\displaystyle\frac{\lambda_{i}^{2}}{\phi(\alpha_{k})}-1\overset{a.s.}{\longrightarrow}0,\quad i\in{\mathcal{J}}_{k}, (26)

where

ϕ(αk)\displaystyle\phi(\alpha_{k}) =[αk(1r1)+r1][αk(1r2)+r2]αk\displaystyle=\frac{[\alpha_{k}(1-r_{1})+r_{1}][\alpha_{k}(1-r_{2})+r_{2}]}{\alpha_{k}}
αr\displaystyle\alpha_{r} =r1r2(1r1)(1r2)\displaystyle=\sqrt{\frac{r_{1}r_{2}}{(1-r_{1})(1-r_{2})}}
r1\displaystyle r_{1} =p/n,r2=q/n.\displaystyle=p/n,\quad r_{2}=q/n.

Having disposed of the results of limits of the square of sample canonical correlation coefficients λi2,i𝒥k\lambda_{i}^{2},i\in{\mathcal{J}}_{k} associated to the αk(k=1,,K)\alpha_{k}(k=1,\cdots,K), we can now return to show the CLT for them.

Theorem 4.3.

Let the Assumption [𝐜][{\mathbf{c}}] and [𝐝][{\mathbf{d}^{\prime}}] hold, and λi2\lambda_{i}^{2} is the square of eigenvalues of sample canonical correlation matrix. We set p/qc3p/q\to c_{3} and p/(nq)c4p/(n-q)\to c_{4} as nn tends to infinity, then the mkm_{k}-dimensional random vector

γk=q{(λi2t(αk)t(αk))1βη,i𝒥k}\displaystyle\gamma_{k}=\sqrt{q}\left\{\left(\frac{\lambda_{i}^{2}-t(\alpha_{k})}{t(\alpha_{k})}\right)\frac{1}{\sqrt{\beta\eta}},\quad i\in\mathcal{J}_{k}\right\}

converges weakly to the joint distribution of the eigenvalues of Gaussian random matrix 𝛀\boldsymbol{\Omega}, and

η=[c4(c3η2)+(1c4(Ψ𝐂(αk))2m𝐂(Ψ𝐂(αk))1+c4Ψ(αk)m𝐅(Ψ(αk)))2η1+η3]\displaystyle\eta=\left[\frac{c_{4}}{(c_{3}\cdot\eta_{2})}+\left(\frac{1-c_{4}(\Psi_{{\bf C}}(\alpha_{k}))^{2}m_{{\bf C}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k}))}{1+c_{4}\Psi(\alpha_{k})m_{{\bf F}}(\Psi(\alpha_{k}))}\right)^{2}\eta_{1}+\eta_{3}\right]
×c32c42Ψ(αk)2[c3+c4Ψ(αk)]4[t(αk)]2,\displaystyle\quad\quad\times\frac{c_{3}^{2}c_{4}^{2}{\Psi(\alpha_{k})^{2}}}{\left[c_{3}+c_{4}\Psi(\alpha_{k})\right]^{4}{[t(\alpha_{k})]^{2}}},
η1=[Ψ𝐂(αk)m¯𝐂+ψ𝚵(f(αk))(1+c3m𝐂+c3Ψ𝐂(αk)m𝐂)Ψ𝐂(αk)(1+c3m𝐂)2]2\displaystyle\eta_{1}=\left[\Psi_{{\bf C}}(\alpha_{k})\underline{m}_{{\bf C}}^{\prime}+\frac{\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))(1+c_{3}m_{{\bf C}}+c_{3}\Psi_{{\bf C}}(\alpha_{k})m_{{\bf C}}^{\prime})}{\Psi_{{\bf C}}(\alpha_{k})(1+c_{3}m_{{\bf C}})^{2}}\right]^{-2}
×(m¯𝐂+(ψ𝚵(f(αk)))2c3m𝐂(Ψ𝐂(αk))2(1+c3m𝐂)4+2ψ𝚵(f(αk))(1+m¯𝐂+Ψ𝐂(αk)m¯𝐂)(Ψ𝐂(αk))2(1+c3m𝐂)2),\displaystyle\times\left(\underline{m}_{{\bf C}}^{\prime}\!+\!\frac{(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))^{2}c_{3}m_{{\bf C}}^{\prime}}{(\Psi_{{\bf C}}(\alpha_{k}))^{2}(1\!+\!c_{3}m_{{\bf C}})^{4}}\!+\!\frac{2\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))(1\!+\!\underline{m}_{{\bf C}}\!+\!\Psi_{{\bf C}}(\alpha_{k})\underline{m}_{{\bf C}}^{\prime})}{(\Psi_{{\bf C}}(\alpha_{k}))^{2}(1\!+\!c_{3}m_{{\bf C}})^{2}}\right),
η2=1+2Ψ(αk)c4m𝐅(Ψ(αk))+c4(Ψ(αk))2m𝐅(Ψ(αk)),\displaystyle\eta_{2}=1+2\Psi(\alpha_{k})c_{4}m_{{\bf F}}(\Psi(\alpha_{k}))+c_{4}(\Psi(\alpha_{k}))^{2}m_{{\bf F}}^{\prime}(\Psi(\alpha_{k})),
η3=qn(ψ𝐅(Ψ𝐂(αk)))2(ψ𝐂(ψ𝚵(f(αk))))2ψ𝚵2(f(αk))m¯(ψ𝚵(f(αk)))ψ𝚵2(f(αk))Ψ2(αk)\displaystyle\eta_{3}=\frac{\frac{q}{n}(\psi_{{\bf F}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k})))^{2}(\psi_{{\bf C}}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))))^{2}}{\psi_{\boldsymbol{\Xi}}^{2}(f(\alpha_{k}))\underline{m}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))}\frac{\psi_{\boldsymbol{\Xi}}^{2}(f(\alpha_{k}))}{\Psi^{2}(\alpha_{k})}

where m𝐂m_{{\bf C}} and m𝐂m_{{\bf C}}^{\prime} denote the value and derivative at the point Ψ𝐂(αk)\Psi_{{\bf C}}(\alpha_{k}), respectively, m𝐅m_{{\bf F}} stands for the unique solution (7) with c1c_{1}, c2c_{2}, HH replaced by p/qp/q, p/(nq)p/(n-q), Fmpp/n,HF_{mp}^{p/n,H}, respectively.

The proof of this theorem can be drawn by the Delta method, Theorem 3.4 and Theorem 4.1 and it will be postponed in subsection 7.5.

4.2 Estimators of the population spiked eigenvalues

In this section, we develop two consistent estimators of population spiked eigenvalues aka_{k} defined in (24), which are derived by the results of the noncentral sample covariance matrix and the noncentral Fisher matrix, respectively. At first, we proceed to show the estimator of population spiked eigenvalues for the noncentral sample covariance matrix. From the conclusion of Theorem 3.3, we have

λi𝐂=ak(1c1m1(ak))2+(1c1)(1c1m1(ak)),\displaystyle\lambda_{i}^{{\bf C}}=a_{k}\left(1-c_{1}m_{1}(a_{k})\right)^{2}+\left(1-c_{1}\right)\left(1-c_{1}m_{1}(a_{k})\right),

and

ak=λk𝐂(1+c1m2(λk𝐂))2(1c1)(1+c1m2(λk𝐂)).\displaystyle a_{k}=\lambda^{{\bf C}}_{k}\left(1+c_{1}m_{2}(\lambda^{{\bf C}}_{k})\right)^{2}-(1-c_{1})\left(1+c_{1}m_{2}(\lambda^{{\bf C}}_{k})\right).

It is sufficient to consider the estimators of λk𝐂\lambda^{{\bf C}}_{k} and m2(λk𝐂)m_{2}(\lambda^{{\bf C}}_{k}), which are denoted by λ^k𝐂\hat{\lambda}^{{\bf C}}_{k} and m^2(λ^k𝐂)\hat{m}_{2}(\hat{\lambda}^{{\bf C}}_{k}), respectively. We adopt an approach similar to that in [23] to estimate m^2(λ^k𝐂)\hat{m}_{2}(\hat{\lambda}^{{\bf C}}_{k}). Define rik=|λ^i𝐂λ^k𝐂|/|λ^k𝐂|r_{ik}=|\hat{\lambda}_{i}^{{\bf C}}-\hat{\lambda}_{k}^{{\bf C}}|/|\hat{\lambda}_{k}^{{\bf C}}| and the set 𝒥k={i(1,,p):rik0.2}\mathcal{J}_{k}=\{i\in(1,\cdots,p):r_{ik}\leq 0.2\} and c~1=(p|𝒥k|)/n\tilde{c}_{1}=(p-|\mathcal{J}_{k}|)/n; then,

m^2(λ^k𝐂)=1p|𝒥k|i𝒥k(λ^i𝐂λ^k𝐂)1\displaystyle\hat{m}_{2}(\hat{\lambda}_{k}^{{\bf C}})=\frac{1}{p-|\mathcal{J}_{k}|}\sum\limits_{i\notin\mathcal{J}_{k}}(\hat{\lambda}_{i}^{{\bf C}}-\hat{\lambda}_{k}^{{\bf C}})^{-1} (27)

is a good estimator of m2(λk𝐂)m_{2}(\lambda_{k}^{{\bf C}}), where the set 𝒥k\mathcal{J}_{k} is selected to avoid the effect of multiple roots and to make the estimator more accurate. Then we have

a^k=λ^k𝐂(1+c~1m^2(λ^k𝐂))2(1c~1)(1+c~1m^2(λ^k𝐂)).\displaystyle\hat{a}_{k}=\hat{\lambda}^{{\bf C}}_{k}\left(1+\tilde{c}_{1}\hat{m}_{2}(\hat{\lambda}^{{\bf C}}_{k})\right)^{2}-(1-\tilde{c}_{1})\left(1+\tilde{c}_{1}\hat{m}_{2}(\hat{\lambda}^{{\bf C}}_{k})\right). (28)

We now turn to give the other estimator for the population spiked eigenvalues by the results of the noncentral Fisher matrix. From the conclusion of Theorem 3.3, we have

λi=ψ𝐂(ak)1+c2ψ𝐂(ak)m2(ψ𝐂(ak)),\displaystyle\lambda_{i}=\frac{\psi_{{\bf C}}(a_{k})}{1+c_{2}\psi_{{\bf C}}(a_{k})m_{2}(\psi_{{\bf C}}(a_{k}))},

then,

ψ𝐂(ak)=λi(1+c2λim3(λi)).\displaystyle\psi_{{\bf C}}(a_{k})=\lambda_{i}(1+c_{2}\lambda_{i}m_{3}(\lambda_{i})). (29)

According to Theorem 3.1, we know

ψ𝐂(ak)=ak(1c1m1(ak))2+(1c1)(1c1m1(ak)),\displaystyle\psi_{{\bf C}}(a_{k})=a_{k}(1-c_{1}m_{1}(a_{k}))^{2}+(1-c_{1})(1-c_{1}m_{1}(a_{k})),

then,

ak=ψ𝐂(ak)[1+c1m2ψ𝐂(ak)]2(1c1)[1+c1m2ψ𝐂(ak)].\displaystyle a_{k}=\psi_{{\bf C}}(a_{k})\left[1+c_{1}m_{2}\psi_{{\bf C}}(a_{k})\right]^{2}-(1-c_{1})\left[1+c_{1}m_{2}\psi_{{\bf C}}(a_{k})\right].

Note that (29) and λ^k\hat{\lambda}_{k} is the natural estimator of λk\lambda_{k}, we set a~k\tilde{a}_{k} as the estimator of ψ𝐂(ak)\psi_{{\bf C}}(a_{k}) and have

a~k=λ^k(1+c~2λ^km^3(λ^k)),\displaystyle\tilde{a}_{k}=\hat{\lambda}_{k}(1+\tilde{c}_{2}\hat{\lambda}_{k}\hat{m}_{3}(\hat{\lambda}_{k})),

where m^3(λ^k)\hat{m}_{3}(\hat{\lambda}_{k}) are the estimators of m3(λk)m_{3}(\lambda_{k}). Similar considerations in (27) are applied here, we have rik=|λ^iλ^k|/|λ^k|r_{ik}=|\hat{\lambda}_{i}-\hat{\lambda}_{k}|/|\hat{\lambda}_{k}| and the set 𝒥k={i(1,,p):rik0.2}\mathcal{J}_{k}=\{i\in(1,\cdots,p):r_{ik}\leq 0.2\} and c~2=(p|𝒥k|)/N\tilde{c}_{2}=(p-|\mathcal{J}_{k}|)/N; then,

m^3(λ^k)=1p|𝒥k|i𝒥k(λ^iλ^k)1\displaystyle\hat{m}_{3}(\hat{\lambda}_{k})=\frac{1}{p-|\mathcal{J}_{k}|}\sum\limits_{i\notin\mathcal{J}_{k}}(\hat{\lambda}_{i}-\hat{\lambda}_{k})^{-1} (30)

is a good estimator of m3(λk)m_{3}(\lambda_{k}). Then we have,

a^k=[a~k(1+c1m^2(a~k))+(1c1)](1+c1m^2(a~k)),\displaystyle\hat{a}_{k}=[\tilde{a}_{k}(1+c_{1}\hat{m}_{2}(\tilde{a}_{k}))+(1-c_{1})](1+c_{1}\hat{m}_{2}(\tilde{a}_{k})), (31)

where m^2(a~k)\hat{m}_{2}(\tilde{a}_{k}) is the estimator of m2m_{2} at ψ𝐂(ak)\psi_{{\bf C}}(a_{k}), by (8) we have

m^2(a~k)=m^3/(1+c~2λ^km^3).\displaystyle\hat{m}_{2}(\tilde{a}_{k})=\hat{m}_{3}/(1+\tilde{c}_{2}\hat{\lambda}_{k}\hat{m}_{3}). (32)

4.3 The environmental variables for world countries data

To illustrate the application of canonical correlation, we apply our result to the environmental variables for world countries data111The data can be download from: https://www.kaggle.com/zanderventer/environmental-variables-for-world-countries.. Deleting the samples missing value and the variables related to others, then, we get 188188 samples with 1818 variables. We divide the 1818 variables into two groups, one (𝐱\mathbf{x}, p=7p=7) contains the elevation above sea level, percentage of country covered by cropland, percentage cover by trees >> 55m in height and so on, the other one (𝐲\mathbf{y}, q=11q=11) contains annual precipitation, temperature mean annual, mean wind speed, average cloudy days per year and so on. We want to explore the relationship between the geographical conditions 𝐱\mathbf{x} and climatic conditions 𝐲\mathbf{y}, so we test their first canonical correlation coefficient being zero or not, i.e.,

H0:ρ12=0v.s.H1:ρ12>0.\displaystyle H_{0}:\rho_{1}^{2}=0\quad\text{v.s.}\quad H_{1}:\rho_{1}^{2}>0.

Therefore, we take the largest eigenvalue λ12\lambda_{1}^{2} of the sample canonical correlation matrix as the test statistic. By the formula (2.1) in [11], the normalized largest eigenvalues of the sample CCA matrix tends to the T-W law under the null hypothesis. The eigenvalues of 𝐒xx1𝐒xy𝐒yy1𝐒yx{\mathbf{S}}_{xx}^{-1}{\mathbf{S}}_{xy}{\mathbf{S}}_{yy}^{-1}{\mathbf{S}}_{yx} are as follows:

Table 2: The sample canonical correlation coefficients of the real data
λ12\lambda_{1}^{2} λ22\lambda_{2}^{2} λ32\lambda_{3}^{2} λ42\lambda_{4}^{2} λ52\lambda_{5}^{2} λ62\lambda_{6}^{2} λ72\lambda_{7}^{2}
0.9152 0.7755 0.4560 0.4034 0.2548 0.2247 0.0492

According to the data, we obtain that pp-value approaches zero. Thus we have strong evidence to reject the null hypothesis and conclude that the geographical conditions relate to climatic conditions. Moreover, we use Algorithm 1 to give an estimator of the population canonical correlation coefficients,

ρ^i2=q/na^i1+q/na^i,i=1,,p.\displaystyle\hat{\rho}_{i}^{2}=\frac{q/n*\hat{a}_{i}}{1+q/n*\hat{a}_{i}},\quad i=1,\cdots,p. (33)

By (33), we get the population canonical correlation coefficients ρ^12=0.9064\hat{\rho}_{1}^{2}=0.9064, which implies that the correlation between geographical conditions and climatic conditions is strong.

Algorithm 1 estimate population canonical correlation coefficients

Input: the samples 𝐱i,𝐲i,i=1,,n{\mathbf{x}_{i},\mathbf{y}_{i},i=1,\cdots,n}
Output: ρ^i2,i=1,,p\hat{\rho}_{i}^{2},i=1,\cdots,p

1:{lpSCCl1SCC}\{l_{p}^{SCC}\leq\cdots\leq l_{1}^{SCC}\}=Eigenvalue(𝐒xx1𝐒xy𝐒yy1𝐒yx{\mathbf{S}}_{xx}^{-1}{\mathbf{S}}_{xy}{\mathbf{S}}_{yy}^{-1}{\mathbf{S}}_{yx});
2:liF=liSCC/(1liSCC)(nq)/qi=1,,p;l_{i}^{F}=l_{i}^{SCC}/(1-l_{i}^{SCC})*(n-q)/q\quad i=1,\cdots,p;
3:Use step 2 and (31) to get the estimator for the population spiked eigenvalues of the noncentral Fisher matrix
4:ρ^i2=q/na^i1+q/na^i,i=1,,p\hat{\rho}_{i}^{2}=\frac{q/n*\hat{a}_{i}}{1+q/n*\hat{a}_{i}},\quad i=1,\cdots,p
5:return result ρ^i2\hat{\rho}_{i}^{2}

5 Simulation

We conduct simulations that support the theoretical results and illustrate the accuracy of the estimators. The simulations are divided into two sections, one is to verify the accuracy of Theorem 3.2, 3.4 and 4.3, and another is confirm the performance of the estimators in (28), (31) and (33).

5.1 Simulations for the asymptotic normality

In this section, we assume the eigenvalues of 𝚵𝚵/n{\mathbf{\Xi}}{\mathbf{\Xi}}^{*}/n satisfy

10, 7.5,1,, 1p2,\displaystyle 10,\,7.5,\,\underbrace{1,\,\cdots,\,1}_{p-2}, (34)

in this situation, we set 𝚲\boldsymbol{\Lambda} defined in (21) satisfying

𝚲=(10/11,15/17,1/2,,1/2).\displaystyle\boldsymbol{\Lambda}=(\sqrt{10/11},\sqrt{15/17},\sqrt{1/2},\cdots,\sqrt{1/2}). (35)

In Figure 1-6, we compare the empirical density (the blue histogram) of the two of largest eigenvalues of 𝐂n{\bf C}_{n}, 𝐅p{\bf F}_{p} and the CCA matrix with the standard normal density curve (the red line) with 2000 repetitions. To be more convincing, we put the Q-Q plots together with the histograms. According to the histograms and Q-Q plots, we can conclude that the Theorem 3.2, 3.4 and 4.3 are reasonable.

Refer to caption
Refer to caption
Figure 1: The asymptotic normality of the largest eigenvalue of noncentral sample covariance matrix with (p,n)=(200,2000)(p,n)=(200,2000).
Refer to caption
Refer to caption
Figure 2: The asymptotic normality of the second largest eigenvalue of the noncentral sample covariance matrix with (p,n)=(200,2000)(p,n)=(200,2000).
Refer to caption
Refer to caption
Figure 3: The asymptotic normality of the largest eigenvalue of noncentral Fisher matrix with (p,n,N)=(200,2000,1000)(p,n,N)=(200,2000,1000).
Refer to caption
Refer to caption
Figure 4: The asymptotic normality of the second largest eigenvalue of noncentral Fisher matrix with (p,n,N)=(200,2000,1000)(p,n,N)=(200,2000,1000).
Refer to caption
Refer to caption
Figure 5: The asymptotic normality of the largest eigenvalue of the CCA matrix with (p,q,n)=(200,200,1000)(p,q,n)=(200,200,1000).
Refer to caption
Refer to caption
Figure 6: The asymptotic normality of the second largest eigenvalue of the CCA matrix with (p,q,n)=(200,200,1000)(p,q,n)=(200,200,1000).

5.2 Simulations for the estimators

We conduct the following simulations to verify the accuracy of the estimators in (28), (31) and (33). Unlike (34), in this subsection, the eigenvalues of 𝚵𝚵/n{\mathbf{\Xi}}{\mathbf{\Xi}}^{*}/n are set as

10,7.5,7.5,1,,1p3,\displaystyle 10,7.5,7.5,\underbrace{1,\cdots,1}_{p-3}, (36)

where a1=10a_{1}=10, a2=a3=7.5a_{2}=a_{3}=7.5, and H=δ{1}H=\delta_{\{1\}}. Note that we set the second eigenvalue as a multiple eigenvalue. According to the single root condition in Assumption d, we set the eigenvalues of 𝚲\boldsymbol{\Lambda} satisfying the model (35).

Remark 5.1.

Compared with the assumption for the eigenvalues in (36), (34) is set without multiple population spiked eigenvalues, which is considered that the joint distribution of the eigenvalues of multiple dimensional GOE matrix can not be visually displayed.

We consider the estimator a^1\hat{a}_{1} and a^2\hat{a}_{2} with p=100,200p=100,200 and 400400, respectively. Then the frequency histograms of the estimators are present in Figure 7-10 with 50005000 repetitions. In Figure 11-12, we show the accuracy of the estimator ρ^i\hat{\rho}_{i} for the two of largest population canonical correlation coefficients. We conclude that the estimators become accurate with pp increasing as the horizontal axis is more concentrated among three estimators.

Refer to caption
Refer to caption
Refer to caption
Figure 7: The estimator of a1a_{1} (a1=10)(a_{1}=10) by the results of the noncentral sample covariance matrix with p=100p=100, 200200, and 400400.
Refer to caption
Refer to caption
Refer to caption
Figure 8: The estimator of a2a_{2} (a2=7.5)(a_{2}=7.5) by the results of the noncentral sample covariance matrix with p=100p=100, 200200, and 400400.
Refer to caption
Refer to caption
Refer to caption
Figure 9: The estimator of a1a_{1} (a1=10)(a_{1}=10) by results of the noncentral Fisher matrix with p=100p=100, 200200, and 400400.
Refer to caption
Refer to caption
Refer to caption
Figure 10: The estimator of a2a_{2} (a2=7.5)(a_{2}=7.5) by results of the noncentral Fisher matrix with p=100p=100, 200200, and 400400.
Refer to caption
Refer to caption
Refer to caption
Figure 11: The estimator of ρ12\rho_{1}^{2} (ρ12=10/110.9091)(\rho_{1}^{2}=10/11\approx 0.9091) by results of the CCA matrix with p=100p=100, 200200, and 400400.
Refer to caption
Refer to caption
Refer to caption
Figure 12: The estimator of ρ22\rho_{2}^{2} (ρ22=15/170.8824)(\rho_{2}^{2}=15/17\approx 0.8824) by results of the CCA matrix with p=100p=100, 200200, and 400400.

To a certain extent, the histograms present the accuracy of estimates of the spiked eigenvalues, however, which is not convincing enough. So we use the Mean Square Errors (MSE) criteria to show the accuracy of estimates in Table 3.

Table 3: The MSE of the three estimators
a1=10a_{1}=10 (ρ1=10/11)\left(\rho_{1}=\sqrt{10/11}\right) a2=7.5a_{2}=7.5 (ρ2=15/17)\left(\rho_{2}=\sqrt{15/17}\right)
pp 100 200 400 100 200 400
SS 1.2928 0.6183 0.3176 0.4017 0.2194 0.1144
FF 0.2077 0.1064 0.0540 0.0821 0.0421 0.0222
CCACCA 4.8001e-05 2.5296e-05 1.2784e-05 7.7782e-05 4.4870e-05 2.3276e-05

The notation SS, FF and CCACCA in Table 3 are short for the noncentral sample covariance matrix, the noncentral Fisher matrix and CCA matrix, respectively. According to Table 3, we find that the MSE decreases as the dimensionality pp increases, which is consistent with the above mentioned simulation results. Compared with the MSE of the two estimators in Table 3, the estimator derived by the noncentral Fisher matrix is more consistent than the noncentral sample covariance matrix.

5.3 Simulation for multiple roots case

Both Theorem (4.2) and Theorem (4.3) need satisfy the Assumption d, specially, the spiked eigenvalue is a single root. So we set Λ\Lambda meet (35) in subsection 5.1. However, we guess our theories and estimator for the single root are effective under the multiple roots condition. In this subsection, we will present some simulations to show the correctness and rationality estimating population canonical correlation coefficient under the multiple roots condition. We assume

𝚲=(10/11,15/17,15/17,1/2,,1/2),\displaystyle\boldsymbol{\Lambda}=(\sqrt{10/11},\sqrt{15/17},\sqrt{15/17},\sqrt{1/2},\cdots,\sqrt{1/2}), (37)

set the ratio among (p:n:N)(p:n:N) as (1:3:9)(1:3:9) and 1000 repetitions. According to Fig 13 and Table 4, the performance looks good.

Refer to caption
Refer to caption
Refer to caption
Figure 13: The estimator of ρ22\rho_{2}^{2} (ρ22=15/170.8824)(\rho_{2}^{2}=15/17\approx 0.8824) by results of the CCA matrix with p=100p=100, 200200, and 400400.
Table 4: The MSE of the population ccc estimators under multiple roots
ρ2=ρ3=15/17\rho_{2}=\rho_{3}=\sqrt{15/17}
pp 100 200 400
CCACCA 3.7055e-05 2.2209e-05 1.2577e-05

6 Conclusion and discussion

In this work, we study the limiting properties of the sample spiked eigenvalues of the noncentral Fisher matrix under high-dimensional setting and Gaussian assumptions. Similar to the spiked model work, we find a phase transition for the limiting properties of the sample spiked eigenvalues of the noncentral Fisher matrix. In addition, we present the CLT for the sample spiked eigenvalues. As an accessory to the proof of the results of the limiting properties of the sample spiked eigenvlaues of the noncentral Fisher matrix, the fluctuations of the spiked eigenvalues of noncentral sample covariance matrix 𝐂n{\mathbf{C}}_{n} are studied, which should have its own interests.

General distribution. It would be natural to ask if our results in the current work can be extended from Gaussian to more general distribution of the matrix entries. Our future work will focus on the limiting properties of the sample spiked eigenvalues of the noncentral Fisher matrix under general distribution of the matrix entries.

7 Appendix

7.1 Proof of Theorem 3.1

There is no loss of generality in assuming

𝚵=(𝚵1𝚵2)=(𝐃1100𝐃~22)=(𝐃11000𝐃220),\displaystyle\boldsymbol{\Xi}=\left(\begin{array}[]{c}\boldsymbol{\Xi}_{1}\\ \boldsymbol{\Xi}_{2}\end{array}\right)=\left(\begin{array}[]{cc}{\bf D}_{11}&\textbf{0}\\ \textbf{0}&\tilde{{\bf D}}_{22}\end{array}\right)=\left(\begin{array}[]{ccc}{\bf D}_{11}&\textbf{0}&\textbf{0}\\ \textbf{0}&{\bf D}_{22}&\textbf{0}\end{array}\right), (44)

where 𝚵1\boldsymbol{\Xi}_{1} is the first MM rows of 𝚵\boldsymbol{\Xi}, the diagonal of 𝐃11{\bf D}_{11} is composed of the MM spiked eigenvalues while the diagonal of 𝐃22{\bf D}_{22} is composed of the non-spiked bounded eigenvalues, and 𝐃~22=(𝐃22,0)\tilde{{\bf D}}_{22}=({\bf D}_{22},\textbf{0}). According to the structure of 𝚵\boldsymbol{\Xi} in (44), we decompose 𝐗{\bf X} as following,

𝐗=(𝐗1𝐗2),\displaystyle{\bf X}=\left(\begin{array}[]{c}{\bf X}_{1}\\ {\bf X}_{2}\end{array}\right),

where 𝐗1{\bf X}_{1} denotes the first MM rows of 𝐗{\bf X}, and 𝐗2{\bf X}_{2} stands for the remaining rows of 𝐗{\bf X}. Then the sample eigenvalues of noncentral sample covariance matrix 𝐂n{\bf C}_{n} are sorted in descending order as

l1𝐂nl2𝐂nlp𝐂n.\displaystyle l_{1}^{{\bf C}_{n}}\geq l_{2}^{{\bf C}_{n}}\geq\cdots\geq l_{p}^{{\bf C}_{n}}.

If we only consider the sample spiked eigenvalues of 𝐂n{\bf C}_{n}, li𝐂nl_{i}^{{\bf C}_{n}}, i𝒥ki\in\mathcal{J}_{k}, k=1,,Kk=1,\cdots,K, then the eigen-equation is

0=\displaystyle 0= |li𝐂n𝐈p1n(𝚵+𝐗)(𝚵+𝐗)|\displaystyle\left|l_{i}^{{\bf C}_{n}}{\bf I}_{p}-\frac{1}{n}(\boldsymbol{\Xi}+{\bf X})(\boldsymbol{\Xi}+{\bf X})^{\ast}\right|
=\displaystyle= |li𝐂n𝐈p1n(𝚵1+𝐗1𝚵2+𝐗2)((𝚵1+𝐗1),(𝚵2+𝐗2))|\displaystyle\left|l_{i}^{{\bf C}_{n}}{\bf I}_{p}-\frac{1}{n}\left(\begin{array}[]{c}\boldsymbol{\Xi}_{1}+{\bf X}_{1}\\ \boldsymbol{\Xi}_{2}+{\bf X}_{2}\end{array}\right)\left((\boldsymbol{\Xi}_{1}+{\bf X}_{1})^{\ast},(\boldsymbol{\Xi}_{2}+{\bf X}_{2})^{\ast}\right)\right| (48)
=\displaystyle= |li𝐂n𝐈M1n(𝚵1+𝐗1)(𝚵1+𝐗1)1n(𝚵1+𝐗1)(𝚵2+𝐗2)1n(𝚵2+𝐗2)(𝚵1+𝐗1)li𝐂n𝐈pM1n(𝚵2+𝐗2)(𝚵2+𝐗2)|.\displaystyle\left|\begin{array}[]{cc}l_{i}^{{\bf C}_{n}}{\bf I}_{M}\!-\!\frac{1}{n}(\boldsymbol{\Xi}_{1}\!+\!{\bf X}_{1})(\boldsymbol{\Xi}_{1}\!+\!{\bf X}_{1})^{\ast}&\!-\frac{1}{n}(\boldsymbol{\Xi}_{1}\!+\!{\bf X}_{1})(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})^{\ast}\\ -\frac{1}{n}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})(\boldsymbol{\Xi}_{1}\!+\!{\bf X}_{1})^{\ast}&l_{i}^{{\bf C}_{n}}{\bf I}_{p-M}-\frac{1}{n}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})^{\ast}\end{array}\right|. (51)

Because MM is fixed, 𝚵2𝚵2/n\boldsymbol{\Xi}_{2}\boldsymbol{\Xi}_{2}^{*}/n and 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{*}/n share the same the LSD, then li𝐂nl_{i}^{{\bf C}_{n}} is an outlier for large nn, i.e. |li𝐂n𝐈pM1n(𝚵2+𝐗2)(𝚵2+𝐗2)|0|l_{i}^{{\bf C}_{n}}{\bf I}_{p-M}-\frac{1}{n}(\boldsymbol{\Xi}_{2}+{\bf X}_{2})(\boldsymbol{\Xi}_{2}^{*}+{\bf X}_{2}^{*})|\neq 0. Rewrite (7.1) by the inverse of a partitioned matrix ([20], Section 0.7.3) and the in-out exchange formula, we have

0=\displaystyle 0= |li𝐂n𝐈M1n(𝚵1+𝐗1)(𝚵1+𝐗1)1n(𝚵1+𝐗1)(𝚵2+𝐗2)\displaystyle|l_{i}^{{\bf C}_{n}}{\bf I}_{M}-\frac{1}{n}(\boldsymbol{\Xi}_{1}+{\bf X}_{1})(\boldsymbol{\Xi}_{1}+{\bf X}_{1})^{\ast}-\frac{1}{n}(\boldsymbol{\Xi}_{1}+{\bf X}_{1})(\boldsymbol{\Xi}_{2}+{\bf X}_{2})^{\ast}
×[li𝐂n𝐈pM1n(𝚵2+𝐗2)(𝚵2+𝐗2)]11n(𝚵2+𝐗2)(𝚵1+𝐗1)|\displaystyle\times\left[l_{i}^{{\bf C}_{n}}{\bf I}_{p-M}-\frac{1}{n}(\boldsymbol{\Xi}_{2}+{\bf X}_{2})(\boldsymbol{\Xi}_{2}+{\bf X}_{2})^{\ast}\right]^{-1}\frac{1}{n}(\boldsymbol{\Xi}_{2}+{\bf X}_{2})(\boldsymbol{\Xi}_{1}+{\bf X}_{1})^{\ast}|
0=\displaystyle\Longleftrightarrow 0\!=\! |𝐈M1n(𝚵1+𝐗1)(li𝐂n𝐈n1n(𝚵2+𝐗2)(𝚵2+𝐗2))1(𝚵1+𝐗1)|,\displaystyle\left|{\bf I}_{M}\!-\!\frac{1}{n}(\boldsymbol{\Xi}_{1}+{\bf X}_{1})\left(l_{i}^{{\bf C}_{n}}{\bf I}_{n}\!-\!\frac{1}{n}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})^{\ast}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})\right)^{-1}(\boldsymbol{\Xi}_{1}\!+\!{\bf X}_{1})^{\ast}\right|,

if li𝐂n0l_{i}^{{\bf C}_{n}}\neq 0. For simplicity, we denote [1n(𝚵2+𝐗2)(𝚵2+𝐗2)li𝐂n𝐈n]1[\frac{1}{n}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})^{\ast}(\boldsymbol{\Xi}_{2}\!+\!{\bf X}_{2})-l_{i}^{{\bf C}_{n}}{\bf I}_{n}]^{-1} briefly by 𝐀n(li𝐂n){\bf A}_{n}(l_{i}^{{\bf C}_{n}}). When there is no ambiguity, we also rewrite it as 𝐀{\bf A}. Then

𝛀n𝐂n=\displaystyle\boldsymbol{\Omega}_{n}^{{\bf C}_{n}}\!\overset{\triangle}{=} 𝐈M+1n(𝚵1+𝐗1)𝐀n(li𝐂n)(𝚵1+𝐗1)\displaystyle{\bf I}_{M}+\frac{1}{n}(\boldsymbol{\Xi}_{1}+{\bf X}_{1}){\bf A}_{n}(l_{i}^{{\bf C}_{n}})(\boldsymbol{\Xi}_{1}^{*}+{\bf X}_{1}^{*})
=\displaystyle= 𝐈M+1n(tr𝐀n(li𝐂n))𝐈M1li𝐂n(1+c1nm2n(li𝐂n))1n𝚵1𝚵1+𝛀0𝐂n,\displaystyle{\bf I}_{M}+\frac{1}{n}({\rm tr}{\bf A}_{n}(l_{i}^{{\bf C}_{n}})){\bf I}_{M}-\frac{1}{l_{i}^{{\bf C}_{n}}(1+c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}}))}\frac{1}{n}\boldsymbol{\Xi}_{1}\boldsymbol{\Xi}_{1}^{*}+\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}, (52)

where

𝛀0𝐂n(li𝐂n)\displaystyle{\boldsymbol{\Omega}}_{0}^{{\bf C}_{n}}(l_{i}^{{\bf C}_{n}})\! =1n𝐗1𝐀𝐗11n(tr𝐀)𝐈M+1n(𝐗1𝐀𝚵1+𝚵1𝐀𝐗1)\displaystyle=\!\frac{1}{n}{\bf X}_{1}{\bf A}{\bf X}_{1}^{*}\!-\!\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\!+\!\frac{1}{n}({\bf X}_{1}{\bf A}\boldsymbol{\Xi}_{1}^{*}\!+\!\boldsymbol{\Xi}_{1}{\bf A}{\bf X}_{1}^{*})
+1n𝐃11[𝐀11+𝐈li𝐂n(1+c1nm2n(li𝐂n))]𝐃11,\displaystyle+\!\frac{1}{n}{\bf D}_{11}^{*}\left[{\bf A}_{11}\!+\!\frac{{\bf I}}{l_{i}^{{\bf C}_{n}}(1+c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}}))}\right]{\bf D}_{11}, (53)
m2n(li𝐂n)\displaystyle m_{2n}(l_{i}^{{\bf C}_{n}}) =1pMtr[1n(𝐗22+𝐃~22)(𝐗22+𝐃~22)li𝐂n𝐈pM]1\displaystyle=\frac{1}{p-M}{\rm tr}\left[\frac{1}{n}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})^{*}\!-\!l_{i}^{{\bf C}_{n}}{\bf I}_{p\!-\!M}\right]^{-1} (54)

c1n=(pM)/(nM)c_{1n}\!=\!({p\!-\!M})/({n\!-\!M}), and 𝐀11{\bf A}_{11} is the first M×MM\times M major diagonal submatrix of 𝐀n(li𝐂n){\bf A}_{n}(l_{i}^{{\bf C}_{n}}).

Here we announce that the following results of almost sure convergence are ture without proofs and the detailed proofs are postponed in Subsection 7.1.1.

𝐀11(li𝐂n)+1li𝐂n+li𝐂nc1nm2n(li𝐂n)𝐈Ma.s.𝟎M×M,\displaystyle{\bf A}_{11}(l_{i}^{{\bf C}_{n}})+\frac{1}{l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}})}{\bf I}_{M}\overset{a.s.}{\longrightarrow}\boldsymbol{0}_{M\times M}, (55)
1n𝐗1𝐀n(li𝐂n)𝐗11n(tr𝐀n(li𝐂n))𝐈Ma.s.𝟎M×M,\displaystyle\frac{1}{n}{\bf X}_{1}{\bf A}_{n}(l_{i}^{{\bf C}_{n}}){\bf X}_{1}^{\ast}-\frac{1}{n}({\rm tr}{\bf A}_{n}(l_{i}^{{\bf C}_{n}})){\bf I}_{M}\overset{a.s.}{\longrightarrow}\boldsymbol{0}_{M\times M}, (56)
1n𝐗1𝐀n(li𝐂n)𝚵1+1n𝚵1𝐀n(li𝐂n)𝐗1a.s.𝟎M×M.\displaystyle\frac{1}{n}{\bf X}_{1}{\bf A}_{n}(l_{i}^{{\bf C}_{n}})\boldsymbol{\Xi}_{1}^{\ast}+\frac{1}{n}\boldsymbol{\Xi}_{1}{\bf A}_{n}(l_{i}^{{\bf C}_{n}}){\bf X}_{1}^{\ast}\overset{a.s.}{\longrightarrow}\boldsymbol{0}_{M\times M}. (57)

According to (52), we have 𝛀0𝐂na.s.𝟎M×M\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}\overset{a.s.}{\longrightarrow}\boldsymbol{0}_{M\times M}. By the formula (1.3) in [17], we arrive at

1ntr𝐀n(z)a.s.m¯2(z)uniformly for z.\frac{1}{n}{\rm tr}{\bf A}_{n}(z)\overset{a.s.}{\longrightarrow}\underline{m}_{2}(z)\quad\mbox{uniformly for $z$}.

From

0=|𝐈M+1ntr𝐀n(li𝐂n)𝐈1n𝐃11𝐃11li𝐂n+li𝐂nc1nm2n(li𝐂n)+𝛀0𝐂n(li𝐂n)|,\displaystyle 0=\left|{\bf I}_{M}+\frac{1}{n}{\rm tr}{\bf A}_{n}(l_{i}^{{\bf C}_{n}}){\bf I}-\frac{\frac{1}{n}{\bf D}_{11}{\bf D}_{11}^{\ast}}{l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}})}+\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(l_{i}^{{\bf C}_{n}})\right|, (58)

we can obtain

|𝐈M+m¯2n0(li𝐂n)𝐈M1n𝐃111𝐃111li𝐂n(1+c1nm2n0(li𝐂n))|a.s.0.\displaystyle\left|{\bf I}_{M}+\underline{m}_{2n}^{0}(l_{i}^{{\bf C}_{n}}){\bf I}_{M}-\frac{\frac{1}{n}{\bf D}_{11}^{-1}{\bf D}_{11}^{-1}}{l_{i}^{{\bf C}_{n}}(1+c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))}\right|\overset{a.s.}{\longrightarrow}0. (59)

For arbitrary kk, let

λnk𝐂=ak(1c1nm1n0(ak))2+(1c1n)(1c1nm1n0(ak))\displaystyle\lambda_{nk}^{{\bf C}}\overset{\triangle}{=}a_{k}\left(1-c_{1n}m_{1n}^{0}(a_{k})\right)^{2}+(1-c_{1n})(1-c_{1n}m_{1n}^{0}(a_{k})) (60)

where c1n=(pM)/(nM)c_{1n}=(p-M)/(n-M), and m1n0m_{1n}^{0} denotes the ST of ESD of pMp-M bulk eigenvalues of 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{*}/n. An easy calculation shows that

1+m¯2n0(λnk𝐂)akλk𝐂(1+c1m2n0(λnk𝐂))=0,\displaystyle 1+\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}})-\frac{a_{k}}{\lambda^{{\bf C}}_{k}(1+c_{1}m_{2n}^{0}(\lambda^{{\bf C}}_{nk}))}=0,

where m2n0m_{2n}^{0} is the ST of LSD of (𝚵2+𝐗2)(𝚵2+𝐗2)/n(\boldsymbol{\Xi}_{2}+{\bf X}_{2})(\boldsymbol{\Xi}_{2}+{\bf X}_{2})^{*}/n with c1c_{1} and HH replaced by pM/(nM)p-M/(n-M) and ESD of pMp-M bulk eigenvalues of 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{*}/n. Combining (59) and the fact that the dimension of matrix is finite, there exists ‘jj’ (assume j𝒥kj\in\mathcal{J}_{k}) such that the jj-th diagonal element convergence almost surely to zero. For this ‘kk’ we have

|𝐈M+m¯2n0(li𝐂n)𝐈M1n𝐃111𝐃111li𝐂n(1+c1nm2n0(li𝐂n))\displaystyle\Big{|}{\bf I}_{M}\!+\!\underline{m}_{2n}^{0}(l_{i}^{{\bf C}_{n}}){\bf I}_{M}\!-\!\frac{\frac{1}{n}{\bf D}_{11}^{-1}{\bf D}_{11}^{-1}}{l_{i}^{{\bf C}_{n}}(1\!+\!c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))}
[1+m¯2n0(λnk𝐂)akλnk𝐂(1+c1m2n0(λnk𝐂))]𝐈M|a.s.0.\displaystyle\!-\!\left[1\!+\!\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}})\!-\!\frac{a_{k}}{\lambda^{{\bf C}}_{nk}(1\!+\!c_{1}m_{2n}^{0}(\lambda^{{\bf C}}_{nk}))}\right]{\bf I}_{M}\Big{|}\overset{a.s.}{\longrightarrow}0.

Then subtracting the jj-th one from all the diagonal elements of the matrix in the above determinant, we find the difference has a lower bound except the kk-th block containing jj-th location, i.e.

asakli𝐂n(1+c1nm2n0(li𝐂n)),s𝒥k\displaystyle\frac{a_{s}-a_{k}}{l_{i}^{{\bf C}_{n}}(1+c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))},\quad s\notin\mathcal{J}_{k}

is with the lower bound as a result of (5) in Assumption A. So for the diagonal elements of kk-th block, we have

m¯2n0(li𝐂n)akli𝐂n(1+c1nm2n0(li𝐂n))m¯2n0(λnk𝐂)+akλnk𝐂(1+c1nm2n0(λnk𝐂))a.s.0\displaystyle\underline{m}_{2n}^{0}(l_{i}^{{\bf C}_{n}})-\frac{a_{k}}{l_{i}^{{\bf C}_{n}}(1+c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))}-\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}})+\frac{a_{k}}{\lambda_{nk}^{{\bf C}}(1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}\overset{a.s.}{\longrightarrow}0
\displaystyle\Leftrightarrow (li𝐂nλnk𝐂λnk𝐂)[λnk𝐂(m¯2n0)(ξ1)+akli𝐂n1+c1nm2n0(λnk𝐂)+li𝐂n(m2n0)(ξ2)(1+c1nm2n0(li𝐂n))(1+c1nm2n0(λnk𝐂))]a.s.0,\displaystyle\left(\frac{l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}}}{\lambda_{nk}^{{\bf C}}}\right)\left[\lambda_{nk}^{{\bf C}}(\underline{m}_{2n}^{0})^{\prime}(\xi_{1})+\frac{a_{k}}{l_{i}^{{\bf C}_{n}}}\frac{1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})+l_{i}^{{\bf C}_{n}}(m_{2n}^{0})^{\prime}(\xi_{2})}{(1+c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))(1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}\right]\overset{a.s.}{\longrightarrow}0,

By the factor

λnk𝐂(m¯2n0)(ξ1)+akli𝐂n1+c1nm2n0(λnk𝐂)+li𝐂n(m2n0)(ξ2)(1+c1nm2n0(li𝐂n))(1+c1nm2n0(λnk𝐂))\displaystyle\lambda_{nk}^{{\bf C}}(\underline{m}_{2n}^{0})^{\prime}(\xi_{1})+\frac{a_{k}}{l_{i}^{{\bf C}_{n}}}\frac{1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})+l_{i}^{{\bf C}_{n}}(m_{2n}^{0})^{\prime}(\xi_{2})}{(1+c_{1n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))(1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}

has lower bound, we get (li𝐂nλnk𝐂)/λnk𝐂a.s.0(l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}})/\lambda_{nk}^{{\bf C}}\overset{a.s.}{\longrightarrow}0. If aka_{k} is bounded, the limit λk𝐂\lambda_{k}^{{\bf C}} of λnk𝐂\lambda_{nk}^{{\bf C}} satisfies

0=11c1λk𝐂+c1m2(λk𝐂)+akλk𝐂λk𝐂c1m2(λk𝐂)\displaystyle 0=1-\frac{1-c_{1}}{\lambda^{{\bf C}}_{k}}+c_{1}m_{2}(\lambda^{{\bf C}}_{k})+\frac{a_{k}}{-{\lambda^{{\bf C}}_{k}}-\lambda^{{\bf C}}_{k}c_{1}m_{2}(\lambda^{{\bf C}}_{k})}
\displaystyle\Longleftrightarrow ak=λk𝐂(1+c1m2(λk𝐂))2(1c1)(1+c1m2(λk𝐂))\displaystyle a_{k}=\lambda^{{\bf C}}_{k}\left(1+c_{1}m_{2}(\lambda^{{\bf C}}_{k})\right)^{2}-(1-c_{1})\left(1+c_{1}m_{2}(\lambda^{{\bf C}}_{k})\right)
\displaystyle\Longleftrightarrow ak=[λk𝐂(1+c1m1(ak)1c1m1(ak))(1c1)][1+c1m1(ak)1c1m1(ak)]\displaystyle a_{k}=\left[\lambda^{{\bf C}}_{k}\left(1+c_{1}\frac{m_{1}(a_{k})}{1-c_{1}m_{1}(a_{k})}\right)-(1-c_{1})\right]\left[1+c_{1}\frac{m_{1}(a_{k})}{1-c_{1}m_{1}(a_{k})}\right]
\displaystyle\Longleftrightarrow ψ𝐂(ak)=λk𝐂=ak(1c1m1(ak))2+(1c1)(1c1m1(ak)),\displaystyle\psi_{\mathbf{C}}(a_{k})\overset{\bigtriangleup}{=}\lambda^{{\bf C}}_{k}=a_{k}\left(1-c_{1}m_{1}(a_{k})\right)^{2}+(1-c_{1})\left(1-c_{1}m_{1}(a_{k})\right),

where m1m_{1} is the Stieltjes transform of the LSD HH of 𝚵𝚵/n\boldsymbol{\Xi}\boldsymbol{\Xi}^{\ast}/n. The second equivalence relation is a consequence of (6). The proof of Theorem 3.1 is completed.

Remark 7.1.

If aka_{k} tends to infinite as stated in Assumption A, we only need to multiply by nli𝐃111\sqrt{n\cdot l_{i}}{\bf D}_{11}^{-1} from both side of (52), and similar arguments above applying to the infinite case, we can get the same conclusion with only describing as: (li𝐂nλnk𝐂)/λnk𝐂a.s.0(l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}})/\lambda_{nk}^{{\bf C}}\overset{a.s.}{\longrightarrow}0. Therefore, we will not repeat the situation that aka_{k} tending infinite in following proof.

7.1.1 Proofs of (55), (56) and (57)

(57) can be obtain by Kolmogorov’s law of large numbers straightly. The proof of (55) and (56) are similar, so we take the proof of (56) as example. We consider the following series for any ε>0\varepsilon>0,

n=1(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε)\displaystyle\sum_{n=1}^{\infty}{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon)
=\displaystyle= n=1(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε,𝒜)\displaystyle\sum_{n=1}^{\infty}{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon,\mathcal{A})
+n=1(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε,𝒜c)\displaystyle+\sum_{n=1}^{\infty}{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon,\mathcal{A}^{c})

where the event 𝒜\mathcal{A} means the spectral norm of 𝐀{\bf A} is bounded, i.e., 𝐀C\|{\bf A}\|\leq C and K\|\cdot\|_{K} means Kolmogorov norm, defined as the largest absolute value of all the entries. Then we have

(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε,𝒜c)(𝒜c)=o(nt),\displaystyle{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon,\mathcal{A}^{c})\leq{\mathbb{P}}(\mathcal{A}^{c})=o(n^{-t}),

where the last equation is a consequence of the exact separation result of information-plus-noise type matrices in [5]. For the first term, we have

n=1(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε𝒜)\displaystyle\sum_{n=1}^{\infty}{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon\cap\mathcal{A})
=\displaystyle= n=1𝔼I{1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε𝒜}\displaystyle\sum_{n=1}^{\infty}{\mathbb{E}}I\{\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon\cap\mathcal{A}\}
=\displaystyle= n=1𝔼[𝔼I{1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε𝒜}|𝐀]\displaystyle\sum_{n=1}^{\infty}{\mathbb{E}}[{\mathbb{E}}I\{\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon\cap\mathcal{A}\}|{\bf A}]
=\displaystyle= n=1𝔼[(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK>ε𝒜)|𝐀]\displaystyle\sum_{n=1}^{\infty}{\mathbb{E}}[{\mathbb{P}}(\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}>\varepsilon\cap\mathcal{A})|{\bf A}]
\displaystyle\leq n=1𝔼[1ε4r𝔼1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈MK4rI𝒜|𝐀].\displaystyle\sum_{n=1}^{\infty}{\mathbb{E}}[\frac{1}{\varepsilon^{4r}}{\mathbb{E}}\|\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\|_{K}^{4r}I_{\mathcal{A}}|{\bf A}]. (61)

In fact, there exists a constant KK, independent with i,l,ni,l,n, such that

𝔼|(1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈M)il|4I𝒜|𝐀Kn2\displaystyle{\mathbb{E}}|(\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M})_{il}|^{4}I_{\mathcal{A}}|{\bf A}\leq\frac{K}{n^{2}}

which implies (7.1.1) is summable when r=1r=1. By the Borel-Cantelli lemma, we have

1n𝐗1𝐀n𝐗11n(tr𝐀)𝐈Ma.s.𝟎M×M.\displaystyle\frac{1}{n}{\bf X}_{1}{\bf A}_{n}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}\overset{a.s.}{\to}\boldsymbol{0}_{M\times M}.

7.2 Proof of Theorem 3.2

In this section, we will consider the random vector

γk𝐂n=n{li𝐂n/λnk𝐂1,i𝒥k},\displaystyle\gamma_{k}^{{\bf C}_{n}}=\sqrt{n}\{l_{i}^{{\bf C}_{n}}/\lambda_{nk}^{{\bf C}}-1,i\in\mathcal{J}_{k}\},

where λnk𝐂\lambda_{nk}^{{\bf C}} is defined as (60), The reason of using λnk𝐂\lambda_{nk}^{{\bf C}} rather than its limit λk𝐂\lambda_{k}^{{\bf C}} lies in the fact that the convergence may be very slow. The following proof is based on (52) and (7.1), then

0=|𝐈M+m¯2n0(λnk𝐂)𝐈1n𝐃11𝐃11λnk𝐂(1+c1nm2n0(λnk𝐂))+𝛀0𝐂n(λnk𝐂)+ε1𝐈M+ε21n𝐃11𝐃11+ε3|,0\!=\!\left|{\bf I}_{M}\!+\!\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}}){\bf I}\!-\!\frac{\frac{1}{n}{\bf D}_{11}{\bf D}_{11}^{\ast}}{\lambda_{nk}^{{\bf C}}(1\!+\!c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}\!+\!\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}})\!+\!\varepsilon_{1}{\bf I}_{M}\!+\!\varepsilon_{2}\frac{1}{n}{\bf D}_{11}{\bf D}_{11}^{*}\!+\!\varepsilon_{3}\right|,

where m2n0m_{2n}^{0} is the Stieltjes transform of F𝐂F^{{\bf C}} with parameter HH and c1c_{1} replaced by HnH_{n} and c1nc_{1n}, and

ε1\displaystyle\varepsilon_{1} =\displaystyle= 1ntr𝐀n(li𝐂n)m¯2n0(λnk𝐂)\displaystyle\frac{1}{n}{\rm tr}{\bf A}_{n}(l_{i}^{{\bf C}_{n}})-\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}}) (62)
ε2\displaystyle\varepsilon_{2} =\displaystyle= 1li𝐂n+li𝐂npMnm2n(li𝐂n)1λnk𝐂+λnk𝐂c1nm2n0(λnk𝐂),\displaystyle\frac{-1}{l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}\frac{p-M}{n}m_{2n}(l_{i}^{{\bf C}_{n}})}-\frac{-1}{\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})}, (63)
ε3\displaystyle\varepsilon_{3} =\displaystyle= 𝛀0𝐂n(li𝐂n)𝛀0𝐂n(λnk𝐂).\displaystyle\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(l_{i}^{{\bf C}_{n}})-\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}}). (64)

Here we give the estimators of ε1\varepsilon_{1}, ε2\varepsilon_{2}, and ε3\varepsilon_{3} and the detailed proof is postponed in the following part,

ε1\displaystyle\varepsilon_{1} =\displaystyle= γk𝐂nnλk𝐂n[m¯2(λk𝐂n)+op(1)],\displaystyle\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\lambda_{k}^{{\bf C}_{n}}[\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}_{n}})+o_{p}(1)], (65)
ε2\displaystyle\varepsilon_{2} =\displaystyle= γk𝐂nn[1+c1m2(λk𝐂n)+c1λk𝐂nm2(λk𝐂n)λk𝐂n[1+c1m2(λk𝐂n)]2+op(1)],\displaystyle\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\left[\frac{1+c_{1}m_{2}(\lambda_{k}^{{\bf C}_{n}})+c_{1}\lambda_{k}^{{\bf C}_{n}}m_{2}^{\prime}(\lambda_{k}^{{{\bf C}_{n}}})}{\lambda_{k}^{{\bf C}_{n}}[1+c_{1}m_{2}(\lambda_{k}^{{\bf C}_{n}})]^{2}}+o_{p}(1)\right], (66)
ε3\displaystyle\varepsilon_{3} =\displaystyle= op(1n)𝟏𝟏.\displaystyle o_{p}\left(\frac{1}{\sqrt{n}}\right)\boldsymbol{1}\boldsymbol{1}^{\prime}. (67)

According to the definition (60), if i𝒥ki\in\mathcal{J}_{k}, then we obtain

1+m¯2n0(λnk𝐂)akλnk𝐂(1+c1nm2n0(λnk𝐂))=0.1+\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}})-\frac{a_{k}}{\lambda_{nk}^{{\bf C}}(1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}=0. (68)

We rewrite the kk-th block matrix of 𝛀n𝐂n\boldsymbol{\Omega}_{n}^{{\bf C}_{n}} as

[𝛀n𝐂n]kk=[𝛀0𝐂n(λnk𝐂)]kk+ε1𝐈mk+ε2ak𝐈mk+op(1n).\displaystyle[\boldsymbol{\Omega}_{n}^{{\bf C}_{n}}]_{kk}=[\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}})]_{kk}+\varepsilon_{1}{\bf I}_{m_{k}}+\varepsilon_{2}a_{k}{\bf I}_{m_{k}}+o_{p}(\frac{1}{\sqrt{n}}).

By the discussion of the limiting distribution of 𝛀0𝐂n(λnk𝐂)\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}}) and Skorokhod strong representation theorem, for more details, see [32] or [21], on an appropriate probability space, one may redefine the random variables such that 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}} tends to the normal variables with probability one. Then, the eigen-equation of (52) becomes

0=|ak(1a1ak)λnk𝐂b(λnk𝐂)+O(n1/2)O(n1/2)O(n1/2)O(n1/2)O(n1/2)O(n1/2)[𝛀0𝐂n]kk+ε1𝐈mk+ε2ak𝐈mkO(n1/2)O(n1/2)O(n1/2)ak(1aMak)λnk𝐂b(λnk𝐂)+O(n1/2)|\displaystyle 0\!=\!\begin{vmatrix}\frac{a_{k}(1\!-\!\frac{a_{1}}{a_{k}})}{\lambda_{nk}^{{\bf C}}b(\lambda_{nk}^{{\bf C}})}\!+\!O(n^{-1/2})&O(n^{-1/2})&O(n^{-1/2})\cr O(n^{-1/2})&\cdots&O(n^{-1/2})\cr O(n^{-1/2})&[\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}]_{kk}\!+\!\varepsilon_{1}{\bf I}_{m_{k}}\!+\!\varepsilon_{2}a_{k}{\bf I}_{m_{k}}&\cdots\cr O(n^{-1/2})&\cdots&O(n^{-1/2})\cr O(n^{-1/2})&\cdots&\frac{a_{k}(1\!-\!\frac{a_{M}}{a_{k}})}{\lambda_{nk}^{{\bf C}}b(\lambda_{nk}^{{\bf C}})}\!+\!O(n^{-1/2})\cr\end{vmatrix}

where b(λnk𝐂)=1+c1nm2n0(λnk𝐂)b(\lambda_{nk}^{{\bf C}})=1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}) and [𝛀0𝐂n]kk[\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}]_{kk} is kk-th diagonal block of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}. Then multiplying n1/4n^{1/4} to the kk-th block row and column of the determinant of the eigen-equation above, and making nn\to\infty. Then we have

n([𝛀0𝐂n]kk+ε1𝐈mk+ε2ak𝐈mk)a.s.0.\sqrt{n}\left([\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}]_{kk}+\varepsilon_{1}{\bf I}_{m_{k}}+\varepsilon_{2}a_{k}{\bf I}_{m_{k}}\right)\overset{{a.s.}}{\longrightarrow}0.

Simplifying the above, we have the random vector γk𝐂n\gamma_{k}^{{\bf C}_{n}} tends to a random vector consists of the ordered eigenvalues of GOE (GUE) matrix under real (complex) case with the scale parameter

θ1=\displaystyle\theta_{1}= 1[λk𝐂m¯2+ak(1+c1m2+c1λk𝐂m2)λk𝐂(1+c1m2)2]2\displaystyle\frac{1}{[\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime}+\frac{a_{k}(1+c_{1}m_{2}+c_{1}\lambda_{k}^{{\bf C}}m_{2}^{\prime})}{\lambda_{k}^{{\bf C}}(1+c_{1}m_{2})^{2}}]^{2}}
×(m¯2+ak2c1m2(λk𝐂)2(1+c1m2)4+2ak(1+m¯2+λk𝐂m¯2)(λk𝐂)2(1+c1m2)2),\displaystyle\times\left(\underline{m}_{2}^{\prime}+\frac{a_{k}^{2}c_{1}m_{2}^{\prime}}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2})^{4}}+\frac{2a_{k}(1+\underline{m}_{2}+\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime})}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2})^{2}}\right), (69)

where m2m_{2} and m¯2\underline{m}_{2} are defined in (6) and (12), m2m_{2}^{\prime} and m¯2\underline{m}_{2}^{\prime} stand for their derivatives at λk𝐂\lambda_{k}^{{\bf C}}, respectively. We shall have establish the proof of theorem 3.2 if we prove the limiting distribution of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}} and the limits of ε1\varepsilon_{1}, ε2\varepsilon_{2} and ε3\varepsilon_{3}. In the following, we will put these proofs.

7.2.1 Limiting distribution of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}

In this section, we proceed to show the limiting distribution of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}. According to the definition of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}} in (7.1), the proof will be divided into three parts, where are (𝐗1𝐀𝐗1(tr𝐀)𝐈M)/n({\bf X}_{1}{\bf A}{\bf X}_{1}^{*}\!-\!({\rm tr}{\bf A}){\bf I}_{M})/n, (𝐗1𝐀𝚵1+𝚵1𝐀𝐗1)/n({\bf X}_{1}{\bf A}\boldsymbol{\Xi}_{1}^{*}\!+\!\boldsymbol{\Xi}_{1}{\bf A}{\bf X}_{1}^{*})/n and

1n𝐃11[𝐀11+𝐈λnk𝐂n(1+c1nm2n(λnk𝐂n))]𝐃11.\displaystyle\frac{1}{n}{\bf D}_{11}^{*}\left[{\bf A}_{11}\!+\!\frac{{\bf I}}{\lambda_{nk}^{{\bf C}_{n}}(1+c_{1n}m_{2n}(\lambda_{nk}^{{\bf C}_{n}}))}\right]{\bf D}_{11}. (70)

By Theorem 7.1 in [6], it is easy to obtain that the kk-th block of M×MM\times M matrix 1n𝐗1𝐀𝐗11n(tr𝐀(λnk𝐂n))𝐈M\frac{1}{\sqrt{n}}{\bf X}_{1}{\bf A}{\bf X}_{1}^{*}-\frac{1}{\sqrt{n}}({\rm tr}{\bf A}(\lambda_{nk}^{{\bf C}_{n}})){\bf I}_{M} tends to mkm_{k}-dimensional GOE (GUE) matrix with scale parameter m¯2(λk𝐂)\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}}) under real (complex) case.

Having disposed of 1n𝐗1𝐀𝐗11n(tr𝐀(λnk𝐂n))𝐈M\frac{1}{\sqrt{n}}{\bf X}_{1}{\bf A}{\bf X}_{1}^{*}-\frac{1}{\sqrt{n}}({\rm tr}{\bf A}(\lambda_{nk}^{{\bf C}_{n}})){\bf I}_{M}, we can now turn to the proof of (70). To complete the proof of (70), we consider the limiting distribution of

𝐀11(λnk𝐂)+1λnk𝐂(1+c1nm2n(λnk𝐂))𝐈M.\displaystyle{\bf A}_{11}(\lambda_{nk}^{{\bf C}})+\frac{1}{\lambda_{nk}^{{\bf C}}(1+c_{1n}m_{2n}(\lambda_{nk}^{{\bf C}}))}{\bf I}_{M}.

Rewrite 𝚵2\boldsymbol{\Xi}_{2} as 𝚵2=(𝟎pM,M,(𝐃~22)pM,nM)\boldsymbol{\Xi}_{2}=(\boldsymbol{0}_{p-M,M},(\tilde{{\bf D}}_{22})_{p-M,n-M}) and similar considerations apply to 𝐗2{\bf X}_{2}, 𝐗2=((𝐗21)pM,M,(𝐗22)pM,nM){\bf X}_{2}=(({\bf X}_{21})_{p-M,M},({\bf X}_{22})_{p-M,n-M}). Then, we have

𝐀=(1n𝐗21𝐗21λnk𝐂𝐈M,1n𝐗21(𝐗22+𝐃~22)1n(𝐗22+𝐃~22)𝐗21,1n(𝐗22+𝐃~22)(𝐗22+𝐃~22)λnk𝐂𝐈nM)1.{\bf A}=\begin{pmatrix}\frac{1}{n}{\bf X}_{21}^{\ast}{\bf X}_{21}\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{M},&\!-\!\frac{1}{n}{\bf X}_{21}^{\ast}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})\cr\!-\!\frac{1}{n}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})^{\ast}{\bf X}_{21},&\frac{1}{n}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})^{\ast}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{n-M}\end{pmatrix}^{-1}. (71)

Then, the first M×MM\times M major diagonal submatrix is

𝐀11=\displaystyle{\bf A}_{11}= (1n𝐗21𝐗21λnk𝐂𝐈M1n𝐗21(𝐗22+𝐃~22)𝐀221n(𝐗22+𝐃~22)𝐗21)1\displaystyle\left(\frac{1}{n}{\bf X}_{21}^{\ast}{\bf X}_{21}\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{M}\!-\!\frac{1}{n}{\bf X}_{21}^{\ast}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22}){\bf A}_{22}\frac{1}{n}({\bf X}_{22}\!+\!\tilde{{\bf D}}_{22})^{\ast}{\bf X}_{21}\right)^{-1}
=\displaystyle= (λnk𝐂𝐈M+1n𝐗21[𝐈pM1n(𝐃~22+𝐗22)𝐀22(𝐃~22+𝐗22)]𝐗21)1\displaystyle\left(\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{M}\!+\!\frac{1}{n}{\bf X}_{21}^{\ast}\left[{\bf I}_{p\!-\!M}\!-\!\frac{1}{n}(\tilde{{\bf D}}_{22}\!+\!{\bf X}_{22}){\bf A}_{22}(\tilde{{\bf D}}_{22}\!+\!{\bf X}_{22})^{\ast}\right]{\bf X}_{21}\right)^{-1}
=\displaystyle= (λnk𝐂𝐈Mλnk𝐂n𝐗21𝐀~22𝐗21)1\displaystyle\left(\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{M}\!-\!\frac{\lambda_{nk}^{{\bf C}}}{n}{\bf X}_{21}^{\ast}\tilde{{\bf A}}_{22}{\bf X}_{21}\right)^{-1}
=\displaystyle= (λnk𝐂𝐈Mλnk𝐂n(tr𝐀~22)𝐈M𝛀1𝐂n(λnk𝐂,𝐗21))1,\displaystyle\left(\!-\!\lambda_{nk}^{{\bf C}}{\bf I}_{M}\!-\!\frac{\lambda_{nk}^{{\bf C}}}{n}({\rm tr}\tilde{{\bf A}}_{22}){\bf I}_{M}\!-\!\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})\right)^{-1},

where

𝐀22=[1n(𝐗22+𝐃~22)(𝐗22+𝐃~22)λnk𝐂𝐈nM]1,\displaystyle{\bf A}_{22}=[\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22})^{\ast}({\bf X}_{22}+\tilde{{\bf D}}_{22})-\lambda_{nk}^{{\bf C}}{\bf I}_{n-M}]^{-1},
𝐀~22=[1n(𝐗22+𝐃~22)(𝐗22+𝐃~22)λnk𝐂𝐈pM]1,\displaystyle\tilde{{\bf A}}_{22}=[\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22})({\bf X}_{22}+\tilde{{\bf D}}_{22})^{\ast}-\lambda_{nk}^{{\bf C}}{\bf I}_{p-M}]^{-1},
𝛀1𝐂n(λnk𝐂,𝐗21)=λnk𝐂n[𝐗21𝐀~22𝐗21(tr𝐀~22)𝐈M].\displaystyle\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})=\frac{\lambda_{nk}^{{\bf C}}}{n}\left[{\bf X}_{21}^{\ast}\tilde{{\bf A}}_{22}{\bf X}_{21}-({\rm tr}\tilde{{\bf A}}_{22}){\bf I}_{M}\right].

We emphasize that both 𝐀22{\bf A}_{22} and 𝐀{\bf A} are noncentral sample covariance matrices with the same limiting noncentral parameter matrix. So the Stieltjes transform of LSD of 𝐀22{\bf A}_{22} is m¯2()\underline{m}_{2}(\cdot). Similar arguments apply to 𝐀~22\tilde{{\bf A}}_{22}, we conclude that 1pMtr𝐀~22\frac{1}{p-M}{\rm tr}\tilde{{\bf A}}_{22} tends to m2(λk𝐂)m_{2}(\lambda_{k}^{{\bf C}}) with probability one. By the CLT of the quadratic form, we have

[𝛀1𝐂n(λnk𝐂,𝐗21)]ij=Op(λnk𝐂n),[\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})]_{ij}=O_{p}\left(\frac{\lambda_{nk}^{{\bf C}}}{\sqrt{n}}\right),

and the kk-th block of 1pM(𝐗21𝐀~22𝐗21(tr𝐀~22)𝐈M)\frac{1}{\sqrt{p-M}}({\bf X}_{21}^{*}\tilde{{\bf A}}_{22}{\bf X}_{21}-({\rm tr}\tilde{{\bf A}}_{22}){\bf I}_{M}) tends to mkm_{k}-dimensional GOE (GUE) matrix with scale parameter m2(λk𝐂)m_{2}^{\prime}(\lambda_{k}^{{\bf C}}) under real (complex) case. Moreover,

𝐀111λnk𝐂+λnk𝐂pMnm2n(λnk𝐂)𝐈M\displaystyle{\bf A}_{11}-\frac{-1}{\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}\frac{p-M}{n}m_{2n}(\lambda_{nk}^{{\bf C}})}{\bf I}_{M}
=\displaystyle= 1λnk𝐂+λnk𝐂pMnm2n(λnk𝐂)𝛀1𝐂n(λnk𝐂,𝐗21)𝐀11\displaystyle\frac{-1}{\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}\frac{p-M}{n}m_{2n}(\lambda_{nk}^{{\bf C}})}\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21}){\bf A}_{11}
=\displaystyle= 𝛀1𝐂n(λnk𝐂,𝐗21)[λnk𝐂+λnk𝐂pMnm2n(λnk𝐂)]2+[𝛀1𝐂n(λnk𝐂,𝐗21)]2𝐀11[λnk𝐂+λnk𝐂pMnm2n(λnk𝐂)]2\displaystyle\frac{\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}\frac{p-M}{n}m_{2n}(\lambda_{nk}^{{\bf C}})]^{2}}+\frac{[\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})]^{2}{\bf A}_{11}}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}\frac{p-M}{n}m_{2n}(\lambda_{nk}^{{\bf C}})]^{2}} (72)

where

[𝛀1𝐂n(λnk𝐂,𝐗21)]2𝐀11[λnk𝐂+λnk𝐂pMnm2n(λnk𝐂)]2=Op(1n)𝐀11𝟏𝟏.\displaystyle\frac{[\boldsymbol{\Omega}_{1}^{{\bf C}_{n}}(\lambda_{nk}^{{\bf C}},{\bf X}_{21})]^{2}{\bf A}_{11}}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}\frac{p-M}{n}m_{2n}(\lambda_{nk}^{{\bf C}})]^{2}}=O_{p}\left(\frac{1}{n}\right)\|{\bf A}_{11}\|\boldsymbol{1}\boldsymbol{1}^{\prime}.

By (7.2.1) and classical CLT, it is easy to see that the corresponding block of 1n𝐃11[𝐀11+1li𝐂n(1+c3nm2n(li𝐂n))𝐈M]𝐃11\frac{1}{\sqrt{n}}{\bf D}_{11}[{\bf A}_{11}+\frac{1}{l_{i}^{{\bf C}_{n}}(1+c_{3n}m_{2n}(l_{i}^{{\bf C}_{n}}))}{\bf I}_{M}]{\bf D}_{11} tends to the GOE (GUE) matrix under real (complex) case with scale parameter

ak2c1m2(λk𝐂)(λk𝐂)2(1+c1m2(λk𝐂))4.\displaystyle\frac{a_{k}^{2}c_{1}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2}(\lambda_{k}^{{\bf C}}))^{4}}.

We shall have established the limiting distribution of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}} if we give the limiting distribution of

1n𝐗1𝐀𝚵1+1n𝚵1𝐀𝐗1.\displaystyle\frac{1}{\sqrt{n}}{\bf X}_{1}{\bf A}\boldsymbol{\Xi}_{1}^{*}+\frac{1}{\sqrt{n}}\boldsymbol{\Xi}_{1}{\bf A}{\bf X}_{1}^{*}. (73)

It is easily seen that the elements (s,t)(s,t) (sM,tMs\leq M,t\leq M) of (73) is 1/ndt𝐱s𝐚t+1/nds𝐚s𝐱t1/\sqrt{n}d_{t}{\bf x}_{s}{\bf a}_{t}+1/\sqrt{n}d_{s}{\bf a}_{s}^{*}{\bf x}_{t}^{*}, where 𝐱s{\bf x}_{s} is the ss-th row of 𝐗1{\bf X}_{1} and 𝐚t{\bf a}_{t} is the tt-th column of 𝐀{\bf A}. Since 𝐗1{\bf X}_{1} is independent of 𝐀{\bf A}, the limiting distribution of 1/ndt𝐱s𝐚t+1/nds𝐚s𝐱t1/\sqrt{n}d_{t}{\bf x}_{s}{\bf a}_{t}+1/\sqrt{n}d_{s}{\bf a}_{s}^{*}{\bf x}_{t}^{*} is Gaussian under given 𝐀{\bf A}. The mean of the Gaussian is zero and its variance is present under two different situations. Under the real samples situation, the variance is equal to

σn2(s,t)={4as𝔼𝐚sT𝐚s, if s=tas𝔼𝐚sT𝐚s+at𝔼𝐚tT𝐚t, if st\sigma_{n}^{2}(s,t)=\begin{cases}4a_{s}{\mathbb{E}}{\bf a}_{s}^{T}{\bf a}_{s},&\mbox{ if }s=t\cr a_{s}{\mathbb{E}}{\bf a}_{s}^{T}{\bf a}_{s}+a_{t}{\mathbb{E}}{\bf a}_{t}^{T}{\bf a}_{t},&\mbox{ if }s\neq t\cr\end{cases} (74)

and under the complex samples situation, the variance is equal to

σn2(s,t)=as𝔼𝐚s𝐚s+at𝔼𝐚t𝐚t.\sigma_{n}^{2}(s,t)=a_{s}{\mathbb{E}}{\bf a}_{s}^{*}{\bf a}_{s}+a_{t}{\mathbb{E}}{\bf a}_{t}^{*}{\bf a}_{t}. (75)

What is left is to compute the limits of (74) and (75). By the inverse blockwise matrix formula, we know that the vector consisted of the first MM components of 𝐚t{\bf a}_{t} is equal to the tt-th column 𝐀11t{\mathbf{A}}_{11t} of 𝐀11{\bf A}_{11} and the vector consisted of next nMn-M components is equal to the jj-th column of 𝐀221n(𝐗22+𝐃~22)𝐗21𝐀11t{\bf A}_{22}\frac{1}{n}({\bf X}_{22}^{*}+\tilde{{\bf D}}_{22}^{*}){\bf X}_{21}{\mathbf{A}}_{11t}. By (71), we have

𝔼𝐚t𝐚t=\displaystyle{\mathbb{E}}{\bf a}_{t}^{*}{\bf a}_{t}= 𝔼𝐀11t𝐀11t+𝔼𝐀11t𝐗211n(𝐗22+𝐃~22)𝐀2221n(𝐗22+𝐃~22)𝐗21𝐀11t\displaystyle{\mathbb{E}}{\bf A}_{11t}^{*}{\bf A}_{11t}+{\mathbb{E}}{\bf A}_{11t}^{*}{\bf X}_{21}^{*}\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22}){\bf A}_{22}^{2}\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22})^{*}{\bf X}_{21}{\bf A}_{11t}
=\displaystyle= 𝔼𝐀11t[𝐈M+1n𝐗211n(𝐗22+𝐃~22)𝐀222(𝐗22+𝐃~22)𝐗21]𝐀11t\displaystyle{\mathbb{E}}{\bf A}_{11t}^{*}\left[{\bf I}_{M}+\frac{1}{n}{\bf X}_{21}^{*}\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22}){\bf A}_{22}^{2}({\bf X}_{22}^{*}+\tilde{{\bf D}}_{22}^{*}){\bf X}_{21}\right]{\bf A}_{11t}
=𝔼𝐀11t\displaystyle={\mathbb{E}}{\bf A}_{11t}^{*} [𝐈M+1ntr[1n(𝐗22+𝐃~22)𝐀222(𝐗22+𝐃22)]𝐈M+Op(1n)]𝐀11t,\displaystyle\left[{\bf I}_{M}+\frac{1}{n}{\rm tr}\left[\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22}){\bf A}_{22}^{2}({\bf X}_{22}+{\bf D}_{22})^{*}\right]{{\bf I}}_{M}+O_{p}\left(\frac{1}{\sqrt{n}}\right)\right]{\bf A}_{11t},

where

(1+1ntr1n(𝐗22+𝐃~22)𝐀222(𝐗22+𝐃~22))\displaystyle\left(1+\frac{1}{n}{\rm tr}\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22}){\bf A}_{22}^{2}({\bf X}_{22}+\tilde{{\bf D}}_{22})^{*}\right)
=\displaystyle= (1+1ntr𝐀2221n(𝐗22+𝐃~22)(𝐗22+𝐃~22))\displaystyle\left(1+\frac{1}{n}{\rm tr}{\bf A}_{22}^{2}\frac{1}{n}({\bf X}_{22}+\tilde{{\bf D}}_{22})^{*}({\bf X}_{22}+\tilde{{\bf D}}_{22})\right)
=\displaystyle= (1+1ntr𝐀22+λnk𝐂nntr𝐀222)=1+m¯2n(λnk𝐂n)+λnk𝐂nm¯2n(λnk𝐂n).\displaystyle\left(1+\frac{1}{n}{\rm tr}{\bf A}_{22}+\frac{\lambda_{nk}^{{\bf C}_{n}}}{n}{\rm tr}{\bf A}_{22}^{2}\right)=1+\underline{m}_{2n}(\lambda_{nk}^{{\bf C}_{n}})+\lambda_{nk}^{{\bf C}_{n}}\underline{m}_{2n}^{\prime}(\lambda_{nk}^{{\bf C}_{n}}).

Then applying (7.2.1), we have

𝔼𝐚t𝐚t=\displaystyle{\mathbb{E}}{\bf a}_{t}^{*}{\bf a}_{t}= (1+m¯2n(λnk𝐂n)+λnk𝐂nm¯2n(λnk𝐂n))(λnk𝐂n)2(1+c1nm2n(λnk𝐂n))2(1+o(1n))\displaystyle\frac{\left(1+\underline{m}_{2n}(\lambda_{nk}^{{\bf C}_{n}})+\lambda_{nk}^{{\bf C}_{n}}\underline{m}_{2n}^{\prime}(\lambda_{nk}^{{\bf C}_{n}})\right)}{(\lambda_{nk}^{{\bf C}_{n}})^{2}(1+c_{1n}m_{2n}(\lambda_{nk}^{{\bf C}_{n}}))^{2}}\left(1+o\left(\frac{1}{n}\right)\right)
\displaystyle\to 1(λk𝐂)2(1+c1m2(λk𝐂))2(1+m¯2(λk𝐂)+λk𝐂m¯2(λk𝐂)),\displaystyle\frac{1}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2}(\lambda_{k}^{{\bf C}}))^{2}}\left(1+\underline{m}_{2}(\lambda_{k}^{{\bf C}})+\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}})\right), (76)

In the same manner we can see that

𝔼𝐚s𝐚t0, for st,{\mathbb{E}}{\bf a}_{s}^{*}{\bf a}_{t}\to 0,\mbox{ for }s\neq t, (77)

which implies the variables in asymmetric positions are asymptotic independent. From the above analysis, we obtain the corresponding block of 1n𝐗1𝐀𝚵1+1n𝚵1𝐀𝐗1\frac{1}{\sqrt{n}}{\bf X}_{1}{\bf A}\boldsymbol{\Xi}_{1}^{*}+\frac{1}{\sqrt{n}}\boldsymbol{\Xi}_{1}{\bf A}{\bf X}_{1}^{*} tends to a mk×mkm_{k}\times m_{k} GOE (GUE) matrix under real (complex) case with scale parameter

2ak(1+m¯2(λk𝐂)+λk𝐂m¯2(λk𝐂))(λk𝐂)2(1+c1m2(λk𝐂))2.\displaystyle\frac{2a_{k}(1+\underline{m}_{2}(\lambda_{k}^{{\bf C}})+\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}}))}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2}(\lambda_{k}^{{\bf C}}))^{2}}.

The third moment of normal population is zero so that the limiting distribution of 1n𝐗1𝐀𝐗11n(tr𝐀)𝐈M\frac{1}{n}{\bf X}_{1}{\bf A}{\bf X}_{1}^{*}\!-\!\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M}, 1n(𝐗1𝐀𝚵1+𝚵1𝐀𝐗1)\frac{1}{n}({\bf X}_{1}{\bf A}\boldsymbol{\Xi}_{1}^{*}\!+\!\boldsymbol{\Xi}_{1}{\bf A}{\bf X}_{1}^{*}) and (70) are independent among them. We conclude that the limiting distribution of the kk-th block of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}} equals the mk×mkm_{k}\times m_{k} GOE (GUE) matrix under real (complex) matrix with the scale parameter

m¯2(λk𝐂)+ak2c1m2(λk𝐂)(λk𝐂)2(1+c1m2(λk𝐂))4+2ak(1+m¯2(λk𝐂)+λk𝐂m¯2(λk𝐂))(λk𝐂)2(1+c1m2(λk𝐂))2.\displaystyle\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}})+\frac{a_{k}^{2}c_{1}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2}(\lambda_{k}^{{\bf C}}))^{4}}+\frac{2a_{k}(1+\underline{m}_{2}(\lambda_{k}^{{\bf C}})+\lambda_{k}^{{\bf C}}\underline{m}_{2}^{\prime}(\lambda_{k}^{{\bf C}}))}{(\lambda_{k}^{{\bf C}})^{2}(1+c_{1}m_{2}(\lambda_{k}^{{\bf C}}))^{2}}. (78)

7.2.2 Limits of ε1\varepsilon_{1}, ε2\varepsilon_{2} and ε3\varepsilon_{3}

In this section, we proceed to show the limits of ε1\varepsilon_{1}, ε2\varepsilon_{2} and ε3\varepsilon_{3} defined in (62)-(64). To deal with ε1\varepsilon_{1}, we note that

ε1\displaystyle\varepsilon_{1} =γk𝐂nnλnk𝐂ntr𝐀n(λnk𝐂)𝐀n(li𝐂n)+1ntr𝐀n(λnk𝐂)m¯2n0(λnk𝐂)\displaystyle\!=\!\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\frac{\lambda_{nk}^{{\bf C}}}{n}{\rm tr}{\bf A}_{n}(\lambda_{nk}^{{\bf C}}){\bf A}_{n}(l_{i}^{{\bf C}_{n}})+\frac{1}{n}{\rm tr}{\bf A}_{n}(\lambda_{nk}^{{\bf C}})-\underline{m}_{2n}^{0}(\lambda_{nk}^{{\bf C}})
=γk𝐂nnλnk𝐂ntr𝐀n2(λnk𝐂)+γk𝐂nnλnk𝐂ntr𝐀n(λnk𝐂)[𝐀n(li𝐂n)𝐀n(λnk𝐂)]+Op(1n),\displaystyle=\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\frac{\lambda_{nk}^{{\bf C}}}{n}{\rm tr}{\bf A}_{n}^{2}(\lambda_{nk}^{{\bf C}})\!+\!\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\frac{\lambda_{nk}^{{\bf C}}}{n}{\rm tr}{\bf A}_{n}(\lambda_{nk}^{{\bf C}})[{\bf A}_{n}(l_{i}^{{\bf C}_{n}})\!-\!{\bf A}_{n}(\lambda_{nk}^{{\bf C}})]+O_{p}\left(\frac{1}{n}\right),

where the last equality is the consequence of Theorem 1 in [9]. As 𝐀n(li𝐂n)\|{\bf A}_{n}(l_{i}^{{\bf C}_{n}})\| is bounded a.s. and Theorem 3.1, we have

1ntr𝐀n2(λnk𝐂)a.s.m¯2(λnk𝐂),\displaystyle\frac{1}{n}{\rm tr}{\bf A}_{n}^{2}(\lambda_{nk}^{{\bf C}})\overset{a.s.}{\longrightarrow}\underline{m}_{2}^{\prime}(\lambda_{nk}^{{\bf C}}),
γk𝐂nnλnk𝐂ntr𝐀n(λnk𝐂)[𝐀n(li𝐂n)𝐀n(λnk𝐂)]=op(γk𝐂nnλnk𝐂).\displaystyle\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\frac{\lambda_{nk}^{{\bf C}}}{n}{\rm tr}{\bf A}_{n}(\lambda_{nk}^{{\bf C}})\left[{\bf A}_{n}(l_{i}^{{\bf C}_{n}})\!-\!{\bf A}_{n}(\lambda_{nk}^{{\bf C}})\right]=o_{p}\left(\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\lambda_{nk}^{{\bf C}}\right).

The task is now to consider the limit of ε2\varepsilon_{2},

ε2=\displaystyle\varepsilon_{2}= 1li𝐂n+li𝐂npMnm2n(li𝐂n)1λnk𝐂+λnk𝐂c1nm2n0(λnk𝐂)\displaystyle\frac{-1}{l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}\frac{p-M}{n}m_{2n}(l_{i}^{{\bf C}_{n}})}-\frac{-1}{\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})}
=\displaystyle= li𝐂n(1+pMnm2n(li𝐂n))λnk𝐂(1+c1nm2n0(λnk𝐂))[λnk𝐂+λnk𝐂c1nm2n0(λnk𝐂)]2\displaystyle\frac{l_{i}^{{\bf C}_{n}}(1+\frac{p-M}{n}m_{2n}(l_{i}^{{\bf C}_{n}}))-\lambda_{nk}^{{\bf C}}(1+c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}}))}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})]^{2}}
+\displaystyle+ (li𝐂nλnk𝐂)2(1+c1nm2n(li𝐂n)+λnk𝐂c1n(m2n0)(λnk𝐂))2+op(1n)[λnk𝐂+λnk𝐂c1nm2n0(λnk𝐂)]2(li𝐂n+li𝐂npMnm2n0(li𝐂n))\displaystyle\frac{(l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}})^{2}(1+c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}})+\lambda_{nk}^{{\bf C}}c_{1n}(m_{2n}^{0})^{\prime}(\lambda_{nk}^{{\bf C}}))^{2}+o_{p}(\frac{1}{n})}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})]^{2}(l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}\frac{p-M}{n}m_{2n}^{0}(l_{i}^{{\bf C}_{n}}))}
=\displaystyle= γk𝐂nn[1+c1m2(λk𝐂)+c1λk𝐂m2(λk𝐂)λk𝐂[1+c1m2(λk𝐂)]2+op(1)],\displaystyle\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\left[\frac{1+c_{1}m_{2}(\lambda_{k}^{{\bf C}})+c_{1}\lambda_{k}^{{\bf C}}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{\lambda_{k}^{{\bf C}}[1+c_{1}m_{2}(\lambda_{k}^{{\bf C}})]^{2}}+o_{p}(1)\right],

the last equality being a consequence of

(li𝐂nλnk𝐂)2(1+c1nm2n(li𝐂n)+λnk𝐂c1n(m2n0)(λnk𝐂))2+op(1n)[λnk𝐂+λnk𝐂c1nm2n0(λnk𝐂)]2(li𝐂n+li𝐂npMnm2n(li𝐂n))=op(γi𝐂nn).\displaystyle\frac{(l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}})^{2}(1+c_{1n}m_{2n}(l_{i}^{{\bf C}_{n}})+\lambda_{nk}^{{\bf C}}c_{1n}(m_{2n}^{0})^{\prime}(\lambda_{nk}^{{\bf C}}))^{2}+o_{p}(\frac{1}{n})}{[\lambda_{nk}^{{\bf C}}+\lambda_{nk}^{{\bf C}}c_{1n}m_{2n}^{0}(\lambda_{nk}^{{\bf C}})]^{2}(l_{i}^{{\bf C}_{n}}+l_{i}^{{\bf C}_{n}}\frac{p-M}{n}m_{2n}(l_{i}^{{\bf C}_{n}}))}=o_{p}\left(\frac{\gamma_{i}^{{\bf C}_{n}}}{\sqrt{n}}\right).

It remains to show the limit of ε3\varepsilon_{3}. Here, we focus on 1n𝐗1𝐀𝐗11n(tr𝐀)𝐈M\frac{1}{n}{\bf X}_{1}{\bf A}{\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}){\bf I}_{M} and other terms can be handle in a similar way. According to aka_{k}, we have

1n𝐗1𝐀(li𝐂n)𝐗11n(tr𝐀(li𝐂n))𝐈M(1n𝐗1𝐀(λnk𝐂)𝐗11n(tr𝐀(λnk𝐂))𝐈M)\displaystyle\frac{1}{n}{\bf X}_{1}{\bf A}(l_{i}^{{\bf C}_{n}}){\bf X}_{1}^{*}-\frac{1}{n}({\rm tr}{\bf A}(l_{i}^{{\bf C}_{n}})){\bf I}_{M}-(\frac{1}{n}{\bf X}_{1}{\bf A}(\lambda_{nk}^{{\bf C}}){\bf X}_{1}-\frac{1}{n}({\rm tr}{\bf A}(\lambda_{nk}^{{\bf C}})){\bf I}_{M})
=\displaystyle= [1n𝐗1𝐀(λnk𝐂)𝐀(li𝐂n)𝐗11ntr𝐀(λnk𝐂)𝐀(li𝐂n)𝐈M](li𝐂nλnk𝐂)\displaystyle\left[\frac{1}{n}{\bf X}_{1}{\bf A}(\lambda_{nk}^{{\bf C}}){\bf A}(l_{i}^{{\bf C}_{n}}){\bf X}_{1}^{*}-\frac{1}{n}{\rm tr}{\bf A}(\lambda_{nk}^{{\bf C}}){\bf A}(l_{i}^{{\bf C}_{n}}){\bf I}_{M}\right](l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}})
=\displaystyle= Op(λk𝐂n)γk𝐂nn𝟏𝟏=op(γk𝐂nn)𝟏𝟏.\displaystyle O_{p}\left(\frac{\lambda_{k}^{{\bf C}}}{\sqrt{n}}\right)\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\boldsymbol{1}\boldsymbol{1}^{\prime}=o_{p}\left(\frac{\gamma_{k}^{{\bf C}_{n}}}{\sqrt{n}}\right)\boldsymbol{1}\boldsymbol{1}^{\prime}.

Note that the assumption the rate of aka_{k} diverging to infinity in Assumption A is used here. To be specific, We find that if the rate of aka_{k} diverging to infinity is more than n\sqrt{n}, λk𝐂/n{\lambda_{k}^{{\bf C}}}/{\sqrt{n}} will tend infinity.

Combining the limiting distribution of 𝛀0𝐂n\boldsymbol{\Omega}_{0}^{{\bf C}_{n}}, we obtain the limiting distribution of the random vector {γk𝐂n}\{\gamma_{k}^{{\bf C}_{n}}\} and its key scale parameter is defined in (7.2).

7.3 Proof of Theorem 3.3

At first, we consider the first order limit of the spiked eigenvalues of 𝐅p\mathbf{F}_{p} under given the matrix sequence {𝐂n}\{{\bf C}_{n}\}. Recall

𝐂n\displaystyle{\bf C}_{n} =\displaystyle= 1n(𝚵+𝐗)(𝚵+𝐗)=𝐔(𝚺100𝚺2)𝐔,\displaystyle\frac{1}{n}(\boldsymbol{\Xi}+{\bf X})(\boldsymbol{\Xi}+{\bf X})^{\ast}={\bf U}\left(\begin{array}[]{cc}{\boldsymbol{\Sigma}}_{1}&0\\ 0&{\boldsymbol{\Sigma}}_{2}\end{array}\right){\bf U}^{\ast},
𝐒N\displaystyle{\bf S}_{N} =\displaystyle= 1N𝐘N𝐘N=1N(𝐘1𝐘2)(𝐘1𝐘2)=1N(𝐘1𝐘1𝐘1𝐘2𝐘2𝐘1𝐘2𝐘2),\displaystyle\frac{1}{N}{\bf Y}_{N}{\bf Y}_{N}^{\ast}=\frac{1}{N}\left(\begin{array}[]{c}{\bf Y}_{1}\\ {\bf Y}_{2}\end{array}\right)\left(\begin{array}[]{cc}{\bf Y}_{1}^{\ast}&{\bf Y}_{2}^{\ast}\end{array}\right)=\frac{1}{N}\left(\begin{array}[]{cc}{\bf Y}_{1}{\bf Y}_{1}^{\ast}&{\bf Y}_{1}{\bf Y}_{2}^{\ast}\\ {\bf Y}_{2}{\bf Y}_{1}^{\ast}&{\bf Y}_{2}{\bf Y}_{2}^{\ast}\end{array}\right),

where 𝚺1{\boldsymbol{\Sigma}}_{1} is an M×MM\times M diagonal matrix and 𝐘1{\bf Y}_{1} denotes the first MM rows of 𝐘{\bf Y}. The matrix 𝐂n{\bf C}_{n} can be seen as a general non-negative definite matrix with the eigenvalues formed in descending order,

l1𝐂nl2𝐂nlp𝐂n.\displaystyle l_{1}^{{\bf C}_{n}}\geq l_{2}^{{\bf C}_{n}}\geq\cdots\geq l_{p}^{{\bf C}_{n}}. (81)

We consider the eigen-equation

|𝐂n𝐒N1λ𝐈|=0|𝐂nλ𝐒N|=0\displaystyle\left|{\bf C}_{n}{\bf S}_{N}^{-1}-\lambda{\bf I}\right|=0\Longleftrightarrow\left|{\bf C}_{n}-\lambda{\bf S}_{N}\right|=0
\displaystyle\Longleftrightarrow |𝐔(𝚺100𝚺2)𝐔λN𝐘N𝐘N|=0\displaystyle\left|{\bf U}\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{1}&0\\ 0&\boldsymbol{\Sigma}_{2}\end{array}\right){\bf U}^{\ast}-\frac{\lambda}{N}{\bf Y}_{N}{\bf Y}_{N}^{\ast}\right|=0
\displaystyle\Longleftrightarrow |(𝚺100𝚺2)λN𝐔𝐘N𝐘N𝐔|=0.\displaystyle\left|\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{1}&0\\ 0&\boldsymbol{\Sigma}_{2}\end{array}\right)-\frac{\lambda}{N}{\bf U}^{\ast}{\bf Y}_{N}{\bf Y}_{N}^{\ast}{\bf U}\right|=0.

We denote 𝐔𝐘N𝐘N𝐔/N{\bf U}^{\ast}{\bf Y}_{N}{\bf Y}_{N}^{\ast}{\bf U}/N as 𝐘~N𝐘~N/N\tilde{{\bf Y}}_{N}\tilde{{\bf Y}}_{N}^{\ast}/N. The entries of 𝐘N{\bf Y}_{N} are the standard normal so that both 𝐘~N𝐘~N/N\tilde{{\bf Y}}_{N}\tilde{{\bf Y}}_{N}^{\ast}/N and 𝐘N𝐘N/N{{\bf Y}}_{N}{{\bf Y}}_{N}^{\ast}/N have the same distribution. If there is no confusion, we will still write the notation 𝐘N{\bf Y}_{N}. Then the eigen-equation becomes

|(𝚺100𝚺2)λN𝐘N𝐘N|=0\displaystyle\left|\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{1}&0\\ 0&\boldsymbol{\Sigma}_{2}\end{array}\right)-\frac{\lambda}{N}{\bf Y}_{N}{\bf Y}_{N}^{\ast}\right|=0
\displaystyle\Longleftrightarrow |(𝚺100𝚺2)λ(1N𝐘1𝐘11N𝐘1𝐘21N𝐘2𝐘11N𝐘2𝐘2)|=0.\displaystyle\left|\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{1}&0\\ 0&\boldsymbol{\Sigma}_{2}\end{array}\right)-\lambda\left(\begin{array}[]{cc}\frac{1}{N}{\bf Y}_{1}{\bf Y}_{1}^{\ast}&\frac{1}{N}{\bf Y}_{1}{\bf Y}_{2}^{\ast}\\ \frac{1}{N}{\bf Y}_{2}{\bf Y}_{1}^{\ast}&\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\end{array}\right)\right|=0.

According to the sample spiked eigenvalues lil_{i}, i𝒥ki\in\mathcal{J}_{k}, k=1,,Kk=1,\cdots,K of 𝑭p=𝐂n𝐒N1\boldsymbol{F}_{p}={\bf C}_{n}{\bf S}_{N}^{-1}, we have |𝚺2li1N𝐘2𝐘2|0|\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}|\neq 0 almost surely, then

|𝚺1li1N𝐘1𝐘1li21N𝐘1𝐘2(𝚺2li1N𝐘2𝐘2)11N𝐘2𝐘1|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}-l_{i}\frac{1}{N}{\bf Y}_{1}{\bf Y}_{1}^{\ast}-l_{i}^{2}\frac{1}{N}{\bf Y}_{1}{\bf Y}_{2}^{\ast}\left(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right)^{-1}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{1}^{\ast}\right|=0
|𝚺1li1N𝐘1[𝐈N+li1N𝐘2(𝚺2li1N𝐘2𝐘2)1𝐘2]𝐘1|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}-l_{i}\frac{1}{N}{\bf Y}_{1}\left[{\bf I}_{N}+l_{i}\frac{1}{N}{\bf Y}_{2}^{\ast}(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast})^{-1}{\bf Y}_{2}\right]{\bf Y}_{1}^{\ast}\right|=0
|𝚺1li1Ntr[𝐈N+li1N𝐘2(𝚺2li1N𝐘2𝐘2)1𝐘2]𝐈M+𝛀q𝑭(li)|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}-l_{i}\frac{1}{N}{\rm tr}\left[{\bf I}_{N}+l_{i}\frac{1}{N}{\bf Y}_{2}^{\ast}(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast})^{-1}{\bf Y}_{2}\right]{\bf I}_{M}+\boldsymbol{\Omega}_{q}^{\boldsymbol{F}}(l_{i})\right|=0
|𝚺1li(1+lipMN1pMtr[(𝚺2li1N𝐘2𝐘2)11N𝐘2𝐘2])𝐈M+𝛀q𝑭(li)|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}\!-\!l_{i}\left(1\!+\!l_{i}\frac{p-M}{N}\frac{1}{p-M}{\rm tr}\left[(\boldsymbol{\Sigma}_{2}\!-\!l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast})^{-1}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right]\right){\bf I}_{M}\!+\!\boldsymbol{\Omega}_{q}^{\boldsymbol{F}}(l_{i})\right|=0
|𝚺1li(1+lipMN1pMtr(𝚺2(1N𝐘2𝐘2)1li𝐈)1)𝐈+𝛀q𝑭(li)|=0,\displaystyle\left|\boldsymbol{\Sigma}_{1}\!-\!l_{i}\left(1\!+\!l_{i}\frac{p-M}{N}\frac{1}{p-M}{\rm tr}\left(\boldsymbol{\Sigma}_{2}(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast})^{-1}\!-\!l_{i}{\bf I}\right)^{-1}\right){\bf I}+\boldsymbol{\Omega}_{q}^{\boldsymbol{F}}(l_{i})\right|=0,

where

𝛀N𝑭(li)=\displaystyle\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(l_{i})= liN𝐘1[𝐈N+li1N𝐘2(𝚺2li1N𝐘2𝐘2)1𝐘2]𝐘1\displaystyle\frac{l_{i}}{N}{\bf Y}_{1}\left[{\bf I}_{N}+l_{i}\frac{1}{N}{\bf Y}_{2}^{\ast}\left(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right)^{-1}{\bf Y}_{2}\right]{\bf Y}_{1}^{\ast} (83)
liNtr[𝐈N+li1N𝐘2(𝚺2li1N𝐘2𝐘2)1𝐘2]𝐈M.\displaystyle-\frac{l_{i}}{N}{\rm tr}\left[{\bf I}_{N}+l_{i}\frac{1}{N}{\bf Y}_{2}^{\ast}\left(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right)^{-1}{\bf Y}_{2}\right]{{\bf I}_{M}}.

Similar to Subsection 7.1.1, by the assumption on aka_{k}, the strong law of large numbers and the Stietjes transformation equation of MM-PP law, we have 𝛀N𝑭(li)a.s.𝟎M×M\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(l_{i})\overset{a.s.}{\longrightarrow}\boldsymbol{0}_{M\times M} as NN\to\infty, and

1pMtr(𝚺2(1N𝐘2𝐘2)1z𝐈)1a.s.m3(z),z+SFc\displaystyle\frac{1}{p-M}{\rm tr}\left(\boldsymbol{\Sigma}_{2}(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast})^{-1}-z{\bf I}\right)^{-1}\overset{a.s.}{\longrightarrow}m_{3}(z),\quad{\forall z\in\mathbb{C}^{+}\cup S_{F}^{c}} (84)

where m3()m_{3}(\cdot) is the Stieltjes transform of FF. Taking limit on the eigen-equation, there exists a diagonal block being 𝟎mi×mi\boldsymbol{0}_{m_{i}\times m_{i}}, then we have λk𝐂=λk(1+c2λkm3(λk))\lambda_{k}^{{\bf C}}=\lambda_{k}(1+c_{2}\lambda_{k}m_{3}(\lambda_{k})), where λk𝐂\lambda_{k}^{{\bf C}} and λk\lambda_{k} are the almost surely limit of li𝐂nl_{i}^{{\bf C}_{n}} and li,i𝒥kl_{i},i\in\mathcal{J}_{k} respectively. According to (8), we have

zm3(z)=\displaystyle zm_{3}(z)= (z+c2z2m3(z))m2(z+c2z2m3(z))\displaystyle(z+c_{2}z^{2}m_{3}(z))m_{2}(z+c_{2}z^{2}m_{3}(z))
λkm3(λk)=\displaystyle\Longleftrightarrow\lambda_{k}m_{3}(\lambda_{k})= λk(1+c2λkm3(λk))m2(λk(1+c2λkm3(λk)))=λk𝐂m2(λk𝐂),\displaystyle\lambda_{k}(1+c_{2}\lambda_{k}m_{3}(\lambda_{k}))m_{2}(\lambda_{k}(1+c_{2}\lambda_{k}m_{3}(\lambda_{k})))=\lambda_{k}^{{\bf C}}m_{2}(\lambda_{k}^{{\bf C}}),

i.e.,

λk𝐂\displaystyle\lambda_{k}^{{\bf C}} =\displaystyle= λk(1+c2λk𝐂m2(λk𝐂))\displaystyle\lambda_{k}(1+c_{2}\lambda_{k}^{{\bf C}}m_{2}(\lambda_{k}^{{\bf C}})) (85)
ψ𝑭(λk𝐂)\displaystyle\psi_{\boldsymbol{F}}(\lambda_{k}^{{\bf C}}) =\displaystyle\overset{\bigtriangleup}{=} λk=λk𝐂1+c2λk𝐂m2(λk𝐂)=λk𝐂1+c2λkm3(λk).\displaystyle\lambda_{k}=\frac{\lambda_{k}^{{\bf C}}}{1+c_{2}\lambda_{k}^{{\bf C}}m_{2}(\lambda_{k}^{{\bf C}})}=\frac{\lambda_{k}^{{\bf C}}}{1+c_{2}\lambda_{k}m_{3}(\lambda_{k})}. (86)

From what has already been proved, we conclude that the first order limit of lil_{i} is independent of {𝐂n}\{{\bf C}_{n}\}, only related to the limit of their spiked eigenvalues. An easy computation shows the relationship between λk\lambda_{k} and aka_{k}:

λk=λk𝐂1+c2λk𝐂m2(λk𝐂),λk𝐂=ak(1c1m1(ak))2+(1c1)(1c1m1(ak)).\displaystyle\lambda_{k}=\frac{\lambda_{k}^{{\bf C}}}{1+c_{2}\lambda_{k}^{{\bf C}}m_{2}(\lambda_{k}^{{\bf C}})},\quad\lambda^{{\bf C}}_{k}=a_{k}\left(1-c_{1}m_{1}(a_{k})\right)^{2}+(1-c_{1})\left(1-c_{1}m_{1}(a_{k})\right).

7.4 Proof of Theorem 3.4

We are now in a position to study the asymptotic distribution of the random vector

γNk=(N(liλNk)/λNk,i𝒥k)\displaystyle\gamma_{Nk}=(\sqrt{N}(l_{i}-\lambda_{Nk})/\lambda_{Nk},i\in\mathcal{J}_{k}) (87)

where λNk=ψ𝐅(λnk𝐂n)\lambda_{Nk}=\psi_{{\bf F}}(\lambda_{nk}^{{\bf C}_{n}}) with the parameters in function ψ𝐅\psi_{{\bf F}} replaced by the corresponding empirical parameters, and which will be divided into two steps. At first, we give the conditional limiting distribution of γNk|𝐂n\gamma_{Nk}|{\bf C}_{n}, then we will find the limiting distribution are independent with the choice of conditioning 𝐂n{\bf C}_{n}. Secondly, combining the above theorems and the following subsection, we can complete the proof of Theorem 3.4.

7.4.1 The conditional limiting distribution of γNk|𝐂n\gamma_{Nk}|{\bf C}_{n}

In this section, we consider the CLT of the random vector γNk|𝐂n={N(liψ𝑭(li𝐂n))/ψ𝑭(li𝐂n),i𝒥k}\gamma_{Nk}|{\bf C}_{n}=\{\sqrt{N}(l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))/\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}),i\in\mathcal{J}_{k}\}. Recall the eigen-equation

|𝚺1li(1+lipMN1pMtr(𝚺2(1N𝐘2𝐘2)1li𝐈pM)1)𝐈M+𝛀N𝑭(li)|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}\!-\!l_{i}\left(1\!+\!l_{i}\frac{p\!-\!M}{N}\frac{1}{p\!-\!M}{\rm tr}\left(\boldsymbol{\Sigma}_{2}\left(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right)^{-1}\!-\!l_{i}{\bf I}_{p-M}\right)^{-1}\right){\bf I}_{M}\!+\!\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(l_{i})\right|\!=\!0
|𝚺1ψ𝑭(li𝐂n)(1+pMNψ𝑭(li𝐂n)m3Nψ𝑭(li𝐂n))𝐈M+𝛀N𝑭(ψ𝑭(li𝐂n))+ε1|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}\!-\!\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})\left(1\!+\!\frac{p\!-\!M}{N}\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})m_{3N}\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})\right){\bf I}_{M}\!+\!\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))\!+\!\varepsilon_{1}\right|=0

where

m3N(ψ𝑭(li𝐂n))=1pMtr(𝚺2(1N𝐘2𝐘2)1ψ𝑭(li𝐂n)𝐈pM)1\displaystyle m_{3N}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))=\frac{1}{p\!-\!M}{\rm tr}\left(\boldsymbol{\Sigma}_{2}\left(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{\ast}\right)^{\!-\!1}\!-\!\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}){\bf I}_{p-M}\right)^{-1}
ε1=ψ𝑭(li𝐂n)(1+ψ𝑭(li𝐂n)pMNm3N(ψ𝑭(li𝐂n)))𝐈M\displaystyle\varepsilon_{1}=\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})\left(1+\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})\frac{p-M}{N}m_{3N}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))\right){\bf I}_{M}
li(1+lipMNm3N(li))𝐈M+𝛀N𝑭(li)𝛀N𝑭(ψ𝑭(li𝐂n)).\displaystyle\quad\quad-l_{i}\left(1+l_{i}\frac{p-M}{N}m_{3N}(l_{i})\right){\bf I}_{M}+\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(l_{i})-\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})).

By simple calculation, we set c2N=(pM)/Nc_{2N}=(p-M)/N and have

ε1=\displaystyle\varepsilon_{1}= [(liψ𝑭(li𝐂n))(li2(ψ𝑭(łi𝐂n))2)c2Nm3N(li)\displaystyle[-(l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))-(l_{i}^{2}-(\psi_{\boldsymbol{F}}(\l_{i}^{{\bf C}_{n}}))^{2})c_{2N}m_{3N}(l_{i})
(ψ𝑭(li𝐂n))2c2Nm3N(ψ𝑭(li𝐂n))(liψ𝑭(li𝐂n))(1+o(1))]𝐈M\displaystyle-(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))^{2}c_{2N}m_{3N}^{\prime}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))(l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))(1+o(1))]\mathbf{I}_{M}
+𝛀N𝑭(li)𝛀N𝑭(ψ𝑭(li𝐂n))\displaystyle+\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(l_{i})-\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))

i.e.

ε1=γNk|𝐂nNλk[1+2λkc2m3(λk)+λk2c2m3(λk)](1+op(1))𝐈+op(liN)𝟏𝟏.\displaystyle\varepsilon_{1}\!=\!-\!\frac{\gamma_{Nk}|{\bf C}_{n}}{\sqrt{N}}\lambda_{k}\left[1\!+\!2\lambda_{k}c_{2}m_{3}(\lambda_{k})\!+\!\lambda_{k}^{2}c_{2}m_{3}^{\prime}(\lambda_{k})\right](1\!+\!o_{p}(1)){\bf I}\!+\!o_{p}(\frac{l_{i}}{\sqrt{N}})\boldsymbol{1}\boldsymbol{1}^{\prime}. (88)

By (86), we have

li𝐂n=ψF(li𝐂n)(1+c2ψF(li𝐂n)m3(li𝐂n)).l_{i}^{{\bf C}_{n}}=\psi_{F}(l_{i}^{{\bf C}_{n}})(1+c_{2}\psi_{F}(l_{i}^{{\bf C}_{n}})m_{3}(l_{i}^{{\bf C}_{n}})).

We recall

|𝚺1ψ𝑭(li𝐂n)(1+ψ𝑭(li𝐂n)c2Nm3N(ψ𝑭(li𝐂n)))𝐈M+𝛀N𝐅(ψ𝑭(li𝐂n))+ε1𝐈M|=0\displaystyle\left|\boldsymbol{\Sigma}_{1}\!-\!\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})\left(1\!+\!\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})c_{2N}m_{3N}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))\right){\bf I}_{M}\!+\!\boldsymbol{\Omega}_{N}^{{\bf F}}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))+\varepsilon_{1}{\bf I}_{M}\right|=0

becomes

0=|λ1𝐂li𝐂n+O(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)[𝛀N𝑭]kk+ε1𝐈mkO(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)O(ψ𝑭(li𝐂n)N)λM𝐂li𝐂n+O(ψ𝑭(li𝐂n)N)|\displaystyle 0=\begin{vmatrix}\lambda_{1}^{{\bf C}}\!-\!l_{i}^{{\bf C}_{n}}\!+\!O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})&\cdots&O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})&\cdots&O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})\cr O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})&\cdots&\cdots&\cdots&O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})\cr\cdots&\cdots&[\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}]_{kk}\!+\!\varepsilon_{1}{\bf I}_{m_{k}}&\cdots&\cdots\cr O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})&\cdots&\cdots&\cdots&O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})\cr O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})&\cdots&\cdots&\cdots&\lambda_{M}^{{\bf C}}\!-\!l_{i}^{{\bf C}_{n}}\!+\!O(\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\sqrt{N}})\cr\end{vmatrix} (89)

where [𝛀N𝑭]kk[\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}]_{kk} is kk-th diagonal block of 𝛀N𝑭(ψ𝑭(li𝐂n))\boldsymbol{\Omega}_{N}^{\boldsymbol{F}}(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})).

By Skorokhod strong representation theorem (for more detais, see [32] or [21]), on an appropriate probability space, one may redefine the random variables such that 𝛀N𝑭\boldsymbol{\Omega}_{N}^{\boldsymbol{F}} tends to the Gaussian variables with probability one. Multiplying (ψ𝑭(li𝐂n))1/2N1/4(\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}}))^{-1/2}N^{1/4} to the kk-th block row and column of the determinant in (89), pp\to\infty and NN\to\infty. It is easily seen that all non-diagonal elements tend to zero and and all the diagonal entries except the kk-th are bounded away from zero as pp\to\infty or NN\to\infty. Therefore,

[N𝛀q𝑭]kk(γNk|𝐂n)λkϑ(λk)𝐈mka.s.0,[\sqrt{N}\boldsymbol{\Omega}_{q}^{\boldsymbol{F}}]_{kk}-(\gamma_{Nk}|{\bf C}_{n})\lambda_{k}\vartheta(\lambda_{k}){\bf I}_{m_{k}}\overset{a.s.}{\to}0,

where

ϑ(λk)=1+2λkc2m3(λk)+c2λk2m3(λk).\displaystyle\vartheta(\lambda_{k})=1+2\lambda_{k}c_{2}m_{3}(\lambda_{k})+c_{2}\lambda_{k}^{2}m_{3}^{\prime}(\lambda_{k}). (90)

By classical CLT, we have [N𝛀𝑭]kk[\sqrt{N}\boldsymbol{\Omega}^{\boldsymbol{F}}]_{kk} tends to an mkm_{k}-dimensional GOE (GUE) matrix under real (complex) case with the scale parameter λk2ϑ\lambda_{k}^{2}\vartheta. In fact, the scale parameter is the limit of

li2Ntr[𝐈+li1N𝐘2(𝚺2li1N𝐘2𝐘2)1𝐘2]2\displaystyle\frac{l_{i}^{2}}{N}{\rm tr}\left[{\bf I}+l_{i}\frac{1}{N}{\bf Y}_{2}^{*}\left(\boldsymbol{\Sigma}_{2}-l_{i}\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{*}\right)^{-1}{\bf Y}_{2}\right]^{2}
=\displaystyle= li2Ntr𝐈N+2li3Ntr(𝚺2(1N𝐘2𝐘2)1li𝐈)1+li4Ntr(𝚺2(1N𝐘2𝐘2)1li𝐈)2\displaystyle\frac{l_{i}^{2}}{N}{\rm tr}{\bf I}_{N}+2\frac{l_{i}^{3}}{N}{\rm tr}\left(\boldsymbol{\Sigma}_{2}\left(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{*}\right)^{-1}-l_{i}{\bf I}\right)^{-1}+\frac{l_{i}^{4}}{N}{\rm tr}\left(\boldsymbol{\Sigma}_{2}\left(\frac{1}{N}{\bf Y}_{2}{\bf Y}_{2}^{*}\right)^{-1}-l_{i}{\bf I}\right)^{-2}
a.s.\displaystyle\overset{a.s.}{\to} λk2(1+2c2λkm3(λk)+λk2c2m3(λk))=λk2ϑ(λk).\displaystyle\lambda_{k}^{2}(1+2c_{2}\lambda_{k}m_{3}(\lambda_{k})+\lambda_{k}^{2}c_{2}m_{3}^{\prime}(\lambda_{k}))=\lambda_{k}^{2}\vartheta(\lambda_{k}).

Then we conclude that the conditional limiting distribution of γNk|𝐂n\gamma_{Nk}|{\bf C}_{n} equals the joint distribution of the eigenvalues of GOE (GUE) matrix with the scale parameter 1/ϑ1/\vartheta.

7.4.2 The limiting distribution of γNk\gamma_{Nk}

In this part, we will give the asymptotic distribution of γNk=(n(li/λNk1),i𝒥k)\gamma_{Nk}=(\sqrt{n}(l_{i}/\lambda_{Nk}-1),i\in\mathcal{J}_{k}). It is worth pointing out that the asymptotic distribution of γNk=(n(li/λNk1),i𝒥k)\gamma_{Nk}=(\sqrt{n}(l_{i}/\lambda_{Nk}-1),i\in\mathcal{J}_{k}) is without the condition 𝐂n{{\bf C}_{n}}. According to (87), we have

γNk=\displaystyle\gamma_{Nk}= nliλNkλNk=nliψ𝑭(li𝐂n)+ψ𝑭(li𝐂n)ψ𝑭(λnk𝐂)ψ𝑭(λnk𝐂)\displaystyle\sqrt{n}\frac{l_{i}-\lambda_{Nk}}{\lambda_{Nk}}=\sqrt{n}\frac{l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})+\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})-\psi_{\boldsymbol{F}}(\lambda_{nk}^{{\bf C}})}{\psi_{\boldsymbol{F}}(\lambda_{nk}^{{\bf C}})} (91)
=\displaystyle= nNNliψ𝑭(li𝐂n)ψ𝑭(li𝐂n)ψ𝑭(li𝐂n)ψ𝑭(λnk𝐂)+nli𝐂nλnk𝐂λnk𝐂λnk𝐂ψ𝑭(λnk𝐂)ψ𝑭(λnk𝐂)(1+o(1)),\displaystyle\frac{\sqrt{n}}{\sqrt{N}}\sqrt{N}\frac{l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}\frac{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\psi_{\boldsymbol{F}}(\lambda_{nk}^{{\bf C}})}+\sqrt{n}\frac{l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}}}{\lambda_{nk}^{{\bf C}}}\frac{\lambda_{nk}^{{\bf C}}}{\psi_{\boldsymbol{F}}(\lambda_{nk}^{{\bf C}})}\psi_{\boldsymbol{F}}^{\prime}(\lambda_{nk}^{{\bf C}})(1+o(1)),

where the condition limiting distribution of

Nliψ𝑭(li𝐂n)ψ𝑭(li𝐂n)\sqrt{N}\frac{l_{i}-\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}{\psi_{\boldsymbol{F}}(l_{i}^{{\bf C}_{n}})}

is independent with the condition, so the asymptotic distribution of the first term of (91) is the joint distribution of the order eigenvalues of a GOE (GUE) matrix with parameter c2/(c1ϑ)c_{2}/(c_{1}*\vartheta). According to Subsection 7.2, it follows that the limiting distribution of

nli𝐂nλnk𝐂λnk𝐂\sqrt{n}\frac{l_{i}^{{\bf C}_{n}}-\lambda_{nk}^{{\bf C}}}{\lambda_{nk}^{{\bf C}}}

equals the joint distribution of the order eigenvalues of a GOE (GUE) matrix with parameter θ1\theta_{1} defined in (7.2). Combining (85) with (86), we obtain

λnk𝐂ψ𝑭(λnk𝐂)=λnk𝐂λNka.s.1+c2λkm3(λk),\frac{\lambda_{nk}^{{\bf C}}}{\psi_{\boldsymbol{F}}(\lambda_{nk}^{{\bf C}})}=\frac{\lambda_{nk}^{{\bf C}}}{\lambda_{Nk}}\overset{a.s.}{\longrightarrow}1+c_{2}\lambda_{k}m_{3}(\lambda_{k}),

and

ψ𝑭(λnk𝐂)a.s.1c2(λk𝐂)2m2(λk𝐂)(1+c2λkm3(λk))2.\displaystyle\psi_{\boldsymbol{F}}^{\prime}(\lambda_{nk}^{{\bf C}})\overset{a.s.}{\longrightarrow}\frac{1-c_{2}(\lambda_{k}^{{\bf C}})^{2}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{(1+c_{2}\lambda_{k}m_{3}(\lambda_{k}))^{2}}.

Then the asymptotic distribution of the second term of (91) is the same as the eigenvalues of a GOE (GUE) matrix with parameter

[1c2(λk𝐂)2m2(λk𝐂)1+c2λkm3(λk)]2θ1.\displaystyle\left[\frac{1-c_{2}(\lambda_{k}^{{\bf C}})^{2}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{1+c_{2}\lambda_{k}m_{3}(\lambda_{k})}\right]^{2}\theta_{1}.

In summary, the limiting distribution of γNk\gamma_{Nk} is related to that of eigenvalues of the GOE (GUE) matrix with parameter

c2c1ϑ+[1c2(λk𝐂)2m2(λk𝐂)1+c2λkm3(λk)]2θ1.\displaystyle\frac{c_{2}}{c_{1}\cdot\vartheta}+\left[\frac{1-c_{2}(\lambda_{k}^{{\bf C}})^{2}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{1+c_{2}\lambda_{k}m_{3}(\lambda_{k})}\right]^{2}\theta_{1}.

7.5 Proof of Theorem 4.3

According to Theorem 4.1, we know that there exsits a function relation between sample canonical correlation coefficients and the eigenvalues of a special noncentral Fisher matrix. The noncentral parameter matrix defined in (23) is a random matrix, so the proof of Theorem 4.3 cannot be obtained directly be Theorem 3.4 and 4.1. Now we will present the details of the proof.

Consider the random variable

γk0=q(liΨ(αk)Ψ(αk)),fori𝒥k,\displaystyle\gamma_{k}^{0}=\sqrt{q}\left(\frac{l_{i}-\Psi(\alpha_{k})}{\Psi(\alpha_{k})}\right),\quad\mbox{for}\ i\in\mathcal{J}_{k},

we have

γk0=q(liψ𝐅ψ𝐂(li𝚵)ψ𝐅ψ𝐂(li𝚵)ψ𝐅ψ𝐂(li𝚵)Ψ(αk)+ψ𝐅ψ𝐂(li𝚵)Ψ(αk)Ψ(αk)).\displaystyle\gamma_{k}^{0}=\sqrt{q}\left(\frac{l_{i}-\psi_{{\bf F}}\circ\psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})}{\psi_{{\bf F}}\circ\psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})}\frac{\psi_{{\bf F}}\circ\psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})}{\Psi(\alpha_{k})}+\frac{\psi_{{\bf F}}\circ\psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})-\Psi(\alpha_{k})}{\Psi(\alpha_{k})}\right).

Under the Assumption d and given 𝐘^\hat{{\bf Y}}, the limiting distribution of first term in γk0\gamma_{k}^{0} can be obtained by Theorem 3.4 and its covariance satisfies (15), which is independent of selection 𝐘^\hat{{\bf Y}}. We apply the Mean value theorem to the second term in γk0\gamma_{k}^{0}, i.e.,

qψ𝐅ψ𝐂(li𝚵)Ψ(αk)Ψ(αk)=nli𝚵ψ𝚵(f(αk))ψ𝚵(f(αk))qnψ𝚵(f(αk))Ψ(αk)ψ𝐅(ξ1)ψ𝐂(ξ2),\displaystyle\sqrt{q}\frac{\psi_{{\bf F}}\circ\psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})\!-\!\Psi(\alpha_{k})}{\Psi(\alpha_{k})}=\sqrt{n}\frac{l_{i}^{\boldsymbol{\Xi}}\!-\!\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}{\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}\frac{\sqrt{q}}{\sqrt{n}}\frac{\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}{\Psi(\alpha_{k})}\psi_{{\bf F}}^{\prime}(\xi_{1})\psi_{{\bf C}}^{\prime}(\xi_{2}),

where ξ1(Ψ𝐂(αk),Ψ𝐂(li𝚵))\xi_{1}\in(\Psi_{{\bf C}}(\alpha_{k}),\Psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}})) or (Ψ𝐂(li𝚵),Ψ𝐂(αk))(\Psi_{{\bf C}}(l_{i}^{\boldsymbol{\Xi}}),\Psi_{{\bf C}}(\alpha_{k})), and ξ2(ψ𝚵(f(αk)),li𝚵)\xi_{2}\in(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})),l_{i}^{\boldsymbol{\Xi}}) or (li𝚵,ψ𝚵(f(αk)))(l_{i}^{\boldsymbol{\Xi}},\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))). By Theorem 3.1 and 3.3, we have

ψF(ξ1)ψ𝐅(Ψ𝐂(αk))a.s.0\displaystyle\psi_{F}^{\prime}(\xi_{1})-\psi_{{\bf F}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k}))\overset{a.s.}{\to}0
ψC(ξ2)ψ𝐂(ψ𝚵(f(αk)))a.s.0\displaystyle\psi_{C}^{\prime}(\xi_{2})-\psi_{{\bf C}}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))\overset{a.s.}{\to}0

where

ψ𝐅(Ψ𝐂(αk))=1c4Ψ𝐂2(αk)m𝐂(Ψ𝐂(αk))[1+c4Ψ𝐂(αk)m𝐂(Ψ𝐂(αk))]2,\displaystyle\psi_{{\bf F}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k}))\overset{\triangle}{=}\frac{1-c_{4}\Psi_{{\bf C}}^{2}(\alpha_{k})m_{{\bf C}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k}))}{[1+c_{4}\Psi_{{\bf C}}(\alpha_{k})m_{{\bf C}}(\Psi_{{\bf C}}(\alpha_{k}))]^{2}},
ψ𝐂(ψ𝚵(f(αk)))=(1c3dFmpp/n,H(t)tψ𝚵(f(αk)))2\displaystyle\psi_{{\bf C}}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))\overset{\triangle}{=}\left(1-c_{3}\int\frac{dF_{mp}^{p/n,H}(t)}{t-\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}\right)^{2}
2ψ𝚵(f(αk))(1c3dFmpp/n,H(t)tψ𝚵(f(αk)))c3dFmpp/n,H(t)(tψ𝚵(f(αk)))2\displaystyle\quad\quad-2\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))\left(1-c_{3}\int\frac{dF_{mp}^{p/n,H}(t)}{t-\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}\right)c_{3}\int\frac{dF_{mp}^{p/n,H}(t)}{(t-\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))^{2}}
(1c3)c3dFmpp/n,H(t)(tψ𝚵(f(αk)))2.\displaystyle\quad\quad-(1-c_{3})c_{3}\int\frac{dF_{mp}^{p/n,H}(t)}{(t-\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))^{2}}.

And

nli𝚵ψ𝚵(f(αk))ψ𝚵(f(αk))×2(ψ𝚵2(f(αk))m¯(ψ𝚵(f(αk)))𝑑N(0,1),\displaystyle\sqrt{n}\frac{l_{i}^{\boldsymbol{\Xi}}\!-\!\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}{\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}\times\sqrt{\frac{2}{(\psi_{\boldsymbol{\Xi}}^{2}(f(\alpha_{k}))\underline{m}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))}}\overset{d}{\to}N(0,1),

where

m¯(ψ𝚵(f(αk)))=1p/nψ𝚵(f(αk))+p/ndFmpp/n,H(t)tψ𝚵(f(αk))\displaystyle\underline{m}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))=-\frac{1-p/n}{\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}+p/n\int\frac{dF_{mp}^{p/n,H}(t)}{t-\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))}

Then the covariance function of γk0\gamma_{k}^{0} satisfies

η3+c2c1η2+[1c2(λk𝐂)2m2(λk𝐂)1+c2λkm3(λk)]2η1.\displaystyle\eta_{3}\!+\!\frac{c_{2}}{c_{1}\cdot\eta_{2}}\!+\!\left[\frac{1-c_{2}(\lambda_{k}^{{\bf C}})^{2}m_{2}^{\prime}(\lambda_{k}^{{\bf C}})}{1\!+\!c_{2}\lambda_{k}m_{3}(\lambda_{k})}\right]^{2}\eta_{1}. (92)

where

η3=qn(ψ𝐅(Ψ𝐂(αk)))2(ψ𝐂(ψ𝚵(f(αk))))2ψ𝚵2(f(αk))m¯(ψ𝚵(f(αk)))ψ𝚵2(f(αk))Ψ2(αk)\displaystyle\eta_{3}=\frac{q}{n}\frac{(\psi_{{\bf F}}^{\prime}(\Psi_{{\bf C}}(\alpha_{k})))^{2}(\psi_{{\bf C}}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k}))))^{2}}{\psi_{\boldsymbol{\Xi}}^{2}(f(\alpha_{k}))\underline{m}^{\prime}(\psi_{\boldsymbol{\Xi}}(f(\alpha_{k})))}\frac{\psi_{\boldsymbol{\Xi}}^{2}(f(\alpha_{k}))}{\Psi^{2}(\alpha_{k})}

According to Delta method, we rewrite

γk=qλi2t(αk)t(αk)=qg1(li)g1(Ψ(αk))Ψ(αk)Ψ(αk)t(αk)\displaystyle\gamma_{k}=\sqrt{q}\frac{\lambda_{i}^{2}-t(\alpha_{k})}{t(\alpha_{k})}=\sqrt{q}\frac{g^{-1}(l_{i})-g^{-1}(\Psi(\alpha_{k}))}{\Psi(\alpha_{k})}\frac{\Psi(\alpha_{k})}{t(\alpha_{k})}

then the covariance function of γk\gamma_{k} will be

(92)(c4[1+c4Ψ(αk)]2)2Ψ2(αk)t2(αk)\displaystyle(\ref{Cov1})*\left(\frac{c_{4}}{[1+c_{4}\Psi(\alpha_{k})]^{2}}\right)^{2}*\frac{\Psi^{2}(\alpha_{k})}{t^{2}(\alpha_{k})}

References

  • [1] {bbook}[author] \bauthor\bsnmTheodore W. Anderson (\byear2003). \btitleAn Introduction to Multivariate Statistical Analysis, \beditionThird edition ed. \bpublisherWiley. \endbibitem
  • [2] {barticle}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmDing, \bfnmXue\binitsX. (\byear2012). \btitleEstimation of spiked eigenvalues in spiked models. \bjournalRandom Matrices: Theory and Applications \bvolume01 \bpages1150011. \bdoi10.1142/S2010326311500110 \bmrnumber2934717 \endbibitem
  • [3] {barticle}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ., \bauthor\bsnmHou, \bfnmZhiqiang\binitsZ., \bauthor\bsnmHu, \bfnmJiang\binitsJ., \bauthor\bsnmJiang, \bfnmDandan\binitsD. and \bauthor\bsnmZhang, \bfnmXiaozhuo\binitsX. (\byear2021). \btitleLimiting canonical distribution of two large dimensional random vectors. \bjournalIn Advances on Methodology and Applications of Statistics - A Volume in Honor of C.R. Rao on the Occasion of his 100th Birthday. Edited by Carlos A. Coelho, N. Balakrishnan and Barry C. Arnold. \endbibitem
  • [4] {bbook}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmSilverstein, \bfnmJack W.\binitsJ. W. (\byear2010). \btitleSpectral Analysis of Large Dimensional Random Matrices, \beditionsecond ed. \bseriesSpringer Series in Statistics. \bpublisherSpringer, New York. \bdoi10.1007/978-1-4419-0661-8 \bmrnumber2567175 \endbibitem
  • [5] {barticle}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmSilverstein, \bfnmJack W.\binitsJ. W. (\byear2012). \btitleNo eigenvalues outside the support of the limiting spectral distribution of Information-plus-Noise type matrices. \bjournalRandom Matrices: Theory and Applications \bvolume01 \bpages1150004. \bdoi10.1142/S2010326311500043 \bmrnumber2930382 \endbibitem
  • [6] {barticle}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2008). \btitleCentral limit theorems for eigenvalues in a spiked population model. \bjournalAnnales de l’Institut Henri Poincaré, Probabilités et Statistiques \bvolume44 \bpages447–474. \bdoi10.1214/07-AIHP118 \bmrnumber2451053 \endbibitem
  • [7] {barticle}[author] \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2012). \btitleOn sample eigenvalues in a generalized spiked population model. \bjournalJournal of Multivariate Analysis \bvolume106 \bpages167–177. \bdoi10.1016/j.jmva.2011.10.009 \bmrnumber2887686 \endbibitem
  • [8] {barticle}[author] \bauthor\bsnmBaik, \bfnmJinho\binitsJ., \bauthor\bsnmArous, \bfnmGérard Ben\binitsG. B. and \bauthor\bsnmPéché, \bfnmSandrine\binitsS. (\byear2005). \btitlePhase transition of the largest eigenvalue for nonnull complex sample covariance matrices. \bjournalAnnals of Probability \bvolume33 \bpages1643–1697. \bdoi10.1214/009117905000000233 \bmrnumberMR2165575 \endbibitem
  • [9] {barticle}[author] \bauthor\bsnmBanna, \bfnmMarwa\binitsM., \bauthor\bsnmNajim, \bfnmJamal\binitsJ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2020). \btitleA CLT for linear spectral statistics of large random Information-plus-Noise matrices. \bjournalStochastic Processes and their Applications \bvolume130 \bpages2250–2281. \bdoi10.1016/j.spa.2019.06.017 \endbibitem
  • [10] {barticle}[author] \bauthor\bsnmBao, \bfnmZhigang\binitsZ., \bauthor\bsnmDing, \bfnmXiucai\binitsX. and \bauthor\bsnmWang, \bfnmKe\binitsK. (\byear2021). \btitleSingular vector and singular subspace distribution for the matrix denoising model. \bjournalThe Annals of Statistics \bvolume49 \bpages370-392. \endbibitem
  • [11] {barticle}[author] \bauthor\bsnmBao, \bfnmZhigang\binitsZ., \bauthor\bsnmHu, \bfnmJiang\binitsJ., \bauthor\bsnmPan, \bfnmGuangming\binitsG. and \bauthor\bsnmZhou, \bfnmWang\binitsW. (\byear2019). \btitleCanonical correlation coefficients of high-dimensional Gaussian vectors: finite rank case. \bjournalThe Annals of Statistics \bvolume47 \bpages612–640. \bdoi10.1214/18-AOS1704 \bmrnumber3909944 \endbibitem
  • [12] {barticle}[author] \bauthor\bsnmBodnar, \bfnmTaras\binitsT., \bauthor\bsnmDette, \bfnmHolger\binitsH. and \bauthor\bsnmParolya, \bfnmNestor\binitsN. (\byear2019). \btitleTesting for independence of large dimensional vectors. \bjournalThe Annals of Statistics \bvolume47 \bpages2977–3008. \bdoi10.1214/18-AOS1771 \bmrnumberMR3988779 \endbibitem
  • [13] {barticle}[author] \bauthor\bsnmCai, \bfnmT. Tony\binitsT. T., \bauthor\bsnmHan, \bfnmXiao\binitsX. and \bauthor\bsnmPan, \bfnmGuangming\binitsG. (\byear2020). \btitleLimiting laws for divergent spiked eigenvalues and largest nonspiked eigenvalue of sample covariance matrices. \bjournalAnnals of Statistics \bvolume48 \bpages1255–1280. \bdoi10.1214/18-AOS1798 \bmrnumberMR4124322 \endbibitem
  • [14] {barticle}[author] \bauthor\bsnmCapitaine, \bfnmMireille\binitsM. (\byear2014). \btitleExact separation phenomenon for the eigenvalues of large Information-plus-Noise type matrices. application to spiked models. \bjournalIndiana University Mathematics Journal \bvolume63 \bpages1875–1910. \bdoi10.1512/iumj.2014.63.5432 \endbibitem
  • [15] {barticle}[author] \bauthor\bsnmDing, \bfnmXiucai\binitsX. (\byear2020). \btitleHigh dimensional deformed rectangular matrices with applications in matrix denoising. \bjournalBernoulli \bvolume17 \bpages387-417. \endbibitem
  • [16] {barticle}[author] \bauthor\bsnmDing, \bfnmXiucai\binitsX. and \bauthor\bsnmYang, \bfnmFan\binitsF. (\byear2019). \btitleSpiked separable covariance matrices and principal components. \bjournalarXiv:1905.13060. \endbibitem
  • [17] {barticle}[author] \bauthor\bsnmDozier, \bfnmR Brent\binitsR. B. and \bauthor\bsnmSilverstein, \bfnmJack W\binitsJ. W. (\byear2007). \btitleOn the empirical distribution of eigenvalues of large dimensional Information-plus-Noise-type matrices. \bjournalJournal of Multivariate Analysis \bvolume98 \bpages678–694. \endbibitem
  • [18] {barticle}[author] \bauthor\bsnmDozier, \bfnmR. Brent\binitsR. B. and \bauthor\bsnmSilverstein, \bfnmJack W.\binitsJ. W. (\byear2007). \btitleAnalysis of the limiting spectral distribution of large dimensional Information-plus-Noise type matrices. \bjournalJournal of Multivariate Analysis \bvolume98 \bpages1099–1122. \bdoi10.1016/j.jmva.2006.12.005 \endbibitem
  • [19] {barticle}[author] \bauthor\bsnmHan, \bfnmXiao\binitsX., \bauthor\bsnmPan, \bfnmGuangming\binitsG. and \bauthor\bsnmZhang, \bfnmBo\binitsB. (\byear2016). \btitleThe Tracy-Widom law for the largest eigenvalue of F type matrices. \bjournalThe Annals of Statistics \bvolume44 \bpages1564–1592. \bdoi10.1214/15-AOS1427 \bmrnumberMR3519933 \endbibitem
  • [20] {bbook}[author] \bauthor\bsnmHorn, \bfnmRoger A.\binitsR. A. and \bauthor\bsnmJohnson, \bfnmCharles R.\binitsC. R. (\byear2012). \btitleMatrix Analysis. \bpublisherCambridge university press. \endbibitem
  • [21] {barticle}[author] \bauthor\bsnmHu, \bfnmJiang\binitsJ. and \bauthor\bsnmBai, \bfnmZhiDong\binitsZ. (\byear2014). \btitleStrong representation of weak convergence. \bjournalScience China Mathematics \bvolume57 \bpages2399–2406. \bdoi10.1007/s11425-014-4855-6 \bmrnumber3266500 \endbibitem
  • [22] {barticle}[author] \bauthor\bsnmJames, \bfnmAlan T.\binitsA. T. (\byear1964). \btitleDistributions of matrix variates and latent roots derived from normal samples. \bjournalAnnals of Mathematical Statistics \bvolume35 \bpages475–501. \bdoi10.1214/aoms/1177703550 \bmrnumberMR181057 \endbibitem
  • [23] {barticle}[author] \bauthor\bsnmJiang, \bfnmDandan\binitsD. and \bauthor\bsnmBai, \bfnmZhidong\binitsZ. (\byear2021). \btitleGeneralized four moment theorem and an npplication to CLT for spiked eigenvalues of high-dimensional covariance matrices. \bjournalBernoulli \bvolume27 \bpages274–294. \bdoi10.3150/20-BEJ1237 \bmrnumberMR4177370 \endbibitem
  • [24] {barticle}[author] \bauthor\bsnmJiang, \bfnmDandan\binitsD., \bauthor\bsnmHou, \bfnmZhiqiang\binitsZ. and \bauthor\bsnmBai, \bfnmZhidong\binitsZ. (\byear2019). \btitleGeneralized Four Moment Theorem with an application to the CLT for the spiked eigenvalues of high-dimensional general Fisher-matrices. \bjournalarXiv:1904.09236. \endbibitem
  • [25] {barticle}[author] \bauthor\bsnmJiang, \bfnmDandan\binitsD., \bauthor\bsnmHou, \bfnmZhiqiang\binitsZ. and \bauthor\bsnmHu, \bfnmJiang\binitsJ. (\byear2021). \btitleThe limits of the sample spiked eigenvalues for a high-dimensional generalized Fisher matrix and its applications. \bjournalJournal of Statistical Planning and Inference. Accept. \endbibitem
  • [26] {barticle}[author] \bauthor\bsnmJohnstone, \bfnmIain M.\binitsI. M. (\byear2001). \btitleOn the distribution of the largest eigenvalue in principal components analysis. \bjournalThe Annals of Statistics \bvolume29 \bpages295–327. \bdoi10.1214/aos/1009210544 \bmrnumberMR1863961 \endbibitem
  • [27] {barticle}[author] \bauthor\bsnmJohnstone, \bfnmI. M.\binitsI. M. and \bauthor\bsnmNadler, \bfnmB.\binitsB. (\byear2017). \btitleRoy’s largest root test under rank-one alternatives. \bjournalBiometrika \bvolume104 \bpages181–193. \endbibitem
  • [28] {barticle}[author] \bauthor\bsnmMa, \bfnmZongming\binitsZ. and \bauthor\bsnmYang, \bfnmFan\binitsF. (\byear2021). \btitleSample canonical correlation coefficients of high-dimensional random vectors with finite rank correlations. \bjournalarXiv:2102.03297. \endbibitem
  • [29] {bbook}[author] \bauthor\bsnmMardia, \bfnmKantilal Varichand\binitsK. V., \bauthor\bsnmKent, \bfnmJohn T\binitsJ. T. and \bauthor\bsnmBibby, \bfnmJohn M\binitsJ. M. (\byear1979). \btitleMultivariate analysis. \bpublisherAcademic press London. \endbibitem
  • [30] {bbook}[author] \bauthor\bsnmRobb J. Muirhead (\byear1982). \btitleAspects of Multivariate Statistical Theory. \bpublisherWiley. \endbibitem
  • [31] {barticle}[author] \bauthor\bsnmPaul, \bfnmDebashis\binitsD. (\byear2007). \btitleAsymptotics of sample eigenstructure for a large dimensional spiked covariance model. \bjournalStatistica Sinica \bvolume17 \bpages1617–1642. \endbibitem
  • [32] {barticle}[author] \bauthor\bsnmSkorokhod, \bfnmAnatoly V\binitsA. V. (\byear1956). \btitleLimit theorems for stochastic processes. \bjournalTheory of Probability & Its Applications \bvolume1 \bpages261–290. \endbibitem
  • [33] {barticle}[author] \bauthor\bsnmTracy, \bfnmCraig A\binitsC. A. and \bauthor\bsnmWidom, \bfnmHarold\binitsH. (\byear1996). \btitleOn orthogonal and symplectic matrix ensembles. \bjournalCommunications in Mathematical Physics \bvolume177 \bpages727–754. \endbibitem
  • [34] {barticle}[author] \bauthor\bsnmWachter, \bfnmKenneth W\binitsK. W. (\byear1980). \btitleThe limiting empirical measure of multiple discriminant ratios. \bjournalThe Annals of Statistics \bpages937–957. \endbibitem
  • [35] {barticle}[author] \bauthor\bsnmWang, \bfnmQinwen\binitsQ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2017). \btitleExtreme eigenvalues of large-dimensional spiked Fisher matrices with application. \bjournalThe Annals of Statistics \bvolume45 \bpages415–460. \bdoi10.1214/16-AOS1463 \bmrnumberMR3611497 \endbibitem
  • [36] {barticle}[author] \bauthor\bsnmYang, \bfnmFan\binitsF. (\byear2020). \btitleSample canonical correlation coefficients of high-dimensional random vectors: local law and Tracy-Widom limit. \bjournalarXiv:2002.09643. \endbibitem
  • [37] {barticle}[author] \bauthor\bsnmYang, \bfnmFan\binitsF. (\byear2021). \btitleLimiting Distribution of the Sample Canonical Correlation Coefficients of High-Dimensional Random Vectors. \bjournalarXiv:2103.08014. \endbibitem
  • [38] {barticle}[author] \bauthor\bsnmZheng, \bfnmShurong\binitsS. (\byear2012). \btitleCentral limit theorems for linear spectral statistics of large dimensional F-matrices. \bjournalAnnales de l’I.H.P. Probabilités et statistiques \bvolume48 \bpages444–476. \bdoi10.1214/11-AIHP414 \endbibitem
  • [39] {barticle}[author] \bauthor\bsnmZheng, \bfnmShurong\binitsS., \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2015). \btitleCLT for linear spectral statistics of a rescaled sample precision matrix. \bjournalRandom Matrices: Theory and Applications \bvolume04 \bpages1550014. \bdoi10.1142/S2010326315500148 \bmrnumber3418843 \endbibitem
  • [40] {barticle}[author] \bauthor\bsnmZheng, \bfnmShurong\binitsS., \bauthor\bsnmBai, \bfnmZhidong\binitsZ. and \bauthor\bsnmYao, \bfnmJianfeng\binitsJ. (\byear2017). \btitleCLT for eigenvalue statistics of large-dimensional general Fisher matrices with applications. \bjournalBernoulli \bvolume23 \bpages1130–1178. \bdoi10.3150/15-BEJ772 \bmrnumberMR3606762 \endbibitem