Phase Transition in the Generalized Stochastic Block Model

Sun Min Lee¹¹1Department of Mathematical Sciences, KAIST, Daejeon 34141, Korea
e-mail: [email protected] Ji Oon Lee²²2Department of Mathematical Sciences, KAIST, Daejeon 34141, Korea
email: [email protected]

Abstract

We study the problem of detecting the community structure from the generalized stochastic block model (GSBM). Based on the analysis of the Stieljtes transform of the empirical spectral distribution, we prove a BBP-type transition for the largest eigenvalue of the GSBM. For specific models such as a hidden community model and an unbalanced stochastic model, we provide precise formulas for the two largest eigenvalues, establishing the gap in the BBP-type transition.

1 Introduction

One of the most fundamental and natural problems in data science is to understand an underlying structure from data sets that can be viewed as networks. The problem is known as the clustering or community detection, and it appears in diverse field of studies involving real-world networks.

The stochastic block model (SBM) is one of the most fundamental mathematical models to understand the community structure in networks. An SBM is a random graph with $N$ nodes, partitioned into $K$ disjoint subsets, called the communities, $C_{1},C_{2},\dots,C_{K}$ . One can characterize an SBM via its adjacency matrix, which is a symmetric (random) matrix $\widetilde{M}$ , whose $(i,j)$ -entry $\widetilde{M}_{ij}$ is a Bernoulli random variable depending only on the communities to which the nodes $i$ and $j$ belong. For the clustering of an SBM, it is often useful to analyze the eigenvalues of the adjacency matrix and their associated eigenvectors, which is known as a spectral method.

One of the most prominent examples of spectral methods is the principal component analysis (PCA) in which the behavior of the eigenvectors associated with the extremal eigenvalues are considered to obtain the community structure of the SBM. For an SBM with two communities, the expectation of its adjacency matrix $\widetilde{M}$ has a block structure, i.e.,

\mathbb{E}[\widetilde{M}]=\left(\begin{array}[]{c|c}P_{11}&P_{12}\\ \hline\cr P_{21}&P_{22}\end{array}\right).

(1.1)

In the simplest case of a balanced SBM with $P_{11}=P_{12}=p$ , $P_{12}=P_{21}=q$ , and the two communities are of equal size, it can be easily checked that $\mathbb{E}[\widetilde{M}]$ has at most two non-zero eigenvalues, $N(p+q)/2$ and $N(p-q)/2$ . Thus, if $N(p-q)/2$ is sufficiently large, the perturbation $\widetilde{M}-\mathbb{E}[\widetilde{M}]$ is negligible for the two largest eigenvalues $\widetilde{M}$ and it is possible to determine the community structure from the eigenvector associated with the second largest eigenvalue of $\widetilde{M}$ . Equivalently, after subtracting $(p+q)/2$ from each entry, the (shifted) adjacency matrix becomes the sum of a rank- $1$ deterministic matrix and a random matrix with centered entries, and one can use the eigenvector associated with the largest eigenvalue of the shifted adjacency matrix for clustering.

The sum of a deterministic matrix and a random matrix has been extensively studied in random matrix theory. When the deterministic matrix is rank- $1$ and the random matrix is a Wigner matrix, it is called a (rank- $1$ ) spiked Wigner matrix. The behavior of the largest eigenvalue of a spiked Wigner matrix is known to exhibit a sharp phase transition depending on the ratio between the spectral norms of the deterministic part and the random part. This type of phase transition is called the BBP transition, after the seminal work of Baik, Ben Arous, and Péché [5] for spiked (complex) Wishart matrix. From the BBP transition, we can immediately see that the detection of the signal is possible via PCA when the signal-to-noise ratio (SNR) is above a certain threshold.

While the BBP transition has been proved for spiked Wigner matrices under various assumptions [20, 15, 8, 7], it is not directly applicable to the SBM, since the entries in a Wigner matrix are i.i.d. (up to symmetry constraint) whereas those in the adjacency matrix of an SBM are not. The proof of the BBP transition with an SBM is substantially harder. For example, unless the SBM is balanced, the empirical spectral distribution (ESD) of $\widetilde{M}$ does not even converge to the semi-circle distribution, which is the limiting ESD of a Wigner matrix; the limiting ESD in this case is not given by a simple formula as the semi-circle distribution but by an implicit formula via its Stieltjes transform.

Main contribution

In this paper, we consider a model that generalizes the SBM, called the generalized stochastic block model (GSBM), with two communities. In this model, the mean of the matrix has the same block structure as that of the SBM in (1.1), but the entries are not necessarily Bernoulli random variables. See Definition 2.1 for the precise definition of the GSBM.

For the GSBM, We prove the BBP-type transition for its largest eigenvalue (Theorem 2.1). The proof is based on the analysis of the Stiejtes transform of the ESD, which involves the resolvent of the random part of the GSBM. Due to the community structure, the random part is not a Wigner matrix, but a generalization of a Wigner matrix, known as a Wigner-type matrix. The local properties of eigenvalues of Wigner-type matrices are now well-established by recent developments of random matrix theory; see, e.g., [3, 4, 11].

In our main result, Theorem 2.1, we only state the existence of the critical values and the limiting gap between the two largest eigenvalue but refrain from writing the precise formulas for them. We instead apply our results to specific examples naturally arising in applications, hidden community model and unbalanced stochastic block model, and present the results from numerical experiments. (In terms of the edge probability, the former corresponds to the case $P_{11}=p$ and $P_{12}=P_{21}=P_{22}=q$ , while the latter $P_{11}=P_{22}=p$ and $P_{12}=P_{21}=q$ (but $\gamma\neq 1/2$ )).

Related works

The local law for Wigner-type matrices and the behavior of Quadratic vector equations (QVE), which are crucial in the analysis for Wigner-type matrices, were thoroughly investigated by Ajanki, Erdős and Krüger [3, 4]. A related result on the local law at the cusp for the Wigner-type matrix was also proved [13]. For more results on general Wigner-type matrices, we refer to [11, 14, 23] and references therein.

The phase transition of the largest eigenvalue was first proved proved by Baik, Ben Arous and Péché [5] for spiked Wishart matrices and later extend to other models, including the spiked Wigner matrix under various assumptions [20, 15, 8, 7]. If the SNR is below the threshold given by the BBP transition, the largest eigenvalue has no information on the signal and we cannot use the PCA for the detection of the signal. For this case, the PCA can be improved by an entrywise transformation that effectively increase the SNR [6, 21]. Reliable detection is impossible below a certain threshold [21], and it is only possible to consider a weak detection, which is a hypothesis testing between the null model (without spike) and the alternative (with spike). For more detail about the weak detection, we refer to [12, 10, 19].

The problem of recovering a hidden community from a symmetric matrix for two important cases, the Bernoulli and Gaussian entries, was discussed by Hajek, Wu, and Xu [16]. A threshold for exact recovery in SBM was discussed in [1, 2, 9, 17]. Recovering community at the Kesten–Stigum threshold for SBM was considered in [18]. For more results and Applications on SBM, we refer to [22] and references therein.

Organization of the paper

The rest of the paper is organized as follows: In Section 2, we define the model and state the main result. In Section 3, we introduce the hidden community model and unbalanced stochastic model to provide the results from numerical experiments around the transition threshold. In Section 4, we prove the main theorem. A summary of our results and future research directions was discussed in Section 5. Appendix A contains the definition of the Wigner-type matrices and preliminary results on this model. The detailed analysis for the specific models can be found in Appendix B.

2 Main Results

In this section, we precisely define the matrix model that we consider in this paper and state our main theorem. We begin by introducing a shifted, rescaled matrix for a generalized stochastic block model with two communities.

Definition 2.1 (Generalized Stochastic Block Model (GSBM)).

An $N\times N$ matrix $M$ is a generalized stochastic block model if

M=H+\lambda uu^{T}

where $\lambda\geq 0$ is a constant, $u=(u_{1},u_{2},\dots,u_{N})\in\mathbb{R}^{N}$ with $\|u\|=1$ , and $H=[H_{ij}]$ is an $N\times N$ real symmetric matrix, satisfying the following:

•

There exist $S\subset[N]:=\{1,2,\dots,N\}$ and constants $\theta_{1},\theta_{2}$ such that

$u_{i}=\begin{cases}\theta_{1}&\text{if}\quad i\in S\,,\\ \theta_{2}&\text{if}\quad i\notin S\,.\end{cases}$

We further assume that $\frac{|S|}{N},(1-\frac{|S|}{N})>c>0$ for some ( $N$ -independent) constant $c$ .

•

Upper diagonal entries $H_{ij}(i\leq j)$ are centered independent random variables such that

–

there exist ( $N$ -independent) constants $\alpha_{1}$ and $\alpha_{2}$ such that

\mathbb{E}[H_{ij}^{2}]=\begin{cases}\alpha_{1}N^{-1}&\text{if}\quad i,j\in S\\ \alpha_{2}N^{-1}&\text{if}\quad i,j\notin S\\ N^{-1}&\text{otherwise}\end{cases}

–

for any ( $N$ -independent) $D>0$ , there exists a constant $C_{D}$ such that for all $i\leq j$

$\mathbb{E}[H_{ij}^{D}]\leq C_{D}N^{-\frac{D}{2}}.$

For an adjacency matrix $\widetilde{M}$ in (1.1), if $P_{11}=p_{1}$ , $P_{22}=p_{2}$ , and $P_{12}=P_{21}=q$ , then after shifting and rescaling, we find that

\alpha_{1}=\frac{p_{1}(1-p_{1})}{q(1-q)},\qquad\alpha_{2}=\frac{p_{2}(1-p_{2})}{q(1-q)}.

(2.1)

(See Appendix B for more detail.)

We remark that $H_{ij}$ are not necessarily Bernoulli random variables. The assumption on the finite moment means that the model is in the dense regime. The most typical balanced stochastic block model with two communities correspond to the choice of parameters $|S|=N/2$ and $\alpha_{1}=\alpha_{2}>1$ .

Our main theorem is the following result on the phase transition for the spectral gap of GSBM.

Theorem 2.1.

Let $M$ be a generalized stochastic block model defined in Definition 2.1. Denote by $\lambda_{1}$ and $\lambda_{2}$ the largest and the second largest eigenvalue of $M$ . Assume that $\gamma:=N_{1}/N$ is fixed. Then, there exists a constant $\lambda_{c}$ , depending only on $\theta_{1},\theta_{2},\alpha_{1},\alpha_{2},\gamma$ , such that

•

(Subcritical case) if $\lambda<\lambda_{c}$ , then $\lambda_{1}-\lambda_{2}\to 0$ as $N\to\infty$ , almost surely.
•

(Supercritical case) if $\lambda>\lambda_{c}$ , then $\lambda_{1}-\lambda_{2}\to g$ as $N\to\infty$ , almost surely, for some ( $N$ -independent) constant $g$ .

We do not include the precise formulas for the critical value $\lambda_{c}$ and the gap $g$ in the statement of Theorem 2.1 for the general cases since they are lengthy but not particularly informative. The formulas for the special cases can be found in Section 3.

3 Examples and Experiments

In this section, we focus on several specific models and check how the main result, Theorem 2.1, applies to them.

3.1 Hidden community model

In the hidden community model, only one of the intra-community connection probability is larger than the inter-community connection probability, and the other intra-connection probability coincides with the inter-community connection. The precise definition for such a model is as follows:

Definition 3.1 (Hidden Community Model).

Let $C\subset[n]$ such that $|C|=K$ . Let define that $S$ is a $N\times N$ symmetric matrix with $S_{ii}=0$ where $S_{ij}$ are independent for $1\leq i\leq j\leq N$ and

\displaystyle S_{ij}\sim\begin{cases}P,&\mbox{if }i,j\in C\\ Q,&\mbox{otherwise}\\ \end{cases}

for given probability measures $P$ and $Q$ .

We consider the BBP-type transition of the hidden community model with Bernoulli entries, i.e., $P=\mathrm{Bernoulli}(p)$ and $Q=\mathrm{Bernoulli}(q)$ with $p\neq q$ , which also corresponds to the case $\alpha_{2}=1$ or $p_{2}=q$ in (2.1). It is not hard to find that the transition occurs in the regime

p_{1}:=p=\frac{w}{\sqrt{N}}+q

for some (possibly $N$ -dependent) $w=\Theta(1)$ . After shifting and rescaling, we find that $\lambda_{2}\to 2$ and

\lambda_{1}\to\begin{cases}\frac{\gamma w}{\sqrt{q(1-q)}}+\frac{\sqrt{q(1-q)}}{\gamma w}&\text{ if }w>\frac{\sqrt{q(1-q)}}{\gamma}\,,\\ 2&\text{ if }w<\frac{\sqrt{q(1-q)}}{\gamma}\,.\end{cases}

(3.1)

See Appendix B.1 for the detail.

We performed the numerical simulation for the hidden community model. We set $N=2500$ , $\gamma=1/4$ , and $q=0.2$ . Following the analysis in Appendix B.1, we find that an outlier eigenvalue occurs if

p>q+\frac{\sqrt{q(1-q)}}{\gamma\sqrt{N}}=0.232.

In Figure 1, we compare the histograms of the eigenvalues of the shifted, rescaled adjacency matrices with $p=0.2$ and $p=0.25$ , respectively. As predicted by the analysis, the outlier appears only for the case $p=0.25$ .

3.2 Unbalanced stochastic block model

We next consider the case $p_{1}=p_{2}$ or $\alpha_{1}=\alpha_{2}$ with $\gamma\neq 1/2$ , which we will refer to an unbalanced stochastic block model. As in the hidden community model, the transition occurs in the regime $p_{1}=p_{2}:=p=\frac{w}{\sqrt{N}}+q$ . After shifting and rescaling, we find that $\lambda_{2}\to 2$ and

\lambda_{1}\to\begin{cases}\frac{w}{2\sqrt{q(1-q)}}+\frac{2\sqrt{q(1-q)}}{w}&\text{ if }w>2\sqrt{q(1-q)}\,,\\ 2&\text{ if }w<2\sqrt{q(1-q)}\,.\end{cases}

See Appendix B.2 for the detail. Note that the transition does not depend on $\gamma$ .

We performed the numerical simulation for the unbalanced stochastic block model. As in the hidden community model, we set $N=2500$ , $\gamma=1/4$ , and $q=0.2$ . An outlier eigenvalue occurs if

p>q+\frac{2\sqrt{q(1-q)}}{\sqrt{N}}=0.216.

In Figure 2, we compare the histograms of the eigenvalues of the shifted, rescaled adjacency matrices with $p=0.2$ and $p=0.25$ , respectively. Again, as predicted by the analysis, the outlier appears only for the case $p=0.25$ .

4 Proof of Theorem 2.1

Recall that we denote by $\lambda_{1}$ and $\lambda_{2}$ the two largest eigenvalues of $M$ . Let $\mu_{1}$ and $\mu_{2}$ be the two largest eigenvalues of $H$ . From the result on the Wigner-type matrices, we find that $\mu_{1}$ and $\mu_{2}$ converge to the rightmost edge of the limiting ESD of $H$ . By the Cauchy interlacing formula, we have the inequality

\mu_{2}\leq\lambda_{2}\leq\mu_{1}\leq\lambda_{1},

which shows that $\lambda_{2}$ also converges to the rightmost edge of the limiting ESD of $H$ .

From the minimax principle,

\lambda_{1}=\max_{\|x\|=1}\langle x,mx\rangle=\max_{\|x\|=1}\left(\langle x,Hx\rangle+\lambda|\langle x,u\rangle|^{2}\right),

which shows that $\lambda_{1}$ is an increasing function of $\lambda$ . Further, since $\lambda_{1}\geq\mu_{1}$ and

\lambda-\mu_{1}\leq\lambda_{1}\leq\lambda+\mu_{1},

we find that $\lambda_{1}-\mu_{1}=o(1)$ if $\lambda=o(1)$ and $\lambda_{1}>\mu_{1}+1$ if $\lambda>2\mu_{1}+1$ . Thus, since $\lambda_{1}$ is a continuous function of $\lambda$ , conditional on $H$ there exists $\lambda_{c}$ such that the statement of Theorem 2.1 holds. It thus remains to show that $\lambda_{c}$ and $\mu$ are deterministic in the sense that the same statement holds without conditioning on $H$ .

Our proof is based on the Stieltjes transform method in random matrix theory for which we use the following definition:

Definition 4.1 (Stieltjes Transform).

Let $\mu$ be a probability measure on the real line. The Stieltjes transform of $\mu$ is defined by

S_{\mu}(z)=\int_{\mathbb{R}}\frac{1}{x-z}\mathrm{d}\mu(x)

for $z\in\mathbb{C}\backslash\mathrm{supp}\,(\mu)$ .

For the noise $H$ , we consider its resolvent $G(z)$ defined by

G(z):=(H-zI)^{-1}

(4.1)

for $z\in\mathbb{C}\backslash\mathrm{spec}\,(H)$ . Note that the normalized trace $m:=N^{-1}\operatorname{Tr}G$ is equal to the Stieltjes transform of the empirical spectral distribution (ESD) of $H$ .

To find the largest eigenvalue of $M=H+\lambda uu^{T}$ , we recall that any eigenvalue $z$ of $M$ satisfies

\det(H+\lambda uu^{T}-zI)=0,

(4.2)

which can be further decomposed into

\begin{split}0&=\det(H+\lambda uu^{T}-zI)=\det(H-zI)(I+(H-zI)^{-1}\lambda uu^{T})\\ &=\det(H-zI)\cdot\det(I+(H-zI)^{-1}\lambda uu^{T}).\end{split}

(4.3)

Thus, if $z$ is not an eigenvalue of $H$ , we find that $\det(H-zI)=0$ , and hence

\det(I+(H-zI)^{-1}\lambda uu^{T})=0.

We now claim that $(H-zI)^{-1}\lambda uu^{T}$ has rank one. To prove the claim, we notice that for any $v\in\mathbb{R}^{N}$ ,

(H-zI)^{-1}\lambda uu^{T}v=\langle u,v\rangle(H-zI)^{-1}\lambda u.

It implies that the range of $(H-zI)^{-1}\lambda uu^{T}$ is contained in $\mathrm{span}\,((H-zI)^{-1}\lambda u)$ , which is a 1-dimensional space.

Since $(H-zI)^{-1}\lambda uu^{T}$ has rank 1, it has only one non-zero eigenvalue, which we call $\lambda_{0}$ . Then, $\lambda_{0}$ is $-1$ for otherwise every eigenvalues of $I+(H-zI)^{-1}\lambda uu^{T}$ is non-zero, contradicting (4.3). Furthermore, it is also obvious that $(H-zI)^{-1}\lambda u$ is an eigenvector associated with the eigenvalue $-1$ . Thus,

(H-zI)^{-1}\lambda uu^{T}(H-zI)^{-1}u=-(H-zI)^{-1}u,

which leads us to the equation

u^{T}(H-zI)^{-1}u=\langle u,G(z)u\rangle=-\frac{1}{\lambda}.

(4.4)

For the noise matrix $H$ , which is a Wigner-type matrix considered in [4], we have that

\langle u,G(z)u\rangle\simeq\sum_{i=1}^{N}m_{i}(z)u_{i}^{2}

(4.5)

for any $z$ not contained in an open neighborhood of the support of the limiting ESD of $H$ , where we let $\mathbf{m}:=(m_{1},m_{2},\dots,m_{N})$ be the solution to the quadratic vector equation (QVE)

-\frac{1}{m_{i}(z)}=z+\sum^{N}_{j=1}\mathbb{E}[H_{ij}^{2}]m_{j}(z).

(4.6)

is satisfied for $i,\ j=1,2,...,N$ . (See Appendix A for the precise statement of (4.5).) We remark that the uniqueness of the solution $m$ for (4.6) is also known [3].

To solve the equations (4.4) and (4.6), we need to estimate $m(z)$ from the assumption on the community structure in Definition 2.1. Assume for the simplicity that $S=\{1,2,\dots,N_{1}\}$ . From the symmetry, we have an ansatz

m_{1}(z)=m_{2}(z)=\dots=m_{N_{1}}(z),\quad m_{N_{1}+1}(z)=\dots=m_{N}(z).

Then, we can rewrite (4.6) as

\displaystyle-\frac{1}{m_{i}(z)}=\begin{cases}z+\sum^{N_{1}}_{j=1}\frac{\alpha_{1}}{N}m_{j}(z)+\sum^{N}_{j=N_{1}+1}\frac{1}{N}m_{j}(z)&\text{if }\quad 1\leq i\leq N_{1}\\ z+\sum^{N_{1}}_{j=1}\frac{1}{N}m_{j}(z)+\sum^{N}_{j=N_{1}+1}\frac{\alpha_{2}}{N}m_{j}(z)&\text{if }\quad N_{1}+1\leq i\leq N\\ \end{cases},

which can be further simplified (after omitting the $z$ -dependence) to

\begin{split}-1&=zm_{1}+\alpha_{1}\gamma(m_{1})^{2}+(1-\gamma)m_{1}m_{N}\,\\ -1&=zm_{N}+\gamma m_{1}m_{N}+\alpha_{2}(1-\gamma)(m_{N})^{2}\,.\end{split}

(4.7)

We can thus conclude that if there exists real $z$ that solves (4.7) under the assumption

N_{1}m_{1}\theta_{1}^{2}+(N-N_{1})m_{N}\theta_{2}^{2}=N\left(\gamma m_{1}\theta_{1}^{2}+(1-\gamma)m_{N}\theta_{2}^{2}\right)=-\frac{1}{\lambda}

(4.8)

then $\lambda_{1}$ converges to $z$ with high probability. Since (4.8) is deterministic, we find that the gap $g$ is deterministic in the supercritical case $\lambda>\lambda_{c}$ .

It remains to find the critical $\lambda_{c}$ . Recall that we set $m:=N^{-1}\operatorname{Tr}G$ . From the Stieltjes inversion formula, the (normalized) imaginary part of $m(z)$ corresponds to the density of the limiting ESD of $H$ at $\operatorname{Re}z$ . Thus, after changing (4.7) as a single equation involving $z$ and $m$ only, i.e., $f(z,m)=0$ , we find that the upper edge $L_{+}$ of the ESD of $H$ is the largest real number such that $f(L_{+},m)=0$ has a double root when considered as an equation for $m$ . (Note that technically the condition can be checked by solving $f(L_{+},m)=0$ and $\frac{\partial}{\partial m}f(L_{+},m)=0$ simultaneously.) We can thus conclude that $\lambda_{c}$ is determined as the largest number such that when $\lambda=\lambda_{c}$ the solution $z$ for the equation (4.7) under the assumption (4.8) coincides with $L_{+}$ . This in particular shows that $\lambda_{c}$ is also deterministic and completes the proof of Theorem 2.1.

5 Conclusion and Future Works

In this paper, we considered the generalized stochastic block model with two communities. We showed the phase transition in the GSBM where the random part is the Wigner-type matrix, which extends the BBP transition. For the precise formulas, we discussed a hidden community model and unbalanced stochastic block model with Bernoulli distribution and Gaussian distribution at the Kesten–Stigum threshold. Both models can be improved for a non-Gaussian case.

We believe that it is possible to prove the phase transition for the sparse matrix in which the data matrix is not necessarily symmetric and most of the elements are composed of zeros. We also hope to extend our result to the GSBM with more than two communities.

Acknowledgments

The authors were supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2019R1A5A1028324).

Appendix A Local law for Wigner-type matrices

In this section, we provide a precise statement of the local law for Wigner-type matrices, which was used in the proof of Theorem 2.1 in Section 4. Wigner-type matrices are defined as follows:

Definition A.1.

(Wigner-type matrix) We say an $N\times N$ random matrix $H=(H_{ij})$ is a Wigner-type matrix if the entries of $H$ are independent real symmetric variables satisfying the following conditions:

•

$\mathbb{E}(H_{ij})=0$ for all $i,j$ .
•

The variance matrix $\boldsymbol{S}=(S_{ij})$ where $S_{ij}=\mathbb{E}|H_{ij}|^{2}$ satisfies

$(S^{L})_{ij}\geq\frac{\rho}{N}\text{ and }s_{ij}\leq\frac{S_{*}}{N},\ \ 1\leq i,j\leq N$

for finite parameters $\rho,S_{*},L$ .

For the precise statement of the local law, we use the following definitions, which are frequently used in the analysis involving rare events in random matrix theory.

Definition A.2.

(Overwhelming probability) An event $\Omega$ holds with overwhelming probability if for any big enough $D>0$ , $P(\Omega)\leq N^{-D}$ for any sufficient large $N$ .

Definition A.3.

(Stochastic domination) Let consider two families of non-negative random variables:

\psi=\{\psi^{(N)}(u)|N\in\mathbb{N},\ u\in U^{(N)}\}

\phi=\{\phi^{(N)}(u)|N\in\mathbb{N},\ u\in U^{(N)}\}

where $U^{(N)}$ is N-dependent parameter set. Suppose $N_{0}:(0,\infty)^{2}\to\mathbb{N}$ is a given function depending on $p,\ q,\ n$ and $\mu$ . If for $\epsilon>0$ small enough and $D>0$ big enough, we have

P\left(\phi^{(N)}>N^{\epsilon}\psi^{(N)}\right)\leq N^{-D},\ \ \ \text{for }N\geq N_{0}(\epsilon,D),

then $\phi$ is stochastically dominated by $\psi$ which denoted by $\phi\prec\psi$ .

We are now ready to state the local law. Let $m_{i}(z)$ be the solution of QVE in (4.6) and $\rho$ is the density defined as

\rho(\tau):=\lim_{\rho\searrow 0}\frac{1}{\pi N}\sum_{i=1}^{N}\operatorname{Im}m_{i}(\tau+\mathrm{i}\eta).

(See also Corollary 1.3 of [4] for more detail.)

Theorem A.1 (Local law).

Let $H$ be a Wigner-type matrix and fix an arbitrary $\gamma\in(0,1)$ . Then, uniformly for all $z=a+bi$ with $b\geq N^{\gamma-1}$ , the resolvent $G(z)=(H-zI)^{-1}$ satisfy

max_{i,j}\left|G_{ij}(z)-m_{i}(z)\delta_{ij}\right|\prec\frac{1+\sqrt{\rho(z)}}{\sqrt{bN}}+\frac{1}{bN}

Furthermore, for any deterministic vector $w\in\mathbb{C}^{N}$ with $\max_{i}|w_{i}|\geq 1$ , we have

\left|\sum^{N}_{i,j=1}\overline{w_{i}}\left(G_{ij}(z)-m_{i}(z)\right)\right|\prec\frac{1}{\sqrt{bN}}

The local law can be generalized to the anisotropic local law as follows.

Theorem A.2 (Anisotropic law).

Suppose that the assumptions in Theorem A.1 hold. Then, uniformly for all $z=a+bi$ with $b\geq N^{\gamma-1}$ , and for any two deterministic $\ell^{2}$ -normalized vectors $w,v\in\mathbb{C}^{N}$ , we have

\left|\sum^{N}_{i,j=1}\overline{w_{i}}G_{ij}(z)v_{j}-\sum^{N}_{i=1}m_{i}(z)\overline{w_{i}}v_{j}\right|\prec\frac{1+\sqrt{\rho(z)}}{\sqrt{bN}}+\frac{1}{bN}

Appendix B Examples from stochastic block models

In this appendix, we consider stochastic block models, which corresponds to GSBMs with Bernoulli distribution in our setting. Suppose that $\widehat{H}=[\widehat{H}_{ij}]_{i,j=1}^{N}$ is an SBM such that

\widehat{H}=\begin{cases}\widehat{H}_{ij}\sim Bernoulli(p_{1}),&\mbox{ if }1\leq i,j\leq N_{1}\,,\\ \widehat{H}_{ij}\sim Bernoulli(p_{2}),&\mbox{ if }N_{1}+1\leq i,j\leq N\,,\\ \widehat{H}_{ij}\sim Bernoulli(q),&\mbox{ otherwise}\,.\\ \end{cases}

(B.1)

In what follows, we will call the $(i,j)$ -entry is in the diagonal block if $1\leq i,j\leq N_{1}$ or $N_{1}+1\leq i,j\leq N$ , and otherwise it is in the off-diagonal block. In the block matrix form, it can also be expressed as follows:

\displaystyle\widehat{H}

\explainA\explainB(Bernoulli(p1) Bernoulli(q) Bernoulli(q) Bernoulli(p2) )\explainC\explainD

{}

Our goal is to shift and rescale $\widehat{H}$ to convert it into a GSBM $M=H+\lambda uu^{T}$ in Definition 2.1. We first notice that the variances of the entries of $\widehat{H}$ are $p_{1}(1-p_{1})$ and $p_{2}(1-p_{2})$ for the diagonal block and $q(1-q)$ for the off-diagonal block. Since we assume that the variance of the entry $H_{ij}$ in the off-diagonal block is $N^{-1}$ , we find that the matrix must be divided by $\sqrt{Nq(1-q)}$ . It is then immediate to find that

\alpha_{1}=\frac{p_{1}(1-p_{1})}{q(1-q)},\qquad\alpha_{2}=\frac{p_{2}(1-p_{2})}{q(1-q)}.

(B.2)

as in (2.1).

The mean matrix

\mathbb{E}[\widehat{H}]=\begin{pmatrix}p_{1}&q\\ q&p_{2}\end{pmatrix}

is a rank- $2$ matrix, and thus we need to subtract each entry by a deterministic number, which depends on the parameters $p_{1},p_{2}$ , and $q$ .

B.1 Hidden community model

Suppose that $p_{1}=p$ and $p_{2}=q$ . It is then easy to find that $\mathbb{E}[\widehat{H}]$ becomes a rank- $1$ matrix after subtracting each entry by $q$ , i.e., if we let $E_{0}$ be the $N\times N$ matrix whose all entries are $q$ , then

\mathbb{E}[\widehat{H}]-E_{0}=\begin{pmatrix}p-q&0\\ 0&0\end{pmatrix}.

Thus, we find that

M=\frac{1}{\sqrt{Nq(1-q)}}(\widehat{H}-E_{0})

(B.3)

and

\mathbb{E}[M]=\frac{1}{\sqrt{Nq(1-q)}}\begin{pmatrix}p-q&0\\ 0&0\end{pmatrix}.

(B.4)

Recall that $N_{1}=\gamma N$ and $p=\frac{w}{\sqrt{N}}+q$ . Since $\lambda uu^{T}=\mathbb{E}[M]$ , we get

u= (1γN0 )\explainC\explainD,

{}

i.e., $\theta_{1}=1/\sqrt{\gamma N}$ and $\theta_{2}=0$ . We also find that

\lambda=\frac{N_{1}(p-q)}{\sqrt{Nq(1-q)}}=\frac{\gamma w}{\sqrt{q(1-q)}}.

Following the proof of Theorem 2.1 in Section 4, we solve the system of equation in (4.7),

\begin{split}-1&=zm_{1}+\frac{p(1-p)}{q(1-q)}\gamma(m_{1})^{2}+(1-\gamma)m_{1}m_{N}\,,\\ -1&=zm_{N}+\gamma m_{1}m_{N}+(1-\gamma)(m_{N})^{2}\,.\end{split}

(B.5)

Since $p-q=O(N^{-1/2})$ , we consider an ansatz $m_{N}=m_{1}+O(N^{-1/2})$ , which shows for $m=\gamma m_{1}+(1-\gamma)m_{N}$ that

1+zm+m^{2}=O(N^{-1/2}).

Following the analysis in the last paragraph of Section 4, we find that the upper edge $L_{+}=2+O(N^{-1/2})$ . By Theorem 2.1, it also implies that $\lambda_{2}\to 2$ as $N\to\infty$ .

In order to determine the location of the largest eigenvalue $\lambda_{1}$ , we consider (B.5) under the assumption in (4.8),

N\left(\gamma m_{1}\theta_{1}^{2}+(1-\gamma)m_{N}\theta_{2}^{2}\right)=m_{1}=-\frac{1}{\lambda}=-\frac{\sqrt{q(1-q)}}{\gamma w}.

(B.6)

We remark that the ansatz $m_{N}=m_{1}+O(N^{-1/2})$ can be directly checked in this case; by plugging (B.6) into (B.5) and eliminating $z$ ,

\left(\frac{p(1-p)}{q(1-q)}-1\right)\gamma\left(\frac{\sqrt{q(1-q)}}{\gamma w}\right)^{2}m_{N}+m_{N}+\frac{\sqrt{q(1-q)}}{\gamma w}=0,

whose solution is

m_{N}=-\frac{\sqrt{q(1-q)}/(\gamma w)}{1+\left(\frac{p(1-p)}{q(1-q)}-1\right)\gamma\left(\frac{\sqrt{q(1-q)}}{\gamma w}\right)^{2}}=-\frac{\sqrt{q(1-q)}}{\gamma w}\left(1-\frac{\sqrt{N}(1-2q)-w}{N\gamma w+\sqrt{N}(1-2q)-w}\right).

To find the location of the largest eigenvalue, we need to check whether the assumption (B.6) is valid. However, we can instead find the value of $z$ by first assuming that the solution exists. Then,

z=\frac{\gamma w}{\sqrt{q(1-q)}}+\frac{\sqrt{q(1-q)}}{\gamma w}+O(N^{-1/2}).

At the critical $\lambda_{c}$ for the phase transition in Theorem 2.1, the location of the largest eigenvalue coincides with the location of the upper edge $L_{+}$ in the limit $N\to\infty$ , or equivalently, $\frac{\gamma w}{\sqrt{q(1-q)}}=1$ . Thus, we conclude that

\lambda_{1}\to\begin{cases}\frac{\gamma w}{\sqrt{q(1-q)}}+\frac{\sqrt{q(1-q)}}{\gamma w}&\text{ if }w>\frac{\sqrt{q(1-q)}}{\gamma}\,,\\ 2&\text{ if }w<\frac{\sqrt{q(1-q)}}{\gamma}\,.\end{cases}

B.2 Unbalanced stochastic model

Suppose that $p_{1}=p_{2}=p$ . Following the strategy in Appendix B.1, we let $E_{1}$ be the $N\times N$ matrix whose all entries are $(p+q)/2$ . Then,

\mathbb{E}[\widehat{H}]-E_{1}=\begin{pmatrix}(p-q)/2&(q-p)/2\\ (q-p)/2&(p-q)/2\end{pmatrix}.

Thus, we find that

M=\frac{1}{\sqrt{Nq(1-q)}}(\widehat{H}-E_{1})

(B.7)

and

\mathbb{E}[M]=\frac{1}{\sqrt{Nq(1-q)}}\begin{pmatrix}(p-q)/2&(q-p)/2\\ (q-p)/2&(p-q)/2\end{pmatrix}.

(B.8)

From $\lambda uu^{T}=\mathbb{E}[M]$ , we get

u= (1N-1N)\explainC\explainD,

{}

i.e., $\theta_{1}=1/\sqrt{N}$ and $\theta_{2}=-1/\sqrt{N}$ . Also,

\lambda=\frac{N(p-q)}{2\sqrt{Nq(1-q)}}=\frac{w}{2\sqrt{q(1-q)}}.

With $\alpha_{1}=\alpha_{2}=\frac{p(1-p)}{q(1-q)}$ , we solve the system of equation in (4.7),

\begin{split}-1&=zm_{1}+\frac{p(1-p)}{q(1-q)}\gamma(m_{1})^{2}+(1-\gamma)m_{1}m_{N}\,,\\ -1&=zm_{N}+\gamma m_{1}m_{N}+\frac{p(1-p)}{q(1-q)}(1-\gamma)(m_{N})^{2}\,.\end{split}

(B.9)

Again, we consider an ansatz $m_{N}=m_{1}+O(N^{-1/2})$ , which leads us to the result that the upper edge $L_{+}=2+O(N^{-1/2})$ and $\lambda_{2}\to 2$ as $N\to\infty$ . The assumption in (4.8) becomes

\gamma m_{1}+(1-\gamma)m_{N}=-\frac{1}{\lambda}=-\frac{2\sqrt{q(1-q)}}{w}.

(B.10)

If the solution to the equation (B.9) exists, it would be

z=\frac{w}{2\sqrt{q(1-q)}}+\frac{2\sqrt{q(1-q)}}{w}+O(N^{-1/2}).

At the critical $\lambda_{c}$ , $\frac{w}{2\sqrt{q(1-q)}}=1$ , and thus we conclude that

\lambda_{1}\to\begin{cases}\frac{w}{2\sqrt{q(1-q)}}+\frac{2\sqrt{q(1-q)}}{w}&\text{ if }w>2\sqrt{q(1-q)}\,,\\ 2&\text{ if }w<2\sqrt{q(1-q)}\,.\end{cases}

References

[1] E. Abbe. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research, 18(1):6446–6531, 2017.
[2] E. Abbe, A. Bandeira, and G. Hall. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62(1):471–487, 2014.
[3] O. Ajanki, L. Erdős, and T. Krüger. Quadratic vector equations on complex upper half-plane. American Mathematical Society, 261(1261), 2019.
[4] O. Ajanki, L. Erdős, and T. Krüger. Universality for general wigner-type matrices. Probability Theory and Related Fields, 169(3):667–727, 2017.
[5] J. Baik, G. Ben Arous, and S. Péché. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability, 33(5):1643–1697, 2005.
[6] J. Barbier, M. Dia, N. Macris, F. Krzakala, T. Lesieur, and L. Zdeborová. Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula. Advances in Neural Information Processing Systems, 29:424–432, 2016.
[7] F. Benaych-Georges and R. R. Nadakuditi. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Advances in Mathematics, 227(1):494–521, 2011.
[8] M. Capitaine, C. Donati-Martin, and D. Féral. The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations. The Annals of Probability, 37(1):1–47, 2009.
[9] P.-Y. Chen and A. Hero. Universal phase transition in community detectability under a stochastic block model. Physical Review E, 91(3):032804, 2015.
[10] H. W. Chung and J. O. Lee. Weak detection of signal in the spiked wigner model. In International Conference on Machine Learning, 97:1233–1241, 2019.
[11] I. Dumitriu and Y. Zhu. Sparse general Wigner-type matrices: Local law and eigenvector delocalization. Journal of Mathematical Physics, 60(2):023301, 2019.
[12] A. El Alaoui, F. Krzakala, and M. I. Jordan. Fundamental limits of detection in the spiked Wigner model. The Annals of Statistics, 48(2):863–885, 2020.
[13] L. Erdős, T. Krüger, and D. Schröder. Cusp universality for random matrices I: local law and the complex hermitian case. Communications in Mathematical Physics, 378(2):1203–1278, 2020.
[14] L. Erdős and P. Mühlbacher. Bounds on the norm of Wigner-type random matrices. Random Matrices: Theory and Applications, 8(03):1950009, 2019.
[15] D. Féral and S. Péché. The largest eigenvalue of rank one deformation of large Wigner matrices. Communications in mathematical physics, 272(1):185–228, 2007.
[16] B. Hajek, Y. Wu, and J. Xu. Information limits for recovering a hidden community. IEEE Transactions on Information Theory, 63(8):4729-4745, 2017.
[17] B. Hajek, Y. Wu, and J. Xu. Achieving exact cluster recovery threshold via semidefinite programming. IEEE Transactions on Information Theory, 62(5):2788–2797, 2016.
[18] B. Hajek, Y. Wu, and J. Xu. Recovering a hidden community beyond the Kesten–Stigum threshold in $O(|E|log*|V|)$ time. Journal of Applied Probability, 55(2):325–352, 2018.
[19] J. H. Jung, H. W. Chung, and J. O. Lee. Weak detection in the spiked wigner model with general rank. arXiv:2001.05676, 2020.
[20] S. Péché. The largest eigenvalue of small rank perturbations of Hermitian random matrices. Probability Theory and Related Fields, 134(1):127–173, 2006.
[21] A. Perry, A. S. Wein, A. S. Bandeira, and A. Moitra. Optimality and sub-optimality of PCA I: Spiked random matrix models. The Annals of Statistics, 46(5):2416–2451, 2018.
[22] N. Stanley, T. Bonacci, R. Kwitt, M. Niethammer, and P. J. Mucha. Stochastic block models with multiple continuous attributes. Applied Network Science, 4(1):1–22, 2019.
[23] Y. Zhu. A graphon approach to limiting spectral distributions of wigner-type matrices. Random Structures & Algorithms, 56(1):251–279, 2020.