Phase Transition in the Generalized Stochastic Block Model
Abstract
We study the problem of detecting the community structure from the generalized stochastic block model (GSBM). Based on the analysis of the Stieljtes transform of the empirical spectral distribution, we prove a BBP-type transition for the largest eigenvalue of the GSBM. For specific models such as a hidden community model and an unbalanced stochastic model, we provide precise formulas for the two largest eigenvalues, establishing the gap in the BBP-type transition.
1 Introduction
One of the most fundamental and natural problems in data science is to understand an underlying structure from data sets that can be viewed as networks. The problem is known as the clustering or community detection, and it appears in diverse field of studies involving real-world networks.
The stochastic block model (SBM) is one of the most fundamental mathematical models to understand the community structure in networks. An SBM is a random graph with nodes, partitioned into disjoint subsets, called the communities, . One can characterize an SBM via its adjacency matrix, which is a symmetric (random) matrix , whose -entry is a Bernoulli random variable depending only on the communities to which the nodes and belong. For the clustering of an SBM, it is often useful to analyze the eigenvalues of the adjacency matrix and their associated eigenvectors, which is known as a spectral method.
One of the most prominent examples of spectral methods is the principal component analysis (PCA) in which the behavior of the eigenvectors associated with the extremal eigenvalues are considered to obtain the community structure of the SBM. For an SBM with two communities, the expectation of its adjacency matrix has a block structure, i.e.,
(1.1) |
In the simplest case of a balanced SBM with , , and the two communities are of equal size, it can be easily checked that has at most two non-zero eigenvalues, and . Thus, if is sufficiently large, the perturbation is negligible for the two largest eigenvalues and it is possible to determine the community structure from the eigenvector associated with the second largest eigenvalue of . Equivalently, after subtracting from each entry, the (shifted) adjacency matrix becomes the sum of a rank- deterministic matrix and a random matrix with centered entries, and one can use the eigenvector associated with the largest eigenvalue of the shifted adjacency matrix for clustering.
The sum of a deterministic matrix and a random matrix has been extensively studied in random matrix theory. When the deterministic matrix is rank- and the random matrix is a Wigner matrix, it is called a (rank-) spiked Wigner matrix. The behavior of the largest eigenvalue of a spiked Wigner matrix is known to exhibit a sharp phase transition depending on the ratio between the spectral norms of the deterministic part and the random part. This type of phase transition is called the BBP transition, after the seminal work of Baik, Ben Arous, and Péché [5] for spiked (complex) Wishart matrix. From the BBP transition, we can immediately see that the detection of the signal is possible via PCA when the signal-to-noise ratio (SNR) is above a certain threshold.
While the BBP transition has been proved for spiked Wigner matrices under various assumptions [20, 15, 8, 7], it is not directly applicable to the SBM, since the entries in a Wigner matrix are i.i.d. (up to symmetry constraint) whereas those in the adjacency matrix of an SBM are not. The proof of the BBP transition with an SBM is substantially harder. For example, unless the SBM is balanced, the empirical spectral distribution (ESD) of does not even converge to the semi-circle distribution, which is the limiting ESD of a Wigner matrix; the limiting ESD in this case is not given by a simple formula as the semi-circle distribution but by an implicit formula via its Stieltjes transform.
Main contribution
In this paper, we consider a model that generalizes the SBM, called the generalized stochastic block model (GSBM), with two communities. In this model, the mean of the matrix has the same block structure as that of the SBM in (1.1), but the entries are not necessarily Bernoulli random variables. See Definition 2.1 for the precise definition of the GSBM.
For the GSBM, We prove the BBP-type transition for its largest eigenvalue (Theorem 2.1). The proof is based on the analysis of the Stiejtes transform of the ESD, which involves the resolvent of the random part of the GSBM. Due to the community structure, the random part is not a Wigner matrix, but a generalization of a Wigner matrix, known as a Wigner-type matrix. The local properties of eigenvalues of Wigner-type matrices are now well-established by recent developments of random matrix theory; see, e.g., [3, 4, 11].
In our main result, Theorem 2.1, we only state the existence of the critical values and the limiting gap between the two largest eigenvalue but refrain from writing the precise formulas for them. We instead apply our results to specific examples naturally arising in applications, hidden community model and unbalanced stochastic block model, and present the results from numerical experiments. (In terms of the edge probability, the former corresponds to the case and , while the latter and (but )).
Related works
The local law for Wigner-type matrices and the behavior of Quadratic vector equations (QVE), which are crucial in the analysis for Wigner-type matrices, were thoroughly investigated by Ajanki, Erdős and Krüger [3, 4]. A related result on the local law at the cusp for the Wigner-type matrix was also proved [13]. For more results on general Wigner-type matrices, we refer to [11, 14, 23] and references therein.
The phase transition of the largest eigenvalue was first proved proved by Baik, Ben Arous and Péché [5] for spiked Wishart matrices and later extend to other models, including the spiked Wigner matrix under various assumptions [20, 15, 8, 7]. If the SNR is below the threshold given by the BBP transition, the largest eigenvalue has no information on the signal and we cannot use the PCA for the detection of the signal. For this case, the PCA can be improved by an entrywise transformation that effectively increase the SNR [6, 21]. Reliable detection is impossible below a certain threshold [21], and it is only possible to consider a weak detection, which is a hypothesis testing between the null model (without spike) and the alternative (with spike). For more detail about the weak detection, we refer to [12, 10, 19].
The problem of recovering a hidden community from a symmetric matrix for two important cases, the Bernoulli and Gaussian entries, was discussed by Hajek, Wu, and Xu [16]. A threshold for exact recovery in SBM was discussed in [1, 2, 9, 17]. Recovering community at the Kesten–Stigum threshold for SBM was considered in [18]. For more results and Applications on SBM, we refer to [22] and references therein.
Organization of the paper
The rest of the paper is organized as follows: In Section 2, we define the model and state the main result. In Section 3, we introduce the hidden community model and unbalanced stochastic model to provide the results from numerical experiments around the transition threshold. In Section 4, we prove the main theorem. A summary of our results and future research directions was discussed in Section 5. Appendix A contains the definition of the Wigner-type matrices and preliminary results on this model. The detailed analysis for the specific models can be found in Appendix B.
2 Main Results
In this section, we precisely define the matrix model that we consider in this paper and state our main theorem. We begin by introducing a shifted, rescaled matrix for a generalized stochastic block model with two communities.
Definition 2.1 (Generalized Stochastic Block Model (GSBM)).
An matrix is a generalized stochastic block model if
where is a constant, with , and is an real symmetric matrix, satisfying the following:
-
•
There exist and constants such that
We further assume that for some (-independent) constant .
-
•
Upper diagonal entries are centered independent random variables such that
-
–
there exist (-independent) constants and such that
-
–
for any (-independent) , there exists a constant such that for all
-
–
For an adjacency matrix in (1.1), if , , and , then after shifting and rescaling, we find that
(2.1) |
(See Appendix B for more detail.)
We remark that are not necessarily Bernoulli random variables. The assumption on the finite moment means that the model is in the dense regime. The most typical balanced stochastic block model with two communities correspond to the choice of parameters and .
Our main theorem is the following result on the phase transition for the spectral gap of GSBM.
Theorem 2.1.
Let be a generalized stochastic block model defined in Definition 2.1. Denote by and the largest and the second largest eigenvalue of . Assume that is fixed. Then, there exists a constant , depending only on , such that
-
•
(Subcritical case) if , then as , almost surely.
-
•
(Supercritical case) if , then as , almost surely, for some (-independent) constant .
3 Examples and Experiments
In this section, we focus on several specific models and check how the main result, Theorem 2.1, applies to them.
3.1 Hidden community model
In the hidden community model, only one of the intra-community connection probability is larger than the inter-community connection probability, and the other intra-connection probability coincides with the inter-community connection. The precise definition for such a model is as follows:
Definition 3.1 (Hidden Community Model).
Let such that . Let define that is a symmetric matrix with where are independent for and
for given probability measures and .
We consider the BBP-type transition of the hidden community model with Bernoulli entries, i.e., and with , which also corresponds to the case or in (2.1). It is not hard to find that the transition occurs in the regime
for some (possibly -dependent) . After shifting and rescaling, we find that and
(3.1) |
See Appendix B.1 for the detail.
We performed the numerical simulation for the hidden community model. We set , , and . Following the analysis in Appendix B.1, we find that an outlier eigenvalue occurs if
In Figure 1, we compare the histograms of the eigenvalues of the shifted, rescaled adjacency matrices with and , respectively. As predicted by the analysis, the outlier appears only for the case .


3.2 Unbalanced stochastic block model
We next consider the case or with , which we will refer to an unbalanced stochastic block model. As in the hidden community model, the transition occurs in the regime . After shifting and rescaling, we find that and
See Appendix B.2 for the detail. Note that the transition does not depend on .
We performed the numerical simulation for the unbalanced stochastic block model. As in the hidden community model, we set , , and . An outlier eigenvalue occurs if
In Figure 2, we compare the histograms of the eigenvalues of the shifted, rescaled adjacency matrices with and , respectively. Again, as predicted by the analysis, the outlier appears only for the case .


4 Proof of Theorem 2.1
Recall that we denote by and the two largest eigenvalues of . Let and be the two largest eigenvalues of . From the result on the Wigner-type matrices, we find that and converge to the rightmost edge of the limiting ESD of . By the Cauchy interlacing formula, we have the inequality
which shows that also converges to the rightmost edge of the limiting ESD of .
From the minimax principle,
which shows that is an increasing function of . Further, since and
we find that if and if . Thus, since is a continuous function of , conditional on there exists such that the statement of Theorem 2.1 holds. It thus remains to show that and are deterministic in the sense that the same statement holds without conditioning on .
Our proof is based on the Stieltjes transform method in random matrix theory for which we use the following definition:
Definition 4.1 (Stieltjes Transform).
Let be a probability measure on the real line. The Stieltjes transform of is defined by
for .
For the noise , we consider its resolvent defined by
(4.1) |
for . Note that the normalized trace is equal to the Stieltjes transform of the empirical spectral distribution (ESD) of .
To find the largest eigenvalue of , we recall that any eigenvalue of satisfies
(4.2) |
which can be further decomposed into
(4.3) |
Thus, if is not an eigenvalue of , we find that , and hence
We now claim that has rank one. To prove the claim, we notice that for any ,
It implies that the range of is contained in , which is a 1-dimensional space.
Since has rank 1, it has only one non-zero eigenvalue, which we call . Then, is for otherwise every eigenvalues of is non-zero, contradicting (4.3). Furthermore, it is also obvious that is an eigenvector associated with the eigenvalue . Thus,
which leads us to the equation
(4.4) |
For the noise matrix , which is a Wigner-type matrix considered in [4], we have that
(4.5) |
for any not contained in an open neighborhood of the support of the limiting ESD of , where we let be the solution to the quadratic vector equation (QVE)
(4.6) |
is satisfied for . (See Appendix A for the precise statement of (4.5).) We remark that the uniqueness of the solution for (4.6) is also known [3].
To solve the equations (4.4) and (4.6), we need to estimate from the assumption on the community structure in Definition 2.1. Assume for the simplicity that . From the symmetry, we have an ansatz
Then, we can rewrite (4.6) as
which can be further simplified (after omitting the -dependence) to
(4.7) |
We can thus conclude that if there exists real that solves (4.7) under the assumption
(4.8) |
then converges to with high probability. Since (4.8) is deterministic, we find that the gap is deterministic in the supercritical case .
It remains to find the critical . Recall that we set . From the Stieltjes inversion formula, the (normalized) imaginary part of corresponds to the density of the limiting ESD of at . Thus, after changing (4.7) as a single equation involving and only, i.e., , we find that the upper edge of the ESD of is the largest real number such that has a double root when considered as an equation for . (Note that technically the condition can be checked by solving and simultaneously.) We can thus conclude that is determined as the largest number such that when the solution for the equation (4.7) under the assumption (4.8) coincides with . This in particular shows that is also deterministic and completes the proof of Theorem 2.1.
5 Conclusion and Future Works
In this paper, we considered the generalized stochastic block model with two communities. We showed the phase transition in the GSBM where the random part is the Wigner-type matrix, which extends the BBP transition. For the precise formulas, we discussed a hidden community model and unbalanced stochastic block model with Bernoulli distribution and Gaussian distribution at the Kesten–Stigum threshold. Both models can be improved for a non-Gaussian case.
We believe that it is possible to prove the phase transition for the sparse matrix in which the data matrix is not necessarily symmetric and most of the elements are composed of zeros. We also hope to extend our result to the GSBM with more than two communities.
Acknowledgments
The authors were supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2019R1A5A1028324).
Appendix A Local law for Wigner-type matrices
In this section, we provide a precise statement of the local law for Wigner-type matrices, which was used in the proof of Theorem 2.1 in Section 4. Wigner-type matrices are defined as follows:
Definition A.1.
(Wigner-type matrix) We say an random matrix is a Wigner-type matrix if the entries of are independent real symmetric variables satisfying the following conditions:
-
•
for all .
-
•
The variance matrix where satisfies
for finite parameters .
For the precise statement of the local law, we use the following definitions, which are frequently used in the analysis involving rare events in random matrix theory.
Definition A.2.
(Overwhelming probability) An event holds with overwhelming probability if for any big enough , for any sufficient large .
Definition A.3.
(Stochastic domination) Let consider two families of non-negative random variables:
where is N-dependent parameter set. Suppose is a given function depending on and . If for small enough and big enough, we have
then is stochastically dominated by which denoted by .
We are now ready to state the local law. Let be the solution of QVE in (4.6) and is the density defined as
(See also Corollary 1.3 of [4] for more detail.)
Theorem A.1 (Local law).
Let be a Wigner-type matrix and fix an arbitrary . Then, uniformly for all with , the resolvent satisfy
Furthermore, for any deterministic vector with , we have
The local law can be generalized to the anisotropic local law as follows.
Theorem A.2 (Anisotropic law).
Suppose that the assumptions in Theorem A.1 hold. Then, uniformly for all with , and for any two deterministic -normalized vectors , we have
Appendix B Examples from stochastic block models
In this appendix, we consider stochastic block models, which corresponds to GSBMs with Bernoulli distribution in our setting. Suppose that is an SBM such that
(B.1) |
In what follows, we will call the -entry is in the diagonal block if or , and otherwise it is in the off-diagonal block. In the block matrix form, it can also be expressed as follows:
\explainA\explainB(Bernoulli(p1) Bernoulli(q) Bernoulli(q) Bernoulli(p2) )\explainC\explainD |
Our goal is to shift and rescale to convert it into a GSBM in Definition 2.1. We first notice that the variances of the entries of are and for the diagonal block and for the off-diagonal block. Since we assume that the variance of the entry in the off-diagonal block is , we find that the matrix must be divided by . It is then immediate to find that
(B.2) |
as in (2.1).
The mean matrix
is a rank- matrix, and thus we need to subtract each entry by a deterministic number, which depends on the parameters , and .
B.1 Hidden community model
Suppose that and . It is then easy to find that becomes a rank- matrix after subtracting each entry by , i.e., if we let be the matrix whose all entries are , then
Thus, we find that
(B.3) |
and
(B.4) |
Recall that and . Since , we get
u= (1γN0 )\explainC\explainD, |
i.e., and . We also find that
Following the proof of Theorem 2.1 in Section 4, we solve the system of equation in (4.7),
(B.5) |
Since , we consider an ansatz , which shows for that
Following the analysis in the last paragraph of Section 4, we find that the upper edge . By Theorem 2.1, it also implies that as .
In order to determine the location of the largest eigenvalue , we consider (B.5) under the assumption in (4.8),
(B.6) |
We remark that the ansatz can be directly checked in this case; by plugging (B.6) into (B.5) and eliminating ,
whose solution is
To find the location of the largest eigenvalue, we need to check whether the assumption (B.6) is valid. However, we can instead find the value of by first assuming that the solution exists. Then,
At the critical for the phase transition in Theorem 2.1, the location of the largest eigenvalue coincides with the location of the upper edge in the limit , or equivalently, . Thus, we conclude that
B.2 Unbalanced stochastic model
Suppose that . Following the strategy in Appendix B.1, we let be the matrix whose all entries are . Then,
Thus, we find that
(B.7) |
and
(B.8) |
From , we get
u= (1N-1N)\explainC\explainD, |
i.e., and . Also,
References
- [1] E. Abbe. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research, 18(1):6446–6531, 2017.
- [2] E. Abbe, A. Bandeira, and G. Hall. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62(1):471–487, 2014.
- [3] O. Ajanki, L. Erdős, and T. Krüger. Quadratic vector equations on complex upper half-plane. American Mathematical Society, 261(1261), 2019.
- [4] O. Ajanki, L. Erdős, and T. Krüger. Universality for general wigner-type matrices. Probability Theory and Related Fields, 169(3):667–727, 2017.
- [5] J. Baik, G. Ben Arous, and S. Péché. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability, 33(5):1643–1697, 2005.
- [6] J. Barbier, M. Dia, N. Macris, F. Krzakala, T. Lesieur, and L. Zdeborová. Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula. Advances in Neural Information Processing Systems, 29:424–432, 2016.
- [7] F. Benaych-Georges and R. R. Nadakuditi. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Advances in Mathematics, 227(1):494–521, 2011.
- [8] M. Capitaine, C. Donati-Martin, and D. Féral. The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations. The Annals of Probability, 37(1):1–47, 2009.
- [9] P.-Y. Chen and A. Hero. Universal phase transition in community detectability under a stochastic block model. Physical Review E, 91(3):032804, 2015.
- [10] H. W. Chung and J. O. Lee. Weak detection of signal in the spiked wigner model. In International Conference on Machine Learning, 97:1233–1241, 2019.
- [11] I. Dumitriu and Y. Zhu. Sparse general Wigner-type matrices: Local law and eigenvector delocalization. Journal of Mathematical Physics, 60(2):023301, 2019.
- [12] A. El Alaoui, F. Krzakala, and M. I. Jordan. Fundamental limits of detection in the spiked Wigner model. The Annals of Statistics, 48(2):863–885, 2020.
- [13] L. Erdős, T. Krüger, and D. Schröder. Cusp universality for random matrices I: local law and the complex hermitian case. Communications in Mathematical Physics, 378(2):1203–1278, 2020.
- [14] L. Erdős and P. Mühlbacher. Bounds on the norm of Wigner-type random matrices. Random Matrices: Theory and Applications, 8(03):1950009, 2019.
- [15] D. Féral and S. Péché. The largest eigenvalue of rank one deformation of large Wigner matrices. Communications in mathematical physics, 272(1):185–228, 2007.
- [16] B. Hajek, Y. Wu, and J. Xu. Information limits for recovering a hidden community. IEEE Transactions on Information Theory, 63(8):4729-4745, 2017.
- [17] B. Hajek, Y. Wu, and J. Xu. Achieving exact cluster recovery threshold via semidefinite programming. IEEE Transactions on Information Theory, 62(5):2788–2797, 2016.
- [18] B. Hajek, Y. Wu, and J. Xu. Recovering a hidden community beyond the Kesten–Stigum threshold in time. Journal of Applied Probability, 55(2):325–352, 2018.
- [19] J. H. Jung, H. W. Chung, and J. O. Lee. Weak detection in the spiked wigner model with general rank. arXiv:2001.05676, 2020.
- [20] S. Péché. The largest eigenvalue of small rank perturbations of Hermitian random matrices. Probability Theory and Related Fields, 134(1):127–173, 2006.
- [21] A. Perry, A. S. Wein, A. S. Bandeira, and A. Moitra. Optimality and sub-optimality of PCA I: Spiked random matrix models. The Annals of Statistics, 46(5):2416–2451, 2018.
- [22] N. Stanley, T. Bonacci, R. Kwitt, M. Niethammer, and P. J. Mucha. Stochastic block models with multiple continuous attributes. Applied Network Science, 4(1):1–22, 2019.
- [23] Y. Zhu. A graphon approach to limiting spectral distributions of wigner-type matrices. Random Structures & Algorithms, 56(1):251–279, 2020.