Time-Inhomogeneous Random Walks on Finite Groups and Cokernels of Random Integer Block Matrices

Elia Gorokhovsky Harvard University, Cambridge, USA

Abstract.

We study time-inhomogeneous random walks on finite groups in the case where each random walk step need not be supported on a generating set of the group. When the supports of the random walk steps satisfy a natural condition involving normal subgroups of quotients of the group, we show that the random walk converges to the uniform distribution on the group and give bounds for the convergence rate using spectral properties of the random walk steps. As an application, we use the moment method of Wood to prove a universality theorem for cokernels of random integer matrices allowing some dependence between entries.

1. Introduction

The work in this paper is motivated by a question in the theory of integer random matrices but is of independent interest to the study of random walks on groups.

Random walks on finite groups are well-studied in the reversible, time-homogeneous, ergodic regime, where the random walk on a group $G$ consists of a product $X_{1}X_{2}\dots X_{n}$ for i.i.d. $X_{i}$ drawn from a distribution supported on a generating set of $G$ . Such random walks are known to converge to the uniform distribution $\pi$ on $G$ exponentially quickly. Namely, if we denote by $\nu_{n}$ the distribution of $X_{1}X_{2}\dots X_{n}$ , then

||\nu_{n}-\pi||_{L^{2}}\leq\sigma^{n},

where $\sigma$ is the second-largest singular value of the Markov operator of the random walk and $||\cdot||_{L^{2}}$ denotes the $L^{2}$ norm of signed measures as functions $G\to\mathbb{R}$ . See [Sal04] for an excellent review of these kinds of walks.

The above inequality comes from looking at norms of convolution operators on the space of signed measures on $G$ . If $X$ and $Y$ are random elements of $G$ distributed according to $\mu$ and $\nu$ respectively, then $XY$ is distributed according to the convolution

(\mu*\nu)(g)=\sum_{h\in G}\mu(h)\nu(h^{-1}g).

In particular, if the $X_{i}$ are distributed according to $\mu$ , then $\nu_{n}$ is the $n$ -fold convolution $\mu^{*n}=\underbrace{\mu*\dots*\mu}_{n\text{ times}}$ .

Some of the assumptions can be relaxed; for instance, Saloff-Coste and Zúñiga [SZ07] studied convergence of time-inhomogeneous Markov chains, including random walks on finite groups, in the case where each step of the random walk is irreducible (in particular, supported on a generating set of $G$ ). In that case, if we denote by $\sigma_{i}$ the second-largest singular value of the $i$ th step,

||\nu_{n}-\pi||_{L^{2}}\leq\prod_{i=1}^{n}\sigma_{i}.

The condition that each step of the random walk is supported on a generating set is crucial because if the subgroup generated by the supports of the steps is a proper subgroup of $G$ , the random walk will surely stay in that subgroup. Nevertheless, one may expect that if the supports of all steps taken together generate $G$ , the random walk might still equilibrate to the uniform distribution on $G$ .

A consequence of our first main result is the following theorem, which relaxes this “generating” assumption by extending part of [SZ07, Theorem 3.5] to some time-inhomogeneous random walks where the probability measures driving each step need not be irreducible:

Theorem 1.1.

Let $G$ be a finite group, and let $\mu_{1},\mu_{2},\dots,\mu_{n}$ be probability measures on $G$ . For each subgroup $H$ of $G$ , let $I_{H}=\{i\mid H=\langle\operatorname{supp}\mu_{i}\rangle\}$ . Let $\mathcal{S}$ be a finite set of normal subgroups of $G$ such that $G=\left\langle\bigcup_{H\in\mathcal{S}}H\right\rangle$ . Write $\nu_{n}=\mu_{1}*\dots*\mu_{n}$ . Also, for each $i$ , let $\sigma_{i}$ be the second-largest singular value of $*\mu_{i}$ as an operator on $L^{2}(\langle\operatorname{supp}\mu_{i}\rangle)$ . Let $\pi$ be the uniform distribution on $G$ .

If $I_{H}$ is nonempty for each $H\in\mathcal{S}$ , we have

||\nu_{n}-\pi||_{L^{2}}\leq\sum_{H\in\mathcal{S}}\left(\prod_{i\in I_{H}}\sigma_{i}\right).

We prove a more general version of this result in Theorem 3.1.

In particular, if a time-inhomogeneous random walk on a finite group has steps supported on enough subgroups, then it converges to the uniform distribution on the group with an exponential rate controlled by subgroups that appear infrequently or mix very slowly. Adding more probability measures to the convolution $\nu_{n}$ may not improve the convergence rate, but it never makes the bound worse because convolution with a probability measure is non-expansive in the $L^{2}$ norm. A nice consequence of this is that $\mathcal{S}$ need not be an exhaustive list of every normal subgroup for which $I_{H}$ is nonempty.

The main difference between this result and [SZ07, Theorem 3.5] is that [SZ07] relaxes the time-homogeneity assumption for random walks but not the assumption that each step is supported on a generating set for the group. The new condition that the supports of the steps jointly generate $G$ is substantially weaker than the assumption that the support of each step generates $G$ .

The conditions of Theorem 1.1 can be weakened so that not all the subgroups $H_{i}$ need to be normal (see Theorem 3.1), but see Example 3.3 for why some hypothesis on the subgroups is necessary.

Our main interest in developing this theorem is an application to limiting distributions of cokernels of random matrices. Wood [Woo19, Theorem 2.9] and Nguyen and Wood [NW22, Theorem 1.1] showed that cokernels of integer-valued random matrices approach a universal limiting distribution in the following sense. Let $(M_{n})_{n=1}^{\infty}$ be a sequence with each $M_{n}$ a random $n\times(n+u)$ integer matrix with independent entries ( $u\geq 0$ ). Wood showed that, under very weak conditions on the distributions of the entries of the $M_{n}$ , the distribution of the isomorphism class of the random group $\operatorname{coker}(M_{n})\coloneqq\mathbb{Z}^{n}/M_{n}(\mathbb{Z}^{n+u})$ converges weakly (i.e., at any finite collection of primes) as $n\to\infty$ to the distribution $\lambda_{u}$ on isomorphism classes of profinite abelian groups defined as follows: if $A\sim\lambda_{u}$ and $B$ is a finite abelian $p$ -group, then

\mathbb{P}[A_{p}\cong B]=\frac{1}{|B|^{u}|\operatorname{Aut}(B)|}\prod_{k=u+1}^{\infty}(1-p^{-k})

independently for all primes $p$ , where $A_{p}$ is the $p$ -part of $A$ . If further $u\geq 1$ , then $\lambda_{u}$ is supported on isomorphism classes of finite abelian groups, and for finite abelian $B$ we have

\lambda_{u}(B)=\frac{1}{|B|^{u}|\operatorname{Aut}(B)|}\prod_{k=u+1}^{\infty}\zeta(k)^{-1},

where $\zeta$ denotes the Riemann zeta function. Nguyen and Wood weakened the conditions on the entries and showed strong (pointwise) convergence to $\lambda_{u}$ . The phenomenon that the limiting distribution of $\mathbb{Z}^{n}/M_{n}(\mathbb{Z}^{n+u})$ is rather insensitive to the distributions of the entries of $M_{n}$ is an example of universality. The distributions $\lambda_{0}$ and $\lambda_{1}$ are known as the Cohen-Lenstra distributions, and are conjectured to describe the distributions of class groups of imaginary and real quadratic number fields, respectively.

In her 2022 ICM talk, Wood [Woo23, Open Problem 3.10] asks if the universality class of $\lambda_{u}$ can be extended to cokernels of matrices with some dependent entries. There are a few specific results in this direction. Most recently, Nguyen and Wood [NW22, Theorem 1.1] show that the distribution $\lambda_{1}$ is universal for Laplacians of Erdős-Rényi random directed graphs. Mészáros [Més20] shows that $\lambda_{0}$ is universal for Laplacians of random regular directed graphs. Friedman and Washington [FW89] show that the cokernels of the random matrices $I-M$ , where $M$ is drawn at random from the multiplicative Haar measure on $\operatorname{GL}_{2g}(\mathbb{Z}_{p})$ , approach the $p$ -part of $\lambda_{0}$ as $g\to\infty$ . However, when there is too much dependence in the entries of the random matrices, one gets different (but often related) limiting distributions, for example in the case of symmetric matrices ([Woo14]), Laplacians of random regular undirected graphs ([Més20]), products of independent random integral matrices ([NV24]), and quadratic polynomials in Haar-random matrices ([CK24]). It is natural to ask just how much (and what kind of) dependence is allowed between the entries of sequences of random matrices before their cokernels leave the universality class of $\lambda_{u}$ .

The main application of Theorem 3.1 in this paper is a Theorem 1.2 below, which extends the result of [Woo19] to matrices with some dependence in their rows and columns. We introduce a regularity condition on matrices, $(w,h,\varepsilon)$ -balanced, in Definitions 4.2 and 4.7. Generally, it means that the matrix can be written as a block matrix where the blocks have height at most $h$ , width at most $w$ , are all independent, and each satisfy some regularity condition depending on $\varepsilon$ . The key detail is that the blocks of the matrix may have dependent entries, as long as there is no dependence between blocks. (The $(w,h,\varepsilon)$ -balanced condition is invariant under permutation of rows and columns, so one can also think of a $(w,h,\varepsilon)$ -balanced matrix as a block matrix which is at most $h$ blocks tall, at most $w$ blocks wide, and such that the entries of each block are independent of each other.) With this condition, we have:

Theorem 1.2.

Let $u\geq 0$ be an integer. Let $(w_{n})_{n},(h_{n})_{n}$ be sequences of real numbers such that $w_{n}=o(\log n)$ , $h_{n}=O(n^{1-\alpha})$ , and $\varepsilon_{n}\geq n^{-\beta}$ for some $0<\alpha\leq 1$ and $0<\beta<\alpha/2$ .

For each integer $n\geq 0$ , let $M_{n}$ be an $(w_{n},h_{n},\varepsilon_{n})$ -balanced $n\times(n+u)$ random matrix with entries in $\mathbb{Z}$ . Then the distribution of $\operatorname{coker}(M_{n})$ converges weakly to $\lambda_{u}$ as $n\to\infty$ . In other words, if $Y\sim\lambda_{u}$ , then for every positive integer $a$ and every abelian group $H$ with exponent dividing $a$ we have

\lim_{n\to\infty}\mathbb{P}[\operatorname{coker}(M_{n})\otimes\mathbb{Z}/a\mathbb{Z}\cong H]=\mathbb{P}[Y\otimes\mathbb{Z}/a\mathbb{Z}\cong H].

The key idea of the proof of Theorem 1.2 uses the moment method developed in [Woo14] and [Woo19]. Understanding the cokernel of a random integer matrix reduces to finding the probability that each random column maps to zero under an arbitrary surjective group homomorphism $f\colon\mathbb{Z}^{n}\to G$ for an arbitrary abelian group $G$ . To handle dependent columns, we treat several columns at a time and look at the induced surjection $(\mathbb{Z}^{n})^{m}\to G^{m}$ . We view the image of a random element of $(\mathbb{Z}^{n})^{m}$ as a random walk in $G^{m}$ and apply Theorem 3.1 to approximate the distribution of this image. Since the surjection $f$ is arbitrary, we have very little control over the distribution of the steps of this walk. In particular, they are almost never supported on all of $G^{m}$ , which is why we need Theorem 3.1 to handle random walk steps supported on proper subgroups. The $(w,h,\varepsilon)$ -balanced condition allows us to bound the singular values of the associated convolution operators and get quantitative bounds on the error in terms of $w$ , $h$ , and $\varepsilon$ .

There is a considerable body of literature pertaining to random matrices with complex entries, with analogous universality results about distributions of eigenvalues. If $\{M_{n}\}$ is a sequence of $n\times n$ random complex matrices whose entries are independent, with appropriately normalized mean and variance, the empirical distribution of the eigenvalues of $M_{n}$ converges to the circular law, which is the uniform distribution on the unit disc in $\mathbb{C}$ [TVK10]. The universality of the circular law for spectra of a wide class of random complex block matrices was proved by Nguyen and O’Rourke in [NO15].

2. Notation and Terminology

For a finite set $S$ and $p>0$ , we use $L^{2}(S)$ to denote the space of signed measures (equivalently, real-valued functions) on $S$ , equipped with the norm $||f||_{L^{2}(S)}^{2}=\sum_{s\in S}|f(s)|^{2}$ . When the set $S$ is implicit, we write $||f||_{L^{2}}$ for $||f||_{L^{2}(S)}$ . Any set map $f\colon S\to T$ defines a pushforward map $f_{*}\colon L^{2}(S)\to L^{2}(T)$ by $f_{*}\mu(t)=\mu(f^{-1}(t))$ . We say a signed measure $\nu$ is uniform on $T\subset S$ if $\nu(t)=\nu(t^{\prime})$ for $t,t^{\prime}\in T$ . For a point $f\in\mathbb{R}^{n}$ with the Euclidean metric and a linear subspace $W\subset\mathbb{R}^{n}$ , we write $d_{L^{2}}(f,W)$ for the distance between $f$ and its orthogonal projection onto $W$ . Note that this is equal to $\inf_{g\in W}|f-g|$ . If $G$ is a finite group, any signed measure $\mu$ defines a linear convolution operator (or, if $\mu$ is a probability measure, Markov operator) $*\mu$ on $L^{2}(G)$ given by $\nu\mapsto\nu*\mu$ . When we discuss the second-largest singular value of an operator, we are counting with multiplicity; for example, if the singular values of $M$ are $1,1,0$ , then its second-largest singular value is $1$ .

For two finite or profinite groups $G,G^{\prime}$ , we write $\operatorname{Hom}(G,G^{\prime})$ for the set of (continuous) group homomorphisms from $G$ to $G^{\prime}$ and $\operatorname{Sur}(G,G^{\prime})$ for the set of (continuous) surjective group homomorphisms from $G$ to $G^{\prime}$ . For a subset $S\subseteq G$ , we denote by $\langle S\rangle$ the (closed) subgroup of $G$ generated by $S$ . We refer to the identity element of a group as $e$ .

A probability measure or distribution is a measure with total mass 1 (not signed). The uniform distribution on $G$ is usually denoted $\pi$ and is the measure on $G$ with $\pi(g)=1/|G|$ for $g\in G$ . We use $\mathbb{P}[\cdot]$ for probability and $\mathbb{E}[\cdot]$ for expectation. We denote by $\operatorname{supp}\mu$ the support of a measure $\mu$ . If a random variable $X$ has law $\mu$ , we write $X\sim\mu$ .

For a positive integer $n$ , we write $[n]$ for the set $\{1,\dots,n\}$ .

3. Random Walks

This section is devoted to proving the following stronger version of Theorem 1.1 from the introduction:

Theorem 3.1.

Let $G$ be a finite group and suppose we have a sequence of surjective homomorphisms

G=G_{0}\xtwoheadrightarrow{Q_{1}}G_{1}\xtwoheadrightarrow{Q_{2}}G_{2}\xtwoheadrightarrow{Q_{3}}\dots\xtwoheadrightarrow{Q_{k-1}}G_{k-1}\xtwoheadrightarrow{Q_{k}}G_{k}=\{e\}.

For $0\leq j\leq k$ , define $\tilde{Q}_{j}\colon G\twoheadrightarrow G_{j}$ by $\tilde{Q}_{j}=Q_{j}\circ Q_{j-1}\circ\dots\circ Q_{1}$ (so $\tilde{Q}_{0}=\operatorname{id}_{G}$ ), and for $1\leq j\leq k$ define $H_{j}\trianglelefteq G_{j-1}$ by $H_{j}=\ker Q_{j}$ .

Let $\mu_{1},\dots,\mu_{n}$ be probability measures on $G$ . Let $\nu_{n}=\mu_{1}*\dots*\mu_{n}$ . For each $j=1,\dots,k$ , let $I_{j}=\{i\mid\langle\operatorname{supp}(\tilde{Q}_{j-1})_{*}\mu_{i}\rangle=H_{j}\}$ . Let $\pi$ be the uniform distribution on $G$ .

For $i\in I_{j}$ , let $\sigma_{i}$ be the second largest singular value of the $(\tilde{Q}_{j-1})_{*}\mu_{i}$ -random walk on $H_{j}$ . If each $I_{j}$ is nonempty, we have

||\nu_{n}-\pi||_{L^{2}}^{2}\leq\sum_{j=1}^{k}\frac{|G_{j-1}|-1}{|G|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right)=\sum_{j=1}^{k}\frac{\prod_{i=j}^{k}|H_{i}|-1}{|G|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right).

In the case where $k=1$ and $H_{1}=G$ , we recover the first part of [SZ07, Theorem 3.5]. We postpone the proof that Theorem 3.1 implies Theorem 1.1 until the end of this section.

The following example gives a case that is covered by Theorem 3.1 but not by Theorem 1.1:

Example 3.2.

Consider the dihedral group $G=D_{2n}=\langle r,s\mid r^{n}=s^{2}=(rs)^{2}=e\rangle$ with $n>2$ . The subgroup $H_{1}=\langle r\rangle$ is normal, but the subgroup $\tilde{H}_{2}=\langle s\rangle$ is not normal. We have $G_{1}=D_{2n}/\langle r\rangle=\mathbb{Z}/2\mathbb{Z}$ generated by the image of $s$ , so the image of $\langle s\rangle$ in the quotient is normal. Let $Q\colon G\twoheadrightarrow\mathbb{Z}/2\mathbb{Z}$ be the quotient map. Let $\mu$ be a measure on $\langle r\rangle$ with second-largest singular value $\sigma$ . Consider the following random walk on $D_{8}$ :

•

For $i$ odd, $\mu_{i}=\mu$ .
•

For $i$ even, $\mu_{i}(s)=p$ and $\mu_{i}(e)=1-p$ .

Say $i$ is even. In matrix form the operator $*(Q_{*}\mu_{i})$ on $L^{2}(\mathbb{Z}/2\mathbb{Z})\cong\mathbb{R}^{2}$ is given by $\begin{pmatrix}1-p&p\\ p&1-p\end{pmatrix}$ . It is symmetric, so the singular values are just the absolute values of the eigenvalues. The vector $\begin{pmatrix}1\\ -1\end{pmatrix}$ is a $(1-2p)$ -eigenvector. Thus, the singular values of $*\mu_{i}$ are 1 and $|1-2p|$ , so $\sigma_{i}=|1-2p|$ . Theorem 3.1 says that

||\mu_{1}*\dots*\mu_{2k}-\pi||_{L^{2}}^{2}\leq\sigma^{k}+|1-2p|^{k}.

In particular, if $p=1/2$ , the random walk mixes on $D_{2n}$ half as fast as the $\mu$ -random walk mixes on $\langle r\rangle\cong\mathbb{Z}/n\mathbb{Z}$ .

Although Theorem 3.1 makes weaker assumptions on the subgroups than Theorem 1.1, it is not possible to fully remove the normality assumption, as the following example shows:

Example 3.3.

Consider the alternating group $A_{5}$ . Recall that $A_{5}$ is generated by the 3-cycles $(1\;2\;3),(1\;2\;4),(1\;2\;5)$ . Consider the following three-step time-inhomogeneous “random walk” on $A_{5}$ : $X_{1}$ is uniformly distributed on $\langle(1\;2\;3)\rangle$ , $X_{2}$ is uniformly distributed on $\langle(1\;2\;4)\rangle$ , and $X_{3}$ is uniformly distributed on $\langle(1\;2\;5)\rangle$ . The step distributions $\mu_{1},\mu_{2},\mu_{3}$ on the respective cyclic groups all have second-largest singular value zero. However, the product $X_{1}X_{2}X_{3}$ is not uniformly distributed on $A_{5}$ . Indeed, when $X_{1}X_{2}X_{3}$ acts on the tuple $(1,2,3,4,5)$ , $3$ can never end up in the fourth or fifth position, whereas if $X_{1}X_{2}X_{3}$ were uniform on $A_{5}$ , $3$ would end up in the fourth and fifth position with probability $1/5$ each.

Consider the space $\mathcal{M}=L^{2}(G)$ of $\mathbb{R}$ -valued functions on a finite group $G$ . Since $G$ is finite, $\mathcal{M}\cong\mathbb{R}^{G}$ (with the Euclidean norm). Let $\mathcal{M}_{0}=\{\nu\in\mathcal{M}\mid\nu(G)=0\}$ . Let $\mathcal{P}\subseteq\mathcal{M}$ be the set of signed measures $\nu$ on $G$ with $\nu(G)=1$ . Note that $\mathcal{P}$ and $\mathcal{M}$ are parallel affine hyperplanes in $\mathbb{R}^{G}$ . Probability measures on $G$ are points in the simplex formed by the part of $\mathcal{P}$ in the positive orthant. The orthogonal complement to $\mathcal{M}_{0}$ is the line spanned by the uniform probability measure $\pi$ on $G$ , and $\operatorname{span}\{\pi\}$ intersects $\mathcal{P}$ at $\pi$ and nowhere else.

Any measure $\mu_{i}$ on $G$ acts by on $\mathcal{M}$ by convolution on the right. If $\mu_{i}$ is a probability measure, the convolution operator $M_{i}\colon\nu\mapsto\nu*\mu_{i}$ also sends $\mathcal{P}$ into itself. The following lemma tells us that, in this case, $M_{i}$ contracts the distance between points of $\mathcal{P}$ and $\pi$ .

Lemma 3.4.

Let $G$ be any finite group and $\mu$ a probability measure on $G$ . Let $M$ be the convolution operator $\nu\mapsto\nu*\mu$ and let $\sigma$ be the second-largest singular value of $M$ on $L^{2}(G)$ .

(1)

If $\nu$ is a signed measure on $G$ , then

$||M\nu||_{L^{2}}\leq||\nu||_{L^{2}}.$
(2)

If $\nu,\nu^{\prime}$ are signed measures on $G$ with $\nu(G)=\nu^{\prime}(G)$ , then

$||M\nu-M\nu^{\prime}||_{L^{2}}\leq\sigma||\nu-\nu^{\prime}||_{L^{2}}.$

Proof.

Part (1) is a case of Young’s convolution inequality for unimodular groups . To prove part (2), we will show that $\sigma$ is the $L^{2}$ operator norm of $M$ on the subspace of $L^{2}(G)$ consisting of signed measures with total mass 0.

Let $M^{*}$ be the adjoint operator to $M$ . Observe that $M^{*}$ is also a convolution operator, given by $\nu\mapsto\nu*\check{\mu}$ , where $\check{\mu}(g)=\mu(g^{-1})$ for $g\in G$ . Thus, $M^{*}M$ is the convolution operator given by $\nu\mapsto\nu*\mu*\check{\mu}$ . In particular, $M$ and $M^{*}M$ are each given by convolution with a probability measure, so they have a shared 1-eigenvector: the uniform measure $\pi$ on $G$ . The largest singular value of a real matrix coincides with its $L^{2}$ operator norm. By part (1), the operator norm of $M$ is at most 1, so the largest eigenvalue of $M^{*}M$ is exactly 1. Let $L^{2}(G)_{0}=\operatorname{span}\{\pi\}^{\perp}$ . Since $\pi$ is an eigenvector of $M$ , the operator $M$ restricts to an operator on $L^{2}(G)_{0}$ , and since $(M|_{L^{2}(G)_{0}})^{*}(M|_{L^{2}(G)_{0}})=(M^{*}M)|_{L^{2}(G)_{0}}$ , the singular values of $M|_{L^{2}(G)_{0}}$ are the singular values of $M$ with a copy of 1 (the largest singular value of $M$ ) excluded. Thus, the operator norm and largest singular value of $M|_{L^{2}(G)_{0}}$ is the second-largest singular value of $M$ , which is $\sigma$ . If $\nu(G)=\nu^{\prime}(G)$ , then $\nu-\nu^{\prime}\in L^{2}(H)_{0}$ , so

||M(\nu-\nu^{\prime})||_{L^{2}}\leq\sigma||\nu-\nu^{\prime}||_{L^{2}}.

∎

Remark 3.5.

When the support of $\mu$ does not generate $G$ , the conclusion of Lemma 3.4(2) still holds. However, we have $\sigma=1$ , so Lemma 3.4(2) gives no useful information. For this reason, in Theorem 3.1, we write $I_{j}=\{i\mid\langle\operatorname{supp}(\tilde{Q}_{j-1})_{*}\mu_{i}\rangle=H_{j}\}$ rather than $I_{j}=\{i\mid\operatorname{supp}(\tilde{Q}_{j-1})_{*}\mu_{i}\subseteq H_{j}\}$ .

The second part of Lemma 3.4 implies that if the support of $\mu_{i}$ generates $G$ and $\nu$ is a probability measure, then $||M_{i}\nu-\pi||_{L^{2}}\leq\sigma_{i}||\nu-\pi||_{L^{2}}$ , so applying $M_{i}$ to a probability measure $n$ times contracts the distance to $\pi$ by a factor of $\sigma_{i}^{n}$ . More generally, if the support of each $\mu$ generates $G$ , then applying any combination of $M_{i}$ in any order contracts the distance to $\pi$ by the appropriate product of $\sigma_{i}$ factors. However, when the support of $\mu_{i}$ is contained in a proper subgroup of $G$ , the second-largest singular value of $M_{i}$ as an operator on $\mathcal{M}$ is always 1. In this case, Lemma 3.4(2) applied to $G$ gives no useful information.

The key idea in the proof of Theorem 3.1 is that even though we cannot say outright that applying the operator $M_{i}$ moves a probability measure closer to uniform, we will show in Lemma 3.8 that $M_{i}$ moves a probability measure closer to some (explicit) subspace of $\mathcal{M}$ containing $\pi$ . This subspace depends on the subgroup generated by the support of $\mu_{i}$ . If we choose enough subspaces that their intersection is just $\{\pi\}$ , then we will be able to show that successive application of different $M_{i}$ ’s moves a probability measure closer to that intersection, that is, to the uniform probability measure. The condition that our chosen subspaces intersect in $\{\pi\}$ is exactly the condition $G=\langle\bigcup_{H\in S}H\rangle$ in Theorem 1.1, or the condition that $G_{k}=\{e\}$ in Theorem 3.1.

For each subgroup $H\leq G$ , let $\mathcal{M}_{H}\subseteq\mathcal{M}$ be the space of functions on $\mathcal{M}$ which are uniform on each left coset of $H$ (i.e., for $\nu\in\mathcal{M}_{H}$ and $g_{1},g_{2}\in G$ with $g_{1}^{-1}g_{2}\in H$ , $\nu(g_{1})=\nu(g_{2})$ ).

Lemma 3.6.

Let $G$ be a finite group and $H\leq G$ be a subgroup. Let $\nu\in\mathcal{M}$ . Let $\tilde{\nu}\in\mathcal{M}_{H}$ be the signed measure on $G$ given by $\tilde{\nu}(gh)=\frac{\nu(gH)}{|H|}$ for $h\in H$ . Then $\tilde{\nu}$ is the orthogonal projection of $\nu$ onto the subspace $\mathcal{M}_{H}$ of $\mathcal{M}$ . In particular, $d_{L^{2}}(\nu,\mathcal{M}_{H})=||\nu-\tilde{\nu}||_{L^{2}}$ .

Proof.

We have decompositions

\mathcal{M}_{H}=\bigoplus_{gH\in G/H}\operatorname{span}\{\pi_{gH}\}\subset\bigoplus_{gH\in G/H}L^{2}(gH)=\mathcal{M},

where $\pi_{gH}\in L^{2}(gH)$ is given by $\pi(gh)=1/\sqrt{|H|}$ for all $h\in H$ . The projection operator $\mathcal{M}\to\mathcal{M}_{H}$ decomposes as a direct sum of projection operators, one for each coset of $H$ . In $L^{2}(gH)$ , projection onto $\operatorname{span}\{\pi_{gH}\}$ is given by inner product with $\pi_{gH}$ , and we have $\left\langle\nu|_{gH},\pi_{gH}\right\rangle\pi_{gH}=\tilde{\nu}|_{gH}$ , which means the projection of $\nu$ onto $\mathcal{M}_{H}$ is $\tilde{\nu}$ . ∎

Lemma 3.7.

Let $G$ be a finite group and $H$ a normal subgroup of $G$ . Let $\mu\in L^{2}(G)$ . Then $*\mu\colon L^{2}(G)\to L^{2}(G)$ preserves $\mathcal{M}_{H}$ , i.e., $\mathcal{M}_{H}*\mu\subseteq\mathcal{M}_{H}$ .

Proof.

Suppose $\nu\in\mathcal{M}_{H}$ , so $\nu$ is uniform on each left coset of $H$ . Since $H$ is normal, its left and right cosets coincide, so $\nu$ is uniform on each right coset of $H$ . Say $X\sim\nu$ and $Y\sim\mu$ are independent, so $\nu*\mu$ is the distribution of $XY$ . We have $\mathbb{P}[X=hg]=\mathbb{P}[X=h^{\prime}g]$ for all $h,h^{\prime}\in H$ and $g\in G$ . For $y\in G$ , we therefore have

\mathbb{P}[XY=hg|Y=y]=\mathbb{P}[X=hgy^{-1}]=\mathbb{P}[X=h^{\prime}gy^{-1}]=\mathbb{P}[XY=h^{\prime}g|Y=y]

for all $h,h^{\prime}\in H$ and $g\in G$ . Summing over $y$ shows $(\nu*\mu)(hg)=(\nu*\mu)(h^{\prime}g)$ for all $h,h^{\prime}\in H$ and $g\in G$ . So, $\nu*\mu$ is uniform on right cosets (hence also left cosets) of $H$ and $\nu*\mu\in\mathcal{M}_{H}$ . ∎

Lemma 3.8.

Let $G$ be a finite group and $H$ a normal subgroup of $G$ . Let $\mu_{1},\dots,\mu_{n}$ be probability measures on $G$ and $\nu_{n}=\mu_{1}*\dots*\mu_{n}$ . Let $I_{H}=\{1\leq i\leq n\mid\langle\operatorname{supp}\mu_{i}\rangle=H\}$ . For each $i\in I_{H}$ , let $\sigma_{i}$ be the second-largest singular value of $*\mu_{i}$ on $H$ . Let $\mathcal{M}_{H}\subseteq L^{2}(G)$ be the set of signed measures on $G$ that are uniform on left cosets of $H$ . Then

d_{L^{2}}(\nu_{n},\mathcal{M}_{H})^{2}\leq\frac{|G|-1}{|G|}\prod_{i\in I_{H}}\sigma_{i}^{2}

Proof.

We say $\nu_{0}=\delta_{e}$ is the Dirac measure on the identity of $G$ , so that $\nu_{m+1}=\nu_{m}*\mu_{m+1}$ for all $0\leq m<n$ .

We will show the following statement by induction on $n$ :

d_{L^{2}}(\nu_{n},\mathcal{M}_{H})^{2}\leq\frac{|G|-1}{|G|}\prod_{\begin{subarray}{c}i\in I_{H}\\ i\leq n\end{subarray}}\sigma_{i}^{2}.

First, we have $d_{L^{2}}(\delta_{e},\mathcal{M}_{H})^{2}\leq||\delta_{e}-\pi||_{L^{2}}^{2}=1-\frac{1}{|G|}$ . This proves $P(0)$ . .

Now suppose $P(n)$ holds. Let $\tilde{\nu}_{n}$ be the orthogonal projection of $\nu_{n}$ onto $\mathcal{M}_{H}$ , described explicitly in Lemma 3.6. Then note that $||\nu_{n}-\tilde{\nu}_{n}||_{L^{2}}=d_{L^{2}}(\nu_{n},\mathcal{M}_{H})$ .

There are two cases, depending on whether $n+1\in I_{H}$ :

Suppose $\langle\operatorname{supp}\mu_{n+1}\rangle\neq H$ , so $n+1\notin I_{H}$ . By Lemma 3.4(1), $||\nu_{n+1}-\tilde{\nu}_{n}*\mu_{n+1}||_{L^{2}}\leq||\nu_{n}-\tilde{\nu}_{n}||_{L^{2}}$ .

By Lemma 3.7, we have $\tilde{\nu}_{n}*\mu_{n+1}\in\mathcal{M}_{H}$ , and

d_{L^{2}}(\nu_{n+1},\mathcal{M}_{H})\leq||\nu_{n+1}-\tilde{\nu}_{n}*\mu_{n+1}||_{L^{2}}\leq||\nu_{n}-\tilde{\nu}_{n}||_{L^{2}}=d_{L^{2}}(\nu_{n},\mathcal{M}_{H}).

Now suppose $\langle\operatorname{supp}\mu_{n+1}\rangle=H$ , so $n+1\in I_{H}$ .

For $g\in G$ , define $g_{*}\colon L^{2}(G)\to L^{2}(G)$ by $g_{*}\nu(h)=\nu(g^{-1}h)$ . Note that $g_{*}$ is an automorphism of normed spaces, and for signed measures $\nu$ and $\mu$ , we have $g_{*}(\nu*\mu)=g_{*}\nu*\mu$ . For each left coset $gH$ of $H$ we have

	$\displaystyle\|\|(\nu_{n}\mu_{n+1})\|_{gH}-(\tilde{\nu}_{n}\mu_{n+1})\|_{gH}\|\|_{L^{2}(gH)}$	$\displaystyle=\|\|(g^{-1}_{}(\nu_{n}\mu_{n+1}))\|_{H}-(g^{-1}_{}(\tilde{\nu}_{n}\mu_{n+1}))\|_{H}\|\|_{L^{2}(H)}$
		$\displaystyle=\|\|(g^{-1}_{}\nu_{n}\mu_{n+1})\|_{H}-(g^{-1}_{}\tilde{\nu}_{n}\mu_{n+1})\|_{H}\|\|_{L^{2}(H)}$

Since $\mu_{n+1}$ is supported on a subset of $H$ , for any $\nu\in L^{2}(G)$ we have $(\nu*\mu_{n+1})|_{H}=\nu|_{H}*\mu_{n+1}|_{H}$ . Thus,

||(\nu_{n}*\mu_{n+1})|_{gH}-(\tilde{\nu}_{n}*\mu_{n+1})|_{gH}||_{L^{2}(gH)}=||g^{-1}_{*}\nu_{n}|_{H}*\mu_{n+1}|_{H}-g^{-1}_{*}\tilde{\nu}_{n}|_{H}*\mu_{n+1}|_{H_{j}}||_{L^{2}(H)}.

By Lemma 3.6, $\tilde{\nu}_{n}|_{gH}$ is uniform on $gH$ with total mass $\nu_{n}(gH)$ , so $g^{-1}_{*}\tilde{\nu}_{n}|_{H}$ is uniform on $H$ with total mass $\nu_{n}(gH)$ . Thus, $g^{-1}_{*}\tilde{\nu}_{n}|_{H}*\mu_{n+1}|_{H}=g^{-1}_{*}\tilde{\nu}_{n}|_{H}$ .

Applying Lemma 3.4(2) on $H$ , we get

	$\displaystyle\|\|g^{-1}_{}\nu_{n}\|_{H}\mu_{n+1}\|_{H}-g^{-1}_{*}\tilde{\nu}_{n}\|_{H}\|\|_{L^{2}(H)}$	$\displaystyle\leq\sigma_{n+1}\|\|g^{-1}_{}\nu_{n}\|_{H}-g^{-1}_{}\tilde{\nu}_{n}\|_{H}\|\|_{L^{2}(H)}$
		$\displaystyle=\sigma_{n+1}\|\|(\nu_{n}-\tilde{\nu}_{n})\|_{gH_{j}}\|\|_{L^{2}(gH)}.$

Adding up over cosets of $H$ gives

||\nu_{n+1}-\tilde{\nu}_{n}||_{L^{2}(G)}\leq\sigma_{n+1}||\nu_{n}-\tilde{\nu}_{n}||_{L^{2}(G)}.

Hence,

d_{L^{2}}(\nu_{n+1},\mathcal{M}_{H})\leq||\nu_{n+1}-\tilde{\nu}_{n}||_{L^{2}(G)}\leq\sigma_{n+1}||\nu_{n}-\tilde{\nu}_{n}||_{L^{2}(G)}=\sigma_{n+1}d_{L^{2}}(\nu_{n},\mathcal{M}_{H}).

By induction, we get

d_{L^{2}}(\nu_{n},\mathcal{M}_{H})^{2}\leq\frac{|G|-1}{|G|}\prod_{\begin{subarray}{c}i\in I_{H}\\ i\leq n\end{subarray}}\sigma_{i}^{2}.

∎

Note that if $H$ is a subgroup of $G$ and $P\colon G\twoheadrightarrow G/H$ is the projection onto the left coset space, then one can identify $L^{2}(G/H)$ with $\mathcal{M}_{H}$ as follows:

Lemma 3.9.

Let $G$ be a finite group and $H$ a subgroup. Let $\mathcal{M}_{H}\subset L^{2}(G)$ be the space of measures uniform on left cosets of $H$ . Let $P\colon G\twoheadrightarrow G/H$ send each element to the corresponding left coset of $H$ . Then the map $\phi\colon L^{2}(G)\to L^{2}(G/H)$ sending $\mu$ to $|H|^{-1/2}P_{*}\nu$ restricts to an isometry of normed spaces $\phi|_{\mathcal{M}_{H}}\colon\mathcal{M}_{H}\cong L^{2}(G/H)$ . Moreover, $(\phi|_{\mathcal{M}_{H}})^{-1}\circ\phi$ is the orthogonal projection map $L^{2}(G)\twoheadrightarrow\mathcal{M}_{H}$ .

Proof.

The map $\phi|_{\mathcal{M}_{H}}$ is norm-preserving because if $\nu$ is uniform on left cosets of $H$ , then $\nu(gH)=|H|\nu(g)$ for $g\in G$ . Indeed, we have

|||H|^{-1/2}P_{*}\nu||_{L^{2}(G/H)}=\frac{1}{|H|}\sum_{gH\in G/H}|\nu(gH)|^{2}=\frac{1}{|H|^{2}}\sum_{g\in G}|\nu(gH)|^{2}=\frac{1}{|H|^{2}}\sum_{g\in G}|H|^{2}|\nu(g)|^{2}=||\nu||_{L^{2}(G)}.

The inverse map $(\phi|_{\mathcal{M}_{H}})^{-1}$ is given by $(\phi|_{\mathcal{M}_{H}})^{-1}(\mu)(g)=\mu(gH)/|H|$ . The fact that $(\phi|_{\mathcal{M}_{H}})^{-1}\circ\phi$ is the orthogonal projection map $L^{2}(G)\to\mathcal{M}_{H}$ follows from Lemma 3.6. ∎

Identifying $L^{2}(G/H)$ with $\mathcal{M}_{H}$ will allow us to prove Theorem 3.1 by induction. Using Lemma 3.8, we can say that a random walk approaches the subspace $\mathcal{M}_{H}$ . Then, we can consider its projection onto $\mathcal{M}_{H}$ as a random walk on $G/H$ . This allows us to ignore all random walk steps supported in $H$ . The key ingredient that allows us to combine Lemma 3.8 with the inductive hypothesis is the following lemma:

Lemma 3.10.

Let $G$ be a finite group and $H\leq G$ . Let $\pi$ be the uniform distribution on $G$ and let $\mu$ be any signed measure on $G$ . Let $P\colon G\twoheadrightarrow G/H$ be the set map sending each element of $G$ to the corresponding left coset of $H$ . Let $\mathcal{M}_{H}\subseteq L^{2}(G)$ be the set of signed measures uniform on left cosets of $H$ . Then

||\mu-\pi||_{L^{2}(G)}^{2}=\frac{1}{|H|}||P_{*}\mu-P_{*}\pi||_{L^{2}(G/H)}^{2}+d_{L^{2}}(\mu,\mathcal{M}_{H})^{2}

Proof.

Let $\tilde{\mu}$ be the orthogonal projection of $\mu$ onto $\mathcal{M}_{H}$ . By Lemma 3.6, we have $P_{*}\mu=P_{*}\tilde{\mu}$ . Then

	$\displaystyle\|\|\mu-\pi\|\|_{L^{2}(G)}^{2}$	$\displaystyle=\|\|\tilde{\mu}-\pi\|\|_{L^{2}(G)}^{2}+\|\|\mu-\tilde{\mu}\|\|_{L^{2}(G)}^{2}$
		$\displaystyle=\frac{1}{\|H\|}\|\|P_{}\tilde{\mu}-P_{}\pi\|\|_{L^{2}(G/H)}^{2}+d_{L^{2}}(\mu,\mathcal{M}_{H})^{2}$
		$\displaystyle=\frac{1}{\|H\|}\|\|P_{}\mu-P_{}\pi\|\|_{L^{2}(G/H)}^{2}+d_{L^{2}}(\mu,\mathcal{M}_{H})^{2}$

∎

The final lemma before the proof of Theorem 3.1 is a pair of facts about pushforwards to quotients:

Lemma 3.11.

Let $G$ be a finite group and $H$ a normal subgroup. Let $P\colon G\twoheadrightarrow G/H$ be the projection. Then:

(1)

If $\mu,\nu\in L^{2}(G)$ , we have $P_{*}(\mu*\nu)=P_{*}\mu*P_{*}\nu$ .
(2)

Suppose $\mu\in L^{2}(G)$ and the second-largest singular value of $*\mu$ on $\langle\operatorname{supp}\mu\rangle$ is $\sigma$ . Then the second-largest singular value of $*(P_{*}\mu)$ on $P(\langle\operatorname{supp}\mu\rangle)$ is at most $\sigma$ .

Proof.

(1)

We have

P_{*}(\mu*\nu)(gH)=\sum_{h\in H}(\mu*\nu)(gh)=\sum_{h\in H}\sum_{k\in G}\mu(k)\nu(k^{-1}gh)=\sum_{kH\in G/H}\sum_{h\in H}\mu(kh)\nu(h^{-1}k^{-1}gH)

Since left cosets of $H$ are right cosets too, $h^{-1}k^{-1}gH=k^{-1}gH$ , so

P_{*}(\mu*\nu)(gH)=\sum_{kH\in G/H}\sum_{h\in H}\mu(kh)\nu(k^{-1}gH)=\sum_{kH\in G/H}\mu(kH)\nu(k^{-1}gH)=P_{*}\mu*P_{*}\nu.

(2)

By restricting to the projection $\langle\operatorname{supp}\mu\rangle\twoheadrightarrow P(\langle\operatorname{supp}\mu\rangle)$ we may as well assume $\langle\operatorname{supp}\mu\rangle=G$ .

Recall (from the proof of Lemma 3.4(2)) that the second-largest singular value of $*\mu$ is the operator norm of $*\mu$ acting on the subspace $L^{2}(G)_{0}$ of measures with total mass 0. Suppose $\sigma^{\prime}$ is the second-largest singular value of $*(P_{*}\mu)$ . Then

\sigma^{\prime}=\sup_{\begin{subarray}{c}\nu\in L^{2}(G/H)\\ \nu(G/H)=0\end{subarray}}\frac{||\nu*P_{*}\mu||_{L^{2}(G/H)}}{||\nu||_{L^{2}(G/H)}}

Let $\mathcal{M}_{H}\subseteq L^{2}(G)$ be the space of signed measures uniform on cosets of $H$ . By Lemma 3.9 and part (1) of this lemma, we have

\sup_{\begin{subarray}{c}\nu\in L^{2}(G/H)\\ \nu(G/H)=0\end{subarray}}\frac{||\nu*P_{*}\mu||_{L^{2}(G/H)}}{||\nu||_{L^{2}(G/H)}}=\sup_{\begin{subarray}{c}\tilde{\nu}\in\mathcal{M}_{H}\\ \tilde{\nu}(G)=0\end{subarray}}\frac{||P_{*}\tilde{\nu}*P_{*}\mu||_{L^{2}(G/H)}}{||\tilde{\nu}||_{L^{2}(G/H)}}=\sup_{\begin{subarray}{c}\tilde{\nu}\in\mathcal{M}_{H}\\ \tilde{\nu}(G)=0\end{subarray}}\frac{||P_{*}(\tilde{\nu}*\mu)||_{L^{2}(G/H)}}{||P_{*}\tilde{\nu}||_{L^{2}(G/H)}}

Then by Lemma 3.9 again, we have

\sigma^{\prime}=\sup_{\begin{subarray}{c}\tilde{\nu}\in\mathcal{M}_{H}\\ \tilde{\nu}(G)=0\end{subarray}}\frac{||P_{*}(\tilde{\nu}*\mu)||_{L^{2}(G/H)}}{||P_{*}\tilde{\nu}||_{L^{2}(G/H)}}=\sup_{\begin{subarray}{c}\tilde{\nu}\in\mathcal{M}_{H}\\ \tilde{\nu}(G)=0\end{subarray}}\frac{||\tilde{\nu}*\mu||_{L^{2}(G)}}{||\tilde{\nu}||_{L^{2}(G)}}\leq\sup_{\begin{subarray}{c}\tilde{\nu}\in L^{2}(G)\\ \tilde{\nu}(G)=0\end{subarray}}\frac{||\tilde{\nu}*\mu||_{L^{2}(G)}}{||\tilde{\nu}||_{L^{2}(G)}}=\sigma

∎

Proof of Theorem 3.1.

We will prove the following statement by induction on $r$ :

(

P(r)

)

\displaystyle||(\tilde{Q}_{r})_{*}\nu_{n}-(\tilde{Q}_{r})_{*}\pi||_{L^{2}(G_{r})}^{2}\leq\sum_{j=r+1}^{k}\frac{|G_{j-1}|-1}{|G_{r}|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right).

When $r=k$ , the right hand side of $P(r)$ is 0. Since both $(\tilde{Q}_{r})_{*}\nu_{n}$ and $(\tilde{Q}_{r})_{*}\pi$ are the unique probability measure on $G_{r}=\{e\}$ , the left hand side is also 0, so $P(k)$ holds.

Now suppose $P(r+1)$ holds. We will show $P(r)$ holds.

Since $(\tilde{Q}_{r})_{*}\pi$ is the uniform distribution on $G_{r}$ , Lemma 3.10 applied to $G_{r}$ and $H_{r+1}$ says

\displaystyle||(\tilde{Q}_{r})_{*}\nu_{n}-(\tilde{Q}_{r})_{*}\pi||_{L^{2}(G_{r})}^{2}

\displaystyle=\frac{1}{|H_{r+1}|}||(\tilde{Q}_{r+1})_{*}\nu_{n}-(\tilde{Q}_{r+1})_{*}\pi||_{L^{2}(G_{r+1})}^{2}+d_{L^{2}}((\tilde{Q}_{r})_{*}\nu_{n},\mathcal{M}_{H_{r+1}})^{2}

where $\mathcal{M}_{H_{r+1}}$ is the subspace of $L^{2}(G_{r})$ consisting of measures uniform on cosets of $H_{r+1}$ . By the inductive hypothesis,

	$\displaystyle\frac{1}{\|H_{r+1}\|}\|\|(\tilde{Q}_{r+1})_{}\nu_{n}-(\tilde{Q}_{r+1})_{}\pi\|\|_{L^{2}(G_{r+1})}^{2}$	$\displaystyle\leq\sum_{j=r+2}^{k}\frac{\|G_{j-1}\|-1}{\|G_{r+1}\|\|H_{r+1}\|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right)$
		$\displaystyle=\sum_{j=r+2}^{k}\frac{\|G_{j-1}\|-1}{\|G_{r}\|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right).$

By Lemma 3.11(1), we have $(\tilde{Q}_{r})_{*}\nu_{n}=(\tilde{Q}_{r})_{*}\mu_{1}*\dots*(\tilde{Q}_{r})_{*}\mu_{n}$ , so by Lemma 3.8 applied to $G_{r}$ , $H_{r+1}$ , and the measures $(\tilde{Q}_{r})_{*}\mu_{i}$ , we get

d_{L^{2}}((\tilde{Q}_{r})_{*}\nu_{n},\mathcal{M}_{H_{r+1}})^{2}\leq\frac{|G_{r}|-1}{|G_{r}|}\prod_{i\in I_{r+1}}\sigma_{i}^{2}.

Hence,

	$\displaystyle\|\|(\tilde{Q}_{r})_{}\nu_{n}-(\tilde{Q}_{r})_{}\pi\|\|_{L^{2}(G_{r})}^{2}$	$\displaystyle\leq\sum_{j=r+2}^{k}\frac{\|G_{j-1}\|-1}{\|G_{r}\|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right)+\frac{\|G_{r}\|-1}{\|G_{r}\|}\prod_{i\in I_{r+1}}\sigma_{i}^{2}$
		$\displaystyle=\sum_{j=r+1}^{k}\frac{\|G_{j-1}\|-1}{\|G_{r}\|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right),$

completing the induction. When $r=0$ , we get

||\nu_{n}-\pi||_{L^{2}}^{2}\leq\sum_{j=1}^{k}\frac{|G_{j-1}|-1}{|G|}\left(\prod_{i\in I_{j}}\sigma_{i}^{2}\right).

∎

Now we show how Theorem 3.1 implies Theorem 1.1 by giving a corollary slightly stronger than Theorem 1.1.

Corollary 3.12.

Let $G$ be a finite group, and let $\mu_{1},\mu_{2},\dots,\mu_{n}$ be probability measures on $G$ . For each subgroup $H$ of $G$ , let $I_{H}=\{i\mid H=\langle\operatorname{supp}\mu_{i}\rangle\}$ . Let $H_{1},\dots,H_{k}$ be a finite set of subgroups of $G$ such that $G=\left\langle\bigcup_{j=1}^{k}H_{j}\right\rangle$ and the image of $H_{j}$ in $G/H_{1}\cdots H_{j-1}$ is a normal subgroup for all $1\leq j\leq k$ . Write $\nu_{n}=\mu_{1}*\dots*\mu_{n}$ . Also, for each $i$ , let $\sigma_{i}$ be the second-largest singular value of $*\mu_{i}$ as an operator on $L^{2}(\langle\operatorname{supp}\mu_{i}\rangle)$ . Let $\pi$ be the uniform distribution on $G$ .

If $I_{H_{j}}$ is nonempty for each $1\leq j\leq k$ , we have

||\nu_{n}-\pi||_{L^{2}}\leq\sum_{j=1}^{k}\left(\prod_{i\in I_{H_{j}}}\sigma_{i}\right).

Proof.

We have $\bigcup_{j=1}^{k}H_{j}\subseteq H_{1}\cdots H_{k}$ , so $H_{1}\cdots H_{k}=G$ . Let $\tilde{Q}_{j}\colon G\to G/H_{1}\cdots H_{j}$ be the projection. For $i\in I_{H_{j}}$ , let $\sigma_{i}^{\prime}$ be the second-largest singular value of $*(\tilde{Q}_{j-1})_{*}\mu_{i}$ on $\tilde{Q}_{j-1}(\langle\operatorname{supp}\mu_{i}\rangle)=\tilde{Q}_{j-1}(H_{j})$ . By Lemma 3.11(2), we have $\sigma_{i}^{\prime}\leq\sigma_{i}$ . Then applying Theorem 3.1 to the sequence

G\longrightarrow G/H_{1}\longrightarrow G/H_{1}H_{2}\longrightarrow\dots\longrightarrow G/H_{1}\cdots H_{k}=\{e\}

we get that

||\nu_{n}-\pi||_{L^{2}}^{2}\leq\sum_{j=1}^{k}\frac{|G/H_{1}\cdots H_{j-1}|-1}{|G|}\left(\prod_{i\in I_{H_{j}}}(\sigma_{i}^{\prime})^{2}\right)\leq\sum_{j=1}^{k}\left(\prod_{i\in I_{H_{j}}}\sigma_{i}^{2}\right).

Then the corollary follows by subadditivity of square root. ∎

We obtain Theorem 1.1 from Corollary 3.12 because the image of a normal subgroup under a surjection is normal.

4. Universality for Random Groups

The goal of this section is to prove Theorem 1.2.

To prove Theorem 1.2, we will use the moment method of Wood (see [Woo14, Woo19]) as follows. Let $X_{1},X_{2},\dots$ be a sequence of random finitely generated abelian groups and $Y$ be a random finitely generated abelian group. Let $a>0$ be an integer and $A$ the set of isomorphism classes of abelian groups with exponent dividing $a$ . If for every $G\in A$ we have

\lim_{n\to\infty}\mathbb{E}[\#\operatorname{Sur}(X_{n},G)]=\mathbb{E}[\#\operatorname{Sur}(Y,G)]\leq|\wedge^{2}G|

then for every $H\in A$ we have

\lim_{n\to\infty}\mathbb{P}[X_{n}\otimes\mathbb{Z}/a\mathbb{Z}\cong H]=\mathbb{P}[Y\otimes\mathbb{Z}/a\mathbb{Z}\cong H]

[Woo19, Theorem 3.1]. The quantity $\mathbb{E}[\#\operatorname{Sur}(X_{n},G)]$ is called the $G$ -moment of $X_{n}$ .

Remark 4.1.

We can put a topology on the set of (isomorphism classes of) finitely generated abelian groups given by a basis of open sets of the form

U_{a,H}=\{X\text{ finitely generated abelian}\mid X\otimes\mathbb{Z}/a\mathbb{Z}\cong H\}

indexed by positive integers $a$ and abelian groups $H$ of exponent dividing $a$ . The assertion that $(**)$ holds for all choices of $a$ and $H$ is equivalent to the assertion that the distribution of $X_{n}$ converges weakly to the distribution of $Y$ in this topology. In particular, if $(*)$ holds for all abelian groups $G$ , then the distribution of $X_{n}$ converges weakly to the distribution of $Y$ .

If $Y\sim\lambda_{u}$ , then [Woo19, Lemma 3.2] gives

\mathbb{E}[\#\operatorname{Sur}(Y,G)]=|G|^{-u}.

Following this strategy, we obtain Theorem 1.2 as a corollary of Theorem 4.18, which states that if $X_{n}$ are the cokernels of $n\times(n+u)$ random matrices satisfying appropriate conditions, then $\lim_{n}\mathbb{E}[\#\operatorname{Sur}(X_{n},G)]=|G|^{-u}$ .

When $X_{n}$ is the cokernel of a random $n\times m$ matrix $M$ , the problem of counting surjections from $X_{n}$ into $G$ can be attacked with combinatorics. Say $X_{n}=\mathbb{Z}^{n}/\Lambda$ , where $\Lambda$ is a random subgroup of $\mathbb{Z}^{n}$ (e.g., the column space of a random integer matrix). Then surjections $X_{n}\to G$ correspond one-to-one with surjections $\mathbb{Z}^{n}\to G$ which vanish on $\Lambda$ . It follows from linearity of expectation that

\mathbb{E}[\#\operatorname{Sur}(\mathbb{Z}^{n}/\Lambda,G)]=\sum_{f\in\operatorname{Sur}(\mathbb{Z}^{n},G)}\mathbb{P}[f(\Lambda)=0].

In the case of cokernels of random matrices, $\Lambda$ is the subgroup generated by the columns of the random matrix, viewed as random elements of $\mathbb{Z}^{n}$ . But we can also view $M$ as a random element of $(\mathbb{Z}^{n})^{m}$ . Given a map $f\colon\mathbb{Z}^{n}\to G$ , we get by abuse of notation a map $f\colon(\mathbb{Z}^{n})^{m}\to G^{m}$ applying $f$ to each component. Then we have that $f(\Lambda)=0$ if and only if $f(M)=0$ . Thus, we want to bound the probabilities $f(M)=0$ . Past work on random matrices with independent entries (e.g., [NW22]) has observed that if $Z$ is a random tuple in $\mathbb{Z}^{n}$ with independent, sufficiently regular components, then for most $f\in\operatorname{Sur}(\mathbb{Z}^{n},G)$ , the element $f(Z)\in G$ is close to uniformly distributed. Applying this to each column independently allows us to compute $\mathbb{P}[f(M)=0]$ . In this work, we apply the same principle to consider several columns of a random matrix at a time.

4.1. Balanced elements

The following definition captures the idea that a random element in a group is not too concentrated in a particular coset.

Definition 4.2.

Let $G$ be a group. A $G$ -valued random variable $X$ is $\varepsilon$ -balanced if for any proper subgroup $H<G$ and element $g\in G$ , we have $\mathbb{P}[X\in gH]\leq 1-\varepsilon$ .

This definition agrees with the definition in [Woo19] when $G$ is a finite cyclic group. Here is an example of an $\varepsilon$ -balanced random variable that does not take values in a cyclic group.

Example 4.3.

Let $G$ be a finitely generated group with finite generating set $S$ containing the identity, and let $X$ be a random variable supported on $S$ with $\min_{g\in S}\mathbb{P}[X=g]=\varepsilon$ . Then $X$ is $\varepsilon$ -balanced.

Indeed, suppose $H$ is a subgroup of $G$ and $g\in G$ such that $\mathbb{P}[X\in gH]>1-\varepsilon$ . Then $\mathbb{P}[X\in gH]=1$ , so $S\subset gH$ . Since $S$ contains the identity element of $G$ , we must have $gH=H$ , and since $S\subset gH=H$ , we must have $H=G$ .

In this paper, we consider $n\times m$ integer matrices as elements of the abelian group $(\mathbb{Z}^{n})^{m}$ . For each subset $S$ of $[n]\times[m]$ , we have a quotient map $\pi_{S}$ from $(\mathbb{Z}^{n})^{m}$ onto $\mathbb{Z}^{S}$ given by taking the entries of a matrix indexed by pairs in $S$ . We say that a subset of the entries of a random matrix $M$ with indices $S$ is jointly $\varepsilon$ -balanced if $\pi_{S}(M)$ is $\varepsilon$ -balanced in $\mathbb{Z}^{S}$ .

The new definition of $\varepsilon$ -balanced has some desirable properties that help construct new examples of $\varepsilon$ -balanced random variables.

Lemma 4.4.

(1)

If $\pi\colon G\twoheadrightarrow Q$ is a surjective homomorphism of groups and $X$ is $\varepsilon$ -balanced in $G$ , then $\pi(X)$ is $\varepsilon$ -balanced in $Q$ .
(2)

Let $G,G^{\prime}$ be groups, $X$ be $\varepsilon$ -balanced in $G$ , and $Y$ be $\varepsilon$ -balanced in $G^{\prime}$ . If $X$ and $Y$ are independent, then $(X,Y)$ is $\varepsilon$ -balanced in $G\times G^{\prime}$ .

Proof.

(1)

Let $qK\subsetneq Q$ be a coset of a proper subgroup of $Q$ . Let $\tilde{q}\in\pi^{-1}(q)$ , so $\pi^{-1}(qK)=\tilde{q}\pi^{-1}(K)$ is a coset of a proper subgroup of $G$ . Since $X$ is $\varepsilon$ -balanced,

$\mathbb{P}[\pi(X)\in qK]\leq\mathbb{P}[X\in\tilde{q}\pi^{-1}(K)]\leq 1-\varepsilon,$

as desired.

(2)

Let $kH$ be a coset of a proper subgroup of $G\times G^{\prime}$ . Note that

\mathbb{P}[(X,Y)\in kH]=\mathbb{P}[(X,e)\in(e,Y^{-1})kH]=\mathbb{P}[(X,e)\in(e,Y^{-1})kH\cap(G\times\{e\})].

Recall that the intersection of two cosets in a group is either empty or a coset of their intersection. In particular, $(e,Y^{-1})kH\cap(G\times\{e\})$ is either empty or a coset of a subgroup of $G\times\{e\}$ .

There are two cases, depending on whether $(e,Y^{-1})kH\cap(G\times\{e\})$ is always a proper subset of $G\times\{e\}$ :

If $(e,y^{-1})kH\cap(G\times\{e\})\subsetneq G\times\{e\}$ for all $y\in G^{\prime}$ :

Condition on $Y=y$ for some fixed $y\in G^{\prime}$ . Since $X$ and $Y$ are independent, and $X$ is $\varepsilon$ -balanced,

\mathbb{P}[(X,e)\in(e,Y^{-1})kH\cap(G\times\{e\})\mid Y=y]=\mathbb{P}[(X,e)\in(e,y^{-1})kH\cap(G\times\{e\})]\leq 1-\varepsilon.

ii.

If $G\times\{e\}\subseteq(e,y^{-1})kH$ for some $y\in Y$ , then $(e,e)\in(e,y^{-1})kH$ , so in particular $(e,y^{-1})kH$ is a subgroup of $G\times G^{\prime}$ and we must have $(e,y^{-1})kH=H$ . We claim that $H=G\times H^{\prime}$ for some proper subgroup $H^{\prime}$ of $G^{\prime}$ .

Indeed, let $\pi\colon G\times G^{\prime}\to G^{\prime}$ be the projection and let $H^{\prime}=\pi(H)$ . On one hand, clearly $H\subseteq G\times\pi(H)$ . On the other, if $(g,h^{\prime})\in G\times H^{\prime}$ , then $h^{\prime}=\pi(g^{\prime},h)$ for some $(g^{\prime},h)\in H$ . Then $(g,h^{\prime})=(g(g^{\prime})^{-1},e)(g^{\prime},h)$ . Since $(g(g^{\prime})^{-1},e)\in G\times\{e\}\subseteq H$ , we have $(g,h^{\prime})\in H$ . Hence $H=G\times H^{\prime}$ . Note that $H^{\prime}\lneq G^{\prime}$ , or else $H=G\times G^{\prime}$ is not a proper subgroup.

Then

$\mathbb{P}[(X,Y)\in kH]=\mathbb{P}[Y\in H^{\prime}]\leq 1-\varepsilon.$

Hence, in both cases we have $\mathbb{P}[(X,Y)\in kH]\leq 1-\varepsilon$ and since this holds for every proper coset $kH$ , we have that $(X,Y)$ is balanced.

∎

Note that Lemma 4.4 gives us a nice way to build up $\varepsilon$ -balanced matrices. If the entries of a random matrix can be partitioned into independent subsets and each of these subsets of the entries is jointly $\varepsilon$ -balanced, then the whole matrix is $\varepsilon$ -balanced. For example, any matrix with independent, $\varepsilon$ -balanced entries (as in [Woo19]) is $\varepsilon$ -balanced as a matrix.

When a random variable is $\varepsilon$ -balanced, we can get an upper bound on the associated singular value.

Lemma 4.5.

Suppose $G$ is a finite group and $X$ is $\varepsilon$ -balanced in $G$ with distribution $\mu$ . Let $\sigma$ be the second largest singular value of the operator $*\mu$ on $L^{2}(G)$ . Then

\sigma\leq\exp\left(-\frac{\varepsilon}{2|G|^{3}}\right).

Proof.

Note that $\sigma$ is the square root of the second largest eigenvalue of the operator $*\nu\coloneqq*\mu*\check{\mu}\colon L^{2}(G)\to L^{2}(G)$ , where $*\check{\mu}$ is the adjoint to the operator $*\mu$ , given by $\check{\mu}(g)=\mu(g^{-1})$ . The operator $*\nu$ is the transition operator for a random walk on $G$ , where each step is a difference of two independent copies of $X$ .

In particular, note that $\nu=\check{\nu}$ . For any generating set $\Sigma$ of $G$ , [Sal04, Theorem 6.2] applied to $\Sigma\cup\Sigma^{-1}$ shows that the second-largest eigenvalue $\sigma^{2}$ of $*\nu$ is bounded above by

\sigma^{2}\leq 1-\frac{m}{D^{2}},

where $m=\min_{x\in\Sigma}\nu(x)$ and $D$ is the diameter of the Cayley graph of $(G,\Sigma)$ . In particular, $D\leq|G|$ .

The goal is to choose an appropriate $\Sigma$ to bound $m$ from below. Note that if $X_{1}$ and $X_{2}$ are $\varepsilon$ -balanced and independent, then so is $X_{1}X_{2}^{-1}$ (via conditioning on $X_{2}$ ). In particular, $\nu$ is $\varepsilon$ -balanced.

We proceed iteratively. Having chosen $x_{1},\dots,x_{n-1}$ (including the empty set $n=1$ ), if $\langle x_{1},\dots,x_{n-1}\rangle=G$ then we are done. Otherwise, since $\nu$ is $\varepsilon$ -balanced, $\nu(\langle x_{1},\dots,x_{n-1}\rangle)\leq 1-\varepsilon$ . Choose

x_{n}=\operatorname{argmax}_{x\in G\setminus\langle x_{1},\dots,x_{n-1}\rangle}\nu(x).

Since $\nu(\langle x_{1},\dots,x_{n-1}\rangle)\leq 1-\varepsilon$ , we have $\nu(G\setminus\langle x_{1},\dots,x_{n-1}\rangle)\geq\varepsilon$ , so $\nu(x_{n})\geq\frac{\varepsilon}{|G\setminus\langle x_{1},\dots,x_{n-1}\rangle|}\geq\frac{\varepsilon}{|G|}$ .

Hence we have $m\geq\frac{\varepsilon}{|G|}$ , so

\sigma\leq\sqrt{1-\frac{\varepsilon}{|G|^{3}}}\leq 1-\frac{\varepsilon}{2|G|^{3}}\leq\exp\left(-\frac{\varepsilon}{2|G|^{3}}\right),

as desired. ∎

Now we will use the $\varepsilon$ -balanced condition to give a related balancedness condition for matrices that contains information about how balanced and independent the entries are.

Definition 4.6.

Let $S$ be a finite set. A partition of $S$ is a collection $\mathcal{P}=\{P_{1},\dots,P_{k}\}\subseteq 2^{S}$ , such that $S=P_{1}\sqcup P_{2}\sqcup\dots\sqcup P_{k}$ and each $P_{i}$ is nonempty. We say $|\mathcal{P}|=\max_{i}\#P_{i}$ and $\#\mathcal{P}=k$ . If $\sigma\subseteq 2^{S}$ , write $\cup\sigma$ for $\bigcup_{S\in\sigma}S$ .

Note that $\#\mathcal{P}\cdot|\mathcal{P}|\geq\#S$ .

The next definition specifies the kinds of restrictions we will give for the matrices in our universality class. The idea is that we can split up the columns of the matrix and then the rows, so that the resulting sections of the matrix are $\varepsilon$ -balanced.

If $M$ is an $n\times m$ matrix, $S=\{s_{1}<\dots<s_{k}\}\subset[n]$ , and $T=\{t_{1}<\dots<t_{\ell}\}\subset[m]$ , then $M_{S,T}$ is the $k\times\ell$ matrix $(M_{s_{i},t_{j}})_{1\leq i\leq k,1\leq j\leq\ell}$ .

Definition 4.7.

An $n\times m$ random matrix $M$ with entries in a ring $R$ is $(w,h,\varepsilon)$ -balanced if there is a partition $\mathcal{Q}=\{Q_{1},\dots,Q_{r}\}$ of $[m]$ and a partition $\mathcal{P}=\{P_{1},\dots,P_{\ell}\}$ of $[n]$ with $|\mathcal{Q}|\leq w$ , $|\mathcal{P}|\leq h$ , and such that each random matrix $M_{P_{i},Q_{j}}$ is $\varepsilon$ -balanced in the additive abelian group $(R^{\#P_{i}})^{\#Q_{j}}$ and the random matrices $M_{P_{i},Q_{j}}$ are independent.

If $|\mathcal{P}|=|\mathcal{Q}|=1$ then we recover the definition of $\varepsilon$ -balanced from [Woo19] and other related work.

Now we are ready to state the main theorem of this section:

Theorem 4.8.

For each integer $n\geq 0$ , let $M_{n}$ be an $(w_{n},h_{n},\varepsilon_{n})$ -balanced $n\times(n+u)$ random matrix with entries in $\mathbb{Z}$ . Let $Y\sim\lambda_{u}$ . Then for all positive integers $a$ and abelian groups $H$ of exponent dividing $a$ we have

\lim_{n\to\infty}\mathbb{P}[\operatorname{coker}(M_{n})\otimes\mathbb{Z}/a\mathbb{Z}\cong H]=\mathbb{P}[Y\otimes\mathbb{Z}/a\mathbb{Z}\cong H]=\lambda_{u}(U_{a,H}).

Together with Remark 4.1, this gives Theorem 1.2.

As discussed at the beginning of this section, we will prove this by computing the limiting moments of $\operatorname{coker}(M_{n})$ , which involves estimating $\mathbb{P}[f(M_{n})=0]$ for maps $\mathbb{Z}^{n}\to G$ .

Remark 4.9.

The same proof will work as written when the entries of $M_{n}$ come from any ring $R$ with at most one quotient to $\mathbb{Z}/a\mathbb{Z}$ for any positive integer $a$ . Some examples of interest are the $p$ -adic integers $\mathbb{Z}_{p}$ or a product $\prod_{i}\mathbb{Z}_{p_{i}}$ for some collection of distinct primes $p_{i}$ . We will find that when $R$ has exactly one quotient to $\mathbb{Z}/a\mathbb{Z}$ , then for any finite abelian group $G$ of exponent dividing $a$ , the limiting $G$ -moment of $\operatorname{coker}(M_{n})$ is $|G|^{-u}$ . Then we get the conclusion of Theorem 4.8 for those $a$ for which $R$ has a quotient to $\mathbb{Z}/a\mathbb{Z}$ .

4.2. Bounds for most maps

It turns out that $(w,h,\varepsilon)$ -balanced is a strong enough condition that we can get bounds on $\mathbb{P}[f(M)=0]$ for the vast majority of maps $f$ .

Definition 4.10.

If $V$ is an abelian group with generating set $S$ and $T\subseteq S$ , we write $V_{\setminus T}$ for the subgroup $\langle S\setminus T\rangle$ of $V$ . When $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ or $\mathbb{Z}^{n}$ we implicitly take $S$ to be the “standard basis”.

Let $\mathcal{P}=\{P_{1},\dots,P_{\ell}\}$ be a partition of $S$ and $G$ be a finite abelian group. A function $f\colon V\to G$ is a $\mathcal{P}$ -code of distance $w$ if for any $\sigma\subset\mathcal{P}$ with $|\cup\sigma|<w$ , we have $f(V_{\setminus\cup\sigma})=G$ .

To approximate $\mathbb{P}[f(M)=0]$ for codes $f$ , we will split the matrices $M$ into independent sets of columns. Each such set of $r$ random columns gets mapped to something close to uniform in $G^{r}$ . The following lemma is analogous to [Woo19, Lemma 2.1].

Lemma 4.11.

Let $n,r\geq 1$ be integers. Let $G$ be a finite abelian group and let $a$ be a multiple of the exponent of $G$ . Let $N$ be the number of subgroups of $G$ . Let $\varepsilon>0$ be a real number. Let $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ . Let $\mathcal{P}=\{P_{i}\}$ be a partition of $[n]$ and let $\ell=|\mathcal{P}|$ . Let $f\in\operatorname{Hom}(V,G)$ be a $\mathcal{P}$ -code of distance $w<n$ .

Let $M$ be an $n\times r$ random matrix in $V^{r}$ such that the matrices $M_{P_{i},[r]}$ are independent and $\varepsilon$ -balanced as random elements of $((\mathbb{Z}/a\mathbb{Z})^{\#P_{i}})^{r}$ .

Let $g_{1},\dots,g_{r}\in G$ . Then

|\mathbb{P}[f(M)=(g_{1},\dots,g_{r})]-|G|^{-r}|\leq N\exp\left(-\frac{\varepsilon w}{2\ell N|G|^{3r}}\right)

Proof.

Let $e_{1},\dots,e_{n}$ be the standard generating set for $V$ . For $i=1,\dots,\#\mathcal{P}$ , let $V_{i}=\langle e_{j}\mid j\in P_{i}\rangle\cong\mathbb{Z}^{\#P_{i}}$ .

The idea is to treat $f(M)$ as a random walk in $G^{r}$ . We have

f(M)=\sum_{i=1}^{\#\mathcal{P}}f(M_{P_{i},[r]}),

where $M_{P_{i},[r]}$ is interpreted as an $\varepsilon$ -balanced random element of $V_{i}^{r}\cong((\mathbb{Z}/a\mathbb{Z})^{\#P_{i}})^{r}$ , a subgroup of $((\mathbb{Z}/a\mathbb{Z})^{n})^{r}$ .

Let $S=\{H\leq G^{r}\mid H=f(V_{i}^{r})\text{ for at least }w/\ell N\text{ values of }i\}$ . Note that $f(V_{i}^{r})=f(V_{i})^{r}$ , so as $i$ ranges over $1,\dots,\#\mathcal{P}$ there are at most $N$ possible values for $f(V_{i}^{r})$ , each an $r$ th power of a subgroup of $G$ . Let $I=\{i\mid f(V_{i}^{r})\notin S\}$ . Then $\#I\leq w/\ell$ , and so $|\bigcup_{i\in I}P_{i}|\leq w$ . Since $f$ is a $\mathcal{P}$ -code of distance $w$ , it remains surjective if we discard all of these indices, which means the images of the $V_{i}^{r}$ s with $f(V_{i}^{r})\in S$ generate $G^{r}$ . In other words, we have $\langle\bigcup_{H\in S}H\rangle=G^{r}$ . The subgroups in $S$ will be the ones we use in the random walk, applying Theorem 1.1.

By the definition of $S$ , for each $H$ in $S$ we have $\#I_{H}\geq w/\ell N$ . By Lemma 4.4, the steps $f(M_{P_{i},[r]})$ are $\varepsilon$ -balanced, which means that by Lemma 4.5 the second largest singular value $\sigma_{i}$ of the $i$ th step $f(M_{P_{i},[r]})$ is bounded above: $\sigma_{i}\leq\exp\left(-\frac{\varepsilon}{2|G|^{3r}}\right)$ (using the fact that each $f(M_{P_{i},[r]})$ is supported on a subgroup of $G^{r}$ ).

Hence by Theorem 1.1 we have

|\mathbb{P}[f(M)=(g_{1},\dots,g_{r})]-|G|^{-r}|\leq\sum_{H\in S}\exp\left(-\frac{\varepsilon w}{2\ell N|G|^{3r}}\right)\leq N\exp\left(-\frac{\varepsilon w}{2\ell N|G|^{3r}}\right),

as desired. ∎

To combine these estimates we will use a result in the flavor of [Woo19, Lemma 2.3]:

Lemma 4.12.

Let $x_{1},\dots,x_{m}\geq-1$ be real numbers such that $\sum_{i=1}^{m}\max\{0,x_{i}\}\leq\log 2$ . Then

\left|\prod_{i=1}^{m}(1+x_{i})-1\right|\leq 2\sum_{i=1}^{m}|x_{i}|

and

\sum_{i=1}^{m}\min\{0,x_{i}\}\leq\prod_{i=1}^{m}(1+x_{i})-1\leq 2\sum_{i=1}^{m}\max\{0,x_{i}\}.

Proof.

The first statement follows from the second statement because $\max\{0,x_{i}\}\leq|x_{i}|$ and $\min\{0,x_{i}\}\geq-|x_{i}|$ . So, we will show the second statement.

First, assume $x_{i}\leq 0$ for all $i$ . In that case,

\prod_{i=1}^{m}(1+x_{i})\geq 1+\sum_{i=1}^{m}x_{i}.

Next, assume $x_{i}\geq 0$ for all $i$ . Using the fact that $1+x_{i}\leq e^{x_{i}}$ , we get

\prod_{i=1}^{m}(1+x_{i})\leq e^{\sum_{i=1}^{m}x_{i}}.

We have $e^{x}-1=2x$ at $x=0$ and $\frac{d}{dx}(e^{x}-1)\leq\frac{d}{dx}(2x)$ for $x\leq\log 2$ , so $e^{x}-1\leq 2x$ for $0\leq x\leq\log 2$ . Hence, if $\sum_{i=1}^{m}x_{i}\leq\log 2$ , then $\exp\left(\sum_{i=1}^{m}x_{i}\right)-1\leq 2\sum_{i=1}^{m}x_{i}$ .

Now consider the general case. By replacing each negative $x_{i}$ with zero, we can only increase the product $\prod_{i=1}^{m}(1+x_{i})$ . On the other hand, by replacing each positive $x_{i}$ with zero, we can only decrease it. Hence, for general $x_{i}$ , we get

\sum_{i=1}^{m}\min\{0,x_{i}\}\leq\prod_{i=1}^{m}(1+\min\{0,x_{i}\})-1\leq\prod_{i=1}^{m}(1+x_{i})-1\leq\prod_{i=1}^{m}(1+\max\{0,x_{i}\})-1\leq 2\sum_{i=1}^{m}\max\{0,x_{i}\}.

∎

Applying this lemma with $x_{i}$ being the error in Lemma 4.11 multiplied by $|G|^{r}$ yields an estimate on the probability that the whole matrix maps to zero:

Lemma 4.13.

Let $u\geq 0$ be an integer. Let $G$ be a finite abelian group and let $a$ be a multiple of the exponent of $G$ . Let $(w_{n})_{n},(h_{n})_{n},(\delta_{n})_{n},(\varepsilon_{n})_{n}$ be sequences of real numbers such that $w_{n}=o(\log n)$ , $h_{n}=O(n^{1-\alpha})$ , and $\varepsilon_{n}\delta_{n}\geq n^{-\alpha+\beta}$ for some $0<\beta\leq\alpha\leq 1$ .

For a natural number $n$ , let $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ . Let $M$ be an $(w_{n},h_{n},\varepsilon_{n})$ -balanced $n\times(n+u)$ random matrix with entries in $\mathbb{Z}/a\mathbb{Z}$ . Let $\mathcal{P}$ be the row partition associated to $M$ and let $f\in\operatorname{Hom}(V,G)$ be a $\mathcal{P}$ -code of distance $n\delta_{n}$ .

Then there are constants $K,c,\gamma>0$ depending only on $G$ , $\alpha$ , $\beta$ , and the sequences $(w_{n})_{n},(h_{n})_{n}$ such that for all $g_{1},\dots,g_{n+u}\in G$ ,

|\mathbb{P}[f(M)=(g_{1},\dots,g_{n+u})]-|G|^{-n-u}|\leq\frac{K\exp(-cn^{\gamma})}{|G|^{n+u}}

Proof.

Let $\mathcal{P}$ and $\mathcal{Q}$ be the row and column partitions for $M$ as in the definition of $(w_{n},h_{n},\varepsilon_{n})$ -balanced. Let $M_{i}=M_{[n],Q_{i}}$ for each $i$ . Let $g_{Q_{i}}=(g_{j}\mid j\in Q_{i})$ . By independence,

\mathbb{P}[f(M)=(g_{1},\dots,g_{n+u})]=\prod_{i}\mathbb{P}[f(M_{i})=g_{Q_{i}}].

For each $i$ , let $x_{i}=|G|^{\#Q_{i}}\mathbb{P}[f(M_{i})=g_{Q_{i}}]-1$ . By Lemma 4.11, we have

	$\displaystyle\|x_{i}\|$	$\displaystyle\leq N\|G\|^{\#Q_{i}}\exp\left(-\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}\|G\|^{3\#Q_{i}}}\right)$
		$\displaystyle\leq N\|G\|^{w_{n}}\exp\left(-\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}\|G\|^{3w_{n}}}\right).$

Hence we have

\log|x_{i}|\leq\log N+w_{n}\log|G|-\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}|G|^{3w_{n}}}.

Since $h_{n}=O(n^{1-\alpha})$ and $\varepsilon_{n}\delta_{n}\geq n^{-\alpha+\beta}$ , there is a constant $C$ depending only on the proportionality constant in $h_{n}$ such that for large enough $n$ we have $\frac{\varepsilon_{n}\delta_{n}}{h_{n}}\geq Cn^{\beta-1}$ so that $\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}|G|^{3w_{n}}}\geq\frac{Cn^{\beta}}{2N|G|^{3w_{n}}}$

Since $w_{n}=o(\log n)$ , for large enough $n$ we have $w_{n}\leq\frac{\beta\log n}{6\log|G|}$ so that $|G|^{3w_{n}}=e^{3w_{n}\log|G|}\leq n^{\beta/2}$ and, for large enough $n$ , $\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}|G|^{3w_{n}}}\geq\frac{Cn^{\beta/2}}{2N}$ .

Finally, since $\log N+w_{n}\log|G|=o(\log n)$ , we also have that for $n$ large enough, $\log|x_{i}|\leq-\frac{C}{4N}n^{\beta/2}$ and $|x_{i}|\leq\exp\left(-\frac{C}{4N}n^{\beta/2}\right)$ . In particular, for $n$ large enough,

\sum_{i=1}^{m}|x_{i}|\leq m\exp\left(-\frac{C}{4N}n^{\beta/2}\right)\leq n\exp\left(-\frac{C}{4N}n^{\beta/2}\right)\leq\log 2.

By Lemma 4.12, we therefore have that for such $n$ ,

	$\displaystyle\|\|G\|^{n+u}\mathbb{P}[f(M)=(g_{1},\dots,g_{n+u})]-1\|$	$\displaystyle=\left\|\prod_{i=1}^{m}\|G\|^{\#Q_{i}}\mathbb{P}[f(M_{i})=g_{Q_{i}}]\|-1\right\|$
		$\displaystyle=\left\|\prod_{i=1}^{m}(1+x_{i})-1\right\|$
		$\displaystyle\leq 2\sum_{i=1}^{m}\|x_{i}\|$
		$\displaystyle\leq 2n\exp\left(-\frac{C}{4N}n^{\beta/2}\right)$
		$\displaystyle=2n\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\cdot\exp\left(-\frac{C}{8N}n^{\beta/2}\right).$

Since $\lim_{n\to\infty}2n\exp\left(-\frac{C}{8N}n^{\beta/2}\right)=0$ , the expression $2n\exp\left(-\frac{C}{8N}n^{\beta/2}\right)$ is uniformly bounded above by some constant for all $n\geq 0$ . Then the appropriate constant $K$ can be chosen so that

||G|^{n+u}\mathbb{P}[f(M)=(g_{1},\dots,g_{n+u})]-1|\leq K\exp\left(-\frac{C}{8N}n^{\beta/2}\right),

for all $n$ , as desired. ∎

4.3. Bounds for the rest of the maps

This gives results for the case when $f$ is a code, but we still need to account for non-codes. To do this, we will show that non-codes make up a negligible proportion of all maps $V\to G$ and thus contribute only a small error term to the sum $\mathbb{E}[\#\operatorname{Sur}(\operatorname{coker}(M),G)]$ . However it turns out that splitting maps into codes and non-codes is not enough to get this bound. Instead, as in [Woo19], [NW22], and similar work, we will categorize non-codes by how far they are from being codes.

If $D$ is an integer with prime factorization $\prod_{i}p_{i}^{e_{i}}$ , we write $\ell(D)=\sum_{i}e_{i}$ .

Definition 4.14.

If $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ and $\mathcal{P}$ is a partition of the “standard basis” of $V$ , the $(\mathcal{P},\delta)$ -depth of $f\in\operatorname{Hom}(V,G)$ is the maximal positive $D$ such that there is a $\sigma\subset\mathcal{P}$ with $|\cup\sigma|<\ell(D)\delta n$ such that $D=[G:f(V_{\setminus\cup\sigma})]$ , or 1 if there is no such $D$ .

We can count the number of $f$ that have given $(\mathcal{P},\delta)$ -depth:

Lemma 4.15.

If $D>1$ , then the number of $f\in\operatorname{Hom}(V,G)$ of $(\mathcal{P},\delta)$ -depth $D$ is at most

K\binom{n}{\lceil\ell(D)\delta n\rceil-1}2^{\ell(D)\delta n}|G|^{n}D^{-n+\ell(D)\delta n},

where $K$ is the number of subgroups of $G$ of index $D$ .

Proof.

For each $f$ of $(\mathcal{P},\delta)$ -depth $D$ , there is a $\sigma\subset\mathcal{P}$ as described in Definition 4.14. There must be some set $S\subset[n]$ with $\#S=\lceil\ell(D)\delta n\rceil-1$ and $\cup\sigma\subseteq S$ . There are $\binom{n}{\lceil\ell(D)\delta n\rceil-1}$ choices of $S$ , and for each choice of $S$ , there are certainly at most $2^{\#S}=2^{\lceil\ell(D)\delta n\rceil-1}\leq 2^{\ell(D)\delta n}$ choices of $\cup\sigma$ . Since $\mathcal{P}$ is a partition, $\cup\sigma$ uniquely determines $\sigma$ , so there are at most $2^{\ell(D)\delta n}$ choices of $\sigma$ for each choice of $S$ .

Now we count how many $f$ of $(\mathcal{P},\delta)$ -depth $D$ have each choice of $\sigma$ , so fix $\sigma$ . There are $K$ subgroups of $G$ of index $D$ , so there are $K$ options for $f(V_{\setminus\cup\sigma})$ .

Fix a subgroup $H$ of $G$ with index $D$ . We now count the number of $f$ with $f(V_{\setminus\cup\sigma})\subseteq H$ . There are at most $|H|^{n-|\cup\sigma|}$ maps from $V_{\setminus\cup\sigma}$ to $H$ , and for each such map, there are at most $|G|^{|\cup\sigma|}$ homomorphisms from $V$ to $G$ which restrict appropriately. Hence, there are at most

|H|^{n-|\cup\sigma|}|G|^{|\cup\sigma|}=|G|^{n-|\cup\sigma|}D^{-n+|\cup\sigma|}|G|^{|\cup\sigma|}=|G|^{n}D^{-n+|\cup\sigma|}\leq|G|^{n}D^{-n+\ell(D)\delta n}

maps $f$ with $f(V_{\setminus\cup\sigma})=H$ . Combined with the counts of choices of $\sigma$ and subgroups of $G$ of index $D$ , we get the lemma. ∎

For non-codes, we do not get precise estimates on $\mathbb{P}[f(M)=0]$ , but we can get upper bounds.

Lemma 4.16.

Let $r\geq 1$ be an integer. Let $G$ be a finite abelian group and let $a$ be a multiple of the exponent of $G$ . Let $N$ be the number of subgroups of $G$ . Let $\varepsilon>0$ and $\delta>0$ be real numbers. Let $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ . Let $\mathcal{P}=\{P_{1},\dots,P_{m}\}$ be a partition of $[n]$ and let $\ell=|\mathcal{P}|$ . Let $f\in\operatorname{Hom}(V,G)$ have $(\mathcal{P},\delta)$ -depth $D>1$ with $[G:f(V)]<D$ .

Then

\mathbb{P}[f(M)=0]\leq(1-\varepsilon)\left(D^{r}|G|^{-r}+N\exp\left(-\frac{\varepsilon\delta n}{2N\ell(D^{-1}|G|)^{3r}}\right)\right)

Proof.

Since $f$ has $(\mathcal{P},\delta)$ -depth $D$ , there is a $\sigma\subset\mathcal{P}$ with $|\cup\sigma|<\ell(D)\delta n$ such that $D=[G:f(V_{\setminus\cup\sigma})]$ . Let $f(V_{\setminus\cup\sigma})=\colon H$ . Since $[G:f(V)]<D$ , we cannot have that $\sigma$ is empty.

Write $f(M)=\sum_{j\notin\sigma}f(M_{P_{j},[r]})+\sum_{j\in\sigma}f(M_{P_{j},[r]})$ . So,

\mathbb{P}[f(M)=0]=\mathbb{P}[f(M)\in H]\mathbb{P}\left[\sum_{j\notin\sigma}f(M_{P_{j},[r]})=-\sum_{j\in\sigma}f(M_{P_{j},[r]})\ \middle|\ f(M)\in H\right].

We bound the two probabilities on the right side separately. Note that since $\sum_{j\in\sigma}f(M_{P_{j},[r]})\in H$ , we have $f(M)\in H$ exactly when $\sum_{j\notin\sigma}f(M_{P_{j},[r]})\in H$ . Since $[G:f(V)]<[G:H]$ , there must be some $i\in\sigma$ such that $f(M_{P_{i},[r]})$ reduces to a nonzero element of $G/H$ . Conditioning on all other $M_{P_{k},[r]}$ for $k\neq i$ , by the $\varepsilon$ -balanced assumption we have that

\mathbb{P}\left[f(M)\in H\right]=\mathbb{P}\left[f(M_{P_{i},[r]})\equiv-\sum_{j\in\sigma\setminus\{k\}}f(M_{P_{j},[r]})\pmod{H}\right]\leq 1-\varepsilon.

For the second probability, let $\mathcal{P}^{\prime}$ be the partition of $[n]\setminus\cup\sigma$ induced by $\mathcal{P}$ . Notice that $f|_{V_{\setminus\cup\sigma}}$ is a $\mathcal{P}^{\prime}$ -code of distance $\delta n$ . Indeed, suppose there is some $\tau\subset\mathcal{P}^{\prime}$ with $|\tau|<\delta n$ inducing some $\tau^{\prime}\subset\mathcal{P}$ with $f(V_{\setminus\cup(\sigma\cup\tau)})\neq H$ . Then the image of $f|_{V_{\setminus\cup(\sigma\cup\tau)}}$ would have index strictly greater than $D$ , contradicting maximality of $D$ .

Now we can apply Lemma 4.11 to the submatrix $M_{[n]\setminus\cup\sigma,[r]}$ and the code $f$ mapping it into $H^{r}$ . If $N^{\prime}$ is the number of subgroups of $H$ and $\ell^{\prime}=|\mathcal{P}^{\prime}|$ , then conditioning on $M_{P_{j},[r]}$ for $j\in\sigma$ gives

	$\displaystyle\mathbb{P}\left[\sum_{j\notin\sigma}f(M_{P_{j},[r]})=-\sum_{j\in\sigma}f(M_{P_{j},[r]})\ \middle\|\ f(M)\in H\right]$	$\displaystyle\leq\|H\|^{-r}+N^{\prime}\exp\left(-\frac{\varepsilon\delta n}{2N^{\prime}\ell^{\prime}\|H\|^{3r}}\right)$
		$\displaystyle\leq D^{r}\|G\|^{-r}+N\exp\left(-\frac{\varepsilon\delta n}{2N\ell(D^{-1}\|G\|)^{3r}}\right),$

and the lemma follows. ∎

Finally, we use Lemma 4.12 again to get a bound for the full $n\times(n+u)$ matrix:

Lemma 4.17.

For a natural number $n$ , let $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ . Let $M$ be an $(w_{n},h_{n},\varepsilon_{n})$ -balanced $n\times n+u$ random matrix with entries in $\mathbb{Z}/a\mathbb{Z}$ . Let $\mathcal{P}$ be the row partition associated to $M$ and let $f\in\operatorname{Hom}(V,G)$ have $(\mathcal{P},\delta_{n})$ -depth $D>1$ , with $[G:f(V)]<D$ .

Then there is a constant $K>0$ depending only on $u$ , $G$ , $\alpha$ , $\beta$ , and the sequences $h_{n}$ , $w_{n}$ such that for all $n$ ,

\mathbb{P}[f(M)=0]\leq K\exp\left(-\varepsilon_{n}\frac{n}{\log n}\right)D^{n}|G|^{-n}.

Proof.

Let $\mathcal{Q}$ be the column partition for $M$ as in the definition of $(w_{n},h_{n},\varepsilon_{n})$ -balanced. Let $M_{i}=M_{[n],Q_{i}}$ for each $i$ . By independence,

\mathbb{P}[f(M)=0]=\prod_{i}\mathbb{P}[f(M_{i})=0].

For each $i$ , let $x_{i}=\frac{|G|^{\#Q_{i}}D^{-\#Q_{i}}}{1-\varepsilon_{n}}\mathbb{P}[f(M_{i})=0]-1$ . By Lemma 4.16, we have

	$\displaystyle\max\{0,x_{i}\}$	$\displaystyle\leq N\|G\|^{\#Q_{i}}D^{-\#Q_{i}}\exp\left(-\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}(D^{-1}\|G\|)^{3\#Q_{i}}}\right)$
		$\displaystyle\leq N\|G\|^{w_{n}}D^{-w_{n}}\exp\left(-\frac{n\varepsilon_{n}\delta_{n}}{2Nh_{n}(D^{-1}\|G\|)^{3w_{n}}}\right).$

By the same argument as in the proof of Lemma 4.13, there is some constant $C$ depending only on $\alpha$ and the sequence $h_{n}$ such that for large enough $n$ (where “large enough” depends on $u$ , $G$ , $\alpha$ , $\beta$ , and the sequences $h_{n}$ and $w_{n}$ ), we have

\sum_{i=1}^{m}\max\{0,x_{i}\}\leq n\exp\left(\frac{C}{4N}n^{\beta/2}\right)\leq\log 2.

By Lemma 4.12, we therefore have that for such $n$ ,

	$\displaystyle\frac{(D^{-1}\|G\|)^{n+u}}{(1-\varepsilon_{n})^{\#\mathcal{Q}}}\mathbb{P}[f(M)=0]-1$	$\displaystyle=\prod_{i=1}^{m}\frac{(D^{-1}\|G\|)^{\#Q_{i}}}{1-\varepsilon_{n}}\mathbb{P}[f(M_{i})=0]-1$
		$\displaystyle=\prod_{i=1}^{m}(1+x_{i})-1$
		$\displaystyle\leq 2\sum_{i=1}^{m}\max\{0,x_{i}\}$
		$\displaystyle\leq 2n\exp\left(-\frac{C}{4N}n^{\beta/2}\right)$
		$\displaystyle=2n\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\cdot\exp\left(-\frac{C}{8N}n^{\beta/2}\right).$

\frac{(D^{-1}|G|)^{n+u}}{(1-\varepsilon_{n})^{\#\mathcal{Q}}}\mathbb{P}[f(M)=0]-1\leq K^{\prime}\exp\left(-\frac{C}{8N}n^{\beta/2}\right),

for all $n$ . Hence we have

	$\displaystyle\mathbb{P}[f(M)=0]$	$\displaystyle\leq D^{n+u}\|G\|^{-n-u}(1-\varepsilon_{n})^{\#\mathcal{Q}}\left(1+K^{\prime}\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\right)$
		$\displaystyle\leq D^{n+u}\|G\|^{-n-u}\exp(-\varepsilon_{n}\#\mathcal{Q})\left(1+K^{\prime}\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\right)$
		$\displaystyle\leq(K^{\prime}+1)D^{n+u}\|G\|^{-n-u}\exp(-\varepsilon_{n}\#\mathcal{Q}).$

The lemma follows from the fact that for large enough $n$ , we have $w_{n}\leq\log n$ , so $\#\mathcal{Q}\geq\frac{n}{w_{n}}\geq\frac{n}{\log n}$ . ∎

4.4. Computing the moments

Finally, we can combine all these results to compute the limiting moments for cokernels of $(w_{n},h_{n},\varepsilon_{n})$ -balanced random matrices. The most delicate part of this proof is the part where we handle the non-codes. This will involve a careful choice of the sequence $\delta_{n}$ .

Theorem 4.18.

Let $u\geq 0$ be an integer. Let $G$ be a finite abelian group and let $a$ be a multiple of the exponent of $G$ (including zero). Let $(w_{n})_{n},(h_{n})_{n}$ be sequences of real numbers such that $w_{n}=o(\log n)$ , $h_{n}=O(n^{1-\alpha})$ , and $\varepsilon_{n}\geq n^{-\beta}$ for some $0<\alpha\leq 1$ and $0<\beta<\alpha/2$ .

Then there are $c,K,\gamma>0$ such that the following holds for every sufficiently large natural number $n$ . Let $M$ be an $(w_{n},h_{n},\varepsilon_{n})$ -balanced $n\times(n+u)$ random matrix with entries in $\mathbb{Z}/a\mathbb{Z}$ . Then

|\mathbb{E}[\#\operatorname{Sur}(\operatorname{coker}(M),G)]-|G|^{-u}|\leq Ke^{-cn^{\gamma}}.

Proof.

Let $V=(\mathbb{Z}/a\mathbb{Z})^{n}$ . Following the discussion at the beginning of this section, we have

\mathbb{E}[\#\operatorname{Sur}(\operatorname{coker}(M),G)]=\sum_{f\in\operatorname{Sur}(V,G)}\mathbb{P}[f(M)=0]

Let $\mathcal{P},\mathcal{Q}$ be the row and column partitions witnessing the $(w_{n},h_{n},\varepsilon_{n})$ -balancedness of $M$ .

Let $\delta_{n}=n^{-\alpha/2}$ . Note that then $\varepsilon_{n}\delta_{n}\geq n^{-\beta-\alpha/2}$ with $-\beta-\alpha/2>-\alpha$ , so $\delta_{n}$ satisfies the conditions for Lemmas 4.13 and 4.17.

For notational convenience, we will allow $K$ to change in each line as long as it remains a constant depending only on $a,u,\alpha,\beta,(h_{n})_{n},(w_{n})_{n},G$ .

We have

	$\displaystyle\left\|\mathbb{E}[\#\operatorname{Sur}(\operatorname{coker}(M),G)]-\frac{1}{\|G\|^{u}}\right\|$
	$\displaystyle\qquad\qquad=\left\|\sum_{f\in\operatorname{Sur}(V,G)}\mathbb{P}[f(M)=0]-\frac{1}{\|G\|^{u}}\right\|$
	$\displaystyle\qquad\qquad=\left\|\sum_{f\in\operatorname{Sur}(V,G)}\mathbb{P}[f(M)=0]-\sum_{f\in\operatorname{Hom}(V,G)}\frac{1}{\|G\|^{n+u}}\right\|$
(1)		$\displaystyle\qquad\qquad\leq\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\left\|\mathbb{P}[f(\bar{M})=0]-\frac{1}{\|G\|^{n+u}}\right\|$
(2)		$\displaystyle\qquad\qquad\qquad+\sum_{\begin{subarray}{c}D>1\\ D\mid\|G\|\end{subarray}}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(\bar{M})=0]$
(3)		$\displaystyle\qquad\qquad\qquad+\sum_{\begin{subarray}{c}D>1\\ D\mid\|G\|\end{subarray}}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\frac{1}{\|G\|^{n+u}}$
(4)		$\displaystyle\qquad\qquad\qquad+\sum_{f\in\operatorname{Hom}(V,G)\setminus\operatorname{Sur}(V,G)}\frac{1}{\|G\|^{n+u}}$

Wood showed in the proof of [Woo19, Theorem 2.9] that (4) is bounded above by $Ke^{-n\log 2}$ . By Lemma 4.13, we can bound (1):

	$\displaystyle\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\left\|\mathbb{P}[f(M)=0]-\frac{1}{\|G\|^{n+u}}\right\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\frac{K\exp(-cn^{\gamma})}{\|G\|^{n+u}}$
		$\displaystyle\leq\|G\|^{n}\frac{K\exp(-cn^{\gamma})}{\|G\|^{n+u}}$
		$\displaystyle=K\exp(-cn^{\gamma}).$

To bound (2) and (3) we use Lemma 4.15. For each $D>1$ , there are at most

K\binom{n}{\lceil\ell(D)n\delta_{n}\rceil-1}2^{\ell(D)n\delta_{n}}|G|^{n}D^{-n+\ell(D)n\delta_{n}}

maps of $(\mathcal{P},\delta_{n})$ -depth $D$ . A standard inequality says that $\binom{n}{k}\leq\left(\frac{ne}{k}\right)^{k}$ , so for $\lceil\ell(D)n\delta_{n}\rceil\geq 2$ (which is the case for $n$ large enough, independent of $D$ )

	$\displaystyle\binom{n}{\lceil\ell(D)n\delta_{n}\rceil-1}$	$\displaystyle\leq\left(\frac{ne}{\lceil\ell(D)n\delta_{n}\rceil-1}\right)^{\lceil\ell(D)n\delta_{n}\rceil-1}$
		$\displaystyle\leq\left(\frac{2ne}{\ell(D)n\delta_{n}}\right)^{\ell(D)n\delta_{n}}$
		$\displaystyle=\left(\frac{2e}{\ell(D)\delta_{n}}\right)^{\ell(D)n\delta_{n}}$
		$\displaystyle=\exp\left(\ell(D)n\delta_{n}\left(1+\log 2-\log\ell(D)-\log\delta_{n}\right)\right).$

Hence, the number of maps of $(\mathcal{P},\delta_{n})$ -depth $D$ is at most

	$\displaystyle K\|G\|^{n}D^{-n}\exp\left(\ell(D)n\delta_{n}\left(\log\frac{4eD}{\ell(D)}-\log\delta_{n}\right)\right)$	$\displaystyle=K\|G\|^{n}\exp\left(\ell(D)n\delta_{n}\left(\log\frac{4eD}{\ell(D)}-\log\delta_{n}\right)-n\log D\right)$
		$\displaystyle\leq K\|G\|^{n}\exp\left(\ell(\|G\|)n\delta_{n}\left(\log\frac{4e\|G\|}{\ell(\|G\|)}-\log\delta_{n}\right)-n\log 2\right)$

Since $\lim_{\delta\to 0}\delta\log\delta=0$ and $\delta_{n}\to 0$ as $n\to\infty$ , for large enough $n$ (depending only on $\beta$ and $|G|$ ) we have $\ell(|G|)\delta_{n}\left(\log\frac{4e|G|}{\ell(|G|)}-\log\delta_{n}\right)\leq\frac{1}{2}\log 2$ , which means that for large enough $n$ ,

	$\displaystyle\sum_{D\mid\|G\|}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\frac{1}{\|G\|^{n+u}}$	$\displaystyle\leq\sum_{D\mid\|G\|}K\|G\|^{-u}\exp\left(\ell(\|G\|)n\delta_{n}\left(\log\frac{4e\|G\|}{\ell(\|G\|)}-\log\delta_{n}\right)-n\log 2\right)$
		$\displaystyle\leq\sum_{D\mid\|G\|}K\exp\left(-\frac{\log 2}{2}n\right)$
		$\displaystyle\leq K\exp\left(-\frac{\log 2}{2}n\right),$

bounding (3) as desired.

Finally, we need to bound (2). From Lemma 4.17, we have that if $f$ has $(\mathcal{P},\delta_{n})$ -depth $D$ ,

\mathbb{P}[f(M)=0]\leq K\exp\left(-\varepsilon_{n}\frac{n}{\log n}\right)D^{n}|G|^{-n},

which means

	$\displaystyle\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(M)=0]$	$\displaystyle\leq K\exp\left(\ell(D)n\delta_{n}\left(\log\frac{4eD}{\ell(D)}-\log\delta_{n}\right)-\varepsilon_{n}\frac{n}{\log n}\right)$
		$\displaystyle\leq K\exp\left(\ell(\|G\|)n^{1-\alpha/2}\left(\log\frac{4e\|G\|}{\ell(\|G\|)}+\frac{\alpha}{2}\log n\right)-\frac{n^{1-\beta}}{\log n}\right).$

Since $\beta<\alpha/2$ , we have that $n^{1-\alpha/2}(\log n)^{2}=o(n^{1-\beta})$ , so

\lim_{n\to\infty}\frac{\ell(|G|)n^{1-\alpha/2}\left(\log\frac{4e|G|}{\ell(|G|)}+\frac{\alpha}{2}\log n\right)}{n^{1-\beta}/\log n}=0.

Hence for large enough $n$ (depending only on $G$ , $\alpha$ , and $\beta$ ), we have $\ell(|G|)n^{1-\alpha/2}\left(\log\frac{4e|G|}{\ell(|G|)}+\frac{\alpha}{2}\log n\right)\leq\frac{1}{2}\frac{n^{1-\beta}}{\log n}$ and

\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(M)=0]\leq K\exp\left(-\frac{n^{1-\beta}}{2\log n}\right).

For the same reason, for $n$ large enough (depending only on $\beta$ ) we have

\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(M)=0]\leq K\exp\left(-\frac{1}{2}n^{\frac{1-\beta}{2}}\right),

which means

\sum_{\begin{subarray}{c}D>1\\ D\mid|G|\end{subarray}}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(M)=0]\leq K\exp\left(-\frac{1}{2}n^{\frac{1-\beta}{2}}\right),

giving us a bound on (2).

Finally, choose $c$ and $\gamma$ appropriately to obtain the desired result. ∎

Acknowledgements

The author was supported by the NSF Graduate Research Fellowship Program, the Caltech Summer Undergraduate Research Fellowship program, and the Samuel P. and Frances Krown SURF Fellowship. The author thanks Melanie Wood and Omer Tamuz for mentorship and Alexander Gorokhovsky, Seth Berman, Sandra O’Neill, and Hoi Nguyen for insightful conversations. The author also thanks Gilyoung Cheong, Yifeng Huang, Hoi Nguyen, Roger Van Peski, Will Sawin, and Melanie Wood for helpful comments on an earlier draft of this manuscript.

References

[CK24] Gilyoung Cheong and Nathan Kaplan “Generalizations of results of Friedman and Washington on cokernels of random p-adic matrices” In Journal of Algebra 604.C, 2024 DOI: 10.1016/j.jalgebra.2022.03.035
[FW89] Eduardo Friedman and Lawrence C. Washington “On the distribution of divisor class groups of curves over a finite field” In Proceedings of the International Number Theory Conference held at Université Laval, July 5-18, 1987 Berlin, New York: De Gruyter, 1989, pp. 227–239 DOI: 10.1515/9783110852790.227
[Més20] András Mészáros “The Distribution of Sandpile Groups of Random Regular Graphs” In Transactions of the American Mathematical Society 373.9, 2020, pp. 6529–6594 DOI: 10.1090/tran/8127
[NO15] Hoi H. Nguyen and Sean O’Rourke “On the Concentration of Random Multilinear Forms and the Universality of Random Block Matrices” In Probability Theory and Related Fields 162.1, 2015, pp. 97–154 DOI: 10.1007/s00440-014-0567-7
[NV24] Hoi H. Nguyen and Roger Van Peski “Universality for cokernels of random matrix products” In Advances in Mathematics 438, 2024, pp. 109451 DOI: 10.1016/j.aim.2023.109451
[NW22] Hoi H. Nguyen and Melanie Matchett Wood “Random Integral Matrices: Universality of Surjectivity and the Cokernel” In Inventiones mathematicae 228.1, 2022, pp. 1–76 DOI: 10.1007/s00222-021-01082-w
[Sal04] Laurent Saloff-Coste “Random Walks on Finite Groups” Series Title: Encyclopaedia of Mathematical Sciences In Probability on Discrete Structures 110 Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 263–346 DOI: 10.1007/978-3-662-09444-0˙5
[SZ07] Laurent Saloff-Coste and Jesse Zúñiga “Convergence of some time inhomogeneous Markov chains via spectral techniques” In Stochastic Processes and their Applications 117.8, 2007, pp. 961–979 DOI: 10.1016/j.spa.2006.11.004
[TVK10] Terence Tao, Van Vu and Manjunath Krishnapur “Random matrices: Universality of ESDs and the circular law” In The Annals of Probability 38.5 Institute of Mathematical Statistics, 2010, pp. 2023–2065 DOI: 10.1214/10-AOP534
[Woo14] Melanie Matchett Wood “The distribution of sandpile groups of random graphs” In Journal of the American Mathematical Society 30.4, 2014
[Woo19] Melanie Matchett Wood “Random integral matrices and the Cohen-Lenstra heuristics” In American Journal of Mathematics 141.2, 2019, pp. 383–398
[Woo23] Melanie Matchett Wood “Probability Theory for Random Groups Arising in Number Theory” arXiv, 2023 DOI: 10.48550/arXiv.2301.09687

	$\displaystyle\|\|\mu-\pi\|\|_{L^{2}(G)}^{2}$	$\displaystyle=\|\|\tilde{\mu}-\pi\|\|_{L^{2}(G)}^{2}+\|\|\mu-\tilde{\mu}\|\|_{L^{2}(G)}^{2}$
		$\displaystyle=\frac{1}{\|H\|}\|\|P_{}\tilde{\mu}-P_{}\pi\|\|_{L^{2}(G/H)}^{2}+d_{L^{2}}(\mu,\mathcal{M}_{H})^{2}$
		$\displaystyle=\frac{1}{\|H\|}\|\|P_{}\mu-P_{}\pi\|\|_{L^{2}(G/H)}^{2}+d_{L^{2}}(\mu,\mathcal{M}_{H})^{2}$

	$\displaystyle\|\|G\|^{n+u}\mathbb{P}[f(M)=(g_{1},\dots,g_{n+u})]-1\|$	$\displaystyle=\left\|\prod_{i=1}^{m}\|G\|^{\#Q_{i}}\mathbb{P}[f(M_{i})=g_{Q_{i}}]\|-1\right\|$
		$\displaystyle=\left\|\prod_{i=1}^{m}(1+x_{i})-1\right\|$
		$\displaystyle\leq 2\sum_{i=1}^{m}\|x_{i}\|$
		$\displaystyle\leq 2n\exp\left(-\frac{C}{4N}n^{\beta/2}\right)$
		$\displaystyle=2n\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\cdot\exp\left(-\frac{C}{8N}n^{\beta/2}\right).$

	$\displaystyle\mathbb{P}[f(M)=0]$	$\displaystyle\leq D^{n+u}\|G\|^{-n-u}(1-\varepsilon_{n})^{\#\mathcal{Q}}\left(1+K^{\prime}\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\right)$
		$\displaystyle\leq D^{n+u}\|G\|^{-n-u}\exp(-\varepsilon_{n}\#\mathcal{Q})\left(1+K^{\prime}\exp\left(-\frac{C}{8N}n^{\beta/2}\right)\right)$
		$\displaystyle\leq(K^{\prime}+1)D^{n+u}\|G\|^{-n-u}\exp(-\varepsilon_{n}\#\mathcal{Q}).$

	$\displaystyle\left\|\mathbb{E}[\#\operatorname{Sur}(\operatorname{coker}(M),G)]-\frac{1}{\|G\|^{u}}\right\|$
	$\displaystyle\qquad\qquad=\left\|\sum_{f\in\operatorname{Sur}(V,G)}\mathbb{P}[f(M)=0]-\frac{1}{\|G\|^{u}}\right\|$
	$\displaystyle\qquad\qquad=\left\|\sum_{f\in\operatorname{Sur}(V,G)}\mathbb{P}[f(M)=0]-\sum_{f\in\operatorname{Hom}(V,G)}\frac{1}{\|G\|^{n+u}}\right\|$
(1)		$\displaystyle\qquad\qquad\leq\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\left\|\mathbb{P}[f(\bar{M})=0]-\frac{1}{\|G\|^{n+u}}\right\|$
(2)		$\displaystyle\qquad\qquad\qquad+\sum_{\begin{subarray}{c}D>1\\ D\mid\|G\|\end{subarray}}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\mathbb{P}[f(\bar{M})=0]$
(3)		$\displaystyle\qquad\qquad\qquad+\sum_{\begin{subarray}{c}D>1\\ D\mid\|G\|\end{subarray}}\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ of }(\mathcal{P},\delta_{n})\text{-depth }D\end{subarray}}\frac{1}{\|G\|^{n+u}}$
(4)		$\displaystyle\qquad\qquad\qquad+\sum_{f\in\operatorname{Hom}(V,G)\setminus\operatorname{Sur}(V,G)}\frac{1}{\|G\|^{n+u}}$

	$\displaystyle\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\left\|\mathbb{P}[f(M)=0]-\frac{1}{\|G\|^{n+u}}\right\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}f\in\operatorname{Sur}(V,G)\\ f\text{ code of distance }n\delta_{n}\end{subarray}}\frac{K\exp(-cn^{\gamma})}{\|G\|^{n+u}}$
		$\displaystyle\leq\|G\|^{n}\frac{K\exp(-cn^{\gamma})}{\|G\|^{n+u}}$
		$\displaystyle=K\exp(-cn^{\gamma}).$