This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimal linear sofic approximations of countable groups

Keivan Mallahi-Karai Constructor University, Bremen, Germany [email protected]  and  Maryam Mohammadi Yekta University of Waterloo, Ontario, Canada [email protected]
Abstract.

Let GG be a group. The notion of linear sofic approximations of GG over an arbitrary field FF was introduced and systematically studied by Arzhantseva and Păunescu [AP17]. Inspired by one of the results of [AP17], we introduce and study the invariant κF(G)\kappa_{F}(G) that captures the quality of linear sofic approximations of GG over FF. In this work we show that when FF has characteristic zero and GG is linear sofic over FF, then κF(G)\kappa_{F}(G) takes values in the interval [1/2,1][1/2,1] and 1/21/2 cannot be replaced by any larger value. Further, we show that under the same conditions, κF(G)=1\kappa_{F}(G)=1 when GG is torsion free. These results answer a question posed by Arzhantseva and Păunescu [AP17] for fields of characteristic zero. One of the new ingredients of our proofs is an effective non-concentration estimates for random walks on finitely generated abelian groups, which may be of independent interest.

1. Introduction

Let 𝒢={(Gn,distn)}n1\mathscr{G}=\{(G_{n},\mathrm{dist}_{n})\}_{n\geq 1} be a family of groups, each equipped with a bi-invariant bounded metric. Bi-invariance means that for every x,y,g1,g2Gnx,y,g_{1},g_{2}\in G_{n}, we have the equality

distn(g1xg2,g1yg2)=distn(x,y)\mathrm{dist}_{n}(g_{1}xg_{2},g_{1}yg_{2})=\mathrm{dist}_{n}(x,y)

A 𝒢\mathscr{G}-approximation of a countable group GG consists of an increasing sequence (nk)k1(n_{k})_{k\geq 1} of positive integers and a sequence (ϕk)k1(\phi_{k})_{k\geq 1} of maps

ϕk:GGnk,k1\phi_{k}:G\to G_{n_{k}},\quad k\geq 1

satisfying the following two properties:

  1. (1)

    (Asymptotic homomorphism) For all g,hGg,h\in G, one has

    limkdistnk(ϕk(gh),ϕk(g)ϕk(h))=0.\lim_{k\to\infty}\mathrm{dist}_{n_{k}}(\phi_{k}(gh),\phi_{k}(g)\phi_{k}(h))=0.
  2. (2)

    (Uniform injectivity) There exists κ>0\kappa>0 such that for all gG{eΓ}g\in G\setminus\{e_{\Gamma}\},

    lim supkdistnk(ϕk(g),eGnk)κ.\limsup_{k\to\infty}\mathrm{dist}_{n_{k}}(\phi_{k}(g),e_{{}_{G_{n_{k}}}})\geq\kappa.

We will then also say that GG is κ\kappa-approximable by 𝒢\mathscr{G}. Perhaps the most prominent and well-studied classes of approximable groups are the sofic and hyperlinear groups, which correspond, respectively, to approximation by the family of symmetric groups equipped with the normalized Hamming distance and the family of unitary groups equipped with normalized Hilbert-Schmidt distance. The class of sofic groups was introduced by Gromov in connection with the so-called Gottschalk surjunctivity conjecture [Gro99], while the terminology is due to Weiss [Wei00]. Hyperlinear groups first appeared in the context of Conne’s embedding conjecture. The term hyperlinear was coined by Rădulescu [R0̆8]. Sofic groups are shown to be hyperlinear [ES05, Theorem 2], It is unknown whether every group is sofic, or even hyperlinear.

The class of linear sofic groups over an arbitrary field was introduced by Arzhantseva and Păunescu [AP17, Definition 4.1 and the paragraph following Definition 4.2] who proved fundamental results about this class of groups. This mode of approximation defining linear sofic groups uses general linear groups (over a general field FF fixed in the discussion) as target groups while the metric is defined using the normalized rank. In this regards, linear sofic groups provide a hybrid form of approximation

In order to define this metric, let FF be a field. For d×dd\times d matrices A,BGLd(F)A,B\in\mathrm{GL}_{d}(F), we define

ρd(A,B):=1drank(AB).\rho_{d}(A,B):=\frac{1}{d}\mathrm{rank}(A-B).

The following definition of linear sofic groups will be more convenient for our purpose. The equivalence of two definitions is proven in [AP17, Proposition 4.4].

Definition 1.1.

Let FF be a field, GG a countable group and 0<κ10<\kappa\leq 1. We say that GG is κ\kappa-linear sofic over FF if for every finite set SGS\subseteq G and every δ>0\delta>0 and every 0κ<κ0\leq\kappa^{\prime}<\kappa, there exists d1d\geq 1 and a map ϕ:SGLd(F)\phi:S\to\mathrm{GL}_{d}(F) satisfying the following two properties:

  1. (AH)

    For all g,h,ghSg,h,gh\in S, one has ρd(ϕ(gh),ϕ(g)ϕ(h))<δ\rho_{d}(\phi(gh),\phi(g)\phi(h))<\delta.

  2. (D)

    For all gS{e}g\in S\setminus\{e\}, ρd(ϕ(g),Id)κ\rho_{d}(\phi(g),\mathrm{Id})\geq\kappa^{\prime}.

Such a map is called an (S,δ,κ)(S,\delta,\kappa^{\prime})-map. Roughly speaking, (AH) guarantees that ϕ\phi is almost a homomorphism, while (D) shows that distinct elements are separated out. Following [AP17] we say that GG is linear sofic over FF if it is κ\kappa-linear sofic for some κ>0\kappa>0. It is clear that if κ1<κ2\kappa_{1}<\kappa_{2}, then every κ2\kappa_{2}-linear sofic group is κ1\kappa_{1}-linear sofic. For a countable group GG, we write

κF(G)=sup{κ0:G is κ-linear sofic over F}(0,1].\kappa_{F}(G)=\sup\{\kappa\geq 0:G\text{ is }\kappa\text{-linear sofic }\text{over }F\}\in(0,1].

Note that whenever GG is not κ\kappa-linear sofic over FF for any κ>0\kappa>0, we define κF(G)\kappa_{F}(G) to be zero.

Remark 1.2.

The notion of metric approximation can also be defined using the notion of metric ultraproducts. This alternative definition allows one to avoid limiting processes that require passing to subsequences, and thereby simplifies certain arguments, see [AP17] for examples. Since this point of view will not provide us with any special advantages, we will not use this definition.

1.1. The amplification argument

Let 𝒢={(Gn,distn)}n1\mathscr{G}=\{(G_{n},\mathrm{dist}_{n})\}_{n\geq 1} be as above, and assume that the diameter of GnG_{n} with respect to distn\mathrm{dist}_{n} is normalized to be 11. It is natural to ask whether a κ\kappa-approximable group for some κ>0\kappa>0 is always 11-approximable. Elek and Szabó [ES05] proved that this is the case for sofic groups. A similar statement (with a modified proof) holds for hyperlinear groups. Note that this implies that analogously defined κsofic\kappa_{\textrm{sofic}} and κhyperlinear\kappa_{\textrm{hyperlinear}} can only take values in the set {0,1}\{0,1\}, and the longstanding open question asking whether all groups are sofic is equivalent to κsofic(G)=1\kappa_{\textrm{sofic}}(G)=1 for all groups GG.

Let us recall that both proofs are based on a basic tool, often referred to as amplification, which uses the identity

(1.1) tr(ab)=tr(a)tr(b).\mathrm{tr}(a\otimes b)=\mathrm{tr}(a)\mathrm{tr}(b).

for matrices aa and bb. This identity allows one to show that there exists a function f:(0,1)(0,1)f:(0,1)\to(0,1) such that if one starts with a map ϕ:S𝖲n\phi:S\to\mathsf{S}_{n} with distHamm(ϕ(g),e)β\mathrm{dist}_{\textrm{Hamm}}(\phi(g),e)\geq\beta then the tensor power ϕ2:S𝖲n×n\phi^{\otimes 2}:S\to\mathsf{S}_{n\times n} defined by ϕ2(g)(i,j)=(ϕ(g)(i),ϕ(g)j)\phi^{\otimes 2}(g)(i,j)=(\phi(g)(i),\phi(g)j) satisfies distHamm(ϕ2(g),e)f(β)\mathrm{dist}_{\textrm{Hamm}}(\phi^{\otimes 2}(g),e)\geq f(\beta). Moreover, starting from any β\beta, the sequence of iterates f(n)(β)f^{(n)}(\beta) converges to 11. Hence, by iterating the tensor power operation, one can arrive at arbitrarily well sofic approximation. The case of hyperlinear groups is dealt with in a similar fashion. As it was observed in [AP17], (1.1) does not have an analog for linear sofic approximations. In [AP17], Arzhantseva and Păunescu invented a new amplification argument to prove that every linear sofic group is 1/41/4-linear sofic. A particularly innovative aspect of this argument is that it tracks two different quantities that when coupled together can be used to control the distance to the identity. Then using clever properties of ranks of tensor powers they prove that this amplification argument works. The question of whether the constant 1/41/4 can be improved is left open in [AP17]. We will build upon their work to answer this question in the case of fields of characteristic zero.

1.2. Statement of results

In this paper, we will address the question of optimality of linear sofic approximations. The main results of this paper is the theorem below.

Theorem A.

Let GG be a countable linear sofic group over {\mathbb{C}}. Then

  1. (1)

    If GG is torsion-free, then GG is 11-linear sofic over {\mathbb{C}}.

  2. (2)

    Unconditionally, GG is 1/21/2-linear sofic over {\mathbb{C}}. Moreover, the constant 1/21/2 cannot be improved.

Note that the assertion in Theorem A is in stark contrast with the case of sofic and hyperlinear groups. An interesting observation in [AP17] (see the paragraph before Proposition 5.12) is that the amplification argument does not see the interaction between group elements and will equally work for a subset of a group. Theorem A, however, shows that the optimal constant does indeed depend on the group structure and can even change by passing to a subgroup of finite index.

Remark 1.3.

Although we stated Theorem A over {\mathbb{C}}, one can easily see that it implies the same statement over all fields of characteristic zero; see Remark 5.1. However, an important part of the argument that is based on Lemma 4.2 does not work when FF has positive characteristic. See Remark 5.2 for more details. Henceforth, we write κ\kappa instead of κ\kappa_{\mathbb{C}}.

We briefly outline the proof of Theorem A, which is following the main strategy of [AP17]. Given a finite subset SGS\subseteq G and δ0>0\delta_{0}>0, we start with an (S,δ0,0.24)(S,\delta_{0},0.24)-map ϕ0\phi_{0} (in the sense of Definition 1.1) provided by [AP17]. Using a sequence of functorial operations (see 2.1 for definitions) we replace ϕ0\phi_{0} with an (S,δ0,0.23)(S,\delta_{0},0.23)-map which has the additional property that for every gSg\in S, at least 1/1001/100 of eigenvalues of matrix ϕ(g)\phi(g) are 11. We will then show that the rank of tensor powers of ϕ(g)\phi(g) are controlled by the return probability of a certain random walk on a finitely generated subgroup of {\mathbb{C}}^{\ast}. We will establish required effective non-concentration estimates in Theorem 3.9. This part of proof uses a variety of tools ranging from Fourier analysis to additive combinatorics. Let us note that the effectiveness of these bounds is a key element of the proof: as the asymptotic homomorphism condition (AH) deteriorates after every iteration of tensor power, we need to know in advance the number of required iterations so that we can start with an appropriate δ0\delta_{0}. The counter-intuitive move of adding ones as eigenvalues is needed for this purpose. When ϕ0(g)\phi_{0}(g) is close to unipotent, this argument completely breaks down. In this case, again using the method of [AP17] we will instead show the normalized number of Jordan block of tensor powers tends to zero with an effective bound for speed. This is carried out in Theorem 4.1 by translating the problem to estimating integrals of certain trigonometric sums. In summary, our proof can be viewed as a version of amplification argument where we use additional functors in the process.

Another result of this paper involves determining κ(G)\kappa(G) for finite groups.

Theorem B.

Let GG be a finite group.

  1. (1)

    There exists a finite-dimensional linear representation ψ:GGLd()\psi:G\to\mathrm{GL}_{d}({\mathbb{C}}) of GG such that

    κ(G)=mingGρd(ψ(g),Id).\kappa(G)=\min_{g\in G}\rho_{d}(\psi(g),\mathrm{Id}).

    In particular, κ(G)\kappa(G) is a rational number.

  2. (2)

    κ(G)=1\kappa(G)=1 iff GG has a fixed-point free complex representation.

  3. (3)

    Let p{\mathbb{Z}}_{p} denote the cyclic group of order pp. For prime pp and n2n\geq 2 we have

    κ(pn)=pnpn1pn1.\kappa({\mathbb{Z}}_{p}^{n})=\frac{p^{n}-p^{n-1}}{p^{n}-1}.

    In particular, κ(2n)1/2\kappa({\mathbb{Z}}_{2}^{n})\to 1/2 as nn\to\infty.

One of the main ingredients in the proof of Theorem B is the notion of stability. Broadly described, stability of a group in a mode of metric approximation demands that almost representations be small deformations of exact representations. Finite groups are easily seen to enjoy this property with respect to linear sofic approximations, see Proposition 6.2. Once this is established, the problem reduces to representation theory of finite groups. Proof of Part (a) of Theorem B is based on the simplex method in linear programming. Part (3) implements this for groups pn{\mathbb{Z}}_{p}^{n}, n2n\geq 2. We finally remark that all finite groups to which (2) of Theorem A applies have been classified by Joseph Wolf, [Wol67]. They include groups such as PSL2(𝔽5)\mathrm{PSL}_{2}({\mathbb{F}}_{5}). The next result establishes the value of κF(G)\kappa_{F}(G) for certain classes of groups over fields of positive characteristic.

Theorem C.

Let FF be a field of characteristic pp, and let GG be a finite group such that pp is the smallest prime dividing |G||G|. Then

κF(G)=11p.\kappa_{F}(G)=1-\frac{1}{p}.

One of the problems posed in [AP17] is whether the notions of linear sofic approximation over {\mathbb{C}} and 𝔽p{\mathbb{F}}_{p} are equivalent. Theorem B and Theorem C together show that in general the values of κ(G)\kappa_{\mathbb{C}}(G) and κ𝔽p(G)\kappa_{{\mathbb{F}}_{p}}(G) need not coincide for a finite group GG. This may be viewed as a quantitative reason for the difficulty of the problem of equivalence. We note that quantitative approaches to other metric approximations have also been considered before, see [AC20].

This paper is organized as follows: In Section 2, we will collect basic facts related to the rank metric, and basic theory of random walks on abelian groups and explain how they relate to our question. In Section 3, we prove various non-concentration estimates for random walks on abelian groups. Section 4 contains the proof of Theorem 4.1, involving matrices in Jordan canonical form. These ingredients are put together in Section 5 to prove Theorem A, except for the optimality claim. The optimality, as well as the proof of Theorems B and C, are discussed in Section 6.

Acknowledgement The authors would like to thank Goulnara Arzhantseva for helpful comments and suggestions on an earlier version of this paper. We also thank the anonymous referee for a careful reading of the paper and numerous remarks that significantly improved the exposition of this paper. Special thanks are due to Iosif Pinelis for providing the reference to Theorem 3.2.

2. Preliminaries and notation

In this section, we will set some notation and gather a number of basic facts needed in the rest of the paper. We will denote the group of invertible d×dd\times d matrices over the field {\mathbb{C}} by GLd()\mathrm{GL}_{d}({\mathbb{C}}). This space can be turned into a metric space by defining for A,BGLd()A,B\in\mathrm{GL}_{d}({\mathbb{C}}):

ρd(A,B)=rank(AB)d.\rho_{d}(A,B)=\frac{\mathrm{rank}(A-B)}{d}.

We will often suppress the subscript dd and simply write ρ(A,B)\rho(A,B). Every AGLd()A\in\mathrm{GL}_{d}({\mathbb{C}}) has dd eigenvalues λ1,,λd\lambda_{1},\dots,\lambda_{d}, each counted with multiplicity. We write

𝐦1(A)=1d#{1id:λi=1}.\mathbf{m}_{1}(A)=\frac{1}{d}\#\{1\leq i\leq d:\lambda_{i}=1\}.

For j1j\geq 1, let us denote by WjW_{j} the set of all jj-th roots of unity in {\mathbb{C}}. It will later be convenient to consider the following quantity:

𝐦r(A)=1d#{1id:λij=1rWj}.\mathbf{m}^{\leq r}(A)=\frac{1}{d}\#\{1\leq i\leq d:\lambda_{i}\in\bigcup_{j=1}^{r}W_{j}\}.

The number of Jordan blocks of AA (in its Jordan canonical form) will be denoted by 𝐣(A)\mathbf{j}(A). The number of Jordan blocks corresponding with 11 on the diagonal will be denoted by 𝐣1(A)\mathbf{j}_{1}(A). Note that

𝐣1(A)=dimker(AId)=(1ρ(A,Id))d.\mathbf{j}_{1}(A)=\dim\ker(A-\mathrm{Id})=(1-\rho(A,\mathrm{Id}))d.

The following lemma plays a key role in the arguments used in [AP17]:

Lemma 2.1 ([AP17]).

For every AGLd()A\in\mathrm{GL}_{d}({\mathbb{C}}), we have

ρ(A,Id)max(1𝐦1(A),1𝐣(A)).\rho(A,\mathrm{Id})\geq\max(1-\mathbf{m}_{1}(A),1-\mathbf{j}(A)).

The next lemma will be essential for tracking the multiplicity of eigenvalue 11 in tensor powers of a matrix AA:

Lemma 2.2 ([AP17], Lemma 5.1).

Suppose that {λ1,,λd}\{\lambda_{1},\dots,\lambda_{d}\} is the set of eigenvalues of a d×dd\times d complex matrix AA, each counted with multiplicity. Then, for k1k\geq 1 the set of eigenvalues of AkA^{\otimes k} counted with multiplicity is given by

{λi1λik:1ijd,1jk}.\{\lambda_{i_{1}}\cdots\lambda_{i_{k}}:1\leq i_{j}\leq d,\quad 1\leq j\leq k\}.
Proof.

There exists PGLd()P\in\mathrm{GL}_{d}({\mathbb{C}}) such that P1APP^{-1}AP is upper-triangular. Hence, without loss of generality, we can assume that AA is upper-triangular with diagonal entries λ1,,λd\lambda_{1},\dots,\lambda_{d}. It is easy to see that AkA^{\otimes k} will be upper-triangular in an appropriate ordering of the bases, with diagonal entries given by the list above. Special case k=2k=2 is dealt with in Lemma 5.1 of [AP17]. ∎

2.1. Three functorial operations

Let us now consider three functorial operations that can be applied to a family of matrices in GLd()\mathrm{GL}_{d}({\mathbb{C}}). These will be used to replace an (S,δ,κ)(S,\delta,\kappa)-map by an (S,δ,κ)(S,\delta^{\prime},\kappa^{\prime})-map, which has some better properties. An alternative point of view is that each operation can be viewed as post-composition of the initial map by a representation of GLn\mathrm{GL}_{n}.

  1. (1)

    (Tensors) Consider the representations

    ΨT,m:GLd()GLdm(),AAA,\Psi_{T,m}:\mathrm{GL}_{d}({\mathbb{C}})\to\mathrm{GL}_{d^{m}}({\mathbb{C}}),\quad A\mapsto A\otimes\cdots\otimes A,

    where mm denotes the number of tensors. We will denote ΨT,m(A)\Psi_{T,m}(A) by 𝖳mA{\mathsf{T}}^{m}A.

  2. (2)

    (Direct sums) For m1m\geq 1, let

    ΨS,m:GLd()GLmd(),AAA,\Psi_{S,m}:\mathrm{GL}_{d}({\mathbb{C}})\to\mathrm{GL}_{md}({\mathbb{C}}),\quad A\mapsto A\oplus\cdots\oplus A,

    where the number of summands is mm. Instead of ΨS,m(A)\Psi_{S,m}(A), we write 𝖲mA\mathsf{S}^{m}A.

  3. (3)

    (Adding Identity) For m0m\geq 0, consider the representation

    ΨI,m:GLd()GLm+d(),AAIdm,\Psi_{I,m}:\mathrm{GL}_{d}({\mathbb{C}})\to\mathrm{GL}_{m+d}({\mathbb{C}}),\quad A\mapsto A\oplus\mathrm{Id}_{m},

    where Idm\mathrm{Id}_{m} is the identity matrix of size mm. We write 𝖨mA\mathsf{I}^{m}A for ΨI,m(A)\Psi_{I,m}(A).

Lemma 2.3.

Let AA and BB be d×dd\times d matrices. Then for m,n1m,n\geq 1 we have

  1. (1)

    𝐦1(𝖨md𝖲nA)=𝐦1(A)+mn.\mathbf{m}_{1}(\mathsf{I}^{md}\mathsf{S}^{n}A)=\mathbf{m}_{1}(A)+\frac{m}{n}.

  2. (2)

    ρ(𝖨md𝖲nA,𝖨md𝖲nB)ρ(A,B).\rho(\mathsf{I}^{md}\mathsf{S}^{n}A,\mathsf{I}^{md}\mathsf{S}^{n}B)\leq\rho(A,B).

  3. (3)

    ρ(𝖨md𝖲nA,Id)=nn+mρ(A,Id).\rho(\mathsf{I}^{md}\mathsf{S}^{n}A,\mathrm{Id})=\frac{n}{n+m}\rho(A,\mathrm{Id}).

Proof.

Parts (1) and (2) are straightforward computations. For (3) note that

dim(ker(𝖨md𝖲nA𝖨md𝖲nId))=ndimker(AId)+md.\dim(\ker(\mathsf{I}^{md}\mathsf{S}^{n}A-\mathsf{I}^{md}\mathsf{S}^{n}\mathrm{Id}))=n\dim\ker(A-\mathrm{Id})+md.

The claim follows by a simple computation. ∎

Proposition 2.4.

Let GG be a linear sofic group. Then for every finite SGS\subseteq G, δ>0\delta>0 and κ<0.24\kappa<0.24 there exists an (S,δ,κ)(S,\delta,\kappa)-map ϕ\phi such that 𝐦1(ϕ(g))0.01\mathbf{m}_{1}(\phi(g))\geq 0.01 for all gS{e}g\in S\setminus\{e\}.

Proof.

We know from [AP17] that κ(G)1/4\kappa(G)\geq 1/4. Since κ+0.01<0.25κ(G)\kappa+0.01<0.25\leq\kappa(G), there exists a (S,ϵ,0.01+κ)(S,\epsilon,0.01+\kappa)-map, which we denote by ϕ0\phi_{0}. Set ϕ=𝖨d𝖲100ϕ0\phi=\mathsf{I}^{d}\mathsf{S}^{100}\phi_{0}. By Lemma 2.3, we have 𝐦1(ϕ(g))0.01\mathbf{m}_{1}(\phi(g))\geq 0.01. Moreover, for every gS{e}g\in S\setminus\{e\} we have

ρ(ϕ(g),Id)100101(κ+0.01)κ.\rho(\phi(g),\mathrm{Id})\geq\frac{100}{101}(\kappa+0.01)\geq\kappa.

Lemma 2.5.

Let AA and BB be d×dd\times d matrices. Then ρ(𝖳nA,𝖳nB)nρ(A,B)\rho({\mathsf{T}}^{n}A,{\mathsf{T}}^{n}B)\leq n\rho(A,B).

Proof.

For n=2n=2, we have AABB=A(AB)+(AB)B.A\otimes A-B\otimes B=A\otimes(A-B)+(A-B)\otimes B. The claim follows from [AP17, Proposition 5] and the triangle inequality. The general case follows by a simple inductive argument. ∎

2.2. Preliminaries from probability theory

It will be convenient to reinterpret Lemma 2.2 in a probabilistic language. Let (Λ,+)(\Lambda,+) be a countable abelian group with the neutral element denoted by 0. By a probability measure on Λ\Lambda, we mean a map μ:Λ[0,1]\mu:\Lambda\to[0,1] such that aΛμ(a)=1\sum_{a\in\Lambda}\mu(a)=1. We will then write μ=aΛμ(a)δa\mu=\sum_{a\in\Lambda}\mu(a)\delta_{a}. For BΛB\subseteq\Lambda, we define μ(B)=aBμ(a)\mu(B)=\sum_{a\in B}\mu(a). We say that μ\mu is finitely supported if there exists a finite set BΛB\subseteq\Lambda such that μ(B)=1\mu(B)=1. The smallest set with this property is called the support of μ\mu. Probability measures considered in this paper are finitely supported.

The convolution of probability measures measures μ1\mu_{1} and μ2\mu_{2} on Λ\Lambda is the probability measure defined by

(μ1μ2)(a)=a1+a2=aμ1(a1)μ2(a2).(\mu_{1}\ast\mu_{2})(a)=\sum_{a_{1}+a_{2}=a}\mu_{1}(a_{1})\mu_{2}(a_{2}).

One can see that the convolution is commutative and associative. The kk-th convolution power of μ\mu will be denoted by μ(k)\mu^{(k)}. Given a probability measure μ\mu on the group AA, the μ\mu-random walk on Λ\Lambda (or the random walk governed by μ\mu) is the random process defined as follows. Let (Xk)k1(X_{k})_{k\geq 1} be a sequence of independent random variables, where the law of XiX_{i} is μ\mu. Define the process (Sk)k0(S_{k})_{k\geq 0} by S0=0S_{0}=0 and Sk=X1++XkS_{k}=X_{1}+\cdots+X_{k}. It is easy to see that the law of XkX_{k} is μ(k)\mu^{(k)}. We will use the notation 𝐏[E]\mathbf{P}\left[E\right] to denote the probability of an event EE. Similarly, 𝐄[X]\mathbf{E}[X] denotes the expected value of a random variable XX.

Given AGLd()A\in\mathrm{GL}_{d}({\mathbb{C}}) with eigenvalues λ1,,λd\lambda_{1},\dots,\lambda_{d}, define the probability measure on {\mathbb{C}}

ξA:=1di=1dδλi.\xi_{A}:=\frac{1}{d}\sum_{i=1}^{d}\delta_{\lambda_{i}}.
Proposition 2.6.

Let AA be a d×dd\times d invertible complex matrix. Then

  1. (1)

    ξA\xi_{A} is a finitely supported probability measure on the multiplicative group :={0}{\mathbb{C}}^{\ast}:={\mathbb{C}}\setminus\{0\}.

  2. (2)

    ξA(1)=𝐦1(A).\xi_{A}(1)=\mathbf{m}_{1}(A).

  3. (3)

    For all integer k1k\geq 1 we have

    ξ𝖳k(A)=(ξA)(k).\xi_{{\mathsf{T}}^{k}(A)}=(\xi_{A})^{(k)}.
Proof.

Parts (1) and (2) follow from the definition of ξA\xi_{A}. Part (3) is an immediate corollary of Lemma 2.2. ∎

3. Effective non-concentration bounds on abelian groups

This section is devoted to proving effective non-concentration bounds for random walks on abelian groups. We will use the additive notation in most of this section. However, we will eventually apply these results to subgroups of the multiplicative group of non-zero complex numbers. Let ν\nu be a finitely supported probability measure on a finitely generated abelian group Λ\Lambda. Our goal is to prove effective upper bounds for the return probability ν(n)(0)\nu^{(n)}(0).

Lemma 3.1.

Let ν\nu be a finitely supported probability measure on d{\mathbb{Z}}^{d} such that βν(0)1β\beta\leq\nu(0)\leq 1-\beta for some 0<β<1/20<\beta<1/2. Then

ν(n)(0)Cβn\nu^{(n)}(0)\leq\frac{C}{\sqrt{\beta n}}

where CC is an absolute constant.

Note that the key aspect of Lemma 3.1 is that the decay rate is controlled only by β\beta without no further assumption on the distribution of ν\nu. This fact will be crucial in our application. The proof of Lemma 3.1 is based on a non-concentration estimate in classical probability theory. Before stating the theorem, we need a few definitions. The concentration function Q(X,λ)Q(X,\lambda) of a random variable XX is defined by

Q(X,λ)=supx𝐏[xXx+λ],λ0.Q(X,\lambda)=\sup_{x\in{\mathbb{R}}}{\mathbf{P}}[x\leq X\leq x+\lambda],\quad\lambda\geq 0.

We will use a theorem of Rogozin [Rog61], which generalizes a special case due to Kolmogorov[Kol58]. Our statement of the theorem is taken from [Ess66, Theorem 1], where a new proof using Fourier analysis is given. A variation of this proof can also be found in [Pet75, Chapter].

Theorem 3.2 (Rogozin).

Suppose X1,,XnX_{1},\dots,X_{n} are independent random variables and Sn=X1++XnS_{n}=X_{1}+\cdots+X_{n}. Then for every non-negative λ1,,λnL\lambda_{1},\dots,\lambda_{n}\leq L we have we have

Q(Sn,L)CL(k=1nλk2(1Q(Xk,λk)))1/2,Q(S_{n},L)\leq C\,L\left(\sum_{k=1}^{n}\lambda_{k}^{2}(1-Q(X_{k},\lambda_{k}))\right)^{-1/2},

where CC is an absolute constant.

Proof of Lemma 3.1.

Let ι:d\iota:{\mathbb{Z}}^{d}\to{\mathbb{R}} denote an embedding of d{\mathbb{Z}}^{d} into {\mathbb{R}} as an abelian group. Let ν\nu^{\prime} denote the push-forward of ν\nu which is a finitely supported probability measure defined by ν(x)=ν(ι1(x))\nu^{\prime}(x)=\nu(\iota^{-1}(x)) for every xx\in{\mathbb{R}}. Let X1,,XnX_{1},\dots,X_{n} be independent identically distributed random variables with distribution ν\nu^{\prime}. Then by choosing all λi\lambda_{i} equal to λ>0\lambda>0 we obtain

𝐏[Sn=0]C(n(1Q(X1,λ))1/2.\mathbf{P}\left[S_{n}=0\right]\leq C(n(1-Q(X_{1},\lambda))^{-1/2}.

Letting λ0\lambda\to 0 we obtain ν(n)(0)=𝐏[Sn=0]C(n(1q))1/2\nu^{\prime(n)}(0)=\mathbf{P}\left[S_{n}=0\right]\leq C(n(1-q))^{-1/2} where q=maxx𝐏[Xi=x]q=\max_{x\in{\mathbb{R}}}\mathbf{P}\left[X_{i}=x\right]. Since ν(0)=ν(0)β\nu^{\prime}(0)=\nu(0)\geq\beta, we have ν(x)1β\nu^{\prime}(x)\leq 1-\beta for all x{0}x\in{\mathbb{R}}\setminus\{0\}. Since ν(0)1β\nu(0)\leq 1-\beta, we have q1βq\leq 1-\beta. The claim follows by noticing that ν(n)(0d)=ν(n)(0)\nu^{(n)}(0_{{\mathbb{Z}}_{d}})=\nu^{\prime(n)}(0_{{\mathbb{R}}}). ∎

Remark 3.3.

It might be tempting to expect a similar upper bound for ν(n)(0)\nu^{(n)}(0) under the weaker assumption that ν(0)1β\nu(0)\leq 1-\beta. This is, however, not true. To see this, consider the probability measure νk\nu_{k} with νk(1)=11k\nu_{k}(1)=1-\frac{1}{k} and νk(k)=1k\nu_{k}(-k)=\frac{1}{k}. Although νk(0)=0\nu_{k}(0)=0, we have for all k1k\geq 1 νk(k+1)(0)=k+1k(11k)k1/e\nu_{k}^{(k+1)}(0)=\frac{k+1}{k}(1-\frac{1}{k})^{k}\approx 1/e.

We will now consider a variant of Lemma 3.1 for finite cyclic groups N{\mathbb{Z}}_{N}. In this case the uniform measure is the stationary measure and hence νn(0)\nu^{n}(0) (under irreducibility assumptions) will converge to 1/N1/N as nn\to\infty. Note, however, that the random walk and its frequency of visits to zero can be bounded by how much the measure is supported on small subgroups. The next lemma provides an effective upper upper bound for ν(n)(0)\nu^{(n)}(0) only depending on the mass given to elements of small order.

Lemma 3.4 (Non-concentration for random walks on cyclic groups).

Let ν\nu be a probability measure on N{\mathbb{Z}}_{N}. Further, suppose that for some 0<β1/20<\beta\leq 1/2 and positive integer r>1r>1 the following hold:

  1. (1)

    βν(0)1β\beta\leq\nu(0)\leq 1-\beta.

  2. (2)

    For every subgroup HNH\leq{\mathbb{Z}}_{N} with |H|r|H|\leq r, we have ν(H)1β\nu(H)\leq 1-\beta.

Then for all n1n\geq 1 we have

ν(n)(0)1r+Cβn+enβ/2.\nu^{(n)}(0)\leq\frac{1}{r}+\frac{C^{\prime}}{\sqrt{\beta n}}+{e^{-n\beta/2}}.

where CC^{\prime} is an absolute constant.

The following proof is an adaptation of the proof of Littlewood-Offord estimate [NW21, Theorem 6.3]. This theorem is proven under different conditions, where instead of condition (2), a stronger assumption on the measure of all subgroups is imposed.

The proof uses the Fourier transform of measure and some of Let ν\nu be a probability measure on N{\mathbb{Z}}_{N}. Write eN(x)=exp(2πix/N)e_{N}(x)=\exp(2\pi ix/N) for xNx\in{\mathbb{Z}}_{N}. The following basic property of eNe_{N} is used several times in the sequel. For each ana\in{\mathbb{Z}}_{n}, we have xneN(ax)\sum_{x\in{\mathbb{Z}}_{n}}e_{N}(ax). We will define the (Fourier) transform ν^:N\widehat{\nu}:{\mathbb{Z}}_{N}\to{\mathbb{C}} by

ν^(t):=aNν(a)eN(at).\widehat{\nu}(t):=\sum_{a\in{\mathbb{Z}}_{N}}\nu(a)e_{N}(at).
Lemma 3.5.

Let ν\nu be a probability measure on N{\mathbb{Z}}_{N}. Then we have

  1. (1)

    ν^(0)=1\widehat{\nu}(0)=1.

  2. (2)

    For all n1n\geq 1, we have ν(n)^=(ν^)n\widehat{\nu^{(n)}}=(\widehat{\nu})^{n}.

  3. (3)

    ν(0)=1NtNν^(t).\nu(0)=\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}\widehat{\nu}(t).

Proof.

Part (1) is clear. For (2), suppose that ν1\nu_{1} and ν2\nu_{2} are two probability measures on N{\mathbb{Z}}_{N}. Then we have

(3.1) ν1ν2^(t)=aN(ν1ν2)(a)eN(at)=aNa1+a2=aν1(a1)ν2(a2)eN(a1t)eN(a2t)=a1Nν1(a1)eN(a1t)a2Nν2(a2)eN(a2t)=ν1^(t)ν2^(t).\begin{split}\widehat{\nu_{1}\ast\nu_{2}}(t)&=\sum_{a\in{\mathbb{Z}}_{N}}(\nu_{1}\ast\nu_{2})(a)e_{N}(at)\\ &=\sum_{a\in{\mathbb{Z}}_{N}}\sum_{a_{1}+a_{2}=a}\nu_{1}(a_{1})\nu_{2}(a_{2})e_{N}(a_{1}t)e_{N}(a_{2}t)\\ &=\sum_{a_{1}\in{\mathbb{Z}}_{N}}\nu_{1}(a_{1})e_{N}(a_{1}t)\sum_{a_{2}\in{\mathbb{Z}}_{N}}\nu_{2}(a_{2})e_{N}(a_{2}t)=\widehat{\nu_{1}}(t)\widehat{\nu_{2}}(t).\end{split}

Now, (2) follows by induction. For (3) note that

(3.2) 1NtNν^(t)=1NtNaNν(a)eN(at)=1NaNν(a)tNeN(a)=ν(0),\begin{split}\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}\widehat{\nu}(t)&=\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}\sum_{a\in{\mathbb{Z}}_{N}}\nu(a)e_{N}(at)\\ &=\frac{1}{N}\sum_{a\in{\mathbb{Z}}_{N}}\nu(a)\sum_{t\in{\mathbb{Z}}_{N}}e_{N}(a)=\nu(0),\end{split}

where the last equality follows from the fact that for every aNa\in{\mathbb{Z}}_{N}, we have tNeN(at)=0\sum_{t\in{\mathbb{Z}}_{N}}e_{N}(at)=0 unless a=0a=0, in which case it is equal to NN. ∎

We can now start with the proof of Lemma 3.4.

Proof of Lemma 3.4.

Using parts (2) and (3) of Lemma 3.5 we have

ν(n)(0)=1NtNν^(t)n,\nu^{(n)}(0)=\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}\widehat{\nu}(t)^{n},

Applying riangle inequality we deduce

ν(n)(0)1NtN|ν^(t)|n.\nu^{(n)}(0)\leq\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}|\ \widehat{\nu}(t)|^{n}.

Writing ψ(t)=1|ν^(t)|2\psi(t)=1-|\widehat{\nu}(t)|^{2} and using the inequality |x|exp(1x22)|x|\leq\exp\left(-\frac{1-x^{2}}{2}\right) that holds for all xx\in{\mathbb{R}} we obtain

ν(n)(0)1NtNexp(n2ψ(t)).\nu^{(n)}(0)\leq\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}\exp\left(-\frac{n}{2}\psi(t)\right).

Set f(t)=nψ(t)f(t)=n\psi(t) and define T(w)={t:f(t)w}T(w)=\{t:f(t)\leq w\}. Note that since f(0)=nψ(0)=0f(0)=n\psi(0)=0, hence 0T(w)0\in T(w) for all w0w\geq 0.

By separating the sum into level sets we obtain the following inequality:

(3.3) ν(n)(0)1N0|T(w)|ew/2𝑑w.\nu^{(n)}(0)\leq\frac{1}{N}\int_{0}^{\infty}|T(w)|e^{-w/2}dw.

For an integer k1k\geq 1, and subsets A1,,AkNA_{1},\dots,A_{k}\subseteq{\mathbb{Z}}_{N} we write

A1++Ak:={a1++ak:aiAi,1ik}.A_{1}+\cdots+A_{k}:=\{a_{1}+\cdots+a_{k}:a_{i}\in A_{i},\quad 1\leq i\leq k\}.

We use the shorthand kAkA for A++AA+\cdots+A, where kk is the number of summands. The following lemma is proven in [Map10, Proposition 3.5] for all finite fields. A simple verification shows that it applies verbatim to all finite cyclic groups. For reader’s convenience we will sketch the modification needed in the proof.

Lemma 3.6.

For any w>0w>0 and integer k1k\geq 1, we have

kT(w)T(k2w).kT(w)\subseteq T(k^{2}w).
Proof.

It suffices to show that for all β1,,βkN\beta_{1},\dots,\beta_{k}\in{\mathbb{Z}}_{N} we have

ψ(β1++βk)k(ψ(β1)++ψ(βk)).\psi(\beta_{1}+\cdots+\beta_{k})\leq k(\psi(\beta_{1})+\cdots+\psi(\beta_{k})).

This, in turn can be re-written as

1a,bNμ(a)μ(b)cos(2πN(ab)(β1++βk))k2kj=1ka,bNμ(a)μ(b)cos(2πN(ab)βj).1-\sum_{a,b\in{\mathbb{Z}}_{N}}\mu(a)\mu(b)\cos(\frac{2\pi}{N}(a-b)(\beta_{1}+\cdots+\beta_{k}))\leq k^{2}-k\sum_{j=1}^{k}\sum_{a,b\in{\mathbb{Z}}_{N}}\mu(a)\mu(b)\cos(\frac{2\pi}{N}(a-b)\beta_{j}).

This follows from the trigonometric inequality proven in [Map10, Proposition 3.1]. ∎

We will also need a lemma from additive combinatorics. Define the set of symmetries of a set ANA\subseteq{\mathbb{Z}}_{N} by SymA:={hN:h+A=A}\operatorname{Sym}A:=\{h\in{\mathbb{Z}}_{N}:h+A=A\}. Note that SymA\operatorname{Sym}A is a subgroup of N{\mathbb{Z}}_{N} and if 0A0\in A then Sym(A)A\operatorname{Sym}(A)\subseteq A. The lemma follows by a simple inductive argument from [TV12, Theorem 5.5].

Lemma 3.7 (Kneser’s bound).

Let ZZ be an abelian group and A1,,AkZA_{1},\dots,A_{k}\subseteq Z are finite subsets. Then we have

|A1++Ak|+(k1)|Sym(A1++Ak)||A1|++|Ak|.|A_{1}+\cdots+A_{k}|+(k-1)|\operatorname{Sym}(A_{1}+\cdots+A_{k})|\geq|A_{1}|+\cdots+|A_{k}|.

In particular, for any finite AA we have

k|A||kA|+(k1)|Sym(kA)|,k|A|\leq|kA|+(k-1)|\operatorname{Sym}(kA)|,

where kA=A++AkA=A+\cdots+A, with the sum containing kk copies of AA.

We will now claim that if HH is a subgroup of N{\mathbb{Z}}_{N} with |H|Nr|H|\geq\frac{N}{r} then there exists tHt\in H with f(t)nβf(t)\geq n\beta. To prove this note that

1|H|tHf(t)=n|H|tH(1|ν^(t)|2).\frac{1}{|H|}\sum_{t\in H}f(t)=\frac{n}{|H|}\sum_{t\in H}(1-|\widehat{\nu}(t)|^{2}).

Expanding the definition of ν^\widehat{\nu} in |ν^(t)|2=ν^(t)ν^(t)¯|\widehat{\nu}(t)|^{2}=\widehat{\nu}(t)\cdot\overline{\widehat{\nu}(t)}, and summing for tHt\in H we obtain

(3.4) tH|ν^(t)|2=tHθ1nν(θ1)eN(θ1t)θ2nν(θ2)eN(θ2t)=tHθ1,θ2nν(θ1)ν(θ2)eN((θ1θ2)t)=|H|θ1,θ2Nν(θ1)ν(θ2)𝟏H(θ1θ2)=|H|θ1Nν(θ1)ν(θ1+H).\begin{split}\sum_{t\in H}|\widehat{\nu}(t)|^{2}&=\sum_{t\in H}\sum_{\theta_{1}\in{\mathbb{Z}}_{n}}\nu(\theta_{1})e_{N}(\theta_{1}t)\sum_{\theta_{2}\in{\mathbb{Z}}_{n}}\nu(\theta_{2})e_{N}(-\theta_{2}t)\\ &=\sum_{t\in H}\sum_{\theta_{1},\theta_{2}\in{\mathbb{Z}}_{n}}\nu(\theta_{1})\nu(\theta_{2})e_{N}((\theta_{1}-\theta_{2})t)=|H|\cdot\sum_{\theta_{1},\theta_{2}\in{\mathbb{Z}}_{N}}\nu(\theta_{1})\nu(\theta_{2})\mathbf{1}_{H^{\perp}}(\theta_{1}-\theta_{2})\\ &=|H|\cdot\sum_{\theta_{1}\in{\mathbb{Z}}_{N}}\nu(\theta_{1})\nu(\theta_{1}+H^{\perp}).\end{split}

Here, HH^{\perp} denotes the dual of HH, consisting of xNx\in{\mathbb{Z}}_{N} such that eN(hx)=1e_{N}(hx)=1 for all hHh\in H. Note that HH^{\perp} is a subgroup of N{\mathbb{Z}}_{N}. If θ1H\theta_{1}\not\in H^{\perp} then 0θ1+H0\not\in\theta_{1}+H^{\perp} and hence ν(θ1+H)1ν(0)1β\nu(\theta_{1}+H^{\perp})\leq 1-\nu(0)\leq 1-\beta. If θ1H\theta_{1}\in H^{\perp} then θ1+H=H\theta_{1}+H^{\perp}=H^{\perp}. Since |H|r|H^{\perp}|\leq r, we have ν(θ1+H)1β\nu(\theta_{1}+H^{\perp})\leq 1-\beta in this case as well. This implies that tH|ν^(t)|21β\sum_{t\in H}|\widehat{\nu}(t)|^{2}\leq 1-\beta. This shows that 1|H|tHf(t)nβ,\frac{1}{|H|}\sum_{t\in H}f(t)\geq n\beta, implying that there exists tHt\in H such that f(t)nβf(t)\geq n\beta.

Let us now suppose w<nβw<n\beta. Let kk be the largest positive integer with k2w<nβk^{2}w<n\beta. We claim that |Sym(kT(w))|Nr|\operatorname{Sym}(kT(w))|\leq\frac{N}{r}. In fact, if this is not the case, using Lemma 3.6 we have

kT(w)T(k2w),kT(w)\subseteq T(k^{2}w),

it follows that T(k2w)T(k^{2}w) contains a subgroup HH with more than N/rN/r elements. We will then have

HT(k2w)T(nβ)H\subseteq T(k^{2}w)\subseteq T(n\beta)

which contradicts that claim proved above. It follows from Lemma 3.7 that for all w<nβw<n\beta, we have

|T(w)|1k|T(k2w)|+Nr.|T(w)|\leq\frac{1}{k}|T(k^{2}w)|+\frac{N}{r}.

By the choice of k1k\geq 1 we have

(2k)2w(k+1)2wnβ(2k)^{2}w\geq(k+1)^{2}w\geq n\beta

implying that knβ4wk\geq\sqrt{\frac{n\beta}{4w}}. Putting all these together we obtain

|T(w)|4wnβN+Nr.|T(w)|\leq\sqrt{\frac{4w}{n\beta}}N+\frac{N}{r}.

Inserting the latter bound in the range w<nβw<n\beta and the trivial bound |T(w)|N|T(w)|\leq N for w>nβw>n\beta into (3.3), we arrive at

(3.5) ν(n)(0)12N0nβ(4wnβN+Nr)ew/2𝑑w+12NnβNew/2𝑑wCnβ0wew/2𝑑w+1r+12nβew/2𝑑w1r+Cnβ+enβ/2\begin{split}\nu^{(n)}(0)&\leq\frac{1}{2N}\int_{0}^{n\beta}(\sqrt{\frac{4w}{n\beta}}N+\frac{N}{r})e^{-w/2}dw+\frac{1}{2N}\int_{n\beta}^{\infty}Ne^{-w/2}dw\\ &\leq\frac{C}{\sqrt{n\beta}}\int_{0}^{\infty}\sqrt{w}e^{-w/2}dw+\frac{1}{r}+\frac{1}{2}\int_{n\beta}^{\infty}e^{-w/2}dw\\ &\leq\frac{1}{r}+\frac{C^{\prime}}{\sqrt{n\beta}}+e^{-n\beta/2}\end{split}

This ends the proof of Lemma 3.4.

Lemma 3.8.

Let ν\nu be a probability measure on N{\mathbb{Z}}_{N} and 0<β<1/20<\beta<1/2 such that βν(0)1β\beta\leq\nu(0)\leq 1-\beta. Then for some constant CC^{\prime} and all n1n\geq 1 we have

ν(n)(0)12+Cβn+enβ/2.\nu^{(n)}(0)\leq\frac{1}{2}+\frac{C^{\prime}}{\sqrt{\beta n}}+e^{-n\beta/2}.
Proof.

This is proven is similar to the proof of Lemma 3.4. As before we write

ν(n)(0)1NtN|ν^(t)|n12N0|T(w)|ew/2𝑑w.\nu^{(n)}(0)\leq\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}|\ \widehat{\nu}(t)|^{n}\leq\frac{1}{2N}\int_{0}^{\infty}|T(w)|e^{-w/2}dw.

We now claim that for all w<nβw<n\beta we have

|T(w)|4wnβN+N2.|T(w)|\leq\sqrt{\frac{4w}{n\beta}}N+\frac{N}{2}.

To show this, for w<nβw<n\beta denote by kk the largest positive integer with k2w<nβk^{2}w<n\beta. We claim that |Sym(T(w)++T(w))|N2|\operatorname{Sym}(T(w)+\cdots+T(w))|\leq\frac{N}{2}. Suppose that this is not the case. Recall that T(w)={t:f(t)w}T(w)=\{t:f(t)\leq w\}, where f(t)=n(1|ν^(t)|2)f(t)=n\left(1-|\widehat{\nu}(t)|^{2}\right). Since ν^(0)=1\widehat{\nu}(0)=1 we have f(0)=0f(0)=0. Since w>0w>0, we have 0T(w)0\in T(w). This implies that

Sym(T(w)++T(w))T(w)++T(w)T(k2w),\operatorname{Sym}(T(w)+\cdots+T(w))\subseteq T(w)+\cdots+T(w)\subseteq T(k^{2}w),

it follows that T(k2w)T(k^{2}w) contains a subgroup with more than N/2N/2 elements, hence has to be equal to N{\mathbb{Z}}_{N}, from which it follows that T(nβ)=NT(n\beta)={\mathbb{Z}}_{N}. On the other hand, using the Plancherel formula we have

1NtN|ν^(t)|2=θ1Nν(θ1)2(1β)2+β21β,\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}|\widehat{\nu}(t)|^{2}=\sum_{\theta_{1}\in{\mathbb{Z}}_{N}}\nu(\theta_{1})^{2}\leq(1-\beta)^{2}+\beta^{2}\leq 1-\beta,

where the second inequality follows from (1β)(1β)2β2=β(12β)>0(1-\beta)-(1-\beta)^{2}-\beta^{2}=\beta(1-2\beta)>0. This implies that 1NtN(1|ν^(t)|2)β\frac{1}{N}\sum_{t\in{\mathbb{Z}}_{N}}(1-|\widehat{\nu}(t)|^{2})\geq\beta. Hence there exists tNt\in{\mathbb{Z}}_{N} such that f(t)nβf(t)\geq n\beta. A similar argument as above finishes the proof. ∎

We will now combine Lemmas 3.1, 3.4, 3.8 to prove a non-concentration theorem for random walks on abelian groups with cyclic torsion component. Let Λ:=Λ1×Λ2\Lambda:=\Lambda_{1}\times\Lambda_{2} where Λ1s\Lambda_{1}\sim{\mathbb{Z}}^{s} and Λ2N\Lambda_{2}\simeq{\mathbb{Z}}_{N} are free and torsion components of Λ\Lambda. For i=1,2i=1,2, denote the projection from Λ\Lambda onto Λi\Lambda_{i} by πi\pi_{i}, and for probability measure ν\nu on Λ\Lambda, we write νi:=(πi)ν\nu_{i}:=(\pi_{i})_{\ast}\nu for the push-forward of ν\nu under πi\pi_{i}.

Theorem 3.9 (Non-concentration for random walks on abelian groups).

Let Λ\Lambda be a finitely generated abelian group with a cyclic torsion subgroup and ν\nu a finitely supported probability measure on Λ\Lambda. Let 0<β<1/20<\beta<1/2 and r1r\geq 1 be a positive integer. Assume that

  1. (1)

    βν(0)1β\beta\leq\nu(0)\leq 1-\beta.

  2. (2)

    ν(Λr)1β\nu(\Lambda^{{\tiny{\leq r}}})\leq 1-\beta, where Λr\Lambda^{\leq r} is the subset of Λ\Lambda consisting of elements of order at most rr.

Then for all n1n\geq 1 we have

ν(n)(0)1r+C′′nβ+enβ/4,\nu^{(n)}(0)\leq\frac{1}{r}+\frac{C^{\prime\prime}}{\sqrt{n\beta}}+e^{-n\beta/{\color[rgb]{0,0,0}4}},

where C′′C^{\prime\prime} is an absolute constant. Moreover, only under assumption (1) above, we have

ν(n)(0)12+C′′nβ+enβ/4.\nu^{(n)}(0)\leq\frac{1}{2}+\frac{C^{\prime\prime}}{\sqrt{n\beta}}+e^{-n\beta/{\color[rgb]{0,0,0}4}}.
Proof.

Clearly we have

ν(n)(0)min(ν1(n)(0),ν2(n)(0)).\nu^{(n)}(0)\leq\min(\nu_{1}^{(n)}(0),\nu_{2}^{(n)}(0)).

Moreover, the inequality βνj(0)\beta\leq\nu_{j}(0) is satisfied for both j=1,2j=1,2. We consider two cases: if ν1(0)1β4\nu_{1}(0)\leq 1-\frac{\beta}{4}, then it follows from Lemma 3.1 that

ν(n)(0)ν1(n)(0)2Cβn.\nu^{(n)}(0)\leq\nu_{1}^{(n)}(0)\leq\frac{2C}{\sqrt{\beta n}}.

Hence, suppose that ν1(0)>1β4\nu_{1}(0)>1-\frac{\beta}{4}. We claim that in this case we must have ν2(0)1β2\nu_{2}(0)\leq 1-\frac{\beta}{2}. If this is not the case, then ν2(0)112β\nu_{2}(0)\geq 1-\frac{1}{2}\beta which contradicts assumption (2). We will now verify condition (2) of Lemma 3.4 by finding an upper bound on the measure (under ν2\nu_{2}) of the set of elements of order at most rr in Λ2\Lambda_{2}.

(3.6) ν2{zΛ2:ordΛ2(z)r}=ν{z=(z1,z2)Λ:ordΛ2(z2)r}ν({z=(0,z2)Λ:ord(z)r})+ν({z=(z1,z2)Λ:z10})1β+β4<1β2.\begin{split}\nu_{2}\{z\in\Lambda_{2}:\operatorname{ord}_{\Lambda_{2}}(z)\leq r\}&=\nu\{z=(z_{1},z_{2})\in\Lambda:\operatorname{ord}_{\Lambda_{2}}(z_{2})\leq r\}\\ &\leq\nu(\{z=(0,z_{2})\in\Lambda:\operatorname{ord}(z)\leq r\})+\nu(\{z=(z_{1},z_{2})\in\Lambda:z_{1}\neq 0\})\\ &\leq 1-\beta+\frac{\beta}{4}<1-\frac{\beta}{2}.\end{split}

In particular, if HH is a subgroup with |H|r|H|\leq r, then the order of every element of HH is at most rr, and hence ν2(H)1β2\nu_{2}(H)\leq 1-\frac{\beta}{2}. We can now apply Lemma 3.4 and Lemma 3.8 (with β/2\beta/2 instead of β\beta) to obtain the result. ∎

Corollary 3.10.

Let AA be a matrix in GLd()\mathrm{GL}_{d}({\mathbb{C}}) and β>0\beta>0. Assume that

  1. (1)

    β𝐦1(A)1β\beta\leq\mathbf{m}_{1}(A)\leq 1-\beta.

  2. (2)

    𝐦r(A)(1β)\mathbf{m}^{\leq r}(A)\leq(1-\beta)

Then for every η>0\eta>0 there exists M:=M(β,r,η)M:=M(\beta,r,\eta) such that for all nMn\geq M we have

𝐦1(𝖳nA)1r+η.\mathbf{m}_{1}({\mathsf{T}}^{n}A)\leq\frac{1}{r}+\eta.

Moreover, only under assumption (1), for every η>0\eta>0 there exists M:=M(β,η)M^{\prime}:=M^{\prime}(\beta,\eta) such that, for all nMn\geq M^{\prime} we have

𝐦1(𝖳nA)12+η.\mathbf{m}_{1}({\mathsf{T}}^{n}A)\leq\frac{1}{2}+\eta.
Proof.

Denote by Λ\Lambda the subgroup of {\mathbb{C}}^{\ast} generated by the eigenvalues of AA. Denote the subgroup of torsion elements in Λ\Lambda by Λ2\Lambda_{2} and let Λ1s\Lambda_{1}\simeq{\mathbb{Z}}^{s} be a subgroup of Λ\Lambda such that Λ=Λ1×Λ2\Lambda=\Lambda_{1}\times\Lambda_{2}. First suppose that conditions (1) and (2) hold. These guarantee that ξA\xi_{A} satisfies (1) and (2) of Theorem 3.9. It follows that for all n1n\geq 1 we have

ξA(n)(0)1r+C′′nβ+enβ/4.\xi_{A}^{(n)}(0)\leq\frac{1}{r}+\frac{C^{\prime\prime}}{\sqrt{n\beta}}+e^{-n\beta/{4}}.

Given η>0\eta>0, we can find M=M(β,η)M=M(\beta,\eta) such that for all nMn\geq M we have

C′′nβ+enβ/4<η.\frac{C^{\prime\prime}}{\sqrt{n\beta}}+e^{-n\beta/{4}}<\eta.

This inequality together with the bound 𝐦1(𝖳nA)=ξA(n)(1)\mathbf{m}_{1}({\mathsf{T}}^{n}A)=\xi_{A}^{(n)}(1) establishes the claim. The second part of the statement (involving only assumption (1)) follows by a similar argument. ∎

4. Dynamics of the number of Jordan blocks under tensor powers

In this section, we will study the effect of amplification on matrices AA with 𝐦1(A)\mathbf{m}_{1}(A) close to 11. If AA is such a matrix, for ρ(AId)\rho(A-\mathrm{Id}) to be away from zero, a definite proportion of its Jordan blocks have to be of size at least 22. We will show that under this condition, the proportion of the number of Jordan blocks to the size of matrix tends to zero. Moreover, we will give an effective bound for the number of iterations required to reduce this ratio below any given positive ϵ\epsilon.

We will denote by J(α,s)J(\alpha,s) a Jordan block of size ss with eigenvalue α\alpha. Let AA be a d×dd\times d complex matrix. We denote by 𝐣(A)\mathbf{j}(A) the number of Jordan blocks in the Jordan decomposition of AA divided by dd, hence 0𝐣(A)10\leq\mathbf{j}(A)\leq 1, with 𝐣(A)=1\mathbf{j}(A)=1 iff AA is diagonalizable. Define a sequence of matrices by setting

(4.1) A1=A,Am=Am1Am1,m1.A_{1}=A,\quad A_{m}=A_{m-1}\otimes A_{m-1},\quad m\geq 1.

The main result of this section is the following theorem:

Theorem 4.1.

For every γ>0\gamma>0 and every η>0\eta>0 there exists N(γ,η)N(\gamma,\eta) such that if 𝐣(A)1γ\mathbf{j}(A)\leq 1-\gamma and m>N(γ,η)m>N(\gamma,\eta) then 𝐣(Am)<η\mathbf{j}(A_{m})<\eta, where AmA_{m} is defined by (4.1).

The proof of this theorem will involve estimating certain trigonometric integrals. Before reformulating the problem, we will recall some elementary facts.

Lemma 4.2 ([AP17, II09], Lemma 5.7).

For m,nm,n\in\mathbb{N} and α,β\alpha,\beta\in{\mathbb{C}} we have:

J(α,m)J(β,n)=i=1min(m,n)J(αβ,m+n+12i).J(\alpha,m)\otimes J(\beta,n)=\bigoplus_{i=1}^{\min(m,n)}J(\alpha\beta,m+n+1-2i).
Remark 4.3.

It follows from this lemma that the number of Jordan blocks of a given size in tensor powers of AA only depends on the size of Jordan blocks in AA and not on the corresponding eigenvalues. We will use this simple fact later.

In order to study the asymptotic behavior of the number of Jordan blocks in tensor powers of AA, it will be convenient to set up some algebraic framework. Let us denote by 𝒜\mathcal{A} the set of all n×nn\times n matrices in the Jordan normal form with eigenvalue 11 on the diagonal, where n1n\geq 1 varies over the set of all natural numbers. Note that if A𝒜A\in\mathcal{A} then AA𝒜A\otimes A\in\mathcal{A}. We will denote by \mathcal{E} the set of all formal linear combinations n1cnδn\sum_{n\geq 1}c_{n}\delta_{n}, where cn0c_{n}\geq 0 are integers and cn=0c_{n}=0 for all but finitely many values of nn. We equip \mathcal{E} with a binary operation defined by

δmδn=i=1min(m,n)δm+n+12i\delta_{m}\ast\delta_{n}=\sum_{i=1}^{\min(m,n)}\delta_{m+n+1-2i}

and extended linearly to 𝒯\mathcal{T}. Finally, for n1n\geq 1, consider the Laurent polynomial

Tn(x)=i=1nxn2i+1=xn1+xn3++x(n3)+x(n1),T_{n}(x)=\sum_{i=1}^{n}x^{n-2i+1}=x^{n-1}+x^{n-3}+...+x^{-(n-3)}+x^{-(n-1)},

and denote by 𝒯\mathcal{T} the set of all (finite) integer linear combinations of {Tn(x)}n1\{T_{n}(x)\}_{n\geq 1}. Note that since TnT_{n} have different degrees, they do form a basis for the {\mathbb{Z}}-module 𝒯\mathcal{T}.

Now we will construct natural maps between these objects that will allow us to encode the number and size of Jordan blocks in tensor powers of a matrix using polynomials Tn(x)T_{n}(x). First, define the map Δ:𝒜\Delta:\mathcal{A}\to\mathcal{E} as follows. Let A𝒜A\in\mathcal{A} be a matrix with cic_{i} blocks of size ii for i1i\geq 1. Then define

Δ(A)=n1cnδn.\Delta(A)=\sum_{n\geq 1}c_{n}\delta_{n}.

Note that Δ\Delta is a bijection. We now define Θ:𝒯\Theta:\mathcal{E}\to\mathcal{T} by

Θ(n1cnδn)=n1cnTn(x).\Theta\left(\sum_{n\geq 1}c_{n}\delta_{n}\right)=\sum_{n\geq 1}c_{n}T_{n}(x).
Lemma 4.4.

The following hold:

  1. (1)

    For all A1,A2𝒜A_{1},A_{2}\in\mathcal{A}, we have Δ(A1A2)=Δ(A1)Δ(A2).\Delta(A_{1}\otimes A_{2})=\Delta(A_{1})\ast\Delta(A_{2}).

  2. (2)

    For all x1,x2x_{1},x_{2}\in\mathcal{E} we have Θ(x1x2)=Θ(x1)Θ(x2).\Theta(x_{1}\ast x_{2})=\Theta(x_{1})\Theta(x_{2}).

Proof.

For (1), note that when A1A_{1} consists of one block of size mm and A2A_{2} is a block of size nn, this is a restatement of Lemma 4.2. In this case, we have Δ(A1)=δm\Delta(A_{1})=\delta_{m} and Δ(A2)=δn\Delta(A_{2})=\delta_{n} and Δ(A1A2)\Delta(A_{1}\otimes A_{2}) is precisely the expression defined by δmδn\delta_{m}\ast\delta_{n}. It follows immediately from the definition of tensor product of matrices that the equality extends to all linear combinations with non-negative integer coefficients. This establishes (1).

In order to prove (2), we will show that For m,nm,n\in\mathbb{N}, the following identity holds:

Tm(x)Tn(x)=i=1min(m,n)Tm+n+12i(x).T_{m}(x)T_{n}(x)=\sum_{i=1}^{\min(m,n)}T_{m+n+1-2i}(x).

Without loss of generality, assume that mnm\leq n. We will proceed by calculating the coefficient of xrx^{r} for rr\in{\mathbb{Z}} on both sides. Note that xrx^{r} appears on either the right-hand side or the left-hand side if and only if r=m+n2jr=m+n-2j for some 1jm+n11\leq j\leq m+n-1. Since the coefficients of xjx^{j} and xjx^{-j} are equal, it suffices to consider the coefficient of xrx^{r} for non-negative values of rr, which correspond to 0jm+n20\leq j\leq\frac{m+n}{2}. Fix jj in this range. It follows from the definition that xm+n2jx^{m+n-2j} appears in Tm+n+12kT_{m+n+1-2k} if and only if m+n2km+n2jm+n-2k\geq m+n-2j, or, equivalently if kjk\leq j. Since we have 1km1\leq k\leq m, the term xm+n2jx^{m+n-2j} appears exactly min(j,m)\min(j,m) times on the right-hand side. We now show that the coefficient of xm+n2jx^{m+n-2j} on the left-hand side is the same. Note that the coefficient of the term xrx^{r} with r=m+n2jr=m+n-2j in TmTnT_{m}T_{n} is equal to the number of pairs (j1,j2)(j_{1},j_{2}) such that xm+12j1x^{m+1-2j_{1}} appears in TmT_{m} and xn+12j2x^{n+1-2j_{2}} appears in TnT_{n} where (m+12j1)+(n+12j2)=m+n2j(m+1-2j_{1})+(n+1-2j_{2})=m+n-2j. These conditions can be reformulated as bellow:

(4.2) j1+j2=j+11j1m,1j2n.j_{1}+j_{2}=j+1\qquad 1\leq j_{1}\leq m,1\leq j_{2}\leq n.

Let us distinguish two cases: First suppose that jmj\leq m. In this case we choose 1j1j1\leq j_{1}\leq j and any such choice determines j2=j+1j1j_{2}=j+1-j_{1} uniquely. Hence the number of solutions to (4.2) is jj. When j>mj>m, then j1j_{1} is allowed to vary in the range 1j1m1\leq j_{1}\leq m, and since jm+n2j\leq\frac{m+n}{2}, j2:=j+1j1j_{2}:=j+1-j_{1} will automatically satisfy the inequality 1j2n1\leq j_{2}\leq n. In conclusion, the coefficient of xm+n2jx^{m+n-2j} on the left-hand side equals min(j,m)\min(j,m) as well. This proves the claim. It thus follows that Θ(x1x2)=Θ(x1)Θ(x2)\Theta(x_{1}\ast x_{2})=\Theta(x_{1})\Theta(x_{2}) holds for x1=δmx_{1}=\delta_{m} and x2=δnx_{2}=\delta_{n}. Since Θ\Theta is extended to \mathcal{E} by linearity, the general case follows immediately. ∎

Corollary 4.5.

Let AA be a d×dd\times d invertible complex matrix. Let an(k)a_{n}^{(k)} be the multiplicity of Jordan blocks of size nn in the Jordan decomposition of AkA_{k}.

(n1an(1)Tn(x))2k1=n1an(k)Tn(x)\left(\sum_{n\geq 1}a_{n}^{(1)}T_{n}(x)\right)^{2^{k-1}}=\sum_{n\geq 1}a_{n}^{(k)}T_{n}(x)
Proof.

Using Remark 4.3, we can assume that all eigenvalues are equal to 11, or A𝒜A\in\mathcal{A}. It is clear that Δ(A)=n1an(1)δn\Delta(A)=\sum_{n\geq 1}a_{n}^{(1)}\delta_{n}. Part (1) of Lemma 4.4 implies that

Δ(Am)=Δ(Am1Am1)=Δ(Am1)Δ(Am1).\Delta(A_{m})=\Delta(A_{m-1}\otimes A_{m-1})=\Delta(A_{m-1})\ast\Delta(A_{m-1}).

Since Δ(Am)=n1an(m)δn\Delta(A_{m})=\sum_{n\geq 1}a_{n}^{(m)}\delta_{n}, applying Θ\Theta to the previous equation and using part (2) of Lemma 4.4 give

n1an(m)Tn(x)=(n1an(m1)Tn(x))2.\sum_{n\geq 1}a_{n}^{(m)}T_{n}(x)=\left(\sum_{n\geq 1}a_{n}^{(m-1)}T_{n}(x)\right)^{2}.

The claim follows by induction on mm. ∎

It will be more convenient to work with a trigonometric generating function. This is carried out in the next lemma.

Corollary 4.6.

With the same notation as in Corollary 4.5, set

Pk(θ)=1dn1an(k)sin(nθ)sin(θ).P_{k}(\theta)=\frac{1}{d}\underset{n\geq 1}{\sum}a_{n}^{(k)}\frac{\sin(n\theta)}{\sin(\theta)}.

For all k1k\geq 1 we have

P1(θ)2k=Pk(θ)P_{1}(\theta)^{2^{k}}=P_{k}(\theta)
Proof.

Substituting x=eiθx=e^{i\theta} into Tn(x)T_{n}(x) yields

Tn(eiθ)=i=1nei(n2i+1)θ=einθeinθeiθeiθ=sin(nθ)sin(θ).T_{n}(e^{i\theta})=\sum_{i=1}^{n}e^{i(n-2i+1)\theta}=\frac{e^{in\theta}-e^{-in\theta}}{e^{i\theta}-e^{-i\theta}}=\frac{\sin(n\theta)}{\sin(\theta)}.

By applying Corollary 4.5 we obtain

(n1an(1)sin(nθ)sin(θ))2k=n1an(k)sin(nθ)sin(θ).\left(\underset{n\geq 1}{\sum}a_{n}^{(1)}\frac{\sin(n\theta)}{\sin(\theta)}\right)^{2^{k}}=\underset{n\geq 1}{\sum}a_{n}^{(k)}\frac{\sin(n\theta)}{\sin(\theta)}.

Lemma 4.7.

Let 0ϵ20\leq\epsilon\leq 2 and 2n2\leq n\in\mathbb{N}. For any real number θ\theta such that |cos(θ)|1ϵ2|\cos(\theta)|\leq 1-\frac{\epsilon}{2} we have

|sin(nθ)sin(θ)|nϵ.\left|\frac{\sin(n\theta)}{\sin(\theta)}\right|\leq n-\epsilon.

In particular, |sin(nθ)sin(θ)|n|\frac{\sin(n\theta)}{\sin(\theta)}|\leq n holds for all θ\theta\in\mathbb{R}.

Proof.

We will proceed by induction on nn. For n=2n=2 the required inequality is obvious. Assume that it also holds for nn. One can write:

|sin((n+1)θ)sin(θ)|=|sin(nθ)cos(θ)+sin(θ)cos(nθ)sin(θ)||sin(nθ)sin(θ)||cos(θ)|+|cos(nθ)|nϵ+1\left|\frac{\sin((n+1)\theta)}{\sin(\theta)}\right|=\left|\frac{\sin(n\theta)\cos(\theta)+\sin(\theta)\cos(n\theta)}{\sin(\theta)}\right|\leq\left|\frac{\sin(n\theta)}{\sin(\theta)}\right|\cdot|\cos(\theta)|+|\cos(n\theta)|\leq n-\epsilon+1

Thus the required inequality holds for n+1n+1 as well and we are done. ∎

Corollary 4.8.

For all θ\theta\in{\mathbb{R}} we have |P1(θ)|1|P_{1}(\theta)|\leq 1.

Proof.

This follows immediately from Lemma 4.7 and the equality n1nan(1)=d.\sum_{n\geq 1}na_{n}^{(1)}=d.

For a measurable subset BB\subseteq{\mathbb{R}} we denote by μ(B)\mu(B) is the Lebesgue measure of BB. We will also denote the number of Jordan blocks of size 11 in AA by 𝐣1(A)\mathbf{j}_{1}(A).

Lemma 4.9.

Suppose that AGLd()A\in\mathrm{GL}_{d}(\mathbb{C}) is in the Jordan normal form. For ϵ>0\epsilon>0, set

Bϵ:={θ[0,2π]:|P1(θ)|1(𝐣(A)𝐣1(A))ϵ}.B_{\epsilon}:=\{\theta\in[0,2\pi]:|P_{1}(\theta)|\leq 1-(\mathbf{j}(A)-\mathbf{j}_{1}(A))\epsilon\}.

Then

μ([0,2π]Bϵ)8ϵ.\mu([0,2\pi]\setminus B_{\epsilon})\leq 8\sqrt{\epsilon}.
Proof.

If all Jordan blocks are of size 11, then 𝐣1(A)=𝐣(A)\mathbf{j}_{1}(A)=\mathbf{j}(A) and the claim follow Corollary 4.8. More generally, suppose cos(θ)1ϵ2\cos(\theta)\leq 1-\frac{\epsilon}{2}. Then it follows from Lemma 4.7 that

1|P1(θ)|=1dm2(m|sin(mθ)sin(θ)|)am(1)1dm2ϵam(1)=ϵ(𝐣(A)𝐣1(A))1-|P_{1}(\theta)|=\frac{1}{d}\sum_{m\geq 2}\left(m-\left|\frac{\sin(m\theta)}{\sin(\theta)}\right|\right)a_{m}^{(1)}\geq\frac{1}{d}\sum_{m\geq 2}\epsilon a_{m}^{(1)}=\epsilon(\mathbf{j}(A)-\mathbf{j}_{1}(A))

This implies that θBϵ\theta\in B_{\epsilon}. Hence

μ([0,2π]Bϵ)4cos1(1ϵ/2)8ϵ,\mu([0,2\pi]\setminus B_{\epsilon})\leq 4\cos^{-1}(1-\epsilon/2)\leq 8\sqrt{\epsilon},

where one can easily verify the last inequality for all ϵ(0,1)\epsilon\in(0,1). ∎

Lemma 4.10.

Suppose AGLd()A\in\mathrm{GL}_{d}({\mathbb{C}}) is not unipotent, that is, 𝐣(A)<1\mathbf{j}(A)<1. Then we have 𝐣1(AA)𝐣(A)2\mathbf{j}_{1}(A\otimes A)\leq\mathbf{j}(A)^{2}.

Proof.

First, notice that J(α,s)J(β,t)J(\alpha,s)\bigotimes J(\beta,t) generates a Jordan block of size 1 if and only if the equation s+t+12i=1s+t+1-2i=1 holds for some 1imin(s,t)1\leq i\leq\min(s,t). Equivalently s+t=2is+t=2i for 1imin(s,t)1\leq i\leq\min(s,t). This equation has a solution if and only if s=ts=t. Therefore, we have

𝐣1(AA)=n1(an(1))2d2(n1an(1)d)2=𝐣(A)2.\mathbf{j}_{1}(A\otimes A)=\frac{\sum_{n\geq 1}(a_{n}^{(1)})^{2}}{d^{2}}\leq\left(\frac{\sum_{n\geq 1}a_{n}^{(1)}}{d}\right)^{2}=\mathbf{j}(A)^{2}.

Lemma 4.11.

For a,ba,b\in\mathbb{R}, define In(a,b)=absin(nθ)sin(θ)𝑑θI_{n}(a,b)=\int_{a}^{b}\frac{\sin(n\theta)}{\sin(\theta)}d\theta. Then for all nn\in\mathbb{Z} we have

(n+1)(In+2(a,b)In(a,b))=2[sin((n+1)b)sin((n+1)a)].(n+1)\big{(}I_{n+2}(a,b)-I_{n}(a,b)\big{)}=2\big{[}\sin((n+1)b)-\sin((n+1)a)\big{]}.
Proof.

As a,ba,b are fixed during this proof we write InI_{n} instead of In(a,b)I_{n}(a,b). First note that

In+2=absin((n+2)θ)sin(θ)𝑑θ=absin(nθ)cos(2θ)sin(θ)𝑑θ+absin(2θ)cos(nθ)sin(θ)𝑑θI_{n+2}=\int_{a}^{b}\frac{\sin((n+2)\theta)}{\sin(\theta)}d\theta=\int_{a}^{b}\frac{\sin(n\theta)\cos(2\theta)}{\sin(\theta)}d\theta+\int_{a}^{b}\frac{\sin(2\theta)\cos(n\theta)}{\sin(\theta)}d\theta

Using cos(2θ)=cos2(θ)sin2(θ)=12sin2(θ)\cos(2\theta)=\cos^{2}(\theta)-\sin^{2}(\theta)=1-2\sin^{2}(\theta) we can write:

In+2=\displaystyle I_{n+2}= absin(nθ)(12sin2(θ))sin(θ)𝑑θ+2abcos(θ)cos(nθ)𝑑θ\displaystyle\int_{a}^{b}\frac{\sin(n\theta)(1-2\sin^{2}(\theta))}{\sin(\theta)}d\theta+2\int_{a}^{b}\cos(\theta)\cos(n\theta)d\theta
=\displaystyle= absin(nθ)sin(θ)𝑑θ2absin(θ)sin(nθ)𝑑θ+2abcos(θ)cos(nθ)𝑑θ\displaystyle\int_{a}^{b}\frac{\sin(n\theta)}{\sin(\theta)}d\theta-2\int_{a}^{b}\sin(\theta)\sin(n\theta)d\theta+2\int_{a}^{b}\cos(\theta)\cos(n\theta)d\theta
=\displaystyle= In+2ab(cos(nθ)cos(θ)sin(nθ)sin(θ)dθ\displaystyle I_{n}+2\int_{a}^{b}(\cos(n\theta)\cos(\theta)-\sin(n\theta)\sin(\theta)d\theta
=\displaystyle= In+2abcos((n+1)θ)dθ=In+2n+1(sin((n+1)b)sin((n+1)a).\displaystyle I_{n}+2\int_{a}^{b}\cos((n+1)\theta)d\theta=I_{n}+\frac{2}{n+1}(\sin((n+1)b)-\sin((n+1)a).

Corollary 4.12.
  1. (1)

    For all odd integers nn we have

    In(0,2π)=2π,In(π2,3π2)=π.I_{n}(0,2\pi)=2\pi,\quad I_{n}(\frac{\pi}{2},\frac{3\pi}{2})=\pi.
  2. (2)

    For all even integers nn we have

    In(0,2π)=0,In(π2,3π2)3.I_{n}(0,2\pi)=0,\quad I_{n}(\frac{\pi}{2},\frac{3\pi}{2})\geq 3.
Proof.

A direct computation shows that I2(0,2π)=0I_{2}(0,2\pi)=0 and I1(0,2π)=2πI_{1}(0,2\pi)=2\pi. Repeated application of Lemma 4.11 extends this to all integers nn. In a similar fashion, checking that I1(π/2,3π/2)=πI_{1}(\pi/2,3\pi/2)=\pi together with sin((n+1)π2)=sin(3(n+1)π2)=0\sin((n+1)\frac{\pi}{2})=\sin(3(n+1)\frac{\pi}{2})=0 implies the claim for all odd values of nn. The last inequality is easy to check for n=2,4n=2,4. For n40n\overset{4}{\equiv}0 we have:

In+2=In4n+1=In2+4n14n+1>In23I_{n+2}=I_{n}-\frac{4}{n+1}=I_{n-2}+\frac{4}{n-1}-\frac{4}{n+1}>I_{n-2}\geq 3

and for n42n\overset{4}{\equiv}2 we can write:

In+2=In+4n+1>In23.I_{n+2}=I_{n}+\frac{4}{n+1}>I_{n-2}\geq 3.

Corollary 4.13.

Let AGLd()A\in\mathrm{GL}_{d}({\mathbb{C}}) and Pk(θ)P_{k}(\theta) be defined as above. Then we have

𝐣(Ak)1302πP1(θ)2k𝑑θ.\mathbf{j}(A_{k})\leq\frac{1}{3}\int_{0}^{2\pi}P_{1}(\theta)^{2^{k}}\ d\theta.
Proof.

Using Corollary 4.6 and Corollary 4.12 we can write:

02πP1(θ)2k𝑑θ\displaystyle\int_{0}^{2\pi}P_{1}(\theta)^{2^{k}}d\theta π23π2P1(θ)2k𝑑θ=n1an(k)d2kπ23π2sin(nθ)sin(θ)𝑑θ\displaystyle\geq\int_{\frac{\pi}{2}}^{\frac{3\pi}{2}}P_{1}(\theta)^{2^{k}}d\theta=\sum_{n\geq 1}\frac{a_{n}^{(k)}}{d^{2^{k}}}\int_{\frac{\pi}{2}}^{\frac{3\pi}{2}}\frac{\sin(n\theta)}{\sin(\theta)}d\theta
3n1a2n(k)d2k+2πn1a2n1(k)d2k3𝐣(Ak).\displaystyle\geq 3\sum_{n\geq 1}\frac{a_{2n}^{(k)}}{d^{2^{k}}}+2\pi\sum_{n\geq 1}\frac{a_{2n-1}^{(k)}}{d^{2^{k}}}\geq 3\mathbf{j}(A_{k}).

Proof of Theorem 4.1.

Let AGLd()A\in\mathrm{GL}_{d}(\mathbb{C)} be such that 𝐣(A)1t\mathbf{j}(A)\leq\frac{1}{t}, where t=(1γ)1>1t=(1-\gamma)^{-1}>1. Let τ<12(γ/(1γ))2\tau<\frac{1}{2}(\gamma/(1-\gamma))^{2}. We will find a function N0(γ)N_{0}(\gamma) such that for all mN0(γ)m\geq N_{0}(\gamma) we have 𝐣(Am)1t+τ\mathbf{j}(A_{m})\leq\frac{1}{t+\tau}. By repeating this process we see that for all 1\ell\geq 1 and all mN0(γ)m\geq\ell N_{0}(\gamma) we have 𝐣(Am)1t+τ\mathbf{j}(A_{m})\leq\frac{1}{t+\ell\tau}. Let kk be defined by k=(ητ)1+1k=\left\lceil(\eta\tau)^{-1}\right\rceil+1 so that kτ>η1k\tau>\eta^{-1}. It is clear that

N(γ,η)=kN0(γ)N(\gamma,\eta)=kN_{0}(\gamma)

will satisfy the required property.

If we have 𝐣(AA)1t+τ\mathbf{j}(A\otimes A)\leq\frac{1}{t+\tau} we set m=2m=2, and we are done. Note that Proposition 5.8. of [AP17], we have 𝐣(BB)𝐣(B)\mathbf{j}(B\otimes B)\leq\mathbf{j}(B) for any matrix BB. Since Am+1=AmAmA_{m+1}=A_{m}\otimes A_{m}, it follows that if we once the inequality holds for m=2m=2 then the same inequality will continue to hold for all m2m\geq 2. Set let us assume that 𝐣(AA)>1t+τ\mathbf{j}(A\otimes A)>\frac{1}{t+\tau}. Then we have

02πP1(θ)2k𝑑θ=\displaystyle\int_{0}^{2\pi}P_{1}(\theta)^{2^{k}}d\theta= Bϵ|P1(θ)|2k𝑑θ+[0,2π]Bϵ|P1(θ)|2k𝑑θ\displaystyle\int_{B_{\epsilon}}|P_{1}(\theta)|^{2^{k}}d\theta+\int_{[0,2\pi]\setminus B_{\epsilon}}|P_{1}(\theta)|^{2^{k}}d\theta
\displaystyle\leq Bϵ(1ϵ(𝐣(A)𝐣1(A)))2k𝑑θ+[0,2π]Bϵ𝑑θ\displaystyle\int_{B_{\epsilon}}(1-\epsilon(\mathbf{j}(A)-\mathbf{j}_{1}(A)))^{2^{k}}d\theta+\int_{[0,2\pi]\setminus B_{\epsilon}}d\theta
\displaystyle\leq 2π(1ϵ(𝐣(A)𝐣1(A)))2k+μ([0,2π]Bϵ)\displaystyle 2\pi(1-\epsilon(\mathbf{j}(A)-\mathbf{j}_{1}(A)))^{2^{k}}+\mu([0,2\pi]\setminus B_{\epsilon})
\displaystyle\leq 2π(11t+τϵ+1t2ϵ)2k+8ϵ\displaystyle 2\pi\left(1-\frac{1}{t+\tau}\epsilon+\frac{1}{t^{2}}\epsilon\right)^{2^{k}}+8\sqrt{\epsilon}

It follows from the choice of τ\tau that 1t+τ>2t2\frac{1}{t+\tau}>\frac{2}{t^{2}}. This implies that

02πP1(θ)2k𝑑θ2π(1ϵt2)2k+8ϵ.\int_{0}^{2\pi}P_{1}(\theta)^{2^{k}}d\theta\leq 2\pi\left(1-\frac{\epsilon}{t^{2}}\right)^{2^{k}}+8\sqrt{\epsilon}.

Set ϵ=128(t+τ)2<12t2\epsilon=\frac{1}{2^{8}(t+\tau)^{2}}<\frac{1}{2t^{2}} so that the second term is bounded by 12(t+τ)\frac{1}{2(t+\tau)}. Now, let N0(γ)N_{0}(\gamma) be the smallest positive integer mm such that

2π(1ϵt2)2m<12(t+τ).2\pi\left(1-\frac{\epsilon}{t^{2}}\right)^{2^{m}}<\frac{1}{2(t+\tau)}.

It is clear that for this value of mm we have

02πP1(θ)2m𝑑θ1t+τ.\int_{0}^{2\pi}P_{1}(\theta)^{2^{m}}d\theta\leq\frac{1}{t+\tau}.

The claim will now follow from Corollary 4.13. ∎

5. Proof of Theorem A

In this section, we will prove parts (1) and (2) of Theorem A. The proof crucially depends on the inequality

ρ(A,Id)max(1𝐦1(A),1𝐣(A)).\rho(A,\mathrm{Id})\geq\max(1-\mathbf{m}_{1}(A),1-\mathbf{j}(A)).

from Lemma 2.1. Let GG be a torsion-free group and SS a finite subset of GG with eSe\not\in S. We will show that for every ϵ>0\epsilon>0 and δ>0\delta>0 an (S,δ,1ϵ)(S,\delta,1-\epsilon)-map into GLD()\mathrm{GL}_{D}({\mathbb{C}}) exists. Let r=2/ϵ+1r=\lfloor 2/\epsilon\rfloor+1 and set S¯={gn:gS,1nr!}\overline{S}=\{g^{n}:g\in S,1\leq n\leq r!\}. By Lemma 2.4, there exists an (S¯,δ0,0.23)(\overline{S},\delta_{0},0.23)-map ϕ0\phi_{0} such that

(5.1) 𝐦1(ϕ0(g))0.01\mathbf{m}_{1}(\phi_{0}(g))\geq 0.01

for all gS¯g\in\overline{S}. Here, δ0\delta_{0} is a small quantity whose values will be determined later. We will find NN such that for n>Nn>N, the tensor power 𝖳2nϕ0{\mathsf{T}}^{2^{n}}\phi_{0} is an (S,δ,1ϵ)(S,\delta,1-\epsilon)-map. Fix some gSg\in S. We will consider two different cases.

Case (1): Suppose that 𝐣(ϕ0(gr!))<0.99\mathbf{j}(\phi_{0}(g^{r!}))<0.99. Then it follows from Lemma 2.5 and Theorem 4.1 that for m=N(0.01,ϵ/2)m=N(0.01,\epsilon/2) we have 𝐣(𝖳2mϕ0(gr!))ϵ\mathbf{j}({\mathsf{T}}^{2^{m}}\phi_{0}(g^{r!}))\leq\epsilon. This implies that

ρ(𝖳2mϕ0(gr!),Id)1ϵ/2\rho({\mathsf{T}}^{2^{m}}\phi_{0}(g^{r!}),\mathrm{Id})\geq 1-\epsilon/2

On the other hand, using Lemma 2.5, we have

ρ(𝖳2mϕ0(gr!),𝖳2mϕ(g)r!)2mρ(ϕ0(gr!),ϕ(g)r!)2mr!δ0<ϵ/2\rho({\mathsf{T}}^{2^{m}}\phi_{0}(g^{r!}),{\mathsf{T}}^{2^{m}}\phi(g)^{r!})\leq 2^{m}\rho(\phi_{0}(g^{r!}),\phi(g)^{r!})\leq 2^{m}r!\delta_{0}<\epsilon/2

as soon as δ0<ϵ/2m+1r!\delta_{0}<\epsilon/2^{m+1}r!. From here, we have by the triangle inequality that

ρ(𝖳2mϕ0(g)r!,Id)1ϵ.\rho({\mathsf{T}}^{2^{m}}\phi_{0}(g)^{r!},\mathrm{Id})\geq 1-\epsilon.

Note that if BB is any matrix and m1m\geq 1 then ker(BI)ker(BmI)\ker(B-I)\subseteq\ker(B^{m}-I). In other words, we have ρ(B,I)ρ(Bm,I)\rho(B,I)\geq\rho(B^{m},I). This implies that ρ(𝖳2mϕ0(g),Id)1ϵ.\rho({\mathsf{T}}^{2^{m}}\phi_{0}(g),\mathrm{Id})\geq 1-\epsilon.

Case (2): Suppose that 𝐣(ϕ0(gr!))0.99\mathbf{j}(\phi_{0}(g^{r!}))\geq 0.99. The proportion of blocks corresponding to eigenvalue 11 is 𝐣1(ϕ0(gr!))\mathbf{j}_{1}(\phi_{0}(g^{r!})), and the proportion of other blocks is at most 1ξϕ0(gr!)(1).1-\xi_{\phi_{0}(g^{r!})}(1). This implies that

𝐣1(ϕ0(gr!))+1ξϕ0(gr!)(1)0.99.\mathbf{j}_{1}(\phi_{0}(g^{r!}))+1-\xi_{\phi_{0}(g^{r!})}(1)\geq 0.99.

Since 𝐣1(ϕ0(gr!))=1ρ(ϕ0(gr!),Id)10.23=0.77\mathbf{j}_{1}(\phi_{0}(g^{r!}))=1-\rho(\phi_{0}(g^{r!}),\mathrm{Id})\leq 1-0.23=0.77 we obtain

ρ(ϕ0(gr!),Id)1𝐦1(ϕ0(gr!)=1ξϕ0(gr!)(1)0.22.\rho(\phi_{0}(g^{r!}),\mathrm{Id})\geq 1-\mathbf{m}_{1}(\phi_{0}(g^{r!})=1-\xi_{\phi_{0}(g^{r!})}(1)\geq 0.22.

Also note that

ρ(ϕ0(g)r!,ϕ0(gr!))r!δ0.\rho(\phi_{0}(g)^{r!},\phi_{0}(g^{r!}))\leq r!\delta_{0}.

It thus follows that

ρ(ϕ0(g)r!,Id)0.22r!δ00.21\rho(\phi_{0}(g)^{r!},\mathrm{Id})\geq 0.22-r!\delta_{0}\geq 0.21

as long as δ0<1100r!\delta_{0}<\frac{1}{100\,r!}. If DD denotes the size of the matrix ϕ0(g)r!\phi_{0}(g)^{r!}, then we know that ϕ0(g)r!\phi_{0}(g)^{r!} has at least 0.99D0.99D Jordan blocks. If kk denotes the number of blocks corresponding to the eigenvalue 11, then noting that each such block contributes a one-dimensional subspace to ker(ϕ0(g)r!I)\ker(\phi_{0}(g)^{r!}-I), we deduce that k0.21Dk\leq 0.21D. Note that by the assumption 𝐣(ϕ0(gr!))0.99\mathbf{j}(\phi_{0}(g^{r!}))\geq 0.99, that is, ϕ0(gr!)\phi_{0}(g^{r!}) has at least 0.99D0.99D Jordan blocks. This means that the total number of eigenvalues (with multiplicity) coming from blocks of size at least 22 is at most 0.01D0.01D. Hence, the multiplicity of 11 as an eigenvalue of ϕ0(gr!)\phi_{0}(g^{r!}) is at most 0.23D0.23D.

Note that if (λi)1id(\lambda_{i})_{1\leq i\leq d} are eigenvalues of ρ(g)\rho(g) then (λir!)(\lambda_{i}^{r!}) are eigenvalues of ρ(g)r!\rho(g)^{r!}. If λik=1\lambda_{i}^{k}=1 for some 1kr1\leq k\leq r then λir!=1\lambda_{i}^{r!}=1. This implies that

(5.2) 𝐦r(ϕ0(g))𝐦1(ϕ0(g)r!)=ξϕ0(g)r!(1)1β.\mathbf{m}^{\leq r}(\phi_{0}(g))\leq\mathbf{m}_{1}(\phi_{0}(g)^{r!})=\xi_{\phi_{0}(g)^{r!}}(1)\leq 1-\beta.

for β=0.23\beta=0.23. Applying Lemma 3.10 it follows that for m>M(β,r,1/r)m>M(\beta,r,1/r), we have

𝐦1(𝖳mϕ0(g))2rϵ.\mathbf{m}_{1}({\mathsf{T}}^{m}\phi_{0}(g))\leq\frac{2}{r}\leq\epsilon.

It follows that ρ(𝖳mϕ0(g)),Id)1ϵ\rho({\mathsf{T}}^{m}\phi_{0}(g)),\mathrm{Id})\geq 1-\epsilon. In conclusion, for m>max(M(β,r,1/r),N(0.01,ϵ))m>\max(M(\beta,r,1/r),N(0.01,\epsilon)), we have

ρ(𝖳mϕ0(g)),Id)1ϵ.\rho({\mathsf{T}}^{m}\phi_{0}(g)),\mathrm{Id})\geq 1-\epsilon.

for all gSg\in S. Note, however, that

ρ(𝖳mϕ0(g)𝖳mϕ0(h),𝖳mϕ0(gh))2mδ0.\rho({\mathsf{T}}^{m}\phi_{0}(g){\mathsf{T}}^{m}\phi_{0}(h),{\mathsf{T}}^{m}\phi_{0}(gh))\leq 2^{m}\delta_{0}.

Hence we need to choose δ0<min(1100r!,δ2N,ϵ2m+1r!)\delta_{0}<\min(\frac{1}{100\,r!},\frac{\delta}{2^{N}},\frac{\epsilon}{2^{m+1}r!}) with N=max(M(β,r,1/r),N(0.01,ϵ))N=\max(M(\beta,r,1/r),N(0.01,\epsilon)) for the desired inequality to hold. The proof of (2) is similar and somewhat simpler. This time, we work directly with SS (and not S¯\overline{S}) and apply part (2) of Corollary 3.10.

Remark 5.1.

One can deduce from Theorem A the more general version in which the field of complex numbers is replaced by an arbitrary field of characteristic zero FF. In order to see this, suppose that GG is κ\kappa-linear sofic over FF. This implies that for every finite set SGS\subseteq G and every δ>0\delta>0 and every 0κ<κ0\leq\kappa^{\prime}<\kappa, there exists d1d\geq 1 and a map ϕ:SGLd(F)\phi:S\to\mathrm{GL}_{d}(F) that satisfies properties (AH)(AH) and (D)(D) of Definition 1.1. Since SS is finite, one can replace FF by a finitely generated subfield FF^{\prime} of FF (depending on SS). However, every such field is isomorphic to a subfield of {\mathbb{C}}. This implies that the arguments given above show that one can amplify ϕ\phi. Now, note that all the amplifications are constructed via the functorial operations describe in 2.1. This implies that the image remains in GLm(F)\mathrm{GL}_{m}(F) for some m1m\geq 1, from which the claim follows.

Remark 5.2.

The part of the proof that is based on Lemma 4.2 does not work over fields of positive characteristic. In fact, the entire section 4 uses heavily the special form of this formula. It would be interesting to see if the method can be generalized to fields of positive characteristic.

6. Stability and the proof of Theorems B and C

In this section, we will prove Theorem B. Along the way, we will also address the more general question of determining κ(G)\kappa(G) when GG is an arbitrary finite group. As a byproduct, we will show that the bound κ(G)1/2\kappa(G)\geq 1/2 cannot be improved for finite groups. This will be carried out through computation of κ(pn)\kappa({\mathbb{Z}}_{p}^{n}), from which it will follow that as nn\to\infty

κ(pn)11p.\kappa({\mathbb{Z}}_{p}^{n})\to 1-\frac{1}{p}.

The special case of p=2p=2 will then prove the claim. One ingredient of the proof is the notion of stability for linear sofic representations. Studying stability for different modes of metric approximation has been an active area of research in the last decade. Stability of finite groups for sofic approximation was proved by Glebsky and Rivera [GR09]. Arzhantseva and Păunescu [AP15] showed that abelian groups are stable for sofic approximation. This result was generalized by Becker, Lubotzky, and Thom [BLT19] who established a criterion in terms of invariant random subgroups for (sofic) stability in the class of amenable groups. For other related results, see for instance [AP15, DCGLT20, BLT19, BL20] and references therein. Some progress towards proving the stability of 2{\mathbb{Z}}^{2} in linear sofic approximation has been made in [EG21]. Our first theorem establishes the stability of finite groups in the normalized rank metric. Before stating and proving this result, we will need a simple fact from linear algebra.

Lemma 6.1.

Suppose that W1,,WiW_{1},\dots,W_{i} are subspaces of d{\mathbb{C}}^{d} with dim(Wr)d(1ϵ)\dim(W_{r})\geq d(1-\epsilon), for 1ir1\leq i\leq r. Then, we have dim(i=1rWi)d(1rϵ)\dim(\cap_{i=1}^{r}W_{i})\geq d(1-r\epsilon).

Proof.

We will proceed by induction on rr. For r=1r=1 there is nothing to prove. Assume that the claim is shown for r1r-1. Then, using the induction hypothesis, one can write:

dim(i=1rWi)\displaystyle\dim(\cap_{i=1}^{r}W_{i}) =dim(i=1r1Wi)+dim(Wr)dim(i=1k1Wi+Wr)\displaystyle=\dim(\cap_{i=1}^{r-1}W_{i})+\dim(W_{r})-\dim(\cap_{i=1}^{k-1}W_{i}+W_{r})
d(1(r1)ϵ)+d(1ϵ)d\displaystyle\geq d(1-(r-1)\epsilon)+d(1-\epsilon)-d
=d(1rϵ)\displaystyle=d(1-r\epsilon)

Proposition 6.2.

Let GG be a finite group, ϵ>0\epsilon>0 and φ:GGLd()\varphi:G\to\mathrm{GL}_{d}(\mathbb{C}) is such that for all g,hGg,h\in G we have

ρ(φ(g)φ(h)φ(gh))<ϵ.\rho(\varphi(g)\varphi(h)-\varphi(gh))<\epsilon.

Then there exists a representation ψ:GGLd()\psi:G\to\mathrm{GL}_{d}(\mathbb{C}) such that for every gGg\in G we have

ρ(φ(g),ψ(g))<|G|2ϵ.\rho(\varphi(g),\psi(g))<|G|^{2}\epsilon.
Proof.

For g,hGg,h\in G consider the subspace defined by

Wg,h=ker(φ(g)φ(h)φ(gh))W_{g,h}=\ker(\varphi(g)\varphi(h)-\varphi(gh))

and set W=g,hGWg,hW=\cap_{g,h\in G}W_{g,h}. We claim that WW is a GG-invariant subspace of d{\mathbb{C}}^{d}. Assume that w1w_{1} is an arbitrary element of WW. It suffices to prove that for any kGk\in G, φ(k)w1\varphi(k)w_{1} is an element of WW as well. Since w1Ww_{1}\in W, we can write:

φ(g)φ(h)(φ(k)w1)=\displaystyle\varphi(g)\varphi(h)\big{(}\varphi(k)w_{1}\big{)}= φ(g)(φ(h)φ(k)w1)\displaystyle\varphi(g)\big{(}\varphi(h)\varphi(k)w_{1}\big{)}
=\displaystyle= φ(g)φ(hk)w1=φ(ghk)w1\displaystyle\varphi(g)\varphi(hk)w_{1}=\varphi(ghk)w_{1}
=\displaystyle= φ(gh)(φ(k)w1)\displaystyle\varphi(gh)(\varphi(k)w_{1})

This implies that φ(k)w1Wg,h\varphi(k)w_{1}\in W_{g,h}, proving the claim. In summary, WdW\leq{\mathbb{C}}^{d} is a GG-invariant subspace with the property that the restriction of ψ(g)\psi(g) to WW is a representation of GG. Let WW^{\perp} be a subspace complement of WW. For gGg\in G, define ψ(g)GLd()\psi(g)\in\mathrm{GL}_{d}({\mathbb{C}}) to be the linear transformation that acts on WW via φ\varphi and on WW^{\perp} by identity. It is clear that ψ\psi defined in this way is a GG-representations. Finally, for every gGg\in G we have:

rank(ψ(g)φ(g))dim(W)=ddim(W)d|G|2ϵ.\mathrm{rank}(\psi(g)-\varphi(g))\leq\dim(W^{\perp})=d-\dim(W)\leq d|G|^{2}\epsilon.

This finishes the proof. ∎

Remark 6.3.

It is noteworthy that our proof establishes stability with a linear estimate. However, the constant depends on |G||G|. It would be interesting to see if this dependency can be relaxed for certain families of groups.

We will start by providing a simple description of irreducible representations of the group G=pnG={\mathbb{Z}}_{p}^{n}. We start by setting some notation. Recall that ep:p\mathrm{e}_{p}:{\mathbb{Z}}_{p}\to{\mathbb{C}}^{\ast} denotes the character ep(x)=exp(2πix/p)\mathrm{e}_{p}(x)=\exp(2\pi ix/p). Also, for x=(x1,,xn),y=(y1,,yn)pnx=(x_{1},\dots,x_{n}),y=(y_{1},\dots,y_{n})\in{\mathbb{Z}}_{p}^{n}, we write xy=i=1nxiyix\cdot y=\sum_{i=1}^{n}x_{i}y_{i}. For each apna\in{\mathbb{Z}}_{p}^{n}, define ϕa:pn\phi_{a}:{\mathbb{Z}}_{p}^{n}\to{\mathbb{C}}^{\ast} by ϕa(x)=ep(ax)\phi_{a}(x)=\mathrm{e}_{p}(a\cdot x). It is a well known [Luo09] fact that ϕa\phi_{a}, as apna\in{\mathbb{Z}}_{p}^{n} constitute all irreducible representations of pn{\mathbb{Z}}_{p}^{n}.

Proposition 6.4.

For n2n\geq 2 we have

κ(pn)=pnpn1pn1.\kappa({\mathbb{Z}}_{p}^{n})=\frac{p^{n}-p^{n-1}}{p^{n}-1}.
Proof.

Let

ϕ:=a0ϕa\phi:=\bigoplus_{a\neq 0}\phi_{a}

denote the direct sum of all ϕa\phi_{a} other than the trivial representation ϕ0\phi_{0}. One can regard ϕ(x)\phi(x) as a diagonal matrix of size pn1p^{n}-1 with diagonal entries ϕa(x)\phi_{a}(x) for non-zero apna\in{\mathbb{Z}}_{p}^{n}. We claim that for every x0x\neq 0, the set of {apn{0}:ϕa(x)=1}\{a\in{\mathbb{Z}}_{p}^{n}\setminus\{0\}:\phi_{a}(x)=1\} has cardinality pn1p^{n-1}. In fact, the condition ϕa(x)=1\phi_{a}(x)=1 corresponds to the equation i=1naixi=0\sum_{i=1}^{n}a_{i}x_{i}=0 for aa which has exactly pn11p^{n-1}-1 non-zero solutions. Hence rankϕ(x)=pnpn1\operatorname{rank}\phi(x)=p^{n}-p^{n-1}, from which it follows that

ρ(Id,ϕ(x))=pnpn1pn1.\rho(\mathrm{Id},\phi(x))=\frac{p^{n}-p^{n-1}}{p^{n}-1}.

To prove the reverse inequality, we first suppose ψ\psi is a dd-dimensional representation of GG. Decompose ψ\psi into a direct sum of irreducible representations ϕa\phi_{a} and denote the multiplicity of ϕa\phi_{a} by cac_{a}:

ϕ=acaϕa.\phi=\bigoplus_{a}c_{a}\phi_{a}.

Set

β:=mingG1drank(ψ(g)Id),\beta:=\min_{g\in G}\frac{1}{d}\operatorname{rank}(\psi(g)-\mathrm{Id}),

and write B(x)={apn:ϕa(x)1}B(x)=\{a\in{\mathbb{Z}}_{p}^{n}:\phi_{a}(x)\neq 1\}. This implies that for every non-zero xpnx\in{\mathbb{Z}}_{p}^{n}, we have

(6.1) aB(x)caβdx0aB(x)caβd(pn1).\sum_{a\in B(x)}c_{a}\geq\beta d\hskip 5.69054pt\Rightarrow\hskip 8.53581pt\sum_{x\neq 0}\sum_{a\in B(x)}c_{a}\geq\beta d(p^{n}-1).

Without loss of generality, we can assume that c0=0c_{0}=0, since by removing the trivial representation the value of β\beta can only increase. Note also that for every a0a\neq 0, there are exactly pnpn1p^{n}-p^{n-1} elements xpnx\in{\mathbb{Z}}_{p}^{n} with aB(x)a\in B(x). This implies that

x0aB(x)ca=(pnpn1)a0ca=d(pnpn1),\sum_{x\neq 0}\sum_{a\in B(x)}c_{a}=(p^{n}-p^{n-1})\sum_{a\neq 0}c_{a}=d(p^{n}-p^{n-1}),

This together with (6.1) implies that βpnpn1pn1\beta\leq\frac{p^{n}-p^{n-1}}{p^{n}-1}.

Now, to finish the proof assume that β:=κ(pn)>pnpn1pn1.\beta:=\kappa({\mathbb{Z}}_{p}^{n})>\frac{p^{n}-p^{n-1}}{p^{n}-1}. Choose ϵ>0\epsilon>0 such that p2nϵ<βpnpn1pn1p^{2n}\epsilon<\beta-\frac{p^{n}-p^{n-1}}{p^{n}-1}. Let φ:pnGLd()\varphi:{\mathbb{Z}}_{p}^{n}\to\mathrm{GL}_{d}({\mathbb{C}}) be such that

ρ(φ(x)φ(y)φ(x+y))<ϵ.\rho(\varphi(x)\varphi(y)-\varphi(x+y))<\epsilon.

holds for all x,ypnx,y\in{\mathbb{Z}}_{p}^{n}. Use Lemma 6.2 to find a representation ψ:pnGLd()\psi:{\mathbb{Z}}_{p}^{n}\to\mathrm{GL}_{d}({\mathbb{C}}) such that ρ(φ(x),ψ(x))<p2nϵ\rho(\varphi(x),\psi(x))<p^{2n}\epsilon. Since ρ(φ(x),Id)β\rho(\varphi(x),\mathrm{Id})\geq\beta for all non-zero xpnx\in{\mathbb{Z}}_{p}^{n}, it follows that for all x0x\neq 0

ρ(ψ(x),Id)βp2nϵ>pnpn1pn1.\rho(\psi(x),\mathrm{Id})\geq\beta-p^{2n}\epsilon>\frac{p^{n}-p^{n-1}}{p^{n}-1}.

This is a contradiction. ∎

Corollary 6.5.

The best constant for the class of all groups is 1/21/2.

6.1. The value of κ(G)\kappa(G) for finite groups

Let GG be an arbitrary finite group. Note that it follows from Proposition 6.2 that

κ(G)=supψmingGρ(ψ(g),Id),\kappa(G)=\sup_{\psi}\min_{g\in G}\rho(\psi(g),\mathrm{Id}),

where ψ\psi ranges over all finite-dimensional representations of GG. The first observation is that the supremum can be upgraded to a maximum.

Proposition 6.6.

Let GG be a finite group. Then there exists a finite-dimensional representation ψ:GGLd()\psi:G\to\mathrm{GL}_{d}({\mathbb{C}}) of GG such that

κ(G)=mingGρ(ψ(g),Id).\kappa(G)=\min_{g\in G}\rho(\psi(g),\mathrm{Id}).

In particular, κ(G)\kappa(G) is a rational number.

Proof.

Denote by R={ψ0,,ψc1}R=\{\psi_{0},\dots,\psi_{c-1}\} the set of all irreducible representations of GG up to isomorphism. Pick C={g0,,gc1}C=\{g_{0},\dots,g_{c-1}\} to be a set of representatives for all conjugacy classes of GG. We will assume that ψ0\psi_{0} is the trivial representation and g0:=eg_{0}:=e is the identity element of GG. Consider the (c1)×(c1)(c-1)\times(c-1) matrix KK where

Kij=dimker(ψj(gi)Id).K_{ij}=\dim\ker(\psi_{j}(g_{i})-\mathrm{Id}).

Note that KijK_{ij} does not depend on the choice of the representative gig_{i}. Set β=κ(G)\beta=\kappa(G), and write

Δ={(x1,,xc1):xi0,1ic1xi=1}.\Delta=\{(x_{1},\dots,x_{c-1}):x_{i}\geq 0,\sum_{1\leq i\leq c-1}x_{i}=1\}.

For each representation ψ:GGLd()\psi:G\to\mathrm{GL}_{d}({\mathbb{C}}) which does not contain the trivial representation, we can decompose ρ\rho as ρ=j=1c1niψi\rho=\oplus_{j=1}^{c-1}n_{i}\psi_{i}, and define δ(ψ)Δ\delta(\psi)\in\Delta to be the column vector

δ(ψ)=(n1d,,nc1d)t,\delta(\psi)=\left(\frac{n_{1}}{d},\dots,\frac{n_{c-1}}{d}\right)^{t},

of normalized multiplicities of irreducible representations of GG. It is easy to see that mingGρ(ψ(g),Id)α\min_{g\in G}\rho(\psi(g),\mathrm{Id})\geq\alpha holds iff for all gGg\in G

i=1c1niker(ψi(g)Id)(1α)d.\sum_{i=1}^{c-1}n_{i}\ker(\psi_{i}(g)-\mathrm{Id})\leq(1-\alpha)d.

These conditions can be more succinctly expressed as

Kδ(ψ)(1α)(1,1,,1)t,K\delta(\psi)\leq(1-\alpha)(1,1,\dots,1)^{t},

where we write xyx\leq y for two vectors x,yx,y if every entry of yxy-x is non-negative. By assumption, for every m1m\geq 1, there exists a representation ψm\psi_{m} such that Kδ(ψm)(1β+1m)(1,1,,1)tK\delta(\psi_{m})\leq(1-\beta+\frac{1}{m})(1,1,\dots,1)^{t} holds. In other words, the set

{xΔ:Kx(1β+1m)(1,1,,1)t}\{x\in\Delta:Kx\leq(1-\beta+\frac{1}{m})(1,1,\dots,1)^{t}\}

is non-empty. By compactness of Δ\Delta, it follows that there exists a point xΔx\in\Delta such that Kx(1β)(1,1,,1)tKx\leq(1-\beta)(1,1,\dots,1)^{t}. Note that these constitute a system of inequalities involving x1,,xc1x_{1},\dots,x_{c-1} and β\beta with rational coefficients. Now using the fact the the set of solutions to this system is a rational polytope (or equivalently using the proof of Farkas’ lemma [Mat07]) we deduce that both β\beta and (x1,,xc1)(x_{1},\dots,x_{c-1}) are rational. This proves the claim.

Remark 6.7.

There are noncyclic finite groups FF for which κ(F)=1\kappa(F)=1. For instance, let FF be a finite subgroup of SU2()\mathrm{{SU}}_{2}({\mathbb{C}}). Such groups are cyclic of odd order and double covers of finite subgroups of SO3(){\mathrm{SO}}_{3}({\mathbb{R}}), which include the alternating groups of 4,54,5 letters, and the symmetric group on 44 letters. We claim that the natural representation ρ\rho of FF in GL2()\mathrm{GL}_{2}({\mathbb{C}}) has the property that for geg\neq e the eigenvalues of ρ(g)\rho(g) are not 11. This is clear since if one eigenvalue is 11, then the other has to be 11 as well, contradicting the faithfulness of ρ\rho. These groups include, for instance, SL2(𝔽5)\mathrm{SL}_{2}({\mathbb{F}}_{5}).

Proposition 6.8.

Let GG be a finite group. Then κ(G)=1\kappa(G)=1 iff GG has a fixed-point free complex representation.

These groups have been classified by Joseph A. Wolf. The classification is rather complicated. We refer the reader to [Wol67] for proofs and to [Nak74, Theorem (1.7)] for a concise statement and the table listing these groups. The difficulty of classifying finite groups GG with κ(G)=1\kappa(G)=1 suggests that the problem of determining κ(G)\kappa(G) in terms of GG may be a challenging one.

Remark 6.9.

Given a field FF, one can also study the notion of linear sofic approximation over FF. It is not known whether linear sofic groups over {\mathbb{C}} and other fields coincide. However, one can show that κ(G)\kappa_{{\mathbb{C}}}(G) and κF(G)\kappa_{F}(G) do not need to coincide. This will be seen in the next subsection.

6.2. Optimal linear sofic approximation over fields of positive characteristic

In this subsection, we will prove Theorem C

Proof of Theorem C.

Let FF be a field of characteristic pp. It is easy to see that the proof of Proposition 6.2 works over FF without any changes. Hence, for any finite group GG we have

κF(G)=supψmingGρ(ψ(g),Id),\kappa_{F}(G)=\sup_{\psi}\min_{g\in G}\rho(\psi(g),\mathrm{Id}),

where ψ\psi runs over all representations ψ:GGLd(F).\psi:G\to\mathrm{GL}_{d}(F).

Suppose ρ:GGLd(F)\rho:G\to\mathrm{GL}_{d}(F) is a linear representation. Let gg be an element of order pp in GG. Write ψ(g)=Id+τ(g)\psi(g)=\mathrm{Id}+\tau(g) where τ(g)\tau(g) is a d×dd\times d matrix over FF. From ψ(g)p=Id\psi(g)^{p}=\mathrm{Id}, and FF has characteristic pp, it follows that τ(g)p=0\tau(g)^{p}=0. Let JJ denote the Jordan canonical form of τ(g)\tau(g), consisting of kk blocks. It follows from τ(g)p=0\tau(g)^{p}=0 that each block in JJ is a nilpotent block of size at most pp, implying kpdkp\geq d. Since k=dimkerτ(g)k=\dim\ker\tau(g) we have

rank(ψ(g)Id)=ddimkerτ(g)ddp.\operatorname{rank}(\psi(g)-\mathrm{Id})=d-\dim\ker\tau(g)\leq d-\frac{d}{p}.

This shows that κ(G)11p\kappa(G)\leq 1-\frac{1}{p}. To prove the reverse inequality, let VV denote the vector space consisting of all functions f:GFf:G\to F. Clearly d:=dimV=|G|d:=\dim V=|G|. Consider the left regular representation of GG on VV defined by

(ψ(g)f)(h)=f(gh).(\psi(g)f)(h)=f(gh).

Let gG{e}g\in G\setminus\{e\}, and consider the subspace

W(g)={fV:ψ(g)f=f}.W(g)=\{f\in V:\psi(g)f=f\}.

Any fW(g)f\in W(g) is invariant from the left by the subgroup g\langle g\rangle generated by gg. It follows that

dimW(g)d|g|dp.\dim W(g)\leq\frac{d}{|\langle g\rangle|}\leq\frac{d}{p}.

Hence

rank(ψ(g)Id)=ddimW(g)ddp,\operatorname{rank}(\psi(g)-\mathrm{Id})=d-\dim W(g)\geq d-\frac{d}{p},

proving the claim. ∎

References

  • [AC20] Goulnara Arzhantseva and Pierre-Alain Cherix, Quantifying metric approximations of discrete groups, 2020.
  • [AP15] Goulnara Arzhantseva and Liviu Paunescu, Almost commuting permutations are near commuting permutations, J. Funct. Anal. 269 (2015), no. 3, 745–757. MR 3350728
  • [AP17] Goulnara Arzhantseva and Liviu Păunescu, Linear sofic groups and algebras, Trans. Amer. Math. Soc. 369 (2017), no. 4, 2285–2310. MR 3592512
  • [BL20] Oren Becker and Alexander Lubotzky, Group stability and Property (T), J. Funct. Anal. 278 (2020), no. 1, 108298, 20. MR 4027744
  • [BLT19] Oren Becker, Alexander Lubotzky, and Andreas Thom, Stability and invariant random subgroups, Duke Math. J. 168 (2019), no. 12, 2207–2234. MR 3999445
  • [DCGLT20] Marcus De Chiffre, Lev Glebsky, Alexander Lubotzky, and Andreas Thom, Stability, cohomology vanishing, and nonapproximable groups, Forum Math. Sigma 8 (2020), Paper No. e18, 37. MR 4080477
  • [EG21] Gábor Elek and Łukasz Grabowski, Almost commuting matrices with respect to the rank metric, Groups Geom. Dyn. 15 (2021), no. 3, 1059–1083. MR 4322021
  • [ES05] Gábor Elek and Endre Szabó, Hyperlinearity, essentially free actions and L2L^{2}-invariants. The sofic property, Math. Ann. 332 (2005), no. 2, 421–441. MR 2178069
  • [Ess66] Carl-Gustav Esseen, On the Kolmogorov-Rogozin inequality for the concentration function, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 5 (1966), 210–216. MR 205297
  • [GR09] Lev Glebsky and Luis Manuel Rivera, Almost solutions of equations in permutations, Taiwanese J. Math. 13 (2009), no. 2A, 493–500. MR 2500002
  • [Gro99] Mikhael Gromov, Endomorphisms of symbolic algebraic varieties, J. Eur. Math. Soc. (JEMS) 1 (1999), no. 2, 109–197. MR 1694588
  • [II09] Kei-ichiro Iima and Ryo Iwamatsu, On the Jordan decomposition of tensored matrices of Jordan canonical forms, Math. J. Okayama Univ. 51 (2009), 133–148. MR 2482411
  • [Kol58] Andery Kolmogorov, Sur les propriétés des fonctions de concentrations de M. P. Lévy, Ann. Inst. H. Poincaré 16 (1958), 27–34. MR 101545
  • [Luo09] Bao Luong, Fourier analysis on finite abelian groups, Applied and Numerical Harmonic Analysis, Birkhäuser Boston, Ltd., Boston, MA, 2009. MR 2537697
  • [Map10] Kenneth Maples, Singularity of random matrices over finite fields, 2010.
  • [Mat07] Jiri Matousek, Understanding and using linear programming, Springer, Berlin, 2007.
  • [Nak74] Minoru Nakaoka, Finite groups which act freely on spheres, Osaka Math. J. 11 (1974), 323–330. MR 356108
  • [NW21] Hoi H Nguyen and Melanie Matchett Wood, Random integral matrices: universality of surjectivity and the cokernel, Invent. Math. 208 (2021).
  • [Pet75] Valentin Vladimirovich Petrov, Sums of independent random variables, Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 82, Springer-Verlag, New York-Heidelberg, 1975, Translated from the Russian by A. A. Brown. MR 0388499
  • [Rog61] Boris Alekseevich Rogozin, An estimate of the concentration functions, Teor. Verojatnost. i Primenen 6 (1961), 103–105. MR 0131893
  • [R0̆8] Florin Rădulescu, The von Neumann algebra of the non-residually finite Baumslag group a,b|ab3a1=b2\langle a,b|ab^{3}a^{-1}=b^{2}\rangle embeds into RωR^{\omega}, Hot topics in operator theory, Theta Ser. Adv. Math., vol. 9, Theta, Bucharest, 2008, pp. 173–185. MR 2436761
  • [TV12] Terence Tao and Van Vu, Additive combinatorics, Cambridge University Press, 2012.
  • [Wei00] Benjamin Weiss, Sofic groups and dynamical systems, vol. 62, 2000, Ergodic theory and harmonic analysis (Mumbai, 1999), pp. 350–359. MR 1803462
  • [Wol67] Joseph Albert Wolf, Spaces of constant curvature, McGraw-Hill Book Co., New York-London-Sydney, 1967. MR 0217740