This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Cutoff phenomenon for the warp-transpose top with random shuffle

Subhajit Ghosh Department of Mathematics, Bar-Ilan University, Ramat-Gan 5290002 [email protected]
Abstract.

Let {Gn}1\{G_{n}\}_{1}^{\infty} be a sequence of non-trivial finite groups. In this paper, we study the properties of a random walk on the complete monomial group GnSnG_{n}\wr S_{n} generated by the elements of the form (e1,,e1,g;id)(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id}) and (e1,,e1,g1,e1,,e1,g;(i,n))(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n)) for gGn, 1i<ng\in G_{n},\;1\leq i<n. We call this the warp-transpose top with random shuffle on GnSnG_{n}\wr S_{n}. We find the spectrum of the transition probability matrix for this shuffle. We prove that the mixing time for this shuffle is O(nlogn+12nlog(|Gn|1))O\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right). We show that this shuffle exhibits 2\ell^{2}-cutoff at nlogn+12nlog(|Gn|1)n\log n+\frac{1}{2}n\log(|G_{n}|-1) and total variation cutoff at nlognn\log n.

Key words and phrases:
random walk, complete monomial group, mixing time, cutoff, Young-Jucys-Murphy elements
2020 Mathematics Subject Classification:
60J10, 60B15, 60C05.

1. Introduction

The number of shuffles required to mix up a deck of cards is the main topic of interest for card shuffling problems. It has received considerable attention in the last few decades. The card shuffling problems can be described as random walks on the symmetric group. The generalisation by replacing the symmetric group with other finite groups is also a well-studied topic in probability theory [24]. A random walk converges to a unique stationary distribution subject to certain natural conditions. For a convergent random walk, the mixing time (number of steps required to reach the stationary distribution up to a given tolerance) is one of the main topics of interest. It is helpful to know the eigenvalues and eigenvectors of the transition matrix to study the convergence rate of random walks. In general, convergence rate questions for finite Markov chains are useful in many subjects, including statistical physics, computer science, biology and more [22].

In the eighties, Diaconis and Shahshahani introduced the use of non-commutative Fourier analysis techniques in their work on the random transposition shuffle [8]. They proved that this shuffle on nn distinct cards has total variation cutoff (sharp mixing time; the formal definition of cutoff will be given later) at 12nlogn\frac{1}{2}n\log n. The upper bound estimate in this case mainly uses the fact that the total variation distance is at most half of the 2\ell^{2}-distance, which relies on the spectrum of the transition matrix. After this landmark work, the theory of random walks on finite groups obtained its own independence, its own problems and techniques. Afterwards, some other techniques have come to deal with random walks on finite groups (viz. the coupling argument [1], the strong stationary time approach [2, 3]). However, the spectral approach became a standard technique for answering mixing time questions for random walk on finite groups [5]. For a random walk model with known cutoff, the natural question appears on how the transition occurs at cutoff. In 2020, Teyssier [27] studied the limit profile for the random transposition model; providing a precise description of the transition at cutoff. Later Nestoridi and Olesker-Taylor [20] generalised Teyssier’s result to reversible Markov chains. In recent times, Nestoridi further develops the previous result by studying the limit profile for the transpose top with random shuffle (the shuffling algorithm is given in the next paragraph) [19]. This present paper focuses on obtaining the sharp mixing time; the limit profile computation will be considered in future work.

Our model is mainly inspired by the transpose top with random shuffle on the symmetric group SnS_{n} [10]. Given a deck of nn distinct cards, this shuffle chooses a card from the deck uniformly at random and transposes it with the top card. This shuffle exhibits total variation cutoff at nlognn\log n [5, 6]. The transpose top with random shuffle has been recently generalised by the author to the cards with two orientations known as the flip-transpose top with random shuffle on the hyperoctahedral group BnB_{n} [13]. The flip-transpose top with random shuffle on BnB_{n} has total variation cutoff at nlognn\log n. In the extended abstract [11], the author has introduced a generalisation of the flip-transpose top with random shuffle to the complete monomial group GnSnG_{n}\wr S_{n}, and announced result on total variation cutoff under the restriction |Gn|=o(nδ)|G_{n}|=o(n^{\delta}) for all δ>0\delta>0. In this work, we further generalise it by removing the restriction on the size of GnG_{n}, that motivates us to consider both 2\ell^{2}-distance and total variation distance. Moreover, if log(|Gn|1)o(logn)\log(|G_{n}|-1)\neq o(\log n), then our model provides an example where the spectral analysis fails for obtaining sharp total variation mixing time; in other words, the 2\ell^{2}-bound of the total variation distance is not sufficient enough for computing sharp total variation mixing time. Thus we consider both 2\ell^{2}-distance and total variation distance in the present paper. For another notable random walk on the complete monomial groups we mention the work of Schoolfield Jr. [25], which is a generalisation of the random transposition model to GSnG\wr S_{n} for a finite group GG. However this walk was generated by the probability measure which is constant on the conjugacy classes. On the other hand, the generating measure of our model is not constant on conjugacy classes. In general, it is not easy to study a random walk generated by a probability measure which is not constant on the conjugacy classes (cf. [4, 10, 12, 13]). For other random walks on the complete monomial group see [9, 16]. Before describing our random walk model, let us first recall the definition of the complete monomial group.

Definition 1.1.

Let GG be a finite group and SnS_{n} be the symmetric group of permutations of elements of the set [n]:={1,2,,n}[n]:=\{1,2,\dots,n\}. The complete monomial group is the wreath product of GG with SnS_{n}, denoted by GSnG\wr S_{n}, and can be described as follows: The elements of GSnG\wr S_{n} are (n+1)(n+1)-tuples (g1,g2,,gn;π)(g_{1},g_{2},\dots,g_{n};\pi) where giGg_{i}\in G and πSn\pi\in S_{n}. The multiplication in GSnG\wr S_{n} is given by (g1,,gn;π)(h1,,hn;η)=(g1hπ1(1),,gnhπ1(n);πη)(g_{1},\dots,g_{n};\pi)(h_{1},\dots,h_{n};\eta)=(g_{1}h_{\pi^{-1}(1)},\dots,g_{n}h_{\pi^{-1}(n)};\pi\eta). Therefore (g1,,gn;π)1=(gπ(1)1,,gπ(n)1;π1)(g_{1},\dots,g_{n};\pi)^{-1}=(g_{\pi(1)}^{-1},\dots,g_{\pi(n)}^{-1};\pi^{-1}).

Now let {Gn}1\{G_{n}\}_{1}^{\infty} be a sequence of non-trivial finite groups. We consider the complete monomial groups 𝒢n:=GnSn\operatorname{\mathcal{G}_{n}}:=G_{n}\wr S_{n} for each positive integer nn. Let e1\operatorname{e}1 be the identity of GnG_{n} and id\operatorname{id} be the identity of SnS_{n}. For an element πSn\pi\in S_{n}, let π:=(e1,,e1;π)𝒢n\pi:=(\operatorname{e}1,\dots,\operatorname{e}1;\pi)\in\operatorname{\mathcal{G}_{n}} and for gGng\in G_{n}, let

g(i):=(e1,,e1,\displaystyle g^{(i)}:=(\operatorname{e}1,\dots,\operatorname{e}1, g,e1,,e1;id)𝒢n.\displaystyle\;\underset{\uparrow}{g},\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id})\in\operatorname{\mathcal{G}_{n}}.
ith position.\displaystyle i\text{th position.}

Unless otherwise stated from now on, (e1,,e1,g1,e1,,e1,g;(i,n))(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n)) denotes the element of 𝒢n\operatorname{\mathcal{G}_{n}} with g1g^{-1} in iith position and gg in nnth position, for gGn, 1i<ng\in G_{n},\;1\leq i<n. One can check that (g1)(i)g(n)(i,n)(g^{-1})^{(i)}g^{(n)}(i,n) is equal to (e1,,e1,g1,e1,,e1,g;(i,n))(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n)) for gGn, 1i<ng\in G_{n},\;1\leq i<n.

In this work we consider a random walk on the complete monomial group 𝒢n\operatorname{\mathcal{G}_{n}} driven by a probability measure PP, defined as follows:

(1) P(x)={1n|Gn|, if x=(e1,,e1,g;id) for gGn,1n|Gn|, if x=(e1,,e1,g1,e1,,e1,g;(i,n)) for gGn, 1i<n,0, otherwise.P(x)=\begin{cases}\frac{1}{n|G_{n}|},&\text{ if }x=(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\text{ for }g\in G_{n},\\ \frac{1}{n|G_{n}|},&\text{ if }x=(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))\text{ for }g\in G_{n},\;1\leq i<n,\\ 0,&\text{ otherwise}.\end{cases}

We call this the warp-transpose top with random shuffle because at most times the nnth component is multiplied by gg and the iith component is multiplied by g1g^{-1} simultaneously, gGng\in G_{n}, 1i<n1\leq i<n. We now give a combinatorial description of this model as follows:

123459786123456789123459786123459786\begin{array}[]{ccc}&&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}6}\\ {\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}\begin{minipage}{14.22636pt} \leavevmode\hbox to11.67pt{\vbox to13.11pt{\pgfpicture\makeatletter\hbox{\hskip 5.83301pt\lower-6.55522pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\hskip 2.13387pt{}{{}}{}{{{}} {}{}{}{}{}{}{}{}}\pgfsys@beginscope\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@invoke{ }{}\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@moveto{4.26773pt}{0.0pt}\pgfsys@curveto{4.26773pt}{3.14279pt}{2.35703pt}{5.69046pt}{0.0pt}{5.69046pt}\pgfsys@curveto{-2.35703pt}{5.69046pt}{-4.26773pt}{3.14279pt}{-4.26773pt}{0.0pt}\pgfsys@curveto{-4.26773pt}{-3.14279pt}{-2.35703pt}{-5.69046pt}{0.0pt}{-5.69046pt}\pgfsys@curveto{2.35703pt}{-5.69046pt}{4.26773pt}{-3.14279pt}{4.26773pt}{0.0pt}\pgfsys@closepath\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@fillstroke\pgfsys@invoke{ }\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{-2.5pt}{-3.22221pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}}} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}} \end{minipage}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}&\hskip 1.70709pt\begin{array}[]{c}\rotatebox[origin={c}]{40.0}{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{0.0}{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{320.0}{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}$\xrightarrow{\hskip 21.33955pt}$}}\end{array}&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}\\ &&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}6}\end{array}    123456789123456789123456789123456789\begin{array}[]{ccc}&&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}9}\\ {\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}\begin{minipage}{14.22636pt} \leavevmode\hbox to11.67pt{\vbox to13.11pt{\pgfpicture\makeatletter\hbox{\hskip 5.83301pt\lower-6.55522pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\hskip 1.70709pt{}{{}}{}{{{}} {}{}{}{}{}{}{}{}}\pgfsys@beginscope\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@invoke{ }{}\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@moveto{4.26773pt}{0.0pt}\pgfsys@curveto{4.26773pt}{3.14279pt}{2.35703pt}{5.69046pt}{0.0pt}{5.69046pt}\pgfsys@curveto{-2.35703pt}{5.69046pt}{-4.26773pt}{3.14279pt}{-4.26773pt}{0.0pt}\pgfsys@curveto{-4.26773pt}{-3.14279pt}{-2.35703pt}{-5.69046pt}{0.0pt}{-5.69046pt}\pgfsys@curveto{2.35703pt}{-5.69046pt}{4.26773pt}{-3.14279pt}{4.26773pt}{0.0pt}\pgfsys@closepath\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@fillstroke\pgfsys@invoke{ }\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{-2.5pt}{-3.22221pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}}} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}} \end{minipage}&\begin{array}[]{c}\rotatebox[origin={c}]{40.0}{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{0.0}{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{320.0}{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}$\xrightarrow{\hskip 21.33955pt}$}}\end{array}&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}\\ &&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}9}\end{array}
(a)(a)    (b)(b)
Figure 1. Transitions for the warp-transpose top with random shuffle on 3S9\mathbb{Z}_{3}\wr S_{9}. 3\mathbb{Z}_{3} is the additive group of integers modulo 33, consists of the colours red, green and blue such that red represents the identity element. (a)(a) shows transitions when the sixth card is chosen and (b)(b) shows transitions when the last card is chosen.

Let 𝒜n(G)\mathcal{A}_{n}(G) denote the set of all arrangements of nn coloured cards in a row such that the colours of the cards are indexed by the set GG. For example, if 2\mathbb{Z}_{2} denotes the additive group of integers modulo 22, then elements of 𝒜n(2)\mathcal{A}_{n}(\mathbb{Z}_{2}) can be identified with the elements of BnB_{n} (the hyperoctahedral group). For g,hGg,h\in G, by saying update the colour gg using colour hh we mean the colour gg is updated to colour ghg\cdot h. Elements of 𝒢n\operatorname{\mathcal{G}_{n}} can be identified with the elements of 𝒜n(Gn)\mathcal{A}_{n}(G_{n}) as follows: The element (g1,,gn;π)𝒢n(g_{1},\dots,g_{n};\pi)\in\operatorname{\mathcal{G}_{n}} is identified with the arrangement in 𝒜n(Gn)\mathcal{A}_{n}(G_{n}) such that the label of the iith card is π(i)\pi(i), and its colour is gπ(i)g_{\pi(i)}, for each i[n]i\in[n]. Given an arrangement of coloured cards in 𝒜n(Gn)\mathcal{A}_{n}(G_{n}), the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} is the following: Choose a positive integer ii uniformly from [n][n]. Also choose a colour gg uniformly from GnG_{n}, independent of the choice of the integer ii.

  1. (1)

    If i=ni=n: update the colour of the nnth card using colour gg.

  2. (2)

    If i<ni<n: first transpose the iith and nnth cards. Then simultaneously update the colour of the iith card using colour gg and update the colour of the nnth card using colour g1g^{-1}.

The flip-transpose top with random shuffle on the hyperoctahedral group serves the case when |Gn|=2|G_{n}|=2 for all nn [13]. We now state the main theorems of this paper.

Theorem 1.1.

The ((total variation and 2-)\ell^{2}\text{-}) mixing time for the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} is O(nlogn+12nlog(|Gn|1))O\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right).

Theorem 1.2.

The warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} presents 2\ell^{2}-cutoff at nlogn+12nlog(|Gn|1)n\log n+\frac{1}{2}n\log(|G_{n}|-1).

Theorem 1.3.

The warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} exhibits total variation cutoff at time nlognn\log n.

In view of Theorem 1.3, the distribution after nlogn(1+o(1))n\log n(1+o(1)) transitions is close to uniform in total variation distance. On the other hand, if log(|Gn|1)o(logn)\log(|G_{n}|-1)\neq o(\log n), then Theorem 1.2 and

(nlogn+12nlog(|Gn|1))(1o(1))>nlogn(1+o(1))\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)(1-o(1))>n\log n(1+o(1))

says that the distribution after nlogn(1+o(1))n\log n(1+o(1)) transitions is far from uniform in 2\ell^{2}-distance. Therefore, the standard spectral approach (mainly uses the fact that the total variation distance is at most half of 2\ell^{2}-distance) for obtaining the total variation mixing time fails when log(|Gn|1)o(logn)\log(|G_{n}|-1)\neq o(\log n). The proof of Theorem 1.1 and Theorem 1.2 will be presented in Section 3, and the proof of Theorem 1.3 will be presented at the end of Section 5.

Let us recall some concepts and terminologies from representation theory of finite group and discrete time Markov chains with finite state space to make this paper self contained. Readers from representation theoretic background may skip Subsection 1.1 and from Probabilistic background may skip Subsection 1.2.

1.1. Representation theory of finite group

Let GG be a finite group and VV be a finite dimensional complex vector space. Also let GL(V)GL(V) be the set of all invertible linear operators on VV. A linear representation ρ\rho of GG is a homomorphism from GG to GL(V)GL(V). Sometimes this representation is also denoted by the pair (ρ,V)(\rho,V). The dimension of the vector space VV is called the dimension dρd_{\rho} of the representation. VV is called the GG-module corresponding to the representation ρ\rho in this case. Let [G]\mathbb{C}[G] be the group algebra consisting of complex linear combinations of elements of GG. In particular taking V=[G]V=\mathbb{C}[G], we define the right regular representation R:GGL([G])R:G\longrightarrow GL(\mathbb{C}[G]) of GG by

R(g)(hGChh)=hGChhg1, where Ch.R(g)\left(\sum_{h\in G}C_{h}h\right)=\sum_{h\in G}C_{h}hg^{-1},\text{ where }C_{h}\in\mathbb{C}.

A vector subspace WW of VV is said to be stable ( or ‘invariant’) under ρ\rho if ρ(g)(W)W\rho(g)\left(W\right)\subset W for all gg in GG. The representation ρ\rho is irreducible if VV is non-trivial and VV has no non-trivial proper stable subspace. Two representations (ρ1,V1)(\rho_{1},V_{1}) and (ρ2,V2)(\rho_{2},V_{2}) of GG are said to be isomorphic if there exists an invertible linear map T:V1V2T:V_{1}\rightarrow V_{2} such that the following diagram commutes for all gGg\in G:

V1{V_{1}}V1{V_{1}}V2{V_{2}}V2{V_{2}}ρ1(g)\scriptstyle{\rho_{1}(g)}T\scriptstyle{T}T\scriptstyle{T}ρ2(g)\scriptstyle{\rho_{2}(g)}

For each gG,ρ(g)g\in G,\;\rho(g) can also be thought of as an invertible complex matrix of size dρ×dρd_{\rho}\times d_{\rho}. The trace of the matrix ρ(g)\rho(g) is said to be the character value of ρ\rho at gg and is denoted by χρ(g)\chi^{\rho}(g). It can be easily seen that the character values are constants on conjugacy classes, hence characters are class functions. If χρ(g)¯\overline{\chi^{\rho}(g)} denote the complex conjugate of χρ(g)\chi^{\rho}(g), then one can check that χρ(g1)=χρ(g)¯\chi^{\rho}(g^{-1})=\overline{\chi^{\rho}(g)} for all gGg\in G. Let 𝒞(G)\mathscr{C}(G) be the complex vector space of class functions of GG. Then a ‘standard’ inner product ,\langle\cdot,\cdot\rangle on 𝒞(G)\mathscr{C}(G) is defined as follows:

ϕ,ψ=1|G|gGϕ(g)ψ(g1) for ϕ,ψ𝒞(G).\langle\phi,\psi\rangle=\frac{1}{|G|}\sum_{g\in G}\phi(g)\psi(g^{-1})\quad\text{ for }\quad\phi,\psi\in\mathscr{C}(G).

An important theorem in this context is the following [26, Theorem 6]: The characters corresponding to the non-isomorphic irreducible representations of GG form an ,\langle\cdot,\cdot\rangle-orthonormal basis of 𝒞(G)\mathscr{C}(G).

If V1V2V_{1}\otimes V_{2} denotes the tensor product of the vector spaces V1V_{1} and V2V_{2}, then the tensor product of two representations ρ1:GGL(V1)\rho_{1}:G\rightarrow GL(V_{1}) and ρ2:GGL(V2)\rho_{2}:G\rightarrow GL(V_{2}) is a representation denoted by (ρ1ρ2,V1V2)(\rho_{1}\otimes\rho_{2},V_{1}\otimes V_{2}) and defined by,

(ρ1ρ2)(g)(v1v2)=ρ1(g)(v1)ρ2(g)(v2) for v1V1,v2V2 and gG.(\rho_{1}\otimes\rho_{2})(g)(v_{1}\otimes v_{2})=\rho_{1}(g)(v_{1})\otimes\rho_{2}(g)(v_{2})\text{ for }v_{1}\in V_{1},v_{2}\in V_{2}\text{ and }g\in G.

We use some results from representation theory of finite groups without recalling the proof. For details about finite group representation see [21, 23, 26].

1.2. Discrete time Markov chain with finite state space

Let Ω\Omega be a finite set. A sequence of random variables X0,X1,X_{0},X_{1},\dots is a discrete time Markov chain with state space Ω\Omega and transition matrix MM if for all x,yΩx,y\in\Omega, all k>1k>1, and all events Hk1:=0s<k{Xs=xs}H_{k-1}:=\underset{0\leq s<k}{\cap}\{X_{s}=x_{s}\} satisfying (Hk1{Xk=x})>0\mathbb{P}(H_{k-1}\cap\{X_{k}=x\})>0, we have

(2) (Xk+1=yHk1{Xk=x})=M(x,y).\mathbb{P}(X_{k+1}=y\mid H_{k-1}\cap\{X_{k}=x\})=M(x,y).

Equation (2) says that given the present, the future is independent of the past. Let 𝒟k\mathscr{D}_{k} denote the distribution after kk transitions, i.e. 𝒟k\mathscr{D}_{k} is the row (probability) vector ((Xk=x))xΩ\left(\mathbb{P}(X_{k}=x)\right)_{x\in\Omega}. Then 𝒟k=𝒟k1M\mathscr{D}_{k}=\mathscr{D}_{k-1}M for all k1k\geq 1, which implies 𝒟k=𝒟0Mk\mathscr{D}_{k}=\mathscr{D}_{0}M^{k}. In particular if the chain starts at xΩx\in\Omega, then its distribution after kk transitions is 𝒟k=δxMk\mathscr{D}_{k}=\delta_{x}M^{k}, i.e. (Xk=yX0=x)=Mk(x,y)\mathbb{P}(X_{k}=y\mid X_{0}=x)=M^{k}(x,y). Here δx\delta_{x} is defined on Ω\Omega as follows:

δx(u)={1 if u=x,0 if ux.\delta_{x}(u)=\begin{cases}1&\text{ if }u=x,\\ 0&\text{ if }u\neq x.\end{cases}

A Markov chain is said to be irreducible if it is possible for the chain to reach any state starting from any state using only transitions of positive probability. The period of a state xΩx\in\Omega is defined to be the greatest common divisor of the set of all times when it is possible for the chain to return to the starting state xx. The period of all the states of an irreducible Markov chain are the same [15, Lemma 1.6]. An irreducible Markov chain is said to be aperiodic if the common period for all its states is 11. A probability distribution Π\Pi is said to be a stationary distribution of the Markov chain if ΠM=Π\Pi M=\Pi. Any irreducible Markov chain possesses a unique stationary distribution Π\Pi with Π(x)>0\Pi(x)>0 for all xΩx\in\Omega [15, Proposition 1.14]. Moreover if the chain is aperiodic then 𝒟kΠ\mathscr{D}_{k}\longrightarrow\Pi as kk\longrightarrow\infty [15, Theorem 4.9]. For an irreducible chain, we first define the 2\ell^{2}-distance between the distribution after kk transitions and the stationary distribution.

Definition 1.2.

Let 𝒟k\mathscr{D}_{k} denote the distribution after kk transitions of an irreducible discrete time Markov chain with finite state space Ω\Omega, and Π\Pi denote its stationary distribution. Then the 2\ell^{2}-distance between 𝒟k\mathscr{D}_{k} and Π\Pi is defined by

𝒟kΠ2:=(xΩ|𝒟k(x)Π(x)1|2Π(x))12.\left|\left|\mathscr{D}_{k}-\Pi\right|\right|_{2}:=\left(\sum_{x\in\Omega}\left|\frac{\mathscr{D}_{k}(x)}{\Pi(x)}-1\right|^{2}\Pi(x)\right)^{\frac{1}{2}}.

We now define the total variation distance between two probability measures.

Definition 1.3.

Let μ\mu and ν\nu be two probability measures on Ω\Omega. The total variation distance between μ\mu and ν\nu is defined by

μνTV:=supAΩ|μ(A)ν(A)|.\left|\left|\mu-\nu\right|\right|_{\text{TV}}:=\sup_{A\subset\Omega}|\mu(A)-\nu(A)|.

It can be easily seen that μνTV=12xΩ|μ(x)ν(x)|\left|\left|\mu-\nu\right|\right|_{\text{TV}}=\frac{1}{2}\sum_{x\in\Omega}|\mu(x)-\nu(x)| (see [15, Proposition 4.2]).

For an irreducible and aperiodic chain the interesting topic is the minimum number of transitions kk required to reach near the stationarity Π\Pi up to a certain level of tolerance ε>0\varepsilon>0. We first define the maximal 2\ell^{2}-distance (respectively total variation distance) between the distribution after kk transitions and the stationary distribution as follows:

d2(k):=maxxΩMk(x,)Π2(respectively dTV(k):=maxxΩMk(x,)ΠTV).d_{2}(k):=\underset{x\in\Omega}{\max}\;\left|\left|M^{k}(x,\cdot)-\Pi\right|\right|_{2}\;\;\left(\text{respectively }d_{\text{TV}}(k):=\underset{x\in\Omega}{\max}\;\left|\left|M^{k}(x,\cdot)-\Pi\right|\right|_{\text{TV}}\right).

For ε>0\varepsilon>0, the 2\ell^{2}-mixing time (respectively total variation mixing time) with tolerance level ε\varepsilon is defined by

τmix(ε):=min{k:d2(k)ε}(respectively tmix(ε):=min{k:dTV(k)ε}).\tau_{\text{mix}}(\varepsilon):=\min\;\{k:d_{2}(k)\leq\varepsilon\}\;\;(\text{respectively }t_{\text{mix}}(\varepsilon):=\min\;\{k:d_{\text{TV}}(k)\leq\varepsilon\}).

Most of the notations of this subsection are borrowed from [15].

1.3. Non-commutative Fourier analysis and random walks on finite groups

Let p and q\;p\text{ and }q be two probability measures on a finite group GG. We define the convolution pqp*q of pp and qq by (pq)(x):=yGp(xy1)q(y)(p*q)(x):=\sum_{y\in G}p(xy^{-1})q(y). The Fourier transform p^\widehat{p} of pp at the right regular representation RR is defined by the matrix xGp(x)R(x)\sum_{x\in G}p(x)R(x). The matrix p^(R)\widehat{p}(R) can be thought of as the action of the group algebra element gGp(g)g1\sum_{g\in G}p(g)g^{-1} on [G]\mathbb{C}[G] by multiplication on the right. It can be easily seen that (pq)^(R)=p^(R)q^(R)\widehat{(p*q)}(R)=\widehat{p}(R)\widehat{q}(R).

A random walk on a finite group GG driven by a probability measure pp is a Markov chain with state space GG and transition probabilities Mp(x,y)=p(x1y)M_{p}(x,y)=\;p(x^{-1}y), x,yGx,y\in G. It can be easily seen that the transition matrix MpM_{p} is p^(R)\widehat{p}(R) and the distribution after kkth transition will be pkp^{*k} (convolution of pp with itself kk times) i.e., the probability of getting into state yy starting at state xx after kk transitions is pk(x1y)p^{*k}(x^{-1}y). One can easily check that the random walk on GG driven by pp is irreducible if and only if the support of pp generates GG [24, Proposition 2.3]. The stationary distribution for an irreducible random walk on GG driven by pp, is the uniform distribution UGU_{G} on GG (since xGMp(x,y)=xGp(x1y)=zGp(z)=1,z=x1y\sum_{x\in G}M_{p}(x,y)=\sum_{x\in G}p(x^{-1}y)=\sum_{z\in G}p(z)=1,\;z=x^{-1}y for all yGy\in G). From now on, the uniform distribution on group GG will be denoted by UGU_{G}. For the random walk on GG driven by pp, it is enough to focus on pkUG2\left|\left|p^{*k}-U_{G}\right|\right|_{2} and pkUGTV\left|\left|p^{*k}-U_{G}\right|\right|_{\text{TV}} because,

Mpk(x,)UGTV=Mpk(y,)UGTV\displaystyle\left|\left|M_{p}^{k}(x,\cdot)-U_{G}\right|\right|_{\text{TV}}=\left|\left|M_{p}^{k}(y,\cdot)-U_{G}\right|\right|_{\text{TV}}
Mpk(x,)UG2=Mpk(y,)UG2\displaystyle\left|\left|M_{p}^{k}(x,\cdot)-U_{G}\right|\right|_{2}=\left|\left|M_{p}^{k}(y,\cdot)-U_{G}\right|\right|_{2}

for any two elements x,yGx,y\in G. We now define the cutoff phenomenon for a sequence of random walks on finite groups.

Definition 1.4.

Let {𝒢n}0\{\mathscr{G}_{n}\}_{0}^{\infty} be a sequence of finite groups. For each nn, let pnp_{n} be a probability measure on 𝒢n\mathscr{G}_{n} such that support of pnp_{n} generate 𝒢n\mathscr{G}_{n}. Consider the sequence of irreducible and aperiodic random walk on 𝒢n\mathscr{G}_{n} driven by pnp_{n}. We say that the 2\ell^{2}-cutoff phenomenon (respectively total variation cutoff phenomenon) holds for the family {(𝒢n,pn)}0\{(\mathscr{G}_{n},p_{n})\}_{0}^{\infty} if there exists a sequence {𝔗n}0\{\mathfrak{T}_{n}\}_{0}^{\infty} of positive real numbers tending to infinity as nn\rightarrow\infty, such that the following hold:

  1. (1)

    For any ϵ(0,1)\epsilon\in(0,1) and kn=(1+ϵ)𝔗nk_{n}=\lfloor(1+\epsilon)\mathfrak{T}_{n}\rfloor,

    limnpnknU𝒢n2=0(respectively limnpnknU𝒢nTV=0),\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{2}=0\;\left(\text{respectively }\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{\text{TV}}=0\right),
  2. (2)

    For any ϵ(0,1)\epsilon\in(0,1) and kn=(1ϵ)𝔗nk_{n}=\lfloor(1-\epsilon)\mathfrak{T}_{n}\rfloor,

    limnpnknU𝒢n2=(respectively limnpnknU𝒢nTV=1).\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{2}=\infty\;\left(\text{respectively }\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{\text{TV}}=1\right).

Here x\lfloor x\rfloor denotes the floor of xx (the largest integer less than or equal to xx).

Informally, we will say that {(𝒢n,pn)}0\{(\mathscr{G}_{n},p_{n})\}_{0}^{\infty} has an 2\ell^{2}-cutoff (respectively total variation cutoff) at time 𝔗n\mathfrak{T}_{n}. This says that for sufficiently large nn the leading order term in the mixing time does not depend on the tolerance level ε(>0)\varepsilon(>0). In other words the distribution after kk transitions is very close to the stationary distribution if k=𝔗nk=\mathfrak{T}_{n} but too far from the stationary distribution if k<𝔗nk<\mathfrak{T}_{n}. Although most of the cases the cutoff phenomenon depend on the multiplicity of the second largest eigenvalue of the transition matrix [7], sometimes the behaviour is different. For the random-to-random shuffle [4], many eigenvalues are almost equal to the second largest eigenvalue, and they impact the (total variation) cutoff time. We now see that the random walk of our concern is irreducible and aperiodic.

Proposition 1.4.

The warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} is irreducible and aperiodic.

Proof.

The support of PP is Γ={(g1)(i)g(n)(i,n),g(n)gGn,1i<n}\Gamma=\{(g^{-1})^{(i)}g^{(n)}(i,n),g^{(n)}\mid g\in G_{n},1\leq i<n\} and it can be easily seen that {g(k),(i,n)gGn, 1kn, 1i<n}\{g^{(k)},(i,n)\mid g\in G_{n},\;1\leq k\leq n,\;1\leq i<n\} is a generating set of 𝒢n\operatorname{\mathcal{G}_{n}}.

(3) (g1)(n)((g1)(i)g(n)(i,n))g(n)=(i,n) for each 1i<n and gGn,(k,n)g(n)(k,n)=g(k) for each 1kn and for all gGn.\begin{split}&(g^{-1})^{(n)}\left((g^{-1})^{(i)}g^{(n)}(i,n)\right)g^{(n)}=(i,n)\text{ for each }1\leq i<n\text{ and }g\in G_{n},\\ &(k,n)g^{(n)}(k,n)=g^{(k)}\text{ for each }1\leq k\leq n\text{ and for all }g\in G_{n}.\end{split}

Thus (3) implies Γ\Gamma generates 𝒢n\operatorname{\mathcal{G}_{n}} and hence the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} is irreducible. Moreover given any π𝒢n\pi\in\operatorname{\mathcal{G}_{n}}, the set of all times when it is possible for the chain to return to the starting state π\pi contains the integer 11 (as support of PP contains the identity element of 𝒢n\operatorname{\mathcal{G}_{n}}). Therefore the period of the state π\pi is 11 and hence from irreducibility all the states of this chain have period 11. Thus this chain is aperiodic. ∎

Proposition 1.4 says that the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} converges to the uniform distribution U𝒢nU_{\operatorname{\mathcal{G}_{n}}} as the number of transitions goes to infinity. In Section 2 we will find the spectrum of P^(R)\widehat{P}(R). We will prove Theorems 1.1 and 1.2 in Section 3. In Section 4, we will obtain an upper bound of PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} using the coupling argument. In Section 5, the lower bound of PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} will be discussed and Theorem 1.3 will be proved.

2. Spectrum of the transition matrix

In this section we find the eigenvalues of the transition matrix P^(R)\widehat{P}(R), the Fourier transform of PP at the right regular representation RR of 𝒢n\operatorname{\mathcal{G}_{n}}. To find the eigenvalues of P^(R)\widehat{P}(R) we will use the representation theory of the wreath product 𝒢n\operatorname{\mathcal{G}_{n}} of GnG_{n} with the symmetric group SnS_{n}. First we briefly discuss the representation theory of GSnG\wr S_{n}, following the notation from [17]. We refer to the exposition [17] for more details on the representation theory of GSnG\wr S_{n}.

A partition λ\lambda of a positive integer nn (denoted λn\lambda\vdash n) is a weakly decreasing finite sequence (λ1,,λr)(\lambda_{1},\cdots,\lambda_{r}) of positive integers such that i=1rλi=n\sum_{i=1}^{r}\lambda_{i}=n. The partition λ\lambda can be pictorially visualized as a left-justified arrangement of rr rows of boxes with λi\lambda_{i} boxes in the iith row, 1ir1\leq i\leq r. This pictorial arrangement of boxes is known as the Young diagram of λ\lambda. For example there are five partitions of the positive integer 44 viz. (4)(4), (3,1)(3,1), (2,2)(2,2), (2,1,1)(2,1,1) and (1,1,1,1)(1,1,1,1). The Young diagrams corresponding to the partitions of 44 are shown in Figure 2.

Definition 2.1.

Let 𝒴\operatorname{\mathcal{Y}} denote the set of all Young diagrams (there is a unique Young diagram with zero boxes) and 𝒴n\operatorname{\mathcal{Y}_{n}} denote the set of all Young diagrams with nn boxes. For example, elements of 𝒴4\mathcal{Y}_{4} are shown in Figure 2. For a finite set XX, we define 𝒴(X)={μ:μ is a map from X to 𝒴}\operatorname{\mathcal{Y}(X)}=\{\mu:\mu\text{ is a map from }X\text{ to }\operatorname{\mathcal{Y}}\}. For μ𝒴(X)\mu\in\operatorname{\mathcal{Y}(X)}, define μ=xX|μ(x)|||\mu||=\sum_{x\in X}|\mu(x)|, where |μ(x)||\mu(x)| is the number of boxes of the Young diagram μ(x)\mu(x) and define 𝒴n(X)={μ𝒴(X):μ=n}\operatorname{\mathcal{Y}_{n}(X)}=\{\mu\in\operatorname{\mathcal{Y}(X)}:||\mu||=n\}.

\yng(4)\yng(3,1)\yng(2,2)\yng(2,1,1)\yng(1,1,1,1)(4)(3,1)(2,2)(2,1,1)(1,1,1,1)\begin{array}[]{cclll}\yng(4)&\hskip 14.22636pt\yng(3,1)&\hskip 14.22636pt\yng(2,2)&\hskip 14.22636pt\yng(2,1,1)&\hskip 21.33955pt\yng(1,1,1,1)\\ (4)&\;\quad(3,1)&\quad\;(2,2)&\quad(2,1,1)&\;(1,1,1,1)\end{array}

Figure 2. Young diagrams with 44 boxes.

Let nn be a fixed positive integer. Let G^\widehat{G} denote the (finite) set of all non-isomorphic irreducible representations of GG. Given σG^\sigma\in\widehat{G}, we denote by WσW^{\sigma} the corresponding irreducible GG-module (the space for the corresponding irreducible representation of GG). Elements of 𝒴(G^)\operatorname{\mathcal{Y}}(\widehat{G}) are called Young GG-diagrams and elements of 𝒴n(G^)\operatorname{\mathcal{Y}_{n}}(\widehat{G}) are called Young GG-diagrams with nn boxes. For example, if n=10n=10 and G=10G=\mathbb{Z}_{10} (the additive group of integers modulo 1010), then an element of 𝒴10(^10)\mathcal{Y}_{10}(\widehat{\mathbb{Z}}_{10}) is given in Figure 3. Let μ𝒴\mu\in\operatorname{\mathcal{Y}}. A Young tableau of shape μ\mu is obtained by taking the Young diagram μ\mu and filling its |μ||\mu| boxes (bijectively) with the numbers 1,2,,|μ|1,2,\dots,|\mu|. A Young tableau is said to be standard if the numbers in the boxes strictly increase along each row and each column of the Young diagram of μ\mu. The set of all standard Young tableaux of shape μ\mu is denoted by tab(μ)\operatorname{tab}(\mu). Elements of tab((3,1))\operatorname{tab}((3,1)) are listed in Figure 4. Let μ𝒴(G^)\mu\in\operatorname{\mathcal{Y}}(\widehat{G}). A Young GG-tableau of shape μ\mu is obtained by taking the Young GG-diagram μ\mu and filling its μ||\mu|| boxes (bijectively) with the numbers 1,2,,μ.1,2,\dots,||\mu||. A Young GG-tableau is said to be standard if the numbers in the boxes strictly increase along each row and each column of all Young diagrams occurring in μ\mu. Let tabG(n,μ)\operatorname{tab}_{G}(n,\mu), where μ𝒴n(G^)\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G}), denote the set of all standard Young GG-tableaux of shape μ\mu and let tabG(n)=μ𝒴n(G^)tabG(n,μ)\operatorname{tab}_{G}(n)=\cup_{\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})}\operatorname{tab}_{G}(n,\mu). An element of tab10(10,μ)\operatorname{tab}_{\mathbb{Z}_{10}}\left(10,\mu\right) is given in Figure 5, the shape μ(𝒴10(^10))\mu\;\left(\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10})\right) is given in Figure 3.

Definition 2.2.

Let TtabG(n)T\in\operatorname{tab}_{G}(n) and i[n]i\in[n]. If ii appears in the Young diagram μ(σ)\mu(\sigma), where μ\mu is the shape of TT and σG^\sigma\in\widehat{G}, we write rT(i)=σr_{T}(i)=\sigma. For the example given in Figure 5, we have rT(1)=σ2r_{T}(1)=\sigma_{2}, rT(2)=σ2r_{T}(2)=\sigma_{2}, rT(3)=σ8r_{T}(3)=\sigma_{8}, rT(4)=σ1r_{T}(4)=\sigma_{1}, rT(5)=σ10r_{T}(5)=\sigma_{10}, rT(6)=σ1r_{T}(6)=\sigma_{1}, rT(7)=σ1r_{T}(7)=\sigma_{1}, rT(8)=σ8r_{T}(8)=\sigma_{8}, rT(9)=σ1r_{T}(9)=\sigma_{1}, rT(10)=σ1r_{T}(10)=\sigma_{1}. The content of a box in row pp and column qq of a Young diagram is the integer qpq-p. Let bT(i)b_{T}(i) be the box in μ(σ)\mu(\sigma), with the number ii resides and c(bT(i))c(b_{T}(i)) denote the content of the box bT(i)b_{T}(i). For the example given in Figure 5, we also have c(bT(1))=0c(b_{T}(1))=0, c(bT(2))=1c(b_{T}(2))=-1, c(bT(3))=0c(b_{T}(3))=0, c(bT(4))=0c(b_{T}(4))=0, c(bT(5))=0c(b_{T}(5))=0, c(bT(6))=1c(b_{T}(6))=1, c(bT(7))=1c(b_{T}(7))=-1, c(bT(8))=1c(b_{T}(8))=-1, c(bT(9))=2c(b_{T}(9))=2, c(bT(10))=0c(b_{T}(10))=0.

μ:=(μ(σ1),μ(σ2),,μ(σ10))=(\yng(3,2),\yng(1,1),,,,,,\yng(1,1),,\yng(1))\mu:=\left(\mu(\sigma_{1}),\mu(\sigma_{2}),\dots,\mu(\sigma_{10})\right)=\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)

Figure 3. An Young 10\mathbb{Z}_{10}-diagram with 1010 boxes μ\mu. Here ^10:={σi:1i10}\widehat{\mathbb{Z}}_{10}:=\{\sigma_{i}:1\leq i\leq 10\}.

\young(123,4)\young(124,3)\young(134,2)\begin{array}[]{ccc}\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}2\end{subarray}}}{{\begin{subarray}{c}3\end{subarray}}},{{\begin{subarray}{c}4\end{subarray}}})&\quad\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}2\end{subarray}}}{{\begin{subarray}{c}4\end{subarray}}},{{\begin{subarray}{c}3\end{subarray}}})&\quad\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}3\end{subarray}}}{{\begin{subarray}{c}4\end{subarray}}},{{\begin{subarray}{c}2\end{subarray}}})\end{array}

Figure 4. Standard Young tableaux of shape (3,1)(3,1).

μ(\young(469,710),\young(1,2),,,,,,\young(3,8),,\young(5))\mu\rightsquigarrow\left(\begin{array}[]{c}\young({{\begin{subarray}{c}4\end{subarray}}}{{\begin{subarray}{c}6\end{subarray}}}{{\begin{subarray}{c}9\end{subarray}}},{{\begin{subarray}{c}7\end{subarray}}}{{\begin{subarray}{c}10\end{subarray}}})\end{array},\begin{array}[]{c}\young({{\begin{subarray}{c}1\end{subarray}}},{{\begin{subarray}{c}2\end{subarray}}})\end{array},\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\young({{\begin{subarray}{c}3\end{subarray}}},{{\begin{subarray}{c}8\end{subarray}}})\end{array},\;\emptyset,\begin{array}[]{c}\young({{\begin{subarray}{c}5\end{subarray}}})\end{array}\right)

Figure 5. A standard Young 10\mathbb{Z}_{10}-tableaux of shape μ\mu, defined in Figure 3.

The irreducible representation of GSnG\wr S_{n} can be parametrised by elements of 𝒴n(G^)\mathcal{Y}_{n}(\widehat{G}) [17, Lemma 6.2 and Theorem 6.4]. Given μ𝒴(G^)\mu\in\operatorname{\mathcal{Y}}(\widehat{G}) and σG^,μσ\sigma\in\widehat{G},\;\mu\downarrow_{\sigma} denotes the set of all Young GG-diagrams obtained from μ\mu by removing one of the inner corners in the Young diagram μ(σ)\mu(\sigma); see Figure 6 for an example. The branching rule [17, Theorem 6.6] of the pair GSn1GSnG\wr S_{n-1}\subseteq G\wr S_{n} is given as follows: Let VμV^{\mu} (respectively VλV^{\lambda}) denote the irreducible GSnG\wr S_{n}-module (respectively GSn1G\wr S_{n-1}-module) indexed by μ𝒴n(G^)\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G}) (respectively λ𝒴n1(G^)\lambda\in\mathcal{Y}_{n-1}(\widehat{G})). Then,

(4) VμGSn1GSnσG^:μ(σ)dim(Wσ)(λμσVλ),V^{\mu}\big{\downarrow}_{G\wr S_{n-1}}^{G\wr S_{n}}\cong\underset{\sigma\in\widehat{G}:\;\mu(\sigma)\neq\emptyset}{\oplus}\operatorname{dim}(W^{\sigma})\left(\underset{\lambda\in\mu\downarrow_{\sigma}}{\oplus}V^{\lambda}\right),

where WσW^{\sigma} is the irreducible GG-module indexed by σ\sigma. For an illustration of (4), consider μ𝒴10(^10)\mu\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10}) from Figure 3, and recall μσi\mu\downarrow_{\sigma_{i}} (i=1,2,8,10)(i=1,2,8,10) from Figure 6. Then

Vμ10S910S10dim(Wσ1)(λμσ1Vλ)dim(Wσ2)Vμσ2dim(Wσ8)Vμσ8dim(Wσ10)Vμσ10.V^{\mu}\big{\downarrow}^{\mathbb{Z}_{10}\wr S_{10}}_{\mathbb{Z}_{10}\wr S_{9}}\cong\operatorname{dim}(W^{\sigma_{1}})\left(\underset{\lambda\in\mu\downarrow_{\sigma_{1}}}{\oplus}V^{\lambda}\right)\oplus\operatorname{dim}(W^{\sigma_{2}})V^{\mu\downarrow_{\sigma_{2}}}\oplus\operatorname{dim}(W^{\sigma_{8}})V^{\mu\downarrow_{\sigma_{8}}}\oplus\operatorname{dim}(W^{\sigma_{10}})V^{\mu\downarrow_{\sigma_{10}}}.

Here we have used the same notation for a singleton set and its element.

μσ1={(\yng(2,2),\yng(1,1),,,,,,\yng(1,1),,\yng(1)),(\yng(3,1),\yng(1,1),,,,,,\yng(1,1),,\yng(1))}μσ2={(\yng(3,2),\yng(1),,,,,,\yng(1,1),,\yng(1))}μσ8={(\yng(3,2),\yng(1,1),,,,,,\yng(1),,\yng(1))}μσ10={(\yng(3,2),\yng(1,1),,,,,,\yng(1,1),,)}\begin{array}[]{c}\begin{array}[]{c}\mu\downarrow_{\sigma_{1}}=\bigg{\{}\left(\begin{array}[]{c}\yng(2,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right),\\ \hskip 162.6075pt\left(\begin{array}[]{c}\yng(3,1)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\end{array}\\ \mu\downarrow_{\sigma_{2}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\\ \mu\downarrow_{\sigma_{8}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\\ \mu\downarrow_{\sigma_{10}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\;\emptyset\right)\bigg{\}}\end{array}

Figure 6. μσ\mu\downarrow_{\sigma} for non empty μ(σ)\mu({\sigma}), where μ(𝒴10(^10))\mu\left(\in\mathcal{Y}_{10}(\widehat{\mathbb{Z}}_{10})\right) is defined in Figure 3.
Definition 2.3.

Let i,n(G)\mathcal{H}_{i,n}(G) be the subgroup

{(g1,,gn,π)GSn:π(j)=j for i+1jn}\{(g_{1},\dots,g_{n},\pi)\in G\wr S_{n}:\pi(j)=j\text{ for }i+1\leq j\leq n\}

of GSnG\wr S_{n} for 0in0\leq i\leq n. In particular 0,n(G)=1,n(G)=Gn\mathcal{H}_{0,n}(G)=\mathcal{H}_{1,n}(G)=G^{n} and n,n(G)=GSn\mathcal{H}_{n,n}(G)=G\wr S_{n}.

The subgroup i,n(G)\mathcal{H}_{i,n}(G) is isomorphic to GSi×GniG\wr S_{i}\times G^{n-i} (direct product of GSiG\wr S_{i} and GniG^{n-i}) by the isomorphism Ψ:i,n(G)GSi×Gni\Psi:\mathcal{H}_{i,n}(G)\rightarrow G\wr S_{i}\times G^{n-i}; sending (g1,,gi,gi+1,,gn;π)i,n(G)(g_{1},\dots,g_{i},g_{i+1},\dots,g_{n};\pi)\in\mathcal{H}_{i,n}(G) to ((g1,,gi;π),(gi+1,,gn))GSi×Gni\left((g_{1},\dots,g_{i};\pi),(g_{i+1},\dots,g_{n})\right)\in G\wr S_{i}\times G^{n-i}. Here we have used the same notation π\pi for permutations of SiS_{i} and SnS_{n}, as π(j)=j\pi(j)=j for i+1jni+1\leq j\leq n. The irreducible GSi×GniG\wr S_{i}\times G^{n-i}-modules are given by the tensor products of the irreducible GSiG\wr S_{i}-modules and the irreducible GniG^{n-i}-modules [26, Theorem 10]. Therefore we may parametrise the irreducible representations of i,n(G)\mathcal{H}_{i,n}(G) by elements of 𝒴i(G^)×G^ni\operatorname{\mathcal{Y}}_{i}(\widehat{G})\times\widehat{G}^{n-i}. The branching rule of the pair i1,n(G)i,n(G)\mathcal{H}_{i-1,n}(G)\subseteq\mathcal{H}_{i,n}(G) is given as follows: Let V(μ,σi+1,σi+2,,σn)V^{(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})} (respectively V(λ,σ,σi+1,σi+2,,σn)V^{(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}) denote the irreducible i,n(G)\mathcal{H}_{i,n}(G)-module (respectively i1,n(G)\mathcal{H}_{i-1,n}(G)-module) indexed by (μ,σi+1,σi+2,,σn)𝒴i(G^)×G^ni(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})\in\operatorname{\mathcal{Y}}_{i}(\widehat{G})\times\widehat{G}^{n-i} (respectively (λ,σ,σi+1,σi+2,,σn)𝒴i1(G^)×G^ni+1(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})\in\operatorname{\mathcal{Y}}_{i-1}(\widehat{G})\times\widehat{G}^{n-i+1}). Then,

(5) V(μ,σi+1,σi+2,,σn)i1,n(G)i,n(G)σG^:μ(σ)(λμσV(λ,σ,σi+1,σi+2,,σn)).V^{(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}\big{\downarrow}_{\mathcal{H}_{i-1,n}(G)}^{\mathcal{H}_{i,n}(G)}\cong\underset{\sigma\in\widehat{G}:\;\mu(\sigma)\neq\emptyset}{\oplus}\left(\underset{\lambda\in\mu\downarrow_{\sigma}}{\oplus}V^{(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}\right).

In particular, V(μ)=VμV^{(\mu)}=V^{\mu} (irreducible GSnG\wr S_{n}-module), for i=ni=n. To illustrate (5), consider μ𝒴10(^10)\mu\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10}) from Figure 3, and recall μσi\mu\downarrow_{\sigma_{i}} (i=1,2,8,10)(i=1,2,8,10) from Figure 6. Then

V(μ)9,10(10)10,10(10)(λμσ1V(λ,σ1))V(μσ2,σ2)V(μσ8,σ8)V(μσ10,σ10).V^{(\mu)}\big{\downarrow}_{\mathcal{H}_{9,10}(\mathbb{Z}_{10})}^{\mathcal{H}_{10,10}(\mathbb{Z}_{10})}\cong\left(\underset{\lambda\in\mu\downarrow_{\sigma_{1}}}{\oplus}V^{(\lambda,\sigma_{1})}\right)\oplus V^{(\mu\downarrow_{\sigma_{2}},\sigma_{2})}\oplus V^{(\mu\downarrow_{\sigma_{8}},\sigma_{8})}\oplus V^{(\mu\downarrow_{\sigma_{10}},\sigma_{10})}.

Here we have used the same notation for a singleton set and its element.

Remark 2.1.

Although the simple (multiplicity free) branching of the pair i1,n(G)i,n(G)\mathcal{H}_{i-1,n}(G)\subseteq\mathcal{H}_{i,n}(G) was established in [17, Section 4], no branching rule was explained. To see the branching rule (5), we note down the following: A straightforward generalisation of [18, Theorem 4.3] provides a proof of the branching rule (5) when i=ni=n. In view of isomorphism Ψ\Psi, it suffices to prove (5) for i=ni=n.

Definition 2.4.

The (generalised) Young-Jucys-Murphy elements X1(G),,Xn(G)X_{1}(G),\dots,X_{n}(G) of n,n(G)\mathcal{H}_{n,n}(G) or [GSn]\mathbb{C}[G\wr S_{n}] are given by X1(G)=0X_{1}(G)=0 and

Xi(G)\displaystyle X_{i}(G) =k=1i1gG(g1)(k)g(i)(k,i)=k=1i1gG(g1)(k)(k,i)g(k), for all 2in.\displaystyle=\sum\limits_{k=1}^{i-1}\sum\limits_{g\in G}(g^{-1})^{(k)}g^{(i)}(k,i)=\displaystyle\sum_{k=1}^{i-1}\sum\limits_{g\in G}(g^{-1})^{(k)}(k,i)g^{(k)},\text{ for all }2\leq i\leq n.

Young-Jucys-Murphy elements generates a maximal commuting subalgebra of [GSn]\mathbb{C}[G\wr S_{n}] and act like scalars on the Gelfand-Tsetlin subspaces of irreducible GSnG\wr S_{n}-modules. We now define Gelfand-Tsetlin subspaces and the Gelfand-Tsetlin decomposition.

Let μn,n^(G)\mu\in\widehat{\mathcal{H}_{n,n}}(G) and consider the irreducible n,n(G)\mathcal{H}_{n,n}(G)-module VμV^{\mu} (the space for the representation μ\mu). Since the branching is simple (recall (5)), the decomposition into irreducible n1,n(G)\mathcal{H}_{n-1,n}(G)-modules is given by

Vμ=𝜆Vλ,V^{\mu}=\underset{\lambda}{\oplus}V^{\lambda},

where the sum is over all λn1,n^(G)\lambda\in\widehat{\mathcal{H}_{n-1,n}}(G), with λμ\lambda\nearrow\mu (i.e there is an edge from λ\lambda to μ\mu in the branching multi-graph), is canonical. Here we note that μ𝒴n(G^)\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G}) and λ𝒴n1(G^)×G^\lambda\in\operatorname{\mathcal{Y}}_{n-1}(\widehat{G})\times\widehat{G}. Iterating this decomposition of VμV^{\mu} into irreducible 1,n(G)\mathcal{H}_{1,n}(G)-submodules, i.e.,

(6) Vμ=𝑇VT,V^{\mu}=\underset{T}{\oplus}V_{T},

where the sum is over all possible chains T=μ1μ2μnT=\mu_{1}\nearrow\mu_{2}\nearrow\dots\nearrow\mu_{n} with μii,n^(G)\mu_{i}\in\widehat{\mathcal{H}_{i,n}}(G) and μn=μ\mu_{n}=\mu. We call (6) the Gelfand-Tsetlin decomposition of VμV^{\mu} and each VTV_{T} in (6) a Gelfand-Tsetlin subspace of VμV^{\mu}. We note that if 0vTVT, then [i,n(G)]vT=Vμi0\neq v_{T}\in V_{T},\text{ then }\mathbb{C}[\mathcal{H}_{i,n}(G)]v_{T}=V^{\mu_{i}} from the definition of VTV_{T}.

Theorem 2.2 ([17, Theorem 6.5]).

Let μ𝒴n(G^)\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G}). Then we may index the Gelfand-Tsetlin subspaces of VμV^{\mu} by standard Young GG-tableaux of shape μ\mu and write the Gelfand-Tsetlin decomposition as

Vμ=TtabG(n,μ)VT,V^{\mu}=\underset{T\in\operatorname{tab}_{G}(n,\mu)}{\oplus}V_{T},

where each VTV_{T} is closed under the action of GnG^{n} and as a GnG^{n}-module, is isomorphic to the irreducible GnG^{n}-module

WrT(1)WrT(2)WrT(n).W^{r_{T}(1)}\otimes W^{r_{T}(2)}\otimes\dots\otimes W^{r_{T}(n)}.

For i=1,,n;i=1,\dots,n; the eigenvalues of Xi(G)X_{i}(G) on VTV_{T} are given by |G|dim(WrT(i))c(bT(i))\frac{|G|}{\operatorname{dim}(W^{r_{T}(i)})}c(b_{T}(i)). Here we recall rT(i)r_{T}(i) and c(bT(i))c(b_{T}(i)) from Definition 2.2, WrT(i)W^{r_{T}(i)} is the irreducible GG-module indexed by rT(i)r_{T}(i), and Xi(G)X_{i}(G) is the iith Young-Jucys-Murphy element of n,n\mathcal{H}_{n,n}.

Theorem 2.3 ([17, Theorem 6.7]).

Let μ𝒴n(G^)\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G}). Write the elements of G^\widehat{G} as {σ1,,σt}\{\sigma_{1},\dots,\sigma_{t}\} and set μ(i)=μ(σi),mi=|μ(i)|,di=dim(Wσi)\mu^{(i)}=\mu(\sigma_{i}),\;m_{i}=|\mu^{(i)}|,d_{i}=\operatorname{dim}(W^{\sigma_{i}}) for each 1it1\leq i\leq t. Then

dim(Vμ)=(nm1,,mt)fμ(1)fμ(t)d1m1dtmt.\operatorname{dim}(V^{\mu})=\binom{n}{m_{1},\dots,m_{t}}f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}.

Here fμ(i)f^{\mu^{(i)}} denotes the number of standard Young tableau of shape μ(i)\mu^{(i)}, for each 1it1\leq i\leq t.

Lemma 2.4.

Let G be a finite group and ρG^\rho\in\widehat{G}. If Wρ(W^{\rho}\;(respectively χρ)\chi^{\rho}) denotes the irreducible GG-module ((respectively character)) and dρd_{\rho} is the dimension of WρW^{\rho}, then the action of the group algebra element gGg\sum_{g\in G}g on WρW^{\rho} is given by the following scalar matrix

gGg=|G|dρχρ,χ𝟙Idρ.\sum_{g\in G}g=\frac{|G|}{d_{\rho}}\langle\chi^{\rho},\chi^{\operatorname{\mathbb{1}}}\rangle I_{d_{\rho}}.

Here IdρI_{d_{\rho}} is the identity matrix of order dρ×dρd_{\rho}\times d_{\rho} and 𝟙\operatorname{\mathbb{1}} be the trivial representation of GG.

Proof.

It is clear that gGg\sum_{g\in G}g is in the centre of [G]\mathbb{C}[G]. Therefore by Schur’s lemma ([26, Proposition 4]), we have gGg=cIdρ\sum_{g\in G}g=cI_{d_{\rho}} for some cc\in\mathbb{C}. The value of cc can be obtained by equating the traces of gGg\sum_{g\in G}g and cIdρcI_{d_{\rho}}. ∎

Remark 2.5.

Our focus will be on n,n(Gn)\mathcal{H}_{n,n}(G_{n}) i.e. GnSnG_{n}\wr S_{n} for the sequence of subgroups

1,n(Gn)i,n(Gn)n,n(Gn).\mathcal{H}_{1,n}(G_{n})\subseteq\cdots\subseteq\mathcal{H}_{i,n}(G_{n})\subseteq\cdots\subseteq\mathcal{H}_{n,n}(G_{n}).

For simplicity we write the Young-Jucys-Murphy elements Xi(Gn)X_{i}(G_{n}) of GnSnG_{n}\wr S_{n} (i.e. 𝒢n\operatorname{\mathcal{G}_{n}}) as XiX_{i} for 1in1\leq i\leq n. Thus Theorems 2.2 and 2.3 are applicable to 𝒢n\operatorname{\mathcal{G}_{n}}.

Let t:=|G^n|t:=|\widehat{G}_{n}| and G^n:={σ1,,σt}\widehat{G}_{n}:=\{\sigma_{1},\dots,\sigma_{t}\}, where σ1=𝟙\sigma_{1}=\operatorname{\mathbb{1}} (the trivial representation of GnG_{n}). We write μ(𝒴n(G^n))\mu\left(\in\mathcal{Y}_{n}(\widehat{G}_{n})\right) as the tuple (μ(1),,μ(t))(\mu^{(1)},\dots,\mu^{(t)}), where μ(i):=μ(σi)\mu^{(i)}:=\mu(\sigma_{i}) for each 1it1\leq i\leq t. We also denote mi:=|μ(i)|,Wσi:=m_{i}:=|\mu^{(i)}|,\;W^{\sigma_{i}}:= the irreducible GnG_{n}-module corresponding to σi\sigma_{i} and di=dim(Wσi)d_{i}=\dim(W^{\sigma_{i}}) for each 1it1\leq i\leq t. Thus t,σi,μ(i),mi,Wσit,\;\sigma_{i},\;\mu^{(i)},\;m_{i},\;W^{\sigma_{i}}, and did_{i} depend on GnG_{n} i.e., on nn. To avoid notational complication the dependence of t,σi,μ(i),mi,Wσit,\;\sigma_{i},\;\mu^{(i)},\;m_{i},\;W^{\sigma_{i}}, and did_{i} on nn is suppressed. We note that for TtabGn(n,μ)T\in\operatorname{tab}_{G_{n}}(n,\mu) the dimension of VTV_{T} is d1m1dtmtd_{1}^{m_{1}}\cdots d_{t}^{m_{t}}.

Theorem 2.6.

For each μ=(μ(1),,μ(t))𝒴n(G^n)\mu=(\mu^{(1)},\dots,\mu^{(t)})\in\mathcal{Y}_{n}(\widehat{G}_{n}), let P^(R)|Vμ\widehat{P}(R)\big{|}_{V^{\mu}} denote the restriction of P^(R)\widehat{P}(R) to the irreducible 𝒢n\operatorname{\mathcal{G}_{n}}-module VμV^{\mu}. Then the eigenvalues of P^(R)|Vμ\widehat{P}(R)\big{|}_{V^{\mu}} are given by,

1ndim(WrT(n))(c(bT(n))+χrT(n),χ𝟙), with multiplicity dim(VT)=d1m1dtmt\frac{1}{n\operatorname{dim}(W^{r_{T}(n)})}\left(c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle\right),\text{ with multiplicity }\operatorname{dim}(V_{T})=d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}

for each TtabGn(n,μ)T\in\operatorname{tab}_{G_{n}}(n,\mu).

Proof.

We first find the eigenvalues of Xn+gGn(e1,,e1,g;id)X_{n}+\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id}). Let Idim(VT)I_{\operatorname{dim}(V_{T})} denote the identity matrix of order dim(VT)×dim(VT)\operatorname{dim}(V_{T})\times\operatorname{dim}(V_{T}). Then from Theorem 2.2 we have

(7) Vμ=TtabGn(n,μ)VT and Xn|VT=|Gn|dim(WrT(n))c(bT(n))Idim(VT).V^{\mu}=\underset{T\in\operatorname{tab}_{G_{n}}(n,\mu)}{\oplus}V_{T}\quad\text{ and }\quad X_{n}\big{|}_{V_{T}}=\frac{|G_{n}|}{\operatorname{dim}(W^{r_{T}(n)})}c(b_{T}(n))I_{\operatorname{dim}(V_{T})}.

Again from Theorem 2.2 and Lemma 2.4 we have

(8) gGn(e1,,e1,g;id)|VT=|Gn|dim(WrT(n))χrT(n),χ𝟙Idim(VT).\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\big{|}_{V_{T}}=\frac{|G_{n}|}{\operatorname{dim}(W^{r_{T}(n)})}\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle I_{\operatorname{dim}(V_{T})}.
We recall P^(R)=1n|Gn|gGn(R((e1,,e1,g;id))+i=1n1R((e1,,e1,g1,e1,,e1,g;(i,n)))).\text{We recall }\widehat{P}(R)=\frac{1}{n|G_{n}|}\sum_{g\in G_{n}}\left(R\left((\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\right)+\sum_{i=1}^{n-1}R\left((\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))\right)\right).

Therefore n|Gn|P^(R)n|G_{n}|\widehat{P}(R) is the action of Xn+gGn(e1,,e1,g;id)X_{n}+\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id}) on [𝒢n]\mathbb{C}[\operatorname{\mathcal{G}_{n}}] by multiplication on the right. Since dim(VT)=d1m1dtmt\operatorname{dim}(V_{T})=d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}, the theorem follows from (7) and (8). ∎

Remark 2.7.

In the regular representation of a finite group, each irreducible representation occurs with multiplicity equal to its dimension [26, section 2.4]. Therefore, Theorems 2.3 and 2.6 provide the eigenvalues of P^(R)\widehat{P}(R).

3. Order of the mixing time and 2\ell^{2}-cutoff

In this section, using the spectrum of the transition matrix P^(R)\widehat{P}(R), we find the upper bounds of PkU𝒢n2\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2} and PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} when knlogn+12nlog(|Gn|1)+Cn,C>0k\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn,\;C>0. We also prove Theorems 1.1 and 1.2 in this section. Before proving the main results of this section, first, we set some notations and prove two useful lemmas. For any positive integer N\operatorname{\text{N}}, we write ξN\xi\vdash\operatorname{\text{N}} to denote that ξ\xi is a partition of N\operatorname{\text{N}}. Given a partition ξ\xi of the integer N\operatorname{\text{N}} (here we are allowing N\operatorname{\text{N}} to take value 0), throughout this section ξ1\xi_{1} denotes the largest part of ξ\xi. In particular if ξ0\xi\vdash 0 then fξ=1f^{\xi}=1 (as there is a unique Young diagram with zero boxes) and we set ξ1=0\xi_{1}=0.

Theorem 3.1 (Plancherel formula, [5, Theorem 4.1]).

Let f1f_{1} and f2f_{2} be two functions on the finite group G. Then

gGf1(g1)f2(g)=1|G|ρG^dρtrace(f1^(ρ)f2^(ρ)),\sum_{g\in G}f_{1}(g^{-1})f_{2}(g)=\frac{1}{|G|}\sum_{\rho\in\widehat{G}}d_{\rho}\operatorname{trace}\left(\hat{f_{1}}(\rho)\hat{f_{2}}(\rho)\right),

where the sum is over all irreducible representations ρ\rho of GG and dρd_{\rho} is the dimension of ρ\rho.

Recall that UGU_{G} is the uniform distribution on the group GG. Then using Lemma 2.4 we have the following

U^G(ρ)={1 if ρ=𝟙,0 if ρ𝟙, for ρG^.\widehat{U}_{G}(\rho)=\begin{cases}1&\text{ if }\rho=\operatorname{\mathbb{1}},\\ 0&\text{ if }\rho\neq\operatorname{\mathbb{1}},\end{cases}\quad\text{ for }\rho\in\widehat{G}.

Moreover, given any probability measure pp on the finite group GG, we have p^(𝟙)=1\widehat{p}(\operatorname{\mathbb{1}})=1. Therefore setting f1=f2=pkUGf_{1}=f_{2}=p^{*k}-U_{G}, we have the following

(9) p(x)=p(x1) for all xGpkUG22=ρG^{𝟙}dρtrace((p^(ρ))2k).p(x)=p(x^{-1})\text{ for all }x\in G\implies\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{2}=\sum\limits_{\rho\in\widehat{G}\setminus\{\operatorname{\mathbb{1}}\}}d_{\rho}\operatorname{trace}\left(\left(\widehat{p}(\rho)\right)^{2k}\right).

We now state the Diaconis-Shahshahani upper bound lemma. The proof follows from the Cauchy-Schwarz inequality and (9).

Lemma 3.2 ([5, Lemma 4.2]).

Let pp be a probability measure on a finite group GG such that p(x)=p(x1)p(x)=p(x^{-1}) for all xGx\in G. Suppose the random walk on GG driven by pp is irreducible. Then we have the following

pkUGTV214pkUG22=14ρG^{𝟙}dρtrace((p^(ρ))2k),\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{\emph{TV}}\leq\frac{1}{4}\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{2}=\frac{1}{4}\sum\limits_{\rho\in\widehat{G}\setminus\{\operatorname{\mathbb{1}}\}}d_{\rho}\operatorname{trace}\left(\left(\widehat{p}(\rho)\right)^{2k}\right),

where the sum is over all non-trivial irreducible representations ρ\rho of GG and dρd_{\rho} is the dimension of ρ\rho.

Definition 3.1.

Let AA be a non empty set. Then the indicator function of AA is denoted by 𝔫𝔡A\operatorname{\mathfrak{Ind}}_{A} and is defined by

𝔫𝔡A(x)={1 if xA0 if xA.\operatorname{\mathfrak{Ind}}_{A}(x)=\begin{cases}1&\text{ if }x\in A\\ 0&\text{ if }x\notin A.\end{cases}
Lemma 3.3.

Let N\operatorname{\text{N}} be a positive integer and ss be any non-negative real number. Then we have

λN(fλ)2(λ1sN)2k<e2ksNeN2e2kN.\sum_{\lambda\vdash\operatorname{\text{N}}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}<e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}.
Proof.

For ζ(Nλ1)\zeta\vdash\left(\operatorname{\text{N}}-\lambda_{1}\right), recall that ζ1\zeta_{1} denotes the largest part of ζ\zeta. Since ζ1λ1\zeta_{1}\leq\lambda_{1} implies fλ(Nλ1)fζf^{\lambda}\leq\binom{\operatorname{\text{N}}}{\lambda_{1}}f^{\zeta}. Therefore λN(fλ)2(λ1sN)2k\displaystyle\sum_{\lambda\vdash\operatorname{\text{N}}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k} is less than or equal to

λ1=1Nζ(Nλ1)ζ1λ1(Nλ1)2(fζ)2(λ1sN)2k\displaystyle\sum_{\lambda_{1}=1}^{\operatorname{\text{N}}}\sum_{\begin{subarray}{c}\zeta\vdash(\operatorname{\text{N}}-\lambda_{1})\\ \zeta_{1}\leq\lambda_{1}\end{subarray}}\binom{\operatorname{\text{N}}}{\lambda_{1}}^{2}(f^{\zeta})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k} λ1=1N(Nλ1)2(λ1sN)2kζ(Nλ1)(fζ)2\displaystyle\leq\sum_{\lambda_{1}=1}^{\operatorname{\text{N}}}\binom{\operatorname{\text{N}}}{\lambda_{1}}^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}\sum_{\zeta\vdash(\operatorname{\text{N}}-\lambda_{1})}(f^{\zeta})^{2}
(10) =u=0N1(Nu)2(1u+sN)2ku!.\displaystyle=\sum_{u=0}^{\operatorname{\text{N}}-1}\binom{\operatorname{\text{N}}}{u}^{2}\left(1-\frac{u+s}{\operatorname{\text{N}}}\right)^{2k}u!.

Equality in (3) is obtained by using ζ(Nλ1)(fζ)2=(Nλ1)!\displaystyle\sum_{\zeta\vdash(N-\lambda_{1})}(f^{\zeta})^{2}=(N-\lambda_{1})!, and writing u=Nλ1u=\operatorname{\text{N}}-\lambda_{1}. Using 1xex1-x\leq e^{-x} for all x0x\geq 0 and (Nu)Nuu!\binom{\operatorname{\text{N}}}{u}\leq\frac{\operatorname{\text{N}}^{u}}{u!}, the expression in the right hand side of (3) is less than or equal to

u=0N1N2uu!e2kN(u+s)<e2ksNu=01u!(N2e2kN)u=e2ksNeN2e2kN.\sum_{u=0}^{\operatorname{\text{N}}-1}\frac{\operatorname{\text{N}}^{2u}}{u!}e^{-\frac{2k}{\operatorname{\text{N}}}(u+s)}<e^{-\frac{2ks}{\operatorname{\text{N}}}}\sum_{u=0}^{\infty}\frac{1}{u!}\left(\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}\right)^{u}=e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}.\qed

An immediate corollary of Lemma 3.3 follows from the fact

(fλ)2(λ1sN)2k=(NsN)2k, if λ=(N)N.\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}=\left(\frac{\operatorname{\text{N}}-s}{\operatorname{\text{N}}}\right)^{2k},\text{ if }\lambda=\left(\operatorname{\text{N}}\right)\vdash\operatorname{\text{N}}.
Corollary 3.4.

Following the notations of Lemma 3.3, we have

λNλ(N)(fλ)2(λ1sN)2k<e2ksNeN2e2kN(NsN)2k.\sum_{\begin{subarray}{c}\lambda\vdash\operatorname{\text{N}}\\ \lambda\neq(\operatorname{\text{N}})\end{subarray}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}<e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}-\left(\frac{\operatorname{\text{N}}-s}{\operatorname{\text{N}}}\right)^{2k}.
Lemma 3.5.

Let μ=(μ(1),,μ(t))𝒴n(G^n)\mu=(\mu^{(1)},\dots,\mu^{(t)})\in\mathcal{Y}_{n}(\widehat{G}_{n}). Recall that μ1(j)(\mu^{(j)}_{1}\;(respectively μ1(j))\mu^{(j)^{\prime}}_{1}) denotes the largest part of μ(j)(\mu^{(j)}\;(respectively its conjugate)) for 1jt1\leq j\leq t. Then we have

TtabGn(n,μ)(c(bT(n))+χrT(n),χ𝟙ndim(WrT(n)))2k\displaystyle\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\dim(W^{r_{T}(n)})}\right)^{2k}
<\displaystyle<\;\; (nm1,,mt)fμ(1)fμ(t)j=1t(j2k+j2k)𝔫𝔡(0,)(mj),\displaystyle\binom{n}{m_{1},\dots,m_{t}}f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}\sum_{j=1}^{t}\left(\operatorname{\mathcal{M}}_{j}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{j}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{j}),

where j:=μ1(j)1+χσj,χ𝟙ndj\operatorname{\mathcal{M}}_{j}:=\frac{\mu^{(j)}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}} and j:=μ1(j)1+χσj,χ𝟙ndj\operatorname{\mathcal{M}}^{\prime}_{j}:=\frac{\mu^{(j)^{\prime}}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}} for each 1jt1\leq j\leq t.

Proof.

Let 𝒯i={(T1,,Tt)tabGn(n,μ)bT(n) is in Ti} for each 1it\mathcal{T}_{i}=\{(T_{1},\dots,T_{t})\in\operatorname{tab}_{G_{n}}(n,\mu)\mid b_{T}(n)\text{ is in }T_{i}\}\text{ for each }1\leq i\leq t. Then tabGn(n,μ)\operatorname{tab}_{G_{n}}(n,\mu) is the disjoint union of the sets 𝒯1,,𝒯t\mathcal{T}_{1},\dots,\mathcal{T}_{t}. Therefore we have

TtabGn(n,μ)(c(bT(n))+χrT(n),χ𝟙ndim(WrT(n)))2k=i=1tT𝒯i(c(bT(n))+χσi,χ𝟙ndi)2k𝔫𝔡(0,)(mi)\displaystyle\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\dim(W^{r_{T}(n)})}\right)^{2k}=\displaystyle\sum_{i=1}^{t}\displaystyle\sum_{T\in\mathcal{T}_{i}}\left(\frac{c(b_{T}(n))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i})

and this is equal to,

i=1t(n1m1,..,mi1,..,mt)fμ(1)fμ(t)fμ(i)Titab(μ(i))(c(bTi(mi))+χσi,χ𝟙ndi)2k𝔫𝔡(0,)(mi)\displaystyle\sum_{i=1}^{t}\binom{n-1}{m_{1},..,m_{i}-1,..,m_{t}}\frac{f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}}{f^{\mu^{(i)}}}\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\frac{c(b_{T_{i}}(m_{i}))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i})
(11) <\displaystyle< i=1t(nm1,,mt)fμ(1)fμ(t)fμ(i)Titab(μ(i))(i2k+i2k)𝔫𝔡(0,)(mi).\displaystyle\;\sum_{i=1}^{t}\binom{n}{m_{1},\dots,m_{t}}\frac{f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}}{f^{\mu^{(i)}}}\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i}).

The inequality in (3) holds because Titab(μ(i))T_{i}\in\operatorname{tab}(\mu^{(i)}) implies the following:

(c(bTi(mi))+χσi,χ𝟙ndi)2k\displaystyle\left(\frac{c(b_{T_{i}}(m_{i}))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}
\displaystyle\leq max{(μ1(i)1+χσi,χ𝟙ndi)2k,(μ1(i)1χσi,χ𝟙ndi)2k}\displaystyle\max\Bigg{\{}\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k},\left(\frac{\mu^{(i)^{\prime}}_{1}-1-\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\Bigg{\}}
\displaystyle\leq max{(μ1(i)1+χσi,χ𝟙ndi)2k,(μ1(i)1+χσi,χ𝟙ndi)2k}, as χσi,χ𝟙=0 or 1\displaystyle\max\Bigg{\{}\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k},\left(\frac{\mu^{(i)^{\prime}}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\Bigg{\}},\quad\text{ as }\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle=0\text{ or }1
<\displaystyle< (μ1(i)1+χσi,χ𝟙ndi)2k+(μ1(i)1+χσi,χ𝟙ndi)2k=i2k+i2k.\displaystyle\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}+\left(\frac{\mu^{(i)^{\prime}}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}=\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}_{i}^{\prime 2k}.

Therefore the result follows from (3) and

Titab(μ(i))(i2k+i2k)=fμ(i)(i2k+i2k).\displaystyle\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right)=f^{\mu^{(i)}}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right).\qed
Proposition 3.6.

For the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}}, we have

4\displaystyle 4\; PkU𝒢nTV2PkU𝒢n22< 2(en2e2kn1)+e4kn\displaystyle\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{\emph{TV}}\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}\;<\;2\left(e^{n^{2}e^{-\frac{2k}{n}}}-1\right)+e^{-\frac{4k}{n}}
+2en2e2kn(en2(|Gn|1)e2kn1)+2(|G^n|1)n2e2knen2e2kn(1n2+en2(|Gn|1)e2kn1),\displaystyle+2e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{n^{2}\left(|G_{n}|-1\right)e^{-\frac{2k}{n}}}-1\right)+2(|\widehat{G}_{n}|-1)n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\left(\frac{1}{n^{2}}+e^{n^{2}\left(|G_{n}|-1\right)e^{-\frac{2k}{n}}}-1\right),

for all kmax{n,nlogn}k\geq\max\{n,\;n\log n\}.

Proof.

Let us recall that G^n={σ1,,σt} and σ1=𝟙\widehat{G}_{n}=\{\sigma_{1},\dots,\sigma_{t}\}\text{ and }\sigma_{1}=\operatorname{\mathbb{1}}, the trivial representation of GnG_{n}. Given μ𝒴n(G^n)\mu\in\mathcal{Y}_{n}(\widehat{G}_{n}), throughout this proof we write μ=(μ(1),,μ(t))\mu=(\mu^{(1)},\dots,\mu^{(t)}), where μ(i)=μ(σi)\mu^{(i)}=\mu(\sigma_{i}), μ(i)mi\mu^{(i)}\vdash m_{i}, and i=1tmi=n\sum_{i=1}^{t}m_{i}=n. Now using Lemma 3.2, we have

(12) 4PkU𝒢nTV2PkU𝒢n22=μ𝒴n(G^n):μ(𝟙)(n)dim(Vμ)trace((P^(R)|Vμ)2k).4\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{\text{TV}}\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}=\sum_{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n}):\;\mu(\operatorname{\mathbb{1}})\neq(n)}\operatorname{dim}(V^{\mu})\operatorname{trace}\left(\left(\widehat{P}(R)\big{|}_{V^{\mu}}\right)^{2k}\right).

First we partition the set 𝒴n(G^n)\mathcal{Y}_{n}(\widehat{G}_{n}) into two disjoint subsets 𝒜1 and 𝒜2\mathcal{A}_{1}\text{ and }\mathcal{A}_{2} as follows:

𝒜1=1iti, where i\displaystyle\mathcal{A}_{1}=\underset{1\leq i\leq t}{\cup}\mathcal{B}_{i},\text{ where }\mathcal{B}_{i} ={μ𝒴n(G^n)mi=n,mk=0 for all k[t]{i}}\displaystyle=\{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n})\mid m_{i}=n,\;m_{k}=0\text{ for all }k\in[t]\setminus\{i\}\}
𝒜2\displaystyle\mathcal{A}_{2} ={μ𝒴n(G^n)k=1tmk=n, 0mkn1}.\displaystyle=\{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n})\mid\sum_{k=1}^{t}m_{k}=n,\;0\leq m_{k}\leq n-1\}.

It can be easily seen that i\mathcal{B}_{i}’s are disjoint. Therefore by using Theorem 2.6, Remark 2.7, and σ1=𝟙\sigma_{1}=\operatorname{\mathbb{1}}, the inequality (12) become

4PkU𝒢nTV2PkU𝒢n22=\displaystyle 4\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}^{2}\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}= μ1μ(𝟙)(n)dim(Vμ)TtabGn(n,μ)(c(bT(n))+1nd1)2kd1n\displaystyle\sum_{\begin{subarray}{c}\mu\in\mathcal{B}_{1}\\ \mu(\operatorname{\mathbb{1}})\neq(n)\end{subarray}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+1}{nd_{1}}\right)^{2k}d_{1}^{n}
(13) +i=2tμidim(Vμ)TtabGn(n,μ)(c(bT(n))ndi)2kdin\displaystyle+\sum_{i=2}^{t}\sum_{\mu\in\mathcal{B}_{i}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))}{nd_{i}}\right)^{2k}d_{i}^{n}
+μ𝒜2dim(Vμ)TtabGn(n,μ)(c(bT(n))+χrT(n),χ𝟙ndim(WrT(n)))2kd1m1dtmt.\displaystyle+\sum_{\mu\in\mathcal{A}_{2}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\operatorname{dim}(W^{r_{T}(n)})}\right)^{2k}d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}.

The sum of the first two terms in the right hand side of (3) are equal to

λnλ1nfλd1nTtab(λ)(c(bT(n))+1nd1)2kd1n+i=2tλnfλdinTtab(λ)(c(bT(n))ndi)2kdin\displaystyle\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda_{1}\neq n\end{subarray}}f^{\lambda}d_{1}^{n}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{nd_{1}}\right)^{2k}d_{1}^{n}+\sum_{i=2}^{t}\sum_{\lambda\vdash n}f^{\lambda}d_{i}^{n}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{nd_{i}}\right)^{2k}d_{i}^{n}
(14) =\displaystyle= d12nd12kλnλ(n),(1n)fλTtab(λ)(c(bT(n))+1n)2k+(n2n)2k+i=2tdi2ndi2kλnfλTtab(λ)(c(bT(n))n)2k.\displaystyle\;\frac{d_{1}^{2n}}{d_{1}^{2k}}\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{n}\right)^{2k}+\left(\frac{n-2}{n}\right)^{2k}+\sum_{i=2}^{t}\frac{d_{i}^{2n}}{d_{i}^{2k}}\sum_{\lambda\vdash n}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{n}\right)^{2k}.

Now recalling λ1(\lambda_{1}\;(respectively λ1)\lambda^{\prime}_{1}) is the largest part of λ(\lambda\;(respectively its conjugate)), we have the following:

(c(bT(n))+xn)2k\displaystyle\left(\frac{c(b_{T}(n))+x}{n}\right)^{2k}\leq max{(λ11+xn)2k,(λ11xn)2k},\displaystyle\;\max\Bigg{\{}\left(\frac{\lambda_{1}-1+x}{n}\right)^{2k},\left(\frac{\lambda^{\prime}_{1}-1-x}{n}\right)^{2k}\Bigg{\}},
<\displaystyle< (λ11+xn)2k+(λ11+xn)2k, for Ttab(λ) and x0.\displaystyle\;\left(\frac{\lambda_{1}-1+x}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}-1+x}{n}\right)^{2k},\;\;\text{ for }T\in\operatorname{tab}(\lambda)\text{ and }x\geq 0.

This implies

λnλ(n),(1n)fλTtab(λ)(c(bT(n))+1n)2k\displaystyle\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{n}\right)^{2k} <λnλ(n),(1n)(fλ)2((λ1n)2k+(λ1n)2k)\displaystyle<\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}\left(f^{\lambda}\right)^{2}\left(\left(\frac{\lambda_{1}}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}}{n}\right)^{2k}\right)
= 2λnλ(n),(1n)(fλ)2(λ1n)2k, using fλ=fλ\displaystyle=\;2\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}}{n}\right)^{2k},\quad\text{ using }f^{\lambda}=f^{\lambda^{\prime}}
and λnfλTtab(λ)(c(bT(n))n)2k\displaystyle\text{ and }\quad\sum_{\lambda\vdash n}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{n}\right)^{2k} <λn(fλ)2((λ11n)2k+(λ11n)2k)\displaystyle<\sum_{\lambda\vdash n}\left(f^{\lambda}\right)^{2}\left(\left(\frac{\lambda_{1}-1}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}-1}{n}\right)^{2k}\right)
= 2λn(fλ)2(λ11n)2k, using fλ=fλ.\displaystyle=\;2\sum_{\lambda\vdash n}\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}-1}{n}\right)^{2k},\quad\text{ using }f^{\lambda}=f^{\lambda^{\prime}}.

Thus using 1xex1-x\leq e^{-x} for x0x\geq 0, knk\geq n, and di1d_{i}\geq 1 for all 1it1\leq i\leq t, the expression in (3) is bounded above by

 2λnλ(n)(fλ)2(λ1n)2k+(12n)2k+2i=2tλn(fλ)2(λ11n)2k\displaystyle\;2\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n)\end{subarray}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}}{n}\right)^{2k}+\left(1-\frac{2}{n}\right)^{2k}+2\sum_{i=2}^{t}\sum_{\lambda\vdash n}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-1}{n}\right)^{2k}
(15) <\displaystyle<  2(en2e2kn1)+e4kn+2(t1)e2knen2e2kn.\displaystyle\;2\left(e^{n^{2}e^{-\frac{2k}{n}}}-1\right)+e^{-\frac{4k}{n}}+2(t-1)e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}.

The inequality in (3) follows from Corollary 3.4 and Lemma 3.3. Now recalling j:=μ1(j)1+χσj,χ𝟙ndj,j:=μ1(j)1+χσj,χ𝟙ndj\operatorname{\mathcal{M}}_{j}:=\frac{\mu^{(j)}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}},\;\operatorname{\mathcal{M}}^{\prime}_{j}:=\frac{\mu^{(j)^{\prime}}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}}, and using Lemma 3.5, the third term in the right hand side of (3) is less than

(16) μ𝒜2(nm1,,mt)2(fμ(1))2(fμ(t))2d12m1dt2mtj=1t(j2k+j2k)𝔫𝔡(0,)(mj).\displaystyle\sum_{\mu\in\mathcal{A}_{2}}\binom{n}{m_{1},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\sum_{j=1}^{t}\left(\operatorname{\mathcal{M}}_{j}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{j}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{j}).

We now deal with (16) by considering two separate cases namely j=1j=1 and 1<jt1<j\leq t. Now using

μ(1)m1(fμ(1))2(μ1(1)nd1)2k=μ(1)m1(fμ(1))2(μ1(1)nd1)2k,\sum_{\mu^{(1)}\vdash m_{1}}\left(f^{\mu^{(1)}}\right)^{2}\left(\frac{\mu^{(1)^{\prime}}_{1}}{nd_{1}}\right)^{2k}=\sum_{\mu^{(1)}\vdash m_{1}}\left(f^{\mu^{(1)}}\right)^{2}\left(\frac{\mu^{(1)}_{1}}{nd_{1}}\right)^{2k},

the partial sum corresponding to j=1j=1 in (16) is equal to,

(17) m1=1n1(m2,,mt)mk=nm10mkn12μ(i)mi1it(nm1)2(nm1m2,,mt)2(fμ(1))2(fμ(t))2d12m1dt2mt(μ1(1)nd1)2k\sum_{m_{1}=1}^{n-1}\sum_{\begin{subarray}{c}(m_{2},\dots,m_{t})\\ \sum m_{k}=n-m_{1}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}2\sum_{\begin{subarray}{c}\mu^{(i)}\vdash m_{i}\\ 1\leq i\leq t\end{subarray}}\binom{n}{m_{1}}^{2}\binom{n-m_{1}}{m_{2},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\left(\frac{\mu^{(1)}_{1}}{nd_{1}}\right)^{2k}

Using μ(i)mi(fμ(i))2=mi!\displaystyle\sum_{\mu^{(i)}\vdash m_{i}}\left(f^{\mu^{(i)}}\right)^{2}=m_{i}! for 2it,(nm1m2,,mt)=(nm1)!m2!mt!2\leq i\leq t,\;\binom{n-m_{1}}{m_{2},\dots,m_{t}}=\frac{(n-m_{1})!}{m_{2}!\cdots m_{t}!}, and the multinomial theorem

(m2,,mt)mk=nm10mkn1(nm1m2,,mt)(d22)m2(dt2)mt=(d22++dt2)nm1,\sum_{\begin{subarray}{c}(m_{2},\dots,m_{t})\\ \sum m_{k}=n-m_{1}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}\binom{n-m_{1}}{m_{2},\dots,m_{t}}(d_{2}^{2})^{m_{2}}\dots(d_{t}^{2})^{m_{t}}=(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}},

the expression in (17) can be written as

2m1=1n1(d22++dt2)nm1(nm1)2(nm1)!(1d1)2k2m1(m1n)2kμ(1)m1(fμ(1))2(μ1(1)m1)2k\displaystyle 2\sum_{m_{1}=1}^{n-1}(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}}\binom{n}{m_{1}}^{2}(n-m_{1})!\left(\frac{1}{d_{1}}\right)^{2k-2m_{1}}\left(\frac{m_{1}}{n}\right)^{2k}\sum_{\mu^{(1)}\vdash m_{1}}(f^{\mu^{(1)}})^{2}\left(\frac{\mu^{(1)}_{1}}{m_{1}}\right)^{2k}
(18) <\displaystyle<  2m1=1n1(d22++dt2)nm1(nm1)2(nm1)!(1d1)2k2m1(m1n)2kem12e2km1.\displaystyle\;2\sum_{m_{1}=1}^{n-1}(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}}\binom{n}{m_{1}}^{2}(n-m_{1})!\left(\frac{1}{d_{1}}\right)^{2k-2m_{1}}\left(\frac{m_{1}}{n}\right)^{2k}e^{m_{1}^{2}e^{-\frac{2k}{m_{1}}}}.

The inequality in (3) follows from Lemma 3.3. As nm1n\geq m_{1}, we have

km1logm1+m1nkm1lognm12e2km1n2e2kn.k\geq m_{1}\log m_{1}+\frac{m_{1}}{n}k-m_{1}\log n\implies m_{1}^{2}e^{-\frac{2k}{m_{1}}}\leq n^{2}e^{-\frac{2k}{n}}.

Thus writing nm1n-m_{1} by uu, the expression in (3) is less than or equal to

(19) 2en2e2knu=1n1(d22++dt2d12)u(1d1)2k2n(nu)2u!(1un)2k2e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{u=1}^{n-1}\left(\frac{d_{2}^{2}+\cdots+d_{t}^{2}}{d_{1}^{2}}\right)^{u}\left(\frac{1}{d_{1}}\right)^{2k-2n}\binom{n}{u}^{2}u!\left(1-\frac{u}{n}\right)^{2k}

Now using 1xex1-x\leq e^{-x} for all x0x\geq 0 and d1=1d_{1}=1 the expression in (19) is less than or equal to

(20) 2en2e2knu=1n11u!(n2(|Gn|d121)e2kn)u<2en2e2kn(e(n2(|Gn|d121)e2kn)1).2e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{u=1}^{n-1}\frac{1}{u!}\left(n^{2}\left(\frac{|G_{n}|}{d_{1}^{2}}-1\right)e^{-\frac{2k}{n}}\right)^{u}\;<2e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{\left(n^{2}\left(\frac{|G_{n}|}{d_{1}^{2}}-1\right)e^{-\frac{2k}{n}}\right)}-1\right).

Now using the notation m1,..,mj^,..,mtm_{1},..,\widehat{m_{j}},..,m_{t} to denote m1,,mj1,mj+1,,mtm_{1},\dots,m_{j-1},m_{j+1},\dots,m_{t}, and

μ(j)mj(fμ(j))2(μ1(j)ndj)2k=μ(j)mj(fμ(j))2(μ1(j)ndj)2k,\sum_{\mu^{(j)}\vdash m_{j}}\left(f^{\mu^{(j)}}\right)^{2}\left(\frac{\mu^{(j)^{\prime}}_{1}}{nd_{j}}\right)^{2k}=\sum_{\mu^{(j)}\vdash m_{j}}\left(f^{\mu^{(j)}}\right)^{2}\left(\frac{\mu^{(j)}_{1}}{nd_{j}}\right)^{2k},

the partial sum corresponding to 1<jt1<j\leq t in (16) turns out to be

(21) mj=1n1(m1,,mj^,,mt)mk=nmj0mkn12μ(i)mi1it(nmj)2(nmjm1,,mj^,,mt)2(fμ(1))2(fμ(t))2d12m1dt2mtζ2k,\sum_{m_{j}=1}^{n-1}\sum_{\begin{subarray}{c}(m_{1},\dots,\widehat{m_{j}},\dots,m_{t})\\ \sum m_{k}=n-m_{j}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}2\sum_{\begin{subarray}{c}\mu^{(i)}\vdash m_{i}\\ 1\leq i\leq t\end{subarray}}\binom{n}{m_{j}}^{2}\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\zeta^{2k},

where ζ=μ1(j)1ndj\zeta=\frac{\mu^{(j)}_{1}-1}{nd_{j}}. Using μ(i)mi(fμ(i))2=mi!\displaystyle\sum_{\mu^{(i)}\vdash m_{i}}\left(f^{\mu^{(i)}}\right)^{2}=m_{i}! for i{1,,t}{j},(nmjm1,,mj^,mt)=(nmj)!kjmk!i\in\{1,\dots,t\}\setminus\{j\},\;\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}}\dots,m_{t}}=\frac{(n-m_{j})!}{\displaystyle\prod_{k\neq j}m_{k}!}, and the multinomial theorem

(m1,,mj^,,mt)kjmk=nmj0mkn1(nmjm1,,mj^,,mt)(d12)m1(dj12)mj1(dj+12)mj+1(dt2)mt=(kjdk2)nmj,\sum_{\begin{subarray}{c}(m_{1},\dots,\widehat{m_{j}},\dots,m_{t})\\ \sum_{k\neq j}m_{k}=n-m_{j}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}},\dots,m_{t}}(d_{1}^{2})^{m_{1}}\dots(d_{j-1}^{2})^{m_{j-1}}(d_{j+1}^{2})^{m_{j+1}}\dots(d_{t}^{2})^{m_{t}}=\left(\displaystyle\sum_{k\neq j}d_{k}^{2}\right)^{n-m_{j}},

the expression given in (21) is equal to the following

2mj=1n1(d12++dt2dj2)nmj(nmj)2(nmj)!(1dj)2k2mj(mjn)2kμ(j)mj(fμ(j))2(μ1(j)1mj)2k2\sum_{m_{j}=1}^{n-1}(d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2})^{n-m_{j}}\binom{n}{m_{j}}^{2}(n-m_{j})!\left(\frac{1}{d_{j}}\right)^{2k-2m_{j}}\left(\frac{m_{j}}{n}\right)^{2k}\sum_{\mu^{(j)}\vdash m_{j}}(f^{\mu^{(j)}})^{2}\left(\frac{\mu^{(j)}_{1}-1}{m_{j}}\right)^{2k}
(22) < 2mj=1n1(d12++dt2dj2)nmj(nmj)2(nmj)!(1dj)2k2mj(mjn)2ke2kmjemj2e2kmj.<\;2\sum_{m_{j}=1}^{n-1}(d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2})^{n-m_{j}}\binom{n}{m_{j}}^{2}(n-m_{j})!\left(\frac{1}{d_{j}}\right)^{2k-2m_{j}}\left(\frac{m_{j}}{n}\right)^{2k}e^{-\frac{2k}{m_{j}}}e^{m_{j}^{2}e^{-\frac{2k}{m_{j}}}}.

The inequality in (22) follows from Lemma 3.3. As nmjn\geq m_{j}, we have

kmjlogmj+mjnkmjlognmj2e2kmjn2e2kn.k\geq m_{j}\log m_{j}+\frac{m_{j}}{n}k-m_{j}\log n\implies m_{j}^{2}e^{-\frac{2k}{m_{j}}}\leq n^{2}e^{-\frac{2k}{n}}.

Thus writing nmjn-m_{j} by vv and using 1mj1\frac{1}{m_{j}}\leq 1, the expression in (22) is less than or equal to

(23) 2n2e2knen2e2knv=1n1(d12++dt2dj2dj2)v(1dj)2k2n(nv)2v!(1vn)2k2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{v=1}^{n-1}\left(\frac{d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2}}{d_{j}^{2}}\right)^{v}\left(\frac{1}{d_{j}}\right)^{2k-2n}\binom{n}{v}^{2}v!\left(1-\frac{v}{n}\right)^{2k}

Now using 1xex1-x\leq e^{-x} for all x0x\geq 0 and dj2k2n1d_{j}^{2k-2n}\geq 1 for all j{1,,t}j\in\{1,\dots,t\}, the expression in (23) is less than or equal to

2n2e2knen2e2knv=1n11v!(n2(|Gn|dj21)e2kn)v\displaystyle 2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{v=1}^{n-1}\frac{1}{v!}\left(n^{2}\left(\frac{|G_{n}|}{d_{j}^{2}}-1\right)e^{-\frac{2k}{n}}\right)^{v}
(24) <\displaystyle<  2n2e2knen2e2kn(e(n2(|Gn|dj21)e2kn)1).\displaystyle\;2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{\left(n^{2}\left(\frac{|G_{n}|}{d_{j}^{2}}-1\right)e^{-\frac{2k}{n}}\right)}-1\right).

Therefore the proposition follows from (3), (3), (20), (3) and 1dj1\frac{1}{d_{j}}\leq 1 for all 1jt1\leq j\leq t. ∎

Theorem 3.7.

For the random walk on 𝒢n\operatorname{\mathcal{G}_{n}} driven by PP we have the following:

  1. (1)

    Let C>0C>0. If knlogn+12nlog(|Gn|1)+Cnk\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn, then

    PkU𝒢nTV12PkU𝒢n2<2(e2C+1)eC+o(1).\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}\leq\frac{1}{2}\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}<\sqrt{2}\left(e^{-2C}+1\right)e^{-C}+o(1).
  2. (2)

    For any ϵ(0,1)\epsilon\in(0,1), if we set kn=(1+ϵ)(nlogn+12nlog(|Gn|1))k_{n}=\big{\lfloor}(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\big{\rfloor}, then

    limnPknU𝒢n2=0.\lim_{n\rightarrow\infty}\left|\left|P^{*k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}=0.
Proof.

Using knlogn+12nlog(|Gn|1)+Cnk\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn and Proposition 3.6 we have the following:

4PkU𝒢nTV2\displaystyle 4\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}^{2} PkU𝒢n22\displaystyle\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}
(25) <2(ee2C|Gn|11)+e4Cn4(|Gn|1)2+2ee2C|Gn|1(ee2C1)\displaystyle<2\left(e^{\frac{e^{-2C}}{|G_{n}|-1}}-1\right)+\frac{e^{-4C}}{n^{4}\left(|G_{n}|-1\right)^{2}}+2e^{\frac{e^{-2C}}{|G_{n}|-1}}\left(e^{e^{-2C}}-1\right)
+2(|G^n|1)×e2C|Gn|1×ee2C|Gn|1(1n2+ee2C1).\displaystyle\hskip 28.45274pt+2(|\widehat{G}_{n}|-1)\times\frac{e^{-2C}}{|G_{n}|-1}\times e^{\frac{e^{-2C}}{|G_{n}|-1}}\left(\frac{1}{n^{2}}+e^{e^{-2C}}-1\right).

The sequence {Gn}1\{G_{n}\}_{1}^{\infty} consists of non-trivial finite groups, thus 1|Gn|11\frac{1}{|G_{n}|-1}\leq 1. Also, |G^n|1|Gn|11\frac{|\widehat{G}_{n}|-1}{|G_{n}|-1}\leq 1. Therefore the expression in the right hand side of (25) is less than

(2+2ee2C+2e2Cee2C)(ee2C1)+e4Cn4+2e2Cee2Cn2\left(2+2e^{e^{-2C}}+2e^{-2C}e^{e^{-2C}}\right)\left(e^{e^{-2C}}-1\right)+\frac{e^{-4C}}{n^{4}}+\frac{2e^{-2C}e^{e^{-2C}}}{n^{2}}

Now using ex1<2xe^{x}-1<2x for 0<x10<x\leq 1 we have

4PkU𝒢nTV2PkU𝒢n22\displaystyle 4\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}^{2}\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2} <(4+6e2C+4e4C)2e2C+o(1)\displaystyle<\left(4+6e^{-2C}+4e^{-4C}\right)2e^{-2C}+o(1)
PkU𝒢nTV12PkU𝒢n2\displaystyle\implies\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}\leq\frac{1}{2}\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2} <(2+3e2C+2e4C)eC+o(1)\displaystyle<\left(\sqrt{2+3e^{-2C}+2e^{-4C}}\right)e^{-C}+o(1)
<2(e2C+1)eC+o(1).\displaystyle<\sqrt{2}\left(e^{-2C}+1\right)e^{-C}+o(1).

This proves the first part of the theorem.

For any ϵ(0,1)\epsilon\in(0,1), setting kn=(1+ϵ)(nlogn+12nlog(|Gn|1))k_{n}=\left\lfloor(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor, we have

kn+1(1+ϵ)(nlogn+12nlog(|Gn|1))eknne1nn1+ϵ(|Gn|1)1+ϵ2.k_{n}+1\geq(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\implies e^{-\frac{k_{n}}{n}}\leq\frac{e^{\frac{1}{n}}}{n^{1+\epsilon}(|G_{n}|-1)^{\frac{1+\epsilon}{2}}}.

Therefore Proposition 3.6 implies

0\displaystyle 0 4PknU𝒢nTV2PknU𝒢n22\displaystyle\leq 4\;\left|\left|P^{*k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}^{2}\leq\left|\left|P^{*k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}
(26) <2(ee2nn2ϵ(|Gn|1)1+ϵ1)+e4nn4+4ϵ(|Gn|1)2+2ϵ+2ee2nn2ϵ(|Gn|1)1+ϵ(ee2nn2ϵ(|Gn|1)ϵ1)\displaystyle<2\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{1+\epsilon}}}-1\right)+\frac{e^{\frac{4}{n}}}{n^{4+4\epsilon}\left(|G_{n}|-1\right)^{2+2\epsilon}}+2e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{1+\epsilon}}}\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{\epsilon}}}-1\right)
+2(|G^n|1)×e2nn2ϵ(|Gn|1)1+ϵ×ee2nn2ϵ(|Gn|1)1+ϵ(1n2+ee2nn2ϵ(|Gn|1)ϵ1)\displaystyle\hskip 49.79231pt+2(|\widehat{G}_{n}|-1)\times\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{1+\epsilon}}\times e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{1+\epsilon}}}\left(\frac{1}{n^{2}}+e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(|G_{n}|-1)^{\epsilon}}}-1\right)

The right hand side of (3) converges to zero as nn\rightarrow\infty because 1|Gn|1,|G^n|1|Gn|11\frac{1}{|G_{n}|-1},\frac{|\widehat{G}_{n}|-1}{|G_{n}|-1}\leq 1. Hence the second part follows. ∎

Proof of Theorem 1.1.

Let ε>0\varepsilon>0 and τmix(n)(ε)\tau_{\text{mix}}^{(n)}(\varepsilon) (respectively tmix(n)(ε)t_{\text{mix}}^{(n)}(\varepsilon)) be the 2\ell^{2}-mixing time (respectively total variation mixing time) with tolerance level ε\varepsilon for the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}}. We choose Cε>0C_{\varepsilon}>0 such that 2(e2Cε+1)eCε<ε4\sqrt{2}\left(e^{-2C_{\varepsilon}}+1\right)e^{-C_{\varepsilon}}<\frac{\varepsilon}{4}. Then the first part of Theorem 3.7 ensures the existence of positive integer NN such that the following hold for all nNn\geq N,

knlogn+12nlog(|Gn|1)+Cεn\displaystyle k\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+C_{\varepsilon}n\implies PkU𝒢nTV12PkU𝒢n2<ε2\displaystyle\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}\leq\frac{1}{2}\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}<\frac{\varepsilon}{2}
\displaystyle\implies PkU𝒢nTV<ε and ||PkU𝒢n||2<ε.\displaystyle\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}<\varepsilon\text{ and }\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}<\varepsilon.

Finally, using nlogn+12nlog(|Gn|1)+Cεn< 2(nlogn+12nlog(|Gn|1))n\log n+\frac{1}{2}n\log(|G_{n}|-1)+C_{\varepsilon}n<\;2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right) for all nNn\geq N, we can conclude that

τmix(n)(ε)2(nlogn+12nlog(|Gn|1)) and tmix(n)(ε)2(nlogn+12nlog(|Gn|1)).\displaystyle\tau_{\text{mix}}^{(n)}(\varepsilon)\leq 2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)\text{ and }t_{\text{mix}}^{(n)}(\varepsilon)\leq 2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right).

Thus the theorem follows. ∎

We now establish a lower bound of PkU𝒢n2\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2} that will be useful in proving the 2\ell^{2}-cutoff.

Proposition 3.8.

For large nn, we have

PkU𝒢n2>(n2+n(|Gn|1))(n1)ekn.\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{(n-2+n(|G_{n}|-1))(n-1)}\;e^{-\frac{k}{n}}.
Proof.

Recall that the irreducible representations of 𝒢n\operatorname{\mathcal{G}_{n}} are parameterised by the elements of 𝒴n(G^n)\mathcal{Y}_{n}(\widehat{G}_{n}). We now use Theorem 2.6 to compute the eigenvalues of the restriction of P^(R)\widehat{P}(R) to some irreducible 𝒢n\operatorname{\mathcal{G}_{n}}-modules. The eigenvalues of the restriction of P^(R)\widehat{P}(R) to the irreducible 𝒢n\operatorname{\mathcal{G}_{n}}-module indexed by

(27) n1(\young(,),,,)𝒴n(G^n)\begin{split}&\hskip 49.08118ptn-1\\[-6.45831pt] &\left(\begin{array}[]{c}\overbrace{\young(\;\;{\;$\cdots$\;\;}\;\;,\;)}\end{array},\emptyset,\dots,\emptyset\right)\in\mathcal{Y}_{n}(\widehat{G}_{n})\end{split}

are given below.

Eigenvalues: 11n1-\frac{1}{n} 0
Multiplicities: n2n-2 11

The eigenvalues of the restriction of P^(R)\widehat{P}(R) to the irreducible 𝒢n\operatorname{\mathcal{G}_{n}}-modules indexed by Young GnG_{n}-diagram with nn boxes of the following form

(28) n1(\young(),,,,\yng(1),,,)𝒴n(G^n), for 1<i|G^n|ith position.\begin{split}&\hskip 49.08118ptn-1\\[-10.76385pt] &\left(\begin{array}[]{c}\overbrace{\young(\;\;{\;$\cdots$\;\;}\;\;)}\end{array},\emptyset,\dots,\emptyset,\underset{\uparrow}{\begin{array}[]{c}\yng(1)\end{array}},\emptyset,\dots,\emptyset\right)\in\mathcal{Y}_{n}(\widehat{G}_{n}),\;\text{ for }1<i\leq|\widehat{G}_{n}|\\[-4.30554pt] &\hskip 170.71652pti\text{th position.}\end{split}

are given below.

Eigenvalues: 11n1-\frac{1}{n} 0
Multiplicities: (n1)di(n-1)d_{i} did_{i}

Now (9) implies

PkU𝒢n22\displaystyle\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}^{2} >(n1)d1n((n2)(11n)2k)+i=2|G^n|nd1n1di((n1)di(11n)2k)\displaystyle>(n-1)d_{1}^{n}\left((n-2)\left(1-\frac{1}{n}\right)^{2k}\right)+\sum_{i=2}^{|\widehat{G}_{n}|}nd_{1}^{n-1}d_{i}\left((n-1)d_{i}\left(1-\frac{1}{n}\right)^{2k}\right)
(29) (n1)(n2)e2kn+n(n1)(|Gn|1)e2kn.\displaystyle\approx(n-1)(n-2)e^{-\frac{2k}{n}}+n(n-1)(|G_{n}|-1)e^{-\frac{2k}{n}}.

Here ‘anbna_{n}\approx b_{n}’ means ‘ana_{n} is asymptotic to bnb_{n}’ i.e. anbn=1\frac{a_{n}}{b_{n}}=1 as nn\rightarrow\infty. We have used d1=1d_{1}=1 and i=1|G^n|di2=|Gn|\displaystyle\sum_{i=1}^{|\widehat{G}_{n}|}d_{i}^{2}=|G_{n}| to obtain (29). Therefore (29) implies

PkU𝒢n2>(n2+n(|Gn|1))(n1)ekn\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{(n-2+n(|G_{n}|-1))(n-1)}\;e^{-\frac{k}{n}}

for large nn. ∎

Proof of Theorem 1.2.

For any ϵ(0,1)\epsilon\in(0,1), the second part of Theorem 3.7 implies

(30) limnP(1+ϵ)(nlogn+12nlog(|Gn|1))U𝒢n2=0.\lim_{n\rightarrow\infty}\left|\left|P^{*\left\lfloor(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}=0.

Again, Proposition 3.8 implies the following

(31) P(1ϵ)(nlogn+12nlog(|Gn|1))U𝒢n2>(1+n2n(|Gn|1))(11n)nϵ(|Gn|1)ϵ2,\left|\left|P^{*\left\lfloor(1-\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{\left(1+\frac{n-2}{n(|G_{n}|-1)}\right)\left(1-\frac{1}{n}\right)}n^{\epsilon}(|G_{n}|-1)^{\frac{\epsilon}{2}},

for large nn. The right hand side of the inequality (31) tends to infinity as nn\rightarrow\infty. Therefore from (30) and (31), we can conclude that the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} satisfies the 2\ell^{2}-cutoff phenomenon with cutoff time nlogn+12nlog(|Gn|1)n\log n+\frac{1}{2}n\log(|G_{n}|-1). This completes the proof of Theorem 1.2. ∎

4. Upper bound for total variation distance

In this section, we use the coupling argument to obtain an upper bound of PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} when knlogn+Cn,C>1k\geq n\log n+Cn,\;C>1. Our method uses the known upper bound for the transpose top with random shuffle [5, Theorem 5.1] and coupling argument. Let us first recall Markov chain coupling.

Definition 4.1.

A coupling of Markov chains with transition matrix MM is a process (Xt,Yt)t(X_{t},Y_{t})_{t} such that both X:=(Xt)tX:=(X_{t})_{t} and Y:=(Yt)tY:=(Y_{t})_{t} are Markov chains with transition matrix MM and with possibly different initial distribution.

Given a coupling (Xt,Yt)t(X_{t},Y_{t})_{t} of a Markov chain with transition matrix MM, suppose that TcoupleT_{\text{couple}} is a random time such that

Xt=Ytfor tTcouple.X_{t}=Y_{t}\quad\text{for }t\geq T_{\text{couple}}.

Then for every pair of initial distribution (μ,ν)(\mu,\nu)

(32) μMtνMtTV(Tcouple>t).\left|\left|\mu M^{t}-\nu M^{t}\right|\right|_{\text{TV}}\leq\mathbb{P}(T_{\text{couple}}>t).

The coupling is called successful if

(Tcouple<)=1.\mathbb{P}(T_{\text{couple}}<\infty)=1.

The random time TcoupleT_{\text{couple}} is known as the coupling time. The coupling is called maximal if equality holds in (32), i.e.

μMtνMtTV=(Tcouple>t).\left|\left|\mu M^{t}-\nu M^{t}\right|\right|_{\text{TV}}=\mathbb{P}(T_{\text{couple}}>t).
Theorem 4.1 ([14, Theorem 4]).

Any Markov chain XX has a maximal coupling ((achieving equality in (32))). Thus there exists a successful maximal coupling for XX if and only if XX is weakly ergodic ((i.e., irrespective of the initial distribution, the chain will converge to a unique distribution as the number of transition approaches infinity)).

In particular, an irreducible and aperiodic random walk on finite group is weakly ergodic.

Before going to the main theorem, we recall the coupon collector problem [15, Section 2.2] and prove a useful lemma. Suppose a shop sells nn different types of coupons. A collector visits the shop and buys coupons. Each coupon he buys is equally likely to be each of the nn types. The collector desires a complete set of all nn types. Let 𝒯\mathscr{T} be the minimum (random) number of coupons collected by the collector to have all the nn types; 𝒯\mathscr{T} is usually known as the coupon collector random variable. Then we have the following [15, Proposition 2.4]

(33) (𝒯>nlogn+Cn)eC, for any C>0.\mathbb{P}\left(\mathscr{T}>\lceil n\log n+C^{\prime}n\rceil\right)\leq e^{-C^{\prime}},\quad\text{ for any }C^{\prime}>0.

Let 𝒯n\mathscr{T}_{n} be the minimum (random) number of coupons collected by the collector to have the coupon of type nn. Then we define the twisted coupon collector random variable 𝒯\mathscr{T}^{\prime} as follows:

(34) 𝒯={𝒯, if the last collected coupon is of type n,𝒯+𝒯n, if the last collected coupon is not of type n,\mathscr{T}^{\prime}=\begin{cases}\mathscr{T},&\text{ if the last collected coupon is of type $n$},\\ \mathscr{T}+\mathscr{T}_{n},&\text{ if the last collected coupon is not of type $n$},\end{cases}

i.e., 𝒯\mathscr{T}^{\prime} is the minimum (random) number of coupons collected by the collector to collect every coupon at least once and the last collected coupon is of type nn.

Lemma 4.2.

For the twisted coupon collector random variable defined in (34), we have

(𝒯>nlogn+Cn)(e+1)eC2, for any C>0.\mathbb{P}\left(\mathscr{T}^{\prime}>\lceil n\log n+Cn\rceil\right)\leq(e+1)e^{-\frac{C}{2}},\quad\text{ for any }C>0.
Proof.

The definition of the twisted coupon collector random variable 𝒯\mathscr{T}^{\prime} implies that

𝒯+𝒯n𝒯, where the distribution of 𝒯n is geometric with parameter 1n.\mathscr{T}+\mathscr{T}_{n}\geq\mathscr{T}^{\prime},\text{ where the distribution of }\mathscr{T}_{n}\text{ is geometric with parameter }\frac{1}{n}.

Therefore,

(𝒯>nlogn+Cn)\displaystyle\mathbb{P}\left(\mathscr{T}^{\prime}>\lceil n\log n+Cn\rceil\right)
\displaystyle\leq (𝒯+𝒯n>nlogn+Cn)\displaystyle\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil\right)
=\displaystyle= (𝒯+𝒯n>nlogn+Cn,𝒯n>C2n1)\displaystyle\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil,\mathscr{T}_{n}>\frac{C}{2}n-1\right)
+(𝒯+𝒯n>nlogn+Cn,𝒯nC2n1)\displaystyle\quad\quad\quad+\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil,\mathscr{T}_{n}\leq\frac{C}{2}n-1\right)
\displaystyle\leq (𝒯n>C2n1)+(𝒯>nlogn+C2n)\displaystyle\mathbb{P}\left(\mathscr{T}_{n}>\frac{C}{2}n-1\right)+\mathbb{P}\left(\mathscr{T}>\left\lceil n\log n+\frac{C}{2}n\right\rceil\right)
\displaystyle\leq kC2n1+1(11n)k11n+eC2,by using (33) and 𝒯nGeom(1n)\displaystyle\sum_{k\geq\left\lfloor\frac{C}{2}n-1\right\rfloor+1}\left(1-\frac{1}{n}\right)^{k-1}\frac{1}{n}+e^{-\frac{C}{2}},\quad\text{by using \eqref{eq:coupon-collector} and }\mathscr{T}_{n}\sim\text{Geom}\left(\frac{1}{n}\right)
=\displaystyle= (11n)C2n1+eC2(e+1)eC2.\displaystyle\left(1-\frac{1}{n}\right)^{\left\lfloor\frac{C}{2}n-1\right\rfloor}+e^{-\frac{C}{2}}\leq(e+1)e^{-\frac{C}{2}}.\qed
Theorem 4.3.

For the random walk on 𝒢n\operatorname{\mathcal{G}_{n}} driven by PP we have the following:

  1. (1)

    Let C>1C>1. If knlogn+Cnk\geq n\log n+Cn, then we have

    PkU𝒢nTV<aeC+(e+1)eC2,\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}<ae^{-C}+(e+1)e^{-\frac{C}{2}},

    where the constant aa is the universal constant given in (37).

  2. (2)

    For any ϵ(0,1)\epsilon\in(0,1),

    limnP(1+ϵ)nlognU𝒢nTV=0.\lim_{n\rightarrow\infty}\left|\left|P^{*\lfloor(1+\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}=0.
Proof.

We construct a coupling (𝒳k,𝒴k)k(\mathscr{X}_{k},\mathscr{Y}_{k})_{k} for the warp-transpose top with random shuffle using a successful maximal coupling of the transpose top with random shuffle on SnS_{n}. Let 𝒳0\mathscr{X}_{0} be the identity element (e1,,e1;id)(\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id}) of 𝒢n\operatorname{\mathcal{G}_{n}}, and 𝒴0\mathscr{Y}_{0} be a random element of 𝒢n\operatorname{\mathcal{G}_{n}} (chosen uniformly). Thus the last element of 𝒳0\mathscr{X}_{0} is the identity permutation id\operatorname{id}, and the last element of 𝒴0\mathscr{Y}_{0} is a random permutation (say) ζ\zeta.

First we focus on the transpose top with random shuffle on SnS_{n}. Let us define the probability measure 𝒫\mathscr{P} on SnS_{n} that generates the transpose top with random shuffle on SnS_{n}.

(35) 𝒫(π)={1n, if π=(i,n), 1in,0, otherwise , for πSn,\mathscr{P}(\pi)=\begin{cases}\frac{1}{n},&\text{ if }\pi=(i,n),\;1\leq i\leq n,\\ 0,&\text{ otherwise },\end{cases}\quad\quad\text{ for }\pi\in S_{n},

where (i,n)(i,n) denotes the transposition in SnS_{n} interchanging ii and nn; here we set (n,n):=id(n,n):=\operatorname{id}. Recall that the transpose top with random shuffle is irreducible and aperiodic. Thus, 𝒫k\mathscr{P}^{*k}, the distribution after kk transitions converges to the unique stationary distribution USnU_{S_{n}} as kk\rightarrow\infty. Therefore, Theorem 4.1 ensures the existence of a successful maximal coupling (𝒳k,𝒴k)k(\mathcal{X}_{k},\mathcal{Y}_{k})_{k} for the transpose top with random shuffle (on SnS_{n}) such that

  • (𝒳k)k(\mathcal{X}_{k})_{k} starts at identity permutation and (𝒴k)k(\mathcal{Y}_{k})_{k} starts at ζ\zeta.

  • Both 𝒳k\mathcal{X}_{k} and 𝒴k\mathcal{Y}_{k} evolves according to the law 𝒫\mathscr{P}, i.e., 𝒳k𝒫k\mathcal{X}_{k}\sim\mathscr{P}^{*k} and 𝒴kUSn\mathcal{Y}_{k}\sim U_{S_{n}}.

  • 𝒯couple\mathcal{T}_{\text{couple}} is the coupling time, i.e., 𝒳k=𝒴k for all k𝒯couple\mathcal{X}_{k}=\mathcal{Y}_{k}\text{ for all }k\geq\mathcal{T}_{\text{couple}}, (𝒯couple<)=1\mathbb{P}(\mathcal{T}_{\text{couple}}<\infty)=1, and

    (36) 𝒫kUSnTV=(𝒯couple>k).||\mathscr{P}^{*k}-U_{S_{n}}||_{\text{TV}}=\mathbb{P}(\mathcal{T}_{\text{couple}}>k).

Define Lk+1L_{k+1} by 𝒳k+1=𝒳k(Lk+1,n)\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot(L_{k+1},n), and Rk+1R_{k+1} by 𝒴k+1=𝒴k(Rk+1,n)\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(R_{k+1},n). In other words, Lk+1L_{k+1} is obtained from 𝒳k1𝒳k+1\mathcal{X}_{k}^{-1}\mathcal{X}_{k+1}, and Rk+1R_{k+1} is obtained from 𝒴k1𝒴k+1\mathcal{Y}_{k}^{-1}\mathcal{Y}_{k+1}. Thus, the sequence {Lk}k=1\{L_{k}\}_{k=1}^{\infty} (also, {Rk}k=1\{R_{k}\}_{k=1}^{\infty}) consists of independent uniformly distributed random variables taking values from {1,,n}\{1,\dots,n\}. We note that, LkL_{k} and RkR_{k} depend on each other via the coupling (𝒳k,𝒴k)k(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}.

The known upper bound of the transpose top with random shuffle on SnS_{n} (see [5, Theorem 5.1]) provides

𝒫nlogn+CnUSnTVaeC for a universal constant a and C>1.\left|\left|\mathscr{P}^{*\lceil n\log n+Cn\rceil}-U_{S_{n}}\right|\right|_{\text{TV}}\leq ae^{-C}\quad\text{ for a universal constant }a\text{ and }C>1.

Therefore, using (36), we have the following:

(37) (𝒯couple>nlogn+Cn)aeC.\mathbb{P}(\mathcal{T}_{\text{couple}}>\lceil n\log n+Cn\rceil)\leq ae^{-C}.

Now we construct the coupling (𝒳k,𝒴k)k(\mathscr{X}_{k},\mathscr{Y}_{k})_{k} as follows:

  • Recall that 𝒳0=(e1,,e1;id)\mathscr{X}_{0}=(\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id}), and (𝒴k)k(\mathscr{Y}_{k})_{k} starts from a random element of 𝒢n\operatorname{\mathcal{G}_{n}}.

  • For k0k\geq 0, set

    𝒳k+1={𝒳k(e1,e1,𝔤k+1,e1,Lk+1th position𝔤k+11;(Lk+1,n))if Lk+1<n,𝒳k(e1,,e1,,𝔤k+1;id)if Lk+1=n,\mathscr{X}_{k+1}=\begin{cases}\mathscr{X}_{k}\cdot(\operatorname{e}1,\underset{L_{k+1}\text{th position}}{\underset{\uparrow}{\dots\operatorname{e}1,\mathfrak{g}_{k+1},\operatorname{e}1,\dots}}\mathfrak{g}^{-1}_{k+1};(L_{k+1},n))&\quad\text{if }L_{k+1}<n,\\ &\\ \mathscr{X}_{k}\cdot(\operatorname{e}1,\dots,\operatorname{e}1,\dots,\mathfrak{g}_{k+1};\operatorname{id})&\quad\text{if }L_{k+1}=n,\end{cases}
    𝒴k+1={𝒴k(e1,e1,𝔥k+1,e1,Rk+1th position𝔥k+11;(Rk+1,n))if Rk+1<n,𝒴k(e1,,e1,,𝔥k+1;id)if Rk+1=n,\mathscr{Y}_{k+1}=\begin{cases}\mathscr{Y}_{k}\cdot(\operatorname{e}1,\underset{R_{k+1}\text{th position}}{\underset{\uparrow}{\dots\operatorname{e}1,\mathfrak{h}_{k+1},\operatorname{e}1,\dots}}\mathfrak{h}^{-1}_{k+1};(R_{k+1},n))&\quad\text{if }R_{k+1}<n,\\ &\\ \mathscr{Y}_{k}\cdot(\operatorname{e}1,\dots,\operatorname{e}1,\dots,\mathfrak{h}_{k+1};\operatorname{id})&\quad\text{if }R_{k+1}=n,\end{cases}

    where we choose 𝔤k+1\mathfrak{g}_{k+1} such that Lk+1L_{k+1}th position 𝒳k+11\mathscr{X}^{-1}_{k+1} and 𝒴k+11\mathscr{Y}^{-1}_{k+1} matches; also choose 𝔥k+1\mathfrak{h}_{k+1} such that Rk+1R_{k+1}th position 𝒳k+11\mathscr{X}^{-1}_{k+1} and 𝒴k+11\mathscr{Y}^{-1}_{k+1} matches. We note that Lk+1=Rk+1L_{k+1}=R_{k+1} may happen.

Under the above coupling (𝒳k,𝒴k)k(\mathscr{X}_{k},\mathscr{Y}_{k})_{k}, if a position i{1,,n1}i\in\{1,\dots,n-1\} matches in 𝒳k01\mathscr{X}^{-1}_{k_{0}} and 𝒴k01\mathscr{Y}^{-1}_{k_{0}}, then the position ii agrees in 𝒳k1\mathscr{X}^{-1}_{k} and 𝒴k1\mathscr{Y}^{-1}_{k} for kk0k\geq k_{0}. Therefore, we have the following

  • First nn positions of 𝒳k1\mathscr{X}^{-1}_{k} and 𝒴k1\mathscr{Y}^{-1}_{k} will be matched when k𝒯k\geq\mathscr{T}^{\prime} (defined in (34)), as they are updated by time 𝒯\mathscr{T}^{\prime} with the final update at position nn.

  • The last position of 𝒳k\mathscr{X}_{k} and 𝒴k\mathscr{Y}_{k} matches when k=𝒯couplek=\mathcal{T}_{\text{couple}} (the coupling time for (𝒳k,𝒴k)k(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}), but does not match when k<𝒯couplek<\mathcal{T}_{\text{couple}}.

Thus, the coupling time 𝔗couple\mathfrak{T}_{\text{couple}} for (𝒳k,𝒴k)k(\mathscr{X}_{k},\mathscr{Y}_{k})_{k} can not exceed max{𝒯couple,𝒯}\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\},

i.e., 𝒳k=𝒴k for all kmax{𝒯couple,𝒯}.\text{i.e., }\mathscr{X}_{k}=\mathscr{Y}_{k}\text{ for all }k\geq\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\}.

Hence by using (32), we have

Pnlogn+CnU𝒢nTV\displaystyle\left|\left|P^{*\left\lceil n\log n+Cn\right\rceil}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} (𝔗couple>nlogn+Cn)\displaystyle\leq\mathbb{P}\left(\mathfrak{T}_{\text{couple}}>\left\lceil n\log n+Cn\right\rceil\right)
(max{𝒯couple,𝒯}>nlogn+Cn)\displaystyle\leq\mathbb{P}\left(\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\}>\left\lceil n\log n+Cn\right\rceil\right)
(𝒯couple>nlogn+Cn)+(𝒯>nlogn+Cn)\displaystyle\leq\mathbb{P}\left(\mathcal{T}_{\text{couple}}>\left\lceil n\log n+Cn\right\rceil\right)+\mathbb{P}\left(\mathscr{T}^{\prime}>\left\lceil n\log n+Cn\right\rceil\right)
(38) aeC+(e+1)eC2,\displaystyle\leq ae^{-C}+(e+1)e^{-\frac{C}{2}},

for C>1C>1 and the constant aa in (37). The inequality in (4) follows from (37) and Lemma 4.2. The first part of this theorem follows from the fact that PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} decreases as kk increases, and the second part follows from the first part by setting C=ϵlogn1nC=\epsilon\log n-\frac{1}{n}. ∎

Remark 4.4.

We now describe an explicit coupling (not necessarily optimal) for the transpose top with random shuffle on SnS_{n}, that provides a proof of (37) without using Theorem 4.1. We use the same notation (𝒳k,𝒴k)k(\mathcal{X}_{k},\mathcal{Y}_{k})_{k} (respectively 𝒯couple\mathcal{T}_{\text{couple}}) for the coupling (respectively coupling time). Let 𝒳0=id\mathcal{X}_{0}=\operatorname{id} and 𝒴0=ζ\mathcal{Y}_{0}=\zeta. The coupling description is as follows: Let 𝒟k:={j:𝒳k(j)=𝒴k(j)}\mathcal{D}_{k}:=\{j:\mathcal{X}_{k}(j)=\mathcal{Y}_{k}(j)\} i.e., the set of positions where 𝒳k\mathcal{X}_{k} and 𝒴k\mathcal{Y}_{k} agree. Now choose i{1,,n}i\in\{1,\dots,n\} uniformly at random.

  • If i,n𝒟ki,n\in\mathcal{D}_{k}, then 𝒳k+1=𝒳kid\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id} and 𝒴k+1=𝒴kid\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot\operatorname{id}. Thus 𝒟k+1=𝒟k\mathcal{D}_{k+1}=\mathcal{D}_{k}.

  • If i𝒟ki\in\mathcal{D}_{k} but n𝒟kn\notin\mathcal{D}_{k}, then 𝒳k+1=𝒳kid\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id} and 𝒴k+1=𝒴k(j0,n)\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(j_{0},n), where j0:=𝒴k1𝒳k(n)j_{0}:=\mathcal{Y}_{k}^{-1}\mathcal{X}_{k}(n). Thus 𝒟k+1=𝒟k{n}\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{n\}.

  • If i𝒟ki\notin\mathcal{D}_{k} but n𝒟kn\in\mathcal{D}_{k}, then 𝒳k+1=𝒳k(i,n)\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot(i,n) and 𝒴k+1=𝒴k(i,n)\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(i,n). Thus 𝒟k+1=𝒟k{i}{n}\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{i\}\setminus\{n\}.

  • If i𝒟ki\notin\mathcal{D}_{k} and n𝒟kn\notin\mathcal{D}_{k}, then 𝒳k+1=𝒳kid\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id} and 𝒴k+1=𝒴k(j0,n)\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(j_{0},n), where j0:=𝒴k1𝒳k(n)j_{0}:=\mathcal{Y}_{k}^{-1}\mathcal{X}_{k}(n). Thus 𝒟k+1=𝒟k{n}\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{n\}.

Thus we have the following: Each element of {1,,n}\{1,\dots,n\} are equally probable to be added to 𝒟k+1\mathcal{D}_{k+1} if n𝒟kn\in\mathcal{D}_{k}, j(n)𝒟k0j(\neq n)\in\mathcal{D}_{k_{0}} implies j𝒟kj\in\mathcal{D}_{k} for all kk0k\geq k_{0}, n𝒟kn\notin\mathcal{D}_{k} implies n𝒟k+1n\in\mathcal{D}_{k+1}, and 𝒳k=𝒴k\mathcal{X}_{k}=\mathcal{Y}_{k} when 𝒟k={1,,n}\mathcal{D}_{k}=\{1,\dots,n\}. Therefore 𝒯couple𝒯+n\mathcal{T}_{\text{couple}}\leq\mathscr{T}+n, where 𝒯\mathscr{T} is the coupon collector random variable satisfying (33). Finally using the fact

(𝒯couple>nlogn+Cn)(𝒯+n>nlogn+Cn),\mathbb{P}\left(\mathcal{T}_{\text{couple}}>\lceil n\log n+Cn\rceil\right)\leq\mathbb{P}\left(\mathscr{T}+n>\lceil n\log n+Cn\rceil\right),

nlogn+Cnn=nlogn+(C1)n\lceil n\log n+Cn\rceil-n=\lceil n\log n+(C-1)n\rceil, and (33); we can conclude (37) with a=ea=e.

5. Lower bound for total variation distance

This section will focus on the lower bound of PkU𝒢nTV\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}} for k=nlogn+cn,c0k=n\log n+cn,\;c\ll 0 and prove Theorem 1.3. The idea is to use the fact that a projected chain mixes faster than the original chain. We define a group homomorphism from 𝒢n\operatorname{\mathcal{G}_{n}} onto the symmetric group SnS_{n} which projects the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} to the transpose top with random shuffle on SnS_{n}. Recall the lower bound results for the transpose top with random shuffle on SnS_{n} from [13, Section 4]. Although the detailed analysis for the transpose top with random shuffle on SnS_{n} has appeared first in [5, Chapter 5(C), p. 27], the lower bound results we need here are directly available in [13, Section 4].

Recall that the transpose top with random shuffle is the random walk on SnS_{n} driven by 𝒫\mathscr{P} (defined in (35)), and 𝒫k\mathscr{P}^{*k} is the distribution after kk transitions. Then 𝒫kUSn\mathscr{P}^{*k}\rightarrow U_{S_{n}} as kk\rightarrow\infty. Given πSn\pi\in S_{n}, if 𝔣(π)\mathfrak{f}(\pi) denotes the number of fixed points in π\pi, then we have

(39) 𝒫kUSnTV14(Ek(𝔣2)(Ek(𝔣))2)(Ek(𝔣))22Ek(𝔣)(see [13, Proposition 4.1]),||\mathscr{P}^{*k}-U_{S_{n}}||_{\text{TV}}\geq 1-\frac{4\left(E_{k}\left(\mathfrak{f}^{2}\right)-\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}\right)}{\left(E_{k}(\mathfrak{f})\right)^{2}}-\frac{2}{E_{k}(\mathfrak{f})}\quad\text{(see \cite[cite]{[\@@bibref{}{FTTR}{}{}, Proposition 4.1]})},

where Ek(𝔣)E_{k}(\mathfrak{f}) (respectively Ek(𝔣2)E_{k}(\mathfrak{f}^{2})) denotes the expected value of 𝔣\mathfrak{f} (respectively 𝔣2\mathfrak{f}^{2}) with respect to the probability measure 𝒫k\mathscr{P}^{*k}. Now recall the expressions for Ek(𝔣)E_{k}(\mathfrak{f}) and Ek(𝔣2)E_{k}(\mathfrak{f}^{2}) obtained in [13].

(40) Ek(𝔣)\displaystyle E_{k}(\mathfrak{f}) 1+(n2)ekn.\displaystyle\approx 1+(n-2)e^{-\frac{k}{n}}.
(41) Ek(𝔣2)\displaystyle E_{k}(\mathfrak{f}^{2}) 2+3(n2)ekn+(n25n+5)e2kn+(n2)(1+(1)knk).\displaystyle\approx 2+3(n-2)e^{-\frac{k}{n}}+(n^{2}-5n+5)e^{-\frac{2k}{n}}+(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right).

Let us define a homomorphism ff from 𝒢n\operatorname{\mathcal{G}_{n}} onto SnS_{n} as follows:

(42) f:(g1,,gn;π)π, for (g1,,gn;π)𝒢n.f:(g_{1},\dots,g_{n};\pi)\mapsto\pi,\text{ for }(g_{1},\dots,g_{n};\pi)\in\operatorname{\mathcal{G}_{n}}.

It can be checked that the mapping ff defined in (42) is a surjective homomorphism. Moreover, ff projects the warp-transpose top with random shuffle on 𝒢n\operatorname{\mathcal{G}_{n}} to the transpose top with random shuffle on SnS_{n} i.e., Pf1=𝒫Pf^{-1}=\mathscr{P}. Here f1f^{-1} is defined by f1(π):={π𝒢n:f(π)=π}f^{-1}(\pi):=\{\pi^{\prime}\in\operatorname{\mathcal{G}_{n}}:f(\pi^{\prime})=\pi\} for πSn\pi\in S_{n}. We now prove a lemma which will be useful in proving the main result of this section.

Lemma 5.1.

For any positive integer kk we have (Pf1)k=Pkf1\left(Pf^{-1}\right)^{*k}=P^{*k}f^{-1}.

Proof.

We use the first principle of mathematical induction on kk. The base case for k=1k=1 is true by definition. Now assume the induction hypothesis i.e., (Pf1)m=Pmf1\left(Pf^{-1}\right)^{*m}=P^{*m}f^{-1} for some positive integer m>1m>1. Let πSn\pi\in S_{n} be chosen arbitrarily. Then for the inductive step k=m+1k=m+1 we have the following:

(Pf1)(m+1)(π)\displaystyle\left(Pf^{-1}\right)^{*(m+1)}(\pi) =((Pf1)(Pf1)m)(π)\displaystyle=\left(\left(Pf^{-1}\right)*\left(Pf^{-1}\right)^{*m}\right)(\pi)
={ξ,ζSn:ξζ=π}(Pf1)(ξ)(Pf1)m(ζ)\displaystyle=\sum_{\{\xi,\zeta\in S_{n}:\;\xi\zeta=\pi\}}\left(Pf^{-1}\right)(\xi)\left(Pf^{-1}\right)^{*m}(\zeta)
={ξ,ζSn:ξζ=π}(Pf1)(ξ)(Pmf1)(ζ), by the induction hypothesis,\displaystyle=\sum_{\{\xi,\zeta\in S_{n}:\;\xi\zeta=\pi\}}\left(Pf^{-1}\right)(\xi)\left(P^{*m}f^{-1}\right)(\zeta),\;\text{ by the induction hypothesis},
=ξ,ζSnξζ=πP(f1(ξ))Pm(f1(ζ))\displaystyle=\sum_{\begin{subarray}{c}\xi,\zeta\in S_{n}\\ \xi\zeta=\pi\end{subarray}}P\left(f^{-1}(\xi)\right)P^{*m}\left(f^{-1}(\zeta)\right)
(43) =ξ,ζSnξζ=πξf1(ξ)ζf1(ζ)P(ξ)Pm(ζ).\displaystyle=\sum_{\begin{subarray}{c}\xi,\zeta\in S_{n}\\ \xi\zeta=\pi\end{subarray}}\sum_{\begin{subarray}{c}\xi^{\prime}\in f^{-1}(\xi)\\ \zeta^{\prime}\in f^{-1}(\zeta)\end{subarray}}P(\xi^{\prime})P^{*m}(\zeta^{\prime}).

Now using the fact that ff is a homomorphism, we have the following:

{(ξ,ζ)f1(ξ)×f1(ζ):ξ,ζSn and ξζ=π}={(ξ,ζ)𝒢n×𝒢n:ξζf1(π)}.\displaystyle\{(\xi^{\prime},\zeta^{\prime})\in f^{-1}(\xi)\times f^{-1}(\zeta):\;\xi,\zeta\in S_{n}\text{ and }\xi\zeta=\pi\}=\{(\xi^{\prime},\zeta^{\prime})\in\operatorname{\mathcal{G}_{n}}\times\operatorname{\mathcal{G}_{n}}:\;\xi^{\prime}\zeta^{\prime}\in f^{-1}(\pi)\}.

Therefore the expression in (5) becomes

{ξ,ζ𝒢n:ξζf1(π)}P(ξ)Pm(ζ)\displaystyle\sum_{\{\xi^{\prime},\zeta^{\prime}\in\operatorname{\mathcal{G}_{n}}:\;\xi^{\prime}\zeta^{\prime}\in f^{-1}(\pi)\}}P(\xi^{\prime})P^{*m}(\zeta^{\prime}) =πf1(π)ξ,ζ𝒢nξζ=πP(ξ)Pm(ζ)\displaystyle=\sum_{\pi^{\prime}\in f^{-1}(\pi)}\sum_{\begin{subarray}{c}\xi^{\prime},\zeta^{\prime}\in\operatorname{\mathcal{G}_{n}}\\ \xi^{\prime}\zeta^{\prime}=\pi^{\prime}\end{subarray}}P(\xi^{\prime})P^{*m}(\zeta^{\prime})
=πf1(π)P(m+1)(π)=(P(m+1)f1)(π).\displaystyle=\sum_{\pi^{\prime}\in f^{-1}(\pi)}P^{*(m+1)}(\pi^{\prime})=\left(P^{*(m+1)}f^{-1}\right)(\pi).

Thus the lemma follows from the first principle of mathematical induction. ∎

Theorem 5.2.

For the random walk on 𝒢n\operatorname{\mathcal{G}_{n}} driven by PP we have the following:

  1. (1)

    For large nn,

    PkU𝒢nTV12(3+3ec+o(1)(e2c+ec+1))(1+(1+o(1))ec)2,||P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}||_{\emph{TV}}\geq 1-\frac{2\left(3+3e^{-c}+o(1)(e^{-2c}+e^{-c}+1)\right)}{\left(1+(1+o(1))e^{-c}\right)^{2}},

    when k=nlogn+cnk=n\log n+cn and c0c\ll 0.

  2. (2)

    For any ϵ(0,1)\epsilon\in(0,1),

    limnP(1ϵ)nlognU𝒢nTV=1.\displaystyle\lim_{n\rightarrow\infty}\left|\left|P^{*\lfloor(1-\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}=1.
Proof.

We know that, given two probability distributions μ\mu and ν\nu on Ω\Omega and a mapping ψ:ΩΛ\psi:\Omega\rightarrow\Lambda, we have μνTVμψ1νψ1TV||\mu-\nu||_{\emph{TV}}\geq||\mu\psi^{-1}-\nu\psi^{-1}||_{\emph{TV}}, where Λ\Lambda is finite [15, Lemma 7.9]. Therefore we have the following:

PkU𝒢nTV\displaystyle||P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}||_{\text{{TV}}} Pkf1U𝒢nf1TV\displaystyle\geq||P^{*k}f^{-1}-U_{\operatorname{\mathcal{G}_{n}}}f^{-1}||_{\text{{TV}}}
=(Pf1)kUSnTV, by Lemma (5.1) and U𝒢nf1=USn,\displaystyle=||\left(Pf^{-1}\right)^{*k}-U_{S_{n}}||_{\text{{TV}}},\;\text{ by Lemma \eqref{lem:transition_preservation_of_the_projection} and }U_{\operatorname{\mathcal{G}_{n}}}f^{-1}=U_{S_{n}},
(44) =𝒫kUSnTV,\displaystyle=||\mathscr{P}^{*k}-U_{S_{n}}||_{\text{{TV}}},

using Pf1=𝒫Pf^{-1}=\mathscr{P}. Therefore (39), (40), (41), and (5) implies that

PkU𝒢nTV\displaystyle||P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}||_{\text{{TV}}}\geq  12(2Ek(𝔣2)2(Ek(𝔣))2+Ek(𝔣))(Ek(𝔣))2\displaystyle\;1-\frac{2\left(2E_{k}\left(\mathfrak{f}^{2}\right)-2\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}+E_{k}\left(\mathfrak{f}\right)\right)}{\left(E_{k}(\mathfrak{f})\right)^{2}}
(45) =\displaystyle=  12(3+3(n2)ekn2(n1)e2kn+o(1))(1+(n2)ekn)2, for k>1,\displaystyle\;1-\frac{2\left(3+3(n-2)e^{-\frac{k}{n}}-2(n-1)e^{-\frac{2k}{n}}+o(1)\right)}{\left(1+(n-2)e^{-\frac{k}{n}}\right)^{2}},\;\text{ for }k>1,

for sufficiently large nn. The equality in (5) holds because of the following:

2(Ek(𝔣2)(Ek(𝔣))2)+Ek(𝔣)\displaystyle 2\left(E_{k}\left(\mathfrak{f}^{2}\right)-\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}\right)+E_{k}\left(\mathfrak{f}\right)
\displaystyle\approx  2(2+3(n2)ekn+(n25n+5)e2kn+(n2)(1+(1)knk)(1+(n2)ekn)2)\displaystyle\;2\left(2+3(n-2)e^{-\frac{k}{n}}+(n^{2}-5n+5)e^{-\frac{2k}{n}}+(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right)-\left(1+(n-2)e^{-\frac{k}{n}}\right)^{2}\right)
+(1+(n2)ekn)\displaystyle\hskip 346.89621pt+\left(1+(n-2)e^{-\frac{k}{n}}\right)
=\displaystyle=  3+3(n2)ekn2(n1)e2kn+2(n2)(1+(1)knk).\displaystyle\;3+3(n-2)e^{-\frac{k}{n}}-2(n-1)e^{-\frac{2k}{n}}+2(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right).

Now if nn is large, c0c\ll 0 and k=nlogn+cnk=n\log n+cn, then by (5), we have the first part of this theorem.

Again for any ϵ(0,1)\epsilon\in(0,1), from (5), we have

(46) 1P(1ϵ)nlognU𝒢nTV12(3+3nϵ+o(1)(n2ϵ+nϵ+1))(1+(1+o(1))nϵ)2,1\geq||P^{*\lfloor(1-\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}||_{\text{TV}}\geq 1-\frac{2\left(3+3n^{\epsilon}+o(1)(n^{2\epsilon}+n^{\epsilon}+1)\right)}{\left(1+(1+o(1))n^{\epsilon}\right)^{2}},

for large nn. Therefore, the second part of this theorem follows from (46) and the fact that

limn2(3+3nϵ+o(1)(n2ϵ+nϵ+1))(1+(1+o(1))nϵ)2=limn2(3n2ϵ+3nϵ+o(1)(1n2ϵ+1nϵ+1))(1nϵ+1+o(1))2= 0.\lim_{n\rightarrow\infty}\frac{2\left(3+3n^{\epsilon}+o(1)(n^{2\epsilon}+n^{\epsilon}+1)\right)}{\left(1+(1+o(1))n^{\epsilon}\right)^{2}}=\lim_{n\rightarrow\infty}\frac{2\left(\frac{3}{n^{2\epsilon}}+\frac{3}{n^{\epsilon}}+o(1)(\frac{1}{n^{2\epsilon}}+\frac{1}{n^{\epsilon}}+1)\right)}{\left(\frac{1}{n^{\epsilon}}+1+o(1)\right)^{2}}=\;0.\qed
Proof of Theorem 1.3.

Theorem 1.3 follows from the second part of Theorem 4.3 and the second part of Theorem 5.2. ∎

Acknowledgement

I extend sincere thanks to my PhD advisor Arvind Ayyer for all the insightful discussions during the preparation of this paper. I am very grateful to the anonymous referees of the Journal of Algebraic Combinatorics for many constructive suggestions. I would like to thank an anonymous referee of the Algebraic Combinatorics for the valuable comments, which helped improve the total variation upper bound result and simplify the proof of the total variation lower bound. I am grateful to Professor Tullio Ceccherini-Silberstein for his encouragement and inspiring comments. I would also like to thank Guy Blachar, Ashish Mishra, and Shangjie Yang for their discussions. I would like to acknowledge support in part by a UGC Centre for Advanced Study grant.

Statements and Declarations

The author has no conflict of interest to disclose. This article is a part of author’s PhD dissertation. The extended abstract of this article was accepted in FPSAC 2020 (online).

Data Availability Statements

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Copyright Statements

This version of the article has been accepted for publication in the Journal of Algebraic Combinatorics: An International Journal, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s10801-023-01271-1. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms.

References

  • [1] David Aldous. Random walks on finite groups and rapidly mixing Markov chains. In Seminar on probability, XVII, volume 986 of Lecture Notes in Math., pages 243–297. Springer, Berlin, 1983.
  • [2] David Aldous and Persi Diaconis. Shuffling cards and stopping times. Amer. Math. Monthly, 93(5):333–348, 1986.
  • [3] David Aldous and Persi Diaconis. Strong uniform times and finite random walks. Adv. in Appl. Math., 8(1):69–97, 1987.
  • [4] Megan Bernstein and Evita Nestoridi. Cutoff for random to random card shuffle. Ann. Probab., 47(5):3303–3320, 2019.
  • [5] Persi Diaconis. Applications of non-commutative fourier analysis to probability problems. In École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, pages 51–100. Springer, 1988.
  • [6] Persi Diaconis. Group representations in probability and statistics. 11:vi+198, 1988.
  • [7] Persi Diaconis. The cutoff phenomenon in finite Markov chains. Proc. Nat. Acad. Sci. U.S.A., 93(4):1659–1664, 1996.
  • [8] Persi Diaconis and Mehrdad Shahshahani. Generating a random permutation with random transpositions. Z. Wahrsch. Verw. Gebiete, 57(2):159–179, 1981.
  • [9] James Allen Fill and Clyde H. Schoolfield, Jr. Mixing times for Markov chains on wreath products and related homogeneous spaces. Electron. J. Probab., 6:no. 11, 22, 2001.
  • [10] L. Flatto, A. M. Odlyzko, and D. B. Wales. Random shuffles and group representations. Ann. Probab., 13(1):154–178, 1985.
  • [11] Subhajit Ghosh. Cutoff for the warp-transpose top with random shuffle. Sém. Lothar. Combin., 84B:Art. 69, 12, 2020.
  • [12] Subhajit Ghosh. Total variation cutoff for the transpose top-2 with random shuffle. J. Theoret. Probab., 33(4):1832–1854, 2020.
  • [13] Subhajit Ghosh. Total variation cutoff for the flip-transpose top with random shuffle. ALEA Lat. Am. J. Probab. Math. Stat., 18(1):985–1006, 2021.
  • [14] David Griffeath. A maximal coupling for Markov chains. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:95–106, 1974/75.
  • [15] David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. Markov chains and mixing times. American Mathematical Society, Providence, RI, 2009. With a chapter by James G. Propp and David B. Wilson.
  • [16] Oliver Matheau-Raven. Random walks on the symmetric group: cutoff for one-sided transposition shuffles. PhD thesis, University of York, 2020.
  • [17] Ashish Mishra and Murali K. Srinivasan. The Okounkov-Vershik approach to the representation theory of GSnG\sim S_{n}. J. Algebraic Combin., 44(3):519–560, 2016.
  • [18] Ashish Mishra and Shraddha Srivastava. On representation theory of partition algebras for complex reflection groups. Algebr. Comb., 3(2):389–432, 2020.
  • [19] Evita Nestoridi. The limit profile of star transpositions. arXiv preprint arXiv:2111.03622, 2021.
  • [20] Evita Nestoridi and Sam Olesker-Taylor. Limit profiles for reversible Markov chains. Probab. Theory Related Fields, 182(1-2):157–188, 2022.
  • [21] Amritanshu Prasad. Representation theory: a combinatorial viewpoint, volume 147. Cambridge University Press, 2015.
  • [22] Dana Randall. Rapidly mixing Markov chains with applications in computer science and physics. Computing in Science and Engg., 8(2):30 – 41, 2006.
  • [23] Bruce E. Sagan. The symmetric group, volume 203 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 2001. Representations, combinatorial algorithms, and symmetric functions.
  • [24] Laurent Saloff-Coste. Random walks on finite groups. In Probability on discrete structures, volume 110 of Encyclopaedia Math. Sci., pages 263–346. Springer, Berlin, 2004.
  • [25] Clyde H. Schoolfield, Jr. Random walks on wreath products of groups. J. Theoret. Probab., 15(3):667–693, 2002.
  • [26] Jean-Pierre Serre. Linear representations of finite groups. Springer-Verlag, New York-Heidelberg, 1977. Translated from the second French edition by Leonard L. Scott, Graduate Texts in Mathematics, Vol. 42.
  • [27] Lucas Teyssier. Limit profile for random transpositions. Ann. Probab., 48(5):2323–2343, 2020.