Cutoff phenomenon for the warp-transpose top with random shuffle

Subhajit Ghosh Department of Mathematics, Bar-Ilan University, Ramat-Gan 5290002 [email protected]

Abstract.

Let $\{G_{n}\}_{1}^{\infty}$ be a sequence of non-trivial finite groups. In this paper, we study the properties of a random walk on the complete monomial group $G_{n}\wr S_{n}$ generated by the elements of the form $(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})$ and $(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))$ for $g\in G_{n},\;1\leq i<n$ . We call this the warp-transpose top with random shuffle on $G_{n}\wr S_{n}$ . We find the spectrum of the transition probability matrix for this shuffle. We prove that the mixing time for this shuffle is $O\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)$ . We show that this shuffle exhibits $\ell^{2}$ -cutoff at $n\log n+\frac{1}{2}n\log(|G_{n}|-1)$ and total variation cutoff at $n\log n$ .

Key words and phrases:

random walk, complete monomial group, mixing time, cutoff, Young-Jucys-Murphy elements

2020 Mathematics Subject Classification:

60J10, 60B15, 60C05.

1. Introduction

The number of shuffles required to mix up a deck of cards is the main topic of interest for card shuffling problems. It has received considerable attention in the last few decades. The card shuffling problems can be described as random walks on the symmetric group. The generalisation by replacing the symmetric group with other finite groups is also a well-studied topic in probability theory [24]. A random walk converges to a unique stationary distribution subject to certain natural conditions. For a convergent random walk, the mixing time (number of steps required to reach the stationary distribution up to a given tolerance) is one of the main topics of interest. It is helpful to know the eigenvalues and eigenvectors of the transition matrix to study the convergence rate of random walks. In general, convergence rate questions for finite Markov chains are useful in many subjects, including statistical physics, computer science, biology and more [22].

In the eighties, Diaconis and Shahshahani introduced the use of non-commutative Fourier analysis techniques in their work on the random transposition shuffle [8]. They proved that this shuffle on $n$ distinct cards has total variation cutoff (sharp mixing time; the formal definition of cutoff will be given later) at $\frac{1}{2}n\log n$ . The upper bound estimate in this case mainly uses the fact that the total variation distance is at most half of the $\ell^{2}$ -distance, which relies on the spectrum of the transition matrix. After this landmark work, the theory of random walks on finite groups obtained its own independence, its own problems and techniques. Afterwards, some other techniques have come to deal with random walks on finite groups (viz. the coupling argument [1], the strong stationary time approach [2, 3]). However, the spectral approach became a standard technique for answering mixing time questions for random walk on finite groups [5]. For a random walk model with known cutoff, the natural question appears on how the transition occurs at cutoff. In 2020, Teyssier [27] studied the limit profile for the random transposition model; providing a precise description of the transition at cutoff. Later Nestoridi and Olesker-Taylor [20] generalised Teyssier’s result to reversible Markov chains. In recent times, Nestoridi further develops the previous result by studying the limit profile for the transpose top with random shuffle (the shuffling algorithm is given in the next paragraph) [19]. This present paper focuses on obtaining the sharp mixing time; the limit profile computation will be considered in future work.

Our model is mainly inspired by the transpose top with random shuffle on the symmetric group $S_{n}$ [10]. Given a deck of $n$ distinct cards, this shuffle chooses a card from the deck uniformly at random and transposes it with the top card. This shuffle exhibits total variation cutoff at $n\log n$ [5, 6]. The transpose top with random shuffle has been recently generalised by the author to the cards with two orientations known as the flip-transpose top with random shuffle on the hyperoctahedral group $B_{n}$ [13]. The flip-transpose top with random shuffle on $B_{n}$ has total variation cutoff at $n\log n$ . In the extended abstract [11], the author has introduced a generalisation of the flip-transpose top with random shuffle to the complete monomial group $G_{n}\wr S_{n}$ , and announced result on total variation cutoff under the restriction $|G_{n}|=o(n^{\delta})$ for all $\delta>0$ . In this work, we further generalise it by removing the restriction on the size of $G_{n}$ , that motivates us to consider both $\ell^{2}$ -distance and total variation distance. Moreover, if $\log(|G_{n}|-1)\neq o(\log n)$ , then our model provides an example where the spectral analysis fails for obtaining sharp total variation mixing time; in other words, the $\ell^{2}$ -bound of the total variation distance is not sufficient enough for computing sharp total variation mixing time. Thus we consider both $\ell^{2}$ -distance and total variation distance in the present paper. For another notable random walk on the complete monomial groups we mention the work of Schoolfield Jr. [25], which is a generalisation of the random transposition model to $G\wr S_{n}$ for a finite group $G$ . However this walk was generated by the probability measure which is constant on the conjugacy classes. On the other hand, the generating measure of our model is not constant on conjugacy classes. In general, it is not easy to study a random walk generated by a probability measure which is not constant on the conjugacy classes (cf. [4, 10, 12, 13]). For other random walks on the complete monomial group see [9, 16]. Before describing our random walk model, let us first recall the definition of the complete monomial group.

Definition 1.1.

Let $G$ be a finite group and $S_{n}$ be the symmetric group of permutations of elements of the set $[n]:=\{1,2,\dots,n\}$ . The complete monomial group is the wreath product of $G$ with $S_{n}$ , denoted by $G\wr S_{n}$ , and can be described as follows: The elements of $G\wr S_{n}$ are $(n+1)$ -tuples $(g_{1},g_{2},\dots,g_{n};\pi)$ where $g_{i}\in G$ and $\pi\in S_{n}$ . The multiplication in $G\wr S_{n}$ is given by $(g_{1},\dots,g_{n};\pi)(h_{1},\dots,h_{n};\eta)=(g_{1}h_{\pi^{-1}(1)},\dots,g_{n}h_{\pi^{-1}(n)};\pi\eta)$ . Therefore $(g_{1},\dots,g_{n};\pi)^{-1}=(g_{\pi(1)}^{-1},\dots,g_{\pi(n)}^{-1};\pi^{-1})$ .

Now let $\{G_{n}\}_{1}^{\infty}$ be a sequence of non-trivial finite groups. We consider the complete monomial groups $\operatorname{\mathcal{G}_{n}}:=G_{n}\wr S_{n}$ for each positive integer $n$ . Let $\operatorname{e}1$ be the identity of $G_{n}$ and $\operatorname{id}$ be the identity of $S_{n}$ . For an element $\pi\in S_{n}$ , let $\pi:=(\operatorname{e}1,\dots,\operatorname{e}1;\pi)\in\operatorname{\mathcal{G}_{n}}$ and for $g\in G_{n}$ , let

	$\displaystyle g^{(i)}:=(\operatorname{e}1,\dots,\operatorname{e}1,$	$\displaystyle\;\underset{\uparrow}{g},\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id})\in\operatorname{\mathcal{G}_{n}}.$
		$\displaystyle i\text{th position.}$

Unless otherwise stated from now on, $(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))$ denotes the element of $\operatorname{\mathcal{G}_{n}}$ with $g^{-1}$ in $i$ th position and $g$ in $n$ th position, for $g\in G_{n},\;1\leq i<n$ . One can check that $(g^{-1})^{(i)}g^{(n)}(i,n)$ is equal to $(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))$ for $g\in G_{n},\;1\leq i<n$ .

In this work we consider a random walk on the complete monomial group $\operatorname{\mathcal{G}_{n}}$ driven by a probability measure $P$ , defined as follows:

(1)

P(x)=\begin{cases}\frac{1}{n|G_{n}|},&\text{ if }x=(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\text{ for }g\in G_{n},\\ \frac{1}{n|G_{n}|},&\text{ if }x=(\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))\text{ for }g\in G_{n},\;1\leq i<n,\\ 0,&\text{ otherwise}.\end{cases}

We call this the warp-transpose top with random shuffle because at most times the $n$ th component is multiplied by $g$ and the $i$ th component is multiplied by $g^{-1}$ simultaneously, $g\in G_{n}$ , $1\leq i<n$ . We now give a combinatorial description of this model as follows:

\begin{array}[]{ccc}&&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}6}\\ {\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}\begin{minipage}{14.22636pt} \leavevmode\hbox to11.67pt{\vbox to13.11pt{\pgfpicture\makeatletter\hbox{\hskip 5.83301pt\lower-6.55522pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\hskip 2.13387pt{}{{}}{}{{{}} {}{}{}{}{}{}{}{}}\pgfsys@beginscope\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@invoke{ }{}\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@moveto{4.26773pt}{0.0pt}\pgfsys@curveto{4.26773pt}{3.14279pt}{2.35703pt}{5.69046pt}{0.0pt}{5.69046pt}\pgfsys@curveto{-2.35703pt}{5.69046pt}{-4.26773pt}{3.14279pt}{-4.26773pt}{0.0pt}\pgfsys@curveto{-4.26773pt}{-3.14279pt}{-2.35703pt}{-5.69046pt}{0.0pt}{-5.69046pt}\pgfsys@curveto{2.35703pt}{-5.69046pt}{4.26773pt}{-3.14279pt}{4.26773pt}{0.0pt}\pgfsys@closepath\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@fillstroke\pgfsys@invoke{ }\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{-2.5pt}{-3.22221pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}}} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}} \end{minipage}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}&\hskip 1.70709pt\begin{array}[]{c}\rotatebox[origin={c}]{40.0}{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{0.0}{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{320.0}{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}$\xrightarrow{\hskip 21.33955pt}$}}\end{array}&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}\\ &&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}9}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}6}\end{array}

\begin{array}[]{ccc}&&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}9}\\ {\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}\begin{minipage}{14.22636pt} \leavevmode\hbox to11.67pt{\vbox to13.11pt{\pgfpicture\makeatletter\hbox{\hskip 5.83301pt\lower-6.55522pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\hskip 1.70709pt{}{{}}{}{{{}} {}{}{}{}{}{}{}{}}\pgfsys@beginscope\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{1,1,1}\pgfsys@color@gray@fill{1}\pgfsys@invoke{ }\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@invoke{ }{}\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@moveto{4.26773pt}{0.0pt}\pgfsys@curveto{4.26773pt}{3.14279pt}{2.35703pt}{5.69046pt}{0.0pt}{5.69046pt}\pgfsys@curveto{-2.35703pt}{5.69046pt}{-4.26773pt}{3.14279pt}{-4.26773pt}{0.0pt}\pgfsys@curveto{-4.26773pt}{-3.14279pt}{-2.35703pt}{-5.69046pt}{0.0pt}{-5.69046pt}\pgfsys@curveto{2.35703pt}{-5.69046pt}{4.26773pt}{-3.14279pt}{4.26773pt}{0.0pt}\pgfsys@closepath\pgfsys@moveto{0.0pt}{0.0pt}\pgfsys@fillstroke\pgfsys@invoke{ }\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{-2.5pt}{-3.22221pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}}} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}} \end{minipage}&\begin{array}[]{c}\rotatebox[origin={c}]{40.0}{{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{0.0}{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}$\xrightarrow{\hskip 21.33955pt}$}}\\ \rotatebox[origin={c}]{320.0}{{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}$\xrightarrow{\hskip 21.33955pt}$}}\end{array}&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}9}\\ &&{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}1}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}2}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}3}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}4}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}5}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}6}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}7}{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}8}{\color[rgb]{0.0,0.5,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{0.0,0.5,0.0}9}\end{array}

(a)

(b)

Figure 1. Transitions for the warp-transpose top with random shuffle on

\mathbb{Z}_{3}\wr S_{9}

\mathbb{Z}_{3}

is the additive group of integers modulo

3

, consists of the colours red, green and blue such that red represents the identity element.

(a)

shows transitions when the sixth card is chosen and

(b)

shows transitions when the last card is chosen.

Let $\mathcal{A}_{n}(G)$ denote the set of all arrangements of $n$ coloured cards in a row such that the colours of the cards are indexed by the set $G$ . For example, if $\mathbb{Z}_{2}$ denotes the additive group of integers modulo $2$ , then elements of $\mathcal{A}_{n}(\mathbb{Z}_{2})$ can be identified with the elements of $B_{n}$ (the hyperoctahedral group). For $g,h\in G$ , by saying update the colour $g$ using colour $h$ we mean the colour $g$ is updated to colour $g\cdot h$ . Elements of $\operatorname{\mathcal{G}_{n}}$ can be identified with the elements of $\mathcal{A}_{n}(G_{n})$ as follows: The element $(g_{1},\dots,g_{n};\pi)\in\operatorname{\mathcal{G}_{n}}$ is identified with the arrangement in $\mathcal{A}_{n}(G_{n})$ such that the label of the $i$ th card is $\pi(i)$ , and its colour is $g_{\pi(i)}$ , for each $i\in[n]$ . Given an arrangement of coloured cards in $\mathcal{A}_{n}(G_{n})$ , the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ is the following: Choose a positive integer $i$ uniformly from $[n]$ . Also choose a colour $g$ uniformly from $G_{n}$ , independent of the choice of the integer $i$ .

(1)

If $i=n$ : update the colour of the $n$ th card using colour $g$ .
(2)

If $i<n$ : first transpose the $i$ th and $n$ th cards. Then simultaneously update the colour of the $i$ th card using colour $g$ and update the colour of the $n$ th card using colour $g^{-1}$ .

The flip-transpose top with random shuffle on the hyperoctahedral group serves the case when $|G_{n}|=2$ for all $n$ [13]. We now state the main theorems of this paper.

Theorem 1.1.

The $($ total variation and $\ell^{2}\text{-})$ mixing time for the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ is $O\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)$ .

Theorem 1.2.

The warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ presents $\ell^{2}$ -cutoff at $n\log n+\frac{1}{2}n\log(|G_{n}|-1)$ .

Theorem 1.3.

The warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ exhibits total variation cutoff at time $n\log n$ .

In view of Theorem 1.3, the distribution after $n\log n(1+o(1))$ transitions is close to uniform in total variation distance. On the other hand, if $\log(|G_{n}|-1)\neq o(\log n)$ , then Theorem 1.2 and

\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)(1-o(1))>n\log n(1+o(1))

says that the distribution after $n\log n(1+o(1))$ transitions is far from uniform in $\ell^{2}$ -distance. Therefore, the standard spectral approach (mainly uses the fact that the total variation distance is at most half of $\ell^{2}$ -distance) for obtaining the total variation mixing time fails when $\log(|G_{n}|-1)\neq o(\log n)$ . The proof of Theorem 1.1 and Theorem 1.2 will be presented in Section 3, and the proof of Theorem 1.3 will be presented at the end of Section 5.

Let us recall some concepts and terminologies from representation theory of finite group and discrete time Markov chains with finite state space to make this paper self contained. Readers from representation theoretic background may skip Subsection 1.1 and from Probabilistic background may skip Subsection 1.2.

1.1. Representation theory of finite group

Let $G$ be a finite group and $V$ be a finite dimensional complex vector space. Also let $GL(V)$ be the set of all invertible linear operators on $V$ . A linear representation $\rho$ of $G$ is a homomorphism from $G$ to $GL(V)$ . Sometimes this representation is also denoted by the pair $(\rho,V)$ . The dimension of the vector space $V$ is called the dimension $d_{\rho}$ of the representation. $V$ is called the $G$ -module corresponding to the representation $\rho$ in this case. Let $\mathbb{C}[G]$ be the group algebra consisting of complex linear combinations of elements of $G$ . In particular taking $V=\mathbb{C}[G]$ , we define the right regular representation $R:G\longrightarrow GL(\mathbb{C}[G])$ of $G$ by

R(g)\left(\sum_{h\in G}C_{h}h\right)=\sum_{h\in G}C_{h}hg^{-1},\text{ where }C_{h}\in\mathbb{C}.

A vector subspace $W$ of $V$ is said to be stable ( or ‘invariant’) under $\rho$ if $\rho(g)\left(W\right)\subset W$ for all $g$ in $G$ . The representation $\rho$ is irreducible if $V$ is non-trivial and $V$ has no non-trivial proper stable subspace. Two representations $(\rho_{1},V_{1})$ and $(\rho_{2},V_{2})$ of $G$ are said to be isomorphic if there exists an invertible linear map $T:V_{1}\rightarrow V_{2}$ such that the following diagram commutes for all $g\in G$ :

For each $g\in G,\;\rho(g)$ can also be thought of as an invertible complex matrix of size $d_{\rho}\times d_{\rho}$ . The trace of the matrix $\rho(g)$ is said to be the character value of $\rho$ at $g$ and is denoted by $\chi^{\rho}(g)$ . It can be easily seen that the character values are constants on conjugacy classes, hence characters are class functions. If $\overline{\chi^{\rho}(g)}$ denote the complex conjugate of $\chi^{\rho}(g)$ , then one can check that $\chi^{\rho}(g^{-1})=\overline{\chi^{\rho}(g)}$ for all $g\in G$ . Let $\mathscr{C}(G)$ be the complex vector space of class functions of $G$ . Then a ‘standard’ inner product $\langle\cdot,\cdot\rangle$ on $\mathscr{C}(G)$ is defined as follows:

\langle\phi,\psi\rangle=\frac{1}{|G|}\sum_{g\in G}\phi(g)\psi(g^{-1})\quad\text{ for }\quad\phi,\psi\in\mathscr{C}(G).

An important theorem in this context is the following [26, Theorem 6]: The characters corresponding to the non-isomorphic irreducible representations of $G$ form an $\langle\cdot,\cdot\rangle$ -orthonormal basis of $\mathscr{C}(G)$ .

If $V_{1}\otimes V_{2}$ denotes the tensor product of the vector spaces $V_{1}$ and $V_{2}$ , then the tensor product of two representations $\rho_{1}:G\rightarrow GL(V_{1})$ and $\rho_{2}:G\rightarrow GL(V_{2})$ is a representation denoted by $(\rho_{1}\otimes\rho_{2},V_{1}\otimes V_{2})$ and defined by,

(\rho_{1}\otimes\rho_{2})(g)(v_{1}\otimes v_{2})=\rho_{1}(g)(v_{1})\otimes\rho_{2}(g)(v_{2})\text{ for }v_{1}\in V_{1},v_{2}\in V_{2}\text{ and }g\in G.

We use some results from representation theory of finite groups without recalling the proof. For details about finite group representation see [21, 23, 26].

1.2. Discrete time Markov chain with finite state space

Let $\Omega$ be a finite set. A sequence of random variables $X_{0},X_{1},\dots$ is a discrete time Markov chain with state space $\Omega$ and transition matrix $M$ if for all $x,y\in\Omega$ , all $k>1$ , and all events $H_{k-1}:=\underset{0\leq s<k}{\cap}\{X_{s}=x_{s}\}$ satisfying $\mathbb{P}(H_{k-1}\cap\{X_{k}=x\})>0$ , we have

(2)

\mathbb{P}(X_{k+1}=y\mid H_{k-1}\cap\{X_{k}=x\})=M(x,y).

Equation (2) says that given the present, the future is independent of the past. Let $\mathscr{D}_{k}$ denote the distribution after $k$ transitions, i.e. $\mathscr{D}_{k}$ is the row (probability) vector $\left(\mathbb{P}(X_{k}=x)\right)_{x\in\Omega}$ . Then $\mathscr{D}_{k}=\mathscr{D}_{k-1}M$ for all $k\geq 1$ , which implies $\mathscr{D}_{k}=\mathscr{D}_{0}M^{k}$ . In particular if the chain starts at $x\in\Omega$ , then its distribution after $k$ transitions is $\mathscr{D}_{k}=\delta_{x}M^{k}$ , i.e. $\mathbb{P}(X_{k}=y\mid X_{0}=x)=M^{k}(x,y)$ . Here $\delta_{x}$ is defined on $\Omega$ as follows:

\delta_{x}(u)=\begin{cases}1&\text{ if }u=x,\\ 0&\text{ if }u\neq x.\end{cases}

A Markov chain is said to be irreducible if it is possible for the chain to reach any state starting from any state using only transitions of positive probability. The period of a state $x\in\Omega$ is defined to be the greatest common divisor of the set of all times when it is possible for the chain to return to the starting state $x$ . The period of all the states of an irreducible Markov chain are the same [15, Lemma 1.6]. An irreducible Markov chain is said to be aperiodic if the common period for all its states is $1$ . A probability distribution $\Pi$ is said to be a stationary distribution of the Markov chain if $\Pi M=\Pi$ . Any irreducible Markov chain possesses a unique stationary distribution $\Pi$ with $\Pi(x)>0$ for all $x\in\Omega$ [15, Proposition 1.14]. Moreover if the chain is aperiodic then $\mathscr{D}_{k}\longrightarrow\Pi$ as $k\longrightarrow\infty$ [15, Theorem 4.9]. For an irreducible chain, we first define the $\ell^{2}$ -distance between the distribution after $k$ transitions and the stationary distribution.

Definition 1.2.

Let $\mathscr{D}_{k}$ denote the distribution after $k$ transitions of an irreducible discrete time Markov chain with finite state space $\Omega$ , and $\Pi$ denote its stationary distribution. Then the $\ell^{2}$ -distance between $\mathscr{D}_{k}$ and $\Pi$ is defined by

\left|\left|\mathscr{D}_{k}-\Pi\right|\right|_{2}:=\left(\sum_{x\in\Omega}\left|\frac{\mathscr{D}_{k}(x)}{\Pi(x)}-1\right|^{2}\Pi(x)\right)^{\frac{1}{2}}.

We now define the total variation distance between two probability measures.

Definition 1.3.

Let $\mu$ and $\nu$ be two probability measures on $\Omega$ . The total variation distance between $\mu$ and $\nu$ is defined by

\left|\left|\mu-\nu\right|\right|_{\text{TV}}:=\sup_{A\subset\Omega}|\mu(A)-\nu(A)|.

It can be easily seen that $\left|\left|\mu-\nu\right|\right|_{\text{TV}}=\frac{1}{2}\sum_{x\in\Omega}|\mu(x)-\nu(x)|$ (see [15, Proposition 4.2]).

For an irreducible and aperiodic chain the interesting topic is the minimum number of transitions $k$ required to reach near the stationarity $\Pi$ up to a certain level of tolerance $\varepsilon>0$ . We first define the maximal $\ell^{2}$ -distance (respectively total variation distance) between the distribution after $k$ transitions and the stationary distribution as follows:

d_{2}(k):=\underset{x\in\Omega}{\max}\;\left|\left|M^{k}(x,\cdot)-\Pi\right|\right|_{2}\;\;\left(\text{respectively }d_{\text{TV}}(k):=\underset{x\in\Omega}{\max}\;\left|\left|M^{k}(x,\cdot)-\Pi\right|\right|_{\text{TV}}\right).

For $\varepsilon>0$ , the $\ell^{2}$ -mixing time (respectively total variation mixing time) with tolerance level $\varepsilon$ is defined by

\tau_{\text{mix}}(\varepsilon):=\min\;\{k:d_{2}(k)\leq\varepsilon\}\;\;(\text{respectively }t_{\text{mix}}(\varepsilon):=\min\;\{k:d_{\text{TV}}(k)\leq\varepsilon\}).

Most of the notations of this subsection are borrowed from [15].

1.3. Non-commutative Fourier analysis and random walks on finite groups

Let $\;p\text{ and }q$ be two probability measures on a finite group $G$ . We define the convolution $p*q$ of $p$ and $q$ by $(p*q)(x):=\sum_{y\in G}p(xy^{-1})q(y)$ . The Fourier transform $\widehat{p}$ of $p$ at the right regular representation $R$ is defined by the matrix $\sum_{x\in G}p(x)R(x)$ . The matrix $\widehat{p}(R)$ can be thought of as the action of the group algebra element $\sum_{g\in G}p(g)g^{-1}$ on $\mathbb{C}[G]$ by multiplication on the right. It can be easily seen that $\widehat{(p*q)}(R)=\widehat{p}(R)\widehat{q}(R)$ .

A random walk on a finite group $G$ driven by a probability measure $p$ is a Markov chain with state space $G$ and transition probabilities $M_{p}(x,y)=\;p(x^{-1}y)$ , $x,y\in G$ . It can be easily seen that the transition matrix $M_{p}$ is $\widehat{p}(R)$ and the distribution after $k$ th transition will be $p^{*k}$ (convolution of $p$ with itself $k$ times) i.e., the probability of getting into state $y$ starting at state $x$ after $k$ transitions is $p^{*k}(x^{-1}y)$ . One can easily check that the random walk on $G$ driven by $p$ is irreducible if and only if the support of $p$ generates $G$ [24, Proposition 2.3]. The stationary distribution for an irreducible random walk on $G$ driven by $p$ , is the uniform distribution $U_{G}$ on $G$ (since $\sum_{x\in G}M_{p}(x,y)=\sum_{x\in G}p(x^{-1}y)=\sum_{z\in G}p(z)=1,\;z=x^{-1}y$ for all $y\in G$ ). From now on, the uniform distribution on group $G$ will be denoted by $U_{G}$ . For the random walk on $G$ driven by $p$ , it is enough to focus on $\left|\left|p^{*k}-U_{G}\right|\right|_{2}$ and $\left|\left|p^{*k}-U_{G}\right|\right|_{\text{TV}}$ because,

	$\displaystyle\left\|\left\|M_{p}^{k}(x,\cdot)-U_{G}\right\|\right\|_{\text{TV}}=\left\|\left\|M_{p}^{k}(y,\cdot)-U_{G}\right\|\right\|_{\text{TV}}$
	$\displaystyle\left\|\left\|M_{p}^{k}(x,\cdot)-U_{G}\right\|\right\|_{2}=\left\|\left\|M_{p}^{k}(y,\cdot)-U_{G}\right\|\right\|_{2}$

for any two elements $x,y\in G$ . We now define the cutoff phenomenon for a sequence of random walks on finite groups.

Definition 1.4.

Let $\{\mathscr{G}_{n}\}_{0}^{\infty}$ be a sequence of finite groups. For each $n$ , let $p_{n}$ be a probability measure on $\mathscr{G}_{n}$ such that support of $p_{n}$ generate $\mathscr{G}_{n}$ . Consider the sequence of irreducible and aperiodic random walk on $\mathscr{G}_{n}$ driven by $p_{n}$ . We say that the $\ell^{2}$ -cutoff phenomenon (respectively total variation cutoff phenomenon) holds for the family $\{(\mathscr{G}_{n},p_{n})\}_{0}^{\infty}$ if there exists a sequence $\{\mathfrak{T}_{n}\}_{0}^{\infty}$ of positive real numbers tending to infinity as $n\rightarrow\infty$ , such that the following hold:

(1)

For any $\epsilon\in(0,1)$ and $k_{n}=\lfloor(1+\epsilon)\mathfrak{T}_{n}\rfloor$ ,

\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{2}=0\;\left(\text{respectively }\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{\text{TV}}=0\right),

(2)

For any $\epsilon\in(0,1)$ and $k_{n}=\lfloor(1-\epsilon)\mathfrak{T}_{n}\rfloor$ ,

\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{2}=\infty\;\left(\text{respectively }\lim_{n\rightarrow\infty}\left|\left|p_{n}^{*k_{n}}-U_{\mathscr{G}_{n}}\right|\right|_{\text{TV}}=1\right).

Here $\lfloor x\rfloor$ denotes the floor of $x$ (the largest integer less than or equal to $x$ ).

Informally, we will say that $\{(\mathscr{G}_{n},p_{n})\}_{0}^{\infty}$ has an $\ell^{2}$ -cutoff (respectively total variation cutoff) at time $\mathfrak{T}_{n}$ . This says that for sufficiently large $n$ the leading order term in the mixing time does not depend on the tolerance level $\varepsilon(>0)$ . In other words the distribution after $k$ transitions is very close to the stationary distribution if $k=\mathfrak{T}_{n}$ but too far from the stationary distribution if $k<\mathfrak{T}_{n}$ . Although most of the cases the cutoff phenomenon depend on the multiplicity of the second largest eigenvalue of the transition matrix [7], sometimes the behaviour is different. For the random-to-random shuffle [4], many eigenvalues are almost equal to the second largest eigenvalue, and they impact the (total variation) cutoff time. We now see that the random walk of our concern is irreducible and aperiodic.

Proposition 1.4.

The warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ is irreducible and aperiodic.

Proof.

The support of $P$ is $\Gamma=\{(g^{-1})^{(i)}g^{(n)}(i,n),g^{(n)}\mid g\in G_{n},1\leq i<n\}$ and it can be easily seen that $\{g^{(k)},(i,n)\mid g\in G_{n},\;1\leq k\leq n,\;1\leq i<n\}$ is a generating set of $\operatorname{\mathcal{G}_{n}}$ .

(3)

\begin{split}&(g^{-1})^{(n)}\left((g^{-1})^{(i)}g^{(n)}(i,n)\right)g^{(n)}=(i,n)\text{ for each }1\leq i<n\text{ and }g\in G_{n},\\ &(k,n)g^{(n)}(k,n)=g^{(k)}\text{ for each }1\leq k\leq n\text{ and for all }g\in G_{n}.\end{split}

Thus (3) implies $\Gamma$ generates $\operatorname{\mathcal{G}_{n}}$ and hence the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ is irreducible. Moreover given any $\pi\in\operatorname{\mathcal{G}_{n}}$ , the set of all times when it is possible for the chain to return to the starting state $\pi$ contains the integer $1$ (as support of $P$ contains the identity element of $\operatorname{\mathcal{G}_{n}}$ ). Therefore the period of the state $\pi$ is $1$ and hence from irreducibility all the states of this chain have period $1$ . Thus this chain is aperiodic. ∎

Proposition 1.4 says that the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ converges to the uniform distribution $U_{\operatorname{\mathcal{G}_{n}}}$ as the number of transitions goes to infinity. In Section 2 we will find the spectrum of $\widehat{P}(R)$ . We will prove Theorems 1.1 and 1.2 in Section 3. In Section 4, we will obtain an upper bound of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ using the coupling argument. In Section 5, the lower bound of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ will be discussed and Theorem 1.3 will be proved.

2. Spectrum of the transition matrix

In this section we find the eigenvalues of the transition matrix $\widehat{P}(R)$ , the Fourier transform of $P$ at the right regular representation $R$ of $\operatorname{\mathcal{G}_{n}}$ . To find the eigenvalues of $\widehat{P}(R)$ we will use the representation theory of the wreath product $\operatorname{\mathcal{G}_{n}}$ of $G_{n}$ with the symmetric group $S_{n}$ . First we briefly discuss the representation theory of $G\wr S_{n}$ , following the notation from [17]. We refer to the exposition [17] for more details on the representation theory of $G\wr S_{n}$ .

A partition $\lambda$ of a positive integer $n$ (denoted $\lambda\vdash n$ ) is a weakly decreasing finite sequence $(\lambda_{1},\cdots,\lambda_{r})$ of positive integers such that $\sum_{i=1}^{r}\lambda_{i}=n$ . The partition $\lambda$ can be pictorially visualized as a left-justified arrangement of $r$ rows of boxes with $\lambda_{i}$ boxes in the $i$ th row, $1\leq i\leq r$ . This pictorial arrangement of boxes is known as the Young diagram of $\lambda$ . For example there are five partitions of the positive integer $4$ viz. $(4)$ , $(3,1)$ , $(2,2)$ , $(2,1,1)$ and $(1,1,1,1)$ . The Young diagrams corresponding to the partitions of $4$ are shown in Figure 2.

Definition 2.1.

Let $\operatorname{\mathcal{Y}}$ denote the set of all Young diagrams (there is a unique Young diagram with zero boxes) and $\operatorname{\mathcal{Y}_{n}}$ denote the set of all Young diagrams with $n$ boxes. For example, elements of $\mathcal{Y}_{4}$ are shown in Figure 2. For a finite set $X$ , we define $\operatorname{\mathcal{Y}(X)}=\{\mu:\mu\text{ is a map from }X\text{ to }\operatorname{\mathcal{Y}}\}$ . For $\mu\in\operatorname{\mathcal{Y}(X)}$ , define $||\mu||=\sum_{x\in X}|\mu(x)|$ , where $|\mu(x)|$ is the number of boxes of the Young diagram $\mu(x)$ and define $\operatorname{\mathcal{Y}_{n}(X)}=\{\mu\in\operatorname{\mathcal{Y}(X)}:||\mu||=n\}$ .

$\begin{array}[]{cclll}\yng(4)&\hskip 14.22636pt\yng(3,1)&\hskip 14.22636pt\yng(2,2)&\hskip 14.22636pt\yng(2,1,1)&\hskip 21.33955pt\yng(1,1,1,1)\\ (4)&\;\quad(3,1)&\quad\;(2,2)&\quad(2,1,1)&\;(1,1,1,1)\end{array}$

Figure 2. Young diagrams with

4

boxes.

Let $n$ be a fixed positive integer. Let $\widehat{G}$ denote the (finite) set of all non-isomorphic irreducible representations of $G$ . Given $\sigma\in\widehat{G}$ , we denote by $W^{\sigma}$ the corresponding irreducible $G$ -module (the space for the corresponding irreducible representation of $G$ ). Elements of $\operatorname{\mathcal{Y}}(\widehat{G})$ are called Young $G$ -diagrams and elements of $\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ are called Young $G$ -diagrams with $n$ boxes. For example, if $n=10$ and $G=\mathbb{Z}_{10}$ (the additive group of integers modulo $10$ ), then an element of $\mathcal{Y}_{10}(\widehat{\mathbb{Z}}_{10})$ is given in Figure 3. Let $\mu\in\operatorname{\mathcal{Y}}$ . A Young tableau of shape $\mu$ is obtained by taking the Young diagram $\mu$ and filling its $|\mu|$ boxes (bijectively) with the numbers $1,2,\dots,|\mu|$ . A Young tableau is said to be standard if the numbers in the boxes strictly increase along each row and each column of the Young diagram of $\mu$ . The set of all standard Young tableaux of shape $\mu$ is denoted by $\operatorname{tab}(\mu)$ . Elements of $\operatorname{tab}((3,1))$ are listed in Figure 4. Let $\mu\in\operatorname{\mathcal{Y}}(\widehat{G})$ . A Young $G$ -tableau of shape $\mu$ is obtained by taking the Young $G$ -diagram $\mu$ and filling its $||\mu||$ boxes (bijectively) with the numbers $1,2,\dots,||\mu||.$ A Young $G$ -tableau is said to be standard if the numbers in the boxes strictly increase along each row and each column of all Young diagrams occurring in $\mu$ . Let $\operatorname{tab}_{G}(n,\mu)$ , where $\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ , denote the set of all standard Young $G$ -tableaux of shape $\mu$ and let $\operatorname{tab}_{G}(n)=\cup_{\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})}\operatorname{tab}_{G}(n,\mu)$ . An element of $\operatorname{tab}_{\mathbb{Z}_{10}}\left(10,\mu\right)$ is given in Figure 5, the shape $\mu\;\left(\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10})\right)$ is given in Figure 3.

Definition 2.2.

Let $T\in\operatorname{tab}_{G}(n)$ and $i\in[n]$ . If $i$ appears in the Young diagram $\mu(\sigma)$ , where $\mu$ is the shape of $T$ and $\sigma\in\widehat{G}$ , we write $r_{T}(i)=\sigma$ . For the example given in Figure 5, we have $r_{T}(1)=\sigma_{2}$ , $r_{T}(2)=\sigma_{2}$ , $r_{T}(3)=\sigma_{8}$ , $r_{T}(4)=\sigma_{1}$ , $r_{T}(5)=\sigma_{10}$ , $r_{T}(6)=\sigma_{1}$ , $r_{T}(7)=\sigma_{1}$ , $r_{T}(8)=\sigma_{8}$ , $r_{T}(9)=\sigma_{1}$ , $r_{T}(10)=\sigma_{1}$ . The content of a box in row $p$ and column $q$ of a Young diagram is the integer $q-p$ . Let $b_{T}(i)$ be the box in $\mu(\sigma)$ , with the number $i$ resides and $c(b_{T}(i))$ denote the content of the box $b_{T}(i)$ . For the example given in Figure 5, we also have $c(b_{T}(1))=0$ , $c(b_{T}(2))=-1$ , $c(b_{T}(3))=0$ , $c(b_{T}(4))=0$ , $c(b_{T}(5))=0$ , $c(b_{T}(6))=1$ , $c(b_{T}(7))=-1$ , $c(b_{T}(8))=-1$ , $c(b_{T}(9))=2$ , $c(b_{T}(10))=0$ .

$\mu:=\left(\mu(\sigma_{1}),\mu(\sigma_{2}),\dots,\mu(\sigma_{10})\right)=\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)$

Figure 3. An Young

\mathbb{Z}_{10}

-diagram with

10

boxes

\mu

. Here

\widehat{\mathbb{Z}}_{10}:=\{\sigma_{i}:1\leq i\leq 10\}

$\begin{array}[]{ccc}\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}2\end{subarray}}}{{\begin{subarray}{c}3\end{subarray}}},{{\begin{subarray}{c}4\end{subarray}}})&\quad\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}2\end{subarray}}}{{\begin{subarray}{c}4\end{subarray}}},{{\begin{subarray}{c}3\end{subarray}}})&\quad\young({{\begin{subarray}{c}1\end{subarray}}}{{\begin{subarray}{c}3\end{subarray}}}{{\begin{subarray}{c}4\end{subarray}}},{{\begin{subarray}{c}2\end{subarray}}})\end{array}$

Figure 4. Standard Young tableaux of shape

(3,1)

$\mu\rightsquigarrow\left(\begin{array}[]{c}\young({{\begin{subarray}{c}4\end{subarray}}}{{\begin{subarray}{c}6\end{subarray}}}{{\begin{subarray}{c}9\end{subarray}}},{{\begin{subarray}{c}7\end{subarray}}}{{\begin{subarray}{c}10\end{subarray}}})\end{array},\begin{array}[]{c}\young({{\begin{subarray}{c}1\end{subarray}}},{{\begin{subarray}{c}2\end{subarray}}})\end{array},\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\young({{\begin{subarray}{c}3\end{subarray}}},{{\begin{subarray}{c}8\end{subarray}}})\end{array},\;\emptyset,\begin{array}[]{c}\young({{\begin{subarray}{c}5\end{subarray}}})\end{array}\right)$

Figure 5. A standard Young

\mathbb{Z}_{10}

-tableaux of shape

\mu

, defined in Figure 3.

The irreducible representation of $G\wr S_{n}$ can be parametrised by elements of $\mathcal{Y}_{n}(\widehat{G})$ [17, Lemma 6.2 and Theorem 6.4]. Given $\mu\in\operatorname{\mathcal{Y}}(\widehat{G})$ and $\sigma\in\widehat{G},\;\mu\downarrow_{\sigma}$ denotes the set of all Young $G$ -diagrams obtained from $\mu$ by removing one of the inner corners in the Young diagram $\mu(\sigma)$ ; see Figure 6 for an example. The branching rule [17, Theorem 6.6] of the pair $G\wr S_{n-1}\subseteq G\wr S_{n}$ is given as follows: Let $V^{\mu}$ (respectively $V^{\lambda}$ ) denote the irreducible $G\wr S_{n}$ -module (respectively $G\wr S_{n-1}$ -module) indexed by $\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ (respectively $\lambda\in\mathcal{Y}_{n-1}(\widehat{G})$ ). Then,

(4)

V^{\mu}\big{\downarrow}_{G\wr S_{n-1}}^{G\wr S_{n}}\cong\underset{\sigma\in\widehat{G}:\;\mu(\sigma)\neq\emptyset}{\oplus}\operatorname{dim}(W^{\sigma})\left(\underset{\lambda\in\mu\downarrow_{\sigma}}{\oplus}V^{\lambda}\right),

where $W^{\sigma}$ is the irreducible $G$ -module indexed by $\sigma$ . For an illustration of (4), consider $\mu\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10})$ from Figure 3, and recall $\mu\downarrow_{\sigma_{i}}$ $(i=1,2,8,10)$ from Figure 6. Then

V^{\mu}\big{\downarrow}^{\mathbb{Z}_{10}\wr S_{10}}_{\mathbb{Z}_{10}\wr S_{9}}\cong\operatorname{dim}(W^{\sigma_{1}})\left(\underset{\lambda\in\mu\downarrow_{\sigma_{1}}}{\oplus}V^{\lambda}\right)\oplus\operatorname{dim}(W^{\sigma_{2}})V^{\mu\downarrow_{\sigma_{2}}}\oplus\operatorname{dim}(W^{\sigma_{8}})V^{\mu\downarrow_{\sigma_{8}}}\oplus\operatorname{dim}(W^{\sigma_{10}})V^{\mu\downarrow_{\sigma_{10}}}.

Here we have used the same notation for a singleton set and its element.

$\begin{array}[]{c}\begin{array}[]{c}\mu\downarrow_{\sigma_{1}}=\bigg{\{}\left(\begin{array}[]{c}\yng(2,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right),\\ \hskip 162.6075pt\left(\begin{array}[]{c}\yng(3,1)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\end{array}\\ \mu\downarrow_{\sigma_{2}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\\ \mu\downarrow_{\sigma_{8}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1)\end{array},\;\emptyset,\begin{array}[]{c}\yng(1)\end{array}\right)\bigg{\}}\\ \mu\downarrow_{\sigma_{10}}=\bigg{\{}\left(\begin{array}[]{c}\yng(3,2)\end{array},\begin{array}[]{c}\yng(1,1)\end{array},\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\;\emptyset,\begin{array}[]{c}\yng(1,1)\end{array},\;\emptyset,\;\emptyset\right)\bigg{\}}\end{array}$

Figure 6.

\mu\downarrow_{\sigma}

for non empty

\mu({\sigma})

, where

\mu\left(\in\mathcal{Y}_{10}(\widehat{\mathbb{Z}}_{10})\right)

is defined in Figure 3.

Definition 2.3.

Let $\mathcal{H}_{i,n}(G)$ be the subgroup

\{(g_{1},\dots,g_{n},\pi)\in G\wr S_{n}:\pi(j)=j\text{ for }i+1\leq j\leq n\}

of $G\wr S_{n}$ for $0\leq i\leq n$ . In particular $\mathcal{H}_{0,n}(G)=\mathcal{H}_{1,n}(G)=G^{n}$ and $\mathcal{H}_{n,n}(G)=G\wr S_{n}$ .

The subgroup $\mathcal{H}_{i,n}(G)$ is isomorphic to $G\wr S_{i}\times G^{n-i}$ (direct product of $G\wr S_{i}$ and $G^{n-i}$ ) by the isomorphism $\Psi:\mathcal{H}_{i,n}(G)\rightarrow G\wr S_{i}\times G^{n-i}$ ; sending $(g_{1},\dots,g_{i},g_{i+1},\dots,g_{n};\pi)\in\mathcal{H}_{i,n}(G)$ to $\left((g_{1},\dots,g_{i};\pi),(g_{i+1},\dots,g_{n})\right)\in G\wr S_{i}\times G^{n-i}$ . Here we have used the same notation $\pi$ for permutations of $S_{i}$ and $S_{n}$ , as $\pi(j)=j$ for $i+1\leq j\leq n$ . The irreducible $G\wr S_{i}\times G^{n-i}$ -modules are given by the tensor products of the irreducible $G\wr S_{i}$ -modules and the irreducible $G^{n-i}$ -modules [26, Theorem 10]. Therefore we may parametrise the irreducible representations of $\mathcal{H}_{i,n}(G)$ by elements of $\operatorname{\mathcal{Y}}_{i}(\widehat{G})\times\widehat{G}^{n-i}$ . The branching rule of the pair $\mathcal{H}_{i-1,n}(G)\subseteq\mathcal{H}_{i,n}(G)$ is given as follows: Let $V^{(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}$ (respectively $V^{(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}$ ) denote the irreducible $\mathcal{H}_{i,n}(G)$ -module (respectively $\mathcal{H}_{i-1,n}(G)$ -module) indexed by $(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})\in\operatorname{\mathcal{Y}}_{i}(\widehat{G})\times\widehat{G}^{n-i}$ (respectively $(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})\in\operatorname{\mathcal{Y}}_{i-1}(\widehat{G})\times\widehat{G}^{n-i+1}$ ). Then,

(5)

V^{(\mu,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}\big{\downarrow}_{\mathcal{H}_{i-1,n}(G)}^{\mathcal{H}_{i,n}(G)}\cong\underset{\sigma\in\widehat{G}:\;\mu(\sigma)\neq\emptyset}{\oplus}\left(\underset{\lambda\in\mu\downarrow_{\sigma}}{\oplus}V^{(\lambda,\sigma,\sigma_{i+1},\sigma_{i+2},\dots,\sigma_{n})}\right).

In particular, $V^{(\mu)}=V^{\mu}$ (irreducible $G\wr S_{n}$ -module), for $i=n$ . To illustrate (5), consider $\mu\in\operatorname{\mathcal{Y}}_{10}(\widehat{\mathbb{Z}}_{10})$ from Figure 3, and recall $\mu\downarrow_{\sigma_{i}}$ $(i=1,2,8,10)$ from Figure 6. Then

V^{(\mu)}\big{\downarrow}_{\mathcal{H}_{9,10}(\mathbb{Z}_{10})}^{\mathcal{H}_{10,10}(\mathbb{Z}_{10})}\cong\left(\underset{\lambda\in\mu\downarrow_{\sigma_{1}}}{\oplus}V^{(\lambda,\sigma_{1})}\right)\oplus V^{(\mu\downarrow_{\sigma_{2}},\sigma_{2})}\oplus V^{(\mu\downarrow_{\sigma_{8}},\sigma_{8})}\oplus V^{(\mu\downarrow_{\sigma_{10}},\sigma_{10})}.

Here we have used the same notation for a singleton set and its element.

Remark 2.1.

Although the simple (multiplicity free) branching of the pair $\mathcal{H}_{i-1,n}(G)\subseteq\mathcal{H}_{i,n}(G)$ was established in [17, Section 4], no branching rule was explained. To see the branching rule (5), we note down the following: A straightforward generalisation of [18, Theorem 4.3] provides a proof of the branching rule (5) when $i=n$ . In view of isomorphism $\Psi$ , it suffices to prove (5) for $i=n$ .

Definition 2.4.

The (generalised) Young-Jucys-Murphy elements $X_{1}(G),\dots,X_{n}(G)$ of $\mathcal{H}_{n,n}(G)$ or $\mathbb{C}[G\wr S_{n}]$ are given by $X_{1}(G)=0$ and

\displaystyle X_{i}(G)

\displaystyle=\sum\limits_{k=1}^{i-1}\sum\limits_{g\in G}(g^{-1})^{(k)}g^{(i)}(k,i)=\displaystyle\sum_{k=1}^{i-1}\sum\limits_{g\in G}(g^{-1})^{(k)}(k,i)g^{(k)},\text{ for all }2\leq i\leq n.

Young-Jucys-Murphy elements generates a maximal commuting subalgebra of $\mathbb{C}[G\wr S_{n}]$ and act like scalars on the Gelfand-Tsetlin subspaces of irreducible $G\wr S_{n}$ -modules. We now define Gelfand-Tsetlin subspaces and the Gelfand-Tsetlin decomposition.

Let $\mu\in\widehat{\mathcal{H}_{n,n}}(G)$ and consider the irreducible $\mathcal{H}_{n,n}(G)$ -module $V^{\mu}$ (the space for the representation $\mu$ ). Since the branching is simple (recall (5)), the decomposition into irreducible $\mathcal{H}_{n-1,n}(G)$ -modules is given by

V^{\mu}=\underset{\lambda}{\oplus}V^{\lambda},

where the sum is over all $\lambda\in\widehat{\mathcal{H}_{n-1,n}}(G)$ , with $\lambda\nearrow\mu$ (i.e there is an edge from $\lambda$ to $\mu$ in the branching multi-graph), is canonical. Here we note that $\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ and $\lambda\in\operatorname{\mathcal{Y}}_{n-1}(\widehat{G})\times\widehat{G}$ . Iterating this decomposition of $V^{\mu}$ into irreducible $\mathcal{H}_{1,n}(G)$ -submodules, i.e.,

(6)

V^{\mu}=\underset{T}{\oplus}V_{T},

where the sum is over all possible chains $T=\mu_{1}\nearrow\mu_{2}\nearrow\dots\nearrow\mu_{n}$ with $\mu_{i}\in\widehat{\mathcal{H}_{i,n}}(G)$ and $\mu_{n}=\mu$ . We call (6) the Gelfand-Tsetlin decomposition of $V^{\mu}$ and each $V_{T}$ in (6) a Gelfand-Tsetlin subspace of $V^{\mu}$ . We note that if $0\neq v_{T}\in V_{T},\text{ then }\mathbb{C}[\mathcal{H}_{i,n}(G)]v_{T}=V^{\mu_{i}}$ from the definition of $V_{T}$ .

Theorem 2.2 ([17, Theorem 6.5]).

Let $\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ . Then we may index the Gelfand-Tsetlin subspaces of $V^{\mu}$ by standard Young $G$ -tableaux of shape $\mu$ and write the Gelfand-Tsetlin decomposition as

V^{\mu}=\underset{T\in\operatorname{tab}_{G}(n,\mu)}{\oplus}V_{T},

where each $V_{T}$ is closed under the action of $G^{n}$ and as a $G^{n}$ -module, is isomorphic to the irreducible $G^{n}$ -module

W^{r_{T}(1)}\otimes W^{r_{T}(2)}\otimes\dots\otimes W^{r_{T}(n)}.

For $i=1,\dots,n;$ the eigenvalues of $X_{i}(G)$ on $V_{T}$ are given by $\frac{|G|}{\operatorname{dim}(W^{r_{T}(i)})}c(b_{T}(i))$ . Here we recall $r_{T}(i)$ and $c(b_{T}(i))$ from Definition 2.2, $W^{r_{T}(i)}$ is the irreducible $G$ -module indexed by $r_{T}(i)$ , and $X_{i}(G)$ is the $i$ th Young-Jucys-Murphy element of $\mathcal{H}_{n,n}$ .

Theorem 2.3 ([17, Theorem 6.7]).

Let $\mu\in\operatorname{\mathcal{Y}_{n}}(\widehat{G})$ . Write the elements of $\widehat{G}$ as $\{\sigma_{1},\dots,\sigma_{t}\}$ and set $\mu^{(i)}=\mu(\sigma_{i}),\;m_{i}=|\mu^{(i)}|,d_{i}=\operatorname{dim}(W^{\sigma_{i}})$ for each $1\leq i\leq t$ . Then

\operatorname{dim}(V^{\mu})=\binom{n}{m_{1},\dots,m_{t}}f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}.

Here $f^{\mu^{(i)}}$ denotes the number of standard Young tableau of shape $\mu^{(i)}$ , for each $1\leq i\leq t$ .

Lemma 2.4.

Let G be a finite group and $\rho\in\widehat{G}$ . If $W^{\rho}\;($ respectively $\chi^{\rho})$ denotes the irreducible $G$ -module $($ respectively character $)$ and $d_{\rho}$ is the dimension of $W^{\rho}$ , then the action of the group algebra element $\sum_{g\in G}g$ on $W^{\rho}$ is given by the following scalar matrix

\sum_{g\in G}g=\frac{|G|}{d_{\rho}}\langle\chi^{\rho},\chi^{\operatorname{\mathbb{1}}}\rangle I_{d_{\rho}}.

Here $I_{d_{\rho}}$ is the identity matrix of order $d_{\rho}\times d_{\rho}$ and $\operatorname{\mathbb{1}}$ be the trivial representation of $G$ .

Proof.

It is clear that $\sum_{g\in G}g$ is in the centre of $\mathbb{C}[G]$ . Therefore by Schur’s lemma ([26, Proposition 4]), we have $\sum_{g\in G}g=cI_{d_{\rho}}$ for some $c\in\mathbb{C}$ . The value of $c$ can be obtained by equating the traces of $\sum_{g\in G}g$ and $cI_{d_{\rho}}$ . ∎

Remark 2.5.

Our focus will be on $\mathcal{H}_{n,n}(G_{n})$ i.e. $G_{n}\wr S_{n}$ for the sequence of subgroups

\mathcal{H}_{1,n}(G_{n})\subseteq\cdots\subseteq\mathcal{H}_{i,n}(G_{n})\subseteq\cdots\subseteq\mathcal{H}_{n,n}(G_{n}).

For simplicity we write the Young-Jucys-Murphy elements $X_{i}(G_{n})$ of $G_{n}\wr S_{n}$ (i.e. $\operatorname{\mathcal{G}_{n}}$ ) as $X_{i}$ for $1\leq i\leq n$ . Thus Theorems 2.2 and 2.3 are applicable to $\operatorname{\mathcal{G}_{n}}$ .

Let $t:=|\widehat{G}_{n}|$ and $\widehat{G}_{n}:=\{\sigma_{1},\dots,\sigma_{t}\}$ , where $\sigma_{1}=\operatorname{\mathbb{1}}$ (the trivial representation of $G_{n}$ ). We write $\mu\left(\in\mathcal{Y}_{n}(\widehat{G}_{n})\right)$ as the tuple $(\mu^{(1)},\dots,\mu^{(t)})$ , where $\mu^{(i)}:=\mu(\sigma_{i})$ for each $1\leq i\leq t$ . We also denote $m_{i}:=|\mu^{(i)}|,\;W^{\sigma_{i}}:=$ the irreducible $G_{n}$ -module corresponding to $\sigma_{i}$ and $d_{i}=\dim(W^{\sigma_{i}})$ for each $1\leq i\leq t$ . Thus $t,\;\sigma_{i},\;\mu^{(i)},\;m_{i},\;W^{\sigma_{i}}$ , and $d_{i}$ depend on $G_{n}$ i.e., on $n$ . To avoid notational complication the dependence of $t,\;\sigma_{i},\;\mu^{(i)},\;m_{i},\;W^{\sigma_{i}}$ , and $d_{i}$ on $n$ is suppressed. We note that for $T\in\operatorname{tab}_{G_{n}}(n,\mu)$ the dimension of $V_{T}$ is $d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}$ .

Theorem 2.6.

For each $\mu=(\mu^{(1)},\dots,\mu^{(t)})\in\mathcal{Y}_{n}(\widehat{G}_{n})$ , let $\widehat{P}(R)\big{|}_{V^{\mu}}$ denote the restriction of $\widehat{P}(R)$ to the irreducible $\operatorname{\mathcal{G}_{n}}$ -module $V^{\mu}$ . Then the eigenvalues of $\widehat{P}(R)\big{|}_{V^{\mu}}$ are given by,

\frac{1}{n\operatorname{dim}(W^{r_{T}(n)})}\left(c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle\right),\text{ with multiplicity }\operatorname{dim}(V_{T})=d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}

for each $T\in\operatorname{tab}_{G_{n}}(n,\mu)$ .

Proof.

We first find the eigenvalues of $X_{n}+\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})$ . Let $I_{\operatorname{dim}(V_{T})}$ denote the identity matrix of order $\operatorname{dim}(V_{T})\times\operatorname{dim}(V_{T})$ . Then from Theorem 2.2 we have

(7)

V^{\mu}=\underset{T\in\operatorname{tab}_{G_{n}}(n,\mu)}{\oplus}V_{T}\quad\text{ and }\quad X_{n}\big{|}_{V_{T}}=\frac{|G_{n}|}{\operatorname{dim}(W^{r_{T}(n)})}c(b_{T}(n))I_{\operatorname{dim}(V_{T})}.

Again from Theorem 2.2 and Lemma 2.4 we have

(8)

\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\big{|}_{V_{T}}=\frac{|G_{n}|}{\operatorname{dim}(W^{r_{T}(n)})}\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle I_{\operatorname{dim}(V_{T})}.

\text{We recall }\widehat{P}(R)=\frac{1}{n|G_{n}|}\sum_{g\in G_{n}}\left(R\left((\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})\right)+\sum_{i=1}^{n-1}R\left((\operatorname{e}1,\dots,\operatorname{e}1,g^{-1},\operatorname{e}1,\dots,\operatorname{e}1,g;(i,n))\right)\right).

Therefore $n|G_{n}|\widehat{P}(R)$ is the action of $X_{n}+\sum_{g\in G_{n}}(\operatorname{e}1,\dots,\operatorname{e}1,g;\operatorname{id})$ on $\mathbb{C}[\operatorname{\mathcal{G}_{n}}]$ by multiplication on the right. Since $\operatorname{dim}(V_{T})=d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}$ , the theorem follows from (7) and (8). ∎

Remark 2.7.

In the regular representation of a finite group, each irreducible representation occurs with multiplicity equal to its dimension [26, section 2.4]. Therefore, Theorems 2.3 and 2.6 provide the eigenvalues of $\widehat{P}(R)$ .

3. Order of the mixing time and $\ell^{2}$ -cutoff

In this section, using the spectrum of the transition matrix $\widehat{P}(R)$ , we find the upper bounds of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}$ and $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ when $k\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn,\;C>0$ . We also prove Theorems 1.1 and 1.2 in this section. Before proving the main results of this section, first, we set some notations and prove two useful lemmas. For any positive integer $\operatorname{\text{N}}$ , we write $\xi\vdash\operatorname{\text{N}}$ to denote that $\xi$ is a partition of $\operatorname{\text{N}}$ . Given a partition $\xi$ of the integer $\operatorname{\text{N}}$ (here we are allowing $\operatorname{\text{N}}$ to take value $0$ ), throughout this section $\xi_{1}$ denotes the largest part of $\xi$ . In particular if $\xi\vdash 0$ then $f^{\xi}=1$ (as there is a unique Young diagram with zero boxes) and we set $\xi_{1}=0$ .

Theorem 3.1 (Plancherel formula, [5, Theorem 4.1]).

Let $f_{1}$ and $f_{2}$ be two functions on the finite group G. Then

\sum_{g\in G}f_{1}(g^{-1})f_{2}(g)=\frac{1}{|G|}\sum_{\rho\in\widehat{G}}d_{\rho}\operatorname{trace}\left(\hat{f_{1}}(\rho)\hat{f_{2}}(\rho)\right),

where the sum is over all irreducible representations $\rho$ of $G$ and $d_{\rho}$ is the dimension of $\rho$ .

Recall that $U_{G}$ is the uniform distribution on the group $G$ . Then using Lemma 2.4 we have the following

\widehat{U}_{G}(\rho)=\begin{cases}1&\text{ if }\rho=\operatorname{\mathbb{1}},\\ 0&\text{ if }\rho\neq\operatorname{\mathbb{1}},\end{cases}\quad\text{ for }\rho\in\widehat{G}.

Moreover, given any probability measure $p$ on the finite group $G$ , we have $\widehat{p}(\operatorname{\mathbb{1}})=1$ . Therefore setting $f_{1}=f_{2}=p^{*k}-U_{G}$ , we have the following

(9)

p(x)=p(x^{-1})\text{ for all }x\in G\implies\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{2}=\sum\limits_{\rho\in\widehat{G}\setminus\{\operatorname{\mathbb{1}}\}}d_{\rho}\operatorname{trace}\left(\left(\widehat{p}(\rho)\right)^{2k}\right).

We now state the Diaconis-Shahshahani upper bound lemma. The proof follows from the Cauchy-Schwarz inequality and (9).

Lemma 3.2 ([5, Lemma 4.2]).

Let $p$ be a probability measure on a finite group $G$ such that $p(x)=p(x^{-1})$ for all $x\in G$ . Suppose the random walk on $G$ driven by $p$ is irreducible. Then we have the following

\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{\emph{TV}}\leq\frac{1}{4}\left|\left|p^{*k}-U_{G}\right|\right|^{2}_{2}=\frac{1}{4}\sum\limits_{\rho\in\widehat{G}\setminus\{\operatorname{\mathbb{1}}\}}d_{\rho}\operatorname{trace}\left(\left(\widehat{p}(\rho)\right)^{2k}\right),

where the sum is over all non-trivial irreducible representations $\rho$ of $G$ and $d_{\rho}$ is the dimension of $\rho$ .

Definition 3.1.

Let $A$ be a non empty set. Then the indicator function of $A$ is denoted by $\operatorname{\mathfrak{Ind}}_{A}$ and is defined by

\operatorname{\mathfrak{Ind}}_{A}(x)=\begin{cases}1&\text{ if }x\in A\\ 0&\text{ if }x\notin A.\end{cases}

Lemma 3.3.

Let $\operatorname{\text{N}}$ be a positive integer and $s$ be any non-negative real number. Then we have

\sum_{\lambda\vdash\operatorname{\text{N}}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}<e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}.

Proof.

For $\zeta\vdash\left(\operatorname{\text{N}}-\lambda_{1}\right)$ , recall that $\zeta_{1}$ denotes the largest part of $\zeta$ . Since $\zeta_{1}\leq\lambda_{1}$ implies $f^{\lambda}\leq\binom{\operatorname{\text{N}}}{\lambda_{1}}f^{\zeta}$ . Therefore $\displaystyle\sum_{\lambda\vdash\operatorname{\text{N}}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}$ is less than or equal to

	$\displaystyle\sum_{\lambda_{1}=1}^{\operatorname{\text{N}}}\sum_{\begin{subarray}{c}\zeta\vdash(\operatorname{\text{N}}-\lambda_{1})\\ \zeta_{1}\leq\lambda_{1}\end{subarray}}\binom{\operatorname{\text{N}}}{\lambda_{1}}^{2}(f^{\zeta})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}$	$\displaystyle\leq\sum_{\lambda_{1}=1}^{\operatorname{\text{N}}}\binom{\operatorname{\text{N}}}{\lambda_{1}}^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}\sum_{\zeta\vdash(\operatorname{\text{N}}-\lambda_{1})}(f^{\zeta})^{2}$
(10)			$\displaystyle=\sum_{u=0}^{\operatorname{\text{N}}-1}\binom{\operatorname{\text{N}}}{u}^{2}\left(1-\frac{u+s}{\operatorname{\text{N}}}\right)^{2k}u!.$

Equality in (3) is obtained by using $\displaystyle\sum_{\zeta\vdash(N-\lambda_{1})}(f^{\zeta})^{2}=(N-\lambda_{1})!$ , and writing $u=\operatorname{\text{N}}-\lambda_{1}$ . Using $1-x\leq e^{-x}$ for all $x\geq 0$ and $\binom{\operatorname{\text{N}}}{u}\leq\frac{\operatorname{\text{N}}^{u}}{u!}$ , the expression in the right hand side of (3) is less than or equal to

\sum_{u=0}^{\operatorname{\text{N}}-1}\frac{\operatorname{\text{N}}^{2u}}{u!}e^{-\frac{2k}{\operatorname{\text{N}}}(u+s)}<e^{-\frac{2ks}{\operatorname{\text{N}}}}\sum_{u=0}^{\infty}\frac{1}{u!}\left(\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}\right)^{u}=e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}.\qed

An immediate corollary of Lemma 3.3 follows from the fact

\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}=\left(\frac{\operatorname{\text{N}}-s}{\operatorname{\text{N}}}\right)^{2k},\text{ if }\lambda=\left(\operatorname{\text{N}}\right)\vdash\operatorname{\text{N}}.

Corollary 3.4.

Following the notations of Lemma 3.3, we have

\sum_{\begin{subarray}{c}\lambda\vdash\operatorname{\text{N}}\\ \lambda\neq(\operatorname{\text{N}})\end{subarray}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-s}{\operatorname{\text{N}}}\right)^{2k}<e^{-\frac{2ks}{\operatorname{\text{N}}}}e^{\operatorname{\text{N}}^{2}e^{-\frac{2k}{\operatorname{\text{N}}}}}-\left(\frac{\operatorname{\text{N}}-s}{\operatorname{\text{N}}}\right)^{2k}.

Lemma 3.5.

Let $\mu=(\mu^{(1)},\dots,\mu^{(t)})\in\mathcal{Y}_{n}(\widehat{G}_{n})$ . Recall that $\mu^{(j)}_{1}\;($ respectively $\mu^{(j)^{\prime}}_{1})$ denotes the largest part of $\mu^{(j)}\;($ respectively its conjugate $)$ for $1\leq j\leq t$ . Then we have

		$\displaystyle\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\dim(W^{r_{T}(n)})}\right)^{2k}$
	$\displaystyle<\;\;$	$\displaystyle\binom{n}{m_{1},\dots,m_{t}}f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}\sum_{j=1}^{t}\left(\operatorname{\mathcal{M}}_{j}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{j}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{j}),$

where $\operatorname{\mathcal{M}}_{j}:=\frac{\mu^{(j)}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}}$ and $\operatorname{\mathcal{M}}^{\prime}_{j}:=\frac{\mu^{(j)^{\prime}}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}}$ for each $1\leq j\leq t$ .

Proof.

Let $\mathcal{T}_{i}=\{(T_{1},\dots,T_{t})\in\operatorname{tab}_{G_{n}}(n,\mu)\mid b_{T}(n)\text{ is in }T_{i}\}\text{ for each }1\leq i\leq t$ . Then $\operatorname{tab}_{G_{n}}(n,\mu)$ is the disjoint union of the sets $\mathcal{T}_{1},\dots,\mathcal{T}_{t}$ . Therefore we have

\displaystyle\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\dim(W^{r_{T}(n)})}\right)^{2k}=\displaystyle\sum_{i=1}^{t}\displaystyle\sum_{T\in\mathcal{T}_{i}}\left(\frac{c(b_{T}(n))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i})

and this is equal to,

		$\displaystyle\sum_{i=1}^{t}\binom{n-1}{m_{1},..,m_{i}-1,..,m_{t}}\frac{f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}}{f^{\mu^{(i)}}}\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\frac{c(b_{T_{i}}(m_{i}))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i})$
(11)		$\displaystyle<$	$\displaystyle\;\sum_{i=1}^{t}\binom{n}{m_{1},\dots,m_{t}}\frac{f^{\mu^{(1)}}\cdots f^{\mu^{(t)}}}{f^{\mu^{(i)}}}\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{i}).$

The inequality in (3) holds because $T_{i}\in\operatorname{tab}(\mu^{(i)})$ implies the following:

		$\displaystyle\left(\frac{c(b_{T_{i}}(m_{i}))+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}$
	$\displaystyle\leq$	$\displaystyle\max\Bigg{\{}\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k},\left(\frac{\mu^{(i)^{\prime}}_{1}-1-\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\Bigg{\}}$
	$\displaystyle\leq$	$\displaystyle\max\Bigg{\{}\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k},\left(\frac{\mu^{(i)^{\prime}}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}\Bigg{\}},\quad\text{ as }\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle=0\text{ or }1$
	$\displaystyle<$	$\displaystyle\left(\frac{\mu^{(i)}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}+\left(\frac{\mu^{(i)^{\prime}}_{1}-1+\langle\chi^{\sigma_{i}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{i}}\right)^{2k}=\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}_{i}^{\prime 2k}.$

Therefore the result follows from (3) and

\displaystyle\sum_{T_{i}\in\operatorname{tab}(\mu^{(i)})}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right)=f^{\mu^{(i)}}\left(\operatorname{\mathcal{M}}_{i}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{i}\right).\qed

Proposition 3.6.

For the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ , we have

	$\displaystyle 4\;$	$\displaystyle\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{\emph{TV}}\leq\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}\;<\;2\left(e^{n^{2}e^{-\frac{2k}{n}}}-1\right)+e^{-\frac{4k}{n}}$
		$\displaystyle+2e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{n^{2}\left(\|G_{n}\|-1\right)e^{-\frac{2k}{n}}}-1\right)+2(\|\widehat{G}_{n}\|-1)n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\left(\frac{1}{n^{2}}+e^{n^{2}\left(\|G_{n}\|-1\right)e^{-\frac{2k}{n}}}-1\right),$

for all $k\geq\max\{n,\;n\log n\}$ .

Proof.

Let us recall that $\widehat{G}_{n}=\{\sigma_{1},\dots,\sigma_{t}\}\text{ and }\sigma_{1}=\operatorname{\mathbb{1}}$ , the trivial representation of $G_{n}$ . Given $\mu\in\mathcal{Y}_{n}(\widehat{G}_{n})$ , throughout this proof we write $\mu=(\mu^{(1)},\dots,\mu^{(t)})$ , where $\mu^{(i)}=\mu(\sigma_{i})$ , $\mu^{(i)}\vdash m_{i}$ , and $\sum_{i=1}^{t}m_{i}=n$ . Now using Lemma 3.2, we have

(12)

4\;\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{\text{TV}}\leq\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|^{2}_{2}=\sum_{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n}):\;\mu(\operatorname{\mathbb{1}})\neq(n)}\operatorname{dim}(V^{\mu})\operatorname{trace}\left(\left(\widehat{P}(R)\big{|}_{V^{\mu}}\right)^{2k}\right).

First we partition the set $\mathcal{Y}_{n}(\widehat{G}_{n})$ into two disjoint subsets $\mathcal{A}_{1}\text{ and }\mathcal{A}_{2}$ as follows:

	$\displaystyle\mathcal{A}_{1}=\underset{1\leq i\leq t}{\cup}\mathcal{B}_{i},\text{ where }\mathcal{B}_{i}$	$\displaystyle=\{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n})\mid m_{i}=n,\;m_{k}=0\text{ for all }k\in[t]\setminus\{i\}\}$
	$\displaystyle\mathcal{A}_{2}$	$\displaystyle=\{\mu\in\mathcal{Y}_{n}(\widehat{G}_{n})\mid\sum_{k=1}^{t}m_{k}=n,\;0\leq m_{k}\leq n-1\}.$

It can be easily seen that $\mathcal{B}_{i}$ ’s are disjoint. Therefore by using Theorem 2.6, Remark 2.7, and $\sigma_{1}=\operatorname{\mathbb{1}}$ , the inequality (12) become

	$\displaystyle 4\;\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}\leq\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}=$	$\displaystyle\sum_{\begin{subarray}{c}\mu\in\mathcal{B}_{1}\\ \mu(\operatorname{\mathbb{1}})\neq(n)\end{subarray}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+1}{nd_{1}}\right)^{2k}d_{1}^{n}$
(13)			$\displaystyle+\sum_{i=2}^{t}\sum_{\mu\in\mathcal{B}_{i}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))}{nd_{i}}\right)^{2k}d_{i}^{n}$
		$\displaystyle+\sum_{\mu\in\mathcal{A}_{2}}\operatorname{dim}(V^{\mu})\sum_{T\in\operatorname{tab}_{G_{n}}(n,\mu)}\left(\frac{c(b_{T}(n))+\langle\chi^{r_{T}(n)},\chi^{\operatorname{\mathbb{1}}}\rangle}{n\operatorname{dim}(W^{r_{T}(n)})}\right)^{2k}d_{1}^{m_{1}}\cdots d_{t}^{m_{t}}.$

The sum of the first two terms in the right hand side of (3) are equal to

		$\displaystyle\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda_{1}\neq n\end{subarray}}f^{\lambda}d_{1}^{n}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{nd_{1}}\right)^{2k}d_{1}^{n}+\sum_{i=2}^{t}\sum_{\lambda\vdash n}f^{\lambda}d_{i}^{n}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{nd_{i}}\right)^{2k}d_{i}^{n}$
(14)		$\displaystyle=$	$\displaystyle\;\frac{d_{1}^{2n}}{d_{1}^{2k}}\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{n}\right)^{2k}+\left(\frac{n-2}{n}\right)^{2k}+\sum_{i=2}^{t}\frac{d_{i}^{2n}}{d_{i}^{2k}}\sum_{\lambda\vdash n}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{n}\right)^{2k}.$

Now recalling $\lambda_{1}\;($ respectively $\lambda^{\prime}_{1})$ is the largest part of $\lambda\;($ respectively its conjugate $)$ , we have the following:

	$\displaystyle\left(\frac{c(b_{T}(n))+x}{n}\right)^{2k}\leq$	$\displaystyle\;\max\Bigg{\{}\left(\frac{\lambda_{1}-1+x}{n}\right)^{2k},\left(\frac{\lambda^{\prime}_{1}-1-x}{n}\right)^{2k}\Bigg{\}},$
	$\displaystyle<$	$\displaystyle\;\left(\frac{\lambda_{1}-1+x}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}-1+x}{n}\right)^{2k},\;\;\text{ for }T\in\operatorname{tab}(\lambda)\text{ and }x\geq 0.$

This implies

	$\displaystyle\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))+1}{n}\right)^{2k}$	$\displaystyle<\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}\left(f^{\lambda}\right)^{2}\left(\left(\frac{\lambda_{1}}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}}{n}\right)^{2k}\right)$
		$\displaystyle=\;2\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n),(1^{n})\end{subarray}}\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}}{n}\right)^{2k},\quad\text{ using }f^{\lambda}=f^{\lambda^{\prime}}$
	$\displaystyle\text{ and }\quad\sum_{\lambda\vdash n}f^{\lambda}\sum_{T\in\operatorname{tab}(\lambda)}\left(\frac{c(b_{T}(n))}{n}\right)^{2k}$	$\displaystyle<\sum_{\lambda\vdash n}\left(f^{\lambda}\right)^{2}\left(\left(\frac{\lambda_{1}-1}{n}\right)^{2k}+\left(\frac{\lambda^{\prime}_{1}-1}{n}\right)^{2k}\right)$
		$\displaystyle=\;2\sum_{\lambda\vdash n}\left(f^{\lambda}\right)^{2}\left(\frac{\lambda_{1}-1}{n}\right)^{2k},\quad\text{ using }f^{\lambda}=f^{\lambda^{\prime}}.$

Thus using $1-x\leq e^{-x}$ for $x\geq 0$ , $k\geq n$ , and $d_{i}\geq 1$ for all $1\leq i\leq t$ , the expression in (3) is bounded above by

		$\displaystyle\;2\sum_{\begin{subarray}{c}\lambda\vdash n\\ \lambda\neq(n)\end{subarray}}(f^{\lambda})^{2}\left(\frac{\lambda_{1}}{n}\right)^{2k}+\left(1-\frac{2}{n}\right)^{2k}+2\sum_{i=2}^{t}\sum_{\lambda\vdash n}(f^{\lambda})^{2}\left(\frac{\lambda_{1}-1}{n}\right)^{2k}$
(15)		$\displaystyle<$	$\displaystyle\;2\left(e^{n^{2}e^{-\frac{2k}{n}}}-1\right)+e^{-\frac{4k}{n}}+2(t-1)e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}.$

The inequality in (3) follows from Corollary 3.4 and Lemma 3.3. Now recalling $\operatorname{\mathcal{M}}_{j}:=\frac{\mu^{(j)}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}},\;\operatorname{\mathcal{M}}^{\prime}_{j}:=\frac{\mu^{(j)^{\prime}}_{1}-1+\langle\chi^{\sigma_{j}},\chi^{\operatorname{\mathbb{1}}}\rangle}{nd_{j}}$ , and using Lemma 3.5, the third term in the right hand side of (3) is less than

(16)

\displaystyle\sum_{\mu\in\mathcal{A}_{2}}\binom{n}{m_{1},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\sum_{j=1}^{t}\left(\operatorname{\mathcal{M}}_{j}^{2k}+\operatorname{\mathcal{M}}^{\prime 2k}_{j}\right)\operatorname{\mathfrak{Ind}}_{(0,\infty)}(m_{j}).

We now deal with (16) by considering two separate cases namely $j=1$ and $1<j\leq t$ . Now using

\sum_{\mu^{(1)}\vdash m_{1}}\left(f^{\mu^{(1)}}\right)^{2}\left(\frac{\mu^{(1)^{\prime}}_{1}}{nd_{1}}\right)^{2k}=\sum_{\mu^{(1)}\vdash m_{1}}\left(f^{\mu^{(1)}}\right)^{2}\left(\frac{\mu^{(1)}_{1}}{nd_{1}}\right)^{2k},

the partial sum corresponding to $j=1$ in (16) is equal to,

(17)

\sum_{m_{1}=1}^{n-1}\sum_{\begin{subarray}{c}(m_{2},\dots,m_{t})\\ \sum m_{k}=n-m_{1}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}2\sum_{\begin{subarray}{c}\mu^{(i)}\vdash m_{i}\\ 1\leq i\leq t\end{subarray}}\binom{n}{m_{1}}^{2}\binom{n-m_{1}}{m_{2},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\left(\frac{\mu^{(1)}_{1}}{nd_{1}}\right)^{2k}

Using $\displaystyle\sum_{\mu^{(i)}\vdash m_{i}}\left(f^{\mu^{(i)}}\right)^{2}=m_{i}!$ for $2\leq i\leq t,\;\binom{n-m_{1}}{m_{2},\dots,m_{t}}=\frac{(n-m_{1})!}{m_{2}!\cdots m_{t}!}$ , and the multinomial theorem

\sum_{\begin{subarray}{c}(m_{2},\dots,m_{t})\\ \sum m_{k}=n-m_{1}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}\binom{n-m_{1}}{m_{2},\dots,m_{t}}(d_{2}^{2})^{m_{2}}\dots(d_{t}^{2})^{m_{t}}=(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}},

the expression in (17) can be written as

		$\displaystyle 2\sum_{m_{1}=1}^{n-1}(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}}\binom{n}{m_{1}}^{2}(n-m_{1})!\left(\frac{1}{d_{1}}\right)^{2k-2m_{1}}\left(\frac{m_{1}}{n}\right)^{2k}\sum_{\mu^{(1)}\vdash m_{1}}(f^{\mu^{(1)}})^{2}\left(\frac{\mu^{(1)}_{1}}{m_{1}}\right)^{2k}$
(18)		$\displaystyle<$	$\displaystyle\;2\sum_{m_{1}=1}^{n-1}(d_{2}^{2}+\cdots+d_{t}^{2})^{n-m_{1}}\binom{n}{m_{1}}^{2}(n-m_{1})!\left(\frac{1}{d_{1}}\right)^{2k-2m_{1}}\left(\frac{m_{1}}{n}\right)^{2k}e^{m_{1}^{2}e^{-\frac{2k}{m_{1}}}}.$

The inequality in (3) follows from Lemma 3.3. As $n\geq m_{1}$ , we have

k\geq m_{1}\log m_{1}+\frac{m_{1}}{n}k-m_{1}\log n\implies m_{1}^{2}e^{-\frac{2k}{m_{1}}}\leq n^{2}e^{-\frac{2k}{n}}.

Thus writing $n-m_{1}$ by $u$ , the expression in (3) is less than or equal to

(19)

2e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{u=1}^{n-1}\left(\frac{d_{2}^{2}+\cdots+d_{t}^{2}}{d_{1}^{2}}\right)^{u}\left(\frac{1}{d_{1}}\right)^{2k-2n}\binom{n}{u}^{2}u!\left(1-\frac{u}{n}\right)^{2k}

Now using $1-x\leq e^{-x}$ for all $x\geq 0$ and $d_{1}=1$ the expression in (19) is less than or equal to

(20)

2e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{u=1}^{n-1}\frac{1}{u!}\left(n^{2}\left(\frac{|G_{n}|}{d_{1}^{2}}-1\right)e^{-\frac{2k}{n}}\right)^{u}\;<2e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{\left(n^{2}\left(\frac{|G_{n}|}{d_{1}^{2}}-1\right)e^{-\frac{2k}{n}}\right)}-1\right).

Now using the notation $m_{1},..,\widehat{m_{j}},..,m_{t}$ to denote $m_{1},\dots,m_{j-1},m_{j+1},\dots,m_{t}$ , and

\sum_{\mu^{(j)}\vdash m_{j}}\left(f^{\mu^{(j)}}\right)^{2}\left(\frac{\mu^{(j)^{\prime}}_{1}}{nd_{j}}\right)^{2k}=\sum_{\mu^{(j)}\vdash m_{j}}\left(f^{\mu^{(j)}}\right)^{2}\left(\frac{\mu^{(j)}_{1}}{nd_{j}}\right)^{2k},

the partial sum corresponding to $1<j\leq t$ in (16) turns out to be

(21)

\sum_{m_{j}=1}^{n-1}\sum_{\begin{subarray}{c}(m_{1},\dots,\widehat{m_{j}},\dots,m_{t})\\ \sum m_{k}=n-m_{j}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}2\sum_{\begin{subarray}{c}\mu^{(i)}\vdash m_{i}\\ 1\leq i\leq t\end{subarray}}\binom{n}{m_{j}}^{2}\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}},\dots,m_{t}}^{2}(f^{\mu^{(1)}})^{2}\cdots(f^{\mu^{(t)}})^{2}d_{1}^{2m_{1}}\dots d_{t}^{2m_{t}}\zeta^{2k},

where $\zeta=\frac{\mu^{(j)}_{1}-1}{nd_{j}}$ . Using $\displaystyle\sum_{\mu^{(i)}\vdash m_{i}}\left(f^{\mu^{(i)}}\right)^{2}=m_{i}!$ for $i\in\{1,\dots,t\}\setminus\{j\},\;\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}}\dots,m_{t}}=\frac{(n-m_{j})!}{\displaystyle\prod_{k\neq j}m_{k}!}$ , and the multinomial theorem

\sum_{\begin{subarray}{c}(m_{1},\dots,\widehat{m_{j}},\dots,m_{t})\\ \sum_{k\neq j}m_{k}=n-m_{j}\\ \\ 0\leq m_{k}\leq n-1\end{subarray}}\binom{n-m_{j}}{m_{1},\dots,\widehat{m_{j}},\dots,m_{t}}(d_{1}^{2})^{m_{1}}\dots(d_{j-1}^{2})^{m_{j-1}}(d_{j+1}^{2})^{m_{j+1}}\dots(d_{t}^{2})^{m_{t}}=\left(\displaystyle\sum_{k\neq j}d_{k}^{2}\right)^{n-m_{j}},

the expression given in (21) is equal to the following

2\sum_{m_{j}=1}^{n-1}(d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2})^{n-m_{j}}\binom{n}{m_{j}}^{2}(n-m_{j})!\left(\frac{1}{d_{j}}\right)^{2k-2m_{j}}\left(\frac{m_{j}}{n}\right)^{2k}\sum_{\mu^{(j)}\vdash m_{j}}(f^{\mu^{(j)}})^{2}\left(\frac{\mu^{(j)}_{1}-1}{m_{j}}\right)^{2k}

(22)

<\;2\sum_{m_{j}=1}^{n-1}(d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2})^{n-m_{j}}\binom{n}{m_{j}}^{2}(n-m_{j})!\left(\frac{1}{d_{j}}\right)^{2k-2m_{j}}\left(\frac{m_{j}}{n}\right)^{2k}e^{-\frac{2k}{m_{j}}}e^{m_{j}^{2}e^{-\frac{2k}{m_{j}}}}.

The inequality in (22) follows from Lemma 3.3. As $n\geq m_{j}$ , we have

k\geq m_{j}\log m_{j}+\frac{m_{j}}{n}k-m_{j}\log n\implies m_{j}^{2}e^{-\frac{2k}{m_{j}}}\leq n^{2}e^{-\frac{2k}{n}}.

Thus writing $n-m_{j}$ by $v$ and using $\frac{1}{m_{j}}\leq 1$ , the expression in (22) is less than or equal to

(23)

2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{v=1}^{n-1}\left(\frac{d_{1}^{2}+\cdots+d_{t}^{2}-d_{j}^{2}}{d_{j}^{2}}\right)^{v}\left(\frac{1}{d_{j}}\right)^{2k-2n}\binom{n}{v}^{2}v!\left(1-\frac{v}{n}\right)^{2k}

Now using $1-x\leq e^{-x}$ for all $x\geq 0$ and $d_{j}^{2k-2n}\geq 1$ for all $j\in\{1,\dots,t\}$ , the expression in (23) is less than or equal to

		$\displaystyle 2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\;\sum_{v=1}^{n-1}\frac{1}{v!}\left(n^{2}\left(\frac{\|G_{n}\|}{d_{j}^{2}}-1\right)e^{-\frac{2k}{n}}\right)^{v}$
(24)		$\displaystyle<$	$\displaystyle\;2n^{2}e^{-\frac{2k}{n}}e^{n^{2}e^{-\frac{2k}{n}}}\left(e^{\left(n^{2}\left(\frac{\|G_{n}\|}{d_{j}^{2}}-1\right)e^{-\frac{2k}{n}}\right)}-1\right).$

Therefore the proposition follows from (3), (3), (20), (3) and $\frac{1}{d_{j}}\leq 1$ for all $1\leq j\leq t$ . ∎

Theorem 3.7.

For the random walk on $\operatorname{\mathcal{G}_{n}}$ driven by $P$ we have the following:

(1)

Let $C>0$ . If $k\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn$ , then

\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}\leq\frac{1}{2}\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}<\sqrt{2}\left(e^{-2C}+1\right)e^{-C}+o(1).

(2)

For any $\epsilon\in(0,1)$ , if we set $k_{n}=\big{\lfloor}(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\big{\rfloor}$ , then

$\lim_{n\rightarrow\infty}\left|\left|P^{*k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}=0.$

Proof.

Using $k\geq n\log n+\frac{1}{2}n\log(|G_{n}|-1)+Cn$ and Proposition 3.6 we have the following:

	$\displaystyle 4\;\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}$	$\displaystyle\leq\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}$
(25)			$\displaystyle<2\left(e^{\frac{e^{-2C}}{\|G_{n}\|-1}}-1\right)+\frac{e^{-4C}}{n^{4}\left(\|G_{n}\|-1\right)^{2}}+2e^{\frac{e^{-2C}}{\|G_{n}\|-1}}\left(e^{e^{-2C}}-1\right)$
		$\displaystyle\hskip 28.45274pt+2(\|\widehat{G}_{n}\|-1)\times\frac{e^{-2C}}{\|G_{n}\|-1}\times e^{\frac{e^{-2C}}{\|G_{n}\|-1}}\left(\frac{1}{n^{2}}+e^{e^{-2C}}-1\right).$

The sequence $\{G_{n}\}_{1}^{\infty}$ consists of non-trivial finite groups, thus $\frac{1}{|G_{n}|-1}\leq 1$ . Also, $\frac{|\widehat{G}_{n}|-1}{|G_{n}|-1}\leq 1$ . Therefore the expression in the right hand side of (25) is less than

\left(2+2e^{e^{-2C}}+2e^{-2C}e^{e^{-2C}}\right)\left(e^{e^{-2C}}-1\right)+\frac{e^{-4C}}{n^{4}}+\frac{2e^{-2C}e^{e^{-2C}}}{n^{2}}

Now using $e^{x}-1<2x$ for $0<x\leq 1$ we have

	$\displaystyle 4\;\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}\leq\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}$	$\displaystyle<\left(4+6e^{-2C}+4e^{-4C}\right)2e^{-2C}+o(1)$
	$\displaystyle\implies\;\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}\leq\frac{1}{2}\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}$	$\displaystyle<\left(\sqrt{2+3e^{-2C}+2e^{-4C}}\right)e^{-C}+o(1)$
		$\displaystyle<\sqrt{2}\left(e^{-2C}+1\right)e^{-C}+o(1).$

This proves the first part of the theorem.

For any $\epsilon\in(0,1)$ , setting $k_{n}=\left\lfloor(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor$ , we have

k_{n}+1\geq(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\implies e^{-\frac{k_{n}}{n}}\leq\frac{e^{\frac{1}{n}}}{n^{1+\epsilon}(|G_{n}|-1)^{\frac{1+\epsilon}{2}}}.

Therefore Proposition 3.6 implies

	$\displaystyle 0$	$\displaystyle\leq 4\;\left\|\left\|P^{k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}\leq\left\|\left\|P^{k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}$
(26)			$\displaystyle<2\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}-1\right)+\frac{e^{\frac{4}{n}}}{n^{4+4\epsilon}\left(\|G_{n}\|-1\right)^{2+2\epsilon}}+2e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{\epsilon}}}-1\right)$
		$\displaystyle\hskip 49.79231pt+2(\|\widehat{G}_{n}\|-1)\times\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}\times e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}\left(\frac{1}{n^{2}}+e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{\epsilon}}}-1\right)$

The right hand side of (3) converges to zero as $n\rightarrow\infty$ because $\frac{1}{|G_{n}|-1},\frac{|\widehat{G}_{n}|-1}{|G_{n}|-1}\leq 1$ . Hence the second part follows. ∎

Proof of Theorem 1.1.

Let $\varepsilon>0$ and $\tau_{\text{mix}}^{(n)}(\varepsilon)$ (respectively $t_{\text{mix}}^{(n)}(\varepsilon)$ ) be the $\ell^{2}$ -mixing time (respectively total variation mixing time) with tolerance level $\varepsilon$ for the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ . We choose $C_{\varepsilon}>0$ such that $\sqrt{2}\left(e^{-2C_{\varepsilon}}+1\right)e^{-C_{\varepsilon}}<\frac{\varepsilon}{4}$ . Then the first part of Theorem 3.7 ensures the existence of positive integer $N$ such that the following hold for all $n\geq N$ ,

	$\displaystyle k\geq n\log n+\frac{1}{2}n\log(\|G_{n}\|-1)+C_{\varepsilon}n\implies$	$\displaystyle\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}\leq\frac{1}{2}\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}<\frac{\varepsilon}{2}$
	$\displaystyle\implies$	$\displaystyle\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}<\varepsilon\text{ and }\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}<\varepsilon.$

Finally, using $n\log n+\frac{1}{2}n\log(|G_{n}|-1)+C_{\varepsilon}n<\;2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)$ for all $n\geq N$ , we can conclude that

\displaystyle\tau_{\text{mix}}^{(n)}(\varepsilon)\leq 2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right)\text{ and }t_{\text{mix}}^{(n)}(\varepsilon)\leq 2\left(n\log n+\frac{1}{2}n\log(|G_{n}|-1)\right).

Thus the theorem follows. ∎

We now establish a lower bound of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}$ that will be useful in proving the $\ell^{2}$ -cutoff.

Proposition 3.8.

For large $n$ , we have

\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{(n-2+n(|G_{n}|-1))(n-1)}\;e^{-\frac{k}{n}}.

Proof.

Recall that the irreducible representations of $\operatorname{\mathcal{G}_{n}}$ are parameterised by the elements of $\mathcal{Y}_{n}(\widehat{G}_{n})$ . We now use Theorem 2.6 to compute the eigenvalues of the restriction of $\widehat{P}(R)$ to some irreducible $\operatorname{\mathcal{G}_{n}}$ -modules. The eigenvalues of the restriction of $\widehat{P}(R)$ to the irreducible $\operatorname{\mathcal{G}_{n}}$ -module indexed by

(27)

\begin{split}&\hskip 49.08118ptn-1\\[-6.45831pt] &\left(\begin{array}[]{c}\overbrace{\young(\;\;{\;$\cdots$\;\;}\;\;,\;)}\end{array},\emptyset,\dots,\emptyset\right)\in\mathcal{Y}_{n}(\widehat{G}_{n})\end{split}

are given below.

Eigenvalues:	$1-\frac{1}{n}$	$0$
Multiplicities:	$n-2$	$1$

The eigenvalues of the restriction of $\widehat{P}(R)$ to the irreducible $\operatorname{\mathcal{G}_{n}}$ -modules indexed by Young $G_{n}$ -diagram with $n$ boxes of the following form

(28)

\begin{split}&\hskip 49.08118ptn-1\\[-10.76385pt] &\left(\begin{array}[]{c}\overbrace{\young(\;\;{\;$\cdots$\;\;}\;\;)}\end{array},\emptyset,\dots,\emptyset,\underset{\uparrow}{\begin{array}[]{c}\yng(1)\end{array}},\emptyset,\dots,\emptyset\right)\in\mathcal{Y}_{n}(\widehat{G}_{n}),\;\text{ for }1<i\leq|\widehat{G}_{n}|\\[-4.30554pt] &\hskip 170.71652pti\text{th position.}\end{split}

are given below.

Eigenvalues:	$1-\frac{1}{n}$	$0$
Multiplicities:	$(n-1)d_{i}$	$d_{i}$

Now (9) implies

	$\displaystyle\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}^{2}$	$\displaystyle>(n-1)d_{1}^{n}\left((n-2)\left(1-\frac{1}{n}\right)^{2k}\right)+\sum_{i=2}^{\|\widehat{G}_{n}\|}nd_{1}^{n-1}d_{i}\left((n-1)d_{i}\left(1-\frac{1}{n}\right)^{2k}\right)$
(29)			$\displaystyle\approx(n-1)(n-2)e^{-\frac{2k}{n}}+n(n-1)(\|G_{n}\|-1)e^{-\frac{2k}{n}}.$

Here ‘ $a_{n}\approx b_{n}$ ’ means ‘ $a_{n}$ is asymptotic to $b_{n}$ ’ i.e. $\frac{a_{n}}{b_{n}}=1$ as $n\rightarrow\infty$ . We have used $d_{1}=1$ and $\displaystyle\sum_{i=1}^{|\widehat{G}_{n}|}d_{i}^{2}=|G_{n}|$ to obtain (29). Therefore (29) implies

\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{(n-2+n(|G_{n}|-1))(n-1)}\;e^{-\frac{k}{n}}

for large $n$ . ∎

Proof of Theorem 1.2.

For any $\epsilon\in(0,1)$ , the second part of Theorem 3.7 implies

(30)

\lim_{n\rightarrow\infty}\left|\left|P^{*\left\lfloor(1+\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}=0.

Again, Proposition 3.8 implies the following

(31)

\left|\left|P^{*\left\lfloor(1-\epsilon)\left(n\log n+\frac{1}{2}n\log\left(|G_{n}|-1\right)\right)\right\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{2}>\sqrt{\left(1+\frac{n-2}{n(|G_{n}|-1)}\right)\left(1-\frac{1}{n}\right)}n^{\epsilon}(|G_{n}|-1)^{\frac{\epsilon}{2}},

for large $n$ . The right hand side of the inequality (31) tends to infinity as $n\rightarrow\infty$ . Therefore from (30) and (31), we can conclude that the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ satisfies the $\ell^{2}$ -cutoff phenomenon with cutoff time $n\log n+\frac{1}{2}n\log(|G_{n}|-1)$ . This completes the proof of Theorem 1.2. ∎

4. Upper bound for total variation distance

In this section, we use the coupling argument to obtain an upper bound of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ when $k\geq n\log n+Cn,\;C>1$ . Our method uses the known upper bound for the transpose top with random shuffle [5, Theorem 5.1] and coupling argument. Let us first recall Markov chain coupling.

Definition 4.1.

A coupling of Markov chains with transition matrix $M$ is a process $(X_{t},Y_{t})_{t}$ such that both $X:=(X_{t})_{t}$ and $Y:=(Y_{t})_{t}$ are Markov chains with transition matrix $M$ and with possibly different initial distribution.

Given a coupling $(X_{t},Y_{t})_{t}$ of a Markov chain with transition matrix $M$ , suppose that $T_{\text{couple}}$ is a random time such that

X_{t}=Y_{t}\quad\text{for }t\geq T_{\text{couple}}.

Then for every pair of initial distribution $(\mu,\nu)$

(32)

\left|\left|\mu M^{t}-\nu M^{t}\right|\right|_{\text{TV}}\leq\mathbb{P}(T_{\text{couple}}>t).

The coupling is called successful if

\mathbb{P}(T_{\text{couple}}<\infty)=1.

The random time $T_{\text{couple}}$ is known as the coupling time. The coupling is called maximal if equality holds in (32), i.e.

\left|\left|\mu M^{t}-\nu M^{t}\right|\right|_{\text{TV}}=\mathbb{P}(T_{\text{couple}}>t).

Theorem 4.1 ([14, Theorem 4]).

Any Markov chain $X$ has a maximal coupling $($ achieving equality in (32) $)$ . Thus there exists a successful maximal coupling for $X$ if and only if $X$ is weakly ergodic $($ i.e., irrespective of the initial distribution, the chain will converge to a unique distribution as the number of transition approaches infinity $)$ .

In particular, an irreducible and aperiodic random walk on finite group is weakly ergodic.

Before going to the main theorem, we recall the coupon collector problem [15, Section 2.2] and prove a useful lemma. Suppose a shop sells $n$ different types of coupons. A collector visits the shop and buys coupons. Each coupon he buys is equally likely to be each of the $n$ types. The collector desires a complete set of all $n$ types. Let $\mathscr{T}$ be the minimum (random) number of coupons collected by the collector to have all the $n$ types; $\mathscr{T}$ is usually known as the coupon collector random variable. Then we have the following [15, Proposition 2.4]

(33)

\mathbb{P}\left(\mathscr{T}>\lceil n\log n+C^{\prime}n\rceil\right)\leq e^{-C^{\prime}},\quad\text{ for any }C^{\prime}>0.

Let $\mathscr{T}_{n}$ be the minimum (random) number of coupons collected by the collector to have the coupon of type $n$ . Then we define the twisted coupon collector random variable $\mathscr{T}^{\prime}$ as follows:

(34)

\mathscr{T}^{\prime}=\begin{cases}\mathscr{T},&\text{ if the last collected coupon is of type $n$},\\ \mathscr{T}+\mathscr{T}_{n},&\text{ if the last collected coupon is not of type $n$},\end{cases}

i.e., $\mathscr{T}^{\prime}$ is the minimum (random) number of coupons collected by the collector to collect every coupon at least once and the last collected coupon is of type $n$ .

Lemma 4.2.

For the twisted coupon collector random variable defined in (34), we have

\mathbb{P}\left(\mathscr{T}^{\prime}>\lceil n\log n+Cn\rceil\right)\leq(e+1)e^{-\frac{C}{2}},\quad\text{ for any }C>0.

Proof.

The definition of the twisted coupon collector random variable $\mathscr{T}^{\prime}$ implies that

\mathscr{T}+\mathscr{T}_{n}\geq\mathscr{T}^{\prime},\text{ where the distribution of }\mathscr{T}_{n}\text{ is geometric with parameter }\frac{1}{n}.

Therefore,

		$\displaystyle\mathbb{P}\left(\mathscr{T}^{\prime}>\lceil n\log n+Cn\rceil\right)$
	$\displaystyle\leq$	$\displaystyle\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil\right)$
	$\displaystyle=$	$\displaystyle\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil,\mathscr{T}_{n}>\frac{C}{2}n-1\right)$
		$\displaystyle\quad\quad\quad+\mathbb{P}\left(\mathscr{T}+\mathscr{T}_{n}>\lceil n\log n+Cn\rceil,\mathscr{T}_{n}\leq\frac{C}{2}n-1\right)$
	$\displaystyle\leq$	$\displaystyle\mathbb{P}\left(\mathscr{T}_{n}>\frac{C}{2}n-1\right)+\mathbb{P}\left(\mathscr{T}>\left\lceil n\log n+\frac{C}{2}n\right\rceil\right)$

	$\displaystyle\leq$	$\displaystyle\sum_{k\geq\left\lfloor\frac{C}{2}n-1\right\rfloor+1}\left(1-\frac{1}{n}\right)^{k-1}\frac{1}{n}+e^{-\frac{C}{2}},\quad\text{by using \eqref{eq:coupon-collector} and }\mathscr{T}_{n}\sim\text{Geom}\left(\frac{1}{n}\right)$
	$\displaystyle=$	$\displaystyle\left(1-\frac{1}{n}\right)^{\left\lfloor\frac{C}{2}n-1\right\rfloor}+e^{-\frac{C}{2}}\leq(e+1)e^{-\frac{C}{2}}.\qed$

Theorem 4.3.

For the random walk on $\operatorname{\mathcal{G}_{n}}$ driven by $P$ we have the following:

(1)

Let $C>1$ . If $k\geq n\log n+Cn$ , then we have

$\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}<ae^{-C}+(e+1)e^{-\frac{C}{2}},$

where the constant $a$ is the universal constant given in (37).

(2)

For any $\epsilon\in(0,1)$ ,

\lim_{n\rightarrow\infty}\left|\left|P^{*\lfloor(1+\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}=0.

Proof.

We construct a coupling $(\mathscr{X}_{k},\mathscr{Y}_{k})_{k}$ for the warp-transpose top with random shuffle using a successful maximal coupling of the transpose top with random shuffle on $S_{n}$ . Let $\mathscr{X}_{0}$ be the identity element $(\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id})$ of $\operatorname{\mathcal{G}_{n}}$ , and $\mathscr{Y}_{0}$ be a random element of $\operatorname{\mathcal{G}_{n}}$ (chosen uniformly). Thus the last element of $\mathscr{X}_{0}$ is the identity permutation $\operatorname{id}$ , and the last element of $\mathscr{Y}_{0}$ is a random permutation (say) $\zeta$ .

First we focus on the transpose top with random shuffle on $S_{n}$ . Let us define the probability measure $\mathscr{P}$ on $S_{n}$ that generates the transpose top with random shuffle on $S_{n}$ .

(35)

\mathscr{P}(\pi)=\begin{cases}\frac{1}{n},&\text{ if }\pi=(i,n),\;1\leq i\leq n,\\ 0,&\text{ otherwise },\end{cases}\quad\quad\text{ for }\pi\in S_{n},

where $(i,n)$ denotes the transposition in $S_{n}$ interchanging $i$ and $n$ ; here we set $(n,n):=\operatorname{id}$ . Recall that the transpose top with random shuffle is irreducible and aperiodic. Thus, $\mathscr{P}^{*k}$ , the distribution after $k$ transitions converges to the unique stationary distribution $U_{S_{n}}$ as $k\rightarrow\infty$ . Therefore, Theorem 4.1 ensures the existence of a successful maximal coupling $(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}$ for the transpose top with random shuffle (on $S_{n}$ ) such that

•

$(\mathcal{X}_{k})_{k}$ starts at identity permutation and $(\mathcal{Y}_{k})_{k}$ starts at $\zeta$ .
•

Both $\mathcal{X}_{k}$ and $\mathcal{Y}_{k}$ evolves according to the law $\mathscr{P}$ , i.e., $\mathcal{X}_{k}\sim\mathscr{P}^{*k}$ and $\mathcal{Y}_{k}\sim U_{S_{n}}$ .
•

$\mathcal{T}_{\text{couple}}$ is the coupling time, i.e., $\mathcal{X}_{k}=\mathcal{Y}_{k}\text{ for all }k\geq\mathcal{T}_{\text{couple}}$ , $\mathbb{P}(\mathcal{T}_{\text{couple}}<\infty)=1$ , and

(36) $||\mathscr{P}^{*k}-U_{S_{n}}||_{\text{TV}}=\mathbb{P}(\mathcal{T}_{\text{couple}}>k).$

Define $L_{k+1}$ by $\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot(L_{k+1},n)$ , and $R_{k+1}$ by $\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(R_{k+1},n)$ . In other words, $L_{k+1}$ is obtained from $\mathcal{X}_{k}^{-1}\mathcal{X}_{k+1}$ , and $R_{k+1}$ is obtained from $\mathcal{Y}_{k}^{-1}\mathcal{Y}_{k+1}$ . Thus, the sequence $\{L_{k}\}_{k=1}^{\infty}$ (also, $\{R_{k}\}_{k=1}^{\infty}$ ) consists of independent uniformly distributed random variables taking values from $\{1,\dots,n\}$ . We note that, $L_{k}$ and $R_{k}$ depend on each other via the coupling $(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}$ .

The known upper bound of the transpose top with random shuffle on $S_{n}$ (see [5, Theorem 5.1]) provides

\left|\left|\mathscr{P}^{*\lceil n\log n+Cn\rceil}-U_{S_{n}}\right|\right|_{\text{TV}}\leq ae^{-C}\quad\text{ for a universal constant }a\text{ and }C>1.

Therefore, using (36), we have the following:

(37)

\mathbb{P}(\mathcal{T}_{\text{couple}}>\lceil n\log n+Cn\rceil)\leq ae^{-C}.

Now we construct the coupling $(\mathscr{X}_{k},\mathscr{Y}_{k})_{k}$ as follows:

•

Recall that $\mathscr{X}_{0}=(\operatorname{e}1,\dots,\operatorname{e}1;\operatorname{id})$ , and $(\mathscr{Y}_{k})_{k}$ starts from a random element of $\operatorname{\mathcal{G}_{n}}$ .

•

For $k\geq 0$ , set

\mathscr{X}_{k+1}=\begin{cases}\mathscr{X}_{k}\cdot(\operatorname{e}1,\underset{L_{k+1}\text{th position}}{\underset{\uparrow}{\dots\operatorname{e}1,\mathfrak{g}_{k+1},\operatorname{e}1,\dots}}\mathfrak{g}^{-1}_{k+1};(L_{k+1},n))&\quad\text{if }L_{k+1}<n,\\ &\\ \mathscr{X}_{k}\cdot(\operatorname{e}1,\dots,\operatorname{e}1,\dots,\mathfrak{g}_{k+1};\operatorname{id})&\quad\text{if }L_{k+1}=n,\end{cases}

\mathscr{Y}_{k+1}=\begin{cases}\mathscr{Y}_{k}\cdot(\operatorname{e}1,\underset{R_{k+1}\text{th position}}{\underset{\uparrow}{\dots\operatorname{e}1,\mathfrak{h}_{k+1},\operatorname{e}1,\dots}}\mathfrak{h}^{-1}_{k+1};(R_{k+1},n))&\quad\text{if }R_{k+1}<n,\\ &\\ \mathscr{Y}_{k}\cdot(\operatorname{e}1,\dots,\operatorname{e}1,\dots,\mathfrak{h}_{k+1};\operatorname{id})&\quad\text{if }R_{k+1}=n,\end{cases}

where we choose $\mathfrak{g}_{k+1}$ such that $L_{k+1}$ th position $\mathscr{X}^{-1}_{k+1}$ and $\mathscr{Y}^{-1}_{k+1}$ matches; also choose $\mathfrak{h}_{k+1}$ such that $R_{k+1}$ th position $\mathscr{X}^{-1}_{k+1}$ and $\mathscr{Y}^{-1}_{k+1}$ matches. We note that $L_{k+1}=R_{k+1}$ may happen.

Under the above coupling $(\mathscr{X}_{k},\mathscr{Y}_{k})_{k}$ , if a position $i\in\{1,\dots,n-1\}$ matches in $\mathscr{X}^{-1}_{k_{0}}$ and $\mathscr{Y}^{-1}_{k_{0}}$ , then the position $i$ agrees in $\mathscr{X}^{-1}_{k}$ and $\mathscr{Y}^{-1}_{k}$ for $k\geq k_{0}$ . Therefore, we have the following

•

First $n$ positions of $\mathscr{X}^{-1}_{k}$ and $\mathscr{Y}^{-1}_{k}$ will be matched when $k\geq\mathscr{T}^{\prime}$ (defined in (34)), as they are updated by time $\mathscr{T}^{\prime}$ with the final update at position $n$ .
•

The last position of $\mathscr{X}_{k}$ and $\mathscr{Y}_{k}$ matches when $k=\mathcal{T}_{\text{couple}}$ (the coupling time for $(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}$ ), but does not match when $k<\mathcal{T}_{\text{couple}}$ .

Thus, the coupling time $\mathfrak{T}_{\text{couple}}$ for $(\mathscr{X}_{k},\mathscr{Y}_{k})_{k}$ can not exceed $\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\}$ ,

\text{i.e., }\mathscr{X}_{k}=\mathscr{Y}_{k}\text{ for all }k\geq\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\}.

Hence by using (32), we have

	$\displaystyle\left\|\left\|P^{*\left\lceil n\log n+Cn\right\rceil}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}$	$\displaystyle\leq\mathbb{P}\left(\mathfrak{T}_{\text{couple}}>\left\lceil n\log n+Cn\right\rceil\right)$
		$\displaystyle\leq\mathbb{P}\left(\max\{\mathcal{T}_{\text{couple}},\mathscr{T}^{\prime}\}>\left\lceil n\log n+Cn\right\rceil\right)$
		$\displaystyle\leq\mathbb{P}\left(\mathcal{T}_{\text{couple}}>\left\lceil n\log n+Cn\right\rceil\right)+\mathbb{P}\left(\mathscr{T}^{\prime}>\left\lceil n\log n+Cn\right\rceil\right)$
(38)			$\displaystyle\leq ae^{-C}+(e+1)e^{-\frac{C}{2}},$

for $C>1$ and the constant $a$ in (37). The inequality in (4) follows from (37) and Lemma 4.2. The first part of this theorem follows from the fact that $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ decreases as $k$ increases, and the second part follows from the first part by setting $C=\epsilon\log n-\frac{1}{n}$ . ∎

Remark 4.4.

We now describe an explicit coupling (not necessarily optimal) for the transpose top with random shuffle on $S_{n}$ , that provides a proof of (37) without using Theorem 4.1. We use the same notation $(\mathcal{X}_{k},\mathcal{Y}_{k})_{k}$ (respectively $\mathcal{T}_{\text{couple}}$ ) for the coupling (respectively coupling time). Let $\mathcal{X}_{0}=\operatorname{id}$ and $\mathcal{Y}_{0}=\zeta$ . The coupling description is as follows: Let $\mathcal{D}_{k}:=\{j:\mathcal{X}_{k}(j)=\mathcal{Y}_{k}(j)\}$ i.e., the set of positions where $\mathcal{X}_{k}$ and $\mathcal{Y}_{k}$ agree. Now choose $i\in\{1,\dots,n\}$ uniformly at random.

•

If $i,n\in\mathcal{D}_{k}$ , then $\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id}$ and $\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot\operatorname{id}$ . Thus $\mathcal{D}_{k+1}=\mathcal{D}_{k}$ .
•

If $i\in\mathcal{D}_{k}$ but $n\notin\mathcal{D}_{k}$ , then $\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id}$ and $\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(j_{0},n)$ , where $j_{0}:=\mathcal{Y}_{k}^{-1}\mathcal{X}_{k}(n)$ . Thus $\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{n\}$ .
•

If $i\notin\mathcal{D}_{k}$ but $n\in\mathcal{D}_{k}$ , then $\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot(i,n)$ and $\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(i,n)$ . Thus $\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{i\}\setminus\{n\}$ .
•

If $i\notin\mathcal{D}_{k}$ and $n\notin\mathcal{D}_{k}$ , then $\mathcal{X}_{k+1}=\mathcal{X}_{k}\cdot\operatorname{id}$ and $\mathcal{Y}_{k+1}=\mathcal{Y}_{k}\cdot(j_{0},n)$ , where $j_{0}:=\mathcal{Y}_{k}^{-1}\mathcal{X}_{k}(n)$ . Thus $\mathcal{D}_{k+1}=\mathcal{D}_{k}\cup\{n\}$ .

Thus we have the following: Each element of $\{1,\dots,n\}$ are equally probable to be added to $\mathcal{D}_{k+1}$ if $n\in\mathcal{D}_{k}$ , $j(\neq n)\in\mathcal{D}_{k_{0}}$ implies $j\in\mathcal{D}_{k}$ for all $k\geq k_{0}$ , $n\notin\mathcal{D}_{k}$ implies $n\in\mathcal{D}_{k+1}$ , and $\mathcal{X}_{k}=\mathcal{Y}_{k}$ when $\mathcal{D}_{k}=\{1,\dots,n\}$ . Therefore $\mathcal{T}_{\text{couple}}\leq\mathscr{T}+n$ , where $\mathscr{T}$ is the coupon collector random variable satisfying (33). Finally using the fact

\mathbb{P}\left(\mathcal{T}_{\text{couple}}>\lceil n\log n+Cn\rceil\right)\leq\mathbb{P}\left(\mathscr{T}+n>\lceil n\log n+Cn\rceil\right),

$\lceil n\log n+Cn\rceil-n=\lceil n\log n+(C-1)n\rceil$ , and (33); we can conclude (37) with $a=e$ .

5. Lower bound for total variation distance

This section will focus on the lower bound of $\left|\left|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\text{TV}}$ for $k=n\log n+cn,\;c\ll 0$ and prove Theorem 1.3. The idea is to use the fact that a projected chain mixes faster than the original chain. We define a group homomorphism from $\operatorname{\mathcal{G}_{n}}$ onto the symmetric group $S_{n}$ which projects the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ to the transpose top with random shuffle on $S_{n}$ . Recall the lower bound results for the transpose top with random shuffle on $S_{n}$ from [13, Section 4]. Although the detailed analysis for the transpose top with random shuffle on $S_{n}$ has appeared first in [5, Chapter 5(C), p. 27], the lower bound results we need here are directly available in [13, Section 4].

Recall that the transpose top with random shuffle is the random walk on $S_{n}$ driven by $\mathscr{P}$ (defined in (35)), and $\mathscr{P}^{*k}$ is the distribution after $k$ transitions. Then $\mathscr{P}^{*k}\rightarrow U_{S_{n}}$ as $k\rightarrow\infty$ . Given $\pi\in S_{n}$ , if $\mathfrak{f}(\pi)$ denotes the number of fixed points in $\pi$ , then we have

(39)

||\mathscr{P}^{*k}-U_{S_{n}}||_{\text{TV}}\geq 1-\frac{4\left(E_{k}\left(\mathfrak{f}^{2}\right)-\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}\right)}{\left(E_{k}(\mathfrak{f})\right)^{2}}-\frac{2}{E_{k}(\mathfrak{f})}\quad\text{(see \cite[cite]{[\@@bibref{}{FTTR}{}{}, Proposition 4.1]})},

where $E_{k}(\mathfrak{f})$ (respectively $E_{k}(\mathfrak{f}^{2})$ ) denotes the expected value of $\mathfrak{f}$ (respectively $\mathfrak{f}^{2}$ ) with respect to the probability measure $\mathscr{P}^{*k}$ . Now recall the expressions for $E_{k}(\mathfrak{f})$ and $E_{k}(\mathfrak{f}^{2})$ obtained in [13].

(40)		$\displaystyle E_{k}(\mathfrak{f})$	$\displaystyle\approx 1+(n-2)e^{-\frac{k}{n}}.$
(41)		$\displaystyle E_{k}(\mathfrak{f}^{2})$	$\displaystyle\approx 2+3(n-2)e^{-\frac{k}{n}}+(n^{2}-5n+5)e^{-\frac{2k}{n}}+(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right).$

Let us define a homomorphism $f$ from $\operatorname{\mathcal{G}_{n}}$ onto $S_{n}$ as follows:

(42)

f:(g_{1},\dots,g_{n};\pi)\mapsto\pi,\text{ for }(g_{1},\dots,g_{n};\pi)\in\operatorname{\mathcal{G}_{n}}.

It can be checked that the mapping $f$ defined in (42) is a surjective homomorphism. Moreover, $f$ projects the warp-transpose top with random shuffle on $\operatorname{\mathcal{G}_{n}}$ to the transpose top with random shuffle on $S_{n}$ i.e., $Pf^{-1}=\mathscr{P}$ . Here $f^{-1}$ is defined by $f^{-1}(\pi):=\{\pi^{\prime}\in\operatorname{\mathcal{G}_{n}}:f(\pi^{\prime})=\pi\}$ for $\pi\in S_{n}$ . We now prove a lemma which will be useful in proving the main result of this section.

Lemma 5.1.

For any positive integer $k$ we have $\left(Pf^{-1}\right)^{*k}=P^{*k}f^{-1}$ .

Proof.

We use the first principle of mathematical induction on $k$ . The base case for $k=1$ is true by definition. Now assume the induction hypothesis i.e., $\left(Pf^{-1}\right)^{*m}=P^{*m}f^{-1}$ for some positive integer $m>1$ . Let $\pi\in S_{n}$ be chosen arbitrarily. Then for the inductive step $k=m+1$ we have the following:

	$\displaystyle\left(Pf^{-1}\right)^{*(m+1)}(\pi)$	$\displaystyle=\left(\left(Pf^{-1}\right)\left(Pf^{-1}\right)^{m}\right)(\pi)$
		$\displaystyle=\sum_{\{\xi,\zeta\in S_{n}:\;\xi\zeta=\pi\}}\left(Pf^{-1}\right)(\xi)\left(Pf^{-1}\right)^{*m}(\zeta)$
		$\displaystyle=\sum_{\{\xi,\zeta\in S_{n}:\;\xi\zeta=\pi\}}\left(Pf^{-1}\right)(\xi)\left(P^{*m}f^{-1}\right)(\zeta),\;\text{ by the induction hypothesis},$
		$\displaystyle=\sum_{\begin{subarray}{c}\xi,\zeta\in S_{n}\\ \xi\zeta=\pi\end{subarray}}P\left(f^{-1}(\xi)\right)P^{*m}\left(f^{-1}(\zeta)\right)$
(43)			$\displaystyle=\sum_{\begin{subarray}{c}\xi,\zeta\in S_{n}\\ \xi\zeta=\pi\end{subarray}}\sum_{\begin{subarray}{c}\xi^{\prime}\in f^{-1}(\xi)\\ \zeta^{\prime}\in f^{-1}(\zeta)\end{subarray}}P(\xi^{\prime})P^{*m}(\zeta^{\prime}).$

Now using the fact that $f$ is a homomorphism, we have the following:

\displaystyle\{(\xi^{\prime},\zeta^{\prime})\in f^{-1}(\xi)\times f^{-1}(\zeta):\;\xi,\zeta\in S_{n}\text{ and }\xi\zeta=\pi\}=\{(\xi^{\prime},\zeta^{\prime})\in\operatorname{\mathcal{G}_{n}}\times\operatorname{\mathcal{G}_{n}}:\;\xi^{\prime}\zeta^{\prime}\in f^{-1}(\pi)\}.

Therefore the expression in (5) becomes

	$\displaystyle\sum_{\{\xi^{\prime},\zeta^{\prime}\in\operatorname{\mathcal{G}_{n}}:\;\xi^{\prime}\zeta^{\prime}\in f^{-1}(\pi)\}}P(\xi^{\prime})P^{*m}(\zeta^{\prime})$	$\displaystyle=\sum_{\pi^{\prime}\in f^{-1}(\pi)}\sum_{\begin{subarray}{c}\xi^{\prime},\zeta^{\prime}\in\operatorname{\mathcal{G}_{n}}\\ \xi^{\prime}\zeta^{\prime}=\pi^{\prime}\end{subarray}}P(\xi^{\prime})P^{*m}(\zeta^{\prime})$
		$\displaystyle=\sum_{\pi^{\prime}\in f^{-1}(\pi)}P^{(m+1)}(\pi^{\prime})=\left(P^{(m+1)}f^{-1}\right)(\pi).$

Thus the lemma follows from the first principle of mathematical induction. ∎

Theorem 5.2.

For the random walk on $\operatorname{\mathcal{G}_{n}}$ driven by $P$ we have the following:

(1)

For large $n$ ,

||P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}||_{\emph{TV}}\geq 1-\frac{2\left(3+3e^{-c}+o(1)(e^{-2c}+e^{-c}+1)\right)}{\left(1+(1+o(1))e^{-c}\right)^{2}},

when $k=n\log n+cn$ and $c\ll 0$ .

(2)

For any $\epsilon\in(0,1)$ ,

\displaystyle\lim_{n\rightarrow\infty}\left|\left|P^{*\lfloor(1-\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}\right|\right|_{\emph{TV}}=1.

Proof.

We know that, given two probability distributions $\mu$ and $\nu$ on $\Omega$ and a mapping $\psi:\Omega\rightarrow\Lambda$ , we have $||\mu-\nu||_{\emph{TV}}\geq||\mu\psi^{-1}-\nu\psi^{-1}||_{\emph{TV}}$ , where $\Lambda$ is finite [15, Lemma 7.9]. Therefore we have the following:

	$\displaystyle\|\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\|\|_{\text{{TV}}}$	$\displaystyle\geq\|\|P^{*k}f^{-1}-U_{\operatorname{\mathcal{G}_{n}}}f^{-1}\|\|_{\text{{TV}}}$
		$\displaystyle=\|\|\left(Pf^{-1}\right)^{*k}-U_{S_{n}}\|\|_{\text{{TV}}},\;\text{ by Lemma \eqref{lem:transition_preservation_of_the_projection} and }U_{\operatorname{\mathcal{G}_{n}}}f^{-1}=U_{S_{n}},$
(44)			$\displaystyle=\|\|\mathscr{P}^{*k}-U_{S_{n}}\|\|_{\text{{TV}}},$

using $Pf^{-1}=\mathscr{P}$ . Therefore (39), (40), (41), and (5) implies that

	$\displaystyle\|\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\|\|_{\text{{TV}}}\geq$	$\displaystyle\;1-\frac{2\left(2E_{k}\left(\mathfrak{f}^{2}\right)-2\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}+E_{k}\left(\mathfrak{f}\right)\right)}{\left(E_{k}(\mathfrak{f})\right)^{2}}$
(45)		$\displaystyle=$	$\displaystyle\;1-\frac{2\left(3+3(n-2)e^{-\frac{k}{n}}-2(n-1)e^{-\frac{2k}{n}}+o(1)\right)}{\left(1+(n-2)e^{-\frac{k}{n}}\right)^{2}},\;\text{ for }k>1,$

for sufficiently large $n$ . The equality in (5) holds because of the following:

		$\displaystyle 2\left(E_{k}\left(\mathfrak{f}^{2}\right)-\left(E_{k}\left(\mathfrak{f}\right)\right)^{2}\right)+E_{k}\left(\mathfrak{f}\right)$
	$\displaystyle\approx$	$\displaystyle\;2\left(2+3(n-2)e^{-\frac{k}{n}}+(n^{2}-5n+5)e^{-\frac{2k}{n}}+(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right)-\left(1+(n-2)e^{-\frac{k}{n}}\right)^{2}\right)$
		$\displaystyle\hskip 346.89621pt+\left(1+(n-2)e^{-\frac{k}{n}}\right)$
	$\displaystyle=$	$\displaystyle\;3+3(n-2)e^{-\frac{k}{n}}-2(n-1)e^{-\frac{2k}{n}}+2(n-2)\left(\frac{1+(-1)^{k}}{n^{k}}\right).$

Now if $n$ is large, $c\ll 0$ and $k=n\log n+cn$ , then by (5), we have the first part of this theorem.

Again for any $\epsilon\in(0,1)$ , from (5), we have

(46)

1\geq||P^{*\lfloor(1-\epsilon)n\log n\rfloor}-U_{\operatorname{\mathcal{G}_{n}}}||_{\text{TV}}\geq 1-\frac{2\left(3+3n^{\epsilon}+o(1)(n^{2\epsilon}+n^{\epsilon}+1)\right)}{\left(1+(1+o(1))n^{\epsilon}\right)^{2}},

for large $n$ . Therefore, the second part of this theorem follows from (46) and the fact that

\lim_{n\rightarrow\infty}\frac{2\left(3+3n^{\epsilon}+o(1)(n^{2\epsilon}+n^{\epsilon}+1)\right)}{\left(1+(1+o(1))n^{\epsilon}\right)^{2}}=\lim_{n\rightarrow\infty}\frac{2\left(\frac{3}{n^{2\epsilon}}+\frac{3}{n^{\epsilon}}+o(1)(\frac{1}{n^{2\epsilon}}+\frac{1}{n^{\epsilon}}+1)\right)}{\left(\frac{1}{n^{\epsilon}}+1+o(1)\right)^{2}}=\;0.\qed

Proof of Theorem 1.3.

Theorem 1.3 follows from the second part of Theorem 4.3 and the second part of Theorem 5.2. ∎

Acknowledgement

I extend sincere thanks to my PhD advisor Arvind Ayyer for all the insightful discussions during the preparation of this paper. I am very grateful to the anonymous referees of the Journal of Algebraic Combinatorics for many constructive suggestions. I would like to thank an anonymous referee of the Algebraic Combinatorics for the valuable comments, which helped improve the total variation upper bound result and simplify the proof of the total variation lower bound. I am grateful to Professor Tullio Ceccherini-Silberstein for his encouragement and inspiring comments. I would also like to thank Guy Blachar, Ashish Mishra, and Shangjie Yang for their discussions. I would like to acknowledge support in part by a UGC Centre for Advanced Study grant.

Statements and Declarations

The author has no conflict of interest to disclose. This article is a part of author’s PhD dissertation. The extended abstract of this article was accepted in FPSAC 2020 (online).

Data Availability Statements

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Copyright Statements

This version of the article has been accepted for publication in the Journal of Algebraic Combinatorics: An International Journal, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s10801-023-01271-1. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms.

References

[1] David Aldous. Random walks on finite groups and rapidly mixing Markov chains. In Seminar on probability, XVII, volume 986 of Lecture Notes in Math., pages 243–297. Springer, Berlin, 1983.
[2] David Aldous and Persi Diaconis. Shuffling cards and stopping times. Amer. Math. Monthly, 93(5):333–348, 1986.
[3] David Aldous and Persi Diaconis. Strong uniform times and finite random walks. Adv. in Appl. Math., 8(1):69–97, 1987.
[4] Megan Bernstein and Evita Nestoridi. Cutoff for random to random card shuffle. Ann. Probab., 47(5):3303–3320, 2019.
[5] Persi Diaconis. Applications of non-commutative fourier analysis to probability problems. In École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, pages 51–100. Springer, 1988.
[6] Persi Diaconis. Group representations in probability and statistics. 11:vi+198, 1988.
[7] Persi Diaconis. The cutoff phenomenon in finite Markov chains. Proc. Nat. Acad. Sci. U.S.A., 93(4):1659–1664, 1996.
[8] Persi Diaconis and Mehrdad Shahshahani. Generating a random permutation with random transpositions. Z. Wahrsch. Verw. Gebiete, 57(2):159–179, 1981.
[9] James Allen Fill and Clyde H. Schoolfield, Jr. Mixing times for Markov chains on wreath products and related homogeneous spaces. Electron. J. Probab., 6:no. 11, 22, 2001.
[10] L. Flatto, A. M. Odlyzko, and D. B. Wales. Random shuffles and group representations. Ann. Probab., 13(1):154–178, 1985.
[11] Subhajit Ghosh. Cutoff for the warp-transpose top with random shuffle. Sém. Lothar. Combin., 84B:Art. 69, 12, 2020.
[12] Subhajit Ghosh. Total variation cutoff for the transpose top-2 with random shuffle. J. Theoret. Probab., 33(4):1832–1854, 2020.
[13] Subhajit Ghosh. Total variation cutoff for the flip-transpose top with random shuffle. ALEA Lat. Am. J. Probab. Math. Stat., 18(1):985–1006, 2021.
[14] David Griffeath. A maximal coupling for Markov chains. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:95–106, 1974/75.
[15] David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. Markov chains and mixing times. American Mathematical Society, Providence, RI, 2009. With a chapter by James G. Propp and David B. Wilson.
[16] Oliver Matheau-Raven. Random walks on the symmetric group: cutoff for one-sided transposition shuffles. PhD thesis, University of York, 2020.
[17] Ashish Mishra and Murali K. Srinivasan. The Okounkov-Vershik approach to the representation theory of $G\sim S_{n}$ . J. Algebraic Combin., 44(3):519–560, 2016.
[18] Ashish Mishra and Shraddha Srivastava. On representation theory of partition algebras for complex reflection groups. Algebr. Comb., 3(2):389–432, 2020.
[19] Evita Nestoridi. The limit profile of star transpositions. arXiv preprint arXiv:2111.03622, 2021.
[20] Evita Nestoridi and Sam Olesker-Taylor. Limit profiles for reversible Markov chains. Probab. Theory Related Fields, 182(1-2):157–188, 2022.
[21] Amritanshu Prasad. Representation theory: a combinatorial viewpoint, volume 147. Cambridge University Press, 2015.
[22] Dana Randall. Rapidly mixing Markov chains with applications in computer science and physics. Computing in Science and Engg., 8(2):30 – 41, 2006.
[23] Bruce E. Sagan. The symmetric group, volume 203 of Graduate Texts in Mathematics. Springer-Verlag, New York, second edition, 2001. Representations, combinatorial algorithms, and symmetric functions.
[24] Laurent Saloff-Coste. Random walks on finite groups. In Probability on discrete structures, volume 110 of Encyclopaedia Math. Sci., pages 263–346. Springer, Berlin, 2004.
[25] Clyde H. Schoolfield, Jr. Random walks on wreath products of groups. J. Theoret. Probab., 15(3):667–693, 2002.
[26] Jean-Pierre Serre. Linear representations of finite groups. Springer-Verlag, New York-Heidelberg, 1977. Translated from the second French edition by Leonard L. Scott, Graduate Texts in Mathematics, Vol. 42.
[27] Lucas Teyssier. Limit profile for random transpositions. Ann. Probab., 48(5):2323–2343, 2020.

	$\displaystyle 4\;\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}$	$\displaystyle\leq\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}$
(25)			$\displaystyle<2\left(e^{\frac{e^{-2C}}{\|G_{n}\|-1}}-1\right)+\frac{e^{-4C}}{n^{4}\left(\|G_{n}\|-1\right)^{2}}+2e^{\frac{e^{-2C}}{\|G_{n}\|-1}}\left(e^{e^{-2C}}-1\right)$
		$\displaystyle\hskip 28.45274pt+2(\|\widehat{G}_{n}\|-1)\times\frac{e^{-2C}}{\|G_{n}\|-1}\times e^{\frac{e^{-2C}}{\|G_{n}\|-1}}\left(\frac{1}{n^{2}}+e^{e^{-2C}}-1\right).$

	$\displaystyle 0$	$\displaystyle\leq 4\;\left\|\left\|P^{k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}^{2}\leq\left\|\left\|P^{k_{n}}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|^{2}_{2}$
(26)			$\displaystyle<2\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}-1\right)+\frac{e^{\frac{4}{n}}}{n^{4+4\epsilon}\left(\|G_{n}\|-1\right)^{2+2\epsilon}}+2e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}\left(e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{\epsilon}}}-1\right)$
		$\displaystyle\hskip 49.79231pt+2(\|\widehat{G}_{n}\|-1)\times\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}\times e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{1+\epsilon}}}\left(\frac{1}{n^{2}}+e^{\frac{e^{\frac{2}{n}}}{n^{2\epsilon}(\|G_{n}\|-1)^{\epsilon}}}-1\right)$

	$\displaystyle k\geq n\log n+\frac{1}{2}n\log(\|G_{n}\|-1)+C_{\varepsilon}n\implies$	$\displaystyle\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}\leq\frac{1}{2}\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}<\frac{\varepsilon}{2}$
	$\displaystyle\implies$	$\displaystyle\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{\text{TV}}<\varepsilon\text{ and }\left\|\left\|P^{k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}<\varepsilon.$

	$\displaystyle\left\|\left\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\right\|\right\|_{2}^{2}$	$\displaystyle>(n-1)d_{1}^{n}\left((n-2)\left(1-\frac{1}{n}\right)^{2k}\right)+\sum_{i=2}^{\|\widehat{G}_{n}\|}nd_{1}^{n-1}d_{i}\left((n-1)d_{i}\left(1-\frac{1}{n}\right)^{2k}\right)$
(29)			$\displaystyle\approx(n-1)(n-2)e^{-\frac{2k}{n}}+n(n-1)(\|G_{n}\|-1)e^{-\frac{2k}{n}}.$

	$\displaystyle\|\|P^{*k}-U_{\operatorname{\mathcal{G}_{n}}}\|\|_{\text{{TV}}}$	$\displaystyle\geq\|\|P^{*k}f^{-1}-U_{\operatorname{\mathcal{G}_{n}}}f^{-1}\|\|_{\text{{TV}}}$
		$\displaystyle=\|\|\left(Pf^{-1}\right)^{*k}-U_{S_{n}}\|\|_{\text{{TV}}},\;\text{ by Lemma \eqref{lem:transition_preservation_of_the_projection} and }U_{\operatorname{\mathcal{G}_{n}}}f^{-1}=U_{S_{n}},$
(44)			$\displaystyle=\|\|\mathscr{P}^{*k}-U_{S_{n}}\|\|_{\text{{TV}}},$

Cutoff phenomenon for the warp-transpose top with random shuffle

Abstract.

Key words and phrases:

2020 Mathematics Subject Classification:

1. Introduction

Definition 1.1.

Theorem 1.1.

Theorem 1.2.

Theorem 1.3.

1.1. Representation theory of finite group

1.2. Discrete time Markov chain with finite state space

Definition 1.2.

Definition 1.3.

1.3. Non-commutative Fourier analysis and random walks on finite groups

Definition 1.4.

Proposition 1.4.

Proof.

2. Spectrum of the transition matrix

Definition 2.1.

Definition 2.2.

Definition 2.3.

Remark 2.1.

Definition 2.4.

Theorem 2.2 ([17, Theorem 6.5]).

Theorem 2.3 ([17, Theorem 6.7]).

Lemma 2.4.

Proof.

Remark 2.5.

Theorem 2.6.

Proof.

Remark 2.7.

3. Order of the mixing time and ℓ2\ell^{2}-cutoff

Theorem 3.1 (Plancherel formula, [5, Theorem 4.1]).

Lemma 3.2 ([5, Lemma 4.2]).

Definition 3.1.

Lemma 3.3.

Proof.

Corollary 3.4.

Lemma 3.5.

Proof.

Proposition 3.6.

Proof.

Theorem 3.7.

Proof.

Proof of Theorem 1.1.

Proposition 3.8.

Proof.

Proof of Theorem 1.2.

4. Upper bound for total variation distance

Definition 4.1.

Theorem 4.1 ([14, Theorem 4]).

Lemma 4.2.

Proof.

Theorem 4.3.

Proof.

Remark 4.4.

5. Lower bound for total variation distance

Lemma 5.1.

Proof.

Theorem 5.2.

Proof.

Proof of Theorem 1.3.

Acknowledgement

Statements and Declarations

Data Availability Statements

Copyright Statements

References

3. Order of the mixing time and $\ell^{2}$ -cutoff