Counting Parabolic Double Cosets in Symmetric Groups

Thomas Browning

(August 2021)

Abstract

Billey, Konvalinka, Petersen, Solfstra, and Tenner recently presented a method for counting parabolic double cosets in Coxeter groups, and used it to compute $p_{n}$ , the number of parabolic double cosets in $S_{n}$ , for $n\leq 13$ . In this paper, we derive a new formula for $p_{n}$ and an efficient polynomial time algorithm for evaluating this formula. We use these results to compute $p_{n}$ for $n\leq 5000$ and to prove an asymptotic formula for $p_{n}$ that was conjectured by Billey et al.

1 Introduction

For a Coxeter system $(W,S)$ , each subset $I\subseteq S$ generates a subgroup $W_{I}$ of $W$ . These subgroups of $W$ are called parabolic subgroups. A parabolic double coset in the Coxeter system $(W,S)$ is a double coset of the form $W_{I}wW_{J}$ for $w\in W$ and $I,J\subseteq S$ . Parabolic double cosets have several properties that make them interesting objects to study. For example, the parabolic double cosets of a finite Coxeter group $(W,S)$ are rank-symmetric intervals in the Bruhat order on $W$ [3], and the poset of presentations of parabolic double cosets gives a boolean complex that retains many properties of the Coxeter complex [6].

The problem of counting parabolic double cosets is considered in [1]. Counting parabolic double cosets is hard because a given parabolic double coset $C$ might arise from multiple choices of $I$ and $J$ . To avoid this problem, Billey, Konvalinka, Petersen, Slofstra, and Tenner take the approach of counting lex-minimal presentations [1]. The lex-minimal presentation of a parabolic double coset $C$ is the unique presentation $C=W_{I}wW_{J}$ such that $w$ is the minimal element of $C$ in the Bruhat order and such that $(\left\lvert{I}\right\rvert,\left\lvert{J}\right\rvert)$ is lexicographically minimal among all presentations of $C$ . Theorem 4.26 of [1] gives a formula for the number of parabolic double cosets with a given minimal element $w\in W$ , for any Coxeter group $W$ . In the case of the symmetric group, $W=S_{n}$ , summing this formula over all $n!$ elements of $S_{n}$ will compute $p_{n}$ , the total number of distinct parabolic double cosets in $S_{n}$ . The sequence $\{p_{n}\}$ may be found at [5, A260700]. The values of $p_{n}$ for $n\leq 13$ have been computed using this method [1]. However, the summation over all $n!$ elements of $S_{n}$ limits the number of terms of $\{p_{n}\}$ that can be computed with this approach.

In this paper we restrict our attention to counting parabolic double cosets in $S_{n}$ . Two ways of depicting parabolic double cosets in $S_{n}$ are the balls-in-boxes model considered in [1] and two-way contingency tables, which are matrices of nonnegative integers with nonzero row sums and column sums. Diaconis and Gangolli use the balls-in-boxes model to give a bijection between parabolic double cosets in the double quotient $W_{I}\backslash S_{n}/W_{J}=\{W_{I}wW_{J}:w\in S_{n}\}$ and two-way contingency tables with prescribed row sums and column sums [2]. They attribute the idea behind the bijection to Nantel Bergeron. This bijection is also used in [6], where it provides an alternate construction of the two-sided Coxeter complex of $S_{n}$ in terms of two-way contingency tables.

There are three main results of this paper. The first result is the explicit formula

p_{n}=\frac{1}{n!}\sum_{m=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{m}\sum_{c=0}^{\lfloor m/2\rfloor}\sum_{t=2c}^{m}\binom{m}{t}\left(\sum_{k=c}^{t-c}(-1)^{k}(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}\binom{k-1}{k-c}\right)\left(\sum_{j=0}^{c}\frac{f_{m-t+c,j}f_{m-t+c,c-j}}{j!\,(c-j)!}\right),

(1)

where the values $f_{n,k}=\sum_{l=k}^{n}l!\genfrac{\{}{\}}{0.0pt}{}{n-k}{l-k}$ are generalizations of the Fubini numbers. The second result is an efficient polynomial time algorithm to evaluate this formula, which is used to compute the values of $p_{n}$ for $n\leq 5000$ . These values of $p_{n}$ are summarized in the appendix. The third result is an asymptotic formula for $p_{n}$ that proves the following conjecture.

Conjecture 1.1 (Conjecture 1.4 in [1]).

There exists a constant $K$ so that

\frac{p_{n}}{n!}\sim\frac{K}{(\log 2)^{2n}}.

Theorem 1.2.

Conjecture 1.1 holds with constant

K=\frac{e^{-(\log 2)^{2}/2}}{4(\log 2)^{2}}\approx 0.409223.

We conclude this introduction by outlining the two key ideas behind formula (1). The first idea is to associate each parabolic double coset with a canonical two-way contingency table. In particular, there is a bijection between parabolic double cosets in $S_{n}$ and two-way contingency tables with sum $n$ that are “maximal” (as defined at the end of section 2). This reduces the problem of computing $p_{n}$ to the problem of counting maximal two-way contingency tables with sum $n$ .

Two-way contingency tables have both a restriction on the rows (nonzero row sums) and a restriction on the columns (nonzero column sums). These two restrictions are entangled since any nonzero entry in the table both makes its row sum nonzero and its column sum nonzero. The second idea is to break this entanglement by transforming the problem from counting two-way contingency tables to counting pairs of weak orders (a weak order is a binary relation that is transitive and has no incomparable pairs of elements).

This transformation is applied in sections 3.2 and 3.3 to the problem of counting maximal two-way contingency tables with sum $n$ , as well as to the simpler problem of counting all two-way contingency tables with sum $n$ . More generally, this transformation is applicable whenever counting two-way contingency tables with a restriction on the locations of nonzero entries, but not on their specific values.

2 Background

In this section, we give an overview of parabolic double cosets in $S_{n}$ , and describe two ways of depicting presentations of parabolic double cosets in $S_{n}$ . We will use $s_{i}$ to denote the $i$ th adjacent transposition in $S_{n}$ (the transposition that swaps $i$ and $i+1$ ).

Definition 2.1.

A parabolic subgroup $W_{I}$ in the symmetric group $S_{n}$ is a subgroup $W_{I}=\langle I\rangle=\langle s:s\in I\rangle$ of $S_{n}$ generated by a collection $I\subseteq\{s_{1},\ldots,s_{n-1}\}$ of adjacent transpositions in $S_{n}$ .

Definition 2.2.

A parabolic double coset $C$ in the symmetric group $S_{n}$ is a double coset of the form $C=W_{I}wW_{J}$ for parabolic subgroups $W_{I}$ and $W_{J}$ in $S_{n}$ and an element $w\in S_{n}$ . The triple $(I,w,J)$ is called a presentation of $C$ . The total number of distinct parabolic double cosets in $S_{n}$ is denoted by $p_{n}$ .
The sequence $\{p_{n}\}$ may be found at [5, A260700].

A parabolic double coset $C$ will usually have many presentations. For example, if $(I,w,J)$ is a presentation of $C$ , then so is $(I,w^{\prime},J)$ for any $w^{\prime}\in C$ . As the next example shows, there could also be multiple choices for the collections $I$ and $J$ . In other words, $C$ might be an element of more than one double quotient

W_{I}\backslash S_{n}/W_{J}=\{W_{I}wW_{J}:w\in S_{n}\}.

Example 2.3.

The parabolic double coset $C=S_{2}$ in $S_{2}$ has six different presentations:

$(\{\},12,\{s_{1}\})$	$(\{s_{1}\},12,\{s_{1}\})$	$(\{s_{1}\},12,\{\})$
$(\{\},21,\{s_{1}\})$	$(\{s_{1}\},21,\{s_{1}\})$	$(\{s_{1}\},21,\{\})$

For each element $w\in S_{n}$ , the ball-in-boxes picture of $w$ is given by placing $n$ balls in an $n\times n$ grid of boxes in the layout of the permutation matrix of $w$ . A ball is placed in column $a$ and row $b$ if and only if $w(a)=b$ . We label the columns left-to-right and the rows bottom-to-top, as in the Cartesian coordinate system. Left-multiplication acts by permuting the rows of the grid and right-multiplication acts by permuting the columns of the grid.

Consider a presentation $(I,w,J)$ of a parabolic double coset $C$ in $S_{n}$ . The group $W_{I}$ acts on the balls-in-boxes picture of $w$ by permuting certain rows of the grid. The group $W_{J}$ acts on the balls-in-boxes picture of $w$ by permuting certain columns of the grid. For each adjacent transposition $s_{i}\notin I$ , we draw a horizontal wall between row $i$ and row $i+1$ . For each adjacent transposition $s_{j}\notin J$ , we draw a vertical wall between column $j$ and column $j+1$ . Then $W_{I}$ acts on the balls-in-boxes picture of $w$ by permuting rows of the grid within the horizontal walls, and $W_{J}$ acts on the balls-in-boxes picture of $w$ by permuting columns of the grid within the vertical walls. The resulting picture is called the balls-in-boxes picture with walls for the presentation $(I,w,J)$ [1].

Example 2.4 (Example 3.5 in [1]).

Let $w=3512467\in S_{7}$ , let $I=\{s_{1},s_{3},s_{4}\}$ , let $J=\{s_{2},s_{3},s_{5},s_{6}\}$ , and let $C=W_{I}wW_{J}$ . Figure 1 depicts the balls-in-boxes picture with walls for the presentation $(I,w,J)$ of $C$ .

Figure 1: A balls-in-boxes picture with walls.

The walls naturally divide the $n\times n$ grid into larger rectangular cells. Notice that the number of balls in each cell remains unchanged under the actions of $W_{I}$ and $W_{J}$ . Thus the number of balls in a given cell is constant over all elements of $C$ . By counting the number of balls in each cell, we obtain a matrix of nonnegative integers such that the sum of all of the entries is $n$ and such that each row sum and column sum is strictly positive.

Definition 2.5.

A two-way contingency table is a matrix of nonnegative integers such that each row sum and column sum is strictly positive. Let $p_{n,0}$ denote the number of two-way contingency tables with sum $n$ .
The sequence $\{p_{n,0}\}$ may be found at [5, A120733].

Thus, each presentation $(I,w,J)$ of $C$ gives rise to a two-way contingency table with sum $n$ . Conversely, given a two-way contingency table with sum $n$ , it is possible to recover the collections $I$ and $J$ , as well as the parabolic double coset $C\in W_{I}\backslash S_{n}/W_{J}$ [2, Section 3B]. As a consequence, two-way contingency tables with sum $n$ are in bijection with triples $(C,I,J)$ consisting of two collections $I,J\subseteq\{s_{1},\ldots,s_{n-1}\}$ and a parabolic double coset $C\in W_{I}\backslash S_{n}/W_{J}$ [6]. Another way to put this is that $p_{n,0}=\sum_{I}\sum_{J}\left\lvert{W_{I}\backslash S_{n}/W_{J}}\right\rvert$ , because the right hand side counts triples $(C,I,J)$ .

Example 2.6.

The two-way contingency table associated to the triple $(C,I,J)$ from Example 2.4 is the matrix

\begin{pmatrix}0&0&1\\ 0&0&1\\ 1&1&1\\ 0&2&0\end{pmatrix}.

Even though a given parabolic double coset might have multiple distinct two-way contingency tables arising from multiple choices for the collections $I$ and $J$ , the following proposition shows that it is always possible to choose $I$ and $J$ to be simultaneously maximal, resulting in a canonical two-way contingency table associated to each parabolic double coset.

Proposition 2.7.

Let $C$ be a parabolic double coset in $S_{n}$ . Let $I=\{s_{i}:s_{i}C=C\}$ , let $J=\{s_{i}:Cs_{i}=C\}$ , and let $w\in C$ be arbitrary. Then $(I,w,J)$ is a presentation of $C$ and is maximal in the sense that if $(I^{\prime},w^{\prime},J^{\prime})$ is any other presentation of $C$ , then $I^{\prime}\subseteq I$ and $J^{\prime}\subseteq J$ .

Proof.

Consider the stabilizer subgroups $H_{L}=\{\sigma\in S_{n}\colon\sigma C=C\}$ and $H_{R}=\{\sigma\in S_{n}\colon C\sigma=C\}$ . Then $I=H_{L}\cap\{s_{1},\ldots,s_{n-1}\}$ and $J=H_{R}\cap\{s_{1},\ldots,s_{n-1}\}$ . In particular, $I\subseteq H_{L}$ and $J\subseteq H_{R}$ so $W_{I}=\langle I\rangle\subseteq H_{L}$ and $W_{J}=\langle J\rangle\subseteq H_{R}$ . Since $w\in C$ , this shows that $W_{I}wW_{J}\subseteq C$ . We will now show that $C\subseteq W_{I}wW_{J}$ .

Let $(I^{\prime},w^{\prime},J^{\prime})$ be any presentation of $C$ . Then $C=W_{I^{\prime}}w^{\prime}W_{J^{\prime}}$ . If $\sigma\in I^{\prime}$ , then

\sigma C=(\sigma W_{I^{\prime}})(w^{\prime}W_{J^{\prime}})=(W_{I^{\prime}})(w^{\prime}W_{J^{\prime}})=C

so $\sigma\in H_{L}$ . Thus, $I^{\prime}\subseteq H_{L}$ and similarly $J^{\prime}\subseteq H_{R}$ . Intersecting with $\{s_{1},\ldots,s_{n-1}\}$ shows that $I^{\prime}\subseteq I$ and $J^{\prime}\subseteq J$ . Then $W_{I^{\prime}}\subseteq W_{I}$ and $W_{J^{\prime}}\subseteq W_{J}$ so

C=W_{I^{\prime}}w^{\prime}W_{J^{\prime}}=W_{I^{\prime}}wW_{J^{\prime}}\subseteq W_{I}wW_{J}.

Combining this with the earlier inclusion $W_{I}wW_{J}\subseteq C$ gives the equality $W_{I}wW_{J}=C$ . Thus, $(I,w,J)$ is a presentation of $C$ . Finally, $(I,w,J)$ is maximal since we already showed that if $(I^{\prime},w^{\prime},J^{\prime})$ is any presentation of $C$ , then $I^{\prime}\subseteq I$ and $J^{\prime}\subseteq J$ . ∎

For an alternative proof of Proposition 2.7, see Proposition 3.7 in [1]. Proposition 2.7 gives a bijection

\displaystyle\left\{\begin{subarray}{c}\text{\small{parabolic double}}\\ \text{\small{cosets $C\in S_{n}$}}\end{subarray}\right\}

\displaystyle\longleftrightarrow\left\{\begin{subarray}{c}\text{\small{Triples $(C,I,J)$ consisting of two collections}}\\ \text{\small{$I,J\subseteq\{s_{1},\ldots,s_{n-1}\}$ and a parabolic}}\\ \text{\small{double coset $C\in W_{I}\backslash S_{n}/W_{J}$ such that }}\\ \text{\small{$I=\{s_{i}:s_{i}C=C\}$ and $J=\{s_{i}:Cs_{i}=C\}$}}\end{subarray}\right\}.

This bijection is given by

	$\displaystyle C$	$\displaystyle\longmapsto(C,\{s_{i}:s_{i}C=C\},\{s_{i}:Cs_{i}=C\}),$
	$\displaystyle C$	$\displaystyle\longmapsfrom(C,I,J),$

where Proposition 2.7 guarentees that the triple $(C,\{s_{i}:s_{i}C=C\},\{s_{i}:Cs_{i}=C\})$ satisfies the condition $C\in W_{I}\backslash S_{n}/W_{J}$ required by the set on the right hand side of the bijection.

The bijection between triples $(C,I,J)$ and two-way contingency tables described after Definition 2.5 gives a bijection

\left\{\begin{subarray}{c}\text{\small{Triples $(C,I,J)$ consisting of two collections}}\\ \text{\small{$I,J\subseteq\{s_{1},\ldots,s_{n-1}\}$ and a parabolic}}\\ \text{\small{double coset $C\in W_{I}\backslash S_{n}/W_{J}$ such that }}\\ \text{\small{$I=\{s_{i}:s_{i}C=C\}$ and $J=\{s_{i}:Cs_{i}=C\}$}}\end{subarray}\right\}\longleftrightarrow\left\{\begin{subarray}{c}\text{\small{Two-way contingency tables with sum $n$}}\\ \text{\small{whose associated triple $(C,I,J)$ satisfies}}\\ \text{\small{$I=\{s_{i}:s_{i}C=C\}$ and $J=\{s_{i}:Cs_{i}=C\}$}}\end{subarray}\right\}.

This motivates the following definition.

Definition 2.8.

A two-way contingency table is said to be maximal if the associated triple $(C,I,J)$ satisfies $I=\{s_{i}:s_{i}C=C\}$ and $J=\{s_{i}:Cs_{i}=C\}$ .

We remark that the maximal two-way contingency table associated to a given parabolic double coset has the smallest possible dimensions and the largest possible entries.

Putting this all together gives a bijection

\left\{\begin{subarray}{c}\text{\small{parabolic double}}\\ \text{\small{cosets $C\in S_{n}$}}\end{subarray}\right\}\longleftrightarrow\left\{\begin{subarray}{c}\text{\small{Maximal two-way}}\\ \text{\small{contingency tables}}\\ \text{\small{with sum $n$}}\end{subarray}\right\}.

Corollary 2.9.

The sequence $\{p_{n,0}\}$ counts the number of two-way contingency tables with sum $n$ . The sequence $\{p_{n}\}$ counts the number of maximal two-way contingency tables with sum $n$ .

We will now turn our attention to counting maximal two-way contingency tables with sum $n$ .

3 Computation of $p_{n}$

3.1 Characterization of Maximality

The following proposition describes how to determine whether the conditions $s_{i}C=C$ and $Cs_{i}=C$ are satisfied from the balls-in-boxes picture with walls. We will use the term cell to mean one of the rectangular regions formed by the walls of the balls-in-boxes picture with walls.

Proposition 3.1.

Let $C=W_{I}wW_{J}$ be a parabolic double coset in $S_{n}$ , and let $P$ be the balls-in-boxes picture with walls for the presentation $(I,w,J)$ .

1.

Let $s_{i}\in\{s_{1},\ldots,s_{n-1}\}\setminus I$ . Then $s_{i}C=C$ if and only if the row of cells of $P$ directly below the $s_{i}$ -wall and the row of cells of $P$ directly above the $s_{i}$ -wall each have only one nonempty cell, the two of which are vertically adjacent.
2.

Let $s_{j}\in\{s_{1},\ldots,s_{n-1}\}\setminus J$ . Then $Cs_{j}=C$ if and only if the column of cells of $P$ directly left the $s_{j}$ -wall and the column of cells of $P$ directly right the $s_{j}$ -wall each have only one nonempty cell, the two of which are horizontally adjacent.

Proof.

Let $s_{i}\in\{s_{1},\ldots,s_{n-1}\}\setminus I$ . First suppose that the row of cells of $P$ directly below the $s_{i}$ -wall and the row of cells of $P$ directly above the $s_{i}$ -wall each have only one nonempty cell, the two of which are vertically adjacent. Let $A$ be the nonempty cell of $P$ directly below the $s_{i}$ -wall and let $B$ be the nonempty cell of $P$ directly above the $s_{i}$ -wall. Since $A$ and $B$ are each the only nonempty cells in their row of cells, the ball in row $i$ must lie in $A$ and the ball in row $i+1$ must lie in $B$ . Then the ball in column $w^{-1}(i)$ must lie in $A$ and the ball in column $w^{-1}(i+1)$ must lie in $B$ . Since the cells $A$ and $B$ are vertically adjacent, $w^{-1}(i)$ and $w^{-1}(i+1)$ lie in the same column of cells. Then the transposition $w^{-1}s_{i}w$ that swaps $w^{-1}(i)$ and $w^{-1}(i+1)$ is an element of $W_{J}$ . Taking $w^{-1}s_{i}w\in W_{J}$ and multiplying on the left by $w$ gives $s_{i}w\in wW_{J}\subseteq C$ . Since $w\in C$ was arbitrary, this shows that $s_{i}C\subseteq C$ . Since $\left\lvert{s_{i}C}\right\rvert=\left\lvert{C}\right\rvert$ , we must have $s_{i}C=C$ .

For the converse, suppose that $P$ has a nonempty cell $A$ directly below the $s_{i}$ -wall and a nonempty cell $B$ directly above the $s_{i}$ -wall such that $A$ and $B$ are not vertically adjacent. This is just the negation of the original assumption on $P$ . Since $A$ and $B$ are nonempty, we can multiply $w$ on the left by an element $x\in W_{I}$ to make $A$ contain the ball in row $i$ and to make $B$ contain the ball in row $i+1$ . Then $s_{i}xw$ has fewer balls in cells $A$ and $B$ than $xw$ . However, the number of balls in a given cell is constant over all elements of $C$ . Since $xw\in C$ , this forces $s_{i}xw\not\in C$ . In particular, $s_{i}C\neq C$ . This proves the first statement. The proof of the second statement is similar. ∎

By definition, a two-way contingency table is maximal precisely when neither of the “if and only if”s in Proposition 3.1 are ever satisfied. This gives a characterization of maximal two-way contingency tables.

Corollary 3.2.

Let $T$ be a two-way contingency table. Then $T$ is maximal if and only if $T$ satisfies both of the following two conditions:

1.

There does not exist a pair of vertically adjacent nonzero entries, each of which is the only nonzero entry in its row.
2.

There does not exist a pair of horizontally adjacent nonzero entries, each of which is the only nonzero entry in its column.

Example 3.3.

There is $p_{1}=1$ maximal two-way contingency table with sum 1:

There are $p_{2}=3$ maximal two-way contingency tables with sum 2:

There are $p_{3}=19$ maximal two-way contingency tables with sum 3:

3.2 Sequence Transformation

Note that the conditions in Corollary 3.2 only depend on the configuration of nonzero entries of $T$ . Because of this, it will be helpful to ignore the specific values of the nonzero entries of $T$ .

Definition 3.4.

A pattern is a two-way contingency table consisting of 0’s and 1’s.

1.

Let $\mathcal{P}_{m,0}$ denote the collection of all patterns with sum $m$ .
2.

Let $\mathcal{P}_{m}$ denote the collection of all patterns with sum $m$ that satisfy the conditions of Corollary 3.2.
3.

For a two-way contingency table $T$ , we define the pattern associated to $T$ to be the two-way contingency table formed by replacing all nonzero entries of $T$ with 1’s.

For each pattern $P$ with sum $m$ , there are exactly $\binom{n-1}{n-m}$ two-way contingency tables with sum $n$ whose associated pattern is $P$ . This is because two-way contingency tables with sum $n$ whose associated pattern is $P$ are in bijection with ways to surjectively distribute $n$ identical balls among $m$ distinguishable urns.

Then combining Corollary 2.9 with Corollary 3.2 gives the formulas

p_{n,0}=\sum_{m=0}^{n}\binom{n-1}{n-m}\left\lvert{\mathcal{P}_{m,0}}\right\rvert\quad\text{and}\quad p_{n}=\sum_{m=0}^{n}\binom{n-1}{n-m}\left\lvert{\mathcal{P}_{m}}\right\rvert.

Here we use the conventions $\binom{-1}{0}=1$ and also $\binom{k-1}{k}=0$ for $k\geq 1$ , since $\binom{n-1}{n-m}$ is counting ways to surjectively distribute $n$ identical balls among $m$ distinguishable urns. We now invoke the identity

\binom{n-1}{n-m}=\frac{m!}{n!}\sum_{k=m}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}\genfrac{\{}{\}}{0.0pt}{}{k}{m},

where $\genfrac{[}{]}{0.0pt}{}{\cdot}{\cdot}$ denotes unsigned Stirling numbers of the first kind and $\genfrac{\{}{\}}{0.0pt}{}{\cdot}{\cdot}$ denotes Stirling numbers of the second kind (see the exercise on Lah numbers in [8]). If we define

q_{k,0}=\sum_{m=0}^{k}m!\genfrac{\{}{\}}{0.0pt}{}{k}{m}\left\lvert{\mathcal{P}_{m,0}}\right\rvert\quad\text{and}\quad q_{k}=\sum_{m=0}^{k}m!\genfrac{\{}{\}}{0.0pt}{}{k}{m}\left\lvert{\mathcal{P}_{m}}\right\rvert,

then we obtain the identities

p_{n,0}=\frac{1}{n!}\sum_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}q_{k,0}\quad\text{and}\quad p_{n}=\frac{1}{n!}\sum_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}q_{k}.

This reduces the computation of the sequences $\{p_{n,0}\}$ and $\{p_{n}\}$ to the computation of the sequences $\{q_{n,0}\}$ and $\{q_{n}\}$ . The following proposition gives a combinatorial interpretation of the sequences $\{q_{n,0}\}$ and $\{q_{n}\}$ .

Proposition 3.5.

The sequence $\{q_{n,0}\}$ counts the number of functions from $n$ distinguishable balls to a rectangular matrix of boxes, such that each row and column is nonempty. The sequence $\{q_{n}\}$ counts the number of such functions that satisfy the following two conditions:

1.

There does not exist a pair of vertically adjacent nonempty boxes, each of which is the only nonempty box in its row.
2.

There does not exist a pair of horizontally adjacent nonempty boxes, each of which is the only nonempty box in its column.

Proof.

Relabeling indices (from $q_{k,0}$ to $q_{n,0}$ and from $q_{k}$ to $q_{n}$ ) gives

q_{n,0}=\sum_{k=0}^{n}k!\genfrac{\{}{\}}{0.0pt}{}{n}{k}\left\lvert{\mathcal{P}_{k,0}}\right\rvert\quad\text{and}\quad q_{n}=\sum_{k=0}^{n}k!\genfrac{\{}{\}}{0.0pt}{}{n}{k}\left\lvert{\mathcal{P}_{k}}\right\rvert.

Then the proposition follows from the fact that $k!\genfrac{\{}{\}}{0.0pt}{}{n}{k}$ counts the number of surjective functions from $n$ distinguishable balls to the $k$ nonempty boxes prescribed by the pattern in $\mathcal{P}_{k,0}$ or $\mathcal{P}_{k}$ . ∎

3.3 Pairs of Weak Orders

Definition 3.6.

A weak order on a set $S$ is a binary relation $\leq$ on $S$ that is transitive and has no incomparable pairs of elements (so for all $s,t\in S$ , $s\leq t$ or $t\leq s$ ). The total number of weak orders on $\{1,\ldots,n\}$ is denoted by $f_{n}$ . The sequence $\{f_{n}\}$ is known as the Fubini numbers or the ordered Bell numbers, and may be found at [5, A000670].

Given a weak order on a set $S$ , it is possible for two elements $s,t\in S$ to be tied with each other, in the sense that both $s\leq t$ and $t\leq s$ . The “tied” relation is an equivalence relation and partitions the set $S$ into tied subsets. Furthermore, the weak order on $S$ gives a total order on this partition. In other words, we obtain an ordered set-partition $S=S_{1}\cup\cdots\cup S_{k}$ . Conversely, an ordered set-partition $S=S_{1}\cup\cdots\cup S_{k}$ determines a weak order on $S$ by setting $s\leq t$ whenever $s\in S_{i}$ and $t\in S_{j}$ and $i\leq j$ . Thus, there is a bijection between weak orders on $S$ and ordered set-partitions on $S$ . In what follows, we will freely translate back and forth between weak orders and ordered set-partitions.

In the special case where $S=\{1,\ldots,n\}$ , weak orders on $\{1,\ldots,n\}$ correspond to ordered set-partitions $\{1,\ldots,n\}=S_{1}\cup\cdots\cup S_{k}$ . Summing over the possible values for $k$ gives the formula $f_{n}=\sum_{k=0}^{n}k!\genfrac{\{}{\}}{0.0pt}{}{n}{k}$ .

Definition 3.7.

Let $((S_{1},\ldots,S_{k}),(T_{1},\ldots,T_{k^{\prime}}))$ be a pair of ordered set-partitions of $\{1,\ldots,n\}$ . A left consecutive embedding is a consecutive pair $(i,i+1)_{l}$ such that $S_{i}\cup S_{i+1}\subseteq T_{j}$ for some (necessarily unique) $T_{j}$ . Similarly, a right consecutive embedding is a consecutive pair $(i,i+1)_{r}$ such that $T_{i}\cup T_{i+1}\subseteq S_{j}$ for some (necessarily unique) $S_{j}$ . The phrase consecutive embedding will refer to either a left consecutive embedding or a right consecutive embedding.

Since weak orders can be viewed as ordered set-partitions, it makes sense to talk about consecutive embeddings in a pair of weak orders.

Proposition 3.8.

The sequence $\{q_{n,0}\}$ counts the total number of pairs of weak orders on $\{1,\ldots,n\}$ , so $q_{n,0}=f_{n}^{2}$ . The sequence $\{q_{n}\}$ counts the number of such pairs that have no consecutive embeddings.

Proof.

Consider one of the $q_{n,0}$ functions from $n$ distinguishable balls to a rectangular matrix of boxes, such that each row and column is nonempty. Considering the row of each ball gives a weak order on $\{1,\ldots,n\}$ . Similarly, considering the column of each ball gives a weak order on $\{1,\ldots,n\}$ . Conversely, given this pair of weak orders on $\{1,\ldots,n\}$ , we can recover the dimensions of the matrix and the position of each ball. In particular, the dimensions of the matrix are equal to the number of parts of the two ordered set-partitions, and the position of the $i$ th ball in the matrix is given by the location of $i$ within the two ordered set-partitions. This gives a bijection

\left\{\begin{subarray}{c}\text{\small{Functions from $n$ distinguishable balls}}\\ \text{\small{to a rectangular matrix of boxes such}}\\ \text{\small{that each row and column is nonempty}}\end{subarray}\right\}\longleftrightarrow\left\{\begin{subarray}{c}\text{\small{Pairs of weak orders}}\\ \text{\small{on $\{1,\ldots,n\}$}}\end{subarray}\right\}.

In particular, the sequence $\{q_{n,0}\}$ counts the total number of pairs of weak orders on $\{1,\ldots,n\}$ .

As for $\{q_{n}\}$ , observe that this bijection takes failures of conditions 1 and 2 of Corollary 3.2 to left and right consecutive embeddings. ∎

Example 3.9.

There is $q_{1,0}=1$ pair of weak orders on $\{1\}$ :

This pair of weak orders has no consecutive embeddings, so we also have $q_{1}=1$ . Here we write the first weak order in the first column, and the second weak order in the second column. Tied elements of a weak order are written in the same row as each other.

There are $q_{2,0}=9$ pairs of weak orders on $\{1,2\}$ :

1,2

Of these, the five in the first row have no consecutive embeddings, whereas the first two in the second row each have a right consecutive embedding, and the last two in the second row each have a left consecutive embedding. Thus, $q_{2}=5$ . Then

p_{2,0}=\frac{1}{2!}\left(\genfrac{[}{]}{0.0pt}{}{2}{1}q_{1,0}+\genfrac{[}{]}{0.0pt}{}{2}{2}q_{2,0}\right)=\frac{1}{2}(1+9)=5\qquad\text{and}\qquad p_{2}=\frac{1}{2!}\left(\genfrac{[}{]}{0.0pt}{}{2}{1}q_{1}+\genfrac{[}{]}{0.0pt}{}{2}{1}q_{2}\right)=\frac{1}{2}(1+5)=3.

3.4 Inclusion-Exclusion

Recall that we have reduced the computation of the sequences $\{p_{n,0}\}$ and $\{p_{n}\}$ to the computation of the sequences $\{q_{n,0}\}$ and $\{q_{n}\}$ . The formula $q_{n,0}=f_{n}^{2}$ from Proposition 3.8 gives a fast polynomial-time algorithm for computing the sequence $\{q_{n,0}\}$ . We now restrict our attention to computing the sequence $\{q_{n}\}$ . This subsection will give an inclusion-exclusion formula for $q_{n}$ .

Definition 3.10.

1.

A pair of weak orders on $\{1,\ldots,n\}$ with distinguished consecutive embeddings consists of a pair of weak orders $(\leq,\leq^{\prime})$ on $\{1,\ldots,n\}$ , together with set $\mathcal{C}$ of consecutive embeddings in $(\leq,\leq^{\prime})$ . In other words, $\mathcal{C}$ is a subset of the set of all consecutive embeddings in $(\leq,\leq^{\prime})$ .
2.

A pairs of weak orders on $\{1,\ldots,n\}$ with $k$ distinguished consecutive embeddings is a pair of weak orders on $\{1,\ldots,n\}$ with distinguished consecutive embeddings whose set $\mathcal{C}$ has cardinality $k$ .
3.

Let $q_{n,k}$ count the number of pairs of weak orders on $\{1,\ldots,n\}$ with $k$ distinguished consecutive embeddings.

The notation $q_{n,k}$ does not conflict with the notation $q_{n,0}$ because pairs of weak orders with 0 distinguished consecutive embeddings are in bijection with (ordinary) pairs of weak orders.

Example 3.11.

We will circle distinguished consecutive embeddings. There are $q_{2,0}=9$ pairs of weak orders on $\{1,2\}$ with 0 distinguished consecutive embeddings:

1,2

There are $q_{2,1}=4$ pairs of weak orders on $\{1,2\}$ with 1 distinguished consecutive embedding:

1,2

Example 3.12.

Figure 2 depicts a more complicated example which we will return to later. The example in Figure 2 is a pair of weak orders on $\{1,2,3,4,5,6,7,8,9\}$ with 5 total consecutive embeddings, 4 of which are distinguished:

1,3,4,6,8,9

1,4

2,5,7

Figure 2: A pair of weak orders with distinguished consecutive embeddings.

The first few values of $q_{n,k}$ are given in Table 1. There are two main observations from this table that we will prove below. First, $q_{n,k}=0$ for $k>n$ (Corollary 3.16). Second, $\sum_{k=0}^{n}(-1)^{k}q_{n,k}=q_{n}$ (Lemma 3.17).

$n$	$q_{n}$	$q_{n,0}$	$q_{n,1}$	$q_{n,2}$	$q_{n,3}$	$q_{n,4}$
$0$	1	1	0	0	0	0
$1$	1	1	0	0	0	0
$2$	5	9	4	0	0	0
$3$	97	169	84	12	0	0
$4$	3365	5625	2812	600	48	0
$5$	177601	292681	145380	34380	4320	240

Table 1: Small Values of

q_{n}

and

q_{n,k}

We will need the notion of a chain in a pair of weak orders with distinguished consecutive embeddings.

Definition 3.13.

In a pair of weak orders on $\{1,\ldots,n\}$ with distinguished consecutive embeddings, a chain is a maximal sequence of overlapping distinguished consecutive embeddings.

The example in Figure 2 has one chain on the left side and two chains on the right side. The following lemma is a key property of chains.

Lemma 3.14.

In a pair of weak orders on $\{1,\ldots,n\}$ with distinguished consecutive embeddings, any two chains have no elements of $\{1,\ldots,n\}$ in common, even if the two chains are on different sides.

Proof.

It is clear that two chains on the same side have no elements of $\{1,\ldots,n\}$ in common. It remains to show that any two consecutive embeddings on different sides have no elements of $\{1,\ldots,n\}$ in common. Let $(i,i+1)_{l}$ be a left consecutive embedding and let $(i^{\prime},i^{\prime}+1)_{r}$ be a right consecutive embedding. Then $S_{i}\cup S_{i+1}\subseteq T_{j}$ for some (necessarily unique) $T_{j}$ and $T_{i^{\prime}}\cup T_{i^{\prime}+1}\subseteq S_{j^{\prime}}$ for some (necessarily unique) $S_{j^{\prime}}$ . If these two consecutive embeddings have an element of $\{1,\ldots,n\}$ in common, then $T_{j}$ must be either $T_{i^{\prime}}$ or $T_{i^{\prime}+1}$ , and $S_{j^{\prime}}$ must be either $S_{i}$ or $S_{i+1}$ . Then

\left\lvert{S_{i}\cup S_{i+1}}\right\rvert\leq\left\lvert{T_{j}}\right\rvert<\left\lvert{T_{i^{\prime}}\cup T_{i^{\prime}+1}}\right\rvert\leq\left\lvert{S_{j^{\prime}}}\right\rvert<\left\lvert{S_{i}\cup S_{i+1}}\right\rvert

which is a contradiction. ∎

We can now determine $q_{n,k}$ for $k\geq n-1$ .

Proposition 3.15.

If $k\geq n$ and $k\geq 1$ , then $q_{n,k}=0$ . If $k=n-1$ and $k\geq 1$ , then $q_{n,k}=2\cdot n!$ .

Proof.

Consider a pair of weak orders on $\{1,\ldots,n\}$ with $k\geq 1$ distinguished consecutive embeddings. Let $c\geq 1$ be the number of chains. Note that $c+k$ counts the number of parts that are contained in the chains, because each chain has one more part than distinguished consecutive embedding. Note that these $c+k$ parts have no elements in common, since any two parts within a chain have no elements in common, and Lemma 3.14 states that distinct chains have no elements in common. This gives the inequality $c+k\leq n$ .

If $k\geq n$ , then this is impossible, which shows that $q_{n,k}=0$ . Now suppose that $k=n-1$ . Then $c=1$ , so there is one chain consisting of $n-1$ overlapping distinguished consecutive embeddings. Thus, one side (either left or right) has $n$ parts in one of the $n!$ possible permutations, and the other side has just one part consisting of the unordered set $\{1,\ldots,n\}$ . This gives $2\cdot n!$ possibilities. ∎

Corollary 3.16.

If $k>n$ , then $q_{n,k}=0$ .

Proof.

If $k>n$ , then $k\geq 1$ so Proposition 3.15 gives $q_{n,k}=0$ . ∎

We now give the inclusion-exclusion formula for $q_{n}$ in terms of $q_{n,k}$ .

Lemma 3.17.

We have the inclusion-exclusion formula

q_{n}=\sum_{k=0}^{n}(-1)^{k}q_{n,k}.

Proof.

If a pair of weak orders on $\{1,\ldots,n\}$ has $m$ total consecutive embeddings, then $m\leq n$ by Corollary 3.16, so that pair of weak orders is counted $\sum_{k=0}^{m}(-1)^{k}\binom{m}{k}$ times by the sum. This is equal to 0 if $m\geq 1$ and equal to 1 if $m=0$ . Thus, the sum counts pairs of weak orders with no consecutive embeddings. ∎

3.5 The Final Count

It remains to count pairs of weak orders on $\{1,\ldots,n\}$ with $k$ distinguished consecutive embeddings. To deal with the fact that consecutive embeddings can overlap, we will work with chains. The first step of our formula will be to sum over the number of chains, which we will call $c$ , and the total number of elements from $\{1,\ldots,n\}$ to put inside these chains, which we will call $t$ . Keep in mind Lemma 3.14, which states that distinct chains have no elements in common, even if they are on different sides. In other words, each of the $t$ elements will end up in exactly one of the $c$ chains.

There are $\binom{n}{t}$ ways to choose which $t$ elements from $\{1,\ldots,n\}$ to put inside the $c$ chains. In the proof of Proposition 3.15 we saw that the $c$ chains contain $c+k$ total parts, so $t\geq c+k$ . Then there are $(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}$ ways to partition these $t$ elements into an ordered sequence of $c+k$ parts.

Now we cut the ordered sequence of $c+k$ parts into $c$ chains, each of which must consist at least 2 parts. Since the parts are already ordered, this just requires choosing how many parts to put into each chain. The number of ways to do this equals the number of ways to distribute $c+k$ identical balls among $c$ distinguishable urns, each of which must get at least 2 balls. This is the same as the number of ways to surjectively distribute $k$ identical balls among $c$ distinguishable urns. There are $\binom{k-1}{k-c}$ such ways.

At this point, we have an ordered sequence of $c$ chains, along with $n-t$ non-chain elements. We can minimize the distinction between the $c$ chains and the $n-t$ elements by collapsing each chain into a single element, as depicted in Figure 3. For example, consider the top right chain in Figure 3. Both the left and right side of this chain get collapsed to a single element (which is labeled $c_{2}$ in Figure 3). The key difference between the left $c_{2}$ and the right $c_{2}$ is that the right $c_{2}$ must be in a part by itself. Thus, the circling in the collapsed form indicates which side of the collapsed chain must be in a part by itself.

1,3,4,6,8,9

1,4

2,5,7

$\quad\longrightarrow\quad$ $c_{1}$ $c_{2}$ , $c_{3}$ ,9 $c_{2}$ $c_{3}$ $c_{1}$ 9

Figure 3: Collapsing chains.

We will need a generalization of the Fubini numbers that allows for the possibility of elements that are required to be placed in a part by themselves. For $0\leq k\leq n$ , let $f_{n,k}$ count the number of weak orders on $\{1,\ldots,n\}$ where each of the elements $1,\ldots,k$ must be placed in a part by itself. These generalized Fubini numbers have the formula $f_{n,k}=\sum_{l=k}^{n}l!\genfrac{\{}{\}}{0.0pt}{}{n-k}{l-k}$ , because there are $\genfrac{\{}{\}}{0.0pt}{}{n-k}{l-k}$ ways to partition the set $\{k+1,\ldots,n\}$ into $l-k$ parts, and $l!$ ways to order these $l-k$ parts along with the remaining $k$ elements $\{1,\ldots,k\}$ . The ordinary Fubini numbers are the special case when $k=0$ .

Returning to the situation prior to Figure 3, we now sum over the number of chains on the left side, which we will call $j$ . This forces the number of chains on the right side to be $c-j$ . We already have an ordering on the $c$ chains, so we will take the first $j$ chains to be the chains on the left side, and the last $c-j$ chains to be the chains on the right side. After collapsing each of the $c$ chains into a single element on each side, we have $n-t+c$ total elements to distribute. On the left side, $j$ of these elements are marked as having to be placed in a part by themselves. On the right side, $c-j$ of these elements are marked as having to be placed in a part by themselves. Then there are $f_{n-t+c,j}f_{n-t+c,c-j}$ ways to produce a valid pair of weak orders. However, since we already have an ordering for the $c$ chains, we must divide by $j!$ and $(c-j)!$ . This gives the formula

q_{n,k}=\sum_{c=0}^{k}\sum_{t=c+k}^{n}\underbrace{\binom{n}{t}}_{\begin{subarray}{c}\text{choose $t$}\\ \text{elements}\end{subarray}}\underbrace{(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}}_{\begin{subarray}{c}\text{partition the $t$}\\ \text{elements into an}\\ \text{ordered sequence}\\ \text{of $c+k$ parts}\end{subarray}}\underbrace{\binom{k-1}{k-c}}_{\begin{subarray}{c}\text{split the}\\ \text{$c+k$ parts}\\ \text{into $c$ chains}\end{subarray}}\sum_{j=0}^{c}\frac{\overbrace{f_{n-t+c,j}f_{n-t+c,c-j}}^{\begin{subarray}{c}\text{collapsing each chain into}\\ \text{a single element on each side}\\ \text{makes $n$ become $n-t+c$}\end{subarray}}}{\underbrace{j!\,(c-j)!}_{\begin{subarray}{c}\text{the chains are already}\\ \text{ordered at this point,}\\ \text{so the numerator is}\\ \text{overcounting by this factor}\end{subarray}}}.

We should point out that the second sum is zero for $c>n-k$ . Plugging this into Lemma 3.17 and rearranging the summations gives the formula

q_{n}=\sum_{c=0}^{\lfloor n/2\rfloor}\sum_{t=2c}^{n}\binom{n}{t}\left(\sum_{k=c}^{t-c}(-1)^{k}(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}\binom{k-1}{k-c}\right)\left(\sum_{j=0}^{c}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{j!\,(c-j)!}\right).

(2)

This proves our formula for $p_{n}$ (equation 1 from the introduction).

Theorem 3.18.

For each nonnegative integer $n$ , we have the formula

p_{n}=\frac{1}{n!}\sum_{m=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{m}\sum_{c=0}^{\lfloor m/2\rfloor}\sum_{t=2c}^{m}\binom{m}{t}\left(\sum_{k=c}^{t-c}(-1)^{k}(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}\binom{k-1}{k-c}\right)\left(\sum_{j=0}^{c}\frac{f_{m-t+c,j}f_{m-t+c,c-j}}{j!\,(c-j)!}\right),

where the values $f_{n,k}=\sum_{l=k}^{n}l!\genfrac{\{}{\}}{0.0pt}{}{n-k}{l-k}$ are generalizations of the Fubini numbers.

Equation 2 has been written so that there are several independent components inside the formula. The next subsection shows how we can precompute these components to speed up computation.

3.6 The Algorithm

We now give an algorithm to compute the values $p_{1},\ldots,p_{N}$ in $O(N^{2})$ memory and $O(N^{3})$ time. The statement “ $O(N^{2})$ memory and $O(N^{3})$ time” assumes that storing a number requires constant memory and that arithmetic operations require constant time. This assumption is not true in practice since the computation involves numbers that grow super-exponentially. However, the number of digits of these numbers grows polynomially. Then storing a number requires polynomial memory and arithmetic operations require polynomial time. Thus, even with these practical considerations, the algorithm uses polynomial memory and polynomial time. Empirically, the runtime for the algorithm grows like $N^{4.75}$ .

Here is the algorithm:

1.

Compute the values $\displaystyle\binom{n}{k},\genfrac{[}{]}{0.0pt}{}{n}{k},\genfrac{\{}{\}}{0.0pt}{}{n}{k}$ , for $0\leq k\leq n\leq N$ using the standard recurrences.
2.

Compute the values $f_{n,k}=\displaystyle\sum_{l=k}^{n}l!\genfrac{\{}{\}}{0.0pt}{}{n-k}{l-k}$ for $0\leq k\leq n\leq N$ .
3.

Compute the values $g_{n,c}=\displaystyle\sum_{j=0}^{c}\frac{f_{n,j}f_{n,c-j}}{j!\,(c-j)!}$ for $0\leq c\leq n\leq N$ .
4.

Compute the values $h_{t,c}=\displaystyle\sum_{k=c}^{t-c}(-1)^{k}(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}\binom{k-1}{k-c}$ for $0\leq 2c\leq t\leq n$ .

In this calculation, note that $\binom{-1}{0}=1$ and also $\binom{k-1}{k}=0$ for $k\geq 1$ . We will give a faster and simpler algorithm for $h_{t,c}$ in the next subsection.
5.

Compute the values $q_{n}=\displaystyle\sum_{c=0}^{\lfloor n/2\rfloor}\sum_{t=2c}^{n}\binom{n}{t}h_{t,c}g_{n-t+c,c}$ for $0\leq n\leq N$ .
6.

Compute the values $p_{n}=\displaystyle\frac{1}{n!}\sum_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}q_{k}$ for $0\leq n\leq N$ .

I have written two versions of this algorithm in Java (see Web Resources at the end of this paper).

Version 1 is a straightforward single-threaded implementation of this algorithm, with the faster and simpler step 4 given in the next subsection. Version 1 can compute the values $p_{1},\ldots,p_{1000}$ in 40 minutes on a laptop with a 2.4 GHz processor using 2 GB of memory. Version 1 is limited by memory.

Version 2 is a memory-optimized multi-threaded implementation of the algorithm that was used to compute the values $p_{1},\ldots,p_{5000}$ . The computation took 9.4 days on a server with 20 CPU cores, each running at 2.4 GHz. Version 2 achieves $O(N)$ memory and $O(N^{3})$ time (with the same caveat as before). It achieves $O(N)$ memory by not storing precomputed two-dimensional arrays and instead recomputing values on the fly. Version 2 maintains the $O(N^{3})$ time of Version 1 by rewriting formula (2) for $q_{n}$ so that the outside sum is a sum over the values of $n-t+c$ , which avoids expensive recomputation of the values $g_{n-t+c,c}$ . Version 2 is limited by time, rather than by memory.

3.7 Combinatorial Interpretation of $h_{t,c}$

We conclude this section by giving a combinatorial interpretation of $h_{t,c}$ . This will give a fast and simple algorithm for $h_{t,c}$ , as well as a bound on $h_{t,c}$ that will be used when proving asymptotic formulas.

Proposition 3.19.

For all integers $0\leq 2c\leq t$ , the quantity $(-1)^{c}h_{t,c}/2^{c}$ counts the number of ordered set-partitions of $\{1,\ldots,t\}$ into $c$ parts of even cardinality. In particular, if $t$ is odd, then $h_{t,c}=0$ .

Proof.

First, we rewrite $h_{t,c}$ as

h_{t,c}=(-1)^{c}\sum_{k=2c}^{t}(-1)^{k}k!\genfrac{\{}{\}}{0.0pt}{}{t}{k}\binom{k-c-1}{k-2c}.

Recall that the quantity

(c+k)!\genfrac{\{}{\}}{0.0pt}{}{t}{c+k}\binom{k-1}{k-c}

from equation (2) counts the number of ways to choose an ordered set-partition of $\{1,\ldots,t\}$ into $c+k$ parts, which are further grouped into $c$ chains of at least two parts each. Then the quantity

k!\genfrac{\{}{\}}{0.0pt}{}{t}{k}\binom{k-c-1}{k-2c}

counts the number of ways to choose an ordered set-partition of $\{1,\ldots,t\}$ into $k$ parts, which are further grouped into $c$ consecutive blocks of at least two parts each. Here’s example with $c=3$ , $k=7$ , and $t=10$ :

8,10

1,7,9

Considering the $c$ blocks gives an ordered set-partition of $\{1,\ldots,t\}$ into $c$ parts. If we let $\Omega$ denote the collection of all ordered set-partitions of $\{1,\ldots,t\}$ into $c$ parts, then we can write

k!\genfrac{\{}{\}}{0.0pt}{}{t}{k}\binom{k-c-1}{k-2c}=\sum_{\pi\in\Omega}a_{\pi},

where $a_{\pi}$ counts the number of ways that $\pi\in\Omega$ arises as the $c$ blocks of an ordered set-partition of $\{1,\ldots,t\}$ into $k$ parts, which are further grouped into $c$ consecutive blocks of at least two parts each. If $\pi$ is the ordered set-partition $\{1,\ldots,t\}=\pi_{1}\cup\cdots\cup\pi_{c}$ , then we can write $a_{\pi}$ as

a_{\pi}=\sum_{\begin{subarray}{c}k=k_{1}+\cdots+k_{c}\\ k_{i}\geq 2\end{subarray}}\prod_{i=1}^{c}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}.

Thus,

k!\genfrac{\{}{\}}{0.0pt}{}{t}{k}\binom{k-c-1}{k-2c}=\sum_{\pi\in\Omega}\sum_{\begin{subarray}{c}k=k_{1}+\cdots+k_{c}\\ k_{i}\geq 2\end{subarray}}\prod_{i=1}^{c}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}

Putting this all together gives

	$\displaystyle h_{t,c}$	$\displaystyle=(-1)^{c}\sum_{k=2c}^{t}(-1)^{k}\sum_{\pi\in\Omega}\sum_{\begin{subarray}{c}k=k_{1}+\cdots+k_{c}\\ k_{i}\geq 2\end{subarray}}\prod_{i=1}^{c}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}$
		$\displaystyle=(-1)^{c}\sum_{\pi\in\Omega}\sum_{k=2c}^{t}\sum_{\begin{subarray}{c}k=k_{1}+\cdots+k_{c}\\ k_{i}\geq 2\end{subarray}}\prod_{i=1}^{c}(-1)^{k_{i}}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}$
		$\displaystyle=(-1)^{c}\sum_{\pi\in\Omega}\sum_{\begin{subarray}{c}k_{1},\ldots,k_{c}\\ 2\leq k_{i}\leq\left\lvert{\pi_{i}}\right\rvert\end{subarray}}\prod_{i=1}^{c}(-1)^{k_{i}}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}$
		$\displaystyle=(-1)^{c}\sum_{\pi\in\Omega}\prod_{i=1}^{c}\sum_{k_{i}=2}^{\left\lvert{\pi_{i}}\right\rvert}(-1)^{k_{i}}k_{i}!\genfrac{\{}{\}}{0.0pt}{}{\left\lvert{\pi_{i}}\right\rvert}{k_{i}}.$

We now invoke the identity

\sum_{k=2}^{n}(-1)^{k}k!\genfrac{\{}{\}}{0.0pt}{}{n}{k}=\begin{cases}2&\text{if }n\text{ is even},\\ 0&\text{if }n\text{ is odd}\end{cases}

which follows from setting $x=-1$ in the identity

\sum_{k=0}^{n}\genfrac{\{}{\}}{0.0pt}{}{n}{k}(x)_{k}=x^{n}.

where $(x)_{k}=x(x-1)\cdots(x-k+1)$ denotes the falling factorial. Thus, if each $\pi_{i}$ has even cardinality, then $\pi$ contributes $(-1)^{c}2^{c}$ to $h_{t,c}$ , and otherwise $\pi$ contributes nothing to $h_{t,c}$ . The result follows. ∎

Corollary 3.20.

For all integers $0\leq 2c\leq t$ , we have the inequality $\left\lvert{h_{t,c}}\right\rvert\leq 2^{c}c^{t}$ .

Proof.

Proposition 3.19 states that $\left\lvert{h_{t,c}}\right\rvert/2^{c}$ counts the number of ordered set-partitions of $\{1,\ldots,t\}$ into $c$ parts of even cardinality. This is bounded above by $c^{t}$ (the number of functions $\{1,\ldots,t\}\to\{1,\ldots,c\}$ ). ∎

The combinatorial interpretation of $h_{t,c}$ given in Proposition 3.19 also gives a recurrence for $h_{t,c}$ . It will be convenient to introduce an auxiliary sequence.

Definition 3.21.

We define integers $T(n,k)$ for $0\leq k\leq n$ by

•

$T(n,0)=0$ for $n\geq 1$ ,
•

$T(n,n)=1$ for $n\geq 0$ ,
•

$T(n,k)=T(n-1,k-1)+k^{2}T(n-1,k)$ for $1\leq k\leq n-1$ .

The values $T(n,k)$ maybe found at [5, A036969].

Proposition 3.22.

For all integers $0\leq k\leq n$ , we have $h_{2n,k}=(-1)^{k}(2k)!\,T(n,k)$ .

The significance of Proposition 3.22 is that it reduces the computation of $h_{t,c}$ in step 4 of the algorithm from cubic time to quadratic time. Let $S(n,k)=(-1)^{k}h_{2n,k}/2^{k}$ , which by Proposition 3.19 counts the number of ordered set-partitions of $\{1,\ldots,2n\}$ into $k$ parts of even cardinality. The proof of Proposition 3.22 boils down to showing $S(n,k)$ satisfies the recurrence $S(n,k)=k^{2}S(n-1,k)+k(2k-1)S(n-1,k-1)$ . This recurrence is known [5, A241171]. We include a proof of Proposition 3.22 for completeness.

Proof of Proposition 3.22.

Let $S(n,k)$ be defined as above. It suffices to show that $S(n,k)=(2k)!\,T(n,k)/2^{k}$ , because then $h_{2n,k}=(-1)^{k}2^{k}S(n,k)=(-1)^{k}(2k)!\,T(n,k)$ . We need to prove the following three properties:

1.

$S(n,0)=0$ for $n\geq 1$ .
2.

$S(n,n)=(2n)!/2^{n}$ for $n\geq 0$ .
3.

$S(n,k)=k^{2}S(n-1,k)+k(2k-1)S(n-1,k-1)$ for $1\leq k\leq n-1$ .

The first property follows from the observation that if $n\geq 1$ , then there are no ordered set-partitions of $\{1,\ldots,2n\}$ into 0 parts. For the second property, note that an ordered set-partition of $\{1,\ldots,2n\}$ into $n$ parts of even cardinality consists of pairing up $\{1,\ldots,2n\}$ into an ordered sequence of $n$ unordered pairs. There are $(2n)!/2^{n}$ such pairings. It remains to show the third property (the recurrence). Consider the following two ways of constructing an ordered set-partition of $\{1,\ldots,2n\}$ into $k$ parts of even cardinality:

•

Start with an ordered set-partition $\{1,\ldots,2n-2\}=S_{1}\cup\cdots\cup S_{k}$ into $k$ parts of even cardinality, and choose two (possibly equal) indices $i,j\in\{1,\ldots,k\}$ . Consider the smallest element of the union $S_{i}\cup S_{j}$ and flip its position (if it was in $S_{i}$ , then move it to $S_{j}$ , and vice versa). Finally, add $2n-1$ to $S_{i}$ and add $2n$ to $S_{j}$ .
•

Start with an ordered set-partition $\{1,\ldots,2n-2\}=S_{1}\cup\cdots\cup S_{k-1}$ into $k-1$ parts of even cardinality, and choose an index $k_{0}\in\{1,\ldots,k\}$ . Inserting the empty set at position $k_{0}$ and relabeling gives $\{1,\ldots,2n-2\}=S_{1}\cup\cdots\cup S_{k}$ , which would be an ordered set-partition except that $S_{k_{0}}$ is empty. Now choose two (possibly equal) indices $i,j\in\{1,\ldots,k\}$ , at least one of which is equal to $k_{0}$ . Consider the smallest element of the union $S_{i}\cup S_{j}$ and flip its position (if $i=j=k_{0}$ , then do nothing). Finally, add $2n-1$ to $S_{i}$ and add $2n$ to $S_{j}$ .

We claim that every ordered set-partition of $\{1,\ldots,2n\}$ into $k$ parts of even cardinality arises uniquely from one of these two constructions. This is because the process can be reversed:

•

Start with an ordered set-partition of $\{1,\ldots,2n\}=S_{1}\cup\cdots\cup S_{k}$ into $k$ parts of even cardinality. Let $S_{i}$ be the set containing $2n-1$ and let $S_{j}$ be the set containing $2n$ . Remove $2n-1$ from $S_{i}$ and remove $2n$ from $S_{j}$ . Consider the smallest element of the union $S_{i}\cup S_{j}$ and flip its position (if $S_{i}=S_{j}=\varnothing$ , then do nothing). If $S_{i}$ and $S_{j}$ are still nonempty, then we are in the first case. If either $S_{i}$ or $S_{j}$ is now empty, then we are in the second case.

In the first case, there are $S(n-1,k)$ ways of choosing the initial ordered set-partition, and $k^{2}$ ways of choosing the indices $i,j$ . In the second case, there are $S(n-1,k-1)$ ways of choosing the initial ordered set-partition, $k$ ways of choosing the index $k_{0}$ , and $2k-1$ ways of choosing the indices $i,j$ . This proves the recurrence $S(n,k)=k^{2}S(n-1,k)+k(2k-1)S(n-1,k-1)$ . ∎

4 Asymptotics

4.1 Asymptotics of $f_{n,k}$

We first give a formula for the generalized Fubini numbers $f_{n,k}$ in terms of the ordinary Fubini numbers $f_{n}$ .

Proposition 4.1.

If $k\geq 0$ and $n\geq 0$ are integers, then

f_{n+k,k}=k!\sum_{n_{0}+\cdots+n_{k}=n}\frac{n!}{n_{0}!\cdots n_{k}!}f_{n_{0}}\cdots f_{n_{k}},

where each $n_{i}$ is a nonnegative integer.

Proof.

Recall that $f_{n+k,k}$ counts the number of weak orders on $\{1,\ldots,n+k\}$ where each of the elements $1,\ldots,k$ must be placed in a part by itself. We first choose the permutation of the elements $1,\ldots,k$ . This leaves $k+1$ regions between the elements $1,\ldots,k$ where we must place the remaining $n$ elements $k+1,\ldots,n+k$ . We then choose how many elements $n_{i}$ should go into each of the $k+1$ regions. There are $\frac{n!}{n_{0}!\cdots n_{k}!}$ ways to choose how to distribute the $n$ elements $k+1,\ldots,n+k$ into these $k+1$ regions. Finally, there are $f_{n_{i}}$ choices for the weak order on the $n_{i}$ elements within each of the $k+1$ regions. ∎

We remark that Proposition 4.1 gives a generating function identity

\sum_{n=0}^{\infty}\frac{f_{n+k,k}}{n!}x^{n}=k!\left(\sum_{n=0}^{\infty}\frac{f_{n}}{n!}x^{n}\right)^{k+1}=\frac{k!}{(2-e^{x})^{k+1}}.

It is possible to prove Lemma 4.3 below using complex-analytic generating function techniques by looking at the poles of $k!/(2-e^{x})^{k+1}$ . However, we will prove Lemma 4.3 using more direct real-analytic techniques. We will need the following technical lemma.

Lemma 4.2.

Let $\{a_{n}\}_{n=0}^{\infty}$ be a sequence of real numbers and let $k\geq 0$ be an integer. If $a_{n}\to 1$ , then

\frac{1}{\binom{n+k}{k}}\sum_{\begin{subarray}{c}n_{0}+\cdots+n_{k}=n\\ \text{all }n_{i}\geq 0\end{subarray}}a_{n_{0}}\cdots a_{n_{k}}\to 1

as $n\to\infty$ .

Proof.

We first remark that $\binom{n+k}{k}$ is the number of terms in the sum, since the number of solutions of $n_{0}+\cdots+n_{k}=n$ equals the number of ways to distribute $n$ identical balls into $k+1$ distinguishable urns. Let $\varepsilon>0$ . Then there exists an $m\geq 0$ such that $a_{n}\in[1-\varepsilon,1+\varepsilon]$ for $n\geq m$ . Let $C=\max\left\lvert{a_{n}}\right\rvert$ . For $n\geq(k+1)m$ , we will split up the sum as

\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}=\sum_{\begin{subarray}{c}n_{0}+\cdots+n_{k}=n\\ \text{all }n_{i}\geq m\end{subarray}}a_{n_{0}}\cdots a_{n_{k}}+\sum_{\begin{subarray}{c}n_{0}+\cdots+n_{k}=n\\ \text{some }n_{i}<m\end{subarray}}a_{n_{0}}\cdots a_{n_{k}}.

The number of terms of the first sum equals the number of solutions to $n_{0}+\cdots+n_{k}=n$ with all $n_{i}\geq m$ , which equals the number of solutions to $n_{0}+\cdots+n_{k}=n-(k+1)m$ . Then the first sum has $\binom{n-(k+1)m+k}{k}$ terms (by the same argument as at the start of the proof), each of which lies in the interval $[(1-\varepsilon)^{k+1},(1+\varepsilon)^{k+1}]$ . The second sum has $\binom{n+k}{k}-\binom{n-(k+1)m+k}{k}$ terms, each of which lies in the interval $[-C^{k+1},C^{k+1}]$ . Thus,

(1-\varepsilon)^{k+1}\binom{n-(k+1)m+k}{k}-C^{k+1}\left(\binom{n+k}{k}-\binom{n-(k+1)m+k}{k}\right)\leq\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}

and

\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}\leq(1+\varepsilon)^{k+1}\binom{n-(k+1)m+k}{k}+C^{k+1}\left(\binom{n+k}{k}-\binom{n-(k+1)m+k}{k}\right).

Dividing through by $\binom{n+k}{k}$ gives the inequalities

(1-\varepsilon)^{k+1}\frac{\binom{n-(k+1)m+k}{k}}{\binom{n+k}{k}}-C^{k+1}\left(1-\frac{\binom{n-(k+1)m+k}{k}}{\binom{n+k}{k}}\right)\leq\frac{1}{\binom{n+k}{k}}\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}

and

\frac{1}{\binom{n+k}{k}}\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}\leq(1+\varepsilon)^{k+1}\frac{\binom{n-(k+1)m+k}{k}}{\binom{n+k}{k}}+C^{k+1}\left(1-\frac{\binom{n-(k+1)m+k}{k}}{\binom{n+k}{k}}\right).

Note that

\frac{\binom{n-(k+1)m+k}{k}}{\binom{n+k}{k}}=\frac{(n-(k+1)m+1)\cdots(n-(k+1)m+k)}{(n+1)\cdots(n+k)}\to 1\text{ as }n\to\infty.

Then taking $n\to\infty$ gives

\displaystyle(1-\varepsilon)^{k+1}\leq\liminf_{n\to\infty}\frac{1}{\binom{n+k}{k}}\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}\leq\limsup_{n\to\infty}\frac{1}{\binom{n+k}{k}}\sum_{n_{0}+\cdots+n_{k}=n}a_{n_{0}}\cdots a_{n_{k}}\leq(1+\varepsilon)^{k+1}.

The result follows from taking $\varepsilon\to 0$ . ∎

We can now give an asymptotic formula for $f_{n,k}$ as $n\to\infty$ . Recall that $f(n)\sim g(n)$ means $\lim\limits_{n\to\infty}\frac{f(n)}{g(n)}=1$ .

Lemma 4.3.

For each fixed nonnegative integer $k$ , we have

f_{n,k}\sim\frac{n!}{2^{k+1}(\log 2)^{n+1}}

as $n\to\infty$ .

Proof.

Proposition 4.1 gives the combinatorial identity

f_{n+k,k}=k!\sum_{n_{0}+\cdots+n_{k}=n}\frac{n!}{n_{0}!\cdots n_{k}!}f_{n_{0}}\cdots f_{n_{k}}

which we can rewrite as

\frac{f_{n+k,k}}{\displaystyle\left(\frac{(n+k)!}{2^{k+1}(\log 2)^{n+k+1}}\right)}=\frac{1}{\displaystyle\binom{n+k}{k}}\sum_{n_{0}+\cdots+n_{k}=n}\frac{f_{n_{0}}}{\displaystyle\left(\frac{n_{0}!}{2(\log 2)^{n_{0}+1}}\right)}\cdots\frac{f_{n_{k}}}{\displaystyle\left(\frac{n_{k}!}{2(\log 2)^{n_{k}+1}}\right)}.

Then the result follows from Lemma 4.2, along with the asymptotic formula $f_{n}\,{\sim}\,\frac{n!}{2(\log 2)^{n+1}}$ [9, p.175-176]. ∎

4.2 Asymptotics of $q_{n}$

We will need a short technical lemma.

Lemma 4.4.

For all integers $0\leq 2c\leq t\leq n$ ,

\binom{n}{t}\left(\frac{(n-t+c)!}{n!}\right)^{2}\leq\frac{1}{t!}.

Proof.

We have

\binom{n}{t}\left(\frac{(n-t+c)!}{n!}\right)^{2}=\frac{1}{t!}\frac{n!}{(n-t)!}\left(\frac{(n-t+c)!}{n!}\right)^{2}=\frac{1}{t!}\frac{(n-t+c)!^{2}}{(n-t)!\,n!},

where

\frac{(n-t+c)!^{2}}{(n-t)!\,n!}\leq\frac{(n-t+c)!}{(n-t)!}\frac{(n-t+c)!}{(n-t+2c)!}=\frac{n-t+1}{n-t+c+1}\cdots\frac{n-t+c}{n-t+2c}\leq 1.\qed

Theorem 4.5.

We have

q_{n}\sim e^{-(\log 2)^{2}}q_{n,0}\sim e^{-(\log 2)^{2}}\frac{n!^{2}}{4\,(\log 2)^{2n+2}}.

Thus, approximately $e^{-(\log 2)^{2}}$ (around $62\%$ ) of pairs of weak orders have no consecutive embeddings.

Proof.

Taking the formula $q_{n}=\sum_{c}\sum_{t}\binom{n}{t}h_{t,c}g_{n-t+c,c}$ in equation (2), unfolding the definition of $g_{n-t+c,c}$ , and dividing through by $f_{n}^{2}$ gives

\frac{q_{n}}{f_{n}^{2}}=\sum_{c=0}^{\lfloor n/2\rfloor}\sum_{t=2c}^{n}\sum_{j=0}^{c}\binom{n}{t}\frac{h_{t,c}}{j!\,(c-j)!}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{f_{n}^{2}}.

Now fix $c\geq 0$ , $t\geq 2c$ , and $0\leq j\leq c$ , and consider the summand

\binom{n}{t}\frac{h_{t,c}}{j!\,(c-j)!}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{f_{n}^{2}}

for $n\geq t$ as $n\to\infty$ . By the asymptotic formula for $f_{n,k}$ in Lemma 4.3 and the fact that $f_{n}=f_{n,0}$ we have

	$\displaystyle\binom{n}{t}\frac{h_{t,c}}{j!\,(c-j)!}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{f_{n}^{2}}$	$\displaystyle\sim\frac{n^{t}}{t!}\frac{h_{t,c}}{j!\,(c-j)!}\frac{\displaystyle\left(\frac{(n-t+c)!}{2^{j+1}(\log 2)^{n-t+c+1}}\right)\left(\frac{(n-t+c)!}{2^{c-j+1}(\log 2)^{n-t+c+1}}\right)}{\displaystyle\left(\frac{n!}{2(\log 2)^{n+1}}\right)^{2}}$
		$\displaystyle\sim\frac{n^{t}}{t!}\frac{h_{t,c}}{j!\,(c-j)!}\frac{(\log 2)^{2t-2c}}{2^{c}n^{2t-2c}}\sim\frac{h_{t,c}}{j!\,(c-j)!}\frac{(\log 2)^{2t-2c}}{2^{c}t!}\frac{1}{n^{t-2c}}.$

If $t>2c$ , then the summand converges to 0. If $t=2c$ , then $h_{t,c}=(-1)^{c}t!$ (this follows from the definition of $h_{t,c}$ , as well as from Proposition 3.22) so the summand converges to

(-1)^{c}\frac{1}{j!\,(c-j)!}\frac{(\log 2)^{2c}}{2^{c}}.

After applying dominated convergence theorem (justified below), the $t>2c$ terms contribute nothing so we can drop the sum over $t$ and get

\frac{q_{n}}{f_{n}^{2}}\to\sum_{c=0}^{\infty}\sum_{j=0}^{c}(-1)^{c}\frac{1}{j!\,(c-j)!}\frac{(\log 2)^{2c}}{2^{c}}=\sum_{c=0}^{\infty}(-1)^{c}\frac{(\log 2)^{2c}}{c!}=e^{-(\log 2)^{2}}.

The result follows from the identity $q_{n,0}=f_{n}^{2}$ and the asymptotic formula $f_{n}\sim\frac{n!}{2(\log 2)^{n+1}}$ (Lemma 4.3).

It remains to justify this application of the dominated convergence theorem. Note that the asymptotic formula $f_{n}\sim\frac{n!}{2(\log 2)^{n+1}}$ gives a constant $C$ (not depending on $c$ , $t$ , $j$ ) such that we have

	$\displaystyle\left\lvert\binom{n}{t}\frac{h_{t,c}}{j!\,(c-j)!}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{f_{n}^{2}}\right\rvert$	$\displaystyle=\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\binom{n}{t}\frac{f_{n-t+c,j}f_{n-t+c,c-j}}{f_{n}^{2}}$
		$\displaystyle\leq\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\binom{n}{t}\left(\frac{f_{n-t+c}}{f_{n}}\right)^{2}$
		$\displaystyle\leq C\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\binom{n}{t}\left(\frac{\displaystyle\frac{(n-t+c)!}{(\log 2)^{n-t+c}}}{\displaystyle\frac{n!}{(\log 2)^{n}}}\right)^{2}$
		$\displaystyle\leq C\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\binom{n}{t}\left(\frac{(n-t+c)!}{n!}\right)^{2}$
		$\displaystyle\leq C\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\frac{1}{t!},$

where the last inequality uses Lemma 4.4. It remains to show that the sum

\sum_{c=0}^{\infty}\sum_{t=2c}^{\infty}\sum_{j=0}^{c}\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\frac{1}{t!}

is finite. Applying Corollary 3.20 and the inequality $2c\leq t$ from the bounds of summation gives

\displaystyle\sum_{c=0}^{\infty}\sum_{t=2c}^{\infty}\sum_{j=0}^{c}\frac{\left\lvert{h_{t,c}}\right\rvert}{j!\,(c-j)!}\frac{1}{t!}

\displaystyle=\sum_{c=0}^{\infty}\sum_{t=2c}^{\infty}\frac{2^{c}}{c!\,t!}\left\lvert{h_{t,c}}\right\rvert\leq\sum_{c=0}^{\infty}\sum_{t=2c}^{\infty}\frac{2^{2c}c^{t}}{c!\,t!}\leq\sum_{c=0}^{\infty}\sum_{t=2c}^{\infty}\frac{(2c)^{t}}{c!\,t!}\leq\sum_{c=0}^{\infty}\sum_{t=0}^{\infty}\frac{(2c)^{t}}{c!\,t!}=\sum_{c=0}^{\infty}\frac{e^{2c}}{c!}

which is finite by the ratio test. ∎

4.3 Asymptotics of $p_{n}$

We will need the following technical lemma.

Lemma 4.6.

For each fixed nonnegative integer $k$ , we have

\genfrac{[}{]}{0.0pt}{}{n}{n-k}\sim\frac{n^{2k}}{2^{k}k!}

as $n\to\infty$ . We also have

\genfrac{[}{]}{0.0pt}{}{n}{n-k}\leq\frac{n^{2k}}{2^{k}k!}

for all $n\geq k$ .

Proof.

The first statement follows from equation 1.6 of [4]. For the second statement, we will use the recurrence for the Stirling numbers of the first kind. If $k=0$ or $k=n$ , then the inequality is clear. Now suppose that $1\leq k\leq n-1$ , and inductively assume that the inequality holds for smaller values of $n$ . Then

	$\displaystyle\genfrac{[}{]}{0.0pt}{}{n}{n-k}$	$\displaystyle=(n-1)\genfrac{[}{]}{0.0pt}{}{n-1}{n-k}+\genfrac{[}{]}{0.0pt}{}{n-1}{n-k-1}$
		$\displaystyle=(n-1)\genfrac{[}{]}{0.0pt}{}{n-1}{(n-1)-(k-1)}+\genfrac{[}{]}{0.0pt}{}{n-1}{(n-1)-k}$
		$\displaystyle\leq(n-1)\frac{(n-1)^{2(k-1)}}{2^{k-1}(k-1)!}+\frac{(n-1)^{2k}}{2^{k}k!}$
		$\displaystyle=\frac{1}{2^{k}k!}(2k+n-1)(n-1)^{2k-1}$
		$\displaystyle\leq\frac{1}{2^{k}k!}\left(\frac{(2k+n-1)+(2k-1)(n-1)}{2k}\right)^{2k}$
		$\displaystyle=\frac{n^{2k}}{2^{k}k!},$

where the last inequality (the penultimate step) uses the AM-GM inequality. ∎

We can now prove the asymptotic formula for $p_{n}$ (Theorem 1.2 from the introduction).

Theorem 4.7.

We have

p_{n}\sim e^{-(\log 2)^{2}/2}\frac{f_{n}^{2}}{n!}\sim\frac{e^{-(\log 2)^{2}/2}}{4(\log 2)^{2}}\frac{n!}{(\log 2)^{2n}}.

Proof.

Applying the formula $\displaystyle p_{n}=\frac{1}{n!}\sum\limits_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}q_{k}$ and replacing $k$ with $n-k$ gives

\frac{n!\,p_{n}}{f_{n}^{2}}=\sum_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{k}\frac{q_{k}}{f_{n}^{2}}=\sum_{k=0}^{n}\genfrac{[}{]}{0.0pt}{}{n}{n-k}\frac{q_{n-k}}{f_{n}^{2}}.

If $k$ is fixed, then applying Lemma 4.3, Theorem 4.5, and Lemma 4.6 gives

\genfrac{[}{]}{0.0pt}{}{n}{n-k}\frac{q_{n-k}}{f_{n}^{2}}\sim\frac{n^{2k}}{2^{k}k!}\frac{\displaystyle e^{-(\log 2)^{2}}\frac{(n-k)!^{2}}{4(\log 2)^{2(n-k)+2}}}{\displaystyle\left(\frac{n!}{2(\log 2)^{n+1}}\right)^{2}}\sim\frac{e^{-(\log 2)^{2}}(\log 2)^{2k}}{2^{k}k!}.

By the dominated convergence theorem (justified below),

\frac{n!\,p_{n}}{f_{n}^{2}}\to\sum_{k=0}^{\infty}\frac{e^{-(\log 2)^{2}}(\log 2)^{2k}}{2^{k}k!}=e^{-(\log 2)^{2}}e^{(\log 2)^{2}/2}=e^{-(\log 2)^{2}/2}.

Then the result follows from the asymptotic formula $f_{n}\sim\frac{n!}{2(\log 2)^{n+1}}$ .

It remains to justify this application of the dominated convergence theorem. Note that the asymptotics $f_{n}\sim\frac{n!}{2(\log 2)^{n+1}}$ and $q_{n}\sim e^{-(\log 2)^{2}}\frac{n!}{4(\log 2)^{2n+2}}$ give a constant $C$ (not depending on $n$ or $k$ ) such that we have

\genfrac{[}{]}{0.0pt}{}{n}{n-k}\frac{q_{n-k}}{f_{n}^{2}}\leq C\frac{n^{2k}}{2^{k}k!}\frac{\displaystyle\frac{(n-k)!^{2}}{(\log 2)^{2(n-k)}}}{\displaystyle\left(\frac{n!}{(\log 2)^{n}}\right)^{2}}\leq C\frac{n^{2k}}{2^{k}k!}\frac{(n-k)!^{2}}{n!^{2}}

for $n\geq k$ . Now view the expression

n^{2k}\frac{(n-k)!^{2}}{n!^{2}}=\left(\left(\frac{n}{n-k+1}\right)\left(\frac{n}{n-k+2}\right)\cdots\left(\frac{n}{n}\right)\right)^{2}

as a function of $n\geq k$ for fixed $k$ . This function has a fixed number of terms, each of which is decreasing as a function of $n$ . In particular, this function is maximized at $n=k$ with value $k^{2k}/k!^{2}$ . This shows that

\genfrac{[}{]}{0.0pt}{}{n}{n-k}\frac{q_{n-k}}{f_{n}^{2}}\leq C\frac{n^{2k}}{2^{k}k!}\frac{(n-k)!^{2}}{n!^{2}}\leq C\frac{k^{2k}}{2^{k}k!^{3}}

for $n\geq k$ . Finally, the sum

\sum_{k=0}^{\infty}\frac{k^{2k}}{2^{k}k!^{3}}

is finite by the ratio test. ∎

5 Further Directions

5.1 Higher Order Asymptotics

Let $K$ denote the constant

K=\frac{e^{-(\log 2)^{2}/2}}{4(\log 2)^{2}}\approx 0.409223.

Theorem 4.7 states that

\frac{p_{n}(\log 2)^{2n}}{n!}\to K.

Figure 4 is a graph of $y=\log\big{(}K-\frac{p_{n}(\log 2)^{2n}}{n!}\big{)}$ against $x=\log n$ for the values of $n$ given in the appendix.

Refer to caption — Figure 4: Graph of $y=\log\big{(}K-\frac{p_{n}(\log 2)^{2n}}{n!}\big{)}$ against $x=\log n$

For large $n$ , the points appear to approach a line with slope $-1$ and $y$ -intercept $b\approx-2.23$ . In other words,

\log\left(K-\frac{p_{n}(\log 2)^{2n}}{n!}\right)\approx b-\log n.

Exponentiating and rearranging terms gives

\frac{p_{n}(\log 2)^{2n}}{n!}\approx K-\frac{e^{b}}{n}.

This suggests the following conjecture.

Conjecture 5.1.

There exists a constant $c>0$ such that

\frac{p_{n}(\log 2)^{2n}}{n!}=K-\frac{c}{n}+O\left(\frac{1}{n^{2}}\right).

5.2 Congruence Conjecture

Theorem 2 of [7] states that if $p$ is prime and $n\geq m$ , then

f_{n+\varphi(p^{m})}\equiv f_{n}\pmod{p^{m}},

where $\varphi$ is Euler’s totient function. In other words, the Fubini numbers are periodic modulo $p^{m}$ . Squaring both sides gives the congruence

q_{n+\varphi(p^{m}),0}\equiv q_{n,0}\pmod{p^{m}}

for $p$ prime and $n\geq m$ . Replacing $q_{n,0}$ with $q_{n}$ suggests the following conjecture, which is supported by the computed values of $q_{n}$ for $n\leq 5000$ .

Conjecture 5.2.

If $p$ is prime and $n\geq m$ , then

q_{n+\varphi(p^{m})}\equiv q_{n}\pmod{p^{m}}.

Unfortunately, this conjecture does not appear to imply anything about the behavior of the sequence $\{p_{n}\}$ modulo $p^{m}$ , because of the division by $n!$ when converting from $\{q_{n}\}$ to $\{p_{n}\}$ .

6 Acknowledgements

Many thanks to Sara Billey for proposing the problem and for helpful discussions, and to the WXML (Washington Experimental Mathematics Lab) for providing the project that led to this paper. Also thanks to Sara Billey and the anonymous referees for their comments and suggestions on this paper.

7 Web Resources

The basic and memory-optimized implementations of the algorithm described in section 3.6 and the values of $p_{n}$ for $n\leq 5000$ can be found at

https://github.com/tb65536/ParabolicDoubleCosets

8 Appendix: Values of $p_{n}$

$n$	$p_{n}$		$(p_{n}\log^{2n}2)/n!$
1	1	(1 digits)	0.480453
2	3	(1 digits)	0.346253
3	19	(2 digits)	0.3512
4	167	(3 digits)	0.370774
5	1791	(4 digits)	0.382093
6	22715	(5 digits)	0.388048
7	334031	(6 digits)	0.391663
8	5597524	(7 digits)	0.394169
9	105351108	(9 digits)	0.396036
10	2200768698	(10 digits)	0.397485
11	50533675542	(11 digits)	0.398644
12	1265155704413	(13 digits)	0.399593
13	34300156146805	(14 digits)	0.400385
14	1001152439025205	(16 digits)	0.401056
15	31301382564128969	(17 digits)	0.401631
16	1043692244938401836	(19 digits)	0.402131
17	36969440518414369896	(20 digits)	0.402569
18	1386377072447199902576	(22 digits)	0.402955
19	54872494774746771827248	(23 digits)	0.403299
20	2285943548113541477123970	(25 digits)	0.403608
30	382079126820…882950534546	(42 digits)	0.405528
40	179736290098…532574927537	(61 digits)	0.406469
50	102365379120…338473199289	(81 digits)	0.407028
60	427699505826…027450945465	(101 digits)	0.407398
70	940027093836…926979570377	(122 digits)	0.407662
80	857360695445…439742054481	(144 digits)	0.407859
90	271659624624…300501685746	(167 digits)	0.408011
100	260443549181…383464403196	(190 digits)	0.408133
200	150691150471…390138470043	(439 digits)	0.40868
300	400039289653…047576602840	(710 digits)	0.408862
400	572423854465…686938545249	(996 digits)	0.408952
500	745894661762…526127432358	(1293 digits)	0.409006
600	529056570650…070570426529	(1599 digits)	0.409042
700	692359539273…658799872850	(1912 digits)	0.409068
800	150717237472…313160125048	(2232 digits)	0.409088
900	902565318506…968550812571	(2556 digits)	0.409103
1000	367762337807…336792083803	(2886 digits)	0.409115
1500	657393993927…489306609387	(4592 digits)	0.409151
2000	677187561025…781759174668	(6372 digits)	0.409169
2500	497164609537…894142980291	(8207 digits)	0.40918
3000	189293873430…434167136044	(10086 digits)	0.409187
3500	163043229993…353705274487	(12001 digits)	0.409192
4000	186384435725…985721119395	(13947 digits)	0.409196
4500	346443781530…440739293425	(15920 digits)	0.409199
5000	962766473267…951984139754	(17917 digits)	0.409201

References

[1] S. Billey, M. Konvalinka, T. K. Peterson, W. Slofstra, and B. E. Tenner. Parabolic Double Cosets in Coxeter Groups. The Electronic Journal of Combinatorics, 25(1), 2018.
[2] P. Diaconis and A. Gangolli. Rectangular Arrays with Fixed Margins. Discrete Probability and Algorithms, pages 15-41. The IMA Volumes in Mathematics and its Applications, vol 72, 1995. Springer, New York.
[3] M. Kobayashi. Two-Sided Structure of Double Cosets in Coxeter Groups, June 2011. Accessed online October 2018.
[4] L. Moser and M. Wyman. Asymptotic Development of the Stirling Numbers of the First Kind. Journal of the London Mathematical Society, 33(2):131-146, 1958.
[5] The On-Line Encyclopedia of Integer Sequences, published electronically at https://oeis.org, October 2018.
[6] T. K. Petersen. A two-sided analogue of the Coxeter complex. The Electronic Journal of Combinatorics, 25(4), 2018.
[7] B. Poonen. Periodicity of a Combinatorial Sequence. The Fibonacci Quarterly, 26(1), 1988.
[8] J. Riordan. An Introduction to Combinatorial Analysis. Princeton University Press, 1978.
[9] H. S. Wilf. generatingfunctionology. Accessed online on August 2020.

Counting Parabolic Double Cosets in Symmetric Groups

Abstract

1 Introduction

Conjecture 1.1 (Conjecture 1.4 in [1]).

Theorem 1.2.

2 Background

Definition 2.1.

Definition 2.2.

Example 2.3.

Example 2.4 (Example 3.5 in [1]).

Definition 2.5.

Example 2.6.

Proposition 2.7.

Proof.

Definition 2.8.

Corollary 2.9.

3 Computation of pnp_{n}

3.1 Characterization of Maximality

Proposition 3.1.

Proof.

Corollary 3.2.

Example 3.3.

3.2 Sequence Transformation

Definition 3.4.

Proposition 3.5.

Proof.

3.3 Pairs of Weak Orders

Definition 3.6.

Definition 3.7.

Proposition 3.8.

Proof.

Example 3.9.

3.4 Inclusion-Exclusion

Definition 3.10.

Example 3.11.

Example 3.12.

Definition 3.13.

Lemma 3.14.

Proof.

Proposition 3.15.

Proof.

Corollary 3.16.

Proof.

Lemma 3.17.

Proof.

3.5 The Final Count

Theorem 3.18.

3.6 The Algorithm

3.7 Combinatorial Interpretation of ht,ch_{t,c}

Proposition 3.19.

Proof.

Corollary 3.20.

Proof.

Definition 3.21.

Proposition 3.22.

Proof of Proposition 3.22.

4 Asymptotics

4.1 Asymptotics of fn,kf_{n,k}

Proposition 4.1.

Proof.

Lemma 4.2.

Proof.

Lemma 4.3.

Proof.

4.2 Asymptotics of qnq_{n}

Lemma 4.4.

Proof.

Theorem 4.5.

Proof.

4.3 Asymptotics of pnp_{n}

Lemma 4.6.

Proof.

Theorem 4.7.

Proof.

5 Further Directions

5.1 Higher Order Asymptotics

Conjecture 5.1.

5.2 Congruence Conjecture

Conjecture 5.2.

6 Acknowledgements

3 Computation of $p_{n}$

3.7 Combinatorial Interpretation of $h_{t,c}$

4.1 Asymptotics of $f_{n,k}$

4.2 Asymptotics of $q_{n}$

4.3 Asymptotics of $p_{n}$

8 Appendix: Values of $p_{n}$