This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

An Experimental Mathematics Approach to Several Combinatorial Problems

YUKUN YAO
Abstract

Experimental mathematics is an experimental approach to mathematics in which programming and symbolic computation are used to investigate mathematical objects, identify properties and patterns, discover facts and formulas and even automatically prove theorems.

With an experimental mathematics approach, this dissertation deals with several combinatorial problems and demonstrates the methodology of experimental mathematics.

We start with parking functions and their moments of certain statistics. Then we discuss about spanning trees and “almost diagonal” matrices to illustrate the methodology of experimental mathematics. We also apply experimental mathematics to Quicksort algorithms to study the running time. Finally we talk about the interesting peaceable queens problem.

\phd\program

Mathematics \directorDoron Zeilberger \approvals4 \submissionyear2020 \submissionmonthMay

\beforepreface
Acknowledgements.
First and foremost, I would like to thank my advisor, Doron Zeilberger, for all of his help, guidance and encouragement throughout my mathematical adventure in graduate school at Rutgers. He introduced me to the field of experimental mathematics and many interesting topics in combinatorics. Without him, this dissertation would not be possible. I am grateful to many professors at Rutgers: Michael Kiessling, for serving on my oral qualifying exam and thesis defense committee; Vladimir Retakh, for serving on my defense committee; Shubhangi Saraf and Swastik Kopparty, for teaching me combinatorics and serving on my oral exam committee. I am also grateful to Neil Sloane, for introducing me to the peaceable queens problem and numerous amazing integer sequences and for serving on my defense committee. I would like to thank other combinatorics graduate students here at Rutgers. From them I learned about a lot of topics in this rich and fascinating area. I would like to express my appreciation to my officemate, Lun Zhang, for interesting conversation and useful remarks. I appreciate Shalosh B. Ekhad’s impeccable computing support and the administrative support from graduate director Lev Borisov and graduate secretary Kathleen Guarino. Finally, I thank my parents. They always support and encourage me to pursue what I want in all aspects of my life. \figurespage\afterpreface

Chapter 1 Introduction

Since the creation of computers, they have been playing a more and more important role in our everyday life and the advancement of science and technology, bringing efficiency and convenience and reshaping our world.

Especially, the use of computers is becoming increasingly popular in mathematics. The proof of the four color theorem would be impossible without computers. Compared with human beings, computers are faster, more powerful, tireless, less error-prone. Computers can do much more than numerical computation. With the development of computer science and symbolic computation, experimental mathematics, as an area of mathematics, has been growing fast in the last several decades.

With experimental mathematics, it is much more efficient and easier to look for a pattern, test a conjecture, utilize data to make a discovery, etc. Computers can be programmed to make conjectures and provide rigorous proofs with little or no human intervention. They can also do what humans cannot do or what takes too long to complete, e.g., solving a large linear system, analyzing a large data set, symbolically computing complicated recurrence relations. Just as machine learning revolutionizes computer science, statistics and information technology, experimental mathematics revolutionizes mathematics.

The main theme of the dissertation is to use the methods of experimental mathematics to study different problems, and likewise, to illustrate the methodology and power of experimental mathematics by showing some case studies and how experimental mathematics works under various situations.

In Chapter 2, we discuss the first problem that is related to the area statistic of of parking functions. Our methods are purely finitistic and elementary, taking full advantage, of course, of our beloved silicon servants. We first introduce the background and definition of parking functions and their generalizations. For aa-parking functions, we derive the recurrence relation and the number of them when the length is nn. Furthermore, a bijection between aa-parking functions and labelled rooted forests is discovered (or possibly re-discovered). Then we consider the sum and area statistics. With the weighted counting of these statistics, the explicit formula between expectation and higher moments can be found. We also look at the limiting distribution of the area statistic, which is Airy distribution.

In Chapter 3, we use two instructive case studies on spanning trees of grid graphs and “almost diagonal” matrices, to show that often, just like Alexander the Great before us, the simple, “cheating” solution to a hard problem is the best. So before you spend days (and possibly years) trying to answer a mathematical question by analyzing and trying to ‘understand’ its structure, let your computer generate enough data, and then let it guess the answer. Often its guess can be proved by a quick ‘hand-waving’ (yet fully rigorous) ‘meta-argument’.

In Chapter 4, we apply experimental mathematics to algorithm analysis. Using recurrence relations, combined with symbolic computations, we make a detailed study of the running times of numerous variants of the celebrated Quicksort algorithms, where we consider the variants of single-pivot and multi-pivot Quicksort algorithms as discrete probability problems. With nonlinear difference equations, recurrence relations and experimental mathematics techniques, explicit expressions for expectations, variances and even higher moments of their numbers of comparisons and swaps can be obtained. For some variants, Monte Carlo experiments are performed, the numerical results are demonstrated and the scaled limiting distribution is also discussed.

In Chapter 5, finally, we discuss, and make partial progress on, the peaceable queens problem, the protagonist of OEIS sequence A250000. Symbolically, we prove that Jubin’s construction of two pentagons is at least a local optimum. Numerically, we find the exact numerical optimums for some specific configurations. Our method can be easily applied to more complicated configurations with more parameters.

All accompanying Maple packages and additional input/output files can be found at the author’s homepage:

http://sites.math.rutgers.edu/~yao .

The accompanying Maple package for Chapter 2 is ParkingStatistics.txt. There are lots of output files and nice pictures on the front of this chapter.

The accompanying Maple packages for Chapter 3 are JointConductance.txt, GFMa- trix.txt and SpanningTrees.txt. There are also numerous sample input and output files on the front of this chapter.

The accompanying Maple packages for Chapter 4 are QuickSort.txt and Findrec. txt. QuickSort.txt is the main package of this chapter and all procedures mentioned in the chapter are from this package unless noted otherwise. Findrec.txt is mainly used to find a recurrence relation, i.e., difference equation of moments from the empirical data.

The accompanying Maple package for Chapter 5 is PeaceableQueens.txt. There are lots of output files and nice pictures on the front of this chapter as well.

Experimental mathematics has made a huge impact on mathematics itself and how mathematicians discover new mathematics so far and is the mathematics of tomorrow. In the information era, the skyscraper of mathematics is becoming taller and taller. Hence we need tools better than pure human logic to maintain and continue building this skyscraper. While the computing capacity, patience and time of humans are limited, experimental mathematics, and ultimately, automated theorem proving, will be the choice of history.

Chapter 2 The Statistics of Parking Functions

This chapter is adapted from [48], which has been published on The Mathematical Intelligencer. It is also available on arXiv.org, number 1806.02680.

2.1 Introduction

Once upon a time, way back in the nineteen-sixties, there was a one-way street (with no passing allowed), with nn parking spaces bordering the sidewalk. Entering the street were nn cars, each driven by a loyal husband, and sitting next to him, dozing off, was his capricious (and a little bossy) wife. At a random time (while still along the street), the wife wakes up and orders her husband, park here, darling!. If that space is unoccupied, the hubby gladly obliges, and if the parking space is occupied, he parks, if possible, at the next still-empty parking space. Alas, if all the latter parking spaces are occupied, he has to go around the block, and drive back to the beginning of this one-way street, and then look for the first available spot. Due to construction, this wastes half an hour, making the wife very cranky.

Q: What is the probability that no one has to go around the block?

A: (n+1)n1/nnen+1(n+1)^{n-1}/n^{n}\,\asymp\,\frac{e}{n+1}.

Both the question and its elegant answer are due to Alan Konheim and Benji Weiss [30].

Suppose wife ii (1in1\leq i\leq n) prefers parking space pip_{i}, then the preferences of the wives can be summarized as an array (p1,,pn)(p_{1},\dots,p_{n}), where 1pin1\leq p_{i}\leq n. So altogether there are nnn^{n} possible preference-vectors, starting from (1,,1)(1,\dots,1) where it is clearly possible for everyone to park, and ending with (n,,n)(n,...,n) (all nn), where every wife prefers the last parking space, and of course it is impossible. Given a preference vector (p1,,pn)(p_{1},\dots,p_{n}), let (p(1),,p(n))(p_{(1)},\dots,p_{(n)}) be its sorted version, arranged in (weakly) increasing order. For example if (p1,p2,p3,p4)=(3,1,1,4)(p_{1},p_{2},p_{3},p_{4})=(3,1,1,4) then (p(1),p(2),p(3),p(4))=(1,1,3,4)(p_{(1)},p_{(2)},p_{(3)},p_{(4)})=(1,1,3,4).

We invite our readers to convince themselves that a parking space preference vector (p1,,pn)(p_{1},\dots,p_{n}) makes it possible for every husband to park without inconveniencing his wife if and only if p(i)ip_{(i)}\leq i for 1in1\leq i\leq n. This naturally leads to the following definition.

Definition 2.1 (Parking Function).

A vector of positive integers (p1,,pn)(p_{1},\dots,p_{n}) with 1pin1\leq p_{i}\leq n is a parking function if its (non-decreasing) sorted version (p(1),,p(n))(p_{(1)},\dots,p_{(n)}) (i.e. p(1)p(2)p(n)p_{(1)}\leq p_{(2)}\leq\dots\leq p_{(n)}, and the latter is a permutation of the former) satisfies

p(i)i,(1in).p_{(i)}\leq i,\quad(1\leq i\leq n).

As we have already mentioned above, Alan Konheim and Benji Weiss [30] were the first to state and prove the following theorem.

Theorem 2.2 (The Parking Function Enumeration Theorem).

There are (n+1)n1(n+1)^{n-1} parking functions of length nn.

There are many proofs of this lovely theorem, possibly the slickest is due to the brilliant human Henry Pollak, (who apparently did not deem it worthy of publication. It is quoted, e.g. in [16]). It is nicely described on pp. 4-5 of [42] (see also [43]), hence we will not repeat it here. Instead, as a warm-up to the ‘statistical’ part, and to illustrate the power of experiments, we will give a much uglier proof, that, however, is motivated.

Before going on to present our (very possibly not new) ‘humble’ proof, we should mention that one natural way to prove the Konheim-Weiss theorem is by a bijection with labeled trees on n+1n+1 vertices, that Arthur Cayley famously proved is also enumerated by (n+1)n1(n+1)^{n-1}. The first such bijection, as far as we know, was given by the great formal linguist, Marco Schützenberger [38]. This was followed by an elegant bijection by the classical combinatorial giants Dominique Foata and John Riordan [16], and others.

Since we know (at least!) 1616 different proofs of Cayley’s formula (see, e.g. [54]), and at least four different bijections between parking functions and labeled trees, there are at least 6464 different proofs (see also [45], ex. 5.49) of the Parking Enumeration theorem. To these one must add proofs like Pollak’s, and a few other ones.

Curiously, our ‘new’ proof has some resemblance to the very first one in [30], since they both use recurrences (one of the greatest tools in the experimental mathematician’s tool kit!), but our proof is (i) motivated and (ii) experimental (yet fully rigorous).

2.2 An Experimental Mathematics Motivated Proof

When encountering a new combinatorial family, the first task is to write a computer program to enumerate as many terms as possible, and hope to conjecture a nice formula. One can also try and “cheat” and use the great OEIS, to see whether anyone came up with this sequence before, and see whether this new combinatorial family is mentioned there.

A very brute force approach, that will not go very far (but would suffice to get the first five terms needed for the OEIS) is to list the superset, in this case all the nnn^{n} vectors in {1n}n\{1\dots n\}^{n} and for each of them sort it, and see whether the condition p(i)ip_{(i)}\leq i holds for all 1in1\leq i\leq n. Then count the vectors that pass this test.

But a much better way is to use dynamical programming to express the desired sequence, let’s call it a(n)a(n), in terms of values a(i)a(i) for i<ni<n.

Let’s analyze the anatomy of a typical parking function of length nn. A natural parameter is the number of 11’s that show up, let’s call it kk (0kn0\leq k\leq n). i.e.

p(1)=1,,p(k)=1,2p(k+1)k+1,,p(n)n.p_{(1)}=1,\quad\dots,\quad p_{(k)}=1,\quad 2\leq p_{(k+1)}\leq k+1,\quad\dots,\quad p_{(n)}\leq n.

Removing the 11’s yields a shorter weakly-increasing vector

2p(k+1)p(k+2)p(n),2\leq p_{(k+1)}\leq p_{(k+2)}\leq\dots\quad\leq\,p_{(n)},

satisfying

p(k+1)k+1,p(k+2)k+2,,p(n)n.p_{(k+1)}\leq k+1,\quad p_{(k+2)}\leq k+2,\quad\dots,\quad p_{(n)}\leq n.

Define

(q1,,qnk):=(p(k+1)1,,p(n)1).(q_{1},\dots,q_{n-k})\,:=\,(p_{(k+1)}-1,\dots,p_{(n)}-1).

The vector (q1,,qnk)(q_{1},\dots,q_{n-k}) satisfies

1q1qnk,1\leq q_{1}\leq\dots\leq q_{n-k},

and

q1k,q2k+1,,qnkn1.q_{1}\leq k,\quad q_{2}\leq k+1,\quad\dots,\quad q_{n-k}\leq n-1.

We see that the set of parking functions with exactly kk 11’s may be obtained by taking the above set of vectors of length nkn-k, adding 11 to each component, scrambling it in every which way, and inserting the kk 11’s in every which way.

Alas, the ‘scrambling’ of the set of such qq-vectors is not of the original form. We are forced to consider a more general object, namely scramblings of vectors of the form p(1)p(n)p_{(1)}\leq\dots\leq p_{(n)} with the condition

p(1)a,p(2)a+1,,p(n)a+n1,p_{(1)}\leq a,\quad p_{(2)}\leq a+1,\quad\dots,\quad p_{(n)}\leq a+n-1,

for a general, positive integer aa, not just for a=1a=1. So in order to get the dynamical programming recurrence rolling we are forced to introduce a more general object, called an aa-parking function. This leads to the following definition.

Definition 2.3 (aa-Parking Function).

A vector of positive integers (p1,,pn)(p_{1},\dots,p_{n}) with 1pin+a11\leq p_{i}\leq n+a-1 is an aa-parking function if its (non-decreasing) sorted version (p(1),,p(n))(p_{(1)},\dots,p_{(n)}) (i.e. p(1)p(2)p(n)p_{(1)}\leq p_{(2)}\leq\dots\leq p_{(n)}, and the latter is a permutation of the former) satisfies

p(i)a+i1,(1in).p_{(i)}\leq a+i-1,\quad(1\leq i\leq n).

Note that the usual parking functions are the special case a=1a=1. So if we would be able to find an efficient recurrence for counting aa-parking functions, we would be able to answer our original question.

So let’s redo the above ‘anatomy’ for these more general creatures, and hope that the two parameters nn and aa would suffice to establish a recursive scheme, and we won’t need to introduce yet more general creatures.

Let’s analyze the anatomy of a typical aa-parking function of length nn. Again, a natural parameter is the number of 11’s that show up, let’s call it kk (0kn0\leq k\leq n). i.e.

p(1)=1,,p(k)=1,2p(k+1)a+k,,p(n)a+n1.p_{(1)}=1,\quad\dots,\quad p_{(k)}=1,\quad 2\leq p_{(k+1)}\leq a+k,\quad\dots,p_{(n)}\leq a+n-1.

Removing the 11’s yields a sorted vector

2p(k+1)p(k+2)p(n),2\leq p_{(k+1)}\leq p_{(k+2)}\leq\dots\,\leq\,p_{(n)},

satisfying

p(k+1)k+a,p(k+2)k+a+1,,p(n)n+a1.p_{(k+1)}\leq k+a,\quad p_{(k+2)}\leq k+a+1,\quad\dots,\quad p_{(n)}\leq n+a-1.

Define

(q1,,qnk):=(p(k+1)1,,p(n)1).(q_{1},\dots,q_{n-k})\,:=\,(p_{(k+1)}-1,\quad\dots,\quad p_{(n)}-1).

The vector (q1,,qnk)(q_{1},\dots,q_{n-k}) satisfies

q1qnkq_{1}\leq\dots\leq q_{n-k}

and

q1k+a1,q2k+a,,qnkn+a1.q_{1}\leq k+a-1,\quad q_{2}\leq k+a,\quad\dots,\quad q_{n-k}\leq n+a-1.

We see that the set of aa-parking functions with exactly kk 11’s may be obtained by taking the above set of vectors of length nkn-k, adding 11 to each component, scrambling it in every which way, and inserting the kk 11’s in every which way.

But now the set of scramblings of the vectors (q1,,qnk)(q_{1},\dots,q_{n-k}) is an old friend!. It is the set of (a+k1)(a+k-1)-parking functions of length nkn-k. To get all aa-parking functions of length nn with exactly kk ones we need to take each and every member of the set of (a+k1)(a+k-1)-parking functions of length nkn-k, add 11 to each component, and insert kk ones in every which way. There are (nk){{n}\choose{k}} ways of doing it. Hence the number of aa-parking functions of length nn with exactly kk ones is (nk){{n}\choose{k}} times the number of (a+k1)(a+k-1)-parking functions of length nkn-k. Summing over all kk between 0 and nn we get the following recurrence.

Proposition 2.4 (Fundamental Recurrence for aa-parking functions).

Let p(n,a)p(n,a) be the number of aa-parking functions of length nn. We have the recurrence

p(n,a)=k=0n(nk)p(nk,a+k1),p(n,a)\,=\,\sum_{k=0}^{n}\,{{n}\choose{k}}p(n-k,a+k-1),

subject to the boundary conditions p(n,0)=0p(n,0)=0 for n1n\geq 1, and p(0,a)=1p(0,a)=1 for a0a\geq 0.

Note that in the sense of Wilf [47], this already answers the enumeration problem to compute p(n,a)p(n,a) and hence p(n,1)=p(n)p(n,1)=p(n), since this gives us a polynomial time algorithm to compute p(n)p(n) (and p(n,a)p(n,a)).

Moving the term k=0k=0 from the right to the left, and denoting p(n,a)p(n,a) by pn(a)p_{n}(a) we have

pn(a)pn(a1)=k=1n(nk)pnk(a+k1).p_{n}(a)-p_{n}(a-1)\,=\,\sum_{k=1}^{n}\,{{n}\choose{k}}p_{n-k}(a+k-1).

Hence we can express pn(a)p_{n}(a) as follows, in terms of pm(a)p_{m}(a) with m<nm<n.

pn(a)=b=0a(k=1n(nk)pnk(b+k1)).p_{n}(a)=\sum_{b=0}^{a}\left(\sum_{k=1}^{n}\,{{n}\choose{k}}p_{n-k}(b+k-1)\right).

Here is the Maple code that implements it



p:=proc(n,a) local k,b:
if n=0 then
RETURN(1)
else
factor(subs(b=a,sum(expand(add(binomial(n,k)*subs(a=a+k-1,p(n-k,a)),
k=1..n)),a=1..b))):
fi:
end:

If you copy-and-paste this onto a Maple session, as well as the line below,

[seq(p(i,a),i=1..8)];

you would immediately get

[a,a(a+2),a(a+3)2,a(a+4)3,a(a+5)4,a(a+6)5,a(a+7)6,a(a+8)7].[a,a\left(a+2\right),a\left(a+3\right)^{2},a\left(a+4\right)^{3},a\left(a+5\right)^{4},a\left(a+6\right)^{5},a\left(a+7\right)^{6},a\left(a+8\right)^{7}].

Note that these are rigorously proved exact expressions, in terms of general aa (i.e. symbolic aa) for pn(a)p_{n}(a), for 1n101\leq n\leq 10, and we can easily get more. The following guess immediately comes to mind

p(n,a)=pn(a)=a(a+n)n1.p(n,a)=p_{n}(a)=a(a+n)^{n-1}.

How to prove this rigorously? If you set q(n,a):=a(a+n)n1q(n,a):=a(a+n)^{n-1}, since q(n,0)=0q(n,0)=0 and q(0,a)=1q(0,a)=1, the fact that p(n,a)=q(n,a)p(n,a)=q(n,a) would follow by induction once you prove that q(n,a)q(n,a) also satisfies the same fundamental recurrence.

q(n,a)=k=0n(nk)q(nk,a+k1).q(n,a)\,=\,\sum_{k=0}^{n}\,{{n}\choose{k}}q(n-k,a+k-1).

In other words, in order to prove that p(n,a)=a(n+a)n1p(n,a)=a(n+a)^{n-1}, we have to prove the identity

a(a+n)n1=k=0n(nk)(a+k1)(a+n1)nk1.a(a+n)^{n-1}\,=\,\sum_{k=0}^{n}\,{{n}\choose{k}}(a+k-1)(a+n-1)^{n-k-1}.
Proof.

Let’s define

f(x):=k=0n(nk)(a+k1)xnk1,f(x)\,:=\,\sum_{k=0}^{n}\,{{n}\choose{k}}(a+k-1)x^{n-k-1},

hence

f(x)=a1xk=0n(nk)xnk+k=0nk(nk)xnk1=a1xk=0n(nk)xnk+nk=0n(nk)xnk1k=0n(nk)(nk)xnk1=a1+nxk=0n(nk)xnkk=0n(nk)(nk)xnk1.\begin{split}f(x)&=\frac{a-1}{x}\sum_{k=0}^{n}\,{{n}\choose{k}}x^{n-k}+\sum_{k=0}^{n}\,k{{n}\choose{k}}x^{n-k-1}\\ &=\frac{a-1}{x}\sum_{k=0}^{n}\,{{n}\choose{k}}x^{n-k}+n\sum_{k=0}^{n}\,{{n}\choose{k}}x^{n-k-1}-\sum_{k=0}^{n}\,(n-k){{n}\choose{k}}x^{n-k-1}\\ &=\frac{a-1+n}{x}\sum_{k=0}^{n}\,{{n}\choose{k}}x^{n-k}-\sum_{k=0}^{n}\,(n-k){{n}\choose{k}}x^{n-k-1}.\end{split}

As an immediate consequence of the binomial theorem:

k=0n(nk)xnk=(1+x)n\sum_{k=0}^{n}\,{{n}\choose{k}}x^{n-k}=(1+x)^{n}

and

k=0n(nk)(nk)xnk1=n(1+x)n1,\sum_{k=0}^{n}\,(n-k){{n}\choose{k}}x^{n-k-1}=n(1+x)^{n-1},

which is trivial to both humans and machines, we have

f(x)=a1+xx(1+x)nn(1+x)n1.f(x)=\frac{a-1+x}{x}(1+x)^{n}-n(1+x)^{n-1}.

By setting x=a+n1x=a+n-1, we get

f(x)=(a+n)nn(a+n)n1=a(a+n)n1.\begin{split}f(x)&=(a+n)^{n}-n(a+n)^{n-1}\\ &=a(a+n)^{n-1}.\end{split}

This completes the proof. ∎

We have just rigorously reproved the following well-known theorem.

Theorem 2.5.

The number of aa-parking functions of length nn is

p(n,a)=a(a+n)n1.p(n,a)=a\,(a+n)^{n-1}.

In particular, by substituting a=1a=1, we reproved the original Konheim-Weiss theorem that p(n,1)=(n+1)n1p(n,1)=(n+1)^{n-1}.

2.3 Bijection between aa-Parking Functions & Labelled Rooted Forests

We consider forests with aa components and totally a+na+n vertices where the roots in the aa components are 1,2,,a1,2,\dots,a. Vertices which are not roots are labelled a+1,a+2,,a+na+1,a+2,\dots,a+n.

Let t(n,a)t(n,a) be the number of such labelled rooted forests with aa components and a+na+n vertices.

Theorem 2.6.

The number of labelled rooted forests with aa components and a+na+n vertices is

t(n,a)=a(a+n)n1.t(n,a)=a(a+n)^{n-1}.
Proof.

When n=0n=0, obviously t(n,a)=1t(n,a)=1 for any aa. When n1n\geq 1 and a=0a=0, t(n,a)=0t(n,a)=0 since there does not exist such a tree with zero component and a positive number of vertices.

Since we want to prove t(n,a)=p(n,a)t(n,a)=p(n,a), the number of aa-parking functions of length nn and they satisfy the same boundary condition, it is natural to think about the recurrence relation for t(n,a)t(n,a). Consider the number of neighbors of vertex 1, say, the number is k,(0kn)k,(0\leq k\leq n), then remove them with their own subtrees as new components and delete vertex 1. Then there are a+k1a+k-1 components and nkn-k non-rooted vertices. Though in this case the labeling of vertices does not follow our rule, there is a unique relabeling which makes it do. When the number of neighbors of vertex 1 is kk, there are (nk){{n}\choose{k}} choices, so

t(n,a)=k=0n(nk)t(nk,a+k1).t(n,a)=\sum_{k=0}^{n}{{n}\choose{k}}t(n-k,a+k-1).

It has exactly the same recurrence relation as p(n,a)p(n,a), hence

t(n,a)=p(n,a)=a(a+n)n1.t(n,a)=p(n,a)=a(a+n)^{n-1}.

As p(n,a)p(n,a) and t(n,a)t(n,a) are the same, it would be interesting to find some meaningful bijection between aa-parking functions of length nn and labelled rooted forests with aa components and a+na+n vertices. We discover or possibly re-discover a bijection between them. This bijection can be best demonstrated via an example as follows.

Assume we already have a 2-parking function with length 7, say 5842121, we’d like to map it to a labelled rooted forests with 2 components and 9 vertices where 1 and 2 are the roots for the components. Because the vertices 1 and 2 are already roots, we use the following two-line notation (#):

vertices:\displaystyle vertices: 3\displaystyle 3 4\displaystyle 4 5\displaystyle 5 6\displaystyle 6 7\displaystyle 7 8\displaystyle 8 9\displaystyle 9
2parkingfunction:\displaystyle 2-parkingfunction: 5\displaystyle 5 8\displaystyle 8 4\displaystyle 4 2\displaystyle 2 1\displaystyle 1 2\displaystyle 2 1\displaystyle 1

Let’s consider the weakly-increasing version (*) first where we just sort the second line of (#):

vertices:\displaystyle vertices: 3\displaystyle 3 4\displaystyle 4 5\displaystyle 5 6\displaystyle 6 7\displaystyle 7 8\displaystyle 8 9\displaystyle 9
2parkingfunction:\displaystyle 2-parkingfunction: 1\displaystyle 1 1\displaystyle 1 2\displaystyle 2 2\displaystyle 2 4\displaystyle 4 5\displaystyle 5 8\displaystyle 8

We interpret (*) as follows: the parent of vertices 3 and 4 is 1, 5’s and 6’s parent is 2, etc. Hence we have the following forest.

Refer to caption
Figure 2.1: The Forest Associated with (*)

If we sort both lines of (#) according to the second line, then we have

vertices:\displaystyle vertices: 7\displaystyle 7 9\displaystyle 9 6\displaystyle 6 8\displaystyle 8 5\displaystyle 5 3\displaystyle 3 4\displaystyle 4
2parkingfunction:\displaystyle 2-parkingfunction: 1\displaystyle 1 1\displaystyle 1 2\displaystyle 2 2\displaystyle 2 4\displaystyle 4 5\displaystyle 5 8\displaystyle 8

Comparing the first line with that of (*), we have a map

3\displaystyle 3 4\displaystyle 4 5\displaystyle 5 6\displaystyle 6 7\displaystyle 7 8\displaystyle 8 9\displaystyle 9
\displaystyle\downarrow \displaystyle\downarrow \displaystyle\downarrow \displaystyle\downarrow \displaystyle\downarrow \displaystyle\downarrow \displaystyle\downarrow
7\displaystyle 7 9\displaystyle 9 6\displaystyle 6 8\displaystyle 8 5\displaystyle 5 3\displaystyle 3 4\displaystyle 4

So the 2-parking function 5842121 is mapped to the following forest:

Refer to caption
Figure 2.2: The Labelled Rooted Forest Which the 2-Parking Function 5842121 is mapped to

One convention is that when we draw the forests, for the same parent, we always place its children in an increasing order (from left to right).

Conversely, if we already have the forest in Figure 2.2 and we’d like to map it to a 2-parking function, then we start with indexing each vertex. The rule is that we start from the first level, i.e. the root and start from the left, then we index the vertices 1, 2, …as follows with indexes in the bracket:

Refer to caption
Figure 2.3: The Indexed Labelled Rooted Forest

Now let the first line still be 3456789. For each of them in the first line, the second line number should be the index of its parent. Thus we have

vertices:\displaystyle vertices: 3\displaystyle 3 4\displaystyle 4 5\displaystyle 5 6\displaystyle 6 7\displaystyle 7 8\displaystyle 8 9\displaystyle 9
2parkingfunction:\displaystyle 2-parkingfunction: 5\displaystyle 5 8\displaystyle 8 4\displaystyle 4 2\displaystyle 2 1\displaystyle 1 2\displaystyle 2 1\displaystyle 1

which is exactly (#).

There are other bijections. For example, in the paper [7], the authors define a bijection without using recurrence between the set of aa-parking functions of length nn to the set of rooted labelled forests with aa components and a+na+n vertices, for which

(n+12)Sum(p1,,pn)=inv(F),{{n+1}\choose 2}-Sum(p_{1},\dots,p_{n})=inv(F),

where Sum(p1,,pn)Sum(p_{1},\dots,p_{n}) is the sum statistic of parking functions defined in Section 2.5, inv(F)inv(F) is the inversion of a forest FF, and the parking function (p1,,pn)(p_{1},\dots,p_{n}) is mapped to the forest FF (vice versa).

2.4 From Enumeration to Statistics

Often in enumerative combinatorics, the class of interest has natural ‘statistics’, like height, weight, and IQ for humans, and one is interested rather than, for a finite set AA,

|A|:=aA1,|A|\,:=\,\sum_{a\in A}1,

called the naive counting, and getting a number (obviously a non-negative integer), by the so-called weighted counting,

|A|x:=aAxf(a),|A|_{x}\,:=\,\sum_{a\in A}x^{f(a)},

where f:=AZf:=A\rightarrow Z is the statistic in question. To go from the weighted enumeration (a certain Laurent polynomial) to straight enumeration, one sets x=1x=1, i.e. |A|1=|A||A|_{1}=|A|.

Since this is mathematics, and not accounting, the usual scenario is not just one specific set AA, but a sequence of sets {An}n=0\{A_{n}\}_{n=0}^{\infty}, and then the enumeration problem is to have an efficient description of the numerical sequence an:=|An|a_{n}:=|A_{n}|, ready to be looked-up (or submitted) to the OEIS, and its corresponding sequence of polynomials Pn(x):=|An|xP_{n}(x):=|A_{n}|_{x}.

It often happens that the statistic ff, defined on AnA_{n}, has a scaled limiting distribution. In other words, if you draw a histogram of ff on AnA_{n},, and do the obvious scaling, they get closer and closer to a certain continuous curve, as nn goes to infinity.

The scaling is as follows. Let En(f)E_{n}(f) and Varn(f)Var_{n}(f) the expectation and variance of the statistic ff defined on AnA_{n}, and define the scaled random variable, for aAna\in A_{n}, by

Xn(a):=f(a)En(f)Varn(f).X_{n}(a):=\frac{f(a)-E_{n}(f)}{\sqrt{Var_{n}(f)}}.

If you draw the histograms of Xn(a)X_{n}(a) for large nn, they look practically the same, and converge to some continuous limit.

A famous example is coin tossing. If AnA_{n} is {1,1}n\{-1,1\}^{n}, and f(v)f(v) is the sum of vv, then the limiting distribution is the bell shaped curve aka standard normal distribution aka Gaussian distribution.

As explained in [55], a purely finitistic approach to finding, and proving, a limiting scaled distribution, is via the method of moments. Using symbolic computation, the computer can rigorously prove exact expressions for as many moments as desired, and often (like in the above case, see [55]) find a recurrence for the sequence of moments. This enables one to identify the limits of the scaled moments with the moments of the continuous limit (in the example of coin-tossing [and many other cases], ex2/22π\frac{e^{-x^{2}/2}}{\sqrt{2\pi}}, whose moments are famously 1,0,13,0,135,0,1357,0,1,0,1\cdot 3,0,1\cdot 3\cdot 5,0,1\cdot 3\cdot 5\cdot 7,0,\dots) . Whenever this is the case the discrete family of random variables is called asymptotically normal. Whenever this is not the case, it is interesting and surprising.

2.5 The Sum and Area Statistics of Parking Functions

Let 𝒫(n,a)\mathcal{P}(n,a) be the set of aa-parking functions of length nn.

A natural statistic is the sum

Sum(p1,,pn):=p1+p2++pn=i=1npi.Sum(p_{1},\dots,p_{n}):=p_{1}+p_{2}+\dots+p_{n}=\sum_{i=1}^{n}p_{i}.

Another, even more natural (see the beautiful article [8]), happens to be

Area(p):=n(2a+n1)2Sum(p).Area(p):=\frac{n(2a+n-1)}{2}-Sum(p).

Let P(n,a)(x)P(n,a)(x) be the weighted analog of p(n,a)p(n,a), according to Sum, i.e.

P(n,a)(x):=p𝒫(n,a)xSum(p).P(n,a)(x)\,:=\,\sum_{p\in\mathcal{P}(n,a)}x^{Sum(p)}.

Analogously, let Q(n,a)(x)Q(n,a)(x) be the weighted analog of p(n,a)p(n,a), according to Area, i.e.

Q(n,a)(x):=p𝒫(n,a)xArea(p).Q(n,a)(x)\,:=\,\sum_{p\in\mathcal{P}(n,a)}x^{Area(p)}.

Clearly, one can easily go from one to the other

Q(n,a)(x)=x(2a+n1)n/2P(n,a)(x1),P(n,a)(x)=x(2a+n1)n/2Q(n,a)(x1).Q(n,a)(x)\,=\,x^{(2a+n-1)n/2}\,P(n,a)(x^{-1}),\quad P(n,a)(x)\,=\,x^{(2a+n-1)n/2}\,Q(n,a)(x^{-1}).

How do we compute P(n,a)(x)P(n,a)(x)?, (or equivalently, Q(n,a)(x)Q(n,a)(x)?). It is readily seen that the analog of Fundamental Recurrence for the weighted counting is

P(n,a)(x)=xnk=0n(nk)P(nk,a+k1)(x),P(n,a)(x)\,=\,x^{n}\,\sum_{k=0}^{n}\,{{n}\choose{k}}P(n-k,a+k-1)(x),

subject to the initial conditions P(0,a)(x)=1P(0,a)(x)=1 and P(n,0)(x)=0P(n,0)(x)=0.

So it is almost the same, the “only” change is sticking xnx^{n} in front of the sum on the right hand side.

Equivalently,

Q(n,a)(x)=k=0n(nk)xk(k+2a3)/2Q(nk,a+k1)(x),Q(n,a)(x)\,=\,\,\sum_{k=0}^{n}\,{{n}\choose{k}}x^{k(k+2a-3)/2}\,Q(n-k,a+k-1)(x),

subject to the initial conditions Q(0,a)(x)=1Q(0,a)(x)=1 and Q(n,0)(x)=0Q(n,0)(x)=0.

Once again, in the sense of Wilf, this is already an answer, but because of the extra variable xx, one can not go as far as we did before for the naive, merely numeric, counting.

It is very unlikely that there is a “closed form” expression for P(n,a)(x)P(n,a)(x) (and hence Q(n,a)(x)Q(n,a)(x)), but for statistical purposes it would be nice to get “closed form” expressions for

\bullet the expectation,

\bullet the variance,

\bullet as many factorial moments as possible, from which the ‘raw’ moments, and latter the centralized moments and finally the scaled moments can be gotten. Then we can take the limits as nn goes to infinity, and see if they match the moments of any of the known continuous distributions, and prove rigorously that, at least for that many moments, the conjectured limiting distribution matches.

In our case, the limiting distribution is the intriguing so-called Airy distribution, that Svante Janson prefers to call “the area under Brownian excursion”. This result was stated and proved in [8], by using deep and sophisticated continuous probability theory and continuous martingales. Here we will “almost” prove this result, in the sense of showing that the limits of the scaled moments of the area statistic on parking functions coincide with the scaled moments of the Airy distribution up to the 3030-th moment, and we can go much further.

But we can do much more than continuous probabilists. We (or rather our computers, running Maple) can find exact polynomial expressions in nn and the expectation E1(n)E_{1}(n). We can do it for any desired number of moments, say 3030. Unlike continuous probability theorists, our methods are entirely elementary, only using high school algebra.

We can also do the same thing for the more general aa-parking functions. Now the expressions are polynomials in nn, aa, and the expectation E1(n,a)E_{1}(n,a).

Finally, we believe that our approach, using the fundamental recurrence of area statistic, can be used to give a full proof (for all moments), by doing it asymptotically, and deriving a recurrence for the leading terms of the asymptotics for the factorial moments that would coincide with the well-known recurrence for the moments of the Airy distribution given, for example in Eqs. (4) and (5) of Svante Janson’s article [23]. This is left as a challenge to our readers.

The expectation of the sum statistic, let’s call it Esum(n,a)E_{sum}(n,a) is given by

Esum(n,a)=P(n,a)(1)P(n,a)(1)=P(n,a)(1)a(a+n)n1,E_{sum}(n,a)\,=\,\frac{P^{\prime}(n,a)(1)}{P(n,a)(1)}\,=\,\frac{P^{\prime}(n,a)(1)}{a(a+n)^{n-1}},

where the prime denotes, as usual, differentiation w.r.t. xx.

Can we get a closed-form expression for P(n,a)(1)P^{\prime}(n,a)(1), and hence for Esum(n,a)E_{sum}(n,a)?

Differentiating the fundamental recurrence of P(n,a)(x)P(n,a)(x) with respect to xx, using the product rule, we get

P(n,a)(x)=xnk=0n(nk)P(nk,a+k1)(x)+nxn1k=0n(nk)P(nk,a+k1)(x).P(n,a)^{\prime}(x)\,=\,x^{n}\,\sum_{k=0}^{n}\,{{n}\choose{k}}P(n-k,a+k-1)^{\prime}(x)\,+\,\,nx^{n-1}\,\sum_{k=0}^{n}\,{{n}\choose{k}}\,P(n-k,a+k-1)(x).

Plugging-in x=1x=1 we get that P(n,a)(1)P(n,a)^{\prime}(1) satisfies the recurrence

P(n,a)(1)k=0n(nk)P(nk,a+k1)(1)=nk=0n(nk)P(nk,a+k1)(1)=np(n,a).P(n,a)^{\prime}(1)-\sum_{k=0}^{n}\,{{n}\choose{k}}P(n-k,a+k-1)^{\prime}(1)=n\,\sum_{k=0}^{n}\,{{n}\choose{k}}P(n-k,a+k-1)(1)=n\,p(n,a).

Using this recurrence, we can, just as we did for p(n,a)p(n,a) above, get expressions, as polynomials in aa, for numeric 1n101\leq n\leq 10, say, and then conjecture that

P(n,a)(1)=12an(a+n1)(a+n)n112j=1n(nj)j!a(a+n)nj.P^{\prime}(n,a)(1)=\frac{1}{2}\,a\,n\,(a+n-1)\,(a+n)^{n-1}-\frac{1}{2}\sum_{j=1}^{n}{{n}\choose{j}}\,j!\,a\,(a+n)^{n-j}.

To prove it, one plugs in the left side into the above recurrence of P(n,a)(1)P(n,a)^{\prime}(1), changes the order of summation, and simplifies. This is rather tedious, but since at the end of the day, these are equivalent to polynomial identities in nn and aa, checking it for sufficiently many special values of nn and aa would be a rigorous proof.

It follows that

Esum(n,a)=n(a+n+1)212j=1nn!(nj)!(a+n)j1.E_{sum}(n,a)\,=\,\frac{n(a+n+1)}{2}-\frac{1}{2}\,\sum_{j=1}^{n}\frac{n!}{(n-j)!(a+n)^{j-1}}.

This formula first appears in [31].

Equivalently,

Earea(n,a)=n(a2)2+12j=1nn!(nj)!(a+n)j1.E_{area}(n,a)\,=\,\frac{n\,(a-2)}{2}\,+\,\frac{1}{2}\,\sum_{j=1}^{n}\frac{n!}{(n-j)!(a+n)^{j-1}}.

In particular, for the primary object of interest, the case a=1a=1, we get

Earea(n,1)=n2+12j=1nn!(nj)!(n+1)j1.E_{area}(n,1)\,=\,-\frac{n}{2}\,+\,\frac{1}{2}\,\sum_{j=1}^{n}\frac{n!}{(n-j)!(n+1)^{j-1}}.

This rings a bell! It may written as

Earea(n,1)=n2+12Wn+1,E_{area}(n,1)\,=\,-\frac{n}{2}\,+\,\frac{1}{2}W_{n+1},

where WnW_{n} is the iconic quantity,

Wn=n!nn1k=0n2nkk!,W_{n}\,=\,\frac{n!}{n^{n-1}}\sum_{k=0}^{n-2}\frac{n^{k}}{k!},

proved by Riordan and Sloane [35] to be the expectation of another very important quantity, the sum of the heights on rooted labeled trees on nn vertices. In addition to its considerable mathematical interest, this quantity, WnW_{n}, has great historical significance, it was the first sequence, sequence A435A435 of the amazing On-Line Encyclopedia of Integer Sequences (OEIS), now with more than 300000300000 sequences! See [12] for details, and far-reaching extensions, analogous to the present chapter.

[The reason it is not sequence A1 is that initially the sequences were arranged in lexicographic order.]

Another fact, that will be of great use later in this chapter, is that, as noted in [35], Ramanujan and Watson proved that WnW_{n} (and hence Wn+1W_{n+1}) is asymptotic to

2π2n3/2.\frac{\sqrt{2\pi}}{2}\,n^{3/2}.

It is very possible that the formula Earea(n,1)=n2+12Wn+1E_{area}(n,1)\,=\,-\frac{n}{2}\,+\,\frac{1}{2}W_{n+1} may also be deduced from the Riordan-Sloane result via one of the numerous known bijections between parking functions and rooted labeled trees. More generally, the results below, for the special case a=1a=1, might be deduced, from those of [12], but we believe that the present methodology is interesting for its own sake, and besides in our current approach (that uses recurrences rather than the Lagrange Inversion Formula), it is much faster to compute higher moments, hence, going in the other direction, would produce many more moments for the statistic on rooted labeled trees considered in [12], provided that there is indeed such a correspondence that sends the area statistic on parking functions (suitably tweaked) to the Riordan-Sloane statistic on rooted labeled trees.

2.6 The Limiting Distribution

Given a combinatorial family, one can easily get an idea of the limiting distribution by taking a large enough nn, say n=100n=100, and generating a large enough number of random objects, say 5000050000, and drawing a histogram, see Figure 2 in Diaconis and Hicks’ insightful article [8]. But, one does not have to resort to simulation. While it is impractical to consider all 10199101^{99} parking functions of length 100100, the generating function Q(100,1)(x)Q(100,1)(x) contains the exact count for each conceivable area from 0 to (1002){{100}\choose{2}}.

Refer to caption
Figure 2.4: A Histogram of the Area of Parking Functions of Length 100

But an even more informative way to investigate the limiting distribution is to draw the histogram of the probability generating function of the scaled distribution

Xn(p):=Area(p)EnVarn,X_{n}(p):=\frac{Area(p)-E_{n}}{\sqrt{Var_{n}}},

where EnE_{n} and VarnVar_{n} are the expectation and variance respectively.

Refer to caption
Figure 2.5: The Scaled Distribution of the Area of Parking Functions of Length 100
Refer to caption
Figure 2.6: The Scaled Distribution of the Area of Parking Functions of Length 120

As proved in [8] (using deep results in continuous probability due to David Aldous, Svante Janson, and Chassaing and Marcket) the limiting distribution is the Airy distribution. We will soon “almost” prove it, but do much more by discovering exact expressions for the first 3030 moments, not just their limiting asymptotics.

2.7 Truly Exact Expressions for the Factorial Moments

In [32] there is an “exact” expression for the general moment, that is not very useful for our purposes. If one traces their proof, one can, conceivably, get explicit expressions for each specific moment, but they did not bother to implement it, and the asymptotics are not immediate.

We discovered, the following important fact.

Fact. Let E1(n,a):=Earea(n,a)E_{1}(n,a):=E_{area}(n,a) be the expectation of the area statistic on aa-parking functions of length nn, given above, and let Ek(n,a)E_{k}(n,a) be the kk-th factorial moment

Ek(n,a):=Q(k)(n,a)(1)a(a+n)n1,E_{k}(n,a)\,:=\,\frac{Q^{(k)}(n,a)(1)}{a(a+n)^{n-1}},

then there exist polynomials Ak(n,a)A_{k}(n,a) and Bk(n,a)B_{k}(n,a) such that

Ek(n,a)=Ak(n,a)+Bk(n,a)E1(n,a).E_{k}(n,a)\,=A_{k}(n,a)\,+\,B_{k}(n,a)\,E_{1}(n,a).

The beauty of experimental mathematics is that these can be found by cranking out enough data, using the sequence of probability generating functions Q(n,a)(x)Q(n,a)(x), obtained by using the fundamental recurrence of area statistic, getting sufficiently many numerical data for the moments, and using undetermined coefficients. These can be proved a posteriori by taking these truly exact formulas and verifying that the implied recurrences for the kk-th factorial moment, in terms of the previous ones. But this is not necessary. Since, at the end of the day, it all boils down to verifying polynomial identities, so, once again, verifying them for sufficiently many different values of (n,a)(n,a) constitutes a rigorous proof. To be fully rigorous, one needs to prove a priori bounds for the degrees in nn and aa, but, in our humble opinion, it is not that important, and could be left to the obtuse reader.

Theorem 2.7 (equivalent to a result in [31]).

The expectation of the area statistic on parking functions of length nn is

E1(n):=n2+12(n+1)!(n+1)nk=0n1(n+1)kk!,E_{1}(n):=\,-\frac{n}{2}\,+\,\frac{1}{2}\,\frac{(n+1)!}{(n+1)^{n}}\sum_{k=0}^{n-1}\frac{(n+1)^{k}}{k!},

and asymptotically it equals 2π4n3/2+O(n)\frac{\sqrt{2\pi}}{4}\cdot n^{3/2}+O(n).

Theorem 2.8.

The second factorial moment of the area statistic on parking functions of length nn is

73(n+1)E1(n)+512n3112n213n,-\frac{7}{3}(n+1)\,E_{1}(n)+{\frac{5}{12}}\,{n}^{3}-\frac{1}{12}\,{n}^{2}-\frac{1}{3}\,n,

and asymptotically it equals 512n3+O(n5/2)\frac{5}{12}\cdot n^{3}+O(n^{5/2}).

Theorem 2.9.

The third factorial moment of the area statistic on parking functions of length nn is

175192n4283192n3+199192n2+259192n+(1532n3+52196n2+121996n+74396)E1(n),-{\frac{175}{192}}\,{n}^{4}-{\frac{283}{192}}\,{n}^{3}+{\frac{199}{192}}\,{n}^{2}+{\frac{259}{192}}\,n+\left({\frac{15}{32}}\,{n}^{3}+{\frac{521}{96}}\,{n}^{2}+{\frac{1219}{96}}\,n+{\frac{743}{96}}\right)\,E_{1}(n),

and asymptotically it equals 151282πn9/2+O(n4)\frac{15}{128}\sqrt{2\pi}\cdot n^{9/2}+O(n^{4}).

Theorem 2.10.

The fourth factorial moment of the area statistic on parking functions of length nn is

2211008n6+6373730240n5+10189715120n4+222175040n31375189n218746330240n{\frac{221}{1008}}\,{n}^{6}+{\frac{63737}{30240}}\,{n}^{5}+{\frac{101897}{15120}}\,{n}^{4}+{\frac{22217}{5040}}\,{n}^{3}-{\frac{1375}{189}}\,{n}^{2}-{\frac{187463}{30240}}\,n
+(3516n444927n31302432520n27409105n50380315120)E1(n),+\left(-{\frac{35}{16}}\,{n}^{4}-{\frac{449}{27}}\,{n}^{3}-{\frac{130243}{2520}}\,{n}^{2}-{\frac{7409}{105}}\,n-{\frac{503803}{15120}}\right)\,E_{1}(n),

and asymptotically it equals 2211008n6+O(n11/2)\frac{221}{1008}\cdot n^{6}+O(n^{11/2}).

Theorem 2.11.

The fifth factorial moment of the area statistic on parking functions of length nn is

105845110592n72170159290304n6999556513870720n530773609725760n4-{\frac{105845}{110592}}\,{n}^{7}-{\frac{2170159}{290304}}\,{n}^{6}-{\frac{99955651}{3870720}}\,{n}^{5}-{\frac{30773609}{725760}}\,{n}^{4}
9484690311612160n3+24676991483840n2+39276390111612160n-{\frac{94846903}{11612160}}\,{n}^{3}+{\frac{24676991}{483840}}\,{n}^{2}+{\frac{392763901}{11612160}}\,n
+(5652048n6+1005128n5+9832585165888n4+11113495184n3+8263585271935360n2+({\frac{565}{2048}}\,{n}^{6}+{\frac{1005}{128}}\,{n}^{5}+{\frac{9832585}{165888}}\,{n}^{4}+{\frac{1111349}{5184}}\,{n}^{3}+{\frac{826358527}{1935360}}\,{n}^{2}
+159943787362880n+10245804415806080)E1(n),+{\frac{159943787}{362880}}\,n+{\frac{1024580441}{5806080}})E_{1}(n),

and asymptotically it equals 56581922πn15/2+O(n7)\frac{565}{8192}\sqrt{2\pi}\cdot n^{15/2}+O(n^{7}).

Theorem 2.12.

The sixth factorial moment of the area statistic of parking functions of length nn is

82825576576n9+373340075110702592n8+9401544029332107776n7+14473244813127733760n6+4141393967091660538880n5{\frac{82825}{576576}}\,{n}^{9}+{\frac{373340075}{110702592}}\,{n}^{8}+{\frac{9401544029}{332107776}}\,{n}^{7}+{\frac{14473244813}{127733760}}\,{n}^{6}+{\frac{414139396709}{1660538880}}\,{n}^{5}
+88215445651332107776n418783816473332107776n36433595420291660538880n23589365404091660538880n+{\frac{88215445651}{332107776}}\,{n}^{4}-{\frac{18783816473}{332107776}}\,{n}^{3}-{\frac{643359542029}{1660538880}}\,{n}^{2}-{\frac{358936540409}{1660538880}}\,n
+(39552048n71863496144n62592832731161216n5119912501129024n414986063308163866880n3+(-{\frac{3955}{2048}}\,{n}^{7}-{\frac{186349}{6144}}\,{n}^{6}-{\frac{259283273}{1161216}}\,{n}^{5}-{\frac{119912501}{129024}}\,{n}^{4}-{\frac{149860633081}{63866880}}\,{n}^{3}
601794266581166053888n2864000570107276756480n921390308389830269440)E1(n),-{\frac{601794266581}{166053888}}\,{n}^{2}-{\frac{864000570107}{276756480}}\,n-{\frac{921390308389}{830269440}})\,E_{1}(n),

and asymptotically it equals 82825576576n9+O(n17/2)\frac{82825}{576576}\cdot n^{9}+O(n^{17/2}).

For Theorems 7-30, see the output file

http://sites.math.rutgers.edu/~zeilberg/tokhniot/oParkingStatistics7.txt

Let {ek}k=1\{e_{k}\}_{k=1}^{\infty} be the sequence of moments of the Airy distribution, defined by the recurrence given in Equations (4)(4) and (5)(5) in Svante Janson’s interesting survey paper [23]. Our computers, using our Maple package, proved that

Ek(n)=ekn3k2+O(n3k12),E_{k}(n)\,=\,e_{k}n^{\frac{3k}{2}}+O(n^{\frac{3k-1}{2}}),

for 1k301\leq k\leq 30. It follows that the limiting distribution of the area statistic is (most probably) the Airy distribution, since the first 3030 moments match. Of course, this was already known to continuous probability theorists, and we only proved it for the first 3030 moments, but:

\bullet Our methods are purely elementary and finitistic.

\bullet We can easily go much farther, i.e. prove it for more moments.

\bullet We believe that our approach, using recurrences, can be used to derive a recurrence for the leading asymptotics of the factorial moments, Ek(n)E_{k}(n), that would turn out to be the same as the above mentioned recurrence (Eqs. (4) and (5) in [23]). We leave this as a challenge to the reader.

We also have the results of the exact expressions for the first 1010 moments of the area statistic for general aa-parking. To see expressions in aa, nn, and E1(n,a)E_{1}(n,a), for the first 1010 moments of aa-parking, see

http://sites.math.rutgers.edu/~zeilberg/tokhniot/oParkingStatistics8.txt.

Chapter 3 The Gordian Knot of the CC-finite Ansatz

This chapter is adapted from [49], which has been accepted on Algorithmic Combinatorics-Enumerative Combinatorics, Special Functions, and Computer Algebra: In honor of Peter Paule’s 60th birthday. It is also available on arXiv.org, number 1812.07193.

This chapter is dedicated to Peter Paule, one of the great pioneers of experimental mathematics and symbolic computation. In particular, it is greatly inspired by his masterpiece, co-authored with Manuel Kauers, The Concrete Tetrahedron [25], where a whole chapter is dedicated to our favorite ansatz, the CC-finite ansatz.

3.1 Introduction

Once upon a time there was a knot that no one could untangle, it was so complicated. Then came Alexander the Great and, in one second, cut it with his sword.

Analogously, many mathematical problems are very hard, and the current party line is that in order for it be considered solved, the solution, or answer, should be given a logical, rigorous, deductive proof.

Suppose that you want to answer the following question:

Find a closed-form formula, as an expression in nn, for the real part of the nn-th complex root of the Riemann zeta function, ζ(s)\zeta(s) .

Let’s call this quantity a(n)a(n). Then you compute these real numbers, and find out that a(n)=12a(n)=\frac{1}{2} for n1000n\leq 1000. Later you are told by Andrew Odlyzko that a(n)=12a(n)=\frac{1}{2} for all 1n10101\leq n\leq 10^{10}. Can you conclude that a(n)=12a(n)=\frac{1}{2} for all nn? we would, but, at this time of writing, there is no way to deduce it rigorously, so it remains an open problem. It is very possible that one day it will turn out that a(n)a(n) (the real part of the nn-th complex root of ζ(s)\zeta(s)) belongs to a certain ansatz, and that checking it for the first N0N_{0} cases implies its truth in general, but this remains to be seen.

There are also frameworks, e.g. Pisot sequences (see [11], [59]), where the inductive approach fails miserably.

On the other hand, in order to (rigorously) prove that 13+23+33++n3=(n(n+1)/2)21^{3}+2^{3}+3^{3}+\dots+n^{3}=(n(n+1)/2)^{2}, for every positive integer nn, it suffices to check it for the five special cases 0n40\leq n\leq 4, since both sides are polynomials of degree 44, hence the difference is a polynomial of degree 4\leq 4, given by five ‘degrees of freedom’.

This is an example of what is called the ‘N0N_{0} principle’. In the case of a polynomial identity (like this one), N0N_{0} is simply the degree plus one.

But our favorite ansatz is the CC-finite ansatz. A sequence of numbers {a(n)}\{a(n)\} (0n<0\leq n<\infty) is CC-finite if it satisfies a linear recurrence equation with constant coefficients. For example the Fibonacci sequence that satisfies F(n)F(n1)F(n2)=0F(n)-F(n-1)-F(n-2)=0 for n2n\geq 2.

The CC-finite ansatz is beautifully described in Chapter 4 of the masterpiece The Concrete Tetrahedron [25], by Manuel Kauers and Peter Paule, and discussed at length in [58].

Here the ‘N0N_{0} principle’ also holds (see [60]), i.e. by looking at the ‘big picture’ one can determine a priori, a positive integer, often not that large, such that checking that a(n)=b(n)a(n)=b(n) for 1nN01\leq n\leq N_{0} implies that a(n)=b(n)a(n)=b(n) for all n>0n>0.

A sequence {a(n)}n=0\{a(n)\}_{n=0}^{\infty} is CC-finite if and only if its (ordinary) generating function f(t):=n=0a(n)tnf(t):=\sum_{n=0}^{\infty}a(n)\,t^{n} is a rational function of tt, i.e. f(t)=P(t)/Q(t)f(t)=P(t)/Q(t) for some polynomials P(t)P(t) and Q(t)Q(t). For example, famously, the generating function of the Fibonacci sequence is t/(1tt2)t/(1-t-t^{2}).

Phrased in terms of generating functions, the CC-finite ansatz is the subject of chapter 4 of yet another masterpiece, Richard Stanley’s ‘Enumerative Combinatorics’ (volume 1) [44]. There it is shown, using the ‘transfer matrix method’ (that originated in physics), that in many combinatorial situations, where there are finitely many states, one is guaranteed, a priori, that the generating function is rational.

Alas, finding this transfer matrix, at each specific case, is not easy! The human has to first figure out the set of states, and then using human ingenuity, figure out how they interact.

A better way is to automate it. Let the computer do the research, and using ‘symbolic dynamical programming’, the computer, automatically, finds the set of states, and constructs, all by itself (without any human pre-processing), the set of states and the transfer matrix. But this may not be so efficient for two reasons. First, at the very end, one has to invert a matrix with symbolic entries, hence compute symbolic determinants, that is time-consuming. Second, setting up the ‘infrastructure’ and writing a program that would enable the computer to do ‘machine-learning’ can be very daunting.

In this chapter, we will describe two case studies where, by ‘general nonsense’, we know that the generating functions are rational, and it is easy to bound the degree of the denominator (alias the order of the recurrence satisfied by the sequence). Hence a simple-minded, empirical, approach of computing the first few terms and then ‘fitting’ a recurrence (equivalently rational function) is possible.

The first case-study concerns counting spanning trees in families of grid-graphs, studied by Paul Raff [34], and F.J. Faase [15]. In their research, the human first analyzes the intricate combinatorics, manually sets up the transfer matrix, and only at the end lets a computer-algebra system evaluate the symbolic determinant.

Our key observation, that enabled us to ‘cut the Gordian knot’ is that the terms of the studied sequences are expressible as numerical determinants. Since computing numerical determinants is so fast, it is easy to compute sufficiently many terms, and then fit the data into a rational function. Since we easily have an upper bound for the degree of the denominator of the rational function, everything is rigorous.

The second case-study is computing generating functions for sequences of determinants of ‘almost diagonal matrices’. Here, in addition to the ‘naive’ approach of cranking enough data and then fitting it into a rational function, we also describe the ‘symbolic dynamical programming method’, that surprisingly, is faster for the range of examples that we considered. But we believe that for sufficiently large cases, the naive approach will eventually be more efficient, since the ‘deductive’ approach works equally well for the analogous problem of finding the sequence of permanents of these almost diagonal matrices, for which the naive approach will soon be intractable.

This chapter may be viewed as a tutorial, hence we include lots of implementation details, and Maple code. We hope that it will inspire readers (and their computers!) to apply it in other situations

3.2 The Human Approach to Enumerating Spanning Trees of Grid Graphs

In order to illustrate the advantage of “keeping it simple”, we will review the human approach to the enumeration task that we will later redo using the ‘Gordian knot’ way. While the human approach is definitely interesting for its own sake, it is rather painful.

Our goal is to enumerate the number of spanning trees in certain families of graphs, notably grid graphs and their generalizations. Let’s examine Paul Raff’s interesting approach described in his paper Spanning Trees in Grid Graph [34]. Raff’s approach was inspired by the pioneering work of F. J. Faase [15].

The goal is to find generating functions that enumerate spanning trees in grid graphs and the product of an arbitrary graph and a path or a cycle.

Grid graphs have two parameters, let’s call them, kk and nn. For a k×nk\times n grid graph, let’s think of kk as fixed while nn is the discrete input variable of interest.

Definition 3.1.

The k×nk\times n grid graph Gk(n)G_{k}(n) is the following graph given in terms of its vertex set VV and edge set EE:

V={vij|1ik,1jn},V=\{v_{ij}|1\leq i\leq k,1\leq j\leq n\},
S={{vij,vij}||ii|+|jj|=1}.S=\{\{v_{ij},v_{i^{\prime}j^{\prime}}\}||i-i^{\prime}|+|j-j^{\prime}|=1\}.

The main idea in the human approach is to consider the collection of set-partitions of [k]={1,2,,k}[k]=\{1,2,\dots,k\} and figure out the transition when we extend a k×nk\times n grid graph to a k×(n+1)k\times(n+1) one.

Let k\mathcal{B}_{k} be the collection of all set-partitions of [k][k]. Bk=|k|B_{k}=|\mathcal{B}_{k}| are called the Bell number. Famously, the exponential generating function of BkB_{k}, namely k=0Bkk!tk\sum_{k=0}^{\infty}\frac{B_{k}}{k!}\,t^{k}, equals eet1e^{e^{t}-1}.

A lexicographic ordering on k\mathcal{B}_{k} is defined as follows:

Definition 3.2.

Given two partitions P1P_{1} and P2P_{2} of [k][k], for i[k]i\in[k], let XiX_{i} be the block of P1P_{1} containing ii and YiY_{i} be the block of P2P_{2} containing ii. Let jj be the minimum number such that XiYiX_{i}\neq Y_{i}. Then P1<P2P_{1}<P_{2} iff

1. |P1|<|P2||P_{1}|<|P_{2}| or

2. |P1|=|P2||P_{1}|=|P_{2}| and XjYjX_{j}\prec Y_{j} where \prec denotes the normal lexicographic order.

For example, here is the ordering for k=3k=3:

3={{{1,2,3}},{{1},{2,3},{{1,2},{3}},{{1,3},{2}},{{1},{2},{3}}}.\mathcal{B}_{3}=\{\{\{1,2,3\}\},\{\{1\},\{2,3\},\{\{1,2\},\{3\}\},\{\{1,3\},\{2\}\},\{\{1\},\{2\},\{3\}\}\}.

For simplicity, we can rewrite it as follows:

3={123,1/23,12/3,13/2,1/2/3}.\mathcal{B}_{3}=\{123,1/23,12/3,13/2,1/2/3\}.
Definition 3.3.

Given a spanning forest FF of Gk(n)G_{k}(n), the partition induced by FF is obtained from the equivalence relation

ijvn,i,vn,ji\sim j\Longleftrightarrow v_{n,i},v_{n,j} are in the same component of FF.

For example, the partition induced by any spanning tree of Gk(n)G_{k}(n) is 123k123\dots k because by definition, in a spanning tree, all vn,i,1ikv_{n,i},1\leq i\leq k are in the same component. For the other extreme, where every component only consists of one vertex, the corresponding set-partition is 1/2/3//k1/k1/2/3/\dots/k-1/k because no two vn,i,vn,jv_{n,i},v_{n,j} are in the same component for 1i<jk1\leq i<j\leq k.

Definition 3.4.

Given a spanning forest FF of Gk(n)G_{k}(n) and a set-partition PP of [k][k], we say that FF is consistent with PP if:

1. The number of trees in FF is precisely |P||P|.

2. PP is the partition induced by FF.

Let EnE_{n} be the set of edges E(Gk(n)\E(Gk(n1))E(G_{k}(n)\backslash E(G_{k}(n-1)), then EnE_{n} has 2k12k-1 members.

Given a forest FF of Gk(n1)G_{k}(n-1) and some subset XEnX\subseteq E_{n}, we can combine them to get a forest of Gk(n)G_{k}(n) as follows. We just need to know how many subsets of EnE_{n} can transfer a forest consistent with some partition to a forest consistent with another partition. This leads to the following definition:

Definition 3.5.

Given two partitions P1P_{1} and P2P_{2} in k\mathcal{B}_{k}, a subset XEnX\subseteq E_{n} transfers from P1P_{1} to P2P_{2} if a forest consistent with P1P_{1} becomes a forest consistent with P2P_{2} after the addition of XX. In this case, we write XP1=P2X\diamond P_{1}=P_{2}.

With the above definitions, it is natural to define a Bk×BkB_{k}\times B_{k} transfer matrix AkA_{k} by the following:

Ak(i,j)=|{AEn+1|APj=Pi}|.A_{k}(i,j)=|\{A\subseteq E_{n+1}|A\diamond P_{j}=P_{i}\}|.

Let’s look at the k=2k=2 case as an example. We have

2={12,1/2},En+1={{v1,n,v1,n+1},{v2,n,v2,n+1},{v1,n+1,v2,n+1}}.\mathcal{B}_{2}=\{12,1/2\},\quad E_{n+1}=\{\{v_{1,n},v_{1,n+1}\},\{v_{2,n},v_{2,n+1}\},\{v_{1,n+1},v_{2,n+1}\}\}.

For simplicity, let’s call the edges in En+1E_{n+1} e1,e2,e3e_{1},e_{2},e_{3}. Then to transfer the set-partition P1=12P_{1}=12 to itself, we have the following three ways: {e1,e2},{e1,e3},{e2,e3}\{e_{1},e_{2}\},\{e_{1},e_{3}\},\{e_{2},e_{3}\}. In order to transfer the partition P2=1/2P_{2}=1/2 into P1P_{1}, we only have one way, namely: {e1,e2,e3}\{e_{1},e_{2},e_{3}\}. Similarly, there are two ways to transfer P1P_{1} to P2P_{2} and one way to transfer P2P_{2} to itself Hence the transfer matrix is the following 2×22\times 2 matrix:

A=[3121].A=\begin{bmatrix}3&1\\ 2&1\end{bmatrix}.

Let T1(n),T2(n)T_{1}(n),T_{2}(n) be the number of forests of Gk(n)G_{k}(n) which are consistent with the partitions P1P1 and P2P2, respectively. Let

vn=[T1(n)T2(n)],v_{n}=\begin{bmatrix}T_{1}(n)\\ T_{2}(n)\end{bmatrix},

then

vn=Avn1.v_{n}=Av_{n-1}.

The characteristic polynomial of AA is

χλ(A)=λ24λ+1.\chi_{\lambda}(A)=\lambda^{2}-4\lambda+1.

By the Cayley-Hamilton Theorem, AA satisfies

A24A+1=0.A^{2}-4A+1=0.

Hence the recurrence relation for T1(n)T_{1}(n) is

T1(n)=4T1(n1)T1(n2),T_{1}(n)=4T_{1}(n-1)-T_{1}(n-2),

the sequence is {1,4,15,56,209,780,2911,10864,40545,151316,}\{1,4,15,56,209,780,2911,10864,40545,151316,\dots\} (OEIS A001353) and the generating function is

x14x+x2.\frac{x}{1-4x+x^{2}}.

Similarly, for the k=3k=3 case, the transfer matrix

A3=[8334143221423211001032221].A_{3}=\begin{bmatrix}8&3&3&4&1\\ 4&3&2&2&1\\ 4&2&3&2&1\\ 1&0&0&1&0\\ 3&2&2&2&1\end{bmatrix}.

The transfer matrix method can be generalized to general graphs of the form G×PnG\times P_{n}, especially cylinder graphs.

As one can see, we had to think very hard. First we had to establish a ‘canonical’ ordering over set-partitions, then define the consistence between partitions and forests, then look for the transfer matrix and finally worry about initial conditions.

Rather than think so hard, let’s compute sufficiently many terms of the enumeration sequence, and try to guess a linear recurrence equation with constant coefficients, that would be provable a posteriori just because we know that there exists a transfer matrix without worrying about finding it explicitly. But how do we generate sufficiently many terms? Luckily, we can use the celebrated Matrix Tree Theorem.

Theorem 3.6 (Matrix Tree Theorem).

If A=(aij)A=(a_{ij}) is the adjacency matrix of an arbitrary graph GG, then the number of spanning trees is equal to the determinant of any co-factor of the Laplacian matrix LL of GG, where

L=[a12++a1na12a1,na21a21++a2na2,nan1an2an1++an,n1].L=\begin{bmatrix}a_{12}+\dots+a_{1n}&-a_{12}&\dots&-a_{1,n}\\ -a_{21}&a_{21}+\dots+a_{2n}&\dots&-a_{2,n}\\ \vdots&\vdots&\ddots&\vdots\\ -a_{n1}&-a_{n2}&\dots&a_{n1}+\dots+a_{n,n-1}\end{bmatrix}.

For instance, taking the (n,n)(n,n) co-factor, we have that the number of spanning trees of GG equals

|a12++a1na12a1,n1a21a21++a2na2,n1an1,1an1,2an1,1++an1,n|.\begin{vmatrix}a_{12}+\dots+a_{1n}&-a_{12}&\dots&-a_{1,n-1}\\ -a_{21}&a_{21}+\dots+a_{2n}&\dots&-a_{2,n-1}\\ \vdots&\vdots&\ddots&\vdots\\ -a_{n-1,1}&-a_{n-1,2}&\dots&a_{n-1,1}+\dots+a_{n-1,n}\end{vmatrix}.

Since computing determinants for numeric matrices is very fast, we can find the generating functions for the number of spanning trees in grid graphs and more generalized graphs by experimental methods, using the CC-finite ansatz.

3.3 The GuessRec Maple procedure

Our engine is the Maple procedure GuessRec(L) that resides in the Maple packages accompanying this chapter. Naturally, we need to collect enough data. The input is the data (given as a list) and the output is a conjectured recurrence relation derived from that data.

Procedure GuessRec(L) inputs a list, L, and attempts to output a linear recurrence equation with constant coefficients satisfied by the list. It is based on procedure GuessRec1(L,d) that looks for such a recurrence of order dd.

The output of GuessRec1(L,d) consists of the the list of initial dd values (‘initial conditions’) and the recurrence equation represented as a list. For instance, if the input is L=[1,1,1,1,1,1]L=[1,1,1,1,1,1] and d=1d=1, then the output will be [[1],[1]][[1],[1]]; if the input is L=[1,4,15,56,209,780,2911,10864,40545,151316]L=[1,4,15,56,209,780,2911,10864,40545,151316] as the k=2k=2 case for grid graphs and d=2d=2, then the output will be [[1,4],[4,1]][[1,4],[4,-1]]. This means that our sequence satisfies the recurrence a(n)=4a(n1)a(n2)a(n)=4a(n-1)-a(n-2), subject to the initial conditions a(0)=1,a(1)=4a(0)=1,a(1)=4.

Here is the Maple code:



GuessRec1:=proc(L,d) local eq,var,a,i,n:
if nops(L)<=2*d+2 then
print(‘The list must be of size >=‘, 2*d+3 ):
RETURN(FAIL):
fi:
var:={\{seq(a[i],i=1..d)}\}:
eq:={\{seq(L[n]-add(a[i]*L[n-i],i=1..d),n=d+1..nops(L))}\}:
var:=solve(eq,var):
if var=NULL then
RETURN(FAIL):
else
RETURN([[op(1..d,L)],[seq(subs(var,a[i]),i=1..d)]]):
fi:
end:

The idea is that having a long enough list LL (|L|>2d+2)(|L|>2d+2) of data, we use the data after the dd-th one to discover whether there exists a linear recurrence relation, the first dd data points being the initial condition. With the unknowns a1,a2,,ada_{1},a_{2},\dots,a_{d}, we have a linear systems of no less than d+3d+3 equations. If there is a solution, it is extremely likely that the recurrence relation holds in general. The first list of length dd in the output constitutes the list of initial conditions while the second list, RR, codes the linear recurrence, where [R[1],R[d]][R[1],\dots R[d]] stands for the following recurrence:

L[n]=i=1dR[i]L[ni].L[n]=\sum_{i=1}^{d}R[i]L[n-i].

Here is the Maple procedure GuessRec(L):

GuessRec:=proc(L) local gu,d:
for d from 1 to trunc(nops(L)/2)-2 do
gu:=GuessRec1(L,d):
if gu<>FAIL then
RETURN(gu):
fi:
od:
FAIL:
end:

This procedure inputs a sequence LL and tries to guess a recurrence equation with constant coefficients satisfying it. It returns the initial values and the recurrence equation as a pair of lists. Since the length of LL is limited, the maximum degree of recurrence cannot be more than |L|/22\lfloor|L|/2-2\rfloor. With this procedure, we just need to input L=[1,4,15,56,209,780,2911,10864,40545,151316]L=[1,4,15,56,209,780,2911,10864,40545,151316] to get the recurrence (and initial conditions) [[1,4],[4,1]][[1,4],[4,-1]].

Once the recurrence relation, let’s call it S, is discovered, procedure CtoR(S,t) finds the generating function for the sequence. Here is the Maple code:

CtoR:=proc(S,t) local D1,i,N1,L1,f,f1,L:
if not (type(S,list) and nops(S)=2 and type(S[1],list) and
type(S[2],list) and nops(S[1])=nops(S[2]) and type(t, symbol) ) then
print(‘Bad input‘):
RETURN(FAIL):
fi:
D1:=1-add(S[2][i]*t**i,i=1..nops(S[2])):
N1:=add(S[1][i]*t**(i-1),i=1..nops(S[1])):
L1:=expand(D1*N1):
L1:=add(coeff(L1,t,i)*t**i,i=0..nops(S[1])-1):
f:=L1/D1:
L:=degree(D1,t)+10:
f1:=taylor(f,t=0,L+1):
if expand([seq(coeff(f1,t,i),i=0..L)])<>expand(SeqFromRec(S,L+1))
then
print([seq(coeff(f1,t,i),i=0..L)],SeqFromRec(S,L+1)):
RETURN(FAIL):
else
RETURN(f):
fi:
end:

Procedure SeqFromRec used above (see the package) simply generates many terms using the recurrence.

Procedure CtoR(S,t) outputs the rational function in tt, whose coefficients are the members of the CC-finite sequence SS. For example:

𝙲𝚝𝚘𝚁([[𝟷,𝟷],[𝟷,𝟷]],𝚝)=1t2t+1.{\tt CtoR([[1,1],[1,1]],t)}=\frac{1}{-t^{2}-t+1}.

Briefly, the idea is that the denominator of the rational function can be easily determined by the recurrence relation and we use the initial condition to find the starting terms of the generating function, then multiply it by the denominator, yielding the numerator.

3.4 Application of GuessRec to Enumerating Spanning Trees of Grid Graphs and G×PnG\times P_{n}

With the powerful procedures GuessRec and CtoR, we are able to find generating functions for the number of spanning trees of generalized graphs of the form G×PnG\times P_{n}. We will illustrate the application of GuessRec to finding the generating function for the number of spanning trees in grid graphs.

First, using procedure GridMN(k,n), we get the k×nk\times n grid graph.

Then, procedure SpFn uses the Matrix Tree Theorem to evaluate the determinant of the co-factor of the Laplacian matrix of the grid graph which is the number of spanning trees in this particular graph. For a fixed kk, we need to generate a sufficiently long list of data for the number of spanning trees in Gk(n),n[l(k),u(k)]G_{k}(n),n\in[l(k),u(k)]. The lower bound l(k)l(k) can’t be too small since the first several terms are the initial condition; the upper bound u(k)u(k) can’t be too small as well since we need sufficient data to obtain the recurrence relation. Notice that there is a symmetry for the recurrence relation, and to take advantage of this fact, modified GuessRec to get the more efficient GuessSymRec (requiring less data). Once the recurrence relation, and the initial conditions, are given, applying CtoR(S,t) will give the desirable generating function, that, of course, is a rational function of tt. All the above is incorporated in procedure GFGridKN(k,t) which inputs a positive integer kk and a symbol tt, and outputs the generating function whose coefficient of tnt^{n} is the number of spanning trees in Gk(n)G_{k}(n), i.e. if we let s(k,n)s(k,n) be the number of spanning trees in Gk(n)G_{k}(n), the generating function

Fk(t)=n=0s(k,n)tn.F_{k}(t)=\sum_{n=0}^{\infty}s(k,n)t^{n}.

We now list the generating functions Fk(t)F_{k}(t) for 1k71\leq k\leq 7: Except for k=7k=7, these were already found by Raff [34] and Faase [15], but it is reassuring that, using our new approach, we got the same output. The case k=7k=7 seems to be new.

Theorem 3.7.

The generating function for the number of spanning trees in G1(n)G_{1}(n) is:

F1(t)=t1t.F_{1}(t)=\frac{t}{1-t}.
Theorem 3.8.

The generating function for the number of spanning trees in G2(n)G_{2}(n) is:

F2=tt24t+1.F_{2}=\frac{t}{{t}^{2}-4\,t+1}.
Theorem 3.9.

The generating function for the number of spanning trees in G3(n)G_{3}(n) is:

F3=t3+tt415t3+32t215t+1.F_{3}=\frac{-{t}^{3}+t}{{t}^{4}-15\,{t}^{3}+32\,{t}^{2}-15\,t+1}.
Theorem 3.10.

The generating function for the number of spanning trees in G4(n)G_{4}(n) is:

F4=t749t5+112t449t3+tt856t7+672t62632t5+4094t42632t3+672t256t+1.F_{4}=\frac{{t}^{7}-49\,{t}^{5}+112\,{t}^{4}-49\,{t}^{3}+t}{{t}^{8}-56\,{t}^{7}+672\,{t}^{6}-2632\,{t}^{5}+4094\,{t}^{4}-2632\,{t}^{3}+672\,{t}^{2}-56\,t+1}.

For 5k75\leq k\leq 7, since the formulas are too long, we present their numerators and denominators separately.

Theorem 3.11.

The generating function for the number of spanning trees in G5(n)G_{5}(n) is:

F5=N5D5F_{5}=\frac{N_{5}}{D_{5}}

where

N5=t15+1440t1326752t12+185889t11574750t10+708928t9708928t7N_{5}=-{t}^{15}+1440\,{t}^{13}-26752\,{t}^{12}+185889\,{t}^{11}-574750\,{t}^{10}+708928\,{t}^{9}-708928\,{t}^{7}
+574750t6185889t5+26752t41440t3+t,+574750\,{t}^{6}-185889\,{t}^{5}+26752\,{t}^{4}-1440\,{t}^{3}+t,
D5=t16209t15+11936t14274208t13+3112032t1219456019t11+70651107t10D_{5}={t}^{16}-209\,{t}^{15}+11936\,{t}^{14}-274208\,{t}^{13}+3112032\,{t}^{12}-19456019\,{t}^{11}+70651107\,{t}^{10}
152325888t9+196664896t8152325888t7+70651107t619456019t5-152325888\,{t}^{9}+196664896\,{t}^{8}-152325888\,{t}^{7}+70651107\,{t}^{6}-19456019\,{t}^{5}
+3112032t4274208t3+11936t2209t+1.+3112032\,{t}^{4}-274208\,{t}^{3}+11936\,{t}^{2}-209\,t+1.
Theorem 3.12.

The generating function for the number of spanning trees in G6(n)G_{6}(n) is:

F6=N6D6F_{6}=\frac{N_{6}}{D_{6}}

where

N6=t3133359t29+3642600t28173371343t27+4540320720t2670164186331t25N_{6}={t}^{31}-33359\,{t}^{29}+3642600\,{t}^{28}-173371343\,{t}^{27}+4540320720\,{t}^{26}-70164186331\,{t}^{25}
+634164906960t242844883304348t231842793012320t22+104844096982372t21+634164906960\,{t}^{24}-2844883304348\,{t}^{23}-1842793012320\,{t}^{22}+104844096982372\,{t}^{21}
678752492380560t20+2471590551535210t195926092273213840t18-678752492380560\,{t}^{20}+2471590551535210\,{t}^{19}-5926092273213840\,{t}^{18}
+9869538714631398t1711674018886109840t16+9869538714631398t15+9869538714631398\,{t}^{17}-11674018886109840\,{t}^{16}+9869538714631398\,{t}^{15}
5926092273213840t14+2471590551535210t13678752492380560t12-5926092273213840\,{t}^{14}+2471590551535210\,{t}^{13}-678752492380560\,{t}^{12}
+104844096982372t111842793012320t102844883304348t9+634164906960t8+104844096982372\,{t}^{11}-1842793012320\,{t}^{10}-2844883304348\,{t}^{9}+634164906960\,{t}^{8}
70164186331t7+4540320720t6173371343t5+3642600t433359t3+t,-70164186331\,{t}^{7}+4540320720\,{t}^{6}-173371343\,{t}^{5}+3642600\,{t}^{4}-33359\,{t}^{3}+t,
D6=t32780t31+194881t3022377420t29+1419219792t2855284715980t27D_{6}={t}^{32}-780\,{t}^{31}+194881\,{t}^{30}-22377420\,{t}^{29}+1419219792\,{t}^{28}-55284715980\,{t}^{27}
+1410775106597t2624574215822780t25+300429297446885t24+1410775106597\,{t}^{26}-24574215822780\,{t}^{25}+300429297446885\,{t}^{24}
2629946465331120t23+16741727755133760t2278475174345180080t21-2629946465331120\,{t}^{23}+16741727755133760\,{t}^{22}-78475174345180080\,{t}^{21}
+273689714665707178t20716370537293731320t19+1417056251105102122t18+273689714665707178\,{t}^{20}-716370537293731320\,{t}^{19}+1417056251105102122\,{t}^{18}
2129255507292156360t17+2437932520099475424t162129255507292156360t15-2129255507292156360\,{t}^{17}+2437932520099475424\,{t}^{16}-2129255507292156360\,{t}^{15}
+1417056251105102122t14716370537293731320t13+273689714665707178t12+1417056251105102122\,{t}^{14}-716370537293731320\,{t}^{13}+273689714665707178\,{t}^{12}
78475174345180080t11+16741727755133760t102629946465331120t9-78475174345180080\,{t}^{11}+16741727755133760\,{t}^{10}-2629946465331120\,{t}^{9}
+300429297446885t824574215822780t7+1410775106597t655284715980t5+300429297446885\,{t}^{8}-24574215822780\,{t}^{7}+1410775106597\,{t}^{6}-55284715980\,{t}^{5}
+1419219792t422377420t3+194881t2780t+1.+1419219792\,{t}^{4}-22377420\,{t}^{3}+194881\,{t}^{2}-780\,t+1.
Theorem 3.13.

The generating function for the number of spanning trees in G7(n)G_{7}(n) is:

F7=N7D7F_{7}=\frac{N_{7}}{D_{7}}

where

N7=t47142t46+661245t45279917500t44+53184503243t435570891154842t42N_{7}=-{t}^{47}-142\,{t}^{46}+661245\,{t}^{45}-279917500\,{t}^{44}+53184503243\,{t}^{43}-5570891154842\,{t}^{42}
+341638600598298t4111886702497030032t40+164458937576610742t39+341638600598298\,{t}^{41}-11886702497030032\,{t}^{40}+164458937576610742\,{t}^{39}
+4371158470492451828t38288737344956855301342t37+7736513993329973661368t36+4371158470492451828\,{t}^{38}-288737344956855301342\,{t}^{37}+7736513993329973661368\,{t}^{36}
131582338768322853956994t35+1573202877300834187134466t34-131582338768322853956994\,{t}^{35}+1573202877300834187134466\,{t}^{34}
13805721749199518460916737t33+90975567796174070740787232t32-13805721749199518460916737\,{t}^{33}+90975567796174070740787232\,{t}^{32}
455915282590547643587452175t31+1747901867578637315747826286t30-455915282590547643587452175\,{t}^{31}+1747901867578637315747826286\,{t}^{30}
5126323837327170557921412877t29+11416779122947828869806142972t28-5126323837327170557921412877\,{t}^{29}+11416779122947828869806142972\,{t}^{28}
18924703166237080216745900796t27+22194247945745188489023284104t26-18924703166237080216745900796\,{t}^{27}+22194247945745188489023284104\,{t}^{26}
15563815847174688069871470516t25+15563815847174688069871470516t23-15563815847174688069871470516\,{t}^{25}+15563815847174688069871470516\,{t}^{23}
22194247945745188489023284104t22+18924703166237080216745900796t21-22194247945745188489023284104\,{t}^{22}+18924703166237080216745900796\,{t}^{21}
11416779122947828869806142972t20+5126323837327170557921412877t19-11416779122947828869806142972\,{t}^{20}+5126323837327170557921412877\,{t}^{19}
1747901867578637315747826286t18+455915282590547643587452175t17-1747901867578637315747826286\,{t}^{18}+455915282590547643587452175\,{t}^{17}
90975567796174070740787232t16+13805721749199518460916737t15-90975567796174070740787232\,{t}^{16}+13805721749199518460916737\,{t}^{15}
1573202877300834187134466t14+131582338768322853956994t13-1573202877300834187134466\,{t}^{14}+131582338768322853956994\,{t}^{13}
7736513993329973661368t12+288737344956855301342t11-7736513993329973661368\,{t}^{12}+288737344956855301342\,{t}^{11}
4371158470492451828t10164458937576610742t9-4371158470492451828\,{t}^{10}-164458937576610742\,{t}^{9}
+11886702497030032t8341638600598298t7+5570891154842t653184503243t5+11886702497030032\,{t}^{8}-341638600598298\,{t}^{7}+5570891154842\,{t}^{6}-53184503243\,{t}^{5}
+279917500t4661245t3+142t2+t,+279917500\,{t}^{4}-661245\,{t}^{3}+142\,{t}^{2}+t,
D7=t482769t47+2630641t461195782497t45+305993127089t4448551559344145t43D_{7}={t}^{48}-2769\,{t}^{47}+2630641\,{t}^{46}-1195782497\,{t}^{45}+305993127089\,{t}^{44}-48551559344145\,{t}^{43}
+5083730101530753t42366971376492201338t41+5083730101530753\,{t}^{42}-366971376492201338\,{t}^{41}
+18871718211768417242t40709234610141846974874t39+18871718211768417242\,{t}^{40}-709234610141846974874\,{t}^{39}
+19874722637854592209338t38422023241997789381263002t37+19874722637854592209338\,{t}^{38}-422023241997789381263002\,{t}^{37}
+6880098547452856483997402t3687057778313447181201990522t35+6880098547452856483997402\,{t}^{36}-87057778313447181201990522\,{t}^{35}
+862879164715733847737203343t346750900711491569851736413311t33+862879164715733847737203343\,{t}^{34}-6750900711491569851736413311\,{t}^{33}
+41958615314622858303912597215t32208258356862493902206466194607t31+41958615314622858303912597215\,{t}^{32}-208258356862493902206466194607\,{t}^{31}
+828959040281722890327985220255t302654944041424536277948746010303t29+828959040281722890327985220255\,{t}^{30}-2654944041424536277948746010303\,{t}^{29}
+6859440538554030239641036025103t2814324708604336971207868317957868t27+6859440538554030239641036025103\,{t}^{28}-14324708604336971207868317957868\,{t}^{27}
+24214587194571650834572683444012t2633166490975387358866518005011884t25+24214587194571650834572683444012\,{t}^{26}-33166490975387358866518005011884\,{t}^{25}
+36830850383375837481096026357868t2433166490975387358866518005011884t23+36830850383375837481096026357868\,{t}^{24}-33166490975387358866518005011884\,{t}^{23}
+24214587194571650834572683444012t2214324708604336971207868317957868t21+24214587194571650834572683444012\,{t}^{22}-14324708604336971207868317957868\,{t}^{21}
+6859440538554030239641036025103t202654944041424536277948746010303t19+6859440538554030239641036025103\,{t}^{20}-2654944041424536277948746010303\,{t}^{19}
+828959040281722890327985220255t18208258356862493902206466194607t17+828959040281722890327985220255\,{t}^{18}-208258356862493902206466194607\,{t}^{17}
+41958615314622858303912597215t166750900711491569851736413311t15+41958615314622858303912597215\,{t}^{16}-6750900711491569851736413311\,{t}^{15}
+862879164715733847737203343t1487057778313447181201990522t13+862879164715733847737203343\,{t}^{14}-87057778313447181201990522\,{t}^{13}
+6880098547452856483997402t12422023241997789381263002t11+6880098547452856483997402\,{t}^{12}-422023241997789381263002\,{t}^{11}
+19874722637854592209338t10709234610141846974874t9+18871718211768417242t8+19874722637854592209338\,{t}^{10}-709234610141846974874\,{t}^{9}+18871718211768417242\,{t}^{8}
366971376492201338t7+5083730101530753t648551559344145t5+305993127089t4-366971376492201338\,{t}^{7}+5083730101530753\,{t}^{6}-48551559344145\,{t}^{5}+305993127089\,{t}^{4}
1195782497t3+2630641t22769t+1.-1195782497\,{t}^{3}+2630641\,{t}^{2}-2769\,t+1.

Note that, surprisingly, the degree of the denominator of F7(t)F_{7}(t) is 4848 rather than the expected 6464 since the first six generating functions’ denominator have degree 2k12^{k-1}, 1k61\leq k\leq 6. With a larger computer, one should be able to compute FkF_{k} for larger kk, using this experimental approach.

Generally, for an arbitrary graph GG, we consider the number of spanning trees in G×PnG\times P_{n}. With the same methodology, a list of data can be obtained empirically with which a a generating function follows.

The original motivation for the Matrix Tree Theorem, first discovered by Kirchhoff (of Kirchhoff’s laws fame) came from the desire to efficiently compute joint resistances in an electrical network.

Suppose one is interested in the joint resistance in an electric network in the form of a grid graph between two diagonal vertices [1,1][1,1] and [k,n][k,n]. We assume that each edge has resistance 11 Ohm. To obtain it, all we need is, in addition for the number of spanning trees (that’s the numerator), the number of spanning forests SFk(n)SF_{k}(n) of the graph Gk(n)G_{k}(n) that have exactly two components, each component containing exactly one of the members of the pair {[1,1],[k,n]}\{[1,1],[k,n]\} (this is the denominator). The joint resistance is just the ratio.

In principle, we can apply the same method to obtain the generating function SkS_{k}. Empirically, we found that the denominator of SkS_{k} is always the square of the denominator of FkF_{k} times another polynomial CkC_{k}. Once the denominator is known, we can find the numerator in the same way as above. So our focus is to find CkC_{k}.

The procedure DenomSFKN(k,t) in the Maple package JointConductance.txt, calculates CkC_{k}. For 2k42\leq k\leq 4, we have

C2=t1,C_{2}=t-1,
C3=t48t3+17t28t+1,C_{3}=t^{4}-8t^{3}+17t^{2}-8t+1,
C4=t1246t11+770t106062t9+24579t855388t7+72324t655388t5+24579t4C_{4}=t^{12}-46t^{11}+770t^{10}-6062t^{9}+24579t^{8}-55388t^{7}+72324t^{6}-55388t^{5}+24579t^{4}
6062t3+770t246t+1.-6062t^{3}+770t^{2}-46t+1.

Remark By looking at the output of our Maple package, we conjectured that R(k,n)R(k,n), the resistance between vertex [1,1][1,1] and vertex [k,n][k,n] in the k×nk\times n grid graph, Gk(n)G_{k}(n), where each edge is a resistor of 11 Ohm, is asymptotically n/kn/k, for any fixed kk, as nn\rightarrow\infty. We proved it rigorously for k6k\leq 6, and we wondered whether there is a human-generated “electric proof’. Naturally we emailed Peter Doyle, the co-author of the delightful masterpiece [9], who quickly came up with the following argument.

Making the horizontal resistors into almost resistance-less gold wires gives the lower bound R(k,n)(n1)/kR(k,n)\geq(n-1)/k since it is a parallel circuit of kk resistors of n1n-1 Ohms. For an upper bound of the same order, put 1 Ampere in at [1,1] and out at [k,n][k,n], routing 1/k1/k Ampere up each of the kk verticals. The energy dissipation is k(n1)/k2+C(k)=(n1)/k+C(k)k(n-1)/k^{2}+C(k)=(n-1)/k+C(k), where the constant C(k)C(k) is the energy dissipated along the top and bottom resistors. Specifically, C(k)=2(11/k)2+(12/k)2++(1/k)2)C(k)=2(1-1/k)^{2}+(1-2/k)^{2}+\dots+(1/k)^{2}). So (n1)/kR(k,n)(n1)/k+C(k)(n-1)/k\leq R(k,n)\leq(n-1)/k+C(k).

We thank Peter Doyle for his kind permission to reproduce this electrifying argument.

3.5 The Statistic of the Number of Vertical Edges

As mentioned in Section 2.4, often in enumerative combinatorics, the class of interest has natural ‘statistics’.

In this section, we are interested in the statistic ‘the number of vertical edges’, defined on spanning trees of grid graphs. For given kk and nn, let, as above, Gk(n)G_{k}(n) denote the k×nk\times n grid-graph. Let k,n\mathcal{F}_{k,n} be its set of spanning trees. if the weight is 1, then fk,n1=|k,n|\sum_{f\in\mathcal{F}_{k,n}}1=|\mathcal{F}_{k,n}| is the naive counting. Now let’s define a natural statistic

ver(T)ver(T) = the number of vertical edges in the spanning tree TT

and the weight w(T)=vver(T)w(T)=v^{ver(T)}, then the weighted counting follows:

Verk,n(v)=Tk,nw(T)Ver_{k,n}(v)=\sum_{T\in\mathcal{F}_{k,n}}w(T)

where k,n\mathcal{F}_{k,n} is the set of spanning trees of Gk(n)G_{k}(n).

We define the bivariate generating function

gk(v,t)=n=0Verk,ntn.g_{k}(v,t)=\sum_{n=0}^{\infty}Ver_{k,n}t^{n}.

More generally, with our Maple package GFMatrix.txt, and procedure VerGF, we are able to obtain the bivariate generating function for an arbitrary graph of the form G×PnG\times P_{n}. The procedure VerGF takes inputs GG (an arbitrary graph), NN (an integer determining how many data we use to find the recurrence relation) and two symbols vv and tt.

The main tool for computing VerGF is still the Matrix Tree Theorem and GuessRec. But we need to modify the Laplacian matrix for the graph. Instead of letting aij=1a_{ij}=-1 for iji\neq j and {i,j}E(G×Pn)\{i,j\}\in E(G\times P_{n}), we should consider whether the edge {i,j}\{i,j\} is a vertical edge. If so, we let ai,j=v,aj,i=va_{i,j}=-v,a_{j,i}=-v. The diagonal elements which are (1)×(-1)\times (the sum of the rest entries on the same row) should change accordingly. The following theorems are for grid graphs when 2k42\leq k\leq 4 while k=1k=1 is a trivial case because there are no vertical edges.

Theorem 3.14.

The bivariate generating function for the weighted counting according to the number of vertical edges of spanning trees in G2(n)G_{2}(n) is:

g2(v,t)=vt1(2v+2)t+t2.g_{2}(v,t)=\frac{vt}{1-\left(2\,v+2\right)t+{t}^{2}}.
Theorem 3.15.

The bivariate generating function for the weighted counting according to the number of vertical edges vertical edges of spanning trees in G3(n)G_{3}(n) is:

g3(v,t)=t3v2+v2t1(3v2+8v+4)t(10v216v6)t2(3v2+8v+4)t3+t4.g_{3}(v,t)=\frac{-{t}^{3}{v}^{2}+{v}^{2}t}{1-\left(3\,{v}^{2}+8\,v+4\right)t-\left(-10\,{v}^{2}-16\,v-6\right){t}^{2}-\left(3\,{v}^{2}+8\,v+4\right){t}^{3}+{t}^{4}}.
Theorem 3.16.

The bivariate generating function for the weighted counting according to the number of vertical edges of spanning trees in G4(n)G_{4}(n) is:

g4(v,t)=numer(g4)denom(g4)g_{4}(v,t)=\frac{numer(g_{4})}{denom(g_{4})}

where

numer(g4)=v3t+(16v524v49v3)t3+(8v6+40v5+48v4+16v3)t4numer(g_{4})={v}^{3}t+\left(-16\,{v}^{5}-24\,{v}^{4}-9\,{v}^{3}\right){t}^{3}+\left(8\,{v}^{6}+40\,{v}^{5}+48\,{v}^{4}+16\,{v}^{3}\right){t}^{4}
+(16v524v49v3)t5+v3t7+\left(-16\,{v}^{5}-24\,{v}^{4}-9\,{v}^{3}\right){t}^{5}+{v}^{3}{t}^{7}

and

denom(g4)=1(4v3+20v2+24v+8)t(52v4192v3256v2144v28)t2denom(g_{4})=1-\left(4\,{v}^{3}+20\,{v}^{2}+24\,v+8\right)t-\left(-52\,{v}^{4}-192\,{v}^{3}-256\,{v}^{2}-144\,v-28\right){t}^{2}
(64v5+416v4+892v3+844v2+360v+56)t3-\left(64\,{v}^{5}+416\,{v}^{4}+892\,{v}^{3}+844\,{v}^{2}+360\,v+56\right){t}^{3}
(16v6160v5744v41408v31216v2480v70)t4-\left(-16\,{v}^{6}-160\,{v}^{5}-744\,{v}^{4}-1408\,{v}^{3}-1216\,{v}^{2}-480\,v-70\right){t}^{4}
(64v5+416v4+892v3+844v2+360v+56)t5-\left(64\,{v}^{5}+416\,{v}^{4}+892\,{v}^{3}+844\,{v}^{2}+360\,v+56\right){t}^{5}
(52v4192v3256v2144v28)t6(4v3+20v2+24v+8)t7+t8.-\left(-52\,{v}^{4}-192\,{v}^{3}-256\,{v}^{2}-144\,v-28\right){t}^{6}-\left(4\,{v}^{3}+20\,{v}^{2}+24\,v+8\right){t}^{7}+{t}^{8}.

With the Maple package BiVariateMoms.txt and its Story procedure from

http://sites.math.rutgers.edu/~zeilberg/tokhniot/BiVariateMoms.txt,

the expectation, variance and higher moments can be easily analyzed. We calculated up to the 4th moment for G2(n)G_{2}(n). For k=3,4k=3,4, you can find the output files from

http://sites.math.rutgers.edu/~yao/OutputStatisticVerticalk=3.txt,

http://sites.math.rutgers.edu/~yao/OutputStatisticVerticalk=4.txt.

Theorem 3.17.

The moments of the statistic: the number of vertical edges in the spanning trees of G2(n)G_{2}(n) are as follows:

Let bb be the largest positive root of the polynomial equation

b24b+1=0b^{2}-4b+1=0

whose floating-point approximation is 3.732050808, then the size of the nn-th family (i.e. straight enumeration) is very close to

bn+12+4b.{\frac{{b}^{n+1}}{-2+4\,b}}.

The average of the statistics is, asymptotically

13+13(1+2b)nb.\frac{1}{3}+\frac{1}{3}\,{\frac{\left(-1+2\,b\right)n}{b}}.

The variance of the statistics is, asymptotically

19+19(7b2)n1+4b.-\frac{1}{9}+\frac{1}{9}\,{\frac{\left(7\,b-2\right)n}{-1+4\,b}}.

The skewness of the statistics is, asymptotically

780b209(4053b1086)n3+(7020b+1881)n2+(4053b1086)n780b+209.{\frac{780\,b-209}{\left(4053\,b-1086\right){n}^{3}+\left(-7020\,b+1881\right){n}^{2}+\left(4053\,b-1086\right)n-780\,b+209}}.

The kurtosis of the statistics is, asymptotically

3(32592b8733)n2+(56451b+15126)n+21728b5822(32592b8733)n2+(37634b+10084)n+10864b2911.3\,{\frac{\left(32592\,b-8733\right){n}^{2}+\left(-56451\,b+15126\right)n+21728\,b-5822}{\left(32592\,b-8733\right){n}^{2}+\left(-37634\,b+10084\right)n+10864\,b-2911}}.

3.6 Application of the CC-finite Ansatz to Almost-Diagonal Matrices

So far, we have seen applications of the CC-finite ansatz methodology for automatically computing generating functions for enumerating spanning trees/forests for certain infinite families of graphs.

The second case study is completely different, and in a sense more general, since the former framework may be subsumed in this new context.

Definition 3.18.

Diagonal matrices AA are square matrices in which the entries outside the main diagonal are 0, i.e. aij=0a_{ij}=0 if iji\neq j.

Definition 3.19.

An almost-diagonal matrix AA is a square matrices in which ai,j=0a_{i,j}=0 if jik1j-i\geq k_{1} or ijk2i-j\geq k_{2} for some fixed positive integers k1,k2k_{1},k_{2} and i1,j1,i2,j2\forall i_{1},j_{1},i_{2},j_{2}, if i1j1=i2j2i_{1}-j_{1}=i_{2}-j_{2}, then ai1j1=ai2j2a_{i_{1}j_{1}}=a_{i_{2}j_{2}}.

For simplicity, we use the notation L=L=[nn, [the first k1k_{1} entries in the first row], [the first k2k_{2} entries in the first column]] to denote the n×nn\times n matrix with these specifications. Note that this notation already contains all information we need to reconstruct this matrix. For example, [6, [1,2,3], [1,4]] is the matrix

[123000412300041230004123000412000041].\begin{bmatrix}1&2&3&0&0&0\\ 4&1&2&3&0&0\\ 0&4&1&2&3&0\\ 0&0&4&1&2&3\\ 0&0&0&4&1&2\\ 0&0&0&0&4&1\end{bmatrix}.

The following is the Maple procedure DiagMatrixL (in our Maple package GFMatrix.txt), which inputs such a list LL and outputs the corresponding matrix.

DiagMatrixL:=proc(L) local n, r1, c1,p,q,S,M,i:
n:=L[1]:
r1:=L[2]:
c1:=L[3]:
p:=nops(r1)-1:
q:=nops(c1)-1:
if r1[1] <> c1[1] then
return fail:
fi:
S:=[0$(n-1-q), seq(c1[q-i+1],i=0..q-1), op(r1), 0$(n-1-p)]:
M:=[0$n]:
for i from 1 to n do
M[i]:=[op(max(0,n-1-q)+q+2-i..max(0,n-1-q)+q+1+n-i,S)]:
od:
return M:
end:

For this matrix, k1=3k_{1}=3 and k2=2k_{2}=2. Let k1,k2k_{1},k_{2} be fixed and M1,M2M_{1},M_{2} be two lists of numbers or symbols of length k1k_{1} and k2k_{2} respectively, AkA_{k} is the almost-diagonal matrix represented by the list Lk=[k,M1,M2]L_{k}=[k,M_{1},M_{2}]. Note that the first elements in the lists M1M_{1} and M2M_{2} must be identical.

Having fixed two lists M1M_{1} of length k1k_{1} and M2M_{2} of length k2k_{2}, (where M1[1]=M2[1]M_{1}[1]=M_{2}[1]), it is of interest to derive automatically, the generating function (that is always a rational function for reasons that will soon become clear), k=0aktk\sum_{k=0}^{\infty}a_{k}\,t^{k}, where aka_{k} denotes the determinant of the k×kk\times k almost-diagonal matrix whose first row starts with M1M_{1}, and first column starts with M2M_{2}. Analogously, it is also of interest to do the analogous problem when the determinant is replaced by the permanent.

Here is the Maple procedure GFfamilyDet which takes inputs (i) AA: a name of a Maple procedure that inputs an integer nn and outputs an n×nn\times n matrix according to some rule, e.g., the almost-diagonal matrices, (ii) a variable name tt, (iii) two integers mm and nn which are the lower and upper bounds of the sequence of determinants we consider. It outputs a rational function in tt, say R(t)R(t), which is the generating function of the sequence.

GFfamilyDet:=proc(A,t,m,n) local i,rec,GF,B,gu,Denom,L,Numer:
L:=[seq(det(A(i)),i=1..n)]:
rec:=GuessRec([op(m..n,L)])[2]:
gu:=solve(B-1-add(t**i*rec[i]*B,i=1..nops(rec)), B):
Denom:=denom(subs(gu,B)):
Numer:=Denom*(1+add(L[i]*t**i, i=1..n)):
Numer:=add(coeff(Numer,t,i)*t**i, i=0..degree(Denom,t)):
Numer/Denom:
end:

Similarly we have procedure GFfamilyPer for the permanent. Let’s look at an example. The following is a sample procedure which considers the family of almost diagonal matrices which the first row [2,3][2,3] and the first column [2,4,5][2,4,5].

SampleB:=proc(n) local L,M:
L:=[n, [2,3], [2,4,5]]:
M:=DiagMatrixL(L):
end:

Then GFfamilyDet(SampleB, t, 10, 50) will return the generating function

145t312t2+2t1.-\frac{1}{45\,{t}^{3}-12\,{t}^{2}+2\,t-1}.

It turns out, that for this problem, the more ‘conceptual’ approach of setting up a transfer matrix also works well. But don’t worry, the computer can do the ‘research’ all by itself, with only a minimum amount of human pre-processing.

We will now describe this more conceptual approach, that may be called symbolic dynamical programming, where the computer sets up, automatically, a finite-state scheme, by dynamically discovering the set of states, and automatically figures out the transfer matrix.

3.7 The Symbolic Dynamic Programming Approach

Recall from Linear Algebra 101,

Theorem 3.20 (Cofactor Expansion).

Let |A||A| denote the determinant of an n×nn\times n matrix AA, then

|A|=j=1n(1)i+jaijMij,i[n],|A|=\sum_{j=1}^{n}(-1)^{i+j}a_{ij}M_{ij},\quad\forall i\in[n],

where MijM_{ij} is the (i,j)(i,j)-minor.

We’d like to consider the Cofactor Expansion for almost-diagonal matrices along the first row. For simplicity, we assume while ai,j=0a_{i,j}=0 if jik1j-i\geq k_{1} or ijk2i-j\geq k_{2} for some fixed positive integers k1,k2k_{1},k_{2}, and if k2<j1i1<j2i2<k1-k_{2}<j_{1}-i_{1}<j_{2}-i_{2}<k_{1}, then ai1j1ai2j2a_{i_{1}j_{1}}\neq a_{i_{2}j_{2}}. Under this assumption, for any minors we obtain through recursive Cofactor Expansion along the first row, the dimension, the first row and the first column should provide enough information to reconstruct the matrix.

For an almost-diagonal matrix represented by L=L=[Dimension, [the first k1k_{1} entries in the first row], [the first k2k_{2} entries in the first column]], any minor can be represented by [Dimension, [entries in the first row up to the last nonzero entry], [entries in the first column up to the last nonzero entry]].

Our goal in this section is the same as the last one, to get a generating function for the determinant or permanent of almost-diagonal matrices AkA_{k} with dimension kk. Once we have those almost-diagonal matrices, the first step is to do a one-step expansion as follows:

ExpandMatrixL:=proc(L,L1)
local n,R,C,dim,R1,C1,i,r,S,candidate,newrow,newcol,gu,mu,temp,p,q,j:
n:=L[1]:
R:=L[2]:
C:=L[3]:
p:=nops(R)-1:
q:=nops(C)-1:
dim:=L1[1]:
R1:=L1[2]:
C1:=L1[3]:

if R1=[] or C1=[] then
return :
elif R[1]<>C[1] or R1[1]<>C1[1] or dim>n then
return fail:
else

S:={}:

gu:=[0$(n-1-q), seq(C[q-i+1],i=0..q-1), op(R), 0$(n-1-p)]:
candidate:=[0$nops(R1),R1[-1]]:
for i from 1 to nops(R1) do
mu:=R1[i]:
for j from n-q to nops(gu) do
if gu[j]=mu then
candidate[i]:=gu[j-1]:
fi:
od:
od:

for i from n-q to nops(gu) do
if gu[i] = R1[2] then
temp:=i:
break:
fi:
od:

for i from 1 to nops(R1) do
if i = 1 then
mu:=[R1[i]*(-1)**(i+1), [dim-1,[op(i+1..nops(candidate), candidate)],
[seq(gu[temp-i],i=1..temp-n+q)]]]:
S:=S union mu:
else
mu:=[R1[i]*(-1)**(i+1), [dim-1, [op(1..i-1, candidate),
op(i+1..nops(candidate), candidate)], [op(2..nops(C1), C1)]]]:
S:=S union mu:
fi:
od:

return S:

fi:

end:

The ExpandMatrixL procedure inputs a data structure L=L= [Dimension, first_row=[ ], first_col=[ ]] as the matrix we start and the other data structure L1L1 as the current minor we have, expands L1L1 along its first row and outputs a list of [multiplicity, data structure].

We would like to generate all the ”children” of an almost-diagonal matrix regardless of the dimension, i.e., two lists LL represent the same child as long as their first_rows and first_columns are the same, respectively. The set of ”children” is the scheme of the almost diagonal matrices in this case.

The following is the Maple procedure ChildrenMatrixL which inputs a data structure LL and outputs the set of its ”children” under Cofactor Expansion along the first row:

ChildrenMatrixL:=proc(L) local S,t,T,dim,U,u,s:
dim:=L[1]:
S:={[op(2..3,L)]}:
T:={seq([op(2..3,t[2])],t in ExpandMatrixL(L,L))}:
while T minus S <> {} do
U:=T minus S:
S:=S union T:
T:={}:
for u in U do
T:=T union {seq([op(2..3,t[2])],t in ExpandMatrixL(L,[dim,op(u)]))}:
od:
od:
for s in S do
if s[1]=[] or s[2]=[] then
S:=S minus {s}:
fi:
od:
S:
end:

After we have the scheme SS, by the Cofactor Expansion of any element in the scheme, a system of algebraic equations follows. For children in SS, it’s convenient to let the almost-diagonal matrix be the first one C1C_{1} and for the rest, any arbitrary ordering will do. For example, if after Cofactor Expansion for C1C_{1}, c2c_{2} ”copies” of C2C_{2} and c3c_{3} ”copies” of C3C_{3} are obtained, then the equation will be

C1=1+c2tC2+c3tC3.C_{1}=1+c_{2}tC_{2}+c_{3}tC_{3}.

However, if the above equation is for Ci,i1C_{i},i\neq 1, i.e. CiC_{i} is not the almost-diagonal matrix itself, then the equation will be slightly different:

Ci=c2tC2+c3tC3.C_{i}=c_{2}tC_{2}+c_{3}tC_{3}.

Here tt is a symbol as we assume the generating function is a rational function of tt.

Here is the Maple code that implements how we get the generating function for the determinant of a family of almost-diagonal matrices by solving a system of algebraic equations:

GFMatrixL:=proc(L,t) local S,dim,var,eq,n,A,i,result,gu,mu:
dim:=L[1]:
S:=ChildrenMatrixL(L):
S:=[[op(2..3,L)], op(S minus {[op(2..3,L)]})]:
n:=nops(S):
var:={seq(A[i],i=1..n)}:
eq:={}:
for i from 1 to 1 do
result:=ExpandMatrixL(L,[dim,op(S[i])]):
for gu in result do
if gu[2][2]=[] or gu[2][3]=[] then
result:=result minus {gu}:
fi:
od:
eq:=eq union {A[i] - 1 - add(gu[1]*t*A[CountRank(S,
[op(2..3, gu[2])])], gu in result)}:
od:
for i from 2 to n do
result:=ExpandMatrixL(L,[dim,op(S[i])]):
for gu in result do
if gu[2][2]=[] or gu[2][3]=[] then
result:=result minus gu:
fi:
od:
eq:=eq union {A[i] - add(gu[1]*t*A[CountRank(S, [op(2..3, gu[2])])],
gu in result)}:
od:
gu:=solve(eq, var)[1]:
subs(gu, A[1]):

end:


GFMatrixL([20, [2, 3], [2, 4, 5]], t) returns

145t312t2+2t1.-\frac{1}{45\,{t}^{3}-12\,{t}^{2}+2\,t-1}.

Compared to empirical approach, the ‘symbolic dynamical programming’ method is faster and more efficient for the moderate-size examples that we tried out. However, as the lists will grow larger, it is likely that the former method will win out, since with this non-guessing approach, it is equally fast to get generating functions for determinants and permanents, and as we all know, permanents are hard.

The advantage of the present method is that it is more appealing to humans, and does not require any ‘meta-level’ act of faith. However, both methods are very versatile and are great experimental approaches for enumerative combinatorics problems. We hope that our readers will find other applications.

3.8 Remarks

Rather than trying to tackle each enumeration problem, one at a time, using ad hoc human ingenuity each time, building up an intricate transfer matrix, and only using the computer at the end as a symbolic calculator, it is a much better use of our beloved silicon servants (soon to become our masters!) to replace ‘thinking’ by ‘meta-thinking’, i.e. develop experimental mathematics methods that can handle many different types of problems. In the two case studies discussed here, every thing was made rigorous, but if one can make semi-rigorous and even non-rigorous discoveries, as long as they are interesting, one should not be hung up on rigorous proofs. In other words, if you can find a rigorous justification (like in these two case studies) that’s nice, but if you can’t, that’s also nice!

Chapter 4 Analysis of Quicksort Algorithms

This chapter is adapted from [50], which has been published on Journal of Difference Equations and Applications. It is also available on arXiv.org, number 1905.00118.

4.1 Introduction

A sorting algorithm is an algorithm that rearranges elements of a list in a certain order, the most frequently used orders being numerical order and lexicographical order. Sorting algorithms play a significant role in computer science since efficient sorting is important for optimizing the efficiency of other algorithms which require input data to be in sorted lists. In this chapter, our focus is Quicksort.

Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It has been a commonly used algorithm for sorting since then and is still widely used in industry.

The main idea for Quicksort is that we choose a pivot randomly and then compare the other elements with the pivot, smaller elements being placed on the left side of the pivot and larger elements on the right side of the pivot. Then we recursively apply the same operation to the sublists obtained from the partition step. As for the specific implementations, there can be numerous variants, some of which are at least interesting from a theoretical perspective despite their rare use in the real world.

It is well-known that the worst-case performance of Quicksort is O(n2)O(n^{2}) and the average performance is O(nlogn)O(n\log n). However, we are also interested in the explicit closed-form expressions for the moments of Quicksort’s performance, i.e., running time, in terms of the number of comparisons and/or the number of swaps. In this chapter, only lists or arrays containing distinct elements are considered.

The chapter is organized as follows. In Section 4.2, we review related work on the number of comparisons of 1-pivot Quicksort, whose methodology is essential for further study. In Section 4.3, the numbers of swaps of several variants of 1-pivot Quicksort are considered. In Section 4.4, we extend our study to multi-pivot Quicksort. In Section 4.5, the technique to obtain more moments and the scaled limiting distribution are discussed. In the last section we discuss some potential improvements for Quicksort, summarize the main results of this chapter and make final remarks on the methodology of experimental mathematics.

4.2 Related Work

In the masterpiece of Shalosh B. Ekhad and Doron Zeilberger [13], they managed to find the explicit expressions for expectation, variance and higher moments of the number of comparisons of 1-pivot Quicksort with an experimental mathematics approach, which is also considered as some form of “machine learning.” Here we will review the results they discovered or rediscovered.

Let CnC_{n} be the random variable “number of comparisons in Quicksort applied to lists of length nn,” n0n\geq 0.

Theorem 4.1 ([25], p.8, end of section 1.3; [17], Eq. (2.14), p. 29, and other places).
E[Cn]=2(n+1)H1(n)4n.E[C_{n}]=2(n+1)H_{1}(n)-4n.

Here H1(n)H_{1}(n) are the Harmonic numbers

H1(n):=i=1n1i.H_{1}(n):=\sum_{i=1}^{n}\frac{1}{i}.

More generally, in following theorems, we introduce the notation

Hk(n):=i=1n1ik.H_{k}(n):=\sum_{i=1}^{n}\frac{1}{i^{k}}.
Theorem 4.2 (Knuth, [27], answer to Ex. 8(b) in section 6.2.2)).
var[Cn]=n(7n+13) 2(n+1)H1(n)4(n+1)2H2(n).var[C_{n}]=n(7\,n+13)\,-\,2\,(n+1)\,H_{{1}}(n)-4\,(n+1)^{2}H_{{2}}(n).

Its asymptotic expression is

(723π2)n2+(132ln(n)2γ4/3π2)n2ln(n)2γ2/3π2+o(1).(7\,-\,\frac{2}{3}\,{\pi}^{2}){n}^{2}+(13-2\,\ln(n)-2\,\gamma-4/3\,{\pi}^{2})n-2\,\ln(n)-2\,\gamma-2/3\,{\pi}^{2}\,+\,o(1).
Theorem 4.3 (Zeilberger, [13]).

The third moment about the mean of CnC_{n} is

n(19n2+81n+104)+H1(n)(14n+14)+12(n+1)2H2(n)+16(n+1)3H3(n).-n(19\,{n}^{2}+81\,n+104)+H_{{1}}(n)(14\,n+14)+12\,(n+1)^{2}H_{{2}}(n)+16\,(n+1)^{3}H_{{3}}(n).

It is asymptotic to

(19+16ζ(3))n3+(81+2π2+48ζ(3))n2+(104+14ln(n)+14γ+4π2(-19+16\,\zeta(3)){n}^{3}+(-81+2\,{\pi}^{2}+48\,\zeta(3)){n}^{2}+(-104+14\,\ln(n)+14\,\gamma+4\,{\pi}^{2}
+48ζ(3))n+14ln(n)+14γ+2π2+16ζ(3)+o(1).+48\,\zeta(3))n+14\,\ln(n)+14\,\gamma+2\,{\pi}^{2}+16\,\zeta(3)\,+\,o(1).

It follows that the limit of the scaled third moment (skewness) converges to

19+16ζ(3)(72/3π2)3/2= 0.8548818671325885.{\frac{-19+16\,\zeta(3)}{(7-2/3\,{\pi}^{2})^{3/2}}}\,=\,0.8548818671325885\dots\quad.
Theorem 4.4 (Zeilberger, [13]).

The fourth moment about the mean of CnC_{n} is

19n(2260n3+9658n2+15497n+11357)2(n+1)(42n2+78n+77)H1(n)\frac{1}{9}\,n(2260\,{n}^{3}+9658\,{n}^{2}+15497\,n+11357)-2\,(n+1)(42\,{n}^{2}+78\,n+77)H_{{1}}(n)
+12(n+1)2(H1(n))2+(4(42n2+78n+31)(n+1)2+48(n+1)3H1(n))H2(n)+12\,(n+1)^{2}(H_{{1}}(n))^{2}+(-4\,(42\,{n}^{2}+78\,n+31)(n+1)^{2}+48\,(n+1)^{3}H_{{1}}(n))H_{{2}}(n)
+48(n+1)4(H2(n))296(n+1)3H3(n)96(n+1)4H4(n).+48\,(n+1)^{4}(H_{{2}}(n))^{2}-96\,(n+1)^{3}H_{{3}}(n)-96\,(n+1)^{4}H_{{4}}(n).

It is asymptotic to

(2260928π2+415π4)n4+(9658984ln(n)84γ+1/6(648+48ln(n)+48γ)π2({\frac{2260}{9}}-28\,{\pi}^{2}+{\frac{4}{15}}\,{\pi}^{4}){n}^{4}+({\frac{9658}{9}}-84\,\ln(n)-84\,\gamma+1/6\,(-648+48\,\ln(n)+48\,\gamma){\pi}^{2}
+1615π496ζ(3))n3+{\frac{16}{15}}\,{\pi}^{4}-96\,\zeta(3)){n}^{3}
+(154979240ln(n)240γ+12(ln(n)+γ)2+1/6(916+144ln(n)+144γ)π2+({\frac{15497}{9}}-240\,\ln(n)-240\,\gamma+12\,(\ln(n)+\gamma)^{2}+1/6\,(-916+144\,\ln(n)+144\,\gamma){\pi}^{2}
+8/5π4288ζ(3))n2+8/5\,{\pi}^{4}-288\,\zeta(3)){n}^{2}
+(113579310ln(n)310γ+24(ln(n)+γ)2+1/6(560+144ln(n)+144γ)π2+({\frac{11357}{9}}-310\,\ln(n)-310\,\gamma+24\,(\ln(n)+\gamma)^{2}+1/6\,(-560+144\,\ln(n)+144\,\gamma){\pi}^{2}
+1615π4288ζ(3))n+{\frac{16}{15}}\,{\pi}^{4}-288\,\zeta(3))n
154ln(n)154γ+12(ln(n)+γ)2+1/6(124+48ln(n)+48γ)π2-154\,\ln(n)-154\,\gamma+12\,(\ln(n)+\gamma)^{2}+1/6\,(-124+48\,\ln(n)+48\,\gamma){\pi}^{2}
+415π496ζ(3)+o(1).+{\frac{4}{15}}\,{\pi}^{4}-96\,\zeta(3)\,+\,o(1).

It follows that the limit of the scaled fourth moment (kurtosis) converges to

2260928π2+415π4(72/3π2)2= 4.1781156382698542.{\frac{{\frac{2260}{9}}-28\,{\pi}^{2}+{\frac{4}{15}}\,{\pi}^{4}}{(7-2/3\,{\pi}^{2})^{2}}}\,=\,4.1781156382698542\dots\quad.

Results for higher moments, more precisely, up to the eighth moment, are also discovered and discussed by Shalosh B. Ekhad and Doron Zeilberger in [13].

Before this article, there are already human approaches to find the expectation and variance for the number of comparisons. Let cn=E[Cn]c_{n}=E[C_{n}]. Since the pivot can be the kk-th smallest element in the list (k=1,2,,n)(k=1,2,\dots,n), we have the recurrence relation

cn=1nk=1n((n1)+ck1+cnk)=(n1)+1nk=1n(ck1+cnk)=(n1)+2nk=1nck1,c_{n}=\frac{1}{n}\sum_{k=1}^{n}((n-1)+c_{k-1}+c_{n-k})=(n-1)+\frac{1}{n}\sum_{k=1}^{n}(c_{k-1}+c_{n-k})=(n-1)+\frac{2}{n}\sum_{k=1}^{n}\,c_{k-1},

because the expected number of comparisons for the sublist before the pivot is ck1c_{k-1} and that for the sublist after the pivot is cnkc_{n-k}. From this recurrence relation, complicated human-generated manipulatorics is needed to rigorously derive the closed form. For the variance, the calculation is much more complicated. For higher moments, we doubt that human approach is realistic.

The experimental mathematics approach is more straightforward and more powerful. For the expectation, a list of data can be obtained through the recurrence relation and the initial condition. Then with an educated guess that cnc_{n} is a polynomial of degree one in both nn and H1(n)H_{1}(n), i.e.,

cn=a+bn+cH1(n)+dnH1(n)c_{n}=a+bn+cH_{1}(n)+dnH_{1}(n)

where a,b,c,da,b,c,d are undetermined coefficients, we can solve for these coefficients by plugging sufficiently many nn and cnc_{n} in this equation.

For higher moments, there is a similar recurrence relation for the probability generating function of CnC_{n}. With the probability generating function, a list of data of any fixed moment can be obtained. Then with another appropriate educated guess of the form of the higher moments, the explicit expression follows.

In [13], it is already discussed that this experimental mathematics approach, which utilizes a recurrence relation to study the Quicksort algorithms, is actually rigorous by pointing out that this is a finite calculation and by referring to results in [36] and [37].

4.3 Number of Swaps of 1-Pivot Quicksort

The performance of Quicksort depends on the number of swaps and comparisons performed. In reality, a swap usually takes more computing resources than a comparison. The difficulty in studying the number of swaps is that the number of swaps depends on how we implement the Quicksort algorithm while the number of comparisons are the same despite the specific implementations.

Since only the number of comparisons is considered in [13], the Quicksort model in [13] is that one picks the pivot randomly, compares each non-pivot element with the pivot and then places them in one of the two new lists L1L_{1} and L2L_{2} where the former contains all elements smaller than the pivot and the latter contains those greater than the pivot. Under this model there is no swap, but a lot of memory is needed. For convenience, let’s call this model Variant Nulla.

In this section, we consider the random variable, the number of swaps XnX_{n}, in different Quicksort variants. Some of them may not be efficient or widely used in industry; however, we treat them as an interesting problem and model in permutations and discrete mathematics. In the first subsection, we also demonstrate our experimental mathematics approaches step by step.

4.3.1 Variant I

The first variant is that we always choose the first (or equivalently, the last) element in the list of length nn as the pivot, then we compare the other elements with the pivot. We compare the second element with the pivot first. If it is greater than the pivot, it stays where it is, otherwise we remove it from the list and then insert it before the pivot. Though this is somewhat different from the “traditional swap,” we define this operation as a swap. Generally, every time we find an element smaller than the pivot, we insert it before the pivot.

Hence, after n1n-1 comparisons and some number of swaps, the partition is achieved, i.e., all elements on the left of the pivot are smaller than the pivot and all elements on the right of the pivot are greater than the pivot. The difference between this variant and Variant Nulla is that this one does not need to create new lists so that it saves memory.

Let Pn(t)P_{n}(t) be the probability generating function for the number of swaps, i.e.,

Pn(t)=k=0P(Xn=k)tk,P_{n}(t)=\sum_{k=0}^{\infty}P(X_{n}=k)\,t^{k},

where for only finitely many integers kk, we have that P(Xn=k)P(X_{n}=k) is nonzero.

We have the recurrence relation

Pn(t)=1nk=1nPk1(t)Pnk(t)tk1,P_{n}(t)=\frac{1}{n}\sum_{k=1}^{n}P_{k-1}(t)P_{n-k}(t)t^{k-1},

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1 because for any fixed k{1,2,,n}k\in\{1,2,\dots,n\}, the probability that the pivot is the kk-th smallest is 1n\frac{1}{n} and there are exactly k1k-1 swaps when the pivot is the kk-th smallest.

The Maple procedure SwapPQs(n,t) in the package Quicksort.txt implements the recurrence of the probability generating function.

Recall that the rr-th moment is given in terms of the probability generating function

E[Xnr]=(tddt)rPn(t)|t=1.E[X_{n}^{r}]=(t\frac{d}{dt})^{r}P_{n}(t)\,|_{t=1}.

The moment about the mean

mr(n):=E[(Xncn)r],m_{r}(n):=E[(X_{n}-c_{n})^{r}],

can be easily derived from the raw moments {E[Xnl]| 1lr}\{E[X_{n}^{l}]\,|\,1\leq l\leq r\}, using the binomial theorem and linearity of expectation. Another way to get the moment about the mean is by considering

mr(n)=(tddt)r(Pn(t)tcn)|t=1.m_{r}(n)=(t\frac{d}{dt})^{r}(\frac{P_{n}(t)}{t^{c_{n}}})\,|_{t=1}.

Recall that

Hk(n):=i=1n1ik.H_{k}(n):=\sum_{i=1}^{n}\frac{1}{i^{k}}.

Our educated guess is that there exists a polynomial of r+1r+1 variables Fr(x0,x1,,xr)F_{r}(x_{0},x_{1},\dots,x_{r}) such that

mr(n)=Fr(n,H1(n),,Hr(n)).m_{r}(n)=F_{r}(n,H_{1}(n),\dots,H_{r}(n)).

With the Maple procedure QsMFn, we can easily obtain the following theorems by just entering QsMFn(SwapPQs, t, n, Hn, r) where rr represents the moment you are interested in. When r=1r=1, it returns the explicit expression for its mean rather than the trivial “first moment about the mean”.

Theorem 4.5.

The expectation of the number of swaps of Quicksort for a list of length nn under Variant I is

E[Xn]=(n+1)H1(n)2n.E[X_{n}]=(n+1)H_{1}(n)-2n.
Theorem 4.6.

The variance of XnX_{n} is

2n(n+2)(n+1)H1(n)(n+1)2H2(n).2n(n+2)-(n+1)H_{1}(n)-(n+1)^{2}H_{2}(n).
Theorem 4.7.

The third moment about the mean of XnX_{n} is

94n(n+3)2+(4n+4)H1(n)+3(n+1)2H2(n)+2(n+1)3H3(n).-\frac{9}{4}n(n+3)^{2}+(4n+4)H_{1}(n)+3(n+1)^{2}H_{2}(n)+2(n+1)^{3}H_{3}(n).
Theorem 4.8.

The fourth moment about the mean of XnX_{n} is

118n(335n3+1568n2+3067n+2770)3(n+1)(4n2+8n+9)H1(n)+3(n+1)2H1(n)2\frac{1}{18}n(335n^{3}+1568n^{2}+3067n+2770)-3(n+1)(4n^{2}+8n+9)H_{1}(n)+3(n+1)^{2}H_{1}(n)^{2}
+((12n2+24n+19)(n+1)2+6(n+1)3H1(n))H2(n)+3(n+1)4H2(n)2+(-(12n^{2}+24n+19)(n+1)^{2}+6(n+1)^{3}H_{1}(n))H_{2}(n)+3(n+1)^{4}H_{2}(n)^{2}
12(n+1)3H3(n)6(n+1)4H4(n).-12(n+1)^{3}H_{3}(n)-6(n+1)^{4}H_{4}(n).

The explicit expressions for higher moments can be easily calculated automatically with the Maple procedure QsMFn and the interested readers are encouraged to find those formulas on their own.

4.3.2 Variant II

The second variant is similar to the first one. One tiny difference is that instead of choosing the first or last element as the pivot, the index of the pivot is chosen uniformly at random. For example, we choose the ii-th element, which is the kk-th smallest, as the pivot. Then we compare those non-pivot elements with the pivot. If i1i\neq 1, the first element will be compared with the pivot first. If it is smaller than the pivot, it stays there, otherwise it is moved to the end of the list. After comparing all the left-side elements with the pivot, we look at those elements whose indexes are originally greater than ii. If they are greater than the pivot, no swap occurs; otherwise insert them before the pivot.

In this case, the recurrence of the probability generating function Pn(t)P_{n}(t) is more complicated as a consequence of that the number of swaps given that ii and kk is known is still a random variable rather than a fixed number as the case in Variant I.

Let Q(n,k,i,t)Q(n,k,i,t) be the probability generating function for such a random variable. In fact, given a random permutation in the permutation group SnS_{n} and that the ii-th element is kk, the number of swaps equals to the number of elements which are before kk and greater than kk or after kk and smaller than kk. Hence, if there are jj elements which are before kk and smaller than kk, then there are i1ji-1-j elements which are before kk and greater than kk and there are k1jk-1-j elements which are after kk and smaller than kk. So in this case the number of swaps is i+k22ji+k-2-2j.

Then we need to determine the range of jj. Obviously it is at least 0. In total there are k1k-1 elements which are less than kk, at most nin-i of them occurring after kk, so jk1n+ij\geq k-1-n+i. As for the upper bound, since there are only i1i-1 elements before kk, we have ji1j\leq i-1. Evidently, jk1j\leq k-1 as well. Therefore the range of jj is [max(k1n+i,0),min(i1,k1)][\,\max(k-1-n+i,0),\,\min(i-1,k-1)\,].

As for the probability that there are exactly jj elements which are before kk and smaller than kk, it equals to

(i1j)s=0j1k1sn1ss=0ij2nksn1js.\binom{i-1}{j}\prod_{s=0}^{j-1}\frac{k-1-s}{n-1-s}\prod_{s=0}^{i-j-2}\frac{n-k-s}{n-1-j-s}.

Consequently, the probability generating function is

Q(n,k,i,t)=j=max(k1n+i,0)min(i1,k1)(i1j)s=0j1k1sn1ss=0ij2nksn1jsti+k22j,Q(n,k,i,t)=\sum_{j=\max(k-1-n+i,0)}^{\min(i-1,k-1)}\binom{i-1}{j}\prod_{s=0}^{j-1}\frac{k-1-s}{n-1-s}\prod_{s=0}^{i-j-2}\frac{n-k-s}{n-1-j-s}t^{i+k-2-2j},

which is implemented by the Maple procedure PerProb(n, k, i, t). For example, PerProb(9, 5, 5, t) returns

170t8+835t6+1835t4+835t2+170.\frac{1}{70}t^{8}+\frac{8}{35}t^{6}+\frac{18}{35}t^{4}+\frac{8}{35}t^{2}+\frac{1}{70}.

We have the recurrence relation

Pn(t)=1n2k=1ni=1nPk1(t)Pnk(t)Q(n,k,i,t),P_{n}(t)=\frac{1}{n^{2}}\sum_{k=1}^{n}\sum_{i=1}^{n}P_{k-1}(t)P_{n-k}(t)Q(n,k,i,t),

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1, which is implemented by the Maple procedure SwapPQ(n, t). The following theorems follow immediately.

Theorem 4.9.

The expectation of the number of swaps of Quicksort for a list of length nn under Variant II is

E[Xn]=(n+1)H1(n)2n.E[X_{n}]=(n+1)H_{1}(n)-2n.
Theorem 4.10.

The variance of XnX_{n} is

16n(11n+17)13(n+1)H1(n)(n+1)2H2(n).\frac{1}{6}n(11n+17)-\frac{1}{3}(n+1)H_{1}(n)-(n+1)^{2}H_{2}(n).
Theorem 4.11.

The third moment about the mean of XnX_{n} is

16n(14n2+57n+73)+(2n+2)H1(n)+(n+1)2H2(n)+2(n+1)3H3(n).-\frac{1}{6}n(14n^{2}+57n+73)+(2n+2)H_{1}(n)+(n+1)^{2}H_{2}(n)+2(n+1)^{3}H_{3}(n).
Theorem 4.12.

The fourth moment about the mean of XnX_{n} is

190n(1496n3+5531n2+8527n+6922)115(n+1)(55n2+85n+173)H1(n)\frac{1}{90}n(1496n^{3}+5531n^{2}+8527n+6922)-\frac{1}{15}(n+1)(55n^{2}+85n+173)H_{1}(n)
+13(n+1)2H1(n)2+(13(33n2+51n+25)(n+1)2+2(n+1)3H1(n))H2(n)+\frac{1}{3}(n+1)^{2}H_{1}(n)^{2}+(-\frac{1}{3}(33n^{2}+51n+25)(n+1)^{2}+2(n+1)^{3}H_{1}(n))H_{2}(n)
+3(n+1)4H2(n)24(n+1)3H3(n)6(n+1)4H4(n).+3(n+1)^{4}H_{2}(n)^{2}-4(n+1)^{3}H_{3}(n)-6(n+1)^{4}H_{4}(n).

Higher moments can also be easily obtained by entering QsMFn(SwapPQ, t, n, Hn, r) where rr represents the rr-th moment you are interested in.

Comparing with Variant I where the index of the pivot is fixed, we find that these two variants have the same expected number of swaps. However, the variance and actually all even moments of the second variant are smaller. Considering that the average performance is already O(nlogn)O(n\log n) which is not far from the best scenario, it is favorable that a Quicksort algorithm has smaller variance. In conclusion, for this model, a randomly-chosen-index pivot can improve the performance of the algorithm.

4.3.3 Variant III

Next we’d like to study the most widely used in-place Quicksort. This algorithm is called Lomuto partition scheme, which is attributed to Nico Lomuto and popularized by Bentley in his book Programming Pearls and Cormen, et al. in their book Introduction to Algorithms. This scheme chooses a pivot that is typically the last element in the list. Two indexes, ii, the insertion index, and jj, the search index are maintained. The following is the pseudo code for this variant.

algorithm quicksort(A, s, t) is
if s < t then
  p := partition(A, s, t)
  quicksort(A, s, p - 1)
  quicksort(A, p + 1, t)



algorithm partition(A, s, t) is
 pivot := A[t]
 i := s
for j := s to t - 1 do
  if A[j] < pivot then
   swap A[i] with A[j]
   i := i + 1
 swap A[i] with A[t]
return i

From the algorithm we can see that when the pivot is the kk-th smallest, there are k1k-1 elements smaller than the pivot. As a result, there are k1k-1 swaps in the if statement of the algorithm partition. Including the last swap outside the if statement, there are kk total swaps. We have the recurrence relation for its probability generating function

Pn(t)=1nk=1nPk1(t)Pnk(t)tkP_{n}(t)=\frac{1}{n}\sum_{k=1}^{n}P_{k-1}(t)P_{n-k}(t)t^{k}

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1.

Theorem 4.13.

The expectation of the number of swaps of Quicksort for a list of length nn under Variant III

E[Xn]=(n+1)H1(n)43n13.E[X_{n}]=(n+1)H_{1}(n)-\frac{4}{3}n-\frac{1}{3}.
Theorem 4.14.

The variance of the number of swaps of Quicksort for a list of length nn under Variant III

var[Xn]=2n2+18745n+74523n(n2+2n+1)H2(n)(n+1)H1(n).var[X_{n}]=2n^{2}+\frac{187}{45}n+\frac{7}{45}-\frac{2}{3n}-(n^{2}+2n+1)H_{2}(n)-(n+1)H_{1}(n).

Note that for this variant, and some other ones in the remainder of the chapter, to find out its explicit expression of moments, we may need to modify our educated guess to a “rational function” of nn and Hk(n)H_{k}(n) for some kk (see procedure QsMFnRat and QsMFnRatG). Moreover, sometimes when we solve the equations obtained by equalizing the guess with the empirical data, some initial terms should be disregarded since the increasing complexity of the definition of the Quicksort algorithms may lead to the “singularity” of the first several terms of moments. Usually, the higher the moment is, the more initial terms should be disregarded.

4.3.4 Variant IV

In Variant III, every time when A[j] < pivot, we swap A[i] with A[j]. However, it is a waste to swap them when i=ji=j. If we modify the algorithm such that a swap is performed only when the indexes iji\neq j, the expected cost will be reduced. Besides, if the pivot is actually the largest element, there is no need to swap A[i] with A[t] in the partition algorithm. To popularize Maple in mathematical and scientific research, we attach Maple code for the partition part here, the ParIP procedure in the package QuickSort.txt, in which Swap(S, i, j) is a subroutine to swap the ii-th element with the jj-th in a list SS.

ParIP:=proc(L) local pivot,i,j,n,S:
n:=nops(L):
pivot:=L[n]:
S:=L:
i:=1:
for j from 1 to n-1 do
 if S[j]<=pivot then
  if i<>j then
   S:=Swap(S, i, j):
   i:=i+1:
  else
   i:=i+1:
  fi:
 fi:
od:
if i<>n then
 return Swap(S, i, n), i:
else
 return S, i:
fi:
end:

Lemma 4.15.

Let Yn(k)Y_{n}(k) be the number of swaps needed in the first partition step in an in-place Quicksort without swapping the same index for a list LL of length nn when the pivot is the kk-th smallest element, then

Yn(k)={|{i[n]|L[i]pivotj<i,L[j]>pivot}|k<n0k=n.Y_{n}(k)=\begin{cases}|\,\{i\in[n]\,|\,L[i]\leq pivot\wedge\exists j<i,L[j]>pivot\}\,|&k<n\\ 0&k=n\end{cases}.
Proof.

When k=nk=n, it is obvious that for each search index jj, the condition S[j] <= pivot is satisfied, hence the insertion index ii is always equal to jj, which means there is no swap inside the loop. Since eventually i=ni=n, there is also no need to swap the pivot with itself. So the number of swaps is 0 in this case.

When k<nk<n, notice that the first time ii is smaller than jj is when we find the first element greater than the pivot. After that, ii will always be less than jj, which implies that for each element smaller than the pivot and the pivot itself, one swap is performed. ∎

The Maple procedure IPProb(n,k,t) takes inputs n,kn,k as defined above and a symbol tt, and outputs the probability generating function Q(n,k,t)Q(n,k,t) for the number of swaps in the first partition when the length of the list is nn and the last element, which is chosen as the pivot, is the kk-th smallest.

When k<nk<n, the probability that there are ss swaps is

(k1ks)(ks)!(nk)(nk2+s)!(n1)!=nkn1(k1ks)(n2ks).\binom{k-1}{k-s}\frac{(k-s)!(n-k)(n-k-2+s)!}{(n-1)!}=\frac{n-k}{n-1}\frac{\binom{k-1}{k-s}}{\binom{n-2}{k-s}}.

Therefore the probability generating function

Q(n,k,t)=s=1knkn1(k1ks)(n2ks)ts.Q(n,k,t)=\sum_{s=1}^{k}\frac{n-k}{n-1}\frac{\binom{k-1}{k-s}}{\binom{n-2}{k-s}}t^{s}.

The recurrence relation for the probability generating function Pn(t)P_{n}(t) of the total number of swaps follows immediately:

Pn(t)=1nk=1nPk1(t)Pnk(t)Q(n,k,t)P_{n}(t)=\frac{1}{n}\sum_{k=1}^{n}P_{k-1}(t)P_{n-k}(t)Q(n,k,t)

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1.

Theorem 4.16.

The expectation of the number of swaps of Quicksort for a list of length nn under Variant IV

E[Xn]=(n+2)H1(n)52n12.E[X_{n}]=(n+2)H_{1}(n)-\frac{5}{2}n-\frac{1}{2}.
Theorem 4.17.

The variance of the number of swaps of Quicksort for a list of length nn under Variant IV

var[Xn]=2n221512n+112+(11n+14)H1(n)(n22n2)H2(n)(2n+2)H1(n)2.var[X_{n}]=2n^{2}-{\frac{215}{12}}n+\frac{1}{12}+(11n+14)H_{1}(n)-(n^{2}-2n-2)H_{2}(n)-(2n+2)H_{1}(n)^{2}.

Comparing these results with Theorem 4.5 and Theorem 4.9, it shows that Variant IV has better average performances, notwithstanding the “broader” definition of “swap” in the first two subsections. And of course, it is better than the in-place Quicksort which swaps the indexes even when they are the same. We fully believe that the additional cost to check whether the insertion and search indexes are the same is worthwhile.

4.3.5 Variant V

This variant might not be practical, but we find that it is interesting as a combinatorial model. As is well-known, if a close-to-median element is chosen as a pivot, the Quicksort algorithm will have better performance than average in this case. Hence if additional information is available so that the probability distribution of chosen pivots is no longer a uniform distribution but something Gaussian-like, it is to our advantage.

Assume that the list is a permutation of [n][n] and we are trying to sort it, pretending that we do not know the sorted list must be [1,2,,n][1,2,\dots,n]. Now the rule is that we choose the first and last number in the list, look at the numbers and choose the one which is closer to the median. If the two numbers are equally close to the median, then choose one at random.

Without loss of generality, we can always assume that the last element in the list is chosen as the pivot; otherwise we just need to run the algorithm in the last subsection “in reverse”, putting both the insertion and search indexes on the last element and letting larger elements be swapped to the left side of the pivot, etc. So the only difference with Variant IV is the probability distribution of kk, which is no longer 1n\frac{1}{n} for each k[n].k\in[n].

Considering symmetry, Pr(n)(pivot=k)=Pr(n)(pivot=n+1k)Pr^{(n)}(pivot=k)=Pr^{(n)}(pivot=n+1-k), so we only need to consider 1k(n+1)/21\leq k\leq(n+1)/2. When nn is even, let n=2mn=2m. Then Pr(n)(pivot=k)=4k3(2m1)2mPr^{(n)}(pivot=k)=\frac{4k-3}{(2m-1)2m} . When nn is odd, let n=2m1n=2m-1, then Pr(n)(pivot=k)=4k3(2m1)(2m2)Pr^{(n)}(pivot=k)=\frac{4k-3}{(2m-1)(2m-2)} when k<mk<m and Pr(n)(pivot=m)=22m1Pr^{(n)}(pivot=m)=\frac{2}{2m-1}.

With this minor modification, the recurrence relation for Pn(t)P_{n}(t) follows.

Pn(t)=k=1nPk1(t)Pnk(t)Q(n,k,t)Pr(n)(pivot=k)P_{n}(t)=\sum_{k=1}^{n}P_{k-1}(t)P_{n-k}(t)Q(n,k,t)Pr^{(n)}(pivot=k)

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1.

Though an explicit expression seems difficult in this case, we can still analyze the performance of the algorithm by evaluating its expected number of swaps. By exploiting the Maple procedure MomFn(f,t,m,N), which inputs a function name ff, a symbol tt, the order of the moment mm and the upper bound of the length of the list NN and outputs a list of mm-th moments for the Quicksort described by the probability generating function ff of lists of length 1,2,,N1,2,\dots,N, we find that Variant V has better average performance than Variant IV when nn is large enough. For example, MomFn(PQIP, t, 1, 20) returns

[0,12,76,2,17960,4110,747140,18728,204592520,1013105,31208327720,256311980,35320124024,148873790090,[0,\frac{1}{2},\frac{7}{6},2,{\frac{179}{60}},{\frac{41}{10}},{\frac{747}{140}},{\frac{187}{28}},{\frac{20459}{2520}},{\frac{1013}{105}},{\frac{312083}{27720}},{\frac{25631}{1980}},{\frac{353201}{24024}},{\frac{1488737}{90090}},
6634189360360,81493940040,27385591712252240,4983019204204,979300393695120,20210819705432],{\frac{6634189}{360360}},{\frac{814939}{40040}},{\frac{273855917}{12252240}},{\frac{4983019}{204204}},{\frac{97930039}{3695120}},{\frac{20210819}{705432}}],

and MomFn(PQIPk, t, 1, 20) returns

[0,12,43,209,15548,1957450,2341420,4055588,558296720,79481,63054755440,17009513068,12735487864864,[0,\frac{1}{2},\frac{4}{3},{\frac{20}{9}},{\frac{155}{48}},{\frac{1957}{450}},{\frac{2341}{420}},{\frac{4055}{588}},{\frac{55829}{6720}},{\frac{794}{81}},{\frac{630547}{55440}},{\frac{170095}{13068}},{\frac{12735487}{864864}},
3864281234234,2521865137592,364243271801800,4343228489196035840,1077683474463316,15673532207598609440,113659973540209624].{\frac{3864281}{234234}},{\frac{2521865}{137592}},{\frac{36424327}{1801800}},{\frac{4343228489}{196035840}},{\frac{107768347}{4463316}},{\frac{15673532207}{598609440}},{\frac{1136599735}{40209624}}].

We notice that for n14n\geq 14, Variant V consistently has better average performance. From this result we can conclude that it is worth choosing a pivot from two candidates since the gains of efficiency are far more than its cost.

Moreover, we can obtain a recurrence for the expected number xnx_{n} of the random variable XnX_{n}. The Findrec(f,n,N,MaxC) procedure in the Maple package Findrec.txt inputs a list of data LL, two symbols nn and NN, where nn is the discrete variable, and NN is the shift operator, and MaxCMaxC which is the maximum degree plus order. Findrec(MomFn(PQIPk, t, 1, 80), n, N, 11) returns

Refer to caption
Figure 4.1: The Recurrence Relation for the Expected Number of Swaps

As previously mentioned, NN is the shift operator. Since this formula is too long, to see its detailed interpretation, please look at Theorem 4.23 as reference.

We can also look at more elements to choose the middle-most one as the pivot. In case that we do not want to store so much information, some other variants involving a “look twice” idea could be that if the first selected element is within some satisfactory interval, e.g., [n4,3n4][\frac{n}{4},\frac{3n}{4}] for a permutation of nn, then it is chosen as the pivot. Otherwise we choose a second element as the pivot without storing information about the first one. It is also likely to improve the algorithm with “multiple looks” to choose the pivot and the requirement to choose the current element as the pivot without continuing to look at the next one could vary and ideally the requirement should be more relaxed as the number of “looks” increases. We also refer the readers to [26] where P. Kirschenhofer, H. Prodinger and C. Martinez chose the median of a random sample of three elements as the pivot and obtained explicit expressions for this variant’s performance with methods of hypergeometric differential equations. In general, it is ideal to have a probability distribution of pivots where an element closer to the median is more likely to be chosen.

4.4 Explorations for Multi-Pivot Quicksort

With the same general idea as the 1-pivot Quicksort, it is natural to think about “what if we choose more pivots.” In kk-pivot Quicksort, kk indexes are chosen and the correspondent elements become pivots. The kk pivots need to be sorted and then the other elements will be partitioned into one of the k+1k+1 sublists. Compared to 1-pivot Quicksort, multi-pivot Quicksort is much more complicated because we need to determine how to sort the pivots, how to allocate each non-pivot element to the sublist they belong to and how to sort a sublist containing less than kk elements. Therefore, there are a few multi-pivot Quicksort variants. We refer the reader to [21] for other versions of multi-pivot. When kk is large, it seems difficult to have an in-place version, so we mainly consider the random variable, the number of comparisons CnC_{n}, in this section since a swap might be difficult to define in this case.

4.4.1 Dual-Pivot Quicksort

Let’s start from the simplest multi-pivot Quicksort: dual-pivot. The model for dual-pivot Quicksort is that two pivots p1p_{1} and p2p_{2} are randomly chosen. After one comparison, they are sorted, say p1<p2p_{1}<p_{2}. Non-pivot elements should be partitioned into one of the three sublists L1,L2L_{1},L_{2} and L3L_{3}. L1L_{1} contains elements smaller than p1p_{1}, L2L_{2} contains elements between p1p_{1} and p2p_{2} while L3L_{3} contains elements greater than p2p_{2}. Each non-pivot element is compared with p1p_{1} first. If it is smaller than p1p_{1}, we are done. Otherwise it is compared with p2p_{2} to determine whether it should go to L2L_{2} or L3L_{3}.

Given that the list contains nn distinct elements and the two pivots are the ii-th and jj-th smallest element (i<j)(i<j), we need one comparison to sort the pivot and i1+2(ni1)=2ni3i-1+2(n-i-1)=2n-i-3 comparisons to distribute non-pivot elements to the sublists. Hence the total number of comparison is 2ni22n-i-2 and the recurrence relation for the probability generating function Pn(t)P_{n}(t) of the total number of comparisons CnC_{n} of dual-pivot Quicksort is

Pn(t)=1(n2)j=2ni=1j1Pi1(t)Pji1(t)Pnj(t)t2ni2P_{n}(t)=\frac{1}{\binom{n}{2}}\sum_{j=2}^{n}\sum_{i=1}^{j-1}P_{i-1}(t)P_{j-i-1}(t)P_{n-j}(t)t^{2n-i-2}

with the initial condition P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1 and P2(t)=tP_{2}(t)=t.

The above recurrence is implemented by the procedure PQc2. With the aforementioned Maple procedure QsMFn it is easy to get the following theorems.

Theorem 4.18.

The expectation of the number of comparisons in dual-pivot Quicksort algorithms is

E[Cn]=2(n+1)H1(n)4n.E[C_{n}]=2(n+1)H_{1}(n)-4n.
Theorem 4.19.

The variance of the number of comparisons in dual-pivot Quicksort algorithms is

var[Cn]=n(7n+13) 2(n+1)H1(n)4(n+1)2H2(n).var[C_{n}]=n(7\,n+13)\,-\,2\,(n+1)\,H_{{1}}(n)-4\,(n+1)^{2}H_{{2}}(n).
Theorem 4.20.

The third moment about the mean of CnC_{n} is

n(19n2+81n+104)+H1(n)(14n+14)+12(n+1)2H2(n)+16(n+1)3H3(n).-n(19\,{n}^{2}+81\,n+104)+H_{{1}}(n)(14\,n+14)+12\,(n+1)^{2}H_{{2}}(n)+16\,(n+1)^{3}H_{{3}}(n).
Theorem 4.21.

The fourth moment about the mean of CnC_{n} is

19n(2260n3+9658n2+15497n+11357)2(n+1)(42n2+78n+77)H1(n)\frac{1}{9}\,n(2260\,{n}^{3}+9658\,{n}^{2}+15497\,n+11357)-2\,(n+1)(42\,{n}^{2}+78\,n+77)H_{{1}}(n)
+12(n+1)2(H1(n))2+(4(42n2+78n+31)(n+1)2+48(n+1)3H1(n))H2(n)+12\,(n+1)^{2}(H_{{1}}(n))^{2}+(-4\,(42\,{n}^{2}+78\,n+31)(n+1)^{2}+48\,(n+1)^{3}H_{{1}}(n))H_{{2}}(n)
+48(n+1)4(H2(n))296(n+1)3H3(n)96(n+1)4H4(n).+48\,(n+1)^{4}(H_{{2}}(n))^{2}-96\,(n+1)^{3}H_{{3}}(n)-96\,(n+1)^{4}H_{{4}}(n).

As usual, any higher moment can be easily obtained with a powerful silicon servant. Careful readers may notice that the above four theorems are exactly the same as the ones in Section 4.2. It is natural to ask whether they indeed have the same probability distribution. The answer is yes. In Section 4.1.2 of [20] the author gives a sketch of proof showing that 1-pivot and dual-pivot Quicksorts’s numbers of comparisons satisfy the same recurrence relation and initial condition. From an experimental mathematical point of view, a semi-rigorous proof obtained by checking sufficiently many special cases is good enough for us. For example, the first 10 probability generating function, Pn(t)P_{n}(t) for 1n101\leq n\leq 10 can be calculated in a nanosecond by entering [seq(PQc2(i, t), i = 1..10)] and we have

P1(t)=1,P_{1}(t)=1,
P2(t)=t,P_{2}(t)=t,
P3(t)=23t3+13t2,P_{3}(t)=\frac{2}{3}t^{3}+\frac{1}{3}t^{2},
P4(t)=13t6+16t5+12t4,P_{4}(t)=\frac{1}{3}{t}^{6}+\frac{1}{6}{t}^{5}+\frac{1}{2}{t}^{4},
P5(t)=215t10+115t9+15t8+415t7+13t6,P_{5}(t)=\frac{2}{15}{t}^{10}+\frac{1}{15}{t}^{9}+\frac{1}{5}{t}^{8}+\frac{4}{15}t^{7}+\frac{1}{3}t^{6},
\vdots

which are exactly the same as the probability generating function for the number of comparisons of 1-pivot Quicksort.

In conclusion, in terms of probability distribution of the number of comparisons, the dual-pivot Quicksort does not appear to be better than the ordinary 1-pivot Quicksort.

As for the random variable XnX_{n}, the number of swaps, it depends on the specific implementation of the algorithm and the definition of a “swap.” As a toy model, we do an analogue of Section 4.3.1. Assume the list is a permutation of [n][n]. The first and last elements are chosen as the pivot. Let’s say they are ii and jj. If i>ji>j then we swap them and still call the smaller pivot ii. For each element less than ii, we move it to the left of ii, and for each element greater than jj, we move it to the right of jj and call this kind of operations a swap.

The recurrence relation for the probability generating function of XnX_{n} is

Pn(t)=1(n2)(12+12t)j=2ni=1j1Pi1(t)Pji1(t)Pnj(t)tn1+ijP_{n}(t)=\frac{1}{\binom{n}{2}}(\frac{1}{2}+\frac{1}{2}t)\sum_{j=2}^{n}\sum_{i=1}^{j-1}P_{i-1}(t)P_{j-i-1}(t)P_{n-j}(t)t^{n-1+i-j}

with the initial conditions P0(t)=P1(t)=1P_{0}(t)=P_{1}(t)=1 and P2(t)=12+12t.P_{2}(t)=\frac{1}{2}+\frac{1}{2}t.

Theorem 4.22.

The expectation of the number of swaps in the above dual-pivot Quicksort variant is

E[Xn]=45(n+1)H1(n)3925n1100.E[X_{n}]=\frac{4}{5}(n+1)H_{1}(n)-\frac{39}{25}n-\frac{1}{100}.

Note that this result is better than those in Sections 4.3.1 and 4.3.2.

4.4.2 Three-Pivot Quicksort

As mentioned at the beginning of this section, to define a 3-pivot Quicksort, we need to define 1) how to sort the pivots, 2) how to partition the list and 3) how to sort a list or sublist containing less than 3 pivots. Since this chapter is to study Quicksort, we choose 1-pivot Quicksort for 1) and 3). For 2), it seems that the binary search is a good option since for each non-pivot element exactly 2 comparisons with the pivots are needed.

The Maple procedure PQs(n,t) outputs the probability generating function for 1-pivot Quicksort of a list of length nn. Hence during the process of sorting the pivots and partitioning the list, the probability generating function of the number of comparisons is PQs(3,t)t2n6PQs(3,t)t^{2n-6}, which equals to

(23t3+13t2)t2n6=23t2n3+13t2n4.(\frac{2}{3}t^{3}+\frac{1}{3}t^{2})t^{2n-6}=\frac{2}{3}t^{2n-3}+\frac{1}{3}t^{2n-4}.

So the recurrence relation for the probability generating function Pn(t)P_{n}(t) of the total number of comparisons for 3-pivot Quicksort of a list of length nn is

Pn(t)=1(n3)k=3nj=2k1i=1j1Pi1(t)Pji1(t)Pkj1(t)Pnk(t)(23t2n3+13t2n4)P_{n}(t)=\frac{1}{\binom{n}{3}}\sum_{k=3}^{n}\sum_{j=2}^{k-1}\sum_{i=1}^{j-1}P_{i-1}(t)P_{j-i-1}(t)P_{k-j-1}(t)P_{n-k}(t)(\frac{2}{3}t^{2n-3}+\frac{1}{3}t^{2n-4})

with initial conditions P0(t)=P1(t)=1,P2(t)=tP_{0}(t)=P_{1}(t)=1,P_{2}(t)=t and P3(t)=23t3+13t2P_{3}(t)=\frac{2}{3}t^{3}+\frac{1}{3}t^{2}. This recurrence is implemented by the procedure PQd3.

The explicit expression seems to be difficult to obtain in this case, but numerical tests imply that 3-pivot Quicksort has better performances than dual-pivot, and of course 1-pivot since it is indeed equivalent to dual-pivot.

By exploiting the Maple procedure MomFn(f,t,m,N) again, we can compare the expectations of different Quicksort variants.

For instance, MomFn(PQc2, t, 1, 20) returns

[0,1,83,296,375,10310,47235,2369140,2593126,307911260,328911155,45299313860,47675312870,49906112012,[0,1,\frac{8}{3},{\frac{29}{6}},{\frac{37}{5}},{\frac{103}{10}},{\frac{472}{35}},{\frac{2369}{140}},{\frac{2593}{126}},{\frac{30791}{1260}},{\frac{32891}{1155}},{\frac{452993}{13860}},{\frac{476753}{12870}},{\frac{499061}{12012}},
208032845045,18358463360360,18999103340340,1241848392042040,1278605111939938,26274175369512],{\frac{2080328}{45045}},{\frac{18358463}{360360}},{\frac{18999103}{340340}},{\frac{124184839}{2042040}},{\frac{127860511}{1939938}},{\frac{26274175}{369512}}],

and MomFn(PQd3, t, 1, 20) returns

[0,1,83,143,10615,495,645,56135,122663,5192225,46531617325,53350917325,71400820475,616157681576575,[0,1,\frac{8}{3},\frac{14}{3},{\frac{106}{15}},{\frac{49}{5}},{\frac{64}{5}},{\frac{561}{35}},{\frac{1226}{63}},{\frac{5192}{225}},{\frac{465316}{17325}},{\frac{533509}{17325}},{\frac{714008}{20475}},{\frac{61615768}{1576575}},
3422348247882875,75460098115765750,140495602726801775,15298397599268017750,31489234438509233725,169792631003925461686250].{\frac{342234824}{7882875}},{\frac{754600981}{15765750}},{\frac{1404956027}{26801775}},{\frac{15298397599}{268017750}},{\frac{31489234438}{509233725}},{\frac{1697926310039}{25461686250}}].

We notice that for each fixed n>3n>3, 3-pivot Quicksort’s average performance is better than 2-pivot and 1-pivot. This numerical test is also possible for all previous Quicksort variants but seems unnecessary when the explicit expressions are easily accessible.

As in Section 4.3.5, a recurrence relation of the expected number of comparisons can be obtained. Findrec(MomFn(PQd3, t, 1, 40),n,N,8) returns

(3n+4)(n25n+12)(n+4)(n+3)(3n+1)(12n4+13n312n2+59n+24)N(3n+1)(n+4)(n+3)(n+2){\frac{\left(3\,n+4\right)\left({n}^{2}-5\,n+12\right)}{\left(n+4\right)\left(n+3\right)\left(3\,n+1\right)}}-{\frac{\left(12\,{n}^{4}+13\,{n}^{3}-12\,{n}^{2}+59\,n+24\right)N}{\left(3\,n+1\right)\left(n+4\right)\left(n+3\right)\left(n+2\right)}}
+3(n+1)(6n+5)nN2(n+4)(n+3)(3n+1)(n+1)(12n+7)N3(n+4)(3n+1)+N4,+3\,{\frac{\left(n+1\right)\left(6\,n+5\right)n{N}^{2}}{\left(n+4\right)\left(n+3\right)\left(3\,n+1\right)}}-{\frac{\left(n+1\right)\left(12\,n+7\right){N}^{3}}{\left(n+4\right)\left(3\,n+1\right)}}+{N}^{4},

which leads to the following theorem.

Theorem 4.23.

The expected number of comparisons CnC_{n} of 3-pivot Quicksort for a list of length nn satisfies the following recurrence relation:

Cn+4=(n+1)(12n+7)(n+4)(3n+1)Cn+33(n+1)(6n+5)n(n+4)(n+3)(3n+1)Cn+2C_{n+4}={\frac{\left(n+1\right)\left(12\,n+7\right)}{\left(n+4\right)\left(3\,n+1\right)}}C_{n+3}-3\,{\frac{\left(n+1\right)\left(6\,n+5\right)n}{\left(n+4\right)\left(n+3\right)\left(3\,n+1\right)}}C_{n+2}
+(12n4+13n312n2+59n+24)(3n+1)(n+4)(n+3)(n+2)Cn+1(3n+4)(n25n+12)(n+4)(n+3)(3n+1)Cn.+{\frac{\left(12\,{n}^{4}+13\,{n}^{3}-12\,{n}^{2}+59\,n+24\right)}{\left(3\,n+1\right)\left(n+4\right)\left(n+3\right)\left(n+2\right)}}C_{n+1}-{\frac{\left(3\,n+4\right)\left({n}^{2}-5\,n+12\right)}{\left(n+4\right)\left(n+3\right)\left(3\,n+1\right)}}C_{n}.

The recurrence relations for higher moments are also obtainable, but a long enough list of data is needed.

4.4.3 kk-Pivot Quicksort

More generally, kk-pivot Quicksort can be considered with the convention that 1) the kk pivots are sorted with 1-pivot Quicksort, 2) binary search is used to partition the list into k+1k+1 sublists, 3) we use 1-pivot Quicksort to sort lists with length less than kk.

In the package QuickSort.txt the procedure PQck(n, t, k) outputs the probability generating function for the number of comparisons of kk-pivot Quicksort where each element is compared with pivots in a linearly increasing order. Obviously this is not efficient when kk is large. However, the problem for binary search is that when k2i1k\neq 2^{i}-1 for some ii, it is hard to get an expression for the number of comparisons in the binary search step since the number highly depends on the specific implementation where some boundary cases may vary and the floor and ceiling functions will be involved, which leads to an increasing difficulty to find the explicit expressions for moments.

There is a procedure QsBC(L, k) which inputs a list of distinct numbers LL and an integer kk representing the number of pivots and outputs the sorted list and the total number of comparisons. For convenience of Monte Carlo experiments, we use MCQsBC(n, k, T) where nn is the length of the list, kk is the number of pivots and TT is the number of times we repeat the experiments. Because of the limit of computing resources, we only test for k=3,4,5,6k=3,4,5,6 and n=10,20,30,40,50.n=10,20,30,40,50.

[𝚜𝚎𝚚(𝙼𝙲𝚀𝚜𝙱𝙲(𝟷𝟶𝚒,𝟹,𝟷𝟶𝟶),𝚒=1..5)]=[22.95,65.75,118.71,178.28,239.45],{\tt[seq(MCQsBC(10*i,3,100),i=1..5)]}=[22.95,65.75,118.71,178.28,239.45],
[𝚜𝚎𝚚(𝙼𝙲𝚀𝚜𝙱𝙲(𝟷𝟶𝚒,𝟺,𝟷𝟶𝟶),𝚒=1..5)]=[23.78,67.77,120.91,180.35,251.19],{\tt[seq(MCQsBC(10*i,4,100),i=1..5)]}=[23.78,67.77,120.91,180.35,251.19],
[𝚜𝚎𝚚(𝙼𝙲𝚀𝚜𝙱𝙲(𝟷𝟶𝚒,𝟻,𝟷𝟶𝟶),𝚒=1..5)]=[23.54,65.74,119.59,178.36,241.03],{\tt[seq(MCQsBC(10*i,5,100),i=1..5)]}=[23.54,65.74,119.59,178.36,241.03],
[𝚜𝚎𝚚(𝙼𝙲𝚀𝚜𝙱𝙲(𝟷𝟶𝚒,𝟼,𝟷𝟶𝟶),𝚒=1..5)]=[23.14,66.22,120.07,176.43,236.46],{\tt[seq(MCQsBC(10*i,6,100),i=1..5)]}=[23.14,66.22,120.07,176.43,236.46],

Our observation is that for large enough nn, the more pivots we use, the less comparisons are needed. However, when kk is too close to nn, the increase of pivots may lead to inefficiency.

4.5 Limiting Distribution

The main purpose of this chapter is to find explicit expressions for the moments of the number of swaps or comparisons of some variants of Quicksort, to compare their performances and to explore more efficient Quicksort algorithms. However, it is also of interest to find more moments for large nn and calculate their floating number approximation of the scaled limiting distribution.

As mentioned in [13], if we are only interested in the first few moments, then it is wasteful to compute the full probability generating function Pn(t).P_{n}(t). Let t=1+wt=1+w and use the fact that

Pn(1+w)=r=0fr(n)r!wrP_{n}(1+w)=\sum_{r=0}^{\infty}\frac{f_{r}(n)}{r!}w^{r}

where fr(n)f_{r}(n) are the factorial moments. The straight moments E[Xnr]E[X_{n}^{r}], and the moments-about-the-mean, mr(n)m_{r}(n) follow immediately.

As a case study, let’s use Variant IV from Section 4.3.4 as an example. The recurrence relation is

Pn(1+w)=1nk=1nPk1(1+w)Pnk(1+w)Q(n,k,1+w),P_{n}(1+w)=\frac{1}{n}\sum_{k=1}^{n}P_{k-1}(1+w)P_{n-k}(1+w)Q(n,k,1+w),

where QQ is as defined in Section 4.3.4.

Since only the first several factorial moments are considered, in each step truncation is performed and only the first several coefficients in ww is kept. With this method we can get more moments in a fixed time. The procedure TrunIP implements the truncated factorial generating function.

With the closed-form expressions for both the expectation, cnc_{n}, and the variance m2(n):=var(Xn)m_{2}(n):=var(X_{n}), the scaled random variable ZnZ_{n} is defined as follows.

Zn:=Xncnm2(n).Z_{n}:=\frac{X_{n}-c_{n}}{\sqrt{m_{2}(n)}}.

We are interested in the floating point approximations of the limiting distribution limnZn\lim_{n\rightarrow\infty}Z_{n}. Of course its expectation is 0 and its variance is 11.

For instance, if we’d like to know the moments up to order 10, TrunIP(100, z, 10) returns

1+761763471283683134464672622416462868654327341323619495089084130905464828354336z1+{\frac{7617634712836831344646726224164628686543}{27341323619495089084130905464828354336}}z
+116914686783624631948031731196044060605778576123443318381348464329517287662514914280390084303910684938635848245569645536000z2+{\frac{1169146867836246319480317311960440606057785761234433183813484643}{29517287662514914280390084303910684938635848245569645536000}}z^{2}
+.+\dots\quad.

For 1r101\leq r\leq 10, the coefficient of zrz^{r} times r!r! is the rr-th factorial moment. By

E[Xr]=j=0r{rj}E[(X)j]E[X^{r}]=\sum_{j=0}^{r}{r\brace j}E[(X)_{j}]

where the curly braces denote Stirling numbers of the second kind, we can get the raw moments. And with the procedure MtoA a sequence of raw moments are transformed to moments about the mean. Divided by m2(n)r2m_{2}(n)^{\frac{r}{2}}, the 3rd through 10th moments in floating point approximations are

[0.7810052982,3.942047050,9.146681877,37.12169647,137.7143092,[0.7810052982,3.942047050,9.146681877,37.12169647,137.7143092,
613.5286860,2872.409923,14709.75560].613.5286860,2872.409923,14709.75560].

The same technique can be applied to other variants of Quicksort in this chapter and we leave this to interested readers.

4.6 Remarks

In such a rich and active research area as Quicksort, there are still several things we could think about to improve the algorithms’ performances. Just to name a few, in 2-pivot Quicksort when we compare non-pivot elements with the pivots to determine which sublist they belong to, if the history is tracked, we might be able to use the history to determine which pivot to compare with first for the next element. The optimal strategy would vary with the additional information about the range of the numbers or the relative ranking of the two pivots among all the elements.

As for kk-pivot Quicksort, our naive approach only distinguishes two cases: whether the currently to-be-sorted list has length less than kk or not. If the length is less than kk, we use 1-pivot Quicksort; otherwise we still choose kk pivots. However, we might be able to improve the performance if the number of pivots varies according to the length of the to-be-sorted list or sublist. Let’s say there is a function g(n)g(n), where nn is the length of the list. So we pick g(n)g(n) pivots at the beginning. After we obtain the g(n)+1g(n)+1 sublists with length ni,1ig(n)+1n_{i},1\leq i\leq g(n)+1, for each one of them, we choose g(ni)g(n_{i}) pivots. It would be interesting whether we can find an optimal gg in terms of its average performance. Additionally, when kk is large, it might make sense to use multi-pivot Quicksort to sort the kk pivots as well.

Of course, it is also interesting to study the explicit expressions of the numbers of swaps in multi-pivot Quicksort. But it appears to be dependent on the specific implementation of the algorithm so it is of significance to look for variants which save time and space complexity.

The main results of this chapter are those explicit expressions of moments and recurrence relations for either the number of comparisons or the number of swaps of various Quicksort variants. Though all of their asymptotics are O(nlogn)O(n\log n), the constant before this term varies a lot and some comparisons of these variants are also discussed. When there is difficulty getting the explicit expressions, numerical tests and Monte Carlo experiments are performed. We also have a demonstration on how to get more moments and find the numeric approximation of the scaled limiting distribution.

Nevertheless, more important than those results is the illustration of a methodology of experimental mathematics. From ansatzes and sufficient data we have an alternative way to obtain results that otherwise might be extremely difficult or even impossible to get via traditional human approaches to algorithm analysis.

Chapter 5 Peaceable Queens Problem

This chapter is adapted from [51], which has been published on Experimental Mathematics. It is also available on arXiv.org, number 1902.05886.

5.1 Introduction

One of the fascinating problems described in the recent article [40], about the great On-Line Encyclopedia of Integer Sequences, and in the beautiful and insightful video [41] is the peaceable queens problem. It was chosen, by popular vote, to be assigned the milestone ‘quarter-million’ A-number, A250000.

The question is the following:

What is the maximal number, mm, such that it is possible to place mm white queens and mm black queens on an n×nn\times n chess board, so that no queen attacks a queen of the opposite color.

Currently only fifteen terms are known:

n:\displaystyle n: 1\displaystyle 1 2\displaystyle 2 3\displaystyle 3 4\displaystyle 4 5\displaystyle 5 6\displaystyle 6 7\displaystyle 7 8\displaystyle 8 9\displaystyle 9 10\displaystyle 10 11\displaystyle 11 12\displaystyle 12 13\displaystyle 13 14\displaystyle 14 15\displaystyle 15
a(n):\displaystyle a(n): 0\displaystyle 0 0\displaystyle 0 1\displaystyle 1 2\displaystyle 2 4\displaystyle 4 5\displaystyle 5 7\displaystyle 7 9\displaystyle 9 12\displaystyle 12 14\displaystyle 14 17\displaystyle 17 21\displaystyle 21 24\displaystyle 24 28\displaystyle 28 32\displaystyle 32

In this chapter, we’d like to consider this peaceable queens problem as a continuous problem by normalizing the chess board to be the unit square U:=[0,1]2={(x,y)| 0x,y1}U:=[0,1]^{2}=\{(x,y)\,|\,0\leq x,y\leq 1\}. Let WUW\subseteq U be the region where white queens are located. Then the non-attacking region BB of WW can be defined as

B={(x,y)U|(u,v)W,xu,yv,x+yu+v,yxvu}.B=\{(x,y)\in U\,|\,\forall(u,v)\in W,x\neq u,y\neq v,x+y\neq u+v,y-x\neq v-u\}.

So the continuous version of the peaceable queens problem is to find

maxW2U(min(Area(W),Area(B))).\max_{W\in 2^{U}}(\min(\textrm{Area}(W),\textrm{Area}(B))).

Considering that the queen is able to move any number of squares vertically, horizontally and diagonally, it is reasonable to let WW be a convex polygon or a disjoint union of convex polygons whose boundary consists of vertical, horizontal and slope ±1\pm 1 line segments, otherwise we can increase the area of white queens without decreasing the area of black queens.

In this chapter, we use a list LL of lists [[a1,b1],[a2,b2],,[an,bn]][\,[a_{1},b_{1}]\,,\,[a_{2},b_{2}]\,,\,\dots\,,\,[a_{n},b_{n}]\,] to denote the nn-gon whose vertices are the nn pairs in the list LL and whose sides are the straight line segments connecting [ai,bi][a_{i},b_{i}] and [ai+1,bi+1],(1in1)[a_{i+1},b_{i+1}],(1\leq i\leq n-1), and [an,bn][a_{n},b_{n}] and [a1,b1][a_{1},b_{1}].

This chapter is organized as follows. At first we look at Jubin’s construction and prove that it is a local optimum. Though there is no rigorous proof, we conjecture and reasonably believe that it is indeed a global optimum at least for “the continuous chess board”, after numerous experiments with one, two and more components. Then we consider the optimal case under more restrictions, or under certain configurations, e.g., only one component or two identical squares or two identical triangles, etc. In some cases, the exact optimal parameters and areas can be obtained. Note that in this chapter’s figures, for convenience of demonstration, the color red is used to represent white queens and blue is for black queens.

5.2 Jubin’s Construction

As mentioned in [40] (and [39], sequence A250000), it is conjectured that Benoit Jubin’s construction given in Fig. 5 of [40], see also here:

http://sites.math.rutgers.edu/~zeilberg/tokhniot/peaceable/P1.html

or Figure 5.1, is optimal for n10n\geq 10. Its value is 7n248\lfloor\frac{7n^{2}}{48}\rfloor.

Refer to caption
Figure 5.1: Benoit Jubin’s Construction for a Unit Square

While we are, at present, unable to prove this, we did manage to prove that when one generalizes Jubin’s construction and replaces the sides of the two pentagons with arbitrary parameters (of course subject to the obvious constraints so that both white and black queens reside in two pentagons), then Jubin’s construction is indeed (asymptotically) optimal, i.e. in the limit as nn goes to infinity.

Lemma 5.1.

Normalizing the chess board to be the unit square {(x,y)| 0x,y1}\{(x,y)\,|\,0\leq x,y\leq 1\}, if the white queens are placed in the union of the interiors of the following two pentagons

[[0,0],[a,a],[a,a+be],[ae,a+be],[0,b]],[\,[0,0]\,,\,[a,a]\,,\,[a,a+b-e]\,,\,[a-e,a+b-e]\,,\,[0,b]\,],

and

[[g,0],[g+c,0],[g+c,c2f+d],[g+cf,cf+d],[g,d]],[\,[g,0]\,,\,[g+c,0]\,,\,[g+c,c-2\,f+d]\,,\,[g+c-f,c-f+d]\,,\,[g,d]\,],

where a,b,c,d,e,f,ga,b,c,d,e,f,g are between 0 and 11 and all coordinates and side lengths in Fig. 5.1 are non-negative and appropriate so that black queens also reside in two pentagons, then the black queens are located in the interiors of the pentagons

[[g,1],[a,1],[a,g+2c2f+da],[12g+12d12b+cf,12g+12d+12b+cf],[\,[g,1]\,,\,[a,1]\,,\,[a,g+2\,c-2\,f+d-a]\,,\,[\frac{1}{2}\,g+\frac{1}{2}\,d-\frac{1}{2}\,b+c-f,\frac{1}{2}\,g+\frac{1}{2}\,d+\frac{1}{2}\,b+c-f]\,,\,
[g,g+b]],[g,g+b]\,],

and

[[1,1],[g+c,g+c],[g+c,a+be],[a+be+gd,a+be],[1,1+dg]].[\,[1,1]\,,\,[g+c,g+c]\,,\,[g+c,a+b-e]\,,\,[a+b-e+g-d,a+b-e]\,,\,[1,1+d-g]\,].
Proof.

Since we only consider cases when the black queens also reside in two pentagons, this requirement provides natural constraints for these parameters a,b,c,d,e,f,ga,b,c,d,e,f,g. Just to name a few, aga\leq g because the two pentagons are not overlapped, g+c1g+c\leq 1 because the right pentagon of white queens should entirely reside in the unit square, dgd\leq g because otherwise the right pentagon of black queens will not exist and cf+dc-f+d, which is the yy-coordinate of the highest point in the right pentagon of white queens, cannot be too large to ensure the right pentagon of black queens does not degenerate to a parallelogram. In these constraints we always use “\leq” instead of “<<” so that the Lagrange multipliers will be able to work in a closed domain.

With school geometry, it is obvious that black queens cannot reside in 0x<a0\leq x<a since it is attacked by the left pentagon of white queens. Similar arguments work for the area 0<ya+b0<y\leq a+b and g<x<g+cg<x<g+c. Now the leftover on the unit square is a union of two rectangles. By excluding x+y<g+2c2f+d,0<yx<bx+y<g+2c-2f+d,0<y-x<b and yx<dgy-x<d-g, these two rectangles are shaped into two pentagons and the coordinates of their vertices follow immediately. ∎

Lemma 5.2.

The area of the white queens is

ab12e2+cd+12c2f2,ab\,-\,\frac{1}{2}\,{e}^{2}+cd\,+\,\frac{1}{2}\,{c}^{2}-{f}^{2},

while the area of the black queens is

a34d2+2gdcdabf212e232c2+2bc2af+3ac+2ad+2cfeced+be-a-\frac{3}{4}\,{d}^{2}+2\,g-d-cd-ab-{f}^{2}-\frac{1}{2}\,{e}^{2}-\frac{3}{2}\,{c}^{2}+2\,bc-2\,af+3\,ac+2\,ad+2\,cf-ec-ed+be
+aebf+fd+32bda234b22gc+12gd12gb+ag+gf74g2.+ae-bf+fd+\frac{3}{2}\,bd-{a}^{2}-\frac{3}{4}\,{b}^{2}-2\,gc+\frac{1}{2}\,gd-\frac{1}{2}\,gb+ag+gf-\frac{7}{4}\,{g}^{2}.
Proof.

For white queens, the left pentagon is a rectangle minus two triangles. Hence the area is

a(a+be)12a212(ae)2=ab12e2.a(a+b-e)-\frac{1}{2}a^{2}-\frac{1}{2}(a-e)^{2}=ab-\frac{1}{2}e^{2}.

The area of the right pentagon is

c(d+cf)12f212(cf)2=12c2+cdf2.c(d+c-f)-\frac{1}{2}f^{2}-\frac{1}{2}(c-f)^{2}=\frac{1}{2}c^{2}+cd-f^{2}.

So the area of the white queens follows. For black queens, similarly, with the coordinates of the vertices in Lemma 5.1, simple calculation leads to the formula of its area. ∎

Theorem 5.3.

The optimal case of the two-pentagon configuration is Jubin’s construction.

Proof.

The procedure MaxC(L,v) in the Maple package PeaceableQueens.txt takes a list of length 2, LL, consisting of polynomials in the list of variables vv, and vv as inputs and outputs all the extreme points of L[1]L[1], subject to the constraint L[1]=L[2]L[1]=L[2], using Lagrange multipliers.

Optimally, the areas of the white queens and black queens should be the same. Maximizing this quantity with the procedure MaxC under this constraint shows that the maximum value is

748,\frac{7}{48},

and this is indeed achieved by Jubin’s construction, in which the white queens are located inside the pentagons

[[0,0],[14,14],[14,12],[16,12],[0,13]],[\,[0,0]\,,\,[\frac{1}{4},\frac{1}{4}]\,,\,[\frac{1}{4},\frac{1}{2}]\,,\,[\frac{1}{6},\frac{1}{2}]\,,\,[0,\frac{1}{3}]\,],

and

[[12,0],[34,0],[34,14],[23,13],[12,16]],[\,[\frac{1}{2},0]\,,\,[\frac{3}{4},0]\,,\,[\frac{3}{4},\frac{1}{4}]\,,\,[\frac{2}{3},\frac{1}{3}]\,,\,[\frac{1}{2},\frac{1}{6}]\,],

and the black queens reside inside the pentagons

[[12,1],[14,1],[14,34],[13,23],[12,56]],[\,[\frac{1}{2},1]\,,\,[\frac{1}{4},1]\,,\,[\frac{1}{4},\frac{3}{4}]\,,\,[\frac{1}{3},\frac{2}{3}]\,,\,[\frac{1}{2},\frac{5}{6}]\,],

and

[[1,1],[34,34],[34,12],[56,12],[1,23]].[\,[1,1]\,,\ \,[\frac{3}{4},\frac{3}{4}]\,,\,[\frac{3}{4},\frac{1}{2}]\,,\,[\frac{5}{6},\frac{1}{2}]\,,\,[1,\frac{2}{3}]\,].

It seems natural that two components are optimal because if there is only one connected component for white queens, black queens still have two connected components. From the view of symmetry, it seems good to add the other component for white queens. In the rest of the chapter, it is shown that with only one connected component it is unlikely to surpass the 748\frac{7}{48} result. And by experimenting with three or more connected components for the white queens, it seems that it is not possible to improve on Jubin’s construction, hence we believe that it is indeed optimal (at least asymptotically). By the way, Donald Knuth kindly informed us that what we (and the OEIS) call Jubin’s construction already appears in Stephen Ainley’s delightful book Mathematical Puzzles [1], p. 31, Fig, 28(A) .

5.3 Single Connected Component

In this section, we try to find the optimal case when WW is a single connected component and when the configuration is restricted to rectangles, parallelograms, triangles and finally obtain a lower bound for the optimal case of one connected component.

5.3.1 A Single Rectangle

Let the rectangle for white queens be [[0,0],[a,0],[a,b],[0,b]][\,[0,0]\,,\,[a,0]\,,\,[a,b]\,,\,[0,b]\,], with the obvious fact that for a rectangle with a given size, placing it in the corner will lead to the largest non-attacking area. The area for white queens is abab and the area for black queens is (1ab)2(1-a-b)^{2}. We’d like to find the maximum of abab under the condition

ab=(max(1ab,0))2,0a,b1.ab=(\max(1-a-b,0))^{2},\quad 0\leq a,b\leq 1.
Refer to caption
Figure 5.2: The Optimal Rectangle for a 120 by 120 Chess Board

Since aa and bb are symmetric, the maximum must be on the line a=ba=b. Hence the optimal case is when

a=b=13a=b=\frac{1}{3}

and the largest area for peaceable queens when the configuration for white queens is a rectangle is 19\frac{1}{9}.

5.3.2 A Single Parallelogram

Let the parallelogram for white queens be [[0,0],[a,a],[a,a+b],[0,b]][\,[0,0]\,,\,[a,a]\,,\,[a,a+b]\,,\,[0,b]\,].

Refer to caption
Figure 5.3: The Optimal Parallelogram for a 120 by 120 Chess Board

Note that as mentioned in the beginning of this section, because the line segment must be vertical, horizontal or of slope ±1\pm 1 and the corner is the best place to locate a shape, there are only two kinds of parallelograms, the other one being [[0,0],[b,0],[a+b,a],[a,a]][[0,0],[b,0],[a+b,a],[a,a]]. Obviously they are symmetric with respect to the line y=xy=x, so let’s focus on one of them.

The area for white queens is still abab and the area for black queens is still (max(1ab,0))2(\max(1-a-b,0))^{2}. So similarly with the rectangle case, the maximum area 19\frac{1}{9} is reached when

a=b=13.a=b=\frac{1}{3}.

5.3.3 A Single Triangle

With similar arguments as in the last subsection, the optimal triangle must have the format: [[0,0],[0,a],[a,a]][\,[0,0]\,,\,[0,a]\,,\,[a,a]\,]. The area for white queens is

12a2\frac{1}{2}a^{2}

and the area for black queens is

12(1a)2.\frac{1}{2}(1-a)^{2}.

with the condition 0a10\leq a\leq 1.

Hence, when a=12a=\frac{1}{2} the area reaches its maximum 18\frac{1}{8}, which is better than the rectangle or parallelogram configuration.

Refer to caption
Figure 5.4: The Optimal Triangle for a 120 by 120 Chess Board

By the way, [[0,0],[0,a],[a,0]][\,[0,0]\,,\,[0,a]\,,\,[a,0]\,] won’t be a good candidate for optimal triangles because we can always extend it to a square [[0,0],[0,a],[a,a],[a,0]][\,[0,0]\,,\,[0,a]\,,\,[a,a]\,,\,[a,0]\,] without decreasing the area of black queens. Then its maximum cannot exceed the maximum of rectangles, which is 19\frac{1}{9}.

5.3.4 A Single Hexagon

After looking at specific configurations in the above three subsections, we’d like to find some numerical lower bounds for the single connected component configuration. It is interesting to find out or at least get a numerical estimation how large the area of white or black queens can be if the white queens are in a single connected component. Note that from rectangles and parallelogram we get a lower bound 190.1111\frac{1}{9}\approx 0.1111 and from triangles we get a better lower bound 18=0.125\frac{1}{8}=0.125.

The natural thing is that we want to place the polygon in a corner. Because of the restriction of the orientations of its sides, at most it can be an octagon. Let’s place the polygon in the lower left corner. Then we immediately realize that it is a waste if the polygon doesn’t fill the lower left corner of the unit square. It is the same for the upper right side of the polygon. If part of its vertices are [[a,b],[a,b+c],[ad,b+c+d],[adf,b+c+d]][\,[a,b]\,,\,[a,b+c]\,,\,[a-d,b+c+d]\,,\,[a-d-f,b+c+d]\,], then we can always extend the polygon to [,[a,b],[a,b+c+d],[adf,b+c+d],][\,\dots\,,\,[a,b]\,,\,[a,b+c+d]\,,\,[a-d-f,b+c+d]\,,\,\dots\,] without decreasing the area of black queens.

Hence the general shape is a hexagon

[[0,0],[a,0],[a+b,b],[a+b,b+c],[d,b+c],[0,b+cd]][\,[0,0]\,,\,[a,0]\,,\,[a+b,b]\,,\,[a+b,b+c]\,,\,[d,b+c]\,,\,[0,b+c-d]\,]

with four parameters. Then the area for white queens is

(a+b)(b+c)12(b2+d2),(a+b)(b+c)-\frac{1}{2}(b^{2}+d^{2}),

and the area for black queens is

12(1abc)2+12(1a2bc+d)2.\frac{1}{2}(1-a-b-c)^{2}+\frac{1}{2}(1-a-2b-c+d)^{2}.

With the procedure MaxC, one of the local maximums found using Lagrange multipliers is when

a=c=d=12,b=0.a=c=d=\frac{1}{2},\,b=0.

However, actually this is the optimal triangle with an area of 18\frac{1}{8}.

Refer to caption
Figure 5.5: The Nearly Best Lower Bound Configuration for a 100 by 100 Chess Board

Another local maximum is when a=b=c=da=b=c=d. In that case, we have

3a2=(13a)2.3a^{2}=(1-3a)^{2}.

Hence when

a=3360.2113248654,a=\frac{3-\sqrt{3}}{6}\approx 0.2113248654,

the area of white queens is maximized at

3a2=2320.1339745962.3a^{2}=\frac{2-\sqrt{3}}{2}\approx 0.1339745962.

The best configuration of hexagons is found and at least we have a numerical lower bound 0.1339745962 for the best single component configuration.

5.4 Two Components

Since in Jubin’s construction, there are two pentagons, it is natural to think of the optimum of certain two-component configurations. The difficulty for analyzing the two-component is that more parameters are introduced and the area formula for black queens becomes a much more complicated piece-wise function.

In this section, the cylindrical algebraic decomposition algorithm in quantifier elimination is applied to find out the exact optimal parameters and the maximum areas. Given a set SS of polynomials in n\mathbb{R}^{n}, a cylindrical algebraic decomposition is a decomposition of n\mathbb{R}^{n} into semi-algebraic connected sets called cells, on which each polynomial has constant sign, either +, - or 0. With such a decomposition it is easy to give a solution of a system of inequalities and equations defined by the polynomials, i.e. a real polynomial system.

5.4.1 Two Identical Squares

To keep the number of parameters as few as possible, the configuration of two identical squares is the first we’d like to study. There are two parameters, the side length aa and the xx-coordinate ss of the lower left vertex of the right square, the left square’s lower left vertex being the origin.

The two squares are

[[0,0],[a,0],[a,a],[0,a]][\,[0,0]\,,\,[a,0]\,,\,[a,a]\,,\,[0,a]\,]

and

[[s,0],[s+a,0],[s+a,a],[s,a]].[\,[s,0]\,,\,[s+a,0]\,,\,[s+a,a]\,,\,[s,a]\,].
Refer to caption
Figure 5.6: The Nearly Optimal Two Identical Squares Configuration for a 200 by 200 Chess Board

Based on this configuration, the domain is

0a12,as1a.0\leq a\leq\frac{1}{2},\quad a\leq s\leq 1-a.

The area of white queens is

2a2.2a^{2}.

Actually the formula for black queens is very complicated, especially when aa is small there may be a lot of components for BB. However, by experimentation (procedure FindM2Square), we found that for all mid-range s[0.24,0.76]s\in[0.24,0.76], aa around 0.23 will always maximize the area. Then we just need to focus on the shape of BB when aa is not far from its optimum.

The area of black queens is

(sa)(1sa)+14(sa)2+(max(1s2a,0))2+max(s2a,0)(1sa).(s-a)(1-s-a)+\frac{1}{4}(s-a)^{2}+(\max(1-s-2a,0))^{2}+\max(s-2a,0)(1-s-a).

The domain for aa and ss is a triangle. The area formula for black queens shows that the two lines s=2as=2a and s=12as=1-2a separate the domain into 4 regions. In each region, we have a polynomial formula for the area of black queens. Since the area of white queens WW is just a simple formula of aa, we need to maximize aa with the condition W=BW=B.

When s2as\geq 2a and s12as\geq 1-2a, by cylindrical algebraic decomposition we obtained

{12(1+2)a<127(1+27)s=4+a7+27419a+9a2127(1+27)a<118(19217)s=4+a7±27419a+9a2a=118(19217)s=4+a727419a+9a2.\begin{cases}\frac{1}{2}(-1+\sqrt{2})\leq a<\frac{1}{27}(1+2\sqrt{7})&s=\frac{4+a}{7}+\frac{2}{7}\sqrt{4-19a+9a^{2}}\\ \frac{1}{27}(1+2\sqrt{7})\leq a<\frac{1}{18}(19-\sqrt{217})&s=\frac{4+a}{7}\pm\frac{2}{7}\sqrt{4-19a+9a^{2}}\\ a=\frac{1}{18}(19-\sqrt{217})&s=\frac{4+a}{7}-\frac{2}{7}\sqrt{4-19a+9a^{2}}\end{cases}.

When s2as\leq 2a and s12as\geq 1-2a, the result is an empty set.

When s2as\leq 2a and s12as\leq 1-2a, we obtained

29a17(32),s=27a22a+9a2.\frac{2}{9}\leq a\leq\frac{1}{7}(3-\sqrt{2}),\quad s=2-7a-2\sqrt{-2a+9a^{2}}.

When s2as\geq 2a and s12as\leq 1-2a, we obtained

29a127(1+27),s=3a2317a+12a2.\frac{2}{9}\leq a\leq\frac{1}{27}(1+2\sqrt{7}),\quad s=3a-\frac{2}{\sqrt{3}}\sqrt{1-7a+12a^{2}}.

Comparing the four cases, we found that the largest area occurred in case 1, when

a=118(19217)0.2371711193,a=\frac{1}{18}(19-\sqrt{217})\approx 0.2371711193,
s=131811262170.6053101598.s=\frac{13}{18}-\frac{1}{126}\sqrt{217}\approx 0.6053101598.

The largest area is 2898119217810.112500281.{\frac{289}{81}}-{\frac{19\,\sqrt{217}}{81}}\approx 0.112500281.

5.4.2 Two Identical Triangles

The configuration of two identical isosceles right triangles with the same orientation is the next to be considered. There are also two parameters, the leg length aa and the xx-coordinate ss of the lower left vertex of the triangle on the right. Note that the slopes of both triangles’ hypotenuses are +1+1.

The two isosceles right triangles are

[[0,0],[a,0],[a,a]][\,[0,0]\,,\,[a,0]\,,\,[a,a]\,]

and

[[s,0],[a+s,0],[a+s,a]].[\,[s,0]\,,\,[a+s,0]\,,\,[a+s,a]\,].

The domain for the two parameters aa and ss is also

0a12,as1a.0\leq a\leq\frac{1}{2},\quad a\leq s\leq 1-a.

The area of white queens is a2a^{2} and for the area of black queens, by numerical experimentation, we found that for all mid-range s[0.32,0.68]s\in[0.32,0.68], the area is maximized when aa is around 0.31. Hence for aa around 0.31, we have that the area of black queens is

2(sa)(1sa)+12(sa)2+12(1sa)2+12(max(1s2a,0))2.2(s-a)(1-s-a)+\frac{1}{2}(s-a)^{2}+\frac{1}{2}(1-s-a)^{2}+\frac{1}{2}(\max(1-s-2a,0))^{2}.

When s12as\geq 1-2a, by cylindrical algebraic decomposition we obtained

{12(22)a<14(1+5)s=12+12312a+8a214(1+5)a<14(33)s=12±12312a+8a2a=14(33)s=1212312a+8a2.\begin{cases}\frac{1}{2}(2-\sqrt{2})\leq a<\frac{1}{4}(-1+\sqrt{5})&s=\frac{1}{2}+\frac{1}{2}\sqrt{3-12a+8a^{2}}\\ \frac{1}{4}(-1+\sqrt{5})\leq a<\frac{1}{4}(3-\sqrt{3})&s=\frac{1}{2}\pm\frac{1}{2}\sqrt{3-12a+8a^{2}}\\ a=\frac{1}{4}(3-\sqrt{3})&s=\frac{1}{2}-\frac{1}{2}\sqrt{3-12a+8a^{2}}\end{cases}.

When s12as\leq 1-2a, we obtained

111(53)a14(1+5),s=2a210a+12a2.\frac{1}{11}(5-\sqrt{3})\leq a\leq\frac{1}{4}(-1+\sqrt{5}),s=2a-\sqrt{2-10a+12a^{2}}.

Hence the area is maximized when

a=14(33)0.316987298,a=\frac{1}{4}(3-\sqrt{3})\approx 0.316987298,
s=12.s=\frac{1}{2}.

The largest area is 343830.1004809470\frac{3}{4}-\frac{3}{8}\sqrt{3}\approx 0.1004809470.

Refer to caption
Figure 5.7: The Nearly Optimal Two Identical Isosceles Right Triangles with the Same Orientation Configuration for a 200 by 200 Chess Board

Actually, a larger area can be obtained if two identical isosceles right triangles with different orientations are considered. For example, if we take the two triangles to be

[[0,0],[a,0],[a,a]][\,[0,0]\,,\,[a,0]\,,\,[a,a]\,]

and

[[1a,0],[1,0],[1a,a]],[\,[1-a,0]\,,\,[1,0]\,,\,[1-a,a]\,],

then the area of black queens is

a(12a)+(12a)2=a2+14.a(1-2a)+(\frac{1}{2}-a)^{2}=-a^{2}+\frac{1}{4}.

Equalizing the areas of white queens and black queens, we get

Area(W)=a2=18,\textrm{Area}(W)=a^{2}=\frac{1}{8},

which is greater than the optimal case of two identical isosceles right triangles with the same orientation.

Refer to caption
Figure 5.8: An Example of Two Identical Isosceles Right Triangles with Different Orientations Configuration for a 200 by 200 Chess Board

5.4.3 One Square and One Triangle with the Same Side Length

Refer to caption
Figure 5.9: The Nearly Optimal One Square and One Triangle (with the same side length) Configuration for a 200 by 200 Chess Board

With the same notations as the above two subsections, let WW be the union of the square

[[0,0],[a,0],[a,a],[0,a]][\,[0,0]\,,\,[a,0]\,,\,[a,a]\,,\,[0,a]\,]

and the triangle

[[s,0],[a+s,0],[a+s,a]].[\,[s,0]\,,\,[a+s,0]\,,\,[a+s,a]\,].

Then the area of white queens is 32a2\frac{3}{2}a^{2} and the area of black queens is

a(sa)(1sa)+14(sa)2+(max(1s2a,0))2a(s-a)(1-s-a)+\frac{1}{4}(s-a)^{2}+(\max(1-s-2a,0))^{2}

when aa is around its optimum 0.27 and s[0.28,0.72]s\in[0.28,0.72]. It is obtained that when s12as\geq 1-2a

{12(2+6)a<121(1+22)s=4a7+171664a+22a2121(1+22)a<211(842)s=4a7±171664a+22a2a=211(842)s=4a7171664a+22a2,\begin{cases}\frac{1}{2}(-2+\sqrt{6})\leq a<\frac{1}{21}(1+\sqrt{22})&s=\frac{4-a}{7}+\frac{1}{7}\sqrt{16-64a+22a^{2}}\\ \frac{1}{21}(1+\sqrt{22})\leq a<\frac{2}{11}(8-\sqrt{42})&s=\frac{4-a}{7}\pm\frac{1}{7}\sqrt{16-64a+22a^{2}}\\ a=\frac{2}{11}(8-\sqrt{42})&s=\frac{4-a}{7}-\frac{1}{7}\sqrt{16-64a+22a^{2}}\end{cases},

and when s12as\leq 1-2a

115(66)a121(1+22),s=7a3131272a+106a2.\frac{1}{15}(6-\sqrt{6})\leq a\leq\frac{1}{21}(1+\sqrt{22}),\quad s=\frac{7a}{3}-\frac{1}{3}\sqrt{12-72a+106a^{2}}.

Consequently, we have the maximized area when

a=211(842)0.276228965,a=\frac{2}{11}(8-\sqrt{42})\approx 0.276228965,
s=1123314334250337+523360.495622162.s=\frac{112}{33}-\frac{14}{33}\sqrt{42}-\frac{50}{33}\sqrt{7}+\frac{52}{33}\sqrt{6}\approx 0.495622162.

The largest area is 63612196121420.1144536616\frac{636}{121}-\frac{96}{121}\sqrt{42}\approx 0.1144536616. Among the three configurations where CAD is applied in this section, we found that this configuration with one square and one triangle has the largest area.

5.5 Remarks

Our method can be easily generalized for configurations with more components and/or more parameters. For instance, let’s consider the configuration of two squares, not necessarily identical. Then there are three parameters, the side length aa of the left square, the side length bb of the right square and the xx-coordinate ss of the right square’s lower left vertex. For fixed bb and ss, we can find the interval of aa in which the optimum is located. Then for each fixed ss, we are able to find the interval of bb such that its corresponding aa will lead to the largest area a2+b2a^{2}+b^{2}. When the estimated optimal parameters are determined, a piece-wise function of the area of black queens follows.

The main difficulty of this peaceable queens problem lies in the number of parameters and the complexity of the area formula of black queens. When there are multiple components, as long as the number of parameters is limited, it should be still doable. For example, the configuration of three identical squares which are placed equidistantly has only one parameter, the side length aa. When the chess board is 240 by 240, the optimal aa is around 40, which means in the unit square the optimal side length is around 16.\frac{1}{6}.

In conclusion, in this chapter we prove that Jubin’s configuration is a local optimum. Optimal cases of some certain configurations are discussed. Future work includes the exact solution of complicated configurations with numerous parameters, whether the white queens have two components under the best configuration, and proof or disproof that Jubin’s configuration is indeed the best.

References

  • [1] Stephen Ainley, Matematical puzzles, Prentice Hall, Upper Saddle River, NJ, 1977.
  • [2] Saugata Basu, Richard Pollack and Marie-Françoise Roy, Algorithms in real algebraic geometry, second edition. Algorithms and Computation in Mathematics 10, Springer-Verlag, Berlin, 2006.
  • [3] Robert Bosch, Peaceably coexisting armies of queens, Optima (Newsletter of the Mathematical Programming Society) 62.6-9: 271, 1999.
  • [4] Christopher W. Brown, Simple CAD Construction and Its Applications, Journal of Symbolic Computation, 31, 521-547, 2001.
  • [5] Bob Caviness and Jeremy Johnson (Eds.), Quantifier Elimination and Cylindrical Algebraic Decomposition, Springer-Verlag, New York, 1998.
  • [6] Michael Cramer, A note concerning the limit distribution of the quicksort algorithm, Informatique Theériques et Applications, 30, 195-207, 1996.
  • [7] Antonio G. De Oliveira and Michel L. Vergnas, Parking Functions and Labelled Trees, Seminaire Lotharingien de Combinatoire 65 (2011), Article B65e.
  • [8] Persi Diaconis and Angela Hicks, Probabilizing Parking Functions, Adv. in Appl. Math. 89 (2017), 125-155.
  • [9] Peter Doyle and Laurie Snell, “Random Walks and Electrical Networks”, Carus Mathematical Monographs (# 22), Math. Assn. of America, 1984.
  • [10] Marianne Durand, Asymptotic analysis of an optimized Quicksort algorithm, Inform. Proc. Lett., 85 (2): 73-77, 2003.
  • [11] Shalosh B. Ekhad, N. J. A. Sloane and Doron Zeilberger, Automated Proof (or Disproof) of Linear Recurrences Satisfied by Pisot Sequences, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger. Available from math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/pisot.html .
  • [12] Shalosh B. Ekhad and Doron Zeilberger, Going Back to Neil Sloane’s FIRST LOVE (OEIS Sequence A435): On the Total Heights in Rooted Labeled Trees, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, July 19, 2016. math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/a435.html .
  • [13] Shalosh B. Ekhad and Doron Zeilberger, A Detailed Analysis of Quicksort Running Time, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, math.rutgers.edu/ zeilberg/pj.html . Also in: arxiv.org/abs/1903.03708 .
  • [14] Shalosh B. Ekhad and Doron Zeilberger, Explicit Expressions for the Variance and Higher Moments of the Size of a Simultaneous Core Partition and its Limiting Distribution, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, math.rutgers.edu/ zeilberg/pj.html. Also in: arxiv.org/abs/1508.07637 .
  • [15] F. J. Faase, On the number of specific spanning subgraphs of the graphs g×png\times p_{n}, Ars Combinatoria, 49 (1998) 129-154.
  • [16] Dominique Foata and John Riordan, Mapping of acyclic and parking functions, Aequationes Mathematicae 10 (1974), 490-515.
  • [17] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, “Concrete Mathematics”, Addison-Wesley, 1989.
  • [18] P. Hennequin, Combinatorial analysis of quicksort algorithm, RAIRO Theor. Inform. Appl., 23 (3): 317-333, 1989.
  • [19] C.A.R. Hoare, Quicksort, The Computer Journal, 5:1, 10-15, 1962.
  • [20] Vasileios Iliopoulos, The Quicksort algorithm and related topics, arxiv.org/abs/1503.02504.
  • [21] Vasileios Iliopoulos, A note on multipivot Quicksort, Journal of Information and Optimization Sciences, 39:5, 1139-1147, 2018.
  • [22] Vasileios Iliopoulos and David B. Penman, Dual pivot Quicksort, Discrete Mathematics, Algorithms and Applications, Vol. 04, No. 03, 1250041, 2012.
  • [23] Svante Janson, Brownian excursion area, Wright’s constants in graph enumeration, and other Brownian areas, Probab. Surveys, 4(2007), 80-145. projecteuclid.org/euclid.ps/1178804352 .
  • [24] Benoit Jubin, Improved lower bound for A250000, oeis.org/A250000/a250000_1.txt .
  • [25] Manuel Kauers and Peter Paule, “The Concrete Tetrahedron”, Springer, 2011.
  • [26] P. Kirschenhofer, H. Prodinger, C. Martinez, Analysis of Hoare’s FIND algorithm with Median-of-three partition, Random Structures and Algorithms, 10:1-2, 143-156, 1997.
  • [27] Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, Addison-Wesley, 1973.
  • [28] Charles Knessl and Wojciech Szpankowski, Quicksort algorithm again revisited, Discrete Mathematics and Theoretical Computer Science, 3, 43-64, 1999.
  • [29] Donald E. Knuth, Satisfiability, Fascicle 6, volume 4 of The Art of Computer Programming, Addison-Wesley, 2015.
  • [30] Alan G. Konheim and Benjamin Weiss, An occupancy discipline and applications, SIAM J. Applied Math. 14 (1966), 1266-1274.
  • [31] J.P. Kung and C. Yan, Expected sums of general parking functions, Annals of Combinatorics 7 (2003), 481-493.
  • [32] J.P. Kung and C. Yan, Exact formulas for the moments of sums of classical parking functions, Advances in Applied Mathematics 31 (2003), 215-241.
  • [33] Alois Panholzer, Analysis of multiple quickselect variants, Theor. Comput. Sci., 302 (1-3): 45-91, 2003.
  • [34] Paul Raff, Spanning Trees in Grid Graph, arxiv.org/abs/0809.2551.
  • [35] John Riordan and Neil J. A. Sloane, The enumeration of rooted trees by total height, J. Australian Math. Soc. 10 (1969), 278-282. neilsloane.com/doc/riordan-enum-trees-by-height.pdf .
  • [36] Carsten Schneider, The Summation package Sigma, A Mathematica package available from www3.risc.jku.at/research/combinat/software/Sigma/index.php .
  • [37] Carsten Schneider, Symbolic Summation Assists Combinatorics, Sem.Lothar.Combin. 56 (2007), Article B56b (36 pages). www3.risc.jku.at/research/combinat/software/Sigma/pub/SLC06.pdf .
  • [38] Marcel-Paul Schützenberger, On an enumeration problem, J. Combinatorial Theory 4 (1968), 219-221.
  • [39] Neil J.A. Sloane, The On-Line Encyclopedia of Integer Sequences, oeis.org/ .
  • [40] Neil J.A. Sloane, The On-Line Encyclopedia of Integer Sequences, Notices of the American Mathematical Society 65 #9 , 1062-1074, 2018.
  • [41] Neil J.A. Sloane, Peaceable Queens - Numberphile video, available from www.youtube.com/watch?v=IN1fPtY9jYg.
  • [42] Richard Stanley, Parking functions, www-math.mit.edu/~rstan/transparencies/parking.pdf .
  • [43] Richard Stanley, A survey of parking functions, www-math.mit.edu/~rstan/transparencies/parking3.pdf .
  • [44] Richard Stanley, “Enumerative Combinatorics, Volume 1”. First edition: Wadsworth & Brooks/Cole, 1986. Second edition: Cambridge University Press, 2011.
  • [45] Richard Stanley, “Enumerative Combinatorics, Volume 2”, Cambridge University Press, 1999.
  • [46] Wikipedia, , Quicksort, en.wikipedia.org/wiki/Quicksort .
  • [47] Herbert S. Wilf, What is an Answer?, The American Mathematical Monthly 89 (1982), 289-292.
  • [48] Yukun Yao and Doron Zeilberger, An Experimental Mathematics Approach to the Area Statistic of Parking Functions, Mathematical Intelligencer volume 41, issue 2 (June 2019), 1-8.
  • [49] Yukun Yao and Doron Zeilberger, Untying The Gordian Knot via Experimental Mathematics, to appear in Algorithmic Combinatorics-Enumerative Combinatorics, Special Functions, and Computer Algebra: In honor of Peter Paule’s 60th birthday, Springer, edited by Veronika Pillwein and Carsten Schneider. arxiv.org/abs/1812.07193 .
  • [50] Yukun Yao, Using Non-Linear Difference Equations to Study Quicksort Algorithms, Journal of Difference Equations and Applications volume 26, issue 2 (2020), 275-294.
  • [51] Yukun Yao and Doron Zeilberger, Numerical and Symbolic Studies of the Peaceable Queens Problem, Experimental Mathematics (2019), DOI: 10.1080/10586458.2019.1616338.
  • [52] Doron Zeilberger, Symbolic Moment Calculus I.: Foundations and Permutation Pattern Statistics, Annals of Combinatorics 8 (2004), 369-378. math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/smcI.html.
  • [53] Doron Zeilberger, Symbolic Moment Calculus II.: Why is Ramsey Theory Sooooo Eeeenormously Hard?, INTEGERS 7(2)(2007), A34. math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/smcII.html.
  • [54] Doron Zeilberger, The nn2n^{n-2}-th proof for the number of labeled trees, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, undated (c. 1998), math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/labtree.html .
  • [55] Doron Zeilberger, The Automatic Central Limit Theorems Generator (and Much More!), “Advances in Combinatorial Mathematics: Proceedings of the Waterloo Workshop in Computer Algebra 2008 in honor of Georgy P. Egorychev”, chapter 8, pp. 165-174, (I.Kotsireas, E.Zima, eds. Springer Verlag, 2009.) math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/georgy.html.
  • [56] Doron Zeilberger, A Holonomic Systems Approach To Special Functions, J. Computational and Applied Math 32 (1990), 321-368, math.rutgers.edu/~zeilberg/mamarim/mamarimPDF/holonomic.pdf.
  • [57] Doron Zeilberger, HISTABRUT: A Maple Package for Symbol-Crunching in Probability theory, the Personal Journal of Shalosh B. Ekhad and Doron Zeilberger, posted Aug. 25, 2010, math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/histabrut.html.
  • [58] Doron Zeilberger, The C-finite Ansatz, Ramanujan Journal 31 (2013), 23-32. Available from math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/cfinite.html .
  • [59] Doron Zeilberger, Why the Cautionary Tales Supplied by Richard Guy’s Strong Law of Small Numbers Should not be Overstated, The Personal Journal of Shalosh B. Ekhad and Doron Zeilberger. Available from math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/small.html.
  • [60] Doron Zeilberger, An Enquiry Concerning Human (and Computer!) [Mathematical] Understanding, in: C.S. Calude, ed., “Randomness & Complexity, from Leibniz to Chaitin” World Scientific, Singapore, 2007, pp. 383-410. Available from math.rutgers.edu/~zeilberg/mamarim/mamarimhtml/enquiry.html.