An Adjacent-Swap Markov Chain on Coalescent Trees
Abstract
The standard coalescent is widely used in evolutionary biology and population genetics to model the ancestral history of a sample of molecular sequences as a rooted and ranked binary tree. In this paper, we present a representation of the space of ranked trees as a space of constrained ordered matched pairs. We use this representation to define ergodic Markov chains on labeled and unlabeled ranked tree shapes analogously to transposition chains on the space of permutations. We show that an adjacent-swap chain on labeled and unlabeled ranked tree shapes has mixing time at least of order , and at most of order . Bayesian inference methods rely on Markov chain Monte Carlo methods on the space of trees. Thus, it is important to define good Markov chains which are easy to simulate and for which rates of convergence can be studied.
1 Introduction
The standard coalescent (Kingman:1982uj, ) is often used in evolutionary biology and population genetics to model the set of ancestral relationships among individuals as a rooted and timed binary tree called a genealogy (wakeley_coalescent_2008, ). In this manuscript, we are concerned with discrete tree topologies only and assume a unit interval length between consecutive coalescing (branching) events.
A labeled ranked tree shape with leaves is a rooted labeled and ranked binary tree. The leaves are labeled by the set and internal nodes are labeled in increasing order, with label at the root (Figure 1(A)). The cardinality of is given by
(1) |
In many applications, the quantity of interest is the overall shape of the tree (Kirkpatrick1993, ; Frost2013, ; Maliet2018, ); the specific labels of the samples are un-important. In fact, a key feature of simple coalescent models, such as the Kingman coalescent, is the assumption that the leaves are exchangeable. The Tajima coalescent, a lumping of the Kingman coalescent that ignores leaf labels, was recently proposed for inferring past population size trajectories from binary molecular data (Palacios2019, ). This motivates the study of unlabeled ranked tree shapes with leaves (Figure 1(B)). The appealing feature of modeling unlabeled ranked tree shapes is the reduction in the cardinality of the state space. The cardinality of is given by the ()-th Euler zig-zag number (kuznetsov1994increasing, ).

Other types of trees are used in the context of evolutionary models, such as cladeograms (or phylogenies). Cladeograms are rooted or unrooted binary trees with labeled leaves and no ranking of internal nodes. In diaconis1998matchings , the space of cladeograms with leaves is shown to be in bijection with the space of perfect matchings of labels, with . The bijection with perfect matchings is used to define a Markov chain on the space of cladeograms. Using the perfect matching perspective, the chain is analogous to a random transpositions chain on the space of permutations of elements. This representation allows the use of known results for the random transpositions chain to be applied to the chain on trees to determine sharp bounds on rates of convergence diaconis2002random . Motivated by this result, we propose a new representation of labeled and unlabeled ranked tree shapes as a type of matching. We use this representation to define an adjacent-swap chain on the spaces of labeled and unlabeled ranked tree shapes, and to study their mixing time. It is important to study the mixing time for chains on ranked trees because these results have implications for the design of MCMC algorithms to infer evolutionary parameters from molecular data in a Bayesian setting (drummond2012bayesian, ).
In the rest of this section we define a bijective representation of ranked trees as constrained ordered matchings and define the adjacent-swap chain. We then state our main results concerning mixing times and discuss related work. In Section 2, we state known results on Markov chains that will be used in the following sections. In section 3 we state important properties of the proposed adjacent-swap chain. The lower bound for the mixing time of the adjacent-swap chains is proven in section 4 and it is obtained by finding a specific function that gives a lower bound for the relaxation time of the chains.In section 5, we prove the upper bounds on the mixing time of the adjacent-swap chains via a coupling argument. In the discussion section we mention future directions and other related chains on labeled and un-labeled coalescent trees which could be defined using the matching representations, as well as discuss implications for MCMC methods applied in practice.
1.1 Matching representations
We first establish some definitions. Let be an integer and a multiset of size . A matching of objects is a partition of the elements into disjoint subsets of size two. An ordered matching is a matching with a linear ordering (ranking) of the pairs.
Definition 1.
The space (Constrained Ordered Matchings) is the set of all ordered matchings of with pairs labeled , which satisfy, for and ,
(2) |
To represent a labeled ranked tree shape as a constrained ordered matching, we take to be the set of leaf and internal node labels, and each pair in the matching represents a coalescence in the tree, with representing the -th coalescence event. The condition (2) then simply ensures that no interior node can be merged before it has been introduced. A leaf can merge at any point and thus there is no constraint on a leaf’s position in the matching.
Lemma 1.
With , the space is in bijection with . With multiset , with repeated times, the space is in bijection with with .
Proof.
First assume that . The first matched pair can be formed by any two leaves in possible ways. The second pair can be formed by any of the remaining leaves or , the label of the first matched pair, in ways; in general, the -th matched pair can be formed in ways, because there are unique labels that could be matched at that time. Thus, and for every element there is a one-to-one mapping between every coalescence event in and every matched pair in one element of . These two facts imply there is a one-to-one mapping between and .
With the leaf labels replaced with the repeated element , the space of matchings is in bijection with due to the fact that the space of is equivalent to the space of with the leaf labels removed. That is, there is a surjection equivalent to the surjection , with and . The bijective matching representation of the ranked trees and of Figure 1 are shown in the legend of Figure 1. ∎
1.2 Markov chain and main results
Definition 2.
The adjacent-swap Markov chain on the space is defined by the following update move:
-
1.
Pick an index uniformly at random.
-
2.
Pick labels and uniformly from the pairs and respectively.
-
(a)
If swapping the positions of does not violate constraint (2), make the swap.
-
(b)
Otherwise, remain in the current state.
-
(a)
For the sets which give the bijections with and , this Markov chain is connected and reversible. In the case of it is also symmetric and hence, it has a uniform stationary distribution. For , the chain is not symmetric; the stationary distribution is the Tajima distribution on the space of ranked un-labeled trees. These facts, as well as the relationship between the two chains, will be proved in a later section.

We are interested in the convergence rates of this Markov chain and how this rate compares depending on the space or . The measure that we study for convergence rate is the mixing time. That is, for a Markov chain on a space with transition probability and stationary distribution , the mixing time is defined as
To state the result, let and be the mixing times for the chains on and respectively. Our main result is:
Theorem 1.
There exists constants such that
and
The lower bound of comes from the relaxation time, and involves a standard trick of finding a specific function for which the variance under the stationary distribution can be bounded. The chain on is a certain type of “lumping” of the chain on , and so the same lower bound applies to both spaces. The upper bounds are obtained using a coupling argument. We conjecture that the mixing time is indeed smaller than the time , though it is not evident whether this is by a significant order, or just a constant factor.
1.3 Related work
Mixing of chains on other tree spaces have been previously studied, in particular, the mixing of Markov chains on the space of cladeograms. An -leaf cladeogram is a rooted or unrooted tree with labeled leaves and unlabeled internal nodes of degree 3. The main difference between cladeograms and the tree topologies studied in this manuscript, is that cladeograms do not rank internal branching events. The cardinality of the space of cladeograms of leaves is , smaller than the cardinality of the space of ranked and labeled trees, but larger than the cardinality of the space of ranked unlabeled trees. Cladeograms are fundamental objects in phylogenetics to model ancestral relationships at the species level, while ranked tree shapes are fundamental objects in population genetics and phylodynamics of infectious diseases. Ranked tree shapes are used to model ancestral relationships of a sample of individuals from a single population of the same species (felsenstein2004inferring, ).
Markov chains on cladeograms. Aldous aldous2000mixing studied a chain on unrooted cladeograms that removes a leaf chosen uniformly at random from the current cladeogram and reattaches it to a random edge. Using coupling methods, Aldous aldous2000mixing showed the relaxation time for this chain is . The same chain was later analyzed by Schweinsberg (schweinsberg2002n2, ) who showed that using a modified method of distinguished paths. In diaconis1998matchings , the authors show that the space of rooted cladeograms with leaves is in bijection with the space of perfect matchings in the complete graph on vertices. This bijection was later used to define a Markov chain by randomly choosing two pairs and transposing two randomly selected entries of each pair diaconis2002random . The authors showed that this random transpositions chain mixed after steps. In SHK2014 , the authors study the nearest neighbor interchange (NNI) chain and a subtree prune and regraft (SPR) chain on the space of rooted cladeograms. The authors showed that the upper bounds on the relaxation time of the SPR and the NNI chains are and respectively.
Other related work include the study of mixing times of chains whose stationary distribution is the posterior distribution over cladeograms (mossel2006limitations, ; vstefankovivc2011fast, ). In (mossel2006limitations, ), the authors show exponential mixing times under a misspecified model in which data is generated from a mixture of cladeograms.
Markov chains on related spaces.
The adjacent swap Markov chain on the space of ranked trees studied in this manuscript is closely related to the adjacent transpositions chain on the space of permutations. As it is often done, one can think of an element as an ordering of a deck of cards with unique labels . In (aldous1983random, , Example 4.10), the chain is defined by picking at random two adjacent cards from the deck; with probability 1/2, the cards are swapped and with probability 1/2 nothing happens. The chain is symmetric with uniform stationary distribution. Aldous (aldous1983random, ) showed that the mixing time of this chain has a lower bound of and an upper bound of on the mixing time. In wilson2004mixing , the now-famous “Wilson’s method” was introduced and applied to the adjacent transpositions chain as an example to improve the lower bound to order with the explicit constant , as well as an upper bound of . Finally, in lacoin2016mixing ,the upper bound was improved to , so that the upper and lower bounds matched; in addition, the chain was proven to follow the cut-off phenomenon. Durrett generalized the adjacent transpositions chain to the -reversal chain, introduced as a model for the evolution of DNA in a chromosome durrett2003shuffling .
To the best of our knowledge, bounds on the mixing time of chains on ranked tree shapes have not been studied before (misra2011optimization, ).
2 Preliminaries
In this section, we review some results of Markov chains theory that will be used to bound the mixing time of the adjacent swap chains on and . We use the coupling method to find the upper bounds. A coupling of Markov chains with transition matrix is a process such that both and are marginally Markov chains with transition matrix . The goal is to define a coupling of two chains from different starting distributions such that the two chains will quickly reach the same state, and once this occurs the two chains stay matched forever. This coupling time gives an upper bound on the mixing time, see Chapter 5 from LevinPeresWilmer2006 for more details.
Theorem 2 (Theorem 5.4 (LevinPeresWilmer2006, )).
Suppose that for each pair of states there is a coupling with and . For each such coupling, let be the coalescence time of the chains, i.e.
Then
To find a lower bound on the mixing time, we bound the relaxation time and use the following result:
Theorem 3 (Theorem 12.5 (LevinPeresWilmer2006, )).
Suppose is the transition matrix of an irreducible, aperiodic and reversible Markov chain. Then
(3) |
The relaxation time is defined as the inverse of the spectral gap: , where , one minus the absolute value of the second-largest eigenvalue of the transition matrix of the chain. We will bound the relaxation time using the following variational characterization:
Theorem 4 (Lemma 13.7 (LevinPeresWilmer2006, )).
Let P be the transition matrix for a reversible Markov chain. The spectral gap satisfies
where
3 The adjacent-swap chain on ranked tree shapes
Lemma 2.
The adjacent-swap Markov chain on is irreducible, aperiodic and reversible with respect to the uniform stationary distribution .
Proof.
This can be seen by noting the transition matrix is symmetric. Let be the transition matrix for the adjacent-swap chain on , then
where is obtained by swapping two randomly chosen elements from two adjacent pairs in , chosen uniformly at random, e.g. consider the following states at pairs and :
In a labeled ranked tree shape , all pairs are formed by distinct elements, so are all unique and the only way to transition from to is to swap the labels and , which happens with probability if or .
To see that the chain is irreducible, we note that any label can be moved to any pair to the right, one pair (or step) at a time with positive probability and that any label can be moved to a pair with index to the left (one pair at a time) as long as the label corresponds to a leaf or an internal node smaller than , satisfying constraint (2). To see that the chain is aperiodic, we note that , for all . Independent of the current state, we can pick pair and attempt to swap any element of the pair with label from the pair with probability . Since this move violates constraint (2), the move is rejected and the chain remains in the current state. ∎
We note that the transition matrix of the adjacent-swap chain on unlabeled ranked tree shapes is not symmetric. For example, consider the transition to of the type:
(4) |
for labels (recall the s represent leaf labels). The probability of this transition is , since once pairs and are chosen, there is a chance of choosing label from pair and label from pair . However, the reverse transition:
has probability . It turns out that the stationary distribution for the chain is the Tajima coalescent distribution veber (also known as Yule distribution). For
where is the number of cherries of , i.e. the number of pairs of the type in the representation. Indeed, for transitions to of the type of (4), has one more cherry than and ; for other types of transitions that do not affect the number of cherries, , and the detailed balance equation is satisfied.
Another way of proving the Tajima coalescent distribution is the stationary distribution on unlabeled ranked tree shapes is to view the chain as a lumping of the chain on . This perspective also gives us an initial comparison of the relaxation times of the chains. The space of unlabeled ranked-tree shapes can be considered as a set of equivalency classes of the trees . That is, for trees , define the equivalence relation if the trees have the same ranked tree shape. From the perspective with , two matchings are equivalent if all internal node labels occur in the same pairs.
This equivalence relation induces a partition of using equivalence classes. That is, we can write as the disjoint union of sets , where , and all trees in have the same ranked-tree shape.
Lemma 3.
For any , equivalence class , and , the following relation holds:
Proof.
If , then the transition from to is of the type:
(5) |
where correspond to leaf labels. Since and differ only at leaf labels, the transition probability of swapping two leaf labels is the same and . If , and therefore , then the transition from to is of the type:
where and are not both in . Since and differ only at leaf labels, again . ∎
In words, Lemma 3 says that probability of transitioning to a different ranked-tree shape is independent of the leaf configuration of the current tree. A well-known result (e.g. Lemma 2.5 in LevinPeresWilmer2006 ) is that the induced chain on the space of equivalence classes defined by is a Markov chain; then observe that the induced chain is equivalent to adjacent-swap Markov chain on the space . This allows a comparison of the spectral gaps of the two chains. Lemma 12.9 in (LevinPeresWilmer2006, ) states the eigenvalues of the transition matrix of the lumped chain are eigenvalues of the transition matrix of the full chain, hence we have the following lemma.
Lemma 4.
Let be the spectral gaps of the chains respectively. Then, .
From Lemma 3 we can immediately see that stationary distribution for is indeed the Tajima coalescent distribution. Suppose corresponds to the equivalence class . Then,
A uniformly sampled labeled ranked tree shape with the leaf-labels erased gives an unlabeled ranked tree shape according to the Tajima distribution (veber, ). This implies that is equal to the Tajima distribution for the ranked tree shape corresponding to .
4 Lower bound
To prove the lower bounds in Theorem 1, we use the variational characterization of the relaxation time:
(6) |
We will use the internal tree length as a function to find a lower bound for . Since the internal tree length is independent of the leaf labels, we get the same lower mixing time bound for the adjacent-swap chains on and . For ease of exposition we will focus the following discussion on .
We will define the internal tree length function on the space using the correspondence with and assuming unit length between consecutive coalescence events. For a label , let denote the pair index such that . In the example of Figure 3, . Note that from the constraints (Eq. (2)), we have that
We now define the internal tree length function on as follows:
(7) |
As an example consider the ranked unlabeled tree shape with depicted in Figure 3. Note that the function is minimized for the “caterpillar tree”: , where for all , giving .

This function is useful in bounding the relaxation time because it is “local” with respect to the Markov chain. That is, suppose and . The change from to could have moved an interior node label to the left, in which case . If it moved an interior node label to the right, then . If two interior node labels were swapped, then . Thus, . The denominator of Equation (6) can then be upper-bounded by .
To find the variance of the internal tree length, we note that VarVar since the internal tree length is independent of the leaf labels. We now re-state the standard coalescent of labeled ranked tree shapes as an urn process.
Consider an urn containing balls labeled . At step k, draw two balls without replacement from the urn. Let be the labels of the balls drawn, then set . Add in a new ball with label . Repeat this process for until only a single ball labeled remains in the urn. The resulting sequence of pairs corresponds to a labeled ranked tree shape drawn from the standard coalescent, i.e. .
We can now simplify the urn process as follows: start with white balls in the urn. At each step, draw two balls and add back in a single red ball (representing an interior node of the tree). Let and be the number of red balls in the urn after coalescence events. Note that after mergers (coalescence events), there are total balls left and .
The simplified urn process is useful because the internal tree length can be computed by counting the number of red balls at each step and adding them all together (Figure 3), i.e.,
The quantities are fairly easy to analyze and have been studied before in various contexts. In janson2011 , the values are used to study the asymptotic behavior of the external tree length of the Kingman coalescent and it is shown that:
Using this, we get
Then, we calculate
(8) | |||
(9) |
The second moment is the sum of these two lines (8) + (9), which gives
Theorem 5.
Let be the relaxation time of the adjacent-swap chains on and , respectively. Then,
5 Upper bound
In this section we prove the upper bounds in Theorem 1 using a coupling argument. The coupling is similar to the one used by Aldous for analyzing the adjacent transpositions chain on aldous1983random .
We analyze a lazy version of the chain to make the coupling. That is, at each step, we generate a random coin flip . If , attempt a move of the chain, and if then make no change. Let and be the two copies of the chain at time , one started from and the other from . We will first describe the coupling for the chain on .
Coupling of unlabeled ranked tree shapes.
We’ll define a coupling that jointly matches the internal node labels of the two copies in the order . Note that label is already jointly matched in the pairs because it can only occur in the final pair. Let be the maximum label that is not jointly matched, i.e., the maximum element that occurs in different pairs in and . Let denote the index of the pair in the matching that contains , i.e. if at time . Then, at time , the elements match in both and , i.e. for every .
For any label , the coupling will have two properties:
Property 1. If , then for all .
Property 2. If and , then and if , then . This condition will ensure that will eventually get matched in the two copies.
Define the following quantities:
-
•
Let be the set of all indices that contain labels jointly matched. That is, there is a label such that . Note that it is possible to have other matches for labels , but we do not keep track of those matches and breaking those matches does not violate the two properties of the coupling.
-
•
Let if label is in pairs and , or if it is in pairs and . Otherwise, set .
At each step, pick an index uniformly at random and consider the following cases:
-
1.
and . There are no joint matchings of labels in pairs and and swapping two elements in both copies will not match label . In this case, propose a swap in and according to their (lazy) marginal dynamics.
-
2.
and/or , and . There is at least one joint match of labels in pairs or and swapping two elements in both copies will not create a new joint match of . To preserve Property 1 it is necessary to perform the same move (or no move at all) on both chains. Toss a single coin to determine whether a move will be proposed on both copies or not.
(a) If and the pairs at and are of the type:With the possibility of . Then, if a matched label e.g. is chosen to be swapped in chain , it must also be chosen for . Now, label since it is a label that merges at ; and since it is a label that was jointly matched before. Then or cannot be label since is already a match. This implies that there are no constraints on and the marginal probability of swapping would be the same in each chain.
(b) If and the pairs at and are of the type:
With the possibility of , or
Then, if a matched label e.g. is chosen to be swapped in chain , it must also be swapped in .
-
3.
and . There are no joint matchings of labels in pairs and , and label is either in and or in and . In order to preserve property 2, simply set . This ensures that label will only possibly move in one of the chains.
-
4.
and/or , and . There is one joint match in pair and/or , and label is either in and or in and . To preserve Properties 1 and 2 while keeping the correct marginal transition probabilities for each chain we define the joint transitions as follows.
(a) Suppose and the pairs at indices are of the type:
By the same argument of case 2, . Table 1 defines the joint proposal probabilities that preserve Properties and and correct marginal transition probabilities.
Table 1: Coupling transition probabilities in case 4(a) Move in Move in Probability No change No change No change No change No change (b) Suppose now that and that pairs at indices and are of the type:
Table 2 defines the joint proposal probabilities that preserve Properties 1 and 2 with correct marginal transition probabilities.
Table 2: Coupling transition probabilities in case 4(b) Move in Move in Probability No change No change No change No change (c) Suppose now that and that pairs at indices and are of the type:
Table 3 defines the joint proposal probabilities that preserve Properties 1 and 2 with correct marginal transition probabilities.
Table 3: Coupling transition probability in case 4(c). Move in Move in Probability No change No change No change No change
Coupling of labeled ranked tree shapes.
The coupling for can proceed exactly as the coupling for until every interior-node label is matched, say at time . After that point, the leaf labels can be matched in any order. We extend the set to contain the indices of any label such that and to be the set of indices such that any label is in pair and or and . The transition probabilities defined above work with these sets , and have the following properties, for and label :
-
1.
Property 1 If , then for all .
-
2.
Property 2 If then for all . If then for all .
5.1 Coupling time
For the chain on unlabeled ranked tree shapes , the time to couple is , where is the time it takes to match interior node label , after the labels are already matched. Property 2 is crucial to study . Suppose that at time when label is matched we have . Let be the time it takes for to hit the left boundary of it’s range, i.e. . Then necessarily at this time, by Property 2, .
Note that . If is not at the left boundary, the probability of moving to the left is:
The probability that moves to the right will depend on the exact configuration and whether or not there is a constraint, e.g. in the situation
in which the probability moves to the right is . Thus, we can bound
The following is an easy result about the hitting time of a symmetric random walk on a line.
Lemma 5.
Let be a random walk on the line with the transitions
for some . Suppose and let . Then .
Proof.
We can prove the result for , then scale. Let be the expected first time of hitting if . Note that the following identities are satisfied:
This is a second-order recursion, with solution . So this means . When the probability of moving is , so the expected time to hit started from is . ∎
Let be a random walk on the line with , defined as in Lemma 5. We can couple with so that almost surely for all , because is moving to the right with probability . In conclusion,
and thus the total coupling time is
Leaf-Labeled Trees
The coupling time for leaf-labeled trees can be written , where is the time from the previous section for the interior labels to match. For the time for the leaves to match, note that time is saved because the leaves are allowed to match in any order. Moreover, leaf labels can be moved without constraints. In analogy with the result from Example 4.10 in aldous1983random for adjacent transpositions on , this would take time of order . Therefore, .
6 Discussion
The representation of ranked tree shapes as ordered matchings can be used to define many other Markov chains in analogy to well-studied chains on . For example, in addition to the adjacent-swap, another natural chain would be a random-swap Markov chain: pick two pairs uniformly at random, within each pair pick a label, and swap the two elements if it is allowed. The internal tree length function defined in Section 4 gives a lower bound of for this chain, because a single move could change the function by at most , and thus the Dirichlet form can be bounded by . A naive coupling argument would give an upper bound of order . The focus of this paper was on the adjacent-swap chain because it is a local-move chain which could be more useful for Metropolis algorithms in applications.
The upper bound on the mixing time relies on a coupling that matches one label at a time in a specific order giving a sub-optimal upper bound. We conjecture that the mixing time of the adjacent-swap chain on unlabeled ranked tree shapes is of the order of as it is in the case of adjacent transpositions on . Figure 4 shows the total variation distance between the adjacent-swap chain on unlabeled ranked tree shapes and the Tajima stationary distribution for trees with leaves. Due to the large size of the state space, this calculation could not be extended for larger trees in order to inform about the presence of a cutoff phenomena.

We also note another connection between trees and permutations: Unlabeled ranked tree shapes are in bijection with alternating permutations (donaghey1975alternating, ; richard1999enumerative, ) and hence some results concerning Markov chains on alternating permutations could be used to analyze the mixing of Markov chains on unlabeled ranked trees and vice versa. However, we are not aware of mixing time results of Markov chains on the space alternating permutations (diaconis2013random, ).
Finally, in this work we analyzed Markov chains on trees whose stationary distributions are coalescent distributions. These coalescent distributions are used as prior distributions over the set of ancestral relationships of samples of DNA (Palacios2019, ). In practice, these Markov chains are used to approximate the posterior distribution, and hence, the state space is more restricted and subject of future exploration.
Acknowledgments.
MS is supported by a National Defense Science & Engineering Graduate fellowship. JAP is supported by National Institutes of Health Grant R01-GM-131404 and the Alfred P. Sloan Foundation.
References
- [1] D. Aldous. Random walks on finite groups and rapidly mixing Markov chains. In Séminaire de Probabilités XVII 1981/82, pages 243–297. Springer, 1983.
- [2] D. J. Aldous. Mixing time for a Markov chain on cladograms. Combinatorics, Probability and Computing, 9(3):191–204, 2000.
- [3] P. Diaconis, S. Holmes, et al. Random walks on trees and matchings. Electronic Journal of Probability, 7, 2002.
- [4] P. Diaconis and P. M. Wood. Random doubly stochastic tridiagonal matrices. Random Structures & Algorithms, 42(4):403–437, 2013.
- [5] P. W. Diaconis and S. P. Holmes. Matchings and phylogenetic trees. Proceedings of the National Academy of Sciences, 95(25):14600–14602, 1998.
- [6] R. Donaghey. Alternating permutations and binary increasing trees. Journal of Combinatorial Theory, Series A, 18(2):141–148, 1975.
- [7] A. Drummond, M. Suchard, D. Xie, and A. Rambaut. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Molecular Biology and Evolution, 29:1969–1973, 2012.
- [8] R. Durrett. Shuffling chromosomes. Journal of Theoretical Probability, 16(3):725–750, 2003.
- [9] J. Felsenstein. Inferring phylogenies, volume 2. Sinauer associates Sunderland, MA, 2004.
- [10] S. D. W. Frost and E. M. Volz. Modelling tree shape and structure in viral phylodynamics. Philosophical Transactions of the Royal Society B: Biological Sciences, 368(1614):20120208, 2013.
- [11] S. Janson and G. Kersting. On the total external length of the Kingman coalescent. Electron. J. Probab., 16:2203–2218, 2011.
- [12] J. Kingman. The coalescent. Stochastic Processes and their Applications, 13(3):235–248, 1982.
- [13] M. Kirkpatrick and M. Slatkin. Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution, 47(4):1171–1181, 1993.
- [14] A. G. Kuznetsov, I. M. Pak, and A. E. Postnikov. Increasing trees and alternating permutations. Russian Mathematical Surveys, 49(6):79, 1994.
- [15] H. Lacoin et al. Mixing time and cutoff for the adjacent transposition shuffle and the simple exclusion. The Annals of Probability, 44(2):1426–1487, 2016.
- [16] D. A. Levin and Y. Peres. Markov chains and mixing times, volume 107. American Mathematical Society, 2017.
- [17] O. Maliet, F. Gascuel, and A. Lambert. Ranked tree shapes, nonrandom extinctions, and the loss of phylogenetic diversity. Systematic Biology, 67(6):1025–1040, 04 2018.
- [18] N. Misra, G. Blelloch, R. Ravi, and R. Schwartz. An optimization-based sampling scheme for phylogenetic trees. In International Conference on Research in Computational Molecular Biology, pages 252–266. Springer, 2011.
- [19] E. Mossel, E. Vigoda, et al. Limitations of markov chain monte carlo algorithms for bayesian inference of phylogeny. The Annals of Applied Probability, 16(4):2215–2234, 2006.
- [20] J. A. Palacios, A. Véber, L. Cappello, Z. Wang, J. Wakeley, and S. Ramachandran. Bayesian estimation of population size changes by sampling Tajima’s trees. Genetics, 213(3):967–986, 2019.
- [21] P. S. Richard. Enumerative combinatorics. Volume I, 1999.
- [22] R. Sainudiin, T. Stadler, and A. Véber. Finding the best resolution for the Kingman-Tajima coalescent: theory and applications. Journal of Mathematical Biology, 70(6):1207–1247, 2015.
- [23] J. Schweinsberg. An bound for the relaxation time of a Markov chain on cladograms. Random Structures & Algorithms, 20(1):59–70, 2002.
- [24] D. Spade, R. Herbei, and L. Kubatko. A note on the relaxation time of two Markov chains on rooted phylogenetic tree spaces. Statistics & Probability Letters, 84:247–252, 01 2014.
- [25] D. Štefankovič and E. Vigoda. Fast convergence of Markov chain Monte Carlo algorithms for phylogenetic reconstruction with homogeneous data on closely related species. SIAM Journal on Discrete Mathematics, 25(3):1194–1211, 2011.
- [26] J. Wakeley. Coalescent Theory: An Introduction. Roberts & Company Publishers, June 2008.
- [27] D. B. Wilson et al. Mixing times of lozenge tiling and card shuffling Markov chains. The Annals of Applied Probability, 14(1):274–325, 2004.