Bijections for Ranked Tree-Child Networks
Abstract
The class of ranked tree-child networks, tree-child networks arising from an evolution process with a fixed embedding into the plane, has recently been introduced by Bienvenu, Lambert, and Steel. These authors derived counting results for this class. In this note, we will give bijective proofs of three of their results. Two of our bijections answer questions raised in their paper.
1 Introduction
Phylogenetic trees and networks are important discrete structures from biology where they are used to model evolution; see [9, 10, 11]. Of these two types of structures, phylogenetic trees are simpler and more classical but they are less suitable to model evolutionary scenarios that involve reticulation events. Thus, in many recent studies, they have been replaced by (the more general) phylogenetic networks. However, the majority of the classes of phylogenetic networks are not recursive and thus they are a poor model for processes that evolve over time. In order to be able to model such processes, Bienvenu et al. recently proposed the class of ranked tree-child networks; see [1].
We recall some definitions. First, a (rooted, binary) phylogenetic network is a simple DAG (directed acyclic graph) with a unique root of indegree and outdegree such that all other nodes belong to one of the following types:
-
•
leaves, which are nodes with indegree and outdegree ; these are bijectively labeled by the set where is the total number of leaves;
-
•
tree nodes, which are nodes with indegree and outdegree ; and
-
•
reticulation nodes, which are nodes with indegree and outdegree .
A phylogenetic network is called a tree-child network if each internal (i.e., non-leaf) node has at least one child which is not a reticulation node. For such a network, we call a tree node whose two children are both not reticulation nodes a branching event and a reticulation node together with both its parents a reticulation event; see Figure 1-(a). A ranked tree-child network (or RTCN for short) is now a tree-child network which is drawn in such a way that one starts with a branching event and then successively adds either a branching event or a reticulation event until the leaves are reached; see Figure 1-(b).

The class of RTCNs has the advantage over other network classes that the special embedding into the plane makes counting relatively straightforward. (In contrast, few of the other classes of phylogenetic networks have so far been counted; see, e.g., [2, 3, 4, 5, 6, 7, 8] for some recent progress.) For instance, the following simple formula was obtained in [1] for the number of RTCNs with leaves:
(1) |
Curiously, the same number is the answer to the following counting problem: find the number of ways for people to cross a river with a two-person boat where the boat trips follow the pattern send, returns, send, returns, etc; see A167484 in the OEIS. Thus, one can ask for a bijection between RTCNs and such boat sequences; however, in Remark 2.8 in [1] it was claimed that a natural bijection seems to be unlikely because for , out of the RTCNs contain a reticulation event and do not (see bottom resp. top row of Figure 2) whereas all possible boat trips are completely equivalent. Nevertheless, we will give a simple bijection in the next section; again see Figure 2 for the correspondence between RTCNs and boat sequences for arising from our bijection.

Using this bijection, branching events are, e.g., mapped to the following events of boat sequences. Assume that the people are ranked according to their skill at steering the boat. Then, the number of branching events corresponds to the number of return trips where the most skilled person among those on the opposite shore takes the boat back. Since it was proved in Corollary 2.7 in [1] that the number of branching events in a RTCN chosen uniformly at random from all RTCNs with leaves satisfies a central limit theorem, we obtain the following result.
Theorem 1.
For a boat sequence of people chosen uniformly at random from all boat sequences, the number of times that the most skilled person takes the boat back satisfies the limit law:
where denotes convergence in distribution and denotes the standard normal distribution.
In fact, the number of RTCNs with leaves and branching events was counted in [1] as well:
where the bracket denotes the (signless) Stirling numbers of the first kind and is the number of ranked trees (i.e., RTCNs without reticulation events). Since the Stirling numbers of the first kind count permutations with a fixed number of cycles, this result suggests that there should be a simple bijection between RTCNs with a fixed number of branching events and pairs consisting of permutations with a fixed number of cycles and ranked trees. We will give such a bijection in Section 3.

A final bijection discussed in this note is related to another result in [1] which concerns containment of ranked trees in RTCNs. Here, we say that a ranked tree is contained in a RTCN , in symbols , if can be obtained from by choosing one of the incoming edges of each reticulation node of , removing them, and then suppressing resulting nodes with indegree and outdegree ; see Figure 3.
For a fixed ranked tree with leaves, it was proved in [1] that
(2) |
Note that is the number of phylogenetic trees with leaves where a phylogenetic tree is a phylogenetic network without reticulation nodes; see, e.g., Corollary 2.2.4 in [10]. Thus, one again would like to have a simple bijection between RTCNs which contain and phylogetetic trees; see Remark 3.7 in [1]. In the final section of this note, we will give such a bijection.
2 RTCNs and boat sequences
In this section, we will describe a bijection between RTCNs with leaves and boat sequences involving people. This will be done in three steps. First, we will explain how to construct all RTCNs with leaves from those with leaves; then, we will do the same for boat sequences; and finally we will synchronize these two constructions to get the desired bijection.
Recursive construction of RTCNs.
Here, we give a top-down construction which (uniquely) produces all RTCNs with leaves. A similar construction was described in [1] where the authors used the concept of decorated RTCNs. We will sidestep this concept which gives a (slightly) simpler construction.

Assume we have a given RTCN with leaves. Then, we perform either one of the following two operations (see Figure 4 for examples).
- (O-i)
-
Pick two labels from the set , remove the label from the RTCN and attach to its leaf two children with labels . Finally, relabel all the other leaves of the RTCN in an order-consistent way with the remaining labels from .
- (O-ii)
-
Pick two labels from and a label from . Next, find the relative ranks, say , of in . Remove from the RTCN and attach a reticulation event to their leaves with the children corresponding to the leaf which had the label being a leaf with label and the reticulation vertex, the children corresponding to the leaf which had the label being a leaf with label and the (same) reticulation vertex and the child of the reticulation vertex being a leaf with label . Finally, relabel all the other leaves of the RTCN in an order-consistent way with the labels from .
Since both of these operations are reversible, we have the following result.
Proposition 1.
Starting from all RTCNs with leaves and performing for each of them all the operations above gives (exactly once) all the RTCNs with leaves.
Moreover, this construction also immediately gives (1).
Corollary 1.
We have
Proof.
The number of different ways to perform (i) and (ii) above is
Thus,
which proves the claim.

Recursive construction of boat sequences.
Here, we explain how all boat sequences involving people can be constructed from those involving people. (Note that this reveals a recursive structure inherent to boat sequences that was not considered in [1].)
First, note that a boat sequence involving people can be represented by an array with rows where the first row has entries and the second has entries, namely, the first row contains the set of numbers of the people sent in the -th step and the second row contains the singleton of the number of the person who returned in the -th step; see bottom left array in Figure 5 for an example.
Now, each such array can be mapped to a corresponding array where the numbers are replaced by the relative ranks of the people within the group on their shore, e.g., if are sent with at the same shore, then is replaced by . Call the resulting map ; see bottom of Figure 5.
Note that the map is clearly a bijection from the above arrays representing boat sequences to arrays where in the first row, we have a subset of size of , followed by a subset of size of , etc. until the set and in the second row, we have a subset of size of , followed by a subset of size of , etc. until a subset of size of .
Now, we apply the following operations on the latter arrays: we shift all entries of the first row to the right by one position, add a subset of size of in the (now empty) first position and add a subset of size of at the end of the second row; see the downwards arrow in Figure 6.
Finally, by using the inverse of , we obtain an array for a boat sequence involving people; see the bottom left array in Figure 6.

Since the above process is reversible, we obtain the following result.
Proposition 2.
Starting from a boat sequence involving people and using the map , all possible operations above, and the inverse of , we obtain (exactly once) all boat sequences involving people.

Synchronization of the two constructions.
We will now explain how to synchronize the constructions from the above two paragraphs to get a bijection between RTCNs and boat sequences. This bijection is defined recursively.
Assume we have given a RTCN on leaves. Reverse the recursive construction from Proposition 1 to obtain a RTCN on leaves. By induction hypothesis, this RTCN is mapped on a boat sequence which is represented by an array as described in the previous paragraph. We apply the map to this array. Now, depending on which of the two operations was used in Proposition 1 to construct the RTCN on leaves from that of leaves, we do the following to the array of ranks.
- (i)
-
If Operation (O-i) was used, then move all the pairs from the first row by one position and add as the first entry. Also, add at the end of the second row.
- (ii)
-
If Operation (O-ii) was used, then again move all the pairs from the first row by one position and add as the first entry. Add the relative rank of in to the end of the second row.
Finally, apply the inverse of to the above array to obtain the desired boat sequence.
More generally, in order to get the corresponding boat sequence for a RTCN on leaves, we first have to reduce it to the RTCN on leaves by applying Proposition 1 times and retain the operations from the first paragraph for each step. Then, we can start from the (trivial) boat sequence and use the above procedure times to construct the corresponding boat sequence; see Figure 7 for an example.
3 Branching events, permutations and ranked binary trees

In this section, we describe an explicit bijection between the set of RTCNs with leaves and reticulation events (equivalently, with branching events) and the set of pairs where is a ranked binary tree with leaves and is a permutation of consisting of disjoint cycles.
From a RTCN to a ranked binary tree and permutation.
Given a RTCN with leaves labeled to , define its profile as the sequence where if and only if the -th event from the bottom is a branching event (see left of Figure 8). Note that we necessarily have , as only two lineages (i.e., vertical edges) are available to merge for the last event. Suppose , that is, that the RTCN under consideration has reticulation events, and let be the indices such that , corresponding to the reticulation events in question (note that the reticulation event happens at time , with the notation from Figure 1-(b)).
Next, note that we can consider lineages above (and below) each event to be naturally ordered as follows. Assuming that lineages just below event are labeled to and that the event is a branching event, label the lineages just above the event with the integers to so that their order reflects the order after the branching event, with the parent lineage identified with its child whose label is smaller. Similarly, assuming is a reticulation event, order the lineages present just above the event consistently with their order after the event (ignoring the hybrid lineage resulting from the reticulation event). This is by no means the only possible convention (other conventions would yield equivalent bijections), but we may use it to determine the left-to-right order of lineages when the RTCN is drawn in the plane (so that we can say that the lineages above event are labeled “left-to-right” with the numbers to ). The planar embedding induced by this convention is depicted on the right of Figure 8: the RTCN on the left is redrawn on the right with lineages ordered left to right according to their current labels, which are explicitly indicated above and below each event; in the representation on the right, parent lineages of reticulation events are drawn directly above the respective non-hybrid child lineages.
In order to obtain a ranked binary tree, we will replace each of the reticulation events of the RTCN with a branching event; at the same time, we shall keep track of additional information allowing the recovery of the original reticulation event.

Build the corresponding ranked binary tree to our RTCN by considering in turn the events . The result of the reticulation event is given by two lineages , labeled above the reticulation event, and a third hybrid lineage ; replace this by a branching event that yields from a single parent lineage and leaves intact. Some convention is needed in order to match lineages to lineages ; we identify with and assign the evolution of the hybrid lineage to the lineage (Figure 9). Labels assigned to lineages below event are unchanged by this procedure, but labels above the event are recomputed according to the rules we explained previously. In particular, labels immediately above the event range between and ; the lineage that used to be labeled is now labeled and the lineage that used to be labeled is now labeled where the values of depend on : if , then and ; if , then and ; if , then and .
The result of this operation is a RTCN with reticulation events. In order to be able to reinstate the original reticulation, it is enough to keep track of (identifying the branching event to be turned back into a reticulation event) and a number that identifies the lineage to turn back into the original hybrid. Note that the label , for example, would allow the recovery of the original RTCN; is an integer between and , but cannot take all values in the range independently of , as it cannot be equal to , that is, to the label of the parent lineage of the branching event . We therefore define the number as if and otherwise. This way, we have ; since can be determined from , the pair is enough to recover and thus the original RTCN (as we will further discuss later).
We repeat this operation for each reticulation to obtain a final ranked binary tree and a sequence where : see Figure 10 for an example.
This sequence of pairs can be naturally encoded by the permutation in the symmetric group (where we adopt the convention that composition of permutations is obtained via multiplication from the right); this is the product of transpositions such that and . The latter condition is actually equivalent to the permutation being a product of disjoint cycles, as clarified by the following lemma:
Lemma 1.
For all integers , a permutation is the product of disjoint cycles if and only if it can be expressed as a product of transpositions of the form where and for ; moreover, if such an expression for exists, then it is unique.
Proof.
First of all, given a positive integer , we show the fact that any product of transpositions with the properties described in the statement is the product of disjoint cycles by induction on . Given a positive integer , consider a permutation of the form where and for ; assume that has cycles and that each in is the maximum element of the cycle it belongs to in the factorisation of as a product of disjoint cycles. Consider any permutation of the form where . The elements must belong to separate cycles in and each must be the maximum of its own cycle, as neither can be in . Denote by the cycle of containing and by the one containing ; the disjoint cycles of the permutation are those of , with the cycles and merged into one.
Note that the property that elements not in are maximal in their cycles holds for . Indeed, if some element is not in and does not belong to or , then it is maximal in its cycle in , and thus it is maximal in the same cycle of . The elements and , which do not appear in the list , are the maximums of the cycles and , respectively. It follows that the element is maximal in its cycle of , which is the union of and . Finally, suppose by contradiction that an element of the new cycle other than belongs to ; this would imply that it was maximal in its cycle in , and therefore in or ; it would thus belong to , which we have excluded.
It follows from the fact that the identity of (the product of zero transpositions) consists of disjoint cycles of length one (and is thus such that each element in is maximal in its cycle) that any permutation of the form described in the statement is the product of disjoint cycles.
Conversely, consider a permutation made up of disjoint cycles; let us determine positive integers between and such that , for , and . The element must necessarily be the smallest one that is not fixed by . The element must be such that , as is not moved by any of the other transpositions. Consequently, we have . We can repeat this procedure recursively on the permutation (which fixes all elements up to and including ) to obtain the subsequent transpositions .
As a result, from our initial RTCN we have constructed a pair given by a ranked binary tree and a permutation in which, by applying Lemma 1 with , we find to be a product of as many disjoint cycles as the number of branching events in the original RTCN (that is, ). We have already sketched how we can revert each change made to recover the original RTCN, but we give a more complete description of the inverse construction below.
From a ranked binary tree and a permutation to a RTCN.
We are given a ranked binary tree with labeled leaves and a permutation in with cycles.
First of all, by Lemma 1 we can express in a unique way as a product of transpositions of the form previously described, as with and for . We then obtain our pairs as , .
All that is left to do is replace the branching events in our ranked binary tree (numbered from the bottom) by reticulation events. We can still number our lineages from to above branching event (again, think of the order induced by assigning to the parent the label of the smaller child).
For in turn, consider lineages to right above branching event and let be the label assigned to the parent of the branching event ; pick the past of lineage where if and otherwise, to become the past of the larger child issued from the branching event (while the past of the branching event becomes the past of the smaller child), and make the future of lineage into a hybridised version of the two lineages involved in the branching event.
This operation yields a RTCN and is precisely the inverse of the one previously described (see again Figure 10 where one can follow the steps backwards to recover the RTCN from the final ranked binary tree).
4 Ranked tree containment and phylogenetic trees
In this final section, we give a bijection between the set of RTCNs which contain a fixed ranked tree and the set of phylogenetic trees; see the last two paragraphs of Section 1.
We first recall how all RTCNs which contain a fixed ranked tree are constructed from ; see [1] where this was used to prove (2). Starting from the leaves and moving back in time, for every branching event of a decision is taken whether or not the branching event is turned into a reticulation event and if yes, which of the (future) lineages not in the branching event is connected with which of the (future) lineages from the branching event; see Figure 11-(b) for an example. We will call the resulting RTCN and will directly translate each step in the above process of building from into a corresponding step for building a phylogenetic tree (time will, however, be reversed).

In order to describe this, we first draw the ranked tree so that all the leaves are in increasing order when read from left to right; see Figure 11-(a). Then, we consider the resulting (with the same order of the leaves). Now, starting from a tree consisting of a root with one child labeled by , we build the (rooted) tree with leaf labeled by and internal nodes labeled by where is the number of labels of and we assume that , by processing the events from top-down. More precisely, for the -th event in , we do the following:
- (a)
-
If the -th event is a branching event, we insert a node with label into the root edge of and attach to this node a leaf with label ;
- (b)
-
If the -th event is a reticulation event that was created from the -th branching event of by connecting the -th (future) lineage (not counting the lineages of the -th branching event), then, we insert a node with label into one of the edges to the children of the internal node with label of where the child with the smaller (larger) label is chosen depending on whether the left or the right lineage of the -th branching event of was used to form the reticulation event; moreover, we again attach a leaf with label to the inserted node.
See Figure 12 for a visualization of the the above construction for the RTCN from Figure 11-(b).

Next, we remove the labels of all international nodes and the root edge of . The resulting tree is a phylogenetic tree; see the tree on the right of Figure 13. This tree is the image of .

Finally, it is not hard to see that the above construction is reversible: e.g., if the phylogenetic tree on the right of Figure 13 is given, find the largest label (); its parent will have label . Then, remove the largest label, the parent label and the edge which joins them. Continue until all internal nodes have received a label (where the number of the label is decreased by in every step). This gives the right-most tree in Figure 12 which unambiguously encodes the whole tree construction process from Figure 12. Clearly, this process can be used to re-construct .
Acknowledgments
We thank François Bienvenu for putting us into contact. Moreover, we thank the two anonymous referees for a careful reading. The second author acknowledges financial support by the Ministry of Science and Technology, Taiwan under the research grant MOST-109-2115-M004-003-MY2; the third author was partially supported by the same funding agency under the grant MOST-110-2115-M-017-003-MY3.
References
- [1] F. Bienvenu, A. Lambert, M. Steel. Combinatorial and stochastic properties of ranked tree-child networks, Random Struct. Algor., in press. https://doi.org/10.1002/rsa.21048
- [2] M. Bouvel, P. Gambette, M. Mansouri (2020). Counting phylogenetic networks of level 1 and 2, J. Math. Biol., 81:6-7, 1357–1395.
- [3] C. McDiarmid, C. Semple, D. Welsh (2015). Counting phylogenetic networks, Ann. Comb., 19:1, 205–224.
- [4] M. Fuchs, B. Gittenberger, M. Mansouri (2019). Counting phylogenetic networks with few reticulation vertices: tree-child and normal networks, Australas. J. Combin., 73, 385–423.
- [5] M. Fuchs, B. Gittenberger, M. Mansouri (2021). Counting phylogenetic networks with few reticulation vertices: exact enumeration and corrections, Australas. J. Combin., 82:2, 257–282.
- [6] M. Fuchs, E.-Y. Huang, G.-R. Yu. Counting phylogenetic networks with few reticulation vertices: a second approach, arXiv:2104.07842.
- [7] M. Fuchs, G.-R. Yu, L. Zhang (2021). On the asymptotic growth of the number of tree-child networks, European J. Combin., 93, 103278, 20pp.
- [8] M. Fuchs, G.-R. Yu, L. Zhang. Asymptotic enumeration and distributional properties of galled networks, arXiv:2010.13324.
- [9] D. H. Huson, R. Rupp, C. Scornavacca. Phylogenetic Networks: Concepts, Algorithms and Applications, Cambridge University Press, 1st edition, 2010.
- [10] C. Semple and M. Steel. Phylogenetics, Oxford University Press, Oxford, 2003.
- [11] M. Steel. Phylogeny—Discrete and Random Processes in Evolution, CBMS-NSF Regional Conference Series in Applied Mathematics, 89, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016.