The leftmost column of ordered Chinese Restaurant Process up-down chains: intertwining and convergence
Abstract
Recently there has been significant interest in constructing ordered analogues of Petrov’s two-parameter extension of Ethier and Kurtz’s infinitely-many-neutral-alleles diffusion model. One method for constructing these processes goes through taking an appropriate diffusive limit of Markov chains on integer compositions called ordered Chinese Restaurant Process up-down chains. The resulting processes are diffusions whose state space is the set of open subsets of the open unit interval. In this paper we begin to study nontrivial aspects of the order structure of these diffusions. In particular, for a certain choice of parameters, we take the diffusive limit of the size of the first component of ordered Chinese Restaurant Process up-down chains and describe the generator of the limiting process. We then relate this to the size of the leftmost maximal open subset of the open-set valued diffusions. This is challenging because the function taking an open set to the size of its leftmost maximal open subset is discontinuous. Our methods are based on establishing intertwining relations between the processes we study.
Introduction
The construction and analysis of ordered analogues of Petrov’s [15] two-parameter extension of Ethier and Kurtz’s [3] infinitely-many-neutral-alleles diffusion model has recently attracted significant interest in the literature [7, 8, 20, 21, 23]. Recall that the diffusions constructed in [15] are a family of Feller diffusions on the closure of the Kingman simplex
whose generator acts on the unital algebra generated by , by
In [20], for each , , and , we constructed a Feller diffusion whose state space is the set of open subsets of such that the ranked sequence of lengths of maximal open intervals in is an diffusion. This was done by considering the scaling limit of up-down chains associated to the ordered Chinese Restaurant Process.
While many interesting properties of can be obtained from the corresponding properties for diffusions, properties that depend on the order structure cannot be. In this paper we begin to study nontrivial aspects of the order structure of these diffusions. Motivated by [6, Theorem 2 and Theorem 19] and [5, Theorem 5], which consider similar properties in closely related tree-valued processes, we consider the evolution of the left-most maximal open interval of in running in its -Poisson-Dirichlet interval partition stationarity distribution. Recall that the -Poisson-Dirichlet interval partition is the distribution of where is a -dimensional Bessel process started from . We prove the following result.
Theorem 1.1.
Define by . If is running in its -Poisson-Dirichlet interval partition stationarity distribution, then is a Feller process. Moreover, the generator of its semigroup is given by
for , where the domain of consists of functions satisfying
-
(D1)
and extends continuously to ,
-
(D2)
and
-
(D3)
as .
We consider only the case because the known stationary distribution of is an -Poisson-Dirichlet interval partition and, except in the case, with probability 1 interval partitions with these distributions do not have left-most maximal open intervals. We remark that our theorem statement could be slightly simpler if we knew that had a unique stationary distribution, but this is currently an open problem.
Our proof is based on taking the scaling limit of the left-most coordinate in an up-down chain on compositions based on the ordered Chinese Restaurant Process, which are the same chains that were used in [20] to construct .
Definition 1.1.
For , a composition of is a tuple of positive integers that sum to . The composition of is the empty tuple, which we denote by . We write and when is a composition of with components. We denote the set of all compositions of by and their union by .
An up-down chain on is a Markov chain whose steps can be factored into two parts: 1) an up-step from to according to a kernel followed by 2) a down-step from to according to a kernel . The probability of transitioning from to can then be written as
(1) |
Up-down chains on compositions, and more generally, on graded sets, have been studied in a variety of contexts [2, 6, 9, 10, 11, 15, 16], often in connection with their nice algebraic and combinatorial properties.
In the up-down chains we considered, the up-step kernel is given by an -ordered Chinese Restaurant Process growth step [18]. In the Chinese Restaurant Process analogy, we view as an ordered list of the number of customers at occupied tables in a restaurant, so that is the number of customers at the table on the list. During an up-step, a new customer enters the restaurant and chooses a table to sit at according to the following rules:
-
•
The new customer joins table with probability , resulting in a step from to .
-
•
The new customer starts a new table directly after table with probability , resulting in a step from to .
-
•
The new customer starts a new table at the start of the list with probability , resulting in a step from to .
We note that, for consistency with [7, 8], this up-step is the left-to-right reversal of the growth step in [18].
The down-step kernel we consider can also be thought of in terms of the restaurant analogy. During a down-step, a seated customer gets up and exits the restaurant according to the following rule:
-
•
The seated customer is chosen uniformly at random, resulting in a step from to with probability (the coordinate is to be contracted away if , or if the table is no longer occupied).
Note that, in contrast to the up-step, the down-step does not depend on .
Let be a Markov chain on with transition kernel defined as in Equation (1) using the and just described. A Poissonized version of this chain was considered in [21, 23]. It can be shown that is an aperiodic, irreducible chain. We denote its unique stationary distribution by and note that this is the left-to-right reversal of the -regenerative composition structures introduced in [12].
The projection for gives rise to the leftmost column processes, defined by . Let , the distribution of the leftmost column when the up-down chain is in stationarity. The following result, interesting in its own right, is a key step in our proof of Theorem 1.1.
Theorem 1.2.
For , let be a distribution on . Then, for all , the up-down chain can be initialized so that is a Markov chain with initial distribution . Moreover, for any such sequence of initial conditions for , if the sequence has a limiting distribution , then we have the convergence
in the Skorokhod space , where is a Feller process with generator (as in Theorem 1.1) and initial distribution .
While there are many ways to prove a result like Theorem 1.2, we take an approach based on the algebraic properties of the ordered Chinese Restaurant Process up-down chains. In particular, our proof is based on the following surprising intertwining result. For a positive integer and composition , we use the notation as a shorthand for the composition .
Theorem 1.3.
For , let be the transition kernel from to given by
and let be the transition kernel from to given by
If the initial distribution of is of the form for some distribution on , then the process is Markovian. In this case, the following intertwining relations hold:
-
(i)
where is the transition kernel of , and
-
(ii)
for , where is the semigroup generated by the operator defined in Theorem 1.1 and denotes the identity operator.
This paper is organized as follows. In Section 2, we show that the leftmost column process is intertwined with its corresponding up-down chain and describe its transition kernel explicitly. This establishes part of Theorem 1.3. In Section 3, we state a condition under which the convergence of Markov processes can be obtained from some commutation relations involving generators. In Section 4, we analyze the generator of the limiting process. In Section 5, we show that our generators satisfy the commutation relations appearing in the result of Section 3. In Section 6, we verify the convergence condition appearing in the result in Section 3. In Section 7, we provide general conditions under which commutation relations involving generators lead to the corresponding relations for their semigroups. Finally in Section 8, we prove Theorems 1.1, 1.2, and 1.3.
The following will be used throughout this paper. For a compact topological space , we denote by the space of continuous functions from to equipped with the supremum norm. Finite topological spaces will always be equipped with the discrete topology. Any sum or product over an empty index set will be regarded as a zero or one, respectively. The set of positive integers will be denoted by . The falling factorial will be denoted using factorial exponents – that is, for a real number and nonnegative integer , and by convention. The rising factorial will be denoted by . We denote the gamma function by . Multinomial coefficients will be denoted using the shorthand
The Leftmost Column Process
Our study of the leftmost column process will be mainly focused on the case. However, it will be useful to study the distribution of the leftmost column process when the up-down chain is in stationarity. As we will see, this distribution has a role in the evolution of the process.
Proposition 2.1.
The stationary distribution of is given by
Moreover, the following consistency conditions hold:
(2) |
Proof.
Proposition 2.2.
If has distribution , then has distribution
Proof.
Let and . It can be verified that
(3) |
Summing over concludes the proof. ∎
Let and . Consider taking an up-step from followed by a down-step. Let be the event in which this up-step stacks a box on the first column of , and let be the event in which the down-step removes a box from the first column of a composition. Then, , , , , and do not depend on . Indeed, we have the formulas
(4) |
We use these formulas to define and Moreover, we extend to be zero for all other integer arguments and .
The following is a useful identity relating the transition kernels of the and chains.
Proposition 2.3.
For and , we have the identity
Proof.
Fix and in . Let be the composition obtained by performing an up-step from and be the composition obtained by performing a down-step from . As before, let be the event in which the up-step adds to the first column of a composition and be the event in which the down-step removes from the first column of a composition. Then, we have that
and
To obtain the identity, we note that
and rewrite this probability by conditioning on the above sets. Of particular importance will be the following observations: the conditional distribution of given and is and conditionally given , is independent of and has distribution We also make use of the fact that the events and are identical, since the size of is known to be .
Our first conditional probability is given by
Next, we will condition on . Notice that this is a null set when . When , we have
Conditioning on will require two cases. For , we have
and for , we have
Finally, we condition on . We have that
Collecting the terms above with the appropriate terms in (4) establishes the result. ∎
Let . We define a transition kernel from to by
and a transition kernel from to by
Proposition 2.4.
For , the transition kernel satisfies
(5) |
Consequently, if the initial distribution of is of the form , then is a time-homogeneous Markov chain with transition kernel . Moreover, the transition kernel is given explicitly by
Proof.
Let be the kernel on defined by the right side of the above equation. Fix and . Using Proposition 2.3 and the identities (2) and (3), we compute
The final equality follows from the fact that is supported on . This establishes the identity Observing that is the identity kernel on , we find that
Convergence from Commutation Relations
In this section, we provide a condition under which commutation relations between operators implies the convergence of those operators in an appropriate sense. In the interest of generality, we first state this condition in the setting of Banach spaces, but we then reformulate it in the context of Markov processes to suit our purposes. The general setting is as follows.
Let be Banach spaces and be uniformly bounded linear operators with . These spaces will be equipped with the following mode of convergence.
Definition 3.1.
A sequence with converges to an element (and we write ) if
where for convenience, we denote every norm by the same symbol .
Proposition 3.1.
For , let and be linear operators in addition to . Suppose that for every ,
-
(i)
for large , and
-
(ii)
as (the sequence need only be defined for large ).
Then for , the sequence (defined for large ) satisfies
Proof.
Let and be large enough so that (i) holds. In particular, we can define . Writing
it is clear that . Writing
and noting that , we obtain the other convergence. ∎
In the probabilistic context, the above result has some additional consequences.
Theorem 3.1.
Let be a compact, separable metric space, be the generator of the Feller semigroup on , and be a core for that is invariant under . For each , let be a finite set endowed with the discrete topology, be a Markov chain on , be any function, and be a linear operator. Denote the transition operator of by and the projection by . Let and be positive sequences converging to zero such that Suppose that for , the following statements hold:
-
(a)
for large , and
-
(b)
as (the sequence need only be defined for large ).
Then,
-
(i)
the discrete semigroups converge to in the following sense: for all and ,
-
(ii)
the above convergence is uniform in on bounded intervals, and
-
(iii)
if is conservative and the distributions of converge, say to , then we have the convergence of paths
in the Skorokhod space , where is a Feller process with initial distribution and generator .
Proof.
This is a combination of Proposition 3.1 and standard convergence results. In particular, for , we can define the sequence for large and obtain the convergence
The Limiting Generator
In this section, we introduce the generator of a Feller process on that will be identified as the limiting process. We describe this generator both on a core of polynomials and on its full domain. However, the core description is sufficient for the analysis that will follow.
Let denote the space of polynomials on equipped with the supremum norm. We will study the operator and the functional given by
and
(6) |
Letting we define a family of polynomials by
Note that and has degree . Moreover, these polynomials are related to the Jacobi polynomials and the shifted Jacobi polynomials [19, 24] by the identity
Proposition 4.1.
Let and for The following statements hold:
-
(i)
for all ,
-
(ii)
the family is a Hamel basis for , and
-
(iii)
is a dense subspace of .
Proof.
The claim in (i) can be obtained from the classical theory of Jacobi polynomials (e.g. (4.1.3), (4.21.2), and (4.21.4) in [24]).
Noting that has degree shows that the family is independent. Since , it clearly lies in . To see that the other also lie in , we use (i) to identify them as elements in the range of and observe that this range lies in . Indeed, this can be verified using (4): for , we have that
To obtain equality from the containment we observe that the former space is a maximal subspace of (it has codimension one) while the latter is a proper subspace of .
Proposition 4.2.
The operator is closable and its closure, , is the generator of a Feller semigroup on .
Proof.
We show that satisfies the conditions of the Hille-Yosida Theorem. For , Proposition 4.1(i)-(ii) show that the range of is exactly . Proposition 4.1(iii) then tells us that this range, as well as the domain of , is dense in .
To establish the positive-maximum principle, suppose that has a nonnegative maximum at . If , the tools of differential calculus show that as desired. When , consider the element given by
almost everywhere. Since on , the norm of is given by
Recalling that , it follows that almost everywhere. Together with the continuity of , this implies that and consequently, ∎
The final result in this section is the explicit description of the generator and its domain .
To begin, we define an operator by
We will write whenever can be continuously extended to . Recalling the definition of and from Theorem 1.1, we see that is the restriction of to . We also define functions and by
and
Note that admits the factorization
from which we obtain the formula
(7) |
Another identity that will be useful is
(8) |
Proposition 4.3.
The identity holds, where is as defined in Theorem 1.1.
Proof.
We begin by showing that the following holds:
(9) |
To do this, we will take limits in (7). First we take the limit . The term converges to zero due to (D3) (see Theorem 1.1). The limit of the integral is handled by the dominated convergence theorem – a suitable bound follows from the boundedness of and (8). This establishes the formula for . Taking now the limit (the dominated convergence theorem can be applied as before) establishes the case. The case is trivial.
Now we show that Fixing , there exists a sequence of functions in such that
(10) |
Noting that for all , we can apply (9). In this case, the identity yields
(11) |
Using (10) and the dominated convergence theorem, we can take the limit above. A suitable bound follows from the boundedness of the sequence and (8). We obtain
(12) |
Together with the fact that and this expression implies that . Differentiating the expression yields the identity
(13) |
for all and extend this to by taking the limit . Once again, we apply the dominated convergence theorem. A preliminary bound can be obtained from (8) and (11):
The boundedness of the sequence then provides a suitable bound.
We have shown that and on (see (13)). Therefore, it only remains to show that From Lemma 19.12 in [13], it suffices to show that satisfies the positive maximum principle. To this end, suppose that has a nonnegative maximum at . If , then the desired inequality can be obtained as in Proposition 4.2. If , we use (D1), L’Hôpital’s rule, (D3), and (8) to establish the existence of limits
Noticing that concludes the proof. ∎
Generator Relations
In this section, we show that our generators satisfy the commutation relations appearing in Theorem 3.1. Here, we rely on an alternative description of the limiting generator in terms of Bernstein polynomials.
For , let be the subspace of consisting of polynomials with degree at most . Similarly, define
Recall the Bernstein polynomials
Note that whenever or . For each , the collection forms a basis of and a partition of unity – that is, We also have the relations
(14) | ||||
(15) |
and
(16) |
which hold whenever the relevant quantities are defined.
For , we define a transition kernel from to by
Proposition 5.1.
Let . As an operator from to , is injective and
(17) | ||||
(18) |
Proof.
Let . From the independence of the Bernstein polynomials and the identity
it follows that the range of is an -dimensional space. As a result, is injective. Observing that the right hand side of (17) has dimension at most and contains the range of , it follows that these two spaces are equal. Since also has dimension (see Proposition 4.1(ii)), it only remains to show that the range of is contained in . The containment in is clear. For the containment in , we simply compute, for ,
Proposition 5.2.
The action of on the Bernstein polynomials is given by
Proof.
Let and Applying (14) twice, we see that
Applying now (16), we have that
(19) |
As a result,
Recalling that is zero unless and unless , we can change the lower and upper limits of the sum to and , respectively. This establishes the case. When , we observe that (20) still holds and the first and last quantities of (LABEL:scaledBernsteinSecondDerivativeExpansion) are still equal. When , the claim is trivial. ∎
Proposition 5.3.
For , the following relation holds on :
The Convergence Argument
In this section, we verify the convergence condition appearing in Theorem 3.1. We rely on a description of the inverse of the transition operator in terms of a variant of the Bernstein polynomials.
These variants fall into the class of degenerate Bernstein polynomials [14] and are given by
Proposition 6.1.
For , we have the expansions
Proof.
The expansions of a Bernstein polynomial in the Bernstein bases are given in Equation (2) in [19]. Let us verify that the coefficients in those expansions match the coefficients in the above expansions. Fix . The coefficient of in the above expansion is given by
When or , it is clear that this coefficient is zero. If instead , this coefficient is reduces to
In either case, this coefficient agrees with the coefficient in [19]. ∎
Let be defined by and be the associated projection,
Proposition 6.2.
For , we have the identity
Proof.
It follows from definition that
Meanwhile, Proposition 6.1 gives us the expansion
Upon comparison, we find that the coefficient of is the same in both expressions whenever . Since both functions lie in , the coefficients of must agree as well (see (17)). As a result, the two functions are equal. ∎
Proposition 6.3.
For , we have the convergence
Proof.
We write
and handle each factor separately. The constants converge to and each factor in a product converges to either or . ∎
Proposition 6.4.
Let and fix such that . Then we have the convergence
in the sense of Definition 3.1.
Semigroup Relations from Generator Relations
In this section, we provide general conditions under which commutation relations involving generators lead to the corresponding relations for their semigroups.
Theorem 7.1.
Let and be the generators of the Feller semigroups and , respectively, and let and denote their respective domains. Suppose that there is a subspace , a linear operator , and a set such that
-
(i)
is bounded,
-
(ii)
is unbounded,
-
(iii)
for , and
-
(iv)
on .
Then on for each .
Proof.
Fix and let and be the resolvent operators corresponding to and respectively. It follows from (iii) that is invariant under . Combining this with (iv), we obtain the following relation on :
It then follows easily that
or equivalently, on , where and are the Yosida approximations of and respectively. Noting that is invariant under , this extends to nonnegative integers :
Applying now (i), we have for and the identity
Letting become arbitrarily large (see (ii)) yields This establishes the result on . The extension to follows from the boundedness of . ∎
Corollary 7.1.
Let and be the generators of the Feller semigroups and , respectively, and let and denote their respective domains. Suppose that there is a subspace , a linear operator , and a filtration of by finite dimensional spaces such that
-
(i)
for all , and
-
(ii)
on .
Then on for each .
Proof.
Let . It follows from (i) that is invariant under the injective operators . Together with the fact that is finite-dimensional, this implies that
Letting denote the restriction of to , it follows from (i) and (ii) that
Since is finite-dimensional, is bounded and Applying Theorem 7.1, we find that on for each . Taking a union over extends the identity to . ∎
Proofs of Main Results
Proof of Theorem 1.3.
The first claim was proved in Proposition 2.4. For the second claim, we appeal to Corollary 7.1. We take , , , and for all . The containment holds trivially and the identity was established in Proposition 5.3. Applying Corollary 7.1, we obtain the desired identity in terms of transition operators, which implies the same relation in terms of transition kernels. ∎
Proof of Theorem 1.2.
The claim about the existence of initial distributions for follows from Theorem 1.3. The second claim follows from applying Theorem 3.1 with , , , , , , , , , and . To verify that is the generator of a conservative Feller semigroup on , is a core for , and is invariant under , we appeal to Propositions 4.3, 4.2, and 4.1. Condition (a) can be obtained from the identity in Proposition 5.3 by recalling that is injective (see Proposition 5.1) and that each in lies in for large . Condition (b) is exactly the result of Proposition 6.4. ∎
Proof of Theorem 1.1.
Define by
From [20, Theorem 1.3], we have that if
then
where is the integer part of and the convergence is in distribution on the Skorokhod space , where the metric on is given by the Hausdorff distance between the complements (complements being taken in ). If were continuous, the result would follow immediately, but is discontinuous. However, it is straightforward to show that if in and , then .
Assuming now that is running in stationarity, the fact that converges in distribution to an Poisson-Dirichlet interval partition distribution follows from [18] and the fact that is a Markov chain follows from Theorem 1.3. Observe that is the stationary distribution of and, in the ordered Chinese Restaurant Process growth step, no new table is ever created at the start of the list. Thus, for every , is distributed like the size of the table containing in the usual Chinese Restaurant Process after customers are seated, see [17]. Consequently, since our chain is stationary, for each ,
where has a Beta distribution, see [17].
Therefore, from Theorem 1.2 with as defined there and , passing to a subsequence if necessary, and using the Skorokhod representation theorem, we may assume that
in . Fix . Since Feller processes have no fixed discontinuities, is almost surely continuous at and, therefore, since convergence in implies convergence at continuity points,
Since , and, since
it follows that . Consequently, is a modification of and since has a Feller semigroup, so does . ∎
References
- [1] Béla Bollobás. Linear Analysis: An Introductory Course. Cambridge University Press, 2 edition, 1999.
- [2] Alexei Borodin and Grigori Olshanski. Infinite-dimensional diffusions as limits of random walks on partitions. Probab. Theory Related Fields, 144(1-2):281–318, 2009.
- [3] S. N. Ethier and Thomas G. Kurtz. The infinitely-many-neutral-alleles diffusion model. Adv. in Appl. Probab., 13(3):429–452, 1981.
- [4] Stewart N. Ethier and Thomas G. Kurtz. Markov processes: characterization and convergence. Wiley series in probability and mathematical statistics. J. Wiley & Sons, New York, Chichester, 2005.
- [5] Noah Forman, Soumik Pal, Douglas Rizzolo, and Matthias Winkel. Interval partition evolutions with emigration related to the Aldous diffusion. arXiv preprint arXiv:1804.01205, 2018.
- [6] Noah Forman, Soumik Pal, Douglas Rizzolo, and Matthias Winkel. Projections of the Aldous chain on binary trees: intertwining and consistency. Random Structures Algorithms, 57(3):745–769, 2020.
- [7] Noah Forman, Soumik Pal, Douglas Rizzolo, and Matthias Winkel. Diffusions on a space of interval partitions: Poisson-Dirichlet stationary distributions. Ann. Probab., 49(2):793–831, 2021.
- [8] Noah Forman, Douglas Rizzolo, Quan Shi, and Matthias Winkel. Diffusions on a space of interval partitions: The two-parameter model. arXiv:2008.02823, 2020.
- [9] Jason Fulman. Commutation relations and Markov chains. Probab. Theory Related Fields, 144(1-2):99–136, 2009.
- [10] Jason Fulman. Mixing time for a random walk on rooted trees. Electron. J. Combin., 16(1):Research Paper 139, 13, 2009.
- [11] Han L. Gan and Nathan Ross. Stein’s method for the Poisson-Dirichlet distribution and the Ewens Sampling Formula, with applications to Wright-Fisher models. arXiv:1910.04976, 2020.
- [12] Alexander Gnedin and Jim Pitman. Regenerative composition structures. Ann. Probab., 33(2):445–479, 2005.
- [13] Olav Kallenberg. Foundations of Modern Probability. Probability and its Applications (New York). Springer-Verlag, New York, 2002.
- [14] Taekyun Kim and Dae san Kim. Degenerate Bernstein polynomials, 2018.
- [15] L. A. Petrov. A two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funktsional. Anal. i Prilozhen., 43(4):45–66, 2009.
- [16] Leonid Petrov. operators and Markov processes on branching graphs. J. Algebraic Combin., 38(3):663–720, 2013.
- [17] J. Pitman. Combinatorial stochastic processes, volume 1875 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2006. Lectures from the 32nd Summer School on Probability Theory held in Saint-Flour, July 7–24, 2002.
- [18] Jim Pitman and Matthias Winkel. Regenerative tree growth: binary self-similar continuum random trees and Poisson–Dirichlet compositions. Ann. Probab., 37(5):1999–2041, 2009.
- [19] Abedallah Rababah. Jacobi-Bernstein basis transformation. Computational Methods in Applied Mathematics, 4:206–214, 06 2004.
- [20] Kelvin Rivera-Lopez and Douglas Rizzolo. Diffusive limits of two-parameter ordered Chinese Restaurant Process up-down chains, 2020. arXiv:2011.06577.
- [21] Dane Rogers and Matthias Winkel. A Ray-Knight representation of up-down Chinese restaurants. arXiv:2006.06334, 2020.
- [22] L. C. G. Rogers and J. W. Pitman. Markov functions. The Annals of Probability, 9(4):573–582, 1981.
- [23] Quan Shi and Matthias Winkel. Up-down ordered Chinese restaurant processes with two-sided immigration, emigration and diffusion limits. arXiv preprint arXiv:2012.15758, 2020.
- [24] Gábor Szegö. Orthogonal Polynomials. American Mathematical Society, Providence, RI, 1975.