Approximation of birth-death processes
Abstract.
The birth-death process is a special type of continuous-time Markov chain with index set . Its resolvent matrix can be fully characterized by a set of parameters , where and are non-negative constants, and is a positive measure on . By employing the Ray-Knight compactification, the birth-death process can be realized as a càdlàg process with strong Markov property on the one-point compactification space , which includes an additional cemetery point . In a certain sense, the three parameters that determine the birth-death process correspond to its killing, reflecting, and jumping behaviors at used for the one-point compactification, respectively.
In general, providing a clear description of the trajectories of a birth-death process, especially in the pathological case where , is challenging. This paper aims to address this issue by studying the birth-death process using approximation methods. Specifically, we will approximate the birth-death process with simpler birth-death processes that are easier to comprehend. For two typical approximation methods, our main results establish the weak convergence of a sequence of probability measures, which are induced by the approximating processes, on the space of all càdlàg functions. This type of convergence is significantly stronger than the convergence of transition matrices typically considered in the theory of continuous-time Markov chains.
Key words and phrases:
Birth-death processes, Continuous-time Markov chains, Ray-Knight compactification, Boundary conditions, Weak convergence, Skorohod topology, Skorohod representation2020 Mathematics Subject Classification:
Primary 60J27, 60J40, 60J50, 60J46, 60J55, 60J74.1. Introduction
The birth-death process is a specific type of continuous-time Markov chain with index set . Its -matrix is given by (2.1), and its key characteristic is that it can only transition between adjacent states in . Building upon the work of Feller [7], Yang demonstrated that all birth-death processes can be obtained by solving the resolvent matrix (see [20, Chapter 7]). In essence, each birth-death process is determined by a set of parameters that satisfy certain conditions, where and are two non-negative constants, and is a positive measure on . For further details, please refer to §2.3.
In the study of continuous-time Markov chains, the index set is typically equipped with the discrete topology and considered as the state space of the corresponding process. This perspective is reasonable when our focus is limited to the transition matrix or objects related to the distribution of the process. However, when considering the trajectory of the process, specifically examining its measurability and regularity with respect to time , setting the state space as the discrete space can lead to certain “peculiar” phenomena. The most extreme example is the famous Feller-McKean chain (see [8]), in which every point in is an instantaneous state, meaning that the process does not stay at any point for a period of time. Notably, all the diagonal elements of its -matrix are infinity. In the context of the discrete topology, it is challenging to comprehend the trajectory of the Feller-McKean chain. However, in reality, the Feller-McKean chain can be derived by confining a regular diffusion process on to the set of rational numbers (see [18, III§23]). Regular diffusions, on the other hand, have a well-established theoretical characterization (see, for example, [12]). Consequently, it is not the Feller-McKean chain itself that is difficult to comprehend, but rather the difficulty arises solely from the setting of the discrete topology.
In the early stages of the development of continuous-time Markov chain theory, Doob realized that the index set is not sufficient to accommodate well-behaved process realizations (see [5]). He then found a so-called separable modification for each continuous-time Markov chain, with trajectories of this modification possessing Borel measurability, separability, and lower semicontinuity (see [3, II§4]). It should be noted that may need to be added to the state space as the compactification point of , becoming a state that process trajectories frequently visit (when the index set is , apart from , may also need to be added to the state space). Doob’s modification became a cornerstone of continuous-time Markov chain theory and has been widely applied in various aspects of related theory (see [3] and [20]). However, for some special examples, the one-point compactification topology of does not reveal the underlying structural essence of the model. In the case of the Feller-McKean chain, the one-point compactification topology of is far from the induced topology of the Euclidean metric on . Doob’s modification also falls short of capturing the regular diffusion that generates the Feller-McKean chain.
Another approach, proposed by Ray in 1959 (see [17]), can achieve this goal. It is now well known as the Ray-Knight compactification. Ray’s theory is applicable to almost all processes satisfying the Markov property. It uses the resolvent to introduce a new metric, expanding the state space in a way that completes the metric, and constructs a càdlàg process satisfying the strong Markov property on the new state space, called a Ray process. Ray’s approach can also be applied profoundly in the study of continuous-time Markov chains (see [18, Chapter 6] and [4, Chapter 9]), and in some aspects, it is even superior to using Doob’s separable modification. Specifically, when Ray-Knight compactification is applied to the Feller-McKean chain, it yields the regular diffusion used to construct the Feller-McKean chain (see [18, III (35.7)]).
In a previous article [14], we investigated all birth-death processes using the Ray-Knight compactification approach. The key advantage of this approach is that it enables us to realize every birth-death process as a càdlàg process that satisfies the strong Markov property on (see §2.1 for this symbol). Interestingly, for birth-death processes, both the Doob’s modification and the Ray-Knight method result in the same topological transformation. Additionally, with the exception of relatively simple Doob processes (see, e.g., [14, §5]), all birth-death processes are Feller processes. Consequently, to study birth-death processes, we have expanded our toolbox beyond traditional methods for studying continuous-time Markov chains to incorporate the rich theory of general Markov processes.
The primary objective of the paper [14] is to provide a clear characterization of the trajectories of all birth-death processes, particularly their behavior near the boundary point . This issue has not been well-addressed in the existing literature based on continuous-time Markov chain methods, such as [20, 2], and others.
Using the framework of Feller processes, we demonstrated in [14] that the parameters determining a birth-death process reflect its different behaviors at the boundary point . From an analytical perspective, these behaviors are described by the boundary condition (2.13) satisfied by the functions in the domain of the infinitesimal generator. From a probabilistic standpoint, , , and respectively describe the killing, reflecting, and jumping behaviors of the birth-death process at . This probabilistic interpretation is clear when the jumping measure is finite. More precisely, Doob process corresponds to the case of and , where is the total variation of . It can be obtained by the piecing out transformation (see [11]) of the minimal birth-death process with respect to the distribution on given by
(1.1) |
Intuitively, whenever it is about to reach , the Doob process always jumps back to , and the probability of arriving at position is determined by (1.1). For the case of and , the relevant description requires adjusting the above formulation by considering the birth-death process corresponding to the parameters instead of , and by adapting the random time of approaching to the lifetime of , namely the time when enters the cemetery . Note that can be obtained as a subprocess of the -process (which plays a similar role to a reflecting Brownian motion on ; see [14, §3.3]) under the killing transformation using the multiplicative functional
where is the local time of -process at . For the rigorous construction of the subprocess, readers can refer to [1, III, §3].
However, in the case of , the construction of piecing out is no longer effective because given by (1.1) becomes meaningless. Similar to Lévy processes with infinite Lévy measures, in this case, the birth-death process may experience a high frequency of jumps into (from ) at certain times . Namely, for any , there are infinitely many jumps from to occurring within the time interval . This makes it extremely difficult to provide a clear description of the trajectories of the birth-death process. Feller referred to this situation as a “pathological case” in [7], probably for this reason.
In this article, we will explore how to use simpler birth-death processes to approximate complex birth-death processes. The significance of this investigation lies in the fact that if we can establish the convergence of the sequence of approximating processes, then even in the pathological case where determined by (1.1) lacks meaning, interpreting the target birth-death process through piecing out becomes intuitively acceptable.
Let us first discuss how to obtain simplified birth-death processes for approximation. One approach is to optimize the parameters of the target birth-death process and then to apply Feller-Yang’s resolvent approach to generate processes using the new parameters. The simplest example is to truncate the measure directly as follows:
(1.2) |
where represents the measure restricted to , and consider as the birth-death process determined by the parameters . Note that this approach has been briefly mentioned in [14, §9]. Another approach, proposed by Wang in his 1958 doctoral thesis (see [20]), differs significantly from the first approach but is important in the theory of birth-death processes. Wang constructed a sequence of Doob processes with instantaneous measures supported on by removing the part of each trajectory of the target birth-death process that starts from until it returns to . (The original purpose of this approximation method was to provide a probabilistic construction for all birth-death processes.) We will further elaborate on this approximation method in §6.1. Next, let us consider in what sense the constructed sequence of birth-death processes can converge to the target process. Wang’s research is based on the theory of continuous-time Markov chains, where the core object is the transition matrix that defines the birth-death process. Therefore, the established convergence also refers to the convergence of the transition matrices, i.e.,
(1.3) |
This convergence is equivalent to the convergence of the resolvents; see Theorem 3.2. In the context of general Markov processes, it is possible to extend our study. Since the trajectories of and are càdlàg, they can all be realized as probability measures on the space , which consists of all càdlàg functions on . This space is typically equipped with the Skorohod topology. Consequently, we can attempt to establish the weak convergence of this sequence of probability measures on . This weak convergence is significantly stronger than the convergence of the transition matrices (1.3).
The main goal of this paper is to establish the weak convergence of the probability measures on associated with both the optimization parameter approximation and Wang’s approximation. For the first type of approximation, the weak convergence has been easily established under additional assumptions of Feller properties in [14, §9]. However, when the approximating sequence consists of Doob processes, the discussion becomes more challenging. Our proof requires an analytical characterization of Doob processes, which, although not Feller processes, can still yield a strongly continuous contractive semigroup when restricted to a closed subspace of the space of continuous functions. This result will be proven in §4. As for Wang’s approximation, in addition to establishing weak convergence on the Skorohod topological space, we will also consider another topology on the space determined by convergence in (Lebesgue) measure. Under this new topology, we can prove both weak convergence and almost sure convergence of -valued random variables induced by the birth-death process sequence. In other words, in the topology determined by convergence in measure, Wang’s approximation not only satisfies weak convergence of the probability measure sequence, but also provides an intuitive construction of the corresponding Skorohod representation.
Finally, we would like to briefly explain the notation that will be frequently used in this paper. Since it is a follow-up study of [14], we will strive to maintain consistency with the notation used in that paper. However, for the sake of clarity, some symbols have been adjusted. For instance, all symbols related to the minimal birth-death process are denoted with a superscript , and those related to approximating birth-death processes are represented by superscript (n). (Symbols related to the target birth-death process do not carry any superscripts.) Sometimes, the integral of a function with respect to a measure is denoted by . Additionally, we need to correct an error in the derivation of boundary condition (2.13) in [14]. In this equation, the parameter in front of was incorrectly written as in [14]. This error occurred because the scale function used in [14], which is given by (2.6), is half of the scale function defined in Feller [7]. However, in the proof of [14, Theorem 6.3] (the equation above [14, (6.8)]), it was mistakenly overlooked that the result from [7] (i.e., (6.6) in this paper) needs to be multiplied by a factor of two.
2. Preliminaries of birth-death processes
We consider a birth-death density matrix as follows:
(2.1) |
where for and for . (Set for convenience.) A continuous-time Markov chain is called a birth-death -process (or simply a -process) if its transition matrix is standard, and its density matrix is , i.e., for . A -process is called honest if its transition matrix satisfies for all and . In our context, two -processes with the same transition matrix will not be distinguished. For convenience, we will also refer to as a -process when no confusion arises. For further terminology concerning continuous-time Markov chains, readers are referred to [3, 20]; see also [14].
2.1. State space
The index set of the transition matrix is typically referred to as the minimal state space. The “real” state space of a -process is its one-point compactification (see, e.g., [14, §2.1])
where can be metrized with the metric
This establishes a topological homeomorphism between and the set
(2.2) |
equipped with the relative topology of . Since we do not always consider honest -processes, it is necessary to introduce a cemetery point , which lies outside the state space . It should be emphasized that, unless explicitly stated otherwise, is always treated as an isolated point distinct from . For instance, we can define the distance between a state and the cemetery point as (), where the inclusion of into is equivalent to adding the point to (2.2).
Given the metric , the set
forms a compact, separable metric space, and its subspace
is a locally compact, separable metric space. For every bounded function defined on either or , we define
Let denote the family of all continuous functions on , where we emphasize that for , the value of may no be equal to . Define
The families , , and consist of all continuous functions on that are bounded, that vanish at infinity, and that have compact support, respectively. In particular, a function defined on belongs to (resp. ) if (resp. if there exists an integer such that for all ). Further, we define and .
2.2. Minimal -process
Given the density matrix (2.1), there exists a particular birth-death process known as the minimal -process, which is denoted by and whose transition matrix is denoted by . Analytically, this process corresponds to the minimal solution to the Kolmogorov backward equation (as discussed in [7, §10]). From a probabilistic perspective, it represents a -process with minimal information, where the trajectories are terminated as they approach at the first time.
We introduce further notations for the minimal -process. Let denote the lifetime of , and define
(2.3) |
The resolvent matrix of this process is given by
(2.4) |
Then, it is straightforward to verify the following relation:
(2.5) |
Similar to regular diffusions on an interval, the minimal -process can be fully characterized by two parameters on derived from the matrix (2.1): a scale function and a speed measure. The scale function is given by
(2.6) |
the speed measure is
The process is symmetric with respect to in the sense that for any and . Thus, the speed measure is also known as the symmetric measure of . Further details and related results are referred to [14, §3.1], which provides a characterization involving time change transformation of Brownian motion.
Using another two parameters derived from the scale function and the speed measure, we can classify the boundary point in accordance with Feller’s approach. Specifically, we define the following two quantities:
The following classification for the boundary point is very well known.
Definition 2.1.
The boundary point (for ) is called
-
(1)
regular, if ;
-
(2)
an exit, if ;
-
(3)
an entrance, if ;
-
(4)
natural, if .
Remark 2.2.
Note that is regular if and only if , where
If is an exit, then and . If is an entrance, then and . If is natural, then .
It is crucial to highlight that -processes are unique if and only if is classified as an entrance or natural boundary; for more details, refer to, e.g., [14, Theorem 3.5]. In this paper, however, we focus on the non-uniqueness case, where is either regular or an exit. This non-uniqueness implies the condition . Furthermore, it holds that
(2.7) |
see, e.g., [14, Lemma 4.1].
2.3. Parameters determining birth-death processes
From now on, we will assume that is either regular or an exit. In addition to the minimal one, consideration of other -processes, such as Doob processes and the -process (which is only applicable in the regular case), is warranted.
It was first examined by Feller in [7] and then firmly established by Yang in 1965 (see [20, Chapter 7]) that each (non-minimal) -process can be uniquely determined, up to a multiplicative constant, by a triple of parameters . Here, are two non-negative constants and is a positive measure on satisfying the conditions:
(2.8) |
where , and
(2.9) |
Let denote the set of all triples satisfying (2.8) and (2.9). More precisely, the resolvent matrix
(2.10) |
of the -process corresponding to can be expressed as (see, e.g., [14, Theorem B.1])
(2.11) |
where is the minimal resolvent matrix (2.4) and is defined as (2.3). The matrix (2.11) is referred to as the -resolvent matrix. It should be noted that for any constant , the -resolvent matrix is identical to the -resolvent matrix.
We need to provide some observations regarding the expression of the resolvent matrix presented in (2.11). First, the first inequality in (2.8) is equivalent to
(2.12) |
see, e.g., [20, §7.10]. Second, is finite, if and only if is regular or an entrance; see, e.g., [7, Theorem 7.1]. Thus, the condition (2.9) guarantees that the resolvent matrix (2.11) is well-defined in the exit case. Third, when while , the resolvent matrix (2.11) reduces to the minimal one (although the second inequality in (2.8) is not satisfied).
2.4. Doob processes and Feller -processes
In a recent article [14], it was demonstrated, using Ray-Knight compactification, that every -process possesses a càdlàg modification on . This modified version is a Ray process on . In this paper, we do not differentiate between and its Ray-Knight compactification . Additionally, [14, Corollary 5.2] classifies all non-minimal -processes into two categories: Doob processes and Feller -processes.
According to, e.g., [14, Theorem B.1], the -process is a Doob process, if and only if its determining triple belongs to
In this case, whenever it approaches , the process refreshes at a randomly determined location from a distribution given by (1.1) on . In other words, can be obtained by the piecing out (see [11]) of the minimal -process with respect to the instantaneous distribution , as established in [14, §5]. The readers are also referred to [14, Appendix A] for a detailed description of the piecing out transformation.
When , the corresponding -process is a Feller process on (with being the cemetery point). Following [14], this -process will be termed a Feller -process. The infinitesimal generator of the Feller -process is derived in [14, Theorem 6.3], and the crucial point is the boundary condition at satisfied by the functions in the generator domain:
(2.13) |
where represents the discrete gradient of with respect to the scale function ; see [14, (6.2)]. Based on this boundary condition, the parameters can be utilized to explain three possible types of boundary behaviours of at : killing, reflecting, and jumping. Please refer to [14, §2.3] and [15] for more details.
2.5. Notations in the context of general Markov processes
Let us introduce some notations for a -process in the context of general Markov processes.
Given the transition matrix and its resolvent matrix as provided in (2.10), we define
for every (with ) and . According to (2.11), for and can be expressed in terms of the parameters as
(2.14) |
where . By utilizing (2.5) and (2.7), we can obtain that
(2.15) |
Define as this limit. For where may not be equal to , it holds that . We define
As established in [14, §4],
is a Ray resolvent in the sense of, e.g., [4, Definition 8.1]. According to, e.g., [4, Theorem 8.2], there exists a Borel measurable Markov transition semigroup on , having as the resolvent, such that is right-continuous for all and . This transition semigroup is known as a Ray semigroup. Note that for and , it holds that for and .
In the case where is a Feller -process, the semigroup is a Feller semigroup in the sense that and
(2.16) |
Particularly, a Feller -process satisfies the normal property on , i.e., for all . However, in the case where is a Doob process, neither (2.16) nor the normal property is satisfied. More precisely, only holds for , and is a branching point of the -process in the sense of, e.g., [4, Definition 8.3]. Specifically, , which is given by (1.1); see [14, Theorem 5.1].
As a Ray process on , the -process has a.s. càdlàg trajectories on according to, e.g., [4, Theorem 8.6] (or [19, Theorem 9.13]). Therefore, we can define the trajectory space as the set of all càdlàg functions from to such that for all . We can then define the projection maps
for all . The translation operators on are defined by for all . Let and , the -algebras on generated by and , respectively. These -algebras are known as the natural filtration on . According to [4, Theorem 8.6] (see also [19, Theorem 9.13]), for any probability measure on , there exists a probability measure on such that
forms a Markov process on with initial distribution (not !) and transition semigroup . Here, for . If for , we write as . Additionally, note that . The natural filtration can be augmented using the standard approach described in [19, I§6], resulting in the augmented natural filtration on . Finally, we obtain a collection
(2.17) |
which forms a realization of the -process .
The lifetime of is . Note that and for all . In the case where is a Feller -process, it also holds that . In contrast, for a Doob process, since . However, when the Doob process is restricted to , it transforms into a Borel right process that satisfies the normal property (see [19, Theorem 9.13]). It is worth noting that in the context of Borel right process, the cemetery point is commonly considered as the compactification point of the state space , which slightly differs from the setup in §2.1.
2.6. Realization on Skorohod topological space
Let denote the set of all càdlàg functions from to . According to [6, §3, Theorem 5.6], the space equipped with the metric inducing the Skorohod topology (defined in [6, §3(5.2)]) is a complete, separable metric space. In addition, utilizing [6, §3, Proposition 7.1], we can identify the Borel -algebra on with respect to the Skorohod topology as , which is generated by all the projection maps .
It is evident that , where is the trajectory space in the realization (2.17) of the -process. Since , it is straightforward to verify that the embedding map
(2.18) |
is measurable. Thus, for any probability measure on , induces an image probability measure on . Since can be regarded as the extension of to by defining , we will still denote this image measure by if no ambiguity arises.
3. Convergence of resolvents
Consider a -process with parameters , which determine its resolvent matrix. Define for and , where is the lifetime of . Additionally, consider a sequence of -processes , with parameters denoted by . Symbols related to this sequence will be distinguished by the superscript . For example, the realization of can be denoted by
where the lifetime is . The semigroup and resolvent of are denoted by and , respectively, and for and .
In this section, our aim is to clarify the relationships between the convergence of the transition matrices, the resolvent matrices, the transition semigroups, and the resolvents for . Among these convergences, the resolvent convergence is comparatively clear and straightforward, according to the resolvent representation (2.14).
A simple case of Kurtz’s lemma [13, Lemma 2.11], as stated below, will be useful in proving our results.
Lemma 3.1.
Let be a Banach space with the norm . Suppose that for each , is a function of taking values in , and that forms a bounded, equicontinuous sequence in the sense that:
-
(i)
There exists such that for all and .
-
(ii)
For every and , there exists such that with implies for all .
Then
(3.1) |
implies
Now we are in a position to present our first result regarding the equivalent conditions for resolvent convergence.
Theorem 3.2.
The following convergences are all equivalent to each other:
-
(1a)
For some (or equivalently, for all ), it holds that for all and .
-
(1b)
for all and .
-
(1c)
for all and .
-
(2a)
For some (or equivalently, for all ), it holds that for all and , and for all .
-
(2b)
It holds that for all and , and for all .
-
(3a)
For some (or equivalently, for all ), it holds that for all and , and for all .
-
(3b)
For some (or equivalently, for all ), it holds that
for all and , and for all .
-
(4a)
For some (or equivalently, for all ), it holds that for all and , and for all .
-
(4b)
For some (or equivalently, for all ), it holds that
(3.2) for all and , and for all .
Proof.
The equivalence between (1a), (1b), and (1c) can be easily verified by considering the following facts: for and ; according to (2.14),
in addition, for all .
Clearly, (2b) implies (2a). By taking and in (1b), we obtain
respectively. Thus, (1b) indicates (2b). Now, we will demonstrate that (2a) implies (1a), thereby establishing the equivalence between (1a), (1b), (1c), (2a), and (2b). Suppose that for some , it holds that for all and . For any , there exists an integer such that for all . Therefore, we have
Note that is dense in , and for all . It is straightforward to further obtain that for all . Taking , and defining , we observe that
From this, we can deduce that
Consequently, (1a) holds true.
Next, we will establish the equivalence between (4a), (4b), and (2a). Clearly, (4b) implies (4a), and (4a) implies (2a) (by the dominated convergence theorem). Suppose that (2a) holds. In order to conclude (4b), our goal is to apply Lemma 3.1 with and . It is sufficient to verify condition (ii) of Lemma 3.1. In fact, based on [3, II§3, Theorem 1], for any , we have
(3.3) |
Hence, the equcontinuity of can be easily obtained.
For the remaining conditions (3a) and (3b), it is worth noting that (3b) implies (3a), and (3a) implies (4a) (by taking in (3a)). Additionally, (4a) implies (4b). Furthermore, we can use the same argument as the one used to show that (2a) implies (1a) to deduce that (4b) implies (3b). Therefore, the equivalence of all conditions has been established. ∎
Remark 3.3.
In discussions about general Markov processes, it is commonly assumed that functions take the value of at the cemetery state . This assumption is the reason why we consider in the theorem above. However, in the subsequent discussion in §5, we will examine the weak convergence of a family of probability measures on , where it becomes necessary to address functions that are not equal to at . Fortunately, it is easy to adjust the conditions in the theorem to apply to . To keep the presentation simple, we will not elaborate on this here.
In conditions (3a) and (3b) that describe the convergence of the transition semigroups, it is somewhat limiting that can only be chosen from functions in rather than from all functions in . Moreover, there is a lack of characterization for equivalent conditions where has uniformity. These limitations render it inadequate to guarantee the convergence of the finite-dimensional distributions of to the corresponding finite-dimensional distributions of . The challenge here is that if we plan to apply Lemma 3.1 with and , neither nor is necessarily a Feller -process, which does not always guarantee that . To address this difficulty, we need to take an alternative approach using Theorem 4.2. We will revisit this issue in §5.
4. Infinitesimal generators of Doob processes
In this section, we examine the Doob process, whose resolvent matrix is determined by the parameters . All other notation related to is consistent with that in §2. Note that the resolvent does not satisfy the strong continuity. That is, does not hold for all . However, we will demonstrate that the strong continuity does hold in a smaller Banach space, enabling the transition semigroup of the Doob process to form a “Feller semigroup” on this Banach space.
We first introduce a lemma, which states that the minimal -process is a Feller process on (with being the cemetery).
Lemma 4.1.
The transition semigroup of acts on as a strongly continuous contractive semigroup, i.e.,
(4.1) |
for all .
Proof.
Let be the Dirichlet form associated with on , as described in [14, Lemma 3.1]. Fix . According to [14, Lemma 3.1], we have for any . By utilizing [10, Lemma 1.3.3], we can derive that
as , where . It follows from the Cauchy-Schwarz inequality that for any ,
Consequently, . In other words, we have established that and for all . Note that is dense in with respect to the uniform norm , and holds for all . Therefore, it is straightforward to verify that and for all . Applying the Hille-Yosida theorem, we can obtain (4.1). ∎
Based on the density matrix provided by (2.1), we define a function on for every function on as
(4.2) |
where . Note that can be extended to a function in , if and only if exists. In this case, we represent its extension using the same symbol . Then, by an abuse of notation, the operator defined by (4.2) with domain
is usually referred to as the maximal (discrete) generalized second order differential operator; see, e.g., [7] and [14, Lemma 6.1].
Given the triple , we define
(4.3) |
Since , is a closed subspace of . Thus, it is a Banach space equipped with the uniform norm .
The following result is analogous to the second part of [16, II §5, Theorem 3].
Theorem 4.2.
Let be a Doob process determined by the triple , i.e., . Then, acts on as a strongly continuous contractive semigroup, i.e.,
for all . Furthermore, the infinitesimal generator of on is with domain .
Proof.
We aim to apply the Hille-Yosida theorem (see, e.g., [16, I §1, Theorem 1]) to and . Two facts need to be proven:
-
(1)
is dense in .
-
(2)
For any and , is the unique solution to the equation
(4.4)
Firstly, we prove that for any and . According to (2.15), we have with
(4.5) | ||||
Then, a straightforward computation yields
Hence, . In addition, it follows from [7, Theorems 7.1 and 9.1] that for ,
(4.6) | ||||
Thus, with
(4.7) |
Note that due to . This, together with (4.5) and (4.7), yields that
As a result, . Therefore, is obtained.
Next, we consider the resolvent equation (4.4) and prove that is the unique solution to (4.4). In fact, by using , (4.5), and (4.6), we have
Consequently, is a solution to (4.4). The uniqueness of the solutions to (4.4) can be concluded by the same argument in the last paragraph of the proof of [14, Theorem 6.3].
Thirdly, we demonstrate that
(4.8) |
for all , thereby establishing that is dense in . To prove (4.8), we define . Utilizing (2.5) and (4.5), we can deduce that
(4.9) | ||||
As established in Lemma 4.1, . Since and , it follows that
Note that , since . Thus, the above limit is equal to . From (4.9), we can obtain (4.8).
Finally, by applying the Hille-Yosida theorem, we can conclude that is the infinitesimal generator of the resolvent on . Hence, it admits a strongly continuous contractive semigroup on . Fixing and , we have
Thus, for a.e. . Since as for any and is right-continuous (see [4, Theorem 8.2]), it is easy to see that for all . This completes the proof. ∎
Remark 4.3.
In the reduced case where and , corresponding to the minimal -process, we have . The same argument as presented in this proof indicates that the infinitesimal generator of on is given by with domain .
5. Weak convergence on Skorohod topological space
In this section, we continue our study of the convergence issues that were temporarily paused in §3. As explained in §2.6, for each probability measure (resp. ) on , (resp. ) can be considered as a probability measure on the Skorohod topological space . Our goal is to demonstrate that if converges to and converges to in some sense, then converges weakly to on .
It should be noted that the assumption of the target process being a Feller -process, i.e., , appears to be necessary. According to Theorem 4.2, when is a Doob process, its semigroup exhibits improved analytic properties when restricted to the proper subspace of . Thus, establishing convergence properties with respect to all functions in seems challenging. However, it is worth mentioning that Doob processes and Feller -processes with are well understood (see [14]). Therefore, our main focus is on examining the approximation of Feller -processes with . From this perspective, the assumption of does not result in any loss.
5.1. Approximating triples
We begin by examining condition (1c) in Theorem 3.2. According to (2.15), this condition can be expressed in terms of the triples in as
(5.1) |
The verification of the convergence of the corresponding parts in (5.1) is a straightforward process in establishing the validity of (1c). Specifically, we can examine the convergence properties of the triples as follows.
Definition 5.1.
The triple is said to converge to , if
and
(5.2) |
Remark 5.2.
The pointwise convergence is equivalent to the vague convergence of the measure on to , i.e., for any . However, this condition alone is not sufficient to guarantee (5.2), because and or may be infinite measure. The equivalence between the two formulations in condition (5.2) can be established by using the following two inequalities and the generalized dominated convergence theorem (similar to the approach in the proof of Lemma 5.3): and for all .
When monotonically converges pointwise to , i.e., (or ) for all , condition (5.2) is clearly satisfied by the dominated convergence theorem. This case includes approximation sequence obtained using the simplest truncation method:
which serves as the primary motivation for the study of this section. However, Definition 5.1 is not restricted to the scenario described in this example, where a sequence of simple processes approximates a complex process. It also allows for the opposite situation. For instance, let us consider the case where is a regular boundary and satisfies and . Define
Then, for all , while converges to in the sense of Definition 5.1.
It is easy to prove that the convergence based on the triple, as defined above, implies (5.1).
Lemma 5.3.
5.2. Convergence in finite-dimensional distributions
For each , define if is a Feller -process, and define as (4.3) (with ) if is a Doob process. According to [14, Theorem 6.3] or Theorem 4.2, the transition semigroup acts on as a strongly continuous contractive semigroup. Let with domain denote the infinitesimal generator of on . Similarly, we define if is a Feller -process, and if is a Doob process. Denote by with domain the infinitesimal generator of on .
Lemma 5.4.
Assume that converges to in the sense of Definition 5.1. Then the following conclusions hold:
-
(1)
For any , there exists a sequence such that
(5.3) -
(2)
For any , there exists a sequence such that
(5.4)
Proof.
The sequence of processes can be divided into at most two subsequences: one consisting of Feller -processes and the other consisting of Doob processes. Thus, it is sufficient to consider each subsequence separately. First, we will examine the subsequence, still denoted by , consisting of Feller -processes. Note that for all . Hence, the existence of is evident. For with , we can take , which satisfies (5.4) by (1b) of Theorem 3.2 and the Hille-Yosida theorem.
It remains to consider the subsequence consisting of Doob processes. From now on, we assume that all are Doob processes. According to Definition 5.1, we have .
(1) Assume without loss of generality that with . According to (2.13) or Theorem 4.2, we have
(5.5) |
Note that because of the condition (2.8). Let such that , and assume without loss of generality that for all . Define
where
We explain why is finite in the definition of . In fact, according to (2.14), we have
Thus,
(5.6) |
which is finite due to (2.12). In addition, it follows from (5.2) and Lemma 5.3 that
Hence, by (5.5), which yields that . Finally, it is straightforward to verify that with by using the definition of . In other words, .
Let denote the family of all probability measures on . A sequence of measures is said to converge weakly to if for all . It is worth noting that this weak convergence is equivalent to the vague convergence, i.e., for all ; see, e.g., [9, §7.3, Exercise 26].
Now we are ready to prove the convergence of in finite-dimensional distributions to a Feller -process .
Theorem 5.5.
Assume that converges to in the sense of Definition 5.1, and that . Then
(5.7) |
for all and . Furthermore, if converges weakly to , then for any , and with ,
(5.8) |
Proof.
Consider , and take a sequence satisfying (5.4). We will first apply Lemma 3.1 with and to conclude that
(5.9) |
Clearly, the first condition (i) and (3.1) in Lemma 3.1 hold for this . It suffices to prove the equicontinuity of . In fact, the Hille-Yosida theorem indicates that
Since , it is straightforward to obtain the equicontinuity of . Therefore, (5.9) is established.
We are now in a position to demonstrate (5.7). Since , it is sufficient to consider . Take an arbitrary . We need to show that there exists an integer such that for any ,
(5.10) |
Since , we can take such that . Let be the sequence for as in (5.4), i.e.,
It follows from (5.9) that
Particularly, there exists an integer such that for any ,
(5.11) |
Note that for any and ,
Before proving (5.8), let us clarify two facts. Firstly, it is not difficult to derive from (5.7) that
(5.12) |
holds for any functions satisfying . Secondly, let and . It can be shown that and are probability measures on (the -fold product space of ) satisfying . Thus, according to [9, §7.3, Exercise 26], to prove (5.8) as required, it is equivalent to proving that converges vaguely to on . In other words, we can assume without loss of generality that in (5.8).
5.3. Weak convergence
Under the assumption stated in Theorem 5.5, we proceed to examine the convergence of . As discussed in § 2.6, the process with initial distribution can be represented as a probability measure on the Skorohod topological space . Similarly, the process with initial distribution can be represented as a probability measure on . We aim to establish the weak convergence of to .
Theorem 5.7.
Assume that converges to in the sense of Definition 5.1. If converges weakly to , then converges weakly to on the Skorohod topological space , i.e.,
(5.14) |
for all , where denotes the family of all bounded continuous functions on equipped with the Skorohod topology.
Proof.
The sequence of processes can be divided into at most two subsequences. One subsequence consists of Doob processes, while the other subsequence consists of Feller -processes. Therefore, it is sufficient to prove (5.14) separately for each of these subsequences. For the subsequence consisting of Feller -processes, we can directly apply the result from [6, §4 Theorem 2.5]. Further details can be found in the proof of [14, Theorem 9.2]. Now, let us focus on the subsequence consisting of Doob processes. Without loss of generality, we assume that all are Doob processes.
Let denote the Skorohod topological space consisting of all càdlàg functions on for . For , the vector-valued function induces a Borel measurable map
where . Thus, the image measures
are probability measures on . We aim to show the weak convergence of to on , and then, (5.14) follows by applying [6, §3, Corollary 9.2].
We first demonstrate that the sequence of probability measures on is relatively compact. According to [6, §3 Theorem 9.4 and Remark 9.5 (b)], it suffices to consider the case and, for fixed , , and , to find a pair of real-valued progressive processes on , adapted to the filtration of , for all , satisfying the following conditions:
-
(i)
, and
(5.15) is an -martingale.
-
(ii)
It holds that
(5.16) and
(5.17)
In fact, since is strongly continuous on , there exists such that
(5.18) |
Applying the first statement of Lemma 5.4 to , we can obtain a sequence such that . It follows from (1b) of Theorem 3.2 that
(5.19) |
For each , we define
and verify the conditions listed above as follows. It is evident that the first part of (i) holds true, since . To demonstrate that (5.15) is an -martingale, we note that and . Then
(5.20) |
is adapted to . By virtue of Theorem 4.2 and the Hille-Yosida theorem, we have
(5.21) |
for any and . Since takes values in for all , -a.s., it follows from the Markov property and (5.21) that for any ,
As a result, (5.20) is an -martingale. Additionally, according to (5.18) and (5.19), the left hand side of (5.16) is not greater than
and it follows from that the left hand side of (5.17) is not greater than
Therefore, we have established the existence of , thereby completing the proof of the relative compactness of .
After proving the theorem mentioned above, it becomes apparent that the truncation method (1.3) effectively enables the construction of a sequence of simple -processes, which converges to the target process with an infinite jumping measure as described in (5.14). Additionally, this theorem allows for the construction of various other types of examples. For instance, let us consider the case where is a regular boundary, and we choose an infinite measure that satisfies (2.8). Define a sequence of measures as follows:
For any constant , the triple corresponds to an honest -process , which exhibits complex jumping behavior near the boundary , but each jump originating from will only enter states that are beyond . As , the jumps of from into become increasingly compressed, and this sequence of processes converges to the -process in the sense of (5.14).
6. Weak convergence for Wang’s approximation
In his 1958 doctoral thesis (see [20, Chapter 6]), Wang constructed a sequence of honest Doob processes for each honest -process. These processes are designed to converge to the given -process in the sense of (1.3). In fact, Wang’s construction remains effective even in the non-honest case, and its convergence is stronger than (1.3) as it also ensures the weak convergence on the Skorohod topological space. In this subsection, these findings will be explained.
6.1. Wang’s approximation
We begin by introducing a transformation on the trajectories. Let be a càdlàg function on . Consider two sequences of positive constants and such that
(These sequences may consist of finite numbers.) We say that the function is obtained from by the -transformation if
where and . Intuitively speaking, the -transformation discards the trajectory of corresponding to the interval , keeps the segment unchanged, and shifts the remaining parts to the left, connecting them in the original order without intersection, thereby obtaining a new càdlàg trajectory .
Let be either a Doob process or a Feller -process with parameters . Fix . Define and (). Then, we define a sequence of stopping times as follows:
and if are already defined, we set
and
For and every , by performing the -transformation on , we obtain a new trajectory, denoted by .
By following the approach of [15, Lemma 7.1], we can derive that
(6.1) |
is a Doob process with instantaneous distribution . Note that the parameters of are (see, e.g., [20, §7.12, Theorem 2] or [15, Lemma 8.1])
(6.2) |
and
(6.3) | ||||
It is straightforward to verify that this special approximating sequence of triples satisfies the conditions in Definition 5.1, if and only if (see also [20, §7.12, Theorem 2]). When (which is applicable only for the case where is regular), it can be obtained that
(6.4) |
and
(6.5) |
by virtue of (see, e.g., [20, §7.10, (3) and (9)] or [7, Theorem 8.1])
(6.6) |
Therefore, whether or , each condition in Theorem 3.2 holds true for Wang’s approximation.
6.2. Weak convergence for Skorohod topology
It is worth noting that the expression (6.1) for the Doob process is not the realization given in (2.17). However, it can be easily proven that the mapping induced by (6.1),
(6.7) |
is measurable. Therefore, similar to what is stated in §2.6, given an initial distribution , can be realized as a probability measure on , denoted by .
Although Wang’s approximation may not satisfy the conditions in Definition 5.1, we can still prove its weak convergence on the Skorohod topology. This is because, upon examining the proof in §5, we can see that the essential role of Definition 5.1 is to ensure the first statement of Lemma 5.4 holds, as well as guaranteeing the resolvent convergence.
Theorem 6.1.
Let be a Feller -process and be its approximating sequence of Doob processes given in (6.1). If converges weakly to , then converges weakly to on equipped with the Skorohod topology, i.e.,
(6.8) |
for all .
Proof.
Our goal is to show that the first statement of Lemma 5.4 still holds true. It suffices to examine the case , and consider with satisfying
where . Note that the measure is given by (6.3).
In the case where , we can define and by the same method as in the proof of Lemma 5.4. Note that . Therefore, it is sufficient to prove that . In fact, it follows from (5.6), (6.4), and (6.5) that
By utilizing the generalized dominated convergence theorem (see [9, §2.3, Exercise 20]), we can deduce from (6.6) that
Thus, according to (2.14) and (6.6), we have
Therefore,
This establishes .
In the case where , we take a sequence of functions as follows:
It is straightforward to verify that and . ∎
According to the Skorohod representation theorem (see [6, §3, Theorem 1.8]), if (6.8) holds, then there exist -valued random variables and on a probability space , such that and have distributions and , respectively, and converges to , -a.s. If , then defined in (6.7) and given by (2.18) are on the same probability space . In this case, it raises the question whether can be the Skorohod representation of ? To verify this fact, it is equivalent to show that
(6.9) |
holds for -a.s. . Although (6.12) provides a pointwise convergence in time for -a.s. , establishing the convergence in the Skorohod topology seems still challenging. Typically, a sufficient condition for the -convergence (6.9) is the local uniform convergence with respect to time . However, this condition is not guaranteed by (6.12).
6.3. Skorohod representation on topology for convergence in measure
Finally, let us consider another simpler metric on inducing the topology for convergence in (Lebesgue) measure. Under this metric, the sequence for Wang’s approximation converges not only weakly but almost surely to .
For , define
According to, e.g., [9, §2.4, Exercise 32], is a metric on , and additionally, if and only if for any , converges to in (Lebesgue) measure on . Note that the Borel -algebra generated by is also identical to , the -algebra generated by all projection maps ; see, e.g., [4, §8.6]. Thus, , and we do not need to change the expressions for the maps and given by (2.18) and (6.7).
It should be noted that in the following theorem, we do not require the target -process to be a Feller -process.
Theorem 6.2.
Let be a -process and be its approximating sequence of Doob processes given in (6.1). For any , it holds in the sense of -a.s. that
(6.10) |
Furthermore, if converges weakly to , then converges weakly to on equipped with the topology induced by , i.e.,
(6.11) |
for all , where denotes the family of all bounded continuous functions on with respect to the metric .
Proof.
As established in [20, §6.4] (the non-honest case is examined in [15, Theorem A.1]), the following convergence holds true: for any ,
(6.12) |
Thus, for -a.s. and any , converges to pointwise in . This convergence clearly implies the convergence in (Lebesgue) measure on . Therefore, (6.10) is established.
In order to prove (6.11), we define for ,
and
By utilizing (6.10) and the dominated convergence theorem, we have for any . It follows from [4, Theorem 8.12] that . Thus, . To obtain (6.11), it remains to show
In fact, we have
and
where . Therefore, we can apply the generalized dominated convergence theorem (see [9, §2.3, Exercise 20]) to obtain
This completes the proof of (6.11). ∎
References
- [1] R. M. Blumenthal and R. Getoor. Markov processes and potential theory. Pure and Applied Mathematics, Vol. 29. Academic Press, New York-London, 1968.
- [2] M.-F. Chen. From Markov chains to non-equilibrium particle systems. World Scientific, Mar. 2004.
- [3] K. L. Chung. Markov chains with stationary transition probabilities. Second edition. Die Grundlehren der mathematischen Wissenschaften, Band 104. Springer-Verlag New York, Inc., New York, 1967.
- [4] K. L. Chung and J. B. Walsh. Markov processes, Brownian motion, and time symmetry, volume 249 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, New York, New York, NY, second edition, 2005.
- [5] J. L. Doob. Stochastic Processes. Wiley, 1953.
- [6] S. N. Ethier and T. G. Kurtz. Markov processes: characterization and convergence. Wiley Series in Probability and Statistics. Wiley, 1 edition, Mar. 1986.
- [7] W. Feller. The birth and death processes as diffusion processes. J. Math. Pures Appl. (9), 38:301–345, 1959.
- [8] W. Feller and H. P. McKean Jr. A diffusion equivalent to a countable Markov chain. P. Natl. Acad. Sci. Usa., 42:351–354, 1956.
- [9] G. B. Folland. Real analysis. Pure and Applied Mathematics (New York). John Wiley & Sons, Inc., New York, second edition, 1999.
- [10] M. Fukushima, Y. Oshima, and M. Takeda. Dirichlet forms and symmetric Markov processes, volume 19 of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, extended edition, 2011.
- [11] N. Ikeda, M. Nagasawa, and S. Watanabe. A construction of Markov processes by piecing out. Proc. Japan Acad., 42:370–375, 1966.
- [12] K. Itô and H. P. McKean Jr. Diffusion processes and their sample paths. Springer-Verlag, Berlin-New York, 1974.
- [13] T. G. Kurtz. Extensions of Trotter’s operator semigroup approximation theorems. J. Funct. Anal., 3(3):354–375, 1969.
- [14] L. Li. Ray-Knight compactification of birth and death processes. Stochastic Processes and their Applications, 177:104456, 2024.
- [15] L. Li. Time-changed Feller’s Brownian motions are birth-death processes. (arXiv:2408.09364), Aug. 2024.
- [16] P. Mandl. Analytical treatment of one-dimensional Markov processes. Die Grundlehren der mathematischen Wissenschaften, Band 151. Academia Publishing House of the Czechoslovak Academy of Sciences, Prague; Springer-Verlag New York Inc., New York, 1968.
- [17] D. Ray. Resolvents, transition functions, and strongly Markovian processes. Ann. of Math. (2), 70:43–72, 1959.
- [18] L. C. G. Rogers and D. Williams. Diffusions, Markov processes, and martingales. I. Cambridge Mathematical Library. Cambridge University Press, Cambridge, Cambridge, 2 edition, 2000.
- [19] M. Sharpe. General theory of Markov processes, volume 133 of Pure and Applied Mathematics. Academic Press, Inc., Boston, MA, 1988.
- [20] Z. K. Wang and X. Q. Yang. Birth and death processes and Markov chains. Springer-Verlag, Berlin; Science Press Beijing, Beijing, 1992.