This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Some applications of phase-type distributions in recurrent events

\nameRaoufeh Asgharia and Amin Hassan Zadehb CONTACT A. Hassan Zadeh Email: [email protected] aDepartment of Statistics, Shahid Beheshti University, Tehran, Iran b Department of Actuarial Science, Shahid Beheshti University, Tehran, Iran
Abstract

In this paper, the recurrent events that can occur more than one over the follow-up time have been modeled by phase-type distributions. We use the finite-state continuous-time Markov process with multi states for patients with recurrent events. The number of recurrences until time tt, the time stay for every state and the time till death are of importances. The time till death is assumed to have a phase-type distribution (which is defined in a Markov chain environment) with interpretable parameters. The underlying continuous-time Markov chain has one absorbing state (death) and transient states to reflect recovery and disease stages. A system of differential equations is obtained to calculate the probability of various number of transitions, the conditional expected time to stay in a disease stage and the probability of transition from a stage to another. The model has been calibrated via a real and simulated datasets. The bootstrap techniques have been used to construct the confidence intervals for the parameters.

keywords:
Multiple state models; Recurrent events; Markov chain; Phase-type distribution; Bootstrap; Cancer; Stanford heart transplant data.

1 Introduction

Multi-state models are the most common models used for the description of longitudinal survival data. A multi-state model is a model for a stochastic process, which is characterized by a set of states and the possible transitions among them [11]. The states represent different situations of the individual (healthy, diseased, etc) along a follow-up. Special multi-state models that have been widely used in biomedical applications are the three-state progressive model, the illness-death model, or the bivariate model [9]. One of the important features of multiple state models is reversible illness-death model that is the model we, in this paper, drive some formulas in a Markovian context.
In many biomedical studies, the event of interest can occur more than once over the follow-up time for patients. Such events are called recurrent events [2]. The recurrent events are repeated events, which are of the same type such as acute exacerbations in asthmatic children, seizures in epileptics, cancer recurrences, myocardial infarctions, migraine pain, and ear infections. Several statistical models have been proposed in the literature to analyze recurrent events including [14],[15],[16] and [17]. [18] describes the R package TPmsm which aims at implementing nonparametric and semiparametric estimators for the transition probabilities in progressive illness-death models. Also another R package, TP.idm is introduced in [19] that implements a novel non-Markovian estimator for the transition probability matrix in the progressive illness-death model under right censoring. [12] reviews several modeling approaches following the methodology of the multi-state models, and focusing on the estimation of several quantities such as the transition probabilities and survival probabilities.
In this paper, we use a finite-state continuous time Markov process with a single absorbing state. The state space in the Markov process is partitioned to some subspaces. Each subspace will represent a state of a human beings health state. For example, in cancer case, each subspace is interpreted as either a stage of cancer or recovery. Each subspace will include states such that the semi-Markov property can be reflected in the models. See [23] for more details.

As in our model the time to death follows a phase-type (PH) distribution, we can take advantages of the PH properties. It is good to remind that the set of PH is dense in the class of all distributions defined on the non-negative real numbers, see [7]. There are closed-form expressions for the distribution and density functions, and this also applies to the Laplace–Stieltjes transform and we are able to obtain the expected value and the all non-central moments of the PH random variable by successive differentiation of the Laplace transform.
Many authors recently have used PH distribution in their researches. [1] uses PH for modeling disability insurance. The model represents the aging process as the passage through a number of phases of decreasing vitality. When disabled, individuals additionally pass through several stages that represent duration of disability. [23] uses PH for modeling of skin cancer patients in the United States and estimate parameters related to the aging process that can be useful for comparing the physiological age processes of patients with cancer and healthy people. [8] used phase-type distribution for mortality analysis, and obtained conditional survival probabilities of the time of death and the actuarial present values of the whole life insurance and annuity. [26] used phase-type distribution for analyzing data on lengths of stay of hospital patients. [4] uses PH to model the interval time between first and second birth. A special case of PH is Coxian distribution that can be used to represent survival times in terms of phases through which an individual may progress until leaves a system, such as hospital stay or time till death. [5] used the Coxian to model the patients stay in hospital. The contribution of this paper are presented in Theorem 3.1 and by developing formulas, in the PH context, presented in examples. It can be seen that how some formulas in recurrent events context are calculated tractably in a Markov chain, by using the Markov property.
This paper is organized as follows. Section 2 provides an introduction to PH as well as the notational convention that will be used in describing our model. In Section 3, we describe the structure of our model and present the main theorem which is the main contribution of this paper. In this section two examples will be presented. Section 4 closes the paper with some concluding remarks and a brief discussion of possible future studies.

2 Phase-type Distributions

Consider a continuous-time Markov process {Jt,t0}\{J_{t},t\geq 0\} with the space state, Γ={0,1,,m}\Gamma=\{0,1,\ldots,m\}, mm\in\mathbb{N}, where the state 0 is absorbing and the rest are transient. Assume the intensity matrix and initial probability vector associated with JJ, are denoted by Q and 𝜷\boldsymbol{\beta}, respectively. Clearly, we have that

Q=[00t0T]\displaystyle\textbf{Q}=\begin{bmatrix}0&\textbf{0}^{\prime}\\ \textbf{t}_{0}&\textbf{T}\end{bmatrix} (1)

where the matrix T is m×mm\times m, sub–intensity matrix ( matrix 𝐁=(bij)i,j=1,,m\mathbf{B}=(b_{ij})_{i,j=1,\dots,m} is called a sub–intensity matrix if bii0b_{ii}\leq 0, bij0,andk=1mbik0b_{ij}\geq 0,\textrm{and}\,\sum_{k=1}^{m}b_{ik}\leq 0, with strick inequality for at least on ii, for  i,j=1,2,,m;iji,j=1,2,\dots,m;~{}i\neq j). The vector t0\textbf{t}_{0} is the transition rates from the transient states to the absorbing state, 0. Therefore we have T1+t0=0\textbf{T}\textbf{1}+\textbf{t}_{0}=\textbf{0}, (where 0, 1 are column vectors of zeros and ones with proper dimensions, respectively). It can be easily proved that

eQy=[101eyT1eyT],\displaystyle e^{\textbf{Q}y}=\begin{bmatrix}1&\textbf{0}^{\prime}\\ \textbf{1}-e^{y\textbf{T}}\textbf{1}&e^{y\textbf{T}}\end{bmatrix}, (2)

( matrix exponential of the squared matrix 𝐁\mathbf{B} is defined as e𝐁=l=0Bll!e^{\mathbf{B}}=\sum_{l=0}^{\infty}\frac{\textbf{B}^{l}}{l!}).
We also assume that P(J0{0})=β0=0P(J_{0}\in\{0\})=\beta_{0}=0. As a result, 𝜷=(β0,𝜶)\boldsymbol{\beta}=(\beta_{0},\boldsymbol{\alpha}) where we have that 𝜶1=1\boldsymbol{\alpha}\textbf{1}=1.
If we define, Y=inf{t;Jt=0}Y=\inf\{t;J_{t}=0\}, then it is said that YY is a PH random variable with representation (𝜶,T)(\boldsymbol{\alpha},\textbf{T}). For a continuous random variable YY, the event Y>yY>y implies that the process started from any transient state has not reached the absorbing state by time yy, thus the distribution function of YY can be obtained as follows:

F(y)=1𝜶eyT1,y0.\displaystyle F(y)=1-\boldsymbol{\alpha}e^{y\textbf{T}}\textbf{1},~{}~{}~{}y\geq 0.

Thus the survival function of YY is

S(y)=𝜶eyT1,y0.S(y)=\boldsymbol{\alpha}e^{y\textbf{T}}\textbf{1},~{}~{}~{}y\geq 0. (3)

By taking derivatives of F(y)F(y), one can obtain the probability density function of YY as follows ,

fY(y)=𝜶eyTt0,y0.f_{Y}(y)=\boldsymbol{\alpha}e^{y\textbf{T}}\textbf{t}_{0},~{}~{}~{}y\geq 0.

The Laplace-transform and kthk^{th} moments are given in the following.

Φ(s)=𝜶(sIT)1t0,\Phi(s)=\boldsymbol{\alpha}(s\textbf{I}-\textbf{T})^{-1}\textbf{t}_{0},
E(Yk)=k!𝜶(T1)k1,k=0,1,2,.E(Y^{k})=k!\boldsymbol{\alpha}(-\textbf{T}^{-1})^{k}\textbf{1},~{}~{}~{}~{}k=0,1,2,\cdots.

For a complete review of PH distributions refer to [3].

3 Transition probabilities

Based on the structure of a representation of the PH, several classes of phase-type distributions can be distinguished. The structure of a PH representation often has an impact on its application, as some structures allow more efficient solutions. The most important distinction is the one into Acyclic and General Phase-type distributions: Every acyclic phase-type distribution has at least one Markovian representation without cycles in the sub–generator, while for general phase-type distributions cycles are allowed. An application of the general PH is in the reversible illness-death model. In this paper we consider the reversible illness-death model in a Markovian environment.

The analysis in recurrent studies is often performed using the multi–state models. These models are very useful for describing event history data offering a better understanding of the process of the illness, and leading to a better knowledge of the evolution of the disease over time. The complexity of a multi-state model greatly depends on the number of states defined and by the transitions allowed between these states. Our approach is calculating the probability of the number of transitions.
We use the finite-state continuous-time Markov process with multi states. One absorbing state which is the death phase and the other states are recovery and disease phases. From now on, we will use the word ”stage” rather than ”state” or ”phase” in our multi-state model. In fact, we will assume kk stages ( from 1 to kk) for the disease and recovery, (one stage for the recovery and k1k-1 for the disease) and one absorbing state for death. Each stage will have nn transient states. In this model transition from every stage to another is possible. In some cases the states inside stages can be interpreted as physiological ages, see [8] and [23] where aging is transition from one physiological age to the next physiological age and process will end when transition occurs from any other state to the absorbing state, death. For each state i,i=1,2,..,ni,i=1,2,..,n in every stage, several parameters are used for modeling mortality. Issues of interest i.e., the number of recurrences or transitions until time tt, the number of transitions from one stage to another and the expected time stay in every stage will be calculated.
After the diagnosis of special disease, that stage is also known. So patient is at one of the disease stages at time t=0t=0 -arrival time- at age xx. At any time thereafter, he may arrive at death state, recovery or visit another disease stages. Recovered patient may become sick or dead. Our proposed multi-state model is reversible, in the sense that past stages can be revisited. For this model, let EE denote the space state of the underlying Markov chain, then E=j=1kEiDE=\bigcup_{j=1}^{k}E_{i}\bigcup D where sets of E1,E2,.EkE_{1},E_{2},\dots.E_{k} are to represent stages, including recovery, and the state D={0}D=\{0\} represents death. Every stage, including recovery, has a finite set of nn states labeled 1,,n1,...,n, with instantaneous transitions being possible between selected pairs of states, see Figure 1. In general, there are k×n+1k\times n+1 states, where k×nk\times n are for recovery and disease stages and 1 for the death state.
For each t0t\geq 0, the continuous Markovian random process JtJ_{t} is in one of the stages 1,,k1,...,k or in DD. We interpret the event JtEiJ_{t}\in E_{i} to mean that the individual is in stage ii at age x+tx+t, i=1,,ki=1,\dots,k. We assume the rate of transition from the disease stages depend on stage, age and duration of illness. Therefore, we design our Markovian model such that it reflects the semi-Markovian property.
As mentioned earlier, we propose a reversible illness–death model involving nn states in each alive stage. By doing so, the semi-Markov property can be captured in the model. This method have been used in [1]. Patients may move to the next stage of disease, recover or death. The recovery rate will normally be lower for the later stages of disease. Of course, patients may die while in any state of stages, as indicated by the arrows to death in Figure 1.

Refer to caption
Figure 1: Reversible illness-death model: the k+1 stages (boxes) and the possible transition among them (arrows).

Since the Markov process has only a single absorbing state, the time of death (the time of absorption) follows a PH distribution. Note also that the time staying in recovery or disease statuses can be obtained by using the Markov property.
Furthermore, the sub-intensity matrix, T, of the Markov process in (1) is given by

T=[T1T1,2T1,3T1,kT2,1T2T2,3T2,kTk,1Tk,2Tk,3Tk],\textbf{T}=\begin{bmatrix}\textbf{T}_{1}&\textbf{T}_{1,2}&\textbf{T}_{1,3}&\cdots&\textbf{T}_{1,k}\\ \textbf{T}_{2,1}&\textbf{T}_{2}&\textbf{T}_{2,3}&\cdots&\textbf{T}_{2,k}\\ \vdots&\vdots&\vdots&\vdots&\vdots\\ \textbf{T}_{k,1}&\textbf{T}_{k,2}&\textbf{T}_{k,3}&\cdots&\textbf{T}_{k}\end{bmatrix}, (4)

where Ti,j,ij,i,j=1,2,,k\textbf{T}_{i,j},i\neq j,~{}i,j=1,2,...,k are n×nn\times n matrices containing the transition rates from the stage ii to jj. The matrices Ti\textbf{T}_{i}’s, i=1,2,,ki=1,2,...,k are the sub-intensity matrices (with non-zero main and upper diagonal elements) for the Markov chain describing a sojourn in stage ii. The mortality rates, 𝐭0=T𝟏\mathbf{t}_{0}=-\textbf{T}\boldsymbol{1}, is an n×kn\times k-dimensional column vector containing the rate of death from each of the n×kn\times k alive states.
In this paper, we assume that the person at diagnose has age of xx and we will remove xx from the notations.
From now on, it as also assumed that 𝜶=(𝜶𝟏,𝜶𝟐,,𝜶𝒌)\boldsymbol{\alpha}=(\boldsymbol{\alpha_{1}},\boldsymbol{\alpha_{2}},\dots,\boldsymbol{\alpha_{k}}), where 𝜶𝒊\boldsymbol{\alpha_{i}}’s are all 1×n1\times n vectors corresponding to initial probabilities to start from stage ii (𝜶𝟏\boldsymbol{\alpha_{1}} corresponds to the recovery which can be assumed to be 𝟎\boldsymbol{0}). The normalized and the transposed vectors of a vector 𝜷\boldsymbol{\beta} are denoted by 𝜷^\hat{\boldsymbol{\beta}} and 𝜷\boldsymbol{\beta}^{\prime}, respectively.
Now with the notations above and assumptions, by using the Markov property we are able to calculate the probability of various number of transitions until time tt.

Theorem 3.1.

: Let {Jt,t0}\{J_{t},t\geq 0\} be a continuous-time Markov chain with space state E=j=1kEjDE=\bigcup_{j=1}^{k}E_{j}\cup D and with the sub-intensity matrix (4) where DD the only absorbing state and EjE_{j}’s are mutual disjoint subsets of EE. Assume {N(t),t0}\{N(t),t\geq 0\} is the number of transitions between EjE_{j}’s in [0,t][0,t] or from any EjE_{j} to DD, j=1,,kj=1,...,k, and define Pt(i)(l)=P[N(t)=l|J0Ei]P^{(i)}_{t}(l)=P[N(t)=l|J_{0}\in E_{i}]. Then we have the followings:

Pt(i)(0)=𝜶^ietTi1P^{(i)}_{t}(0)=\hat{\boldsymbol{\alpha}}_{i}\,e^{t\textbf{T}_{i}}\textbf{1} (5)
Pt(i)(1)=i1=1i1ik𝜶^i𝐱i1(i)(t)1,P^{(i)}_{t}(1)=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\,\mathbf{x}^{(i)}_{i_{1}}(t)\textbf{1}, (6)
Pt(i)(2)=i1,i2i1ii2i1k𝜶^i𝐱i1,i2(i)(t)1,P^{(i)}_{t}(2)=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\mathbf{x}^{(i)}_{i_{1},i_{2}}(t)\textbf{1}, (7)
\vdots
Pt(i)(l)=i1,i2,,ili1ii2i1ilil1l𝜶^i𝐱i1,i2,il(i)(t)1P^{(i)}_{t}(l)=\sum_{\begin{subarray}{c}i_{1},i_{2},...,i_{l}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\\ \vdots\\ i_{l}\neq i_{l-1}\end{subarray}}^{l}\hat{\boldsymbol{\alpha}}_{i}\mathbf{x}^{(i)}_{i_{1},i_{2}\cdots,i_{l}}(t)\textbf{1} (8)

where 𝐱i1(i)(.),𝐱i1i2(i)(.),,𝐱i1i2ik(i)(.)\mathbf{x}^{(i)}_{i_{1}}(.),\mathbf{x}^{(i)}_{i_{1}i_{2}}(.),\cdots,\mathbf{x}^{(i)}_{i_{1}i_{2}\cdots i_{k}}(.) satisfy the following differential equations:

ddt𝐱i1(i)(t)=eTitTi,i1+𝐱i1(i)(t)Ti1,\frac{d}{dt}\mathbf{x}^{(i)}_{i_{1}}(t)=e^{\textbf{T}_{i}\,t}\textbf{T}_{i,i_{1}}+\mathbf{x}^{(i)}_{i_{1}}(t)\textbf{T}_{i_{1}},
ddt𝐱i1,i2(i)(t)=𝐱i1(i)(t)Ti1,i2+𝐱i1,i2(i)(t)Ti2,\frac{d}{dt}\mathbf{x}^{(i)}_{i_{1},i_{2}}(t)=\mathbf{x}^{(i)}_{i_{1}}(t)\textbf{T}_{i_{1},i_{2}}+\mathbf{x}^{(i)}_{i_{1},i_{2}}(t)\textbf{T}_{i_{2}},

and

ddt𝐱i1,i2,,il(i)(t)=𝐱i1,i2il1(i)(t)Til1,il+𝐱i1,i2,,il(i)(t)Til,\frac{d}{dt}\mathbf{x}^{(i)}_{i_{1},i_{2},\cdots,i_{l}}(t)=\mathbf{x}^{(i)}_{i_{1},i_{2}\cdots i_{l-1}}(t)\textbf{T}_{i_{l-1},i_{l}}+\mathbf{x}^{(i)}_{i_{1},i_{2},\cdots,i_{l}}(t)\textbf{T}_{i_{l}},

with the matrices x satisfy the following recursive equations

xi1,i2,,im1,im(i)(t)=0txi1,i2,,im1(i)(z)Tim1imeTim(tz)𝑑z,\textbf{x}_{i_{1},i_{2},\dots,i_{m-1},i_{m}}^{(i)}(t)=\int_{0}^{t}\textbf{x}_{i_{1},i_{2},\dots,i_{m-1}}^{(i)}(z)\textbf{T}_{i_{m-1}i_{m}}e^{\textbf{T}_{i_{m}}(t-z)}dz,

for m=1,2,,m=1,2,\dots, where

x{}(i)(t)=eTit,\textbf{x}_{\{\}}^{(i)}(t)=e^{\textbf{T}_{i}\,t},

and all x’s are zero matrices, with proper dimension, at t=0t=0.

Proof: See Appendix 1.

To solve the differential equations above, we use ODE-45 function in MATLAB (2018) software. In the following we will consider two examples for the applications.

Example 1: The Stanford Heart Transplantation Study
The Stanford heart transplant study began in October 19671967. This data set can be found in [24] (pp 230-232) or in [25]. The available data covers the period until April 1, 19741974. Some patients died before an appropriate heart was found. Out of the 103103, 6969 received a heart transplant that 45 (65%)of them deceased. Of the 34 patients without transplant, the number of deaths was 30 (88%). The remaining 2828 alive patients contributed with censored survival times. For each individual, an indicator of its final vital status (censored or not), the survival times (time to transplant, time to death) from the entry of the patient into the study (in days), and a vector of covariates including age at acceptance (Age), year of acceptance (Year), previous surgery (Surgery: coded as 11 = yes; 0 = no), and transplant (Transplant: coded as 11 = yes; 0 = no) were recorded. The Transplant covariate is the only time-dependent covariate.
In this data-structure, an individual’s survival data is expressed by three variables: start, stop and event. For the Stanford study, the time-dependent covariate “transplant” represents a treatment intervention. Individuals without change in the time-dependent covariate are represented by only one line of data, whereas patients with a change in the time-dependent covariate must be represented by two lines. For these patients, the first line represents the time period until the transplant; the second line represents the time period that passes from the transplant to the end of the follow-up or death. The remaining (time-fixed) covariates are the same for the two lines. For each row, variables start and stop mark the time interval (start, stop) for the data, while event is an indicator variable taking on value 11 if there was a death at time stop, and 0 otherwise. As an example consider the information available from four patients (from the Stanford study) with identification numbers 2525, 2626, 2727 and 2828 in Table 1 . For the first two patients the time from enrollment to censoring is 18001800 and 14011401 days, respectively, and the first patient had a heart transplant 2525 days after enrollment. The time from enrollment to death for the third and fourth patients are 263263 and 7272 days, respectively, and the last patient received a new heart in day 7171.

Table 1: Stanford heart transplantation in Example 2.
id start stop event transplant age year surgery
25 0 25 0 0 33.2238 1.57426 0
25 25 1800 0 1 33.2238 1.57426 0
26 0 1401 0 0 30.5353 1.58248 0
27 0 263 1 0 8.7858 1.59069 0
28 0 71 0 0 54.0233 1.68378 0
28 71 72 1 1 54.0233 1.68378 0

The descriptive statistics of the data by Age (intervals of 5-year), Year (from 1967 to 1974, intervals of 1) and Surgery (yes=1 or no=0) are presented in Tables 2, 3 and 4, respectively.

Table 2: The descriptive statistics of the data by age in Example 2.
Age N Trans. Death Deathtrans.\textrm{Death}_{\textrm{trans.}} % Deathwithout trans.\textrm{Death}_{\textrm{without trans.}} % Deathafter trans.\textrm{Death}_{\textrm{after trans.}}
<<25 5 2 4 1 100 50
25-30 5 4 1 1 0 25
30-35 4 2 1 0 50 0
35-40 7 4 4 1 100 25
40-45 19 12 15 9 100 75
45-50 29 22 19 13 86 59
50-55 27 17 23 14 90 82
>>55 8 6 8 6 100 100
sum 103 69 75 45

In Table 2, the average age of heart patients is 45.2 years. The percent of transplanted patients with 45-50 years old is maximum with 78% transplanting. The rate of death for transplanted patients(#death of transplanted patients# transplanted patients\frac{\#\text{death of transplanted patients}}{\#\text{ transplanted patients}}) increases with age.

Table 3: The descriptive statistics of the data by year in Example 2.
Year N Trans. Death Deathtrans.\textrm{Death}_{trans.} % Deathwithout trans.\textrm{Death}_{\textrm{without trans.}} % Deathafter trans.\textrm{Death}_{\textrm{after trans.}}
1 16 7 16 7 100 100
2 17 11 15 10 83 91
3 10 8 6 4 100 50
4 17 11 14 8 100 73
5 17 12 12 8 100 67
6 19 16 11 8 100 50
>>6 7 4 1 0 33 0
sum 103 69 75 45

In Table 3, the number of heart patients’ enrollment is about the same for every year but the percent of transplanted patients for the fifth year of study is maximum with 84% transplanting. The mortality rate for transplanted patients has decreased over the years so it is expected that the influence of year of acceptance on hazard is negative.

Table 4: The descriptive statistics of the data by surgery in Example 2.
surgery N Trans. Death Deathtrans.\textrm{Death}_{\textrm{trans.}} % Deathwithout trans.\textrm{Death}_{\textrm{without trans.}} % Deathafter trans.\textrm{Death}_{\textrm{after trans.}}
0 87 56 66 39 87 70
1 16 13 9 6 100 46
sum 103 69 75 45

As can be seen in Table 4, most patients have no history of surgery and patients without surgery are more likely to die, about 76%. Heart patients with surgery are more likely to have heart transplants than those without heart surgery, in other words, 81% of patients with surgery and 64% of patients without surgery have a heart transplant. Therefore it is expected that the influence of previous surgery on hazard is negative.
In this example, we use the illness-death model that can be used to study the effect of binary time-dependent covariates, shown in Figure 2.

Refer to caption
Figure 2: Illness-death model with disease and transplant states.

The heart patients are in the illness status when the disease is diagnosed. At any time thereafter, they may become transplanted or deceased. As explained in Section 1, rates of transition from the illness state depend on the duration of disease. Therefore the model must be a semi-Markov model.
We propose a model involving 22 stages (Illness and Transplant) as shown in Figure 2. In each stage, the patients may move from left to right in the figure as they age. After a heart transplant, patients move right from a state in the illness stage to a state in the transplant stage. Of course, patients may die while in any state, as indicated by the arrows to death state in Figure 2.
In order to describe how probabilities values are calculated, we define some notations for our model. For t0t\geq 0, let JtJ_{t} represent the status of a patient at time tt. We use the finite-state continuous-time Markov process with 2n2n alive states: nn transient states for disease stage, nn transient states for transplant stage, and one absorbing state which is the death state. Suppose that the space state is partitioned into the set E1E_{1} of nn illness states and the set E2E_{2} of nn transplant states and DD, for death.
Hence the sub-intensity matrix (4) will be reduced to

𝐓=[T1T120T2],\mathbf{T}=\begin{bmatrix}\textbf{T}_{1}&\textbf{T}_{12}\\ \textbf{0}&\textbf{T}_{2}\end{bmatrix}, (9)

where the T1\textbf{T}_{1} and T2\textbf{T}_{2} are the sub-intensity (with non-zero main and upper diagonal elements) matrices for the Markov chain describing a sojourn in E1E_{1}, and E2E_{2}, respectively. The matrix 𝐓12\mathbf{T}_{12} contains the transition rates from the illness stage to the transplant stage. Let 𝜶\boldsymbol{\alpha} be an 2n-dimensional column vector providing the initial state probabilities.

The elements of the proposed sub-intensity matrix T are:

T=[(q11+λ0+λ1)λ00λ1000(q21+λ0+λ1)λ00λ1000(qn1+λ1)00λ1000(q12+λ0)λ000000(q22+λ0)λ000000qn2],\textbf{T}=\begin{bmatrix}-(q^{1}_{1}+\lambda_{0}+\lambda_{1})&\lambda_{0}&\cdots&0&\lambda_{1}&0&0&\cdots\\ 0&-(q^{1}_{2}+\lambda_{0}+\lambda_{1})&\lambda_{0}&\cdots&0&\lambda_{1}&0&\cdots\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&-(q^{1}_{n}+\lambda_{1})&0&0&\cdots&\lambda_{1}\\ 0&0&\cdots&0&-(q^{2}_{1}+\lambda_{0})&\lambda_{0}&0&\cdots\\ 0&0&\cdots&0&0&-(q^{2}_{2}+\lambda_{0})&\lambda_{0}&\cdots\\ \vdots&\vdots&\ddots&\vdots&\vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&0&0&0&\cdots&-q^{2}_{n}\\ \end{bmatrix}, (10)

where λ0\lambda_{0} is the rate of transition from one state to the next, λ1\lambda_{1} is the rate of transplant from disease. The rate of mortality for the stages decease and transplant are shown by 𝐪i1\mathbf{q}^{1}_{i} and 𝐪i2\mathbf{q}^{2}_{i}, respectively.

𝐪i1\displaystyle\mathbf{q}^{1}_{i} =\displaystyle= a+b+q×i(p+γ1age+γ2year+γ3surgery)\displaystyle a+b+q\times i^{(p+\gamma_{1}age+\gamma_{2}year+\gamma_{3}surgery)}
𝐪i2\displaystyle\mathbf{q}^{2}_{i} =\displaystyle= a+q×i(p+γ1age+γ2year+γ3surgery)\displaystyle a+q\times i^{(p+\gamma_{1}age+\gamma_{2}year+\gamma_{3}surgery)} (11)

for i=1,2,,ni=1,2,\dots,n. Where the constant aa is interpreted as a background rate that a general reflection of the living environment, qq is a scale parameter and pp is a measure of the relative impact of the occurrence of aging. By descriptive statistics from heart data, rate of death for heart patients, before transplant is higher than after. So we add the parameter bb to the elements of 𝐪i1\mathbf{q}^{1}_{i}. The three regression coefficients, γ1,γ2\gamma_{1},\gamma_{2}, and γ3\gamma_{3} are included in the model to represent Age, Year, Surgery effects.
Also it was assumed that patients at acceptance will be in the first phase of the model. That is, 𝜶=(1,0,..,0)\boldsymbol{\alpha}=(1,0,..,0)^{\prime}.

We aim to estimate the parameters by the maximum likelihood method. In the heart transplant data, there are four situations for a patient: He stays in disease stage (stage 1) until the end of the study period, dies before the transplant, dies after the transplant or he is alive after transplant (stage 2) until end of the follow-up. Since the Markov process has only a single absorbing state, the time of death (the time of absorbing) follows a PH distribution. In order to estimate the parameters of the model, by using the phase-type density function (3) with representation (𝜶,𝐓\boldsymbol{\alpha},\mathbf{T}), we can construct expressions for the contribution of the likelihood function.

1. The probability of staying in disease state for a period of time tt:

f1=Pr[JuE1u[0,t]|J0E1]=𝜶^1etT11.f_{1}=Pr[J_{u}\in E_{1}~{}~{}\forall u\in[0,t]|J_{0}\in E_{1}]=\boldsymbol{\hat{\alpha}}_{1}^{\prime}e^{t\textbf{T}_{1}}\textbf{1}. (12)

2. The probability of staying in disease state for ss time units and then stay in transplant state until time tt, will be:

f12=Pr[JuE1u[0,s],Jv+sE2v[0,ts]|J0E1]=𝜶^1eT1sT12eT2(ts)1.f_{12}=Pr[\underset{\forall u\in[0,s]}{J_{u}\in E_{1}},\underset{\forall v\in[0,t-s]}{J_{v+s}\in E_{2}}|J_{0}\in E_{1}]=\boldsymbol{\hat{\alpha}}_{1}^{\prime}\,e^{\textbf{T}_{1}\,s}\textbf{T}_{12}e^{\textbf{T}_{2}(t-s)}\textbf{1}. (13)

3. The probability of death in disease state after tt time units is:

f10=Pr[JuE1u[0,t),JtD|J0E1]=𝜶^1eT1tt10.f_{10}=Pr[\underset{\forall u\in[0,t)}{J_{u}\in E_{1}},J_{t}\in D|J_{0}\in E_{1}]=\boldsymbol{\hat{\alpha}}_{1}^{\prime}e^{\textbf{T}_{1}\,t}\textbf{t}_{10}. (14)

where t10=T11\textbf{t}_{10}=-\textbf{T}_{1}\textbf{1}.

4. The probability of death at time tt from the transplant after staying in disease state for ss time units is:

f20=Pr[JuE1u[0,s]Jν+sE2ν[0,ts],JtD|J0E1]=𝜶1esT1T12eT2(ts)t20\begin{split}f_{20}&=Pr[\underset{\forall u\in[0,s]}{J_{u}\in E_{1}}\,\underset{\forall\nu\in[0,t-s]}{J_{\nu+s}\in E_{2}},J_{t}\in D|J_{0}\in E_{1}]\\ &=\boldsymbol{\alpha}_{1}^{\prime}\,e^{s\textbf{T}_{1}}\textbf{T}_{12}e^{\textbf{T}_{2}(t-s)}\textbf{t}_{20}\end{split} (15)

where 𝐭20=T21\mathbf{t}_{20}=-\textbf{T}_{2}\textbf{1}. We estimate the parameters in (3) by maximizing the likelihood function,

L(θ)=f1n1f12n2f10n3f20n4,L(\theta)=f_{1}^{n_{1}}f_{12}^{n_{2}}f_{10}^{n_{3}}f_{20}^{n_{4}},

where nkn_{k}s, k=1,2,3,4,k=1,2,3,4, represent the number of patients in each scenarios explained in (12)-(15) and θ=(a,b,q,p,λ0,λ1,γ1,γ2,γ3)\theta=(a,~{}b,~{}q,~{}p,~{}\lambda_{0},~{}\lambda_{1},~{}\gamma_{1},~{}\gamma_{2},~{}\gamma_{3}).
The FMINCON function in MATLAB Software has been used for finding the maximum likelihood estimates. We estimated the parameters for different numbers of nn. Based on the results shown in Table 5, n=3n=3 was chosen. Note that the our criterion is maximizing l(θ)=log(L(θ))l(\theta)=log(L(\theta)).

Table 5: Parameters values for different nn values.
nn aa qq pp λ0\lambda_{0} λ1\lambda_{1} bb γ1\gamma_{1} γ2\gamma_{2} γ3\gamma_{3} l(θ)l(\theta)
1 1.7e-03 5.0e-06 7.5 1.50 0.0116 0.0033 1.0 -0.25 -0.54 -896.48
2 1.7e-03 6.7e-08 3.0 0.50 0.0116 0.0034 0.37 -0.17 -0.59 -896.45
3 4.9e-04 6.4e-08 9.3 0.50 0.0115 0.0034 0.098 -0.02 -0.92 -885.17
4 7.7e-04 2.7e-08 7.7 0.49 0.0113 0.0034 0.104 -0.003 -0.98 -885.88
5 7.7e-04 1.2e-08 7.1 0.49 0.0115 0.0036 0.094 -1.2e-07 -0.98 -886.25
6 8.4e-04 8.7e-09 6.5 0.49 0.0116 0.0037 0.089 -0.015 -0.87 -886.78
8 1.0e-03 5.9e-10 6.7 0.30 0.0116 0.0037 0.094 -2.1e-08 -0.86 -888.37
10 8.9e-04 5.1e-09 5.2 0.48 0.0119 0.0038 0.077 -1.3e-08 -0.75 -887.63
20 1.7e-03 7.5e-09 2.2 0.008 0.0116 0.0034 0.49 -0.47 -0.60 -895.97

The parameters values, the standard deviations and the confidence intervals at level 95% are shown in Table 6.
In the fitted model, the influence of Age at acceptance on hazard is positive, while effects of Year and Surgery are both negative as they were expected.
The standard deviation of each parameter is obtained by the bootstrap technique. The main benefit of the bootstrap is that it allows statisticians to set the confidence intervals of the parameters without having to make unreasonable assumptions. It creates multiple resamples (with replacement) from a single set of observations, and computes the effect size of interest on each of these resamples. The bootstrap resamples of the effect size can then be used to determine the 95%95\% confidence interval. See [21] for more about bootstrap techniques.

Table 6: Parameters values with standard deviations and confidence intervals(95%95\%).
Param. a q p λ0\lambda_{0} λ1\lambda_{1} b γ1\gamma_{1} γ2\gamma_{2} γ3\gamma_{3} supθl(θ)sup_{\theta}l(\theta)
Estimate 4.9e-04 6.4e-08 9.3 0.50 0.0115 0.0034 0.098 -0.02 -0.92 -885.17
Std. 5.3e-04 2.9e-06 2.36 0.34 0.0036 0.0018 0.49 0.37 0.22
(Lower,  ) 1.9e-04 9.2e-14 1.5e-08 4.8e-04 0.007 4.9e-04 0.055 -0.50 -0.98
(  ,Upper) 0.0025 1.0e-05 9.7 1.25 0.021 0.0077 1.77 -1.8e-08 -0.0014

It was our interest to know about the significance of the covariates and the parameter bb in the model. To this end, we have conducted hypothesis testing for some null tests. In Table 7 the results are presented.

Table 7: The value of estimates
Parameter a q p λ0\lambda_{0} λ1\lambda_{1} b γ1\gamma_{1} γ2\gamma_{2} γ3\gamma_{3} supθl(θ)sup_{\theta}l(\theta)
without γ1\gamma_{1},γ2\gamma_{2},γ3\gamma_{3} 0.0017 4.3e-09 3.01 0.50 0.0116 0.0033 - - - -896.48
without γ1\gamma_{1} 0.0013 9.9e-06 5.1 0.50 0.0114 0.0029 - -0.50 -0.96 -895.54
without γ2\gamma_{2} 6.1e-04 1.2e-08 10.4 0.21 0.0114 0.0036 0.14 - -0.98 -885.02
without γ3\gamma_{3} 5.0e-04 3.5e-07 7.4 0.50 0.0116 0.0036 0.13 -3.7e-04 - -888.01
without b 0.0022 2.1e-09 0.39 0.0035 0.0112 - 1.71 -0.077 -0.52 -904.7

The likelihood ratio tests (LRTLRT) has been used for the testing. For example, for the null hypothesis H0:i,γi=0H_{0}:\forall i,\gamma_{i}=0 against the alternative H1:i,γi0H_{1}:\exists i,\gamma_{i}\neq 0 can be expressed:

LRT=2log(supΘ0L(θ)supΘL(θ))=2(supΘ0l(θ)supΘl(θ))=22.62,LRT=-2log(\frac{sup_{\Theta_{0}}L(\theta)}{sup_{\Theta}L(\theta)})=-2(sup_{\Theta_{0}}l(\theta)-sup_{\Theta}l(\theta))=22.62, (16)

which is greater than χ0.95,32=7.8\chi^{2}_{0.95,3}=7.8 and therefore the null hypothesis is disproved. As seen in Table 7, we can obtain the values of LRTLRT for the model without every variable separately. Θ\Theta is the parameter space and Θ0\Theta_{0} is the parameter under the null hypothesis.
For testing the null hypothesis H0:b=0H_{0}:b=0 against the alternative H1:b0H_{1}:b\neq 0, LRTLRT is 39.0639.06, by comparing with χ0.95,12=3.8\chi^{2}_{0.95,1}=3.8, we reject H0H_{0}. Hence, it can be said that mortality before heart transplant is greater than after transplant.

Now using the value of parameters, the probability of the number of transition until time tt can be calculated. These probabilities are shown in Table 8 for different Age ( 3030 and 5050), Year (years 33 and 55) and Surgery (0 and 11). As we can seen in this table, the probability without transition decreases as tt increases because the patient may die or transplant after more time of staying in disease stage. The event of no transition decreases with age, increases with year and surgery.
As we said before, mortality increases with age, so probability of staying in the disease stage for older patients are reduced, and he/she is more likely to die. Therefore, it is expected that the probability of one transition increases with age. Mortality decreases with year and surgery, so the probability for no transition also increases as the year and surgery increase. The probability of one transition increases until 6 months but decreases thereafter because the probability of transplant decreases in the later months and the probability of death increases. The probability of two transitions constantly increases. It is obvious that this probability increases with age because of its positive influence on death, decreases with year and surgery because of their negative influence on death.

Table 8: Probability of N(t) for different Age, Year and Surgery.
Age Year Surgery P[N(t)] 1 month 3 months 6 months 1 year 3 years
30 3 0 P[N(t)=0] 0.6254 0.2441 0.0595 0.0033 3.5e-08
P[N(t)=1] 0.3714 0.7338 0.8786 0.8505 0.6082
P[N(t)=2] 0.0032 0.0221 0.0619 0.1462 0.3918
1 P[N(t)=0] 0.6279 0.2474 0.0612 0.0035 4.1e-08
P[N(t)=1] 0.3695 0.7350 0.8892 0.8776 0.6650
P[N(t)=2] 0.0026 0.0176 0.0496 0.1190 0.3350
5 0 P[N(t)=0] 0.6255 0.2443 0.0596 0.0033 3.5e-08
P[N(t)=1] 0.3713 0.7339 0.8793 0.8523 0.6117
P[N(t)=2] 0.0032 0.0218 0.0611 0.1444 0.3883
1 P[N(t)=0] 0.6280 0.2475 0.0612 0.0035 4.2e-08
P[N(t)=1] 0.3695 0.7350 0.8894 0.8783 0.6666
P[N(t)=2] 0.0026 0.0175 0.0493 0.1183 0.3334
50 3 0 P[N(t)=0] 0.5955 0.2077 0.0428 0.0017 4.5e-09
P[N(t)=1] 0.3935 0.7215 0.7764 0.6360 0.3831
P[N(t)=2] 0.0110 0.0709 0.1808 0.3624 0.6169
1 P[N(t)=0] 0.6168 0.2332 0.0542 0.0027 2.0e-08
P[N(t)=1] 0.3777 0.7299 0.8453 0.7712 0.4802
P[N(t)=2] 0.0055 0.0369 0.1005 0.2261 0.5198
5 0 P[N(t)=0] 0.5969 0.2093 0.0434 0.0017 5.0e-09
P[N(t)=1] 0.3925 0.7220 0.7804 0.6426 0.3844
P[N(t)=2] 0.0106 0.0688 0.1762 0.3557 0.6156
1 P[N(t)=0] 0.6173 0.2339 0.0545 0.0027 2.0e-08
P[N(t)=1] 0.3774 0.7301 0.8472 0.7756 0.4859
P[N(t)=2] 0.0053 0.0360 0.0982 0.2217 0.5141

Given JuEiJ_{u}\in E_{i}, the time of staying continuously in stage ii of disease has a PH distribution with representation (𝜶^iu,Ti)(\boldsymbol{\hat{\alpha}}^{u}_{i},\textbf{T}_{i}), where

𝜶^iu=𝜶^eTuIEEi𝜶^eTu1i,\boldsymbol{\hat{\alpha}}^{u}_{i}=\dfrac{\boldsymbol{\hat{\alpha}}^{\prime}\,e^{\textbf{T}u}\textbf{I}_{E\,E_{i}}}{\boldsymbol{\hat{\alpha}}^{\prime}\,e^{\textbf{T}u}\textbf{1}_{i}},

IEEc\textbf{I}_{E\,E_{c}}, c=1,,kc=1,\dots,k, is nk×nnk\times n matrix with (n(c1)+l,l)(n(c-1)+l,l) entry equal to 1, l=1,,nl=1,\dots,n and all other entries are 0, and 1c\textbf{1}_{c} is a column vector with nknk entries such that elements n(c1)+ln(c-1)+l, l=1,,nl=1,\dots,n equal to 11 and all other entries 0.

Hence, given that the process JJ is ithi^{\textrm{th}} stage at time uu, the expected time of continuously staying in this stage, in time interval [0,t][0,t] is equal to:

E[0t𝟏{JνEi,ν[0,ω]|JuEi}𝑑ω]=0tPr[JνEi,ν[0,ω]|JuEi]𝑑ω=0t𝜶^iueTiω1i𝑑ω=𝜶^iuTi1(eTitI)1.\begin{split}E[\int_{0}^{t}\mathbf{1}_{\{J_{\nu}\in E_{i},\,\forall\nu\in[0,\omega]|J_{u}\in E_{i}\}}d\omega]&=\int_{0}^{t}Pr[J_{\nu}\in E_{i},\,\forall\nu\in[0,\omega]|J_{u}\in E_{i}]\,d\omega\\ &=\int_{0}^{t}\hat{\boldsymbol{\alpha}}^{u}_{i}\,e^{\textbf{T}_{i}\,\omega}\textbf{1}_{i}\,d\omega\\ &=\boldsymbol{\hat{\alpha}}^{u}_{i}\,\textbf{T}_{i}^{-1}(e^{\textbf{T}_{i}\,t}-\textbf{I})\textbf{1}.\end{split} (17)

The integrand in (17) can be interpreted as the survival function of a PH distribution with representation of (𝜶^iu,Ti)(\boldsymbol{\hat{\alpha}}^{u}_{i},\textbf{T}_{i}). Given starting from stage illness (stage 1), expected times without transition for u=0u=0, are shown in Table 9 for different ages. It is expected that the 30-year-old patient spends 24 days in disease stage during one month. This expected time decrease as age increases because of transition to death state.

Table 9: Expected sojourn time (days) of without transition
duration
Age 1 month 3 months 6 months 1 year 3 years
30 24 48.3 60.1 63.6 63.9
40 23.9 47.6 58.7 61.9 62.1
50 23.5 45.6 55.0 57.3 57.4
60 22.5 40.5 46.3 47.3 47.3

Finally, the probability of transition from stage ii to stage jj in a period of time tt can be calculated easily as follows

Pr[Ju+tEj|JuEi]=𝜶^eTuIEEiIEEieTt1j𝜶^eTu1i.Pr[J_{u+t}\in E_{j}|J_{u}\in E_{i}]=\dfrac{\hat{\boldsymbol{\alpha}}^{\prime}e^{\textbf{T}u}\textbf{I}_{E\,E_{i}}\textbf{I}^{\prime}_{E\,E_{i}}e^{\textbf{T}t}\textbf{1}_{j}}{\hat{\boldsymbol{\alpha}}^{\prime}e^{\textbf{T}u}\textbf{1}_{i}}. (18)

For Age=30=30, Year=3=3 and Surgery=0=0, and u=0u=0 (18) can be seen in Table 10. The formula for the probability of transition from disease stage to transplant is greater than death. Mortality from disease and transplant stages increase over time but mortality before transplant is greater than after.

Table 10: Probability of transition from stage ii to stage jj for Age=30, Year=3, Surgery=0.
from to 1 months 3 months 6 months 1 year 3 years
disease transplant 0.2726 0.5338 0.6296 0.5866 0.3434
disease death 0.0988 0.2000 0.2490 0.2640 0.2648
transplant death 0.0032 0.0221 0.0619 0.1462 0.3918

Example 2: Cancer Disease
After someone is diagnosed with cancer, doctors will try to figure out if it has spread, and if so, how far. This process is called staging. The stage of a cancer describes how much cancer is in the body. It helps determine how serious the cancer is and how best to treat it. Cancer staging may sometimes include the grading of the cancer. This describes how similar a cancer cell is to a normal cell. Doctors also use a cancer’s stage when talking about survival statistics. The earliest stage of cancers are called stage 0 (carcinoma in situ), and then range from stages I (1) through IV (4). As a rule, the lower the number, the less the cancer has spread. A higher number, such as stage IV, means cancer has spread more. Although each person’s cancer experience is unique, cancers with similar stages tend to have a similar outlook and are often treated in much the same way. When cancer returns after a period of remission, it’s considered a recurrence. A cancer recurrence happens because, in spite of the best efforts to rid of cancer, some cells from cancer remain. These cells could be in the same place where cancer first originated, or they could be in another part of body. These cancer cells may have been dormant for a period of time, but eventually they continued to multiply, resulting in the reappearance of the cancer.
Also by the end of the observation period, the patient under study may not have reached an absorbing state. In survival analysis this would correspond to individual still being alive by the end of the study and this kind of incomplete observation is known as right-censoring.
To illustrate the computation of probabilities of transitions, we use hypothetical transition rate between stages. In the following, we show how the PH model can be used in modeling of recurrent events in cancers.
The model can be considered as a development of [22], where the number of disease stages (which in [22] is called disability stage) is increased from 2 to 5.
We assume 55 disease stages and one recovery stage. Each disease stage and recovery include 5 states, i.e. k=6k=6 and n=5n=5. The stage 11 corresponds to the recovery (R) and the stages 2,3,,62,3,\dots,6 represents cancer stages 0,1,,40,1,\dots,4, respectively. The entries of the sub-generator matrix T=(tij)i,j=1,,30\textbf{T}=(t_{ij})\,\,i,j=1,...,30 will be described in the following.

The rates of transition from one state to the next are given by

ti,i+1=λ,i=1,2,,29,i5,10,15,20,25,t_{i,i+1}=\lambda,\,\,\,i=1,2,...,29,i\neq 5,10,15,20,25,

where λ\lambda is assumed to be 0.20.2.
The rates of recovery from disease stage, defined based on the degree of malignancy of the disease, are

ti+5l,i5=γ(0.1)l,i=6,7,,10,l=0,1,,4,t_{i+5l,i-5}=\gamma(0.1)^{l},~{}~{}~{}i=6,7,...,10,~{}l=0,1,...,4,

where γ\gamma is the coefficient of recovery; assume γ=0.1\gamma=0.1
and the rates of transition from one stage of disease to the stage before, defined based on the degree of malignancy of the disease, are given by

ti+5l,i+5(l1)=β(0.1)l,i=11,12,,15,l=0,,3t_{i+5l,i+5(l-1)}=\beta(0.1)^{l},~{}~{}~{}i=11,12,...,15,~{}l=0,...,3

where β\beta is assumed to be 0.20.2.
The rates of transition from one stage of disease to the next are

ti+5l,i+5(l+1)=(l+1)(a+qip),i=1,2,,5,l=1,,4t_{i+5l,i+5(l+1)}=(l+1)(a+qi^{p}),~{}~{}~{}i=1,2,...,5,~{}l=1,...,4

with a=103a=10^{-3}, q=106q=10^{-6} and p=4.5p=4.5.
The rates of transition from recovery stage to each of the disease stages are given by

ti,i+5(l+1)=(0.1)l(a+qip),i=1,2,,5,l=0,1,,4,t_{i,i+5(l+1)}=(0.1)^{l}(a+qi^{p}),~{}~{}~{}i=1,2,...,5,~{}l=0,1,...,4,

whereas the mortality rates are given by

𝐭0=(t1,0,t2,0,,t30,0)\mathbf{t}^{\prime}_{0}=(t_{1,0},t_{2,0},...,t_{30,0})

where

ti,0=(0.1)5+a+qip,i=1,,5t_{i,0}=(0.1)^{5}+a+qi^{p},~{}~{}~{}i=1,...,5

and for l=0,1,,4l=0,1,...,4

ti+5(l+1),0=(0.1)4l+a+qip,i=1,,5.t_{i+5(l+1),0}=(0.1)^{4-l}+a+qi^{p},~{}~{}~{}i=1,...,5.

So other entries of T are zero except the diagonal entries which are:

ti,i=j=0,ji30ti,j,i=1,2,,30.t_{i,i}=-\sum_{j=0,j\neq i}^{30}t_{i,j},~{}~{}~{}i=1,2,...,30.

Finally, the initial probability is 𝜶=(𝜶1,𝜶2,,𝜶6)\boldsymbol{\alpha}^{\prime}=(\boldsymbol{\alpha}^{\prime}_{1},\boldsymbol{\alpha}^{\prime}_{2},...,\boldsymbol{\alpha}^{\prime}_{6}) where 𝜶i\boldsymbol{\alpha}_{i}’s are 55-dimensional column vector. After the diagnosis of cancer, first stage, ithi^{\textrm{th}}, is also known so 𝜶i\boldsymbol{\alpha}^{\prime}_{i} is (1,0,0,0,0)(1,0,0,0,0) and other elements of 𝜶\boldsymbol{\alpha} will be zero.

Table 11: Probability of N(t) in Example 2.
duration
input stage 6 months 12 months 24 months 36 months
P(N(t)=0)
0 0.5382 0.2885 0.0814 0.0226
1 0.2750 0.0752 0.0055 0.0004
2 0.8046 0.6429 0.3981 0.2403
3 0.5219 0.2701 0.0698 0.0175
4 0.0025 6.04e-6 3.62e-11 2.16e-16
P(N(t)=1)
0 0.4541 0.6880 0.8504 0.8576
1 0.5199 0.4422 0.1960 0.0952
2 0.1406 0.2061 0.2694 0.2959
3 0.4573 0.6915 0.8703 0.9132
4 0.9975 0.9999 0.9998 0.9998
P(N(t)=2)
0 0.0088 0.0204 0.0440 0.0669
1 0.2875 0.4614 0.3268 0.1523
2 0.0639 0.1173 0.1483 0.1552
3 0.0208 0.0377 0.0559 0.0622
4 7.94e-5 1.31e-4 1.71e-4 1.81e-4

Table 11 shows the probability of N(t)N(t), (5)–(7), for different input stages, for different durations. These probabilities behave the way we expect. With entering to every stage, probability of no transition is decreasing as the duration increases. This means that probability of staying in every stage is higher at the earlier times and decreases over time. For example in stage 44, probability is near to zero over time, because stay in this state very low and patient has a higher chance to die. For one transition until tt this probability is near to one, because of the transition to death state. As it is seen in Table 11, the probability of staying in stage 22 is the maximum, compared to other stages, due to the fact that chance of recovery is not as high as in stage 1 and the rate of death is not as high as in stage 3.
The expected time of staying in stage at diagnosis, (17), is shown in Table 12. When a patient is diagnosed with stage 0 cancer, it is expected that he stays in that stage for 4.5 months during the six months. If the first stage is 2, this value will be 5.4 months. As we said before, at this stage, the expected sojourn time is greater than the others because of the low chance of recovery compared to the previous stage and low chance of death compared to the next stage. The patient, who is in stage 4, is expected to eventually survive for 1 month.

Table 12: Expected sojourn time in first stage until tt.
duration
input stage 6 months 12 months 24 months 36 months
0 4.5 6.9 8.8 9.4
1 3.4 4.3 4.6 4.6
2 5.4 9.7 15.9 19.6
3 4.4 6.7 8.5 9.0
4 1 1 1 1

Also we can obtain the probability of one transition from stage ii to stage jj, P[Nij(t)=1]P[N_{ij}(t)=1], by using Equation (18).

Table 13: Probability of one transition between stages.
stage waiting time RR 0 11 22 33 44 DD
0 6 months 0.4440 0 0.0050 0 0 0 0.0051
12 months 0.6750 0 0.0046 0 0 0 0.0084
1 6 months 0.0334 0.4704 0 0.0092 0 0 0.0069
12 months 0.0420 0.3809 0 0.0104 0 0 0.0089
2 6 months 0.0054 0 0.0592 0 0.0165 0 0.0596
12 months 0.0096 0 0.0635 0 0.0247 0 0.1084
3 6 months 4.38e-04 0 0 0.0078 0 0.0032 0.4459
12 months 6.58e-04 0 0 0.0103 0 0.0021 0.6784
4 6 months 9.89e-6 0 0 0 1.16e-4 0 0.9973
12 months 9.72e-6 0 0 0 6.049e-5 0 0.9998

As seen in Table 13, probability of one transition from the lower stages of cancer to recovery stage is higher than the higher stages, this means that the lower the number, the less the cancer has spread and the higher number of cancer has spread more. In higher stages probability of transition to recovery is near to zero. Also probability of transition from the lower stages of cancer to death is lower than the higher stages and probability of transition from higher stages to death is near to one. Other possible probabilities from one stage to another is shown in this table.

4 Conclusion

In this article, the multi-state models, that are widely used in the medical studies for many diseases, has been developed via taking advantages of phase–type distribution, to model the recurrent events. It is important for decision makers and patients to be aware of information about their illness, such as the probability of relapse, the time stay in each stage of recovery or disease, the probability of recovery, and so on. Using Markov’s properties and phase-type distributions, we present a formula for calculating the probability of the number of transitions between the stage os diseases. To calculate these probabilities, which are interdependent and defined as recursive relations, we need to use differential equations to calculate them. In this article, we used two examples to calibrate our model. Example 11 was for 103 Stanford heart patients who have two stages: heart disease and heart transplant. The mortality rate was defined for these stages, which have several parameters. We estimate these parameters by the maximum likelihood method to estimate the matrix T. The standard deviations and the confidence intervals for the parameters are obtained by the bootstrap technique. Finally, the probability of the number of transition until time tt, the expected time of continuously staying in each stage and the probability of transition from stage ii to stage jj in a period of time tt are obtained. Another example is a simulated cancer that has one recovery stage and five cancer stages. In the simulated example, which had a more complex model, similar calculations have been done.

Calculating these probabilities for higher recurrent’s number is time consuming due to more transition between stages. Because of using of the recursive equations in the formulas, we need to use the differential equations of the previous equations to calculate the probability. In other words, we have to start from the beginning to get the probability.
We have used the available functions in Matlab and have not developed any algorithms for solving the differential equations. In future study, we aim to optimize the algorithms for calculating probability distribution of the transitions.

Data Availability

Data availability statement: NA

References

  • [1] Hassanzadeh, A., Jones, B. L. and Stanford, D. A. (2013). The use of phase-type models for disability insurance calculations, Scandinavian Actuarial Journal 2014: 714–728.
  • [2] Kleinbaum, D. G. and Klein, M. (2012). Survival Analysis. A Self-Learning Text, Third Edition, Springer, New York.
  • [3] Neuts, M. F. (1994). Martix-geometric solutions in stochastic models. An algorithmic approach, Mineola, NY: Dover Publications.
  • [4] Odd Aalen, O. (1995). Phase type distributions in survival analysis, Scandinavian Journal of Statistics 22(4): 447–463.
  • [5] Marshall, A. and Zenga, M. (2009). Simulating coxian phase-type distributions for patient survival, International transactions in operational research 16(2): 213–226.
  • [6] Nielsen, B. (2012). Lecture notes on phase-type distributions for 02407 Stochastic Process.
  • [7] Asmussen, S., Nerman, O. and Olsson, M. (1996). Fitting phase-type distributions via the EM algorithm, Scandinavian Journal of Statistics 23(4): 419–441.
  • [8] Lin, X. Sh. and Liu, X. (2007). Markov aging Process and Phase-Type Law of Mortality, North American Actuarial Journal 11(4): 92–109.
  • [9] P Hougaard. Heidelberg (2000). Analysis of Multivariate Survival Data, Springer, 542 pages, ISBN: 0-387-98873-4.
  • [10] David C. M. Dickson, Mary R. Hardy and Howard R. Waters (2009). Actuarial mathematics for life contingent risks,Cambridge University Press, New York.
  • [11] Andersen, P.K.,Borgan, O, Gill, R.D and Keiding, N. (1993). Statistical Models Based on Counting Processes, Springer, New York.
  • [12] Meira-Machado, L., de Una-Alvarez, J., Cadarso-Suarez, C. and Andersen, P.K. (2009). Multi-state models for the analysis of time to event data, Statistical Methods in Medical Research 18: 195–222.
  • [13] Andersen PK and Keiding N. (2002). Multi-state models for event history analysis, Stat Methods Med Res 11: 91–115.
  • [14] Andersen PK and Gill RD (1982). Cox’s regression model for counting processes: a large sample study, Ann Stat 10: 1100–-20.
  • [15] Lin DY, Wei LJ, Yang I and Ying Z (2000). Semiparametric regression for the mean and rate functions of recurrent events, J R Stat Soc B 62:711–-30.
  • [16] Kelly PJ and Lim L L-Y (2000). Survival analysis for recurrent event data: an application to childhood infectious diseases, Stat Med 19: 13-–33.
  • [17] Cook RJ and Lawless JF (2010). The Statistical analysis of recurrent events, 2nd edn. New York, NY: Springer.
  • [18] Araujo A., Meira-Machado L. and Roca-Pardinas J. (2014).TPmsm: Estimation of the Transition Probabilities in 3-State Models., Journal of Statistical Software 62(4): 1–29.
  • [19] Balboa V. and J. de Una-Alvarez (2018). Estimation of Transition Probabilities for the Illness-Death Model: Package TP.idm, Journal of Statistical Software 83(10): 1–19.
  • [20] Efron, B. (1979). Bootstrap methods: Another look at the Jackknife, Ann. Statist 7: 1–26.
  • [21] Efron, B. and Tibshiran, R. (1993). An introduction to the bootstrap, New York: Chapman & Hall.
  • [22] Hassan Zadeh, A. and Jones, B.L. and Stanford, D. A. (2014). The use of phase-type models for disability insurance calculations, Scandinavian Actuarial Journal. 2014(8): 714–728.
  • [23] Asghari, R. and Hassan zadeh, A. (2020). Mortality modeling of skin cancer patients with actuarial applications, Journal of North American Actuarial 0(0): 1–17, DOI:10.1080/10920277.2019.1670070.
  • [24] Kalbfleisch JD. and Prentice RL. (1980). The Statistical Analysis of Failure Time Data, John Wiley & Sons.
  • [25] Crowley J. and Hu M. (1977). Covariance Analysis of Heart Transplant Survival Data, Journal of the American Statistical Association 72: 27–36.
  • [26] Faddy M.J. and McClean S.I. (2000). Analysing data on lengths of stay of hospital patients using phase-type distributions, Applied Stochastic Models and Data Analysis.
  • [27] Latouche, G., Ramaswami, V. (1999). Introduction to Matrix Analytic Methods in Stochastic Modeling, Series on statistics and applied probability. ASA-SIAM.

5 Appendix 1

In this section we prove Theorem 3.1. The Equation (5) is obvious. To prove the rest, we divide [0,t][0,t] into mm equal subintervals, each length will be ds=tmds=\frac{t}{m}. First, we assume that any transition between the stages will occur in a multiplier of dsds. Assume also that sκ=κdss_{\kappa}=\kappa\,ds, κ=0,1,\kappa=0,1,\dots, then we have the followings:

Pt(i)(1)=Pr[N(t)=1|J0Ei]=E[𝟏N(t)=1|J0Ei]=E[𝟏{i1=1i1ikκ=0m1JuEiu[0,sκ],Jsκ+dsEi1,Jsκ+ds+νEi1ν[0,t(sκ+ds)]}|J0Ei]=E[i1=1i1ikκ=0m1 1{JuEiu[0,sκ],Jsκ+dsEi1,Jsκ+ds+νEi1ν[0,t(sκ+ds)]}|J0Ei]=i1=1i1ikκ=0m1E[𝟏{JuEiu[0,sκ],Jsκ+dsEi1,Jsκ+ds+νEi1ν[0,t(sκ+ds)]}|J0Ei]=i1=1i1ikκ=0m1Pr[JuEiu[0,sκ],Jsκ+dsEi1,Jsκ+ds+νEi1ν[0,t(sκ+ds)]|J0Ei]=i1=1i1ikκ=0m1Pr[JuEiu[0,sκ]|J0Ei]Pr[Jsκ+dsEi1|JsκEi]Pr[Jsκ+ds+νEi1ν[0,t(sκ+ds)]|Jsκ+dsEi1].\begin{split}P^{(i)}_{t}(1)&=Pr[N(t)=1|J_{0}\in E_{i}]=E[\mathbf{1}_{N(t)=1}|J_{0}\in E_{i}]\\ &=E[\mathbf{1}_{\{\bigcup_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\bigcup_{\kappa=0}^{m-1}{\underset{\forall u\in[0,s_{\kappa}]}{J_{u}\in E_{i}},~{}J_{s_{\kappa}+ds}\in E_{i_{1}},~{}\underset{\forall\nu\in[0,t-(s_{\kappa}+ds)]}{J_{s_{\kappa}+ds+\nu}\in E_{i_{1}}}}\}}|J_{0}\in E_{i}]\\ &=E[\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\sum_{\kappa=0}^{m-1}\,\mathbf{1}_{\{\underset{\forall u\in[0,s_{\kappa}]}{J_{u}\in E_{i}},J_{s_{\kappa}+ds}\in E_{i_{1}},\underset{\forall\nu\in[0,t-(s_{\kappa}+ds)]}{J_{s_{\kappa}+ds+\nu}\in E_{i_{1}}}\}}|J_{0}\in E_{i}]\\ &=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\sum_{\kappa=0}^{m-1}\,E[\mathbf{1}_{\{\underset{\forall u\in[0,s_{\kappa}]}{J_{u}\in E_{i}},J_{s_{\kappa}+ds}\in E_{i_{1}},\underset{\forall\nu\in[0,t-(s_{\kappa}+ds)]}{J_{s_{\kappa}+ds+\nu}\in E_{i_{1}}}\}}|J_{0}\in E_{i}]\\ &=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\sum_{\kappa=0}^{m-1}Pr[\underset{\forall u\in[0,s_{\kappa}]}{J_{u}\in E_{i}},J_{s_{\kappa}+ds}\in E_{i_{1}},~{}\underset{\forall\nu\in[0,t-(s_{\kappa}+ds)]}{J_{s_{\kappa}+ds+\nu}\in E_{i_{1}}}|J_{0}\,\in E_{i}]\\ &=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\sum_{\kappa=0}^{m-1}Pr[\underset{\forall u\in[0,s_{\kappa}]}{J_{u}\in E_{i}}|J_{0}\in E_{i}]Pr[J_{s_{\kappa}+ds}\in E_{i_{1}}|J_{s_{\kappa}}\in E_{i}]Pr[\underset{\forall\nu\in[0,t-(s_{\kappa}+ds)]}{J_{s_{\kappa}+ds+\nu}\in E_{i_{1}}}|J_{s_{\kappa}+ds}\in E_{i_{1}}].\end{split}

For i,i1=1,,k,Di,i_{1}=1,...,k,D, ii1i\neq i_{1}.
Now, by using equation (5), and letting ds0ds\to 0 we end up with the followings:

=i1=1i1iklimds0κ=0m1(𝜶^ieTiκdsIE,Ei)(eTdsIE,Ei1)(eTi1(t(κ+1)ds)1)\begin{split}=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\lim_{ds\rightarrow 0}\sum_{\kappa=0}^{m-1}(\hat{\boldsymbol{\alpha}}_{i}e^{\textbf{T}_{i}\kappa\,ds}\textbf{I}_{E,E_{i}}^{\prime})(e^{\textbf{T}ds}\textbf{I}_{E,E_{i_{1}}})(e^{\textbf{T}_{i_{1}}(t-(\kappa+1)ds)}\textbf{1})\\ \end{split} (19)

from limϵ0+eTϵIϵ=T\lim_{\epsilon\rightarrow 0^{+}}\frac{e^{\textbf{T}\epsilon}-\textbf{I}}{\epsilon}=\textbf{T}, the continuity of eTse^{\textbf{T}s} and the limit of the Riemann sum, (19) becomes

=i1=1i1ik0t𝜶^ieTisTi,i1eTi1(ts)1𝑑s=i1=1i1ik𝜶^i0te𝐓isTi,i1eTi1(ts)𝑑s1=i1=1i1ik𝜶^ixi1(i)(t)1.\begin{split}&=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\int_{0}^{t}\hat{\boldsymbol{\alpha}}_{i}e^{\textbf{T}_{i}\,s}\textbf{T}_{i,{i_{1}}}e^{\textbf{T}_{i_{1}}(t-s)}\textbf{1}ds\\ &=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\int_{0}^{t}e^{\mathbf{T}_{i}\,s}\textbf{T}_{i,i_{1}}e^{\textbf{T}_{i_{1}}(t-s)}ds\textbf{1}\\ &=\sum_{\begin{subarray}{c}i_{1}=1\\ i_{1}\neq i\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\textbf{x}^{(i)}_{{i_{1}}}(t)\textbf{1}.\end{split} (20)

Where xi1(i)(t)=0teTisTi,i1eTi1(ts)𝑑s\textbf{x}^{(i)}_{{i_{1}}}(t)=\int_{0}^{t}e^{\textbf{T}_{i}\,s}\textbf{T}_{i,i_{1}}e^{\textbf{T}_{i_{1}}(t-s)}ds. It can be seen that

ddtxi1(i)(t)=eTitTi,i1+xi1(i)(t)Ti1.\frac{d}{dt}\textbf{x}^{(i)}_{{i_{1}}}(t)=e^{\textbf{T}_{i}\,t}\textbf{T}_{i,i_{1}}+\textbf{x}^{(i)}_{{i_{1}}}(t)\textbf{T}_{i_{1}}.

The event of N(t)=2N(t)=2 means there is 22 transitions during interval [0,t][0,t]. Assuming the transitions happen at time sκ1s_{\kappa_{1}} and sκ2s_{\kappa_{2}}, we have the following:

Pt(i)(2)=Pr(N(t)=2|J0Ei)=E(𝟏N(t)=2|J0Ei)=E(𝟏i1,i2i1i2kκ1=0m2κ2=κ1+1m1JuEiu[0,sκ1]Jsκ1+dsEi1,Jsκ1+ds+νEi1ν[0,sκ2(sκ1+ds)],Jsκ2+dsEi2,Jsκ2+ds+ωEi2ω[0,t(sκ2+ds)])=i1,i2i1ii2i1kκ1=0m2κ2=κ1+1m1Pr[JuEiu[0,sκ1]|J0Ei]Pr[Jsκ1+dsEi1|Jsκ1Ei]Pr[Jsκ1+ds+νEi1ν[0,sκ2(sκ1+ds)]|Jsκ1+dsEi1]Pr[Jsκ2+dsEi2|Jsκ2Ei1]Pr[Jsκ2+ds+ωEi2ω[0,t(sκ2+ds)]|Jsκ2+dsEi2].\begin{split}P^{(i)}_{t}(2)&=Pr\left(N(t)=2|J_{0}\in E_{i}\right)=E\left(\mathbf{1}_{N(t)=2}|J_{0}\in E_{i}\right)\\ &=E\left(\mathbf{1}_{\bigcup_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i_{2}\end{subarray}}^{k}\bigcup_{\kappa_{1}=0}^{m-2}\bigcup_{\kappa_{2}=\kappa_{1}+1}^{m-1}\underset{\forall~{}u\in[0,s_{\kappa_{1}}]}{\,J_{u}\in E_{i}}\,~{}J_{s_{\kappa_{1}}+ds}\in E_{i_{1}},~{}\underset{~{}\forall~{}\nu\in[0,s_{\kappa_{2}}-(s_{\kappa_{1}}+ds)]}{J_{s_{\kappa_{1}}+ds+\nu}\in E_{i_{1}}},~{}J_{s_{\kappa_{2}}+ds}\in E_{i_{2}},\underset{~{}\forall~{}\omega\in[0,t-(s_{\kappa_{2}}+ds)]}{J_{s_{\kappa_{2}}+ds+\omega}\in E_{i_{2}}}}\right)\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\sum_{\kappa_{1}=0}^{m-2}\sum_{\kappa_{2}=\kappa_{1}+1}^{m-1}Pr[\underset{\forall u\in[0,s_{\kappa_{1}}]}{J_{u}\in E_{i}}|J_{0}\in E_{i}]\,Pr[J_{s_{\kappa_{1}}+ds}\in E_{i_{1}}|J_{s_{\kappa_{1}}}\in E_{i}]\\ &Pr[\underset{\forall\nu\in[0,s_{\kappa_{2}}-(s_{\kappa_{1}}+ds)]}{J_{s_{\kappa_{1}}+ds+\nu}\in E_{i_{1}}}|J_{s_{\kappa_{1}}+ds}\in E_{i_{1}}]\,Pr[J_{s_{\kappa_{2}}+ds}\in E_{i_{2}}|J_{s_{\kappa_{2}}}\in E_{i_{1}}]\\ &Pr[\underset{\forall~{}\omega\in[0,t-(s_{\kappa_{2}}+ds)]}{J_{s_{\kappa_{2}}+ds+\omega}\in E_{i_{2}}}|J_{s_{\kappa_{2}}+ds}\in E_{i_{2}}].\end{split}

As ds0ds\to 0 and using the same techniques used in (20) we will end with the following.

=i1,i2i1ii2i1k0t0z𝜶^ieTisTii1eTi1(zs)Ti1i2eTi2(tz)1𝑑s𝑑z=i1,i2i1ii2i1k𝜶^i0t0zeTisTii1eTi1(zs)Ti1i2eTi2(tz)1𝑑s𝑑z=i1,i2i1ii2i1k𝜶^i0t𝐱i1(i)(z)Ti1i2eTi2(tz)1𝑑z=i1,i2i1ii2i1k𝜶^i𝐱i1i2(i)(t)1.\begin{split}&=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\int_{0}^{t}\int_{0}^{z}\hat{\boldsymbol{\alpha}}_{i}e^{\textbf{T}_{i}\,s}\textbf{T}_{i\,i_{1}}e^{\textbf{T}_{i_{1}}(z-s)}\textbf{T}_{{i_{1}}{i_{2}}}e^{\textbf{T}_{i_{2}}(t-z)}\textbf{1}~{}dsdz\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\int_{0}^{t}\int_{0}^{z}e^{\textbf{T}_{i}\,s}\textbf{T}_{i\,i_{1}}e^{\textbf{T}_{i_{1}}(z-s)}\textbf{T}_{{i_{1}}{i_{2}}}e^{\textbf{T}_{i_{2}}(t-z)}\textbf{1}~{}dsdz\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\int_{0}^{t}\mathbf{x}_{i_{1}}^{(i)}(z)\textbf{T}_{{i_{1}}{i_{2}}}e^{\textbf{T}_{i_{2}}(t-z)}\textbf{1}dz\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\mathbf{x}_{i_{1}\,i_{2}}^{(i)}(t)\textbf{1}.\end{split}

Where 𝐱i1i2(i)(t)\mathbf{x}_{i_{1}\,i_{2}}^{(i)}(t) satisfies the following differential equation

d𝐱i1i2(i)(t)dt=𝐱i1(i)(t)Ti1i2+𝐱i1i2(i)(t)Ti2.\dfrac{d\mathbf{x}_{i_{1}\,i_{2}}^{(i)}(t)}{dt}=\mathbf{x}_{i_{1}}^{(i)}(t)\textbf{T}_{i_{1}\,i_{2}}+\mathbf{x}_{i_{1}\,i_{2}}^{(i)}(t)\textbf{T}_{i_{2}}.

In general, it can be seen easily that Pt(i)(l)P^{(i)}_{t}(l) equals

=i1,i2,,ili1ii2i1ilil1k0t0zl10z1𝜶^ieTisTii1eTi1(z1s)Ti1i2eTi2(z2z1)Til1ileTil(tzl1)1𝑑s𝑑z1𝑑z2𝑑zl1=i1,i2i1ii2i1k𝜶^i0t𝐱i1,i2,,il1(i)(zl1)Til1ileTil(tzl1)1𝑑zl1=i1,i2i1ii2i1k𝜶^i𝐱i1,i2,,il(i)(t)1,\begin{split}&=\sum_{\begin{subarray}{c}i_{1},i_{2},\dots,i_{l}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\\ \vdots\\ i_{l}\neq i_{l-1}\end{subarray}}^{k}\int_{0}^{t}\int_{0}^{z_{l-1}}\dots\,\int_{0}^{z_{1}}\hat{\boldsymbol{\alpha}}_{i}e^{\textbf{T}_{i}\,s}\textbf{T}_{i\,i_{1}}e^{\textbf{T}_{i_{1}}(z_{1}-s)}\textbf{T}_{{i_{1}}{i_{2}}}e^{\textbf{T}_{i_{2}}(z_{2}-z_{1})}\dots\textbf{T}_{i_{l-1}i_{l}}e^{\textbf{T}_{i_{l}}(t-z_{l-1})}\textbf{1}~{}ds\,dz_{1}\,dz_{2}\,\cdots dz_{l-1}\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\int_{0}^{t}\mathbf{x}_{i_{1},i_{2},\cdots,i_{l-1}}^{(i)}(z_{l-1})\textbf{T}_{{i_{l-1}}{i_{l}}}e^{\textbf{T}_{i_{l}}(t-z_{l-1})}\textbf{1}dz_{l-1}\\ &=\sum_{\begin{subarray}{c}i_{1},i_{2}\\ i_{1}\neq i\\ i_{2}\neq i_{1}\end{subarray}}^{k}\hat{\boldsymbol{\alpha}}_{i}\mathbf{x}_{i_{1},i_{2},\cdots,i_{l}}^{(i)}(t)\textbf{1},\end{split}

where 𝐱i1,i2,,il(i)(t)\mathbf{x}_{i_{1},i_{2},\cdots,i_{l}}^{(i)}(t) satisfies the following differential equation

d𝐱i1,i2,,il(i)(t)dt=𝐱i1,i2il1(i)(t)Til1,il+𝐱i1,i2,,il(i)(t)Til.\dfrac{d\mathbf{x}^{(i)}_{i_{1},i_{2},\cdots,i_{l}}(t)}{dt}=\mathbf{x}^{(i)}_{i_{1},i_{2}\cdots i_{l-1}}(t)\textbf{T}_{i_{l-1},i_{l}}+\mathbf{x}^{(i)}_{i_{1},i_{2},\cdots,i_{l}}(t)\textbf{T}_{i_{l}}.