This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Dynamic Risk Prediction Triggered by Intermediate Events Using Survival Tree Ensembles

Yifei Sun, Sy Han Chiou, Colin O.Wu, Meghan McGarry, and Chiung-Yu HuangYifei Sun is Assistant Professor, Department of Biostatistics, Columbia University Mailman School of Public Health, New York, NY 10032 (Email: [email protected]). Sy Han Chiou is Assistant Professor, Department of Mathematical Sciences, University of Texas at Dallas (Email: [email protected]). Meghan McGarry is Assistant Professor, Department of Pediatrics, School of Medicine, University of California San Francisco (Email: [email protected]). Colin O. Wu is Mathematical Statistician, National Heart, Lung, and Blood Institute, National Institutes of Health (Email: [email protected]). Chiung-Yu Huang is Professor, Department of Epidemiology and Biostatistics, School of Medicine, University of California San Francisco, San Francisco, CA 94158 (Email: [email protected]).
Abstract

With the availability of massive amounts of data from electronic health records and registry databases, incorporating time-varying patient information to improve risk prediction has attracted great attention. To exploit the growing amount of predictor information over time, we develop a unified framework for landmark prediction using survival tree ensembles, where an updated prediction can be performed when new information becomes available. Compared to conventional landmark prediction with fixed landmark times, our methods allow the landmark times to be subject-specific and triggered by an intermediate clinical event. Moreover, the nonparametric approach circumvents the thorny issue of model incompatibility at different landmark times. In our framework, both the longitudinal predictors and the event time outcome are subject to right censoring, and thus existing tree-based approaches cannot be directly applied. To tackle the analytical challenges, we propose a risk-set-based ensemble procedure by averaging martingale estimating equations from individual trees. Extensive simulation studies are conducted to evaluate the performance of our methods. The methods are applied to the Cystic Fibrosis Patient Registry (CFFPR) data to perform dynamic prediction of lung disease in cystic fibrosis patients and to identify important prognosis factors.

KEY WORDS: Dynamic prediction, landmark analysis, multi-state model, survival tree, time-dependent predictors.

1 Introduction

Cystic fibrosis (CF) is a genetic disease characterized by a progressive, irreversible decline in lung function caused by chronic microbial infections of the airways. Despite recent advances in diagnosis and treatment, the burden of CF care remains high, and most patients succumb to respiratory failure. There is currently no cure for CF, so early prevention of lung disease for high-risk patients are essential for successful disease management. The goal of this research is to develop flexible and accurate event risk prediction algorithms for abnormal lung function in pediatric CF patients by exploiting the rich longitudinal information made available by the Cystic Fibrosis Foundation Patient Registry (CFFPR).

The CFFPR is a large electronic health record database that collects encounter-based records of over 300 unique variables on patients from over 120 accredited CF care centers in the United States (Knapp et al., 2016). The CFFPR contains detailed information on potential risk factors, including symptoms, pulmonary infections, medications, test results, and medical history. Analyses of CFFPR suggested that the variability in spirometry measurements over time is highly predictive of subsequent lung function decline (Morgan et al., 2016). Moreover, the acquisition of chronic, mucoid, or multidrug-resistant subtypes of Pseudomonas aeruginosa (PA) leads to more severe pulmonary disease, accelerating the decline in lung function (McGarry et al., 2020). In this paper, the event of interest is the progressive loss of lung function, defined as the first time that the percent predicted forced expiratory volume in 1 second (ppFEV1) drops below 80% in CFFPR. Since risk factors such as weight and height in pediatric patients can change substantially over time, models with baseline predictors have limited potential for long-term prognosis. Incorporating repeated measurements and intermediate clinical events would reflect ongoing CF management and result in more accurate prediction.

To incorporate the longitudinal patient information in risk prediction, one major approach is joint modeling (see, for example, Rizopoulos, 2011; Taylor et al., 2013). Under the joint modeling framework, a longitudinal submodel for the time-dependent variables and a survival submodel for the time-to-event outcome are postulated; the sub-models are typically linked via latent variables. Such a model formulation provides a complete specification of the joint distribution, based on which the survival probability given the history of longitudinal measurements can be derived. Most joint modeling methods consider a single, continuous time-dependent variable. Although attempts have been made to incorporate multiple time-dependent predictor variables (Proust-Lima et al., 2016; Wang et al., 2017), correct specification of the model forms for all the time-dependent covariates and their associations with the event outcome remains a major challenge. Moreover, it is not clear how existing joint modeling approaches can further incorporate the information on the multiple intermediate events, such as the acquisition of different subtypes of PA, in risk prediction.

Another major approach that can account for longitudinal predictors is landmark analysis, where models are constructed at pre-specified landmark times to predict the event risk in a future time interval. For example, at each landmark time, one may postulate a working Cox model with appropriate summaries of the covariate history up to the landmark time (e.g., last observed values) as predictors and then fit the Cox model using data from subjects who are at risk of the event. The estimation can either be performed using a separate model at each landmark time point or a supermodel for all landmark time points (van Houwelingen, 2007; van Houwelingen and Putter, 2008, 2011). This way, multiple and mixed type time-dependent predictors can be easily incorporated. Moreover, to better exploit the repeated measurements and to handle measurement errors, one may also consider a two-stage landmark approach (Rizopoulos et al., 2017; Sweeting et al., 2017; Ferrer et al., 2019): in the first step, mixed-effects models are used to model the longitudinal predictors; in the second step, functions of the best linear unbiased prediction (BLUP) estimator of the random effects are included as predictors of the landmark Cox model. Other than Cox models, Parast et al. (2012) considered time-varying coefficient models to incorporate a single intermediate event and multiple biomarker measurements. Zheng and Heagerty (2005), Maziarz et al. (2017), and Zhu et al. (2019) further considered the impact of informative observation times of repeated measurements on future risk.

Direct application of the existing landmark analysis method to the CFFPR data may not be ideal for the following reasons: first, imposing semiparametric working models at different landmark times may result in incompatible models and inconsistent predictions (Jewell and Nielsen, 1993; Rizopoulos et al., 2017). In other words, a joint distribution of predictors and event times that satisfies the models at all the landmark times simultaneously may not exist. Second, the specification of how the predictor history affects the future event risk may require deep clinical insight. For example, researchers have shown that various summaries of the repeated measurements, including the variability (Morgan et al., 2016), the rate of change (Mannino et al., 2006), and the area under the trajectory curve (Domanski et al., 2020), can serve as important predictors of disease risks, while the last observed value has been commonly used in the statistical literature. Therefore, nonparametric statistical learning methods are appealing in landmark prediction, because they require minimal model assumptions and have the potential to deal with a large number of complicated predictors. Tanner et al. (2021) applied super learners for landmark prediction in CF patients, where discrete time survival analysis were conducted via the use of machine learning algorithms for binary outcomes.

In this paper, we propose a unified framework for landmark prediction using survival tree ensembles, where the landmark times can be subject-specific. A subject-specific landmark can be defined by an intermediate clinical event that modifies patients’ risk profiles and triggers the need for an updated evaluation of future risk. In our application, the acquisition of chronic PA usually leads to accelerated deterioration in the pulmonary function and serves as a natural landmark. When the landmark time is random, the number of observed predictors at the landmark time often varies across subjects, creating analytical challenges in fully utilizing the available information. Moreover, unlike static risk prediction models where baseline predictors are completely observed, the observation of the time-dependent predictors is further subject to right censoring. To tackle these problems, we propose a risk-set-based approach to handle the possibly censored predictors. To avoid the instability issue of a single tree, we propose a novel ensemble procedure based on averaging unbiased martingale estimating equations derived from individual trees. Our ensemble method is different from existing ensemble methods that directly average the cumulative hazard predictions and has strong empirical performances in dealing with censored data.

The rest of this article is organized as follows. In Section 2, we introduce a landmark prediction framework that incorporates the repeated measurements and intermediate events. In Section 3, we propose tree-based ensemble methods to deal with censored predictors and outcomes. We propose a concordance measure to evaluate the prediction performance in Section 4 and define a permutation variable importance measure in Section 5. The proposed methods are evaluated by extensive simulation studies in Section 6 and are applied to the CFFPR data in Section 7. We conclude the paper with a discussion in Section 8.

2 Model Setup

In contrast to static risk prediction methods that output a conditional survival function given baseline predictors, dynamic landmark prediction focuses on the survival function conditioning on the predictor history up to the landmark time. Since history information involves complicated stochastic processes, challenges arise as to how to partition the history processes when applying tree-based methods. In what follows, we first introduce a generalized definition of the landmark survival function, starting from either a fixed or subject-specific landmark time. We then express the history information as a fixed-length predictor vector on which recursive partition can be applied.

Denote by TT a continuous failure event time and by TL{T_{\rm L}} a landmark time. The landmark is selected a priori and is usually clinically meaningful. We allow TL{T_{\rm L}} to be either fixed or subject-specific. We focus on the subpopulation that is free of the failure event at TL{T_{\rm L}} and predict the risk after TL{T_{\rm L}}. Denote by 𝒁{\bm{Z}} the baseline predictors and denote by (t){\mathcal{H}}(t) other information observed on [0,t][0,t]. Our goal is to predict the probability conditioning on all the available information up to TL{T_{\rm L}}, that is,

P(TTLtTTL,TL,(TL),𝒁).\displaystyle P(T-{T_{\rm L}}\geq t\mid T\geq{T_{\rm L}},{T_{\rm L}},{\mathcal{H}}({T_{\rm L}}),{\bm{Z}}). (1)

To illustrate the observed history (t){\mathcal{H}}(t), we consider two types of predictors that are available in the CFFPR data. The first type of predictors is repeated measurements of time-dependent variables such as weight and ppFEV1. It is worthwhile to point out that both internal and external time-dependent predictors can be included, as only their history up to TL{T_{\rm L}} will be used. We denote this type of predictors by 𝑾(t){\bm{W}}(t), a qq-dimensional vector of time-dependent variables, and assume 𝑾(){\bm{W}}(\cdot) is available at fixed time points t1,t2,,tKt_{1},t_{2},\ldots,t_{K}. The observed history up to tt is W(t)={𝑾(s)dO(s),0<st}{\mathcal{H}}_{W}(t)=\{{\bm{W}}(s)dO(s),0<s\leq t\}, where O(t)O(t) is a counting process that jumps by one when 𝑾(){\bm{W}}(\cdot) is measured (i.e., dO(tk)=1dO(t_{k})=1 for k=1,,Kk=1,\ldots,K). The second type of predictors is the timings of intermediate clinical events such as pseudomonas infections. Denote by UjU_{j} the time to the jjth intermediate event, j=1,,Jj=1,\ldots,J. The observed history up to tt is U(t)={I(Ujs),0<st,j=1,,J}{\mathcal{H}}_{U}(t)=\{I(U_{j}\leq s),0<s\leq t,j=1,\ldots,J\}. Collectively, we have a system of history processes (t)=(W(t),U(t)){\mathcal{H}}(t)=({\mathcal{H}}_{W}(t),{\mathcal{H}}_{U}(t)).

In our framework, both tkt_{k} and UjU_{j} can serve as landmark times. Due to the stochastic nature of UjU_{j}, the order of {tk,Uj,k=1,,K,j=1,,J}\{t_{k},U_{j},k=1,\ldots,K,j=1,\ldots,J\} can not be pre-determined. As a result, the number of available predictors at a given landmark time can vary across subjects. For illustration, we consider one fixed time point t1t_{1} at age 7 and one intermediate event chronic PA (cPA), of which the occurrence time is denoted by U1U_{1}. Figure 1 depicts the observed data of two study subjects. At t1=7t_{1}=7, subject 1 has experienced cPA (U1=4.8t1U_{1}=4.8\leq t_{1}), while subject 2 remains free of cPA (U1=10.2>t1U_{1}=10.2>t_{1}). Let 𝑾(t){\bm{W}}(t) be the body weight measured at time tt (i.e., q=1q=1). The probabilities of interest are given as follows:

  1. (I)

    At a fixed landmark time TL=t1{T_{\rm L}}=t_{1}, we predict the risk at t1+tt_{1}+t, t>0t>0, among those who are at risk, that is, Tt1T\geq t_{1}. Note that subjects in the risk set may or may not have experienced the intermediate event prior to t1t_{1}. Given 𝑾(t1){\bm{W}}(t_{1}) and the partially observed U1U_{1}, the conditional survival probability (1) can be reexpressed as

    {P(Tt+t1Tt1,𝒁,𝑾(t1),U1),if U1t1,P(Tt+t1Tt1,𝒁,𝑾(t1),U1>t1),otherwise.\displaystyle\begin{cases}P(T\geq t+t_{1}\mid T\geq t_{1},{\bm{Z}},{\bm{W}}(t_{1}),U_{1}),&\text{if~{}}U_{1}\leq t_{1},\\ P(T\geq t+t_{1}\mid T\geq t_{1},{\bm{Z}},{\bm{W}}(t_{1}),U_{1}>t_{1}),&\text{otherwise}.\end{cases}

    In other words, at t1t_{1}, we output the former for subjects who experience the intermediate event prior to t1t_{1} (subject 1, Figure 1(a)), while output the latter for others (subject 2, Figure 1(b)).

  2. (II)

    At a random landmark time TL=U1{T_{\rm L}}=U_{1}, we predict the risk for subjects who have experienced the intermediate event and are free of the failure event. The predictor value 𝑾(t1){\bm{W}}(t_{1}) is available only if U1t1U_{1}\geq t_{1}. In this case, we predict

    {P(Tt+U1TU1,𝒁,U1),if U1t1,P(Tt+U1TU1,𝒁,𝑾(t1),U1),otherwise.\displaystyle\begin{cases}P(T\geq t+U_{1}\mid T\geq U_{1},{\bm{Z}},U_{1}),&\text{if~{}}U_{1}\leq t_{1},\\ P(T\geq t+U_{1}\mid T\geq U_{1},{\bm{Z}},{\bm{W}}(t_{1}),U_{1}),&\text{otherwise}.\end{cases}

    Therefore, at U1U_{1}, we output the former for subjects whose 𝑾(t1){\bm{W}}(t_{1}) is observed after U1U_{1} (Subject 1, Figure 1(c)), while output the latter for others (Subject 2, Figure 1(d)).

𝒕𝟏=𝟕\bm{t_{1}=7}cPA𝑾(𝒕𝟏)=19.7\bm{W(t_{1})=19.7} kg𝑼𝟏=4.8\bm{U_{1}=4.8}
(a) Subject 1, landmark at t1=7t_{1}=7.
𝒕𝟏=𝟕\bm{t_{1}=7}cPA𝑾(𝒕𝟏)=21.5\bm{W(t_{1})=21.5} kg𝑼𝟏>𝒕𝟏\bm{U_{1}>t_{1}}
(b) Subject 2, landmark at t1=7t_{1}=7.
𝒕𝟏=𝟕\bm{t_{1}=7}cPA𝑾(𝒕𝟏)=19.7\bm{W(t_{1})=19.7} kg𝑼𝟏=4.8\bm{U_{1}=4.8}
(c) Subject 1, landmark at U1U_{1}.
𝒕𝟏=𝟕\bm{t_{1}=7}cPA𝑾(𝒕𝟏)=21.5\bm{W(t_{1})=21.5} kg𝑼𝟏=10.2\bm{U_{1}=10.2}
(d) Subject 2, landmark at U1U_{1}.
Figure 1: Illustration of fixed and random landmark times.   marks the landmark time point;   marks the available information at the landmark time;   marks the unavailable information at the landmark time;   marks the target prediction interval.

In the above example, the observed predictors vary across subjects. We then represent the history (TL){\mathcal{H}}({T_{\rm L}}) as a vector with a fixed length, so that tree-based methods can be applied to estimate the probability in (1). Define the complete predictors 𝑿={𝑾(t1),,𝑾(tK),U1,,UJ}{\bm{X}}=\{{\bm{W}}(t_{1}),\ldots,{\bm{W}}(t_{K}),U_{1},\ldots,U_{J}\}. The information in 𝑿{\bm{X}} may not be fully available at a given landmark time. We define the available information up to tt by 𝑿(t)={𝑾(t1,t),,{\bm{X}}(t)=\{{\bm{W}}(t_{1},t),\ldots, 𝑾(tK,t),U1(t),,UJ(t)}{\bm{W}}(t_{K},t),U_{1}(t),\ldots,U_{J}(t)\}, where

𝑾(tk,t)={𝑾(tk), if tkt,NAq, otherwise,Uj(t)={Uj, if Ujt,t+, otherwise.\displaystyle{\bm{W}}(t_{k},t)=\begin{cases}{\bm{W}}(t_{k}),&\text{~{}if~{}}t_{k}\leq t,\\ {\textbf{NA}_{q}},&\text{~{}otherwise},\end{cases}\hskip 21.68121ptU_{j}(t)=\begin{cases}U_{j},&\text{~{}if~{}}U_{j}\leq t,\\ t^{+},&\text{~{}otherwise}.\end{cases}

By a slight abuse of notation, we write Uj(t)=t+U_{j}(t)=t^{+} if Uj>tU_{j}>t and 𝑾(tk,t)=NAq{\bm{W}}(t_{k},t)=\textbf{NA}_{q} if tk>tt_{k}>t, with NAq=def(NA,,NA)\textbf{NA}_{q}\overset{\rm def}{=}(\rm NA,\ldots,NA) denoting a non-numeric qq-dimensional vector. Here an NA value indicates that the covariate value is collected after the landmark time and thus should not be used for constructing prediction models. In other words, under our setting NA is treated as an attribute rather than missing data, as the target probability is not conditional on 𝑾(tk){\bm{W}}(t_{k}) for tk>TLt_{k}>{T_{\rm L}}. The covariate history (TL){\mathcal{H}}({T_{\rm L}}) can then be expressed as 𝑿(TL){\bm{X}}({T_{\rm L}}), which is a (qK+J)(qK+J)-dimensional vector. This way, a covariate not being observed is predictive of the outcome, and the target survival probability function can be expressed as follows:

S(ta,𝒛,𝒙)=P(Tt+aTTL=a,𝒁=𝒛,𝑿(TL)=𝒙).\displaystyle S(t\mid a,{\bm{z}},{\bm{x}})=P(T\geq t+a\mid T\geq{T_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({T_{\rm L}})={\bm{x}}). (2)

In the example depicted in Figure 1, we have 𝑿(t)={𝑾(t1,t),U1(t)}{\bm{X}}(t)=\{{\bm{W}}(t_{1},t),U_{1}(t)\}. For TL=t1{T_{\rm L}}=t_{1}, the predictor values in Figure 1(a) and 1(b) correspond to 𝒙=(19.7,4.8){\bm{x}}=(19.7,4.8) and 𝒙=(21.5,7+){\bm{x}}=(21.5,7^{+}), respectively. For TL=U1{T_{\rm L}}=U_{1}, the predictor values in Figures 1(c) and 1(d) correspond to 𝒙=(NA,4.8){\bm{x}}=({\rm NA},4.8) and 𝒙=(21.5,10.2){\bm{x}}=(21.5,10.2), respectively. Since the information at TL{T_{\rm L}} involves left-bounded intervals and non-numeric values, applying semiparametric methods to estimate S(ta,𝒛,𝒙)S(t\mid a,{\bm{z}},{\bm{x}}) is challenging. Hence we propose tree-based methods to handle partially observed predictors.

3 Survival trees and ensembles for landmark prediction

At a given landmark time, we build a tree-based model to predict future event risk. Survival trees are popular nonparametric tools for risk prediction. The original survival trees take baseline predictors as input variables and output the survival probability conditioning on the baseline predictors (see, for example, Gordon and Olshen, 1985; Ciampi et al., 1986; Segal, 1988; Davis and Anderson, 1989; LeBlanc and Crowley, 1992, 1993; Zhang, 1995; Molinaro et al., 2004; Steingrimsson et al., 2016), and ensemble methods have been applied to address the instability issue of a single tree (Hothorn et al., 2004, 2006; Ishwaran et al., 2008; Zhu and Kosorok, 2012; Steingrimsson et al., 2019). However, existing methods may not be directly applied, because the predictors in 𝑿{\bm{X}} are not completely observed at TL{T_{\rm L}}, and the available predictors (TL,𝑿(TL))({T_{\rm L}},{\bm{X}}({T_{\rm L}})) are subject to right censoring. In the absence of censoring, we introduce a partition scheme for subjects who are event-free at the landmark time in Section 3.1. To handle censored data, we propose risk-set methods to estimate the partition-based landmark survival probability in Section 3.2 and propose an ensemble procedure in Section 3.3.

3.1 Partition on partially observed predictors at the landmark time

A tree partitions the predictor space into disjoint subsets termed terminal nodes and assigns the same survival prediction for subjects that enter the same terminal node. In dynamic risk prediction, the population of interest is subjects who remain event-free at the landmark time, that is, those with TTLT\geq{T_{\rm L}}. We use 𝒯={τ1,τ2,,τM}{\mathcal{T}}=\{\tau_{1},\tau_{2},\ldots,\tau_{M}\} to denote a partition on the sample space of (TL,𝒁,𝑿(TL))TTL({T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}}))\mid T\geq{T_{\rm L}}, where τm,m=1,,M\tau_{m},m=1,\ldots,M, are the terminal nodes. The terminal nodes are formed recursively using binary partitions by asking a sequence of yes-or-no questions. Existing implementations of trees usually do not handle mixtures of numeric and nominal variables. Since a variable in 𝑿(TL){\bm{X}}({T_{\rm L}}) may take either numeric/ordinal values or NA, the conventional partition scheme needs to be extended. When a variable in 𝑾(tk){\bm{W}}(t_{k}) is nominal, its counterpart in 𝑾(tk,TL){\bm{W}}(t_{k},{T_{\rm L}}) is also nominal and can be split applying existing approaches. In what follows, we focus on the case where the longitudinal marker measurements 𝑾(tk){\bm{W}}(t_{k}) are numeric/ordinal.

When TL{T_{\rm L}} is random, the partition is based on the variables in (TL,𝒁,𝑿(TL))({T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}})). We consider the following situations:

  1. (R1)

    When 𝑾(tk,TL){\bm{W}}(t_{k},{T_{\rm L}})’s are the splitting variables, conventional splitting rules may not be directly applied because they take NA values when tk>TLt_{k}>{T_{\rm L}}. Supposed WW is an element of 𝑾(tk,TL){\bm{W}}(t_{k},{T_{\rm L}}) and cc is a cutoff value. We consider two possible splits, (a) {W>cW>c} versus {WcW\leq c or W=NAW={\rm NA}} and (b) {W>cW>c or W=NAW={\rm NA}} versus {WcW\leq c}, and select the split that yields a larger improvement in the splitting criterion. We note that W=NAW=\rm NA (i.e., TL<tk{T_{\rm L}}<t_{k}) is treated as an attribute rather than missing data.

  2. (R2)

    Conventional splitting rules cannot be directly applied to Uj(TL)U_{j}({T_{\rm L}}), because Uj(TL)U_{j}({T_{\rm L}}) can be TL+{T_{\rm L}}^{+} and the support of Uj(TL)U_{j}({T_{\rm L}}) is not an ordered set. To tackle this problem, we employ a set of transformed predictors {U1(TL)/TL,,UJ(TL)/TL}\{U_{1}({T_{\rm L}})/{T_{\rm L}},\ldots,U_{J}({T_{\rm L}})/{T_{\rm L}}\}, which, together with TL{T_{\rm L}}, contains the same information as {U1(TL),,UJ(TL),TL}\{U_{1}({T_{\rm L}}),\ldots,U_{J}({T_{\rm L}}),{T_{\rm L}}\}. Then the rule Uj(TL)/TL>cU_{j}({T_{\rm L}})/{T_{\rm L}}>c can be applied for c(0,1]c\in(0,1], and Uj(TL)/TL>1U_{j}({T_{\rm L}})/{T_{\rm L}}>1 is equivalent to Uj(TL)=TL+U_{j}({T_{\rm L}})={T_{\rm L}}^{+}.

The splitting rules in (R1) and (R2) can be implemented by transforming 𝑿(TL){\bm{X}}({T_{\rm L}}) to a numeric/ordinal vector and passing the transformed predictors into a tree algorithm using conventional partition rules: First, we create two features for each element WW in 𝑾(tk,TL){\bm{W}}(t_{k},{T_{\rm L}}). The two features, denoted by W+W^{+} and WW^{-}, take the same value as WW when WNAW\neq{{\rm NA}} (i.e., tk>TLt_{k}>{T_{\rm L}}), and take extreme values MM and M-M otherwise, where MM is a large positive number outside the possible range of predictor values. Instead of using WW, we use W+W^{+} and WW^{-} as candidate variables for splitting. The partition “{W+>cW^{+}>c} versus {W+cW^{+}\leq c}” is equivalent to the type (b) partition “{W>cW>c or W=NAW={\rm NA}} versus {WcW\leq c}”, because observations satisfying W=NAW={\rm NA} or W>cW>c are assigned to the same child node; similarly, splitting based on WW^{-} yields type (a) partitions. As an example, if landmark is cPA and WW denotes the observed age-7 weight at the landmark time, then the rule “Is W+>20W^{+}>20?” is equivalent to “Is weight at age 7 greater than 20kg or does the subject have cPA before age 7?”. Implementing such a partition rule is equivalent to the “missings together” approach (Zhang et al., 1996) and missingness incorporated in attributes (Twala et al., 2008). Second, for each transformed intermediate event time Uj(TL)/TL(j=1,,J)U_{j}({T_{\rm L}})/{T_{\rm L}}(j=1,\ldots,J), we replace the value 1+1^{+} with MM. The details are described in Algorithm 1 (see Table 1). The partition scheme also apply to the trivial case of fixed landmark times. Figure 2 illustrates the proposed data preprocessing procedure.

Landmark At-risk status 𝑾(t1,TL){\bm{W}}(t_{1},{T_{\rm L}}) U1(TL)U_{1}({T_{\rm L}})
time TL{T_{\rm L}} ID Observed data I(TTL)I(T\geq{T_{\rm L}}) Observed Processed Observed Processed
t1=7t_{1}=7 1
𝒕𝟏=𝟕\bm{t_{1}=7}𝑾(𝒕𝟏)=19.7\bm{W(t_{1})=19.7} kg𝑻=9.7\bm{T=9.7}𝑼𝟏=4.8\bm{U_{1}=4.8}
1 19.7 (19.7, 19.7) 4.8 4.8 / 7
2
𝒕𝟏=𝟕\bm{t_{1}=7}𝑾(𝒕𝟏)=21.5\bm{W(t_{1})=21.5} kg𝑼𝟏>𝒕𝟏\bm{U_{1}>t_{1}}𝑻=11.7\bm{T=11.7}
1 21.5 (21.5, 21.5) 7+7^{+} MM
3
𝑻=5.7\bm{T=5.7}
0 - - - -
U1U_{1} 1
cPA𝑼𝟏=4.8\bm{U_{1}=4.8}𝑻=9.7\bm{T=9.7}
1 NA (MM, M-M) 4.8 4.8 / 4.8
2
cPA𝑾(𝒕𝟏)=21.5\bm{W(t_{1})=21.5} kg𝑼𝟏=10.2\bm{U_{1}=10.2}𝑻=11.7\bm{T=11.7}
1 21.5 (21.5, 21.5) 10.2 10.2 / 10.2
3
𝑻=5.7\bm{T=5.7}
0 - - - -
Figure 2: Illustration of observed data and preprocessed data from three subjects at fixed and random landmark times. At each landmark time TL{T_{\rm L}}, only subjects with TTLT\geq{T_{\rm L}} (subjects 1 and 2) are of interest. For each predictor that can take the value NA (e.g., 𝑾(t1,TL){\bm{W}}(t_{1},{T_{\rm L}})), we create two features that take extreme values MM and M-M if the predictor is not observed. For partially observed intermediate events (e.g., U1(TL)U_{1}({T_{\rm L}})), we first divide the event time by TL{T_{\rm L}} and replace the value 1+1^{+} with MM. After preprocessing, zero-variance features and duplicate features can be removed before running the tree algorithm.

The partition scheme described above guarantees that each individual has a well-defined pathway to determine its node membership. Given such a partition 𝒯{\mathcal{T}}, one can define a partition function l𝒯(a,𝒛,𝒙)l_{{\mathcal{T}}}(a,{\bm{z}},{\bm{x}}), which returns the terminal node in 𝒯{\mathcal{T}} that contains (a,𝒛,𝒙)(a,{\bm{z}},{\bm{x}}). We define the following partition-based survival function,

S𝒯(ta,𝒛,𝒙)=P(Tt+aTTL,(TL,𝒁,𝑿(TL))l𝒯(a,𝒛,𝒙)).\displaystyle S_{{\mathcal{T}}}(t\mid a,{\bm{z}},{\bm{x}})=P(T\geq t+a\mid T\geq{T_{\rm L}},({T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}}))\in l_{{\mathcal{T}}}(a,{\bm{z}},{\bm{x}})). (3)

The probability S𝒯(ta,𝒛,𝒙)S_{{\mathcal{T}}}(t\mid a,{\bm{z}},{\bm{x}}) approximates the target function S(ta,𝒛,𝒙)S(t\mid a,{\bm{z}},{\bm{x}}).

3.2 Partition-based estimation at the landmark time

The follow-up of a subject can be terminated due to loss to follow-up or study end. We now consider estimating S𝒯(ta,𝒛,𝒙)S_{\mathcal{T}}(t\mid a,{\bm{z}},{\bm{x}}) with censored data. Denote by CC the censoring time and assume CC is independent of (TL,T,𝑿)({T_{\rm L}},T,{\bm{X}}) given 𝒁{\bm{Z}}. We follow the convention to define Y=min(T,C)Y=\min(T,C) and Δ=I(TC)\Delta=I(T\leq C). We further define YL=min(TL,Y){Y_{\rm L}}=\min({T_{\rm L}},Y) and ΔL=I(TLT,TLC){\Delta_{\rm L}}=I({T_{\rm L}}\leq T,{T_{\rm L}}\leq C). Note that ΔL=1{\Delta_{\rm L}}=1 is the at-risk indicator at TL{T_{\rm L}}. For subjects who are free of the failure event at TL{T_{\rm L}}, one can observe TL{T_{\rm L}} and 𝑿(TL){\bm{X}}({T_{\rm L}}) when ΔL=1{\Delta_{\rm L}}=1. The training data are {(Yi,Δi,YLi,ΔLi,𝑿i(YLi),𝒁i),i=1,,n}\{(Y_{i},\Delta_{i},{Y_{{\rm L}i}},{\Delta_{\rm L}}_{i},{\bm{X}}_{i}({Y_{{\rm L}i}}),{\bm{Z}}_{i}),i=1,\ldots,n\}, which are assumed to be independent identically distributed replicates of (Y,Δ,YL,ΔL,𝑿(YL),𝒁)(Y,\Delta,{Y_{\rm L}},{\Delta_{\rm L}},{\bm{X}}({Y_{\rm L}}),{\bm{Z}}).

Define N(t,a)=ΔI(Yat)N(t,a)=\Delta I(Y-a\leq t). Denote by λ(ta,𝒛,𝒙)\lambda(t\mid a,{\bm{z}},{\bm{x}}) the landmark hazard function, that is, λ(ta,𝒛,𝒙)dt=P(TTL[t,t+dt)TTLt,TL=a,𝒁=𝒛,𝑿(TL)=𝒙)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt=P(T-{T_{\rm L}}\in[t,t+dt)\mid T-{T_{\rm L}}\geq t,{T_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({T_{\rm L}})={\bm{x}}). The survival function S(ta,𝒛,𝒙)S(t\mid a,{\bm{z}},{\bm{x}}) and the hazard function λ(ta,𝒛,𝒙)\lambda(t\mid a,{\bm{z}},{\bm{x}}) have a one-to-one correspondence relationship: S(ta,𝒛,𝒙)=exp{0tλ(ua,𝒛,𝒙)𝑑u}S(t\mid a,{\bm{z}},{\bm{x}})=\exp\{-\int_{0}^{t}\lambda(u\mid a,{\bm{z}},{\bm{x}})du\}. For t>0t>0, we have

E{N(dt,YL)I(YYLt)λ(ta,𝒛,𝒙)dtΔL=1,YL=a,𝒁=𝒛,𝑿(YL)=𝒙}\displaystyle E\{N(dt,{Y_{\rm L}})-I(Y-{Y_{\rm L}}\geq t)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt\mid{\Delta_{\rm L}}=1,{Y_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({Y_{\rm L}})={\bm{x}}\}
=\displaystyle= E{N(dt,a)I(Ta+t,Ca+t)λ(ta,𝒛,𝒙)dtCa,TTL=a,𝒁=𝒛,𝑿(TL)=𝒙}\displaystyle E\{N(dt,a)-I(T\geq a+t,C\geq a+t)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt\mid C\geq a,T\geq{T_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({T_{\rm L}})={\bm{x}}\}
=\displaystyle= E{I(TTL[t,t+dt))I(TTLt)λ(ta,𝒛,𝒙)dtCa,TTL=a,𝒁=𝒛,\displaystyle E\{I(T-{T_{\rm L}}\in[t,t+dt))-I(T-{T_{\rm L}}\geq t)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt\mid C\geq a,T\geq{T_{\rm L}}=a,{\bm{Z}}={\bm{z}},
𝑿(TL)=𝒙}×P(Ca+tCa,𝒁=𝒛)\displaystyle{\bm{X}}({T_{\rm L}})={\bm{x}}\}\times P(C\geq a+t\mid C\geq a,{\bm{Z}}={\bm{z}})
=\displaystyle= 0.\displaystyle 0.

Then we have

λ(ta,𝒛,𝒙)dt=E{N(dt,YL)ΔL=1,YL=a,𝒁=𝒛,𝑿(YL)=𝒙}E{I(YYLt)ΔL=1,YL=a,𝒁=𝒛,𝑿(YL)=𝒙}.\displaystyle\lambda(t\mid a,{\bm{z}},{\bm{x}})dt=\frac{E\{N(dt,{Y_{\rm L}})\mid{\Delta_{\rm L}}=1,{Y_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({Y_{\rm L}})={\bm{x}}\}}{E\{I(Y-{Y_{\rm L}}\geq t)\mid{\Delta_{\rm L}}=1,{Y_{\rm L}}=a,{\bm{Z}}={\bm{z}},{\bm{X}}({Y_{\rm L}})={\bm{x}}\}}. (4)

Conditioning on ΔL=1{\Delta_{\rm L}}=1 and (YL,𝒁,𝑿(YL))({Y_{\rm L}},{\bm{Z}},{\bm{X}}({Y_{\rm L}})), subjects with YYLtY-{Y_{\rm L}}\geq t can be viewed as a representative sample of the population with TTLtT-{T_{\rm L}}\geq t for each t>0t>0. Heuristically, the numerator and denominator in (4) can be estimated using partition-based estimators in the subsample with ΔL=1{\Delta_{\rm L}}=1. Given a partition 𝒯{\mathcal{T}}, the function S𝒯(ta,𝒛,𝒙)S_{\mathcal{T}}(t\mid a,{\bm{z}},{\bm{x}}) in (3) can be estimated by the following estimator,

S^𝒯(ta,\displaystyle{\widehat{S}}_{{\mathcal{T}}}(t\mid a, 𝒛,𝒙)\displaystyle{\bm{z}},{\bm{x}})
=exp{0ti=1nΔLiI((YLi,𝒁i,𝑿i(YLi))l𝒯(a,𝒛,𝒙))Ni(du,YLi)i=1nΔLiI((YLi,𝒁i,𝑿i(YLi))l𝒯(a,𝒛,𝒙),YiYLiu)}.\displaystyle=\exp\left\{-\int_{0}^{t}\frac{{\sum_{i=1}^{n}}{\Delta_{\rm L}}_{i}I(({Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}))\in l_{{\mathcal{T}}}(a,{\bm{z}},{\bm{x}}))N_{i}(du,{Y_{\rm L}}_{i})}{{\sum_{i=1}^{n}}{\Delta_{\rm L}}_{i}I(({Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}))\in l_{{\mathcal{T}}}(a,{\bm{z}},{\bm{x}}),Y_{i}-{Y_{\rm L}}_{i}\geq u)}\right\}. (5)

When a new subject is event-free at the landmark time TL0{T_{\rm L}}_{0} with predictors 𝒁0{\bm{Z}}_{0} and 𝑿0(TL0){\bm{X}}_{0}({T_{\rm L}}_{0}), the predicted survival probability based on a single tree is S^𝒯(tTL0,𝒁0,𝑿0(TL0)){\widehat{S}}_{\mathcal{T}}(t\mid{T_{\rm L}}_{0},{\bm{Z}}_{0},{\bm{X}}_{0}({T_{\rm L}}_{0})). In practice, the partition 𝒯{\mathcal{T}} can be constructed via a recursive partition algorithm, and the split-complexity pruning can be applied to determine the size of the tree (LeBlanc and Crowley, 1993). More details of the tree algorithm are given in the Supplementary Material (Sun et al., 2022).

3.3 Survival tree ensembles based on martingale estimating equations

Since the prediction based on a tree is often unstable, ensemble methods such as bagging (Breiman, 1996) and random forests (Breiman, 2001) have been commonly applied. The original random forests perform the prediction for a new data point by averaging predictions from a large number of trees, which are often grown sufficiently deep to achieve low bias (Friedman et al., 2001). However, for censored data, a large tree may result in a small number of observed failures in the terminal nodes, leading to increased estimation bias of survival or cumulative hazard functions. Existing survival forests inherit from the original random forest and directly average the cumulative hazard prediction from individual trees. Therefore, the node size parameter needs to be carefully tuned to achieve accurate prediction. On the other hand, if the target estimate can be expressed as the solution of an unbiased estimating equation, a natural way is to solve the averaged estimating equations. In what follows, we propose an ensemble procedure based on averaging martingale estimating equations.

For b=1,,Bb=1,\ldots,B, we draw the bbth bootstrap sample from the training data. Let 𝕋={𝒯b}b=1B\mathbb{T}=\{{\mathcal{T}}_{b}\}_{b=1}^{B} be a collection of BB partitions constructed using the bootstrap datasets. Each partition is constructed via a recursive partition procedure where at each split, mm predictors are randomly selected as the candidate variables for splitting, and mm is smaller than the number of predictors. Let lbl_{b} be the partition function based on the partition 𝒯b{\mathcal{T}}_{b}. The tree-based estimation from 𝒯b{\mathcal{T}}_{b} can be obtained from the following estimating equation,

i=1nwbiI((YLi,𝒁i,𝑿i(YLi))lb(a,𝒛,𝒙))ΔLi{Ni(dt,YLi)I(YiYLit)λ(ta,𝒛,𝒙)dt}=0,\displaystyle{\sum_{i=1}^{n}}w_{bi}I(({Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}))\in l_{b}(a,{\bm{z}},{\bm{x}})){\Delta_{\rm L}}_{i}\{N_{i}(dt,{Y_{\rm L}}_{i})-I(Y_{i}-{Y_{\rm L}}_{i}\geq t)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt\}=0,

where wbiw_{bi} is the frequency of the iith observation in the bbth bootstrap sample. Note that when wbi=1w_{bi}=1 for all i=1,,ni=1,\ldots,n, solving the above estimating equation yields the estimator in (3.2). To perform prediction using all the trees, we consider the following averaged estimating equation,

i=1nwi(a,𝒛,𝒙){Ni(dt,YLi)I(YiYLit)λ(ta,𝒛,𝒙)dt}=0,\displaystyle{\sum_{i=1}^{n}}w_{i}(a,{\bm{z}},{\bm{x}})\{N_{i}(dt,{Y_{\rm L}}_{i})-I(Y_{i}-{Y_{\rm L}}_{i}\geq t)\lambda(t\mid a,{\bm{z}},{\bm{x}})dt\}=0,

where wi(a,𝒛,𝒙)=b=1BwbiI{(YLi,𝒁i,𝑿i(YLi))lb(a,𝒛,𝒙)}ΔLi/Bw_{i}(a,{\bm{z}},{\bm{x}})={\sum_{b=1}^{B}}w_{bi}I\{({Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}))\in l_{b}(a,{\bm{z}},{\bm{x}})\}{\Delta_{\rm L}}_{i}/B. Solving the averaged estimating equation yields

S^𝕋(ta,𝒛,𝒙)=exp{0ti=1nwi(a,𝒛,𝒙)Ni(ds,YLi)i=1nwi(a,𝒛,𝒙)I(YiYLis)}.\displaystyle{\widehat{S}}_{\mathbb{T}}(t\mid a,{\bm{z}},{\bm{x}})=\exp\left\{-\int_{0}^{t}\frac{{\sum_{i=1}^{n}}w_{i}(a,{\bm{z}},{\bm{x}})N_{i}(ds,{Y_{\rm L}}_{i})}{{\sum_{i=1}^{n}}w_{i}(a,{\bm{z}},{\bm{x}})I(Y_{i}-{Y_{\rm L}}_{i}\geq s)}\right\}.

The estimator S^𝕋(ta,𝒛,𝒙){\widehat{S}}_{\mathbb{T}}(t\mid a,{\bm{z}},{\bm{x}}) can be viewed as an adaptive nearest neighbour estimator (Lin and Jeon, 2006), where the weight assigned to each observation comes from random forests. The risk prediction algorithm is given in Table 1.

Table 1: The data preprocessing and risk prediction algorithm
1
Input : {(Yi,Δi,YLi,ΔLi,𝑿i(YLi),𝒁i),i=1,,n}\{(Y_{i},\Delta_{i},{Y_{{\rm L}i}},{\Delta_{\rm L}}_{i},{\bm{X}}_{i}({Y_{{\rm L}i}}),{\bm{Z}}_{i}),i=1,\ldots,n\}
Output :  S^𝕋(ta,𝒛,𝒙)\widehat{S}_{\mathbb{T}}(t\mid a,{\bm{z}},{\bm{x}})
2
3
4
5Function Transform(a,𝐱a,{\bm{x}}):
       /* The function transforms the longitudinal predictors 𝑿(TL){\bm{X}}({T_{\rm L}}) (i.e., 𝒙{\bm{x}}) to a vector 𝑿~(TL)\widetilde{\bm{X}}({T_{\rm L}}) (i.e., 𝒙~\widetilde{\bm{x}}) such that the rules in (R1) and (R2) on 𝑿(TL){\bm{X}}({T_{\rm L}}) can be achieved using a conventional tree-structured rule on 𝑿~(TL)\widetilde{\bm{X}}({T_{\rm L}}) */
6       𝒙~vector(size:2Kq+J)\widetilde{\bm{x}}\leftarrow{\rm vector}({\rm size}:2Kq+J) ;
7      
8      for  l=1l=1 to Kq+JKq+J do
             if 1lKq1\leq l\leq Kq and 𝐱[l]NA{\bm{x}}[l]\neq{\rm NA} ;
              // observed longitudinal measurements
9              then
10                   𝒙~[2l1]𝒙[l]\widetilde{\bm{x}}[2l-1]\leftarrow{\bm{x}}[l] ;
11                   𝒙~[2l]𝒙[l]\widetilde{\bm{x}}[2l]\leftarrow{\bm{x}}[l] ;
12                  
             else if 1lKq1\leq l\leq Kq and 𝐱[l]=NA{\bm{x}}[l]={\rm NA} ;
              // unobserved longitudinal measurements
13             then
14                   𝒙~[2l1]M\widetilde{\bm{x}}[2l-1]\leftarrow M ;
15                   𝒙~[2l]M\widetilde{\bm{x}}[2l]\leftarrow-M ;
16                  
             /* The split 𝒙~[2l1]>c\widetilde{\bm{x}}[2l-1]>c vs. 𝒙~[2l1]c\widetilde{\bm{x}}[2l-1]\leq c corresponds to (𝒙[l]>c{\bm{x}}[l]>c or 𝒙[l]=NA{\bm{x}}[l]={\rm NA}) vs. 𝒙[l]c{\bm{x}}[l]\leq c; the split 𝒙~[2l]>c\widetilde{\bm{x}}[2l]>c vs. 𝒙~[2l]c\widetilde{\bm{x}}[2l]\leq c corresponds to 𝒙[l]>c{\bm{x}}[l]>c vs. (𝒙[l]c{\bm{x}}[l]\leq c or 𝒙[l]=NA{\bm{x}}[l]={\rm NA}) */
             else if Kq+1lKq+JKq+1\leq l\leq Kq+J and 𝐱[l]=a+{\bm{x}}[l]=a+ ;
              // unobserved intermediate events
17             then
18                   𝒙~[Kq+l]M\widetilde{\bm{x}}[Kq+l]\leftarrow M ;
19                  
20             else
                   𝒙~[Kq+l]𝒙[l]/a\widetilde{\bm{x}}[Kq+l]\leftarrow{\bm{x}}[l]/a ;
                    // observed intermediate events
21                  
22             end if
            /* The split 𝒙~[Kq+l]>c\widetilde{\bm{x}}[Kq+l]>c vs. 𝒙~[Kq+l]c\widetilde{\bm{x}}[Kq+l]\leq c corresponds to 𝒙[l]/a>c{\bm{x}}[l]/a>c vs. 𝒙[l]/ac{\bm{x}}[l]/a\leq c */
23            
24       end for
25      
26      return 𝒙~\widetilde{\bm{x}} ;
27 end
28
29Data Preprocessing
30
31for i=1i=1 to nn do
32       if ΔLi=1{\Delta_{\rm L}}_{i}=1 then
33             𝑿~i(YLi)\widetilde{\bm{X}}_{i}({Y_{\rm L}}_{i})\leftarrow Transform(YLi,𝐗i(YLi){Y_{\rm L}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}));
34            
35       end if
36      
37 end for
38𝒙~\widetilde{\bm{x}}\leftarrow Transform(a,𝐱a,{\bm{x}}) ;
39
40Survival probability prediction
41Initialize the weights: wi0w_{i}\leftarrow 0, i=1,,ni=1,\ldots,n;
42
43for b=1b=1 to BB do
44       Draw the bbth bootstrap sample from the training data;
45      
Construct a partition 𝒯b{\mathcal{T}}_{b} using subjects with ΔL=1\Delta_{L}=1 in the bbth bootstrap sample:
  • 46

        

    The predictors are {YL,𝒁,𝑿~(YL)}\{{Y_{\rm L}},{\bm{Z}},\widetilde{\bm{X}}({Y_{\rm L}})\} and the censored outcome is (YYL,Δ)(Y-{Y_{\rm L}},\Delta);

  • 47

        

    At each split, a random selection of mm predictors are used as the candidate splitting variables;

  • 48      
    49      for i=1i=1 to nn do
    50             if  ΔLi=1{\Delta_{\rm L}}_{i}=1 and (YLi,𝐙i,𝐗~i(YLi))({Y_{\rm L}}_{i},{\bm{Z}}_{i},\widetilde{\bm{X}}_{i}({Y_{\rm L}}_{i})) and (a,𝐳,𝐱~)(a,{\bm{z}},\widetilde{\bm{x}}) are in the same terminal node then
    51                   wiwi+wbiw_{i}\leftarrow w_{i}+w_{bi};
                       /* wbiw_{bi} is the frequency of the iith observation in the bbth bootstrapped sample */
    52                  
    53             end if
    54            
    55       end for
    56      
    57 end for
    58
    59Predict the survival probability using
    S^𝕋(ta,𝒛,𝒙)=exp{i=1nwiΔiI(YiYLit)j=1nwjI(YjYLjYiYLi)}.\widehat{S}_{\mathbb{T}}(t\mid a,{\bm{z}},{\bm{x}})=\exp\left\{-\sum_{i=1}^{n}\frac{w_{i}\Delta_{i}I(Y_{i}-{Y_{\rm L}}_{i}\leq t)}{\sum_{j=1}^{n}w_{j}I(Y_{j}-{Y_{\rm L}}_{j}\geq Y_{i}-{Y_{\rm L}}_{i})}\right\}.
    Algorithm 1 The survival tree ensemble algorithm

    4 Evaluating the landmark prediction performance

    To evaluate the performance of the predicted risk score, we extend the cumulative/dynamic receiver operating characteristics (ROC) curves (Heagerty et al., 2000), which has been commonly used when a risk score is based on baseline predictors. We note that ROC and concordance indices for dynamic prediction at a fixed landmark time has been studied in the literature (Rizopoulos et al., 2017; Wang et al., 2017). Here we consider a more general case where the landmark time TL{T_{\rm L}} is random and subject to right censoring. For t>0t>0, subjects with 0TTL<t0\leq T-{T_{\rm L}}<t are considered as cases and subjects with TTLtT-{T_{\rm L}}\geq t are considered as controls. The ROC curve then evaluates the performance of a risk score that discriminates between subjects who have experienced the events prior to TL+t{T_{\rm L}}+t and those who do not.

    Let g(t,TL,𝒁,𝑿(TL))g(t,{T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}})) denote a risk score based on (TL,𝒁,𝑿(TL))({T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}})), with a larger value indicating a higher chance of being a case. For each t>0t>0, The true positive rate and false positive rate at a threshold of cc are defined as follows,

    TPRt(c)=\displaystyle{{\rm TPR}_{t}(c)}= P(g(t,TL,𝒁,𝑿(TL))>c0TTL<t,TLτ0),\displaystyle P(g(t,{T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}}))>c\mid 0\leq T-{T_{\rm L}}<t,{{T_{\rm L}}\leq\tau_{0}}),
    FPRt(c)=\displaystyle{{\rm FPR}_{t}(c)}= P(g(t,TL,𝒁,𝑿(TL))>cTTLt,TLτ0),\displaystyle P(g(t,{T_{\rm L}},{\bm{Z}},{\bm{X}}({T_{\rm L}}))>c\mid T-{T_{\rm L}}\geq t,{{T_{\rm L}}\leq\tau_{0}}),

    where τ0\tau_{0} is a pre-specified constant. The ROC curve at tt is defined as ROCt(p)=TPRt(FPRt1(p)){\rm ROC}_{t}(p)={\rm TPR}_{t}({\rm FPR}_{t}^{-1}(p)). Following the arguments of McIntosh and Pepe (2002), it can be shown that g(t,a,𝒛,𝒙)=1S(ta,𝒛,𝒙)g(t,a,{\bm{z}},{\bm{x}})=1-S(t\mid a,{\bm{z}},{\bm{x}}) yields the highest ROC curve, which justifies the use of the proposed time-dependent ROC curve. Moreover, the area under the ROC curve is equivalent to the following concordance measure,

    CONt(g)\displaystyle{\rm CON}_{t}(g)
    =\displaystyle= P{g(t,TL1,𝒁,𝑿1(TL1))<g(t,TL2,𝒁,𝑿2(TL2))\displaystyle P\{g(t,{T_{\rm L}}_{1},{\bm{Z}},{\bm{X}}_{1}({T_{\rm L}}_{1}))<g(t,{T_{\rm L}}_{2},{\bm{Z}},{\bm{X}}_{2}({T_{\rm L}}_{2}))\mid
    0T2TL2<tT1TL1,TL1τ0,TL2τ0}+\displaystyle\hskip 158.99377pt0\leq T_{2}-{T_{\rm L}}_{2}<t\leq T_{1}-{T_{\rm L}}_{1},{{T_{\rm L}}_{1}\leq\tau_{0},{T_{\rm L}}_{2}\leq\tau_{0}}\}+
    0.5P{g(t,TL1,𝒁,𝑿1(TL1))=g(t,TL2,𝒁,𝑿2(TL2))\displaystyle\hskip 0.72229pt0.5P\{g(t,{T_{\rm L}}_{1},{\bm{Z}},{\bm{X}}_{1}({T_{\rm L}}_{1}))=g(t,{T_{\rm L}}_{2},{\bm{Z}},{\bm{X}}_{2}({T_{\rm L}}_{2}))\mid
    0T2TL2<tT1TL1,TL1τ0,TL2τ0},\displaystyle\hskip 158.99377pt0\leq T_{2}-{T_{\rm L}}_{2}<t\leq T_{1}-{T_{\rm L}}_{1},{{T_{\rm L}}_{1}\leq\tau_{0},{T_{\rm L}}_{2}\leq\tau_{0}}\},

    where (TL1,𝒁1,𝑿1(TL1),T1)({T_{\rm L}}_{1},{\bm{Z}}_{1},{\bm{X}}_{1}({T_{\rm L}}_{1}),T_{1}) and (TL2,𝒁2,𝑿2(TL2),T2)({T_{\rm L}}_{2},{\bm{Z}}_{2},{\bm{X}}_{2}({T_{\rm L}}_{2}),T_{2}) are independent pairs of observations, and the second term accounts for potential ties in the risk score.

    In practice, one usually builds the model on a training dataset and evaluates its performance on an independent test dataset that are also subject to right censoring. To simplify notation here, we construct the estimator for CONt(g){\rm CON}_{t}(g) using the observed data introduced in Section 3.2, although CONt{\rm CON}_{t} evaluated using the test data should be used in real applications. Define dij(t)=g(t,YLj,𝒁j,𝑿j(YLj))g(t,YLi,𝒁i,𝑿i(YLi))d_{ij}(t)=g(t,{Y_{\rm L}}_{j},{\bm{Z}}_{j},{\bm{X}}_{j}({Y_{\rm L}}_{j})){\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}-}g(t,{Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i})). The CONt(g){\rm CON}_{t}(g) measure can be estimated by

    CON^t(g)\displaystyle\widehat{{\rm CON}}_{t}(g)
    =\displaystyle= ij{I(dij(t)>0)+0.5I(dij(t)=0)}ΔjI(0YjYLjt<YiYLi,YLiτ0,YLjτ0)S^C(Yj𝒁j)S^C(YLi+t𝒁i)ijΔjI(0YjYLjt<YiYLi,YLiτ0,YLjτ0)S^C(Yj𝒁j)S^C(YLi+t𝒁i),\displaystyle\frac{\sum_{i\neq j}\{I(d_{ij}(t)>0)+0.5I(d_{ij}(t)=0)\}\frac{\Delta_{j}I(0\leq Y_{j}-{Y_{\rm L}}_{j}\leq t<Y_{i}-{Y_{\rm L}}_{i},{{Y_{\rm L}}_{i}\leq\tau_{0},{Y_{\rm L}}_{j}\leq\tau_{0}})}{{\widehat{S}}_{C}(Y_{j}\mid{\bm{Z}}_{j}){\widehat{S}}_{C}({Y_{\rm L}}_{i}+t\mid{\bm{Z}}_{i})}}{\sum_{i\neq j}\frac{\Delta_{j}I(0\leq Y_{j}-{Y_{\rm L}}_{j}\leq t<Y_{i}-{Y_{\rm L}}_{i},{{Y_{\rm L}}_{i}\leq\tau_{0},{Y_{\rm L}}_{j}\leq\tau_{0}})}{{\widehat{S}}_{C}(Y_{j}\mid{\bm{Z}}_{j}){\widehat{S}}_{C}({Y_{\rm L}}_{i}+t\mid{\bm{Z}}_{i})}},

    where S^C(t𝒛)\widehat{S}_{C}(t\mid{\bm{z}}) is an estimator for the conditional censoring distribution SC(t𝒛)=P(Ct𝒁=𝒛)S_{C}(t\mid{\bm{z}})=P(C\geq t\mid{\bm{Z}}={\bm{z}}). For example, survival trees or forests can be applied to estimate SC(t𝒛)S_{C}(t\mid{\bm{z}}); when censoring is completely random, the Kaplan-Meier estimator can also be applied. Under regularity conditions, we show that CON^t(g)\widehat{{\rm CON}}_{t}(g) consistently estimates CONt(g){{\rm CON}}_{t}(g) in the Supplementary Material. In practice, one can either use the concordance at a given time point tt or an integrated measure to evaluate the overall concordance on a given time interval [tL,tU][t_{\rm L},t_{\rm U}]. In the latter case, a weighted average of concordance on a grid of time points in [tL,tU][t_{\rm L},t_{\rm U}] can be reported. For example, one may assign equal weights to all the time points. Another example is a weight proportional to the denominator in CON^t(g)\widehat{\rm CON}_{t}(g), that is, ijΔjI(0YjYLjt<YiYLi,YLiτ0,YLjτ0)/S^C(Yj𝒁j)S^C(YLi+t𝒁i)\sum_{i\neq j}{\Delta_{j}I(0\leq Y_{j}-{Y_{\rm L}}_{j}\leq t<Y_{i}-{Y_{\rm L}}_{i},{{Y_{\rm L}}_{i}\leq\tau_{0},{Y_{\rm L}}_{j}\leq\tau_{0}})}/{{\widehat{S}}_{C}(Y_{j}\mid{\bm{Z}}_{j}){\widehat{S}}_{C}({Y_{\rm L}}_{i}+t\mid{\bm{Z}}_{i})}, to avoid potentially unstable estimation for very small or large time points.

    5 Permutation variable importance

    Variable importance is a useful measure for understanding the impact of predictors in tree ensembles and can be used as a reference for variable selection (Breiman, 2001). In the original random forests, each tree is constructed using a bootstrap sample of the original data, and the out-of-bag (OOB) data can be used to estimate the OOB prediction performance. The permutation variable importance of a predictor is computed as the average decrease in model accuracy on the OOB samples when the respective feature values are randomly permuted.

    To study variable importance in dynamic risk prediction using censored data, we consider an extension of variable importance. Following the original random forests, the OOB prediction for each training observation is made based on trees constructed without using this observation. Applying the same arguments as in Section 4, the prediction based on trees built without the iith subject is

    S^i(ta,𝒛,𝒙)=exp{0tk=1nwk,i(a,𝒛,𝒙)Nk(ds,YLk)k=1nwk,i(a,𝒛,𝒙)I(YkYLks)},\displaystyle{\widehat{S}}_{-i}(t\mid a,{\bm{z}},{\bm{x}})=\exp\left\{-\int_{0}^{t}\frac{{\sum_{k=1}^{n}}w_{k,-i}(a,{\bm{z}},{\bm{x}})N_{k}(ds,{Y_{\rm L}}_{k})}{{\sum_{k=1}^{n}}w_{k,-i}(a,{\bm{z}},{\bm{x}})I(Y_{k}-{Y_{\rm L}}_{k}\geq s)}\right\},

    where wk,i(a,𝒛,𝒙)=b=1BwbkI((YLk,𝒁k,𝑿k(YLk))lb(a,𝒛,𝒙),wbi=0)ΔLiw_{k,-i}(a,{\bm{z}},{\bm{x}})={\sum_{b=1}^{B}}w_{bk}I(({Y_{\rm L}}_{k},{\bm{Z}}_{k},{\bm{X}}_{k}({Y_{\rm L}}_{k}))\in l_{b}(a,{\bm{z}},{\bm{x}}),w_{bi}=0){\Delta_{\rm L}}_{i}. Define dij(t)=S^i(tYLi,𝒁i,𝑿i(YLi))+S^j(tYLj,𝒁j,𝑿j(YLj))d_{ij}(t)=-{\widehat{S}}_{-i}(t\mid{Y_{\rm L}}_{i},{\bm{Z}}_{i},{\bm{X}}_{i}({Y_{\rm L}}_{i}))+{\widehat{S}}_{-j}(t\mid{Y_{\rm L}}_{j},{\bm{Z}}_{j},{\bm{X}}_{j}({Y_{\rm L}}_{j})). The OOB concordance at tt can be calculated by applying (4). To compute variable importance for a predictor, we permute this predictor and calculate the OOB concordance after permutation. We repeat the permutation multiple times (e.g., 100 times) and define the variable importance as the average difference in OOB concordances over all the permutations.

    Permuting variables is straightforward in the case of fixed landmark times (i.e., TL=tk{T_{\rm L}}=t_{k}), where we randomly shuffling the observed values of the predictor among individuals who remained under observation at the landmark time, that is, subjects with ΔL=1{\Delta_{\rm L}}=1. When the landmark time is random (i.e., TL=Uj{T_{\rm L}}=U_{j}), we propose different permutation procedures for calculating variable importance according to the type of the variable: (1) If the variable of interest is completely observed at the landmark time TL{T_{\rm L}} (e.g., the value of TL{T_{\rm L}} and baseline covariates 𝒁{\bm{Z}}), we randomly shuffle its values among subjects with ΔL=1{\Delta_{\rm L}}=1. (2) If the variable of interest is an intermediate event time Uj(jj)U_{j^{\prime}}(j^{\prime}\neq j), we propose to permute its relative value to the landmark time, that is, Uj(TL)/TLU_{j^{\prime}}({T_{\rm L}})/{T_{\rm L}}, among subjects with ΔL=1{\Delta_{\rm L}}=1. Note that the reason to permute the ratio Uj(TL)/TLU_{j^{\prime}}({T_{\rm L}})/{T_{\rm L}} instead of the untransformed intermediate event time directly is to avoid incompatible pairs of the intermediate event time and the landmark time under permutation. (3) If the variable of interest is an element of 𝑾(tk){\bm{W}}(t_{k}), we permute its values among subjects with complete observations at the landmark time (i.e., TLtk{T_{\rm L}}\geq t_{k} and ΔL=1{\Delta_{\rm L}}=1). This is because 𝑾(tk){\bm{W}}(t_{k}) is not used in prediction for subjects with TL<tk{T_{\rm L}}<t_{k}.

    6 Simulation

    Simulation studies were conducted to assess the performance of the proposed methods in estimating the landmark survival probability in (1) under scenarios when the landmark time is fixed or random. In both cases, we generated the time-independent predictors 𝒁=(Z1,,Z10){\bm{Z}}=(Z_{1},\ldots,Z_{10}) from a multivariate normal distribution with E(Zi)=1E(Z_{i})=1, Var(Zi)=1{\rm Var}(Z_{i})=1, and Cov(Zi,Zj)=0{\rm Cov}(Z_{i},Z_{j})=0, for i,j=1,,10i,j=1,\ldots,10. The longitudinal predictors were observed intermittently at time points specified below.

    In the first set of simulations, the longitudinal predictors 𝑾(){\bm{W}}(\cdot) were measured at fixed landmark time points tk=kt_{k}=k for k=1,,Kk=1,\ldots,K. The longitudinal predictors 𝑾(t)=(W1(t),,W10(t)){\bm{W}}(t)=(W_{1}(t),\ldots,W_{10}(t)) were generated from Wi(t)=aiF(bit)/tW_{i}(t)=a_{i}F(b_{i}t)/t, i=1,,10i=1,\ldots,10, where aia_{i} and bib_{i} are independent standard uniform random variables and F(x)=1exp(x2)F(x)=1-\exp(-x^{2}). The probability in (1) can be expressed as P(Ttk+tTtk,𝑾(t1),,𝑾(tk),𝒁)P(T\geq t_{k}+t\mid T\geq t_{k},{\bm{W}}(t_{1}),\ldots,{\bm{W}}(t_{k}),{\bm{Z}}) for 0=t0<t<tk+1tk0=t_{0}<t<t_{k+1}-t_{k}, k1k\geq 1, and P(Tt𝒁)P(T\geq t\mid{\bm{Z}}) for 0<t<t10<t<t_{1}. For k1k\geq 1, we assume the hazard function of TT on (tk,tk+1)(t_{k},t_{k+1}) depends on the history of 𝑾(){\bm{W}}(\cdot) up to tt only through its value at tkt_{k}. For t(tk,tk+1)t\in(t_{k},t_{k+1}) and k0k\geq 0, we consider the following hazard functions for TT,
    (I) λ(tW(t),𝒁)=t2exp{5+j=110αkjWj(tk)+j=110βkjWj(tk)Zj+j=13Zj2}\lambda(t\mid{\mathcal{H}}_{W}(t),{\bm{Z}})=t^{2}\exp\left\{-5+\sum_{j=1}^{10}\alpha_{kj}W_{j}(t_{k})+\sum_{j=1}^{10}\beta_{kj}W_{j}(t_{k})Z_{j}+\sum_{j=1}^{3}Z_{j}^{2}\right\}, αkj=βkj=2I(k=1)+4I(k2)\alpha_{kj}=\beta_{kj}=2I(k=1)+4I(k\geq 2) for 1j31\leq j\leq 3, and αkj=βkj=0\alpha_{kj}=\beta_{kj}=0 for 4j104\leq j\leq 10;
    (II) λ(tW(t),𝒁)=0.1t2+exp{5+j=110αkjWj(tk)+j=110βkjWj(tk)Zj+j=13Zj2}\lambda(t\mid{\mathcal{H}}_{W}(t),{\bm{Z}})=0.1t^{2}+\exp\left\{-5+\sum_{j=1}^{10}\alpha_{kj}W_{j}(t_{k})+\sum_{j=1}^{10}\beta_{kj}W_{j}(t_{k})Z_{j}+\sum_{j=1}^{3}Z_{j}^{2}\right\}, αkj=βkj=I(k=1)+2I(k2)\alpha_{kj}=\beta_{kj}=I(k=1)+2I(k\geq 2) for 1j31\leq j\leq 3, and αkj=βkj=0\alpha_{kj}=\beta_{kj}=0 for 4j104\leq j\leq 10.
    The closed-form expressions of the true landmark survival probabilities under Models (I) and (II) are given in the Supplementary Material. When evaluating the prediction performance of different approaches, we focus on the landmark probability at t2=2t_{2}=2.

    In the second set of simulations, we consider the case where both an intermediate event and longitudinal markers are present. Specifically, we generated the event times from irreversible multi-state models with three states: healthy, diseased, and death. We assume that all subjects started in the healthy state, disease onset is an intermediate event, and death is the event of interest. We generated the time to the first event, denoted by DD, from a uniform distribution on [0, 5]. Define the disease indicator, Π\Pi, where Π=1\Pi=1 indicates the subject moves from the healthy state to the disease state at time DD, and Π=0\Pi=0 indicates the subject moves from the healthy state to death at time DD. The disease indicator was obtained via a logistic regression model, logitP(Π=1𝒁,𝑾(D))=j=13Wi(D)+j=13Zj+γ\operatorname{logit\!}P(\Pi=1\mid{\bm{Z}},{\bm{W}}(D))=\sum_{j=1}^{3}W_{i}(D)+\sum_{j=1}^{3}Z_{j}+\gamma, where γ\gamma is a frailty variable following a gamma distribution with mean 1 and variance 0.5, 𝑾(t)=(W1(t),,W10(t)){\bm{W}}(t)=(W_{1}(t),\ldots,W_{10}(t)) were generated from Wj(t)=aj{1exp(0.04t2)}W_{j}(t)=a_{j}\{1-\exp(-0.04t^{2})\}, and aja_{j} follows a uniform distribution on [1,1][-1,1] for j=1,,10j=1,\ldots,10. Given a subject had developed the disease at time DD, i.e., Π=1\Pi=1, the residual survival time, RR, was generated from the following models,
    (III) logR=5+j=13Wj(D)+j=13Zj2+j=13Wj(D)Zj+log(1+D)+γ+ϵ\log R=-5+\sum_{j=1}^{3}W_{j}(D)+\sum_{j=1}^{3}Z_{j}^{2}+\sum_{j=1}^{3}W_{j}(D)Z_{j}+\log(1+D)+\gamma+\epsilon, where ϵ\epsilon is a standard normal random variable;
    (IV) logR=5+j=13Wj(D)+j=13Zj2+j=13Wj(D)Zj+log(1+D)+γ+I(Z1>0)ϵ1+I(Z10)ϵ2\log R=-5+\sum_{j=1}^{3}W_{j}(D)+\sum_{j=1}^{3}Z_{j}^{2}+\sum_{j=1}^{3}W_{j}(D)Z_{j}+\log(1+D)+\gamma+I(Z_{1}>0)\epsilon_{1}+I(Z_{1}\leq 0)\epsilon_{2}, where ϵ1\epsilon_{1} and ϵ2\epsilon_{2} are independent normal random variables with variances 1 and 0.25, respectively;
    (V) The hazard function of RR is t2exp[5+j=13{2I(1D<2)+4I(D2)}{Wj(D)+Wj(D)Zj+Zj2}].t^{2}\exp[-5+\sum_{j=1}^{3}\{2I(1\leq D<2)+4I(D\geq 2)\}\{W_{j}(D)+W_{j}(D)Z_{j}+Z_{j}^{2}\}].
    When Π=1\Pi=1, the time to death is T=D+RT=D+R, and the time to the intermediate event is U=DU=D; when Π=0\Pi=0, the time to death is T=DT=D and the intermediate event does not occur. Under Models (III)-(V), we consider the following three scenarios depending on how the landmark time and the longitudinal markers are observed:

    1. (A)

      The landmark is the intermediate event, i.e., TL=U{T_{\rm L}}=U, and 𝑾(){\bm{W}}(\cdot) is observed intermittently at tk=kt_{k}=k, k=1,,5k=1,\ldots,5. The target probability in (1) is

      P(TU+tTU,U,𝑾(t1,U),,𝑾(tK,U),𝒁).P(T\geq U+t\mid T\geq U,U,{\bm{W}}(t_{1},U),\ldots,{\bm{W}}(t_{K},U),{\bm{Z}}).
    2. (B)

      The landmark time is fixed at TL=a{T_{\rm L}}=a, and 𝑾(){\bm{W}}(\cdot) is observed at the intermediate event. The target probability is

      {P(TatTa,U,𝑾(U),𝒁),Ua,P(TatTa,U>a,𝒁),U>a.\displaystyle\begin{cases}P(T-a\geq t\mid T\geq a,U,{\bm{W}}(U),{\bm{Z}}),&U\leq a,\\ P(T-a\geq t\mid T\geq a,U>a,{\bm{Z}}),&U>a.\\ \end{cases}
    3. (C)

      The landmark is the intermediate event, and 𝑾(){\bm{W}}(\cdot) is observed at the intermediate event. The target probability is P(TU+tTU,U,𝑾(U),𝒁)P(T\geq U+t\mid T\geq U,U,{\bm{W}}(U),{\bm{Z}}).

    Scenario (A) is motivated by the CFFPR data, where the longitudinal marker is regularly monitored. Scenarios (B) and (C) are motivated by applications where markers are observed when a disease is diagnosed. In Scenario (B), we set a=2a=2. Due to the complicated relationship between event times and longitudinal markers, deriving the closed-form expression of the true probability under Models (III)-(V) is challenging. We outline the the Monte Carlo method used to approximate the true probabilities in the Supplementary Material.

    For all scenarios, the censoring time was generated from an independent exponential distribution with rate cc, where cc was chosen to achieve either a 20% or 40% rate of censoring at the baseline. We simulated 500 training datasets with sample sizes of 200 and 400 at baseline. The results for large sample sizes (n=5000n=5000) are included in the Supplementary Material. The trees were constructed with a minimum terminal node size of 15. When a single tree was used for prediction, the size of the tree was determined by split-complexity pruning via ten-fold cross-validation. To grow the trees in the ensemble method, we randomly selected p\left\lceil\sqrt{p}\right\rceil variables at each splitting step and did not prune the trees. We applied the log-rank splitting rule in the ensemble method in order to compare the martingale-based ensemble approach with the default ensemble approach in random survival forests. Each fitted model was evaluated on independent test data with 500 observations. The evaluating criteria were the integrated mean absolute error (IMAE), the integrated mean squared error (IMSE), the integrated Brier score (IBS), and the integrated concordance over [0,t0][0,t_{0}], where t0t_{0} is set to be approximately the 90% quantile of TTLT-{T_{\rm L}}. The integrated concordance was defined in Section 4, and other evaluating criteria are defined as follows:

    IMAE\displaystyle{\rm IMAE} =i=15000t0|S^(tTLi0,𝒁i0,i0(TLi0))S(tTLi0,𝒁i0,i0(TLi0))|dt/500,\displaystyle=\sum_{i=1}^{500}\int_{0}^{t_{0}}\left|\widehat{S}(t\mid T^{0}_{{\rm L}i},{\bm{Z}}_{i}^{0},{\mathcal{H}}_{i}^{0}(T^{0}_{{\rm L}i}))-S(t\mid T^{0}_{{\rm L}i},{\bm{Z}}_{i}^{0},{\mathcal{H}}_{i}^{0}(T^{0}_{{\rm L}i}))\right|dt/500,
    IMSE\displaystyle{\rm IMSE} =i=15000t0{S^(tTLi0,𝒁i0,i0(TLi0))S(tTLi0,𝒁i0,i0(TLi0))}2𝑑t/500,\displaystyle=\sum_{i=1}^{500}\int_{0}^{t_{0}}\left\{\widehat{S}(t\mid T^{0}_{{\rm L}i},{\bm{Z}}_{i}^{0},{\mathcal{H}}_{i}^{0}(T^{0}_{{\rm L}i}))-S(t\mid T^{0}_{{\rm L}i},{\bm{Z}}_{i}^{0},{\mathcal{H}}_{i}^{0}(T^{0}_{{\rm L}i}))\right\}^{2}dt/500,
    IBS\displaystyle{\rm IBS} =i=15000t0{S^(tTLi0,𝒁i0,i0(TLi0))I(Ti0TLi0t)}2𝑑t/500,\displaystyle=\sum_{i=1}^{500}\int_{0}^{t_{0}}\left\{\widehat{S}(t\mid T^{0}_{{\rm L}i},{\bm{Z}}_{i}^{0},{\mathcal{H}}_{i}^{0}(T^{0}_{{\rm L}i}))-I(T^{0}_{i}-{T_{\rm L}}^{0}_{i}\geq t)\right\}^{2}dt/500,

    where S(tTL,𝒁,(TL))=P(TTLtTTL,TL,𝒁,(TL))S(t\mid{T_{\rm L}},{\bm{Z}},{\mathcal{H}}({T_{\rm L}}))=P(T-{T_{\rm L}}\geq t\mid T\geq{T_{\rm L}},{T_{\rm L}},{\bm{Z}},{\mathcal{H}}({T_{\rm L}})) is the target survival probability, and the superscript 0 is used to denote the test data.

    For comparison, we applied the conventional random survival forest implemented in the R package ranger (Wright and Ziegler, 2017) and the simple landmark Cox model in all scenarios; we also applied the two-stage landmark approach in Models (I) and (II). We used the proposed data preprocessing procedure to prepare the predictors for the conventional random survival forest but calculated the predicted survival probabilities by default (i.e., averaging the cumulative hazard predictions). Under Models (I) and (II), the predictors in the simple landmark Cox model are {𝑾(t1),,𝑾(tk),𝒁}\{{\bm{W}}(t_{1}),\ldots,{\bm{W}}(t_{k}),{\bm{Z}}\}. Under Models (III)-(V), the predictors in the simple landmark Cox model are {U,𝒁,𝑾(tk)I(tkU),I(tk>U);k1}\{U,{\bm{Z}},{\bm{W}}(t_{k})I(t_{k}\leq U),I(t_{k}>U);k\geq 1\}, {𝒁,UI(Ua),𝑾(U)I(Ua),I(a>U)}\{{\bm{Z}},UI(U\leq a),{\bm{W}}(U)I(U\leq a),I(a>U)\}, and {U,𝒁,𝑾(U)}\{U,{\bm{Z}},{\bm{W}}(U)\} in sub-scenarios (A), (B), and (C), respectively. For the two-stage landmark approach, we fit separate linear mixed models for all the variables in 𝑾(){\bm{W}}(\cdot): each model includes a time variable and a random intercept, and were estimated using repeated measurements from subjects who are at-risk at the landmark time. The predictors of the landmark Cox model then include the BLUPs of random effects and 𝒁{\bm{Z}}.

    The simulation results are summarized in Tables 2 and 3, in which the proposed ensemble method outperforms the others based on the four evaluation criteria we considered. As expected, when the sample size increases from 200 to 400, the IMAEs, the IMSEs, and the IBSs of the proposed methods decrease, while the integrated concordance increases. The conventional random survival forest approach performs similarly to the proposed ensemble method when n=200n=200 but loses its edge as the sample size increases. On the other hand, the simple landmark Cox model and the two-stage landmark approach yield similar results under Model (I), but the former yields better results under Model (II). When comparing the tree models with the Cox models, we observe that a small IMSE does not necessarily accompany by a large integrated concordance. We conjecture this is because the concordance measure only depends on the order of the predicted survival probability and is less sensitive in terms of risk calibration. In summary, the proposed ensemble method has strong performances and serves as an appealing tool for dynamic risk prediction.

    Table 2: Summaries of integrated mean absolute error (IMAE) (×1000)(\times 1000), integrated mean squared error (IMSE) (×1000)(\times 1000), integrated Brier score (IBS) (×1000)(\times 1000), and integrated concordance (ICON) (×1000)(\times 1000) of different methods in the first set of simulations. cen is the censoring percentage; Tr: the proposed tree; E1: the proposed survival tree ensemble; E2: the original random survival forest; C1: the landmark Cox model; C2: the two stages landmark Cox model.
    IMAE IMSE IBS ICON
    nn cen Tr E1 E2 C1 C2 Tr E1 E2 C1 C2 Tr E1 E2 C1 C2 Tr E1 E2 C1 C2
    Scenario (I)
    200 20% 213 196 219 220 216 74 64 71 91 91 192 183 196 208 208 508 642 634 572 624
    40% 215 202 222 236 239 76 68 74 106 112 185 177 188 212 217 509 621 615 562 607
    400 20% 201 172 207 204 197 65 48 63 74 66 181 165 188 192 184 515 694 687 593 655
    40% 205 178 212 209 206 70 51 66 80 73 176 160 180 188 181 526 675 666 586 647
    Scenario (II)
    200 20% 73 76 83 139 175 13 13 14 38 58 172 172 182 196 214 531 537 537 533 534
    40% 76 79 86 192 236 13 14 15 69 98 161 161 171 208 235 530 538 536 536 536
    400 20% 64 59 81 91 114 11 10 13 19 26 164 160 172 177 184 530 547 546 539 541
    40% 62 64 83 113 145 10 11 14 26 41 156 150 159 172 184 531 544 542 536 538
    Table 3: Summaries of integrated mean absolute error (IMAE) (×1000)(\times 1000), integrated mean squared error (IMSE) (×1000)(\times 1000), integrated Brier score (IBS) (×1000)(\times 1000), and integrated concordance (ICON) (×1000)(\times 1000) of different methods in the second set of simulations. cen is the censoring percentage; Tr: the proposed tree; E1: the proposed survival tree ensemble; E2: the original random survival forest; C1: the landmark Cox model.
    IMAE IMSE IBS ICON
    nn cen Tr E1 E2 C1 Tr E1 E2 C1 Tr E1 E2 C1 Tr E1 E2 C1
    Scenario (III-A)
    200 20% 192 167 178 310 84 72 79 237 98 79 86 260 816 902 900 531
    40% 266 194 198 405 114 85 93 319 140 81 87 323 610 906 906 530
    400 20% 173 154 169 255 68 61 69 185 70 67 78 210 904 909 909 544
    40% 204 178 191 323 85 76 84 239 74 64 76 241 882 917 915 523
    Scenario (III-B)
    200 20% 234 216 219 249 94 81 89 122 143 132 133 161 544 713 700 614
    40% 245 228 215 268 99 83 87 138 144 133 133 169 545 652 645 584
    400 20% 212 181 192 213 84 65 70 92 132 116 126 133 572 764 760 646
    40% 224 196 193 226 92 67 68 101 135 117 127 136 584 708 703 619
    Scenario (III-C)
    200 20% 236 214 228 259 108 93 97 125 150 138 138 168 524 822 713 643
    40% 253 233 256 292 121 106 110 150 160 147 147 187 522 783 678 623
    400 20% 232 187 206 236 95 66 85 98 138 113 128 143 550 901 810 688
    40% 247 204 233 258 106 77 94 114 146 122 134 153 574 872 777 669
    Scenario (IV-A)
    200 20% 183 163 170 311 80 69 69 238 94 80 81 261 844 902 876 519
    40% 248 191 195 411 109 83 81 324 127 81 83 329 680 906 871 526
    400 20% 168 150 165 258 74 59 66 187 79 67 78 209 906 909 879 510
    40% 197 173 187 325 90 72 79 241 83 64 76 240 885 915 876 510
    Scenario (IV-B)
    200 20% 199 180 181 206 73 62 63 91 142 132 134 161 543 717 706 615
    40% 198 178 177 216 72 59 60 98 144 133 133 168 549 655 648 585
    400 20% 184 153 156 190 65 49 53 72 130 115 125 132 581 770 770 647
    40% 186 152 157 188 66 46 50 72 134 117 127 135 580 712 707 620
    Scenario (IV-C)
    200 20% 269 245 243 263 114 98 103 130 150 136 137 167 520 828 833 649
    40% 299 276 275 297 128 112 117 155 159 146 146 185 522 789 789 631
    400 20% 252 209 212 244 101 81 90 114 147 111 125 151 552 904 897 694
    40% 278 239 240 274 114 93 100 131 155 120 132 162 571 878 871 674
    Scenario (V-A)
    200 20% 212 185 190 349 89 75 78 311 133 112 115 333 688 832 824 534
    40% 229 208 215 368 103 90 92 350 115 100 103 316 707 815 820 526
    400 20% 193 166 179 377 75 66 69 310 107 105 105 362 828 859 843 534
    40% 220 190 206 427 91 83 84 365 95 92 93 347 810 847 842 539
    Scenario (V-B)
    200 20% 180 158 154 188 54 42 42 70 161 150 151 175 564 672 677 572
    40% 189 164 164 200 62 46 47 83 147 135 138 164 569 650 651 566
    400 20% 168 136 141 173 53 31 35 55 160 135 143 161 566 715 710 599
    40% 177 141 149 178 63 33 39 62 147 120 128 146 604 691 685 589
    Scenario (V-C)
    200 20% 234 208 207 239 89 71 71 108 157 140 141 174 527 747 733 580
    40% 242 222 223 257 101 80 81 128 145 129 130 167 549 712 694 568
    400 20% 219 179 181 226 83 54 55 91 151 125 127 159 581 827 812 609
    40% 219 192 194 237 90 61 62 101 137 114 112 146 631 808 797 595

    7 Application to the Cystic Fibrosis Foundation Patient Registry Data

    Understanding the risk factors associated with the progressive loss of lung function is crucial in managing CF. The risk for lung disease depends on patient characteristics, and certain patient groups, such as Hispanic patients, are at increased risk of severe disease for reasons not yet known (McGarry et al., 2019). Our goal is to build prediction models for the development of moderate airflow limitation, defined as the first time that ppFEV1 drops below 80% in CFPPR. Our analysis focused on 5,398 pediatric CF patients who were diagnosed before age one between 2008 and 2013; among them, 419 were Hispanic. The data were subjected to right-censoring due to loss of follow-up or administrative censoring. A total of 4,507 failure events were observed with a median follow-up time of 7.71 years.

    The rich information in the CFPPR data renders the possibility of a comprehensive evaluation of important risk factors. We considered baseline predictors including gender, ethnicity, maternal education status (\geq16 years of education vs. else), insurance status, geographic location (West, Midwest, Northeast, and South), and mutation class (severe, mild, and unknown). Since baseline factors may have limited predictability, we further included repeated measurements and intermediate events as predictors. The longitudinal measurements, ppFEV1, percent predicted forced vital capacity (ppFVC), weight, and height, were assessed regularly throughout the study and were annualized at integer ages via last observation carried forward. The intermediate events include different subtypes of PA (initial acquisition, mucoid, chronic, multidrug-resistant), methicillin-sensitive staphylococcus aureus (MSSA), methicillin-resistant staphylococcus aureus (MRSA), as well as the diagnoses of CF-related diabetes (CFRD) and pancreatic insufficiency.

    For landmark prediction, we considered the following fixed and random landmark times:

    1. (LM1)

      The landmark time is age 7, with the target prediction interval [7,22][7,22].

    2. (LM2)

      The landmark time is age 12, with the target prediction interval [12,22][12,22].

    3. (LM3)

      The landmark time is the acquisition of chronic Pseudomonas aeruginosa (cPA), and the target prediction interval is from the time of acquiring cPA to age 22.

    The fixed landmark ages 7 and 12 correspond to middle childhood and preadolescence. The random landmark event cPA was considered because patients with cPA are more likely to develop increased inflammation, leading to an accelerated loss in lung function (Kamata et al., 2017). The median time to cPA in our dataset was 15.5 years. While PA is a frequent pathogen in cystic fibrosis, early PA is eradicated in the majority of patients through inhaled and intravenous antibiotics (Döring et al., 2004; Heltshe et al., 2018). When PA is not eradicated after initial PA, it converts to chronic PA. Since initial PA is frequently eradicated and does not have the long-term impact on pulmonary function, we chose the landmark time of cPA (Harun et al., 2016). A model using initial PA as the landmark is reported in the Supplementary Material. To perform risk prediction, we used Model (LM1) to obtain the future risk for a patient who is event-free at age 7. An updated prediction can be carried out using Model (LM2) if the patient remains event-free at age 12. Upon converting to cPA, the predicted risk can be updated using Model (LM3). Although landmark prediction models can be constructed at multiple time points, we suggest practitioners focusing on the predicted probabilities given by the landmark model that is most close in time. To illustrate the event history and landmark times, we show the occurrence of PA during the follow-up period for a random sample of 50 patients in the Supplementary Material. At each landmark time, the exact timing of an intermediate event is available only if it has occurred before the landmark time.

    The data were partitioned into a training set (60%) and a test set (40%). The landmark prediction models were built using the training data and were evaluated on the test data via the proposed concordance measure. Results from the simple landmark Cox model and the two-stage approach were also reported for comparison. Similar to the conventional landmark prediction models, the Cox models were constructed using subjects who remained event-free and uncensored at each landmark time. In the Cox models, the jjth partially observed intermediate events was incorporated via two predictors, UjI(UjTL)U_{j}I(U_{j}\leq{T_{\rm L}}) and I(Uj>TL)I(U_{j}>{T_{\rm L}}). In the simple landmark Cox model, the partially observed repeated measurements at tkt_{k} were expressed using 𝑾(tk)I(tkTL){\bm{W}}(t_{k})I(t_{k}\leq{T_{\rm L}}) and I(tk>TL)I(t_{k}>{T_{\rm L}}) under Model (LM3). The models were built in the way described in the simulation section.

    The concordance measures are summarized in Table 4. For landmark prediction at ages 7 and 12, we reported the average CONt{\rm CON}_{t} at 50 equally spaced time points on the target prediction intervals. For the prediction at the landmark age of 12, both Models (LM1) and (LM2) can be applied: (LM1) used history up to age 7 while (LM2) used history up to age 12. As expected, incorporating additional information between ages 7 and 12 results in an increase of average concordance from 0.711 to 0.739 in our ensemble model. For the landmark model at cPA, we used the concordance at a time horizon of 5 years after cPA as the evaluation criterion. Since we focus on the risk prior to age 22, predicting the 5-year risk for individuals who acquired the chronic form of PA after age 17 is not feasible. Therefore, the concordance was evaluated in the subsample of subjects who developed cPA before age 17. The ensemble method yielded better performances compared to its competitors.

    Table 4: Concordance measures evaluated using the test data in the CFFPR analysis. The column Landmark gives the left bound of the target prediction interval. The integrated concordance is reported for Models (LM1) and (LM2) over the interval where the risk prediction is performed, where CON^[a,b]=j=150CON^tj/50\widehat{\rm CON}_{[a,b]}=\sum_{j=1}^{50}\widehat{\rm CON}_{t_{j}}/50 and tj=a+(ba)j/50t_{j}=a+(b-a)j/50. The concordance at year 5 after cPA is reported for Model (LM3).
    Landmark Measure Model Tree Ensemble Cox-1 Cox-2
    Age 7 CON^[0,15]\widehat{\rm CON}_{[0,15]} LM1 0.654 0.748 0.695 0.699
    Age 12 CON^[0,10]\widehat{\rm CON}_{[0,10]} LM2 0.667 0.739 0.674 0.674
    Age 12 CON^[0,10]\widehat{\rm CON}_{[0,10]} LM1 0.620 0.711 0.611 0.669
    cPA CON^5\widehat{\rm CON}_{5} LM3 0.763 0.813 0.788 -

    Note: Cox-1 stands for the simple one-stage landmark Cox models, and Cox-2 stands for the two-stage landmark Cox models.

    In an attempt to identify important predictors in the ensemble models, we computed the permutation variable importance with 100 permutations. Under the fixed landmark time models, we permuted all of the repeated measurements of a marker simultaneously to evaluate the overall impact of the longitudinal marker. The results of the permutation variable importance are summarized in Figure 3, where ppFEV1, ppFVC, weight, and height are identified as the top four important predictors for both landmark ages 7 and 12 (Figures 3(a) and 3(b)). Following them, intermediate events related to PA and staphylococcus aureus are moderately important. We note that mucoid PA and MSSA became more important at age 12 when compared to age 7. This could be due to the fact that these intermediate events are less common before age 7. When using cPA as the landmark, the repeated measurements after the acquisition of cPA were not used in prediction, and thus the number of observed repeated measurements varied across subjects. Unlike baseline variables and intermediate events of which the permutations were performed among subjects who experienced cPA, the permutation of a marker at a specific time point (e.g., age 7) was performed among subjects who experienced cPA after the time point. To this end, we plot the variable importance for predictors at different ages separately, so that the variable importance measures in each plot are based on permuting the same set of subjects. Figure 4(a) shows the importance of baseline variables and intermediate events. The timing of cPA plays an important role in predicting future event risk. We note that baseline factors such as ethnicity have a relatively low variable importance. However, this does not mean that ethnicity does not affect the risk of lung function decline. We conjecture that the effect of ethnicity was predominantly mediated through spirometry measurements. Therefore, one should be cautious when interpreting the variable importance. When applying the proposed method in health disparity research, one can further build separate models for Hispanic and non-Hispanic patients. Additional analyses in different ethnic groups are included in the Supplementary Material.

    Refer to caption
    (a) Variable importance when the landmark time is age 7 and the risk prediction interval is [7,22][7,22].
    Refer to caption
    (b) Variable importance when the landmark time is age 12 and the risk prediction interval is [12,22][12,22].
    Figure 3: The permutation variable importance in the CFFPR data analysis, when the landmark times are ages 7 and 12. The boxplots show the decreases in OOB concordances from the 100 permutations and are ranked in descending order according to the mean value.
    Refer to caption
    (a) Variable importance for baseline variables and intermediate events
    Refer to caption
    (b) Variable importance for markers at age 7.
    Refer to caption
    (c) Variable importance for markers at age 12.
    Figure 4: The permutation variable importance in the CFFPR data analysis, when the landmark time is cPA. The boxplots show the decreases in OOB concordances from the 100 permutations and are ranked in descending order according to the mean value.

    Our models identify repeated measurements of weight and height as important variables in landmark prediction. To provide more insight into how historical weight measurements affect future risk in our ensemble model, we present the predicted event-free probabilities for hypothetical patients with different weight trajectories in Figure 5. Specifically, we consider Hispanic and non-Hispanic male patients whose weights were in the 10th, 50th, and 90th weight-for-age percentiles of the corresponding ethnicity-gender subgroup. For all six patients, the intermediate event times were fixed at the median derived from the Kaplan-Meier estimates, while categorical predictors and continuous predictors were fixed at the reference levels and mean values, respectively. At the landmark age 7, patients with the 10th percentile weight trajectories had the highest predicted risk, followed by patients with 90th percentile and those with 50th percentile weight trajectories (Figure 5(a)). At the landmark age 12, the predicted risk in patients with 10th percentile weight remains the highest, followed by patients with 50th percentile and those with 90th percentile weight trajectories (Figure 5(b)). A possible explanation for the predicted curves of the 50th and 90th weight percentiles to flip between landmark age 7 and 12 is a higher degree of survivor bias at the later landmark age; in other words, the event-free individuals at landmark age 12 tended to have better lung function than those at landmark age 7. So if overweight is associated with a higher risk of lung function decline, the 90th percentile patients are more likely to fail before age 12. As a result, we have a group of healthier overweight subjects at the landmark age 12, and thus their risk beyond age 12 can be lower than individuals with 50th percentile weight. As for underweight individuals, the degree of selection bias may not be large enough to compensate for the poor lung function and thus remains the highest risk group at landmark age 12. To summarize, the landmark prediction models built at age 7 and age 12 are for different survivor populations, and thus can not be directly compared.

    Refer to caption
    (a) Landmark time point at age 7, predicting the event risk on [7,22][7,22].
    Refer to caption
    (b) Landmark time point at age 12, predicting the event risk on [12,22][12,22].
    Figure 5: Survival predictions for patients in different weight groups. Curves on the left show the repeated weight measurements before the landmark time. Curves on the right show the survival predictions over the interval of interests for the corresponding groups. The weight groups were chosen by the percentiles and ethnicity, : 10th percentile, non-Hispanic; : 50th percentile, non-Hispanic; : 90th percentile, non-Hispanic; : 10th percentile, Hispanic; : 50th percentile, Hispanic; : 90th percentile, Hispanic.

    8 Discussion

    In this paper, we proposed a unified framework for tree-based risk prediction with updated information. Compared to semiparametric methods, our methods can handle a large, growing number of predictors over time and do not impose strong model assumptions. Furthermore, the landmark times at which a prediction is performed are allowed to be subject-specific and defined by intermediate clinical events. Notably, our ensemble procedure averages the unbiased martingale estimation equations instead of survival probabilities and avoids the potential bias arising due to small terminal node sizes.

    Our discussion has focused on the case where the time-dependent variables 𝑾(){\bm{W}}(\cdot) are observed at fixed time points t1,,tKt_{1},\ldots,t_{K}. The proposed method can also be applied to the case where repeated measurements are collected at irregular time points, such as hospitalizations. When building prediction models, one can consider using up to KK repeated measurements as predictors, where KK is a fixed integer. Denote by V1<V2<<VKV_{1}<V_{2}<\cdots<V_{K} the potential observation times of 𝑾(){\bm{W}}(\cdot). At time tt, the available information can be expressed using 𝑿(t)={𝑾(V1,t),,𝑾(VK,t),V1(t),,VK(t)}{\bm{X}}(t)=\{{\bm{W}}(V_{1},t),\ldots,{\bm{W}}(V_{K},t),V_{1}(t),\ldots,V_{K}(t)\}, where 𝑾(Vk,t)=𝑾(Vk){\bm{W}}(V_{k},t)={\bm{W}}(V_{k}) and Vk(t)=VkV_{k}(t)=V_{k} if VktV_{k}\leq t, while 𝑾(Vk,t)=NAq{\bm{W}}(V_{k},t)=\textbf{NA}_{q} and Vk(t)=t+V_{k}(t)=t^{+} otherwise. In other words, the random measurement times V1,,VKV_{1},\cdots,V_{K} are treated as intermediate event times. In this way, our framework can incorporate irregularly observed covariate information.

    Acknowledgements

    The authors would like to thank the three anonymous referees, the Associate Editor, and the Editor for their helpful comments that improved the quality of this paper. The authors thank the CF Foundation for the use of the CFFPR data to conduct this study. Additionally, we would like to thank the patients, care providers, and care coordinators at CF centers throughout the U.S. for their contributions to the CFFPR. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung and Blood Institute (NHLBI), the National Institutes of Health (NIH) and the U.S. Department of Health and Human Services. Sun’s research was supported by NIH (R21HL156228). Wu’s research was supported in part by the Intramural Research Program of the NHLBI/NIH. Huang’s research was supported by NIH (R01CA193888). McGarry’s research was supported by NIH (1K23HL133437-01A1) and CF Foundation Therapeutics (MCGARR16A0).

    References

    • Breiman (1996) Breiman, L. (1996). Bagging predictors. Machine Learning 24, 123–140.
    • Breiman (2001) Breiman, L. (2001). Random forests. Machine Learning 45, 5–32.
    • Ciampi et al. (1986) Ciampi, A., Thiffault, J., Nakache, J.-P., and Asselain, B. (1986). stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covariates. Computational Statistics & Data Analysis 4, 185–204.
    • Davis and Anderson (1989) Davis, R. B. and Anderson, J. R. (1989). Exponential survival trees. Statistics in Medicine 8, 947–961.
    • Domanski et al. (2020) Domanski, M. J., Tian, X., Wu, C. O., Reis, J. P., Dey, A. K., Gu, Y., Zhao, L., Bae, S., Liu, K., Hasan, A. A., et al. (2020). Time course of LDL cholesterol exposure and cardiovascular disease event risk. Journal of the American College of Cardiology 76, 1507–1516.
    • Döring et al. (2004) Döring, G., Hoiby, N., Group, C. S., et al. (2004). Early intervention and prevention of lung disease in cystic fibrosis: a european consensus. Journal of Cystic fibrosis 3, 67–91.
    • Ferrer et al. (2019) Ferrer, L., Putter, H., and Proust-Lima, C. (2019). Individual dynamic predictions using landmarking and joint modelling: Validation of estimators and robustness assessment. Statistical Methods in Medical Research 28, 3649–3666.
    • Friedman et al. (2001) Friedman, J., Hastie, T., and Tibshirani, R. (2001). The Elements of Statistical Learning. New York: Springer.
    • Gordon and Olshen (1985) Gordon, L. and Olshen, R. A. (1985). Tree-structured survival analysis. Cancer Treatment Reports 69, 1065–1069.
    • Harun et al. (2016) Harun, S. N., Wainwright, C., Klein, K., and Hennig, S. (2016). A systematic review of studies examining the rate of lung function decline in patients with cystic fibrosis. Paediatric respiratory reviews 20, 55–66.
    • Heagerty et al. (2000) Heagerty, P. J., Lumley, T., and Pepe, M. S. (2000). Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56, 337–344.
    • Heltshe et al. (2018) Heltshe, S., Khan, U., Beckett, V., Baines, A., Emerson, J., Sanders, D., Gibson, R., Morgan, W., and Rosenfeld, M. (2018). Longitudinal development of initial, chronic and mucoid pseudomonas aeruginosa infection in young children with cystic fibrosis. Journal of Cystic Fibrosis 17, 341–347.
    • Hothorn et al. (2006) Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., and Van Der Laan, M. J. (2006). Survival ensembles. Biostatistics 7, 355–373.
    • Hothorn et al. (2004) Hothorn, T., Lausen, B., Benner, A., and Radespiel-Tröger, M. (2004). Bagging survival trees. Statistics in Medicine 23, 77–91.
    • Ishwaran et al. (2008) Ishwaran, H., Kogalur, U. B., Blackstone, E. H., and Lauer, M. S. (2008). Random survival forests. The Annals of Applied Statistics 2, 841–860.
    • Jewell and Nielsen (1993) Jewell, N. P. and Nielsen, J. P. (1993). A framework for consistent prediction rules based on markers. Biometrika 80, 153–164.
    • Kamata et al. (2017) Kamata, H., Asakura, T., Suzuki, S., Namkoong, H., Yagi, K., Funatsu, Y., Okamori, S., Uno, S., Uwamino, Y., Fujiwara, H., Nishimura, T., Ishii, M., Betsuyaku, T., and Hasegawa, N. (2017). Impact of chronic pseudomonas aeruginosa infection on health-related quality of life in mycobacterium avium complex lung disease. BMC Pulmonary Medicine 17, 198.
    • Knapp et al. (2016) Knapp, E. A., Fink, A. K., Goss, C. H., Sewall, A., Ostrenga, J., Dowd, C., Elbert, A., Petren, K. M., and Marshall, B. C. (2016). The Cystic Fibrosis Foundation Patient Registry. Design and methods of a national observational disease registry. Annals of the American Thoracic Society 13, 1173–1179.
    • LeBlanc and Crowley (1992) LeBlanc, M. and Crowley, J. (1992). Relative risk trees for censored survival data. Biometrics 48, 411–425.
    • LeBlanc and Crowley (1993) LeBlanc, M. and Crowley, J. (1993). Survival trees by goodness of split. Journal of the American Statistical Association 88, 457–467.
    • Lin and Jeon (2006) Lin, Y. and Jeon, Y. (2006). Random forests and adaptive nearest neighbors. Journal of the American Statistical Association 101, 578–590.
    • Mannino et al. (2006) Mannino, D. M., Reichert, M. M., and Davis, K. J. (2006). Lung function decline and outcomes in an adult population. American Journal of Respiratory and Critical Care Medicine 173, 985–990.
    • Maziarz et al. (2017) Maziarz, M., Heagerty, P., Cai, T., and Zheng, Y. (2017). On longitudinal prediction with time-to-event outcome: Comparison of modeling options. Biometrics 73, 83–93.
    • McGarry et al. (2020) McGarry, M. E., Huang, C.-Y., Nielson, D. W., and Ly, N. P. (2020). Early acquisition and conversion of Pseudomonas aeruginosa in hispanic youth with cystic fibrosis in the United States. Journal of Cystic Fibrosis page In press.
    • McGarry et al. (2019) McGarry, M. E., Neuhaus, J. M., Nielson, D. W., and Ly, N. P. (2019). Regional variations in longitudinal pulmonary function: A comparison of Hispanic and non-Hispanic subjects with cystic fibrosis in the United States. Pediatric Pulmonology 54, 1382–1390.
    • McIntosh and Pepe (2002) McIntosh, M. W. and Pepe, M. S. (2002). Combining several screening tests: Optimality of the risk score. Biometrics 58, 657–664.
    • Molinaro et al. (2004) Molinaro, A. M., Dudoit, S., and Van der Laan, M. J. (2004). Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis 90, 154–177.
    • Morgan et al. (2016) Morgan, W. J., Vandevanter, D. R., Pasta, D. J., Foreman, A. J., Wagener, J. S., Konstan, M. W., Morgan, W., Konstan, M., Liou, T., McColley, S., et al. (2016). Forced expiratory volume in 1 second variability helps identify patients with cystic fibrosis at risk of greater loss of lung function. The Journal of Pediatrics 169, 116–121.
    • Parast et al. (2012) Parast, L., Cheng, S.-C., and Cai, T. (2012). Landmark prediction of long-term survival incorporating short-term event time information. Journal of the American Statistical Association 107, 1492–1501.
    • Proust-Lima et al. (2016) Proust-Lima, C., Dartigues, J.-F., and Jacqmin-Gadda, H. (2016). Joint modeling of repeated multivariate cognitive measures and competing risks of dementia and death: A latent process and latent class approach. Statistics in Medicine 35, 382–398.
    • Rizopoulos (2011) Rizopoulos, D. (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67, 819–829.
    • Rizopoulos et al. (2017) Rizopoulos, D., Molenberghs, G., and Lesaffre, E. M. (2017). Dynamic predictions with time-dependent covariates in survival analysis using joint modeling and landmarking. Biometrical Journal 59, 1261–1276.
    • Segal (1988) Segal, M. R. (1988). Regression trees for censored data. Biometrics 44, 35–47.
    • Steingrimsson et al. (2016) Steingrimsson, J. A., Diao, L., Molinaro, A. M., and Strawderman, R. L. (2016). Doubly robust survival trees. Statistics in Medicine 35, 3595–3612.
    • Steingrimsson et al. (2019) Steingrimsson, J. A., Diao, L., and Strawderman, R. L. (2019). Censoring unbiased regression trees and ensembles. Journal of the American Statistical Association 114, 370–383.
    • Sun et al. (2022) Sun, Y., Chiou, S. H., Wu, C. O., McGarry, M., and Huang, C.-Y. (2022). Supplement to “dynamic risk prediction triggered by intermediate events using survival tree ensembles”.
    • Sweeting et al. (2017) Sweeting, M. J., Barrett, J. K., Thompson, S. G., and Wood, A. M. (2017). The use of repeated blood pressure measures for cardiovascular risk prediction: A comparison of statistical models in the aric study. Statistics in Medicine 36, 4514–4528.
    • Tanner et al. (2021) Tanner, K. T., Sharples, L. D., Daniel, R. M., and Keogh, R. H. (2021). Dynamic survival prediction combining landmarking with a machine learning ensemble: Methodology and empirical comparison. Journal of the Royal Statistical Society: Series A (Statistics in Society) 184, 3–30.
    • Taylor et al. (2013) Taylor, J. M. G., Park, Y., Ankerst, D. P., Proust-Lima, C., Williams, S., Kestin, L., Bae, K., Pickles, T., and Sandler, H. (2013). Real-time individual predictions of prostate cancer recurrence using joint models. Biometrics 69, 206–213.
    • Twala et al. (2008) Twala, B., Jones, M. C., and Hand, D. J. (2008). Good methods for coping with missing data in decision trees. Pattern Recognition Letters 29, 950–956.
    • van Houwelingen (2007) van Houwelingen, H. C. (2007). Dynamic prediction by landmarking in event history analysis. Scandinavian Journal of Statistics 34, 70–85.
    • van Houwelingen and Putter (2008) van Houwelingen, H. C. and Putter, H. (2008). Dynamic predicting by landmarking as an alternative for multi-state modeling: An application to acute lymphoid leukemia data. Lifetime Data Analysis 14, 447–463.
    • van Houwelingen and Putter (2011) van Houwelingen, H. C. and Putter, H. (2011). Dynamic prediction in clinical survival analysis. Boca Raton: CRC Press.
    • Wang et al. (2017) Wang, J., Luo, S., and Li, L. (2017). Dynamic prediction for multiple repeated measures and event time data: An application to Parkinson’s disease. The Annals of Applied Statistics 11, 1787–1809.
    • Wright and Ziegler (2017) Wright, M. N. and Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software 77, 1–17.
    • Zhang (1995) Zhang, H. (1995). Splitting criteria in survival trees. Statistical Modelling 104, 305–313.
    • Zhang et al. (1996) Zhang, H., Holford, T., and Bracken, M. B. (1996). A tree-based method of analysis for prospective studies. Statistics in Medicine 15, 37–49.
    • Zheng and Heagerty (2005) Zheng, Y. and Heagerty, P. J. (2005). Partly conditional survival models for longitudinal data. Biometrics 61, 379–391.
    • Zhu and Kosorok (2012) Zhu, R. and Kosorok, M. R. (2012). Recursively imputed survival trees. Journal of the American Statistical Association 107, 331–340.
    • Zhu et al. (2019) Zhu, Y., Li, L., and Huang, X. (2019). Landmark linear transformation model for dynamic prediction with application to a longitudinal cohort study of chronic disease. Journal of the Royal Statistical Society: Series C (Applied Statistics) 68, 771–791.