This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Instrumental variable estimation of early treatment effect in randomized screening trials

Sudipta Saha Dalla Lana School of Public Health, University of Toronto Zhihui (Amy) Liu Princess Margaret Cancer Centre, University Health Network Olli Saarela Correspondence to: Olli Saarela, Dalla Lana School of Public Health, 155 College Street, Toronto, Ontario M5T 3M7, Canada. Email: [email protected] Dalla Lana School of Public Health, University of Toronto
Abstract

The primary analysis of randomized screening trials for cancer typically adheres to the intention-to-screen principle, measuring cancer-specific mortality reductions between screening and control arms. These mortality reductions result from a combination of the screening regimen, screening technology and the effect of the early, screening-induced, treatment. This motivates addressing these different aspects separately. Here we are interested in the causal effect of early versus delayed treatments on cancer mortality among the screening-detectable subgroup, which under certain assumptions is estimable from conventional randomized screening trial using instrumental variable type methods. To define the causal effect of interest, we formulate a simplified structural multi-state model for screening trials, based on a hypothetical intervention trial where screening detected individuals would be randomized into early versus delayed treatments. The cancer-specific mortality reductions after screening detection are quantified by a cause-specific hazard ratio. For this, we propose two estimators, based on an estimating equation and a likelihood expression. The methods extend existing instrumental variable methods for time-to-event and competing risks outcomes to time-dependent intermediate variables. Using the multi-state model as the basis of a data generating mechanism, we investigate the performance of the new estimators through simulation studies. In addition, we illustrate the proposed method in the context of CT screening for lung cancer using the US National Lung Screening Trial (NLST) data.

Keywords: Causal inference, Instrumental variable estimation, Multi-state model, Randomized screening trial

1 Introduction

The benefits of cancer screening are ideally studied through randomized screening trials, where asymptomatic participants are assigned to either undergo a prespecified regimen of screening examinations, or to control without screening, and then followed up for cancer specific mortality (Hu and Zelen, 1997). The primary analysis of such a trial usually follows the intention to screen (ITS) principle, based on comparison of cancer specific mortality between the screening and control arms at the end of the follow-up period. Any mortality reductions in the screening arm are due to the combination of earlier (asymptomatic) detections due to the screening technology, regimen and uptake, and the effectiveness of the subsequent early (versus delayed, following symptomatic diagnosis) treatments. This motivates studying these two aspects separately. The aim in isolating the effect of the early treatment is to obtain an effect measure that is less dependent on the screening regimen and uptake and potentially more transferable between populations with for example different disease incidence. Herein we are interested in the effectiveness of the early treatments in the subpopulation who would be screening detectable if offered screening. The subgroup effectiveness can be studied based on data collected through conventional screening trials, using instrumental variable type estimators, where the instrument is the randomized assignment.

Instrumental variable (IV) approach has been primarily introduced to adjust for non-compliance with the treatment assignments in therapeutic trials (Baker and Lindeman, 1994; Imbens and Angrist, 1994; Angrist et al., 1996; Baiocchi et al., 2014; Baker et al., 2016). Several authors have utilized the IV approach in screening trials for estimating the effect of screening assignment among the compliers, a latent subgroup of participants who adhere to the assigned screening regimen (Baker, 1998; Roemeling et al., 2007). This is different from the present context where our interest is in the effectiveness of early treatments in the screening detectable subgroup; IV estimators are applicable also for this purpose, and can estimate the mortality risk reduction in the screening detectable subgroup. For this purpose, McIntosh (1999) formulated instrumental variable type estimators in a cohort diagnosed with cancer, assembled after sufficiently long follow-up period to have similar numbers of cancer cases in both screening and control arms to avoid overdiagnosis bias. Saha et al. (2018) defined the effect measures and estimators as functions of time, and formulated the estimators in the entire trial cohort, which avoids any selection bias due to conditioning on intermediate variables in instrumental variable analysis (Swanson et al., 2015).

If the screening is discontinued sufficiently early in the trial and the subsequent follow-up is sufficiently long, the estimators proposed by Saha et al. (2018) also estimate the case-fatality reduction due to the early treatments, though they are valid for subgroup mortality risk reduction even without this interpretation, which is important if the screening regimen continues for the duration of the trial. The effect measures can be connected to a hypothetical intervention trial proposed by Miettinen (2014, 2015), where screening-detected individuals would be randomized into early versus delayed treatments. Although such a trial could not be implemented in practice, it is helpful in formulating meaningful causal estimands that can then be estimated from conventional screening trials. In a different context, treatment effect in a latent subgroup has been studied by Altstein et al. (2011); Altstein and Li (2013), in a diagnostic setting where immediate lymph node removal surgery is received by patients with a positive biopsy test while those with a negative biopsy test receive a control regimen of observation, with the causal effect of the immediate surgery among the positive biopsy subgroup as the quantity of interest. The screening trial context differs from this due to the repeated rounds of screening, making the latent subgroup membership time-dependent.

Recent methodological literature has seen several proposals for instrumental variable estimation with time-to-event and competing risks outcomes. The latter is relevant for the screening context as the interest is in cancer-specific mortality. Instrumental variable estimators generalize to time-to-event outcomes most straightforwardly by considering time-specific risk or survival probability as the outcome and using plug-in estimators that can accommodate censoring such as the Kaplan-Meier estimator (Nie et al., 2011). This approach generalizes to competing risks outcomes by using non-parametric cumulative incidence estimators (Richardson et al., 2017). Instrumental variable approaches to estimate causal effect parameters in structural proportional hazards (Martinussen et al., 2019) and additive hazards (Zheng et al., 2017; Martinussen and Vansteelandt, 2020) models have also been proposed. The current setting differs from these because the latent subgroup is time-dependent, accumulated through repeated screening examinations. Saha et al. (2018) demonstrated that instrumental variable estimation is still applicable in this setting, formulating the estimators under competing risks, and allowing for inclusion of covariates to account for covariate dependent censoring.

While Saha et al. (2018) estimated absolute and proportional subgroup mortality risk reductions, here we are interested in measuring the early treatment effect in terms of a mortality hazard ratio. For estimation, we broadly follow the approach of Loeys and Goetghebeur (2003), Loeys et al. (2005), and Martinussen et al. (2019) for instrumental variable estimation for structural Cox proportional hazards models. These authors proposed two-step estimation procedures where estimators for unknown survival probabilities are substituted into an equation derived under the instrumental variable assumptions, to obtain an estimating equation that can be solved with respect to the effect measure of interest. We extend this approach for time-dependent exposure/latent subgroup and competing risks in a multi-state modeling framework. The methods also have similarities to g-estimation in the presence of time-dependent compliance (Mark and Robins, 1993b, a; Robins, 1994). The challenges in formulating causal estimands with time-dependent exposure in a multi-state model setting have been discussed by von Cube et al. (2019). We make sure the quantities are well-defined by linking them with the hypothetical intervention trial.

Because of our focus on the subgroup treatment effect, we aim to characterize randomized screening trials with a simplified multi-state model that avoids modeling the properties of the screening technology (sensitivity, specificity), sojourn time (the period where asymptomatic cancer is screening-detectable), lead time (the period between screening detection and clinical detection in the absence of screening) and stage transition of the cancer. To test the methods, we also use the model as a basis of a data generating mechanism in simulation studies. In contrast, the well-known Hu-Zelen model and its extensions which have been used for planning screening trials (Zelen, 1993; Hu and Zelen, 1997; Lee and Zelen, 2006), parameter estimation (Shen and Zelen, 1999; Sung et al., 2019) and could also serve as a simulation model. These models are based on modeling the natural history of the disease and the properties of the screening test. We aim to avoid specifying these parts of the model by basing our model on the hypothetical intervention trial, where the causal effects of early treatments following an early detection are quantified through a hazard ratio. We will apply the simulation mechanism to compare the performance of the different measures suggested for subgroup mortality reduction, and further use simulation to answer the question of whether conventional screening trials are powered to detect such subgroup effects.

The remainder of the paper is structured as follows: in Section 2 we introduce the necessary notation, define the causal model and the identifying assumptions for estimation under this model and describe the algorithm to simulate data mimicking screening trials. In Section 3, we introduce new estimators to quantify the effect of early treatments in the screening-detectable subgroup as a hazard ratio. We present simulation studies to compare all the estimators in Section 4. In Section 5, we illustrate the use of the new estimator using data from the US National Lung Screening Trial (NLST). Finally, a brief discussion and future directions are presented in Section 6.

2 A causal model for randomized screening trials

2.1 Conventional screening trial

We introduce the necessary notation in the context of the US National Lung Screening Trial, where 53,45253,452 heavy smokers, currently smoking or having quit within the last 1515 years, with a smoking history of 30+30+ pack years, aged between 557455-74, were assigned at random to undergo three annual screening examinations or control. The participants were asymptomatic of lung cancer at the time of randomization. Participants in the screening arm received three annual low-dose helical CT scans, whereas participants in the control arm received a standard chest X-ray, and both arms were followed up for 77 years for the main analysis (NLST Research Team, 2011). There were 649649 early diagnoses in the screening arm and 469469 and 552552 cancer deaths in the screening arm and control arm, respectively. In the following, we assume the X-ray screening to have negligible mortality benefits, as has been demonstrated by Oken et al. (2011), and equate it to a placebo. We characterize the causal effect of screening-induced early treatments through a simplified structural multi-state model where the mortality benefits manifest after an early detection through a CT scan. From the healthy state (numbered as state 1), participants in the screening arm have three possible transitions: early diagnosed through CT screening (state 2), cancer-specific death (state 3) and other-cause death (state 4). From the early diagnosed state, the participants can move to cancer-specific death or other-cause death states, with potentially different transition rates. We assume that the control arm does not have early diagnosis through CT screening, and thus only direct transitions from state 1 to states 3 and 4 are possible. For concise notation, we use counting processes. We note that since each particular type of transition in a multi-state model can be characterized through a competing risks model, the transition intensities are analogous to the counterfactual cause-specific hazards, as outlined by Young et al. (2020), but conditional on the event histories taken place so far.

Let Nklz(t){0,1}N^{z}_{kl}(t)\in\{0,1\} be the potential (underlying, in the absence of censoring) counting process that counts the transitions of a given individual from state kk to ll by time tt in arm zz, where kl;k,l{1,2,3,4}k\neq l;\;k,l\in\{1,2,3,4\} and z=1z=1, or 0 indicate the screening or control arms, respectively. In the current context each transition can take place at most once, and thus each counting process only counts to one, with states 3 and 4 being absorbing states. The corresponding observed process is formulated as Nkl(t)=ZNkl1(t)+(1Z)Nkl0(t)N_{kl}(t)=ZN_{kl}^{1}(t)+(1-Z)N_{kl}^{0}(t), where ZZ is the arm that the patient actually was assigned to in the randomized screening trial (counterfactual consistency). The counting processes are characterized by the transition intensities defined as

λklz(t)=limΔt0P(Nklz(t+Δt)Nklz(t)=1Hz(t))Δt\displaystyle\lambda_{kl}^{z}(t)=\lim_{\Delta t\to 0}\frac{P(N_{kl}^{z}(t+\Delta t)-N_{kl}^{z}(t^{-})=1\mid H^{z}(t^{-}))}{\Delta t}

where Hz(t)H^{z}(t^{-}) indicates the history of states until time tt^{-}. We note that the model could be alternatively specified in discrete time through the conditional probabilities P(Nklz(t+Δt)Nklz(t)=1Hz(t))λklz(t)ΔtP(N_{kl}^{z}(t+\Delta t)-N_{kl}^{z}(t^{-})=1\mid H^{z}(t^{-}))\approx\lambda_{kl}^{z}(t)\Delta t, but we will use continuous time notation for simplicity. The corresponding cumulative transition intensity function is defined as Λklz(t)=stλklz(t)dt\Lambda_{kl}^{z}(t)=\int_{s}^{t}\lambda_{kl}^{z}(t)\,\textrm{d}t. The transition intensities can be collected into a matrix

λz(t)={blockarray}c@rrrr@&1234{block}r@|@|@rrrr@|@|1.λ12z(t)λ13z(t)λ14z(t)20.λ23z(t)λ24z(t)300.04000.\lambda^{z}(t)=\blockarray{c@{\hskip 1.0pt}rrrr@{\hskip 4.0pt}}&1234\\ \block{r@{\hskip 1.0pt}|@{\hskip 1.0pt}|@{\hskip 1.0pt}rrrr@{\hskip 1.0pt}|@{\hskip 1.0pt}|}1.\lambda_{12}^{z}(t)\lambda_{13}^{z}(t)\lambda_{14}^{z}(t)\\ 20.\lambda_{23}^{z}(t)\lambda_{24}^{z}(t)\\ 300.0\\ 4000.\\

In the conventional screening trials, the ITS estimand would be defined by comparing the outcomes between the two arms at time tt after sufficiently long follow-up, in terms of the cumulative incidences for cancer-specific mortality. For example, the absolute reduction in cancer-specific mortality risk would be measured by

E[N130(t)]E[N131(t)+N231(t)]E[N_{13}^{0}(t)]-E[N_{13}^{1}(t)+N_{23}^{1}(t)] (1)

and the proportional reduction by

1E[N131(t)+N231(t)]E[N130(t)]1-\frac{E[N_{13}^{1}(t)+N_{23}^{1}(t)]}{E[N_{13}^{0}(t)]} (2)

The estimation of these quantities does not require a multi-state model, as the components could be estimated through empirical cumulative incidences in the two arms. However, the multi-state model is needed to specify and estimate the causal effect of the early treatment among the early detectable subgroup. This subgroup is latent at baseline and accumulates in the screening arm during the follow-up of the trial through the repeated screening examinations. Since screening itself is not an intervention, the mortality benefits can manifest only after an early detection. To specify the corresponding causal effect, we need to introduce a well-defined intervention.

2.2 Hypothetical intervention trial

Following and extending Miettinen (2015), we can envision a trial that has similar screening regimen as the screening arm of the conventional trial, but instead of randomizing into screening and non-screening, screens everyone, and at the time of an early detection, randomizes into referral to early treatment vs delayed treatment through withholding the screening result. Delayed treatment refers treatments following a later symptomatic diagnosis. Since the early detection can take place only once for a given individual, the randomization for the assignment in the event of early detection can already take place at study baseline, which is relevant for the indexing of the corresponding potential outcomes. A schematic illustration of the hypothetical trial is presented in Figure 1, and the corresponding multi-state model in Figure 2. While this kind of trial could not be implemented in practice, it is important as a thought experiment to define the causal quantities that we are interested in.

Among N121(t)=1N_{12}^{1}(t)=1 we define N2l1r(t){0,1}N^{1r}_{2l}(t)\in\{0,1\}, l{3,4}l\in\{3,4\} to be potential (underlying) counting processes for subsequent death corresponding to early (r=1r=1) versus delayed (r=0r=0) treatment, characterized by transition intensities

λ2l1r(t)=limΔt0P(N2l1r(t+Δt)N2l1r(t)=1H1(t))Δt.\displaystyle\lambda_{2l}^{1r}(t)=\lim_{\Delta t\to 0}\frac{P(N_{2l}^{1r}(t+\Delta t)-N_{2l}^{1r}(t^{-})=1\mid H^{1}(t^{-}))}{\Delta t}.

The modified transition matrix for screening detectable subgroup is now

λ1r(t)={blockarray}c@rrrr@&1234{block}r@|@|@rrrr@|@|1.λ121(t)λ131(t)λ141(t)20.λ231r(t)λ241r(t)300.04000.\lambda^{1r}(t)=\blockarray{c@{\hskip 1.0pt}rrrr@{\hskip 4.0pt}}&1234\\ \block{r@{\hskip 1.0pt}|@{\hskip 1.0pt}|@{\hskip 1.0pt}rrrr@{\hskip 1.0pt}|@{\hskip 1.0pt}|}1.\lambda_{12}^{1}(t)\lambda_{13}^{1}(t)\lambda_{14}^{1}(t)\\ 20.\lambda_{23}^{1r}(t)\lambda_{24}^{1r}(t)\\ 300.0\\ 4000.\\

Under this model, the absolute cancer mortality risk reduction in the subgroup could be measured by

E[N2310(t)N2311(t)N121(t)=1]E[N_{23}^{10}(t)-N_{23}^{11}(t)\mid N_{12}^{1}(t)=1] (3)

and the proportional reduction by

1E[N2311(t)=1N121(t)=1]E[N2310(t)=1N121(t)=1].1-\frac{E[N_{23}^{11}(t)=1\mid N_{12}^{1}(t)=1]}{E[N_{23}^{10}(t)=1\mid N_{12}^{1}(t)=1]}. (4)

We note that these quantities are distinct from (1) and (2) as they are defined in the early diagnosed subpopulation. Conditioning on this subgroup membership also conditions on survival until the early diagnosis, so (3) and (4) measure mortality reductions after the early diagnosis. In a trial where the screening is discontinued at some point (as it was in NLST after three annual rounds), and the follow-up is long enough so that all the mortality benefits have been realized, these quantities approximate the absolute and proportional case-fatality reduction, contrasting the screening-induced early versus delayed treatment among the screening detectable cases. The estimation of these quantities was discussed by Saha et al. (2018).

Herein we aim to measure the effect by contrasting the transition intensities λ2311(t)\lambda_{23}^{11}(t) and λ2310(t)\lambda_{23}^{10}(t). We note that the formulation in terms of transition intensities does not yet assume anything about the early treatment effect, for example generally this effect could be a function of time and/or function of the time of the early diagnosis. However, we can simplify the model by characterizing the effect as a constant ratio λ2310(t)=θλ2311(t)\lambda_{23}^{10}(t)=\theta\lambda_{23}^{11}(t), characterizing the early treatment effect on cancer-specific mortality through a single parameter, corresponding to the proportionality of the two cause-specific mortality hazards. In addition, although this is not generally required, we take λ2410(t)=λ2411(t)\lambda_{24}^{10}(t)=\lambda_{24}^{11}(t), meaning that the early treatments do not have effect on other-cause mortality. In what follows, we focus on the estimation of the quantity θ\theta as our effect measure of interest. We note that we parametrized this as the ratio of the mortality under delayed vs. early treatment to be consistent with the earlier defined risk reduction measures. Alternatively, the inverse of this can be reported. Herein we take the time scale for all of the transition intensities to be the time since the baseline/study entry. However, it would also be equally possible to use the time since early detection as the time scale for the transitions after the early detection, and to define proportionality of the effect on this time scale.

Since the transition intensities λ2l10(t)\lambda_{2l}^{10}(t) are not directly estimable from data observed under a conventional randomized screening trial, the estimation requires instrumental variable type methods, using the randomized assignment in the conventional trial as an instrument. We will propose estimation methods in Section 3, but will first discuss the use of the model for simulation, and outline the assumptions required for the instrumental variable estimation.

Eligible for screening and randomized Z=1Z=1 (screening arm) Z=0Z=0 (control arm) N131(t)+N231(t)N_{13}^{1}(t)+N_{23}^{1}(t) N130(t)N_{13}^{0}(t) (a) Conventional screening trial      Contrast: 1E[N131(t)+N231(t)]E[N130(t)]1-\frac{E[N_{13}^{1}(t)+N_{23}^{1}(t)]}{E[N_{13}^{0}(t)]}      or E[N130(t)]E[N131(t)+N231(t)]E[N_{13}^{0}(t)]-E[N_{13}^{1}(t)+N_{23}^{1}(t)] Eligible for screening Z=1Z=1 (everyone screened) N121(t)=1N_{12}^{1}(t)=1 (screening-detected and randomized) R=1R=1 (referral to early treatment) R=0R=0 (no referral to early treatment) N2311(t)N_{23}^{11}(t) N2310(t)N_{23}^{10}(t) (b) Hypothetical intervention trial      Contrast: 1E[N2311(t)N121(t)=1]E[N2310(t)N121(t)=1]1-\frac{E[N_{23}^{11}(t)\mid N_{12}^{1}(t)=1]}{E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]}      or E[N2310(t)N2311(t)N121(t)=1]E[N_{23}^{10}(t)-N_{23}^{11}(t)\mid N_{12}^{1}(t)=1]
Figure 1: Illustration of a conventional randomized screening trial and a hypothetical intervention trial sharing the same screening regimen, and the corresponding causal contrasts.
Healthy(1)Early detected(2)Cancer-specific death(3)Other-cause death(4)λ121(t)\lambda_{12}^{1}(t)λ131(t)\lambda_{13}^{1}(t)λ141(t)\lambda_{14}^{1}(t)λ2311(t)\lambda_{23}^{11}(t)λ2411(t)\lambda_{24}^{11}(t)λ2310(t)=θλ2311(t)\lambda_{23}^{10}(t)=\theta\lambda_{23}^{11}(t)
Figure 2: A schematic illustrating the transition intensities in the hypothetical intervention trial. Here λkl(t)\lambda_{kl}(t) corresponds to the transition intensity from state kk to state ll at time tt. The dashed arrow represents cancer mortality under delayed treatments, contrasted to early treatments, with the intensity ratio θ\theta quantifying the causal effect of interest.

2.3 Identifying assumptions

In the methods proposed in Section 3, the random assignment in the conventional screening trial is used as an instrumental variable. We outline the identifying assumptions here as they are also used to generate simulated data from the model. In particular, we assume that (i) N121(t)=1N1l0(t)=N2l10(t)N_{12}^{1}(t)=1\Rightarrow N_{1l}^{0}(t)=N_{2l}^{10}(t), l{3,4}l\in\{3,4\}, meaning that for someone who would get early detected by time tt, but not receive early treatment, the potential mortality outcome would be the same as in the control arm of the conventional trial. Further, we assume that (ii) N121(t)=0N1l0(t)=N1l1(t)N_{12}^{1}(t)=0\Rightarrow N_{1l}^{0}(t)=N_{1l}^{1}(t), l{3,4}l\in\{3,4\}, meaning that for someone who would not get early detected, the potential mortality outcome would be the same between the two arms of the conventional trial. Essentially, assumptions (i) and (ii) together state that screening examinations themselves do not have impact on mortality in the absence of the intervention. They broadly correspond to the exclusion restriction assumption of conventional instrumental variable analysis (Angrist et al., 1996), stating that the instrumental variable does not have direct causal effect on the outcome, outside the effect mediated by the intervention. In the present context this assumption could be violated for example if the screening examinations cause the participants to modify their health behaviour in a way that affects their cancer risk. In the NLST also the control arm received screening, so any such effects should be more similar between the two arms, making (i) and (ii) more plausible.

In addition, we assume that (iii) N121(t)=1N2l1(t)=N2l11(t)N_{12}^{1}(t)=1\Rightarrow N_{2l}^{1}(t)=N_{2l}^{11}(t), l{3,4}l\in\{3,4\}, meaning that in practice in the screening arm early treatment follows early diagnosis, which is reasonable at least in the lung cancer context. From this it follows that λ2l1(t)=λ2l11(t)\lambda_{2l}^{1}(t)=\lambda_{2l}^{11}(t), meaning that the mortality rates under early treatment following early diagnosis can be estimated from the screening arm of the conventional trial. For the purpose of identification based on data from the two arms of the conventional trial, we also need counterfactual consistency (iv), formulated in Section 2.1. Finally, we take that λ120(t)=λ230(t)=λ240(t)=0\lambda_{12}^{0}(t)=\lambda_{23}^{0}(t)=\lambda_{24}^{0}(t)=0, meaning that early detection through the screening technology offered in the screening arm is not available in the control arm.

2.4 Description of the simulation algorithm

To simulate data from the proposed multi-state model, one needs to fix the transition intensities in the matrix λ1r(t)\lambda^{1r}(t), to simulate event histories in the two arms in the hypothetical intervention trial, corresponding to r=1r=1 and r=0r=0. These event histories are also used to represent event histories in the conventional screening trial. In particular, under condition (iii) the intervention arm (r=1r=1) event histories can be directly taken to be event histories in the screening arm of the conventional trial. Also, the control arm r=0r=0 event histories can be directly taken to be event histories in the control arm by taking N120(t)=0N_{12}^{0}(t)=0 and N1l0(t)=N1l1(t)+N2l10(t)N_{1l}^{0}(t)=N_{1l}^{1}(t)+N_{2l}^{10}(t), l{3,4}l\in\{3,4\}, which follows from (i) and (ii). If the outcomes are simulated without covariates, the data generating mechanism and the structural model are the same. However, for the data generation, all the intensities can be made conditional on observed and/or unobserved simulated covariates, in which case the structural model will parametrize the marginal effect of interest. In this case, the marginal effect is not directly specified by the data generating mechanism, but can be approximated by fitting a marginal proportional hazards model to the two arms of the hypothetical trial.

For a simulation study, for example exponential or Weibull functional forms can be assumed for the intensities, or the observable ones (λ1l1(t)\lambda_{1l}^{1}(t), l{2,3,4}l\in\{2,3,4\} and λ2l11(t)\lambda_{2l}^{11}(t), l{3,4}l\in\{3,4\}) can be estimated from the screening arm of an existing trial such as the NLST, with the control arm (r=0r=0) counterparts obtained by fixing the effect parameter θ\theta to a value. With the transition intensity matrix λ1r\lambda^{1r}, simulation can then proceed through usual algorithms for simulating event histories from a multi-state model as described for example in Beyersmann et al. (2011, p.  45). Briefly, this proceeds by simulating the time and type of the next event from a competing risks model, moving to a new state, and continuing the same. Starting from the healthy state, this would mean simulating the time of the first event, TT, from the total hazard

Λ1(t)=Λ121(t)+Λ131(t)+Λ141(t),\displaystyle\Lambda^{1}(t)=\Lambda_{12}^{1}(t)+\Lambda_{13}^{1}(t)+\Lambda_{14}^{1}(t),

which specifies the event time distribution F(t)=P(Tt)=1exp(Λ1(t))F(t)=P(T\leq t)=1-\exp(-\Lambda^{1}(t)). Using the inverse cumulative distribution function method, we can take uU[0,1]u\sim U[0,1] and solve Λ1(t)+log(1u)=0\Lambda^{1}(t)+\log(1-u)=0 with respect to tt to get the time. At time T=tT=t the type of the event is then drawn with multinomial probabilities

λ1l1(t)λ121(t)+λ131(t)+λ141(t)\displaystyle\frac{\lambda_{1l}^{1}(t)}{\lambda_{12}^{1}(t)+\lambda_{13}^{1}(t)+\lambda_{14}^{1}(t)}

for l{2,3,4}l\in\{2,3,4\}. If the new state is terminal (l{3,4}l\in\{3,4\}), the algorithm stops, otherwise it continues similarly from state 2 after possible modification of the subsequent transition intensities based on the timing of the first event, where the new total hazard is given by Λ1r(t)=Λ231r(t)+Λ241r(t)\Lambda^{1r}(t)=\Lambda_{23}^{1r}(t)+\Lambda_{24}^{1r}(t), and the event type probabilities by

λ2l1r(t)λ231r(t)+λ241r(t)\displaystyle\frac{\lambda_{2l}^{1r}(t)}{\lambda_{23}^{1r}(t)+\lambda_{24}^{1r}(t)}

for l{3,4}l\in\{3,4\}. We used the NLST as the basis of the simulation study presented in Section 4.1 by estimating the quantities λ11(t)\lambda^{11}(t) from the screening arm of the trial. For the observable transition intensities, we calculated smooth hazard estimates using the muhaz (Hess and Gentleman, 2019) package in R and integrated them numerically to get the corresponding cumulative intensities.

3 Hazard ratio estimation

3.1 Estimating equation

To derive a connection between the observable and unobservable quantities under the identifying assumptions listed in Section 2.3, we can re-express the cumulative incidence of cancer-specific mortality in the control arm of a conventional trial E[N130(t)]E[N_{13}^{0}(t)] as the sum of two different alternative event histories in terms of the latent subgroup membership of the individual, that is, the event history under early detection and no early treatment and the event history under no early detection by time tt. This gives

E[N130(t)]\displaystyle E[N_{13}^{0}(t)] =0texp{(Λ130(u)+Λ140(u))}λ130(u)du\displaystyle=\int_{0}^{t}\exp\left\{-(\Lambda_{13}^{0}(u)+\Lambda_{14}^{0}(u))\right\}\lambda_{13}^{0}(u)\,\textrm{d}u
=E[N130(t)N121(t)=1]P(N121(t)=1)\displaystyle=E[N_{13}^{0}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[N130(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[N_{13}^{0}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=(i),(ii)E[N2310(t)N121(t)=1]P(N121(t)=1)\displaystyle\stackrel{{\scriptstyle\textrm{(i),(ii)}}}{{=}}E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[N131(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[N_{13}^{1}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ121(u)\displaystyle=\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{12}^{1}(u)
×utexp{(us[λ2310(s)+λ2410(s)]ds)}λ2310(v)dvdu\displaystyle\quad\times\int_{u}^{t}\exp\left\{-\left(\int_{u}^{s}[\lambda_{23}^{10}(s)+\lambda_{24}^{10}(s)]\,\textrm{d}s\right)\right\}\lambda_{23}^{10}(v)\,\textrm{d}v\,\textrm{d}u
+0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ131(u)du\displaystyle\quad+\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{13}^{1}(u)\,\textrm{d}u

Here the second equality used assumptions (i) and (ii). Finally, applying assumption (iii), as well as the modeling assumption of proportional hazards, we get

E[N130(t)]\displaystyle E[N_{13}^{0}(t)] =(iii)0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ121(u)\displaystyle\stackrel{{\scriptstyle\textrm{(iii)}}}{{=}}\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{12}^{1}(u)
×utexp{(uv[θλ231(s)+λ241(s)]ds)}θλ231(v)dvdu\displaystyle\quad\times\int_{u}^{t}\exp\left\{-\left(\int_{u}^{v}[\theta\lambda_{23}^{1}(s)+\lambda_{24}^{1}(s)]\,\textrm{d}s\right)\right\}\theta\lambda_{23}^{1}(v)\,\textrm{d}v\,\textrm{d}u
+0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ131(u)du\displaystyle\quad+\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{13}^{1}(u)\,\textrm{d}u (5)

A corresponding equation using time since the early diagnosis as the time scale after transition to state 2 would be obtained by replacing λ2l1(v)\lambda_{2l}^{1}(v) with λ2l1(vu)\lambda_{2l}^{1}(v-u), l{3,4}l\in\{3,4\}. A related equation was considered by McIntosh (1999), but without considering the hazard ratio as the parameter of interest. We note that in the case of only single baseline screening examination at time t=0t=0 and the absence of competing risks, (3.1) would simplify to

E[N130(t)]\displaystyle E[N_{13}^{0}(t)] =E[N2310(t)N121(0)=1]P(N121(0)=1)\displaystyle=E[N_{23}^{10}(t)\mid N_{12}^{1}(0)=1]P(N_{12}^{1}(0)=1)
+E[N131(t)N121(0)=0]P(N121(0)=0)\displaystyle\quad+E[N_{13}^{1}(t)\mid N_{12}^{1}(0)=0]P(N_{12}^{1}(0)=0)
=P(N121(0)=1)0texp{θΛ231(u)}θλ231(u)du\displaystyle=P(N_{12}^{1}(0)=1)\int_{0}^{t}\exp\left\{-\theta\Lambda_{23}^{1}(u)\right\}\theta\lambda_{23}^{1}(u)\,\textrm{d}u
+P(N121(0)=0)0texp{Λ131(u)}λ131(u)du,\displaystyle\quad+P(N_{12}^{1}(0)=0)\int_{0}^{t}\exp\left\{-\Lambda_{13}^{1}(u)\right\}\lambda_{13}^{1}(u)\,\textrm{d}u,

which corresponds to the equality given by Martinussen et al. (2019, p. 69).

In addition to the parameter of interest θ\theta, (3.1) involves other unknown quantities, needing to be estimated. We note that under counterfactual consistency (iv) all the quantities (except θ\theta) on the right hand side of (3.1) can be estimated from the screening arm of the conventional trial, while the left hand size can be estimated from the control arm of the conventional trial. In principle, any associational model can be used to obtain estimators to substitute in for the quantities on the right hand side. For example, proportional cause-specific hazard models where the mortality after the early diagnosis is allowed to depend on time uu of the early diagnosis can be fitted for λ2l1(v)\lambda_{2l}^{1}(v), l{3,4}l\in\{3,4\}. However, here we focus on the special case of Markov multi-state model, where λ2l1(v)\lambda_{2l}^{1}(v) does not depend on uu, in which case we can use non-parametric estimators to substitute in for the unknown quantities.

State occupation probabilities in a Markov multi-state model can be generally estimated using the non-parametric Aalen-Johansen approach (Borgan, 2014), which in the case of a competing risks setting reduces to the non-parametric cumulative incidence estimator for the left hand side of the equation. The required inputs for the right hand side are Nelson-Aalen/Breslow estimates for the cumulative hazards/hazard increments; using these as inputs (with one of them modified by the multiplicative factor θ\theta), the Aalen-Johansen state occupation probabilities can be calculated for example using the msfit and probtrans functions of the mstate package (Putter et al., 2007; de Wreede et al., 2011). Equation (3.1) can then be solved numerically with respect to logθ\log\theta to estimate the early treatment effect.

Equation (3.1) was written for the underlying counting processes in the absence of censoring. However, by substituting in estimators that can accommodate independent censoring (unconditionally on covariates), the proposed approach can accommodate independent censoring. We discuss in Section 6 how covariate-dependent censoring can be accommodated. The fixed time tt at which the estimating equation is evaluated can be chosen as the maximum follow-up length, or, as we demonstrate in Section 5, the effect can be estimated as a function of tt to see if the estimates stabilize over the follow-up period. Alternatively, if reporting a single estimate is desirable, the timepoint for the estimation can be chosen to minimize the empirical variance as θ^=argmintV^(θ^t)\hat{\theta}=\arg\min_{t}\hat{V}(\hat{\theta}_{t}) as suggested by Martinussen et al. (2019), or as the inverse variance weighted average θ^=(tθ^t/V^(θ^t))/(t1/V^(θ^t))\hat{\theta}=\left(\sum_{t}\hat{\theta}_{t}/\hat{V}(\hat{\theta}_{t})\right)/\left(\sum_{t}1/\hat{V}(\hat{\theta}_{t})\right) over an equally spaced grid of times tt. The variance estimates can be obtained through the bootstrap.

We note that we can obtain an analogous equation

E[N140(t)]\displaystyle E[N_{14}^{0}(t)] =0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ121(u)\displaystyle=\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{12}^{1}(u)
×utexp{(uv[θλ231(s)+λ241(s)]ds)}λ241(v)dvdu\displaystyle\quad\times\int_{u}^{t}\exp\left\{-\left(\int_{u}^{v}[\theta\lambda_{23}^{1}(s)+\lambda_{24}^{1}(s)]\,\textrm{d}s\right)\right\}\lambda_{24}^{1}(v)\,\textrm{d}v\,\textrm{d}u
+0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ141(u)du.\displaystyle\quad+\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{14}^{1}(u)\,\textrm{d}u. (6)

for other-cause mortality. However, this is not needed for estimation as we have only one unknown quantity to solve under the assumption that early treatments do not have effect on other-cause mortality. We note that the proposed methodology can be extended to relax this assumption, that is λ2410(t)=λ2411(t)\lambda^{10}_{24}(t)=\lambda^{11}_{24}(t). In that case, we can solve the two equations for two unknowns, characterizing the effect on cancer-specific and other cause mortality, respectively.

3.2 Maximum likelihood estimation

Alternatively to the estimating equation approach, we can consider a multinomial likelihood expression for the parameter θ\theta using data from the mortality outcomes in the conventional trial. This can be written as

L(θ)\displaystyle L(\theta) ={i:Zi=1}{E[N131(t)+N231(t)]N131(t)+N231(t)E[N141(t)+N241(t)]N141(t)+N241(t)\displaystyle=\prod_{\{i:Z_{i}=1\}}\Big{\{}{E[N_{13}^{1}(t)+N_{23}^{1}(t)]}^{N_{13}^{1}(t)+N_{23}^{1}(t)}{E[N_{14}^{1}(t)+N_{24}^{1}(t)]}^{N_{14}^{1}(t)+N_{24}^{1}(t)}
×(1E[N131(t)+N231(t)]E[N141(t)+N241(t)])1N131(t)N231(t)N141(t)N241(t)}\displaystyle\qquad\times\left(1-E[N_{13}^{1}(t)+N_{23}^{1}(t)]-E[N_{14}^{1}(t)+N_{24}^{1}(t)]\right)^{1-N_{13}^{1}(t)-N_{23}^{1}(t)-N_{14}^{1}(t)-N_{24}^{1}(t)}\Big{\}}
×{i:Zi=0}E[N130(t)]N130(t)E[N140(t)]N140(t)\displaystyle\quad\times\prod_{\{i:Z_{i}=0\}}E[N_{13}^{0}(t)]^{N_{13}^{0}(t)}E[N_{14}^{0}(t)]^{N_{14}^{0}(t)}
×(1E[N130(t)]E[N140(t)])(1N130(t)N140(t))\displaystyle\qquad\times\left(1-E[N_{13}^{0}(t)]-E[N_{14}^{0}(t)]\right)^{(1-N_{13}^{0}(t)-N_{14}^{0}(t))}
θ{i:Zi=0}E[N130(t)]N130(t)E[N140(t)]N140(t)\displaystyle\stackrel{{\scriptstyle\theta}}{{\propto}}\prod_{\{i:Z_{i}=0\}}E[N_{13}^{0}(t)]^{N_{13}^{0}(t)}E[N_{14}^{0}(t)]^{N_{14}^{0}(t)}
×(1E[N130(t)]E[N140(t)])1N130(t)N140(t),\displaystyle\qquad\times\left(1-E[N_{13}^{0}(t)]-E[N_{14}^{0}(t)]\right)^{1-N_{13}^{0}(t)-N_{14}^{0}(t)}, (7)

where E[N130(t)]E[N_{13}^{0}(t)] and E[N140(t)]E[N_{14}^{0}(t)] are as in (3.1) and (3.1). The proportionality followed because, considering the transition intensities in the screening arm as fixed quantities, the likelihood contributions of the individuals in the screening arm do not depend on θ\theta. Similarly to the estimating equation in Section 3.1, (3.2) involves unknown quantities that under the stated assumptions can be replaced with estimators from the screening arm of the trial, and the expression L(θ)L(\theta) then maximized numerically with respect to logθ\log\theta. The limitation of the multinomial likelihood expression (3.2) is that it can only accommodate type I censoring. However, instead of evaluating the likelihood expression at a fixed time tt, independent censoring can be accommodated by evaluating the likelihood contributions at the minimum of the time of death or censoring time. Such a likelihood expression could also be informative of time-dependent treatment effects. We will briefly outline this in Section 6.

4 Simulation studies

4.1 Using NLST as basis

To compare the estimators proposed in Section 3 in a setting resembling a real randomized screening trial, we generated data using the algorithm described in Section 2.4, using the NLST as the basis of the simulation study. In particular, because the NLST was powered for the ITS effect, we were interested in determining if such a study is also powered for detecting the subgroup effect using tests based on the proposed estimators. For the purpose of comparison, we also included tests based on these to the two estimators proposed earlier for absolute and proportional subgroup mortality risk reduction (Saha et al., 2018). In the present notational framework, the absolute mortality risk reduction (3) can be expressed as

E[N130(t){N131(t)+N231(t)}]\displaystyle E[N_{13}^{0}(t)-\{N_{13}^{1}(t)+N_{23}^{1}(t)\}]
=E[N130(t){N131(t)+N231(t)}N121(t)=1]P(N121(t)=1)\displaystyle=E[N_{13}^{0}(t)-\{N_{13}^{1}(t)+N_{23}^{1}(t)\}\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[N130(t){N131(t)+N231(t)}N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[N_{13}^{0}(t)-\{N_{13}^{1}(t)+N_{23}^{1}(t)\}\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=(i),(ii),(iii)E[N2310(t){N131(t)+N2311(t)}N121(t)=1]P(N121(t)=1)\displaystyle\stackrel{{\scriptstyle\textrm{(i),(ii),(iii)}}}{{=}}E[N_{23}^{10}(t)-\{N_{13}^{1}(t)+N_{23}^{11}(t)\}\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
=E[N2310(t)N2311(t)N121(t)=1]E[N121(t)]\displaystyle=E[N_{23}^{10}(t)-N_{23}^{11}(t)\mid N_{12}^{1}(t)=1]E[N_{12}^{1}(t)]

Thus, under (iv) we get,

(3)=E[N13(t)Z=0]E[N13(t)+N23(t)Z=1]E[N12(t)Z=1],\displaystyle\eqref{equation:acfr}=\frac{E[N_{13}(t)\mid Z=0]-E[N_{13}(t)+N_{23}(t)\mid Z=1]}{E[N_{12}(t)\mid Z=1]}, (8)

where the expectations can be estimated non-parametrically as cumulative incidences. Further, we can write

E[N130(t)]\displaystyle E[N_{13}^{0}(t)]
=E[N130(t)N121(t)=1]P(N121(t)=1)\displaystyle=E[N_{13}^{0}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[N130(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[N_{13}^{0}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=(i),(ii)E[N2310(t)N121(t)=1]P(N121(t)=1)\displaystyle\stackrel{{\scriptstyle\textrm{(i),(ii)}}}{{=}}E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[N131(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad\quad+E[N_{13}^{1}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=E[N2310(t)N121(t)=1]P(N121(t)=1)+P(N131(t)=1,N121(t)=0)\displaystyle=E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)+P(N_{13}^{1}(t)=1,N_{12}^{1}(t)=0)
=E[N2310(t)N121(t)=1]P(N121(t)=1)+E[N131(t)]\displaystyle=E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)+E[N^{1}_{13}(t)]

Using the above two results, we can now re-express (4) as,

1E[N2311(t)N121(t)=1]E[N2310(t)N121(t)=1]\displaystyle 1-\frac{E[N_{23}^{11}(t)\mid N_{12}^{1}(t)=1]}{E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]}
=(E[N2310(t)N121(t)=1]E[N2311(t)N121(t)=1])P(N121(t)=1)E[N2310(t)=1N121(t)=1]P(N121(t)=1)\displaystyle=\frac{(E[N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]-E[N_{23}^{11}(t)\mid N_{12}^{1}(t)=1])P(N_{12}^{1}(t)=1)}{E[N_{23}^{10}(t)=1\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)}
=E[N130(t)]E[N131(t)+N231(t)]E[N130(t)]E[N131(t)]\displaystyle=\frac{E[N^{0}_{13}(t)]-E[N^{1}_{13}(t)+N^{1}_{23}(t)]}{E[N_{13}^{0}(t)]-E[N^{1}_{13}(t)]}

Finally, under (iv) we can write that

(4)=E[N13(t)Z=0]E[N13(t)+N23(t)Z=1]E[N13(t)Z=0]E[N13(t)Z=1],\displaystyle\eqref{equation:pcfr}=\frac{E[N_{13}(t)\mid Z=0]-E[N_{13}(t)+N_{23}(t)\mid Z=1]}{E[N_{13}(t)\mid Z=0]-E[N_{13}(t)\mid Z=1]}, (9)

where again the expectations can be replaced with appropriate non-parametric cumulative incidence estimators. While (8) and (9) are estimating mortality risk reductions rather than the hazard ratio, the four estimators are comparable in terms of the power of a Wald-type or confidence interval based test on them. This will help answer the question of whether conventional screening trial such as NLST is powered to detect the subgroup early treatment effect. For this reason, we also calculated estimates for the ITS mortality reductions as

(1)=E[N13(t)Z=0]E[N13(t)+N23(t)Z=1]\eqref{equation:itt}=E[N_{13}(t)\mid Z=0]-E[N_{13}(t)+N_{23}(t)\mid Z=1] (10)

and the proportional reduction by

(2)=1E[N13(t)+N23(t)Z=1]E[N13(t)Z=0],\eqref{equation:ittp}=1-\frac{E[N_{13}(t)+N_{23}(t)\mid Z=1]}{E[N_{13}(t)\mid Z=0]}, (11)

as the trial is powered for these.

To compare the four estimators, we generated 10001000 NLST-like datasets, by fixing the transition intensities to those estimated from the NLST, and fixing the log\log hazard ratio to 0.48040.4804, estimated from the real data using the estimating equation (3.1). While we could have fixed the hazard ratio to any value in the simulation, we used the empirical value to obtain similar death counts to the real data. For each dataset we simulated n=53,452n=53,452 individuals assigned randomly with equal probability to screening or control. Since in the NLST the censoring is mostly administrative at the end of the follow-up at 7 years, we did not simulate random censoring, so the simulated datasets had only type I censoring at 7 years. The true values of the subgroup mortality reductions were evaluated by calculating the quantities (3) and (4) under the specified transition intensities, that is, evaluating the cumulative incidences through numerical integration using the true transition intensity functions used in the simulation. In the estimators, the cumulative incidences were calculated using the cuminc function of the cmprsk R package (Gray, 2019). For more complex state occupation probabilities needed for evaluating the estimating equation and the likelihood expression, we first calculated Nelson-Aalen/Breslow estimators, which were used as inputs to Aalen-Johansen estimators for state occupation probabilities using the functions available in the mstate R package. The estimating equation was solved numerically using the uniroot function and the likelihood maximized numerically using the optim function of R (R Core Team, 2017).

The mortality reductions, as well as the estimating equation (3.1) and the likelihood expression (3.2) were evaluated at 7 years. To estimate standard errors for each estimator, we bootstrapped each simulated dataset 50 times, and took the standard deviation of the point estimates as the standard error. This was compared to the Monte Carlo standard deviation of the point estimates over the simulation rounds. 95% confidence intervals were constructed using the normal approximation and compared to the nominal coverage probability. The power of the confidence interval based test was calculated as the proportion of simulation rounds where the interval did not cover the null value.

The simulation results are presented in Table 1. These indicate that all the estimators could capture the corresponding true value relatively well, based on low bias and near nominal coverage probability. The power of the subgroup treatment effect estimators was also comparable to the ITS analysis. This is consistent with the similar property of g-estimation methods (White, 2005). This can be explained by the fact that all the mortality reduction due to screening is realized through the effect of the early treatments, and thus focusing on the subgroup does not lose any of the effect. The test based on the estimator (9) had the highest power. The two hazard ratio estimators gave very similar results, so we only use the estimating equation for the data analysis in the next section.

Estimator Causal Truth Point SE Power Coverage MCsd MCE
contrast estimate
(3.1) logθ\log\theta 0.4804 0.4893 0.1628 0.85 0.942 0.1657 0.005
(3.2) logθ\log\theta 0.4804 0.4892 0.1628 0.85 0.942 0.1657 0.005
(8) (3) 0.1512 0.1507 0.0521 0.83 0.945 0.0534 0.0017
(9) (4) 0.2986 0.2921 0.0804 0.90 0.944 0.0803 0.0025
(10) (1) 0.0036 0.0036 0.0012 0.85 0.940 0.0012 0.00004
(11) (2) 0.1616 0.1616 0.0500 0.87 0.938 0.0514 0.0016
Table 1: Comparison of subgroup and population-level estimators. The estimators from top to bottom are estimating equation and likelihood based estimation of the subgroup cancer mortality hazard ratio, absolute and proportional subgroup cancer mortality reduction, and absolute and proportional population cancer mortality reduction. SE stands for standard error, MCsd for Monte Carlo standard deviation and MCE for Monte Carlo error.

4.2 Simulating from constant transition rates

To check the performance of the proposed methods under settings not resembling a typical randomized screening trial with large size and rare events, we also simulated scenarios with smaller sample size and relatively more common events. In particular, we simulated scenarios with more than 50%50\% in the screening arm early diagnosed through (compared to 2.4%2.4\% in NLST) and among them majority experiencing cancer-specific death. The screening arm state transitions were simulated from the following constant transition rates, with follow-up type I censored at t=7t=7:

λ11(t)={blockarray}c@rrrr@&1234{block}r@|@|@rrrr@|@|1.0.22800.11480.016820.0.19800.0111300.04000.\lambda^{11}(t)=\blockarray{c@{\hskip 1.0pt}rrrr@{\hskip 4.0pt}}&1234\\ \block{r@{\hskip 1.0pt}|@{\hskip 1.0pt}|@{\hskip 1.0pt}rrrr@{\hskip 1.0pt}|@{\hskip 1.0pt}|}1.0.22800.11480.0168\\ 20.0.19800.0111\\ 300.0\\ 4000.\\

For the control arm, all the intensities in the matrix were the same, except for λ2310(t)=0.1980×θ\lambda^{10}_{23}(t)=0.1980\,\times\theta, where logθ\log\theta was fixed to 0.47000.4700 (i.e. θ=1.6\theta=1.6). From the intensity matrices, we generated screening trial data using the algorithm of Section 2.4 with sample sizes n=500,800,n=500,800, and 10001000 and estimated the hazard ratios in the screening detectable subgroup using the two methods described in Section 3. We did not consider the risk reduction type measures here. The results for a scenario without unmeasured confounding are presented in Table 2 (rows with β=0\beta=0).

β=0\beta=0 Estimator Causal Truth Point SE Power Coverage MCsd MCE
contrast estimate
n=500n=500 (3.1) logθ\log\theta 0.4700 0.4500 0.2467 0.44 0.95 0.2363 0.0106
(3.2) logθ\log\theta 0.4700 0.4544 0.2324 0.50 0.95 0.2216 0.0099
n=800n=800 (3.1) logθ\log\theta 0.4700 0.4708 0.1931 0.69 0.96 0.1828 0.0082
(3.2) logθ\log\theta 0.4700 0.4674 0.1807 0.75 0.96 0.1760 0.0079
n=1000n=1000 (3.1) logθ\log\theta 0.4700 0.4730 0.1698 0.82 0.95 0.1652 0.0074
(3.2) logθ\log\theta 0.4700 0.4686 0.1600 0.85 0.95 0.1601 0.0072
β=0.34\beta=0.34 Estimator Causal Truth Point SE Power Coverage MCsd MCE
contrast estimate
n=500n=500 (3.1) logθ\log\theta 0.3900 0.4034 0.2506 0.37 0.95 0.2256 0.0101
(3.2) logθ\log\theta 0.3900 0.4052 0.2275 0.43 0.96 0.2098 0.0094
n=800n=800 (3.1) logθ\log\theta 0.3941 0.3807 0.1827 0.57 0.96 0.1740 0.0078
(3.2) logθ\log\theta 0.3941 0.3796 0.1681 0.62 0.97 0.1632 0.0073
n=1000n=1000 (3.1) logθ\log\theta 0.4000 0.4268 0.1658 0.77 0.95 0.1642 0.0073
(3.2) logθ\log\theta 0.4000 0.4221 0.1535 0.82 0.95 0.1528 0.0068
β=0.47\beta=0.47 Estimator Causal Truth Point SE Power Coverage MCsd MCE
contrast estimate
n=500n=500 (3.1) logθ\log\theta 0.3794 0.3766 0.2633 0.24 0.96 0.2199 0.0098
(3.2) logθ\log\theta 0.3794 0.3744 0.2288 0.37 0.96 0.1968 0.0088
n=800n=800 (3.1) logθ\log\theta 0.3819 0.3436 0.1849 0.46 0.95 0.1697 0.0076
(3.2) logθ\log\theta 0.3819 0.3423 0.1682 0.53 0.95 0.1604 0.0072
n=1000n=1000 (3.1) logθ\log\theta 0.3876 0.4044 0.1691 0.73 0.96 0.1554 0.0069
(3.2) logθ\log\theta 0.3876 0.4014 0.1540 0.77 0.96 0.1464 0.0065
Table 2: Comparison of subgroup hazard ratio estimates with varied sample sizes, based on 500500 simulated datasets. The two estimators correspond to estimating equation and likelihood based estimation. SE stands for standard error, MCsd for Monte Carlo standard deviation and MCE for Monte Carlo error.

Results in Table 2 indicate that the two estimators again perform very similarly. With n=500n=500, they show some small sample bias, which disappears with increasing sample size. The bootstrap standard errors give a reasonable approximation of the sampling distribution standard deviation. These results indicate that when the modeling and identifying assumptions are satisfied, the methods perform reasonably also with smaller sample size and more common events.

To study how the methods perform in the presence of unmeasured confounding, we modified the above transition intensities for transitions 121\rightarrow 2, 131\rightarrow 3 and 232\rightarrow 3 as λ121(t)exp(βU)\lambda_{12}^{1}(t)\exp(\beta U), λ131(t)exp(βU)\lambda_{13}^{1}(t)\exp(\beta U) and λ2311(t)exp(βU)\lambda_{23}^{11}(t)\exp(\beta U), where UBernoulli(0.5)U\sim\textrm{Bernoulli}(0.5) is an unmeasured baseline covariate. This represents a scenario where some individuals are at higher risk for cancer diagnosis and cancer mortality, but under which the instrumental variable assumptions are still satisfied. However, because the data are simulated from a conditional generating mechanism, some degree of model misspecification is expected in the estimators. In addition to the proportionality of the marginal effect, the estimators use a Markov multi-state model for calculation of the state occupancy probabilities. While the Markov assumption can be relaxed, here we study its effect on the estimates. The true marginal effects were approximated as explained in Section 2.4.

The results with β=0.34\beta=0.34 and β=0.47\beta=0.47 are also summarized in Table 2. They show as expected that the true marginal effects are now smaller compared to the conditional effect used in the simulation. The points estimates were close to the true marginal effects under most of the scenarios and sample sizes, with coverage probabilities close to nominal. This suggests that the estimates are free of confounding bias, as is expected of instrumental variable estimators.

5 Illustration using NLST data

The NLST data were briefly introduced in Section 2.1. The original analysis followed the ITS principle. Here we are interested in quantifying the effect of the early vs late treatments among the screening detectable subgroup in terms of the subgroup cancer-specific mortality hazard ratio. For this purpose, we carried out a secondary analysis of the NLST data using the estimating equation approach based on (3.1). We estimated the hazard ratio as a function of the fixed time point tt, to determine if this effect stabilizes or dilutes over time. The confidence intervals were generated using 500500 bootstrap replicates.

The results are presented in Figure 3. This demonstrates that the hazard ratio becomes statistically significant in the fifth year of follow-up, reaches a maximum of 1.6681.668 at around 6 years, and levels after that. Overall the effect estimate is relatively stable over time which may lend some support towards the proportionality of this effect. The hazard ratio estimate at 7 years indicates a 100%×(11/1.62)=38.3%100\%\times(1-1/1.62)=38.3\% cancer-specific mortality hazard reduction due to the early treatments. This can be contrasted to the reduction in the empirical cancer mortality rate between screening and control arms in the trial which after 7 years of follow-up was 1(469/171,412)/(552/170,355)1-(469/171,412)/(552/170,355), or around 15.6%. The reason for the difference between these two measures is that the latter combines the effect of early treatments after early diagnosis with factors influencing early diagnosis itself, including the screening regimen, screening technology and non-compliance. The statistical significance of the two effects is similar based on the NLST data; even though the subgroup effect is larger in magnitude, it is evaluated in a smaller subpopulation, and these two aspects counterbalance each other compared to the smaller ITS effect in the entire study population.

In addition to reporting the estimate as a function of time and at the end of the follow-up, for choosing a single number effect estimate to report, we also applied the two approaches suggested in Section 3.1 for choosing the timepoint for estimation, namely the time minimizing the empirical variance of the estimate and the inverse variance weighted average. For these, we considered a grid of timepoints from 1 to 7 years with 0.050.05 year intervals. We found that the minimum empirical variance is achieved at t=4.55t=4.55 with hazard ratio of 1.361.36, while the inverse variance weighted average hazard ratio was 1.431.43, suggesting that the choice of the timepoint made relatively small difference to the estimate.

Refer to caption
Figure 3: Estimated effect of delayed versus early treatment on lung cancer mortality among the screening-detectable subgroup. The dashed lines are 95% bootstrap confidence bands. Estimates are generated as a function of time, using the follow-up data up to a given time point tt. The estimates are noisy in the first year of follow-up and are shown from the second year.

6 Discussion

In this paper, we proposed a simplified multi-state model to characterize screening trials, and proposed a new measure (i.e. structural cause-specific hazard ratio) to quantify the impact of early treatments compared to delayed treatments in the screening-detectable subgroup. We note that the subgroup is specific to the screening regimen and screening technology implemented in the trial that is used for the estimation, as the effect is formulated in terms of a hypothetical intervention trial that shares the same regimen and technology. Transferability of the subgroup effect to different settings with different screening technology or regimen, disease incidence and non-adherence rate requires additional assumptions. Nevertheless, the subgroup effect is arguably more transferable compared to the ITS effect, which directly depends on all of these factors. We presented two alternative estimators for the subgroup effect. In addition to comparing properties of estimators, the simulation model may have use in planning new trials targeted for the subgroup effects. While the estimation methods we proposed allow for flexible estimation of the quantities not of interest, the effect of interest is quantified in terms of a single parameter. As a limitation of the present work, we did not propose methods for estimating time-dependent effects, but outline some ideas below.

While we contend that the cause-specific hazard ratio can serve as a useful summary measure of the magnitude of the effect, and simplifies accommodating competing risks into the analysis, we do note that the use of marginal hazard ratios as a causal parameters has been criticized by several authors (e.g. Aalen et al., 2015), who point out even in randomized trials where the arms are initially balanced with respect to observed or unobserved baseline characteristics, this balance is lost over the course of the follow-up as more high risk individuals get removed from the risksets. However, as outlined by Young et al. (2020), counterfactual cause-specific hazards can always be formulated in terms of the underlying potential event time/event type outcomes in the two arms of a trial. Because these quantities are defined conditional on the past, it should be noted that the corresponding hazard ratio being one at a given time does not necessarily imply the absence of a causal effect, but rather some combination of causal effect and selection. If a strictly causal interpretation is sought, one could use the subgroup mortality risk reduction as the measure instead.

With these reservations, we suggest interpreting the cause-specific hazard ratio as a summary measure of the early treatment effect over time. However, we note that the proportionality is more plausible for the subgroup early treatment effect since this effect begins from the early diagnosis, compared to the use of cancer mortality rate ratio to compare the screening and control arms, which is dependent on the screening regimen, among other things. In particular, when the screening is discontinued, the screening effect eventually disappears in the full cohort and continuing the follow-up will dilute the proportional effect. The subgroup effect measure does not suffer from this limitation, as no more subgroup members are accumulated after the screening discontinues. In fact, we suggest that the present proposal can resolve some of the controversies in characterizing screening effect through a hazard/rate ratio, discussed for example by Liu et al. (2013, 2015); Hanley and Njor (2018); Habbema (2018). Rather than the screening effect in the entire population, what could be plausibly assumed proportional is the early treatment effect in the subgroup. On the other hand, the usual residual type diagnostics for proportionality violations are not applicable under the instrumental variable type estimation methods; diagnosing violations of proportionality under the current framework is a topic for further research. As one possible approach, the likelihood-based method could be extended to make use of the observed event times in the control arm, making it potentially informative of the time-varying effects. For this purpose, the likelihood contributions in the control arm could be evaluated at the minimum of observed time of death and censoring time TiT_{i} as

L(θ)\displaystyle L(\theta) θ{i:Zi=0}E[dN130(Ti)]dN130(Ti)E[dN140(Ti)]dN140(Ti)\displaystyle\stackrel{{\scriptstyle\theta}}{{\propto}}\prod_{\{i:Z_{i}=0\}}E[\textrm{d}N_{13}^{0}(T_{i})]^{\textrm{d}N_{13}^{0}(T_{i})}E[\textrm{d}N_{14}^{0}(T_{i})]^{\textrm{d}N_{14}^{0}(T_{i})}
×(1E[N130(Ti)]E[N140(Ti)])1dN130(Ti)dN140(Ti),\displaystyle\qquad\times\left(1-E[N_{13}^{0}(T_{i})]-E[N_{14}^{0}(T_{i})]\right)^{1-\textrm{d}N_{13}^{0}(T_{i})-\textrm{d}N_{14}^{0}(T_{i})}, (12)

where we can characterize for example the sub-density of the cancer-specific death time through

E[dN130(t)]\displaystyle E[\textrm{d}N_{13}^{0}(t)] =exp{(Λ130(t)+Λ140(t))}λ130(t)dt\displaystyle=\exp\left\{-(\Lambda_{13}^{0}(t)+\Lambda_{14}^{0}(t))\right\}\lambda_{13}^{0}(t)\,\textrm{d}t
=E[dN130(t)N121(t)=1]P(N121(t)=1)\displaystyle=E[\textrm{d}N_{13}^{0}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[dN130(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[\textrm{d}N_{13}^{0}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=(i),(ii)E[dN2310(t)N121(t)=1]P(N121(t)=1)\displaystyle\stackrel{{\scriptstyle\textrm{(i),(ii)}}}{{=}}E[\textrm{d}N_{23}^{10}(t)\mid N_{12}^{1}(t)=1]P(N_{12}^{1}(t)=1)
+E[dN131(t)N121(t)=0]P(N121(t)=0)\displaystyle\quad+E[\textrm{d}N_{13}^{1}(t)\mid N_{12}^{1}(t)=0]P(N_{12}^{1}(t)=0)
=0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ121(u)\displaystyle=\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{12}^{1}(u)
×exp{(ut[λ2310(v)+λ2410(v)]dv)}λ2310(t)dtdu\displaystyle\quad\times\exp\left\{-\left(\int_{u}^{t}[\lambda_{23}^{10}(v)+\lambda_{24}^{10}(v)]\,\textrm{d}v\right)\right\}\lambda_{23}^{10}(t)\,\textrm{d}t\,\textrm{d}u
+exp{(Λ121(t)+Λ131(t)+Λ141(t))}λ131(t)dt\displaystyle\quad+\exp\left\{-(\Lambda_{12}^{1}(t)+\Lambda_{13}^{1}(t)+\Lambda_{14}^{1}(t))\right\}\lambda_{13}^{1}(t)\,\textrm{d}t

and further, under proportionality,

E[dN130(t)]\displaystyle E[\textrm{d}N_{13}^{0}(t)] =(iii)0texp{(Λ121(u)+Λ131(u)+Λ141(u))}λ121(u)\displaystyle\stackrel{{\scriptstyle\textrm{(iii)}}}{{=}}\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u)+\Lambda_{13}^{1}(u)+\Lambda_{14}^{1}(u))\right\}\lambda_{12}^{1}(u)
×exp{(ut[θλ231(v)+λ241(v)]dv)}θλ231(t)dtdu\displaystyle\quad\times\exp\left\{-\left(\int_{u}^{t}[\theta\lambda_{23}^{1}(v)+\lambda_{24}^{1}(v)]\,\textrm{d}v\right)\right\}\theta\lambda_{23}^{1}(t)\,\textrm{d}t\,\textrm{d}u
+exp{(Λ121(t)+Λ131(t)+Λ141(t))}λ131(t)dt,\displaystyle\quad+\exp\left\{-(\Lambda_{12}^{1}(t)+\Lambda_{13}^{1}(t)+\Lambda_{14}^{1}(t))\right\}\lambda_{13}^{1}(t)\,\textrm{d}t,

with the cumulative incidences in (6) as before. This type of likelihood would allow testing the significance of interaction terms with time entered into the model. The likelihood could also be used to assess on which time scale proportionality fits the data better. It could also accommodate independent censoring which can be seen by introducing a further counting process characterizing the censoring events, and factoring out the corresponding terms from the likelihood. The likelihood could also be written for covariate conditional quantities, accommodating covariate dependent censoring. We note that because the event times in the screening and control arms are distinct, substituting estimators for the sub-densities from the screening arm of the trial would require smooth hazard estimates. This is in contrast to the methods we focused on earlier, where non-parametric estimators could be substituted in for the cumulative incidences.

The proposed estimating equation based method could accommodate independent censoring. It can be extended straightforwardly to the case where the censoring is conditional on baseline covariates, for example, standardizing over the empirical covariate distribution in the control arm of the trial, equation (3.1) can be expressed as

1n{i:Zi=0}E[N130(t)xi]\displaystyle\frac{1}{n}\sum_{\{i:Z_{i}=0\}}E[N_{13}^{0}(t)\mid x_{i}]
=1n{i:Zi=0}[0texp{(Λ121(uxi)+Λ131(uxi)+Λ141(uxi))}λ121(uxi)\displaystyle=\frac{1}{n}\sum_{\{i:Z_{i}=0\}}\Bigg{[}\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u\mid x_{i})+\Lambda_{13}^{1}(u\mid x_{i})+\Lambda_{14}^{1}(u\mid x_{i}))\right\}\lambda_{12}^{1}(u\mid x_{i})
×utexp{(uv[θλ231(sxi)+λ241(sxi)]ds)}θλ231(vxi)dvdu\displaystyle\quad\times\int_{u}^{t}\exp\left\{-\left(\int_{u}^{v}[\theta\lambda_{23}^{1}(s\mid x_{i})+\lambda_{24}^{1}(s\mid x_{i})]\,\textrm{d}s\right)\right\}\theta\lambda_{23}^{1}(v\mid x_{i})\,\textrm{d}v\,\textrm{d}u
+0texp{(Λ121(uxi)+Λ131(uxi)+Λ141(uxi))}λ131(uxi)du],\displaystyle\quad+\int_{0}^{t}\exp\left\{-(\Lambda_{12}^{1}(u\mid x_{i})+\Lambda_{13}^{1}(u\mid x_{i})+\Lambda_{14}^{1}(u\mid x_{i}))\right\}\lambda_{13}^{1}(u\mid x_{i})\,\textrm{d}u\Bigg{]},

where the covariate conditional cause-specific hazards can be estimated using Cox models, and the cumulative incidences on the left hand side either with Cox or Fine & Gray models. In addition to allowing for covariate-dependent censoring, conditioning on the covariates may be useful for relaxing the instrumental variable assumptions outside of randomized trials, or improving the efficiency of the instrumental variable type estimators (Burgess et al., 2017). In this case the estimated hazard ratio will be covariate conditional. Alternatively, to preserve the original marginal estimand, the original estimating equation (3.1) could be used when combined with inverse probability of censoring weighted estimators for the other unknown quantities (e.g. Howe et al., 2016). We are currently pursuing these extensions.

Acknowledgment

This work was supported by the Ontario Institute for Cancer Research through funding provided by the Government of Ontario (to SS) and by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (to OS). The authors thank the National Cancer Institute (NCI) for access to NCI’s data collected by the National Lung Screening Trial. The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by the NCI.

References

  • Aalen et al. (2015) Aalen OO, Cook RJ, Røysland K (2015) Does cox analysis of a randomized survival study yield a causal treatment effect? Lifetime Data Anal 21:579–593
  • Altstein and Li (2013) Altstein LL, Li G (2013) Latent subgroup analysis of a randomized clinical trial through a semiparametric accelerated failure time mixture model. Biometrics 69:52–61
  • Altstein et al. (2011) Altstein LL, Li G, Elashoff RM (2011) A method to estimate treatment efficacy among latent subgroups of a randomized clinical trial. Stat Med 30:709–717
  • Angrist et al. (1996) Angrist JD, Imbens GW, Rubin DB (1996) Identification of causal effects using instrumental variables. J Am Stat Assoc 91:444–455
  • Baiocchi et al. (2014) Baiocchi M, Cheng J, Small DS (2014) Instrumental variable methods for causal inference. Stat Med 33:2297–2340
  • Baker (1998) Baker SG (1998) Analysis of survival data from a randomized trial with all-or-none compliance: estimating the cost-effectiveness of a cancer screening program. J Am Stat Assoc 93:929–934
  • Baker and Lindeman (1994) Baker SG, Lindeman KS (1994) The paired availability design: a proposal for evaluating epidural analgesia during labor. Stat Med 13(21):2269–2278
  • Baker et al. (2016) Baker SG, Kramer BS, Lindeman KS (2016) Latent class instrumental variables: a clinical and biostatistical perspective. Stat Med 35(1):147–160
  • Beyersmann et al. (2011) Beyersmann J, Allignol A, Schumacher M (2011) Competing risks and multistate models with R. Springer Science & Business Media
  • Borgan (2014) Borgan Ø (2014) Aalen-Johansen estimator. Wiley StatsRef: Statistics Reference Online pp 1–13
  • Burgess et al. (2017) Burgess S, Small DS, Thompson SG (2017) A review of instrumental variable estimators for mendelian randomization. Stat Methods Med Res 26:2333–2355
  • von Cube et al. (2019) von Cube M, Schumacher M, Wolkewitz M (2019) Causal inference with multistate models—estimands and estimators of the population attributable fraction. J R Stat Soc: Series A (Statistics in Society)
  • de Wreede et al. (2011) de Wreede LC, Fiocco M, Putter H (2011) mstate: An R package for the analysis of competing risks and multi-state models. J Stat Softw 38:1–30, http://www.jstatsoft.org/v38/i07/
  • Gray (2019) Gray B (2019) cmprsk: Subdistribution Analysis of Competing Risks. https://CRAN.R-project.org/package=cmprsk, r package version 2.2-9
  • Habbema (2018) Habbema D (2018) Statistical analysis and decision making in cancer screening. Eur J Epidemiol 33:433–435
  • Hanley and Njor (2018) Hanley JA, Njor SH (2018) Disaggregating the mortality reductions due to cancer screening: model-based estimates from population-based data. Eur J Epidemiol 33:465–472
  • Hess and Gentleman (2019) Hess K, Gentleman R (2019) muhaz: Hazard Function Estimation in Survival Analysis. https://CRAN.R-project.org/package=muhaz, r package version 1.2.6.1
  • Howe et al. (2016) Howe CJ, Cole SR, Lau B, Napravnik S, Eron Jr JJ (2016) Selection bias due to loss to follow up in cohort studies. Epidemiology (Cambridge, Mass) 27:91
  • Hu and Zelen (1997) Hu P, Zelen M (1997) Planning clinical trials to evaluate early detection programmes. Biometrika 84:817–829
  • Imbens and Angrist (1994) Imbens GW, Angrist JD (1994) Identification and estimation of local average treatment effects. Econometrica 62:467–475
  • Lee and Zelen (2006) Lee S, Zelen M (2006) Chapter 11: a stochastic model for predicting the mortality of breast cancer. JNCI Monographs 2006:79–86
  • Liu et al. (2013) Liu Z, Hanley JA, Strumpf EC (2013) Projecting the yearly mortality reductions due to a cancer screening programme. J Med Screen 20:157–164
  • Liu et al. (2015) Liu Z, Hanley JA, Saarela O, Dendukuri N (2015) A conditional approach to measure mortality reductions due to cancer screening. Int Stat Rev 83:493–510
  • Loeys and Goetghebeur (2003) Loeys T, Goetghebeur E (2003) A causal proportional hazards estimator for the effect of treatment actually received in a randomized trial with all-or-nothing compliance. Biometrics 59:100–105
  • Loeys et al. (2005) Loeys T, Goetghebeur E, Vandebosch A (2005) Causal proportional hazards models and time-constant exposure in randomized clinical trials. Lifetime Data Anal 11:435–449
  • Mark and Robins (1993a) Mark SD, Robins JM (1993a) Estimating the causal effect of smoking cessation in the presence of confounding factors using a rank preserving structural failure time model. Stat Med 12:1605–1628
  • Mark and Robins (1993b) Mark SD, Robins JM (1993b) A method for the analysis of randomized trials with compliance information: an application to the multiple risk factor intervention trial. Control Clin Trials 14:79–97
  • Martinussen and Vansteelandt (2020) Martinussen T, Vansteelandt S (2020) Instrumental variables estimation with competing risk data. Biostatistics 21:158–171
  • Martinussen et al. (2019) Martinussen T, Nørbo Sørensen D, Vansteelandt S (2019) Instrumental variables estimation under a structural Cox model. Biostatistics 20:65–79
  • McIntosh (1999) McIntosh MW (1999) Instrumental variables when evaluating screening trials: estimating the benefit of detecting cancer by screening. Stat Med 18:2775–2794
  • Miettinen (2014) Miettinen OS (2014) Toward Scientific Medicine. Springer
  • Miettinen (2015) Miettinen OS (2015) ‘screening’ for breast cancer: Misguided research misinforming public policies. Epidemiologic Methods 4:3–10
  • Nie et al. (2011) Nie H, Cheng J, Small DS (2011) Inference for the effect of treatment on survival probability in randomized trials with noncompliance and administrative censoring. Biometrics 67:1397–1405
  • NLST Research Team (2011) NLST Research Team (2011) Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 365:395–409
  • Oken et al. (2011) Oken MM, Hocking WG, Kvale PA, Andriole GL, Buys SS, Church TR, Crawford ED, Fouad MN, Isaacs C, Reding DJ, Weissfeld JL, Yokochi LA, O’Brien B, Ragard LR, Rathmell JM, Riley TL, Wright P, Caparaso N, Hu P, Izmirlian G, Pinsky PF, Prorok PC, Kramer BS, Miller AB, Gohagan JK, Berg CD (2011) Screening by chest radiograph and lung cancer mortality: the Prostate, Lung, Colorectal, and Ovarian (PLCO) randomized trial. JAMA 306:1865–1873
  • Putter et al. (2007) Putter H, Geskus RB, Fiocco M (2007) Tutorial in biostatistics: Competing risks and multi-state models. Stat Med 26:2389–2430
  • R Core Team (2017) R Core Team (2017) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/
  • Richardson et al. (2017) Richardson A, Hudgens MG, Fine JP, Brookhart MA (2017) Nonparametric binary instrumental variable analysis of competing risks data. Biostatistics 18:48–61
  • Robins (1994) Robins JM (1994) Correcting for non-compliance in randomized trials using structural nested mean models. Commun Stat A-Theor 23:2379–2412
  • Roemeling et al. (2007) Roemeling S, Roobol MJ, Otto SJ, Habbema DF, Gosselaar C, Lous JJ, Cuzick J, Schröder FH (2007) Feasibility study of adjustment for contamination and non-compliance in a prostate cancer screening trial. The Prostate 67:1053–1060
  • Saha et al. (2018) Saha S, Liu ZA, Saarela O (2018) Estimating case-fatality reduction from randomized screening trials. Epidemiologic Methods 7
  • Shen and Zelen (1999) Shen Y, Zelen M (1999) Parametric estimation procedures for screening programmes: stable and nonstable disease models for multimodality case finding. Biometrika 86:503–515
  • Sung et al. (2019) Sung NY, Jun JK, Kim YN, Jung I, Park S, Kim GR, Nam CM (2019) Estimating age group-dependent sensitivity and mean sojourn time in colorectal cancer screening. J Med Screen 26:19–25
  • Swanson et al. (2015) Swanson SA, Robins JM, Miller M, Hernán MA (2015) Selecting on treatment: a pervasive form of bias in instrumental variable analyses. Am J Epidemiol 181:191–197
  • White (2005) White IR (2005) Uses and limitations of randomization-based efficacy estimators. Stat Methods Med Res 14:327–347
  • Young et al. (2020) Young JG, Stensrud MJ, Tchetgen Tchetgen EJ, Hernán MA (2020) A causal framework for classical statistical estimands in failure-time settings with competing events. Stat Med 39:1199–1236
  • Zelen (1993) Zelen M (1993) Optimal scheduling of examinations for the early detection of disease. Biometrika 80:279–293
  • Zheng et al. (2017) Zheng C, Dai R, Hari PN, Zhang MJ (2017) Instrumental variable with competing risk model. Stat Med 36:1240–1255