This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Evidence synthesis with reconstructed survival data

Chenqi Fu   
Department of Public Health Sciences,
Pennsylvania State University
Shouhao Zhou   
Department of Public Health Sciences,
Pennsylvania State University
Xuelin Huang   
Department of Biostatistics,
The University of Texas MD Anderson Cancer Center
Nicholas J. Short   
Department of Leukemia,
The University of Texas MD Anderson Cancer Center
Farhad Ravandi-Kashani   
Department of Leukemia,
The University of Texas MD Anderson Cancer Center
and
Donald A. Berry   
Department of Biostatistics,
The University of Texas MD Anderson Cancer Center
corresponding author: Shouhao Zhou, [email protected]
Abstract

We present a general approach to synthesizing evidence of time-to-event endpoints in meta-analyses of aggregate data (AD). Our work goes beyond most previous meta-analytic research by using reconstructed survival data as a source of information. A Bayesian multilevel regression model, called the “meta-analysis of reconstructed survival data” (MARS), is introduced, by modeling and integrating reconstructed survival information with other types of summary data, to estimate the hazard ratio function and survival probabilities. The method attempts to reduce selection bias, and relaxes the presumption of proportional hazards in individual clinical studies from the conventional approaches restricted to hazard ratio estimates. Theoretically, we establish the asymptotic consistency of MARS, and investigate its relative efficiency with respect to the individual participant data (IPD) meta-analysis. In simulation studies, the MARS demonstrated comparable performance to IPD meta-analysis with minor deviation from the true values, suggesting great robustness and efficiency achievable in AD meta-analysis with finite sample. Finally, we applied MARS in a meta-analysis of acute myeloid leukemia to assess the association of minimal residual disease with survival, to help respond to FDA’s emerging concerns on translational use of surrogate biomarker in drug development of hematologic malignancies.


Keywords: Aggregate data; Bayesian modeling; Hazard ratio; Meta-analysis; Proportional hazards assumption

1 Introduction

The time to occurrence of a disease-related event is the most important outcome measure in clinical trials to assess the efficacy of an intervention. Because the clinical and regulatory evaluation of a new intervention usually involves multiple trials, evidence synthesis using meta-analysis that integrates the survival data of such trials is essential (FDA, 2018).

In systematic reviews, aggregate data (AD) meta-analysis is the mainstay approach to combining the results of multiple trials. Even though individual patient data (IPD) meta-analysis is considered the gold standard (Stewart & Clarke, 1995; Riley et al., 2010), IPD were hard to obtain (Tierney et al., 2015; Marciniak, 2017). A recent systematic survey showed that less than 1%1\% of 78,629 meta-analyses included complete or partial IPD, while the vast majority were based on AD only (Nevitt et al., 2017). Thus, it is of utmost importance to develop sensible methodology for AD meta-analysis.

We are concerned in this work with AD meta-analytic methods to synthesize evidence in time-to-event data. For survival outcomes, the standard approach entails a fixed or random effects model on aggregated study-level hazard ratio (HR) estimates (Moher et al., 2012; Higgins et al., 2020). Owing to data censoring in studies of survival outcomes, HRs are useful for describing differences in treatment effects between an intervention group and control group. Abundant open-source and commercial computational software programs are also conveniently available for conducting meta-analysis.

However, this standard approach poses several methodological limitations and challenges in analysis and reporting. First, applications are constricted under the assumption of proportional hazards. Recent systematic reviews of randomized controlled trials showed that the proportional hazards assumption could be significantly violated in one-fifth of treatment comparisons (Rulli et al., 2018), while testing of the proportional hazards assumption seldom was conducted (Batson et al., 2016). In fact, significant violations, such as survival curve convergences and crossings, are common in clinical studies, which gives rise to concerns upon general model validity of AD meta-analysis for time-to-event outcomes (Kristiansen, 2012).

Second, the exclusive use of HR estimates impairs the effectiveness and generalizability of meta-analysis. The strength of systematic review is that it allows evidence to be synthesized from all available clinical information in the literature and allows meta-analysis to combine the information comprehensively. Using HR estimates as the only source of “survival data” simplifies analyses but undermines the generalizability of results by excluding studies that alternatively report other kinds of survival data, such as Kaplan-Meier (KM) curves, log-rank test statistics, median survival times or survival rates. Existing methods to convert those types of survival data to HR estimates involve additional assumptions about time-to-event outcomes (Tudur et al., 2001; Tierney et al., 2007) and, given the various duration of studies, could assign study weights improperly (Riley et al., 2007).

The last limitation is pertinent to AD meta-analytic reporting for intervention effect. If the proportional hazards assumption is violated in a single study, the interpretation of the overall meta-analytic HR statistics could be a problem. Moreover, it is preeminent to determine survival curves, rather than a hazard ratio, to gauge the risk of the event over time (Pocock et al., 2002). Although various methods have been developed in the literature for combining published survival curves (Begg et al., 1989; Voest et al., 1989; Shore et al., 1990; Reimold et al., 1992; Hunink & Wong, 1994; Dear, 1994; Arends et al., 2008), they were all rooted in the proportional hazards assumption except Combescure et al. (2014). That approach applies a distribution-free approach to estimating the summary survival curve for single-arm studies, but is subject to a risk of bias when multiple curves from the same study are modeled. In brief, AD meta-analytic reporting that entails comparative trials without the proportional hazards assumption remains unclear.

The main contribution of this paper is to address these challenges and present a novel AD meta-analysis framework for survival-type endpoints. Particularly, a Bayesian multilevel modeling strategy, coined “meta-analysis of reconstructed survival data” (MARS), is proposed to exploit reconstructed survival curves as a richer source of survival information. In general, survival data can be reconstructed by 1) acquiring the positions of censoring and event points on images of KM survival curves; 2) performing iterative numerical computation to convert 2-dimensional position data into time-to-censoring and time-to-events. The use of the reconstructed survival data, which can be regarded an alternative to IPD, enables several advantages similar to using IPD in both modeling and reporting. For example, instead of assuming proportional hazards, MARS can estimate the time-varying hazard rates and HRs. We note that KM curves may not be provided in every published clinical study. To conduct a meta-analysis that thoroughly covers the literature, MARS also simultaneously accommodates 2 more types of AD: HR estimates and survival rates at various time points.

Our goal is to deliver effective estimation for the comparison of efficacy of treatments, and thus expand the traditional framework for AD meta-analysis. The results of simulation study suggest that MARS is competent to provide valid estimation of a treatment effect in both cases of constant HR and time-varying HR, and capable of reporting informative summary metrics, such as survival probabilities, median survival times and restricted mean survival times (RMSTs). Moreover, MARS generates HR estimates that are consistent and have decent asymptotic efficiency with respect to IPD meta-analysis.

The outline of the paper is as follows. In Section 2, we present a data set in adult leukemia with disease-free survival (DFS) as the time-to-event outcome of interest. Our work was originally motivated by a larger AD study consisting of pediatric and adult leukemia patients with DFS endpoints. Here we apply a subset of 64 non-transplantation trials to illustrate the concept. In section 3, we introduce the MARS, which can integrate various types of survival information, and discuss the metrics to quantify treatment effects for informative reporting. The theoretical properties of MARS are investigated in section 4. Simulation studies are illustrated in section 5 for finite sample performance and an application of MARS to leukemia data set is demonstrated in section 6. Lastly, section 7 provides a discussion of our work, with suggestions for further research.

2 Case study: minimal residual disease in leukemia

Responding to the emerging development of targeted and immunological treatments, the U.S. Food and Drug Administration (FDA, 2020) published a guideline clarifying for the use of minimal residual disease (MRD) status in the development of drugs for the treatment of hematologic malignancies. An MRD-positive status means that the disease was still detectable after treatment, whereas an MRD-negative status means that no disease was detected after treatment. Potentially, the MRD status can serve as a prognostic biomarker to assess a patient’s risk of future relapse for regulatory and clinical uses.

Evidence supporting the clinical validity of MRD status as a surrogate biomarker varies across hematologic cancer types and patient populations (Rach et al., 2016; Berry et al., 2017). To gain a better understanding of the state-of-the-science of the use of MRD for specific hematologic malignancies, the FDA recommended a systematic review and meta-analysis. In particular, to ensure the effective validation of MRD to support the marketing approval of drugs and biological products, the FDA gave special meta-analytic considerations to (i) prespecified inclusion criteria to ensure comprehensive literature search of both positive and negative results, to limit selection bias, and (ii) sensitivity analyses (e.g., alternative statistical methods) to demonstrate the robustness of MRD as a surrogate marker.

Short et al. (2020) performed a quantitative overview of clinical studies in pediatric and adult patients with acute myeloid leukemia (AML) to quantify the relationships between disease-free survival (DFS) and MRD status. If a strong association can be demonstrated with robustness across studies, MRD status could be used as a critical indicator of therapeutic benefit in clinical practice to guide the clinical treatment of AML patients and accelerate the drug development by providing early evidence of treatment benefit in regulatory approval. After performing a comprehensive literature search and screening based on inclusion and exclusion criteria, we took sequential steps to extract various survival information from articles selected in the meta-analysis. First, survival data were reconstructed from the KM estimators provided in 16 studies. Second, we accepted the hazard ratios reported in the articles comparing MRD positive group to MRD negative group if the survival data cannot be reconstructed. This consists of 36 studies reported log HRs and corresponding standard errors. Finally, 12 additional studies provided only survival rates at particular time points, which were still included as a type of supplemental survival data in the meta-analysis. In total, we included 64 studies with 8,033 AML patients who did not undergo transplantation to illustrate MARS (proposed in section 3) and to respond to the FDA’s pertinent concerns.

3 Model for survival AD meta-analysis

We present a general framework for survival meta-analysis, called the “meta-analysis of reconstructed survival data” (MARS). MARS attempts to establish a one-stage Bayesian multilevel model with separate AD components, such that the most informative summary survival data available in individual studies can be properly integrated to the parameter estimation and uncertainty quantification.

In this section, we first review the general strategy to reconstruct the survival data from KM curves, and discuss the model specification to relax the proportional hazards assumption. If KM curves in a decent quality are unavailable in some clinical studies, we elicit and incorporate other types of survival data, though less informative, to complement a comprehensive AD meta-analysis and diminish selection bias. We provide the hierarchical prior distributions to complete the model specification, and finally present several alternative summary measures from MARS to better inform AD meta-analysis reporting.

3.1 Reconstructed survival data

KM curves are the richest source of information for AD meta-analysis. In the absence of IPD, it can be considered as a poor man’s IPD if the survival data are properly reconstructed. The overview of a general strategy for data extraction is as follows.

The first step in reconstructing the survival data is to digitize the KM curves using sufficiently large, clear, and high-resolution images. In electronically published articles, images are typically either ‘raster images’ or ‘vector objects’. A raster image consists of pixels, each represented by a dot or square with its own coordinates in a 2-dimensional grid. A graphical digitizing software, such as GraphClick or DigitizeIt, can be employed to convert and store the corresponding coordinates of the change points on the KM curves by clicking on the individual points using a mouse or automatic read-off (Guyot et al., 2012). Alternatively, Liu et al. (2014) proposed directly reading in the lines of the PostScript file of a vector image, rather than using a digitizing software program to read in the coordinates from a raster image, and reported an improvement with 200 times better resolution.

After the curves have been digitized, the next step is to implement a numerical algorithm to elicit survival data. Early work in this area involved dividing the KM curves into several time intervals, which gives a good representation of event rates over time, while limiting the number of events (e.g., 20%) within a given time interval Parmar et al. (1998). It was later extended and implemented in a spreadsheet calculator by Tierney et al. (2007). Using supplementary R code, Guyot et al. (2012) produced an iterative algorithm to derive a close approximation of the original individual patient time-to-event data from the published KM survival curves. Including the number of patients at risk and the number of events noted from the paper, if available, also significantly improved the approximation accuracy. Saluja et al. (2019) compared 4 data reconstruction algorithms (Guyot et al., 2012; Hoyle & Henley, 2011; Parmar et al., 1998; Williamson et al., 2002), and reported that the Guyot method was consistently superior to others.

Although the modeling of reconstructed survival data is lacking in the literature, the general ideas can be analogues to that of IPD without patient-level covariates. In IPD meta-analysis literature for survival outcomes, it was popular to apply a Poisson generalized linear model, which could be implemented with either fixed or random effects and with baseline hazard stratified by trial (Dear, 1994; Arends et al., 2008). Crowther et al. (2012) extended this model to allow non‐proportional hazards of the treatment effects.

In MARS, we assume the reconstructed survival data follows a piecewise exponential distribution in the AD framework. In theory, the piecewise exponential survival model can be equivalent to a Poisson log-linear model (Holford, 1980; Laird & Olivier, 1981). To model the piecewise exponential, we first separate the overall time scale into JJ prespecified disjointed small segments Ij=(tj1,tj]I_{j}=(t_{j-1},t_{j}] for j=1,2,,Jj=1,2,\dots,J, where 0=t0<t1<<tJ<0=t_{0}<t_{1}<\dots<t_{J}<\infty. For each patient, we create a set of pseudo-observations, one for each interval, indicating whether the individual had an event in that interval. The piecewise exponential model incorporates the data by treating the pseudo-observations as following Poisson distributions with means that are equal to the cumulative hazards in the intervals. Thus, the likelihood function of a piecewise exponential model is equivalent to the product of Poisson likelihood functions, one for each combination of an individual and an interval.

Within each interval, the baseline hazard rate and the log HR for group difference are assumed to be constant, denoted by λj\lambda_{j} and βj\beta_{j} for j=1,2,,Jj=1,2,\dots,J. We denote tJt_{J} the largest survival or censored time. If some study has an extraordinary long follow-up period compared with other studies, the reconstructed survival time could be censored by a reasonable maximum time to make the inference of the meta-analysis meaningful and robust. The hazard rate for the iith subject in the kkth study is modeled as

hk,i(t)=hk(t,Xk,i)=λjexp{αk+θk(t)Xk,i},tIj,h_{k,i}(t)=h_{k}(t,X_{k,i})=\lambda_{j}\exp{\{\alpha_{k}+\theta_{k}(t)X_{k,i}\}},\quad t\in I_{j}, (1)

where αk\alpha_{k} allows the baseline hazard rate to vary from study to study and Xk,iX_{k,i} indicates the treatment assignment. In the motivating case study (described in Section 2), Xk,iX_{k,i} is the indicator of MRD status. The study-specific log HR θk(t)\theta_{k}(t) is decomposed into the time-dependent effect βj\beta_{j} and an inter-study effect μk\mu_{k}

θk(t)=βj+μk,tIj.\theta_{k}(t)=\beta_{j}+\mu_{k},\quad t\in I_{j}. (2)

The survival function is uniquely defined given the hazard function. If the study-level covariates (moderators) ZkZ_{k} are available, it is also straightforward to accommodate the study-level covariate effects, γZk\gamma Z_{k}, into μk\mu_{k} in (2) for subgroup analysis.

3.2 Supplemental survival data

One objective of MARS is to diminish selection bias. In meta-analysis applications, studies with survival information on intervention effect may not all publish KM curves, but report alternative summary statistics. To include all available information from eligible studies in a comprehensive systematic review, we incorporate into MARS with other common types of survival data, ranked by their richness of survival information. Without ambiguity, we denote the reconstructed survival data as the type I data hereinafter.

3.2.1 Hazard ratios (type II data)

In MARS, HR estimates are synthesized as type II data if the KM curves are not available or are provided in low-resolution images, which is often the case in early publications. When the reported HRs are estimated under the proportional hazards assumption, they are presumed to be time-independent. Estimating a time-varying HR based solely on reported constant HRs is not feasible. However, given the various lengths of follow-up periods, it is natural to regard the observed constant log HR θ^k\hat{\theta}_{k} as an estimate of the weighted average of stepwise time-dependent effects βj\beta_{j} in addition to the study-specific effect μk\mu_{k},

θ^kN(θ¯k,se^(θ^k)2),\hat{\theta}_{k}\sim N(\bar{\theta}_{k},\hat{se}(\hat{\theta}_{k})^{2}),

with the HR aligned with study duration

θ¯k\displaystyle\bar{\theta}_{k} =\displaystyle= 1Tk0Tkθk(u)𝑑u\displaystyle\frac{1}{T_{k}}\int_{0}^{T_{k}}\theta_{k}(u)du
=\displaystyle= μk+1Tk{l=1j1βl|Il|+βj(Tktj1)},TkIj,\displaystyle\mu_{k}+\frac{1}{T_{k}}\{\sum_{l=1}^{j-1}\beta_{l}|I_{l}|+\beta_{j}(T_{k}-t_{j-1})\},\quad T_{k}\in I_{j},

where θk(t)\theta_{k}(t) is defined in equation (2), |Ij|=tjtj1|I_{j}|=t_{j}-t_{j-1}, and TkT_{k} is the duration of the kkth study. The corresponding likelihood provides no information on the baseline hazard function.

In general, type II survival data can consist of the HRs converted from variants of indirect statistics comparing group differences in clinical studies lacking reported HRs. A typical example is a log-rank test statistics and its p-value, from which the log HR and its variance can be reasonably approximated (Parmar et al., 1998). This allows the totality of evidence to be assessed and improves the efficiency and reliability of meta-analysis for survival-type endpoints (Tudur et al., 2001). Tierney et al. (2007) reviewed the relevant methods that carefully manipulate published or other summary data to obtain HR summary statistics, classified by the type of information presented in trial reports. The resulting converted HRs can then be conveniently used as type II data in the MARS model.

3.2.2 Survival probabilities (type III data)

Estimates of survival probabilities at particular time points (e.g., 1-year or 3-year DFS rates for individual groups) are often reported in randomized controlled clinical studies, but usually ignored and rarely used in meta-analysis even if HR was not published. Compared with HR estimates, survival rates provide limited information and thus are not considered valid surrogate measures for the meta-analysis of survival outcomes (Michiels et al., 2005). Although it can also be used to estimate HR estimates, such estimation requires assumptions in addition to the proportional hazards assumption. Nevertheless, it put an important piece to complete the information synthesis for systematic review when other types of survival data of a study are not available. For example, Berry et al. (2017) used it in a 2-step approach by first estimating the HR after assuming an exponential distribution of the time-to-event outcomes for the totality of evidence.

In MARS, the reported survival rates are treated as the realizations of the survival function derived from equation (1)

Sk(t,X)\displaystyle S_{k}(t,X) =\displaystyle= exp{0thk(u,X)𝑑u}\displaystyle\exp\left\{-\int_{0}^{t}h_{k}(u,X)du\right\} (3)
=\displaystyle= exp{l=1j1λlexp{αk+(μk+βl)X}|Il|\displaystyle\exp\{-\sum_{l=1}^{j-1}\lambda_{l}\exp\{\alpha_{k}+(\mu_{k}+\beta_{l})X\}|I_{l}|
λjexp{αk+(μk+βj)X}(ttj1)},tIj.\displaystyle-\lambda_{j}\exp\{\alpha_{k}+(\mu_{k}+\beta_{j})X\}(t-t_{j-1})\},\quad t\in I_{j}.

When survival rate S^k(t,X=x)\hat{S}_{k}(t,X=x) and its standard error se^(S^k(t,X=x))\hat{se}(\hat{S}_{k}(t,X=x)) for xxth treatment group are provided at a specific time point tt for the kkth study, we assume that logit(S^k(t,X=x))\text{logit}(\hat{S}_{k}(t,X=x)) follows a normal distribution as

logit(S^k(t,x))N(logit(Sk(t,x)),se^2(S^k(t,x))S^k(t,x)2(1S^k(t,x))2)\text{logit}(\hat{S}_{k}(t,x))\sim N\left(\text{logit}(S_{k}(t,x)),\frac{\hat{se}^{2}(\hat{S}_{k}(t,x))}{\hat{S}_{k}(t,x)^{2}(1-\hat{S}_{k}(t,x))^{2}}\right)

to reduce the bias from asymmetry for interval-ranged Sk(t,x)S_{k}(t,x) on (0,1)(0,1). S^k(t,x)=1\hat{S}_{k}(t,x)=1 or 0 has been rarely reported, as the observation would be neglected to avoid the degenerate likelihood.

3.3 Priors specification and model implementation

For the log HR estimation, βj\beta_{j} represents the time-dependent treatment effect in time interval IjI_{j} and μk\mu_{k} describes the inter-study heterogeneity. We assume that prior of μk\mu_{k} follows a normal distribution μk𝒩(μ0,σμ2)\mu_{k}\sim\mathcal{N}(\mu_{0},\sigma_{\mu}^{2}). To specify the prior for 𝜷=(β1,β2,,βJ)\boldsymbol{\beta}=(\beta_{1},\beta_{2},\dots,\beta_{J}), we rely on the multivariate normal prior:

𝜷|Σ𝜷𝒱𝒩(𝝁𝜷,Σ𝜷),\boldsymbol{\beta}|\Sigma_{\boldsymbol{\beta}}\sim\mathcal{MVN}(\boldsymbol{\mu_{\beta}},{\Sigma_{\boldsymbol{\beta}}}),

where the mean 𝝁𝜷=μβ𝟏\boldsymbol{\mu_{\beta}}=\mu_{\beta}\mathbf{1}, and the variance-covariance matrix Σ𝜷\Sigma_{\boldsymbol{\beta}} takes a first-order autoregressive structure

Σ𝜷=σβ2(1ρβρβ2ρβJ1ρβ1ρβρβJ2ρβJ1ρβJ2ρβJ31).\Sigma_{\boldsymbol{\beta}}=\sigma_{\beta}^{2}\begin{pmatrix}1&\rho_{\beta}&\rho_{\beta}^{2}&\dots&\rho_{\beta}^{J-1}\\ \rho_{\beta}&1&\rho_{\beta}&\dots&\rho_{\beta}^{J-2}\\ \dots&\dots&\dots&\dots&\dots\\ \rho_{\beta}^{J-1}&\rho_{\beta}^{J-2}&\rho_{\beta}^{J-3}&\dots&1\end{pmatrix}.

The hyperparameter ρβ\rho_{\beta} induces the association between the time segments and thereby the smoothness of the estimated piecewise HR function. For example, if ρβ=1\rho_{\beta}=1, this model implies that the time-dependent treatment effects are identical, that is, degenerate to a proportional hazards model.

The parameters λj\lambda_{j} and αk\alpha_{k} capture the baseline hazard rates and their variation among studies. Similarly, we assume a normal prior distribution for αk\alpha_{k} as αk𝒩(μα,σα2)\alpha_{k}\sim\mathcal{N}(\mu_{\alpha},\sigma_{\alpha}^{2}), and a multivariate normal distribution for (log(λ1),log(λ2),,log(λJ))(\log(\lambda_{1}),\log(\lambda_{2}),\dots,\log(\lambda_{J})) with the mean 𝝁λ=μλ𝟏\boldsymbol{\mu}_{\lambda}=\mu_{\lambda}\mathbf{1} and first-order autoregressive covariance matrix Σ𝝀\Sigma_{\boldsymbol{\lambda}} of the parameters σλ2\sigma_{\lambda}^{2} and ρλ\rho_{\lambda}. The hyperparameters follow the common choice of non-informative priors

μβ,μλ,μα,μ0\displaystyle\mu_{\beta},\mu_{\lambda},\mu_{\alpha},\mu_{0} \displaystyle\sim 𝒩(0,1000)\displaystyle\mathcal{N}(0,1000)
1/σβ2,1/σμ2,1/σλ2\displaystyle 1/\sigma_{\beta}^{2},1/\sigma_{\mu}^{2},1/\sigma_{\lambda}^{2} \displaystyle\sim Gamma(0.01,0.01)\displaystyle Gamma(0.01,0.01)
ρβ,ρλ\displaystyle\rho_{\beta},\rho_{\lambda} \displaystyle\sim Unif(0.99,0.99).\displaystyle Unif(-0.99,0.99).

For the case study, the joint posterior distributions were generated using Markov Chain Monte Carlo (MCMC) methods (Gelman et al., 2013). We used the R2jags package in statistical software R (version 3.6.1) to interface with the JAGS (Just Another Gibbs Sampler, version 4.3.0) for data analysis. More details can be found in the JAGS model specification in the Supplementary Materials.

3.4 Summary measures for meta-analytic reporting

We specify several summary measures, moving beyond the HR estimator, for statistical reporting and result dissemination using MARS. Having precise estimates of the difference in survival between treatments is extremely important, especially in oncology (Michiels et al., 2005). In conventional meta-analysis, HR estimates are routinely used to summarize differences in survival endpoints between groups (Moher et al., 2012; Higgins et al., 2020). If the underlying proportional hazards assumption is violated, the clinical interpretation of the HR estimate becomes difficult. Alternatively, owing to its flexible Bayesian model structure, MARS can generate more informative summary measurements to quantify the between-group difference with exact inference for uncertainty quantification.

Survival probabilities.        Survival probabilities are essential in survival analysis as they provide crucial summary information about the survival experiences of study populations. MARS readily yields the estimation of the overall survival function

S(t)\displaystyle S(t) =\displaystyle= S(t,X)=exp{0th(u,X)𝑑u}\displaystyle S(t,X)\ =\ \exp\left\{-\int_{0}^{t}h(u,X)du\right\} (4)
=\displaystyle= exp{l=1j1λlexp{βlX}|Il|λjexp{βjX}(ttj1)},tIj.\displaystyle\exp\{-\sum_{l=1}^{j-1}\lambda_{l}\exp\{\beta_{l}X\}|I_{l}|-\lambda_{j}\exp\{\beta_{j}X\}(t-t_{j-1})\},\quad t\in I_{j}.

The Bayesian model can deliver exact inference on parameter estimation without assuming asymptotic normality. Estimated survival probabilities can function similarly as KM estimates of the survival function so that treatment effects summarized in them are intuitive and straightforward. Using MCMC can generate the 95% credible intervals (CrIs) for uncertainty quantification. Moreover, it is appealing to generate graphical representations of meta-analytic survival curves (see, for example, Figure 3A and 3B).

Median survival time.        The median survival time defines as the amount of time tt such that S(t)=0.5,S(t)=0.5, which is equivalent to that the cumulative hazard function H(t)H(t) is H(t)=log(2).H(t)=\log(2).

In MARS, the cumulative hazard is given by a piecewise linear function

H^(t)=l=1j1λ^l|Il|+λ^j(ttj),\hat{H}(t)=\sum_{l=1}^{j-1}\hat{\lambda}_{l}|I_{l}|+\hat{\lambda}_{j}(t-t_{j}),

where tj1<ttjt_{j-1}<t\leq t_{j}. Subsequently, the median survival times tMSt_{MS} can be estimated solving of the following equation: H^(tMS)=log(2).\hat{H}(t_{MS})=\log(2).

Restricted mean survival time (RMST).        The RMST is a well-established alternative measure of the “life expectancy” from the time of treatment to a specific time horizon of interest (Royston & Parmar, 2013). The RMST is valid in survival models under any distribution, including the proportional hazards model. It is as powerful as the HR when the proportional hazards assumption is valid and even more powerful than the HR when the assumption is invalid. Another advantage of the RMST is that it primarily relies on the chosen time period so that it is less subject to follow-up time (Kim et al., 2017).

The RMST is defined as the expected value of survival time up to a pre-specified time point τ\tau. Let TT denote the time to the event and S(t)S(t) denote the corresponding survival function. The RMST equals to the area under the survival curve from baseline to τ\tau.

RMST(τ)=E(min(T,τ))=0τS(u)𝑑uRMST(\tau)=E(\min(T,\tau))=\int_{0}^{\tau}S(u)du

Because the survival curve as be approximated by KM estimates, the RMST can be nonparametrically estimated by using KM estimates of survival function (Guo & Liang, 2019).

In the proposed piecewise exponential model, the RMST(τ)RMST(\tau) can be approximated using the estimated mean survival probabilities S^\hat{S}:

RMST^(τ)=0τS^(u,x)𝑑u=i=1jS^(ti,X)S^(ti1,X)λ^iexp(Xβ),\widehat{RMST}(\tau)=\int_{0}^{\tau}\hat{S}(u,x)du=\sum_{i=1}^{j}\frac{\hat{S}(t_{i},X)-\hat{S}(t_{i-1},X)}{-\hat{\lambda}_{i}\exp{(X\beta)}},

where τ=tj\tau=t_{j} and S^(ti,X)=exp(i=1jλ^iexp(Xβ)|Ii|)\hat{S}(t_{i},X)=\exp(-\sum_{i=1}^{j}\hat{\lambda}_{i}\exp{(X\beta)}|I_{i}|). Using posterior samples from the MCMC algorithm, estimating the RMST and its standard deviation is succinct.

4 Asymptotic properties

Meta-analysis is widely used in the current big data era in all scientific areas for thorough statistical inference. As both the scale and number of individual studies increase, it is essential to understand the theoretical properties of MARS.

4.1 Consistency

To account for the heterogeneity among studies, MARS employs a hierarchical Bayesian structure in meta-analysis. Using a hierarchical Bayesian model provides consistent and accurate estimation (Rouder & Lu, 2005). Assume there are KK studies and nkn_{k} subjects in the kthk^{th} study. We denote nn as average sample size. Let 𝜷=(β1,β2,,βJ)T{\boldsymbol{\beta}}=(\beta_{1},\beta_{2},\dots,\beta_{J})^{T} and 𝜼=(η1,η2,,ηJ)T{\boldsymbol{\eta}}=(\eta_{1},\eta_{2},\dots,\eta_{J})^{T} be the parameters of piecewise log HR and piecewise log baseline hazard rates, respectively, and denote 𝜷^\hat{\boldsymbol{\beta}} and 𝜼^\hat{\boldsymbol{\eta}} as the estimates generated by MARS. We denote 𝒦1\mathcal{K}_{1}, 𝒦2\mathcal{K}_{2}, 𝒦3\mathcal{K}_{3}, and 𝒦\mathcal{K} as sets of studies that provide reconstructed survival data (type I data), HRs (type II data), and survival rates (type III data), respectively. We have the following result:

Theorem 1.

Under the regularity conditions (a)-(f) specified in Appendix A and for k𝒦1k\in\mathcal{K}_{1}, as the sample size nkn_{k}\rightarrow\infty and the number of patients in each interval nkjn_{kj} becomes larger at a constant ratio nkj/nk=cjn_{kj}/n_{k}=c_{j}, j=1,,Jj=1,\dots,J;

Then, as the total number of studies KK\rightarrow\infty and the average sample size nn\rightarrow\infty, (𝛃^𝛈^)\begin{pmatrix}\hat{\boldsymbol{\beta}}\\ \hat{\boldsymbol{\eta}}\end{pmatrix} converges to (𝛃𝛈)\begin{pmatrix}\boldsymbol{\beta}\\ \boldsymbol{\eta}\end{pmatrix} in probability.

According to Theorem 1, MARS yields consistent estimators for 𝜷\boldsymbol{\beta} and 𝜼\boldsymbol{\eta} with sufficient reconstructed survival data. As a result, the estimated summary measures, such as the overall survival rates, median survival time and RMSTs in 3.4, are consistent as well.

4.2 Relative efficiency

The relative efficiency of conventional fixed-effect and random-effect AD meta-analyses has been rigorously studied and shown to be at most as efficient as IPD meta-analysis and asymptotically equivalent (Lin & Zeng, 2010; Chen et al., 2020). Here, we extend the theoretical results and investigate the relative efficiency of MARS. Because survival rates (type III data) can only provide limited information, we are particularly interested in the efficiency of estimating the time-varying hazard ratio function by incorporating the commonly reported supplemental type II data into meta-analysis when reconstructed survival data are unavailable in some studies. Let 𝜷^\hat{\boldsymbol{\beta}}^{*} and 𝜷^IPD\hat{\boldsymbol{\beta}}^{*}_{IPD} be the estimates generated from MARS and IPD meta-analysis using type I and II data, respectively.

Theorem 2.

Under the regularity conditions (a)-(f) specified in Appendix A, we have the following properties that

  1. 1.

    for any fixed KK and nn, Var(𝜷^)Var(𝜷^IPD)Var(\hat{\boldsymbol{\beta}}^{*})\geq Var(\hat{\boldsymbol{\beta}}^{*}_{IPD});

  2. 2.

    if Var(𝜷^k)Var(\hat{\boldsymbol{\beta}}_{k}) coincides with the variance of MLE of 𝜷k\boldsymbol{\beta}_{k} for k𝒦2k\in\mathcal{K}_{2}, as K,nK,n\rightarrow\infty and Kn1/20Kn^{-1/2}\rightarrow 0, the asymptotic variances satisfy

    limn,KKn1/20Var(𝜷^)=limn,KKn1/20Var(𝜷^IPD).\lim_{\begin{subarray}{c}n\to\infty,K\to\infty\\ Kn^{-1/2}\rightarrow 0\end{subarray}}Var(\hat{\boldsymbol{\beta}}^{*})=\lim_{\begin{subarray}{c}n\to\infty,K\to\infty\\ Kn^{-1/2}\rightarrow 0\end{subarray}}Var(\hat{\boldsymbol{\beta}}^{*}_{IPD}).

According to the first inequality, Var(𝜷^)Var(𝜷^IPD)Var(\hat{\boldsymbol{\beta}}^{*})-Var(\hat{\boldsymbol{\beta}}^{*}_{IPD}) is positive semidefinite, indicating that the efficiency of MARS is at most the same as that of IPD meta-analysis. The second statement implies that asymptotically, MARS can achieve the full efficiency when the sample size increases at a higher order rate than the study number. Note that the condition of variance for type II data is met when the hazard ratio is constant. In particular, it holds even if hazard ratio is constant within the period of study duration TkT_{k} and becomes time-varying. This might not be practical. Thus, we encounter acceptable efficiency loss in reality.

Corollary.

Let 𝛃^\hat{\boldsymbol{\beta}} denote the estimates generated from MARS by using all 3 types of data, for fixed KK and nn, Var(𝛃^)Var(𝛃^).Var(\hat{\boldsymbol{\beta}})\leq Var(\hat{\boldsymbol{\beta}}^{*}).

The corollary indicates that for a meta-analysis with finite number of studies, including studies that only reported type III data increases efficiency.

5 Simulation

We empirically evaluated the finite sample performance of the proposed meta-analysis method in 2 simulation studies. First, we compared the MARS to conventional AD meta-analysis method and IPD meta-analysis in a setting in which the proportional hazards assumption is valid (Case 1). Second, we assessed the validity of MARS and its relative efficiency as compared to IPD meta-analysis when the proportional hazard assumption is invalid but two survival curves cross each other (Case 2). The performances of the methods are compared based on 1,000 replications.

5.1 Simulation setting

Case 1.        In each replication, we assumed 13 eligible studies and proportional hazards over time. The random effects for each study were generated as follows:

αk𝒩(0,0.01),μk𝒩(0,0.1)\alpha_{k}\sim\mathcal{N}(0,0.01),\mu_{k}\sim\mathcal{N}(0,0.1)

where k=1,2,,13k=1,2,\dots,13 and XX’s are independent binary variables generated from a Bernoulli distribution with a rate of 0.55. Given αk\alpha_{k} and μk\mu_{k}, the survival times in the kthk^{th} study were independently generated from the hazard function

hk(t)=h0(t)exp(αk+(μk+β)X)h_{k}(t)=h_{0}(t)\exp(\alpha_{k}+(\mu_{k}+\beta)X)

where h0(t)h_{0}(t) is a Weibull distribution with shape 0.1 and scale 0.5 and β=0.6\beta=-0.6 (exp(β)0.55)(\exp(\beta)\approx 0.55) is the time independent effect. We assumed that survival times were right-censored according to an exponential distribution with a censoring rate of 0.01. The sample size for each study was randomly sampled from 50 to 150. The follow-up time for each study was randomly sampled from 80 to 120 months, at multiples of 5 months only. We equally partitioned the 120 months into 10 segments, each 12 months long.

To mimic the real-world scenario that HRs may not be reported in every clinical study, we assumed that 6 studies reported KM plots, 6 studies reported HR (including 3 studies with KM plots and 3 without), and 4 studies reported survival rates. For 3 studies that had both KM data and HR estimates, only the KM data were used for MARS, and only the HR estimates were used for AD random-effects model. Therefore, we applied MARS to 6 studies of type I data, 3 studies of type II data, and 4 studies of type III data. In contrast, in the conventional AD meta-analysis, 6 studies were included to estimate the HR. Original survival data from all 13 studies were included in an IPD meta-analysis as a gold standard for comparison.

We computed the log HRs using MARS, IPD meta-analysis and the conventional AD meta-analysis. For MARS and IPD meta-analysis, we also computed survival probabilities, median survival times and RMSTs at 1, 3, 5, and 10 years for each group.

Case 2.        In each replication, we included 20 studies, for which αk\alpha_{k}, μk\mu_{k} and XX were generated the same as in case 1. The survival times in each study were generated from the hazard function

hk(t)=0.01×exp(αk+(μk+β(t))X)h_{k}(t)=0.01\times\exp(\alpha_{k}+(\mu_{k}+\beta(t))X)

where the log HR β(t)=10.2(t/301)\beta(t)=1-0.2^{(t/30-1)} is time-varying and for simplicity, the baseline hazard rate was constant over time. The censoring mechanism and the follow-up time were the same as those in case 1.

We assumed that 9 studies reported KM plots, 12 studies reported HR (including 5 studies with KM plots), and 4 studies reported survival rates. Thus, in MARS, we accounted for 9 studies of type I data, 7 studies of type II data, and 4 studies of type III data. In contrast, in the conventional AD meta-analysis, 12 studies were included to estimate the HR. Data from all 20 studies were included in an IPD meta-analysis.

As we did in case 1, we calculated survival rates, median survival times, and RMSTs at 1, 3, 5 and 10 years. Alternative to the point estimate of log HR, which is meaningless when the HR is not constant, we computed the average of the squared error between the log HR function β(t)\beta(t) and the estimate log HR function β^(t)\hat{\beta}(t) over time, where β^(t)\hat{\beta}(t) would be a stepwise function in MARS.

5.2 Results

Table 1: Comparison of HR estimates. Abbreviations: CP, coverage probability of 95% confidence interval (CI) or 95% credible interval (CrI).
log HR
Mean (SD) MSE CP Length of CrIs
Case 1: proportional hazards MARS -0.63 (0.10) 2.66 0.98 0.48
IPD -0.64 (0.10) 2.70 0.95 0.49
AD -0.59 (0.12) 2.55 0.96 0.46
Case 2: time-varying hazards MARS 0.19 (0.10) 0.24
IPD 0.23 (0.09) 0.19
AD 0.05 (0.08) 1.48

Case 1.        Figure 1 shows the performance of MARS in which the proportional hazard assumption holds. The piecewise HRs estimated by MARS were approximately constant (Figure 1A), highlighting that MARS is able to identify a proportional hazards situation when the HR is actually constant. Thus, we compared the average piecewise HR estimates generated by MARS with the HR estimates generated by AD meta-analysis and IPD meta-analysis (Table 1). Conventional AD meta-analysis using the available HR estimates from individual studies yielded the log HR estimates closest to the true value of -0.6. Although MARS generated a point estimate slightly smaller than the true value, the coverage probability of the 95% CrIs was non-inferior to those of the other 2 methods.

Refer to caption
Figure 1: Comparison of the performances of MARS and IPD meta-analysis in the simulation study under proportional hazard. (A) Estimated piecewise HRs; (C) Differences in RMSTs; (B) and (D) Survival probabilities of group X=0X=0 and group X=1X=1, respectively.

The estimated survival probabilities of both groupsgroup X=0 and group X=1 are shown in Figure 1B and 1D, respectively. MARS provided similar point estimates with slightly larger variability. Specifically, the true survival rates of the X=0X=0 group at 3, 5, and 10 years were 0.55, 0.46 and 0.33, respectively (table 2). With MARS, the 95% CrIs were more likely to cover the true survival rates at 3 years, 5 years and 10 years, whereas the IPD meta-analysis provided a more precise estimate of those survival rates.

Table 2: Comparison of SR, MS and RMST. Abbreviations: CP, coverage probabitility of 95% CrI; SR, survival rate; MS, median survival time.
[Uncaptioned image]

Regarding the median survival time and RMST (Figure 1C and Table 2), MARS had similar operating characteristics to those of the IPD analysis, and a higher coverage probability due to wider 95% CrIs length. The mean relative efficiency of MARS to IPD meta-analysis was 1.22, indicating that MARS had reasonable efficiency loss compared with IPD meta-analysis.

Refer to caption
Figure 2: Comparison of the performances of MARS and IPD meta-analysis in the simulation study when non proportional hazard. (A)Estimated piecewise HRs. (C) Differences in RMST. (B and D) Survival probabilities of group X=0X=0 and group X=1X=1, respectively.

Case 2.        Figure 2 demonstrates the performance of MARS using 3 types of data to provide estimates of various summary measures when the proportional hazards assumption is violated. The average of squared error of MARS (0.24) was higher than that of the IPD meta-analysis but much smaller than that of conventional AD meta-analysis, which is inappropriate in this situation (Table 1).

MARS had good performance in estimating survival probabilities (Figure 2B and 2D). The survival rates MARS generated for both groups were almost identical to those generated from IPD meta-analysis. In long-term follow-up, both methods slightly over-estimated survival rates for group X=1X=1. This can be explained by the shrinkage effect in Bayesian estimation of time-varying HR function 2A). Even though the HRs were over-estimated in early years, it effected less to the estimated survival rates due to the low hazard rates in that period. The 95% CrIs generated with MARS were a little wider, but also had a higher probability to cover the true values at all time points (Table 2).

MARS was also able to give valid estimates of median survival time and RMSTs at 5 years with minor deviations from the true values (Table 2).

6 Application

In our application of MARS to assess the MRD effects in AML dataset, we used time segments of 0-6 months and every 6 months of follow-up thereafter to a maximum of 11 years (22 intervals total).

The meta-analysis survival curves for DFS are shown in the figure 3A, representing the survival probabilities over years. Generally, the DFS was better for subjects who showed MRD negative than those who had positive MRD status. The estimated DFS at 5 years were 65% (95% CrI, 59%-70%) for MRD negative group and 25% (95% CrI, 19%-32%) for MRD positive group. Figure 3B shows the survival curves for DFS synthesizing only reconstructed survival data (type I data) of 16 studies. Survival probabilities of both groups in figure 3B are lower than those in figure 3A, and the 95% credible interval (CrI) was significantly narrower in figure 3B than CrI in figure 3A. Both discrepancies in point estimates and variability reflect the selection bias and extractor bias resulting from using only type 1 data and consequently suggest to conduct complete data analysis for heterogeneity estimation rather than using partial data, especially when the total number of studies included for meta-analysis is not large. The median survival time for MRD negative group is unavailable as the estimated survival probability at the end of study is about 0.56, while the median survival time for MRD positive group is around 1.08 years.

Refer to caption
Figure 3: Estimated DFS curves, stratified by MRD status, and estimated RMSTs. (A) Survival probabilities, with 95% CrIs, estimated using all 3 types of survival data. (B) Survival probabilities, with 95% CrIs, estimated using only reconstructed survival data. (C) Estimated RMSTs and corresponding 95% CrIs for MRD-negative patients and MRD-positive patients at various times.

The RMSTs of both groups at 1, 3, 5, and 10 years are shown in Figure 3C. Patients who were MRD-negative had a longer RMST than patients who were MRD-positive. The estimated RMST at 5 years was 3.73 years (95% CrI, 3.51-3.93) for MRD negative group, while that for MRD positive group was 2.06 years (95% CrI, 1.79-2.35).

7 Discussion

In this work, we present MARS, a novel AD meta-analytic approach to exploit reconstructed survival data as a valuable source of information. Reconstructed survival data, which can be deemed as an AD alternative to IPD, enables theoretical advantages similar to using IPD in both meta-analytic modeling and reporting. To systematically assess previous research studies, we develop a Bayesian multilevel modeling framework consolidating 3 regular types of survival information: 1) the reconstructed survival data elicited from published Kaplan-Meier curves, 2) log HR estimates, and 3) survival rates at specific time points. For the estimation of the HR function and survival probabilities, the piecewise exponential functions are employed with AR(1) correlations to model time-varying effects, the study effects are intrinsically weighted by the length of follow-up study duration, and MCMC allows exact inference for point estimation and uncertainty quantification. We establish the theoretical large sample properties, including consistency and relative efficiency, and provide conditions under which the MARS may coincide in efficiency with the IPD analysis. In finite sample, the MARS demonstrates comparable performance to IPD meta-analysis through simulation studies, under both the constant and time-varying hazard ratio scenarios. As suggested by the motivating MRD example, this research endeavor can congruously promote clinical research and assist regulatory decisions.

Meta-analysis modeling approaches for continuous and binary outcomes are relatively well developed, but less work has been done with time‐to‐event outcomes (Freeman & Carpenter, 2017). Given the high prevalence of clinical studies in which the proportional hazards assumption is violated, the validity of AD meta-analysis is often in question. The considerable popularity in meta-analysis of survival endpoints suggests the critical need for a modeling approach that can relax proportional hazards assumption (Rulli et al., 2018).

The development of the MARS method contributes to the meta-analysis literature in several ways. In particular, MARS relaxes the proportional hazards assumption in individual clinical studies, which is a fundamental restriction in the meta-analysis using HR estimates (Higgins et al., 2020). To derive valid inference for a given population, MARS comprehensively utilizes the most evidence available in the literature, and thereby mitigates the selection bias by including studies that have partial survival information. Sensible summary measurements can be derived conveniently to enrich AD meta-analytic reporting for time-to-event endpoints. For example, survival probabilities can be generated for visual presentation to inform the risk of study group over time. Alternative statistics, such as survival probabilities and RMSTs, can be derived to better inform meta-analysis reporting especially if the proportional hazard assumption is not met. In summary, by addressing various limitations of current approaches, the MARS method effectively extends the classical framework of AD meta-analysis for survival endpoints.

Only 25% of published IPD meta-analyses have had access to all IPD (Nevitt et al., 2017). A generalization of the MARS method can be applied to combine IPD and AD data for survival endpoints (Riley et al., 2007), or process information together with single-arm trials or multiple treatments to incorporate real-world evidence (Begg & Pilote, 1991; Zhang et al., 2019). Additionally, one future methodological exploration is to employ alternative strategies for estimating time-varying effects. Jansen (2011); Jansen & Cope (2012) used fractional polynomials (Royston & Altman, 1994) to model the time-varying hazard functions in a two‐step IPD analysis of time‐to‐event data. Fractional polynomials are a parsimonious alternative to regular polynomials and provide flexible parameterization for continuous variables. However, the shape of a fractional polynomial may result in unexpected end effects which may jeopardize the overall accuracy. Freeman & Carpenter (2017) applied the Royston‐Parmar model with the baseline log‐cumulative hazard modeled by restricted cubic splines. However, their formulation failed to guarantee the monotonicity of the cumulative hazard function. Possibly, one solution is to adopt the extended multiresolution hazard model (Hagar et al., 2017) for reconstructed survival data.

SUPPLEMENTARY MATERIAL

Title:

Supplementary Materials for ”Evidence synthesis with reconstructed survival data”. (PDF file)

References

  • (1)
  • Arends et al. (2008) Arends, L. R., Hunink, M. M. & Stijnen, T. (2008), ‘Meta-analysis of summary survival curve data’, Statistics in medicine 27(22), 4381–4396.
  • Batson et al. (2016) Batson, S., Greenall, G. & Hudson, P. (2016), ‘Review of the reporting of survival analyses within randomised controlled trials and the implications for meta-analysis’, PLoS One 11(5), e0154870.
  • Begg & Pilote (1991) Begg, C. B. & Pilote, L. (1991), ‘A model for incorporating historical controls into a meta-analysis’, Biometrics pp. 899–906.
  • Begg et al. (1989) Begg, C. B., Pilote, L. & McGlave, P. B. (1989), ‘Bone marrow transplantation versus chemotherapy in acute non-lymphocytic leukemia: a meta-analytic review’, European Journal of Cancer and Clinical Oncology 25(11), 1519–1523.
  • Berry et al. (2017) Berry, D. A., Zhou, S., Higley, H., Mukundan, L., Fu, S., Reaman, G. H., Wood, B. L., Kelloff, G. J., Jessup, J. M. & Radich, J. P. (2017), ‘Association of minimal residual disease with clinical outcome in pediatric and adult acute lymphoblastic leukemia: a meta-analysis’, JAMA oncology 3(7), e170580–e170580.
  • Chen et al. (2020) Chen, D.-G., Liu, D., Min, X. & Zhang, H. (2020), ‘Relative efficiency of using summary versus individual data in random-effects meta-analysis’, Biometrics 76(4), 1319–1329.
  • Combescure et al. (2014) Combescure, C., Foucher, Y. & Jackson, D. (2014), ‘Meta-analysis of single-arm survival studies: a distribution-free approach for estimating summary survival curves with random effects’, Statistics in medicine 33(15), 2521–2537.
  • Crowther et al. (2012) Crowther, M. J., Riley, R. D., Staessen, J. A., Wang, J., Gueyffier, F. & Lambert, P. C. (2012), ‘Individual patient data meta-analysis of survival data using poisson regression models’, BMC Medical Research Methodology 12(1), 34.
  • Dear (1994) Dear, K. B. (1994), ‘Iterative generalized least squares for meta-analysis of survival data at multiple times’, Biometrics pp. 989–1002.
  • FDA (2018) FDA (2018), Guidance for Industry: Meta-Analyses of Randomized, Controlled, Clinical Trials (RCTs) to Evaluate the Safety of Human Drugs or Biologic Products, Technical report.
  • FDA (2020) FDA (2020), Guidance for Industry: Hematologic Malignancies: Regulatory Considerations for Use of Minimal Residual Disease in Development of Drug and Biological Products for Treatment, Technical report.
  • Freeman & Carpenter (2017) Freeman, S. C. & Carpenter, J. R. (2017), ‘Bayesian one-step ipd network meta-analysis of time-to-event data using royston-parmar models’, Research synthesis methods 8(4), 451–464.
  • Gelman et al. (2013) Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. & Rubin, D. B. (2013), Bayesian data analysis, CRC press.
  • Guo & Liang (2019) Guo, C. & Liang, Y. (2019), Analyzing restricted mean survival time using sas/stat, in ‘SAS global forum proceedings Paper SAS3013-2019, SAS Institute Inc’.
  • Guyot et al. (2012) Guyot, P., Ades, A., Ouwens, M. J. & Welton, N. J. (2012), ‘Enhanced secondary analysis of survival data: reconstructing the data from published kaplan-meier survival curves’, BMC medical research methodology 12(1), 9.
  • Hagar et al. (2017) Hagar, Y., Dignam, J. J. & Dukic, V. (2017), ‘Flexible modeling of the hazard rate and treatment effects in long-term survival studies’, Statistical methods in medical research 26(5), 2455–2480.
  • Higgins et al. (2020) Higgins, J. P., Thomas, J., Chandler, J., Cumpston, M., Li, T., Page, M. J. & Welch, V. A. (2020), Cochrane handbook for systematic reviews of interventions, John Wiley & Sons.
  • Holford (1980) Holford, T. R. (1980), ‘The analysis of rates and of survivorship using log-linear models’, Biometrics pp. 299–305.
  • Hoyle & Henley (2011) Hoyle, M. W. & Henley, W. (2011), ‘Improved curve fits to summary survival data: application to economic evaluation of health technologies’, BMC medical research methodology 11(1), 139.
  • Hunink & Wong (1994) Hunink, M. G. & Wong, J. B. (1994), ‘Meta-analysis of failure-time data with adjustment for covariates’, Medical Decision Making 14(1), 59–70.
  • Jansen (2011) Jansen, J. P. (2011), ‘Network meta-analysis of survival data with fractional polynomials’, BMC medical research methodology 11(1), 1–14.
  • Jansen & Cope (2012) Jansen, J. P. & Cope, S. (2012), ‘Meta-regression models to address heterogeneity and inconsistency in network meta-analysis of survival outcomes’, BMC medical research methodology 12(1), 152.
  • Kim et al. (2017) Kim, D. H., Uno, H. & Wei, L.-J. (2017), ‘Restricted mean survival time as a measure to interpret clinical trial results’, JAMA cardiology 2(11), 1179–1180.
  • Kristiansen (2012) Kristiansen, I. (2012), ‘Prm39 survival curve convergences and crossing: a threat to validity of meta-analysis?’, Value in health 15(7), A652.
  • Laird & Olivier (1981) Laird, N. & Olivier, D. (1981), ‘Covariance analysis of censored survival data using log-linear analysis techniques’, Journal of the American Statistical Association 76(374), 231–240.
  • Lin & Zeng (2010) Lin, D.-Y. & Zeng, D. (2010), ‘On the relative efficiency of using summary statistics versus individual-level data in meta-analysis’, Biometrika 97(2), 321–332.
  • Liu et al. (2014) Liu, Z., Rich, B. & Hanley, J. A. (2014), ‘Recovering the raw data behind a non-parametric survival curve’, Systematic reviews 3(1), 151.
  • Marciniak (2017) Marciniak, T. A. (2017), ‘There are no published meta-analyses with all individual patient data’, bmj 357.
  • Michiels et al. (2005) Michiels, S., Piedbois, P., Burdett, S., Syz, N., Stewart, L. & Pignon, J.-P. (2005), ‘Meta-analysis when only the median survival times are known: a comparison with individual patient data results’, International journal of technology assessment in health care 21(1), 119.
  • Moher et al. (2012) Moher, D., Hopewell, S., Schulz, K. F., Montori, V., Gøtzsche, P. C., Devereaux, P., Elbourne, D., Egger, M. & Altman, D. G. (2012), ‘Consort 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials’, International journal of surgery 10(1), 28–55.
  • Nevitt et al. (2017) Nevitt, S. J., Marson, A. G., Davie, B., Reynolds, S., Williams, L. & Smith, C. T. (2017), ‘Exploring changes over time and characteristics associated with data retrieval across individual participant data meta-analyses: systematic review’, bmj 357.
  • Parmar et al. (1998) Parmar, M. K., Torri, V. & Stewart, L. (1998), ‘Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints’, Statistics in medicine 17(24), 2815–2834.
  • Pocock et al. (2002) Pocock, S. J., Clayton, T. C. & Altman, D. G. (2002), ‘Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls’, The Lancet 359(9318), 1686–1689.
  • Rach et al. (2016) Rach, E. A., Kalra, S., Williams, T. & Hung, H.-L. (2016), ‘Regulatory informatics reveals minimal residual disease trends in hematologic malignancies’, Therapeutic innovation & regulatory science 50(3), 319–329.
  • Reimold et al. (1992) Reimold, S. C., Chalmers, T. C., Berlin, J. A. & Antman, E. M. (1992), ‘Assessment of the efficacy and safety of antiarrhythmic therapy for chronic atrial fibrillation: observations on the role of trial design and implications of drug-related mortality’, American heart journal 124(4), 924–932.
  • Riley et al. (2010) Riley, R. D., Lambert, P. C. & Abo-Zaid, G. (2010), ‘Meta-analysis of individual participant data: rationale, conduct, and reporting’, Bmj 340, c221.
  • Riley et al. (2007) Riley, R. D., Simmonds, M. C. & Look, M. P. (2007), ‘Evidence synthesis combining individual patient data and aggregate data: a systematic review identified current practice and possible methods’, Journal of clinical epidemiology 60(5), 431–e1.
  • Rouder & Lu (2005) Rouder, J. N. & Lu, J. (2005), ‘An introduction to bayesian hierarchical models with an application in the theory of signal detection’, Psychonomic bulletin & review 12(4), 573–604.
  • Royston & Altman (1994) Royston, P. & Altman, D. G. (1994), ‘Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling’, Journal of the Royal Statistical Society: Series C (Applied Statistics) 43(3), 429–453.
  • Royston & Parmar (2013) Royston, P. & Parmar, M. K. (2013), ‘Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome’, BMC medical research methodology 13(1), 1–15.
  • Rulli et al. (2018) Rulli, E., Ghilotti, F., Biagioli, E., Porcu, L., Marabese, M., D’Incalci, M., Bellocco, R. & Torri, V. (2018), ‘Assessment of proportional hazard assumption in aggregate data: a systematic review on statistical methodology in clinical trials using time-to-event endpoint’, British journal of cancer 119(12), 1456–1463.
  • Saluja et al. (2019) Saluja, R., Cheng, S., delos Santos, K. A. & Chan, K. K. (2019), ‘Estimating hazard ratios from published kaplan-meier survival curves: A methods validation study’, Research synthesis methods 10(3), 465–475.
  • Shore et al. (1990) Shore, T., Weinerman, B. & Nelson, N. (1990), ‘A meta-analysis of stages i and ii hodgkin’s disease’, Cancer 65(5), 1155–1160.
  • Short et al. (2020) Short, N. J., Zhou, S., Fu, C., Berry, D. A., Walter, R. B., Freeman, S. D., Hourigan, C. S., Huang, X., Gonzalez, G. N., Hwang, H. et al. (2020), ‘Association of measurable residual disease with survival outcomes in patients with acute myeloid leukemia: a systematic review and meta-analysis’, JAMA oncology 6(12), 1890–1899.
  • Stewart & Clarke (1995) Stewart, L. A. & Clarke, M. (1995), ‘Practical methodology of meta-analyses (overviews) using updated individual patient data. cochrane working group’, Statistics in medicine 14(19), 2057–2079.
  • Tierney et al. (2015) Tierney, J. F., Pignon, J.-P., Gueffyier, F., Clarke, M., Askie, L., Vale, C. L., Burdett, S., Alderson, P., Askie, L., Bennett, D. et al. (2015), ‘How individual participant data meta-analyses have influenced trial design, conduct, and analysis’, Journal of clinical epidemiology 68(11), 1325–1335.
  • Tierney et al. (2007) Tierney, J. F., Stewart, L. A., Ghersi, D., Burdett, S. & Sydes, M. R. (2007), ‘Practical methods for incorporating summary time-to-event data into meta-analysis’, Trials 8(1), 16.
  • Tudur et al. (2001) Tudur, C., Williamson, P. R., Khan, S. & Best, L. Y. (2001), ‘The value of the aggregate data approach in meta-analysis with time-to-event outcomes’, Journal of the Royal Statistical Society: Series A (Statistics in Society) 164(2), 357–370.
  • Voest et al. (1989) Voest, E. E., van Houwelingen, J. C. & Neijt, J. P. (1989), ‘A meta-analysis of prognostic factors in advanced ovarian cancer with median survival and overall survival (measured with the log (relative risk)) as main objectives’, European Journal of Cancer and Clinical Oncology 25(4), 711–720.
  • Williamson et al. (2002) Williamson, P. R., Smith, C. T., Hutton, J. L. & Marson, A. G. (2002), ‘Aggregate data meta-analysis with time-to-event outcomes’, Statistics in medicine 21(22), 3337–3351.
  • Zhang et al. (2019) Zhang, J., Ko, C.-W., Nie, L., Chen, Y. & Tiwari, R. (2019), ‘Bayesian hierarchical methods for meta-analysis combining randomized-controlled and single-arm studies’, Statistical methods in medical research 28(5), 1293–1310.