This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Sensitivity Analysis of Causal Treatment Effect Estimation for Clustered Observational Data with Unmeasured Confounding

Yang Ou1    Lu Tang1    Chung-Chou H. Chang1,2
1Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania 2Department of Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania
Abstract

Identifying causal treatment (or exposure) effects in observational studies requires the data to satisfy the unconfoundedness assumption which is not testable using the observed data. With sensitivity analysis, one can determine how the conclusions might change if assumptions are violated to a certain degree. In this paper, we propose a new technique for sensitivity analysis applicable to clusters observational data with a normally distributed or binary outcome. The proposed methods aim to assess the robustness of estimated treatment effects in a single study as well as in multiple studies, i.e., meta-analysis, against unmeasured confounders. Simulations with various underlying scenarios were conducted to assess the performance of our methods. Unlike other existing sensitivity analysis methods, our methods have no restrictive assumptions on the number of unmeasured confounders or on the relationship between measured and unmeasured confounders, and do not exclude possible interactions between measured confounders and the treatment. Our methods are easy to implement using standard statistical software packages. Sensitivity analysis; Unmeasured confounding; Bias factor; Clustered observational data; Conditional average treatment effect; Mixed model

1 Introduction

When attempting to estimate the causal effect of an experimental treatment, a risk exposure, or a new policy, researchers often prefer the approach of randomized experimental design. Through randomization, patients who are assigned to different treatment arms are comparable in baseline factors thus potential biases can be avoided. However, when it is not feasible to conduct randomized experiments because of ethical, financial, or other concerns, observational studies are brought into play to estimate the casual treatment effect. To identify the average treatment effects (ATE) and the conditional average treatment effects (CATE) from observational data, further assumptions must be made. The no unmeasured confounder assumption, which is also referred to as the exchangeability assumption, is one of the assumptions needed to identify ATE and CATE from observational data which asserts that the confounding mechanism is determined by observed covariates only. One must address the concern of bias caused by the violation of the exchangeability assumption. The bias introduced by unmeasured treatment-outcome confounding will threaten the usefulness of the statistical inferences that rely on this assumption. For example, Willett (1990) claimed dietary carotenoids reduce the risk of lung cancer using observational data, while a randomized clinical trial conducted later (The Alpha-Tocopherol Beta Carotene Cancer Prevention Study Group, 1994) showed that the group that received carotene has a higher incidence of lung cancer than those that did not receive carotene. The contradictory conclusions are most likely because that the observational study was unable to adjust for unmeasured confounders. For example, people taking dietary carotenoids tend to be healthier due to social and behavioral factors that were beneficial to them. However, adjusting for measured confounders only was not enough to adjust for the difference of health condition between groups taking and not taking carotenoids. Similar contradictory conclusions have been seen in observational and randomized studies analyzing the relationship between antioxidant vitamins and cardiovascular disease (Stampfer et al., 1993, Gaetano and Collaborative Group of the Primary Prevention Project, 2001, Lawlor et al., 2004), which could also be results of violation of the exchangeability assumption.

To deal with unmeasured confounding, researchers propose to use instrumental variable analysis with an exogenous variable, i.e., an instrument that is not directly related to the outcome yet can affect the choice of treatments. This method uses an approximated randomization to form comparison groups of similar characteristics including both measured and unmeasured confounders.

The difference-in-differences (DID) approach can be used for estimating ATE in the presence of unobserved confounding when pre- and post-treatment outcome measurements are available. This approach requires that the association between the unobserved confounder and the outcome is equal in the two treatment arms and that it is a constant over time (Sofer et al., 2016).

Another way to account for unmeasured confounder is through the use of a proxy variable. Tchetgen Tchetgen et al. (2020) introduced negative controls which are auxiliary variables not causally associated with the treatment or outcome, and a generalization of negative controls, namely the proximal causal learning framework, that offers an opportunity to learn causal effects in settings which the assumption of no unmeasured confounding fails.

Among all methods dealing with unmeasured confounders in observational studies, the most common approach is using sensitivity analysis to assess the robustness of causal effect estimation and inference against unmeasured confounders, i.e., how much the estimation will be affected by unmeasured confounders. Sensitivity analysis methods for various models and data structures have been proposed in Lin et al. (1998), Rosenbaum and Rubin (1983), Imbens (2003), Marcus (1997), and Robins et al. (2000). These methods are applicable under additional restrictions, for examples, only for a single binary type of unmeasured confounder (Rosenbaum and Rubin, 1983); or assuming conditional independence that requires the measured and unmeasured confounders to be independent conditional on the treatment (Lin et al., 1998). VanderWeele and Ding (2017) and Ding and VanderWeele (2017) proposed a sensitivity method to estimate the E-value for binary outcome, which is the minimum risk ratio a unmeasured confounder must have with both treatment and outcome to explain away the confounded treatment effect. Mathur and VanderWeele (2019b) extended the idea by developing a sensitivity analysis method that is applicable to multiple studies (meta-analysis). To estimate the ATE on a continuous outcome, VanderWeele and Ding (2017) approximated the difference in the mean continuous outcomes between two treatment arms to a risk ratio then applied the E-value on this approximated risk ratio under additional assumptions.

In this study, we propose a general sensitivity analysis technique for normally distributed and binary outcomes without making distributional assumptions for the measured and unmeasured confounders. We successfully incorporated data from clustered observational designs (e.g., hierarchical or longitudinal) without additional restrictions on the number of unmeasured confounders or the interaction effects between treatment and measured confounders. We also extended the idea and developed a method for multiple studies (meta-analysis).

The rest of the paper is organized as follows. Section 2 introduces the concept of our proposed methods, including cases for single and multiple studies. Section 3 shows the interpretation of our methods. Section 4 provides the assessment of our proposed methods under several simulated scenarios. We conclude our findings and provide final comments in Section 5.

2 Methods

2.1 Single Study - Continuous Outcome

Consider a binary treatment variable AA with value 1 for the treated and 0 for the control, and a continuous outcome variable YY. Denote X as a set of measured confounders, and U as a set of unmeasured confounders. Without loss of generality, we assume one measured confounder and one unmeasured confounder. Let YA=aY^{A=a} (or YaY^{a} ) be the counterfactual or potential outcome, which is the outcome that would have been observed if individual had been given treatment aa. Denote YY as the observed outcome that corresponds to the treatment value actually experienced.

Individual causal effects are defined as the difference between potential outcomes. Because only one of those outcomes is observed for each individual, individual effects cannot be identified. The average treatment effects (ATE) and conditional average treatment effects (CATE) were proposed and can be estimated from the observed data when additional assumptions hold. The causal ATE of AA on YY is the difference between the expectation of the potential outcomes in the population while the conditional causal ATE is the ATE in any subpopulation.

ATE=𝔼(YA=1)𝔼(YA=0),\displaystyle\text{ATE}=\mathbb{E}\left(Y^{A=1}\right)-\mathbb{E}\left(Y^{A=0}\right),
CATE=𝔼(YA=1X=x,U=u)𝔼(YA=0X=x,U=u).\displaystyle\text{CATE}=\mathbb{E}\left(Y^{A=1}\mid X=x,U=u\right)-\mathbb{E}\left(Y^{A=0}\mid X=x,U=u\right).

To estimate the ATE or CATE, the following causal assumptions are necessary. The stable unit treatment value assumption (SUTVA) indicates that treatment assignment of one subject does not affect the outcome of another subject. The consistency assumption requires that the potential outcome under treatment A=aA=a, YA=aY^{A=a}, is equal to the observed outcome if treatment A=aA=a is actually received, i.e., Ya=Y if A=a, for all a.Y^{a}=Y\text{ if }A=a,\text{ for all }a. The positivity assumption states that treatment assignment is not deterministic for every set of values of XX and UU, i.e., 0<P(A=aX=x,U=u)<1 for all ax and u.0<P(A=a\mid X=x,U=u)<1\text{ for all }a\text{, }x\text{ and }u. The exchangeability assumption requires that given XX and UU, treatment assignment is independent of the potential outcomes, i.e., YaAX,UY^{a}\perp A\mid X,U for all aa.

Under these assumptions, ATE and CATE can be estimated from the data collected from an observational study:

𝔼(YA=1X=x,U=u)𝔼(YA=0X=x,U=u)=𝔼(YA=1A=1,X=x,U=u)𝔼(YA=0A=0,X=x,U=u)=𝔼(YA=1,X=x,U=u)𝔼(YA=0,X=x,U=u).\begin{split}&\mathbb{E}\left(Y^{A=1}\mid X=x,U=u\right)-\mathbb{E}\left(Y^{A=0}\mid X=x,U=u\right)\\ &=\mathbb{E}\left(Y^{A=1}\mid A=1,X=x,U=u\right)-\mathbb{E}\left(Y^{A=0}\mid A=0,X=x,U=u\right)\\ &=\mathbb{E}\left(Y\mid A=1,X=x,U=u\right)-\mathbb{E}\left(Y\mid A=0,X=x,U=u\right).\end{split}

In clustered observational study, a common approach to estimate 𝔼(YA,X=x,U=u)\mathbb{E}\left(Y\mid A,X=x,U=u\right) is to use a generalized mixed effects model or a generalized linear model with GEE, where all potential confounders are included. We will introduce our sensitivity analysis method based on generalized mixed effects model, where random effects will be used to account for within-cluster correlation. Let ii (i=1,,I)(i=1,\dots,I) and jj (j=1,,J)(j=1,\dots,J) be the indicators for the level-1 (e.g., subjects nested in clusters) and the level-2 unit (e.g., clusters), respectively. Consider a continuous outcome YijY_{ij} with the marginal mean model:

𝔼(Yij|Aij,Xij,Uij)=β0+β1Aij+β2Xij+β3AijXij+θUij;\small\begin{split}\mathbb{E}(Y_{ij}|A_{ij},X_{ij},U_{ij})=\beta_{0}+\beta_{1}A_{ij}+\beta_{2}X_{ij}+\beta_{3}A_{ij}X_{ij}+\theta U_{ij};\end{split}

and the corresponding mixed model:

Yij=β0+β1Aij+β2Xij+β3AijXij+θUij+ζj+ϵij,\small\begin{split}&Y_{ij}=\beta_{0}+\beta_{1}A_{ij}+\beta_{2}X_{ij}+\beta_{3}A_{ij}X_{ij}+\theta U_{ij}+\zeta_{j}+\epsilon_{ij},\end{split} (1)

where 𝜷=(β0,β1,β2,β3)T\boldsymbol{\beta}=(\beta_{0},\beta_{1},\beta_{2},\beta_{3})^{T} and θ\theta are unknown regression parameters; ζj\zeta_{j} and ϵij\epsilon_{ij} are the random intercepts and the level-1 error terms, respectively, where ζjN(0,ν)\zeta_{j}\sim N(0,\nu) and ϵij|ζjN(0,ϕ)\epsilon_{ij}|\zeta_{j}\sim N(0,\phi) with unknown variances ν\nu and ϕ\phi.

Our goal is to estimate the CATE, denoted as ExE_{x}, with the form

Ex𝔼(YA=1,X=x,U=u)𝔼(YA=0,X=x,U=u)=β1+xβ3.\begin{split}E_{x}\coloneqq\mathbb{E}\left(Y\mid A=1,X=x,U=u\right)-\mathbb{E}\left(Y\mid A=0,X=x,U=u\right)=\beta_{1}+x\beta_{3}.\end{split}

Since UU is unmeasured, we can only fit the following confounded model:

Yij=β0c+β1cAij+β2cXij+β3cAijXij+ζjc+ϵijc,Y_{ij}=\beta_{0}^{c}+\beta_{1}^{c}A_{ij}+\beta_{2}^{c}X_{ij}+\beta_{3}^{c}A_{ij}X_{ij}+\zeta_{j}^{c}+\epsilon_{ij}^{c}, (2)

where 𝜷c=(β0c,β1c,β2c,β3c)T\boldsymbol{\beta}^{c}=(\beta_{0}^{c},\beta_{1}^{c},\beta_{2}^{c},\beta_{3}^{c})^{T} are unknown regression parameters; ζjc\zeta_{j}^{c} and ϵijc\epsilon_{ij}^{c} are random intercepts and level-1 error terms, respectively, where ζjcN(0,νc)\zeta_{j}^{c}\sim N(0,\nu^{c}) and ϵijc|ζjcN(0,ϕc)\epsilon_{ij}^{c}|\zeta_{j}^{c}\sim N(0,\phi^{c}) with unknown variances νc\nu^{c} and ϕc\phi^{c}. There is potential violation of assuming a constant variance of random effect and error term, but our methods are robust against this potential violation based on the simulation results.

Denote confounded treatment effect ExcE_{x}^{c} as:

Exc𝔼(YA=1,X=x)𝔼(YA=0,X=x)=β1c+xβ3c.E_{x}^{c}\coloneqq\mathbb{E}\left(Y\mid A=1,X=x\right)-\mathbb{E}\left(Y\mid A=0,X=x\right)=\beta_{1}^{c}+x\beta_{3}^{c}.

We refer 𝜷\boldsymbol{\beta} and 𝜷c\boldsymbol{\beta}^{c} as the true and confounded coefficients, respectively. The element in 𝜷\boldsymbol{\beta} cannot be estimated directly using the observed data while 𝜷c\boldsymbol{\beta}^{c} can. Therefore, our goal is to establish the relationship between 𝜷\boldsymbol{\beta} and 𝜷c\boldsymbol{\beta}^{c} to infer 𝜷\boldsymbol{\beta}.

Let F(U|A,X)F(U|A,X) be the cumulative distribution function of UU given AA and XX. The conditional expectation of YijY_{ij} given AijA_{ij} and XijX_{ij} based on model (LABEL:1) becomes:

𝔼(Yij|Aij,Xij=x)=𝔼𝕌𝔼ζ𝔼(Yij|Aij,Xij=x,Uij,ζj)=𝔼𝕌𝔼(Yij|Aij,Xij=x,Uij,ζj=0)=β0+β1Aij+xβ2+xβ3Aij+θUij𝑑F(Uij|Aij,Xij=x)=(β0+θm0x)+{β1+xβ3+θ(m1xm0x)}Aij+xβ2,\begin{split}\mathbb{E}(Y_{ij}|A_{ij},X_{ij}=x)&=\mathbb{E_{U}}{\mathbb{E}_{\zeta}\mathbb{E}(Y_{ij}|A_{ij},X_{ij}=x,U_{ij},\zeta_{j})}\\ &=\mathbb{E_{U}}\mathbb{E}(Y_{ij}|A_{ij},X_{ij}=x,U_{ij},\zeta_{j}=0)\\ &=\beta_{0}+\beta_{1}A_{ij}+x\beta_{2}+x\beta_{3}A_{ij}+\theta\int_{-\infty}^{\infty}U_{ij}dF(U_{ij}|A_{ij},X_{ij}=x)\\ &=(\beta_{0}+\theta m_{0x})+\{\beta_{1}+x\beta_{3}+\theta(m_{1x}-m_{0x})\}A_{ij}+x\beta_{2},\end{split} (3)

where m1x=𝔼(Uij|Aij=1,Xij=x)m_{1x}=\mathbb{E}(U_{ij}|A_{ij}=1,X_{ij}=x) and m0x=𝔼(Uij|Aij=0,Xij=x)m_{0x}=\mathbb{E}(U_{ij}|A_{ij}=0,X_{ij}=x).

The conditional expectation of YijY_{ij} based on the confounded model described in equation (2) has the form:

𝔼(Yij|Aij,Xij=x)=β0c+(β1c+xβ3c)Aij+xβ2c.\mathbb{E}(Y_{ij}|A_{ij},X_{ij}=x)=\beta_{0}^{c}+(\beta_{1}^{c}+x\beta_{3}^{c})A_{ij}+x\beta_{2}^{c}. (4)

Comparing equations (3) and (4), we have the following relationship:

Ex=ExcBx,E_{x}=E_{x}^{c}-B_{x}, (5)

where Ex=β1+xβ3E_{x}=\beta_{1}+x\beta_{3}, Exc=β1c+xβ3cE_{x}^{c}=\beta_{1}^{c}+x\beta_{3}^{c}, and Bx=(m1xm0x)θB_{x}=(m_{1x}-m_{0x})\theta. Note that ExE_{x} is the CATE conditional on X=xX=x and ExcE_{x}^{c} is the confounded treatment effect conditional on X=xX=x. Equation (5) specifies the relationship between causal CATE and confounded treatment effect. Define BxB_{x} as the bias factor, which is a function of the effect of unmeasured confounders and the difference of the mean unmeasured confounders between the treated and control groups given X=xX=x. Let θ\theta, m1xm_{1x} and m0xm_{0x} be the sensitivity parameters.

2.2 Sensitivity Analysis for Single Study with Continuous Outcome

Denote the estimated confounded treatment effect as E^xc\widehat{E}_{x}^{c} (95% CI, lblb to ubub), where lblb and ubub are the lower and upper bounds of its 95% confidence interval. Following equation (5), the estimated conditional average treatment effects is E^xcBx\widehat{E}_{x}^{c}-B_{x} (95% CI, lbBxlb-B_{x} to ubBxub-B_{x}). We say an unmeasured confounder can explain away the estimated confounded treatment effect if the unmeasured confounder is able to shift the confidence interval of the estimated confounded treatment effect to include the null value (no treatment effect). If confidence interval of estimated confounded treatment effect includes the null value, then no confounding is needed to explain away the results or move confidence interval to include the null value.

Once the sensitivity parameters are specified, we can calculate bias factor and the estimated CATE. However, in practice, we do not know the strengths and distributions of unmeasured confounders, thus, the sensitivity parameters are usually unknown. What we can do is to calculate how the estimate is affected under different values of the bias factor. Among all different bias factors that are able to explain away the confounded treatment effect, minimal bias factor is defined as the smallest value of bias factors that is able to explain away the confounded treatment effect, meaning any bias factor larger than the minimal bias factor can also explain away the confounded treatment effect. The robustness of the estimated confounded treatment effect can be measured by the minimal bias factor. A small value of the minimal bias factor indicates the estimated treatment effect is not robust to unmeasured confounders.

Let E^xc>0\widehat{E}_{x}^{c}>0 and E^xc<0\widehat{E}_{x}^{c}<0 be apparently positive effect and apparently negative effect, respectively. In apparently positive effect case, the confounded treatment effect will be explained away when the lower bound of confidence interval of estimated confounded treatment effect is smaller than 0, i.e., lbBx0lb-{B}_{x}\leq 0. It means any bias factor, BxB_{x}, such that BxlbB_{x}\geq lb, can explain away the confounded treatment effect. Denote BxB_{x}^{*} as the minimal bias factor that can explain away the confounded treatment effect. Thus, when E^xc>0\widehat{E}_{x}^{c}>0, the minimal bias factor that can explain away the confounded treatment effect is lblb, i.e., Bx=lb{B}_{x}^{*}=lb.

In the apparently negative case, define B~x=Bx\tilde{B}_{x}=-{B}_{x}, thus, estimated CATE can be expressed as E^xc+B~x\widehat{E}_{x}^{c}+\tilde{B}_{x} (95% CI, lb+B~xlb+\tilde{B}_{x} to ub+B~xub+\tilde{B}_{x}). The confounded treatment effect will be explained away when ub+B~x0ub+\tilde{B}_{x}\geq 0. It means any bias factor, B~x\tilde{B}_{x}, such that B~xub\tilde{B}_{x}\geq-ub, can explain away the confounded treatment effect. Denote B~x\tilde{B}_{x}^{*} as the minimal bias factor that can explain away the confounded treatment effect. Thus, when E^xc<0\widehat{E}_{x}^{c}<0, the minimal bias factor that can explain away the confounded treatment effect is ub-ub, i.e., B~x=ub\tilde{B}_{x}^{*}=-ub.

As shown above, bias factor is a function of the effects of unmeasured confounders, and the difference of mean of unmeasured confounders between treated and control groups given X=xX=x, such that Bx=(m1xm0x)θB_{x}=(m_{1x}-m_{0x})\theta and B~x=(m1xm0x)θ\tilde{B}_{x}=-(m_{1x}-m_{0x})\theta. It means that, when E^xc>0\widehat{E}_{x}^{c}>0, a set of θ\theta, m1xm_{1x}, and m0xm_{0x}, such that (m1xm0x)θlb(m_{1x}-m_{0x})\theta\geq lb, can explain away estimated confounded treatment effect. When E^xc<0\widehat{E}_{x}^{c}<0, a set of θ\theta, m1xm_{1x}, and m0xm_{0x}, such that (m0xm1x)θub(m_{0x}-m_{1x})\theta\geq-ub, can explain away estimated confounded treatment effect. We recommend to provide a table or contour plot which present different values of θ\theta, m1xm_{1x}, m0xm_{0x}, and bias factor to give researchers a sense of what are combinations of sensitivity parameters that can explain away the confounded treatment effect and therefore a sense of how sensitive the confounded treatment effect is to unmeasured confounders.

2.3 Single Study - Binary Outcome

For a binary outcome, risk difference (RD) and relative risk (RR) can be used to explore causal treatment effect. In this article, RR will be used to explore the causal relationship. With same notations as in the case of continuous outcome, except that YY denotes a binary variable now, when same assumptions hold:

RR=𝔼(YA=1X=x,U=u)𝔼(YA=0X=x,U=u)=𝔼(YA=1A=1,X=x,U=u)𝔼(YA=0A=0,X=x,U=u)=𝔼(YA=1,X=x,U=u)𝔼(YA=0,X=x,U=u).\begin{split}\text{RR}=\frac{\mathbb{E}\left(Y^{A=1}\mid X=x,U=u\right)}{\mathbb{E}\left(Y^{A=0}\mid X=x,U=u\right)}=\frac{\mathbb{E}\left(Y^{A=1}\mid A=1,X=x,U=u\right)}{\mathbb{E}\left(Y^{A=0}\mid A=0,X=x,U=u\right)}=\frac{\mathbb{E}\left(Y\mid A=1,X=x,U=u\right)}{\mathbb{E}\left(Y\mid A=0,X=x,U=u\right)}.\end{split}

When the event is rare in both treatment and control group, it can be shown that odds ratio is approximately equal to the relative risk. A logistic regression model is widely used to estimate odds ratio. Let the response variable YY be related to AA, XX, and UU through following generalized linear mixed model:

logit{(Yij=1|Aij,Xij,Uij,ζj)}=β0+β1Aij+β2Xij+β3AijXij+θUij+ζj,\text{logit}\left\{\mathbb{P}(Y_{ij}=1|A_{ij},X_{ij},U_{ij},\zeta_{j})\right\}=\beta_{0}+\beta_{1}A_{ij}+\beta_{2}X_{ij}+\beta_{3}A_{ij}X_{ij}+\theta U_{ij}+\zeta_{j}, (6)

where 𝜷=(β0,β1,β2,β3)T\boldsymbol{\beta}=(\beta_{0},\beta_{1},\beta_{2},\beta_{3})^{T} and θ\theta are unknown regression parameters, and ζjN(0,ν)\zeta_{j}\sim N(0,\nu).

When vv is small, it can be shown (proof is provided in Appendix A)

logit{(Yij=1|Aij,Xij,Uij)}β0+β1Aij+β2Xij+β3AijXij+θUij.\text{logit}\{\mathbb{P}(Y_{ij}=1|A_{ij},X_{ij},U_{ij})\}\approx\beta_{0}+\beta_{1}A_{ij}+\beta_{2}X_{ij}+\beta_{3}A_{ij}X_{ij}+\theta U_{ij}.

Denote CATE on the log-RR scale as ExE_{x}. Thus, when the event is rare and vv is small, ExE_{x} has the form: Exlog𝔼(YA=1X=x,U=u)𝔼(YA=0X=x,U=u)=log𝔼(YA=1,X=x,U=u)𝔼(YA=0,X=x,U=u)β1+xβ3.E_{x}\coloneqq\text{log}\frac{\mathbb{E}\left(Y^{A=1}\mid X=x,U=u\right)}{\mathbb{E}\left(Y^{A=0}\mid X=x,U=u\right)}=\text{log}\frac{\mathbb{E}\left(Y\mid A=1,X=x,U=u\right)}{\mathbb{E}\left(Y\mid A=0,X=x,U=u\right)}\approx\beta_{1}+x\beta_{3}.

Since UU is unmeasured, we can only fit the following confounded model:

logit{(Yij=1|Aij,Xij,ζjc)}=β0c+β1cAij+β2cXij+β3cAijXij+ζjc,\text{logit}\left\{\mathbb{P}(Y_{ij}=1|A_{ij},X_{ij},\zeta_{j}^{c})\right\}=\beta_{0}^{c}+\beta_{1}^{c}A_{ij}+\beta_{2}^{c}X_{ij}+\beta_{3}^{c}A_{ij}X_{ij}+\zeta_{j}^{c}, (7)

where ζjcN(0,νc)\zeta_{j}^{c}\sim N(0,\nu^{c}) and 𝜷c=(β0c,β1c,β2c,β3c)T\boldsymbol{\beta}^{c}=(\beta_{0}^{c},\beta_{1}^{c},\beta_{2}^{c},\beta_{3}^{c})^{T} are unknown regression parameters. Similarly, when νc\nu^{c} is small, it can be shown that for the confounded model:

logit{(Yij=1|Aij,Xij)}β0c+β1cAij+β2cXij+β3cAijXij.\text{logit}\left\{\mathbb{P}(Y_{ij}=1|A_{ij},X_{ij})\right\}\approx\beta_{0}^{c}+\beta_{1}^{c}A_{ij}+\beta_{2}^{c}X_{ij}+\beta_{3}^{c}A_{ij}X_{ij}. (8)

Denote confounded treatment effect on the log-RR scale as ExcE_{x}^{c} which has the form: Exclog𝔼(YA=1,X=x)𝔼(YA=0,X=x)β1c+xβ3c.E_{x}^{c}\coloneqq\text{log}\frac{\mathbb{E}\left(Y\mid A=1,X=x\right)}{\mathbb{E}\left(Y\mid A=0,X=x\right)}\approx\beta_{1}^{c}+x\beta_{3}^{c}.

When UU is binary and both θ\theta and ν\nu are small, we have (proof is provided in Appendix B):

logit{(Y=1|A,X=x)}β0+β1A+β2x+β3Ax+log{PAxexp(θ+ν2)+(1PAx)exp(ν2)}=β0+β2x+log{P0xexp(θ+ν2)+(1P0x)exp(ν2)}+{β1+β3x+logP1xexp(θ)+1P1xP0xexp(θ)+1P0x}A,\small\begin{split}&\text{logit}\left\{\mathbb{P}(Y=1|A,X=x)\right\}\approx\beta_{0}+\beta_{1}A+\beta_{2}x+\beta_{3}Ax+\text{log}\{P_{Ax}\text{exp}(\theta+\frac{\nu}{2})+(1-P_{Ax})\text{exp}(\frac{\nu}{2})\}\\ &=\beta_{0}+\beta_{2}x+\text{log}\{P_{0x}\text{exp}(\theta+\frac{\nu}{2})+(1-P_{0x})\text{exp}(\frac{\nu}{2})\}+\{\beta_{1}+\beta_{3}x+\text{log}\frac{P_{1x}\text{exp}(\theta)+1-P_{1x}}{P_{0x}\text{exp}(\theta)+1-P_{0x}}\}A,\end{split} (9)

where PAX=P(Uij=1|Aij,Xij)P_{AX}=P(U_{ij}=1|A_{ij},X_{ij}).

Following the confounded model described in equation (8), we have:

logit{(Y=1|A,X=x)}β0c+β1cAij+β2cx+β3cAx=β0c+β2cx+(β1c+β3cx)A.\text{logit}\left\{\mathbb{P}(Y=1|A,X=x)\right\}\approx\beta_{0}^{c}+\beta_{1}^{c}A_{ij}+\beta_{2}^{c}x+\beta_{3}^{c}Ax=\beta_{0}^{c}+\beta_{2}^{c}x+(\beta_{1}^{c}+\beta_{3}^{c}x)A. (10)

When θ\theta, ν\nu, and νc\nu^{c} are small, comparing equations (9) and (10), we have the following relationship:

Ex=ExcBx,E_{x}=E_{x}^{c}-B_{x}, (11)

where Ex=β1+xβ3E_{x}=\beta_{1}+x\beta_{3}, Exc=β1c+xβ3cE_{x}^{c}=\beta_{1}^{c}+x\beta_{3}^{c}, and Bx=logP1xexp(θ)+1P1xP0xexp(θ)+1P0xB_{x}=\text{log}\frac{P_{1x}\text{exp}(\theta)+1-P_{1x}}{P_{0x}\text{exp}(\theta)+1-P_{0x}}.

When UU is normally distributed such that UN(μAX,σu)U\sim N(\mu_{AX},\sigma_{u}), i.e., the expectation of UU is dependent on both AA and XX, then if θ\theta, σu\sigma_{u}, and ν\nu are small, we have (proof is provided in Appendix C):

logit{(Y=1|A,X=x)}β0+β1A+β2x+β3Ax+θμAx+θ2σu+ν2=β0+β2x+μ0x+θ2σu+ν2+(β1+xβ3+θμ1xθμ0x)A.\begin{split}\text{logit}\left\{\mathbb{P}(Y=1|A,X=x)\right\}&\approx\beta_{0}+\beta_{1}A+\beta_{2}x+\beta_{3}Ax+\theta\mu_{Ax}+\frac{\theta^{2}\sigma_{u}+\nu}{2}\\ &=\beta_{0}+\beta_{2}x+\mu_{0x}+\frac{\theta^{2}\sigma_{u}+\nu}{2}+(\beta_{1}+x\beta_{3}+\theta\mu_{1x}-\theta\mu_{0x})A.\end{split} (12)

When θ\theta, σu\sigma_{u}, ν\nu, and νc\nu^{c} are small, comparing equations (10) and (12), we have the following relationship:

Ex=ExcBx,E_{x}=E_{x}^{c}-B_{x}, (13)

where Ex=β1+xβ3E_{x}=\beta_{1}+x\beta_{3}, Exc=β1c+xβ3cE_{x}^{c}=\beta_{1}^{c}+x\beta_{3}^{c}, and Bx=θ(μ1xμ0x)B_{x}=\theta(\mu_{1x}-\mu_{0x}).

Intraclass correlation coefficient (ICC) is the proportion of total variance of the outcome that is attributable to the between-subject variation, i.e., how much variation of the outcome can be explained by the random effects. For a logistic model, ICC=νν+π/3.\text{ICC}=\frac{\nu}{\nu+\pi/3}. It means when θ\theta, ICC, and confounded ICC are small, equations (11) and (13) hold.

With equations (11) and (13), we can calculate the minimal bias factor that can explain away the confounded treatment effect by moving the confidence interval to include the null value. Similarly, if a small minimal bias factor is able to explain away the confounded treatment effect, the confounded treatment effect is not robust to unmeasured confounders.

2.4 Multiple Studies

When multiple observational studies is involved where only study-level estimates not individual-level data can be merged, meta-analysis is the most commonly used analysis technique to estimate the population covariate effects which combines estimates from all studies and takes study heterogeneous into consideration. In this section, we introduce our proposed method to assess how sensitive the estimated population covariate effects from meta-analysis are to the unmeasured confounders.

Suppose we have KK studies which are random samples from a population of interest. For study kk, let YijkY_{ijk} be the continuous response variable, which is related to treatment assignment AijkA_{ijk}, measured confounders XijkX_{ijk}, and unmeasured confounder UijkU_{ijk} through the following random intercepts model:

Yijk=β0k+β1kAijk+β2kXijk+β3kAijkXijk+θkUijk+ζjk+ϵijk,Y_{ijk}=\beta_{0k}+\beta_{1k}A_{ijk}+\beta_{2k}X_{ijk}+\beta_{3k}A_{ijk}X_{ijk}+\theta_{k}U_{ijk}+\zeta_{jk}+\epsilon_{ijk},

where ii (i=1,,I)(i=1,\dots,I) and jj (j=1,,J)(j=1,\dots,J) are the indicators for the level-1 (e.g.,subjects in clusters) and the level-2 unit (e.g., clusters), respectively; and βk\beta_{k}’s and θk\theta_{k} are the regression parameters adjusting for both measured and unmeasured confounders. Similarly, when causal assumptions hold, the model can be used to estimate the causal ATE. Denote Ek|xE_{k|x} as the CATE for study kk given X=xX=x, such that Ek|x=β1k+xβ3kE_{k|x}=\beta_{1k}+x\beta_{3k}.

Since UijkU_{ijk} is unmeasured, we can only fit the following confounded model for each of the KK studies:

Yijk=β0kc+β1kcAijk+β2kcXijk+β3kcAijkXijk+ζjkc+ϵijkc,Y_{ijk}=\beta_{0k}^{c}+\beta_{1k}^{c}A_{ijk}+\beta_{2k}^{c}X_{ijk}+\beta_{3k}^{c}A_{ijk}X_{ijk}+\zeta_{jk}^{c}+\epsilon_{ijk}^{c},

where βkc\beta_{k}^{c}’s are confounded regression parameters only adjusting for measured confounders. Denote Ek|xcE_{k|x}^{c} as the confounded treatment effect for study kk given X=xX=x, such that Ek|xc=β1kc+xβ3kcE_{k|x}^{c}=\beta_{1k}^{c}+x\beta_{3k}^{c}.

In a standard setup of parametric random effects meta-analysis (Borenstein et al., 2007), it assumes that each of the KK studies measures a potentially unique confounded effect size Ek|xcE_{k|x}^{c}, such that Ek|xcN(μExc,VExc)E_{k|x}^{c}\sim N(\mu_{E_{x}^{c}},V_{E_{x}^{c}}), for k=1,,Kk=1,\dots,K. Let E^k|xc\widehat{E}_{k|x}^{c} be the point estimate of the treatment effect and νk\nu_{k} be the known within-study variance for study kk. In random effects meta-analysis setup, E^k|xc\widehat{E}_{k|x}^{c} is determined by the true confounded effect size Ek|xcE_{k|x}^{c} plus the known within-study variance νk\nu_{k}, while Ek|xcE_{k|x}^{c} is determined by the grand mean, μExc\mu_{E_{x}^{c}} and the between-study variance, VExcV_{E_{x}^{c}}. More generally, for any observed effect E^k|xc\widehat{E}_{k|x}^{c},

E^k|xc=μExc+φk+κk,\widehat{E}_{k|x}^{c}=\mu_{E_{x}^{c}}+\varphi_{k}+\kappa_{k},

where φkN(0,VExc)\varphi_{k}\sim N(0,V_{E_{x}^{c}}) and κkN(0,νk)\kappa_{k}\sim N(0,\nu_{k}). Fixed effects meta-analysis can be viewed as a special case where VExc=0V_{E_{x}^{c}}=0.

The relationship between the CATE and confounded treatment effects in equation (5) still hold for each of the KK studies as we illustrated in the single study case,

Ek|x=Ek|xcBk|x;Bk|x=(mk|1xmk|0x)θk,\displaystyle E_{k|x}=E_{k|x}^{c}-B_{k|x};\hskip 36.135ptB_{k|x}=(m_{k|1x}-m_{k|0x})\theta_{k},

where mk|1x=𝔼(Uijk|Aijk=1,Xijk=x)m_{k|1x}=\mathbb{E}(U_{ijk}|A_{ijk}=1,X_{ijk}=x) and mk|0x=𝔼(Uijk|Aijk=0,Xijk=x)m_{k|0x}=\mathbb{E}(U_{ijk}|A_{ijk}=0,X_{ijk}=x), Bk|xN(μBx,VBx)B_{k|x}\sim N(\mu_{B_{x}},V_{B_{x}}) and Ek|xcN(μExc,VExc)E_{k|x}^{c}\sim N(\mu_{E_{x}^{c}},V_{E_{x}^{c}}). It means that we allow the bias factor Bk|xB_{k|x} to vary across studies. Assume that Bk|xB_{k|x} is independent of Ek|xE_{k|x}. Bk|xB_{k|x} is independent of Ek|xE_{k|x} but not Ek|xcE_{k|x}^{c} because studies with a larger bias factor may observe a larger confounded treatment effect. Thus, the distribution of the CATE can be written as Ek|xN(μEx,VEx)E_{k|x}\sim N(\mu_{E_{x}},V_{E_{x}}), where

μEx=μExcμBx;VEx=VExcVBx.\mu_{E_{x}}=\mu_{E_{x}^{c}}-\mu_{B_{x}};\hskip 36.135ptV_{E_{x}}=V_{E_{x}^{c}}-V_{B_{x}}.

2.5 Sensitivity Analysis for Multiple Studies

Although we use continuous outcome as an example to demonstrate the method in multiple studies, similar derivation can be easily extended to the binary outcome case by replacing the error distribution and the link function in the generalized linear mixed model for binary outcomes. Given equations (11) and (13) hold for each of the KK studies, we can similarly derive the distribution of CATE under the case of binary outcomes.

Following the newly designed metrics for meta-analyses of heterogeneous effects in Mathur and VanderWeele (2019a) and Mathur and VanderWeele (2019b), we define that there is a treatment effect if the probability of observing a CATE larger than a scientifically meaningful effect size qq, is larger than a threshold rr. This means that a bias factor will explain away the confounded treatment effect in meta-analysis if the bias factor can reduce the probability of observing a conditional average treatment effects larger than qq to less than rr. Note, for continuous outcome, qq is on the difference of the continuous outcomes scale, while for binary outcome, qq is on the log-RR scale.

Denote μ^Exc\widehat{\mu}_{E_{x}^{c}} and V^Exc\widehat{V}_{E_{x}^{c}} as the estimators of μExc\mu_{E_{x}^{c}} and VExcV_{E_{x}^{c}}. Let μ^Exc>0\widehat{\mu}_{E_{x}^{c}}>0 and μ^Exc<0\widehat{\mu}_{E_{x}^{c}}<0 be the apparently positive effect and apparently negative effect, respectively, in meta-analysis.

For the apparently positive effect, we define the probability of observing a CATE larger than qq as p(q)x=P(Ek|x>q)p(q)_{x}=P\left(E_{k|x}>q\right) where Ek|xN(μEx,VEx)E_{k|x}\sim N(\mu_{E_{x}},V_{E_{x}}). p(q)xp(q)_{x} can be expressed using μBx\mu_{B_{x}}, VBxV_{B_{x}}, μExc\mu_{E_{x}^{c}}, and VExcV_{E_{x}^{c}}:

p(q)x=1Φ(q+μBxμExcVExcVBx),VExc>VBx{p}(q)_{x}=1-\Phi\left(\frac{q+\mu_{B_{x}}-\mu_{E_{x}^{c}}}{\sqrt{V_{E_{x}^{c}}-V_{B_{x}}}}\right),\quad V_{E_{x}^{c}}>V_{B_{x}} (14)

where Φ\Phi is the cumulative function of standard normal distribution. μBx\mu_{B_{x}} and VBxV_{B_{x}} such that 1Φ(q+μBxμ^ExcV^ExcVBx)<r1-\Phi\left(\frac{q+\mu_{B_{x}}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{V}_{E_{x}^{c}}-V_{B_{x}}}}\right)<r will explain away the estimated confounded treatment effect in meta-analysis. When 0<r<0.50<r<0.5, a constant bias factor among studies, i.e., VBx=0V_{B_{x}}=0, will lead to an upper bound of p^(q)x\widehat{p}(q)_{x}, i.e.,

1Φ(q+μBxμ^ExcV^ExcVBx)1Φ(q+μBxμ^ExcV^Exc), 0<r<0.5.1-\Phi\left(\frac{q+\mu_{B_{x}}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{V}_{E_{x}^{c}}-V_{B_{x}}}}\right)\leq 1-\Phi\left(\frac{q+\mu_{B_{x}}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{V}_{E_{x}^{c}}}}\right),\ \ 0<r<0.5.

It means a constant bias factor, such that BXBxB_{X}\geq B_{x}^{*} where

1Φ(q+Bxμ^ExcV^Exc)rBx=Φ1(1r)V^Exc12q+μ^Exc,1-\Phi\left(\frac{q+B_{x}^{*}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{V}_{E_{x}^{c}}}}\right)\equiv r\rightarrow B_{x}^{*}=\Phi^{-1}(1-r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-q+\widehat{\mu}_{E_{x}^{c}}, (15)

and any other non-constant bias factors with mean larger or equal to BxB_{x}^{*} can explain away the confounded treatment effect in meta-analysis. Thus, BxB_{x}^{*} can be viewed as the minimal common constant bias factor that reduce the probability of observing a CATE larger than a scientifically meaningful effect size qq, to a threshold rr.

For the apparently negative case, define B~x=Bx\tilde{B}_{x}=-B_{x}; therefore Ek|x=Ek|xc+B~k|xE_{k|x}=E_{k|x}^{c}+\tilde{B}_{k|x} and Ek|xN(μEx,VEx)E_{k|x}\sim N(\mu_{E_{x}},V_{E_{x}}) where μEx=μExc+μB~x\mu_{E_{x}}=\mu_{E_{x}^{c}}+\mu_{\tilde{B}_{x}} and VEx=VExcVB~xV_{E_{x}}=V_{E_{x}^{c}}-V_{\tilde{B}_{x}}. We define the probability of observing a CATE larger than qq as p(q)x=P(Ek|x<q)p(q)_{x}=P\left(E_{k|x}<q\right). p(q)xp(q)_{x} can be expressed using μB~x\mu_{\tilde{B}_{x}}, VB~xV_{\tilde{B}_{x}}, μExc\mu_{E_{x}^{c}}, and VExcV_{E_{x}^{c}}:

p(q)x=Φ(qμB~xμExcVExcVB~x),VExc>VB~x,{p}(q)_{x}=\Phi\left(\frac{q-\mu_{\tilde{B}_{x}}-\mu_{E_{x}^{c}}}{\sqrt{V_{E_{x}^{c}}-V_{\tilde{B}_{x}}}}\right),\quad V_{E_{x}^{c}}>V_{\tilde{B}_{x}},

Similarly, when 0<r<0.50<r<0.5, a constant bias factor among studies will lead to an upper bound of p^(q)x\widehat{p}(q)_{x}. Let B~x\tilde{B}_{x}^{*} be the minimal common constant bias factor that can reduce the probability of observing a conditional average treatment effects larger than a scientifically meaningful effect size qq, to a threshold rr such that

Φ(qB~xμ^ExcV^Exc)rB~x=qΦ1(r)V^Exc12μ^Exc.\Phi\left(\frac{q-\tilde{B}_{x}^{*}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{V}_{E_{x}^{c}}}}\right)\equiv r\rightarrow\tilde{B}_{x}^{*}=q-\Phi^{-1}(r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-\widehat{\mu}_{E_{x}^{c}}. (16)

Consider fixed mk|1xm_{k|1x}, mk|0xm_{k|0x}, θk\theta_{k} and therefore a fixed bias factor across studies, meaning the unmeasured confounder has the same distribution and effect across all KK studies. In the apparently positive effect case, this means that if each of the KK studies suffers from unmeasured confounder with the same strength such that (mk|1xmk|0x)θkΦ1(1r)V^Exc12q+μ^Exc(m_{k|1x}-m_{k|0x})\theta_{k}\geq\Phi^{-1}(1-r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-q+\widehat{\mu}_{E_{x}^{c}}, the confounded treatment effect will be explained away. In the apparently negative effect case, it means that if (mk|0xmk|1x)θkqΦ1(r)V^Exc12μ^Exc(m_{k|0x}-m_{k|1x})\theta_{k}\geq q-\Phi^{-1}(r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-\widehat{\mu}_{E_{x}^{c}} in each of the KK studies, the confounded treatment effect will be explained away.

Consider variant mk|1xm_{k|1x}, mk|0xm_{k|0x}, θk\theta_{k} and therefore a variant bias factor across studies. In apparently positive effect case, any random bias factor with mean equal or larger than Φ1(1r)V^Exc12q+μ^Exc\Phi^{-1}(1-r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-q+\widehat{\mu}_{E_{x}^{c}} can explain away the confounded treatment effect. In apparently negative effect case, any random bias factor with mean equal or larger than qΦ1(r)V^Exc12μ^Excq-\Phi^{-1}(r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-\widehat{\mu}_{E_{x}^{c}} can explain away the confounded treatment effect.

Similar to the case of the single study, the robustness of the estimated confounded treatment effect can be measured by the minimal common constant bias factor. A small minimal common constant bias factor indicates that the estimated confounded treatment effect is not robust to unmeasured confounders.

Table 1 summarizes how to calculate the minimum (common constant) bias factor for a single study and for multiple studies (meta-analysis) under different scenarios. Once BxB_{x}^{*} or B~x\tilde{B}_{x}^{*} is calculated, one can determine the minimum effect of unmeasured confounders on response and difference between conditional mean of unmeasured confounders, that are sufficient to explain away the confounded treatment effect.

3 Interpretations

3.1 Single study

We take a prospective clustered study, which focuses on identifying potential causal effect of breastfeeding on children’s intelligence quotient (IQ), as an example to explain how to apply our method to assess the robustness of results from an observational study. Luby et al. (2016) examined relationship between breastfeeding and children’s IQ. After adjusting for children’s age, sex, history of depression, caregiver’s highest level of education, and familial income-to-needs ratio, the authors found that the IQ of breastfed child was 5.495.49 (95% CI: 0.750.75 to 10.2310.23) higher than the IQ of non-breastfed child. The investigators did not control for maternal smoking during pregnancy while maternal smoking may be a confounder as it could associate with less breastfeeding as well as lower offspring IQ. We are worried this result is confounded by maternal smoking status, so we conduct a sensitivity analysis to assess how robust the estimate is to the unmeasured confounding: maternal smoking status.

As the study indicated an apparently positive effect of breastfeeding on children’s IQ, the minimal bias factor, BxB_{x}^{*}, that is able to explain away the confounded effect equals the lower bound of confidence interval, i.e., Bx=0.75B_{x}^{*}=0.75, which means that any bias factor larger than 0.750.75 can also explain away the confounded effect. The minimum bias factor 0.750.75 can be decomposed into the product of two factors: the effect of maternal smoking on children’s IQ adjusting for other covariates and the difference between 𝔼(U|A=1,X=x)\mathbb{E}(U|A=1,X=x) and 𝔼(U|A=0,X=x)\mathbb{E}(U|A=0,X=x). Our methods allow investigators to make statements of the following form: the confounded effect of breastfeeding on children’s IQ could be explained away if the difference between the prevalence of maternal smoking in breastfed children and non-breastfed children is larger than 0.250.25 and the effect of smoking on IQ is greater than 3. There are also other decompositions that could explain away the confounded effect of breastfeeding on children’s IQ. A contour plot depicted in Figure 1 can be used to visualize the relationship between bias factors, difference between 𝔼(U|A=1,X=x)\mathbb{E}(U|A=1,X=x) and 𝔼(U|A=0,X=x)\mathbb{E}(U|A=0,X=x) (i.e., m1xm0xm_{1x}-m_{0x}), and the effect of unmeasured confounder θ\theta. Any combination of m1xm0xm_{1x}-m_{0x} and θ\theta in the contour plot that corresponds to a bias factor equal or larger than 0.750.75 are able to explain away the confounded effect. Using both the information of the minimal bias factor and different combinations of sensitivity parameters that are able to explain away the confounded treatment effect, investigators can get a sense of how sensitive the conclusions are to potential unmeasured confounders.

3.2 Multiple studies

Applications of our methods for meta-analysis and single study are similar. Investigators can calculate the minimal common constant bias factor based on equations (15) or (16) by plugging in V^Exc\widehat{V}_{E_{x}^{c}}, μ^Exc\widehat{\mu}_{E_{x}^{c}}, rr, and qq. If each of the studies suffers from an unmeasured confounder with the same distribution and effect on outcome that lead to a bias factor equal or greater than the minimal common constant bias factor, the confounded treatment effect from the meta-analysis can be explained away. Further, any non-fixed unmeasured confounder across studies with expectation equal or larger than minimal common constant bias factor can also explain away the confounded effect. Similar to single study, investigators can use our method to assess whether a set of sensitivity parameters can explain away the confounded effect and get a sense of how sensitive the conclusions are to potential unmeasured confounders.

We take a meta-analysis, which focuses on identifying potential causal effect of body mass index (BMI) in midlife on dementia, as an example to explain how to apply our method to assess the robustness of results from meta-analysis. Albanese et al. (2017) examined relationship between BMI in midlife and dementia by combining results, on relative risk (RR) scale, from 19 longitudinal studies on 589,649 participants (2,040 incident dementia cases) followed up for up to 42 years. The authors found that midlife (age 35 to 65 years) obesity (BMI \geq 30) was associated with larger risk of dementia in late life (RR, 1.33; 95% CI, 1.08 to 1.63). The study reported μ^Exc=log1.33\widehat{\mu}_{E_{x}^{c}}=\text{log}1.33 (μ^Exc\widehat{\mu}_{E_{x}^{c}} is on log RR scale), while the study level summary measures reported from Albanese et al. (2017) can be used to obtain V^Exc\widehat{V}_{E_{x}^{c}}. We estimated V^Exc=0.08\widehat{V}_{E_{x}^{c}}=0.08 via methods illustrated in Borenstein et al. (2007). Note that the choice of qq and rr should be adapted based on the context and specific research questions, which means that different analysis can have difference choice on qq and rr. In this example, let us consider a RR of 1.2 as a scientifically meaningful effect size, meaning q=log1.2q=\text{log}1.2, and consider having at least 40% chance to observe an effect size above qq for results to be of scientific interest, meaning r=0.4r=0.4.

As the meta-analysis indicated an apparently positive effect of BMI on dementia, we have the minimal common constant bias factor, which can reduce the probability of observing a conditional average treatment effects larger than a scientifically meaningful effect size qq by a threshold rr, such that

Bx=Φ1(1r)V^Exc12q+μ^Exc=Φ1(10.4)0.08log1.2+log1.33=0.17.B_{x}^{*}=\Phi^{-1}(1-r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-q+\widehat{\mu}_{E_{x}^{c}}=\Phi^{-1}(1-0.4)*\sqrt{0.08}-\text{log}1.2+\text{log}1.33=0.17.

It means that any constant bias factor across studies such that Bx0.17B_{x}\geq 0.17 and any other non-constant bias factors across studies with mean larger or equal to 0.170.17 can explain away the confounded treatment effect in meta-analysis. The minimum common constant bias factor 0.170.17 can be decomposed into the product of two factors: the effect of unmeasured confounder on dementia and the difference between the mean of unmeasured confounder in different BMI groups. Our methods allow investigators to make statement of the following form: the confounded effect of BMI on dementia could be explained away if an unmeasured confounder exists such that its mean difference between BMI groups is larger than 0.17 and its effect on dementia is greater than 1 in each of the studies. We can similarly refer to the contour plot in Figure 1 to visualize combinations of sensitivity parameters that corresponds to a bias factor equal or larger than 0.17 to explain away the confounded effect. Using both the information of the minimal common constant bias factor and different combination of sensitivity parameters that are able to explain away the confounded treatment effect, investigators can get a sense of how sensitive the conclusions are to potential unmeasured confounders.

4 Simulation studies

We conducted simulation studies to assess the performance of our proposed methods; specifically, we examined the bias and the precision of the estimator E^x\widehat{E}_{x} in equation (5) for a single study setting with continuous outcome, the estimator E^x\widehat{E}_{x} in equation (13) for a single study setting with binary outcome and continuous unmeasured confounder, and estimator p^(q)x\widehat{p}(q)_{x} in equation (14) for the meta-analysis settings.

4.1 Single Study - Continuous Outcome

First, we assessed the performance of the estimator of ExE_{x}, E^x=β^1c+xβ^3cBx\widehat{E}_{x}=\widehat{\beta}_{1}^{c}+x\widehat{\beta}_{3}^{c}-B_{x}. We assumed that AA, XX, and UU are dependent, which is a frequently encountered condition in observational studies.

Let the number of subjects be JJ and the number of repeated measures be II per subject. We simulated a normally distributed unmeasured confounder UijN(0,σu2)U_{ij}\sim N(0,\sigma_{u}^{2}), one measured binary confounder XijBin(0.5)X_{ij}\sim Bin(0.5) if Uij<1U_{ij}<1 and XijBin(0.4)X_{ij}\sim Bin(0.4) if Uij1U_{ij}\geq 1, and treatment assignment AijBin(0.4)A_{ij}\sim Bin(0.4) if Uij+Xij<2U_{ij}+X_{ij}<2 and AijBin(0.5)A_{ij}\sim Bin(0.5) if Uij+Xij2U_{ij}+X_{ij}\geq 2. We simulated random intercept ζj\zeta_{j} and the level-1 residuals ϵij\epsilon_{ij} from the distributions N(0,4)N(0,4) and N(0,1)N(0,1), respectively. We proceeded to simulate the response variable YY using the underlying true model described in (LABEL:1) setting β0=1\beta_{0}=1 and β2=3\beta_{2}=3 while varying the values of β1\beta_{1}, β3\beta_{3}, and θ\theta. E^x\widehat{E}_{x} can be calculated after we estimated the confounded coefficient β1c\beta_{1}^{c} and β3c\beta_{3}^{c} in (2) using R package lme4 (Bates et al., 2015). We simulated 1,000 such studies and inferred the estimated conditional average treatment effects described in equation (5) under X=0X=0 and X=1X=1. We repeated the simulations with different values of II, JJ, β1\beta_{1}, β3\beta_{3}, θ\theta, and σu2\sigma_{u}^{2}.

To measure the performance of our methods in estimating conditional causal treatment effect, we assessed the bias, standard error (SE) and coverage probability (CP) of 95% confidence intervals of E^x\widehat{E}_{x}. Results in Table 2 show that the estimated conditional average treatment effects, E^x\widehat{E}_{x}, is unbiased and the standard error is relatively small. The coverage of 95% confidence interval is nominal. The accuracy of the point estimate appears to be independent of XX.

4.2 Single Study - Binary Outcome

Then, we assessed the performance of the estimator of ExE_{x} in equation (13) for a single study setting with binary outcome and continuous unmeasured confounder. Assuming 200 subjects and the 4 repeated measures per subject, AA, XX, and UU were simulated similarly as for continuous outcome. We simulated random intercept ζj\zeta_{j} and the level-1 residuals ϵij\epsilon_{ij} from the distributions N(0,ν)N(0,\nu) and N(0,1)N(0,1), respectively. We proceeded to simulate the response variable YY using the underlying true model described in (6) setting β0=4.5\beta_{0}=-4.5, β1=1\beta_{1}=1, β2=3\beta_{2}=3, and β3=0.5\beta_{3}=-0.5 while varying the value of θ\theta.

As we have shown, equation (13) holds when θ\theta, σu\sigma_{u}, true ICC, and confounded ICC are small, thus, in order to fully understand the relationship between bias of E^x\widehat{E}_{x}, θ\theta, σu\sigma_{u}, and true ICC, we will estimate bias of E^x\widehat{E}_{x} under a set of different θ\theta, σu\sigma_{u}, and true ICC. Similarly, E^x\widehat{E}_{x} can be calculated after we estimated the confounded coefficient β1c\beta_{1}^{c} and β3c\beta_{3}^{c}. We simulated 1,000 such studies. To measure the performance of our methods in estimating conditional causal treatment effect, we assessed the bias, standard error (SE) and coverage probability (CP) of 95% confidence intervals of E^x\widehat{E}_{x}. Results in Table 3 show that when θ\theta, σu\sigma_{u}, and true ICC is in range of simulated scenarios, our proposed estimators is unbiased and precise. There is no monotone relationship between accuracy of E^x=0\widehat{E}_{x=0} and θ\theta, σu\sigma_{u}, or true ICC, as the approximation that allows equation (13) to hold also depends on the confounded ICC which is related with all parameters.

4.3 Multiple Studies

For multiple studies, we assessed the performance of the estimator of p(q)xp(q)_{x}, p^(q)x=1Φ(q+μBxμ^Excν^ExcνBx)\widehat{p}(q)_{x}=1-\Phi\left(\frac{q+\mu_{B_{x}}-\widehat{\mu}_{E_{x}^{c}}}{\sqrt{\widehat{\nu}_{E_{x}^{c}}-\nu_{B_{x}}}}\right). Let KK be the number of studies, JJ be the subjects in each study and II be the number of repeated measurements for each subject. Let K{15,30,100}K\in\{15,30,100\}, 𝔼(J){100,200}\mathbb{E}(J)\in\{100,200\}, and I{3,5}I\in\{3,5\}. For each study, we simulate JUnif(𝔼(J)50,𝔼(J)+50)J\sim Unif(\mathbb{E}(J)-50,\mathbb{E}(J)+50) to allow for between study variation in sample size. We simulated β1k\beta_{1k} and β3k\beta_{3k} from a bivariate normally distribution with μ1=3\mu_{1}=3, μ3=4\mu_{3}=4, V11=2V_{11}=2, V33=6V_{33}=6, and V13=0.05V_{13}=0.05, and simulated θN(5,0.01)\theta\sim N(5,0.01). Settings used in single study with continuous outcome were used to simulate AA, XX, UU, ζ\zeta, ϵ\epsilon, and YY in each of the KK studies. Then we estimated the effect sizes in each of the KK studies and then fit the random effects meta-analysis using R package meta (Schwarzer et al., 2015). We repeated the process 1,000 times.

Results in Table 4 show that our proposed estimator p^(q)x\widehat{p}(q)_{x} is unbiased and precise, as the SE are relatively small. The coverage of 95% confidence interval is nominal. The proposed estimator performed well for all simulated conditions across various settings on the number of studies, subjects, and repeated measurements per subject.

5 Discussion

In this study, we propose new sensitivity analysis methods to assess robustness of the treatment effect estimated from observational data against violation of the assumption of no unmeasured confounding. We developed methods for clustered outcomes (e.g., repeated measures per subject, patients nested within a clinic) of a single study and of multiple studies (meta-analysis) separately. Our methods enable researchers to estimate the minimal effect of the unmeasured confounder sufficient to explain away the confounded treatment effect, when applied after data analysis using the standard mixed effects modeling approach. If only an unmeasured confounder with strong effect can explain away the confounded effect, it means the observational study is robust to unmeasured confounding. The major advantage of our methods is to avoid unrealistic assumptions, in addition to efficient computation and uncomplicated implementation.

Unlike previously developed methods, our methods can be easily extended to deal with multiple unmeasured confounders. Taking continuous outcome case as an example, equations (LABEL:1) to (5) can be modified by including all unmeasured confounders, U1,,ULU_{1},\dots,U_{L}, and the bias factor, BxB_{x}, can be derived similarly such that Bx=θ1(m11xm10x)++θL(mL1xmL0x)B_{x}=\theta_{1}(m_{11x}-m_{10x})+\dots+\theta_{L}(m_{L1x}-m_{L0x}), where θl\theta_{l} is the effect of UlU_{l} on outcome and mlax=𝔼(Ul|A=a,X=x)m_{lax}=\mathbb{E}(U_{l}|A=a,X=x) for l=1,,Ll=1,\dots,L, if we assume that unmeasured confounders are independent with each other. Our methods can be easily extended to deal with multiple treatments. To find the relationship between the CATE and confounded treatment effects in a study involving SS treatments, we need to obtain msx=𝔼(Uij|Aij=s,Xij=x)m_{sx}=\mathbb{E}(U_{ij}|A_{ij}=s,X_{ij}=x) for s=1,,Ss=1,\dots,S. The bias factor for each treatment effect will have the same form as that for two treatments.

For multilevel data, for example, patients are nested within hospitals so unmeasured confounders could be at the patient or at the hospital level. In our derivations, we considered that the source of unmeasured confounders is at the patient level, so we incorporated UijU_{ij} in the model. To assess the impact of unmeasured confounders at the hospital level, we may replace UijU_{ij} with UjU_{j} in the model and the same subsequent derivation follows.

The minimal bias factor that can explain away the confounded result in a meta-analysis involves parameters qq and rr representing a scientifically meaningful effect size and the probability of observing a study that have an effect size larger than qq, respectively. The choice of qq and rr should be adapted based on the context and specific research questions. For parameter rr, our recommendation is to choose any value of rr less than 0.5 in order to keep equations (15) and (16) valid. Selection of qq should base on the research goals and the associated background knowledge. For example, if the goal of a cohort study is to test whether there is a treatment effect in a new physical therapy, the value of qq could be set to the treatment effect of the current protocol physical therapy.

While the focus of our proposed methods is on moving the confidence interval to include the null, we care not only about the statistical uncertainty of the estimate but also about the point estimate itself in practice. For this reason, researchers can also derive the bias factor that is able to move the point estimate to the null or even to reverse direction of the confounded effect. It is also possible to derive the minimal bias factor that can move a near null confounded treatment effect to a clinically meaningful level if it is the interest. No matter what target is set, a similar sensitivity analysis can be conducted once the (minimal) bias factor is calculated.

In practice, there could be violation against the assumption that the random effect and error term have constant variance in the confounded model, i.e., violation against the assumption that the variance of random effect and error term are independent of measured confounders and treatment. Given different relationship between unmeasured confounder, measured confounders, treatment and outcome, there could be different degree of violation. In our simulated scenarios, the data is generated in the way that this assumption is violated. The accurate and precise results of our simulation indicates that our methods are robust against the violation of the assumption that the random effect and error term have constant variance in the confounded model.

Depending on the context, the same value of minimal bias factor could carry different interpretations. We suggest that for all observational studies whose goal is to investigate causal relationships, to report the minimal bias factor needed to explain away both the point estimate and confidence interval of the observed effect, and then interpret minimal bias factor and assess the robustness of the study to unmeasured confounding in its own context.

Reference

References

  • Albanese et al. (2017) Albanese, E., Launer, L. J., Egger, M., Prince, M. J., Giannakopoulos, P., Wolters, F. J. and Egan, K. (2017). Body mass index in midlife and dementia: Systematic review and meta-regression analysis of 589,649 men and women followed in longitudinal studies. Alzheimers Dement (Amst) 20(8), 165–178.
  • Bates et al. (2015) Bates, D, Mächler, M, Bolker, B and Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1), 1–48.
  • Borenstein et al. (2007) Borenstein, Michael, Hedges, Larry V., Higgins, Julian P.T. and Rothstein, Hannah R. (2007). Introduction to Meta-Analysis, 1th edition. United Kingdom: Wiley.
  • Ding and VanderWeele (2017) Ding, Peng and VanderWeele, Tyler. (2017). Sensitivity analysis without assumptions. Epidemiology 27, 368–377.
  • Gaetano and Collaborative Group of the Primary Prevention Project (2001) Gaetano, G. D. and Collaborative Group of the Primary Prevention Project. (2001). Low-dose aspirin and vitamin e in people at cardiovascular risk: a randomised trial in general practice. Lancet 357, 89–95.
  • Imbens (2003) Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. American Economic Review 93 (2), 126–132.
  • Lawlor et al. (2004) Lawlor, D. A., Davey, S. G., Kundu, D., Bruckdorfer, K. R. and Ebrahim, S. (2004). Those confounded vitamins: what can we learn from the differences between observational versus randomised trial evidence? Lancet 363, 1724–1727.
  • Lin et al. (1998) Lin, D. Y., Psaty, B. M. and Kronmal, R. A. (1998). Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics 3, 948–963.
  • Luby et al. (2016) Luby, J., Belden, A., Whalen, D., Harms, M. and Barch, D. (2016). Breastfeeding and childhood iq - the mediating role of gray matter volume. Journal of the American Academy of Child and Adolescent Psychiatry 55, 67–75.
  • Marcus (1997) Marcus, S. M. (1997). Using omitted variable bias to assess uncertainty in the estimation of an aids education treatment effect. Journal of Educational and Behavioral Statistics 22, 193–201.
  • Mathur and VanderWeele (2019a) Mathur, M. B. and VanderWeele, T. J. (2019a). New metrics for meta-analyses of heterogeneous effects. Statistics in Medicine 38(1), 1336–1342.
  • Mathur and VanderWeele (2019b) Mathur, Maya B. and VanderWeele, Tyler J. (2019b). Sensitivity analysis for unmeasured confounding in meta-analyses. Journal of the American Statistical Association 115, 163–172.
  • Robins et al. (2000) Robins, J. M., Rotnitzky, A. and Scharfstein, D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. Statistical models in epidemiology, the environment, and clinical trials, 1–94.
  • Rosenbaum and Rubin (1983) Rosenbaum, PR and Rubin, DB. (1983). Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society 45.2, 212–218.
  • Schwarzer et al. (2015) Schwarzer, G., Carpenter, J. R. and Rucker, G. (2015). Meta-Analysis with R, 1st edition. Springer.
  • Sofer et al. (2016) Sofer, T., Richardson, D. B., Colicino, E., Schwartz, J. and Tchetgen Tchetgen, E. J. (2016). On negative outcome control of unobserved confounding as a generalization of difference-in-differences. Statistical Science 31(3), 348–361.
  • Stampfer et al. (1993) Stampfer, M. J., Hennekens, C. H., Manson, J. E., Colditz, G. A., Rosner, B. and Willett, W. C. (1993). itamin e consumption and the risk of coronary disease in women. Lancet 328, 1444–1449.
  • Tchetgen Tchetgen et al. (2020) Tchetgen Tchetgen, EJ, Ying, A, Cui, Y, Shi, X and Miao, W. (2020). An introduction to proximal causal learning. arXiv.
  • The Alpha-Tocopherol Beta Carotene Cancer Prevention Study Group (1994) The Alpha-Tocopherol Beta Carotene Cancer Prevention Study Group. (1994). The effect of vitamin e and beta carotene on the incidence of lung cancer and other cancers in male smokers. New England Journal of Medicine 330, 1029–1035.
  • VanderWeele and Ding (2017) VanderWeele, Tyler and Ding, Peng. (2017). Sensitivity analysis in observational research: Introducing the e-value. Annals of internal medicine 167.4, 268–274.
  • Willett (1990) Willett, W. C. (1990). Vitamin a and lung cancer. Nutrition Reviews 48(5), 201–211.

Appendix A Binary Outcome

We establish a result that for binary outcome when vv is small, we have

logit{(Yij=1|Aij,Xij,Uij)}β0+β1Aij+β2Xij+β3AijXij+θUij.\text{logit}\{\mathbb{P}(Y_{ij}=1|A_{ij},X_{ij},U_{ij})\}\approx\beta_{0}+\beta_{1}A_{ij}+\beta_{2}X_{ij}+\beta_{3}A_{ij}X_{ij}+\theta U_{ij}.

proof.proof.

logit{(Y=1|A,X,U)}=logexp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(ζ|A,U,X)1exp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(ζ|A,U,X)=β0+β1A+β2X+β3AX+θU+logexp(ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(ζ|A,U,X)1exp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(ζ|A,U,X).\begin{split}&\text{logit}\{\mathbb{P}(Y=1|A,X,U)\}\\ &=\text{log}\frac{\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(\zeta|A,U,X)}{1-\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(\zeta|A,U,X)}\\ &=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\text{log}\frac{\int_{-\infty}^{\infty}\frac{\text{exp}(\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(\zeta|A,U,X)}{1-\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(\zeta|A,U,X)}.\end{split}

Denote h(A,X,U)=exp(ζ)1+exp(Δ+ζ)𝑑F(ζ|A,X,U)h(A,X,U)=\int_{-\infty}^{\infty}\frac{\text{exp}(\zeta)}{1+\text{exp}(\Delta+\zeta)}dF(\zeta|A,X,U), where Δ=β0+β1A+β2X+β3AX+θU\Delta=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U. Thus, logit{(Y=1|A,X,U)}=β0+β1A+β2X+β3AX+θU+f(A,X,U),\text{logit}\{\mathbb{P}(Y=1|A,X,U)\}=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+f(A,X,U), where f(A,X,U)=logh(A,X,U)1exp(Δ)h(A,X,U)f(A,X,U)=\text{log}\frac{h(A,X,U)}{1-\text{exp}(\Delta)*h(A,X,U)}. If v0v\rightarrow 0, it means

h(A,X,U)exp(𝔼(ζ))1+exp(Δ+𝔼(ζ))=11+exp(Δ).h(A,X,U)\rightarrow\frac{\text{exp}(\mathbb{E}(\zeta))}{1+\text{exp}(\Delta+\mathbb{E}(\zeta))}=\frac{1}{1+\text{exp}(\Delta)}.

Thus, when v0v\rightarrow 0

f(A,X,U)log11+exp(Δ)1exp(Δ)1+exp(Δ)=0.f(A,X,U)\approx\text{log}\frac{\frac{1}{1+\text{exp}(\Delta)}}{1-\frac{\text{exp}(\Delta)}{1+\text{exp}(\Delta)}}=0.

Therefore, when v0v\rightarrow 0

logit{(Y=1|A,X,U)}β0+β1A+β2X+β3AX+θU.\text{logit}\{\mathbb{P}(Y=1|A,X,U)\}\approx\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U.

Appendix B Binary Outcome and Binary Unmeasured Confounder

We establish a result that for binary outcome with presence of binary unmeasured confounder in the study, when both vv and θ\theta are small, we have

logit{(Y=1|A,X)}=β0+β1A+β2X+β3AX+log{PAxexp(θ+ν2)+(1PAx)exp(ν2)}\text{logit}\{\mathbb{P}(Y=1|A,X)\}=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\text{log}\{P_{Ax}\text{exp}(\theta+\frac{\nu}{2})+(1-P_{Ax})\text{exp}(\frac{\nu}{2})\}

proof.proof.

logit{(Y=1|A,X)}=logexp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(U,ζ|A,X)1exp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(U,ζ|A,X)=β0+β1A+β2X+β3AX+logexp(θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(U,ζ|A,X)1exp(β0+β1A+β2X+β3AX+θU+ζ)1+exp(β0+β1A+β2X+β3AX+θU+ζ)𝑑F(U,ζ|A,X).\begin{split}&\text{logit}\{\mathbb{P}(Y=1|A,X)\}\\ &=\text{log}\frac{\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(U,\zeta|A,X)}{1-\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(U,\zeta|A,X)}\\ &=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\text{log}\frac{\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(U,\zeta|A,X)}{1-\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}{1+\text{exp}(\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta U+\zeta)}dF(U,\zeta|A,X)}.\end{split}

Denote h(A,X)=exp(θU+ζ)1+exp(Δ+θU+ζ)𝑑F(U,ζ|A,X)h(A,X)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\theta U+\zeta)}{1+\text{exp}(\Delta+\theta U+\zeta)}dF(U,\zeta|A,X), where Δ=β0+β1A+β2X+β3AX\Delta=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX. Thus, logit{(Y=1|A,X)}=β0+β1A+β2X+β3AX+f(A,X),wheref(A,X)=logh(A,X)1exp(Δ)h(A,X)\text{logit}\{\mathbb{P}(Y=1|A,X)\}=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+f(A,X),\ where\ f(A,X)=\text{log}\frac{h(A,X)}{1-\text{exp}(\Delta)*h(A,X)}.

h(A,X)=exp(θU+ζ)1+exp(Δ+θU+ζ)𝑑F(U,ζ|A,X)=exp(θU)exp(ζ)1+exp(Δ+θU+ζ)12πνexp(ζ22ν)𝑑ζ𝑑F(U|A,X)=exp(θU)exp(tν)1+exp(Δ+θU+tν)12πνexp(t22)ν𝑑t𝑑F(U|A,X),t=ζν=exp(θU)12π11+exp(Δ+θU+(s+ν)ν)exp(s22)exp(ν2)𝑑s𝑑F(U|A,X),s=tν=exp(θU+ν2)2π11+exp(Δ+θU+sν+ν)exp(s22)𝑑s𝑑F(U|A,X)=PAzexp(θ+ν2)2π11+exp(Δ+θ+sν+ν)exp(s22)𝑑s+(1PAx)exp(ν2)2π11+exp(Δ+sν+ν)exp(s22)𝑑s,\small\begin{split}h(A,X)&=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\frac{\text{exp}(\theta U+\zeta)}{1+\text{exp}(\Delta+\theta U+\zeta)}dF(U,\zeta|A,X)\\ &=\int_{-\infty}^{\infty}\text{exp}(\theta U)\int_{-\infty}^{\infty}\frac{\text{exp}(\zeta)}{1+\text{exp}(\Delta+\theta U+\zeta)}\frac{1}{\sqrt{2\pi\nu}}\text{exp}(\frac{-\zeta^{2}}{2\nu})d\zeta dF(U|A,X)\\ &=\int_{-\infty}^{\infty}\text{exp}(\theta U)\int_{-\infty}^{\infty}\frac{\text{exp}(t\sqrt{\nu})}{1+\text{exp}(\Delta+\theta U+t\sqrt{\nu})}\frac{1}{\sqrt{2\pi\nu}}\text{exp}(\frac{-t^{2}}{2})\sqrt{\nu}dtdF(U|A,X),\ t=\frac{\zeta}{\sqrt{\nu}}\\ &=\int_{-\infty}^{\infty}\text{exp}(\theta U)\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\theta U+(s+\sqrt{\nu})\sqrt{\nu})}\text{exp}(\frac{-s^{2}}{2})\text{exp}(\frac{\nu}{2})dsdF(U|A,X),\ s=t-\sqrt{\nu}\\ &=\int_{-\infty}^{\infty}\frac{\text{exp}(\theta U+\frac{\nu}{2})}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\theta U+s\sqrt{\nu}+\nu)}\text{exp}(\frac{-s^{2}}{2})dsdF(U|A,X)\\ &=P_{Az}\frac{\text{exp}(\theta+\frac{\nu}{2})}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+\theta+s\sqrt{\nu}+\nu)}\text{exp}(\frac{-s^{2}}{2})ds+\\ &(1-P_{Ax})\frac{\text{exp}(\frac{\nu}{2})}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+s\sqrt{\nu}+\nu)}\text{exp}(\frac{-s^{2}}{2})ds,\end{split} (17)

where PAX=P(Uij=1|Aij,Xij)P_{AX}=P(U_{ij}=1|A_{ij},X_{ij}).

1exp(Δ)h(A,X=x)=PAx+(1PAx)exp(Δ)h(A,X=x)=PAx1exp(Δ+θ+ζ)1+exp(Δ+θ+ζ)dF(ζ|A,X=x)+(1PAx)1exp(Δ+ζ)1+exp(Δ+ζ)dF(ζ|A,X=x)=PAx11+exp(Δ+θ+ζ)12πνexp(ζ22ν)𝑑ζ+(1PAx)11+exp(Δ+ζ)12πνexp(ζ22ν)𝑑ζ=PAx2π11+exp(Δ+θ+tν)exp(t22)𝑑t+(1PAx)2π11+exp(Δ+tν)exp(t22)𝑑t,t=ζν.\begin{split}&1-\text{exp}(\Delta)*h(A,X=x)\\ &=P_{Ax}+(1-P_{Ax})-\text{exp}(\Delta)*h(A,X=x)\\ &=P_{Ax}\int_{-\infty}^{\infty}1-\frac{\text{exp}(\Delta+\theta+\zeta)}{1+\text{exp}(\Delta+\theta+\zeta)}dF(\zeta|A,X=x)+\\ &(1-P_{Ax})\int_{-\infty}^{\infty}1-\frac{\text{exp}(\Delta+\zeta)}{1+\text{exp}(\Delta+\zeta)}dF(\zeta|A,X=x)\\ &=P_{Ax}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\theta+\zeta)}\frac{1}{\sqrt{2\pi\nu}}\text{exp}(\frac{-\zeta^{2}}{2\nu})d\zeta+\\ &(1-P_{Ax})\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\zeta)}\frac{1}{\sqrt{2\pi\nu}}\text{exp}(\frac{-\zeta^{2}}{2\nu})d\zeta\\ &=\frac{P_{Ax}}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\theta+t\sqrt{\nu})}\text{exp}(\frac{-t^{2}}{2})dt+\\ &\frac{(1-P_{Ax})}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+t\sqrt{\nu})}\text{exp}(\frac{-t^{2}}{2})dt,\ t=\frac{\zeta}{\sqrt{\nu}}.\end{split} (18)

Based on the results from equation (17) and equation (18):

f(A,X=x)=logPAxexp(θ+ν2)11+exp(Δ+θ+sν+ν)exp(s22)𝑑s+(1PAx)exp(ν2)11+exp(Δ+sν+ν)exp(s22)𝑑sPAx11+exp(Δ+θ+tν)exp(t22)𝑑t+(1PAx)11+exp(Δ+tν)exp(t22)𝑑t.\small\begin{split}&f(A,X=x)=\text{log}\frac{P_{Ax}\text{exp}(\theta+\frac{\nu}{2})\int\frac{1}{1+\text{exp}(\Delta+\theta+s\sqrt{\nu}+\nu)}\text{exp}(\frac{-s^{2}}{2})ds+(1-P_{Ax})\text{exp}(\frac{\nu}{2})\int\frac{1}{1+\text{exp}(\Delta+s\sqrt{\nu}+\nu)}\text{exp}(\frac{-s^{2}}{2})ds}{P_{Ax}\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+\theta+t\sqrt{\nu})}\text{exp}(\frac{-t^{2}}{2})dt+(1-P_{Ax})\int_{-\infty}^{\infty}\frac{1}{1+\text{exp}(\Delta+t\sqrt{\nu})}\text{exp}(\frac{-t^{2}}{2})dt}.\end{split}

When both θ\theta and ν\nu are small, we have

f(A,X=x)log{PAxexp(θ+ν2)+(1PAx)exp(ν2)}.f(A,X=x)\approx\text{log}\{P_{Ax}\text{exp}(\theta+\frac{\nu}{2})+(1-P_{Ax})\text{exp}(\frac{\nu}{2})\}.

Appendix C Binary Outcome and Continuous Unmeasured Confounder

We establish a result that for binary outcome with presence of continuous unmeasured confounder in the study, when both vv, θ\theta, and σu\sigma_{u} are small, we have

logit{(Y=1|A,X)}=β0+β1A+β2X+β3AX+θμAX+θ2σu+ν2\text{logit}\{\mathbb{P}(Y=1|A,X)\}=\beta_{0}+\beta_{1}A+\beta_{2}X+\beta_{3}AX+\theta\mu_{AX}+\frac{\theta^{2}\sigma_{u}+\nu}{2}

proof.proof. Denote l=θU+ζl=\theta U+\zeta, as we assume that Uζ|A,XU\perp\zeta\ |\ A,X, thus l=θU+ζN(μl,σl)l=\theta U+\zeta\sim N(\mu_{l},\sigma_{l}), where μl=θμl\mu_{l}=\theta\mu_{l} and σl=θ2σu+ν\sigma_{l}=\theta^{2}\sigma_{u}+\nu. We have,

h(A,X)=exp(μl+σl2)2π11+exp(Δ+μl+mσl+σl)exp(m22)𝑑m,m=lμlσlσl;h(A,X)=\frac{\text{exp}(\mu_{l}+\frac{\sigma_{l}}{2})}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+\mu_{l}+m\sqrt{\sigma_{l}}+\sigma_{l})}\text{exp}(\frac{-m^{2}}{2})dm,\ m=\frac{l-\mu_{l}}{\sqrt{\sigma_{l}}}-\sqrt{\sigma_{l}};
1exp(Δ)h(A,X)=12π11+exp(Δ+μl+nσl)exp(n22)𝑑n,n=lμlσl;1-\text{exp}(\Delta)*h(A,X)=\frac{1}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+\mu_{l}+n\sqrt{\sigma_{l}})}\text{exp}(\frac{-n^{2}}{2})dn,\ n=\frac{l-\mu_{l}}{\sqrt{\sigma_{l}}};
f(A,X=x)=logexp(μl+σl2)2π11+exp(Δ+μl+mσl+σl)exp(m22)𝑑m12π11+exp(Δ+μl+nσl)exp(n22)𝑑n.\begin{split}&f(A,X=x)=\text{log}\frac{\frac{\text{exp}(\mu_{l}+\frac{\sigma_{l}}{2})}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+\mu_{l}+m\sqrt{\sigma_{l}}+\sigma_{l})}\text{exp}(\frac{-m^{2}}{2})dm}{\frac{1}{\sqrt{2\pi}}\int\frac{1}{1+\text{exp}(\Delta+\mu_{l}+n\sqrt{\sigma_{l}})}\text{exp}(\frac{-n^{2}}{2})dn}.\end{split}

When σl\sigma_{l} is small,

f(A,X=x)μl+σl2.f(A,X=x)\approx\mu_{l}+\frac{\sigma_{l}}{2}.
Refer to caption
Figure 1: The contour plot shows the relationship between m1xm0xm_{1x}-m_{0x} (x axis) and θ\theta (y axis) and the bias factor. m0xm_{0x}: mean of unmeasured confounder in control group; m1xm_{1x}: mean of unmeasured confounder in treatment group; θ\theta: effect of unmeasured confounder on outcome
Table 1: Calculating minimum (common constant) bias factor for single study and multiple studies (meta analysis).
Single study Multiple studies (meta-analysis)
Apparently positive treatment effect Bx=lbB_{x}^{*}=lb Bx=Φ1(1r)V^Exc12q+μ^ExcB_{x}^{*}=\Phi^{-1}(1-r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-q+\widehat{\mu}_{E_{x}^{c}}
Apparently negative treatment effect B~x=ub2\tilde{B}_{x}^{*}\ =-ub{\textsuperscript{2}} B~x=qΦ1(r)V^Exc12μ^Exc\tilde{B}_{x}^{*}=q-\Phi^{-1}(r)\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}-\widehat{\mu}_{E_{x}^{c}}

BxB_{x}^{*}: minimum (common constant) bias factor in study with apparently positive treatment effect; B~x\tilde{B}_{x}^{*}: minimum (common constant) bias factor in study with apparently negative treatment effect; lblb: lower bound of confidence interval; ubub: upper bound of confidence interval; μ^Exc\widehat{\mu}_{E_{x}^{c}}: estimated grand mean of confounded treatment effect in meta analysis; V^Exc12\widehat{V}_{E_{x}^{c}}^{\frac{1}{2}}: estimated between study variance of confounded treatment effect in meta analysis; qq: a scientifically meaningful effect size; rr: threshold of probability of observing a conditional average treatment effects larger than a scientifically meaningful effect size qq for results to be of scientific interest

Table 2: Bias, standard error (SE) and coverage probability (CP) of 95% confidence intervals of the estimated conditional average treatment effects, E^x\widehat{E}_{x} conditional on X=0X=0 and X=1X=1 in single study with continuous outcome.
X=0X=0 X=1X=1
JJ II β1\beta_{1} β3\beta_{3} θ\theta σu2\sigma_{u}^{2} Bias SE CP Bias SE CP
5050 33 1-1 11 0.50.5 0.250.25 0.018 0.281 0.948 0.009 0.285 0.956
100100 33 1-1 11 0.50.5 0.250.25 0.03 0.157 0.952 0.017 0.214 0.951
100100 88 1-1 11 0.50.5 0.250.25 0.006 0.109 0.951 0.002 0.115 0.948
100100 33 1-1 11 0.5-0.5 0.250.25 0.004 0.203 0.94 0.002 0.199 0.96
100100 33 11 11 0.50.5 0.250.25 0.01 0.195 0.952 0.009 0.212 0.954
100100 33 11 1-1 0.50.5 0.250.25 0.001 0.159 0.954 0.005 0.210 0.959
100100 33 1-1 11 0.50.5 11 0.002 0.155 0.969 0.020 0.199 0.967

JJ: size of level-1 unit (e.g., clusters); II: size of level-2 unit (e.g., subjects nested in clusters); 𝜷\boldsymbol{\beta}: regression parameters; θ\theta: effect of unmeasured confounder on outcome; σu2\sigma_{u}^{2}: variance of unmeasured confounder

Table 3: Bias, standard error (SE) and coverage probability (CP) of 95% confidence intervals of the estimated conditional average treatment effects, E^x\widehat{E}_{x} conditional on X=0X=0 and X=1X=1 in single study with binary outcome.
X=0X=0 X=1X=1
θ\theta ICC σu2\sigma_{u}^{2} Bias SE CP Bias SE CP
-0.5 0.15 0.25 0.001 0.987 0.983 0.025 0.289 0.960
-0.5 0.15 1 0.047 0.707 0.981 0.031 0.278 0.957
-0.5 0.15 2.25 0.034 0.670 0.968 0.048 0.376 0.959
-0.5 0.25 0.25 0.028 0.637 0.974 0.019 0.301 0.952
-0.5 0.25 1 0.039 0.659 0.972 0.034 0.292 0.953
-0.5 0.25 2.25 0.033 0.588 0.967 0.033 0.287 0.967
-0.5 0.35 0.25 0.045 0.586 0.963 0.008 0.321 0.956
-0.5 0.35 1 0.032 0.575 0.968 0.005 0.304 0.954
-0.5 0.35 2.25 0.079 0.579 0.954 0.056 0.307 0.951
0.5 0.25 0.25 0.012 0.649 0.969 0.004 0.302 0.953
0.5 0.25 1 0.047 0.645 0.971 0.034 0.289 0.954
0.5 0.25 2.25 0.057 0.602 0.968 0.038 0.275 0.958

θ\theta: effect of unmeasured confounder on outcome; ICC: intraclass correlation coefficient; σu2\sigma_{u}^{2}: variance of unmeasured confounder

Table 4: Bias, Standard Error (SE) and coverage probability (CP) of 95% confidence intervals of the estimated probability of observing a conditional average treatment effects larger than qq, p^(q)x\widehat{p}(q)_{x}, conditional on X=0X=0 and X=1X=1 in meta analysis.
X=0 X=1
KK 𝔼(J)\mathbb{E}(J) II Bias SE CP Bias SE CP
15 100 3 0.048 0.125 0.956 0.001 0.126 0.962
15 100 5 0.094 0.124 0.937 0.006 0.126 0.973
15 200 3 0.079 0.124 0.940 0.025 0.125 0.96
15 200 5 0.05 0.125 0.964 0.017 0.124 0.966
30 100 3 0.044 0.089 0.962 0.008 0.090 0.972
30 100 5 0.056 0.089 0.941 0.031 0.090 0.972
30 200 3 0.046 0.088 0.936 0.015 0.090 0.98
30 200 5 0.009 0.090 0.979 0.030 0.091 0.969
100 100 3 0.012 0.049 0.985 0.009 0.048 0.985
100 100 5 0.002 0.048 0.987 0.017 0.049 0.981
100 200 3 0.004 0.049 0.972 0.009 0.049 0.989
100 200 5 0.003 0.048 0.989 0.005 0.048 0.989

KK: size of studies included in meta analysis; 𝔼(J)\mathbb{E}(J): average size of level-1 unit (e.g., clusters); II: size of level-2 unit (e.g., subjects nested in clusters);