This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Identification of causal direct-indirect effects
without untestable assumptions

Takahiro Hoshino

TAKAHIRO HOSHINO
Department of Economics, Keio University
RIKEN Center for Advanced Intelligence Project
E-mail: [email protected]

ABSTRACT

In causal mediation analysis, identification of existing causal direct or indirect effects requires untestable assumptions in which potential outcomes and potential mediators are independent. This paper defines a new causal direct and indirect effect that does not require the untestable assumptions. We show that the proposed measure is identifiable from the observed data, even if potential outcomes and potential mediators are dependent, while the existing natural direct or indirect effects may find a pseudo-indirect effect when the untestable assumptions are violated.

Keywords: causal effect, causal mediation, mediation analysis, natural indirect effect, potential outcome,

1 Introduction

In recent years, there has been considerable methodological development and applied studies based on potential outcome approaches (Rubin, 1974) to causal mediation analysis to understand causal mechanisms (for example, Pearl 2001; Van der Weele 2009; Imai et al. 2010a; Tchetgen Tchetgen & Shpitser 2012; Ding & Van der Weele 2016; Miles et al. 2020). Let TT denote the exposure or treatment of interest, YY the outcome, MM the mediator, and the baseline covariates vv, which are not affected by the exposure and mediator. Following the potential outcome approach, let Yj(m)Y_{j(m)} be the potential outcome when T=jT=j and M=mM=m.

Most recent studies in causal mediation analysis consider the natural direct/indirect effects, which are defined using the expectation of the “never-observed” outcome (not “potential outcome”) Yj(Mk)(jk)Y_{j}(M_{k})\ \ (j\neq k), which is the potential outcome under treatment jj for the potential mediator for treatment kk, MkM_{k}. To identify these effects various assumptions are proposed. The following assumptions (Pearl, 2001) are often made:

Assumption 1

Yj(k)T|v,j,k,Y_{j}(k)\perp\!\!\!\perp T|v,\ \forall j,k,

Assumption 2

MjT|v,j,M_{j}\perp\!\!\!\perp T|v,\ \forall j,

Assumption 3

Yj(k)M|T,v,j,k,Y_{j}(k)\perp\!\!\!\perp M|T,\ v,\ \forall j,k,

Assumption 4

Yj(k)Mj|v,j,j,k,Y_{j}(k)\perp\!\!\!\perp M_{j^{*}}|v,\ \forall j,j^{*},k,

Another example for sufficient conditions for identifying the two natural effects includes the following sequential ignorability conditions (SI1 and SI2) by Imai et al. (2010b):

Assumption SI1

{Yj(k),Mj}T|v,j,j,k,\{Y_{j}(k),M_{j^{*}}\}\perp\!\!\!\perp T|v,\ \forall j,j^{*},k,

Assumption SI2

Yj(k)Mj|T=j,v,j,j,k,Y_{j}(k)\perp\!\!\!\perp M_{j^{*}}|T=j^{*},v,\ \forall j,j^{*},k,

For the relationship between these sets of conditions, see Pearl (2014).

As has already been pointed out by various studies, Assumptions 1 and 2 or assumption SI1 are satisfied if TT is randomized. Assumption 4 or SI2 is not testable in that this states independence of potential outcomes and potential mediators, some of which we never observe simultaneously. These assumptions do not hold even if both TT and MM are randomized or ignorable given vv (Pearl, 2014), while Assumption 3 holds.

This paper defines new causal mediation effects that are identifiable from observed data without the untastable assumptions when both TT and MM are randomized or ignorable given vv. The proposed ones are useful even if the ranzomization for the mediator is not possible in that the assumptions required for identification are weaker than those for the traditional estimands, natural direct/indirect effects.

The proposed direct and indirect effects will have the following properties:

(a)

indirect effect will be zero if M0=M1M_{0}=M_{1} for all units.

(b)

these effects are identifiable without untestable Assumption 4 or SI2.

(c)

the defined effects are expressed as the potential outcomes Yj(m)Y_{j}(m) and the potential mediators MjM_{j}, not Yj(Mk)Y_{j}(M_{k}). Thus, the causal interpretation is straightforward.

2 Notation and existing estimand

Without loss of generality, we consider binary treatment in this paper (for multi-valued treatment, we can generalise the result using similar arguments to those by Imbens 2000). Using two potential mediators M1M_{1} (for T=1T=1 treatment) and M0M_{0} (for T=0T=0 treatment), MM is expressed as

M=TM1+(1T)M0.M=TM_{1}+(1-T)M_{0}. (1)

We assume that MM is a categorical variable (i.e., M=0,,MM=0,\cdots,M^{*}). Let M(m)M(m) be the binary indicator such that M(m)=1M(m)=1 if M=mM=m. Similarly, let Mj(m)M_{j}(m) be the binary indicator under T=jT=j treatment, Mj(m)=1M_{j}(m)=1 if the potential mediator under treatment jj, MjM_{j}, is mm.

The observed outcome YY is expressed by the potential outcomes, potential mediators, and treatment indicator as follows:

Y=m=0M[TM(m)Y1(m)+(1T)M(m)Y0(m)]=m=0M[TM1(m)Y1(m)+(1T)M0(m)Y0(m)]Y=\sum_{m=0}^{M^{*}}\Bigl{[}TM(m)Y_{1}(m)+(1-T)M(m)Y_{0}(m)\Bigr{]}=\sum_{m=0}^{M^{*}}\Bigl{[}TM_{1}(m)Y_{1}(m)+(1-T)M_{0}(m)Y_{0}(m)\Bigr{]} (2)

The observed outcome can be expressed by two potential outcomes Y1=Y1(M1)Y_{1}=Y_{1}(M_{1}) (for T=1T=1 treatment) and Y0(M0)Y_{0}(M_{0}) (for T=0T=0 treatment) as follows:

Y=TY1+(1T)Y0=TY1(M1)+(1T)Y0(M0)Y=TY_{1}+(1-T)Y_{0}=TY_{1}(M_{1})+(1-T)Y_{0}(M_{0}) (3)

under the composition assumption (Pearl, 2009).

The average treatment effect (ATEATE) is defined as the expectation of the difference between two potential outcomes:

ATEE[Y1Y0],ATE\equiv E[Y_{1}-Y_{0}], (4)

A straightforward way of defining the direct effect is to set the mediator to a pre-specified level M=mM=m. Pearl (2001) defined the controlled direct effect with mediator fixed at M=mM=m, CDE(m)CDE(m), in which the mediator is set to mm uniformly over the entire population:

ICDE(m)Y1(m)Y0(m),CDE(m)E[ICDE(m)].ICDE(m)\equiv Y_{1}(m)-Y_{0}(m),\ \ \ CDE(m)\equiv E[ICDE(m)]. (5)

where ICDE(m)ICDE(m) is an unit-level version of the CDE(m)CDE(m) that we define here to use later in this paper. As pointed out in existing studies, the quantity defined as ATECDEATE-CDE is not a proper measure of indirect effect in that this quantity may not be zero even when M0=M1M_{0}=M_{1} for all units (Van der Weele, 2009)

In the literature on causal mediation analysis (Robins & Greenland, 1992; Pearl, 2001, 2009), ATE is expressed as the sum of the natural direct effect (NDENDE) and the natural indirect effect (NIENIE), instead of using CDECDE. Following Imai et al. (2010b),

NDE(t)\displaystyle NDE(t) E[Y1(Mt)Y0(Mt)],NIE(t)E[Yt(M1)Yt(M0)],(t=0,1)\displaystyle\equiv E[Y_{1}(M_{t})-Y_{0}(M_{t})],\ \ \ NIE(t)\equiv E[Y_{t}(M_{1})-Y_{t}(M_{0})],\ (t=0,1)
NDE\displaystyle NDE 12[NDE(1)+NDE(0)]NIE12[NIE(1)+NIE(0)]\displaystyle\equiv\frac{1}{2}[NDE(1)+NDE(0)]\ \ \ NIE\equiv\frac{1}{2}[NIE(1)+NIE(0)]
ATE\displaystyle ATE =E[Y1Y0]=E[Y1(M1)Y0(M0)]=NDE+NIE.\displaystyle=E[Y_{1}-Y_{0}]=E[Y_{1}(M_{1})-Y_{0}(M_{0})]=NDE+NIE. (6)

Note that the NDE and NIE are not identified without further assumptions because quantity Y1(M0)Y_{1}(M_{0}) is not observable. For identification, the existing studies assume independence between potential outcomes and mediator (given some observable covariates), or related conditions such as sequential ignorability. In Section 5, we show that NIE may be biased in that under no mediation, NIE is not zero when the assumption is violated while the proposed one is not.

3 Definition of weighted direct effect and estimable indirect effect

Identification of NDE and IDE requires assumption 4 or SI2 because the causal mediation effect is defined by using the “never-observable”outcome (not “potentially observable” outcome) Yj(Mk)(jk)Y_{j}(M_{k})\ \ (j\neq k). However, Yj(k)Y_{j}(k) and MjM_{j} are observed for some portion of the units.

We redefine the potential outcomes Yj(Mk)Y_{j}(M_{k}) by using the functions of potential outcomes and mediators as follows:

Yj(Mk)m=0MMk(m)Yj(m).\displaystyle Y_{j}(M_{k})\equiv\sum_{m=0}^{M^{*}}M_{k}(m)Y_{j}(m). (7)

Under this definition, Yj(Mk)=Yj(m)Y_{j}(M_{k})=Y_{j}(m) when Mk(m)=1M_{k}(m)=1.

By using this expression, ATE is determined as follows:

ATE=E[Y1(M1)Y0(M0)]=E[m=0M{M1(m)Y1(m)M0(m)Y0(m)}]\displaystyle ATE=E[Y_{1}(M_{1})-Y_{0}(M_{0})]=E[\sum_{m=0}^{M^{*}}\{M_{1}(m)Y_{1}(m)-M_{0}(m)Y_{0}(m)\}] (8)

Then, we defined the weighted controlled direct effect (WCDEWCDE)

WCDE\displaystyle WCDE E[m=0M{M(m)Y1(m)M(m)Y0(m)}]\displaystyle\equiv E[\sum_{m=0}^{M^{*}}\{M(m)Y_{1}(m)-M(m)Y_{0}(m)\}]
=E[m=0MM(m)(Y1(m)Y0(m))]=E[m=0MM(m)ICDE(m)].\displaystyle=E[\sum_{m=0}^{M^{*}}M(m)(Y_{1}(m)-Y_{0}(m))]=E[\sum_{m=0}^{M^{*}}M(m)ICDE(m)]. (9)

Note that in CDE(m)CDE(m), the mediator is set to be the specific value, M=mM=m, while WCDEWCDE is the weighted average of ICDE(m)ICDE(m) over the observed distribution of MM.

The implied indirect effect (IIEIIE) is expressed as:

IIE\displaystyle IIE ATEWCDE\displaystyle\equiv ATE-WCDE
=E[m=0M[(M1(m)M(m))Y1(m)+(M(m)M0(m))Y0(m)]]\displaystyle=E[\sum_{m=0}^{M^{*}}[(M_{1}(m)-M(m))Y_{1}(m)+(M(m)-M_{0}(m))Y_{0}(m)]] (10)

While the quantity defined as ATECDEATE-CDE may not be zero even when M0=M1M_{0}=M_{1} for all units, IIEIIE is always zero if M1=M0M_{1}=M_{0} for all units.

Theorem 1.

IIEIIE is equivalent to zero if M1=M0M_{1}=M_{0} for all units.

of Theorem 1.

From Equation 1, if M1=M0M_{1}=M_{0} then IIE is zero because M1=M0=MM_{1}=M_{0}=M. ∎

A case of a binary moderator

We consider the case of a binary moderator. By Equation 2, the observed outcome YY is expressed as

Y\displaystyle Y =TMY1(1)+T(1M)Y1(0)+(1T)MY0(1)+(1T)(1M)Y0(0)\displaystyle=TMY_{1}(1)+T(1-M)Y_{1}(0)+(1-T)MY_{0}(1)+(1-T)(1-M)Y_{0}(0)
=TM1Y1(1)+T(1M1)Y1(0)+(1T)M0Y0(1)+(1T)(1M0)Y0(0)\displaystyle=TM_{1}Y_{1}(1)+T(1-M_{1})Y_{1}(0)+(1-T)M_{0}Y_{0}(1)+(1-T)(1-M_{0})Y_{0}(0) (11)

where M1(1)=M1,M0(1)=M0,M1(0)=1M1M_{1}(1)=M_{1},M_{0}(1)=M_{0},M_{1}(0)=1-M_{1}, and M0(0)=1M0M_{0}(0)=1-M_{0}.

Using Equation 7,

Y1=Y1(M1)\displaystyle Y_{1}=Y_{1}(M_{1}) =M1Y1(1)+(1M1)Y1(0),Y1(M0)=M0Y1(1)+(1M0)Y1(0)\displaystyle=M_{1}Y_{1}(1)+(1-M_{1})Y_{1}(0),\ \ Y_{1}(M_{0})=M_{0}Y_{1}(1)+(1-M_{0})Y_{1}(0)
Y0(M1)\displaystyle Y_{0}(M_{1}) =M1Y0(1)+(1M1)Y0(0),Y0=Y0(M0)=M0Y0(1)+(1M0)Y0(0),\displaystyle=M_{1}Y_{0}(1)+(1-M_{1})Y_{0}(0),\ \ Y_{0}=Y_{0}(M_{0})=M_{0}Y_{0}(1)+(1-M_{0})Y_{0}(0), (12)

For example, the potential outcome if the unit recieves T=1T=1 and M0=1M_{0}=1 will be Y1(1)Y_{1}(1).

Then, ATEATE, WCDEWCDE, and IIEIIE are expressed as follows:

ATE\displaystyle ATE =E[M1Y1(1)M0Y0(1)+(1M1)Y1(0)(1M0)Y0(0))]\displaystyle=E[M_{1}Y_{1}(1)-M_{0}Y_{0}(1)+(1-M_{1})Y_{1}(0)-(1-M_{0})Y_{0}(0))]
WCDE\displaystyle WCDE =E[M×ICDE(1)+(1M)×ICDE(0)]\displaystyle=E[M\times ICDE(1)+(1-M)\times ICDE(0)]
=E[M(Y1(1)Y0(1))+(1M)(Y1(0)Y0(0))]\displaystyle=E[M(Y_{1}(1)-Y_{0}(1))+(1-M)(Y_{1}(0)-Y_{0}(0))]
=E[(TM1+(1T)M0)(Y1(1)Y0(1))+(1TM1(1T)M0)(Y1(0)Y0(0))]\displaystyle=E[(TM_{1}+(1-T)M_{0})(Y_{1}(1)-Y_{0}(1))+(1-TM_{1}-(1-T)M_{0})(Y_{1}(0)-Y_{0}(0))]
NDE=12[NDE(1)+NDE(0)]\displaystyle\neq NDE=\frac{1}{2}[NDE(1)+NDE(0)]
=E[M1+M02ICDE(1)+(1M1+M02)ICDE(0)]\displaystyle=E[\frac{M_{1}+M_{0}}{2}ICDE(1)+\bigl{(}1-\frac{M_{1}+M_{0}}{2}\Bigr{)}ICDE(0)]
IID\displaystyle IID =E[(M1M)(Y1(1)Y1(0))+(MM0)(Y0(1)Y0(0))]\displaystyle=E[(M_{1}-M)(Y_{1}(1)-Y_{1}(0))+(M-M_{0})(Y_{0}(1)-Y_{0}(0))]
NIE=12[NIE(1)+NIE(0)]=12E[(M1M0){(Y1(0)Y1(0))+(Y0(0)Y0(0))}]\displaystyle\neq NIE=\frac{1}{2}[NIE(1)+NIE(0)]=\frac{1}{2}E[(M_{1}-M_{0})\{(Y_{1}(0)-Y_{1}(0))+(Y_{0}(0)-Y_{0}(0))\}] (13)

where M(1)=MM(1)=M and M(0)=1MM(0)=1-M.

It is easily shown that with randomization of TT, the natural direct effect NDE=12[NDE(1)+NDE(0)]NDE=\frac{1}{2}[NDE(1)+NDE(0)] evaluates the direct effect if the distribution of treatment is to be p(T=1)=0.5p(T=1)=0.5, which is different from the “natural” observed distribution. From these equations. it is expected that the difference between WCDE and NDE will be larger when P(T=1)P(T=1) deviates from 0.50.5. Note that in the above equations the expectation is taken over the population distribution of Yj(k)(j,k=0,1),M1,M0Y_{j}(k)\ (j,k=0,1),M_{1},M_{0} and TT. Moreover, as will be mentioned in the next section, WCDEWCDE is estimable without Assumption 4 or SI2, while NDE is not.

4 Identification and Estimation

Instead of Assumptions 1-4 (or SI1 and SI2), we introduce the following mean independence versions of Assumptions 1-4, SI1 and SI2:

Assumption 11^{{}^{\prime}}

E[Yj(k)|T,v]=E[Yj(k)|v]j,k,E[Y_{j}(k)|T,v]=E[Y_{j}(k)|v]\ \forall j,k,

Assumption 22^{{}^{\prime}}

E[Mj|T,v]=E[Mj|v]j,E[M_{j}|T,v]=E[M_{j}|v]\ \forall j,

Assumption 33^{{}^{\prime}}

E[Yj(k)|M,T=j,v]=E[Yj(k)|T=j,v]j,k,E[Y_{j}(k)|M,T=j,v]=E[Y_{j}(k)|T=j,v]\ \forall j,k,

Assumption 44^{{}^{\prime}}

E[Yj(k)|Mj,v]=E[Yj(k)|v]j,j,k,E[Y_{j}(k)|M_{j^{*}},v]=E[Y_{j}(k)|v]\ \forall j,j^{*},k,

Assumption SI1SI1^{{}^{\prime}}

E[MjYj(k)|T,v]=E[MjYj(k)|v]j,j,k,E[M_{j^{*}}Y_{j}(k)|T,v]=E[M_{j^{*}}Y_{j}(k)|v]\ \forall j,j^{*},k,

Assumption SI2SI2^{{}^{\prime}}

E[Yj(k)|Mj,T=j,v]=E[Yj(k)|T=j,v]j,j,k,E[Y_{j}(k)|M_{j^{*}},T=j,v]=E[Y_{j}(k)|T=j,v]\ \forall j,j^{*},k,

Assumption SI1SI1^{{}^{\prime}} implies the mean independence version of the ignorability assumption (Rosenbaum & Rubin, 1983), E[Yj|T,v]=E[Yj|v]jE[Y_{j}|T,v]=E[Y_{j}|v]\ \forall j, which is sufficient for identifying ATE.

For identification of WCDEWCDE, we consider the following two cases: Case 1, when both MM and TT are randomized or ignorable given vv, and Case 2, when MM is not directly maipulable.

Case1: When both MM and TT can be randomized or ignorable given covariates

In this case we can identify ATE, WCDE and IIE in the following ways:

Step 1

Divide the sample into two equivalent subgroups (usually by using randomization).

Step 2

Randomize TT with vv given in the first group to obtain a consistent estimator of ATE, E^[Y1Y0]\hat{E}[Y_{1}-Y_{0}], and that of p(M)=p(M1|T=1)p(T=1)+p(M0|T=0)p(T=0)p(M)=p(M_{1}|T=1)p(T=1)+p(M_{0}|T=0)p(T=0).

Step 3

Randomize both MM and TT with vv given in the second group, in which the distributions of TT and MM are set to be equal to those in the first group to identify WCDE (and IID) by using Theorem 2 below.

Note that in the first group by randomizing TT, Assumptions 22^{{}^{\prime}} and SI1SI1^{{}^{\prime}} automatically hold, which is sufficient to identify ATEATE and p(M)p(M). In the second group, by randomizing both TT and MM, Assumptions 11^{{}^{\prime}} and 33^{{}^{\prime}} hold. From the following theorem, using the data from the second group WCDEWCDE and IID=ATEWCDEIID=ATE-WCDE are identifiable without any additional assumptions such as mean independence between potential outcomes and potential mediators (i.e., Assumption 4`4^{`} or SI2`SI2^{`}).

Theorem 2.

WCDE is identifiable by observed data under Assumptions 11^{{}^{\prime}} and 33^{{}^{\prime}}.

of Theorem 2.

Under these assumptions, the WCDEWCDE is expressed as

WCDE\displaystyle WCDE =Ev[m=0ME[M(m)ICDE(m)|v]]=Ev[m=0MET[E[M(m)|T]E[ICDE(m)|T]|v]]\displaystyle=E_{v}\Bigl{[}\sum^{M^{*}}_{m=0}E[M(m)ICDE(m)|v]\Bigr{]}=E_{v}\Bigl{[}\sum^{M^{*}}_{m=0}E_{T}[E[M(m)|T]E[ICDE(m)|T]|v]\Bigr{]}
=Evm=0M[p(T=1|v)p(M(m)=1|T=1,v)E[Y1(m)Y0(m)|T=1,v]\displaystyle=E_{v}\sum^{M^{*}}_{m=0}\Bigl{[}p(T=1|v)p(M(m)=1|T=1,v)E[Y_{1}(m)-Y_{0}(m)|T=1,v]
+p(T=0|v)p(M(m)=1|T=0,v)E[Y1(m)Y0(m)|T=0,v]]\displaystyle+p(T=0|v)p(M(m)=1|T=0,v)E[Y_{1}(m)-Y_{0}(m)|T=0,v]\Bigr{]}
=Evm=0M[p(M(m)=1|v)(E[Y1(m)|v]E[Y0(m)|v])]\displaystyle=E_{v}\sum^{M^{*}}_{m=0}\Bigl{[}p(M(m)=1|v)(E[Y_{1}(m)|v]-E[Y_{0}(m)|v])\Bigr{]}
=Evm=0M[p(M(m)=1|v)×(E[Y1(m)|T=1,M=m,v]E[Y0(m)|T=0,M=m,v])]\displaystyle=E_{v}\sum^{M^{*}}_{m=0}\Bigl{[}p(M(m)=1|v)\times(E[Y_{1}(m)|T=1,M=m,v]-E[Y_{0}(m)|T=0,M=m,v])\Bigr{]} (14)

Considering that p(M)p(M), E[Yj(k)|T=j,M=k,v](j,k)E[Y_{j}(k)|T=j,M=k,v]\ (\forall j,k) are observable, the WCDEWCDE is identifiable. ∎

It should be noted again that the identification of NDE and NIE requires Assumption 44^{{}^{\prime}} or SI2SI2^{{}^{\prime}} even after randomizing both TT and MM (Pearl, 2014).

Case2: When MM is not directly manipulable

If randomization of MM is not feasible, it is inevitable to accept some untestable assumptions to identify causal direct/indirect effects. As stated in Theorem 2, it is sufficient to assume Assumptions 1,2,31^{{}^{\prime}},2^{{}^{\prime}},3^{{}^{\prime}} and SI1SI1^{{}^{\prime}} hold given abundant covariates to identify WCDE and IID. In this case, MM is not directly manipulatable, then Assumption 33^{{}^{\prime}} (and the other assumptions when TT is also not manipulatable) is untestable, but as mentioned earlier, these assumptions are weaker than Assumption 44^{{}^{\prime}} or SI2SI2^{{}^{\prime}} .

Estimation

For simplicity, we consider the case without covariates. For Case 1, the estimator of WCDE is expressed by the observed quantities:

WCDE^=m=0Mp^(M(m)=1)(y¯1|M=my¯0|M=m)\displaystyle\widehat{WCDE}=\sum^{M^{*}}_{m=0}\hat{p}(M(m)=1)(\bar{y}_{1|M=m}-\bar{y}_{0|M=m}) (15)

where p^(M(m)=1)\hat{p}(M(m)=1) is the sample proportion with M=mM=m in the first group and y¯j|M=m\bar{y}_{j|M=m} is the average of yy for units with T=jT=j and M=mM=m in the second group. By simple application of the Delta method the asymptotic variance of WCDE^\widehat{WCDE} is expressed as

1N{dt(diag(p)ppt)d+m=0Mpm2[V(y¯1|M=m)+V(y¯0|M=m)]}\displaystyle\frac{1}{N}\Bigl{\{}d^{t}(diag(p)-pp^{t})d+\sum^{M^{*}}_{m=0}p_{m}^{2}[V(\bar{y}_{1|M=m})+V(\bar{y}_{0|M=m})]\Bigr{\}} (16)

where pm=P(M(m)=1)p_{m}=P(M(m)=1), p=(p0,,pM)tp=(p_{0},\cdots,p_{M^{*}})^{t}, dm=E(y1(m))E(y0(m))d_{m}=E(y_{1}(m))-E(y_{0}(m)) and d=(d0,,dM)td=(d_{0},\cdots,d_{M^{*}})^{t}.

The unbiased ATE is y¯1y¯0\bar{y}_{1}-\bar{y}_{0}, where y¯j\bar{y}_{j} is the average of yy for units with T=jT=j in the whole sample, because even in the second group Assumptions 11^{{}^{\prime}} and 33^{{}^{\prime}} hold, then the difference of averages of outcomes is an unbiased estimator of ATE in the second group.

For Case 2, under ignorability given covariate vv, ATE and WCDE are expressed as:

ATE\displaystyle ATE =Ev[E[Y|T=1,v]E[Y|T=0,v]}]\displaystyle=E_{v}\Bigl{[}E[Y|T=1,v]-E[Y|T=0,v]\}\Bigr{]}
WCDE\displaystyle WCDE =Evm=0M[P(M(m)=1|v){E[Y|T=1,M=m,v]E[Y|T=0,M=m,v]}]\displaystyle=E_{v}\sum^{M^{*}}_{m=0}\Bigl{[}P(M(m)=1|v)\{E[Y|T=1,M=m,v]-E[Y|T=0,M=m,v]\}\Bigr{]} (17)

Then various methods such as inverse probability weighting estimator or doubly robust type estimator can be used to estimate ATEATE and WCDEWCDE.

Hypothetical ratio adjustment for treatment

As stated in the previous section, “WCDE” evaluates the direct effect with the observed proportion (p^(T=1)\hat{p}(T=1)) of the treatment group. If the researcher needs to consider the direct effect with a hypothetical proportion (say pp^{*}) of the treatment group, use a weight of pp^(T=1)\frac{p^{*}}{\hat{p}(T=1)} for treatment individual and use a weight of 1p1p^(T=1)\frac{1-p^{*}}{1-\hat{p}(T=1)} for control individual.

Generalization

We can address the case where mm is continuous. Under continuous mediator mm, WCDE is expressed as follows:

WCDE={E[Y1(m)|m,v]E[Y0(m)|m,v]}p(m|v)p(v)𝑑m𝑑v\displaystyle WCDE=\int\int\{E[Y_{1}(m)|m,v]-E[Y_{0}(m)|m,v]\}p(m|v)p(v)dmdv (18)

Under Assumptions 11^{{}^{\prime}} and 33^{{}^{\prime}}, WCDE is expressed as the following quantities identifiable by observed data:

WCDE={E[Y1(m)|T=1,m,v]E[Y0(m)|T=0,m,v]}p(m|v)p(v)𝑑m𝑑v\displaystyle WCDE=\int\int\{E[Y_{1}(m)|T=1,m,v]-E[Y_{0}(m)|T=0,m,v]\}p(m|v)p(v)dmdv (19)

ATE is identifiable by Assumption SI1SI1^{{}^{\prime}}, then IID is also identified as IID=ATEWCDEIID=ATE-WCDE.

5 Illustrative simulation

For illustrative purposes, we present a simulation study that compares the defined effects with the previously proposed ones. We numerically evaluated bias from the true values (population WCDE for the proposed method in Case 1 and population NDE for the existing method with assumptions SI1 and SI2). We consider the data-generating model in which assumption SI2 in Section 1 can be violated.

For simplicity, we consider binary mediator MM, and define two latent continuous potential mediators M0LM_{0}^{L} and M1LM_{1}^{L} so that Mj=1M_{j}=1 if MjL>0M_{j}^{L}>0 otherwise Mj=0(j=0,1)M_{j}=0\ \ (j=0,1). We generated 10,000 samples of size n=4,000n=4,000 from the joint vector of potential mediators and potential , W=(M0L,M1L,Y0(0),Y0(1),Y1(0),Y1(1))tW=(M_{0}^{L},M_{1}^{L},Y_{0}(0),Y_{0}(1),Y_{1}(0),Y_{1}(1))^{t} which follows a finite scale mixture of multivariate-normal distributions;

W0.6×N(μ,Σ1)+0.4×N(μ,Σ2)\displaystyle W\sim 0.6\times N(\mu,\Sigma_{1})+0.4\times N(\mu,\Sigma_{2}) (20)

where μ=(1,1,0,0.2,0.6,1)\mu=(-1,1,0,0.2,0.6,1), Σ1=Σ\Sigma_{1}=\Sigma, Σ2=2×Σ\Sigma_{2}=2\times\Sigma and the diagonal elements of Σ\Sigma is set to be 11. Off-diagonal elements are set such that Cov(Yj(k)),Yl(m))=0.5Cov(Y_{j}(k)),Y_{l}(m))=0.5 and Cov(M0,M1)=0.6Cov(M_{0},M_{1})=0.6. For simplicity, the covariances between potential mediators and potential outcomes are set as follows: Cov(M0,Y0(0))=Cov(M0,Y1(0)=Cov(M1,Y0(1))=Cov(M1,Y1(1))=ϕCov(M_{0},Y_{0}(0))=Cov(M_{0},Y_{1}(0)=Cov(M_{1},Y_{0}(1))=Cov(M_{1},Y_{1}(1))=\phi, and Cov(M0,Y0(1))=Cov(M0,Y1(1))=Cov(M1,Y0(0))=Cov(M1Y1(0))=ϕCov(M_{0},Y_{0}(1))=Cov(M_{0},Y_{1}(1))=Cov(M_{1},Y_{0}(0))=Cov(M_{1}Y_{1}(0))=-\phi. In this setup, p(M1=1)0.809p(M_{1}=1)\approx 0.809 and p(M0=1)0.191p(M_{0}=1)\approx 0.191.

The treatment indicator TT is generated from the Bernoulli distribution with Pr(T=1)=pPr(T=1)=p, independently of WW. We consider four cases with p=0.01p=0.01, 0.10.1, 0.30.3 and 0.50.5 and in each case ϕ\phi varies from 0.15-0.15 to 0.150.15 in 0.050.05 increments. Note that for all the models assumption 4 or SI2 is violated, except ϕ=0\phi=0.

The true values of WCDEWCDE and NDENDE of the model is difficult to calculate analytically. We then evaluate these values using the simulated 10,000 datasets.

We compare the proposed estimatior for WCDEWCDE with the existing estimator of NDENDE implied by Equation 18 in Imai et al. (2010b), which is frequently used in applied studies. The results are shown in Table 1 and those with p=0.5p=0.5 are illustrated in Figure 1, in which the horizontal axis is ϕ\phi and the vertical axis is the bias from the true values.

As shown in this figure and Table1, the bias of the previously proposed estimator can be large for large deviations from Assumption SI2 (i.e., large ϕ\phi) as mentioned in the sensitivity analysis in Imai et al. (2010b), while the proposed method can find true values on average. The tendency of the size of bias does not change according to pp, the proportion of T=1T=1, but in setups with small pp the variance of the two estimators is large because the sample size with T=1T=1 is very small.

Table 1 shows the numerically evaluated population ATEATE, WCDEWCDE and NDENDE in each setup. The difference between WCDEWCDE and NDENDE for small pp exists but not large, while as mentioned in Section 3 the difference is negligible if p(T=1)=0.5p(T=1)=0.5.

It is also shown that the bias of the existing estimator can be very large compared with the size of true value of the causal mediation effect. In particular, as seen in the setup with ϕ=0,15\phi=-0,15 where ATEATE is almost the same as NDENDE (i.e., NIENIE is almost zero), the existing method wrongly finds a “(pseudo-)mediation” effect under the setup when the true model does not contain a mediation effect, but the mediator and potential outcomes are correlated.

Tab.1.   Simulation results [Uncaptioned image]

Figure 1: Result for p=0.5p=0.5
Refer to caption

The error bar indicates one standard deviation calculated from the 10,000 estimates.

6 Discussion

In this paper, we proposed a new definition of causal direct and indirect effects in causal mediation analysis.

Identification of the previously proposed quantities, natural direct effect, and natural indirect effect is not possible even when both treatment and mediator are randomized. Therefore, it is unavoidable to employ untestable assumption of the independence of potential outcomes and potential mediators, some of which we never observe simultaneously.

The proposed quantities are identifiable without any assumption when both treatment and mediator are randomized. Even when randomization is not possible, the proposed quantities require weaker assumptions than those for the identification of traditional quantities.

When it is difficult to directly manipulate MM, Assumption 33^{{}^{\prime}} is required for identification of the proposed effects. Another approach for identification such as principal stratification approach (Frangakis & Rubin, 2002; Forastiere et al., 2018) is an promising strategy that we will investigate in a future study.

In this paper, we focused on the case with binary treatment, but the definition of WCDE is useful for the case with multi-valued treatment. In multi-valued treatment, the measure of indirect effect should be defined in terms of the sum of squares or variance of the expectations, instead of the traditional “difference of the expectations” formulation in the case of binary treatment, which will be considered elsewhere.

References

  • Ding & Van der Weele (2016) Ding, P., & Van der Weele, T.J. (2016). Sharp sensitivity bounds for mediation under unmeasured mediator-outcome confoundingy. Biometrika 103, 483–490.
  • Forastiere et al. (2018) Forastiere, L., Matti, A. & Ding, P. (2018). Principal ignorability in mediation analysis: through and beyond sequential ignorability. Biometrika 105, 979–986.
  • Frangakis & Rubin (2002) Frangakis, C.E., & Rubin, D.B. (2002). Principal stratification in causal inference. Biometrics 58, 21–9.
  • Imai et al. (2010a) Imai, K., Keele, L. & Tingley, D. (2010). A general approach to causal mediation analysis. Psychol. Methods. 15, 309–334.
  • Imai et al. (2010b) Imai, K., Keele, L. & Yamamoto, T. (2010). Identification, Inference and Sensitivity Analysis for Causal Mediation Effects. Statistical. Sci. 25, 51–71.
  • Imbens (2000) Imbens, G.W. (2000). The Role of the Propensity Score in Estimating Dose-Response Functions. Biometrika. 87, 706–710.
  • Miles et al. (2020) Miles, C.H., Shpitser, I., Kanki, P., Meloni, S. & Tchetgen Tchetgen, E.J. (2020). On semiparametric estimation of a path-specific effect in the presence of mediator-outcome confounding. Biometrika 107, 159–172.
  • Pearl (2001) Pearl, J. (2001). Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence(eds. J. Breese and D. Koller), 401–420, Noegan Kaufman, San Fransisco.
  • Pearl (2009) Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed., Cambridge University Press, New York.
  • Pearl (2014) Pearl, J. (2014). Interpretation and Identification of Causal Mediation. Psychol. Methods. 19, 459–481.
  • Robins & Greenland (1992) Robins, J.M. & Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology. 3, 143–155.
  • Robins (2003) Robins, J.M. (2001). Semantics of causal DAG models and trhe identification of directed and indirected effects. In Highly Structured Stochastic Systems(eds. P.J. Green., N.L. Hjort and S. Richardson), 70–81, Oxford UIniversity Press.
  • Rosenbaum & Rubin (1983) Rosenbaum, P.R. & Rubin, D.B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 70, 41–55.
  • Rubin (1974) Rubin, D.B. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. J. Educ. Psychol 66, 688–701.
  • Tchetgen Tchetgen & Shpitser (2012) Tchetgen Tchetgen, E.J. & Shpitser, I. (2012). Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness and sensitivity analysis. Ann. Statist. 40, 1816–1845
  • Van der Weele (2009) Van der Weele, T. J. (2009). Mediation and mechanism. Eur. J. Epidemiol. 24, 217–224.