Decomposing Identification Gains and Evaluating Instrument Identification Power for Partially Identified Average Treatment Effects
Abstract
This paper examines the identification power of instrumental variables (IVs) for average treatment effect (ATE) in partially identified models. We decompose the ATE identification gains into components of contributions driven by IV relevancy, IV strength, direction and degree of treatment endogeneity, and matching via exogenous covariates. Our decomposition is demonstrated with graphical illustrations, simulation studies and an empirical example of childbearing and women’s labour supply. Our analysis offers insights for understanding the complex role of IVs in ATE identification and for selecting IVs in practical policy designs. Simulations also suggest potential uses of our analysis for detecting irrelevant instruments.
JEL Codes: C14, C31, C35, C36
Keywords: Heterogeneous Treatment Effect; Binary Dependent Variables; Propensity Score; Asymmetric Endogeneity; Instrument Identification Power.
1 Introduction
The average treatment effect (ATE) is an important policy relevant measure in causal analysis (heckman2006understanding; imbens2004nonparametric), but its identification and estimation in empirical research has long been contentious when the treatment is endogenous and instrumental variables (IVs) are used as the identification strategy. This paper takes an empirical causal analyst’s perspective and examines and illustrates the roles of IVs and other factors in the identification and estimation of the ATE within a partially identified modelling framework. Although there have been significant theoretical developments in the econometric literature on the understanding of conventional IV estimands and treatment effect bounds under broader assumptions (see e.g. manski1990nonparametric; balke1997bounds; heckman1999local; heckman2001instrumental; heckman2005structural; manski2000monotone; chernozhukov2007estimation; chesher2010instrumental), the exact role of IVs and the associated estimation of various causal effects have remained not well understood in applied economic studies. By synthesising the existing econometric literature on IVs and ATE bounds, and with the help of some novel analyses, in this paper we aim to facilitate a better understanding of the complex role of IVs in ATE identification in practical applications.
In empirical causal analyses, it is common for researchers to estimate the causal effect of an endogenous treatment by a conventional IV estimator as the “identification strategy” (see, e.g., nunn2014us). In models with homogeneous treatment responses, using any one of the valid IVs can lead to point identification of the ATE, and conventional IV estimators correctly estimate the ATE. However, as shown by heckman2006understanding, in heterogeneous treatment effect models, different IVs identify different local treatment effects (imbens1994identification), and conventional IV estimates are no longer robust to the choice of alternative IVs. In fact, as explained by heckman2006understanding, the classical IV estimand is not ATE but may be a quantity with no easily interpretable meaning regardless of IV strength.
Evidence against homogeneous treatment effects abounds, with estimates based on different sets of valid IVs often producing different treatment effect estimates in practice (carneiro2003understanding; basu2007use; angrist2010extrapolate). Once heterogeneous treatment response is allowed, the ATE is often not point identified. Thus, the analysis in heckman2006understanding presents a convincing argument that if the ATE is of primary interest, the identified set for the ATE based on a partially identified model should be preferred to an analysis based on a conventional IV estimation approach. However, there have only been limited applications of partially identified ATE analysis in empirical studies. Consequently, understanding the role of IVs in this setting is important for promoting heterogeneous treatment models for empirical causal analysis.
In heterogeneous treatment effects models, heckman2001instrumental demonstrate that the property of “identification at infinity” (hereafter IAI, heckman1990varieties), namely, the availability of IVs that produce propensity scores of zero and one in the limit, leads to point identification of the ATE. However, this condition is rarely satisfied in practice, especially when IVs have limited variation. When IAI fails, inference on the ATE can still be carried out by constructing an identified set for the ATE. Therefore, in a partial identification framework, the impact of IVs can be studied by their influence on the ATE bounds.
We focus on models with binary outcome and binary endogenous treatment. Such models have been widely used in empirical studies since the pioneering work of heckman1978dummy. See neal1997effects, thornton2008demand and ashraf2014household for examples that use fully parametric bivariate probit models, and see aakvik2005estimating, bhattacharya2008treatment; bhattacharya2012treatment and kreider2012identifying; kreider2016identifying for examples that rely on nonparametric models. The role played by IVs has been a topic of discussion in the literatures, including the notion of “identification by functional form” (see e.g. maddala1986limited; wilde2000identification; freedman2010endogeneity; mourifie2014note; han2017identification; li2019bivariate).111See li2019bivariate for a summary on the topic, including a sufficient condition regarding the support of exogenous regressors for models such as the bivariate probit to achieve ATE point identification without any IVs. However, once the restrictive parametric assumptions fail to hold, IVs become necessary.
The important role of IVs has been noted for partially identified ATE in heterogeneous treatment effect models (see e.g. manski1990nonparametric; heckman2001instrumental; chesher2005nonparametric; chesher2010instrumental; shaikh2011partial; li2019bivariate). heckman2001instrumental show that for their model with threshold crossing for the treatment, it is the width between the minimum and maximum propensity scores reached by the available instruments that determines the ATE bound width. chesher2010instrumental also points out that the support and the strength of the IVs are important in determining the ATE bounds, whilst li2018bounds present some simulation results on bound width and IV strength. However, the mechanism through which the IV strength translates to identification gains in partially identified models and whether/how other factors also play a part have not been laid bare in a manner that can be readily understood by practitioners.
In this paper, we examine the role of IVs, as well as their interplay with the degree of endogeneity and exogenous covariates, in the identification of the ATE. Following the partial identification literature,222For example, see Kitagawa (2009) and Swanson et al (2018) among others. we use the ATE sign identification and the reduction in the size of the ATE identified set as a measure for identification gains. Focusing on the bivariate joint threshold crossing model and the ATE bounds proposed by shaikh2011partial (henceforth referred to as the SV model and SV bounds), we disentangle the various factors determining the ATE identification, and provide useful insights for the practitioners into the different sources and natures of identification gains.
To this end, our first contribution is to highlight and demonstrate for the case of SV bounds how IVs achieve identification gains via their attained minimum and maximum conditional propensity scores. The implication to empirical researchers is that, unlike in homogeneous treatment effect models, in heterogeneous models, omitting relevant IVs, or misclassifying continuous IVs as binary ones, could result in a loss of identification power and wider ATE bounds.
Second, we show that, unlike the case of heckman2001instrumental, the SV bounds for a binary outcome are additionally impacted by the sign and degree of treatment endogeneity. Interestingly, we find that the endogeneity drives the SV bounds asymmetrically. Specifically, the same propensity score extremes could offer much greater identification power when the ATE and the endogeneity direction are of the opposite signs, relative to the case when they are of the same sign. Thus, it is the interactions of IVs with other features of the model that determine the level of the IV identification power for the ATE. Similar asymmetric influence of treatment endogeneity is also noted in nonlinear parametric models (see, e.g., freedman2010endogeneity; frazier).
Our third contribution is to propose a novel decomposition of identification gains into components driven by: (i) the existence of valid IVs that identifies the sign of the ATE; (ii) IV strength that determines the size of the outer set of the ATE identified set; and (iii) the variation of the exogenous covariates that further refines the outer set. The last component is the key driver for achieving the ATE sharp identified set in mourifie2015sharp and the ATE point identification in vytlacil2007dummy. Based on the decomposition, we further propose a measure for IV identification power (hereafter ), which captures the critical fact that the IV identification information pertaining to the ATE varies with the endogeneity degree.
The decomposition and analysis allow us to shine a light on the internal workings of the ATE partial identification mechanism and thereby characterize the structure of identification gains. This analysis allows us to offer useful insights on the extent of identification gains achievable by each individual factor, and the contribution of each IV when multiple IVs are used. Our analysis includes graphical illustrations of bound reduction anatomy, as well as Monte Carlo results of finite sample performance. We also apply our methods to an empirical study of women’s labour force participation (angrist1998children). Two IVs are used in the study: a dummy for the first two children being same-sex siblings (“Samesex”), and a dummy for the second birth being a twin (“Twins”). Our analysis shows that the identification power of Twins is about 1.4 times the identification power of Samesex if they are used separately. If both IVs are used, the total IV identification power is only marginally larger than that if only using the Twins.
Together with the theoretical decomposition analysis, we believe this paper offers useful insights for empirical causal researchers who wish to understand the complex impacts of IVs on ATE partial identification. Furthermore, we offer some practical examples where our analysis can be used for policy relevant instrument design and selection. Our paper also sheds light on instrument relevancy. Our measure is related to existing approaches in the generalized methods of moment (GMM) literature that seek to determine instrument “relevancy”. The ability of our approach to rank sets of IVs by their identification gains, in conjunction with our Monte Carlo simulation results lead us to document, we believe for the first time, an important feature of bivariate triangular models: while in the population, adding irrelevant IVs cannot tighten the ATE bounds, in finite-samples, using such IVs could lead to a loss in IV identification power and wider bounds, when the variation of the covariates is small. We liken this phenomena to the well-known problem of irrelevant moment conditions in GMM (see breusch1999redundancy; hall2003consistent; hall2005generalized; hall2007information, among others) and leave a more rigorous study of this topic for future research.
The rest of this paper is organized as follows. In Section 2 we present the SV model setup and the SV bounds. In Section LABEL:sectionBounds we establish three key factors that affect the ATE bounds. Section LABEL:subsectionIG introduces our decomposition of identification gains and the index of . A comprehensive numerical analysis and graphical presentation are given in Section LABEL:sectionNA. Finite sample evaluation and implications for empirical causal practice are presented in Section LABEL:sectionER, and an empirical example is given in Section LABEL:emp. The paper closes in Section LABEL:con with some summary remarks. All proofs are relegated to Appendix.
2 SV Model Setup and the ATE Bounds
Suppose we observe a binary outcome and a binary treatment . Consider the joint threshold crossing (JTC) model studied in shaikh2011partial:
(1) | ||||
where denotes a vector of exogenous covariates, represents a vector of instruments that can be discrete, continuous or mixed, and are unknown functions, and and are unobservable error terms. Let denote the potential outcome for with . The JTC model allows for flexible forms of heterogeneous treatment effects due to the nonseparable error structure (heckman2006understanding), and is often used in treatment evaluation studies (see, e.g., bhattacharya2008treatment; bhattacharya2012treatment; kreider2012identifying).333vytlacil2002independence shows that the threshold crossing condition is equivalent to the monotonicity assumption. In the JTC models, it means that all individuals with the same observable characteristics will respond to the treatment and instrument in the same direction. bhattacharya2012treatment demonstrate that the ATE SV bounds under the JTC model (1) still hold under a rank similarity condition, a weaker property that allows heterogeneity in the sign of the ATE.
We are interested in the most commonly studied treatment effect, the conditional ATE, defined as
For notational simplicity, for any generic random variables and , henceforth we will use to represent , unless otherwise stated. The support of is denoted as and the support of conditional on is given by .
Assumption 1
(shaikh2011partial)
-
(a)
is independent of .
-
(b)
has a strictly positive density with respect to the Lebesgue measure on .
-
(c)
The support of the distribution of , , is compact.
-
(d)
,